Supervised Data Compression using Linear Discriminant Analysis (LDA)
As discussed previously, PCA transforms features into a set of variables to maximize the variance among the features. In PCA, the output labels are not considered when fitting the model. Meanwhile, LDA uses the dependent variable to help compress data into features that best discriminate the classes of the outcome variable. In this section, we will walk through how to use LDA as a supervised data compression technique.
To demonstrate using LDA as supervised dimensionality compression technique, we will:
- Fit an LDA model with all possible n_components
- Transform our features to n_components
- Tune the number of n_components
Exercise 42: Fitting LDA Model
To fit the model as a supervised learner using the default parameters of the LDA algorithm we will be using a slightly different glass data set, glass_w_outcome.csv. (https://github.com/TrainingByPackt/Data-Science-with-Python/tree/master/Chapter04) This dataset...