Using PCA and LDA for dimensionality reduction
We’ll start our hands-on activities in this chapter with dimensionality reduction using PCA and LDA. We can use the wine dataset within scikit-learn as an example. I always wish I could impress my friends by being a wine expert, but I can barely tell a $10 bottle from a $500 bottle, so instead, I’ll use data science to develop impressive knowledge.
The wine dataset is an example of a multivariate dataset that contains the results of a chemical analysis of wines grown in the same region in Italy but derived from three different types of grapes (referred to as cultivars
). The analysis focused on quantifying 13 constituents found in each of the three types of wines.
Using PCA on this dataset will help us to understand the important features. By looking at the weights of the original features in the principal components, we can see which features contribute most to the variability in the wine dataset.
Again, we can use...