Principal Component Analysis (PCA)
At a high level, PCA is a technique for creating uncorrelated linear combinations from the original features termed components. Of the principal components, the first component explains the greatest proportion of variance in data, while the following components account for progressively less variance.
To demonstrate PCA, we will:
- Fit PCA model with all principal components
- Tune the number of principal components by setting a threshold of explained variance to remain in data
- Fit those components to a k-means cluster analysis and compare k-means performance before and after the PCA transformation
Exercise 39: Fitting a PCA Model
In this exercise, you will learn to fit a generic PCA model using data we prepared in Exercise 34, Building an HCA Model and the brief explanation of PCA.
- Instantiate a PCA model as shown here:
from sklearn.decomposition import PCA
model = PCA()
- Fit the PCA model to scaled_features, as shown in the following code...