A deeper look into the principal components
Before we take a look at our second feature transformation algorithm, it is important to take a look at how principal components are interpreted:
- Our
iris
dataset is a 150 x 4 matrix, and when we calculated our PCA components whenn_components
was set to2
, we obtained a components matrix of size2 x 4
:
# how to interpret and use components pca.components_ # a 2 x 4 matrix array([[ 0.52237162, -0.26335492, 0.58125401, 0.56561105], [ 0.37231836, 0.92555649, 0.02109478, 0.06541577]])
- Just like in our manual example of calculating eigenvectors, the
components_
attribute can be used to project data using matrix multiplication. We do so by multiplying our original dataset with the transpose of thecomponents_ matrix
:
# Multiply original matrix (150 x 4) by components transposed (4 x 2) to get new columns (150 x 2) np.dot(X_scaled, pca.components_.T)[:5,] array([[-2.26454173, 0.5057039 ], [-2.0864255 , -0.65540473], [-2.36795045, -0.31847731], [-2...