Principal component analysis
One of the most commonly used methods of dimensionality reduction is Principal Component Analysis (PCA). Conceptually, PCA computes the axes along which the variation in the data is greatest. You may recall that in Chapter 3, Finding Patterns in the Noise – Clustering and Unsupervised Learning, we calculated the eigenvalues of the adjacency matrix of a dataset to perform spectral clustering. In PCA, we also want to find the eigenvalue of the dataset, but here, instead of any adjacency matrix, we will use the covariance matrix of the data, which is the relative variation within and between columns. The covariance for columns xi
and xj
in the data matrix X
is given by:
This is the average product of the offsets from the mean column values. We saw this value before when we computed the correlation coefficient in Chapter 3, Finding Patterns in the Noise – Clustering and Unsupervised Learning, as it is the denominator of the Pearson coefficient. Let us use a simple...