Using principal component analysis
A very different approach to feature selection than any of the methods we have discussed so far is PCA. PCA allows us to replace the existing feature set with a limited number of components, each of which explains an important amount of the variance. It does this by finding a component that captures the largest amount of variance, followed by a second component that captures the largest amount of remaining variance, and then a third component, and so on. One key advantage of this approach is that these components, known as principal components, are uncorrelated. We discuss PCA in detail in Chapter 15, Principal Component Analysis.
Although I include PCA here as a feature selection approach, it is probably better to think of it as a tool for dimension reduction. We use it for feature selection when we need to limit the number of dimensions without sacrificing too much explanatory power.
Let's work with the NLS data again and use PCA to...