PCA
As we described previously, PCA is a commonly used and very effective dimensionality reduction technique, which often forms a pre-processing stage for a number of machine learning models and techniques. For this reason, we will dedicate this section of the book to looking at PCA in more detail than any of the other methods. PCA reduces the sparsity in the dataset by separating the data into a series of components where each component represents a source of information within the data. As its name suggests, the first component produced in PCA, the principal component comprises the majority of information or variance within the data. The principal component can often be thought of as contributing the most amount of interesting information in addition to the mean. With each subsequent component, less information, but more subtlety, is contributed to the compressed data. If we consider all of these components together, there will be no benefit from using PCA, as the original dataset will...