Principal Component Analysis
In some datasets, features heavily correlate with each other. For example, the speed and the fuel consumption would be heavily correlated in a go-kart with a single gear. While it can be useful to find these correlations for some applications, data mining algorithms typically do not need the redundant information.
The ads dataset has heavily correlated features, as many of the keywords are repeated across the alt text and caption.
The Principal Component Analysis (PCA) algorithm aims to find combinations of features that describe the dataset in less information. It aims to discover principal components, which are features that do not correlate with each other and explain the information—specifically the variance—of the dataset. What this means is that we can often capture most of the information in a dataset in fewer features.
We apply PCA just like any other transformer. It has one key parameter, which is the number of components to find. By default, it will result...