Feature engineering for numerical data
We’ll introduce feature engineering for numerical data by using the same technique that we used previously but for visualizing data – PCA.
PCA
PCA is used to transform a set of variables into components that are supposed to be independent of one another. The first component should explain the variability of the data or be correlated with most of the variables. Figure 7.3 illustrates such a transformation:
Figure 7.3 – Graphical illustration of the PCA transformation from two dimensions to two dimensions
This figure contains two axes – the blue ones, which are the original coordinates, and the orange ones, which are the imaginary axes and provide the coordinates for the principal components. The transformation does not change the values of the x and y axes and instead finds such a transformation that the axes align with the data points. Here, we can see that the transformed Y axis...