Principal Component Analysis (PCA) is a dimensionality reduction technique that is widely used in data analysis when there are many numerical variables, some of which may be correlated, and we would like to reduce the number of dimensions required to understand the data.
It can be useful to help us understand the data, since thinking in more than three dimensions can be problematic, and to accelerate algorithms that are computationally intensive, especially with large numbers of variables. With PCA, we can extract most of the information into only one or two variables constructed in a very specific way, such that they capture the most variance while having the added benefit of being uncorrelated among them by construction.
The first principal component is a linear combination of the original variables which captures the maximum...