Principal Component Analysis
Finding the really important fields in databases with a huge number of variables may prove to be a challenging task for the data scientist. This is where Principal Component Analysis (PCA) comes into the picture: to find the core components of data. It was invented more than 100 years ago by Karl Pearson, and it has been widely used in diverse fields since then.
The objective of PCA is to interpret the data in a more meaningful structure with the help of orthogonal transformations. This linear transformation is intended to reveal the internal structure of the dataset with an arbitrarily designed new basis in the vector space, which best explains the variance of the data. In plain English, this simply means that we compute new variables from the original data, where these new variables include the variance of the original variables in decreasing order.
This can be either done by eigendecomposition of the covariance, correlation matrix (the so-called R-mode PCA)...