Introduction
The abundance of data available nowadays can be mind-boggling; the datasets grow not only in terms of the number of observations, but also get richer in terms of collected metadata.
In this chapter, we will present techniques that will allow you to extract the most important features from your data and use them in modeling. The drawback of using principal components instead of raw features in modeling is that it is almost impossible to meaningfully explain the models' coefficients, that is, understand the causality or what drives your predictions or classifications.
If your aim is to have a forecaster with the highest attainable accuracy and the focus of your project is not to understand the drivers, some of the following methods presented might be of interest to you.