Dimension reduction
Without prior knowledge of the data domain, data scientists include all possible features in their first attempt to create a classification, prediction, or regression model. After all, making assumptions is a poor and dangerous approach to reduce the search space. It is not uncommon for a model to use hundreds of features, adding complexity and significant computation costs to build and validate the model.
Noise-filtering techniques reduce the sensitivity of the model to features that are associated with sporadic behavior. However, these noise-related features are not known prior to the training phase, and therefore, cannot be discarded. As a consequence, training of the model becomes a very cumbersome and time-consuming task.
Overfitting is another hurdle that can arise from a large feature set. A training set of limited size does not allow you to create a model with a large number of features.
Dimension reduction techniques alleviate these problems by detecting features...