Challenging model complexity
Without prior knowledge of the problem domain, data scientists include all possible features in their first attempt to create a classification, prediction, or regression model. After all, making assumptions is a poor and dangerous approach to reducing the search space. Models may require hundreds or thousands of features, adding complexity and significant computation costs to build and validate these models.
Noise-filtering techniques reduce the sensitivity of a model to the features that are associated with the sporadic behavior. However, these noise-related features are unknown prior to the training phase, and therefore cannot be completely discarded. Consequently, the training of a model becomes a very cumbersome and time-consuming task.
Overfitting is another hurdle that can arise from a large feature set. A training set of limited sizes does not allow you to create an accurate model with any features.
There are three approaches to reduce the number of features...