Feature engineering
According to a recent survey performed by the folks at Forbes, data scientists spend around 80% of their time on data preparation:
Figure 4: Breakdown of time spent by data scientists (source: Forbes)
This statistic highlights the importance of data preparation and feature engineering in data science.
Just like judicious and systematic feature selection can make models faster and more performant by removing features, feature engineering can accomplish the same by adding new features. This seems contradictory at first blush, but the features that are being added are not features that were removed by the feature selection process. The features being added are features that might have not been in the initial dataset. You might have the most powerful and well-designed machine learning algorithm in the world, but if your input features are not relevant, you will never be able to produce useful results. Let's analyze a couple of simple examples to get...