Through building and developing all of the projects and prototypes in this book, you have certainly noticed that feature engineering and feature selection are essential to every modern data science product, especially machine learning based projects. According to research, over 50% of the time spent building the model is occupied by cleaning, processing, and selecting the data required to train the model. It is your responsibility to design, represent, and select the features.
Most machine learning algorithms cannot work on raw data. They are not smart enough to do so. Thus, feature engineering is needed, to transform data in its raw status into data that can be understood and consumed by algorithms. Professor Andrew Ng once said:
"Coming up with features is difficult, time-consuming, requires expert knowledge. 'Applied machine...