Engineering new columns
– Owen Zhang, Kaggle Winner
Many Kagglers and data scientists have confessed to spending considerable time on research and feature engineering. In this section, we will use pandas
to engineer new columns of data.
What is feature engineering?
Machine learning models are as good as the data that they train on. When data is insufficient, building a robust machine learning model is impossible.
A more revealing question is whether the data can be improved. When new data is extracted from other columns, these new columns of data are said to be engineered.
Feature engineering is the process of developing new columns of data from the original columns. The question is not whether you should...