Feature Engineering for Regression
Feature engineering is the process of taking data and transforming it for use in predictions. The idea is to create features that capture aspects of what's important to the outcome of interest. This process requires both data expertise and domain knowledge—you need to know what can be done with the data that you have, as well as knowledge of what might be predictive of the outcome you're interested in.
Once the features are created, they need to be assessed. This can be done by simply looking for relationships between the features and the outcome of interest. Alternatively, you can test how much a feature impacts the performance of a model, to decide whether to include it or not. We will first look at how to transform data to create features, and then how to clean the data of the resulting features to ensure models are trained on high-quality data.
Feature Creation
In order to perform a regression, we first need data to be in a format that allows it. In many...