Standardized code to train and evaluate machine learning models
There are two main ingredients while training a machine learning model – data and the model itself. Therefore, to standardize the pipeline, we defined three configuration classes (FeatureConfig
, MissingValueConfig
, and ModelConfig
) and another wrapper class (MLForecast
) over scikit-learn-style estimators (.fit
- .predict
) to make the process smooth. Let’s look at each of them.
Notebook alert
To follow along with the code, use the 01-Forecasting with ML.ipynb
notebook in the chapter08
folder and the code in the src
folder.
FeatureConfig
FeatureConfig
is a Python dataclass
that defines a few key attributes and functions that are necessary while processing the data. For instance, continuous, categorical, and Boolean columns need separate kinds of preprocessing before being fed into the machine learning model. Let’s see what FeatureConfig
holds:
date
: A mandatory column that sets the...