Generating a feature set and training data
We will refactor a bit of the code previously developed in our local environment to generate features for training to add to our MLflow project the data pipelineof our MLflow project .
We will now create the feature_set_generation.py file
, which will be responsible for generating our features and saving them in the training
folder where all the data is valid and ready to be used for ML training. You can look at the contents in the file in the repository https://github.com/PacktPublishing/Machine-Learning-Engineering-with-MLflow/blob/master/Chapter07/psystock-data-features-main/feature_set_generation.py:
- We need to import the following dependencies:
import mlflow from datetime import date from dateutil.relativedelta import relativedelta import pprint import pandas as pd import pandas_datareader import pandas_datareader.data as web import numpy as np
- Before delving into the main component of the code, we'll now proceed to...