Operationalizing your ML models
Once a model is validated and used on a regular basis for running predictions, it should be operationalized. The reasons for this are to remove the manual tasks of retraining your models and to ensure that your model still retains high accuracy after your data distribution has changed over time, also referred to as data drift. When data drift occurs, you need to retrain the model using an updated training set.
In the following sections, we will do a simple model retraining, then show you how you can create a version from an existing model.
Model retraining process without versioning
To walk through the retraining process, we will use one of our previously used models.
In Chapter 7, we discussed different regression models, so let’s use the chapter7_regressionmodel.predict_ticket_price_auto
model. This model solved a multi-input regression problem and SageMaker Autopilot chose the XGBoost algorithm.
Let’s assume this model...