Managing a Model Training Workflow
You created a data science project and a workbench created in OpenShift Data Science (ODS) in the previous chapter. In this chapter, you will learn how to build a model training pipeline. You will see how you can version your data using the partner software available in Red Hat OpenShift and build automated pipelines to retrain your model as new data becomes available. You will use the Jupyter notebook that you have configured in your workbench and write Python code to build a simple machine learning (ML) model.
It is important to understand how to manually embed a model into an application before we introduce you to the concept of model serving. We will take you through the following sections in this chapter:
- Configuring Pachyderm
- Versioning your data with Pachyderm
- Training a model using Red Hat ODS
- Building a model training pipeline