Chapter 11: Tuning Hyperparameters and Versioning Your Model
The journey of a data scientist is always an iterative one. Understanding how to create a process that is scalable and repeatable ensures that you can smoothly move through all the phases of data cleaning and model discovery.
In this chapter, we will cover how to create a pipeline that will combine a lot of the small steps we have learned throughout the book into an easier flow. We will then see how you can use a grid search to uncover the best hyperparameters to ensure you are creating the best possible model. We will then show you how you can create saved and versioned models to let you easily return to a previous model at any point in time. All these skills will allow for much greater accessibility and flexibility to your end goal of creating a maintainable process.
Specifically, we will cover the following in this chapter:
- Creating a
scikit-learn
pipeline - Finding optimal hyperparameters with
GridSearchCV...