In the previous chapter, we trained a basic machine learning (ML) model. However, most real-world scenarios require models to be accurate, and that means the model and features need to be improved and fine-tuned for a specific task. This process is usually long, iterative, and based on trial and error.
So, in this chapter, we will see how we can improve and validate model quality and keep track of all of the experiments along the way. As a result, we will improve the quality of the model and learn how to track our experiments and log metrics and parameters. In particular, we'll learn the following:
- Understanding cross-validation and overfitting
- Adding features in order to improve models
- Wrapping models and transformations into pipelines
- Version control of our datasets and metrics using the dvc package