Summary
In this chapter, we discussed preparations for an ML process. Starting from business requirements, you need to understand the problem and see if ML is the best solution for it. You then define the ML problem, set up performance measurement, and identify the data to be used for ML modeling to make sure we have a high-quality dataset.
Data plays such an important role! We have also discussed data preparation and feature engineering in this chapter. From data collection and construction to data transformation, feature selection, and feature synthesis, data pipelines prepare the dataset for ML model training. Mastering these data preparation and feature engineering skills will provide us with insights into the data and help us in model development. In the next chapter, we will discuss the ML model development process, from model training and validation to model testing and deployment.