Data science process models
Applying data science is much more than just selecting a suitable machine learning algorithm and using it on the data. It is always good to keep in mind that machine learning is only a small part of the project; there are other parts such as understanding the problem, collecting the data, testing the solution and deploying to the production.
When working on any project, not just data science ones, it is beneficial to break it down into smaller manageable pieces and complete them one-by-one. For data science, there are best practices that describe how to do it the best way, and they are called process models. There are multiple models, including CRISP-DM and OSEMN.
In this chapter, CRISP-DM is explained as Obtain, Scrub, Explore, Model, and iNterpret (OSEMN), which is more suitable for data analysis tasks and addresses many important steps to a lesser extent.
CRISP-DM
Cross Industry Standard Process for Data Mining (CRISP-DM) is a process methodology for developing...