Understanding ML deployments and paradigms
Data science is not the same as data engineering. Data science is more geared toward taking a business problem that we convert into data problems using scientific methods. We develop mathematical models and then optimize their performance. Data engineers are mainly concerned with the reliability of the data in the data lake. They are more focused on the tools to make the data pipelines scalable and maintainable while meeting the service-level agreements (SLAs).
When we talk about ML deployments, we want to bridge the gap between data science and data engineering.
The following figure visualizes the entire process of ML deployment:
Figure 7.1 – Displaying the ML deployment process
On the right-hand side, we have the process of data science, which is very interactive and iterative. We understand the business problem and discover the datasets that can add value to our analysis. Then, we build data pipelines...