Chapter 8: Automating the Machine Learning Process Using Apache Airflow
When building an ML model, there is a fundamental principle that all ML practitioners are aware of; namely, an ML model is only as robust as the data on which it was trained. In the previous four chapters, we have primarily focused on automating the ML process using a source code-centric mechanism. In other words, we applied a DevOps methodology of Continuous Integration and Continuous Deployment to automate the ML process by supplying the model source code, tuning parameters, and the ML workflow source code. Any changes to these artifacts would trigger a release change process of the CI/CD pipeline.
However, we also supplied static abalone data, downloaded from the UCI Machine Learning Repository, as a source artifact, but we never made any changes to this data. So, using a typical DevOps methodology, the data artifact is static and therefore won't trigger a change release of the CI/CD process.
Accordingly...