Managing continuous training
In Chapter 9, Building the ML Workflow Using Amazon Managed Workflow for Apache Airflow, we learned how Airflow can be used to create a data-centric ML process and train the Age Calculator model on new Abalone survey data. In Chapter 10, An Introduction to the Machine Learning Software Development Life Cycle (MLSDLC), we learned how the Data Team applied this technique to the ACME web application by codifying the acme-data-workflow Airflow DAG. The following diagram shows a graphical representation of the Airflow DAG:
As you can see, the Airflow DAG starts when new Abalone survey data is added to the S3 bucket. The survey data is then preprocessed to engineer the relevant training features; these features are then ingested into the Feature Store. Once the new data is ingested into the Feature Store, a release change of the MLSDLC process is triggered to automate the process of releasing a new...