What this book covers
Chapter 1, Getting Started with Automated Machine Learning on AWS, provides an overview of what the ML pipeline/process looks like and highlights the typical challenges you will face when building the pipeline. The main challenge to highlight is overcoming the interactive nature of the process and why automation is crucial to a successful process. Subsequently, we will introduce the concept of AutoML and highlight how it can alleviate the aforementioned challenges.
Chapter 2, Automating Machine Learning Model Development Using SageMaker Autopilot, provides an overview of what SageMaker Autopilot is and how it can be useful in automating the ML process. By using an example use case (ACME Fishing Logistics), the chapter will further educate you on how to practically leverage SageMaker Autopilot and apply it to the use case. The chapter accomplishes this by walking you through each step of the process, comparing it to the model framing example to highlight the benefits of process automation.
Chapter 3, Automating Complicated Model Development with AutoGluon, provides you with an overview of what AutoGluon is, how it differs from SageMaker Autopilot, and the value it adds for use cases that involve deep learning models that make use of text, image, and tabular data. It further elaborates on AutoGluon's capabilities for process automation by walking you through the hands-on, ACME Fishing Logistics example, and a deep learning-based model for computer vision.
Chapter 4, Continuous Integration and Continuous Delivery (CI/CD) for Machine Learning, introduces you to the concept of continuous integration and continuous deployment (CI/CD) and how specifically it can be applied to an ML use case. The chapter accomplishes this by introducing DevOps culture and highlighting how the DevOps process can evolve into an MLOps process. This chapter also introduces and focuses on how the various CI/CD services within AWS can be applied to the use case, by introducing you to the Cloud Development Kit (CDK) and the Cloud9 development environment. The chapter will also practically show you how to set up the development workspace, install and configure the CDK, set up the artifact repositories, and start codifying the primary artifacts that will be leveraged by the CI/CD pipeline.
Chapter 5, Continuous Deployment of a Production ML Model, introduces you to the typical tasks performed by the ML practitioner, within the context of the deployed CI/CD pipeline and DevOps culture. The chapter will walk you through creating the model assets, which trigger the pipeline execution, and show you how to manage and monitor the progress.
Chapter 6, Automating the Machine Learning Process Using AWS Step Functions, highlights how the CI/CD process can be further optimized, by including the ML practitioner in the majority of the pipeline build process. This chapter shows how this can be done by introducing AWS Step Functions and the Data Science SDK for Step Functions. It will then walk you through how to integrate the Data Science SDK into the CI/CD pipeline process.
Chapter 7, Building the ML Workflow Using AWS Step Functions, elaborates on the role and tasks of the ML practitioner, within the context of further optimizing the CI/CD pipeline, by walking you through how to build the codified ML workflow, perform integration testing on the workflow, and deploy the ML model into production, using the workflow.
Chapter 8, Automating the Machine Learning Process Using Apache Airflow, introduces you to a data-centric workflow, why its application to the ML process is important, and the team members normally responsible for executing this part of the process. The chapter elaborates on the common tools used to perform this function, namely Apache Airflow, and the Amazon managed service for Apache Airflow. The chapter will then walk you through how to build a managed Airflow environment.
Chapter 9, Building the ML Workflow Using Amazon Managed Workflows for Apache Airflow, leverages the environment created in the previous chapter and focuses on the role and tasks that the ML practitioner performs, within the context of further optimizing the CI/CD pipeline. The chapter accomplishes this by walking you through how to build the codified ML workflow, perform integration testing on the workflow, and deploy the ML model into production, using the workflow running on the MWAA environment.
Chapter 10, An Introduction to the Machine Learning Software Development Life Cycle (MLSDLC), introduces you to the MLSDLC methodology and explains why adopting this methodology encompasses a holistic solution for automating the entirety of the ML-based application. The chapter highlights the key success criteria for an MLSDLC implementation – the cross-functional and agile team. It showcases this success criteria by walking through each of the team member roles, how they interact with the other team members, and building the codified artifacts that each role is responsible for.
Chapter 11, Continuous Integration, Deployment, and Training for the MLSDLC, walks through the process of creating a self-mutating CI/CD pipeline using the CDK, from the perspective of the platform engineering team. The chapter will show you how to take the various cross-functional teams' artifacts and combine them into an automated process for CI of both the ACME Fishing Logistics application and the ML model in a development and QA environment. The chapter will also highlight how to include automated integration and QA test procedures for the web application, plus ML model inferences, in the overall MLSDLC workflow. The chapter will then show you how to take the application from the test environment into the production environment, to produce the production version of the overall ML application. The last part of the chapter will focus on the various tasks and procedures from the perspective of the data engineering team, to essentially close the loop on the MLSDLC process, by walking you through how to apply continuous training of the pipeline, based on new data and the lessons learned from Chapter 8, Automating the Machine Learning Process Using Apache Airflow.