You're reading from Automated Machine Learning on AWS Fast-track the development of your production-ready machine learning applications the AWS way

Product type Paperback

Published in Apr 2022

Publisher Packt

ISBN-13 9781801811828

Length 420 pages

Edition 1st Edition

Tools

AWS

Concepts

Machine Learning

Author (1):

Trenton Potgieter

View More author details

Table of Contents (18) Chapters

Preface

1. Section 1: Fundamentals of the Automated Machine Learning Process and AutoML on AWS

2. Chapter 1: Getting Started with Automated Machine Learning on AWS FREE CHAPTER

3. Chapter 2: Automating Machine Learning Model Development Using SageMaker Autopilot

4. Chapter 3: Automating Complicated Model Development with AutoGluon

5. Section 2: Automating the Machine Learning Process with Continuous Integration and Continuous Delivery (CI/CD)

6. Chapter 4: Continuous Integration and Continuous Delivery (CI/CD) for Machine Learning

7. Chapter 5: Continuous Deployment of a Production ML Model

8. Section 3: Optimizing a Source Code-Centric Approach to Automated Machine Learning

9. Chapter 6: Automating the Machine Learning Process Using AWS Step Functions

10. Chapter 7: Building the ML Workflow Using AWS Step Functions

11. Section 4: Optimizing a Data-Centric Approach to Automated Machine Learning

12. Chapter 8: Automating the Machine Learning Process Using Apache Airflow

13. Chapter 9: Building the ML Workflow Using Amazon Managed Workflows for Apache Airflow

14. Section 5: Automating the End-to-End Production Application on AWS

15. Chapter 10: An Introduction to the Machine Learning Software Development Life Cycle (MLSDLC)

16. Chapter 11: Continuous Integration, Deployment, and Training for the MLSDLC

17. Other Books You May Enjoy

Overview of the ML process

Unfortunately, there is no established how-to guide when performing ML. This is because every ML use case is unique and specific to the application that leverages the resultant ML model. Instead, there is a general process pattern that most data scientists, ML engineers, and ML practitioners follow. This process model is called the Cross-Industry Standard Process for Data Mining (CRISP-DM) and while not everyone follows the specific steps of the process verbatim, most production ML models have probably, in some shape or form, been built by using the guardrails that the CRISP-DM methodology provides.

So, when we refer to the ML process, we are invariably referring to the overall methodology of building production-ready ML models using the guardrails from CRSIP-DM.

The following diagram shows an overview of the CRISP-DM guidelines for creating a typical process that an ML practitioner might follow:

Figure 1.1 – Overview of a typical ML process

In a nutshell, the process starts with the ML practitioner being tasked with providing an ML model that addresses a specific business use case. The ML practitioner then finds, ingests, and analyzes an appropriate dataset that can be effectively leveraged to accomplish the goals of the ML project.

Once the data has been analyzed, the ML practitioner determines the most applicable modeling techniques that extract the most relevant information from the data to address the use case. These techniques include the following:

Determining the most applicable ML algorithm
Creating new aspects (engineering new features) of the data that can further improve the chosen model's overall effectiveness
Separating the data into training and testing sets for model training and evaluation

The ML practitioner then codifies the algorithm's architecture and training/testing/evaluation routines. These routines are then executed to determine the best possible model parameters – ones that optimize the model to fit both the data and the business use case.

Finally, the best model is deployed into production to serve predictions that match the initial objective of the business use case.

As you can see, the overall process seems relatively straightforward and easy to follow. So, you may be wondering what all the fuss is about. For example, you may be asking yourself, Where is the complexity in this process? or Why do you say that this is so hard to automate?

While the process may look simplistic, the reality when executing it is vastly different. The following diagram provides a more realistic representation of what an ML practitioner may observe when developing an ML use case:

Figure 1.2 – Overview of a realistic ML process

As you can see, the overall process is far more convoluted than the typical representation shown in Figure 1.1. There are potentially multiple different paths that can be taken through the process. Each course of action is based on the results captured from the previous step in the process. Additionally, taking a particular course of action may not always yield the desired results, thus forcing the ML practitioner to have to reset or go back and choose a different set of criteria that will hopefully produce a better result.

So, now that we have provided a high-level overview of what the typical ML process should entail, let's examine some of the complexities and challenges that make the ML process difficult.

You're reading from Automated Machine Learning on AWS Fast-track the development of your production-ready machine learning applications the AWS way

Table of Contents (18) Chapters

Overview of the ML process

Authors (1)

Personalised recommendations for you