Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Automated Machine Learning on AWS

You're reading from   Automated Machine Learning on AWS Fast-track the development of your production-ready machine learning applications the AWS way

Arrow left icon
Product type Paperback
Published in Apr 2022
Publisher Packt
ISBN-13 9781801811828
Length 420 pages
Edition 1st Edition
Tools
Arrow right icon
Author (1):
Arrow left icon
Trenton Potgieter Trenton Potgieter
Author Profile Icon Trenton Potgieter
Trenton Potgieter
Arrow right icon
View More author details
Toc

Table of Contents (18) Chapters Close

Preface 1. Section 1: Fundamentals of the Automated Machine Learning Process and AutoML on AWS
2. Chapter 1: Getting Started with Automated Machine Learning on AWS FREE CHAPTER 3. Chapter 2: Automating Machine Learning Model Development Using SageMaker Autopilot 4. Chapter 3: Automating Complicated Model Development with AutoGluon 5. Section 2: Automating the Machine Learning Process with Continuous Integration and Continuous Delivery (CI/CD)
6. Chapter 4: Continuous Integration and Continuous Delivery (CI/CD) for Machine Learning 7. Chapter 5: Continuous Deployment of a Production ML Model 8. Section 3: Optimizing a Source Code-Centric Approach to Automated Machine Learning
9. Chapter 6: Automating the Machine Learning Process Using AWS Step Functions 10. Chapter 7: Building the ML Workflow Using AWS Step Functions 11. Section 4: Optimizing a Data-Centric Approach to Automated Machine Learning
12. Chapter 8: Automating the Machine Learning Process Using Apache Airflow 13. Chapter 9: Building the ML Workflow Using Amazon Managed Workflows for Apache Airflow 14. Section 5: Automating the End-to-End Production Application on AWS
15. Chapter 10: An Introduction to the Machine Learning Software Development Life Cycle (MLSDLC) 16. Chapter 11: Continuous Integration, Deployment, and Training for the MLSDLC 17. Other Books You May Enjoy

Overview of the ML process

Unfortunately, there is no established how-to guide when performing ML. This is because every ML use case is unique and specific to the application that leverages the resultant ML model. Instead, there is a general process pattern that most data scientists, ML engineers, and ML practitioners follow. This process model is called the Cross-Industry Standard Process for Data Mining (CRISP-DM) and while not everyone follows the specific steps of the process verbatim, most production ML models have probably, in some shape or form, been built by using the guardrails that the CRISP-DM methodology provides.

So, when we refer to the ML process, we are invariably referring to the overall methodology of building production-ready ML models using the guardrails from CRSIP-DM.

The following diagram shows an overview of the CRISP-DM guidelines for creating a typical process that an ML practitioner might follow:

Figure 1.1 – Overview of a typical ML process

Figure 1.1 – Overview of a typical ML process

In a nutshell, the process starts with the ML practitioner being tasked with providing an ML model that addresses a specific business use case. The ML practitioner then finds, ingests, and analyzes an appropriate dataset that can be effectively leveraged to accomplish the goals of the ML project.

Once the data has been analyzed, the ML practitioner determines the most applicable modeling techniques that extract the most relevant information from the data to address the use case. These techniques include the following:

  1. Determining the most applicable ML algorithm
  2. Creating new aspects (engineering new features) of the data that can further improve the chosen model's overall effectiveness
  3. Separating the data into training and testing sets for model training and evaluation

The ML practitioner then codifies the algorithm's architecture and training/testing/evaluation routines. These routines are then executed to determine the best possible model parameters – ones that optimize the model to fit both the data and the business use case.

Finally, the best model is deployed into production to serve predictions that match the initial objective of the business use case.

As you can see, the overall process seems relatively straightforward and easy to follow. So, you may be wondering what all the fuss is about. For example, you may be asking yourself, Where is the complexity in this process? or Why do you say that this is so hard to automate?

While the process may look simplistic, the reality when executing it is vastly different. The following diagram provides a more realistic representation of what an ML practitioner may observe when developing an ML use case:

Figure 1.2 – Overview of a realistic ML process

Figure 1.2 – Overview of a realistic ML process

As you can see, the overall process is far more convoluted than the typical representation shown in Figure 1.1. There are potentially multiple different paths that can be taken through the process. Each course of action is based on the results captured from the previous step in the process. Additionally, taking a particular course of action may not always yield the desired results, thus forcing the ML practitioner to have to reset or go back and choose a different set of criteria that will hopefully produce a better result.

So, now that we have provided a high-level overview of what the typical ML process should entail, let's examine some of the complexities and challenges that make the ML process difficult.

You have been reading a chapter from
Automated Machine Learning on AWS
Published in: Apr 2022
Publisher: Packt
ISBN-13: 9781801811828
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime