Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Amazon SageMaker Best Practices

You're reading from   Amazon SageMaker Best Practices Proven tips and tricks to build successful machine learning solutions on Amazon SageMaker

Arrow left icon
Product type Paperback
Published in Sep 2021
Publisher Packt
ISBN-13 9781801070522
Length 348 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (3):
Arrow left icon
Randy DeFauw Randy DeFauw
Author Profile Icon Randy DeFauw
Randy DeFauw
Shelbee Eigenbrode Shelbee Eigenbrode
Author Profile Icon Shelbee Eigenbrode
Shelbee Eigenbrode
Sireesha Muppala Sireesha Muppala
Author Profile Icon Sireesha Muppala
Sireesha Muppala
Arrow right icon
View More author details
Toc

Table of Contents (20) Chapters Close

Preface 1. Section 1: Processing Data at Scale
2. Chapter 1: Amazon SageMaker Overview FREE CHAPTER 3. Chapter 2: Data Science Environments 4. Chapter 3: Data Labeling with Amazon SageMaker Ground Truth 5. Chapter 4: Data Preparation at Scale Using Amazon SageMaker Data Wrangler and Processing 6. Chapter 5: Centralized Feature Repository with Amazon SageMaker Feature Store 7. Section 2: Model Training Challenges
8. Chapter 6: Training and Tuning at Scale 9. Chapter 7: Profile Training Jobs with Amazon SageMaker Debugger 10. Section 3: Manage and Monitor Models
11. Chapter 8: Managing Models at Scale Using a Model Registry 12. Chapter 9: Updating Production Models Using Amazon SageMaker Endpoint Production Variants 13. Chapter 10: Optimizing Model Hosting and Inference Costs 14. Chapter 11: Monitoring Production Models with Amazon SageMaker Model Monitor and Clarify 15. Section 4: Automate and Operationalize Machine Learning
16. Chapter 12: Machine Learning Automated Workflows 17. Chapter 13:Well-Architected Machine Learning with Amazon SageMaker 18. Chapter 14: Managing SageMaker Features across Accounts 19. Other Books You May Enjoy

Feature tour of training and tuning capabilities

In this section, we'll dive into SageMaker's model training capabilities. By the end of this section, you should understand the basics of SageMaker training jobs, Autopilot and Hyperparameter Optimization (HPO), SageMaker Debugger, and SageMaker Experiments.

SageMaker training jobs

When you launch a model training job, SageMaker manages a series of steps for you. It launches one or more training instances, transfers training data from S3 or other supported storage systems to the instances, gets your training code from a Docker image repository, and starts the job. It monitors job progress and collects model artifacts and metrics from the job. The following screenshot shows an example of the hyperparameters tracked in a training job:

Figure 1.12 – SageMaker training jobs capture data such as input hyperparameter values

Figure 1.12 – SageMaker training jobs capture data such as input hyperparameter values

For larger training datasets, SageMaker manages distributed training. It will distribute subsets of data from storage to different training instances and manage the inter-node communication during the training job. The specifics vary based on the ML framework you're using, but note that most of the supported frameworks and several of the SageMaker built-in algorithms support distributed training.

Autopilot

If you are working with tabular data and solving regression or classification problems, you may find that you're performing a lot of repetitive work. You may have settled on XGBoost as a high-performing algorithm, always one-hot encoding for low-cardinality categorical features, normalizing numeric features, and so on. Autopilot performs many of these routine steps for you. In the following diagram, you can see the logical steps for an Autopilot job:

Figure 1.13 – Autopilot process

Figure 1.13 – Autopilot process

Autopilot saves you time by automating a lot of that routine process. It will run normal feature preparation tasks, try the three supported algorithms (Linear Learner, XGBoost, and a multilayer perceptron), and run hyperparameter tuning. Autopilot is a great place to start even if you end up needing to refine the output, as it generates a notebook with the code used for the entire process.

HPO

Some ML algorithms accept tens of hyperparameters as inputs. Tuning these by hand is time-consuming. Hyperparameter Optimization (HPO) simplifies that process by letting you define the hyperparameters you want to experiment with, the ranges to work over, and the metric you want to optimize. The following screenshot shows example output for an HPO job:

Figure 1.14 – Hyperparameter tuning jobs showing the objective metric of interest

Figure 1.14 – Hyperparameter tuning jobs showing the objective metric of interest

SageMaker Debugger

SageMaker Debugger helps you debug and, depending on your ML framework, profile your training jobs. While making training jobs run faster is always helpful, debugging is particularly useful if you are writing your own deep learning code with neural networks. Problems such as exploding gradients or mysterious NaN in your tensors are quite tough to track down, particularly in distributed training jobs. Debugger can effectively help you set breakpoints to see where things are going wrong. The following figure shows an example of the training and validation loss captured by SageMaker Debugger:

Figure 1.15 – Visualization of tensors captured by SageMaker Debugger

Figure 1.15 – Visualization of tensors captured by SageMaker Debugger

SageMaker Experiments

ML is an iterative process. When you're tuning a model, you may try several variations of hyperparameters, features, and even algorithms. It's important to track that work systematically so you can reproduce your results later on. That's where SageMaker Experiments comes into the picture. It helps you track, organize, and compare different trials. The following screenshot shows an example of SageMaker Experiments information:

Figure 1.16 – Trial results in SageMaker Experiments

Figure 1.16 – Trial results in SageMaker Experiments

Now that we've introduced several SageMaker capabilities for training and tuning, let's move on to model management and deployment capabilities.

You have been reading a chapter from
Amazon SageMaker Best Practices
Published in: Sep 2021
Publisher: Packt
ISBN-13: 9781801070522
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime