You're reading from Amazon SageMaker Best Practices Proven tips and tricks to build successful machine learning solutions on Amazon SageMaker

Product type Paperback

Published in Sep 2021

Publisher Packt

ISBN-13 9781801070522

Length 348 pages

Edition 1st Edition

Languages

Python

Tools

Amazon SimpleDB

Concepts

Machine Learning

Authors (3):

Randy DeFauw

Shelbee Eigenbrode

Sireesha Muppala

View More author details

Table of Contents (20) Chapters

Preface

1. Section 1: Processing Data at Scale

2. Chapter 1: Amazon SageMaker Overview FREE CHAPTER

3. Chapter 2: Data Science Environments

4. Chapter 3: Data Labeling with Amazon SageMaker Ground Truth

5. Chapter 4: Data Preparation at Scale Using Amazon SageMaker Data Wrangler and Processing

6. Chapter 5: Centralized Feature Repository with Amazon SageMaker Feature Store

7. Section 2: Model Training Challenges

8. Chapter 6: Training and Tuning at Scale

9. Chapter 7: Profile Training Jobs with Amazon SageMaker Debugger

10. Section 3: Manage and Monitor Models

11. Chapter 8: Managing Models at Scale Using a Model Registry

12. Chapter 9: Updating Production Models Using Amazon SageMaker Endpoint Production Variants

13. Chapter 10: Optimizing Model Hosting and Inference Costs

14. Chapter 11: Monitoring Production Models with Amazon SageMaker Model Monitor and Clarify

15. Section 4: Automate and Operationalize Machine Learning

16. Chapter 12: Machine Learning Automated Workflows

17. Chapter 13:Well-Architected Machine Learning with Amazon SageMaker

18. Chapter 14: Managing SageMaker Features across Accounts

19. Other Books You May Enjoy

Deploying multiple models behind a single inference endpoint

A SageMaker inference endpoint is a logical entity that actually holds a load balancer and one or more instances of your inference container. You can deploy either multiple versions of the same model or entirely different models behind a single endpoint. In this section, we'll look at these two use cases.

Multiple versions of the same model

A SageMaker endpoint lets you host multiple models that serve different percentages of traffic for incoming requests. That capability supports common continuous integration (CI)/continuous delivery (CD) practices such as canary and blue/green deployments. While these practices are similar, they have slightly different purposes, as explained here:

A canary deployment means that you let the new version of a model host a small percentage of traffic that lets you test a new version of the model on a subset of traffic until you are satisfied that it is working well.
A...