You're reading from Machine Learning Engineering on AWS Build, scale, and secure machine learning systems and MLOps pipelines in production

Product type Paperback

Published in Oct 2022

Publisher Packt

ISBN-13 9781803247595

Length 530 pages

Edition 1st Edition

Tools

AWS

Concepts

Machine Learning

Author (1):

Joshua Arvin Lat

View More author details

Table of Contents (19) Chapters

Preface

1. Part 1: Getting Started with Machine Learning Engineering on AWS

2. Chapter 1: Introduction to ML Engineering on AWS FREE CHAPTER

3. Chapter 2: Deep Learning AMIs

4. Chapter 3: Deep Learning Containers

5. Part 2:Solving Data Engineering and Analysis Requirements

6. Chapter 4: Serverless Data Management on AWS

7. Chapter 5: Pragmatic Data Processing and Analysis

8. Part 3: Diving Deeper with Relevant Model Training and Deployment Solutions

9. Chapter 6: SageMaker Training and Debugging Solutions

10. Chapter 7: SageMaker Deployment Solutions

11. Part 4:Securing, Monitoring, and Managing Machine Learning Systems and Environments

12. Chapter 8: Model Monitoring and Management Solutions

13. Chapter 9: Security, Governance, and Compliance Strategies

14. Part 5:Designing and Building End-to-end MLOps Pipelines

15. Chapter 10: Machine Learning Pipelines with Kubeflow on Amazon EKS

16. Chapter 11: Machine Learning Pipelines with SageMaker Pipelines

17. Index

Why subscribe?

18. Other Books You May Enjoy

Deploying a pre-trained model to a serverless inference endpoint

In the initial chapters of this book, we’ve worked with several serverless services that allow us to manage and reduce costs. If you are wondering whether there’s a serverless option when deploying ML models in SageMaker, then the answer to that would be a sweet yes. When you are dealing with intermittent and unpredictable traffic, using serverless inference endpoints to host your ML model can be a more cost-effective option. Let’s say that we can tolerate cold starts (where a request takes longer to process after periods of inactivity) and we only expect a few requests per day – then, we can make use of a serverless inference endpoint instead of the real-time option. Real-time inference endpoints are best used when we can maximize the inference endpoint. If you’re expecting your endpoint to be utilized most of the time, then the real-time option may do the trick.

...