You're reading from Machine Learning Engineering on AWS Build, scale, and secure machine learning systems and MLOps pipelines in production

Product type Paperback

Published in Oct 2022

Publisher Packt

ISBN-13 9781803247595

Length 530 pages

Edition 1st Edition

Tools

AWS

Concepts

Machine Learning

Author (1):

Joshua Arvin Lat

View More author details

Table of Contents (19) Chapters

Preface

1. Part 1: Getting Started with Machine Learning Engineering on AWS

2. Chapter 1: Introduction to ML Engineering on AWS FREE CHAPTER

3. Chapter 2: Deep Learning AMIs

4. Chapter 3: Deep Learning Containers

5. Part 2:Solving Data Engineering and Analysis Requirements

6. Chapter 4: Serverless Data Management on AWS

7. Chapter 5: Pragmatic Data Processing and Analysis

8. Part 3: Diving Deeper with Relevant Model Training and Deployment Solutions

9. Chapter 6: SageMaker Training and Debugging Solutions

10. Chapter 7: SageMaker Deployment Solutions

11. Part 4:Securing, Monitoring, and Managing Machine Learning Systems and Environments

12. Chapter 8: Model Monitoring and Management Solutions

13. Chapter 9: Security, Governance, and Compliance Strategies

14. Part 5:Designing and Building End-to-end MLOps Pipelines

15. Chapter 10: Machine Learning Pipelines with Kubeflow on Amazon EKS

16. Chapter 11: Machine Learning Pipelines with SageMaker Pipelines

17. Index

Why subscribe?

18. Other Books You May Enjoy

Getting started with serverless data management

Years ago, developers, data scientists, and ML engineers had to spend hours or even days setting up the infrastructure needed for data management and data engineering. If a large dataset stored in S3 needed to be analyzed, a team of data scientists and ML engineers performed the following sequence of steps:

Launch and configure a cluster of EC2 instances.
Copy the data from S3 to the volumes attached to the EC2 instances.
Perform queries on the data using one or more of the applications installed in the EC2 instances.

One of the known challenges with this approach is that the provisioned resources may end up being underutilized. If the schedule of the data query operations is unpredictable, it would be tricky to manage the uptime, cost, and compute specifications of the setup as well. In addition to these, system administrators and DevOps engineers need to spend time managing the security, stability, performance...