Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Free Learning

Amazon SageMaker Best Practices

You're reading from Amazon SageMaker Best Practices Proven tips and tricks to build successful machine learning solutions on Amazon SageMaker

Product type Paperback

Published in Sep 2021

Publisher Packt

ISBN-13 9781801070522

Length 348 pages

Edition 1st Edition

Languages

Python

Tools

Amazon SimpleDB

Concepts

Machine Learning

Authors (3):

Randy DeFauw

Shelbee Eigenbrode

Sireesha Muppala

View More author details

Table of Contents (20) Chapters

Preface

1. Section 1: Processing Data at Scale

2. Chapter 1: Amazon SageMaker Overview FREE CHAPTER

3. Chapter 2: Data Science Environments

4. Chapter 3: Data Labeling with Amazon SageMaker Ground Truth

5. Chapter 4: Data Preparation at Scale Using Amazon SageMaker Data Wrangler and Processing

6. Chapter 5: Centralized Feature Repository with Amazon SageMaker Feature Store

7. Section 2: Model Training Challenges

8. Chapter 6: Training and Tuning at Scale

9. Chapter 7: Profile Training Jobs with Amazon SageMaker Debugger

10. Section 3: Manage and Monitor Models

11. Chapter 8: Managing Models at Scale Using a Model Registry

12. Chapter 9: Updating Production Models Using Amazon SageMaker Endpoint Production Variants

13. Chapter 10: Optimizing Model Hosting and Inference Costs

14. Chapter 11: Monitoring Production Models with Amazon SageMaker Model Monitor and Clarify

15. Section 4: Automate and Operationalize Machine Learning

16. Chapter 12: Machine Learning Automated Workflows

17. Chapter 13:Well-Architected Machine Learning with Amazon SageMaker

18. Chapter 14: Managing SageMaker Features across Accounts

19. Other Books You May Enjoy

Machine learning use case and dataset

Throughout this book, we will be using examples to demonstrate the best practices that apply across the ML life cycle. For this, we'll focus on a single ML use case and use an open dataset with data relating to the ML use case.

The primary use case we'll explore in this book is predicting air quality readings. Given a location (weather station) and date, we'll try to predict a value for a particular type of air quality measurement (for example, pm25 or o3). We'll treat this as a regression problem and explore XGBoost and neural network-based model approaches.

For this, we'll use a dataset from OpenAQ (https://registry.opendata.aws/openaq/) that includes air quality data from public data sources. The dataset that we will use is the realtime dataset (https://openaq-fetches.s3.amazonaws.com/index.html) and the realtime-parquet-gzipped dataset (https://openaq-fetches.s3.amazonaws.com/index.html), which includes daily...

The rest of the chapter is locked

Register for a free Packt account to unlock a world of extra content!

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (3)

Muppala

Muppala

Sireesha Muppala, PhD is a Principal Enterprise Solutions Architect, AI/ML at Amazon Web Services (AWS). Sireesha holds a PhD in computer science and post-doctorate from the University of Colorado. She is a prolific content creator in the ML space with multiple journal articles, blogs, and public speaking engagements. Sireesha is a co-creator and instructor of the Practical Data Science specialization on Coursera. She is a co-director of Women In Big Data (WiBD), Denver chapter. Sireesha enjoys helping organizations design, architect, and implement ML solutions at scale.

See other products by Muppala

DeFauw

DeFauw

Randy DeFauw is a Principal Solution Architect at AWS. He holds an MSEE from the University of Michigan, where his graduate thesis focused on computer vision for autonomous vehicles. He also holds an MBA from Colorado State University. Randy has held a variety of positions in the technology space, ranging from software engineering to product management. He entered the big data space in 2013 and continues to explore that area. He is actively working on projects in the ML space, including reinforcement learning. He has presented at numerous conferences, including GlueCon and Strata, published several blogs and white papers, and contributed many open source projects to GitHub.

See other products by DeFauw

Eigenbrode

Eigenbrode

Shelbee Eigenbrode is a Principal AI and ML Specialist Solutions Architect at AWS. She holds six AWS certifications and has been in technology for 23 years, spanning multiple industries, technologies, and roles. She is currently focusing on combining her DevOps and ML background to deliver and manage ML workloads at scale. With over 35 patents granted across various technology domains, she has a passion for continuous innovation and using data to drive business outcomes. Shelbee co-founded the Denver chapter of Women in Big Data.

See other products by Eigenbrode

Other recommended products

Related to this chapter

Learn Amazon SageMaker

Learn Amazon SageMaker

This book will teach you how to move quickly from business questions to machine learning models in production. Using real-world examples implemented with Python and Jupyter notebooks, you'll learn about many the features and APIs of Amazon SageMaker on a wide spectrum of use cases: tabular data, computer vision, and natural language processing.

Aug 2020 16h 20m

Automated Machine Learning

Automated Machine Learning

This guide will help you to explore automated machine learning (AutoML), a rapidly growing subfield of machine learning. You'll learn how you can use AutoML to fully automate the machine learning process even if you're not an expert, and in turn increase your productivity drastically.

Feb 2021 10h 24m

Engineering MLOps

Engineering MLOps

Get to grips with ML lifecycle management and MLOps implementation for your organization. This book will give you comprehensive insights into MLOps coupled with real-world examples in Azure that will teach you how to write programs, train robust and scalable ML models, and build ML pipelines to train, deploy, and monitor models securely in production.

Apr 2021 12h 20m

Machine Learning Engineering with MLflow

Machine Learning Engineering with MLflow

Machine Learning Engineering with MLflow is a step-by-step guide that will have you up and running, and productive in no time with MLflow using the most effective machine learning engineering approach. You will also learn how to scale MLflow in big data environments and for high computing demands.

Aug 2021 8h 16m

Mastering Machine Learning on AWS

Mastering Machine Learning on AWS

This book will help you master your skills in various artificial intelligence and machine learning services available on AWS. Through practical hands-on examples, you'll learn how to use these services to generate impressive results. You will have a tremendous understanding of how to use a wide range of AWS services in your own organization.

May 2019 10h 12m

Hands-On Artificial Intelligence on Amazon Web Services

Hands-On Artificial Intelligence on Amazon Web Services

AI in AWS covers primarily two broad topics – a) how to leverage readily available AI/ML APIs and b) how to build, train and deploy ML models from scratch, to solve diverse business problems, such as demand forecasting, image classification, topic modeling, speech and text recognition. By the end of the book, you will have learned how to build production grade AI/ML applications in AWS

Oct 2019 14h 12m

Mastering Azure Machine Learning

Mastering Azure Machine Learning

This book will help you learn how to build a scalable end-to-end machine learning pipeline in Azure from experimentation and training to optimization and deployment. By the end of this book, you will learn to build complex distributed systems and scalable cloud infrastructure using powerful machine learning algorithms to compute insights.

Apr 2020 14h 32m

AWS Certified Machine Learning Specialty: MLS-C01 Certification Guide

AWS Certified Machine Learning Specialty: MLS-C01 Certification Guide

The AWS Certified Machine Learning Specialty 2020 Certification Guide covers everything you need to pass the MLS-C01 certification exam and serves as a handy, on-the-job reference guide. You'll also find the book useful if you're looking to get up to speed with AWS services for machine learning.

Mar 2021 11h 16m

Mastering AWS Security

Mastering AWS Security

Security is a key ingredient when it comes to workloads deployed in cloud. Security is highest priority for any organization and it is considered job zero at AWS. Our book will dig deep into the achieving end to end automated security for all workloads deployed, running and stored in AWS cloud.

Oct 2017 8h 24m

Scalable Data Streaming with Amazon Kinesis

Scalable Data Streaming with Amazon Kinesis

This practical guide takes a hands-on approach to implementation and associated methodologies to have you up and running with all that Amazon Kinesis has to offer. You'll work with use cases and practical examples to be able to ingest, process, analyze, and stream real-time data in no time.

Mar 2021 10h 28m

AWS is currently the market leader in the public cloud market.With the increasing global interest in leveraging cloud infrastructure, AWS Cloud from Amazon offers a cutting-edge platform for architecting, building, and deploying web-scale cloud applications. This book will help you in performing these tasks easily.

Feb 2018 13h 44m

Personalised recommendations for you

Based on your interests and search pattern

Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch

This book provides a hands-on approach to solving over 30 prominent real-world computer vision problems using PyTorch 2.x on actual datasets. Here you'll learn to build a neural network from scratch and optimize hyperparameters, perform image classification, multi-object detection, segmentation, and more. You'll also explore facial expression manipulation and combining CV with NLP and RL techniques, build generative AI applications, and take your model to production on AWS. By the end of this book, you'll master modern NN architectures and confidently solve real-world CV problems.

Jun 2024 24h 52m

Data Governance Handbook

Data Governance Handbook

This book provides a highly focused view of real business outcomes powered by data governance, that resonate with non-data executives such as CFOs and CEOs. You'll also find useful insights into how to implement data governance initiatives.

May 2024 13h 8m

Data Engineering with Databricks Cookbook

Data Engineering with Databricks Cookbook

This book shows you how to use Apache Spark, Delta Lake, and Databricks to build data pipelines, manage and transform data, optimize performance, and more. Additionally, you'll implement DataOps and DevOps practices, and orchestrate data workflows.

May 2024 14h 36m

Azure Data Engineer Associate Certification Guide

Azure Data Engineer Associate Certification Guide

Unlock the power of Azure data engineering with this certification guide, elevating your skills in data processing, storage, and security with the help of practical insights, hands-on exercises, and the latest advancements.

May 2024 18h 16m

Microsoft Power BI Cookbook

Microsoft Power BI Cookbook

Microsoft Power BI is the most sought-after platform for BI professionals' visualization needs. Explore the latest Power BI features, future AI enhancements, and integration with other Power Platform tools via new recipes in this updated edition.

Jul 2024 19h 56m

Python Data Cleaning Cookbook

Python Data Cleaning Cookbook

The book shows you how to clean, wrangle, and view data from multiple perspectives, including dataset and column attributes. You will cover common and not-so-common challenges that are faced while cleaning messy data for complex situations and learn to manipulate data to get it down to a form that can be useful for making the right decisions.

May 2024 16h 12m

Microsoft Azure AI Fundamentals AI-900 Exam Guide

Microsoft Azure AI Fundamentals AI-900 Exam Guide

This AI-900 study guide will help you prepare and practice for the certification exam. You'll delve into AI workloads, ML principles, computer vision, NLP, knowledge mining, and generative AI using Azure cloud services.

May 2024 9h 36m

Using Stable Diffusion with Python

Using Stable Diffusion with Python

This book shows you how to use Python to control Stable Diffusion and generate high-quality images. In addition to covering the basic usage of the diffusers package, the book provides solutions for extending the package for more advanced purposes.

Jun 2024 11h 44m

Getting Started with DuckDB

Getting Started with DuckDB

This hands-on book teaches you to analyze large datasets with blazing speed and ease. You will learn how to use DuckDB to quickly load, query, transform, analyze, and visualize data effectively through a series of practical examples.

Jun 2024 12h 44m

Databricks Certified Associate Developer for Apache Spark Using Python

Databricks Certified Associate Developer for Apache Spark Using Python

This guide gets you ready for certification with expert-backed content, key exam concepts, and topic reviews. Additionally, you'll be able to make the most of Apache Spark 3.0 to modernize workloads and more using specific tools and techniques.