Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Amazon SageMaker Best Practices
Amazon SageMaker Best Practices

Amazon SageMaker Best Practices : Proven tips and tricks to build successful machine learning solutions on Amazon SageMaker

Arrow left icon
Profile Icon Muppala Profile Icon Eigenbrode Profile Icon DeFauw
Arrow right icon
NZ$71.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.9 (8 Ratings)
Paperback Sep 2021 348 pages 1st Edition
eBook
NZ$39.99 NZ$57.99
Paperback
NZ$71.99
Subscription
Free Trial
Arrow left icon
Profile Icon Muppala Profile Icon Eigenbrode Profile Icon DeFauw
Arrow right icon
NZ$71.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.9 (8 Ratings)
Paperback Sep 2021 348 pages 1st Edition
eBook
NZ$39.99 NZ$57.99
Paperback
NZ$71.99
Subscription
Free Trial
eBook
NZ$39.99 NZ$57.99
Paperback
NZ$71.99
Subscription
Free Trial

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
Table of content icon View table of contents Preview book icon Preview Book

Amazon SageMaker Best Practices

Chapter 1: Amazon SageMaker Overview

This chapter will provide a high-level overview of the Amazon SageMaker capabilities that map to the various phases of the machine learning (ML) process. This will set a foundation for the best practices discussion of using SageMaker capabilities in order to handle various data science challenges. 

In this chapter, we're going to cover the following main topics:

  • Preparing, building, training and tuning, deploying, and managing ML models
  • Discussion of data preparation capabilities
  • Feature tour of model-building capabilities
  • Feature tour of training and tuning capabilities
  • Feature tour of model management and deployment capabilities

Technical requirements

All notebooks with coding exercises will be available at the following GitHub link:

https://github.com/PacktPublishing/Amazon-SageMaker-Best-Practices

Preparing, building, training and tuning, deploying, and managing ML models

First, let's review the ML life cycle. By the end of this section, you should understand how SageMaker's capabilities map to the key phases of the ML life cycle. The following diagram shows you what the ML life cycle looks like:

Figure 1.1 – Machine learning life cycle

Figure 1.1 – Machine learning life cycle

As you can see, there are three phases of the ML life cycle at a high level:

  • In the Data Preparation phase, you collect and explore data, label a ground truth dataset, and prepare your features. Feature engineering, in turn, has several steps, including data normalization, encoding, and calculating embeddings, depending on the ML algorithm you choose.
  • In the Model Training phase, you build your model and tune it until you achieve a reasonable validation score that aligns with your business objective.
  • In the Operations phase, you test how well your model performs against real-world data, deploy it, and monitor how well it performs. We will cover model monitoring in more detail in Chapter 11, Monitoring Production Models with Amazon SageMaker Model Monitor and Clarify.

This diagram is purposely simplified; in reality, each phase may have multiple smaller steps, and the whole life cycle is iterative. You're never really done with ML; as you gather data on how your model performs in production, you'll likely try to improve it by collecting more data, changing your features, or tuning the model.

So how do SageMaker capabilities map to the ML life cycle? Before we answer that question, let's take a look at the SageMaker console (Figure 1.2):

Figure 1.2 – Navigation pane in the SageMaker console

Figure 1.2 – Navigation pane in the SageMaker console

The appearance of the console changes frequently and the preceding screenshot shows the current appearance of the console at the time of writing.

These capability groups align to the ML life cycle, shown as follows:

Figure 1.3 – Mapping of SageMaker capabilities to the ML life cycle

Figure 1.3 – Mapping of SageMaker capabilities to the ML life cycle

SageMaker Studio is not shown here, as it is an integrated workbench that provides a user interface for many SageMaker capabilities. The marketplace provides both data and algorithms that can be used across the life cycle.

Now that we have had a look at the console, let's dive deeper into the individual capabilities of SageMaker in each life cycle phase.

Discussion of data preparation capabilities

In this section, we'll dive into SageMaker's data preparation and feature engineering capabilities. By the end of this section, you should understand when to use SageMaker Ground Truth, Data Wrangler, Processing, Feature Store, and Clarify.

SageMaker Ground Truth

Obtaining labeled data for classification, regression, and other tasks is often the biggest barrier to ML projects, as many companies have a lot of data but have not explicitly labeled it according to business properties such as anomalous and high lifetime value. SageMaker Ground Truth helps you systematically label data by defining a labeling workflow and assigning labeling tasks to a human workforce.

Over time, Ground Truth can learn how to label data automatically, while still sending low-confidence results to humans for review. For advanced datasets such as 3D point clouds, which represent data points like shape coordinates, Ground Truth offers assistive labeling features, such as adding bounding boxes to the middle frames of a sequence once you label the start and end frames. The following diagram shows an example of labels applied to a dataset:

Figure 1.4 – SageMaker Ground Truth showing the labels applied to sentiment reviews

Figure 1.4 – SageMaker Ground Truth showing the labels applied to sentiment reviews

The data is sourced from the UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/Sentiment+Labelled+Sentences). To counteract individual worker bias or error, a data object can be sent to multiple workers. In this example, we only have one worker, so the confidence score is not used.

Note that you can also use Ground Truth in other phases of the ML life cycle; for example, you may use it to check the labels generated by a production model.

SageMaker Data Wrangler

Data Wrangler helps you understand your data and perform feature engineering. Data Wrangler works with data stored in S3 (optionally accessed via Athena) and Redshift and performs typical visualization and transformations, such as correlation plots and categorical encoding. You can combine a series of transformations into a data flow and export that flow into an MLOps pipeline. The following screenshot shows an example of Data Wrangler information for a dataset:

Figure 1.5 – Data Wrangler displaying summary table information regarding a dataset

Figure 1.5 – Data Wrangler displaying summary table information regarding a dataset

You may also use Data Wrangler in the operations phase of the ML life cycle if you want to analyze the data coming into an ML model for production inference.

SageMaker Processing

SageMaker Processing jobs help you run data processing and feature engineering tasks on your datasets. By providing your own Docker image containing your code, or using a pre-built Spark or sklearn container, you can normalize and transform data to prepare your features. The following diagram shows the logical flow of a SageMaker Processing job:

Figure 1.6 – Conceptual overview of a Spark processing job. Spark jobs are particularly handy for processing larger datasets

Figure 1.6 – Conceptual overview of a Spark processing job. Spark jobs are particularly handy for processing larger datasets

You may also use processing jobs to evaluate the performance of ML models during the Model Training phase and to check data and model quality in the Model Operations phase.

SageMaker Feature Store

SageMaker Feature Store helps you organize and share your prepared features. Using a feature store improves quality and saves time by letting you reuse features rather than duplicate complex feature engineering code and computations that have already been done. Feature Store supports both batch and stream storage and retrieval. The following screenshot shows an example of feature group information:

Figure 1.7 – Feature Store showing a feature group with a set of related features

Figure 1.7 – Feature Store showing a feature group with a set of related features

Feature Store also helps during the Model Operations phase, as you can quickly look up complex feature vectors to help obtain real-time predictions.

SageMaker Clarify

SageMaker Clarify helps you understand model behavior and calculate bias metrics from your model. It checks for imbalance in the dataset, models that give different results based on certain attributes, and bias that appears due to data drift. It can also use leading explainability algorithms such as SHAP to help you explain individual predictions to get a sense of which features drive model behavior. The following figure shows an example of class imbalance scores for a dataset, where we have many more samples from the Gift Card category than the other categories:

Figure 1.8 – Clarify showing class imbalance scores in a dataset. Class imbalance can lead to biased results in an ML model

Figure 1.8 – Clarify showing class imbalance scores in a dataset. Class imbalance can lead to biased results in an ML model

Clarify can be used throughout the entire ML life cycle, but consider using it early in the life cycle to detect imbalanced data (datasets that have many examples of one class but few of another).

Now that we've introduced several SageMaker capabilities for data preparation, let's move on to model-building capabilities.

Feature tour of model-building capabilities

In this section, we'll dive into SageMaker's model-building capabilities. By the end of this section, you should understand when to use SageMaker Studio or SageMaker notebook instances, and how to choose between SageMaker's built-in algorithms, frameworks, and libraries, versus a bring your own (BYO) approach.

SageMaker Studio

SageMaker Studio is an integrated development environment (IDE) for ML. It brings together Jupyter notebooks, experiment management, and other tools into a unified user interface. You can easily share notebooks and notebook snapshots with other team members using Git or a shared filesystem. The following screenshot shows an example of one of SageMaker Studio's built-in visualizations:

Figure 1.9 – SageMaker Studio showing an experiment graph

Figure 1.9 – SageMaker Studio showing an experiment graph

SageMaker Studio can be used in all phases of the ML life cycle.

SageMaker notebook instances

If you prefer a more traditional Jupyter or JupyterLab experience, and you don't need the additional integrations and collaboration tools that Studio provides, you can use a regular SageMaker notebook instance. You choose the notebook instance compute capacity (that is, whether you want GPUs and how much storage you need), and SageMaker provisions the environment with the Jupyter Notebook and JupyterLab and several of the common ML frameworks and libraries installed.

The notebook instance also supports Docker in case you want to build and test containers with ML code locally. Best of all, the notebook instances come bundled with over 100 example notebooks. The following figure shows an example of the JupyterLab interface in a notebook:

Figure 1.10 – JupyterLab interface in a SageMaker notebook, showing a list of example notebooks

Figure 1.10 – JupyterLab interface in a SageMaker notebook, showing a list of example notebooks

Similar to SageMaker Studio, you can perform almost any part of the ML life cycle in a notebook instance.

SageMaker algorithms

SageMaker bundles open source and proprietary algorithms for many common ML use cases. These algorithms are a good starting point as they are tuned for performance, often supporting distributed training. The following table lists the SageMaker algorithms provided for different types of ML problems:

Figure 1.11 – SageMaker algorithms for various ML scenarios

Figure 1.11 – SageMaker algorithms for various ML scenarios

BYO algorithms and scripts

If you prefer to write your own training and inference code, you can work with a supported ML, graph, or RL framework, or bundle your own code into a Docker image. The BYO approach works well if you already have a library of model code, or if you need to build a model for a use case where a pre-built algorithm doesn't work well. Data scientists who use R like to use this approach. SageMaker supports the following frameworks:

  • Supported machine learning frameworks: XGBoost, sklearn
  • Supported deep learning frameworks: TensorFlow, PyTorch, MXNet, Chainer
  • Supported reinforcement learning frameworks: Ray RLLib, Coach
  • Supporting graph frameworks: Deep Graph Library

Now that we've introduced several SageMaker capabilities for model building, let's move on to training and tuning capabilities.

Feature tour of training and tuning capabilities

In this section, we'll dive into SageMaker's model training capabilities. By the end of this section, you should understand the basics of SageMaker training jobs, Autopilot and Hyperparameter Optimization (HPO), SageMaker Debugger, and SageMaker Experiments.

SageMaker training jobs

When you launch a model training job, SageMaker manages a series of steps for you. It launches one or more training instances, transfers training data from S3 or other supported storage systems to the instances, gets your training code from a Docker image repository, and starts the job. It monitors job progress and collects model artifacts and metrics from the job. The following screenshot shows an example of the hyperparameters tracked in a training job:

Figure 1.12 – SageMaker training jobs capture data such as input hyperparameter values

Figure 1.12 – SageMaker training jobs capture data such as input hyperparameter values

For larger training datasets, SageMaker manages distributed training. It will distribute subsets of data from storage to different training instances and manage the inter-node communication during the training job. The specifics vary based on the ML framework you're using, but note that most of the supported frameworks and several of the SageMaker built-in algorithms support distributed training.

Autopilot

If you are working with tabular data and solving regression or classification problems, you may find that you're performing a lot of repetitive work. You may have settled on XGBoost as a high-performing algorithm, always one-hot encoding for low-cardinality categorical features, normalizing numeric features, and so on. Autopilot performs many of these routine steps for you. In the following diagram, you can see the logical steps for an Autopilot job:

Figure 1.13 – Autopilot process

Figure 1.13 – Autopilot process

Autopilot saves you time by automating a lot of that routine process. It will run normal feature preparation tasks, try the three supported algorithms (Linear Learner, XGBoost, and a multilayer perceptron), and run hyperparameter tuning. Autopilot is a great place to start even if you end up needing to refine the output, as it generates a notebook with the code used for the entire process.

HPO

Some ML algorithms accept tens of hyperparameters as inputs. Tuning these by hand is time-consuming. Hyperparameter Optimization (HPO) simplifies that process by letting you define the hyperparameters you want to experiment with, the ranges to work over, and the metric you want to optimize. The following screenshot shows example output for an HPO job:

Figure 1.14 – Hyperparameter tuning jobs showing the objective metric of interest

Figure 1.14 – Hyperparameter tuning jobs showing the objective metric of interest

SageMaker Debugger

SageMaker Debugger helps you debug and, depending on your ML framework, profile your training jobs. While making training jobs run faster is always helpful, debugging is particularly useful if you are writing your own deep learning code with neural networks. Problems such as exploding gradients or mysterious NaN in your tensors are quite tough to track down, particularly in distributed training jobs. Debugger can effectively help you set breakpoints to see where things are going wrong. The following figure shows an example of the training and validation loss captured by SageMaker Debugger:

Figure 1.15 – Visualization of tensors captured by SageMaker Debugger

Figure 1.15 – Visualization of tensors captured by SageMaker Debugger

SageMaker Experiments

ML is an iterative process. When you're tuning a model, you may try several variations of hyperparameters, features, and even algorithms. It's important to track that work systematically so you can reproduce your results later on. That's where SageMaker Experiments comes into the picture. It helps you track, organize, and compare different trials. The following screenshot shows an example of SageMaker Experiments information:

Figure 1.16 – Trial results in SageMaker Experiments

Figure 1.16 – Trial results in SageMaker Experiments

Now that we've introduced several SageMaker capabilities for training and tuning, let's move on to model management and deployment capabilities.

Feature tour of model management and deployment capabilities

In this section, we'll dive into SageMaker's model hosting and monitoring capabilities. By the end of this section, you should understand the basics of SageMaker model endpoints along with the use of SageMaker Model Monitor. You'll also learn about deploying models on edge devices with SageMaker Edge Manager.

Model Monitor 

In some organizations, the gap between the ML team and the operations team causes real problems. Operations teams may not understand how to monitor an ML system in production, and ML teams don't always have deep operational expertise.

Model Monitor tries to solve that problem: it will instrument a model endpoint and collect data about the inputs to, and outputs from, an ML model used for inference. It can then analyze that data for data drift and other quality problems, as well as model accuracy or quality problems. The following diagram shows an example of model monitoring data captured for an inference endpoint:

Figure 1.17 – Model Monitor checking data quality on inference inputs

Figure 1.17 – Model Monitor checking data quality on inference inputs

Model endpoints

In some cases, you need to get a large number of inferences at once, in which case SageMaker provides a batch inference capability. But if you need to get inferences closer to real time, you can host your model in a SageMaker managed endpoint. SageMaker handles the deployment and scaling of your endpoints. Just as important, SageMaker lets you host multiple models in a single endpoint. That's useful both for A/B testing (that is, you can direct some percentage of traffic to a newer model) and for hosting multiple models that are tuned for different traffic segments.

You can also host an inference pipeline with multiple containers chained together, which is convenient if you need to preprocess inputs before performing inference. The following screenshot shows a model endpoint with two models serving different percentages of traffic:

Figure 1.18 – Multiple models configured behind a single inference endpoint

Figure 1.18 – Multiple models configured behind a single inference endpoint

Edge Manager

In some cases, you need to get model inferences on a device rather than from the cloud. You may need a lower response time that doesn't allow for an API call to the cloud, or you may have intermittent network connectivity. In video use cases, it's not always feasible to stream data to the cloud for inference. In such cases, Edge Manager and related tools such as SageMaker Neo help you compile models optimized to run on devices, deploy them, manage them, and get operational metrics back to the cloud. The following screenshot shows an example of a virtual device managed by Edge Manager:

Figure 1.19 – A device registered to an Edge Manager device fleet

Figure 1.19 – A device registered to an Edge Manager device fleet

Before we conclude with the summary, let's have a recap of the SageMaker capabilities provided for the following primary ML phases:

  • For data preparation:
Figure 1.20 – SageMaker capabilities for data preparation

Figure 1.20 – SageMaker capabilities for data preparation

  • For operations:
Figure 1.21 – SageMaker capabilities for operations

Figure 1.21 – SageMaker capabilities for operations

  • For model training:
Figure 1.22 – SageMaker capabilities for model training

Figure 1.22 – SageMaker capabilities for model training

With this, we have come to the end of this chapter.

Summary

In this chapter, you saw how to map SageMaker capabilities to different phases of the ML life cycle. You got a quick look at important SageMaker capabilities. In the next chapter, you will learn about the technical requirements and the use case that will be used throughout. You'll also learn about setting up managed data science environments for scaling model-building activities.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Learn best practices for all phases of building machine learning solutions - from data preparation to monitoring models in production
  • Automate end-to-end machine learning workflows with Amazon SageMaker and related AWS
  • Design, architect, and operate machine learning workloads in the AWS Cloud

Description

Amazon SageMaker is a fully managed AWS service that provides the ability to build, train, deploy, and monitor machine learning models. The book begins with a high-level overview of Amazon SageMaker capabilities that map to the various phases of the machine learning process to help set the right foundation. You'll learn efficient tactics to address data science challenges such as processing data at scale, data preparation, connecting to big data pipelines, identifying data bias, running A/B tests, and model explainability using Amazon SageMaker. As you advance, you'll understand how you can tackle the challenge of training at scale, including how to use large data sets while saving costs, monitoring training resources to identify bottlenecks, speeding up long training jobs, and tracking multiple models trained for a common goal. Moving ahead, you'll find out how you can integrate Amazon SageMaker with other AWS to build reliable, cost-optimized, and automated machine learning applications. In addition to this, you'll build ML pipelines integrated with MLOps principles and apply best practices to build secure and performant solutions. By the end of the book, you'll confidently be able to apply Amazon SageMaker's wide range of capabilities to the full spectrum of machine learning workflows.

Who is this book for?

This book is for expert data scientists responsible for building machine learning applications using Amazon SageMaker. Working knowledge of Amazon SageMaker, machine learning, deep learning, and experience using Jupyter Notebooks and Python is expected. Basic knowledge of AWS related to data, security, and monitoring will help you make the most of the book.

What you will learn

  • Perform data bias detection with AWS Data Wrangler and SageMaker Clarify
  • Speed up data processing with SageMaker Feature Store
  • Overcome labeling bias with SageMaker Ground Truth
  • Improve training time with the monitoring and profiling capabilities of SageMaker Debugger
  • Address the challenge of model deployment automation with CI/CD using the SageMaker model registry
  • Explore SageMaker Neo for model optimization
  • Implement data and model quality monitoring with Amazon Model Monitor
  • Improve training time and reduce costs with SageMaker data and model parallelism
Estimated delivery fee Deliver to New Zealand

Standard delivery 10 - 13 business days

NZ$20.95

Premium delivery 5 - 8 business days

NZ$74.95
(Includes tracking information)

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Sep 24, 2021
Length: 348 pages
Edition : 1st
Language : English
ISBN-13 : 9781801070522
Category :
Languages :

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
Estimated delivery fee Deliver to New Zealand

Standard delivery 10 - 13 business days

NZ$20.95

Premium delivery 5 - 8 business days

NZ$74.95
(Includes tracking information)

Product Details

Publication date : Sep 24, 2021
Length: 348 pages
Edition : 1st
Language : English
ISBN-13 : 9781801070522
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just NZ$7 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just NZ$7 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total NZ$ 233.97
Serverless Analytics with Amazon Athena
NZ$80.99
Machine Learning with Amazon SageMaker Cookbook
NZ$80.99
Amazon SageMaker Best Practices
NZ$71.99
Total NZ$ 233.97 Stars icon

Table of Contents

19 Chapters
Section 1: Processing Data at Scale Chevron down icon Chevron up icon
Chapter 1: Amazon SageMaker Overview Chevron down icon Chevron up icon
Chapter 2: Data Science Environments Chevron down icon Chevron up icon
Chapter 3: Data Labeling with Amazon SageMaker Ground Truth Chevron down icon Chevron up icon
Chapter 4: Data Preparation at Scale Using Amazon SageMaker Data Wrangler and Processing Chevron down icon Chevron up icon
Chapter 5: Centralized Feature Repository with Amazon SageMaker Feature Store Chevron down icon Chevron up icon
Section 2: Model Training Challenges Chevron down icon Chevron up icon
Chapter 6: Training and Tuning at Scale Chevron down icon Chevron up icon
Chapter 7: Profile Training Jobs with Amazon SageMaker Debugger Chevron down icon Chevron up icon
Section 3: Manage and Monitor Models Chevron down icon Chevron up icon
Chapter 8: Managing Models at Scale Using a Model Registry Chevron down icon Chevron up icon
Chapter 9: Updating Production Models Using Amazon SageMaker Endpoint Production Variants Chevron down icon Chevron up icon
Chapter 10: Optimizing Model Hosting and Inference Costs Chevron down icon Chevron up icon
Chapter 11: Monitoring Production Models with Amazon SageMaker Model Monitor and Clarify Chevron down icon Chevron up icon
Section 4: Automate and Operationalize Machine Learning Chevron down icon Chevron up icon
Chapter 12: Machine Learning Automated Workflows Chevron down icon Chevron up icon
Chapter 13:Well-Architected Machine Learning with Amazon SageMaker Chevron down icon Chevron up icon
Chapter 14: Managing SageMaker Features across Accounts Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.9
(8 Ratings)
5 star 87.5%
4 star 12.5%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Wesley Pasfield Sep 24, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Overall this is an excellent and practical overview of the full breadth of the SageMaker platform, I’d recommend this to anyone who is using SageMaker on AWS for full end-to-end machine learning workflows. There are an overwhelming number of products within the SageMaker platform, and this book does a great job of clearly explaining the benefit of each product, and placing it within the broader AWS ecosystem. Often with machine learning platforms and tutorials there’s an over-emphasis on the modeling portion of the process, but this book covers the full machine learning lifecycle including model management, versioning and deployment, and I really appreciated the book’s practical focus.I especially enjoyed the focus on integration with other AWS services, and notably sections with detail on CloudFormation orchestration (particularly the setting up Data Science environments chapter), and would have liked to see more CloudFormation focus in the deployment section. It also would have been nice to see the pros and cons of some of the new SageMaker features vs. their prior AWS service predecessors to better understand the value proposition (ex. Online Feature Store vs. DynamoDB). I thought the comparison of Model Registry options (SageMaker, AWS Custom, and Open Source) was especially strong and a helpful given the vast number of architecture decisions that are possible with AWS
Amazon Verified review Amazon
J. Wu Dec 18, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
As a Data Scientist I have done many ML projects in the Jupyter Notebook and I have always been intrigued about AWS SageMaker. The book gave a great overall view of what Safemaker does. This is what I learned from the book: that Sagemaker is not magic. For mature algorithms, it has prebuilt scripts. For more customized projects you need to write your own code. The main point of SafeMaker is to automate and scale. It could potentially automate labor intense labeling. It could facilitate ETL, the whole ML pipeline and scale to big data. Regarding how to do so, this book provides examples of good practices. Overall this book is definitely valuable for someone who is able to dive into AWS SageMaker.
Amazon Verified review Amazon
Amrita Oct 28, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The book is a great addition to the documentation around AWS Sagemaker - from my personal experience, Sagemaker is a very powerful MLOps tool, but the official documentation/examples/tutorials are definitely too sparse on many details. This book would be a great resource for both beginners and advanced users. I think one thing the book could have covered in more detail is examples on how Sagemaker can be adapted to existing ML workflows - since that is not always obvious, as the book covers each Sagemaker feature separately.
Amazon Verified review Amazon
laura Feb 21, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book contains a detailed overview of the full machine learning lifecycle. Inside you will learn how to process and prepare data, use big data pipelines, and run A/B tests. What is unique and great about the book is its use of example code files/cloudformation provided on GitHub, which allows the reader to follow each chapter and test the features described throughout the book.To get the most out of this book, the reader needs to have a working knowledge of Amazon SageMaker and experience with the AWS console.I particularly enjoyed the fact that the AWS Well Architected Framework is taken into account by including an overview of the AWS services necessary to stay within the well architected guidelines.
Amazon Verified review Amazon
Jim Oct 28, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This is a comprehensive book on applying Amazon SageMaker to the whole lifecycle of a machine learning process. I highly recommend it to anyone who uses Amazon SageMaker to develop machine learning applications, especially the systems with large-scale datasets. The book provides many advices with examples on how to address data science challenges with large-scale datasets such as data pre-processing and preparation at scale, model training at scale using big data pipelines. In addition, the book also shows how to integrate machine learning pipelines with MLOps principles to automate the development of a secure and performant machine learning application system.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is the delivery time and cost of print book? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela
What is custom duty/charge? Chevron down icon Chevron up icon

Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.

Do I have to pay customs charges for the print book order? Chevron down icon Chevron up icon

The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.

List of EU27 countries: www.gov.uk/eu-eea:

A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.

How do I know my custom duty charges? Chevron down icon Chevron up icon

The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.

For example:

  • If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
  • Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order? Chevron down icon Chevron up icon

Cancellation Policy for Published Printed Books:

You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.

Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.

What is your returns and refunds policy? Chevron down icon Chevron up icon

Return Policy:

We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:

  1. If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
  2. Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
  3. You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
  4. Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
  5. If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
  6. Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.

On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.

What tax is charged? Chevron down icon Chevron up icon

Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.

What payment methods can I use? Chevron down icon Chevron up icon

You can pay with the following card types:

  1. Visa Debit
  2. Visa Credit
  3. MasterCard
  4. PayPal
What is the delivery time and cost of print books? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela