Using the SageMaker training toolkit with scikit-learn
In this example, we're going to build a custom Python container with the SageMaker Training Toolkit. We'll use it to train a scikit-learn
model on the Boston Housing dataset, using Script Mode and the SKLearn
estimator.
We need three building blocks:
- The training script: Thanks to Script Mode, we can use exactly the same code as in the
Scikit-Learn
example from Chapter 7, Extending Machine Learning Services with Built-in Frameworks. - We need a
Dockerfile
and Docker commands to build our custom container. - We also need a
SKLearn
estimator configured to use our custom container.
Let's take care of the container:
- A
Dockerfile
can get quite complicated. No need for that here! We start from the official Python 3.7 image available on Docker Hub (https://hub.docker.com/_/python). We install scikit-learn,numpy
,pandas
,joblib
, and the SageMaker training toolkit:FROM python:3.7 RUN pip3 install...