Using the SageMaker Training Toolkit with scikit-learn
In this example, we're going to build a custom Python container with the SageMaker Training Toolkit. We'll use it to train a scikit-learn model on the Boston Housing dataset, using script mode and the SKLearn
estimator.
We need three building blocks:
- The training script. Since script mode will be available, we can use exactly the same code as in the scikit-learn example from Chapter 7, Extending Machine Learning Services Using Built-In Frameworks.
- We need a Dockerfile and Docker commands to build our custom container.
- We also need an
SKLearn
estimator configured to use our custom container.
Let's take care of the container:
- A Dockerfile can get quite complicated. No need for that here! We start from the official Python 3.7 image available on Docker Hub (https://hub.docker.com/_/python). We install scikit-learn,
numpy
,pandas
,joblib
, and the SageMaker Training Toolkit:FROM python...