Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Machine Learning with Amazon SageMaker Cookbook

You're reading from   Machine Learning with Amazon SageMaker Cookbook 80 proven recipes for data scientists and developers to perform machine learning experiments and deployments

Arrow left icon
Product type Paperback
Published in Oct 2021
Publisher Packt
ISBN-13 9781800567030
Length 762 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Joshua Arvin Lat Joshua Arvin Lat
Author Profile Icon Joshua Arvin Lat
Joshua Arvin Lat
Arrow right icon
View More author details
Toc

Table of Contents (11) Chapters Close

Preface 1. Chapter 1: Getting Started with Machine Learning Using Amazon SageMaker 2. Chapter 2: Building and Using Your Own Algorithm Container Image FREE CHAPTER 3. Chapter 3: Using Machine Learning and Deep Learning Frameworks with Amazon SageMaker 4. Chapter 4: Preparing, Processing, and Analyzing the Data 5. Chapter 5: Effectively Managing Machine Learning Experiments 6. Chapter 6: Automated Machine Learning in Amazon SageMaker 7. Chapter 7: Working with SageMaker Feature Store, SageMaker Clarify, and SageMaker Model Monitor 8. Chapter 8: Solving NLP, Image Classification, and Time-Series Forecasting Problems with Built-in Algorithms 9. Chapter 9: Managing Machine Learning Workflows and Deployments 10. Other Books You May Enjoy

Building and testing the custom Python algorithm container image

In this recipe, we will prepare a Dockerfile for the custom Python container image. We will make use of the train and serve scripts that we prepared in the previous recipes. After that, we will run the docker build command to prepare the image before pushing it to an Amazon ECR repository.

Tip

Wait! What's a Dockerfile? It's a text document containing the directives (commands) used to prepare and build a container image. This container image then serves as the blueprint when running containers. Feel free to check out https://docs.docker.com/engine/reference/builder/ for more information on Dockerfiles.

Getting ready

Make sure you have completed the Preparing and testing the serve script in Python recipe.

How to do it…

The initial steps in this recipe focus on preparing a Dockerfile. Let's get started:

  1. Double-click the Dockerfile file in the file tree to open it in the Editor pane. Make sure that this is the same Dockerfile that's inside the ml-python directory:
    Figure 2.55 – Opening the Dockerfile inside the ml-python directory

    Figure 2.55 – Opening the Dockerfile inside the ml-python directory

    Here, we can see a Dockerfile inside the ml-python directory. Remember that we created an empty Dockerfile in the Setting up the Python and R experimentation environments recipe. Clicking it in the file tree should open an empty file in the Editor pane:

    Figure 2.56 – Empty Dockerfile in the Editor pane

    Figure 2.56 – Empty Dockerfile in the Editor pane

    Here, we have an empty Dockerfile. In the next step, we will update this by adding three lines of code.

  2. Update Dockerfile with the following block of configuration code:
    FROM arvslat/amazon-sagemaker-cookbook-python-base:1
    COPY train /usr/local/bin/train
    COPY serve /usr/local/bin/serve

    Here, we are planning to build on top of an existing image called amazon-sagemaker-cookbook-python-base. This image already has a few prerequisites installed. These include the Flask, pandas, and Scikit-learn libraries so that you won't have to worry about getting the installation steps working properly in this recipe. For more details on this image, check out https://hub.docker.com/r/arvslat/amazon-sagemaker-cookbook-python-base:

    Figure 2.57 – Docker Hub page for the base image

    Figure 2.57 – Docker Hub page for the base image

    Here, we can see the Docker Hub page for the amazon-sagemaker-cookbook-python-base image.

    Tip

    You can access a working copy of this Dockerfile in the Machine Learning with Amazon SageMaker Cookbook GitHub repository: https://github.com/PacktPublishing/Machine-Learning-with-Amazon-SageMaker-Cookbook/blob/master/Chapter02/ml-python/serve.

    With the Dockerfile ready, we will proceed with using the Terminal until the end of this recipe:

  3. You can use a new Terminal tab or an existing one to run the next set of commands:
    Figure 2.58 – New Terminal

    Figure 2.58 – New Terminal

    Here, we can see how to create a new Terminal. Note that the Terminal pane is under the Editor pane in the AWS Cloud9 IDE.

  4. Navigate to the ml-python directory containing our Dockerfile:
    cd /home/ubuntu/environment/opt/ml-python
  5. Specify the image name and the tag number:
    IMAGE_NAME=chap02_python
    TAG=1
  6. Build the Docker container using the docker build command:
    docker build --no-cache -t $IMAGE_NAME:$TAG .

    The docker build command makes use of what is written inside our Dockerfile. We start with the image specified in the FROM directive and then we proceed by copying the file into the container image.

  7. Use the docker run command to test if the train script works:
    docker run --name pytrain --rm -v /opt/ml:/opt/ml $IMAGE_NAME:$TAG train

    Let's quickly discuss some of the different options that were used in this command. The --rm flag makes Docker clean up the container after the container exits. The -v flag allows us to mount the /opt/ml directory from the host system to the /opt/ml directory of the container:

    Figure 2.59 – Result of the docker run command (train)

    Figure 2.59 – Result of the docker run command (train)

    Here, we can see the results after running the docker run command. It should show logs similar to what we had in the Preparing and testing the train script in Python recipe.

  8. Use the docker run command to test if the serve script works:
    docker run --name pyserve --rm -v /opt/ml:/opt/ml $IMAGE_NAME:$TAG serve

    After running this command, the Flask API server starts successfully. We should see logs similar to what we had in the Preparing and testing the serve script in Python recipe:

    Figure 2.60 – Result of the docker run command (serve)

    Figure 2.60 – Result of the docker run command (serve)

    Here, we can see that the API is running on port 8080. In the base image we used, we added EXPOSE 8080 to allow us to access this port in the running container.

  9. Open a new Terminal tab:
    Figure 2.61 – New Terminal

    Figure 2.61 – New Terminal

    As the API is running already in the first Terminal, we have created a new one.

  10. In the new Terminal tab, run the following command to get the IP address of the running Flask app:
    SERVE_IP=$(docker network inspect bridge | jq -r ".[0].Containers[].IPv4Address" | awk -F/ '{print $1}')
    echo $SERVE_IP

    We should get an IP address that's equal or similar to 172.17.0.2. Of course, we may get a different IP address value.

  11. Next, test the ping endpoint URL using the curl command:
    curl http://$SERVE_IP:8080/ping

    We should get an OK after running this command.

  12. Finally, test the invocations endpoint URL using the curl command:
    curl -d "1" -X POST http://$SERVE_IP:8080/invocations

    We should get a value similar or close to 881.3428400857507 after invoking the invocations endpoint.

At this point, it is safe to say that the custom container image we have prepared in this recipe is ready. Now, let's see how this works!

How it works…

In this recipe, we built a custom container image using the Dockerfile configuration we specified. When you have a Dockerfile, the standard set of steps would be to use the docker build command to build the Docker image, authenticate with ECR to gain the necessary permissions, use the docker tag command to tag the image appropriately, and use the docker push command to push the Docker image to the ECR repository.

Let's discuss what we have inside our Dockerfile. If this is your first time hearing about Dockerfiles, they are simply text files containing commands to build the image. In our Dockerfile, we did the following:

Using the arvslat/amazon-sagemaker-cookbook-python-base image as the base image allowed us to write a shorter Dockerfile that focuses only on copying the train and serve files to the directory inside the container image. Behind the scenes, we have already pre-installed the flask, pandas, scikit-learn, and joblib packages, along with their prerequisites, inside this container image so that we will not run into issues when building the custom container image. Here is a quick look at the Dockerfile file we used as the base image that we are using in this recipe:

FROM ubuntu:18.04
    
RUN apt-get -y update
RUN apt-get install -y python3.6
RUN apt-get install -y --no-install-recommends python3-pip
RUN apt-get install -y python3-setuptools
    
RUN ln -s /usr/bin/python3 /usr/bin/python & \
    ln -s /usr/bin/pip3 /usr/bin/pip
    
RUN pip install flask
RUN pip install pandas
RUN pip install scikit-learn
RUN pip install joblib
    
WORKDIR /usr/local/bin
EXPOSE 8080

In this Dockerfile, we can see that we are using Ubuntu:18.04 as the base image. Note that we can use other base images as well, depending on the libraries and frameworks we want to be installed in the container image.

Once we have the container image built, the next step will be to test if the train and serve scripts will work inside the container once we use docker run. Getting the IP address of the running container may be the trickiest part, as shown in the following block of code:

SERVE_IP=$(docker network inspect bridge | jq -r ".[0].Containers[].IPv4Address" | awk -F/ '{print $1}')

We can divide this into the following parts:

  • docker network inspect bridge: This provides detailed information about the bridge network in JSON format. It should return an output with a structure similar to the following JSON value:
    [
        {
            ...
            "Containers": {
                "1b6cf4a4b8fc5ea5...": {
                    "Name": "pyserve",
                    "EndpointID": "ecc78fb63c1ad32f0...",
                    "MacAddress": "02:42:ac:11:00:02",
                    "IPv4Address": "172.17.0.2/16",
                    "IPv6Address": ""
                }
            },
            ...
        }
    ]
  • jq -r ".[0].Containers[].IPv4Address": This parses through the JSON response value from docker network inspect bridge. Piping this after the first command would yield an output similar to 172.17.0.2/16.
  • awk -F/ '{print $1}': This splits the result from the jq command using the / separator and returns the value before /. After getting the AA.BB.CC.DD/16 value from the previous command, we get AA.BB.CC.DD after using the awk command.

Once we have the IP address of the running container, we can ping the /ping and /invocations endpoints, similar to how we did in the Preparing and testing the serve script in Python recipe.

In the next recipes in this chapter, we will use this custom container image when we do training and deployment with the SageMaker Python SDK.

You have been reading a chapter from
Machine Learning with Amazon SageMaker Cookbook
Published in: Oct 2021
Publisher: Packt
ISBN-13: 9781800567030
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime