Serverless ML deployment with Lambda’s container image support
Now that we have the model.pth
file, what do we do with it? The answer is simple: we will deploy this model in a serverless API using an AWS Lambda function and an Amazon API Gateway HTTP API, as shown in the following diagram:
Figure 3.11 – Serverless ML deployment with an API Gateway and AWS Lambda
As we can see, the HTTP API should be able to accept GET requests from “clients” such as mobile apps and other web servers that interface with end users. These requests then get passed to the AWS Lambda function as input event data. The Lambda function then loads the model from the model.pth
file and uses it to compute the predicted y value using the x value from the input event data.
Building the custom container image
Our AWS Lambda function code needs to utilize PyTorch functions and utilities to load the model. To get this setup working properly, we will build...