Model serving in PyTorch
In this section, we will begin with building a simple PyTorch inference pipeline that can make predictions given some input data and the location of a previously trained and saved PyTorch model. We will proceed thereafter to place this inference pipeline on a model server that can listen to incoming data requests and return predictions. Finally, we will advance from developing a model server to creating a model microservice using Docker.
Creating a PyTorch model inference pipeline
We will be working on the handwritten digits image classification CNN model that we built in Chapter 1, Overview of Deep Learning Using PyTorch, on the MNIST
dataset. Using this trained model, we will build an inference pipeline that shall be able to predict a digit between 0 to 9 for a given handwritten-digit input image.
For the process of building and training the model, please refer to the Training a neural network using PyTorch section of Chapter 1, Overview of Deep...