Summary
In this chapter, we have explored the world of deploying trained PyTorch deep learning models in production systems. We began with building a local model inference pipeline to be able to make predictions using a pre-trained model with a few lines of Python code. We then utilized the model inference logic of this pipeline to build our own model server using Python's Flask library. We went further with the model server to build a self-contained model microservice using Docker that can be deployed and scaled with a one-line command.
Next, we explored TorchServe, which is a recently developed dedicated model-serving framework for PyTorch. We learned how to use this tool to serve PyTorch models with a few lines of code and discussed the advanced capabilities it offers, such as model versioning and metrics monitoring. Thereafter, we elaborated on how to export PyTorch models.
We first learned the two different ways of doing so using TorchScript: tracing and scripting....