Serving a PyTorch model using TorchServe
TorchServe, released in April 2020, is a dedicated PyTorch model-serving framework. Using the functionalities offered by TorchServe, we can serve multiple models at the same time with low prediction latency and without having to write much custom code. Furthermore, TorchServe offers features such as model versioning, metrics monitoring, and data preprocessing and post-processing.
This clearly makes TorchServe a more advanced model-serving alternative than the model microservice we developed in the previous section. However, making custom model microservices still proves to be a powerful solution for complicated machine learning pipelines (which is more common than we might think).
In this section, we will continue working with our handwritten digit classification model and demonstrate how to serve it using TorchServe. After reading this section, you should be able to get started with TorchServe and go further with utilizing its full...