Moving a Model to Production
Moving a model to production is a step toward enabling the consumption of our model by an external party. We should expose our model to the world and start rendering predictions on real, unseen input.
It is not sufficient to have a trained PyTorch model for deployment. We need additional server components to set up communication channels from the real world to the PyTorch model and back to the real world. It is important that we know how to create an API (through which a user can interact with the model), wrap it as a self-contained application (so that it can be deployed on any computer), and ship it to the cloud – so that anybody with the required URL and credentials can interact with the model. To successfully move a model to production, all these steps are necessary.
In addition, we will have to deal with constraints around the latency of predictions (the time taken to get a prediction) and the size of the model (when deploying on edge...