Deploying locally for batch and web service inference
For development and testing purposes, we usually need to deploy our model locally to verify it works as expected. Let's see how to do it for two scenarios: batch inference and web service inference.
Batch inference
For batch inference, follow these instructions:
- Make sure you have completed Chapter 7, Multi-Step Deep Learning Inference Pipeline. This will produce an MLflow
pyfunc
DL inference model pipeline URI that can be loaded using standard MLflow Python functions. The logged model can be uniquely located by therun_id
and model name as follows:logged_model = 'runs:/37b5b4dd7bc04213a35db646520ec404/inference_pipeline_model'
The model can also be identified by the model name and version number using the model registry as follows:
logged_model = 'models:/inference_pipeline_model/6'
- Follow the instructions under the Batch inference at-scale using PySpark UDF function section...