Section 4 –
Deploying a Deep Learning Pipeline at Scale
In this section, we will learn how to implement and deploy a multi-step inference pipeline for production usage. We will start with an overview of four patterns of inference workflows in production. We will then learn how to implement a multi-step inference pipeline with preprocessing and postprocessing steps around a fine-tuned deep learning (DL) model using MLflow PyFunc APIs. With a ready-to-deploy MLflow PyFunc-compatible DL inference pipeline, we will learn about different deployment tools and hosting environments to decide which tool to use for a specific deployment scenario. We will then implement and deploy a batch inference pipeline using MLflow's Spark user-defined function (UDF). From there on, we will focus on deploying a web service using either MLflow's built-in model serving tool or Ray Serve's MLflow deployment plugin. Finally, we will show a complete step-by-step guide to deploying...