Chapter 8: Deploying a DL Inference Pipeline at Scale
Deploying a deep learning (DL) inference pipeline for production usage is both exciting and challenging. The exciting part is that, finally, the DL model pipeline can be used for prediction with real-world production data, which will provide real value to the business scenarios. However, the challenging part is that there are different DL model serving platforms and host environments. It is not easy to choose the right framework for the right model serving scenarios, which can minimize deployment complexity but provide the best model serving experiences in a scalable and cost-effective way. This chapter will cover the topics as an overview of different deployment scenarios and host environments, and then provide hands-on learning on how to deploy to different environments, including local and remote cloud environments using MLflow deployment tools. By the end of this chapter, you should be able to confidently deploy an MLflow DL...