Chapter 9: Managing Machine Learning Workflows and Deployments
In the previous chapters, we focused on relatively straightforward machine learning model deployments with SageMaker; that is, using the deploy()
function to deploy a single model to an inference endpoint. In simple experiments and deployments, this would do the trick. However, when dealing with requirements that involve a more complex setup, we need to have a few more tricks up our sleeves.
In this chapter, we will work with a relatively more complex set of deployment solutions for real-time endpoint deployments and automated workflows. As shown in the following diagram, this chapter has three primary focus areas – deep learning model deployment for Hugging Face models, multi-model endpoint deployments, and ML workflows:
The first focus area involves fine-tuning and deploying state-of-the-art NLP models in SageMaker. We...