Chapter 7: Hosting ML Models in the Cloud: Best Practices
After you've successfully trained a model, you want to make the model available for inference, don't you? ML models are often the product of a business that is ML-driven. Your customers consume the ML prediction from your model, not your training jobs or processed data. How do you provide a satisfying customer experience, starting with a good experience with your ML models?
SageMaker has several options for ML hosting and inferencing, depending on your use case. Options are welcomed in many aspects of life, but it can be difficult to find the best option. This chapter will help you understand how to host models for batch inference and for online real-time inference, how to use multi-model endpoints to save costs, and how to conduct resource optimization for your inference needs.
In this chapter, we will be covering the following topics:
- Deploying models in the cloud after training
- Inferencing in batches...