How to Deploy Your Model
In this chapter, we’ll introduce you to a variety of techniques for deploying your model, including real-time endpoints, serverless, batch options, and more. These concepts apply to many compute environments, but we’ll focus on the capabilities available on AWS within Amazon SageMaker. We’ll talk about why you should try to shrink the size of your model before deploying, along with techniques for this across vision and language. We’ll also cover distributed hosting techniques for scenarios when you can’t or don’t need to shrink your model. Lastly, we’ll explore model-serving techniques and concepts that can help you optimize the end-to-end performance of your model.
We will cover the following topics in the chapter:
- What is model deployment?
- What is the best way to host my model?
- Model deployment options on AWS with SageMaker
- Techniques for reducing your model size
- Hosting distributed...