Chapter 11: Deploying Machine Learning Models
In previous chapters, we've deployed models in the simplest way possible: by configuring an estimator, calling the fit()
application programming interface (API) to train the model, and calling the deploy()
API to create a real-time endpoint. This is the simplest scenario for development and testing, but it's not the only one.
Models can be imported. For example, you could take an existing model that you trained on your local machine, import it into SageMaker, and deploy it as if you had trained it on SageMaker.
In addition, models can be deployed in different configurations, as follows:
- A single model on a real-time endpoint, which is what we've done so far, as well as several model variants in the same endpoint.
- A sequence of up to five models, called an inference pipeline.
- An arbitrary number of related models that are loaded on demand on the same endpoint, known as a multi-model endpoint. We&apos...