Hosting multiple models with multi-model endpoints
In the previous recipe, we prepared a few prerequisites for a multi-model endpoint deployment; that is, the pre-trained model files and the paths where the pre-trained model files will be uploaded to in S3.
In this recipe, we will deploy multiple models within a single endpoint using the multi-model endpoint support of SageMaker. With multi-model endpoints, we can reduce costs as we can host multiple models inside a single endpoint, compared to having one dedicated endpoint for each model. This approach also works well in staging or test environments, where occasional cold-start delays can be tolerated for infrequently used models.
Note
If you are wondering where we got these pre-trained models, we simply reused two of the XGBoost models we trained in Chapter 5, Effectively Managing Machine Learning Experiments. These models simply accept numerical values for the a
and b
features and return the predicted label
value. The...