References
You can refer to the following resources for more information:
- Hosting multiple models on a single endpoint: https://docs.aws.amazon.com/sagemaker/latest/dg/multi-model-endpoints.html
- Amazon EC2 Inf1 instances: https://aws.amazon.com/ec2/instance-types/inf1/
- Using your own inference code with SageMaker Batch Transform: https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-batch-code.html
- Multiple models with different containers behind a single endpoint: https://docs.aws.amazon.com/sagemaker/latest/dg/multi-container-endpoints.html
- Serverless inference on Amazon SageMaker: https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html
- Blue/green deployments using Amazon SageMaker: https://docs.aws.amazon.com/sagemaker/latest/dg/deployment-guardrails-blue-green.html
- Canary traffic shifting: https://docs.aws.amazon.com/sagemaker/latest/dg/deployment-guardrails-blue-green-canary.html
- SageMaker’s Transformer class...