Managing real-time endpoints
SageMaker endpoints serve real-time predictions using models hosted on fully managed infrastructure. They can be created and managed either with the SageMaker SDK, or with an AWS language SDK such as boto3
. The latter gives us more flexibility and control. For instance, we can deploy several Production Variants on the same endpoint, and also configure Auto Scaling.
First, let's look at the SageMaker SDK in greater detail.
Managing endpoints with the SageMaker SDK
The SageMaker SDK lets you work with endpoints in several ways:
- Configure an estimator, train it with
fit()
, deploy an endpoint withdeploy()
, and invoke it withpredict()
. - Deploy an existing model.
- Invoke an existing endpoint.
- Update an existing endpoint.
We've used the first scenario in many examples so far. Let's look at the other ones.
Deploying an existing model
This is useful when you want to import a model that wasn't trained...