SageMaker Deployment Solutions
After training our machine learning (ML) model, we can proceed with deploying it to a web API. This API can then be invoked by other applications (for example, a mobile application) to perform a “prediction” or inference. For example, the ML model we trained in Chapter 1, Introduction to ML Engineering on AWS, can be deployed to a web API and then be used to predict the likelihood of customers canceling their reservations or not, given a set of inputs. Deploying the ML model to a web API allows the ML model to be accessible to different applications and systems.
A few years ago, ML practitioners had to spend time building a custom backend API to host and deploy a model from scratch. If you were given this requirement, you might have used a Python framework such as Flask, Pyramid, or Django to deploy the ML model. Building a custom API to serve as an inference endpoint can take about a week or so since most of the application logic needs...