Understanding model inferencing with Seldon Core
In the previous chapter, you built the model. These models are built by data science teams to be used in production and serve the prediction requests. There are many ways to use a model in production, such as embedding the model with your customer-facing program, but the most common way is to expose the model as a REST API. The REST API can then be used by any application. In general, running and serving a model in production is called model serving.
However, once the model is in production, it needs to be monitored for performance and needs updating to meet the expected criteria. A hosted model solution enables you to not only serve the model but monitor its performance and generate alerts that can be used to trigger retraining of the model.
Seldon is a UK-based firm that created a set of tools to manage the model's life cycle. Seldon Core is an open source framework that helps expose ML models to be consumed as REST APIs...