Introducing the ensemble pattern
In this section, we will discuss the ensemble pattern of serving models and the different types of ensembles that can be used to serve a model.
In the ensemble pattern of serving models, more than one model is served together. In an ensemble pattern, an inference decision is made by combining the inferences from all the models in the ensemble.
The final response, , from the input, , will be generated as a combined inference from the models , , , as shown in the following equation:
In this equation, is the response and is the combination function that combines the responses from all the models. , , are different models and is the input passed to the models.
We can ensemble multiple models for various scenarios. The first four of these different types are introduced in this article: https://www.anyscale.com/blog/serving-ml-models-in-production-common-patterns. The following scenarios are examples of where the ensemble pattern can be...