ML serving patterns
There are patterns that are specific to serving ML models. In this section, we will discuss the patterns for ML model serving. We will see the categories of patterns and describe each of the categories separately.
Model serving patterns can be classified into the following two categories at a high level:
- Serving philosophies: This group of patterns mainly concerns the principles and best practices you need to be aware of when serving a model – for example, whether the serving should be stateful or stateless, or whether we should evaluate the performance of the model continuously or intermittently
- Serving approaches: This kind of pattern gives a clear picture of different serving approaches – for example, how the model will be served in cases of the presence of a large quantity of distributed data, how the model will be served if we need the immediate impact of the most recent data, or how we will serve on edge devices
We will...