Summary
In this chapter, we have learned about stateful and stateless functions in more detail. We have seen how stateful serving can be problematic, causing impediments to scalable and resilient serving. Stateful serving can also violate fundamental computer science principles by causing servers to be out of sync. The response from servers will not be consistent once they are out of sync, violating the consistency principle.
We have discussed different kinds of states in machine learning models and how they impact inference. We also discussed that these models, with all these states, can be a big barrier to resilient and scalable serving.
We have seen some techniques to decouple states from ML models and have tried some demos using some dummy models by serving using the Flask API server.
In the next chapter, we will learn about continued model evaluation. We will discuss what the continuous model evaluation pattern is and why it is necessary, along with examples.