Introducing online model serving
In online model serving, the model is updated automatically along with the user input as a backend process of retraining the model. So, the response from the model reflects the most recent data available for training. Whenever we send a prediction request to the model with an input, the model updates the model weights and biases by running the training for the provided input, and at the same time, the prediction response is provided.
We can have a look at Figure 7.1 to see how the model update is coupled with the prediction requests in the online model serving. Whenever a prediction request is made, it also involves updating the model:
Figure 7.1 – Online model serving usually couples the model updating with the prediction requests
Let’s go into more detail on serving.
Serving requests
We have seen that in online model serving, each prediction request also triggers an update model request. Now, this...