Inferencing (online or batch)
Inferencing is a process of using a trained machine learning model to make predictions on new unseen data. Online inferencing refers to making predictions in real time on live data as it arrives. Latency is of utmost importance during online inferencing in order to prevent any lags to the end user.
There is another type called batch inferencing, where predictions are made on a large set of already collected data in an offline fashion.
Figure A.2 – Process flow when live data comes to the model for scoring (inferencing)
Inferencing is a process of using a trained machine learning model to make predictions on new input (unseen) data in real time. The following are the steps involved in the inferencing process:
- Input data: The first step is to receive new input data that needs to be classified or predicted. This data could be in the form of text, images, audio, or any other data format.
- Transform data: Before...