Understanding micro batching
Micro batching is defined as the procedure in which the incoming stream of messages is processed by dividing them into group of small batches. This helps to achieve the performance benefits of batch processing; however, at the same time, it helps to keep the latency of processing of each message minimal.
Here, an incoming stream of messages is treated as a flow of small batches of messages and each batch is passed to a processing engine, which outputs a processed stream of batches.
In Spark Streaming, a micro batch is created based on time instead of size, that is, events received in a certain time interval, usually in milliseconds, are grouped together into a batch. This assures that any message does not have to wait much before processing and helps to keep latency of processing under control.
Another advantage of micro batching is that it helps to keep the volume of control message lower. For example, if a system requires an acknowledgement is to be sent by the...