Kafka Streams internally uses the Kafka producer and consumer libraries. It is tightly coupled with Apache Kafka and allows you to leverage the capabilities of Kafka to achieve data parallelism, fault tolerance, and many other powerful features.
In this section, we will discuss how Kafka Stream works internally and what the different components involved in building Stream applications using Kafka Streams are. The following figure is an internal representation of the working of Kafka Stream:
Stream instance consists of multiple tasks, where each task processes non overlapping subsets of the record. If you want to increase parallelism, you can simply add more instances, and Kafka Stream will auto distribute work among different instances.
Let's discuss a few important components seen in the previous figure:
- Stream...