Introduction to Apache Kafka
Apache Kafka is a distributed streaming platform that allows applications to publish and subscribe to a stream of records. Apache Kafka is not just a message queue, it also allows applications to publish the events that are then stored by Kafka as an immutable log in a fault-tolerant way. It allows the producers and consumers of the events to scale horizontally without affecting each other. Since the events are logged in the same sequence as they are published within Kafka, it allows consumers to replay the log from and up to the desired point to reconstruct views of the application state.
Design principles
Kafka is run as a cluster of one or more servers that act as message brokers (https://en.wikipedia.org/wiki/Message_broker) in the system. Kafka categorizes the stream of records under topics that are used by producers and consumers to produce records and consume them, respectively. Each record consists of a key-value pair and a timestamp.
A typical Kafka cluster...