Suppose that we have in our possession a purchase table and we want to read records for an item from the purchase table that belongs to a certain category, say, electronics. In the normal course of events, we will simply filter out other records, but what if we partition our table in such a way that we will be able to read the records of our choice quickly?
This is exactly what happens when topics are broken into partitions known as units of parallelism in Kafka. This means that the greater the number of partitions, the more throughput. This does not mean that we should choose a huge number of partitions. We will talk about the pros and cons of increasing the number of partitions further.
While creating topics, you can always mention the number of partitions that you require for a topic. Each of the messages will be appended to partitions and each message is...