Kafka Connect is used to copy data into and out of Kafka. There are already a lot of tools available to move data from one system to another system. You would find many use cases where you want to do real-time analytics and batch analytics on the same data. Data can come from different sources but finally may land into the same category or type.
We may want to bring this data to Kafka topics and then pass it to a real-time processing engine or store it for batch processing. If you closely look at the following figure, there are different processes involved:
Kafka Connect
Let's look into each component in detail:
- Ingestion in Kafka: Data is inserted into Kafka topic from different sources, and most of the time, the type of sources are common. For example you may want to insert server logs into Kafka topics, or insert all records from the database...