Flume Channel
A channel is a mechanism used by the Flume agent to transfer data from source to sink. The events are persisted in the channel and until it is delivered/taken away by a sink, they reside in the channel. This persistence in channel allows sink to retry for each event in case there is a failure while persisting data to the real store (HDFS).
Channels can be broadly categorized into two:
- In-memory: The events are available until the channel component is alive:
- Queue: In-memory queues in the channel. This has the lowest latency time for processing because the events are persisted in memory.
- Durable: Even after the component is dead, the event persisted is available, and when the component becomes online, these events will be processed:
- File (WAL or Write-Ahead Log): The most used channel type. It's durable and requires disk to be RAID, SAN or similar.
- JDBC: A proper RDBMS backed channel that provides ACID compliance.
- Kafka: stored in Kafka cluster.
There is another special channel called...