As discussed previously, we are now familiar with the Storm topology concept and will now look into how we can integrate Apache Storm with Apache Kafka. Apache Kafka is most widely used with Apache Storm in production applications. Let us look into different APIs available for integration:
- KafkaSpout: Spout in Storm is responsible for consuming data from the source system and passing it to bolts for further processing. KafkaSpout is specially designed for consuming data from Kafka as a stream and then passing it to bolts for further processing. KafkaSpout accepts SpoutConfig, which contains information about Zookeeper, Kafka brokers, and topics to connect with.
Look at the following code:
SpoutConfig spoutConfig = new SpoutConfig(hosts, inputTopic, "/" + zkRootDir, consumerGroup);
spoutConfig.scheme = new SchemeAsMultiScheme...