Connecting Kafka to Spark Streaming
The following section walks you through a program that reads the streaming data off the Kafka topic and counts the words. The aspects that will be captured in the following code are as follows:
- Kafka-Spark Streaming integration
- Creating and consuming from DStreams in Spark
- See the streaming application reading from an infinite unbounded stream to generate results
Let's take a look at the following code:
package com.example.spark; Import files: import java.util.Collection; import java.util.HashMap; import java.util.Iterator; import java.util.Map; import java.util.regex.Pattern; import org.apache.spark.SparkConf; import org.apache.spark.api.java.function.Function; import org.apache.spark.streaming.Duration; import org.apache.spark.streaming.api.java.JavaDStream; import org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream; import org.apache.spark.streaming.api.java.JavaStreamingContext; import org.apache.spark.streaming.kafka.KafkaUtils...