Spark Streaming with Kafka is a common combination of technologies in data pipelines. This section will present some examples of streaming Kafka with Spark.
Streaming data with Kafka and Spark
Apache Kakfa
Apache Kafka (http://kafka.apache.org/) is an open source message broker written in Scala. Originally, it was developed by LinkedIn, but it was then released as open source in 2011 and is currently maintained by the Apache Software Foundation.
Here are some of the reasons why you might prefer Kafka to a traditional JMS message broker:
- It's fast: A single Kafka broker running on commodity hardware can handle hundreds of megabytes of reads and writes per second from thousands of clients
- Great scalability: It can be easily...