Chapter 5. Real-Time Analytics with Spark Streaming and Structured Streaming
Spark Streaming supports real-time processing of fast moving, streaming data to gain insights for business and make business decisions in real-time or near real-time. It is an extension to Spark core to support stream processing. Spark Streaming is production-ready and is used in many organizations. This chapter helps you get started with writing real-time applications including Kafka and HBase. This chapter also helps you to get started with the new concept of Structured Streaming introduced in Spark 2.0.
This chapter is divided into the following sub-topics:
- Introducing real-time processing
- Architecture of Spark Streaming
- Stateless and stateful stream processing
- Transformations and actions
- Input sources and output stores
- Spark Streaming with Kafka and HBase
- Advanced Spark Streaming concepts
- Monitoring applications
- Introducing Structured Streaming