Section 3:Beyond Batch – Building Real-Time Data Pipelines
In this section, you will learn about the differences between batch processing – what you have currently been doing – and stream processing. You will learn about a new set of tools that allow you to stream and process data in real time. First, you will learn how to build an Apache Kafka cluster to stream real-time data. To process this data, you will use an Apache Spark cluster that you will build and deploy. Lastly, you will learn two more advanced NiFi topics – how to stream data to NiFi from an Internet of Things device using MiNiFi, and how to cluster NiFi for more processing power.
This section comprises the following chapters:
- Chapter 12, Building an Apache Kafka Cluster
- Chapter 13, Streaming Data with Kafka
- Chapter 14, Data Processing with Apache Spark
- Chapter 15, Real-Time Edge Data – Kafka, Spark, and MiNiFi