Part 2: Data Engineering Toolset
In this part, we will go through the bread-and-butter tools found in most companies. We will explore and deeply understand Apache Spark, Delta Lake, Batch processing, and Streaming. We will then look at how to work with Kafka, the most popular dedicated Streaming tool.
This part has the following chapters:
- Chapter 3, Apache Spark Deep Dive
- Chapter 4, Batch and Stream Data Processing using PySpark
- Chapter 5, Streaming Data with Kafka