Chapter 2. Transformations and Actions with Spark RDDs
Now that we have had a basic overview of the architecture of Spark and key software components, we will cover Spark RDD's in this chapter. During the course of this chapter, we'll walk through the following topics:
- How to construct RDDs
- Operations on RDDs, such as transformations and actions
- Passing functions to Spark (Scala, Java, and Python)
- Transformations such as map, filter, flatMap, and sample
- Set operations such as distinct, intersection, and union
- Actions such as reduce, collect, count, take, and first
- PairRDDs
- Shared and broadcast variables
Let's get cracking!