Learning Apache Spark 2

Chapter 2. Transformations and Actions with Spark RDDs

Now that we have had a basic overview of the architecture of Spark and key software components, we will cover Spark RDD's in this chapter. During the course of this chapter, we'll walk through the following topics:

How to construct RDDs
Operations on RDDs, such as transformations and actions
Passing functions to Spark (Scala, Java, and Python)
Transformations such as map, filter, flatMap, and sample
Set operations such as distinct, intersection, and union
Actions such as reduce, collect, count, take, and first
PairRDDs
Shared and broadcast variables

Let's get cracking!

Filter reviews by

All

Amazon verified reviews

Shambhu Nath Mar 13, 2018

Delivery is awesome ! this book is used simple english language, So good for beginner, also explanations is good but I felt screenshot print is not so good !Thanks,Shambhu Nath

Amazon Verified review

Ivan Falcão Oct 17, 2017

Excelente livro. Apresenta uma base teórica considerável, além de diversos exemplos práticos. Certamente uma das melhores opções pra quem quer aprender mais spark

Kalaiselvan Dec 25, 2017

Simple language, good book for hands on development

Deepak May 20, 2019

Not much in details. Tells only on high level and gives the link to refer for further details. Returned it. Ordered learning spark from orielly.

Jose VL Jun 21, 2017

Nice reading to learn about Spark. I'd have liked to see more information for developers ... maybe next edition :-)

Most Common Transformations
`map(func)`	coalesce(numPartitions)
`filter(func)`	repartition(numPartitions)
`flatMap(func)`	repartitionAndSortWithinPartitions(partitioner)
`mapPartitions(func)`	join(otherDataset, [numTasks])
`mapPartitionsWithIndex(func)`	cogroup(otherDataset, [numTasks])
`sample(withReplacement, fraction, seed)`	cartesian(otherDataset)

Learning Apache Spark 2: A beginner's guide to real-time Big Data processing using the Apache Spark framework

What do you get with a Packt Subscription?

Learning Apache Spark 2

Chapter 2. Transformations and Actions with Spark RDDs

What is an RDD?

Operations on RDD

Transformations

Actions

Passing functions to Spark (Scala)

Anonymous functions

Passing functions to Spark (Java)

Passing functions to Spark (Python)

What is an RDD?

Operations on RDD

Transformations

Actions

Passing functions to Spark (Scala)

Anonymous functions

Passing functions to Spark (Java)

Passing functions to Spark (Python)

Transformations

Map(func)

Set operations in Spark

Page 1 of 13

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with a Packt Subscription?

Product Details

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs

Learning Apache Spark 2: A beginner's guide to real-time Big Data processing using the Apache Spark framework

What do you get with a Packt Subscription?

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with a Packt Subscription?

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs