Overview of DataFrame transformations
Just like RDDs, DataFrames have both transformations and actions. As a reminder, transformations convert one DataFrame into another, while actions perform some computation on a DataFrame and normally return the result to the driver. Also, just like the RDDs, transformations in DataFrames are lazy.
In this recipe, we will review the most common transformations.
Getting ready
To execute this recipe, you need to have a working Spark 2.3 environment. You should have gone through the Specifying schema programmatically recipe, as we will be using the sample_data_schema
DataFrame we created there.
There are no other requirements.
How to do it...
In this section, we will list some of the most common transformations available for DataFrames. The purpose of this list is not to provide a comprehensive enumeration of all available transformations, but to give you some intuition behind the most common ones.
The .select(...) transformation
The .select(...)
transformation...