Dataset interfaces and functions
Now let's work out a few interesting examples, starting out with a simple one and then moving on to progressively complex operations.
Tip
The code files are in fdps-v3/code
, and the data files are in fdps-v3/data
. You can run the code either from a Scala IDE or just from the Spark Shell.
Start Spark Shell from the bin directory where you have installed the spark:
/Volumes/sdxc-01/spark-2.0.0/bin/spark-shell
Inside the shell, the following command will load the source:
:load /Users/ksankar/fdps-v3/code/DS01.scala
Read/write operations
As we saw earlier, SparkSession.read.*
gives us a rich set of features to read different types of data with flexible control over the options. Dataset.write.*
does the same for writing data:
val spark = SparkSession.builder .master("local") .appName("Chapter 9") .config("spark.logConf","true") .config("spark.logLevel","ERROR") .getOrCreate...