Summary
In this chapter, we discussed SparkSession
, which is the single entry point for Spark in 2.x versions. We talked about the unification of dataset and dataframe APIs. Then, we created a dataset using RDD and discussed various dataset operations with examples. We also learnt how to execute Spark SQL operations on a dataset by creating temporary views. Last but not least, we learnt how to create UDFs in Spark SQL with examples.
In the next chapter, we will learn how to process real-time streams with Spark.