Programming Spark transformations and actions
In this section, we will leverage the various functions exposed by RDD APIs and analyze our Chicago crime dataset. We will start with simple operations and move on to the complex transformations. First, let's create/define some base classes and then we will develop our transformation logic.
Perform the following steps to write the basic building blocks:
We will extend our
Spark-Examples
projects and create a new Scala class by the name ofchapter.seven.ScalaCrimeUtil.scala
. This class will contain some utility functions that will be utilized by our main transformation job.Open and edit
ScalaCrimeUtil.scala
and add the following piece of code:package chapter.seven class ScalaCrimeUtil extends Serializable{ /** * Create a Map of the data which is extracted by applying Regular expression. */ def createDataMap(data:String): Map[String, String] = { //Replacing Empty columns with the blank Spaces, //so that split function...