In this section, we will run Apache Spark in local mode or standalone mode. First we will set up Scala, which is the prerequisite for Apache Spark. After the Scala setup, we will set up and run Apache Spark. We will also perform some basic operations on it. So let's start.
Since Apache Spark is written in Scala, it needs Scala to be set up on the system. You can download Scala from http://www.scala-lang.org/download/ (we will set up Scala 2.11.8 in the following examples).
Once Scala is downloaded, we can set it up on a Linux system as follows:
Also, it is recommended to set the SCALA_HOME environment variable and add Scala binaries to the PATH variable. You can set it in the .bashrc file or /etc/environment file as follows:
export SCALA_HOME=/usr/local/scala-2.11.8
export PATH=$PATH:/usr/local/scala-2.11.8/bin
It is also shown in the following...