Running Spark in standalone mode
A standalone mode is basically when Spark uses its default mode provided with the application and does not use one of the externally provided cluster managers like YARN or Mesos. In this section, we'll look at the following key topics:
- Installing Spark Standalone on a cluster
-
Starting the cluster
- Manually
- Launch-Scripts
- Launching an application in the cluster
- Monitoring
-
Configuring High-Availability
- Configuring Stand-by masters with ZooKeeper
- Recovery with the FileSystem
Installing Spark standalone on a cluster
In our example we are going to build a 6 node Spark cluster, and have a windows machine submit Spark jobs to the cluster. We'll not develop a new Spark program, but rather submit the examples provided with the Spark Framework for the purpose of this exercise. Our architecture looks something like the following image:
Figure 8.2: Spark Standalone Cluster Deployment
The installation of a standalone cluster is very simple, you need to place...