Building the Uber JAR
The first step for deploying our Spark application on a cluster is to bundle it into a single Uber JAR, also known as the assembly
JAR. In this recipe, we'll be looking at how to use the SBT assembly plugin to generate the assembly
JAR. We'll be using this assembly
JAR in subsequent recipes when we run Spark in distributed mode. We could alternatively set dependent JARs using the spark.driver.extraClassPath
property (https://spark.apache.org/docs/1.3.1/configuration.html#runtime-environment). However, for a large number of dependent JARs, this is inconvenient.
How to do it...
The goal of building the assembly
JAR is to build a single, Fat JAR that contains all dependencies and our Spark application. Refer to the following screenshot, which shows the innards of an assembly
JAR. You can see not only the application's files in the JAR, but also all the packages and files of the dependent libraries:
The assembly
JAR can easily be built in SBT using the SBT assembly plugin...