The training job that's executed by spark-submit will need to resolve all the required dependencies at runtime. In order to manage this task, we will create an uber-JAR that has the application runtime and its required dependencies. We will use the Maven configurations in pom.xml to create an uber-JAR so that we can perform distributed training. Effectively, we will create an uber-JAR and submit it to spark-submit to perform the training job in Spark.
In this recipe, we will create an uber-JAR using the Maven shade plugin for Spark training.