Spark configuration
There are a number of ways to configure your Spark jobs. In this section, we will discuss these ways. More specifically, according to Spark 2.x release, there are three locations to configure the system:
- Spark properties
- Environmental variables
- Logging
Spark properties
As discussed previously, Spark properties control most of the application-specific parameters and can be set using a SparkConf
object of Spark. Alternatively, these parameters can be set through the Java system properties. SparkConf
allows you to configure some of the common properties as follows:
setAppName() // App name setMaster() // Master URL setSparkHome() // Set the location where Spark is installed on worker nodes. setExecutorEnv() // Set single or multiple environment variables to be used when launching executors. setJars() // Set JAR files to distribute to the cluster. setAll() // Set multiple parameters together.
An application can be configured to use a number of available cores on your machine...