In this section, we will see how to debug Spark applications that are running locally (on Eclipse or IntelliJ), standalone or cluster mode in YARN or Mesos. However, before diving deeper, it is necessary to know about logging in the Spark application.
Debugging Spark applications
Logging with log4j with Spark recap
We have already discussed this topic in Chapter 14, Time to Put Some Order - Cluster Your Data with Spark MLlib. However, let's replay the same contents to make your brain align with the current discussion Debugging Spark applications. As stated earlier, Spark uses log4j for its own logging. If you configured Spark properly, Spark gets logged all the operation to the shell console. A sample snapshot of the...