So far in this book, we have discussed how you can create your own Spark application using RDDs and the DataFrame and dataset APIs. We also discussed some basic concepts of Spark, such as transformations, actions, caching, and repartitions, which enable you to write your Spark applications efficiently. In this chapter, we'll discuss what happens under the hood when you run your Spark application. We'll also walk you through the different tools and techniques available for monitoring your jobs. This chapter will discuss the following:
- Spark components and their respective roles in the application execution
- The life cycle of a Spark application
- Monitoring Spark applications