Spark components
Let’s dive into the inner workings of each Spark component to understand how each of them plays a crucial role in empowering efficient distributed data processing.
Spark driver
The Spark driver is the core of the intelligent and efficient computations in Spark. Spark follows an architecture that is commonly known as the master-worker architecture in network topology. Consider the Spark driver as a master and Spark executors as slaves. The driver has control and knowledge of all the executors at any given time. It is the responsibility of the driver to know how many executors are present and if any executor has failed so that it can fall back on its alternative. The Spark driver also maintains communication with executors all the time. The driver runs on the master node of a machine or cluster. When a Spark application starts running, the driver keeps up with all the required information that is needed to run the application successfully.
As shown in...