Understanding where YARN fits into Hadoop
If we refer to Hadoop 1.x in the first figure of this chapter, then it is clear that the responsibilities of the JobTracker mainly included the following:
Managing the computational resources in terms of map and reduce slots
Scheduling submitted jobs
Monitoring the executions of the TaskTrackers
Restarting failed tasks
Performing a speculative execution of tasks
Calculating the Job Counters
Clearly, the JobTracker alone does a lot of tasks together and is overloaded with lots of work.
This overloading of the JobTracker led to the redesign of the JobTracker, and YARN tried to reduce the responsibilities of the JobTracker in the following ways:
Cluster resource management and Scheduling responsibilities were moved to the global Resource Manager (RM)
The application life cycle management, that is, job execution and monitoring was moved into a per-application ApplicationMaster (AM)
The Global Resource Manager is seen in the following image:
If you look at the preceding...