Chapter 2. YARN Architecture
This chapter dives deep into YARN architecture its core components, and how they interact to deliver optimal resource utilization, better performance, and manageability. It also focuses on some important terminology concerning YARN.
In this chapter, we will cover the following topics:
- Core components of YARN architecture
- Interaction and flow of YARN components
- ResourceManager scheduling policies
- Recent developments in YARN
The motivation behind the YARN architecture is to support more data processing models, such as Apache Spark, Apache Storm, Apache Giraph, Apache HAMA, and so on, than just MapReduce. YARN provides a platform to develop and execute distributed processing applications. It also improves efficiency and resource-sharing capabilities.
The design decision behind YARN architecture is to separate two major functionalities, resource management and job scheduling or monitoring of JobTracker, into separate daemons, that is, a cluster level ResourceManager...