Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Learning YARN

You're reading from   Learning YARN Moving beyond MapReduce - learn resource management and big data processing using YARN

Arrow left icon
Product type Paperback
Published in Aug 2015
Publisher
ISBN-13 9781784393960
Length 278 pages
Edition 1st Edition
Tools
Arrow right icon
Toc

Table of Contents (14) Chapters Close

Preface 1. Starting with YARN Basics FREE CHAPTER 2. Setting up a Hadoop-YARN Cluster 3. Administering a Hadoop-YARN Cluster 4. Executing Applications Using YARN 5. Understanding YARN Life Cycle Management 6. Migrating from MRv1 to MRv2 7. Writing Your Own YARN Applications 8. Dive Deep into YARN Components 9. Exploring YARN REST Services 10. Scheduling YARN Applications 11. Enabling Security in YARN 12. Real-time Data Analytics Using YARN Index

The YARN architecture

In the previous topic, we discussed the YARN components. Here we'll discuss the high-level architecture of YARN and look at how the components interact with each other.

The YARN architecture

The ResourceManager service runs on the master node of the cluster. A YARN client submits an application to the ResourceManager. An application can be a single MapReduce job, a directed acyclic graph of jobs, a java application, or any shell script. The client also defines an ApplicationMaster and a command to start the ApplicationMaster on a node.

The ApplicationManager service of resource manager will validate and accept the application request from the client. The scheduler service of resource manager will allocate a container for the ApplicationMaster on a node and the NodeManager service on that node will use the command to start the ApplicationMaster service. Each YARN application has a special container called ApplicationMaster. The ApplicationMaster container is the first container of an application.

The ApplicationMaster requests resources from the ResourceManager. The RequestRequest will have the location of the node, memory, and CPU cores required. The ResourceManager will allocate the resources as containers on a set of nodes. The ApplicationMaster will connect to the NodeManager services and request NodeManager to start containers. The ApplicationMaster manages the execution of the containers and will notify the ResourceManager once the application execution is over. Application execution and progress monitoring is the responsibility of ApplicationMaster rather than ResourceManager.

The NodeManager service runs on each slave of the YARN cluster. It is responsible for running application's containers. The resources specified for a container are taken from the NodeManager resources. Each NodeManager periodically updates ResourceManager for the set of available resources. The ResourceManager scheduler service uses this resource matrix to allocate new containers to ApplicationMaster or to start execution of a new application.

You have been reading a chapter from
Learning YARN
Published in: Aug 2015
Publisher:
ISBN-13: 9781784393960
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime