Understanding application execution flow
A YARN application can be a simple shell script, MapReduce job, or any group of jobs. This section will cover YARN application submission and execution flow. To manage application execution over YARN, a client needs to define an ApplicationMaster. The client submits an application context to the ResourceManager. As per the application needs, the ResourceManager then allocates memory for an ApplicationMaster and containers for application execution.
The complete process of application execution can be broadly divided into six phases, as shown in the following figure:
Phase 1 – Application initialization and submission
In the first phase of application execution, a client will connect to the applications manager service of the ResourceManager daemon and will request the ResourceManager for a new application ID. The ResourceManager will validate the client request and if the client is an authorized user, it will send a new and unique application...