Configuring YARN for performance
Another important component to tune is the YARN framework. Until now, we have concentrated on the HDFS/storage layer, but we need to tune the scheduler and compute the layer as well.
In this recipe, we will see which important properties to take care of and how they can be optimized. To get a picture of the YARN layout and to correlate things better, please refer to the following diagram:
Getting ready
Make sure that the user has a running cluster with HDFS and YARN configured. The user must be able to execute HDFS and YARN commands. Please refer to Chapter 1, Hadoop Architecture and Deployment, for Hadoop installation and configuration.
How to do it...
Connect to the Namenode
master1.cyrus.com
and switch to thehadoop
user.The important file for this recipe is
yarn-site.xml
and all the parameters in the following steps will be part of it.The memory on the system after accounting for the operating system, any daemons like Namenode or Datanodes, and HBase regions...