Configuring HDFS and YARN logs
In this recipe, we will configure logs for the HDFS and YARN, which is very important for troubleshooting and diagnosis of job failures.
For larger clusters, it is important to manage logs in terms of disk space usage, ease of retrieval, and performance. It is always recommended to store logs on separate hard disks and that too on RAIDed disks for performance. The disk thats used by Namenode or Datanodes for metadata or HDFS blocks must not be shared with for logs.
Getting ready
To complete the recipe, the user must have a running cluster with HDFS and YARN configured and have played around with Chapter 1, Hadoop Architecture and Deployment and Chapter 2, Maintain Hadoop Cluster HDFS to understand things better.
How to do it...
Connect to the
master1.cyrus.com
master node in the cluster and switch to userhadoop
.By default, the location of HDFS and YARN logs is defined by the settings
$HADOOP_HOME/logs
and$YARN_LOG_DIR/logs
in filehadoop-env.xml
andyarn-env.sh...