Resourcemanager HA using ZooKeeper
In this recipe, we will be covering Resourcemanager (RM) high availability. In a Hadoop cluster, if the RM goes offline for any reason, all the jobs on the cluster will fail. In production, there will be critical jobs that might be running for a long time and it does not make sense to start them again due to the failure of RM. HA for Resourcemanager was introduced in Hadoop 2.4 and it supports both manual and automatic failover.
Similar to Namenode HA discussed in the earlier recipes, Resourcemanager HA also has only one active node at any given point of time. The failover is either initiated by an admin command or by using ZooKeeper for automatic failover.
Resourcemanager HA can be configured by either using FileSystemRMStore
or using ZooKeeper store. In this recipe, we will configure automatic failover using ZKRMStateStore
.
Getting ready
Before starting with this recipe, it is mandatory to complete the earlier recipe on ZooKeeper configuration and to make...