We have seen the architecture of Apache Hadoop in a Chapter 1, Hadoop 3.0 - Background and Introduction. In this section, we will go through the High Availability (HA) feature of Apache Hadoop, given the fact that HDFS supports high availability through its replication factor. However, in earlier Apache Hadoop 1.X, NameNode was the single point of failure due to it being a central gateway for accessing data blocks. Similarly, Resource Manager is responsible for managing resources for MapReduce and YARN applications. We will study both of these points with respect to high availability.
High availability of Hadoop
High availability for NameNode
We have understood the challenges faced with Hadoop 1.x, so now let's understand...