Architecture of HDFS Federation
The crux of the HDFS Federation feature is that it allows for multiple NameNodes to run on a cluster. These NameNodes are independent and do not have any dependency on each other. However, the DataNodes are shared between all the NameNodes in the system. The NameNodes are said to be federated because they can be run independently without coordination.
Each DataNode sends heartbeats and block report information to all the NameNodes in the cluster. DataNodes also receive instructions from all the NameNodes. They are the common shared storage resource in the cluster and still run on commodity hardware. However, they cater to different NameNodes, and in turn, facilitate different Namespaces. These independent Namespaces provide isolation guarantees in a multitenant environment. By running many NameNodes, the cluster can be horizontally scaled and requests can be load-balanced among these NameNodes.
The following diagram shows the architecture of a federated HDFS...