Summary
Hadoop is made of the compute and storage layers. The compute layer has been replaced by YARN in Hadoop 2.X, helping other paradigms to co-exist on the Hadoop cluster hardware. The storage layer is making rapid progress towards a similar goal. Features such as HDFS Federation are one step closer in making the storage layer generic. By loosely coupling Block Storage from the Namespace, this can become a reality soon.
The key takeaways from this chapter are as follows:
- With HDFS Federation, it is possible to run multiple NameNodes. This not only helps in isolation, but it can also aid in performance by load balancing. Horizontal scaling of the NameNode is easier.
- Block pools are the abstractions that facilitate federation. Blocks from a single Namespace belong to a single pool. Each pool is given an identifier for addressability. The DataNodes remain shared among the different NameNodes.
- In Hadoop 2.X, there are a number of different options to ensure NameNode recovery from failures....