Introduction
In the previous chapters, we learned about the storage layer HDFS, how to configure it, and what are its different components. We mainly talked about Namenode, Datanode, and its concepts.
In this chapter, we will take a look at the processing layer which is MapReduce and the resource management framework YARN. Prior to Hadoop 2.x, MapReduce was the only processing layer for Hadoop, but the introduction of YARN as a framework, provided a pluggable processing layer, which could be MapReduce, Spark, and so on.
Note
While the recipes in this chapter will give you an overview of a typical configuration, we encourage you to adapt this proposal according to your needs. The deployment directory structure varies according to IT policies within an organization.