When we set up a Hadoop cluster, Hadoop creates a virtual layer on top of your local filesystem (such as a Windows- or Linux-based filesystem). As you might have noticed, HDFS does not map to any physical filesystem on operating system, but Hadoop offers abstraction on top of your Local FS to provide a fault-tolerant distributed filesystem service with HDFS. The overall design and access pattern in HDFS is like a Linux-based filesystem. The following diagram shows the high-level architecture of HDFS:
We have covered NameNode, Secondary Name, and DataNode in Chapter 1, Hadoop 3.0 - Background and Introduction Each file sent to HDFS is sliced into a number of blocks that need to be distributed. The NameNode maintains the registry (or name table) of all of the nodes present in the data in the local filesystem path specified with dfs.namenode.name.dir in hdfs-site...