Configuring HDFS replication
For redundancy, it is important to have multiple copies of data. In HDFS, this is achieved by placing copies of blocks on different nodes. By default, the replication factor is 3, which means that for each block written to HDFS, there will be three copies in total on the nodes in the cluster.
It is important to make sure that the cluster is working fine and the user can perform file operations on the cluster.
Getting ready
Log in to any of the nodes in the cluster. It is best to use the edge node, as stated in Chapter 1, and switch to the user hadoop
.
Create a simple text file named file1.txt
using any of your favorite text editors, and write some content in it.
How to do it...
ssh
to the Namenode, which in this case isnn1.cluster1.com
, and switch to userhadoop
.- Navigate to the
/opt/cluster/hadoop/etc/hadoop
directory. This is the directory where we installed Hadoop in Chapter 1, Hadoop Architecture and Deployment. If the user has installed it at a different location...