Nodes needed in the cluster
In this recipe, we will look at the number of nodes needed in the cluster based upon the storage requirements.
From the initial Disk space calculations recipe, we estimated that we need about 2 PB of storage for our cluster. In this recipe, we will estimate the number of nodes required for running a stable Hadoop cluster.
Getting ready
To step through the recipe, the user needs to have understood the Hadoop cluster daemons and their roles. It is recommended to have a cluster running with healthy HDFS and at least two Datanodes.
How to do it...
- Connect to the
master1.cyrus.com
master node in the cluster and switch to the userhadoop
. - Execute the command as shown here to see the Datanodes available and the disk space on each node:
$ hdfs dfsadmin -report
- From the preceding command, we can tell the storage available per node, but we cannot tell the number of disks that make up that storage. Refer to the following screenshot for details:
- Login to a Datanode
dn6.cyrus.com
...