Disk space calculations
In this recipe, we will calculate the disk storage needed for the Hadoop cluster. Once we know what our storage requirement is, we can plan the number of nodes in the cluster and narrow down on the hardware options we have.
The intent of this recipe is not to tune performance, but to plan for capacity. Users are encouraged to read Chapter 9, HBase Administration on optimizing the Hadoop cluster.
Getting ready
To step through the recipe in this section, we need a Hadoop cluster set up and running. We need at least the HDFS configured correctly. It is recommended to complete the first two chapters before starting with this recipe.
How to do it...
- Connect to the
master1.cyrus.com
master node in the cluster and switch to the userhadoop
. - On the master node, execute the following command:
$ hdfs dfsadmin -report
This command will give you an understanding about how the storage in the cluster is represented. The total cluster storage is a summation of storages from each of the...