Sizing the cluster as per SLA
In this recipe, we will look at how service-level agreements can impact our decision to size the clusters. In an organization, there will be multitenant clusters, which are funded differently by business units and ask for a guarantee for their share.
A good thing about YARN is that multiple users can run different jobs such as MapReduce, Hive, Pig, HBase, Spark, and so on. While YARN guarantees what it needs to start a job, it does not control how the job will finish. Users can still step on each other and cause an impact on SLAs.
Getting ready
For this recipe, the users must have completed the Memory requirements and Nodes needed in the cluster recipes. It is good to have a running cluster with HDFS and YARN to run quick commands for reference. It is also good to understand the scheduler recipes covered in Chapter 5, Schedulers.
How to do it...
- Connect to the
master1.cyrus.com
master node and switch to the userhadoop
. - Run a
teragen
andterasort
on the cluster using...