Monitoring
Earlier in this chapter, we discussed Cloudera Manager as a visual monitoring tool and hinted that it could also be programmatically integrated with other monitoring systems. But before plugging Hadoop into any monitoring framework, it's worth considering just what it means to operationally monitor a Hadoop cluster.
Hadoop – where failures don't matter
Traditional systems monitoring tends to be quite a binary tool; generally speaking, either something is working or it isn't. A host is alive or dead, and a web server is responding or it isn't. But in the Hadoop world, things are a little different; the important thing is service availability, and this can still be treated as live even if particular pieces of hardware or software have failed. No Hadoop cluster should be in trouble if a single worker node fails. As of Hadoop 2, even the failure of the server processes, such as the NameNode shouldn't really be a concern if HA is configured. So, any monitoring of Hadoop needs to take...