Cluster monitoring
In large distributed systems, an administrator handles the difficult task of being aware of the overall status of the system, as well as knowing about each server separately. In disaster-like situations, it is difficult to know when and how it got started just by looking at a handful of raw logfiles.
HBase cluster (another distributed system running on top of Hadoop) administrators need to continuously ensure that the cluster is up and operating as expected. For such difficult tasks, HBase provides a large number of metrics that provide details regarding their current status.
There are different solutions provided that can be further grouped into graphing and monitoring solutions or both. Here, graphing solutions, such as Ganglia (http://ganglia.sourceforge.net/), capture the exposed metrics of a system and display them in visual charts on the basis of time filters such as daily, monthly, and so on. Monitoring-based solutions, such as Nagios (http://www.nagios.org/), use...