When you are operating a Ceph cluster, it's important to monitor its health and performance. By monitoring Ceph, you can be sure that your cluster is running in full health and also be able to quickly react to any issues that may arise. By capturing and graphing performance counters, you will also have the data required to tune Ceph and observe your tuning impact on your cluster.
In this chapter you will learn the following topics:
- Why it is important to monitor Ceph
- How to monitor Ceph's health
- What should be monitored
- The states of PGs and what they mean
- How to capture and Ceph's performance counters with collectd
- Example graphs using Graphite