The monitoring system is a critical component of any infrastructure. We rely on it to keep watch over everything – from servers and network devices to services and applications – and expect to be notified whenever there's a problem. However, when the problem is on the monitoring stack itself, or even on a notification provider so that alerts are generated but don't reach us, how will we, as operators, know?
Guaranteeing that the monitoring stack is up and running, and that notifications are able to reach recipients, is a commonly overlooked task. In this section, we will go into what can be done to mitigate risk factors and improve overall confidence in the monitoring system.