Alerting – the art of doing it quietly
Since many of monitoring tools have alerting functionalities, we want to talk about how SREs define and handle alerts.
First, let’s understand what an alert is. As we have learned, modern monitoring systems work with events – normalized and structured monitoring data types. Some of those events are considered critical and urgent. When that happens, the monitoring system needs to raise an alert that goes to a notification system. The notification system lets the first responders know about the alert.
Alerting and notification are straightforward processes but moreover, they need to be cost-effective. As SREs, we should pay attention to a few guidelines to ensure that the outcomes of these processes add value to the end user rather than giving us more operational work. For that purpose, we will divide this topic into two sections.
The user perspective notification trigger principle
This SRE principle advises us on...