Alerting on metrics abnormalities
Metrics provide time-series measurements of the behavior of our applications and infrastructure, but they provide no notification when those measurements deviate from the expected behavior of our applications. To be able to react to abnormal behaviors in our applications, we need to establish rules about what is normal behavior in our applications and how we can be notified when our applications deviate from that behavior.
Alerting on metrics enables us to define behavioral norms and specify how we should be notified when our applications exhibit abnormal behavior. For example, if we expect HTTP responses from our application to respond in under 100 milliseconds and we observe a time span of 5 minutes when our application is responding in greater than 100 milliseconds, we would want to be notified of the deviation from the expected behavior.
In this section, we will learn how to extend our current configuration of services to include an Alertmanager...