Defining alerts based on key metrics
You will be let down if you believe that gathering logs and metrics and showing them in attractive dashboards is sufficient. If we just use dashboards, some support staff will need to be stationed in front of a large number of monitors constantly, round the clock, every day of the year, just in case. To put it mildly, this job is tedious. What happens if the person nods off? We must adjust our approach. Let’s start by defining what metrics are.
Metrics
Metrics are used as input values in the rules on which alerts are based. Critical metrics must be identified, and if they surpass a predetermined value repeatedly or for an extended period of time, an alert is required. For illustration, consider CPU usage.
Defining alerts based on key metrics is an important part of monitoring and maintaining the health of our Docker and Kubernetes systems. Alerts allow us to define conditions based on metrics and to send notifications when those...