Unit-testing alerting rules
Testing alerts is one of the less often discussed features of Prometheus. I’ve frequently seen alerts simply tested by crafting a query and running it while looking at a time range in which an incident occurred to confirm that the query would detect it. While it’s certainly valid to do so, it doesn’t really do anything to ensure the alert will keep working as expected when it’s inevitably tweaked and tuned. Building out unit tests for your Prometheus alerting rules allows you to specify series, their sample values, and what you expect firing alerts to look like.
Since alerts are defined and evaluated within Prometheus, we use promtool
to validate and test alerts instead of amtool
.
Unit testing is distinct from simply validating rules, which can be done with the following command:
$ promtool check rules ch5/test-rule.yaml
This will simply validate whether or not your rules have appropriate syntax. However, we also want...