Highly available (HA) alerting
Unlike Prometheus itself, Alertmanager does have built-in high-availability capabilities and is natively able to run as a cluster. It provides clustering capabilities using a gossip protocol in which information is shared amongst the members of the cluster by specifying peers when an instance of Alertmanager is started up using the repeatable --cluster.peer
flag. Alternatively, if you don’t want to enable clustering, you can set the --cluster.listen-address
flag to an empty string to disable clustering altogether (not recommended for production deployments). There are a variety of flags for Alertmanager under --cluster.*
, but these should not need to be modified for most use cases apart from that peer flag.
The gossip between cluster members only applies to certain information, such as silences and whether or not a notification has already been sent to an alert group. Consequently, gossip between Alertmanagers in a cluster is crucial to avoid...