Summary
In this chapter, we covered one of the most important aspects of service reliability work – alerting. You learned how to set up the service metric collection using the Prometheus tool and the tally
library, set up service alerts using the Alertmanager tool, and connect all these components to create an end-to-end service alerting pipeline.
The material in this chapter summarizes our learning from the reliability and service telemetry topics from Chapter 10 and Chapter 11. By collecting the telemetry data and establishing the notification mechanisms using the alerting tools, we can quickly detect various service issues and get notified each time we need to mitigate them.
In the next chapter, we will continue covering some advanced aspects of Go development, including system profiling and dashboarding.