You're reading from Mastering Prometheus Gain expert tips to monitoring your infrastructure, applications, and services

Product type Paperback

Published in Apr 2024

Publisher Packt

ISBN-13 9781805125662

Length 310 pages

Edition 1st Edition

Tools

Prometheus

Concepts

DevOps

Author (1):

William Hegedus

View More author details

Table of Contents (21) Chapters

Preface

1. Part 1: Fundamentals of Prometheus

2. Chapter 1: Observability, Monitoring, and Prometheus FREE CHAPTER

3. Chapter 2: Deploying Prometheus

4. Chapter 3: The Prometheus Data Model and PromQL

5. Chapter 4: Using Service Discovery

6. Chapter 5: Effective Alerting with Prometheus

7. Part 2: Scaling Prometheus

8. Chapter 6: Advancing Prometheus: Sharding, Federation, and High Availability

9. Chapter 7: Optimizing and Debugging Prometheus

10. Chapter 8: Enabling Systems Monitoring with the Node Exporter

11. Part 3: Extending Prometheus

12. Chapter 9: Utilizing Remote Storage Systems with Prometheus

13. Chapter 10: Extending Prometheus Globally with Thanos

14. Chapter 11: Jsonnet and Monitoring Mixins

15. Chapter 12: Utilizing Continuous Integration (CI) Pipelines with Prometheus

16. Chapter 13: Defining and Alerting on SLOs

17. Chapter 14: Integrating Prometheus with OpenTelemetry

18. Chapter 15: Beyond Prometheus

19. Index

Why subscribe?

20. Other Books You May Enjoy

Making robust alerts

The ability to make more robust alerts is one of the distinguishing factors of Prometheus vs. traditional, check-based monitoring solutions such as Nagios. It allows you to consider multiple factors when creating alerts. For example, rather than just alerting on high memory usage on a server, you can easily create an alert that will only fire if you have high memory usage and a high rate of major page faults since that is generally a better indicator of a system experiencing memory pressure. The idea is to craft alerts in such a way that you reduce the number of false positives as much as possible so that alerts only fire when real, visible impact is occurring. This is part of a larger discussion on the philosophy of alerting on symptoms vs. causes, which is covered comprehensively in Rob Ewaschuk’s excellent document entitled My Philosophy on Alerting (linked at the end of this chapter).