Chapter 1, Monitoring Fundamentals, lays the foundations of several key concepts that are used throughout the book. This chapter also explores the approach Prometheus takes to metric collection and why some controversial decisions are vital for the design and architecture of its stack.
Chapter 2, An Overview of the Prometheus Ecosystem, introduces a high-level overview of the entire Prometheus ecosystem, which components perform which jobs, and how everything interoperates logically.
Chapter 3, Setting Up a Test Environment, presents the fundamentals of how to use the test environments provided throughout the book, and how to tinker with them to validate different configurations.
Chapter 4, Prometheus Metrics Fundamentals, explores metrics, the core resource of Prometheus. Understanding them correctly is essential to fully utilize, manage, or even extend the Prometheus stack.
Chapter 5, Running a Prometheus Server, focuses on the Prometheus server, providing common patterns of usage and full setup process scenarios for virtual machines and containers.
Chapter 6, Exporters and Integrations, introduces some of the most useful exporters available, as well as providing examples on how to use them.
Chapter 7, Prometheus Query Language – PromQL, dives into the powerful and flexible Prometheus query language to leverage its multi-dimensional data model, which allows ad hoc aggregation and the combination of time series.
Chapter 8, Troubleshooting and Validation, provides useful guidelines on how to quickly detect and fix problems. It also presents useful endpoints that expose critical information and explores promtool, the Prometheus command-line interface and validation tool.
Chapter 9, Defining Alerting and Recording Rules, covers the usage and testing of recording and alerting rules, providing examples along the way.
Chapter 10, Discovering and Creating Grafana Dashboards, delves into the visualization components of the Prometheus stack, covering not only the built-in console functionality but also exploring Grafana and how to build, share, and reuse dashboards.
Chapter 11, Understanding and Extending Alertmanager, introduces the alerting component of the stack, showing how to integrate it with several different alerting providers, and how to correctly set up clustering to enable high-availability with the deduplication of alerts.
Chapter 12, Choosing the Right Service Discovery, explores multiple service discovery integrations, as well as providing you with the requirements and knowledge to build your own integration if required.
Chapter 13, Scaling and Federating Prometheus, tackles the scaling of a Prometheus stack and introduces concepts such as sharding and global views, while providing context and explaining them.
Chapter 14, Integrating Long-Term Storage with Prometheus, covers the concepts of the Prometheus read and write endpoints. Then, it deep-dives into considerations for external and long-term metric storage. Finally, it introduces an end-to-end example using Thanos.