Summary
In this chapter, we stood up both a Grafana and a Prometheus server and used Prometheus to scrape metrics data from both servers. We use the ad hoc analysis functionality of Explore to identify interesting metrics, possibly with an eye toward monitoring them. We looked at how to aggregate certain metrics to capture how they change over time. We examined how there can be limitations to our data that we must respect for the sake of accuracy and integrity.
Essentially, we’ve established the foundations for building observability workflows by first capturing metrics from services and then identifying important performance metrics. Finally, if necessary, we aggregated or otherwise transformed the metrics. Once we had the metrics we were interested in, we monitored them in real time, then we discussed how to associate alerts when our metrics deviate from normal.
In the next chapter, we’ll take some of the concepts we’ve picked up through playing around...