In the previous chapters, we converted our initial monolithic application into a set of microservices that are now running distributed inside our Kubernetes cluster. This paradigm shift introduced a new item to our list of project requirements: as system operators, we must be able to monitor the health of each individual service and be notified when problems arise.
We will begin this chapter by comparing the strengths and weaknesses of popular systems for capturing and aggregating metrics. Then we will focus our attention on Prometheus, a popular metrics collection system written entirely in Go. We will explore approaches for instrumenting our code to facilitate the efficient collection and export of metrics. In the last part of this chapter, we will investigate the use of Grafana for...