Monitoring and Observability
In this final chapter, we will cover how to monitor the performance and health of the services and the application as a whole. The most common approach to monitoring is to use logging to record information and errors. Following logging, the second most common approach is to record metrics such as CPU usage, request latency, and more. We will be looking into these forms of monitoring and will also take a look at an additional form of monitoring, known as distributed tracing.
In this chapter, we will also introduce OpenTelemetry and learn about its goals, and how it works. We will then add it to our application to record each request as it works its way through the application.
We will end by looking at the tools that are used to consume the data produced by our monitoring additions – that is, Jaeger, Prometheus, and Grafana.
In this chapter, we are going to cover the following main topics:
- What are monitoring and observability? ...