Adding Prometheus metrics
In the previous section, we saw how logs can help us understand what our program is doing by finely tracing the operations it does over time. However, most of the time, you can’t afford to keep an eye on the logs all day: they are useful for understanding and debugging a particular situation but way less useful for getting global insights to alert you when something goes wrong.
To solve this, we’ll see in this section how to add metrics to our application. Their role is to measure things that matter in the execution of our program: the number of requests made, the time taken to give a response, the number of pending tasks in the worker queue, the accuracy of our ML predictions… Anything that we could easily monitor over time – usually, with charts and graphs – so we can easily monitor the health of our system. We say that we instrument our application.
To achieve this task, we’ll use two widely used technologies...