Instrumenting code with custom metrics
If we want to keep our application running smoothly, we need to be proactive. Observability isn't only about being able to do post-mortem analysis of logs and error reports. It is also about collecting various metrics that provide insights about service load, performance, and resource usage. If you monitor how your application behaves during normal operation, you will be able to spot anomalies and anticipate failures before they happen.
The key in monitoring software is defining metrics that will be useful in determining the general service health. Typical metrics can be divided into a few categories:
- Resource usage metrics: Typical metrics are memory, disk, network, and CPU time usage. You should always monitor these metrics because every infrastructure has limited resources. That's true even for cloud services, which provide seemingly unlimited resource pools. If one service has abnormal resource usage, it can starve...