Investigating common performance problems
Performance degradation is a symptom of some other issues such as race conditions, dependency slow-down, high load, or any other problem that causes your service-level indicators (SLIs) to go beyond healthy limits and miss service-level objectives (SLOs). Such issues may affect multiple, if not all, code paths and APIs, even if they’re initially limited to a specific scenario.
For example, when a downstream service experiences issues, it can cause throughput to drop significantly for all APIs, including those that don’t depend on that downstream service. Retries, additional connections, or threads that handle downstream calls consume more resources than usual and take them away from other requests.
Note
Resource consumption alone, be it high or low, does not indicate a performance issue (or lack of it). High CPU or memory utilization can be valid if users are not affected. It could still be important to investigate when...