Service Mesh Observability
Distributed systems built using microservice architecture are complex and unpredictable. Irrespective of how diligent you have been in writing code, failures, meltdowns, memory leaks, and so on are highly likely to happen. The best strategy to handle such an incident is to proactively observe systems to identify any failures or situations that might lead to failures or any other adverse behavior.
Observing systems help you understand system behavior and the underlying causes behind faults so that you can confidently troubleshoot issues and analyze the effects of potential fixes. In this chapter, you will read about why observability is important, how to collect telemetry information from Istio, the different types of metrics available and how to fetch them via APIs, and how to enable distributed tracing. We will do so by discussing the following topics:
- Understanding observability
- Metric scraping using Prometheus
- Customizing Istio metrics...