Understanding SLIs, SLOs, and SLAs
In the realm of site reliability, three crucial parameters guide SREs: the indicators of availability – service-level indicators (SLIs), the definition of availability –SLOs, and the consequences of unavailability – service-level agreements (SLAs). Let’s start by exploring SLIs in detail.
SLIs
SLIs serve as quantifiable reliability metrics. Google defines them as “carefully defined quantitative measures of some aspect of the level of service provided.” Common examples include request latency, failure rate, and data throughput. SLIs are specific to user journeys, which are sequences of actions users perform to achieve specific goals. For instance, a user journey for our sample Blog App might involve creating a new blog post.
Google, the original advocate of SRE, has identified four golden signals that apply to most user journeys:
- Latency: This measures the time it takes for your service to...