Illustrating the impact of SLAs, SLOs, and error budgets relative to SLI
In this section, we will go through two hands-on scenarios to illustrate how SLO targets are met or missed based on SLI performance over time. SLOs performance will have direct impact on SLAs and error budgets. Changes in the error budget will specifically dictate the priority between the release of new features versus service reliability. For ease of explanation, a 7-day period is taken as the measure of time (ideally, a 28-day period is preferred).
Scenario 1 – New service features introduced; features are reliable; SLO is met
Here are the expectations for this scenario:
- Expected SLA—95%
- Expected SLO—98%
- Measured SLI—Service availability or uptime
- Measure duration—7 days
Given that the anticipated SLO for service is 98%, here is how the allowed downtime or error budget is calculated (you can use this downtime calculator for reference: https...