Understanding the concept of SRE
Originally, SRE was meant for mission-critical systems, but overall, it can be used to drive the DevOps process in a more efficient way. The goal is to enable developers to deploy infrastructure quickly and without errors. To achieve this, the deployment is fully automated. In this way of working, operators will not be swamped with requests to constantly onboard and manage more systems.
The original description of SRE as invented by Google is well over 400 pages long. In the Further reading section, a good book is listed to have a real deep dive into SRE. This chapter is merely an introduction.
Key terms in SRE are service-level indicators (SLI), SLO, and the error budget, the number of failures that lead to the unavailability of a system. The terms are explained in more detail in the next paragraphs.
SLI and SLO differ from SLA, the service-level agreement. The SLA is an agreement between the supplier of a service and the end user of that...