Chapter 5: Architecting Next-Level DevOps with SRE
In previous chapters, we discussed the ins and outs of DevOps. It's called DevOps for a reason, but in practice, the Dev is typically emphasized: creating agility by speeding up the development. Site Reliability Engineering (SRE) addresses Ops very strongly. How does Ops survive under the ever-increasing speed and number of products that Dev delivers? The answer is SRE teams, working with error budgets and toil.
After completing this chapter, you will have learned the basic principles of SRE and how you can help an enterprise adopt and implement them. You will have a good understanding of how to define Key Performance Indicators (KPIs) for SRE and what benefits these will bring to the organization.
In this chapter, we're going to cover the following main topics:
- Understanding the basic principles of SRE
- Assessing the enterprise for SRE readiness
- Architecting SRE using KPIs
- Implementing SRE ...