Summary
In this chapter, we have learned about SRE principles and practices. We are now aware of how to calculate time-based availability and how to define availability based on business expectations and needs.
We also explored the typical reliability challenges associated with the traditional team model and how constituting an SRE team will help you find the right balance between system reliability and development. We underlined the role of proper and consistent reliability.
Then, we learned about all the necessary aspects, from conception to successfully launching the service on production. We also highlighted key techniques such as applying SLOs and SLIs, reducing toil, the post-mortem culture, and efficiently utilizing an error budget to improve system and cloud service dependability.
In the upcoming chapter, we’ll look at DevOps tools and capabilities to see how they can help you manage your software development life cycle.