Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Becoming a Rockstar SRE

You're reading from   Becoming a Rockstar SRE Electrify your site reliability engineering mindset to build reliable, resilient, and efficient systems

Arrow left icon
Product type Paperback
Published in Apr 2023
Publisher Packt
ISBN-13 9781803239224
Length 420 pages
Edition 1st Edition
Languages
Tools
Concepts
Arrow right icon
Authors (2):
Arrow left icon
Jeremy Proffitt Jeremy Proffitt
Author Profile Icon Jeremy Proffitt
Jeremy Proffitt
Rod Anami L. Anami Rod Anami L. Anami
Author Profile Icon Rod Anami L. Anami
Rod Anami L. Anami
Arrow right icon
View More author details
Toc

Table of Contents (27) Chapters Close

Preface 1. Part 1 - Understanding the Basics of Who, What, and Why
2. Chapter 1: SRE Job Role – Activities and Responsibilities FREE CHAPTER 3. Chapter 2: Fundamental Numbers – Reliability Statistics 4. Chapter 3: Imperfect Habits – Duct Tape Architecture and Spaghetti Code 5. Part 2 - Implementing Observability for Site Reliability Engineering
6. Chapter 4: Essential Observability – Metrics, Events, Logs, and Traces (MELT) 7. Chapter 5: Resolution Path – Master Troubleshooting 8. Chapter 6: Operational Framework – Managing Infrastructure and Systems 9. Chapter 7: Data Consumed – Observability Data Science 10. Part 3 - Applying Architecture for Reliability
11. Chapter 8: Reliable Architecture – Systems Strategy and Design 12. Chapter 9: Valued Automation – Toil Discovery and Elimination 13. Chapter 10: Exposing Pipelines – GitOps and Testing Essentials 14. Chapter 11: Worker Bees – Orchestrations of Serverless, Containers, and Kubernetes 15. Chapter 12: Final Exam – Tests and Capacity Planning 16. Part 4 - Mastering the Outage Moments
17. Chapter 13: First Thing – Runbooks and Low Noise Outage Notifications 18. Chapter 14: Rapid Response – Outage Management Techniques 19. Chapter 15: Postmortem Candor – Long-Term Resolution 20. Part 5 - Looking into Future Trends and Preparing for SRE Interviews
21. Chapter 16: Chaos Injector – Advanced Systems Stability 22. Chapter 17: Interview Advice – Hiring and Being Hired 23. Index 24. Other Books You May Enjoy Appendix A – The Site Reliability Engineer Manifesto 1. Appendix B – The 12-Factor App Questionnaire

DevOps engineers versus SRE versus others

This is one of the most frequently asked questions we receive from customers and organizations: how does the site reliability engineering profession differ from other existing technical roles? We already talked about how SREs are the connection between the different steps of the solution life cycle. Here, we’ll focus our discussion on the DevOps engineer role, and later, we’ll broaden it. We have split this discussion into two sections:

  • DevOps and site reliability engineers
  • Software and site reliability engineers

DevOps and site reliability engineers

Google described the relationship between DevOps and SRE with a famous subtitle in their The Site Reliability Workbook publication:

Class SRE implements interface DevOps

This statement is an elegant way to define this link and refers to Java programming. It implies that site reliability engineering describes and deepens the implementation of whatever DevOps is. Moreover, we can say that site reliability engineering has commonalities with DevOps as a logically derived conclusion. However, what exactly does site reliability engineering implement from DevOps, or what are the differences between a site reliability engineer and a DevOps engineer? We have visualized these similarities and divergences in an infographic as follows:

Figure 1.3 – An infographic on SRE and DevOps

Figure 1.3 – An infographic on SRE and DevOps

Notice that they have shared values. Both SREs and DevOps engineers require those values in the orange (bottom right in the above diagram) box. In the bottom-left table, you can see the difference between those roles. Typically, site reliability engineers resolve operational problems by applying the right software engineering disciplines. On the other hand, DevOps engineers resolve development and delivery pipeline issues with systems management techniques mainly by using automation and infrastructure-as-code. They also concentrate different levels of effort on distinct phases of the solution life cycle, as depicted in the infographic.

It’s not rare to hear that DevOps is a shift-right transformation while site reliability engineering is a shift-left one. That implies moving from the left (development side of the equation) to the right (operations side of the equation), and vice versa. Another term we hear a lot is DevSecOps, which has the addition of security. Since security has always been implied in these roles, we think including new letters in the middle is confusing and redundant.

SREs and DevOps engineers are, in our opinion, different sides of the same coin. They should be more like best friends forever than opposing roles as they share values. Let’s check how SREs fulfill those values from the five main areas of DevOps:

  • Reduce organizational silos: SREs use the same tooling as developers or DevOps engineers. They also share objectives and performance metrics with them.
  • Accept failure as normal: SREs embrace risks using the error budget for new features. They quantify failure through SLIs and SLOs. And they run postmortems in a blameless culture.
  • Implement gradual changes: SREs work to increase reliability, and more reliable systems allow more frequent changes and releases.
  • Leverage tooling and automation: SREs eliminate toil by automating operational tasks at a constant pace.
  • Measure everything: SREs measure reliability by implementing MELT data and observability. They also have ways to identify and size toil.

Software and site reliability engineers

Another frequently asked question is how site reliability engineers differ from software engineers (SWEs). The short answer is simple: they have the same core skills but specific work scopes.

What are SWEs? SWEs design, engineer, and architect applications using modeling languages and requirements analysis techniques. They implement an integrated development environment (IDE) and develop code for use cases using one of the multiple available programming languages. They create test cases and testing suites. Also, they integrate software and service components and handle their dependencies. SWEs work with many software development life cycle tools and processes.

Site reliability engineers may execute the same activities, but they intend to improve reliability when doing so. For instance, developing code for an SRE translates much more to instrumenting the application code, so it generates more logs, than coding a use case. Also, SREs treat operations as a software problem and see daily systems management tasks as possible software coding opportunities. Besides that, SREs have other core skills, relating to systems thinking, systems management, and data science.

Indeed, an SRE could become an SWE and vice versa, and that leads us to another principle that we find in the Google materials.

Common staffing pool

Another principle is hiring site reliability engineers and SWEs from the same staffing pool. This principle works well for companies where most employees are software developers and engineers, and having a shared pool means that site reliability and software engineering job roles are interchangeable. However, this principle may be much more challenging for enterprises with a mix of systems administrators and developers. Hence, we left it out of our list in the previous section.

We could compare the SRE’s unique profession to many others, but we limited this topic to the most common comparisons. SREs are not architects, developers, systems administrators, or data scientists; they are more than all of these roles combined. Up next, we are going to understand the primary responsibilities of an SRE.

You have been reading a chapter from
Becoming a Rockstar SRE
Published in: Apr 2023
Publisher: Packt
ISBN-13: 9781803239224
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime