Part 2 - Implementing Observability for Site Reliability Engineering
The second part focuses on the skills and knowledge any site reliability engineer (SRE) must have to succeed in this profession. It explains what a SRE should learn about observability, systems administration, and how data science is part of their day. Through the chapters in this part, the reader will find comparison tables that will elucidate how the site reliability engineering profession differs from other IT professions and how to find more information to hone those skills.
The following chapters will be covered in this section:
- Chapter 4, Essential Observability – Metrics, Events, Logs, and Traces (MELT)
- Chapter 5, Resolution Path – Master Troubleshooting
- Chapter 6, Operational Framework – Managing Infrastructure and Systems
- Chapter 7, Data Consumed – Observability Data Science