Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Practical Site Reliability Engineering

You're reading from   Practical Site Reliability Engineering Automate the process of designing, developing, and delivering highly reliable apps and services with SRE

Arrow left icon
Product type Paperback
Published in Nov 2018
Publisher Packt
ISBN-13 9781788839563
Length 390 pages
Edition 1st Edition
Tools
Arrow right icon
Authors (3):
Arrow left icon
Pethuru Raj Chelliah Pethuru Raj Chelliah
Author Profile Icon Pethuru Raj Chelliah
Pethuru Raj Chelliah
Shailender Singh Shailender Singh
Author Profile Icon Shailender Singh
Shailender Singh
Shreyash Naithani Shreyash Naithani
Author Profile Icon Shreyash Naithani
Shreyash Naithani
Arrow right icon
View More author details
Toc

Table of Contents (14) Chapters Close

Preface 1. Demystifying the Site Reliability Engineering Paradigm FREE CHAPTER 2. Microservices Architecture and Containers 3. Microservice Resiliency Patterns 4. DevOps as a Service 5. Container Cluster and Orchestration Platforms 6. Architectural and Design Patterns 7. Reliability Implementation Techniques 8. Realizing Reliable Systems - the Best Practices 9. Service Resiliency 10. Containers, Kubernetes, and Istio Monitoring 11. Post-Production Activities for Ensuring and Enhancing IT Reliability 12. Service Meshes and Container Orchestration Platforms 13. Other Books You May Enjoy

Summary


Monitoring is not a one-time task. We should be regularly measuring what's going on with our Kubernetes pods or our microservices. Monitoring plays a crucial role in the microservice system, as we need to monitor all endpoints in our microservices. To achieve a higher quality product, we should be able to detect failures before our customer does. We should enable anomaly detection and notify our operation team to troubleshoot the problem. We have to set up the necessary monitoring and alerts on both the infrastructure side and the application side.In this chapter, we saw how to use Prometheus and Grafana metrics to create powerful dashboards and alerts. 

In the next chapter, we will talk about post-production activities and best practices for ensuring and enhancing the IT reliability.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at AU $24.99/month. Cancel anytime