Monitoring Clusters and Workloads
So far in this book, we’ve spent a considerable amount of time standing up different aspects of an enterprise Kubernetes infrastructure. Once it’s stood up, how do you know it’s healthy? How do you know it’s running? Do you know when there’s a problem before your users do, or are you first finding out when someone can’t access a critical system? Monitoring is a critical aspect of any well-run infrastructure that has its own unique challenges in the Kubernetes and Cloud Native world. In this chapter, we’re going to look at two specific aspects of monitoring. First, we’re going to work with the Prometheus project and its integration with Kubernetes to understand how to inspect our cluster and what to look for. Next, we’re going to centralize our logs using the popular ELK stack. Along the way, we’ll include typical enterprise discussions around security and compliance to make sure...