Summary
In this chapter, we explored important Kubernetes metrics and learned about the SRE best practices for maintaining higher availability. We learned how to get a Prometheus and Grafana-based monitoring and visualization stack up and running and added custom application dashboards to our Grafana instance. We also learned how to get Elasticsearch, Kibana, and Fluent Bit-based ECK logging stacks up and running on our Kubernetes cluster.
In the next and final chapter, we will learn about Kubernetes operation best practices. We will cover cluster maintenance topics such as upgrades and rotation, disaster recovery and avoidance, cluster and application troubleshooting, quality control, continuous improvement, and governance.