The final section of the book focuses on some of the elements that make a live system work long term in real life, from the observability of the system, which is critical for detecting and fixing problems quickly, to handling the configuration that affects the whole system and includes techniques for ensuring that the different teams collaborate and develop systems in a coordinated fashion.
The first chapter of this section deals with how to discover an operation on a live cluster in order to detect usage and associated problems. This chapter introduces the concept of observability and the two main tools for supporting it: logs and metrics. It covers how to include them properly in a Kubernetes cluster.
The second chapter of this section deals with a configuration that is shared across different...