Kubernetes is a distributed system with many hidden working parts. AKS abstracts all of it for us, but it is still our responsibility to know where to look and how to respond when bad things happen. Much of the failure handling is done automatically by Kubernetes – still, you will run into situations where manual intervention is required. The following is a list of the most common failure modes that require interaction. We will look into the following failure modes in depth in this section:
- Node failures
- Out-of-resource failure
- Storage mount issues
- Network issues
See Kubernetes the Hard Way (https://github.com/kelseyhightower/kubernetes-the-hard-way), an excellent tutorial, to get an idea about the blocks on which Kubernetes is built. For the Azure version, see Kubernetes the Hard Way – Azure Translation (https://github.com/ivanfioravanti...