Troubleshooting
In this section, we will cover the troubleshooting process in a production cluster and the logical procession of actions to take. The pod lifecycle involves multiple phases and failures can occur at each phase. In addition, pod containers go through their own mini lifecycle where init containers are running to completion and then the main containers start running. Let’s see what can go wrong along the way and how to handle it.
First, let’s look at pending pods.
Handling pending pods
When a new pod was created, Kubernetes used to place it in the Pending state and try to find a node to schedule it on. However, since Kubernetes 1.26, there is an even earlier state where a pod can’t be scheduled.
Let’s create a new 1.26 kind cluster called “trouble
" and enable the pod scheduling readiness feature. Here is the configuration file (cluster-config.yaml
):
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: trouble...