Summary
In this chapter, you learned about common Kubernetes failure modes and how you can recover from these. We started this chapter with an example on how Kubernetes automatically detects node failures and how it will start new Pods to recover the workload. After that, you scaled out your workload and had your cluster run out of resources. You recovered from that situation by starting the failed node again to add new resources to the cluster.
Next, you saw how PVs are useful to store data outside of a Pod. You shut down all Pods on the cluster and saw how the PV ensured that no data was lost in your application. In the final example in this chapter, you saw how you can recover from a node failure when PVs are attached. You were able to recover the workload by unmounting the disk from the node and forcefully deleting the terminating Pod. This brought your workload back to a healthy state.
This chapter has explained common failure modes in Kubernetes. In the next chapter, we...