Troubleshooting Proxmox cluster problems
In this section, we will go through some common problems that we encounter during the operation of a Proxmox cluster, which are as follows:
HA VM start up problem
The fenced cluster node cannot recover automatically
The cluster member node cannot join back with the cluster
The cluster service cannot be restarted because of the DLM lockspace
Activity blocked within the cluster
Unable to start the HA-enabled VM
Symptom: When you try to start an HA-enabled VM, the operation fails with the following output:
Executing HA start for VM 201
Member vmsrv01 trying to enable pvevm:201… Aborted; service failed
TASK ERROR: command 'clusvcadm -e pvevm:201 -m vmsrv01' failed: exit code 254
Root cause: So, if you start the HA-enabled VM again, the process will still be unsuccessful with the same output. The problem is that the VM 201 is in the failed status, and so it cannot be turned on:
Solution: The HA service with the failed status cannot be used until we re-enable...