What caused the high load average?
While we have identified what rebooted the server, we still have not gotten to the root cause of the issue. We still need to figure out what caused the high load average. Unfortunately, this would classify as information that is lost during a reboot.
If the system was still experiencing a high load average, we would simply be able to use top
or ps
to figure out which processes are using the most CPU time. Once the system was rebooted however, any process that was causing a high load average would have been restarted.
Unless these processes started causing a high load average again, we have no way of identifying the source.
$ w 02:13:07 up 23 min, 1 user, load average: 0.00, 0.01, 0.05 USER TTY LOGIN@ IDLE JCPU PCPU WHAT vagrant pts/0 01:59 3.00s 0.26s 0.10s sshd: vagrant [priv]
However, we are able to identify when the load average started to increase and how high it went. This information might be useful as we investigate further...