Employing the kernel's hung task and workqueue stall detectors
A hung task is one that's become unresponsive. Similarly, the kernel can also, on occasion, suffer from some types of stalls (workqueue and RCU). In this section, we will examine how we can leverage these features, allowing us to detect them so that an action – such as triggering a panic or emitting a warning with stack backtraces – can be taken. Obviously, the warnings logged can then help you, the developer, understand what occurred and work to fix it.
Leveraging the kernel hung task detector
Configuring the kernel via the usual make menuconfig
UI, under the Kernel hacking | Debug Oops, Lockups and Hangs menu (refer to Figure 10.8), you'll find entries labeled as follows:
[*] Detect Hung Tasks
(120) Default timeout for hung task detection (in seconds)
[ ] Panic (Reboot) On Hung Tasks
These are what we discuss here. The whole idea, when enabled, is to allow...