What this book covers
Chapter 1, A General Introduction to Debugging Software, begins this journey by covering what debugging software actually entails, how it's really a mix of science and art. A few select software "horror stories" will serve to underline the importance of careful design, good (and secure) coding, and the ability to debug issues. On the more practical side, you will then set up the required workspace on your Linux system (or VM) so that you can – very importantly – work upon examples and assignments that will follow later.
Chapter 2, Approaches to Kernel Debugging, covers various approaches that can be taken to perform debugging at the level of kernel code. This will give you the insight to select the best, or the most viable, approach(es) depending on your particular situation and system constraints.
Chapter 3, Debug via Instrumentation – printk and Friends, refreshes the basics of using the common kernel printk()
API. Next, we go into specifics of how to leverage it for the express purpose of kernel/driver debug via the instrumentation approach. The heart of this chapter – the kernel's powerful dynamic debug framework and how you can leverage it even in production – is then covered in detail.
Chapter 4, Debug via Instrumentation – Kprobes, explains the kernel's powerfull Kprobes framework, a means to – among other things – instrument the kernel and modules, by hooking into pretty much any kernel or module function, even in production. This can prove to be a practically useful way to debug systems during production.
Chapter 5, Debugging Kernel Memory Issues – Part 1, looks at memory bugs and corruption – a very common issue when working with a language such as C. First, you'll learn why this is, and, importantly, about the typical types of memory issues that tend to arise in such systems. Next, you will learn how to tackle these memory issues head-on, using the powerful compiler-based KASAN technology, as well as the kernel's compiler-based UBSAN technology.
Chapter 6, Debugging Kernel Memory Issues – Part 2, continues the coverage of debugging kernel memory issues. We delve in depth into the details of catching common memory issues on slab (SLUB) memory and then detecting difficult kernel memory leakage bugs with kmemleak. A detailed comparison between various memory corruption issues and the appropriate tooling to detect them rounds off these two chapters.
Chapter 7, Oops! Interpreting the Kernel Bug Diagnostic, covers a key topic – what a kernel "Oops" diagnostic message really is and, very importantly, how to interpret it in depth. Along this interesting journey, you will generate a simple kernel Oops and understand exactly how to interpret it. Further, several tools and techniques to help with this task will be shown. Getting to the bottom of an Oops often helps pinpoint the root cause of the kernel bug! A few actual Oops messages will also be pointed out.
Chapter 8, Lock Debugging, looks at an integral part of writing robust kernel or driver code: locking. Unfortunately, it's really quite easy to land up with errors – deadlocks and such – that are difficult to debug after the fact. This chapter skims over the basics of lock debugging, instead spending the bulk of it on a really powerful modern tool that helps uncover deep locking issues (data races) – the Kernel Concurrent Sanitizer (KCSAN). Here, you'll learn how to configure the (debug) kernel for KCSAN, and how to use it in detail. We round it off by delving into several actual instances of kernel bugs whose root cause is locking issues.
Chapter 9, Tracing the Kernel Flow, introduces powerful technologies that allow you to trace the flow of kernel code in detail, at the granularity of every function call made! Usage of the primary kernel tracing infrastructure – ftrace – is covered first. You will then learn how to use powerful frontends to ftrace: trace-cmd, the KernelShark GUI, and the perf-tools collection. We wrap up this topic with an introduction to using LTTng (and visualization with the TraceCompass GUI!) to perform kernel-level tracing and analysis.
Chapter 10, Kernel Panic, Lockups, and Hangs, explains what kernel panic means precisely, and about the code paths executed within the kernel when it panics. More importantly, you'll learn how to write a custom kernel panic handler routine so that your code (also) runs if and when the kernel does panic. Associated topics – detecting lockups and CPU / work queue stalls, and hangs within the kernel – are covered as well.
Chapter 11, Using Kernel GDB (KGDB), introduces the powerful KGDB kernel source-level debug framework. You will learn how to configure and set up KGDB, after which, you'll see how to make use of it practically to debug kernel/module code at the level of the source, setting breakpoints, hardware watchpoints, leveraging GDB Python scripts, and more.
Chapter 12, A Few More Kernel Debugging Approaches, rounds off this vast topic of kernel debugging by introducing other approaches you can – and at times should – use. This includes understanding what the powerful (though resource-intensive) Kdump/crash tooling is, which can at times be a lifesaver. Then, we introduce you to why static analysis is key, and the available tools for analyzing Linux kernel/module/driver code. An introduction to code coverage and kernel testing frameworks follows. We round off the discussion with an introduction to logging (via journalctl), kernel assertions, and warning macros.