Writing our very first kernel module
When introducing a new programming language or topic, it has become a widely accepted computer programming tradition to mimic the original Hello, world program as the very first piece of code. I’m happy to follow this venerated tradition to introduce the Linux kernel’s powerful LKM framework. In this section, you will learn the steps to code a simple LKM. We explain the code in detail.
Introducing our Hello, world LKM C code
Without further ado, here is some simple Hello, world C code, implemented to abide by the Linux kernel’s LKM framework:
For reasons of readability and space constraints, only the key parts of most source code are displayed here. To view the complete source code (with all comments), build it, and run it, the entire source tree for this book is available in its GitHub repository here: https://github.com/PacktPublishing/Linux-Kernel-Programming_2E. We definitely expect you to clone it and use it:
git clone https://github.com/PacktPublishing/Linux-Kernel-Programming_2E.git
$ cat <LKP2E_src>/ch4/helloworld_lkm/helloworld_lkm.c
// ch4/helloworld_lkm/helloworld_lkm.c
#include <linux/init.h>
#include <linux/module.h>
/* Module stuff */
MODULE_AUTHOR("<insert your name here>");
MODULE_DESCRIPTION("LKP2E book:ch4/helloworld_lkm: hello, world, our first LKM");
MODULE_LICENSE("Dual MIT/GPL");
MODULE_VERSION("0.2");
static int __init helloworld_lkm_init(void)
{
printk(KERN_INFO "Hello, world\n");
return 0; /* success */
}
static void __exit helloworld_lkm_exit(void)
{
printk(KERN_INFO "Goodbye, world! Climate change has done us in...\n");
}
module_init(helloworld_lkm_init);
module_exit(helloworld_lkm_exit);
You can try out this simple Hello, world kernel module right away! Just cd
to the correct source directory and use our helper lkm
script to build and run it:
$ cd ~/Linux-Kernel-Programming_2E/ch4/helloworld_lkm
$ ../../lkm helloworld_lkm.c
Usage: lkm name-of-kernel-module-file ONLY (do NOT put any extension).
$ ../../lkm helloworld_lkm
Version info:
Distro: Ubuntu 22.04.2 LTS
Kernel: 5.19.0-45-generic
[ … ]
make || exit 1
------------------------------
make -C /lib/modules/5.19.0-45-generic/build/ M=/home/c2kp/Linux-Kernel-Programming_2E/ch4/helloworld_lkm modules
[ … ]
sudo dmesg
------------------------------
[ 4123.028252] Hello, world
$
The hows and whys will be explained in a lot of detail shortly. Though tiny, the code of this, our very first kernel module, requires careful perusal and understanding. Do read on!
Breaking it down
The following subsections explain pretty much each line of the preceding Hello, world C LKM code. Remember that although the program appears very small and trivial, there is a lot to be understood regarding it and the surrounding LKM framework. The rest of this chapter focuses on this and goes into detail. I highly recommend that you take the time to read through and understand these fundamentals first. This will help you immensely in later, possibly difficult-to-debug situations.
Kernel headers
First thing in the code, we use #include
to (obviously) include a few header files. Unlike in user space C application development, these are kernel headers (as mentioned in the Technical requirements section). Recall from Chapter 3, Building the 6.x Linux Kernel from Source – Part 2, that kernel modules were installed under a specific root-writable branch.
Let’s check it out again (here, we’re running on our guest x86_64 Ubuntu VM with the 5.19.0-45-generic distro kernel):
$ ls -l /lib/modules/$(uname -r)
total 6712
lrwxrwxrwx 1 root root 40 Jun 7 19:53 build -> /usr/src/linux-headers-5.19.0-45-generic
drwxr-xr-x 2 root root 4096 Jun 23 09:47 initrd
[ … ]
Notice the symbolic or soft link here named build
. It points to the location of the kernel headers on the system. In the preceding code block, you can see that it’s in the directory /usr/src/linux-headers-5.19.0-45-generic/!
As you will see, we will supply this information to the Makefile used to build our kernel module. (Also, some systems have a similar soft link called source
.)
The kernel-headers
or linux-headers
package unpacks a limited kernel source tree onto the system, typically under /usr/src/…
. This kernel code base, however, isn’t complete, hence our use of the phrase limited source tree. This is because the complete kernel source tree isn’t required for the purpose of building modules – just the required components (the headers, the Makefiles, and so on) are what’s packaged and extracted.
The first line of code in our Hello, world kernel module is #include <linux/init.h>
.
The compiler resolves this line by searching for the previously mentioned kernel header file under /lib/modules/$(uname -r)/build/include/
. Thus, by following the build
soft link, we can see that it ultimately picks up this header file:
$ ls -l /usr/src/linux-headers-5.19.0-45-generic/include/linux/init.h
-rw-r--r-- 1 root root 11963 Aug 1 2022 /usr/src/linux-headers-5.19.0-45-generic/include/linux/init.h
The same follows for the other kernel headers included in the kernel module’s source code.
Module macros
Next, we have a few module macros of the form MODULE_FOO()
; (colloquially called “module stuff”). Most are quite intuitive:
MODULE_AUTHOR()
: Specifies the author(s) of the kernel moduleMODULE_DESCRIPTION()
: Briefly describes the function or purpose of this LKMMODULE_LICENSE()
: Specifies the license(s) under which this kernel module is releasedMODULE_VERSION()
: Specifies the (local) version string of the kernel module
In the absence of the source code, how will this information be conveyed to the end user (or customer)? Ah, the modinfo
utility does precisely that! These macros and their information might seem trivial, but they are important in projects and products.
This information is relied upon, for example, by a vendor establishing the (open source) licenses that code is running under by using grep
on the modinfo
output on all installed kernel modules. (These are the basic module macros; there are more that we shall cover as we go along.)
Entry and exit points
Never forget, kernel modules are, after all, kernel code running with kernel privilege. It’s not an application and thus does not have its entry point as the familiar main()
function (that we know well and love). This, of course, begs the question: what are the entry and exit points of the kernel module? Notice, at the bottom of our simple kernel module, the following lines:
module_init(helloworld_lkm_init);
module_exit(helloworld_lkm_exit);
The module_{init|exit}()
code are macros specifying the entry and exit points, respectively. The parameter to each is a function pointer. With modern C compilers, we can just specify the name of the function. Thus, in our code, the following applies:
- The
helloworld_lkm_init()
function is the entry point. - The
helloworld_lkm_exit()
function is the exit point.
You can almost think of these entry and exit points as a constructor/destructor pair for a kernel module. Technically, it’s not the case of course, as this isn’t object-oriented C++ code, it’s plain C. Nevertheless, it’s a useful analogy, perhaps.
Return values
Notice the signature of the init
and exit
functions is as follows:
static int __init <modulename>_init(void);
static void __exit <modulename>_exit(void);
As good coding practice, we have used the naming format for the functions as <modulename>_{init|exit}()
, where <modulename>
is replaced with the name of the kernel module. You will realize that this naming convention is just that – it’s merely a convention that is, technically speaking, unnecessary, but it is intuitive and thus helpful (remember, we humans must write code for humans to read and understand, not machines). Clearly, neither routine receives any parameter.
Marking both functions with the static
qualifier implies that they are private to this kernel module. That is what we want.
Now let’s move along to the important convention that is followed for a kernel module’s init
function’s return value.
The 0/-E return convention
The kernel module’s init
function is to return an integer, a value of type int
; this is a key aspect. The Linux kernel has evolved a style or convention, if you will, with regard to returning values from it (meaning from kernel space, where the module resides and runs, to the user space process).
To return a value, the LKM framework follows what is colloquially referred to as the 0/-E
convention:
- Upon success, return integer value
0
. - Upon failure, return the negative of the value you would like the user space global uninitialized integer
errno
to be set to.
Be aware that errno
is a global integer residing in a user process VAS within its uninitialized data segment. With very few exceptions, whenever a Linux system call fails, -1
is returned and errno
is set to a positive value, representing the failure code or diagnostic; this work is carried out by glibc
“glue” code on the syscall
return path.
Furthermore, the errno
value is an index into a global table of English error messages (const char * sys_errlist[]
); this is how routines such as perror(3)
, strerror[_r](3)
, and the like can print out failure diagnostics.
By the way, you can look up the complete list of error (errno) codes available to you from within these (kernel source tree) header files: include/uapi/asm-generic/errno-base.h
and include/uapi/asm-generic/errno.h
.
A quick example of how to return from a kernel module’s init
function will help make this key point clear: say our kernel module’s init
function is attempting to dynamically allocate some kernel memory (details on the kmalloc()
API and so on will be covered in later chapters of course; please ignore it for now). Then, we could code it like so:
[...]
ptr = kmalloc(87, GFP_KERNEL);
if (!ptr) {
pr_warning("%s():%s():%d: kmalloc failed! Out of memory\n", __FILE__, __func__, __LINE__);
return -ENOMEM;
}
[...]
return 0; /* success */
If the memory allocation does fail (very unlikely, but hey, it can happen on a bad day!), we do the following:
- First, we emit a warning
printk
(don’t worry, we’ll cover these syntax details and much more on theprintk
). In this particular case – being “out of memory” – it’s considered pedantic and unnecessary to emit a message. The kernel will certainly emit sufficient diagnostic information if a kernel-space memory allocation ever fails! See this link for more details: https://lkml.org/lkml/2014/6/10/382; we do so here merely as it’s early in the discussion and for reader continuity. - Return the integer value
-ENOMEM
:- The layer to which this value will be returned in user space is actually
glibc
; it has some “glue” code that multiplies this value by-1
and sets the global integererrno
to it. - Now, the
[f]init_module()
system call will return-1
, indicating failure (this is becauseinsmod
actually invokes thefinit_module()
(or, earlier, theinit_module()
) system call, as you will soon see). errno
will be set toENOMEM
, reflecting the fact that the kernel module insertion failed due to a failure to allocate memory.
- The layer to which this value will be returned in user space is actually
Conversely, the framework expects the init
function to return the value 0
upon success. In fact, in older kernel versions, failure to return 0
upon success would cause the kernel module to be abruptly and immediately unloaded from kernel memory. Nowadays, this removal of the kernel module does not happen; instead, the kernel emits a warning message regarding the fact that a suspicious non-zero value has been returned. Moreover, modern compilers typically catch the fact that you aren’t returning a value when expected to, triggering an error message similar to this: error: no return statement in function returning non-void [-Werror=return-type]
.
There’s not much to be said for the cleanup routine. It receives no parameters and returns nothing (void
). Its job is to perform any and all required cleanup (freeing memory objects, setting certain registers, perhaps, and so on, depending on what the module’s designed to do) before the kernel module is unloaded from kernel memory.
Not including the module_exit()
macro in your kernel module makes it impossible to ever unload it (notwithstanding a system shutdown or reboot, of course). Interesting... I suggest you try this out as a small exercise! Of course, it’s never that simple: this behavior preventing the module from unloading is guaranteed only if the kernel is built with the CONFIG_MODULE_FORCE_UNLOAD
flag set to Disabled
(the default).
The ERR_PTR and PTR_ERR macros
On the discussion of return values, you now understand that the kernel module’s init
routine must return an integer. What if you wish to return a pointer instead? The ERR_PTR()
inline function comes to our rescue, allowing us to return an integer disguised as a pointer simply by typecasting it as void *
.
It gets better: you can check for an error using the IS_ERR()
inline function (which really just figures out whether the value is in the range [-1 to -4095]), encodes a negative error value into a pointer via the ERR_PTR()
inline function, and retrieves this value from the pointer using the converse routine PTR_ERR()
.
As a simple example, see the callee code given here. This time, as an example, we have the (sample) function myfunc()
return a pointer (to a structure named mystruct
) and not an integer:
struct mystruct * myfunc(void)
{
struct mystruct *mys = NULL;
mys = kzalloc(sizeof(struct mystruct), GFP_KERNEL);
if (!mys)
return ERR_PTR(-ENOMEM);
[...]
return mys;
}
The caller code is as follows:
[...]
retp = myfunc();
if (IS_ERR(retp)) {
pr_warn("myfunc() mystruct alloc failed, aborting...\n");
stat = PTR_ERR(retp); /* sets 'stat' to the value -ENOMEM */
goto out_fail_1;
}
[...]
out_fail_1:
return stat;
}
FYI, the inline ERR_PTR()
, PTR_ERR()
, and IS_ERR()
functions all live within the (kernel header) include/linux/err.h
file. One example of usage for these functions is here: https://elixir.bootlin.com/linux/v6.1.25/source/arch/x86/kernel/cpu/sgx/ioctl.c#L269 (and the ERR_PTR()
on line 31).
The __init and __exit keywords
Recall our simple module’s init
and cleanup
functions:
static int __init helloworld_lkm_init(void)
{
[ … ]
static void __exit helloworld_lkm_exit(void)
{
[ … ]
A niggling leftover: what exactly are the __init
and __exit
macros we see within the preceding function signatures? These are merely specifying memory optimization linker attributes.
The __init
macro defines an init.text
section for code. Similarly, any data declared with the __initdata
attribute goes into an init.data
section. The whole point here is the code and data in the init
function are used exactly once during initialization.
Once it’s invoked, it will never be called again; so, once called, all the code and data in these init
sections are freed up (via free_initmem()
).
The deal is similar with the __exit
macro, though, of course, this only makes sense with kernel modules. Once the cleanup
function is called, all the memory is freed. If the code were instead part of the static kernel image (or if module support were disabled), this macro would have no effect.
Fine, but so far, we have still not explained some practicalities: how exactly can you build your new kernel module, get it into kernel memory and have it execute, and then unload it, plus several other operations you might wish to perform? Let’s discuss these in the following section.