Linux kernel debugging

Rebuilding a single module

Configuring and compiling the entire Linux kernel is best avoided, so if you only need to make changes to some modules and have a computer that you don't mind crashing, you can be more surgical. Download and extract the tarball from the Kernel Archives that corresponds to the version that's currently running. You may want to look for any patches that have been applied to your specific kernel, likely by your distribution.

Once inside the source directory, you'll need to collect some files:

1
2
3
4
5
cp /proc/config.gz .
gunzip config.gz
mv config .config
cp /lib/modules/"$(uname -r)"/build/Module.symvers .
cp /lib/modules/"$(uname -r)"/build/localversion.* .

You might need to install a "dev" or "headers" package (e.g.) for the build directory to exist. Then you can make the necessary preparations:

1
2
3
make oldconfig
make prepare
make modules_prepare

Hopefully, that's enough to build a module:

1
make M=path/to/dir modules

Once it's built, you can cross your fingers and insert it into the kernel:

1
insmod path/to/dir/module.ko

To make sure that any dependencies are loaded, you might first need to use modprobe on the original version of the module:

1
2
modprobe module
rmmod module

Setting up a VM

If you're unable to just reload some modules in situ, and need to run an entire custom kernel, it makes sense to do so in a VM. I found it straightforward to get Arch Linux going inside a VirtualBox VM, so that's assumed here, but anything will do.

Once the initial setup is done, you'll want to install the base-devel group, which includes (among other useful things) gcc, make, and sudo. Then you can set up networking and SSH, turn on port forwarding, and run the VM headless with VBoxManage startvm.

If your terminal is wonky (e.g. backspace doesn't render correctly), setting TERM ought to fix that right up. Ideally, you would set it match your terminal, but this might require installing the appropriate terminfo package; if you're in a hurry,

1
export TERM=xterm

will probably work.

Snapshots

The VBoxManage snapshot command is very handy to restore the VM to a previous working condition. For example, make a snapshot before rebooting into a new kernel in case it won't boot.

Building the whole kernel

We'll build the Linux kernel the same way that the one in the linux package is built. For this, we need the PKGBUILD and associated files. The simplest way to obtain them is with the asp utility:

1
asp export core/linux

Before we proceed to configuring the kernel, the PKGBUILD needs some adjustment. There are some very large dependencies which are only used for building the documentation, but we don't need it, so there's no reason to keep those dependencies or the _package-docs function. While we're at it, the _package-headers function can also go.

The kernel source is downloaded from a GitHub repo, which is extremely wasteful if we only want a single version. In that case, it's better to put a tarball from either GitHub (e.g.) or the Kernel Archives (e.g.) into the source array. (You may need to update the cd commands in the PKGBUILD.) If you're downloading the vanilla upstream source, you can apply any Arch-specific patches in the prepare function. The patches can be found by looking at the commits for a tag (e.g.) and downloaded by appending ".patch" to the URL (e.g.). There's already code to automatically apply all the patches found in the source array.

Now we can run

1
makepkg -so

to prepare the sources in the src directory.

If you've ever built the kernel from source, you'll know that configuring it is half the fun. Thus, we're going to skip as much manual configuration as possible. The other half the fun is waiting for the kernel to build, which means we're going to use a minimal configuration to minimize the build time. To do this, you should modprobe all the modules you'll want to use with the new kernel. Then

1
make localmodconfig

will disable any modules in the configuration which are not currently loaded. I've found that localmodconfig doesn't always work perfectly, but when it does work, it's extremely helpful. It may be tempting to first make olddefconfig to bring the .config file up to date, but this seems to reliably break localmodconfig. If you find yourself needing to tweak some kernel settings,

1
make menuconfig

is your friend.

Then build and install the new kernel with

1
makepkg -efi

This will take a while the first time. Run it each time you make a change, and then reboot the VM to boot from the updated kernel. Subsequent builds should take only a fraction of the time.

If you run into "Error 137" while linking the kernel, your VM might need more RAM:

1
2
  LD      vmlinux.o
make[1]: *** [scripts/Makefile.vmlinux_o:68: vmlinux.o] Error 137

Supposedly, 137 = 128 | 9 signifies that the process was killed with SIGKILL.

Starting from userspace

To talk to the kernel, you'll need an entry point from userspace. You can use strace (from the strace package) for this.

For example,

1
strace ls /test/path

might result in

1
2
3
...
open("/test/path", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 EACCES (Permission denied)
...

Then you could write a wrapper around this failing call to try to understand the problem better:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>

int main(int argc, char **argv) {
    int result = open(argv[1], O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC);
    int errsv = errno;

    printf("%d %d %s\n", result, errsv, strerror(errsv));

    return 0;
}

Making the kernel speak

The kernel is in a privileged position, making it more challenging to debug than programs in userspace. After all, how do you step forward in the kernel if the kernel itself is paused and can't pass your key presses to the debugger process? There are out-of-band solutions for this, but it's often easier to start by printing useful data.

printk

The bluntest tool in the toolbox is printk. It's a lot like printf, but the format string is prefixed by a log level:

1
printk(KERN_WARNING "%s %s\n", name, root);

This will cause the desired message to be printed to the kernel buffer, which you can access using dmesg.

dump_stack

Often it's not clear where a piece of kernel code is being called from. In that case, calling

1
dump_stack();

will output an entire stack trace, like

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
dump_stack+0x63/0x83
ovl_dir_open+0x37/0x120 [overlay]
do_dentry_open+0x205/0x2e0
? ovl_dir_fsync+0x140/0x140 [overlay]
vfs_open+0x4c/0x70
path_openat+0x282/0x1170
? unlock_page_memcg+0x29/0x60
? page_add_file_rmap+0x5b/0x140
? filemap_map_pages+0x233/0x410
do_filp_open+0x91/0x100
? __alloc_fd+0xc9/0x180
do_sys_open+0x147/0x210
SyS_open+0x1e/0x20
entry_SYSCALL_64_fastpath+0x1a/0xa4

The text in square brackets indicates the module containing the function. According to a Stack Overflow answer, the question marks indicate that those entries are unreliable.

Dynamic debug

The kernel has a feature called "dynamic debug" for runtime control of some debug-specific behaviour. In the simplest case (without debugfs), you can enable debug printing for an entire module:

1
echo 'module overlay +p' >/proc/dynamic_debug/control

Finding symbols

To find where a symbol is defined, the Elixir tool from Bootlin is a blessing. For example, we can see that in Linux 4.14.11, ovl_dir_fsync is a function defined in /fs/overlayfs/readdir.c.

Memory allocation

Sometimes you'll need to allocate more memory, for example to call dentry_path_raw. This can be done using __get_free_page. You should then free the page using free_page so you don't leak memory. For example,

1
2
3
4
char *buf = (char*)__get_free_page(GFP_USER);
char *p = dentry_path_raw(filp->f_path.dentry, buf, PAGE_SIZE);
printk(KERN_WARNING "path: %s\n", p);
free_page((unsigned long)buf);