How is the Linux kernel tested?
Have you ever wondered how the Linux kernel is tested? How to maintain the quality of an open source project with millions of lines of code developed by thousands of programmers worldwide?
It is not an easy task. But that doesn’t mean it is not possible!
A new Linux kernel version is released approximately every 3 months (10 to 12 weeks). In this period, 2 weeks are reserved for the “merge window”, where everything that has already been developed and approved is merged in the Linus Torvalds tree (mainline). The other 8 to 10 weeks of development are reserved for bug fixes and stabilization.
During the stabilization phase, new release candidate versions come out (usually every week). When Linus Torvalds “feels” that the code is “stable enough” and the number of changes in the last release candidates have decreased, a new kernel version is released.
When I say that a new version is released when Linus Torvalds “feels” that the code is “stable enough”, I mean that there is nothing very formal in the Linux kernel development process to release a new version. There is no single magical tool to test the kernel against or formal testing plan to ensure that the new kernel version is working properly. The decision to release a new version is really based on the feeling of the Linux benevolent dictator for life, Linus Torvalds.
But that doesn’t mean that tests are not done on the Linux kernel. Quite the opposite!
Starting with kernel users and developers, who are encouraged to test development and production trees and report bugs on mailing lists and Bugzilla, the bug tracking tool (un)used by the Linux kernel community. There is a documentation about how to report bugs in the Linux kernel source code.
Linus Torvalds himself encourages these tests when he releases a new version of the kernel by (sometimes) ending the release email with the message “Go out and test!”.
The kernel development process also proves that there is a big focus on testing since around 20% of the development process (2 weeks) is reserved for code integration and the other 80% (8 to 10 weeks) is focused on testing and bug fixes.
But to ensure that the kernel is released with good quality, we do not depend solely on the initiative of the developers and users. Different kinds of tools can help in the process, including static code analysis, test automation, and continuous integration tools.
So let’s talk about the most prominent tools that help find bugs and test the Linux kernel?
Static code analysis
Better than searching for bugs in a running program is to find and fix the bugs before running it, right? This is the job of static analysis tools.
Static analysis tools are able to analyze the source code to find problems without running the program. If you want to better understand how these tools work, see the article “Bug hunting with static analysis tools”.
There are some static analysis tools focused on the Linux kernel that help a lot of developers to identify and fix problems during development.
Sparse is a tool originally written by Linus Torvalds and integrated with the Linux source code, designed to find possible coding faults in the kernel. Via annotations, the tool performs several semantic validations in the source code. For example, the tool will point out an error if a kernel code tries to directly dereference a user space pointer (marked with __user). Information about this tool is available in the kernel documentation.
Smatch is a static analysis tool developed with a focus on the Linux kernel. This tool is able to identify programming errors like access to null pointers, buffer overflow, use of freed memory, deadlocks, usage of uninitialized variables, and many more. According to this LWN’s post, more than 3,000 bugs have already been identified and fixed with Smatch. More information about this tool is available in the project’s website.
Coccinelle is a very interesting tool capable of identifying patterns in the source code and making changes automatically. For example, let’s assume that you need to change the signature for a kernel API function used by hundreds of device drivers. Instead of opening the source code of each driver and making the change manually, it is possible to use Coccinelle to automate the process. And Coccinelle has already identified hundreds of bugs in the Linux kernel. For example, this patch was automatically generated by the tool to correct a memory leak bug in the kernel. More information, including a list of bugs found in the Linux kernel, is available on the project’s website.
In addition to static analysis tools, several test automation tools are currently used to identify bugs and regressions in the Linux kernel.
Test automation tools can help programmers a lot during development, avoiding the execution of repetitive tasks, automatically identifying possible errors and regressions in the code, improving software quality and saving developer’s time.
ktest is a Perl script (ktest.pl) available in the Linux source code at tools/testing/ktest/, capable of automating the process of compiling, deploying and testing a kernel image. The tool receives a kernel configuration file, builds it from this config, then deploys the generated image on a remote machine to perform some type of automated test defined by the user, such as waiting for a login message from the serial console. All the build, deploy and test steps are configurable. It is a very useful tool to automate kernel testing. This message in LKML can serve as a reference for anyone interested in learning more about the tool.
kselftest is a testing framework available in the Linux source code at tools/testing/selftests, capable of testing specific parts of the kernel. The tests are written in C language or shell script and run in user space to test some specific part of the kernel. There are tests to verify several Linux kernel subsystems, libraries and APIs, including cpufreq, gpio, networking, rtc, watchdog, cgroup, ftrace, futex, ipc, etc. It is a very useful tool for both developers and users, and its documentation is available in the kernel source code and on this wiki page.
The Linux Test Project (LTP) is a project that offers a set of automated tests that validates not only the functionality of the Linux kernel but also the reliability, robustness, and stability of the operating system. The project is developed and maintained by companies like IBM, Cisco, Fujitsu, SUSE and Red Hat. The tests are written in C and shell script, and run directly on the target system. Information about the project, including the source code and documentation, is available on GitHub.
Autotest is a GPL licensed test automation framework with a focus on the Linux kernel, developed and used by several organizations such as Google, IBM and Red Hat. It has a very modular architecture including a client to run the tests, a server to control multiple clients and a web interface to trigger tests and view the result. The focus of the tool is not to implement the tests themselves, but to provide an infrastructure to automate the execution of tests implemented by other projects. For example, Autotest uses the test cases implemented by the Linux Test Project. The source code of this project and its documentation are available on GitHub.
KUnit is a unit testing framework for the Linux kernel proposed in late 2018 by Google’s Brendan Higgins on the Linux kernel mailing list. According to the author of the project, this framework was inspired by other unit testing tools such as JUnit (Java), unittest.mock (Python) and Googletest/Googlemock (C++). While most test automation tools require a machine with a running kernel to run the tests, kunit does not have this need. It uses a kernel feature called user-mode Linux, a kind of special architecture that makes it possible to run the Linux kernel as if it were just another program. Thus, there is no need to run the tests on a separate machine. KUnit’s documentation is available in the Linux kernel source code.
Another very common test automation technique, used mainly to identify security problems, is called fuzzing. A fuzzing tool is capable of generating random/invalid entries for an application and monitor the result to identify possible problems such as crashes or resource leaks. Two of the main Linux kernel fuzzing tools are Trinity and Syzkaller. There is also an automation tool called Syzbot that is continuously running Syzkaller on some Linux kernel trees and reporting the problems identified on this web site.
Finally, there are also several tools developed to test specific kernel subsystems, including mmtests with a focus on memory management, xfstests for testing file systems and hackbench for testing the scheduler.
Many of these testing tools are used in continuous integration projects to identify and report bugs during the development of the Linux kernel.
Continuous integration
Hundreds of commits are made to the kernel development and production repositories every day. These commits can cause several problems, including regressions, build failures and merge conflicts with other branches and repositories.
The practice of continuous integration (CI – Continuous Integration) can help in this case. A CI tool will integrate the work done by the developers on a daily basis, automating tests and making it possible to identify and fix bugs long before they happen in production.
KernelCI is a Linux Foundation project and currently the most complete automated testing and continuous integration tool for the Linux kernel. Created by Linaro in 2014 and published on kernelci.org, the tool performs automatic build tests on several kernel development trees, boots the kernel compiled on a large number of hardware platforms and performs some automated tests. The results are published in the project’s mailing list and on its website. The source code of the tool is hosted on GitHub.
Another tool for continuous integration of the Linux kernel is the 0-Day Test Service. Created and maintained by Intel, this tool is focused on the x86 architecture and monitors the various kernel development and production trees, automatically performing build, boot, functional and performance tests. When a problem is detected, an email is sent to kernel maintainers. According to the project’s website, it has already helped to identify more than 40,000 bugs in the kernel development trees! More information is available on the project’s website.
LKFT (Linux Kernel Functional Testing) is a continuous integration tool from Linaro that performs functional tests on several kernel development trees to identify bugs and regressions. The builds are done with OpenEmbedded and the automated tests are performed on ARM and x86 platforms (32-bit and 64-bit). The results are published on the project’s website.
The Kerneltests project was created in 2013 to test stable releases of the Linux kernel. This tool continuously runs build tests on all architectures and some boot tests in QEMU. The test results are published on the project’s website.
There are also other tools for testing automation and continuous integration such as Fuego, LAVA and Beaker. Some communities that maintain Linux distributions use these test automation tools to identify and report kernel problems.
So are we free of bugs?
Of course not. But as we can see, there are several initiatives to test the Linux kernel, starting with the user and the developer, going through tools for static code analysis, fuzzing, unit tests, test automation, and continuous integration. It is a collaborative and continuous effort by the community to maintain the quality of one of the largest open source projects in the world.
But there are still many challenges ahead.
Although test coverage has increased in recent years, the kernel source code base is very large and grows exponentially with each release, and that requires a huge effort to maintain and increase test coverage. Another major challenge is to automate device drivers and hardware platform tests because they require the hardware to be tested. There is also a lack of collaboration, interoperability and standardization between the tools.
So there is always room for improvement. And these challenges should encourage the community to improve existing tools and processes, develop new techniques and testing tools, and continue to improve the quality of the Linux kernel.