of Linux Kernel Source Lines of Code Even though seemingly contradictory, the Linux kernel is both monolithic and modular. The kernel is classified as a
monolithic kernel architecturally since the entire OS runs in kernel space. The design is modular since it can be assembled from
modules that in some cases are loaded and unloaded at runtime. It supports features once only available in closed source kernels of non-free operating systems. The rest of the article makes use of the UNIX and Unix-like operating systems convention of the
manual pages. The number that follows the name of a command, interface, or other feature specifies the section (i.e. the type of the OS' component or feature) it belongs to. For example refers to a system call, and refers to a userspace library wrapper. The following is an overview of architectural design and of noteworthy features. •
Concurrent computing and (with the availability of enough CPU cores for tasks that are ready to run) even
true parallel execution of many
processes at once (each of them having one or more
threads of execution) on
SMP and
NUMA architectures. • Selection and configuration of hundreds of kernel features and drivers (using one of the family of commands before building), • Configuration (again using the commands) and run-time modifications of the policies (via , , and the family of syscalls) of the
task schedulers that allow
preemptive multitasking (both in
user mode and, since the 2.6 series, in
kernel mode); the
earliest eligible virtual deadline first scheduling (EEVDF) scheduler, is the default scheduler of Linux since 2023 and it uses a
red-black tree that can search, insert and delete process information (
task struct) with
O(log n) time complexity, where
n is the number of runnable tasks. • Advanced
memory management with
paged virtual memory and a multi-generational least recently used (MGLRU)
page replacement algorithm. •
Inter-process communications and
synchronization mechanism. • A
virtual filesystem on top of several concrete filesystems (
ext4,
Btrfs,
XFS,
JFS,
FAT32, and many more). • Configurable I/O schedulers, syscall that manipulates the underlying device parameters of special files (it is a non standard system call, since arguments, returns, and semantics depends on the device driver in question), support for POSIX asynchronous I/O (because they scale poorly with multithreaded applications, however, a family of Linux specific I/O system calls () had to be created for the management of asynchronous I/O contexts suitable for concurrent processing). •
OS-level virtualization (with
Linux-VServer),
paravirtualization and
hardware-assisted virtualization (with
KVM or
Xen, and using
QEMU for hardware emulation); On the Xen
hypervisor, the Linux kernel provides support to build Linux distributions (such as openSUSE Leap and many others) that work as
Dom0, that are virtual machine host servers that provide the management environment for the user's virtual machines (
DomU). • I/O Virtualization with
VFIO and
SR-IOV. Virtual Function I/O (VFIO) exposes direct device access to user space in a secure memory (IOMMU) protected environment. With VFIO, a VM Guest can directly access hardware devices on the VM Host Server. This technique improves performance, if compared both to Full virtualization and Paravirtualization. With VFIO, however, devices cannot be shared with multiple VM guests. Single Root I/O Virtualization (SR-IOV) combines the performance gains of VFIO and the ability to share a device with several VM Guests (but it requires special hardware that must be capable to appear to two or more VM guests as different devices). • Security mechanisms for
discretionary and
mandatory access control (SELinux, AppArmor, POSIX
ACLs, and others). Furthermore, the
X Window System and
Wayland, the windowing system and display server protocols that most people use with Linux, do not run within the kernel. Differently, the actual interfacing with
GPUs of
graphics cards is an in-kernel subsystem called
Direct Rendering Manager (DRM). Unlike standard monolithic kernels, device drivers are easily configured as
modules, and loaded or unloaded while the system is running and can also be pre-empted under certain conditions in order to handle
hardware interrupts correctly and to better support
symmetric multiprocessing. Linux typically makes use of
memory protection and
virtual memory and can also handle
non-uniform memory access, the project has absorbed
μClinux, which also makes it possible to run Linux on
microcontrollers without virtual memory. The hardware is represented in the file hierarchy. User applications interact with device drivers via entries in the or directories. Process information is mapped into the directory. The kernel provides system calls and other interfaces that are Linux-specific. In order to be included in the official kernel, the code must comply with a set of licensing rules. The
system calls are expected to never change in order to preserve
compatibility for
userspace programs that rely on them.
Loadable kernel modules (LKMs), by design, cannot rely on a stable ABI.
Kernel-to-userspace API The set of the
Linux kernel API that regards the interfaces exposed to user applications is fundamentally composed of UNIX and Linux-specific
system calls. A system call is an entry point into the Linux kernel. For example, among the Linux-specific ones there is the family of the system calls. Most extensions must be enabled by defining the _GNU_SOURCE
macro in a
header file or when the user-land code is being compiled. System calls can only be invoked via assembly instructions that enable the transition from unprivileged user space to privileged kernel space in
ring 0. For this reason, the
C standard library (libC) acts as a wrapper to most Linux system calls, by exposing C functions that, if needed, transparently enter the kernel, which will execute on behalf of the calling process. the library provides a function called , which can be used to explicitly invoke them.
Pseudo filesystems (e.g., the
sysfs and
procfs filesystems) and
special files (e.g., /dev/random, /dev/sda, /dev/tty, and many others) constitute another layer of interface to kernel data structures representing hardware or logical (software) devices.
Kernel-to-userspace ABI Because of the differences existing between the hundreds of various implementations of the Linux OS, executable objects, even though they are compiled, assembled, and linked for running on a specific hardware architecture (that is, they use the
ISA of the target hardware), often cannot run on different Linux distributions. This issue is mainly due to distribution-specific configurations and a set of patches applied to the code of the Linux kernel, differences in system libraries, services (daemons), filesystem hierarchies, and environment variables. The main standard concerning application and binary compatibility of Linux distributions is the
Linux Standard Base (LSB). The LSB goes beyond what concerns the Linux kernel, because it also defines the desktop specifications, the X libraries and Qt that have little to do with it. The LSB version 5 is built upon several standards and drafts (POSIX, SUS, X/Open,
File System Hierarchy (FHS), and others). The parts of the LSB more relevant to the kernel are the
General ABI (gABI), especially the
System V ABI and the
Executable and Linking Format (ELF), and the
Processor Specific ABI (psABI), for example the
Core Specification for X86-64. The standard ABI for how x86_64 user programs invoke system calls is to load the syscall number into the
rax register, and the other parameters into
rdi,
rsi,
rdx,
r10,
r8, and
r9, and finally to put the
syscall assembly instruction in the code.
In-kernel API infrastructure with
Mesa 3D. As there is no stable in-kernel
ABI, AMD had to constantly adapt the former
binary blob used by Catalyst. There are several internal kernel APIs between kernel subsystems. Some are available only within the kernel subsystems, while a somewhat limited set of in-kernel symbols (i.e., variables, data structures, and functions) is exposed to dynamically loadable modules (e.g., device drivers loaded on demand) whether they're exported with the and macros (the latter reserved to modules released under a GPL-compatible license). Linux provides in-kernel APIs that manipulate data structures (e.g.,
linked lists,
radix trees,
red-black trees,
queues) or perform common routines (e.g., copy data from and to user space, allocate memory, print lines to the system log, and so on) that have remained stable at least since Linux version 2.6. In-kernel APIs include libraries of low-level common services used by device drivers: •
SCSI Interfaces and
libATArespectively, a peer-to-peer packet based communication protocol for storage devices attached to USB, SATA, SAS, Fibre Channel, FireWire, ATAPI device, and an in-kernel library to support [S]ATA host controllers and devices. •
Direct Rendering Manager (DRM) and
Kernel Mode Setting (KMS)for interfacing with GPUs and supporting the needs of modern 3D-accelerated video hardware, and for setting screen resolution, color depth and refresh rate •
DMA buffers (
DMA-BUF)for sharing buffers for hardware direct memory access across multiple device drivers and subsystems •
Video4Linuxfor video capture hardware •
Advanced Linux Sound Architecture (ALSA)for sound cards •
New APIfor
network interface controllers •
mac80211 and cfg80211for wireless network interface controllers
In-kernel ABI The Linux developers chose not to maintain a stable in-kernel ABI. Modules compiled for a specific version of the kernel cannot be loaded into another version without being recompiled. A new process can be created by calling family of system calls or
system call. Processes can be suspended and resumed by the kernel by sending signals like and . A process can terminate itself by calling system call, or terminated by another process by sending signals like , or . If the executable is dynamically linked to shared libraries, a
dynamic linker is used to find and load the needed objects, prepare the program to run and then run it. The
Native POSIX Thread Library (NPTL) provides the POSIX standard thread interface (
pthreads) to userspace. The kernel isn't aware of processes nor threads but it is aware of
tasks, thus threads are implemented in userspace. Threads in Linux are implemented as
tasks sharing resources, while if they aren't sharing called to be independent processes. The kernel provides the (fast user-space mutex) mechanisms for user-space locking and synchronization. The majority of the operations are performed in userspace but it may be necessary to communicate with the kernel using the system call. They are threads created by the kernel itself for specialized tasks; they are privileged like the kernel and aren't bound to any process or application.
Scheduling The Linux
process scheduler is modular, in the sense that it enables different scheduling classes and policies. Scheduler classes are pluggable scheduler algorithms that can be registered with the base scheduler code. Each class schedules different types of processes. The core code of the scheduler iterates over each class in order of priority and chooses the highest priority scheduler that has a schedulable entity of type struct sched_entity ready to run. full
kernel preemption. and makes Linux more suitable for desktop and
real-time applications. This comes at a cost of throughput, as work being interrupted worsens cache behavior. The scheduler is defined as a macro in a C header as SCHED_NORMAL. In other POSIX kernels, a similar policy known as SCHED_OTHER allocates CPU timeslices (i.e, it assigns absolute slices of the processor time depending on either predetermined or dynamically computed priority of each process). The Linux CFS does away with absolute timeslices and assigns a fair proportion of CPU time, as a function of parameters like the total number of runnable processes and the time they have already run; this function also takes into account a kind of weight that depends on their relative priorities (nice values). With kernel preemption, the kernel can preempt itself when an interrupt handler returns, when kernel tasks block, and whenever a subsystem explicitly calls the schedule() function. The kernel also contains two POSIX-compliant real-time scheduling classes named
SCHED_FIFO (realtime
first-in-first-out) and
SCHED_RR (realtime
round-robin), both of which take precedence over the default class. SCHED_DEADLINE takes precedence over all the other scheduling classes. Real-time
PREEMPT_RT patches, included into the mainline Linux since version 2.6, provide a
deterministic scheduler, the removal of preemption and interrupt disabling (where possible), PI Mutexes (i.e., locking primitives that avoid priority inversion), support for
High Precision Event Timers (HPET), preemptive
read-copy-update (RCU), (forced) IRQ threads, and other minor features. In 2023, Peter Zijlstra proposed replacing CFS with an
earliest eligible virtual deadline first scheduling (EEVDF) scheduler, to prevent the need for CFS "latency nice" patches. The EEVDF scheduler replaced CFS in version 6.6 of the Linux kernel. and
lockless algorithms (e.g.,
RCUs). Most lock-less algorithms are built on top of
memory barriers for the purpose of enforcing
memory ordering and prevent undesired side effects due to
compiler optimization. code included in mainline Linux provide
RT-mutexes, a special kind of Mutex that do not disable preemption and have support for priority inheritance. Almost all locks are changed into sleeping locks when using configuration for realtime operation. Linux includes a kernel lock validator called
Lockdep.
Interrupts Although the management of
interrupts could be seen as a single job, it is divided into two. This split in two is due to the different time constraints and to the synchronization needs of the tasks whose the management is composed of. The first part is made up of an asynchronous
interrupt service routine (ISR) that in Linux is known as the
top half, while the second part is carried out by one of three types of the so-called
bottom halves (
softirq,
tasklets, and
work queues). each of which has a specific purpose. • : this zone is suitable for
DMA. • : for normal memory operations. • : part of physical memory that is only accessible to the kernel using temporary mapping. Those zones are the most common, but others exist. The kernel is not
pageable (meaning it is always resident in physical memory and cannot be swapped to the disk) and there is no memory protection (no signals, unlike in user space), therefore memory violations lead to instability and system crashes. Small chunks of memory can be dynamically allocated in kernel space via the family of kmalloc() APIs and freed with the appropriate variant of kfree(). vmalloc() and kvfree() are used for large virtually contiguous chunks. alloc_pages() allocates the desired number of entire pages. The kernel used to include the SLAB, SLUB and SLOB allocators as configurable alternatives. The SLOB allocator was removed in Linux 6.4 and the SLAB allocator was removed in Linux 6.8. The sole remaining allocator is SLUB, which aims for simplicity and efficiency, and was introduced in Linux 2.6.
Virtual filesystem Since Linux supports numerous filesystems with different features and functionality, it is necessary to implement a generic filesystem that is independent from underlying filesystems. The
virtual file system interfaces with other Linux subsystems, userspace, or
APIs and abstracts away the different implementations of underlying filesystems. VFS implements system calls like create, open, read, write and close. VFS implements a generic
superblock and
inode block that is independent from the one that the underlying filesystem has. In this subsystem directories and files are represented by a struct file
data structure. When
userspace requests access to a file it is returned a
file descriptor (non negative integer value) but in
kernel space it is a struct file structure. This structure stores all the information the kernel knows about a file or directory.
sysfs and
procfs are virtual filesystems that expose hardware information and
userspace programs' runtime information. These filesystems aren't present on disk and instead the kernel implements them as a
callback or routine that gets called when they are accessed by userspace.
Supported architectures DVR, a consumer device running Linux While not originally designed to be
portable, Linux is now one of the most widely ported operating system kernels, running on a diverse range of systems from the
ARM architecture to IBM
z/Architecture mainframe computers. The first port was performed on the
Motorola 68000 platform. The modifications to the kernel were so fundamental that Torvalds viewed the Motorola version as a
fork and a "Linux-like operating system". Linux has also been ported to various handheld devices such as
Apple's iPhone 3G and
iPod.
Supported devices In 2007, the LKDDb project has been started to build a comprehensive database of hardware and protocols known by Linux kernels. The database is built automatically by static analysis of the kernel sources. Later in 2014, the Linux Hardware project was launched to automatically collect a database of all tested hardware configurations with the help of users of various Linux distributions.
Live patching Rebootless updates can even be applied to the kernel by using
live patching technologies such as
Ksplice,
kpatch and
kGraft. Minimalistic foundations for live kernel patching were merged into the Linux kernel mainline in kernel version 4.0, which was released on 12 April 2015. Those foundations, known as
livepatch and based primarily on the kernel's
ftrace functionality, form a common core capable of supporting hot patching by both kGraft and kpatch, by providing an
application programming interface (API) for kernel modules that contain hot patches and an
application binary interface (ABI) for the userspace management utilities. Nonetheless, the common core included into Linux kernel 4.0 supports only the
x86 architecture and does not provide any mechanisms for ensuring
function-level consistency while the hot patches are applied.
Security Kernel bugs present potential security issues. For example, they may allow for
privilege escalation or create
denial-of-service attack vectors. Over the years, numerous bugs affecting system security were found and fixed. New features are frequently implemented to improve the kernel's security. Capabilities(7) have already been introduced in the section about the processes and threads. Android makes use of them and
systemd gives administrators detailed control over the capabilities of processes. Linux offers a wealth of mechanisms to reduce kernel attack surface and improve security that are collectively known as the
Linux Security Modules (LSM). They comprise the
Security-Enhanced Linux (SELinux) module, whose code has been originally developed and then released to the public by the
NSA, and
AppArmor among others. SELinux is now actively developed and maintained on
GitHub. SELinux and AppArmor provide support to access control security policies, including
mandatory access control (MAC), though they profoundly differ in complexity and scope. Another security feature is the Seccomp BPF (SECure COMPuting with Berkeley Packet Filters), which works by filtering parameters and reducing the set of system calls available to user-land applications. Critics have accused kernel developers of covering up security flaws, or at least not announcing them; in 2008, Torvalds responded to this with the following: Linux distributions typically release security updates to fix vulnerabilities in the Linux kernel. Many offer
long-term support releases that receive security updates for a certain Linux kernel version for an extended period of time. In 2024, researchers disclosed that the Linux kernel contained a serious vulnerability, CVE-2024-50264, located in the AF_VSOCK subsystem. This bug is a use-after-free flaw, a class of memory corruption issue that occurs when a program continues to use memory after it has been freed. Such flaws are particularly dangerous in the kernel, as they can allow attackers to escalate privileges. The bug was resolved in May 2025. == Legal ==