Hubbry Logo
KexecKexecMain
Open search
Kexec
Community hub
Kexec
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Kexec
Kexec
from Wikipedia

kexec (kernel execute), analogous to the Unix/Linux kernel call exec, is a mechanism of the Linux kernel that allows booting of a new kernel from the currently running one.

Details

[edit]

Essentially, kexec skips the bootloader stage and hardware initialization phase performed by the system firmware (BIOS or UEFI), and directly loads the new kernel into main memory and starts executing it immediately. This avoids the long times associated with a full reboot, and can help systems to meet high-availability requirements by minimizing downtime.[1][2]

While feasible, implementing a mechanism such as kexec raises two major challenges:

  • Memory of the currently running kernel is overwritten by the new kernel, while the old one is still executing.
  • The new kernel will usually expect all hardware devices to be in a well defined state, in which they are after a system reboot because the system firmware resets them to a "sane" state. Bypassing a real reboot may leave devices in an unknown state, and the new kernel will have to recover from that.

Support for allowing only signed kernels to be booted through kexec was merged into version 3.17 of the Linux kernel mainline, which was released on October 5, 2014.[3] This disallows a root user to load arbitrary code via kexec and execute it, complementing the UEFI secure boot and in-kernel security mechanisms for ensuring that only signed Linux kernel modules can be inserted into the running kernel.[4][5][6]

Kexec is used by LinuxBoot to boot the main kernel from the Linux kernel located in the firmware.

See also

[edit]
  • kdump (Linux) – Linux kernel's crash dump mechanism, which internally uses kexec
  • kGraft – Linux kernel live patching technology developed by SUSE
  • kpatch – Linux kernel live patching technology developed by Red Hat
  • Ksplice – Linux kernel live patching technology developed by Ksplice, Inc. and later bought by Oracle

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
kexec is a system call in the Linux kernel that facilitates the direct loading and execution of a new kernel from within a running kernel, effectively bypassing the traditional bootloader and firmware (such as BIOS or UEFI) initialization processes. This mechanism, often described as an in-kernel bootloader, enables rapid transitions between kernel versions without a full hardware reboot, significantly reducing downtime compared to conventional restarts. Developed primarily by Eric W. Biederman in the early 2000s, kexec was integrated into the Linux kernel around the 2.6 series to support efficient kernel switching for development, testing, and system maintenance. One of its most prominent applications is in kdump, a crash-dumping solution co-developed by Vivek Goyal, Eric W. Biederman, and Hariprasad Nellitheertha at IBM, which uses kexec to boot a secondary "capture kernel" in the event of a system panic, preserving the memory state of the crashed kernel for post-mortem analysis. The kexec system call, implemented via the kexec_load(2) interface, allows userspace tools like the kexec utility to load kernel images (such as ELF or bzImage formats) into reserved memory regions, with options for normal reboots or crash scenarios. Key features include support for preserving system state across transitions, enforced through kernel configuration options like CONFIG_KEXEC=y, and compatibility with various architectures including x86, , and PowerPC. In modern kernels, enhancements such as Kexec (KHO)—introduced to serialize and transfer driver states, regions, and arbitrary properties to the new kernel—further improve reliability for scenarios like live migrations or complex hardware handoffs. This evolution underscores kexec's role in enabling resilient, high-availability environments, particularly in enterprise and embedded systems where minimizing reboot times is critical.

Overview

Definition and Purpose

kexec is a Linux kernel mechanism that enables the direct loading and execution of a new kernel from within a running kernel, primarily through the kexec_load() system call. This system call allows userspace applications to prepare a secondary kernel image in memory, which can later be booted without invoking hardware reinitialization or firmware components such as BIOS or UEFI. By bypassing these traditional boot stages, kexec avoids the lengthy initialization sequences typically required during system startup. The primary purposes of kexec include accelerating system transitions for and updates, streamlining kernel development and testing workflows through quicker iteration cycles, and enabling reliable kernel crash analysis via mechanisms like kdump. It addresses the inefficiencies of conventional processes in environments, where boot times can significantly delay operations. In conceptual terms, kexec parallels the traditional exec system call used in user space to replace a running process with a new executable, but extends this paradigm to the kernel level for seamless kernel handoff while retaining access to system memory.

Basic Mechanism

kexec operates by allowing the currently running Linux kernel to directly load and execute a new kernel image in memory, bypassing traditional firmware and bootloader initialization sequences. This transition is initiated through a system call from user space, typically using tools like the kexec utility, which interfaces with the kernel to prepare the handover. The process ensures that the new kernel starts from a clean state while minimizing hardware reinitialization, thereby reducing boot time. The mechanism proceeds in a two-phase load and execution sequence. In the first phase, the new kernel image—along with any associated initramfs and command-line parameters—is loaded into a designated region using the kexec_load . This phase relocates the kernel segments to avoid conflicts with the running kernel's usage, reserving specific address ranges (e.g., via options like --mem-min and --mem-max) to prevent overwriting critical data structures or code. A key component here is the code segment, an ELF-relocatable object that acts as an intermediary for verification and cleanup tasks, such as computing SHA-256 hashes to ensure the integrity of the loaded kernel image before execution. In the second phase, triggered by the kexec , the running kernel performs a minimal shutdown procedure—halting non-crashing CPUs via interrupts if necessary—and transfers control to the segment. The code then executes any required post-shutdown actions, such as saving register states, before jumping directly to the of the new kernel. This preserves the overall system memory layout where possible, with the reserved region spanning a small portion of physical RAM to house the , new kernel, and parameters without interfering with ongoing operations. Conceptually, the process can be visualized as a linear transition: the old kernel loads the and new kernel segments into reserved , shuts down minimally, and passes control to , which in turn invokes the new kernel's startup routine. This avoids the full power cycle and hardware probing of a conventional , enabling a seamless kernel-to-kernel switch.

History

Origins and Development

The development of kexec began in the early amid discussions on the and the mailing list, focusing on mechanisms for faster kernel switching to enhance development workflows. Eric W. Biederman emerged as the key contributor, posting the first substantial patches for a minimal kexec implementation on October 30, 2002, targeting version 2.5.44 on x86 architecture. These early efforts laid the groundwork for a enabling direct loading and execution of a new kernel from within a running one. The primary motivations for kexec stemmed from the need to drastically reduce times, which could take minutes on complex hardware due to initialization, device probing, and overhead—particularly burdensome for kernel developers iterating through frequent tests. Biederman's patches addressed this by allowing a "warm" that skips stages, potentially cutting duration from over a minute to seconds. An additional driver was the desire to facilitate reliable kernel crash dumping without full hardware resets, preserving system memory for analysis in production environments. Biederman continued refining the implementation through 2003-2004, incorporating feedback from the community and expanding support for hardware compatibility. By mid-2005, kexec had matured sufficiently for mainline inclusion, with multiple architecture-specific fixes and cleanups merged into the Linux kernel. It became a standard feature in kernel version 2.6.13, released on August 29, 2005, marking its official adoption as a core capability for fast kernel transitions. This integration enabled broader use in both development and operational contexts, later extending to tools like kdump for crash handling.

Adoption and Milestones

kexec was integrated into the mainline Linux kernel with version 2.6.13 in 2005, enabling the core functionality for loading and booting a new kernel directly from a running one without firmware reinitialization. Subsequent enhancements included the introduction of file-based loading via the kexec_file_load system call in kernel 3.17 in 2014, which allowed loading kernels and initramfs directly from files and improved compatibility with secure environments. Major Linux distributions adopted kexec shortly after its mainline inclusion. Red Hat integrated kexec-tools into 5, released in 2007, to support fast reboots and crash dumping. SUSE has packaged kexec-tools since at least SUSE Linux Enterprise Server 10 in 2006, providing ongoing support for reboot acceleration and kdump. By the 2010s, documented kexec usage in its official wiki, facilitating community adoption for custom kernel switching. Key milestones in kexec's evolution include architecture support added in kernel 3.17 in 2014, enabling file-based loading on platforms for embedded and mobile systems. For x86_64, improvements for compatibility were implemented in 3.17 in 2014, allowing signed kernel loading to comply with firmware restrictions. Ongoing updates for EFI/ environments continued into the 6.x series, with features like kexec merged in kernel 6.16 in 2025 to enhance live kernel transitions in modern boot setups. kexec also influenced related projects, such as kboot, a 2006 proof-of-concept bootloader that leverages kexec to simplify multi-kernel booting from traditional loaders like GRUB.

Technical Implementation

Kernel-Level Components

The kexec subsystem is implemented primarily in the kernel source file kernel/kexec.c, which provides the core functionality for loading and executing a new kernel image from within the running kernel. This subsystem handles the allocation of memory for kernel segments, validation of loaded images, and coordination of device shutdown before jumping to the new kernel, ensuring a direct transition without firmware reinitialization. A mutex is employed to serialize operations, particularly during the loading of crash kernels, preventing concurrent modifications to shared resources. Central to the subsystem is the kimage structure, which manages the segments of the kernel image being loaded. This structure includes fields such as start for the , nr_segments to track the number of segments (limited to KEXEC_SEGMENT_MAX), and an array of kexec_segment entries that define memory ranges for kernel , , and other components. The kimage facilitates segment validation, memory allocation via kimage_alloc_page, and preparation for execution, including handling control pages and swap pages to avoid conflicts during the transition. Additionally, the component, an architecture-specific , performs cleanup tasks such as disabling interrupts and preparing hardware state before the new kernel takes over. The primary interface for loading kernel images is the kexec_load system call, defined as SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments, struct kexec_segment *, segments, unsigned long, flags). This call, restricted to root privileges, allows loading up to KEXEC_SEGMENT_MAX segments into memory, with flags controlling aspects like architecture-specific behavior (e.g., KEXEC_ARCH_MASK) and crash kernel designation. For enhanced security, particularly in environments with secure boot, the kexec_file_load system call was introduced in Linux kernel 3.17 as SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd, unsigned long, cmdline_len, const char __user *, cmdline, unsigned long, flags). It accepts file descriptors for the kernel and optional initramfs, enabling the kernel to verify signatures and load images directly from files rather than user-provided memory buffers. Architecture-specific handling is integral to kexec, addressing variations in boot protocols and hardware transitions. On x86, the subsystem manages real-mode transitions by loading the new kernel's real-mode setup code and using purgatory to switch from protected mode back to real mode, mimicking the bootloader process to initialize the new kernel without BIOS/UEFI involvement. For ARM architectures, kexec passes the device tree blob (DTB) to the new kernel, ensuring hardware description continuity; this involves copying the DTB into a reserved segment and updating the kernel's entry parameters to reference it during the handoff. Relocation of the initrd or initramfs is handled by mapping segments to available physical memory, avoiding overlaps with the running kernel's address space, and adjusting pointers in the kimage structure accordingly. Memory management for kexec, especially in crash scenarios, relies on the crashkernel boot parameter to reserve a dedicated at boot time. Specified as crashkernel=<size>[@offset], such as crashkernel=256M, it allocates a contiguous block (e.g., 256 megabytes) from low physical , preventing its use by the main kernel and ensuring availability for the crash dump kernel loaded via kexec. This reservation is parsed early in the boot process and enforced through the buddy allocator, with the offset allowing placement below the 4 GiB boundary on systems with high .

User-Space Tools and Interfaces

The primary user-space tool for interacting with kexec is the kexec command, provided by the kexec-tools package, which contains binaries and utilities to load kernel images into and initiate direct execution without invoking the . This package includes the /sbin/kexec binary, enabling administrators to prepare and trigger kernel transitions from the running system. The kexec-tools package has been distributed in major repositories since the mid-2000s, with initial inclusion in Etch (released April 2007, version 1.101-kdump10-2), and subsequent availability in (starting with Fedora Core 6 in 2006) and (from version 7.04 in 2007, with ongoing updates). Key command-line options facilitate kernel loading and execution; the -l (or --load) flag loads a kernel image along with optional initrd and append parameters, as in the example kexec -l /boot/vmlinuz --initrd=/boot/initrd.img --append="root=/dev/sda1", which specifies the device and other arguments. Following loading, the -e (or --exec) option executes the prepared kernel, bypassing hardware reinitialization for a faster transition. These options support formats like ELF, bzImage, and PE for compatibility across architectures. For system integration, kexec works with init systems like via the kexec.target, a special target that activates systemd-kexec.service during shutdown to coordinate unmounting filesystems, disabling swap, and process termination before invoking the loaded kernel. This allows seamless use of systemctl kexec for reboots in -based environments. Additionally, automated setups can leverage GRUB configurations by employing helper tools that parse GRUB menu entries to load equivalent kernels via kexec, enabling script-based or default boot alignments without manual intervention.

Applications and Use Cases

Fast System Rebooting

kexec enables fast system rebooting by permitting the running Linux kernel to directly load a new kernel into memory and execute it, circumventing the firmware's Power-On Self-Test (POST), hardware initialization, and bootloader phases such as GRUB. This bypass eliminates time-intensive steps like BIOS or UEFI checks and device probing, often shrinking reboot durations from minutes to seconds on compatible systems. Recent advancements, such as memory persistence over kexec using the Kexec HandOver (KHO) subsystem, allow preservation of non-movable memory pages across transitions, further reducing state loss in and virtualized environments. To set up kexec for fast reboots, administrators install the kexec-tools package and configure it to preload the production kernel, typically via distro-specific files such as /etc/kexec.conf or through scripts and services like kexec-load.service. These configurations specify the kernel , initramfs, and command-line arguments—often mirroring the current parameters—to ensure a seamless transition upon invocation of the kexec command or systemctl kexec. In high-availability servers, kexec minimizes unplanned during ; in environments like AWS EC2 instances, it accelerates restarts without full hardware resets; and on development machines, it supports swift kernel testing cycles by avoiding lengthy loads. Benchmarks demonstrate substantial performance gains, with reboot times typically reduced by 50-90% based on hardware complexity—for example, from 184 seconds to 30 seconds on an x86 server with multiple cores and large RAM, or achieving a 77% reduction in controlled evaluations.

Kernel Crash Analysis (kdump)

Kdump is a kernel crash dumping mechanism that leverages to a secondary "capture" kernel during a system panic, enabling the preservation and analysis of the crashed kernel's state. This approach allows for the creation of a dump file, known as vmcore, which captures the volatile contents of RAM that would otherwise be lost upon a full system . The capture kernel is typically a minimal, kernel configured specifically for dump collection, often generated using tools like dracut to create an initramfs . Utilities such as makedumpfile are employed to process the dump, compressing it and applying filters to exclude unnecessary data like free pages, zero pages, or user-space , thereby reducing the and focusing on kernel-relevant information. Configuration of kdump begins with reserving a portion of physical for the capture kernel during the primary kernel's process, specified via the crashkernel= parameter in the configuration (e.g., GRUB). This parameter reserves a fixed amount of , such as crashkernel=256M, or uses auto-detection like crashkernel=auto for systems with varying RAM sizes; an optional offset (e.g., crashkernel=256M@16M) can be set to avoid conflicts with low- regions. The kexec tools load the capture kernel and initramfs into this reserved area. Further customization occurs through the /etc/kdump.conf file, which defines the dump target—such as a local path (e.g., path /var/crash), a (e.g., nfs server.example.com:/export/cores), or transfer (e.g., ssh [email protected] with an SSH key)—along with options for the core collector like makedumpfile for filtering and compression. Upon a —triggered by events like panic(), a fatal die() kernel error, or a (NMI)—the primary kernel invokes kexec to immediately transfer control to the pre-loaded capture kernel without performing a hardware reset or BIOS reinitialization. The capture kernel then mounts the file system (or network target if specified) and executes scripts to analyze the memory image accessible via /proc/vmcore, which appears as a representing the entire physical memory of the crashed system. makedumpfile processes this vmcore by applying filters (e.g., level 31 to exclude cache-private pages, free pages, and user data) and compression (e.g., LZMA or LZO algorithms) before saving the output to the configured target, after which the system can reboot normally. This process ensures a complete and consistent dump, even in cases of hardware faults or triple faults that would corrupt traditional netdump or diskdump methods. The primary advantages of kdump lie in its ability to capture the full, unaltered kernel memory state—including registers, stacks, and dynamic data structures—that dissipates in conventional crash dumping techniques reliant on or external debuggers, facilitating detailed post-mortem analysis with tools like the crash utility. By avoiding the full cycle until after the dump, kdump minimizes and supports of subtle issues like memory corruption or driver bugs. This feature has been integrated into the since version 2.6.13, with ongoing enhancements for architectures including x86_64, , PowerPC, s390, , and others.

Security and Limitations

Potential Vulnerabilities

The kexec mechanism introduces security risks primarily through its capability to load and execute a new kernel image directly from the running kernel, bypassing conventional and safeguards. The core vulnerability stems from the kexec_load system call, which requires the CAP_SYS_BOOT Linux capability—typically granted only to running as —to load kernel images into memory. A possessing this capability can inject an unverified or malicious kernel, enabling full system takeover, as the loaded kernel executes with unrestricted privileges upon invocation. Without additional verification mechanisms, this allows attackers with sufficient privileges to replace the running kernel with one that evades detection or . In systems enabling Secure Boot, earlier kernels prior to version 3.17 permitted the use of kexec_load to bypass signature enforcement, allowing the loading of unsigned kernel images despite Secure Boot's restrictions on boot-time code. This flaw undermined the integrity chain enforced by Secure Boot, as kexec operates post-boot and does not invoke the firmware's verification process. The introduction of the kexec_file_load in kernel 3.17 addressed this by mandating signature checks for loaded images when Secure Boot is active, restricting legacy kexec_load usage on locked-down systems. Historical vulnerabilities in kexec-related components have exposed systems to exploitation. For instance, CVE-2021-20269 affected kexec-tools by setting overly permissive permissions (0666) on log files generated during kernel loading, enabling local unprivileged users to read these files and extract sensitive kernel details, such as internal structures or crash data. Additional risks include denial-of-service conditions in resource-constrained , where failures in reserving for kexec images—such as for crash dumps—or overflows in the code (the intermediate loader between kernels) can halt operation or prevent recovery. For example, dereferences during kexec operations have been shown to crash the kernel, rendering the unresponsive. These issues are exacerbated in low- or embedded environments, where improper image sizing leads to allocation failures without graceful degradation.

Mitigation Strategies

Access controls form the first line of defense in securing kexec deployments by limiting who can invoke the necessary system calls. The kexec_load and kexec_file_load syscalls require the CAP_SYS_BOOT capability, which is typically restricted to the root user to prevent non-privileged processes from loading new kernels. Additionally, mandatory access control (MAC) frameworks like SELinux can confine the kexec-tools processes through targeted policies, ensuring they operate within defined domains and preventing unauthorized file accesses or transitions during kernel loading; for instance, Red Hat Enterprise Linux (RHEL) SELinux policies include rules that may deny kexec operations if not explicitly permitted. AppArmor provides similar path-based confinement for kexec-tools on distributions like Ubuntu, restricting the utility to specific file paths and capabilities to mitigate privilege escalation risks. Secure loading mechanisms enhance integrity by verifying kernel images before execution. Introduced in Linux kernel 3.17, the kexec_file_load syscall supports loading kernels from file descriptors while enabling signature verification, particularly on systems configured with CONFIG_KEXEC_FILE and CONFIG_KEXEC_VERIFY_SIG to mandate signed images. This integrates with Integrity Measurement Architecture (IMA) and Extended Verification Module (EVM) for appraising file signatures stored in extended attributes like security.ima, allowing only verified kernels to load and preventing tampering in secure boot environments. Configuration hardening further reduces exposure by disabling vulnerable legacy features and enforcing runtime protections. The legacy kexec_load syscall can be permanently disabled via the kernel.kexec_load_disabled (set to 1 in /proc/sys/kernel/), which blocks both kexec_load and kexec_file_load once enabled, often used in conjunction with modules_disabled for comprehensive runtime restrictions. Since kernel 5.4, mode (enabled via lockdown=integrity or confidentiality) prohibits unsigned kexec loads in protected states, waiving restrictions only for IMA-appraised images to maintain kernel integrity during or secure scenarios. Monitoring and auditing provide visibility into kexec activities for threat detection in enterprise environments. The Linux Audit system (auditd) can log kexec-related syscalls, such as kexec_file_load (syscall number 347 on x86_64), using rules like "-a always,exit -F arch=b64 -S 347 -k kexec-load" added to /etc/audit/rules.d/audit.rules, enabling analysis of invocation attempts via ausearch. In RHEL deployments, signature enforcement is recommended through Secure Boot integration and STIG-compliant configurations that disable unsigned kexec loads, ensuring only vendor-signed kernels are permissible.

References

Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.