Hubbry Logo
Express Data PathExpress Data PathMain
Open search
Express Data Path
Community hub
Express Data Path
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Express Data Path
Express Data Path
from Wikipedia
XDP
Original authorsBrenden Blanco,
Tom Herbert
DevelopersOpen source community, Google, Amazon, Intel, Microsoft[1]
Initial release2016; 9 years ago (2016)
Written inC
Operating systemLinux, Windows
TypePacket filtering
LicenseLinux: GPL
Windows: MIT License

XDP (eXpress Data Path) is an eBPF-based high-performance network data path used to send and receive network packets at high rates by bypassing most of the operating system networking stack. It is merged in the Linux kernel since version 4.8.[2] This implementation is licensed under GPL. Large technology firms including Amazon, Google and Intel support its development. Microsoft released their free and open source implementation XDP for Windows in May 2022.[1] It is licensed under MIT License.[3]

Data path

[edit]
Packet flow paths in the Linux kernel. XDP bypasses the networking stack and memory allocation for packet metadata.

The idea behind XDP is to add an early hook in the RX path of the kernel, and let a user supplied eBPF program decide the fate of the packet. The hook is placed in the network interface controller (NIC) driver just after the interrupt processing, and before any memory allocation needed by the network stack itself, because memory allocation can be an expensive operation. Due to this design, XDP can drop 26 million packets per second per core with commodity hardware.[4]

The eBPF program must pass a preverifier test[5] before being loaded, to avoid executing malicious code in kernel space. The preverifier checks that the program contains no out-of-bounds accesses, loops or global variables.

The program is allowed to edit the packet data and, after the eBPF program returns, an action code determines what to do with the packet:

  • XDP_PASS: let the packet continue through the network stack
  • XDP_DROP: silently drop the packet
  • XDP_ABORTED: drop the packet with trace point exception
  • XDP_TX: bounce the packet back to the same NIC it arrived on
  • XDP_REDIRECT: redirect the packet to another NIC or user space socket via the AF_XDP address family

XDP requires support in the NIC driver but, as not all drivers support it, it can fallback to a generic implementation, which performs the eBPF processing in the network stack, though with slower performance.[6]

XDP has infrastructure to offload the eBPF program to a network interface controller which supports it, reducing the CPU load. In 2023, only Netronome[7] cards support it.

Microsoft is partnering with other companies and adding support for XDP in its MsQuic implementation of the QUIC protocol.[1]

AF_XDP

[edit]

Along with XDP, a new address family entered in the Linux kernel starting 4.18.[8] AF_XDP, formerly known as AF_PACKETv4 (which was never included in the mainline kernel),[9] is a raw socket optimized for high performance packet processing and allows zero-copy between kernel and applications. As the socket can be used for both receiving and transmitting, it supports high performance network applications purely in user space.[10]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
eXpress Data Path (XDP) is a high-performance, programmable networking framework integrated into the that enables fast packet processing directly within the kernel's network driver context, allowing for efficient handling of incoming network packets at the earliest possible stage without requiring kernel bypass techniques. Developed as part of the IO Visor Project, XDP leverages (extended ) programs to inspect, modify, forward, or drop packets, providing a safe and flexible environment for custom data plane operations while maintaining compatibility with the existing networking stack. XDP was first introduced in 2016 through contributions from developers at Facebook and Red Hat, with its design formalized in a 2018 research paper presented at the ACM CoNEXT conference, marking its integration into the mainline Linux kernel starting from version 4.8. The framework executes eBPF bytecode—compiled from high-level languages like C—early in the receive (RX) path of network interface controllers (NICs), enabling decisions such as packet rejection before memory allocation or stack traversal, which minimizes overhead and enhances security by avoiding userspace involvement for common tasks. Key actions supported include XDP_DROP for discarding packets, XDP_PASS for forwarding to the kernel stack, XDP_TX for immediate transmission, and XDP_REDIRECT for rerouting to other interfaces or sockets, all verified at load time via static analysis to prevent kernel crashes. In terms of performance, XDP achieves up to 24 million packets per second (Mpps) on a using hardware, outperforming traditional kernel paths and even some userspace solutions by reducing latency and CPU utilization for high-throughput scenarios. It supports advanced features like stateful processing through maps for hash tables and counters, as well as integration with AF_XDP sockets for user-space access, making it suitable for applications such as , load balancing, and inline firewalls. Since its inception, XDP has been adopted in production environments by organizations like and , with ongoing enhancements in recent kernels expanding hardware offload support and metadata access for even greater .

Overview

Definition

Express Data Path (XDP) is an eBPF-based technology designed for high-performance within the . It integrates directly into the network interface card (NIC) driver at the earliest receive (RX) point, allowing eBPF programs to execute on incoming packets before they proceed further into the kernel. The core purpose of XDP is to enable programmable decisions on incoming packets prior to kernel memory allocation or involvement of the full networking stack, thereby minimizing overhead and maximizing throughput. This approach supports processing rates up to 26 million packets per second per core on commodity hardware. In contrast to traditional networking paths, XDP bypasses much of the operating system stack—for instance, avoiding initial allocation of socket buffer (skb) structures—to achieve lower latency and reduced CPU utilization. Originally developed as a GPL-licensed component of the , XDP received a Windows port in 2022, released under the . As of 2025, developments like XDP2 are being proposed to further extend its capabilities for modern high-performance networking.

Advantages

XDP provides significant performance benefits by enabling line-rate packet processing directly in the network driver, achieving throughputs exceeding 100 Gbps on multi-core systems while maintaining low latency. This is accomplished by executing programs at the earliest possible stage in the receive path, before the creation of socket buffer (skb) structures or the invocation of generic receive offload (GRO) and segmentation offload (GSO) layers, which reduces processing overhead for high-volume traffic scenarios such as and traffic filtering. For instance, simple packet drop operations can reach up to 20 million packets per second (Mpps) per core, far surpassing traditional methods. In terms of , XDP minimizes CPU utilization by allowing early decisions on packet fate—such as dropping invalid packets—thereby freeing kernel resources for other tasks and avoiding unnecessary memory allocations or context switches deeper in the networking stack. This approach supports scalable deployment across multiple cores without the need for kernel bypass techniques like DPDK, while retaining the security and interoperability of the networking subsystem. Additionally, XDP's potential for operations further reduces memory bandwidth consumption, enhancing overall system efficiency in bandwidth-intensive environments. The flexibility of XDP stems from its integration with , enabling programmable custom logic for packet processing without requiring kernel modifications or recompilation, which facilitates rapid adaptation to evolving network requirements. Compared to conventional tools like or , XDP can be significantly faster for basic filtering tasks, with speedups of up to 5 times, due to its position in the data path and avoidance of higher-layer overheads. Furthermore, XDP enhances through seamless integration with tools like bpftrace, allowing for efficient monitoring and of network events in production environments.

History and Development

Origins

The development of Express Data Path (XDP) was initiated in 2016 by Jesper Dangaard Brouer, a principal kernel engineer at , in response to the growing demands for high-performance networking in environments where traditional Linux kernel networking stacks struggled with speeds exceeding 10 Gbps. Traditional kernel processing, including socket buffer (SKB) allocation and memory management, created significant bottlenecks under high packet rates, often limiting throughput to below line-rate performance for multi-gigabit interfaces. The project aimed to enable programmable, kernel-integrated packet processing that could rival user-space solutions like DPDK while maintaining compatibility with the existing networking stack. Key contributions came from the open-source Linux kernel community, with significant input from engineers at , Amazon, and , who helped refine the design through collaborative patch reviews and testing. Early efforts built upon the (extended ) framework, which had advanced in 2014 to support more complex in-kernel programs, allowing XDP to extend programmable packet processing beyond existing hooks like traffic control (tc). Initial prototypes focused on integrating XDP hooks into network drivers, with testing conducted on Netronome SmartNICs to evaluate offloading capabilities and on Mellanox ConnectX-3 Pro adapters (supporting 10/40 Gbps Ethernet) to demonstrate drop rates up to 20 million packets per second on a single core. These prototypes validated the feasibility of early packet inspection and processing directly in the driver receive path, minimizing overhead from higher-layer kernel components.

Milestones

XDP was initially merged into the version 4.8 in 2016, introducing basic support for programmable packet processing at the driver level, with initial implementation in the Intel ixgbe Ethernet driver. In 2018, Linux kernel 4.18 added AF_XDP, a socket family enabling efficient user-space access to XDP-processed packets, facilitating data transfer between kernel and user space. Microsoft ported XDP to the Windows kernel in 2022, releasing an open-source implementation that integrated with the MsQuic library to accelerate QUIC protocol processing by bypassing the traditional network stack. Between 2023 and 2024, XDP driver support expanded to additional Intel Ethernet controllers, such as the E810 series, while Netronome hardware offloading achieved greater stability through kernel enhancements for reliable eBPF program execution on smart NICs. In 2024 and 2025, kernel updates addressed critical issues, including a fix for race conditions in the AF_XDP receive path identified as CVE-2025-37920, where improper synchronization in shared umem mode could lead to concurrent access by multiple CPU cores; this was resolved by relocating the rx_lock to the buffer pool structure. The ecosystem around XDP also grew, with the introduction of uXDP as a userspace runtime for executing verified XDP programs outside the kernel while maintaining compatibility, and innovative workarounds enabling XDP-like processing for egress traffic via kernel loopholes. XDP's core implementation in Linux remains under the GPL license, ensuring integration with the kernel's licensing requirements, whereas the Windows port adopts the more permissive MIT license to broaden adoption across platforms.

Core Functionality

Data Path Mechanics

The eXpress Data Path (XDP) hook is integrated at the earliest point in the receive (RX) path within the Linux kernel's network device driver, immediately following the network interface card (NIC)'s direct memory access (DMA) transfer of packet data into kernel memory buffers from the RX descriptor ring, but prior to any socket buffer (skb) allocation or engagement with the broader network stack. This placement minimizes latency by allowing programmable processing before traditional kernel overheads. In cases where a driver lacks native support, XDP falls back to a generic mode (also known as SKB mode) that integrates into the kernel's NAPI processing after skb allocation, resulting in slightly higher overhead but ensuring compatibility. Upon DMA transfer, the raw packet data resides in a kernel buffer, where an program attached to the XDP hook executes directly on it, utilizing metadata from the xdp_md —such as packet , ingress , and RX queue ID—for contextual analysis. This flow enables rapid decision-making on packet disposition without propagating the frame through the full kernel network stack, thereby reducing CPU cycles and memory usage for high-throughput scenarios. XDP supports three execution modes: native mode, which embeds the hook directly in for optimal performance on supported hardware; generic mode, a universal software fallback that integrates into the standard RX path with slightly higher overhead; and offload mode, where the program is transferred to the NIC for hardware-accelerated execution, bypassing the host CPU entirely. To enhance efficiency, XDP leverages the kernel's page pool API for , allocating and recycling page-sized pools dedicated to XDP frames and associated skbs, which avoids frequent page allocations and reduces cache misses in high-rate environments. This approach supports traffic handling, where replicated packets can be processed across relevant queues, and integrates with Receive Side Scaling () to distribute ingress load via hardware hashing to multiple RX queues for parallel execution. Traditionally limited to ingress processing on the RX path, XDP saw 2025 advancements enabling egress support through eBPF-based techniques that manipulate kernel packet direction heuristics, extending its applicability to outbound traffic without native TX hooks.

Actions

In eXpress Data Path (XDP), the possible decisions an XDP program can make on a received packet are determined by returning one of the values from the enum xdp_action, which the kernel uses to execute the corresponding handling without further program involvement. These actions enable efficient packet processing at the driver level, allowing for high-performance decisions such as dropping unwanted traffic or redirecting packets to alternative paths. XDP_DROP instructs the kernel to immediately discard the packet, freeing the underlying DMA buffer directly in the driver without allocating kernel data structures like sk_buff or passing the packet to the network stack. This action is particularly effective for early-stage filtering, such as mitigating DDoS attacks, as it minimizes and latency compared to traditional stack-based dropping. XDP_PASS forwards the packet to the standard networking stack for further processing, such as routing, firewalling, or delivery to user space. It allows the XDP program to inspect or minimally modify the packet before normal handling resumes, preserving compatibility with existing network functionality. XDP_TX causes the kernel to transmit the packet back out through the same network interface it arrived on, often used for reflecting packets or simple redirects without changing the egress device. This action reuses the original buffer for transmission, enabling low-overhead operations like packet mirroring or bouncing invalid ingress traffic. XDP_REDIRECT redirects the packet to a different network interface, CPU queue, or AF_XDP socket, typically invoked via the eBPF helper bpf_redirect() or map-based variants like bpf_redirect_map(). It supports advanced forwarding scenarios, such as load balancing across devices, by handing off the buffer to another or processing context. XDP_ABORTED, with a value of 0, signals an error or abort condition in the XDP program, resulting in the packet being dropped along with a kernel warning via bpf_warn_invalid_xdp_action(). This action is primarily intended for or testing purposes and is rarely used in production environments due to its punitive overhead. The kernel interprets the returned enum xdp_action value to perform the specified operation atomically after the program execution, ensuring minimal overhead in the data path. Statistics for these actions, including counts of drops, passes, transmissions, and redirects, are exposed by supported network drivers through the ethtool utility, allowing administrators to monitor XDP performance and efficacy.

eBPF Integration

Program Development

eBPF programs for XDP are written in a restricted subset of , leveraging kernel headers such as <linux/bpf.h> and <bpf/bpf_helpers.h> to access necessary types and helper functions. Developers define the main program function and annotate it with the SEC("xdp") macro to place it in the appropriate ELF section, ensuring it is recognized as an XDP program during loading. The function signature typically takes a struct xdp_md *ctx , providing access to packet metadata like ingress_ifindex for the incoming interface index. Programs must return an enum xdp_action value, such as XDP_DROP to discard packets or XDP_PASS to continue processing. To compile the C source into an ELF object file containing eBPF bytecode, developers use LLVM/Clang with the BPF target architecture. The command clang -O2 -target bpf -c program.c -o program.o generates the object file, enabling features like bounded loops and helper function inlining supported by the LLVM BPF backend. This process ensures the bytecode adheres to eBPF instruction constraints verified by the kernel. Loading the program into the kernel utilizes the libbpf library, which provides the bpf_prog_load() function with BPF_PROG_TYPE_XDP as the program type. Once loaded, the program file descriptor is attached to a network device using bpf_set_link_xdp_fd() on the netdevice or, in newer kernels, bpf_link_create() with BPF_LINK_TYPE_XDP. Alternatively, the iproute2 suite offers a command-line interface for attachment: ip link set dev <interface> xdp obj program.o sec xdp, simplifying deployment without custom userspace code. For inspection and management, bpftool from iproute2 allows querying loaded programs via bpftool prog show or attached XDP links with bpftool net show. A representative example is a simple XDP program that drops packets with non-IPv4 Ethernet types:

c

#include <linux/bpf.h> #include <bpf/bpf_helpers.h> #include <linux/if_ether.h> SEC("xdp") int xdp_drop_non_ip(struct xdp_md *ctx) { void *data_end = (void *)(long)ctx->data_end; void *data = (void *)(long)ctx->data; struct ethhdr *eth = data; if ((void *)(eth + 1) > data_end) return XDP_PASS; if (eth->h_proto != htons(ETH_P_IP)) return XDP_DROP; return XDP_PASS; }

#include <linux/bpf.h> #include <bpf/bpf_helpers.h> #include <linux/if_ether.h> SEC("xdp") int xdp_drop_non_ip(struct xdp_md *ctx) { void *data_end = (void *)(long)ctx->data_end; void *data = (void *)(long)ctx->data; struct ethhdr *eth = data; if ((void *)(eth + 1) > data_end) return XDP_PASS; if (eth->h_proto != htons(ETH_P_IP)) return XDP_DROP; return XDP_PASS; }

This program accesses the Ethernet header via the ctx metadata, performs bounds checking to prevent verifier rejection, and selectively drops non-IP traffic. Metadata like ctx->ingress_ifindex can be used for interface-specific logic, such as conditional actions based on the receiving device. Debugging XDP programs involves kernel-side tracing with bpf_trace_printk() for logging messages to the kernel ring buffer, viewable via dmesg, though it is limited for production due to performance overhead. For more scalable telemetry, developers populate userspace-accessible eBPF maps with counters or statistics, which can be read and aggregated from userspace applications. The kernel verifier plays a role in accepting programs by statically analyzing bytecode for safety, but development focuses on iterating through compilation and loading to resolve verification failures.

Safety Mechanisms

The verifier serves as a critical in-kernel static analyzer for XDP programs, simulating their execution path to ensure safety before loading. It performs exhaustive checks for potential issues such as unreachable instructions, out-of-bounds memory accesses relative to packet boundaries (e.g., ensuring offsets do not exceed the data_end pointer in XDP contexts), invalid use of helper functions, and violations of the kernel's security model. If any unsafe behavior is detected, the verifier rejects the program, preventing it from being loaded and executed, thereby avoiding kernel crashes or exploits. This verification process is mandatory for all program types, including XDP, and operates on the program's without requiring runtime overhead during packet processing. To enforce bounded execution, the verifier prohibits unbounded loops in eBPF programs, a restriction that originated with early eBPF designs to guarantee termination; since Linux kernel 5.3, bounded loops are permitted but only if the verifier can prove they will not exceed resource limits. A key safeguard is the fixed instruction limit, capped at 1 million instructions per program invocation (increased from 4,096 in earlier kernels like 5.2), which prevents excessive computation and potential denial-of-service scenarios. Additionally, map accesses—such as those to eBPF maps used for state in XDP filtering—are validated at load time, ensuring pointers remain within allocated bounds and avoiding arbitrary memory corruption. These mechanisms collectively ensure that XDP programs remain deterministic and resource-bounded, maintaining kernel stability even under high packet rates. Following successful verification of the , the kernel may optionally apply just-in-time () compilation to translate it into native for improved performance during execution. However, the verifier's safety checks occur solely on the portable , independent of the process, ensuring that optimizations do not introduce vulnerabilities. For error handling, XDP programs must return one of predefined actions (e.g., XDP_PASS to continue processing, XDP_DROP to discard the packet, or XDP_REDIRECT for forwarding), which the kernel interprets to dictate packet fate; in cases of unrecoverable errors like , the program returns XDP_ABORTED, triggering a kernel tracepoint for logging while defaulting to a safe fallback such as XDP_PASS to avoid disrupting traffic flow. In recent developments through 2025, the verifier has seen enhancements to support more complex operations, including improved precision for packet redirects (e.g., via XDP_REDIRECT with tail calls) and metadata handling in XDP programs, where additional packet metadata can be safely accessed without bound violations. These updates, such as proof-based refinement mechanisms, allow the verifier to handle intricate control flows more accurately while rejecting fewer valid programs, building on ongoing efforts to balance safety and expressiveness in high-performance networking scenarios.

User-Space Access

AF_XDP Sockets

AF_XDP sockets, introduced in version 4.18, provide a specialized address family (PF_XDP) designed for high-performance, operations that enable direct packet transfer from kernel-space XDP programs to user-space applications, bypassing much of the traditional networking stack. This raw socket type facilitates efficient packet processing by allowing XDP programs to redirect ingress traffic straight to user-space buffers, supporting applications requiring low-latency and high-throughput networking. To create an AF_XDP socket, applications invoke the standard socket syscall with the address family AF_XDP, socket type SOCK_RAW, and protocol 0: fd = socket(AF_XDP, SOCK_RAW, 0);. Following creation, the socket must be bound to a specific network interface and receive queue ID using the bind() syscall, specifying parameters such as the interface index, queue identifier, and socket options via setsockopt() for features like shared user memory (UMEM) registration. This binding associates the socket with a particular hardware receive queue, enabling targeted packet reception from XDP-processed traffic on that queue. The core of AF_XDP's efficiency lies in its user memory (UMEM) model, where user-space allocates a contiguous region registered with the kernel via setsockopt() using the SOL_XDP level. This UMEM is divided into fixed-size frames, and communication between kernel and user-space occurs through four lock-free ring buffers: the RX ring for incoming packet descriptors from the kernel to user-space, the TX ring for outgoing descriptors from user-space to the kernel, the FILL ring for user-space to supply empty frames to the kernel, and the COMPLETION ring for the kernel to notify user-space of processed frames. Descriptors in these rings reference UMEM frame addresses and lengths, allowing shared access without data copying in optimal configurations. AF_XDP supports two operational modes for packet handling: copy mode, which relies on traditional sk_buff structures for data transfer and is compatible with all XDP-capable drivers, and mode, which grants the driver direct page access to UMEM via DMA for ingress and egress, minimizing overhead but requiring explicit driver support through XDP_FLAGS_DRV_MODE. Upon binding, the kernel attempts if available; otherwise, it defaults to copy mode. Driver support for has expanded in recent kernels, enhancing performance for supported hardware. As of 2025, AF_XDP has seen integrations aimed at broader ecosystem compatibility, including patches in the DPDK framework to enable AF_XDP poll-mode drivers for seamless migration from kernel-bypass libraries, allowing DPDK applications to leverage AF_XDP sockets for raw packet I/O on supported NICs. Additionally, experimental implementations in DNS servers such as utilize AF_XDP to achieve higher query processing rates by directly handling UDP packets in user-space, demonstrating improved query processing rates with minimal CPU overhead, such as a 1.7x improvement over traditional UDP handling in experimental tests.

Zero-Copy Features

AF_XDP enables packet handling through a region known as UMEM, which consists of a contiguous block of user-allocated memory divided into fixed-size frames, typically 2 KB or 4 KB each, to hold packet data without intermediate copies between kernel and user space. The kernel driver writes packet descriptors directly into ring buffers mapped to this UMEM, allowing the network interface card (NIC) to DMA packet data straight into the frames, while the user-space application accesses the data via these descriptors. This structure supports multiple AF_XDP sockets sharing the same UMEM for efficient resource utilization in multi-queue setups. Ring buffer operations in zero-copy mode rely on four memory-mapped rings associated with the UMEM: the fill ring, where the user space provides available frames for incoming packets; the RX ring, where the kernel enqueues receive descriptors pointing to filled frames; the TX ring, for user-submitted transmit descriptors; and the completion ring, where the kernel signals TX completions. The user space polls the head and tail pointers of these single-producer/single-consumer rings to synchronize access, minimizing system calls through techniques like busy-polling or eventfd notifications, while the kernel updates them atomically to reflect buffer states. This design ensures seamless data flow without memcpy operations, as both kernel and user space operate on the . By eliminating the overhead of data copying between kernel and user space, AF_XDP achieves significant performance improvements, such as line-rate processing exceeding 40 Gbps for receive-only workloads in user-space applications like packet capture on 40 Gbps NICs. These gains stem from reduced CPU cycles on memory transfers and fewer context switches, enabling applications to handle high-throughput traffic more efficiently than traditional socket interfaces. To enable zero-copy mode, applications must set the XDP_ZEROCOPY flag during socket binding via the bind() , which requires compatible NIC drivers supporting direct UMEM access, such as Intel's i40e for 40 Gbps Ethernet; if unsupported, the operation falls back to copy mode using SKB buffers. Driver support is typically provided in XDP_DRV mode for native , contrasting with the generic XDP_SKB mode that always copies data. In 2025, advancements addressed reliability issues, including a fix for race conditions in the generic RX path under shared UMEM scenarios (CVE-2025-37920), where improper locking could lead to races across multiple sockets, now resolved by relocating the rx_lock to the buffer pool level in versions post-6.9. Performance studies on mixed-mode deployments, combining and copy-based sockets on programmable NICs, highlighted benefits but noted potential bottlenecks from uneven buffer allocation, informing optimizations for hybrid environments.

Hardware Support

Offloading Modes

XDP supports hardware offloading through specific modes that enable execution of programs directly on the network interface card (NIC), bypassing the host CPU for packet processing. The primary mode is specified by the XDP_FLAGS_HW_MODE flag, which attaches the eBPF program for full offload to the NIC when both driver and hardware support this capability. As a fallback when hardware offload is unavailable or unsupported, XDP_FLAGS_SKB_MODE is used, directing the program to run in software mode using the kernel's socket buffer (SKB) path. Additionally, launch-time offload for transmit (TX) metadata allows the NIC to schedule packets based on specified timestamps without host intervention, merged in Linux kernel 6.14 (April 2025). The offloading process involves compiling the eBPF program into a format compatible with the target hardware, such as P4 for programmable switches or NIC-specific , before loading it onto the device. This compilation ensures the program adheres to the hardware's instruction set limitations. The program is then loaded using the devlink interface, a kernel subsystem for managing device resources, which handles the transfer to the NIC . The kernel verifier performs compatibility checks during loading to confirm that the program and hardware align, preventing mismatches that could lead to failures. Driver-specific hooks facilitate the attachment, ensuring seamless integration with the NIC's data path. Offloading provides significant benefits, including zero involvement from the host CPU after initial setup, enabling line-rate packet processing on SmartNICs even under high traffic loads. It supports core XDP actions such as DROP and entirely on the hardware, allowing packets to be filtered or forwarded without reaching the host stack, which is particularly useful for and performance-critical applications. However, hardware offload is constrained by a of eBPF features, excluding complex operations like advanced map manipulations or certain helper functions to match hardware capabilities. It also requires periodic NIC firmware updates to incorporate new offload support, limiting adoption to compatible devices. Integration with (TSN) has advanced, enabling XDP offload to support deterministic traffic scheduling in industrial and real-time environments.

Supported Devices

Express Data Path (XDP) hardware support is available on select network interface controllers (NICs) and platforms, enabling native or offloaded execution of XDP programs for high-performance packet processing. Ethernet controllers provide full native XDP offload through the driver for E810 series devices, supporting XDP and AF_XDP operations on kernels 4.14 and later. The 700-series controllers, such as those based on X710, achieve similar support via the i40e driver for native XDP on kernels 4.14 and later, with iavf handling virtual functions in SR-IOV configurations on kernels 5.10 and above. Netronome Agilio SmartNICs have offered early and stable XDP offload support since 2016, allowing /XDP programs to execute directly on the NIC hardware for packet filtering and processing tasks. NVIDIA (formerly Mellanox) ConnectX-6 and later NICs support driver-level XDP execution, enabling high-throughput packet handling in native mode on . Hardware offload for XDP is not supported on BlueField-2 DPUs as of the latest available information (2023), with development ongoing. Other vendors include Broadcom's family of SmartNICs, which support XDP offload by running full distributions on the device, facilitating eBPF program deployment for network functions. Marvell OCTEON DPUs, such as those in the TX and 10 series, provide XDP and acceleration in configurations like Asterfusion's SmartNICs, targeting security and load balancing workloads. Software-based XDP support extends to virtualized environments via the virtio-net driver, available since 4.10 for both host and guest packet processing. On Windows, basic XDP functionality is available through the (WDK) via the open-source XDP-for-Windows project, which implements a high-performance packet I/O interface similar to XDP. Hardware offload is supported on select Azure NICs, such as Mellanox adapters in virtualized setups, though primarily optimized for guests with experimental Windows extensions. To query XDP support and status on , administrators can use [ethtool](/page/Ethtool) -l <interface> to view channel configurations relevant to XDP multi-queue operations and [ethtool](/page/Ethtool) -S <interface> for statistics including XDP drop counts. For offload flags and parameters, devlink dev param show <device> displays hardware offload capabilities, such as XDP mode settings on supported NICs.

Applications

Use Cases

XDP has been widely deployed for , where it enables early packet dropping of malformed or suspicious traffic at the network interface level, often using the DROP action to discard packets before they consume kernel resources. Integration with intrusion detection systems like allows XDP to apply custom filters for real-time threat detection and blocking, as demonstrated in deployments throughout 2024 that handle high-volume attacks efficiently. For instance, Cloudflare's L4Drop tool leverages XDP to filter Layer 4 DDoS traffic, achieving rapid mitigation by processing packets directly in the driver. In load balancing and telemetry applications, XDP supports packet redirection to specific queues or devices using the REDIRECT action, facilitating efficient traffic distribution in containerized environments. , an eBPF-based networking solution for , employs XDP to accelerate service load balancing and enable flow sampling for monitoring, providing cluster-wide visibility into network traffic without kube-proxy overhead. This approach is particularly effective in dynamic cloud-native setups, where XDP programs dynamically update rules based on telemetry data to optimize routing and detect anomalies. For high-speed packet capture and forwarding, XDP combined with AF_XDP sockets enables user-space applications to bypass the kernel stack, serving as a foundation for tools that outperform traditional utilities like . Cloudflare's xdpcap, for example, uses XDP to capture packets at line rate directly from the driver, supporting forwarding scenarios in monitoring and analysis pipelines. Red Hat's xdpdump further illustrates this by integrating XDP for efficient traffic examination in enterprise environments. XDP accelerates and HTTP processing by enabling receive-side scaling, distributing incoming connections across CPU cores for better throughput in modern web protocols. Microsoft's MsQuic implementation incorporates XDP to bypass the kernel for UDP packet handling, improving latency and in high-performance networking stacks. Research on acceleration confirms XDP's role in offloading receive processing, making it suitable for and content delivery networks. Recent 2025 advancements highlight XDP's expanding versatility, such as a technique exploiting virtual Ethernet (veth) interfaces to apply XDP programs to egress traffic for shaping and , previously limited to ingress paths. In DNS servers, the Name Server Daemon () integrates AF_XDP sockets to handle elevated query rates, enhancing protection against amplification attacks by enabling rapid filtering and processing of UDP traffic on port 53. As of Linux kernel 6.11 (September 2025), XDP includes improved multi-buffer support for AF_XDP, boosting performance in cloud-native environments. Enterprise adoption of XDP is evident in cloud providers, where it supports VPC traffic filtering through integrations for and flow optimization. AWS employs , including XDP capabilities via tools like , to enforce network security groups and tune VPC flows for enhanced observability and threat detection. Similarly, Google Cloud integrates XDP in Google Kubernetes Engine via , enabling efficient packet filtering and load balancing within shared VPC architectures.

Performance Metrics

XDP programs demonstrate high throughput in packet drop operations, achieving up to 14.9 million packets per second (Mpps) per core on i7 processors, as measured in 2019 tests on 4.18 systems with simple filters. With hardware offload to SmartNICs, performance scales to up to 18 Mpps, enabling efficient processing on 25 Gbps interfaces without host involvement. Comparisons highlight XDP's efficiency for filtering tasks, delivering 5-10 times higher throughput than , with XDP sustaining up to 7.2 Mpps under heavy drop loads while tops out at around 1.5 Mpps with minimal rules. For user-space access via AF_XDP sockets, mode reaches approximately 90% of line rate on high-speed links, compared to 50% with traditional copy-based sockets, by avoiding kernel-to-user transfers. Key metrics include decision latencies under 1 μs for basic operations in the driver hook, though average forwarding latency measures around 7 μs at 1 Mpps loads. CPU utilization remains below 5% when handling 40 Gbps traffic with multi-core scaling via Receive Side Scaling (RSS), allowing efficient resource use across cores. Recent 2025 evaluations of userspace XDP (uXDP) implementations report up to 40% performance improvements over kernel-mode execution for certain network functions, such as load balancing, enhancing throughput for complex network functions. Performance testing commonly employs tools like TRex for generating high-volume traffic and pktgen for kernel-based , while xdp-bench provides detailed statistics on XDP program execution across modes. Throughput scales linearly with the number of CPU cores and queues, and hardware offload modes completely bypass host CPU cycles for processed packets. In mixed deployments, 2024 studies on AF_XDP confirm end-to-end delays below 10 μs when using busy polling and optimized socket parameters, supporting latency-sensitive applications.

References

Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.