Recent from talks
Contribute something
Nothing was collected or created yet.
Express Data Path
View on Wikipedia| XDP | |
|---|---|
| Original authors | Brenden Blanco, Tom Herbert |
| Developers | Open source community, Google, Amazon, Intel, Microsoft[1] |
| Initial release | 2016 |
| Written in | C |
| Operating system | Linux, Windows |
| Type | Packet filtering |
| License | Linux: GPL Windows: MIT License |
XDP (eXpress Data Path) is an eBPF-based high-performance network data path used to send and receive network packets at high rates by bypassing most of the operating system networking stack. It is merged in the Linux kernel since version 4.8.[2] This implementation is licensed under GPL. Large technology firms including Amazon, Google and Intel support its development. Microsoft released their free and open source implementation XDP for Windows in May 2022.[1] It is licensed under MIT License.[3]
Data path
[edit]
The idea behind XDP is to add an early hook in the RX path of the kernel, and let a user supplied eBPF program decide the fate of the packet. The hook is placed in the network interface controller (NIC) driver just after the interrupt processing, and before any memory allocation needed by the network stack itself, because memory allocation can be an expensive operation. Due to this design, XDP can drop 26 million packets per second per core with commodity hardware.[4]
The eBPF program must pass a preverifier test[5] before being loaded, to avoid executing malicious code in kernel space. The preverifier checks that the program contains no out-of-bounds accesses, loops or global variables.
The program is allowed to edit the packet data and, after the eBPF program returns, an action code determines what to do with the packet:
XDP_PASS: let the packet continue through the network stackXDP_DROP: silently drop the packetXDP_ABORTED: drop the packet with trace point exceptionXDP_TX: bounce the packet back to the same NIC it arrived onXDP_REDIRECT: redirect the packet to another NIC or user space socket via the AF_XDP address family
XDP requires support in the NIC driver but, as not all drivers support it, it can fallback to a generic implementation, which performs the eBPF processing in the network stack, though with slower performance.[6]
XDP has infrastructure to offload the eBPF program to a network interface controller which supports it, reducing the CPU load. In 2023, only Netronome[7] cards support it.
Microsoft is partnering with other companies and adding support for XDP in its MsQuic implementation of the QUIC protocol.[1]
AF_XDP
[edit]Along with XDP, a new address family entered in the Linux kernel starting 4.18.[8] AF_XDP, formerly known as AF_PACKETv4 (which was never included in the mainline kernel),[9] is a raw socket optimized for high performance packet processing and allows zero-copy between kernel and applications. As the socket can be used for both receiving and transmitting, it supports high performance network applications purely in user space.[10]
See also
[edit]References
[edit]- ^ a b c Jawad, Usama (25 May 2022). "Microsoft brings Linux XDP project to Windows". Neowin. Retrieved 26 May 2022.
- ^ "[GIT] Networking - David Miller". lore.kernel.org. Retrieved 2019-05-14.
- ^ Yasar, Erdem (25 May 2022). "Microsoft introduced open-source XDP for Windows". cloud7. Archived from the original on 25 May 2022. Retrieved 26 May 2022.
- ^ Høiland-Jørgensen, Toke (2019-05-03), Source text and experimental data for our paper describing XDP: tohojo/xdp-paper, retrieved 2019-05-21
- ^ "A thorough introduction to eBPF [LWN.net]". lwn.net. Retrieved 2019-05-14.
- ^ "net: Generic XDP". www.mail-archive.com. Retrieved 2019-05-14.
- ^ "BPF, eBPF, XDP and Bpfilter… What are these things and what do they mean for the enterprise? - Netronome". www.netronome.com. Archived from the original on 2020-09-24. Retrieved 2019-05-14.
- ^ "kernel/git/torvalds/linux.git - Linux kernel source tree". git.kernel.org. Retrieved 2019-05-16.
- ^ "Questions about AF_PACKET V4 and AF_XDP". Kernel.org.
- ^ Corbet, Jonathan (9 April 2018). "Accelerating networking with AF_XDP [LWN.net]". lwn.net. Retrieved 2019-05-16.
External links
[edit]- XDP documentation on Read the Docs
- AF_XDP documentation on kernel.org
- xdp-for-windows on GitHub
- XDP walkthrough at FOSDEM 2017 by Daniel Borkmann, Cilium
- AF_XDP at FOSDEM 2018 by Magnus Karlsson, Intel
- eBPF.io - Introduction, Tutorials & Community Resources
- L4Drop: XDP DDoS Mitigations, Cloudflare
- Unimog: Cloudflare's edge load balancer, Cloudflare
- Open-sourcing Katran, a scalable network load balancer, Facebook
- Cilium's L4LB: standalone XDP load balancer, Cilium
- Kube-proxy replacement at the XDP layer, Cilium
- eCHO Podcast on XDP and load balancing
Express Data Path
View on GrokipediaOverview
Definition
Express Data Path (XDP) is an eBPF-based technology designed for high-performance network packet processing within the Linux kernel. It integrates directly into the network interface card (NIC) driver at the earliest receive (RX) point, allowing eBPF programs to execute on incoming packets before they proceed further into the kernel.[4][2] The core purpose of XDP is to enable programmable decisions on incoming packets prior to kernel memory allocation or involvement of the full networking stack, thereby minimizing overhead and maximizing throughput. This approach supports processing rates up to 26 million packets per second per core on commodity hardware.[5] In contrast to traditional networking paths, XDP bypasses much of the operating system stack—for instance, avoiding initial allocation of socket buffer (skb) structures—to achieve lower latency and reduced CPU utilization.[6] Originally developed as a GPL-licensed component of the Linux kernel, XDP received a Windows port in 2022, released under the MIT license.[7] As of 2025, developments like XDP2 are being proposed to further extend its capabilities for modern high-performance networking.[8]Advantages
XDP provides significant performance benefits by enabling line-rate packet processing directly in the network driver, achieving throughputs exceeding 100 Gbps on multi-core systems while maintaining low latency. This is accomplished by executing eBPF programs at the earliest possible stage in the receive path, before the creation of socket buffer (skb) structures or the invocation of generic receive offload (GRO) and segmentation offload (GSO) layers, which reduces processing overhead for high-volume traffic scenarios such as DDoS mitigation and traffic filtering. For instance, simple packet drop operations can reach up to 20 million packets per second (Mpps) per core, far surpassing traditional methods.[9][10] In terms of resource efficiency, XDP minimizes CPU utilization by allowing early decisions on packet fate—such as dropping invalid packets—thereby freeing kernel resources for other tasks and avoiding unnecessary memory allocations or context switches deeper in the networking stack. This approach supports scalable deployment across multiple cores without the need for kernel bypass techniques like DPDK, while retaining the security and interoperability of the Linux networking subsystem. Additionally, XDP's potential for zero-copy operations further reduces memory bandwidth consumption, enhancing overall system efficiency in bandwidth-intensive environments.[11][6][10] The flexibility of XDP stems from its integration with eBPF, enabling programmable custom logic for packet processing without requiring kernel modifications or recompilation, which facilitates rapid adaptation to evolving network requirements. Compared to conventional tools like iptables or nftables, XDP can be significantly faster for basic filtering tasks, with speedups of up to 5 times, due to its position in the data path and avoidance of higher-layer overheads. Furthermore, XDP enhances ecosystem observability through seamless integration with tools like bpftrace, allowing for efficient monitoring and debugging of network events in production environments.[9][11][10]History and Development
Origins
The development of Express Data Path (XDP) was initiated in 2016 by Jesper Dangaard Brouer, a principal kernel engineer at Red Hat, in response to the growing demands for high-performance networking in cloud computing environments where traditional Linux kernel networking stacks struggled with speeds exceeding 10 Gbps.[12] Traditional kernel processing, including socket buffer (SKB) allocation and memory management, created significant bottlenecks under high packet rates, often limiting throughput to below line-rate performance for multi-gigabit interfaces.[1] The project aimed to enable programmable, kernel-integrated packet processing that could rival user-space solutions like DPDK while maintaining compatibility with the existing networking stack.[13] Key contributions came from the open-source Linux kernel community, with significant input from engineers at Google, Amazon, and Intel, who helped refine the design through collaborative patch reviews and testing.[1] Early efforts built upon the eBPF (extended Berkeley Packet Filter) framework, which had advanced in 2014 to support more complex in-kernel programs, allowing XDP to extend programmable packet processing beyond existing hooks like traffic control (tc).[12] Initial prototypes focused on integrating XDP hooks into network drivers, with testing conducted on Netronome SmartNICs to evaluate offloading capabilities and on Mellanox ConnectX-3 Pro adapters (supporting 10/40 Gbps Ethernet) to demonstrate drop rates up to 20 million packets per second on a single core.[12] These prototypes validated the feasibility of early packet inspection and processing directly in the driver receive path, minimizing overhead from higher-layer kernel components.[1]Milestones
XDP was initially merged into the Linux kernel version 4.8 in 2016, introducing basic support for programmable packet processing at the driver level, with initial implementation in the Intel ixgbe Ethernet driver.[14] In 2018, Linux kernel 4.18 added AF_XDP, a socket family enabling efficient user-space access to XDP-processed packets, facilitating zero-copy data transfer between kernel and user space.[15] Microsoft ported XDP to the Windows kernel in 2022, releasing an open-source implementation that integrated with the MsQuic library to accelerate QUIC protocol processing by bypassing the traditional network stack.[16] Between 2023 and 2024, XDP driver support expanded to additional Intel Ethernet controllers, such as the E810 series, while Netronome hardware offloading achieved greater stability through kernel enhancements for reliable eBPF program execution on smart NICs.[17][4] In 2024 and 2025, kernel updates addressed critical issues, including a fix for race conditions in the AF_XDP receive path identified as CVE-2025-37920, where improper synchronization in shared umem mode could lead to concurrent access by multiple CPU cores; this was resolved by relocating the rx_lock to the buffer pool structure.[18] The eBPF ecosystem around XDP also grew, with the introduction of uXDP as a userspace runtime for executing verified XDP programs outside the kernel while maintaining compatibility, and innovative workarounds enabling XDP-like processing for egress traffic via kernel loopholes.[19][20] XDP's core implementation in Linux remains under the GPL license, ensuring integration with the kernel's licensing requirements, whereas the Windows port adopts the more permissive MIT license to broaden adoption across platforms.Core Functionality
Data Path Mechanics
The eXpress Data Path (XDP) hook is integrated at the earliest point in the receive (RX) path within the Linux kernel's network device driver, immediately following the network interface card (NIC)'s direct memory access (DMA) transfer of packet data into kernel memory buffers from the RX descriptor ring, but prior to any socket buffer (skb) allocation or engagement with the broader network stack. This placement minimizes latency by allowing programmable processing before traditional kernel overheads. In cases where a driver lacks native support, XDP falls back to a generic mode (also known as SKB mode) that integrates into the kernel's NAPI processing after skb allocation, resulting in slightly higher overhead but ensuring compatibility.[21][22][23] Upon DMA transfer, the raw packet data resides in a kernel buffer, where an eBPF program attached to the XDP hook executes directly on it, utilizing metadata from the xdp_md structure—such as packet length, ingress port, and RX queue ID—for contextual analysis. This flow enables rapid decision-making on packet disposition without propagating the frame through the full kernel network stack, thereby reducing CPU cycles and memory usage for high-throughput scenarios. XDP supports three execution modes: native mode, which embeds the hook directly in the driver for optimal performance on supported hardware; generic mode, a universal software fallback that integrates into the standard RX path with slightly higher overhead; and offload mode, where the eBPF program is transferred to the NIC for hardware-accelerated execution, bypassing the host CPU entirely.[21][24][4] To enhance efficiency, XDP leverages the kernel's page pool API for memory management, allocating and recycling page-sized pools dedicated to XDP frames and associated skbs, which avoids frequent page allocations and reduces cache misses in high-rate environments. This approach supports multicast traffic handling, where replicated packets can be processed across relevant queues, and integrates with Receive Side Scaling (RSS) to distribute ingress load via hardware hashing to multiple RX queues for parallel eBPF execution. Traditionally limited to ingress processing on the RX path, XDP saw 2025 advancements enabling egress support through eBPF-based techniques that manipulate kernel packet direction heuristics, extending its applicability to outbound traffic without native TX hooks.[25][26][20]Actions
In eXpress Data Path (XDP), the possible decisions an XDP program can make on a received packet are determined by returning one of the values from theenum xdp_action, which the kernel uses to execute the corresponding handling without further program involvement.[27] These actions enable efficient packet processing at the driver level, allowing for high-performance decisions such as dropping unwanted traffic or redirecting packets to alternative paths.[4]
XDP_DROP instructs the kernel to immediately discard the packet, freeing the underlying DMA buffer directly in the driver without allocating kernel data structures like sk_buff or passing the packet to the network stack. This action is particularly effective for early-stage filtering, such as mitigating DDoS attacks, as it minimizes resource consumption and latency compared to traditional stack-based dropping.[21]
XDP_PASS forwards the packet to the standard Linux kernel networking stack for further processing, such as routing, firewalling, or delivery to user space.[4] It allows the XDP program to inspect or minimally modify the packet before normal handling resumes, preserving compatibility with existing network functionality.
XDP_TX causes the kernel to transmit the packet back out through the same network interface it arrived on, often used for reflecting packets or simple redirects without changing the egress device.[21] This action reuses the original buffer for transmission, enabling low-overhead operations like packet mirroring or bouncing invalid ingress traffic.[4]
XDP_REDIRECT redirects the packet to a different network interface, CPU queue, or AF_XDP socket, typically invoked via the eBPF helper bpf_redirect() or map-based variants like bpf_redirect_map(). It supports advanced forwarding scenarios, such as load balancing across devices, by handing off the buffer to another driver or processing context.[21]
XDP_ABORTED, with a value of 0, signals an error or abort condition in the XDP program, resulting in the packet being dropped along with a kernel warning via bpf_warn_invalid_xdp_action().[27] This action is primarily intended for debugging or testing purposes and is rarely used in production environments due to its punitive overhead.[4]
The kernel interprets the returned enum xdp_action value to perform the specified operation atomically after the program execution, ensuring minimal overhead in the data path. Statistics for these actions, including counts of drops, passes, transmissions, and redirects, are exposed by supported network drivers through the ethtool utility, allowing administrators to monitor XDP performance and efficacy.[28]
eBPF Integration
Program Development
eBPF programs for XDP are written in a restricted subset of the C programming language, leveraging kernel headers such as<linux/bpf.h> and <bpf/bpf_helpers.h> to access necessary types and helper functions.[29] Developers define the main program function and annotate it with the SEC("xdp") macro to place it in the appropriate ELF section, ensuring it is recognized as an XDP program during loading.[30] The function signature typically takes a struct xdp_md *ctx parameter, providing access to packet metadata like ingress_ifindex for the incoming interface index.[4] Programs must return an enum xdp_action value, such as XDP_DROP to discard packets or XDP_PASS to continue processing.[29]
To compile the C source into an ELF object file containing eBPF bytecode, developers use LLVM/Clang with the BPF target architecture. The command clang -O2 -target bpf -c program.c -o program.o generates the object file, enabling features like bounded loops and helper function inlining supported by the LLVM BPF backend.[31] This process ensures the bytecode adheres to eBPF instruction constraints verified by the kernel.
Loading the program into the kernel utilizes the libbpf library, which provides the bpf_prog_load() function with BPF_PROG_TYPE_XDP as the program type.[32] Once loaded, the program file descriptor is attached to a network device using bpf_set_link_xdp_fd() on the netdevice or, in newer kernels, bpf_link_create() with BPF_LINK_TYPE_XDP.[4] Alternatively, the iproute2 suite offers a command-line interface for attachment: ip link set dev <interface> xdp obj program.o sec xdp, simplifying deployment without custom userspace code. For inspection and management, bpftool from iproute2 allows querying loaded programs via bpftool prog show or attached XDP links with bpftool net show.
A representative example is a simple XDP program that drops packets with non-IPv4 Ethernet types:
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <linux/if_ether.h>
SEC("xdp")
int xdp_drop_non_ip(struct xdp_md *ctx) {
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
struct ethhdr *eth = data;
if ((void *)(eth + 1) > data_end)
return XDP_PASS;
if (eth->h_proto != htons(ETH_P_IP))
return XDP_DROP;
return XDP_PASS;
}
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <linux/if_ether.h>
SEC("xdp")
int xdp_drop_non_ip(struct xdp_md *ctx) {
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
struct ethhdr *eth = data;
if ((void *)(eth + 1) > data_end)
return XDP_PASS;
if (eth->h_proto != htons(ETH_P_IP))
return XDP_DROP;
return XDP_PASS;
}
ctx metadata, performs bounds checking to prevent verifier rejection, and selectively drops non-IP traffic.[33] Metadata like ctx->ingress_ifindex can be used for interface-specific logic, such as conditional actions based on the receiving device.[4]
Debugging XDP programs involves kernel-side tracing with bpf_trace_printk() for logging messages to the kernel ring buffer, viewable via dmesg, though it is limited for production due to performance overhead. For more scalable telemetry, developers populate userspace-accessible eBPF maps with counters or statistics, which can be read and aggregated from userspace applications.[34] The kernel verifier plays a role in accepting programs by statically analyzing bytecode for safety, but development focuses on iterating through compilation and loading to resolve verification failures.[4]
Safety Mechanisms
The eBPF verifier serves as a critical in-kernel static analyzer for XDP programs, simulating their execution path to ensure safety before loading. It performs exhaustive checks for potential issues such as unreachable instructions, out-of-bounds memory accesses relative to packet boundaries (e.g., ensuring offsets do not exceed thedata_end pointer in XDP contexts), invalid use of helper functions, and violations of the kernel's security model. If any unsafe behavior is detected, the verifier rejects the program, preventing it from being loaded and executed, thereby avoiding kernel crashes or exploits. This verification process is mandatory for all eBPF program types, including XDP, and operates on the program's bytecode without requiring runtime overhead during packet processing.[35][4]
To enforce bounded execution, the verifier prohibits unbounded loops in eBPF programs, a restriction that originated with early eBPF designs to guarantee termination; since Linux kernel 5.3, bounded loops are permitted but only if the verifier can prove they will not exceed resource limits. A key safeguard is the fixed instruction limit, capped at 1 million instructions per program invocation (increased from 4,096 in earlier kernels like 5.2), which prevents excessive computation and potential denial-of-service scenarios. Additionally, map accesses—such as those to eBPF maps used for state in XDP filtering—are validated at load time, ensuring pointers remain within allocated bounds and avoiding arbitrary memory corruption. These mechanisms collectively ensure that XDP programs remain deterministic and resource-bounded, maintaining kernel stability even under high packet rates.[36][37]
Following successful verification of the bytecode, the kernel may optionally apply just-in-time (JIT) compilation to translate it into native machine code for improved performance during execution. However, the verifier's safety checks occur solely on the portable bytecode, independent of the JIT process, ensuring that optimizations do not introduce vulnerabilities. For error handling, XDP programs must return one of predefined actions (e.g., XDP_PASS to continue processing, XDP_DROP to discard the packet, or XDP_REDIRECT for forwarding), which the kernel interprets to dictate packet fate; in cases of unrecoverable errors like division by zero, the program returns XDP_ABORTED, triggering a kernel tracepoint for logging while defaulting to a safe fallback such as XDP_PASS to avoid disrupting traffic flow.[38][39][4]
In recent Linux kernel developments through 2025, the eBPF verifier has seen enhancements to support more complex operations, including improved precision for packet redirects (e.g., via XDP_REDIRECT with tail calls) and metadata handling in XDP programs, where additional packet metadata can be safely accessed without bound violations. These updates, such as proof-based refinement mechanisms, allow the verifier to handle intricate control flows more accurately while rejecting fewer valid programs, building on ongoing efforts to balance safety and expressiveness in high-performance networking scenarios.[40][41]
User-Space Access
AF_XDP Sockets
AF_XDP sockets, introduced in Linux kernel version 4.18, provide a specialized address family (PF_XDP) designed for high-performance, zero-copy input/output operations that enable direct packet transfer from kernel-space XDP programs to user-space applications, bypassing much of the traditional networking stack.[42] This raw socket type facilitates efficient packet processing by allowing XDP eBPF programs to redirect ingress traffic straight to user-space buffers, supporting applications requiring low-latency and high-throughput networking. To create an AF_XDP socket, applications invoke the standard socket syscall with the address family AF_XDP, socket type SOCK_RAW, and protocol 0:fd = socket(AF_XDP, SOCK_RAW, 0);. Following creation, the socket must be bound to a specific network interface and receive queue ID using the bind() syscall, specifying parameters such as the interface index, queue identifier, and socket options via setsockopt() for features like shared user memory (UMEM) registration. This binding associates the socket with a particular hardware receive queue, enabling targeted packet reception from XDP-processed traffic on that queue.
The core of AF_XDP's efficiency lies in its user memory (UMEM) model, where user-space allocates a contiguous memory region registered with the kernel via setsockopt() using the SOL_XDP level. This UMEM is divided into fixed-size frames, and communication between kernel and user-space occurs through four lock-free ring buffers: the RX ring for incoming packet descriptors from the kernel to user-space, the TX ring for outgoing descriptors from user-space to the kernel, the FILL ring for user-space to supply empty frames to the kernel, and the COMPLETION ring for the kernel to notify user-space of processed frames. Descriptors in these rings reference UMEM frame addresses and lengths, allowing shared access without data copying in optimal configurations.
AF_XDP supports two operational modes for packet handling: copy mode, which relies on traditional sk_buff structures for data transfer and is compatible with all XDP-capable drivers, and zero-copy mode, which grants the driver direct page access to UMEM via DMA for ingress and egress, minimizing overhead but requiring explicit driver support through XDP_FLAGS_DRV_MODE.[43] Upon binding, the kernel attempts zero-copy if available; otherwise, it defaults to copy mode. Driver support for zero-copy has expanded in recent kernels, enhancing performance for supported hardware.[43]
As of 2025, AF_XDP has seen integrations aimed at broader ecosystem compatibility, including patches in the DPDK framework to enable AF_XDP poll-mode drivers for seamless migration from kernel-bypass libraries, allowing DPDK applications to leverage AF_XDP sockets for raw packet I/O on supported NICs.[44] Additionally, experimental implementations in DNS servers such as NSD utilize AF_XDP to achieve higher query processing rates by directly handling UDP packets in user-space, demonstrating improved query processing rates with minimal CPU overhead, such as a 1.7x improvement over traditional UDP handling in experimental tests.[45][46]
Zero-Copy Features
AF_XDP enables zero-copy packet handling through a shared memory region known as UMEM, which consists of a contiguous block of user-allocated memory divided into fixed-size frames, typically 2 KB or 4 KB each, to hold packet data without intermediate copies between kernel and user space.[47] The kernel driver writes packet descriptors directly into ring buffers mapped to this UMEM, allowing the network interface card (NIC) to DMA packet data straight into the frames, while the user-space application accesses the data via these descriptors.[47] This structure supports multiple AF_XDP sockets sharing the same UMEM for efficient resource utilization in multi-queue setups.[47] Ring buffer operations in zero-copy mode rely on four memory-mapped rings associated with the UMEM: the fill ring, where the user space provides available frames for incoming packets; the RX ring, where the kernel enqueues receive descriptors pointing to filled frames; the TX ring, for user-submitted transmit descriptors; and the completion ring, where the kernel signals TX completions.[47] The user space polls the head and tail pointers of these single-producer/single-consumer rings to synchronize access, minimizing system calls through techniques like busy-polling or eventfd notifications, while the kernel updates them atomically to reflect buffer states.[47] This design ensures seamless data flow without memcpy operations, as both kernel and user space operate on the shared memory.[43] By eliminating the overhead of data copying between kernel and user space, zero-copy AF_XDP achieves significant performance improvements, such as line-rate processing exceeding 40 Gbps for receive-only workloads in user-space applications like packet capture on 40 Gbps NICs.[48] These gains stem from reduced CPU cycles on memory transfers and fewer context switches, enabling applications to handle high-throughput traffic more efficiently than traditional socket interfaces.[49] To enable zero-copy mode, applications must set the XDP_ZEROCOPY flag during socket binding via the bind() system call, which requires compatible NIC drivers supporting direct UMEM access, such as Intel's i40e for 40 Gbps Ethernet; if unsupported, the operation falls back to copy mode using SKB buffers.[47][50] Driver support is typically provided in XDP_DRV mode for native zero-copy, contrasting with the generic XDP_SKB mode that always copies data.[47] In 2025, advancements addressed reliability issues, including a fix for race conditions in the generic RX path under shared UMEM scenarios (CVE-2025-37920), where improper locking could lead to data races across multiple sockets, now resolved by relocating the rx_lock to the buffer pool level in Linux kernel versions post-6.9.[18] Performance studies on mixed-mode deployments, combining zero-copy and copy-based sockets on programmable NICs, highlighted scalability benefits but noted potential bottlenecks from uneven buffer allocation, informing optimizations for hybrid environments.[51]Hardware Support
Offloading Modes
XDP supports hardware offloading through specific modes that enable execution of programs directly on the network interface card (NIC), bypassing the host CPU for packet processing. The primary mode is specified by the XDP_FLAGS_HW_MODE flag, which attaches the eBPF program for full offload to the NIC when both driver and hardware support this capability.[4] As a fallback when hardware offload is unavailable or unsupported, XDP_FLAGS_SKB_MODE is used, directing the program to run in software mode using the kernel's socket buffer (SKB) path. Additionally, launch-time offload for transmit (TX) metadata allows the NIC to schedule packets based on specified timestamps without host intervention, merged in Linux kernel 6.14 (April 2025).[52][53] The offloading process involves compiling the eBPF program into a format compatible with the target hardware, such as P4 for programmable switches or NIC-specific bytecode, before loading it onto the device. This compilation ensures the program adheres to the hardware's instruction set limitations. The program is then loaded using the devlink interface, a kernel subsystem for managing device resources, which handles the transfer to the NIC firmware. The kernel verifier performs compatibility checks during loading to confirm that the program and hardware align, preventing mismatches that could lead to failures. Driver-specific hooks facilitate the attachment, ensuring seamless integration with the NIC's data path.[54][55] Offloading provides significant benefits, including zero involvement from the host CPU after initial setup, enabling line-rate packet processing on SmartNICs even under high traffic loads. It supports core XDP actions such as DROP and TX entirely on the hardware, allowing packets to be filtered or forwarded without reaching the host stack, which is particularly useful for security and performance-critical applications.[56][57] However, hardware offload is constrained by a subset of eBPF features, excluding complex operations like advanced map manipulations or certain helper functions to match hardware capabilities. It also requires periodic NIC firmware updates to incorporate new offload support, limiting adoption to compatible devices.[58] Integration with Time-Sensitive Networking (TSN) has advanced, enabling XDP offload to support deterministic traffic scheduling in industrial and real-time environments.[59]Supported Devices
Express Data Path (XDP) hardware support is available on select network interface controllers (NICs) and platforms, enabling native or offloaded execution of XDP programs for high-performance packet processing. Intel Ethernet controllers provide full native XDP offload through the ice driver for E810 series devices, supporting XDP and AF_XDP zero-copy operations on Linux kernels 4.14 and later.[60] The 700-series controllers, such as those based on X710, achieve similar support via the i40e driver for native XDP on kernels 4.14 and later, with iavf handling virtual functions in SR-IOV configurations on kernels 5.10 and above.[61][62] Netronome Agilio SmartNICs have offered early and stable XDP offload support since 2016, allowing eBPF/XDP programs to execute directly on the NIC hardware for packet filtering and processing tasks.[63] NVIDIA (formerly Mellanox) ConnectX-6 and later NICs support driver-level XDP execution, enabling high-throughput packet handling in native mode on Linux.[64] Hardware offload for XDP is not supported on BlueField-2 DPUs as of the latest available information (2023), with development ongoing.[64] Other vendors include Broadcom's Stingray family of SmartNICs, which support XDP offload by running full Linux distributions on the device, facilitating eBPF program deployment for network functions.[65] Marvell OCTEON DPUs, such as those in the TX and 10 series, provide XDP and eBPF acceleration in configurations like Asterfusion's Helium SmartNICs, targeting security and load balancing workloads.[66] Software-based XDP support extends to virtualized environments via the virtio-net driver, available since Linux kernel 4.10 for both host and guest packet processing.[67] On Windows, basic XDP functionality is available through the Windows Driver Kit (WDK) via the open-source XDP-for-Windows project, which implements a high-performance packet I/O interface similar to Linux XDP. Hardware offload is supported on select Azure NICs, such as NVIDIA Mellanox adapters in virtualized setups, though primarily optimized for Linux guests with experimental Windows extensions.[7] To query XDP support and status on Linux, administrators can use[ethtool](/page/Ethtool) -l <interface> to view channel configurations relevant to XDP multi-queue operations and [ethtool](/page/Ethtool) -S <interface> for statistics including XDP drop counts.[68] For offload flags and parameters, devlink dev param show <device> displays hardware offload capabilities, such as XDP mode settings on supported NICs.
