Hubbry Logo
KqueueKqueueMain
Open search
Kqueue
Community hub
Kqueue
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Kqueue
Kqueue
from Wikipedia

Kqueue is a scalable event notification interface introduced in FreeBSD 4.1 in July 2000,[1][2] also supported in NetBSD, OpenBSD, DragonFly BSD, and macOS. Kqueue was originally authored in 2000 by Jonathan Lemon,[1][2] then involved with the FreeBSD Core Team. Kqueue makes it possible for software like nginx to solve the c10k problem.[3][4] The term "kqueue" refers to its function as a "kernel event queue"[1][2]

Kqueue provides efficient input and output event pipelines between the kernel and userland. Thus, it is possible to modify event filters as well as receive pending events while using only a single system call to kevent(2) per main event loop iteration. This contrasts with older traditional polling system calls such as poll(2) and select(2) which are less efficient, especially when polling for events on numerous file descriptors.

Kqueue not only handles file descriptor events but is also used for various other notifications such as file modification monitoring, signals, asynchronous I/O events (AIO), child process state change monitoring, and timers which support nanosecond resolution. Furthermore, kqueue provides a way to use user-defined events in addition to the ones provided by the kernel.

Some other operating systems which traditionally only supported select(2) and poll(2) also currently provide more efficient polling alternatives, such as epoll on Linux and I/O completion ports on Windows and Solaris.

libkqueue is a user space implementation of kqueue(2), which translates calls to an operating system's native backend event mechanism.[5]

API

[edit]

The function prototypes and types are found in <sys/event.h>.[6]

int kqueue(void);

Creates a new kernel event queue and returns a descriptor.

int kevent(int kq, const struct kevent* changelist, int nchanges, struct kevent* eventlist, int nevents, const struct timespec* timeout);

Used to register events with the queue, then wait for and return any pending events to the user. In contrast to epoll, kqueue uses the same function to register and wait for events, and multiple event sources may be registered and modified using a single call. The changelist array can be used to pass modifications (changing the type of events to wait for, register new event sources, etc.) to the event queue, which are applied before waiting for events begins. nevents is the size of the user supplied eventlist array that is used to receive events from the event queue.

EV_SET(kev, ident, filter, flags, fflags, data, udata);

A macro that is used for convenient initialization of a struct kevent object.

See also

[edit]

OS-independent libraries with support for kqueue:

Kqueue equivalent for other platforms:

  • on Solaris, Windows and AIX: I/O completion ports. Note that completion ports notify when a requested operation has completed, whereas kqueue can also notify when a file descriptor is ready to perform an I/O operation.
  • on Linux:
    • epoll system call has similar but not identical semantics.
    • inotify is a Linux kernel subsystem that notices changes to the filesystem and reports those to applications.

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
kqueue is a scalable event notification interface in operating systems that enables applications to efficiently monitor kernel events across diverse sources, including I/O, signals, timers, completions, process state changes, and filesystem modifications, using a unified, filter-based mechanism to register interests and retrieve notifications without the performance limitations of earlier interfaces like select() or poll(). Introduced in 4.1 in 2000 by developer Jonathan Lemon, it provides a generic centered on two primary system calls—kqueue(), which creates an event queue returning a , and kevent(), which atomically adds, deletes, or fetches events via structured change and event lists—allowing for level-triggered notifications and extensibility to new event types through kernel filters without altering the user-space interface. Designed to address the issues of traditional polling methods, which require scanning entire descriptor sets and incur high overhead with growing numbers of monitored objects, kqueue maintains event state in the kernel, enabling constant-time operations for inactive items and reducing frequency, thus supporting high-performance applications like web servers handling thousands of concurrent connections. Key innovations include support for multiple event filters (e.g., EVFILT_READ for readable data, EVFILT_VNODE for changes, EVFILT_PROC for forks/exits), user-defined events, and detailed event data such as pending byte counts, making it suitable for both network and system monitoring tasks. Originally implemented in , kqueue has been adopted across various BSD derivatives and related systems, including , , , and macOS (based on Darwin), where it serves as a core component for asynchronous programming and event-driven architectures.

Introduction

Definition and Purpose

kqueue is a kernel-level facility in operating systems that provides a scalable mechanism for monitoring multiple sources for various events, including I/O readiness on file descriptors, signals, timers, and filesystem changes. It operates by allowing applications to register interest in these events through kernel-maintained data structures known as knotes, which aggregate notifications efficiently without the per-process limitations common in earlier systems. Introduced in 4.1, kqueue serves as a generic event delivery system, enabling the kernel to notify user-space processes of conditions based on customizable filters. The primary purpose of kqueue is to address the scalability challenges faced by high-performance applications, particularly network servers handling thousands of connections, which older interfaces like select() and poll() could not manage efficiently due to their linear scanning of file descriptors. By the late and early , the growth of applications demanded a more robust solution for diverse event types without performance degradation, as traditional mechanisms imposed significant overhead in terms of CPU cycles and frequency. kqueue overcomes these limitations by supporting arbitrary event registrations across processes and delivering notifications in a way that scales to large numbers of monitored objects, making it ideal for in servers and other I/O-intensive workloads. In its basic workflow, an application first creates an event queue in the kernel to serve as a notification channel. It then registers knotes for specific events on file descriptors or other sources, associating user-defined data with each for context. Finally, the application polls the queue using a single to retrieve any pending events, allowing efficient of changes without repeated invocations for each monitored item. This design minimizes kernel-user space transitions and supports high-throughput scenarios by enabling the kernel to buffer and coalesce events internally.

Key Features and Benefits

kqueue offers significant by supporting the monitoring of thousands of file descriptors and events without the linear limitations inherent in older mechanisms like select() or poll(), allowing applications to handle large numbers of concurrent connections efficiently up to system-imposed limits such as the per-user RLIMIT_KQUEUES resource limit. This design enables robust performance in environments with high event volumes, where traditional polling would incur prohibitive CPU and memory overhead. A core strength of kqueue lies in its broad support for diverse event types, extending far beyond basic I/O readiness to include file status changes via the EVFILT_VNODE filter, signal notifications with EVFILT_SIGNAL, process lifecycle events through EVFILT_PROC, asynchronous I/O completion using EVFILT_AIO, and even user-defined events via the extensible EVFILT_USER filter. This versatility allows developers to unify monitoring of multiple system conditions within a single interface, simplifying application logic for complex, event-driven architectures. kqueue achieves high efficiency through O(1) time complexity for event registration, deletion, and notification operations, which minimizes CPU cycles compared to the repeated scanning required by polling-based systems. It further provides fine-grained control via level-triggered notifications by default, edge-triggered behavior using the EV_CLEAR flag to reset the event after retrieval, and one-time notifications via EV_ONESHOT to disable the event after delivery, enabling precise handling of event persistence. These attributes collectively reduce overhead and enhance responsiveness in non-blocking I/O scenarios. The benefits of kqueue are particularly pronounced in high-concurrency applications, such as web servers and , where it facilitates improved throughput by efficiently dispatching events without blocking threads, as demonstrated in benchmarks showing superior performance over select() and poll() under load. This has contributed to its adoption in systems like macOS for scalable networking tasks.

History and Development

Origins in

kqueue was first implemented in the operating system as part of version 4.1, which was released on July 27, 2000. The facility was primarily developed by Jonathan Lemon, a contributor to the project, who authored the initial system calls and documentation. Lemon's work focused on enhancing the kernel's ability to handle event notifications efficiently, particularly for applications requiring high scalability in network and I/O operations. The development of kqueue was motivated by the inherent limitations of earlier event notification mechanisms like select() and poll(), which exhibited poor as the number of file descriptors increased, due to their O(N) and requirements for repeated traversals of descriptor lists. These systems also involved redundant copies between user and kernel space, making them inefficient for modern network applications that needed to monitor thousands of connections simultaneously. Additionally, select() and poll() were restricted primarily to file descriptor-based I/O events and lacked native support for other sources such as signals, completions, or filesystem changes, prompting the need for a more versatile solution within the kernel. Key design goals for kqueue included providing a unified, scalable interface capable of handling diverse event sources without the performance bottlenecks of prior methods, while ensuring simplicity for adoption and reliability through level-triggered notifications that avoided silent failures. This approach was inspired by earlier research on event notification, such as the "get next event" API proposed by Banga, Mogul, and Druschel, but was specifically optimized for integration into the BSD kernel architecture. In its early releases, kqueue was integrated directly into the FreeBSD kernel via the kqueue() and kevent() system calls, with initial support centered on basic I/O filters such as EVFILT_READ and EVFILT_WRITE for monitoring sockets and files. This foundational implementation allowed applications to register interest in events efficiently and retrieve them in batches, marking a significant advancement in FreeBSD's support for scalable event-driven programming.

Adoption Across Unix-like Systems

Following its initial introduction in FreeBSD 4.1, kqueue saw continued enhancements in subsequent FreeBSD releases, including the addition of timer support via the EVFILT_TIMER filter in the 5.x series around 2003–2004, enabling periodic or one-shot notifications for time-based events. These improvements built on the core scalability features, allowing broader application in high-performance scenarios without altering the fundamental event queue mechanism. kqueue was adopted into Apple's Darwin kernel, forming the basis for event notification in macOS and ; it has been available since the initial release of in March 2001, providing scalable I/O handling that underpinned later frameworks like Grand Central Dispatch introduced in Mac OS X 10.6 Leopard. This integration has persisted, with kqueue remaining central to modern macOS and for efficient monitoring of file, socket, and events across Apple's ecosystem. Other BSD variants incorporated kqueue shortly after its FreeBSD origins, with OpenBSD adding support in version 2.9 released in June 2001, evolving to version 3.4 in 2002 with minor variations such as adjusted filter flags for vnode events to align with its security-focused kernel policies. NetBSD integrated it in version 2.0 released in December 2004, supporting events for sockets, files, processes, and signals through a port of the original implementation. DragonFly BSD, which forked from FreeBSD 4.8-STABLE in 2003, has included kqueue since its first release in April 2004. Linux lacks native kqueue support, relying instead on user-space ports like those in libraries such as for compatibility, though kqueue's design influenced the development of in the starting with version 2.5 in 2002, sharing goals of efficient, scalable event multiplexing for file descriptors. As of 2025, kqueue remains stable across supporting systems, with ongoing kernel refinements for security, such as mitigations for race conditions in event handling addressed in 14.0 released in November 2023 and subsequent advisories.

Architecture

Event Queues and Knotes

In kqueue, the event queue serves as a per-process kernel object that functions as a notification channel for monitoring various system events. It is created through the kqueue , which returns a representing the queue; this descriptor is not inherited by child processes upon forking. The queue maintains registered events and delivers notifications when those events occur, enabling scalable event handling without the limitations of earlier mechanisms like select or poll. At the core of the event queue are knotes, which are internal kernel data structures that encapsulate individual event registrations. Each knote links the monitored (such as a or signal), the associated filter for evaluating activity, the parent kqueue, and connections to related knotes. Knotes are dynamically allocated and stored within the queue's substructures, including an active list for ready events, a for non-descriptor identifiers, and an array indexed by descriptors. This design allows knotes to represent a wide range of event sources efficiently. The registration process for events involves applications submitting a changelist of kevent structures via the to add, change, or delete knotes. Upon an EV_ADD flag, the kernel's register function allocates a new knote using the (ident, filter) as a and invokes the filter's attach routine to associate it with the target object. Changes or deletions similarly update or detach existing knotes, ensuring the queue reflects the application's current interests. Various filters, such as those for file descriptors or signals, define the specific conditions monitored by each knote. When an event triggers activity on a monitored object, the kernel's notification flow begins with a call to the knote function, which evaluates the relevant filters on the object's knote . Qualifying knotes are marked active and moved to the queue's ready . The kqueue_scan function then dequeues these knotes, revalidates them against their filters, and copies the event details to the application's for retrieval via kevent. This process ensures timely and accurate delivery of events while minimizing unnecessary wakeups. Event queues and knotes incorporate limits and management mechanisms to prevent exhaustion. Per-user limits on the number of kqueues are enforced via the RLIMIT_KQUEUES limit, while system-wide constraints apply, such as the kern.kq_calloutmax for the maximum number of timers. The queue's internal structures, including the active list, , and descriptor array, expand dynamically without fixed upper bounds, though overall limits (kern.maxfiles) indirectly cap usage. Knotes are automatically freed upon queue closure or object detachment, facilitating efficient cleanup. For custom event handling, such as user-defined notifications, the EVFILT_USER filter allows applications to inject events directly into the queue, aiding in scenarios like inter-thread signaling.

Filters and Event Types

kqueue supports a variety of filters to monitor different types of events across file descriptors, processes, signals, timers, and user-defined conditions, each defined by the f_filter field in a struct kevent. These filters allow applications to register interest in specific occurrences, such as data availability or changes, using shared parameters like fflags for filter-specific flags, data for numeric values (e.g., byte counts or statuses), and udata as an opaque user pointer for associating custom data. The EVFILT_READ filter detects when input is available on a descriptor, such as sockets or pipes, triggering when data exceeds the low-water mark (customizable via NOTE_LOWAT in fflags) or upon EOF conditions like shutdowns, with data reporting bytes available or offset to EOF. The EVFILT_WRITE filter monitors output readiness, activating when buffer space is available for writing, with data indicating remaining space in buffers for sockets, pipes, or fifos, and setting EV_EOF if the reader disconnects. For asynchronous I/O, EVFILT_AIO watches completion of aio requests, using data to convey error status from aio_error(), tied to sigevent conditions without additional fflags. File and directory changes are handled by EVFILT_VNODE, which supports fflags like NOTE_DELETE for deletions, NOTE_EXTEND for size increases, NOTE_RENAME for renames, and others for attributes, opens, closes, links, reads, writes, or revocations, returning triggered events in fflags upon notification. Process-related use EVFILT_PROC, monitoring forks (NOTE_FORK), executions (NOTE_EXEC), exits (NOTE_EXIT with exit status in data), or child tracking (NOTE_CHILD with parent PID in data), with NOTE_TRACK or NOTE_TRACKERR for error handling. The EVFILT_SIGNAL filter captures signal deliveries, counting occurrences in data and automatically applying edge-triggering semantics. Timers are managed via EVFILT_TIMER, where data specifies the interval in milliseconds (or other units via fflags like NOTE_SECONDS or NOTE_NSECONDS), supporting periodic firing unless modified by EV_ONESHOT or absolute time (NOTE_ABSTIME). For user-defined events, EVFILT_USER enables posting custom notifications from user space using NOTE_TRIGGER in fflags, with the lower 24 bits of fflags available for application-specific flags like NOTE_FFNOP or NOTE_FFOR, bypassing kernel involvement. Event notification operates in level-triggered mode by default, continuously reporting the current state (e.g., data remains available until read), but can switch to edge-triggered mode via the EV_CLEAR flag, which deasserts the event after one retrieval to notify only on state transitions. This distinction is particularly useful for filters like signals or timers, where EV_CLEAR prevents repeated notifications without changes. Edge cases include zero-length reads on EVFILT_READ, which may still trigger if underlying data is present but not fully consumed, and automatic removal of knotes associated with closed descriptors to avoid dangling events.

API

Core Functions

The primary interface to kqueue consists of a small set of system calls that enable the creation, management, and monitoring of event queues in kernel space. The kqueue() system call is used to create a new kernel event queue, returning a small integer file descriptor representing the queue upon success; its prototype is int kqueue(void);, and it returns -1 on failure, with errno set accordingly. This descriptor serves as the handle for all subsequent operations on the queue, allowing applications to monitor various events such as file descriptor readiness or process state changes. The core manipulation of events is handled by the kevent() system call, which multiplexes the tasks of adding, modifying, deleting events on the queue, and retrieving pending events that are ready for processing. Its prototype is int kevent(int kq, const struct kevent *changelist, int nchanges, struct kevent *eventlist, int nevents, const struct timespec *timeout);, where kq is the queue descriptor, changelist and nchanges specify events to register or modify, eventlist and nevents receive returned events, and timeout controls the blocking duration (NULL for indefinite wait). The call returns the number of events placed in eventlist or -1 on error. Central to these operations is the struct kevent , which encapsulates event details for both input changes and output events. It includes fields such as ident (an identifier like a ), filter (the type of event filter), flags (action indicators like EV_ADD for addition or EV_DELETE for removal), fflags (filter-specific flags), data (filter-specific data, such as bytes available), and udata (opaque user data). The EV_SET macro is typically used to populate this structure before passing it to kevent(). Event flags, such as those for adding or deleting knotes, are detailed in subsequent sections. To destroy a kqueue, applications close the file descriptor using the standard close() system call, which deallocates the queue and removes all associated events referencing open file descriptors. Error handling in these functions follows POSIX conventions, with common errno values including EBADF for an invalid queue descriptor, EINTR for interruption by a signal, ENOMEM for insufficient kernel memory (in kqueue()), EMFILE for exceeding per-process file descriptor limits, and EINVAL for invalid arguments like negative list lengths (in kevent()). If kevent() encounters an error during event processing, it may set the EV_ERROR flag in the returned kevent structure, with the specific error code in the data field.

Event Operations and Flags

In the kqueue system, event operations are primarily managed through flags passed to the kevent() function, which allows for adding, deleting, modifying, and retrieving events associated with kernel notes (knodes). These flags control the lifecycle and behavior of events within the queue, enabling efficient monitoring of various I/O and system conditions. The core kevent() prototype, int kevent(int kq, const struct kevent *changelist, int nchanges, struct kevent *eventlist, int nevents, const struct timespec *timeout);, facilitates these operations by processing changes from the changelist and returning triggered events in the eventlist. The add operation registers a new event using the EV_ADD flag, which appends the knote to the kqueue and implicitly enables it unless modified otherwise; combining EV_ADD with EV_ENABLE explicitly permits the event to be returned by kevent() when triggered. This mechanism supports initial setup for monitoring file descriptors, signals, or other filters without duplicating existing entries, as re-adding modifies the parameters of any matching knote. Deletion removes events via the EV_DELETE flag, which detaches the knote from the kqueue, with automatic cleanup occurring on the last close of associated file descriptors; alternatively, EV_DISABLE suspends reporting of the event without removing it, allowing later reactivation. These operations ensure precise control over active monitoring, preventing unnecessary notifications during temporary pauses. Modifying an existing knote involves reusing EV_ADD to update its parameters, often combined with flags like EV_ONESHOT for one-time notification—where the event is deleted after retrieval—or EV_CLEAR to reset its state post-retrieval, which is particularly useful for state-transition filters. Such changes enable dynamic adjustments to event sensitivity without full recreation. Upon retrieval, flags in the returned struct kevent indicate event status: EV_EOF signals a filter-specific condition, while EV_ERROR denotes an with the errno value stored in the data field, allowing applications to handle failures gracefully. These retrieval indicators provide context for processing without additional system calls. Filter-specific flags further refine behavior; for instance, in the EVFILT_READ filter, NOTE_LOWAT sets a low-water mark threshold in the data field, triggering notifications only when sufficient is available, overriding the default to optimize for buffered I/O scenarios. Timeout handling in kevent() uses the timespec parameter to differentiate blocking from non-blocking polls: a non-NULL timespec specifies a maximum wait interval, with a zero-valued structure effecting an instantaneous poll, whereas NULL enables indefinite blocking until an event occurs. This flexibility supports both responsive and efficient long-running applications.

Usage and Implementations

Basic Programming Examples

To illustrate basic usage of kqueue for event notification, consider a simple C program that monitors standard input (file descriptor 0) for readability using the EVFILT_READ filter. This example creates a kqueue, registers the event, polls indefinitely for occurrences, and handles basic errors by checking return values from system calls. The following code snippet demonstrates this setup:

c

#include <sys/event.h> #include <sys/time.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <errno.h> int main(void) { int kq, nev; struct kevent event; struct kevent eventlist[1]; int fd = STDIN_FILENO; /* File descriptor to monitor (stdin) */ /* Create the kqueue */ kq = kqueue(); if (kq == -1) { perror("kqueue"); exit(EXIT_FAILURE); } /* Set up the event for readability on the file descriptor */ EV_SET(&event, fd, EVFILT_READ, EV_ADD, 0, 0, NULL); if (kevent(kq, &event, 1, NULL, 0, NULL) == -1) { perror("kevent (add)"); close(kq); exit(EXIT_FAILURE); } /* Poll for events with infinite timeout */ while ((nev = kevent(kq, NULL, 0, eventlist, 1, NULL)) != -1) { if (nev > 0) { /* Process the event */ if (eventlist[0].flags & EV_ERROR) { fprintf(stderr, "Error on event: %s\n", strerror(eventlist[0].data)); break; } if (eventlist[0].filter == EVFILT_READ) { printf("Data available for reading on fd %d\n", (int)eventlist[0].ident); /* In a real application, read the data here */ char buf[1024]; ssize_t n = read(fd, buf, sizeof(buf)); if (n > 0) { write(STDOUT_FILENO, buf, n); } else if (n == 0) { printf("EOF on fd %d\n", fd); break; } } } } if (nev == -1) { perror("kevent (poll)"); } /* Cleanup */ close(kq); return EXIT_SUCCESS; }

#include <sys/event.h> #include <sys/time.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <errno.h> int main(void) { int kq, nev; struct kevent event; struct kevent eventlist[1]; int fd = STDIN_FILENO; /* File descriptor to monitor (stdin) */ /* Create the kqueue */ kq = kqueue(); if (kq == -1) { perror("kqueue"); exit(EXIT_FAILURE); } /* Set up the event for readability on the file descriptor */ EV_SET(&event, fd, EVFILT_READ, EV_ADD, 0, 0, NULL); if (kevent(kq, &event, 1, NULL, 0, NULL) == -1) { perror("kevent (add)"); close(kq); exit(EXIT_FAILURE); } /* Poll for events with infinite timeout */ while ((nev = kevent(kq, NULL, 0, eventlist, 1, NULL)) != -1) { if (nev > 0) { /* Process the event */ if (eventlist[0].flags & EV_ERROR) { fprintf(stderr, "Error on event: %s\n", strerror(eventlist[0].data)); break; } if (eventlist[0].filter == EVFILT_READ) { printf("Data available for reading on fd %d\n", (int)eventlist[0].ident); /* In a real application, read the data here */ char buf[1024]; ssize_t n = read(fd, buf, sizeof(buf)); if (n > 0) { write(STDOUT_FILENO, buf, n); } else if (n == 0) { printf("EOF on fd %d\n", fd); break; } } } } if (nev == -1) { perror("kevent (poll)"); } /* Cleanup */ close(kq); return EXIT_SUCCESS; }

This program initializes a kqueue descriptor with kqueue(), which returns -1 on failure, prompting error reporting via perror() and program exit. It then registers a kevent structure using EV_SET() to monitor the specified file descriptor for read events, invoking kevent() with the changelist to add the event; a return value of -1 indicates failure, handled similarly with perror(). The main loop calls kevent() with no changelist and an eventlist of size 1, using a NULL timeout for blocking indefinitely until an event occurs or an error arises. Upon receiving events (nev > 0), it checks for EV_ERROR flags to report kernel errors via strerror() on the event's data field, and processes the read event by echoing input data or detecting EOF. Finally, the kqueue descriptor is closed with close() to release resources. To compile this program, include the <sys/event.h> header for kqueue structures and functions; no special linking is typically required as these are standard system calls available in the C library. For instance, use cc example.c -o example on a supported system like .

Platform-Specific Support and Libraries

kqueue received full support in starting with version 4.1, released in 2000, where it was introduced as a scalable event notification mechanism. The implementation allows tuning via sysctls such as kern.kq_calloutmax, which sets the system-wide limit on the number of timers that can be registered across all kqueues. Additionally, the RLIMIT_KQUEUES resource limit controls the maximum number of kqueues per user, helping manage resource usage in multi-process environments. In 14.0 (released November 2023), timerfd(2) was added for Linux compatibility, with kqueue's EVFILT_TIMER recommended for native use, alongside enhanced process visibility controls that support secure kqueue monitoring in jailed environments. On macOS, kqueue has been available since Mac OS X version 10.3 (Panther) in 2003, providing compatibility with BSD-derived systems. It integrates deeply with libdispatch, Apple's Grand Central Dispatch framework, which uses kqueue under the hood to handle asynchronous event sources like file descriptors and timers, enabling higher-level concurrent programming without direct kevent calls. NetBSD adopted kqueue with version 2.0 in 2004, offering a similar to for monitoring events on files, processes, and signals. included support starting from version 2.9 in 2000, maintaining API compatibility but with security-focused restrictions on certain filters, such as limited monitoring (available only on UFS file systems). In , EVFILT_USER for user-defined events was added in 2025. User-space libraries abstract kqueue for cross-platform development on BSD and macOS systems. provides a wrapper that selects kqueue as the backend on supported platforms, offering a unified for event handling across operating systems. Similarly, libev uses kqueue when available on BSD derivatives, prioritizing it for efficient polling in event-driven applications. , through its library, employs kqueue as the backend for on macOS and BSD, facilitating JavaScript-based server-side event loops. Portability challenges arise on non-supporting systems like , where libraries such as and libev fall back to for equivalent functionality. As of 2025, no major deprecations of kqueue have occurred across supported platforms.

Comparisons and Alternatives

With Traditional Mechanisms (select/poll)

The select() , a traditional mechanism for I/O in systems, suffers from significant limitations that hinder its scalability. It relies on fixed-size bitmasks (fd_set) to track file descriptors, with the maximum number typically capped at FD_SETSIZE, often , beyond which monitoring fails without recompiling with a larger value. Additionally, select() exhibits O(n time complexity, where n is the number of file descriptors, as the kernel scans the entire list on each invocation to check for readiness, leading to inefficiencies with growing descriptor counts. Introduced to address some of select()'s shortcomings, the poll() system call uses a dynamic array of pollfd structures, eliminating the fixed FD_SETSIZE limit and allowing monitoring of arbitrarily many descriptors. However, poll() retains the O(n) scanning overhead, requiring the full list of descriptors to be resubmitted with every call, which involves repeated user-kernel memory copies and kernel-side traversal, making it similarly unscalable for high-descriptor workloads. In contrast, kqueue overcomes these issues through its stateful design, featuring persistent registrations of file descriptors (known as knotes) that remain in the kernel without resubmission, enabling O(1) event retrieval for ready descriptors only. This avoids the O(n) scanning and memory copy penalties of select() and poll(), imposes no artificial limit on descriptor count beyond system resources like memory, and supports efficient handling of thousands of concurrent events. Performance evaluations underscore kqueue's advantages in server environments. In benchmarks using the thttpd with 500 requests per second and up to 10,000 idle connections, kqueue maintained low response times and CPU usage, while poll() failed to scale beyond approximately 600 idle connections due to excessive overhead. Similarly, in a web proxy cache test with 100 active connections and up to 4,000 cold connections, kqueue exhibited constant CPU utilization, whereas select() saturated around 2,000 connections. These results demonstrate kqueue's ability to manage roughly 10 times more connections with substantially less CPU expenditure compared to traditional mechanisms. For use cases, select() and poll() remain suitable for small-scale applications or legacy code handling fewer than a few hundred file descriptors, where their simplicity outweighs scalability concerns. Kqueue, however, is preferred for high-performance servers managing thousands of clients, such as web proxies or network daemons, to achieve efficient resource utilization and avoid bottlenecks.

With Linux epoll

kqueue and Linux's both offer scalable mechanisms for monitoring s and delivering I/O event notifications, addressing the limitations of earlier interfaces like select and poll by avoiding O(n) scanning of sets. Both support level-triggered (default) and edge-triggered modes, where level-triggering notifies as long as the condition persists and edge-triggering signals only on state changes, enabling efficient handling of high-volume connections without redundant wakeups. was introduced in 2.5.44 in 2002, building on concepts similar to kqueue, which debuted in 4.1 in 2000. Despite these parallels, the APIs diverge significantly, posing challenges for cross-platform code portability. employs a split model with epoll_ctl() for adding, modifying, or deleting interests on file descriptors and epoll_wait() for retrieving events, requiring multiple system calls for common operations like batch updates. In contrast, kqueue unifies these actions in the kevent() call, allowing simultaneous registration, modification, deletion, and polling of events in a single invocation, which reduces overhead for dynamic workloads. Additionally, is confined to file descriptor-based I/O events and lacks native support for timers or signals, necessitating auxiliary mechanisms like timerfd or signalfd for such notifications. kqueue, however, provides dedicated filters (EVFILT_TIMER for timers and EVFILT_SIGNAL for signals), enabling a more comprehensive event model without extra file descriptors. Event mappings exist between the two, such as kqueue's EVFILT_READ approximating epoll's EPOLLIN for readability and EVFILT_WRITE to EPOLLOUT for writability, but nuances arise in user data handling. kqueue associates udata—a user-defined pointer or value—with each knote (event filter instance), offering finer-grained flexibility for per-event customization across diverse filter types. epoll's epoll_data_t, while similarly user-definable, is tied to the in the interest set, limiting its granularity to I/O contexts. Performance benchmarks show the two interfaces as largely comparable for core I/O tasks, with throughput scaling similarly under high connection loads on their respective platforms. However, kqueue demonstrates advantages in scenarios involving non-I/O events like timers or signals, owing to its unified that minimizes volume. Cross-platform libraries such as mitigate portability issues by abstracting both backends, selecting on and kqueue on BSD/macOS systems to provide a consistent interface. On , io_uring represents a more recent advancement introduced in kernel 5.1 (2019), offering a completion-based interface that further reduces system calls through submission and completion queues, supporting multishot operations for efficient event batching as of 2025. While io_uring builds on epoll concepts for scalability, it provides broader support for file, network, and device I/O without traditional polling limitations, positioning it as an alternative to both epoll and kqueue in high-performance applications. Kqueue lacks a direct equivalent in BSD/macOS but continues to be optimized for similar use cases. In terms of adoption, dominates on Linux-based servers due to its integration in the kernel and widespread use in high-performance networking stacks. prevails in BSD derivatives (, , , ) and Apple's macOS ecosystem, powering event-driven applications in those environments. As of 2025, no notable efforts toward convergence or cross-implementation have emerged, leaving developers reliant on abstraction layers for multi-platform support.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.