Recent from talks
Nothing was collected or created yet.
Fair queuing
View on WikipediaFair queuing is a family of scheduling algorithms used in some process and network schedulers. The algorithm is designed to achieve fairness when a limited resource is shared, for example to prevent flows with large packets or processes that generate small jobs from consuming more throughput or CPU time than other flows or processes.
Fair queuing is implemented in some advanced network switches and routers.
History
[edit]The term fair queuing was coined by John Nagle in 1985 while proposing round-robin scheduling in the gateway between a local area network and the internet to reduce network disruption from badly-behaving hosts.[1][2][3]
A byte-weighted version was proposed by Alan Demers, Srinivasan Keshav and Scott Shenker in 1989, and was based on the earlier Nagle fair queuing algorithm.[4][5] The byte-weighted fair queuing algorithm aims to mimic a bit-per-bit multiplexing by computing theoretical departure date for each packet.
The concept has been further developed into weighted fair queuing, and the more general concept of traffic shaping, where queuing priorities are dynamically controlled to achieve desired flow quality of service goals or accelerate some flows.
Principle
[edit]Fair queuing uses one queue per packet flow and services them in rotation, such that each flow can "obtain an equal fraction of the resources".[1][2]
The advantage over conventional first in first out (FIFO) or priority queuing is that a high-data-rate flow, consisting of large packets or many data packets, cannot take more than its fair share of the link capacity.
Fair queuing is used in routers, switches, and statistical multiplexers that forward packets from a buffer. The buffer works as a queuing system, where the data packets are stored temporarily until they are transmitted.
With a link data-rate of R, at any given time the N active data flows (the ones with non-empty queues) are serviced each with an average data rate of R/N. In a short time interval the data rate may fluctuate around this value since the packets are delivered sequentially in turn.
Fairness
[edit]In the context of network scheduling, fairness has multiple definitions. Nagle's article uses round-robin scheduling of packets,[2] which is fair in terms of the number of packets, but not on the bandwidth use when packets have varying size. Several formal notions of fairness measure have been defined including max-min fairness, worst-case fairness,[6] and fairness index.[7]
Generalisation to weighted sharing
[edit]The initial idea gives to each flow the same rate. A natural extension consists in letting the user specify the portion of bandwidth allocated to each flow leading to weighted fair queuing and generalized processor sharing.
A byte-weighted fair queuing algorithm
[edit]This algorithm attempts to emulate the fairness of bitwise round-robin sharing of link resources among competing flows. Packet-based flows, however, must be transmitted packetwise and in sequence. The byte-weighted fair queuing algorithm selects transmission order for the packets by modeling the finish time for each packet as if they could be transmitted bitwise round robin. The packet with the earliest finish time according to this modeling is the next selected for transmission.
The complexity of the algorithm is O(log(n)), where n is the number of queues/flows.
Algorithm details
[edit]Modeling of actual finish time, while feasible, is computationally intensive. The model needs to be substantially recomputed every time a packet is selected for transmission and every time a new packet arrives into any queue.
To reduce computational load, the concept of virtual time is introduced. Finish time for each packet is computed on this alternate monotonically increasing virtual timescale. While virtual time does not accurately model the time packets complete their transmissions, it does accurately model the order in which the transmissions must occur to meet the objectives of the full-featured model. Using virtual time, it is unnecessary to recompute the finish time for previously queued packets. Although the finish time, in absolute terms, for existing packets is potentially affected by new arrivals, finish time on the virtual time line is unchanged - the virtual time line warps with respect to real time to accommodate any new transmission.
The virtual finish time for a newly queued packet is given by the sum of the virtual start time plus the packet's size. The virtual start time is the maximum between the previous virtual finish time of the same queue and the current instant.
With a virtual finishing time of all candidate packets (i.e., the packets at the head of all non-empty flow queues) computed, fair queuing compares the virtual finishing time and selects the minimum one. The packet with the minimum virtual finishing time is transmitted.
Pseudocode
[edit]
Shared variables
const N // Nb of queues
queues[1..N] // queues
lastVirFinish[1..N] // last virtual finish instant
| |
receive(packet)
queueNum := chooseQueue(packet)
queues[queueNum].enqueue(packet)
updateTime(packet, queueNum)
|
updateTime(packet, queueNum)
// virStart is the virtual start of service
virStart := max(now(), lastVirFinish[queueNum])
packet.virFinish := packet.size + virStart
lastVirFinish[queueNum] := packet.virFinish
|
send()
queueNum := selectQueue()
packet := queues[queueNum].dequeue()
return packet
|
selectQueue()
it := 1
minVirFinish =
while it ≤ N do
queue := queues[it]
if not queue.empty and queue.head.virFinish < minVirFinish then
minVirFinish = queue.head.virFinish
queueNum := it
it := it + 1
return queueNum
|
The function receive() is executed each time a packet is received, and send() is executed each time a packet to send must be selected, i.e. when the link is idle and the queues are not empty. This pseudo-code assumes there is a function now() that returns the current virtual time, and a function chooseQueue() that selects the queue where the packet is enqueued.
The function selectQueue() selects the queue with the minimal virtual finish time. For the sake of readability, the pseudo-code presented here does a linear search. But maintaining a sorted list can be implemented in logarithmic time, leading to a O(log(n)) complexity, but with more complex code.
See also
[edit]References
[edit]- ^ a b John Nagle: "On packet switches with infinite storage," RFC 970, IETF, December 1985.
- ^ a b c Nagle, J. B. (1987). "On Packet Switches with Infinite Storage". IEEE Transactions on Communications. 35 (4): 435–438. CiteSeerX 10.1.1.649.5380. doi:10.1109/TCOM.1987.1096782.
- ^ Phillip Gross (January 1986), Proceedings of the 16-17 January 1986 DARPA Gateway Algorithms and Data Structures Task Force (PDF), IETF, pp. 5, 98, retrieved 2015-03-04,
Nagle presented his "fair queuing" scheme, in which gateways maintain separate queues for each sending host. In this way, hosts with pathological implementations can not usurp more than their fair share of the gateway's resources. This invoked spirited and interested discussion.
- ^ Demers, Alan; Keshav, Srinivasan; Shenker, Scott (1989). "Analysis and simulation of a fair queueing algorithm". ACM SIGCOMM Computer Communication Review. 19 (4): 1–12. doi:10.1145/75247.75248.
- ^ Demers, Alan; Keshav, Srinivasan; Shenker, Scott (1990). "Analysis and Simulation of a Fair Queueing Algorithm" (PDF). Internetworking: Research and Experience. 1: 3–26.
- ^ Bennett, J. C. R.; Hui Zhang (1996). "WF/sup 2/Q: Worst-case fair weighted fair queueing". Proceedings of IEEE INFOCOM '96. Conference on Computer Communications. Vol. 1. p. 120. doi:10.1109/INFCOM.1996.497885. ISBN 978-0-8186-7293-4. S2CID 17558577.
- ^ Ito, Y.; Tasaka, S.; Ishibashi, Y. (2002). "Variably weighted round robin queueing for core IP routers". Conference Proceedings of the IEEE International Performance, Computing, and Communications Conference (Cat. No.02CH37326). p. 159. doi:10.1109/IPCCC.2002.995147. ISBN 978-0-7803-7371-6. S2CID 60787008.
Fair queuing
View on GrokipediaOverview
Definition and Motivation
Fair queuing refers to a class of scheduling algorithms employed in network devices, such as routers, to allocate bandwidth equitably among competing packet flows. In packet-switched networks, a flow consists of a sequence of packets sharing the same source-destination pair, and the link capacity represents the total transmission rate available on the outgoing interface. These algorithms approximate an idealized bit-by-bit round-robin discipline, where bandwidth is divided equally by servicing tiny portions of each flow in a cyclic manner, ensuring that no single flow monopolizes the resource.[6][7] The primary motivation for fair queuing arises from the limitations of traditional first-in-first-out (FIFO) queuing, which uses a single queue for all incoming packets and services them in arrival order. FIFO suffers from head-of-line blocking, where a packet at the front of the queue delays all subsequent packets regardless of their flow, and it permits starvation of low-rate flows by allowing high-rate or large-packet flows to dominate the link. This lack of flow isolation enables ill-behaved sources to capture an arbitrarily large share of bandwidth, adversely affecting well-behaved traffic and exacerbating congestion.[6][7][8] A classic example illustrates these issues: consider a router with a single FIFO queue handling two flows—a bulk file transfer sending large packets at a high rate and an interactive telnet session sending small packets sporadically. The file transfer can occupy most of the link capacity , causing excessive delays for telnet packets queued behind it, potentially rendering the interactive session unresponsive. Fair queuing mitigates this by maintaining separate queues per flow and servicing them proportionally, allowing the telnet flow to receive its fair share of bandwidth without interference from the bulk transfer.[6][7]Basic Operation
Fair queuing operates by maintaining a separate queue for each distinct flow, typically identified by source-destination address pairs or additional port numbers, allowing packets from different conversations to be isolated and managed independently. This structure contrasts with traditional first-in-first-out (FIFO) queuing, where a single misbehaving flow with large packets can dominate the link and starve others of bandwidth.[2] The service process assigns a virtual finish time to each arriving packet, computed as the maximum of its virtual start time (the time when the flow is eligible to send) and the previous packet's finish time in the flow, plus the packet's transmission time scaled by the flow's fair share (R/N for N equal-share flows). Packets are then served from the head of their flow's queue in order of increasing virtual finish times, emulating bit-by-bit round-robin service across flows. This ensures that each flow receives a proportional share of bandwidth regardless of packet sizes or arrival patterns.[2][7] For the case of equal packet sizes, this process approximates round-robin scheduling, where each active flow receives approximately an equal share of the available bandwidth, specifically , where is the link transmission rate and is the number of active flows. This allocation promotes fairness by ensuring that no flow is unduly penalized or favored, leading to more predictable performance across diverse traffic types. To illustrate, consider a link with two flows (A and B) sharing a 1 Mbps capacity, where packets are 1000 bits each. Flow A arrives with packets at times 0, 1, and 3 ms, while Flow B arrives with packets at 0.5 and 2 ms. The scheduler serves alternately:- At t=0: Serve A's first packet (0-1 ms).
- At t=1: Serve B's first packet (1-2 ms).
- At t=2: Serve A's second packet (2-3 ms).
- At t=3: Serve B's second packet (3-4 ms), then A's third (4-5 ms).
Historical Development
Origins in Networking
The concept of fair queuing originated in the mid-1980s as a response to severe congestion problems in early Internet networks, particularly during the ARPANET era. John Nagle, in his 1984 RFC 896, first documented the phenomenon of congestion collapse, where bursty traffic from multiple sources overwhelmed shared links, leading to excessive retransmissions and network throughput dropping to near zero as gateways connected networks of differing bandwidths. This instability was exacerbated by the lack of mechanisms to isolate traffic flows, causing synchronized behavior among TCP connections that amplified overload on gateways.[9] Building on these observations, Nagle proposed an initial fair queuing approach in his 1985 RFC 970, advocating for round-robin scheduling at packet switches to achieve congestion avoidance. In this scheme, a single first-in-first-out (FIFO) queue per outgoing link would be replaced with multiple queues—one per source host—to ensure equitable bandwidth sharing and prevent any single host from monopolizing the link due to bursty transmissions. Switches would service these queues in a round-robin fashion, dequeuing one packet from each non-empty queue in turn, while using "Source Quench" messages to signal hosts when queues exceeded a small threshold (e.g., two packets), thereby maintaining low latency and avoiding infinite storage buildup under overload.[3] This per-source queuing idea laid the groundwork for isolating TCP flows, reducing the risk of global synchronization where all connections simultaneously ramp up transmission rates and then back off, a problem Nagle anticipated in heterogeneous networks. By apportioning bandwidth fairly among active sources, the approach aimed to stabilize datagram networks against the collapse seen in ARPANET interconnections, such as those between high-speed local networks and slower wide-area links.[3][9]Key Milestones and Publications
The foundational publication on fair queuing appeared in 1989 with the paper "Analysis and Simulation of a Fair Queuing Algorithm" by Alan Demers, Srinivasan Keshav, and Scott Shenker, presented at ACM SIGCOMM. This work introduced byte-weighted fair queuing as a mechanism to ensure packet-level fairness in packet-switched networks by approximating bit-by-bit round-robin service through the use of virtual finishing times. Building on John Nagle's earlier conceptual suggestion of round-robin queuing in RFC 970, the paper provided both analytical models and simulation results demonstrating improved bandwidth allocation and reduced variance in delays compared to first-come-first-served queuing.[10] Demers, Keshav, and Shenker played key roles in formalizing the core principles: the trio developed the virtual time mechanism to emulate idealized bit-by-bit round-robin scheduling without requiring fluid packet transmission, addressing practical implementation challenges in gateways. Their contributions shifted focus from coarse-grained congestion control to fine-grained resource sharing, influencing subsequent theoretical and practical developments.[10] A significant extension came in 1993 with Abhay K. Parekh and Robert G. Gallager's paper on "A Generalized Processor Sharing Approach to Flow Control in Integrated Services Networks," which introduced weighted fair queuing (WFQ) to support proportional bandwidth allocation among flows based on assigned weights, laying groundwork for differentiated services.[4] In the 1990s, fair queuing principles were integrated into broader Quality of Service (QoS) architectures, notably the Integrated Services (IntServ) framework outlined in RFC 1633, which used per-flow scheduling like weighted fair queuing to guarantee resource reservations for diverse traffic types. A practical approximation, Stochastic Fair Queuing (SFQ), was proposed by Paul E. McKenney in 1990 to reduce per-flow state overhead by hashing flows into a fixed number of queues, achieving approximate fairness with lower computational cost suitable for high-speed routers.[11] These milestones marked a transition from theoretical constructs to deployable mechanisms, enabling fair queuing implementations in ATM networks for services like Available Bit Rate (ABR) and in IP routers for traffic management, thereby enhancing network stability and equity in shared infrastructures.[12]Theoretical Principles
Fairness Metrics
Fair queuing disciplines aim to allocate bandwidth equitably among competing flows, with fairness serving as a core performance criterion. Max-min fairness is defined as an allocation where the minimum rate received by any flow is maximized, ensuring that no flow can increase its rate without decreasing the rate of another flow that currently receives a lower rate. This concept ensures protection for low-bandwidth flows against domination by high-bandwidth ones, as discussed in early network resource allocation studies such as Bertsekas and Gallager's work on data networks.[13] Proportional fairness, in contrast, maximizes the product of the rates across all flows, balancing aggregate throughput with equitable shares and providing a logarithmic utility interpretation that favors flows based on their elastic demands.[14] Quantitative metrics evaluate how closely a queuing discipline achieves these ideals. Jain's fairness index, a widely used measure, is computed as , where are the allocated rates to flows and the index ranges from 0 (complete unfairness) to 1 (perfect equality).[15] Another key metric is the deviation from idealized Generalized Processor Sharing (GPS), a fluid model where bandwidth is shared continuously in fixed proportions; packet-based approximations like fair queuing bound this deviation to ensure service delays remain within a small multiple of the GPS finish time, typically relative to packet transmission times.[16] Challenges in achieving fairness arise from practical constraints, particularly variable packet sizes, which can distort bandwidth shares since flows sending larger packets may receive disproportionate service in packetized systems approximating bit-level round-robin.[16] Fair queuing addresses this through virtual time computations to emulate fluid fairness, but residual unfairness persists if packets arrive in bursts or sizes vary significantly, potentially leading to short-term imbalances. Flow isolation, another challenge, requires strict separation of service guarantees to prevent one misbehaving flow from impacting others, which fair queuing enforces via per-flow queues but at the cost of increased overhead in high-flow scenarios.[16] For example, consider a link with capacity shared by two flows: one sending small packets of 100 bytes and another large packets of 1500 bytes. Without size normalization, the large-packet flow could claim up to 15 times more bandwidth than the small-packet flow per packet transmission cycle, yielding a Jain's index of approximately 0.57 despite equal queue priorities, highlighting the need for byte-weighted adjustments to restore fairness. In weighted extensions, these metrics adapt to proportional shares, but unweighted cases prioritize equal division as the baseline ideal.[14]Weighted Generalizations
Weighted fair queuing (WFQ) generalizes the basic fair queuing mechanism by assigning weights to individual flows or classes of traffic, enabling proportional allocation of bandwidth such that flow receives a share of of the total capacity .[17] This approach builds on the equal-weight case of basic fair queuing, where all flows receive identical shares, but allows for differentiated service based on predefined priorities.[18] The ideal theoretical model for WFQ is Generalized Processor Sharing (GPS), a fluid approximation that serves traffic bit-by-bit in a weighted round-robin fashion.[17] In GPS, the service rate for flow is given by where is the server's total service rate and the sum is over all active flows.[17] This fluid model assumes infinitely divisible packets, providing a benchmark for analyzing packet-based approximations and ensuring worst-case guarantees on throughput and delay independent of other flows' behavior.[17] By supporting variable weights, WFQ enables Quality of Service (QoS) differentiation, such as allocating higher shares to latency-sensitive traffic like voice over less urgent data flows, thereby prioritizing real-time applications in integrated services networks.[17] This weighted allocation promotes efficient resource utilization while protecting well-behaved flows from misbehaving ones, a key advantage in congested environments.[18]Core Algorithms
Virtual Time Scheduling
Virtual time scheduling serves as a foundational mechanism in fair queuing to emulate the idealized generalized processor sharing (GPS) discipline using discrete packet transmissions. In this approach, virtual time, denoted as , functions as a logical clock that decouples the progression of service allocation from physical real time, advancing only during periods when the server is actively transmitting data. This clock tracks the cumulative service provided to backlogged flows in a manner that mirrors the fluid, bit-by-bit service of GPS, ensuring that the scheduler can approximate proportional bandwidth sharing without requiring infinitesimal packet granularity.[4] The primary goal of virtual time is to determine the hypothetical completion times for packets under GPS, allowing the packet-by-packet generalized processor sharing (PGPS) algorithm to select packets for transmission in an order that closely follows this ideal fluid model. Each flow maintains a virtual start time and finish time for its packets, where the start time is set to the maximum of the previous packet's finish time or the current virtual time at arrival, and the finish time reflects the service duration proportional to the flow's weight. The scheduler then selects the packet from the flow with the smallest virtual finish time, thereby preserving fairness by serving flows in the sequence they would complete under GPS. This mechanism ensures that no flow receives service out of proportion to its allocated share during contention.[4] A critical aspect of virtual time management is its handling of server idle periods, during which remains frozen to avoid artificially advancing service credits for non-busy flows. When the server resumes operation, virtual time continues from its last value, adjusted based on the weights of currently backlogged flows, which prevents unfair penalization of flows that were idle due to external factors like network variability. This freezing behavior maintains the integrity of the GPS emulation across discontinuous transmission periods. Virtual time thus underpins finish time computations in fair queuing by providing a consistent reference for timestamping packets.[4]Finish Time Computation
In fair queuing, the service order of packets is determined by computing a virtual finish time for each packet, which emulates the completion time that packet would have in an idealized fluid-based generalized processor sharing system. This computation applies the current virtual time to assign timestamps that preserve fairness across flows while respecting per-flow packet ordering.[17] For the -th packet of flow arriving at time with length , the virtual finish time is given by where is the virtual finish time of the previous packet from the same flow (with ), is the system virtual time at arrival (advancing based on the weighted service to backlogged flows), is the allocated weight for flow (summing to 1 across all flows), and is the link capacity. This formula incorporates virtual time as input to simulate proportional service allocation.[17] In the unweighted case, where all active flows receive equal shares ( for flows), the formula simplifies to reflect uniform bandwidth division, often expressed as with under the approximation that virtual time closely tracks real arrival time during busy periods. The original unweighted formulation uses , where is the virtual time (rounds completed) and is the packet size in normalized time units.[19] To dequeue the next packet, the scheduler selects the head-of-line packet across all flows with the smallest virtual finish time, ensuring the emulated fluid order is approximated in the packet system. The use of handles bursts by preventing a flow from advancing its subsequent packets in virtual time until the prior packet's service slot elapses, thus avoiding any flow from disproportionately jumping ahead during high-rate periods.[19]Byte-Weighted Fair Queuing
In packet-based fair queuing, treating each packet equally regardless of size leads to unfair bandwidth allocation in terms of bytes transmitted, as flows with smaller packets receive disproportionately more service opportunities and thus higher throughput. Byte-weighted fair queuing resolves this by emulating a bit-by-bit (or byte-by-byte) round-robin service discipline, ensuring that each flow receives an equal share of bytes over time rather than equal shares of packets. This adaptation promotes fairness at the granularity of data volume, preventing short-packet flows from monopolizing bandwidth at the expense of those with larger packets.[8] The core of the algorithm involves computing a virtual finish time for each arriving packet to determine the transmission order. Upon arrival, a packet's virtual start time is set to the maximum of the current system virtual time and the finish time of the previous packet from the same flow. The virtual finish time is then calculated as , where is the packet length in bytes and is the flow's fair share rate (typically the link capacity divided by the number of active flows). Packets are dequeued and transmitted in increasing order of their virtual finish times, approximating the idealized fluid model of generalized processor sharing where service is allocated proportionally to bytes. This mechanism ensures that the byte throughput remains balanced across flows, even with varying packet sizes. Implementation relies on a priority queue (such as a heap) to efficiently select the packet with the minimum virtual finish time for transmission. Each packet arrival, departure, or flow update requires insertions or deletions in the queue, yielding an overall complexity of per packet operation, where is the number of active flows.[8] This logarithmic overhead is practical for typical network routers, balancing fairness with computational efficiency. For illustration, consider two flows sharing a link with capacity of 1250 bytes per unit time with equal shares, so each gets bytes per unit time. Flow 1 sends 400-byte packets, while Flow 2 sends 1000-byte packets. The first packet of Flow 1 finishes virtually at time 0.64 units (), followed by Flow 2's at 1.6 units (). Subsequent packets from Flow 1 (finishing at 1.28, 1.92, etc.) interleave, allowing Flow 1 more transmissions to equalize byte service, resulting in both flows transmitting approximately 625 bytes per unit time on average.[8]Variant Algorithms
Weighted Fair Queuing
Weighted Fair Queuing (WFQ) extends the principles of byte-weighted Fair Queuing by incorporating configurable weights for each flow, enabling differentiated bandwidth allocation to support quality of service (QoS) requirements in packet-switched networks. In the unweighted byte-weighted Fair Queuing, all flows receive equal shares of bandwidth, but WFQ assigns a weight to each flow , proportional to its desired share, allowing higher-priority flows to receive a larger portion of the link capacity . This generalization maintains the emulation of an ideal fluid-rate scheduler while handling discrete packets, ensuring that the virtual time advances based on the weighted fluid rate for active flows.[4] The core of WFQ involves computing a virtual finish time for each arriving packet to determine scheduling order. Upon arrival of a packet of length from flow , the virtual time at the packet's start is the maximum of the current system virtual time and the finish time of the previous packet in the same flow. The virtual time increment for the packet is then calculated as , and the packet's virtual finish time is . Packets are dequeued in increasing order of their virtual finish times, approximating the service that would be provided under a weighted fluid model. This mechanism ensures that bandwidth is allocated according to weights even under varying traffic loads.[4] WFQ provides strong guarantees on performance and fairness, including worst-case delay bounds that are within one maximum packet transmission time of those under Generalized Processor Sharing (GPS), offering relative delay independent of the number of flows . Additionally, it achieves flow isolation by bounding the impact of any misbehaving flow on others, preventing one flow from monopolizing resources and ensuring that compliant flows receive their entitled shares. These properties make WFQ effective for protecting latency-sensitive traffic from bursty or adversarial sources.[21] WFQ forms the basis for practical implementations in networking hardware and has been standardized as a foundational QoS mechanism. Cisco Systems adopted WFQ in their routers starting in the mid-1990s, integrating it with flow-based classification for efficient congestion management. The Internet Engineering Task Force (IETF) has referenced WFQ principles in recommendations for differentiated services and integrated services architectures, such as in RFC 2205 for RSVP and subsequent QoS frameworks.[22]Deficit Round Robin
Deficit Round Robin (DRR) is a packet scheduling algorithm designed as an efficient approximation to Weighted Fair Queuing (WFQ), providing near-optimal fairness with significantly reduced computational overhead.[5] It was proposed by M. Shreedhar and G. Varghese in 1995 specifically for implementation in high-speed routers, addressing the challenges of variable packet sizes in fair bandwidth allocation among multiple flows.[5] In DRR, each flow maintains a dedicated queue along with two key parameters: a fixed quantum (a credit amount, typically set to a multiple of the maximum packet size, such as 500 bytes) and a deficit counter initialized to zero. The algorithm operates in a round-robin fashion, cycling through all active (non-empty) queues using a simple pointer. For a selected flow, the quantum is added to its current deficit counter. The scheduler then dequeues and transmits packets from that queue sequentially, subtracting each packet's size from the deficit counter after transmission, until either the counter reaches zero or below (in which case any negative remainder carries over as the deficit for the next round) or the queue becomes empty. This carry-over mechanism ensures that unused credits from small packets are preserved for future rounds, preventing starvation of flows with smaller packets.[5] A primary advantage of DRR is its constant-time O(1) complexity per packet, achieved through the straightforward round-robin traversal without the need for per-flow timestamps or priority queues.[5] Unlike WFQ, which requires O(log N) operations for timestamp maintenance and sorting across N flows, DRR eliminates these by leveraging the deficit counter to handle variable packet lengths directly in a cyclic manner.[5] Regarding fairness, DRR approximates the ideal fluid-model fairness of WFQ within a bound proportional to the maximum packet size, ensuring that no flow receives more or less than its entitled share by more than one packet's worth over time, which is sufficient for most network applications.[5]QFQ+
Quick Fair Queueing Plus (QFQ+) is an efficient packet scheduling algorithm that provides near-optimal fair-queueing guarantees at the amortized cost of Deficit Round Robin (DRR). Proposed by Paolo Valente, QFQ+ enhances the Quick Fair Queueing (QFQ) scheduler by grouping classes into aggregates and scheduling them with O(1) complexity, while handling individual classes within aggregates using DRR-like mechanisms. It was integrated into the mainline Linux kernel starting from version 3.8.0, replacing QFQ to offer improved performance in hierarchical and weighted scheduling scenarios.[23][24] QFQ+ matches DRR's O(1) time complexity per packet decision and achieves execution times close to or lower than DRR, particularly in environments with differentiated flow weights or TCP Segmentation Offload (TSO)/Generic Segmentation Offload (GSO). For instance, performance measurements on an Intel i7-2760QM processor show QFQ+ processing packets in as little as 63 ns compared to DRR's 56 ns in certain configurations, with up to 31% fewer instructions than the original QFQ and comparable cache efficiency to DRR. Unlike DRR, which can exhibit high delay and jitter due to its round-robin cycling and variable packet sizes, QFQ+ provides much tighter fairness bounds by emulating an ideal fluid-rate scheduler more closely, ensuring near-optimal worst-case deviation from perfect fairness. Additionally, QFQ+ offers superior delay guarantees, with the worst-case packet completion time equivalent to that under QFQ plus an additional delay of the transmission time for three maximum-size packets at the class's reserved rate, resulting in lower worst-case delay and jitter overall.[23][24] These properties make QFQ+ a preferred choice for modern networking applications requiring low latency, such as quality of service in high-speed links and cloud environments, where it balances computational efficiency with robust service assurances.Modern Applications
Quality of Service in Networks
Fair queuing plays a central role in Quality of Service (QoS) mechanisms within IP networks, particularly through its integration with Differentiated Services (DiffServ), where it enables per-class queuing to allocate bandwidth equitably among traffic classes such as expedited forwarding for low-latency needs and assured forwarding for controlled delays.[25] In DiffServ architectures, fair queuing variants like Weighted Fair Queuing (WFQ) schedule packets based on class weights, ensuring that higher-priority classes receive proportional service without starving lower-priority ones, thus maintaining end-to-end QoS guarantees across domains.[25] This integration addresses the limitations of simpler FIFO queuing by providing isolation between aggregated flows, which is essential for scalable QoS deployment in core routers.[26] Stochastic Fair Queuing (SFQ), an approximation of ideal fair queuing, further enhances flow isolation in routers by hashing packet flows into a fixed number of queues (typically 16 to 128) using flow identifiers like source and destination IP addresses and ports, thereby approximating per-flow fairness without maintaining state for every individual flow.[27] SFQ ensures that short-lived or low-bandwidth flows are not overwhelmed by long-lived high-bandwidth ones, promoting statistical fairness in environments with thousands of concurrent connections.[27] The IETF's RFC 2309 highlights the value of such per-flow scheduling approaches, including fair queuing, for mitigating congestion and achieving approximate bandwidth allocation in Internet routers, recommending their use alongside active queue management to preserve network performance.[28] One key benefit of fair queuing in QoS is its ability to prevent TCP greediness, where flows with shorter round-trip times or aggressive retransmissions could otherwise monopolize link capacity, starving other connections; by enforcing proportional sharing, it protects responsive TCP traffic from unresponsive or misbehaving flows.[29] This isolation is particularly vital for supporting real-time applications like Voice over IP (VoIP), as fair queuing assigns dedicated or weighted queues to latency-sensitive UDP-based VoIP packets, minimizing jitter and delay variations even when coexisting with bulk data transfers.[30] In modern IP routers, extensions like Flow Queuing with Controlled Delay (FQ-CoDel) combine fair queuing with active queue management to combat bufferbloat, delivering low latency for interactive traffic while maintaining fairness; FQ-CoDel hashes flows into queues and applies CoDel drop policies per queue, reducing queueing delays to under 5 ms for typical links even under load.[31]Cloud Computing and 5G
In cloud computing environments, fair queuing principles are applied to ensure equitable resource allocation among virtual machines (VMs), mitigating issues such as resource monopolization by individual tenants. For instance, measurement-based fair queuing (MBFQ) allocates bandwidth proportionally to VMs based on observed usage, allowing a VM to achieve its allocated share within three scheduling intervals while reclaiming idle bandwidth in five periods. This approach addresses noisy neighbor problems by dynamically adjusting shares without requiring precise a priori knowledge of traffic patterns. Similarly, in container orchestration platforms like Kubernetes, schedulers such as Apache YuniKorn implement hierarchical fair sharing, which draws from fair queuing to balance CPU, memory, and GPU resources across pods, preventing any single workload from dominating cluster capacity during AI and machine learning tasks.[32][33] In 5G networks, weighted fair queuing (WFQ) variants support multi-slice radio access networks (RAN) by dynamically allocating resources among diverse services like enhanced mobile broadband (eMBB) and ultra-reliable low-latency communications (URLLC). Dynamic weighted fair queuing (DWFQ) in multi-slice setups ensures fairness across virtual operators by adjusting weights based on slice priorities, maintaining isolation while optimizing throughput for eMBB traffic and latency for URLLC. 3GPP specifications from Release 16 onward incorporate scheduling mechanisms akin to WFQ for network slicing, as studied in resource allocation frameworks that guarantee minimum data rates and fairness levels in coexistence scenarios. These adaptations enable efficient handling of heterogeneous traffic in sliced architectures, with open-source implementations using WFQ queues to prioritize eMBB and URLLC flows via OpenFlow.[34][35] Recent advancements post-2020 integrate machine learning to optimize WFQ for adaptive weight assignment in dynamic environments. For example, deep reinforcement learning-based WFQ (WFQ continual-DRL) dynamically tunes queue weights in response to varying bandwidth demands, improving fairness and throughput in cloud and 5G contexts compared to static methods. In 5G active queue management (AQM), FQ-CoDel combines fair queuing with controlled delay to combat bufferbloat, reducing latency under mixed traffic loads in New Radio (NR) scenarios in simulations compliant with RFC 7928 guidelines.[36] These developments address gaps in disaggregated RAN and edge computing by employing ML-driven AQM to manage buffers across centralized and distributed units, enhancing performance in low-latency edge applications without exacerbating congestion.[37][38]Implementation Aspects
Pseudocode and Examples
Fair queuing algorithms can be implemented using a per-flow queue structure, where each packet is assigned a finish time upon enqueuing, and the scheduler selects the packet with the minimum finish time for dequeuing. This approach approximates the bit-by-bit round-robin service discipline described by Demers, Keshav, and Shenker.[19] For simplicity, assume the link capacity is normalized to 1 (e.g., service times equal packet sizes in bytes), virtual time starts at 0, and there is one queue per flow. Packets are enqueued with finish time , where is the packet size and is the finish time of the previous packet in that flow (0 initially). Upon dequeuing, the server advances to the finish time of the selected packet. If the server is idle when a packet arrives, set to the current real time (also normalized). The following pseudocode outlines the core operations for byte-weighted fair queuing (equal weights across flows):Initialize:
- V = 0 // current virtual time
- For each flow f: queue_f = empty, F_last_f = 0
- Use a global priority queue (min-heap) keyed by finish time F to hold all packets across flows
Enqueue(packet p for flow f, arrival time t, size S):
if server idle and t > V:
V = t // update virtual time on idle periods
F = max(V, F_last_f) + S
insert p into queue_f with tag F
insert (F, p, f) into global priority queue
F_last_f = F
Dequeue():
if global priority queue empty:
idle
else:
extract min (F_min, p, f) from global priority queue
remove p from queue_f
V = F_min // advance virtual time
transmit p
if queue_f not empty:
// next packet in flow f will use updated F_last_f
Initialize:
- V = 0 // current virtual time
- For each flow f: queue_f = empty, F_last_f = 0
- Use a global priority queue (min-heap) keyed by finish time F to hold all packets across flows
Enqueue(packet p for flow f, arrival time t, size S):
if server idle and t > V:
V = t // update virtual time on idle periods
F = max(V, F_last_f) + S
insert p into queue_f with tag F
insert (F, p, f) into global priority queue
F_last_f = F
Dequeue():
if global priority queue empty:
idle
else:
extract min (F_min, p, f) from global priority queue
remove p from queue_f
V = F_min // advance virtual time
transmit p
if queue_f not empty:
// next packet in flow f will use updated F_last_f
- t=0: Enqueue A's first packet (size 1): . Enqueue A's second (size 2): . Enqueue B's (size 4): .
- Dequeue: Select min F=1 (A1), set V=1.
- t=1: Enqueue C's (size 1): .
- Dequeue: Select min F=2 (C), set V=2.
- Dequeue: Select min F=3 (A2), set V=3.
- t=3: Enqueue A's third (size 3): .
- Dequeue: Select min F=4 (B), set V=4.
- Dequeue: Select F=6 (A3), set V=6.
Performance and Complexity
Fair queuing algorithms exhibit varying computational complexities depending on their implementation. Weighted Fair Queuing (WFQ) typically requires O(log N) time per packet operation, where N is the number of active flows, primarily due to the use of a priority queue (such as a heap) for sorting packets by their virtual finish times.[40] In contrast, Deficit Round Robin (DRR) achieves O(1) time complexity per packet through a simple round-robin traversal of queues, augmented by a deficit counter to handle variable packet sizes without sorting.[41] Performance metrics for fair queuing emphasize guarantees on fairness and latency. Fairness is typically bounded such that the difference in service received by any two flows deviates by at most the maximum packet size from their entitled shares, ensuring proportional allocation even under bursty traffic.[16] For latency, WFQ provides a worst-case delay bound that approximates the idealized Generalized Processor Sharing (GPS) scheduler, specifically within one maximum packet transmission time of the GPS delay, which for a flow with rate ρ_i and burstiness constraints remains independent of the total number of flows N.[42] Scalability trade-offs arise with high flow counts, as WFQ's O(log N) operations become prohibitive in routers handling thousands of concurrent flows, leading to increased processing overhead. To mitigate this, approximations like Stochastic Fair Queuing (SFQ) employ hashing to map flows to a fixed number of bins, reducing complexity to O(1) while preserving fairness in expectation, though with a small probability of minor inaccuracies.[27] Optimizations for fair queuing include hardware accelerations in Application-Specific Integrated Circuits (ASICs) and Field-Programmable Gate Arrays (FPGAs), which enable high-throughput implementations suitable for modern networks. For instance, dedicated WFQ architectures in ASICs can process packets at rates up to 50 Gb/s (as of 2011) by parallelizing queue management and timestamp computations.[43] Recent FPGA implementations, such as those for multi-tenant cloud and 5G networks as of 2025, support scalable fair scheduling at 100 Gb/s and beyond, facilitating low-latency QoS enforcement in programmable switches.[44][45]References
- https://pbg.cs.[illinois](/page/Illinois).edu/courses/cs598fa09/readings/pg93.pdf
