Recent from talks
Contribute something
Nothing was collected or created yet.
Consistent hashing
View on WikipediaThis article may be too technical for most readers to understand. (October 2024) |
In computer science, consistent hashing[1][2] is a special kind of hashing technique such that when a hash table is resized, only keys need to be remapped on average where is the number of keys and is the number of slots. Consistent hashing evenly distributes cache keys across shards, even if some of the shards crash or become unavailable.[3] In contrast, in most traditional hash tables, a change in the number of array slots causes nearly all keys to be remapped because the mapping between the keys and the slots is defined by a modular operation.
Consistent hashing is used by Content Delivery Networks because it is useful for distributing requests for content from a rotating population of web servers. Tim Berners-Lee credits consistent hashing algorithms, and Daniel Lewin as their inventor, with solving the slashdotting problem which plagued the World Wide Web in the 1990s.[4]
History
[edit]The term "consistent hashing" was introduced by David Karger et al. at MIT for use in distributed caching, particularly for the web.[5] This academic paper from 1997 in Symposium on Theory of Computing introduced the term "consistent hashing" as a way of distributing requests among a changing population of web servers.[6] Each slot is then represented by a server in a distributed system or cluster. The addition of a server and the removal of a server (during scalability or outage) requires only items to be re-shuffled when the number of slots (i.e. servers) change. The authors mention linear hashing and its ability to handle sequential server addition and removal, while consistent hashing allows servers to be added and removed in an arbitrary order. [1] The paper was later re-purposed to address technical challenge of keeping track of a file in peer-to-peer networks such as a distributed hash table.[7][8]
Teradata used this technique in their distributed database[citation needed], released in 1986, although they did not use this term. Teradata still uses the concept of a hash table to fulfill exactly this purpose. Akamai Technologies was founded in 1998 by the scientists Daniel Lewin and F. Thomson Leighton (co-authors of the article coining "consistent hashing"). In Akamai's content delivery network,[9] consistent hashing is used to balance the load within a cluster of servers, while a stable marriage algorithm is used to balance load across clusters.[2]
Consistent hashing has also been used to reduce the impact of partial system failures in large web applications to provide robust caching without incurring the system-wide fallout of a failure.[10] Consistent hashing is also the cornerstone of distributed hash tables (DHTs), which employ hash values to partition a keyspace across a distributed set of nodes, then construct an overlay network of connected nodes that provide efficient node retrieval by key.
Rendezvous hashing, designed in 1996, is a simpler and more general technique [citation needed]. It achieves the goals of consistent hashing using the very different highest random weight (HRW) algorithm.
Basic technique
[edit]
In the problem of load balancing, for example, when a BLOB has to be assigned to one of servers on a cluster, a standard hash function could be used in such a way that we calculate the hash value for that BLOB, assuming the resultant value of the hash is , we perform modular operation with the number of servers ( in this case) to determine the server in which we can place the BLOB: ; hence the BLOB will be placed in the server whose is successor of in this case. However, when a server is added or removed during outage or scaling (when changes), all the BLOBs in every server should be reassigned and moved due to rehashing, but this operation is expensive.
Consistent hashing was designed to avoid the problem of having to reassign every BLOB when a server is added or removed throughout the cluster. The central idea is to use a hash function that maps both the BLOB and servers to a unit circle, usually radians. For example, (where is hash of a BLOB or server's identifier, like IP address or UUID). Each BLOB is then assigned to the next server that appears on the circle in clockwise order. Usually, binary search algorithm or linear search is used to find a "spot" or server to place that particular BLOB in or complexities respectively; and in every iteration, which happens in clockwise manner, an operation (where is the value of the server within the cluster) is performed to find the server to place the BLOB. This provides an even distribution of BLOBs to servers. But, more importantly, if a server fails and is removed from the circle, only the BLOBs that were mapped to the failed server need to be reassigned to the next server in clockwise order. Likewise, if a new server is added, it is added to the unit circle, and only the BLOBs mapped to that server need to be reassigned.
Importantly, when a server is added or removed, the vast majority of the BLOBs maintain their prior server assignments, and the addition of server only causes fraction of the BLOBs to relocate. Although the process of moving BLOBs across cache servers in the cluster depends on the context, commonly, the newly added cache server identifies its "predecessor" and moves all the BLOBs, whose mapping belongs to this server (i.e. whose hash value is less than that of the new server), from it. However, in the case of web page caches, in most implementations there is no involvement of moving or copying, assuming the cached BLOB is small enough. When a request hits a newly added cache server, a cache miss happens and a request to the actual web server is made and the BLOB is cached locally for future requests. The redundant BLOBs on the previously used cache servers would be removed as per the cache eviction policies.[11]
Implementation
[edit]Let and be the hash functions used for the BLOB and server's unique identifier respectively. In practice, a binary search tree (BST) is used to dynamically maintain the within a cluster or hashring, and to find the successor or minimum within the BST, tree traversal is used.
- Inserting into the cluster
- Let be the hash value of a BLOB such that, where and . To insert , find the successor of in the BST of s. If is larger than all of the s, the BLOB is placed in the server with smallest value.
- Deleting from the cluster
- Find the successor of in the BST, remove the BLOB from the returned . If has no successor, remove the BLOB from the smallest of the s.[12]
- Insert a server into cluster
- Let be the hash value of a server's identifier such that, where and . Move all the BLOBs, whose hash value is smaller than , from the server whose is successor of . If is largest of all the s, move the relevant BLOBs from the smallest of the s into .[13]
- Delete a server from cluster
- Find the successor of in the BST, move the BLOBs from into its successor server. If doesn't have a successor, move the BLOBs into the smallest of the s.[14]
Variance reduction
[edit]To avoid skewness of multiple nodes within the radian, which happen due to lack of uniform distribution of the servers within the cluster, multiple labels are used. Those duplicate labels are called "virtual nodes" i.e. multiple labels which point to a single "real" label or server within the cluster. The amount of virtual nodes or duplicate labels used for a particular server within a cluster is called the "weight" of that particular server.[15]
Practical extensions
[edit]A number of extensions to the basic technique are needed for effectively using consistent hashing for load balancing in practice. In the basic scheme above, if a server fails, all its BLOBs are reassigned to the next server in clockwise order, potentially doubling the load of that server. This may not be desirable. To ensure a more even redistribution of BLOBs on server failure, each server can be hashed to multiple locations on the unit circle. When a server fails, the BLOBs assigned to each of its replicas on the unit circle will get reassigned to a different server in clockwise order, thus redistributing the BLOBs more evenly. Another extension concerns a situation where a single BLOB gets "hot" and is accessed a large number of times and will have to be hosted in multiple servers. In this situation, the BLOB may be assigned to multiple contiguous servers by traversing the unit circle in clockwise order. A more complex practical consideration arises when two BLOBs are hashed near each other in the unit circle and both get "hot" at the same time. In this case, both BLOBs will use the same set of contiguous servers in the unit circle. This situation can be ameliorated by each BLOB choosing a different hash function for mapping servers to the unit circle.[2]
Comparison with rendezvous hashing and other alternatives
[edit]Rendezvous hashing, designed in 1996, is a simpler and more general technique, and permits fully distributed agreement on a set of options out of a possible set of options. It can in fact be shown that consistent hashing is a special case of rendezvous hashing. Because of its simplicity and generality, rendezvous hashing is now being used in place of Consistent Hashing in many applications.
If key values will always increase monotonically, an alternative approach using a hash table with monotonic keys may be more suitable than consistent hashing.[citation needed]
Complexity
[edit]| Classic hash table | Consistent hashing | |
|---|---|---|
| add a node | ||
| remove a node | ||
| lookup a key | ||
| add a key | ||
| remove a key |
The is an average cost for redistribution of keys and the complexity for consistent hashing comes from the fact that a binary search among nodes angles is required to find the next node on the ring.[citation needed]
Examples
[edit]Known examples of consistent hashing use include:
- Couchbase automated data partitioning [16]
- OpenStack's Object Storage Service Swift[17]
- Partitioning component of Amazon's storage system Dynamo[18]
- Data partitioning in Apache Cassandra[19]
- Data partitioning in ScyllaDB[20]
- Data partitioning in Voldemort[21]
- Akka's consistent hashing router[22]
- Riak, a distributed key-value database[23]
- Gluster, a network-attached storage file system[24]
- Akamai content delivery network[25]
- Discord chat application[26]
- Load balancing gRPC requests to a distributed cache in SpiceDB[27]
- Chord algorithm[28]
- MinIO object storage system[29]
References
[edit]- ^ a b Karger, D.; Lehman, E.; Leighton, T.; Panigrahy, R.; Levine, M.; Lewin, D. (1997). Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing. ACM Press New York, NY, USA. pp. 654–663. doi:10.1145/258533.258660.
- ^ a b c Bruce Maggs and Ramesh Sitaraman (2015). "Algorithmic nuggets in content delivery" (PDF). ACM SIGCOMM Computer Communication Review. 45 (3).
- ^ Designing Distributed Systems Patterns and Paradigms for Scalable, Reliable Services. O'Reilly Media. 2018. ISBN 9781491983607.
- ^ Berners-Lee, Tim (2025). This is for Everyone: the unfinished story of the World Wide Web. Farrar, Straus and Giroux. p. 156. ISBN 978-0-374-61246-7.
- ^ Roughgarden & Valiant 2021, p. 2.
- ^ Roughgarden & Valiant 2021, p. 7.
- ^ Roughgarden & Valiant 2021, p. 8.
- ^ I. Stoica et al., "Chord: a scalable peer-to-peer lookup protocol for Internet applications," in IEEE/ACM Transactions on Networking, vol. 11, no. 1, pp. 17–32, Feb. 2003, doi: 10.1109/TNET.2002.808407.
- ^ Nygren., E.; Sitaraman R. K.; Sun, J. (2010). "The Akamai Network: A Platform for High-Performance Internet Applications" (PDF). ACM SIGOPS Operating Systems Review. 44 (3): 2–19. doi:10.1145/1842733.1842736. S2CID 207181702. Archived (PDF) from the original on November 30, 2022. Retrieved August 29, 2023.
- ^ Karger, D.; Sherman, A.; Berkheimer, A.; Bogstad, B.; Dhanidina, R.; Iwamoto, K.; Kim, B.; Matkins, L.; Yerushalmi, Y. (1999). "Web Caching with Consistent Hashing". Computer Networks. 31 (11): 1203–1213. doi:10.1016/S1389-1286(99)00055-9. Archived from the original on 2008-07-21. Retrieved 2008-02-05.
- ^ Roughgarden & Valiant 2021, p. 6.
- ^ Moitra 2016, p. 2.
- ^ Moitra 2016, p. 2–3.
- ^ Moitra 2016, p. 3.
- ^ Roughgarden & Valiant 2021, p. 6–7.
- ^ "What Exactly Is Membase?". 16 December 2014. Retrieved 2020-10-29.
- ^ Holt, Greg (February 2011). "Building a Consistent Hashing Ring". openstack.org. Retrieved 2019-11-17.
- ^ DeCandia, G.; Hastorun, D.; Jampani, M.; Kakulapati, G.; Lakshman, A.; Pilchin, A.; Sivasubramanian, S.; Vosshall, P.; Vogels, Werner (2007). "Dynamo" (PDF). ACM SIGOPS Operating Systems Review. 41 (6): 205–220. doi:10.1145/1323293.1294281. Retrieved 2018-06-07.
- ^ Lakshman, Avinash; Malik, Prashant (2010). "Cassandra: a decentralized structured storage system". ACM SIGOPS Operating Systems Review. 44 (2): 35–40. doi:10.1145/1773912.1773922. S2CID 916681.
- ^ "NoSQL Comparison: MongoDB vs ScyllaDB". benchant.com. Retrieved 21 March 2024.
- ^ "Design -- Voldemort". www.project-voldemort.com/. Archived from the original on 9 February 2015. Retrieved 9 February 2015.
Consistent hashing is a technique that avoids these problems, and we use it to compute the location of each key on the cluster.
- ^ "Akka Routing". akka.io. Retrieved 2019-11-16.
- ^ "Riak Concepts". Archived from the original on 2015-09-19. Retrieved 2016-12-06.
- ^ "GlusterFS Algorithms: Distribution". gluster.org. 2012-03-01. Retrieved 2019-11-16.
- ^ Roughgarden, Tim; Valiant, Gregory (2016-03-28). "Modern Algorithmic Toolbox" (PDF). stanford.edu. Retrieved 2019-11-17.
- ^ Vishnevskiy, Stanislav (2017-07-06). "How Discord Scaled Elixir to 5,000,000 Concurrent Users". Retrieved 2022-08-16.
- ^ "Consistent Hash Load Balancing for gRPC". 24 November 2021. Retrieved 2023-09-04.
- ^ Stoica, I.; Morris, R.; Liben-Nowell, D.; Karger, D.; Kaashoek, M. F.; Dabek, F.; Balakrishnan, H. (25 Feb 2003). "Chord: a scalable peer-to-peer lookup protocol for Internet applications". IEEE/ACM Transactions on Networking. 11 (1): 17–32. doi:10.1109/TNET.2002.808407. S2CID 221276912.
- ^ "MinIO Versioning, Metadata and Storage Deep Dive". 3 January 2022. Retrieved 2023-10-24.
Works cited
[edit]- Moitra, Ankur (10 February 2016). "Advanced Algorithms, 6.854" (PDF). Massachusetts Institute of Technology. Archived (PDF) from the original on 13 April 2021. Retrieved 8 October 2021.
- Roughgarden, Tim; Valiant, Gregory (28 March 2021). "The Modern Algorithmic Toolbox, Introduction to Consistent Hashing" (PDF). Stanford University. Archived (PDF) from the original on 25 July 2021. Retrieved 7 October 2021.
External links
[edit]- Understanding Consistent hashing
- Consistent hashing by Michael Nielsen on June 3, 2009
- Consistent Hashing, Danny Lewin, and the Creation of Akamai
- Jump Consistent Hashing: A Fast, Minimal Memory, Consistent Hash Algorithm
- Rendezvous Hashing: an alternative to Consistent Hashing
- Implementations in various languages:
Consistent hashing
View on GrokipediaHistory
Origins in Distributed Systems
Consistent hashing emerged in 1997 from research at the Massachusetts Institute of Technology (MIT) aimed at improving distributed caching mechanisms for the burgeoning World Wide Web.[1] The technique was developed by David Karger, Eric Lehman, Tom Leighton, Matthew R. Levine, Daniel Lewin, and Rina Panigrahy to enable scalable, decentralized caching protocols that could adapt to dynamic network conditions without excessive reconfiguration.[5] This invention was detailed in the seminal paper "Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web," presented at the 29th Annual ACM Symposium on Theory of Computing (STOC 1997).[5] The work addressed the limitations of traditional hashing methods in distributed environments, where frequent changes in the set of available cache nodes—such as proxies joining or leaving—would otherwise require remapping nearly all keys, leading to high overhead and instability.[1] By introducing a "smooth" hashing scheme, the authors ensured that only an expected fraction of keys proportional to the change in the system needed reassignment, specifically bounding the number of remapped keys to O(n/m), where n represents the total number of keys and m the number of nodes.[1] The primary motivation stemmed from the problem of hot spots in early web infrastructure, where a surge in requests for popular content—such as the Jet Propulsion Laboratory's site following the 1994 Shoemaker-Levy 9 comet impact or IBM's pages during the 1997 Deep Blue chess matches—overloaded individual servers or shared proxy caches, causing widespread delays and failures.[1] In distributed proxy systems, uneven load distribution exacerbated these issues, as conventional uniform hashing failed to balance traffic effectively across varying numbers of caches, often concentrating requests on a few overloaded nodes.[1] Consistent hashing provided a solution by mapping both keys and nodes to a fixed circular space, minimizing global recomputation and enabling local decision-making for cache placement, thus alleviating hot spots while supporting the Web's rapid growth.[1]Evolution and Key Milestones
Following its initial proposal, consistent hashing saw early adoption in 1998 by Akamai Technologies, founded by the technique's co-inventors, who applied it for load balancing in their content delivery network (CDN) to distribute web caching across global servers and minimize disruptions from node changes.[6] This implementation addressed hot spots in web traffic, enabling scalable delivery of high-demand content like the 1999 Star Wars trailer, which handled millions of requests without overload.[7] In the 2000s, consistent hashing expanded into peer-to-peer (P2P) networks and distributed hash tables (DHTs), powering decentralized systems for efficient key lookups. A key milestone was its integration into Chord, a scalable P2P lookup protocol introduced in 2001, which used consistent hashing to map keys to nodes on a ring, supporting dynamic joins and departures with only an O(1/n) fraction of keys affected, where n is the number of nodes.[8] Subsequent DHT variants, such as Pastry and Tapestry, built on this foundation, adapting consistent hashing for routing in large-scale, fault-tolerant overlays. By 2007, consistent hashing gained prominence in NoSQL databases through Amazon's Dynamo, a highly available key-value store that employed it to partition data across nodes while prioritizing availability over strict consistency in distributed environments.[3] Dynamo's design influenced subsequent systems by demonstrating how consistent hashing, combined with virtual nodes, could balance load and enable seamless scaling in production settings handling petabytes of data. The 2010s brought refinements for heterogeneous clusters, notably in Apache Cassandra, which from its 2008 origins drew from Dynamo but introduced virtual nodes (vnodes) in 2013 to approximate weighted consistent hashing.[9] This allowed nodes with varying capacities to receive proportional token assignments on the hash ring, improving data distribution and reducing imbalances in multi-node setups without manual intervention.[10] More recent milestones include its adoption in service meshes, such as Envoy proxy's ring hash load balancer, which supports gRPC traffic by using consistent hashing for sticky sessions and fault-tolerant routing across microservices.[11] Similarly, in 2017, Discord scaled to 5 million concurrent users by implementing a custom consistent hashing scheme in Elixir for distributing chat state across nodes, ensuring low-latency presence updates during rapid growth.[12] As of 2025, consistent hashing is emerging in blockchain sharding protocols for decentralized storage, with weight-based variants proposed to allocate shards across heterogeneous edge nodes, enhancing scalability in systems like hierarchical sharding frameworks.Core Concepts
The Hash Ring and Mapping
In consistent hashing, the hash space is conceptualized as a fixed-size circular structure, often referred to as the hash ring, spanning from 0 to to accommodate standard 32-bit hash outputs, though the original formulation uses a unit interval [0, 1] scaled from a large modulus .[13] A consistent hash function maps both keys (such as data identifiers) and nodes (such as servers) to points on this ring, ensuring uniform distribution assuming is a strong hash like MD5 truncated to the ring size.[13] This ring topology derives its consistency property from the circular arrangement, which minimizes disruptions during dynamic changes in the node set, building on basic hashing principles where collisions are resolved spatially rather than via traditional modulo partitioning.[13] Nodes are positioned on the ring at their hashed values , sorted in clockwise order around the circle. For a given key hashed to position , the key is assigned to the successor node, defined as the first node encountered when traversing clockwise from .[13] Mathematically, the successor node is: with wrapping around the ring if no node lies clockwise from before completing the circle.[13] This clockwise successor rule ensures that each key maps to exactly one node, partitioning the key space into contiguous arcs owned by each node. The ring's design achieves consistency by localizing remappings during node additions or removals. When a node is added, only the keys in the arc between the new node's position and its immediate clockwise predecessor are reassigned to the new node; similarly, node removal affects only the arc it previously owned, which is redistributed to its successor.[13] On average, this impacts keys, where is the total number of keys and the number of nodes, compared to the full remapping of keys in modulo-based schemes.[13] This bounded disruption supports scalable distributed systems like caching protocols, where node churn is frequent.[13]Key-to-Node Assignment Process
In consistent hashing, the assignment of a key to a node begins with computing the hash of the key, which maps it to a position on the circular hash ring. The key is then assigned to the node whose hashed position is the closest successor in the clockwise direction from the key's position on the ring.[1] This lookup process ensures that each key is deterministically routed to exactly one node without requiring global state synchronization among the nodes.[1] When adding a new node, its identifier is hashed to determine its position on the ring, and it is inserted at that point. Only the keys whose positions fall within the arc between the new node's predecessor and the new node itself are remapped to the new node, as these are the keys for which the new node becomes the closest successor.[1] This localized remapping minimizes disruption, with the expected fraction of affected keys being approximately 1/(N+1), where N is the current number of nodes.[1] Node removal follows a symmetric process: the removed node's position is deleted from the ring, and its arc is merged with that of its successor. All keys previously assigned to the removed node are reassigned to this successor node.[1] The expected fraction of keys remapped in this case is also approximately 1/N, where N is the number of nodes before removal.[1] For instance, consider a system with three evenly spaced nodes A, B, and C on the ring. Adding a fourth node D at a random position will split one of the existing arcs, remapping approximately one-fourth of the keys to D.[1] This assignment mechanism provides deterministic key-to-node mappings based on fixed hash positions, eliminating the need for state replication across nodes and thereby supporting partition tolerance in distributed systems by allowing operations to proceed with partial or inconsistent views of the node set while bounding load deviations.[1]Implementation
Algorithmic Procedures
The hash ring in consistent hashing is typically maintained as a sorted collection of hash values for nodes (or their virtual representatives), implemented using a balanced binary search tree such as a red-black tree or a sorted set data structure, which enables O(log N) time complexity for lookups and updates, where N denotes the number of points on the ring.[14] This structure stores pairs of hash values and associated node identifiers, ordered by the hash values to facilitate efficient successor queries in the circular space.[15] To construct the ring, each physical node generates multiple virtual node identifiers (e.g., by appending a sequence number to the node name), hashes each identifier using a cryptographic hash function like MD5 or SHA-1 to produce a uniform distribution over a large fixed space (e.g., 128 or 160 bits), applies modulo arithmetic to map values onto the ring (0 to 2^m - 1), and inserts the resulting points into the sorted data structure while associating them with the original node.[3][15] This process ensures even distribution and is performed once during initialization or when the node set changes.[5] Key-to-node lookup proceeds by hashing the key with the same function to obtain h(k), then querying the sorted structure for the ceiling (successor) of h(k) to identify the responsible node; the ring's circularity is handled by wrapping to the minimum value if no successor exists.[14] The following pseudocode illustrates this procedure:function findNode(key):
h_k = hash(key) // e.g., MD5(key) modulo ring_size
successor = bst.ceiling(h_k) // Find smallest entry >= h_k
if successor is null:
successor = bst.minimum() // Wrap around to first entry
return successor.node // Return associated node
function findNode(key):
h_k = hash(key) // e.g., MD5(key) modulo ring_size
successor = bst.ceiling(h_k) // Find smallest entry >= h_k
if successor is null:
successor = bst.minimum() // Wrap around to first entry
return successor.node // Return associated node
Virtual Nodes for Balance
In consistent hashing, virtual nodes address the potential for uneven distribution of keys across physical nodes by replicating each physical node multiple times on the hash ring. Specifically, for a system with N physical nodes, each node n is represented by v virtual nodes, where the hash values for these virtual nodes are computed as h(n || i) for i = 1 to v, with || denoting concatenation. These vN points are then placed on the ring, effectively subdividing the responsibility of each physical node into smaller, more evenly spaced segments. This approach spreads the positions of a single physical node's responsibilities more uniformly around the ring, reducing clustering that can occur with a single hash point per node.[1] The primary benefit of virtual nodes is improved load balancing, where the maximum load on any node—measured as the deviation from the ideal share of 1/N—drops from O(\log N) with high probability in basic consistent hashing to O(1) as v grows sufficiently large, such as v = \Theta(\log N). This probabilistic guarantee ensures that no single node bears a disproportionately large share of keys, enhancing overall system stability and performance under varying workloads. In practice, systems like Amazon's Dynamo employ virtual nodes to achieve near-uniform distribution, with empirical imbalance ratios as low as 10% even under heterogeneous node capacities.[1][3] Implementation involves generating the v hash points for each physical node and inserting them into a balanced binary search tree (BST) that represents the ordered ring, allowing efficient successor lookups for key assignments. For a key k, its position h(k) is located in the BST, and the closest virtual node clockwise determines the responsible physical node. This maintains the core ring mapping from basic consistent hashing while distributing assignments more finely.[1] However, increasing v introduces trade-offs: while balance improves, storage requirements grow to O(N v) for maintaining the BST, and lookup times increase marginally from O(\log N) to O(\log (N v)). The expected load per physical node remains approximately 1/N, with load variance reduced to roughly 1/(N v), providing tighter concentration around the mean as v rises. This formulation highlights how virtual nodes statistically average out irregularities in hash distributions.[1][3]Advanced Techniques
Load Balancing Methods
Consistent hashing relies on the uniform hashing assumption, where keys are distributed evenly across the hash space, enabling balanced load distribution among nodes without requiring knowledge of the total number of nodes.[1] However, in practice, this assumption often fails due to non-uniform key distributions, such as skewed access patterns in real-world workloads, leading to load imbalances where certain nodes handle disproportionately more keys or requests.[3] To enhance balance while supporting replication, consistent hashing assigns each key to multiple replicas by mapping it to the r nearest successor nodes in the clockwise direction on the hash ring, where r is the replication factor (typically 3 in production systems). This approach not only provides fault tolerance but also contributes to load balancing by spreading replicas across distinct nodes, reducing the impact of node failures on overall distribution.[3] Virtual nodes, as introduced in practical extensions, improve balance by representing each physical node with multiple points on the ring, approximating uniform spacing and mitigating the effects of non-uniform hashing.[16] Empirical tuning of the number of virtual nodes per physical node (v) is essential for practical deployment; for instance, with N=100 physical nodes, setting v between 100 and 1000 yields a maximum load less than 1.2 times the average load, balancing distribution quality against lookup overhead.[3] Load balance is typically measured using the standard deviation of key counts across nodes or the ratio of maximum to average load; in evaluated systems, increasing virtual nodes reduces this standard deviation, achieving ratios as low as 1.18 with v=100.[3]Mitigating Imbalances and Hotspots
In consistent hashing, hotspots arise when a small subset of popular keys, such as frequently accessed binary large objects (BLOBs) in content delivery networks, concentrate requests on a few nodes, leading to overload despite the use of virtual nodes for improved balance.[17] This skew often follows Zipf-like distributions in real-world workloads, where a few items account for the majority of accesses, exacerbating load imbalances on the hash ring.[17] To mitigate hotspots, systems employ dynamic reassignment techniques, where hot keys are identified through monitoring and migrated to additional nodes or replicated across multiple replicas beyond the standard ring assignment. This reactive approach spreads request traffic, often combined with dedicated caching layers at the application level to absorb bursts without altering the core hashing structure. For instance, requests for hot content can be routed via tree-based protocols overlaid on the hash ring, ensuring distribution to O(log n) caches with high probability, where n is the number of caches.[17] For known or anticipated hotspots, biased loading adjustments incorporate rendezvous hashing elements into the consistent hashing framework, applying rendezvous computations selectively to hot keys while retaining the ring for general traffic to maintain uniformity and minimize remapping. Rendezvous hashing, by selecting the node with the highest random weight for a key-node pair, provides finer-grained control over load distribution for skewed items without requiring ring modifications. Virtual nodes further reduce variance in load distribution, with probabilistic guarantees ensuring balanced assignment.[18] In practice, Akamai's content delivery network uses consistent hashing to balance load within clusters of servers.[19]Extensions
Weighted and Biased Variants
Weighted consistent hashing addresses scenarios where nodes possess heterogeneous capacities, ensuring that the load assigned to each node is proportional to its processing or storage capability. In this approach, each physical node with capacity is represented by a number of virtual nodes on the hash ring, where and is the total number of virtual nodes in the system.[3] These virtual nodes are placed by hashing the node's identifier concatenated with an index for each virtual instance, distributing them evenly across the ring to approximate uniform load per capacity unit.[3] This method builds on the use of virtual nodes for balance, adapting their count to reflect capacity differences rather than treating all nodes equally.[3] A well-known implementation of weighted consistent hashing is the Ketama library, which employs a continuum-based technique to position weighted points smoothly on the ring. In Ketama, weights determine the density of points for each node—higher-capacity nodes receive more points, generated via MD5 hashing of the node name and point indices for even spacing.[20] This ensures that key assignments favor higher-capacity nodes proportionally, minimizing load imbalances in production environments like distributed caches.[20] Biased variants further modify the ring to handle non-uniform costs beyond simple capacity. In CDNs, this can involve popularity-aware biasing, where frequently requested content is preferentially mapped to multiple servers to reduce effective latency.[21] Despite these benefits, weighted and biased variants introduce drawbacks, particularly increased remapping complexity when node weights or biases change. Updating the ring requires recomputing and repositioning multiple virtual nodes, which can temporarily elevate coordination overhead and disrupt ongoing assignments in dynamic clusters.[22]Integration in Modern Frameworks
Consistent hashing has been integrated into several open-source libraries that facilitate its use in distributed systems. The libketama library, implemented in C, provides a robust framework for consistent hashing with support for weighted node assignments, enabling efficient key distribution across servers in caching setups like Memcached.[20][23] In Java, the Spymemcached client library incorporates consistent hashing for Memcached clusters, allowing seamless load balancing and minimal data remapping during node additions or failures.[24] For Python developers, libraries such as hash_ring offer pure-Python implementations of consistent hashing, suitable for sharding in distributed caches and databases with low overhead.[25] Modern frameworks have adopted consistent hashing for advanced load balancing scenarios. The Envoy proxy, integrated with gRPC since its 2021 updates, employs ring-based consistent hashing for HTTP/2 traffic routing, ensuring sticky sessions and efficient distribution across upstream hosts in microservices environments.[11][26] This approach minimizes disruptions in service meshes by maintaining request affinity even as cluster topology changes. Protocol extensions in content delivery networks (CDNs) leverage consistent hashing alongside emerging transport protocols. Post-2020 developments in QUIC-enabled CDNs utilize consistent hashing for server selection, combining QUIC's low-latency multiplexing to route client requests to optimal edge nodes while preserving consistency during dynamic scaling. As of 2025, consistent hashing trends toward deeper integration in serverless computing paradigms. In AWS Lambda, it partitions asynchronous invocations across execution environments using consistent hashing, enabling scalable handling of billions of requests with reduced placement churn and improved reliability for event-driven workloads.[27] Customizations of consistent hashing extend to consensus protocols for fault tolerance. Variants of the Raft consensus algorithm incorporate consistent hashing to manage data placement and leader election during failures. This adaptation builds on weighted variants to balance loads in replicated logs, ensuring quorum stability without full rehashing.Analysis and Comparisons
Complexity Measures
In consistent hashing, the lookup operation to determine the successor node for a given key typically involves searching an ordered structure of node positions on the hash circle. When implemented using a balanced binary search tree (BST) to store the hashed positions of nodes (or their virtual nodes), the lookup time complexity is , where is the number of physical nodes.[15] Alternatively, maintaining a sorted array of these positions enables binary search for lookups in time, where is the number of virtual nodes per physical node.[28] Node addition or removal requires updating the ordered structure by inserting or deleting the corresponding virtual node positions, which takes time in a BST or sorted array implementation. Additionally, only a fraction of keys need remapping to new nodes, leading to an expected remapping cost of operations, where is the total number of keys stored across the system. This arises because the addition or removal of a single node affects approximately of the hash circle, redistributing that share of keys.[15] Formally, the expected fraction of keys remapped upon such a change is given by where is the current set of node positions on the circle and is the change due to the node operation, assuming uniform hashing and balanced virtual node placement.[1] The space complexity for maintaining the hash ring structure is , as each of the nodes contributes virtual positions that must be stored and ordered for efficient lookups. This overhead scales linearly with the number of virtual nodes used to improve load balance but remains modest compared to the total key storage.[28] Overall, these measures render consistent hashing superior to simple modulo-based hashing schemes, where node changes necessitate remapping all keys, potentially causing significant disruption in large-scale systems.[1]Versus Alternative Hashing Schemes
Consistent hashing addresses key limitations of traditional simple modulo hashing, which assigns keys to nodes via a hash function modulo the current number of nodes n. When n changes—such as during node addition or failure—simple modulo hashing necessitates remapping nearly all K keys across the system, incurring O(K) computational cost and disrupting consistency by invalidating most prior assignments.[1] In contrast, consistent hashing localizes disruptions, remapping only an expected O(K/n) fraction of keys on average, thereby maintaining higher availability and reducing overhead in dynamic clusters.[1] Rendezvous hashing, proposed by Thaler and Ravishankar in 1996 as a name-based mapping scheme, provides an alternative to consistent hashing by selecting nodes via the highest random weight from pairwise key-node hash comparisons, typically using simple functions like XOR for efficiency.[29] Unlike consistent hashing's ring-based structure, rendezvous hashing avoids maintaining a shared coordinate space, enabling stateless operation where each client independently computes assignments without global synchronization.[29] This simplicity suits decentralized environments, though it demands O(n) hash evaluations per key lookup—scaling poorly with node count n—compared to consistent hashing's O(log n) lookup time when augmented with ordered node lists or trees.[29] Consistent hashing also incorporates monotonicity as a core property: upon adding nodes, keys migrate only to new nodes and never rearrange among existing ones, preserving assignment stability.[1] Consistent hashing trades some order strictness for efficiency, using the ring to approximate balance with lower memory footprint, avoiding the full spatial overhead of purely monotonic designs.[29] Key trade-offs between consistent hashing and alternatives like rendezvous hashing center on coordination and performance: the ring structure in consistent hashing requires distributed agreement on node positions, introducing synchronization overhead in highly volatile networks, while rendezvous hashing's stateless design eliminates this at the cost of repeated computations per operation.[29] Both achieve comparable load balance under uniform distributions, but consistent hashing better mitigates hotspots through virtual nodes, whereas rendezvous relies on hash quality for evenness.[29] Consistent hashing is typically chosen for structured systems like databases, where its ring enables scalable partitioning with minimal remapping during growth.[30] Rendezvous hashing, conversely, excels in simplicity-driven scenarios such as IoT and sensor networks, where stateless coordination supports swarm-like, low-resource deployments without centralized state.[31]Applications
In Distributed Storage Systems
Consistent hashing plays a pivotal role in distributed storage systems, particularly NoSQL databases and key-value stores, by enabling efficient data partitioning across nodes in a cluster. It maps keys to positions on a virtual ring, assigning ownership to nodes based on their positions, which facilitates horizontal scaling and fault tolerance without requiring full data reshuffling during cluster changes. This approach is foundational in systems prioritizing availability and partition tolerance over strict consistency, as per the CAP theorem implications in such architectures.[3] Amazon Dynamo, introduced in 2007, employs consistent hashing for ring-based partitioning of its key space, using virtual nodes to achieve load balancing and uniform distribution. Each key is hashed to a point on the ring, with the successor node responsible for storage; to handle heterogeneous node capacities, Dynamo assigns multiple virtual nodes per physical node, typically 100-1000 per machine. Replication is managed with N=3 replicas by default, where each write operation targets the primary node and its N-1 successors on the ring, ensuring data availability even during node failures. This design minimizes the impact of node additions or removals, as only a fraction of keys (approximately 1/N) need remapping.[3] Apache Cassandra, released in 2010, builds on similar principles with a token-based ring structure using consistent hashing to partition data across nodes. Keys are hashed to tokens within a fixed range (0 to 2^{127} - 1 for the MD5-based RandomPartitioner used in early versions as described in the 2009 paper); each node owns a contiguous range of tokens; virtual nodes (vnodes) allow finer-grained distribution, with a default of 256 per node to balance load. Modern versions use the Murmur3 partitioner with a token range of -2^{63} to 2^{63}-1. Cassandra supports tunable consistency levels, such as ONE, QUORUM, or ALL, allowing applications to trade off between availability and consistency during reads and writes. Node additions or removals trigger seamless token range transfers between neighbors, preserving data locality and enabling linear scalability.[4] Riak, another Dynamo-inspired system, utilizes consistent hashing on a ring for data distribution, with a default ring size of 64 partitions distributed across the physical nodes, resulting in approximately 64/N virtual nodes per physical node where N is the number of nodes, to mitigate hotspots and ensure even partitioning. Like Dynamo, it replicates data across N nodes (typically 3), but incorporates active anti-entropy mechanisms using Merkle trees to detect and resolve inconsistencies across replicas in the background, promoting eventual consistency without blocking operations. This approach supports high availability in dynamic environments, where node failures are common, by periodically comparing hash trees to repair divergent data.[3][32] The primary benefits of consistent hashing in these systems include horizontal scalability, as clusters can expand by adding nodes that absorb only a small portion of data (roughly 1/N of the ring), and minimized data movement during resizes, which reduces downtime and network overhead compared to traditional hashing schemes. It also enhances fault tolerance by localizing the effects of node changes to adjacent segments, allowing the system to maintain operations with minimal disruption.[3] A key challenge is propagating ring membership changes across the cluster, addressed via gossip protocols where nodes periodically exchange state information with random peers, achieving eventual consistency in topology awareness. While effective for decentralization, gossip can introduce temporary inconsistencies in routing during high churn, requiring careful tuning of intervals to balance convergence speed and overhead.[3]In Content Delivery and Networking
Consistent hashing plays a crucial role in content delivery networks (CDNs) by enabling efficient distribution of user requests across edge servers, minimizing disruptions from server additions or failures. In Akamai's early implementation around 1998, consistent hashing was employed for selecting edge servers to serve web content, which helps reduce cache misses by ensuring that requests for the same content are consistently routed to the same server when possible, thereby improving hit rates and overall performance.[17] This approach addresses hot spots in web caching by mapping both content keys and server identifiers to a hash ring, allowing dynamic scaling without widespread remapping of requests.[17] In modern real-time communication platforms, consistent hashing facilitates low-latency sharding of users across distributed servers. For instance, Discord adopted a custom consistent hashing ring in 2017 to shard over 5 million concurrent users across voice servers, which routes users to nearby servers and significantly reduces join latency for voice channels.[12] By hashing user identifiers onto the ring and assigning virtual nodes to servers, this method ensures balanced load distribution and minimal reshuffling during server changes, supporting seamless scaling for high-traffic voice interactions.[12] For microservices architectures, weighted consistent hashing enhances routing in service meshes and proxies. Envoy Proxy utilizes a ring hash load balancer that incorporates endpoint weights to route requests across upstream services, enabling handling of thousands of requests per second (RPS) while preserving session affinity.[33] This weighted variant adjusts server capacities on the hash ring, directing more traffic to higher-capacity nodes and maintaining consistency in microservices environments like those using gRPC for inter-service communication.[33] Other caching and proxy systems also leverage consistent hashing for upstream load balancing in networking contexts. Varnish Cache employs consistent hashing through its directors module to shard requests across backend servers, promoting even distribution and cache efficiency in high-throughput scenarios.[34] Similarly, Nginx's upstream module supports ketama-based consistent hashing via thehash directive with the consistent parameter, which routes HTTP traffic to backends while minimizing key remappings upon failures.[35]
A key performance advantage in these networking applications is the reduction of remapped requests to less than 1% during node failures or additions, as only a fraction proportional to 1/n (where n is the number of nodes) of the keys are affected, thereby enhancing system availability and reducing latency spikes.[17] This property proves particularly valuable in CDNs and load balancers, where maintaining request affinity directly impacts user experience and resource utilization.