Hubbry Logo
Erasure codeErasure codeMain
Open search
Erasure code
Community hub
Erasure code
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Erasure code
Erasure code
from Wikipedia

In coding theory, an erasure code is a forward error correction (FEC) code under the assumption of bit erasures (rather than bit errors), which transforms a message of k symbols into a longer message (code word) with n symbols such that the original message can be recovered from a subset of the n symbols. The fraction r = k/n is called the code rate. The fraction k’/k, where k’ denotes the number of symbols required for recovery, is called reception efficiency. The recovery algorithm expects that it is known which of the n symbols are lost.

History

[edit]

Erasure coding was invented by Irving Reed and Gustave Solomon in 1960.[1]

There are many different erasure coding schemes. The most popular erasure codes are Reed-Solomon coding, Low-density parity-check code (LDPC codes), and Turbo codes.[1]

As of 2023, modern data storage systems can be designed to tolerate the complete failure of a few disks without data loss, using one of 3 approaches:[2][3][4]

  • Replication
  • RAID
  • Erasure Coding

While technically RAID can be seen as a kind of erasure code,[5] "RAID" is generally applied to an array attached to a single host computer (which is a single point of failure), while "erasure coding" generally implies multiple hosts,[3] sometimes called a Redundant Array of Inexpensive Servers (RAIS). The erasure code allows operations to continue when any one of those hosts stops.[4][6]

Compared to block-level RAID systems, object storage erasure coding has some significant differences that make it more resilient.[7][8][9][10][11]

Optimal erasure codes

[edit]

Optimal erasure codes have the property that any k out of the n code word symbols are sufficient to recover the original message (i.e., they have optimal reception efficiency). Optimal erasure codes are maximum distance separable codes (MDS codes).

Parity check

[edit]

Parity check is the special case where k + 1. From a set of k values , a checksum is computed and appended to the k source values:

The set of k + 1 values is now consistent with regard to the checksum. If one of these values, , is erased, it can be easily recovered by summing the remaining variables:

RAID 5 is a widely used application of the parity check erasure code using XOR.[1]

Polynomial oversampling

[edit]

Example: Err-mail (k = 2)

[edit]

In the simple case where k = 2, redundancy symbols may be created by sampling different points along the line between the two original symbols. This is pictured with a simple example, called err-mail:

Alice wants to send her telephone number (555629) to Bob using err-mail. Err-mail works just like e-mail, except

  1. About half of all the mail gets lost.
  2. Messages longer than 5 characters are illegal.
  3. It is very expensive (similar to air-mail).

Instead of asking Bob to acknowledge the messages she sends, Alice devises the following scheme.

  1. She breaks her telephone number up into two parts a = 555, b = 629, and sends 2 messages – "A=555" and "B=629" – to Bob.
  2. She constructs a linear function, , in this case , such that and .

  1. She computes the values f(3), f(4), and f(5), and then transmits three redundant messages: "C=703", "D=777" and "E=851".

Bob knows that the form of f(k) is , where a and b are the two parts of the telephone number. Now suppose Bob receives "D=777" and "E=851".

Bob can reconstruct Alice's phone number by computing the values of a and b from the values (f(4) and f(5)) he has received. Bob can perform this procedure using any two err-mails, so the erasure code in this example has a rate of 40%.

Note that Alice cannot encode her telephone number in just one err-mail, because it contains six characters, and that the maximum length of one err-mail message is five characters. If she sent her phone number in pieces, asking Bob to acknowledge receipt of each piece, at least four messages would have to be sent anyway (two from Alice, and two acknowledgments from Bob). So the erasure code in this example, which requires five messages, is quite economical.

This example is a little bit contrived. For truly generic erasure codes that work over any data set, we would need something other than the f(i) given.

General case

[edit]

The linear construction above can be generalized to polynomial interpolation. Additionally, points are now computed over a finite field.

First we choose a finite field F with order of at least n, but usually a power of 2. The sender numbers the data symbols from 0 to k − 1 and sends them. He then constructs a (Lagrange) polynomial p(x) of order k such that p(i) is equal to data symbol i. He then sends p(k), ..., p(n − 1). The receiver can now also use polynomial interpolation to recover the lost packets, provided he receives k symbols successfully. If the order of F is less than 2b, where b is the number of bits in a symbol, then multiple polynomials can be used.

The sender can construct symbols k to n − 1 'on the fly', i.e., distribute the workload evenly between transmission of the symbols. If the receiver wants to do his calculations 'on the fly', he can construct a new polynomial q, such that q(i) = p(i) if symbol i < k was received successfully and q(i) = 0 when symbol i < k was not received. Now let r(i) = p(i) − q(i). Firstly we know that r(i) = 0 if symbol i < k has been received successfully. Secondly, if symbol i ≥ k has been received successfully, then r(i) = p(i) − q(i) can be calculated. So we have enough data points to construct r and evaluate it to find the lost packets. So both the sender and the receiver require O(n (n − k)) operations and only O(n − k) space for operating 'on the fly'.

Real world implementation

[edit]

This process is implemented by Reed–Solomon codes, with code words constructed over a finite field using a Vandermonde matrix.

Most practical erasure codes are systematic codes—each one of the original k symbols can be found copied, unencoded, as one of the n message symbols.[12] (Erasure codes that support secret sharing never use a systematic code).

Near-optimal erasure codes

[edit]

Near-optimal erasure codes require (1 + ε)k symbols to recover the message (where ε>0). Reducing ε can be done at the cost of CPU time. Near-optimal erasure codes trade correction capabilities for computational complexity: practical algorithms can encode and decode with linear time complexity.

Fountain codes (also known as rateless erasure codes) are notable examples of near-optimal erasure codes. They can transform a k symbol message into a practically infinite encoded form, i.e., they can generate an arbitrary amount of redundancy symbols that can all be used for error correction. Receivers can start decoding after they have received slightly more than k encoded symbols.

Regenerating codes address the issue of rebuilding (also called repairing) lost encoded fragments from existing encoded fragments. This issue occurs in distributed storage systems where communication to maintain encoded redundancy is a problem.[12]

Applications of erasure coding in storage systems

[edit]

Erasure coding is now standard practice for reliable data storage.[13][14][15] In particular, various implementations of Reed-Solomon erasure coding are used by Apache Hadoop, the RAID-6 built into Linux, Microsoft Azure, Facebook cold storage, and Backblaze Vaults.[15][12]

The classical way to recover from failures in storage systems was to use replication. However, replication incurs significant overhead in terms of wasted bytes. Therefore, increasingly large storage systems, such as those used in data centers use erasure-coded storage. The most common form of erasure coding used in storage systems is Reed-Solomon (RS) code, an advanced mathematics formula used to enable regeneration of missing data from pieces of known data, called parity blocks. In a (kr) RS code, a given set of '' data blocks, called "chunks", are encoded into (k + r) chunks. The total set of chunks comprises a stripe. The coding is done such that as long as at least k out of (k + r) chunks are available, one can recover the entire data. This means a (kmr') RS-encoded storage can tolerate up to m failures. (This is different from the standard RS(nk) notation, with n = k + r.)

RS(10,4)

[edit]

Example: In RS(10, 4) code, which is used in Facebook for their HDFS (now part of Apache Hadoop), 10 MB of user data is divided into ten 1MB blocks. Then, four additional 1 MB parity blocks are created to provide redundancy. This can tolerate up to 4 concurrent failures. The storage overhead here is 14/10 = 1.4×.[16]

In the case of a fully replicated system, the 10 MB of user data will have to be replicated 4 times to tolerate up to 4 concurrent failures. The storage overhead in that case will be 50/10 = 5 times.

This gives an idea of the lower storage overhead of erasure-coded storage compared to full replication and thus the attraction in today's storage systems.

The Hitchhiker scheme can be combined with RS-coding to reduce the amount of computation and data-transfer required for data block reconstruction. It is also implemented as an HDFS codec, though a policy will need to be manually defined for it to be used.[12]

Hot data

[edit]

Initially, erasure codes were used to reduce the cost of storing "cold" (rarely-accessed) data efficiently; but erasure codes can also be used to improve performance serving "hot" (more-frequently-accessed) data compared to simpler redundancy schemes (mirroring).[12]

The classic example of erasing coding improving performance is RAID 5, which provides the same single-drive failure protection while requiring fewer hard drives compared to RAID 1. The extra hard drives may then be used to store more data and make use of the improved read/write-speed multiplier on RAID 5. This also applies to RAID 6 (double-redundancy: one parity and one erasure code), assuming processing power is sufficient.[1] Generalized RAID can work with any number of redundancy drives. There are two notations for generalized RAID: RAID7.x refers to a system with x redundancy drives, allowing recovery when as many as x drives fail.[17] Alternatively, RAID N+M refers to N regular data drives with M redundancy drives, being able to recover all the data when any M drives fail.[1]

A more advanced example is EC-Cache, a cluster cache, i.e. a cache distributed among several nodes. Such systems can suffer from load imbalance when one node happens to host more popular items and a common method to address this issue is selective replication, i.e. create more replicas for more popular objects. However, this method is limited by the available amount of memory. By individually dividing every object into k splits and r redundancy units, perfect load balancing can instead be achieved with minimal waste of memory.[12]

Examples

[edit]

Here are some examples of implementations of the various codes:

Near optimal erasure codes

[edit]

Near optimal fountain (rateless erasure) codes

[edit]

Optimal erasure codes

[edit]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
An erasure code is a (FEC) scheme that encodes a of k symbols into a longer codeword of n symbols (n > k), such that the original can be recovered from any k of the n symbols even if the remaining nk symbols are lost or erased. These codes provide in and transmission by distributing redundant parity information across multiple components, enabling reconstruction without needing the exact locations of erasures. Unlike general error-correcting codes that handle arbitrary errors, erasure codes specifically address known erasures, making them efficient for scenarios like disk failures or in networks. The foundational erasure code, the Reed–Solomon code, was invented in 1960 by Irving S. Reed and Gustave Solomon as a subclass of Bose–Chaudhuri–Hocquenghem (BCH) codes, using polynomial evaluations over finite fields to achieve maximum distance separable (MDS) properties that optimize redundancy. Early applications emerged in the 1980s for consumer technologies, such as compact discs (CDs), where interleaved Reed–Solomon codes corrected burst errors in audio data. Over time, erasure codes evolved beyond Reed–Solomon variants to include regenerating codes, which minimize repair bandwidth during node recovery, and locally recoverable codes (LRCs), which reduce the number of helper nodes needed for repairs. In modern distributed storage systems, erasure codes offer significant advantages over traditional replication by achieving higher storage efficiency—for instance, a (14, 10) Reed–Solomon code requires only about 1.4× the original data size compared to 3× for triple replication—while maintaining comparable reliability against failures. They are deployed in large-scale environments like , where LRCs such as the (18, 14, 7) code provide approximately 1.29× overhead and support rapid local repairs from fewer nodes, and in open-source systems like Ceph and Hadoop for handling petabyte-scale data with daily node failure rates exceeding 50 in data centers. Key performance metrics include encoding/decoding complexity, repair bandwidth (data downloaded for recovery), and repair degree (number of contacted nodes), which continue to drive research into hybrid and optimized constructions.

Fundamentals

Definition and Principles

Erasure codes are a subclass of error-correcting codes designed specifically to protect against data erasures, where the positions of lost data are known, enabling the recovery of the original data from a sufficient subset of the remaining encoded fragments. In this framework, kk original data symbols are encoded into nn total symbols, with n>kn > k, such that any kk out of the nn encoded symbols can reconstruct the original data exactly. The redundancy is provided by m=nkm = n - k additional symbols, which allow the system to tolerate up to mm erasures without data loss. Unlike general error-correcting codes, which must detect and correct arbitrary errors in unknown positions within noisy channels, erasure codes exploit the knowledge of erasure locations to simplify decoding and achieve higher efficiency in recovery. This distinction makes erasure decoding computationally easier, as it involves solving a from the known surviving symbols rather than searching for error patterns. A fundamental limit on erasure code performance is given by the , which states that the minimum distance dd of an (n,k)(n, k) code satisfies dnk+1d \leq n - k + 1, with codes achieving equality known as maximum distance separable (MDS) codes. Erasure codes offer significant storage compared to replication methods, requiring less overhead while providing comparable ; for instance, a (6,5)(6,5) incurs only a 1.2× storage multiplier versus 2× for simple . This stems from distributing across multiple fragments, allowing recovery from any kk survivors even after mm failures. A simple illustrative example is a (3,2)(3,2) parity over , where two data blocks D1D_1 and D2D_2 are encoded into three blocks with the third being the parity P=D1D2P = D_1 \oplus D_2; the original data can then be recovered from any two of the three blocks, such as D1=PD2D_1 = P \oplus D_2 or D2=PD1D_2 = P \oplus D_1.

Mathematical Foundations

Erasure codes are constructed as linear codes over finite fields, also known as Galois fields GF(q), where q is typically a power of 2 for computational efficiency in applications. In such fields, addition is performed using bitwise XOR operations, which correspond to vector addition modulo 2. Multiplication is more involved and is defined relative to an irreducible primitive polynomial of degree b for GF(2^b); for byte-level coding, GF(256) or GF(2^8) is commonly used, with multiplication implemented via precomputed lookup tables for logarithms and antilogarithms to enable fast exponentiation-based computation. Linear erasure codes are defined by a G, which is a k × n matrix over GF(q), where k is the (number of symbols) and n is the block length (total symbols including redundancy). A codeword is formed by multiplying a k-dimensional vector m by G, yielding the n-dimensional codeword c = m G. In systematic form, G is structured as [I_k | P], where I_k is the k × k and P is a k × (n - k) parity submatrix; this allows the encoded block to be E = [D | P_D], with D the k symbols and P_D = D P the (n - k) parity symbols. The H is an (n - k) × n matrix satisfying G H^T = 0, used in decoding to compute the syndrome s = r H^T for a received vector r, where non-zero syndromes indicate erasures or errors. A fundamental limit on the performance of block codes is the , which states that for a code of block length n, dimension k, and minimum d over an of size q, the satisfies d ≤ n - k + 1. Codes achieving equality in this bound are known as maximum distance separable (MDS) codes, which provide the optimal between and /erasure correction capability for a given rate. The of an erasure code is characterized by its rate R = k/n, which represents the of symbols in the encoded block, and the corresponding storage overhead factor of 1/R = n/k, quantifying the introduced. Decoding erasure codes involves solving a derived from the known symbol positions using the parity-check matrix or . For up to (n - k) erasures in a systematic code, this reduces to inverting a k × k submatrix via over the , with O(k^3) in the number of arithmetic operations.

Historical Development

Early Innovations

The concept of parity checks emerged in the 1950s as a fundamental mechanism for achieving simple redundancy in early computing systems, enabling basic error detection and correction. These checks involved adding redundant bits to data to verify integrity, laying groundwork for erasure coding principles by allowing recovery from single failures through parity computations. In 1950, Richard Hamming formalized error-correcting codes in his seminal work on parity-check-based schemes, initially focused on detecting and correcting transmission errors in computing machines, which influenced later erasure-tolerant designs. A significant practical application arrived in 1988 with the introduction of by David A. Patterson, Garth Gibson, and Randy H. Katz, who proposed using XOR-based parity schemes for disk arrays to tolerate disk failures. levels 5 and 6 specifically employed these erasure codes, distributing parity information across drives to recover from one or two simultaneous failures, respectively, thereby enhancing reliability in storage systems without excessive redundancy. In 1960, Irving S. Reed and Gustave Solomon developed Reed-Solomon codes, a class of non-binary cyclic error-correcting codes based on evaluations over finite fields, capable of correcting multiple symbol erasures or errors. These codes found early real-world use in deep-space communication, notably in NASA's Voyager missions launched in 1977, where a concatenated (255,223) Reed-Solomon code over GF(256) was implemented to protect telemetry data against channel noise and erasures during transmission from billions of kilometers away. During the 1990s, erasure codes began appearing in academic prototypes for distributed systems, exploring beyond traditional replication. The OceanStore project, initiated in 2000 by John Kubiatowicz and colleagues, integrated Byzantine fault-tolerant erasure coding into its global-scale persistent storage architecture, using techniques like Reed-Solomon variants to ensure data durability across untrusted, wide-area networks while minimizing storage overhead.

Modern Advancements

In the , academic research increasingly applied erasure codes to distributed storage systems, emphasizing scalability for wide-area networks. The Total Recall system, introduced in 2004, used erasure codes to store large files across nodes, automating availability management by detecting and repairing failures through redundant fragments. Similarly, DHash++, developed in 2006, integrated erasure coding into a architecture to enhance lookup efficiency and in wide-area environments, reducing the overhead of traditional replication. Production adoption accelerated in the 2010s as large-scale systems sought cost savings over replication. Google's transition from the to Colossus around 2010 incorporated Reed-Solomon codes, such as the RS(6,3) configuration, to achieve efficient redundancy with a storage overhead of 1.5x while supporting across clusters. In parallel, Apache Hadoop's HDFS , released in 2012, applied erasure codes to convert replicated cold data into parity-protected stripes, reducing storage requirements by up to 50% without compromising durability. Efficiency breakthroughs addressed repair bottlenecks in erasure-coded systems. Minimum Storage Regenerating (MSR) codes, proposed by Dimakis et al. in 2007, minimized both per-node storage and repair bandwidth by leveraging network coding principles, enabling optimal trade-offs for distributed repair scenarios. Building on this, Locally Repairable Codes (LRCs), introduced in 2012 for Storage, localized single-failure repairs to a small subset of nodes, reducing repair traffic by up to 75% compared to standard Reed-Solomon codes while maintaining low storage overhead. By the 2020s, erasure codes had become integral to cloud infrastructure. continued deploying LRCs for blob storage, achieving multi-rack with repair times under 10% of global reconstruction costs. Recent advancements include maximally recoverable LRCs, which enhance resilience in multi-failure scenarios and have been further deployed in Azure as of 2022. A survey of 285 papers from 2002 to 2023 highlights the field's maturity, with 76.1% addressing code constructions, algorithmic optimizations, and adaptations for emerging architectures like . Standardization efforts facilitated broader deployment. The IETF's Framework (RFC 6363, 2011) and Reed-Solomon specifications (RFC 5510, 2009) provided protocols for erasure codes in IP networks, influencing storage implementations. Swift integrated erasure code support from the 2014 Juno release onward, enabling policy-based use of codes like Reed-Solomon for object storage durability.

Optimal Erasure Codes

Parity Check Codes

Parity check codes represent a foundational class of optimal erasure codes constructed using linear operations over the binary field GF(2), where addition corresponds to the exclusive-OR (XOR) function. These codes generate m parity symbols from k symbols to form an n = k + m symbol codeword, achieving maximum distance separable (MDS) properties for small m by ensuring that any k symbols suffice for . The encoding process relies on a parity-check matrix H of dimensions m × n, where the code satisfies H \mathbf{c} = \mathbf{0} over GF(2), with \mathbf{c} the systematic codeword vector containing followed by parities. In simple constructions, each row of H places a 1 in the corresponding parity column and 1's across specific columns to define the linear dependencies, enabling efficient XOR-based while maintaining invertibility of relevant submatrices for MDS guarantees. For the basic case of m=1, the construction yields a single even parity symbol P computed as the XOR of all k data symbols d_1, \dots, d_k: P=i=1kdiP = \bigoplus_{i=1}^{k} d_i This corresponds to the parity-check matrix row [1 , 1 , \dots , 1 , | , 1], where the final 1 enforces the even parity condition. Such codes, exemplified by RAID-5 arrays, tolerate one erasure by recomputing the lost symbol via XOR of the remaining n-1 symbols, making them computationally lightweight with O(n) operations for encoding and decoding. A prominent extension to m=2 appears in the EVENODD code, which provides an MDS construction tolerating two erasures with minimal redundancy. Designed for k data symbols where k is an odd prime, EVENODD arranges the information symbols into a (k-1) × k array and computes two parity columns: the first (horizontal parity column) as the row-wise XOR of all k data symbols in each row, and the second (diagonal parity column) along wrapped diagonals of the array, with an adjustment for even/odd parity on a special diagonal to ensure systematic form and MDS distance 3. Encoding requires O(k^2) XORs, but leverages RAID-5 hardware for efficiency. Decoding for single or double erasures uses syndrome equations solved via recursive XORs on available symbols, recovering lost data in O(k) to O(k^2) time depending on failure pattern. This achieves optimality per the Singleton bound, using only two parities for distance 3, outperforming general Reed-Solomon codes in XOR count for small fields (e.g., 50% fewer operations for n=15). These codes exhibit MDS properties for small m, such as m=1 in RAID-5 or m=2 in EVENODD, allowing recovery from any m erasures through simple XOR combinations of surviving symbols without . However, a key limitation is high repair bandwidth: repairing one erased symbol requires downloading approximately k symbols from other nodes, as the linear dependencies necessitate accessing nearly the full . Additionally, the quadratic in k for encoding, decoding, and updates renders them non-scalable for large n in distributed systems, where prime constraints on k further restrict flexibility. In practice, parity check codes underpin RAID-6 deployments, which employ two parities for double-failure tolerance and optimize computations using Cauchy matrices to minimize XOR operations (e.g., reducing encoding to O(k) via sparse, invertible structures over GF(2^b)).

Reed-Solomon Codes

Reed-Solomon codes are a class of optimal maximum distance separable (MDS) erasure codes that achieve the , allowing recovery of the original data from any k out of n encoded symbols, where n - k is the number of parity symbols. These codes operate over finite fields, typically Galois fields GF(q), and are constructed using to ensure the MDS property. In the construction of a Reed-Solomon code RS(n, k), the k data symbols are treated as the coefficients of a polynomial f(x) of degree less than k over GF(q), where q ≥ n to provide n distinct evaluation points α_i. The encoded symbols consist of the n evaluations of f(x) at these distinct points α_1, α_2, ..., α_n in the field, with the first k evaluations representing the systematic data and the remaining n - k serving as parities. This polynomial-based approach ensures that any n - k erasures can be tolerated, as the minimum distance d = n - k + 1 matches the MDS requirement. The encoding formula for each symbol is given by c_i = f(α_i) for i = 1 to n, where f(x) = d_0 + d_1 x + ... + d_{k-1} x^{k-1} and the d_j are the symbols. This evaluation can be computed directly using for efficiency, resulting in an overall encoding complexity of O(n k) field operations. Decoding in the erasure-only case involves recovering f(x) from any k of the received symbols using . Lagrange interpolation provides a direct method to reconstruct the from the k points, while the Berlekamp-Massey algorithm offers an efficient alternative by solving for the shortest that generates the received sequence, enabling recovery in O(k^2) time. These methods handle up to n - k erasures by interpolating the unique of degree less than k that passes through the available points. A simple example is the (3,2) Reed-Solomon code, often illustrated for basic error-resilient messaging like "Err-mail." Consider data symbols d_0 and d_1 in GF(3), forming f(x) = d_0 + d_1 x; evaluate at α_1 = 0, α_2 = 1, α_3 = 2 to get c_1 = d_0, c_2 = d_0 + d_1, c_3 = d_0 + 2 d_1. If c_3 is erased, interpolation from c_1 and c_2 recovers f(x) and thus the data. Reed-Solomon codes exhibit O(n k) encoding and O(k^2) decoding complexity in their standard forms, making them suitable for practical implementations. They have been widely adopted, such as in compact discs (CDs) for error correction since the 1980s using cross-interleaved (32,28) and (28,24) codes over GF(2^8), and in QR codes for data recovery with varying error correction levels based on Reed-Solomon parities. A key limitation is the field size constraint, requiring q ≥ n, which bounds the code to the field order and necessitates sub-packetization—dividing into smaller —for large-scale applications where n exceeds practical q values like 2^8 or 2^16.

Near-Optimal Erasure Codes

Locally Repairable Codes

Locally repairable codes (LRCs) are a class of erasure codes designed to minimize the amount of read during the repair of a single failed node in distributed storage systems, achieving near-optimal tradeoffs between storage and repair bandwidth. Unlike maximum separable (MDS) codes, LRCs relax the strict MDS to enable local repairability, where each can be recovered using a small, fixed number r of other symbols within a local parity group. The typically divides the n coded symbols into g local groups, each consisting of r symbols plus one local parity , resulting in g local parities. Additional h global parities are then computed across all and local parities, with the total redundancy m = g + h - 1, as the sum of local parities serves as an implicit global parity to avoid . This structure ensures locality r for repairing any single erasure using only r other symbols from its local group. A representative example is a (10, 6, 3) LRC, featuring two groups each of size 4 (three symbols and one local parity computed via XOR or Reed-Solomon within the group) for a total of six symbols and two local parities, augmented by two global parities computed over the entire set. In this setup, repairing a single failed symbol requires accessing only three other symbols from its group, significantly reducing I/O compared to reading from k systematic symbols in an MDS code. The minimum distance d satisfies the near-MDS bound d = n - k + 1 - (g - 1) = 10 - 6 + 1 - (2 - 1) = 4, allowing correction of up to three erasures globally while maintaining repair efficiency; repair operations scale as O(r) in bandwidth, versus O(k) for traditional Reed-Solomon codes. Encoding in LRCs involves first computing local parities independently for each group, often using simple XOR for binary fields or higher-order Reed-Solomon codes for enhanced local distance, followed by global parities derived from a systematic overall parity-check matrix that incorporates both and local parities. This hierarchical approach ensures the remains linear and efficient for encoding/decoding. Tradeoffs include slightly higher storage overhead, such as 1.5× replication factor versus 1.33× for an equivalent MDS code tolerating three failures, in exchange for 2-3× faster single-node repairs by limiting reads to the local group. Storage deploys an LRC with parameters (16, 12, 2), using 12 symbols, two local parities (g=2 groups of six data each plus locals), and two globals (h=2), achieving a 1.33× overhead while halving repair bandwidth over Reed-Solomon baselines.

Regenerating Codes

In distributed storage systems, repairing a failed node using traditional erasure codes like Reed–Solomon requires downloading a full node's worth of data from kk surviving nodes, amounting to kk symbols in total, which incurs significant network bandwidth costs. Regenerating codes address this by enabling a new node to download a reduced amount γk\gamma \leq k symbols from dd helper nodes (kd<nk \leq d < n), where nn is the total number of nodes, while ensuring any kk nodes can reconstruct the original file of size BB. This trade-off between storage per node α\alpha and repair bandwidth γ=dβ\gamma = d \beta (with β\beta symbols from each helper) minimizes communication overhead in failure-prone environments like cloud storage. The cut-set bound, derived from network coding principles, establishes the theoretical minimum for repair bandwidth: γkddk+1×\gamma \geq \frac{k d}{d - k + 1} \times symbol size, assuming the file is divided into symbols and helpers contribute equally. This bound arises from information flow graph analysis, where the minimum cut across kk nodes must support file reconstruction, and repair flows from dd helpers must suffice for recovery without exceeding storage constraints. Codes achieving this bound are optimal, balancing α\alpha and γ\gamma along the storage-bandwidth trade-off curve. Minimum Storage Regenerating (MSR) codes achieve the cut-set bound at the minimum storage point, where α=B/k\alpha = B/k (matching MDS codes like Reed–Solomon for space efficiency) and γ=dk+1dk\gamma = \frac{d - k + 1}{d} k. Exact-regenerating MSR constructions recover the exact lost data using interference alignment techniques, where helper contributions are linearly combined to cancel interference and isolate desired symbols. These codes maintain MDS properties for reconstruction while reducing repair traffic, with explicit constructions available for parameters where d2k2d \geq 2k - 2. Minimum Bandwidth Regenerating (MBR) codes prioritize bandwidth minimization at γ=d(1k2d+k(k1)2d2)\gamma = d \left(1 - \frac{k}{2d} + \frac{k(k-1)}{2 d^2}\right), at the cost of higher storage α>B/k\alpha > B/k. This point on the curve allows simpler encoding but increases per-node capacity, making MBR suitable for bandwidth-constrained networks. Like MSR, MBR codes can be exactly repaired using interference alignment for feasible (n,k,d)(n, k, d). A representative example is the (4,2) MSR code with d=3d=3 helpers and sub-packetization, where the file BB is split into smaller sub-packets to enable exact repair; each helper contributes 3/23/2 symbols, yielding total γ=9/2\gamma = 9/2 symbols—less than the 4 symbols required by traditional MDS repair—while α=B/2\alpha = B/2. This construction uses product-matrix methods to align interferences across sub-packets, demonstrating bandwidth savings in small-scale systems. As of 2025, extensions like piggybacking codes, introduced in 2013, enable lazy repair by precomputing functions of data during initial writes, further reducing effective bandwidth in MSR setups without altering core parameters. These codes remain primarily in research prototypes, such as experimental distributed file systems, due to sub-packetization complexities limiting practical deployment. Regenerating codes differ from locally repairable codes by focusing on global bandwidth reduction across dd helpers rather than locality for fast single-node fixes.

Applications

Distributed Storage Systems

Erasure codes have been integrated into the Hadoop Distributed File System (HDFS) since the release of 3.0 in 2017, with initial proposals and development efforts beginning around 2015 through collaborations between and . Note: Erasure code parameters are denoted as RS(n,k), where n is the total number of symbols and k is the number of data symbols. HDFS supports configurable erasure coding policies, such as Reed-Solomon (RS) codes with parameters like RS(9,6) or RS(14,10), which replace triple replication for certain data sets, particularly cold or archival data. For example, an RS(14,10) configuration encodes 10 data blocks into 14 total blocks (10 data + 4 parity), enabling automatic tiering where frequently accessed hot data remains replicated while less-accessed data uses erasure coding to optimize space. This integration allows administrators to apply policies at the directory level, balancing durability with storage efficiency without altering core HDFS workflows. In cloud storage systems, erasure codes are prominently deployed in Google's Colossus file system and Blob Storage. Google's Colossus, the successor to the (GFS), employs RS(9,6) codes, which provide a redundancy factor of 1.5 while tolerating up to 3 simultaneous failures across distributed clusters. Similarly, utilizes locally repairable codes (LRCs), such as LRC(12,2,2) with 12 data fragments, 2 local parities, and 2 global parities (total 16 fragments), to enhance repair efficiency in large-scale . These LRCs, which build on principles from regenerating codes, reduce the data read during single-fragment repairs from 12 blocks (as in standard RS codes) to as few as 3, achieving approximately 50% lower repair times compared to equivalent RS configurations. The primary benefits of erasure codes in distributed storage include significant cost savings through improved storage efficiency—up to 3 times better than traditional 3-way replication for low-overhead schemes like RS(13,10)—and robust against 10-20% node failures in large clusters, depending on parity configuration. For instance, in hyperscale environments, this translates to handling petabyte-scale data with minimal overhead, reducing hardware requirements and operational expenses while maintaining . However, challenges persist, including encoding and decoding latency, which can increase read/write times over replication due to computational overhead; these are often mitigated using hardware accelerators like Intel's ISA-L . Additionally, straggler effects during distributed repairs—where slow nodes delay overall recovery—can extend (MTTR) in heterogeneous clusters, though techniques like partial parallel decoding help reduce this by up to 59%. As of 2025, erasure codes are widely adopted in hyperscale distributed storage systems, powering platforms like HDFS, Colossus, and Azure for zettabyte-scale deployments. Open-source libraries such as OpenEC facilitate this adoption by providing a unified framework for implementing and configuring various erasure codes, including RS and LRC, directly into systems like Ceph and , enabling customizable integration without proprietary dependencies.

Networking and Communication

Erasure codes play a crucial role in networking and communication by enabling reliable data transmission over unreliable channels, where packet losses occur due to congestion, interference, or other impairments. In such scenarios, packets are encoded prior to transmission to include redundancy, allowing receivers to recover lost data without retransmission requests, which is particularly beneficial in or broadcast settings. For instance, Tornado codes, introduced in 1998, were specifically developed for efficient reliable of large files, achieving near-linear time encoding and decoding complexity while outperforming traditional Reed-Solomon codes in bandwidth efficiency for erasure channels. A key advancement in this domain is the development of fountain codes, which are rateless erasure codes capable of generating an unlimited number of parity packets from a fixed set of source symbols. Luby Transform (LT) codes, proposed in 2002, form the foundational class of practical fountain codes, employing a robust degree distribution to ensure efficient performance; the encoding process involves randomly selecting source symbols based on this distribution to create output symbols. Building on LT codes, Raptor codes, introduced in 2006, incorporate a low-density parity-check precode followed by an LT-like fountain stage, enabling linear-time encoding and decoding complexity, which significantly reduces computational overhead for large-scale transmissions. As an illustrative example, for an LT code with 1000 source symbols, a receiver typically collects around 1050 encoded packets—incurring a modest 5% overhead—to achieve successful decoding with high probability using the algorithm, which iteratively resolves symbols starting from degree-one outputs. These codes find widespread application in video streaming and communications, where is common and retransmission-based (ARQ) protocols introduce unacceptable delays. In video streaming, fountain codes protect against erasures in real-time delivery, maintaining playback quality without buffering interruptions; for example, they have been integrated into protocols for scenarios to minimize retransmissions. In links, characterized by high latency and variable loss rates, erasure codes enhance throughput by correcting erasures proactively, with empirical studies demonstrating latency reductions of 20-30% compared to ARQ approaches in lossy environments. As of 2025, erasure codes are increasingly integrated into and networks to support ultra-reliable low-latency communication (URLLC) services, such as industrial automation and autonomous vehicles, by providing robust FEC without feedback overhead. The IETF's FECFRAME framework further standardizes these mechanisms, defining architectures for applying FEC, including fountain-based schemes, across and flows to ensure and efficiency.

Emerging Uses

Advanced Storage Architectures

In flash and (SSD) storage, erasure codes are adapted to address the unique constraints of multilevel cells, such as limited write cycles and . Write-Once Memory (WOM) codes enable multiple writes to flash cells without erasure by leveraging the asymmetric programming property, where cells can only increase in charge level. These codes are particularly suited for multilevel cells (MLC) in NAND flash, allowing data to be rewritten up to t times per block while minimizing . For instance, rank modulation schemes represent data as permutations of cell charge levels rather than absolute values, facilitating rewriting without full block erasures and integrating with erasure-correcting mechanisms to tolerate erasures and deletions caused by or errors. This approach extends flash lifetime by reducing erasure frequency, with rank-modulation rewriting codes achieving close to 2 bits per cell per write while preserving reliability. In in-memory storage systems using volatile RAM clusters, erasure codes provide with lower overhead than traditional replication. EC-Cache, an online erasure coding scheme for cluster caching, splits objects into k data and r parity units during writes, enabling reconstruction from any k units. It employs (10,6) codes, which store data across 16 units with 1.6× storage overhead, reducing replication overhead by over 3× compared to selective 3-replication while using similar . By reading k+Δ units (e.g., Δ=1) to handle stragglers, EC-Cache improves load balancing by more than 3× and reduces read latency by up to 2.64× and latency (99.9th ) by 1.79× for large objects like 40 MB, scaling effectively in RAM-based clusters. Disaggregated storage architectures separate compute and resources into pooled systems, where erasure codes enhance efficiency in memory pooling. MicroEC optimizes erasure coding for disaggregated memory by reusing auxiliary coding data and pipelining encoding with RDMA transmissions via an exponential , supporting operations like writes, reads, degraded reads, and recoveries. This reduces write latency by up to 44.35% compared to baseline erasure coding and 42.14% versus 3-way replication, while improving write throughput by 1.80× over erasure coding and 1.73× over replication for objects of 1 MB or larger. Complementing this, Hydra applies online erasure coding to individual remote memory pages in disaggregated setups, achieving single-digit access latencies for fault-tolerant remote memory without full replication. It uses configurable coding groups placed via the CodingSets for balanced load and , tolerating node failures in pooled environments. Optimizations in these architectures further mitigate traditional erasure code limitations. Stripeless placement schemes, such as Nos, eliminate stripe alignment by having nodes independently replicate and XOR-encode data into parities based on symmetric balanced incomplete block designs (SBIBD), avoiding centralized coordination and stripe-induced overheads during writes. This enables for up to p node failures while simplifying placement in in-memory systems. Evaluations show Nos-based systems achieving 1.61× to 2.60× higher throughput than stripe-based baselines, with comparable or lower latencies. Additionally, hardware accelerations like GPUs speed up encoding; for example, G-CRS uses GPU parallelism for Cauchy Reed-Solomon codes, delivering up to 10× higher throughput than CPU-based libraries for large-scale operations. These advancements enable erasure codes to support scalability to petabyte-scale nodes in advanced architectures, such as those using NVMe-over-Fabrics (NVMe-oF) for high-speed interconnects. In NVMe-oF fabrics, erasure codes provide robust against multiple node or drive failures, as seen in systems like Spectrum Scale Erasure Code Edition, which uses Reed-Solomon codes for declustered protection across petabyte clusters with 1-3 node tolerance depending on configuration (e.g., 8+3 parity). This ensures data availability in disaggregated, high-density environments without excessive replication overhead.

Integration with AI and DNA Storage

Erasure codes have been integrated into distributed frameworks to enhance resilience during training es, particularly by mitigating the impact of stragglers—slow or failed compute nodes—in computations. In coded computing approaches, data and computations are encoded such that the results from a of nodes suffice for full recovery, tolerating up to a specified number of stragglers without halting the entire ; for instance, nested gradient codes allow flexible straggler mitigation by concatenating schemes for varying failure counts, reducing training time in large-scale setups. Recent advancements, such as ECRec for recommendation models, couple erasure coding with model-specific traits to achieve while minimizing overhead in distributed training environments. Additionally, models predict file access patterns or failure probabilities to enable adaptive erasure coding, as demonstrated in Azure Storage systems where optimizes code parameters for dynamic workloads, improving efficiency in AI-driven data management. In DNA storage, erasure codes address inherent biological errors during synthesis and sequencing, including 6.2% deletion rates and 5.7% insertion rates observed in photolithographic methods, by encoding data into redundant oligonucleotides that allow reconstruction from partial reads. codes, adapted for DNA's constraints like variable oligo recovery and biochemical limits, enable rateless encoding where sufficient fragments—regardless of dropouts—decode the original information, as pioneered in the DNA architecture that stores gigabytes scalably. This integration supports ultra-high densities, with theoretical capacities reaching up to 33 zettabytes in volumes comparable to a ping-pong ball, far surpassing traditional media and positioning DNA as a viable archival solution for exascale data. An example is the system, which applies erasure coding tiering in edge-cloud LSM-tree stores to balance redundancy and performance, extendable to bio-hybrid scenarios for reliable DNA data handling. As of 2025, trends emphasize hybrid erasure codes combining replication and coding for mixed failure modes in AI and DNA contexts, as surveyed in comprehensive reviews of storage systems, with applications in adaptive AI training and error-prone bio-storage. The market for erasure coding technologies is projected to experience significant growth by 2031, fueled by these interdisciplinary integrations in resilient computing and archival technologies. Challenges persist, including high decoding error rates in DNA due to indel dominance, necessitating advanced codes like HEDGES for direct correction, and computational overhead in AI distributed training, where encoding/decoding adds latency that coded schemes aim to offset through optimized designs.

References

  1. https://ntrs.[nasa](/page/NASA).gov/api/citations/19900019023/downloads/19900019023.pdf
  2. https://.org/abs/1206.3804
Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.