Recent from talks
Nothing was collected or created yet.
Low-density parity-check code
View on Wikipedia
Low-density parity-check (LDPC) codes are a class of error-correction codes which (together with the closely related turbo codes) have gained prominence in coding theory and information theory since the late 1990s. The codes today are widely used in applications ranging from wireless communications to flash-memory storage. Together with turbo codes, they sparked a revolution in coding theory, achieving order-of-magnitude improvements in performance compared to traditional error correction codes.[1]
Central to the performance of LDPC codes is their adaptability to the iterative belief-propagation decoding algorithm. Under this algorithm, they can be designed to approach theoretical limits (capacities) of many channels[2] at low computation costs.
Theoretically, analysis of LDPC codes focuses on sequences of codes of fixed code rate and increasing block length. These sequences are typically tailored to a set of channels. For appropriately designed sequences, the decoding error under belief propagation can often be proven to be vanishingly small (approaches zero with the block length) at rates that are very close to the capacities of the channels. Furthermore, this can be achieved at a complexity that is linear in the block length.
This theoretical performance is made possible using a flexible design method that is based on sparse Tanner graphs (specialized bipartite graphs).[3]
History
[edit]LDPC codes were originally conceived by Robert G. Gallager (and are thus also known as Gallager codes). Gallager devised the codes in his doctoral dissertation[4] at the Massachusetts Institute of Technology in 1960.[5][6] The codes were largely ignored at the time, as their iterative decoding algorithm (despite having linear complexity) was prohibitively computationally expensive for the hardware available.
Renewed interest in the codes emerged following the invention of the closely related turbo codes (1993), whose similarly iterative decoding algorithm outperformed other codes used at that time. LDPC codes were subsequently rediscovered in 1996.[7] Initial industry preference for LDPC codes over turbo codes stemmed from patent-related constraints on the latter.[8] Since their discovery, advances in LDPC codes have seen them surpass turbo codes in terms of error floor and performance in the higher code rate range, leaving turbo codes better suited for the lower code rates only.[9] Although the fundamental patent for turbo codes expired in 2013,[10][11] LDPC codes are now still being preferred for their technical merits.
Theoretical interest in LDPC codes also follows from their amenability to mathematical analysis. In his dissertation, Gallager showed that LDPC codes achieve the Gilbert–Varshamov bound for linear codes over binary fields with high probability. Over the binary erasure channel, code sequences were designed at rates arbitrarily close to channel capacity, with provably vanishing decoding error probability and linear decoding complexity.[12] In 2020 it was shown that Gallager's LDPC codes achieve list decoding capacity and also achieve the Gilbert–Varshamov bound for linear codes over general fields.[13]
Since 2013, LDPC codes have also been proposed as a means of correcting errors in quantum computers, as they require few additional qubits to correct errors, as demonstrated by Gottesman, the University of Strasburg, Alice & Bob, and others.[14][15][16][17]
Applications
[edit]In 2003, an irregular repeat accumulate (IRA) style LDPC code beat six turbo codes to become the error-correcting code in the new DVB-S2 standard for digital television.[18] The DVB-S2 selection committee made decoder complexity estimates for the turbo-code proposals using a much-less-efficient serial decoder architecture rather than a parallel decoder architecture. This forced the turbo-code proposals to use frame sizes on the order of one-half the frame size of the LDPC proposals.[citation needed]
In 2008, LDPC beat convolutional turbo codes as the forward error correction (FEC) system for the ITU-T G.hn standard.[19] G.hn chose LDPC codes over turbo codes because of their lower decoding complexity (especially when operating at data rates close to 1.0 Gbit/s) and because the proposed turbo codes exhibited a significant error floor at the desired range of operation.[20]
LDPC codes are also used for 10GBASE-T Ethernet, which sends data at 10 gigabits per second over twisted-pair cables. As of 2009, LDPC codes are also part of the Wi-Fi 802.11 standard as an optional part of 802.11n and 802.11ac, in the High Throughput (HT) PHY specification.[21] LDPC is a mandatory part of 802.11ax (Wi-Fi 6).[22]
Some OFDM systems add an additional outer error correction that fixes the occasional errors (the "error floor") that get past the LDPC correction inner code even at low bit error rates.
For example: The Reed-Solomon code with LDPC Coded Modulation (RS-LCM) uses a Reed-Solomon outer code.[23] The DVB-S2, the DVB-T2, and the DVB-C2 standards all use a BCH code outer code to mop up residual errors after LDPC decoding.[24]
5G NR uses polar code for the control channels and LDPC for the data channels.[25][26]
Although LDPC code has had its success in commercial hard disk drives, to fully exploit its error correction capability in SSDs demands unconventional fine-grained flash memory sensing, leading to an increased memory read latency. LDPC-in-SSD[27] is an effective approach to deploy LDPC in SSDs with a very small latency increase, which turns LDPC-in-SSD into a reality. Since then, LDPC has been widely adopted in commercial SSDs in both customer grades and enterprise grades by major storage vendors. Many TLC (and later) SSDs are using LDPC codes. A fast hard-decode (binary erasure) is first attempted, which can fall back into the slower but more powerful soft decoding.[28]
Operational use
[edit]This section needs additional citations for verification. (May 2023) |
LDPC codes functionally are defined by a sparse parity-check matrix. This sparse matrix is often randomly generated, subject to the sparsity constraints—LDPC code construction is discussed later. These codes were first designed by Robert Gallager in 1960.[6]
Below is a graph fragment of an example LDPC code using Forney's factor graph notation. In this graph, n variable nodes in the top of the graph are connected to (n−k) constraint nodes in the bottom of the graph.
This is a popular way of graphically representing an (n, k) LDPC code. The bits of a valid message, when placed on the Ts at the top of the graph, satisfy the graphical constraints. Specifically, all lines connecting to a variable node (box with an "=" sign) have the same value, and all values connecting to a factor node (box with a "+" sign) must sum, modulo two, to zero (in other words, they must sum to an even number, or there must be an even number of odd values).

Ignoring any lines going out of the picture, there are eight possible six-bit strings corresponding to valid codewords: (i.e., 000000, 011001, 110010, 101011, 111100, 100101, 001110, 010111). This LDPC code fragment represents a three-bit message encoded as six bits. Redundancy is used here to increase the chance of recovering from channel errors. This is a (6, 3) linear code, with n = 6 and k = 3.
Again ignoring lines going out of the picture, the parity-check matrix representing this graph fragment is
In this matrix, each row represents one of the three parity-check constraints, while each column represents one of the six bits in the received codeword.
In this example, the eight codewords can be obtained by putting the parity-check matrix H into this form through basic row operations in GF(2):
Step 1: H.
Step 2: Row 1 is added to row 3.
Step 3: Row 2 and 3 are swapped.
Step 4: Row 1 is added to row 3.
From this, the generator matrix G can be obtained as (noting that in the special case of this being a binary code ), or specifically:
Finally, by multiplying all eight possible 3-bit strings by G, all eight valid codewords are obtained. For example, the codeword for the bit-string "101" is obtained by
- ,
where is symbol of mod 2 multiplication.
As a check, the row space of G is orthogonal to H such that .
The input bit-string "101" is found as the first 3 bits of the codeword "101011", due to the presence of the identity matrix . The trailing three bits "011" of the codeword are the parity bits.
Example encoder
[edit]
During the encoding of a frame, the input data bits (D) are repeated and distributed to a set of constituent encoders. The constituent encoders are typically accumulators and each accumulator is used to generate a parity symbol. A single copy of the original data (S0,K-1) is transmitted with the parity bits (P) to make up the code symbols. The S bits from each constituent encoder are discarded.
The parity bit may be used within another constituent code.
In an example using the DVB-S2 rate 2/3 code the encoded block size is 64800 symbols (N=64800) with 43200 data bits (K=43200) and 21600 parity bits (M=21600). Each constituent code (check node) encodes 16 data bits except for the first parity bit which encodes 8 data bits. The first 4680 data bits are repeated 13 times (used in 13 parity codes), while the remaining data bits are used in 3 parity codes (irregular LDPC code).
For comparison, classic turbo codes typically use two constituent codes configured in parallel, each of which encodes the entire input block (K) of data bits. These constituent encoders are recursive convolutional codes (RSC) of moderate depth (8 or 16 states) that are separated by a code interleaver which interleaves one copy of the frame.
The LDPC code, in contrast, uses many low depth constituent codes (accumulators) in parallel, each of which encode only a small portion of the input frame. The many constituent codes can be viewed as many low depth (2 state) "convolutional codes" that are connected via the repeat and distribute operations. The repeat and distribute operations perform the function of the interleaver in the turbo code.
The ability to more precisely manage the connections of the various constituent codes and the level of redundancy for each input bit give more flexibility in the design of LDPC codes, which can lead to better performance than turbo codes in some instances. Turbo codes still seem to perform better than LDPCs at low code rates, or at least the design of well performing low rate codes is easier for turbo codes.
As a practical matter, the hardware that forms the accumulators is reused during the encoding process. That is, once a first set of parity bits are generated and the parity bits stored, the same accumulator hardware is used to generate a next set of parity bits.
Decoding
[edit]As with other codes, the maximum likelihood decoding of an LDPC code on the binary symmetric channel is an NP-complete problem,[29] shown by reduction from 3-dimensional matching. So assuming P != NP, which is widely believed, then performing optimal decoding for an arbitrary code of any useful size is not practical.
However, sub-optimal techniques based on iterative belief propagation decoding give excellent results and can be practically implemented. The sub-optimal decoding techniques view each parity check that makes up the LDPC as an independent single parity check (SPC) code. Each SPC code is decoded separately using soft-in-soft-out (SISO) techniques such as SOVA, BCJR, MAP, and other derivates thereof. The soft decision information from each SISO decoding is cross-checked and updated with other redundant SPC decodings of the same information bit. Each SPC code is then decoded again using the updated soft decision information. This process is iterated until a valid codeword is achieved or decoding is exhausted. This type of decoding is often referred to as sum-product decoding.
The decoding of the SPC codes is often referred to as the "check node" processing, and the cross-checking of the variables is often referred to as the "variable-node" processing.
In a practical LDPC decoder implementation, sets of SPC codes are decoded in parallel to increase throughput.
In contrast, belief propagation on the binary erasure channel is particularly simple where it consists of iterative constraint satisfaction.
For example, consider that the valid codeword, 101011, from the example above, is transmitted across a binary erasure channel and received with the first and fourth bit erased to yield ?01?11. Since the transmitted message must have satisfied the code constraints, the message can be represented by writing the received message on the top of the factor graph.
In this example, the first bit cannot yet be recovered, because all of the constraints connected to it have more than one unknown bit. In order to proceed with decoding the message, constraints connecting to only one of the erased bits must be identified. In this example, only the second constraint suffices. Examining the second constraint, the fourth bit must have been zero, since only a zero in that position would satisfy the constraint.
This procedure is then iterated. The new value for the fourth bit can now be used in conjunction with the first constraint to recover the first bit as seen below. This means that the first bit must be a one to satisfy the leftmost constraint.

Thus, the message can be decoded iteratively. For other channel models, the messages passed between the variable nodes and check nodes are real numbers, which express probabilities and likelihoods of belief.
This result can be validated by multiplying the corrected codeword r by the parity-check matrix H:
Because the outcome z (the syndrome) of this operation is the three × one zero vector, the resulting codeword r is successfully validated.
After the decoding is completed, the original message bits '101' can be extracted by looking at the first 3 bits of the codeword.
While illustrative, this erasure example does not show the use of soft-decision decoding or soft-decision message passing, which is used in virtually all commercial LDPC decoders.
Updating node information
[edit]In recent years[when?], there has also been a great deal of work spent studying the effects of alternative schedules for variable-node and constraint-node update. The original technique that was used for decoding LDPC codes was known as flooding. This type of update required that, before updating a variable node, all constraint nodes needed to be updated and vice versa. In later work by Vila Casado et al.,[30] alternative update techniques were studied, in which variable nodes are updated with the newest available check-node information.[citation needed]
The intuition behind these algorithms is that variable nodes whose values vary the most are the ones that need to be updated first. Highly reliable nodes, whose log-likelihood ratio (LLR) magnitude is large and does not change significantly from one update to the next, do not require updates with the same frequency as other nodes, whose sign and magnitude fluctuate more widely.[citation needed] These scheduling algorithms show greater speed of convergence and lower error floors than those that use flooding. These lower error floors are achieved by the ability of the Informed Dynamic Scheduling (IDS)[30] algorithm to overcome trapping sets of near codewords.[31]
When nonflooding scheduling algorithms are used, an alternative definition of iteration is used. For an (n, k) LDPC code of rate k/n, a full iteration occurs when n variable and n − k constraint nodes have been updated, no matter the order in which they were updated.[citation needed]
Code construction
[edit]For large block sizes, LDPC codes are commonly constructed by first studying the behaviour of decoders. As the block size tends to infinity, LDPC decoders can be shown to have a noise threshold below which decoding is reliably achieved, and above which decoding is not achieved,[32] colloquially referred to as the cliff effect. This threshold can be optimised by finding the best proportion of arcs from check nodes and arcs from variable nodes. An approximate graphical approach to visualising this threshold is an EXIT chart.[citation needed]
The construction of a specific LDPC code after this optimization falls into two main types of techniques:[citation needed]
- Pseudorandom approaches
- Combinatorial approaches
Construction by a pseudo-random approach builds on theoretical results that, for large block size, a random construction gives good decoding performance.[7] In general, pseudorandom codes have complex encoders, but pseudorandom codes with the best decoders can have simple encoders.[33] Various constraints are often applied to help ensure that the desired properties expected at the theoretical limit of infinite block size occur at a finite block size.[citation needed]
Combinatorial approaches can be used to optimize the properties of small block-size LDPC codes or to create codes with simple encoders.[citation needed]
Some LDPC codes are based on Reed–Solomon codes, such as the RS-LDPC code used in the 10 Gigabit Ethernet standard.[34] Compared to randomly generated LDPC codes, structured LDPC codes—such as the LDPC code used in the DVB-S2 standard—can have simpler and therefore lower-cost hardware—in particular, codes constructed such that the H matrix is a circulant matrix.[35]
Yet another way of constructing LDPC codes is to use finite geometries. This method was proposed by Y. Kou et al. in 2001.[36]
Compared to turbo codes
[edit]LDPC codes can be compared with other powerful coding schemes, e.g. turbo codes.[37] In one hand, BER performance of turbo codes is influenced by low codes limitations.[38] LDPC codes have no limitations of minimum distance,[39] that indirectly means that LDPC codes may be more efficient on relatively large code rates (e.g. 3/4, 5/6, 7/8) than turbo codes. However, LDPC codes are not the complete replacement: turbo codes are the best solution at the lower code rates (e.g. 1/6, 1/3, 1/2).[40][41]
See also
[edit]People
[edit]Theory
[edit]Applications
[edit]- G.hn/G.9960 (ITU-T Standard for networking over power lines, phone lines and coaxial cable)
- 802.3an or 10GBASE-T (10 gigabit/s Ethernet over twisted pair)
- CMMB (China Multimedia Mobile Broadcasting)
- DVB-S2 / DVB-T2 / DVB-C2 (digital video broadcasting, 2nd generation)
- DMB-T/H (digital video broadcasting)[42]
- WiMAX (IEEE 802.16e standard for microwave communications)
- IEEE 802.11n-2009 (Wi-Fi standard)
- DOCSIS 3.1
- ATSC 3.0 (Next generation North America digital terrestrial broadcasting)
- 3GPP (5G-NR data channel)
Other capacity-approaching codes
[edit]- Fountain codes
- LT codes
- Online codes
- Raptor codes
- Repeat-accumulate codes (a class of simple turbo codes)
- Serial concatenated convolutional codes
- Tornado codes (LDPC codes designed for erasure decoding)
- Turbo codes
Capacity-achieving codes
[edit]So far there is only one capacity achieving code by design and proof.
References
[edit]- ^ "Turbo Codes Explained: History, Examples, and Applications - IEEE Spectrum". spectrum.ieee.org. Retrieved December 18, 2024.
- ^ Richardson, T.J.; Shokrollahi, M.A.; Urbanke, R.L. (2001). "Design of capacity-approaching irregular low-density parity-check codes". IEEE Transactions on Information Theory. 47 (2): 619–637. Bibcode:2001ITIT...47..619R. doi:10.1109/18.910578.
- ^ Amin Shokrollahi, LDPC Codes: An Introduction (PDF), archived from the original (PDF) on May 17, 2017
- ^ Gallager, Robert G. (1960). Low density parity check codes (PDF) (Ph.D thesis). Massachusetts Institute of Technology.
- ^ Hardesty, L. (January 21, 2010). "Explained: Gallager codes". MIT News. Retrieved August 7, 2013.
- ^ a b Gallager, R.G. (January 1962). "Low density parity check codes". IRE Trans. Inf. Theory. 8 (1): 21–28. doi:10.1109/TIT.1962.1057683. hdl:1721.1/11804/32786367-MIT. S2CID 260490814.
- ^ a b MacKay, David JC; Neal, Radford M (1996). "Near Shannon limit performance of low density parity check codes" (PDF). Electronics Letters. 32 (18). IET: 1645–1646. Bibcode:1996ElL....32.1645M. doi:10.1049/el:19961141.
- ^ Erico Guizzo (March 1, 2004). "CLOSING IN ON THE PERFECT CODE". IEEE Spectrum. Archived from the original on September 2, 2021. "Another advantage, perhaps the biggest of all, is that the LDPC patents have expired, so companies can use them without having to pay for intellectual-property rights."
- ^ Telemetry Data Decoding, Design Handbook
- ^ US 5446747
- ^ Mackenzie, D. (July 9, 2005). "Communication speed nears terminal velocity". New Scientist.
- ^ Richardson, T.J.; Shokrollahi, M.A.; Urbanke, R.L. (2001). "Design of capacity-approaching irregular low-density parity-check codes". IEEE Transactions on Information Theory. 47 (2): 619–637. Bibcode:2001ITIT...47..619R. doi:10.1109/18.910578.
- ^ Mosheiff, J.; Resch, N.; Ron-Zewi, N.; Silas, S.; Wootters, M. (2020). "Low-Density Parity-Check Codes Achieve List-Decoding Capacity". SIAM Journal on Computing. 53 (FOCS 2020): 38–73. arXiv:1909.06430. doi:10.1137/20M1365934. S2CID 244549036.
- ^ Gottesman, Daniel (2014). "Fault-tolerant quantum computation with constant overhead". Quantum Information and Computation. 14 (15–16). Rinton Press: 1338–1371. Bibcode:2014QuInf..14.1338G. ISSN 1533-7146.
- ^ Breuckmann, Nikolas P.; Eberhardt, Jens Niklas (October 11, 2021). "Quantum Low-Density Parity-Check Codes". PRX Quantum. 2 (4) 040101. Bibcode:2021PRXQ....2d0101B. doi:10.1103/PRXQuantum.2.040101. ISSN 2691-3399.
- ^ Ruiz, Diego; Guillaud, Jérémie; Leverrier, Anthony; Mirrahimi, Mazyar; Vuillot, Christophe (January 26, 2025). "LDPC-cat codes for low-overhead quantum computing in 2D". Nature Communications. 16 1040. Bibcode:2025NatCo..16.1040R. doi:10.1038/s41467-025-52400-9. ISSN 2041-1723.
- ^ Pecorari, Laura; Jandura, Sven; Brennen, Gavin K.; Pupillo, Guido (January 28, 2025). "High-rate quantum LDPC codes for long-range-connected neutral atom registers". Nature Communications. 16 1111. Bibcode:2025NatCo..16.1111P. doi:10.1038/s41467-025-52500-6. ISSN 2041-1723.
- ^ Presentation by Hughes Systems Archived 2006-10-08 at the Wayback Machine
- ^ HomePNA Blog: G.hn, a PHY For All Seasons
- ^ IEEE Communications Magazine paper on G.hn Archived 2009-12-13 at the Wayback Machine
- ^ IEEE Standard, section 20.3.11.6 "802.11n-2009", IEEE, October 29, 2009, accessed March 21, 2011.
- ^ "IEEE SA - IEEE 802.11ax-2021". IEEE Standards Association. Retrieved May 22, 2022.
- ^ Chih-Yuan Yang, Mong-Kai Ku. http://123seminarsonly.com/Seminar-Reports/029/26540350-Ldpc-Coded-Ofdm-Modulation.pdf "LDPC coded OFDM modulation for high spectral efficiency transmission"
- ^ Nick Wells. "DVB-T2 in relation to the DVB-x2 Family of Standards" Archived 2013-05-26 at the Wayback Machine
- ^ "5G Channel Coding" (PDF). Archived from the original (PDF) on December 6, 2018. Retrieved January 6, 2019.
- ^ Maunder, Robert (September 2016). "A Vision for 5G Channel Coding" (PDF). Archived from the original (PDF) on December 6, 2018. Retrieved January 6, 2019.
- ^ Kai Zhao; Wenzhe Zhao; Hongbin Sun; Tong Zhang; Xiaodong Zhang; Nanning Zheng (2013). LDPC-in-SSD: Making Advanced Error Correction Codes Work Effectively in Solid State Drives (PDF). FAST' 13. pp. 243–256.
- ^ "Soft-Decoding in LDPC based SSD Controllers". EE Times. 2015.
- ^ Robert McEliece, E. R. Berlekamp and H. Van Tilborg (1978). "On the Inherent Intractability of Certain Coding Problems". IEEE Trans. Inf. Theory. IEEE: 384–386. doi:10.1109/TIT.1978.1055873.
- ^ a b Casado, A.I.V.; Griot, M.; Wesel, R.D. (2007). Informed Dynamic Scheduling for Belief-Propagation Decoding of LDPC Codes. 2007 IEEE International Conference on Communications, Glasgow, UK. pp. 932–7. arXiv:cs/0702111. doi:10.1109/ICC.2007.158.
- ^ Richardson, T. (October 2003). "Error floors of LDPC codes" (PDF). Proceedings of the Annual Allerton Conference on Communication Control and Computing. 41 (3): 1426–35. ISSN 0732-6181.
- ^ Richardson, T.J.; Shokrollahi, M.A.; Urbanke, R.L. (February 2001). "Design of capacity-approaching irregular low-density parity-check codes". IEEE Transactions on Information Theory. 47 (2): 619–637. Bibcode:2001ITIT...47..619R. doi:10.1109/18.910578.
- ^ Richardson, T.J.; Urbanke, R.L. (February 2001). "Efficient encoding of low-density parity-check codes". IEEE Transactions on Information Theory. 47 (2): 638–656. Bibcode:2001ITIT...47..638R. doi:10.1109/18.910579.
- ^ Ahmad Darabiha, Anthony Chan Carusone, Frank R. Kschischang. "Power Reduction Techniques for LDPC Decoders"
- ^ Zhang, Z.; Anantharam, V.; Wainwright, M.J.; Nikolic, B. (April 2010). "An Efficient 10GBASE-T Ethernet LDPC Decoder Design With Low Error Floors" (PDF). IEEE Journal of Solid-State Circuits. 45 (4): 843–855. Bibcode:2010IJSSC..45..843Z. doi:10.1109/JSSC.2010.2042255. S2CID 10431486.
- ^ Kou, Y.; Lin, S.; Fossorier, M.P.C. (November 2001). "Low-density parity-check codes based on finite geometries: a rediscovery and new results". IEEE Transactions on Information Theory. 47 (7): 2711–36. Bibcode:2001ITIT...47.2711K. CiteSeerX 10.1.1.100.3023. doi:10.1109/18.959255.
- ^ Tahir, B.; Schwarz, S.; Rupp, M. (2017). BER comparison between Convolutional, Turbo, LDPC, and Polar codes. 2017 24th International Conference on Telecommunications (ICT), Limassol, Cyprus. pp. 1–7. doi:10.1109/ICT.2017.7998249.
- ^ Moon Todd, K. (2005). Error correction coding: mathematical methods and algorithms. Wiley. p. 614. ISBN 0-471-64800-0.
- ^ Moon Todd 2005, p. 653
- ^ Andrews, Kenneth S., et al. "The development of turbo and LDPC codes for deep-space applications." Proceedings of the IEEE 95.11 (2007): 2142-2156.
- ^ Hassan, A.E.S., Dessouky, M., Abou Elazm, A. and Shokair, M., 2012. Evaluation of complexity versus performance for turbo code and LDPC under different code rates. Proc. SPACOMM, pp.93-98.
- ^ Dill, Jeffery; Li, Huanlin (June 27, 2011). Cao, Yanyan (ed.). "Studies on Lowering the Error Floors of Finite Length LDPC codes". IEEE. Ohio University, Electrical Engineering (Engineering and Technology). Archived from the original on December 12, 2009.
External links
[edit]- Introducing Low-Density Parity-Check Codes (by Sarah J Johnson, 2010)
- LDPC Codes – a brief Tutorial (by Bernhard Leiner, 2005)
- LDPC Codes (TU Wien) Archived February 28, 2019, at the Wayback Machine
- MacKay, David J.C. (September 25, 2003). "47. Low-Density Parity-Check Codes". Information Theory, Inference, and Learning Algorithms. Cambridge University Press. pp. 557–573. ISBN 978-0-521-64298-9.
- Guruswami, Venkatesan (2006). "Iterative Decoding of Low-Density Parity Check Codes". arXiv:cs/0610022.
- LDPC Codes: An Introduction (by Amin Shokrollahi, 2003)
- Belief-Propagation Decoding of LDPC Codes (by Amir Bennatan, Princeton University)
- Turbo and LDPC Codes: Implementation, Simulation, and Standardization (West Virginia University)
- Information theory and coding (Marko Hennhöfer, 2011, TU Ilmenau) - discusses LDPC codes at pages 74–78.
- LDPC codes and performance results
- DVB-S.2 Link, Including LDPC Coding (MatLab)
- Source code for encoding, decoding, and simulating LDPC codes is available from a variety of locations:
- Binary LDPC codes in C
- Binary LDPC codes for Python (core algorithm in C)
- LDPC encoder and LDPC decoder in MATLAB
- A Fast Forward Error Correction Toolbox (AFF3CT) in C++11 for fast LDPC simulations
Low-density parity-check code
View on GrokipediaFundamentals
Definition and Basic Properties
Low-density parity-check (LDPC) codes are a class of linear block codes over the binary field GF(2), where the codewords form a linear subspace of dimension in the vector space .[1] These codes are defined by a set of linear parity-check equations that the codewords must satisfy, without specifying a generator matrix explicitly.[6] In general, a linear block code of length and dimension has a rate and can correct up to errors, where is the minimum Hamming distance between any two distinct codewords.[7] An LDPC code is specified by an parity-check matrix that is sparse, meaning it contains a low density of nonzero entries (typically less than 0.01 for practical codes with large ).[1][6] The rows of correspond to parity-check equations, and the code rate is . A vector is a valid codeword if and only if it satisfies (modulo 2).[1] For a received vector (where is the error vector), the syndrome is computed as (modulo 2); a zero syndrome indicates no detectable errors, while a nonzero syndrome signals the presence of errors.[6] The minimum distance of an LDPC code, which determines its error-correcting capability, is closely tied to structural properties of the underlying bipartite Tanner graph, such as its girth (the length of the shortest cycle) and expansion (the growth rate of neighborhoods in the graph).[6] Larger girth helps avoid short cycles that degrade performance, providing lower bounds on , while good expansion ensures reliable error correction for a fraction of errors up to the code's design threshold.[6] The Tanner graph representation, with variable nodes for code bits and check nodes for parity equations connected by edges corresponding to nonzero entries in , illustrates these graph-theoretic features.[6] LDPC codes exhibit capacity-approaching behavior when decoded using iterative message-passing algorithms, achieving rates arbitrarily close to the Shannon limit on the binary symmetric channel and binary erasure channel for sufficiently large block lengths.[1][7] This performance stems from the sparsity of , which allows efficient probabilistic decoding that converges to near-optimal maximum-likelihood results in many cases.[7]Parity-Check Matrix and Tanner Graph
The parity-check matrix of a low-density parity-check (LDPC) code is an binary matrix, where each of the rows corresponds to a parity-check equation that the codeword must satisfy, and each of the columns corresponds to one of the bits in the codeword.[1] The codewords are those vectors in satisfying , ensuring the code rate is at least .[8] The "low-density" property refers to the sparsity of , where the density of 1s is very small, typically on the order of 1% or less, meaning each row and each column contains only a small number of 1s relative to and .[1] This sparsity is quantified by the row weight (number of 1s per row, denoted for check node degree) and column weight (number of 1s per column, denoted for variable node degree), which are fixed in regular LDPC codes but can vary in irregular ones.[8] The Tanner graph provides a graphical representation of the parity-check matrix , introduced as a bipartite graph model for constraint-based codes.[9] It consists of two sets of nodes: variable nodes (one for each codeword bit) on one side and check nodes (one for each parity-check equation) on the other, with an edge connecting variable node to check node if and only if .[8] This structure visually encodes the linear constraints imposed by , facilitating analysis of the code's dependencies. In the Tanner graph, cycles are closed paths of even length (since the graph is bipartite), and the girth is defined as the length of the shortest such cycle, with the minimum possible girth being 4.[9] Short cycles, particularly those of length 4 or 6, can degrade decoding performance by introducing correlations that hinder the propagation of independent information during iterative processing, while larger girth generally improves error-correction capability by enhancing the code's minimum distance.[9] In regular LDPC codes, all variable nodes have the same degree and all check nodes have the same degree , leading to uniform connectivity, whereas irregular codes allow varying degrees to optimize performance metrics like threshold or error floor.[8] For illustration, consider a small irregular LDPC code with parameters , (rate ), and parity-check matrix [10] The corresponding Tanner graph has variable nodes to connected to check nodes to as follows: to ; to ; to ; to .[10] This graph exhibits irregularity, with column weights ranging from 1 to 2 (e.g., degree 1, degree 2), and contains cycles whose lengths influence the code's structural properties.[10]Historical Development
Early Invention
Low-density parity-check (LDPC) codes were first invented by Robert G. Gallager in his doctoral dissertation at the Massachusetts Institute of Technology, completed in 1960.[11] In this work, Gallager proposed these codes as a class of linear block codes defined by sparse parity-check matrices, where each row and column has a low, fixed number of nonzero entries.[1] The motivation stemmed from the desire to create error-correcting codes that could approach the Shannon limit while enabling simpler decoding through the sparsity, which reduces the complexity of processing interdependencies in the parity checks compared to denser matrices.[1] Gallager expanded on these ideas in his seminal 1962 paper, providing a formal definition and analysis of LDPC codes.[1] A key contribution was the theoretical proof that ensembles of random LDPC codes, under maximum-likelihood decoding, exhibit an error exponent matching that of completely random codes, thereby achieving performance approaching channel capacity as block lengths grow large.[1] Additionally, he introduced simple iterative decoding algorithms based on probabilistic message-passing between variable and check nodes, which approximate the maximum-likelihood solution with significantly lower computational demands.[1] Gallager's LDPC codes represented the earliest known examples of error-correcting codes with decoding thresholds close to the Shannon limit for binary symmetric channels.[1] Initial computer simulations in the early 1960s demonstrated their strong performance, with error probabilities dropping exponentially for code rates below capacity at moderate block lengths, outperforming contemporaneous codes like BCH.[1] However, despite these promising results, the codes were largely overlooked for decades due to the era's computational limitations, which made the iterative decoding impractical for real-time applications on available hardware. This neglect persisted until the 1990s, predating the invention of turbo codes by over three decades.[12]Modern Rediscovery and Advances
Low-density parity-check (LDPC) codes experienced a significant revival in the late 1990s after decades of obscurity following their initial proposal. In 1997, David J.C. MacKay and Radford M. Neal independently rediscovered Gallager's codes and demonstrated through simulations that LDPC codes, when decoded using belief propagation, could achieve bit error rates near the Shannon limit on binary symmetric and additive white Gaussian noise channels for moderate block lengths. This work highlighted the practical potential of iterative decoding, sparking renewed interest in sparse graph-based codes. Concurrently, in 1997, Michael Luby, Michael Mitzenmacher, Amin Shokrollahi, and Daniel Spielman introduced Tornado codes as a class of highly irregular LDPC codes tailored for efficient erasure correction in multicast scenarios, achieving low encoding and decoding complexity while approaching capacity on binary erasure channels. Key theoretical and design advances followed in the early 2000s, enabling LDPC codes to approach channel capacity more reliably. In 2001, Tom Richardson, Amin Shokrollahi, and Ruediger Urbanke proposed irregular LDPC codes, where variable and check node degrees are non-uniformly distributed to optimize performance; these codes improved decoding thresholds by up to 1 dB closer to capacity compared to regular counterparts on binary-input symmetric channels.[7] The same year, Richardson and Urbanke developed density evolution, an analytical tool that tracks the evolution of message distributions during iterative decoding, allowing precise prediction of code thresholds and guiding the optimization of degree distributions for various channels. These innovations established LDPC codes as capacity-achieving under message-passing decoding, with stability conditions ensuring convergence to low error rates.[13] The practical impact of these rediscoveries materialized through widespread adoption in communication standards. LDPC codes were standardized in the DVB-S2 satellite broadcasting system in 2005, providing rates from 1/4 to 9/10 with performance within 0.7 dB of Shannon limits at a bit error rate of 10^{-5}. They became optional in IEEE 802.11n Wi-Fi (2009) for high-throughput modes, enhancing reliability at data rates up to 600 Mbps, and were integrated into IEEE 802.11ac/ax for even higher speeds. In 5G New Radio (NR), released in 2018, LDPC codes replaced turbo codes for data channels, supporting block lengths up to 8448 bits and rates from 1/3 to 8/9 with low latency decoding suitable for enhanced mobile broadband. In the 2020s, LDPC research has extended to non-binary fields and quantum settings, alongside hardware optimizations for emerging standards. Non-binary LDPC codes over finite fields like GF(256) have gained traction for short-to-medium block lengths in ultra-reliable low-latency communications, offering 1-2 dB gains over binary versions due to better minimum distance properties. Quantum LDPC codes, adapted for stabilizer codes, have advanced with constructions achieving constant relative overhead (e.g., O(1) physical qubits per logical qubit) while tolerating error rates above 1%, as shown in lifted product constructions. For 6G proposals, enhanced LDPC variants like partially information-coupled designs target extreme reliability (error rates below 10^{-9}) and massive connectivity. Hardware accelerations, particularly FPGA implementations, have enabled real-time decoding at throughputs exceeding 1 Gbps for 5G NR codes, reducing power consumption by up to 50% through partially parallel architectures.Code Design and Construction
Matrix Construction Methods
Low-density parity-check (LDPC) codes are defined by their sparse parity-check matrix , and various methods have been developed to construct such that the resulting code exhibits strong error-correcting performance. The original construction method, proposed by Robert G. Gallager, generates regular LDPC codes by ensuring fixed column and row weights while maintaining sparsity.[1] In Gallager's approach for a -regular code (where is the column weight and is the row weight), the rows of (with , the block length, and the rate) are partitioned into equal-sized subsets, each containing rows.[14] For each subset, exactly one 1 is placed in every column, with positions chosen randomly across the rows in that subset; to achieve the target row weight , this submatrix effectively comprises disjoint permutation matrices summed together, ensuring no multiple 1's in the same row-column pair within the subset.[1] The columns are then randomly permuted within each subset to distribute the 1's evenly and reduce the likelihood of short cycles in the corresponding Tanner graph.[14] This method produces codes with good average performance for large block lengths but can introduce short cycles in smaller codes due to the random placement.[1] A more refined general method is the progressive edge-growth (PEG) algorithm, which constructs the Tanner graph representation of by incrementally adding edges in a greedy manner to locally maximize the girth (the length of the shortest cycle). Introduced by Hu, Eleftheriou, and Arnold, the PEG algorithm begins with isolated variable and check nodes corresponding to the desired degrees, then adds edges one at a time, starting from the highest-degree nodes. For each new edge connecting a variable node to an available check node, it selects the check node that results in the longest possible shortest cycle involving the local neighborhood of , computed via breadth-first search up to a depth related to the target girth.[15] This process continues until all edges are placed, yielding a parity-check matrix with improved cycle distribution compared to purely random methods, particularly for short-to-moderate block lengths. Simulation results demonstrate that PEG-constructed codes achieve thresholds close to density-evolution limits with fewer iterations in belief propagation decoding.[15] Structured constructions offer deterministic and hardware-efficient alternatives to random methods. Protograph-based LDPC codes start with a small protograph—a compact base bipartite graph with a few variable and check node types—and expand it by replacing each edge with a bundle of parallel edges, typically implemented as permutation matrices to preserve sparsity. A prominent example is the accumulate-repeat-accumulate (ARA) code family, developed by Divsalar, Dolinar, and Thorpe, where the protograph incorporates accumulator nodes for rate control and repetition for diversity, resulting in linear-time encodable LDPC codes with thresholds near the Shannon limit.[16] Quasi-cyclic (QC) LDPC codes further enhance structure by composing from circulant permutation matrices (cyclic shifts of identity matrices), which facilitate parallel hardware implementations and memory-efficient storage.[17] These are particularly adopted in standards like 5G New Radio (NR), where QC-LDPC codes for the physical downlink shared channel support high-throughput rates from 1/3 to 8/9, with base matrices optimized for low latency and flexibility across lifting sizes. In 5G NR, the parity-check matrix is built from a quasi-cyclic structure with two submatrices for systematic encoding, ensuring minimal short cycles through careful shift selections.[18] Key optimization criteria in matrix construction focus on graph properties that enhance decoding convergence. Avoiding short cycles (girth at least 6 or higher) prevents premature convergence in message-passing decoders by reducing correlations in iterative updates on the Tanner graph. Good expansion properties ensure that small sets of variable nodes connect to sufficiently many check nodes, promoting reliable information propagation and approaching capacity. Density evolution serves as a primary validation tool, tracking the probability distribution of log-likelihood ratios during decoding iterations in the asymptotic block-length limit to compute the noise threshold beyond which errors vanish; constructions are refined (e.g., via degree distribution optimization or cycle removal) until this threshold meets or exceeds target performance. As an illustrative example, consider step-by-step construction of a small (3,6)-regular LDPC code with block length (, rate ) using Gallager's method. Divide the 6 rows into subsets, each with 2 rows (subsets A: rows 1-2, B: rows 3-4, C: rows 5-6). For subset A, partition the 12 columns into two groups of 6 (e.g., columns 1-6 to row 1, 7-12 to row 2) and place 1's in those positions, ensuring one 1 per column; this forms a 2×12 submatrix with row weight 6. Randomly permute the columns within subset A to decorrelate positions (e.g., swap columns 3 and 8). Repeat for subset B: assign another partitioning (e.g., columns 1-3 and 7-9 to row 3, 4-6 and 10-12 to row 4), place 1's, and permute (e.g., cycle shift by 2). For subset C, use a different random partitioning and permutation. The full combines these submatrices vertically, resulting in a sparse matrix where each column has exactly three 1's (one per subset) and each row has six 1's, with the permutations helping to elongate cycles. For instance, after permutations, a partial might appear as:H = [
[1,1,1,1,1,1,0,0,0,0,0,0], // row 1 ([subset](/page/Subset) A, permuted)
[0,0,0,0,0,0,1,1,1,1,1,1], // row 2
[1,0,1,0,1,0,1,0,1,0,1,0], // row 3 ([subset](/page/Subset) B, example perm)
[0,1,0,1,0,1,0,1,0,1,0,1], // row 4
[1,1,0,0,1,1,0,0,1,1,0,0], // row 5 ([subset](/page/Subset) C, example)
[0,0,1,1,0,0,1,1,0,0,1,1] // row 6
]
H = [
[1,1,1,1,1,1,0,0,0,0,0,0], // row 1 ([subset](/page/Subset) A, permuted)
[0,0,0,0,0,0,1,1,1,1,1,1], // row 2
[1,0,1,0,1,0,1,0,1,0,1,0], // row 3 ([subset](/page/Subset) B, example perm)
[0,1,0,1,0,1,0,1,0,1,0,1], // row 4
[1,1,0,0,1,1,0,0,1,1,0,0], // row 5 ([subset](/page/Subset) C, example)
[0,0,1,1,0,0,1,1,0,0,1,1] // row 6
]
Regular and Irregular Codes
Low-density parity-check (LDPC) codes are classified into regular and irregular variants based on the uniformity of degrees in their associated bipartite Tanner graphs. Regular LDPC codes maintain a fixed variable node degree and a fixed check node degree for all nodes, resulting in a parity-check matrix where each column has exactly ones and each row has exactly ones. A canonical example is the (3,6)-regular LDPC code, where variable nodes connect to three checks and checks connect to six variables, promoting balanced message propagation during decoding.[19] This regularity simplifies matrix construction—often via progressive edge growth or cyclic shifts—and reduces decoding complexity, as all nodes follow identical update rules.[19] However, such codes typically yield performance thresholds that lag behind channel capacity by several decibels, limiting their efficiency in capacity-approaching scenarios.[19] Irregular LDPC codes, by contrast, permit varying degrees among variable and check nodes, allowing tailored distributions that enhance overall error-correcting capability. These are defined by the variable degree distribution polynomial , where denotes the fraction of edges connected to degree- variable nodes, and the check degree distribution , defined analogously for checks.[7] Optimization of and occurs through methods like density evolution-guided linear programming, which minimizes bit error rates by concentrating low-degree variable nodes for reliable initial messages and higher-degree checks for constraint enforcement.[7] This irregularity produces steeper waterfall performance—rapid error rate drops at moderate signal-to-noise ratios—and suppresses error floors by mitigating trapping sets, outperforming regular codes in both aspects.[7] When comparing the two, irregular LDPC codes achieve error rates substantially closer to Shannon capacity limits, often within 0.1 dB for optimized designs, but incur higher decoding complexity from heterogeneous node processing and larger memory requirements for variable schedules.[7] The key metric distinguishing their efficacy is the threshold , the highest channel noise level (e.g., erasure probability on the binary erasure channel) under which density evolution predicts vanishing error probability for infinite block lengths. For the binary erasure channel, solves as the supremum of ensuring convergence to zero in the density evolution recursion: with initial condition , where tracks the average erasure probability after iterations.[7] Irregular distributions can push near the capacity threshold of (for rate ), while regular ones plateau lower.[7] Irregular LDPC codes address limitations in outdated regular designs by enabling modern high-throughput applications; for instance, quasi-cyclic irregular constructions are standardized in IEEE 802.11ax (Wi-Fi 6) and IEEE 802.11be (Wi-Fi 7) to deliver robust error correction at multi-gigabit rates over wireless channels.[20]Encoding Techniques
General Encoding Algorithms
Low-density parity-check (LDPC) codes are linear block codes defined by a sparse parity-check matrix of size , where is the code length and with being the dimension, assuming has full row rank. Encoding involves generating a codeword from an information vector of length , such that , where is the parity vector of length .[21] In the systematic form, the parity-check matrix is partitioned as , with of size corresponding to information bits and of size to parity bits. This leads to the linear system , which is solved for . The requirement of full row rank in ensures the code has the desired dimension and that is invertible.[21] A straightforward approach to encoding uses the systematic generator matrix , where (modulo 2 for binary codes), yielding . However, since is sparse but is typically dense, this results in quadratic complexity , which is inefficient for large . Direct inversion of via Gaussian elimination also incurs complexity, making it impractical for long codes.[21] To exploit the sparsity of , efficient algorithms avoid explicit construction of the dense generator matrix. One seminal method, developed by Richardson and Urbanke, transforms into an approximate lower triangular form through column permutations, partitioning it into submatrices where a core block is lower triangular. Parity bits are then computed sequentially via back-substitution, solving the system row by row while leveraging sparsity to minimize operations. This achieves near-linear complexity , where is a small gap parameter measuring the approximation quality, often approaching for well-designed codes. For general sparse , the overall complexity scales as , with denoting the average column weight (variable node degree).[21] For structured LDPC codes, such as quasi-cyclic constructions where comprises circulant permutation matrices, encoding can be further optimized to strict linear time . These methods use cyclic shifts and precomputed scalar multiplications over finite fields (or modulo 2 for binary cases), avoiding matrix-vector multiplications altogether and enabling hardware-efficient implementations via shift registers.Practical Encoder Example
To illustrate the encoding process for a low-density parity-check (LDPC) code, consider the (7,4) Hamming code as a simple example, where the parity-check matrix is sparse with a density of approximately 0.57 ones per entry, qualifying it as a trivial LDPC code.[22] The code has length n=7, dimension k=4, and minimum distance d=3, with the parity-check matrix H defined over GF(2) as follows: [23] This matrix can be partitioned as H = [H1 | H2], where H1 is the 3×4 submatrix corresponding to the information bits and H2 is the 3×3 submatrix for the parity bits, assuming a systematic encoding where the codeword c = [u | p] with information vector u ∈ GF(2)4 and parity vector p ∈ GF(2)3. For u = [1, 0, 1, 1]T, the parity bits are computed by solving the linear system H2 p = H1 u (mod 2). First, compute the syndrome-like vector s = H1 u (mod 2): Next, solve H2 p = s (mod 2), where The solution yields p = [0, 1, 0]T, so the codeword is c = [1, 0, 1, 1, 0, 1, 0].[24] To verify, compute H cT = 0 (mod 2), confirming c lies in the code: [25] The following table summarizes the matrices and vectors in this example:| Component | Value |
|---|---|
| H1 | |
| H2 | |
| u | [1, 0, 1, 1] |
| p | [0, 1, 0] |
| c | [1, 0, 1, 1, 0, 1, 0] |
Decoding Algorithms
Belief Propagation Decoding
Belief propagation (BP) decoding is a soft-decision iterative algorithm for low-density parity-check (LDPC) codes, operating on the Tanner graph representation of the code to exchange probabilistic messages between variable and check nodes. It initializes messages from variable nodes based on the channel outputs, typically represented as log-likelihood ratios (LLRs), where the LLR for a bit is , with denoting the received signal. This approach leverages the sparse structure of the parity-check matrix to approximate maximum a posteriori decoding efficiently. The algorithm proceeds in iterations, alternating between variable-to-check (V2C) and check-to-variable (C2V) message updates until convergence is achieved—such as when all parity checks are satisfied—or a maximum number of iterations is reached. In the sum-product variant, the core of BP, V2C messages convey extrinsic information excluding the target check node, while C2V messages enforce parity constraints. Practical implementations often employ the min-sum approximation to the sum-product algorithm, which replaces nonlinear operations with simpler min and sign computations to reduce complexity while maintaining near-optimal performance. Messages are defined in terms of LLRs for numerical stability. The total LLR at a variable node is computed aswhere is the set of neighboring check nodes, and are incoming C2V messages; the outgoing V2C message to a specific check excludes to provide extrinsic information. For the C2V message from check node to variable , the sum-product update uses the hyperbolic tangent rule:
where denotes neighboring variables; the min-sum approximation simplifies this to
After each iteration, hard decisions are made from the total LLRs, and the process repeats. The success of BP decoding depends on the channel noise level relative to the code's design threshold, determined via density evolution analysis, which tracks message distributions over iterations assuming a tree-like graph structure. If the noise (e.g., standard deviation in AWGN channels) is below this threshold, the bit error rate approaches zero as block length increases; otherwise, decoding fails with positive error probability. For well-designed irregular LDPC codes, thresholds can approach the Shannon limit within 0.06 dB.[7]
