Recent from talks
Contribute something
Nothing was collected or created yet.
Hamming code
View on WikipediaThis article includes a list of general references, but it lacks sufficient corresponding inline citations. (March 2013) |
| Binary Hamming codes | |
|---|---|
The Hamming(7,4) code (with r = 3) | |
| Named after | Richard W. Hamming |
| Classification | |
| Type | Linear block code |
| Block length | 2r − 1 where r ≥ 2 |
| Message length | 2r − r − 1 |
| Rate | 1 − r/(2r − 1) |
| Distance | 3 |
| Alphabet size | 2 |
| Notation | [2r − 1, 2r − r − 1, 3]2-code |
| Properties | |
| perfect code | |
In computer science and telecommunications, Hamming codes are a family of linear error-correcting codes. Hamming codes can detect one-bit and two-bit errors, or correct one-bit errors without detection of uncorrected errors. By contrast, the simple parity code cannot correct errors, and can detect only an odd number of bits in error. Hamming codes are perfect codes, that is, they achieve the highest possible rate for codes with their block length and minimum distance of three.[1] Richard W. Hamming invented Hamming codes in 1950 as a way of automatically correcting errors introduced by punched card readers. In his original paper, Hamming elaborated his general idea, but specifically focused on the Hamming(7,4) code which adds three parity bits to four bits of data.[2]
In mathematical terms, Hamming codes are a class of binary linear code. For each integer r ≥ 2 there is a code-word with block length n = 2r − 1 and message length k = 2r − r − 1. Hence the rate of Hamming codes is R = k / n = 1 − r / (2r − 1), which is the highest possible for codes with minimum distance of three (i.e., the minimal number of bit changes needed to go from any code word to any other code word is three) and block length 2r − 1. The parity-check matrix of a Hamming code is constructed by listing all columns of length r that are non-zero, which means that the dual code of the Hamming code is the shortened Hadamard code, also known as a Simplex code. The parity-check matrix has the property that any two columns are pairwise linearly independent.
Due to the limited redundancy that Hamming codes add to the data, they can only detect and correct errors when the error rate is low. This is the case in computer memory (usually RAM), where bit errors are extremely rare and Hamming codes are widely used. Memory with this correction system is known as ECC memory. In this context, an extended Hamming code having one extra parity bit is often used. Extended Hamming codes achieve a Hamming distance of four, which allows the decoder to distinguish between when at most one one-bit error occurs and when any two-bit errors occur. In this sense, extended Hamming codes are single-error correcting and double-error detecting, abbreviated as SECDED.
History
[edit]Richard Hamming, the inventor of Hamming codes, worked at Bell Labs in the late 1940s on the Bell Model V computer, an electromechanical relay-based machine with cycle times in seconds. Input was fed in on punched paper tape, seven-eighths of an inch wide, which had up to six holes per row. During weekdays, when errors in the relays were detected, the machine would stop and flash lights so that the operators could correct the problem. During after-hours periods and on weekends, when there were no operators, the machine simply moved on to the next job.
Hamming worked on weekends, and grew increasingly frustrated with having to restart his programs from scratch due to detected errors. In a taped interview, Hamming said, "And so I said, 'Damn it, if the machine can detect an error, why can't it locate the position of the error and correct it?'".[3] Over the next few years, he worked on the problem of error-correction, developing an increasingly powerful array of algorithms. In 1950, he published what is now known as Hamming code, which remains in use today in applications such as ECC memory.
Codes predating Hamming
[edit]A number of simple error-detecting codes were used before Hamming codes, but none were as effective as Hamming codes in the same overhead of space.
Parity
[edit]Parity adds a single bit that indicates whether the number of ones (bit-positions with values of one) in the preceding data was even or odd. If an odd number of bits is changed in transmission, the message will change parity and the error can be detected at this point; however, the bit that changed may have been the parity bit itself. The most common convention is that a parity value of one indicates that there is an odd number of ones in the data, and a parity value of zero indicates that there is an even number of ones. If the number of bits changed is even, the check bit will be valid and the error will not be detected.
Moreover, parity does not indicate which bit contained the error, even when it can detect it. The data must be discarded entirely and re-transmitted from scratch. On a noisy transmission medium, a successful transmission could take a long time or may never occur. However, while the quality of parity checking is poor, since it uses only a single bit, this method results in the least overhead.
Two-out-of-five code
[edit]A two-out-of-five code is an encoding scheme which uses five bits consisting of exactly three 0s and two 1s. This provides possible combinations, enough to represent the digits 0–9. This scheme can detect all single bit-errors, all odd numbered bit-errors and some even numbered bit-errors (for example the flipping of both 1-bits). However it still cannot correct any of these errors.
Repetition
[edit]Another code in use at the time repeated every data bit multiple times in order to ensure that it was sent correctly. For instance, if the data bit to be sent is a 1, an n = 3 repetition code will send 111. If the three bits received are not identical, an error occurred during transmission. If the channel is clean enough, most of the time only one bit will change in each triple. Therefore, 001, 010, and 100 each correspond to a 0 bit, while 110, 101, and 011 correspond to a 1 bit, with the greater quantity of digits that are the same ('0' or a '1') indicating what the data bit should be. A code with this ability to reconstruct the original message in the presence of errors is known as an error-correcting code. This triple repetition code is a Hamming code with m = 2, since there are two parity bits, and 22 − 2 − 1 = 1 data bit.
Such codes cannot correctly repair all errors, however. In our example, if the channel flips two bits and the receiver gets 001, the system will detect the error, but conclude that the original bit is 0, which is incorrect. If we increase the size of the bit string to four, we can detect all two-bit errors but cannot correct them (the quantity of parity bits is even); at five bits, we can both detect and correct all two-bit errors, but not all three-bit errors.
Moreover, increasing the size of the parity bit string is inefficient, reducing throughput by three times in our original case, and the efficiency drops drastically as we increase the number of times each bit is duplicated in order to detect and correct more errors.
Description
[edit]If more error-correcting bits are included with a message, and if those bits can be arranged such that different incorrect bits produce different error results, then bad bits could be identified. In a seven-bit message, there are seven possible single bit errors, so three error control bits could potentially specify not only that an error occurred but also which bit caused the error.
Hamming studied the existing coding schemes, including two-of-five, and generalized their concepts. To start with, he developed a nomenclature to describe the system, including the number of data bits and error-correction bits in a block. For instance, parity includes a single bit for any data word, so assuming ASCII words with seven bits, Hamming described this as an (8,7) code, with eight bits in total, of which seven are data. The repetition example would be (3,1), following the same logic. The code rate is the second number divided by the first, for our repetition example, 1/3.
Hamming also noticed the problems with flipping two or more bits, and described this as the "distance" (it is now called the Hamming distance, after him). Parity has a distance of 2, so one bit flip can be detected but not corrected, and any two bit flips will be invisible. The (3,1) repetition has a distance of 3, as three bits need to be flipped in the same triple to obtain another code word with no visible errors. It can correct one-bit errors or it can detect - but not correct - two-bit errors. A (4,1) repetition (each bit is repeated four times) has a distance of 4, so flipping three bits can be detected, but not corrected. When three bits flip in the same group there can be situations where attempting to correct will produce the wrong code word. In general, a code with distance k can detect but not correct k − 1 errors.
Hamming was interested in two problems at once: increasing the distance as much as possible, while at the same time increasing the code rate as much as possible. During the 1940s he developed several encoding schemes that were dramatic improvements on existing codes. The key to all of his systems was to have the parity bits overlap, such that they managed to check each other as well as the data.
General algorithm
[edit]The following general algorithm generates a single-error correcting (SEC) code for any number of bits. The main idea is to choose the error-correcting bits such that the index-XOR (the XOR of all the bit positions containing a 1) is 0. We use positions 1, 10, 100, etc. (in binary) as the error-correcting bits, which guarantees it is possible to set the error-correcting bits so that the index-XOR of the whole message is 0. If the receiver receives a string with index-XOR 0, they can conclude there were no corruptions, and otherwise, the index-XOR indicates the index of the corrupted bit.
An algorithm can be deduced from the following description:
- Number the bits starting from 1: bit 1, 2, 3, 4, 5, 6, 7, etc.
- Write the bit numbers in binary: 1, 10, 11, 100, 101, 110, 111, etc.
- All bit positions that are powers of two (have a single 1 bit in the binary form of their position) are parity bits: 1, 2, 4, 8, etc. (1, 10, 100, 1000)
- All other bit positions, with two or more 1 bits in the binary form of their position, are data bits.
- Each data bit is included in a unique set of 2 or more parity bits, as determined by the binary form of its bit position.
- Parity bit 1 covers all bit positions which have the least significant bit set: bit 1 (the parity bit itself), 3, 5, 7, 9, etc.
- Parity bit 2 covers all bit positions which have the second least significant bit set: bits 2–3, 6–7, 10–11, etc.
- Parity bit 4 covers all bit positions which have the third least significant bit set: bits 4–7, 12–15, 20–23, etc.
- Parity bit 8 covers all bit positions which have the fourth least significant bit set: bits 8–15, 24–31, 40–47, etc.
- In general each parity bit covers all bits where the bitwise AND of the parity position and the bit position is non-zero.
If a byte of data to be encoded is 10011010, then the data word (using _ to represent the parity bits) would be __1_001_1010, and the code word is 011100101010.
The choice of the parity, even or odd, is irrelevant but the same choice must be used for both encoding and decoding.
This general rule can be shown visually:
Bit position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... Encoded data bits p1 p2 d1 p4 d2 d3 d4 p8 d5 d6 d7 d8 d9 d10 d11 p16 d12 d13 d14 d15 Parity
bit
coveragep1 









p2 









p4 







p8 







p16 



Shown are only 20 encoded bits (5 parity, 15 data) but the pattern continues indefinitely. The key thing about Hamming codes that can be seen from visual inspection is that any given bit is included in a unique set of parity bits. To check for errors, check all of the parity bits. The pattern of errors, called the error syndrome, identifies the bit in error. If all parity bits are correct, there is no error. Otherwise, the sum of the positions of the erroneous parity bits identifies the erroneous bit. For example, if the parity bits in positions 1, 2 and 8 indicate an error, then bit 1+2+8=11 is in error. If only one parity bit indicates an error, the parity bit itself is in error.
With m parity bits, bits from 1 up to can be covered. After discounting the parity bits, bits remain for use as data. As m varies, we get all the possible Hamming codes:
| Parity bits | Total bits | Data bits | Name | Rate |
|---|---|---|---|---|
| 2 | 3 | 1 | Hamming(3,1) (Triple repetition code) |
1/3 ≈ 0.333 |
| 3 | 7 | 4 | Hamming(7,4) | 4/7 ≈ 0.571 |
| 4 | 15 | 11 | Hamming(15,11) | 11/15 ≈ 0.733 |
| 5 | 31 | 26 | Hamming(31,26) | 26/31 ≈ 0.839 |
| 6 | 63 | 57 | Hamming(63,57) | 57/63 ≈ 0.905 |
| 7 | 127 | 120 | Hamming(127,120) | 120/127 ≈ 0.945 |
| 8 | 255 | 247 | Hamming(255,247) | 247/255 ≈ 0.969 |
| 9 | 511 | 502 | Hamming(511,502) | 502/511 ≈ 0.982 |
| ... | ||||
| m | Hamming | |||
Hamming codes with additional parity (SECDED)
[edit]Hamming codes have a minimum distance of 3, which means that the decoder can detect and correct a single error, but it cannot distinguish a double bit error of some codeword from a single bit error of a different codeword. Thus, some double-bit errors will be incorrectly decoded as if they were single bit errors and therefore go undetected, unless no correction is attempted.
To remedy this shortcoming, Hamming codes can be extended by an extra parity bit. This way, it is possible to increase the minimum distance of the Hamming code to 4, which allows the decoder to distinguish between single bit errors and two-bit errors. Thus the decoder can detect and correct a single error and at the same time detect (but not correct) a double error. If the decoder does not attempt to correct errors, it can reliably detect triple bit errors. If the decoder does correct errors, some triple errors will be mistaken for single errors and "corrected" to the wrong value. Error correction is therefore a trade-off between certainty (the ability to reliably detect triple bit errors) and resiliency (the ability to keep functioning in the face of single bit errors).
For k data bits, a SECDED scheme requires:
- Let n be the first power of 2 greater than k.
- Let l be floor(log2(k)) and h be l + 1.
- If k ≤ n - l - 1, the required additional bits are l. Otherwise the required count is h.
This extended Hamming code was popular in computer memory systems, starting with IBM 7030 Stretch in 1961,[4] where it is known as SECDED (or SEC-DED, abbreviated from single error correction, double error detection).[5] Common forms for memory systems include (39,32) and (72,64). (While it is more efficient to use a codeword length in the form of 2m - 1, existing computer data word sizes being powers of 2 preclude this choice, though communication and data storage system do take advantage.) Server computers in 21st century, while typically keeping the SECDED level of protection, no longer use Hamming's method, relying instead on the designs with longer codewords (128 to 256 bits of data) and modified balanced parity-check trees.[4] The (72,64) Hamming code is still popular in some hardware designs, including Xilinx FPGA families.[4]
[7,4] Hamming code
[edit]
In 1950, Hamming introduced the [7,4] Hamming code. It encodes four data bits into seven bits by adding three parity bits. As explained earlier, it can either detect and correct single-bit errors or it can detect (but not correct) both single and double-bit errors.
With the addition of an overall parity bit, it becomes the [8,4] extended Hamming code and can both detect and correct single-bit errors and detect (but not correct) double-bit errors.
Construction of G and H
[edit]The matrix is called a (canonical) generator matrix of a linear (n,k) code,
and is called a parity-check matrix.
This is the construction of G and H in standard (or systematic) form. Regardless of form, G and H for linear block codes must satisfy
, an all-zeros matrix.[6]
Since [7, 4, 3] = [n, k, d] = [2m − 1, 2m − 1 − m, 3]. The parity-check matrix H of a Hamming code is constructed by listing all columns of length m that are pair-wise independent.
Thus H is a matrix whose left side is all of the nonzero n-tuples where order of the n-tuples in the columns of matrix does not matter. The right hand side is just the (n − k)-identity matrix.
So G can be obtained from H by taking the transpose of the left hand side of H with the identity k-identity matrix on the left hand side of G.
The code generator matrix and the parity-check matrix are:
and
Finally, these matrices can be mutated into equivalent non-systematic codes by the following operations:[6]
- Column permutations (swapping columns)
- Elementary row operations (replacing a row with a linear combination of rows)
Encoding
[edit]- Example
From the above matrix we have 2k = 24 = 16 codewords. Let be a row vector of binary data bits, . The codeword for any of the 16 possible data vectors is given by the standard matrix product where the summing operation is done modulo-2.
For example, let . Using the generator matrix from above, we have (after applying modulo 2, to the sum),
[8,4] Hamming code with an additional parity bit
[edit]
The [7,4] Hamming code can easily be extended to an [8,4] code by adding an extra parity bit on top of the (7,4) encoded word (see Hamming(7,4)). This can be summed up with the revised matrices:
and
Note that H is not in standard form. To obtain G, elementary row operations can be used to obtain an equivalent matrix to H in systematic form:
For example, the first row in this matrix is the sum of the second and third rows of H in non-systematic form. Using the systematic construction for Hamming codes from above, the matrix A is apparent and the systematic form of G is written as
The non-systematic form of G can be row reduced (using elementary row operations) to match this matrix.
The addition of the fourth row effectively computes the sum of all the codeword bits (data and parity) as the fourth parity bit.
For example, 1011 is encoded (using the non-systematic form of G at the start of this section) into 01100110 where blue digits are data; red digits are parity bits from the [7,4] Hamming code; and the green digit is the parity bit added by the [8,4] code. The green digit makes the parity of the [7,4] codewords even.
Finally, it can be shown that the minimum distance has increased from 3, in the [7,4] code, to 4 in the [8,4] code. Therefore, the code can be defined as [8,4] Hamming code.
To decode the [8,4] Hamming code, first check the parity bit. If the parity bit indicates an error, single error correction (the [7,4] Hamming code) will indicate the error location, with "no error" indicating the parity bit. If the parity bit is correct, then single error correction will indicate the (bitwise) exclusive-or of two error locations. If the locations are equal ("no error") then a double bit error either has not occurred, or has cancelled itself out. Otherwise, a double bit error has occurred.
See also
[edit]Notes
[edit]- ^ See Lemma 12 of
- ^ Hamming (1950), pp. 153–154.
- ^ Thompson, Thomas M. (1983), From Error-Correcting Codes through Sphere Packings to Simple Groups, The Carus Mathematical Monographs (#21), Mathematical Association of America, pp. 16–17, ISBN 0-88385-023-0
- ^ a b c Kythe & Kythe 2012, p. 115.
- ^ Kythe & Kythe 2012, p. 95.
- ^ a b Moon T. Error correction coding: Mathematical Methods and Algorithms. John Wiley and Sons, 2005.(Cap. 3) ISBN 978-0-471-64800-0
References
[edit]- Hamming, Richard Wesley (1950). "Error detecting and error correcting codes" (PDF). Bell System Technical Journal. 29 (2): 147–160. doi:10.1002/j.1538-7305.1950.tb00463.x. hdl:10945/46756. S2CID 61141773. Archived (PDF) from the original on 2022-10-09.
- Moon, Todd K. (2005). Error Correction Coding. New Jersey: John Wiley & Sons. ISBN 978-0-471-64800-0.
- MacKay, David J.C. (September 2003). Information Theory, Inference and Learning Algorithms. Cambridge: Cambridge University Press. ISBN 0-521-64298-1.
- D.K. Bhattacharryya, S. Nandi. "An efficient class of SEC-DED-AUED codes". 1997 International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN '97). pp. 410–415. doi:10.1109/ISPAN.1997.645128.
- "Mathematical Challenge April 2013 Error-correcting codes" (PDF). swissQuant Group Leadership Team. April 2013. Archived (PDF) from the original on 2017-09-12.
- Kythe, Dave K.; Kythe, Prem K. (2012). "Extended Hamming Codes". Algebraic and Stochastic Coding Theory. CRC Press. pp. 95–116. ISBN 978-1-351-83245-8.
External links
[edit]Hamming code
View on GrokipediaBackground and History
Early Error-Correcting Codes
Early error-correcting techniques relied on simple redundancy to detect or correct transmission errors in noisy channels, such as those found in telegraphic and early computing systems. Parity bits emerged as a basic method for single-error detection, where an additional bit is appended to a block of data to ensure the total number of 1s is even (even parity) or odd (odd parity). Upon reception, the parity is recalculated; a mismatch indicates an odd number of errors, typically a single bit flip, allowing detection but not correction. This approach was particularly useful in the unreliable vacuum-tube computers of the 1940s, where tube failures could corrupt data during processing or memory storage.[8][9] Another foundational technique was the two-out-of-five code, employed in early punched card systems by IBM starting in the 1930s for numeric data representation. In this constant-weight code, each digit from 0 to 9 is encoded using five bits with exactly two 1s, ensuring a fixed Hamming weight of two. Any single-bit error alters this weight to one or three, enabling detection without correction. This method improved reliability in mechanical punched card readers and sorters, where physical damage or misreads were common, by providing built-in validation during data entry and processing.[10][11] Repetition codes offered a straightforward way to achieve error correction by transmitting each data bit multiple times—typically three or more—and decoding via majority voting at the receiver. For instance, sending "101" for bit 1 and "000" for bit 0 allows correction of a single error per group by selecting the most frequent value. This technique was applied in early radio transmissions around the 1900s, where atmospheric noise frequently garbled Morse code signals, requiring operators to repeat messages for verification. While effective for short messages in low-bandwidth channels like wireless telegraphy, repetition codes suffered from severe efficiency limitations for large data volumes, as the redundancy overhead grew linearly, reducing effective throughput.[12][13] These pre-Hamming methods highlighted the need for more efficient error control in computing, motivating innovations that balanced redundancy with performance.[12]Invention and Development
In 1950, Richard W. Hamming, a mathematician at Bell Laboratories, developed the foundational concepts of what would become known as Hamming codes, motivated by the frequent failures of early computing equipment plagued by noise and unreliable components in vacuum-tube based systems. These interruptions, often requiring manual restarts and loss of computational progress, prompted Hamming to seek automated methods for detecting and correcting errors, building briefly on prior simple techniques like parity bits and repetition codes for error detection. His work addressed the practical needs of large-scale computing machines at Bell Labs, where errors from environmental noise or hardware glitches were commonplace. A key catalyst was an incident in late 1947, when Hamming set up calculations on the Bell Model V relay computer before a weekend but returned to find errors from faulty punched cards and relays, inspiring his pursuit of self-correcting systems.[14][15][1] Hamming formalized the invention of single-error-correcting (SEC) codes in 1950, introducing a systematic approach that could correct a single bit error while detecting double errors in binary data transmission or storage.[15] This breakthrough was detailed in his seminal paper, "Error Detecting and Error Correcting Codes," published in April 1950 in The Bell System Technical Journal.[1] The paper presented the core principles of these linear block codes, emphasizing their efficiency in achieving the theoretical limits for error correction in noisy channels, and marked a significant advancement over earlier error-detection-only methods.[16] Early implementations of Hamming codes were integrated into Bell Laboratories' computing infrastructure, notably on the Bell Model V, an electromechanical relay-based machine used for numerical computations, where the codes helped mitigate errors from punched-card inputs and mechanical failures.[15] This application demonstrated the codes' practicality in real-world systems and influenced the broader adoption of error-correcting techniques in subsequent commercial computers.[14]Fundamentals of Hamming Codes
Linear Block Code Properties
A linear block code over GF(2) is defined as a -dimensional subspace of the -dimensional vector space , characterized by the parameters , where is the code length, is the dimension (and thus the number of information bits is ), and is the minimum Hamming distance between any two distinct nonzero codewords.[1] This structure ensures that the code is closed under addition and scalar multiplication modulo 2, forming a linear code suitable for error correction in binary channels.[1] Hamming codes constitute a specific family of these linear block codes that are perfect single-error-correcting codes, achieving equality in the Hamming bound for . The Hamming bound, or sphere-packing bound, provides an upper limit on the size of a code capable of correcting errors: For , this reduces to , and Hamming codes saturate this bound, packing the space without gaps or overlaps in the spheres of radius 1 around codewords.[1] With a minimum distance , binary Hamming codes can correct any single-bit error and simultaneously detect up to two-bit errors, as the distance allows distinguishing single-error patterns from no-error or double-error cases.[1] These codes admit a systematic representation, wherein each codeword explicitly contains the information bits followed by parity bits computed to satisfy the code constraints.[1] The parameters of binary Hamming codes are given by and , where denotes the number of parity-check bits required to achieve the error-correcting capability.[1]Parity-Check and Generator Matrices
The parity-check matrix for a binary Hamming code is an matrix over , where and the columns consist of all distinct nonzero binary vectors of length . This construction lists the nonzero elements of as columns, ensuring each is unique.[17] Typically, the columns are arranged in systematic order corresponding to the binary representations of the integers from 1 to ; for instance, with , the columns are the binary forms of 1 through 7, yielding: [18] The distinct columns guarantee a minimum distance of , enabling the code to correct any single error. The generator matrix for the Hamming code is a matrix in systematic form, where , expressed as .[18] Here, is the identity matrix, and the parity submatrix (of size ) is derived such that , ensuring all codewords lie in the null space of .[17] To obtain , the columns of are rearranged so that the last columns form the identity matrix , and then is set to the transpose of the submatrix formed by the first columns of the rearranged .[18] This systematic structure allows the information bits to directly appear in the first positions of the codeword. A key property of this matrix construction is that the syndrome for an error vector with a single 1 in position equals the -th column of . Since all columns are distinct and nonzero, uniquely identifies the error position , facilitating syndrome-based decoding for single-error correction.[17]Encoding and Decoding Procedures
Encoding Algorithms
Hamming codes employ systematic encoding, in which the k information bits are directly placed in the initial k positions of the n-bit codeword, while the remaining m parity bits are computed to ensure the codeword satisfies the parity-check equations defined by the parity-check matrix H, specifically by solving H c^T = 0 where c denotes the codeword vector.[1] The step-by-step encoding algorithm for a given k-bit message vector u proceeds as follows: form the partial codeword by appending zero placeholders for the m parity bits to u, yielding a tentative n-bit vector; then compute the parity bits p by performing parity checks over the appropriate positions (including the placeholders themselves initially set to zero), and insert these values into the designated parity positions to achieve even parity across each check group, thereby satisfying all parity conditions. Alternatively, this can be expressed using the generator matrix G = [I_k | P], where I_k is the k × k identity matrix and P is the k × m parity submatrix, such that the full codeword c = [u | u P]. Non-systematic encoding is also possible through direct matrix multiplication: the codeword c is obtained by computing the row vector u multiplied by the full n × k generator matrix G, resulting in c = u G, which generates all valid codewords spanning the code space. In binary Hamming codes, parity bits are conventionally positioned at indices that are powers of 2 (1, 2, 4, 8, etc.), with each parity bit responsible for checking a unique set of positions determined by the binary representation of the indices; for instance, the parity bit at position 1 (binary 001) covers all odd-numbered positions (1, 3, 5, 7, ...), the bit at position 2 (binary 010) covers positions where the second bit is set (2, 3, 6, 7, 10, 11, ...), and the bit at position 4 (binary 100) covers positions where the third bit is set (4, 5, 6, 7, 12, 13, 14, 15, ...), with each parity bit set to the XOR of the bits in its covered positions to enforce even parity.[1] The parity-check matrix H and generator matrix G together define the structure of the Hamming code and facilitate both systematic and non-systematic encoding approaches.[1]Error Detection and Correction via Syndromes
In syndrome decoding for Hamming codes, the received vector is processed, where is the transmitted codeword and is the error vector, typically with a single nonzero entry for correctable errors. The syndrome is computed using the parity-check matrix , which requires multiplying the matrix by the received vector, resulting in an -bit syndrome. If , the received vector is deemed error-free, as valid codewords satisfy .[16][2] When , it is compared to the columns of , which are distinct nonzero vectors designed to uniquely identify error positions. If equals the -th column , a single error is assumed in the -th position of . The correction involves flipping the -th bit of to obtain the estimated codeword , from which the information bits are extracted using the systematic form or generator matrix. This process ensures single-error correction with high reliability in channels prone to isolated bit flips.[16][2] For multiple errors, such as doubles, the syndrome will be nonzero but may not match any single column of , signaling a detected yet uncorrectable error. In such cases, the decoder flags the received vector as erroneous without attempting correction, preventing propagation of undetected faults, though this detection capability varies by the specific Hamming code variant and field. The overall decoding complexity is linear in the code length, , making it highly efficient for hardware implementations like memory controllers and satellite communications.[2][19]Binary Hamming Code Examples
The [7,4] Hamming Code Construction
The [7,4] Hamming code serves as the foundational binary example of a Hamming code with parity-check matrix dimension m=3, yielding length n=7 and dimension k=4. The parity-check matrix H is the 3×7 matrix over GF(2) whose columns consist of all distinct nonzero 3-bit binary vectors, arranged to correspond to the binary representations of the position indices 1 through 7 (with the least significant bit in the first row). For instance, the first column is (binary 001 for position 1), and the third column is (binary 011 for position 3). The corresponding generator matrix G is the 4×7 systematic matrix that generates all codewords as linear combinations of its rows, ensuring H G^T = 0. This form places the 4 message bits in positions 3, 5, 6, and 7, with parity bits in positions 1, 2, and 4 computed to satisfy the parity-check equations defined by H. To encode a message, insert the 4-bit message vector u = (u_1, u_2, u_3, u_4) into positions 3, 5, 6, and 7 of the codeword, then compute the parity bits in positions 1, 2, and 4 as the even-parity checks over the relevant positions (including the parity bit itself) based on the rows of H. For the message 1011 (u_1=1 in position 3, u_2=0 in position 5, u_3=1 in position 6, u_4=1 in position 7), the parity calculations are:- Position 1 (first row of H): XOR of positions 1, 3, 5, 7 = 1 ⊕ 0 ⊕ 1 = 0 (so parity bit = 0 to make even).
- Position 2 (second row of H): XOR of positions 2, 3, 6, 7 = 1 ⊕ 1 ⊕ 1 = 1 (so parity bit = 1 to make even).
- Position 4 (third row of H): XOR of positions 4, 5, 6, 7 = 0 ⊕ 1 ⊕ 1 = 0 (so parity bit = 0 to make even).