Hubbry Logo
Decoding methodsDecoding methodsMain
Open search
Decoding methods
Community hub
Decoding methods
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Decoding methods
Decoding methods
from Wikipedia

In coding theory, decoding is the process of translating received messages into codewords of a given code. There have been many common methods of mapping messages to codewords. These are often used to recover messages sent over a noisy channel, such as a binary symmetric channel.

Notation

[edit]

is considered a binary code with the length ; shall be elements of ; and is the distance between those elements.

Ideal observer decoding

[edit]

One may be given the message , then ideal observer decoding generates the codeword . The process results in this solution:

For example, a person can choose the codeword that is most likely to be received as the message after transmission.

Decoding conventions

[edit]

Each codeword does not have an expected possibility: there may be more than one codeword with an equal likelihood of mutating into the received message. In such a case, the sender and receiver(s) must agree ahead of time on a decoding convention. Popular conventions include:

  1. Request that the codeword be resent – automatic repeat-request.
  2. Choose any random codeword from the set of most likely codewords which is nearer to that.
  3. If another code follows, mark the ambiguous bits of the codeword as erasures and hope that the outer code disambiguates them
  4. Report a decoding failure to the system

Maximum likelihood decoding

[edit]

Given a received vector maximum likelihood decoding picks a codeword that maximizes

,

that is, the codeword that maximizes the probability that was received, given that was sent. If all codewords are equally likely to be sent then this scheme is equivalent to ideal observer decoding. In fact, by Bayes' theorem,

Upon fixing , is restructured and is constant as all codewords are equally likely to be sent. Therefore, is maximised as a function of the variable precisely when is maximised, and the claim follows.

As with ideal observer decoding, a convention must be agreed to for non-unique decoding.

The maximum likelihood decoding problem can also be modeled as an integer programming problem.[1]

The maximum likelihood decoding algorithm is an instance of the "marginalize a product function" problem which is solved by applying the generalized distributive law.[2]

Minimum distance decoding

[edit]

Given a received vector , minimum distance decoding picks a codeword to minimise the Hamming distance:

i.e. choose the codeword that is as close as possible to .

Note that if the probability of error on a discrete memoryless channel is strictly less than one half, then minimum distance decoding is equivalent to maximum likelihood decoding, since if

then:

which (since p is less than one half) is maximised by minimising d.

Minimum distance decoding is also known as nearest neighbour decoding. It can be assisted or automated by using a standard array. Minimum distance decoding is a reasonable decoding method when the following conditions are met:

  1. The probability that an error occurs is independent of the position of the symbol.
  2. Errors are independent events – an error at one position in the message does not affect other positions.

These assumptions may be reasonable for transmissions over a binary symmetric channel. They may be unreasonable for other media, such as a DVD, where a single scratch on the disk can cause an error in many neighbouring symbols or codewords.

As with other decoding methods, a convention must be agreed to for non-unique decoding.

Syndrome decoding

[edit]

Syndrome decoding is a highly efficient method of decoding a linear code over a noisy channel, i.e. one on which errors are made. In essence, syndrome decoding is minimum distance decoding using a reduced lookup table. This is allowed by the linearity of the code.[3]

Suppose that is a linear code of length and minimum distance with parity-check matrix . Then clearly is capable of correcting up to

errors made by the channel (since if no more than errors are made then minimum distance decoding will still correctly decode the incorrectly transmitted codeword).

Now suppose that a codeword is sent over the channel and the error pattern occurs. Then is received. Ordinary minimum distance decoding would lookup the vector in a table of size for the nearest match - i.e. an element (not necessarily unique) with

for all . Syndrome decoding takes advantage of the property of the parity matrix that:

for all . The syndrome of the received is defined to be:

To perform ML decoding in a binary symmetric channel, one has to look-up a precomputed table of size , mapping to .

Note that this is already of significantly less complexity than that of a standard array decoding.

However, under the assumption that no more than errors were made during transmission, the receiver can look up the value in a further reduced table of size

List decoding

[edit]

Information set decoding

[edit]

This is a family of Las Vegas-probabilistic methods all based on the observation that it is easier to guess enough error-free positions, than it is to guess all the error-positions.

The simplest form is due to Prange: Let be the generator matrix of used for encoding. Select columns of at random, and denote by the corresponding submatrix of . With reasonable probability will have full rank, which means that if we let be the sub-vector for the corresponding positions of any codeword of for a message , we can recover as . Hence, if we were lucky that these positions of the received word contained no errors, and hence equalled the positions of the sent codeword, then we may decode.

If errors occurred, the probability of such a fortunate selection of columns is given by .

This method has been improved in various ways, e.g. by Stern[4] and Canteaut and Sendrier.[5]

Partial response maximum likelihood

[edit]

Partial response maximum likelihood (PRML) is a method for converting the weak analog signal from the head of a magnetic disk or tape drive into a digital signal.

Viterbi decoder

[edit]

A Viterbi decoder uses the Viterbi algorithm for decoding a bitstream that has been encoded using forward error correction based on a convolutional code. The Hamming distance is used as a metric for hard decision Viterbi decoders. The squared Euclidean distance is used as a metric for soft decision decoders.

Optimal decision decoding algorithm (ODDA)

[edit]

Optimal decision decoding algorithm (ODDA) for an asymmetric TWRC system.[clarification needed][6]

See also

[edit]

References

[edit]

Further reading

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Decoding methods refer to the set of algorithms and techniques employed to reconstruct an original transmitted message from a potentially corrupted received signal, primarily within the framework of error-correcting codes in and digital communications. These methods aim to minimize decoding errors by estimating the most likely codeword, often under constraints like maximum likelihood criteria, and are essential for reliable transmission over noisy channels such as binary symmetric channels or channels. In coding theory, decoding approaches vary by code type and complexity requirements. For block codes, techniques like syndrome decoding use parity-check matrices to identify error patterns without exhaustive search, while maximum likelihood decoding compares the received sequence against all possible codewords to select the closest match in terms of Hamming or Euclidean distance. Convolutional codes, on the other hand, leverage trellis structures for sequential decoding; the Viterbi algorithm performs maximum likelihood sequence detection by maintaining survivor paths across a trellis of states, achieving finite complexity proportional to 2ν2^\nu where ν\nu is the constraint length, making it practical for real-time applications in wireless systems. The BCJR algorithm extends this by computing a posteriori probabilities for individual bits, providing soft outputs (e.g., log-likelihood ratios) that enhance performance in iterative decoding schemes like turbo codes. More recent innovations, such as Guessing Random Additive Noise Decoding (GRAND), approximate maximum likelihood by ordered enumeration of low-weight error patterns, offering efficiency for high-rate codes with short lengths. These methods underscore decoding's role in bridging theoretical limits—such as —with practical systems, such as networks, where ongoing research addresses trade-offs in speed, robustness, and resource efficiency.

Fundamentals

Notation

In , a CC is a nonempty subset of the F2n\mathbb{F}_2^n, where nn denotes the code length (or block length) and F2={0,1}\mathbb{F}_2 = \{0, 1\} is the binary field equipped with modulo-2 addition and multiplication. The elements of CC are referred to as codewords, denoted yCy \in C, each of which is a binary vector of length nn. Upon transmission over a noisy channel, a codeword yy may be corrupted, resulting in a received vector xF2nx \in \mathbb{F}_2^n. The error vector is then given by e=x+ye = x + y, where is componentwise 2 (equivalent to the bitwise XOR operation), representing the positions where errors occurred. The between two vectors x,yF2nx, y \in \mathbb{F}_2^n measures their dissimilarity and is defined as d(x,y)={i:xiyi}d(x, y) = |\{ i : x_i \neq y_i \}|, the number of positions in which they differ. The minimum distance dd of the code CC is the smallest between any two distinct codewords, d=min{d(y1,y2):y1,y2C,y1y2}d = \min \{ d(y_1, y_2) : y_1, y_2 \in C, y_1 \neq y_2 \}, which quantifies the code's error-detecting and correcting capability. The Hamming weight of an error vector eF2ne \in \mathbb{F}_2^n is w(e)=d(e,0)w(e) = d(e, 0), the number of nonzero (i.e., 1) entries in ee, corresponding to the number of bit errors. A common channel model for binary codes is the binary symmetric channel (BSC), in which each transmitted bit is independently flipped (i.e., 0 becomes 1 or vice versa) with crossover probability pp (where 0<p<1/20 < p < 1/2 typically), and transmitted correctly with probability 1p1 - p. For linear binary codes, which form a subspace of F2n\mathbb{F}_2^n, the code CC has dimension kk (where 1kn1 \leq k \leq n) and can be generated by a k×nk \times n generator matrix GG whose rows form a basis for CC, such that every codeword is y=mGy = m G for some message vector mF2km \in \mathbb{F}_2^k. Equivalently, CC is the null space of an (nk)×n(n - k) \times n parity-check matrix HH, satisfying Hy=0H y = 0 (the zero vector in F2nk\mathbb{F}_2^{n-k}) for all yCy \in C.

Decoding Conventions

Decoding conventions in error-correcting codes establish standardized protocols for resolving ambiguities during the decoding process, ensuring predictable and consistent outcomes across various decoding algorithms. These conventions address scenarios where the received vector does not map uniquely to a valid codeword, such as when multiple codewords are equally likely or when the error pattern exceeds the code's correction capability. By defining fallback behaviors, these protocols help maintain reliability in noisy communication systems, distinguishing between successful correction, detection of errors without correction, and outright decoding failure. The foundations of these conventions trace back to Claude Shannon's seminal 1948 work on information theory, which emphasized the possibility of reliable communication over noisy channels by introducing redundancy through coding, thereby setting the stage for systematic error handling in decoding. Shannon's framework highlighted the need for decoding rules that achieve arbitrarily low error probabilities as code length increases, influencing subsequent developments in error-correcting code design and ambiguity resolution. A key distinction in decoding conventions lies between error detection and error correction thresholds. Error detection can identify up to d1d-1 errors in a code with minimum distance dd, as any such alteration prevents the received vector from matching a codeword exactly. However, correction is limited to at most t=(d1)/2t = \lfloor (d-1)/2 \rfloor errors, beyond which decoding fails because the spheres of radius tt around codewords no longer cover the entire space without overlap, leading to potential misdecoding or undecodability. If the number of errors exceeds this threshold, the decoder declares a failure, often triggering mechanisms like retransmission requests in practical systems. In ideal decoding scenarios, such as maximum likelihood decoding, ambiguities arise when no unique codeword is closest to the received vector or when ties occur among multiple candidates at the minimum distance. In such cases of ambiguity, the decoder typically outputs a failure indicator or selects one of the equally likely codewords according to the specific implementation, ensuring data integrity over unreliable channels. For ties, where multiple codewords share the same minimum Hamming distance to the received vector, the convention is to select one arbitrarily, randomly, or based on a predefined priority (e.g., the lowest-indexed codeword), though some decoders output a failure indicator to avoid erroneous decisions. These rules promote consistent behavior, with exhaustive search decoders explicitly returning a failure symbol (e.g., "?") in tie cases to signal undecodability. Erasures, where error locations are known but values are unknown, are handled differently in partial decoding scenarios compared to unknown errors. Conventions treat erasures as partially resolved corruptions, allowing codes with minimum distance dd to correct up to d1d-1 erasures alone, since their positions enable direct recovery via solving for missing symbols. In combined error-and-erasure channels, decoding succeeds if the weighted error pattern—typically 2e+s<d2e + s < d, with ee unknown errors and ss erasures—falls within the code's capability; otherwise, failure is declared, often prompting partial recovery or resend of affected segments. This approach, common in forward error correction systems, prioritizes known positions to extend correction beyond pure error cases.

Probabilistic Decoding

Ideal Observer Decoding

Ideal observer decoding is a probabilistic method that selects the codeword yy from the code CC which maximizes the posterior probability P(y sentx received)\mathbb{P}(y \text{ sent} \mid x \text{ received}), where xx is the received vector. This is given by Bayes' theorem as P(yx)=P(xy)P(y)P(x),\mathbb{P}(y \mid x) = \frac{\mathbb{P}(x \mid y) \mathbb{P}(y)}{\mathbb{P}(x)}, where P(xy)\mathbb{P}(x \mid y) is the channel transition probability, P(y)\mathbb{P}(y) is the prior probability of the codeword, and P(x)\mathbb{P}(x) is the marginal probability of the received vector. The decoder thus computes this posterior for each possible codeword and chooses the one with the highest value, providing the theoretically optimal estimate under the Bayesian framework. A key assumption in many coding scenarios is a uniform prior distribution over the codewords, where P(y)=1/C\mathbb{P}(y) = 1/|C| for all yCy \in C. Under this assumption, the posterior simplifies such that maximizing P(yx)\mathbb{P}(y \mid x) is equivalent to maximizing the likelihood P(xy)\mathbb{P}(x \mid y), reducing ideal observer decoding to maximum likelihood decoding. This equivalence holds because the uniform prior and the denominator P(x)\mathbb{P}(x) (constant for a fixed xx) do not affect the argmax operation. For the binary symmetric channel (BSC) with crossover probability p<0.5p < 0.5, the posterior probability is proportional to (1p)nd(x,y)pd(x,y)(1-p)^{n - d(x,y)} p^{d(x,y)}, where nn is the code length and d(x,y)d(x,y) is the Hamming distance between xx and yy. This form highlights that closer codewords (smaller d(x,y)d(x,y)) are favored, as 1p>p1-p > p, making the decoding rule bias toward low-distance candidates while incorporating the channel's error statistics. In general, ideal observer decoding requires an exhaustive search over all C|C| codewords to evaluate the posteriors, rendering it computationally infeasible for large codes. The problem is NP-hard, as even approximating the optimal solution or related tasks like minimum-distance computation is intractable. The concept originates from in statistical and was applied to communication coding by in his foundational 1948 paper, where he described decoding as selecting the most probable transmitted signal based on a posteriori probabilities.

Maximum Likelihood Decoding

Maximum likelihood decoding is a that selects the codeword y^\hat{y} from the codebook C\mathcal{C} which maximizes the P(xy sent)\mathbb{P}(x \mid y \text{ sent}) of observing the received vector xx given that codeword yy was transmitted over the channel. This approach assumes of the channel model and seeks the most probable transmitted codeword based solely on the likelihood, without incorporating prior probabilities on the messages. In practice, to facilitate computation and avoid numerical underflow, the maximization is often performed on the log-likelihood, expressed as ilogP(xiyi)P(xi¬yi)\sum_i \log \frac{\mathbb{P}(x_i \mid y_i)}{\mathbb{P}(x_i \mid \neg y_i)} for binary channels where ¬yi\neg y_i denotes the complementary . For the binary symmetric channel (BSC) with crossover probability p<0.5p < 0.5, maximum likelihood decoding is equivalent to minimum Hamming distance decoding, as the likelihood decreases monotonically with the number of bit errors, reducing the problem to finding the codeword closest to the received vector in Hamming distance. This equivalence holds because the channel transition probabilities satisfy P(xiyi)>P(xi¬yi)\mathbb{P}(x_i \mid y_i) > \mathbb{P}(x_i \mid \neg y_i) for p<0.5p < 0.5, making the log-likelihood ratio proportional to the negative distance. However, maximum likelihood decoding generalizes beyond symmetric channels to asymmetric ones, where distance metrics fail, and the full probabilistic model must be used to account for unequal error probabilities. Practical implementations of maximum likelihood decoding include integer programming formulations, which model the problem as an optimization over binary variables subject to code constraints and a linear objective based on the log-likelihood. Alternative algorithms leverage the generalized distributive law on factor graphs to perform exact or approximate inference through message passing, efficiently computing marginals for structured codes. Additionally, sphere packing bounds provide theoretical limits on the decoding error probability, demonstrating that maximum likelihood decoding can approach the channel capacity for random codes as block length increases. Maximum likelihood decoding achieves the optimal error performance dictated by the channel capacity but is computationally intensive, with average complexity scaling as O(C)O(|\mathcal{C}|) in the exhaustive search over the codebook size. While traditional exhaustive methods are infeasible for large codes, integration with modern iterative techniques like belief propagation approximates maximum likelihood solutions efficiently for sparse codes such as low-density parity-check (LDPC) codes. Under uniform priors on messages, maximum likelihood decoding coincides with maximum a posteriori decoding.

Distance-Based Decoding

Minimum Distance Decoding

Minimum distance decoding is a fundamental hard-decision technique in error-correcting coding that selects the codeword from the codebook CC closest to the received vector rr according to the Hamming distance d(,)d(\cdot, \cdot). The algorithm computes c^=argmincCd(r,c)\hat{c} = \arg\min_{c \in C} d(r, c), where nn is the block length. This approach assumes a hard-decision channel output, treating each received symbol as the nearest code symbol without retaining reliability information. Decoding is guaranteed to be unique and correct if the number of errors does not exceed the error-correcting capability t=(dmin1)/2t = \lfloor (d_{\min} - 1)/2 \rfloor, where dmind_{\min} is the minimum distance of the code. The method relies on the assumption of independent bit errors over a binary symmetric channel (BSC) with crossover probability p<0.5p < 0.5. Under this model, the probability of an error pattern decreases with its Hamming weight, so minimizing the distance to the received vector implements maximum likelihood decoding. For such channels, minimum distance decoding achieves the optimal performance of maximum likelihood when p<0.5p < 0.5. Bounded-distance variants of the algorithm restrict the search to error patterns of weight at most tt, declaring a decoding failure otherwise, which improves efficiency while maintaining correctness within the capability. The computational complexity of exhaustive minimum distance decoding is O(nC)O(n |C|), as it requires comparing the received vector to each of the C|C| codewords, with each comparison taking O(n)O(n) time; for binary linear codes, C=2k|C| = 2^k where kk is the dimension, yielding O(n2k)O(n 2^k). The error probability PeP_e under this decoder over a BSC can be upper-bounded using the union bound: PeCj=t+1n(nj)pj(1p)nj,P_e \leq |C| \sum_{j = t+1}^n \binom{n}{j} p^j (1-p)^{n-j}, which accounts for the probability that the received vector is closer to an incorrect codeword. This bound is loose but useful for analyzing performance at low error rates. Minimum distance decoding finds application in classical block codes, notably the Hamming codes introduced in 1950, which correct single errors using a minimum distance of 3. These codes laid the groundwork for practical error correction in early computing and communication systems. Although foundational, the method is largely outdated for high-noise scenarios, where its exhaustive nature becomes impractical and more efficient or soft-decision alternatives prevail.

Syndrome Decoding

Syndrome decoding is an efficient algebraic technique for correcting errors in linear error-correcting codes by computing a syndrome vector that identifies the error pattern without exhaustively searching all codewords. For a linear [n, k] code over a finite field, the received vector x\mathbf{x} (of length n) is assumed to be the transmitted codeword c\mathbf{c} plus an error vector e\mathbf{e}, so x=c+e\mathbf{x} = \mathbf{c} + \mathbf{e}. The syndrome s\mathbf{s} is computed as s=Hx=He\mathbf{s} = H \mathbf{x} = H \mathbf{e}, where HH is the (n-k) × n parity-check matrix satisfying Hc=0H \mathbf{c} = \mathbf{0} for all codewords c\mathbf{c}. If no errors occur (e=0\mathbf{e} = \mathbf{0}), then s=0\mathbf{s} = \mathbf{0}; otherwise, s\mathbf{s} is nonzero and corresponds to the column space of H restricted to the error. To decode, a lookup table is precomputed containing all possible syndromes for correctable error patterns of weight up to the error-correcting capability t, where the table size is i=0t(ni)\sum_{i=0}^{t} \binom{n}{i} entries (one for each possible error vector of weight at most t). Upon receiving x\mathbf{x}, the syndrome s\mathbf{s} is matched to the table to find the corresponding error vector e\mathbf{e} such that He=sH \mathbf{e} = \mathbf{s}, typically selecting the minimum-weight such e\mathbf{e}; the decoded vector is then c^=x+e\hat{\mathbf{c}} = \mathbf{x} + \mathbf{e}. This approach leverages the linearity of the code to map syndromes directly to error cosets, enabling correction of up to t errors when the minimum distance d satisfies d ≥ 2t + 1. The computational advantages include syndrome calculation in O(n^2) time (dominated by the matrix-vector multiplication with H) and constant-time lookup once the table is built, making it practical for hardware implementation; the space requirement is efficient for small t relative to n, though table construction is exponential in t. A key variant uses coset leaders, where for each coset of the code (partitioned by syndromes), the leader is the minimum-weight vector in that coset, ensuring the decoded error is the most likely under a minimum-distance criterion. Algebraic extensions of syndrome decoding, such as the Berlekamp-Massey algorithm, enable efficient handling of multiple errors in structured cyclic codes like BCH codes by solving the key equation for the error-locator polynomial from the syndrome sequence, avoiding exhaustive lookup for larger t. This method iteratively finds the shortest linear feedback shift register that generates the syndrome, locating error positions in O(t^2) time. Syndrome decoding was developed by Eugene Prange in 1958 specifically for cyclic codes, introducing simple algorithms that exploit the cyclic structure to compute and interpret syndromes efficiently.

List and Search-Based Decoding

List Decoding

List decoding extends the capabilities of error-correcting codes beyond the unique decoding radius tt, where a single codeword is guaranteed to be the closest, by outputting a small list of possible codewords within a larger error radius δ>t\delta > t. Specifically, given a received word xx and code CC, the decoder produces all codewords yCy \in C such that the d(x,y)δd(x, y) \leq \delta, or a list of at most LL such codewords if the total number within the radius is polynomially bounded in the code length nn. This approach is particularly useful when the number of errors exceeds tt, allowing post-processing to select the correct codeword based on additional context, such as cyclic redundancy checks or higher-layer protocols. The foundational algorithm for list decoding Reed-Solomon (RS) codes was introduced by in 1997, which solves an problem to find a low-degree bivariate that agrees with the received word on at least δn\delta n points and factors it to recover candidate codewords. This method achieves list decoding up to an error fraction of roughly 1R1 - \sqrt{R}
Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.