Fibonacci coding
View on WikipediaThis article includes a list of general references, but it lacks sufficient corresponding inline citations. (January 2013) |
| Part of a series on |
| Numeral systems |
|---|
| List of numeral systems |
In mathematics and computing, Fibonacci coding is a universal code[1] which encodes positive integers into binary code words. It is one example of representations of integers based on Fibonacci numbers. Each code word ends with "11" and contains no other instances of "11" before the end.
The Fibonacci code is closely related to the Zeckendorf representation, a positional numeral system that uses Zeckendorf's theorem and has the property that no number has a representation with consecutive 1s. The Fibonacci code word for a particular integer is exactly the integer's Zeckendorf representation with the order of its digits reversed and an additional "1" appended to the end.
Definition
[edit]For a number , if represent the digits of the code word representing then we have:
where F(i) is the ith Fibonacci number, and so F(i+2) is the ith distinct Fibonacci number starting with . The last bit is always an appended bit of 1 and does not carry place value.
It can be shown that such a coding is unique, and the only occurrence of "11" in any code word is at the end (that is, d(k−1) and d(k)). The penultimate bit is the most significant bit and the first bit is the least significant bit. Also, leading zeros cannot be omitted as they can be in, for example, decimal numbers.
The first few Fibonacci codes are shown below, and also their so-called implied probability, the value for each number that has a minimum-size code in Fibonacci coding.
| Symbol | Fibonacci representation | Fibonacci code word | Implied probability |
|---|---|---|---|
| 1 | 11 | 1/4 | |
| 2 | 011 | 1/8 | |
| 3 | 0011 | 1/16 | |
| 4 | 1011 | 1/16 | |
| 5 | 00011 | 1/32 | |
| 6 | 10011 | 1/32 | |
| 7 | 01011 | 1/32 | |
| 8 | 000011 | 1/64 | |
| 9 | 100011 | 1/64 | |
| 10 | 010011 | 1/64 | |
| 11 | 001011 | 1/64 | |
| 12 | 101011 | 1/64 | |
| 13 | 0000011 | 1/128 | |
| 14 | 1000011 | 1/128 |
To encode an integer N:
- Find the largest Fibonacci number equal to or less than N; subtract this number from N, keeping track of the remainder.
- If the number subtracted was the ith Fibonacci number F(i), put a 1 in place i − 2 in the code word (counting the left most digit as place 0).
- Repeat the previous steps, substituting the remainder for N, until a remainder of 0 is reached.
- Place an additional 1 after the rightmost digit in the code word.
To decode a code word, remove the final "1", assign the remaining the values 1,2,3,5,8,13... (the Fibonacci numbers) to the bits in the code word, and sum the values of the "1" bits.
Comparison with other universal codes
[edit]Fibonacci coding has a useful property that sometimes makes it attractive in comparison to other universal codes: it is an example of a self-synchronizing code, making it easier to recover data from a damaged stream. With most other universal codes, if a single bit is altered, then none of the data that comes after it will be correctly read. With Fibonacci coding, on the other hand, a changed bit may cause one token to be read as two, or cause two tokens to be read incorrectly as one, but reading a "0" from the stream will stop the errors from propagating further. Since the only stream that has no "0" in it is a stream of "11" tokens, the total edit distance between a stream damaged by a single bit error and the original stream is at most three.
This approach, encoding using sequence of symbols, in which some patterns (like "11") are forbidden, can be freely generalized.[2]
Example
[edit]The following table shows that the number 65 is represented in Fibonacci coding as 0100100011, since 65 = 2 + 8 + 55. The first two Fibonacci numbers (0 and 1) are not used, and an additional 1 is always appended.
Generalizations
[edit]The Fibonacci encodings for the positive integers are binary strings that end with "11" and contain no other instances of "11". This can be generalized to binary strings that end with N consecutive 1s and contain no other instances of N consecutive 1s. For instance, for N = 3 the positive integers are encoded as 111, 0111, 00111, 10111, 000111, 100111, 010111, 110111, 0000111, 1000111, 0100111, …. In this case, the number of encodings as a function of string length is given by the sequence of tribonacci numbers.
For general constraints defining which symbols are allowed after a given symbol, the maximal information rate can be obtained by first finding the optimal transition probabilities using a maximal entropy random walk, then using an entropy coder (with switched encoder and decoder) to encode a message as a sequence of symbols fulfilling the found optimal transition probabilities.
See also
[edit]References
[edit]- ^ Basu, Manjusri; Prasad, Bandhu (2010-09-01). "Long range variations on the Fibonacci universal code". Journal of Number Theory. 130 (9): 1925–1931. doi:10.1016/j.jnt.2010.01.013. ISSN 0022-314X.
- ^ Duda, Jarek (2007). "Optimal encoding on discrete lattice with translational invariant constrains using statistical algorithms". arXiv:0710.3861 [cs.IT].
- Allouche, Jean-Paul; Shallit, Jeffrey (2003). Automatic Sequences: Theory, Applications, Generalizations. Cambridge University Press. p. 105. ISBN 978-0-521-82332-6. Zbl 1086.11015.
- Fraenkel, Aviezri S.; Klein, Shmuel T. (1996). "Robust universal complete codes for transmission and compression". Discrete Applied Mathematics. 64 (1): 31–55. CiteSeerX 10.1.1.37.3064. doi:10.1016/0166-218X(93)00116-H. ISSN 0166-218X. Zbl 0874.94026.
Further reading
[edit]- Stakhov, A. P. (2009). The Mathematics of Harmony: From Euclid to Contemporary Mathematics and Computer Science. Singapore: World Scientific Publishing.
Fibonacci coding
View on GrokipediaFundamentals
Definition
Fibonacci coding is a universal binary encoding scheme for positive integers that leverages the Fibonacci number system to produce prefix-free codewords. Each positive integer is represented uniquely as a binary string corresponding to a sum of distinct non-consecutive Fibonacci numbers, where the codeword terminates with two consecutive 1s ("11") and contains no other adjacent 1s. This ensures the code is instantaneous, meaning no codeword is a prefix of another, facilitating efficient decoding without delimiters. The Fibonacci sequence employed in this coding starts with , , , , and generally for . This indexing shifts the conventional sequence (, ) to align with the coding requirements, beginning the summation from . Mathematically, a codeword of length with bits —where each , , and the final 1 is appended for termination—encodes . The appended 1 distinguishes the codeword while preserving the underlying representation. This scheme originates from Zeckendorf's theorem, which establishes the unique representation of any positive integer as a sum of non-consecutive Fibonacci numbers greater than or equal to , and was adapted into a prefix-free universal code by Apostolico and Fraenkel in 1987.[6][7]Relation to Zeckendorf Representation
Zeckendorf's theorem asserts that every positive integer can be uniquely expressed as a sum of distinct non-consecutive Fibonacci numbers, where "non-consecutive" means no two selected Fibonacci numbers have indices differing by 1.[7] This representation, known as the Zeckendorf representation, forms the foundation of Fibonacci coding by providing a binary-like encoding without adjacent 1s in the bit string corresponding to the coefficients of the Fibonacci basis.[6] The Fibonacci sequence used in this context is typically indexed as , , , , , and so on, with for .[7] The Zeckendorf representation of a positive integer employs Fibonacci numbers starting from (skipping to avoid redundancy), ensuring the sum uses non-adjacent terms such as where the indices satisfy and no two are consecutive.[7] The uniqueness of this representation follows from a key property of the Fibonacci sequence: for any , .[7] This inequality implies that the largest possible sum of non-consecutive Fibonacci numbers up to is strictly less than , preventing ambiguities or "carry-over" issues similar to those in binary addition. To sketch the proof, suppose two different Zeckendorf representations sum to the same ; let and be the sets of indices in each, and assume without loss of generality that the maximum index in exceeds that in . The difference in sums would then contradict the inequality, as the excess in cannot be compensated by smaller terms in without violating the non-consecutivity. Thus, the representations must coincide.[7] Fibonacci coding adapts the Zeckendorf representation to form a prefix-free code suitable for variable-length encoding of positive integers. Specifically, the bit string of the Zeckendorf representation (with 1s indicating selected Fibonacci numbers and 0s elsewhere, ordered from lowest to highest index) is modified by appending an extra 1 at the end.[6] This appendage creates a terminating sequence of "11" (since the original Zeckendorf string ends with a 1 followed by no adjacent 1, ensuring the prior bit is 0), which enforces the prefix property without introducing adjacent 1s elsewhere in the codeword. The resulting code is instantaneous and uniquely decodable, leveraging the inherent uniqueness of Zeckendorf while bounding error propagation in transmission.[6]Encoding and Decoding
Encoding Algorithm
The encoding algorithm for Fibonacci coding generates a binary codeword for a given positive integer by leveraging the Zeckendorf representation, which uniquely expresses as a sum of non-consecutive Fibonacci numbers starting from , , , and so on, where the sequence is defined by , , and for . This greedy approach ensures the representation uses the largest possible Fibonacci number at each step without adjacent selections, as guaranteed by Zeckendorf's theorem.[8] The procedure begins by identifying the largest Fibonacci number . The bit corresponding to position (indexing bits from the lowest position 0 for ) is set to 1, and is subtracted from to obtain the remainder. This process repeats on the remainder until it reaches 0, producing a bit string with no two consecutive 1s. To form the complete codeword, an extra 1 is appended to the end, ensuring the code ends with "11" for prefix-free decoding. The bit string is constructed with the lowest position (F_2) as the leftmost bit.[8] For edge cases, if , the Zeckendorf representation uses , yielding the bit string "1" followed by the appended 1, resulting in "11". For , it uses , with bits "01" (0 for , 1 for ) followed by 1, giving "011".[8] The following pseudocode outlines the algorithm, assuming a precomputed list of Fibonacci numbers up to exceeding :function fibonacci_encode(N):
if N == 0:
return "" // Not typically encoded, as coding starts from 1
fibs = generate_fibonacci_up_to(N) // List [F_2=1, F_3=2, F_4=3, ..., F_k] where F_k > N
code = []
i = len(fibs) - 1
while i >= 0:
if fibs[i] <= N:
code.append(1)
N -= fibs[i]
else:
code.append(0)
i -= 1
// code now has bits from high (F_k) to low (F_2)
// Reverse to have low (F_2) first to high last
code = code[::-1]
// No lstrip, as leading 0s (low bits) are needed
return ''.join(map(str, code)) + '1'
This greedy nature directly mirrors the Zeckendorf algorithm, producing the unique representation required for the code's completeness.[8]
Decoding Algorithm
The decoding of a Fibonacci codeword involves identifying the terminating "11" sequence, which delimits the end of the codeword due to the appended 1 during encoding.[9][10] The final 1 in this sequence is then discarded, as it functions solely as a delimiter to ensure prefix-free decoding.[9][10] In the remaining bit string, each 1 corresponds to a Fibonacci number, with positions interpreted from the least significant bit (leftmost) onward: the leftmost bit aligns with , the next (to the right) with , the following with , and so on up to for higher positions to the right.[9][10] The original integer is recovered by summing the Fibonacci numbers at these 1 positions.[9][10] This procedure exhibits a self-synchronizing property, allowing decoding to resume from any point in the bitstream by searching for the next "11" sequence, which confines desynchronization errors to at most one affected codeword.[10][9] For edge cases, the codeword "11" decodes to 1, as discarding the final 1 leaves a single bit "1" at the position.[10][9] Likewise, "011" decodes to 2, with the remaining "01" having its 1 at the position.[10][9] The algorithm's reliability stems from the prefix-free property enforced by the termination rule, which introduces the only instance of consecutive 1s in the codeword, preventing ambiguity since internal representations adhere to the no-adjacent-1s constraint of Zeckendorf's theorem.[9][10]Properties and Analysis
Uniqueness and Completeness
The uniqueness of Fibonacci coding stems directly from Zeckendorf's theorem, which guarantees that every positive integer has a unique representation as a sum of distinct non-consecutive Fibonacci numbers, where the Fibonacci sequence is defined with , , and for .[11] In Fibonacci coding, this representation is converted to a binary string by placing 1s at positions corresponding to the selected Fibonacci numbers (starting from the least significant bit) and appending an extra 1 to terminate the codeword, preserving the uniqueness since no two different integers share the same Zeckendorf representation.[12] The appended 1 ensures the prefix-free property, meaning no valid codeword is a prefix of another, as every codeword ends with "11" while the no-adjacent-1s rule (inherited from the Zeckendorf representation) prevents any internal "11" sequence that could cause ambiguity during prefix detection.[11] This termination rule eliminates the possibility of one codeword being misinterpreted as the beginning of a longer one, guaranteeing instantaneous decodability without lookahead beyond the termination bits.[12] Completeness follows from the fact that Zeckendorf's theorem covers all positive integers exactly once, establishing a bijection between the positive integers and the set of valid Fibonacci codewords, with the appended 1 making the mapping prefix-free and thus suitable for concatenation in streams.[11] Fibonacci coding satisfies the Kraft inequality for uniquely decodable codes, where the sum over all codewords of (with the length of codeword ) holds with equality, confirming that the code is complete and no additional codewords can be added without violating unique decodability.[6] Violations of the no-adjacent-1s rule, such as allowing consecutive 1s in the representation before termination, would introduce multiple possible Zeckendorf-like decompositions for the same integer, thereby breaking the uniqueness of the mapping and potentially leading to decoding ambiguities.[11]Efficiency and Code Length
Fibonacci coding achieves codeword lengths that scale logarithmically with the value being encoded. The length of the codeword for a positive integer is , where is the golden ratio. Specifically, the codeword length is , where is the index of the largest Fibonacci number , and . This follows from the growth rate of the Fibonacci sequence, , ensuring that the representation remains compact for large .[13] The average codeword length for uniformly distributed integers from 1 to is approximately bits. This asymptotic expression accounts for the varying lengths across the range, with shorter codewords for smaller integers pulling the average below the maximum length. Additionally, the exact asymptotic density of ones in the codewords is per bit, reflecting the sparse nature of the Zeckendorf representation where no two consecutive ones appear in the coefficient bits (excluding the terminating one).[14] These properties contribute to the code's efficiency in representing sequences of integers without prior knowledge of their distribution.[13] Relative to information-theoretic bounds, for a uniform distribution over 1 to , the average codeword length is asymptotically bits, compared to the entropy bits, resulting in redundancy of approximately bits per symbol. This factor arises because each codeword requires about bits while the entropy is , yielding . The code's completeness and uniqueness ensure full coverage without gaps or overlaps.[13] Encoding and decoding in Fibonacci coding exhibit time complexity, achieved through precomputation of a table of Fibonacci numbers up to the required index, which takes space and time to build. The encoding process involves greedy selection of the largest possible Fibonacci number, while decoding scans the bitstream until the terminating "11" pattern, leveraging the no-adjacent-ones property for rapid synchronization. This linearithmic performance makes it practical for applications requiring variable-length integer encoding.[13]Comparisons
With Unary and Binary Codes
Fibonacci coding offers significant advantages over unary coding for representing positive integers, particularly as values grow larger. Unary coding encodes the integer as a sequence of ones followed by a zero, yielding a codeword length of bits, which scales linearly and becomes impractical for even moderately large . In contrast, Fibonacci coding employs a variable-length, prefix-free scheme based on the Zeckendorf representation, achieving code lengths that grow logarithmically, approximately bits on average. For instance, encoding 65 requires 66 bits in unary but only 10 bits in Fibonacci coding. Compared to standard binary coding, Fibonacci coding provides prefix-free encoding without needing additional delimiters or fixed block sizes, but at the cost of greater length. Binary representation of uses bits in fixed-length form, which is optimal for known ranges but lacks the prefix property, necessitating extra bits for separation in variable-length contexts. Fibonacci codes, while inherently instantaneous and suitable for concatenation, incur about 44% more bits on average due to the inefficiency of the Fibonacci base versus base 2; for example, numbers up to 825 require 15 bits in Fibonacci versus 10 in binary. However, for small integers in fixed-width binary (e.g., 1 to 31), Fibonacci's variable lengths average 4.6 bits per number, outperforming 5-bit binary blocks. Key trade-offs highlight Fibonacci coding's niche: it eliminates the need for predefined block sizes or length prefixes, making it ideal for streaming unbounded integers, though the base- overhead increases redundancy relative to binary's efficiency. Unary suits minimal values where linear growth is tolerable and decoding is trivial, binary excels with bounded ranges for minimal bits, and Fibonacci is preferred for prefix-free encoding of arbitrary positive integers without length metadata.With Other Universal Codes
Fibonacci coding is often compared to other universal codes such as Elias gamma and exponential Golomb codes, highlighting differences in efficiency, implementation, and error handling. The Elias gamma code encodes a positive integer using a unary prefix of length followed by the binary representation of (omitting the leading 1), yielding an average code length of approximately bits.[15] In contrast, Fibonacci coding leverages the Zeckendorf representation terminated by "11", resulting in code lengths of roughly bits, where is the golden ratio; this makes Fibonacci codes shorter than Elias gamma for many values of , such as 11 bits for versus 13 bits for Elias gamma.[16] However, Elias gamma benefits from simpler implementation relying on standard binary shifts and unary counting, whereas Fibonacci encoding requires Fibonacci number computations or tables.[17] Against exponential Golomb codes, which are parameterized (e.g., via parameter ) for geometric distributions and encode as a quotient in unary followed by a binary remainder, Fibonacci coding offers parameter-free universality suitable for unknown distributions.[17] Benchmarks show Fibonacci codes achieving comparable or better compression for uniform integer data, such as disk access costs of 7,175 MB versus Golomb's 8,477 MB on random 6D datasets, though Golomb excels when the distribution matches its parameter.[17] A key strength of Fibonacci coding is its self-synchronization: the "11" terminator acts as a marker, confining bit errors or insertions/deletions to a single symbol and enabling quick resynchronization, unlike Elias gamma where misalignment can propagate errors across multiple codewords.[18] This prefix-free property allows instantaneous decoding without lookahead, a shared trait with other universal codes, but Fibonacci's no-adjacent-1s constraint (except at termination) reduces flexibility for highly non-uniform sources.[18] Overall, while optimal for representations in the Fibonacci number system, Fibonacci coding trades some efficiency for robustness, typically yielding 20-50% longer codewords than entropy-approaching adaptive methods like Huffman coding under uniform distributions (e.g., 10.6 bits versus 7 bits for integers up to 128).[18]Examples and Illustrations
Basic Numerical Examples
To illustrate Fibonacci coding, consider the Zeckendorf representation, which expresses each positive integer as a unique sum of non-consecutive Fibonacci numbers starting from , where the Fibonacci sequence is defined as , , and for . The codeword is formed by writing the binary digits of this representation from lowest to highest Fibonacci index (left to right), ensuring no two consecutive 1s, and appending a final 1 to make the code prefix-free.[19] For , the Zeckendorf representation is . The binary bits are 1 (for ), and appending 1 yields the codeword "11". To decode, read until the ending "11", discard the last 1, and sum the remaining bits weighted by Fibonacci numbers starting from as the rightmost bit before the terminator: the bit for is 1, so .[19] For , the Zeckendorf representation is . The binary bits are 01 (0 for , 1 for ), and appending 1 yields "011". Decoding discards the last 1, leaving 01; the rightmost bit (before terminator) is for , and the next left is for , summing to 2.[19] For , the Zeckendorf representation is . The binary bits are 001 (0 for , 0 for , 1 for ), and appending 1 yields "0011". Decoding leaves 001 after discarding the last 1; weights from right: , , , summing to 3.[19] A more involved example is , with Zeckendorf representation . The binary bits from to are 010010001 (0 for , 1 for , 0 for , 0 for , 1 for , 0 for , 0 for , 0 for , 1 for ), and appending 1 yields "0100100011". Decoding discards the last 1, leaving 010010001; weights from right: , , , , , , , , , summing to 65.[19] For a step-by-step encoding of using the greedy algorithm from the encoding process: the largest Fibonacci number is ; subtract to get remainder 0, so the representation is . The binary bits from to are 000001 (0 for to , 1 for ), and appending 1 yields "0000011". Decoding leaves 000001, with weights from right: , others 0, summing to 13.[19]| Integer | Zeckendorf Representation | Codeword | Decoded Sum |
|---|---|---|---|
| 1 | 11 | ||
| 2 | 011 | ||
| 3 | 0011 | ||
| 13 | 0000011 | ||
| 65 | 0100100011 |
Visual Representation
Fibonacci coding can be visually represented through structured tables and diagrams that illustrate the mapping of positive integers to their binary codewords, highlighting the underlying Zeckendorf representations as sums of non-adjacent Fibonacci numbers. These visualizations emphasize the no-adjacent-1s property in the representation bits and the terminating "11" suffix, which ensures unique decodability.[7] The following table lists the codewords for the first 10 positive integers , along with their lengths and the corresponding Fibonacci components based on the Zeckendorf representation (using the Fibonacci sequence where , , , , , etc.). The codeword bits (excluding the terminating 1) directly correspond to the coefficients for (leftmost) to higher terms (rightward before termination).[7]| Codeword | Length | Fibonacci Components | |
|---|---|---|---|
| 1 | 11 | 2 | |
| 2 | 011 | 3 | |
| 3 | 0011 | 4 | |
| 4 | 1011 | 4 | |
| 5 | 00011 | 5 | |
| 6 | 10011 | 5 | |
| 7 | 01011 | 5 | |
| 8 | 000011 | 6 | |
| 9 | 100011 | 6 | |
| 10 | 010011 | 6 |
N=1: 11 (F2:1 | term:1)
N=2: 011 (F2:0 F3:1 | term:1)
N=3: 0011 (F2:0 F3:0 F4:1 | term:1)
N=4: 1011 (F2:1 F3:0 F4:1 | term:1)
N=5: 00011 (F2:0 F3:0 F4:0 F5:1 | term:1)
N=6: 10011 (F2:1 F3:0 F4:0 F5:1 | term:1)
N=7: 01011 (F2:0 F3:1 F4:0 F5:1 | term:1)
F2 F3 F4 F5
This alignment shows how 1s map to non-adjacent indices, with the left-to-right order starting from the lowest , ensuring the representation avoids consecutive 1s except at the termination.[7]
The self-synchronization property of Fibonacci codes is illustrated by their ability to recover from bit errors in a concatenated stream, as the "11" suffix uniquely demarcates codeword boundaries and does not appear internally. Consider the stream for : 110110011. If a bit error flips the third bit (yielding 111110011), the decoder correctly decodes the first codeword as "11" to 1, but misdecodes the second as "11" to 1 (instead of 2), and the third as "10011" to 6 (instead of 3). This robustness allows decoding to continue after the error, with typically only a few codewords affected.
Observing patterns across codewords, lengths increase incrementally with (e.g., groups of 2, 3, 5, 8 codewords at lengths 2 through 6, mirroring Fibonacci growth), while the density of 0s rises as higher dominate, reflecting the sparse, non-adjacent 1s in Zeckendorf sums.