Recent from talks
Nothing was collected or created yet.
Lehmer code
View on WikipediaIn mathematics and in particular in combinatorics, the Lehmer code is a particular way to encode each possible permutation of a sequence of n numbers. It is an instance of a scheme for numbering permutations and is an example of an inversion table.
The Lehmer code is named in reference to D. H. Lehmer,[1] but the code had been known since 1888 at least.[2]
The code
[edit]The Lehmer code makes use of the fact that there are
permutations of a sequence of n numbers. If a permutation σ is specified by the sequence (σ1, ..., σn) of its images of 1, ..., n, then it is encoded by a sequence of n numbers, but not all such sequences are valid since every number must be used only once. By contrast the encodings considered here choose the first number from a set of n values, the next number from a fixed set of n − 1 values, and so forth decreasing the number of possibilities until the last number for which only a single fixed value is allowed; every sequence of numbers chosen from these sets encodes a single permutation. While several encodings can be defined, the Lehmer code has several additional useful properties; it is the sequence
in other words the term L(σ)i counts the number of terms in (σ1, ..., σn) to the right of σi that are smaller than it, a number between 0 and n − i, allowing for n + 1 − i different values.
A pair of indices (i,j) with i < j and σi > σj is called an inversion of σ, and L(σ)i counts the number of inversions (i,j) with i fixed and varying j. It follows that L(σ)1 + L(σ)2 + … + L(σ)n is the total number of inversions of σ, which is also the number of adjacent transpositions that are needed to transform the permutation into the identity permutation. Other properties of the Lehmer code include that the lexicographical order of the encodings of two permutations is the same as that of their sequences (σ1, ..., σn), that any value 0 in the code represents a right-to-left minimum in the permutation (i.e., a σi smaller than any σj to its right), and a value n − i at position i similarly signifies a right-to-left maximum, and that the Lehmer code of σ coincides with the factorial number system representation of its position in the list of permutations of n in lexicographical order (numbering the positions starting from 0).
Variations of this encoding can be obtained by counting inversions (i,j) for fixed j rather than fixed i, by counting inversions with a fixed smaller value σj rather than smaller index i, or by counting non-inversions rather than inversions; while this does not produce a fundamentally different type of encoding, some properties of the encoding will change correspondingly. In particular counting inversions with a fixed smaller value σj gives the inversion table of σ, which can be seen to be the Lehmer code of the inverse permutation.
Encoding and decoding
[edit]The usual way to prove that there are n! different permutations of n objects is to observe that the first object can be chosen in n different ways, the next object in n − 1 different ways (because choosing the same number as the first is forbidden), the next in n − 2 different ways (because there are now 2 forbidden values), and so forth. Translating this freedom of choice at each step into a number, one obtains an encoding algorithm, one that finds the Lehmer code of a given permutation. One need not suppose the objects permuted to be numbers, but one needs a total ordering of the set of objects. Since the code numbers are to start from 0, the appropriate number to encode each object σi by is the number of objects that were available at that point (so they do not occur before position i), but which are smaller than the object σi actually chosen. (Inevitably such objects must appear at some position j > i, and (i,j) will be an inversion, which shows that this number is indeed L(σ)i.)
This number to encode each object can be found by direct counting, in several ways (directly counting inversions, or correcting the total number of objects smaller than a given one, which is its sequence number starting from 0 in the set, by those that are unavailable at its position). Another method which is in-place, but not really more efficient, is to start with the permutation of {0, 1, ... n − 1} obtained by representing each object by its mentioned sequence number, and then for each entry x, in order from left to right, correct the items to its right by subtracting 1 from all entries (still) greater than x (to reflect the fact that the object corresponding to x is no longer available). Concretely a Lehmer code for the permutation B,F,A,G,D,E,C of letters, ordered alphabetically, would first give the list of sequence numbers 1,5,0,6,3,4,2, which is successively transformed
where the final line is the Lehmer code (at each line one subtracts 1 from the larger entries to the right of the boldface element to form the next line).
For decoding a Lehmer code into a permutation of a given set, the latter procedure may be reversed: for each entry x, in order from right to left, correct the items to its right by adding 1 to all those (currently) greater than or equal to x; finally interpret the resulting permutation of {0, 1, ... n − 1} as sequence numbers (which amounts to adding 1 to each entry if a permutation of {1, 2, ... n} is sought). Alternatively the entries of the Lehmer code can be processed from left to right, and interpreted as a number determining the next choice of an element as indicated above; this requires maintaining a list of available elements, from which each chosen element is removed. In the example this would mean choosing element 1 from {A,B,C,D,E,F,G} (which is B) then element 4 from {A,C,D,E,F,G} (which is F), then element 0 from {A,C,D,E,G} (giving A) and so on, reconstructing the sequence B,F,A,G,D,E,C.
Applications to combinatorics and probabilities
[edit]Independence of relative ranks
[edit]The Lehmer code defines a bijection from the symmetric group Sn to the Cartesian product , where [k] designates the k-element set . As a consequence, under the uniform distribution on Sn, the component L(σ)i defines a uniformly distributed random variable on [n − i], and these random variables are mutually independent, because they are projections on different factors of a Cartesian product.
Number of right-to-left minima and maxima
[edit]Definition : In a sequence u=(uk)1≤k≤n, there is right-to-left minimum (resp. maximum) at rank k if uk is strictly smaller (resp. strictly bigger) than each element ui with i>k, i.e., to its right.
Let B(k) (resp. H(k)) be the event "there is right-to-left minimum (resp. maximum) at rank k", i.e. B(k) is the set of the permutations which exhibit a right-to-left minimum (resp. maximum) at rank k. We clearly have
Thus the number Nb(ω) (resp. Nh(ω)) of right-to-left minimum (resp. maximum) for the permutation ω can be written as a sum of independent Bernoulli random variables each with a respective parameter of 1/k :
Indeed, as L(k) follows the uniform law on
The generating function for the Bernoulli random variable is
therefore the generating function of Nb is
(using the rising factorial notation), which allows us to recover the product formula for the generating function of the Stirling numbers of the first kind (unsigned).
The secretary problem
[edit]This is an optimal stop problem, a classic in decision theory, statistics and applied probabilities, where a random permutation is gradually revealed through the first elements of its Lehmer code, and where the goal is to stop exactly at the element k such as σ(k)=n, whereas the only available information (the k first values of the Lehmer code) is not sufficient to compute σ(k).
In less mathematical words: a series of n applicants are interviewed one after the other. The interviewer must hire the best applicant, but must make his decision (“Hire” or “Not hire”) on the spot, without interviewing the next applicant (and a fortiori without interviewing all applicants).
The interviewer thus knows the rank of the kth applicant, therefore, at the moment of making his kth decision, the interviewer knows only the k first elements of the Lehmer code whereas he would need to know all of them to make a well informed decision. To determine the optimal strategies (i.e. the strategy maximizing the probability of a win), the statistical properties of the Lehmer code are crucial.
Allegedly, Johannes Kepler clearly exposed this secretary problem to a friend of his at a time when he was trying to make up his mind and choose one out eleven prospective brides as his second wife. His first marriage had been an unhappy one, having been arranged without himself being consulted, and he was thus very concerned that he could reach the right decision.[3]
Similar concepts
[edit]Several related constructions have also been put into use. One of them is often called inversion vector, e.g. by Wolfram Alpha. See also Inversion (discrete mathematics) § Inversion related vectors.
References
[edit]- ^
Lehmer, D.H. (1960), "Teaching combinatorial tricks to a computer", Combinatorial Analysis, Proceedings of Symposia in Applied Mathematics, vol. 10, pp. 179–193, doi:10.1090/psapm/010/0113289, ISBN 978-0-8218-1310-2, MR 0113289
{{citation}}: ISBN / Date incompatibility (help) - ^ Laisant, Charles-Ange (1888), "Sur la numération factorielle, application aux permutations" [On factorial numbering, application to permutations], Bulletin de la Société Mathématique de France (in French), 16: 176–183, doi:10.24033/bsmf.378
- ^ Ferguson, Thomas S. (August 1989), "Who solved the secretary problem?" (PDF), Statistical Science, 4 (3): 282–289, doi:10.1214/ss/1177012493, JSTOR 2245639
Bibliography
[edit]- Mantaci, Roberto; Rakotondrajao, Fanja (2001), "A permutation representation that knows what "Eulerian" means", Discrete Mathematics and Theoretical Computer Science (4): 101–108, archived from the original on 2004-11-16
- Knuth, Donald (1981), The Art of Computer Programming, vol. 3, Reading: Addison-Wesley, pp. 12–13
Lehmer code
View on GrokipediaHistory and Definition
Historical Background
The Lehmer code traces its origins to the late 19th century, when French mathematician Charles-Ange Laisant introduced a factorial numbering system for representing permutations in his seminal paper. In this work, Laisant described a method to enumerate and encode permutations using factorial bases, providing a systematic way to assign unique indices to each permutation without employing the modern terminology of "Lehmer code." This innovation emerged within the broader context of enumerative combinatorics, where Laisant and contemporaries sought efficient tools for cataloging combinatorial objects during an era of growing interest in permutation theory.[4] The encoding was first proposed around 1906 by American mathematician Derrick Norman Lehmer and later elaborated in detail by his son, Derrick Henry Lehmer. The concept gained prominence in the mid-20th century through the efforts of Derrick Henry Lehmer, who adapted and popularized the encoding scheme for computational purposes. In his 1960 paper, Lehmer presented the code as a practical tool for generating and manipulating permutations on early computers, emphasizing its utility in combinatorial enumeration algorithms. This development reflected the era's shift toward computer-assisted mathematics, where Lehmer's expertise in numerical analysis and permutation generation addressed the need for efficient data structures in automated problem-solving.[5] Although the Lehmer code is named after Derrick Henry Lehmer, its foundational ideas bear resemblance to earlier notions like inversion tables, which served as precursors in permutation analysis. The code's evolution from Laisant's theoretical framework to the Lehmers' computational applications underscores its enduring role in bridging classical combinatorics with modern computing.Formal Definition
The Lehmer code of a permutation of the set is the sequence , where each entry is defined as for .[1] This counts, for each position , the number of elements to the left of in the permutation that are larger than , providing a measure of the left-inversions involving the element at position .[1] For example, consider the permutation of {{grok:render&&&type=render_inline_citation&&&citation_id=4&&&citation_type=wikipedia}}.- For , ; no elements to the left, so .
- For , ; the element to the left is (one element), so .
- For , ; the elements to the left are and (zero elements larger), so .
- For , ; the elements to the left are , , (two elements), so .
Thus, .[1]
Encoding and Decoding
Encoding a Permutation
To encode a permutation of elements into its Lehmer code , where each satisfies , proceed sequentially from left to right, tracking the set of unused elements at each step. This process counts, for each position , the number of unused elements smaller than that have not yet appeared in the prefix to . Equivalently, is the number of elements to the right of position that are smaller than , as the unused elements at step consist precisely of and the elements in positions to .[3] The algorithm is as follows:- Initialize the set of available elements as (assuming the permutation is of the numbers 1 through ).
-
For each position from 1 to :
- Identify from the permutation.
- Count the number of elements in the current that are strictly smaller than ; this count is .
- Remove from .
- The resulting sequence is the Lehmer code.
- Start with . For , ; smaller elements in : 1 (count=1). Set , remove 2; .
- For , ; smaller in : 1,3,4,5 (count=4). Set , remove 6; .
- For , ; smaller in : none (count=0). Set , remove 1; .
- For , ; smaller in : 3,4,5 (count=3). Set , remove 7; .
- For , ; smaller in : 3 (count=1). Set , remove 4; .
- For , ; smaller in : 3 (count=1). Set , remove 5; .
- For , ; smaller in : none (count=0). Set , remove 3; .
Decoding a Lehmer Code
The Lehmer code establishes a bijection between the set of all permutations of elements and the set of integer sequences satisfying for each .[6] Decoding reconstructs the unique permutation corresponding to via an algorithm that operates in linear time .[6] The algorithm begins with the set of available elements , sorted in increasing order. For each position from 1 to , it selects the -th smallest element from the current as and removes that element from . This process builds from left to right, ensuring each choice respects the inversion counts encoded in .[6] In detail, for each , specifies the number of smaller unused elements to skip before selecting the next one; the selected element is thus the one that would have exactly remaining smaller elements placed after it in the permutation. This step-wise selection guarantees the bijection, as every valid produces a unique and vice versa.[6] The decoding is the inverse of the encoding operation, which counts right inversions for each position.[6] Consider the example with and . Start with .- For , : Select the 2nd smallest in (which is 2); set , update .
- For , : Select the 5th smallest in (1, 3, 4, 5, 6 → 6); set , update .
- For , : Select the 1st smallest in (1); set , update .
- For , : Select the 4th smallest in (3, 4, 5, 7 → 7); set , update .
- For , : Select the 2nd smallest in (3, 4 → 4); set , update .
- For , : Select the 2nd smallest in (3, 5 → 5); set , update .
- For , : Select the 1st smallest in (3); set , update .
Properties
Relation to the Factorial Number System
The Lehmer code provides a direct bijection between the set of all permutations of elements, denoted , and the integers in the range through its interpretation as a representation in the factorial number system. This connection was introduced by Derrick Lehmer to facilitate computational enumeration of permutations by assigning each one a unique numerical index corresponding to its position in colexicographic order. The factorial number system, also known as the factoradic system, is a mixed radix numeral system where the place values are the factorials for positions indexed from the right starting at , and each digit satisfies . This ensures every nonnegative integer up to has a unique representation without leading zeros beyond the necessary positions. For a permutation , the reversed Lehmer code (where is as defined) uses digits that exactly fit these constraints, as the values count the inversions in a way that aligns with colex ordering from right to left.[7] The rank , or colexicographic index, is then computed as the value of this factoradic number: This formula maps each permutation uniquely to its ordinal position, starting from 0 for the identity permutation. For instance, with and Lehmer code (so reversed ), the rank is This places as the 1222nd permutation (1-indexed) in colexicographic order among the total.[8] This numerical bijection enables efficient sorting and traversal of permutations by treating them as integers in the factorial base, which underpins algorithms for generating permutations in colexicographic sequence without redundant computations.Statistical Properties and Independence
When considering a uniformly random permutation of , the components of its Lehmer code exhibit notable probabilistic behavior. Specifically, each is uniformly distributed over the set , and the are mutually independent random variables.[9] This uniformity arises because, for each fixed , the value counts the number of elements to the left of position that are larger than ; since is random, is equally likely to be in any rank among the prefix , yielding for . The independence follows from the fact that the relative ordering within each prefix is uniformly random and the counts decouple across positions due to the exchangeability of the permutation.[9] The sum equals the total number of inversions in , and under the uniform distribution, this sum has expected value . This expectation can be derived by linearity: each possible pair with forms an inversion with probability , and there are such pairs. In the Lehmer code, the positions where precisely correspond to the left-to-right maxima of , as means no element to the left of is larger than . The number of left-to-right maxima is thus the count of such zeros in , while the number of right-to-left maxima can be expressed analogously using a suitable transformation of the code components.[9] The independence of the under uniform random permutations enables efficient sampling and simulation techniques, facilitating Monte Carlo methods in modern combinatorics, such as learning distributions over permutation spaces in post-2020 computational studies.[9]Applications
Combinatorial Enumeration
Lehmer codes play a central role in combinatorial enumeration by enabling unranking, the process of generating the specific permutation corresponding to a given rank in the range under lexicographic order. The rank is expressed in the factorial number system, yielding digits where and . These digits form the Lehmer code, and decoding proceeds by successively selecting the -th (0-based index) unused element from the remaining set to build the permutation.[10] This bijection ensures every rank maps uniquely to a permutation, facilitating direct access without generating preceding ones.[11] In applications, Lehmer codes support efficient algorithms for enumerating all permutations without duplicates or omissions, as successive unranking from ranks 0 to produces the complete lexicographic list. Historically, Derrick H. Lehmer incorporated these codes into mechanical computing devices, known as teaching machines, to solve combinatorial problems like permutation generation and counting, as detailed in his foundational work on machine tools for combinatorics.[12] Such methods allowed early digital and analog computers to tackle enumeration tasks that were otherwise infeasible by exhaustive search.[8] For illustration, consider and rank (0-based). The factorial representation is , yielding Lehmer code digits . Starting with the set :- Select the 0-th element: 1; remaining: .
- Select the 2-nd element: 4; remaining: .
- Select the 1-st element: 3; remaining: .
- Select the 0-th element: 2.
itertools.permutations introduced in version 2.6 (2008), efficient generation leverages lexicographic ordering akin to unranking techniques, enabling scalable enumeration for practical combinatorial tasks.[15][16] The independence of Lehmer code digits further supports uniform sampling in enumeration algorithms.[10]
