Lyndon word
Lyndon word
Main page

Lyndon word

logo
Community Hub0 subscribers
What are your thoughts?
Be the first to start a discussion here.
Be the first to start a discussion here.
Lyndon word

In mathematics, in the areas of combinatorics and computer science, a Lyndon word is a nonempty string that is strictly smaller in lexicographic order than all of its rotations. Lyndon words are named after mathematician Roger Lyndon, who investigated them in 1954, calling them standard lexicographic sequences. Anatoly Shirshov introduced Lyndon words in 1953 calling them regular words. Lyndon words are a special case of Hall words; almost all properties of Lyndon words are shared by Hall words.

Several equivalent definitions exist.

A -ary Lyndon word of length is an -character string over an alphabet of size , and which is the unique minimum element in the lexicographical ordering in the multiset of all its rotations. Being the singularly smallest rotation implies that a Lyndon word differs from any of its non-trivial rotations, and is therefore aperiodic.

Alternately, a word is a Lyndon word if and only if it is nonempty and lexicographically strictly smaller than any of its proper suffixes, that is for all nonempty words such that and is nonempty.

Another characterisation is the following: A Lyndon word has the property that it is nonempty and, whenever it is split into two nonempty substrings, the left substring is always lexicographically less than the right substring. That is, if is a Lyndon word, and is any factorization into two substrings, with and understood to be non-empty, then . This definition implies that a string of length is a Lyndon word if and only if there exist Lyndon words and such that and . Although there may be more than one choice of and with this property, there is a particular choice, called the standard factorization, in which is as long as possible.

The Lyndon words over the two-symbol binary alphabet {0,1}, sorted by length and then lexicographically within each length class, form an infinite sequence that begins

The first string that does not belong to this sequence, "00", is omitted because it is periodic (it consists of two repetitions of the substring "0"); the second omitted string, "10", is aperiodic but is not minimal in its permutation class as it can be cyclically permuted to the smaller string "01".

The empty string also meets the definition of a Lyndon word of length zero. The numbers of binary Lyndon words of each length, starting with length zero, form the integer sequence

See all
User Avatar
No comments yet.