Suffix array
Suffix array
Main page

Suffix array

logo
Community Hub0 subscribers
What are your thoughts?
Be the first to start a discussion here.
Be the first to start a discussion here.
Suffix array

In computer science, a suffix array is a sorted array of all suffixes of a string. It is a data structure used in, among others, full-text indices, data-compression algorithms, and the field of bibliometrics.

Suffix arrays were introduced by Manber & Myers (1990) as a simple, space efficient alternative to suffix trees. They had independently been discovered by Gaston Gonnet in 1987 under the name PAT array (Gonnet, Baeza-Yates & Snider 1992).

Li, Li & Huo (2016) gave the first in-place time suffix array construction algorithm that is optimal both in time and space, where in-place means that the algorithm only needs additional space beyond the input string and the output suffix array.

Enhanced suffix arrays (ESAs) are suffix arrays with additional tables that reproduce the full functionality of suffix trees preserving the same time and memory complexity. A sorted array of only some (rather than all) suffixes of a string is called a sparse suffix array.

Let be an -string and let denote the substring of ranging from to inclusive.

The suffix array of is now defined to be an array of integers providing the starting positions of suffixes of in lexicographical order. This means, an entry contains the starting position of the -th smallest suffix in and thus for all : .

Each suffix of shows up in exactly once. Suffixes are simple strings. These strings are sorted (as in a paper dictionary), before their starting positions (integer indices) are saved in .

Consider the text =banana$ to be indexed:

See all
User Avatar
No comments yet.