A quick tour on suffix arrays and compressed suffix arrays

doi:10.1016/j.tcs.2010.12.036

Abstract

Suffix arrays are a key data structure for solving a run of problems on texts and sequences, from data compression and information retrieval to biological sequence analysis and pattern discovery. In their simplest version, they can just be seen as a permutation of the elements in ${1, 2, \dots, n}$ , encoding the sorted sequence of suffixes from a given text of length $n$ , under the lexicographic order. Yet, they are on a par with ubiquitous and sophisticated suffix trees. Over the years, many interesting combinatorial properties have been devised for this special class of permutations: for instance, they can implicitly encode extra information, and they are a well characterized subset of the $n!$ permutations. This paper gives a short tutorial on suffix arrays and their compressed version to explore and review some of their algorithmic features, discussing the space issues related to their usage in text indexing, combinatorial pattern matching, and data compression.

Theoretical Computer Science

Abstract

Keywords

Cited by (0)

Theoretical Computer Science

A quick tour on suffix arrays and compressed suffix arrays☆

Abstract

Keywords