Example

$ \def\sa#1{\tt{#1}} \def\dd{\dot{}\dot{}} \def\SA{\mathrm{SA}} \def\LCP{\mathrm{LCP}} \def\Rank{\mathrm{Rank}} \def\lcp{\mathit{lcp}} $

Below are the tables associated with the example word $\sa{aababa}$. Its sorted list of suffixes is $\sa{a}$, $\sa{aababa}$, $\sa{aba}$, $\sa{ababa}$, $\sa{ba}$ and $\sa{baba}$ whose starting positions are 5, 0, 3, 1, 4 and 2. This latter list is stored in $\SA$ indexed by suffix ranks.

$i$ 0 1 2 3 4 5

$x[i]$ $\sa{a}$ $\sa{a}$ $\sa{b}$ $\sa{a}$ $\sa{b}$ $\sa{a}$

$\Rank[i]$ 1 3 5 2 4 0

$r$ 0 1 2 3 4 5 6 7 8 9 10 11 12

$\SA[r]$ 5 0 3 1 4 2

$\LCP[r]$ 0 1 1 3 0 2 0 0 1 0 0 0 0

Suffix Array

The Suffix array of a word is also used to produce indexes but proceeds differently than with trees or automata. It consists primarily in sorting the non-empty suffixes of the word to allow binary search for its factors. To get actually efficient searches another feature is considered: the longest common prefixes of successive suffixes in the sorted list.

The information is stored in two arrays, $\SA$ and $\LCP$. The array $\SA$ is the inverse of the array $\Rank$ that gives the rank of each suffix attached at its starting position.

The table $\LCP$ essentially contains longest common prefixes stored as maximal lengths of common prefixes between successive suffixes: \[\LCP[r]=|\lcp(x[\SA[r-1]\dd |x|-1],x[\SA[r]\dd |x|-1])|,\] where $\lcp$ denotes the longest common prefix between two words. This gives $\LCP[0\dd 6]$ for the example. The next values in $\LCP[7\dd 12]$ correspond to the same information for suffixes starting at positions $d$ and $f$ when the pair $(d,f)$ appears in the binary search. Formally, for such a pair, the value is stored at position $|x|+1+\lfloor (d+f)/2 \rfloor$. For example, in the above $\LCP$ array the value $1$ corresponding to the pair $(0,2)$, maximal length of prefixes between $x[5\dd 5]$ and $x[3\dd 5]$, is stored at position~$8$.

The table $\Rank$ is used in applications of the Suffix array that are mainly other than searching.

$i$	0	1	2	3	4	5
$x[i]$	$\sa{a}$	$\sa{a}$	$\sa{b}$	$\sa{a}$	$\sa{b}$	$\sa{a}$
$\Rank[i]$	1	3	5	2	4	0
$r$	0	1	2	3	4	5	6	7	8	9	10	11	12
$\SA[r]$	5	0	3	1	4	2
$\LCP[r]$	0	1	1	3	0	2	0	0	1	0	0	0	0

125 Problems in Text Algorithms

Example

Suffix Array