Problem 55: Bare Suffix Tree |
Suffix trees provide a data structure for indexing texts. Optimal-time constructions of them suffer from a rather high memory requirement, larger than for Suffix arrays with the same usage. The problem deals with a moderately efficient but not completely naive and very simple construction of Suffix trees.
The Suffix tree $T$ of a word $x$ ending with a unique end marker is the compacted trie of suffixes of $x$. A leaf corresponds to a suffix and an internal node to a factor having at least two occurrences followed by different letters. Each edge is labelled by a factor $x[i\dd j]$ of $x$, represented by the pair $(i,j)$. Its word-length is $|x[i\dd j]|=j-i+1$. The word-length of a path in $T$ is the sum of word-lengths of its edges, while the length of the path is its number of edges. Let $\depth(T)$ be the maximum length of a path in $T$ from the root to a leaf. Let $l_i$ be the leaf ending the branch labelled by $x[i\dd n-1]$.