CLR cover

Problem 48: Linear Suffix Trie

The Suffix trie of a word can be of quadratic size according to the word length. On the contrary, its Suffix tree requires only a linear amount of space for its storage, but the space should include the word itself.

The goal is to design a Suffix trie with edges labelled by single letters and that can be stored in linear space without the word itself. This is done by adding extra nodes and a few elements to the Suffix tree.

A node of the Suffix trie of $y$ is identified to the factor of $y$ that labels the path from the root to the node. Nodes in the linear Suffix trie $\LST(y)$ that are not in the Suffix tree $\aSTree(y)$ are of the form $au$, where $a$ is a letter and $u$ is a node of $\aSTree(y)$. That is, denoting by $s$ the suffix link of the tree, $s(au)=u$. When nodes are added to $\aSTree(y)$ to create $\LST(y)$ edges are relabelled accordingly.

Show the number of extra nodes added to the Suffix tree of a word $y$ to create its linear Suffix trie is less than $|y|$.

Labels of edges in $\LST(y)$ are reduced to the first letter of the corresponding factor as follows. If $v$, $|v|\gt 1$, labels the edge from $u$ to $uv$ in $\aSTree(y)$, the label of the associated edge in $\LST(y)$ is the first letter of $v$ and the node $uv$ is marked with the $+$ sign to indicate the actual label is longer.

Design an algorithm that checks if $x$ occurs in $y$ using the linear Suffix trie $\LST(y)$ and runs in time $O(|x|)$ on a fixed alphabet.
Edge labels can be recovered using suffix links.

References

  • M. Crochemore, C. Epifanio, R. Grossi, and F. Mignosi. Linear-size suffix tries. Theor. Comput. Sci., 638:171-178, 2016.
  • D. Hendrian, T. Takagi, and S. Inenaga. Online algorithms for constructing linear-size suffix trie. CoRR, abs/1901.10045, 2019.