# Problem 22: Prefix Table

$$\def\sa#1{\tt{#1}} \def\tpref{\mathit{pref}} \def\dd{\dot{}\dot{}}$$

The prefix table, like the border table of Problem 19, is a basic tool for building efficient algorithms on words. It is used mostly when searching texts for various types of given patterns.

Let $x$ be a non-empty string. The prefix table of $x$ is defined on its positions $i$, $i=0, \dots, |x|-1$, by: $\tpref[i]$ is the length of the longest prefix of $x$ starting at position $i$. Obviously $\tpref[0]=|x|$.

Here is the prefix table of the word $\sa{abaababaaba}$:
 $i$ 0 1 2 3 4 5 6 7 8 9 10 $x[i]$ $\sa a$ $\sa b$ $\sa a$ $\sa a$ $\sa b$ $\sa a$ $\sa b$ $\sa a$ $\sa a$ $\sa b$ $\sa a$ $\tpref[i]$ 11 0 1 3 0 6 0 1 3 0 1

Show that Algorithm Prefixes computes in linear time the prefix table of its input word $x$ and executes no more than $2|x|-2$ letter comparisons.

Prefixes$(x \textrm{ non-empty word})$
    \begin{algorithmic}
\STATE $pref[0]\leftarrow |x|$
\STATE $g\leftarrow 0$
\FOR{$i\leftarrow 1$ \TO $|x|-1$}
\IF{$i\lt g \mbox{ and } pref[i-f]\ne g-i$}
\STATE $pref[i]\leftarrow \min\{pref[i-f],g-i\}$
\ELSE
\STATE $(f,g)\leftarrow (i,\max\{g,i\})$
\WHILE{$g\lt |x|$ and $x[g]=x[g-f]$}
\STATE $g\leftarrow g+1$
\ENDWHILE
\STATE $pref[f]\leftarrow g-f$
\ENDIF
\ENDFOR
\RETURN{$pref$}
\end{algorithmic}


The key idea in Algorithm Prefixes that computes the table sequentially from left to right is to benefit from what has been computed before the current position.

When $v=x[f\dd g-1]$ is a prefix of $x$ and position $i$ is between $f$ and $g$ (see picture), the first step for computing $\tpref[i]$ is to check whether or not its value can be deduced from the work done on the prefix occurrence of $v$, that is, at position $i-f$ on $x$. This saves enough work to get a linear-time algorithm.

## References

• M. Crochemore, C. Hancart, and T. Lecroq. Algorithms on Strings. Cambridge University Press, 2007. 392 pages.
• M. Crochemore and W. Rytter. Text algorithms. Oxford University Press, 1994. 412 pages.
• D. Gusfield. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge, 1997.
• G. Navarro and M. Raffinot. Flexible pattern matching in strings - practical on-line search algorithms for texts and biological sequences. Cambridge University Press, 2002.
• B. Smyth. Computing Patterns in Strings. Pearson Education Limited, Harlow, England, 2003. 423 pages.