|
Problem 19: Border Table
|
|
The border table, as well as the prefix table in
Problem 22, are
basic tools for building efficient algorithms on words.
They are used mostly
for searching texts online for various types of given patterns.
The border table of a non-empty word $x$ is defined on the
lengths $\ell$, $\ell=0, \dots, |x|$, of its prefixes both by
$border[0] = -1$ and, for $\ell\gt 0$, by $border[\ell] = |Border(x[0..\ell-1])|$.
Here is the border table of the word abaababaaba:
$i$ | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
$x[i]$ | | a | b | a | a | b | a | b | a | a | b | a |
$\ell$ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
$border[\ell]$ | -1 | 0 | 0 | 1 | 1 | 2 | 3 | 2 | 3 | 4 | 5 | 6 |
Prove that Algorithm Borders correctly computes the border table of
a non-empty word $x$ and executes a maximum of $2|x|-3$ letter comparisons when $|x|\gt 1$.
Borders$(x \mbox{ non-empty word})$
\begin{algorithmic}
\STATE $border[0]\leftarrow -1$
\FOR{$i\leftarrow 0$ \TO $|x|-1$}
\STATE $\ell\leftarrow border[i]$
\WHILE{$\ell\geq 0$ \AND $x[\ell]\neq x[i]$}
\STATE $\ell\leftarrow border[\ell]$
\ENDWHILE
\STATE $border[i+1]\leftarrow \ell+1$
\ENDFOR
\RETURN $border$
\end{algorithmic}
Example
Let us consider the prefix $u=$
abaababa of the above word.
Its border is $v=$
aba of length $3=border[8]$.
The next letter
a extends the border, that is, $Border(u$
a$) = v$
a.
If the next letter $c$ is not
a, the border of $uc$ is the border
of $vc$, which sketches the proof of the recurrence relation, for $u$ a word
and $c$ a letter:
$$
Border(uc)=\cases{
Border(u)c & if $Border(u)c$ is a prefix of $u$, \cr
Border(Border(u)c) & otherwise. \cr
}
$$
Show how to detect the non-primitive prefixes of a word using its border table.
References
M. Crochemore, C. Hancart, and T. Lecroq. Algorithms on Strings.
Cambridge University Press, 2007. 392 pages.
M. Crochemore and W. Rytter. Text algorithms. Oxford University
Press, 1994. 412 pages.
D. Gusfield. Algorithms on Strings, Trees and Sequences: Computer
Science and Computational Biology. Cambridge University Press, Cambridge,
1997.
G. Navarro and M. Raffinot. Flexible pattern matching in strings -
practical on-line search algorithms for texts and biological sequences.
Cambridge University Press, 2002.
B. Smyth. Computing Patterns in Strings. Pearson Education Limited,
Harlow, England, 2003. 423 pages.