CLR cover
Previous problem

Problem 33: Good-Suffix Table

Next problem
\( \def\dd{\dot{}\dot{}} \def\tbsuff{\mathit{good\text{-}suff}} \def\per{\mathit{per}} \)

The Boyer-Moore algorithm (BM in Problem 34) applies the sliding window strategy on the text to locate occurrences of a pattern. It requires a pattern preprocessing to accelerate the search.

At a given step, the algorithm compares the pattern and a window on the text by computing their longest common suffix $u$. If $u=x$ a match occurs. Otherwise, in the generic situation, pattern $x[0\dd m-1]$ is aligned with the window, factor $y[j-m+1\dd j]$ of the text, $au$ is a suffix of $x$ and $bu$ a suffix of the window, for different letters $a$ and $b$.

Good Suffix Table

To continue the search, Algorithm BM slides the window according to the period of $x$ in case of a match or otherwise to the factor $bu$ of the text to avoid positions of the window where no occurrence of $x$ is possible. To do so it uses the good-suffix table $\tbsuff$ defined for a position $i$ on $x$ and $u=x[i+1\dd m-1]$ by $$\tbsuff[i]=\min\{|v| \mid x \mbox{ suffix of } uv \mbox{ or } cuv \mbox{ suffix of } x, c\neq x[i]\}.$$

The condition "$cuv \mbox{ suffix of } x$" with $c\neq a=x[i]$ ensures that when letter $c$ is aligned with letter $b=y[j-m+1+i]$ after sliding the window, the same mismatch does not re-occur immediately. From the definition note that $\tbsuff[0]=\per(x)$.

Design a linear-time algorithm for computing the good suffix table of a word.
Use the reverse table of prefixes of $x^\mathrm{R}$.

References

  • A. Apostolico and R. Giancarlo. The Boyer-Moore-Galil string searching strategies revisited. SIAM J. Comput., 15(1):98-105, 1986.
  • R. S. Boyer and J. S. Moore. A fast string searching algorithm. Commun. ACM, 20(10):762-772, 1977.
  • M. Crochemore and T. Lecroq. Tight bounds on the complexity of the Apostolico-Giancarlo algorithm. Inf. Process. Lett., 63(4):195-203, 1997.
  • D. E. Knuth, J. H. Morris Jr., and V. R. Pratt. Fast pattern matching in strings. SIAM J. Comput., 6(2):323-350, 1977.
  • W. Rytter. A correct preprocessing algorithm for Boyer-Moore stringsearching. SIAM J. Comput., 9(3):509-512, 1980.