|
Problem 53: LPF Table
|
|
\(
\def\sa#1{\tt{#1}}
\def\dd{\dot{}\dot{}}
\def\LPF{\mathrm{LPF}}
\def\LCP{\mathrm{LCP}}
\def\Rank{\mathrm{Rank}}
\def\prev{\mathit{prev}}
\def\nextt{\mathit{next}}
\)
The problem deals with yet another table on words called abusively the
longest previous factor table.
It is a useful tool to factorise words
for data compression (see Problem 97) and
more generally to design efficient algorithms for finding repeats in texts.
For a non-empty word $y$, its table $\LPF$ stores lengths of
repeating factors.
More precisely, for a position $j$ on $y$, $\LPF[j]$ is
the maximal length of factors that starts both at position $j$ and at a
previous (i.e. smaller) position.
Here is the table for $\sa{abaabababbabbb}$.
$j$ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 |
$y[j]$ | $\sa{a}$ | $\sa{b}$ | $\sa{a}$ | $\sa{a}$ | $\sa{b}$ | $\sa{a}$ | $\sa{b}$ | $\sa{a}$ | $\sa{b}$ | $\sa{b}$ | $\sa{a}$ | $\sa{b}$ | $\sa{b}$ | $\sa{b}$ |
$\LPF[j]$ | 0 | 0 | 1 | 3 | 2 | 4 | 3 | 2 | 1 | 4 | 3 | 2 | 2 | 1 |
The next algorithm computes the table $\LPF$ for its input word $y$.
It utilises the Suffix array of $y$ and the table $\Rank$ that gives ranks of
its suffixes in lexicographic order.
Tables $\prev$ and $\nextt$ are links
for a list representation of suffix ranks.
Lpf$(y \textrm{ non-empty word})$
\begin{algorithmic}
\FOR{$r\leftarrow 0$ \TO $|y|-1$}
\STATE $(prev[r],next[r])\leftarrow (r-1,r+1)$
\ENDFOR
\FOR{$j\leftarrow |y|-1$ \DOWNTO $0$}
\STATE $r\leftarrow \mathrm{Rank}[j]$
\STATE $\mathrm{LPF}[j]\leftarrow \max\{\mathrm{LCP}[r],\mathrm{LCP}[next[r]]\}$
\STATE $\mathrm{LCP}[next[r]]\leftarrow \min\{\mathrm{LCP}[r],\mathrm{LCP}[next[r]]\}$
\IF{$prev[r] \geq 0$}
\STATE $next[prev[r]]\leftarrow next[r]$
\ENDIF
\IF{$next[r] \lt |y|$}
\STATE $prev[next[r]]\leftarrow prev[r]$
\ENDIF
\ENDFOR
\RETURN{$LPF$}
\end{algorithmic}
Show that Algorithm Lpf correctly computes the table $\LPF$ and works
in linear time.
Looking accurately at the algorithm proves more than what it is designed for:
lengths in $\LPF$ form a permutation of lengths in $\LCP$.
Show both that values in the $\LPF$ table are permuted from values in the
$\LCP$ table and that the $\LCP$ table can be transformed into the
$\LPF$ table.
References
S. Chairungsee and M. Crochemore. Efficient computing of longest previous
reverse factors. In Y. Shoukourian, editor, Seventh International
Conference on Computer Science and Information Technologies (CSIT
2009), pages 27-30. The National Academy of Sciences of Armenia Publishers,
Yerevan, Armenia, 2009.
S. Chairungsee and M. Crochemore. Longest previous non-overlapping
factors table computation. In X. Gao, H. Du, and M. Han, editors,
Combinatorial Optimization and Applications - 11th International Conference,
COCOA 2017, Shanghai, China, December 16-18, 2017, Proceedings,
Part II, volume 10628 of Lecture Notes in Computer Science,
pages 483-491. Springer, 2017.
M. Crochemore and L. Ilie. Computing longest previous factors in linear
time and applications. Inf. Process. Lett., 106(2):75-80, 2008.
M. Crochemore, L. Ilie, C. S. Iliopoulos, M. Kubica, W. Rytter, and
T. Walen. Computing the longest previous factor. Eur. J. Comb.,
34(1):15-26, 2013.
M. Crochemore and G. Tischler. Computing longest previous nonoverlapping
factors. Inf. Process. Lett., 111(6):291-295, 2011.