Problem 24: Periodicity Test |
The detection of periodicities in words is an essential question to cope with when designing text searching or text compression methods. The goal is to test the periodicity of a word in a space-economical way.
A word $x$ is said to be periodic if its (smallest) period is no more than half its length, $\per(x) \leq |x|/2$. Equivalently, $x$ is periodic if it is of the form $u^kv$ for two words $u$ and $v$ and an integer $k$, $u$ non-empty, $v$ a proper prefix of $u$ and $k\geq 2$.
Checking the property is straightforward from the border or prefix tables of $x$. The goal here is to do it still in linear time but with only constant extra space (in addition to input). The idea is to use the function MaxSuffixPos of Problem 40 that gives the position and period of the maximal suffix of $x$.
The examples below illustrate variations of the period of a word according to its maximal suffix. The maximal suffix starts at position $\ms$ and is in the form $u^kv$ (here $k=1$), where $u=x[\ms\dd\ms+p-1]$, $p$ is its period and $v$ is a proper prefix of $u$.
First, $x=\sa{ababbaababbaab}$. The prefix $\sa{aba}$ of $x$ preceding the maximal suffix $uv$ is a suffix of $u$. Then $x$ is periodic with period $6$.
Second, $x=\sa{ababbaaabbaababbaa}$. The prefix of $x$ preceding $uv$ is longer than $u$. The period of $x$ is $11\gt |uv|$ and $x$ is not periodic.
Third, $x=\sa{baabbaababbaab}$. The prefix of $x$ preceding $uv$ is shorter than $u$ but not a suffix of it. The period of $x$ is $10\gt |x|-|v|$ and $x$ is not periodic.