CLR cover

Problem 78: Almost Square-Free Words

\( \newcommand{\Occur}{\mathit{Occ}}% \def\sa#1{\tt{#1}} \def\dd{\dot{}\dot{}} \)

Testing if a word that contains no short squares is square free can be done in a more efficient and simpler way than with the methods treating ordinary words. This is the object of the problem.

A word $w$ is said to be almost square free if it does not contain any square factor of length smaller than $|w|/2$. Such words have a useful property stated in the observation, in which $\Occur(z,w)$ denotes the set of starting positions of occurrences of $z$ in the word $w$.

Observation 1. If $z$ is a factor of length $|w|/8$ of an almost square-free word $w$, then $z$ is non-periodic (its smallest period is larger than $|z|/2$), $|\Occur(z,w)|\lt 8$ and $\Occur(z,w)$ can be computed in linear time and constant space.

Under the hypothesis of the observation, the computation of $\Occur(z,w)$ is realised, for example, by Algorithm NaiveSearch, a naive version of Algorithm KMP.

NaiveSearch$(z,w \textrm{ non-empty words})$
    \begin{algorithmic}
  \STATE $(i,j)\leftarrow (0,0)$
  \STATE $Occur(z,w)\leftarrow \emptyset$
  \WHILE{$j \leq |w|-|z|$}
    \WHILE{$i\lt |z| \mbox{ and } z[i]=w[j+i]$}
      \STATE $i\leftarrow i+1$
    \ENDWHILE
    \IF{$i=|z|$}
      \STATE $Occur(z,w)\leftarrow Occur(z,w)\cup\{j\}$
    \ENDIF
    \STATE $(j,i)\leftarrow (j+\max\{1,\lfloor i/2 \rfloor \},0)$
  \ENDWHILE
  \RETURN{$Occur(z,w)$}
    \end{algorithmic}

Design an algorithm that checks in linear time with constant space if an almost square-free word $w$ is square free, assuming for simplicity that $|w|=2^k$, $k\geq3$.
Use a factorisation of $w$ into short factors, and apply Algorithm NaiveSearch and Observation 1.

References

  • M. G. Main and R. J. Lorentz. Linear time recognition of square-free strings. In A. Apostolico and Z. Galil, editors, Combinatorial Algorithms on Words, volume 12 of Series F: Computer and System Sciences, pages 271-278. Springer, 1985.