CLR cover
Previous problem

Problem 75: Testing Overlaps in a Binary Word

Next problem
\( \def\sa#1{\tt{#1}} \def\dd{\dot{}\dot{}} \def\mv{\varepsilon} \)

The goal of the problem is to design an efficient algorithm to test whether a binary word contains an overlap factor. An overlap is a factor whose exponent is larger than $2$. A word contains an overlap if equivalently it has a factor of the form $auaua$, where $a$ is a letter and $u$ is a word.

The Thue-Morse word $\mu^\infty(\sa{a})$ is an example of an infinite overlap-free word. It is generated by the Thue-Morse morphism $\mu$ (defined by $\mu(\sa{a})=\sa{ab}$ and $\mu(\sa{b})=\sa{ba}$) that preserves overlap-freeness of words.

For a binary word $x$ we define its decomposition $uyv=x$, formally a triple $(u,y,v)$, in which $|u|$ is the smallest position on $x$ of a longest factor $y$ that belongs to $\{\sa{ab},\sa{ba}\}^+$. The decomposition is called a Restivo-Salemi (RS) factorisation if $u,v\in \{\mv,\sa{a},\sa{b},\sa{aa},\sa{bb}\}$. RS-factorisations are transformed into words in $\{\sa{ab},\sa{ba}\}^*$ by the partial functions $f$ or $g$ as follows ($c$ and $d$ are letters and the bar function exchanges letters $\sa{a}$ and $\sa{b}$): \[f(uyv)=\left\{\begin{array}{ll} y & \mbox{if } u=v=\mv \cr \bar{c}cy & \mbox{if } u=c \mbox{ or } cc \mbox{ and } v=\mv \cr yd\bar{d} & \mbox{if } u=\mv \mbox{ and } v=d \mbox{ or } dd \cr \bar{c}cyd\bar{d} & \mbox{if } u=c \mbox{ or } cc \mbox{ and } v=d \mbox{ or } dd \end{array}\right. \] \[g(uyv)=\left\{\begin{array}{ll} y & \mbox{if } u=v=\mv \cr \bar{c}cy & \mbox{if } u=c \mbox{ or } cc \mbox{ and } v=\mv \cr yd\bar{d} & \mbox{if } u=\mv \mbox{ or } c \mbox{ or } cc \mbox{ and } v=d \mbox{ or } dd \end{array}\right. \]

OverlapFree$(x \textrm{ non-empty binary word})$
    \begin{algorithmic}
 \WHILE{$|x|\gt 6$}
   \COMMENT{below $c$ and $d$ are letter variables}
   \STATE $uyv\leftarrow \mbox{RS-factorisation}(x)$
   \IF{$uyv$ is not an RS-factorisation}
     \RETURN{False}
   \ENDIF
   \IF{$[u=cc$ \AND $(ccc$ \OR $cc\bar{c}cc\bar{c}c \mbox{ prefix of } uy)]$ \OR
   $[v=dd$ \AND $(ddd$ \OR $d\bar{d}dd\bar{d}dd \mbox{ suffix of } uy)]$}
     \RETURN{False}
   \ENDIF
   \IF{$(u=c$ \OR $u=cc)$ \AND $(v=d$ \OR $v=dd)$ \AND
       $uyv \mbox{ is a square}$}
     \STATE $x\leftarrow \mu^{-1}(g(uyv))$
   \ELSE
     \STATE $x\leftarrow \mu^{-1}(f(uyv))$
   \ENDIF
 \ENDWHILE
 \RETURN{True}
    \end{algorithmic}

Show that Algorithm OverlapFree runs in linear time for testing if its binary input word is overlap free.

References

  • A. J. Kfoury. A linear-time algorithm to decide whether A binary word contains an overlap. ITA, 22(2):135-145, 1988.
  • A. Restivo and S. Salemi. Overlap-free words on two symbols. In M. Nivat and D. Perrin, editors, Automata on Infinite Words, École de Printemps d'Informatique Théorique, Le Mont Dore, May 14-18, 1984, volume 192 of Lecture Notes in Computer Science, pages 198-206. Springer, 1985.