CLR cover

Problem 134: Compressed pattern matching in Thue-Morse words

\( \def\EVEN{\mathit{EVEN}} \def\first{\mathit{first}} \def\last{\mathit{last}} \)

The Thue-Morse binary word on the alphabet $\{\sa{0}, \sa{1}\}$ is produced by iterating infinitely form 0 the Thue-Morse morphism $\mu$ from $\{\sa{0}, \sa{1}\}^*$ to itself defined by $$\mu(\sa{0}) = \sa{01}, \ \mu(\sa{1}) = \sa{10}. $$ Eventually the iteration produces the infinite Thue-Morse word: $$\mathbf{t}\,=\, \sa{01101001100101101001011001101001}\cdots. $$

For a pattern $x$ of even length, let $\mu^{-1}(x)$ be the word $z$ for which $\mu(z)=x$ if it exists and $nil$ otherwise. We introduce also the set $\EVEN=\{\sa{0110}, \sa{1010}, \sa{0101}, \sa{1001}\}$.

Let us denote by $\first_4(x)$ the prefix of $x$ of length 4, if there is any. $\first(x)$ and $\last(x)$ the first and last letters of $x$, respectively, and $\bar{s}$ is the negation of a bit $s$ ($\bar{\sa{0}}=\sa{1}$ and $\bar{\sa{1}}=\sa{0}$).

The following algorithm tests in linear time and in a very simple way if a finite binary pattern $x$ is a factor of $\mathbf{t}$.

Test$(x \textrm{ non-empty word})$
    \begin{algorithmic}
\IF{$x=nil$}
  \RETURN{false}
\ENDIF
\IF{$|x|\lt 4$}
  \RETURN{$x\ne 111$ and $x\neq 000$}
\ENDIF
\IF{$\mathit{first}_4(x)\not\in EVEN$}
  \STATE{$x\leftarrow \overline{\mathit{first}(x)}\cdot x$}
\ENDIF
\IF{$|x|$ is odd}
  \STATE{$x\leftarrow x\cdot \overline{\mathit{last}(x)}$}
\ENDIF
\RETURN{Test$(\mu^{-1}(x))$} 
    \end{algorithmic}

Show why this algorithm correctly tests in linear time if $x$ is a factor of the infinite Thue-Morse word $\mathbf{t}$?

References