CLR cover

Additional Problem 2: Attractors on Thue-Morse Words

\( \newcommand{\Att}{\mathrm{Att}} \)

The notion of an attractor provides a unifying framework for known dictionary-based compressors. A string attractor for a word $x$ is a subset $\Att$ of positions on $x$ for which each factor $u$ of $x$ has an occurrence covering at least one position in $\Att$. That is, there are positions $i$, $j$ and $t$ satisfying $u=x[i\dd j]$, $t\in\Att$ and $i\leq t\leq j$.

In this problem we concentrate on attractors on Thue-Morse words. Thue-Morse words are defined by the following recurrence: \[\cases{ \tau_0 = \sa{a}, \cr \tau_{k+1} = \tau_k\overline{\tau_k}, & for $k \geq 0$, }\] where the bar morphism is defined by $\overline{\sa{a}}\,=\sa{b}$ and $\overline{\sa{b}}\,=\sa{a}$. Note the length of the $k$th Thue-Morse word is $|\tau_k|=2^k$ and $\overline{\tau_{k+1}} = \overline{\tau_k}\tau_k$.

Show that each Thue-Morse word $\tau_k$, $k\geq 4$, has an attractor of size at most $4$.