CLR cover

Problem 127: String attractors

\( \newcommand{\Att}{\mathrm{Att}} \)

The notion of the string attractor provides a unifying framework for known dictionary-based compressors. A string attractor on a word $x$ is a subset $\Att$ of positions on $x$ for which each factor $u$ of $x$ has an occurrence covering at least one position in $\Att$. That is, there are positions $i$, $j$ and $t$ satisfying $u=x[i\dd j]$, $t\in\Att$ and $i\leq t\leq j$. We concentrate on attractors on two families of words.

Thue-Morse words

Thue-Morse words are defined as follows: \[ \tau_0 = \sa{a}, \tau_{k+1} = \tau_k\overline{\tau_k}, \mbox{ for } k \geq 0, \] where the bar morphism is defined by $\overline{\sa{a}}\,=\sa{b}$ and $\overline{\sa{b}}\,=\sa{a}$. Note the length of the $k$th Thue-Morse word is $|\tau_k|=2^k$ and $\overline{\tau_{k+1}} = \overline{\tau_k}\tau_k$.

Construct an attractor of size at most $4$. for Thue-Morse words $\tau_k$, $k\geq 4$.

Attractors on Fibonacci words

The Fibonacci word $\fib_k$ is $\phi^k(\sa{a})$, $k\geq 0$, where the morphism $\phi$ is defined by $\phi(\sa{a})=\sa{ab}$ and $\phi(\sa{b})=\sa{a}$. The length of $\fib_k$ is the Fibonacci number $F_{k+2}$ ($F_0=0$, $F_1=1$, $F_2=1$, $F_3=2$, $\dots$).

Find two essentially distinct attractors of size $2$ for each Fibonacci word $\fib_k$, $k\geq 2$.

References