CLR cover

Problem 68: Searching an Infinite Word

\( \def\sa#1{\tt{#1}} \def\dd{\dot{}\dot{}} \)

The goal is to design an algorithm for matching a pattern in an infinite word. Since there is no answer for general infinite words, we restrict the question to a pure morphic word. It is an infinite word obtained by iterating a morphism $\theta$ from $A^+$ to itself, where $A=\{\sa{a}, \sa{b}, \dots\}$ is a finite alphabet. To do so, we assume that $\theta$ is prolongable over the letter $a$, that is, $\theta(\sa{a}) = \sa{a}u$ for $u\in A^+$. Then $\Theta=\theta^\infty(\sa{a})$ exists and is $\sa{a}u\theta(u)\theta^2(u)\cdots$. The infinite word $\Theta$ is a fixed point of $\theta$, i.e. $\theta(\Theta)=\Theta$.

To avoid trivial cases, like that of the morphism $\eta$ defined by $\eta(\sa{a})=\sa{ab}$, $\eta(\sa{b})=\sa{c}$, $\eta(\sa{c})=\sa{b}$ and $\eta(\sa{d})=\sa{d}$ where the letter $\sa{d}$ is useless and the letter $\sa{a}$ appear only once in $\Theta$, we assume that $\theta$ is irreducible. It means that any letter is accessible from any letter: for any distinct letter $c,d\in A$ the letter $d$ appears in $\theta^k(c)$ for some integer $k$.

The Thue-Morse morphism $\mu$ and Fibonacci morphism $\phi$ (see Remarkable Words) are both irreducible morphisms.

Show how to test if a morphism is irreducible.

Design an algorithm to compute the set of factors of length $m$ occurring in the infinite word $\Theta=\theta^\infty(\sa{a})$, where $\theta$ is an irreducible morphism.

When the set of length-$m$ factors of $\Theta$ is represented by a (deterministic) trie, testing if a pattern of length $m$ appears in $\Theta$ becomes a mere top-down traversal of the trie.

References

  • F. Durand and J. Leroy. The constant of recognizability is computable for primitive morphisms. CoRR, abs/1610.05577, 2016.