CLR cover

Problem 122: Unavoidable Sets via Lyndon Words

Lyndon words often surprisingly appear in seemingly unrelated problems. We show that they appear as basic components in certain decompositions, which are the main tool in the problem.

The problem deals with unavoidable sets of words. A set $X \subseteq \{\sa{0},\sa{1}\}^*$ is unavoidable if any infinite binary word has a factor in $X$.

To start with, let $\mathcal{N}_k$ be the set of necklaces of length $k$, $k\gt 0$. A necklace is the lexicographically smallest word in its conjugacy class. Each necklace is a power of a Lyndon word.

Observation 1. If $X\subseteq \{\sa{0},\sa{1}\}^k$ is unavoidable then $|X| \geq |\mathcal{N}_k|$.

Indeed, for each $w\in\{\sa{0},\sa{1}\}^k$, length-$k$ factors of $w^{\infty}$ are all conjugates of $w$ and at least one of them should be in the unavoidable set $X$. The first question provides a more flexible condition to be unavoidable.

Show that if for each necklace $y\in\{\sa{0},\sa{1}\}^*$, $|y|\geq 2k$, the word $y^2$ contains a word in $X\subseteq \{\sa{0},\sa{1}\}^k$ then $X$ is unavoidable.

An ingenious construction of an unavoidable set is based on the notion of pre-primes. A pre-prime $w$ is a prefix of a necklace.

Observation 2. A pre-prime is a prefix of a power of a Lyndon word.

The observation justifies the special decomposition of a pre-prime $w$ as $u^ev$, where $u$ is a Lyndon word, $e\geq 1$, $|v|\lt |u|$ and $u$ is the shortest possible. Head and tail of $w$ are defined by $\head(w)=u^e$ and $\tail(w)=v$.

Note that $v$ is not necessarily a proper prefix of $u$, that is, $|u|$ is not usually the period of $w$. It is clear that such factorisation of a pre-prime always exists. The factorisation is the key-concept in this problem.

Show that there is an unavoidable set $X_k\subseteq \{0,1\}^k$ of size $|\mathcal{N}_k|$. Consequently, $|\mathcal{N}_k|$ is the smallest size of such a set.
Consider words of the form $\tail(w)\cdot\head(w)$.


  • J.-M. Champarnaud, G. Hansel, and D. Perrin. Unavoidable sets of constant length. IJAC, 14(2):241-251, 2004.