CLR cover

Problem 4: Oldenburger-Kolakoski Sequence

\( \def\K{\mathbf{K}} \def\blocks{\mathit{blocks}} \def\sa#1{\tt{#1}} \)

The Oldenburger-Kolakoski sequence is an autodescriptive and self-generating infinite sequence of symbols $\{\sa{1},\sa{2}\}$. More technically, it is its own run-length encoding. The sequence, denoted here by $\K$, is one of the strangest sequences. Despite the simplicity of its generation it appears to have a random behaviour.

By a block of letters in a word we mean a run of letters, that is, a maximal factor consisting of occurrences of the same letter. The operation $\blocks(S)$ replaces each block of a word $S$ by its length. For example, $$\blocks(\sa{2}\,\sa{1}\,\sa{1}\,\sa{1}\,\sa{2}\,\sa{2}\,\sa{1}\,\sa{2}\,\sa{2}\,\sa{2}) =\sa{1}\,\sa{3}\,\sa{2}\,\sa{1}\,\sa{3}.$$

The sequence $\K$ is the unique infinite sequence over the alphabet $\{\sa{1},\sa{2}\}$ that starts with $\sa{2}$ and satisfies $\blocks(\K)=\K$.

Remark. Usually the sequence is defined to start with $\sa{1}$, but it is more convenient here that it starts with $\sa{2}$. In fact, these are the same sequences after removing the first occurrence of $\sa{1}$.

Show that we can generate online the first $n$ symbols of the sequence $\K$ in $O(n)$ time and $O(\log n)$ space.
Produce $\K$ by iterating $h=\blocks^{-1}$ from $\sa{2}$.

The very small space used for the generation of $\K$ is the most interesting element of the question.

References

  • S. Brlek, D. Jamet, and G. Paquin. Smooth words on 2-letter alphabets having same parity. Theor. Comput. Sci., 393(1-3):166-181, 2008.
  • W. Kolakoski. Problem 5304. American Mathematical Monthly, 72(674), 1965.
  • J. Nilsson. Letter frequencies in the Kolakoski sequence. Acta Physica Polonica A, 126(2):549-552, 2014.
  • R. Oldenburger. Exponent trajectories in symbolic dynamics. Transactions of the American Mathematical Society, 46:453-466, 1939.