CLR cover

Problem 86: The Number of Runs

A run is a maximal periodicity occurring in a word. Formally, a run in $x$ is an interval $[i\dd j]$ of positions on $x$ whose associated factor $x[i\dd j]$ is periodic (i.e., its smallest period $p$ satisfies $2p \leq |x[i\dd j]|=(j-i+1)$) and the periodicity does not extend to the right nor to the left (i.e., $x[i-1\dd j]$ and $x[i\dd j+1]$ have larger periods when defined).

We consider an ordering $\lt$ on the word alphabet and the corresponding lexicographic ordering denoted $\lt$ as well. We also consider the lexicographic ordering $\widetilde{\lt}$, called the reverse ordering, inferred by the inverse alphabet ordering $\lt^{-1}$. Each run $[i\dd j]$ is associated with its greatest suffix according to one of the two orderings as follows. Let $p=\per(x[i\dd j])$. If $j+1\lt n$ and $x[j+1] \gt x[j-p+1]$ we assign to the run the position $k$ for which $x[k\dd j]$ is the greatest proper suffix of $x[i\dd j]$ according to $\lt$. Otherwise, $k$ is the starting position of the greatest proper suffix of $x[i\dd j]$ according to $\widetilde{\lt}$. The position $k$ assigned this way to a run is called its special position. These positions are intimately linked to Lyndon words (defined in Section Ordering), subject of the first question.

Show that, if the special position $k$ of a run of period $p$ is defined according to $\widetilde{\lt}$ (resp. $\lt$), $x[k\dd k+p-1]$ is the longest Lyndon factor of $x$ starting at position $k$ according to $\lt$ (resp. $\widetilde{\lt}$).
The special position $k$ of a run $[i\dd j]$ of period $p$ satisfies $k\leq i+p$; see Problem 40.

Show two distinct runs have no special position in common and deduce that the number of runs in a word is smaller than its length.

References

  • H. Bannai, M. Giraud, K. Kusano, W. Matsubara, A. Shinohara, and J. Simpson. The number of runs in a ternary word. In J. Holub and J. Zdárek, editors, Proceedings of the Prague Stringology Conference 2010, Prague, Czech Republic, August 30 - September 1, 2010, pages 178-181. Prague Stringology Club, Department of Theoretical Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2010.
  • [26] H. Bannai, T. I, S. Inenaga, Y. Nakashima, M. Takeda, and K. Tsuruta. The "runs" theorem. SIAM J. Comput., 46(5):1501-1514, 2017.
  • M. Crochemore and L. Ilie. Maximal repetitions in strings. J. Comput. Syst. Sci., 74(5):796-807, 2008.
  • M. Crochemore, L. Ilie, and L. Tinta. The "runs" conjecture. Theor. Comput. Sci., 412(27):2931-2941, 2011.
  • M. Crochemore, C. S. Iliopoulos, M. Kubica, J. Radoszewski, W. Rytter, and T. Walen. The maximal number of cubic runs in a word. J. Comput. Syst. Sci., 78(6):1828-1836, 2012.
  • M. Crochemore and R. Mercas. On the density of Lyndon roots in factors. Theor. Comput. Sci., 656:234-240, 2016. In honor of Bill Smyth.
  • A. Deza and F. Franek. A d-step approach to the maximum number of distinct squares and runs in strings. Discrete Applied Mathematics, 163(3):268-274, 2014.
  • J. Fischer, Š. Holub, T. I, and M. Lewenstein. Beyond the runs theorem. In 22nd SPIRE, volume 9309 of LNCS, pages 272-281, 2015.
  • C. S. Iliopoulos, D. Moore, and W. F. Smyth. A characterization of the squares in a Fibonacci string. Theor. Comput. Sci., 172(1-2):281-291, 1997.
  • R. M. Kolpakov and G. Kucherov. Finding maximal repetitions in a word in linear time. In 40th Annual Symposium on Foundations of Computer Science, FOCS '99, 17-18 October, 1999, New York, NY, USA, pages 596-604. IEEE Computer Society, 1999.
  • S. J. Puglisi, J. Simpson, and W. F. Smyth. How many runs can a string contain? Theor. Comput. Sci., 401(1-3):165-171, 2008.
  • W. Rytter. The number of runs in a string. Information and Computation, 205(9):1459-1469, 2007.