#
Problem 73: Tight Bounds on Occurrences of Powers

\(
\def\sa#1{\tt{#1}}
\def\dd{\dot{}\dot{}}
\)

The problem considers lower bounds on the number of occurrences of integer
powers occurring in a word.

An integer power is a word in the form $u^k$ for some non-empty word $u$ and
some integer $k\gt 1$.
The size of the set of square factors
(Problem 72) and the number of runs
(Problem 86) in a word
are known to be linear in the length of the word.
This contrasts with the number of occurrences of integer powers that does
not satisfy this property.

To avoid trivial lower bounds we consider primitively rooted integer powers,
that is, powers of the form $u^k$, where $u$ is a primitive word (i.e. not
itself a power).

To start with, let us consider the word $\sa{a}^n$.
Though it contains a quadratic number occurrences of squares,
it contains exactly $n-1$ occurrences of primitively rooted
squares (underlined below).

But if a few occurrences of $\sa{a}$ are changed to $\sa{b}$ in the word (see
below) the number of primitively rooted squares increases, although some
occurrences of short squares disappear (when $n$ is large enough).

\[\left\{\begin{array}{rcll}
x_0 & = & \sa{a}^5\sa{b}, \cr
x_{i+1} & = & (x_i)^3\sa{b}, & \mbox{ for } i\geq 0.
\end{array}\right.
\]

Show that $x_i$ contains asymptotically $\Omega(|x_i|\log |x_i|)$ occurrences
of primitively rooted squares.

In fact, the property on squares also holds for powers of any integer
exponent $k\geq 2$.

For a given integer $k\geq 2$, define a sequence of words $y_i$, for $i\geq 0$,
containing asymptotically $\Omega(|y_i|\log |y_i|)$ occurrences of
primitively rooted $k$th powers.

Notice the bound is tight due to the upper bound on square prefixes in
Problem 72.

## References

M. Crochemore. An optimal algorithm for computing the repetitions in
a word. *Inf. Process. Lett.*, **12**(5):244-250, 1981.
M. Crochemore, S. Z. Fazekas, C. S. Iliopoulos, and I. Jayasekera. Number
of occurrences of powers in strings. *Int. J. Found. Comput. Sci.*,
**21**(4):535-547, 2010.
D. Gusfield and J. Stoye. Linear time algorithms for finding and representing
all the tandem repeats in a string. *J. Comput. Syst. Sci.*,
**69**(4):525-546, 2004.