Problem 112: Factoring with Border-Free Prefixes |
Searching for a border-free pattern in texts is done very efficiently without any sophisticated solution by BM algorithm (see Problem 33) because two occurrences of the pattern cannot overlap. When a pattern is not border free, its factorisation into border-free words may lead to efficient searching methods.
We say that a non-empty word $u$ is border free if none of its proper non-empty prefix is also a suffix, that is, $\Bord(u) = \varepsilon$, or equivalently, if its smallest period is its length, that is, $\per(u) = |u|$.
The aim of the problem is to show how a word factorises into its border-free prefixes.
Consider, for example, the word $\sa{aababaaabaababaa}$. Its set of border-free prefixes is $\{\sa{a},\sa{aab},\sa{aabab}\}$ and its factorisation on the set is
The factorisation of $x$ can be represented by the list of its factor lengths. On the preceding example it is $(5,1,3,5,1,1)$ and the list of factors is $x[0\dd 4]$, $x[5\dd 5]$, $x[6\dd 8]$, $x[9\dd 13]$, $x[14\dd 14]$, $x[15\dd 15]$.