Problem 30: Words with Singleton Variables |
The problem shows the flexibility of the border table notion and of the fast algorithm computing this table. Having such a table is a valuable element to design efficient pattern matching, searching here for patterns with a variable.
We consider words over the alphabet $\Al=\{\sa{a},\sa{b},\dots\}$ in which letters are considered as singleton variables, that is, each letter represents a distinct unknown letter of the alphabet.
Two words $u$ and $v$ are said to be equivalent, denoted as $u \equiv v$, if there is a bijective letter-to-letter morphism $h:\alph(u)^*\rightarrow \alph(v)^*$ for which $h(u)=v$. For example, $\sa{aacbaba}\equiv\sa{bbdabab}$ through the morphism $h$ on $\Al^*$ to itself defined by $h(\sa{a})=\sa{b}$, $h(\sa{b})=\sa{a}$, $h(\sa{c})=\sa{d}$ and $h(\sa{d})=\sa{c}$. When $\alph(u)=\alph(v)$ and $u \equiv v$ the words become equal after permuting thei r letters.
The pattern-matching problem is naturally redefined as follows: given a pattern word $x$ and a text $y$, check if a factor $z$ of $y$ is equivalent to $x$: $z \equiv x$. For example, the pattern $x=\sa{aacbaba}$ occurs in $y=\sa{babbdababbacb}$ because its factor $z=\sa{bbdabab}$ is equivalent to $x$.