With $\A=\{\sa{a},\sa{b},\sa{c}\}$ and $\varA=\{t,u,v,w\}$, $\sa{a}u\sa{b}v\sa{a}u\sa{b}$ and $\sa{a}w\sa{b}u\sa{a}w\sa{b}$ p-match by mapping $u$ to $w$ and $v$ to $u$. But $\sa{a}u\sa{b}v\sa{a}u\sa{b}$ and $\sa{a}v\sa{b}w\sa{a}z\sa{b}$ do not p-match, since $u$ should be mapped to both $v$ and $z$.
With $y=\sa{a}z\sa{b}u\sa{a}z\sa{b}z\sa{a}v\sa{b}w\sa{a}v\sa{b}$ the pattern $x=\sa{a}u\sa{b}v\sa{a}u\sa{b}$ occurs at position $0$ by mapping $u$ to $z$ and $v$ to $u$ and at position $8$ by mapping $u$ to $v$ and $v$ to $w$.
Problem 32: Parameterised Matching |
The problem considers a more general and more complex version of Problem 30 where some symbols are unknown and some others are fixed constant symbols. Searching texts for a fixed pattern is rather restrictive in some contexts, and parameterised string matching provides an efficient solution in several applications by introducing variables in patterns. The problem was initially stated to detect code duplicates in which, for example, identifiers are substituted for the original names.
Let $\A$ and $\varA$ be two disjoint alphabets: $\A$ is the alphabet of constant letters and $\varA$ is the alphabet of variable letters. We assume that no alphabet contains integers. A word over $\A\cup\varA$ is called a parameterised word or a p-word. Two p-words $x$ and $y$ are said to match or p-match if $x$ can be transformed into $y$ by applying a one-to-one mapping on symbols of $\varA$ occurring in $x$.
The parameterised pattern matching problem can be stated as follows: given a pattern $x\in (\A\cup\varA)^*$ and a text $y\in(\A\cup\varA)^*$ find all the p-occurrences of $x$ in $y$, that is, find all the positions $j$ on $y$, $0\le j \le |y|-|x|$, for which $x$ and $y[j \dd j+|x|-1]$ p-match.