The superstring $\sa{fbdiachbgegeakhiacbd}$ is produced by GreedySCS from the set $\{\sa{egeakh},\sa{fbdiac},\sa{hbgege},\sa{iacbd},\sa{bdiach}\}$.
Problem 61: Greedy Superstring |
A superstring of a set of words can be used to store the set in a compact way. Formally, a common superstring of a set $X$ of words is a word $z$ in which all elements of $X$ occur as factors, that is, $X\subseteq\Fact(z)$. The shortest common superstring of $X$ is denoted by $\SCS(X)$.
The greedy paradigm leads to a simple algorithm that produces a fairly good approximation of $\SCS(X)$. The goal of the problem is to show an efficient implementation of the method.
For two words $u$ and $v$, $\Overlap(u,v)$ is the longest suffix of $u$ that is a prefix of $v$. If $w=\Overlap(u,v)$, $u=u'w$ and $v=wv'$, $u\otimes v$ is defined as the word $u'wv'$. Note that $\SCS(u,v)$ is either $u\otimes v$ or $v\otimes u$. Also note that a word in $X$ factor of another word in $X$ can be discarded without changing $\SCS(X)$. Then $X$ is supposed to be factor free.
\begin{algorithmic} \IF{$|X|=1$} \RETURN{$x \in X$} \ELSE \STATE let $x,y\in X, x\neq y$, with $|Overlap(x,y)|$ maximal \RETURN{GreedySCS$(X \setminus \{x,y\} \cup \{x\otimes y\}$} \ENDIF \end{algorithmic}