Problem 107: Compression Ratio of Greedy Superstrings |
The problem considers Algorithm GreedySCS (presented under different forms in Problem 61) that computes a superstring $\Greedy(X)$ for a set $X$ of words of total length $n$. The superstring can be viewed as a compressed text representing all words in $X$ and from this point of view it is interesting to quantify the gain of representing $X$ by a superstring.
Let $\mathit{GrCompr}(X) = n-|\Greedy(X)|$ denote the compression achieved by the greedy algorithm. Similarly define $\mathit{OptCompr}(X)=n-|\mathit{OPT}(X)|$ where $\mathit{OPT}$ is an optimal (unknown) superstring for $X$.
The fraction $\frac{\mathit{GrCompr}(X)}{\mathit{OptCompr}(X)}$ is called the compression ratio of Algorithm GreedySCS.