|
Problem 101: Online Huffman Coding
|
|
\(
\def\weight{\mathit{weight}}
\)
The two main drawbacks of the static Huffman compression method arethat first,
if the frequencies of letters in the source text are not known a
priori, the source text has to be scanned twice and second, the Huffman coding
tree must be included in the compressed file.
The problem shows a solution that avoids these drawbacks.
The solution is based on a dynamic method in which the coding tree is updated
each time a symbol is read from the source text.
The current Huffman tree relates to the part of the text that has
already been processed and evolves
exactly in the same way during the decoding process.
Design a Huffman compression method that reads only once the source text and
does not need to store the coding tree in the compressed text.
Huffman trees are characterised by the siblings property.
Siblings property
Let $T$ be a Huffman tree with $n$ leaves (a complete binary weighted tree in
which all leaves have positive weights).
Nodes of $T$ can be arranged in a
list $(t_0,t_1,\ldots,t_{2n-2})$ that satisfies
-
Nodes are in decreasing order of their weights:
$\weight(t_0)\geq\weight(t_1)\geq\cdots\geq\weight(t_{2n-2})$.
-
For any $i$, $0\leq i\leq n-2$, the consecutive nodes $t_{2i}$ and
$t_{2i+1}$ are siblings (they have the same parent).
References
G. V. Cormack and R. N. Horspool. Algorithms for adaptive Huffman
codes. Inf. Process. Lett., 18(3):159-165, 1984.
N. Faller. An adaptive system for data compression. In Record of the
7th Asilomar Conference on Circuits, Systems, and Computers, pages
593-597, 1973.
A. Fruchtman, Y. Gross, S. T. Klein, and D. Shapira. Weighted adaptive
Huffman coding. In A. Bilgin, M. W. Marcellin, J. Serra-Sagristà,
and J. A. Storer, editors, Data Compression Conference, DCC 2020,
Snowbird, UT, USA, March 24-27, 2020, pages 368-385. IEEE, 2020.
http://arxiv.org/abs/2005.08232v1.
R. G. Gallager. Variations on a theme by Huffman. IEEE Trans. Information
Theory, 24(6):668-674, 1978.
D. E. Knuth. Dynamic Huffman coding. J. Algorithms, 6(2):163-180,
1985.
J. S. Vitter. Design and analysis of dynamic Huffman codes. J. ACM,
34(4):825-845, 1987.