The Huffman tree corresponding to the sequence $\mathbf{p}=[0.1,0.1,0.3,0.5]$ is the following:
Then $\Huf(\mathbf{p})$ is
$0.1\cdot 3 + 0.1\cdot 3 + 0.3\cdot 2 + 0.5\cdot 1 = 1.7$
and
$\Entropy(\mathbf{p}) \approx 1.68548$.
We have $1.68548 \leq 1.7 \leq 2.68548$
Problem 133: Huffman codes vs entropy |
We consider $n$ items with positive weights probabilities $p_1,p_2,\ldots,p_n$ satisfying $\sum_i\, p_i=1$ and let $\mathbf{p}=(p_1,p_2,\ldots,p_n)$. Huffman algorithm constructs a full binary tree (each non-leaf node has two children) with items assigned to its leaves. Let $l_i$ be the depth to $p_i$, number of edges from the root to $p_i$. The average length of the Huffman coding is $\Huf(\mathbf{p})=\sum_i\, p_i\cdot l_i$.
An important concept in information theory is entropy. The sequence $\mathbf{p}$ is treated as a source of information and its entropy is $$\Entropy(\mathbf{p})\,=\, -\sum_i p_i\cdot \log_2 p_i.$$