Truncated Huffman Code
Huffman codes require an enormous number of computations. For N source symbols, N-2
source reductions (sorting operations) and N-2 code assignments must be made. Sometimes
we sacrifice coding efficiency for reducing the number of computations.
Truncated Huffman coding is a variation of standard Huffman coding. In truncated Huffman
coding the first K (out of overall J source symbols) most probable source symbols joined
with hypothetical symbol which probability is equal to sum of probabilities of J-K less
probable source symbols are coded with standard Huffman code.
The J-K less probable source symbols are assigned the Huffman code of that hypothetical
symbol concatenated with natural binary code of length log2 (J-K). For example, if J=10 and
K=4, then J-K=6 and we add binary code of length 3 as (6<23 or log2 6 <3)
The constant K<J may be chosen arbitrarily and if K=J truncated Huffman coding is
equivalent to standard Huffman coding.
The truncated Huffman coding makes J-K-1 Huffman code assignments and J-K-1 source
reductions, thus taking less time, by cost of greater average code length and less efficiency.
Truncated Huffman Algorithm:
A truncated Huffman code is generated by:
1. Arranging the source symbols so that their probabilities are monotonically decreasing.
2. Dividing the total number of symbols (J) into two groups, the first group consists of
the first K most probable source symbols and the second group consists of the
remaining J-K symbols.
3. Adding a hypothetical symbol to the first group. Its probability is equal to sum of
probabilities of J-K less probable source symbols.
4. Arranging the new set of source symbols (K+1) so that their probabilities are
monotonically decreasing.
5. Encoding the new set of symbols with the standard Huffman code.
6. The J-K less probable source symbols are assigned the Huffman code of that
hypothetical symbol concatenated with natural binary code of length log2 (J-K).
Example 1:
The source of information A generates the symbols shown below. Encoding the source
symbols with the binary encoder and the truncated Huffman encoder gives:
Source Symbol
A0
A1
A2
A3
A4
A5
A6
A7
A8
Lavg
Pi
0.3
0.2
0.15
0.1
0.08
0.06
0.05
0.04
0.02
H = 2.778
Binary Code
0000
0001
0010
0011
0100
0101
0110
0111
1000
4
The Entropy of the source is
Since we have 9 symbols (9<16=24), we need 4 bits at least to represent each symbol in
binary (fixed-length code). Hence the average length of the binary code is
Thus the efficiency of the binary code is
Taking the three most probable symbols, K is 3 of J=9 symbols. Lets introduce a
hypothetical symbol Ax which probability is 0.35 (equals the sum of the probabilities of the
last six symbols from A3 to A8). The new set of symbols is shown in the table below.
Source Symbol
Ax
A0
A1
A2
Pi
0.35
0.3
0.2
0.15
Truncated Huffman
1
01
000
001
Then, the resultant code is
Source Symbol
A0
A1
A2
A3
A4
A5
A6
A7
A8
Lavg
Pi
0.3
0.2
0.15
0.1
0.08
0.06
0.05
0.04
0.02
H = 2.778
Truncated Huffman
01
000
001
1 000
1 001
1 010
1 011
1 100
1 101
3.05
The 6 less probable source symbols are assigned the Huffman code of that hypothetical
symbol Ax (1) concatenated with natural binary code of length 3.
The average length of the truncated Huffman code is
Thus the efficiency of the Shannon-Fano code is
This example demonstrates that the efficiency of the truncated Huffman encoder is much
higher than that of the binary encoder.
Applying the Huffman code of the same source, we get the following codewords
Source Symbol
A0
A1
A2
A3
A4
A5
A6
A7
A8
Lavg
Pi
0.3
0.2
0.15
0.1
0.08
0.06
0.05
0.04
0.02
H = 2.778
Truncated Huffman
00
01
100
110
1010
1011
1110
11110
11111
2.81
The average length of the Huffman code is
Thus the efficiency of the Shannon-Fano code is
This example demonstrates that the efficiency of the truncated Huffman encoder is a bit
lower than that of the standard Huffman encoder. However, the time is reduced as it needs
only 2 (4-2) stages of reduction while the Huffman needs 7 (9-2) stages of reduction.
Exercise 1:
The source of information A generates the symbols shown below. Encode the source symbols
with the binary encoder, the Huffman encoder and the truncated Huffman encoder.
Source Symbol
A0
A1
A2
A3
A4
A5
A6
A7
Pi
0.35
0.24
0.16
0.1
0.1
0.02
0.02
0.01
Compare the efficiency of the three codes and comment on the results.
Exercise 2:
The source of information A generates the symbols shown below. Encode the source symbols
with the binary encoder, the Huffman encoder and the truncated Huffman encoder.
Source Symbol
A1
A2
A3
A4
A5
A6
A7
A8
A9
Pi
0.2
0.1
0.1
0.06
0.05
0.05
0.05
0.04
0.04
A10
A11
A12
A13
A14
A15
A16
A17
A18
A19
A20
A21
0.04
0.04
0.03
0.03
0.03
0.03
0.02
0.02
0.02
0.02
0.02
0.01
Compare the efficiency of the three codes and comment on the results. (Hint: let K = 12)