Lecture 2: Huffman Coding
Lecture 2: Huffman Coding
Huffman Coding
Huffman (1951) Uses frequencies of symbols in a string to build a variable rate prefix code. Each symbol is mapped to a binary string. More frequent symbols have shorter codes. No code is a prefix of another. Example a b c d 0 100 101 11
where ri is the length of the path from the root to ai. HC(T) is the expected length of the code of a symbol coded by the tree T. HC(T) is the bit rate of the code.
Example of Cost
Example: a 1/2, b 1/8, c 1/8, d 1/4
Huffman Tree
Input: Probabilities p1, p2, ... , pm for symbols a1, a2, ... ,am, respectively. Output: A tree that minimizes the average number of bits (bit rate) to code a symbol.That is, minimizes
where ri is the length of the path from the root to ai. This is a Huffman tree or Huffman code.
Huffman Code
average number of bits per symbol is .4 x 1 + .1 x 4 + .3 x 2 + .1 x 3 + .1 x 4 = 2.1
a b c d e