0% found this document useful (0 votes)
32 views

Huffman_coding

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Huffman_coding

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Lecture 7

Huffman Coding
Greedy Algorithm
What is Huffman Coding
Huffman coding is an efficient method of compressing data without losing information. In
computer science, information is encoded as bits—0's and 1's. Strings of bits encode the information
that tells a computer which instructions to carry out.

Huffman coding provides an efficient, unambiguous code by analyzing the frequencies that certain
symbols appear in a message. Symbols that appear more often will be encoded as a shorter-bit string
while symbols that aren't used as much will be encoded as longer strings. Since the frequencies of
symbols vary across messages, there is no one Huffman coding that will work for all messages.

Note: Basically, this technique is used to reduced the size of a Data while sending via a Network by
compressing it to reduce the cost of transmission.
Note
There are 3 points will be analyzed through this technique
1. If a normal message is sent what will be the cost?
2. Fix size encoding what will be the cost?
3. Variable size encoding (Huffman coding ) what will be the cost?

4. NB: cost of transmission


Example 1
Message  BCCABBDDAECCBBAEDDCC
Length  20

ASCII-Code uses 8-bit

Ex:
A  65  01000001
B 66 01000010
C 67
D 68
E 69

Total cost to send this message with out using any technique is
Total=8X20= 160 bits is the size of the message.
Example 2: Fix size code
Message  BCCABBDDAECCBBAEDDCC

Character Count/Frequency code


A 3 000
B 5 001
C 6 010
D 4 011
E 2 100

20

Note: 8 bits is for representing 128 symbols, but I have only 5 characters
Example:
1 bit  0/1
2 bits  00,01,10,11  combinations
3 bits  0-7 but I can use only 5 out of the 7 combinations

1. 20X3= 60 bits is size of the message  size of the message is 60 bits


2. 5x8+5X3 =55 bits  size of the table
Total Cost: 115 bits
3. Variable Size code(Huffman
Code)
Message  BCCABBDDAECCBBAEDDCC
Char Count Code
A 3
B 5
20 C 6
 optimal Merge Tree D 4
E 2

11
5

2 3 4 5 6
Huffman coding
Now, we can use this code to encoding our message.

So, if we use this method what will be the size of this message?
Size of the message

The message size is 45 bits

NB: alongside with message we need to send the table or the tree
for decoding the message

5X8bits= 40 bits
12bit of the code
Total = 52 bits

Total message including decode: 45+52 =97


Problem

A file contains the following characters with the frequencies as shown.


If Huffman Coding is used for data compression, determine-
Characters Frequencies

a 10
1.Huffman Code for each character
e 15
2.Average code length i 12
3.Length of Huffman encoded message (in bits) o 3
u 4
s 13
t 1
Solution

First let us construct the Huffman Tree. Huffman Tree is


constructed in the following steps-
Step -1

Step -2

Step -3
Step -4
Step -5
Step -6
Step -7
Our complete Huffman Tree
1. Huffman Code For Characters

To write Huffman Code for any character, traverse the Huffman Tree from root node to the
leaf node of that character.
Following this rule, the Huffman Code for each character is-

a = 111
e = 10
i = 00
o = 11001
u = 1101
s = 01
t = 11000

NB: From here, we can observe-


Characters occurring less frequently in the text are assigned the larger code.
2. Average Code Length-

Using formula-01, we have-


Average code length
= ∑ ( frequencyi x code lengthi ) / ∑ ( frequencyi )
= { (10 x 3) + (15 x 2) + (12 x 2) + (3 x 5) + (4 x 4) + (13 x 2) + (1 x 5) } / (10
+ 15 + 12 + 3 + 4 + 13 + 1)
= 2.52
3. Length of Huffman Encoded Message-

Using formula-02, we have-


Total number of bits in Huffman encoded message
= Total number of characters in the message x Average code
length per character
= 58 x 2.52
= 146.16
≅ 147 bits
Problem 2
How many bits may be required for encoding the
message ‘mississippi’?
1. Total number of bits to encode the message?
2. What is the average bits per character?
Solution
First let us construct the Huffman Tree base on the following
“mississippi”

Character Frequency
m 1
p 2
s 4
i 4
11
Character Frequency Code Code length
0 m 1 000 3
p 2 001 3
s 4 01 2
7 I 4 1 1
0 1

3 1

0 1

1 2 4 4

m p s i
1. Total number of bits to encode the message?

Total number of bits= freq(m) * codelength(m) + freq(p) * code_length(p) +


freq(s) * code_length(s) + freq(i) * code length(i) = 1*3 + 2*3 + 4*2 + 4*1 = 21

Total number of bits to encode the message =21

2. What is the average bits per character?

average bits per character can be found as:


Total number of bits required / total number of characters

21/11 = 1.909
Homework 1
Construct a Huffman tree by using these nodes.

Value A B C D E F
Frequ 5 25 7 15 4 12
ency
Homework2
Construct a Huffman tree by using the following strings.
"go go gophers"
Homework3
Let obtain a set of Huffman code for the message (m1.....m7) with relative frequencies (q1.....q7) =
(4,5,7,8,10,12,20).

1. Draw the Huffman tree for the given set of codes.

2. Write down the code base on the tree


Homework4
Given the follow encoded message and frequencies, by using Huffman
code technique can you decode the message?

Frequencies : A: 6, B: 1, C: 6, D: 2, E: 5
Encoded Data :
0000000000001100101010101011111111010101010
Questions?

You might also like