0% found this document useful (0 votes)

89 views

Huffman Coding

The document discusses implementing Huffman encoding and decoding algorithms. It first provides background on the project and objectives. It then reviews literature on general data compression techniques, including sources of redundancy and entropy coding algorithms like Huffman coding. The document outlines the methodology used, including building a min heap data structure and Huffman tree. Code snippets are provided for key functions like creating nodes, building the min heap, and extracting the minimum value. Implementation details and results are discussed.

Uploaded by

Mahi Noor

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

89 views

Huffman Coding

Uploaded by

Mahi Noor

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Date:12/12/2018

Submitted to : Waris Ali

Submitted by: Hafiz Muhammad Atif 11427

Asad Khan 11288

Malik Nabil Anjum 11422

Waqar Malik 11294

BS(SE)-V (Evening)

Project: HuffMan Encoding And Decoding

Contents
1. Introduction .......................................................................................................................................... 3
1.1 background ......................................................................................................................................... 3
2. LITERATURE REVIEW ................................................................................................................................ 3
2.1 GENERAL IDEA OF DATA COMPRESSION ........................................................................................... 3
2.1.1 SCHEMA OF DATA COMPRESSION .............................................................................................. 3
1. Spatial redundancy........................................................................................................................... 4
2. Temporal redundancy. ..................................................................................................................... 4
2.2.2 CODING AND ALGORITHM .............................................................................................................. 4
3. METHODOLOGY........................................................................................................................................ 5
4. IMPLEMENTATION RESULTS AND DISCUSSION....................................................................................... 5
4.1 IMPLEMENTATION DETAILS OF HUFFMAN CODING ......................................................................... 5
4.1.1 IMPLEMENTATION PREPARATION.............................................................................................. 5
5. CONCLUSION .......................................................................................................................................... 11
5. BIBLIOGRAPHY........................................................................................................................................ 11
1. Introduction

1.1 background
This project was conducted for Massey university 159.333 Individual
Programming Project paper. The main purpose of this proposed project is about
researching and implementing data compression algorithms. The objectives of the
project are:
1. Research and apply data compression algorithms involves C++ and data
structures.
2. Implement compression algorithms and compare effectiveness of
implementations.
3. Make extensions on current methodological approaches and test designs of
developed algorithms. In this project, two lossless data compression algorithms
were implemented and discussed. Lossless data compression uses statistical
redundancy to represent data with lossless information

2. LITERATURE REVIEW
In this section some compression-related research in the areas of information
theory and data compression algorithm will be discussed. For this project, entropy
and lossless compression algorithm are the main topics to be discussed.
Therefore, a brief introduction of data compression and lossless compression will
be discussed accordingly.
2.1 GENERAL IDEA OF DATA COMPRESSION
2.1.1 SCHEMA OF DATA COMPRESSION
In computer science and communication theory, data compression or source
coding is the art of encoding original data with specific mechanism to fewer bits
(or other information related units). For example, if the word “compression” in
previous sentence is encoding to “comp”, then this sentence can be stored in a
text file with less data. A popular living example is ZIP file format, which is widely
used in PC. It not only provides compress functions, but also offers archive tools
(Archiver) that can store many files in a same root file.
1. Spatial redundancy.
Data compression is possible because most real-world data contain large amount
of statistical redundancy. Redundancy can exist in various forms. For example, in a
text file, the letter “e” in English appears more frequently than the letter “q”. The
possibility is dramatically reduced if the letter “q” followed by “z”. There also
exists data redundancy in multimedia information. Basically, data redundancy has
following types: 1. Spatial redundancy. (The static architectural background, blue
sky and lawn of an image contain many same pixels. If such image is stored pixel
by pixel, then it will waste a lot of space. )
2. Temporal redundancy.
(In television, animation images between adjacent frames likely contain the same
background, only the position of moving objects have some slightly changes. So it
is only worthy to store discrepant portion of the adjacent frame. ) 3. Structure
redundancy 4. Knowledge redundancy 5. Information entropy redundancy. (It is
also known as encoding redundancy, which is the entropy that data carry.) Hence,
one aspect of compression is redundancy removal. Characterization of
redundancy involves some form of modeling. So for the previous English letter
redundancy example, the model of this redundancy is English text. This process is
also known as de-correlation. After the modelling process the information needs
to be encoded into a binary bits representation. This encoding part involves
coding algorithms which are applied
2.2.2 CODING AND ALGORITHM

Entropy coding is the coding without loss of original information (based on

entropy principle). Information entropy is the average amount of information
(uncertainty Metric) of the source.
The common entropy coding are:
header Shannon–Fano coding At about 1960 Claude E. Shannon (MIT) and Robert
M. Fano (Bell Laboratories) had developed a coding procedure for constructing a
binary code treeof prefix code according to a set of symbols and their
probabilities.
Huffman coding In 1952, HUffman had proposed a coding method completely in
accordance with the probability of character frequency to construct optimal
prefix code.
Arithmetic coding A coding method that directly encodes the whole input
message to a decimal number n(0.0 ≤ n < 1.0

3. METHODOLOGY
This section will mainly present various methods which were used in this project.
Implementations and experimental results will be discussed in next sections. All
the methods were conducted for testing and making extensions on current data
compression algorithms, such as Huffman Algorithm. A series of programs were
written in C++ to make adaptation of already available device, algorithm and
design, in order to research and improve the productivity of current compression
algorithms. These programs are tested using Visual Studio 2017 IDE; on an Intel
Core i7-4790k 2.7GHz equipped PC machine under Windows 10 operation system.

4. IMPLEMENTATION RESULTS AND DISCUSSION

4.1 IMPLEMENTATION DETAILS OF HUFFMAN CODING
4.1.1 IMPLEMENTATION PREPARATION
Huffman coding is basically an algorithm of representing data in another representation to reduce the
information redundancy. The data is encoded in binary.

HuffManCoding.h File
#include <iostream>
#include <cstdlib>
using namespace std;
#define MAX_TREE_HT 100
struct MinHeapNode{

char data;
unsigned freq;
MinHeapNode *left, *right;
};

struct MinHeap{
unsigned size;
unsigned capacity;
MinHeapNode** array;
};
class HuffManCoding
{
public:
HuffManCoding();
~HuffManCoding();
MinHeapNode* newNode(char data, unsigned freq);
MinHeap* createMinHeap(unsigned capacity);
void swapMinHeapNode(MinHeapNode** a,
MinHeapNode** b);
void minHeapify(MinHeap* minHeap, int idx);
int isSizeOne(MinHeap* minHeap);
MinHeapNode* extractMin(MinHeap* minHeap);
void insertMinHeap(MinHeap* minHeap,
MinHeapNode* minHeapNode);
void buildMinHeap(MinHeap* minHeap);
void printArr(int arr[], int n);
int isLeaf(MinHeapNode* root);
MinHeap* createAndBuildMinHeap(char data[], int freq[], int size);
MinHeapNode* buildHuffmanTree(char data[], int freq[], int size);
void printCodes(MinHeapNode* root, int arr[], int top);
void HuffmanCodes(char data[], int freq[], int size);

};

HuffManCoding.cpp
#include "HuffManCoding.h"

HuffManCoding::HuffManCoding()
{
}
HuffManCoding::~HuffManCoding()
{
}
MinHeapNode* HuffManCoding::newNode(char data, unsigned freq)
{
MinHeapNode* temp
= new MinHeapNode();
temp->left = temp->right = NULL;
temp->data = data;
temp->freq = freq;

return temp;
}

MinHeap* HuffManCoding::createMinHeap(unsigned capacity)

MinHeap* minHeap
= new MinHeap();

minHeap->size = 0;

minHeap->capacity = capacity;

minHeap->array
= new MinHeapNode*[capacity];
return minHeap;
}

void HuffManCoding::swapMinHeapNode( MinHeapNode** a,

MinHeapNode** b)

{
MinHeapNode* t = *a;
*a = *b;
*b = t;
}

void HuffManCoding::minHeapify( MinHeap* minHeap, int idx)

{
int smallest = idx;
int left = 2 * idx + 1;
int right = 2 * idx + 2;

if (left < minHeap->size && minHeap->array[left]->

freq < minHeap->array[smallest]->freq)
smallest = left;

if (right < minHeap->size && minHeap->array[right]->

freq < minHeap->array[smallest]->freq)
smallest = right;

if (smallest != idx) {
swapMinHeapNode(&minHeap->array[smallest],
&minHeap->array[idx]);
minHeapify(minHeap, smallest);
}
}

int HuffManCoding::isSizeOne( MinHeap* minHeap)

{
return (minHeap->size == 1);
}

MinHeapNode* HuffManCoding::extractMin( MinHeap* minHeap)

{
MinHeapNode* temp = minHeap->array[0];
minHeap->array[0]
= minHeap->array[minHeap->size - 1];

--minHeap->size;
minHeapify(minHeap, 0);

return temp;
}

void HuffManCoding::insertMinHeap( MinHeap* minHeap,

MinHeapNode* minHeapNode)

++minHeap->size;
int i = minHeap->size - 1;

while (i && minHeapNode->freq < minHeap->array[(i - 1) / 2]->freq) {

minHeap->array[i] = minHeap->array[(i - 1) / 2];

i = (i - 1) / 2;
}

minHeap->array[i] = minHeapNode;
}

void HuffManCoding::buildMinHeap( MinHeap* minHeap)

int n = minHeap->size - 1;
int i;

for (i = (n - 1) / 2; i >= 0; --i)

minHeapify(minHeap, i);
}

void HuffManCoding::printArr(int arr[], int n)

{
int i;
for (i = 0; i < n; ++i)
cout << arr[i];

cout << "\n";

}
int HuffManCoding::isLeaf( MinHeapNode* root)

return !(root->left) && !(root->right);

}

MinHeap* HuffManCoding::createAndBuildMinHeap(char data[], int freq[], int size)

MinHeap* minHeap = createMinHeap(size);

for (int i = 0; i < size; ++i)

minHeap->array[i] = newNode(data[i], freq[i]);

minHeap->size = size;
buildMinHeap(minHeap);

return minHeap;
}

MinHeapNode* HuffManCoding::buildHuffmanTree(char data[], int freq[], int size)

{
MinHeapNode *left, *right, *top;

MinHeap* minHeap = createAndBuildMinHeap(data, freq, size);

while (!isSizeOne(minHeap)) {

left = extractMin(minHeap);
right = extractMin(minHeap);

top = newNode('$', left->freq + right->freq);

top->left = left;
top->right = right;

insertMinHeap(minHeap, top);
}

return extractMin(minHeap);
}

void HuffManCoding::printCodes( MinHeapNode* root, int arr[], int top)

if (root->left) {

arr[top] = 0;
printCodes(root->left, arr, top + 1);
}

if (root->right) {

arr[top] = 1;
printCodes(root->right, arr, top + 1);
}

if (isLeaf(root)) {

cout << root->data << ": ";

printArr(arr, top);
}
}

void HuffManCoding::HuffmanCodes(char data[], int freq[], int size)

{
MinHeapNode* root
= buildHuffmanTree(data, freq, size);

int arr[MAX_TREE_HT], top = 0;

printCodes(root, arr, top);

}

Main.cpp
#include<iostream>
using namespace std;
#include"HuffManCoding.h";
int main()
{
HuffManCoding HMC;
char arr[] = { 'a', 'b', 'c', 'd', 'e', 'f' };
int freq[] = { 5, 9, 12, 13, 16, 45 };

int size = sizeof(arr) / sizeof(arr[0]);

HMC.HuffmanCodes(arr, freq, size);

cin.get();
return 0;
}
5. CONCLUSION
To conclude, this project has introduced the background of data compression in
terms of classification, principle and theory. The lossless data compression and
information entropy are mainly discussed in this report for further
implementations. Huffman coding and LZW coding algorithms were implemented
for investigating how to improve compression ratio. The implementation is based
on several methods with respect to information search, concepts generation and
research analysis. The objectives of this project are mostly completed. Two
important lossless algorithms were implemented and discussed in details. . For
Huffman part, a user-defined bit length Huffman coding program was designed
for investigation the relationship between coding bit length and compression
ratio. The experimental results showed that the program can produce better
compression ratio if the input file string is split by even number bits. Also, in the
range of 16 to 28 split bits, the program can reach the highest compression ratio.
Therefore, it is recommended that user can focus on this range to get a
productive compression. Additionally, the test files with high information
redundancy (e.g. image file) can be compressed in higher compression ratio.

5. BIBLIOGRAPHY
A.Lesne. (2011). Shannon entropy: a rigorous mathematical notion at the
crossroads between probability, information theory, dynamical systems and
statistical physics.
Retrieved from: http://www.lptmc.jussieu.fr/user/lesne/MSCS-entropy.pdf
A.RRNYI. (1961).
ON MEASURES OF ENTROPY AND INFORMATION. Retrieved from:
http://l.academicdirect.org/Horticulture/GAs/Refs/Renyi_1961.pdf
A.Shahbahrami, R.Bahrampour, M.S.Rostami, M.A.Mobarhan. (2011). Evaluation
of Huffman and Arithmetic Algorithms for Multimedia Compression Standards.
Retrieved from: http://arxiv.org/ftp/arxiv/papers/1109/1109.0216.pdf

EXERCISE 2. Solve The Following Problems Using Regula-Falsi Method and Show The Graph
No ratings yet
EXERCISE 2. Solve The Following Problems Using Regula-Falsi Method and Show The Graph
5 pages
Mini Project
No ratings yet
Mini Project
26 pages
Huffman Encoding Report
No ratings yet
Huffman Encoding Report
36 pages
5 Huffman Coding
No ratings yet
5 Huffman Coding
50 pages
Multimedia Data Compression
No ratings yet
Multimedia Data Compression
31 pages
Data Compression
No ratings yet
Data Compression
28 pages
Application of Compression
No ratings yet
Application of Compression
14 pages
Data Compression Algorithms and Their Applications
100% (1)
Data Compression Algorithms and Their Applications
14 pages
XXBEC00xx_VL2020210102036_DA (1)
No ratings yet
XXBEC00xx_VL2020210102036_DA (1)
3 pages
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
No ratings yet
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
24 pages
Group-8 DIP Presentation
No ratings yet
Group-8 DIP Presentation
100 pages
dms final 1
No ratings yet
dms final 1
14 pages
Wa0023.
No ratings yet
Wa0023.
28 pages
DC 3
No ratings yet
DC 3
20 pages
Huffman Coding Ms 140400147 Sadia Yunas Butt
No ratings yet
Huffman Coding Ms 140400147 Sadia Yunas Butt
9 pages
Compression For Sending and Storing Information: Text, Audio, Images, Videos
No ratings yet
Compression For Sending and Storing Information: Text, Audio, Images, Videos
28 pages
Lesson - Huffman and Entropy Coding
No ratings yet
Lesson - Huffman and Entropy Coding
31 pages
Adp Huffman Coding
No ratings yet
Adp Huffman Coding
15 pages
Chapter 3-Part II
100% (1)
Chapter 3-Part II
26 pages
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
No ratings yet
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
28 pages
Huffman
No ratings yet
Huffman
13 pages
CHAPTER 7
No ratings yet
CHAPTER 7
36 pages
Term Paper Huffman Coding
No ratings yet
Term Paper Huffman Coding
9 pages
Huffman Code
No ratings yet
Huffman Code
5 pages
Huff Man
No ratings yet
Huff Man
8 pages
Optimization Problems
No ratings yet
Optimization Problems
38 pages
Chapter-5 Data Compression
No ratings yet
Chapter-5 Data Compression
53 pages
Unit 2
No ratings yet
Unit 2
28 pages
Lecture
No ratings yet
Lecture
75 pages
MLSP_LAB_EXP2
No ratings yet
MLSP_LAB_EXP2
6 pages
KMA SS05 Kap03 Compression
No ratings yet
KMA SS05 Kap03 Compression
54 pages
Chapter 4 Multi
No ratings yet
Chapter 4 Multi
45 pages
Chapter Three
No ratings yet
Chapter Three
30 pages
Huffman Coding by Akas
100% (1)
Huffman Coding by Akas
54 pages
Haufman 1
No ratings yet
Haufman 1
8 pages
16 Greedy Algorithms
No ratings yet
16 Greedy Algorithms
21 pages
Huffman Coding
No ratings yet
Huffman Coding
65 pages
Deep Dive Into Huffman Coding Techniques
No ratings yet
Deep Dive Into Huffman Coding Techniques
3 pages
Algorithm
No ratings yet
Algorithm
14 pages
Huffman Coding Technique
No ratings yet
Huffman Coding Technique
13 pages
Chapter Four Indexing Structure
100% (2)
Chapter Four Indexing Structure
60 pages
Compression: Another Example of Greedy Algorithm: Huffman Codes
No ratings yet
Compression: Another Example of Greedy Algorithm: Huffman Codes
4 pages
Algorithmics: Information Coding Techniques
No ratings yet
Algorithmics: Information Coding Techniques
44 pages
Design and Analysis of Algorithms (COM336) : Huffman Coding
No ratings yet
Design and Analysis of Algorithms (COM336) : Huffman Coding
1 page
Huffman Coding Using MATLAB (PoojaS)
75% (4)
Huffman Coding Using MATLAB (PoojaS)
20 pages
Haufman
No ratings yet
Haufman
8 pages
Huffman Coding and Encoding Data Methods
No ratings yet
Huffman Coding and Encoding Data Methods
6 pages
Lecture# 08 Greedy Algorithms
No ratings yet
Lecture# 08 Greedy Algorithms
63 pages
CA Module 1
No ratings yet
CA Module 1
64 pages
Compression Algorithms
No ratings yet
Compression Algorithms
16 pages
05 Compression
No ratings yet
05 Compression
46 pages
Data Compression Unit-2
No ratings yet
Data Compression Unit-2
74 pages
Nikhil Devadas: Huffman Data Compression .!!!!
No ratings yet
Nikhil Devadas: Huffman Data Compression .!!!!
4 pages
Graph Theory - Important Application of Trees Huffman Coding
No ratings yet
Graph Theory - Important Application of Trees Huffman Coding
50 pages
5, Huffman Code
No ratings yet
5, Huffman Code
5 pages
Compression Techniques and Cyclic Redundency Check
No ratings yet
Compression Techniques and Cyclic Redundency Check
5 pages
Group Assignment Multimedia System
No ratings yet
Group Assignment Multimedia System
26 pages
IEEE Paper
No ratings yet
IEEE Paper
2 pages
Huffman Coding: Greedy Algorithm
No ratings yet
Huffman Coding: Greedy Algorithm
27 pages
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
4-Greedy Best first search algorithm
No ratings yet
4-Greedy Best first search algorithm
3 pages
Chapter 02: Using Data in C#
No ratings yet
Chapter 02: Using Data in C#
44 pages
PPT2-Simplex Method
No ratings yet
PPT2-Simplex Method
118 pages
AVL Trees
No ratings yet
AVL Trees
41 pages
Synthetic Division Cheat Sheet PDF
No ratings yet
Synthetic Division Cheat Sheet PDF
1 page
DSP 9
No ratings yet
DSP 9
31 pages
Quantum Computing: Exercise Sheet 2: Steven Herbert and Anuj Dawar
No ratings yet
Quantum Computing: Exercise Sheet 2: Steven Herbert and Anuj Dawar
2 pages
LA Opti Assignment 53.1 Lagrangian Duality
No ratings yet
LA Opti Assignment 53.1 Lagrangian Duality
8 pages
Module 7 Queue
No ratings yet
Module 7 Queue
17 pages
2marks For Pondicherry University
No ratings yet
2marks For Pondicherry University
45 pages
OTA Project (A1-G4)
No ratings yet
OTA Project (A1-G4)
12 pages
W3203 FXsol F03
No ratings yet
W3203 FXsol F03
8 pages
Disjoint Sparse Table
No ratings yet
Disjoint Sparse Table
6 pages
All Quiz Sets Graph Theory
No ratings yet
All Quiz Sets Graph Theory
6 pages
DMS Unit-4 Notes
No ratings yet
DMS Unit-4 Notes
12 pages
3-10-graph-theory-elNgecCW5yFq~4nC
No ratings yet
3-10-graph-theory-elNgecCW5yFq~4nC
46 pages
Chapter 6: Test Practice Problems
No ratings yet
Chapter 6: Test Practice Problems
6 pages
Introduction To Quantum Computing
No ratings yet
Introduction To Quantum Computing
11 pages
DP Problems
No ratings yet
DP Problems
130 pages
Assignment 5 (Sol.) : Reinforcement Learning
100% (1)
Assignment 5 (Sol.) : Reinforcement Learning
4 pages
Bisection PDF
No ratings yet
Bisection PDF
9 pages
Big O
No ratings yet
Big O
27 pages
Assignment 2
No ratings yet
Assignment 2
2 pages
Formal Reasoning
No ratings yet
Formal Reasoning
28 pages
Review of Data Compression and Different Techniques of Data Compression IJERTV2IS1106
No ratings yet
Review of Data Compression and Different Techniques of Data Compression IJERTV2IS1106
8 pages
CS2351 - Artificial Intellegence
No ratings yet
CS2351 - Artificial Intellegence
13 pages
Cliques and Independent Sets
No ratings yet
Cliques and Independent Sets
29 pages
Seminar
No ratings yet
Seminar
19 pages
Com 124 Notes Final
100% (1)
Com 124 Notes Final
18 pages