0% found this document useful (0 votes)

21 views28 pages

Data Compression

Data compression involves encoding information using fewer bits than the original representation. It can be either lossless, preserving all information, or lossy, accepting some loss of information. Lossless compression removes statistical redundancy in data, while lossy compression removes marginally important information. Common techniques include entropy encoding like Huffman coding, which assigns shorter codes to more common symbols. Compression reduces storage and transmission costs but requires decompression processing.

Uploaded by

Kim

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

Download as ppt, pdf, or txt

0% found this document useful (0 votes)

21 views28 pages

Data Compression

Uploaded by

Kim

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

Download as ppt, pdf, or txt

You are on page 1/ 28

Data Compression

Cleophas Mochoge
INTE 412

1
In computer science and information theory, data
compression, source coding, or bit-rate
reduction involves encoding information using
fewer bits than the original representation.
Compression can be either lossy or lossless.
Lossless compression reduces bits by identifying
And eliminating statistical redundancy. No
information is lost in lossless compression.
Lossy compression reduces bits by identifying
Marginally important information and removing it.

2
The process of reducing the size of a data file is
popularly referred to as data compression,
although its formal name is source coding
(coding done at the source of the data, before it
is stored or transmitted). Compression is useful
because it helps reduce resources usage, such
as data Storage space or transmission capacity.
Because compressed data must be
decompressed to use, this extra processing
imposes computational or other costs through
decompression.

3
Lossless
• Lossless data compression algorithms
usually exploit statistical redundancy to
represent data more concisely without
losing information. Lossless compression is
possible because most real-world data has
statistical redundancy.

4
For example, an image may have areas
of colour that do not change over several
pixels; instead of coding "red pixel, red
pixel, ..." the data may be encoded as
"279 red pixels". This is a simple example
of run-length encoding; there are many
schemes to reduce size by eliminating
redundancy.

5
Lossy
Lossy data compression is contrasted
with lossless data compression. In these
schemes, some loss of information is
acceptable. Depending upon the
application, detail can be dropped from
the data to save storage space. Generally
, lossy data compression schemes are
guided by research on how people
perceive the data in question. 6
For example, the human eye is more sensitive
to subtle variations in luminance than it is to
variations in color. JPEG image compression
works in part by "rounding off" less-important
visual information. There is a corresponding
trade-off between information lost and the
size reduction. A number of popular
compression
formats exploit these perceptual differences,
including those used in music files, images,
and video

7
Introduction

 Compression is used to reduce the volume

of information to be stored into storages or
to reduce the communication bandwidth
required for its transmission over the
networks

8
9
Compression Principles
 Entropy Encoding
 Run-length encoding
 Lossless & Independent of the type of source
information
 Used when the source information comprises
long substrings of the same character or
binary digit
(string or bit pattern, # of occurrences), as
FAX
e.g) 000000011111111110000011……
 0,7 1, 10, 0,5 1,2……  7,10,5,2……

10
Compression Principles

 Entropy Encoding
 Statistical encoding
 Based on the probability of occurrence of a
pattern
 The more probable, the shorter codeword
 “Prefix property”: a shorter codeword must not
form the start of a longer codeword

11
Huffman Codding
Huffman coding is one type of entropy
coding where a given character must be
encoded together with the probability of
their occurrence. The Huffman Coding
Algorithm determines the optimal code
using the minimum number of bits. The
length (number of bits) of the coded
character will be differing.
12
To determine Huffman code, it is useful to
construct a binary tree. The leaves (nodes) of
the tree represent the characters that are to be
encoded. Every nodes contains the occurrence
of probability 0 and 1 are assigned to the
branches of the tree. Every character has
associated weight equal to number of times the
character occurs in a data stream.

13
Stream of characters
• P(A) = 0.16
• P(B) = 0.51
• P(C) = 0.09
• P(D) = 0.13
• P(E) = 0.11

14
15
E.g) symbols A,B,C,D,E with probabilities
A(0.16), B(0.51), C(0.09), D(0.13), E(.011)

H’ = Σ i=1 5 Ni Pi = (0.16 +

0.51+0.09+0.13+0.11) = 1
bit/codeword
H = -Σ i=1 5 Pi log2Pi = -
((0.16log20.16) + (0.51log20.51)+
(0.09log20.09) + (0.13log20.13) +
(0.11log20.11)) = 1.95
16
Compression Principles

 Huffman Encoding
 Entropy, H: theoretical min. avg. # of bits that are required
to transmit a particular stream

H = -Σ i=1 n Pi log2Pi

where n: # of symbols, Pi: probability of symbol i

 Efficiency, E = H/H’
where, H’ = avr. # of bits per codeword = Σ i=1 n Ni
Pi
Ni: # of bits of symbol i 17
 E.g) symbols M(10), F(11), Y(010), N(011), 0(000),
1(001) with probabilities 0.25, 0.25, 0.125, 0.125,
0.125, 0.125

 H’ = Σ i=1 6 Ni Pi = (2(20.25) + 4(30.125)) = 2.5

bits/codeword
 H = -Σ i=1 6 Pi log2Pi = - (2(0.25log20.25) +
4(0.125log20.125)) = 2.5
 E = H/H’ =100 %
 3-bit/codeword if we use fixed-length codewords for six
symbols

18
Huffman Algorithm

Method of construction for an encoding tree

• Full Binary Tree Representation
• Each edge of the tree has a value,
(0 is the left child, 1 is the right child)
• Data is at the leaves, not internal nodes
• Result: encoding tree
• “Variable-Length Encoding”

19
Huffman Algorithm

• 1. Maintain a forest of trees

• 2. Weight of tree = sum frequency of
leaves
• 3. For 0 to N-1
– Select two smallest weight trees
– Form a new tree

20
• Huffman coding

• variable length code whose length is inversely

proportional to that character’s frequency
• must satisfy nonprefix property to be uniquely
decodable
• two pass algorithm
– first pass accumulates the character frequency
and generate codebook
– second pass does compression with the
codebook

21
Huffman coding

• create codes by constructing a binary tree

1. consider all characters as free nodes
2. assign two free nodes with lowest frequency to
a parent nodes with weights equal to sum of
their frequencies
3. remove the two free nodes and add the newly
created parent node to the list of free nodes
4. repeat step2 and 3 until there is one free node
left. It becomes the root of tree

22
• Right of binary tree :1
• Left of Binary tree :0
• Prefix (example)
– e:”01”, b: “010”
– “01” is prefix of “010” ==> “e0”
• same frequency : need consistency of
left or right

23
• Example(64 data)
• R K K K K K K K
• K K K R R K K K
• K K R R R R G G
• K K B C C C R R
• G G G M C B R R
• B B B M Y B B R
• G G G G G G G R
• G R R R R G R R

24
• Color frequency Huffman code
• =================================
• R 19 00
• K 17 01
• G 14 10
• B 7 110
• C 4 1110
• M 2 11110
• Y 1 11111

25
26
Static Huffman Coding
 Huffman (Code) Tree
 Given : a number of symbols (or characters) and their relative
probabilities in prior
 Must hold “prefix property” among codes

Symbol Occurrence
Root node 0 8 1
A 4/8
0 4 1 A
B 2/8 Branch node
0 2 1 B
C 1/8
Leaf node D C
D 1/8

Symbol Code
A 1 41 + 22 + 13 +
B 01 13 = 14 bits are
C 001 required to transmit
“AAAABBCD”
D 000
Prefix Property !
27
The end
Thank you

Satlink WS-6906 Manual - Englis
No ratings yet
Satlink WS-6906 Manual - Englis
13 pages
Digital Signal Processin-TU Darmstadt
No ratings yet
Digital Signal Processin-TU Darmstadt
166 pages
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
No ratings yet
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
24 pages
Multimedia Data Compression
No ratings yet
Multimedia Data Compression
31 pages
Lesson - Huffman and Entropy Coding
No ratings yet
Lesson - Huffman and Entropy Coding
31 pages
Lecture 3-Huffman Coding
No ratings yet
Lecture 3-Huffman Coding
30 pages
Chapter Three
No ratings yet
Chapter Three
30 pages
Chapter 4 Multi
No ratings yet
Chapter 4 Multi
45 pages
Graph Theory - Important Application of Trees Huffman Coding
No ratings yet
Graph Theory - Important Application of Trees Huffman Coding
50 pages
Group-8 DIP Presentation
No ratings yet
Group-8 DIP Presentation
100 pages
Huffman Coding
No ratings yet
Huffman Coding
40 pages
Compression For Sending and Storing Information: Text, Audio, Images, Videos
No ratings yet
Compression For Sending and Storing Information: Text, Audio, Images, Videos
28 pages
Lecture
No ratings yet
Lecture
75 pages
Huffman Encoding: WWW - Cis.Upenn - Edu/ Matuszek/Cit594-2002/SLIDES/HUFFMAN
No ratings yet
Huffman Encoding: WWW - Cis.Upenn - Edu/ Matuszek/Cit594-2002/SLIDES/HUFFMAN
13 pages
Huffman Coding Technique
No ratings yet
Huffman Coding Technique
13 pages
Chapter 3 Multimedia Data Compression
No ratings yet
Chapter 3 Multimedia Data Compression
21 pages
Huffman Coding
No ratings yet
Huffman Coding
65 pages
Optimization Problems
No ratings yet
Optimization Problems
38 pages
Unit 2
No ratings yet
Unit 2
28 pages
Huffman
No ratings yet
Huffman
17 pages
Multimedia System: Chapter Eight: Multimedia Data Compression
No ratings yet
Multimedia System: Chapter Eight: Multimedia Data Compression
29 pages
Unite 4-Greedy Method - CSE
No ratings yet
Unite 4-Greedy Method - CSE
41 pages
Huffman Coding: Greedy Algorithm
No ratings yet
Huffman Coding: Greedy Algorithm
27 pages
L10 Huffman Encoding Greedy
No ratings yet
L10 Huffman Encoding Greedy
52 pages
Wa0023.
No ratings yet
Wa0023.
28 pages
Unit 2 CA209
No ratings yet
Unit 2 CA209
29 pages
Huffman Coding Ms 140400147 Sadia Yunas Butt
No ratings yet
Huffman Coding Ms 140400147 Sadia Yunas Butt
9 pages
Huffman Encoding: Farhad Muhammad Riaz
No ratings yet
Huffman Encoding: Farhad Muhammad Riaz
17 pages
CH 6
No ratings yet
CH 6
21 pages
11 Huffman Coding
No ratings yet
11 Huffman Coding
25 pages
Group Assignment Multimedia System
No ratings yet
Group Assignment Multimedia System
26 pages
Huffman Coding: Vida Movahedi
No ratings yet
Huffman Coding: Vida Movahedi
24 pages
Huffman Code
No ratings yet
Huffman Code
51 pages
Dce 1
No ratings yet
Dce 1
21 pages
ICT - Module 1 Lecture 3
No ratings yet
ICT - Module 1 Lecture 3
43 pages
5 Huffman Coding
No ratings yet
5 Huffman Coding
50 pages
Data Compression Chapter 7
No ratings yet
Data Compression Chapter 7
40 pages
Module IV
No ratings yet
Module IV
37 pages
Compressor Principles
No ratings yet
Compressor Principles
32 pages
Huffman Coding
No ratings yet
Huffman Coding
22 pages
Huff Man
No ratings yet
Huff Man
8 pages
KMA SS05 Kap03 Compression
No ratings yet
KMA SS05 Kap03 Compression
54 pages
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
No ratings yet
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
37 pages
Algorithmics: Information Coding Techniques
No ratings yet
Algorithmics: Information Coding Techniques
44 pages
Mini Project
No ratings yet
Mini Project
26 pages
Huffman
No ratings yet
Huffman
24 pages
Compression & Huffman Codes
No ratings yet
Compression & Huffman Codes
29 pages
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
No ratings yet
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
28 pages
CGIP Huffman EX
No ratings yet
CGIP Huffman EX
17 pages
Chapter 3-Part II
100% (1)
Chapter 3-Part II
26 pages
An Introduction To Arithmetic Coding: Glen G. Langdon, JR
No ratings yet
An Introduction To Arithmetic Coding: Glen G. Langdon, JR
15 pages
Basics of Compression: Goals
No ratings yet
Basics of Compression: Goals
15 pages
Huffman Encoding Report
No ratings yet
Huffman Encoding Report
36 pages
Huffman Coding Scheme
No ratings yet
Huffman Coding Scheme
59 pages
Data Compression Unit-2
No ratings yet
Data Compression Unit-2
74 pages
Entropy
No ratings yet
Entropy
10 pages
Image Compression
No ratings yet
Image Compression
50 pages
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet
An Introduction To Digital Design
From Everand
An Introduction To Digital Design
Jason King
2/5 (1)
Blowfish Cipher Tutorials - Herong's Tutorial Examples
From Everand
Blowfish Cipher Tutorials - Herong's Tutorial Examples
Herong Yang
No ratings yet
Digital Communications 1: Source and Channel Coding
From Everand
Digital Communications 1: Source and Channel Coding
Didier Le Ruyet
No ratings yet
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Audio Representation - LECTURE
No ratings yet
Audio Representation - LECTURE
28 pages
OOAD Chapter 5
No ratings yet
OOAD Chapter 5
23 pages
Inte 312 Comp 322 Cosf 326 Distributed Systems
No ratings yet
Inte 312 Comp 322 Cosf 326 Distributed Systems
5 pages
Entrepreneurship Skills Notes
No ratings yet
Entrepreneurship Skills Notes
66 pages
Network Security and Cryptography II
No ratings yet
Network Security and Cryptography II
114 pages
Fourier - Vibration
No ratings yet
Fourier - Vibration
4 pages
8B10B Coding
No ratings yet
8B10B Coding
29 pages
Xs500 AE Specs 135179
No ratings yet
Xs500 AE Specs 135179
1 page
Voice Services Operating Procedures
No ratings yet
Voice Services Operating Procedures
51 pages
CM200 Product Spec Sheet
No ratings yet
CM200 Product Spec Sheet
2 pages
20 5880110
No ratings yet
20 5880110
2 pages
PLLs in High Performance Systems - Final
100% (1)
PLLs in High Performance Systems - Final
512 pages
Antenna Gain Isotropic 1.00 Half-Wave Dipole 1.64 Elementary Doublet 1.5
No ratings yet
Antenna Gain Isotropic 1.00 Half-Wave Dipole 1.64 Elementary Doublet 1.5
2 pages
Sim900 HD V1.05
No ratings yet
Sim900 HD V1.05
59 pages
01 GSM BSS Network KPI (MOS) Optimization Manual - 2
No ratings yet
01 GSM BSS Network KPI (MOS) Optimization Manual - 2
36 pages
Experiencias en Plataformas de Medición Inteligente - SIEMENS
No ratings yet
Experiencias en Plataformas de Medición Inteligente - SIEMENS
20 pages
Asus Router - Advanced - HSDPA - Others - Asp - Edit
No ratings yet
Asus Router - Advanced - HSDPA - Others - Asp - Edit
1 page
Unit 2 - Radar Engineering - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Radar Engineering - WWW - Rgpvnotes.in
16 pages
Link Budget: Produced Using Satmaster Pro
No ratings yet
Link Budget: Produced Using Satmaster Pro
3 pages
An Improved Speech Transmission Index For Intelligibility Prediction
No ratings yet
An Improved Speech Transmission Index For Intelligibility Prediction
11 pages
UMTS Brussels Network
No ratings yet
UMTS Brussels Network
69 pages
Philips NTRX505
No ratings yet
Philips NTRX505
28 pages
Synchronization Solutions in 5g Transport Network
No ratings yet
Synchronization Solutions in 5g Transport Network
16 pages
Tma Andes
No ratings yet
Tma Andes
1 page
Seminar Report On LiFi
100% (1)
Seminar Report On LiFi
28 pages
Crystal Radio Circuits
100% (1)
Crystal Radio Circuits
11 pages
Lista A&m Febrero Redes
No ratings yet
Lista A&m Febrero Redes
2 pages
SDH Training Manual
No ratings yet
SDH Training Manual
134 pages
Mobile Computing and Wireless Networking Course Outline
No ratings yet
Mobile Computing and Wireless Networking Course Outline
2 pages
Reg-C - MSB Protection System
No ratings yet
Reg-C - MSB Protection System
24 pages
AirScale BB - CellSet Connection Rules - Ed02
0% (1)
AirScale BB - CellSet Connection Rules - Ed02
7 pages
Wideband Amplifiers
100% (1)
Wideband Amplifiers
2 pages