Text and Text Compression
Text and Text Compression
95 printable characters
Alphabetic, Numeric, Punctuation
A – 1000001 (65)
5) Scan text again and create new file using the Huffman
codes.
Examples
• Consider the following text:
BCAACADBDCADAEEEABACDBACADCBADABEABEAAA
• A: 15; B: 7; C: 6; D: 6; E: 5
(39)
0 1
A(15) (24)
0 1
(13) (11)
0 1 0 1
a3 a3 a3 a3 a3
a2 a2 a2 a2 a2
a1 a1 a1 a1 a1
0 0 0.04 0.056 0.0624
Arithmetic Coding: Q&A
• Demonstrate how to encode a sequence of five symbols,
namely BABAB from the alphabet (A, B), using arithmetic
coding algorithm if p(A)=1/5 and p(B)=4/5. 😍
• Context
▫ Huffman needs a tree for every context
▫ Arithmetic needs a small table of frequencies for every context
• Adaptation
▫ Huffman has an elaborate adaptive algorithm
▫ Arithmetic has a simple adaptive mechanism
Arithmetic Coding: Q&A
• Demonstrate how to encode a sequence of four symbols,
namely x1x2x2x3 using arithmetic coding algorithm if p(x1)=0.5,
p(x2)=0.3 and p(x3)=0.2. Calculate the data size after
compression.
Lempel-Ziv Coding
• Lempel-Ziv Coding relies on reoccurring patterns to save
data space.