0% found this document useful (0 votes)
56 views49 pages

DCT Based Coding

This document discusses lossless and lossy compression techniques for digital images. It begins with an overview of concepts from information theory such as entropy, self-information, and discrete memoryless sources. It then focuses on Huffman coding as a lossless compression method, explaining how Huffman codes are constructed and decoded. The document notes that Huffman coding performs best when symbol probabilities are skewed and provides examples where it performs poorly with uniform probabilities. It also discusses how extending the source and differential coding can improve Huffman coding performance for image data.

Uploaded by

Lurthu Pushparaj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
56 views49 pages

DCT Based Coding

This document discusses lossless and lossy compression techniques for digital images. It begins with an overview of concepts from information theory such as entropy, self-information, and discrete memoryless sources. It then focuses on Huffman coding as a lossless compression method, explaining how Huffman codes are constructed and decoded. The document notes that Huffman coding performs best when symbol probabilities are skewed and provides examples where it performs poorly with uniform probabilities. It also discusses how extending the source and differential coding can improve Huffman coding performance for image data.

Uploaded by

Lurthu Pushparaj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 49

Digital Image Processing

Lossless Compression – Lossy Compression - JPEG

DR TANIA STATHAKI
READER (ASSOCIATE PROFFESOR) IN SIGNAL PROCESSING
IMPERIAL COLLEGE LONDON
Elements from Information Theory

• Any information generating process can be viewed as a source that


emits a sequence of symbols chosen from a finite alphabet.
 ASCII symbols (text)
 𝑛 −bit image values (2𝑛 symbols)

• The simplest form of an information source is the so called Discrete


Memoryless Source (DMS). Successive symbols produced by such a
source are statistically independent.

• A DMS is completely specified by the source alphabet 𝑆 = 𝑠1 , 𝑠2 , … , 𝑠𝑛


and the associated probabilities 𝑃 = 𝑝1 , 𝑝2 , … , 𝑝𝑛
Elements from Information Theory

• The Self-Information of a symbol 𝑠𝑖 with probability 𝑝𝑖 is defined as:


1
𝐼 𝑠𝑖 = log 2 = −log 2 𝑝𝑖
𝑝𝑖
 The occurrence of a less probable event provides more information.
 The information of a sequence of independent events taken as a
single event equals the sum of their individual information.
 An event can be the occurrence of a symbol.

• The Average Information per Symbol or Entropy of a DMS is:


𝑛 𝑛

𝐻 𝑆 = 𝑝𝑖 𝐼 𝑠𝑖 = − 𝑝𝑖 log 2 𝑝𝑖
𝑖=1 𝑖=1

• The Entropy is measured in bits/symbol.


What is actually entropy?
• Interpretation of entropy:
 By definition it is the average amount of information that symbols of
a specific source carry.
 Entropy is also a measure of the "disorder" (uncertainty) of a system.

• In Physics
 what "disorder” above refers to, is really the number of different
microscopic states a system can be in, given that the system has a
particular fixed composition, volume, energy, pressure, and
temperature. By "microscopic states", we mean the exact states of
all the molecules making up the system.

• When we design a coding scheme the average number of bits per


symbol we can achieve is always greater that the entropy. Therefore,
the entropy is the best we can do in term of bits/symbol!
Extension of a DMS

• Given a DMS of size 𝑛, it might be beneficial to group the original


symbols of the source into blocks of 𝑁 symbols. Each block can now be
considered as a single source symbol generated by a source 𝑆 𝑁 which
has 𝑛𝑁 symbols.

• In this case the entropy of the new source is


𝐻 𝑆 𝑁 = 𝑁 × 𝐻(𝑠)

• We observe that when the source is extended, the entropy increases,


however, the symbols increase in length. The entropy per original
symbol remains the same.
Noiseless source coding theorem

• Let 𝑆 be a source with alphabet size 𝑛 and entropy 𝐻(𝑠).

• Consider coding blocks of 𝑁 source symbols into binary codewords. For


any 𝛿 > 0 (with 𝛿 a small number), it is possible by choosing 𝑁 large
enough to construct a code in such a way that the average number of
bits per original source symbol 𝑙𝑎𝑣𝑔 satisfies the following:

𝐻 𝑆 ≤ 𝑙𝑎𝑣𝑔 < 𝐻 𝑠 + 𝛿
• We observe that for 𝛿 small enough, the average number of bits per
symbols converges to the entropy of the source. This is the best coding
we can achieve.

• The above is not realistic since the alphabet size increases too much
with 𝑁!
Examples of possible codes for a 4-symbol source

• In the table below we see four different codes for a four-symbol


alphabet. The entropy of the source is 1.75bits/symbol.

• The Average Length of a code is


𝑙𝑎𝑣𝑔 = 𝑙𝑖 𝑝𝑖
𝑖
• 𝑙𝑖 is the length of the codeword in bits which corresponds the symbol 𝑠𝑖 .
Code characteristics

• It is desirable for a code to exhibit Unique Decodability.


• Prefix Codes: no codeword is a prefix of another codeword.
• A prefix code is always uniquely decodable. The reverse is not true.
• A code can be possible depicted as a binary tree where the symbols are
the nodes and the branches are 0 or 1. The codeword of a symbold can
be found if we concatenate the 0s and 1s that we have to scan until we
reach that symbol, starting from the “root “ of the tree.
• In a prefix code the codewords are associated only with the external
nodes.
Huffman coding (1952)

• Huffman coding is a very popular coding method.

• Huffman codes are:


 Prefix codes.
 Optimal for a given model (set of probabilities).

• Huffman coding is based on the following two observations:


 Symbols that occur more frequently will have shorter codewords
than symbols that occur less frequently.
 The two symbols that occur less frequently will have the same
length.
Huffman coding (1952)
Huffman coding (1952)

1/20

1/20

0.05 0.1

0.05
Properties of Huffman Codes

• 𝐻 𝑆 ≤ 𝑙𝑎𝑣𝑔 ≤ 𝐻 𝑆 + 1
• If 𝑝𝑚𝑎𝑥 < 0.5 then 𝑙𝑎𝑣𝑔 ≤ 𝐻 𝑆 + 𝑝𝑚𝑎𝑥
• If 𝑝𝑚𝑎𝑥 ≥ 0.5 then 𝑙𝑎𝑣𝑔 ≤ 𝐻 𝑆 + 𝑝𝑚𝑎𝑥 + 0.086
• 𝐻 𝑆 = 𝑙𝑎𝑣𝑔 if the probabilities of the symbols are of the form 2𝑘 , with 𝑘
a negative integer.
1
• For an 𝑁 −th extension of a DMS we have 𝐻 𝑆 ≤ 𝑙𝑎𝑣𝑔 ≤ 𝐻 𝑆 +
𝑁
• The complement of a Huffman code is also a valid Huffman code.
• A minimum variance Huffman code is obtained by placing the
combined letter in the sorted list as high as possible.
• The code efficiency is defined as 𝐻(𝑆)/𝑙𝑎𝑣𝑔
• The code redundancy is defined as 𝑙𝑎𝑣𝑔 − 𝐻(𝑆)
Huffman Decoding: Bit-Serial Decoding

• This method is a fixed-input bit rate but variable-output symbol rate


scheme. It consists of the following steps:
1. Read the input compressed stream bit by bit and traverse the tree
until a leaf node is reached.
2. As each bit in the input stream is used, it is discarded. When the leaf
node is reached, the Huffman decoder outputs the symbol at the leaf
node. This completes the decoding for this symbol.

• We repeat these steps until all of the input is consumed. Since the
codewords are not of the same length, the decoding bit rate is not the
same for all symbols. Hence, this scheme has a fixed input bit rate but a
variable output symbol rate.
Huffman Decoding: Lookup-Table-Based Decoding

• Lookup-table-based methods have a variable input bit rate and constant


decoding symbol rate.
• We have to construct the so called Lookup Table using the symbol-to-
codeword mapping table (Huffman code). If the longest codeword in this
table is 𝐿 bits, then the lookup table will have 2𝐿 rows.
• Let 𝑐𝑖 be the codeword that corresponds to symbol 𝑠𝑖 . Assume that 𝑐𝑖
has 𝑙𝑖 bits. In this method we associate 𝑐𝑖 not with a single codeword but
with 2𝐿−𝑙𝑖 codewords. These are all the codewords where the first 𝑙𝑖 bits
are the codeword 𝑐𝑖 and the last 𝐿 − 𝑙𝑖 bits can be all possible binary
numbers with 𝐿 − 𝑙𝑖 bits. These are 2𝐿−𝑙𝑖 on total. Therefore each
symbol 𝑠𝑖 is associated with 2𝐿−𝑙𝑖 codewords of fixed length 𝐿.
• The 2𝐿−𝑙𝑖 pairs 𝑠𝑖 , codeword𝑖𝑗 , 𝑗 = 1, … , 2𝐿−𝑙𝑖 are stored in into the
Lookup Table.
Huffman Decoding: Lookup-Table-Based Decoding (cont.)

• When we receive the bit stream for decoding we read the first 𝐿 bits.
• By checking the Lookup Table we find the symbol 𝑠𝑖 which has the read
𝐿-bit word as one if its possible codewords.
• When we find this symbol we know that the “true” codeword for that
symbol is formed by the first 𝑙𝑖 bits only of the read 𝐿-bit word.
• The first 𝑙𝑖 bits are discarded from the buffer.
• The next 𝑙𝑖 bits are appended to the buffer so that the next 𝐿-bit word for
investigation is formed.
• We carry on this procedure until the entire bit stream is examined.
Example of Huffman coding with poor performance

• Consider the following example of the Huffman code of a 3-symbol


alphabet:

• In that case 𝐻 = 0.816 bits/symbol.


• Average number of bits per symbol is 𝑙𝑎𝑣𝑔 = 1.2 bits/symbol.
• Redundancy 𝑙𝑎𝑣𝑔 − 𝐻 = 1.2 − 0.816 = 0.384 bits/symbol, which is 47% of
entropy.
Huffman code of extended source
• Consider the previous example where the source is extended by 2.

• Average number of bits per symbol is


𝑙𝑎𝑣𝑔 = 1.7516 bits/(new symbol).
• Average number of bits per original
symbol is
𝑙𝑎𝑣𝑔 = 0.8758 bits/(original symbol).
• Redundancy is
𝑙𝑎𝑣𝑔 − 𝐻 = 0.8758 − 0.816 = 0.06 bits/symbol.
• This is 7% of entropy.
Example of Huffman coding with poor performance

• Consider the following example of the Huffman code of a 3-symbol


alphabet:

• In that case 𝐻 = 0.335 bits/symbol.


• Average number of bits per symbol is 𝑙𝑎𝑣𝑔 = 1.05 bits/symbol.
• Redundancy 𝑙𝑎𝑣𝑔 − 𝐻 = (1.05 − 0.335) bits/symbol, which is 213% of
entropy.
Huffman code of extended source-Poor performance also!
• Consider the previous example where the source is extended by 2.

• Average number of bits per symbol is


𝑙𝑎𝑣𝑔 = 1.222 bits/(new symbol).
• Average number of bits per original
symbol is
𝑙𝑎𝑣𝑔 = 0.611 bits/(original symbol).
• Redundancy is
𝑙𝑎𝑣𝑔 − 𝐻 = (0.611 − 0.335) bits/symbol.
• This is 72% of entropy.
• Redundancy drops to acceptable values
for 𝑁 = 8. The alphabet size is then
6561 symbols.
Differential Coding
• Huffman coding works well when the distribution of probabilities of the
symbols deviate from uniform.
• In case where the symbols are image intensities we can change the
distribution of probabilities by replacing each pixel intensity with its
differential.
• The differential is the difference between the intensity of the pixel of interest
and a function of the neighboring intensities.
• The function of the neighboring intensities attempts to approximate the
prediction of the pixel of interest.
• This method falls within the so called predictive coding.
• For example 𝑓 𝑥, 𝑦 can be replaced by 𝑔 𝑥, 𝑦 = 𝑓 𝑥, 𝑦 − 𝑓(𝑥, 𝑦 − 1).
• Another alternative is 𝑔 𝑥, 𝑦 = 𝑓 𝑥, 𝑦 − 1/3[𝑓 𝑥, 𝑦 − 1 + 𝑓 𝑥 − 1, 𝑦 +
𝑓 𝑥 − 1, 𝑦 − 1 ]
Differential Coding
• Histogram of the original image
is shown in top figure.

• Histogram of the difference image


obtained by using horizontal
pixel-to-pixel differencing is shown
in bottom figure.

• The dynamic range increases


from 256 to 512.
The Lossless JPEG Standard
• We use Differential Coding to form prediction residuals.
• Residuals then coded with either a Huffman coder or an arithmetic coder.
• We will focus on Huffman coding.
• In lossless JPEG, one forms a prediction residual using "previous" pixels
in the current line and/or the previous line.
• If 𝑥 is the pixel of interest, the prediction residual is 𝑟 = 𝑦 − 𝑥 with
𝑦 = 𝑓(𝑎, 𝑏, 𝑐), 𝑎, 𝑏, 𝑐 the “previous” (we can define previous as the top and
left) neighbours.
𝑦 = 0,
𝑦 = 𝑎 , 𝑦 = 𝑏, 𝑦 = 𝑐 𝑐 𝑏
𝑦 =𝑎+𝑏−𝑐
𝑏−𝑐 𝑎−𝑐
𝑦=𝑎+ ,𝑦=𝑏+ 𝑎 𝑥
2 2
𝑦 = (𝑎 + 𝑏)/2
Encoding of the Prediction Residual
• The residual is expressed as a pair of symbols: the category and the
actual value (magnitude).

• The magnitude is expressed in binary form with the Most Significant Bit
(MSB) always 1 if it is positive.

• The category represents the number of bits needed to encode the


magnitude. This value ONLY is Huffman coded.
Example: Assume that the residual has magnitude 42.
(42)10 = (101010)2 belongs to Category 6.
Codeword: (Huffman-code-for-6)∪(Binary-number-for-42-with-MSB-1)

• If the residual is negative, then the code for the magnitude is the one's
complement of its absolute value.

• Codewords for negative residual always start wish a zero bit.


Category – Range of Prediction Residual
Encoding of the Prediction Residual

𝑎 = 100,
𝑏 = 191 𝑐 𝑏
𝑐 = 100
𝑥 = 180 𝑎 𝑥
𝑎+𝑏
𝑦= = 145
2
𝑟 = 𝑦 − 𝑥 = −35

• Suppose Huffman code for six is 1110 then is coded by the 10 −bit
codeword 1110011100.

• Without entropy coding, −35 would require 16 bits.


Lossy Compression

• Lossy compression of images deals with compression processes where


decompression yields an imperfect reconstruction of the original
image data.

• Regardless of the compression method that being used, given the level
of image loss (or distortion), there is always a bound on the
minimum bit rate of the compressed bit stream.

• The analysis that relates signal distortion and minimum bit rate falls within
the so called Rate-Distortion theory.
Lossy Compression

• The figure below demonstrates the rate-distortion 𝑅(𝐷) relationship.

• For a discrete signal zero distortion coding is achieved when 𝑅 0 = the


source entropy. For a continuous source the rate rises without limit
(observe the dashed line ---).
Block-based Coding

• Spatial-domain block coding


The pixels are grouped into blocks and the blocks are then compressed
in the spatial domain.
Example: Vector quantization

• Transform-domain block coding


The pixels are grouped into blocks and the blocks are then transformed to
another domain, such as the frequency domain.
Example: DCT
DFT
DHT
KL
A Generic DCT-Based Image Coding System
DCT Based Coding Example – Low activity region

• The input block (labeled original) is taken from a low activity region; that
is, there are very small differences among pixel values in that area.
DCT Based Coding Examples

• In order to provide for uniform processing, most standard DCT coders


require that image pixels are preprocessed so that their expected mean
value is zero.
• After subtracting 128 from each element of the block, the DCT output
block is given by
DCT Based Coding Examples

• It is the process of quantization which leads to compression in DCT


domain coding. Quantization of 𝑦𝑘𝑙 is expressed as:
𝑞𝑘𝑙
𝑦𝑘𝑙 𝑦𝑘𝑙 ±
𝑧𝑘𝑙 = 𝑟𝑜𝑢𝑛𝑑 = 2 , 𝑘, 𝑙 = 0,1, … , 7
𝑞𝑘𝑙 𝑞𝑘𝑙

• 𝑞𝑘𝑙 are the elements of a quantisation matrix 𝑄

• The choice of 𝑄 depends on:


 Psychovisual characteristics.
 Compression ratio considerations.
Quantization matrix
Quantized DCT

• The quantized DCT matrix is given by:

• Only 11 values are needed to represent 𝑍.


• Compression ratio of 5.8 is achieved.
• 𝑍 is entropy coded.
Decompression

• We do entropy decoding of the coded bit stream to get back 𝑍.


• Inverse quantization on 𝑍 gives 𝑧𝑘𝑙 = 𝑧𝑘𝑙 𝑞𝑘𝑙 .
8x8 IDCT

• Inverse IDTC on 𝑍 gives:

• Observe that 𝑋 ≠ 𝑋.
DCT Based Coding Example – High activity region

• The input block (labeled original) is taken from a high activity region;
that is, there are essential differences among pixel values in that area.
DCT Based Coding Example – High activity region
Huffman Coding of DC Coefficients

• Let 𝐷𝐶𝑖 and 𝐷𝐶𝑖−1 denote the DC coefficients of blocks 𝑖 and 𝑖 − 1.

• Due to the high correlation of DC values among adjacent blocks, JPEG


uses differential coding for the DC coefficients.

• (𝐷𝐶𝑖 −𝐷𝐶𝑖−1 ) ∈ −2047,2047 ; this range is divided into 12 size


categories.

• Each DC differential can be described by the pair (size, amplitude).

• From this pair of values, only the first (size) is Huffman coded.
Example

• The DC differential has an amplitude of 195.

• size= 8.

• Thus, 195 is described by the pair (8,11000011).

• If the Huffman codeword for size= 8 is 111110, then 195 is coded as


11111011000011.

• Similarly, −195 would he coded as 11111000111100.

• Huffman decoding is obvious.


Generic diagram of Huffman coding in baseline JPEG
Huffman Coding of AC Coefficients

• After quantization, most of the AC coefficients will be zero; thus, only the
nonzero AC coefficients need to be coded.

• 𝐴𝐶 ∈ −1023, 1023 ; this range is divided into 10 size categories.

• Each AC differential can be described by the pair (run/size, amplitude).

• From this pair of values, only the first (run/size) is Huffman coded.

• Size is what we called before category.


Example

• Assume an AC coefficient is preceded by six zeros and has a value of


− 18.

• −18 falls into category 5.

• The one's complement of −18 is 01101.

• Hence, this coefficient is represented by (6/5,01101).

• The pair (6/5) is Huffman coded, and the 5 −bit value of −18 is
appended to that code.

• If the Huffman codeword for (6/5) is 1101, then the codeword for 6 zeros
followed by −18 is 110101101.
Special cases

• The run-length value cannot be larger than 15.


 In that case, JPEG uses the symbol (15/0) to denote a run-length of
15 zeros followed by a zero.
 Such symbols can be cascaded as needed: however, the codeword
for the last AC coefficient must have a non zero amplitude.

• If after a nonzero AC value all the remaining coefficients are zero, then
the special symbol 0/0 denotes an end of block (EOB).
Conventional and zig-zag ordering

conventional order zig-zag order


A coding example: DC

• Assume that the values of a quantized DCT matrix are given by:

• If the DC value of the previous block is 40, then 𝐷𝐶𝑖 − 𝐷𝐶𝑖−1 = 2.


• This can be expressed as the (size, amplitude) pair 2,2 .
• If the Huffman codeword for size 2 is 011, then the codeword for the DC
value is 01110.
A coding example (cont.): AC
A coding example (cont.): AC

• We require on total 82 bits to encode the AC coefficients.


• We require 5 bit to encode the DC coefficients.
87
• Average bit rate is = 1.36 bits per pixel.
64
8
• Compression ratio is = 5.88.
1.36
Compression efficiency of entropy coding in JPEG

• The DC and AC coefficients are treated separately. This is motivated by


the fact that the statistics for the DC and AC coefficients are quite
dissimilar.

• Many of the AC coefficients within a block will be zero-valued.

• Values for the DC differentials range between −2047 and 2047, and for
the AC coefficients range between −1023 and 1023.

• Direct Huffman coding of these values would require code tables


with 𝟒𝟎𝟗𝟓 and 𝟐𝟎𝟒𝟕 entries!

• By Huffman coding only the size or the (run/size) information, the size of
these tables is reduced to 12 and 162 entries, respectively!

You might also like