Image Compression
Image Compression
JPEG ENCODER
Huffman
Table
AC
Color
components
(Y, Cb, or Cr)
88
FDCT
Zig-zag
reordering
Huffman
coding
Difference
Encoding
Huffman
coding
Quantizer
DC
Quantization
Table
Huffman
Table
JPEG
bit-stream
JPEG DECODER
2-D DCT
As images are two dimensional so 2-D DCT is required.
Two series of 1-D transforms result in a 2-D transform as demonstrated in the figure
below
f(i, j)
1-D
1-D
Columnwise
Rowwise
8x8
8x8
8x8
F(u, v)
called AC components
Quantization
Quantization is done to reduce number of bits per sample
F(u,v) = round(F(u,v)/q(u,v))
Quantization error is the main source of the Lossy Compression
1. Uniform Quantization:
1. q(u,v) is a constant.
Zig-Zag Scan
To group low frequency coefficients in top of vector and high frequency
coefficients at the bottom
Maps 8 x 8 matrix to a 1 x 64 vector
8x8
...
1x64
DPCM on DC Components
The DC component value in each 8x8 block is large and varies across
blocks, but is often close to that in the previous block.
Differential Pulse Code Modulation (DPCM): Encode the difference
between the current and previous 8x8 block. Remember, smaller number
-> fewer bits
45
1x64
54
1x64
48
32
.
.
.
1x64
1x64
36
1x64
45
1x64
1x64
-6
12
4
.
.
.
1x64
1x64
1x64
RLE on AC Components
The 1x64 vectors have a lot of zeros in them, more so towards the end of the
vector.
Higher up entries in the vector capture higher frequency (DCT) components which
tend to be capture less of the content.
Could have been as a result of using a quantization table
Encode a series of 0s as a (skip, value) pair, where skip is the number of zeros
and value is the next non-zero component.
Send (0,0) as end-of-block sentinel value.
...
0 0 0 0 0 1 1 0 0 0 0 0 0 0 02...
5,1
7,2
1x64
Value
Code
---
-1,1
0,1
00,01,10,11
11
SIZE: The number of bits needed to code the next nonzero AC components value. [0-A]
S2: (Value)
Run/
SIZE
Code
Length
0/0
0/1
Code
Run/
SIZE
Code
Length
1010
1/1
1100
00
1/2
11011
0/2
01
1/3
1111001
0/3
100
1/4
111110110
0/4
1011
1/5
11
11111110110
0/5
11010
1/6
16
1111111110000100
0/6
1111000
1/7
16
1111111110000101
0/7
11111000
1/8
16
1111111110000110
0/8
10
1111110110
1/9
16
1111111110000111
0/9
16
1111111110000010
1/A
16
1111111110001000
0/A
16
1111111110000011
15/A
More
Code
Such rows
JPEG Modes
Sequential Mode:
Each image is encoded in a single left-to-right, top-to-bottom scan.
The technique we have been discussing so far is an example of such a mode,
also referred to as the Baseline Sequential Mode.
It supports only 8-bit images as opposed to 12-bit images as described
before.
Lossless Mode:
Truly lossless
It is a predictive coding mechanism as opposed to the baseline mechanism which
is based on DCT and quantization(the source of the loss).
JPEG Modes
Progressive Mode: It allows a coarse version of an image to be
transmitted at a low rate, which is then progressively improved
over subsequent transmissions.
Results
SNR
JPEG2000
JPEG 2000 (JP2) is an image compression standard and
coding system.
It was created by the Joint Photographic Experts Group
committee in 2000 with the intention of superseding their
original discrete cosine transform-based JPEG standard
(created in 1992) with a newly designed, wavelet-based
method.
JPEG 2000 code streams are regions of interest that offer
several mechanisms to support spatial random access or
region of interest access at varying degrees of granularity.
It is possible to store different parts of the same picture
using different quality.
Source
Image
Data
Forward
Transform
Quantization
Entropy
Encoding
Compressed
Image Data
Store or
transmit
Reconstructe
d Image Data
Inverse
Transform
Dequantization
Entropy
decoding
Compressed
Image Data
Image
Multi
component
transform
Discrete
Wavelet
Transform
Quantization
Tier-1
Encoder
ROI Mask
Tier-2
Encoder
Coded
Stream
Coded
Stream
Tier-2
Decoder
Tier-1
Decoder
Deuantizatio
n
Inverse
Wavelet
Transfor
m
inverse
Multi
component
transform
Reconstructe
d Image
JPEG2000 Encoder
Subbands
tile
DWT
Image
Component
code
block
BPC
Subband
Subband
Subband
Tile
Context
& Data
Code
block
Compressed
data
BAC
Code
block
Subband
Layer formation
Bit
stream
Multi-Component Transform
Part 1 of encoder allows color transformation on first
three components
Irreversible Color Transform (ICT)
The ICT is nothing more than the classic RGB-to-YCrCb color space
transform.
Wavelet Transform
Floating point 9/7 wavelet filter for lossy compression
Best performance at low bit rate
High implementation complexity, especially for hardware
Integer 5/3 wavelet filter for lossless coding
Integer arithmetic, low implementation complexity
Each row and column is filtered with a high pass and low pass
Wavelet Transform
In JPEG2000 multiple stages
Quantization
The wavelet coefficients are quantized using a uniform
quantizer with deadzone. For each subband b, a
basic
quantizer step size b is used to quantize all
the coefficients in that subband according to:
y
q sign( y )
Code-blocks are then coded a bitplane at a time starting from the Most
Significant Bit-Plane to the Least
Significant Bit-Plane.
For each bit plane, there are 3 coding
passes, similar to those in EZW or
SPIHT:
3 Passes Scanning
1. Significance Propagation Pass
. If a bit is insignificant (=0) but at least one of it's eight
neighbors is significant (=1), then it is encoded.
. If the bit at the same time is a 1, it's significance flag is
set to 1 and the sign of the symbol is encoded.
2. Magnitude Refinement Pass:
.Samples which are significant and were not coded in the
significance propagation pass.
3. Clean-up Pass:
.It codes all bits which were passed over by the previous
two coding passes (insignificant bits). It is the first pass
for MSB plane.
Rate Control
Rate control is the process by which the code-stream is
altered so that a target bit rate can be reached.
Once the entire image has been compressed, a postprocessing operation passes over all the compressed
blocks and determines the extent to which each block's
embedded bit stream should be truncated in order to
achieve the target bit rate.
The ideal truncation strategy is one that minimize
distortion while still reaching the target bit-rate.
The code-blocks are compressed independently, so any
bit stream truncation policy can be used.
Region-of-interest (ROI)
A ROI is a part of an image that is encoded with higher
quality than the rest of the image (the background). The
encoding is done in such a way that the information
associated with the ROI precedes the information
associated with the background.
Bit-stream parsing
A combination of spatial and
quality scalability.
It is possible to progress by
spatial scalability to a given
(resolution) level and then
change the
progression by
SNR at a higher level
Resolution scalability
Resolution scalability
Resolution scalability
Resolution scalability
Quality scalability
Examples
Thank You