Basics of Video Compression
Neena Raj N. R.
Department of Computer Science and Engineering
Mar Baselios College of Engineering and Technology, Nalanchira
Syllabus
Module 3 :Video Compression
Basics of Video Compression- Analog video and Digital Video,
Motion Compensation, MPEG-1 standard and Video Syntax, MPEG-1
Pel Reconstruction, MPEG-4 standard, Functionalities for MPEG-4
Neena Raj N. R. CS1U43D DCT 2 / 52
Course Outcomes
Course Outcomes
CO1 Describe the fundamental principles of data Understand
compression.
CO2 Make use of statistical and dictionary based Apply
compression techniques for various applications
CO3 Illustrate various image compression standards. Apply
CO4 Understand
Summarize video compression mechanisms to re-
duce the redundancy in video.
CO5 Use the fundamental properties of digital audio Understand
to compress audio data.
Neena Raj N. R. CS1U43D DCT 3 / 52
Video
Video
Video is an electronic medium for the recording, copying, playback,
broadcasting, and display of moving visual media.
Neena Raj N. R. CS1U43D DCT 4 / 52
Analog Video
Analog Video
An analog video camera converts the image it “sees” through its lens
to an electric voltage (a signal) that varies with time according to the
intensity and color of the light emitted from the different image parts.
Such a signal is called analog, since it is analogous (proportional) to
the light intensity.
Neena Raj N. R. CS1U43D DCT 5 / 52
Analog Video
Analog Video
Fig. (a) CRT Operation. (b) Persistence.
Neena Raj N. R. CS1U43D DCT 6 / 52
Analog Video
Analog Video
Fig. (c) Odd Scan Lines. (d) Even Scan Lines.
Neena Raj N. R. CS1U43D DCT 7 / 52
Analog Video
Analog Video
The signal instructs the hardware to turn the beam off, move it to the
top-left corner of the screen, turn it on, and sweep a horizontal line
on the screen.
While the beam is swept horizontally along the top scan line, the
analog signal is used to adjust the beam’s intensity according to the
image parts being displayed.
At the end of the first scan line, the signal instructs the television
hardware to turn the beam off, move it back and slightly down, to the
start of the third (not the second) scan line, turn it on, and sweep
that line.
Neena Raj N. R. CS1U43D DCT 8 / 52
Analog Video
Analog Video
Moving the beam to the start of the next scan line is known as a
retrace.
The time it takes to retrace is the horizontal blanking time.
This way, one field of the picture is created on the screen line by line,
using just the odd-numbered scan lines.
At the end of the last line, the signal contains instructions for a frame
retrace.
This turns the beam off and moves it to the start of the next field
(the second scan line) to scan the field of even-numbered scan lines.
The time it takes to do the vertical retrace is the vertical blanking
time.
Neena Raj N. R. CS1U43D DCT 9 / 52
Analog Video
Analog Video
The picture is therefore created in two fields that together make a
frame.
The picture is said to be interlaced.
Neena Raj N. R. CS1U43D DCT 10 / 52
Analog Video
Analog Video
Composite Video
The common television receiver found in many homes receives from
the transmitter a composite signal, where the luminance and
chrominance components are multiplexed.
This type of signal was designed in the early 1950s, when color was
added to television transmissions.
The basic black-and-white signal becomes the luminance (Y)
component, and two chrominance components C1 and C2 are added.
Those can be U and V , Cb and Cr, I and Q, or any other
chrominance components.
Neena Raj N. R. CS1U43D DCT 11 / 52
Analog Video
Analog Video
Fig. Main components of a transmitter and a receiver using a composite signal
The main point is that only one signal is needed.
If the signal is sent on the air, only one frequency is needed. If it is
sent on a cable, only one cable is used.
Neena Raj N. R. CS1U43D DCT 12 / 52
Analog Video
Analog Video
Neena Raj N. R. CS1U43D DCT 13 / 52
Analog Video
Analog Video
Neena Raj N. R. CS1U43D DCT 14 / 52
Analog Video
Analog Video
Neena Raj N. R. CS1U43D DCT 15 / 52
Analog Video
Analog Video
Composite video is cheap but has problems such as cross-luminance
and cross- chrominance artifacts in the displayed image.
Neena Raj N. R. CS1U43D DCT 16 / 52
Analog Video
Analog Video
Component Video
Component video is an analog video signal that has been split into
two or more component channels.
In popular use, it refers to a type of component analog video (CAV)
information that is transmitted or stored as three separate signals.
Component video can be contrasted with composite video in which all
the video information is combined into a single signal that is used in
analog television.
Like composite, component cables do not carry audio and are often
paired with audio cables.
Neena Raj N. R. CS1U43D DCT 17 / 52
Analog Video
Analog Video
It requires more bandwidth and good synchronization of three
components.
Fig. Main components of a transmitter and a receiver using a component signal
Neena Raj N. R. CS1U43D DCT 18 / 52
Digital Video
Digital Video
Digital video is the case where the original image is generated, in the
camera, in the form of pixels.
An analog image seems to have infinite resolution, whereas a digital
image has a fixed, finite resolution that cannot be increased without
loss of image quality.
1 It can be easily edited. This makes it possible to produce special
effects.
2 It can be stored on any digital medium, such as hard disks, removable
cartridges, CD-ROMs, or DVDs.
3 It can be compressed. This allows for more storage (when video is
stored on a digital medium) and also for fast transmission
Neena Raj N. R. CS1U43D DCT 19 / 52
Digital Video
Digital Video
Digital video is, in principle, a sequence of images, called frames,
displayed at a certain frame rate (so many frames per second, or fps)
to create the illusion of animation.
This rate, as well as the image size and pixel depth, depend heavily
on the application.
Surveillance cameras, for example, use the very low frame rate of five
fps, while HDTV displays 25 fps.
Most video applications also involve sound. It is part of the overall
video data and has to be compressed with the video image.
Neena Raj N. R. CS1U43D DCT 20 / 52
Digital Video
Digital Video
There are few video applications do not include sound. Three
common examples are: (1) Surveillance camera, (2) an old, silent
movie being restored and converted from film to video, and (3) a
video presentation taken underwater.
Neena Raj N. R. CS1U43D DCT 21 / 52
Digital Video
Digital Video
A complete piece of video is sometimes called a presentation.
It consists of a number of acts, where each act is broken down into
several scenes.
A scene is made of several shots or sequences of action, each a
succession of frames, where there is a small change in scene and
camera position between consecutive frames.
The hierarchy is thus
piece act scene sequence frame
Neena Raj N. R. CS1U43D DCT 22 / 52
Video Compression
Video Compression
Video compression is based on two principles.
The first is the spatial redundancy that exists in each frame.
The second is the fact that most of the time, a video frame is very
similar to its immediate neighbors. This is called temporal
redundancy.
A typical technique for video compression should therefore start by
encoding the first frame using a still image compression method.
It should then encode each successive frame by identifying the
differences between the frame and its predecessor, and encoding these
differences.
Neena Raj N. R. CS1U43D DCT 23 / 52
Video Compression
Video Compression
If the frame is very different from its predecessor (as happens with the
first frame of a shot), it should be coded independently of any other
frame.
In the video compression literature, a frame that is coded using its
predecessor is called inter frame (or just inter), while a frame that is
coded independently is called intra frame (or just intra).
Neena Raj N. R. CS1U43D DCT 24 / 52
Video Compression
Video Compression
Video compression is normally lossy.
Encoding a frame Fi in terms of its predecessor Fi−1 introduces some
distortions.
As a result, encoding frame Fi+1 in terms of Fi increases the
distortion.
Even in lossless video compression, a frame may lose some bits.
If a frame Fi has lost some bits, then all the frames following it, up to
the next intra frame, are decoded improperly, perhaps even leading to
accumulated errors.
This is why intra frames should be used from time to time inside a
sequence, not just at its beginning.
Neena Raj N. R. CS1U43D DCT 25 / 52
Video Compression
Video Compression
An intra frame is labeled I , and an inter frame is labeled P (for
predictive).
Inter frame I can be coded based on one of its predecessors and also
on one of its successors.
A frame that is encoded based on both past and future frames is
labeled B (for bidirectional).
We usually don’t mind if the encoder is slow, but the decoder has to
be fast.
A typical case is video recorded on a hard disk or on a DVD, to be
played back.
The encoder can take minutes or hours to encode the data.
Neena Raj N. R. CS1U43D DCT 26 / 52
Video Compression
Video Compression
The decoder, however, has to play it back at the correct frame rate
(so many frames per second), so it has to be fast.
This is why a typical video decoder works in parallel.
It has several decoding circuits working simultaneously on several
frames.
An I frame is decoded independently of any other frame.
A P frame is decoded using the preceding I or P frame.
A B frame is decoded using the preceding and following I or P frames.
Neena Raj N. R. CS1U43D DCT 27 / 52
Video Compression
Video Compression
In the figure, the frame labeled 2 should be displayed after frame 5,
so each frame should have two time stamps, its coding time and its
display time.
Neena Raj N. R. CS1U43D DCT 28 / 52
Video Compression A few intuitive video compression methods
Subsampling
The encoder selects every other frame and writes it on the
compressed stream.
This yields a compression factor of 2.
The decoder inputs a frame and duplicates it to create two frames.
Neena Raj N. R. CS1U43D DCT 29 / 52
Video Compression A few intuitive video compression methods
Differencing
A frame is compared to its predecessor.
If the difference between them is small (just a few pixels), the encoder
encodes the pixels that are different by writing three numbers on the
compressed stream for each pixel: its image coordinates, and the
difference between the values of the pixel in the two frames.
If the difference between the frames is large, the current frame is
written on the output in raw format
Neena Raj N. R. CS1U43D DCT 30 / 52
Video Compression A few intuitive video compression methods
Block Differencing
This is a further improvement of differencing.
The image is divided into blocks of pixels, and each block B in the
current frame is compared to the corresponding block P in the
preceding frame.
If the blocks differ by more than a certain amount, then B is
compressed by writing its image coordinates, followed by the values of
all its pixels (expressed as differences) on the compressed stream.
The advantage is that the block coordinates are small numbers
(smaller than a pixel’s coordinates), and these coordinates have to be
written just once for the entire block.
Neena Raj N. R. CS1U43D DCT 31 / 52
Video Compression A few intuitive video compression methods
Block Differencing
On the downside, the values of all the pixels in the block, even those
that haven’t changed, have to be written on the output.
However, since these values are expressed as differences, they are
small numbers.
Consequently, this method is sensitive to the block size.
Neena Raj N. R. CS1U43D DCT 32 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
Anyone who has watched movies knows that the difference between
consecutive frames is small because it is the result of moving the
scene, the camera, or both between frames.
This feature can therefore be exploited to get better compression.
If the encoder discovers that a part P of the preceding frame has been
rigidly moved to a different location in the current frame, then P can
be compressed by writing the following three items on the compressed
stream: its previous location, its current location, and information
identifying the boundaries of P.
Neena Raj N. R. CS1U43D DCT 33 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
In principle, such a part can have any shape.
In practice, we are limited to equal-size blocks (normally square but
can also be rectangular).
The encoder scans the current frame block by block.
For each block B it searches the preceding frame for an identical
block C (if compression is to be lossless) or for a similar one (if it can
be lossy).
Neena Raj N. R. CS1U43D DCT 34 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
Finding such a block, the encoder writes the difference between its
past and present locations on the output. This difference is of the
form
(Cx − Bx , Cy − By ) = (∆x, ∆y )
so it is called a motion vector.
Neena Raj N. R. CS1U43D DCT 35 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
Motion compensation is effective if objects are just translated, not
scaled or rotated.
Neena Raj N. R. CS1U43D DCT 36 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
Drastic changes in illumination from frame to frame also reduce the
effectiveness of this method.
In general, motion compensation is lossy.
Neena Raj N. R. CS1U43D DCT 37 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
Main aspects of motion compensation
Frame Segmentation
The current frame is divided into equal-size nonoverlapping blocks.
The blocks may be square or rectangles.
The block size is important, since large blocks reduce the chance of
finding a match, and small blocks result in many motion vectors.
In practice, block sizes that are integer powers of 2, such as 8 or 16,
are used, since this simplifies the software.
Neena Raj N. R. CS1U43D DCT 38 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
Search Threshold
Each block B in the current frame is first compared to its counterpart
C in the preceding frame.
If they are identical, or if the difference between them is less than a
preset threshold, the encoder assumes that the block hasn’t been
moved.
Neena Raj N. R. CS1U43D DCT 39 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
Block Search
This is a time-consuming process, and so has to be carefully designed.
If B is the current block in the current frame, then the previous frame
has to be searched for a block identical to or very close to B.
The search is normally restricted to a small area (called the search area)
around B, defined by the maximum displacement parameters dx and dy.
These parameters specify the maximum horizontal and vertical
distances, in pixels, between B and any matching block in the previous
frame.
Neena Raj N. R. CS1U43D DCT 40 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
If B is a square with side b, the search area will contain (b + 2dx)(b +
2dy) pixels and will consist of (2dx + 1)(2dy + 1) distinct, overlapping
bÖb squares.
The number of candidate blocks in this area is therefore proportional to
dx·dy.
Neena Raj N. R. CS1U43D DCT 41 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
Distortion Measure
This is the most sensitive part of the encoder.
The distortion measure selects the best match for block B.
Mean absolute difference (or mean absolute error )-calculates the
average of the absolute differences between a pixel Bij in B and its
counterpart Cij in a candidate block C:
b b
1 XX
|Bij − Cij |
b 2 i=1 j=1
mean square difference is a similar measure, where the square, rather
than the absolute value, of a pixel difference is calculated:
b b
1 XX
(Bij − Cij )2
b 2 i=1 j=1
Neena Raj N. R. CS1U43D DCT 42 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
The Pel difference Classification (PDC) measure counts how many
differences |Bij − C ij| are smaller than the PDC parameter p.
The integral projection measure computes the sum of a row of B and
subtracts it from the sum of the corresponding row of C. The absolute
value of the difference is added to the absolute value of the difference
of the columns sum:
b
X b
X b
X b
X b
X b
X
Bij − Cij + Bij − Cij
i=1 j=1 j=1 j=1 i=1 i=1
Neena Raj N. R. CS1U43D DCT 43 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
Suboptimal Search Methods
These methods search some, instead of all, the candidate blocks in the
(b + 2dx)(b + 2dy) area.
They speed up the search for a matching block, at the expense of
compression efficiency.
Neena Raj N. R. CS1U43D DCT 44 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
Motion Vector Correction:
Once a block C has been selected as the best match for B, a motion
vector is calculated as the difference between the upper-left corner of C
and that of B.
Regardless of how the matching was determined, the motion vector
may be wrong because of noise, local minima in the frame, or because
the matching algorithm is not ideal.
It is possible to apply smoothing techniques to the motion vectors after
they have been calculated, in an attempt to improve the matching.
Spatial correlations in the image suggest that the motion vectors
should also be correlated.
If certain vectors are found to violate this, they can be corrected.
This step is costly and may even backfire.
Neena Raj N. R. CS1U43D DCT 45 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
A video presentation may involve slow, smooth motion of most objects,
but also swift, jerky motion of some small objects.
Correcting motion vectors may interfere with the motion vectors of
such objects and cause distortions in the compressed frames.
Neena Raj N. R. CS1U43D DCT 46 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
Coding Motion Vectors:
Two properties of motion vectors help in encoding them: (1) They are
correlated and (2) their distribution is nonuniform.
As we scan the frame block by block, adjacent blocks normally have
motion vectors that don’t differ by much; they are correlated.
The vectors also don’t point in all directions.
There are usually one or two preferred directions in which all or most
motion vectors point; the vectors are thus nonuniformly distributed.
No single method has proved ideal for encoding the motion vectors.
Arithmetic coding, adaptive Huffman coding, and various prefix codes
have been tried, and all seem to perform well.
Neena Raj N. R. CS1U43D DCT 47 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
Predict a motion vector based on its predecessors in the same row and
its predecessors in the same column of the current frame. Calculate the
difference between the prediction and the actual vector, and Huffman
encode it. This method is important. It is used in MPEG and other
compression methods.
Group the motion vectors in blocks. If all the vectors in a block are
identical, the block is encoded by encoding this vector. Other blocks
are encoded as in 1 above. Each encoded block starts with a code
identifying its type.
Neena Raj N. R. CS1U43D DCT 48 / 52
Video Compression A few intuitive video compression methods
Motion Compensation
Coding the Prediction Error
Motion compensation is lossy, since a block B is normally matched to a
somewhat different block C.
Compression can be improved by coding the difference between the
current uncompressed and compressed frames on a block by block basis
and only for blocks that differ much.
This is usually done by transform coding.
The difference is written on the output, following each frame, and is
used by the decoder to improve the frame after it has been decoded.
Neena Raj N. R. CS1U43D DCT 49 / 52
Syllabus
Module 3 :Video Compression
Basics of Video Compression- Analog video and Digital Video,
Motion Compensation, MPEG-1 standard and Video Syntax, MPEG-1
Pel Reconstruction, MPEG-4 standard, Functionalities for MPEG-4
Neena Raj N. R. CS1U43D DCT 50 / 52
Course Outcomes
Course Outcomes
CO1 Describe the fundamental principles of data Understand
compression.
CO2 Make use of statistical and dictionary based Apply
compression techniques for various applications
CO3 Illustrate various image compression standards. Apply
CO4 Understand
Summarize video compression mechanisms to re-
duce the redundancy in video.
CO5 Use the fundamental properties of digital audio Understand
to compress audio data.
Neena Raj N. R. CS1U43D DCT 51 / 52
References
References
[1] K. Sayood, Introduction to data compression. Morgan Kaufmann,
2003.
[2] D. Solomon, Data compression: the complete reference. Springer,
2007.
Neena Raj N. R. CS1U43D DCT 52 / 52