CS 4763 Fundamentals of Multimedia Systems
- Introduction to Image Processing
Qi Tian
Computer Science Department
University of Texas at San Antonio
[email protected]
http://www.cs.utsa.edu/~qitian/
Image Processing
Manipulation of multidimensional signals
image (photo)
video
CT, MRI
Fluid flow
f ( x, y )
f ( x, y , t )
f ( x, y , z , t )
v ( x, y , z , t )
A Typical Image Processing System
X-ary, radar imaging, infrared
imaging, ultrasound imaging,
medical imaging, geophysical
imaging
Imaging
systems
Sample and
quantize
Digital
storage
(disk)
Digital
computer
On-line
buffer
A/D
object
observe
digitize
Display
output
store
process
Refresh
/store
Record
Fundamentals of Image Processing
Representation
acquisition, digitization, and display to mathematical
characterization of images for subsequent processing
a prerequisite for an efficient processing techniques such as
enhancement, filtering, and restoration.
Processing Techniques
Image compression, image restoration, and image reconstruction
Statistical image processing techniques
Communications
Multimedia Processing Techniques
Coding/compression
Storage and communications
9 JPEG, JPEG2000
9 MPEG-1 (CD, mp3), MPEG-2 (HDTV, DVD)
9 H.261, H.263
Enhancement, restoration, reconstruction
9 feature extraction for image analysis and visual information display
9 removal of degradation in an image, LS, ML, Max entropy, MAP
9 2D -> 3D image MRI, CT, Radon transform
Analysis, detection, recognition, understanding
quantitative measurements from an image to produce a description on it
Visualization
Advanced Processing Techniques
Statistical processing techniques
Hidden Markov model (HMM)
Probabilistic graphical models
Bayesian networks
Markov random field
Many applications to speech recognition, pattern classification, data
compression, and channel coding, etc.
History of Image/Video Coding
Signal Processing
Based
1950
1960
1970
1980
1990
2000+
Math
PR, CV, CG
Fractal
3-D Model based
coding
PCM
DPCM
Transform Coding
VQ
Subband Coding
Wavelets
Reference:
F. Nebeker, Signal Processing: The Emergency of a Discipline,
1948-1998
IEEE History Center, 1998
Broadband TV (NTSC)
500 500 8 3
30 bits/sec
100 Mb/sec (compression is necessary!)
Modem: 56Kb/sec
Picture Element
Pixel
West coast people in USC
Pel
East people in MIT
Image/Video Compression
Signal-Processing Based:
Encoder
H
f ( x, y )
Representation
Signal
Proc.
g ( x, y )
Decoder
g ( x, y )
g ( x, y )
f ( x, y )
Image/Video Compression
3D Model-Based:
Encoder
f ( x, y )
H
Analysis
Model
Parameter P
Model
Representation P
Decoder
P
3D
Model
f ( x, y )
Image/Video Compression
Fractal-Based:
Encoder
f ( x, y )
System S
f ( x, y )
Find S for which f ( x , y ) is an Attractor.
Representation S
Iteration
Decoder
Any
signal
f ( x, y )
Image/Video Compression
Standard
Facsimile: Fax Group 1, 2, 3, 4
JBIG (Joint Bi-level Image Expert Group)
Images:
JPEG (http://www.jpeg.org/)
JPEG2000
Video:
Based on Wavelet
Transform
MPEG 7
P 64 Kb/s (P =1 ~ 30)
1.2 Mb/s Video, CD, MP3
1.2 20 Mb/s, sports, HDTV, DVD
1 kb/s 1Mb/s, very low speed video
coding, Multimedia
Multimedia description, audio/video
MPEG 21
Multimedia framework
H.261, H.263
MPEG 1
MPEG 2
MPEG 4
Lena
A de facto image for the past three decade for its rich texture
What are Challenging Problems in
Multimedia Processing?
Multimedia Processing is taken in a broad sense,
including:
Image/Video compression, enhancement, restoration,
reconstruction, analysis, recognition, understanding,
visualization, and synthesis/animation.
Examples
Face modeling, detection, and recognition
Emotion recognition
Gesture recognition
Gender/age/ethnicity recognition
Audio-visual speech recognition
Image/video superresolution
Image/video browsing, indexing, and retrieval
Biometrics
Face Related Research
Face modeling
Face detection
Face recognition
Facial expression recognition
Generic Face Model
Face model
Texture
mapping
morphing
Generic Face Model
The generic face model is generated from a MRI data set
Customize A Genetic Face on An
Individual
Polygon Mesh: 2240 Vertices + 3946 Triangles.
Non-Uniform Rational B-Splines (NURBS): 63 control points.
Avatar talking head
The iFACE system in a distributed collaborative environment. (a)
Avatar in the head mounted display, (b) avatar in the desk screen of
MIC3E, (c) avatar in the main screen of MIC3E
University of Illinois at Urbana-Champaign
Text-Driven Face Animation
We strive to make the meter on
animation production, and are
always looking for new technology
that will enable faster, more
appealing character creation,
said Joel Kransove, Digital Director of
Nickelodeon. (Source: Digital
Producer)
Speech-Driven Face Animation
Game characters have become
synthetic actors and dialogue is an
essential element of the effect we
create. The quality of the lipsynching can make or break the
sense of reality,
said Scott Cronce, vice president and CTO at
Electronic Art (Source: Gamepro)
Video-Driven Face Animation
Emotion Recognition
Emotion Recognition
Emotion Recognition
Face Detection Techniques
Face Detection Techniques
Face Recognition: Why it is easy?
Face Recognition: Why it is hard?
Beauty Check
What Are the Causes and Consequences of
Human Facial Attractiveness?
Babyfaceness
Symmetry
Social perception
Universities of Regensburg, Germany
Which is more attractive?
Universities of Regensburg, Germany
Babyfaceness
Large head
Large curved forehead
Facial elements (eyes,
nose, mouth) located
relatively low
Large, round eyes
Small, short nose
Round cheeks
4-year old girl
Kate Moss
Small chin
Include mature female features: high, prominent cheek bones and
concave cheeks
Which one is cuter?
Miss Germany (2002)
A selection of the 22 contestants of the final round of
the contest
Real vs. Virtual Miss Germany
Image Analysis
Texture synthesis and transfer
Image Super-resolution
Image Repairs
Illumination/Lighting changes and transfer
Texture Synthesis and Transfer
synthesis
+
transfer
SIGGRAPH01 Effros & Freeman, MIT, 2001
Texture Synthesis and Transfer
Image Superresolution
True
Sub-sampled
Intelligent guess about details of texture
Image Superresolution
Gaussian filter
Bicubic interpolation
Image Superresolution
Median filter
Wiener filter
Image Superresolution
Dynamic resolution
enhancement
Amos Storkey
True
Image Repairs
Image Repairs
Segmentation
Original Image
Curve connection
Image synthesis
based on Tensor
Voting
Result
Image Repairs
Illumination Effects on Images
Relighting Basic Algorithm
Step 1: Align image with generic 3D face model
Step 2: Approximate radiance environment map
Step 3: Synthesize novel appearance by adjusting
the 9 spherical harmonic coefficients
Lighting Transfer
input
target
results
Image/Video Retrieval
Image database
CBIR based on color, texture,
shape/structure
MARS: Multimedia Analysis and Retrieval System
C/C++
Feature
Extraction
User
Interface
Visual
C++
Color
Texture
structure
memory
Feature
weighting
Similarity
ranking
metadata
State-of-the-art
CBIR Systems
QBIC (IBM), PhotoBook (Media Lab), Netra (UCSB),
VisualSeek (Columbia), PicHunter (NEC-NJ), Amore (NECCA), EI Nio (Praja), MARS (UIUC), Virage (Virage Inc.),
CORE, PictoSeek, Piction, InfoScope
Research Communities
Computer Vision, Image/Video Processing, Library and
Information Science, Database and Management Systems
Leading Journals & Standard
PAMI, ACM Multimedia, IJCV, CVIU
MPEG-7
MARS using global features
Biometrics
Security Threats:
We now live in a global society of increasing desperate and dangerous
people whom we can no longer trust based on identification documents
which may have been compromised.
A challenging Pattern Recognition Problem
Enabling technology to make our society safer,
reduce fraud and offer user convenience.
Too many passwords to remember
Identification Problems
Identity Theft: Identity
thieves steal PIN (e.g., date
of birth) to open credit card
account, withdraw money
from accounts and take out
loans
3.3 million identity thefts in
U.S. in 2002; 6.7 million
victims of credit card fraud
Surrogate representations of identity such as password
and ID cards no longer suffice
Biometrics
Automatic recognition of people on their
distinctive anatomical (e.g., face, fingerprint, iris,
retina, hand geometry) and behavioral (e.g.,
signature, gait) characteristics.
Person identification is now an integral part of the
infrastructure needed for diverse business sectors
such as banking, border control, law
enforcement
Biometric Applications
Biometric Applications
There are ~500 million border
crossing/year (each way) in the US
Want to charge it?
Biometric Characteristics
Biometric Market Growth
International Biometric Group
State-of-the-art Error Rate
False accept rate
(FAR):
Proportion of
imposters
accepted
False reject rate
(FRR):
Proportions of
genuine users
rejected
Multibiometrics
Soft Biometrics
Privacy Concerns
Tracking