CS294-127: Computational Imaging Lecture #2
University of California, Berkeley Monday, 29 August 2016
Imaging Basics and Image Formation
Lecture #2: Monday, 29 August 2016
Lecturer: Ren Ng
Scribe: Pratul Srinivasan
Reviewer: Ben Mildenhall
1 Introduction
In this lecture, we reviewed the basic principles of image formation from the perspective
of geometric optics, and considered the effects of important imaging system parameters
for photography. In a camera, light enters the lens (or pinhole) and forms an image on
the sensor, which accumulates irradiance during the exposure length, as controlled by
the shutter.
2 Pinhole Image Formation
The history of the pinhole camera can be traced all the way to the 4th century BC, in
the writings of Chinese philosopher Mo Tzu, where he described the idea of a camera
obscura (Latin for dark room), where an inverted image of the scene could be viewed on
the back wall of the room.
In an ideal pinhole camera (this neglects light capture efficiency and diffraction, as
discussed in Section 5), only a single ray from every point within the field of view hits
the sensor, so every point appears in focus. This is visualized below in Fig. 1.
3 Focal Length, Sensor Size, and Field of View
As shown in Fig. 1, the focal length is the distance along the optical axis from the pinhole
location to the sensor plane. The field of view is defined as the angular extent of rays
that enter the imaging system and intersect the sensor. It is clear to see from Fig. 1 that
the equation for the field of view of a pinhole camera is:
!
h
F OV = 2arctan (1)
2f
where h is the sensor width and f is the focal length. From this equation, we can see
that for a fixed sensor size, decreasing the focal length increases the field of view, and
for a fixed focal length, increasing the sensor size increases the field of view.
2 CS294-127: Lecture #2
FOV pinhole plane sensor
focal length
Figure 1: A pinhole camera imaging a chicken.
4 Perspective Composition
In a typical photography setting, since our sensor size is fixed, we must choose both the
focal length and the distance from the camera to the scene. These choices can have a
dramatic effect on the composition of the resulting photos, as illustrated in Fig. 2. For
example, if we want to keep the size of the photograph subject constant, we can choose
to place the camera close to the subject and use a short focal length, resulting in a large
field of view, place the camera far from the subject and use a large focal length, resulting
in a narrow field of view, or anything in between.
5 Imaging with Lenses
Although images can be formed with pinhole cameras, they suffer from a few disadvan-
tages that motivate the use of lenses instead of pinholes. To create a sharp image, the
pinhole must be very small. However, diffraction effects also decrease the sharpness of
the image when the pinhole is very small. Additionally, a small pinhole lets in very little
light at any given time, so the exposure must be lengthened. Furthermore, since pinhole
images are in focus across the whole field of view, the photographer does not have con-
trol over depth of field effects that they might want to use to draw attention to specific
subjects in a photo.
For these reasons, most imaging systems use lenses to refract and redirect light rays.
Light rays bend when towards the surface normal when they enter materials with higher
Perspective
CS294-127: Lecture #2 Composition – Camera Position / Focal
3 Length
In this
sequence,
distance from
subject
increases with
focal length to
maintain image
size of human
subject.
Notice the
dramatic
change in
background
perspective.
From Canon EF Lens Work III
CS294-127 Waller, Ng, Fall 201
Figure 2: Illustration of perspective composition. Changing the focal length and the
distance from the camera to the subject can result in dramatically different fields of view
while keeping the subject size in the photograph constant.
refractive indices, and this is described by Snell’s Law:
ni sin(θi ) = nt sin(θt ) (2)
where ni and nt are the refractive indices of the materials on either side of the interface,
and θi and θt are the angles from the normal vectors on either side of the interface.
While ideal lenses have aberrations that prevent parallel light rays from converging
to a single point, we will for now focus on ideal thin lenses, which have a focal length
f defined as the distance along the optical axis to the focal point where parallel rays
converge.
To figure out where the image of an object is formed, we can use Gauss’ ray tracing
construction, as visualized in Fig. 3. We consider 3 rays which intersect at the image
plane: the parallel ray, which is parallel to the optical axis and is refracted to go through
the lens focal point, the chief ray, which intersects the lens plane at the optical axis and
is not refracted, and the focal ray, which intersects the optical axis at a focal length in
4 Gauss’ Ray Tracing Construction
CS294-127: Lecture #2
Parallel Ray
Chief Ray
Focal Ray
Object Image
Figure 3: Illustration of parallel, chief, and focal rays.
CS294-127 Waller, Ng, Fall 2016
front of the lens, and is refracted to be parallel to the optical axis.
To calculate the location of an object’s image, we can construct similar triangles, as
in Fig. 4. After simplification, we arrive at the thin lens equation:
1 1 1
= + (3)
f zi zo
where f is the lens focal length, zi is the distance from the lens plane to the image, and
zo is the distance from the lens plane to the object. One important consequence of this
equation is that we cannot focus on objects closer than the lens focal length.
It is also important to consider the magnification in height of the formed image, which
can also be easily determined using similar triangles:
hi zi
m= = (4)
ho zo
where m is the magnification, hi is the image height, and ho is the object height.
6 Defocus
Light rays from objects that are not at the focal plane of the camera will not converge to
a point on the sensor plane. Instead, they will form a circle called the circle of confusion.
As can be seen in Fig. 5, the size of the circle of confusion is proportional to the size of
the aperture:
C |zs − zi |
= (5)
A zi
Gauss’ Ray Tracing Construction
CS294-127: Lecture #2 5
ho ho ho hi
=
ho hi f f zi f
= f
zo f f hi hi
ho hi ho hi
Figure 4: Similar triangles
=used for deriving the=thin lens equation.
Computing zCircle
o f off Confusion
f Diameter
zi f (C)
CS294-127 zs0 zs Waller, Ng, Fall 2016
zo zi d!
A
C
Object Focal Plane Image Sensor Plane
Circle of confusion of the circle C
is proportional
Figure 5: Illustration d0
of confusion. |zs zi |
to the size of the aperture
= =
A zi zi
where C is the size of the circle of confusion, A is the size of the aperture, zs is the
distance CS294-127
from the lens plane to the sensor plane, and zi is the Waller, Ng, Fall
distance from 2016
the lens
plane to the object’s image. One interesting observation about the circle of confusion in
perspective composition is that if we increase the distance to the subject and the focal
length by a factor k, the size of the circle of confusion will also increase by a factor k.
When considering the amount of defocus of objects in a scene, it is interesting to
consider the depth of field, which is the range of object depths in the scene that are
imaged with acceptable sharpness. For digital image sensors, the maximum acceptable
blur spot for a point to be considered sharp is typically the size of 1 pixel. We can
derive the depth of field by successively applying the thin lens equation for objects at
the minimum and maximum depths that would result in a given circle of confusion size,
as visualized in Fig. 6. This results in the equations:
6 Depth of Field CS294-127: Lecture #2
dN dS C
A =
dN A
Depth of field Depth of focus
dS dF C
=
dF A
Circle of confusion, C f
N=
D
1 1 1
+ =
f f DF dF f
DF dF 1 1 1
+ =
DS dS DS dS f
DN dN 1 1 1
+ =
DN dN f
DOF = DF the D
Figure 6: Relevant quantities used to measure N of field.
depth
DS f 2 DS f 2
DF = DN = 2
f2 N C(DS f) f + N C(DS f)
DOF = DF − DN (6)
CS294-127 Waller, Ng, Fall 2016
DS f 2
DF = (7)
f 2 − N C(DS − f )
DS f 2
DN = 2 (8)
f − N C(DS − f )
where DOF is the depth of field, DF is the distance to the furthest object with a given
circle of confusion size, DN is the distance to the nearest object with a given circle of
confusion size, DS is the distance to the subject, C is the circle of confusion size, and
N = Af .
Another important quantity to consider is the hyperfocal distance H, which is the
focus distance that maximizes the depth of field, such that infinity as at the limit of
2
acceptable sharpness. This amounts to increasing DF to infinity, so DS = H = Nf C + f ,
and DN = H2 .
7 Discussion
During the discussion portion of this lecture, we talked about the paper ”Computational
Cameras: Convergence of Optics and Processing” by Changyin Zhou and Shree K. Nayar,
which was published in the IEEE Transactions on Image Processing in 2011. We mainly
focused on 5 classes of coding methods: object side, pupil plane, sensor side, illumination,
and ”other”. For each coding method, we discussed the pros, cons, and other interesting
observations.
CS294-127: Lecture #2 7
Object Side Coding generally seems easy to prototype and add to a system. Addi-
tionally, it lets you add extra information into your system. Some downsides are that it
can result in larger optical hardware than other types of coding.
Pupil Plane Coding can be thought of as implementing a Fourier space filter, re-
sulting in spatially-invariant PSF modifications. Some concerns are that the pupil plane
can be hard to physically access, and the size of the modification must be considered for
manufacturing.
Sensor Side Coding can be very effective for particular applications, but a downside
to this is that it can result in more specific imaging systems. Additionally, sensor side
coding can be hard to prototype. Sensor side coding presents interesting opportunities
to encode time as well as motion.
Illumination Coding presents interesting opportunities to add more information,
such as spatial and temporal resolution, contrast, and 3D information, by cleverly con-
trolling the illumination. Additionally, illumination coding is convenient because it is
easier to implement. One downside is that it can be limiting in certain situations, and
the coherency of the illumination has to be carefully considered.
Other Coding Methods than those listed above include multi-camera systems, lens-
less cameras, gradient-measuring cameras, and other non-conventional systems. Multi-
camera systems such as gigapixel imaging systems and camera arrays present opportuni-
ties to deal with lots of interesting data, but they are typically expensive. Additionally,
lensless imaging can enable us to build cheap and small systems but contrast is typically
a concern for lensless imaging.