Multivariable Calculus
Multivariable Calculus
CALCULUS
Don Shimamoto
www.dbooks.org
www.dbooks.org
Multivariable Calculus
Don Shimamoto
Swarthmore College
www.dbooks.org
ii
2019
c Don Shimamoto
ISBN: 978-1-7082-4699-0
This work is licensed under the Creative Commons Attribution 4.0 International
License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
Unless noted otherwise, the graphics in this work were created using the software packages: Cin-
derella, Mathematica, and, in a couple of instances, Adobe Illustrator and Canvas.
Contents
Preface vii
I Preliminaries 1
1 Rn 3
1.1 Vector arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Linear transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 The matrix of a linear transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Matrix multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 The geometry of the dot product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.6 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.7 Exercises for Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
iii
www.dbooks.org
iv CONTENTS
Index 307
www.dbooks.org
vi CONTENTS
Preface
This book is based on a course that I taught several times at Swarthmore College. The material is
standard, though this particular course is geared towards students who enjoy learning mathematics
for its own sake. As a result, there is a priority placed on understanding why things are true and
a recognition that, when details are sketched or omitted, that should be acknowledged. Otherwise,
the level of rigor is fairly normal. The course has a prerequisite of a semester of linear algebra.
That is not necessary for this book, but it helps. Many of the students go on to more advanced
courses in mathematics, and the course is calibrated to be part of the path on the way to the more
theoretical, proof-oriented nature of upper-level material. Indeed, the hope is that the course will
help inspire the students to continue on.
Roughly speaking, the book is organized into three main parts depending on the type of function
being studied: vector-valued functions of one variable, real-valued functions of many variables, and,
finally, the general case of vector-valued functions of many variables. The table of contents gives a
pretty good idea of the topics that are covered, but here are a few notes:
• Chapter 1 contains the basics of working with vectors in Rn . For students who have studied
linear algebra, this is review.
• Chapter 2 is concerned with curves in Rn . This belongs to the study of functions of one
variable, and many students will have studied parametric equations in their first-year calculus
courses, at least for curves in the plane. One of the appealing aspects of the topic is that it is
possible to give a reasonably complete proof of a substantial result—the classification of space
curves—in a way that illustrates the basic vector concepts that have just been introduced.
Many of the proofs along the way and in the related exercises feel like calculations. This
allows the students to ease into the mindset of proving things before encountering the more
argument-based proofs to come.
• Chapters 3, 4, and 5 study real-valued functions of many variables. This includes the topics
most closely associated with multivariable calculus: partial derivatives and multiple integrals.
The discussion of differentiation emphasizes first-order approximations and the notion of
differentiability. For integration, the focus is almost entirely on functions of two variables
and, after the change of variables theorem is introduced later on, functions of three variables.
• Chapters 6 and 7 begin the study of vector-valued functions of many variables. The main
results here are the chain rule and the change of variables theorem. Their importance to the
subsequent theory is reiterated throughout the rest of the book.
• Chapters 8, 9, and 10 introduce vector fields and their integrals over curves in Rn and surfaces
in R3 , in other words, line and surface integrals. This leads to the theorems of Green, Stokes,
and Gauss, which are often taken as the target destinations of a multivariable calculus course.
• The book closes in Chapter 11 with a brief discussion of differential forms and their relation
to what has come before. The treatment is superficial but hopefully still illuminating. The
vii
www.dbooks.org
viii PREFACE
chapter is a success if the students come away believing that something interesting is going
on and curious enough to want to learn more.
As is always the case, the most useful way for students to learn the material is by doing problems,
and this book is written to get to the exercises as quickly as possible. In some cases, proofs are
sketched in the text and the details are left for the exercises. I have tried to make sure that there is
enough information in the text for the students to be able to do all the problems, but some students
and instructors may find it helpful to see more worked examples. I believe that there are enough
exercises in the book that instructors can choose to include some of them—ranging from routine
calculations to proofs—as part of their own presentation of the material. Or students can try some
of the exercises on their own and check their answers against those in the back of the book. The
exercises are written with these possibilities in mind. I hope that this also gives instructors the
flexibility to integrate the approach taken in the book more easily with their personal perspectives
on the subject.
In writing the book, it has become clear that my own view of the material is heavily influenced
by the multivariable calculus course I took as a student. It was taught by Greg Brumfiel, and I
thank him for getting me excited about the subject in such a lasting way. I also thank my colleague
Ralph Gomez for reading an entire draft of the manuscript and making numerous suggestions
that improved the book considerably. I wish I had thought of them myself. Lastly, I thank the
students who took my courses over the years for their responsiveness and feedback which allowed
my approach to the material to evolve over the years to its current state, such as it is. They did
not know that the course they were taking might someday become the basis for a textbook, but
then I wasn’t aware of it either.
Part I
Preliminaries
www.dbooks.org
Chapter 1
Rn
Let R denote the set of real numbers. Its elements are also called scalars . If n is a positive integer,
then Rn is defined to be the set of all sequences x of n real numbers:
x = (x1 , x2 , . . . , xn ). (1.1)
The elements of Rn are called points, vectors, or n-tuples. We follow the convention of indicating
vectors in boldface and scalars in plainface. For a vector x, the individual scalar entries xi for
i = 1, 2, . . . , n are called coordinates or components.
Multivariable calculus studies functions between these sets, that is, functions of the form
f : Rn → Rm , or, more accurately, of the form f : A → Rm , where A is a subset of Rn . In
this context, if x represents a typical point of Rn , the coordinates x1 , x2 , . . . , xn are referred to as
variables. For example, first-year calculus studies real-valued functions of one variable, functions
of the form f : R → R.
This chapter collects some of the background information about Rn that we use throughout
the book. The presentation is meant to be self-contained, though readers who have studied linear
algebra are likely to have a greater perspective on how the pieces fit together as part of a bigger
picture.
• Addition: x + y = (x1 + y1 , x2 + y2 , . . . , xn + yn ).
• Scalar multiplication: cx = (cx1 , cx2 , . . . , cxn ).
Because of our familiarity with R2 , we illustrate these concepts there in some detail. An element
of R2 is an ordered pair x = (x1 , x2 ). Geometrically, x is a point in the plane plotted in the usual
way. The origin (0, 0) is called the zero vector and is denoted by 0. Alternatively, we may
visualize x by drawing the arrow starting at (0, 0) and ending at (x1 , x2 ). We’ll go back and forth
freely between the point/arrow viewpoints.
Given two vectors x = (x1 , x2 ) and y = (y1 , y2 ) in R2 , the sum x + y as defined above is the
point that results from adding the displacements in each of the horizontal and vertical directions,
respectively. For instance, if x = (1, 2) and y = (3, 4), then x + y = (4, 6). Thinking of x and y as
arrows emanating from 0, this places x + y at the vertex opposite the origin in the parallelogram
www.dbooks.org
4 CHAPTER 1. RN
convenient to translate vectors, especially when they represent quantities for which length and
direction are the most relevant characteristics. Nevertheless, it’s important to remember that the
translations are only copies: the real vector starts at the origin.
Similarly, if c is a real number, then cx results from multiplying each of the coordinate dis-
placements by a factor of c. For instance, if x = (1, 2), then 3x = (3, 6). In general, cx is an arrow
|c| times as long as x and in the same direction as x if c > 0, the opposite direction if c < 0. See
Figure 1.3. In particular, (−1)x is a copy of x rotated by 180◦ to reverse the direction. It is usually
denoted −x since it satisfies x + (−1)x = 0.
Going back to the parallelogram used to visualize x+y, we could also look at the other diagonal,
say drawn as an arrow from y to x, as indicated in Figure 1.4. We sometimes denote this arrow by
−→ It is what you would add to y to get to x. In other words, it is the difference x − y:
yx.
−
→ = x − y.
yx
In the case of R2 , the coordinates are usually denoted x, y, rather than x1 , x2 . Then R2 is
the usual xy-plane. The notation is potentially confusing since x is also often used to denote the
generic vector in Rn , as in equation (1.1) above. Hopefully, the context and the use of boldface will
clarify whether a coordinate or vector is intended. Similarly, R3 represents 3-dimensional space.
Its coordinates are often denoted x, y, z.
Returning to the general case, every vector x = (x1 , x2 , . . . , xn ) in Rn can be decomposed as a
sum along the coordinate directions:
The vectors e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), . . . , en = (0, 0, . . . , 1), with a 1 in a single compo-
nent and zeros everywhere else, are called the standard basis vectors. Thus:
x = x1 e1 + x2 e2 + · · · + xn en . (1.2)
www.dbooks.org
6 CHAPTER 1. RN
This geometric argument is so simple that it may not be clear that there is any actual reasoning
behind it. The reader is encouraged to go through it carefully to pin down how it proves what
we want. Also, the case where x and y are collinear, so that they don’t determine an honest
parallelogram, requires a separate argument. We won’t give it, though see the reasoning in the
next paragraph.
Similarly, T rotates the line through the origin and x to the line through the origin and T (x).
The point on this line c times as far from the origin as x is rotated to the point c times as far from
the origin as T (x). In other words, T (cx) = c T (x). This is shown on the right in the figure.
Example 1.2. Likewise, the function T : R2 → R2 that reflects a point x = (x1 , x2 ) in the x1 -axis
is a linear transformation. The appropriate supporting diagrams are shown in Figure 1.6.
Example 1.3. Let T : R3 → R2 be the function that projects a point x = (x1 , x2 , x3 ) in R3 onto
the x1 x2 -plane, that is, T (x1 , x2 , x3 ) = (x1 , x2 ). Rather than use pictures, this time we show that
T is linear by calculating.
For instance:
T (x + y) = T (x1 + y1 , x2 + y2 , x3 + y3 ) = (x1 + y1 , x2 + y2 ).
1.3. THE MATRIX OF A LINEAR TRANSFORMATION 7
www.dbooks.org
8 CHAPTER 1. RN
For example, in R3 , if x = (1, 2, 3) and y = (4, 5, 6), then x·y = 1·4+2·5+3·6 = 4+10+18 = 32.
The dot product satisfies a variety of elementary properties, such as x · y = y · x. The ones
we shall use are pretty obvious, so we won’t bother listing them out, though please see Exercises
5.5–5.10 if you’d like to see some of them collected together.
Returning to the main point, we have shown in equation (1.4) that every real-valued linear
transformation T : Rn → R has the form:
T (x) = a · x
for some vector a = (a1 , a2 , . . . , an ) in Rn .
The analysis for the general case of a linear transformation T : Rn → Rm follows the same
pattern except that now the values aj = T (ej ) are vectors in Rm . We record these values by
putting them in the columns of a rectangular table. That is, say aj is the vector (a1j , a2j , . . . , amj )
in Rm , and let A be the table:
a11 a12 · · · a1j · · · a1n
a21 a22 · · · a2j · · · a2n
A= . .. ,
.. ..
.. . . .
am1 am2 · · · amj ··· amn
where aj is highlighted in red in the jth column. Such a table is called a matrix. In fact, A is
called an m by n matrix, which means that it has m rows and n columns. The subscripting has
been chosen so that aij is the entry in row i, column j, where the rows are numbered starting from
the top and the columns starting from the left.
The matrix obtained in this way is called the matrix of T with respect to the standard bases.
Since, by equation (1.3), the transformation T is completely determined by the columns aj , the
matrix contains all the data we need to find T (x) for all x in Rn .
We illustrate this with the three examples of linear transformations considered earlier.
Example 1.4. Let T : R2 → R2 be the counterclockwise rotation by π3 about the origin. Then T
rotates the vector e1 = (1, 0) to the vector on the unit circle that makes an angle of π3 with the
√ √
x1 -axis. That is, T (e1 ) = (cos π3 , sin π3 ) = ( 12 , 23 ). Similarly, T (e2 ) = (cos 5π 5π 3 1
6 , sin 6 ) = (− 2 , 2 ).
Hence the matrix of T with respect to the standard bases is:
" √ #
1
2 − 23
A= √ . (1.5)
3 1
2 2
Example 1.5. If T : R2 → R2 is the reflection in the x1 -axis, then T (e1 ) = T (1, 0) = (1, 0) and
T (e2 ) = T (0, 1) = (0, −1). Hence:
1 0
A= . (1.6)
0 −1
Example 1.6. Lastly, if T : R3 → R2 is the projection of x1 x2 x3 -space onto the x1 x2 -plane, then
T (e1 ) = T (1, 0, 0) = (1, 0), T (e2 ) = T (0, 1, 0) = (0, 1), and T (e3 ) = T (0, 0, 1) = (0, 0), so:
1 0 0
A= . (1.7)
0 1 0
1.3. THE MATRIX OF A LINEAR TRANSFORMATION 9
To use the matrix A to compute T (x) in a systematic way, we observe the convention that
vectors are identified with matrices having a single column. Thus:
x1
a1j
x2 a2j
x = . and aj = .
.
. .
.
xn amj
This is similar to the real-valued case (1.4) in that each component is a dot product. Specifically,
the ith component is the dot product of the ith row of A and x. This expression is given a name.
Definition (Preliminary version of matrix multiplication). Let A be an m by n matrix, and let
x be a column vector in Rn . Then the product Ax is defined to be the column vector y in Rm
whose ith component is the dot product of the ith row of A and x:
2. Conversely, given any m by n matrix A, the formula T (x) = Ax defines a linear transforma-
tion T : Rn → Rm .
We apply this to the examples previously considered. For instance, if T is the counterclockwise
rotation of R2 by π3 about the origin, then by (1.5):
" √ # " √ # √ √
1
2 − 23 x1 1
x 1 − 3
x 2 1 3 3 1
T (x) = √ = 2√3 2
1
= x1 − x2 , x 1 + x2 .
3 1 x2 x 1 + x 2 2 2 2 2
2 2 2 2
www.dbooks.org
10 CHAPTER 1. RN
Once you get the hang of it, this may be the simplest way to find a formula for a rotation.
For the reflection in the x1 -axis, (1.6) gives:
1 0 x1 1 · x1 + 0 · x2 x1
T (x) = = = = (x1 , −x2 ).
0 −1 x2 0 · x1 − 1 · x2 −x2
We didn’t need matrix methods to come up with this formula, but at least it’s correct.
Lastly, for the projection of R3 onto the x1 x2 -plane, we find from (1.7) that:
x1
1 0 0 x +0+0 x
T (x) = x2 = 1 = 1 = (x1 , x2 ),
0 1 0 0 + x2 + 0 x2
x3
as expected.
Working
√
backwards and looking at the columns of √
this matrix, this tells us that (S ◦ T )(e1 ) =
( 12 , − 23 ) = (cos(− π3 ), sin(− π3 )) and (S ◦ T )(e2 ) = (− 23 , − 21 ) = (cos( 7π 7π
6 ), sin( 6 )). These points are
plotted in Figure 1.7. A linear transformation that has the same effect on e1 and e2 is the reflection
in the line ` that makes an angle of − π6 with the positive x1 -axis. But linear transformations are
completely determined by what they do to the standard basis: two transformations that do the
same thing must be the same transformation. Thus we conclude that S ◦ T is the reflection in `.
This can be verified with geometric reasoning as well.
By comparison, the composition T ◦ S (first reflect, then rotate) is represented by the same
matrices multiplied in the opposite order:
" √ # " 1 √ #
1 3 3
− 1 0
√2 2
= √23 2 . (1.9)
3 1 0 −1 − 12
2 2 2
Note that this is different from the matrix of S ◦ T . Matrix multiplication need not obey the
commutative law AB = BA. (Can you describe geometrically the linear transformation that (1.9)
represents?)
www.dbooks.org
12 CHAPTER 1. RN
We return to the familiar p setting of the plane and examine these notions there. For instance,
if x = (x1 , x2 ), then kxk = x21 + x22 . By the Pythagorean theorem, this is the length of the
hypotenuse of a right triangle with legs |x1 | and |x2 |. If we think of x as an arrow emanating from
the origin, then kxk is the length of the arrow. If we think of x as a point, then kxk is the distance
from x to the origin.
Given two points x and y in R2 , the distance between them is the length of the arrow that
connects them, − → = x − y. Hence:
yx
Next, let x = (x1 , x2 ) and y = (y1 , y2 ) be nonzero elements of R2 , regarded as arrows emanating
from the origin. Suppose that the arrows are perpendicular. If x and y do not form a horizon-
tal/vertical pair, then the slopes of the lines through the origin that contain them are defined and
are negative reciprocals. The slope is the ratio of vertical displacement to horizontal displacement,
so this gives xx21 = − yy21 . See Figure 1.8. This is easily rearranged to become x1 y1 + x2 y2 = 0, or
Figure 1.8: Perpendicular vectors in the plane: for instance, note that x has slope x2 /x1 .
For vectors in the plane in general, let θ denote the angle between two given vectors x and y,
where 0 ≤ θ ≤ π. To study the relationship between the dot product and θ, assume for the moment
that neither x nor y is a scalar multiple of the other, so θ 6= 0, π, and consider the triangle whose
1.5. THE GEOMETRY OF THE DOT PRODUCT 13
vertices are 0, x, and y. Two of the sides of this triangle have lengths kxk and kyk, and the length
of the third side is the length of the arrow −
→ = x − y. See Figure 1.9. Thus by the law of cosines,
yx
kx − yk2 = kxk2 + kyk2 − 2kxk kyk cos θ. By Proposition 1.10, this is the same as:
www.dbooks.org
14 CHAPTER 1. RN
x · y = (a1 u1 + a2 u2 ) · (b1 u1 + b2 u2 )
= a1 b1 u1 · u1 + (a1 b2 + a2 b1 )u1 · u2 + a2 b2 u2 · u2
= a1 b1 + a2 b2 .
Thus the dot product in Rn agrees with the result we would expect in terms of the newly created
internal coordinates in P. In particular, kxk2 = x · x = a21 + a22 , so the norm in Rn represents our
intuitive notion of length in P. This is true even though the coordinates of x = (x1 , x2 , . . . , xn ) in
Rn don’t really have anything to do with P.
Continuing in this way, we can build up P as a copy of R2 and transfer over the familiar concepts
of plane geometry, such as distance and angle. The following terminology reflects this intuition.
For example, the standard basis vectors ei = (0, 0, . . . , 0, 1, 0, . . . , 0) are unit vectors.
1
Corollary 1.11. If x is a nonzero vector in Rn , then u = kxk x is a unit vector. It is called the
unit vector in the direction of x.
1 1 1 1
kxk2
Proof. We calculate that u·u = kxk x · kxk x = kxk2
x·x = kxk2
= 1. Thus by Proposition
1.10, kuk = 1.
Our main result in this section is that the relation between the dot product and angles that we
derived for the plane in equation (1.11) is true in Rn for all n.
where θ is the angle between x and y in the plane that they span.
Proof. The case that x or y is a scalar multiple of the other is proved in the same way as in R2 .
Otherwise, x and y span a plane P, and the points 0, x, and y are the vertices of a “triangle” in
P. Since we can make P into a geometric replica of R2 , the law of cosines remains true, and, since
the norm in Rn represents length in P, this takes the form:
Lastly, we introduce the standard terminology for the case of perpendicular vectors, that is,
when θ = π2 , so cos θ = 0.
1.6 Determinants
The determinant is a function that assigns a real number to an n by n matrix. There’s a separate
function for each n. We shall focus almost exclusively on the cases n = 2 and n = 3, since those are
the cases we really need later. The determinant is not defined for matrices in which the numbers
of rows and columns are unequal.
The determinant of a 2 by 2 matrix is defined to be:
a11 a12
det = a11 a22 − a12 a21 .
a21 a22
It’s the product of the diagonal entries minus the product of the off-diagonal entries. For instance,
det [ 13 24 ] = 1 · 4 − 2 · 3 = 4 − 6 = −2.
One often thinks of the determinant as a function of the rows of the matrix. If x = (x1 , x2 ) and
y = (y1 , y2 ), let x y denote the matrix whose rows are x and y:
x1 x2
det x y = det = x1 y2 − x2 y1 .
y1 y2
Proposition 1.13. Let x and y be vectors in R2 , neither a scalar multiple of the other. Then:
det x y = Area of the parallelogram determined by x and y.
2
= (Area)2 . The area of the parallelogram
Proof. We show the equivalent result that det x y
is (base)(height). As base, we use the arrow x, which has length kxk. For the height h, we drop a
perpendicular from the point y to the base. See Figure 1.10. Let θ be the angle between x and y,
where as usual 0 ≤ θ ≤ π. Then the height is given by h = kyk sin θ, so Area = kxk kyk sin θ and:
www.dbooks.org
16 CHAPTER 1. RN
Meanwhile:
2
= (x1 y2 − x2 y1 )2
det x y
= x21 y22 − 2x1 x2 y1 y2 + x22 y12
= (x21 + x22 )(y12 + y22 ) − x21 y12 − x22 y22 − 2x1 x2 y1 y2
= (x21 + x22 )(y12 + y22 ) − (x1 y1 + x2 y2 )2
= kxk2 kyk2 − (x · y)2
= kxk2 kyk2 − kxk2 kyk2 cos2 θ
= kxk2 kyk2 sin2 θ. (1.13)
If x or y is a scalar multiple of the other, for instance, if y = cx for some scalar c, then the
“parallelogram” they determine
degenerates into a line segment, which has area 0. At the same
x1 x2
time, det x y = = cx1 x2 − cx1 x2 = 0, so the proposition remains valid when things
cx1 cx2
degenerate, too.
Determinants satisfy a great many algebraic properties. We list only the ones that we shall use,
which barely begins to scratch the surface. In part (c) of the following proposition, the transpose
of a matrix A, denoted At , refers to the matrix obtained by turning the rows into columns and the
columns into rows. So the (i, j)th entry of At is the (j, i)th entry of A. For instance:
t
t t 1 4 1
1 2 1 3 1 2 3
= , = 2 5 , 2 = 1 2 3 , and so on.
3 4 2 4 4 5 6
3 6 3
If A is an m by n matrix, then At is n by m.
Proposition 1.14.
(a) If two of the rows are equal, the determinant is 0.
(b) Interchanging two rows flips the sign of the determinant.
(c) det(At ) = det A.
(d) det(AB) = (det A)(det B).
Proof. For 2 by 2 determinants, the proofs are easy.
(a) det x x = det [ xx11 xx22 ] = x1 x2 − x1 x2 = 0.
+a22 b22 )−(a11 b12 +a12 b22 )(a21 b11 +a22 b21 ). After multiplying out, the terms involving a11 a21 b11 b12
and a12 a22 b21 b22 cancel in pairs, leaving:
det(AB) = a11 b11 a22 b22 + a12 b21 a21 b12 − a11 b12 a22 b21 − a12 b22 a21 b11 .
One can check that (det A)(det B) = a11 a22 − a12 a21 b11 b22 − b12 b21 expands to the same thing.
1.6. DETERMINANTS 17
The signs in the sum alternate, and the pattern is that the terms run along the entries of the first
row, a1j , each multiplied by the 2 by 2 determinant obtained by deleting the first row and jth
column of the original matrix. The process is known as expansion along the first row. For
instance:
1 2 3
5 6 4 6 4 5
det 4 5 6 = 1 · det
− 2 · det + 3 · det
8 9 7 9 7 8
7 8 9
= 1 · (45 − 48) − 2 · (36 − 42) + 3 · (32 − 35) = −3 + 12 − 9 = 0.
One can show that 3 by 3 determinants satisfy all four of the algebraic properties of Proposition
1.14. The calculations are longer than in the 2 by 2 case but still straightforward, except for the
product formula det(AB) = (det A)(det B) which calls for a new approach.
One consequence of the properties is that there’s a formula for expanding the determinant along
any row. To expand along the ith row, interchange row i with the row above it repeatedly until it
reaches the top, then expand using equation (1.14). The result has the same form as (1.14), except
that the leading scalar factors come from the ith row and sometimes the signs might alternate
beginning with a minus sign depending on how many sign flips were introduced in getting the ith
row to the top.
In addition, since det(At ) = det(A), any general statement about rows applies to columns as
well, so there are formulas for expanding along any column, too. We won’t write down those
formulas precisely.
There is also a geometric interpretation of 3 by 3 determinants in terms of 3-dimensional volume.
This is discussed in the next chapter.
For larger matrices, the same pattern continues, that is, n by n determinants can be defined in
terms of (n − 1) by (n − 1) determinants using expansion along the first row. For the 4 by 4 case,
the formula is:
a11 a12 a13 a14
a21 a22 a23 a24 a21 a23 a24
a22 a23 a24
= a11 · det a32 a33
det
a31 a34 − a12 · det a31 a33
a34
a32 a33 a34
a42 a43 a44 a41 a43 a44
a41 a42 a43 a44
a21 a22 a24 a21 a22 a23
+ a13 · det a31
a32 a34 − a14 · det a31
a32 a33 .
a41 a42 a44 a41 a42 a43
The algebraic properties of Proposition 1.14 remain true for n by n determinants whatever the
value of n, though, to prove this, one should really develop the algebraic structure of determinants
in a systematic way rather than hope for success with brute force calculation. We leave this level
of generality for a course in linear algebra. In what follows, we work for the most part with the
cases n = 2 and n = 3.
www.dbooks.org
18 CHAPTER 1. RN
1.2. Find −
→ and y + −
yx →
yx.
1.3. If x + y + z = 0, find z.
1.4. If x − 2y + 3z = 0, find z.
(a) Show that the midpoint of line segment xy is given by m = 12 x + 12 y. (Hint: What do
you add to x to get to the midpoint?)
(b) Find an analogous expression in terms of x and y for the point p that is 2/3 of the way
from x to y.
(c) Let x = (1, 1, 0) and y = (0, 1, 1). Find the point z in R3 such that y is the midpoint of
line segment xz.
2.2. Let T : R2 → R3 be the function defined by T (x1 , x2 ) = (x1 , x2 , 0). Show that T is a linear
transformation.
T (x1 , x2 , . . . , xn ) = x1 a1 + x2 a2 + · · · + xn an
is a linear transformation.
3.1. Let T : R2 → R2 be the linear transformation such that T (e1 ) = (1, 2) and T (e2 ) = (3, 4).
3.2. Let T : R3 → R3 be the linear transformation such that T (e1 ) = (1, 0, −1), T (e2 ) = (−1, 1, 0),
and T (e3 ) = (0, −1, 1).
(c) Find the set of all points x = (x1 , x2 , x3 ) in R3 such that T (x) = 0.
In Exercises 3.3–3.6, find the matrix of the given linear transformation T : R2 → R2 with respect
to the standard bases.
3.7. Let ρθ : R2 → R2 be the rotation about the origin counterclockwise by an angle θ. Show that
the matrix of ρθ with respect to the standard bases is:
cos θ − sin θ
Rθ = . (1.15)
sin θ cos θ
The matrix Rθ is called a rotation matrix.
3.8. Let T : R2 → R2 be
the linear transformation whose matrix with respect to the standard
0 1
bases is A = . Describe T geometrically.
1 0
3.9. Let T : R2 → R2 be
the linear transformation whose matrix with respect to the standard
1 2
bases is A = .
2 4
(a) Find T (e1 ) and T (e2 ).
(b) Describe the image of T , that is, the set of all y in R2 such that y = T (x) for some x in
R2 .
3.10. Let v1 = (1, 1) and v2 = (−1, 1).
(a) Find scalars c1 , c2 such that c1 v1 + c2 v2 = e1 .
(b) Find scalars c1 , c2 such that c1 v1 + c2 v2 = e2 .
(c) Let T : R2 → R2 be the linear transformation such that T (v1 ) = (−1, −1) and T (v2 ) =
(−2, 2). Find the matrix of T with respect to the standard bases.
(d) Let S : R2 → R2 be the linear transformation such that S(v1 ) = e1 and S(v2 ) = e2 .
Find the matrix of S with respect to the standard bases.
3.11. Let T : R2 → R3 be the linear transformation given by T (x1 , x2 ) = (x1 , x2 , 0). Find the
matrix of T with respect to the standard bases.
3.12. Let T : R3 → R3 be the rotation about the x3 -axis by π/2 counterclockwise as viewed looking
down from the positive x3 -axis. Find the matrix of T with respect to the standard bases.
3.13. Let T : R3 → R3 be the rotation by π about the line x1 = x2 , x3 = 0, in the x1 x2 -plane. Find
the matrix of T with respect to the standard bases.
3.14. Let A and B be m by n matrices. If Ax = Bx for all column vectors x in Rn , show that
A = B.
www.dbooks.org
20 CHAPTER 1. RN
4.7. We have seen that the commutative law AB = BA does not hold in general for matrix
multiplication. In fact, the situation is worse than that: it’s rare for AB and BA even to
both be defined. In that sense, the preceding half dozen exercises are misleading.
(a) Find an example of matrices A and B such that neither AB nor BA is defined.
(b) Find an example where AB is defined but BA is not.
(AB)C = A(BC).
Since it does not matter how the terms are grouped, we shall write the product henceforth
simply as ABC, without parentheses. (Hint: See Exercise 3.14.)
In Exercises 4.9–4.10:
cos θ − sin θ
• let Rθ = be the matrix (1.15) that represents the rotation ρθ : R2 → R2 about
sin θ cos θ
the origin counterclockwise by angle θ, and
1 0
• let S = be the matrix (1.6) that represents the reflection r : R2 → R2 in the x1 -axis.
0 −1
1.7. EXERCISES FOR CHAPTER 1 21
(a) Find x · y.
(b) Find kxk and kyk.
(c) Find the angle between x and y.
5.2. Find the unit vector in the direction of x = (2, −1, −2).
5.3. Find all unit vectors in R2 that are orthogonal to x = (1, 2).
5.4. What does the sign of x · y tell you about the angle between x and y?
In Exercises 5.5–5.10, show that the dot product satisfies the given property. The properties
are true for vectors in Rn , though you may assume in your arguments that the vectors are in R2 ,
i.e., x = (x1 , x2 ), y = (y1 , y2 ), and so on. The proofs for Rn in general are similar.
5.5. x · y = y · x
5.8. w · (x + y) = w · x + w · y
5.11. Recall that a rhombus is a planar quadrilateral whose sides all have the same length. Use the
dot product to show that the diagonals of a rhombus are perpendicular to each other.
5.12. The Pythagorean theorem states that, for a right triangle in the plane, a2 + b2 = c2 , where
a and b are the lengths of the legs and c is the length of the hypotenuse. Use vector algebra
and the dot product to show that the theorem remains true for right triangles in Rn . (Hint:
The hypotenuse is a diagonal of a rectangle.)
Section 6 Determinants
In Exercises 6.1–6.4, find the given determinant.
2 5
6.1. det
−3 4
0 5
6.2. det
−3 4
www.dbooks.org
22 CHAPTER 1. RN
1 −2 3
6.3. det −4 5 −6
7 −8 9
1 0 0
6.4. det 2 3 0
4 5 6
6.5. Find the area of the parallelogram in R2 determined by x = (4, 0) and y = (1, 3).
6.6. Find the area of the parallelogram in R2 determined by x = (−2, −3) and y = (−3, 2).
1 −2
6.7. Let A = . Use the product rule for determinants to show that there is no 2 by 2
2 −4
1 0
matrix B such that AB = .
0 1
6.8. Let A be an n by n matrix, and let At be its transpose. Show that det(At A) = (det A)2 .
23
www.dbooks.org
Chapter 2
This chapter is concerned with curves in Rn . While we may have an intuitive sense of what a
curve is, at least in R2 or R3 , the formal description here is somewhat indirect in that, rather than
requiring a curve to have a defining equation, we describe it by how it is swept out, like the trace
of a skywriter. Thus in addition to studying geometric features of the curve, such as its length,
we also look at quantities related to the skywriter’s motion, such as its velocity and acceleration.
The goal of the chapter is a remarkable result about curves in R3 that describes measurements
that characterize the geometry of a space curve completely. Along the way, we shall gain valuable
experience applying vector methods.
The functions in this chapter take their values in Rn , but they are functions of one variable. As
a result, we treat the material as a continuation of first-year calculus without going back to redefine
or redevelop concepts that are introduced there. At times, this assumed familiarity may lead to a
rather loose treatment of certain basic topics, such as continuity, derivatives, and integrals. When
we study functions of more than one variable, we shall go back and define these concepts carefully,
and what we say then applies retroactively to what we cover here. We hope that any reader who
becomes anxious about a possible lack of rigor will be willing to wait.
2.1 Parametrizations
Let I be an interval of real numbers, typically, I = [a, b], (a, b), or R.
Definition. A continuous function α : I → Rn is called a path. As t varies over I, α(t) traces out
a curve C. More precisely:
This is also known as the image of α. We say that α is a parametrization of the curve. We often
refer to the input variable t as time and think of α(t) as the position of a moving object at time
t. See Figure 2.1.
A path is a vector-valued function of one variable. To follow our notational convention, we
should write the value in boldface as α(t), but for the most part we continue to use plainface,
usually reserving boldface for a particular type of vector-valued function that we begin studying in
Chapter 8. For each t in I, α(t) is a point of Rn , so we may write α(t) = (x1 (t), x2 (t), . . . , xn (t)),
where each of the n coordinates is a real number that depends on t, i.e., a real-valued function of
one variable.
Here are some of the standard examples of parametrized curves that we shall refer to frequently.
25
www.dbooks.org
26 CHAPTER 2. PATHS AND CURVES
If b > 0, the helix is said to be “right-handed” and if b < 0 “left-handed.” Each type is shown in
Figure 2.4.
Figure 2.4: Helices: right-handed (at left) and left-handed (at right)
If the domain of α is restricted to an interval of finite length, then α parametrizes a finite segment
of the line.
www.dbooks.org
28 CHAPTER 2. PATHS AND CURVES
Example 2.5. Find a parametrization of the line in R3 that passes through the points a = (1, 2, 3)
and b = (4, 5, 6).
−→
The line passes through the point a = (1, 2, 3) and is parallel to v = ab = b − a = (4, 5, 6) −
(1, 2, 3) = (3, 3, 3). The setup is indicated in Figure 2.6. Therefore one parametrization is:
Figure 2.6: Parametrizing the line through two given points a and b
Any nonzero scalar multiple of v = (3, 3, 3) is also parallel to the line and could be used as
part of a parametrization as well. For instance, w = (1, 1, 1) is such a multiple, and β(t) =
(1, 2, 3) + t (1, 1, 1) = (1 + t, 2 + t, 3 + t) is another parametrization of the same line.
provided the limit exists. The derivative is also called the velocity v(t) of α.
Example 2.6. For the helix parametrized by α(t) = (cos t, sin t, t):
Now, choose some time t0 , and, for any time t, let s(t) be the distance traced out along the
path from α(t0 ) to α(t). Then s(t) is called the arclength function, and we think of the rate of
2.2. VELOCITY, ACCELERATION, SPEED, ARCLENGTH 29
change of distance ds
dt as the speed. Unfortunately, these terms should be defined more carefully,
and, to get everything in the right logical order, it seems best to define the speed first.
To define what we expect dsdt to be, we make the intuitive approximation that the distance 4s
along the curve between two nearby points α(t) and α(t + 4t) is approximately the straight line
distance between them, as indicated in Figure 2.7. We assume that 4t is positive. If 4t is small:
v(t) = kv(t)k.
Note that the speed v(t) is a scalar quantity, whereas the velocity v(t) is a vector.
Now, to define the length of the path from t = a to t = b, we integrate the speed.
Rb Rb
Definition. The arclength from t = a to t = b is defined to be a v(t) dt = a kv(t)k dt.
The arclength function s(t) we considered above is then given by integrating the speed v(t),
so, by the fundamental theorem of calculus, ds
dt = v(t). Thus the definitions realize the intuitive
relations with which we began.
Example 2.7. For the helix parametrized by α(t) = (cos t, sin t, t), find:
First, as just shown in Example 2.6, the velocity is v(t) = (− sin t, cos t, 1). Thus:
p √
(a) speed: v(t) = kv(t)k = sin2 t + cos2 t + 1 = 2.
R 4π R 4π √ √ 4π √
(b) arclength: 0 v(t) dt = 0 2 dt = 2 t0 = 4π 2.
Summary of definitions
velocity v(t) = α0 (t)
acceleration a(t) = α00 (t)
speed v(t) = kv(t)k = ds dt
Rb
arclength = a v(t) dt
www.dbooks.org
30 CHAPTER 2. PATHS AND CURVES
We can rewrite the sums in a way that leads to a normal first-year calculus integral over an
interval by using a parametrization α : [a, b] → Rn of C. We assume that [a, b] can be subdivided
into a sequence of consecutive subintervals of lengths 4t1 , 4t2 , . . . , 4tk such that the subinterval
of length 4tj gets mapped to the curve segment of length 4sj for each j = 1, 2, . . . , k. Then the
sum that models the integral can be written as:
X X 4sj
f (xj ) 4sj = f (α(cj )) 4tj ,
4tj
j j
where cj is a value of the parameter for which α(cj ) = xj . In the limit as 4tj goes to 0, these
Rb
sums approach the integral a f (α(t)) ds ds
dt dt. In this last expression, dt is the speed, also denoted
v(t). This intuition is formalized in the following definition.
Definition. Let α : [a, b] → Rn be a differentiable parametrization of a curve C, and let v(t) denote
its speed. If f : C →RR is a continuous real-valued function, the integral of f with respect to
arclength, denoted α f ds, is defined by:
Z Z b
f ds = f (α(t)) v(t) dt.
α a
R
We also denote this integral by C f ds, though this raises some issues that are addressed later when
we have more practice with parametrizations and their effect on calculations. (See page 208.)
R
Example 2.8. If f = 1 (constant function), then the definition of the integral says C 1 ds =
Rb
a v(t) dt. On the other hand, this is also precisely the definition of arclength. Hence:
Z
1 ds = arclength of C.
C
2.4. THE GEOMETRY OF CURVES: TANGENT AND NORMAL VECTORS 31
www.dbooks.org
32 CHAPTER 2. PATHS AND CURVES
Figure 2.10: The unit tangent vector T(t), translated to start at α(t)
The remaining two curve-related coordinate directions are orthogonal to this first one. In R3 ,
there is a whole plane of orthogonal possibilities, but one of the possibilities turns out to be naturally
distinguished. Identifying it requires some preliminary work.
Proposition 2.10 (Product rule for the dot product). If α and β are differentiable paths in Rn ,
then:
(α · β)0 = α0 · β + α · β 0 .
The expression within the limit can be put into more manageable form by using the trick of “adding
zero,” suitably disguised, in the middle:
1 1
α(t + h) · β(t + h) − α(t) · β(t) = α(t + h) · β(t + h)−α(t) · β(t + h)
h h
+α(t) · β(t + h) − α(t) · β(t)
α(t + h) − α(t) β(t + h) − β(t)
= · β(t + h) + α(t) · .
h h
Now, taking the limit as h goes to 0 gives the desired result: (α·β)0 (t) = α0 (t)·β(t)+α(t)·β 0 (t).
Proof. By Proposition 1.10 of Chapter 1, kαk2 = α · α, so by the product rule (kαk2 )0 = (α · α)0 =
α0 · α + α · α0 = 2α · α0 .
In words, a vector-valued function of one variable has constant magnitude if and only if it is
orthogonal to its derivative at all times.
Proof. kαk is constant if and only if kαk2 is constant. This in turn is true if and only if (kαk2 )0 = 0.
By the preceding corollary, this is equivalent to saying that α · α0 = 0.
Returning to the geometry of curves, the unit tangent vector T(t) satisfies kT(t)k = 1, which
is constant. Therefore T and T0 are orthogonal. The unit vector in the direction of T0 will be
orthogonal to T, too. This gives us our second coordinate direction.
2.5. THE CROSS PRODUCT 33
Figure 2.11: The unit tangent and principal normal vectors, T(t) and N(t)
Example 2.13. For the helix parametrized by α(t) = (cos t, sin t, t), find the unit tangent T(t)
and the principal normal N(t). √
As calculated in Examples 2.6 and 2.7, α0 (t) = (− sin t, cos t, 1) and kα0 (t)k = v(t) = 2. Thus:
1 1
T(t) = α0 (t) = √ (− sin t, cos t, 1).
kα0 (t)k 2
p
Continuing with this, T0 (t) = √1 (− cos t, − sin t, 0)
2
and kT0 (t)k = √1
2
cos2 t + sin2 t + 02 = √1
2
Hence:
1
N(t) = T0 (t)
kT0 (t)k
1 1
= · √ (− cos t, − sin t, 0)
√1 2
2
= (− cos t, − sin t, 0).
As a check, we verify that the unit tangent and principal normal are orthogonal, as predicted
by the theory:
1
T(t) · N(t) = √ (− sin t)(− cos t) + (cos t)(− sin t) + 1 · 0
2
1
= √ sin t cos t − cos t sin t + 0 = 0.
2
The remaining step is to find a third and final coordinate direction, that is, a unit vector in
R3 orthogonal to both T(t) and N(t). There are two choices for such a vector, and, in the next
section, we describe a systematic way to single out one of them, denoted B(t). Once this is done,
we will have a basis of orthogonal unit vectors (T(t), N(t), B(t)) at each point of the curve that is
naturally adapted to the motion along the curve.
www.dbooks.org
34 CHAPTER 2. PATHS AND CURVES
Our goal for the moment is this: given vectors v and w in R3 , find a vector, written v × w,
that is orthogonal to both v and w. We concentrate initially not so much on what v × w actually
is but rather on requiring it to have a certain critical property. Namely, whatever v × w is, we
insist that it satisfy the following.
x1 x2 x3
Key Property. For all x in R3 , x · (v × w) = det v1 v2 v3 , that is:
w1 w2 w3
x · (v × w) = det x v w (P)
where x v w denotes the 3 by 3 matrix whose rows are x, v, w.
For instance, i × j must satisfy:
x1 x2 x3
x · (i × j) = det x i j = det 1 0 0
0 1 0
= x1 · (0 − 0) − x2 · (0 − 0) + x3 · (1 − 0)
= x3 for all x = (x1 , x2 , x3 ) in R3 .
But x · k = (x1 , x2 , x3 ) · (0, 0, 1) = x3 for all x as well, so k satisfies the key property (P) required
of i × j. We would expect that i × j = k.
Assuming for the moment that v × w can be defined in general to satisfy (P), our main goal
follows immediately.
Proposition 2.14. v × w is orthogonal to v and w (Figure 2.12).
Proof. We take dot products and use the key property:
v · (v × w) = det v v w = 0 (two equal rows ⇒ det = 0).
Thus v × w is orthogonal to v. For the same reason, w · (v × w) = det w v w = 0.
Working backwards, it is not hard to verify that the formula given by (×) does indeed satisfy
(P) (Exercise 5.8), so that we have actually accomplished something.
The odd looking determinant in (×) is somewhat illegitimate in that the entries in the top row
are vectors, not scalars. It is meant to be calculated in the usual way by expansion along the top
row. To understand how it works, perhaps it is best to do some examples.
Example 2.15. First, we reproduce our result for i × j, putting i = (1, 0, 0) and j = (0, 1, 0) in the
second and third rows of (×) and expanding along the first row:
i j k
i × j = det 1 0 0 = i · (0 − 0) − j · (0 − 0) + k · (1 − 0) = k,
0 1 0
as expected.
Similarly, j × k = i and k × i = j.
Example 2.16. If v = (1, 2, 3) and w = (4, 5, 6), then:
i j k
v × w = det 1 2 3
4 5 6
= i · (12 − 15) − j · (6 − 12) + k · (5 − 8)
= −3i + 6j − 3k
= (−3, 6, −3).
As a check against possible computational error, one can go back and confirm that the result is
orthogonal to both v and w.
Example 2.17. With the same v and w as in the previous example, let u = (1, −2, 3). Then:
u · (v × w) = (1, −2, 3) · (−3, 6, −3) = −3 − 12 − 9 = −24. (2.1)
Suppose that we use the same three factors but in a permuted order, say w · (v × u). Taking
advantage of the result of (2.1), what can you say about the value of this product without calculating
the individual dot and cross products involved?
We use (P) and properties of the determinant:
w · (v × u) = det w v u (property (P))
= − det u v w (row switch)
= −u · (v × w) (property (P))
= −(−24) (by (2.1))
= 24.
www.dbooks.org
36 CHAPTER 2. PATHS AND CURVES
(Justification. This appears as Exercise 5.9. It uses v · w = kvk kwk cos θ.)
The length formula (2.2) for v × w has some useful consequences:
(b) Consider the parallelogram determined by v and w. We can find its area by thinking of v as
base and dropping a perpendicular from w to get the height, as in the case of Figure 1.10 in
Chapter 1. See Figure 2.13.
In other words:
(c) We can now describe the geometric interpretation of 3 by 3 determinants as volume that
was mentioned in Chapter 1. Let u, v, and w be points in R3 such that 0, u, v, and w
are not coplanar. This determines a “paralellepiped,” which is the 3-dimensional analogue of
a parallelogram in the plane, as shown in Figure 2.14. It consists of all points of the form
au + bv + cw, where a, b, c are scalars such that 0 ≤ a ≤ 1, 0 ≤ b ≤ 1, and 0 ≤ c ≤ 1. Then:
det u v w is the volume of the parallelepiped determined by u, v, and w.
2.5. THE CROSS PRODUCT 37
This follows from property (P), which we can use to reverse course and take what we’ve
learned about dot and cross products to tell us about determinants. We leave the details
for the exercises (Exercise 5.13 to be precise), though the relevant ingredients are labeled in
Figure 2.14. In the figure, ϕ denotes the angle between u and v × w.
Proposition 2.19. If v and w are vectors in R3 , neither a scalar multiple of the other, then the
triple (v, w, v × w) is right-handed.
In R3 , the right-handed orientation gets its name from a rule of thumb known as the “right-hand
rule.” This is based on a convention regarding the standard basis (i, j, k), namely, if you rotate the
www.dbooks.org
38 CHAPTER 2. PATHS AND CURVES
fingers of your right hand from i to j, your thumb points in the direction of k. This is not so
much a fact as it is an agreement: whenever we draw R3 , we agree to orient the positive x, y, and
z-axes so that the right-hand rule for (i, j, k) is satisfied, as in Figure 2.15. As we continue with
further material, figures in R3 become increasingly prominent, so it is probably good to state this
convention explicitly.
More generally, given a basis (v, w, z) of R3 , the plane P spanned by v and w separates R3
into two half-spaces, or “sides.” Once we adopt the right-hand rule for (i, j, k), it turns out that
(v, w, z) is right-handed in the sense of the definition if and only if, when you rotate the fingers
of your right hand from v to w, your thumb lies on the same side of P as z. In particular, by
Proposition 2.19, your thumb lies on the same side as v × w. In this way, your right thumb gives
the direction of v × w.
Proving the connection between the sign of a determinant and the right-hand rule would be
too great a digression. One approach uses the continuity of the determinant and the geometry of
rotations to show that, if the rule works for (i, j, k), then it works in general. In any case, the
right-hand rule is sometimes a convenient way to find cross products geometrically. For instance,
in addition to i × j = k, it allows us to see that j × k = i, not −i, and that k × i = j, as pictured
in Figure 2.15.
These definitions make sense only if α0 (t) 6= 0 and T0 (t) 6= 0, so we assume that this is the case
from now on. Both T(t) and N(t) are constructed to be unit vectors, and we showed before that
T(t) · N(t) = 0. At this point, the third basis vector is easily defined.
Definition. The cross product B(t) = T(t) × N(t) is called the binormal vector.
2.6. THE GEOMETRY OF SPACE CURVES: FRENET VECTORS 39
As a cross product, B(t) is orthogonal to T(t) and N(t). Moreover, its length is kB(t)k =
kT(t)k kN(t)k sin θ = 1 · 1 · sin π2 = 1. Thus (T(t), N(t), B(t)) is a collection of orthogonal unit
vectors. The vectors are called the Frenet vectors of α and are illustrated in Figure 2.16.
Example 2.20 (The Frenet vectors of a helix). We shall use the helix with parametrization
as a running example to illustrate new concepts as they arise. Here, a and b are constant, and
a > 0. Recall that the helix is called right-handed if b > 0 and left-handed if b < 0. If b = 0, the
curve collapses to a circle of radius a in the xy-plane.
To find the Frenet vectors T(t), N(t), and B(t) of the helix, we simply and carefully follow the
definitions. p √
Unit tangent: α0 (t) = (−a sin t, a cos t, b), so kα0 (t)k = a2 sin2 t + a2 cos2 t + b2 = a2 + b2 . Thus:
1 1
T(t) = α0 (t) = √ (−a sin t, a cos t, b).
kα0 (t)k a2 + b2
1 1 1
N(t) = T0 (t) = ·√ (−a cos t, −a sin t, 0)
kT0 (t)k √ a a2 + b2
a2 +b2
1
= (−a cos t, −a sin t, 0)
a
= (− cos t, − sin t, 0).
Binormal: Lastly:
i j k
www.dbooks.org
40 CHAPTER 2. PATHS AND CURVES
√ a
2 2 a
Curvature of a helix: κ(t) = √ a +b = .
a + b2
2 a2 + b2
a
In particular, for a circle of radius a, in which case b = 0, the curvature is κ = a2 +02
= a1 .
Next, to measure wobbling, we look at the binormal B, which is orthogonal to the plane spanned
by T and N. This plane is the “plane of motion” in the sense that it contains the velocity and
acceleration vectors (see Exercise 9.4), so B0 can be thought of as representing the rate of wobble
of that plane.
We prove in Lemma 2.24 below that B0 (t) is always a scalar multiple of N(t), that is:
B0 (t) = c(t)N(t),
where c(t) is a scalar that depends on t. For the moment, let’s accept this as true. Then c(t)
represents the rate of wobble. We again normalize by dividing by the speed. Moreover, the
convention is to throw in a minus sign.
Definition. Given that B0 (t) = c(t)N(t) as above, then the torsion is defined by:
c(t)
τ (t) = − ,
v(t)
where v(t) is the speed.
1
What Exercise 1.13 in Chapter 9 shows is that the curvature is essentially a function of the points on the curve
C in the sense that, if x ∈ C and if there is only one value of the parameter t such that α(t) = x, then the quantity
kT0 (t)k
v(t)
is the same for all such parametrizations of C. Thus we could denote the curvature by κ(x) and calculate it
without worrying about which parametrization we use.
2.7. CURVATURE AND TORSION 41
Example 2.22 (The torsion of a helix). To find the torsion of the helix α(t) = (a cos t, a sin t, bt),
we again piggyback on the calculations of Example 2.20. For instance, using the result that B(t) =
√ 1
a2 +b2
(b sin t, −b cos t, a), we obtain B0 (t) = √a21+b2 (b cos t, b sin t, 0). To find c(t), we want to
manipulate this to be a scalar multiple of N(t) = (− cos t, − sin t, 0):
1 b b
B0 (t) = √ (b cos t, b sin t, 0) = − √ (− cos t, − sin t, 0) = − √ N(t).
a2+b2 2
a +b2 a + b2
2
c(t) − √a2b+b2 b
Torsion of a helix: τ (t) = − = −√ = 2 .
v(t) a2 + b2 a + b2
Note that the torsion is positive if the helix is right-handed (b > 0) and negative if left-handed
(b < 0). In the case of a circle (b = 0), the torsion is zero. This reflects the fact that planar curves
don’t wobble in space at all.
To complete the discussion of torsion, it remains to justify the claim that B0 is a scalar multiple
of N. The argument uses a product rule for the cross product.
Proposition 2.23 (Product rules). Let I be an interval of real numbers, and let α, β : I → Rn be
differentiable paths and f : I → R a differentiable real-valued function. Then:
Proof. The first of these was proven earlier (Proposition 2.10), and the others are the same, swap-
ping in the appropriate type of product. We leave them as exercises (Exercise 7.3).
Lemma 2.24. The derivative B0 of the binormal is always a scalar multiple of the principal normal
N.
Proof. We show that B0 is orthogonal (a) to B and (b) to T. The only vectors orthogonal to both
are precisely the scalar multiples of N, so the lemma follows.
(a) B is a unit vector, so kBk is constant. Hence B and its derivative B0 are always orthogonal
(Corollary 2.12).
(b) By definition, B = T × N, so using the product rule:
B0 = T0 × N + T × N0
= (kT0 k N) × N + T × N0 (definition of N)
0 0
= kT k 0 + T × N (v × v = 0 always)
0
=T×N.
www.dbooks.org
42 CHAPTER 2. PATHS AND CURVES
T0 = kT0 kN = κvN.
For (3), we know that B0 = cN and τ = − vc for some scalar-valued function c. Hence c = −τ v,
and B0 = −τ vN.
Lastly, for (2), the right-hand rule gives N = B × T (see Figure 2.16). Therefore, according to
the product rule and the two Frenet-Serret formulas that were just proven:
N0 = B0 × T + B × T0
= −τ vN × T + B × (κvN).
But, again by the right-hand rule, N × T = −B and B × N = −T. Substituting into the last
expression gives N0 = −κvT + τ vB, completing the proof.
The same conclusion applies to rotations, though we don’t really have the tools to give a rigorous
proof at this point. So we try an intuitive explanation. Suppose that α is rotated in R3 to obtain
a new path β. The velocities α0 (t) and β 0 (t) are related by the same rotation, and then so are
the respective Frenet vectors (T(t), N(t), B(t)). So the Frenet vectors of the two paths are not the
same, but, since they are rotated versions of one another, the corresponding vectors change at the
same rates. In particular, the speed, curvature, and torsion are the same.
Our main theorem is a converse to these considerations. That is, we show that, if two paths α
and β have the same speed, curvature, and torsion, then they differ by a translation and/or rotation
in the sense that there is a composition of translations and rotations F : R3 → R3 that transforms
α into β, i.e., F (α(t)) = β(t) for all t. Hence measuring the three scalar quantities v, κ, τ suffices to
determine the geometry of a space curve. It is striking and satisfying that such a complete answer
is possible. In the spirit of the theorems in plane geometry about congruent triangles (SAS, ASA,
SSS, etc.), we call the theorem the “vκτ theorem.”
Theorem 2.26 (vκτ theorem). Let I be an interval of real numbers, and let α, β : I → R3 be
differentiable paths in R3 such that v and T0 are nonzero so that the Frenet vectors are defined for
all t. Then either path can be translated and rotated so that it fits exactly on top of the other if and
only if they have:
In particular, two paths with the same speed, curvature, and torsion are congruent to each other.
Proof. That translations and rotations preserve speed, curvature, and torsion was discussed above.
For the converse, assume that α and β have the same speed, curvature, and torsion. We proceed
in three stages to move α on top of β.
Step 1. A translation.
Choose now and for the rest of the proof a point a in the interval I. We first translate α so
that α(a) moves to β(a). In other words, we shift α by the constant vector d = β(a) − α(a) to get
a new path γ(t) = α(t) + d = α(t) + β(a) − α(a). Then:
Moreover, as a translate, γ has the same speed, curvature, and torsion as α and hence as β as well.
Step 2. Two rotations.
Let x0 denote the common point γ(a) = β(a). The unit tangents to γ and β at x0 may not
be equal, but we can rotate γ about x0 until they are. Then, rotate γ again using this tangent
direction as axis of rotation until the principal normals at x0 coincide, too. The binormals are then
automatically the same, since they are the cross products of T and N. Moreover, neither rotation
changes the speed, curvature, or torsion.
Call this final path αe. In this way, we have translated and rotated α into a path α e satisfying
the following three conditions:
• α
e(a) = β(a) = x0 ,
• at the point x0 , α
e and β have the same Frenet vectors, and
• α
e and β have the same speed, curvature, and torsion for all t.
www.dbooks.org
44 CHAPTER 2. PATHS AND CURVES
Step 3. A calculation.
We show that α e = β. Since αe has been obtained from α by translation and rotation, the
theorem follows.
The argument uses the Frenet-Serret formulas, which we repeat for convenience:
e · T(t) + N(t)
f (t) = T(t) e · N(t) + B(t)
e · B(t),
−(τ v N
e · B + τ vB
e · N).
Note that the theorem does not say what happens when a curve is reflected in a plane, that is,
how the curvature and torsion of a curve are related to those of its mirror image. For a clue as to
what happens in this case, see Exercise 9.14.
2.10. EXERCISES FOR CHAPTER 2 45
1.3. α : R → R2 , α(t) = (et , e−t ) (Hint: How are the x and y-coordinates related?)
1.13. Find a parametrization of the line in R3 that passes through the point a = (1, 2, 3) and is
parallel to v = (4, 5, 6).
1.14. Let ` be the line in R3 parametrized by α(t) = (1 + 2t, −3t, 4 + 5t). Find a point that lies on
` and a vector that is parallel to `.
1.15. Find a parametrization of the line in R3 that passes through the points a = (1, 0, 0) and
b = (0, 0, 1).
1.16. Find a parametrization of the line in R3 that passes through the points a = (1, −2, 3) and
b = (−4, 5, −6).
1.17. Let ` be the line in Rn parametrized by α(t) = a + tv, and let p be a point in Rn . Suppose
that q is the foot of the perpendicular dropped from p to ` (Figure 2.17).
(a) Since q lies on `, there is a value of t such that q = a + tv. Find a formula for t in terms
of a, v, and p. (Hint: ` lies in a direction perpendicular to p − q.)
(b) Find a formula for the point q.
www.dbooks.org
46 CHAPTER 2. PATHS AND CURVES
1.18. Let ` be the line in R3 parametrized by α(t) = (1 + t, 1 + 2t, 1 + 3t), and let p = (0, 0, 4).
Find the foot of the perpendicular dropped from p to `.
(a) Explain why the problem of determining whether ` and m intersect in R3 amounts to
solving a system of three equations in two unknowns t1 and t2 .
(b) Determine whether the lines parametrized by α(t) = (−1, 0, 1) + t (1, −3, 2) and β(t) =
(1, 5, 4) + t (−1, 1, 6) intersect by writing down the corresponding system of equations,
as described in part (a), and solving the system.
(c) Repeat for the lines parametrized by α(t) = (2, 1, 0) + t (0, 0, 1) and β(t) = (0, 1, 3) +
t (1, 0, 0).
2.4. Let a, b be real numbers, a > 0, and let α(t) = (a cos t, a sin t, bt).
2.5. Find a parametrization of the line through the points a = (1, 1, 1) and b = (2, 3, 4) that
traverses the line with constant speed 1.
2.10. EXERCISES FOR CHAPTER 2 47
3.3. Let C be the line segment in R2 from (1, 0) to (0, 1). With as little calculation as possible,
find the values of the following integrals.
R
(a) C 5 ds
R
(b) C (x + y) ds
(a) If the speed is constant, show that the velocity and acceleration vectors are always
orthogonal. (Hint: Consider kv(t)k2 .)
(b) Conversely, if the velocity and acceleration are always orthogonal, show that the speed
is constant.
4.4. Let C be a curve in Rn traced out by a differentiable path α : (a, b) → Rn defined on an open
interval (a, b). Assume that there is a point α(t0 ) on C that is closer to the origin than any
other point of C. Prove that the tangent vector α0 (t0 ) is orthogonal to α(t0 ). Draw a sketch
that illustrates the conclusion.
www.dbooks.org
48 CHAPTER 2. PATHS AND CURVES
5.5. Find a nonzero vector in R3 that points in a direction perpendicular to the plane that contains
the origin and the points p = (1, 1, 1) and q = (2, 1, −3).
5.6. Find a nonzero vector in R3 that points in a direction perpendicular to the plane that contains
the points p = (1, 0, 0), q = (0, 1, 0), and r = (0, 0, 1).
5.7. Let v = (−1, 1, −2), w = (4, −1, −1), and u = (−3, 2, 1).
(a) Find v × w.
(b) What is the area of the parallelogram determined by v and w?
(c) Find u · (v × w). Then, use your answer to find v · (u × w) and (u × v) · w.
(d) What is the volume of the parallelepiped determined by u, v, and w?
5.8. Verify that the cross v × w as defined in equation (×) satisfies the key property that
product
x · (v × w) = det x v w for all x in R3 .
(a) Use the definitions of the dot and cross products in terms of coordinates to prove that:
(b) Use part (a) to give a proof that kv × wk = kvk kwk sin θ, where θ is the angle between
v and w.
5.10. Do there exist nonzero vectors v and w in R3 such that v · w = 0 and v × w = 0? Explain.
5.11. In R3 , let ` be the line parametrized by α(t) = a + tv, and let p be a point.
k(p − a) × vk
d= .
kvk
www.dbooks.org
50 CHAPTER 2. PATHS AND CURVES
The following calculations are kind of messy, but the conclusion may be unexpected.
In Exercises 9.4–9.8, α is a path in R3 with velocity v(t), speed v(t), acceleration a(t), and
Frenet vectors T(t), N(t), and B(t). You may assume that v(t) 6= 0 and T0 (t) 6= 0 for all t so that
the Frenet vectors are defined.
(a) v = vT
(b) a = v 0 T + κv 2 N (Hence v 0 is the tangential component of acceleration and κv 2 the
normal component.)
9.7. Let C be the curve in R3 parametrized by α(t) = (t, t2 , t3 ). Use the formula from the previous
exercise to find the curvature at the point (1, 1, 1).
9.8. According to Exercise 9.4, the velocity and acceleration vectors lie in the plane spanned by
T and N. Thus, like the binormal vector B, the cross product v × a is orthogonal to that
plane. It follows that the unit vector in the direction of v × a must be ±B. Show that in fact
v×a
it is B. In other words, show that B = kv×ak .
(a) Find the curvature κ(t). (You may use the result of Exercise 9.6 if you wish.)
(b) For this curve, what does v × a tell you about the torsion τ (t)?
w = τ v T + κv B.
9.11. Suppose that the Darboux vector w = τ v T + κv B from the previous exercise is a constant
vector.
(a) Prove that τ v and κv are both constant. (Hint: Calculate w0 , and use the Frenet-Serret
formulas. Note that T and B need not be constant even if w is.)
(b) If α has constant speed, prove that α is congruent to a helix.
9.12. Let α be a path in R3 that lies on the sphere of radius a centered at the origin, that is:
Let T(t) be the unit tangent vector, N(t) the principal normal, v(t) the speed, and κ(t) the
curvature. Assume that v(t) 6= 0 and T0 (t) 6= 0 for all t so that the Frenet vectors are defined.
www.dbooks.org
52 CHAPTER 2. PATHS AND CURVES
1
(c) Show that κ(t) ≥ a for all t.
• constant speed v = 1,
• positive torsion τ (t), and
√ √ √
2 2 2
• binormal vector B(t) = − 2 , 2 sin 2t, 2 cos 2t .
(a) Find, in whatever order you wish, the unit tangent vector T(t), the principal normal
N(t), the curvature κ(t), and the torsion τ (t).
(b) In addition, if α(0) = (0, 0, 0), find a formula for α(t) as a function of t. (Hint: Use your
formula for T(t).)
9.14. In this problem, we determine the effect of a reflection, specifically the reflection in the xy-
plane, on the Frenet vectors, the curvature, and the torsion of a path. To fix some notation,
if p = (x, y, z) is a point in R3 , let p∗ = (x, y, −z) denote its reflection in the xy-plane.
Now, let α(t) = (x(t), y(t), z(t)) be a path in R3 whose Frenet vectors are defined for all t,
and let β(t) be its reflection:
Real-valued functions
53
www.dbooks.org
Chapter 3
Thus far, we have studied functions α for which the input is a real number t and the output is a
vector α(t) = (x1 (t), x2 (t), . . . , xn (t)). The techniques from calculus that we used were basically
familiar from first-year calculus. Now, we reverse the roles and consider functions where the input is
a vector x = (x1 , x2 , . . . , xn ) and the output is a real number y = f (x) = f (x1 , x2 , . . . , xn ). Thinking
of the coordinates x1 , x2 , . . . , xn as variables, these are real-valued functions of n variables. More
formally, they are functions f : A → R where the domain A is a subset of Rn . When more than
one variable is involved, we shall need methods that go beyond first-year calculus.
Example 3.1. We have seen that a helix can be parametrized by α(t) = (a cos t, a sin t, bt). The
values of a and b give the radius of the cylinder around which the helix winds and the rate at which
it rises, respectively. If a and b vary, the size and shape of the helix change.
a
The curvature of the helix is given by κ = a2 +b 2 . We can think of this as a real-valued function
that describes how the curvature depends on the geometric parameters a and b. In a way, it’s like
a function on the set of helices, but, strictly speaking, it is a function κ : A → R whose domain
as far as helices go is A = {(a, b) ∈ R2 : a > 0}, a subset of R2 . Likewise, the torsion of a helix
b
τ = a2 +b 2 is a real-valued function of the same two variables with the same domain.
55
www.dbooks.org
56 CHAPTER 3. REAL-VALUED FUNCTIONS: PRELIMINARIES
For instance, consider the cross-sections with the three coordinate planes. The
p yz-plane is where
x = 0, so the equation of the intersection of the graph with this plane is z = 02 + y 2 = |y| and
x = 0, a V -shaped curve. Similarly, the cross-section with the xz-plane is z = |x|, y = 0, another
V -shaped curve at right angles to the first. These cross-sections are shown in Figure 3.1. Lastly,
Figure 3.1: Two cross-sections with coordinate planes: z = |y|, x = 0 (left) and z = |x|, y = 0
(right)
p
the cross-section with the xy-plane, where z = 0, is 0 = x2 + y 2 , z = 0. This consists of the single
point (0, 0, 0).
In general,
p cross-sections with horizontal planes z = c are also useful. Here, they are described
by c = x + y 2 , or x2 + y 2 = c2 , where c ≥ 0. This is a circle of radius c in the plane z = c. As c
2
increases, the circles get bigger. Some of them are shown in Figure 3.2.
By putting
p all this information together, we can assemble a pretty good picture of the whole
graph z = x2 + y 2 . It is the circular cone with vertex at the origin pictured in Figure 3.3.
p
Figure 3.3: A circular cone: the graph z = x2 + y 2
3.1. GRAPHS AND LEVEL SETS 57
The information about horizontal cross-sections in Figure 3.2 can also be presented by projecting
the cross-sections onto the xy-plane and presenting them like a topographical map. As we saw
above, the cross-section with z = c is described by the points (x, y) such that x2 + y 2 = c2 , a
circle of radius c. When projected onto the xy-plane, this is called a level curve. Sketching a
representative sample of level curves in one picture and labeling them with the corresponding value
of c is called a contour map. In this case, it consists of concentric circles about the origin. See
Figure 3.4. The function f is constant on each circle and increases uniformly as you move out.
2
2.5 2.5
2.
1 1.5
1.
0
0.5
-1
2.5 2.5
-2
-2 -1 0 1 2
p
Figure 3.4: Level curves of f (x, y) = x2 + y 2
A level set is a subset of the domain A, which is contained in Rn . So we can hope to draw level
sets only for functions of one, two, or three variables.
Example 3.3. Let f : R2 → R be given by f (x, y) = x2 − y 2 . Sketch some level sets and the graph
of f .
Level sets: We set f (x, y) = x2 − y 2 = c and choose a few values of c. For example:
Level set
c=1 x2 − y 2 = 1, a hyperbola
c=0 x2 − y 2 = 0, or y = ±x, two lines
c = −1 x2 − y 2 = −1, or y 2 − x2 = 1, a hyperbola
These curves and a couple of others are sketched in Figure 3.5.
2
-2.
1
-1.
0. 0.
0
1. 1. 2.
2.
0. 0.
-1
-1.
-2.
-2
-2 -1 0 1 2
www.dbooks.org
58 CHAPTER 3. REAL-VALUED FUNCTIONS: PRELIMINARIES
The graph: For the graph, the level sets are taken out of the xy-plane and raised to height z = c.
Furthermore, identifying the cross-sections with the coordinate planes provides some additional
framework.
Coordinate plane Equation of cross-section in that plane
yz-plane: x = 0 z = −y 2 , a downward parabola
xz-plane y = 0 z = x2 , an upward parabola
These are shown in Figure 3.6.
Reconstructing the graph from these pieces gives a saddle-like surface, also known as a hy-
perbolic paraboloid. See Figure 3.7. It has the distinctive feature that, in the cross-section with
the yz-plane, the origin looks like a local maximum, while, in the xz-plane, it looks like a local
minimum. (This last information can be deduced from the contour map, too. See Figure 3.5.)
Equation of a sphere: x2 + y 2 + z 2 = a2
The sphere of radius 1 is shown in Figure 3.8, left.
3.2. MORE SURFACES IN R3 59
Figure 3.8: The unit sphere x2 + y 2 + z 2 = 1 (left) and the cylinder x2 + y 2 = 1 (right)
Figure 3.9: Cross-sections with the yz-plane (left), xz-plane (middle), and horizontal planes (right)
These cross-sections fit together to form a surface that looks like a nuclear cooling tower, as
shown in Figure 3.10.
In fact, each of these last three surfaces is a level set of a function of three variables. The
sphere is the level set of f (x, y, z) = x2 + y 2 + z 2 corresponding to c = a2 ; the cylinder the
www.dbooks.org
60 CHAPTER 3. REAL-VALUED FUNCTIONS: PRELIMINARIES
level set of f (x, y, z) = x2 + y 2 corresponding to c = a2 ; and the cooling tower the level set of
f (x, y, z) = x2 + y 2 − z 2 corresponding to c = 1.
This suggests a way to visualize the behavior of functions of three variables, which would require
four dimensions to graph. Namely, determine the level sets given by f (x, y, z) = c. Then the way
these surfaces morph into one another as c varies reflects which points get mapped to which values.
For example, the function f (x, y, z) = x2 + y 2 + z 2 is constant on its level sets, which are spheres,
and the larger the sphere, the greater the value of f .
Example 3.7. A graph of a function f (x, y) of two variables can be regarded as a level set, too, but
of a function of three variables. For let F (x, y, z) = f (x, y) − z. Then the graph z = f (x, y) is the
level set of F corresponding to c = 0. Both are defined by the condition F (x, y, z) = f (x, y) − z =
0.
Ax + By + Cz = D. (3.1)
To understand what this level set is, we rewrite (3.1) as a dot product (A, B, C) · (x, y, z) = D, or,
writing the typical point (x, y, z) of R3 as x:
n · x = D,
where n = (A, B, C). Suppose that we happen to know one particular point p on the level set.
Then n · p = D, so all points on the level set satisfy:
n · x = D = n · p. (3.2)
3.3. THE EQUATION OF A PLANE IN R3 61
In other words, n·(x−p) = 0. This says that x lies on the level set if and only if x−p is orthogonal
to n. Geometrically, this is true if and only if x lies in the plane that is perpendicular to n and
that passes through p. This is illustrated in Figure 3.11. We say that n is a normal vector to the
plane. Thus the level sets of the linear transformation T are planes, all having as normal vector
the vector n whose components are the entries of the matrix that represents T .
For future reference, we record the results of (3.1) and (3.2).
Proposition 3.8.
1. In R3 , the equation of the plane through p with normal vector n is:
n · x = n · p, where x = (x, y, z).
We use the form n · x = n · p. As a point in the desired plane, we can take p = (1, 2, 3). From
its parametrization, the line α passes through the point a = (4, 5, 6) and is parallel to v = (7, 8, 9).
Hence as normal vector to the plane, we can use the cross product:
−
→ × v = (p − a) × v = (1, 2, 3) − (4, 5, 6) × (7, 8, 9)
ap
= (−3, −3, −3) × (7, 8, 9)
i j k
= det −3 −3 −3
7 8 9
= (−3, 6, −3)
= −3(1, −2, 1).
www.dbooks.org
62 CHAPTER 3. REAL-VALUED FUNCTIONS: PRELIMINARIES
The scalar multiple n = (1, −2, 1) is also a normal vector, and, since it’s a little simpler, it’s the
one we use. Substituting into n · x = n · p gives:
equations that define the planes, the normals are n1 = (1, 1, −1) and n2 = (1, −2, 3). Then
n1 · n2 = 1 − 2 − 3 = −4, which is nonzero. The planes are not perpendicular.
Definition. Let a be a point in Rn , and assume that r > 0. The open ball with center a and
radius r, denoted B(a, r), is defined to be:
Example 3.11. In R, an open ball is an open interval (a − r, a + r). In R2 , it is a disk, the region
in the interior of a circle. These are shown in Figure 3.14. In R3 , an open ball is a solid ball, the
region in the interior of a sphere.
3.4. OPEN SETS 63
Definition. A subset U of Rn is called open if, given any point a in U , there exists a positive real
number r such that B(a, r) ⊂ U .
So to show that a set U is open, one starts with an arbitrary point a of U and finds a value of r
so that the open ball about a of radius r stays entirely within U . The value of r depends typically
on the conditions that define the set U as well as on the point a. For the time being, we shall be
content to find a concrete expression for r and accept geometric intuition to justify that it works.
If, for some point a in U , there is no value of r that works, then U is not open.
Example 3.12. In R2 , let U be the open ball B((0, 0), 1) of radius 1 centered at the origin. This
is the set of all x in R2 such that kxk < 1. We verify that this open ball is in fact an open set.
Let a be a point of U , and let r = 1 − kak. Note that r > 0 since kak < 1. Then B(a, r) ⊂ U ,
so, given any a, we’ve found an r that works. See Figure 3.15. Hence U is an open set.
To those who would like a more detailed justification that B(a, r) ⊂ U , please see Exercise
7.2. Note that the expression r = 1 − kak gets smaller as a approaches the boundary of the disk.
That this is necessary makes sense from the geometry of the situation. By modifying the argument
slightly, one can show that every open ball in Rn is an open set in Rn (Exercise 7.3).
Example 3.13. Prove that the set U = {(x, y) ∈ R2 : x > 0, y < 0} is open in R2 .
This is the fourth quadrant of the plane, not including the coordinate axes. Let a be a point
of U . Write a = (c, d), so c > 0 and d < 0. The closest point on the boundary of U lies on one of
the axes, either c or |d| units away (Figure 3.16). We choose r to be whichever of these is smaller,
that is, r = min{c, |d|}. Then B(a, r) ⊂ U . Thus U is open.
Example 3.14. Is U = {(x, y) ∈ R2 : x > 0, y ≤ 0} open in R2 ?
This is the same set as in the previous example with the positive x-axis added in. The change is
enough to keep the set from being open. For instance, the point a = (1, 0) is in U , but every open
www.dbooks.org
64 CHAPTER 3. REAL-VALUED FUNCTIONS: PRELIMINARIES
ball about a includes points outside of U , namely, points in the upper half-plane. This is shown in
Figure 3.17. Thus no value of r works for this choice of a.
Example 3.15. If U and V are open sets in Rn , prove that their intersection U ∩ V is open, too.
Let a be a point of U ∩ V . Then a ∈ U , and, since U is open, there exists a positive number
r1 such that B(a, r1 ) ⊂ U . Likewise, a ∈ V , and there exists r2 such that B(a, r2 ) ⊂ V . Let
r = min{r1 , r2 } (Figure 3.18). Then B(a, r) ⊂ B(a, r1 ) ⊂ U and B(a, r) ⊂ B(a, r2 ) ⊂ V , so
U3 = B((0, 0), 13 ), etc. Each Un is open, but the only point common to all of them is the origin.
Thus U1 ∩ U2 ∩ · · · = {(0, 0)}. This is no longer an open set.
There is also a notion of closed set, though the definition may not be what one would guess.
Example 3.16. Let K = {x ∈ R2 : kxk ≤ 1}. This is the open ball centered at the origin of radius
1 together with its boundary, the unit circle, as shown in Figure 3.19. Its complement R2 − K is
the set of all points x such that kxk > 1, which is an open set. (Briefly, given a in R2 − K, then
r = kak − 1 works in the definition of open set.) Hence K is closed.
Example 3.17. Let K = {x ∈ R2 : either kxk < 1 or x2 + y 2 = 1 where y ≥ 0}. This is the same
as the closed ball of the preceding example except that the lower semicircle has been deleted. See
Figure 3.20.
K is no longer closed: for instance, the point a = (0, −1) belongs to R2 − K, but no open ball
about a is contained entirely within R2 − K. At the same time, K is not open either: the point
a = (0, 1) belongs to K, but no open ball about a stays within K.
www.dbooks.org
66 CHAPTER 3. REAL-VALUED FUNCTIONS: PRELIMINARIES
3.5 Continuity
We are ready for continuity. Intuitively, the idea is that a function f is continuous at a point a if
limx→a f (x) = f (a). That is, as x gets close to a, f (x) gets close to f (a). We make this precise by
expressing the requirement in terms of open balls.
Definition. Let U be an open set in Rn , and let f : U → R be a real-valued function. We say that
f is continuous at a point a of U if, given any open ball B(f (a), ) about f (a), there exists an
open ball B(a, δ) about a such that:
f (B(a, δ)) ⊂ B(f (a), ).
In other words, if x ∈ B(a, δ), then f (x) ∈ B(f (a), ) (Figure 3.21). Equivalently, writing out the
definition of open ball, f is continuous at a if, given any > 0, there exists a δ > 0 such that:
if kx − ak < δ, then |f (x) − f (a)| < .
Figure 3.21: f maps the open ball B(a, δ) about a into the open ball (f (a) − , f (a) + ) about
f (a).
The definition says that f is continuous at a if you can guarantee that f (x) will be as close
to f (a) as you want (“within ”) by making sure that x is close enough to a (“within δ”). The
strategy for proving that a function is continuous is similar formally to proving that a set is open.
There, one starts with a point a and tries to find a radius r. Here, one starts with an and tries
to find a δ.
It may seem that using the open ball B(f (a), ) in the definition is an unnecessarily confusing
way to write the interval (f (a) − , f (a) + ), but we have in mind the generalization of continuity
to vector-valued functions in Chapter 6. The definition phrased in terms of open balls extends
naturally to the more general case, as we shall see.
Definition. A function f : U → R is simply called continuous if it is continuous at every point
of its domain U .
Example 3.18. Let f : R2 → R be defined by f (x, y) = x. In other words, f is the projection of
the xy-plane onto the x-axis. Is f a continuous function?
Let a be a point of R2 . To get a sense of what answer to expect, we consider limx→a f (x).
(Of course, we haven’t defined rigorously what a limit is yet, so this is meant to be completely
informal.) Write x = (x, y) and a = (c, d). Then:
lim f (x) = lim f (x, y) = lim x = c.
x→a (x,y)→(c,d) (x,y)→(c,d)
3.5. CONTINUITY 67
At the same time, f (a) = f (c, d) = c, which agrees. Thus we expect intuitively that f is continuous
at a.
To prove this rigorously, let > 0 be given. We want to find a radius δ so that the open ball
B (c, d), δ is projected inside the open interval B(f (c, d), ) = B(c, ) = (c − , c + ). From the
geometry of the projection, taking δ = works. See Figure 3.22. That is, if (x, y) ∈ B((c, d), ),
then its x-coordinate satisfies c − < x < c + . In other words, c − < f (x, y) < c + , or
f (x, y) ∈ (c − , c + ). Therefore δ = works in the definition of continuity.
Actually, δ could be anything less than , too, but the definition requires only that we find one
δ that works, not all of them.
xy
Figure 3.23: The graph z = x2 +y 2
: can it be made continuous at (0, 0)?
Say we set f (0, 0) = c. First, the intuitive approach: we want to know if it’s possible to choose
c so that:
lim f (x, y) = c.
(x,y)→(0,0)
To get a feel for this, we try approaching the origin in a couple of different ways. For instance,
www.dbooks.org
68 CHAPTER 3. REAL-VALUED FUNCTIONS: PRELIMINARIES
x·0
suppose that we approach along the x-axis. Then f (x, 0) = x2 +02
= 0, so:
x·x
On the other hand, if we approach along the line y = x, then f (x, x) = x2 +x2
= 12 , so:
1 1
lim f (x, x) = lim = .
(x,x)→(0,0) (x,x)→(0,0) 2 2
Thus f (x, y) can approach different values depending on how you approach the origin. As a result,
intuitively, the limit does not exist, so f cannot be continuous regardless of what c is.
To prove this rigorously, we need to articulate what it means for the criterion for continuity to
fail. The definition says that a function is continuous if, for every , there exists a δ. The negation
of this is that there is some for which there is no δ. We try to find such a bad .
Our intuitive calculations above showed that there are points arbitrarily close to the origin
where f = 0 and other points arbitrarily close where f = 21 . We choose so that these values of f
cannot both be within of f (0, 0) = c.
For this, let = 81 . Then B(f (0, 0), ) = B(c, 81 ) = (c − 18 , c + 18 ), an interval of length 14 . As just
noted, in any open ball B((0, 0), δ) about the origin, there will be points (x, 0) where f (x, 0) = 0
and points (x, x) where f (x, x) = 12 . It’s impossible to fit both these values inside an interval of
length 14 . So for = 18 , there is no δ > 0 such that f (B((0, 0), δ) ⊂ (c − 81 , c + 18 ). Thus f is not
continuous at (0, 0) no matter what the value of f (0, 0) is.
The δ-definition of continuity is awkward and difficult to work with. One of its virtues, how-
ever, is that, not only can it be used to verify that a function is continuous, but it also gives an
unambiguous condition for proving that a function is not. That is the moral of the preceding
example.
3 3
Example 3.20. In the same spirit, let f (x, y) = xx2 +y+y 2
for all (x, y) 6= (0, 0). The graph of f is
shown in Figure 3.24. Is it possible to assign a value to f (0, 0) so that f is continuous at (0, 0)?
x3 +y 3
Figure 3.24: The graph z = x2 +y 2
x3 +y 3
As before, intuitively, in order to be continuous, f (0, 0) should equal lim(x,y)→(0,0) x2 +y 2
, if the
limit exists. We try approaching the origin along various lines:
x3 +03
Along the x-axis f (x, 0) = x2 +02
= x → 0 as x → 0
03 +y 3
Along the y-axis f (0, y) = 02 +y2 = y → 0 as y → 0
3 3 x3 3
Along the line y = mx f (x, mx) = xx2 +m
+m2 x2
= 1+m
1+m2
x→0 as x → 0
3.6. SOME PROPERTIES OF CONTINUOUS FUNCTIONS 69
We get a consistent answer of 0, but this does not prove that the limit is 0. We have not
exhausted all possible ways of approaching the origin, and being close to the origin is different
from being close to it along any individual curve. Nevertheless, the evidence suggests that choosing
f (0, 0) = 0 might make f continuous, and this gives us something concrete to shoot for.
So set f (0, 0) = 0, and let > 0 be given. We want to find a δ > 0 such that, if k(x, y)−(0, 0)k <
δ, then |f (x, y) − 0| < . This is satisfied automatically when (x, y) = (0, 0) for any value of δ since
3 3
f (0, 0) = 0, so we assume (x, y) 6= (0, 0) and look for a δ such that, if k(x, y)k < δ, then xx2 +y
+y 2
< .
x3 +y3
To hunt for a connection between the quantities k(x, y)k and x2 +y2 , we introduce polar coor-
p
dinates: x = r cos θ and y = r sin θ, where r = x2 + y 2 = k(x, y)k. See Figure 3.25.
www.dbooks.org
70 CHAPTER 3. REAL-VALUED FUNCTIONS: PRELIMINARIES
Example 3.22. We showed earlier in Example 3.18 that the projection onto the x-axis f : R2 → R,
f (x, y) = x, is continuous. Likewise, the projection onto the y-axis g : R2 → R, g(x, y) = y, is
continuous. It then follows immediately from the proposition that functions like x + y, 3x, xy,
x2 = x · x, x3 − 4x2 y 2 + 5y, and, at points other than (0, 0), x2xy
+y 2
are all continuous.
Proof. We prove the statement about the sum f + g and leave the rest for the exercises. (See
Exercises 7.4–7.7.) For this, we want to argue that, as x gets close to a, f (x) + g(x) gets close
to f (a) + g(a). Since f (x) and g(x) get close to f (a) and g(a) individually, this conclusion seems
clear, and it’s just a matter of feeding the intuition into the formal definition. The key ingredient
in the argument below is the “triangle inequality.”
Let > 0 be given. We need to come up with a δ > 0 such that f (x) + g(x) ∈ B(f (a) + g(a), )
whenever x ∈ B(a, δ), in other words, a δ such that:
Note that:
|(f (x) + g(x)) − (f (a) + g(a))| = |(f (x) − f (a)) + (g(x) − g(a))|. (3.3)
The triangle inequality says that, for any real numbers r and s, |r + s| ≤ |r| + |s|. This seems
reasonable, but it has a generalization to Rn that is important enough that we discuss it separately
in the next section. Accepting that it is true for the time being, we find from equation (3.3) that:
|(f (x) + g(x)) − (f (a) + g(a))| ≤ |f (x) − f (a)| + |g(x) − g(a)|. (3.4)
This separates the f contribution and the g contribution into terms that can be manipulated
independently.
Since f is continuous at a, taking the positive quantity 2 as the given input in the definition
of continuity, there is a δ1 > 0 such that |f (x) − f (a)| < 2 whenever x ∈ B(a, δ1 ). Similarly, there
is a δ2 > 0 such that |g(x) − g(a)| < 2 whenever x ∈ B(a, δ2 ). Let δ = min{δ1 , δ2 }. Then the last
two inequalities remain true for all x in B(a, δ), and substituting them into (3.4) gives:
Proposition 3.23. Compositions of continuous functions are continuous. That is, let U be an
open set in Rn and V an open set in R, and let f : U → R and g : V → R be functions such
that f (x) ∈ V for all x in U . (This assumption guarantees that the composition g ◦ f : U → R is
defined.) Then, if f and g are continuous, so is g ◦ f .
Example 3.24. We accept without comment that the familiar functions of one variable from first-
√
p includes |x|, x,
n
year calculus that were said to be continuous there really are continuous. This
sin x, cos x, ex , and ln x. The proposition then implies that functions like x2 + y 2 , sin(x
p + y),
and exy are continuous. For example, the first is the composition (x, y) 7→ x2 + y 2 7→ x2 + y 2 ,
which is a composition of continuous steps.
3.7. THE CAUCHY-SCHWARZ AND TRIANGLE INEQUALITIES 71
Proof. Let a be a point of U , and let > 0 be given. We want to find an open ball about a that
is mapped by g ◦ f inside the interval B(g(f (a)), ) = g(f (a)) − , g(f (a)) + . We work our
way backwards from g(f (a)) to a using the definition of continuity twice to find such a ball. The
ingredients are depicted in Figure 3.26.
First, since g is continuous at the point f (a), there exists δ 0 > 0 such that:
But then since f is continuous at a, treating δ 0 as the given input, there exists δ > 0 such that:
(g ◦ f )(B(a, δ)) = g(f (B(a, δ))) ⊂ g(B(f (a), δ 0 )) ⊂ B(g(f (a)), ).
Figure 3.26: The composition of continuous functions: given an , a δ 0 exists (g is continuous), then
a δ exists (f is continuous).
|v · w| ≤ kvk kwk
for all v, w in Rn .
Proof. Recall that v · w = kvk kwk cos θ, where θ is the angle between v and w. Thus since
| cos θ| ≤ 1, we have |v · w| = kvk kwk | cos θ| ≤ kvk kwk.
From this follows a generalization of the property of R that was alluded to in the proof of
Proposition 3.21.
Theorem 3.26 (Triangle inequality).
kv + wk ≤ kvk + kwk
for all v, w in Rn .
www.dbooks.org
72 CHAPTER 3. REAL-VALUED FUNCTIONS: PRELIMINARIES
Proof. It’s equivalent to square both sides and prove that kv + wk2 ≤ (kvk + kwk)2 . But:
kv + wk2 = (v + w) · (v + w)
=v·v+v·w+w·v+w·w
= kvk2 + 2 v · w + kwk2 . (3.5)
Also, v · w ≤ |v · w| ≤ kvk kwk, where the second inequality is Cauchy-Schwarz. Substituting this
into (3.5) gives:
kv + wk2 ≤ kvk2 + 2kvk kwk + kwk2 = (kvk + kwk)2 .
Figure 3.27: The geometry of the triangle inequality: the length kv + wk of one side of a triangle
is less than or equal to the sum kvk + kwk of the lengths of the other two sides.
3.8 Limits
We used our intuition about limits as a way to think about continuity. Now that continuity has
been defined rigorously, we turn the tables and use continuity to give a formal definition of limits.
One technical, though important, point is that, in determining how f (x) behaves as x ap-
proaches a, what is happening right at a does not matter. The value of f (a), or even whether f is
defined there, is irrelevant.
Definition. Let U be an open set in Rn , and let a be a point of U . If f is a real-valued function
that is defined on U , except possibly at the point a, then we say that limx→a f (x) exists if there is
a number L such that the function fe: U → R defined by:
(
f (x) if x 6= a,
fe(x) =
L if x = a
x3 + y 3
lim = 0.
(x,y)→(0,0) x2 + y 2
Because of the close connection between the two concepts, properties of continuous functions
have analogues for limits. Here is the version of Proposition 3.21 for limits.
Proposition 3.28. Assume that limx→a f (x) and limx→a g(x) both exist. Then:
1. limx→a f (x) + g(x) = limx→a f (x) + limx→a g(x) .
2. limx→a cf (x) = c limx→a f (x) for any scalar c.
3. limx→a f (x)g(x) = limx→a f (x) limx→a g(x) .
f (x) limx→a f (x)
4. limx→a g(x) = limx→a g(x) , provided that limx→a g(x) 6= 0.
1.1. f (x, y) = x2 + y 2
1.2. f (x, y) = x2 + y 2 + 1
1.3. f (x, y) = y − x2
1.4. f (x, y) = x + y
1.6. f (x, y) = xy
y
1.7. f (x, y) = x2 +1
1.8. The graph of the function f (x, y) = x3 − 3xy 2 is called a monkey saddle. You will need to
bring one along whenever you invite a monkey to go riding with you. The level sets of f are
not easy to sketch directly, but it is still possible to get a reasonable idea of what the graph
looks like.
www.dbooks.org
74 CHAPTER 3. REAL-VALUED FUNCTIONS: PRELIMINARIES
√ √
(a) Sketch the level set corresponding to c = 0. (Hint: x3 − 3xy 2 = x(x + 3y)(x − 3y).)
(b) Draw the region of the xy-plane in which f (x, y) > 0 and the region in which f (x, y) < 0.
(Hint: Part (a) might help.)
(c) Use the information from parts (a) and (b) to make a rough sketch of the monkey saddle.
2.1. x2 + y 2 + z 2 = 4
2.2. x2 + y 2 = 4
2.3. x2 + z 2 = 4
y2 z2
2.4. x2 + 9 + 4 =1
In Exercises 2.5–2.9, sketch the level sets corresponding to the indicated values of c for the given
function f (x, y, z) of three variables. Make a separate sketch for each individual level set.
2.5. f (x, y, z) = x2 + y 2 + z 2 , c = 0, 1, 2
2.6. f (x, y, z) = x2 + y 2 , c = 0, 1, 2
2.10. Can the saddle surface z = x2 − y 2 in R3 be described as a level set? If so, how? What about
the parabola y = x2 in R2 ?
2
For more information about hypars, see http://erikdemaine.org/hypar/.
3.9. EXERCISES FOR CHAPTER 3 75
3.1. Find an equation of the plane through the point p = (1, 2, 3) with normal vector n = (4, 5, 6).
3.2. Find an equation of the plane through the point p = (2, −1, 3) with normal vector n =
(1, −4, 5).
3.3. Find a point that lies on the plane x + 3y + 5z = 9 and a normal vector to the plane.
3.4. Find a point that lies on the plane x + z = 1 and a normal vector to the plane.
3.7. Find an equation of the plane that contains the points p = (1, 0, 0), q = (0, 2, 0), and
r = (0, 0, 3).
3.8. Find an equation of the plane that contains the points p = (1, 1, 1), q = (2, −1, 2), and
r = (−1, 2, 0).
3.9. Find an equation of the plane that contains the point p = (1, 1, 1) and the line parametrized
by α(t) = (1, 0, 1) + t (2, 3, −4).
3.11. What point on the plane x − y + 2z = 3 is closest to the point q = (2, 1, −1)? (Hint: Find a
parametrization of an appropriate line through q.)
3.12. Two planes in R3 are perpendicular to each other. Their line of intersection is described by
the parametric equations:
x = 2 − t, y = −1 + t, z = −4 + 2t.
If one of the planes has equation 2x + 4y − z = 4, find an equation for the other plane.
3.14. Sketch and describe some of the level sets of the function f (x, y, z) = x + y + z.
π1 : Ax + By + Cz = D1 and π2 : Ax + By + Cz = D2 .
n · (p2 − p1 ) = D2 − D1 ,
www.dbooks.org
76 CHAPTER 3. REAL-VALUED FUNCTIONS: PRELIMINARIES
(b) Show that the perpendicular distance d between π1 and π2 is given by:
|D2 − D1 |
d= √ .
A2 + B 2 + C 2
3.16. Let α be a path in R3 that has constant torsion τ = 0. You may assume that α0 (t) 6= 0 and
T0 (t) 6= 0 for all t so that the Frenet vectors of α are always defined.
f (t) = B · (α(t) − x0 )
4.4. x2 + y 2 = 1
4.6. y > x
4.7. y ≥ x
4.8. Is R2 − {(0, 0)}, the plane with the origin removed, an open set in R2 ? Justify your answer.
4.9. Prove that the union of any collection of open sets in Rn is open. That is, if {Uα } is a
n
S
collection of open sets in R , then α Uα is open.
Section 5 Continuity
(a) Let a be a positive real number. Show that, if k(x, y)k < a, then |f (x, y)| < a2 .
3.9. EXERCISES FOR CHAPTER 3 77
(b) Use the δ-definition of continuity to determine whether f is continuous at (0, 0).
x2 y 4
f (x, y) = if (x, y) 6= (0, 0).
(x2 + y 4 )2
Determine what happens to the value of f (x, y) as (x, y) approaches the origin along:
6.1. Let a = (1, 2). Show that the function f : R2 → R given by f (x) = a · x is continuous.
In Exercises 5.1–5.2, you determined the continuity of the following two functions at (0, 0).
Now, consider points (x, y) other than (0, 0). Determine all such points at which f is continuous.
x3 y
6.2. f (x, y) = x2 +y 2
if (x, y) 6= (0, 0), f (0, 0) = 0
x2 y 4
6.3. f (x, y) = (x2 +y 4 )2
if (x, y) 6= (0, 0)
6.4. Let U be an open set in Rn , and let f : U → R be a function that is continuous at a point a
of U .
(a) If f (a) > 0, show that there exists an open ball B = B(a, r) centered at a such that
f (x) > f (a) f (a)
2 for all x in B. (Hint: Let = 2 .)
(b) Similarly, if f (a) < 0, show that there exists an open ball B = B(a, r) such that
f (x) < f (a)
2 for all x in B.
In particular, if f (a) 6= 0, there is an open ball B centered at a throughout which f (x) has
the same sign as f (a).
6.5. Let U be the set of all points (x, y) in R2 such that y > sin x, that is:
Prove that U is an open set in R2 . (Hint: Apply the previous exercise to an appropriate
continuous function.)
www.dbooks.org
78 CHAPTER 3. REAL-VALUED FUNCTIONS: PRELIMINARIES
7.2. (a) Let a be a point of Rn . If r > 0, show that B(a, r) ⊂ B(0, kak + r). Draw a picture that
illustrates the result in R2 . (Hint: To prove a set inclusion of this type, you must show
that, if x ∈ B(a, r), then x ∈ B(0, kak+r), that is, if kx−ak < r, then kx−0k < kak+r.)
(b) In R2 , if a ∈ B((0, 0), 1) and r = 1 − kak, show that B(a, r) ⊂ B((0, 0), 1). (Recall that
this inclusion was used in Example 3.12 to prove that B((0, 0), 1) is an open set.)
7.3. Let a be a point in Rn , and let r be a positive real number. Prove that the open ball B(a, r)
is an open set in Rn .
Exercises 7.4–7.7 provide the missing proofs of parts of Proposition 3.21 about properties of
continuous functions. Here, U is an open set in Rn , f, g : U → R are real-valued functions defined
on U , and a is a point of U .
7.4. If f is continuous at a and c is a scalar, prove that cf is continuous at a as well. (Hint: One
approach is to consider the two cases c = 0 and c 6= 0 separately. Another is to start by
showing that |cf (x) − cf (a)| ≤ (|c| + 1) |f (x) − f (a)|.)
7.5. If f is continuous at a, prove that there exists a δ > 0 such that, if x ∈ B(a, δ), then
|f (x)| < |f (a)| + 1. (Hint: Use Exercise 7.1.)
7.6. If f and g are continuous at a, prove that their product f g is continuous at a. (Hint: Use
an “add zero” trick and the previous exercise to show
that there
is a δ1 >
0 such that, if
x ∈ B(a, δ1 ), then f (x)g(x)−f (a)g(a) ≤ |f (a)|+1 g(x)−g(a) + |g(a)|+1 f (x)−f (a).)
(a) Prove that g1 is continuous at a. (Hint: Use Exercise 6.4 to show that there exists a
1 1
2
δ1 > 0 such that, if x ∈ B(a, δ1 ), then g(x) − g(a) ≤ |g(a)| 2 |g(x) − g(a)|.)
f f
(b) Prove that the quotient g is continuous at a. (Hint: g = f · g1 .)
Section 8 Limits
8.1. Prove parts 1 and 3 of Proposition 3.28 about limits of sums and products. (Hint: By using
the corresponding properties of continuous functions in Proposition 3.21, you should be able
to avoid ’s and δ’s entirely.)
(a) Let f, g, h : U → R be real-valued functions such that f (x) ≤ g(x) ≤ h(x) for all x in
U . If f (a) = h(a)—let’s call the common value c—and if f and h are continuous at a,
prove that g(a) = c and that g is continuous at a as well.
(b) Let f, g, h be real-valued functions defined on U , except possibly at the point a, such
that f (x) ≤ g(x) ≤ h(x) for all x in U , except possibly when x = a. If limx→a f (x) =
limx→a h(x) = L, prove that limx→a g(x) = L, too.
www.dbooks.org
80 CHAPTER 3. REAL-VALUED FUNCTIONS: PRELIMINARIES
Step 1
Step 3 Step 4
Step 5 Step 6
6. Fold the bottom edge to the lower crease line (the one you made in step 3), once again only
creasing between the diagonals. Unfold back to the square.
7. Repeat for all four sides. At this point, the square should be divided into four concentric
square rings. See the figure on the next page.
8. Turn the square over.
9. The goal now is to create additional creases halfway between the ones constructed so far, for
a total of eight concentric square rings. You can do this by folding the bottom edge up to
previously constructed folds, but use only every other fold. That is, fold the bottom edge
up to the existing creases 7/8, 5/8, 3/8, and 1/8 of the way to the top, again only creasing
between the diagonals. Do this for all four sides.
3.9. EXERCISES FOR CHAPTER 3 81
Step 7
Step 9
10. Turn the paper over. The creases should alternate as you move from one concentric square
to the next: mountain, valley, mountain, valley, . . . .
11. Now, try to fold all the creases. It’s easiest to start from the outer ring and work your way in,
pinching in at the corners and trying to compress adjacent faces of the ring flat as you move
towards the center. Half of the short diagonal creases between the rings need to be reversed
so that the rings nest inside each other, though the paper can be coaxed to do this without
too much trouble. In the end, you should get something shaped like an X, or a four-armed
starfish.
12. Open up the arms of the starfish. The paper will naturally assume the hypar shape.
www.dbooks.org
82 CHAPTER 3. REAL-VALUED FUNCTIONS: PRELIMINARIES
Chapter 4
We now extend the notion of derivative from real-valued functions of one variable, as in first-year
calculus, to real-valued functions of n variables f : U → R, where U is an open set in Rn . The
most natural choice would be to copy the one-variable definition verbatim and let the derivative
at a point a be limx→a f (x)−f
x−a
(a)
. Unfortunately, this involves dividing by a vector, which does not
make sense.
One potential remedy would be to consider limx→a f (x)−f (a)
kx−ak instead. Here, it turns out that
the limit fails to exist even for very simple functions. For example, for the function f (x) = x of one
x−a
variable, this would be limx→a |x−a| which approaches +1 from the right and −1 from the left. So
we must look harder to find a formulation of the one-variable derivative that can be generalized.
Figure 4.1: Approximating a differentiable function f (x) by its tangent line `(x)
83
www.dbooks.org
84 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
f (x) − `(x)
lim = 0. (4.2)
x→a x−a
In particular, when x is near a, f (x) − `(x) must be much much smaller than x − a in order for
f (x)−`(x)
x−a to be near 0. Note that f (x) − `(x) is the error in using the tangent line to approximate
f , so, in order for f to be differentiable at a, not only must this error go to 0 as x approaches a, it
must go to 0 much faster than x − a. This is the principle we generalize in moving to functions of
more than one variable.
Definition. A function ` : Rn → Rm is called an affine function if it has the form `(x) = T (x)+b,
where:
This happens to have the same form as the tangent line (4.1), which seems like a good sign.
Substituting into equation (4.3), we say, still provisionally, that f is differentiable at a if there
exists a linear transformation T : Rn → R such that:
f (x) − f (a) − T (x − a)
lim = 0. (4.4)
x→a kx − ak
There is really only one choice for T as well. For, as a linear transformation,
T is represented
by a matrix, in this case, a 1 by n matrix A = m1 m2 . . . mn , where mj = T (ej ) for
j = 1, 2, . . . , n. (See Proposition 1.7.) We shall determine the values of the entries mj .
Suppose that x approaches a in the x1 -direction, i.e., let x = a + he1 , and let h go to 0. Then
x − a = he1 , and kx − ak = |h|. Assume for a moment that h > 0. In order for equation (4.4) to
hold, it must be true that:
If h < 0, then the denominator in equation (4.4) is |h| = −h. The minus sign can be factored out
and canceled, so equation (4.5) still holds. This can be solved for T (e1 ):
f (a + he1 ) − f (a)
m1 = T (e1 ) = lim
h→0 h
f (a1 + h, a2 , . . . , an ) − f (a1 , a2 , . . . , an )
= lim .
h→0 h
This limit of a difference quotient is the definition of the ordinary one-variable derivative where
x2 , . . . , xn are held fixed at x2 = a2 , . . . , xn = an , and only x1 varies. It is called the partial
∂f
derivative ∂x 1
(a).
∂f
Similarly, by approaching a in the x2 , . . . , xn directions, we find that m2 = ∂x2 (a), . . . , mn =
h i
∂f ∂f ∂f ∂f
∂xn (a) and hence that A = ∂x1 (a) ∂x2 (a) . . . ∂xn (a) .
∂f f (a + hej ) − f (a)
(a) = lim .
∂xj h→0 h
Note that a + hej = (a1 , . . . , aj + h, . . . , an ), so a + hej and a differ only in the jth coordinate.
Thus the partial derivative is defined by the one-variable difference quotient for the derivative
with variable xj .
Other common notations for the partial derivative are fxj (a) and (Dj f )(a).
h i
∂f ∂f ∂f
2. Df (a) is defined to be the 1 by n matrix Df (a) = ∂x1 (a) ∂x2 (a) . . . ∂xn (a) . The
∂f ∂f ∂f
Rn is called the gradient of
corresponding vector ∇f (a) = ∂x 1
(a), ∂x 2
(a), . . . , ∂x n
(a) in
f . (The intent of the notation might be clearer if these objects were denoted by (Df )(a) and
(∇f )(a), respectively, but the extra parentheses are usually omitted.)
Example 4.1. Let f (x, y) = x3 + 2x2 y − 3y 2 , and let a = (2, 1). To find ∂f ∂x (a), we fix y = 1 and
∂f
differentiate with respect to x. Since f (x, 1) = x + 2x − 3, this gives ∂x (x, 1) = 3x2 + 4x and
3 2
then ∂f
∂x (2, 1) = 12 + 8 = 20.
Similarly, for the partial derivative at a with respect to y, f (2, y) = 8 + 8y − 3y 2 , so ∂f
∂y (2, y) =
8 − 6y and ∂f∂y (2, 1) = 8 − 6 = 2.
In practice, this is not how partial derivatives are calculated usually. Instead, one works directly
with the general formula for f . For instance, to find the partial derivative with respect to x,
differentiate with respect to x thinking of all other variables—in this case, only y—as constant:
∂f 2 2 ∂f ∂f 2
∂x = 3x + 4xy − 0 = 3x + 4xy, so ∂x (2, 1) = 12 + 8 = 20 as before. Likewise, ∂y = 2x − 6y, and
∂f
∂y (2, 1) = 8 − 6 = 2.
In any case, Df (a) = 20 2 , and ∇f (a) = (20, 2).
We now have all the ingredients needed to state the formal definition of differentiability, as
motivated by equation (4.4).
www.dbooks.org
86 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
The function in this last expression is continuous, as it is a composition of sums and products of
continuous pieces. Thus the value of the limit is simply the value of the function at the point (1, 2).
In other words:
2 2
x−1
x +y −5− 2 4
y−2 p
lim p = (1 − 1)2 + (2 − 2)2 = 0.
(x,y)→(1,2) (x − 1)2 + (y − 2)2
This shows that the answer is: yes, f is differentiable at a = (1, 2).
In addition, the preceding calculations show that the first-order approximation of f (x, y) =
x2 + y 2 at a = (1, 2) is:
For instance, (x, y) = (1.05, 1.95) is near a, and f (1.05, 1.95) = 1.052 + 1.952 = 4.905 while
`(1.05, 1.95) = 2(1.05) + 4(1.95) − 5 = 4.9. The values are pretty close.
The same sort of reasoning can be used to show that, more generally, the function f (x, y) =
x + y 2 is differentiable at every point of R2 . This is Exercise 1.16.
2
(
xy
2 2 if (x, y) 6= (0, 0),
Example 4.3. Let f (x, y) = x +y . Its graph is shown in Figure 4.2. Is f
0 if (x, y) = (0, 0)
differentiable at a = (0, 0)?
xy
Figure 4.2: The graph z = x2 +y 2
Using the definition requires knowing the partial derivatives at (0, 0). We could apply the
quotient rule to the formula that defines f , but the result would be valid only when (x, y) 6= (0, 0),
which is precisely what we don’t need. So instead we go back to the basic idea behind partial
derivatives, namely, that they are one-variable derivatives where all variables but one are held
constant. For example, to find ∂f ∂x (0, 0), fix y = 0 in f (x, y) and look at the resulting function of
x. By the formula for f , f (x, 0) = x2x·0+02
= 0 if x 6= 0. This expression also holds when x = 0:
by definition, f (0, 0) = 0. Thus f (x, 0) = 0 for all x. Differentiating this with respect to x gives
∂f d
∂x (x, 0) = dx 0 = 0, whence ∂f ∂f
∂x (0, 0) = 0. A symmetric calculation results in ∂y (0, 0) = 0. Hence
Df (0, 0) = 0 0 .
www.dbooks.org
88 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
x−0 x p
Moreover, x − a = = , and kx − ak = x2 + y 2 . Thus, according to the definition
y−0 y
of differentiability, we need to check if:
xy x
x2 +y 2
−0− 0 0
y
lim p = 0,
(x,y)→(0,0) 2
x +y 2
xy
i.e., if lim(x,y)→(0,0) (x2 +y 2 )3/2 = 0.
For this, suppose we approach the origin along the line y = x. Then we are looking at the
x·x x2 1
behavior of (x2 +x 2 )3/2 = 23/2 |x|3 = 23/2 |x| . This blows up as x goes to 0. It certainly doesn’t
xy
approach 0. Hence there’s no way that lim(x,y)→(0,0) (x2 +y 2 )3/2 can equal 0 (in fact, the limit does
This example shows that a function of more than one variable can fail to be differentiable at a
point even though all of its partial derivatives exist there. This differs from the one-variable case.
were differentiable at (0, 0). In fact, we had shown before that f is not continuous at (0, 0) (see
Example 3.19 in Chapter 3). In first-year calculus, discontinuous functions cannot be differentiable.
If the same were true for functions of more than one variable, we could have saved ourselves a lot
of work. We state the result in contrapositive form.
Proof. We just outline the idea and leave the details for the exercises (Exercise 2.3). If f is
differentiable at a, then limx→a (f (x) − `(x)) = 0, where `(x) is the first-order approximation at a.
In other words:
lim (f (x) − f (a) − Df (a) · (x − a)) = 0.
x→a
The matrix product term Df (a) · (x − a) goes to 0 as x approaches a (it’s a continuous function of
x), so the last line becomes limx→a (f (x) − f (a) − 0) = 0. Hence limx→a f (x) = f (a). As a result,
f is continuous at a.
4.3. THE MEAN VALUE THEOREM 89
Example 4.5. Returning to function (4.7) above, we could have simply said from the start that f
is not continuous at (0, 0) and therefore is not differentiable there either, avoiding the definition of
differentiability entirely.
Conversely, we know from first-year calculus that a function can be continuous without being
differentiable. The absolute value function f (x) = |x| is the standard example. It is continuous,
but, because its graph has a corner, it is not differentiable at the origin. Sometimes, we can use
the failure of a one-variable derivative to tell us about the differentiability of a function of more
than one variable.
∂f
Proposition 4.6. If some partial derivative ∂xj does not exist at a, then f is not differentiable at
a.
∂f
Proof. This follows simply because if ∂x j
(a) doesn’t exist, then neither does Df (a), so the condition
for differentiability cannot be satisfied.
2
Example 4.7. p One analogue of the absolute value function is the function f : R → R, f (x, y) =
2 2
k(x, y)k = x + y . We have seen that its graph is a cone with vertex at the origin (Example
3.2). √
To find ∂f
∂x at the origin, we fix y = 0 to get f (x, 0) = x2 = |x|. The derivative with respect to
∂f
x does not exist at x = 0, so, by definition, ∂x (0, 0) does not exist either. Thus f is not differentiable
at (0, 0) (though it is continuous there).
We have saved the most useful criterion for differentiability for last. To understand why it is
true, however, we take a slight detour.
www.dbooks.org
90 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
the mean value theorem applies to each of the differences on the right.
That is, in the second difference, using x as variable and y = a2 fixed, the mean value theorem
implies that there exists a point p = (p1 , a2 ) on the line segment between a and c such that:
∂f
f (c) − f (a) = (p) · (b1 − a1 ). (4.10)
∂x
Similarly, for the first difference, using y as variable and x = b1 fixed, there exists a point q = (b1 , q2 )
on the line segment between c and b such that:
∂f
f (b) − f (c) = (q) · (b2 − a2 ). (4.11)
∂y
∂f ∂f
Substituting (4.10) and (4.11) into equation (4.9) results in f (b) − f (a) = ∂y (q) · (b2 − a2 ) + ∂x (p) ·
(b1 − a1 ), which is the desired conclusion (4.8).
Proof. We give the proof in the case of two variables, that is, U ⊂ R2 . This contains the main new
idea, and we leave for the reader the generalization to n variables, which is a relatively straightfor-
ward extension.
4.4. THE C 1 TEST 91
We use the definition of differentiability. Let a be a point of U . We first analyze the error
f (x) − f (a) − Df (a) · (x − a) in the first-order approximation of f at a. Since U is an open set and
we are interested only in what happens when x is near a, we may assume that x lies in an open
ball about a that is contained in U . Say a = (c, d) and x = (x, y). By the result we just obtained
in Example 4.9, there are points p and q such that kp − ak ≤ kx − ak, kq − ak ≤ kx − ak, and:
∂f ∂f
f (x) − f (a) = (p) · (x − c) + (q) · (y − d).
∂x ∂y
Also:
h i x − c
∂f ∂f
Df (a) · (x − a) = ∂x (a) ∂y (a)
y−d
∂f ∂f
= (a) · (x − c) + (a) · (y − d).
∂x ∂y
Hence:
∂f ∂f ∂f ∂f
f (x) − f (a) − Df (a) · (x − a) = (p) − (a) · (x − c) + (q) − (a) · (y − d),
∂x ∂x ∂y ∂y
so:
|f (x) − f (a) − Df (a) · (x − a)|
kx − ak
∂f ∂f ∂f ∂f
| ∂x (p) − ∂x (a) · (x − c) + ∂y (q) − ∂y (a) · (y − d)|
=
kx − ak
∂f ∂f |x − c| ∂f ∂f |y − d|
≤ (p) −
(a) ·
+ (q) −
(a) · ,
∂x ∂x kx − ak ∂y ∂y kx − ak
√
|x−c| (x−c)2
where the last step uses the triangle inequality. Note that kx−ak = √ 2 2
≤ 1. Similarly,
(x−c) +(y−d)
|y−d|
kx−ak ≤ 1. Therefore:
|f (x) − f (a) − Df (a) · (x − a)| ∂f ∂f ∂f ∂f
≤ (p) − (a) + (q) −
(a). (4.12)
kx − ak ∂x ∂x ∂y ∂y
At last, we bring in the assumption that f has continuous partial derivatives. Since kp − ak ≤
kx − ak and kq − ak ≤ kx − ak, it follows that, as x approaches a, so do p and q. By the continuity
of the partial derivatives, this means that ∂f ∂f ∂f ∂f
∂x (p) approaches ∂x (a) and ∂y (q) approaches ∂y (a).
Hence, in the limit as x goes to a, the right side of (4.12) approaches 0, and therefore so must the
left. By definition, f is differentiable at a.
www.dbooks.org
92 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
of (4.7). We saw that f is not differentiable at the origin. On the other hand, at points other than
the origin, the partial derivatives can be found using the quotient rule. This gives:
∂f (x2 + y 2 ) · y − xy · 2x y 3 − x2 y y(y 2 − x2 )
= = =
∂x (x2 + y 2 )2 (x2 + y 2 )2 (x2 + y 2 )2
∂f (x2 + y 2 ) · x − xy · 2y x(x2 − y 2 )
and = = .
∂y (x2 + y 2 )2 (x2 + y 2 )2
As algebraic combinations of x and y, both of these partial derivatives are continuous on the open
set R2 − {(0, 0)}. As a result, the C 1 test says that f is differentiable at all points other than the
origin.
Thus f (α(t))−f
t−t0
(α(t0 ))
≈ ∇f (α(t0 ))· α(t)−α(t
t−t0
0)
. In the limit as t approaches t0 , the left side approaches
the one-variable derivative (f ◦ α) (t0 ) and the right approaches ∇f (α(t0 )) · α0 (t0 ). Writing t in
0
In words, the derivative of the composition is “gradient dot velocity.” While we outlined the
idea why this is true, a complete proof of the result as stated entails a more careful treatment of
the approximations involved. The details appear in the exercises (see Exercise 5.6).
of an open set U in Rn . To represent a direction going away from a, we choose a unit vector u
pointing in that direction. We travel in that direction on the line through a parallel to u, which
is parametrized by α(t) = a + tu. The parametrization has velocity α0 (t) = u and hence constant
speed kuk = 1.
If f : U → R is a real-valued function, then the derivative (f ◦ α)0 (t) is the rate of change of f
along this line. In particular, since α(0) = a, we think of (f ◦ α)0 (0) as the rate of change at a in
the direction of u.
Definition. Given the input described above, (f ◦ α)0 (0) is called the directional derivative of
f at a in the direction of the unit vector u, denoted (Du f )(a). If v is any nonzero vector in
v
Rn , we define the directional derivative in the direction of v to be (Du f )(a), where u = kvk is the
unit vector in the direction of v.
Directional derivatives are easy to compute, for by the Little Chain Rule:
In other words:
Proposition 4.14. If f is differentiable at a and u is a unit vector, then the directional derivative
is given by:
(Du f )(a) = ∇f (a) · u.
As a result, (Du f )(a) = k∇f (a)k kuk cos θ = k∇f (a)k cos θ, where θ is the angle between
∇f (a) and the direction vector u. Keeping a fixed but varying the direction, we see that (Du f )(a)
is maximized when θ = 0, i.e., when u points in the same direction as ∇f (a), and that, in this
direction, (Du f )(a) = k∇f (a)k.
Corollary 4.15. The maximum directional derivative of f at a:
• occurs in the direction of ∇f (a) and
• has a value of k∇f (a)k.
Sometimes, weather reports mention the “temperature gradient.” Usually, this does not refer to
the gradient vector that we have been studying but rather to one of the aspects of the gradient in
the corollary, that is, the magnitude and/or direction of the maximum rate of temperature increase.
Example 4.16. Let f : R3 → R be f (x, y, z) = x2 + y 2 + z 2 , and let a = (1, 2, 3).
(a) Find the directional derivative of f at a in the direction of v = (2, −1, 1).
(b) Find the direction and rate of maximum increase at a.
(a) First, ∇f = (2x, 2y, 2z), so ∇f (a) = (2, 4, 6). Next, the unit vector in the direction of v is
1
u = kvk v = √16 (2, −1, 1). Thus:
www.dbooks.org
94 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
This is illustrated in Figure 4.4. Then f (α(t)) = c for all t, so (f ◦ α)0 (t) = c0 = 0. By the
Little Chain Rule, this can be written ∇f (α(t)) · α0 (t) = 0. In particular, when t = t0 , we obtain
∇f (a) · v = 0. This is true for all vectors v tangent to S at a, which has the following consequence.
Example 4.18. Find an equation for the plane tangent to the graph z = x2 + y 2 at the point
(1, 2, 5).
As with the equation of any plane, it suffices to find a point p on the plane and a normal vector
n. As a point, we can use the point of tangency, p = (1, 2, 5). To find a normal vector, we rewrite
the graph as a level set. That is, rewrite z = x2 + y 2 as x2 + y 2 − z = 0. Then the graph is the level
of the function of three variables F (x, y, z) = x2 + y 2 − z corresponding to c = 0. In particular,
∇F (p) is a normal vector to the level set, that is, to the graph, at p, so it is normal to the tangent
plane as well. See Figure 4.5.
The rest is calculation: ∇F = (2x, 2y, −1), so n = ∇F (p) = (2, 4, −1) is our normal vector.
Substituting into the equation n · x = n · p for a plane gives:
Note that we just studied the graph of the function f (x, y) = x2 + y 2 of two variables, and
we considered the point p = (1, 2, 5) = (a, f (a)) on the graph, where a = (1, 2). In the remarks
following Example 4.2, we saw that the first-order approximation of f at a is given by `(x, y) =
2x + 4y − 5 (see equation (4.6)). The graph of the approximation ` is given by z = 2x + 4y − 5,
which is exactly the tangent plane we just found.
4.7. ∇F AS NORMAL VECTOR 95
In other words, geometrically, the approximation we get by using the tangent plane at (a, f (a))
to approximate the graph coincides with the first-order approximation at a. This extends a familiar
idea from first-year calculus, where tangent lines are used to obtain the first-order approximation
of a differentiable function. In fact, it is the idea we used to motivate our multivariable definition
of differentiability in the first place.
Example 4.19. Suppose that T (x, y) represents the temperature at the point (x, y) in an open
set U of the plane. There are two natural families of curves associated with this function.
The first is the curves along which the temperature is constant. They are called isotherms
and are the level sets of T : all points on a single isotherm have the same temperature. See Figure
4.6.3
Figure 4.6: Isotherms are level curves of temperature. Heat travels along curves that are always
orthogonal to the isotherms.
The second is the curves along which heat flows. The principle here is that, at every point, heat
travels in the direction in which the temperature is decreasing as rapidly as possible, from hot to
cold.
From our work on directional derivatives, we know that the maximum rate of decrease occurs
in the direction of −∇T . And we’ve just seen that ∇T is always normal to the level sets, the
isotherms. In other words, the isotherms and the curves of heat flow are mutually orthogonal
families of curves. If you have a map of the isotherms, you can visualize the flow of heat: always
move at right angles to the isotherms.
3
Figure taken from https://www.weather.gov/images/jetstream/synoptic/ll analyze temp soln3.gif by National
Weather Service, public domain.
www.dbooks.org
96 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
∂f
= 3x2 − 6xy 4 + yexy
∂x
∂f
and = −12x2 y 3 + xexy .
∂y
Both of these are again real-valued functions of x and y, so they have partial derivatives of their
own. In the case of ∂f
∂x :
∂2f
∂ ∂f
= = 6x − 6y 4 + y 2 exy
∂x2 ∂x ∂x
∂2f
∂ ∂f
and = = −24xy 3 + exy + xyexy .
∂y ∂x ∂y ∂x
∂f
Similarly, for ∂y :
∂2f
∂ ∂f
= = −24xy 3 + exy + xyexy
∂x ∂y ∂x ∂y
∂2f
∂ ∂f
and = = −36x2 y 2 + x2 exy .
∂y 2 ∂y ∂y
These are the second-order partial derivatives. They too have partial derivatives, and the
process could continue indefinitely to obtain partial derivatives of higher and higher order.
Perhaps the most striking part of the preceding calculation is that the two partial derivatives
∂2f ∂2f
∂y ∂x and ∂x ∂y turned out to be equal. They are called the mixed partials. To see whether
they might be equal in general, we return to the definition of a partial derivative as a limit of
difference quotients and try to describe the second-order partials in terms of the original function
f . In fact, what we are about to discuss is a method one might use to numerically approximate
the second-order partials.
∂2f ∂ ∂f
∂f
Let a = (c, d). First, consider ∂x ∂y = ∂x ∂y . The function being differentiated is ∂y . It is
being differentiated with respect to x, so we let x vary and hold y constant:
∂f ∂f
∂2f ∂y (c + h, d) − ∂y (c, d)
∂ ∂f
(c, d) = (c, d) ≈ .
∂x ∂y ∂x ∂y h
The approximation is meant to hold when h is small. The terms in the numerator are both partials
of f with respect to y, so in each case we differentiate f , letting y vary and holding x constant:
f (c+h,d+k)−f (c+h,d) f (c,d+k)−f (c,d)
∂2f k − k
(c, d) ≈
∂x ∂y h
f (c + h, d + k) − f (c + h, d) − f (c, d + k) + f (c, d)
≈ (4.13)
hk
when h and k are small.
4.8. HIGHER-ORDER PARTIAL DERIVATIVES 97
This expression is symmetric enough in h and k that perhaps you are willing to believe that
you would get the same thing when the order of differentiation is reversed. But if you are nervous
∂2f
about it, here is the parallel calculation for ∂y ∂x :
∂2f
∂ ∂f
(c, d) = (c, d)
∂y ∂x ∂y ∂x
∂f ∂f
∂x (c, d + k) − ∂x (c, d)
≈
k
f (c+h,d+k)−f (c,d+k) f (c+h,d)−f (c,d)
h − h
≈
k
f (c + h, d + k) − f (c, d + k) − f (c + h, d) + f (c, d)
≈ . (4.14)
hk
Indeed, comparing the approximations shows that they are the same, so:
∂2f ∂2f
(c, d) ≈ (c, d).
∂x ∂y ∂y ∂x
A proper proof that the two sides are actually equal requires tightening up the approximations that
are involved. We leave the details for the exercises (Exercise 8.7), but the main idea is to use the
mean value theorem to show that the quotient Q = f (c+h,d+k)−f (c+h,d)−f
hk
(c,d+k)+f (c,d)
common to
approximations (4.13) and (4.14) is equal on the nose to the values of the mixed partials at nearby
points, that is:
∂2f ∂2f
(p) = Q = (q), (4.15)
∂x ∂y ∂y ∂x
where p and q are points that are close to a but depend on h and k. As h and k go to 0, p and q
approach a. If the mixed partials were known to be continuous at a, then, in the limit, equation
(4.15) becomes:
∂2f ∂2f
(a) = (a).
∂x ∂y ∂y ∂x
Thus the mixed partials are equal provided one invokes a requirement of continuity. In fact, there
are functions for which the mixed partials are not equal, so some sort of restriction of this type is
unavoidable.
For functions of more than two variables, the second-order mixed partials treat all but two vari-
ables as fixed, so we remain essentially in the two-variable case. In other words, if the mixed partials
are equal for functions of two variables, they are also equal for more than two. We summarize the
discussion with the following result.
∂2f ∂2f
=
∂xi ∂xj ∂xj ∂xi
www.dbooks.org
98 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
4
Example 4.22. Find ∂x ∂∂yf2 ∂z if f (x, y, z) = x2 + y 3 + sin(xz) (cos x − xz 5 ).
The order of differentiation doesn’t matter, so we may as well choose one that simplifies the
bookkeeping. Differentiating with respect to x or z would involve the product rule, and who wants
that, so let’s differentiate first with respect to y: ∂f 2 5
∂y = 3y (cos x − xz ). The next choices don’t
∂2f
make much difference, but let’s try differentiating with respect to z: ∂z ∂y = −15xy 2 z 4 . Then, with
∂3f
respect to x: ∂x ∂z ∂y = −15y 2 z 4 . And lastly with respect to y again:
∂4f
= −30yz 4 .
∂y ∂x ∂z ∂y
∂4f
By the equality of mixed partials, this is the same as the originally requested ∂x ∂y 2 ∂z
.
this for functions of more than one variable. This is a big subject that overflows with applications.
Our treatment is not comprehensive. Instead, we have selected a couple of topics that illustrate
some of the similarities and differences that pertain to moving to the multivariable case.
Definition. Let f : U → R be a function defined on an open set U in Rn . f is said to have a local
maximum at a point a of U if there is an open ball B(a, r) centered at a such that f (a) ≥ f (x)
for all x in B(a, r). It is said to have a local minimum at a if instead f (a) ≤ f (x) for all x in
B(a, r). See Figure 4.7. It has a global maximum or global minimum at a if the associated
inequalities are true for all x in U , not just in B(a, r).
If f is smooth and has a local maximum or local minimum at a, then it has a local maximum or
local minimum in every direction, in particular in each of the coordinate directions. The derivatives
in these directions are the partial derivatives. They are essentially one-variable derivatives, so from
∂f
first-year calculus, at a local maximum or minimum, all of them must be 0, that is, ∂x j
(a) = 0 for
j = 1, 2, . . . , n.
∂f
Definition. A point a is called a critical point of f if ∂xj (a) = 0 for j = 1, 2, . . . , n. Equivalently,
Df (a) = 0 0 . . . 0 , or ∇f (a) = 0.
The preceding discussion proves the following.
Proposition 4.23. Let U be an open set in Rn , and let f : U → R be a smooth function. If f has
a local maximum or a local minimum at a, then a is a critical point of f .
Hence if one is looking for local maxima or minima of a smooth function on an open set, the
critical points are the only possible candidates. This is just like the one-variable case. Conversely,
as in the one-variable case, not every critical point is necessarily a local max or min, though there
are new ways in which this failure can occur. A good example to keep in mind are functions whose
graphs are saddles, such as f (x, y) = x2 − y 2 . At a saddle point, the function has a local maximum
in one direction and a local minimum in another, so there is no open ball about the point in which
it is exclusively one or the other.
Even so, in first-year calculus, a given critical point can often be classified as a local maximum
or minimum using the second derivative. There is an analogue of this for more than one variable
that is similar in spirit while reflecting the greater number of possibilities. We discuss the situation
for two variables.
Let U be an open set in R2 , and let f : U → R be a smooth function. If a ∈ U , consider the
matrix of second-order partials:
" 2
∂2f
#
∂ f
∂x2
(a) ∂y ∂x (a)
H(a) = ∂ 2 f ∂2f .
∂x ∂y (a) ∂y 2
(a)
www.dbooks.org
100 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
It is called the Hessian matrix of f at a. By the equality of mixed partials, the two off-diagonal
terms are actually equal.
Theorem 4.24 (Second derivative test for functions of two variables). Let a be a critical point of
f such that det H(a) 6= 0. Under this assumption, a is called a nondegenerate critical point.
1. If det H(a) > 0 and:
∂2f
(a) if ∂x2
(a) > 0, then f has a local minimum at a,
∂2f
(b) if ∂x2
(a) < 0, then f has a local maximum at a.
Figure 4.8: Prototypical nondegenerate critical points at (0, 0): f (x, y) = x2 + y 2 (left), f (x, y) =
−x2 − y 2 (middle), f (x, y) = x2 − y 2 (right)
(b) By the same calculation, (0, 0) is the only critical point, only this time:
−2 0
H(0, 0) = .
0 −2
2
Hence det H(0, 0) = 4 > 0 and ∂∂xf2 (0, 0) = −2 < 0, so f has a local maximum at (0, 0).
(c) Again, (0, 0) is the only critical point, but:
2 0
H(0, 0) = .
0 −2
The first equation implies that y = x2 , and substituting this into the second gives 24x4 − 3x = 0,
or 3x(8x3 − 1) = 0. Hence x = 0, in which case y = 02 = 0, or x3 = 18 , i.e., x = 12 , in which case
2
y = 12 = 14 . As a result, f has two critical points, (0, 0) and ( 21 , 41 ).
To classify their types, we consider the Hessian:
6x −3
H(x, y) = .
−3 48y
0 −3
For instance, at the critical point (0, 0), det H(0, 0) = det = −9. Since this is negative,
−3 0
(0, 0) is a saddle point.
3 −3 2
1 1
Similarly, det H( 2 , 4 ) = det = 36 − 9 = 27. This is positive and ∂∂xf2 (0, 0) = 3 is
−3 12
positive (it’s the upper left entry of the Hessian), so, by the second derivative test, ( 12 , 14 ) is a local
minimum.
As a check on the plausibility of these conclusions, see Figure 4.10 for what the graph of f looks
like near each of the critical points.
www.dbooks.org
102 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
Figure 4.10: The graph of f (x, y) = x3 + 8y 3 − 3xy near (0, 0) (left) and near ( 12 , 14 ) (right)
1
f (x) ≈ f (a) + Df (a) · (x − a) + (x − a)t · H(a) · (x − a). (4.17)
2
The various dots are matrix products, where, writing x = (x, y) and a = (c, d):
" 2
∂2f
#
∂ f
2 (a) ∂x ∂y (a)
• H(a) = ∂∂x2 f ∂2f is the Hessian matrix at a,
∂x ∂y (a) ∂y 2
(a)
x−c
• x−a= , and
y−d
To justify the approximation, let x be a point near a, and let v = x − a. Consider the real-
valued function g defined by g(t) = f (a + tv). Then g(0) = f (a) and g(1) = f (a + v) = f (x), so
4.11. CLASSIFYING NONDEGENERATE CRITICAL POINTS 103
finding an approximation for f (x) is the same as approximating g(1). As a real-valued function of
one variable, g(t) has a second-order Taylor approximation (4.16) at 0, which for g(1) gives:
g 00 (0)
f (x) = g(1) ≈ g(0) + g 0 (0) · (1 − 0) + · (1 − 0)2
2
g 00 (0)
≈ f (a) + g 0 (0) + . (4.18)
2
To find the derivatives of g, note that, by construction, g is the composition g(t) = f (α(t)),
where α(t) = a + tv. Hence by the Little Chain Rule:
In particular, g 0 (0) = ∇f (a) · v, or in matrix form, g 0 (0) = Df (a) · v. Substituting this into (4.18)
and recalling that v = x − a gives:
g 00 (0)
f (x) ≈ f (a) + Df (a) · (x − a) + . (4.20)
2
So far, we have succeeded only in reconstructing the first-order approximation of f at a. The
really new information comes from determining g 00 (0). From equation (4.19), note that g 0 is also
a composition, namely, g 0 (t) = u(α(t)), where u(x) = ∇f (x) · v and α(t) = a + tv as before.
Therefore by the same Little Chain Rule calculation:
Substituting this into (4.20) gives the second-order approximation f (x) ≈ f (a) + Df (a) · (x − a) +
1 t
2 (x − a) · H(a) · (x − a) as stated in (4.17). This accomplishes what we set out to do.
www.dbooks.org
104 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
We remark that one can continue in this way to obtain higher-order approximations of f by
using higher-order Taylor approximations of g. The coefficients involve the higher-order derivatives
g (n) (0) of g. The idea is that, by the Little Chain Rule, each additional derivative of g corresponds
∂ ∂ ∂ ∂ ∂ ∂ n
. Thus g (n) (0) = h ∂x
to applying the “operator” ∇·v = ( ∂x , ∂y )·(h, k) = h ∂x +k ∂y +k ∂y f (a),
∂ ∂ n ∂ ∂ ∂ ∂ ∂ ∂
where h ∂x + k ∂y denotes the n-fold composition h ∂x + k ∂y ◦ h ∂x + k ∂y ◦ · · · ◦ h ∂x + k ∂y .
We leave for the curious and ambitious reader the work of checking that this formula is consistent
with our results for the first two derivatives and of working out a general formula for the nth-order
approximation of f . For example, see Exercise 11.3 for the third-order approximation.
1
f (x) = f (a + v) ≈ f (a) + (x − a)t · H(a) · (x − a)
2
1
≈ f (a) + vt · H(a) · v,
2
where v = x − a. The approximation holds when v is near 0.
We assume that det H(a) 6= 0 and determine the nature of the critical point a. To do this,
it is simpler to write the quadratic term of the approximation using the notation of (4.23) with
v = (h, k) so that:
1
f (a + v) ≈ f (a) + (Ah2 + 2Bhk + Ck 2 ) (4.24)
2
A B
and H(a) = B C .
We consider the case A 6= 0. Similar arguments apply if A = 0 (see Exercise 11.5). Completing
the square on the expression in parentheses in (4.24) gives:
B2 2 B2 2
= A(h2 + 2BA hk + A2 k ) − A k + Ck 2
2 B2 2
= A(h + BA k) + (C − A )k
2 AC−B 2 2
= A(h + BA k) + A k
2
= Ah̃2 + AC−B
A k2 ,
B
where h̃ = h + A k. Then (4.24) becomes:
1 AC−B 2
Ah̃2 + k2 .
f (a + v) ≈ f (a) + A
2
The important point is that the expression in parentheses is now a combination of squares,
and we know from Example 4.25 what to expect of expressions of this type. For instance, if both
2
coefficients A and AC−BA are positive, then f has a local minimum at v = 0, like x2 + y 2 . If
both coefficients are negative, then f has a local maximum, like −x2 − y 2 . If the coefficients have
opposite signs, then f has a saddle point, like x2 − y 2 . Note that this last case arises precisely when
2
the product of the coefficients is negative, that is, when A AC−B A = AC − B 2 < 0.
2
Recall that A = ∂∂xf2 (a),
A Band
observe that the expression AC − B 2 is the determinant of the
Hessian: det H(a) = det B C = AC − B 2 . Collecting the information in the previous paragraph
4.12. MAX/MIN: LAGRANGE MULTIPLIERS 105
2
about the signs of the coefficients A and AC−B A then gives the second-derivative test as stated in
Theorem 4.24.
The key idea of the argument was to complete the square to turn f into one of the prototypes
of being a sum and/or difference of squares, at least up to second order. This is not as ad hoc as it
might seem. A theorem in linear algebra states that, if M is a symmetric matrix in the sense that
M t = M , then it is always possible to complete the square and convert the expression vt M v into
a sum and/or difference of squares. This is one interpretation of a powerful result known as the
spectral theorem. Thanks to the equality of mixed partials, the Hessian matrix H(a) is symmetric,
so the spectral theorem applies. In fact, the spectral theorem is true for n by n symmetric matrices
for all n. Using similar ideas, this leads to a corresponding classification of nondegenerate critical
points for real-valued functions of n variables when n > 2.
x2 y2
Figure 4.11: The ellipsoid 4 + 9 + z2 = 1
2
Here, we want to maximize and minimize f = x + 2y + 3z subject to the constraint x4 +
y 2
2
9 + zq = 1. One approach is to convert this q
to a two-variable problem, for instance, writing
q
2 2 2 2 2 2
z = ± 1 − x4 − y9 on S, so f = x + 2y + 3 1 − x4 − y9 or f = x + 2y − 3 1 − x4 − y9 . These
are functions of two variables, so we could find the critical points, i.e., set the partials with respect
to x and y equal to 0, and proceed as in the previous two sections.
Instead, we try a new approach based on organizing the values of f on S according to the level
sets of f . Each level set f = x + 2y + 3z = c is a plane that intersects S typically in a curve, if at
all. As c varies, we get a family of level curves on S with f constant on each. See Figure 4.12.
Let’s say we want to maximize f on S. Then, starting at some point, we move on S in the
direction of increasing c. We continue doing this until we reach a point a where we can’t go any
farther. Let cmax denote the value of f at this point a. Observe that at a:
www.dbooks.org
106 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
Figure 4.12: Intersecting S with a level set f = c (left) and a family of intersections (right)
• The level set f = cmax must be tangent to S. Otherwise we would be able to cross the level
curve at a and make the value of f larger while remaining on S.
• ∇f (a) is a normal vector to the level set f = cmax . This is because gradients are always
normal to level sets (Proposition 4.17).
• Similarly, the surface S is itself a level set, namely, the set where:
x2 y 2
g(x, y, z) = + + z 2 = 1.
4 9
Hence ∇g(a) is a normal vector to S.
Figure 4.13: The level set f = cmax and the ellipsoid S are tangent at a.
Since the level set f = cmax and the surface S are tangent at a, their normal vectors are scalar
multiples of each other. In other words, there is a scalar λ such that ∇f (a) = λ∇g(a). This is the
idea behind the following general principle.
Proposition 4.28 (Method of Lagrange multipliers). Let U be an open set in Rn , and suppose that
you want to maximize or minimize a smooth function f : U → R subject to the constraint g(x) = c
for some smooth function g : U → R and constant c. If f has a local maximum or local minimum
at a point a and if ∇g(a) 6= 0, then there is a scalar λ such that:
∇f (a) = λ∇g(a).
the solutions occur at points where ∇f = λ∇g, that is, where (1, 2, 3) = λ( 21 x, 29 y, 2z). This gives
a system of equations:
1
1 = λx
2
2
2 = λy
9
3 = 2λz.
There are four unknowns and only three equations which may seem like not enough information,
but we must remember that the constraint provides a fourth equation that can be included as part
of the system. In this case, the equations in the system imply that λ 6= 0, so we can solve for each
of x, y, and z in terms of λ, then substitute into the constraint: x = λ2 , y = λ9 , and z = 2λ
3
, and:
2 2 9 2
λ λ 3 2
+ + = 1.
4 9 2λ
This simplifies to λ12 + 9
λ2
+ 9
4λ2
= 1, or 4+36+9
4λ2
= 1. Hence 4λ2 = 49, so λ = ± 72 .
If λ = 27 , then:
2 2 4 9 18 3 3
x= = 7 = , y= 7 = , z= 7 = 7,
λ 2
7 2
7 2· 2
and f = x + 2y + 3z = 47 + 36 9 7
7 + 7 = 7. If λ = − 2 , it’s the same thing with minus signs. Thus, on
S, f attains a maximum value of 7 at ( 74 , 18 3 4 18 3
7 , 7 ) and a minimum value of −7 at (− 7 , − 7 , − 7 ).
Example 4.29. Suppose that, when a manufacturer spends x dollars on labor, y dollars on equip-
ment, and z dollars on research, the total number of units that it produces is:
f (x, y, z) = 60x1/6 y 1/3 z 1/2 .
This is an example of what is called a Cobb-Douglas production function. If the manufacturer has
a total budget of B dollars, how should it distribute its spending among the three categories in
order to maximize production?
Here, we want to maximize the function f above subject to the budget constraint g = x+y +z =
B. Following the method of Lagrange multipliers, we set ∇f = λ∇g:
(10x−5/6 y 1/3 z 1/2 , 20x1/6 y −2/3 z 1/2 , 30x1/6 y 1/3 z −1/2 ) = λ(1, 1, 1).
In other words:
−5/6 1/3 1/2
10x
y z =λ (4.25)
1/6 −2/3 1/2
20x y z =λ (4.26)
1/6 1/3 −1/2
30x y z = λ. (4.27)
Both (4.25) and (4.26) equal λ, and setting them equal gives:
10x−5/6 y 1/3 z 1/2 = 20x1/6 y −2/3 z 1/2 . (4.28)
If z = 0, then f = 0, which is clearly not the maximum value. Thus we may assume z 6= 0 so that
the factors of z 1/2 in equation (4.28) can be canceled, leaving 10x−5/6 y 1/3 = 20x1/6 y −2/3 , or y = 2x.
Similarly, setting (4.25) and (4.27) equal and assuming y 6= 0 gives 10x−5/6 z 1/2 = 30x1/6 z −1/2 , or
z = 3x.
Thus the budget constraint becomes x + y + z = x + 2x + 3x = 6x = B, so x = B6 . We conclude
that the optimal level of production is achieved by spending x = B6 dollars on labor, y = 2x = B3
dollars on equipment, and z = 3x = B2 dollars on research.
www.dbooks.org
108 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
It’s worth stepping back to review this calculation, as there is more here than meets the eye.
A partial derivative can be viewed as a one-variable rate of change. For instance, in the previous
example, ∂f∂x is the rate at which the output changes per dollar spent on labor. This is called the
marginal productivity with respect to labor. There are similar interpretations of ∂f ∂y and ∂z
∂f
10 y 1/3 z 1/2
λ=
x5/6
1/3 B 1/2
10 B3 2
=
B 5/6
6
1 1/3 1 1/2 5/6 65/6
= 10 B · 5/6
3 2 B
1 1/3 1 1/2 1 1/6
= 60
3 2 6
≈ 21.82.
So, at the optimal level, an additional dollar spent on any combination of labor, equipment, or
research increases production by approximately 21.82 units. The seemingly extraneous Lagrange
mutliplier has told us something interesting.
1
1.7. f (x, y) = x2 +y 2
1.8. f (x, y) = xy
∂f ∂f ∂f
In Exercises 1.9–1.13, find (a) the partial derivatives ∂x , ∂y , and ∂z and (b) the gradient
∇f (x, y, z).
1.9. f (x, y, z) = x + 2y + 3z + 4
1.14. Let: y
yx x3 y + x4
yx ln x
f (x, y) = x sin(xy) + 2 arctan 5 .
y + xy + 1 x + y6
∂f ∂f
Evaluate ∂y (1, π). (Hint: Do not try to find a general formula for ∂y . Ever.)
1.15. Let f (x, y) = x + 2y. Use the definition of differentiability to prove that f is differentiable at
every point a = (c, d) of R2 .
1.16. Use the definition of differentiability to show that the function f (x, y) = x2 + y 2 is differen-
tiable at every point a = (c, d) of R2 .
www.dbooks.org
110 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
(c) Use the definition of differentiability to determine whether f is differentiable at (0, 0).
(Hint: Polar coordinates might be useful.)
∂f ∂f
(a) Find the values of the partial derivatives ∂x (0, 0) and ∂y (0, 0). (This shouldn’t require
much calculation.)
(b) Use the definition of differentiability to determine whether f is differentiable at (0, 0).
1.19. Let f and g be differentiable real-valued functions defined on an open set U in Rn . Prove
that:
∇(f g) = f ∇g + g ∇f.
(Hint: Partial derivatives are one-variable derivatives.)
1.20. Let f, g : U → R be real-valued functions defined on an open set U of Rn , and let a be a point
of U . Use the definition of differentiability to prove the following statements.
(a) If f and g are differentiable at a, so is f + g. (It’s reasonably clear that D(f + g)(a) =
Df (a) + Dg(a), so the problem is to show that the limit in the definition of differentia-
bility for f + g is 0.)
(b) If f is differentiable at a, so is cf for any scalar c.
2.2. Is the function f (x, y) = x2/3 + y 2/3 differentiable at (0, 0)? If so, find Df (0, 0).
2.3. This exercise gives the proof of Proposition 4.4, that differentiability implies continuity. Let
U be an open set in Rn , f : U → R a real-valued function, and a a point of U . Define a
function Q : U → R by:
( f (x)−f (a)−∇f (a)·(x−a)
kx−ak if x 6= a,
Q(x) =
0 if x = a.
Note that the quotient is the one that appears in the definition of differentiability using the
gradient form of the derivative.
3.1. Let B = B(a, r) be an open ball in R2 centered at the point a = (c, d), and let f : B → R be
a differentiable function such that:
Df (x) = 0 0 for all x in B.
Prove that f (x) = f (a) for all x in B. In other words, if the derivative of f is zero everywhere,
then f is a constant function.
In Exercises 4.2–4.6, use the various criteria for differentiability discussed so far to determine
the points at which the function is differentiable and the points at which it is not. Your reasons
may be brief as long as they are clear and precise. It should not be necessary to use the definition
of differentiability.
5.1. Let f (x, y) = x2 + y 2 , and let α(t) = (t2 , t3 ). Calculate (f ◦ α)0 (1) in two different ways:
www.dbooks.org
112 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
5.2. Let f (x, y, z) = xyz and α(t) = (cos t, sin t, t). Calculate (f ◦ α)0 ( π6 ) in two different ways:
5.3. Let f (x, y, z) be a differentiable real-valued function of three variables, and let α(t) =
(x(t), y(t), z(t)) be a differentiable path in R3 . If w = f (α(t)), use the Little Chain Rule
to find a formula for dw dt in terms of the partial derivatives of f and the derivatives with
respect to t of x, y, and z.
5.4. The Little Chain Rule often lurks in the background of a type of first-year calculus problem
known as “related rates.” For instance, as an (admittedly artificial) example, suppose that
the length ` and width w of a rectangular region in the plane are changing in time, so that
the area A = `w also changes.
5.5. Let a be a point of Rn , and let B = B(a, r) be an open ball centered at a. Let f : B → R be
a differentiable function. If b is any point of B, show that there exists a point c on the line
segment connecting a and b such that:
Note that this is a generalization of the mean value theorem to real-valued functions of more
than one variable. (Hint: Explain why the line segment can be parametrized by α(t) =
a + t(b − a), 0 ≤ t ≤ 1. Then, note that the composition f ◦ α is a real-valued function of one
variable, so the one-variable mean value theorem applies.)
5.6. In this exercise, we prove the Little Chain Rule (Theorem 4.13). Let U be an open set in Rn ,
α : I → U a path in U defined on an open interval I, and f : U → R a real-valued function.
Let t0 be a point of I. Assume that α is differentiable at t0 and f is differentiable at α(t0 ).
Let a = α(t0 ), and let Q : U → R be the function defined in Exercise 2.3:
( f (x)−f (a)−∇f (a)·(x−a)
kx−ak if x 6= a,
Q(x) =
0 if x = a.
5.7. Let f : R3 → R be a differentiable function with that property that ∇f (x) points in the same
direction as x for all nonzero x in R3 . If a > 0, prove that f is constant on the sphere
x2 + y 2 + z 2 = a2 . (Hint: If p and q are any two points on the sphere, there is a differentiable
path α on the sphere from p to q.)
6.4. f (x, y, z) = sin x sin y cos z, a = (0, π2 , π), v = (π, 2π, 2π)
6.5. For a certain differentiable real-valued function f : R2 → R, the maximum directional deriva-
tive at the point a = (0, 0) has a value of 6 and occurs in the direction from a to b = (4, 1).
Find ∇f (a).
f (x, y) = rx2 + sy 2 ,
where r and s are constant. Find values of r and s so that the maximum directional derivative
of f at the point a = (1, 2) has a value of 10 and occurs in the direction from a to the point
b = (2, 3).
6.9. The temperature in a certain region of space, in degrees Celsius, is modeled by the function
2 2 2
T (x, y, z) = 20e−x −2y −4z , where x, y, z are measured in meters. At the point a = (2, −1, 3):
www.dbooks.org
114 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
(b) If u is a unit vector in Rn , find a formula for the directional derivative (Du f )(x).
6.11. You discover that your happiness is a function of your location in the plane. At the point
(x, y), your happiness is given by the formula:
where “happ” is a unit of happiness. For instance, at the point (1, 0), you are H(1, 0) = 1
happ happy.
(a) At the point (1, 0), in which direction is your happiness increasing most rapidly? What
is the value of the directional derivative in this direction?
(b) Still at the point (1, 0), in which direction is your happiness decreasing most rapidly?
What is the value of the directional derivative in this direction?
(c) Suppose that, starting at (1, 0), you follow a curve so that you are always traveling in
the direction in which your happiness increases most rapidly. Find an equation for the
curve, and sketch the curve. An equation describing the relationship between x and y
along the curve is fine. No further parametrization is necessary. (Hint: What can you
say about the slope at each point of the curve?)
(d) Suppose instead that, starting at (1, 0), you travel along a curve such that the directional
derivative in the tangent direction is always equal to zero. Find an equation describing
this curve, and sketch the curve.
6.12. Does there exist a differentiable function f : U → R defined on an open subset U of Rn with
the property that, for some point a in U , the directional derivatives at a satisfy (Du f )(a) > 0
for all unit vectors u in Rn , i.e., the directional derivative at a is positive in every direction?
Either find such a function and point a, or explain why none exists.
7.1. Find an equation of the tangent plane to the surface z = x2 − y 2 at the point a = (1, 2, −3).
7.2. Find an equation of the tangent plane to the surface x2 +y 2 −z 2 = 1 at the point a = (1, −1, 1).
7.4. Let S be the ellipsoid x2 + 21 y 2 + 14 z 2 = 4. Let a = (1, 2, 2), and let n be the unit normal
vector to S at a that points outward from S. If f (x, y, z) = xyz, find the directional derivative
(Dn f )(a).
7.6. Find all values of c such that, at every point of intersection of the spheres
∂ i+j f
8.5. Let f (x, y) = x4 y 3 . Find i and j so that the (i + j)th-order partial derivative ∂xi ∂y j
(0, 0) is
nonzero. What is the value of this partial derivative?
x3 y
(
x2 +y 2
if (x, y) 6= (0, 0),
f (x, y) =
0 if (x, y) = (0, 0).
∂f ∂f
(a) Find formulas for the partial derivatives ∂x (x, y) and ∂y (x, y) if (x, y) 6= (0, 0).
∂f ∂f
(b) Find the values of ∂x (0, 0) and ∂y (0, 0).
(c) Use your answers to parts (a) and (b) to evaluate the second-order partials
∂2f ∂2f
∂ ∂f ∂ ∂f
(0, 0) = (0, 0) and (0, 0) = (0, 0)
∂y ∂x ∂y ∂x ∂x ∂y ∂x ∂y
at the origin (and only at the origin). (Warning: Do not assume that the second-order
partial derivatives are continuous.)
(d) Is f of class C 2 on R2 ? Is it of class C 1 ?
www.dbooks.org
116 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
8.7. This exercise gives the details of a full proof of the equality of mixed partials (Theorem 4.21).
Let a = (c, d) be a point of R2 , and let f : B → R be a real-valued function defined on an open
ball B = B(a, r) centered at a. Assume that the first and second-order partial derivatives of
f are continuous on B. Let R be the rectangle whose vertices are a = (c, d), u = (c + h, d),
v = (c + h, d + k), and w = (c, d + k), where h and k are nonzero real numbers small enough
that R ⊂ B. See Figure 4.15.
Let:
10.7. Let k be a real number, and let f (x, y) = x3 + y 3 − kxy. Find all critical points of f , and
classify each one as a local maximum, local minimum, or neither. (The answer may depend
on the value of k.)
4.13. EXERCISES FOR CHAPTER 4 117
Note that using a line y = mx + b to predict the data value y = yi when x = xi gives an error
of magnitude |mxi + b − yi |. Consequently E is called the total squared error. It depends on
the line, which we identify here by its slope m and y-intercept b. Thus E is a function of the
two variables m and b, and we can apply the methods we have been studying to find where
it assumes its minimum.
∂E
Calculate the partial derivatives ∂m and ∂E
∂b , and show that the critical points (m, b) of E are
the solutions of the system of equations:
n n n
P 2 P P
xi m + xi b = x i yi
i=1 i=1 i=1
Pn Pn (4.30)
xi m + nb = yi .
i=1 i=1
In Exercises 10.11–10.12, find the best fitting line in the sense of least squares for the given
data points by solving the system (4.30). Then, plot the points and sketch the line.
10.13. In this exercise, we use the second derivative test to verify that, for the best fitting line in the
sense of least squares, the critical point (m, b) given by the system (4.30) is a local minimum
of the total squared error E.
n n
xi )2 ≤ n ( x2i ) and that equality
P P
(a) If x1 , x2 , . . . , xn are real numbers, show that (
i=1 i=1
holds if and only if x1 = x2 = · · · = xn . (Hint: Consider the dot product x · v, where
x = (x1 , x2 , . . . , xn ) and v = (1, 1, . . . , 1).)
www.dbooks.org
118 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
(b) Let (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ) be given data points. Show that the Hessian of the
total squared error E is given by:
n n
P 2 P
2 xi 2 xi
H(m, b) = i=1 i=1 .
P n
2 xi 2n
i=1
(c) Assuming that the data points don’t all have the same x-coordinate, show that the
critical point of E given by (4.30) is a local minimum. (In fact, it is a global minimum,
as follows once one factors in that E is a quadratic polynomial in m and b, though we
won’t go through the details to justify this.)
11.3. Let U be an open set in R2 , and let f : U → R be a smooth real-valued function defined on
U . Let a be a point of U , and let x = a + v be a nearby point, where v = (h, k).
∂ ∂
Due to the equality of mixed partials, the powers of the operator ∇ · v = h ∂x + k ∂y under
composition expand in the same way as ordinary binomial expressions. For instance:
∂ ∂ 2 ∂ ∂ ∂ ∂
h +k f = (h + k ) ◦ (h +k ) f
∂x ∂y ∂x ∂y ∂x ∂y
∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂
= h2 ◦ + hk ◦ + hk ◦ + k2 ◦ f
∂x ∂x ∂x ∂y ∂y ∂x ∂y ∂y
∂2 ∂2 ∂2
= h2 2 + 2hk + k2 2 f
∂x ∂x ∂y ∂y
2
∂ f 2
∂ f ∂2f
= h2 2 + 2hk + k2 2 .
∂x ∂x ∂y ∂y
Note that this agrees with the expression derived in (4.22) as part of the quadratic term in
the second-order approximation of f .
∂ ∂ 3
(a) Find the corresponding expansion of h ∂x + k ∂y f.
(b) Find a formula for the third-order approximation of f at a. (Recall from first-year
calculus that the third-order Taylor approximation of a function of one variable is f (x) =
00 000
f (a + h) ≈ f (a) + f 0 (a) h + f 2(a) h2 + f 3!(a) h3 .)
11.4. Let f (x, y) = ax2 + by 2 , where a and b are nonzero constants. For which values of a and b
is the origin a local maximum? For which is it a local minimum? For which is it a saddle
point?
11.5. Complete the description of the behavior of the second-order approximationnear a critical
A B
point a by considering the case that det H(a) 6= 0 and A = 0, where H(a) = .
B C
(b) Show that the quadratic term Ah2 + 2Bhk + Ck 2 = 2Bhk + Ck 2 of the second-order
approximation can be written as a difference of two perfect squares. (Hence, based on
our prototypical models, we predict a to be a saddle point. Hint: In the case that C = 0,
too, consider the expansions of (a ± b)2 .)
12.1. Consider the problem of finding the maximum and minimum values of the function f (x, y) =
xy subject to the constraint x + 2y = 1.
(a) Explain why a minimum value does not exist.
(b) On the other hand, you may assume that a maximum does exist. Use Lagrange multi-
pliers to find the maximum value and the point at which it is attained.
12.2. (a) Find the points on the curve x2 − xy + y 2 = 4 at which the function f (x, y) = x2 + y 2
attains its maximum and minimum values. You may assume that these maximum and
minimum values exist.
(b) Use your answer to part (a) to sketch the curve x2 − xy + y 2 = 4. (Hint: Note that
f (x, y) = x2 + y 2 represents the square of the distance from the origin.)
12.3. At what points on the unit sphere x2 + y 2 + z 2 = 1 does the function f (x, y, z) = 2x − 3y + 4z
attain its maximum and minimum values?
12.4. Find the minimum value of the function f (x, y, z) = x2 + y 2 + z 2 subject to the constraint
2x − 3y + 4z = 1. At what point is the minimum attained?
12.5. Find the positive values of x, y, and z for which the function f (x, y, z) = xy 2 z 3 attains its
maximum value subject to the constraint 3x+4y+5z = 12. You may assume that a maximum
exists.
12.6. (a) Use the method of Lagrange multipliers to find the positive values of x, y, and z such
that xyz = 1 and x + y + z is as small as possible. What is the minimum value? (You
may assume that a minimum exists.)
(b) Use part (a) to show that, for all positive real numbers a, b, c,
a+b+c √ 3
≥ abc.
3
This√is called the inequality of the arithmetic and geometric means. (Hint: Let
k = 3 abc, and consider a/k, b/k, and c/k.)
a b c
12.7. Let A be a 3 by 3 symmetric matrix, A = b d e , and consider the quadratic polynomial:
c e f
a b c x
Q(x) = xt Ax = x y z b d e y ,
c e f z
x
where x = y . Use Lagrange multipliers to show that, when Q is restricted to the unit
z
sphere x +y +z 2 = 1, any point x at which it attains its maximum or minimum value satisfies
2 2
www.dbooks.org
120 CHAPTER 4. REAL-VALUED FUNCTIONS: DIFFERENTIATION
Ax = λx for some scalar λ and that the maximum or minimum value is the corresponding
value of λ. (In the language of linear algebra, x is called an eigenvector of A and λ is called
the corresponding eigenvalue.) In fact, maxima and minima do exist, so the conclusion is
not an empty one.
12.8. You discover that your happiness is a function of your daily routine. After a great deal of
soul-searching, you decide to focus on three activities to which you will devote your entire
day: if x, y, and z are the number of hours a day that you spend eating, sleeping, and
studying multivariable calculus, respectively, then, based on data provided by the federal
government, your happiness is given by the function:
There is a mix of eating, sleeping, and studying that leads to the greatest possible daily
happiness (you may assume this). What is it? Answer in two ways, as follows.
(a) Substitute for z in terms of x and y in the formula for h, and find the critical points of
the resulting function of x and y.
(b) Use Lagrange multipliers.
12.9. After conducting extensive market research, a manufacturer of monkey saddles discovers that,
if it produces a saddle consisting of x ounces of leather, y ounces of copper, and z ounces of
premium bananas, the monkey rider experiences a total of:
Leather costs $1 per ounce, copper $2 per ounce, and premium bananas $3 per ounce, and the
manufacturer is willing to spend at most $1000 per saddle. What combination of ingredients
yields the most satisfied monkey? You may assume that a maximum exists.
5
See Exercise 6.11, page 114.
Chapter 5
Figure 5.1: The region W under the graph z = f (x, y) and above D
RR RR
Consider
RR the volume of W , which we denote variously by D f (x, y) dA, D f (x, y) dx dy, or
just D f dA. To calculate it, we apply the following principle from first-year calculus:
Volume = Integral of cross-sectional area.
For instance, in first-year calculus, the cross-sections may turn out to be disks, washers, or, in a
slight variant, cylinders.
Example 5.1. Find the volume under the plane z = 4 − x + 2y and above the square 1 ≤ x ≤ 3,
2 ≤ y ≤ 4, in the xy-plane. See Figure 5.2.
121
www.dbooks.org
122 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
base is a line segment from y = 2 to y = 4, and the top is a curve lying in the graph z = 4 − x + 2y.
(In fact, the cross-sections in this example are trapezoids, but let’s not worry
R 4 about that.) In other
words, A(x) is an area under a curve, so it too is an integral: A(x) = 2 (4 − x + 2y) dy, where,
for the given cross-section, x is fixed. Integrating the cross-sectional area as in equation (5.1) and
evaluating gives:
Z 3 Z 4
Volume = 4 − x + 2y dy dx
1 2
y=4
Z 3
2
= 4y − xy + y dx
1 y=2
Z 3
= (16 − 4x + 16) − (8 − 2x + 4) dx
1
Z 3
= (20 − 2x) dx
1
3
= 20x − x2 1
= (60 − 9) − (20 − 1)
= 32.
5.1. VOLUME AND ITERATED INTEGRALS 123
Alternatively, we could have used cross-sections perpendicular to the y-axis, in which case the
cross-sections occur from y = 2 to y = 4 (Figure 5.4):
Z 4
Volume = A(y) dy. (5.2)
2
Each cross-section
R3 lies below the graph and above a line segment that goes from x = 1 to x = 3,
so A(y) = 1 (4 − x + 2y) dx. This time the integral of the cross-sectional area (5.2) gives:
Z 4 Z 3
Volume = 4 − x + 2y dx dy
2 1
4 x=3
Z
1 2
= 4x − x + 2xy dy
2 2 x=1
Z 4
9 1
= (12 − + 6y) − (4 − + 2y) dy
2 2 2
Z 4
= (4 + 4y) dy
2
4
= 4y + 2y 2 2
= 48 − 16
= 32.
According to this approach, the volume is the integral of an integral, or an iterated integral.
Evaluating the integral consists of a sequence of “partial antidifferentiations” with respect to one
variable at a time, all other variables being held constant. Once you integrate with respect to a
particular variable, that variable disappears from the remainder of the calculation.
Example 5.2. Find the volume of the solid that lies inside the cylinders x2 +z 2 = 1 and y 2 +z 2 = 1
and above the xy-plane.
The cylinders have axes along the y and x-axis, respectively, and intersect one another at right
angles, as in Figure 5.5. Perhaps it’s simplest to focus on the quarter of the solid that lies in the first
“octant,” i.e., the region of R3 in which x, y, and z are all nonnegative. The base of this portion
is the unit square 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 (Figure 5.6, left). Half of the solid lies under x2 + z 2 = 1
(Figure 5.6, right), running in the y-direction above the triangle in the xy-plane bounded by the
x-axis, the line x = 1, and the line y = x. Call this triangle D1 . The other half runs in the
x-direction, under y 2 + z 2 = 1 and above a complementary triangle D2 within the unit square.
www.dbooks.org
124 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
Figure 5.5: The cylinders x2 + z 2 = 1 (left), y 2 + z 2 = 1 (middle), and the region contained in both
(right)
Figure 5.6: The portion in the first octant (left) and the portion under only x2 + z 2 = 1 (right)
To calculate the volume over D1 , let’s try cross-sections perpendicular to the x-axis. These
exist from x = 0 to x = 1:
Z 1
Volume = 8 A(x) dx.
0
The base of each cross-section is a line segment perpendicular to the x-axis within D1 . The lower
endpoint is always at y = 0, but the upper endpoint depends on x. Indeed, the upper endpoint lies
on the line y = x. See Figure 5.7.
Consequently: Z 1 Z 1 Z 1p
Volume = 8 A(y) dy = 8 1− x2 dx dy.
0 0 y
While not impossible, it is a little less obvious how to do the first partial antidifferentiation with
respect to x by hand. In other words, our original order of antidifferentiation might be preferable.
Z 2 Z 4
3 y3
Example 5.3. Consider the iterated integral x e dy dx.
0 x2
www.dbooks.org
126 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
(b) To evaluate the integral as presented, we would antidifferentiate first with respect to y, treating
x as constant: Z Z 2 4 Z Z 2 4
3 3
x3 ey dy dx = x3 ey dy dx.
0 x2 0 x2
The innermost antiderivative looks hard. So, having nothing better to do, we try switching the
order of antidifferentiation. Changing to cross-sections perpendicular to the y-axis, we see from the
√
description of D that y goes from y = 0 to y = 4, and, for each y, x goes from x = 0 to x = y.
The thinking behind the switched order is illustrated in Figure 5.10.
Therefore:
Z 2 Z √
4 Z 4 Z y
3 y3 3 y3
x e dy dx = x e dx dy
0 x2 0 0
√
1 4 y3 x= y
Z 4
= x e dy
0 4 x=0
Z 4
1 2 y3
(let u = y 3 , du = 3y 2 dy)
= y e − 0 dy
0 4
1 1 3 4
= · ey 0
4 3
1
= (e64 − 1).
12
To define the integral, we take as a guiding principle that it is a limit of weighted sums. This is
often what gives the integral its power.
We begin by reviewing the one-variable case. Let f : [a, b] → R be defined on a closed interval
[a, b]. We subdivide the interval into n subintervals of widths 4x1 , 4x2 , . . . , 4xn and choose
P a point
pi in the ith subinterval for i = 1, 2, . . . , n. See Figure 5.11. We then form the sum i f (pi ) 4xi .
This is called a Riemann sum, and we refer to the pi as sample points. We think of the
Figure 5.11: Subdividing [a, b] into n subintervals of widths 4x1 , 4x2 , . . . , 4xn with sample points
p1 , p2 , . . . , pn
Riemann sum as the sum of the values f (pi ) at the sample points weighted by the width of the
subintervals (though it may be equally appropriate on occasion to think of it as the widths of the
subintervals weighted by the function values). The integral is defined as a limit of the Riemann
sums: Z b X
f (x) dx = lim f (pi ) 4xi .
a 4xi →0
i
Definition. If X and Y are sets, then the Cartesian product X × Y is defined to be the set of
all ordered pairs:
X × Y = {(x, y) : x ∈ X, y ∈ Y }.
For instance, the product of two closed intervals R = [a, b] × [c, d] = {(x, y) : x ∈ [a, b], y ∈
[c, d]} = {(x, y) : a ≤ x ≤ b, c ≤ y ≤ d} is a rectangle in R2 whose sides are parallel to the
coordinate axes (Figure 5.12). We define how to integrate a function f (x, y) over such a rectangle
R.
We chop up R into a grid of subrectangles with sides parallel to the axes and of dimensions
4xi by 4yj . A simple example is shown inP Figure 5.13. We choose a sample point pij in each
subrectangle, and consider the Riemann sum i,j f (pij ) 4xi 4yj . This is a weighted sum in which
the function values at the sample points are weighted by the areas of the subrectangles.
Now, let 4xi and 4yj go to 0.
www.dbooks.org
128 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
assuming that the limit exists. When it does, we say that f is integrable on R.
There are other versions of integrals, too, and the one defined here is known technically as the
Riemann integral. Integrals of functions of two variables are called double integrals.
We begin with a standard somewhat artificial example that is constructed to make a point.
Example 5.4. Let R = [0, 1] × [0, 1], and define f : R → R by:
(
0 if x and y are both rational,
f (x, y) =
1 otherwise.
√ √
For instance, f ( 12 , 53 ) = 0, while f ( 12 , 22 ) = 1 and f ( π6 , 22 ) = 1. The graph z = f (x, y) consists of
two 1 by 1 squares, one at height RR z = 0 and one at height z = 1, both with lots of missing points.
To evaluate the integral R f (x, y) dA using the definition, consider a subdivision of R into
subrectangles. Regardless of the subdivision, each subrectangle contains P some points where f = 0
and others where P f = 1. Thus some Riemann sums have the form i.j 0·4xi 4yj = 0, while others
have
P the form i,j 1 · 4xi 4yj = Area of R = 1. This is true no matter what 4xi and 4yj are, so
i,j f (p ij )4x i 4y j does not approach a limit as 4xi and 4yj go to 0. Thus f is not integrable
on R.
This raises the question of whether it’s possible to tell when a function is integrable. For the
preceding function, in some sense the problem is that it is too discontinuous.
5.2. THE DOUBLE INTEGRAL 129
RR
Theorem 5.6 (Fubini’s theorem). If f is integrable on R = [a, b] × [c, d], then R f (x, y) dA is
equal to both of the iterated integrals:
ZZ Z b Z d Z d Z b
f (x, y) dA = f (x, y) dy dx = f (x, y) dx dy.
R a c c a
In particular, the iterated integrals that we computed in Section 5.1 were actually examples of
double integrals.
Step 2. Integrals over bounded sets in general.
Let f : D → R be a bounded function defined on a bounded subset D of R2 . By definition, D is
contained in a rectangle R. By taking an even larger rectangle, if necessary, we may assume that
R has the form [a, b] × [c, d]. Define a function fe: R → R by:
(
f (x, y) if (x, y) ∈ D,
fe(x, y) =
0 otherwise.
www.dbooks.org
130 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
The integral of fe, as a function defined on a rectangle, has been covered in Step 1.
Definition.
RR say that f : D → R is integrable on D if fe is integrable on R. If so, we define
We RR
D f (x, y) dA = R f (x, y) dA.
e
P
Thus the integral of f over D is defined as a limit of Riemann sums for fe, i,j fe(pij ) 4xi 4yj ,
based on subdivisions of the larger rectangle R into subrectangles. Since fe = 0 away from D,
however, only those subrectangles that intersect D contribute to the Riemann sum. Hence we
might write: ZZ X
f (x, y) dA = lim f (pij ) 4xi 4yj .
D 4xi →0
subrectangles
4yj →0
that intersect D
Figure 5.16: The general case: subdividing a rectangle containing D into subrectangles. Only those
subrectangles that intersect D may contribute to a Riemann sum for fe.
As Figure 5.15 indicates, even if f is continuous on D, the function fe might well be discontinuous
on the boundary of D. Nevertheless, by Theorem 5.5, fe will still be integrable on R as long as the
boundary of D consists of finitely many smooth curves. In other words, the integral can tolerate
a certain amount of bad behavior on the boundary. Having this latitude is something we take
advantage of later on.
Here are some basic properties of the integral. They are quite reasonable and can be proven
using the definition of the integral as a limit of sums.
where the subrectangles come from subdividing a rectangle R that contains D. Now, we go about
trying to understand why this might be useful.
RR is a limit of sums. Therefore, if a typical summand f (pij )4xi 4yj represents
First, the integral
“something,” then D f (x, y) dA represents the total “something.” Also, we have noted before that
the Riemann sum is the sum of the sample values f (pij ) weighted by the area of the subrectangles.
We consider a few situations where such a weighted sum might arise.
Example 5.8. The first is when f (x, y) represents some sort of density, or quantity per unit
area, at the point (x, y), for instance, mass density or population density. Then f (pij )4xi 4yj is
approximately the quantity on a typical subrectangle, and
ZZ
(density) dA = total quantity in D,
D
Example 5.9. Suppose that f is the constant function f (x, y) = 1 for all (x, y) in D. Then
f (pij )4xi 4yj = 1 · 4xi 4yj is the area of a subrectangle, so:
ZZ
1 dA = Area (D).
D
Example 5.10. Let’s return to the graph of a function z = f (x, y), where f (x, y) ≥ 0 for all (x, y)
in D. Then f (pij ) is the height of the graph above the sample point pij , and f (pij )4xi 4yj is
approximately the volume below the graph and above a typical subrectangle (Figure 5.17). Thus:
ZZ
f (x, y) dA = volume below the graph of f and above D.
D
www.dbooks.org
132 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
Figure 5.17: Visualizing a Riemann sum as approximating the volume under a graph
Taking the limit as 4xi and 4yj go to 0 leads to the following definition.
Definition.
1. The average value of a bounded function f (x, y) on a bounded subset D of R2 , denoted f ,
is defined to be: ZZ
1
f= f dA.
Area (D) D
the left by y = −x, on the right by y = x, and on top by y = 2. We use cross-sections perpendicular
to the y-axis. These cross-sections exist from y = 0 to y = 2, and, for each y, x goes from x = −y
to x = y. Hence:
ZZ Z 2 Z y
2 2 2 2
(y − x ) dA = (y − x ) dx dy
D 0 −y
2
1 3 x=y
Z
2
= y x− x dy
0 3 x=−y
Z 2
1 1
= (y 3 − y 3 ) − (−y 3 + y 3 ) dy
0 3 3
Z 2
4 3
= y dy
0 3
1 2
= y 4 0
3
16
= .
3
Therefore f = 41 · 16 4
3 = 3.
As a side note, suppose we had tried to calculate the double integral using cross-sections per-
pendicular to the x-axis instead. These cross-sections exist from x = −2 to x = 2, but the y
endpoints of a given cross-section are different for the left and right halves of the triangle. When x
is between −2 and 0, the cross-section at x goes from y = −x to y = 2, whereas when x is between
0 and 2, it goes from y = x to y = 2 (Figure 5.20). Thus to describe D this way, the integral must
be split into two pieces:
ZZ Z 0 Z 2 Z 2 Z 2
2 2 2 2 2 2
(y − x ) dA = (y − x ) dy dx + (y − x ) dy dx.
D −2 −x 0 x
www.dbooks.org
134 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
Actually, we could try to take advantage of symmetry properties of D and f to avoid having
to evaluate both halves. We take up this idea more fully when we discuss the change of variables
theorem in Chapter 7.
(b) Because of the symmetry of D in the y-axis, it’s intuitively clear that the average x-coordinate
is 0: x = 0. Again, we justify this more carefully later (see Example 7.7 in
RRChapter 7).
For the average y-coordinate, by definition, y = Area1 (D) D y dA = 14 D y dA. We describe D
RR
using the same limits of integration as in part (a) with cross-sections perpendicular to the y-axis:
ZZ Z 2 Z y
y dA = y dx dy
D 0 −y
Z 2
x=y
= xy dy
0 x=−y
Z 2
= (y 2 − (−y 2 )) dy
0
Z 2
= 2y 2 dy
0
2 2
= y 3 0
3
16
= .
3
1 16
Thus, y = 4 · 3 = 43 , and the centroid is the point (x, y) = (0, 43 ).
We next describe surfaces by how they are traced out. This is like the way we described curves using
parametrizations, except that now, since surfaces are two-dimensional, we need two parameters. In
other words, surfaces will be described by functions σ : D → Rn , where D ⊂ R2 , as illustrated in
Figure 5.21. We want the domain of the parametrization to be two-dimensional, so we assume that
D is an open set U in R2 together with all of its boundary points. The technical term is that D
is the closure of U . Actually, the theory carries over if some of the boundary points are missing,
5.4. PARAMETRIZATIONS OF SURFACES 135
but, in our examples, D will often be a rectangle or a closed disk in R2 , in which case all boundary
points are included.
The coordinates of D are the parameters. There are often natural choices for what to call them
depending on the surface, though generically we call them s and t. For every (s, t) in D, σ(s, t)
is a point in Rn , so we may write σ(s, t) = (x1 (s, t), x2 (s, t), . . . , xn (s, t)). We shall focus almost
exclusively on surfaces in R3 , in which case, σ(s, t) = (x(s, t), y(s, t), z(s, t)).
The reason for introducing the topic now is that this is a good time to talk about how to integrate
real-valued functions over surfaces. Because surfaces are two-dimensional objects, in principle, these
integrals should be extensions of the double integrals we are in the midst of studying. For example,
the interpretations of limits of weighted sums that apply to integrals over regions in the plane
should then carry over to integrals over surfaces as well.
In the remainder of this section, we look at several examples of parametrizations of surfaces
and, in particular, some common geometric parameters. Integrals we leave for the next section.
p
Example 5.13. Consider the portion of the cone z = x2 + y 2 in R3 where 0 ≤ z ≤ 2. See Figure
5.22.pThe surface is already described in terms of x and y, so we use (x, y) as parameters with
z = x2 + y 2 , that is, let:
p
σ(x, y) = (x, y, x2 + y 2 ).
p
The domain D of σ is the set of all (x, y) such that 0 ≤ x2 + y 2 ≤ 2, or x2 + y 2 ≤ 4. This is a
disk of radius 2 in the xy-plane. As (x, y) ranges over D, σ(x, y) sweeps out the given portion of
the cone.
www.dbooks.org
136 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
In this example, we used the x and y coordinates of the xyz-coordinate system as parameters
for the surface. This is always an option for surfaces that are graphs z = f (x, y) of functions of
two variables. We next introduce some other coordinate systems that can be used to locate points
in the plane or in three-dimensional space. These coordinate systems provide the most convenient
way to approach a variety of situations, though, for the time being, we just show how two of the
coordinates are natural parameters for certain common surfaces.
(r cos θ, r sin θ) in rectangular coordinates. Thus the conversions from polar to rectangular coordi-
nates are: (
x = r cos θ
y = r sin θ.
p
Also, by the Pythagorean theorem, r = x2 + y 2 . All of R2 can be described by choosing r and θ
in the ranges r ≥ 0 and 0 ≤ θ ≤ 2π. Actually, θ = 0 and θ = 2π represent the same points, along
the positive x-axis, but it is convenient to allow this degree of redundancy.
For example, the disk D of radius 2 about the origin in Example 5.13 can be described in polar
coordinates by the conditions 0 ≤ r ≤ 2, 0 ≤ θ ≤ 2π, that is, (r, θ) ∈ [0, 2] × [0, 2π]. We think of
imposing polar coordinates on D as a transformation T : [0, 2] × [0, 2π] → R2 from the rθ-plane to
the xy-plane, where T (r, θ) = (r cos θ, r sin θ). Horizontal segments 0 ≤ r ≤ 2, θ = constant, in the
rθ-plane get mapped to radial segments emanating from the origin in the xy-plane, and vertical
segments r = constant, 0 ≤ θ ≤ 2π, get mapped to circles centered at the origin. This is illustrated
in Figure 5.24
p
Example 5.14. We return to the cone z = x2 + y 2 , 0 ≤ z ≤ 2, from Example 5.13. In polar
coordinates, the equation of the cone is z = r, and combining this with the polar description of the
disk D gives a parametrization σ e of the cone with r and θ as parameters:
See Figure 5.25. The actual cone is the same one as before, but one advantage of this parametriza-
tion is that, for the purposes of integration, rectangles are nicer domains than disks.
5.4. PARAMETRIZATIONS OF SURFACES 137
Figure 5.24: The polar coordinate transformation from the rθ-plane to the xy-plane
www.dbooks.org
138 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
p
Again r = x2 + y 2 . All of R3 is covered by choosing r ≥ 0, 0 ≤ θ ≤ 2π, −∞ < z < ∞.
Example 5.15 (Circular cylinders). We parametrize the circular cylinder of radius a and height
h given by the conditions:
x2 + y 2 = a2 , 0 ≤ z ≤ h.
The axis of the cylinder is the z-axis.
In cylindrical coordinates, along the cylinder, r is fixed at the radius a, but θ and z can vary
independently. The surface is traced out by setting r = a and letting θ and z range over 0 ≤ θ ≤ 2π,
0 ≤ z ≤ h. In other words, we can parametrize the cylinder using θ and z as parameters. With
r = a fixed, the conversions from cylindrical to rectangular coordinates give the parametrization:
It is instructive to think about how σ transforms the rectangle D = [0, 2π] × [0, h] to become
the cylinder S. In polar/cylindrical coordinates, θ = 0 and θ = 2π represent the same thing in
xyz-space. Thus σ maps the points (0, z) and (2π, z) in the θz-parameter plane to the same point
in R3 . Apart from this, distinct points in D get mapped to distinct points in xyz-space. One way
to construct a cylinder is to take a rectangle like D and glue together, or identify, points (0, z) on
the left edge with points (2π, z) on the right. It is clear geometrically that the result is a cylinder,
and σ gives a formula that carries it out.
To plot a point in R3 with spherical coordinates (ρ, φ, θ), first go out ρ units along the positive z-
axis. This brings you to the point (0, 0, ρ) in rectangular coordinates. Then, rotate counterclockwise
about the positive y-axis by an angle φ, bringing you to the point (ρ sin φ, 0, ρ cos φ) in the xz-plane,
a distance ρ from the origin and r = ρ sin φ from the z-axis. Finally, rotate counterclockwise about
5.4. PARAMETRIZATIONS OF SURFACES 139
the positive z-axis by an angle θ. This lands at the point (ρ sin φ cos θ, ρ sin φ sin θ, ρ cos φ). The
coordinates are labeled in Figure 5.28.
Thus the conversions from spherical to rectangular coordinates are:
x = ρ sin φ cos θ
y = ρ sin φ sin θ
z = ρ cos φ.
p
From the Pythagorean theorem, the distance from the origin is ρ = x2 + y 2 + z 2 . All of R3 is
covered by spherical coordinates in the intervals ρ ≥ 0, 0 ≤ φ ≤ π, 0 ≤ θ ≤ 2π. (Note that the
angular interval 0 ≤ φ ≤ π ranges from the direction of the positive z-axis to the direction of
the negative z-axis, and the interval 0 ≤ θ ≤ 2π sweeps all the way around the z-axis. This is
why values of φ in the interval π < φ ≤ 2π are not needed—they would duplicate points already
covered.)
Example 5.16 (Spheres). We parametrize the sphere of radius a centered at the origin:
x2 + y 2 + z 2 = a2 .
In spherical coordinates, the sphere is traced out by keeping the value of ρ fixed at the radius a,
while φ and θ vary independently. More precisely, set ρ = a, and let φ and θ range over 0 ≤ φ ≤ π,
0 ≤ θ ≤ 2π. Hence we use φ and θ as parameters, and, from the spherical to rectangular conversions
with ρ = a fixed, obtain the parametrization:
σ : [0, π] × [0, 2π] → R3 , σ(φ, θ) = (a sin φ cos θ, a sin φ sin θ, a cos φ).
This is illustrated in Figure 5.29.
Again the parametrization gives precise instructions for how to transform the rectangle D =
[0, π] × [0, 2π] into a sphere. Here, all points where φ = 0 get mapped to the north pole. These are
the points along the left edge of D. Similarly, the right edge of D, where φ = π, gets mapped to
the south pole. Finally, points where θ = 0 and θ = 2π, that is, along the top and bottom edges,
are identified in pairs along a meridian on the sphere in the xz-plane. Thus to construct a sphere
from a rectangle, collapse each of the left and right sides to points and zip together the remaining
two sides to close up the surface.
p
Example 5.17. We return one last time to the cone z = x2 + y 2 , 0 ≤ z ≤ 2, which we
have parametrized twice already, once using rectangular coordinates (x, y) and once using po-
lar/cylindrical coordinates (r, θ) as parameters. It can be parametrized using spherical coordinates
as well.
www.dbooks.org
140 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
Here, along the cone, the angle φ down from the positive z-axis is fixed at π4 . To see this, note,
for instance, that the cross-section with the yz-plane is √ z = |y|. On the other hand, ρ and θ can
vary independently. For example, ρ ranges from 0 to 2 2. (To determine the upper endpoint, one
can either use trigonometry or solve for ρ using z = ρ cos φ with z = 2 and φ = π4 .) Also, θ rotates
all the way around from 0 to 2π. From the spherical conversions to rectangular coordinates, we
obtain a parametrization:
√
b : [0, 2 2] × [0, 2π] → R3 ,
σ
√ √ √
π π π 2 2 2
σ
b(ρ, θ) = (ρ sin cos θ, ρ sin sin θ, ρ cos ) = ( ρ cos θ, ρ sin θ, ρ).
4 4 4 2 2 2
We leave as an exercise the instructions that σ
b gives for turning a rectangle into a cone. See Figure
5.30.
As the cone illustrates, it may be possible to parametrize a given surface in several different
ways.
two-dimensional partition
P of small pieces of surface area 4Sij , choose a sample point pij in each
piece, form a sum i,j f (pij ) 4Sij , and take the limit as the size of the pieces goes to zero.
We use a parametrization of S to convert the calculation into a double integral over a region in
the plane. The approach is similar to what we did when we defined integrals over curves with respect
to arclength, though that was a long time ago (Section 2.3 to be precise). Thus let σ : D → R3 be
a parametrization of S, where D ⊂ R2 . We write σ(s, t) = (x(s, t), y(s, t), z(s, t)). We subdivide
D (or, really, a rectangle that contains D) into small subrectangles of dimensions 4si by 4tj , and
choose a sample point aij in each subrectangle. Intuitively, the expectation is that σ transforms
each subrectangle into a small “curvy quadrilateral” in S. This is one of the small pieces of S
alluded to earlier. We denote its area by 4Sij and take pij = σ(aij ) as the corresponding sample
point. This is depicted in Figure 5.31.
www.dbooks.org
142 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
Figure 5.32: A blow-up of the previous figure: σ
sends a small 4si by 4tj subrectangle to a curvy
quadrilateral whose area 4Sij is approximately
∂σ ∂σ
∂s × ∂t 4s i 4tj .
We should acknowledge a point that was also glossed over in connection with integrals
P with re-
spect to arc length. Namely, we motivated the surface integral using Riemann sums i,j f (pij )4Sij
based on the surface, but the actual definition
RR relies on a parametrization σ. Perhaps a better nota-
tion for the integral as defined would be σ f dS. The issue is whether the choice of parametrization
affects the value of the integral. Once we have a few more tools available, we shall see that there is
no need for concern: using the definition of the integral with two different parametrizations of the
same surface gives the same value. The effect of parametrizations on surface integrals is discussed
at the end of Chapter 10.
P P
Example 5.18. If f (x, y, z) = 1 for all (x, y, z) in S, then i,j f (pij ) 4Sij = i,j 1 · 4Sij , the
sum of the small pieces of surface area. This is the total area of S, that is:
ZZ
Surface area of S = 1 dS. (5.3)
S
We use the parametrization of a sphere derived in Example 5.16 with spherical coordinates
(φ, θ) as parameters and ρ = a fixed:
σ : [0, π] × [0, 2π] → R3 , σ(φ, θ) = (a sin φ cos θ, a sin φ sin θ, a cos φ).
Let D denote the domain of the parametrization, D = [0, π] × [0, 2π].
(a) Applying the previous example, we integrate the function f = 1. Then, using the definition of
the integral: ZZ ZZ
∂σ ∂σ
Area (S) = 1 dS = 1·
∂φ × ∂θ
dφ dθ.
S D
∂σ ∂σ
We calculate ∂φ and and place them in the rows of the determinant for a cross product:
∂θ
i j k
∂σ ∂σ
× = det a cos φ cos θ a cos φ sin θ −a sin φ
∂φ ∂θ
−a sin φ sin θ a sin φ cos θ 0
= a sin φ cos θ, a sin φ sin θ, a cos φ sin φ(cos2 θ + sin2 θ)
2 2 2 2 2
so:
∂σ ∂σ
q
×
= a2 sin φ sin2 φ cos2 θ + sin2 φ sin2 θ + cos2 φ
∂φ ∂θ
q
= a2 sin φ sin2 φ + cos2 φ
= a2 sin φ. (5.4)
Hence:
ZZ
Area (S) = a2 sin φ dφ dθ
D
Z 2π Z π
2
= a sin φ dφ dθ
0 0
Z 2π
φ=π
2
= −a cos φ dθ
0 φ=0
Z 2π
a2 − (−a2 ) dθ
=
0
Z 2π
= 2a2 dθ
0
2π
= 2a2 θ 0
2
= 4πa .
This formula gets used enough that we put it in a box:
The surface area of a sphere of radius a is 4πa2 .
(b) Here, f (x, y, z) = z 2 , and substituting for z in terms of the parameters gives f (σ(φ, θ)) =
a2 cos2 φ. Using the result of (5.4) for k ∂σ ∂σ
∂φ × ∂θ k, we obtain:
ZZ ZZ
2
∂σ ∂σ
z dS = f (σ(φ, θ))
×
dφ dθ
S D ∂φ ∂θ
ZZ
= (a2 cos2 φ) · (a2 sin φ) dφ dθ
ZDZ
4
=a cos2 φ sin φ dφ dθ
D
Z 2π Z π
4 2
=a cos φ sin φ dφ dθ (let u = cos φ, du = − sin φ dφ)
0 0
Z 2π
φ=π
4 1 3
=a − cos φ dθ
0 3 φ=0
Z 2π
1 1
= a4 − (− ) dθ
0 3 3
Z 2π
2
= a4 dθ
0 3
2 2π
= a4 θ 0
3
4
= πa4 .
3
www.dbooks.org
144 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
= x2 + y 2 + z 2RR
(c) This time f (x, y, z) RR . We use a trick! On the sphere S, x2 + y 2 + z 2 = a2 , so
2 2 2 2 2 2
RR
S (x + y + z ) dS = S a dS = a S 1 dS = a Area (S). Using the formula for the area from
part (a) then gives: ZZ
(x2 + y 2 + z 2 ) dS = a2 · 4πa2 = 4πa4 .
S
The trick enabled us to evaluate the integral with almost no calculation. This is a highly
desirable situation. In retrospect,
RR we can get even more mileage out of it by going back and using
the result of part (c) to redo z 2 dS in part (b). For the sphere S is symmetric in x, y, and z,
SRR
whence S x2 dS = S y 2 dS = S z 2 dS. Therefore S (x2 + y 2 + z 2 ) dS = 3 S z 2 dS, or:
RR RR RR RR
ZZ ZZ
2 1 1 4
z dS = (x2 + y 2 + z 2 ) dS = · 4πa4 = πa4 ,
S 3 S 3 3
where the next-to-last step used part (c). Happily, this is the same as the answer we obtained in
part (b) originally by grinding it out with a parametrization.
one subdivides W into small n-dimensional subboxes of dimensions 4x1P by 4x2 by · · · by 4xn ,
chooses a sample point p in each subbox, and considers sums of the form f (p) 4x1 4x2 · · · 4xn ,
where the summation is over all the subboxes. The integral is the limit of these sums as the
dimensions of the subboxes go to zero. It is denoted:
ZZ Z ZZ Z
··· f dV or ··· f dx1 dx2 · · · dxn .
W W
Figure 5.33: Three of the planes that bound W : z = y (left), y = x (middle), and x = 1 (right).
The fourth is z = 0, the xy-plane.
3
RRR
Example 5.20. Find W (y + z) dV if W is the solid region in R bounded by the planes z = y,
y = x, x = 1, and z = 0. See Figure 5.33.
The planes in question slice through one another in such a way that the solid W that they
bound is a tetrahedron. Each of its four triangular faces is contained in one of the planes. Looking
at W towards the origin from the first octant, the z = y face is the top, z = 0 is the bottom, x = 1
is the front, and y = x is the back. This is shown on the left of Figure 5.34.
Figure 5.34: The region of integration W (left) and its projection on the xy-plane (right)
To find the endpoints that describe W in an iterated integral, our approach is to focus on
choosing which variable to integrate with respect to first. Once that variable is integrated out of
the picture, what remains is a double integral, which by now is a much more familiar situation.
For instance, say we decide to integrate first with respect to z, treating x and y as constant.
We need to know the values of x and y for which this process makes sense, that is, the pairs (x, y)
for which there is at least one z such that (x, y, z) ∈ W . These pairs are precisely the projection of
W on the xy-plane. In this example, this is the triangle D at the base of the tetrahedron, which is
shown on the right of Figure 5.34. In the xy-plane, it is bounded by the x-axis and the lines x = 1
and y = x.
For each (x, y) in D, the values of z for which (x, y, z) ∈ W go from z = 0 to z = y. Thus:
ZZZ Z Z Z y
(y + z) dV = (y + z) dz dx dy.
W D 0
The remaining limits of integration come from describing D as a double integral in the usual way,
for instance, using the order of integration dy dx. After that, the integral can be evaluated one
www.dbooks.org
146 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
x2
RR
1.5. Evaluate D y dA if D is the rectangle described by 1 ≤ x ≤ 3, 2 ≤ y ≤ 4.
+ y) dA if D is the quarter-disk x2 + y 2 ≤ 1, x ≥ 0, y ≥ 0.
RR
1.6. Evaluate D (x
ex+y dA if D is the triangular region with vertices (0, 0), (0, 2), (1, 0).
RR
1.7. Evaluate D
RR
1.8. Evaluate D xy dA if D is the triangular region with vertices (0, 0), (1, 1), (2, 0).
In Exercises 1.9–1.14, (a) sketch the domain of integration D in the xy-plane and (b) write an
equivalent expression with the order of integration reversed.
Z 2 Z 4
1.9. f (x, y) dy dx
1 3
Z 1 Z e
1.10. f (x, y) dy dx
0 ex
Z 1 Z y
1.11. f (x, y) dx dy
0 y2
Z π/2 Z sin y
1.12. f (x, y) dx dy
0 0
www.dbooks.org
148 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
√
Z 1 Z 1−x2
1.13. f (x, y) dy dx
−1 0
√
Z 6 Z −1+ 3+y
1.14. f (x, y) dx dy
1
−3 3
y
Z 1 Z 4
xy
1.15. Consider the iterated integral ye dy dx.
1 1
4 x
1.18. Find the volume of the wedge-shaped solid that lies above the xy-plane, below the plane
z = y, and inside the cylinder x2 + y 2 = 1.
1.19. Find the volume of the pyramid-shaped solid in the first octant bounded by the three coor-
dinate planes and the planes x + z = 1 and y + 2z = 2.
1.20. (a) If c is a positive constant, find the volume of the tetrahedron in the first octant bounded
by the plane x + y + z = c and the three coordinate planes.
(b) Consider the tetrahedron W bounded by the plane x+y +z = 1 and the three coordinate
planes. Suppose that you want to divide W into three pieces of equal volume by slicing
it with two planes parallel to x + y + z = 1, i.e., with planes of the form x + y + z = c.
How should the slices be made?
1.21. Let D be the unit square 0RR≤ x ≤ 1, 0 ≤ y ≤ 1, and let f : D → R be the function given by
f (x, y) = min{x, y}. Find D f (x, y) dx dy.
2.1. Let R = [0, 1] × [0, 1], and let f : R → R be the function given by:
(
1 if (x, y) = ( 21 , 12 ),
f (x, y) =
0 otherwise.
(a) Let R be subdivided into a 3 by 3 grid of subrectangles of equal size (Figure 5.36). What
are the possible values of Riemann sums based on this subdivision? (Different choices
of sample points may give different values.)
(b) Repeat for a 4 by 4 grid of subrectangles of equal size.
(c) How about for an n by n grid of subrectangles of equal size?
5.7. EXERCISES FOR CHAPTER 5 149
2.2. Let R = [0, 1] × [0, 1], and let f : R → R be the function given by:
(
1 if x = 0 or y = 0,
f (x, y) =
0 otherwise.
(a) Let R be subdivided into a 3 by 3 grid of subrectangles of equal size (Figure 5.36). What
are the possible values of Riemann sums based on this subdivision?
(b) Repeat for a 4 by 4 grid of subrectangles of equal size.
(c) How about for an n by n grid of subrectangles of equal size?
RR
(d) Is f integrable on R? If so, what is the value of R f (x, y) dA?
2.3. Here is a result that will come in handy increasingly as we calculate more integrals. Let
R = [a, b] × [c, d] be a rectangle, and let F : R → R be a real-valued function such that the
variables x and y separate into two continuous factors, that is, F (x, y) = f (x)g(y), where
f : [a, b] → R and g : [c, d] → R are continuous. Show that:
ZZ Z b Z d
f (x)g(y) dx dy = f (x) dx g(y) dy .
R a c
2.4. Let f : U → R be a continuous function defined on an open set U in R2 , and let a be a point
of U .
(a) If f (a) > 0, prove that there exists a rectangle R containing a such that:
ZZ
f dA > 0.
R
www.dbooks.org
150 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
(b) Let g : U → R be another continuous function on U . If f (a) > g(a), prove that there
exists a rectangle R containing a such that:
ZZ ZZ
f dA > g dA.
R R
2.5. In this exercise, we use the double integral to give another proof of the equality of mixed
partial derivatives. Let f (x, y) be a real-valued function defined on an open set U in R2
whose first and second-order partial derivatives are continuous on U .
∂2f ∂2f
ZZ ZZ
(x, y) dA = (x, y) dA.
R ∂x ∂y R ∂y ∂x
(Hint: Use Fubini’s theorem, with appropriate orders of integration, and the fundamen-
tal theorem of calculus to show that both sides are equal to f (b, d) − f (b, c) − f (a, d) +
f (a, c).)
∂ f 2 ∂ f 2
(b) Show that ∂x ∂y (a) = ∂y ∂x (a) for all a in U . (Hint: Use Exercise 2.4 to show that
something goes wrong if they are not equal.)
3.1. The population density of birds in a wildlife refuge decreases at a uniform rate with the
distance from a river. If the river is modeled as the x-axis in the plane, then the density at
the point (x, y) is given by f (x, y) = 5 − |y| hundred birds per square mile, where x and y are
measured in miles. Find the total number of birds in the rectangle R = [−2, 2] × [0, 1].
3.2. A small cookie has the shape of the region in the first quadrant bounded by the curves y = x2
and x = y 2 , where x and y are measured in inches. Chocolate is poured unevenly on top
of the cookie in such a way that the density of chocolate at the point (x, y) is given by
f (x, y) = 100(x + y) grams per square inch. Find the total mass of chocolate on the cookie.
3.3. The eye of a tornado is positioned directly over the origin in the plane. Suppose that the
wind speed on the ground at the point (x, y) is given by v(x, y) = 30(x2 + y 2 ) miles per hour.
(a) Find the average wind speed on the square R = [0, 2] × [0, 2].
(b) Find all points (x, y) of R at which the wind speed equals the average.
3.5. Find the centroid of the triangular region in R2 with vertices (0, 0), (1, 2), and (1, 3).
3.6. Find the centroid of the triangular region in R2 with vertices (0, 0), (2, 2), and (1, 3).
3.7. Let D be the fudgsicle-shaped region of R2 that consists of the rectangle [−1, 1] × [0, h] of
height h topped by a half-disk of radius 1, as shown in Figure 5.37. Find the value of h such
that the point (0, h) is the centroid of D.
5.7. EXERCISES FOR CHAPTER 5 151
(a) Describe the curve that is traced out by σ if r = 1/2 is held fixed and θ varies, that is,
the curve parametrized by the path α(θ) = σ( 21 , θ), 0 ≤ θ ≤ 4π.
(b) Similarly, describe the curve that is traced out by σ if θ = π/4 is held fixed and r varies,
that is, the curve parametrized by β(r) = σ(r, π4 ), 0 ≤ r ≤ 1.
(c) Describe S in a few words, and draw a sketch.
5.1. Let S be the triangular surface in R3 whose vertices are (1, 0, 0), (0, 1, 0), and (0, 0, 1).
(a) Find a parametrization of S using x and y as parameters. What is the domain of your
parametrization? (Hint: The triangle is contained in a plane.)
(b) Use your parametrization and formula (5.3) to find the area of S. Check that your
answer agrees with what you would get using the formula: Area = 21 (base)(height).
RR
(c) Find S 7 dS. (Hint: Use your answer to part (b).)
www.dbooks.org
152 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
5.3. Consider the graph z = f (x, y) of a smooth real-valued function f defined on a bounded
subset D of R2 . Show that the surface area of the graph is given by the formula:
s 2 2
ZZ
∂f ∂f
Surface area = 1+ + dx dy.
D ∂x ∂y
5.4. Let f : [a, b] → R be a smooth real-valued function of one variable, where a ≥ 0, and let S be
the surface of revolution that is swept out when the curve z = f (x) in the xz-plane is rotated
all the way around the z-axis. Show that the surface area of S is given by the formula:
Z b p
Surface area = 2π x 1 + f 0 (x)2 dx.
a
5.5. Let W be the solid bounded on top by the plane z = y + 5, on the sides by the cylinder
x2 + y 2 = 4, and on the bottom by the plane z = 0.
(a) Sketch W .
(b) Let S be the surface that bounds W (top, sides, and bottom). Find the area of S.
RR
(c) Find S z dS.
5.7. Let W be the solid region of Example 5.2 that lies inside the cylinders x2 + z 2 = 1 and
y 2 + z 2 = 1 and above the xy-plane, and let S be the exposed part of the surface that bounds
W , that is, the cylindrical surfaces but not the base.
6.7. Let a be a positive number, and let W = [0, a] × [0, a] × [0, a] in R3 . Find the average value
of f (x, y, z) = x2 + y 2 + z 2 on W .
For each of the triple integrals in Exercises 6.8–6.11, (a) sketch the domain of integration in
R3 , (b) write an equivalent expression in which the first integration is with respect to x, and (c)
write an equivalent expression in which the first integration is with respect to y.
q q
2 2 2
Z 2 Z 2− x2 Z 1− x4 − y2
6.8. f (x, y, z) dz dy dx
−2 0 0
√
Z 1 Z 1−x2 Z 1
6.9. f (x, y, z) dz dy dx
−1 0 0
Z 1 Z 1 Z x
6.10. f (x, y, z) dz dy dx
0 x 0
Z 1 Z 1 Z y
6.11. f (x, y, z) dz dy dx
0 x 0
www.dbooks.org
154 CHAPTER 5. REAL-VALUED FUNCTIONS: INTEGRATION
Part IV
Vector-valued functions
155
www.dbooks.org
Chapter 6
So far, we have studied functions of which at least one of the domain or codomain is a subset of R,
in other words:
• vector-valued functions of one variable, also known as paths, α : I → Rn , where I ⊂ R, or
• real-valued functions of n variables f : U → R, where U ⊂ Rn .
Now, we consider the general case of vector-valued functions of n variables, that is, functions of the
form f : U → Rm , where U ⊂ Rn and both n and m are allowed to be greater than 1.
We can leverage what we know about real-valued functions because, if x ∈ U , then, as an
element of Rm , f (x) has m coordinates:
f (x) = (f1 (x), f2 (x), . . . , fm (x)),
where each fi (x) is a real number. In other words, each component is a real-valued function
fi : U → R. Thus a vector-valued function is also a sequence of real-valued functions. We use
the notation above for the components consistently from now on. For example, if f : R2 → R2 is
the polar coordinate transformation f (r, θ) = (r cos θ, r sin θ), then f1 (r, θ) = r cos θ and f2 (r, θ) =
r sin θ. Also, as with paths, we denote f in plainface type, reserving boldface for a special type of
vector-valued function that we study beginning in Chapter 8.
We treated the case of real-valued functions fairly rigorously, and we shall see that much of the
theory carries over to vector-valued functions without incident. On the other hand, we discussed
paths back in Chapter 2 more informally, so what we are about to do here in the vector-valued case
can be taken as establishing the theory behind what we did then.
157
www.dbooks.org
158 CHAPTER 6. DIFFERENTIABILITY AND THE CHAIN RULE
Figure 6.1: Continuity of a vector-valued function: given any ball B(f (a), ) about f (a), there is a
ball B(a, δ) about a such that f (B(a, δ)) ⊂ B(f (a), ).
www.dbooks.org
160 CHAPTER 6. DIFFERENTIABILITY AND THE CHAIN RULE
To repeat a point made in the real-valued case, the derivative is not a single number, but rather
a matrix or, even better, the linear part of a good affine approximation of f near a.
We begin by calculating a few simple examples of Df (a). In general, it is an m by n matrix
whose entries are various partial derivatives. The ith row is the gradient ∇fi , and the jth column
is the “velocity” ∂x∂ j of f with respect to xj .
Example 6.3. Let f : R2 → R2 be f (r, θ) = (r cos θ, r sin θ). Then:
cos θ −r sin θ
Df (r, θ) = .
sin θ r cos θ
The formula above gives what the matrix looks like at"a√ general √ point
# a" √
= (r, θ). #At a specific
2 2 2
√
π π 2 −2 · 2 2 − 2
point, such as a = (2, 4 ), we have Df (a) = Df (2, 4 ) = √2 √ = √2 √ .
2 2 · 22 2 2
Example 6.4. Let g : R3 → R2 be the projection of xyz-space onto the xy-plane, g(x, y, z) = (x, y).
Then:
1 0 0
Dg(x, y, z) = .
0 1 0
1 0 0
In other words, Dg(a) = for all a in R3 .
0 1 0
Example 6.5. Let h : R2 → R3 be h(x, y) = (x2 + y 3 , x4 y 5 , e6x+7y ). Then, if a = (x, y):
3y 2
2x
Dh(a) = Dh(x, y) = 4x3 y 5 5x4 y 4 .
6e6x+7y 7e6x+7y
To determine whether a function is differentiable without going through the definition, we may
use criteria similar to those in the real-valued case. For instance, the same reasoning as before
∂fi
shows that a differentiable function must be continuous and that all the partial derivatives ∂x j
must exist in order for f to be differentiable.
In addition, the condition for differentiability in the definition (6.1) requires that a certain limit
be the zero vector. This can happen if and only if the limit of each of the components is zero, too.
This follows from Proposition 6.2. In other words, we have the following componentwise criterion
for differentiability.
Proposition 6.6. If f (x) = (f1 (x), f2 (x), . . . , fm (x)), then f is differentiable at a if and only if
f1 , f2 , . . . , fm are all differentiable at a.
Example 6.7. Lat I be an open interval in R, and let α : I → Rn be a path in Rn , where
α(t) = (x1 (t), x2 (t), . . . , xn (t)). By the last proposition, α is differentiable if and only if each of
its components x1 , x2 , . . . , xn is differentiable. The latter condition is essentially how we defined
differentiability of paths in Chapter 2, so what we did there is supported by the theory we now
have in place. Moreover, by definition of the derivative matrix:
0
x1 (t)
x0 (t)
2
Dα(t) = . .
..
x0n (t)
This agrees with α0 (t), the velocity, under the usual identification of a vector in Rn with a column
matrix.
6.3. THE CHAIN RULE: A CONCEPTUAL APPROACH 161
According to Proposition 6.6, we may apply the C 1 test for real-valued functions to each of the
components of a vector-valued function to obtain the following criterion.
∂fi
Theorem 6.8 (The C 1 test). If all the partial derivatives ∂x j
(i.e., the entries of Df (x)) exist
and are continuous on U , then f is differentiable at every point of U .
To illustrate this, in each of Examples 6.3–6.5 above, all entries of the D matrices are continuous,
so those three functions f, g, h are differentiable at every point of their domains.
A function f is called smooth if each of its component functions f1 , f2 , . . . , fm is smooth in the
sense we have discussed previously (see Section 4.9), that is, they have continuous partial derivatives
of all orders. As before, we generally state our results about differentiability for smooth functions
without worrying about whether this is a stronger assumption than necessary.
when x is near a. The challenge is to figure out the expression that goes into the box for D(g ◦f )(a).
The idea is that the first-order approximation of the composition g ◦f should be the composition
of the first-order approximations of g and f individually. Let b = f (a), and assume that f is
differentiable at a and g is differentable at b. Then there are good first-order approximations:
• for f : f (x) ≈ `f (x)
≈ f (a) + Df (a) · (x − a) when x is near a, (6.3)
• for g : g(y) ≈ `g (y)
≈ g(b) + Dg(b) · (y − b) when y is near b
≈ g(f (a)) + Dg(f (a)) · (y − f (a)). (6.4)
Note that, by continuity, when x is near a, f (x) is near f (a) = b, so substituting y = f (x) in
equation (6.4) gives:
g(f (x)) ≈ g(f (a)) + Dg(f (a)) · (f (x) − f (a)).
Then, using approximation (6.3) to substitute f (x) − f (a) ≈ Df (a) · (x − a) yields:
www.dbooks.org
162 CHAPTER 6. DIFFERENTIABILITY AND THE CHAIN RULE
In terms of the sizes of the various parties involved, the left side of the chain rule is a p by n
matrix, while the right side is a product (p by m) × (m by n). This need not be memorized. In
practice, it usually takes care of itself.
The chain rule says that the derivative of a composition is the product of the derivatives, or,
to be even more sophisticated, the composition of the linear parts of the associated first-order
approximations. Though the objects involved are more complicated, this is actually the same as
what happens back in first-year calculus. There, if u is a function of x, say u = f (x), and y is a
function of u, say y = g(u), then y becomes a function of x via composition: y = g(f (x)). The
one-variable chain rule says:
dy dy du
= · .
dx du dx
Again, the derivative of a composition is the product of the derivatives of the composed steps.
The line of reasoning we used to obtain the chain rule is natural and reasonably straightforward,
but a proper proof requires greater attention to the various approximations involved. The letter
is likely to appear. The arguments are a little like those needed for a correct proof of the Little
Chain Rule back in Exercise 5.6 of Chapter 4, though the details here are more complicated. We
shall say more about the proof of the chain rule in the next section.
The following example is meant to illustrate how the pieces of the chain rule fit together.
since f (2, π2 ) = (2 cos π2 , 2 sin π2 ) = (0, 2). Calculating the various derivatives involved is straight-
forward:
2x 2y 0 4
Dg(x, y) = 2 3 ⇒ Dg(0, 2) = 2 3 ,
2
y 2xy 4 0
cos θ −r sin θ π 0 −2
Df (r, θ) = ⇒ Df (2, ) = .
sin θ r cos θ 2 1 0
Therefore:
0 4 4 0
π 0 −2
D(g ◦ f )(2, ) = 2 3 = 3 −4 .
2 1 0
4 0 0 −8
This is remarkably concise and elegant. Nevertheless, under the hood, there is a lot going on. Each
entry of a matrix product involves considerable information: the (i, j)th entry of the product is the
6.4. THE CHAIN RULE: A COMPUTATIONAL APPROACH 163
There is one such dot product for each entry of D(g ◦ f )(a). We shall write out these entries in
a few cases in a moment and discover that we have seen this type of product before. You can
think of the discovery either as an alternative way to derive the chain rule or as confirmation of
the consistency of the theory.
We take a simple situation that we have studied previously and then tweak it a couple of times.
Example 6.11. Let α : R → R3 be a smooth path in R3 , α(t) = (x(t), y(t), z(t)), and let g : R3 → R
be a smooth real-valued function of three variables, w = g(x, y, z). Then w becomes a function of
t by composition: w = g(α(t)) = g(x(t), y(t), z(t)). As a matter of notation, from here on, we use
w to denote the value of the composition, reserving g to stand for the original function of x, y, z.
The composition is a real-valued function of one variable, so the Little Chain Rule applies. It
says that the derivative of the compositon is “gradient dot velocity”:
dw 0 ∂g ∂g ∂g dx dy dz
= ∇g(α(t)) · α (t) = , , · , ,
dt ∂x ∂y ∂z dt dt dt
∂g dx ∂g dy ∂g dz
= + + , (6.5)
∂x dt ∂y dt ∂z dt
∂g ∂g ∂g
where the partial derivatives ∂x , ∂y , ∂z are evaluated at the point α(t) = (x(t), y(t), z(t)). This
calculation can be visualized using a “dependence diagram,” as shown in Figure 6.2. It illustrates
w=g
x y z
the dependence of the various variables involved: g is a function of x, y, z, and in turn x, y, z are
functions of t. The Little Chain Rule (6.5) says that, to find the derivative of the top with respect
to the bottom, multiply the derivatives along each possible path from top to bottom and add these
products.
Tweak 1. Suppose that x, y, z are functions of two variables s and t. In other words, replace
α : R → R3 with a smooth function f : R2 → R3 , where f (s, t) = (x(s, t), y(s, t), z(s, t)). Then the
composition g ◦f : R2 → R3 → R is a function of s and t: w = (g ◦f )(s, t) = g(x(s, t), y(s, t), z(s, t)).
The corresponding dependence diagram is shown in Figure 6.3.
To compute ∂w ∂w
∂s (or ∂t ), only one variable actually varies. Because the composition is again
real-valued, the Little Chain Rule applies with respect to that variable, just as in the previous case.
Notationally, because w, x, y, and z are no longer functions of just one variable, we need to replace
www.dbooks.org
164 CHAPTER 6. DIFFERENTIABILITY AND THE CHAIN RULE
w=g
x y z
s t
d d ∂ ∂
the ordinary derivative terms ds (or dt ) in equation (6.5) with the partial derivatives ∂s (or ∂t ).
∂w
For instance, for ∂s , the Little Chain Rule says:
∂w ∂g ∂x ∂g ∂y ∂g ∂z
= + + . (6.6)
∂s ∂x ∂s ∂y ∂s ∂z ∂s
The relevant paths in the dependence diagram are shown in bold in the figure. Similarly:
∂w ∂g ∂x ∂g ∂y ∂g ∂z
= + + . (6.7)
∂t ∂x ∂t ∂y ∂t ∂z ∂t
In fact, equations (6.6) and (6.7) can be rewritten as a single matrix equation:
∂x ∂x
h i ∂s ∂t
∂w ∂w
∂g ∂g ∂g ∂y ∂y
∂s ∂t = ∂x ∂y ∂z ∂s ∂t
. (6.8)
∂z ∂z
∂s ∂t
The entries corresponding to equation (6.6) are shown in red. The point is that the matrix on the
left of equation (6.8) is the derivative D(g ◦f ), while those on the right are Dg and Df , respectively.
Tweak 2. We alter g so that it is vector-valued, say g : R3 → R2 , where g(x, y, z) = (g1 (x, y, z),
2 3 1. The composition g ◦ f : R2 → R2 is vector-valued,
leave f : R → R as in tweak
g2 (x, y, z)), and
w = g f (s, t) = g1 (f (s, t)), g2 (f (s, t)) . Each of the components is a real-valued function of s
and t: w1 = g1 (f (s, t)) = g1 (x(s, t), y(s, t), z(s, t)) and w2 = g2 (f (s, t)) = g2 (x(s, t), y(s, t), z(s, t)).
The dependence diagram is shown in Figure 6.4.
w1 = g1 w2 = g2
x y z
s t
Both w1 and w2 have partial derivatives with respect to s and t, and each of them fits exactly
into the context of tweak 1. For example:
as highlighted in bold in the figure. Again, the collection of the four individual partial derivatives
of the composition computed in this way can be organized into a single matrix equation:
∂x ∂x
" # " # ∂s ∂t
∂w1 ∂w1 ∂g1 ∂g1 ∂g1
∂s ∂t ∂x ∂y ∂z ∂y ∂y
∂w2 ∂w2
= ∂g2 ∂g2 ∂g2 ∂s ∂t
.
∂s ∂t ∂x ∂y ∂z ∂z ∂z
∂s ∂t
Once again, D(g ◦ f ) appears on the left, while the product of Dg and Df is on the right.
The same pattern persists in general for any composition. The individual partial derivatives
of a composition can be calculated using the Little Chain Rule and then combined as the matrix
equation D(g ◦ f )(a) = Dg(f (a)) · Df (a). In a way, the matrix version of the chain rule is a
bookkeeping device that keeps track of many, many Little Chain Rule calculations!
Indeed, since we presented a rigorous proof of the Little Chain Rule in Exercise 5.6 of Chapter
4, we can think of this approach as supplying a proof of the general chain rule as well.
Example 6.12. The substitutions x = r cos θ, y = r sin θ convert a smooth real-valued function
f (x, y) of x and y into a function of r and θ: w = f (r cos θ, r sin θ). For instance, if f (x, y) = x2 y 3 ,
then w = (r cos θ)2 (r sin θ)3 = r5 cos2 θ sin3 θ.
∂f ∂f
We find general formulas for ∂w ∂w
∂r and ∂θ in terms of ∂x and ∂y . As just discussed, we can
compute these individual partials using the Little Chain Rule and a dependence diagram (Figure
6.5).
w=f
x y
r θ
Hence:
∂w ∂f ∂x ∂f ∂y ∂f ∂f
= + = cos θ + sin θ,
∂r ∂x ∂r ∂y ∂r ∂x ∂y
∂w ∂f ∂x ∂f ∂y ∂f ∂f
= + =− r sin θ + r cos θ.
∂θ ∂x ∂θ ∂y ∂θ ∂x ∂y
dw ∂f dx ∂f dy ∂f ∂f dy
= + = + .
dx ∂x dx ∂y dx ∂x ∂y dx
www.dbooks.org
166 CHAPTER 6. DIFFERENTIABILITY AND THE CHAIN RULE
Figure 6.6: The level set f (x, y) = c defines y implicitly as a function of x on an interval of x values.
w=f
x y
dw
On the other hand, w has a constant value of c, so dx = 0. Hence:
∂f ∂f dy
+ = 0. (6.9)
∂x ∂y dx
Solving for dy
dx gives dy
dx = − ∂f /∂x
∂f /∂y .
2 2 2
x2
For example, consider the ellipse x9 + y4 = 1. It’s the level set of f (x, y) = 9 + y4 corresponding
to c = 1. Hence by the previous example, along the ellipse, we have:
dy ∂f /∂x 2x/9 4x
=− =− =− .
dx ∂f /∂y 2y/4 9y
Of course, implicit differentiation problems of this type appear in first-year calculus, and the
multivariable chain rule is never mentioned. There, to find the derivative along the ellipse, one
simply differentiates both sides of the equation of the ellipse and uses the one-variable chain rule.
d x2 y2 d
In other words, one computes dx 9 + 4 = dx (1), or:
2 1 dy
x+ y = 0.
9 2 dx
This is exactly the same as the result given by equation (6.9), from which the general formula
dy
for dx followed immediately. The multivariable chain rule gives us a way of articulating what the
one-variable calculation is doing.
1.1. (a) If x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ) are elements of Rn , show that |xi − yi | ≤
kx − yk for i = 1, 2, . . . , n.
6.5. EXERCISES FOR CHAPTER 6 167
www.dbooks.org
168 CHAPTER 6. DIFFERENTIABILITY AND THE CHAIN RULE
3.3. Let U and V be open sets in R2 , and let f : U → V and g : V → U be inverse functions. That
is:
3.4. Exercise 5.5 of Chapter 4 described a generalization of the mean value theorem for real-valued
functions of n variables. Does the same generalization apply to vector-valued functions? More
precisely, let a be a point of Rn , and let B = B(a, r) be an open ball centered at a. Is it
necessarily true that, if f : B → Rm is a differentiable function on B and b is any point of B,
then there exists a point c on the line segment connecting a and b such that:
4.1. Let f (x, y, z) be a smooth real-valued function of x, y, and z. The substitutions x = s + 2t,
y = 3s+4t, and z = 5s+6t convert f into a function of s and t: w = f (s+2t, 3s+4t, 5s+6t).
∂f ∂f ∂f
Find expressions for ∂w ∂w
∂s and ∂t in terms of ∂x , ∂y , and ∂z .
x = s2 − t2 + u2 + 2s − 2t and y = stu.
∂w ∂w ∂w ∂f ∂f
Find expressions for ∂s , ∂t , and ∂u in terms of ∂x and ∂y .
6.5. EXERCISES FOR CHAPTER 6 169
4.3. The spherical substitutions x = ρ sin ϕ cos θ, y = ρ sin ϕ sin θ, and z = ρ cos ϕ convert a
smooth real-valued function f (x, y, z) into a function of ρ, ϕ, and θ:
4.6. The substitutions x = r cos θ and y = r sin θ convert a smooth real-valued function f (x, y)
into a function of r and θ: w = f (r cos θ, r sin θ).
4.7. (a) Suppose that w = f (u, v) is a smooth real-valued function, where u = x/z and v = y/z.
Show that:
∂w ∂w ∂w
x +y +z = 0. (6.10)
∂x ∂y ∂z
x2 +xy+y 2
(b) Without calculating any partial derivatives, deduce that w = z2
satisfies equation
(6.10).
4.8. Let f : U → R be a smooth real-valued function of three variables defined on an open set U
in R3 . Assume that the condition f (x, y, z) = c defines z implicitly as a smooth function of
x and y, z = z(x, y), on some open set of points (x, y) in R2 . Show that, on this open set:
∂z ∂f /∂x ∂z ∂f /∂y
=− and =− .
∂x ∂f /∂z ∂y ∂f /∂z
4.10. (a) Let f : R3 → R2 be a smooth function. By definition, the level set S of f corresponding
to the value c = (c1 , c2 ) is the set of all (x, y, z) such that f (x, y, z) = c or, equivalently,
where f1 (x, y, z) = c1 and f2 (x, y, z) = c2 . Typically, the last two equations define a
curve in R3 , the curve of intersection of the surfaces defined by f1 = c1 and f2 = c2 .
www.dbooks.org
170 CHAPTER 6. DIFFERENTIABILITY AND THE CHAIN RULE
Assume that the condition f (x, y, z) = c defines x and y implicitly as smooth functions
of z on some interval of z values in R, that is, x = x(z) and y = y(z) are functions
that satisfy f (x(z), y(z), z) = c. Then the corresponding portion of the level set is
parametrized by α(z) = (x(z), y(z), z).
Show that x0 (z) and y 0 (z) satisfy the matrix equation:
" ∂f1 ∂f1 # " ∂f #
0 (z)
1
∂x ∂y x ∂z
= − ,
∂f2 ∂f2 y 0 (z) ∂f2
∂x ∂y ∂z
where the partial derivatives are evaluated at the point (x(z), y(z), z). (Hint: Let w =
(w1 , w2 ) = f (x(z), y(z), z).)
(b) Let f (x, y, z) = (x − cos z, y − sin z), and let C be the level set of f corresponding to
c = (0, 0). Find the functions x = x(z) and y = y(z) and the parametrization α of C
described in part (a), and describe C geometrically. Then, use part (a) to find Dα(z),
and verify that your answer makes sense.
4.11. Let f (x, y) be a smooth real-valued function of x and y defined on an open set U in R2 . The
partial derivative ∂f ∂f
∂x is also a real-valued function of x and y (as is ∂y ). If α(t) = (x(t), y(t))
∂f
is a smooth path in U , then substituting for x and y in terms of t converts ∂x into a function
of t, say w = ∂f
∂x (x(t), y(t)).
www.dbooks.org
172 CHAPTER 6. DIFFERENTIABILITY AND THE CHAIN RULE
Chapter 7
Change of variables
We now turn to the counterpart for integrals of the chain rule. It is called the change of variables
theorem. For functions of one variable, the corresponding notion is the method of substitution.
The one-variable and multivariable versions are expressed in equations that are similar formally,
but the approaches one might take to understand where they come from are quite different. It’s
exciting when a change in perspective still leads to a result of recognizable form. We make a few
remarks comparing the two versions later, after the theorem has been stated.
In many respects, the change of variables theorem and the chain rule hold together the foun-
dation of the more advanced theory of multivariable calculus. We begin, however, with a concrete
example where changing variables just seems like the right thing to do.
RR p
Example 7.1. Let D be the disk x2 + y 2 ≤ 4 in the xy-plane. Find D x2 + y 2 dx dy. For
instance, if one wanted to know the average distance to the origin on D, one would evaluate this
integral and divide by Area (D) = 4π.
Setting up the integral with the order of integration dy dx in the usual way (Figure 7.1) gives:
√
ZZ p Z 2 Z 4−x2 p
x2 + y 2 dx dy = √
2 2
x + y dy dx.
D −2 − 4−x2
While not impossible, this looks potentially messy. On the other hand, expressing the problem in
terms of polar coordinates seems promising for two reasons.
p
• The function to be integrated is simpler: x2 + y 2 = r.
• The region of integration is simpler: in polar coordinates, D is described by 0 ≤ r ≤ 2,
0 ≤ θ ≤ 2π, in other words, by a rectangle in the rθ-plane.
173
www.dbooks.org
174 CHAPTER 7. CHANGE OF VARIABLES
The obstacle is what to do about the dA = dx dy part of the integral. For this, we need to see how
small pieces of area in the xy-plane are related to those in the rθ-plane. We return to this example
later.
To begin, subdivide D∗ into small subrectangles of dimensions 4ui by 4vj , as in Figure 7.2.
The transformation T sends a typical subrectangle of area 4ui 4vj in the uv-plane to a small curvy
quadrilateral in the xy-plane. Let 4Aij denote the area of the curvy quadrilateral. If we choose
a sample point pij in each curvy quadrilateral, we can form a Riemann sum-like approximation of
the integral over D: ZZ X
f (x, y) dx dy ≈ f (pij ) 4Aij . (7.1)
D i,j
To turn this into a Riemann sum over D∗ , we need to relate the areas of the subrectangles and the
curvy quadrilaterals.
To do so, we use the first-order approximation of T . Choose a point aij in the (i, j)th subrect-
angle of D∗ such that T (aij ) = pij . Then the first-order approximation says that, for points u in
7.1. CHANGE OF VARIABLES FOR DOUBLE INTEGRALS 175
D∗ near aij :
It may be difficult to get a foothold on the behavior of T itself, but describing how the approximation
transforms the (i, j)th subrectangle is quite tractable.
For this, we bring in some linear algebra. Note that the derivative DT (aij ) is a 2 by 2 ma-
trix. As shown in Chapter 1, any 2 by 2 matrix A = ac db determines a linear transformation
L : R2 → R2 given by matrix multiplication: L(x) = A x. This transformation sends the unit
square determined by e1 and e2 to the parallelogram determined by L(e1 ) = [ ac ] and L(e2 ) = db ,
which are the columns of A (Figure 7.3). We saw in Proposition 1.13 of Chapter 1 that the area
Figure 7.3: A linear transformation L sends the square determined by e1 and e2 to the parallelogram
determined by L(e1 ) and L(e2 ).
of this parallelogram is | det L(e1 ) L(e2 ) |, where L(e1 ) L(e2 ) is the matrix whose rows are
L(e1 ) and L(e2 ). This is precisely the matrix transpose At . Thus the area of the parallelogram is
| det(At )| = | det A|, where we have used the general fact that an n by n matrix and its transpose
have the same determinant (Proposition 1.14, Chapter 1). In particular, L has changed the area
by a factor of | det A|.
This principle extends easily to a couple of slightly more general cases.
• Given nonzero scalars ` and w, let R be the |`| by |w| rectangle determined by `e1 and w e2 .
Then L sends R to the parallelogram P determined by L(`e1 ) = `L(e1 ) and L(we2 ) = wL(e2 ).
Compared to the original case, the areas of both rectangle and parallelogram have changed
by a common factor of |`w|, hence they still differ by a factor of | det A|.
• Suppose that we translate the rectangle R in the previous case by a constant vector a. This
results in another |`| by |w| rectangle R0 consisting of all points of the form a+x, where x ∈ R.
By linearity L(a+x) = L(a)+L(x), so L transforms R0 to the translation of the parallelogram
L(R) = P by L(a). Let’s call the translated parallelogram P 0 . Since translations don’t affect
areas, R0 and its image P 0 again differ by a factor of | det A|.
Lemma 7.2. A linear transformation L : R2 → R2 represented with respect to the standard bases by
a 2 by 2 matrix A sends rectangles whose sides are parallel to the coordinate axes to parallelograms
and alters the area of the rectangles by a factor of | det A|.
www.dbooks.org
176 CHAPTER 7. CHANGE OF VARIABLES
Letting 4ui and 4vj go to zero, this becomes an integral over D∗ , giving the following major
result.
Theorem 7.3 (Change of variables theorem for double integrals). Let D and D∗ be bounded subsets
of R2 , and let T : D∗ → D be a smooth function such that T (D∗ ) = D and that is one-to-one, except
possibly on the boundary of D∗ . If f is integrable on D, then:
ZZ ZZ
f (x, y) dx dy = f (T (u, v)) | det DT (u, v)| du dv.
D D∗
We verify that T satisfies the hypotheses of the theorem, though we shall grow less meticulous
about writing out the details on this point after a couple more examples. All the points on the
left side of D∗ , where r = 0, are mapped to the origin, that is, T (0, θ) = (0, 0) for 0 ≤ θ ≤ 2π.
Also, along the top and bottom of D∗ , θ = 0 and θ = 2π correspond to the same thing in the
xy-plane: T (r, 0) = T (r, 2π) for 0 ≤ r ≤ 2. Apart from this, distinct points in D∗ are mapped to
distinct values in D. In other words, T is one-to-one away from the boundary of D∗ , so the change
of variables
p theorem applies.
Since x2 + y 2 = r, this gives:
ZZ p ZZ
2 2
x + y dx dy = r · r dr dθ
D D∗
Z 2π Z 2
2
= r dr dθ
0 0
1 3 r=2
Z 2π
= r dθ
0 3 r=0
Z 2π
8
= dθ
0 3
8 2π
= θ0
3
16π
= .
3
www.dbooks.org
178 CHAPTER 7. CHANGE OF VARIABLES
geometric transformation goes the other way, from (u, v) to (x, y). This is an inherent aspect of
substitution.
Sometimes, the original integrand is denoted by an expression like η = f (x, y) dx dy, and then
the transformed integrand f (T (u, v)) | det DT (u, v)| du dv is denoted by T ∗ (η). Another appropriate
notation for it might be η ∗ . With this notation, the change of variables theorem becomes:
ZZ ZZ
η= T ∗ (η). (7.3)
T (D∗ ) D∗
Actually, what we just said is not entirely accurate and the situation is a little more complicated,6
but it serves to illustrate the general principle: in a substitution, the geometry is “pushed forward,”
and the integral is “pulled back.”
Here, the integrand f (x, y) = x2 + y 2 is fairly simple, but the region of integration poses some
problems. D is the region inside the upper half of an ellipse. The endpoints of the iterated integral
with respect to x and y are somewhat hard to work with, and polar coordinates don’t help either—
the relationship between r and θ needed to describe D is a little complicated. On the other hand,
we can think of D as a transformed semicircular disk, which is easy to describe in polar coordinates.
Thus we attempt to map the half-disk D∗ of radius 1 given in the uv-plane by u2 + v 2 ≤ 1, v ≥ 0,
over to D and hope that this does not complicate the function to be integrated too much.
The x and y-intercepts of the ellipse are ±3 and ±2, respectively, so, to achieve the transfor-
mation, we stretch horizontally by a factor of 3 and vertically by a factor of 2. In other words,
define:
T (u, v) = (3u, 2v).
The transformation is illustrated in Figure 7.5. In effect, we are making the substitutions x = 3u and
y = 2v. Then T transforms D∗ to D: if (u, v) satisfies u2 + v 2 ≤ 1 and v ≥ 0, then (x, y) = T (u, v)
2 2 2 (2v)2
satisfies x9 + y4 = (3u) 2 2
9 + 4 = u + v ≤ 1 and y = 2v ≥ 0.
Moreover, T is one-to-one on all of R2 . This seems reasonable geometrically, or, in terms of
equations, if (u1 , v1 ) 6= (u2 , v2 ), then (3u1 , 2v1 ) 6= (3u2 , 2v2 ). Perhaps it is more readable to phrase
6
The correct expressions are η = f (x, y) dx ∧ dy and T ∗ (η) = f (T (u, v)) det DT (u, v) du ∧ dv. We shall learn more
about integrands like this, and more importantly how to integrate them, in the last two chapters. In particular, for
the correct version of equation (7.3), see Exercise 4.20 in Chapter 11.
7.3. EXAMPLES: LINEAR CHANGES OF VARIABLES, SYMMETRY 179
this in the contrapositive: if (3u1 , 2v1 ) = (3u2 , 2v2 ), then (u1 , v1 ) = (u2 , v2 ). This is clear. For
instance, 3u1 = 3u2 ⇒ u1 = u2 .
Finally, DT (u, v) = [ 30 02 ], so | det DT (u, v)| = 6, and f (T (u, v)) = (3u)2 + (2v)2 = 9u2 + 4v 2 .
Thus by the change of variables theorem:
ZZ ZZ
2 2
(x + y ) dx dy = (9u2 + 4v 2 ) · 6 du dv.
D D∗
We convert this to polar coordinates in the uv-plane using u = r cos θ, v = r sin θ, and du dv =
r dr dθ. The half-disk D∗ is described in polar coordinates by 0 ≤ r ≤ 1, 0 ≤ θ ≤ π, in other words,
by the rectangle [0, 1] × [0, π] in the rθ-plane. Thus:
ZZ ZZ
(x2 + y 2 ) dx dy = ∗
(9r2 cos2 θ + 4r2 sin2 θ) · 6 · r dr dθ
D D in
polar
Z π Z 1
3 2 2
=6 r (9 cos θ + 4 sin θ) dr dθ. (7.4)
0 0
At this point, we pause to put on the record a couple of facts from the exercises that we shall
use freely from now on (see Exercise 2.3 in Chapter 5 and Exercise 1.1 in this chapter):
• For a function F whose variables separate into two independent continuous factors, F (x, y) =
f (x)g(y), the integral over a rectangle is given by:
Z d Z b Z b Z d
f (x)g(y) dx dy = f (x) dx g(y) dy .
c a a c
Note that integrals over rectangles are characterized by the property that all the limits of
integration are constant.
Rπ Rπ Rπ
• R0 cos2 θ dθ = 0 sin2 θ dθ = π2 . (An easy way to remember this is to note that 0 cos2 θ dθ =
π 2 π
the graphs of the sine and cosine functions and that 0 (cos2 θ + sin2 θ) dθ =
R
R0π sin θ dθ from
π
0 1 dθ = θ 0 = π.)
3 −1 u 3u − v
T (u, v) = = = (3u − v, 2u + v).
2 1 v 2u + v
www.dbooks.org
180 CHAPTER 7. CHANGE OF VARIABLES
Then T maps the unit square D∗ = [0, 1] × [0, 1] in the uv-plane onto D. See Figure 7.6. We leave
for the exercises the verification that T is one-to-one on R2 (Exercise 3.3), so we can use the change
of variables theorem to pull back the original integral to an integral over D∗ .
The substitutions given by T are x = 3u − v and y = 2u + v. Also, DT (u, v) = 23 −11 , and
We apply the change of variables theorem with f (x, y) = x so that f (T (x, y)) = f (−x, y) = −x.
Then:
ZZ ZZ
x dx dy = f (T (x, y)) | det DT (x, y)| dx dy
D ∗
Z ZD =D
= −x · 1 dx dy
ZDZ
=− x dx dy.
D
RR RR
As a result, 2 D x dx dy = 0, so D x dx dy = 0. P RR
(b) Here, the intuition is that the contributions to the Riemann sums y 4x 4y for D y dx dy
are the same for (x, y) and (−x, y), hence the sums converge to the same value on L and on R.
More formally, again let T (x, y) = (−x, y). Then T (L) = R, i.e., R∗ = L, and | det DT (x, y)| = 1
as before. Here, f (x, y) = y, so f (T (x, y)) = f (−x, y) = y. Consequently:
ZZ ZZ
y dx dy = f (T (x, y)) | det DT (x, y)| dx dy
R ∗
Z ZR =L
= y · 1 dx dy
Z ZL
= y dx dy.
L
www.dbooks.org
182 CHAPTER 7. CHANGE OF VARIABLES
More informally,
one uses T to substitute for x1 , x2 , . . . , xn in terms of u1 , u2 , . . . , un and sets
dx1 dx2 · · · dxn = det DT (u) du1 du2 · · · dun . This pulls back the integral over W to an integral
over W ∗ .
The intuition behind this generalization is the same as for double integrals: A linear transfor-
mation L : Rn → Rn represented by a matrix A alters n-dimensional volume by a factor of | det A|.
So, to obtain the integral, one chops up W ∗ into small pieces and, on each one, approximates T by
its first-order approximation, which alters volume by a factor of | det DT (u)|.
For triple integrals, there are two standard sets of alternate variables.
• Cylindrical coordinates. For this, the change of variables is T (r, θ, z) = (x, y, z), where,
as we saw in Section 5.4.2, the substitutions are given by:
x = r cos θ
y = r sin θ
z = z.
cos θ −r sin θ 0
Therefore DT (r, θ, z) = sin θ r cos θ 0. After expanding along the third column, this
0 0 1
cos θ −r sin θ
gives | det DT (r, θ, z)| = det = r, i.e.:
sin θ r cos θ
dx dy dz = r dr dθ dz.
7.4. CHANGE OF VARIABLES FOR N -FOLD INTEGRALS 183
sin φ cos θ ρ cos φ cos θ −ρ sin φ sin θ
See Section 5.4.3. Then DT (ρ, φ, θ) = sin φ sin θ ρ cos φ sin θ ρ sin φ cos θ. We leave as
cos φ −ρ sin φ 0
2
an exercise the calculation that | det DT (ρ, φ, θ)| = ρ sin φ (Exercise 2.8 in Chapter 6), i.e.:
dx dy dz = ρ2 sin φ dρ dφ dθ.
Example 7.8. Find the volume of a closed ball of radius a in R3 , that is, the volume of:
W = {(x, y, z) ∈ R3 : x2 + y 2 + z 2 ≤ a2 }.
The variables of the integrand separate into independent factors and all endpoints are constant, so,
www.dbooks.org
184 CHAPTER 7. CHANGE OF VARIABLES
That is:
This is often referred to as the volume of a sphere, though that is a misnomer. A sphere is a
surface, not a solid.
For those who would like more practice setting up these types of integrals, this example could
have been solved using cylindrical coordinates as well, where dx dy dzp = r dr dθ dz. We√use the
order of integration
√ dz dr dθ. For fixed r and θ, z varies from z = − a2 − x2 − y 2 = − a2 − r2
2 2
to z = + a − r . Moreover, the possible values of r and θ come from the two-dimensional disk
of radius a in the xy-plane, which in polar coordinates is described by 0 ≤ r ≤ a and 0 ≤ θ ≤ 2π.
Thus:
√
Z 2π Z a Z a2 −r2
Vol (W ) = √ 1 · r dz dr dθ.
0 0 − a2 −r2
Z 2π Z a
z=√a2 −r2
= rz √
dr dθ
0 0 z=− a2 −r2
Z 2π Z a p
= 2 2
2r a − r dr dθ.
0 0
Example 7.9. Let W be the region in R3 inside bothRRR the circular cylinder x2 + y 2 = 4 and the
2 2 2
sphere x + y + z = 9 and above the xy-plane. Find 2 2
W (x + y )z dx dy dz.
The region of integration W is a silo-shaped solid bounded laterally by a circular cylinder and
capped with a sphere, as drawn in Figure 7.10. Because the base is a disk in the xy-plane, perhaps
the simplest way to describe W is with polar coordinates together with the z-coordinate, that is,
with cylindrical coordinates. Substituting x = r cos θ, y = r sin θ, and dx dy dz = r dr dθ dz gives:
ZZZ ZZZ ZZZ
2 2 2
(x + y )z dx dy dz = r z · r dr dθ dz = r3 z dr dθ dz.
W W in W in
cylindrical cylindrical
Figure 7.10: The region bounded by a circular cylinder of radius 2, a sphere of radius 3, and the
xy-plane
Note that W is symmetric in the yz-plane in the sense that (−x, y, z) is in W whenever (x, y, z)
is in W . In addition, the function being integrated, namely, f (x, y, z) = x, satisfies f (−x, y, z) =
−x = −f (x, y, z). The same sort of change of variables
RRRargument that we applied in Example 7.7
to study symmetry for double integrals shows that W x dx dy dz = 0, and hence x = 0. The
argument that y = 0 is similar using symmetry in the xz-plane.
To find z = Vol1(W )
RRR
W z dx dy dz, we actually must do some calculation. We begin with
RRR
Vol (W ) = W 1 dx dy dz. It is simplest to describe W using spherical coordinates. Using the
www.dbooks.org
186 CHAPTER 7. CHANGE OF VARIABLES
√
Figure 7.11: An ice cream cone-shaped solid, capped by a sphere of radius 2 2
√
order of integration dρ dφ dθ, for fixed φ and θ, ρ goes along a radial segment from ρ = 0 to ρ = 2 2,
the radius of the spherical cap. Then, for fixed θ, the angle φ varies from φ = 0 to φ = π4 . Lastly
θ goes all the way around from θ = 0 to θ = 2π. Hence:
ZZZ ZZZ
Vol (W ) = 1 dx dy dz = 1 · ρ2 sin φ dρ dφ dθ
W W in
spherical
π
√
Z 2π Z 4
Z 2 2
2
= ρ sin φ dρ dφ dθ
0 0 0
√ Z π
Z 2 2 4
Z 2π
2
= ρ dρ sin φ dφ 1 dθ
0 0 0
1 3 2√2
π 2π
= ρ − cos φ 0
4
θ0
3 0
√ √
16 2 2
= · − − (−1) · 2π
3 2
32π √
= ( 2 − 1).
3
3
(x, y, z) = 0, 0, √ .
4( 2 − 1)
As a reality√check, note that this is approximately the point (0, 0, 1.8), whereas the radius of the
sphere is 2 2 ≈ 2.8. Hence the centroid is roughly two-thirds of the way to the spherical top of
the region. Given the top-heavy shape of W , this is plausible.
Example 7.11. Find the volume of the 4-dimensional closed unit ball:
Figure
RRR R 7.12: The volume of a 4-dimensional ball as a triple integral of a single integral,
( . . . dx4 ) dx1 dx2 dx3
Hence:
Z Z Z Z √1−x2 −x2 −x2
1 2 3
Vol (W ) = √ 1 dx4 dx1 dx2 dx3 (7.5)
B − 1−x22 −x22 −x23
ZZZ q
= 2 1 − x21 − x22 − x23 dx1 dx2 dx3 .
B
This triple integral can be evaluated, say using spherical coordinates, though it gets a little messy.
Here is an alternative. One can think of equation (7.5) above as expressing the volume of W as
the triple integral of a single integral.We look at it instead
as a double integral of a double integral,
RR RR
along the lines of Vol (W ) = x3 ,x4 x1 ,x2 1 dx1 dx2 dx3 dx4 . More precisely, the projection of
W onto the x3 x4 -plane is the set of all pairs (x3 , x4 ) for which there is at least one (x1 , x2 ) such
www.dbooks.org
188 CHAPTER 7. CHANGE OF VARIABLES
that (x1 , x2 , x3 , x4 ) is in W . This is satisfied by all points in the unit disk x23 + x24 ≤ 1. In fact,
given such a (x3 , x4 ), the corresponding points (x1 , x2 ) are those such that x21 + x22 ≤ 1 − x23 − x24 .
We write this way of describing W as:
ZZ Z Z
Vol (W ) = 1 dx1 dx2 dx3 dx4 .
x23 +x24 ≤1 x21 +x22 ≤1−x23 −x24
Figure
RR RR 7.13: The volume of a 4-dimensional ball as a double integral of a double integral,
( . . . dx1 dx2 ) dx3 dx4
p
The inner integral represents the area of a two-dimensional disk of radius a = 1 − x23 − x24 in
2 2 2
RR
the x1 x2 -plane. Hence x2 +x2 ≤1−x2 −x2 1 dx1 dx2 = πa = π(1 − x3 − x4 ), whence:
1 2 3 4
ZZ
Vol (W ) = π(1 − x23 − x24 ) dx3 dx4 .
x23 +x24 ≤1
This is an integral over the unit disk in the x3 x4 -plane, so it can be evaluated using polar coordinates
with dx3 dx4 = r dr dθ:
Z 2π Z 1
2
Vol (W ) = π(1 − r ) · r dr dθ
0 0
Z 1 Z 2π
2
=π r(1 − r ) dr 1 dθ
0 0
1 2 1 4 1 2π
=π r − r 0 θ0
2 4
1
= π · · 2π
4
π2
= .
2
We leave for the exercises the problems of finding the volume of a 5-dimensional ball and, more
broadly, a strategy for approaching the volume of an n-dimensional ball in general. See Exercises
4.9 and 4.10.
7.5. EXERCISES FOR CHAPTER 7 189
1.1. This exercise involves two standard one-variable integrals that appear regularly enough that it
seems like a good idea Rto get out into theRopen how they can be evaluated quickly. Namely, we
nπ nπ
compute the integrals 0 cos2 x dx and 0 sin2 x dx, where n is a positive integer. They can
be found using trigonometric identities, but there is another way that is easier to reproduce
on the spot.
2
(a) First, sketch the graphsRof y = cos2 x and
R nπy = sin x for x in the interval [0, nπ], and use
nπ 2 2
them to illustrate that 0 cos x dx = 0 sin x dx.
(b) This is an optional exercise for those who are uneasy about drawing the conclusion in
part (a) based only on a picture.
R nπ R nπ+ π
(i) Show that 0 sin2 x dx = π 2 cos2 x dx. (Hint: Use the substitution u = x + π2
2
and the identity sin(θ − π2 ) = − cos θ.)
Rπ R nπ+ π
(ii) Show that 02 cos2 x dx = nπ 2 cos2 x dx. (Hint: cos(θ − π) = − cos θ.)
R nπ R nπ
Deduce that 0 cos2 x dx = 0 sin2 x dx.
(c) Integrate the identity cos2 x + sin2 x = 1, and use part (a) (or (b)) to show that:
Z nπ Z nπ
nπ nπ
cos2 x dx = and sin2 x dx = .
0 2 0 2
1.3. Find D xy dx dy, where D is the region in the first quadrant lying inside the circle x2 +y 2 = 4
RR
1.5. Let D be the region in the xy-plane satisfying x2 + y 2 ≤ 2, y ≥ x, and x ≥ 0. See Figure
7.14.
(a) Write an expression for the integral using the order of integration dy dx.
(b) Write an expression for the integral using the order of integration dx dy.
www.dbooks.org
190 CHAPTER 7. CHANGE OF VARIABLES
(c) Write an expression for the integral in polar coordinates in whatever order of integration
you prefer.
(d) Evaluate the integral using whichever approach seems best.
1.6. Let W be the region in R3 lying above the xy-plane, inside the cylinder x2 + y 2 = 1, and
below the plane x + y + z = 2. Find the volume of W .
1.7. Let W be the region in R3 lying above the xy-plane, inside the cylinder x2 + y 2 = 1, and
below the plane x + y + z = 1. Find the volume of W .
R∞ 2
1.8. In this exercise, we evaluate the improper one-variable integral −∞ e−x dx by following the
unlikely strategy of replacing it with a more tractable improper double integral. Let a be a
positive real number.
2 −y 2
(b) Let Da be the disk x2 + y 2 ≤ a2 . Use polar coordinates to evaluate e−x
RR
Da dx dy.
that, as a goes to ∞, both
(c) Note RR Ra and Da fill out all of R2 . It is true that both
2 2 2 2
lim Ra e−x −y dx dy and lim Da e−x −y dx dy exist and that they are equal. Their
RR
a→∞ a→∞
2 2
common value is the improper integral R2 e−x −y dx dy. Use this information along
RR
3.1. Let D be the set of points (x, y) in R2 such that (x − 3)2 + (y − 2)2 ≤ 4.
3.3. Let T : R2 → R2 be the linear change of variables T (u, v) = (3u − v, 2u + v) of Example 7.6.
(a) Show that the only point (u, v) that satisfies T (u, v) = 0 is (u, v) = 0.
(b) Show that T is one-to-one on R2 . (Hint: Start by assuming that T (a) = T (b), and use
part (a) and the linearity of T .)
RR
3.4. Find D (2x + 4y) dx dy if:
7.5. EXERCISES FOR CHAPTER 7 191
(a) D is the parallelogram with vertices (0, 0), (3, 1), (5, 5), and (2, 4),
(b) D is the triangular region with vertices (3, 1), (5, 5), and (2, 4). (Hint: Take advantage
of the work you’ve already done in part (a). You should be able to use quite a bit of it.)
ZZ
2 2
3.5. Find (x + y) ex −y dx dy, where D is the rectangle with vertices (0, 0), (1, −1), (3, 1),
D
and (2, 2).
ZZ p
3.6. Find 12 + x2 + 3y 2 dx dy, where D is the region in R2 described by x2 + 3y 2 ≤ 12 and
D
0≤y≤ √1 x.
3
� = ��
�=�
D
�
�� = �
�� = �
Figure 7.15: The region bounded by y = x, y = 4x, xy = 1, and xy = 2 (left) and the region
x2 − 4xy + 8y 2 ≤ 4 (right)
2 2 2
Z Z D be the region in R described by x − 4xy + 8y ≤ 4. See Figure 7.15, right. Find
3.8. Let
y 2 dx dy. (Hint: x2 − 4xy + 8y 2 = (x − 2y)2 + 4y 2 .)
D
(x − y)2
ZZ
3.9. Find dx dy, where D is the square with vertices (2, 0), (4, 2), (2, 4), and (0, 2).
D (x + y)2
3.10. Let T : R2 → R2 be a one-to-one
linear transformation from the uv-plane to the xy-plane
represented by a matrix A = ac db , i.e., (x, y) = T (u, v) = A · [ uv ]. Let D∗ and D be bounded
regions of R2 of positive area such that T maps D∗ to D, as shown in Figure 7.16.
(a) Let k = | det A|. Use the change of variables theorem to show that:
Area (D) = k · Area (D∗ ) .
www.dbooks.org
192 CHAPTER 7. CHANGE OF VARIABLES
(b) Let (u, v) and (x, y) denote the centroids of D∗ and D, respectively. Show that:
3.11. A region D in the xy-plane is called symmetric in the line y = x if, whenever a point
(x, y) is in D, so is the point (y, x). If D is such a region, prove that its centroid lies on the
line y = x.
3.12. Let D be a subset of R2 that is symmetric about the origin, that is, whenever (x, y) is in D,
so is (−x, −y).
4.4. A class of enthusiastic multivariable calculus students celebrates the change of variables theo-
rem by drilling a cylindrical hole of radius 1 straight through the center of the earth. Assuming
that the earth is a 3-dimensional closed ball of radius 2, find the volume of the portion of the
earth that remains.
4.5. Find the centroid of the half-ball of radius a in R3 that lies inside the sphere x2 + y 2 + z 2 = a2
and above the xy-plane.
4.6. Let W be the wedge-shaped solid in R3 that lies under the plane z = y, inside the cylinder
x2 + y 2 = 1, and above the xy-plane. Find the centroid of W .
7.5. EXERCISES FOR CHAPTER 7 193
4.7. Let W be the region of R3 that lies inside the three cylinders x2 + z 2 = 1, yR2 + z 2 = 1,
x 2 2
R +y2 = 1, and above the xy-plane. Find the volume of W . (Hints: The integral cos12 θ dθ =
sec θ dθ = tan θ + C might be helpful. Also, sin θ = sin2 θ · sin θ = (1 − cos2 θ) sin θ.)
3
4.8. Within the solid ball x2 + y 2 + z 2 ≤ a2 of radius a in R3 , find the average distance to the
origin.
4.10. If a > 0, let Wn (a) denote the closed ball x21 + x22 + · · · + x2n ≤ a2 in Rn of radius a centered
at the origin, and let Vol (Wn (a)) denote its n-dimensional volume. For instance, we showed
2
in Example 7.11 that Vol (W4 (1)) = π2 .
(a) Use the change of variables theorem to show that Vol (Wn (a)) = an Vol (Wn (1)).
(b) By thinking of Vol (Wn (1)) as a double integral of an (n − 2)-fold integral, find a rela-
tionship between Vol (Wn (1)) and Vol (Wn−2 (1)).
(c) Verify that your answer to part (b) predicts a correct relationship between Vol (W4 (1))
and Vol (W2 (1)).
(d) Find Vol (W6 (1)), the volume of the 6-dimensional closed unit ball.
www.dbooks.org
194 CHAPTER 7. CHANGE OF VARIABLES
Part V
195
www.dbooks.org
Chapter 8
Vector fields
Now that we know what it means to differentiate vector-valued functions of more than one variable,
we turn to integrating them. The functions that we integrate, however, are of a special type. The
integrals we consider are rather specialized, too. The functions are called vector fields, and this
short chapter is devoted to introducing them. The integration begins in the next chapter.
A distinctive feature of vector fields is that both the elements x of the domain and their values
F(x) are in Rn , as opposed to a function in general from Rn to Rm .
As the figure shows, the sample vectors often end up running into each other, making the picture
a little muddled. As a result, we usually depict a vector field by drawing a constant positive scalar
multiple of the vectors F(x), where the scalar factor is chosen to improve the readability of the
picture. For the vector field in the last example, this is done on the right of the figure. The lengths
of the vectors are scaled down, but the arrows still point in the correct direction.
197
www.dbooks.org
198 CHAPTER 8. VECTOR FIELDS
Figure 8.1: The constant vector field F(x, y) = (1, 0) = i literally (left) and scaled down (right)
Figure 8.2: The vector field F(x, y) = (−y, x) at the points (1, 0) and (0, 1)
We can get some qualitative information of how F acts by noting, for example, that in the
first quadrant, where both x and y are positive, F(x, y) = (−y, x) has a negative first component
and positive second component, i.e., F(+, +) = (−, +), so the arrow points to the left and up.
Similarly, in the second quadrant F(−, +) = (−, −) (to the left and down), in the third quadrant
F(−, −) = (+, −) (to the right and down), and in the fourth F(+, −) = (+, +) (to the right and
up).
In fact, we can identify both the length and direction of F(x, y) more precisely:
8.1. EXAMPLES OF VECTOR FIELDS 199
p
• kF(x, y)k = (−y)2 + x2 = k(x, y)k,
• F(x, y) · (x, y) = (−y, x) · (x, y) = −yx + xy = 0.
In other words, the length of F(x, y) equals the distance of (x, y) from the origin, and its direction
is orthogonal to (x, y) in the counterclockwise direction. After drawing in a sample of arrows, one
can imagine F as describing a circular counterclockwise flow, or vortex, around the origin, growing
in magnitude as one moves further out, as in Figure 8.3.
y x
Example 8.3. Let W : R2 − {(0, 0)} → R2 be given by W(x, y) = − x2 +y
2 , x2 +y 2 , where (x, y) 6=
(0, 0).
This is a nonconstant positive scalar multiple of the previous example. The scalar factor is
1
x +y 2
2 . Thus the vectors point in the same direction as before, but now:
1 k(x, y)k 1
kW(x, y)k = k(x, y)k = = .
x2 +y 2 k(x, y)k 2 k(x, y)k
Hence the vector field also circulates about the origin, but its length is inversely proportional to the
distance from the origin. See Figure 8.4. We shall see later that this vector field has some notable
features.
y x
Figure 8.4: The vector field W(x, y) = − x2 +y 2 , x2 +y 2
Example 8.4 (The inverse square field). An important vector field G from classical physics is
characterized by the following properties.
• The length of G is inversely proportional to the square of the distance to the origin. In other
c
words, kG(x, y, z)k = k(x,y,z)k2 for some positive constant c.
• The direction of G(x, y, z) is from (x, y, z) to the origin. That is, G(x, y, z) is a negative
scalar multiple of (x, y, z).
See Figure 8.5. For instance, in physics, the gravitational force due to a mass at the origin and the
electrostatic force due to a charged particle at the origin are both modeled by such a field.
www.dbooks.org
200 CHAPTER 8. VECTOR FIELDS
c
To find a formula for G, we write G(x, y, z) = k(x,y,z)k2
u, where u is the unit vector in the
(x,y,z) c (x,y,z)
direction from (x, y, z) to (0, 0, 0), that is, u =− k(x,y,z)k . Hence
G(x, y, z) = − k(x,y,z)k2 · k(x,y,z)k =
c
− k(x,y,z)k3 (x, y, z).
This is called an inverse square field. To summarize, it is the vector field G : R3 −{(0, 0, 0)} →
3
R given variously by:
c
G(x, y, z) = − (x, y, z)
k(x, y, z)k3
c
=− 2 (x, y, z)
(x + y + z 2 )3/2
2
cx cy cz
= − 2 ,− 2 ,− 2 ,
(x + y 2 + z 2 )3/2 (x + y 2 + z 2 )3/2 (x + y 2 + z 2 )3/2
for some positive constant c.
In the future, we take the constant of proportionality to be c = 1 and refer to the resulting
vector field G as “the” inverse square field.
These examples shall often serve as test cases for the concepts we are about to develop.
1.7. (a) Find a smooth vector field F on R2 such that, at each point (x, y), F(x, y) is a unit
vector normal to the parabola of the form y = x2 + c that passes through that point.
(b) Find a smooth vector field F on R2 such that, at each point (x, y), F(x, y) is a unit
vector tangent to the parabola of the form y = x2 + c that passes through that point.
1.8. (a) Find a smooth vector field F on R2 such that kF(x, y)k = k(x, y)k and, at each point
(x, y) other than the origin, F(x, y) is a vector normal to the curve of the form xy = c
that passes through that point.
(b) Find a smooth vector field F on R2 such that kF(x, y)k = k(x, y)k and, at each point
(x, y) other than the origin, F(x, y) is a vector tangent to the curve of the form xy = c
that passes through that point.
Then α0 (t) = (−a sin t, a cos t), and F(α(t)) = F(a cos t, a sin t) = (−a sin t, a cos t). Since the two
are equal, α is an integral path of F. The integral curves are circles centered at the origin. This
reinforces our sense that F describes counterclockwise circular flow around the origin.
1.9. Let F(x, y) = (−2y, x). (This vector field also appears in Exercise 1.4.)
√ √ √
(a) If a is a real number, show that α(t) = 2 a cos( 2 t), a sin( 2 t) is an integral path
of F.
2
(b) Show that the integral curves satisfy the equation x2 + y 2 = a2 , and sketch a represen-
tative sample of them. Indicate the direction in which the curves are traversed by the
integral paths.
1.10. Let F(x, y) = (y, x). (This vector field also appears in Exercise 1.5.)
t + e−t ), a(et − e−t ) and α(t) = a(et −
(a) If a is a real number, show that α(t) = a(e
e−t ), a(et + e−t ) are integral paths of F.
(b) Describe the integral curves, and sketch a representative sample of them. Indicate the
direction in which the curves are traversed by the integral paths. (Hint: To find a
relationship between x and y on the integral paths, compute x(t)2 and y(t)2 .)
y x
1.11. Find integral paths for the vector field W(x, y) = (− x2 +y 2 , x2 +y 2 ), where (x, y) 6= (0, 0), from
Example 8.3 by modifying the integral paths for F(x, y) = (−y, x) given in equation (8.1).
Finding explicit formulas for the integral paths of a vector field may be difficult in prac-
tice. To simplify the discussion, we restrict ourselves to two-dimensional vector fields F(x, y) =
(F1 (x, y), F2 (x, y)) defined on open sets of R2 . If α(t) = (x(t), y(t)) is a path, then the condition
α0 (t) = F(α(t)) that it must satisfy to be an integral path can be written in terms of coordinates
www.dbooks.org
202 CHAPTER 8. VECTOR FIELDS
dy
as dx
dt , dt = F1 (x(t), y(t)), F2 (x(t), y(t)) . In other words, the components x(t) and y(t) of α(t)
must satisfy the differential equations:
dx
= F1 (x, y)
dt (8.2)
dy = F (x, y).
2
dt
Let (x0 , y0 ) be a point of R2 . For the vector fields in Exercises 1.12–1.13, (a) solve the differential
equations (8.2) to find the integral path α(t) = (x(t), y(t)) of F that satisfies the condition α(0) =
(x0 , y0 ) and (b) describe the integral curves of F geometrically and sketch a representative sample
of them, including the direction in which they are traversed by the integral paths. (Note that these
vector fields also appear in Exercises 1.1 and 1.6, respectively.)
Line integrals
We begin integrating vector fields in the case that the domain of integration is a curve. The integrals
are called line integrals. By comparison, we learned in Section
Rb 2.3 about the integral over a curve
C of a real-valued function f . By definition, C f ds = a f (α(t)) kα0 (t)k dt, where α : [a, b] → Rn
R
203
www.dbooks.org
204 CHAPTER 9. LINE INTEGRALS
Figure 9.1: The component Ftan of a vector field in the tangent direction
α0
if T = kα0 k is the unit tangent vector, then:
By the definition of the integral with respect to arclength with f = Ftan , we obtain:
Z Z Z b
F(α(t)) · T(t) kα0 (t)k dt
Ftan ds = F · T ds =
C C a
α0 (t) 0
Z b
= F(α(t)) · 0 kα (t)k dt
a kα (t)k
Z b
= F(α(t)) · α0 (t) dt.
a
The integrand F1 dx1 + F2 dx2 + · · · + Fn dxn is called a differential form, or, more precisely,
a differential 1-form. This notation is very convenient, because it is easy to remember how to
get from (9.2) to (9.1) in order to evaluate the integral. One simply uses the parametrization to
substitute for x1 , x2 , . . . , xn in terms of the parameter t wherever they appear in the integrand.
This includes the substitution dxi = dx dt dt for i = 1, 2, . . . , n. We use differential form notation for
i
One final word about notation: we have chosen to write the line integral as an integral over
α rather than over the underlying curve C to acknowledge that the definition makes use of the
specific parametrization α. This is an important point, and we elaborate on it momentarily.
Just as in first-year calculus a function can be integrated over many intervals, so can a vector
field be integrated over many paths. The relation between the vector field and the path is reflected
in the value of the integral.
Example 9.1. Let F be the vector field on R2 given by F(x, y) = (−y, x). This is the vector field
from Chapter 8 that circulates around the origin, shown again in Figure 9.2.
Figure 9.3: The vector field F(x, y) = (−y, x) along the unit circle (left), the line y = x (middle),
and the line x + y = 1 (right)
(a) The path α1 traces out the unit circle counterclockwise. The direction of the vector field and
the direction of motion are aligned at every point. See Figure 9.3 at left. As a result, F · T is
always positive, and we expect the integral to be positive as well. In fact, for this parametrization,
www.dbooks.org
206 CHAPTER 9. LINE INTEGRALS
(b) The path α2 is a line segment radiating out from (0, 0) to (3, 3) along the line y = x. At every
point of this path, the vector field is orthogonal to the direction of motion (Figure 9.3, middle). In
other words, F · T = 0 at every point, so we would expect the integral to be zero, too. To confirm
this: Z Z Z 3 3
−y dx + x dy = −t · 1 + t · 1) dt = 0 dt = 0.
α2 0 0
(c) This again moves along the unit circle, but only along the arc from (1, 0) to (0, 1). Using the
work from part (a):
Z Z π
2 π π
−y dx + x dy = 1 dt = t02 = .
α3 0 2
(d) The path α4 also goes from (1, 0) to (0, 1), but here the parametrization satisfies x + y =
(1 − t) + t = 1 so the curve is a segment contained in the line x + y = 1. The vector field is neither
perfectly aligned with the direction of motion nor perpendicular to it, but it does have a positive
component in the direction of motion at every point (Figure 9.3, right). Thus we expect the line
integral to be positive. Indeed:
Z Z 1
Z 1
−y dx + x dy = −t · (−1) + (1 − t) · 1) dt = 1 dt = 1.
α4 0 0
(e) α5 is another path from (1, 0) to (0, 1), and, again, x + y = cos2 t + sin2 t = 1. Thus the path
traces out the same line segment as in part (d), though in a different way. The value of the line
integral is:
Z Z π
2
2 2
−y dx + x dy = − sin t · 2 cos t · (− sin t) + cos t · 2 sin t cos t dt
α5 0
Z π
2
2 sin3 t cos t + 2 cos3 t sin t dt
=
0
Z π
2
= 2 sin t cos t (sin2 t + cos2 t) dt
0
Z π
2
= 2 sin t cos t dt
0
π
= sin2 t02
= 1.
9.1. DEFINITIONS AND EXAMPLES 207
Note that each of the paths α3 , α4 , and α5 above goes from (1, 0) to (0, 1). The values of the
line integral are not all equal, however. Thus changing the path between the endpoints can change
the integral. On the other hand, α4 and α5 not only go between the same two points but also trace
out the same curve, just parametrized in different ways. The integrals over these two paths are
equal. Is this a coincidence?
In general, suppose that α : [a, b] → U and β : [c, d] → U are smooth parametrizations of the
same curve C. To help distinguish between them, we use t to denote the parameter of α and u for
the parameter of β. It simplifies the argument somewhat to assume that α is one-to-one, except
possibly at the endpoints, where α(a) = α(b) is allowed, and likewise for β. This is not strictly
necessary, though it has been true of all the examples of parametrizations we have encountered
thus far. Rb Rd
We want to compare the integrals α F · ds = a F(α(t)) · α0 (t) dt and β F · ds = c F(β(u)) ·
R R
β 0 (u) du obtained using the respective parametrizations. Consider first the case that α and β trace
out C in the same direction, that is, both start at a point p = α(a) = β(c) and end at a point
q = α(b) = β(d), as in Figure 9.4.
Let x be a point of C other than the endpoints. Then x = α(t) for some value of t and x = β(u)
for some value of u. We write this as t = g(u), that is, g(u) is the value of t such that α(t) = β(u).
We also set g(c) = a and g(d) = b. Thus:
β(u) = α(g(u)) for all u in [c, d]. (9.3)
For instance, in the case of the line segments α4 and α5 above, α5 (u) = (cos2 u, sin2 u) = 1 −
sin2 u, sin2 u) = α4 (sin2 u), so g(u) = sin2 u.
Figure 9.4: Two parametrizations α and β that trace out the same curve in the same direction:
g(c) = a and g(d) = b
www.dbooks.org
208 CHAPTER 9. LINE INTEGRALS
In other words, both parametrizations give the same value of the line integral.
If the parametrizations traverse C in opposite directions so that, say p = α(a) = β(d) and
q = α(b) = β(c), then the only difference is that the endpoints are reversed. That is, g(d) = a and
g(c) = b, so:
Z Z g(d)
F · ds = F(α(t)) · α0 (t) dt (just as before, but . . . )
β g(c)
Z a
= F(α(t)) · α0 (t) dt
b
Z b
=− F(α(t)) · α0 (t) dt (9.5)
Za
= − F · ds.
α
where the choice of sign depends on whether they traverse C in the same or opposite direction.
This allows us to define a line integral over a curve, as opposed to over a parametrization.
Definition. An oriented curve is a curve C with a specified direction of traversal. Given such a
C, define: Z Z
F · ds = F · ds,
C α
where α is any smooth parametrization that traverses C in the given direction. By the preceding
proposition, any two such parametrizations give the same answer.
The point is that the line integral depends only on the underlying curve and the direction
in which it is traversed. This is critical in that it allows us to think of the line integral as an
integral over an oriented geometric set, independent of the parametrization that is used to trace
it out. This is an issue that we overlooked previously when we discussed integrals over curves
with respect to arclength and integrals over surfaces with respect to surface area. In both of those
cases, the integrals were defined using parametrizations, but we did not check that the answers
were independent of the particular parametrization used to trace out the curve R or surface.
R
In the case of the integral with respect to arclength, the story is that α f ds = β f ds for any
smooth parametrizations α and β of C. The new twist is that the integral is not only independent
of the parametrization but also of the direction in which R it traverses
Rb C. The proof proceeds as
before, but a difference arises because the definition α f ds = a f (α(t))kα0 (t)k dt involves the
norm kα0 (t)k, not α0 (t) itself. This introduces a factor of |g 0 (u)| after applying the chain rule as
in equation (9.4) and taking norms. If α and β traverse C in opposite directions, then g 0 (u) < 0,
so |g 0 (u)| = −g 0 (u). This minus sign is counteracted by the minus sign that appears in this case
when reversing the limits of integration in (9.5), so the two minus signs cancel. We leave the details
for the exercises (Exercise 1.12). The role of parametrizations in surface integrals is discussed in
Chapter 10.
Having cleared the air on that lingering matter, we now return to studying line integrals.
9.1. DEFINITIONS AND EXAMPLES 209
R
Example 9.3. Evaluate C x dx+2y dy+3z dz if C is the oriented curve consisting of three quarter-
circular arcs on the unit sphere in R3 from (1, 0, 0) to (0, 1, 0) to (0, 0, 1) and then back to (1, 0, 0).
See Figure 9.5.
The vector field being integrated here is given by F(x, y, z) = (x, 2y, 3z). Since C consists of
three arcs C R 1 , C2 , C3 , Rwe can imagine
R traversing
R each of them in succession and breaking up the
integral as C F · ds = C1 F · ds + C2 F · ds + C3 F · ds. Moreover, we may parametrize each arc
however we like as long as it is traversed in the proper direction. In this way, we can avoid having
to parametrize all of C over a single parameter interval.
For C1 , we use the usual parametrization of the unit circle in the xy-plane from (1, 0, 0) to
(0, 1, 0): α1 (t) = (cos t, sin t, 0), 0 ≤ t ≤ π2 .
Z Z π
2
x dx + 2y dy + 3z dz = cos t · (− sin t) + 2 sin t · cos t + 3 · 0 · 0 dt
C1 0
Z π π
2 1 2 1
= sin t cos t dt = sin2 t = .
0 2 0 2
For C2 , we use a similar parametrization, only now in the yz-plane from (0, 1, 0) to (0, 0, 1):
α2 (t) = (0, cos t, sin t), 0 ≤ t ≤ π2 .
Z Z π
2
x dx + 2y dy + 3z dz = 0 · 0 + 2 cos t · (− sin t) + 3 sin t · cos t dt
C2 0
Z π π
2 1 2
2 1
= sin t cos t dt = sin t = .
0 2 0 2
Lastly for C3 we travel in the xz-plane from (0, 0, 1) to (1, 0, 0): α3 (t) = (sin t, 0, cos t), 0 ≤ t ≤
π
2.
Z Z π
2
x dx + 2y dy + 3z dz = sin t · cos t + 2 · 0 · 0 + 3 cos t · (− sin t) dt
C3 0
Z π π
2
2
2
= −2 sin t cos t dt = − sin t = −1.
0 0
1 1
R
Adding these three calculations gives C x dx + 2y dy + 3z dz = 2 + 2 − 1 = 0.
This example illustrates a general circumstance that we should make explicit. In addition to
integrating over smooth oriented curves, we can integrate over curves C that are concatenations of
www.dbooks.org
210 CHAPTER 9. LINE INTEGRALS
smooth oriented curves, that is, curves of the form C = C1 ∪ C2 ∪ · · · ∪ Ck , where C1 , C2 , . . . , Ck are
traversed in succession and each of them has a smooth parametrization. Such a curve C is called
piecewise smooth. Under these conditions, we define:
Z Z Z Z
F · ds = F · ds + F · ds + · · · + F · ds.
C C1 C2 Ck
This allows us to deal with curves that have corners, as in the example, without worrying about
whether the entire curve can be parametrized in a smooth way.
dxi
α∗ (dxi ) = dt.
dt
Actually, the xi ’s on each side of this equation mean slightly different things (coordinate label
vs. function of t) and it would be better to write α∗ (dxi ) = dα dt dt, but hopefully our cavalier
i
in terms of t. If we denote F1 dx1 + F2 dx2 + · · · + Fn dxn by ω, then the definition of the line
integral reads: Z Z
ω = α∗ (ω). (9.6)
α(I) I
This looks a lot like the change of variables theorem. For instance, compare with equation (7.3) of
Chapter 7. We verified that the value of the integral is independent of the parametrization α so
9.3. CONSERVATIVE FIELDS 211
that it is well-defined as an integral over the oriented curve C = α(I). The chain rule played an
essential, though easily missed, role in the verification. See equation (9.4).
This approach recurs more generally. In order to define a new type of integral, we use substitu-
tion to pull the integral back to a familiar situation. This is done in a way that allows the change
of variables theorem and the chain rule to conspire to make the process legitimate.
that traces out C with the proper orientation. Now, suppose—and this is a big assumption—that
F is the gradient of a smooth real-valued function f , that is, F = ∇f , or, in terms of components,
∂f
(F1 , F2 , . . . , Fn ) = ( ∂x , ∂f , . . . , ∂x
1 ∂x2
∂f
n
). Then F(α(t)) · α0 (t) = ∇f (α(t)) · α0 (t) = (f ◦ α)0 (t), where
the last equality is the Little Chain Rule. In other words, the integrand in the definition of the line
integral is a derivative. So:
Z Z b
b
F · ds = (f ◦ α)0 (t) dt = (f ◦ α)(t)a = f (α(b)) − f (α(a)) = f (q) − f (p).
C a
This depends only on the endpoints p and q and not at all on the curve that connects them.
In order to apply the Little Chain Rule, this argument requires that C have a smooth parametri-
zation. If C is piecewise smooth instead, say C = C1 ∪ C2 ∪ · · · ∪ Ck , where each Ci is smooth and
C1 goes from p to p1 , C2 from p1 to p2 , and so on until Ck goes from pk−1 to q, then applying the
previous result to the pieces gives a telescoping sum in which the intermediate endpoints appear
twice with opposite signs:
Z Z Z Z
F · ds = F · ds + F · ds + · · · + F · ds
C C1 C2 Ck
= f (p1 ) − f (p) + f (p2 ) − f (p1 ) + · · · + f (q) − f (pk−1 )
= −f (p) + f (q)
= f (q) − f (p).
In other words, in the end, the same result holds, and again only p and q matter.
Definition. Let U be an open set in Rn . A vector field F : U → Rn is called a conservative, or
a gradient, field on U if there exists a smooth real-valued function f : U → R such that F = ∇f .
Such an f is called a potential function for F.
www.dbooks.org
212 CHAPTER 9. LINE INTEGRALS
Figure 9.7: Examples of closed curves: an ellipse (left) and a figure eight (middle) in R2 and a knot
in R3 (right)
Corollary 9.5. If F is conservative on U , then for any piecewise smooth oriented closed curve C
in U : Z
F · ds = 0.
C
Example 9.6. Let F be the vector field on R3 given by F(x, y, z) = (x, 2y, 3z). Is F a conservative
field?
We look to see if there is a real-valued function f such that ∇f = F, that is, ∂f ∂f
∂x = x, ∂y = 2y,
and ∂f 1 2 2 3 2
∂z = 3z. By inspection, it’s easy to see that f (x, y, z) = 2 x + y + 2 z works. Thus F is
conservative with fR as potential function.
It follows that C x dx + 2y dy + 3z dz = 0 for any piecewise smooth oriented closed curve C in
R3 . This explains the answer of 0 that we obtained in Example 9.3 when we integrated F around
the closed curve consisting of three circular arcs (Figure 9.5).
Similarly, if C is any curve from (1, 0, 0) to (0, 1, 0), such as the arc C1 in the xy-plane of
Example 9.3, then by the conservative vector field theorem:
Z
1 1
x dx + 2y dy + 3z dz = f (0, 1, 0) − f (1, 0, 0) = (0 + 1 + 0) − ( + 0 + 0) = .
C 2 2
Again, this agrees with what we got before. More importantly, it is a much simpler way to evaluate
the integral.
9.3. CONSERVATIVE FIELDS 213
To satisfy the first condition, we might try f (x, y) = −xy, but then ∂f
∂y = −x, not x, and there
doesn’t seem to be any way to fix it.
This shows that our first guess didn’t work, but it’s possible that we are simply bad guessers.
To show that (9.7) is truly inconsistent, we bring in the second-order partials. For if there were a
function f that satisfied (9.7), then:
∂2f ∂ ∂f ∂ ∂2f ∂ ∂f ∂
= = −y) = −1 and = = x) = 1.
∂y ∂x ∂y ∂x ∂y ∂x ∂y ∂x ∂y ∂x
This contradicts the equality of mixed partials. Hence no such f exists.
www.dbooks.org
214 CHAPTER 9. LINE INTEGRALS
Theorem 9.9 (Mixed partials theorem). Let F be a smooth vector field on an open set U in Rn ,
written in terms of components as F = (F1 , F2 , . . . , Fn ). If F is conservative, then:
∂Fi ∂Fj
= for all i, j = 1, 2, . . . , n.
∂xj ∂xi
∂Fi ∂Fj
Equivalently, if ∂xj 6= ∂xi for some i, j, then F is not conservative.
Proof. Assume that F is conservative, and let f be a potential function. Then F = ∇f , that is,
∂f 2f
(F1 , F2 , . . . , Fn ) = ( ∂x , ∂f , . . . , ∂x
1 ∂x2
∂f
n
∂f
). Therefore Fi = ∂x i
∂f
and Fj = ∂x j
∂Fi
, and so ∂x j
= ∂x∂j ∂xi
and
∂Fj ∂2f
∂xi = ∂xi ∂xj . By the equality of mixed partials, these are equal.
We look at what the mixed partials theorem says for R2 and R3 . For a vector field F = (F1 , F2 )
on a subset of R2 , there is just one mixed partials pair: ∂F ∂F2
∂y and ∂x . Thus:
1
∂F1 ∂F2
In R2 , if F = (F1 , F2 ) is conservative, then ∂y = ∂x .
Next, for F = (F1 , F2 , F3 ) on a subset of R3 , there are three mixed partials pairs: (i) ∂F ∂y and
1
This is called the curl of F. Note that its components are precisely the differences of the three
mixed partials pairs. Hence:
to get ∂f 2 3 3 2 2 3 4 1 2 2
∂z = 4x y z + y , we modify again to f = x y z + 2 x + 3y + y z. Unfortunately, this
∂f
spoils ∂y . It seems unlikely that a potential function exists.
9.4. GREEN’S THEOREM 215
= (2y, 0, 0).
Taking into account the orientations, we write C = C1 ∪ C2 ∪ (−C3 ) ∪ (−C4 ), where the minus
signs indicate that the segment has been given the orientation opposite the direction in which it is
actually traversed in C. Then the line integral over C can be written as:
Z Z Z Z Z
F1 dx + F2 dy = + − − F1 dx + F2 dy. (9.8)
C C1 C2 C3 C4
Hopefully, the notation, where the integral signs are meant to distribute over the integrand, is
self-explanatory.
For the horizontal segment C1 , we use x as parameter, keeping y = c fixed: α1 (x) = (x, c),
dx dy d
a ≤ x ≤ b. Then, along C1 , dx = 1 and dx = dx c = 0, so:
Z Z b
Z b
F1 dx + F2 dy = F1 (x, c) · 1 + F2 (x, c) · 0 dx = F1 (x, c) dx.
C1 a a
www.dbooks.org
216 CHAPTER 9. LINE INTEGRALS
Similarly, for C3 :
Z Z b
F1 dx + F2 dy = F1 (x, d) dx.
C3 a
For the vertical segments C2 and C4 , we use y as parameter, c ≤ y ≤ d, keeping x fixed. Hence
dx
dy = 0 and dy
dy = 1, which gives:
Z Z d Z Z d
F1 dx + F2 dy = F2 (b, y) dy and F1 dx + F2 dy = F2 (a, y) dy.
C2 c C4 c
Note that, for the boundaries of R1 , R2 , R3 , R4 , the portions in the interior of D appear as part of
two different Cj traversed in opposite directions. Thus these portions of the line integrals cancel
out, leaving only the parts left exposed around the boundary of the original region D. Hence if C
denotes the boundary of D, we have:
ZZ Z
∂F2 ∂F1
− dx dy = F1 dx + F2 dy.
D ∂x ∂y C
This is the same relation as we obtained for the filled-in rectangle in equation (9.9). As the
argument shows, it is valid provided that C is oriented so that the outer perimeter is traversed
counterclockwise and the inner perimeter is traversed clockwise, as shown in Figure 9.10.
Definition. We have said that a curve C is called closed if it starts and ends at the same point.
In addition, C is said to be simple if, apart from this, it does not intersect itself.
For instance, this describes the orientation of the boundary of the rectangle-with-rectangular-hole
in Figure 9.10. Note that oriented boundaries are completely different from partial derivatives,
even though the same symbol “∂ ” happens to be used for both.
The relation we found for line integrals around the oriented boundaries of rectangles or rectan-
gles with holes holds in greater generality.
Theorem 9.11 (Green’s theorem). Let U be an open set in R2 , and let D be a bounded subset of
U such that the boundary of D consists of a finite number of piecewise smooth simple closed curves.
If F = (F1 , F2 ) is a smooth vector field on U , then:
Z ZZ
∂F2 ∂F1
F1 dx + F2 dy = − dx dy.
∂D D ∂x ∂y
R
If F happens to be a conservative field, then ∂D F1 dx + F2 dy = 0, since ∂D consists of closed
curves and the integral of a conservative field around any closed curve is RR
0. On the other
hand, by
∂F2 ∂F1 ∂F2 ∂F1
the mixed partials theorem, the mixed partials ∂x and ∂y are equal, so D ∂x − ∂y dx dy = 0
as well. Thus for conservative fields, Green’s theorem confirms the generally accepted belief that
0 = 0.
www.dbooks.org
218 CHAPTER 9. LINE INTEGRALS
We have proven Green’s theorem only in very special cases. The argument given for a rectangle
with a hole, however, can be extended easily to any domain D that is a union of finitely many
rectangles that intersect at most in pairs along portions of their common boundaries. For a general
region D, one strategy for giving a proof is to show that D can be approximated by such a union
of rectangles so that Green’s theorem is approximately true for D. By taking a limit of such
approximations, the theorem can be proven. This requires considerable technical expertise which
we leave for another course.
Example 9.12. Let F(x, y) = (x − y 2 , xy). Find C F · ds if C is the simple closed curve consisting
R
of the semicircular arc in the upper half-plane from (2, 0) to (−2, 0) followed by the line segment
from (−2, 0) to (2, 0). See Figure 9.11.
Example 9.13. Once again, consider the vector field F(x, y) = (−y, x) that circulates counter-
clockwise about the origin. Let C be a piecewise smooth simple closed curve in R2 , oriented
counterclockwise. For example, it could be the one shown in Figure 9.12. Can we say anything
about the integral of F over C?
There are portions of C that go counterclockwise about the origin, for instance, the parts
furthest from the origin. The tangential component of F along these portions is positive. But there
9.5. THE VECTOR FIELD W 219
Figure 9.12: Integrating the circulating vector field F(x, y) = (−y, x) around a simple closed curve
C
may also be places where C moves in the clockwise direction around the origin. For instance, if C
does not encircle the origin, this happens at points nearest the origin. The tangental component
is negative along such places. The two types of contributions counteract one another, but the
magnitude of F increases as you move away from the origin so we might expect the positive
contribution to be greater. In fact, using Green’s theorem, we can measure precisely how much
greater it is.
Let D be the filled-in region bounded by C. Then, by Green’s theorem:
Z ZZ ZZ
∂ ∂
−y dx + x dy = (x) − (−y) dx dy = 2 dx dy = 2 Area (D).
C D ∂x ∂y D
Corollary 9.14. Let D be a subset of R2 that satisfies the assumptions of Green’s theorem. Then:
Z
1
Area (D) = −y dx + x dy.
2 ∂D
This illustrates the striking principle that information about all of D can be recovered from
measurements taken only along its boundary.
This was Example 8.3 in Chapter 8. Like our usual circulating vector field, W circulates counter-
1
clockwise around the origin, but, due to the factor of x2 +y 2 , the arrows get shorter as you move
away (Figure 9.13). We are about to see that W is an instructive example to keep in mind.
First, we integrate W around the circle Ca of radius a centered at the origin and oriented
counterclockwise. We evaluate the integral in two different ways.
Approach A. We use Green’s theorem. Let Da be the filled-in disk of radius a so that Ca = ∂Da .
See Figure 9.14.
www.dbooks.org
220 CHAPTER 9. LINE INTEGRALS
Approach B. We calculate the line integral directly using a parametrization of Ca , say α(t) =
(a cos t, a sin t), 0 ≤ t ≤ 2π. This gives:
Z
y x
− 2 2
dx + 2 dy
Ca x + y x + y2
Z 2π
a sin t a cos t
= − 2 · (−a sin t) + · (a cos t) dt
0 a cos2 t + a2 sin2 t a2 cos2 t + a2 sin2 t
Z 2π 2 2
a sin t + a2 cos2 t
= dt
0 a2
Z 2π 2
a
= dt
0 a2
Z 2π
= 1 dt
0
= 2π.
9.5. THE VECTOR FIELD W 221
We broaden the analysis by considering any piecewise smooth simple closed curve C in the
plane, oriented counterclockwise, that does not pass through the origin, meaning that (0, 0) ∈ / C.
This doesn’t seem like very much to go on, but it turns out that we can say a lot about the integral
of W over C.
There are two cases, corresponding to the fact that C separates the plane into two regions,
which we refer to as the interior and the exterior.
Case 1. (0, 0) lies in the exterior of C. See Figure 9.15.
Then the region D bounded by C does not contain the origin, so Approach A above is valid and
Green’s theorem applies with C = ∂D. Since the mixed partials of W are equal, it follows that:
Z ZZ
y x ∂W2 ∂W1
− 2 dx + 2 dy = − dx dy = 0.
C x + y2 x + y2 D ∂x ∂y
www.dbooks.org
222 CHAPTER 9. LINE INTEGRALS
But we showed in equation (9.10) that the integral of W around any counterclockwise circle about
the origin, such as C , is 2π. Hence the value around C must be the same.
To summarize, we have proven:
Z (
y x 0 if (0, 0) lies in the exterior of C,
− 2 2
dx + 2 2
dy = (9.11)
C x + y x + y 2π if (0, 0) lies in the interior of C.
If C is oriented clockwise instead, then the change in orientation reverses the sign of the integral, so
the corresponding values in the two cases are 0 and −2π, respectively. That there are exactly three
possible values of the integral over all piecewise smooth oriented simple closed curves in R2 −{(0, 0)}
may be somewhat surprising.
C1. the integral of F over any piecewise smooth oriented curve C depends only on the endpoints
of C,
C2. the integral of F over any piecewise smooth oriented closed curve is 0, and
∂Fi ∂Fj
C3. all mixed partial pairs are equal: ∂xj = ∂xi for all i, j.
So these conditions are necessary for F to be conservative. Are any of them sufficient? It turns
out that the converses corresponding to C1 and C2 are true, but we know from the example of W
in the previous section that the converse of C3 is not. Computing the mixed partials and seeing
if they are equal is the simplest of the three conditions to check, however, so we could ask if there
are special circumstances under which the converse of C3 does hold. We investigate this next and
toss in a few remarks about the other two conditions along the way.
9.6. THE CONVERSE OF THE MIXED PARTIALS THEOREM 223
For example, Rn itself is simply connected. So are rectangles and balls. A sphere in R3 is simply
connected as well. A closed curve in any of them can be collapsed to a point within the set rather
like a rubber band shrinking in on itself. See Figure 9.17.
Figure 9.17: Examples of simply connected sets: a disk, a rectangle, and a sphere
The punctured plane R2 − {(0, 0)} is not simply connected, however. A closed curve that goes
around the origin cannot be shrunk to a point without passing through the origin, i.e., without
leaving the set. Similarly, an annulus in R2 , that is, the ring between two concentric circles, is not
simply connected. An example of a non-simply connected surface is the surface of a donut, which is
called a torus. A loop that goes through the hole cannot be contracted to a point while remaining
on the surface. These examples are shown in Figure 9.18.
Figure 9.18: Examples of non-simply connected sets and closed curves that cannot be contracted
within the set: the punctured plane, an annulus, and a torus
Theorem 9.15 (Converse of the mixed partials theorem). Let F = (F1 , F2 , . . . , Fn ) be a smooth
vector field on an open set U in Rn . If:
∂Fi ∂Fj
• ∂xj = ∂xi for all i, j = 1, 2, . . . , n and
• U is simply connected,
A rigorous proof of this theorem is quite hard, so we postpone even any discussion of it until
after we look at an example.
y x
Example 9.16. Consider again the vector field W(x, y) = − x2 +y 2 , x2 +y 2 , (x, y) 6= (0, 0), of
Section 9.5. We established that W is not conservative on R2 − {(0, 0)}. On the other hand,
the mixed partials of W are equal; therefore Theorem 9.15 implies that W is conservative when
restricted to any simply connected subset of R2 − {(0, 0)}.
www.dbooks.org
224 CHAPTER 9. LINE INTEGRALS
For instance, let U be the right half-plane, U = {(x, y) ∈ R2 : x > 0}. This is a simply connected
set, so a potential function for W must exist on U . Indeed, one can check that w(x, y) = arctan( xy )
works. For instance, ∂w 1
2 · − xy2 = − x2 +y
y
∂x = y
2.
1+ x
To illustrate how this might be used, suppose that C is a piecewise smooth curve in the right
half-plane that goes from a point p, say on the line y = −x in the fourth quadrant, to a point q
on the line y = x in the first quadrant, as in Figure 9.19.
Figure 9.19: A curve in the right half-plane from the line y = −x to the line y = x
Note that this difference in the arctangent is the angle about the origin swept out by C.
The same potential function works in the left half-plane x < 0. Similarly, on the upper half-
plane y > 0 and the lower half-plane y < 0, v(x, y) = − arctan( xy ) works as a potential function for
W. Integrating W over a curve in any of these half-planes amounts to finding the counterclockwise
change in angle traced out by the curve.
We can use these observations to go beyond the result of (9.11) and give a more complete
description of the integral of W over curves in R2 − {(0, 0)}. Any such curve C can be divided
into pieces, each of which lies in one of the four half-planes above. By summing the integrals over
each of these pieces, we find that the integral of W over all of C is the total counterclockwise angle
about the origin swept out by C from start to finish.
If C is a closed
R curve so that it comes back to where it started, the total angle is an integer
y x
multiple of 2π: C − x2 +y 2 dx + x2 +y 2 dy = 2πn. This conclusion does not require that C be
simple. The integer n represents the net number of times that C goes around the origin in the
counterclockwise sense and is given a special name.
Definition. Let C be a piecewise smooth oriented closed curve in R2 −{(0, 0)}. Then the winding
number of C with respect to (0, 0) is the integer defined by the equation:
Z
1 y x
winding # = − dx + 2 dy.
2π C x2 + y 2 x + y2
For instance, if Ca is the circle of radius a centered at the origin, traversed once counterclockwise,
R y x
then we showed in Section 9.5 that Ca − x2 +y 2 dx + x2 +y 2 dy = 2π (see equation (9.10)). Hence
according to the definition, Ca has winding number 1.
9.6. THE CONVERSE OF THE MIXED PARTIALS THEOREM 225
Figure 9.20: Oriented closed curves with winding number 1 (left), 2 (middle), and −1 (right)
One reason that this discussion is interesting is that, in first-year calculus, we tend to focus on
the integral of a function as telling us mainly about the function. The winding number, on the
other hand, is an instance where the integral is telling us about the geometry of the domain, i.e.,
the curve, over which the integral is taken.
We close by going over some of the ideas behind the proof of Theorem 9.15, the converse of
the mixed partials theorem. We do this only in the case of a vector field F = (F1 , F2 ) on an open
set U in R2 . Thus we assume that ∂F ∂F1
∂x = ∂y and that U is simply connected, and we want to
2
prove that F is conservative on U , that is, that it has a potential function f . We think of f as an
“anti-gradient” of F. Then the method of finding f is a lot like what is done in first-year calculus
for the fundamental theorem, where an antiderivative is constructed by integrating.
Choose a point a of U . If x is any point of U , then by assumptionR there is aR piecewise smooth
curve C in U from a to x. The idea is to show that the line integral C F · ds = C F1 dx + F2 dy is
the same regardless of which curve C is chosen. For suppose that C1 is another piecewise smooth
curve in U from a to x. See Figure 9.21. We want to prove that:
Z Z
F1 dx + F2 dy = F1 dx + F2 dy.
C C1
Figure 9.21: Two curves C and C1 from a to x and the region D that fills in the closed curve
C2 = C ∪ (−C1 )
www.dbooks.org
226 CHAPTER 9. LINE INTEGRALS
Note that, whether U is simply connected or not, as long as condition C1 above holds—namely,
the line integral depends only on the endpoints of a curve—then the formula in equation (9.14)
describes a well-defined function f . The reasoning above also contains the argument that, if con-
dition C2 holds—that the integral around any closed curve is 0—that too is enough to show that
the function f in (9.14) is well-defined. This basically follows from equation (9.13).
Thus to show that any of C1, C2, or, on simply connected domains, C3 suffice to imply that F
is conservative, it remains only to confirm that f in (9.14) is a potential function for F: ∂f
∂x = F1
∂f
and ∂y = F2 . This part is actually not so bad, and we leave it for the exercises (Exercise 6.4).
Find C F·ds if F(x, y) = (cos2 x, sin y cos y) and C is the curve parametrized by α(t) = (t, t2 ),
R
1.3.
0 ≤ t ≤ π.
R
1.4. Find C F · ds if F(x, y, z) = (x + y, y + z, x + z) and C is the curve parametrized by α(t) =
(t, t2 , t3 ), 0 ≤ t ≤ 1.
R
1.5. Find C F·ds if F(x, y, z) = (−y, x, z) and C is the curve parametrized by α(t) = (cos t, sin t, t),
0 ≤ t ≤ 2π.
R
1.6. Consider the line integral C (x + y) dx + (y − x) dy.
(c) Evaluate the integral if C consists of the line segment from (0, 0) to (1, 0) followed by
the line segment from (1, 0) to (1, 1).
(d) Evaluate the integral if C consists of the line segment from (0, 0) to (0, 1) followed by
the line segment from (0, 1) to (1, 1).
R
1.7. Consider the line integral C xy dx + yz dy + xz dz.
1.8. Find C (x − y 2 ) dx + 3xy dy if C is the portion of the parabola y = x2 from (−1, 1) to (2, 4).
R
and the plane z = x + y, oriented counterclockwise as viewed from high above the xy-plane,
looking down.
For the line integrals in Exercises 1.10–1.11, (a) sketch the curve C over which the integral
is taken and (b) evaluate the integral. (Hint: Look for ways to reduce, or eliminate, messy
calculation.)
2
8z e(x+y) dx + (2x2 + y 2 )2 dy + 6xy 2 z dz, where C is parametrized by:
R
1.10. C
Consider both cases that α and β traverse C in the same or opposite directions. You may
assume α and β are one-to-one, except possibly at the endpoints of their respective domains.
1.13. In this exercise, we return to the geometry of curves in R3 and study the role of the
parametrization in defining curvature. Recall that, if C is a curve in R3 with a smooth
parametrization α : I → R3 , then the curvature is defined by the equation:
kT0α (t)k
κα (t) = ,
vα (t)
www.dbooks.org
228 CHAPTER 9. LINE INTEGRALS
where Tα (t) is the unit tangent vector and vα (t) is the speed. We have embellished our original
notation with the subscript α to emphasize our interest in the effect of the parametrization.
We assume that vα (t) 6= 0 for all t so that κα (t) is defined.
Let β : J → R3 be another such parametrization of C. As in the text, we assume that α and
β are one-to-one, except possibly at the endpoints of their respective domains. As in equation
(9.3), there is a function g : J → I, assumed to be smooth, such that β(u) = α(g(u)) for all
u in J. In other words, g(u) is the value of t such that β(u) = α(t).
(a) Show that the velocities of α and β are related by vβ (u) = vα g(u) g 0 (u).
(b) Show that the unit tangents are related by Tβ (u) = ±Tα g(u) .
(c) Show that κβ (u) = κα g(u) .
In other words, if x = β(u) = α(t) is any point of C other than an endpoint, then
κβ (u) = κα (t). We denote this common value by κ(x), i.e., κ(x) is defined to be κα (t)
for any smooth parametrization α of C, where t is the value of the parameter such that
α(t) = x. In other words, the curvature at x depends only on the point x and not on
how C is parametrized.
(a) Is F a conservative vector field? If so, find a potential function for it. If not, explain
why not.
R
(b) Find C F · ds if C is the line segment from the point (4, 1) to the point (2, 3).
In Exercises 3.3–3.4, (a) find the curl of F and (b) determine if F is conservative. If it is
conservative, find a potential function for it, and, if not, explain why not.
3.4. F(x, y, z) = (y − z, z − x, y − x)
In Exercises 3.5–3.8, determine whether the vector field F is conservative. If it is, find a potential
function for it. If not, explain why not.
3.9. Let p be a positive real number, and let F be the vector field on R3 − {(0, 0, 0)} given by:
x y z
F(x, y, z) = − 2 ,− 2 ,− 2 .
(x + y 2 + z 2 )p (x + y 2 + z 2 )p (x + y 2 + z 2 )p
For instance, the inverse square field is the case p = 3/2.
(a) The norm of F is given by kF(x, y, z)k = k(x, y, z)kq for some exponent q. Find q in
terms of p.
(b) For which values of p is F a conservative vector field? The case p = 1 may require special
attention.
2
3.10. Find C ey dx + xey dy if C is the curve parametrized by α(t) = (et , t3 ), 0 ≤ t ≤ 1.
R
3.11. Let a and b be real numbers. If C is a piecewise smooth oriented closed curve in the punctured
plane R2 − {(0, 0)}, show that:
(b − a)y
Z Z
ax by
2 2
dx + 2 2
dy = 2 2
dy.
C x +y x +y C x +y
3.12. Let G(x, y, z) = − (x2 +y2x+z 2 )3/2 , − (x2 +y2y+z 2 )3/2 , − (x2 +y2z+z 2 )3/2 be the inverse square field.
that consists of the sequence of line segments, each parallel to one of the coordinate axes,
from (0, 0, 0) to (1, 0, 0) to (1, 1, 0) and finally to (1, 1, 1).
3.14. Let C be a piecewise smooth 3
R oriented curve in R from a point p = (xR1 , y1 , z1 ) toRa point
q = (x2 , y2 , z2 ). Show that C 1 dx = x2 − x1 . (Similar formulas apply to C 1 dy and C 1 dz.)
3.15. (a) Find a vector field F = (F1 , F2 ) on R2 with the property that, whenever C is a piecewise
smooth oriented curve in R2 , then:
Z
F1 dx + F2 dy = kqk2 − kpk2 ,
C
where p and q are the starting and ending points, respectively, of C? Either describe such a
vector field as precisely as you can, or explain why none exists.
3.17. Let f, g : Rn → R be smooth real-valued functions. Prove that, for all piecewise smooth
oriented closed curves C in Rn :
Z Z
(f ∇g) · ds = − (g∇f ) · ds.
C C
www.dbooks.org
230 CHAPTER 9. LINE INTEGRALS
3.18. For a vector field F = (F1 , F2 , F3 ) on an open set in R3 , is the cross product ∇ × F, i.e., the
curl, necessarily orthogonal to F? Either prove that it is, or find an example where it isn’t.
3.19. Newton’s second law of motion, F = ma, relates the force F acting on an object to the
object’s mass m and acceleration a. Let F be a smooth force field on an open set of R3 , and
assume that an object of mass m travels under the influence of F along a curve C having a
smooth parametrization α : [a, b] → R3 with velocity v(t) and acceleration a(t).
(a) The quantity K(t) = 21 mkv(t)k2 is called the kinetic energy of the object. Use New-
ton’s second law to show that:
Z
F · ds = K(b) − K(a).
C
In other words, the work done by the force equals the change in the object’s kinetic
energy.
(b) Assume, in addition, that F is a conservative field with potential function f . Show that
the function
1
E(t) = mkv(t)k2 − f (α(t))
2
is constant. (The function −f is called a potential energy for F. The fact that, for a
conservative force field F, the sum of the kinetic and potential energies along an object’s
path is constant is known as the principle of conservation of energy.)
3.20. Let F be a smooth vector field on an open set U in Rn . A smooth path α : I → U defined
on an interval I is called an integral path of F if α0 (t) = F(α(t)) for all t. In other words, at
time t, when the position is the point α(t), the velocity equals the vector field vector at that
point. (See page 201.)
(a) If α : [a, b] → U
R is a nonconstant integral path of F and C is the curve parametrized by
α, show that C F · ds > 0. (Hint: The assumption that α is not constant means that
there is some point t in [a, b] where α0 (t) 6= 0.)
(b) Let f : U → R be a smooth real-valued function on U , and let F = ∇f . If α is a
nonconstant integral path of F going from a point p to a point q, show that f (q) > f (p).
(c) Show that a conservative vector field F cannot have a nonconstant closed integral path.
3.21. Let U be an open set in Rn with the property that every pair of points p, q in U can be joined
by a piecewise smooth path in U . Let f : U → R be a smooth real-valued function defined on
U . If ∇f (x) = 0 for all x in U , use line integrals to give a simple proof that f is a constant
function.
4.1. Find C (x2 − y 2 ) dx + 2xy dy if C is the boundary of the square with vertices (0, 0), (1, 0),
R
clockwise.
R
4.3. Find C −y cos x dx + xy dy if C is the boundary of the parallelogram with vertices (0, 0),
(2π, π), (3π, 3π), and (π, 2π), oriented counterclockwise.
9.7. EXERCISES FOR CHAPTER 9 231
4.4. Find C (ex+2y − 3y) dx + (4x + 2ex+2y ) dy if C is the track shown in Figure 9.22 consisting
R
4.5. (a) Use the parametrization α(t) = (a cos t, a sin t), 0 ≤ t ≤ 2π, and Corollary 9.14 to verify
that the area of the disk x2 + y 2 ≤ a2 of radius a is πa2 .
(b) By modifying the parametrization in part (a), use Corollary 9.14 to find the area of the
2 2
region xa2 + yb2 ≤ 1 inside an ellipse.
In Exercises 4.6–4.11, evaluate the line integral using whatever methods seem best.
R
4.6. C (x + y) dx + (x − y) dy, where C is the curve consisting of the line segment from (0, 1) to
(1, 0) followed by the line segment from (1, 0) to (2, 1)
x
4.7. C x2e+y2 dx − x2 +y
x 2 2 2
R
2 dy, where C is the unit circle x + y = 1 in R , oriented counterclockwise
4.8. C (3x2 y − xy 2 − 3x2 y 2 ) dx + (x3 − 2x3 y + x2 y) dy, where C is the closed triangular curve in
R
R2 with vertices (0, 0), (1, 1), and (0, 1), oriented counterclockwise
4.10. C (6x2 −4y+2xy) dx+(2x−2 sin y+3x2 ) dy, where C is the diamond-shaped curve |x|+|y| = 1,
R
oriented counterclockwise
and the saddle surface z = x2 − y 2 , oriented counterclockwise as viewed from high above the
xy-plane, looking down
4.13. Let D be a bounded subset of U whose boundary consists of a finite number of piecewise
smooth simple closed curves. If h is harmonic on U , prove that:
Z ZZ
∂h ∂h
−h dx + h dy = k∇h(x, y)k2 dx dy.
∂D ∂y ∂x D
www.dbooks.org
232 CHAPTER 9. LINE INTEGRALS
4.14. Let D be a region as in the preceding exercise, and suppose that every pair of points p, q in
D can be joined by a piecewise smooth path in D. If h is harmonic on U and if h = 0 at all
points of ∂D, prove that h = 0 on all of D. (Hint: See Exercises 2.4 of Chapter 5 and 3.21
of this chapter.)
4.15. Continuing with the previous exercise, prove that a harmonic function is completely deter-
mined on D by its values on the boundary in the sense that, if h1 and h2 are harmonic on U
and if h1 = h2 at all points of ∂D, then h1 = h2 on all of D.
R
(a) Find F1 dx + F2 dy if C is the oriented fish-shaped curve shown in the figure.
C
R
(b) Draw a piecewise smooth oriented closed curve K such that K F1 dx + F2 dy = 1. Your
curve need not be simple. Include in your drawing the direction in which K is traversed.
6.1. (a) Let C be the curve in R2 − {(0, 0)} parametrized by α(t) = (cos t, − sin t), 0 ≤ t ≤ 2π.
Describe C geometrically, and calculate its winding number using the definition.
(b) More generally, let n be an integer (positive, negative, or zero allowed), and let Cn be the
curve parametrized by α(t) = (cos nt, sin nt), 0 ≤ t ≤ 2π. Give a geometric description
of Cn , and calculate its winding number using the definition.
9.7. EXERCISES FOR CHAPTER 9 233
6.2. Let C be a piecewise smooth simple closed curve in R2 , oriented counterclockwise, such that
the origin lies in the exterior of C. What are the possible values of the winding number of
C?
6.3. Let F = (F1 , F2 ) be a smooth vector field on the punctured plane U = R2 − {(0, 0)} such that
∂F2 ∂F1
∂x = ∂y on U , and let C be a piecewise smooth oriented closed curve in U .
R
(a) If C does not intersect the x-axis, explain why C F · ds = 0.
(b) Suppose instead that C mayRintersect the positive x-axis but not the negative x-axis. Is
it necessarily still true that C F · ds = 0? Explain.
6.4. (a) Let c = (c, d) be a point of R2 , and let B = B(c, r) be an open ball centered at c. For
the moment, we think of R2 as the uv-plane to avoid having x and y mean too many
different things later on. Let F(u, v) = (F1 (u, v), F2 (u, v)) be a continuous vector field
on B.
Given any point x = (x, y) in B, let C1 be the curve from (c, d) to (x, y) consisting of
a vertical line segment followed by a horizontal line segment, as indicated on the left of
Figure 9.24. Define a function f1 : B → R by:
Z
f1 (x, y) = F1 du + F2 dv.
C1
Figure 9.24: The curves of integration C1 and C2 for functions f1 and f2 , respectively
(b) Similarly, let C2 be the curve from (c, d) to (x, y) consisting of a horizontal segment
followed by a vertical segment, as shown on the right of Figure 9.24, and define:
Z
f2 (x, y) = F1 du + F2 dv.
C2
∂f2
Show that ∂y (x, y) = F2 (x, y) for all (x, y) in B.
(c) Now, let U be an open set in R2 such that any two points of U can be joined by a
piecewise smooth path in U . Let a be a point of U . Assume that F is a smooth vector
field on U such that the function f : U → R given in equation (9.14) is well-defined.
That is: Z x
f (x) = F1 du + F2 dv,
a
www.dbooks.org
234 CHAPTER 9. LINE INTEGRALS
R
where the notation means C F1 du + F2 dv for any piecewise smooth curve C in U from
a to x.
Show that f is a potential function for F on U , i.e., ∂f ∂f
∂x = F1 and ∂y = F2 . In particular,
F is a conservative vector field. (Hint: Given a point c in U , choose an open ball
B = B(c, r) that is contained in U . Define functions f1 and f2 as in parts (a) and (b).
For x in B, how is f (x) related to f (c) + f1 (x) and f (c) + f2 (x)?)
Chapter 10
Surface integrals
We next study how to integrate vector fields over surfaces. One important caveat: our discussion
applies only to surfaces in R3 . For surfaces in Rn with n > 3, the expressions that one integrates
are more complicated than vector fields.
It is easy to get lost in the weeds with the details of surface integrals, so, in the first section
of the chapter, we just try to build some intuition about what the integral measures and how to
compute that measurement. The formal definition of the integral appears in the second section.
Calculating surface integrals is not necessarily difficult but it can be messy, so having a range of
options is especially welcome. In addition to the definition and the intuitive approach, sometimes
a surface integral can be converted to a different type of integral altogether. For example, one
of the theorems of the chapter, Stokes’s theorem, relates surface integrals to line integrals, and
another, Gauss’s theorem, relates them to triple integrals. It would be a mistake to think of these
results primarily as computational tools, however. Their theoretical consequences are at least as
important, and we try to give a taste of some of the new lines of reasoning about integrals that
they open up. The chapter closes with a couple of sections in which we tie up some loose ends.
This also puts us in position to wrap things up in the next, and final, chapter.
235
www.dbooks.org
236 CHAPTER 10. SURFACE INTEGRALS
in contrast with the interpretation of the line integral, where we were interested in the degree to
which the vector field flowed along a curve.
To begin, we need to designate a direction of flow through S that is considered to be positive.
Definition. An orientation of a smooth surface S in R3 is a continuous vector field n : S → R3
such that, for every x in S, n(x) is a unit normal vector to S at x. See Figure 10.2. An oriented
surface is a surface S together with a specified orientation n.
Example 10.1. Suppose that S is a level set defined by f (x, y, z) = c for some smooth real-valued
function f of three variables. If x ∈ S, then, by Proposition 4.17, ∇f (x) is a normal vector to S
at x, so the formula n(x) = k∇f1(x)k ∇f (x) describes an orientation of S, provided that ∇f (x) 6= 0
for all x in S.
For instance, let S be the sphere x2 + y 2 + z 2 = a2 of radius a. Here, f (x, y, z) = x2 + y 2 + z 2 ,
so ∇f = (2x, 2y, 2z) = 2(x, y, z). As a result, an orientation of the sphere is given by:
1 1 1
n(x, y, z) = p 2(x, y, z) = √ (x, y, z) = (x, y, z)
2 x2 + y 2 + z 2 a2 a
for all points (x, y, z) on the sphere. This vector points away from the origin, so we say that it
orients S with the outward normal. See Figure 10.3, left.
We could equally well orient the sphere with the inward unit normal. This would be given by
n(x, y, z) = − a1 (x, y, z).
Or let S be the circular cylinder x2 + y 2 = a2 . This is a level set of f (x, y, z) = x2 + y 2 . Then
∇f = (2x, 2y, 0) = 2(x, y, 0), and an orientation is:
1 1
n(x, y, z) = p 2(x, y, 0) = (x, y, 0)
2 x2 + y 2 a
for all (x, y, z) on the cylinder. Once again, this is the outward normal (Figure 10.3, right).
10.1. WHAT THE SURFACE INTEGRAL MEASURES 237
In general, let S be an oriented surface with orienting normal n. To measure the flow of a vector
field F = (F1 , F2 , F3 ) through S, we find the scalar component Fnorm of F in the normal direction
at every point x of S and integrate it over S. This component is given by Fnorm = kFk cos θ,
where θ is the angle between F and n, as shown in Figure 10.5.
Figure 10.5: The component of a vector field in the normal direction at a point x
Hence:
Fnorm = kFk cos θ = kFk knk cos θ = F · n.
RR
The integral of F over S, denoted S F · dS, can be expressed as:
ZZ ZZ ZZ
F · dS = Fnorm dS = F · n dS. (10.2)
S S S
8
Photo credit: https://pixabay.com/photos/nature-animal-porcupine-mammal-3588682/ by analogicus under
Pixabay license.
www.dbooks.org
238 CHAPTER 10. SURFACE INTEGRALS
As the integral of the real-valued function F · n, the expression on the right is an integral with
respect to surface area of the type considered previously in (10.1). We take equation (10.2) as a
tentative definition of the integral of F and use it to work through some examples.
Example 2 2 2 2
RR 10.2. Let S be the sphere x + y + z = a of radius a, oriented by the outward normal.
Find S F · dS if:
(a) F(x, y, z) = − (x2 +y21+z 2 )3/2 (x, y, z) (i.e., the inverse square field),
(a) The inverse square field, illustrated in Figure 10.6, points directly inwards towards the origin,
1 1 1 1 1
(x, y, z) · (x, y, z) = − 4 x2 + y 2 + z 2 = − 4 · a2 = − 2 .
F·n=− 3
a a a a a
Thus:
ZZ ZZ ZZ
1 1 1
F · dS = F · n dS = − dS = − 2 Area (S) = − 2 · 4πa2 = −4π.
S S S a2 a a
This uses the formula that the surface area is 4πa2 from Example 5.19 in Chapter 5. We shall refer
to this surface integral later, so to repeat for emphasis: the integral of the inverse square field over
a sphere of radius a centered at the origin and with the outward orientation is −4π, regardless of
the radius of the sphere.
(b) The vector field F(x, y, z) = (0, 0, z) points straight upward when z > 0 and straight downward
when z < 0 (Figure 10.7). In either region, the component in the outward direction of S is positive.
Thus we expect the integral to be positive.
In fact, F · n = (0, 0, z) · a1 (x, y, z) = a1 z 2 , so:
ZZ ZZ
1
F · dS = z 2 dS.
S a S
10.1. WHAT THE SURFACE INTEGRAL MEASURES 239
Figure 10.7: The vector field F(x, y, z) = (0, 0, z) flowing through a sphere
This last integral is precisely one of the examples we calculated using a parametrization
RR 2 when we
4
studied integrals with respect to surface area (Example 5.19 again). The answer dS = 3 πa4 .
RR 2is S z RR
(We
RR 2also gaveRR a symmetry argumentRR that made use of the facts that S x dS = S y 2 dS =
2 2 2 2 2 2 4
S z dS and S (x + y + z ) dS = S a dS = a · (4πa ) = 4πa .) Thus:
ZZ
1 4 4 4 3
F · dS = · πa = πa .
S a 3 3
Figure 10.8: The vector field F(x, y, z) = (0, 0, −1) flowing through a sphere
By symmetry, this last integral is indeed 0. More precisely, S is symmetric in the xy-plane, i.e.,
(x, y, −z) is in S whenever (x, y, z) is, and the integrand f (x, y, z) = z satisfies f (x, y, −z) =
−z
RR = −f (x, y, z), so the contributions of the northern and southern hemispheres cancel. Thus
S F · dS = 0.
www.dbooks.org
240 CHAPTER 10. SURFACE INTEGRALS
In this last example, if we had integrated instead only over the northern hemisphere, the same
line of thought would lead us to expect that the integral is negative. As a test of your intuition
about what the integral represents, you might think about whether this is related to the following
example.
Example 10.3. Let S be the disk of radius a in the xy-plane described by:
x2 + y 2 ≤ a2 , z = 0,
Figure 10.9: The vector field F(x, y, z) = (0, 0, −1) flowing through a disk in the xy-plane
The upward unit normal to S is n = k = (0, 0, 1), so F · n = (0, 0, −1) · (0, 0, 1) = −1. Therefore:
ZZ ZZ
F · dS = −1 dS = −Area (S) = −πa2 .
S S
Continuing with the comments right before the example, the result turned out be negative,
but is the flow of F = (0, 0, −1) through the disk really related to the flow through the northern
hemisphere? We shall be able to give a precise answer based on general principles once we learn a
little more about surface integrals (see Exercises 4.4 and 5.5 at the end of the chapter).
We say that σ preserves orientation if the sign is + and that it reverses orientation RRif −.
Now, to formulate the definition of the surface integral, recall that the idea was that S F · dS
∂σ
RR RR × ∂σ
should equal S F·n dS. Using (10.3) to substitute for n, this becomes ± S F· k ∂σ ∂s ∂t
× ∂σ k
dS. Then
∂s ∂t
∂σ
× ∂σ
the definition of the integral with respect to surface area (10.1) with f = F · ∂s
k ∂σ
∂t
× ∂σ k
gives:
∂s ∂t
∂σ ∂σ
×
ZZ ZZ
∂s ∂t
∂σ ∂σ
F · n dS = ± F(σ(s, t)) · ∂σ ∂σ
∂s ×
ds dt
S D k ∂s
× ∂t k ∂t
ZZ
∂σ ∂σ
=± F(σ(s, t)) · × ds dt.
D ∂s ∂t
This is the thinking behind the following definition.
Definition. Let U be an open set in R3 , and let F : U → R3 be a continuous vector field. Let S
be an oriented surface contained in U , and let σ : D → R3 be a smooth parametrization of S such
∂σ ∂σ 9
RR
that ∂s × ∂t is never 0. Then the integral of F over S, denoted S F · dS, is defined by:
ZZ ZZ
∂σ ∂σ
F · dS = ± F(σ(s, t)) · × ds dt, (10.4)
S D ∂s ∂t
where + is used if σ is orientation-preserving and − if orientation-reversing.
We defer the discussion that the definition is independent of the parametrization σ until the
end of the chapter after we have gotten used to working with the definition and there are fewer
new things to absorb. Also, the definition extends in the obvious way to surfaces that are piecewise
smooth. That is, if a surface S is a union S = S1 ∪ S2 ∪ · · · ∪ Sk of oriented smooth surfaces that
intersect at most in pairs along portions of their boundaries, we define:
ZZ ZZ ZZ ZZ
F · dS = F · dS + F · dS + · · · + F · dS.
S S1 S2 Sk
As a first example, we calculate an integral that we evaluated using the tentative definition in
Example 10.2(b).
Example
RR 10.4. Let S be the sphere x2 + y 2 + z 2 = a2 , oriented by the outward normal. Find
S F · dS if F(x, y, z) = (0, 0, z).
We apply the definition using the standard parametrization of S with spherical coordinates φ
and θ as parameters and ρ = a fixed:
The domain of the parametrization is the rectangle D = [0, π] × [0, 2π] in the φθ-plane. See Figure
10.10.
9
Actually, it’s allowed for ∂σ
∂s
× ∂σ
∂t
to be 0 at points (s, t) on the boundary of D. As noted earlier, as far as
integrals are concerned, bad behavior can be neglected if confined to the boundary (page 130).
www.dbooks.org
242 CHAPTER 10. SURFACE INTEGRALS
Substituting in terms of the parameters gives F(σ(φ, θ)) = (0, 0, z(φ, θ)) = (0, 0, a cos φ). In
addition:
i j k
∂σ ∂σ
× = det a cos φ cos θ a cos φ sin θ −a sin φ
∂φ ∂θ
−a sin φ sin θ a sin φ cos θ 0
.
= .. (we calculated this before in Example 5.19)
= a2 sin φ sin φ cos θ, sin φ sin θ, cos φ .
(10.5)
∂σ ∂σ
Figure 10.11: The tangent vectors ∂φ and ∂θ
10.2. THE DEFINITION OF THE SURFACE INTEGRAL 243
= a3 sin φ cos2 φ.
Thus:
ZZ ZZ
∂σ ∂σ
F · dS = + F(σ(φ, θ)) · × dφ dθ
S ∂φ ∂θ
ZZ D
= a3 sin φ cos2 φ dφ dθ
D
Z 2π Z π
3 2
= a sin φ cos φ dφ dθ
0 0
Z π Z 2π
3 2
=a sin φ cos φ dφ 1 dθ
0 0
1 π 2π
= a3 − cos3 φ0 θ 0
3
3 1
= a · − (−1 − 1) · 2π
3
2
= a3 · · 2π
3
4 3
= πa .
3
This agrees with the answer we got earlier in Example 10.2(b), but the calculation here was
at least as messy as the one obtained there by integrating the normal component of F. In other
words, the definition may not always be the most efficient way to go. We’ll return to this example
one more time later and apply a method that’s simplest of all.
Example 10.5. Let F(x, y, z) = (0, 1, −3). This is a constant
RR vector field which we think of as
“rain” falling through space. See Figure 10.12, left. Find S F · dS if:
p
(a) S is the cone z = x2 + y 2 , where x2 + y 2 ≤ 4, oriented by the upward normal (Figure 10.12,
middle),
(b) S is the disk that caps the cone in part (a), i.e., the disk x2 + y 2 ≤ 4 in the plane z = 2,
oriented by the upward normal (Figure 10.12, right).
(a) As we saw in Chapter 5, there are several ways to parametrize the cone, but it’s the graph of
a function of x and y so one way is to use x and y as parameters:
p
σ(x, y) = (x, y, x2 + y 2 ), x2 + y 2 ≤ 4.
www.dbooks.org
244 CHAPTER 10. SURFACE INTEGRALS
p
Figure 10.12: Rain (left), the cone z = x2 + y 2 , x2 + y 2 ≤ 4 (middle), and the cap of the cone,
x2 + y 2 ≤ 4, z = 2 (right)
∂σ ∂σ
The z-component is positive, so ∂x × ∂y points upward. Thus σ is orientation-preserving. In
addition:
∂σ ∂σ x y
F(σ(x, y)) · × = (0, 1, −3) · − p , −p ,1
∂x ∂y x2 + y 2 x2 + y 2
y
= −p − 3.
x2 + y 2
Thus by definition:
ZZ ZZ
y
F · dS = −p − 3 dx dy
S D x2 + y 2
ZZ ZZ
y
=− p dx dy − 3 1 dx dy.
D x2 + y 2 D
We close by picking up the thread with which we ended the previous section, which is to note
that the answers to parts (a) and (b) of the last example are the same. We can interpret this to
say that the flow through the top of the cone equals the flow through the sides. If we could have
been sure that this was true in advance, then we could have calculated the integral over the cone
by replacing it with the much simpler calculation over the capping disk. The next three sections
have something to say about why this would have been valid from a couple of different perspectives.
They also contain the last two big theorems of introductory multivariable calculus.
10.3. STOKES’S THEOREM 245
Figure 10.13: The boundary of a hemisphere is a circle and that of a cylinder is two circles. The
boundaries of a sphere and torus are empty.
vector field integral over the surface itself. It can be seen as an extension of Green’s theorem from
double integrals to surface integrals. The theorem is not obvious, and we shall try to show where
it comes from, though in a very special case similar to the one we used to explain Green’s theorem.
Namely, we assume that S is an oriented
surface with a smooth orientation-preserving parametri-
zation σ(s, t) = x(s, t), y(s, t), z(s, t) such that:
• the domain of σ is a rectangle R = [a, b] × [c, d] and
• σ is one-to-one and, in particular, transforms the boundary of R precisely onto the boundary
of S.
The situation is shown in Figure 10.14.
Figure 10.14: A magic carpet: a surface S with parametrization σ defined on a rectangle R. The
sides of R are oriented so that ∂R = A1 ∪ A2 ∪ (−A3 ) ∪ (−A4 ).
Let A denote the boundary of R and B the boundary of S. We orient A counterclockwise and
break it into its four sides A1 , A2 , A3 , A4 , where the horizontal sides A1 and A3 are oriented to the
www.dbooks.org
246 CHAPTER 10. SURFACE INTEGRALS
right and the vertical sides A2 and A4 are oriented upward. Taking orientation into account, we
have A = A1 ∪ A2 ∪ (−A3 ) ∪ (−A4 ). Applying σ then breaks B up into four curves B1 , B2 , B3 , B4 ,
where Bj = σ(Aj ) for j = 1, 2, 3, 4. B inherits the orientation B = B1 ∪ B2 ∪ (−B3 ) ∪ (−B4 ).
Let F = (F1 , F2 , F3 ) be a smooth vector field defined on an open set in R3 that contains S. We
consider the line integral of F around B, the boundary of S:
Z Z Z Z Z
F · ds = + − − F · ds.
B B1 B2 B3 B4
R R
We use differential form notation: B F · ds R = B F1Rdx + F2 dy
R + F3 dz. To cut down on the
amount of notation in any one place, we find B F1 dx, B F2 dy, B F3 dz separately and add the
results at the end.
We can parametrize each of the four pieces of B by applying σ to the corresponding side of A,
fixing one of s or t and using the other as parameter:
For B1 : α1 (s) = σ(s, c), a ≤ s ≤ b.
For B2 : α2 (t) = σ(b, t), c ≤ t ≤ d.
For B3 : α3 (s) = σ(s, d), a ≤ s ≤ b.
For B4 : α4 (t) = σ(a, t), c ≤ t ≤ d.
R
So for B F1 dx, we integrate over each of B1 through B4 and take the sum with appropri-
ate signs. For example, for B1 , the parametrization is α1 (s) = σ(s, c) = (x(s, c), y(s, c), z(s, c)),
where t = c is held fixed. Hence the derivatives with respect to the parameter are deriva-
tives with respect to s, partial derivatives where appropriate. For instance, α10 (s) = ∂σ ∂s (s, c) =
∂y
( ∂x
∂s (s, c), ∂s (s, c), ∂z
∂s (s, c)). As a result, we obtain:
Z Z b
∂x
F1 dx = F1 (σ(s, c)) (s, c) ds.
B1 a ∂s
Continuing in this way with B3 and B4 and combining the results gives:
Z Z b Z d
∂x ∂x
F1 dx = F1 (σ(s, c)) (s, c) ds + F1 (σ(b, t)) (b, t) dt
B a ∂s c ∂t
Z b Z d
∂x ∂x
− F1 (σ(s, d)) (s, d) ds − F1 (σ(a, t)) (a, t) dt. (10.6)
a ∂s c ∂t
Upon closer examination, each of the terms in this last expression is a line integral over one of
the sides of the boundary of R back in the st-plane. For instance, the first term is A1 (F1 ◦ σ) ∂x
R
∂s ds
∂x
R
and the second is A2 (F1 ◦ σ) ∂t dt, line integrals over A1 and A2 , respectively. In fact, since t is
constant on A1 and s is constant on A2 , we can give these terms a more uniform appearance by
writing them as:
Z Z
∂x ∂x ∂x ∂x
(F1 ◦ σ) ds + (F1 ◦ σ) dt and (F1 ◦ σ) ds + (F1 ◦ σ) dt,
A1 ∂s ∂t A2 ∂s ∂t
respectively. The extra summands are superfluous to the integrals, since the derivatives of the extra
coordinates are zero on the corresponding segment. After converting the remaining two terms to
10.3. STOKES’S THEOREM 247
line integrals over A3 and A4 , equation (10.6) becomes a line integral over all of A:
Z Z
∂x ∂x
F1 dx = (F1 ◦ σ) ds + (F1 ◦ σ) dt.
B A ∂s ∂t
But now that we are back to a line integral in the plane, Green’s theorem applies. In other
words, we can replace the line integral over A with a double integral over the rectangle R that it
bounds. Thus: Z ZZ
∂ ∂x ∂ ∂x
F1 dx = (F1 ◦ σ) − (F1 ◦ σ) ds dt. (10.7)
B R ∂s ∂t ∂t ∂s
We work with the integrand of the double integral. Using the product rule to calculate the
partial derivatives with respect to s and t, the integrand becomes:
∂ ∂x ∂2x ∂ ∂x ∂2x
F1 ◦ σ) · + (F1 ◦ σ) · − F1 ◦ σ) · − (F1 ◦ σ) · .
∂s ∂t ∂s ∂t ∂t ∂s ∂t ∂s
2
∂ x ∂ x 2
Thanks to the equality of the mixed partials ∂s ∂t = ∂t ∂s , the second and fourth terms cancel,
leaving:
∂ ∂x ∂ ∂x
F1 ◦ σ) − F1 ◦ σ) .
∂s ∂t ∂t ∂s
So after substituting into equation (10.7), at this point, we have:
Z ZZ
∂ ∂x ∂ ∂x
F1 dx = F1 ◦ σ) − F1 ◦ σ) ds dt. (10.8)
B R ∂s ∂t ∂t ∂s
We compute the partials of F1 ◦ σ using the chain rule. Fans of dependence diagrams will
F1
x y z
s t
welcome Figure 10.15. By the chain rule, the integrand in equation (10.8) becomes:
∂F1 ∂x ∂F1 ∂y ∂F1 ∂z ∂x ∂F1 ∂x ∂F1 ∂y ∂F1 ∂z ∂x
+ + − + +
∂x ∂s ∂y ∂s ∂z ∂s ∂t ∂x ∂t ∂y ∂t ∂z ∂t ∂s
∂F1 ∂z ∂x ∂x ∂z ∂F1 ∂x ∂y ∂y ∂x
= − − − .
∂z ∂s ∂t ∂s ∂t ∂y ∂s ∂t ∂s ∂t
Plugging this into (10.8) yields:
Z ZZ
∂F1 ∂z ∂x ∂x ∂z ∂F1 ∂x ∂y ∂y ∂x
F1 dx = − − − ds dt.
B R ∂z ∂s ∂t ∂s ∂t ∂y ∂s ∂t ∂s ∂t
R R
The calculations of B F2 dy and B F3 dz are similar and end up giving:
Z ZZ
∂F2 ∂y ∂z ∂z ∂y ∂F2 ∂x ∂y ∂y ∂x
F2 dy = − − + − ds dt
B R ∂z ∂s ∂t ∂s ∂t ∂x ∂s ∂t ∂s ∂t
Z ZZ
∂F3 ∂y ∂z ∂z ∂y ∂F3 ∂z ∂x ∂x ∂z
and F3 dz = − − − ds dt.
B R ∂y ∂s ∂t ∂s ∂t ∂x ∂s ∂t ∂s ∂t
www.dbooks.org
248 CHAPTER 10. SURFACE INTEGRALS
Adding the three calculations and suitably regrouping the terms results in the formula:
Z ZZ
∂F3 ∂F2 ∂y ∂z ∂z ∂y
F1 dx + F2 dy + F3 dz = − −
B R ∂y ∂z ∂s ∂t ∂s ∂t
∂F1 ∂F3 ∂z ∂x ∂x ∂z
+ − −
∂z ∂x ∂s ∂t ∂s ∂t
∂F2 ∂F1 ∂x ∂y ∂y ∂x
+ − − ds dt. (10.9)
∂x ∂y ∂s ∂t ∂s ∂t
This looks awful, but actually the ungainly double integral can be expressed much more con-
cisely. The integrand is the dot product of the vectors:
∂F3 ∂F2 ∂F1 ∂F3 ∂F2 ∂F1
v= − , − , −
∂y ∂z ∂z ∂x ∂x ∂y
∂y ∂z ∂z ∂y ∂z ∂x ∂x ∂z ∂x ∂y ∂y ∂x
and w = − , − , − .
∂s ∂t ∂s ∂t ∂s ∂t ∂s ∂t ∂s ∂t ∂s ∂t
For v, the components are the differences of the mixed partial pairs of F, and in fact v = ∇ × F,
the curl of F. Likewise, w is also a cross product:
i j k
∂y ∂σ ∂σ
w = det ∂x
∂s ∂s
∂z
∂s = ∂s × ∂t .
∂x ∂y ∂z
∂t ∂t ∂t
This last expression is precisely the definition of the surface integral of the vector field ∇ × F over
S. We have arrived at the final formula:
Z Z ZZ
F · ds = F1 dx + F2 dy + F3 dz = (∇ × F) · dS. (10.10)
B B S
This was a long calculation. To recap the strategy, to evaluate the line integral over the boundary
B of S, we used a parametrization of S to pull back to a line integral in the plane, applied Green’s
theorem there, and lastly identified the resulting double integral (10.9) as the pullback of an integral
over the surface.
The argument behind (10.10) assumed that the domain of the parametrization is a rectangle, but
the conclusion remains true for surfaces more broadly. In order to formulate a general statement,
we need to say something about the relation between the orientation of a surface and the orientation
of its boundary. Let S be a smooth oriented surface in R3 with orienting normal n. We think of n
as defining “up” and imagine walking around S with our feet on the surface and our heads in the
direction of n. Then:
If S happens to be contained in the plane R2 , this is consistent with the way we oriented its
boundary in Green’s theorem, assuming that S is oriented by the upward normal n = (0, 0, 1).
10.3. STOKES’S THEOREM 249
For example, let S be a sphere that is oriented by the outward normal. Then the boundary of
the northern hemisphere is the equatorial circle oriented counterclockwise, while the boundary of
the southern hemisphere is the same circle but oriented clockwise. If S is a pair of pants oriented
by the outward normal, then ∂S consists of three closed curves: the waist oriented one way—let’s
call it clockwise—and the bottoms of the two legs oriented counterclockwise. See Figure 10.16.
For a piecewise smooth surface S = S1 ∪ S2 ∪ · · · ∪ Sk , we require that the smooth pieces
S1 , S2 , . . . , Sk are oriented compatibly in the sense that the oriented boundaries of adjacent pieces
are traversed in opposite directions wherever they intersect. These intersections are not considered
to be part of the boundary of the whole surface S. This is illustrated in Figure 10.17.
For the extension of (10.10) beyond surfaces whose parameter domain is a rectangle, one ap-
proach might be to mimic the strategy we outlined informally for Green’s theorem. Namely, first
verify it in the case where the parameter domain is a union of rectangles that overlap at most
along parts of their boundaries and then argue that the domain of any parametrization can be
approximated arbitrarily well by such a union. We won’t give the details.
With these preliminaries out of the way, we can state the theorem.
Theorem 10.6 (Stokes’s theorem). Let S be a piecewise smooth oriented surface in R3 that is
bounded as a subset of R3 , and let F be a smooth vector field defined on an open set containing S.
Then: Z ZZ
F · ds = (∇ × F) · dS.
∂S S
In the form stated, Stokes’s theorem converts a line integral to a surface integral. Since surface
integrals are usually messier than line integrals, this is not the usual way in which the theorem
is applied, at least as far as calculating specific examples goes. (See Exercises 3.3–3.6 for some
exceptions, however.) Sometimes though, the theorem can be used to go the other way, converting
a surface integral to a line integral. This possibility has implications for the general properties of
surface integrals.
www.dbooks.org
250 CHAPTER 10. SURFACE INTEGRALS
Examplep10.7. Let F be the “falling rain” vector field F(x, y, z) = (0, 1, −3), and let S be the
cone z = x2 + y 2 , x2 + y 2 ≤ 4, oriented by the upward normal. In Example 10.5, we integrated
F over S using a parametrization. Since RR we evaluated the integral before, there seems no harm in
giving away a spoiler: the answer is S F · dS = −12π.
Could this surface integral have been computed using Stokes’s theorem? The answer is yes
provided
RR thatRRthere is a vector Rfield G such that F = ∇ × G, for then Stokes’s theorem gives
S F · dS = S (∇ × G) · dS = ∂S G · ds. In other words, we could find the surface integral of F
over S by computing the line integral of G around the boundary.
We try to find such a G = (G1 , G2 , G3 ) by setting F = ∇ × G and guessing:
i j k
∂ ∂ ∂
F = (0, 1, −3) = det ∂x ∂y ∂z .
G1 G 2 G3
From the second equation, we guess G1 = z, from the third G2 = −3x, and from the first G3 = 0.
One can check that these guesses actually are consistent with all three equations. Thus G(x, y, z) =
(z, −3x, 0) works! Our virtuous lifestyle has paid off.
This gives:
ZZ Z Z Z
F · dS = G · ds = z dx − 3x dy + 0 dz = z dx − 3x dy.
S ∂S ∂S ∂S
We calculate the line integral by parametrizing C = ∂S, which is a circle of radius 2 in the plane
z = 2, oriented counterclockwise (see Figure 10.12, middle, or, if you’re willing to peek ahead,
Figure 10.18, left). The circle can be parametrized by α(t) = (2 cos t, 2 sin t, 2), 0 ≤ t ≤ 2π. Thus:
ZZ Z Z 2π
F · dS = z dx − 3x dy = 2 · (−2 sin t) − 6 cos t · 2 cos t dt
S C 0
Z 2π
= (−4 sin t − 12 cos2 t) dt
0
2π 2π
= 4 cos t0 − 12 · (10.11)
2
= 0 − 12π
= −12π,
where in the step labeled (10.11) we used our trick to integrate cos2 t (Exercise 1.1, Chapter 7).
could be the disk in the plane z = 2 that caps off the cone, oriented by the upward normal, which
was also considered in Example 10.5. See Figure 10.18.
If we integrate the falling rain vector field F over the new surface S,
e we obtain:
ZZ ZZ
F · dS = (∇ × G) · dS (where G = (z, −3x, 0) as before)
S
e
Z S
e
= G · ds
C
= −12π (by the calculation in the previous example).
In other words, the integral of F is −12π over all oriented surfaces whose boundary is C.
This idea can be formulated more generally.
For instance, we saw in Example 10.7 that F = (0, 1, −3) is a curl field on R3 with “anti-curl”
G = (z, −3x, 0).
One perspective on curl fields is the following general working principle: curl fields are to surface
integrals as conservative fields are to line integrals. We consider a couple of illustrations of this.
Example 10.8. For line integrals: The integral of a conservative field over an oriented curve
depends only on the endpoints of the curve.
For surface integrals: The integral of a curl field over an oriented surface depends only on the
oriented boundary of the surface. More precisely:
Theorem 10.9. Let S1 and S2 be piecewise smooth oriented surfaces in an open set U in R3 that
have the same oriented boundary, i.e., ∂S1 = ∂S2 . If F is a curl field on U , then:
ZZ ZZ
F · dS = F · dS.
S1 S2
R Suppose that F = ∇ × G. Then by Stokes’s theorem, both sides of the equation are equal
Proof.
to C G · ds, where C = ∂S1 = ∂S2 .
www.dbooks.org
252 CHAPTER 10. SURFACE INTEGRALS
Example 10.10. For line integrals: The integral of a conservative field over a piecewise smooth
oriented closed curve is 0.
For surface integrals:
For example, spheres and tori are closed surfaces, but hemispheres and cylinders are not.
Theorem 10.11. If F is a curl field on an open set U in R3 , then S F · dS = 0 for any piecewise
RR
S1 and S2 inherit the orienting unit normal from S, so, in order to keep them on the left, their
boundaries must be traversed in opposite directions. This is consistent with our requirement for
orientations of piecewise smooth surfaces. In other words, ∂S1 and ∂S2 are the same curve but
with opposite orientations. Hence the line integrals in equation (10.12) cancel each other out.
Lastly, we look for an analogue for curl fields of the mixed partials theorem for conservative
fields. In R3 , the mixed partials theorem takes the form that, if F is conservative,
then ∇ × F = 0.
i j k
∂ ∂ ∂
Now, let F be a curl field, say F = ∇ × G, so (F1 , F2 , F3 ) = det ∂x ∂y ∂z . Hence:
G1 G2 G3
F = ∂G ∂G2
∂y − ∂z
3
1
F2 = ∂G ∂G3
∂z − ∂x
1
F = ∂G2 − ∂G1 .
3 ∂x ∂y
10.5. GAUSS’S THEOREM 253
The terms on the right side are mixed partial pairs that appear with opposite signs. Thus, by the
equality of mixed partials, taking the sum gives:
∂F1 ∂F2 ∂F3
+ + = 0. (10.13)
∂x ∂y ∂z
The sum can also be written in terms of the ∇ operator as ∂F ∂F2 ∂F3 ∂
∂x + ∂y + ∂z = ( ∂x ,
1 ∂ ∂
∂y , ∂z ) ·
(F1 , F2 , F3 ) = ∇ · F, where the product is dot product. This quantity has a name.
Definition. If F = (F1 , F2 , F3 ) is a smooth vector field on an open set U of R3 , then:
∂F1 ∂F2 ∂F3
∇·F= + +
∂x ∂y ∂z
is called the divergence of F.
Note that, while the curl of a vector field is again a vector field, the divergence is a real-valued
function.
From (10.13), we have proven the following:
Theorem 10.12. Let U be an open set in R3 , and let F be a smooth vector field on U . If F is a
curl field on U , then:
∇·F=0
at all points of U . Equivalently, given a smooth vector field F on U , if ∇ · F 6= 0 at some point,
then F is not a curl field.
Example 10.13. The vector field F(x, y, z) = (2xy + z 2 , 2yz + x2 , 2xz + y 2 ) is conservative on
R3 . It has f (x, y, z) = x2 y + y 2 z + z 2 x as a potential function. Is F also a curl field on R3 ?
We could write down the conditions for being a curl field and try to guess a vector field G
such that F = ∇ ×G the way we did before. It is easy, however, to compute the divergence:
∂ ∂ ∂
∇ · F = ∂x 2xy + z 2 + ∂y 2yz + x2 + ∂z 2xz + y 2 = 2y + 2z + 2x. Since this is not identically
0, F is not a curl field.
www.dbooks.org
254 CHAPTER 10. SURFACE INTEGRALS
Figure 10.20: The rectangular box W = [a, b]×[c, d]×[e, f ] in R3 and the orientation of its boundary
surface S
We integrate F over S. To do so, we consider each face separately and add the results:
ZZ Z Z ZZ ZZ ZZ ZZ ZZ
F · dS = + + + + + F · dS. (10.14)
S top bottom front back left right
For instance, the top may be parametrized using x and y as parameters with z = f fixed: σ(x, y) =
(x, y, f ), a ≤ x ≤ b, c ≤ y ≤ d. Then:
∂σ ∂σ
× = (1, 0, 0) × (0, 1, 0) = i × j = k = (0, 0, 1).
∂x ∂y
This points upward, hence away from the box W , so σ is orientation-preserving. Therefore:
ZZ ZZ
∂σ ∂σ
F · dS = F(σ(x, y)) · × dx dy
top [a,b]×[c,d] ∂x ∂y
ZZ
= (F1 , F2 , F3 ) · (0, 0, 1) dx dy
[a,b]×[c,d]
ZZ
= F3 (x, y, f ) dx dy.
[a,b]×[c,d]
For the face on the bottom, the similar parametrization σ(x, y) = (x, y, e) works, only now σ is
orientation-reversing since the orientation of S on the bottom points downward in order to point
away from W . This gives:
ZZ ZZ
F · dS = − F3 (x, y, e) dx dy.
bottom [a,b]×[c,d]
10.5. GAUSS’S THEOREM 255
Adding up the results over all the faces as in equation (10.14) then gives:
ZZ ZZZ ZZZ
∂F1 ∂F2 ∂F3
F · dS = + + dx dy dz = ∇ · F dx dy dz. (10.15)
S W ∂x ∂y ∂z W
One can extend this formula to the case that W is a union of rectangular boxes that intersect
at most in pairs along their boundaries by applying equation (10.15) to the individual boxes and
adding. A key observation is that when two neighboring boxes intersect in the interior of W on
a portion of a common face, then the orienting normals of each box point in opposite directions
in order to point away from the respective boxes. Hence the normal components F · n are also
opposite, so that the surface integrals over the overlaps cancel out in the sum. This leaves only the
surface integral over the boundary of W . One then claims that equation (10.15) remains true for
more general regions by some sort of limiting argument.
Before stating the theorem, we identify the orientation convention implicit in the argument
given above, namely, if W is a bounded set in R3 whose boundary consists of one or more closed
surfaces, then:
Theorem 10.14 (Gauss’s theorem). Let U be an open set in R3 , and let F be a smooth vector
field on U . If W is a bounded subset of U whose boundary consists of a finite number of piecewise
smooth closed surfaces, then:
ZZ ZZZ
F · dS = ∇ · F dx dy dz.
∂W W
www.dbooks.org
256 CHAPTER 10. SURFACE INTEGRALS
In the case that F is a curl field, the left side of Gauss’s theorem is 0. This is because ∂W
consists of closed surfaces and the integral of a curl field over a closed surface is 0 (Theorem 10.11).
The right side is 0, too, because curl fields have divergence 0 (Theorem 10.12). Thus for curl fields,
Gauss’s theorem reflects properties already known to us.
To illustrate Gauss’s theorem, we begin by taking a new look at two surface integrals we
evaluated earlier.
Example 10.15. 2 2 2 2
RR Let Sa be the sphere x + y + z = a of radius a, oriented by the outward
normal. Find Sa F · dS if:
Figure 10.22: The solid ball Wa of radius a and its oriented boundary, the sphere Sa
(a) This was Example 10.2(c). To answer it using Gauss’s theorm, let Wa be the solid ball x2 +
∂ ∂ ∂
y 2 + z 2 ≤ a2 so that Sa = ∂Wa , as in Figure 10.22. Since ∇ · F = ∂x (0) + ∂y (0) + ∂z (−1) = 0,
Gauss’s theorem says that:
ZZ ZZZ ZZZ
F · dS = ∇ · F dx dy dz = 0 dx dy dz = 0.
Sa Wa Wa
(b) This is the third time we have evaluated this integral (see Examples 10.2(b) and 10.4). Here,
∇ · F = 0 + 0 + 1 = 1, so with Wa as above:
ZZ ZZZ
4
F · dS = 1 dx dy dz = Vol (Wa ) = πa3 ,
Sa Wa 3
10.6. THE INVERSE SQUARE FIELD 257
where we have used the formula for the volume of a 3-dimensional ball from Example 7.8 in Chapter
7.
It seems odd to say that turning a problem into a triple integral makes it simpler, but, with the
help of Gauss’s theorem, finding the preceding two surface integrals became essentially trivial.
Example 10.16. Let F(x, y, z) = (x, y, z). This vector field radiates directly away from the origin
at every point, and the arrows get longer the further out W be a solid region in R3 as
RR you go. Let RRR
in Gauss’s theorem. Then ∇ · F = 1 + 1 + 1 = 3, so ∂W F · dS = W 3 dx dy dz = 3 Vol (W ),
that is: ZZ
1
Vol (W ) = (x, y, z) · dS.
3 ∂W
This is another instance where information about a set can be gleaned from a calculation just
involving its boundary. (Question: Thinking of the surface integral as the integral of the normal
component
RR and keeping in mind that ∂W is oriented by the outward normal, does it make sense
that ∂W (x, y, z) · dS is always positive? Note that the origin need not lie in W , so there may
be points where the orienting normal n on the boundary points towards the origin and the normal
component (x, y, z) · n is negative.)
Justification. We calculated and even emphasized this earlier in Example 10.2(a), which was the
first surface integral of a vector field that we found. Note that this implies that G is not a curl
field, since a curl field would integrate to 0 over the closed surface Sa .
Fact 2. ∇ · G = 0.
Justification. This is a brute force calculation. For instance:
∂G1 ∂ x
= − 2
∂x ∂x (x + y 2 + z 2 )3/2
(x2 + y 2 + z 2 )3/2 · 1 − x · 32 (x2 + y 2 + z 2 )1/2 · 2x
=−
(x2 + y 2 + z 2 )3
(x2 + y 2 + z 2 ) − 3x2
=−
(x2 + y 2 + z 2 )5/2
2x2 − y 2 − z 2
= 2 .
(x + y 2 + z 2 )5/2
By symmetric calculations:
∂G2 2y 2 − x2 − z 2 ∂G3 2z 2 − x2 − y 2
= 2 and = 2 .
∂y (x + y 2 + z 2 )5/2 ∂z (x + y 2 + z 2 )5/2
www.dbooks.org
258 CHAPTER 10. SURFACE INTEGRALS
2 2 2 2 2 2 2 2 2
Taking the sum gives ∇ · G = (2x −y −z )+(2y −x −z )+(2z −x −y )
(x2 +y 2 +z 2 )5/2
= 0, as claimed.
This fact has an interesting implication, too, for we know that curl fields have divergence 0.
This shows that the converse need not hold, since ∇ · G = 0 yet G is not a curl field by Fact 1.
Now, let S be a piecewise smooth closed surface in R3 − {(0, RR0, 0)}, oriented by the outward
normal. Is that enough for us to be able to say something about S G · dS? Strikingly, the answer
is yes. There are two cases.
Case 1. Suppose that the origin 0 lies in the exterior of S (Figure 10.23).
as before. This time, however, the boundary of W consists of S and S , oriented by the normal
pointing away from W . This is the RR outward normal
RR on S but RR
points inward on S . In other words,
∂W = S ∪ (−S ), and therefore ∂W G · dS = S G · dS − S G · dS. By equation (10.17), the
RR RR
difference is 0, from which it follows that S G · dS = S G · dS. And, by Fact 1, the integral of
G over any sphere centered at the origin is −4π. Hence:
ZZ
G · dS = −4π.
S
After a slight rearrangement, we can summarize the discussion as follows. For a piecewise
smooth closed surface S that doesn’t pass through the origin and is oriented by the outward
normal: (
ZZ
1 0 if 0 lies in the exterior of S,
− G · dS =
4π S 1 if 0 lies in the interior of S.
As we noted back in Chapter 8, the inverse square field describes the gravitational, or electro-
static, force due to an object, or charged particle, located at the origin. If several objects of equal
mass, or equal charge, are placed at the points p1 , p2 , . . . , pk in R3 , then the total force at a point
x due to these objects is described by the vector field:
k
X
F(x) = G(x − pi ), (10.18)
i=1
where G is the same as above (10.16). It is not hard to verify that again ∇ · F = 0. A modification
of the argument just given shows that, if S is a piecewise smooth closed surface contained in the
set R3 − {p1 , p2 , . . . pk } and oriented by the outward normal, then:
ZZ
1
− F · dS counts the number of the points p1 , p2 , . . . , pk that lie inside S.
4π S
This is an instance of what is known in physics as Gauss’s law. In effect, the total mass, or charge,
contained inside S determines, and is determined by, the integral of F over S. An explicit formula
or parametrization for S is not needed to obtain this information.
www.dbooks.org
260 CHAPTER 10. SURFACE INTEGRALS
F3
dz
F1 F2
dx dy
In comparison with the expanded definition (10.19), the notation in (10.20) says that, as part
of the substitutions for x, y, z in terms of s, t, one should make the substitution, for example:
∂y ∂z
dy ∧ dz = det ∂y ∂s ∂s
∂z ds dt.
∂t ∂t
Actually, this may look more familiar if we take the transpose of the matrix, which does not affect
the value of the determinant: ∂y ∂y
dy ∧ dz = det ∂z ∂s ∂t
∂z ds dt. (10.21)
∂s ∂t
This has the form of a change of variables, i.e., a substitution, for a transformation from the st-
plane to the yz-plane. Similarly, to evaluate an integral presented in the notation of (10.20), one
substitutes: ∂z ∂z
dz ∧ dx = det ∂x∂s ∂t
∂x ds dt
∂s ∂t
∂x (10.22)
∂x
and dx ∧ dy = det ∂s ∂t ds dt,
∂y ∂y
∂s ∂t
10.8. INDEPENDENCE OF PARAMETRIZATION 261
With this notation, the definition of the surface integral takes the form:
ZZ ZZ
η= σ ∗ (η), (10.23)
σ(D) D
RR
where, onRR the right, an integral of the form D f (s, t) ds ∧ dt is defined to be the usual double
integral D f (s, t) ds dt. Equation (10.23) highlights that the definition has the same form as the
change of variables theorem, continuing an emerging pattern (see (7.3) and (9.6)).
We say more in the next chapter about properties of differential forms, but our discussion so far
can be used to motivate some elementary manipulations. The substitution formulas (10.21) and
(10.22) for dy ∧ dz and the like suggest a connection between wedge products and determinants.
To reinforce this, we adopt the following rules for working with differential forms:
• dz ∧ dy = −dy ∧ dz,
dx ∧ dz = −dz ∧ dx,
dy ∧ dx = −dx ∧ dy,
• dx ∧ dx = 0,
dy ∧ dy = 0,
dz ∧ dz = 0.
The first collection mirrors the property that interchanging two rows changes the sign of a deter-
minant, and the second mirrors the property that two equal rows imply a determinant of 0. We
postpone until the next chapter the discussion of how these properties may be used.
www.dbooks.org
262 CHAPTER 10. SURFACE INTEGRALS
integral after it has been expanded out in terms of coordinates as suggested by the differential form
notation, that is:
ZZ ZZ ∂y ∂y ∂z ∂z
F · dS = ∂s ∂t ∂s
F1 (σ(s, t)) · det ∂z ∂z + F2 (σ(s, t)) · det ∂x ∂t
∂x
σ D ∂s ∂t ∂s ∂t
∂x ∂x
+ F3 (σ(s, t)) · det ∂s ∂t ds dt, (10.24)
∂y ∂y
∂s ∂t
as
RR in equation (10.19), except that the ∗2 by 2 matrices have been transposed. The expression for
τ F · dS is similar, integrating over D and using u’s and v’s instead of s’s and t’s.
If x ∈ S, then x = σ(s, t) for some (s, t) in D and x = τ (u, v) for some (u, v) in D∗ . We write
this as (s, t) = T (u, v). This defines a function T : D∗ → D, where, given (u, v) in D∗ , T (u, v) is
the point (s, t) in D such that σ(s, t) = τ (u, v). Then, by construction:
τ (u, v) = σ(s, t) = σ(T (u, v)). (10.25)
See Figure 10.26. (If σ or τ is not one-to-one, which is permitted only on the boundaries of their
respective domains, there is some ambiguity on how to define T there, but, as we have observed
before, the integral can tolerate a certain amount of misbehavior on the boundary.)
By equation (10.25) and the chain rule, Dτ (u, v) = Dσ(T (u, v))·DT (u, v) = Dσ(s, t)·DT (u, v),
or: ∂x ∂x ∂x ∂x
∂u ∂v ∂s ∂t
∂s ∂s
∂y ∂y = ∂y ∂y ∂u ∂t
∂v .
∂t (10.26)
∂u ∂v ∂s ∂t
∂z ∂z ∂z ∂z ∂u ∂v
∂u ∂v ∂s ∂t
We focus on the 2 by 2 blocks within the first two matrices of this equation. For instance, isolating
the portions of the product involving the second and third rows, as indicated in red above, gives:
∂y ∂y ∂y ∂y ∂s ∂s ∂y ∂y
∂u ∂v = ∂s ∂t ∂u ∂v = ∂s ∂t
∂z ∂z ∂z ∂z ∂t ∂t ∂z ∂z · DT (u, v).
∂u ∂v ∂s ∂t ∂u ∂v ∂s ∂t
Since the determinant of a product is the product of the determinants—that is, det(AB) =
(det A)(det B)—we obtain:
∂y ∂y ∂y ∂y
∂u ∂v ∂s ∂t
det ∂z ∂z = det ∂z ∂z · det DT (u, v). (10.27)
∂u ∂v ∂s ∂t
10.8. INDEPENDENCE OF PARAMETRIZATION 263
RR
determinant appears in the definition of τ F · dS, while
Referring to equation (10.24), the first RR
the second appears in the definition of σ F · dS. By choosing different pairs of rows in the chain
rule (10.26), the same reasoning applies to the other two determinants that appear in (10.24). In
each case, the determinants for τ and σ differ by a factor of det DT (u, v).
∂τ
In fact, these 2 by 2 determinants are precisely the components of the cross products ∂u × ∂τ
∂v
∂σ ∂σ
and ∂s × ∂t , so this calculation shows that:
∂τ ∂τ ∂σ ∂σ
× = × · det DT (u, v),
∂u ∂v ∂s ∂t
where, contrary to standard practice, we have written the scalar factor det DT (u, v) on the right.
The two cross products are normal vectors to S. Since we assumed that σ is orientation-preserving,
we conclude that τ is orientation-preserving if det DT (u, v) is positive and orientation-reversing if
it is negative.
We are ready to compare the integrals obtained using the two parametrizations. Beginning
with the expansion (10.24) with τ as parametrization and then using (10.27) and its analogues to
pull out a factor of det DT (u, v), we obtain:
ZZ ZZ ∂y ∂y ∂z ∂z
F · dS = (F1 ◦ τ ) · det ∂u ∂v + (F ◦ τ ) · det ∂u ∂v
∂z ∂z 2 ∂x ∂x
τ D∗ ∂u ∂v ∂u ∂v
∂x ∂x
+ (F3 ◦ τ ) · det ∂u∂y
∂v
∂y du dv (10.28)
∂u ∂v
ZZ ∂y ∂y ∂z ∂z
= ∂s
(F1 ◦ τ ) · det ∂z ∂t + (F ◦ τ ) · det ∂s ∂t
∂z 2 ∂x ∂x
D∗ ∂s ∂t ∂s ∂t
∂x ∂x
∂s
+ (F3 ◦ τ ) · det ∂y ∂t · det DT (u, v) du dv. (10.29)
∂y
∂s ∂t
Next, note that det DT (u, v) = ±| det DT (u, v)|, where the sign depends on whether det DT (u, v) is
positive or negative or equivalently, as noted above, whether τ is orientation-preserving or reversing.
Keeping in mind that τ = σ ◦ T , we find:
ZZ ZZ ∂y ∂y ∂z ∂z
F · dS = ± ∂s ∂t
(F1 ◦ σ ◦ T ) · det ∂z ∂z + (F2 ◦ σ ◦ T ) · det ∂x∂s ∂t
∂x
τ D∗ ∂s ∂t ∂s ∂t
∂x ∂x
+ (F3 ◦ σ ◦ T ) · det ∂s ∂t · det DT (u, v) du dv.
∂y ∂y
∂s ∂t
This last integral is the result of a change of variables. It is an integral over D∗ that is the
pullback of an integral over D using T as the change of variables transformation from the uv-plane
to the st-plane. The integral over D to which the change of variables is applied is:
ZZ ∂y ∂y ∂z ∂z ∂x ∂x
± ∂s
(F1 ◦ σ) · det ∂z ∂t ∂s ∂t ∂s ∂t
∂z + (F2 ◦ σ) · det ∂x ∂x + (F3 ◦ σ) · det ∂y ∂y ds dt.
D ∂s ∂t ∂s ∂t ∂s ∂t
www.dbooks.org
264 CHAPTER 10. SURFACE INTEGRALS
In other words, using either σ or τ to pull back η gives the same value.
Don’t panic! The first equation follows since T was defined so that τ = σ ◦ T . The second
equation uses the fact that (σ ◦ T )∗ (η) = T ∗ (σ ∗ (η)). This is essentially the chain rule-based
calculation that brought out the factor of det DT (u, v) in going from (10.28) to (10.29), though
we would need to develop the properties of pullbacks more carefully to explain this. (To get a
taste of these properties, see Exercises 4.14–4.18 in Chapter 11.) Finally, the step going from D∗
to D uses the pullback form of the change of variables theorem, which is a refinement that takes
orientation into account and eliminates the absolute value around the determinant (see Exercise
4.20 in Chapter 11).
Though there are gaps that need to be filled in, the intended lesson here is that differential forms,
properly developed, provide a structure in which the arguments can be presented with remarkable
conciseness. Of course, the proper development may involve lengthy calculations similar to the
ones we presented, but their validity is guided by and rooted in basic principles: the chain rule and
change of variables.
1.5. Let F : R3 → R3 be a vector field such that F(−x) = −F(x) for all x in R3 , and let S be
the 2 2 2 2
RR sphere x + y + z = a , oriented by the outward normal. Is it necessarily true that
S F · dS = 0? Explain.
2.1. Let S be the portion of the plane x + y + z = 3 for which 0 ≤ xRR ≤ 1 and 0 ≤ y ≤ 1. Let
F(x, y, z) = (2y, x, z). If S is oriented by the upward normal, find S F · dS.
2.5. Let W be the solid region in R3 that lies inside the cylinders x2 + z 2 = 1 and y 2 + z 2 = 1 and
above the xy-plane. This solid appeared in Example 5.2 of Chapter 5, shown again in Figure
10.27. Let S be the piecewise smooth surface that is the exposed part of the boundary of W ,
that is, the cylindrical surfaces but not the base, oriented by the upward normal.
RR
Find S F · dS if F is the vector field F(x, y, z) = (0, 0, z).
www.dbooks.org
266 CHAPTER 10. SURFACE INTEGRALS
3.1. Consider the vector field F(x, y, z) = (2yz, y sin z, 1 + cos z).
oriented by the upward-pointing normal. (Hint: Take advantage of what you’ve already
done.)
3.2. Consider the vector field F(x, y, z) = (ex+y − xey+z , ey+z − ex+y + yez , −ez ).
normal.
(d) Find S F · dS if S is the hemisphere x2 + y 2 + z 2 = 4, z ≤ 0, oriented by the outward
RR
normal.
(e) Find S F · dS if S is the cylinder x2 + y 2 = 4, 0 ≤ z ≤ 4, oriented by the outward
RR
normal.
Exercises 3.3–3.6 illustrate situations where, by bringing in a surface integral, Stokes’s theorem
can be used to obtain information about line integrals that would be hard to find directly.
(a) Find ∇ × F.
(b) Let C be the curve of intersection in R3 of the cylinder x2 + y 2 = 1 and the saddle
surface z = x2 − y 2 , oriented counterclockwise as viewed from Rhigh above the xy-plane,
looking down. Use Stokes’s theorem to find the line integral C F · ds. (Hint: Find a
surface whose boundary is C.)
3.5. (a) Find an example of a smooth vector field F = (F1 , F2 , F3 ) defined on an open set U of
∂Fi ∂F
R3 such that ∂x j
= ∂xji for all i, j but there exists a piecewise smooth oriented simple
R
closed curve C in U such that C F1 dx + F2 dy + F3 dz 6= 0.
(b) On the other hand, show that, if F is any smooth vector field on an open set U in R3
∂Fi ∂Fj R
such that ∂x j
= ∂xi
for all i, j, then C F1 dx + F2 dy + F3 dz = 0 for any piecewise
smooth oriented simple closed curve C that is the boundary of an oriented surface S
contained in U .
10.9. EXERCISES FOR CHAPTER 10 267
(c) What does part (b) tell you about the curve C you found in part (a)?
3.6. Let C1 and C2 be piecewise smooth simple closed curves contained in the cylinder x2 + y 2 = 1
that do not intersect one another, both oriented counterclockwise when viewed from high
above the xy-plane looking down, as illustrated inR Figure 10.28. If F is the vector field
3 5 2
R
F(x, y, z) = (z − 2y , 2yz + x , y + x), show that C1 F · ds = C2 F · ds.
4.4. This exercise ties up some loose ends left dangling after Example 10.3 back in Section 10.1.
Let S1 be the hemisphere x2 + y 2 + z 2 = a2 , z ≥ 0, oriented by the outward normal, and let
S2 be the disk x2 + y 2 ≤ a2 , z = 0, oriented by the upward normal.
Let F be the vector field F(x, y, z) = (0, 0, −1).
(a) Show that F is a curl field by finding a vector field G whose curl is F.
(b) In Example 10.3, we showed that it is fairly easy to calculate that S2 F · dS = −πa2 .
RR
If so, find such a vector field, and justify why it works. If no such vector field exists, explain
why not.
www.dbooks.org
268 CHAPTER 10. SURFACE INTEGRALS
5.1. Let F(x, y, z) = (x3 , y 3 , z 3 ). Use Gauss’s theorem to find S F · dS if S is the sphere x2 +
RR
W = [0, 1] × [0, 1] × [0, 1], oriented by the normal pointing away from W .
5.3. Let S be the silo-shaped closed surface consisting of:
• the cylinder x2 + y 2 = 4, 0 ≤ z ≤ 3,
• a hemispherical cap of radius 2 centered at (0, 0, 3), i.e., x2 + y 2 + (z − 3)2 = 4, z ≥ 3,
and
• a base consisting of the disk x2 + y 2 ≤ 4, z = 0,
RR
all oriented by the normal pointing away from the silo. Find S F · dS if F(x, y, z) =
(3xz 2 , 2y, x + y 2 − z 3 ).
5.4. One way to produce a torus is to take the circle C in the xz-plane of radius b and center
(a, 0, 0) and rotate it about the z-axis. We assume that a > b. The surface that is swept out
is a torus. The circle C can be parametrized by α(ψ) = (a + b cos ψ, 0, b sin ψ), 0 ≤ ψ ≤ 2π.
The distance of α(ψ) to the z-axis is a + b cos ψ, so rotating α(ψ) counterclockwise by θ about
the z-axis brings it to the point (a + b cos ψ) cos θ, (a + b cos ψ) sin θ, b sin ψ). This gives the
following parametrization of the torus with ψ and θ as parameters, illustrated in Figure 10.29:
σ(ψ, θ) = (a + b cos ψ) cos θ, (a + b cos ψ) sin θ, b sin ψ , 0 ≤ ψ ≤ 2π, 0 ≤ θ ≤ 2π.
Let S denote the resulting torus, and assume that it is oriented by the unit normal vector
that points towards the exterior of the torus.
(a) Show that:
∂σ ∂σ
× = −b(a + b cos ψ) cos ψ cos θ, cos ψ sin θ, sin ψ .
∂ψ ∂θ
10.9. EXERCISES FOR CHAPTER 10 269
T
xx²2++ yy²2 ==4,9,z = z8 = 2
n–
S
Figure 10.30: A solid region W lying between the planes z = a and z = b whose boundary consists
of a surface S, a top T , and a bottom B (left) and a lumpy shopping bag of volume 75 (right)
5.7. A shopping bag is modeled as a lumpy surface S having an open circular top. Suppose that
the bag is placed in R3 in such a way that the boundary of the opening is the circle x2 +y 2 = 4
in the plane z = 8. See Figure 10.30, right. Assume that, when filled exactly to the brim, the
bag has a volume of 75.
Let F(x, y, z) = (x, y, z). If S is oriented by the outward normal, use Gauss’s theorem to find
RR
S F · dS. (Note that S is not a closed surface.)
www.dbooks.org
270 CHAPTER 10. SURFACE INTEGRALS
5.10. Let U and W be as in the preceding exercise. Show that, if h1 and h2 are harmonic on U
and if h1 = h2 at all points of ∂W , then h1 = h2 on all of W . Thus a harmonic function is
determined on W by its values on the boundary.
Section 6 The inverse square field
6.1. Let Sa denote the sphere x2 + y 2 + z 2 = a2 of radius a, oriented by the outward normal. If
we did not already know the answer, it might be tempting to calculate the integral of the
inverse square field G(x, y, z) = − (x2 +y2x+z 2 )3/2 , − (x2 +y2y+z 2 )3/2 , − (x2 +y2z+z 2 )3/2 over Sa by
filling in the sphere with the solid ball Wa given by x2 + y 2 + z 2 ≤ a2 and using Gauss’s
theorem. Unfortunately, this is not valid (and would give an incorrect answer), because Wa
contains the origin and the origin is not in the domain of G. On Sa , however, G agrees with
the vector field F(x, y, z) = − a13 (x, y, z), and F is defined on all of R3 .
RR
Find Sa G · dS by replacing G with F and applying Gauss’s theorem.
6.2. Let F be a vector field defined on all of R3 , except at the two points p = (2, 0, 0) and
q = (−2, 0, 0). Let S1 , S2 , and S be the following spheres, centered at (2, 0, 0), (−2, 0, 0), and
(0, 0, 0), respectively, each oriented by the outward normal:
S1 : (x − 2)2 + y 2 + z 2 = 1
S2 : (x + 2)2 + y 2 + z 2 = 1
S: x2 + y 2 + z 2 = 25.
RR RR RR
Assume that ∇ · F = 0. If S1 F · dS = 5 and S2 F · dS = 6, what is S F · dS?
6.3. Give the details of the justification of the instance of Gauss’s law cited in the text. That is,
3
Pk closed surface in R − {p1 , p2 , . . . , pk }, oriented by the outward
if S is a piecewise smooth
normal, and F(x) = i=1 G(x − pi ) is the vector field of (10.18), show that:
ZZ
1
− F · dS equals the number of the points p1 , p2 , . . . , pk that lie inside S.
4π S
6.4. Suppose that objects of mass m1 , m2 , . . . , mk , not necessarily equal, are located at the points
p1 , p2 , . . . , pk , respectively, in R3 . The gravitational force that they exert on another object
whose position is x is described by the vector field:
k
X
F(x) = mi G(x − pi ),
i=1
10.9. EXERCISES FOR CHAPTER 10 271
where G is our usual inverse square field (10.16). Let S be a piecewise smooth closed surface in
3 1
RR
R −{p1 , p2 , . . . , pk }, oriented by the outward normal. What can you say about − 4π S F·dS
in this case?
7.1. Let S be the unit sphere x2 + y 2 + z 2 = 1, oriented by the outward normal. Consider the
surface integral: ZZ
xy dy ∧ dz + yz dz ∧ dx + xz dx ∧ dy.
S
7.2. Suppose that a surface S in R3 is the graph z = f (x, y) of a smooth real-valued function
f : D → R whose domain D is a bounded subset of R2 . Let S be oriented by the upward
normal. Show that: ZZ
1 dx ∧ dy = Area (D).
S
In other words, the integral
RR is the area RR
of the projection of S on the xy-plane. Similar
interpretations hold for S 1 dy ∧ dz and S 1 dz ∧ dx under analogous hypotheses.
www.dbooks.org
272 CHAPTER 10. SURFACE INTEGRALS
Chapter 11
This final chapter is a whirlwind survey of how to work with differential forms. We focus particularly
on what differential forms have to say about some of the main theorems we have learned recently.
In what follows, all functions, including vector fields, are assumed to be smooth.
Our goal is to present just enough of the properties of differential forms to indicate how forms
enter the picture as far as these theorems are concerned. Further information is developed in the
exercises.
273
www.dbooks.org
274 CHAPTER 11. WORKING WITH DIFFERENTIAL FORMS
As indicated on the left of equation (11.1), we won’t use multiple integral signs with differential
forms. The dimension of the integral can be inferred from the type of form that is being
integrated.
• 0-forms. A 0-form is simply another name for a real-valued function f . 0-forms are integrated
over 0-dimensional domains, i.e., individual points, by evaluating
R the function at thatR point
and where we orient a point by attaching a + or − sign. So +{p} f = f (p) and −{p} f =
−f (p). If C is an oriented curve from p to q, we define its oriented boundary to be the
oriented points ∂C = (+{q}) ∪ (−{p}).
In this notation, the four theorems (C), (Gr), (S), and (Ga) translate as integrals of forms as
follows.
11.2. DERIVATIVES OF DIFFERENTIAL FORMS 275
2. Green’s theorem.
Z Z
∂F2 ∂F1
− dx ∧ dy = F1 dx + F2 dy (Gr)
D ∂x ∂y ∂D
3. Stokes’s theorem.
Z
∂F3 ∂F2 ∂F1 ∂F3 ∂F2 ∂F1
− dy ∧ dz + − dz ∧ dx + − dx ∧ dy
S ∂y ∂z ∂z ∂x ∂x ∂y
Z (S)
= F1 dx + F2 dy + F3 dz
∂S
4. Gauss’s theorem.
Z Z
∂F1 ∂F2 ∂F3
+ + dx ∧ dy ∧ dz = F1 dy ∧ dz + F2 dz ∧ dx + F3 dx ∧ dy (Ga)
W ∂x ∂y ∂z ∂W
1. Arithmetic. These are the rules we encountered earlier that reflect a connection between
differential forms and determinants (see the end of Section 10.7):
(a) dy ∧ dx = −dx ∧ dy, dz ∧ dy = −dy ∧ dz, dx ∧ dz = −dz ∧ dx,
(b) dx ∧ dx = dy ∧ dy = dz ∧ dz = 0.
These rules have the effect of simplifying what differential forms can look like. For instance, a
2-form should be a generalized double integrand. In R3 , such an expression would have the form:
f1 dx ∧ dx + f2 dx ∧ dy + f3 dx ∧ dz + f4 dy ∧ dx + · · · + f9 dz ∧ dz,
where f1 , f2 , . . . , f9 are real-valued functions. According to the rules, however, three of the nine
terms—the ones involving dx ∧ dx, dy ∧ dy, and dz ∧ dz—are zero and therefore can be omitted.
Similarly, the remaining six terms can be combined in pairs: for instance, f2 dx ∧ dy + f4 dy ∧ dx =
f2 dx ∧ dy − f4 dx ∧ dy = (f2 − f4 ) dx ∧ dy. Hence we may assume that any 2-form on a subset of
R3 is a sum of three terms having the form η = F1 dy ∧ dz + F2 dz ∧ dx + F3 dx ∧ dy.
www.dbooks.org
276 CHAPTER 11. WORKING WITH DIFFERENTIAL FORMS
In (b) and (c), F1 , F2 , and F3 are 0-forms, so their derivatives are as defined in (a).
Example 11.1. Let x = r cos θ and y = r sin θ. These are real-valued functions of r and θ—in
other words, they are 0-forms—so by definition:
∂x ∂x
dx = dr + dθ = cos θ dr − r sin θ dθ
∂r ∂θ
∂y ∂y
and dy = dr + dθ = sin θ dr + r cos θ dθ.
∂r ∂θ
Then, by the arithmetic rules:
Example 11.2. Let σ(s, t) = (x(s, t), y(s, t), z(s, t)) be a smooth parametrization of a surface S.
Then x, y, and z are each real-valued functions of s and t, so, for example:
∂y ∂y ∂z ∂z
dy = ds + dt and dz = ds + dt.
∂s ∂t ∂s ∂t
It follows that:
∂y ∂y ∂z ∂z
dy ∧ dz = ( ds + dt) ∧ ( ds + dt)
∂s ∂t ∂s ∂t
∂y ∂z ∂y ∂z
=0+ ds ∧ dt + dt ∧ ds + 0
∂s ∂t ∂t ∂s
∂y ∂z ∂y ∂z
= − ds ∧ dt
∂s ∂t ∂t ∂s
∂y ∂y
∂s ∂t
= det ∂z ∂z ds ∧ dt.
∂s ∂t
This is the same as the expression for dy ∧ dz that we obtained when we discussed the substitutions
used to compute surface integrals, as in equation (10.21) of Chapter 10. Perhaps the calculation
given here is a more organic way of obtaining the substitution. Either way, the formula is consistent
with what the change of variables theorem says to do to pull back an integral with respect to y and
z to one with respect to s and t.
11.2. DERIVATIVES OF DIFFERENTIAL FORMS 277
Similarly:
∂F2 ∂F2
d(F2 dy) = dx ∧ dy − dy ∧ dz
∂x ∂z
∂F3 ∂F3
and d(F3 dz) = − dz ∧ dx + dy ∧ dz.
∂x ∂y
∂F3 ∂F2
d(F1 dx + F2 dy + F3 dz) = − dy ∧ dz
∂y ∂z
∂F1 ∂F3 ∂F2 ∂F1
+ − dz ∧ dx + − dx ∧ dy,
∂z ∂x ∂x ∂y
Likewise:
∂F1 ∂F3
d(F1 dy ∧ dz) = dx ∧ dy ∧ dz and d(F3 dx ∧ dy) = dx ∧ dy ∧ dz.
∂x ∂z
www.dbooks.org
278 CHAPTER 11. WORKING WITH DIFFERENTIAL FORMS
1. 0-forms
(a) f = f (x) df = f 0 (x) dx
∂f ∂f
(b) f = f (x, y) df = ∂x dx + ∂y dy
∂f ∂f ∂f
(c) f = f (x, y, z) df = ∂x dx + ∂y dy + ∂z dz
2. 1-forms
∂F2 ∂F1
(a) ω = F1 (x, y) dx + F2 (x, y) dy dω = ∂x − ∂y dx ∧ dy
dω = ∂F ∂F2
(b) ω = F1 (x, y, z) dx + F2 (x, y, z) dy ∂y − ∂z dy ∧ dz
3
3. 2-forms
∂F1 ∂F2 ∂F3
η = F1 dy ∧ dz + F2 dz ∧ dx + F3 dx ∧ dy dη = ∂x + ∂y + ∂z dx ∧ dy ∧ dz
2. Green’s theorem.
Z Z
∂F2 ∂F1
− dx ∧ dy = F1 dx + F2 dy (Gr)
D ∂x ∂y ∂D
3. Stokes’s theorem.
Z
∂F3 ∂F2 ∂F1 ∂F3 ∂F2 ∂F1
− dy ∧ dz + − dz ∧ dx + − dx ∧ dy
S ∂y ∂z ∂z ∂x ∂x ∂y
Z (S)
= F1 dx + F2 dy + F3 dz
∂S
4. Gauss’s theorem.
Z Z
∂F1 ∂F2 ∂F3
+ + dx ∧ dy ∧ dz = F1 dy ∧ dz + F2 dz ∧ dx + F3 dx ∧ dy (Ga)
W ∂x ∂y ∂z ∂W
11.3. A LOOK BACK AT THE THEOREMS OF MULTIVARIABLE CALCULUS 279
They’re all the same theorem! But there’s even more to it, for suppose we go even further back,
to first-year calculus and a real-valued function f (x) of one variable. The fundamental theorem of
Rb
calculus says that a f 0 (x) dx = f (b) − f (a). Denoting the interval [a, b] by I and converting to
differential form notation, this becomes:
Z Z
df = f.
I ∂I
In other words, the four theorems (C), (Gr), (S), and (Ga) of multivariable calculus are general-
izations of something that we’ve known for a long time.
The pattern continues in higher dimensions and with more variables. One studies subsets of Rn
that are k-dimensional in the sense that they can be traced out using k independent parameters.
In other words, they are the images of smooth functions of the form ψ : D → Rn , where D ⊂ Rk .
These objects are called k-dimensional manifolds. Differential k-forms can be integrated over
them, or at least over the ones for which there is a notion of orientation.
A typical k-form in Rn is a k-dimensional integrand, that is, a sum of terms of the form
f dxi1 ∧ dxi2 ∧ · · · ∧ dxik , where f is a real-valued function of n variables x1 , x2 , . . . , xn . By the
arithmetic rules for differential forms, we may assume that the subscripts i1 , i2 , . . . , ik are distinct
and in fact that they are arranged, say in increasing order i1 < i2 < · · · < ik . If ζ is a k-form
and M is a oriented k-dimensional manifold in Rn with an orientation-preserving parametrization
www.dbooks.org
280 CHAPTER 11. WORKING WITH DIFFERENTIAL FORMS
ψ : D → Rn as above, then the integral M ζ is defined using substitution and the rules of forms to
R
pull the integral back to an ordinary Riemann integral over the k-dimensional parameter domain
D in Rk . (See the remarks before Exercise 4.14 at the end of the chapter for more about the
pullback process.) In the end, the relation between the integral and its pullback is summarized by
the formula: Z Z
ζ= ψ ∗ (ζ). (11.2)
M D
That the value of the integral does not depend on the parametrization rests ultimately on the
change of variables theorem and the chain rule. In fact, you have seen the proof, at least in outline
form. It is (10.30).
Perhaps it is worth reinforcing the point that equation (11.2) agrees with the expressions for
line integrals and surface integrals that we obtained in the previous two chapters. See (9.6) and
(10.23). In particular, line and surface integrals fit in as two special cases of a general type of
integral, something that may not have been clear from the original motivations in terms of tangent
and normal components of vector fields.
The general theory has a striking coherence and consistency. The culmination may be a theorem
that describes how information over the entire manifold is related to behavior along the boundary.
The boundary of a k-dimensional manifold M , denoted ∂M , is (k − 1)-dimensional, as was the case
with curves (k = 1) and surfaces (k = 2). If ω is a differential (k − 1)-form, then this theorem,
which is known as Stokes’s theorem, states that:
R R
M dω = ∂M ω.
4.7. f = x2 + y 2
ζ = F1 dx2 ∧ dx3 ∧ dx4 + F2 dx1 ∧ dx3 ∧ dx4 + F3 dx1 ∧ dx2 ∧ dx4 + F4 dx1 ∧ dx2 ∧ dx3 ,
where F1 , F2 , F3 , and F4 are real-valued functions. The notation is arranged so that, in the
term with leading factor Fi , the differential dxi is omitted.
As a 4-form on R4 , dζ has the form dζ = f dx1 ∧ dx2 ∧ dx3 ∧ dx4 for some real-valued function
f . Find a formula for f in terms of F1 , F2 , F3 , and F4 .
We lay out more formally some of the rules that govern substitutions and pullbacks. In Rm
with coordinates x1 , x2 , . . . , xm , a typical differential k-form is a sum of terms of the form:
γ ∗ (f ) = f ◦ γ.
γ ∗ (dxi ) = dγi ,
where the expression on the right is the derivative of the 0-form γi . For instance, if (x1 , x2 ) =
γ(u1 , u2 ) = (u1 cos u2 , u1 sin u2 ), then γ ∗ (dx1 ) = dγ1 = d(u1 cos u2 ) = cos u2 du1 − u1 sin u2 du2 .
Applying these substitutions en masse, the k-form ζ = f dxi1 ∧ dxi2 ∧ · · · ∧ dxik of equation (11.3)
pulls back as:
γ ∗ (ζ) = (f ◦ γ) dγi1 ∧ dγi2 ∧ · · · ∧ dγik .
Perhaps a slicker way of writing this is:
If ζ = ζ1 + ζ2 is a sum of k-forms, then carrying out the substitutions in both summands gives the
rule γ ∗ (ζ1 + ζ2 ) = γ ∗ (ζ1 ) + γ ∗ (ζ2 ).
Exercises 4.14–4.18 are intended to provide some practice working with pullbacks.
www.dbooks.org
282 CHAPTER 11. WORKING WITH DIFFERENTIAL FORMS
4.14. Let α : R → R3 be given by α(t) = (cos t, sin t, t), where we think of R3 as xyz-space.
4.15. Let T : R2 → R2 be the function from the uv-plane to the xy-plane given by T (u, v) =
(u2 − v 2 , 2uv).
4.16. Let σ : R2 → R3 be the function from the uv-plane to xyz-space given by σ(u, v) = (u cos v,
u sin v, u).
4.17. Let α : R → Rn be a smooth function, where α(t) = (x1 (t), x2 (t), . . . , xn (t)). If f : Rn → R is
a smooth real-valued function, f = f (x1 , x2 , . . . , xn ), then df is a 1-form on Rn . Show that:
d(α∗ (f )) = α∗ (df ).
4.18. Let T : R2 → R2 be a smooth function, regarded as a map from the uv-plane to the xy-plane,
so (x, y) = T (u, v) = (T1 (u, v), T2 (u, v)).
(a) Find T ∗ (dx) and T ∗ (dy) in terms of the partial derivatives of T1 and T2 .
(b) Let η be a 2-form on the xy-plane: η = f dx ∧ dy, where f is a real-valued function of x
and y. Show that:
T ∗ (η) = (f ◦ T ) · det DT du ∧ dv.
(c) Let S : R2 → R2 be a smooth function, regarded as a map from the st-plane to the
uv-plane. Then the composition T ◦ S : R2 → R2 goes from the st-plane to the xy-plane.
Let η = f dx ∧ dy, as in part (b). Show that:
i.e., (T ◦ S)∗ = S ∗ ◦ T ∗ for 2-forms on R2 . (Actually, the same relation holds for
compositions and k-forms in general.)
We next formulate a version of the change of variables theorem for double integrals that is
adapted for differential forms, something we have alluded to before. We are not trying to give
another proof of the theorem. Rather, we take the theorem as known and describe how to restate
it in the language of differential forms.
11.4. EXERCISES FOR CHAPTER 11 283
Differential forms are integrated over oriented domains, so, given a bounded subset D of R2 , we
first make a choice whether to assign it the positive orientation or the negative orientation. If this
seems too haphazard, think of it as analogous to choosing whether, as a subset of R3 , D is oriented
by the upward normal n = k or the downward normal n = −k.
A typical 2-form on D has the form f dx ∧ dy. We define its integral over D by:
Z ( RR
f (x, y) dx dy if D has the positive orientation,
f dx ∧ dy = RRD
D − D f (x, y) dx dy if D has the negative orientation,
where, in each case, the integral on the right is the ordinary Riemann double integral. So, for
instance,
RR if D has the negative orientation, then we could
RR also say that, as an integral of a differential
form, D f dy ∧ dx is equal to the Riemann integral D f (x, y) dx dy.
Exercises 4.19–4.20 concern how to express the change of variables theorem in terms of integrals
of differential forms.
4.19. We say that a basis (v, w) of R2 has the positive orientation
if det v w > 0 and the
negative orientation if det v w < 0. Here, we use v w to denote the 2 by 2 matrix
whose columns are v and w. In Section 2.5 of Chapter 2, we used the terms “right-handed”
and “left-handed” to describe the same concepts, though there we put the vectors into the
rows of the matrix, i.e., we worked with the transpose, which doesn’t affect the determinant.
Let L : R2 → R2 be a linear transformation represented by a matrix A with respect to the
standard bases. Thus L(x) = Ax. In addition, assume that det A 6= 0.
(a) Show that L(e1 ), L(e2 ) has the positive orientation if det A > 0 and the negative
orientation if det A < 0.
(b) More generally, let (v, w) be any basis of R2 . Show that (v, w) and L(v), L(w) have
the same orientation if det A > 0 and opposite orientations if det A < 0.
4.20. Let D∗ and D be bounded subsets of R2 , and let T : D∗ → D be a smooth function that maps
D∗ onto D and that is one-to-one, except possibly on the boundary of D∗ . As in the change of
variables theorem, we think of T as a function from the uv-plane to the xy-plane. Moreover,
assume that det DT (u, v) has the same sign at all points (u, v) of D∗ , except possibly on the
boundary of D∗ , where det DT (u, v) = 0 is allowed.
Let D∗ be given an orientation, and let D have the orientation imposed on it by T . That is, let
T (D∗ ) denote the set D with the same orientation, positive or negative, as D∗ if det DT > 0
and the opposite orientation from D∗ if det DT < 0. In the first case, we say that T is
orientation-preserving and, in the second, that it is orientation-reversing. In other words, we
categorize how T acts on orientation based on the behavior of its first-order approximation.
Let η = f dx ∧ dy be a 2-form on D. Use the change of variables theorem to prove that:
Z Z
η= T ∗ (η).
T (D∗ ) D∗
This equation has appeared in various guises throughout the book. You may assume in your
proof that D∗ has the positive orientation. The arguments are similar if it has the negative
orientation. (Hint: See Exercise 4.18.)
The remaining exercises call for speculation rather than proofs. You are asked to propose
reasonable solutions consistent with patterns that have come before . It is considered a bonus if
www.dbooks.org
284 CHAPTER 11. WORKING WITH DIFFERENTIAL FORMS
what you say is true. The correct answers are known and are covered in more advanced courses that
study manifolds, for instance, advanced real analysis or perhaps differential geometry or differential
topology.
(a) Write down the general form of a differential form η in the variables x1 , x2 , x3 , x4 that
could be integrated over S. (Hint: Six terms.)
(b) Let σ : D → R4 be an orientation-preserving parametrization RR
R
of S. The integral S η
should be defined via the parametrization as a double integral D f (s, t) ds dt over the
parameter domain D. Propose a formula for the function f .
4.22. Continuing with the preceding problem, assume that it also known how to orient the boundary
∂S of an oriented surface S in R4 in a reasonable way. Propose an expression µ that makes
the following statement true:
Z Z
F1 dx1 + F2 dx2 + F3 dx3 + F4 dx4 = µ.
∂S S
where the volume on the left refers to four-dimensional volume. More than one answer for ζ
is possible. Try to find one with as much symmetry in the variables x1 , x2 , x3 , x4 as you can.
Answers to selected exercises
285
www.dbooks.org
286 ANSWERS TO SELECTED EXERCISES
5.1. (a) −3
√ √
(b) kxk = 6, kyk = 3 2
1
(c) arccos(− 2√ 3
)
6.1. 23
6.3. 0
6.5. 12
1.1.
4
0.5 1 1.5 2
287
1.3.
5
-1 0 1 2 3 4 5
1.5.
1.7.
1.15. α(t) = (1 − t, 0, t)
v·(p−a)
1.17. (a) v·v
(b) a + v·(p−a)
v·v v
www.dbooks.org
288 ANSWERS TO SELECTED EXERCISES
2.5. α(t) = (1 + √1 t, 1 + √2 t, 1 + √3 t)
14 14 14
√
9.1. (a) 2
(b) √1 (− sin t, − cos t, 1)
2
(f ) − 12
(g) a = 1, b = −1. Both α and β trace out the same helix but in opposite directions. The
rotation T (x, y, z) = (x, −y, −z) about the x-axis by π transforms one parametrization
into the other.
√
9.3. (a) 2 2
1
√ √
(b) 1 + 3 cos t, 3 − cos t, −2 sin t
√
2 2
√
(c) 12 − 3 sin t, sin t, −2 cos t
1
√ √
(d) 2√ 2
1 − 3 cos t, 3 + cos t, 2 sin t
1
(e) 4
1
(f ) 4
(g) It’s congruent to the helix parametrized by β(t) = (2 cos t, 2 sin t, 2t).
q
1 19
9.7. 7 14
√
2
9.9. (a) 2 (1+sin2 t)3/2
(b) τ (t) = 0
√
2
9.13. (a) T(t) = − 2 (1, sin 2t, cos 2t)
N(t) = (0, − cos 2t, sin 2t)
√
κ(t) = 2
√
τ (t) = 2
√ √ √ √
2 2 2 2
(b) α(t) = − 2 t, − 4 + 4 cos 2t, − 4 sin 2t
www.dbooks.org
290 ANSWERS TO SELECTED EXERCISES
1.1. (b)
2
6
5
4
3
1 2
1
-1
-2
-2 -1 0 1 2
(c)
1.3. (b)
2
1
1
0
0
-1
-1
-2
-2
-2 -1 0 1 2
(c)
1.5. (b)
2
3
2.5
1 2
1.5
1
0.5
0
-1
-2
-2 -1 0 1 2
291
(c)
1.7. (b)
2
1.5
1
1
0.5
0
0
-0.5
-1
-1
-1.5
-2
-2 -1 0 1 2
(c)
2.1.
2.3.
www.dbooks.org
292 ANSWERS TO SELECTED EXERCISES
3.1. 4x + 5y + 6z = 32
3.5. Perpendicular
3.7. 6x + 3y + 2z = 6
3.9. 2x + z = 3
3.11. ( 83 , 31 , 31 )
3.13. (b) It’s the triangle with vertices (1, 0, 0), (0, 1, 0), and (0, 0, 1).
3.15. (c) √4
3
4.1. Let a = (c, d) be a point of U , and let r = min{c, 1 − c, d, 1 − d}. Then B(a, r) ⊂ U , so U is
an open set.
293
4.3. The point a = (1, 0) is in U , but no open ball centered at a stays within U . Thus U is not
an open set.
6.1. f (x, y) = (1, 2) · (x, y) = x + 2y. We know that the projections x and y are continuous, so, as
an algebraic combination of continuous functions, f is continuous as well.
7.1. (a) By the triangle inequality, kvk = k(v−w)+wk ≤ kv−wk+kwk, so kvk−kwk ≤ kv−wk.
(b) Reversing the roles of v and w gives kwk−kvk ≤
kw−vk =kv−wk, or −(kvk−kwk) ≤
kv − wk. Hence ±(kvk − kwk) ≤ kv − wk, so kvk − kwk ≤ kv − wk.
7.4. First, note that |cf (x) − cf (a)| = |c| |f (x) − f (a)| ≤ (|c| + 1) |f (x) − f (a)|. Now, let > 0
be given. Since f is continuous at a, there exists a δ > 0 such that, if kx − ak < δ,
then |f (x) − f (a)| < |c|+1 . Then, for this choice of δ, if kx − ak < δ, |cf (x) − cf (a)| ≤
(|c| + 1) |f (x) − f (a)| < (|c| + 1) · |c|+1 = . Hence cf is continuous at a.
www.dbooks.org
294 ANSWERS TO SELECTED EXERCISES
8.1. For sums: Let L = limx→a f (x) and M = limx→a g(x), and consider the functions:
( (
f (x) if x 6
= a, g(x) if x = 6 a,
fe(x) = and ge(x) =
L if x = a M if x = a.
By definition of limit, fe and ge are continuous at a, hence so is their sum. In other words, the
function given by: (
f (x) + g(x) if x 6= a,
fe(x) + ge(x) =
L+M if x = a
is continuous at a. By definition, this means that limx→a (f (x) + g(x)) = L + M .
∂f
1.5. (a) ∂x = (1 + xy)exy
∂f
= x2 exy
∂y
(b) (1 + xy)exy x2 exy
∂f 2x
1.7. (a) ∂x = − (x2 +y 2 )2
∂f 2y
∂y = − (x2 +y 2 )2
h i
2x 2y
(b) − (x2 +y 2 )2 − (x2 +y 2 )2
∂f ∂f ∂f
1.9. (a) ∂x = 1, ∂y = 2, ∂z =3
(b) (1, 2, 3)
∂f 2 2 2
= cos(x + 2y) − 2x sin(x + 2y) e−x −y −z
1.11. (a) ∂x
∂f −x2 −y2 −z 2
∂y = 2 cos(x + 2y) − y sin(x + 2y) e
∂f 2 −y 2 −z 2
∂z = −2z sin(x + 2y)e−x
2 2 2 2 2 2
cos(x + 2y) − 2x sin(x + 2y) e−x −y −z , 2 cos(x + 2y) − y sin(x + 2y) e−x −y −z ,
(b)
2 2 2
−2z sin(x + 2y)e−x −y −z
∂f 2x
1.13. (a) ∂x = x2 +y 2
∂f 2y
∂y = x2 +y 2
∂f
∂z =0
295
(b) ( x22x , 2y , 0)
+y 2 x2 +y 2
∂f 3
1.17. (a) ∂x = √ 2x4
x +y 4
∂f 3
∂y = √ 2y4
x +y 4
∂f ∂f
(b) ∂x (0, 0) = 0, ∂y (0, 0) = 0.
5.1. (b) (f ◦ α)(t) = f (t2 , t3 ) = t4 + t6 , so (f ◦ α)0 (t) = 4t3 + 6t5 and (f ◦ α)0 (1) = 10.
∂f dx ∂f dy ∂f dz ∂f ∂f ∂f
5.3. ∂x dt + ∂y dt + ∂z dt , where ∂x , ∂y , and ∂z are evaluated at α(t)
6.1. − √517
6.3. √2
6
6.5. √6 (4, 1)
17
6.7. r = √5 , s= 5
√
2 2 2
www.dbooks.org
296 ANSWERS TO SELECTED EXERCISES
(c) y = 13 x2 − 1
3
3
1 2 3
(d) y = − 32 ln x
5
1 2 3
-1
7.1. 2x − 4y − z = −3
7.3. 4y + 3z = 7
∂2f
8.1. ∂x2
= 12x2 − 12xy + 6y 2
∂2f
∂y ∂x = −6x2 + 12xy − 12y 2
∂2f
∂x ∂y = −6x2 + 12xy − 12y 2
∂2f
∂y 2
= 6x2 − 24xy + 60y 2
∂2f 2 −y 2
8.3. ∂x2
= (4x2 − 2)e−x
∂2f 2 −y 2
∂y ∂x = 4xye−x
∂2f 2 −y 2
∂x ∂y = 4xye−x
∂2f 2 −y 2
∂y 2
= (4y 2 − 2)e−x
∂7f
8.5. i = 4, j = 3, ∂x4 ∂y 3
(0, 0) = 144
10.7. k 6= 0: (0, 0) saddle point; ( k3 , k3 ) local minimum if k > 0, local maximum if k < 0
k = 0: (0, 0) neither local maximum nor minimum (degenerate)
297
1
10.11. y = x + 3
4
-1 1 2 3 4
11.1. 1 − 21 (x + y)2
12.1. (a) x can be chosen large and positive and y large and negative so that x + 2y = 1. Thus
f (x, y) = xy can be made as large and negative as you like.
1
(b) Maximum value 8 at ( 12 , 41 )
12.5. x = 23 , y = 1, z = 6
5
49
1.1. 6
7
1.3. 6
26
1.5. 3 ln 2
1.7. e2 − 2e + 1
1.9. (a)
4
-1 1 2 3 4
-1
Z 4 Z 2
(b) f (x, y) dx dy
3 1
www.dbooks.org
298 ANSWERS TO SELECTED EXERCISES
1.11. (a)
1.25
1
x = y^2
0.75
x=y
0.5
0.25
Z 1 Z √
x
(b) f (x, y) dy dx
0 x
1.13. (a)
1.25
1 - x2
1
0.75
0.5
0.25
-1 -0.5 0.5 1
Z 1 Z √1−y2
(b) √ f (x, y) dx dy
0 − 1−y 2
1.15. (a)
4
y = 1/x
1
0
0 0.25 0.5 0.75 1 1.25
Z 4 Z 1
(b) yexy dx dy
1
1 y
(c) e4 − 4e
2
1.17. 3
2
1.19. 3
1
1.21. 3
1
2.1. (a) S = 0 or 9
1 1 3 1
(b) S = 0, 16 , 8 , 16 , or 4
1
(c) If n is odd, S = 0 or n2
. If n is even, S = 0, n12 , n22 , n32 , or 4
n2
.
√
(e) Let > 0 be given, and let δ = 2 . If R is subdivided into subrectangles of dimensions
4xi by 4yj , where 4xi < δ and 4yj P < δ for all i, j, then, by part (d), all Riemann
sums based on the subdivision satisfy | i,j f (pij ) 4xi 4yj − 0| < 4δ 2 = . Therefore f
RR
is integrable over R, and R f (x, y) dA = 0.
299
3.5. ( 23 , 35 )
q
2
3.7. 3
5.1. (a) σ : D → R3 , σ(x, y) = (x, y, 1 − x − y), where D is the triangular region in the xy-plane
with vertices (0, 0), (1, 0), (0, 1)
√
3
(b) 2
√
7 3
(c) 2
5.2. 36π
5.5. (a)
√
(b) 4π (6 + 2)
√
(c) 2π (27 + 10 2)
5.7. (a) 8
(b) 4
www.dbooks.org
300 ANSWERS TO SELECTED EXERCISES
6.7. a2
6.8. (a)
√ q
2
q
2
Z 2 Z 1− y2 Z 2 1− y2 −z 2
(b) q
2
f (x, y, z) dx dz dy
0 0 −2 1− y2 −z 2
q q
2 2
Z 2 Z 1− x4 Z 2− x2 −2z 2
(c) f (x, y, z) dy dz dx
−2 0 0
6.10. (a)
Z 1 Z y Z y
(b) f (x, y, z) dx dz dy
0 0 z
Z 1 Z x Z 1
(c) f (x, y, z) dy dz dx
0 0 x
1.2. Given any > 0, there exists a δ > 0 such that kf (x) − Lk < whenever kx − ak < δ, except
possibly when x = a.
2.3. 1 2y 9z 2
1 0
2.5. 0 1
0 0
301
ex+y ex+y
∂w ∂f ∂f ∂f
4.3. (a) ∂ρ = ∂x sin φ cos θ + ∂y sin φ sin θ + ∂z cos φ
∂w ∂f ∂f ∂f
∂φ = ∂x ρ cos φ cos θ + ∂y ρ cos φ sin θ − ∂z ρ sin φ
∂w
∂θ = − ∂f ∂f
∂x ρ sin φ sin θ + ∂y ρ sin φ cos θ
(b) 0
∂f ∂f
4.5. ∂x = 52 , ∂y = 3
4
∂z
4.9. ∂x = − 13 , ∂z
∂y = − 32
∂f d2 x ∂f d2 y ∂2f dx 2
2
∂ f dx dy ∂2f dy 2
4.11. (b) ∂x dt2 + ∂y dt2 + ∂x2 dt + 2 ∂x ∂y dt dt + ∂y 2 dt
www.dbooks.org
302 ANSWERS TO SELECTED EXERCISES
1.6. 2π
2
1.8. (b) π(1 − e−a )
2 (3,2)
1 D
1 2 3 4 5
4.5. (0, 0, 83 a)
√
4.7. 4(2 − 2)
8π 2
4.9. 15
1.1.
2
-1
-2
-2 -1 0 1 2
303
1.3.
2
-1
-2
-2 -1 0 1 2
1.5.
2
-1
-2
-2 -1 0 1 2
-1
-2
-2 -1 0 1 2
www.dbooks.org
304 ANSWERS TO SELECTED EXERCISES
1.1. (a)
2
-1
-2
-2 -1 0 1 2
(b) For instance,R any line segment radiating directly away from the origin. If α(t) = (t, t), 0 ≤
t ≤ 1, then C x dx + y dy = 1.
R instance, any circle centered at the origin. If α(t) = (cos t, sin t), 0 ≤ t ≤ 2π, then
(c) For
C x dx + y dy = 0.
1
1.3. 2 (π + sin2 (π 2 ))
179
1.4. 60
1.10. (a)
(b) e4 + 2
3.9. (a) q = 1 − 2p
305
4.1. 2
9π 3
4.3. 2
4.6. 4
1
4.8. 2
4.10. 12
5.1. (a) 27
6.1. (a) C traverses the unit circle once clockwise. Winding number = −1.
(b) Cn goes around the unit circle |n| times, counterclockwise if n > 0 and clockwise if n < 0.
It stays stuck at the point (1, 0) if n = 0. Winding number = n.
6.2. Winding number = 0
www.dbooks.org
306 ANSWERS TO SELECTED EXERCISES
6.2. 11
6.4. It’s the total mass contained in the interior of S, i.e., the sum of those mi for which pi is in
the interior of S.
4.1. (x2 + y 2 ) dx ∧ dy
4.5. 0
4.7. 2x dx + 2y dy
4.9. 4y dx ∧ dy
4.11. (1 + xzexyz ) dx ∧ dy ∧ dz
∂F1 ∂F2 ∂F3 ∂F4
4.13. ∂x1 − ∂x2 + ∂x3 − ∂x4
4.14. (a) 1 + t2
(c) α∗ (dy) = cos t dt, α∗ (dz) = dt
(d) (1 + t) dt
4.16. (b) u2 du ∧ dv
∂T1 ∂T1
4.18. (a) T ∗ (dx) = ∂u du + ∂v dv
∂T2 ∂T2
T ∗ (dy) = ∂u du + ∂v dv
4.21. (a) F1 dx1 ∧ dx2 + F2 dx1 ∧ dx3 + F3 dx1 ∧ dx4 + F4 dx2 ∧ dx3 + F5 dx2 ∧ dx4 + F6 dx3 ∧ dx4
Index
307
www.dbooks.org
308 INDEX
www.dbooks.org
www.dbooks.org
www.dbooks.org