0% found this document useful (0 votes)
17 views248 pages

MA1521+Lecture+Notes

Uploaded by

rong18993
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
17 views248 pages

MA1521+Lecture+Notes

Uploaded by

rong18993
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 248

MA1521 CALCULUS FOR COMPUTING1

Leung Ka Hin To Wing Keung

July 31, 2024

1 This notes is exclusively for students taking MA1521 in AY2024/25 Semester 1.


2
Contents

0 Real Numbers and Functions 7


0.1 Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
0.2 Absolute Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
0.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
0.4 Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
0.5 Rational Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
0.6 Trigonometric Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
0.7 Exponential and Logarithmic Function . . . . . . . . . . . . . . . . . . . . . . . 13
0.8 More on the Domain and Range of a Function . . . . . . . . . . . . . . . . . . . 13

1 Limits and Continuity 15


1.1 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3 Evaluation of limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4 Limits at infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5 More on Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.6 Squeeze (Sandwich) Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.7 Intermediate Value Theorem (IVT) . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.8 Appendix: The Precise Definition of the Limit of a Function . . . . . . . . . . . 25

2 Derivatives 27
2.1 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 Standard Derivatives & Differentiation Rules . . . . . . . . . . . . . . . . . . . 32
2.3 Implicit Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.4 Derivatives of Inverse Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.5 Higher-order Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3
2.6 Parametric Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.7 Miscellaneous examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3 Applications of Differentiation 45
3.1 Tangents and Normals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Increasing and Decreasing Functions . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 Concave Upward and Concave Downward Functions . . . . . . . . . . . . . . . 48
3.4 Related Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5 Maximum and Minimum Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.6 Applied Maximum and Minimum Problems . . . . . . . . . . . . . . . . . . . . 57
3.7 L’Hôpital’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.8 Rolle’s Theorem and Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . 63

4 Integrals 67
4.1 Antiderivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2 Standard Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3 Partial Fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4 Integration by Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.5 Integration by Parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.6 Riemann Sums and Definite Integrals . . . . . . . . . . . . . . . . . . . . . . . . 76
4.7 Fundamental Theorem of Calculus (FTC) . . . . . . . . . . . . . . . . . . . . . . 78
4.8 Miscellaneous Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.9 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5 Applications of Integration 89
5.1 Area Between Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2 Volume of Solid of Revolution by Disk Method . . . . . . . . . . . . . . . . . . 94
5.3 Cylindrical Shell Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.4 Arc Length of a curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6 Sequences and Series 105


6.1 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.2 Finding the Limit of a Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.3 Limit Laws for Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.4 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

4
6.5 Integral Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.6 The Comparison Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.7 The Ratio Test and Root Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.8 Alternating Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.9 Absolute Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.10 Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.11 Power Series Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.12 Taylor and Maclaurin Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.13 Appendix: The Precise Definition of the Limit of a Sequence and some Proofs 131

7 Vectors and Geometry of Space 133


7.1 The 3D-Coordinate System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.2 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.3 The Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.4 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
7.5 The Cross Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7.6 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
7.7 Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

8 Functions of Several Variables 151


8.1 Vector Functions of One Variable . . . . . . . . . . . . . . . . . . . . . . . . . . 151
8.2 Calculus of Vector Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
8.3 Tangent Vector and Tangent Line to a Curve . . . . . . . . . . . . . . . . . . . . 154
8.4 Arc Length of a Space Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.5 Functions of Two Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.6 Cylinders and Quadric Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.7 Functions of Three Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
8.8 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8.9 Higher Order Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . 169
8.10 Tangent Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
8.11 Differentiability and Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.12 Implicit Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
8.13 Increments and Differentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
8.14 Directional Derivatives and the Gradient Vector . . . . . . . . . . . . . . . . . . 181
8.15 Extrema of Functions of Two Variables . . . . . . . . . . . . . . . . . . . . . . . 189

5
9 Double Integrals 195
9.1 Riemann Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
9.2 Volume and Double Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
9.3 Iterated Double Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
9.4 A Special Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
9.5 Double Integral over General Region . . . . . . . . . . . . . . . . . . . . . . . . 205
9.6 Decomposing Domain into Smaller Domains . . . . . . . . . . . . . . . . . . . 213
9.7 Properties of Double Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
9.8 An Application – Finding Area . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
9.9 Double Integrals in Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . 216
9.10 Surface Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

10 Ordinary Differential Equations 225


10.1 First Order Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . 225
10.2 Reduction to Separable Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
10.3 Linear First Order ODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
10.4 The Bernoulli Equation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
10.5 Applications of ODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

11 More on ODE 237


11.1 Euler’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
11.2 2nd Order Linear Equations with Constant Coefficients . . . . . . . . . . . . . 240
11.3 Method of Undetermined Coefficients . . . . . . . . . . . . . . . . . . . . . . . 242
11.4 Appendix: Malthus Model of Population . . . . . . . . . . . . . . . . . . . . . . 245

6
Chapter 0

Real Numbers and Functions

Read Thomas’ Calculus, Chapter 1.

Remark. Solution to Exercises in Lecture Notes can be downloaded at

Canvas > Files > Solutions to Exercises in Lecture Notes (folder).

0.1 Numbers

The collection of all real numbers is denoted by R. Thus R includes the integers

. . . , −2, −1, 0, 1, 2, 3 . . . ,

√ rational numbers, p/q, where p and q are integers (q , 0), and the irrational numbers, like
the
2, π, e, etc.

a ∈ R means a is a member of the set R. In other words, a is a real number. Given two real
numbers a and b with a < b, the closed interval [a, b] consists of all x such that a ≤ x ≤ b,
and the open interval (a, b) consists of all x such that a < x < b. Similarly, we may form the
half-open intervals [a, b) and (a, b].

7
0.2 Absolute Value
The absolute value of a number a ∈ R is written as |a| and is defined as
(
a if a ≥ 0
|a| =
−a if a < 0.

For example, |2| = 2, | − 2| = 2.

Some properties of |x| are summarized as follows:

1. | − x| = |x|, for all x ∈ R.

2. |xy| = |x||y|, for all x, y ∈ R.

3. −|x| ≤ x ≤ |x|, for all x ∈ R.

4. For a fixed r > 0, |x| < r if and only if x ∈ (−r, r).



5. x2 = |x|, x ∈ R.

6. (Triangle Inequality) |x + y| ≤ |x| + |y| for all x, y ∈ R.

2x − 1
Example 0.1. Solve the inequality < 1.
2x + 1

Solution.

2x − 1
<1
2x + 1
2x − 1
⇔ 0 < 1−
2x + 1
2x + 1 − 2x + 1
⇔0<
2x + 1
2
⇔0<
2x + 1
⇔ 0 < 2x + 1
1
⇔ − < x.
2

Example 0.2. Solve the inequality |x + 1| ≤ |2x − 1|.

Solution.

8
|x + 1| ≤ |2x − 1|
⇔ |x + 1|2 ≤ |2x − 1|2
⇔ x2 + 2x + 1 ≤ 4x2 − 4x + 1
⇔ 0 ≤ 3x2 − 6x
⇔ 0 ≤ 3x(x − 2)
⇔ x ≤ 0 or x ≥ 2
⇔ x ∈ (−∞, 0] ∪ [2, ∞).

Exercise 0.1. Let r > 0. Prove that |x − a| < r if and only if x ∈ (−r + a, a + r).

Exercise 0.2. Prove the triangle inequality |x + y| ≤ |x| + |y|.

Exercise 0.3. Prove that for any x, y ∈ R, ||x| − |y|| ≤ |x − y|.

0.3 Functions

A function f : A −→ B is a rule that assigns to each a ∈ A one specific member f (a) of B.


Symbolically we may denote the function by a 7→ f (a). We can specify a function f by
giving the rule for f (x).

Example 0.3. f (x) = x2 /(1 − x) assigns the number x2 /(1 − x) to each x , 1 in R.

The set A is called the domain of f and B is the codomain of f .

The range of f is the subset of B consisting of all the values of f . That is, the range of f =
{f (x) ∈ B | x ∈ A}.

Given f : A −→ R, it means that f assigns a value f (x) in R to each x ∈ A.

Such a function is called a real-valued function.

For a real-valued function f : A −→ R defined on a subset A of R, the graph of f consists of


all the points (x, f (x)) in the xy-plane.

9
If f : A → B and g : B → C, then the composite function of f and g is the function g ◦ f : A →
C given by g ◦ f (x) = g(f (x)).
1
Example 0.4. Let f (x) = x and g(x) = x2 − 1. Find g ◦ f and f ◦ g.

Solution. g ◦ f (x) = g(f (x)) = g( 1x ) = ( 1x )2 − 1 = 1


x2
− 1.
1
f ◦ g(x) = f (g(x)) = f (x2 − 1) = x2 −1
. 

Let f : A → B. If g : B → A is a function such that f (g(x)) = x for all x ∈ B and g(f (x)) = x
for all x ∈ A, then g is called the inverse of f . Similarly, f is the inverse of g. The inverse
function of f is usually denoted by f −1 .

Let f : A → B. f is called an injective function if for any x, y ∈ A, f (x) = f (y) ⇒ x = y. f is


called a surjective function if for any z ∈ B, there is an x ∈ A such that f (x) = z. f is called a
bijective function if f is injective and surjective.

....
.....
.................
....
...
f .....
....
.................
....
...
A ......... • .
.............
.............. .....
. . . .
........... ..
...
.........
• ...
...
...
B
...
.
.. ...
...
.
............
.
................
.
... ..
.
.
..
. ...
.
............... ...
...
...
....
...
• ...
... ... .. . . .
...
. .
..
. ..
.... ....
.
.
. . ...
• ...
...
.
...
... • ..... ...
...................................................................... • ....
... f is injective but not surjective
... .. ... .
.
... ... ... ..
.
... ... ... ...
...
...
... ..
...
...
...
...
...
• ..
...
...
..... ..
.. ..
......................
..
.................

....
...........
...... .........
...
f ....
.........
...... ..........
...
A ........ • .
..............
............... ...
.
. .
............
.
. . .. .
..
.........
. • ...
...
...
B
. . .....
.. . ..
... .................... .. ...
.... ......... .
... .... . .
............... ...
....
...
...
• . .
. ... .. . . ..
...
..
.
.
.
.....
.
. . . . . .
..... • ...
...
..
...
... • ....
...
........................................................................
.......... • .
...
.. f is surjective but not injective
... ..
. .
.. .... .
. . . . .
..
... .. ............ ..... ...
... .. ................ ...
...
...
...
• ..............
. .....
...
...
...
...... . ....
..
.....
.................... ......................

10
....
...........
...... .........
...
f ...
.....
...................
....
A ........ • .
..............
............... ..
. .
.............
...
..
..
...........

...
...
...
...
B
..
... ..
. .....
... ................... ..
.... .
......... .
... .... . .
............... ...
...
....
...
...
• ...
. .. .. . . ..
...
..
.
.....
...
.. . . . . . ...... • ...
...
.
...
... • ... .......
............................................................... • ..
.
... f is bijective
... ..
. ... ...
... .. ... ...
... .. .. ..
...
...
...
• .......................................................................
. ..... ...
.....
• .
.....
..
.....
.................... ...... .........
.........

Exercise 0.4. Prove that if f −1 exists, then f is a bijective function.

0.4 Polynomials

A function of the form p(x) = an xn + an−1 xn−1 + · · · + a1 x + a0 , where a0 , . . . , an are constants and
an , 0, is called a polynomial of degree n.
For example, a quadratic function p(x) = ax2 + bx + c (with a , 0) is a polynomial of degree 2.
A polynomial of degree n can be factored as a product of linear and quadratic factors.
For example, x4 − 1 = (x2 + 1)(x + 1)(x − 1).
In general, a polynomial p(x) of degree n has at most n real roots. (A root of a polynomial
p(x) is a point c such that p(c) = 0.)
For example, x4 − 1 has only two real roots −1 and 1.

0.5 Rational Functions

p(x)
A rational function is a function of the form , where p(x) and q(x) are polynomials. The
q(x)
p(x)
domain of consists of all real numbers except the roots of q(x).
q(x)
x3 + 3
For example, the domain of is R \ {−1, 1}.
x4 − 1

0.6 Trigonometric Functions

The 6 trigonometric functions are sin x, cos x, tan x, csc x, sec x, cot x. They are periodic func-
tions of period 2π.

11
12
Exercise 0.5. Sketch the graph of csc(x).

0.7 Exponential and Logarithmic Function


A function of the form f (x) = ax , where a > 0 is called an exponential function. It’s inverse
function, denoted by loga x is called the logarithmic function to the base a. (a > 0 and a , 1.)
Let e = 2.718281828459045235360287 · · · be the Euler number. Then the inverse of the
exponential function ex is the natural logarithm ln x.
We have eln x = x for x > 0 and ln ex = x for all x.
The domain of ex is R and the range is the set R+ of all positive real numbers.

0.8 More on the Domain and Range of a Function


If the domain of a function is not specified, then it is understood that we will take the
domain to be as large as possible. This is called the maximal domain of the function.

13
In general it is not so easy to determine the range of a function. In some simple cases, basic
algebraic techniques can be used to find the range of a function.
1
Example 0.5. Find the maximal domain and the range of f (x) = x−1 .

Solution. The maximal domain of f is R \ {1}.


Recall that the range of f = {f (x) ∈ R | x , 1}.
1
To find the range of f , let y = f (x). That is y = x−1 . Solving for x, we get

1 1 1
y= =⇒ x − 1 = =⇒ x = 1 + .
x−1 y y

From this we see that if y , 0 then we may choose x = 1 + y1 to get f (x) = y. Thus the range of
f is R \ {0}. 

Example 0.6. Find the maximal domain and range of f (x) = x2 − x + 1.

Solution. The maximal domain of f is R.


To find the range of f , let y = f (x). That is y = x2 −x +1. Solving for x (by using the quadratic
formula), we get
p
2 2 (1 ± 1 − 4(1 − y)) 1 p
y = x − x + 1 =⇒ x − x + (1 − y) = 0 =⇒ x = =⇒ x = (1 ± 4y − 3)).
2 2

From this we see that if y ≥ 43 then we may choose x = 12 (1 ± 4y − 3)) to get f (x) = y. Thus
p

the range of f is [ 34 , ∞). 

Exercise 0.6. Let f (x) = x + 5 and g(x) = x2 − 3. Find the maximal domain and range of g(f (x)).

Ans: Maximal domain is R, range is [−3, ∞).

14
Chapter 1

Limits and Continuity

Read Thomas’ Calculus, Chapter 2.

1.1 Limits
Let f be a real-valued function defined on some interval I (e.g. (a, b), or (a, b] or (a, ∞)).
Let c be a point in I.

• lim− f (x) is the value that f (x) approaches when x approaches c from the left.
x→c

• lim+ f (x) is the value that f (x) approaches when x approaches c from the right.
x→c

• Let c be an interior point (i.e. not an end point). If lim− f (x) = lim+ f (x) = L ∈ R, we
x→c x→c
say that lim f (x) exist and has value L.
x→c

15
Left Limit Right Limit Limit
c lim− f (x) lim+ f (x) lim f (x) f (x)
x→c x→c x→c
0
2
4
-3
6

1.2 Continuity
Let f be a real-valued function defined on some interval I (e.g. (a, b), or (a, b] or (a, ∞)).
Let c be a point in I.

Continuity at a point

Case 1 c is an interior point


• f is continuous at x = c if
(i) lim f (x) exists,
x→c
(ii) lim f (x) = f (c).
x→c

Case 2 c is the left end-point


• f is continuous at x = c if
(i) lim+ f (x) exists,
x→c
(ii) lim+ f (x) = f (c).
x→c

Case 3 c is the right end-point


• f is continuous at x = c if
(i) lim− f (x) exists,
x→c
(ii) lim− f (x) = f (c).
x→c

Continuity on an interval

• f is continuous on an interval if f is continuous at x = c for all points c in I.

Remark. Roughly speaking, f is continuous at x = c means that the values of f near x = c


all become very close to f (c) when x is very close to c, so that there is no sudden jump in the
values of f at x = c.

16
Example 1.1. Find the points of discontinuity of the function f whose graph on (−3, 6] is given
below.

Solution.

Point of discontinuity Reason

x=2 lim f (x) , f (2)


x→2

x=4 lim f (x) does not exist


x→4

1.3 Evaluation of limits

Results (Law of limits)

The following results are true provided all the limits involved exist. The limit could be
one-sided or two-sided. The number k is a constant.

1. lim(f (x) ± g(x)) = lim f (x) ± lim g(x)


x→c x→c x→c

2. lim kf (x) = k lim f (x)


x→c x→c

3. lim(f (x)g(x)) = (lim f (x))(lim g(x))


x→c x→c x→c

f (x) lim
x→c
f (x)
4. lim =
x→c g(x) lim g(x)
x→c

5. If g is continuous at the point b and lim f (x) = b, then lim g(f (x)) = g(b) = g(lim f (x))).
x→c x→c x→c

17
p
f (x)g(x) + g(x)
Example 1.2. Suppose lim f (x) = 3 and lim g(x) = 4. Find lim .
x→2 x→2 x→2 g(x) − f (x)

Solution. p √
f (x)g(x) + g(x) 3 · 4 + 4
lim = = 14.
x→2 g(x) − f (x) 4−3


From the Laws of Limits, we have corresponding results on continuity of functions.

Results

• If f and g are continuous at x = c, then for any constant k and any positive constant n,
each of the following functions is continuous at x = c.
(i) f ± g, (ii) f n , (iii) kf , (iv) f g, (v) f /g provided g(c) , 0.
• If g is continuous at x = c and f is continuous at x = g(c), then the composite function
f ◦ g is continuous at x = c.
(Note (f ◦ g)(x) = f (g(x))).

[ For example, if f and g are continuous at x = c, then we have lim f (x) = f (c) and lim g(x) =
x→c x→c
g(c). Thus
lim(f (x) + g(x)) = lim f (x) + lim g(x) = f (c) + g(c),
x→c x→c x→c
which implies that f + g is continuous at x = c. ]

Result. The following functions are continuous on any interval contained in their maximal
domain.

1. Polynomials
2. Trigonometric Functions
3. Exponential Functions
4. Logarithmic Functions
5. A combination of any of the above on the domain it is defined.

P (x)
For example, a rational function is continuous at all points x where Q(x) , 0.
Q(x)
1
In particular, the rational function is continuous on R \ {1, 2}.
(x − 1)(x − 2)

18
Example 1.3. Show that the function ln(x + 3) is continuous on the interval (−3, ∞).

Solution. Take any point c ∈ (−3, ∞). Then 0 < c + 3 < ∞. Then the polynomial g(x) = x + 3
is continuous at x = c, and the logarithmic function f (x) = ln x is continuous at x = c + 3 > 0.
Thus the composition
f ◦ g(x) = f (g(x)) = f (x + 3) = ln(x + 3)
is continuous at x = c. 

Remark. Since lim f (x) = f (c) when f is continuous at x = c, finding the limit at x = c of any
x→c
of the above functions is a matter of evaluating f at x = c.

x + ln(x + 3)
Example 1.4. Evaluate lim √ .
x→−2 x+6

x + ln(x + 3)
Solution. The function f (x) = √ is continuous at x = −2, which implies that
x+6

lim f (x) = f (−2).


x→−2

Thus
x + ln(x + 3) −2 + ln(−2 + 3)
lim √ = √ = −1.
x→−2 x+6 −2 + 6


Exercise 1.1. Evaluate lim tan3 (sin x).


x→0

Ans: 0.

1.4 Limits at infinity


Let f be defined on R.

• lim f (x) is the value f (x) approaches as x tends to positive infinity.


x→∞

• lim f (x) is the value f (x) approaches as x tends to negative infinity.


x→−∞

Graphically, if lim f (x) = c ∈ R or lim f (x) = c, then the line y = c is a horizontal asymp-
x→∞ x→−∞
tote of the graph of f (x).

19
1 1
Example 1.5. From the graph of y = , one sees that approaches 0 as x tends to positive infinity.
x x
1
Thus we have lim = 0.
x→∞ x

1
Similarly we have lim = 0.
x→−∞ x

Then for any positive integer n, we also have

1 n
n
1 1 1
  
lim = lim = 0n = 0, and lim n = lim = 0n = 0.
x→∞ xn x→∞ x x→−∞ x x→−∞ x

Example 1.6. Using the result in the previous example, we have

1
lim (2 − ) = 2 − 0 = 2,
x→∞ x
1
and thus y = 2 is a horizontal asymptote of the graph of y = 2 − .
x

1
Also, lim (2 − ) = 2.
x→−∞ x

20
3 √
 2
Example 1.7. Evaluate lim + 4−e −x .
x→∞ 2x

3 √ √
 2
Solution. lim + 4 − e−x = (0 + 4)2 = 4. 
x→∞ 2x

4
 
Exercise 1.2. Evaluate lim ln 3 − 2 sin .
x→−∞ x

Ans: ln 3.

1.5 More on Limits


Indeterminate forms.

f (x)
(a) A limit of the form lim where f (x) → 0 and g(x) → 0 as x → c is called an
x→c g(x)
0
indeterminate form of the type .
0
f (x)
(b) A limit of the form lim where f (x) → ∞ and g(x) → ∞ as x → c is called an
x→c g(x)

indeterminate form of the type .

Replacement rule. Let I be an open interval containing the point x = c. Suppose f (x) =
g(x) for all x ∈ I, except possibly at x = c. Then lim f (x) = lim g(x).
x→c x→c

x2 − 7x + 6
Example 1.8. Evaluate lim .
x→6 36 − x2

x2 − 7x + 6 (x − 1)(x − 6) x−1 5
Solution. lim = lim == lim = − . 
x→6 36 − x2 x→6 −(6 + x)(x − 6) x→6 −(6 + x) 12
√ √
x + 12 − 6 − x
Exercise 1.3. Evaluate lim .
x→−3 18 − 2x2
1
Ans: 36 .

P (x)
Result. Limits of the form lim , where P (x) and Q(x) are polynomials in x.
x→±∞ Q(x)

21
leading term 
z}|{ 
 0 if α < β
A

Axα if α = β

P (x) +··· 
 B
lim = lim =

x→±∞ Q(x) x→±∞ Bxβ

+···  ∞ or − ∞ if α > β
|{z} 
 | {z }

depends on the question

leading term


(18x2 + 5x − 1)(2 x − 1)3
Example 1.9. Evaluate lim .
x→∞ (3x − 1)4

√ 7
(18x2 + 5x − 1)(2 x − 1)3 144x 2 + · · ·
Solution. lim = lim = 0. 
x→∞ (3x − 1)4 x→∞ 81x4 + · · ·

(1 + 2x)3
Exercise 1.4. Evaluate lim √ .
x→−∞ 16x6 + 9x − 1

Ans: −2.

Useful results.
If lim g(x) = 0, then
x→c

sin(g(x)) g(x)
• lim = lim = 1,
x→c g(x) x→c sin(g(x))

tan(g(x)) g(x)
• lim = lim = 1.
x→c g(x) x→c tan(g(x))

In particular, when c = 0 and g(x) = x,

sin x x
• lim = lim = 1,
x→0 x x→0 sin x

tan x x
• lim = lim = 1.
x→0 x x→0 tan x

sin 3x ln x sin(e−x )
For example, lim = 1, lim = 1, lim = 1.
x→0 3x x→1 tan(ln x) x→∞ e−x

22
sin2 x
Example 1.10. Evaluate lim .
x→0 sin(3x2 ) + x tan(2x)

Solution.

sin2 x
sin x 2 x2
lim = lim x2
2
x→0 sin(3x ) + x tan(2x) x→0 sin(3x )2 tan(2x)
3x2 + 2x 2
3x2 2x
 sin x 2
x 12 1
= lim 2
= =
x→0 sin(3x ) tan(2x) 3 + 2 5
3 + 2
3x2 2x

 √ 
Exercise 1.5. Evaluate lim+ x2 cot(2x) csc2 (3 x) .
x→0

1
Ans: 18 .

tan( x − 2)
Exercise 1.6. Evaluate lim .
x→4 sin(16 − x2 )

1
Ans: − 32 .

1.6 Squeeze (Sandwich) Theorem


Squeeze Theorem. Suppose g(x) ≤ f (x) ≤ h(x) for all x in some open interval containing
a point c, except possibly at x = c. If lim g(x) = lim h(x) = L, then lim f (x) = L.
x→c x→c x→c

23
Example 1.11. It is given that 3 − x2 ≤ f (x) ≤ 1 + 2ex for all x. Find lim f (x).
x→0

Solution. As lim 3 − x2 = 3 and lim 1 + 2ex = 3, we have by Squeeze Theorem that lim f (x) = 3.
x→0 x→0 x→0


Exercise 1.7. Use Squeeze Theorem to show that lim |f (x)| = 0 ⇒ lim f (x) = 0.
x→c x→c

Remark The converse of the above result, namely lim f (x) = 0 ⇒ lim |f (x)| = 0 is true.
x→c x→c
Hence, we have

Result. lim f (x) = 0 ⇔ lim |f (x)| = 0.


x→c x→c

2
 
3
Example 1.12. Evaluate lim x cos( ) .
x→0 sin x

Solution. Notice that for all x, we have


2 2
−1 ≤ cos( ) ≤ 1 =⇒ cos( ) ≤1
sin x sin x
2
=⇒ x3 cos( ) ≤ |x3 | = |x|3
sin x
2
=⇒ −|x| ≤ x3 cos(
3
) ≤ |x|3 .
sin x
Notice that lim |x|3 = 03 = 0 and thus also lim −|x|3 = 0.
x→0 x→0

2
 
3
Hence by Squeeze Theorem, we have lim x cos( ) = 0.
x→0 sin x

!
sin(2 ln x) 1
Exercise 1.8. Evaluate lim + 2x sin( ) .
x→∞ ln x x

Ans: 2.

24
1.7 Intermediate Value Theorem (IVT)
If a real-valued function f is continuous on [a, b] and k is a number between f (a) and f (b),
then f (c) = k for some c ∈ [a, b].

Example 1.13. Show that the equation x3 ex = 10 has a solution between 1 and 1.5.

Solution. Let f (x) = x3 ex . f is continuous on R. We have f (1) = e = 2.718 and f (1.5) =


1.53 e1.5 = 15.126. Thus f (1) < 10 < f (1.5). By Intermediate Value Theorem, there is a solu-
tion to f (x) = 10 between 1 and 1.5.

Exercise 1.9. Show that the equation 10 = x + 2 tan(2x) has a solution between 3 and 4.

1.8 Appendix: The Precise Definition of the Limit of a Func-


tion
Remark: Section 1.8 will be excluded from the assessments (quizzes and the Final Exam).

Remark: The precise definition of the limit of a function introduced in this section will not
be needed in MA1521, and it will be studied in detail in MA2108 Mathematical Analysis I,
where many results in calculus will be proved rigorously.

Let f (x) be defined on an open interval containing the point c, except possibly at c itself.
We say that the limit of f (x) as x approaches c is the number L, and write

lim f (x) = L,
x→c

if, for every number  > 0, there exists a corresponding number δ > 0 such that for all x,

0 < |x − c| < δ ⇒ |f (x) − L| < .

25
Example 1.14. Prove from definition that lim 5x − 3 = 2.
x→1

Solution. Note that |(5x − 3) − 2| = 5|x − 1|. Given  > 0, we choose δ = /5. Then for all x,
0 < |x − 1| < δ ⇒ 5|x − 1| < 5δ ⇒ |(5x − 3) − 2| < .

x
Exercise 1.10. Prove from definition that lim + 3 = 1.
x→−6 3

26
Chapter 2

Derivatives

Read Thomas’ Calculus, Chapter 3.

2.1 Differentiability

Definition 2.1. The derivative of a function f at the point x0 , denoted by f 0 (x0 ), is given by
the following limit

f (x0 + h) − f (x0 )
f 0 (x0 ) = lim .
h→0 h

Consider the tangent to the curve y = f (x) at the point (x0 , f (x0 )). From the diagram, as
h → 0, Q approaches P and hence, the gradient of the chord P Q, namely

f (x0 + h) − f (x0 )
,
h

27
approaches the limiting value

f (x0 + h) − f (x0 )
lim = f 0 (x0 ).
h→0 h

When f 0 (x0 ) exists, we say that f is differentiable at x = x0 . Geometrically in this case, f 0 (xo )
is equal to the slope of the tangent line to the curve y = f (x) at the point (x0 , f (x0 )).

Remark. An alternative formula for f 0 (x0 ) is

f (x) − f (x0 )
f 0 (x0 ) = lim .
x→x0 x − x0

(This can be seen by letting x = x0 + h, and noting that ‘h → 0’ is the same as ‘x → x0 ’.)

Example 2.1. Let f (x) = x2 . Find f 0 (3) (that is, x0 = 3).

Solution.
f (3 + h) − f (3)
f 0 (3) = lim
h→0 h
(3 + h) − 32
2
= lim
h→0 h
9 + 6h + h2 − 9
= lim
h→0 h
= lim (6 + h)
h→0
= 6.

Remark. Thus the slope of the tangent line to the curve y = x2 at the point (3, 9) is equal to
6.


Definition 2.2. Suppose the derivative f 0 (x) exists for all x in an open interval I. We can
then treat f 0 (x) as a function defined on I. The process of finding the derivative of a function
d dy df
is called differentiation. If y = f (x) , we can also write dx f (x), dx or dx to denote f 0 (x).

In summary, we have

d dy df f (x + h) − f (x)
f (x) = = = f 0 (x) = lim .
dx dx dx h→0 h

28
1
Example 2.2. (a) Use the above definition of derivative to differentiate the function √ .
2+ x

1√
(b) Find the equation of the tangent to the curve y = 2+ x
at the point (1, 13 ).

Ans: (a) √ −1 √ ,
2 x(2+ x)2
1
(b) (y − 31 ) = − 18 (x − 1).

Solution. (a)

1 1
√ − √
d 1 2+ x+h 2+ x
√ = lim
dx 2 + x h→0 h
√ √
1 (2 + x) − (2 + x + h)
= lim √ √
h→0 h (2 + x + h)(2 + x)
√ √
1 x− x+h
= lim √ √
h→0 h (2 + x + h)(2 + x)
√ √ √ √
1 x− x+h x+ x+h
= lim √ √ ·√ √
h→0 h (2 + x + h)(2 + x) x+ x+h
1 −h
= lim √ √ √ √
h→0 h (2 + x + h)(2 + x)( x + x + h)
−1
= lim √ √ √ √
h→0 (2 + x + h)(2 + x)( x + x + h)
−1
= √ √ .
2 x(2 + x)2

d 1 −1
(b) From part (a), √ = √ √ .
dx 2 + x 2 x(2 + x)2

−1 1
At x = 1, derivative is √ √ =− .
2 1(2 + 1)2 18

1
1 y−
Thus the equation of the tangent at (1, ) is 3 =− 1 .
3 x−1 18


Differentiability implies continuity.

Theorem 2.1. If f is differentiable at x = x0 , then f is continuous at x = x0 .

29
Proof. It is given that f 0 (x0 ) exists. For all x near x0 , we have

f (x) − f (x0 )
f (x) = · (x − x0 ) + f (x0 ).
x − x0

Letting x → x0 , we have

f (x) − f (x0 )
lim f (x) = lim ( · (x − x0 ) + f (x0 ))
x→x0 x→x0 x − x0
= f 0 (x0 ) · (x0 − x0 ) + f (x0 )
= f (x0 ).

Hence f is continuous at x = x0 .


Remark. The converse of the above result is not true in general. For example the absolute
value function f (x) = |x| is continuous at x = 0 but not differentiable at x = 0.

Differentiability on Intervals.

Definition 2.3. A function f is said to be differentiable on an interval I if it is differentiable


at every point in I.

Remark. If the interval has endpoints, then the limit in defining the derivative at an end-
point should be replaced by the appropriate one-sided limit.

Exercise 2.1. Show that f (x) = |x2 − 2x| is not differentiable at x = 2.


d n
Example 2.3. Let n be a positive integer. Show that x = nxn−1 .
dx

Solution.

d n (x + h)n − xn
x = lim
dx h→0 h

30
n(n−1) n−2 2
(xn + nxn−1 h + 2 x h + · · · + hn ) − xn
= lim
h→0 h

(using binomial theorem)

n(n−1) n−2 2
nxn−1 h + 2 x h + · · · + hn
= lim
h→0 h
n(n−1) n−2
= lim nxn−1 + 2 x h + · · · + hn−1 = nxn−1 .
h→0

d
Example 2.4. Show that sin x = cos x.
dx

Solution.
2x + h h
d sin(x + h) − sin x 2 cos sin
sin x = lim = lim 2 2
dx h→0 h h→0 h

(using sin A − sin B = 2 cos A+B A−B


2 sin 2 )

h
h sin
= lim cos(x + ) · lim 2
h→0 2 h→0 h
2

= (cos x) · 1 = cos x.

31
2.2 Standard Derivatives & Differentiation Rules

Table 1

Function Derivative

xn nxn−1

cos(x) − sin(x)

sin(x) cos(x)

tan(x) sec2 (x)

sec(x) sec(x) tan(x)

csc(x) − csc(x) cot(x)

cot(x) − csc2 (x)

ex ex

1
ln(x) x

sin−1 (x) √ 1
1−x2
1
cos−1 (x) −√
1−x2
1
tan−1 (x) 1+x2

1
cot−1 (x) − 1+x 2

sec−1 (x) √1 , |x| > 1


|x| x2 −1

csc−1 (x) − √1 , |x| > 1


|x| x2 −1

32
Table 2

Function Derivative

(g(x))n ng 0 (x)(g(x))n−1

cos(g(x)) −g 0 (x) sin(g(x))

sin(g(x)) g 0 (x) cos(g(x))

tan(g(x)) g 0 (x) sec2 (g(x))

sec(g(x)) g 0 (x) sec(g(x)) tan(g(x))

csc(g(x)) −g 0 (x) csc(g(x)) cot(g(x))

cot(g(x)) −g 0 (x) csc2 (g(x))

eg(x) g 0 (x)eg(x)
g 0 (x)
ln(g(x)) g(x)
0
sin−1 (g(x)) √ g (x)
1−g(x)2
g 0 (x)
cos−1 (g(x)) −√
1−g(x)2
g 0 (x)
tan−1 (g(x)) 1+g(x)2
g 0 (x)
cot−1 (g(x)) − 1+g(x)2
0
g (x)
sec−1 (g(x)) √ , |g(x)| > 1
|g(x)| g(x)2 −1
g (x)0
csc−1 (g(x)) − √ , |g(x)| > 1
|g(x)| g(x)2 −1

Note that when g(x) = x, the formulae in Table 2 reduce to those in Table 1.
The formulae in Table 2 follow readily from those in Table 1 and the Chain Rule (which we
will see very soon).

33
Rules of Differentiation
Let u and v be differentiable functions of x, and let c be a constant.

d
Constant Rule (c) = 0
dx
d du
Constant Multiple Rule (cu) = c
dx dx
d du dv
Sum Rule (u + v) = +
dx dx dx
d du dv
Product Rule (uv) = v+u
dx dx dx
du dv
d u dx v − u dx
Quotient Rule ( )=
dx v v2

Example 2.5. Differentiate x3 + x2 tan x with respect to x.

Solution. By the sum rule and product rule,

d 3 d 3 d 2
(x + x2 tan x) = x + (x tan x)
dx dx dx
d d
= 3x2 + (x2 ) · tan x + x2 · tan x
dx dx
= 3x2 + 2x tan x + x2 sec2 x.

x2
Example 2.6. Differentiate with respect to x.
ln x

Solution. By the quotient rule,

d d
d x2 ( x2 ) · ln x − x2 · ln x
= dx dx
dx ln x (ln x)2
1
2x ln x − x2 ·
= x
(ln x)2
2x ln x − x
= .
(ln x)2

34
Let f (u) be differentiable at u = g(x), and let g be a differentiable function of x.

d
Chain Rule (f (g(x))) = f 0 (g(x)) · g 0 (x)
dx

Write y = (f (g(x)) and u = g(x) as functions of x. Then the Chain Rule can be abbreviated as
dy dy du
= · ,
dx du dx
dy
with the understanding that is evaluated at u = g(x), so that all three expressions are
du
regarded as functions of x.
Example 2.7. Differentiate sin(x3 + x + 2) with respect to x.

Solution. Let u = x3 + x + 2, so that

y = sin(x3 + x + 2) = sin u, and u = x3 + x + 2.

Then by the Chain Rule,


dy dy du
= ·
dx du dx
= cos u · (3x2 + 1)
= (cos(x3 + x + 2)) · (3x2 + 1)
d
=⇒ sin(x3 + x + 2) = (3x2 + 1) cos(x3 + x + 2).
dx

d
Example 2.8. Use the Chain Rule to derive the formula (g(x))n = ng 0 (x)(g(x))n−1 (in Table 2)
dx
from a corresponding formula in Table 1.

Solution. Let u = g(x), so that

y = (g(x))n = u n , and u = g(x).

Then by the Chain Rule and the first formula in Table 1,


dy dy du
= ·
dx du dx
= nu n−1 · g 0 (x)
= n(g(x))n−1 · g 0 (x)
d
=⇒ (g(x))n = ng 0 (x)(g(x))n−1 .
dx


35
x2
!
Example 2.9. Differentiate ln with respect to x.
(6x − 7)2

2 12
Ans: x − 6x−7 .

Solution. First we have


x2
! !
x
ln = 2 ln = 2 ln(x) − 2 ln(6x − 7).
(6x − 7)2 (6x − 7)

x2
!
d 2 1 2 12
Thus ln 2
= −2· ·6 = − .
dx (6x − 7) x 6x − 7 x 6x − 7


Exercise 2.2. Differentiate with respect to x.



(a) (x + 1)2 tan−1 ( x)
sin−1 (2x)
(b) √
1 − 4x2

√ x+1 2 4x sin−1 (2x)


Ans: (a) 2(x + 1) tan−1 ( x) + √ , (b) + .
2 x 1 − 4x2 3
(1 − 4x2 ) 2

2.3 Implicit Differentiation


The diagram shows the curve given implicitly by the equation

x3 + y 3 − 9xy = 0

known as the folium of Descartes.

36
Suppose we wish to find the gradient of the curve at the point (2, 4). For this example, getting
an explicit expression for y in terms of x is challenging. It turns out that it is possible to find
dy
dx by a method known as implicit differentiation. This consists of differentiating both sides
dy
of the given equation with respect to x and solving the resulting equation for dx . When
differentiating a function in y with respect to x, we need to use the Chain Rule as follows:

d dy
g(y) = g 0 (y) .
dx dx
dy
Example 2.10. Consider the curve x3 + y 3 − 9xy = 0. Find dx .

dy 3x2 −9y
Ans: dx = − 3y 2 −9x .

Solution. Differentiating both sides of the equation with respect to x, we have


dy dy
3x2 + 3y 2 − 9y − 9x = 0.
dx dx
dy
Solving for , we obtain
dx
dy
(3y 2 − 9x)
= −3x2 + 9y
dx
dy 3x2 − 9y
=⇒ =− 2 .
dx 3y − 9x


dy
Example 2.11. Find dx for points on the curve x3 ey + cos(xy) = 2024.

Solution. Differentiating both sides of the equation with respect to x, we have

Apply product rule to x3 ey


z }| { !
dy dy
3x2 ey + x3 ey − sin(xy) x + y = 0.
dx dx
dy
Solving for , we obtain
dx
dy
(x3 ey − x sin(xy))
= −3x2 ey + y sin(xy)
dx
dy −3x2 ey + y sin(xy)
=⇒ = 3 y .
dx x e − x sin(xy)

37
2.4 Derivatives of Inverse Functions
Recall that
(i) a bijective function (or one-one function) f has an inverse f −1 defined on the range of f .
(ii) increasing or decreasing functions are bijective.

Theorem 2.2. Let f be bijective and differentiable on an open interval I. Then

1
(f −1 )0 (a) = 0 (f −1 (a))
.
f

Let b = f −1 (a), so that a = f (b). At the point (a, f −1 (a)) = (a, b) on the curve y = f −1 (x), the
gradient of the tangent is
1 1
(f −1 )0 (a) = 0 −1 = 0 .
f (f (a)) f (b)

Proof. Let b = f −1 (a), so that a = f (b). As f −1 is the inverse function of f , we have f −1 (f (x)) =
x for all x. By chain rule,

(f −1 )0 (f (x)) · f 0 (x) = 1

⇒ (f −1 )0 (f (b)) · f 0 (b) = 1

⇒ (f −1 )0 (a) · f 0 (b) = 1
1 1
⇒ (f −1 )0 (a) = 0 (b)
= 0 (f −1 (a))
.
f f


38
Remark. Using a = f (b) and letting b vary over I, one obtains the formula
1
(f −1 )0 (f (x)) = 0 (x)
f
relating the derivative functions of f −1 and f . This formula is often abbreviated as follows:
Let y = y(x) be a function. Suppose y has an inverse function x = x(y). Then

dx 1
= .
dy dy
dx

dx
with the understanding that is evaluated at y = y(x), and both expressions are regarded
dy
as functions of x.

This can also be obtained directly as follows: Note that


x = x(y(x))
for all x. Differentiating both sides (and applying the abbreviated Chain Rule on the right
hand side), we get
dx d dx dy dy 1
= (x(y(x)) =⇒ 1 = · =⇒ =
dx dx dy dx dx dy
dx
dx
with the understanding that is evaluated at y = y(x). 
dy

d 1
Example 2.12. Show that sin−1 (x) = √ .
dx 1 − x2
π π
(Recall that sin−1 (x) ∈ [− , ] for all x ∈ −[1, 1].)
2 2

Solution. Let y = sin−1 (x), so that x = sin y. Then


dy 1 d 1 1
= =⇒ sin−1 (x) = = .
dx dx dx d cos y
(sin y)
dy dy
Note that
π π
q √
−1
y = sin (x) ∈ [− , ] =⇒ cos y ≥ 0 =⇒ cos y = 1 − sin2 y = 1 − x2 .
2 2

d 1
Thus sin−1 (x) = √ .
dx 1 − x2


39
Exercise 2.3. Let f (x) = x5 − 2e−2x + 3.
(a) Show that f is bijective by showing that f is increasing on R.
(b) Find the gradient of the tangent to the curve y = f −1 (x) at the point (1, 0).
1
(c) Let g(x) = −1 . Find the value of g 0 (1).
2x + 3f −1 (x)

Ans: (b) 14 , (c) 5


16 .

2.5 Higher-order Derivatives


Given a differentiable function f , we can find the derivative of its derivative function f 0
with respect to x to get a new function, denoted by f 00 or f (2) , called the second derivative of
f provided that f 0 is differentiable. That is,

d 0
f 00 (x) = f (x).
dx

Notations. Let y = f (x).

d 2y
f (2) (x) = f 00 (x) = = y 00 = D 2 f (x).
dx2
In general, we can define the nth order derivative of f for any positive integer n provided
the derivative exists. For example, the third derivative of y = f (x) is defined by
!!
(3) 000 d (2) d d d
f (x) = f (x) = f (x) = (f (x)) .
dx dx dx dx

We shall denote the nth order derivative of f by

d ny
f (n) (x) = = D n y = D n f (x).
dxn
Example 2.13. Let n be a positive integer. For an non-negative integer k, find

dk n
x .
dxk

Solution.
d n
k = 1: x = nxn−1 .
dx

40
d2 n
k = 2: x = n(n − 1)xn−2 .
dx2
d3 n
k = 3: 3
x = n(n − 1)(n − 2)xn−3 .
dx
..
.
dk n
k ≤ n: k
x = n(n − 1)(n − 2) · · · (n − k + 1)xn−k .
dx

dn n
In particular, when k = n, we have x = n(n − 1)(n − 2) · · · (1) = n!.
dxn
d n+1 n d dn n d
k = n + 1: n+1
x = ( nx ) = (n!) = 0.
dx dx dx dx

dk n
Similarly, we have x = 0 for all k ≥ n + 1.
dxk

2.6 Parametric Equations


A curve defined by the parametric equations

x = f (t) and y = g(t), (t is the parameter)

is differentiable at a point where t = t0 if both f and g are differentiable at t = t0 . Usually we


also assume f 0 (t0 ) , 0 or g 0 (t0 ) , 0.
By chain rule,

dy dy dx dy dy dx g 0 (t)
= · =⇒ = ÷ = ,
dt dx dt dx dt dt f 0 (t)
and

0
d g (t)
 
d 2y g 00 (t)f 0 (t) − g 0 (t)f 00 (t)
!
d dy dx dt f 0 (t)
= ÷ = = .
dx2 dt dx dt f 0 (t) f 0 (t)3

Examples of parametric curves

• Ellipses
x = a cos t + x0 and y = b sin t + y0 ,
where a > 0, b > 0, x0 and y0 are fixed constants and 0 ≤ t < 2π.

41
• Circles
x = r cos t + x0 and y = r sin t + y0 ,
where r > 0, x0 and y0 are fixed constants and 0 ≤ t < 2π.

• Hyperbolas
x = a sec t + x0 and y = b tan t + y0 ,

or
x = a tan t + x0 and y = b sec t + y0 ,

where a > 0, b > 0, x0 and y0 are fixed constants and −π ≤ t ≤ π, t , − π2 , π2 .

Example 2.14. For the parametric curve given by

x = 2t − t 2 , y = t − t 3 ,

find the point(s) on the curve at which the tangent is parallel to the line 2y = x + 2024.

Ans: (0, 0) and ( 59 , 27


8
).
Solution. We are given x = 2t − t 2 , y = t − t 3 .
dy
dy 1 − 3t 2
First = dt = .
dx dx 2 − 2t
dt
1
The gradient of the line 2y = x + 2024 is .
2
dy 1
So we look for the value(s) of t such that = . Thus
dx 2
1 − 3t 2 1 1
= ⇐⇒ 1 − 3t 2 = 1 − t ⇐⇒ t(3t − 1) = 0 ⇐⇒ t = 0, .
2 − 2t 2 3
When t = 0, (x, y) = (2 · 0 − 02 , 0 − 03 ) = (0, 0).
1 1 1 1 1 5 8
When t = , (x, y) = (2 · − ( )2 , − ( )3 ) = ( , )
3 3 3 3 3 9 27
5 8
The corresponding points on the curve are (0, 0) and ( , ).
9 27


2.7 Miscellaneous examples

42
Functions of the form f (x)g(x) .

The derivative of y = f (x)g(x) can be found by first finding the derivative of ln y and then
dy
solving for .
dx

Example 2.15. Differentiate with respect to x.


(a) (x2 − 1)4 tan x ,
(b) 5x ln x .
8x tan x
 
Ans: (a) (x2 − 1)4 tan x 2 2
4 sec x ln(x − 1) + 2 ,
x −1
(b) 5x ln x ln 5(1 + ln x).

Solution. (a) Let y = (x2 − 1)4 tan x , so that


ln y = ln(x2 − 1)4 tan x = 4 tan x · ln(x2 − 1).
Differentiating both sides with respect to x, we have
1 dy (x2 − 1)0
= (4 tan x)0 ln(x2 − 1) + 2 (4 tan x)
y dx (x − 1)
2x
= 4 sec2 x ln(x2 − 1) + 2 · 4 tan x
x −1
8x tan x
= 4 sec2 x ln(x2 − 1) + 2
x −1
dy 8x tan x
 
2 2
=⇒ = y · 4 sec x ln(x − 1) + 2
dx x −1
8x tan x
 
= (x2 − 1)4 tan x 4 sec2 x ln(x2 − 1) + 2 .
x −1

(b) Let y = 5x ln x . Then


ln y = ln(5x ln x ) = x ln x · ln 5.
Differentiating both sides with respect to x, we have
1 dy
= ln 5 · (x ln x)0
y dx
1
= ln 5 · (x · + 1 · ln x)
x
= ln 5 · (1 + ln x)
dy
=⇒ = y · ln 5 · (1 + ln x)
dx
d x ln x
=⇒ (5 ) = 5x ln x ln 5 (1 + ln x) .
dx

43


Change of base formula

ln x
loga x = , a > 0 and a , 1.
ln a

Example 2.16. Differentiate log(1+x2 ) x with respect to x.

ln(1 + x2 ) 2x ln x
!
1
Ans: − .
2(ln(1 + x2 ))2 x 1 + x2

Solution.

d √
log(1+x2 ) x
dx

d ln x d ln x
= =
dx ln(1 + x ) dx 2 ln(1 + x2 )
2

1 (ln x)0 ln(1 + x2 ) − ln x · (ln(1 + x2 ))0


= ·
2 (ln(1 + x2 ))2

1 2 ) − ln x · 2x
1 x ln(1 + x
= · 1 + x2
2 2
(ln(1 + x )) 2

ln(1 + x2 ) 2x ln x
!
1
= − .
2(ln(1 + x2 ))2 x 1 + x2

44
Chapter 3

Applications of Differentiation

Read Thomas’ Calculus, Chapter 4.

3.1 Tangents and Normals

• The tangent at the point (x0 , f (x0 )) on the graph of a differentiable function f has
equation
y − f (x0 ) = m(x − x0 ).

• The normal at the point (x0 , f (x0 )) on the graph of a differentiable function f has equa-
tion
1
y − f (x0 ) = − (x − x0 ),
m
0
where m = f (x0 ).

Example 3.1. The curve C has equation x2 + y 2 + 3xy = 5.


(a) Find the equations of the tangent and normal at the point (1, 1).

45
(b) Find (if any) the equations of the tangents that are parallel to the axes.

Ans: (a) tangent: y = −x + 2, normal: y = x.

Solution. (a) Differentiating the equation x2 + y 2 + 3xy = 5 with respect to x, we have


2x + 2yy 0 + 3y + 3xy 0 = 0.
2x + 3y
Thus y 0 = − .
3x + 2y
5
At (1, 1), we have y 0 = − = −1. Thus the gradient of the tangent at (1, 1) is −1 and the
5
gradient of the normal is 1.

Therefore, the equation of tangent is y − 1 = −(x − 1). That is y = −x + 2.

The equation of normal is y − 1 = x − 1. That is y = x.


2x + 3y
(b) Recall that y 0 = − .
3x + 2y
If a tangent is parallel to the x-axis, then its gradient is 0. Thus we shall find the points on
2x + 3y 2x
the curve such that y 0 = 0. That is − = 0 ⇐⇒ 2x + 3y = 0 ⇐⇒ y = − .
3x + 2y 3

Substituting this into the equation of the curve x2 + y 2 + 3xy = 5, we obtain


2x 2 2x
x2 + (− ) + 3x(− ) = 5.
3 3
5x2
That is − = 5, which has no solution.
9


If a curve is defined parametrically by


x = x(t) and y = y(t),
then

• the equation of the tangent at the point where t = t0 is


y − y(t0 ) = m(x − x(t0 )),

• the equation of the normal at the point where t = t0 is


1
y − y(t0 ) = − (x − x(t0 )),
m
dy dy dx
where m is the value of = ÷ at t = t0 .
dx dt dt

46
Example 3.2. A curve is defined by

x = t 2 − t and y = (t + 1)2 .

The tangent at the point A on the curve passes through (−1, 0) and (1, 8). Find the equation of the
normal at the point A.

Ans: y = − 4x + 4.

8−0
Solution. The gradient of the line through (−1, 0) and (1, 8) is = 4.
1 − (−1)
dy y 0 2t + 2
Also = = .
dx x0 2t − 1
dy 2t + 2
Solving = 4, that is = 4, gives t = 1.
dx 2t − 1
Thus the point A on the curve has coordinates (12 − 1, (1 + 1)2 ) = (0, 4).

The equation of the normal through A is


y −4 1 x
=− or equivalently, y = − + 4.
x−0 4 4


3.2 Increasing and Decreasing Functions

Definition 3.1. (a) The function f is increasing on an interval I if f (x2 ) > f (x1 ) for x1 , x2 ∈ I
with x2 > x1 .
(b) The function f is decreasing on an interval I if f (x2 ) < f (x1 ) for x1 , x2 ∈ I with x2 > x1 .

Theorem 3.1. Let f be differentiable on (a, b) and continuous on [a, b].


(a) f is increasing on [a, b] if f 0 (x) > 0 for all x in (a, b).
(b) f is decreasing on [a, b] if f 0 (x) < 0 for all x in (a, b).

Thus if f 0 (x) > 0(< 0) on (a, b) except possibly at a finite number of points at which f 0 (x) = 0,
then f is increasing (decreasing) on [a, b].

Example 3.3. Show that the function f (x) = 3x3 − 3e−x − 4x is bijective (one-one) on the interval
(0, ∞) to its range R.

47
4
Solution. f 0 (x) = 9x2 + 3e−x + > 0 for all x ∈ (0, ∞).
x2
Thus f is increasing on (0, ∞). It follows that f is injective on (0, ∞).

4 4
Note that lim+ 3x3 − 3e−x − = −∞ and lim 3x3 − 3e−x − = +∞.
x→0 x x→+∞ x
As f is continuous on (0, ∞), the range of f is R.

Thus f (x) is bijective from (0, ∞) onto its range R.




3.3 Concave Upward and Concave Downward Functions

Definition 3.2. (a) The graph of f is concave upward (downward) at (c, f (c)) if f 0 (c) exists
and there is an open interval I containing c such that for all x , c in I, the point (x, f (x)) on
the graph of f is above (below) the tangent line to the graph of f at x = c.
(b) The graph of f is concave upward (downward) on (a, b) if it is concave upward (downward)
at every point in (a, b).

The following theorem gives a test for concavity.

Theorem 3.2. Let f be differentiable on (a, b). Let c ∈ (a, b).


(a) If f 00 (c) > 0, then the graph of f is concave upward at (c, f (c)).
(b) If f 00 (c) < 0, then the graph of f is concave downward at (c, f (c)).

48
Note that the converse of the theorem is not true. For example, if f (x) = x4 , then the graph
of f is concave upward at (0, 0), but f 00 (0) = 0.

Definition 3.3. A point (c, f (c)) where the graph of a function f has a tangent line and where
the concavity changes is called a point of inflection.

Point of inflection: change of concavity

Theorem 3.3. Let f be differentiable on (a, b). Let c ∈ (a, b). If (c, f (c)) is a point of inflection
of the graph of f and f 00 (c) exists, then f 00 (c) = 0.

Example 3.4. Determine the intervals on which the function

f (x) = −2x3 + 15x2 − 24x + 7

is
(i) increasing, (ii) decreasing, and its graph is (iii) concave upward (iv) concave downward. (v)
Find the point(s) of inflection of the graph of f .

Ans: (i) [1, 4], (ii) (−∞, 1] ∪ [4, ∞), (iii) (−∞, 25 ), (iv) ( 52 , ∞), (v) ( 52 , 19
2 ).
Solution. f 0 (x) = −6x2 + 30x − 24 = −6(x − 1)(x − 4).

49
x x<1 x=1 1<x<4 x=4 4<x
f 0 (x) − 0 + 0 −

Thus f is increasing on [1, 4], and decreasing on (−∞, 1] ∪ [4, ∞),

We have f 00 (x) = −12x + 30 = −12(x − 25 ).

x x < 52 x= 5
2
5
2 <x
f 00 (x) + 0 −

Thus the graph of f is concave upward on (−∞, 52 ) and concave downward on ( 52 , ∞).

There is a change of concavity of the graph of f at ( 25 , 19 5 19


2 ). Thus ( 2 , 2 ) is a point of inflection
of the graph of f .

Summarizing, (i) increasing on [1, 4], (ii) decreasing on (−∞, 1], [4, ∞), (iii) concave upward
on (−∞, 25 ), (iv) concave downward ( 52 , ∞), (v) inflection point at x = 52 .

3.4 Related Rates

Let y = f (x) and let x and y be functions of a third variable t that represents, for example,
time. By the Chain Rule,
dy dy dx
= .
dt dx dt
Example 3.5. Water is flowing at a rate of 100 cm3 per second into an inverted conical flask of
height 16cm and base radius 4cm. At the instant when the height of water level is 12cm, the water
level is rising at the rate of 3cm per sec. Calculate the rate at which water is leaking from the flask.

50
Ans: 15.18 cm3 per second.
r 4 h
Solution. By similar triangles, = ⇒ r = . Then
h 16 4
1 1 h
V = πr 2 h = π( )2 h.
3 3 4
π 3
Thus V = h . Differentiating with respect to t,
48
dV dV dh π dh
= · = · 3h2 · .
dt dh dt 48 dt

dh
At h = 12, = 3 cm per second. Thus
dt
dV π
= × 3 × (12)2 × 3 = 27π = 84.82 cm3 per second.
dt 48
Therefore, leaking rate is 100 − 27π = 15.18 cm3 per second.

Exercise 3.1. A particle is moving horizontally in the x–y plane along the line y = 5 in such a
way that its distance from the origin (0, 0) is increasing at a rate of 1 unit per sec. Calculate the
rate at which the particle is moving horizontally at the instant when it is 13 units from (0, 0).

13
Ans: 12 unit per sec.

3.5 Maximum and Minimum Values

51
Definition 3.4. (Absolute Extrema) A function f has an

(a) absolute/global maximum at x = c if f (x) ≤ f (c) for all x in the domain of f . (Such
point x = c is called an absolute/global maximum point of f , and the value f (c) is called
the absolute/global maximum value of f .)

(b) absolute/global minimum at x = c if f (x) ≥ f (c) for all x in the domain of f . (Such point
x = c is called an absolute/global minimum point of f , and the value f (c) is called the
absolute/global minimum value of f .)

Definition 3.5. (Local Extrema) A function f defined on some interval I has a

(a) local/relative maximum at x = c if f (x) ≤ f (c) for x in some open interval containing
x = c. (Such point x = c is called a local/relative maximum point of f , and the value
f (c) is called a local/relative maximum value of f .)

(b) local/relative minimum at x = c if f (x) ≥ f (c) for x in some open interval containing
x = c. (Such point x = c is called a local/relative minimum point of f , and the value f (c)
is called a local/relative minimum value of f .)

Theorem 3.4. (Extreme Value Theorem) If f is continuous on a closed interval [a, b], then f
has an absolute maximum and an absolute minimum at some points in [a, b].

Question. What if f is not continuous or if the domain is not a closed interval of the form
[a, b]?

52
In such cases, the conclusion of the Extreme Value Theorem may not hold. (In the two
diagrams above, none of the two functions has an absolute maximum on its domain.)

Theorem 3.5. If f is differentiable on an open interval containing x = c and f has a local


extremum (i.e., local maximum or local minimum) at x = c, then f 0 (c) = 0.

Definition 3.6. (Critical Point) A number c in the domain of a function f is a critical point
of f if the following 2 conditions hold:
(i) it is not an end-point,
(ii) either f 0 (c) = 0 or f 0 (c) does not exist.

Theorem 3.6. If f has a local minimum/maximum at x = c, then c is a critical point of f .

However, the converse of this result is not true in general.

From the above results, we conclude that an absolute extremum occurs either at the end point or
at a critical point.

Hence, to find absolute extrema of a continuous function f defined on [a, b] , we

(1) find the values of f at all critical points of f on (a, b),


(2) find the values of f (a) and f (b).

The largest and smallest values from steps (1) and (2) are the absolute maximum value and
absolute minimum value of f respectively.

53
Remark. If the function f is continuous on [a, b] and is considered as defined on (a, b] and
the largest (smallest) value of f obtained from steps 1 and 2 occurs only at x = a, then f has
no absolute maximum (minimum) on (a, b].

Example 3.6. Find the absolute maximum and minimum values of


x
g(x) =
x2 + 1
on (a) [−2, 1], (b) (−2, 1), (c) (−1, 1].

Ans: (a) absolute maximum value = 21 , absolute minimum value = − 12 , (b) no absolute max-
imum value, absolute minimum value = − 12 , (c) absolute maximum value = 21 , no absolute
minimum value.
1 · (x2 + 1) − x(2x) 1 − x2 (1 − x)(1 + x)
Solution. g 0 (x) = = 2 = .
x2 + 1 x +1 x2 + 1
g 0 (x) = 0 ⇔ x = −1, 1. Thus g has a critical point at x = −1, 1.

x −2 −1 1
2 1 1
g(x) − −
5 2 2

1 1
(a) on [−2, 1]: absolute maximum value = , absolute minimum value = − ,
2 2
1
(b) on (−2, 1): no absolute maximum value, absolute minimum value = − ,
2
1
(c) on (−1, 1]: absolute maximum value = , no absolute minimum value.
2

Exercise 3.2. Find the absolute maximum and minimum values of

h(x) = x5/3 − x2/3


on (a) [−1, 8], (b) (−1, 1).

54
Ans: (a) absolute maximum value = 28, absolute minimum value = −2.
(
|x| −5 ≤ x < 2
Exercise 3.3. Let f (x) = . Find
x2 − 6x + 10 2 ≤ x ≤ 4

(a) the critical points of f ,

(b) the absolute maximum and minimum values of f .

Ans: (a) critical points at x = 0, 2, 3, (b) absolute maximum value = 5, absolute minimum
= 0.

Theorem 3.7. (First Derivative Test for Absolute Extrema) Let f be differentiable on an open
interval containing a critical point c except possibly at c and f is continuous at c.

(1) If f 0 (x) > 0 for all x < c and f 0 (x) < 0 for all x > c, then f has an absolute maximum at
c.

(2) If f 0 (x) < 0 for all x < c and f 0 (x) > 0 for all x > c, then f has an absolute minimum at
c.

Example 3.7. Find the absolute maximum value of the function

h(x) = 6x1/2 − x3/2

on the interval (1, ∞).



Ans: absolute maximum value = 4 2.

1 3 1 3 1 3(2 − x)
Solution. h(x) = 6x 2 − x 2 . Then h0 (x) = 3x− 2 − x 2 = 1
.
2 2x 2
Thus h has 1 critical point in the interval (1, ∞) given by x = 2.

When 1 < x < 2, h0 (x) > 0. When x > 2, h0 (x) < 0.

Thus by the first derivative test for absolute extrema, h has an absolute maximum on the
1 3 √ √
interval
√ (1, ∞) given by x = 2, and the maximum value of h is h(2) = 6 · 2 2 −22 = 6 2−2 2 =

4 2.


55
Theorem 3.8. (First Derivative Test for Local Extrema) Let f be differentiable on an open
interval containing a critical point c except possibly at c and f is continuous at c.

(1) If f 0 changes from positive to negative at x = c, then f has a local maximum at c.

(2) If f 0 changes from negative to positive at x = c, then f has a local minimum at c.

(3) If f 0 does not change sign at x = c, then f has no local extremum at c.

Example 3.8. Let f (x) = 3x4 − 8x3 .


(a) Find the critical points of f .
(b) Determine whether a local maximum or local minimum or neither occurs at each of these
points.

Ans: (a) critical point at x = 0, 2, (b) f has a local minimum at x = 2.

Solution. It is given that f (x) = 3x4 − 8x3 . Then f 0 (x) = 12x3 − 24x2 = 12x2 (x − 2).
Thus f 0 (x) = 0 ⇐⇒ 12x2 (x − 2) = 0 ⇐⇒ x = 0, 2.
Hence f has critical points at x = 0, 2.

x x<0 0 0<x<2 2 2<x


f 0 (x) − 0 − 0 +

By the first derivative test (for local extrema), f has no local extremum at x = 0, and f has a
local minimum at x = 2.


Theorem 3.9. (Second Derivative Test) Let f be a twice differentiable function defined in an
open interval containing c.

(1) If f 0 (c) = 0 and f 00 (c) < 0, then f has a local maximum at c.

(2) If f 0 (c) = 0 and f 00 (c) > 0, then f has a local minimum at c.

(3) No conclusion can be drawn if f 00 (c) = 0.

56
Remark. The functions x4 , −x4 , x3 , has a local min, max, neither a max nor a min at x = 0,
respectively, but have 0 second derivative at x = 0.

Example 3.9. Let f (x) = (x − 4)ex .

(a) Find the critical points of f .

(b) Determine whether a local maximum or local minimum or neither occurs at each of these
points.

Ans: (a) critical point at x = 3, (b) f has a local minimum at x = 3.

Solution. It is given that f (x) = (x − 4)ex . Then f 0 (x) = 1 · ex + (x − 4)ex = (x − 3)ex .

Thus f 0 (x) = 0 ⇐⇒ (x − 3)ex = 0 ⇐⇒ x = 3. Hence f has a critical point at x = 3.

Then f 00 (x) = 1 · ex + (x − 3)ex = (x − 2)ex . Thus f 00 (3) = (3 − 2)e3 = e3 > 0.

By the second derivative test, f has a local minimum at x = 3.



(
2x − x2 , 0 ≤ x < 2
Exercise 3.4. Let f (x) = .
(x − 2)2 , x ≥ 2
(a) Find the critical points of f .
(b) Determine whether a local maximum or local minimum or neither occurs at each of these
points.

Ans: (a) critical points at x = 1 and x = 2, (b) f has a local maximum at x = 1 and a local
minimum at x = 2.

3.6 Applied Maximum and Minimum Problems

Example 3.10. The number of viewers of a YouTube video at time t is given by

106 t
V (t) = 3
, t > 0.
(t 2 + 400) 2

Find the value of t when the number of viewers V is at its peak. Justify your answer.

Ans: t = 10 2.

57
106 t
Solution. V (t) = 3
, t > 0.
(t 2 + 400) 2

The domain of V is given to be t > 0. To find the time t when the number of viewers is at its
peak, it is the same as to find the time t at which the function V has an absolute maximum.

6 2 3 3 2 1
dV 10 [(t + 400) · 1 − t · 2 (t + 400) 2t]
2 2

=
dt (t 2 + 400)3
1
106 (t 2 + 400) 2 [(t 2 + 400) − 3t 2 ]
=
(t 2 + 400)3
√ √
106 (400 − 2t 2 ) 106 · 2(200 − t 2 ) 106 · 2(10 2 + t)(10 2 − t)
= 5
= 5
= 5
.
(t 2 + 400) 2 (t 2 + 400) 2 (t 2 + 400) 2

dV √
Thus = 0 ⇐⇒ t = ±10 2.
dt

Since t > 0, the only critical point of V (t) is at t = 10 2.
√ dV √ dV
When t < 10 2, > 0. When t > 10 2, < 0.
dt dt

Thus V has an absolute maximum at t = 10 2.


Exercise 3.5. Determine the coordinates of the points on the curve y = 16 x that are closest to the
origin. With the aid of a graph, justify that your answer indeed gives the shortest distance.

Ans: (4, 4), (−4, −4).

16
Solution Let d be the distance from (x, ) to the origin.
x
16 2
By Pythagoras’ theorem, d 2 = x2 + ( ) .
x

58
Note that d attains a minimum value L at (x0 , y0 ) if and only if d 2 attains a minimum value
L2 at (x0 , y0 ).

Thus it suffices to minimize d 2 . Let D = d 2 . we have


16 2
D = x2 + ( ) .
x
Note that x , 0.

Hence,
dD 2 · 162 2(x4 − 44 )
= 2x − = .
dx x3 x3
dD 2(x4 − 44 )
= 0 ⇐⇒ 3
= 0 ⇐⇒ x4 = 44 ⇐⇒ x = ±4.
dx x
dD dD
When 0 < x < 4, < 0. When 4 < x, > 0.
dx dx
Thus D attains the absolute minimum at x = 4 on (0, ∞).
dD dD
Similarly, when x < −4, < 0. When −4 < x < 0, > 0.
dx dx
Thus D attains the absolute minimum at x = −4 on (−∞, 0).

The 2 points are (4, 4) and (−4, −4).




Exercise 3.6. A triangular plot ABC in which ∠A = 60◦ and AB is 80 metres long is to be divided
into two plots of equal areas by a fence built along the line P Q. The fence costs $10 per metre. Let
the lengths of AC and AP be 10b and 10x respectively.
2
(a) Show that the length of P Q is 10z, where z2 = x2 + 16b
x2
− 4b.
(b) Show that 4 ≤ x ≤ 8.
(c) Determine, to the nearest dollar, the minimum cost of fencing, and the corresponding value of
x, when (i) b = 9, (ii) b = 25.

59
Ans: (a) $600, (b) $1096.59.

Exercise 3.7. A bus carries 60 passengers each day from a train station to a shopping mall. It
costs $1.50 per passenger to ride the bus. Research reveals that 4 more (fewer) people would ride
the bus for each 5 cents decrease (increase) in bus fare. Determine the bus fare (to the nearest cent)
per passenger that will maximise revenue of the bus operator.
Ans: $1.12.
2 y2
Exercise 3.8. Find the point P on the ellipse xa2 + b2 = 1 in the first quadrant such that the triangle
bounded by the axes of the ellipse and the tangent to the ellipse at the point P has the least area.

Ans: ( √a , √b ).
2 2
2 y2
Exercise 3.9. Find the point P on the ellipse xa2 + b2 = 1 in the first quadrant such that the line
segment tangent to the ellipse at the point P with endpoints on the axes of the ellipse has the least
length.

60
3 3
a2 b2
Ans: ( 1 , 1 ).
(a+b) 2 (a+b) 2

3.7 L’Hôpital’s Rule

Theorem 3.10. Let f and g be differentiable at all points in some open interval containing
x = c (except possibly at c). If lim f (x) = 0 = lim g(x) or lim f (x) = ∞ = lim g(x), then
x→c x→c x→c x→c

f (x) f 0 (x)
lim = lim 0 ,
x→c g(x) x→c g (x)

provided the limit on the right exists or equals ∞ or −∞.

The result also holds


(1) for limits at infinity, i.e. c = ±∞.
(2) for one-sided limits.

Sketch of Proof. Here we give a proof in the special case when lim f (x) = 0 = lim g(x), and
x→c x→c
the functions f , g, f 0 and g 0 are continuous at x = c, and g 0 (c) , 0. Then

f (c) = lim f (x) = 0 and g(c) = lim g(x) = 0.


x→c x→c

Thus
f (x) f (x) − 0
lim = lim
x→c g(x) x→c g(x) − 0
f (x) − f (c)
= lim
x→c g(x) − g(c)
f (x) − f (c)
= lim x−c
x→c g(x) − g(c)
x−c
0
f (c)
= 0 (by definition of f 0 (c) and g 0 (c) and noting that g 0 (c) , 0)
g (c)
lim f 0 (x)
x→c
= (by continuity of f 0 , g 0 at x = c)
lim g 0 (x)
x→c
f 0 (x)
= lim .
x→c g 0 (x)

61
Example 3.11. Evaluate
3x2 + x ln x
(a) lim
x→∞ x2 + 2 ln x
1
 
(b) lim 2 csc 2x −
x→0 x
(c) lim+ x ln x
x→0

Ans: (a) 3, (b) 0, (c) 0.


Solution. (a) We apply L’Hôpital’s Rule twice.

3x2 + x ln x
lim
x→∞ x2 + 2 ln x
6x + ln x + 1
= lim (by L’Hôpital’s Rule)
x→∞ 2
2x +
x
1
6+ +0
= lim x (by L’Hôpital’s Rule)
x→∞ 2
2− 2
x
6+0
= = 3.
2−0

(b) We apply L’Hôpital’s Rule twice.

1
 
lim 2 csc 2x −
x→0 x

2x − sin(2x)
= lim
x→0 x sin(2x)

2 − 2 cos(2x)
= lim (by L’Hôpital’s Rule)
x→0 sin(2x) + 2x cos(2x)

0 + 4 sin(2x)
= lim (by L’Hôpital’s Rule)
x→0 2 cos(2x) + (2 cos(2x) − 4x sin(2x))

0
= = 0.
2+2−0

62
(c) Exercise.


There are 3 indeterminate forms of the following types:


(1) 00 , (2) ∞0 , (3) 1∞ .

Example 3.12.
ln x x−1
ln(xx ) x ln x lim x ln x lim lim lim −x
00 : x
lim+ x = lim+ e = lim+ e =e x→0+ =e x→0+ x−1 =e x→0+ −x−2 = ex→0+ = e0 = 1.
x→0 x→0 x→0

2 2x 20
∞0 : lim+ ( )x = lim+ x = = 1.
x→0 x x→0 x 1
1 1
1 ln(1+x)
1
ln((1+x) x )
ln(1+x) lim lim 1+x 1+0
1∞ : lim (1 + x) = lim+ e
x = lim+ e x =e x→0+ x
=e x→0+ 1 =e 1 = e.
x→0+ x→0 x→0

1
Remark. In the last example, we have lim+ (1 + x) x = e (note that the value of the limit is not 1
x→0
or ∞). When x is positive and gets closer to 0, the expression 1 + x becomes smaller and gets closer
1
to 1, which tends to make the expression (1 + x) x smaller. On the other hand, the exponent 1x gets
1
larger and tends to make the expression (1+x) x larger. As it turns out, the combined effect of these
1
two opposing forces is that lim+ (1 + x) x = e ≈ 2.718, which is somewhere between 1 and ∞.
x→0

3.8 Rolle’s Theorem and Mean Value Theorem

Theorem 3.11. (Rolle’s Theorem) Let f be continuous on [a, b] and differentiable on (a, b). If
f (a) = f (b), then there is at least one number c in (a, b) such that f 0 (c) = 0.

63
Example 3.13. Use Rolle’s Theorem to prove that the equation

ln x + 2x = 3

has at most one positive solution.

Solution. Let f (x) = ln x + 2x. Note that f is a differentiable function defined for x > 0. Also
1
f 0 (x) = + 2. Suppose f (x) = 3 has two positive solutions a and b with 0 < a < b. That is
x
f (a) = f (b) = 3. By Rolle’s Theorem applied to f on the interval [a, b], there exists c ∈ (a, b)
1
such that f 0 (c) = + 2 = 0, which is not true as c > a > 0. This contradiction shows that
c
f (x) = 3 cannot have two positive solutions. In other words, ln x + 2x = 3 has at most one
positive solution.

Remark. Note that

f (1) = ln(1) + 2 = 2 and f (e) = ln e + 2e = 1 + 2e > 3.

Thus we have
f (1) < 3 < f (e).

Also f is differentiable and thus continuous on [1, e]. Thus by Intermediate Value Theorem
(see Theorem 1.7 of Lecture Notes), there exists a number c ∈ [1, 3] such that f (c) = 3.

Since we have also proved that the equation f (x) = 3 has at most one positive solution, we
knows that the equation f (x) = 3 has exactly one positive solution.


Exercise 3.10. Use Rolle’s Theorem to prove that the equation

2ex + x2 + 3x = 0

has at most two real solutions.

Rolle’s Theorem can be used to prove the following result.

Theorem 3.12. (Mean Value Theorem) Let f be continuous on [a, b] and differentiable on
(a, b). Then, there is at least one number c in (a, b) such that

f (b) − f (a)
f 0 (c) = .
b−a

64
Theorem 3.13. Let f be continuous on [a, b] and differentiable on (a, b). If f 0 (x) > 0(< 0) for
all x ∈ (a, b), then f is increasing (decreasing) on [a, b].

Proof. Let’s suppose f 0 (x) > 0 for all x ∈ (a, b). Let x1 , x2 ∈ [a, b] and x1 < x2 . Thus f is con-
tinuous on [x1 , x2 ] and differentiable on (x1 , x2 ) as f is continuous on [a, b] and differentiable
on (a, b). By mean value theorem applied to f on [x1 , x2 ], there is a number c in (x1 , x2 ) such
that
f (x2 ) − f (x1 )
= f 0 (c).
x2 − x1
Since f 0 (c) > 0 and x2 > x1 , we have f (x2 ) > f (x1 ). That is f is increasing on [a, b].


Exercise 3.11. Use the Mean Value Theorem to prove that for all real numbers x and y,

| cos x − cos y| ≤ |x − y|.

Exercise 3.12. Use the Mean Value Theorem to show that if f 0 (x) = 0 for all x in (a, b), then f is
a constant function on (a, b).

Exercise 3.13. Use the Mean Value Theorem to prove that there exists a real number θ ∈ ( π5 , π4 )
such that

π π π
sin = sin − cos θ.
5 4 20

2
Deduce that sin π5 < 40 (20 − π).

65
66
Chapter 4

Integrals

Read Thomas’ Calculus, Chapter 5 and 8.

4.1 Antiderivatives

Definition 4.1. F is an antiderivative of f on an interval I if F 0 (x) = f (x) for all x in I.

Theorem 4.1. (1) If F is an antiderivative of f on an interval I, then so is F + C for any


constant C. Furthermore, any antiderivative of f on I is of the form F + C for some constant
C. This can be expressed as Z
f (x) dx = F(x) + C.

R
f (x) dx is called an indefinite integral.

(2) Let α and β be any constants. Then


Z Z Z
αf (x) + βg(x) dx = α f (x) dx + β g(x) dx.

 
1 n+1 d 1 n+1
Example 4.1. An anti-derivative of xn is n+1 x , where n , −1 as dx n+1 x = xn . Thus
Z
1 n+1
xn dx = x + C.
n+1

Example 4.2. Find all the anti-derivatives of 2x − cos 3x.


R
Solution. 2x − cos 3x dx = x2 − 31 sin 3x + C.

67
4.2 Standard Integrals
(ax+b)n+1
R
1. (ax + b)n dx = (n+1)a
+C (n , −1)
R
1
2. ax+b dx = 1a ln |ax + b| + C
R
3. eax+b dx = 1a eax+b + C
R
4. sin(ax + b) dx = − 1a cos(ax + b) + C
R
5. cos(ax + b) dx = 1a sin(ax + b) + C
R
6. tan(ax + b) dx = 1a ln | sec(ax + b)| + C
R
7. sec(ax + b) dx = 1a ln | sec(ax + b) + tan(ax + b)| + C
R
8. csc(ax + b) dx = − 1a ln | csc(ax + b) + cot(ax + b)| + C
R
9. cot(ax + b) dx = − 1a ln | csc(ax + b)| + C
R
10. sec2 (ax + b) dx = 1a tan(ax + b) + C
R
11. csc2 (ax + b) dx = − 1a cot(ax + b) + C
R
12. sec(ax + b) tan(ax + b) dx = 1a sec(ax + b) + C
R
13. csc(ax + b) cot(ax + b) dx = − 1a csc(ax + b) + C
R
1
14. a2 +(x+b)2
dx = 1a tan−1 ( x+b
a )+C
R
1
15. √ dx = sin−1 ( x+b
a )+C
a2 −(x+b)2
R
16. √ −1
dx = cos−1 ( x+b
a )+C
a2 −(x+b)2
R
1 1 x+b+a
17. a2 −(x+b)2
dx = 2a ln x+b−a +C
R
1 1 x+b−a
18. (x+b)2 −a2
dx = 2a ln x+b+a +C
R p
1
19. √ dx = ln (x + b) + (x + b)2 + a2 + C
(x+b)2 +a2
R p
1
20. √ dx = ln (x + b) + (x + b)2 − a2 + C
(x+b)2 −a2
R√ √ 2
x
21. a2 − x2 dx =a2 − x2 + a2 sin−1 xa + C
2
R√ √ 2 √
22. x2 − a2 dx = 2x x2 − a2 − a2 ln |x + x2 − a2 | + C

68
R
1
Example 4.3. Let a , 0. Show that ax+b dx = 1a ln |ax + b| + C, for x , − ba .

d 1 1
Solution. We have to show dx ( a ln |ax + b|) = ax+b .

For x > − ba , we have |ax + b| = ax + b. Thus

d 1 d 1 1
( ln |ax + b|) = ( ln(ax + b)) = .
dx a dx a ax + b

For x < − ba , we have |ax + b| = −(ax + b). Thus

d 1 d 1 1 −a 1
( ln |ax + b|) = ( ln(−(ax + b))) = · = .
dx a dx a a −(ax + b) ax + b

Example 4.4. Find


R
1
(a) √ dx
x2 −4x+29
R
1
(b) √ dx
3+6x−9x2

R R
1 √ 1
Solution. (a) √ dx = dx
x2 −4x+29 (x−2)2 +52 p
= ln (x − 2) + (x − 2)2 + 52 + C

= ln (x − 2) + x2 − 4x + 29 + C.

R R
1 √ 1
(b) √ dx = dx
3+6x−9x2 R 22 −(3x−1)2
= 31 √ 2 2 1 1 2 dx
( 3 ) −(x−3 )
1
1 −1 x− 3
= 3 sin 2 +C
3
= 13 sin−1 ( 3x−1
2 ) + C.

Exercise 4.1. Find

3x−1 2
R 
(a) 2x+1 dx

(2e2x−1 −e−x )2
R
(b) ex+1
dx

69
9x
Ans: (a) 4 − 15 25 4 3x−3
4 ln |2x + 1| − 8(2x+1) + C, (b) 3 e − 4e−2 x − 31 e−3x−1 + C.

Trigonometric Identities Useful for Integration

1. sec2 x − 1 = tan2 x

2. csc2 x − 1 = cot2 x

3. sin A cos A = 21 sin 2A

4. cos2 A = 12 (1 + cos 2A)

5. sin2 A = 12 (1 − cos 2A)

6. sin A cos B = 12 (sin(A + B) + sin(A − B))

7. cos A sin B = 12 (sin(A + B) − sin(A − B))

8. cos A cos B = 12 (cos(A + B) + cos(A − B))

9. sin A sin B = − 21 (cos(A + B) − cos(A − B))

R
Example 4.5. Find cos 6x sin 3x dx

Solution.
R R
1
cos 6x sin 3x dx = x
2 (sin 2 − sin (− 6x )) dx
R
1
= 2 sin 2x + sin 6x dx

= 12 (−2 cos 2x − 6 cos 6x ) + C

= − cos 2x − 3 cos 6x + C.


R
Example 4.6. Show that cos4 x dx = 38 x + 41 sin 2x + 32
1
sin 4x + C.

Solution. First we have

cos4 x = ( 21 (1 + cos 2x))2 = 14 (1 + 2 cos 2x + cos2 2x)


= 41 (1 + 2 cos 2x + 12 (1 + cos 4x))
= 83 + 12 cos 2x + 18 cos 4x.

70
R R
Thus cos4 x dx= 38 + 12 cos 2x + 18 cos 4x dx
= 38 x + 14 sin 2x + 32
1
sin 4x + C.


R 2
sin 4x
Exercise 4.2. Find 1+cos 4x dx

1
Ans: 2 tan 2x − x + C.

4.3 Partial Fractions


Let P (x) and Q(x) be two polynomials. Suppose Q(x) is a product of linear or quadratic
P (x)
factors with real coefficients. Then, the rational function Q(x) can be expressed as a sum of
simple fractions whose denominators are factors of Q(x).

Factors of Q(x) Partial fractions

A
ax + b
ax + b
A B
(ax + b)2 +
ax + b (ax + b)2
Ax + B
ax2 + bx + c, b2 − 4ac < 0
ax2 + bx + c

P (x)
The rational function Q(x)
is said to be a proper fraction if the degree of P (x) is smaller
P (x)
that the degree of Q(x). Otherwise, it is called an improper fraction. If Q(x)
is an improper
B(x) B(x)
fraction, one can perform long division to write it as A(x) + Q(x)
, where Q(x)
is a proper
fraction.

Examples

5 1
2x + 4 2x + 4 3 3
(1) 2 = = + .
x − 9 (x − 3)(x + 3) x − 3 x + 3

14 8
3x2 + x + 4 −2x + 10 3 3
(2) 2 = 3+ 2 = 3− + .
x +x−2 x +x−2 x+2 x−1
(2) is an example of an improper fraction.
3x2 + x + 4
Z
Example 4.7. Find dx.
x2 + x − 2

71
Solution.
14 8
3x2 + x + 4
Z Z
3 3 14 8
dx = 3− + dx = 3x − ln |x + 2| + ln |x − 1| + C.
x2 + x − 2 x+2 x−1 3 3


4.4 Integration by Substitution

Theorem 4.2. Let u = g(x) be a differentiable function whose range is some interval I and let
f be continuous on I. Then,
Z Z
0
f (g(x))g (x) dx = f (u) du.

e3x
Z
Example 4.8. Find √ dx.
2e3x + 4
du
Solution. Let u = 2e3x + 4. Then = 6e3x , so that du = 6e3x dx. Then
dx
e3x 1√ 1 √ 3x
Z Z
1
√ dx = √ du = u +C = 2e + 4 + C.
2e3x + 4 6 u 3 3


(3 − tan 4x)5
Z
Exercise 4.3. Find dx.
cos2 4x
1
Ans: − 24 (3 − tan 4x)6 + C.
Z
8
Exercise 4.4. Find √ dx.
x ln x

Ans: 16 ln x + C.

Interchanging the roles of u and x in Theorem 4.2 and then renaming u and x by x and t
respectively, we get

Theorem 4.3. Let x = g(t) be a differentiable function whose range is some interval I and let
f be continuous on I. Then
Z Z
f (x) dx = f (g(t))g 0 (t) dt.

72
Z
ln x
Example 4.9. Find dx.
x

dx
Solution. Let x = et . Then = et , so that dx = et dt. Then
dt

ln(et ) t t2 (ln x)2


Z Z Z
ln x
dx = e dt = t dt = + C = + C.
x et 2 2

Trigonometric Substitution

qExpression Substitution Identity involved


a2 − (x + b)2 x + b = a sin θ, − π2 ≤ θ ≤ π
2 1 − sin2 θ = cos2 θ
q
a2 + (x + b)2 x + b = a tan θ, − π2 < θ < π
2 1 + tan2 θ = sec2 θ
q
π 3π
(x + b)2 − a2 x + b = a sec θ, 0 < θ < 2 or π ≤ θ < 2 sec2 θ − 1 = tan2 θ

Z √
25 − 4x2
Example 4.10. Find dx.
x2

5 2x √
Solution. Let x = sin θ. That is θ = sin−1 . Then 2x = 5 sin θ, so that 25 − 4x2 =
2 5
√ dx 5 5
2
25 − 25 sin θ = 5 cos θ. Also = cos θ, so that dx = cos θdθ. Then
dθ 2 2
Z √
25 − 4x2
Z
5 cos θ 5
dx = · cos θ dθ
x 2 5 2 2
( sin θ)
Z 2
= 2 cot2 θ dθ
Z
= 2 csc2 θ − 1 dθ

= −2 cot θ − 2θ + C
1√ 2x
= − 25 − 4x2 − 2 sin−1 ( ) + C.
x 5


Z
1
Example 4.11. Find √ dx.
x 9x2 + 1

73
1 √ √
Solution. Let x = tan θ. That is θ = tan−1 (3x). Then 1 + 9x2 = 1 + tan2 θ = sec θ. Also,
3
dx 1 1
= sec θ, so that dx = sec2 θdθ. Then
2
dθ 3 3
Z Z
1 1 1
√ dx = · sec2 θ dθ
x 9x2 + 1 1 3
tan θ sec θ
Z 3
= csc θ dθ

= − ln | csc θ + cot θ| + C

9x2 + 1 1
= − ln + +C
3x 3x

9x2 + 1 + 1
= − ln + C.
3x


Z √
Exercise 4.5. Find 6x − x2 dx.


9 −1 x−3 1
Ans: 2 sin ( 3 ) + 2 (x − 3) 6x − x2 + C

4.5 Integration by Parts


Recall the product rule of differentiation:

d
(f (x)g(x)) = f 0 (x)g(x) + f (x)g 0 (x).
dx
Integrating both sides of the above equation with respect to x gives
Z Z
0
f (x)g(x) = f (x)g(x) dx + f (x)g 0 (x) dx,

or
Z Z
0
f (x)g(x) dx = f (x)g(x) − f (x)g 0 (x) dx.

This suggests the following way of performing integration by parts:

74
integrate differentiate
f0 keep g keep f g
Z z}|{ z}|{ Z z}|{ z}|{
0
f (x)g(x) dx = f (x) · g(x) − f (x) g 0 (x) dx.

Z
Example 4.12. Find x sin 3x dx.

Solution.
Z Z
x sin 3x dx = (sin 3x) · x, dx
integrate sin 3x keep integral of sin 3x
z }| { keep x Z z }| { differentiate x

− cos 3x z}|{ − cos 3x z}|{


= x − 1 dx
3 3
−x cos 3x sin 3x
= + + C.
3 9


Z
Example 4.13. Find x ln x dx.

keep integral of
integrate x x differentiate ln x
z}|{ keep ln x z}|{ z}|{
Z
x 2 z}|{ Z
x2 1
Solution. x ln x dx = ln x − dx
2 2 x

x2 ln x x2
= − + C.
2 4


Basic rules to determine which function to integrate and which function to differentiate.

The choice of integration follows the reverse order of the following:

75
Types of functions Examples Remark

Logarithmic Function ln(ax + b) or its higher powers differentiate it

Inverse Trigonometric Functions sin−1 (ax + b), cos−1 (ax + b), tan−1 (ax + b) differentiate it

Algebraic Functions Power functions xa , polynomials differentiate it

sin(ax + b), cos(ax + b), differentiate it


Trigonometric Functions tan(ax + b),
csc(ax + b), sec(ax + b), cot(ax + b), or integrate it
a combinations of these

Exponential Functions eax+b integrate it

Z
Exercise 4.6. Find (2x + 1) ln(2x − 3) dx.

2
Ans: (x2 + x) ln(2x − 3) − x2 − 5x 15
2 − 4 ln(2x − 3) + C.
Z
Exercise 4.7. Find x tan−1 (2x) dx.

2
Ans: ( 18 + x2 ) tan−1 2x − 4x + C.
Z
sin 2x
Exercise 4.8. Find dx.
e2x

Ans: − 41 e−2x (sin 2x + cos 2x) + C.


Z
Exercise 4.9. Find sec3 x dx.

Ans: 21 (sec x tan x + ln | sec x + tan x|) + C.

4.6 Riemann Sums and Definite Integrals

Let f be continuous on [a, b].

76
The following limit exists and is known as the definite integral of f from x = a to x = b.
 !
n !

X b − a b − a 

lim  f a + k( ) .
 
n n

n→∞ 
 

k=1

It is denoted by
Z b
f (x) dx.
a

Geometrically the definite integral gives the area of the region under the graph of f from
x = a to x = b (at least in the case when f (x) ≥ 0 and a < b). The numbers a and b are
respectively called the lower and upper limits of the definite integral. The function f (x) is
called the integrand of the definite integral.
n ! !
X b−a b−a
The finite series f a + k( ) is known as a Riemann sum of f .
n n
k=1
Z b
Summing up, we have the following definition of f (x) dx.
a
 !
Z b n !
X b − a b−a 

 
f (x) dx = lim  f a + k( ) .

n n

a n→∞ 
 

k=1

77
Approximation. For sufficiently large n,

Z b n ! !
X b−a b−a
f (x) dx ≈ f a + k( ) .
a n n
k=1
Z 3
Example 4.14. Use Riemann sum to compute x2 dx.
0
n
X 1
The summation formula k 2 = n(n + 1)(2n + 1) is needed to compute the sum.
6
k=1

Ans: 9.

Solution. Here f (x) = x2 , a = 0, b = 3. Thus for each n, the corresponding Riemann sum is
given by
n n
X 3−0 3−0 X 3 3k 2
· f (0 + k · )= ( ) .
n n n n
k=1 k=1
Using the given summation formula, we get
n n
X 3 3k 2 27 X 2 27 n(n + 1)(2n + 1) 9(n + 1)(2n + 1)
( ) = 3 k = 3· = .
n n n n 6 2n2
k=1 k=1

Thus
Z 3 n
2
X 3 3k 2 9(n + 1)(2n + 1)
x dx = lim ( ) = lim
0 n→∞ n n n→∞ 2n2
k=1
9(1 + n1 )(2 + n1 ) 9(1 + 0)(2 + 0)
= lim = = 9.
n→∞ 2 2

4.7 Fundamental Theorem of Calculus (FTC)


The method of using Riemann sums to evaluate definite integrals is tedious. The following
result, known as the First Fundamental Theorem of Calculus (FTC 1), provides us with a
simpler way of calculating definite integrals when the anti-derivatives of the integrand can
be found.

Theorem 4.4 (FTC 1). Let f be continuous on [a, b] and let F be an anti-derivative of f .
Then, Zb
f (x) dx = F(b) − F(a).
a

78
That is, Z b
F 0 (x) dx = F(b) − F(a).
a

 b
Notation: Often we denote F(x) = F(b) − F(a). Then the FTC 1 takes the form
a
Z b  b
0
F (x) dx = F(x) = F(b) − F(a).
a a

Remark. We will see later that FTC 1 follows from Theorem 4.5 (FTC 2).

Z e 1
(ln x) 3
Example 4.15. Evaluate dx.
1 x

Ans: 34 .
1
(ln x) 3
Solution: We need to find an anti-derivative of f (x) = x . For this, we make the substi-
tution u = ln x to get
Z 1 Z
(ln x) 3 1 3 4
dx = (ln x) 3 d(ln x) = (ln x) 3 + C.
x 4

3 4
Pick C = 0, and we get an F(x) = (ln x) 3 . Then by FTC 1, we have
4
Z e 1 e
(ln x) 3 3 3 3 3 3

4 4 4
dx = (ln x) 3 = (ln e) 3 − (ln 1) 3 = − 0 = .
1 x 4 1 4 4 4 4

Remark. We often present our solution more compactly as follows:


Z e 1 Z e
(ln x) 3 1
dx = (ln x) 3 d(ln x)
1 x 1
e
3 3 3 3 3

4 4 4
= (ln x) 3 = (ln e) 3 − (ln 1) 3 = − 0 = .
4 1 4 4 4 4


Z π
2
Example 4.16. Evaluate x cos x dx.
0

79
π
Ans: 2 − 1.

Solution. Using integration by parts, we have


Z π  π2 Z π
2 2

x cos x dx = x sin x − sin x dx
0 0 0
 π2
π

= ( − 0) + cos x
2 0
π
= + (0 − 1)
2
π
= − 1.
2

n
X 1 1
Example 4.17. Use a Riemann sum to show that lim = ln 4.
n→∞ 3k + n 3
k=1

Solution. Recall that


n ! ! Zb
X b−a b−a
lim f a + k( ) = f (x) dx.
n→∞ n n a
k=1
n
X 1
To express lim as a Riemann sum, we need to identify the function f (x) and the
n→∞ 3k + n
k=1
interval [a, b]. First we have
n n n
X 1 X 1 1 1X 1 4−1
= = .
k=1
3k + n k( n3 ) + 1 n 3
k=1
1 + k( 4−1
k=1 n ) n

1
From this we see that a = 1, b = 4 and f (x) = . Therefore,
x
n n
X 1 1X 1 4−1
lim = lim
n→∞
k=1
3k + n n→∞ 3
k=1
1 + k( 4−1
n )
n
1 41
Z 4
1 1

= dx = ln x = ln 4.
3 1 x 3 1 3

Next we have the Second Fundamental Theorem of Calculus (FTC 2) as follows:

Theorem 4.5 (FTC 2). Let f be continuous on [a, b]. The function g defined by
Zx
g(x) = f (t) dt, a ≤ x ≤ b
a

80
is continuous and differentiable on (a, b), and g 0 (x) = f (x). That is
Zx
d
f (t) dt = f (x).
dx a

Rx
Remark. FTC 2 essentially says that the area function a
f (t) dt is an anti-derivative of (the
integrand) f .

Sketch of Proof of Theorem 4.5 (FTC 2). For small h,


Z x+h Z x
g(x + h) − g(x) = f (t) dt − f (t) dt
a a
Z x+h
= f (t) dt
x
= area under the graph of f over [x, x + h]
≈ f (x) · h.

y
.. ..
.......... ...
... ..
...
..
...
y = f (t) ...
.
..
.
... ...
... ...
... .
...
.
... ...
... ...
... ...
... .. ........
......
... ......................
..•
... ................
... .... .................
... ..
..... .........................
.
... ..... .................
..... .....................
... ...... .................
... ...... ....................
... .
...
........
. .................
..
... ................... f (x) .....................
... .................
... ....................
... ..................
... ....................
.................
... ....................
... .................
... .....................
... .................
... ................
..................................................................................................................................................................................... t
...
O ... x x+h

Area under the graph of f over [x, x + h] ≈ f (x) · h

Thus
g(x + h) − g(x)
≈ f (x).
h
Letting h → 0, we get
g(x + h) − g(x)
g 0 (x) = lim = f (x).
h→0 h


As mentioned earlier, one can deduce FTC 1 from FTC 2 as follows:

Proof of Theorem 4.4 (FTC 1). Let F be an antiderivative Z of a continuous function f over
x
an interval I, and let a, b be in I. By Theorem 4.5 (FTC 2), f (t) dt is also an antiderivative
a

81
of f . Thus by Theorem 4.1, we have
Z x
F(x) = f (t) dt + C
a

for some constant C. Then


Z b Z a Z b
F(b) − F(a) = f (t) dt + C) − f (t) dt + C) = f (t) dt.
a a a

Remark. If g(x) is differentiable, then using the Chain Rule, one has
Z g(x)
d
f (t) dt = f (g(x))g 0 (x).
dx a

Proof. We let Z u
F(u) = f (t) dt and u = g(x).
a
Thus Z g(x)
F(g(x)) = f (t) dt.
a
By Theorem 4.5 (FTC 2), one has
dF
= f (u).
du
Thus,
Z g(x)
d d
f (t) dt = F(g(x))
dx a dx
dF du
= · (by Chain Rule)
du dx
= f (g(x))g 0 (x).


Z sin x √
d
Example 4.18. Find 1 + t 6 dt.
dx −2

Ans: cos x 1 + sin6 x.

Solution. We apply the formula


Z g(x)
d
f (t) dt = f (g(x))g 0 (x)
dx a

82

(with f (t) = 1 + t 6 and g(x) = sin x). Note that g 0 (x) = cos x. Thus
Z sin x √
d
q p
1 + t 6 dt = 1 + (sin x)6 · cos x = cos x 1 + sin6 x.
dx −2

Properties of Definite Integrals

Let c ∈ [a, b] and α, β ∈ R.

Rb
1. a
α dx = α(b − a)
Rc
2. c
f (x) dx = 0
Rb Rb Rb
3. a
(αf (x) + βg(x)) dx = a
αf (x) dx + a
βg(x) dx
Rb Rc Rb
4. a
f (x) dx = a
f (x) dx + c
f (x) dx
Rb Ra
5. a
f (x) dx = − b
f (x) dx
Rb
6. a
f (x) dx ≥ 0 if f (x) ≥ 0 for a ≤ x ≤ b
Rb
7. a
f (x) dx ≤ 0 if f (x) ≤ 0 for a ≤ x ≤ b
Rb Rb
8. a
f (x) dx ≥ a
g(x) dx if f (x) ≥ g(x) for a ≤ x ≤ b
Rb Rb
9. a
f (x) dx ≤ a
g(x) dx if f (x) ≤ g(x) for a ≤ x ≤ b
Rb
10. m(b − a) ≤ a f (x) dx ≤ M(b − a) if m ≤ f (x) ≤ M for a ≤ x ≤ b
Ra
11. −a f (x) dx = 0 if f is an odd function defined on [−a, a]
Ra Ra
12. −a f (x) dx = 2 0 f (x) dx if f is an even function defined on [−a, a]

4.8 Miscellaneous Examples


Z
px + q
Example 4.19. (Integrals of the type dx)
ax2 + bx + c
Z −1
3x + 7
Evaluate dx.
−2 x2 + 4x + 5

83
Solution.
Z −1
3x + 7
dx
−2 x2 + 4x + 5
Z −1
3 2x + 4 1
= + dx
−2 2 x2 + 4x + 5 x2 + 4x + 5
Z −1
3 2x + 4 1
= + dx
−2 2 x + 4x + 5 1 + (x + 2)2
2

−1
3
 h i−1
= ln |x2 + 4x + 5| + tan−1 (x + 2)
2 −2 −2

3
= (ln 2 − ln 1) + tan−1 (1) − tan−1 (0)
2
3 π
= ln 2 + .
2 4

Z
px + q
Example 4.20. (Integrals of the type √ dx)
ax2 + bx + c
Z 2 √
2x + 3 7π
Show that √ dx = 2 3 + − 4.
1 4x − x2 6

Solution:
Z2 Z2
2x + 3 −(4 − 2x) 7
√ dx = √ +√ dx
1 4x − x2 1 4x − x2 4x − x2
Z2
−(4 − 2x) 7
= √ +p dx
1 4x − x2 22 − (x − 2)2

√ 2
−1 x − 2
h i2 
2
= −2 4x − x + 7 sin ( )
1 2 1

√ 1
= −2(2 − 3) + 7(sin−1 0 − sin−1 (− ))
2
√ π
= −2(2 − 3) − 7(− )
6
√ 7π
= 2 3−4+ .
6

84

x x
Exercise 4.10. Using the identity 1 + cos x + sin x = 2 cos2 (1 + tan ), evaluate
2 2
Z π
2 1
dx.
0 1 + cos x + sin x

Ans: ln 2.
π
Exercise 4.11. Using the substitution x = 2 − y, show that
π π
sin2 x cos2 x
Z Z
2 2
dx = dx.
0 1 + cos x + sin x 0 1 + cos x + sin x

Exercise 4.12. Using the results in exercise 4.10 and 4.11 and the identity sin2 x + cos2 x = 1,
show that

π
sin2 x
Z
2 ln 2
dx = .
0 1 + cos x + sin x 2

4.9 Improper Integrals


The definite integrals that we have studied so far have the following characteristics: (i) the
domain of integration is a finite closed interval [a, b], (ii) the function or the integrand has
finite values on the domain of integration. It is possible that we would encounter problems
that do not meet these conditions.
The integral for the area under the curve y = xe−x from x = 0 to x = ∞ has domain of
integration which is infinite.

The graph of y = xe−x .

The integral for the area under the curve y = √1 from x = 0 to x = 10 requires us to integrate
x
the function √1 whose value at x = 0 is infinite.
x

85
The graph of y = √1 .
x

In either case, the integrals are called improper integrals and are calculated as limits.

Definition 4.2. Integrals with infinite limits of integration are improper integrals of Type I.
1. If f (x) is continuous on [a, ∞), then
Z ∞ Z b
f (x) dx = lim f (x) dx.
a b→∞ a

2. If f (x) is continuous on (−∞, b], then


Z b Z b
f (x) dx = lim f (x) dx.
−∞ a→−∞ a

3. If f (x) is continuous on (−∞, ∞), then


Z∞ Z c Z ∞
f (x) dx = f (x) dx + f (x) dx,
−∞ −∞ c

where c is any real number.


In each case, if the limit is finite, we say that the improper integral converges and that the limit
is the value of the improper integral. If the limit fails to exist, then we say that the improper
integral diverges.

Example 4.21. Evaluate the Type I improper integral


Z∞
ln x
dx.
1 x2

Solution. We first use integration by parts to compute the indefinite integral.


Z Z
1 −1 1 −1 ln x 1 (1 + ln x)
ln x · 2 dx = ln x · − · dx = − − +C = − + C.
x x x x x x x

86
Thus #b
Z ∞ Z b "
ln x ln x (1 + ln x) 1 + ln b
dx = lim dx = lim − = lim 1 − = 1.
1 x2 b→∞ 1 x 2 b→∞ x 1 b→∞ b

ln b
Here lim = 0 by L’Hôpital’s rule. 
b→∞ b

Example 4.22. Evaluate the Type I improper integral


Z∞
1
2
dx.
−∞ 1 + x

Z 0 Z ∞
1 1
Solution. We need to evaluate 2
dx and dx.
−∞ 1 + x 0 1 + x2

Z 0 Z 0 0
1 1 π π

−1 −1
2
dx = lim dx = lim tan x = lim (0 − tan b) = −(− ) = .
−∞ 1 + x b→−∞ b 1 + x2 b→−∞ b b→−∞ 2 2

Similarly, Z ∞
1 π
2
dx = .
0 1+x 2

Therefore, Z ∞
1 π π
2
dx = + = π.
−∞ 1 + x 2 2


Definition 4.3. Integrals of functions that become infinite at a point within the interval of
integration are improper integrals of Type II.
1. If f (x) is continuous on (a, b] and is discontinuous at a, then
Z b Z b
f (x) dx = lim+ f (x) dx.
a c→a c

2. If f (x) is continuous on [a, b) and is discontinuous at b, then


Z b Z c
f (x) dx = lim− f (x) dx.
a c→b a

3. If f (x) is discontinuous at c with a < c < b, then


Z b Z c Z b
f (x) dx = f (x) dx + f (x) dx.
a a c

87
In each case, if the limit is finite, we say that the improper integral converges and that the limit
is the value of the improper integral. If the limit fails to exist, the improper integral diverges.

Example 4.23. Evaluate the Type II improper integral


Z 1
1
2
dx.
0 (x − 1) 3

Solution.
Z1 Z c
1 1 h 1 ic 1
2
dx = lim− 2
dx = lim− 3(x − 1) 3 = lim− 3(c − 1) 3 + 3 = 3.
c→1 c→1 0 c→1
0 (x − 1) 3 0 (x − 1) 3


Exercise 4.13. Evaluate the Type I improper integral


Z∞
tan−1 x
dx.
0 1 + x2

π2
Ans: 8 .

Exercise 4.14. Evaluate the Type II improper integral


Z 1
1
√ dx.
0 2 x(1 + x)

π
Ans: 4.

Exercise 4.15. Evaluate the Type I improper integral


Z∞
1
−x x
dx.
−∞ e + e

π
Ans: 2.

88
Chapter 5

Applications of Integration

Read Thomas’ Calculus, Chapter 6.

5.1 Area Between Curves

Theorem 5.1. Let f and g be continuous on [a, b] with f (x) ≥ g(x) for all a ≤ x ≤ b. The area
of the region bounded by the curves y = f (x), y = g(x), and the lines x = a and x = b is given
by
Zb
A= (f (x) − g(x)) dx.
a

In particular, if g(x) = 0, we obtain the following result.

89
Theorem 5.2. Let f be continuous on [a, b] with f (x) ≥ 0 for all a ≤ x ≤ b. The area of the
region bounded by the curve y = f (x), the x-axis y = 0, and the lines x = a and x = b is given
by
Zb
A= f (x) dx.
a

In general we have the following result.

Theorem 5.3. Let f and g be continuous on [a, b] (not necessarily with f (x) ≥ g(x) for all
a ≤ x ≤ b). The area of the region bounded by the curves y = f (x), y = g(x), and the lines x = a
and x = b is given by
Zb
A= |f (x) − g(x)| dx.
a

To evaluate the above integral, we split it into two or more integrals, each corresponding to
the region where either f (x) − g(x) ≥ 0 or f (x) − g(x) ≤ 0.

In particular, if g(x) = 0, then we obtain the following result.

Theorem 5.4. Let f be continuous on [a, b]. The area of the region bounded by the curve
y = f (x), and the lines x = a and x = b is given by
Z b
A= |f (x)| dx.
a

To evaluate the above integral, we split it into two or more integrals, each corresponding to
the region where either f (x) ≥ 0 or f (x) ≤ 0.

Example 5.1. Find the area of the region bounded the curve y = x(9−x2 ), (−2 ≤ x ≤ 2), the x-axis,
the line x = −2 and the line x = 2.

90
Solution. Let f (x) = x(9 − x2 ). Notice that f (x) ≥ 0 for 0 ≤ x ≤ 2 and f (x) ≤ 0 for −2 ≤ x ≤ 0.

Z 2
Area = |x(9 − x2 )| dx
−2
Z 0 Z 2
2
= −x(9 − x ) dx + x(9 − x2 ) dx
−2 0
#0 #2
9x2 x4 2 x4
" "
9x
= − + + −
2 4 −2 2 4 0
= 0 − (−18 + 4) + (18 − 4) − 0 = 28.

Example 5.2. Find the area of the region bounded by the curves y = e2x − 2, (x ≥ 0), and y =
10 − ex , (x ≥ 0), and
(a) the x-axis,
(b) the y-axis.

Ans: (a) − 27 − 12 ln 3 + 10 ln 10 + ln 2,
(b) 12 ln 3 − 6.

Solution.
ln 2
1. e2x − 2 = 0 ⇔ x = .
2

91
2. e2x − 2 = 10 − ex ⇐⇒ e2x + ex − 12 = 0

⇐⇒ (ex + 4)(ex − 3) = 0 ⇔ x = ln 3.

3. 10 − ex = 0 ⇐⇒ x = ln 10.

Z ln 3 Z ln 10
2x
(a) Green Area = (e − 2) dx + (10 − ex ) dx
ln 2
2 ln 3
ln 3  ln 10
1

= e2x − 2x + 10x − ex
2 ln 2
2 ln 3
9
= ( − 2 ln 3) − (1 − ln 2) + (10 ln 10 − 10) − (10 ln 3 − 3)
2
7
= − − 12 ln 3 + 10 ln 10 + ln 2.
2
Z ln 3
(b) Yellow Area = (10 − ex ) − (e2x − 2) dx
Z0ln 3
= 12 − ex − e2x dx
0 ln 3
1
= 12x − ex − e2x
2 0
9 1
= (12 ln 3 − 3 − ) − (−1 − )
2 2
= 12 ln 3 − 6.

Theorem 5.5. Let f and g be continuous on [c, d] with f (y) ≥ g(y) for all c ≤ y ≤ d. The area
of the region bounded by the curves x = f (y), x = g(y), and the lines y = c and y = d is given
by

92
Z d
A= (f (y) − g(y)) dy.
c

In general, we have the following result.

Theorem 5.6. Let f and g be continuous on [c, d] (not necessarily with f (y) ≥ g(y) for all
c ≤ y ≤ d). The area of the region bounded by the curves x = f (y), x = g(y), and the lines y = c
and y = d is given by
Zd
A= |f (y) − g(y)| dy.
c

To evaluate the above integral, we split it into two or more integrals, each corresponding to
the region where either f (y) − g(y) ≥ 0 or f (y) − g(y) ≤ 0.

Example 5.3. Find the area of the region bounded the curve y = ln(x + 2), y = 2 ln x, the x-axis.

Ans: 4 ln 2 − 1.
Solution.

93
Solving y = ln(x + 2) and y = 2 ln x gives (x, y) = (2, ln 2). Also y = ln(x + 2) ⇔ x = ey − 2 and
y
y = 2 ln x ⇔ x = e 2 .
R 2 ln 2 y h y i2 ln 2
Thus the area is A = 0 e 2 − ey + 2 dy = 2e 2 − ey + 2y = 4 ln 2 − 1.
0


5.2 Volume of Solid of Revolution by Disk Method

Theorem 5.7. When the plane region bounded by the curve y = f (x) and the lines x = a and
x = b is revolved completely about the x-axis, the volume of the solid formed is
Z b
V =π f (x)2 dx.
a

The above formula is known as the disk method.

Remark. Informally we can visualize the above formula as follows: Think of the expression
π(f (x)2 dx as the volume of a thin vertical circular disk of radius f (x) and thickness dx (so
that its volume is π · (radius)2 · thickness = π(f (x)2 dx). Also we think of the definite integral
Zb
sign as the process of ‘taking the limit of a Riemann sum’. Then the formula essentially
a
says that the volume V of solid of revolution is equal to the limit of the Riemann sum (i.e.
Zb
the definite integral ) of the volumes of the thin circular disks (given by π(f (x)2 dx)
a

94
Theorem 5.8. Let f and g be continuous on [a, b] with f (x) ≥ g(x) ≥ 0 for all a ≤ x ≤ b. When
the region bounded by the curves y = f (x) and y = g(x) for a ≤ x ≤ b is revolved completely
about the x-axis, the volume of the solid formed is
Z b Z b
2
V =π f (x) dx − π g(x)2 dx.
a a

Theorem 5.9. Let f be continuous on [c, d]. When the plane region bounded by the curve
x = f (y) and the lines y = c and y = d is revolved completely about the y-axis, the volume of

95
the solid formed is
Z b
V =π f (y)2 dy.
a

Remark. Informally we can visualize the above formula as follows: The volume V of the
Zd
solid of revolution is equal to the limit of a Riemann sum (i.e. ) of volumes of thin
c
horizontal circular disks (given by π · (f (y))2 · dy).

Theorem 5.10. Let f and g be continuous on [c, d] with f (y) ≥ g(y) ≥ 0 for all c ≤ y ≤ d.
When the region bounded by the curves x = f (y) and x = g(y) for c ≤ y ≤ d is revolved
completely about the y-axis, the volume of the solid formed is
Z d Z d
2
V =π f (y) dy − π g(y)2 dy.
c c

Example 5.4. Find the volume of the solid generated by rotating completely the region bounded
by the curve y 2 = 4x, the line y = 2x − 4 and the y-axis about the
(a) x-axis,
(b) y-axis.

22π 16π
Ans: (a) 3 , (b) 15 .

Solution. The two curves intersect at (1, −2), (4, 4). The required region is the one shaded in
purple in the figure.

96
(a) (Rotating about the x-axis) Using the disk method, the volume is

Z 1 √
V =π (2x − 4)2 − (− 4x)2 dx
0
1 #1
x3 5x2
Z "
2 22π
= 4π x − 5x + 4 dx = 4π − + 4x = .
0 3 2 0
3

(b) (Rotating about the y-axis) Using the disk method, the volume is

Z 0 Z −2
1 22 1
V =π ( y ) dy + π ( (y + 4))2 dy
−2 4 −4 2
" 5 #0 #−2
(y + 4)3
"
y 2 2 16π
=π +π = π( + ) = .
80 −2 12 −4 5 3 15

5.3 Cylindrical Shell Method

Consider the solid of revolution obtained by revolving the region R about the y-axis. If we
apply the disk method, it is necessary to express the equation y = f (x) in the form x = g(y).
Quite often, it is difficult or even impossible to do so.

97
The following result, known as the method of cylindrical shell, provides a solution to this
problem.

Theorem 5.11. When the plane region bounded by the curve y = f (x) and the lines x = a and
x = b, where 0 ≤ a < b, is revolved completely about the y-axis, the volume of the solid formed
is Z b
V = 2π x|f (x)| dx.
a

Remark. Informally we can visualize the above formula as follows: Think of the expression
2πx · |f (x)| · dx as the volume of a thin vertical cylindrical shell of base radius x and of
height |f (x)| and of thickness dx (so that its volume is given by (surface area of the cylinder)·
thickness = (2πx · |f (x)|) · dx). Then the formula says that the volume V of the solid of
Zb
revolution is equal to the limit of a Riemann sum (i.e. ) of volumes of thin vertical
a
cylindrical shells (given by 2πx · |f (x)| · dx).

98
Theorem 5.12. When the plane region bounded by the curve y = f (x), y = g(x) and the lines
x = a and x = b, where 0 ≤ a < b, is revolved completely about the y-axis, the volume of the
solid formed is
Zb
V = 2π x|f (x) − g(x)| dx.
a

Theorem 5.13. When the plane region bounded by the curve x = f (y) and the lines y = c and
y = d, where 0 ≤ c < d, is revolved completely about the x-axis, the volume of the solid formed

99
is Z d
V = 2π y|f (y)| dy.
c

Remark. Informally we can visualize the above formula as follows: The volume V of the
Zd
solid of revolution is equal to the limit of a Riemann sum (i.e. ) of volumes of thin
c
horizontal cylindrical shells (given by 2πy · |f (y)| · dy).

Theorem 5.14. When the plane region bounded by the curve x = f (y), x = g(y) and the lines
y = c and y = d, where 0 ≤ c < d, is revolved completely about the x-axis, the volume of the
solid formed is
Zd
V = 2π y|f (y) − g(y)| dy.
c

Example 5.5. The regions bounded by the curve y = 2x − x2 (1 ≤ x ≤ 3), the line x = 1, the line
x = 3 and the x-axis is revolved completely about the y-axis. Calculate the volume of the solid
generated.

Ans: 9π.

Solution. We use the shell method. The volume is given by

100
Z 3 Z 2
V = 2π x|f (x)| dx = 2π x|2x − x2 | dx
1 1
Z 2 Z 3 !
2 2
= 2π x(2x − x ) dx + x(−2x + x ) dx
1 2
#2 #3
2x3 x4 x4 2x3
" "
11 43
= 2π − + 2π − = 2π( + ) = 9π.
3 4 1 4 3 2 12 12

Example 5.6. Sketch the curve whose equation is y = ln(2x − 1) for x > 21 .
The region bounded by this curve, the axes and the line y = ln 3 is rotated completely about the
x-axis. Calculate the volume of the solid generated.

Ans: π(3 ln 3 + 21 (ln 3)2 − 2).

Solution. We use the shell method. First

1
y = ln(2x − 1) ⇔ x = (ey + 1).
2

The volume is given by


Z ln 3 Z ln 3
1 y
V = 2π y f (y) dy = 2π y (e + 1) dy
0 0 2
Z ln 3 Z ln 3
1 y
= 2π y (e + 1) dy = π yey + y dy
0 2 0
" 2
#ln 3
y 1
= π yey − ey + = π(3 ln 3 + (ln 3)2 − 2).
2 0 2

101


5.4 Arc Length of a curve


Let f be continuous on [a, b]. Using Riemann sums, we can prove:
The length of the curve y = f (x), a ≤ x ≤ b is
Z bq
1 + f 0 (x)2 dx.
a

Remark. To visualize the above expression, we can think of a curve as consisting of many
small slanted line segments, and each slanted line segment is the hypotenuse of a right-
angled triangle of base length dx and of height dy. Then qby Pythagoras’ p Theorem, the length
p dy
of a slanted line segment is given by (dx)2 + (dy)2 = 1 + ( dx )2 · dx = 1 + f 0 (x)2 dx. Then
Zb
the length of the curve is the limit of a Riemann sum (i.e. ) of lengths of small slanted line
p p a
segments given by 1 + f 0 (x)2 dx. (ds ≡ 1 + f 0 (x)2 dx is called the arc length differential).
Example 5.7. Calculate the length of the curve
x4 + 3
y= , 1 ≤ x ≤ 2.
6x
x4 +3
Solution. Given y = 6x = 16 (x3 + 3x ), we have

1 3 1 1
y 0 = (3x2 − 2 ) = (x2 − 2 ).
6 x 2 x
 2 p
Thus 1 + y 02 = 1 + 41 (x2 − x12 )2 = 12 (x2 + x12 ) so that 1 + y 02 = 12 (x2 + x12 ).
R 2p R2 h 3 i2
The arclength is 1 1 + y 02 dx = 1 12 (x2 + x12 ) dx = 21 x3 − 1x = 17 12 .
1


Similarly, if the curve is given as the graph of a function of y, we have the following formula.

102
The length of the curve x = g(y), p ≤ y ≤ q is
Z qq
1 + g 0 (y)2 dy.
p

Exercise 5.1. Calculate the length of the curve


 23
3x 16

y = 1+ , 0≤x≤ .
2 3
3
Ans: 23 (5 2 − 1).

Exercise 5.2. Sketch the curves y = 2 − ex−1 and y = 4x2 − 3 for x ≥ 0 on a single diagram. It is
given that the two curves meet at a point where x = 1. Calculate the area of the region bounded by
the two curves and the y-axis.

8
Ans: 3 + 1e .
R1
Exercise 5.3. Let In = 0
(2x + 1)n e−x dx, where n is a non-negative integer.
(a) Show that for n ≥ 1,
3n
In = (1 − ) + 2nIn−1 .
e
(b) The region bounded by the curve y = (2x + 1)2 e−x , the axes and the line x = 1 is rotated
completely about the y-axis. Use the result in (a) to find the value of the solid generated.

Ans: (b) π(66 − 172


e ).

Exercise 5.4. Consider the region R bounded by y = 2x2 , the line x = 2 and the x-axis. For
0 < p < 2, the vertical line x = p divides R into two parts R1 and R2 , where R1 denotes the part
on the right of x = p and R2 denotes the part on the left of x = p. Let V1 be the volume of the
solid generated by revolving R1 about the x-axis, and V2 be the volume of the solid generated
by revolving R2 about the y-axis. Find R1 and R2 in terms of p. Find also the value of p that
maximizes the total volume given by V = V1 + V2 .

Ans: p = 1 gives the maximum V .

103
104
Chapter 6

Sequences and Series

Read Thomas’ Calculus, Chapter 9.

6.1 Sequences
An infinite sequence of numbers is an infinite ordered list of numbers

a1 , a 2 , a 3 , · · · , an , · · ·

For each n, the nth number in the list is called the nth term of the sequence.

We usually denote a sequence by {an }∞


n=1 (or simply {an } when the reference to n is clear).
Formally,

Definition 6.1. An infinite sequence of numbers is a function whose domain is the set of
positive integers.

In this formal definition, an is the value of the function evaluated at n.


Example 6.1. The sequence of arithmetic progression is given by {a + (n − 1)d}∞
n=1 , where a is the
first term of the sequence and d is called the common difference.

The function that defines this sequence is: f (n) = a + (n − 1)d.

The sequence of geometric progression is given by {ar n−1 }∞


n=1 , where a is the first term of the
sequence and r is known as the common ratio of the geometric progression.

The function that defines this sequence is: f (n) = ar n−1 .

For example, 1, 3, 5, 7, 9, . . ., is an arithmetic progression (with a = 1, d = 2).


1 1 1 1 1
1, , 2 , 3 , 4 , . . ., is a geometric progression (with a = 1, r = ).
2 2 2 2 2

105
n
We now consider the sequence {an }∞ n=1 , where an = n+1 , and determine what happens to an
when n is getting large. The following graph shows how the terms approach 1.

n
We say that the sequence { n+1 } approaches 1 as n increases and write
n
lim = 1.
n→∞ n + 1

We now state formally the meaning of a limit of a sequence.

Let {an }∞
n=1 be a sequence of real numbers.

Then lim an is the value an approaches as n tends to positive infinity.


n→∞

If lim an exists as a real (finite) number L, then we say that the sequence {an } converges
n→∞
(or more detailedly, {an } converges to L). Sometimes we simply write an → L.

We say that the sequence {an } diverges if lim an does not exist as a real (finite) number.
n→∞

n
Example 6.2. (i) As we have seen, we have lim = 1.
n→∞ n + 1

n n
 
Thus the sequence converges (to 1), and we also write → 1.
n+1 n+1

(ii) Note that the sequence

−1, 1, −1, 1, −1, 1, −1, 1, −1, 1, −1, 1, · · ·

does not converge to any real number. Thus the sequence {(−1)n } diverges.

(iii) Clearly we have lim 2n = ∞. But ∞ is not a real number, so the sequence {2n} diverges.
n→∞

106
6.2 Finding the Limit of a Sequence
This following theorem gives a shortcut to evaluate the limit of some sequences using the
limit of functions.

Theorem 6.1. Let f (x) be a function, and {an } be a sequence such that f (n) = an for all n. If
lim f (x) = L, then lim an = L.
x→∞ n→∞

Example 6.3. Find the following limits:

1 ln n
(i) lim . (ii) lim .
n→∞ n n→∞ n

1 1
Solution. (i) Consider the function f (x) = . Note that = f (n) for all n.
x n
1 1
Since lim = 0, it follows from Theorem 6.1 that lim = 0.
x→∞ x n→∞ n
(ii) By L’Hôpital’s rule, we have

ln x 1/x 1
lim = lim = lim = 0.
x→∞ x x→∞ 1 x→∞ x

ln n
Thus we also have lim = 0.
n→∞ n


6.3 Limit Laws for Sequences


If {an } and {bn } are convergent sequences and c is a constant, then we have

• lim can = c lim an .


n→∞ n→∞

• lim (an ± bn ) = lim an ± lim bn .


n→∞ n→∞ n→∞

• lim an bn = lim an · lim bn .


n→∞ n→∞ n→∞

lim an
an n→∞
• lim = , if lim bn , 0.
n→∞ bn lim bn n→∞
n→∞

107
Example 6.4. From the previous example, we know that
1 ln n
lim =0 and lim = 0.
n→∞ n n→∞ n

It follows that we have


2n2 + 3 ln n 3 ln n
lim 2
= lim (2 + 2 )
n→∞ n n→∞ n
ln n 1
= 2 + 3 lim ·
n→∞ n n
= 2 + 3 · 0 · 0 = 2.

Theorem 6.2. (Squeeze Theorem for Sequence) If an ≤ bn ≤ cn for all n and lim an = lim cn = L,
n→∞ n→∞
then lim bn = L.
n→∞

Example 6.5. Show that if lim |an | = 0, then lim an = 0.


n→∞ n→∞

Solution. Note that −|an | ≤ an ≤ |an | for all n,


and it is given that lim |an | = 0, so that we also have lim −|an | = − lim |an | = −0 = 0. Thus it
n→∞ n→∞ n→∞
follows from Squeeze Theorem that lim an = 0.
n→∞

n!
Example 6.6. Show that lim = 0.
n→∞ nn

n! n n−1 n−2 2 1 1 1
Solution. Since 0 ≤ nn = n × n × n × ··· × n × n ≤ n for all n and lim = 0, the result
n→∞ n
follows from Squeeze Theorem.


6.4 Series

An expression of the form



X
an = a1 + a2 + a3 + a4 + · · ·
n=1
is called an infinite series or simply a series. To compute the value (called the sum) of this
infinite series, we construct a new sequence {Sn } defined by
n
X
Sn = ai = a1 + a2 + · · · + an .
i=1

108
{Sn } is called the sequence of partial sums of the given series. Then the sum of the infinite
series is defined as the limit of the sequence {Sn }. In other words, we have

X
an = lim Sn .
n→∞
n=1


X
We say that the series an converges if the sequence {Sn } converges (which means that
n=1

X
lim Sn exists as a real number). Thus the series an is convergent means that it has a
n→∞
n=1
finite sum.

X
We say that the series an diverges if the sequence {Sn } diverges (which means that lim Sn
n→∞
n=1

X
does not exist as a real number). Thus the series an is divergent means that it does not
n=1
have a finite sum.

Example 6.7. The constant sequence {1} is a convergent sequence, since lim 1 = 1, which is a
n→∞
real number.

X
Now consider the series 1 = 1 + 1 + 1 + 1 + ···.
n=1

n
X ∞
X
For each n, the nth partial sum is given by Sn = 1 = n. Thus, we have 1 = lim Sn =
n→∞
i=1 n=1
lim n = ∞.
n→∞

X
But ∞ is not a real number, and thus the series 1 is divergent.
n=1

Example 6.8. (a) An important example of an infinite series is the geometric series

X
ar n−1 , (a , 0).
n=1

a
The geometric series is convergent with its sum equal to when |r| < 1, and it is divergent
1−r
when |r| ≥ 1.

X
(b) Is the series 22n 31−n convergent or divergent?
n=1

109

X
(c) Find the sum of the series xn , for |x| < 1.
n=1

Solution. (a) Note that for each n,

a(1 − r n )

n 
X  if r , 1,
ar i−1 = 

Sn = 1−r


i=1 an
 if r = 1.

∞

 if a > 0,
Note that a , 0. Then lim an = 
n→∞ −∞ if a < 0.




∞ if r > 1,
n

lim r = 0 if − 1 < r < 1,

n→∞ 

does not exist if r ≤ −1.

Then it follows readily that





∞ if a > 0 and r ≥ 1,


−∞ if a < 0 and r ≥ 1,
X 

ar n−1 = lim Sn = 

a
n→∞ 
 if − 1 < r < 1,
n=1 


 1−r
does not exist if r ≤ −1.

∞ ∞
X
n−1 a X
In summary, ar is convergent (with its sum = ) when |r| < 1, and ar n−1 is
1−r
n=1 n=1
divergent when |r| ≥ 1.

∞ ∞  n
X
2n 1−n
X 4 4
(b) 2 3 =3 is a geometric series with r = > 1, and thus it is divergent.
3 3
n=1 n=1


X ∞
X
n
(c) x = x · xn−1 is a geometric series with a = x and r = x. When |x| < 1, it follows from
n=1 n=1

X x
(a) that the series is convergent and xn = .
1−x
n=1


X 1
Example 6.9. Show that the series is convergent, and find its sum.
n(n + 1)
n=1

110
Solution. For each n,
n n 
1 1 1
X X 
Sn = = −
i(i + 1) i i +1
i=1 i=1
1 1 1 1 1 1 1 1
= ( − ) + ( − ) + ( − ) + ··· + ( − )
1 2 2 3 3 4 n n+1
1
= 1− .
1+n

X 1  1 
Therefore, = lim 1 − = 1, which is a real number. Thus the series
n(n + 1) n→∞ 1+n
n=1

X 1
converges.
n(n + 1)
n=1



X ∞
X ∞
X
Theorem 6.3. If an and bn are convergent series, so are the series can (where c is a
n=1 n=1 n=1

X
constant) and (an + bn ). Moreover,
n=1

X X∞
(a) can = c an , and
n=1 n=1
X∞ ∞
X ∞
X
(b) (an + bn ) = an + bn .
n=1 n=1 n=1


X 1
Example 6.10. From Example 6.9, we know that = 1. It follows that
n(n + 1)
n=1

∞ ∞ ∞ 2
X 2 4 X 2 X 1 3
( n+ )= + 4 = + 4 · 1 = 1 + 4 = 5.
3 n(n + 1) 3n n(n + 1) 1 − 13
n=1 n=1 n=1


X 2 2 1
(Note that n
is a geometric series with a = and r = .)
3 3 3
n=1



X
Lemma 6.4. If the series an is convergent, then lim an = 0.
n→∞
n=1

111

X
Proof. Suppose an = L. Let Sn = a1 + · · · + an . Then lim Sn = L. Note that an = Sn − Sn−1 for
n→∞
n=1
all n ≥ 2. Then
lim an = lim (Sn − Sn−1 ) = L − L = 0.
n→∞ n→∞

Theorem 6.5. (The nth Term Test for Divergence)



X
If lim an does not exist or if lim an , 0, then the series an is divergent.
n→∞ n→∞
n=1

This is also known simply as the nth term test.



X n2
Example 6.11. Is the series convergent or divergent?
7n2 + 3
n=1

n2 1 1 1
Solution. lim 2
= lim 3
= = , 0. Thus by the nth term test, the series
n→∞ 7n + 3 n→∞ 7 +
2
7+0 7
n

X n2
is divergent.
7n2 + 3
n=1


Warning. The nth term test is inconclusive if lim an = 0.


n→∞
∞ ∞
X 1
X 1
(To see this, consider the two series and . Note that both series satisfy the condi-
n n2
n=1 n=1
∞ ∞
X 1 X 1
tion lim an = 0. But we will later show that is divergent, while is convergent.)
n→∞ n n2
n=1 n=1


For series of nonnegative terms (i.e., each an ≥ 0), we have the following fundamental result:


X
Theorem 6.6. A series an of nonnegative terms converges if and only if its partial sums
n=1
are bounded from above (i.e., there exists a constant K such that Sn < K for all n.)


X
Remark. Basically, this theorem means that if each an ≥ 0, then the sum of the series an
n=1
is either a finite number or ∞.

112

X 1
Example 6.12. The series is convergent.
n2
n=1

n
X 1
Solution. Let Sn = . Then for each n,
k2
k=1

1 1 1 1
Sn = 1 + 2
+ 2 + ··· + 2
+ 2
2 3 (n − 1) n
1 1 1 1
≤ 1+ + + ··· + +
1×2 2×3 (n − 2) × (n − 1) (n − 1) × n
1 1 1 1 1 1 1
= 1 + (1 − ) + ( − ) + · · · + ( − )+( − )
2 2 3 n−2 n−1 n−1 n
1
= 2 − < 2.
n

Thus the sequence of partial sums {Sn } is bounded above (by 2). Therefore by the above

X 1
theorem, converges. 
n2
n=1

In general, it is difficult to determine if a given series in convergent. In the next few sections,
we will discuss some methods that will enable us to test the convergence of certain series.

6.5 Integral Test



X 1
Example 6.13. The harmonic series is divergent.
n
n=1

Solution. Observe that

1 1 1 1 1 1 1 1 1 1
     
1+ + + + + + + + + + ··· + +···
2 3 4 5 6 7 8 9 10 16
| {z } | {z } | {z }
2 1 4 1 8 1
≥ = ≥ = ≥ =
4 2 8 2 16 2
From this bracketing of the terms, one can see that the partial sums {Sn } satisfy

1 1 1 1
S2 ≥ , S4 ≥ 2 · , S8 ≥ 3 · , S16 ≥ 4 · , · · · ,
2 2 2 2
and more generally,
k
S 2k ≥ for all k.
2

113
So the sequence of partial sums is not bounded from above. By the above theorem, the
harmonic series diverges.


For n = 1, 2, . . ., the rectangle with height n1 erected on the unit interval [n, n + 1] has area n1 .
The harmonic series may be viewed as the sum of the areas of these rectangles. Graphically
we see that the union of all these rectangles contains the green region R bounded by the
graph of y = 1x andR the x-axis for x ≥ 1. However, the region R has an infinite area as the

improper integral 1 1x dx diverges. This implies the harmonic series diverges.

Theorem 6.7. (Integral Test) Let {an } be a sequence of positive terms. Suppose that an = f (n),
X∞
where f is a continuous, positive, decreasing function of x for all x ≥ 1. Then the series an
Z∞ n=1

is convergent if and only if the improper integral f (x) dx is convergent. In other words:
1
Z ∞ ∞
X
(i) If f (x) dx is convergent, then an is convergent.
1 n=1
Z ∞ ∞
X
(ii) If f (x) dx is divergent, then an is divergent.
1 n=1


X 1
Example 6.14. Test the convergence of the series .
n2 + 1
n=1

Solution. Clearly, x21+1 is a decreasing function for x > 0.


Z∞
1 π π π
2
dx = lim [tan−1 x]b1 = lim (tan−1 b − tan−1 1) = − = . Thus the improper inte-
1 x +1 b→∞ b→∞ 2 4 4
Z∞ ∞
1 X 1 π
gral 2
dx converges. By the integral test, 2
converges. (But note that is
1 x +1 n +1 4
n=1
not the sum of the series.)


114

X 1
Theorem 6.8. (The p-series) The p-series is convergent if and only if p > 1.
np
n=1

Z ∞ " −p+1 #b
1 x 1 1 1
 
Proof. Let p > 1. p dx = lim = lim p−1
− 1 = . Thus the series
1 x b→∞ −p + 1 1 b→∞ 1 − p b p−1
converges by the integral test.

If p = 1, then the series is the harmonic series which is divergent.


Z∞
1 1  1−p 
Let 0 < p < 1. Then 1 − p > 0. p dx = lim b − 1 = ∞. Thus the series diverges
1 x b→∞ 1 − p
by the integral test.
1
Let p ≤ 0. Then lim , 0. By Theorem 6.5 (the nth term test), the series diverges.
n→∞ np



X 1 1
Example 6.15. The series √ is divergent, since it is a p-series with p = ≤ 1.
n 2
n=1

X 1 3
The series √ is convergent, since it is a p-series with p = > 1.
n n 2
n=1

6.6 The Comparison Test

Theorem 6.9. (Comparison Test)



X ∞
X
Suppose an and bn are series with nonnegative terms such that 0 ≤ an ≤ bn for all n.
n=1 n=1

X ∞
X
(i) If bn is convergent, then an is convergent.
n=1 n=1

X ∞
X
(ii) If an is divergent, then bn is divergent.
n=1 n=1

Remark. Roughly speaking, (i) means that if the bigger series has a finite sum, then the
smaller series also has a finite sum.

Also, (ii) means that if the sum of the smaller series is ∞, then the sum of the bigger series
is also ∞.

115
Remark. When applying the Comparison Test, we often compare a given series with an
appropriate p-series or geometric series.

X 7
Example 6.16. Determine the convergence of the series .
2n2 + 4n + 3
n=1

∞ ∞
7 7 X 7 X 1
Solution. Note that 0 ≤ 2 ≤ 2 for all n ≥ 1. We know 2
=7 converges
2n + 4n + 3 n n n2
n=1 n=1

X 7
as it is a p-series with p = 2 > 1. Thus by the comparison test, converges.
2n2 + 4n + 3
n=1


X 1
Example 6.17. Determine the convergence of the series .
2n − 1
n=1

Solution. Note that for all n ≥ 1,


1 n 1 1
2n − 1 ≥ · 2 = 2n−1 =⇒ 0 ≤ n ≤ n−1 .
2 2 −1 2

X 1 1
The series is convergent, since it is a geometric series with |r| = < 1. Thus by the
2n−1 2
n=1

X 1
comparison test, n
is convergent.
2 −1
n=1


6.7 The Ratio Test and Root Test



X an+1
Theorem 6.10. (The Ratio Test) Suppose an is a series such that lim = L (L is a
n→∞ an
n=1
finite number or ∞).


X ∞
X
(i) If 0 ≤ L < 1, then an is absolutely convergent. That is |an | is convergent.
n=1 n=1

X
(ii) If L > 1, then an is divergent.
n=1

(iii) If L = 1, then the ratio test is inconclusive.

116
Proof. (Optional Reading Exercise) See Section 6.13. 

X ∞
X
Remark. Theorem 6.13 (later) will tell us: |an | is convergent =⇒ an is convergent .
n=1 n=1


X
Thus in (i) (when 0 ≤ L < 1), the series an itself is also convergent.
n=1


X ∞
X
Remark. For a geometric series an = ar n−1 , one has
n=1 n=1

an+1 ar n
L = lim = lim = lim |r| = |r|.
n→∞ an n→∞ ar n−1 n→∞

Thus roughly speaking, the Ratio Test says that if a series resembles a geometric series, then
its convergence/divergence property also resembles that of the geometric series (in the cases
of (i) and (ii)).

X 1
Remark. (iii) can be illustrated by the convergent series and the divergent series
n2
n=1

X 1
, as both series have L = 1.
n
n=1

Example 6.18. Test the series



X n3
(−1)n
3n
n=1

for absolute convergence.

Solution.
(n+1)3
(−1)n+1 3n+1 1 n+1 3
 
1

1 3 1

lim 3
= lim = lim 1+ = < 1.
n→∞ n
(−1)n 3n n→∞ 3 n n→∞ 3 n 3

X n3
Thus by the ratio test, (−1)n n converges absolutely.
3
n=1


Example 6.19. Test the series



X n!
nn
n=1

for absolute convergence.

117
Solution.
(n + 1)!
(n + 1)n+1 nn 1
L = lim = lim n
= .
n→∞ n! n→∞ (n + 1) 1 n
lim (1 + )
nn n→∞ n
By L’Hôpital’s rule,

ln(1 + 1x )
1 1 lim
1 x lim ln(1 + )x lim x ln(1 + ) x→∞ 1
lim (1 + ) = e x→∞ x =e x→∞ x =e x
x→∞ x
1
· (− x12 )
(1+ 1x ) 1
lim lim
x→∞ − x12 x→∞ 1 + 1 1
=e =e x = e 1+0 = e.

1 1
Thus we have lim (1 + )n = e, and it follows that L = < 1.
n→∞ n e

X n!
Hence by the Ratio Test, the series is absolutely convergent.
nn
n=1


Exercise 6.1. Test for convergence of the following series.


∞ ∞ ∞
X 2n + 5 X (2n)! X 4n (n!)2
(a) (b) (c)
3n n!n! (2n)!
n=1 n=1 n=1

Ans. (a) convergent, (b) divergent, (c) divergent.

Our next test resembles the Ratio Test closely:


X p
n
Theorem 6.11. (The Root Test) Suppose an is a series such that lim |an | = L (L is a
n→∞
n=1
finite number or ∞).


X
(i) If 0 ≤ L < 1, then an is absolutely convergent.
n=1

X
(ii) If L > 1, then an is divergent.
n=1

(iii) If L = 1, then the root test is inconclusive.

118

X ∞
X
Remark. For a geometric series an = ar n−1 , one has
n=1 n=1
p p 1 n−1
|an | = lim |ar n−1 || = lim |a| n |r| n = |a|0 · |r|1 = 1 · |r| = |r|.
n n
L = lim
n→∞ n→∞ n→∞

As such, the conclusions in (i) and (ii) in the Root Test are consistent with what we know
about convergence/divergence property of a geometric series.

X 1
Remark. Again (iii) can be illustrated by the convergent series and the divergent
n2
n=1

X 1
series , as both series have L = 1.
n
n=1

Example 6.20. Test the series


∞ 
2n + 3 n
X 
3n + 2
n=1
for convergence.

Solution. r n
n 2n + 3 2n + 3 2
lim = lim = < 1.
n→∞ 3n + 2 n→∞ 3n + 2 3
∞ 
2n + 3 n
X 
By the root test, is convergent.
3n + 2
n=1


Example 6.21. Test the series



X nn
33n+1
n=1
for convergence.
r ∞
n nn n X nn
Solution. lim = lim = ∞. By the root test, is divergent.
n→∞ 33n+1 n→∞
33+ n
1
33n+1
n=1


Exercise 6.2. Test for convergence of the following series.


∞ ∞ ∞ 
n2 2n 1 n
X X X 
(a) (b) (c)
2n n3 1+n
n=1 n=1 n=1

Ans. (a) convergent, (b) divergent, (c) convergent.

119
6.8 Alternating Series
An alternating series is a series whose terms are alternatively positive and negative.
Example 6.22. (i) An example of an alternating series is given by


X 1 1 1 1 1 1 1
(−1)n−1 = − + − + − + · · · ,
n 1 2 3 4 5 6
n=1

X 1
which is known as the alternating harmonic series. The name reflects the fact that (−1)n−1 =
n
n=1

X 1
is the harmonic series.
n
n=1

(ii) Another example of an alternating series is given by



X
(−1)n n = −1 + 2 − 3 + 4 − 5 + 6 − · · ·
n=1


X 1
Remark. The harmonic series is divergent. Nonetheless, it turns out that the alternat-
n
n=1

X 1
ing harmonic series (−1)n−1 is convergent.
n
n=1
This can be seen as follows:

X 1 1 1 1 1 1 1
(−1)n−1 = ( − ) + ( − ) + ( − ) + · · ·
n 1 2 3 4 5 6
n=1
1 1 1
= + + + ···
1·2 3·4 5·6

X 1
= .
(2n − 1) · 2n
n=1
For each n ≥ 1, we have
1 1
2n − 1 ≥ n =⇒ 0 ≤ ≤ 2.
(2n − 1) · 2n 2n
∞ ∞
X 1 1 X 1
The series = · is convergent, since it is a p-series with p = 2 > 1. Thus
2n2 2 n2
n=1 n=1
∞ ∞
X 1 X 1
by the Comparison Test, is convergent. Thus the series (−1)n−1 is also
(2n − 1) · 2n n
n=1 n=1
convergent.
The above conclusion also follows from the following result:

120
Theorem 6.12. (The Alternating Series Test) If bn is a sequence of positive numbers such that

(i) bn is decreasing (that is, bn ≥ bn+1 for all n), and

(ii) lim bn = 0,
n→∞

then the alternating series



X
(−1)n−1 bn = b1 − b2 + b3 − b4 + · · · and
n=1
X∞
(−1)n bn = −b1 + b2 − b3 + b4 − · · ·
n=1

are convergent.


X (−1)n−1
Example 6.23. Use the Alternating Series Test to show that the series is convergent.
n
n=1


X (−1)n−1 1
Proof. is an alternating series with each bn = > 0. For each n ≥ 1, we have
n n
n=0
1 1 1 1 1
 
− = > 0, and thus the sequence is decreasing. Also we have lim = 0.
n n + 1 n(n + 1) n n→∞ n
∞ n−1
X (−1)
Thus by the alternating series test, the series is convergent.
n
n=0

X (−1)n n2
Example 6.24. Show that the series is convergent.
n3 + 1
n=1


X (−1)n n2 n2
Solution. is an alternating series with each bn = > 0. For each n ≥ 1, we
n3 + 1 n3 + 1
n=1
have, by direct calculation,

n2 (n + 1)2 n2 ((n + 1)3 + 1) − (n + 1)2 (n3 + 1)


bn − bn+1 = − =
n3 + 1 (n + 1)3 + 1 (n3 + 1)((n + 1)3 + 1)
n4 + 2n3 + n2 − 2n − 1
= 3
(n + 1)((n + 1)3 + 1)
n4 + 2n2 (n − 1) + (n2 − 1)
= ≥ 0.
(n3 + 1)((n + 1)3 + 1)
n 2 o
Thus the sequence n3n+1 is decreasing.

121
1
n2 n 0
Also, lim = lim = = 0.
n→∞ n3 + 1 n→∞ 1 + 1 1+0
n3

X (−1)n n2
Thus by the alternating series test, is convergent.
n3 + 1
n=1


6.9 Absolute Convergence



X ∞
X
Given a series an , we can construct a new series |an |, whose terms are the absolute
n=1 n=1
values of the terms of the original series.


X ∞
X
Theorem 6.13. If |an | is convergent, then an is convergent.
n=1 n=1


X
Proof. Note that 0 ≤ (an + |an |) ≤ 2|an | for all n. Since 2|an | converges, we have by compar-
n=1

X
ison test, (an + |an |) converges too. Therefore,
n=1


X ∞
X ∞
X
an = (an + |an |) − |an |
n=1 n=1 n=1

converges.



X ∞
X
Definition 6.2. A series an is said to be absolutely convergent if |an | is convergent.
n=1 n=1

X
The series an is said to be conditionally convergent if it is convergent but not absolutely
n=1
convergent.

Theorem 6.13 states that every absolutely convergent series is convergent.


X (−1)n−1
Example 6.25. Show that the series is absolutely convergent.
n2
n=1

122
∞ ∞
X (−1)n−1 X 1
Solution. The series = is convergent, since it is a p-series with p = 2 > 1.
n2 n2
n=0 n=0

X (−1)n−1
Therefore, converges absolutely.
n2
n=0

X (−1)n−1
Example 6.26. Show that the series is conditionally convergent.
n
n=1


X (−1)n−1
Solution. From Example 6.23, we know, by the alternating series test, that is
n
n=0
convergent.
∞ ∞
X (−1)n−1 X 1
Also, = is the harmonic series which is divergent.
n n
n=0 n=0

X (−1)n−1
Therefore, converges conditionally.
n
n=0


X sin n
Example 6.27. Show that the series is absolutely convergent.
n2
n=1

6.10 Power Series


A power series is a series of the form

X
cn xn = c0 + c1 x + c2 x2 + c3 x3 + · · · ,
n=0

where x is a variable, and the cn ’s are constants called coefficients of the series. For each fixed
x, the power series is a series of numbers that we can test for convergence or divergence.

More generally, a series of the form



X
cn (x − a)n = c0 + c1 (x − a) + c2 (x − a)2 + c3 (x − a)3 + · · · ,
n=0

is called a power series centred at a or a power series about a.



X
Note that the power series cn (x − a)n always converges at x = a.
n=0

123

X
Example 6.28. For what values of x is the series n!xn convergent?
n=0


(n + 1)!xn+1 X
Solution. If x , 0, then lim n
= lim (n + 1)|x| = ∞. By ratio test, n!xn diverges.
n→∞ n!x n→∞
n=0

X
Therefore, n!xn converges if and only if x = 0.
n=0


X (x − 7)n
Example 6.29. For what values of x is the series convergent?
n
n=0

(x−7)n+1
n+1 n
Solution. lim = lim |x − 7| = |x − 7|. By ratio test, we have the following con-
n→∞ (x−7)n n→∞ n + 1
n
clusions (i) and (ii).

X (x − 7)n
(i) If |x − 7| < 1, then is absolutely convergent.
n
n=0

X (x − 7)n
(ii) If |x − 7| > 1, then is divergent.
n
n=0

X (−1)n
(iii) If x = 6, the series becomes , which is an alternating series. By the alternating
n
n=0
series test, it is convergent.

X 1
( iv) If x = 8, the series becomes , which is the harmonic series which is divergent.
n
n=0

X (x − 7)n
Summarizing, the series is convergent if and only if x ∈ [6, 8).
n
n=0



X
Theorem 6.14. For a given power series cn (x − a)n , exactly one of the following possibili-
n=0
ties holds:

(i) The series converges at x = a only.

(ii) The series converges for all x.

124
(iii) There is a positive number R such that the series converges absolutely if |x − a| < R and
diverges if |x − a| > R.

The number R in case (iii) is called the radius of convergence of the power series. By con-
vention, the radius of convergence is R = 0 in case (i) and R = ∞ in case (ii). The interval of
convergence of a power series is the interval consisting of all values of x for which the series
converges.

The interval of convergence can be (a − R, a + R), [a − R, a + R), (a − R, a + R], [a − R, a + R].


In some cases, we can compute R by the following method.


X cn+1
Theorem 6.15. Consider the power series cn (x − a)n , where cn , 0 for all n. If lim =L
n→∞ cn
p n=0
or lim n
|cn | = L, where L is a real number or ∞, then R = L1 .
n→∞

By convention, if L = 0, then R = ∞, and if L = ∞, then R = 0.

cn+1
Proof. Suppose lim = L.
n→∞ cn
cn+1 (x − a)n+1 cn+1
Thus lim = lim |x − a| = L|x − a|.
n→∞ cn (x − a)n n→∞ cn

By ratio test, the series converges absolutely for L|x − a| < 1, that is |x − a| < L1 ; and the series
diverges for L|x − a| > 1, that is |x − a| > L1 . Therefore the radius of convergence is L1 .
p
The second case where lim n |cn | = L follows similarly by root test.
n→∞


Example 6.30. Find the radius of convergence and interval of convergence of the power series

X (−3)n xn
√ .
n=0 n + 1

(−3)n+1 r

n+2 n+1
Solution. lim (−3)n = lim 3 = 3. Thus the radius of convergence is 13 .
n→∞ √ n→∞ n+2
n+1

125

X (−1)n
When x = 13 , the series becomes √ , which is an alternating series. By the alternating
n=0 n + 1
series test, it is convergent.

X 1
When x = − 13 , the series becomes √ , which is divergent by integral test.
n=0 n + 1
Consequently, the interval of convergence is (− 13 , 13 ].

Example 6.31. Find the radius of convergence and interval of convergence of the power series

X n(x + 2)n
.
3n+1
n=0

n+1
3n+2 n+1 1
Solution. lim n = lim = . Thus the radius of convergence is 3. The centre of the
n→∞ n+1 n→∞ 3n 3
3
power series is at x = −2. Next we consider the two endpoints x = −2 ± 3, that is, x = −5, 1.
∞ ∞
X n(−3)n X n(−1)n
When x = −5, the series becomes , that is , which is divergent by the nth
3n+1 3
n=0 n=0
term test .
∞ ∞
X n(3)n X n
When x = 1, the series becomes , that is , which is also divergent by the nth
3n+1 3
n=0 n=0
term test .
Consequently, the interval of convergence is (−5, 1).

Example 6.32. Find the radius of convergence and interval of convergence of the power series

X (4x − 3)2n+1
n n4/3
.
n=1
4

Solution. Note that we cannot apply Theorem 6.15 directly since all the coefficients of the
(4x−3)2n+1
even powers of x are zero. Let un = 4n n4/3
.

un+1 (4x − 3)2n+3 4n n4/3


lim = lim n+1 ·
n→∞ un n→∞ 4 (n + 1)4/3 (4x − 3)2n+1
4
n 3 |4x − 3|2 |4x − 3|2

= lim = .
n→∞ n + 1 4 4
|4x−3|2 |4x−3|
By ratio test, the power series converges absolutely for 4 <1⇔ 2 < 1 ⇔ |x − 43 | < 12 ,
and diverges for |x − 43 | > 12 .
1
Therefore, the radius of convergence is .
2

126
∞ ∞
X 22n+1 X 2
At x = 54 , the series is n n4/3
= 4/3
, which is convergent since it is a p-series with
n=1
4 n=1
n
4
p= 3 > 1.
∞ ∞
1
X (−2)2n+1 X −2
At x = 4, the series is n n4/3
= 4/3
, which is convergent since it is a p-series with
n=1
4 n=1
n
4
p= 3 > 1.
Therefore, the interval of convergence is [ 14 , 54 ].


6.11 Power Series Representation


Recall that for |x| < 1, the geometric series

X 1
xn = 1 + x + x2 + x3 + · · · + xn + · · · = .
1−x
n=0


X
xn is called a power series representation of the function 1
1−x about x = 0.
n=0

1
Example 6.33. (i) Find a power series representation of about x = 0.
1+x
x3
(ii) Find a power series representation of about x = 0.
x+2

Solution. (i) We make use of the above geometric series.



1 1 X
= = (−x)n = 1 − x + x2 − x3 + x4 + · · · ,
1 + x 1 − (−x)
n=0

which is valid for | − x| < 1. That is,



1 X
= (−1)n xn for |x| < 1.
1+x
n=0

(ii) Using the power series representation in (i), we get



x3 x3 1 x3 X n x n
  x
= · = · (−1) for | | < 1.
x+2 2 1 + 2x 2 2 2
n=0

127
That is,

x3 X (−1)n
= xn+3 for |x| < 2.
x+2 2n+1
n=0

1
Example 6.34. Find a power series representation of about x = 0.
x2 + 3x + 2

Solution.
∞ ∞
1 1 1 1 1 1 X
n n 1
X x
2
= − = − x = (−1) x − · (−1)n ( )n ,
x + 3x + 2 x + 1 x + 2 1 + x 2 1 + 2 2 2
n=0 n=0

x
which is valid when both |x| < 1 and < 1. Note that
2
x
|x| < 1 and < 1 ⇐⇒ |x| < 1 and |x| < 2 ⇐⇒ |x| < 1.
2

Thus

1 X 1
2
= (−1)n (1 − n+1 )xn for |x| < 1.
x + 3x + 2 2
n=0


X
Theorem 6.16. If the power series cn (x − a)n has radius of convergence R > 0, then the
n=0
function f defined by

X
f (x) = cn (x − a)n
n=0
is differentiable on the interval |x − a| < R and


X
0
(i) f (x) = ncn (x − a)n−1 , for |x − a| < R.
n=1

(x − a)n+1
Z X
(ii) f (x) dx = cn + C, for |x − a| < R.
n+1
n=0

Example 6.35. Find a power series representation of ln(1 − x) and its radius of convergence.

128

1 X
Solution. For |x| < 1, = xn . Thus by Theorem 6.16 (ii),
1−x
n=0
Z X xn+1 ∞
1
dx = +C
1−x n+1
n=0

X xn+1
=⇒ − ln(1 − x) = + C, for |x| < 1.
n+1
n=0

When x = 0, we have 0 = 0 + C so that C = 0. Therefore,


∞ ∞
X xn+1 X 1 n+1
ln(1 − x) = − = − x , for |x| < 1.
n+1 n+1
n=0 n=0

The radius of convergence is 1 by ratio test.



Example 6.36. Find a power series representation of tan−1 x.
∞ ∞
1 X
2 n
X
Solution. For |x| < 1, = (−x ) = (−1)n x2n . Thus by Theorem 6.16 (ii),
1 + x2
n=0 n=0
Z ∞ 2n+1
1 X
n x
dx = (−1) +C
1 + x2 2n + 1
n=0

X x2n+1
=⇒ tan−1 x = (−1)n + C, for |x| < 1.
2n + 1
n=0

When x = 0, we have 0 = 0 + C so that C = 0. Therefore,



X x2n+1
tan −1
x= (−1)n , for |x| < 1.
2n + 1
n=0

6.12 Taylor and Maclaurin Series


By repeated use of Theorem 6.16 (i), we deduce the following.

Theorem 6.17. If f has a power series representation at x = a, that is



X
f (x) = cn (x − a)n , |x − a| < R, for some R > 0,
n=0

129
then its coefficients are given by the formula

f (n) (a)
cn = .
n!

If f has a power series representation at x = a, then it is unique and has the form

X f (n) (a)
f (x) = (x − a)n .
n!
n=0

This is called the Taylor series of f at x = a.


The Maclaurin series of f is the special case of Taylor series when a = 0:

X f (n) (0)
f (x) = xn .
n!
n=0

Exercise 6.3. Assume that each of the functions ex , sin x and cos x has a Maclaurin series repre-
sentation. Find the Maclaurin series and its radius of convergence for (a) ex , (b) sin x, (c) cos x.

∞ n ∞ ∞
x
X x X
n x2n+1 X x2n
Ans: (a) e = , (b) sin x = (−1) , (c) cos x = (−1)n .
n! (2n + 1)! (2n)!
n=0 n=0 n=0
All three Maclaurin series converge for all x in R.
d n ex
Solution. (a) Let’s work out the Maclaurin series of ex . Since n
= ex for all n ≥ 0. Thus
dx
d n ex
(0) = 1 for all n ≥ 0.
dxn
Therefore, the Maclaurin series of ex is
x x2 xn
1+ + + ··· + + ··· .
1! 2! n!
1
(n + 1)! 1
Since lim = lim = 0, the radius of convergence is ∞ and so the interval of
n→∞ 1 n→∞ n + 1
n!
convergence is R.
∞ n
X x
Assuming ex has a Maclaurin series representation, we thus have ex = for all x ∈ R.
n!
n=0

(b), (c): Exercise.




130
6.13 Appendix: The Precise Definition of the Limit of a Se-
quence and some Proofs
Remark: Section 6.13 will be excluded from the assessments (quizzes and the Final Exam).

Remark: The precise definition of the limit of a sequence and the proofs given in this section
will will be studied in detail in MA2108 Mathematical Analysis I.

Definition 6.3. The sequence {an } converges to the number L if for every positive number ,
there corresponds an integer N such that for all n,

n > N implies that |an − L| < .

If no such number exists, we say that {an } diverges. If {an } converges to L, we write

lim an = L
n→∞

or simply an → L and call L the limit of the sequence.

We write lim an = ∞ if an is arbitrarily large by taking n sufficiently large. Formally, it


n→∞
means for every M > 0, there exists a number N such that

n > N ⇒ an > M.
1
Example 6.37. Show that lim = 0.
n→∞ n

Solution. Given  > 0, choose a positive integer N such that N1 < . Then for any integer
n > N , we have | n1 − 0| = n1 < N1 < . This verifies the definition of limit. Therefore we have
1
shown that lim = 0.
n→∞ n


Now we give the proofs of the Ratio Test (Theorem 6.10) and the Alternating Series Test
(Theorem 6.12).

Proof of Theorem 6.10 (Ratio Test). Notation and setting as in Theorem 6.10. Suppose
a
lim n+1 = L < 1. Choose r such that L < r < 1. Then there exists a positive integer N such
n→∞ an
that
a
n > N ⇒ n+1 < r.
an

131
Then |an | ≤ |aN +1 |r n−N −1 for |aN +1 |r n−N −1 is conver-
P
P all n > N . Since the geometric series
gent, by comparison test, |an | is convergent.
a
Suppose lim n+1 = L > 1. Choose r such that 1 < r < L. Then there exists a positive integer
n→∞ an
N such that
a
n > N ⇒ n+1 > r.
an
Then |an | ≥ |aN +1 |r n−N −1 for all n > N . Since lim |aN +1 |r n−N −1 = ∞ as r > 1, we have lim |an | = ∞.
n→∞ n→∞
X∞
Thus lim an , 0. Therefore, an is divergent by the test for divergence (Theorem 6.5).
n→∞
n=1


Proof of Theorem 6.12 (Alternating Series Test). Notation and setting as in Theorem 6.12.

X n
X
n−1
For simplicity, we only consider the series (−1) bn . Let Sn = (−1)i−1 bi . Then S2n =
n=1 i=1
(b1 −b2 )+· · ·+(b2n−1 −b2n ) = b1 −(b2 −b3 )−· · ·−(b2n−2 −b2n−1 )−b2n . The first equality shows that
S2n ≥ 0 and S2(n+1) ≥ S2n , The second equality shows that S2n ≤ b1 for all n. The sequence
{S2n } is non-decreasing and bounded above, so it have a limit, say

lim S2n = L.
n→∞

As S2n+1 = S2n + b2n+1 , we have

lim S2n+1 = lim S2n + lim b2n+1 = L + 0 = L.


n→∞ n→∞ n→∞

Lastly, for any term Sn with n ≥ 2, there exists a positive integer m such that S2m ≤ Sn ≤
S2m+1 . [For example, S2 ≤ S2 ≤ S3 , S2 ≤ S3 ≤ S3 , S4 ≤ S4 ≤ S5 , S4 ≤ S5 ≤ S5 , etc.] Also, as n
tends to infinity, m tends to infinity. Thus

lim S2m ≤ lim Sn ≤ lim S2m+1 .


m→∞ n→∞ m→∞

Since lim S2m = lim S2m+1 = L, we have by Squeeze Theorem that lim Sn = L. This means
m→∞ m→∞ m→∞

X
the series (−1)n−1 bn is convergent.
n=1


132
Chapter 7

Vectors and Geometry of Space

Read Thomas’ Calculus, Chapter 11.


In this chapter,
• we introduce the coordinate systems for three-dimensional space R3 . This provides the
setting for our study of calculus of functions of two and three variables. The simplest geo-
metric notion is the distance between two points in space.
• we define vectors geometrically followed by the study their algebraic properties. We em-
phasize the power of algebraic manipulation of vectors. In particular, we look at dot product
and cross product of vectors and their applications.
• we define lines and planes in R3 .

7.1 The 3D-Coordinate System


We set up the 3D coordinate system by fixing a point O in space (called the origin) and take
three lines passing through O that are perpendicular to each other. These lines are labeled
as x-axis, y-axis and z-axis respectively.

A point P in space can be represented by an ordered triple (a, b, c) where a , b and c are
projections of the point P onto the x-, y- and z-axis respectively. The three dimensional
space is also called the xyz-space.
The direction of the z-axis is determined by the right-hand rule:

133
Any two of the axes determine a plane:

Example 7.1. Describe and sketch the surface in R3 represented by the equation x + y = 2.

Solution. Note that x +y = 2 represents a line on the xy-plane. However, in R3 , it represents


the plane containing all points whose x- and y-coordinate sum to 2. This is a vertical plane.

Theorem 7.1 (Distance Formula).


The distance |P1 P2 | between the points P1 (x1 , y1 , z1 ) and P2 (x2 , y2 , z2 ) is
q
|P1 P2 | = (x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2 .

Consequently, we have the following equation of a sphere:

134
Theorem 7.2 (Equation of Sphere).
An equation of a sphere with center C(h, k, l) and radius r is

(x − h)2 + (y − k)2 + (z − l)2 = r 2 .

7.2 Vectors
It can be complicated (messy) to describe an object in space directly using the coordinates
x, y and z. It turns out to be easier by using vectors.
A vector is often represented by an arrow.
• The length of the arrow represents the magnitude of the vector.
• The arrow points in the direction of the vector.
For instance, suppose a particle moves along a line segment from point A to point B.

The vector v has initial point A (the tail) and terminal point B (the tip). We indicate this by
−−→
writing v = AB . Call this the displacement vector of a particle from A to B.
We denote a vector by either:
• Printing a letter in boldface v, or
• Putting an arrow above the letter v~.
The zero vector, denoted by 0, has length 0. It is the only vector with no specific direction.
Note that a vector does not depend on its initial point.

135
Definition 7.1 (Adding Vectors – The Triangle Law).
Let u and v be two vectors. Then their sum u + v is the vector from the initial point of u to the
terminal point of v when we position the vectors so that the initial point of v coincide with the
terminal point of u.

Definition 7.2 (Scalar Multiplication).


Let c ∈ R and u be a vector. The scalar multiple cu is the vector whose length is |c| times the
length of u and whose direction is the same as u if c > 0 and is opposite to u if c < 0. If c = 0
or u = 0, then cu = 0.

Notice two nonzero vectors are parallel if they are scalar multiple of each other. By the
difference u − v, we mean
u − v = u + (−v).
To treat vectors systematically (algebraically), we place the initial point of a vector u at the
origin O. In doing so, the terminal point has the coordinate (u1 , u2 ) or (u1 , u2 , u3 ) for some
u1 , u2 , u3 ∈ R, depending on whether u is a vector in R2 or R3 .
Denote u by
u = hu1 , u2 , u3 i.
u1 , u2 , u3 are called the components of u.
hu1 , u2 , u3 i is also called the position vector of the point (u1 , u2 , u3 ).
In other words, position vectors are vectors whose initial point is the origin.
Any vector can be represented by a position vector.
What is so nice about position vectors? The main advantage of representing vectors using
position vectors is that we can simplify calculations by algebraic manipulations!
• To add position vectors, we can just add their corresponding components. If a = ha1 , a2 , a3 i,
b = hb1 , b2 , b3 i then
a + b = ha1 + b1 , a2 + b2 , a3 + b3 i.
• To multiply a by a scalar c, we multiply each component by that scalar.

ca = hca1 , ca2 , ca3 i.

136
Theorem 7.3.
−−→
Given the points A(x1 , y1 , z1 ) and B(x2 , y2 , z2 ), the vector a representing AB is

a = hx2 − x1 , y2 − y1 , z2 − z1 i.

Proof.
−−→ −−−→ −−−→
AB = AO + OB
−−−→ −−−→
= OB − OA
= hx2 , y2 , z2 i − hx1 , y1 , z1 i
= hx2 − x1 , y2 − y1 , z2 − z1 i.


Theorem 7.4 (Properties of Vectors).


Suppose a, b and c are vectors, and c, d ∈ R are scalars. Then

(1) a + b = b + a (2) a + (b + c) = (a + b) + c
(3) a + 0 = a (4) a + (−a) = 0
(5) c(a + b) = ca + cb (6) (c + d)a = ca + da
(7) (cd)a = c(da) (8) 1a = a

These properties are readily verified geometrically or algebraically.


Three vectors in V3 play a special role. They are
i = h1, 0, 0i, j = h0, 1, 0i, k = h0, 0, 1i.

These vectors are called the standard basis vectors. They have length 1 and point in the
direction of the positive x-, y- and z-axes respectively. Thus, any vector a = ha1 , a2 , a3 i can be
written as

a = ha1 , a2 , a3 i
= a1 h1, 0, 0i + a2 h0, 1, 0i + a3 h0, 0, 1i
= a1 i + a2 j + a3 k.

The length of a vector u is the length of any of its representation, and is denoted by | u|| (in
the textbook, the notation |u| is used). Using the distance formula, we have the following
The length of the vector u = hu1 , u2 , u3 i is
q
| u|| = u12 + u22 + u32 .

A unit vector is a vector whose length is 1. For example, i, j and k are unit vectors.

137
Theorem 7.5.
If a , 0, then a unit vector in the same direction as a is given by

u = a/ | a|| .

Notice 1/ | a|| is a positive scalar, so u is in the same direction as a. Now,


a 1
| u|| = = | a|| = 1.
| a|| | a||
So u is the unit vector in the same direction as a. 

7.3 The Dot Product


The dot product of two vectors a = ha1 , a2 , a3 i and b = hb1 , b2 , b3 i is defined to be

a · b = a1 b1 + a2 b2 + a3 b3 .

The dot product satisfies the following properties

Theorem 7.6 (Properties of Dot Product).


For vectors a, b and c and any scalar d,

(i) a · b = b · a (commutativity)

(ii) a · (b + c) = a · b + a · c (distributive law)

(iii) (da) · b = d(a · b) = a · (db)

(iv) 0 · a = 0

(v) a · a = | a||2 .

Notice a · b = 0 does not imply that a = 0 or b = 0.


For two nonzero vectors a and b in V3 , we define the angle θ between them to be the smaller
angle between a and b, formed by placing their initial points at the origin.

138
• a and b have the same direction iff θ = 0.
• a and b have opposite direction iff θ = π.
• a and b are orthogonal (perpendicular) iff θ = π2 .

Theorem 7.7.
Let θ be the angle between nonzero vectors a and b. Then

a · b = | a|| | b|| cos θ.

Proof. Recall that the Law of Cosines says that


| a − b||2 = | a||2 + | b||2 − 2 | a|| | b|| cos θ.

| a − b||2 = | ha1 − b1 , a2 − b2 , a3 − b3 i||2


= (a1 − b1 )2 + (a2 − b2 )2 + (a3 − b3 )2
= (a21 − 2a1 b1 + b12 ) + (a22 − 2a2 b2 + b22 )
+(a23 − 2a3 b3 + b32 )
= (a21 + a22 + a23 ) + (b12 + b22 + b32 ) − 2(a1 b1 + a2 b2 + a3 b3 )

= | a||2 + | b||2 − 2a · b.

Rearranging,

2a · b = | a||2 + | b||2 − | a − b||2


= 2 | a|| | b|| cos θ,
the last equality follows from the Law of Cosines.
So

a · b = | a|| | b|| cos θ.



Example 7.2. Find the angle between the vectors a = h2, 1, −3i and b = h1, 5, 6i.
Solution. We have
a·b −11
cos θ = =√ √ .
| a|| | b|| 14 62
It follows that !
−1 −11
θ = cos √ √ ≈ 1.953radian.
14 62


139
Theorem 7.8.
Two vectors a and b are orthogonal if and only if a · b = 0.

Proof. If either a or b is 0 then a · b = 0 and a and b are orthogonal as 0 is considered to be


orthogonal to every vector.
We may assume that a and b are nonzero. Then

| a|| | b|| cos θ = a · b = 0

if and only if cos θ = 0, if and only if θ = π2 , which is equivalent to saying that a and b are
orthogonal. 

7.4 Projections
−−−→ −−→
The figure shows two vectors a and b with the same initial point representing P Q and P R .

−−−→
Let S be the foot of the perpendicular line from R to the line containing P Q .

−−→
The vector P S is called the vector projection of b onto a, denoted by

proja b.

The scalar projection of b onto a (also called the component of b along a) is defined to be
the signed magnitude of the vector projection, and is denoted by

compa b.

140
Notice

a·b
compa b = | b|| cos θ = .
| a||

π
This value is negative if 2 < θ ≤ π, where θ is the angle between a and b.
Therefore, !
a a·b a a·b a·b
proja b = compa b × = = a = a.
| a|| | a|| | a|| | a||2 a·a

Example 7.3. Let a = h−2, 3, 1i and b = h1, 1, 2i. Find the scalar projection and vector projection
of b onto a.

p √
Solution. Notice | a|| = (−2)2 + 32 + 12 = 14. So

a · b (−2)(1) + 3(1) + 1(2) 3


compa b = = √ =√ .
| a|| 14 14

It follows that
3 a 3 3 9 3
proja b = √ = a = h− , , i.
14 | a|| 14 7 14 14

Theorem 7.9 (Distance from a point to a plane).


The (shortest) distance from a point P (x0 , y0 , z0 ) to the plane ax + by + cz = d is given by

ax0 + by0 + cz0 − d


√ .
a2 + b2 + c2

Proof. A normal vector to the plane is n = ha, b, ci. (See Section 7.7.) Pick any point Q(x1 , y1 , z1 )

141
on the plane so that ax1 + by1 + cz1 = d. Then the shortest distance from P to the plane is
−−−→ −−−→
projn QP = compn QP
−−−→
QP · n
=
| n||
hx0 − x1 , y0 − y1 , z0 − z1 i · ha, b, ci
=
| n||
ax0 + by0 + cz0 − d
= √ .
a2 + b2 + c2

Example 7.4. Find the distance from the point (2, −3, 4) to the plane x + 2y + 3z = 13.

Solution. We have (x0 , y0 , z0 ) = (2, −3, 4) and a = 1, b = 2, c = 3, d = 13. Using the formula, the
distance is
|2(1) + (−3)(2) + 4(3) − 13| 5
√ =√ .
12 + 22 + 32 14


7.5 The Cross Product


We now define a second type of product of vectors, called the cross product or vector prod-
uct. While the dot product is a scalar, the cross product is a vector.
For two vectors a = ha1 , a2 , a3 i and b = hb1 , b2 , b3 i, define the cross product of a and b to be

i j k
a×b = a1 a2 a3
b1 b2 b3
a2 a3 a a a a
= i− 1 3 j+ 1 2 k
b2 b3 b1 b3 b1 b2
= (a2 b3 − a3 b2 )i − (a1 b3 − a3 b1 )j + (a1 b2 − a2 b1 )k.

To compute a × b, we must write the components of a in the second row of the determinant,
and the components of b in the third row. The order is important!
One of the most important properties of the cross product is the following theorem.

Theorem 7.10.
The vector a × b is orthogonal to both a and b.

142
Proof. To show a × b is orthogonal to a, we compute their dot product as follows:

a2 a3 a a a a
(a × b) · a = a1 − 1 3 a2 + 1 2 a3
b2 b3 b1 b3 b1 b2
= (a2 b3 − a3 b2 )a1 − (a1 b3 − a3 b1 )a2 + (a1 b2 − a2 b1 )a3
= 0.

A similar computation shows that (a × b) · b = 0. 


The vector a × b points in a direction perpendicular to a and b. This can be given by the
right-hand rule as follows:

What is the geometric meaning of the length | a × b||? This is given by the following theorem.

Theorem 7.11.
If θ is the angle between a and b, then

| a × b|| = | a|| | b|| sin θ.

We can use cross product


• to find the area of a parallelogram
• to find the distance from a point to a line in R3 .
If a and b are represented by directed line segments with the same initial point then they
determine a parallelogram with base | a||, altitude | b|| sin θ. Therefore, the area of the paral-
lelogram is given by
| a|| | b|| sin θ = | a × b|| .

143
The distance from Q to the line through P and R is

−−−→ −−→
−−−→ PQ ×PR
P Q sin θ = −−→ .
PR

Some of the usual laws of algebra hold for cross products.

Theorem 7.12.
If a, b and c are vectors and d is a scalar, then

(i) a × b = −b × a

(ii) (da) × b = d(a × b) = a × (db)

(iii) a × (b + c) = a × b + a × c

(iv) (a + b) × c = a × c + b × c

7.6 Lines

How do we write down an equation of a line in space? To do it, we must describe the
behavior of a general point on the line. We shall see that vectors can help us to achieve this
goal with minimal effort.
So let P (x, y, z) denote an arbitrary point on the line L.
Let r and r0 denote the position vectors of P and P0 respectively, where P0 is a point on L
which we have fixed. Our aim is to describe r, the position vector of an arbitrary point on L.
−−−→ −−−→
Let v be a vector parallel to L, so P0 P = tv for some scalar t since P0 P and v are parallel.

144
Then
−−−→
r = r0 + P0 P ,
so that
r = r0 + tv
which is a vector equation of L.
Each parameter t gives the position vector r of a point on L. As t varies, the line is traced
out by the tip of the vector r.
We can write the vector equation in the component form:

v = ha, b, ci, r0 = hx0 , y0 , z0 i, r = hx, y, zi.

Two vectors are equal if and only if the corresponding components are equal. Therefore, we
have
r = r0 + tv

hx, y, zi = hx0 , y0 , z0 i + tha, b, ci.

Theorem 7.13 (Parametric Equation of Line).

x = x0 + at, y = y0 + bt, z = z0 + ct.

Usually the parameter t (in the parametric equation of line) takes values on the entire R or
an interval I.
The numbers a, b and c are called direction numbers of the line L.
The vector equation and parametric equations of a line are not unique.
If we change the point r0 or the parameter t or choose a different parallel vector v, then the
equations change. Therefore, direction numbers are not unique.

Example 7.5. Find an equation of the line passing through P (1, 2, −1) and Q(5, −3, 4).

145
Solution. A vector parallel to the line is
−−−→
P Q = h5 − 1, −3 − 2, 4 − (−1)i = h4, −5, 5i.

Pick a point on the line, say (1, 2, −1). Then the parametric equations for the line are
x = 1 + 4t, y = 2 − 5t, z = −1 + 5t.


Let L1 and L2 be two lines in R3 , with parallel vectors a and b, respectively, and let θ be the
angle between a and b.
• The lines L1 and L2 are parallel whenever a and b are parallel.
• If L1 and L2 intersect then θ is an angle between L1 and L2 . Notice π − θ is also an angle
between the lines.
In 2-D, two lines are either parallel or intersect. This is not true in 3-D. Non-parallel, non-
intersecting lines are called skew lines.
Example 7.6. Show that the lines
L1 : x − 2 = −t, y − 1 = 2t, z − 5 = 2t,
L2 : x − 1 = s, y − 2 = −s, z − 1 = 3s,
are skew.

Solution. The lines are not parallel since a vector parallel to L1 is a = h−1, 2, 2i and a vector
parallel to L2 is b = h1, −1, 3i. Since a is not a scalar multiple of b, these vectors are not
parallel.
Assume for a contradiction that L1 and L2 intersect. Then there must exist a choice of the
parameter t and s such that the values of x, y and z are the same. In particular, for the
x-coordinate,
2 − t = 1 + s,
so that s = 1 − t.
On the other hand, the y-coordinate must satisfy
y = 1 + 2t = 2 − s.

Substituting s = 1 − t into the last equation, we have t = 0 and so s = 1.


Now, the z-coordinate must satisfy
z = 5 + 2t = 5,
z = 1 + 3s = 4,
which is absurd!
Hence our assumption that L1 and L2 intersects was wrong. So the lines are skew, as desired.


146
7.7 Planes
To get an equation for the plane, we need to describe an arbitrary point P (x, y, z) on the
plane. Again, we use vectors to help us do that.
Let r and r0 denote the position vectors of P and P0 respectively, where P0 is a fixed point on
the given plane.
−−−→
Then r − r0 is represented by P0 P .

The normal vector n (which is orthogonal to the plane) is always orthogonal to r − r0 . There-
fore we have

Theorem 7.14 (Vector Equation of Plane).

n · (r − r0 ) = 0
which can be written as
n · r = n · r0 .

To obtain a scalar equation for the plane, write the vectors in component form and equate
corresponding components:

n = ha, b, ci, r = hx, y, zi, r0 = hx0 , y0 , z0 i.

Then n · r = n · r0 becomes

ha, b, ci · hx, y, zi = ha, b, ci · hx0 , y0 , z0 i

ax + by + cz = ax0 + by0 + cz0 .

147
Theorem 7.15 (Linear Equation of Plane).

ax + by + cz + d = 0,
where
d = −(ax0 + by0 + cz0 ).

Example 7.7. Find an equation of the plane that passes through the points P (1, 3, 2), Q(3, −1, 6),
R(5, 2, 0).

Solution. First, we need a vector n orthogonal to the plane. This can be given by

−−−→ −−→
n= P Q ×P R

Notice

−−−→ −−→
P Q = h2, −4, 4i, P R = h4, −1, −2i.
So

i j k
n = 2 −4 4
4 −1 −2
= 12i + 20j + 14k.

With the point P (1, 3, 2) and the normal vector n, an equation of the plane is:

12(x − 1) + 20(y − 3) + 14(z − 2) = 0

or after simplifications,
6x + 10y + 7z = 50.


Two planes are parallel if their normal vectors are parallel.

148
If two planes are not parallel, then
• They intersect in a straight line.
• The angle between the two planes is defined as the acute angle between their normal
vectors

Example 7.8. (a) Find the angle between the planes x + 2y + z = 3 and x − 4y + 3z = 5.
(b) Find the line of intersection of these two planes.

Solution. (a) The normal vectors of these planes are


n1 = h1, 2, 1i, n2 = h1, −4, 3i.
So, if θ is the angle between them, then
n1 · n2
θ = cos−1
| n1 | | n2 |
1(1) + 2(−4) + 1(3)
= cos−1 √ √
1 + 4 + 1 1 + 16 + 9
−4
= cos−1 √ ≈ 108.7◦
156
Therefore, the angle between the planes is 71.3◦ .
(b) Solving both equations for x,
x = 3 − 2y − z and x = 5 + 4y − 3z.
Setting them to be equal gives
3 − 2y − z = 5 + 4y − 3z.
Solving for z gives
z = 3y + 1.
Substituting this into the first equation,
x = 3 − 2y − (3y + 1) = −5y + 2.
Let y = t be the parameter, we obtain a parametric equation for the line of intersection

x = −5t + 2, y = t, z = 3t + 1.

149
Example 7.9. Consider the planes α : 3x + 4y + 12z = 26 and β : 3x + 4y + 12z = 39.

(a) Find the distance between α and β.

(b) Let ` be the line with parametric equations


x = 2 + 3t, y = 2 + 4t, z = 1 + 12t.
Find the intersection point Q of ` with β.

29 30 25
Ans: (a) 1, (b) ( 13 , 13 , 13 ).

Solution. (a) Note that P (2, 2, 1) is a point on α, and the two planes α and β are parallel.

Then the distance between α and β is given by


|3(2) + 4(2) + 12(1) − 39| | − 13|
√ = = 1.
32 + 42 + 122 13

(b) Substituting the parametric equations of ` into the equation of β, we have


1
3(2 + 3t) + 4(2 + 4t) + 12(1 + 12t) = 39 ⇔ 26 + 169t = 39 ⇔ t = 13 .

Therefore, the intersection point Q is


3 4 12 29 30 25
(2 + , 2 + , 1 + ) = ( , , ).
13 13 13 13 13 13

Note that PQ is perpendicular to α and β, and kPQk = 1.




Exercise 7.1. Let a and b be vectors in R3 . Prove that (a · b)2 + |a × b|2 = |a|2 |b|2 .
Exercise 7.2. Find the distance from the point P (3, 3, 3) to the line ` : x = t, y = 2t , z = t.

Ans. 2.
Exercise 7.3. Find the point on the surface z = x2 +y 2 +10 that is nearest to the plane x+2y−z = 0.

Ans. ( 12 , 1, 45
4 ).

150
Chapter 8

Functions of Several Variables

Read Thomas’ Calculus, Chapter 13.

8.1 Vector Functions of One Variable


Recall the vector equation of line:
r(t) = r0 + tv.
We have seen that the tip of the vector r(t) traces out a line as t varies.
We can rewrite the above as follows:

r(t) = hx0 , y0 , z0 i + tha, b, ci = hx0 + ta, y0 + tb, z0 + tci.

Notice that each component of r(t) is a scalar function of t.


In general, a vector-valued function is

r(t) = hf (t), g(t), h(t)i

where f (t), g(t) and h(t) are scalar functions of t.


Formally,
A vector-valued function r(t) is a mapping from its domain D ⊆ R to its range R ⊆ V3 , so
that for each t ∈ D, r(t) = v for exactly one vector v ∈ V3 .
We write a vector-valued function as

r(t) = f (t)i + g(t)j + h(t)k

or
r(t) = hf (t), g(t), h(t)i
for some scalar function f , g and h (called the component functions of r).

151
Suppose r(t) traces out the curve C, we say that r(t) is a parametrization of C.

A curve C can have more than one parametrizations.

For example, both

r(t) = ht, t 2 i, t ∈ R

r(t) = ht 3 , t 6 i, t ∈ R

parameterize the parabola f (x) = x2 on the xy-plane.

Example 8.1. Sketch the curve traced out by the vector-valued function r(t) = sin ti−3 cos tj+2tk.

Solution. There is a relationship between x and y here:

 2
2 y
x + = sin2 t + cos2 t = 1
3

which is the equation of an ellipse in 2-D. In 3-D, since the equation does not involve z, it
becomes the equation of an elliptic cylinder whose axis is the z-axis.

The curve will wind its way up the cylinder anticlockwise as t increases.

We call this curve an elliptical helix.

152


8.2 Calculus of Vector Functions


To extend differentiation and integration to vector-valued functions, just think

‘component-wise’!!!

Definition 8.1. The derivative r0 (t) of the vector-valued function r(t) is defined by

r(t + 4t) − r(t)


r0 (t) = lim
4t→0 4t
for any values of t for which the limit exists.

When the limit exists for t = a, we say that r is differentiable at t = a.


The derivative of a vector-valued function can be found directly from the derivatives of the
components.

Theorem 8.1 (Derivative of Vector-valued Function).


Let r(t) = hf (t), g(t), h(t)i and suppose that the components f , g and h are all differentiable at
t = a. Then r is differentiable at t = a and its derivative is given by

r0 (a) = hf 0 (a), g 0 (a), h0 (a)i.

153
Example 8.2. Let r(t) = h3t 2 + 2t + 1, sin(πt), e2t i. Find r0 (t), and compute r0 (1).

Solution.

d d d
r0 (t) = h (3t 2 + 2t + 1), sin(πt), e2t i
dt dt dt
= h6t + 2, π cos(πt), 2e2t i.

Hence
r0 (1) = h6 + 2, π cos π, 2e2 i = h8, −π, 2e2 i.

Theorem 8.2 (Derivative Rules).


Suppose r(t) and s(t) are differentiable vector-valued functions, f (t) is a differentiable scalar
function and c is a scalar constant. Then

d 0 0
(i) dt (r(t) + s(t)) = r (t) + s (t)
d 0
(ii) dt (cr(t)) = cr (t)
d
(iii) dt f (t)r(t) = f 0 (t)r(t) + f (t)r0 (t)
d 0 0
(iv) dt r(t) · s(t) = r (t) · s(t) + r(t) · s (t)
d 0 0
(v) dt (r(t) × s(t)) = r (t) × s(t) + r(t) × s (t).

8.3 Tangent Vector and Tangent Line to a Curve

Recall that one interpretation of the derivative of a scalar function is that the value of the
derivative at a point gives the slope of the tangent line to the curve at that point.
There is a similar interpretation for the derivative of vector-valued functions.
Recall

r(a + 4t) − r(a)


r0 (a) = lim .
4t→0 4t

154
r(a+4t)−r(a)
Notice that for 4t > 0, the vector 4t points in the same direction as r(a + 4t) − r(a).
r(a+4t)−r(a)
As 4t → 0, 4t approaches r0 (a).

This is a vector tangent to the curve at r(a). We call r0 (a) a tangent vector to the curve at
t = a.

Example 8.3. Find the tangent line L to the curve r(t) = hcos t, sin t, ti at (0, 1, π/2).

Solution. At point (0, 1, π/2), t = π/2. Since r0 (t) = h− sin t, cos t, 1i, a direction of the tangent
line L is
r0 (π/2) = h− sin(π/2), cos(π/2), 1i = h−1, 0, 1i

So a parametric equation of L is

x = 0 + (−1)t, y = 1 + (0)t, z = π/2 + (1)t,

that is
x = −t, y = 1, z = π/2 + t.

8.4 Arc Length of a Space Curve

Suppose that a smooth curve C is traced out by the vector-valued function r(t) = hf (t), g(t), h(t)i,
where f , f 0 , g, g 0 , h, h0 are all continuous for t ∈ [a, b], and the curve is traversed exactly once
as t increases from a to b. Then the arc length of C is given by the following result.

155
Theorem 8.3 (Arc Length Formula).
Let C be the curve given by

r(t) = hf (t), g(t), h(t)i, a ≤ t ≤ b

where f 0 , g 0 and h0 are continuous. If C is traversed exactly once as t increases from a to b,


then its length is
Z bq
s = f 0 (t)2 + g 0 (t)2 + h0 (t)2 dt
a
Z b
= r0 (t) dt.
a

Example 8.4. Find the arclength of the curve traced out by the endpoint of the vector-valued
function r(t) = h2t, ln t, t 2 i for 1 ≤ t ≤ e.

Solution.

Z er  2
2
1
s = 2 + + (2t)2 dt
1 t
Z er
1 + 4t 2 + 4t 4
= dt
1 t2
Z er
(1 + 2t 2 )2
= dt
1 t2
Ze Z e
1 + 2t 2 1

= dt = + 2t dt
1 t 1 t
 
= ln |t| + t 2 e1

= (ln e + e2 ) − (ln 1 + 1) = e2 .

156

So far we have seen functions of one variable, i.e. the domain is a subset of R

function Domain D Range R


(scalar) f (t) D ⊆R R⊆R
(vector) r(t) D ⊆R R ⊆ R2 or R3

However, in the real world, physical quantities often depend on two or more variables.

8.5 Functions of Two Variables

Definition 8.2.
A function f of two variables is a rule that assigns to each ordered pair of real numbers (x, y)
in a set D ⊆ R2 = R × R a unique real number denoted by f (x, y).

If a function f is given by a formula and no domain is specified, then the domain of f is


understood to be:
the set of all pairs (x, y) for which the given expression is a well-defined real number.

Example 8.5. Find the domain of

f (x, y) = x ln(y 2 − x).

Solution. ln(y 2 − x) is defined only when y 2 − x > 0, that is, x < y 2 .


So the domain of f is

D = {(x, y) : x < y 2 }.

157

One way of visualizing z = f (x, y) is to draw its graph.
If f is a function of two variables with domain D, then the graph of f is the set of all points
(x, y, z) ∈ R3 such that z = f (x, y) and (x, y) ∈ D.
The graph of a function f of two variables is also called the surface S with equation z =
f (x, y).
We can visualize the graph S of f as lying directly above or below its domain D in the
xy-plane.

Graphing functions of more than one variable is not simple!


For most functions of two variables, to identify the surface, we must
(1) take hints from the expressions z = f (x, y)
(2) think through the traces and piece together the clues

Example 8.6. Match the functions f (x, y) = ln(x2 + y 2 ) and g(x, y) = cos(x2 + y 2 ) to the surfaces
shown below:

Solution. Notice both functions contain the expression x2 +y 2 . This is significant: given any
value r and any point (x, y) on the circle x2 + y 2 = r 2 , the height of the surface at the point
(x, y) is a constant. That is, the surface has circular cross sections parallel; to the xy-plane.
However, both surfaces shown have this property, so we cannot yet tell which surface is
matched by which function.
Notice the cosine of any angle lies between 1 and −1. So the second graph is g(x, y).

158
An important property of f (x, y) is that the logarithm tends to −∞ as its argument x2 + y 2
approaches 0. This appears in the first graph.

Another way to visualize functions of several variables is to use the contour plot which
provide the same information condensed into a 2-D picture.

Definition 8.3 (Level Curve).


A level curve of f (x, y) is the two-dimensional graph of the equation f (x, y) = k for some
constant k.

Definition 8.4 (Contour Plot).


A contour plot of f (x, y) is a graph of numerous level curves f (x, y) = k, for representative
values of k.

To sketch contour plots, we use values of k that are equally spaced. The surface is:
• steep where the level curves are close together.
• flatter where the level curves are farther apart.

Example 8.7. Sketch some level curves of h(x, y) = 4x2 + y 2 .

Solution. If k < 0, then 4x2 + y 2 = k has no solution, so there is no level curves for k < 0.
If k = 0, then there is only one solution (0, 0), so the level curve is a single point (0, 0).

159
If k > 0, then 4x2 + y 2 = k is an ellipse:
x2 y2
√ + √ 2 = 1.
( k/2)2 k
Thus, larger k gives rise to an ellipse with longer major and minor axes.

8.6 Cylinders and Quadric Surfaces


We have seen that the graph of functions of two variables are surfaces in space R3 . To
appreciate the calculus of functions of two variables, we need more examples of surfaces in
space other than planes (ax + by + cz = d)and spheres ((x − a)2 + (y − b)2 + (z − c)2 = d 2 ).
It is not easy to draw surfaces in R3 . Our goal here is to identify and sketch some special
type surfaces, namely the cylinders and some quadric surfaces.

8.6.1 Cylinders
When we mention the word cylinder, we probably think of the following object

160
Actually, the term cylinder is used to refer to a surface more general that the one we saw.

Definition 8.5.
A surface is a cylinder if there is a plane P such that all the planes parallel to P intersect the
surface in the same curve (when viewed in 2-dimension).

Example 8.8. Show that the surface given by

y 2 + z2 = 1

is a cylinder.

Solution. Notice x is missing in the equation. When x = 0, y 2 + z2 = 1 is a circle with radius


1 in the yz-plane, which is the intersection of the surface and the yz-plane.
Generally, x = k represent a plane parallel to the yz-plane, and the intersection the surface
and this plane is always the circle y 2 + z2 = 1.
Therefore, the surface y 2 + z2 = 1 is a cylinder.
In fact, any equation in x, y and z where one of the variable is missing is a cylinder.

Example 8.9. Sketch the graph of the surface z = x2 .

Solution. Notice that the equation of the graph, z = x2 , does not involve y.
This means that any vertical plane with equation y = k (parallel to the xz-plane) intersects
the graph in a curve with equation z = x2 . So the surface z = x2 is a cylinder.

161


8.6.2 Quadric Surface

Definition 8.6 (Quadric Surface).


A quadric surface is the graph of a second-degree equation in three variables x, y and z:

Ax2 + By 2 + Cz2 + Dxy + Eyz + Fxz + Gx + Hy + Iz + J = 0

where A, B, . . ., J are constants.

There are six basic quadric surfaces. But we shall focus on two of them:

• Elliptic paraboloid

• Ellipsoid

Definition 8.7 (Elliptic paraboloid – symmetric about the z-axis).

x2 y 2 z
+ =
a2 b2 c

x2 y2 z
The graph of the elliptic paraboloid a2
+ b2 = c when c > 0.

162
Example 8.10. Identify and sketch the surface
x2 + 2z2 − 6x − y + 10 = 0.

Solution. By completing squares, we rewrite the equation as

z2
(y − 1) = (x − 3)2 +
1/2
It represents an elliptic paraboloid. However, it has been shifted so that its vertex is the
point (3, 1, 0), and is symmetric about the line which is parallel to the y-axis and passes
through (3, 1, 0).

163
Definition 8.8 (Ellipsoid).

x2 y 2 z2
+ + =1
a2 b 2 c 2

If a = b = c, then the ellipsoid is a sphere.

Example 8.11. Sketch the quadric surface with equation

2 y 2 z2
x + + = 1.
9 4

Solution. The surface intersects the xy-plane (z = 0) in the ellipse

y2
x2 + = 1.
9

In general, the surface intersects the plane z = k in the ellipse

y2 k2
x2 + = 1− ,
9 4
2
provided 1 − k4 > 0, that is −2 < k < 2.
The surface also intersects every plane x = k (which is parallel to the yz-plane (x = 0)) in the
ellipse
y 2 z2
+ = 1 − k2,
9 4
provided 1 − k 2 > 0, that is −1 < k < 1.

164
The surface also intersects every planes y = k which is parallel to the xz-plane (y = 0) in
ellipse
z2 k2
x2 + = 1 − ,
4 9
k2
provided 1 − 9 > 0, that is −3 < k < 3.


8.7 Functions of Three Variables

Definition 8.9.
A function f of three variables is a rule that assigns to each ordered triple of real numbers
(x, y, z) in a set D ⊆ R3 = R × R × R a unique real number denoted by f (x, y, z).

Unlike functions of two variables, it is very difficult to visualize a function f of three vari-
ables by its graph.
That would lie in a four-dimensional space!!!
However, we do gain some insight into f by examining its level surfaces (counterparts of
level curves in two-variable case).

165
Definition 8.10 (Level Surface).
A level surface of f (x, y, z) is the three-dimensional graph of the equation f (x, y, z) = k for
some constant k.

If the point (x, y, z) moves along a level surface, the value of f (x, y, z) remains fixed.

Example 8.12. Find the level surfaces of the function

f (x, y, z) = x2 + y 2 + z2 .

Solution. The level surfaces are:


x2 + y 2 + z2 = k
where k ≥ 0.

These form a family of concentric spheres with radius k.

8.8 Partial Derivatives


Recall that for a function f of a single variable, we define the derivative function as

f (x + h) − f (x)
f 0 (x) = lim
h→0 h
for any values of x for which the limit exists.

166
At a particular point x = a, we interpret f 0 (a) as the instantaneous rate of change of f with
respect to x at that point.
We want to generalize the notion of derivative to functions of more than one variable.
The idea is to ‘vary’ one variable and keep other variable(s) fixed.

Definition 8.11 (Partial Derivative).


If f is a function of two variables, its partial derivatives are the functions fx and fy defined by:

f (x + h, y) − f (x, y)
fx (x, y) = lim ,
h→0 h
f (x, y + h) − f (x, y)
fy (x, y) = lim .
h→0 h

Example 8.13. Let f (x, y) = x2 y. Then


(x + h)2 y − x2 y (x2 y + 2xhy + h2 y) − x2 y
fx (x, y) = lim = lim = lim (2xy + hy) = 2xy.
h→0 h h→0 h h→0
x2 (y + h) − x2 y
fy (x, y) = lim lim x2 = x2 .
h→0 h h→0

There are many alternative notations for partial derivatives:


Instead of fx , we can write f1 or D1 f (to indicate differentiation with respect to the first
variable x) or
∂f
.
∂x
To compute the partial derivative fx , one may simply do the following: Treat (temporar-
ily) the other variable y in f (x, y) as a constant, and differentiate f (x, y) with respect to the
variable x.

Similarly, to compute fy , one may simply do this: Treat (temporarily) the other variable x in
f (x, y) as a constant, and differentiate f (x, y) with respect to the variable y.
∂f ∂f
Example 8.14. For f (x, y) = exy + yx , compute fx and fy . Find also (2, 1) and (2, 1).
∂x ∂y

Solution. Treating y as a constant, we have


1
fx (x, y) = yexy + .
y
Treating x as a constant, we have
x
fy (x, y) = xexy − .
y2

167
Hence at (x, y) = (2, 1), we have

∂f 1
(2, 1) = fx (2, 1) = 1 · e2·1 + = e2 + 1,
∂x 1
∂f 2
(2, 1) = fy (2, 1) = 2 · e2·1 − 2 = 2e2 − 2.
∂y 1

To give a geometric interpretation of partial derivatives, we recall that the equation z =
f (x, y) represents a surface S (the graph of f ).
If f (a, b) = c, then the point P (a, b, c) lies on S.
By fixing y = b, we are restricting our attention to the curve C1 in which the vertical plane
y = b intersects S. That is, C1 is the trace of S in the plane y = b.
Likewise, the vertical plane x = a intersects S in a curve C2 .
Both the curves C1 and C2 pass through P .

• The curve C1 is the graph of the function g(x) = f (x, b). So, the slope of its tangent T1 at P
is: g 0 (a) = fx (a, b).
• The curve C2 is the graph of the function h(x) = f (a, y). So, the slope of its tangent T2 at P
is: h0 (b) = fy (a, b).
Thus, the partial derivatives fx (a, b) and fy (a, b) can be interpreted geometrically as:
The slopes of the tangent lines at P (a, b, c) to the traces C1 and C2 of S in the planes y = b
and x = a.
For functions of more than two variables, such as w = f (x, y, z), we can similarly define

∂f ∂f ∂f ∂w ∂w ∂w
fx , fy , fz , , , or , , .
∂x ∂y ∂z ∂x ∂y ∂z
∂z ∂z
Example 8.15. Find ∂x
and ∂y
if z is defined implicitly as a function of x and y by the equation
x3 + y 3 + z3 + 6xyz = 1.

168
Solution. Take partial derivative with respect to x on both sides:

∂z ∂z
3x2 + 3z2 + 6yz + 6yx = 0.
∂x ∂x
∂z
Solving for ∂x
, we have

∂z ∂z x2 + 2yz
(3z2 + 6yx) = −(3x2 + 6yz) =⇒ =− 2 .
∂x ∂x z + 2xy

∂z y 2 + 2xz
Similarly, we have =− 2 .
∂y z + 2xy


8.9 Higher Order Partial Derivatives


If f is a function of two variables, then its partial derivatives fx and fy are also functions of
two variables.
So, we can consider their partial derivatives

(fx )x , (fx )y , (fy )x , (fy )y .

These are called the second partial derivatives of f .


If z = f (x, y), we use the following notation:

∂ ∂f ∂2 f ∂2 z
 
(fx )x = fxx = f11 = ∂x ∂x
= ∂x2
= ∂x2
∂2 f ∂2 z
 ∂f 

(fx )y = fxy = f12 = ∂y ∂x
= ∂y∂x
= ∂y∂x
∂ ∂f ∂2 f ∂2 z
 
(fy )x = fyx = f21 = ∂x ∂y
= ∂x∂y
= ∂x∂y
∂2 f ∂2 z
 ∂f 

(fy )y = fyy = f22 = ∂y ∂y
= ∂y 2
= ∂y 2

Thus, the notation fxy means that we first differentiate with respect to x and then with
respect to y.
In computing fyx , the order is reversed.

Example 8.16. Find all second-order partial derivatives of f (x, y) = x2 y − y 3 + ln x.

Solution. First, we compute the first-order derivatives:

1
fx = 2xy + ,
x

169
fy = x2 − 3y 2 .

Then we have
 
fxx = ∂
∂x
2xy + 1x = 2y − x12 ,

 
fxy = ∂
∂y
2xy + 1x
= 2x,
 

fyx = ∂x x2 − 3y 2 = 2x,
 

fyy = ∂y x2 − 3y 2 = −6y.


Notice fxy = fyx in the preceding example. This is not a coincidence.
It turns out that the mixed partial derivatives fxy and fyx are equal for most (not all) func-
tions that one meets in practice.
The following theorem, discovered by the French mathematician Alexis Clairaut (1713–
1765), gives conditions under which we can assert that fxy = fyx .

Theorem 8.4 (Clairaut’s Theorem).


Suppose f is defined on a disk D that contains (a, b). If the functions fxy and fyx are both
continuous on D, then

fxy (a, b) = fyx (a, b).

Partial derivatives of order 3 and higher can also be defined. For example, fxyy = (fxy )y .
Using Clairaut’s Theorem, it can be shown that

fxyy = fyxy = fyyx

if these functions are continuous.

8.10 Tangent Planes


Recall that we use derivative f 0 (a) to get the tangent line to the curve y = f (x) at x = a:

y = f (a) + f 0 (a)(x − a).

In the same spirit, we shall use partial derivatives to obtain the tangent plane to the surface
z = f (x, y) at a given point.

170
Consider the surface S which is the graph of z = f (x, y). Suppose f has continuous first
partial derivatives.
Let P (a, b, c) be a point on S. Notice c = f (a, b).
Let C1 and C2 be the curves obtained by intersecting the vertical planes y = b and x = a with
the surface S. Notice P lies on both C1 and C2 .
Let T1 and T2 be the tangent lines to the curves C1 and C2 at the point P .

Then, the tangent plane to the surface S at the point P is defined to be the plane that contains
both tangent lines T1 and T2 .
How to find an equation for the tangent plane?
Recall that any plane passing through P (a, b, c) has an equation of the form

n · hx − a, y − b, z − ci = 0

where n is a vector normal to the plane.


Notice the tangent line T1 lies on the plane y = b. Along T1 at x = a, a change of 1 unit in x
corresponds to a change of fx (a, b) in z (here we require fx to be continuous). The value of y
does not change along the line. A vector with the same direction as T1 is

h1, 0, fx (a, b)i.

Similarly, a vector with the same direction as T2 is

h0, 1, fy (a, b)i.

We have found two vectors parallel to the tangent plane:

h1, 0, fx (a, b)i, h0, 1, fy (a, b)i.

171
A vector normal to the plane is given by the cross product:

h0, 1, fy (a, b)i × h1, 0, fx (a, b)i = hfx (a, b), fy (a, b), −1i.

Theorem 8.5 (Equation of Tangent Plane).


Suppose f (x, y) has continuous first partial derivatives at (a, b). A normal vector to the tangent
plane at (a, b, f (a, b)) to the surface z = f (x, y) is

hfx (a, b), fy (a, b), −1i.

Further, an equation of the tangent plane is given by

fx (a, b)(x − a) + fy (a, b)(y − b) − (z − f (a, b)) = 0

or

z = f (a, b) + fx (a, b)(x − a) + fy (a, b)(y − b).

Example 8.17. Find the tangent plane to the elliptic paraboloid z = 2x2 + y 2 at the point (1, 1, 3).

Solution. Notice
fx (x, y) = 4x, fx (1, 1) = 4,

fy (x, y) = 2y, fy (1, 1) = 2.

The equation of the plane is

z = f (1, 1) + 4(x − 1) + 2(y − 1),

z = 4x + 2y − 3.

The figure shows the elliptic paraboloid and its tangent plane at (1, 1, 3) that we found in the
preceding example

172


8.11 Differentiability and Chain Rule


For single-variable function f (x), we say that f is differentiable at a if and only if f 0 (a)
exists. For two-variable function f (x, y), it is tempting to say that f is differentiable at (a, b)
if fx (a, b) and fy (a, b) exist. However, such definition would fail to capture the true nature of
‘differentiability’!

Definition 8.12.
Informally, we say that f is differentiable at (a, b) if the tangent plane at (a, b) is a good
approximation to f at points close to (a, b).

The above definition is not precise since we do not define what do we mean by ‘a good
approximation’. Do not worry! All the functions we encounter in this course will be differ-
entiable at points in its domain. Recall that the Chain Rule for functions of a single variable
gives the following rule for differentiating a composite function.
If y = f (x) and x = g(t), where f and g are differentiable functions, then y is indirectly a
differentiable function of t, and

dy dy dx
= .
dt dx dt
We shall extend the chain rule to functions of several variables. This takes several slightly
different forms, depending on the number of independent variables.
The first version deals with a function z = f (x, y) where x = g(t) and y = h(t) are both func-
tions of a single variable t:

z = f (g(t), h(t)).

173
Theorem 8.6 (The Chain Rule - Case 1).
Suppose that z = f (x, y) is a differentiable function of x and y, where x = g(t) and y = h(t) are
both differentiable functions of t. Then, z is a differentiable function of t, and

dz ∂f dx ∂f dy
= + .
dt ∂x dt ∂y dt
dz
Example 8.18. For z = f (x, y) = x2 ey , x = g(t) = t 2 − 1 and y = h(t) = sin t, find the derivative dt .

Solution. First, compute the partial derivatives:


∂z ∂z
= 2xey , = x2 ey .
∂x ∂y
Next, compute the derivatives:
dx dy
= 2t, = cos t.
dt dt
Therefore, using the chain rule,

dz ∂z dx ∂z dy
= +
dt ∂x dt ∂y dt
= 2xey (2t) + x2 ey cos t
= 2(t 2 − 1)esin t (2t) + (t 2 − 1)2 esin t cos t.

Notice, in the preceding example, you could have first substituted for x and y and then
compute the derivative of
f (g(t), h(t)) = (t 2 − 1)2 esin t
using the usual rules of differentiation for functions of a single variable.
We can easily extend The Chain Rule to the case of a function f (x, y) where x and y now are
both functions of two independent variables s and t, x = g(s, t) and y = h(s, t).
Then, z is indirectly a function of s and t:
z = f (g(s, t), h(s, t)).

We wish to find
∂z ∂z
, .
∂s ∂t
Recall that, in computing ∂z ∂t
, we hold s fixed and compute the ordinary derivative of z with
respect to t. (This is the situation in The Chain Rule - Case 1)
Similarly, in computing ∂z ∂s
, we hold t fixed and compute the ordinary derivative of z with
respect to s. (This is the situation in The Chain Rule - Case 1)
We have the following (for free!)

174
Theorem 8.7 (The Chain Rule - Case 2).
Suppose that z = f (x, y) is a differentiable function of x and y, where x = g(s, t) and y = h(s, t)
are both differentiable functions of s and t. Then,

∂z ∂f ∂x ∂f ∂y
= + ,
∂s ∂x ∂s ∂y ∂s

∂z ∂f ∂x ∂f ∂y
= + .
∂t ∂x ∂t ∂y ∂t

Case 2 of the Chain Rule contains three types of variables:


• s and t are independent variables.
• x and y are called intermediate variables.
• z is the dependent variable.

∂z ∂z
Example 8.19. If z = ex sin y, where x = st 2 and y = s2 t, find ∂s
and ∂t
.

Solution. Applying Case 2 of Chain Rule,

∂z ∂z ∂x ∂z ∂y
= +
∂s ∂x ∂s ∂y ∂s
= (ex sin y)(t 2 ) + (ex cos y)(2st)
2 2
= t 2 est sin(s2 t) + 2stest cos(s2 t).

∂z ∂z ∂x ∂z ∂y
= +
∂t ∂x ∂t ∂y ∂t
= (ex sin y)(2st) + (ex cos y)(s2 )
2 2
= 2stest sin(s2 t) + s2 est cos(s2 t).


To remember the Chain Rule, it’s helpful to draw a tree diagram, as follows:
We draw branches from the dependent variable z to the intermediate variables x and y to
indicate that z is a function of x and y.

175
Then, we draw branches from x and y to the independent variables s and t. On each branch,
we write the corresponding partial derivative.

To find ∂z
∂s
, we find the product of the partial derivatives along each path from z to s and
then add these products:
∂z ∂z ∂x ∂z ∂y
= + .
∂s ∂x ∂s ∂y ∂s

∂z
Similarly, we find ∂t
by using the paths from z to t.

Theorem 8.8 (The Chain Rule - General Version).


Suppose that u is a differentiable function of n variables x1 , . . . , xn , and each xj is a differen-
tiable function of m variables t1 , . . . , tm . Then u is a function of t1 , . . . , tm and

∂u ∂u ∂x1 ∂u ∂x2 ∂u ∂xn


= + + ··· +
∂ti ∂x1 ∂ti ∂x2 ∂ti ∂xn ∂ti

for each i = 1, . . . , m.

Example 8.20. Write out the Chain Rule for the case where w = f (x, y, z, t) and x = x(u, v),
y = y(u, v), z = z(u, v), t = t(u, v).

Solution. The figure shows the tree diagram.

176
With the aid of the tree diagram, we can now write the required expressions:

∂w ∂w ∂x ∂w ∂y ∂w ∂z ∂w ∂t
= + + + .
∂u ∂x ∂u ∂y ∂u ∂z ∂u ∂t ∂u

∂w ∂w ∂x ∂w ∂y ∂w ∂z ∂w ∂t
= + + + .
∂v ∂x ∂v ∂y ∂v ∂z ∂v ∂t ∂v


Example 8.21. If w = f (x2 − y 2 , y 2 − x2 ) and f is differentiable, show that

∂w ∂w
y +x = 0.
∂x ∂y

Solution. Introduce intermediate variables:

u = x2 − y 2 , v = y 2 − x2 .

Using Chain Rule,


∂w ∂w ∂u ∂w ∂v ∂w ∂w
= + = (2x) + (−2x)
∂x ∂u ∂x ∂v ∂x ∂u ∂v
and
∂w ∂w ∂u ∂w ∂v ∂w ∂w
= + = (−2y) + (2y)
∂y ∂u ∂y ∂v ∂y ∂u ∂v

Therefore
∂w ∂w
y +x
∂x ∂y
! !
∂w ∂w ∂w ∂w
= (2xy) + (−2xy) + (−2xy) + (2xy) = 0.
∂u ∂v ∂u ∂v


177
8.12 Implicit Differentiation
Consider a surface defined by an equation
F(x, y, z) = 0
where F(x, y, z) is differentiable.
Suppose z is implicitly defined as a function of independent variables x and y, that is, for
every choice of x and y, there is a unique z such that F(x, y, z) = 0.
∂z
Suppose we are interested in ∂x
.
∂z
If we can solve the above equation for z, say z = f (x, y), then we can compute ∂x directly.
But life is complicated enough that we may not be able to solve for z. Using Chain Rule to
differentiate the equation F(x, y, z) = 0 with respect to x:
∂F ∂x ∂F ∂y ∂F ∂z
+ + = 0.
∂x ∂x ∂y ∂x ∂z ∂x
But
∂x ∂y
= 1, =0
∂x ∂x
∂F
since x and y are independent variables. Therefore, if ∂z
, 0, then
∂F
∂z F
∂x
= − ∂F = − x.
∂x Fz
∂z

Theorem 8.9 (Implicit Differentiation: Two Independent Variables).


Suppose the equation F(x, y, z) = 0, where F is differentiable, defines z implicitly as a differen-
tiable function of x and y. Then,

∂z F (x, y, z) ∂z Fy (x, y, z)
=− x , =−
∂x Fz (x, y, z) ∂y Fz (x, y, z)

provided Fz (x, y, z) , 0.

∂z
Example 8.22. Find ∂x
if
x3 + y 3 + z3 + 6xyz = 1.

Solution. Let F(x, y, z) = x3 + y 3 + z3 + 6xyz − 1. Then


Fx = 3x2 + 6yz, Fz = 3z2 + 6xy.
Therefore, by the Implicit Differentiation Theorem,
∂z F 3x2 + 6yz
=− x =− 2 .
∂x Fz 3z + 6xy


178
8.13 Increments and Differentials

Definition 8.13.
Let z = f (x, y). Suppose 4x and 4y are increments in the independent variable x and y
respectively.
Then the increment in z is defined by

4z = f (x + 4x, y + 4y) − f (x, y).

Definition 8.14.
Let z = f (x, y). Suppose 4x and 4y are increments in the independent variable x and y
respectively.
Then the differentials of the independent variables x and y are

dx = 4x, dy = 4y.

The differential (or total differential) of the dependent variable z is

dz = fx (x, y)dx + fy (x, y)dy.

Notice that
• the increment 4z is the change in z as (x, y) changes from (a, b) to (a + 4x, b + 4b).
• the differential dz is the change in the tangent plane as (x, y) changes from (a, b) to (a +
4x, b + 4b).

Example 8.23. Let z = 2x2 − xy. Find 4z. Use this result to find the change of z if (x, y) changes
from (1, 1) to (0.98, 1.03).

179
Solution.

4z = f (x + 4x, y + 4y) − f (x, y)


 
= 2(x + 4x)2 − (x + 4x)(y + 4y) − (2x2 − xy)
= (4x − y)4x − x4y + 2(4x)2 − 4x4y.

As (x, y) changes from (1, 1) to (0.98, 1.03), we have 4x = 0.98 − 1 = −0.02 and 4y = 1.03 − 1 =
0.03. Substituting these values into the expression of 4z above, we obtain

4z = −0.0886.


From the previous example, it seems quite complicated to calculate 4z. Is there a way to
approximate 4z?
It turns out that dz gives a good approximation of 4z provided 4x and 4y are small and
f (x, y) is differentiable.

Theorem 8.10.
Suppose f is differentiable at (a, b). Let 4x and 4y be small increments in x and y respectively
from (a, b). Then

4z ≈ dz = fx (a, b) dx + fy (a, b) dy = fx (a, b)4x + fy (a, b)4y.

Example 8.24. The base radius and height of a circular cone are measured as 10cm and 25cm
respectively, with a possible error in measurement of as much as 0.1cm in each. Use differential to
estimate the maximum error in the calculated volume of the cone.

Solution. The volume of the cone is V = πr 2 h/3. So

2πrh πr 2
dV = Vr dr + Vh dh = dr + dh.
3 3

Since each error is at most 0.1cm, we can take dr = 0.1 and dh = 0.1 along with r = 10, h = 25
to give

500π 100π
dV = (0.1) + (0.1) = 20π.
3 3

The maximum error required is 20πcm3 . 

180
8.14 Directional Derivatives and the Gradient Vector
Imagine you are hiking in the Grand Canyon. Let’s think of your altitude at the point given
by longitude x and latitude y as a function f (x, y).
Facing east (in the direction of positive x-axis), the slope is given by the partial derivative
∂f
∂x
.
Facing north (in the direction of positive y-axis), the slope is given by the partial derivative
∂f
∂y
.
How to compute the slope when you are facing any given direction, say north-east?

Definition 8.15 (Directional Derivative).


The directional derivative of f (x, y) at (x0 , y0 ) in the direction of unit vector u = ha, bi is

f (x0 + ha, y0 + hb) − f (x0 , y0 )


Du f (x0 , y0 ) = lim
h→0 h
provided this limit exists.

By looking at the figure above, we can think of the directional derivative Du f (x0 , y0 ) as the
slope to the point P (x0 , y0 , z0 ) on the surface in the direction given by u.
Notice
• if u = i = h1, 0i then
Di f = fx .

181
• if u = j = h0, 1i then
Dj f = fy .

In other words, the partial derivatives of f with respect to x and y are just special cases of
the directional derivative.
In practice, we do not usually compute the directional derivative using the definition. In-
stead, we compute it using the dot product of the vector consisting of partial derivatives and
the unit direction vector u.

Theorem 8.11 (Computing Directional Derivative).


If f (x, y) is a differentiable function, then f has a directional derivative in the direction of any
unit vector u = ha, bi and
Du f (x, y) = fx (x, y)a + fy (x, y)b.
We can rewrite it in terms of vectors:

Du f (x, y) = hfx , fy i · ha, bi = hfx , fy i · u.

∂f ∂f
Consider the vector hfx , fy i = ∂x
i + ∂y j.
It turns out that this vector has much significance. So we give it a special name.

Definition 8.16 (Gradient).


The gradient of f (x, y) is the vector-valued function

∂f ∂f
∇f (x, y) = hfx , fy i = fx i + fy j = i+ j
∂x ∂y

provided both partial derivatives exist.

∇f is read ‘del f ’.
With this notation, we have

Du f (x, y) = ∇f (x, y) · u
Example 8.25. Find the directional derivative of the function f (x, y) = x2 y 3 − 4y at the point
(2, −1) in the direction of the vector v = 2i + 5j.

Solution. First compute the gradient vector at (2, −1):

∇f (x, y) = 2xy 3 i + (3x2 y 2 − 4)j

∇f (2, −1) = −4i + 8j.

182
Notice v is NOT a unit vector, since
√ √
| v|| = 22 + 52 = 29.

The unit vector in the direction of v is

v 2 5
u= = √ i + √ j.
| v|| 29 29
Therefore

Du f (2, −1) = ∇f (2, −1) · u


2 5
= h−4, 8i · h √ , √ i
29 29
32
= √ .
29

For functions of three variables, we can define directional derivative in a similar manner.

Definition 8.17 (3-D Directional Derivative).


The directional derivative of f (x, y, z) at (x0 , y0 , z0 ) in the direction of unit vector u = ha, b, ci
is
f (x0 + ha, y0 + hb, z0 + hc) − f (x0 , y0 , z0 )
Du f (x0 , y0 , z0 ) = lim
h→0 h
provided this limit exists.

Just as with functions of two variables, we have

Theorem 8.12 (Computing 3-D Directional Derivative).

Du f (x0 , y0 , z0 ) = ∇f (x0 , y0 , z0 ) · u
where
∂f ∂f ∂f
∇f = hfx , fy , fz i = i+ j+ k
∂x ∂y ∂z
is the gradient vector.

What is so significant about ∇f ?

183
Theorem 8.13 (Level Curve vs ∇f ).
Suppose f (x, y) is differentiable function of x and y at (x0 , y0 ).
Suppose ∇f (x0 , y0 ) , 0. Then ∇f (x0 , y0 ) is perpendicular/normal to the level curve f (x, y) = k
at the point (x0 , y0 ) where f (x0 , y0 ) = k.

Using a similar argument, we can prove that this phenomenon also holds for level surfaces
F(x, y, z) = k.

Theorem 8.14 (Level Surface vs ∇f ).


Suppose F(x, y, z) is differentiable function of x, y and z at (x0 , y0 , z0 ). Suppose S is the level
surface F(x, y, z) = k containing (x0 , y0 , z0 ). Let C be any curve that lies on S and passes
through (x0 , y0 , z0 ). Let r(t) be a parametric equation of C such that r(t0 ) = hx0 , y0 , z0 i.
Suppose ∇F(x0 , y0 , z0 ) , 0. Then

∇F(x0 , y0 , z0 ) · r0 (t0 ) = 0,

That is, the ∇F(x0 , y0 , z0 ) is perpendicular/normal to tangent vector r0 (t0 ) to any curve C on
the surface S that passes through (x0 , y0 , z0 ).

F(x0 , y0 , z0 ) = k, ∇F(x0 , y0 , z0 ) · r0 (t0 ) = 0.

Consequently, the tangent plane to the level surface F(x, y, z) = k at (x0 , y0 , z0 ) is given by
the equation

184
Theorem 8.15 (Tangent Plane to Level Surface).

∇F(x0 , y0 , z0 ) · hx − x0 , y − y0 , z − z0 i = 0

or equivalently,

Fx (x0 , y0 , z0 )(x − x0 ) + Fy (x0 , y0 , z0 )(y − y0 ) + Fz (x0 , y0 , z0 )(z − z0 ) = 0.

Example 8.26. Find the equations of the tangent plane and normal line at the point (−2, 1, −3) to
the ellipsoid
x2 z2
+ y 2 + = 3.
4 9

Solution. The ellipsoid is the level surface (with k = 3) of the function

x2 z2
F(x, y, z) = + y2 + .
4 9

Therefore,

x 2z
Fx (x, y, z) = , Fy (x, y, z) = 2y, Fz =
2 9

2
Fx (−2, 1, −3) = −1, Fy (−2, 1, −3) = 2, Fz (−2, 1, −3) = − .
3
The equation of the tangent plane at (−2, 1, −3) is

∇F(−2, 1, −3) · hx − (−2), y − 1, z − (−3)i

2
−1(x + 2) + 2(y − 1) − (z + 3) = 0,
3
which simplifies to

3x − 6y + 2z + 18 = 0.

The normal vector to the plane is h3, −6, 2i. So the parametric equations of the normal line
are

x = −2 + 3t, y = 1 − 6t, z = −3 + 2t, t ∈ R.

185
Remark. Theorem 8.5 is a special case of Theorem 8.15, and one may deduce Theorem 8.5
from Theorem 8.15 as follows:
Suppose S is the surface defined by z = f (x, y), and (a, b, f (a, b)) is a point on S. Now consider
the function F(x, y, z) = f (x, y) − z. Notice that
z = f (x, y) ⇐⇒ f (x, y) − z = 0 ⇐⇒ F(x, y, z) = 0.
Thus we may regard S as the level surface F(x, y, z) = 0. Note also that at a point (x0 , y0 , z0 ),
we have
Fx (x0 , y0 , z0 ) = fx (x0 , y0 ), Fy (x0 , y0 , z0 ) = fy (x0 , y0 ), Fz (x0 , y0 , z0 ) = −1.
Then from Theorem 8.15 (with (z0 , y0 , z0 ) = (a, b, f (a, b)), the equation of the tangent plane at
(a, b, f (a, b)) to S (as the level surface F(x, y, z) = 0) is given by
Fx (a, b, f (a, b))(x − a) + Fy (a, b, f (a, b))(y − b) + Fz (a, b, f (a, b))(z − f (a, b)) = 0
or equivalently,
fx (a, b)(x − a) + fy (a, b)(y − b) − (z − f (a, b)) = 0,
which is the same equation as given in Theorem 8.5.


Let’s return to the directional derivative Du f (x, y, z) and ask some questions.
We know that Du f (x, y, z) = ∇f (x, y, z) · u is a scalar function of x, y and z (because it is a dot
product of two vectors). Geometrically, we think of Du f (x0 , y0 , z0 ) as the rate of change of f
at (x0 , y0 , z0 ) in the direction of u.
Question: at a given point (x0 , y0 , z0 ), in which direction does f change the fastest? In other
words, what is the maximum rate of change of f at (x0 , y0 , z0 )?
The answer lies in ∇f !
Let θ be the angle between ∇f and u. Then

Du f = ∇f · u
= | ∇f | | u|| cos θ
= | ∇f | cos θ since u is a unit vector

Du f = | ∇f | cos θ.
The maximum value of cos θ is 1 and this happens when θ = 0.
So the maximum value of Du f is | ∇f | and it occurs when θ = 0, i.e. u points in the direction
of ∇f .
The minimum value of cos θ is −1 and this happens when θ = π.
So the minimum value of Du f is − | ∇f | and it occurs when θ = π, i.e. u points in the direction
of −∇f .

186
Theorem 8.16 (Maximizing Rate of Increase/Decrease of f ).
Suppose f is a differentiable function of two or three variables. Let P denote a given point.
Assume ∇f (P ) , 0. Let u be a unit vector making an angle θ with ∇f . Then

Du f (P ) = | ∇f (P )|| cos θ.

Moreover,
• ∇f (P ) points in the direction of maximum rate of increase of f at P (maximum value of
Du f (P ) is | ∇f (P )||)
• −∇f (P ) points in the direction of maximum rate of decrease of f at P (minimum value of
Du f (P ) is − | ∇f (P )||)

Example 8.27. Let f (x, y) = xey . In what direction does f have the maximum rate of change at
the point P (2, 0)? What is this maximum rate of change?

Solution. Note that

∇f (x, y) = hfx , fy i = hey , xey i.

f increases fastest in the direction of the gradient vector

∇f (2, 0) = h1, 2i.

The maximum rate of change is

√ √
| ∇f (2, 0)|| = 12 + 22 = 5.

Example 8.28. In a toy model of a neural network with only one artificial neuron, there are two
inputs a and b to the neuron, and they are attached with (variable) weights of x and y respectively.
In a (repeated) training of the neuron, the values of the inputs are fixed at (a, b) = (2, 3). The
1
activation function of the neuron is given by ϕ(s) = near s = −1, so that for (x, y) near
1 + e−s
1
(4, 5), the (actual) output of the neuron is given by ϕ(2x + 3y − 24) = (here the
1 + e−(2x+3y−24)
term −24 is called a ‘bias’) (note also that 2·4+3·5−24 = −1). With the input fixed at (a, b) = (2, 3)
1
and the corresponding target output of the neuron set at , the cost function is given by
2
2
1 1

C(x, y) = − −(2x+3y−24)
for (x, y) near (4, 5),
2 1+e

187
which is a measure of the discrepancy between the actual output and the target output. Find the
direction in which the cost function C will have maximum rate of decrease when the weights are
at (x, y) = (4, 5).
.............................
........ ......
...... .....
a ......................x............................... ..
.... ..
.
.
.
..
...... ...
...
...
...
... ... ...
... .. .....................................................
.. .. ..
.
...
...
.......................... ... ....
.. ...
.................... ...
b y ...
.....
......
........
..
.....
......
.

...........................

Remark. Such information is useful in training the neuron, which involves modifying the weights
(x, y) efficiently so that for the same input (a, b) = (2, 3) (but with the weights modified from
(x, y) = (4, 5)), the new output of the neuron will become closer to the target output.

Solution. Given that


2
1 1

C(x, y) = − −(2x+3y−24)
for (x, y) near (4, 5).
2 1+e
We have, for (x, y) near (4, 5),

1 1 (−1)(−1)
 
Cx = 2 · − −(2x+3y−24)
· · e−(2x+3y−24) · (−2)
2 1+e (1 + e−(2x+3y−24) )2
4e−(2x+3y−24) 1 1
 
=− · − ,
(1 + e−(2x+3y−24) )2 2 1 + e−(2x+3y−24)
1 1 (−1)(−1)
 
Cy = 2 · − · · e−(2x+3y−24) · (−3)
2 1 + e−(2x+3y−24) (1 + e−(2x+3y−24) )2
6e−(2x+3y−24) 1 1
 
=− · − .
(1 + e−(2x+3y−24) )2 2 1 + e−(2x+3y−24)

A direct computation gives

−2e(e − 1) −3e(e − 1)
Cx (4, 5) = 3
, Cy (4, 5) = .
(1 + e) (1 + e)3

At the point (x, y) = (4, 5), the cost function C decreases fastest in the direction of the gradi-
ent vector
−2e(e − 1) −3e(e − 1) e(e − 1)
−∇C(4, 5) = −hCx (4, 5), Cy (4, 5)i = −h 3
, 3
i= h2, 3i.
(1 + e) (1 + e) (1 + e)3

Thus the required direction is given by the unit vector

−∇C(4, 5) h2, 3i h2, 3i


u= = = √ .
| −∇C(4, 5)|| | h2, 3i|| 13

188
8.15 Extrema of Functions of Two Variables
In the real world, we always seek to optimize our resources.
Given our constraints (our time, ability, finance, health, family background and what not),
getting an A in MA1521 itself can be seen as an optimization problem.
Similar to the study of extrema of functions of one variable, two key concepts are the local
maximum/minimum and the absolute maximum/minimum.

Definition 8.18 (Local and Absolute Maximum).


Let f (x, y) : D → R. Then
• f has a local maximum at (a, b) if f (x, y) ≤ f (a, b) for all points in some disk with center
(a, b). Such point (a, b) is called a local maximum point of f , and the number f (a, b) is called
a local maximum value of f .
• f has an absolute maximum at (a, b) if f (x, y) ≤ f (a, b) for all points in the domain D.
Such point (a, b) is called an absolute maximum point of f , and the number f (a, b) is called
the absolute maximum value of f .

Definition 8.19 (Local and Absolute Minimum).


Let f (x, y) : D → R. Then
• f has a local minimum at (a, b) if f (x, y) ≥ f (a, b) for all points in some disk with center
(a, b). Such point (a, b) is called a local minimum point of f , and the number f (a, b) is called
a local minimum value of f .
• f has an absolute minimum at (a, b) if f (x, y) ≥ f (a, b) for all points in the domain D.
Such point (a, b) is called an absolute minimum point of f , and the number f (a, b) is called
the absolute minimum value of f .

8.15.1 Local Extrema


A key observation that will be used repeatedly when finding local extrema of functions is
the following.

Theorem 8.17.
If f has a local maximum or minimum at (a, b) and the first-order derivatives of f exist there,
then
fx (a, b) = fy (a, b) = 0.

Proof. Let g(x) = f (x, b). Then g is a function of a single variable x. If f has a local maxi-
mum/minimum at (x, y) = (a, b) then g has a local maximim/minimim at x = a. So g 0 (a) = 0.

189
But g 0 (a) = fx (a, b). So fx (a, b) = 0.
Similarly, fy (a, b) = 0.

There is a geometric interpretation of the preceding theorem:
If f has a tangent plane at a local maximum/minimum (a, b), then the tangent plane has
equation

z = f (a, b) + fx (a, b)(x − a) + fy (a, b)(y − b) = f (a, b),

that is the tangent plane is a horizontal plane parallel to the xy-plane.


If I stand on a local maximum/minimum then ......

In the following, I am definitely NOT standing on local maximum/minimum ......

Definition 8.20 (Critical or Stationary Point).


Let f (x, y) : D → R. Then a point (a, b) is called a critical point of f if
• fx (a, b) = 0 and fy (a, b) = 0, OR
• one of the partial derivatives does not exist.

190
Clearly,

(a, b) local maximum/minimum point =⇒ (a, b) critical point.


However, the converse IS NOT TRUE!
Example 8.29. Find the extreme (maximum/minimum) values of f (x, y) = y 2 − x2 .

Solution. Extreme values can only occur at critical points. Since fx = −2x and fy = 2y, the
only critical point is (0, 0).
We still have to check whether f (0, 0) is a maximum/minimum value.
Note that f (0, 0) = 0.
If y = 0 and x , 0, then

f (x, y) = −x2 < 0 = f (0, 0).


If x = 0 and y , 0, then

f (x, y) = y 2 > 0 = f (0, 0).


Therefore, f (0, 0) cannot be an extreme value for f . So f has no extreme values.


In the preceding example, we see that a critical point needs not be an extreme point. But
the behavior of the critical point in the preceding example is interesting.
If you look at the graph of f (x, y) = y 2 − x2 , you will see that f (0, 0) = 0 is a maximum in the
direction of the x-axis but a minimum in the direction of the y-axis.
This motivates the following definition.

Definition 8.21 (Saddle Point).


Let f (x, y) : D → R. Then

A point (a, b) is called a saddle point of f if

191
• it is a critical point of f , AND

• every open disk centered at (a, b) contains points (x, y) ∈ D for which f (x, y) < f (a, b) and
points (x, y) ∈ D for which f (x, y) > f (a, b).

Suppose you are standing on a surface and you are standing upright (parallel to
the z-axis). Moreover, when you begin walking, some directions take you uphill
while other directions take you downhill.

Then you are standing at a saddle point!


We cannot rely on our visualization of 3D-graphs to locate extreme points.
Luckily, we have the second derivative test to determine whether a given critical point is
local maximum/minimum, saddle point or neither.

Theorem 8.18 (Second Derivative Test).


Suppose f (x, y) has continuous second-order partial derivatives on some open disk centered at
(a, b). Suppose fx (a, b) = fy (a, b) = 0 (that is (a, b) is a critical point). Define the discriminant
D for the point (a, b) by
h i2
D = D(a, b) = fxx (a, b)fyy (a, b) − fxy (a, b) .

(a) If D > 0 and fxx (a, b) > 0, then f (a, b) is a local minimum.
(b) If D > 0 and fxx (a, b) < 0, then f (a, b) is a local maximum.
(c) If D < 0, then (a, b) is a saddle point of f .
(d) If D = 0, then no conclusion can be drawn.

Example 8.30. Locate and classify all critical points for f (x, y) = x3 − 2y 2 − 2y 4 + 3x2 y.

Solution. We have

fx = 3x2 + 6xy, fy = −4y − 8y 3 + 3x2 .

Step 1. Locate critical points.


Let’s solve the system
3x2 + 6xy = 0 (1)
3 2
−4y − 8y + 3x = 0 (2)

192
If x = 0, then by (2) we have −4y − 8y 3 = 0 ⇔ −4y(1 + 2y 2 ) = 0 ⇔ y = 0. Thus we obtain one
solution (0, 0).

If x , 0, then by (1) we have y = − 2x . Substituting this into (2), we have


−4(− 2x ) − 8(− 2x )3 + 3x2 = 0 ⇔ 2x + x3 + 3x2 = 0 ⇔ x(x + 2)(x + 1) = 0 ⇔ x = −1, −2. Note that
x , 0.

Using y = − 2x , we obtain the two solutions (−1, 21 ), (−2, 1).


Therefore, the critical points are (0, 0), (−1, 12 ), (−2, 1).
Step 2. Classify (if possible) these critical points using Second Derivative Test.
We need

fxx = 6x + 6y, fxy = 6x, fyy = −4 − 24y 2 .

At the critical point (−2, 1), we have

D(−2, 1) = fxx (−2, 1)fyy (−2, 1) − [fxy (−2, 1)]2 = (−6) · (−28) − (−12)2 = 24 > 0.

Note also that fxx (−2, 1) = −6 > 0. Thus by the Second Derivative Test, we know that f has a
local maximum point at (−2, 1).
We can make similar computations at the other two critical points (0, 0) and (−1, 12 ), and the
result is tabulated as follows:

critical point D fxx 2nd Derivative Test’s Conclusion


(0, 0) 0 inconclusive
(−1, 12 ) −6 < 0 saddle point
(−2, 1) 24 > 0 −6 < 0 local maximum

We need a different analysis to deal with the critical point (0, 0).
Notice in the plane y = 0, f (x, y) = f (x, 0) = x3 . We know from Calculus that this curve has
an inflection point at x = 0. So there is no local extremum at this point.
Moreover, when we start walking from (0, 0) in positive direction of x, we will be walking
uphill; in the negative direction of x, we will be walking downhill.
So (0, 0) is a saddle point.

2 −y 2
Exercise 8.1. Find the critical points of f (x, y) = −3xe−x and classify them.

Ans. Local minimum at √1 , Local maximum at − √1 .


2 2

Exercise 8.2. A delivery company only accepts rectangular boxes the sum of whose length and
girth (perimeter of a cross-section) does not exceed 270 cm. Find the dimensions of an acceptable
box of largest volume.

193
Ans. 90cm × 45cm × 45cm.

194
Chapter 9

Double Integrals

Read Thomas’ Calculus, Chapter 14.

9.1 Riemann Sum


Having studied derivatives for functions of several variables and their applications, we now
turn to introducing the idea of integral for functions of several variables. It turns out that
these ideas are useful in many practical problems.
Recall that in Calculus of Single Variable, our attempt to find the area under a curve led to
the definition of a definite integral.
We now seek to find volume under a surface and in the process we arrive at the definition of
a double integral.
We start by reviewing how we arrive at the definite integral of functions of a single variable:
Step 1. Suppose f (x) is defined for a ≤ x ≤ b. We divide the interval [a, b] into n subintervals
of equal size 4x = b−a
n .
Step 2. We choose sample points xi∗ from these subintervals and form the Riemann Sum

n
X
f (xi∗ )4x.
i=1

Step 3. Take the limit of such sum as n → ∞ to obtain the definite integral of f from a to b:

Z b n
X
f (x)dx = lim f (xi∗ )4x.
a n→∞
i=1

Rb
In the special case where f (x) ≥ 0, the integral a
f (x)dx represents the area under the curve
f (x) from a to b.

195
9.2 Volume and Double Integral
Suppose f (x, y) is a function of two variables defined on a closed rectangle

R = [a, b] × [c, d] = {(x, y) ∈ R2 : a ≤ x ≤ b, c ≤ y ≤ d}.

Suppose f (x, y) ≥ 0. The graph of f is a surface with z = f (x, y) above the region R.
Let S be the solid that lies above R and under the graph of f .
How can we find the volume of S?

We can estimate the volume of S as follows:


Step 1. divide the rectangle R into subrectangles. We do this by
b−a
• dividing the interval [a, b] into m subintervals [xi−1 , xi ] of equal length 4x = m , and
d−c
• dividing the interval [c, d] into n subintervals [yj−1 , yj ] of equal length 4y = n .
Form subrectangles

Rij = [xi−1 , xi ] × [yj−1 , yj ], for all 1 ≤ i ≤ m, 1 ≤ j ≤ n.

Each of these subrectangles has area 4A = 4x4y.

196

Step 2. Choose a sample point (xij , yij∗ ) in each Rij . Then approximate the part of S lies

above Rij by a thin rectangle box with base Rij and height f (xij , yij∗ ). The volume of this box
is given by


f (xij , yij∗ )4A.

It follows that by adding the volumes of all these thin boxes, we get an approximation of the
total volume of S:

m X
X n

V ≈ f (xij , yij∗ )4A.
i=1 j=1

Our intuition tells us that the approximation becomes better as m, n → ∞. So we would


expect

m X
X n

V = lim f (xij , yij∗ )4A.
m,n→∞
i=1 j=1

197
We make the following definition:

Definition 9.1 (Double Integral).


The double integral of f over the rectangle R is

ZZ m X
X n

f (x, y) dA = lim f (xij , yij∗ )4A
R m,n→∞
i=1 j=1


provided the limit exists and is the same for any choice of the sample points (xij , yij∗ ) in Rij , for
1 ≤ i ≤ m, 1 ≤ j ≤ n.
When this happens, we say that f is integrable over R.

Remark. It can be shown that all continuous functions are integrable.


By comparing our definition of integral and volume, we have

Theorem 9.1 (Volume as a Double Integral).


If f (x, y) ≥ 0, the volume V of the solid that lies above the rectangle R and below the surface
z = f (x, y) is
ZZ
V = f (x, y) dA.
R

Some properties of double integral:

Assuming all the integrals exist, we have


ZZ ZZ ZZ
1. (f (x, y) + g(x, y)) dA = f (x, y) dA + g(x, y) dA.
R R R
ZZ ZZ
2. cf (x, y) dA = c f (x, y) dA.
R R

198
3. If f (x, y) ≥ g(x, y) for all (x, y) ∈ R, then
ZZ ZZ
f (x, y) dA ≥ g(x, y) dA.
R R

9.3 Iterated Double Integral


Recall that it is usually difficult to evaluate a single integral directly from definition, but the
Fundamental Theorem of Calculus provides a much easier method:
Z b
f (x) dx = F(b) − F(a)
a

where F(x) is an antiderivative of f (x).


For double integral, it is even more difficult to compute it from first principles. We now
see how to express a double integral as an iterated integral which can be evaluated by
calculating two single integrals.
Suppose f (x, y) is integrable over the rectangle R = [a, b] × [c, d].
Rd
We use the notation c f (x, y) dy to mean that x is held fixed and f (x, y) is integrated with
respect to y from c to d.
This procedure is called partial integration with respect to y. Notice the similarity to par-
tial differentiation.
Rd
So c f (x, y) dy is a function of x, as it depends on the value of x: set
Z d
A(x) = f (x, y) dy.
c

We now integrate A(x) from a to b:


Z b Z b "Z d #
A(x) dx = f (x, y) dy dx.
a a c

The integral on the right-hand side is called an iterated integral.


Usually, we omit the brackets:
Z bZ d Z b "Z d #
f (x, y) dy dx = f (x, y) dy dx.
a c a c

199
Definition 9.2 (Iterated Integral).
Z bZ d
f (x, y) dy dx
a c
means we first integrate with respect to y from c to d (keeping x fixed) and then with respect
to x from a to b.
Z dZ b
f (x, y) dx dy
c a
means we first integrate with respect to x from a to b (keeping y fixed) and then with respect
to y from c to d.

Example 9.1. Evaluate the iterated integral


Z 2Z 3
x2 y dx dy.
1 0

Solution. We first integrate with respect to x and then with respect to y:

Z 2Z 3 Z 2 "Z 3 #
x2 y dx dy = x2 y dx dy
1 0 1 0
Z 2" #3
x3 y
= dy
1 3 0
Z 2
= 9y dy
1
#2
9y 2
"
=
2 1
27
= .
2

Example 9.2. Evaluate the iterated integral


Z 3Z 2
x2 y dy dx.
0 1

Solution. We first integrate with respect to y and then with respect to x:

200
Z 3Z 2 Z 3 "Z 2 #
2 2
x y dy dx = x y dy dx
0 1 0 1
Z 3" #2
x2 y 2
= dx
0 2 1
Z 3
3 2
= x dx
0 2
" 3 #3
x
=
2 0
27
= .
2


Notice in both of the preceding examples, we obtained the same answer.
It seems that the order of integration (with respect to x or y first) does not matter. This is
similar to Clairaut’s Theorem for mixed partial derivatives.
Indeed, if f is continuous on R, this is always true. Moreover, the (iterated) integral is equal
to the corresponding double integral.
The following theorem gives a practical way for evaluating a double integral by expressing
it as an iterated integral (in either order):

Theorem 9.2 (Fubini’s Theorem).


If f is continuous on the rectangle R = [a, b] × [c, d], then
ZZ Z bZ d Z dZ b
f (x, y) dA = f (x, y) dy dx = f (x, y) dx dy.
R a c c a

More generally, this is true if we assume that f is bounded on R, f is discontinuous only on a


finite number of smooth curves, and the iterated integrals exist.

Example 9.3. Evaluate


ZZ
y sin(xy) dA
R

where R = [1, 2] × [0, π].

Solution 1. Using Fubini’s Theorem, let’s integrate first with respect to x.

201
ZZ Z πZ 2
y sin(xy) dA = y sin(xy) dx dy
R 0 1
Z π
= [− cos(xy)]21 dy
0
Z π
= (− cos 2y + cos y) dy
0

1

= − sin 2y + sin y = 0.
2 0

Solution 2. By Fubini’s Theorem, we should get the same answer if we first integrate with
respect to y:
ZZ Z 2Z π
y sin(xy) dA = y sin(xy) dy dx.
R 1 0

We first need to compute


Z π
y sin(xy) dy.
0

Using integration by parts,

Z π " #π Z π
y cos(xy) cos(xy)
y sin(xy) dy = − − − dy
0 x 0 0 x
π cos πx 1
= − + 2 [sin xy]π0
x x
π cos πx sin πx
= − + .
x x2

Now, integrating the first term by parts, we have


Z  Z
π cos πx sin πx sin πx

− dx = − − dx
x x x2
So
Z 
π cos πx sin πx sin πx

− + 2
dx = − .
x x x
Hence

202
Z 2Z π
sin πx 2
 
y sin(xy) dy dx = −
1 0 x 1

sin 2π
= − + sin π = 0.
2


Now, we make some observation about the last example.
• Though both solutions give the same answer, the first solution is much easier than the
second one. Therefore, when we evaluate double integrals, it is wise to choose the right
order of integration that yields simpler calculations.
• Consider the surface z = y sin(xy).

This function takes both positive and negative values on R = [1, 2] × [0, π].
RR
For such a function, R f (x, y) dA is a difference of volumes: V1 − V2 where V1 is the volume
above R and below the graph of f and V2 is the volume below R and above the graph.
The fact that the integral is 0 in the preceding example means that these two volumes V1
and V2 are equal.

203
9.4 A Special Case
Sometimes f (x, y) can be factored as the product of a function of x only and a function of y
only. That is

f (x, y) = g(x)h(y).
Then Fubini’s Theorem gives

ZZ Z dZ b
f (x, y) dA = g(x)h(y) dx dy
R c a
Z d "Z b #
= g(x)h(y) dx dy
c a

In the inner integral, y is a constant, so h(y) is a constant and we can write

ZZ Z d" Z b #
f (x, y) dA = h(y) g(x) dx dy
R c a
Z b ! Z d !
= g(x) dx h(y) dy
a c
Rb
since a
g(x) dx is a constant.
To summarize:

Theorem 9.3 (A Special Case).


ZZ Z b ! Z d !
g(x)h(y) dA = g(x) dx h(y) dy
R a c

where R = [a, b] × [c, d].

Example 9.4. Evaluate ZZ


2x
dA
R y
where R = [3, 4] × [1, 2].

Solution: By Theorem 9.3,


ZZ
2x
ZZ
1 Z 4   Z 2 1  h i4 h i2
dA = 2x· dA = 2x dx · dy = x2 · ln |y| = (42 −32 )·(ln 2−ln 1) = 7 ln 2.
R y y 1 y
3 1
R 3

204
9.5 Double Integral over General Region

So far we have defined double integrals over domains which are rectangles. In this section,
we shall define double integrals over domains which are more general than rectangles. In
particular, they are regions which are bounded between two continuous curves. They are
called Type I and Type II regions respectively.

Definition 9.3. Type I Region


A plane region D is said to be of Type I if it lies between the graphs of two continuous functions
of x, that is,
D = {(x, y) : a ≤ x ≤ b, g1 (x) ≤ y ≤ g2 (x)}
where g1 (x) and g2 (x) are continuous on [a, b].

Some examples of Type I region:

Definition 9.4. Type II Region


A plane region D is said to be of Type II if it lies between the graphs of two continuous
functions of y, that is,

D = {(x, y) : c ≤ y ≤ d, h1 (y) ≤ x ≤ h2 (y)}

where h1 (y) and h2 (y) are continuous on [c, d].

Some examples of Type II region:

205
How do we compute the integral of f (x, y) over Type I region D?

Theorem 9.4. Double Integral over Type I Domain


If f is continuous on a Type I domain D such that

D = {(x, y) : a ≤ x ≤ b, g1 (x) ≤ y ≤ g2 (x)}

then ZZ Z bZ g2 (x)
f (x, y) dA = f (x, y) dy dx.
D a g1 (x)

Observe that the expression on the right-hand side of


ZZ Z bZ g2 (x)
f (x, y) dA = f (x, y) dy dx
D a g1 (x)

is an iterated integral similar to the ones we have for rectangle region, except that in the
inner integral, we regard x as being constant not only in f (x, y) but also in the limits of the
integration, g1 (x) and g2 (x).
Similarly, we have

Theorem 9.5. Double Integral over Type II Domain


If f is continuous on a Type II domain D such that

D = {(x, y) : c ≤ y ≤ d, h1 (y) ≤ x ≤ h2 (y)}

206
then ZZ Z dZ h2 (y)
f (x, y) dA = f (x, y) dx dy.
D c h1 (y)

RR
Example 9.5. Evaluate D
(x + 2y) dA where D is the region bounded by the parabolas y = 2x2
and y = 1 + x2 .

Solution.
Step 1. Identify the region.
Notice the parabolas intersect when 2x2 = 1 + x2 , that is, x = ±1.
We note that D is a Type I region:

D = {(x, y) : −1 ≤ x ≤ 1, 2x2 ≤ y ≤ 1 + x2 }.

Step 2. Set up the iterated integral.


Therefore,
ZZ Z 1 Z 1+x2
(x + 2y) dA = (x + 2y) dy dx.
D −1 2x2

Step 3. Evaluate the inner integral.

Z 1+x2 h iy=1+x2
(x + 2y) dy = xy + y 2
y=2x2
2x2
= x(1 + x2 ) + (1 + x2 )2 − x(2x2 ) − (2x2 )2
= −3x4 − x3 + 2x2 + x + 1.

Step 4. Complete the computation.

207
ZZ Z 1
(x + 2y) dA = (−3x4 − x3 + 2x2 + x + 1) dx
D −1
#1
x5 x4 x3 x2
"
= −3 − +2 + +x
5 4 3 2 −1
32
= .
15


When we set up a double integral, it is helpful to draw a diagram.

For Type I region, it is helpful to draw a vertical arrow which starts at the lower
boundary y = g1 (x) and ends at the upper boundary y = g2 (x). This corresponds to
the inner integral.

For Type II region, it is helpful to draw a horizontal arrow which starts at the left
boundary x = h1 (x) and ends at the right boundary x = h2 (x). This corresponds to
the inner integral.
RR
Example 9.6. Evaluate D xy dA where D is the region bounded by the line y = x − 1 and the
parabola y 2 = 2x + 6.

Solution. The region D can be of Type I or II:

But we prefer D as Type II because as a Type I region, the lower boundary of D is more
complicated, in particular, it consists of two parts: one for −3 ≤ x ≤ −1 and another for
−1 ≤ x ≤ 5.
Therefore, we let

1
D = {(x, y) : −2 ≤ y ≤ 4, y 2 − 3 ≤ x ≤ y + 1}.
2

208
So ZZ Z 4 Z y+1
xy dA = xy dx dy.
1 2
D −2 2 y −3

Lets first compute the inner integral:

y+1 #x=y+1
x2
Z "
xy dx = y·
y2 2 y2
2 −3 x= 2 −3

y2
!
1
= y(y + 1)2 − y( − 3 + 1)2
2 2
1 y5
!
3 2
= − + 4y + 2y − 8y .
2 4

Therefore,

4
1 y5
ZZ Z !
3 2
xy dA = − + 4y + 2y − 8y dy
D −2 2 4
" 6 #4
1 y 4 y3 2
= − + y + 2 − 4y
2 24 3 −2
= 36.

Example 9.7. Find the volume of the tetrahedron T bounded by the planes x + 2y + z = 2, x = 2y,
x = 0 and z = 0.

Solution. For question like this, it is wise to draw two diagrams:


• one for the solid (tetrahedron) T ,
• another for the domain D.
How do we start drawing?

209
There is no general rule, it depends on the problem in the question.
Notice the plane x + 2y + z = 2 intersects the xy-plane in the line x + 2y = 2. (Set z = 0 in the
equation of the plane).
Together with the restrictions that the solid is bounded by z = 0 (above the xy-plane), x = 0
(the yz-plane) and the plane x = 2y, we see that T lies above the region D in the xy-plane
bounded by the lines:
• x = 2y,
• x + 2y = 2 (the intersection of the plane x + 2y + z = 2 and the plane z = 0),
• x = 0.

Notice that (1, 21 , 0) and (0, 1, 0) are two points on the plane x + 2y + z = 2.
There is another point on this plane: (0, 0, 2).
We can now draw the tetrahedron T as follows:

So the required volume V lies under the graph z = 2 − x − 2y and above

210
x x
D = {(x, y) : 0 ≤ x ≤ 1, ≤ y ≤ 1 − }.
2 2
Therefore

ZZ
V = (2 − x − 2y) dA
D
Z 1Z 1−x/2
= (2 − x − 2y) dy dx
0 x/2
Z 1h iy=1−x/2
= 2y − xy − y 2 dx
y=x/2
0
Z1
= (x2 − 2x + 1) dx
0
#1
x3
"
2
= −x +x
3 0
1
= .
3

Example 9.8. Find the volume of the solid above the xy-plane and bounded by the graph of z =
y sin(xy) and xy-plane for 1 ≤ x ≤ 2 and 0 ≤ y ≤ π.

Ans. π/2.

Solution. First, for the rectangle [1, 2] × [0, π] in the xy-plane,

y sin(xy) = 0 ⇐⇒ y = 0 or sin(xy) = 0
⇐⇒ y = 0 or xy = π or (x, y) = (2, π).

211
Therefore, the surface z = y sin(xy) intersects the interior of the rectangle [1, 2] × [0, π] in the
π π
curve y = . Denote the region bounded by the curve y = and the x-axis from x = 1 to
x x
x = 2 by D. Note that D is a type I region given by
π
D = {(x, y) : 1 ≤ x ≤ 2, 0 ≤ y ≤ }.
x
Note also that for points (x, y) , (2, π) in the rectangle [1, 2] × [0, π] (so that 0 ≤ xy < 2π), we
have
y
z = y sin(xy) ≥ 0 ⇐⇒ sin(xy) ≥ 0 ⇐⇒ 0 ≤ xy ≤ π ⇐⇒ 0 ≤ y ≤ ⇐⇒ (x, y) is in D.
x

Thus the required volume V is given by


ZZ Z 2Z π
x
V = y sin(xy) dA = y sin(xy) dy dx
D 1 0
Z 2" #y= πx
y cos(xy) sin xy
= − + dx
1 x x2 y=0
Z 2
π cos(π) sin(π)
= − + dx
1 x2 x2
Z 2
π
= 2
dx
1 x
 2
π π
= − = .
x 1 2

Remark. (Interchanging the order of integration) When computing an iterated integral


over a domain which is of both type I and type II, it is sometimes easier to compute the
integral by interchanging the order of integration.

212
Z 1Z 1
Example 9.9. Evaluate the iterated integral I = sin(y 2 ) dydx by interchanging the or-
0 x
der of integration.
Z 1
[Remark. It is difficult to compute the inner integral sin(y 2 ) dy.]
x

1
Ans: (1 − cos 1).
2
Solution. The region of integration D is the triangular region bounded by the lines y = x, x =
0 and y = 1. Notice that D is a domain of both type I and type II.

Z 1Z 1 Z Z 1Z y
2 2
sin(y ) dydx = sin(y ) dA = sin(y 2 ) dxdy
0 x D 0 0
Z 1h ix=y Z1
= x sin(y 2 ) dy = y sin(y 2 ) dy
x=0
0 0
1
1 1

= − cos(y 2 ) = (1 − cos 1).
2 0 2

9.6 Decomposing Domain into Smaller Domains


Double integrals are additive with respect to the domain: if D is the union of domains
D1 , . . . , Dn that do not overlap except possibly on boundary curves, then

Theorem 9.6. Additivity With Respect to Domain


ZZ ZZ ZZ
f (x, y) dA = f (x, y) dA + · · · + f (x, y) dA.
D D1 Dn

213
Additivity may be used to evaluate double integrals over domain D which is neither of Type
I nor II but can be decomposed into finitely many domains of Type I or II.
"
Example 9.10. Suppose we want to compute the double integral xy dA, where D is the
D
shaded region bounded by the curve y 2 = 2x, and the lines 2x − 3y − 4 = 0 and 3x + 8y − 6 = 0 as
shown below.

y
....
.......
....
...
...
...
... y2 • (8, 4) ...............
......................
x= ... ............... . ......
... 2...............................................
... ......
... . . . . . .. .. 3y
... .............. .................... x= +2
D
... ........... ...1........... 2
............. .............
............................................................................................................................................................................................................................................
0

2
..... . . . . ...........
............... ......................
x
... ............. ............................
... ................................... 8y
........ . . . . . . . . ..........
... ........ . . . . . . . . . .......... x=− +2
... ............ . ......................... 3
... D ............... .2. . . . . . ..........
............... . . . . . . . .........
... ................ . .......... ..........
.............. . . . . . ..........
... ................ . . . ..........
................. . . . ...........
... ................... .............
D = D ∪D ...
... 1 2 ............................
........................
... • (18, −6) .
..

Notice that the region D is not a domain of type I or type II. Nonetheless, D is a union of two
domains of type II as follows:

D = D1 ∪ D2 , where
y2 3y
D1 = {(x, y) : 0 ≤ y ≤ 4, ≤x≤ + 2},
2 2
y2 8y
D2 = {(x, y) : −6 ≤ y ≤ 0, ≤ x ≤ − + 2}.
2 3
Then by Theorem 9.6, we have
" " "
xy dA = xy dA + xy dA
D D1 D2
Z 4Z 3y 8y
2 +2 0 3 +2
Z Z
= xy dxdy + xy dxdy,
y2 y2
0 2 −6 2

and then the two iterated itegrals in the last line can be computed readily.

9.7 Properties of Double Integral


The following properties for double integral over D follow from the corresponding proper-
ties for double integrals over a rectangle region R:

214
Theorem 9.7.
ZZ ZZ ZZ
[f (x, y) + g(x, y)] dA = f (x, y) dA + g(x, y) dA
D D D

Theorem 9.8.
ZZ ZZ
cf (x, y) dA = c f (x, y) dA
D D

Theorem 9.9.
If f (x, y) ≥ g(x, y) for all (x, y) ∈ D then
ZZ ZZ
f (x, y) dA ≥ g(x, y) dA.
D D

Example 9.11. (i) Let D be a domain in R2 . Then Theorem 9.7 and Theorem 9.8, we have
" " Z
2 2
(x + 2xy) dA = x dA + 2 xy dA.
D D D

(ii) Show that " Z


2 2
(x + y ) dA ≥ 2xy dA.
D D

Proof of (ii). For all (x, y) in D, we have

(x − y)2 ≥ 0 =⇒ x2 − 2xy + y 2 ≥ 0 =⇒ x2 + y 2 ≥ 2xy.


" "
2 2
Thus by Theorem 9.9, we have (x + y ) dA ≥ 2xy dA.
D D


9.8 An Application – Finding Area


We can use double integral to compute area of a region D on the plane:

215
Theorem 9.10. Area of plane region
Let f (x, y) = 1 over a given region D. Then the area of D is
ZZ
A(D) = 1 dA.
D

RR
Proof. By considering the constant function f (x, y) = 1 for all (x, y) in D, we see that D
1 dA
is the volume of the solid which is a cylinder whose base is A(D) and height 1.
Another way of computing the volume of a cylinder is

area of base × height

which is
A(D) · 1
in this case. So ZZ
A(D) = 1 dA,
D
as required.


9.9 Double Integrals in Polar Coordinates

We have learned how to evaluate double integral over D where D is of the following type:
• rectangle;
• region of Type I or Type II.
Sometimes, the region D is not so easily described in terms of x and y coordinates.
Sometimes, such regions can be conveniently described using polar coordinates (r, θ). The
following figure which shows the relationship between polar coordinates and the rectangle
coordinates:

216
Instead of using (x, y), we note that any point on the xy-plane can be represented by an
ordered pair (r, θ) where
• r is the distance from the origin to the point
• θ is the angle from the positive x-axis to the straight line joining the origin and the point.
For example consider the following region given in terms of its polar coordinates:

Polar coordinates (r, θ) of a point are related to the rectangle coordinate (x, y) by the equa-
tions

Theorem 9.11. Polar Coordinates Versus Rectangle Coordinates

r 2 = x2 + y 2 , x = r cos θ, y = r sin θ.

Definition 9.5 (Polar Rectangle).


A polar rectangle is a region

R = {(r, θ) : a ≤ r ≤ b, α ≤ θ ≤ β}.

217
How do we compute ZZ
f (x, y) dA
R
where R is a polar rectangle?
RR
Recall that for R f (x, y) dA over the usual rectangle R, we can think of dA = dxdy as the
area of the ‘little rectangle’ 4A = 4x4y:
RR
To compute R f (x, y) dA over the polar rectangle R given by

R = {(r, θ) : a ≤ r ≤ b, α ≤ θ ≤ β}.

we partition R as follows:

Then we can think of dA = r dr dθ as the area of the ‘little polar rectangle’ 4A ≈ 4r · r4θ:

Note that the arc of the polar rectangle is r 4θ which depends on r.

218
Theorem 9.12. Change to Polar Coordinates in Double Integral
If f is continuous on a polar rectangle R given by

R = {(r, θ) : 0 ≤ a ≤ r ≤ b, α ≤ θ ≤ β}

where 0 ≤ β − α ≤ 2π, then


ZZ Z βZ b
f (x, y) dA = f (r cos θ, r sin θ)r dr dθ.
R α a

The formula says that we convert from rectangle to polar coordinates in a double integral
by:

• writing x = r cos θ, y = r sin θ

• using the appropriate limits of integration for r and θ

• replacing dA by r dr dθ (do not forget the additional r in r dr dθ)


RR
Example 9.12. Evaluate R (3x + 4y 2 ) dA where R is the region in the upper half-plane bounded
by the circles x2 + y 2 = 1 and x2 + y 2 = 4.

Solution. The region R is shown below:

So

R = {(r, θ) : 1 ≤ r ≤ 2, 0 ≤ θ ≤ π}.

Changing to polar coordinates for the double integral, we have

219
ZZ Z πZ 2
2
(3x + 4y ) dA = (3r cos θ + 4r 2 sin2 θ)r dr dθ
R 0 1
Z πZ 2
= (3r 2 cos θ + 4r 3 sin2 θ) dr dθ
0 1
Z πh ir=2
= r 3 cos θ + r 4 sin2 θ dθ
r=1
Z0π
= (7 cos θ + 15 sin2 θ) dθ
0
Z π
15

= (1 − cos 2θ) dθ
7 cos θ +
0 2

15θ 15

= 7 sin θ + − sin 2θ
2 4 0
15π
= .
2


Example 9.13. Find the volume of the solid bounded by the plane z = 0 and the paraboloid
z = 1 − x2 − y 2 .

Solution. Notice the plane and the paraboloid intersect in the circle x2 + y 2 = 1.
So the solid lies under the paraboloid and above the circular disk D given by x2 + y 2 ≤ 1.

In polar coordinates, D is given by

D = {(r, θ) : 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π}.

Since 1 − x2 − y 2 = 1 − r 2 , we have

220
ZZ
Volume = (1 − x2 − y 2 ) dA
D
Z 2π Z 1
= (1 − r 2 )r dr dθ
0 0
Z 2π ! Z1 !
3
= dθ (r − r ) dr
0 0
r2 r 4 1
" #
= 2π −
2 4 0
π
= .
2

Exercise 9.1. Let R be the circular region bounded by the circle x2 + (y − 1)2 = 1. It is known that

"
dA π
2 2 2
= ,
R (1 + 2x + 2y ) a

where a is a positive integer. Determine the value of a.

[Hint: Use polar coordinates and evaluate the resulting integral by means of the substitution
t = tan θ].

Ans. 6.

9.10 Surface Area

Let f be a differentiable function of 2 variables defined on a domain


" D. We wish to find
the surface area of the graph of f over D. It is simply equal to dS, where dS is the
D
differential of the surface area of the graph of f . Therefore we need to express dS in terms
of the differential dA of the domain. To do so, take any point P 0 (x, y) in D and let P be the
corresponding point on the graph of f . Consider an increment dx along the x-direction and
an increment dy along the y-direction at the point P 0 . Thus dA = |dxdy|. These increments
sweep out an increment of surface area on the surface at P . The differential dS of this area
at P is given by the corresponding area on the tangent plane to the surface at P .

221
−−−→ −−→
Let P Q be the vector on the tangent plane at P with x-component dx, and P R the vector
−−−→ −−→
with y-component dy. Thus, P Q = hdx, 0, fx (x, y)dxi and P R = h0, dy, fy (x, y)dyi. The
−−−→ −−→
area of the parallelogram spanned by P Q and P R is the magnitude of the cross product
−−−→ −−→
P Q × P R.

i j k
−−−→ −−→
P Q × P R = dx 0 fx dx = h−fx , −fy , 1idxdy.
0 dy fy dy

q
Therefore, dS = |h−fx , −fy , 1idxdy| = fx2 + fy2 + 1 dA. Consequently,

" " q
Surface area = dS = fx2 + fy2 + 1 dA.
D D

Example 9.14. Find the area of the part of the paraboloid z = x2 + y 2 that lies under the plane
z = 9.

Solution. The paraboloid lies above the circular disk

D = {(r, θ) | 0 ≤ θ ≤ 2π, 0 ≤ r ≤ 3}.

222
The paraboloid is defined by z = f (x, y), where f (x, y) = x2 +y 2 . Thus we have fx = 2x, fy = 2y.
Then
" q
Surface area = fx2 + fy2 + 1 dA
D
" q
= 1 + 4(x2 + y 2 ) dA
D
Z 2πZ 3 √
= 1 + 4r 2 rdrdθ (change to polar coordinates)
0 0

i3
2 ) 32
h
1
= 2π 12 (1 + 4r
0

π √
= (37 37 − 1).
6

Exercise 9.2. The surface area of the portion on the cylinder
y 2 + z2 = 1
bounded by the planes y = x + 2 and y = x − 2 is equal to πa. Determine the value of a.

Ans. 8.
Exercise 9.3. Let y = f (x) be a curve on the xy-plane, where f 0 (x) is continuous and f (x) ≥ 0 for
all x ∈ [a, b]. The curve y = f (x), x ∈ [a, b], situated on the xy-plane in R3 , is rotated about the
x-axis through 360◦ to generate a surface S. Let D be the Type I region on the xy-plane bounded
between the curves y = f (x) and y = −f (x) for x from a to b. That is
D = {(x, y) | − f (x) ≤ y ≤ f (x), a ≤ x ≤ b}.

223
Show that the function
p p(x, y) defined on D whose graph is the portion of S above the xy-plane is
given by p(x, y) = f (x)2 − y 2 . Hence, or otherwise, show that the surface area of S is given by
Z b q
2πf (x) 1 + f 0 (x)2 dx.
a

Exercise 9.4. The curve y = sin x, x ∈ [0, π2 ], situated on the xy-plane in R3 , is rotated about the
x-axis through 360◦ to generate a surface S. Find the surface area of S.
√ √
Ans. π[ 2 + ln(1 + 2)].

224
Chapter 10

Ordinary Differential Equations

Read Thomas’ Calculus, Chapter 16.

10.1 First Order Ordinary Differential Equations

Let y be a function of x. An equation involving x, y and at least one derivative of y is called


an ordinary differential equation (ODE). The order of an ODE is the order of the highest
derivative that occurs in the equation. We consider only first order ordinary differential
equations.

Separable ODE

A separable first order ODE is of the form

dy
= f (x)g(y).
dx

Separating the variables,

1
dy = f (x)dx.
g(y)

Integrating both sides,

Z Z
1
dy = f (x) dx + C.
g(y)

Example 10.1. Solve y 0 = (1 + y 2 )ex .

225
dy
Solution. First we note that the differential equation = (1+y 2 )ex is separable. We separate
dx
1 x dx. Thus
R 1 R
the variables to obtain dy = e dy = ex dx. That is tan−1 y = ex + C,
1 + y2 1 + y2
or y = tan(ex + C).


Example 10.2. Experiments show that a radioactive substance decomposes at a rate proportional
to the amount present. Starting with 2 mg at certain time, say t = 0, what can be said about the
amount available at a later time?

dy
Solution. Let y be the amount of substance in mg at time t in years. Then dt = −ky, y(0) = 2,
dy
R dy R
where k is a positive constant. Thus y = −kdt. Integrating, y = −kdt. That is ln |y| =
−kt + C, or equivalently, |y| = e−kt+C = eC e−kt . Therefore, y = eC e−kt or y = −eC e−kt . In other
words, y = Ae−kt , where A is a constant. As y(0) = 2, we have 2 = Ae−k×0 = A. Consequently,
y = 2e−kt .

Remark. How to find k? The value of k depends on the radioactive substance. Usually we
can calculate k by looking up the half-life of the substance in a chemistry table.
For example, the half-life of the substance is T years. From the above solution, we know
y = Ae−kt . Thus A2 = Ae−kT . That is − ln 2 = −kT . From this we obtain k = lnT 2 .

In the report ‘Stemming the tide 2020: The reality of the Fukushima radioactive water crisis’,
Greenpeace claimed that the contaminated water contained “dangerous levels of carbon-
14”, a radioactive substance that has the “potential to damage human DNA”.

Carbon-14 is unstable and has a half-life of 5730 ± 40 years.

Example 10.3. A copper ball is heated to 100◦ C. At time t = 0, it is placed in water which is
maintained at 30◦ C. At the end of 3 mins, the temperature of the ball is reduced to 70◦ C. Find
the time at which the temperature of the ball is 31◦ C.

[ Physical information: Experiments show that the rate of change of the temperature T of the
ball with respect to time t is proportional to the difference between T and the temperature of the
surrounding medium.

Also heat flows so rapidly in copper that at any time the temperature is practically the same at all
points of the ball.]

Solution. Let T be Rthe temperature


R of the ball at time t. Then dT dt = k(T − 30), T (0) =
dT
100, T (3) = 70. Thus T −30 = kdt. That is ln |T − 30| = kt + C, or equivalently, T − 30 = Aekt .
T (0) = 100 ⇒ 100 − 30 = Aek×0 ⇒ A = 70. Therefore, T = 30 + 70ekt .

226
1 4
T (3) = 70 ⇒ 70 = 30 + 70e3k ⇒ 4 = 7e3k ⇒ k = 3 ln 7 = 13 (ln 4 − ln 7). Therefore, T = 30 +
t
70e 3 (ln 4−ln 7) . Then
t
T = 31 =⇒ 31 = 30 + 70e 3 (ln 4−ln 7)
t 1
=⇒ (ln 4 − ln 7) = ln = − ln 70
3 70
3 ln 70
=⇒ t = = 22.78 min.
ln 7 − ln 4

Example 10.4. A skydiver together with his equipment has a combined weight of m kg. After
he jumps and the parachute opens at time t = 0, he falls freely and is descending with velocity
v m/s at the moment when the time is t s. The air resistance against his descending motion is
known to be bv 2 N, where b is a positive constant, and v is his
q velocity at that moment. Show that
mg
the skydiver eventually approaches a terminal speed of k ≡ b m/s, where g = 9.81 m/s2 is the
acceleration due to gravity.

dv
Solution. By Newton’s second law, we have m = mg − bv 2 .
dt
r
mg
Note that k ≡ =⇒ mg = bk 2 , and the equation can thus be rewritten as
b

dv dv b
m = bk 2 − bv 2 =⇒ = − (v 2 − k 2 ).
dt dt m
Separating the variables, we get

dv b
=−
dt
v2 − k2 m
1 1 1 b
=⇒ ( − )dv = − dt
2k v − k v + k m
1 1 2kb
=⇒ ( − )dv = − dt.
v−k v+k m

227
Integrating, we get
Z Z
1 1 2kb
( − )dv = − dt
v −k v +k m
2kb
=⇒ ln |v − k| − ln |v + k| = − t+C
m
v −k 2kb
=⇒ ln =− t+C
v +k m
v−k 2kb
=⇒ = eC · e− m t
v +k
v −k 2kb
=⇒ = Ae− m t ,
v+k
where A is a constant (= eC or −eC ). Solving for v, we obtain
 2kb 
 1 + Ae− m t 
v =  2kb 
 k.
1 − Ae− m t
From this, we see that
1+0
lim v = · k = k.
t→∞ 1−0


Exercise 10.1. Solve the differential equation


dy
= xe3x−2y .
dx

Ans: 21 e2y = 13 xe3x − 19 e3x + C.

Exercise 10.2. A curve C that passes through the point (2, 1) is such that at any point (x, y) on
the curve,

dy
x2 = y(x3 + 4).
dx
Find the equation of the curve.

x2 4
Ans: y = e 2 − x .

10.2 Reduction to Separable Form


Certain first order differential equations are not separable, but they can be made separable
by a simple change of variables.

228
This holds for equations of the form
y
y 0 = g( ),
x
y y
where g is any function of x . Let v = . Then y = vx and y 0 = v + xv 0 . Then the equation
x
y g(v) − v
y 0 = g( ) can be written as v +xv 0 = g(v) or equivalently v 0 = , which is separable.
x x
We can now solve for v, and then solve for y.

Example 10.5. Solve 2xyy 0 − y 2 + x2 = 0.

Solution. We may rewrite the equation as


y
y 2 − x2 −1 + ( x )2
y0 = or equivalently y0 = y ,
2xy 2( x )
y y
where the right hand side is a function of . Let v = , so that y 0 = (xv)0 = v + xv 0 . Then the
x x
equation can be written as

−1 + v 2 dv −1 + v 2 1 + v2
v + xv 0 = ⇐⇒ x = −v = − .
2v dx 2v 2v
Separating the variables, we get
2vdv dx
2
=− .
1+v x
Integrating, we get
Z Z
2vdv dx
= −
1 + v2 x
2
=⇒ ln |1 + v | = − ln |x| + C
=⇒ ln |x(1 + v 2 )| = C
=⇒ |x(1 + v 2 )| = eC
=⇒ x(1 + v 2 ) = A,

where A is a constant (= eC or − eC ). Therefore, we have

y2
x(1 + 2 ) = A, or equivalently, x2 + y 2 = Ax.
x


229
A differential equation of the form y 0 = f (ax + by), where f is continuous and b , 0, can
be solved by setting u = ax + by. (If b = 0, then the equation itself is separable.)

Example 10.6. Solve (2x − 4y + 5)y 0 + x − 2y + 3 = 0. qquad (*)

Solution. Note that the equation can be rewritten as

−(x − 2y + 3) −(x − 2y) − 3


y0 = = ,
2x − 4y + 5 2(x − 2y) + 5

where the right hand side is a function of x − 2y.

1 − u0
Let u = x − 2y. Then u 0 = 1 − 2y 0 , and thus y 0 = . Thus the equation becomes
2
1 − u 0 −u − 3
= ,
2 2u + 5
which gives
4u + 11 du 4u + 11
u0 = ⇐⇒ = .
2u + 5 dx 2u + 5
Separating the variables, we get
2u + 5
du = dx,
4u + 11
which gives
1 1
2 (4u + 11) − 2 1
du = dx ⇐⇒ (1 − )du = 2dx.
4u + 11 4u + 11
Integrating, we get
Z Z
1
(1 − )du = 2dx
4u + 11
1
=⇒ u − ln |4u + 11| = 2x + C1
4
=⇒ 4u − ln |4u + 11| = 8x + 4C1
=⇒ 4x − 8y − ln |4x − 8y + 11| = 8x + 4C1 (since u = x − 2y)
=⇒ 4x + 8y + ln |4x − 8y + 11| + C = 0, where C = 4C1 .

In the calculations above, the two expressions

2u + 5 = 2x − 4y + 5 and 4u + 11 = 4x − 8y + 11

have appeared in the denominators, and thus the equations

2x − 4y + 5 = 0 and 4x − 8y + 11 = 0

230
are possible solutions to the DE (*) that we may have missed out in our calculations.
2x + 5 1
Case 1: 2x − 4y + 5 = 0. Then y = and y 0 = . Substituting this into the left hand side
4 2
of (*), we get

2x + 5 1
(2x − 4y + 5)y 0 + x − 2y + 3 = 0 · y 0 + x − 2 · + 3 = − , 0.
4 2
Thus 2x − 4y + 5 = 0 does not satisfy (*).
4x + 11 1
Case 2: 4x − 8y + 11 = 0. Then y = and y 0 = . Substituting this into the left hand
8 2
side of (*), we get

4x + 11 1 4x + 11 11 1 11
(2x−4y+5)y 0 +x−2y+3 = (2x−4· +5)· +x−2· +3 = (− +5)· +x−x− +3 = 0.
8 2 8 2 2 4
Thus 4x − 8y + 11 = 0 satisfies (*).

Thus the solutions to (*) are

4x + 8y + ln |4x − 8y + 11| + C = 0 and 4x − 8y + 11 = 0.

y 2x3 cos(x2 ) √
Exercise 10.3. Solve the initial value problem y 0 = x + y , y( π) = 0.
p
Ans: y = ±x 2 sin(x2 ).

Solution: Note that we may rewrite the differential equation as

y 2x2 cos(x2 )
y0 = + y ,
x x

y y
where the right hand side is a function of and x. Let v = . Then y = xv and y 0 = v + xv 0 .
x x
Thus the equation becomes

2x2 cos(x2 )
v + xv 0 = v +
v
2x cos(x2 )
=⇒ v 0 =
v
dv 2x cos(x2 )
=⇒ = .
dx v
Separating the variables, we get

vdv = 2x cos(x2 )dx.

231
Integrating, we get
Z Z
vdv = 2x cos(x2 )dx
1
=⇒ v 2 = sin(x2 ) + C
2
1 y2
=⇒ · 2 = sin(x2 ) + C
2 x
=⇒ y 2 = 2x2 (sin(x2 ) + C).

Now √
y( π) = 0 =⇒ 0 = 2π(sin π + C) =⇒ C = 0.
p
Consequently, the solution is y 2 = 2x2 sin(x2 ), or y = ±x 2 sin(x2 ).


Exercise 10.4. Solve (x + 2y − 1) + 3(x + 2y)y 0 = 0.

Ans: x + 3y + C = 3 ln |x + 2y + 2|, x + 2y + 2 = 0.

10.3 Linear First Order ODE


A linear first order ODE is of the form
dy
+ P (x)y = Q(x),
dx

where P (x), Q(x) are continuous functions. Note that the above ODE is separable if P (x) is
identically equal to Q(x). This is the standard form of a linear first order ODE.
R
Let I(x) = e P (x) dx . We call I(x) an integrating factor. Multiplying both sides of the above
ODE by I(x), we get

dy R P (x) dx R R
e + P (x)e P (x) dx y = Q(x)e P (x) dx .
dx
But
dy R P (x) dx d
R  R 
P (x) dx P (x) dx
e + P (x)e y= ye ,
dx dx
which can be shown by applying the product rule and the Fundamental Theorem of Calcu-
lus. Hence,

d
 R  R
P (x) dx
ye = Q(x)e P (x) dx .
dx

232
Thus we have shown that
d
(y · I(x)) = Q(x) · I(x).
dx
Integrating both sides gives Z
y · I(x) = Q(x) · I(x) dx

from which the solution for y can be obtained.

Example 10.7. Solve xy 0 − 3y = x2 , x > 0.

Solution.
R
First we rewrite the DE in the standard form y 0 − 3x y = x. An integrating factor is
− 3x dx
e = e−3 ln x = x13 . Multiplying the DE (the standard form) by this integrating factor, we
3 1 1 y y
have (y 0 − y) · 3 = x · 3 , which gives ( x3 )0 = x12 . Integrating, we have x3 = − 1x + C. That is
x x x
y = −x2 + Cx3 .


Example 10.8. Solve y 0 − y = e2x .


R
Solution. An integrating factor is e −1 dx = e−x . Multiplying the DE by this integrating factor,
we have (y 0 − y) · e−x = e2x · e−x , which gives (ye−x )0 = ex . Integrating, we obtain ye−x = ex + C.
That is y = e2x + Cex . 

Exercise 10.5. Solve the differential equation


dy
(x + 1)2 − (x + 1)y = 2, x > −1.
dx
1
Ans: y = − x+1 + C(x + 1).

Exercise 10.6. Solve the differential equation


dy 4 + y sin x π π
= , − <x< ,
dx cos x 2 2
given that y = 6 when x = 0.
4x+6
Ans: y = cos x .

Exercise 10.7. An object of mass m dropped from rest in a medium that offers a resistance pro-
portional to the magnitude of the instantaneous velocity of the object. Let x(t) be the displacement
of the object measured vertically downward at time t so that x(0) = 0. Show that
mg m2 g k
x(t) = t + 2 (e− m t − 1),
k k
where k is the proportional (positive) constant of the force of resistance of the medium.

[Set up the DE for the velocity first: m dv


dt = mg − kv.]

233
10.4 The Bernoulli Equation.
An ODE in the form
y 0 + p(x)y = q(x)y n ,
where n , 0, 1, is called the Bernoulli equation. The functions p(x) and q(x) are continuous
functions on an interval J.
Let u = y 1−n . Substituting into the Bernoulli equation we get

u 0 + (1 − n)p(x)u = (1 − n)q(x).

This is a first order linear ODE.

Remark. (i) When n = 0 or 1, the Bernoulli equation itself is a first order linear ODE.
(ii) When n > 0, the constant zero function y(x) = 0 is automatically a solution of the
Bernoulli equation.

Example 10.9. Solve y 0 + y = x2 y 2 .


Solution. This is a Bernoulli equation with n = 2. Let z = y 1−2 = y −1 . Then z0 = −y −2 y 0 , so
that y 0 = −y 2 z0 . Thus the given Bernoulli equation can be written as
−y 2 z0 + y = x2 y 2 ⇔ z0 − y −1 = −x2 ⇔ z0 − z = −x2 .
R
−dx
This is a first order linear equation. Multiplying by the integrating factor e = e−x , we
have
(z0 − z)e−x = −x2 e−x ,
which gives
(ze−x )0 = −x2 e−x .
R
Integrating, we get ze−x = −x2 e−x dx.
Using integration by parts, we have
Z
−x2 e−x dx = x2 e−x + 2xe−x + 2e−x + C.

Thus z = ex (x2 e−x + 2xe−x + 2e−x + C) = x2 + 2x + 2 + Cex .


1 1
Therefore, = x2 + 2x + 2 + Cex , and thus y = 2 .
y x + 2x + 2 + Cex
Since n = 2 > 0, y = 0 is also a solution.

Exercise 10.8. Solve xy 0 + y = x4 y 3 .
1
Ans: y2
= −x4 + cx2 , or y = 0.

234
10.5 Applications of ODE
Example 10.10. At time t = 0, a tank contains 20 kg of salt dissolved in 100 litres of water.
Assume that water containing 14 kg of salt per litre is entering the tank at the rate of 3 litre per
min, and the well-stirred solution is leaving the tank at the same rate. Find the amount of salt at
any time t.

Solution. First note that the volume of the solution remains constant which is 100 litres. Let
Q be the amount of salt in kg at time t. The concentration of salt in the solution is Q/100 kg
per litre. Suppose at time t + dt, the amount of salt is Q + dQ. Then

1 Q
dQ = salt input − salt output = 3 × × dt − 3 × × dt.
4 100
Thus
dQ 3 3Q
= − .
dt 4 100
That is
dQ 3
=− (Q − 25).
dt 100
3t
The general solution to this first order linear DE is Q = 25 + Ce− 100 . Since Q(0) = 20, we have
3t
20 = 25 + C so that C = −5. Consequently, Q = 25 − 5e− 100 .
Note that lim Q(t) = 25. Thus after sufficiently long time, the salt concentration will ap-
t→∞
proach 25 kg per 100 litres.


Example 10.11. A body was found at a crime scene. You are a member of the CSI team and you
arrived at the crime scene at 8AM. Immediately upon arrival, you took the temperature of the
victim and found that it was 26◦ C. At 9AM, you took the temperature of the victim again and
found that it was 24◦ C. You estimate that the victim’s temperature was 37◦ C just before death

235
and that the temperature at the crime scene stayed approximately constant at 21◦ C. What is your
estimate on the time of death?

Remark: Newton’s law of cooling states that the rate of cooling of an object is proportional to the
difference in temperature between the object and its surroundings.
Solution. Set time t = 0 at 8AM, where t is measured in hours. Let T be the temperature
of the body at time t. By Newton’s law of cooling, we have dT dt = k(T − 21), where k is a
kt
constant. The general solution is T = 21 + Ae . As T (0) = 26, we have 26 = 21 + A so that
A = 5. Therefore, T = 21 + 5ekt . At 9AM, that is 1 hour later, T (1) = 24. Thus 24 = 21 + 5ek so
that k = ln( 35 ). Hence
 t
t·ln( 35 ) ln(( 35 )t ) 3
T = 21 + 5e = 21 + 5e = 21 + 5 .
5
If τ is the time of death, then T (τ) = 37. Therefore,
 τ
3
37 = 21 + 5 .
5
 τ
That is 165 = 3
5 ⇔ ln( 16 3 16 3
5 ) = τ ln( 5 ) ⇔ τ = ln( 5 )/ ln( 5 ) = −2.277 hours, (or equivalently
negative 2 hour 17 mins). Thus time of death is about 5 : 43AM.

Exercise 10.9. The Jurong lake has a volume of 700000 m3 . At time t = 0, the government starts
a water cleaning process so that only fresh clean water flows into the lake. After 5 years, it is found
that the pollution in the lake is reduced by 50%. If fresh water flows into the lake at a rate of r
cubic metres per year and lake water flows out to the sea at the same rate, what is the value of r
correct to the nearest thousands?
Ans: 97000.
Exercise 10.10. Newton’s law of cooling states that the rate of cooling of an object is proportional
to the difference in temperature between the object and its surroundings. If an object is kept in an
environment whose temperature is kept constant at 15◦ C and the object takes 20 minutes to cool
from 95◦ C to 55◦ C, determine how much longer it will take for the object to cool down to 25◦ C.

Ans: 40 mins more.


Exercise 10.11. In a chemical reaction, the rate at which the mass, m (in grams) of a chemical
compound at time t (in seconds) is proportional to m2 − 9m + 18, (0 < m < 3). Initially (t = 0),
we assume m = 0. After 1 second, the mass of the chemical compound has increased to 2g. Write
down a differential equation in m and t, and show that
m−6
= 2t+1 .
m−3
Find the exact mass of the chemical compound after 2 seconds.

Ans: 18/7 grams.

236
Chapter 11

More on ODE

Read Thomas’ Calculus, Chapter 16.

Remark: Chapter 11 will be excluded from the Final Exam.

11.1 Euler’s Method

Remark: Section 11.1 will be excluded from the Final Exam.

Not all the first order ODE’s can be solved explicitly in closed form. In that case, we have
to rely on numerical solutions. In this section, we introduce Euler’s method which is a
dy
numerical method in approximating a first order ODE. Given a differential equation dx =
f (x, y) and an initial condition y(x0 ) = y0 , we can approximate the solution y = y(x) by its
linearization

L(x) = y(x0 ) + y 0 (x0 )(x − x0 ) or L(x) = y0 + f (x0 , y0 )(x − x0 ).

The linear function L(x) gives a good approximation to the solution in a short interval about
x0 . The idea of Euler’s method is to put together a sequence of such linearizations in a
successive manner to approximate the solution curve over a longer interval.

First we know that the point (x0 , y0 ) lies on the solution curve. Consider a small increment
from x0 to x1 ≡ x0 +h. The graph of L(x) is the tangent line with slope f (x0 , y0 ) to the solution
curve y = y(x) at the point (x0 , y0 ). So if h is small, y1 ≡ L(x1 ) is a good approximation to y(x1 ).
In other word, the point (x1 , y1 ) is close to the solution curve y = y(x).

237
Using the point (x1 , y1 ) and the slope f (x1 , y1 ) of the solution curve through (x1 , y1 ), we take
a second step. Setting x2 = x1 + h, we use the linearization of the solution curve through
(x1 , y1 ) to calculate y2 = y1 + f (x1 , y1 )h.

This gives the next approximation (x2 , y2 ) to the value along the solution curve y = y(x).
Continuing in this way, we take a third step from the point (x2 , y2 ) with slope f (x2 , y2 ) to
obtain the third approximation y3 = y2 + f (x2 , y2 )h, and so on.

In other words, we are building an approximation to one of the solution by following the
direction of the slope field of the differential equation.

238
The following steps summarize Euler’s method. Suppose we wish to approximate the solu-
tion over the interval [a, b]. Choose an integer n as the number of steps. Let h = b−a
n . Let

x0 = a
x1 = x0 + h
x2 = x1 + h
..
.
b = xn = xn−1 + h.

Then calculate the approximations to the solution as follows.

y1 = y0 + f (x0 , y0 )h
y2 = y1 + f (x1 , y1 )h
..
.
yn = yn−1 + f (xn−1 , yn−1 )h.

The polygonal curve joining the points (x0 , y0 ), (x1 , y1 ), . . . , (xn , yn ) successively is an approxi-
mation to the solution curve of the DE y 0 = f (x, y) through the point (x0 , y0 ).
Example 11.1. Use Euler’s method to solve

y 0 = 1 + y, y(0) = 1,

on the interval [0, 1] starting at x0 = 0 by taking h = 0.1. Find the approximate value of y(1) and
compare it with the exact value.

Solution. Taking n = 10 and h = 0.1, the result is tabulated in the following table. The exact
solution to the DE is y = 2ex − 1. Thus the exact value at x = 1 is y(1) = 2e − 1 = 4.4366. The
approximate value is 4.1875.

Euler solution of y 0 = 1 + y, y(0) = 1, h = 0.1


x y(Euler) y(Exact) Error
0 1 1 0
0.1 1.2 1.2103 0.0103
0.2 1.42 1.4428 0.0228
0.3 1.662 1.6997 0.0377
0.4 1.9282 1.9836 0.0554
0.5 2.221 2.2974 0.0764
0.6 2.5431 2.6442 0.1011
0.7 2.8974 3.0275 0.1301
0.8 3.2872 3.4511 0.1639
0.9 3.7159 3.9192 0.2033
1 4.1875 4.4366 0.2491

239
Exercise 11.1. Use Euler’s method to calculate the first three approximations to the initial value
problem:
y 0 = 2xy + 2y, y(0) = 3,
by taking h = 0.2.

Ans: y1 = 4.2, y2 = 6.216, y3 = 9.697.

Remark. In the film Hidden Figures, Katherine Goble resorts to Euler’s method in calculat-
ing the re-entry of astronaut John Glenn from Earth orbit.

11.2 2nd Order Linear Equations with Constant Coefficients

Remark: Section 11.2 will be excluded from the Final Exam.

Let us begin with second order homogenous linear equation with constant coefficients

y 00 + ay 0 + by = 0, (11.1)

where a and b are real constants. We look for a solution of the form y = eλx . Plugging into
(11.1) we find that, eλx is a solution of (11.1) if and only if

λ2 + aλ + b = 0. (11.2)

240
(11.2) is called the auxiliary equation or characteristic equation of (11.1). The roots of (11.2)
are called characteristic values (or eigenvalues):
1 √
λ1 = (−a + a2 − 4b),
2
1 √
λ2 = (−a − a2 − 4b).
2

1. If a2 − 4b > 0, (11.2) has two distinct real roots λ1 , λ2 , and the general solutions of (11.1)
is
y = c1 eλ1 x + c2 eλ2 x .

2. If a2 − 4b = 0, (11.2) has one real root λ (we may say that (11.2) has two equal roots
λ1 = λ2 ). The general solution of (11.1) is
y = c1 eλx + c2 xeλx .

3. If a2 − 4b < 0, (11.2) has a pair of complex conjugate roots


λ1 = α + iβ, λ2 = α − iβ.
The general solution of (11.1) is
y = c1 eαx cos(βx) + c2 eαx sin(βx).

Example 11.2. Solve y 00 + y 0 − 2y = 0, y(0) = 4, y 0 (0) = −5.

Solution. The characteristic values are 1, −2. Thus the solution is


y = ex + 3e−2x .

Example 11.3. Solve y 00 − 4y 0 + 4y = 0, y(0) = 3, y 0 (0) = 1.

Solution. The characteristic values are 2 with multiplicity 2. Thus the solution is
y = (3 − 5x)e2x .

Example 11.4. Solve y 00 − 2y 0 + 10y = 0.

Solution. The characteristic values are 1 + 3i, 1 − 3i. Thus the solution is
y = ex (c1 cos 3x + c2 sin 3x).

241
11.3 Method of Undetermined Coefficients
Remark: Section 11.3 will be excluded from the Final Exam.

Consider the equation y 00 + ay 0 + by = f (x), where a and b are real constants. To solve this
non-homogeneous linear DE, we look for a particular solution yp of y 00 + ay 0 + by = f (x).

Then the general solution is the sum of the general solution yc of the associated homoge-
neous linear DE: y 00 + ay 0 + by = 0 and this particular solution yp . That is
y = yc + yp .

Case 1. f (x) = Pn (x)eαx , where Pn (x) is a polynomial of degree n ≥ 0.


We look for a particular solution in the form
y = Q(x)eαx ,
where Q(x) is a polynomial. Plugging it into y 00 + ay 0 + by = f (x) we find
Q00 + (2α + a)Q0 + (α 2 + aα + b)Q = Pn (x). (11.3)

Subcase 1.1. If α 2 + aα + b , 0, namely, α is not a root of the characteristic equation, we


choose Q = Rn , a polynomial of degree n, and
y = Rn (x)eαx .
The coefficients of Rn can be determined by comparing the terms of same power in the two
sides of (11.3). Note that in this case both sides of (11.3) are polynomials of degree n.

Subcase 1.2. If α 2 + aα + b = 0 but 2α + a , 0, namely, α is a simple root of the characteristic


equation, then (11.3) is reduced to
Q00 + (2α + a)Q0 = Pn . (11.4)
We choose Q to be a polynomial of degree n+1. Since the constant term of Q does not appear
in (11.4), we may choose Q(x) = xRn (x), where Rn (x) is a polynomial of degree n.
y = xRn (x)eαx .

Subcase 1.3 If α 2 +aα +b = 0 and 2α +a = 0, namely, α is a root of the characteristic equation


with multiplicity 2, then (11.3) is reduced to
Q00 = Pn . (11.5)
We choose Q(x) = x2 Rn (x), where Rn (x) is a polynomial of degree n.
y = x2 Rn (x)eαx .

242
Example 11.5. Find the general solution of y 00 − y 0 − 2y = 4x2 .

Solution. The homogeneous equation has λ2 − λ − 2 = 0 as its characteristic equation with


roots λ = 2, −1.
Therefore the general solution of the associated homogeneous equation is y = c1 e2x + c2 e−x .
Note that 4x2 = 4x2 e0x and 0 is not a root of the characteristic equation. We can try a
particular solution of the form
yp = A + Bx + Cx2 .

Substituting this into the equation, we have

2C − (B + 2Cx) − 2(A + Bx + Cx2 ) = 4x2 .

Equating coefficients, we have


2C − B − 2A = 0
−2C − 2B = 0
−2C = 4

Thus A = −3, B = 2, C = −2, and y = −3 + 2x − 2x2 .


The general solution is
y = c1 e2x + c2 e−x − 3 + 2x − 2x2 .


Example 11.6. Solve y 00 − 2y 0 + y = xex .

Solution. The general solution of the associated homogeneous DE is C1 ex + C2 xex .


Here α = 1 is a double root of the characteristic equation λ2 − 2λ + 1 = 0. Therefore, we try a
particular solution of the form y = x2 (A + Bx)ex .
We have y 0 = (Bx3 + (A + 3B)x2 + 2Ax)ex and y 00 = (Bx3 + (A + 6B)x2 + (4A + 6B)x + 2A)ex .
Substituting these into the DE, we have (2A + 6Bx)ex = xex . Thus A = 0 and B = 16 .
Consequently, the general solution is y = C1 ex + C2 xex + 61 x3 ex .


Case 2. f (x) = Pn (x)eαx cos(βx) or f (x) = Pn (x)eαx sin(βx), where Pn (x) is a polynomial of
degree n ≥ 0.
We first look for a solution of

y 00 + ay 0 + by = Pn (x)e(α+iβ)x . (11.6)

Using the method in Case 1 we obtain a complex-valued solution

z(x) = u(x) + iv(x),

243
where u(x) = <(z(x)), v(x) = =(z(x)). Substituting z(x) = u(x) + iv(x) into (11.6) and taking
the real and imaginary parts, we can show that u(x) = <(z(x)) is a solution of

y 00 + ay 0 + by = Pn (x)eαx cos(βx), (11.7)

and v(x) = =(z(x)) is a solution of

y 00 + ay 0 + by = Pn (x)eαx sin(βx). (11.8)

Example 11.7. Solve y 00 − 2y 0 + 2y = ex cos x.

Solution. The characteristic equation is λ2 − 2λ + 2 = 0 with roots 1 + i and 1 − i.


The general solution of the associated homogeneous DE is

y = c1 ex cos x + c2 ex sin x.

Now consider the DE y 00 − 2y 0 + 2y = e(1+i)x . Let’s find a particular solution.


Since (1 + i) is a root of the characteristic equation, we should try a particular solution of the
form y = Axe(1+i)x .
Thus y 0 = (A + A(1 + i)x)e(1+i)x , y 00 = (2A(1 + i) + A(1 + i)2 x)e(1+i)x .
Therefore,

yh 00 − 2y 0 + 2 i
= (2A(1 + i) + A(1 + i)2 x) − 2(A + A(1 + i)x) + 2Ax e(1+i)x
= 2Aie(1+i)x .

From this, 1 = 2Ai or A = − 2i .


Thus a particular solution is given by y = − 2i xe(1+i)x , or equivalently, y = 12 xex sin x− 2i xex cos x.
Taking the real part, yp = 12 xex sin x is a particular solution of the given DE.
Consequently, the general solution is
1
y = c1 ex cos x + c2 ex sin x + xex sin x.
2


Remark. Alternatively to solve (11.7) or (11.8), one can try a solution of the form

Qn (x)eαx cos(βx) + Rn (x)eαx sin(βx)

if α + iβ is not a root of λ2 + aλ + b = 0, and

xQn (x)eαx cos(βx) + xRn (x)eαx sin(βx)

if α + iβ is a root of λ2 + aλ + b = 0, where Qn and Rn are polynomials of degree n.

244
11.4 Appendix: Malthus Model of Population
Remark: Section 11.4 will be excluded from the Final Exam.

The total population N (t) of a country or a colony is clearly a function of time. N (t) though
should be integer valued and is great than 0, is considered as a continuous and in fact differentiable
function of time, especially its value is usually very huge.
Given the population now, can one predict the future population?
Suppose B is a function giving the “per capita birth rate” in a given society, i.e. B is the
number of babies born per second, divided by the total population N of the country at
that moment. Note that B could be small in a big country and large in a small country - it
depends on whether there is a strong social pressure on people to get married and have kids.
Now B could depend on time (people might gradually come to realise that large families are
no fun, etc..) and it could depend on N . But suppose you don’t believe these things. Instead
suppose people will always have as many kids as they can, no matter what. Then B is a
constant. Thus

number of babies born in time interval dt = BN dt.

Similarly, let D be the death rate per capita. Again it could be a function of t (in case
the society has better medicine, fewer smokers etc) or N (overcrowding leads to famine or
disease). But if we assume that it is constant, then

number of deaths in time interval dt = DN dt.

So the change in N , denoted by dN within the time interval dt is

dN = number of births − number of deaths,

provided there is no emigration or immigration. Thus

dN = (B − D)N dt.

That is
dN
= (B − D)N = kN , (1)
dt
where k = B − D.
This model of society was put forward by Thomas Malthus in 1798. Clearly Malthus was
assuming a socially static society in which human reproductive behaviour never changes
with time or overcrowding, poverty etc.. What does Malthus’ model predict?
Suppose that the population now is N̂ and let t = 0 now.
R R
From (1), dN
dt = kN ⇒ dN
N = k dt = kt + C ⇒ ln(N ) = kt + C ⇒ N (t) = Aekt .

245
Since N̂ = N (0) = A, we get
N (t) = N̂ ekt . (2)

(1) is the logistic equation and (2) is the solution of the standard Malthus’s model.

The population collapses if k < 0 (more deaths than births per capita), remain stable if (and
only if) k = 0, and it explodes if k > 0 (more births than deaths). Malthus observed that the
population of Europe was increasing, so he predicted a catastrophic population explosion;
since the food supply could not be expanded so fast, this would be disastrous.
In fact, this didn’t happen in Europe. So Malthus’ model is not quite correct; as many
millions went to the US, and many millions died in wars.
Malthus’ model can be improved. Note that Malthus’ model is interesting because it shows
that static behaviour patterns can lead to disaster. But precisely because the term ekt grows
so quickly, Malthus’ assumptions must eventually go wrong - obviously there is a limit to
the possible population. Eventually, if we don’t control B, then D will have to increase. So
we have to assume that D is a function of N .
Clearly, D must be an increasing function of N , but which function? The simplest possible
choice is
D = sN , (3)
where s is a constant.
Now we want to solve
dN
= BN − sN 2 , N (0) = N̂ .
dt
Rewrite the equation as
dN
− BN = −sN 2 .
dt

246
This is a Bernoulli equation.
2
Let z = N 1−2 = N1 . Then dz 1 dN N dz 2
dt = − N 2 dt . Thus − dt − BN = −sN . That is
dz
dt + Bz = s which is a
linear equation in z. An integrating factor is eBt . Thus
dz d(zeBt ) 1
dt + Bz = s ⇔ dt = seBt ⇔ zeBt = Bs eBt + C ⇔ z = s
B + Ce−Bt . That is N = s
B + Ce−Bt .
B 1 1
Let N∞ = s which is the carrying capacity. Then N = N∞ + Ce−Bt .
1 1 1
Now N (0) = N̂ ⇒ N̂
= N∞ +C ⇒ C = N̂
− N1∞ .
Thus !
1 1 1 1
= + − e−Bt .
N N∞ N̂ N ∞

Rearranging,

N∞
N= N  . (4)
1+ ∞

− 1 e−Bt

Note that lim N = N∞ , as B > 0. Also N (t) is increasing if N∞ > N̂ , and N (t) is decreasing if
t→∞
dN
N∞ < N̂ . Thus dt , 0 if N∞ , N̂ . (4) is the solution of the improved Malthus’ model.

Example 11.8. The growth of rabbits in your rabbit farm followed a logistic population model
with a birth rate per capita of 10 rabbits per rabbit per year. You observed that their number had
approached to a logistic equilibrium population of 2500 rabbits. One day your friend Dr. Good
visited your farm and suggested that you try to mix some of his latest invention of Vitamin X
into your rabbit feed to boost the reproduction rate. You followed his suggestion and after a long
period of time, observed that the rabbit population had reached a new logistic equilibrium of 3000
rabbits. If the new rabbit birth rate per capita after Vitamin X was introduced was B rabbits per
rabbit per year, what is the value of B?

247
10 1 B 3000
Solution. We have s = 2500 so that s = 250 . Thus s = 3000 ⇒ B = 250 = 12.


Exercise 11.2. Suppose N∞ > 2N̂ . Show that there is a point of inflection on the graph of N at
t > 0.

248

You might also like