0% found this document useful (0 votes)
199 views209 pages

Advanced Calculus MAST10021 Guide

Uploaded by

Charis Huang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
199 views209 pages

Advanced Calculus MAST10021 Guide

Uploaded by

Charis Huang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Calculus 2: Advanced

MAST10021

Volker Schlue
University of Melbourne

Semester 2, 2023
(version: November 21, 2023)
Contents
I. Numbers, Functions, and Graphs 11

1. Numbers 13
1.1. Natural numbers and induction . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2. Rational and real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3. Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4. Absolute value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5. The real number line and euclidean space . . . . . . . . . . . . . . . . . . 16
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2. Functions and Graphs 21


2.1. Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2. Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3. Functions of two variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Additional: Vectors, functions, and graphs 29


Euclidean Spaces and Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
A mathematical definition of functions and domains . . . . . . . . . . . . . . . 30
Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Figures in plane geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Additional problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

II. Limits, and Continuity 33

3. Limits 35
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4. Continuous functions 41
4.1. Definition of continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2. Consequences of continuity . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5. Theorems about continuity 45


5.1. Global properties of continuous functions . . . . . . . . . . . . . . . . . . 45
5.2. Roots of polynomial equations . . . . . . . . . . . . . . . . . . . . . . . . 46
5.3. What about the proof? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3
Module Note 0

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Additional: Limts and Continuity 51


Additional Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Additional Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

6. Functions of two variables: Limits and Continuity 55


Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Additional: Limits and Continuity 59


Functions of several variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Continuity of elementary functions of two variables . . . . . . . . . . . . . . . . 59

III. Differentiation 61

7. Differentiation in one variable 63


7.1. Differentiability in one variable . . . . . . . . . . . . . . . . . . . . . . . . 63
7.1.1. Geometric interpretation of the derivative . . . . . . . . . . . . . . 64
7.1.2. Higher derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.2. Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

8. Differentiation in two variables 71


8.1. Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.2. Differentiable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.3. Gradient of a function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
8.4. Directional derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

Additional: Chain rule 77


Functions of one variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Functions of several variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

IV. Mean value theorem 81

9. L’Hôpital’s rule 83
9.1. Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
9.2. L’Hôpital’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

[Link] functions 89
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

4
MAST10021 Semester 2, 2023

Additional: Critical points, and continuity of the inverse 93


10.1. Critical points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
10.2. Continuity and differentiability of the inverse . . . . . . . . . . . . . . . . 94
Additional Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

V. Integration 97

[Link] fundamental theorem of Calculus 99


Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

[Link] “simplest” differential equations 103

[Link] and exponential function 107


13.1. Logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
13.2. Exponential function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

[Link] of integration 111


14.1. Integration by parts, and substitution . . . . . . . . . . . . . . . . . . . . 112
14.2. Applications of the substitution formula . . . . . . . . . . . . . . . . . . . 115
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

VI. Ordinary differential equations 121

[Link] order linear differential equations 123


15.1. Linear first order equations . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

[Link] differential equations 127


Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

[Link] of first order equations 131


17.1. Examples of separable equations . . . . . . . . . . . . . . . . . . . . . . . 131
17.2. Reductions to separable equations . . . . . . . . . . . . . . . . . . . . . . 132
17.3. Loss of uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

Additional: Isoclines and Homogeneity 135

[Link] differential equations of second order with constant coefficients 139


18.1. Existence of solutions by inspection . . . . . . . . . . . . . . . . . . . . . . 140
18.2. General form of the solutions . . . . . . . . . . . . . . . . . . . . . . . . . 141
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

5
Module Note 0

Additional: Uniqueness of solutions to the initial value problem 145


Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Additional: The space of solutions 149


First order equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Second order equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

[Link] numbers and complex exponentials 151

[Link] numbers 153


19.1. Imaginary numbers and quadratic equations . . . . . . . . . . . . . . . . . 154
19.2. Complex plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

Additional: Complex numbers 157


Proof of the triangle inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Consequences of de Moivre’s theorem . . . . . . . . . . . . . . . . . . . . . . . 158

[Link] functions 161


20.1. Basic properties of hyperbolic functions . . . . . . . . . . . . . . . . . . . 161
20.2. Inverse hyperbolic functions . . . . . . . . . . . . . . . . . . . . . . . . . . 163
20.3. The catenary curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

[Link] order differential equations 167


21.1. Homogeneous equation and complex exponentials . . . . . . . . . . . . . . 167
21.2. Inhomogeneous equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
21.3. Special types of the inhomogeneous terms . . . . . . . . . . . . . . . . . . 170
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

Additional: Complex functions,


and power series 173
22.1. Complex functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
22.2. Complex power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
22.3. Applications to differentiation and integration . . . . . . . . . . . . . . . . 178
Additional Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

Additional: Catenary problem 181

Additional: Variation of constants 185


Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

6
MAST10021 Semester 2, 2023

[Link] polynomials 187

[Link] by polynomial functions 189


Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

[Link]’s theorem 195


Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

Additional: Taylor’s theorem 199


Proof of Taylor’s theorem with Lagrange remainder . . . . . . . . . . . . . . . . 199
Euler’s number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Local extrema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Additional Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

[Link] Sequences 203


Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

7
Preface
I have written these lecture notes for the subject Calculus 2: Advanced which was
introduced at the University of Melbourne in 2020. Its ambition is to teach all the topics
covered in the “standard” Calculus 2 subject, but with more emphasis on concepts, and
proofs. Even though the name would suggest it, there is no Calculus 1: Advanced
subject. For most students, this is the first enounter with any proofs in Calculus. Thus
to get from the basics (say the definition of a limit) to the more advanced topics (say
discussions of solutions to differential equations), the pace is quite high, and a selection
of topics had to be made: I am mostly following (Spivak, Calculus), but I have tried to
make a division into “core material” (these are the Notes, that I expect every student to
study), and “additional topics” (that I usually do not have time to cover in class, but
give an opportunity for further study, along with some references to the literature). The
result is a class that can be taught over 12 weeks, accompanied by tutorials (which are
not contained in these lecture notes). I would like to thank the students who took this
subject in the years 2020-2023 for their feedback and I hope that taking this class made
them curious about mathematics. – VS, Melbourne, November 2023.

Literature. The following text books were useful when preparing these lecture notes.

(Spivak, Calculus) This is a beautiful introduction to Calculus. About 2/3 of topics


covered in this subject can be found in this text, especially foundations.

(Folland, Advanced Calculus) This is a good text, but with a more advanced starting
point. About 1/4 of the topics covered in this course can be found this text,
especially related to multivariable calculus.

(Osserman, Two-Dimensional Calculus) I have found this text to be a delightful treat-


ment of Calculus in two and three variables. Unfortunately, we barely have time to
cover these topics.

(Apostel, Calculus I) This is a quiet comprehensive treatment of Calculus, which in the


end I have not found very useful. We consult it for topics related to differential
equations.

(Walter, Gewöhnliche Differentialgleichungen) This is a very nice introduction to dif-


ferential equations, but it requires some familiarity with the German language.

9
Module I.

Numbers, Functions, and Graphs

11
Note 1.
Numbers
1.1. Natural numbers and induction
The simplest numbers are the natural numbers 1, 2, 3, . . ., which are collectively referred
to as N.
The natural numbers are very important in particular in relation to the principle
of mathematical induction: Suppose P (x) means that the property P holds for the
number x. Then the principle of mathematical induction states that P (x) is true for all
natural numbers x provided that
• P (1) is true,
• If P (k) is true, then P (k + 1) is true.
Example 1.1. Suppose we want to prove that for any natural number n,
n(n + 1)
1 + 2 + ... + n = . (1.1)
2
Then it sufficies to demonstrate that this is valid in the case n = 1, and to show that its
validity for n = k, implies that the formula holds for n = k + 1. Now assuming that the
formula holds for n = k, we have that
k(k + 1) k 2 + 3k + 2 (k + 2)(k + 1)
1 + . . . + k + (k + 1) = + (k + 1) = = (1.2)
2 2 2
which shows that the formula holds for n = k + 1.
Closely related to proofs by induction are recursive defininitions.
Example 1.2. The number n! (“n factorial”) is defined as follows:
• 1! = 1
• (n + 1)! = (n + 1)n!
Example 1.3. We write for the sum a1 + . . . + an . For example, the formula above
Pn
i=1 ai
could be expressed as
n
n(n + 1)
i= (1.3)
X
.
i=1
2
A careful definition of this symbol would be a recursive one:

13
Module I Note 1

• = a1
P1
i=1 ai

• ai = + an+1 .
Pn+1 Pn
i=1 i=1 ai

1.2. Rational and real numbers


We could extend the system of natural numbers N to the integers, which is the collection
of numbers Z, given by
. . . , −2, −1, 0, 1, 2, 3, . . . .
A still larger system of numbers are the rational numbers Q obtained by taking
quotients m/n of integers with n 6= 0. These numbers satisfy all the “basic properties” of
numbers that we are familiar with (and which we will spell out in the additional material
below). It is tempting to think that these are all the numbers we would ever need.
However, it is really Calculus that will lead us to yet a larger collection of numbers,
the real numbers R. The additional properties of these numbers are quite profound,
and very different from basic arithmetic properties. We will return to these in Lecture 5.
Instead of discussing these additional properties now it might be instructive to convince
ourselves that there are irrational numbers (real numbers which are not rational).

Example 1.4. The number 2 is not rational. The proof relies on the following observation:
The square of an even natural number, namely a number of the form n = 2k, is again
even; (2k)2 = 2(2k 2 ). The same is true for odd natural numbers, namely numbers of the
form n = 2k + 1:
(2k + 1)2 = 4k 2 + 4k + 1 = 2(2k 2 + 2k) + 1
This
√ means that if n is even, then n is even; and if n is odd, then n is odd. Now suppose
2 2

2 were rational, and equals p/q for some integers p and q 6= 0, and we can assume that
p and q have no common divisor. Then

p2 = 2q 2

which shows that p2 , and hence p is even. So for p = 2k, we get

2k 2 = q 2

which now shows that also q is even. But if both p and q are even, this contradicts the
fact that p and q have no common divisor.
The argument in this example shows that there is no
√ rational number x such that
x2 = 2. We have not shown that there exists a number 2 whose square is 2.

1.3. Inequalities
Although inequalities are often not discussed in elementary mathematics, they play a
prominent role in Calculus.

14
MAST10021 Semester 2, 2023

We write a < b to say “a is less than b”, and a > b to say “a is greater than b”. a < b
means the same as b > a and are merely two ways of writing the same assertion. The
main properties of numbers pertaining to inequalities are:

(Trichotomy) For any real numbers a and b, one, and only one, of the following holds:
1. a = b
2. a < b
3. b < a

(Ordering) For any numbers a, b, and c, if a < b and b < c, then a < c.

(Closure under addition) For any numbers a, b, and c, if a < b, then a + c < b + c.

(Closure under multiplication) For any numbers a, b, and c, if a < b, and 0 < c, then
ac < bc.

The numbers satisfying a > 0 are called positive, while the numbers a < 0 are
negative. We also write a ≤ b to mean a < b or a = b, and a ≥ b to mean a > b or a = b.
All familiar facts about inequalities, howevery elementary they may seem, can be
derived from these basic properties.
Example 1.5. If a < 0 is negative, then −a > 0 is positive. Since a < 0 means the same as
0 > a it follows (from closure under addition) that 0 − a > a − a = 0, or simply −a > 0.
Example 1.6. More generally, if a < 0 and b < 0 are negative numbers, then ab > 0 is
positive: We have already shown that −a > 0 and −b > 0 are positive numbers, hence
(using closure under multiplication) it follows that (−a)(−b) > 0.
Exercise 1.1. If a > 0 and b > 0 are positive, then ab > 0 is positive.
The fact that ab > 0 if a > 0, b > 0 and also if a < 0, b < 0 has one special consequence:
a2 > 0 whenever a 6= 0. In particular, we have proven that 1 = 1 · 1 > 0.
We will encounter many more examples of inequalities in Problem 7 below.
Example 1.7. If a < b and c < 0 then ac > bc. Since −c > 0 we have a(−c) = −ac <
b(−c) = −bc which is the same as bc < ac.

1.4. Absolute value


For any number a, we define the absolute value of a as:

a≥0
(
a
|a| =
−a a ≤ 0.

An important inequality is the triangle inequality, which is the statement that for
all numbers a and b, we have
|a + b| ≤ |a| + |b| . (1.4)

15
Module I Note 1

Proof. Since the absolute value is defined by cases, the proof amounts to verifying this
inequality in the following four cases:

a≥0 b≥0 (A)


a≥0 b≤0 (B)
a≤0 b≥0 (C)
a≤0 b≤0 (D)

In the case (A), the stated inequality is certainly true, because |a + b| = a + b = |a| + |b|.
Similarly in the case (D), we have |a + b| = −(a + b) = −a − b = |a| + |b|.
Now let us look at the case (B). Here we need to prove that

|a + b| ≤ a − b . (1.5)

From the assumption we do not know if a + b is positive, negative, or zero, but we can
consider the case a + b ≥ 0 first. Then we need to show that a + b ≤ a − b, which is the
same as b ≤ −b, which is certainly true since b ≤ 0. In the other case, when a + b ≤ 0,
we have to show that −(a + b) ≤ a − b which is the same as −a ≤ a, which is certainly
true for a ≥ 0.
Exercise 1.2. Verify the inequality in the case (C).

Another proof of the triangle inequality is based on fact that



|a| = a2 . (1.6)

Therefore for all numbers a and b,

|a + b|2 = (a + b)2 = a2 + 2ab + b2 ≤ |a|2 + 2|a||b| + |b|2 = (|a| + |b|)2 . (1.7)

Exercise 1.3. Why does this imply (1.4)? See Problem 7.

1.5. The real number line and euclidean space


Many properties of the real numbers are intuitively understood by picturing them
geometrically as points on a line. This is the real (number) line R. To associate
to each point on the line a number pick arbitrarily a point which we label 0, and
one to the right which we label 1, which fixes the scale, and we arrive at all natural
numbers N = {1, 2, 3, . . .} by marking equal distances to the right, and at all integers
Z = {. . . , −2, 1, 0, 1, 2, . . .} by proceeding similarly to the left. By subdividing the interval
from 0 to 1, it is clear how to make sense of the rational numbers Q = {p/q : p ∈ Z, q ∈ N},
and we take it for granted for now that also the irrational numbers fit into this scheme.
The relation a < b can be interpreted to mean that a lies to the left of b, and the number
|a − b| is the distance between a and b.

16
MAST10021 Semester 2, 2023

We frequently encounter the set of numbers x which satisfy |x − a| < ε. This is the
collection of points whose distance from a is less than ε > 0. This is an example of an
open interval
(a − ε, a + ε) = {x : a − ε < x < a + ε} (1.8)

There are also closed intervals denoted by

[a, b] = {x : a ≤ x ≤ b} (1.9)

where it is always understood that a ≤ b. Moreover we write

(a, ∞) = {x : x > a} [a, ∞) = {x : x ≥ a} (1.10)

but we emphasize that the symbol ∞ is purely suggestive, because there is no number
“∞” such that a < ∞ for all a ∈ R. Nonetheless with this notation the real line can be
viewed as an interval:
R = (−∞, ∞) (1.11)

We can intersect two real number lines at a right angle to form the plane R2 consisting
of pairs of numbers (a, b), and we also refer to a and b as the coordinates of the point
(a, b). Alternatively, we can view ~v = (a, b) as a vector, namely an arrow from the origin
to the point (a, b). This allows us add two vectors ~v = (a, b), w ~ = (c, d), and interpret
v + w = (a + c, b + d) geometrically. Similarly we can define λ~v = (λa, λb) for any real
number λ, and interprete the operation of scalar multiplicaion geometrically is scaling.
The distance between two points in the plane (a, b), and (c, d) is
q
(a − c)2 + (b − d)2 . (1.12)

More generally, we can view an ordered n-tuple of real numbers (a1 , a2 , . . . , an ) as


the coordinates of a point in the n-dimensional space Rn . Alternatively, we can view
~a = (a1 , a2 , . . . , an ) as a vector. The geometry of euclidean space is tied to the dot
product
n
~a · ~b = (1.13)
X
ai bi ,
i=1

which allows us to speak of vectors which are orthogonal, and to define the length of
a vector. We say two vectors ~a and ~b are orthogonal if ~a · ~b = 0. Moreover, we define the
length (or norm) of a vector ~a by
√ q
|~a| = ~a · ~a = a21 + a22 + . . . + a2n . (1.14)

With this notion of length the triangle inequality that we have seen above for
numbers extends to vectors:
|~a + ~b| ≤ |~a| + |~b| . (1.15)

17
Module I Note 1

Remark 1.1. Indeed, it follows from the definition that

|~a + ~b|2 = |~a|2 + 2~a · ~b + |~b|2 (1.16)

and so the stated inequality follows, provided we can show that |~a · ~b| ≤ |~a||~b|. This is
Cauchy’s inequality ; (see the additional notes to Module I).
The distance between two points (a1 , a2 , . . . , an ) and (b1 , b2 , . . . , bn ) in Rn is defined
by |~a − ~b|; note that this agrees with (1.12) in R2 . The reason (1.15) is called the triangle
inequality is because it implies that for any vectors ~a, ~b, ~c,

|~a − ~c| ≤ |~a − ~b| + |~b − ~c| . (1.17)

In other words, the distance from ~a to ~c is at most the sum of the distances from ~a to ~b,
and from ~b to ~c, for any intermediate point ~c.

Problems
1. Prove the following formula by induction:
n(n + 1)(2n + 1)
12 + . . . + n2 =
6

2. Find a formula for


n
(2i − 1) = 1 + 3 + 5 + . . . + (2n − 1)
X

i=1

3. Find all the numbers x for which


a) 4 − x < 3 − 2x
b) 5 − x2 < −2
c) (x − 1)(x − 3) > 0
d) x2 − 2x + 2 > 0
e) x2 − x + 10 > 16
f) 1
x + 1
1−x >0
g) |x − 3| = 8
h) |x − 3| < 8
i) |x + 4| < 2
j) |x − 1| + |x − 2| > 1

4. a) Prove by induction on n that

1 − rn+1
1 + r + r2 + . . . + rn =
1−r

18
MAST10021 Semester 2, 2023

b) Derive this result by setting S = 1 + r + r2 + . . . + rn , multiplying this equation


by r, and solving the two equations for S.

5. Prove that 3 is irrational.
Hint: Every interger is of the form 3n, or 3n + 1, or 3n + 2.

6. The following is a recursive definition of an : a1 = a, and an+1 = an · a. Prove, by


induction, that
an+m = an · am (an )m = anm .

7. Use the basic properties of Section 1.3 to prove the following:


a) If a < b and c < d, then a + c < b + d.
b) If a < b, then −b < −a.
c) If a < b and c > d, then a − c < b − d.
d) If a < b and c < 0 then ac > bc.
e) If a > 1 then a2 > a.
f) If 0 < a < 1 then a2 < a.
g) If 0 ≤ a < b and 0 ≤ c < d then ac < bd.
h) If 0 ≤ a < b, then a2 < b2 .
i) If a, b ≥ 0 and a2 < b2 , then a < b.

8. Let ~v = (v1 , v2 ), and w


~ = (w1 , w2 ) be vectors in the plane.
a) Given v, find a vector w such that v · w = 0.
b) Show that v · (w + z) = v · w + v · z
c) Show that v · w = 14 (|v + w|2 − |v − w|2 )

9. Let ~a = (3, −1, 2) and ~b = (2, 1, 0). Compute the norms of ~a and ~b.

10. Given vectors ~x, and ~y in Rn show that


a) |~x + ~y |2 = |~x|2 + 2~x · ~y + |~y |2
b) |~x + ~y |2 + |~x − ~y |2 = 2 |~x|2 + |~y |2


11. Show that |~a| − |~b| ≤ |~a − ~b| for every ~a, ~b in Rn .

19
Note 2.
Functions and Graphs
2.1. Functions
A function f is a rule which assigns to each real number x (or only certain real numbers
x) another real number f (x), the value of f at x.
Example 2.1. The function which assigns to each number the square of that number,

f (x) = x2 (2.1)

Example 2.2. The rule which assigns to each number c 6= 1, −1 the number c3 /(c2 − 1),

c3
f (c) = (c 6= 1, −1) (2.2)
c2 − 1
Example 2.3. The rule that assigns to each number t the number t3 + x. This rule
obviously depends on x, and defines a family of functions fx ,

fx (t) = t3 + x (2.3)

Remark 2.1. A function need not be expressed by an algebraic formula, it can be any
rule that assigns numbers to certain other numbers.
Example 2.1 is a special example of an extremely important class of functions, namely
the polynomial functions. If a0 , a1 , . . . , an are real numbers, and an =
6 0, then we say

f (x) = an xn + an−1 xn−1 + . . . + a1 x + a0 (2.4)

is a polynomial of degree n.
Example 2.2 is an example of a rational function, namely a function of the form p/q
where p, and q are polynomials.
Given two functions f , g, they can be combined to form a new function in various
ways: f + g is called the sum of f and g, f · g is the product, and f /g the quotient of
f , and g. All these are defined in the obvious ways, but we already see that some thought
has to be given to the domain: For example (f + g)(x) = f (x) + g(x) only makes sense
for numbers x for which both f and g are defined. The domain of a function is the set of
numbers to which the rule can be applied. So in the above example if A is the domain of
f , and B the domain of g, then f + g is only defined on the intersection of A, and B,
denoted by A ∩ B. Similarly for f · g.

21
Module I Note 2

Figure 2.1.: Graph of a function.

In order to indicate the domain A of a real valued function f , we write

f : A −→ R . (2.5)

This is often also indicated implicitly, when we give a formula for f (x) followed by

(x ∈ A) .

Exercise 2.1. What is the domain of f /g?


If f and g are any two functions, then the composition f ◦ g of f and g is defined by

(f ◦ g)(x) = f (g(x)) (2.6)

on the domain {x : x is in the domain of g, and g(x) is in the domain of f }.


Example 2.4. The function x 7→ sin(x2 ) is the composition of the functions f (x) = sin(x)
and g(x) = x2 .

2.2. Graphs
We call the graph of a function f the set of points

(x, f (x)) ∈ R2 (2.7)

where x is in the domain of f ; see Fig. 2.2.


Remark 2.2. In fact, one could go one step further and identify a function with its graph.
That leads to the formal definition of a function mentioned below (Additional).
The graph of a linear function f (x) = cx + d is a straight line with slope c through
the point (0, d).

22
MAST10021 Semester 2, 2023

Exercise 2.2. Given two points (a, b) and (c, d) find the linear function f whose graph
goes through both points.
Example 2.5. The graph of the function f (x) = x2 is a parabola.
Exercise 2.3. Sketch the graphs of

1. f (x) = xn for n = 2, 3, 4, . . .

2. f (x) = x2 + x

3. f (x) = x3 − 3x

4. f (x) = 1/x (x 6= 0)

5. f (x) = 1
1+x2

Example 2.6. Consider the function f (x) = sin(1/x) on its domain R \ {0}. To draw the
graph it helps to observe that

f (x) = 0 for x = 1/π, 1/2π, 1/3π, . . . (2.8)


1 1 1
f (x) = 1 for x = , , ,... (2.9)
π/2 π/2 + 2π π/2 + 4π
1 1 1
f (x) = −1 for x = , , ,... (2.10)
3π/2 3π/2 + 2π 3π/2 + 4π

Moreover, when x is large, 1/x is small, so also f (x) is small; similarly when |x| is large,
for x < 0. We take away from this that while the graph of f approaches the horizontal
axis as |x| → ∞ (from above on the right, and from below on the left), it oscillates
infinitely many times between −1, and 1 near 0.

Exercise 2.4. Draw the graph of f (x) = x sin(1/x).

2.3. Functions of two variables


A function of two variables is a rule f that assigns to each point in the plane (x, y) a real
number, the value of f at that point, denoted by f (x, y).
Example 2.7. f (x, y) = 1 − 2x − y
Example 2.8. f (x, y) = x2 + y 2
Example 2.9. f (x, y) = 2xy/(x2 + y 2 ), (x, y) 6= (0, 0).
The first two are examples of polynomials in two variables, namely the sum of
terms of the form axm y n , where a is a constant and m, and n are nonnegative integers.
The number m + n is called the degree of this term, and the degree of the polynomial is
the highest degree of the terms it contains.
As we have seen for functions of one variable, one can gain a great deal of insight by
associating with a function f (x) its graph, consisting of those points in the xy plane for

23
Module I Note 2

Figure 2.2.: The graph of the function f (x) = sin(1/x) for x > 0.

which y = f (x). Similary when visualizing a function f (x, y) of two variables, it is useful
to think about the graph of f as a surface in R3 consisting of all points

(x, y, z = f (x, y)) ∈ R3 . (2.11)

Example 2.10. The graph of the function in Example 2.7 is a plane. It intersects the
z-axis at z = 1, because f (0, 0) = 1. Similarly we obtain the intersections with the x and
y axes, and these three points, (0, 0, 1), (1/2, 0, 0), and (0, 1, 0) determine uniquely the
plane.
Example 2.11. The graph of the function in Example 2.8 is a surface of revolution, since
the value of f depends only on the distance |(x, y)| from the z-axis. The surface is a
paraboloid.
Example 2.12. An important difference between Example 2.9 and the previous ones, is
that here f is not defined for all values of x, and y. More generally the quotient of two
polynomials
P (x, y)
f (x, y) = (2.12)
Q(x, y)
is not defined at points where the denominator vanishes. In Example 2.9 this a single
point, the origin. However, since the numerator also vanishes, it is not immediately clear
how this function behaves near the origin.
Exercise 2.5. Introduce polar coordinates in the plane to study the behaviour of the
function f (x, y) = 2xy/(x2 + y 2 ) near the origin. Set x = r cos(θ), and y = r sin(θ), to
express f (x, y) as a function of r, and θ. Sketch the graph of f .

24
MAST10021 Semester 2, 2023

Figure 2.3.: The graph of the function from Example 2.9 is generated by rotating a ray
from center, while moving it up and down.

Solution 2.6. We have

f (r cos(θ), r sin(θ)) = 2 cos(θ) sin(θ) = sin(2θ) . (2.13)

In other words, the value only depends on θ, not on r. Therefore the graph of f is
constant over the rays (r cos θ, r sin(θ)), r > 0, for constant θ, and the graph z = f (x, y)
lies entirely between the planes z = 1, and z = −1. We can visualize the surface as a
kind of “spiral ramp”, see Fig. 2.3.
A common way to sketch a function of two variables is to consider the intersections of
the graph of f in R3 with a plane ax + by = 0. In other words, by restricting ourselves
to points in the xy plane with ax + by = 0 for some fixed a, and b, the values of f are a
function of one variable, and its graph is visualized as a curve in the plane.
Example 2.13.
f (x, y) = cos(xey ) (2.14)
Note that the graph of f lies between the horizontal planes z = 1 and z = −1. On each
straight line y = c in the xy plane, the function f (x, c) = cos(kx), with k = ec , is an
oscillating function in x, which is oscillating more rapidly the larger the value of c. The
graph of x → f (x, c) is the cross section of the graph of f , as a surface in R3 , with the
planes y = c.
Exercise 2.7. Sketch the graph of the function of the previous example.

25
Module I Note 2

Figure 2.4.: Level sets of the function f (x, y) = xy.

Another approach is to intersect the graph of a function of two variables with the
planes z = c. Here we are interested in the set of points (x, y) in the plane for which
f takes a given value c. The set of points (x, y) in the xy plane for which f (x, y) = c,
where c is a given constant, is a level curve of f . By choosing various values of c, and
constructing the corresponding level curves, we can often obtain a picture of the graph
of f .
Exercise 2.8. Sketch the level curves of the function

f (x, y) = xy . (2.15)

Convince yourself that the graph of f is a “saddle-shaped” surface.


Solution 2.9. The level curves xy = c are hyperbolas if c 6= 0. For c > 0, they lie in the
first and third quadrants, while for c < 0, they lie in the second and forth quadrants of
the xy plane; see Fig. 2.4. The level curve xy = 0 consists of the set of points on the x
and y axes.

Exercise 2.10. Ask yourself what is the relevance of level curves to reading a map of a
mountainous terrain?

Problems
1. Let
1
f (x) = (2.16)
1+x
What is

26
MAST10021 Semester 2, 2023

a) f (1/x)
b) f (x + y)
c) f (x) + f (y)

2. Find the domain of the functions defined by the following formulas



a) f (x) = 1 − x2
b) f (x) = 1
x−1 + 1
x−2

3. A function is even if f (x) = f (−x) and odd if f (x) = −f (−x). For example, the
functions f (x) = x2 , and f (x) = |x| are even, while the function f (x) = x, or
f (x) = sin(x) are odd.
a) Determine whether f + g is even, odd, or not necessarily either, in the four
cases obtained by choosing f even or odd, and g even or odd.
b) Do the same for f · g, and f ◦ g.
c) Prove that every even function f can be written as f (x) = g(|x|), with a
function g that is not uniquely determined.

4. Prove or give a counterexample for each of the following assertions:


a) f ◦ (g + h) = f ◦ g + f ◦ h
b) (g + h) ◦ f = g ◦ f + h ◦ f
c) 1
f ◦g = f ◦ (1/g)

5. Indicate on the real line the set of x satisfying the following relations, and write
these sets using the notation of intervals.
a) |x − 3| ≤ 1
b) 1
1+x2
≤a
c) |x − 1| < 21
2

6. Draw the set of all points (x, y) satisfying the following conditions.
a) x > y
b) |x − y| < 1
c) 1/(x + y) is an integer.
d) x = y 2
e) x = |y|

7. Sketch the graphs of the following functions, by plotting enough points to get a
good idea of its shape.
a) f (x) = x − 1/x
b) f (x) = x2 + 1/x2

27
Module I Note 2

8. Describe the general features of the graph of f if


a) f is even
b) f is odd
c) f is periodic, namely f (x) = f (x + a) for all x with some period a > 0.

9. For each of the following functions of two variables,


a) f (x, y) = x2 − y 2
b) f (x, y) = xy 2
c) f (x, y) = sin(x + y)
draw the intersection of their graphs with the planes,
a) y = c
b) x = c
c) z = c (level curves)
for various values of c, and use these to sketch the graph of each function.

10. For each of the following functions of two variables,


a) f (x, y) = x + y 2
b) f (x, y) = x2 + 2xy + y 2
c) f (x, y) = x3 − 3xy 2
draw the level curve z = 0, and shade the region where f (x, y) > 0.

11. Let h(t) be a strictly increasing function of t, and let g(x, y) = h(f (x, y)).
a) How are the level curves of f and g related?
b) How are the graphs of f and g related?

28
Additional: Vectors, functions, and graphs
Further Reading

(Folland, Advanced Calculus, Chapter 1.1)

Vector Calculus: Advanced (MAST20032)

Euclidean Spaces and Vectors


Similarly to numbers we have the basic operations of addition and scalar multiplica-
tion for vectors ~x = (x1 , . . . , xn ) in Rn :

~x + ~y =(x1 + y1 , . . . , xn + yn ) (2.1)
λ~x =(λx1 , . . . , λxn ) λ∈R (2.2)

We have already introduced the dot product between two vectors and the norm of a
vector. The fundamental inequality relating the two is Cauchy’s inequality:

Proposition 2.1. For any ~x, ~y ∈ Rn , we have

|~x · ~y | ≤ |~x||~y | . (2.3)

Exercise 2.1. This inequality has a geometrically instructive proof. Draw pictures while
you are reading it!

Proof. Let us assume ~b 6= 0. Consider the function

f (t) = |~a − t~b|2 = |~a|2 − 2t~a · ~b + t2 |~b|2 . (2.4)

We know, on one hand, that f (t) ≥ 0 is non-negative. On the other hand, this is a
quadratic in t, and its minimum occurs at t = ~a · ~b/|~b|2 , where f takes the value

(~a · ~b)2
f (~a · ~b/|~b|2 ) = |~a|2 − . (2.5)
|~b|2
Since f ≥ 0, in particular at the minimum, we obtain the inequality after multiplying
through by |~b|2 .

A consequence is the triangle inequality:

29
Module I Note 2

Proposition 2.2. For any ~x, ~y ∈ Rn , we have

|~x + ~y | ≤ |~x| + |~y | . (2.6)

Proof. We have |~x + ~y |2 = |~x|2 + 2~x · ~y + |~y |2 , and by Cauchy’s inequality |~x · ~y | ≤ |~x||~y |.
Therefore,
|~x + ~y |2 ≤ (|~x| + |~y |)2 , (2.7)

so the inequality follows by taking square roots.

Further Reading

(Spivak, Calculus, Chapter 3, 4)

A mathematical definition of functions and domains


We have introduced functions as “rules” but we will now illustrate how to turn this idea
into a proper mathematical definition, which in particular avoids the confusion that while
for example
f (x) = x2 (2.8)

and
f (x) = x2 + 3x + 3 − 3(x + 1) (2.9)

are different rules, they certainly define the same function.

Definition 2.1. A function is a collection of pairs of numbers with the following


property: if (a, b) and (a, c) are both in the collection, then b = c.

Definition 2.2. The domain of f is the set of numbers of all a for which there is some
b such that (a, b) is in f .

If a is in the domain of f , it follows from the first definition that there is a unique
number b such that (a, b) is in f . This unique number is denoted by f (a).

30
MAST10021 Semester 2, 2023

Mappings
More generally, a map (or mapping) is a rule f that assigns to each element of some
set A an element of some other set B (possibly equal to A). We write f : A → B. If
x ∈ A, the element in B assigned to x by f is called the value f (x). Thus functions are
maps, but the term “function” is typically reserved for mappings whose values are real
numbers (or complex numbers).
Given f : A → B we refer to A as the domain of f . If S is a subset of A, we denote by

f (S) = f (x) : x ∈ A , (2.10)




and the set f (A) (a subset of B) is called the range of f .


A mapping f : A → B is said to be invertible if there is another mapping g : B → A
such that g(f (x)) = x for all x in A and f (g(y)) = y for all y in B. If this mapping exists
g is called the inverse of f .

Figures in plane geometry


It is important to realise however that some of the most important figures of plane
geometry are not graphs of functions. Let us first introduce the notion of distance, as
captured by the Pythagorean theorem.

Definition 2.3. The distance d between two points (a, b) and (c, d) in the plane R2 is
defined by q
d = (a − c)2 + (b − d)2 (2.11)

Example 2.1. The circle with centre (a, b) of radius r is the set of points (x, y) whose
distance from (a, b) is equal to r. Since for example both (a, b + r), and (a, b − r) are in
this collection of points, it is not a graph of a function.
Example 2.2. Given two points in the plane, an ellipse is the set of points, for which the
sum of the distances to the two focal points is constant. If we take for simplicity the
two focal points to be (−c, 0), and (c, 0) on the horizontal axis, and the distance to be
2a > 0, then these are all points (x, y) for which
q q
(x + c)2 + y 2 + (x − c)2 + y 2 = 2a (2.12)

After some calculation one finds this implies

x2 y2
+ = 1, (2.13)
a2 a2 − c2
where we take a > c.
Example 2.3. The hyperbola is defined analogously, except that we require the difference
of the two distances to be a constant. These leads to the same equation, but we now
take c > a. The hyperbola has two branches because we can take the difference in

31
Module I Note 2

two different ways. Note that while the hyperbola is also not a graph, we can write for
example in the case a2 = 2, and a2 − c2 = −2, that
(x + y)(x − y) = 2 (2.14)
and so hyperbola coincides with the graph of the function f (x) = 1/x after a rotation of
the axes by an angle of π/4.

Additional problems
1. a) If f is any function, define a new function |f | by |f |(x) = |f (x)|. If f and g
are functions, define two new functions max(f, g), min(f, g) by
max(f, g)(x) = max(f (x), g(x)) (2.15)
min(f, g)(x) = min(f (x), g(x)) (2.16)
Find an expression for max(f, g), and min(f, g) in terms of | · |.
b) Show that f = f+ + f− where f+ = max(f, 0) is the positive part, and
f− = min(f, 0) is the negative part of f .
c) A function f is called nonnegative if f (x) ≥ 0 for all x. Prove that any
function f can be written as f = g − h, where g and h are nonnegative, (and
not uniquely determined).
2. Prove that the graphs of the linear functions
f (x) = mx + b f (x) = nx + c (2.17)
are perpendicular if mn = −1.
Hint: Consider the triangle with vertices at (0, 0), (1, m), and (1, n), use the
Pythagorean theorem.
3. a) If x1 , . . . , xn are distinct numbers, find a polynomial function fi of degree n − 1
which is 1 at xi and 0 at xj for j 6= i.
b) Find a polynomial function f of degree n − 1 such that f (xi ) = ai , where
a1 , . . . , an are given numbers.
4. For which numbers a, b, c, d does the function
ax + b
f (x) = (2.18)
cx + d
satisfy f (f (x)) = x (for all x in the domain of f ◦ f )?
5. Convince yourself that the set of points (x, y) satisfying
ax2 + bx + cy 2 + dy + e = 0 (2.19)
is either a parabola, an ellipse, a hyperbola, or in a degenerate cases two lines, one
line, a point, or the empty set.

32
Module II.

Limits, and Continuity

33
Note 3.
Limits
In this lecture we shall make precise one of the most important notions in Calculus,
namely the limit of a function. We would like to say that “a function f approaches
the limit l near a, if we can make f (x) as close as we like to l by requiring that x be
sufficiently close to, but not equal to, a.”
Here it is irrelevant how or even if f is defined at the point a. For example the functions
x 6= a
(
x2
f (x) = x ,
2
g(x) = x 2
(x 6= a) , h(x) = (3.1)
b x=a

should all have the same limit l = a2 at a.


A way to picture what we mean by “we can make f (x) as close as we like to l”, is
to draw the graph of f , and first choose an interval B around l, which determines two
horizontal lines in the plane. Then “by requiring that x be sufficiently close to a” means
that we can find an interval A around a, so that the graph of the function f above A lies
between the two horizontal lines, except perhaps at a. The idea is that if “f approaches
the limit l near a”, this should be possible no matter how small we choose the interval B.
Example 3.1. Consider the function f (x) = 3x with a = 5. The limit should be l = 15.
Suppose we want to make that f (x) is within 1/10 of 15. This means we want
1 1
15 − < 3x < 15 + (3.2)
10 10
which we can also write as
1 1
− <x−5< (3.3)
30 30
or simply |x − 5| < 1/30. This means that as long as we take x to within 1/30 of distance
from a, f (x) will be within a distance of 1/10 from l.
Exercise 3.1. Convince yourself that the function f (x) = x2 approaches l = 9 near a = 3
in this sense. Suppose we would like to make f (x) within distance 1 from l = 9. How
close does x have to be to a = 3?
Example 3.2. Consider the function f (x) = 1/x for x 6= 0, and let us try to show that
f (x) approaches l = 1/3 near a = 3. Let us convince ourselves that f (x) will be arbitrarily
close to l, if x is sufficiently close a = 0. This means we would like to show that for any
chosen distance ε > 0 the inequality
1 1
− <ε (3.4)
x 3

35
Module II Note 3

Figure 3.1.: Geometric interpretation of the definition of a limit.

can be satisifed provided x is in close range of a, namely |x − 3| is small. We begin with


1 1 3−x 1 1
− = = |x − 3| (3.5)
x 3 3x 3 |x|
We have to make sure that the factor 1/|x| is not too large in the range we consider, so
let us first require that |x − 3| < 1, which ensures that 2 < x < 4, so that
1 1 1
< < (3.6)
4 x 2
and so 1/|x| < 1/2 because 0 < x = |x|. Therefore
1 1 1
− < |x − 3| (|x − 3| < 1) (3.7)
x 3 6
which shows that |1/x − 1/3| < ε provided |x − 3| < 6ε, which is also < 1 as long as we
started out with ε < 1/6. The emphasis is here on being able to choose ε small, any
upper bound on ε is irrelevant for the notion of a limit.

Definition 3.1 (Limit). A function f approaches the limit l near a if for every
ε > 0, there is some δ > 0 such that, for all x, if 0 < |x − a| < δ, then |f (x) − l| < ε.
This is a very important definition and you need to know it by heart!
Let us also make sure to get the logical negation of this statement right, namely to
understand what it means for a function not to approach a limit l at a:
A function does not approach the limit l at a, if there is some ε > 0 such
that for every δ > 0 there is some x which satisfies 0 < |x − a| < δ, but not
|f (x) − l| < ε.
Example 3.3. The function f (x) = sin(1/x) does not approach 0 near 0, because for
ε = 1/2 and any δ > 0, there is some x with 0 < |x| < δ such that sin(1/x) ≥ 1/2. Indeed,
we only need to choose x = 1/(π/2 + 2πn) for some n ∈ N, which becomes arbitrarily
small for n large.

36
MAST10021 Semester 2, 2023

Figure 3.2.: The function sin(1/x) does not have a limit at 0.

Exercise 3.2. In fact, more is true: The function f (x) = sin(1/x) does not approach any
limit l near 0.
Example 3.4. The function f (x) = x sin(1/x) approaches the limit 0 near 0. Since for all
x 6= 0,
x sin(1/x) ≤ |x| (3.8)
we can make |f (x)| < ε simply by requiring that 0 < |x| < δ with δ = ε.
Exercise 3.3. Show that the function f (x) = x2 sin(1/x) approaches 0 near 0. What

about f (x) = x sin(1/x)?
Since a function f cannot approach two different limits, we can talk about the limit l
that f approaches near a, which is denoted by

lim f (x) . (3.9)


x→a

The statement limx→a f (x) = l has exactly the same meaning as the phrase “f approaches
l near a.” The possibility remains that f does not approach l near a for any l, and that
is expressed by saying “limx→a f (x) does not exist.”
While the examples at the beginning of the lecture may give the impression that every
function in question has to be dealt with separately, the idea is of course to establish
general theorems which will make it easy to find limits.

Theorem 3.1 (Limit laws). If limx→a f (x) = l and limx→a g(x) = m, then

lim (f + g)(x) =l + m (3.10)


x→a
lim (f · g)(x) =l · m (3.11)
x→a

Moreover, if m 6= 0, then
f l
lim = (3.12)
x→a g m

37
Module II Note 3

Example 3.5. Using the Theorem we can prove, trivially, such statements as

x3 + 7x5 a3 + 7a5
lim = (3.13)
x→a x2 + 1 a2 + 1
without going through the laborious process of finding a δ, for a given ε.
We only give a proof of the first “limit law”.

Proof. Let ε > 0. By assumption we know that there are δ1 , δ2 > 0 such that, for all x,

0 < |x − a| < δ1 =⇒|f (x) − l| < ε/2


0 < |x − a| < δ2 =⇒|g(x) − m| < ε/2.

So if we choose δ = min(δ1 , δ2 ) to be the smallest of the two, then both statements are
true for all 0 < |x − a| < δ, and moreover

|(f + g)(x) − (l + m)| = |f (x) − l + g(x) − m| ≤ |f (x) − l| + |g(x) − m| < ε/2 + ε/2 = ε

whenever 0 < |x − a| < δ.


The proofs of the other two statements are very similar but the choices of δ are more
intricate. For example, to achieve that

|f (x)g(x) − lm| < ε

we would write

|f (x)g(x) − lm| = |(f (x) − l)g(x) + l(g(x) − m)| ≤ |f (x) − l||g(x)| + |l||g(x) − m|

and so a good choice of δ1 ensures that


ε
if 0 < |x − a| < δ1 , then |f (x) − l| <
2(|m| + 1)

which together with a choice of δ2 such that


ε
if 0 < |x − a| < δ2 , then |g(x) − m| < min(1, )
2(|l| + 1)

yields the desired inequality for all 0 < |x − a| < δ, where again δ = min(δ1 , δ2 ).

Problems
1. Find the following limits

1− x
a) limx→1 1−x

1− 1−x2
b) limx→0 x

38
MAST10021 Semester 2, 2023

2. In each of the following cases, determine the limit l for the given a, and prove that
it is the limit by showing how to find a δ such that |f (x) − l| < ε for all x satisfying
0 < |x − a| < δ.
a) f (x) = x2 + 5x − 2, a = 2
b) f (x) = x4 , for any a > 0.
c) f (x) = |x|, a = 0.
p

3. Give an example of a function f for which the following assertion is false: If


|f (x) − l| < ε when 0 < |x − a| < δ, then |f (x) − l| < ε/2 when 0 < |x − a| < δ/2.

4. a) If limx→a f (x) and limx→a g(x) do not exist, can limx→a (f (x) + g(x)) exist?
What about limx→a f (x)g(x)?
b) If limx→a f (x) exists and limx→a (f (x) + g(x)) exists, must limx→a g(x) exist?
c) If limx→a f (x) exists and limx→a g(x) does not exist, can limx→a (f (x) + g(x))
exist?

5. a) Prove that limx→a f (x) = l if and only if limx→a (f (x) − l) = 0.


b) Prove that limx→0 f (x) = limx→a f (x − a).
c) Prove that limx→0 f (x) = limx→0 f (x3 ).

6. Prove that if limx→0 g(x) = 0 and |h(x)| ≤ M for all x, then limx→0 g(x)h(x) = 0.

39
Note 4.
Continuous functions
4.1. Definition of continuity
Intuitively, a function f is continuous if the graph contains no breaks, jumps, or wild
oscillations.

Definition 4.1 (Continuity). A function is continuous at a if

lim f (x) = f (a) (4.1)


x→a

Remark 4.1. There are several ways this can fail. For example, f might not be defined at
a, or the limit may not exist, in which cases this identity makes no sense. It could also
be that f is defined at a and the limit of f (x) at a exists, but these two numbers are not
the same.
Example 4.1. The function f (x) = sin(1/x) is not continuous at 0, because it is not even
defined at 0.
Example 4.2. The function f (x) = x sin(1/x) is not defined at 0 either, but the limit
limx→0 x sin(1/x) exists and is 0, so while f is not continuous at 0, we can define an
extension of f , namely the function

f (x) x 6= 0
(
F (x) = (4.2)
0 x=0

which is continuous at 0.
Example 4.3. Any monomial f (x) = xn is obviously continuous at any a because
limx→a xn = an = f (a).

Theorem 4.1 (Continuity laws). If f and g are continuous at a, then f + g and f · g


are continuous at a, and moreover if g(x) 6= 0, then 1/g is continuous at a.

Proof. This follows directly from Theorem 3.1. Indeed, if limx→a f (x) = f (a) and
limx→a g(x) = g(a), then

lim (f + g)(x) = lim f (x) + lim g(x) = f (a) + g(a) = (f + g)(a) . (4.3)
x→a x→a x→a

Similarly for f · g, and 1/g.

41
Module II Note 4

The theorem allows us to infer that rational functions are continuous at every point
in their domain. We defer the proof that the trigonometric functions are continuous,
but even if we know that we are still unable to prove the continuity of functions like
f (x) = sin(x2 ), before making a statement about compositions:
Theorem 4.2 (Composition of continuous functions). If g is continuous at a, and f is
continuous at g(a), then f ◦ g is continuous at a.
Proof. Let ε > 0. Since f is continuous at g(a) we can find a δ > 0 such that, if
|y − g(a)| < δ, then |f (y) − f (g(a))| < ε. So now choose η > 0, so that for all x, if
|x − a| < η, we have |g(x) − g(a)| < δ, which is possible because g is continuous at a.

Example 4.4. With this theorem we can now infer that F (x) from (4.2) is continuous at
every point. Similarly for functions like f (x) = sin(x2 + sin(x)), etc.
Exercise 4.1. Give another proof of the quotient case in the limit laws using Theorem 4.2.
In other words, use the theorem about compositions of continuous functions to show that
if g is continuous at a, and g(a) 6= 0, then
1 1
lim = . (4.4)
x→a g(x) g(a)
So far we have talked about continuity at a point. The consequences of continuity are
more powerful when it refers to continuity on an interval: We say f is continuous on
(a, b) if f (x) is continuous at x for all x ∈ (a, b). This is a special case of a function being
continuous on R = (−∞, ∞).

4.2. Consequences of continuity


A continuous function is sometimes described, intuitively, as one whose graph can be
drawn without lifting your pencil off the paper. While Example 4.2 shows this description
is a little too optimistic, it is true that all the following theorems about continuity are
clear in this picture:
Theorem 4.3. Suppose f is continuous at a, and f (a) > 0. Then f (x) > 0 for all x in
some interval containing a.
Proof. Since f is continuous at a, we know that for ε = f (a) > 0 there is a δ > 0 so that
for all x, if |x − a| < δ, then |f (x) − f (a)| < ε, or −ε < f (x) − f (a) < ε, which means

0 < f (x) < 2f (a) (4.5)

for all x in the interval (a − δ, a + δ).

Exercise 4.2. Formulate and prove an analogous statement under the assumption f (a) < 0.
The next theorem says, geometrically, that the graph of a continuous function which
starts below the horizontal axis and ends above the horizontal axis must cross this axis
at some point.

42
MAST10021 Semester 2, 2023

Theorem 4.4. If f is continuous on an interval (c, d), c < a < b < d, and

f (a) < 0 < f (b) ,

then there is some x in (a, b) such that f (x) = 0.

This theorem requires continuity on the whole interval.


Exercise 4.3. Give an example of a function for which continuity fails to hold at a single
point, and the conclusion is false.
There are simple generalisations of this theorem: Firstly there is nothing special about
the number 0. In the situation that f (a) < c < f (b), there is some x ∈ (a, b) such that
f (x) = c. Indeed, simply apply the theorem to the function g = f − c. Moreover if
f (a) > c > f (b), then we can apply the theorem to g = −f + c. This can be summarized
in the Intermediate Value Theorem: If a continuous function on an interval takes
on two values, then it takes on every value in between.
The proof of Theorem 4.4 relies on a deep property of the real numbers that we will
not discuss in this [Link], we can guess that the proof of the intermediate
value theorem must rely on a property of the real numbers, as opposed to the rationals,
because Theorem 4.4 implies immediately the existence of square roots:

Theorem 4.5. Every positive number has a square root.

Proof. Consider the function f (x) = x2 . We want to show that if α > 0, there exists a
number x such that f (x) = x2 = α. There is obviously a number b such that f (b) > α.
(In fact, if α > 1, take for example b = α, and if α < 1, take for example b = 1.) Since
f is continuous, and f (0) = 0 < α < f (b), there exists a number x in the interval [0, b]
such that f (x) = α, so x2 = α.

Problems
1. For which of the following functions f is there a continuous extension F of f ? In
other words, for which of the following functions is there a continuous function F
on the real line such that F (x) = f (x) for all x in the domain of f .
x2 −4
a) f (x) = x−2
|x|
b) f (x) = x
c) f (x) = x2 sin(1/x2 )

2. a) Suppose that f is a function satisfying |f (x)| ≤ |x| for all x. Show that f is
continuous at 0.
b) Suppose that g is continuous at 0 and g(0) = 0, and |f (x)| ≤ |g(x)|. Prove
that f is continuous at 0.

43
Module II Note 4

3. Prove that if f is continuous at a then for any ε > 0 there is a δ > 0 so that
whenever |x − a| < δ and |y − a| < δ we have |f (x) − f (y)| < ε.

4. Find an integer n such that f (x) = 0 for some x between n and n + 1, where

f (x) = x3 − x + 3 (4.6)

5. Suppose f is continuous, and that f (x) = 0 only for x = a.


a) Suppose f (x) > 0 for some x > a as well as for some x < a. What can be said
about f (x) for all x 6= a?
b) Suppose f (x) > 0 for some x > a and f (x) < 0 for some x < a. What can be
said about f (x) for x 6= a?

6. Suppose that f is a continuous function on [0, 1] and that f (x) is in [0, 1] for each
x. (Draw a picture!) Prove that f (x) = x for some number x in the unit interval.

44
Note 5.
Theorems about continuity
5.1. Global properties of continuous functions
We have already seen one important theorem about continuity, namely the Intermediate
value theorem which states that
Theorem 5.1 (Intermediate value theorem). If f is continuous on [a, b] and
f (a) < t < f (b),
then there is some x in (a, b) such that f (x) = t.
Let us now state two more theorems about continuity and explore some of their
consequences:
Theorem 5.2 (Bounded value theorem). If f is continuous on [a, b], then f is bounded
above on [a, b], that is there is some number N such that f (x) ≤ N for all x in [a, b].
Geometrically, this means that the graph of f lies below some horizontal line.
The third theorem states that a continuous function on a closed interval always achieves
a maximum:
Theorem 5.3 (Extreme value Theorem). If f is continuous on [a, b], then there is some
number y in [a, b] such that f (y) ≥ f (x) for all x in [a, b].
These theorems all rely on the continuity of f on the interval [a, b]. Indeed, the
conclusions are false if continuity fails at a single point
Example 5.1. For Theorem 5.2, take the function

1/x x 6= 0
(
f (x) = (5.1)
0 x=0
which is continuous at every point except 0, but f is not bounded above.
Example 5.2. For Theorem 5.3, consider the function

x<1
(
x2
f (x) = (5.2)
0 x≥1
On the interval [0, 1] the function is bounded above, but there is no y in [0, 1] such that
f (y) ≥ f (x) for all x in the interval.

45
Module II Note 5

These important theorems are stated in the simplest setting and are easily generalised.
For example, a continuous function on a closed interval always achieves a minimum, too:
Exercise 5.1. Use Theorem 5.3 to show that if f is continuous on [a, b] then there is some
y in [a, b] such that f (y) ≤ f (x) for all x in [a, b].

5.2. Roots of polynomial equations


We illustrate the power of these theorems with an application to polynomials. We have
already seen that Theorem 5.1 implies the existence of square roots. More generally, we
have
Theorem 5.4. If n is odd, then for any numbers a1 , . . . , an , the equation
xn + an−1 xn−1 + . . . + a1 x + a0 = 0 (5.3)
has a root.
This is true because the function f (x) = xn + an−1 xn−1 + . . . + a1 x + a0 is positive for
large positive x, and negative for large negative x, so f (x) = 0 for some x. More precisely,
for n odd,
lim f (x) = ±∞ . (5.4)
x→±∞
Remark 5.1. The statement that limx→∞ f (x) = ∞ does not mean that f has a limit
as x becomes large, but that f is unbounded, in the sense that, for every N > 0, there
exists M > 0, so that,
x > M =⇒ f (x) > N . (5.5)
Similary for statements such as limx→a f (x) = ∞; (see the additional notes to Mod-
ule II).
Exercise 5.2. Prove (5.4). If this is difficult, read on and return to it after the proof of
the Theorem below.
Examples of equations like x2 + 1 = 0 show that we cannot hope to show the same
result for n even. However in this case the equation x2 + 1 = c has a solution for c ≥ 1.
While we are changing the problem — in effect we are playing with the constant a0 — a
result of this type can be proven more generally with the help of Theorem 5.3.
Theorem 5.5. Suppose n is even. Then there is a number m such that
xn + an−1 xn−1 + . . . + a1 x + a0 = c (5.6)
has a solution for c ≥ m, and no solution for c < m.
The idea is here to show first that f (x) = xn + an−1 xn−1 + . . . + a1 x + a0 has an
absolute minimum, namely there is a number y such that f (y) ≤ f (x) for all x. Given
such a y, set m = f (y), then f (x) = c obviously has no solution for c < m, but for c = m
we have that x = y is a solution. In the case c > m we can find a number b > y with
f (b) > c, so f (y) = m < c < f (b) and we can apply Theorem 5.1 to infer the existence of
a solution f (x) = c.

46
MAST10021 Semester 2, 2023

Exercise 5.3. Why can we find a number b > y with f (b) > c?
It remains to show:

Lemma 5.6. If n is even and f (x) = xn + an−1 xn−1 + . . . + a0 , then there is a number
y such that f (y) ≤ f (x) for all x.

Proof. First note that


lim f (x) = ∞ . (5.7)
x→±∞

Hence we can find an interval [−b, b] so large, that for any points x outside that interval,

|x| > b =⇒ f (x) > |f (0)| . (5.8)

Now we can use Theorem 5.3, to infer that f has a minimum on the closed interval
[−b, b]:
f (x) ≥ f (y) (5.9)
for some y ∈ (−b, b). The point y is a minimum not just on the interval [−b, b], because
for |x| > b we have
f (x) > |f (0)| ≥ f (0) ≥ f (y) . (5.10)

Solution 5.4. In the proofs of the theorems above we have used that for any polynomial
of degree n ∈ N,
f (x) = xn + an−1 xn−1 + . . . + a0 (5.11)
we have
lim f (x) = ∞ . (5.12)
x→∞
We can see this as follows:
 an−1 a0 
f (x) = xn 1 + + ... + n (x > 0) (5.13)
x x
Now by choosing M = max(1, 2n|an−1 |, . . . 2n|a0 |) we get that for |x| ≥ M ,
an−1 a0   1 1
+ . . . + n ≤ |an−1 | + . . . + |a0 | ≤ . (5.14)
x x M 2
Therefore, for all x, with |x| ≥ M ,
1 n  an−1 a0 
x ≤ xn 1 + + . . . + n = f (x) (5.15)
2 x x
This shows that for any N > 0, we can find M > 0, so that, if x > M , then f (x) > N .
Indeed, for a given N > 1, let us choose M = max{1, 2n|an−1 |, . . . , 2n|a0 |, 2N }, then,
whenever x > M , we have
1
f (x) > xn > 2n−1 N n > N . (5.16)
2

47
Module II Note 5

5.3. What about the proof?


We have deferred the proofs of the theorems about continuity because they rely on a
property of the real numbers that we have not discussed yet. Let us see what this property
might be by trying to give a proof of Theorem 4.4, namely:
If f is continuous on [a, b] and f (a) < 0 < f (b), then there is some x in [a, b]
such that f (x) = 0.
In the situation that f (a) < 0 a promising idea seems to be to locate the first point
where f (x) = 0, namely the smallest point in [a, b] such that f (x) = 0. To find this point
consider the set of numbers A which contains all points x in [a, b] such that f is negative
on [a, x]. In view of Theorem 4.3 the set A contains some points close to a, while all
points sufficiently close to b are not in A.
Now it is clear that we can find a number α that is greater than all the numbers in
A (for example α = b is such a choice), but it is not so clear if we can find the smallest
number α which is greater than all the numbers in A. It is precisely this property of the
real numbers, that ensures the existence of α which allows us to proceed. A thorough
discussion of “least upper bounds” is beyond the scope of this subject.
We claim that f (α) = 0, and prove this by eliminating the possibilities that f (α) < 0
or f (α) > 0.
Suppose first that f (α) < 0. Then by Theorem 4.3 f (x) < 0 for all x in a small interval
containing α, in particular for some numbers greater than α, which contradicts that α is
an upper bound for A.
On the other hand, if f (α) > 0 then by Theorem 4.3 f (x) > 0 for all x in a small
interval containing α, in particular for some numbers smaller that α, which means that
these numbers are not in A, and we could have chosen a smaller upper bound for A,
which contradicts that α is the smallest upper bound for A.
Hence f (α) = 0.

Problems
1. For each of the following functions, decide which are bounded above and below on
the indicated interval, and which take on their maximum or minimum value.
a) f (x) = x2 on (−1, 1)
b) f (x) = x3 on (−1, 1)
c) f (x) = x2 on R
d) f (x) = x2 on [0, ∞)
(
x2 , x≤a
e) f (x) = on (−a − 1, a + 1). (Assume here that a > −1.)
a + 2, x > a
(
x2 , x<a
f) f (x) = on [−a − 1, a + 1]. (Again assume here that a > −1.)
a + 2, x ≥ a

48
MAST10021 Semester 2, 2023

2. Suppose f and g are continuous on [a, b] and that f (a) < g(a), but f (b) > g(b).
Prove that f (x) = g(x) for some x in [a, b].

3. Suppose that f is a continuous function with f (x) > 0 for all x, and

lim f (x) = 0 = lim f (x) . (5.17)


x→−∞ x→∞

Prove that there is some number y such that f (y) ≥ f (x) for all x.

49
Additional: Limts and Continuity
Recommended Reading

(Spivak, Calculus, Chapter 5)

Real Analysis: Advanced (MAST20033)

In the lecture we have mentioned without proof that:

Theorem 5.1. A function f cannot approach two different limits near a.

Proof. We want to show that if f approaches l near a, and f approaches m near a, then
l = m.
By definition, for a given ε > 0, there exist δ1 > 0, and δ2 > 0 such that

0 < |x − a| < δ1 =⇒ |f (x) − l| < ε (5.1)


0 < |x − a| < δ2 =⇒ |f (x) − m| < ε (5.2)

and in particular both conclusions are true provided |x − a| < δ, if we choose δ =


min(δ1 , δ2 ). If it were true that l =
6 m, then we could choose ε = |l − m|/2 > 0, and it
follows that for all x, if 0 < |x − a| < δ,

|l − m| = |l − f (x) + f (x) − m| ≤ |l − f (x)| + |f (x) − m| < 2ε = |l − m| (5.3)

which is a contradiction.

Exercise 5.1. First interpret separately, precisely in the sense of Definition 3.1, and then
prove equality of the expressions

lim f (x) and lim f (a + h) . (5.4)


x→a h→0

There are times we would like to talk about the limit of f as x approaches a “from
above”, or “from below”. These are situations when f (x) may not be defined for all x in
|x − a| < δ, but only say for x > a, or x < a, but the “one-sided” limits still exist.

Definition 5.1. A function f has a limit l as x approaches a from above, if for


every ε > 0 there is a δ > 0, such that for x, if 0 < x − a < δ, then |f (x) − l| < ε. We
then write
lim f (x) = l . (5.5)
x→a+

51
Module II Note 5

As an exercise write down the definition of a limit from below.


Example 5.1. Consider the function

x<0
(
−1,
f (x) = (5.6)
1, x>0

This function does not approach any number near 0, but the limits from above and below
do exist:
lim f (x) = 1 lim f (x) = −1 . (5.7)
x→0+ x→0−

Additional Problems
1. Prove that limx→a f (x) exists if limx→a+ f (x) = limx→a− f (x).

2. The function f (x) = 1/x2 does not approach a limit near 0. Nonetheless it is
common to write limx→0 f (x) = ∞. In general we define “limx→a f (x) = ∞” to
mean that for all N there is a δ > 0 such that, for all x, if 0 < |x − a| < δ, then
f (x) > N .
a) Show that limx→3 1/(x − 3)2 = ∞.
b) Prove that if f (x) >  > 0 for all x, and limx→a g(x) = 0, then

f (x)
lim = ∞.
x→a |g(x)|

Recommended Reading

(Spivak, Calculus, Chapter 6, 7)

Real Analysis: Advanced (MAST20033)

Exercise 5.2. You might have noticed that in the proof of Theorem 4.2 we have taken
limx→a f (x) = f (a) to mean that for every ε > 0, there exists a δ > 0, so that for all x,
if |x − a| < δ, then |f (x) − f (a)| < ε. So for a function f which is continuous at a, we
have dropped the condition that 0 < |x − a|. Why?

52
MAST10021 Semester 2, 2023

For the theorems about global properties of continuous functions, the notion of conti-
nuity on a closed interval is important. We say a function is continuous on [a, b] if f is
continuous at all x in (a, b), and

lim f (x) = f (a) lim f (x) = f (b) . (5.8)


x→a+ x→b−

Recall here from Defintion 5.1 (Additional: Limits) what it means for a function to
approach a limit from above or from below.
Another variation of the “one-sided” limit occurs when we talk about the limit of a
function f (x) “as x approaches ∞.”
Example 5.2. We have
lim sin(1/x) = 0 . (5.9)
x→∞

Definition 5.2. A function f (x) approaches a limit l “as x goes to infinity,”

lim f (x) = l
x→∞

if for every ε > 0, there is a number N ∈ N such that, for all x,

x > N =⇒ |f (x) − l| < ε .

Additional Problems
1. a) Prove the following version of Theorem 4.3 for “right-hand continuity”: Suppose
that limx→a+ f (x) = f (a), and f (a) > 0. Then there is a number δ > 0 such
that f (x) > 0 for all x satisfying 0 ≤ x − a < δ. Similarly if f (a) < 0, then
there is a number δ > 0 such that f (x) < 0 for all x satisfying 0 ≤ x − a < δ.
b) Prove a version of Theorem 4.3 when limx→b− f (x) = f (b).

2. a) Prove that if f is continuous on [a, b], then there is a function g which is


continuous on R which satisifies g(x) = f (x) for all x ∈ [a, b].
b) Give an example to show that this assertion is false if [a, b] is replaced by
(a, b).

3. Prove that limx→0+ f (1/x) = limx→∞ f (x).

53
Note 6.
Functions of two variables: Limits and
Continuity
In Note 3 we have arrived at a definition of what it means for a function of one variable
to approach a limit at a point a. In this note we want to extend this notion to functions
f (x, y) of two variables.

Definition 6.1 (Limit). A function f (x, y) of two variables has the limit l as (x, y)
approaches (a, b) if for every ε > 0, there is some δ > 0 so that

|f (x, y) − l| < ε whenever 0 < |(x, y) − (a, b)| < δ .

In other words, the definition is conceptually exactly the same as for one variable just
that | · | now refers to the distance in R2 : The set of points (x, y) satisfying
q
0 < |(x, y) − (a, b)| = (x − a)2 + (y − b)2 < δ (6.1)

is a (punctured, and open) disc of radius δ centered at the point (a, b).
Remark 6.1. It is often more convenient to use the following equivalent formulation:
A function f (x, y) has a limit l as (x, y) approaches (a, b) if for every ε > 0 there is a
δ > 0 so that |f (x, y) − l| < ε whenever (x, y) 6= (a, b) and

max{|x − a|, |y − b|} < δ .

Indeed, if |f (x, y) − l| < ε on a punctured disc of radius δ,√ then this will also be
the case on a puctured square with side length 2δ 0 , provided 2δ 0 ≤ δ. Conversely, if
|f (x, y) − l| < ε on a punctured square of side-length 2δ, then this will be true for all
points (x, y) on a punctured disc that fits into this square, namely a disc of radius δ.
The definition absolves us from the following dilemma. For a function of one variable
f (x) it is clear what we mean by “x approaches a”: we can approach the point a either
from the left, or the right. However, for functions of two variables there are infinitely many
ways to approach the point (a, b) in the plane, because any curve in the plane through
(a, b) gives a way to approach (a, b). Moreover, the value that a function approaches at a
point may depend on the direction in which this point is approached.

55
Module II Note 6

Example 6.1. Let 


 x2 y (x, y) 6= (0, 0)
g(x, y) = x4 +y 2 (6.2)
0 (x, y) = (0, 0) .

First observe that g(x, 0) = 0 and g(0, y). In particular, along the coordinate axes g goes
to 0. Now let c 6= 0, and consider g evaluated on the points (x, cx):

cx3 cx
g(x, cx) = = 2 →0 (x → 0) . (6.3)
x +c x
4 2 2 c + x2
This shows that along any straight line through the origin the function g tends to 0.
However, if we approach the origin on an parabola y = cx2 , for any c 6= 0, then
c
g(x, cx2 ) = 6= 0 . (6.4)
1 + c2

Our definition does not make reference to the “way” in which (a, b) is approached. In
fact, according to our definition the function g(x, y) of the previous example does not
have a limit at (0, 0) at all!
Exercise 6.1. Why does our working of Example 6.1 show that g(x, y) does not have a
limit at the origin?
Exercise 6.2. Let
xy
f (x, y) = (x, y) 6= (0, 0) (6.5)
x2 + y 2
and let f (0, 0) = 0. Show that f does not have a limit as (x, y) approaches (0, 0).
Solution 6.3. First note that f (0, y) = 0, and f (x, 0) = 0. Now consider the values of f
on the straight lines through the origin: For any c 6= 0,

cx2 c
f (x, cx) = = (x 6= 0) . (6.6)
x +c x
2 2 2 1 + c2
We need to show that for any l, we can find ε > 0, so that for all δ > 0, there exists a
point (x, y) in the punctured disc of radius δ, where |f (x, y) − l| > ε.
For l = 0, we can always choose a point (x, x) on the straight line y = x arbitrarily
close to the origin, where f (x, x) = 1/2, so we can arrange this for any ε < 1/2.
For l 6= 0, simply choose  = l/2, then for any point on the axis |g(x, 0) − l| = l > ε.
Notice that the definition of a limit, which if it does exist for a function f (x, y) at the
point (a, b) we denote by
lim f (x, y) , (6.7)
(x,y)→(a,b)

does not involve the value f (a, b) at all; only the values of the function near (a, b) are
relevant here. Indeed the function need not even be defined at (a, b). However, if f is
defined at (a, b), and its value at the point agrees with its limit as we approach (a, b),
then the function is said to be continuous at (a, b):

56
MAST10021 Semester 2, 2023

Definition 6.2 (Continuity). A function f (x, y) is continuous at (a, b) if it has a limit


at (a, b), and
lim f (x, y) = f (a, b) . (6.8)
(x,y)→(a,b)

While the examples above may have given the impression that functions of two variables,
in general, do not have a limit, this is rather the exception than the rule:
The functions

f (x, y) = x + y (6.9)
f (x, y) = xy (6.10)

are continuous everywhere in the plane. Moreover

f (x, y) = x − y (6.11)
f (x, y) = x/y (y 6= 0) (6.12)

are continuous in their domain. Furthermore; (see the additional notes to Module II):
(Continuity laws) the sum, and product of two continuous functions is contin-
uous. Moreover, the quotient of two continuous functions is continuous
(on the set where the denominator is not zero).
Since the “elementary functions” of one variable, in particular polynomials and trigono-
metric functions, are all continuous (on their domains), it is almost immediate that all
the “elementary functions” of two variables, namely those built up of these functions of
one variable, by arithmetic operations and compositions, are also continuous, where they
are defined.
Example 6.2. The function
sin(3x + 2y)
f (x, y) = (6.13)
x2 − y
is continuous everywhere, except along the parabola y = x2 .
Exercise 6.4. Let f be defined by

 xy(x2 −y2 ) (x, y) 6= (0, 0)
f (x, y) = x2 +y 2 (6.14)
0 (x, y) = (0, 0) .

Decide if h is continuous at the origin.


Solution 6.5. Experimentation with lines does not give any indication that the limit of f
near (0, 0) does not exist, or is anything other than 0. So let us try to show that

lim f (x, y) = 0 (6.15)


(x,y)→(0,0)

which, if true, then shows that f is continuous at the origin. Since |x2 − y 2 | ≤ x2 + y 2 ,
we have
|f (x, y)| ≤ |xy| (6.16)

57
Module II Note 6

for all (x, y) 6= (0, 0). Since the function h(x, y) = xy is continuous at the origin, it follows
that f is continuous at the origin. Indeed, let ε > 0. Then for all (x, y) 6= (0, 0), with
|(x, y)| < δ, we have
1
|f (x, y)| ≤ |xy| ≤ (x2 + y 2 ) < ε , (6.17)
2

provided δ is chosen so that δ 2 < 2ε, say δ = ε.

Problems
1. Show that the following functions do not have a limit at the origin:
2
a) f (x, y) = √x 2+y 2 (x, y) 6= (0, 0)
x +y

b) f (x, y) = x
x4 +y 4
(x, y) 6= (0, 0)

2. Show that the following functions do have a limit at the origin:


x2 y 2
a) f (x, y) = x2 +y 2
3x5 −xy 4
b) f (x, y) = x4 +y 4

3. Let
1
f (x, y) =
sin(xy) (x 6= 0) (6.18)
x
How should we define f (0, y) for any number y so as to make f a continuous
function on the plane?

4. Consider again the function


xy
f (x, y) = . (6.19)
x2 + y2

In Exercise 6.2 we have shown that f is not continuous at (0, 0). Nonetheless, prove
that f (x, b), and f (a, y) are continous functions of x, and y, respectively, for any
numbers a, and b (including a = 0, b = 0).
We say f is separately continuous.

58
Additional: Limits and Continuity
Further Reading

(Folland, Advanced Calculus, Chapter 1.3)

Vector Calculus: Advanced (MAST20032)

Functions of several variables


Definition 6.1 extends in an obvious way to functions of several variables: If f is a function
on Rn , then we say f has a limit l as ~x = (x1 , x2 , . . . , xn ) approaches ~a = (a1 , . . . , an ) if
for every ε > 0, we can find a δ > 0 so that, if 0 < |~x − ~a| < δ, then |f (~x) − f (~a)| < ε.

Continuity of elementary functions of two variables


In our discussion of continuous functions of two variables we have stated without proof
that:

Proposition 6.1. The functions f1 (x, y) = x + y and f2 (x, y) = xy are continuous


functions on the plane.

This can be shown in an elementary way using the definition of continuity.


We have also used without proof that compositions preserve continuity.

Proposition 6.2. Suppose f : A → R2 , and g : B → R are continuous functions on


their domains A, B ⊂ R2 , respectively. If f (A) ⊂ B, then g ◦ f : A → R is a continuous
function.

This is used to obtain results about the continuity of elementary functions.

Proposition 6.3. The functions f3 (x, y) = x − y is continuous on the plane, and the
function f4 (x, y) = x/y is continuous on {(x, y) : y 6= 0}.

Proof. We have f3 (x, y) = f1 (x, f2 (−1, y)), hence a composition of continuous functions.
Moreover, f4 (x, y) = f2 (x, g(y)), where g(y) = 1/y is continuous away from 0, hence f2
is a composition of continuous functions on the set where it is defined.

59
Module III.

Differentiation

61
Note 7.
Differentiation in one variable
7.1. Differentiability in one variable
Following our intuition that continuous functions are those “whose graphs can be drawn
without lifting the pen off the paper” they are still allowed to have “sharp corners”. A
differentiable function does not have graphs like that and admits a well-defined “tangent
line” at each point.

Definition 7.1. A function is differentiable at a if


f (a + h) − f (a)
lim (7.1)
h→0 h
exists. In this case the limit is denoted by f 0 (a) and is called the derivative of f at a.

Remark 7.1. Note that the difference quotient (f (x + h) − f (x))/h is the slope of the line
through the points (x, f (x)) and (x + h, f (x + h)). Therefore define that tangent line
to the graph of f at (a, f (a)) to be the line through the point (a, f (a)) with slope f 0 (a).
We say f is differentiable if f is differentiable at every point on its domain. More
generally, we say f is differentiable on say an interval A = (a, b) (or some set of points
A) if f is differentiable at every point a ∈ A, and we call the function f 0 the derivative
of f on the domain A.
Example 7.1. The constant function f (x) = c is differentiable and f 0 (x) = 0.
Exercise 7.1. The linear functions f (x) = cx + d are differentiable, and f 0 (x) = c.
Example 7.2. Let us compute the derivative of the function f (x) = x2 at x = a:

(a + h)2 − a2
f 0 (a) = lim = lim (2a + h) = 2a (7.2)
h→0 h h→0

Exercise 7.2. Show that the function f (x) = x3 is differentiable and f 0 (a) = 3a2 .
Example 7.3. The function
f (x) = |x| (7.3)
is not differentiable at 0. Indeed, the difference quotient at 0 is simply |h|/h, which is 1
for h > 0, and −1 for h < 0, so the limit as h approaches 0 does not exist.
Exercise 7.3. Show that the function f (x) = |x| is differentiable at every point a 6= 0.

63
Module III Note 7

Figure 7.1.: Linear approximation of a function.

Example 7.4. The function f (x) = |x| is also not differentiable at 0. In fact, the slopes
p

of the tangent lines at (x, f (x)) become infinite as we approach 0 from the right, and
negative infinite as we approach from the left.
These are example of functions which are continuous, but not differentiable. Conversely,
we have:
Theorem 7.1. If f is differentiable at a, then f is continuous at a.
Proof.
f (a + h) − f (a)
lim f (a + h) − f (a) = lim · h = f 0 (a) lim h = 0 . (7.4)
h→0 h→0 h h→0

7.1.1. Geometric interpretation of the derivative


Let us return to the geometric interpretation of the derivative given in Remark 7.1,
cf. Figure 7.1.
Suppose f is differentiable at a point a. This means that the graph of the linear function

l(x) = f 0 (a)(x − a) + f (a) (7.5)

approximates the graph of the function f near a. More precisely, we know that the
difference in height
h(x) = f (x) − l(x) (7.6)
tends to zero at a faster rate than x − a, as x approaches a:
h(x) f (x) − l(x) f (x) − f (a)
lim = lim = lim − f 0 (a) = 0 . (7.7)
x→a x − a x→a x−a x→a x−a

64
MAST10021 Semester 2, 2023

In conclusion, we have

f (x) = l(x) + h(x) = f (a) + f 0 (a)(x − a) + h(x) , (7.8)

and we can view this as a linear approximation of the function f near a, and h as an
error which goes to zero faster than the distance to a.
Remark 7.2. The statement that
lim h(x) = 0 (7.9)
x→a
is the statment that f (x) is continuous at a. The fact that (7.9) is implied by (7.7) is
another proof of Theorem 7.1.

7.1.2. Higher derivatives


We have seen in Note 4 that the function
x sin(1/x) x 6= 0
(
f (x) = (7.10)
0 x=0

is continuous. However, once again this function is not differentiable at 0: For any h 6= 0,
f (h) − f (0)
= sin(1/h) (7.11)
h
and this function does not have a limit as h → 0. A very similar function, which is
differentiable at 0, is
x2 sin(1/x) x =
6 0
(
f (x) = (7.12)
0 x = 0.
However, we will see that for this function the second derivative fails to exist at 0.
For any function f , we obtain by taking the derivative another function f 0 (whose
domain may be smaller than the domain of f ). Clearly, now starting with the function
f 0 , we obtain another function (f 0 )0 whose domain we take to be all points where f 0 is
differentiable. This is the second derivative f 00 of f . In general, we also write

f (0) = f (7.13)
0
f (1)
=f (7.14)
f (2) = f 00 (7.15)
(k) 0
f (k+1)
= (f ) , (7.16)

and we also call f (k) , for k ≥ 2, the higher order derivatives of f . The idea is that
the more derivatives of a function exist, the more regular it is.
Example 7.5. Consider the function

x≥0
(
x2
f (x) = (7.17)
−x2 x ≤ 0.

65
Module III Note 7

We know that f 0 (a) = 2a for a > 0, and f 0 (a) = −2a for a < 0. Moreover,
f (h) − f (0)
= |h| (7.18)
h
and so f is differentiable at 0, and f 0 (0) = 0. We can summarize this conveniently by
f 0 (x) = 2|x| . (7.19)
As we have seen this function is continuous but not differentiable, hence f 00 (0) does not
exist.

7.2. Differentiation
The aim is now to prove a few theorems that will allow us to differentiate a large number
of functions without invoking the definition, and investigating a limit, every time.
Theorem 7.2 (Sum and product rule). If f and g are differentiable at a, then f + g
and f · g are also differentiable at a, and
(f + g)0 (a) =f 0 (a) + g 0 (a) (7.20)
0 0 0
(f · g) (a) =f (a)g(a) + f (a)g (a) (7.21)
The second formula is also called the product rule.
Proof of (7.21).
(f g)(a + h) − (f g)(a)
(f g)0 (a) = lim
h→0 h
(f (a + h) − f (a))g(a + h) + f (a)(g(a + h) − g(a)) (7.22)
= lim
h→0 h
0 0
=f (a)g(a) + f (a)g (a)
where in the last step we have used Theorem 3.1.
Another proof of (7.21). We can also use the ideas from Section 7.1.1 to give another
proof of the product rule: If f and g are differentiable at a, then they have a linear
approximation near a,
f (x) = f (a) + f 0 (a)(x − a) + h(x) , g(x) = g(a) + g 0 (a)(x − a) + k(x) , (7.23)
where h(x), and k(x) are functions that tend to zero faster than x − a. Therefore
h i
f (x)g(x) = f (a)g(a) + f (a)g 0 (a) + f 0 (a)g(a) (x − a) + E(x) (7.24)

where E(x) is a function that satisfies


E(x)
lim = 0. (7.25)
x−a
x→a

Therefore the graph of the function f g is approximated by a linear function with slope
f (a)g 0 (a) + f 0 (a)g(a).

66
MAST10021 Semester 2, 2023

Example 7.6. If f (x) = xn for some natural number n ∈ N, then we can now prove by
induction that
f 0 (a) = nan−1 . (7.26)
Exercise 7.4. The above allows us to compute easily the derivatives of polynomials:

p(x) = an xn + an−1 xn−1 + . . . + a0 (7.27)

Compute a few derivatives, p0 (x), p(2) , and note in particular that

f (n) (x) = n!an , f (k) (x) = 0 (k > n) . (7.28)

Note that the product rule can also be used to differentiate any product of functions.
For example to compute (f · g · h)0 , we could either write f · g · h = (f g) · h, and apply
the product rule to the functions f g, and h, or we could write f · g · h = f · (gh), and
apply the product rule to f , and gh; both has the same result:

(f gh)0 = (f 0 )gh + f (g 0 )h + f g(h0 ) (7.29)

Next we turn to the quotient rule.

Theorem 7.3 (Quotient rule). If f and g are differentiable at a and g(a) 6= 0, then f /g
is differentiable at a, and
 f 0 f 0 (a)g(a) − f (a)g 0 (a)
(a) = (7.30)
g (g(a))2

Proof. Let us start with the special case


 1 0 g 0 (a)
(a) = − . (7.31)
g (g(a))2

We have
 1 0 1 g(a) − g(a + h)
(a) = lim
g h→0 h g(a + h)g(a)
(7.32)
g(a + h) − g(a) 1 g 0 (a)
= − lim lim =− .
h→0 h h→0 g(a + h)g(a) (g(a))2

Note that for the difference quotient to make sense we needed g(a + h) 6= 0, at least
for |h| sufficiently small. However, we know that g is differentiable, hence continuous at
a, and since g(a) 6= 0 it follows that g(a + h) 6= 0, as long as |h| < δ, for some δ > 0;
cf. Theorem 7.1 and Theorem 4.3.
Since
f 1
=f· , (7.33)
g g
the formula for the derivative of the quotient then follows from the product rule.

67
Module III Note 7

Example 7.7.
x2 − 1 4x
f (x) = f 0 (x) = (7.34)
x2 + 1 (x2 + 1)2
Example 7.8. If f (x) = x−n for some natural number n ∈ N, then

1 −nxn−1
f (x) = , f 0 (x) = = (−n)x−n−1 . (7.35)
xn x2n
To investigate our favourite functions from above, say

f (x) = x2 sin(1/x) (x 6= 0) (7.36)

we need to understand differentiation of compositions, and we need to know the derivatives


of the trigonometric functions which we take for now without proof:

sin0 (x) = cos(x) cos0 (x) = − sin0 (x) . (7.37)

Finally the chain rule, whose proof can defer to Additional.

Theorem 7.4 (Chain rule). If g is differentiable at a, and f is differentiable at g(a),


then f ◦ g is differentiable at a, and

(f ◦ g)0 (a) = f 0 (g(a)) · g 0 (a) . (7.38)

Example 7.9. Consider again the function f (x) from (7.36), or more precisely the extension

x2 sin(1/x) x 6= 0
(
f (x) = (7.39)
0 x = 0.

We have already seen, directly from the defintion, that f 0 (0) = 0. With the chain rule,
we can also compute that for x 6= 0,
 1
f 0 (x) = 2x sin(1/x) + x2 cos(1/x) · − = 2x sin(1/x) − cos(1/x) (7.40)
x2
In particular, f 0 (x) is not continuous at 0.
We illustrate how the chain rule is applied in practice with some more examples.
Example 7.10.
f (x) = sin(x2 ) (7.41)
When we apply the chain rule we view this as a composition

f = sin ◦S (7.42)

where S(x) = x2 (say S for “taking the square”). Then it is clear that

f 0 (x) = cos(S(x)) · S 0 (x) = cos(x2 ) · 2x , S 0 (x) = 2x . (7.43)

68
MAST10021 Semester 2, 2023

Example 7.11.
f (x) = sin2 (x2 ) (7.44)
We could view f as the composition

f = S ◦ (sin ◦S) (7.45)

and the chain rule gives

f 0 = S 0 (sin ◦S) · (sin ◦S)0 (7.46)


0
f (x) = 2(sin ◦S)(x) · (cos ◦S)(x) · 2x = 2 sin(x ) · cos(x ) · 2x
2 2
(7.47)

The above notation is useful to clarify the compositions that make up a function, but
in practice one does not usually introduce additional notation.
Example 7.12.
f (x) = sin(sin(x2 )) (7.48)
We compute directly:
f 0 (x) = cos(sin(x2 )) · cos(x2 ) · 2x (7.49)
Exercise 7.5. Compute the derivatives of
 
f (x) = sin (sin(x))2 (7.50)
f (x) = sin2 (x sin x) (7.51)

Problems

1. Prove directly using the defintion, that if f (x) = x, then f 0 (a) = 1

2 a
for any
a > 0.

2. Prove that if g(x) = f (x) + c, then g 0 (x) = f 0 (x). Also show that if g(x) = cf (x)
then g 0 (x) = cf 0 (x).

3. Let f be a function such that |f (x)| ≤ x2 for all x. Prove that f is differentiable at
0.

4. Show that if f is differentiable at a, then


a)
f (x) − f (a)
f 0 (a) = lim (7.52)
x→a x−a

b)
f (a + h) − f (a − h)
f 0 (a) = lim (7.53)
h→0 2h

69
Module III Note 7

5. a) Prove that if f is even, then f 0 is odd:

f 0 (x) = −f 0 (−x) . (7.54)

Draw a picture!
b) Prove that if f is odd, then f 0 is even: f 0 (x) = f 0 (−x).

6. Find the derivative f 0 for the following functions f .


a) f (x) = sin(cos(x))
b) f (x) = sin(x + sin(x))

7. Find f 0 for each of the following functions f .


a) f (x) = sin3 (x2 + sin(x))
b) f (x) = sin2 ((x + sin(x))2 )
c) f (x) = (x + sin5 (x))6
sin(x2 ) sin2 (x)
d) f (x) = 1+sin(x) (x 6= −π/2 + k2π, k ∈ Z)

8. Find the derivative f 0 for the following functions f .


a) f (x) = sin(x + x2 )
 
cos(x)
b) f (x) = sin x (x 6= 0)

9. Find f 0 for each of the following functions f .


a) f (x) = sin3 (x2 + sin(x))
b) f (x) = sin2 ((x + sin(x))2 )
c) f (x) = (x + sin5 (x))6
sin(x2 ) sin2 (x)
d) f (x) = 1+sin(x) (x 6= −π/2 + k2π, k ∈ Z)

70
Note 8.
Differentiation in two variables

8.1. Partial derivatives


The simplest notion of a derivative of a function f (x, y) of two variables is the partial
derivative, which is the derivative of the function with respect to one of the variables,
while keeping the other fixed.

Definition 8.1 (Partial derivatives). For a given function f (x, y), the limit of the
following difference quotients, if they exists, are called the partial derivative ∂x f , and
∂y f of f at (a, b):

∂f f (a + h, b) − f (a, b)
(a, b) = lim (8.1)
∂x h→0 h
∂f f (a, b + h) − f (a, b)
(a, b) = lim . (8.2)
∂y h→0 h

Example 8.1. The partial derivatives of a function

e2x sin(y)
f (x, y) = (8.3)
1 + y2
are

2e2x sin(y)
∂x f (x, y) = = 2f (x, y) (8.4)
1 + y2
e2x cos(y) e2x 2y sin(y)  2y sin(y)  e2x
∂y f (x, y) = − = cos(y) + . (8.5)
1+y 2 (1 + y )2 2 1 + y2 1 + y2

The partial derivatives of a function tell us how the values of a function change along
the coordinate axes. However, even if they exist at a point (a, b), they do not necessarily
give us information about the behaviour of the function near (a, b).
Example 8.2. Let us take another look at the example
( xy
(x, y) 6= (0, 0)
f (x, y) = x2 +y 2 (8.6)
0 (x, y) = (0, 0)

71
Module III Note 8

Here both partial derivatives exist at the origin:

∂x f (0, 0) = lim h−1 f (h, 0) − f (0, 0) = 0 (8.7)



h→0
∂y f (0, 0) = lim h−1 f (0, h) − f (0, 0) = 0 (8.8)

h→0

However, this does not tell us anything about the behaviour of the function f (x, y) near
(a, b). The reason is, as we shall see, that this function is not differentiable at (0, 0).

8.2. Differentiable functions


The idea to characterize differentiable functions of two variables is the same as for
functions of one variable: Functions which are differentiable at a point should be well
approximated by linear functions near that point. In other words, the graph of a function
which is differentiable at a point should be approximately a plane through that point.
More precisely, a function f (~x) of two variables ~x = (x1 , x2 ) is differentiable at a point
~a = (a1 , a2 ), if there is a linear function l(~x) such that l(~a) = f (~a), and the difference
h(~x) = f (~x) − l(~x) tends to zero faster than |~x − ~a|, as ~x approaches ~a. Now, a linear
function of two variables takes the form

l(~x) = b + ~c · ~x , (8.9)

for some b ∈ R, ~c = (c1 , c2 ) ∈ R2 , and the condition l(~a) = f (~a) implies that b = f (~a)−~c·~a,
so
l(~x) = f (~a) + ~c · (~x − ~a) . (8.10)
Definition 8.2 (Differentiability). A function f (x1 , x2 ) is differentiable at a point
~a = (a1 , a2 ), if there is a vector ~c ∈ R2 such that

f (~a + ~h) = f (~a) + ~c · ~h + E(~h) (8.11)

where the function E(~h) satisfies

E(~h)
lim = 0. (8.12)
~h→0 |~
h|
As for functions of one variable, all differentiable functions are continuous.
Theorem 8.1. If f is differentiable at ~a, then f is continuous at ~a.
The proof follows immediately from the linear approximation of a differentiable function.
Indeed if f is differentiable at ~a, then

f (~a + ~h) = f (~a) + ∇f (~a) · ~h + E(~h) . (8.13)

where E(~h)/|~h| → 0, hence in particular

lim f (~a + ~h) = f (~a) . (8.14)


~h→0

72
MAST10021 Semester 2, 2023

8.3. Gradient of a function


We have seen in Definition 8.2 that if a function f (~x) is differentibale at a point ~a, then
there exists a vector ~c, so that

f (~x) = f (~a) + ~c · (~x − ~a) + E(~x) (8.15)

where E(~x)/|~x − ~a| → 0 as ~x → ~a. Let us now determine the components of this vector
~c = (c1 , c2 ). In (8.15) simply choose ~x = (a1 + h, a2 ), then

f (a1 + h, a2 ) − f (a1 , a2 ) = ~c · (h, 0) + E(a1 + h, a2 ) = c1 h + E(a1 + h, a2 ) (8.16)

Hence, dividing by h 6= 0, and taking the limit h → 0, shows that

∂f
(~a) = c1 . (8.17)
∂x
Similarly for the other component of ~c, and we conclude

∂f
 
 ∂x (~a) 
∇f (~a) =  (8.18)
 
.
 
 ∂f
(~a)

∂y

We call this vector the gradient of f at (a, b).


In particular, we have proven that for a differentiable function the partial derivatives
always exist.

Theorem 8.2. If f (x, y) is differentiable at (a, b), then the partial derivatives ∂x f , and
∂y f of f exist at ~a and are the components of the gradient vector ∇f (~a).

Conversely, however, we have seen an example of a function whose partial derivatives


exist at a point, but which is nonetheless not differentiable at that point:
Example 8.3. The function
( xy
(x, y) 6= (0, 0)
f (x, y) = x2 +y 2 (8.19)
0 (x, y) = (0, 0)

is not continuous at the origin, hence in particular not differentiable at (0, 0), by Theo-
rem 8.1. Nonetheless the partial derivatives exist:

∂x f (0, 0) = ∂y f (0, 0) = 0 . (8.20)

In summary, for a function f (x, y) to be differentiable at (a, b) it is necessary for the


partial derivatives ∂x f (a, b), ∂y f (a, b) to exist, but not sufficient.
A sufficient criterion for differentiability is the following, whose proof we will defer:

73
Module III Note 8

Theorem 8.3. Suppose the partial derivatives ∂x f , and ∂y f of a function f (x, y) exist
at every point, and are themselves as functions ∂x f (x, y), and ∂y f (x, y) continuous at
(a, b). Then f (x, y) is differentiable at (a, b).

Example 8.4. For the example above we compute away from the origin,

y 3 − x2 y x3 − xy 2
∂x f (x, y) = ∂y f (x, y) = (x, y) 6= (0, 0) , (8.21)
(x2 + y 2 )2 (x2 + y 2 )2

and these functions are continuous everywhere, except at the origin.

8.4. Directional derivatives


Let ~a = (a1 , a2 ) be a point, and ~u = (u1 , u2 ) a direction, namely a vector of unit length,
|u| = 1. Then for each number t,

~x(t) = ~a + t~u (8.22)

is a point on the straight line through ~a, in the direction ~u.

Definition 8.3 (Directional derivative). The directional derivative of a function f at


~a in the direction ~u is defined as the derivative of the function f (~a + t~u) at t = 0. If it
exists, it is denoted by
∂f f (~a + t~u) − f (~a)
(~a) = lim . (8.23)
∂~u t→0 t
Remark 8.1. If we think of ~x(t) = ~a + t~u as a particle moving on a straight line, and the
function f as an observable, say the pressure in the atmosphere, then ∂~u f (~a) is the rate
of change of the pressure in the direction that the particle is moving.
Remark 8.2. Note that if we choose

~u = (1, 0) (8.24)

then
∂f ∂f
= (8.25)
∂~u ∂x
is the partial derivative of f in x. Similarly ∂y f is the directional derivative in the
direction ~u = (0, 1).

Theorem 8.4 (Formula for the directional derivative). Suppose f is differentiable at ~a.
Then the directional derivative at ~a in any direction ~u exists, and is given by

∂f
(~a) = ∇f (~a) · ~u . (8.26)
∂~u

74
MAST10021 Semester 2, 2023

Proof. Since f is differentiable at ~a, we know that (8.13) holds for any ~h, in particular
for ~h = t~u:
f (~a + t~u) = f (~a) + ∇f (~a) · t~u + E(~a + t~u) . (8.27)

Hence
f (~a + t~u) − f (~a) E(~a + t~u)
= ∇f (~a) · ~u + , (8.28)
t t
and the formula follows by taking the limit t → 0.

The above formula also provides a geometric interpretation of the gradient of a function:
Since for any vectors ~a, and ~b,
|~a · ~b| ≤ |~a||~b| (8.29)

with equality when ~a and ~b are colinear, we have in particular with |~u| = 1 that

∂f
(~a) ≤ ∇f (~a) (8.30)
∂~u

with equality when ∇f (~a) and ~u are colinear. This means that ∇f (~a) points in the
direction of the steepest increase of f at ~a, and its magnitude is that rate of increase of
f in that direction.
Exercise 8.1. Let f (x, y) = x2 + 5xy 2 , and ~a = (−2, 1).

1. Find the directional derivative of f at ~a in the direction of the vector ~v = (3, 4).

2. What is the largest directional derivative of f at ~a, and in what direction does it
occur?

Solution 8.2. We have ∇f (x, y) = (2x + 5y 2 , 10xy), so that ∇f (−2, 1) = (1, −20).
Note that ~v is not normalised, so let us first determine ~u colinear to ~v of unit length:

v 1
~u = = (3, 4) . (8.31)
|~v | 5

The directional derivative in that direction is


3 − 80 77
∂~u f (−2, 1) = ∇f (−2, 1) · ~u = =− . (8.32)
5 5

The largest directional derivative is



|∇f (~a)| = 401 ,

and it occurs in the direction (1, −20)/ 401.

75
Module III Note 8

Problems
1. For each of the following functions f (x, y) find the linear function l(x, y) whose
graph is the tangent plane to the graph of f at the point (1, −2, f (1, −2)).
a) f (x, y) = x2 y + sin(πxy)
xy
b) f (x, y) = x2 +y 2

2. Compute the gradient ∇f (~x) of the following functions.


a) f (~x) = 1
|~
x| (~x 6= 0)
b) f (~x) = (~a · ~x)2

3. For each of the following functions f (x, y) compute the directional derivative of f
at the point (−1, 2) in the direction ( 35 , 45 ).
a) f (x, y) = x2 y + sin(πxy)
xy
b) f (x, y) = x2 +y 2

4. Suppose f (~x) and g(~x) are differentiable at ~a. Does that imply that f + g, and f g
are differentiable at ~a? Find a formula for the gradient of f g, and f + g at ~a.

76
Additional: Chain rule
Further Reading

(Spivak, Calculus, Chapter 9, 10) (Folland, Advanced Calculus, Chapter 2.3)

Real Analysis: Advanced (MAST20033)Vector Calculus: Advanced (MAST20032)

Functions of one variable


Let us first give a direct proof of the chain rule.

Proof of Theorem 7.4. The idea is write the difference quotient as


f (g(a + h)) − f (g(a)) f (g(a + h)) − f (g(a)) g(a + h) − g(a)
= (8.1)
h g(a + h) − g(a) h
which however we can only do when g(a + h) − g(a) 6= 0. This can fail even for simple
choices of g (say g is the constant function), so define instead:

 f (g(a+h))−f (g(a)) , when g(a + h) − g(a) 6= 0
φ(h) = g(a+h)−g(a)
(8.2)
f 0 (g(a)) , when g(a + h) − g(a) = 0

We want to show that


lim φ(h) = f 0 (g(a)) . (∗)
h→0
With this statement we can complete the proof, because
f (g(a + h)) − f (g(a)) g(a + h) − g(a)
= φ(h) · (8.3)
h h
even if g(a + h) − g(a) = 0 (in which case both sides are zero), and taking the limit then
implies the formula.
It remains to show (∗), namely that for every ε > 0 there exists a δ > 0 such that, if
0 < |h| < δ, then
|φ(h) − f 0 (g(a))| < ε. (8.4)
So let ε > 0. Since f is differentiable at g(a), we can certainly find a η > 0 such that, if
|k| < η, then
f (g(a) + k) − f (g(a))
− f 0 (g(a)) < ε . (8.5)
k

77
Module III Note 8

Moreover since g is differentiable at a, hence continuous at a, we can find δ > 0, so that,


if |h| < δ, then
|g(a + h) − g(a)| < η. (8.6)
Hence whenever |h| < δ, and k = g(a + h) − g(a) 6= 0, we have

f (g(a + h)) − f (g(a))


φ(h) − f 0 (g(a)) = − f 0 (g(a))
g(a + h) − g(a)
(8.7)
f (g(a) + k) − f (g(a))
= − f 0 (g(a)) < ε
k
and the same statement, |φ(h) − f 0 (g(a))| < ε, is obviously also true for all h with
k = g(a + h) − g(a) = 0, in view of the definition of φ(h). This proves (∗).

Functions of several variables


The chain rule is also related to the directional derivative: In (8.22) we have “parametrized”
the points on a straight line in R2 .
More generally, the equation

~x(t) = ~g (t) = (g1 (t), . . . , gn (t)) (8.8)

where gi : i = 1, 2, . . . , n are functions of t, represents a parametrized curve in Rn .


Remark 8.1. For n = 3, we can view ~x = ~g (t) as the trajectory of a particle, which at
time t has the position ~g (t). Then

~g 0 (t) = (g10 (t), g20 (t), g30 (t)) (8.9)

is the velocity of the particle at time t.

Theorem 8.1 (Chain rule). Suppose that ~g (t) = (g1 (t), . . . , gn (t)), and gi : i = 1, 2, . . . , n
are differentiable functions at t = a. Suppose moreover that f (x1 , . . . , xn ) is differentiable
at ~b = ~g (a). Then the function (f ◦ g)(t) is differentiable at t = a, and its derivative is
given by
(f ◦ g)0 (a) = ∇f (~b) · ~g 0 (a) . (8.10)

Proof. Since f is differentiable at ~g (a), we have

f (~g (a + u)) =f (~g (a + u) − ~g (a) + ~g (a))


(8.11)
=f (~g (a)) + ∇f (~g (a)) · ~h + E1 (~h) ,

where
~h =~g (a + u) − ~g (a)
(8.12)
=~g 0 (a)u + E~2 (u) , ~g 0 (a) = (g10 (a), . . . , gn0 (a)) ,

78
MAST10021 Semester 2, 2023

because each gi (t) : i = 1, . . . , n is differentiable at t = a. Hence

f (~g (a + u)) = f (~g (a)) + ∇f (~g (a)) · ~g 0 (a)u + ∇f (~g (a)) · E~2 (u) + E(~h) , (8.13)

and the chain rule follows, because the last two terms go to zero as u tends to zero faster
than u.

Exercise 8.1. Use this approach to give another proof of the chain rule for functions of
one variable.

79
Module IV.

Mean value theorem

81
Note 9.
L’Hôpital’s rule
The aim of this lecture is to prove:

Theorem 9.1 (L’Hôpital’s rule). Suppose that

lim f (x) = 0 lim g(x) = 0 (9.1)


x→a x→a

and suppose that limx→a f 0 (x)/g 0 (x) exists. Then limx→a f (x)/g(x) exists, and

f (x) f 0 (x)
lim = lim 0 . (9.2)
x→a g(x) x→a g (x)

There are many variations of L’Hôpital’s rule; see for example Problems 6 below.
The proof in turn relies on the mean value theorem, which says that given a
continuous function on [a, b], which is differentiable on (a, b), there is a some x ∈ (a, b)
such that
f (b) − f (a)
f 0 (x) = . (9.3)
b−a
Geometrically, this means that there is some tangent line to the graph of f , which is
paralllel to the line between (a, f (a)) and (b, f (b)); see Figure 9.1.
Before we discuss these concepts an example how L’Hôptial’s rule is applied:
Example 9.1. The theorem allows us to determine the limit of the function

sin(x)
f (x) = (x 6= 0) (9.4)
x

near a = 0. Since sin0 (x) = cos(x), and x0 = 1, and moreover

cos(x)
lim = 1, (9.5)
x→0 1

the assumptions of L’Hôptial’s rule are satisfied, and we conclude that

sin(x) cos(x)
lim = lim = 1. (9.6)
x→0 x x→0 1

83
Module IV Note 9

Figure 9.1.: Geometric interpretation of the mean value theorem.

9.1. Mean Value Theorem


The theorems about derivatives we have seen in the previous lectures give us some
information about the derivative in terms of the function itself. It is more difficult to
infer knowledge of a function in terms of its derivative. For example, it is easy to see that
f 0 (x) = 0 for the constant function f (x) = c, but how do you prove that if f 0 (x) = 0
then f must be constant?
The mean value theorem allows us to draw this conclusion. It is probably the most
important theorem about derivatives and has many consequences. We first discuss a
special case:

Theorem 9.2 (Rolle’s Theorem). If f is continuous on [a, b] and differentiable on (a, b),
and f (a) = f (b), then there is a number x in (a, b) such that f 0 (x) = 0.

Proof. Since f is continuous on [a, b] it has a maximum point, and a minimum point in
[a, b].
If the maximum, or minimum, occur at x ∈ (a, b), then f 0 (x) = 0 at this point. More
precisely, we are using here that if x ∈ (a, b) is a local maximum (or minimum), then
necessarily f 0 (x) = 0. (See Additional for a proof of this statement.)
If both the maximum and minimum points lie on the boundary, then since f (a) = f (b)
they must be equal, and the function f is a constant, hence f 0 (x) = 0 for any x ∈ (a, b).

84
MAST10021 Semester 2, 2023

Figure 9.2.: Geometric interpretation of Rolle’s theorem

Exercise 9.1. Draw the graph of a continuous on [a, b], but not differentiable function on
(a, b) for which the conclusion of Rolle’s theorem is false.

Theorem 9.3 (Mean Value Theorem). If f is continuous on [a, b] and differentiable on


(a, b), then there is a number x in (a, b) such that

f (b) − f (a)
f 0 (x) = . (9.7)
b−a
Proof. Recall the geometric interpretation of this statement: At the point x the slope of
the tangent equals that of the line from (a, f (a)) to (b, f (b)).
Now the line from (a, f (a)) to (b, f (b)) is the graph of the linear function

f (b) − f (a)
l(x) = (x − a) + f (a) . (9.8)
b−a
In particular l(a) = f (a), and l(b) = f (b) and we set

h(x) = f (x) − l(x) (9.9)

which is the height of the graph of f over the line from (a, f (a)) to (b, f (b)). We have

h(a) = 0 , h(b) = 0 . (9.10)

Thus by Rolle’s theorem there is an x in (a, b) such that

f (b) − f (a)
h0 (x) = f 0 (x) − l0 (x) = f 0 (x) − = 0. (9.11)
b−a

We can now prove the obvious

85
Module IV Note 9

Corollary 9.4. If f is defined on an interval and f 0 (x) = 0 for all x in the interval,
then f is constant on the interval.

Proof. Take any points a < b in that interval, then by the mean value theorem there is
an x ∈ (a, b) with f 0 (x)(b − a) = f (b) − f (a), but f 0 (x) = 0 so f (a) = f (b).

Exercise 9.2. If f and g are defined on the same interval and f 0 (x) = g 0 (x) for all x, show
that then there is a constant c such that f (x) = g(x) + c.

9.2. L’Hôpital’s rule


We now derive several consequences of the mean value theorem.

Theorem 9.5 (Cauchy mean value theorem). Let f and g be continuous on [a, b] and
differentiable on (a, b), then there is a number x ∈ (a, b) such that

f (b) − f (a) f 0 (x)


= 0 (9.12)
g(b) − g(a) g (x)

provided g(b) 6= g(a) and g 0 (x) 6= 0.

Remark 9.1. Note that the special case g(x) = x is the mean value theorem, but the
Cauchy mean value theorem is not a direct consequence of the mean value theorem,
because while f (b) − f (a) = f 0 (x)(b − a) for some x, and g(b) − g(a) = g 0 (y)(b − a) for
some y, x and y are not neccesarily the same.

Proof. Let
h(x) = f (b) − f (a) g(x) − f (x) g(b) − g(a) (9.13)
 

then h is continuous on [a, b] and differentiable on (a, b), and

h(a) = f (b)g(a) − f (a)g(b) = h(b) (9.14)

so by Rolle’s theorem, there is a number x in (a, b) such that h0 (x) = 0.

This theorem is the main statement we need to evaluate limits of the form
f (x)
lim (9.15)
x→a g(x)

when limx→a f (x) = 0 and limx→a g(x) = 0.

Proof of Theorem 9.1. Recall that we assume that f and g approach the limit 0 near a,
so let us define (possibly redefine)

f (a) = 0 , g(a) = 0 , (9.16)

86
MAST10021 Semester 2, 2023

then f and g are continuous at a. Then by the Cauchy mean value theorem applied to f
and g on the interval [a, x], we get that there exists a < αx < x, such that

f (x) − f (a) f 0 (αx )


= 0 (9.17)
g(x) − g(a) g (αx )

Note that the assumptions of Theorem 9.5 are indeed satisfied: Since the f 0 (x)/g 0 (x)
approaches a limit, and so in particular g 0 (x) 6= 0 near a. This also shows that g(x) 6= 0
near a, because if g(x) = 0 for some x > a, then by the mean value theorem there would
exist a y ∈ (a, x) with g 0 (y) = 0, again contradicting that g 0 (x) 6= 0 near a.
Furthermore it follows that
f (x) f 0 (αx )
lim = lim 0 , (9.18)
x→a g(x) x→a g (αx )

because αx tends to 0 as x goes to 0. More precisely, we know that the limit

f 0 (x)
l = lim (9.19)
x→a g 0 (x)

exists. This means that for any ε > 0, we can find δ > 0, so that, if |y − a| < δ, then

f 0 (y)
− l < ε. (9.20)
g 0 (y)

Therefore, whenever |x − a| < δ,

f (x) f 0 (αx )
−l = 0 −l <ε (9.21)
g(x) g (αx )

because αx ∈ (a, x), and so in particular |αx − a| < |x − a| < δ.

Problems
1. A function is increasing on an interval if f (a) < f (b) whenever a and b are two
numbers in the interval with a < b. Similarly for a decreasing function.
Show that if f 0 (x) > 0 for all x in an interval, then f is increasing on the interval.

2. a) Prove that if f 0 (x) ≥ M for all x in [a, b] then f (b) ≥ f (a) + M (b − a)


b) Prove that if f 0 (x) ≤ M for all x in [a, b] then f (b) ≤ f (a) + M (b − a)
c) Formulate a similar theorem when |f 0 (x)| ≤ M for all x in [a, b]

3. a) Suppose that f 0 (x) > g 0 (x) for all x and that f (a) = g(a). Show that f (x) >
g(x) for x > a and f (x) < g(x) for x < a

87
Module IV Note 9

b) Show by an example that these conclusions do not follow without the hypothesis
f (a) = g(a).

4. Find all functions f such that


a) f 0 (x) = sin(x)
b) f 00 (x) = x3 .

5. What is wrong with the following use of L’Hôptial’s rule:

x3 + x − 2 3x2 + 1 6x
lim = lim = lim =3 (9.22)
x→1 x2 − 3x + 2 x→1 2x − 3 x→1 2

Find the correct limit.

6. Prove the following variations of L’Hôpital’s rule (with the much the same reasoning
as in the proof of Theorem 9.1).
a) If limx→a f (x) = limx→a g(x) = 0, and limx→a f 0 (x)/g 0 (x) = ∞, then

lim f (x)/g(x) = ∞ .
x→a

b) If limx→∞ f (x) = limx→∞ g(x) = 0, and limx→∞ f 0 (x)/g 0 (x) = l, then


limx→∞ f (x)/g(x) = l.
Hint: Consider limx→0+ f (1/x)/g(1/x).

88
Note 10.
Inverse functions
The graph of f −1 is the graph of f reflected across the diagonal line consisting of all
points (x, x). For f −1 to be a function it is necessary that, geometrically, no horizontal
line intersects the graph of f twice, and this property has a name:

Definition 10.1. A function f is one-to-one if f (a) 6= f (b) if a 6= b.

Example 10.1. Say f (x) = x3 . The f −1 is the function that assigns to y = x3 the number
unique number x, that is

f −1 (y) = 3 y . (10.1)
More generally, the fact that f −1 (x) is the number y such that f (y) = x can be restated
as: f (f −1 (x)) = x for every point x in the domain f −1 , or alternatively f −1 (f (x)) = x
for every point in the domain of f .
We know that all increasing, and decreasing functions are one-to-one.
Exercise 10.1. Show that if f is increasing, then f −1 is also increasing.
Exercise 10.2. A function f is increasing if and only if −f is decreasing.
However, it is not true that every one-to-one function is either increasing or decreasing.
Example 10.2. The function

0<x<1
(
x2
f (x) = (10.2)
1
x−1 +1 x>1

is a continuous one-to-one function, which is neither increasing, nor decreasing. However,


it is of course increasing on the interval (0, 1), and decreasing on the interval (1, ∞).
This is true more generally: A continuous function f which is one-to-one on an interval
is either increasing or decreasing on that interval (we will not prove this here); (see the
additional notes to Module IV).
The purpose of this note is to record some very general properties of the inverse
of one-to-one functions. For example, if f is continuous, does that mean that f −1 is
continuous? Moreover, if f is differentiable, does that mean that f −1 is differentiable,
and if so what is the derivative?
A glance at the graph of a one-to-one function suggests the answer, see Figure 10.1:
Say L is the tangent line to the graph of f at the point (a, f (a)). Then the tangent line L0
to the graph of f −1 at the point (f (a), a) is obtained by reflecting L across the diagonal,

89
Module IV Note 10

Figure 10.1.: Tangent to the inverse function.

and so the slope of L0 is the reciprocal of the slope of L. In other words, this suggests
that:
1
(f −1 )0 (f (a)) = 0 , (10.3)
f (a)
or alternatively,
1
(f −1 )0 (y) = . (10.4)
f 0 (f −1 (y))
There is another reason, this formula should be true: We know that
 
f f −1 (y) = y (10.5)

for all y in the domain of f −1 , so by the chain rule:


 
f 0 f −1 (y) · (f −1 )0 (y) = 1 (10.6)

This argument is not a proof because it presupposes that we know that f −1 is differentiable,
but it does tell us if f , and f −1 are differentiable, then (f −1 )0 must be given by this
formula.
This argument also tells us:
Corollary 10.1. If f is a continuous one-to-one function defined on an interval and
f 0 (f −1 (a)) = 0, then f −1 is not differentiable at a.

90
MAST10021 Semester 2, 2023

Proof. Indeed, if f −1 were differentiable at a, then (10.6) would imply 0 = 1.

Example 10.3. The function f (x) = x3 is continuous and one-to-one, and satisfies
f 0 (0) = 0, and indeed f −1 is not differentiable at 0 = f −1 (0). (Draw a picture!)
Now finally, the positive results:
Theorem 10.2 (Continuity of the inverse). If f is continuous and one-to-one on an
interval, then f −1 is also continuous.
This is surprisingly cumbersome to show and we will not go into the proof here; (see
the additional notes to Module IV).
Theorem 10.3 (Differentiability of the inverse). Let f be a continuous one-to-one
function defined on an interval, and suppose that f is differentiable at f −1 (b), with
f 0 (f −1 (b)) 6= 0, then f −1 is differentiable at b, and
1
(f −1 )0 (b) = . (10.7)
f 0 (f −1 (b))
Proof. In order to prove the theorem, we need to look at the difference quotient
f −1 (b + h) − f −1 (b)
(10.8)
h
where b = f (a). For given h, let us choose k, depending on h, so that b + h = f (a + k).
Then we can also write this as
a+k−a k
= . (10.9)
f (a + k) − f (a) f (a + k) − f (a)
Moreover, since f is differentiable at a, we have

f (a + k) − f (a) = f 0 (a)k + E(k) , (10.10)

where E(k) has the property that limk→0 E(k)/k = 0.


Note that k is given by k = f −1 (b + h) − a = f −1 (b + h) − f −1 (b), and since f −1 is
continuous at b (by Theorem 10.2), we know that k approaches 0 as h tends to 0.
Therefore
f −1 (b + h) − f −1 (b) 1
lim = 0 . (10.11)
h→0 h f (a)

Example 10.4. Consider for any n ∈ N,

fn (x) = xn (10.12)

For n odd, this function is continuous and one-to-one, and for n even it is so if we take
the domain to be [0, ∞). We have

fn−1 (x) = n x = x1/n (10.13)

91
Module IV Note 10

whose domain is R when n is odd, and [0, ∞) if n is even. By Theorem 10.3 we have, for
x 6= 0,
1 1
(fn−1 )0 (x) = (n−1)/n
= x1/n−1 (10.14)
nx n
Hence if f (x) = xa for any integer a, or a the reciprocal of a natural number, then
f (a) = naa−1 . In fact, for any rational number a = m/n, we can write
0

f (x) = xm/n = (x1/n )m (10.15)

which entails by the chain rule:


1 1/n−1 m m/n−1
f 0 (x) = m(x1/n )m−1 · x = x . (10.16)
n n

Problems
1. Find f −1 for each of the following functions f .
a) f (x) = x3 + 1
b) f (x) = (x − 1)3

2. Describe the graph of f −1 when


a) f is increasing and always positive.
b) f is increasing and always negative.

3. Prove that if f and g are one-to-one, then f ◦ g is also one-to-one. Find a formula
for (f ◦ g)−1 in terms of f −1 and g −1 .

4. On which intervals [a, b] will the following functions be one-to-one:


a) f (x) = x3 − 3x2
b) f (x) = (1 + x2 )−1 .

5. Suppose f is a one-to-one function and that f −1 has a derivative which is nowhere 0.


Prove that f is differentiable.
Hint: Apply Theorem 10.3

6. Find a formula for (f −1 )00 (x) using the chain rule.

92
Additional: Critical points, and continuity of
the inverse
10.1. Critical points
Further Reading

(Spivak, Calculus, Chapter 11) (Folland, Advanced Calculus, Chapter 2.8)

Vector Calculus: Advanced (MAST20032)

Let us first consider a function f of one variable.

Definition 10.1. A point x is a local maximum point for a function f , if there is


some δ > 0 such that f (x) ≥ f (y) for every y, with x − δ < y < x + δ. The number f (x)
itself is called the local maximum value of f on A. Similary for the local minimum
point and value.

Theorem 10.1. Let f be any function defined on (a, b). If x is a local maximum (or
minimum) point for f on (a, b), and f is differentiable at x, then f 0 (x) = 0.

Proof. If x is a local maximum point, then the difference quotient

f (x + h) − f (x)
(10.1)
h
is ≤ 0 for h > 0, and ≥ 0 for h < 0, because x is a local maximum point. Consequently,

f (x + h) − f (x) f (x + h) − f (x)
lim ≤ 0, lim ≥0 (10.2)
h→0+ h h→0− h
and since f is differentiable both limits exist and are equal. Hence f 0 (x) = 0.

However, the converse is not true: A function f whose derivative f 0 (x) = 0 is zero at
a point x does not necessarily have a minimum or maximum at that point. A simple
example is the function f (x) = x3 , for which f 0 (0), yet it does not have a minimum or
maximum anywhere.

Definition 10.2. A critical point of a function f is a number x such that f 0 (x) = 0.


The number f (x) itself is called a critical value.

93
Module IV Note 10

We have a similar notion for functions f (x1 , x2 ) of two variables ~x = (x1 , x2 ).

Definition 10.3. A point ~a = (a1 , a2 ) in the domain of f (x1 , x2 ) is a critical point if

∇f (~a) = 0 . (10.3)

Similarly, we can prove that if f (x1 , x2 ) has a local maximum at a point ~a = (a1 , a2 ),
in the sense that for some δ > 0,

f (~x) ≤ f (~a) |~x − ~a| < δ , (10.4)

the ~a is a critical point of f . Indeed, consider the function

g(t) = f (~a + t~u) (10.5)

then by assumption g has a local maximum at t = 0. If moreover f is differentiable at ~a,


then
∂f
g 0 (0) = (~a) = ∇f (~a) · ~u . (10.6)
∂~u
From Theorem 10.1 we already know that g 0 (0) = 0, and since that holds for any vector
~u, with |~u| = 1, we conclude that
∇f (~a) = 0 . (10.7)

10.2. Continuity and differentiability of the inverse


Further Reading

(Spivak, Calculus, Chapter 12)

We have seen that functions can be defined as pairs of numbers. The pairs of numbers
(x, y) consist of points x in the domain of f and the values y = f (x).

Definition 10.4. For any function f , the inverse of f , denoted by f −1 , is the set of
pairs (y, x) for which the pair (x, y) is in f .

However, we have seen that what makes a collection of pairs (a, b) a function f , is that
for each point a in the domain, there is a unique number b such that (a, b) is in f . So for
f −1 to be a function, we need that for each y = f (x) there is a unique number x such
that f (x) = y. In other words, f needs to be one-to-one.
A function f which is one-to-one has an inverse function f −1 . The inverse function f −1
is itself one-to-one, and (f −1 )−1 = f . In the pair (a, b) is in f , then b = f (a); moreover if
f is one-to-one, then (b, a) is in f −1 , and a = f −1 (b).
We have seen that increasing, or decreasing functions are one-to-one.
Let us prove Theorem 10.2 in the case that the domain of f is an open interval, and f
is increasing on that interval.

94
MAST10021 Semester 2, 2023

Figure 10.1.: Proof of the continuity of the inverse.

Proof of Theorem 10.2. Suppose f is increasing on an open interval, and a is a point in


that interval. Let b = f (a).
We want to show that
lim f −1 (y) = f −1 (b) (10.8)
y→b
This means that we need to show that for any ε > 0, there is a δ > 0, so that,
if |y − b| < δ , then |f −1 (y) − f −1 (b)| < ε , (10.9)
or equivalently,
if f (a) − δ < y < f (a) + δ , then a − ε < f −1 (y) < a + ε . (10.10)
Now consider consider the points (f (a + ε), a + ε), (f (a − ε), a − ε) on the graph of
f −1 ;
see Figure. We can choose δ > 0 so that
f (a − ε) ≤ f (a) − δ < f (a) < f (a) + δ ≤ f (a + ε) . (10.11)
With this choice of δ, we have that for all y, if f (a) − δ < y < f (a) + δ, then
f (a − ε) < y < f (a + ε). (10.12)
We will now use that since f is increasing, also f −1 is increasing. Therefore:
a − ε = f −1 (f (a − ε)) < f −1 (y) < f −1 (f (a + ε)) = a + ε . (10.13)

95
Module IV Note 10

Additional Problems
1. In Note 8, we have seen in Theorem 8.3 a sufficient criterion for differentiability of
a function of two variables, but we deferred the proof because it relies on the Mean
Value Theorem. This exercise guides you through the proof of Theorem 8.3:
Suppose the partial derivatives of a function f (x, y) exist at every point. We want
to show that at every point (a, b),

f (a + h, b + k) = f (a, b) + ∇f (a, b) · (h, k) + E(h, k) (10.14)

where
E(h, k)
∇f (a, b) = (∂x f (a, b), ∂y f (a, b)) and lim . (10.15)
(h,k)→0 |(h, k)|

In order to prove that write

E(h, k) = f (a + h, b + k) − f (a, b) − ∂x f (a, b)h − ∂y f (a, b)k (10.16)

and apply the mean value theorem twice. Finally use that the partial derivatives
are continuous functions to evaluate the limit.

96
Module V.

Integration

97
Note 11.
The fundamental theorem of Calculus
Further Reading

(Spivak, Calculus, Chapter 15)

Real Analysis: Advanced (MAST20033)

Given a continuous function f ≥ 0 on [a, b] we can talk about the “area under the
graph of f ”. In fact, this concept can be made precise for a larger class of bounded
functions which
Rb
are called integrable (in particular they need not be nonnegative), and
the number a f which formalizes the concept of “area” is called the integral.
In this subject, we will not define the integral, or attempt to discuss this notion with
the same level of care as we have treated the topics of continuity, and differentiability,
for instance. Consequently, we cannot hope to prove any of the fundamental theorems
about integration, most importantly the fundamental theorems of Calculus stated
below, which relate integration and differentiation.
In this note, we will only state these theorems as facts.

Basic properties of the integral. The basic properties of the integral, which we denote
interchangeably by
Z b Z b
f (x)dx or f (11.1)
a a
are that
Z b Z c Z b
f= f+ f (for any a < c < b) (11.2)
a a c
Z b Z b Z b
(f + g) = f+ g (11.3)
a a a
Z b Z b
f =c f (for any c ∈ R) (11.4)
a a

Recall that given a differentiable function f , we obtained a new function f 0 by differen-


tiation. In a similar way, given an integrable function f , a new function can be obtained
by integration. In fact, given any continuous function f on [a, b], consider the function
Z x
F (x) = f (x ∈ [a, b]) (11.5)
a

99
Module V Note 11

It turns out this function is itself continuous in x, and more importantly always differen-
tiable:
Theorem 11.1 (First fundamental theorem of Calculus). If f is continuous on [a, b],
then F is differentiable on (a, b), and
F 0 (x) = f (x) . (11.6)
Remark 11.1. The theorem says that given any continuous function on [a, b] (in fact,
more generally any merely integrable function), there always exists a function F , whose
derivative is f ; namely (11.5).
Also note that if G is defined by
Z b
G(x) = f (11.7)
x

then G(x) = ab f − ax f , so consequently G0 (c) = −f (c).


R R

As a consequence, if it is known that a function f is the derivative of another function


g, the evaluation of its integral is a triviality:
Corollary 11.2. If f is continuous on [a, b] and f = g 0 for some function g, then
Z b
f = g(b) − g(a) . (11.8)
a

Proof. Let F (x) be defined by (11.5), then F 0 = f = g 0 on [a, b]. Consequently by


the mean value theorem F = g + c for some constant c, cf. Corollary 9.4. In fact,
F (a) = 0 = g(a) + c, and thus F (x) = g(x) − g(a), in particular for x = b.

Example 11.1. If f (x) = x2 , and g(x) = x3 /3, then g 0 (x) = x2 = f (x), so


Z b
x2 dx = (b3 − a3 )/3 (11.9)
a

Exercise 11.1. Compute the integral of f (x) = xn on the interval [a, b] for any natural
number n ∈ N.
Example 11.2. Also for f (x) = x−n , where n ∈ N, n 6= 1, we know that we can find g
with g 0 (x) = f (x), at least for x 6= 0: g(x) = (−n + 1)−1 x−n+1 . Thus for 0 < a < b
1  1 1 
Z b
x−n dx = − . (11.10)
a n − 1 an−1 bn−1
The exception n = 1 in the above example is significant. While there is no monomial
whose derivative is the function 1/x, we do know, by the fundamental theorem of Calculus,
that there exists a function g(x) whose derivative is f (x) = 1/x for x > 0, namely
1
Z x
g(x) = dt (11.11)
a t
For a = 1 this serves as the definition of the logarithm; (see the additional notes to
Module V).

100
MAST10021 Semester 2, 2023

Problems
1. The fundamental theorem of Calculus, together with the chain rule, allow us to
compute derivatives of a variety of functions defined in terms of integrals.
Example 11.3. Let us compute the derivative of the function

1
Z sin x
f (x) = dt (11.12)
a 1 + sin2 (t)

We can view f = F ◦ sin as the composition of sin with F , where


1
Z x
F (x) = dt (11.13)
a 1 + sin2 (t)

Hence by the chain rule we have

cos(x)
f 0 (x) = F 0 (sin(x)) cos(x) = . (11.14)
1 + sin2 (sin(x))

Find the derivatives of the following functions


Z x 1 
a) f (x) = sin dt
a 1 + sin (t)
2
Z x3
b) F (x) = sin3 (t)dt
a
Z x Z y 
c) F (x) = sin sin3 (t)dt dy

0 0

2. Find (f −1 )0 (0) if Z x
f (x) = 1 + sin sin(t) dt (11.15)

0

Hint: Do not try to evaluate the integral!

3. a) Find F 0 if Z x
F (x) = xf (t)dt (11.16)
0
Hint: The answer is not xf (x).
b) Prove that if f is continuous, then
Z x Z x Z u 
f (u)(x − u)du = f (t)dt du (11.17)
0 0 0

c) Prove that moreover


Z x Z x Z u2 Z u1  
f (u)(x − u)2 du = 2 f (t)dt du1 du2 (11.18)
0 0 0 0

101
Module V Note 11

4. Show that if h is continuous, and f and g are differentiable, and


Z g(x)
F (x) = h(t)dt , (11.19)
f (x)

then F 0 (x) = h(g(x))g 0 (x) − h(f (x))f 0 (x).

5. Suppose f > 0 on [a, b]. Find


f (t)
Z b 0
dt (11.20)
a f (t)

102
Note 12.
The “simplest” differential equations
The simplest example of a differential equation is

y 0 (x) = f (x) . (12.1)

Here we are looking for a function y whose derivative is f . The fundamental theorem
of Calculus says that for any continuous function f (x) this differential equation has a
solution, namely
Z x
y(x) = f (t)dt . (12.2)
a

1 I
x

Figure 12.1.: The function f in (12.1) geometrically prescribes a slope, and gives rise
to the simplest example of a direction field. Solutions are functions whose
graph are tangential to the indicated slopes at every point.

103
Module V Note 12

Another example is the equation


y0 = y . (12.3)

Here we are looking for a function whose derivative equals itself.


Exercise 12.1. Draw the direction field for (12.3) as in Figure 12.1.
An example of a solution to this equation is the exponential function

exp(x) = ex . (12.4)

The exponential can be defined as the inverse of the logarithm; see Note 13 below.

1
Z x
log(x) = dt (x > 0) , exp(x) = log−1 (x) . (12.5)
1 t

The statement that the exponential function solves the differential equation (12.3) then
follows from the formula for the derivative of the inverse given in Note 10:

1
exp0 (x) = (log−1 )0 (x) = = exp(x) . (12.6)
log0 (exp(x))

Of course ex is not the only function with that property, for example also f (x) = c exp(x)
satisfies the relation f 0 (x) = f (x) for any constant c ∈ R. However, these are all:

Theorem 12.1. If f is differentiable and satisfies

f 0 (x) = f (x) for all x ∈ R (12.7)

then there is a number c such that

f (x) = cex for all x ∈ R (12.8)

Remark 12.1. In other words, the solutions of the differential equation y 0 = y are all of
the form y(x) = cex for some constant c ∈ R.

Proof. Let g(x) = e−x f (x), then

ex f 0 (x) − f (x)ex
g 0 (x) = =0 (12.9)
(ex )2

Therefore there is a number c such that for all x:

f (x)
g(x) = =c (12.10)
ex

104
MAST10021 Semester 2, 2023

An important example of a second order differential equation is

y 00 + y = 0 . (12.11)

It is easy to verify that the trigonometric functions are solutions, namely for both

f (x) = cos(x) , and f (x) = sin(x) , (12.12)

we have f 00 + f = 0, but they differ in their values of f and f 0 at x = 0. We have,


respectively, for these two functions

f (0) = 1 , f 0 (0) = 0 ; and f (0) = 0 , f 0 (0) = 1 . (12.13)

It may at first be surprising that cos(x) and sin(x) are the only solutions to (12.11) with
these values at 0.

Lemma 12.2. Suppose that f is twice differentiable and that

f 00 + f = 0 (12.14)
0
f (0) = 0 f (0) = 0 . (12.15)

Then f = 0.

Proof. Since f satisfies the equation f 00 + f = 0, we compute


 0  
(f 0 )2 + f 2 = 2f 0 f 00 + f = 0 (12.16)

so (f 0 )2 + f 2 must be constant, and evaluated at x = 0 it is 0, hence

(f 0 (x))2 + (f (x))2 = 0 (12.17)

for all x.

This also means that if

f 00 + f = 0 (12.18)
0
f (0) = a f (0) = b (12.19)

then
f (x) = a cos(x) + b sin(x) . (12.20)
Indeed, if we define
g(x) = f (x) − a cos(x) − b sin(x) (12.21)
then g also satisfies g 00 + g = 0, and g(0) = 0, and g 0 (0) = 0, from which we conclude
with the Lemma that g(x) = 0.
An unexpected consequence are the addition theorems for trigonometric functions.

105
Module V Note 12

Proposition 12.3.

sin(x + y) = sin(x) cos(y) + cos(x) sin(y) (12.22)


cos(x + y) = cos(x) cos(y) − sin(x) sin(y) (12.23)

Proof. For fixed y, set


f (x) = sin(x + y) . (12.24)
Then
f 00 (x) + f (x) = 0 (12.25)
and
f (0) = sin(y) f 0 (0) = cos(y) . (12.26)
So by the argument above,

f (x) = sin(y) cos(x) + cos(y) sin(x) . (12.27)

In particular we note that cos(2x) = cos2 (x) − sin2 (x), and sin(2x) = 2 cos(x) sin(x).

106
Note 13.
Logarithm and exponential function
Further Reading

(Spivak, Calculus, Chapter 18), with more motivation for the definition of the
logarithm. (Apostel, Calculus I, Chapter 8.3), with an emphasis on the exponential
functions as the solution to a differential equation.

13.1. Logarithm
The logarithm is an example of a function that is defined by an integral.
Definition 13.1 (Logarithm). For x > 0, we set
1
Z x
log(x) = dt (13.1)
1 t
Exercise 13.1. Sketch the graph of the logarithm.
Proposition 13.1. If x, y > 0, then
log(xy) = log(x) + log(y) (13.2)
Proof. Note that by the fundamental theorem of calculus log0 (x) = 1/x. Now choose a
number y > 0, and let f (x) = log(xy). Then
y 1
f 0 (x) = log0 (xy)y = = (13.3)
xy x
which says that f 0 = log0 . This implies that there is a number c such that
f (x) = log(x) + c (13.4)
for all x > 0, and we can find c by evaluating
f (1) = log(y) = log(1) + c = c (13.5)
and therefore
log(xy) = f (x) = log(x) + c = log(x) + log(y) . (13.6)
Since this is true for all y > 0, the theorem is proved.

107
Module V Note 13

Exercise 13.2. Show by induction that if n is a natural number and x > 0, then
log(xn ) = n log(x) (13.7)
Corollary 13.2. If x, y > 0, then
x
log = log(x) − log(y) . (13.8)
y
Proof. This is true because
x
log(x) = log( y) = log(x/y) + log(y) . (13.9)
y

The function log(x) is clearly increasing but since log0 (x) = 1/x the slope gets very
small when x is large, and consequently log(x) grows more and more slowly. It is not
immediately clear if the function is bounded or unbounded. However, for any n ∈ N,
log(2n ) = n log(2) (13.10)
and log 2 > 1; similarly
log(2−n ) = −n log(2) . (13.11)
Thus by the intermediate value theorem the logarithm takes on any value t ∈ R.

13.2. Exponential function


We have seen that the logarithm is increasing on (0, ∞) and takes all values in R.
Therefore the inverse function log−1 exists and its domain is R. This function is the
exponential function.
Definition 13.2 (Exponential function). For any real number x, we set
exp(x) = log−1 (x) . (13.12)
Theorem 13.3. For all numbers x,
exp0 (x) = exp(x) (13.13)
Moreover, for any two numbers x and y,
exp(x + y) = exp(x) exp(y) . (13.14)
Proof. With the formula for the derivative of the inverse from Lecture 10,
1
exp0 (x) = (log−1 )0 (x) = = exp(x) . (13.15)
log (exp(x))
0

Moreover, since x = log(a), y = log(b) for some a, b > 0,


exp(x + y) = exp(log(a) + log(b)) = exp(log(ab)) = ab = exp(x) exp(y) . (13.16)

The number e = exp(1) is called Euler’s number. For any number x, we define
ex = exp(x); see Additional.

108
MAST10021 Semester 2, 2023

Exponential growth. Another important observation is that “the exponential grows


faster than any polynomial”.

Theorem 13.4. For any natural number n,


ex
lim =∞ (13.17)
x→∞ xn

Proof. Let us first show that


lim ex = ∞ . (13.18)
x→∞
From the definition of the logarithm we can infer that

log(x) < x (x > 0) . (13.19)

Indeed, for 0 < x < 1 we have log(x) < 0; moreover log(1) = 0 < 1, and for x > 1,
dt
Z x
log(x) = ≤ (x − 1) < x . (13.20)
1 t
Therefore
x = exp(log(x)) < ex , (13.21)
which in particular implies (13.18).
Next we prove that
ex
lim = ∞. (13.22)
x→∞ x

Since
ex 1  ex/2  x/2
= e (13.23)
x 2 x/2
and in view of (13.21) the factor in parenthesis is bounded from below by 1, this statement
also follows from (13.18).
Similarly we can now write

ex  ex/n n 1  ex/n n
= = (13.24)
xn x nn x/n

which then proves the statement of the Theorem by (13.22).

Problems
1. a) Check that the derivative of log ◦f is f 0 /f .
Note: The derivative of log ◦f is called the logarithmic derivative and is some-
times easier to compute than f 0 , because taking the logarithm turns products
into sums. The formula says that multiplying (log ◦f )0 by f recovers f 0 , and
this process of finding the derivative of f is called logarithmic differentiation.

109
Module V Note 13

b) Use logarithmic differentiation to find the derivative of the following functions


2
i. f (x) = (1 + x)(1 + ex )
ex −e−x
ii. f (x) = e2x (1+x3 )

2. Draw the graph of each of the following functions


a) f (x) = ex+1
b) f (x) = ex + e−x
c) f (x) = ex − e−x
Are these functions the solutions to a differential equation?

110
Note 14.
Methods of integration
Definition 14.1 (Primitive). A function F satisfying F 0 = f is called the primitive of f .

Example 14.1. If F (x) = x log x − x, then F 0 (x) = log x. So F is a primitive of the


logarithm. Consequently, by the fundamental theorem of calculus,
Z b
log(x)dx = b log b − a log a − b + a . (14.1)
a

A continuous function f always has a primitive, namely


Z x
F (x) = f (t)dt . (14.2)
a

However, in this lecture we will try to find a primitive which can be written in terms
of elementary functions, namely the trigonometric functions and their inverses, and the
logarithmic and exponential functions, and rational functions formed thereof.
Remark 14.1. Elementary primitives usually cannot be found. For example, there is no
elementary function F such that
2
F 0 (x) = e−x . (14.3)

The basic methods for finding elementary primitives are actually theorems which allow
us to express primitives of one function in terms of primitives of other functions. To
integrate we will therefore need a list of primitives for some functions, and such a list
can be obtained simply by differentiating various well-known functions.

Definition 14.2. For the primitive of a function f we often use the notation
Z Z
f (x)dx or f. (14.4)

These are also called indefinite integrals, in contrast to definite integrals of a function f
with primitive F for which we adopt the notation
Z b
b
f (x)dx = F (b) − F (a) = F (x) a .
a

111
Module V Note 14

Example 14.2. We can verify the following formulas by differentiating the right hand
sides:
Z
adx = ax (14.5)
xn+1
Z
xn dx = (n 6= −1) (14.6)
n + 1
1
Z
dx = log x (14.7)
Zx
ex dx = ex (14.8)
Z
sin xdx = − cos x (14.9)
Z
cos xdx = sin x (14.10)
Z
sec2 xdx = tan x (14.11)
dx
Z
= arctan x (14.12)
1 + x2
dx
Z
√ = arcsin x (14.13)
1 − x2

14.1. Integration by parts, and substitution


We will now learn two important methods which can be used to express the primitive of
one function in terms of the primitive of another.
Exercise 14.1. Convince yourself by using differentiation that a primitive of f + g can
be obtained by adding a primitive of f to a primitive of g, and a primitive of c · f can
be obtained by multiplying a primitive of f by c. This is expressed by the two general
formulas:
Z Z Z
f +g = f+ g (14.14)
Z Z
c·f =c f. (14.15)

The product rule of differentiation yields the following important theorem:


Theorem 14.1 (Integration by parts). If f 0 and g 0 are continuous, then
Z Z
0
fg = fg − f 0g (14.16)
Z b Z b
b
f (x)g 0 (x) = f (x)g(x) a − f 0 (x)g(x)dx (14.17)
a a
Proof. The formula for the indefinite integrals states that the primitive of the function
f 0 g + f g 0 is f g, which is of course follows immediatly from the product rule:
(f g)0 = f 0 g + f g 0 (14.18)

112
MAST10021 Semester 2, 2023

The formula for the definite integral follows if we integrate both side of this equation on
the interval [a, b].

Example 14.3. Z
xex dx = xex − ex (14.19)

Example 14.4.
Z Z
x sin xdx = −x cos(x) + cos x = −x cos(x) + sin(x) (14.20)

Example 14.5. Z Z
log x = x log x − x · (1/x)dx = x log x − x (14.21)

Example 14.6. The computation of the primitive of log(x)/x is an example where the
result is obtained in two steps:
1 1
Z Z
log(x)dx = log(x)2 − log(x) dx (14.22)
x x
log x 1
Z
dx = (log x)2 (14.23)
x 2
Example 14.7. Any previously computed primitive can be used for integration by parts:
Z Z
(log(x)) dx =
2
(log x)(log x)dx
1
Z Z 
= log(x) log(x) − x log x − x dx
xZ (14.24)
 
= log(x) x log x − x − log(x) − 1 dx


=x(log x)2 − 2x log(x) + 2x

The next method is a consequence of the chain rule of differentiation.

Theorem 14.2 (Substitution formula). If f and g 0 are continuous, then


Z g(b) Z b
f (y)dy = f (g(x))g 0 (x)dx (14.25)
g(a) a
Z g(b) Z b
f= (f ◦ g) · g 0 (14.26)
g(a) a

Proof. Let F be a primitive of f , namely for some c,


Z x
F (x) = f (y)dy , (14.27)
c

then F 0 = f , and
(F ◦ g)0 = (F 0 ◦ g) · g 0 = (f ◦ g) · g 0 , (14.28)

113
Module V Note 14

and so after integration, by the fundamental theorem of Calculus,


Z b Z g(b)
0
(f ◦ g)(x) · g (x)dx = (F ◦ g)|ba = F (g(b)) − F (g(a)) = f. (14.29)
a g(a)

Example 14.8.
1
Z b
b
sin5 (x) cos(x)dx = sin6 (x) (14.30)
a 6 a

because with f (x) = x5 , and g(x) = sin(x), this integral is of the form
Z b Z g(b)
0
f (g(x))g (x)dx = f (y)dy = (F ◦ g)|ba (14.31)
a g(a)

where F (x) = x6 /6 is the primitive of f .


Example 14.9.
− sin(x)
Z b Z b
tan(x)dx = − dx = − log cos(b) + log(cos(a)) (14.32)
a a cos(x)
Example 14.10.
dx
Z b
= log log(x)|ba (14.33)
a x log x
Rb
We are usually interested in primitives rather than definite integrals, but if we can find
a f (x)dx for all a, and b, then we can certainly find f . For example, from (14.30) it
R

follows that
sin6 x
Z
sin5 (x) cos(x)dx = . (14.34)
6
It is quite uneconomical to find the primitive by first evaluating a definite integral. Instead
we have the following procedure:

Substitution procedure. Find a function g(x) so that after setting

u = g(x) du = g 0 (x)dx (14.35)

only the variable u appears. Then find a primitive in terms of u, and substitute
g(x) back in for u.

Example 14.11. Consider again Example 14.10. Set


1
u = log x du = dx (14.36)
x
then
1 du
Z Z
dx = = log u (14.37)
x log x u
and substitution back in for u gives the answer log log(x).

114
MAST10021 Semester 2, 2023

Similarly for the other examples above:


Example 14.12. To evaluate (14.30), set

u = sin(x) du = cos(x)dx (14.38)

so
1 1
Z Z
6
sin5 (x) cos(x)dx = u5 du = u6 = sin(x) (14.39)
6 6
Example 14.13. To evaluate
x
Z
dx (14.40)
1 + x2
set
u = 1 + x2 du = 2xdx (14.41)
so this integral equals
1 du 1 1
Z
= log u = log(1 + x2 ) (14.42)
2 u 2 2

14.2. Applications of the substitution formula


In many cases a factor g 0 (x) is easily recognized, and integration by substition is straight-
forward.
Exercise 14.2. Find
Z
esin x cos(x)dx (14.43)
ex
Z
√ dx (14.44)
1 − e2x

Example 14.14.

e3x
Z
e3x dx = (14.45)
3
sin(4x)
Z
cos(4x)dx = (14.46)
4

More interesting uses of the substitution formula appear when the factor g 0 (x) does
not appear.
Example 14.15. Consider
1 + ex
Z
dx . (14.47)
1 − ex
The obvious substitution to try is

u = ex du = ex dx (14.48)

115
Module V Note 14

and even though this factor does not appear in the integral we are led to
1 + ex 1+u1
Z Z
dx = du . (14.49)
1 − ex 1−uu
This can be integrated easily once we recognise that
1+u1 2 1
= + , (14.50)
1−uu 1−u u
hence
1 + ex
Z
dx = −2 log 1 − ex ) + log(ex ) (14.51)
1 − ex
Alternatively we could have set
1
u = ex x = log u dx = du (14.52)
u
then immediately
1 + ex 1+u1
Z Z
dx = du . (14.53)
1 − ex 1−uu

This trick works more generally: Suppose we make the substitution

u = g(x) (14.54)

and say we are in the situation that g is one-to-one, at least for all x under consideration,
in particular g 0 6= 0, then we can solve

x = g −1 (u) . (14.55)

In order to find Z
f (g(x))dx (14.56)

we would conventionally write


du = g 0 (x)dx (14.57)
so
1 0 1
Z Z Z
f (g(x))dx = f (g(x)) g (x)dx = f (u) du . (14.58)
g (x)
0 g 0 (g −1 (u))
Now note that
1
= (g −1 )0 (u) (14.59)
g 0 (g −1 (u))
so we could have obtained the identity
Z Z
f (g(x))dx = f (u)(g −1 )0 (u)du (14.60)

equally well with the substitution rule

x = g −1 (u) dx = (g −1 )0 (u)du . (14.61)

116
MAST10021 Semester 2, 2023

Example 14.16. Consider the integral

e2x
Z
√ dx . (14.62)
ex + 1
Set √
u= ex + 1 , (14.63)
then

u2 = ex + 1 (14.64)
2u
x = log(u2 − 1) dx = du (14.65)
u2−1
hence
e2x (u2 − 1)2 2u
Z Z Z
√ dx = du = 2 (u2 − 1)du
ex + 1 u u2 − 1 (14.66)
2 2 3 1
= u3 − 2u = ex + 1 2 − 2 ex + 1 2
3 3
Finally let us look at some examples for the integration of trigonometric functions
by substitution. When integrating a monomials in trigonometric functions it is useful
to remember the formulas from Prop. 12.3, in particular

cos(2x) = cos2 (x) − sin2 (x) (14.67)

from which we obtain cos(2x) = 2 cos2 (x) − 1 = 1 − 2 sin2 (x) or


cos(2x) + 1 1 − cos(2x)
cos2 (x) = sin2 (x) = . (14.68)
2 2
Example 14.17. Consider the integral
Z p
1 − x2 dx (14.69)

If we set
x = sin(u) (14.70)

then 1 − x2 = cos(x) simplifies, so we are led to the substitution

u = arcsin(x) . (14.71)

Then Z p Z q Z
1 − x2 dx = 1 − sin2 (u) cos(u)du = cos2 (u)du . (14.72)

This integral can be evaluated using that cos2 (u) = (1 + cos(2u))/2 and we find that
u sin(2u)
Z
cos2 (u)du = + (14.73)
2 4

117
Module V Note 14

and substituting back in u = arcsin(x) we have an expression for the primitive of (14.69):

arcsin(x) sin(2 arcsin(x))


Z
cos2 (u)du = +
2 4
√ (14.74)
arcsin(x) x 1 − x2
= +
2 2
We can also use the formulas (14.68) to integrate:
Z Z
sin x dx
n
cosn x dx (14.75)

Example 14.18. We see immediately that


Z 2π
cos2 (x)dx = π (14.76)
0

Example 14.19. Let us consider the case n = 3:


Z Z
sin3 (x)dx = sin(x)(1 − cos2 (x))dx
(14.77)
1
= − cos(x) + cos3 (x)
3
This approach works for any n = 2k + 1 odd.
Example 14.20. Finally let us compute

cos(2x) + 1 2
Z 2π Z 2π 
cos4 (x)dx = dx
0 0 2
1 1 1
Z 2π
= cos2 (2x) +
cos(2x) + dx (14.78)
0 4 2 4
1 cos(4x) + 1 3π
Z 2π  
= + 1 dx =
4 0 2 4

118
MAST10021 Semester 2, 2023

Problems
1. Find elementary expressions for the following primitives.
a)
Z √ √
x3 + 6 x
5

√ dx
x
b)
dx
Z
√ √
x−1+ x+1
c)
dx
Z

a2 + x2
d)
dx
Z

a2 − x2
2. Solve by substitution.
a) Z
2
xe−x dx

b)
ex dx
Z

e2x + 2ex + 1
c) Z
log(cos(x)) tan xdx

3. Solve by integration by parts.


a) Z
2
x3 ex dx

b) Z
(log x)3 dx

c)
log(log x)
Z
dx
x
4. Find the following primitives in elementary terms using substitution.
a) Z
ex sin ex dx

119
Module V Note 14

b) Z p
x 1 − x2 dx

c)
log(log(x))
Z
dx
x log(x)

5. The following integrations involve substitutions of various types. There is no


general rule, but try to substitute for an expression which appears frequently and
prominently. Also remember that it usually helps to express x directly in terms of
u.
a)
dx
Z

1+ x+1

b)
dx
Z

1 + ex
c)
dx
Z
√ √
x+ 3x

d)
1
Z
√ dx
1 + ex

e)
1
Z
dx
2 + tan(x)

120
Module VI.

Ordinary differential equations

121
Note 15.
First order linear differential equations
In the previous lectures we have already encountered the notion of a solution to a
differential equation, and the initial value problem.
The notion of a solution is more familiar for algebraic equations. For example the
equation x2 − 1 = 0 has as its solution x = ±1, which means inserting the values x = ±1
turns the equation into a true statement.
For a differential equation the solutions are functions, which upon inserting turn the
equation into a true statement.

Definition 15.1 (Solutions to first order differential equations). Let U ⊂ R2 and

f :U →R

be a continuous real valued function.

(i) The expression


y 0 = f (x, y) (15.1)
is called an (explicit) first order differential equation.

(ii) A function ϕ : I → R is a solution to (15.1) if ϕ is continuously differentiable on an


interval I (namely ϕ is differentiable and the derivative ϕ0 is a continuous function;
we write ϕ ∈ C 1 (I)), where the domain I ⊂ R is an interval such that (x, ϕ(x)) ∈ U
for all x ∈ I, and
ϕ0 (x) = f (x, ϕ(x)) . (15.2)

(iii) Let (x0 , y0 ) ∈ U . We say ϕ ∈ C 1 (I) is a solution to the initial value problem

y 0 = f (x, y) y(x0 ) = y0 (15.3)

if ϕ : I → R is a solution in the above sense, and x0 ∈ I and ϕ(x0 ) = y0 .

Remark 15.1. The general theory of differential equations addresses the question under
which conditions on f there exist solutions to (15.1), and when they are unique. It is also
of interest when the solutions can be expressed in explicit terms.
Remark 15.2. In practise we often write y(x) for the solution, but conceptually it is
important to distinguish between the unknown y, and the solution ϕ(x).

123
Module VI Note 15

Remark 15.3. We use the notation C0 (I) to denote “the space of continuous functions on
the interval I”. For example, ϕ ∈ C0 (I), where I = (a, b), means that ϕ is continuous on
(a, b). Similarly, C1 (I) denotes the “space of continuously differentiable functions on I”.
So ϕ ∈ C1 (I), where I = (a, b), means that ϕ is differentiable on (a, b), and ϕ0 ∈ C0 (I).
In Module V we have seen the exponential as the solution to an initial value problem.
Let us prove this characterisation of the exponential in yet another way, and thereby give
an example of an existence and uniqueness theorem for a differential equation.
Proposition 15.1. Let a, y0 ∈ R. Then f : I → R is a solution to the initial value
problem
y 0 = ay y(0) = y0 (15.4)
if and only if
f (x) = y0 eax . (15.5)
Proof. Clearly, with f (x) given by (15.5) we have
f 0 (x) = y0 aeax = af (x) (15.6)
and f (0) = y0 , so f (x) solves the initial value problem. Conversely, let f (x) be a solution
to the initial value problem (15.4), and set g(x) = f (x)e−ax on the interval I where f is
defined. Then
g 0 (x) = f 0 (x)e−ax − af (x)e−ax = 0 . (15.7)
Therefore (by the Mean Value Theorem) g(x) = g(0), hence
f (x)e−ax = g(x) = g(0) = f (0) = y0 . (15.8)

15.1. Linear first order equations


A special case are the linear differential equations of first order, which correspond to
the case when the function f in Defintion 15.1 depends linearly on y.
Definition 15.2. Let P, Q : I → R be continuous functions on an interval I ⊂ R. A
linear first order differential equation is an expression of the form
y 0 + P (x)y = Q(x) (15.9)
and we say the equation is homogeneous if Q = 0, and inhomogeneous otherwise.
Assuming for a moment that the homogeneous equation (15.9), with Q = 0, has a
solution ϕ 6= 0, we see that we can integrate the homogeneous equation in the form
ϕ0
= −P (x) (15.10)
ϕ
which yields for any (x0 , x) ⊂ I,
Z x
|ϕ(x)|
log =− P (t)dt . (15.11)
|ϕ(x0 )| x0

124
MAST10021 Semester 2, 2023

Proposition 15.2. Let I ⊂ R be an interval, P ∈ C 0 (I) and x0 ∈ I, and y0 ∈ R. Then


ϕ : I → R is a solution to the initial value problem

y0 + P y = 0 y(x0 ) = y0 (15.12)

if and only if Z x
h i
ϕ(x) = y0 exp − P (t)dt (15.13)
x0

Proof. By differentiating it is easily verified that (15.13) solves (15.12). Conversely, if


ϕ : I → R is a solution to (15.12), set
hZ x i
ψ(x) = ϕ(x) exp P (t)dt , (15.14)
x0

then ψ 0 (x) = 0, hence ψ(x) = ψ(0) = y0 .

One can find the solutions to the inhomogeneous equation by the method of variation
of constants: Consider the function
Z x
ϕ(x) = ϕ0 (x)e−G(x) G(x) = P (t)dt (15.15)
x0

which is obtained from the solution to the homogeneous equation by replacing the constant
y0 by a function ϕ0 ∈ C 1 (I). We will now derive a condition for the function ϕ0 (x) for
ϕ(x) to be a solution to (15.9). We compute

ϕ0 (x) = ϕ00 (x)e−G(x) − ϕ0 (x)P (x)e−G(x) = ϕ00 (x)e−G(x) − P (x)ϕ(x) (15.16)

hence for ϕ(x) to solve (15.9) we need:

ϕ00 = eG(x) Q(x) . (15.17)

Integrating gives a formula for ϕ0 (x) in terms of the known functions P (x), and Q(x).

Theorem 15.3. Let I ⊂ R be an interval, P, Q continuous functions on the interval I.

1. Let G be a primitive of P , and H a primitive of eG Q. Then ϕ is a solution to


(15.9) if and only if
ϕ(x) = e−G(x) H(x) . (15.18)

2. Let x0 ∈ I, y0 ∈ R. Then ϕ : I → R is a solution to the initial value problem

y 0 + P (x)y = Q(x) y(x0 ) = y0 (15.19)

if and only if
Rx Z x Rt
− P (t)dt P (τ )dτ
 
ϕ(x) = e x0 y0 + xe 0 Q(t)dt . (15.20)
x0

125
Module VI Note 15

Exercise 15.1. Prove the first part of the theorem. Also verify that (15.20) solves the
initial value problem (15.19).
Example 15.1. Let us find all solutions of the equation

xy 0 + (1 − x)y = e2x (15.21)

on the interval I = (0, ∞).


This equation takes the form (15.9) with P (x) = 1/x − 1 and Q(x) = e2x /x. We shall
express the solutions ϕy0 in terms of the initial value y0 at x0 = 1. We first compute
Z x
G(x) = P (t)dt = log(x) − (x − 1) (15.22)
1

and thus e−G(x) = ex−1 /x and eG(x) = xe−(x−1) , and so

ex−1  ex  −1
Z x  
ϕ(x; y0 ) = y0 + e−(t−1) e2t dt = y0 e + ex − e1 (15.23)
x 1 x
Note that these solutions are unbounded as x tends to 0, unless y0 /e + 1 − e = 0 in
which case limx→0 ϕ(x; y0 ) = 1. Here we used that ex is well approximated by the linear
function 1 + x near x = 0.

Problems
1. Solve the following initial value problems:
a) y 0 − 3y = e2x with y(0) = 0
b) y 0 + y = e2x with y(0) = 1

2. Find all solutions of y 0 sin(x) + y cos(x) = 1 on the interval (0, π). Prove that exactly
one of these solutions has a finite limit as x → 0, and another has a finite limit as
x → π.

3. We can view the following as a linear inhomogeneous differential equation:

y0 = x + y (15.24)

What are its solutions?

126
Note 16.
Separable differential equations
In this lecture we consider a somewhat more challenging class of differential equations:
These are equations as in Definition 15.1 where the function f on the right hand side is
in fact a product of a function of x and a function of y.
More precisely, a separable differential equation is an equation of the form

y 0 = f (x)g(y) (16.1)

where f , and g are continuous functions on intervals I, and J respectively, and the
corresponding initial value problem is

y 0 = f (x)g(y) , y(x0 ) = y0 , (16.2)

for any values x0 ∈ I, and y0 ∈ J.


Let us first make the assumption

g(y) 6= 0 y∈J (16.3)

and illustrate the behaviour that may occur if g has a zero with an example later.
We have not proven yet that a solution to (16.2) exists, but we will obtain such a proof
by first assuming there is a solution, and deriving an explicit formula for the solution.
Indeed, suppose ϕ is a solution to the initial value problem (16.2), then in view of the
assumption (16.3), we obtain
ϕ0 (t)
Z x Z x
dt = f (t)dt (16.4)
x0 g(ϕ(t)) x0

We can apply the substitution rule of Theorem 14.2 to write the left hand side as
ϕ0 (t) du
Z x Z ϕ(x)
dt = (16.5)
x0 g(ϕ(t)) y0 g(u)
where we have used that ϕ(x0 ) = y0 . Introducing the notation
du
Z y Z x
G(y) = , F (x) = f (t)dt , (16.6)
y0 g(u) x0

we can write the relation (16.4) as

G(ϕ(x)) = F (x) . (16.7)

127
Module VI Note 16

The function G is continously differentiable and strictly monotone on J because g is


continuous and g 6= 0 on J. Hence G has an inverse, G−1 , and we find

ϕ(x) = (G−1 ◦ F )(x) (16.8)

We have derived here a necessary condition for the solution: If there is a solution ϕ
to the initial value problem then it must of this form. Conversely, we can now argue as
promised that given f , and g, we can define the functions F , and G as in (16.6) and
verify that with ϕ defined by (16.8),
1
ϕ0 (x) = F 0 (x) = g(ϕ(x))f (x) (16.9)
G0 ((G−1 ◦ F )(x))
and ϕ(x0 ) = G−1 (0) = y0 .
In summary, we have proven that the solution to the initial value problem (16.2) is
precisely given by (16.8). In the same way one proves:
Theorem 16.1. Suppose f , and g are continuous functions on intervals I, and J,
respectively and g 6= 0 on J. Then ϕ is a solution to (16.1) if and only if

ϕ(x) = (G−1 ◦ F )(x) x ∈ I˜ (16.10)

where F and G are primitives of f , and 1/g respectively, and I˜ an interval.


Example 16.1. Let us consider the differential equation
x
y0 = (16.11)
y
on the upper half plane y > 0. We sketch the direction field as in Figure 16.1.
With the above notation f (x) = x, and g(y) = 1/y > 0, and F (x) = x2 /2 and
G(y) = y 2 /2 are primitives of f , and
√ 1/g, respectively. For y > 0, G is indeed invertible,
and the inverse given by G−1 (t) = 2t. Thus the solutions are
q p
ϕ(x) = 2(x2 /2 + C) = x2 + 2C (16.12)

for some constant C, which is defined on



(−∞, ∞) if C > 0


I˜ = (−∞, 0) or (0, ∞) if C = 0 (16.13)
(−∞, −√−2C) or (√−2C, ∞)

if C < 0 .

Note that for all solutions


ϕ(x)
lim = ±1 , (16.14)
x→±∞x
which confirms the qualitative behaviour inferred from the direction field. We may sketch
the solutions and see the three types of integral curves depending on whether C < 0,
C = 0, and C > 0.
Exercise 16.1. Solve the initial value problem (16.2) for (16.11).

128
MAST10021 Semester 2, 2023

Figure 16.1.: Direction field of the equation y 0 = x/y.

Problems
1. Determine the solutions to the differential equation

y0 = x + y . (16.15)

Hint: This is not a separable equation, but it can be reduced to this case, by
considering the equation satisfied by

z(x) = x + y(x) . (16.16)

2. Consider a differential equation of the from

y 0 = f y/x (16.17)


Show that if ϕ is a solution to (16.17), then


ϕ(x)
z(x) = (16.18)
x
is a solution to
f (z) − z
z0 = . (16.19)
x
Conversely, show that if ψ is a solution to (16.19), then ϕ(x) = xψ(x) is a solution
to (16.17).

129
Note 17.
Examples of first order equations
17.1. Examples of separable equations
We have discussed separable first order equations in general. Let us look at one more
example, and others will be discussed in the tutorials.
Example 17.1. Consider the non-linear equation xy 0 + y = y 2 . By inspection we see that
y = 0, and y = 1 are solutions. The remaining solutions, when y(y − 1) 6= 0, and x 6= 0,
satisfy
y 0 (x) 1
= (17.1)
y(y − 1) x
That means that a solution y = ϕ(x) satisfies
ϕ0 (x) dx
Z Z
dx = . (17.2)
ϕ(x)(ϕ(x) − 1) x
We can use partial fractions to rewrite the integrand
1 1 1
= − (17.3)
y(y − 1) y−1 y
so by the substitution rule,
ϕ0 (x) dy
Z Z
dx = = ln |y − 1| − ln |y| (17.4)
ϕ(x)(ϕ(x) − 1) y(y − 1)
with y = ϕ(x), and on the right hand side dx/x = ln |x|. Thus for some constant C,
R

y−1
ln = ln |x| + C y = ϕ(x) (17.5)
y
which gives |(y − 1)/y| = eC |x|, or (y − 1)/y = Kx for some constant K. We have
y−1 1
− 1 = − = Kx − 1 (17.6)
y y
which finally gives the formula for the solutions
1
ϕK (x) = (x ∈ IK ) (17.7)
1 − Kx
where IK = (−∞, 1/K), or IK = (1/K, ∞) depending on the parameter K ∈ R, K =
6 0.
Note that K = 0 corresponds to the solution y = 1.

131
Module VI Note 17

Exercise 17.1. Find the solution passing through any given point (x0 , y0 ). Sketch all
solutions to the differential equation (17.1). Are there points with several solutions passing
through them?

17.2. Reductions to separable equations


The following two types of equations can be transformed to separable equations.
The first type are equations of the form

y 0 = f (ax + by + c) (17.8)

Here the direction field is constant on straight lines, and we can pass from y(x) to the
new unknown
u(x) = ax + by(x) + c . (17.9)
Then u satisfies
u0 = a + by 0 = a + bf (u) (17.10)
which is an equation of the form u0 = g(u). Conversely, any solution to (17.10) gives rise
to a solution of the original equation (17.8) using the relation (17.9).
Exercise 17.2. Verify this!
Example 17.2.
y 0 = (x + y)2 (17.11)
We find that u(x) = x + y(x) satisfies u0 = 1 + u2 , hence

arctan(u(x)) = x + C (17.12)

for some constant C. The solutions of (17.11) are thus given by

y(x) = ϕC (x) = tan(x + C) − x . (17.13)

The second type are equations of the form


y
y0 = f . (17.14)
x
Here the transformation u(x) = y(x)/x yields

y 0 = u + xu0 = f (u) , (17.15)

which is a separable equation for the new unknown u,


f (u) − u
u0 = . (17.16)
x
Exercise 17.3. Verify that any solution (17.16) yields a solution to the (17.14).
This second type of equations are examples of differential equations which are homoge-
neous of degree zero; (see the additional notes to Module VI).

132
MAST10021 Semester 2, 2023

17.3. Loss of uniqueness


In our treatment of separable equation in Note 16, we have assumed that

y 0 = f (x)g(y) where g(y) 6= 0 (17.17)

Let us consider an example where where g(y) does have a zero.


Consider the differential equation
q
y0 = |y| , (17.18)

which corresponds to the case f (x) = 1, and g(y) =


p
|y|.
Exercise 17.4. Sketch the direction field.
Consider first the region y > 0. Here the procedure discussed in Note 16 applies with
√ √
g(y) = y > 0. We find F (x) = x and G(y) = 2 y are primitives, and thus the solutions
in the upper half plane are
1
ϕ(x) = x + C)2 , x ∈ (−C, ∞) . (17.19)
4
Exercise 17.5. Show that
1
ϕ(x) = − −x + C)2 , x ∈ (−∞, C) . (17.20)
4
are the solutions in the lower half plane y < 0.
Note that the solutions in the lower half plane all arrive on the y = 0 axis and their
graphs have vanishing slope there. Similarly the solutions in the upper half plane begin
on the y = 0 axis and have vanishing slope there. Furthermore ϕ(x) = 0 is a solution to
(17.18). This means that the solutions to (17.18) lose their uniqueness property precisely
at the points where g(y) = 0.
Exercise 17.6. Write down an explicit family of solutions to (17.18) whose graphs pass
through the point (a, 0), a ∈ R.
Remark 17.1. Finally note that the function g(y) is continuous but not differentiable. It
turns out that uniqueness holds for differential equations y 0 = g(y) even at points where
g(y) = 0 provided g is differentiable.

Problems
1. Find formulas for the solutions of the following differential equations.
a) (x + 1)y 0 + y 2 = 0
b) y 0 = (y − 1)(y − 2)
c) (x − 1)y 0 = xy

133
Module VI Note 17

2. Solve the following initial value problem:


2
0 e−y
y = , y(2) = 0 . (17.21)
y(2x + x2 )

3. Find all solutions to the following differential equation:

y 0 = (x − y + 3)3 (17.22)

4. Integrate the following differential equations:


a) y 0 = −x/y
b) y 0 = 1 + y/x
x2 +2y 2
c) y 0 = xy

134
Additional: Isoclines and Homogeneity
Further Reading

(Apostel, Calculus I, Chapter 8, Section 20-21, 23, 25–27)

Differential Equations (MAST20030)

Definition 17.1. A function f (x, y) is homogeneous of degree λ, if


f (tx, ty) = tλ f (x, y) (t > 0) . (17.1)
The type of functions often appear in differential equations.
Exercise 17.1. An example with f (x, y) homogeneous of degree 2 is
y 0 = x2 + y 2 . (17.2)
Sketch the direction field, and determine qualitatively the solution passing through the
origin (0, 0).
In the construction of a direction field it is often useful to locate first those points at
which the slope y 0 has a constant value c.
Definition 17.2. A level set of f (x, y) is called an isocline of the differential equation
y 0 = f (x, y).
Exercise 17.2. The equation
y0 = x + y (17.3)
is an example of differential equation with f (x, y) homogeneous of degree 1. Show that
the isoclines of the differential equation form a one-parameter family of straight lines.
Plot the isoclines corresponding to the constant slopes 0, ±1/2, ±1, ±3/2, ±2. With
the aid of the isoclines construct a direction field and sketch the integral curve passing
through the origin. One of the integral curves is also an isocline; find this curve.
In the context of separable equations, the case of functions which are homogeneous of
degree 0 are distinguished:
Consider
y 0 = f (x, y) (17.4)
with the property that
f (tx, ty) = f (x, y) (t > 0) , (17.5)
namely the right hand side is invariant under rescalings of the variables (x, y). An equation
with this property is said to be homogeneous of degree zero.

135
Module VI Note 17

Example 17.1. The following equations are homogeneous degree zero:

y−x
y0 = (17.6)
y+x
 x 2 + y 2 3
y0 = . (17.7)
xy

All of these equations are separable: Indeed, any first-order equation with the property
(17.5) can be written as
y 0 = f (1, y/x) . (17.8)

Exercise 17.3. Explain why.


Thus introducing a new unknown v = y/x, we have y = vx, y 0 = v 0 x + v, and thus
(17.8) becomes again
xv 0 = f (1, v) − v (17.9)

which is separable, and can be solved as in Note 16.


Exercise 17.4. Show that also the substitution v = x/y transforms a differential equation
which is homogeneous of degree zero into a first-order equation for v which is separable.
Example 17.2. Let us solve the differential equation (17.6). As indicated above, the
equation transforms to

v−1 v2 + 1
xv 0 = −v =− , v = y/x , (17.10)
v+1 v+1

which we may integrate to get

vdv dv dx
Z Z Z
+ =− (17.11)
v +1
2 v +1
2 x
1
ln(v 2 + 1) + arctan(v) = − ln |x| + C (17.12)
2

which shows that for every solution y = ϕ(x) there is a constant C such that

1    ϕ(x) 
ln ϕ(x)2 + x2 + arctan =C. (17.13)
2 x

We have seen examples of differential equations (17.8) which are homogeneous of degree
zero in the sense of (17.5). Let us explore some of their properties in greater generality.
Exercise 17.5. Show that straight lines through the origin are isoclines of the differential
equations which are homogeneous of degree zero.
Exercise 17.6. Demonstrate this property for the equation y 0 = −2y/x. Sketch the
isoclines and the direction field.

136
MAST10021 Semester 2, 2023

Exercise 17.7. Given that all straight lines are isoclines, and the slope of the direction
field is unchanged along an isocline, we may guess that the integral curves are similar, in
the sense that if
G = {(x, ϕ(x)) : x ∈ I} (17.14)
is the graph of a solution, then so is

kG = {(kx, kϕ(x)) : x ∈ I} (17.15)

for any k > 0. (This transformation is a concentric scaling, or similarity transformation.)


Prove that this is indeed true.
Example 17.3. An example of a differential equation where the geometric property of the
previous exercise is quite obvious is

y 0 = −x/y (17.16)

whose integral curves are concentric circles given by x2 + y 2 = C for some constant C > 0.

137
Note 18.
Linear differential equations of second order
with constant coefficients
The equation
y 00 = −k 2 y (18.1)
describes the oscillations of a particle on a line around the origin y = 0, when a force
proportional to its displacement pulls it back to its equilibrium position at y = 0.1
This is an example of a second order equation, and could we viewed as a system of
first order equations: Introducing the unknowns y1 = y and y2 = y 0 we could rewrite this
equation as
y10 = y2 , y20 = −k 2 y1 . (18.2)
We can think of ~y = (y1 , y2 ) as a point in the plane, and view the solution ϕ
~ (t) as a point
moving through 2-dimensional space R2 with velocity ~v :
~ 0 (t) = ~v (~
ϕ ϕ(t)) (18.3)
Here ~v (y1 , y2 ) = (y2 1 ) is a vectorfield.
, −k 2 y
Exercise 18.1. A vectorfield in R2 , in comparison to a direction field which has a line
attached to each point, has in addition to the slope also a magnitude at each point.
In fact,
q with ~ v (y1 , y2 ) = (v1 (y1 , y2 ), v2 (y1 , y2 )) the magnitude at the point (y1 , y2 ) is
|v| = v12 + v22 . Sketch the vectorfield ~v of (18.2).
Exercise 18.2. Show that in the case k = 1 concentric circles are solutions to (18.2). The
fact that y12 /2 + y22 /2 is constant along the solution curve can be interpreted as the law
of conservation of energy. What is the situation if k 6= 1?
While recasting a second order equation as a first order system is a very fruitful point
of view, we will not adopt this approach here, and study directly the linear differential
equation of second order with constant coefficients:
y 00 + ay 0 + by = 0 (18.4)
More generally, a linear differential equations of second order is an expressions
of the form
y 00 + g(x)y 0 + h(x)y = r(x) (18.5)
In the case that r(x) vanishes identically we say the equation is homogeneous, otherwise
inhomogeneous.
1
For the given equation, this relationship is linear, and is then also referred to as Hooke’s law in physics.

139
Module VI Note 18

18.1. Existence of solutions by inspection


We consider first the equation (18.4) in the case that a = 0.

The equation y 00 = 0. In this special case, when both coefficients a = b = 0, we see


immediately by integration that any solution that satisfies this equation has constant
derivative y 0 = c1 , and so integrating again

y(x) = c1 x + c2 (18.6)

with constant c2 . Conversely, for any numbers c1 , c2 ∈ R, the function given by (18.6) is
a solution, so we have found all solutions in this case.

The equation y 00 + by = 0, when b < 0. Since b < 0, we can write b = −k 2 for some
k > 0, and the differential equation takes the form

y 00 = k 2 y . (18.7)

We immediately verify that y(x) = ekx is a solution, and another is y(x) = e−kx .
Therefore also linear combinations of these are solutions, and we conclude: For any
constants c1 , c2 ∈ R
y(x) = c1 ekx + c2 e−kx (18.8)
is a solution. We will prove below that these are in fact all solutions in this case.

The equation y 00 + by = 0, when b > 0. In this case let us again write b = k 2 , then the
equation (18.4) takes the form
y 00 (x) = −k 2 y . (18.9)
One may recognize that this relation is satisfied by the function y(x) = cos(kx), and
also y(x) = sin(kx). We find again a general solution by forming a linear combination:

y(x) = c1 cos(kx) + c2 sin(kx) (18.10)

This does not show that any solution of (18.9) is of this form, but we have actually
already given a proof of that in Note 12 in the case k = 1.
Exercise 18.3. Can you generalise the argument we have given in Lemma 12.2 to show
that any solution to (18.9) is of the form (18.10)?

The case a 6= 0. The cases considered above actually cover the case a 6= 0 as well, in
the sense that we can reduce the problem (18.4) to the problem with a = 0.
The idea is to consider solutions of (18.4) of the form

y(x) = u(x)v(x) . (18.11)

140
MAST10021 Semester 2, 2023

A simple calculation gives that

y 00 + ay 0 + by = v 00 + av 0 + bv u + 2v 0 + av u0 + vu00 (18.12)
 

and so by choosing v such that v 0 = −av/2 — which can evidently be arranged by


choosing v(x) = e−ax/2 — the coefficient to the u0 term disappears, and with this choice
the equation reduces to
 4b − a2 
y 00 + ay 0 + by = u00 + u v. (18.13)
4
Since v(x) = e−ax/2 6= 0, we have shown:

Proposition 18.1. Let y(x) = u(x) exp(−ax/2). Then the function y satisfies (18.4) if
and only if the function u satisfies the differential equation

4b − a2
u00 + u=0 (18.14)
4

18.2. General form of the solutions


Theorem 18.2. Let b ∈ R. Every solution ϕ(x) of the equation

y 00 + by = 0 (18.15)

is a linear combination of two functions f1 , f2 , where

• when b = 0: f1 (x) = 1, f2 (x) = x

• when b < 0: set b = −k 2 and f1 (x) = ekx , f2 (x) = e−kx

• when b > 0: set b = k 2 and set f1 (x) = cos(kx), and f2 (x) = sin(kx).

In other words, for every solution ϕ(x) of the equation (18.15) there are constants
c1 , c2 ∈ R so that
ϕ(x) = c1 f1 (x) + c2 f2 (x) . (18.16)

Proof. Let ϕ(x) be a solution to (18.15), and let f1 (x), and f2 (x) be chosen as in the
statement of the theorem, depending on the value of b. Then also

ψ(x) = ϕ(x) − c1 f1 (x) − c2 f2 (x) (18.17)

is a solution to (18.15).
Let us choose the constants c1 , and c2 so that ψ(0) = 0, and ψ 0 (0) = 0. This amounts
to solving the equations:

ϕ(0) =c1 f1 (0) + c2 f2 (0) (18.18a)


0
ϕ (0) =c1 f10 (0) + c2 f20 (0) (18.18b)

141
Module VI Note 18

In the case b = 0, these equations reduce to

ϕ(0) = c1 ϕ0 (0) = c2 . (18.19)

Now in the case b < 0, these two equations read:

ϕ(0) =c1 + c2
ϕ0 (0) =c1 k − c2 k

which we can solve to find

ϕ(0) + ϕ0 (0)/k
c1 =
2
ϕ(0) − ϕ0 (0)/k
c2 =
2

Similary in the case b > 0.


Thus it remains to show that the only solution to the initial value problem

y 00 + by = 0 y(0) = 0 y 0 (0) = 0 (18.20)

is the trivial solution y = 0.


In the case b = 1 we have already demonstrated this in Note 12; see Lemma 12.2.
Exercise 18.4. Generalize this argument to the case b > 0.
In the case b = 0 this follows from integrating twice.
Exercise 18.5. Show that if f is twice differentiable and satisfies f 00 (x) = 0, and moreover
f (0) = f 0 (0) = 0, then f (x) = 0.
The case b < 0 is covered by the more general considerations below; (see the additional
notes to Module VI).
Exercise 18.6. Read the additional material on uniqueness of solutions to the initial value
problem and give a proof of the above statement that the only solution to (18.20) is the
trivial solution, in the special case that b = −k 2 .

Exercise 18.7. In view of Proposition 18.1 it is now also possible to state a theorem that
gives a characterisation of all solutions to (18.4), for any values of a, b ∈ R, depending
on the sign of the so-called discriminant of the differential equation (18.4):

∆ = a2 − 4b . (18.21)

State it!

142
MAST10021 Semester 2, 2023

Problems
1. Find explicitly the solution to the following initial value problem:
a) y 00 + ky = 0, y(0) = 0, y 0 (0) = y1 , where k > 0, and y1 ∈ R are fixed constants.
b) y 00 + ay 0 = 0, y(0) = 1, y 0 (0) = 0, where a ∈ R is fixed.

2. Show that the solution (18.16) in the case b > 0 can also be written as

ϕ(x) = c sin(kx + δ) (18.22)


q
where c = c21 + c22 , and δ is to be determined.

3. Find all solutions to the following differential equations:


a) y 00 + 4y 0 = 0
b) y 00 + 2y 0 + y = 0

4. Find all values of the constant k so that y 00 + ky = 0 has a non-trivial solution


y = ϕk (x) for which ϕk (0) = ϕk (1) = 0.

143
Additional: Uniqueness of solutions to the
initial value problem
We will give a proof of Theorem 18.2 above, in a way that applies to all cases, and
already outlines an approach to the general second order differential equation (18.5) (not
necessarily with constant coefficients).
In this section we will prove that there exists one and only one solution to the following

Initial value problem. Let a, b ∈ R, and x0 , y0 , y1 ∈ R. The initial value problem for
(18.4) is to find a solution ϕ ∈ C 2 (R) to the problem

y 00 + ay 0 + by = 0 , (18.1a)
0
y(x0 ) = y0 y (x0 ) = y1 . (18.1b)

In particular, we want to prove that the only solution to this initial value problem
with x0 = y0 = y1 = 0 is y = 0.
We begin with the following observation:

Proposition 18.1 (Wronskian determinant). Let a, b ∈ R and ϕ1 , ϕ2 ∈ C 2 (R) be any


two solutions to the differential equation (18.1a). Let the Wronskian be defined as the
determinant
ϕ1 ϕ2
W (ϕ1 , ϕ2 ) := = ϕ1 ϕ02 − ϕ01 ϕ2 (18.2)
ϕ01 ϕ02
then for some c ∈ R it holds:

W (ϕ1 , ϕ2 )(x) = ce−ax . (18.3)

Note this means in particular that either W (ϕ1 , ϕ2 )(x) = 0 for all x ∈ R or none.

Proof. Let us denote for short by W (x) = W (ϕ1 , ϕ2 )(x) and note that W ∈ C 1 (R)
because ϕ1 , ϕ2 ∈ C 2 (R). We have

W 0 (x) = ϕ1 ϕ002 − ϕ001 ϕ2 = ϕ1 (−aϕ02 − bϕ2 ) − (−aϕ01 − bϕ1 )ϕ2 = −aW (x) (18.4)

therefore by Proposition 15.1,

W (x) = W (0)e−ax . (18.5)

145
Module VI Note 18

Exercise 18.1. With f1 , and f2 as defined in Theorem 18.2, compute W (f1 , f2 ) in all
cases b > 0, b = 0, b < 0. In particular, verify that W (f1 , f2 ) 6= 0 in all cases!

Theorem 18.2 (Uniqueness of solutions). Let b ∈ R, and x0 , y0 , y1 ∈ R, and suppose


ϕ1 , ϕ2 ∈ C 2 (R) are two solutions to the initial value problem

y 00 + by = 0 , y(x0 ) = y0 , y 0 (x0 ) = y1 . (18.6)

Then
ϕ1 (x) = ϕ2 (x) x ∈ R. (18.7)

Proof. The difference ϕ(x) = ϕ1 (x) − ϕ2 (x) is a solution to the same equation with trivial
initial values, namely ϕ(x0 ) = ϕ0 (x0 ) = 0. Assume that ϕ(x1 ) 6= 0 for some x1 ∈ R.
The idea is now to write down a solution to the initial value problem

y 00 + by = 0 , y(x1 ) = 0 , y 0 (x1 ) = ϕ(x1 ) . (18.8)

We can do this as follows: We know that

ψ(x) = c1 f1 (x) + c2 f2 (x) (18.9)

is a solution, and we can arrange for the initial conditions by choosing

c1 f1 (x1 ) + c2 f2 (x1 ) =0 (18.10)


c1 f10 (x1 ) + c2 f20 (x1 ) =ϕ(x1 ) (18.11)

This system is solvable for c1 , and c2 , because


!
f1 (x1 ) f2 (x1 )
det = W (f1 , f2 )(x1 ) 6= 0 (18.12)
f10 (x1 ) f20 (x1 )

In fact, we obtain
ϕ(x1 )f2 (x1 ) ϕ(x1 )f1 (x1 )
c1 = − c2 = . (18.13)
W (x1 ) W (x1 )
Since ϕ and ψ are solutions to y 00 + by = 0 we know from Proposition 18.1 that the
Wronskian determinant either vanishes for all x ∈ R or none. However,

W (ϕ, ψ)(x0 ) =ϕ(x0 )ψ 0 (x0 ) − ϕ0 (x0 )ψ(x0 ) = 0


W (ϕ, ψ)(x1 ) =ϕ(x1 )ψ 0 (x1 ) − ϕ0 (x1 )ψ(x1 ) = ϕ(x1 )2 > 0 ,

which is a contradiction, hence ϕ(x) = ϕ1 (x) − ϕ2 (x) = 0 for all x ∈ R.

Remark 18.1. In the proof we have made the following important observation: In order
to find the solution to the initial value problem (18.1a,18.1b) it suffices to find any two
solutions of (18.1a) whose Wronski determinant does not vanish at a point. Indeed, if
ϕ1 , ϕ2 ∈ C 2 (R) are two solutions to (18.1a) such that W (x1 ) 6= 0 for some x1 ∈ R where

146
MAST10021 Semester 2, 2023

W (x) = W (ϕ1 , ϕ2 )(x), then W (x) 6= 0 for all x ∈ R by virtue of Proposition 18.1, and
we can set for any x0 , y0 , y1 ∈ R,
1
c1 = (y0 ϕ02 (x0 ) − y1 ϕ2 (x0 )) (18.14a)
W (x0 )
1
c2 = (−y0 ϕ01 (x0 ) + y1 ϕ1 (x0 )) . (18.14b)
W (x0 )

Then, with this choice of constants,

ϕ(x) = c1 ϕ1 (x) + c2 ϕ2 (x) (18.15)

is a solution to the initial value problem (18.1b).


Exercise 18.2. Verify this claim!
Exercise 18.3. The proof of the uniqueness of solutions we have given actually equally
applies to the initial value problem (18.1a), (18.1b), when a =
6 0. Indeed, Proposition 18.1
is valid in that case! Find the part of the proof of Theorem 18.2 where we have actually
used that a = 0, and adapt it to include the case a 6= 0!

Problems
1. Let ϕ1 , and ϕ2 be two solutions to differential equation (18.1a), and assume ϕ1 is
not identically zero.
a) Prove that W (ϕ1 , ϕ2 )(0) = 0 if and only if ϕ2 /ϕ1 is constant.
b) Suppose ϕ2 /ϕ1 is not constant. Let ϕ be any solution to (18.1a). Show that
there exists constants c1 , c2 such that

c1 ϕ1 (0) + c2 ϕ2 (0) = ϕ(0) c1 ϕ01 (0) + c2 ϕ02 (0) = ϕ0 (0) . (18.16)

Moreover, show that

ϕ(x) = c1 ϕ1 (x) + c2 ϕ2 (x) . (18.17)

147
Additional: The space of solutions
Further Reading

(Apostel, Calculus I, Section 8.1-8.4) and (Apostel, Calculus I, Section 8.8-8.17)

Differential Equations (MAST20030)

First order equations


One can phrase the statement of Proposition 15.2 in the languge of Linear Algebra as
follows: A linear map T : C 1 (I) → C 0 (I) is defined by

T (y) = y 0 + P y . (18.1)

A continuously differentiable function ϕ is a solution to the homogeneous equation (15.12)


if and only if it lies in the kernel of T , T (ϕ) = 0. We have seen that the kernel as a
subspace of C 1 (I) is one dimensional, and spanned by e−G where G is a primitive of P :

ker(T ) = {ϕ ∈ C 1 (I) : ϕ = ce−G for some c ∈ R} (18.2)

Similarly ϕ is a solution to the inhomogeneous equation (15.9) if and only if T (ϕ) = Q.


According to Theorem 15.3 we have that the space of solutions is

{ϕ ∈ C 1 (I) : ϕ = ϕin + Ce−G , C ∈ R} = ϕin + ker(T ) (18.3)

where ϕin is a solution to the inhomogeneous problem (15.9). The set of solutions to the
inhomogeneous equation is thus span an affine subspace of C 1 (I) of dimension 1.

Second order equations


It is useful to use the language of linear algebra to describe the set of solutions to the
homogeneous equation (18.4).

Theorem 18.1 (Space of solutions). Let a, b ∈ R, and T : C 2 (R) → C 0 (R) be defined


by
T (f ) = f 00 + af 0 + bf (18.4)
Then we have

149
Module VI Note 18

(i) T is a linear map.

(ii) For f ∈ C 2 (R), f is a solution to (18.1a) if and only if f ∈ ker(T ). In particular the
set of solutions to the homogeneous second order differential equation with constant
coefficients is a linear subspace of C 2 (R).

(iii) If f1 , f2 ∈ ker T then f1 , f2 are linearly independent if and only if W (f1 , f2 )(x) 6= 0
for some x ∈ R (or equivalently W (f1 , f2 )(x) = 0 for all x ∈ R).

(iv) dim ker(T ) = 2.

(v) If f1 , f2 ∈ ker(T ) are linearly independent, then we say {f1 , f2 } are a fundamental
system for the differential equation (18.1a), and

ker(T ) = f ∈ C 2 (R) : there exists constants c1 , c2 ∈ R such that f = c1 f1 + c2 f2




(18.5)
is the space of solutions to (18.1a) defined on R.

We omit most of the proof except for (iii,iv) which are instructive.

Proof of (iii). If f1 , f2 are linearly dependent, say f1 = λf2 , then clearly W (f1 , f2 ) = 0.
Conversely suppose that W (f1 , f2 ) = 0. If either f1 or f2 vanish identically then they
are linearly dependent, so we can assume that f1 6= 0 and f2 = 6 0. (Meaning f1 , f2 are
not the “zero function” 0(x) = 0, x ∈ R.) Then by Theorem 18.2 f1 and f2 cannot have
trivial initial values at x = 0. Consider the case that f10 (0) = f20 (0) = 0. Then we must
have f1 (0) 6= 0, and f2 (0) 6= 0, and we can define

f (x) = f2 (0)f1 − f1 (0)f2 ∈ ker(T ) (18.6)

which satisfies f (0) = 0 and f 0 (0) = W (f1 , f2 )(0) = 0, so f = 0 again by Theorem 18.2,
which shows that f1 , f2 are linearly dependent. In the case that either f10 (0) 6= 0 or
f20 (0) 6= 0 we can define
f = f20 (0)f1 − f10 (0)f2 ∈ ker(T ) (18.7)
and find f (0) = W (0) = 0 and f 0 (0) = 0 which implies that f = 0 by Theorem 18.2
hence f1 , f2 are also linearly dependent in this case.

Proof of (iv). Let f1 , f2 be the solutions to (18.1b) with (y0 , y1 ) = (1, 0), and (y0 , y1 ) =
(0, 1) respectively (and x0 = 0). Then W (f1 , f2 )(0) = 1 and so f1 , f2 are linearly
independent by (iii). Hence dim ker(T ) ≥ 2. Given f ∈ ker(T ), then we claim that
f = f (0)f1 +f 0 (0)f2 . Indeed g = f −f (0)f1 +f 0 (0)f2 ∈ ker(T ), and g(0) = f (0)−f (0) = 0,
and g 0 (0) = f 0 (0) − f 0 (0) = 0, hence g = 0 identically by Theorem 18.2. Thus we have
shown that f1 , f2 is a basis for ker(T ), so dim ker(T ) = 2.

150
Module VII.

Complex numbers and complex


exponentials

151
Note 19.
Complex numbers
In Module VI we have found the solutions to the differential equation y 00 + by = 0 “by
inspection.” What that means is that we have taken a guess that a certain function is a
solution, or as one also says we make the ansatz:

f (x) = eλx (x ∈ R) (19.1)

and we hope to find a number λ so that f is indeed a solution.1


Now inserting this ansatz into the differential equation we find that f is a solution to
y + by = 0 if
00

λ2 + b = 0 (19.2)

In the case when b < 0 there are two√


solutions, namely λ = ± |b|, and they correspond
p

to the two fundamental solutions e ± |b|x


. But what about the case b > 0? Does the
equation
λ2 + 1 = 0 (19.3)

have a solution? There is no real number that solves this equation, but as we will see in
this lecture it does have the complex solutions λ = i and λ = −i. The corresponding
fundamental solutions e±ix — of which we still have to make sense of! — are in fact
related to the trigonometric functions that we have encountered in Section 18.1 as the
fundamental solutions to y 00 + by = 0 when b > 0.
More generally, we verify that the ansatz (19.1) gives a solution to (18.4) if λ is a zero
of the characteristic polynomial:

λ2 + aλ + b = 0 (19.4)

Depending on the sign of the discriminant ∆ — see (18.21) — this equation has either
real solutions (∆ > 0, or ∆ = 0) or complex roots (∆ < 0), corresponding to the three
systems of fundamental solutions that we have encountered in Theorem 18.2.
1
This of course does not explain how one arrives at this particular ansatz. But it is clear that polynomials
cannot provide a solution, because differentiation reduces its order. The exponential function already
led to success for the equation y 0 = y, so why not try again! This procedure of trial and error is also
called heuristics, which comes from Greek for “searching.”

153
Module VII Note 19

19.1. Imaginary numbers and quadratic equations


Since λ2 + 1 > 0 for all x ∈ R there cannot be a real number that solves (19.3). The
name “imaginary number” reflects that at first the “number” i that solves i2 + 1 = 0 was
invented. While mysterious, this number allows us to solve every quadratic equation:
ax2 + bx + c = 0 (19.5)
Formally the solutions are
√ √
−b + b2 − 4ac −b − b2 − 4ac
x= , x= , (19.6)
2a 2a
which seems to make sense even if ∆ = b2 − 4ac < 0. Consider for example the case
a = b = c = 1, then this formula says
√ √ √ √
−1 + −3 1 3 −1 − −3 1 3
x= =− + i, x= =− − i (19.7)
2 2 2 2 2 2
√ √ √
where we have taken −3 = 3i2 = 3i.
Exercise 19.1. Verify that these are formally solutions to x2 + x + 1 = 0.
These are examples of complex numbers, which can always be written as a + bi for
some real numbers a, and b, with the convetion that a = a + 0i, and i = 0 + 1i. The laws
of arithmetic and the relation i2 = −1 then show that
(a + bi) + (c + di) = (a + c) + (b + d)i (19.8a)
(a + bi) · (c + di) = (ac − bd) + (ad + bd)i (19.8b)
Remark 19.1. An equation like (1 + 2i)(3 + 1i) = 1 + 7i may thus be regarded as an
abbreviation for two equations 1 · 3 − 2 · 1 = 1, and 1 · 1 + 2 · 3 = 7.
Any complex number a + bi 6= 0 has a (multiplicative) inverse denoted by (a + bi)−1 ,
which is given by:
1 1 a − bi a − bi
= = 2 (19.9)
a + bi a + bi a − bi a + b2

19.2. Complex plane


It is clear that we can view complex numbers z = a + bi as points (a, b) in the plane,
where the points (a, 0) on the horizontal axis are called the real axis, and the vertical
axis consisting of the points (0, b) is called the imaginary axis. In fact, we can use this
point of view to define complex numbers ; (see the additional notes to Module VII).
Definition 19.1. For any complex number z = x + iy (x, y ∈ R), the conjugate is
defined by
z = x − iy , (19.10)
and the modulus (or absolute value) is defined as
q
|z| = x2 + y 2 . (19.11)

154
MAST10021 Semester 2, 2023

Exercise 19.2. Interpret these definitions geometrically.


Exercise 19.3. Let z, and w be complex numbers. Show that
1. z = z

2. z = z if z is real

3. z + w = z̄ + w̄

4. −z = −z

5. z · w = z · w

6. z −1 = z −1

7. |z|2 = zz

8. |z · w| = |z| · |w|
A less straight-forward but very important statement is the following triangle inequality;
(see the additional notes to Module VII): For any complex numbers z, and w it holds

|z + w| ≤ |z| + |w| . (19.12)

Addition and multiplication both have geometric interpretations in the complex plane.
For the interpretation of multiplication note that for any complex number z 6= 0 we
can write
z
z = |z| (19.13)
|z|
where |z| is a positive factor, and z/|z| is a complex number with unit absolute value.
Since any complex number with unit modulus can be written in the form cos θ + i sin θ,
we see that
z = r(cos θ + i sin θ) (19.14)
where r = |z| > 0, and θ ∈ R (which is not unique, because if θ0 is one possibility, then
so are θ + 2πk, for any k ∈ Z); θ is called the argument of z.
Exercise 19.4. Show that the product of two nonzero complex numbers z = r(cos θ +sin θ),
and w = s(cos φ + i sin φ) is

z · w = rs(cos(θ + φ) + i sin(θ + φ)) (19.15)

and give a geometric interpretation in the complex plane.


It is an easy matter to derive from (19.15), by induction, a formula for z n , for any
given complex number z: If z is given by (19.14), then

z n = |z|n cos(nθ) + i sin(nθ) . (19.16)




This formula is also known as de Moivre’s theorem, and can be used to compute the nth
roots of a complex number; (see the additional notes to Module VII).

155
Module VII Note 19

Problems
1. Find the absolute value and argument(s) of each of the following complex numbers:
a) 3 + 4i
b) (3 + 4i)−1
c) (1 + i)5

2. Solve the following equations:


a) x2 + ix + 1 = 0
b) x4 + x2 + 1 = 0
c) x2 + 2ix − 1 = 0

3. Describe the set of all complex numbers z such that


a) z̄ = −z
b) z̄ = z −1
c) |z − a| = |z − b|

4. Prove that |z| = |z̄| and that the real part of z can be written as (z + z̄)/2, while
the imaginary part is (z − z̄)/2i.

5. Consider a polynomial p(z) = z n + an−1 z n−1 + . . . + a0 with real coefficients


a0 , . . . , an−1 ∈ R. Suppose a + ib (with real a, b) is a root of p, p(a + ib) = 0. Show
that then also a − ib is a root of the polynomial. (Thus the complex roots of the
equation p(z) = 0 always occur in pairs, and the number of such roots is even.)

6. Find all the 4th roots of i.

156
Additional: Complex numbers
Further Reading

(Spivak, Calculus, Chapter 25) or (Apostel, Calculus I, Chapter 9)

The discussion suggests that we can arrive at a sensible definition of complex numbers
if we view them as pairs of real numbers:

Definition 19.1 (Complex numbers). A complex number z is an ordered pair of real


numbers (a, b), where a is called the real part, and b the imaginary part of z. The set
of complex numbers is denoted by C. If (a, b) and (c, d) are complex numbers, we define
addition and multiplication by (19.8), namely

(a, b) + (c, d) =(a + c, b + d) (19.1a)


(a, b) · (c, d) =(ac − bd, ad + bc) (19.1b)

Moreover, we define
i = (0, 1) (19.2)

Remark 19.1. We can identify complex numbers (a, 0) with the real number a ∈ R.
Moreover i2 = (0, 1) · (0, 1) = (−1, 0), and so we have

(a, b) = (a, 0) + (0, b) = (a, 0) + b(0, 1) = a + bi . (19.3)

We will not verify explicitly that C satisfy all the properties of a number system, or
more precisely the axioms of a field; see for example (Spivak, Calculus, Chapter 1, and
Chapter 25). But let us figure out how to compute the multiplicative inverse. For (a, b)
let us find (x, y) such that
(a, b) · (x, y) = (1, 0) (19.4)
For this to be true we need ax − by = 1, and ay + bx = 0, which has the solutions
x = a/(a2 + b2 ), and y = −b/(a2 + b2 ). This proves (19.9)

Proof of the triangle inequality


Proposition 19.1. Let z, and w be complex numbers, then

|z + w| ≤ |z| + |w| . (19.5)

157
Module VII Note 19

Proof. It is easy to see that this is true when z = λw for some real number λ. So let us
assume that z 6= λw for any λ ∈ R, and that w 6= 0. Then for all λ ∈ R,

0 < |z − λw|2 =(z − λw) · (z̄ − λw̄)


(19.6)
=|z|2 + λ2 |w|2 − λ(z w̄ + wz̄)

Since z w̄ + wz̄ is real (verify this!), the right hand side is a quadratic in λ with real
coefficients, which by the inequality cannot have a zero. Therefore its discriminant must
be negative:
(z w̄ + wz̄)2 − 4|w|2 |z|2 < 0 (19.7)
From this inequality it follows that

|z + w|2 =(z + w) · (z̄ + w̄) = |z|2 + |w|2 + z w̄ + wz̄


(19.8)
<|z|2 + |w|2 + 2|w||z| = (|z| + |w|)2

which implies the stated inequality.

Consequences of de Moivre’s theorem


The formula (19.15) can be used to show by induction that for any nonzero complex
number z = r(cos θ + i sin θ) we have (and this holds for any argument θ of z):

z n = |z|n cos(nθ) + i sin(nθ) (19.9)




This formula is also known as de Moivre’s theorem, and has the important consequence:

Proposition 19.2. Every nonzero complex number has exactly n complex nth roots.

Proof. The statement is that for any complex number w = s(cos φ + i sin φ) 6= 0, and any
natural number n there are precisely n different complex numbers z = r(cos θ + sin θ)
satisfying z n = w. So by de Moivre’s theorem this happens if and only if

rn = s (19.10)
cos(nθ) + i sin(nθ) = cos(φ) + i sin φ (19.11)

So from the first equation r = n
s, and from the second for some integer k,

nθ = φ + 2πk (19.12)

which has the solutions θk . However, it remains to find out how many of these are distinct.
Since any integer k can be written as k = nq + k 0 for some integer q, and some integer k 0
between 0 and n − 1, we see that

φ 2πk 0
θk = + 2πq + = θk0 + 2πq (19.13)
n n

158
MAST10021 Semester 2, 2023

and so θk and θk0 are the arguments of the same root z in the complex plane. Therefore
there are n distinct nth roots

z = n s cos θk + i sin θk ) k = 0, 1, 2, . . . , n − 1 . (19.14)

Exercise 19.1. The proof actually shows us a method to compute the nth root of a
complex number. Use it to compute the three cube roots of i.

159
Note 20.
Hyperbolic functions
Let us return to the differential equation

y 00 − y = 0 . (20.1)

We have learned in Note 18 that every solution to (20.1) is of the from

y(x) = c1 ex + c2 e−x (20.2)

for some constants c1 , c2 ∈ R. We now define the following special solutions, corresponding
to the choice of constants c1 = 12 , and c2 = ± 12 :
1 x
cosh(x) := e + e−x (20.3)

2
1 x
sinh(x) := e − e−x . (20.4)

2
Remark 20.1. We may view these functions as the unique solutions to the initial value
problem
y 00 − y = 0 y(0) = y0 y 0 (0) = y1 (20.5)
with (y0 , y1 ) = (1, 0), and (y0 , y1 ) = (0, 1), respectively. Indeed,

cosh(0) = 1 cosh0 (0) = 0 (20.6)


0
sinh(0) = 0 sinh (0) = 1 . (20.7)

In particular, we find that the Wronskian at x = 0 is

W (cosh(·), sinh(·)) = cosh(0) sinh0 (0) − cosh0 (0) sinh(0) = 1 (20.8)

which tells us that these two solutions are linearly independent, namely every solution
to (20.1) can be written as a linear combination of these; cf. Theorem 18.1 in Note 18,
additional material to Module VI.

20.1. Basic properties of hyperbolic functions


We easily verify the identity

cosh2 (x) − sinh2 (x) = 1 (20.9)

161
Module VII Note 20

Figure 20.1.: Geometric interpretation of the hyperbolic functions.

which means geometrically that

{(x, y) : x = cosh(t), y = sinh(t) for some t ∈ R}

is one branch of the hyperbola


x2 − y 2 = 1 .
This is the origin of the name hyperbolic functions, and we call cosh(·) the hyperbolic
cosine, and sinh(·) the hyperbolic sine.
Exercise 20.1. The geometric interpretation of the hyperbolic functions is not to be
confused with the graphs of the hyperbolic functions. Sketch the graphs of the hyperbolic
cosine and sine!
The hyperbolic cosine is an example of an even function, while the hyperbolic sine is
an example of an odd function:

cosh(−x) = cosh(x) (20.10)


sinh(−x) = − sinh(x) (20.11)

Recall that

Definition 20.1. A function f is called even if f (−x) = f (x) for all x ∈ R, and it is
called odd if f (−x) = −f (x) for all x ∈ R.

Exercise 20.2. Give other examples of even and odd functions.

162
MAST10021 Semester 2, 2023

Exercise 20.3. Show that for all x, y ∈ R:

sinh(x ± y) = sinh(x) cosh(y) ± cosh(x) sinh(y) (20.12)

Also note that the hyperbolic functions are each other’s derivatives:

cosh0 (x) = sinh(x) (20.13)


0
sinh (x) = cosh(x) (20.14)

20.2. Inverse hyperbolic functions


The hyperbolic sine is a strictly increasing function because sinh0 (x) = cosh(x) > 0 for all
x ∈ R. Similarly the hyperbolic cosine is strictly increasing on the interval [0, ∞). Indeed
1 x
cosh0 (x) = sinh(x) = e − e−x > 0 (x > 0) (20.15)

2
Therefore the inverse functions sinh−1 (x), and cosh−1 (x), exist on the intervals [1, ∞),
and R, respectively, and are also called area hyperbolic cosine arcosh(·), and area
hyperbolic sine arsinh(·).

Proposition 20.1. The area hyperbolic functions can be expressed as


 p 
arcosh(x) = log x + x2 − 1 (x ∈ [1, ∞)) (20.16)
 p 
arsinh(x) = log x + x2 + 1 (x ∈ R) (20.17)

Proof. For x > 1 we have that


1 1
arcosh0 (x) = =
cosh0 (arcosh(x)) sinh(arcosh(x))
1 1 (20.18)
=q =√
cosh2 (arcosh(x)) − 1 x −1
2

We find the same result when differentiating


  p 0 1  x  1
log x + x2 − 1 =
√ 1+ √ =√ (20.19)
x+ x −1 2 x −1
2 x −1
2


Therefore the difference g(x) = arcosh(x) − log(x + x2 − 1) is a continuous function
on [1, ∞) whose derivative exists on (1, ∞) and is g 0 (x) = 0. Therefore g(x) is constant,
and

g(x) = g(1) = arcosh(1) + log(1) = arcosh(cosh(0)) + 0 = 0 (x ≥ 1) (20.20)

163
Module VII Note 20

Exercise 20.4. Write down an analogous proof for (20.17).


Exercise 20.5. The hyperbolic tangent is defined by
sinh(x)
tanh(x) := (x ∈ R) (20.21)
cosh(x)
Show that tanh(·) takes values in (−1, 1), and
1
tanh0 (x) = 1 − tanh2 (x) = (20.22)
cosh2 (x)
Sketch the graph of the hyperbolic tangent, and show that tanh(·) is invertible. The inverse
function artanh(·) is defined on (−1, 1) and is called the area hyperbolic tangent.
Verify that
1
artanh0 (x) = (x ∈ (−1, 1)) , (20.23)
1 − x2
and recall the partial fraction decomposition of Note 14 to write
1 1h 1 1 i
= + (20.24)
1 − x2 2 1−x 1+x
Finally integrate this identity to find an explicit formula for the area hyperbolic tangent
function in terms of the logarithm.
We have seen that
1
Z
√ dx = arcosh(x) (20.25)
x −1
2
1
Z
√ dx = arsinh(x) (20.26)
1 + x2

In general, the primitive of integrals involving x2 − 1 may often be found using the
substitution x = cosh(u);
√ cf. Note 14. Similarly the substitution x = sinh(u) often works
for integrals involving x2 + 1.
Example 20.1. To find the primitive
dx
Z
√ (20.27)
x x2 − 1
we make the substitution x = cosh(u), then dx = sinh(u)du, so
dx du
Z Z
√ = (20.28)
x x2 − 1 cosh(u)
and using the definition of the hyperbolic cosine, together with the substitution eu = y,
eu du = dy we find
du 2du 2eu du 2dy
Z Z Z Z
= = = = arctan(y) (20.29)
cosh(u) e + e−u
u 1 + e2u 1 + y2
and thus in view of Proposition 20.1, with u = arcosh(x),
dx
Z p
= arctan(eu ) = arctan x + x2 − 1 . (20.30)


x x −1
2

164
MAST10021 Semester 2, 2023

20.3. The catenary curve


A famous application of hyperbolic functions is the solution the following problem:

Catenary problem What shape does a chain of uniform density take under its own weight
when suspended between two points?

The solution to this problem invokes the theory of separable differential equations
from Note 16, and involves hyperbolic functions as discussed above; (see the additional
notes to Module VII).

Problems
1. Compute the following integrals using hyperbolic substitutions:
a)
dx
Z
√ . (20.31)
x 1 + x2
b)
dx
Z
√ (20.32)
x2 x2 − 1
c) Z p
x2 + 1dx (20.33)

d) Z p
x2 − 1dx (20.34)

165
Note 21.
Second order differential equations

21.1. Homogeneous equation and complex exponentials


We have considered the special case

y 00 − y = 0 (21.1)

of a linear differential equation, which we solved by making the ansatz f (x) = eλx . This
led to the equation
λ2 − 1 = 0 (21.2)
which has the solutions λ = ±1, corresponding to the linearly independent solutions e±x .
Linear combinations of these define the hyperbolic functions
1 x 1 x
cosh(x) = e + e−x sinh(x) = e − e−x (21.3)
 
2 2
which we have studied in Note 20.
Recall from Note 18 that the special case

y 00 + y = 0 (21.4)

has the trigonometric functions as its solutions. The same ansatz f (x) = eλx leads — as
we have already seen in Lecture 19 — to the equation

λ2 + 1 = 0 (21.5)

which now has the complex solutions λ = ±i, corresponding to the complex valued
solutions e±ix .
These complex valued functions need to be discussed separately; (see the additional
notes to Module VII). However, given that e±ix is a complex valued solution, we can
already infer that their real and imaginary parts are also solutions. Since for any complex
number w the real part is given by Re(w) = (w + w̄)/2, and the imaginary part is given
by Im(w) = (w − w̄)/2i, and given that moreover ez = ez̄ for any complex number z, we
obtain that the real and imaginary parts of eix are given, respectively, by
1 ix 1 ix
Re(eix ) = e + e−ix , Im(eix ) = e − e−ix . (21.6)
 
2 2i

167
Module VII Note 21

Therefore they are, respectively, the solutions of the initial value problems,

y 00 + y = 0 , y(0) = 1 , y 0 (0) = 0 (21.7)


00 0
y + y = 0, y(0) = 0 , y (0) = 1 . (21.8)

In view of the uniqueness theorem for solutions to the initial value problem — see
Theorem 18.2 — and the discussion in Section 18.1, we can infer that they must be
precisely the trigonometric functions:

1 ix 1 ix
cos(x) = e + e−ix , sin(x) = e − e−ix . (21.9)
 
2 2i

Moreover, by the very definition of the trigonometric functions as the real and imaginary
part we obtain Euler’s identity:

eix = cos(x) + i sin(x) (21.10)

We will discuss more properties of the trigonometric functions separately below.


Remark 21.1. For example, we see directly from the definition that

cos2 (x) + sin2 (x) = (Re(eix )2 + Im(eix )2 = eix eix = e0 = 1 , (21.11)




and so geometrically the set {(cos(θ), sin(θ) : θ ∈ [0, 2π]} is a circle.

Let us now return to the general case (18.4) of a homogeneous linear second order
differential equation with constant coefficients:

y 00 + ay 0 + by = 0 (21.12)

As we have seen in Note 19, the corresponding characteristic polynomial is (19.4), namely
the algebraic condition for f (x) = eλx to be a solution is that λ is a root of

Q(λ) = λ2 + aλ + b . (21.13)

This polynomial may not have any real zeros, depending on the sign of the discriminant
∆ = a2 − 4b. Indeed λ is a solution to Q(λ) = 0 if and only if

a 1p 2
λ=− ± a − 4b (21.14)
2 2

Case ∆ > 0.
Exercise 21.1. Let λ1 , λ2 be the two real solutions of (21.13). Show that f1 (x) = eλ1 x ,
and eλ2 x are two linearly independent solutions of (21.12).

168
MAST10021 Semester 2, 2023

Case ∆ = 0. In this case the two roots coincide λ1 = λ2 = −a/2. Let us prove that

f1 (x) = e−ax/2 f2 (x) = xe−ax/2 (21.15)

are two linearly independent solutions to (21.12).


Clearly f1 (x) is a solution, and we verify that

f200 (x) + af20 (x) + bf2 (x) = −ae−a/2 + xQ(−a/2)e−ax/2 + ae−ax/2 = 0 . (21.16)

Moreover the Wronskian of the two solutions is


W (x) =f1 (x)f20 (x) − f10 (x)f2 (x) =
(21.17)
=e−ax/2 (1 − ax/2)e−ax/2 + (a/2)e−ax/2 xe−ax/2 = e−ax/2 6= 0

and thus f1 (x) and f2 (x) are linearly independent by Theorem 18.1.

Case ∆ < 0. In this case we verify that



λ = α ± iβ , α = −a/2 , β= −∆/2 (21.18)

are zeros of the characteristic polynomial Q(λ). Similarly to the special case (21.4), we
expect that the real and imaginary parts of eλx are solutions to (21.12). Since

e(α+iβ)x = eαx eiβx , e(α+iβ)x = eαx e−iβx = e(α−iβ)x (21.19)

we obtain
  1  
Re e(α+iβ)x = eαx eiβx + e−iβx = eαx cos(βx) (21.20)
2
  1  
Im e(α+iβ)x
= eαx eiβx − e−iβx = eαx sin(βx) (21.21)
2i
Exercise 21.2. Show that the functions f1 (x) = eαx cos(βx), and f2 (x) = eαx sin(βx) are
solutions to (21.12) in the case ∆ < 0.
Exercise 21.3. Show that the solutions f1 (x), and f2 (x) are linearly independent, i.e. that
for all x ∈ R,
W (f1 , f2 )(x) 6= 0 . (21.22)

Summary. In summary, to find the solutions to the homogeneous equation (21.12), one
may determine the zeros of the characteristic polynomial,

Q(λ) = λ2 + aλ + b = 0 λ∈C (21.23)

In the three possible cases


1. λ1 , λ2 ∈ R, λ 6= 0

2. λ1 = λ2 ∈ R

169
Module VII Note 21

3. λ1 ∈ C, Im(λ) 6= 0, λ2 = λ1 ,
the functions f1 and f2 defined above form a fundamental system, in the sense that a
twice differentiable function f is a solution to (21.12) if and only there are constants
c1 , c2 ∈ R (uniquely determined) such that f = c1 f1 + c2 f2 .
Exercise 21.4. Convince yourself that the fundamental system of solutions found in
this way is exactly the same as we have found initially in Lecture 18. In particular
from Proposition 18.1 we know that any solution to (21.12) can be written as a linear
combination of
f1 (x) = e−ax/2 u1 (x) f1 (x) = e−ax/2 u2 (x) (21.24)
where u1 (x), and u2 (x) are as in Theorem 18.2 depending on the sign of the discrimant
∆ = a2 − 4b.

21.2. Inhomogeneous equations


A few comments are in order about the inhomogeneous equation

y 00 + ay 0 + by = r(x) (21.25)

where a and b remain constants, but r(x) is a continuous function on (−∞, ∞).
Let us first observe that if y1 and y2 are solutions to (21.25), then y1 − y2 is a solution
to the homogeneous equation (21.12), and thus can be written as a linear combination of
the fundamental solutions f1 and f2 . In other words, there are constants c1 , and c2 such
that
y1 − y2 = c1 f1 + c2 f2 . (21.26)
This means in particular if one particular solution y1 to (21.25) is known, then any
solution y2 to (21.25) can be written as y2 = c1 f1 + c2 f2 + y1 .
Proposition 21.1. If y1 is a particular solution to (21.25), then the general solution to
the inhomogeneous equation (21.25) is obtained by adding to y1 the general solution to
the homogeneous equation (21.12).
It thus suffices to find one particular solution to the inhomogeneous problem.
In general, for any continuous function r(x), a particular solution can be found by the
method of variation of constants; (see the additional notes to Module VII).

21.3. Special types of the inhomogeneous terms


While Theorem 22.1 provides a general method to determine a particular solution to

y 00 + ay 0 + by = r(x) , (21.27)

there are special cases — namely when the function r(x) takes a special form — for which
other methods are available. We illustrate this in the cases when r(x) is a polynomial, or
a polynomial times an exponential.

170
MAST10021 Semester 2, 2023

Case: r(x) = pn (x) is a polynomial of degree n. If b 6= 0, then a particular solution is


given by a polynomial
n
g(x) = (21.28)
X
ak xk ,
k=0
whose coefficients can be determined successively order by order, after substituting into
the equation. Moreover, if b = 0 but a 6= 0, then a polynomial of order n + 1 satisfies the
equation
y 00 + ay 0 = pn (x) . (21.29)
Example 21.1. Consider the equation

y 00 + y = x3 . (21.30)

The ansatz
y1 (x) = a3 x3 + a2 x2 + a1 x + a0 (21.31)
leads to the equation

a3 x3 + a2 x2 + (6a3 + a1 )x + 2a2 + a0 = x3 (21.32)

which is evidently satisfied provided

a3 = 1 a2 = 0 a1 = −6a3 = −6 a0 = −2a2 = 0 . (21.33)

Therefore a particular solution is g(x) = x3 − 6x, and the general solution is given by

y(x) = c1 cos(x) + c2 sin(x) + x3 − 6x . (21.34)

Exercise 21.5. Derive the particular solution to (21.30) by using Theorem 22.1 and verify
that the two methods give the same result.

Case: r(x) = pn (x)emx where pn is a polynomial of degree n, and m ∈ R. In this


case we can always find a particular solution of the form

y1 (x) = u(x)emx (21.35)

because inserting this ansatz in (21.27) gives an equation for u,

u00 + (2m + a)u0 + (m2 + am + b)u = pn (x) (21.36)

which is precisely of the from discussed above, and can be solved with a particular
solution of the form n
q(x) = (21.37)
X
ak xk ,
k=1

at least when m2 + am + b 6= 0. If m2 + am + b = 0, but 2m + a 6= 0, then we can take q


to be a polynomial of degree n + 1.

171
Module VII Note 22

Example 21.2. Consider the equation

y 00 + y = xe3x . (21.38)

Setting y = ue3x we find


u00 + 6u0 + 10u = x (21.39)
which we can solve with the ansatz

q(x) = a1 x + a0 , (21.40)

which gives the relations


10a1 x + 6a1 + 10a0 = x (21.41)
which can only be satisfied if a1 = 1/10, and a0 = −6/100, so

q(x) = (5x − 3)/50 (21.42)

and so the general solution is

y(x) = c1 cos(x) + c2 sin(x) + (5x − 3)e3x /50 . (21.43)

Problems
1. Find the general solution of each of the following differential equations. Unless
defined on the whole real line, indicate the interval on which the solution is defined.
a) y 00 − y = x
b) y 00 + y 0 = x2 + x
c) y 00 − 4y = e2x
d) y 00 − 2y 0 + y = x + 2xe2x

2. If k 6= 0, verify that the equation

y 00 − k 2 y = r(x) (21.44)

has a particular solution given by


1
Z x
g(x) = r(t) sinh k(x − t) dt . (21.45)

k 0

172
Additional: Complex functions,
and power series

Further Reading

This note is only an introduction to complex valued functions. This material is


mostly beyond the scope of this subject (MAST10021). A deeper discussion, along
with many of the proofs can be found in (Spivak, Calculus, Chapter 26, 27). For
applications also see (Apostel, Calculus I, Chapter 9).

Complex Analysis (MAST30021)

In this note, we want to understand the meaning of the expression

f (x) = eix (22.1)

that we have encountered as a formal solution to a differential equation in Note 21. f is


an example of a complex-valued function: It assigns to each real number x a complex
number f (x). More generally we would like to understand ez for any complex number z.
This is an example of a complex function.

22.1. Complex functions


A complex function is rule

f :C → C
(22.2)
z 7→ f (z)

which assigns to each complex number, or complex numbers in a subset D ⊂ C — the


domain of f — another complex number.1 We can always write

f (z) = u(z) + iv(z) (22.3)

where u(z) and v(z) are real numbers, and say that f is real-valued if v(z) = 0.
1
In much the same way that we have talked about functions in Note 2 it makes sense to give a formal
defintion of a complex function as a collection of pairs of complex numbers, which does not contain
two distinct pairs with the same first element; cf. Additional material to Module I.

173
Module VII Note 22

Example 22.1. A polynomial is given by

f (z) = an z n + an−1 z n−1 + . . . + a0 (22.4)

where a0 , . . . , an ∈ C.
Example 22.2. The conjugate function is defined by

f (z) = z̄ . (22.5)

Example 22.3. Since any complex number z can be written in the form z = x + iy, many
explict examples can be written down using (22.3), such as

f (x + iy) = ey sin(x − y) + ix3 cos(y) . (22.6)

In a similar fashion to Note 3 and 4 we can talk about the existence of a limit, and
continuity:

The statement
lim f (z) = l (22.7)
z→a

means that for every (real) number ε > 0, there is a (real) number δ > 0,
such that, for all z, if 0 < |z − a| < δ, then |f (z) − l| < ε.

While formally the same, the geometric interpretation is different; see Fig. 22.1. The
function f (z) has the limit l as z approaches a, if f (z) can be made to lie in a circle of
radius ε in the complex plane C, by restricting z to lie within a circle of radius δ around
a in the complex plane C.
Moreover a function f (z) is continuous at a if limz→a f (z) = f (a), and continuous if
f is continuous at every a in the domain.
Remark 22.1. Compare this to the discussion of limits and continuity for functions of
two variables in Note 6!
In much the same way as in Notes 3 and 4, one can prove that

lim z = a (22.8)
z→a
 
lim f (z) + g(z) = lim f (z) + lim g(z) (22.9)
z→a z→a z→a
lim f (z) · g(z) = lim f (z) · lim g(z) (22.10)
z→a z→a z→a
f (z) limz→a f (z)
lim = , if lim g(z) 6= 0 . (22.11)
z→a g(z) limz→a g(z) z→a

The story is however more delicate — but eventually more beautiful! — for differentia-
bility. We can define that a function f (z) is differentiable at a, if

f (z) − f (a)
f 0 (a) = lim (22.12)
z→a z−a

174
MAST10021 Semester 2, 2023

Figure 22.1.: Definition of limits of complex functions.

exists, in which case the limit on the right is denoted by f 0 (z) on the left. While the
familiar rules of differentiation from Note 7 can be verified for rational functions, in
particular for
f (z) = z n f 0 (z) = nz n−1 (22.13)

there are many perplexing examples of functions which are simply not differentiable.

Example 22.4. Consider the conjugate function f (z) = z̄ which we can also write in the
form
f (x + iy) = x − iy . (22.14)

Note that for z = x + iy, with y = 0, the difference quotient at a = 0 reads

f (z) − f (0) x − iy x
= = =1 (22.15)
z−0 x + iy x

but the same expression restricted to z = x + iy with x = 0 reads

f (z) − f (0) x − iy −iy


= = = −1 . (22.16)
z−0 x + iy iy

Thus no matter how small we take 0 < |z|, f (z) does not approach any limit l.

Nonetheless the most important differentiable functions can be defined by means of


infinite series.

175
Module VII Note 22

22.2. Complex power series


The complex-valued function (22.1) is the most important example of a complex power
series:
z z2 z3
exp(z) = ez = 1 + + + + ... (22.17)
1! 2! 3!
For real numbers z = x this is an identity with the exponential function ex defined as in
Note 13:
x2 x3
ex = 1 + x + + + ... (22.18)
2 3!
The precise interpretation of the expression 1 + x + x2 /2! + . . ., and the explanation for
why this is an identity is the subject of Module VIII and ??.
In fact, the idea for the interpretation of the right hand side of (22.17) is as follows:
For a given complex number z, the partial sum
n
zj zn
sn (z) = = 1 + z + ... + (22.19)
X

j=0
j! n!

is a sequence of complex numbers. In Note 25 we will discuss the notion of con-


vergence of a sequence of numbers: Informally speaking, a sequence {an } of complex
numbers has a limit l, if for any small circle drawn around l in the complex plane, the
sequence is eventually, namely for large n, contained in that circle. We then write

lim an = l . (22.20)
n→∞

It turns out that the sequence of numbers {sn (z)} of (22.19) converges for any complex
number, and we denote the limit by
∞ j
z
= lim sn (z) . (22.21)
X

j=0
j! j→∞

In Note ?? we will give a formal definition of convergent series along these lines.
More generally, a complex power series is a series of the form

f (z) = an (z − a)n = a0 + a1 (z − a) + a2 (z − a)2 + . . . (22.22)
X

n=1

where a, an ∈ C, and the power series centered at 0 reads



f (z) = an z n = a0 + a1 z + a2 z 2 + . . . (22.23)
X

n=1

An important theorem about power series is that if f (z0 ) converges, then f (z) converges
for any |z| < |z0 |. This means, geometrically, that if the power series converges for some
z0 , then it does so inside the entire circle of radius |z0 |. In fact, for power series of the
form (22.23) one of the following three possibilities must be true:

176
MAST10021 Semester 2, 2023

P∞
1. n=0 an z
n converges only for z = 0
P∞
2. n=0 an z
n converges absolutely for all z ∈ C
P∞
3. There is a number R > 0 such that n=0 an z
n converges absolutely if |z| < R and
diverges for |z| > R.

(We say a series n an “converges absolutely” if n |an | converges.)


P P

The number R is called the radius of convergence of the power series.2 Inside the
circle of convergence a power series defines a differentiable function. In view of the
examples we have given in the previous section, this shows that power series provide a
large class of differentiable function.
In fact, if the power series (22.23) has radius of convergence R > 0, then f is differen-
tiable inside the circle of convergence |z| < R, and

f 0 (z) = (22.24)
X
nan z n−1 .
n=1

It follows that a power series is infinitely differentiable and continuous inside the circle of
convergence.
As we have seen power series provide a way to define the complex exponential exp(z)
as in (22.17). Then by the above mentioned results for power series exp0 (z) = exp(z). We
compute in particular

(iz)2 (iz)3 (iz)4 (iz)5


eiz =1 + iz + + + + + ...
2! 3! 4! 5!
z2 z3 z4 z5
=1 + iz − −i + + i + ...
2! 3! 4! 5! (22.25)
 z2 z4   z3 z5 
= 1− + + ... + i z − + + ...
2! 4! 3! 5!
= cos(z) + i sin(z)

where we have defined the complex functions cos(z) and sin(z) by:

z3 z5
sin(z) =z − + + ...w (22.26)
3! 5!
z2 z4
cos(z) =1 − + + ... (22.27)
2! 4!
then also sin0 (z) = cos(z), and cos0 (z) = − sin(z).
As for the exponential function in (22.18), the formulas (22.26) and (22.27) are identities
for real numbers z = x. This will be proven in Modules VIII and ??. In other words, the
2
Inside the circle of convergence the power series converges absolutely, but outside it diverges. What
happens on the circle is a more difficult question, and there are examples of series which converge on
this circle, and others which do not.

177
Module VII Note 22

definitions (22.26) and (22.27) extend statements for trigonometric functions and their
Taylor series to the domain of complex functions.
It is also clear from the definitions, namely the power series, that

sin(−z) = − sin(z) (22.28)


cos(−z) = cos(z) (22.29)

and so we also have


e−iz = cos(z) − i sin(z) . (22.30)
From these formulas we obtain that
1  iz 
cos(z) = e + e−iz (22.31)
2
1  iz 
sin(z) = e − e−iz (22.32)
2i
Note that for real z this agrees with our discussion in Note 21.
From this point of view the exponential function plays a central role in the definition
of all elementary functions. We also obtain the famous relation

eiπ = −1 , (22.33)

from (22.25) with z = π, and more generally that e2πi/n is an nth root of 1.
Recall that for real numbers z, the values of sin(z) always lie between −1 and 1.
However for complex z this is not true at all: Take z = iy, for any real y, then

ei(iy) − e−i(iy) e−y − ey


sin(iy) = = = i sinh(y) , (22.34)
2i 2i
which is unbounded.

It may seem that the complex functions we have considered are very special, but this
is not quite true. The basic theorems of complex analysis show:

If a complex function is defined in some region A of the complex plane and


is differentiable in A, then it is automatically infinitely differentiable in A.
Moreover, for each point a in A the function f can be represented as a power
series inside a circle centered at the point a.

The power series that are relevant for this statement are Taylor series which are the
topic of Module VIII and ??.

22.3. Applications to differentiation and integration


Let now λ ∈ C be a complex number, and consider

f (x) = exp(λx) . (22.35)

178
MAST10021 Semester 2, 2023

We have seen above that


 0  1 1 0
f 0 (x) = exp(λx) = 1 + λx + (λx)2 + (λx)3 + . . .
2 3! (22.36)
λ
= λ + λ(λx) + (λx)2 + . . . = λ exp(λx) = λf (x)
2
or simply
1 0
eλx = eλx (22.37)
λ
and
1 λx
Z
eλx dx = e . (22.38)
λ
The statement is here that λ−1 exp(λx) is a primitive of exp(λx). Here the notion of
continuity and differentiablity are to be understood for complex valued functions of a
real variable.
Exercise 22.1. A complex valued functions of a real variable can always be written as

f (x) = u(x) + iv(x) . (22.39)

Adapt the definition of continuity given in (22.7) to the case of a comlex-valued function
of a real variable, and show that f is continuous in that sense, if and only if u and v are
continuous.
Exercise 22.2. Similarly, show that f is differentiable if and only if u and v are differen-
tiable, and
f 0 (x) = u0 (x) + iv 0 (x) . (22.40)
Use this to give an alternative proof of (22.38) using Euler’s identity (21.10).
As an application recall the formula (21.20). We obtain
Z Z
eαx cos(βx)dx = Re e(α+iβ)x dx
 1  1  
= Re e(α+iβ)x = 2 αeαx
cos(βx) + βeαx
sin(βx) (22.41)
α + iβ α + β2

Exercise 22.3. Derive a similar formula from (21.21).

Additional Problems
1. Show that every complex number of absolute value 1 can be written as eiy for some
real number y.

2. For all complex numbers z and w it holds that ez+w = ez ew . It also holds that
sin(z + w) = sin(z) cos(w) + cos(z) sin(w). Interprete these statement in terms of
series. How could we prove these?

179
Module VII Note 22

3. Prove that |ex+iy | = ex for real x and y.

4. a) Prove that exp takes on every complex value except 0.


b) Prove that sin takes on every complex value.

5. Show that exp is not one-to-one on C. Given w 6= 0 find z so that ez = w.

6. Prove that n
1 − einx sin(nx/2) i(n+1)x/2
eikx = eix =
X
e .
k=1
1 − eix sin(x/2)

7. Use (21.9) to show


1
cos2 (θ) = 1 + cos(2θ) (22.42)

2
Derive a similar formula for sin2 (θ).

8. Prove that every trigonometric sum of the form


n
1
Sn (x) = a0 + ak cos(kx) + bk sin(kx) (22.43)
X 
2 k=1

can be expressed as a sum of complex exponentials,


n
Sn (x) = (22.44)
X
ck eikx ,
k=−n

where ck = 12 (ak − ibk ) for k = 1, 2, . . . , n.

9. a) If m and n are integers, prove that

0 m 6= n
Z 2π (
einx e−imx dx = (22.45)
0 2π m 6= n .

b) Use part (a) to deduce the following orthgonality relation: If m and n are
integers with m2 =
6 n2 , then
Z 2π
sin(nx) cos(mx)dx = 0 . (22.46)
0

180
Additional: Catenary problem
In this note we want to solve the

Catenary problem: What shape does a flexible rope1 of uniform density take under its
own weight when suspended between two points?

The rope is in a static equilibrium meaning that at every point on the rope the total
force acting on it vanishes. The rope lies in a plane and we can choose coordinates (x, y)
so that the y-axis aligns with the gravitational force and the lowest point on the rope
falls in the origin (0, 0). At every point (x0 , y0 ) on the rope there is a force F (x0 , y0 ) ∈ R2
corresponding to the pull of the rope segment with x > x0 . This force is not yet known
to us, but we know it is always tangential to the rope at (x0 , y0 ). Now consider the lowest
point where the “pull from the rope segment on the left” is balanced by the “pull from
the rope segment on the right”:

(−F0 , 0) + F (0, 0) = 0 (22.1)

for some F0 > 0, representing the force exerted by the rope segment with x < 0.
Now consider some point (x0 , y0 ) on the rope on the right with x0 > 0; cf. Figure 22.1.
The horizontal component of the force exerted by the rope segment x < x0 is still F0 ,
but in addition there is a vertical component related to the weight of the rope segment
between 0 < x < x0 , which is proportional to the total mass of this segment given by ρs,
where ρ0 is the density (mass per unit length) and s is the arc length of the rope segment
from the lowest point. Therefore

(−F0 , −ρs(x0 , y0 )g) + F (x0 , y0 ) = 0 (22.2)

where g is the gravitational constant, and by writing F in the form F = |F |(cos θ, sin θ),
where |F | and θ a functions of (x0 , y0 ) we obtain two equations

|F | cos θ = F0 |F | sin θ = ρsg . (22.3)

Let us now view the rope as a graph y = y(x) over the x-axis. The arc length s(x0 , y0 )
of the rope segment from (0, 0) to (x0 , y0 ) is then given by the length s(x0 ) of the curve
x → (x, y(x)), with x ∈ [0, x0 ]. The length of this curve is given by
s
Z x0  dy(x) 2
s(x0 ) = 1+ dx (22.4)
0 dx
1
The word catenary comes from the latin word for chain, but I find it more pleasant to think about
ropes. The problem was first solved in the 1690s, independently by Leibniz, Huygens, and Bernoulli.

181
Module VII Note 22

Figure 22.1.: Catenary problem.

dy
where m(x) = y 0 (x) = dx (x) is the slope of the graph at (x, y(x)). Since (cos(θ(x)), sin(θ(x))
is the unit tangent vector to the curve at (x, y(x)), with the angle θ as introduced above,
it follows from (22.3) that

ρs(x)g
m(x) = tan(θ(x)) = (22.5)
F0
In view of (22.4) we thus obtain the following ODE for the slope m(x):
ρg 0 ρg q
m0 (x) = s (x) = 1 + m(x)2 (22.6)
F0 F0
In Note 16 we have learned that this is an example of a separable differential equation,
which can be solved by writing
dm
Z Z
√ =a dx (22.7)
1 + m2
where for short a = ρg/F0 . It follows from (20.26) that

arsinh(m(x)) = ax (22.8)

where we used that m(0) = 0. Hence


dy
(x) = m(x) = sinh(ax) (22.9)
dx

182
MAST10021 Semester 2, 2023

which we can easily integrate:

cosh(ax) − 1
Z x
y(x) = sinh(ax0 )dx0 = (22.10)
0 a
where we used y(0) = 0.
This is the solution to the catenary problem, and obviously finds applications in civil
engineering, although the assumption of uniform density is rather idealized.
Exercise 22.1. Find the area under the catenary curve from x = 0 to x = 1/a.

183
Additional: Variation of constants
In this note we show how to obtain a particular solution to (21.25) by variation of
constants.1
More precisely, let f1 and f2 be two linearly independent solutions of the homogeneous
equation as discussed in Section 21.1. We are then looking for a particular solution to
the inhomogeneous problem of the form

g(x) = c1 (x)f1 (x) + c2 (x)f2 (x) . (22.1)

We compute

g 0 (x) = c1 (x)f10 (x) + c2 (x)f20 (x) + c01 (x)f1 (x) + c02 (x)f2 (x) (22.2)

g 00 (x) = c1 (x)f100 (x) + c2 (x)f200 (x) + c01 (x)f10 (x) + c02 (x)f20 (x)
+ (c01 (x)f1 (x) + c02 (x)f2 (x))0 (22.3)

Therefore, using the fact that f1 (x) and f2 (x) are solutions to the homogeneous equation,

g 00 (x) + ag 0 (x) + bg(x) = c01 (x)f10 (x) + c02 (x)f20 (x)


+ (c01 (x)f1 (x) + c02 (x)f2 (x))0 + a(c01 (x)f1 (x) + c02 (x)f2 (x)) (22.4)

So if c1 (x) and c2 (x) can be chosen such that

c01 (x)f1 (x) + c02 (x)f2 (x) = 0 (22.5)

then g(x) is a solution to (21.25) provided also

c01 (x)f10 (x) + c02 (x)f20 (x) = r(x) (22.6)

The point is that (22.5) and (22.6) are a linear system of equations for c01 (x) and c02 (x)
which is always solvable:
! ! !
f1 (x) f2 (x) c01 (x) 0
= (22.7)
f10 (x) f20 (x) c02 (x) r(x)
1
This idea was first used by Bernoulli in 1697 to solve linear equations of first order, and then by
Lagrange in 1774 to solve linear equations of second order.

185
Module VII Note 22

The determinant of this matrix is precisely the Wronskian W (f1 , f2 )(x), and we have
shown in Theorem 18.1 that W (f1 , f2 )(x) never vanishes. Therefore we can always solve
this linear system and obtain
f2 (x)r(x)
c01 (x) = − (22.8)
W (f1 , f2 )(x)
f1 (x)r(x)
c02 (x) = . (22.9)
W (f1 , f2 )(x)
Theorem 22.1. Let (f1 , f2 ) be a fundamental system for the homogeneous equation
(21.12). Then a particular solution to the inhomogeneous equation (21.25) is given by the
formula (22.1),
y1 (x) = c1 (x)f1 (x) + c2 (x)f2 (x) (22.10)
where c1 and c2 are primitives of the following functions:
f2 (x)r(x) f1 (x)r(x)
Z Z
c1 = − dx , c2 = dx . (22.11)
W (f1 , f2 )(x) W (f1 , f2 )(x)
Example 22.1. Let us determine the general solution of the equation
y 00 + y = sin(2x) (22.12)
A fundamental system of solutions for the homogeneous equation is given by
f1 (x) = sin(x) , f2 (x) = cos(x) , (22.13)
which has the Wronskian W (x) = −1, and the two primitives (22.11) are
Z Z
c1 = cos(x) sin(2x)dx = 2 cos2 (x) sin(x)dx = −2 cos3 (x)/3 (22.14)
Z Z
c2 = − sin(x) sin(2x)dx = −2 sin2 (x) cos(x)dx = −2 sin3 (x)/3 (22.15)

which shows that a particular solution to (22.12) is given by


y(x) = −(2/3) cos3 (x) sin(x) − (2/3) sin3 (x) cos(x) , (22.16)
and the general solution is given by
y(x) = a1 sin(x) + a2 cos(x) − (2/3) cos3 (x) sin(x) − (2/3) sin3 (x) cos(x) , (22.17)
where a1 , and a2 are constants.

Problems
1. Find the general solution of the equation
y 00 + y = tan(x) (22.18)
on the interval (−π, π).
2. Derive the formula (21.45) by the method of variation of constants.

186
Module VIII.

Taylor polynomials

187
Note 23.
Approximation by polynomial functions
In the previous lectures we have looked at differential equations with constant coefficients,
and we have found several special functions as solutions. These functions are often given
implicitly and “in practice” it would not be easy at all to compute their values.
Example 23.1. For example to compute values of the exponential function, we would first
have to approximate
1
Z x
log(x) = dt (23.1)
1 t

by some upper and lower sums, and finding ex = log−1 (x) would involve computing log(a)
for many values of a, until log(a) is close to x, and a would then be an approximation
for ex .
In this lecture we will learn a way to approximate functions by polynomials,

p(x) = a0 + a1 x + . . . + an xn , (23.2)

in the sense that the derivatives of p(x) at a point a agree with those of the given function
up to a given order. Such polynomials are called Taylor polynomials.
Remark 23.1. There are of course other methods to approximate functions by polynomials.
One could for example try to find a polynomial of degree n that passes through n+1 given
points on the graph of the function. Or one could attempt to make the area between the
function and the polynomial as small as possible. Or one could approximate a function
uniformly by polynomials, as it is done in the “Weierstrass approximation theorem.”
First note that all coefficients in (23.2) can be expressed in terms of the values of p
and its derivatives at 0: p(0) = a0 , and p0 (x) = a1 , and more generally

p(k) (0) = k!ak (23.3)

If we had begun with a polynomial in (x − a), namely

p(x) = a0 + a1 (x − a) + . . . + an (x − a)n (23.4)

then the same argument shows that

p(k) (a)
ak = . (23.5)
k!

189
Module VIII Note 23

Definition 23.1. Suppose f is a function which is n-times differentiable at a. Then

Pn,a [f ](x) = a0 + a1 (x − a) + a2 (x − a)2 + . . . + an (x − a)n , (23.6)


f (k) (a)
where ak = (0 ≤ k ≤ n) (23.7)
k!
is the Taylor polynomial of degree n for f at a. (We usually drop [f ] from the
notation.)

The Taylor polynomials of the most important elementary function are extremely
simple.
Example 23.2. The Taylor polynomial for sin(x) at 0 is, to order 2n + 1,

x3 x5 x2n+1
P2n+1,0 (x) = x − + + . . . + (−1)n (23.8)
3! 5! (2n + 1)!

Example 23.3. The Taylor polynomial of log(x) at a = 1 is

1 1 (−1)n−1
Pn,1 (x) = (x − 1) − (x − 1)2 + (x − 1)3 + . . . + (x − 1)n (23.9)
2 3 n
Exercise 23.1. For the example of the logarithm it is more convenient to consider the
function f (x) = log(1 + x). In this case compute the Taylor polynomial of f at a = 0.
Example 23.4. Let us compute the Taylor polynomial for arctan(x) at a = 0. Since

1
arctan0 (x) = , (23.10)
1 + x2
we see that arctan0 (0) = 1, and arctan00 (0) = 0, so

P2,0 = x . (23.11)

However to compute the Taylor polynomials to higher order appears to be a laborious


exercise. We will return to this example when we have learned more about the properties
of Taylor polynomials.
Let us return to the Taylor polynomial in general form, first of degree 1, given by

P1,a [f ](x) = f (a) + f 0 (a)(x − a) . (23.12)

We see that by the very definition of the derivative,

f (x) − P1,a (x) f (x) − f (a)


= − f 0 (a) → 0 (x → a) , (23.13)
x−a x−a
which tells us that f (x) − P1,a [f ](x) not only becomes small as x → a, but becomes small
compared to (x − a) as x → a.

190
MAST10021 Semester 2, 2023

Example 23.5. Consider the exponential function f (x) = ex , and its Taylor polynomial of
degree 1,
P1,0 (x) = 1 + x . (23.14)
While this is a good approximation as x approaches 0, it appears that
1
P2,0 (x) = 1 + x + x2 (23.15)
2
gives an even better approximation. Indeed by L’Hôpital’s rule,

ex − 1 − x − 12 x2 ex − 1 − x ex − 1
lim = lim = lim = 0. (23.16)
x→0 x2 2x x→0 2
Theorem 23.1. Suppose f is a function for which f 0 (a), . . . , f (n) (a) all exist. Then

f (x) − Pn,a (x)


lim = 0. (23.17)
x→a (x − a)n
Proof. This is also a consequence of L’Hôpital’s rule.

We also say that “f (x) equals Pn,a (x) up to order n at a.” We have designed Pn,a to
have that property, but it turns out that the Taylor polynomial is the only polynomial
with this property.
Theorem 23.2. Let p(x) and q(x) be two polynomials in (x − a), of degree at most n,
which are equal up to order n at a in the sense that
p(x) − q(x)
lim = 0. (23.18)
x→a (x − a)n
Then p = q.
Proof. The difference r = p − q is a polynomial of degree at most n,

r(x) = b0 + b1 (x − a) + . . . + bn (x − a)n (23.19)

and by assumption, for all 0 ≤ i ≤ n,


r(x)
lim = 0. (23.20)
x→a (x − a)i
In the case i = 0 this implies b0 = 0, so

r(x) = b1 (x − a) + . . . + bn (x − a)n (23.21)

and we can use the case i = 1 to infer that b1 = 0. Continuing in this way gives

b0 = b1 = b2 = . . . = bn = 0 .

191
Module VIII Note 23

Corollary 23.3. Let f be n-times differentiable at a, and suppose P is a polynomial in


(x − a) that equals f up to order n at a. Then P = Pn,a [f ] is the Taylor polynomial of f .

In some situations this insight gives an unexpected way to compute the Taylor polyno-
mial of a function.
Example 23.6. Let us return to the problem of computing the Taylor polynomial of
1
Z x
arctan(x) = dt . (23.22)
0 1 + t2

It is not hard to verify (multiply both sides by (1 + t2 )!) that

1 n+1 t
2n+2
= 1 − t 2
+ t 4
− t 6
+ . . . + (−1)n 2n
t + (−1) . (23.23)
1 + t2 1 + t2
Therefore
1 1 x2n+1
Z x 2n+2
t
arctan(x) = x − x3 + x5 − . . . + (−1)n + (−1)n+1 dt , (23.24)
3 5 2n + 1 0 1 + t2
and so the polynomial that appears here is the Taylor polynomial of degree 2n + 1,
provided we can show that

1
Z x 2n+2
t
lim dt = 0 (23.25)
x→0 x2n+1 0 1 + t2
Since Z x 2n+2
t |x|2n+3
0≤ dt ≤ (23.26)
0 1 + t2 2n + 3
this is satisfied, and we conclude

1 1 x2n+1
P2n+1,0 (x) = x − x3 + x5 − . . . + (−1)n . (23.27)
3 5 2n + 3
This last example also shows that for |x| ≤ 1,
1
| arctan(x) − P2n+1,0 (x)| ≤ , (23.28)
2n + 1
which means that we use Taylor polynomials to compute arctan(x) on this interval as
accurately as we like. We will now turn more generally to the question how well a Taylor
polynomial Pn,a [f ](x) approximates f (x), for fixed x, and different n.

Problems
1. Find the Taylor polynomials Pn,a [f ] for the following functions f :
x
a) f (x) = ee , n = 3, a = 0

192
MAST10021 Semester 2, 2023

b) f (x) = esin(x) , n = 3, a = 0
c) f (x) = sin(x), degree 2n, a = π/2
d) f (x) = cos(x), degree 2n, a = π

2. Write each of the following polynomials in x as a polynomial in (x − 3).


a) x2 − 4x − 9
b) x5
Hint: It is only necessary to compute the Taylor polynomial at a = 3, of the same
degree as the given polynomial. Why?

3. Consider the equation x2 = cos(x), which has precisely two solutions. Use the
Taylor polynomialpof the cosine function of degree 3, to show that the solutions are
approximately ± 2/3, and give a bound for the error. Use a fifth degree Taylor
polynomial to get a better approximation.

4. Suppose that ai and bi are the coefficients in the Taylor polynomials at a of f and
g. Find the coefficients of the Taylor polynomials at a of the following functions in
terms of ai and bi :
a) f + g
b) f g
c) f 0
Rx
d) h(x) = a f (t)dt

5. Prove that the Taylor polynomial of f (x) = sin(x2 ) of degree 4n + 2 at 0 is

x6 x10 x4n+2
P4n+2,0 (x) = x2 − + − . . . + (−1)n . (23.29)
3! 5! (2n + 1)!

Hint: If P is the Taylor polynomial of degree 2n + 1 of the sine function at 0, then


sin(x) = P (x) + R(x), where limx→0 R(x)/x2n+1 = 0. What does this imply for the
limit of R(x2 )/x4n+2 ?

193
Note 24.
Taylor’s theorem
In the previous lecture we have introduced Taylor polynomials Pn,a as polynomial
functions whose derivatives at a point a agree with those of a given function f up to a
certain order n. The main theorem in this lecture will answer the question precisely in
which sense a Taylor polynomial approximates the values of a given function away from
the point a.
If f is a function for which Pn,a [f ] exists, we define the remainder term Rn,a [f ] by

f (x) =Pn,a [f ](x) + Rn,a [f ](x)


f (n) (a) (24.1)
=f (a) + f 0 (a)(x − a) + . . . + (x − a)n + Rn,a [f ](x)
n!
The aim is to derive a formula for the remainder. We can see this might be possible
from the case n = 0: By the fundamental theorem of Calculus
Z x
R0,a (x) = f (x) − f (a) = f 0 (x)dx . (24.2)
a

We can proceed by integration by parts to obtain a formula in the case n = 1:


Z x Z x
0
f (t)dt = f 0 (t)(t − x)0 dt
a a
Z x (24.3)
=(x − a)f 0 (a) − f 00 (t)(t − x)dt
a

which shows that


Z x
f (x) = f (a) + f 0 (a)(x − a) + R1,a (x) , R1,a (x) = f 00 (t)(x − t)dt . (24.4)
a

Exercise 24.1. Show that

f (t)
Z x (3)
R2,a (x) = (x − t)2 dt (24.5)
a 2
by suitably integrating by parts:
Z x h 1 i0
R1,a (x) = f 00 (t) − (x − t)2 dt . (24.6)
a 2

195
Module VIII Note 24

Proceeding in this way we can prove by induction the integral form of the remainder,
(t)
Z x (n+1)
f
Rn,a (x) = (x − t)n dt . (24.7)
a n!
We get from this the first estimate for the remainder:
Proposition 24.1. Suppose f is n + 1 times differentiable on an interval I, a ∈ I, and
|f (n+1) (x)| ≤ M for x ∈ I, with some M > 0. Then
M
|Rn,a (x)| ≤ |x − a|n+1 (24.8)
(n + 1)!

Example 24.1. As an example consider the trigonometric functions,


x3 x5 x2n+1
sin(x) = x − + − . . . + (−1)n + R2n+1,0 (x) (24.9)
3! 5! (2n + 1)!
and the remainder term is easy to estimate:
x2n+2
|R2n+1,0 (x)| ≤ (24.10)
(2n + 2)!
Since xn /n! can be made arbitrarily small by choosing n large enough, for any x, (see
Note 25) this means that that sin(x) can be computed to any degree of accuracy by
evaluating the Taylor polynomial Pn,0 (x). For example, suppose we wish to compute
sin(2) with an error of less than 10−4 , we just need to choose n so that
22n+2
< 10−4 (24.11)
(2n + 2)!
In this case n = 5 works, and so
23 25 27 29 211
sin(2) = 2 − + − + − +R (24.12)
3! 5! 7! 9! 11!
where R < 10−4 .
While we will be content with this estimate of the remainder, another version of Taylor’s
theorem gives the statement f (x) = Pn,a (x) + Rn,a (x) even under the weaker assumption
that f (k) exist up to order 0 ≤ k ≤ n + 1, but f (n+1) is not necessarily continuous. Its
proof does not rely on the integral form of the remainder, but instead on a higher order
version of the mean value theorem; (see the additional notes to Module VIII) .
Theorem 24.2 (Taylor’s theorem with Lagrange remainder). Let f be a n + 1-times
differentiable function on an interval (c, d), a ∈ (c, d), and Rn,a (x) defined by
f (n) (a)
f (x) = f (a) + f 0 (a)(x − a) + . . . + (x − a)n + Rn,a (x) (24.13)
n!
Then for some t in between a and x,
f (n+1) (t)
Rn,a (x) = (x − a)n+1 . (24.14)
(n + 1)!

196
MAST10021 Semester 2, 2023

Problems
1. Prove that if x ≤ 0, then the remainder term Rn,0 for ex satisfies

|x|n+1
|Rn,0 | ≤ (24.15)
(n + 1)!

2. Prove that if −1 < x ≤ 0, then the remainder term Rn,0 for log(1 + x) satisfies

|x|n+1
|Rn,0 | ≤ . (24.16)
(1 + x)(n + 1)

3. a) Prove that if f 00 (a) exists, then

f (a + h) + f (a − h) − 2f (a)
f 00 (a) = lim . (24.17)
h→0 h2
The limit on the right hand side is called the Schwarz second derivative of
f at a. Hint: Use the Taylor polynomial P2,a (x) with x = a + h, and with
x = a − h.
b) Let f (x) = x2 for x ≥ 0, and −x2 for x ≤ 0. Show that the Schwarz second
derivative of f at 0 exists, even though f 00 (0) does not.
c) Prove that if f 000 (a) exists, then

f (3) (a) f (a + h) − f (a − h) − 2hf 0 (a)


= lim (24.18)
3 h→0 h3

4. Give another proof for the uniqueness of solutions to the differential equation
y 00 − y = 0 using Taylor’s theorem. In other words, suppose f 00 − f = 0 and
f (0) = f 0 (0) = 0, and show that then f = 0.

197
Additional: Taylor’s theorem
Proof of Taylor’s theorem with Lagrange remainder
Recommended Reading

(Spivak, Calculus, Chapter 20)

Real Analysis: Advanced (MAST20033)

As mentioned above the proof of Taylor’s theorem does not rely on the integral form
of the remainder, but instead on a higher order version of the mean value theorem:

Lemma 24.1. Suppose a function g is n + 1 times differentiable on an interval I, and


for some a ∈ I,
g(a) = g 0 (a) = . . . = g (n) (a) = 0 . (24.1)
Then for any x ∈ I, we have

g (n+1) (t)
g(x) = (x − a)n+1 . (24.2)
(n + 1)!
for some t in between a and x.

Proof. We can prove this by induction in n.


In the case n = 0, by the mean value theorem

g(x) = g(x) − g(a) = g 0 (t)(x − a) (24.3)

for some t ∈ (a, x).


Now let the statement be valid for some n = k. We want to show it holds for n = k + 1.
So assume g is k + 2 times differentiable, and g(a) = . . . = g (k+1) (a) = 0. Now by the
Cauchy mean value theorem (Lecture 9) we get with h(x) = (x − a)k+2 ,

g(x) g(x) − g(a) g 0 (t) g 0 (t)


= = = (24.4)
(x − a)k+2 h(x) − h(a) h0 (t) (k + 2)(t − a)k+1

for some t ∈ (a, x). Now g 0 itself is k + 1 times differentiable, and satisfies g 0 (a) = . . . =
(g 0 )(k) (a) = 0, so by our inductive assumption,

(g 0 )(k+1) (y)
g 0 (t) = (t − a)k+1 (24.5)
(k + 1)!

199
Module VIII Note 24

for some y in between a and t. Therefore

g(x) (g 0 )(k+1) (y) g (k+2) (y)


= = (24.6)
(x − a)k+2 (k + 2)(k + 1)! (k + 2)!

which shows the claim for n = k + 1.

Proof of Theorem 24.2. The remainder Rn,a = f (x) − Pn,a (x) satisfies by its very defini-
tion,
Rn,a (a) = . . . = Rn,a
(n)
(a) = 0 (24.7)

Thus by the above Lemma, for some t in between a and x,

(f − Pn,a )(n+1) (t) f (n+1) (t)


Rn,a (x) = (x − a)n+1 = (x − a)n+1 , (24.8)
(n + 1)! (n + 1)!

(n+1)
where we used that Pn,a (x) is a polynomial of degree n, and hence Pn,a = 0.

Local extrema
Recommended Reading

(Spivak, Calculus, Chapter 20)

As a consequence of Theorem 23.1 the test for local extrema can be answered even in
the indefinite case. Recall that if a is a critical point of f , then f has a local minimum
at a if f 00 (a) > 0, and a local maximum if f 00 (a) < 0, but no immediate conclusion can
be drawn if f 00 (a) = 0. It is now clear that in this case f (3) (a) will give the relevant
information, and moreover if also f (3) (a) = 0, then the sign of f (4) (a) is significant. More
generally, we can ask what happens when

f 0 (a) = f 00 (a) = · · · = f (n−1) (a) = 0 (24.9a)


f (n) (a) 6= 0 . (24.9b)

Theorem 24.2. Suppose that a given function f satisfies (24.15).

1. If n is even and f (n) (a) > 0, then f has a local minimum at a.

2. If n is even and f (n) (a) < 0, then f has a local maximum at a.

3. If n is odd, then f has neither a local maximum nor a local minimum at a.

200
MAST10021 Semester 2, 2023

Additional Problems
1. a) Show that if |g 0 (x)| ≤ M |x − a|n for |x − a| < δ, then |g(x) − g(a)| ≤
M |x − a|n+1 /(n + 1) for |x − a| < δ.
b) Use this to show that if limx→a g 0 (x)/(x − a)n = 0, then

g(x) − g(a)
lim = 0. (24.10)
x→a (x − a)n+1

c) Show that if g(x) = f (x) − Pn,a [f ](x), then g 0 (x) = f 0 (x) − Pn−1,a [f 0 ](x).
d) Give an inductive proof of Theorem 23.1, without using L’Hôptial’s rule.

2. Deduce Theorem 23.1 as a corollary of Taylor’s theorem, albeit under the assumption
of one more derivative.

201
Note 25.
Infinite Sequences
In the previous lecture we have encountered the problem whether the numbers
an
(25.1)
n!
for a given choice of a ∈ R, become “small for n sufficiently large”. The numbers
an = an /n! for n ∈ N are an example of an “infinite sequence” of real numbers

a1 , a2 , a3 , . . . (25.2)

which more generally is denoted by

{an }∞
n=1 . (25.3)

Given that a sequence assigns to each natural number n and real number an we could
define the concept as follows:

Definition 25.1. An infinite sequence of real numbers is a function whose domain


is N.

One could graph a sequence in the same way we graph a function, but it is usually
more convenient to simply label the points an on the real number line R.
Sequences are often defined explicitly by a formula for the nth term, or recursively in
the sense that ak+1 is given in terms ak , or even al for 1 ≤ l ≤ k.
Example 25.1. ak = k 2 .
Example 25.2. The factorial function n! can itself be thought of as a recursively defined
sequence. Setting a0 = 1, and an+1 = an (n + 1), the number n! is n-th number in this
sequence, n! = an .
Example 25.3. Another example of a recursively defined sequence are the Fibonacci
numbers, where x1 = x2 = 1, and xk = xk−1 + xk−2 for k ≥ 2.
The statement we would like to make about the sequence (25.1) is that it “converges
to zero”, or
an
→0 (n → ∞) . (25.4)
n!
The next definition makes this notion precise.

203
Module VIII Note 25

Definition 25.2 (Convergence). A sequence {an }∞


n=1 converges to a limit l, written

an → l (n → ∞) or lim an = l (25.5)
n→∞

if for every  > 0, there is a natural number N such that

|xn − l| <  whenever n>N. (25.6)

We say a sequence converges if it converges to a limit l for some l ∈ R. Otherwise, the


sequence is said to diverge.

Exercise 25.1. Verify that ak = 1/k converges in this sense.


Example 25.4. Examples of sequences that do not converge are ak = k which is “going to
infinity”, and the sequence ak = (−1)k which “jumps back and forth between −1, and 1”.
Example 25.5. Let us show that
√ √ 
lim n+1− n = 0. (25.7)
n→∞

Here we use a simple algebraic trick to rewrite this difference:


√ √ 1
n+1− n= √ √ (25.8)
n+1+ n

Given  > 0, we have that for n ≥ 1/2 that


√ √ 1
0≤ n + 1 − n ≤ √ ≤ /2 <  . (25.9)
2 n

Alternatively, we could have taken the point of view of Definition 25.1, and view

an = n = f (n) as the values of a function f , first with domain N, but then as the

values of the function f (x) = x with domain x > 0, evaluated on the natural numbers
x = n. This would allow us to apply the mean value theorem, to get immediately
√ √ 1 1
n+1− n = f (n + 1) − f (n) = f 0 (x) = √ ≤ √ → 0 (n → ∞) (25.10)
2 x 2 n

for some x ∈ (n, n + 1).


Example 25.6. Another example of a diverging sequence is xn = n2 . Yet we write

lim xn = ∞
n→∞

if for every C > 0 there is an integer N such that xn > C whenever n > N .
Example 25.7. Another typical example is

3n3 + 7n2 + 1 3
lim = (25.11)
n→∞ 4n3 − 8n + 63 4

204
MAST10021 Semester 2, 2023

This is clear because the n3 terms are the “leading order” terms for large n, and this can
be turned into a proof by “dividing through” by n3 :

3n3 + 7n2 + 1 3 + 7/n + 1/n3


an = = (25.12)
4n3 − 8n + 63 4 − 8/n2 + 63/n3

We could now proceed by first finding an explicit expression for an − 3/4, and then
estimate |an − 3/4|, with the aim of verifying an → 3/4 directly using Definition 25.2. It
is easier, however, to apply at this stage the following limit laws.
For the evaluation of limits as in the last example the following facts are useful:

Theorem 25.1 (Limit Laws). Suppose the infinite sequences {an } and {bn } both have
limits as n → ∞, then

lim (an + bn ) = lim an + lim bn (25.13)


n→∞ n→∞ n→∞
lim (an · bn ) = lim an · lim bn . (25.14)
n→∞ n→∞ n→∞

Moreover, if limn→∞ bn 6= 0, then

lim (an /bn ) = lim an / lim bn . (25.15)


n→∞ n→∞ n→∞

Remark 25.1. The formulation of the last equality of limits actually requires some more
care. As it stands we are considering the sequence cn = an /bn which may not even be
defined for all n ∈ N. However, since limn→∞ bn 6= 0, we know that for some N sufficiently
large, bn 6= 0 whenever n ≥ 0. Moreover, redefinining all bn , whenever bn = 0 and n ≤ N ,
obviously does not effect the statement about the limits.
The proof of these limit laws is so similar to the corresponding statements for limits of
functions, that they will not be repeated here. Nonetheless, let us explore the similarity
between the definition of limits of functions and sequences a little further.
Note for example that if f is a function that satisfies

lim f (x) = l (25.16)


x→∞

and we set an = f (n), then


lim an = l . (25.17)
n→∞
This observation is often very useful:
Example 25.8. Let 0 < a < 1, then

lim an = 0 . (25.18)
n→∞

To prove this note that ax = ex log a , and log a is negative, hence limx→∞ ex log a = 0.
Exercise 25.2. Show that for any |a| < 1, limn→∞ an = 0 . Also show that, if a > 1 then
limn→∞ an = ∞.

205
Module VIII Note 25

We can now finally return to (25.4). In other words, let us show that for any a ∈ R,
limn→∞ an /n! = 0. We can write for n > N > 2a,
an aN a···a aN 1 n−N
| |≤ ≤ →0 (n → ∞) (25.19)
n! N ! (N + 1) · · · n N! 2
where in the last step we are using that 2−n → 0 as n → ∞ as we have shown in (25.18).

Finally the observation made in (25.16) and (25.17) still does not give us convergence
of sequences like
 1
an = sin 13 + (25.20)
n2
 1 
bn = cos sin 1 + (−1)n (25.21)
n
which clearly should converge to sin(13), and cos sin(1), respectively. (If this is not “clear”
draw a few points on the line.) The theorem that allows us to conclude that is the
following:
Theorem 25.2. Let c ∈ R, and f be function defined on an interval I that contains c,
except perhaps at c itself, and suppose
lim f (x) = l . (25.22)
x→c

Suppose {an } is a sequence such that each an ∈ I, an 6= c, and limn→∞ an = c. Then the
sequence {f (an )} converges, and
lim f (an ) = l . (25.23)
n→∞

Conversely, if this is true for every sequence {an } satisfying these conditions, then (25.22)
holds.
Proof. If limx→c f (x) = l, then for every  > 0 there is a δ > 0 such that
|f (x) − l| <  (25.24)
whenever 0 < |x − c| < δ. Now by assumption we can choose N > 0 such that
n > N =⇒ |an − c| < δ , (25.25)
which then implies that |f (an ) − l| < , showing that f (an ) → l.
Conversely, if (25.22) were not true, then there exists  > 0, so that for every δ > 0,
there exists x with |x − c| < δ, and |f (x) − l| ≥ . However, this can be used to define
a sequence an of points with the property that say |an − c| < 1/n, but |f (an ) − l| ≥ 1.
Then an → c, but f (an ) does not converge to l, in contradiction ot (25.23).

Example 25.9.  a n
lim 1 + = ea . (25.26)
n→∞ n

206
MAST10021 Semester 2, 2023

Problems
1. Verify each of the following limits.
a) limn→∞ n+1 = 1
n

b) limn→∞ nn+3
3 +4 = 0

c) limn→∞ nn!n = 0

d) limn→∞ n
a = 1, a>0

e) limn→∞ n
n=1

f) limn→∞ n
n2 + n = 1

2. Find the following limits.


a) limn→∞ n
− n+1

n+1 n
√ √
b) limn→∞ n− n+a n+b


2n +(−1)n
c) limn→∞ 2n+1 +(−1)n+1
d) limn→∞ ncn , |c| < 1

3. a) Prove that if 0 < a < 2, then a < 2a < 2.
b) Prove that the sequence
r q
√ q √ √
2, 2 2, 2 2 2, . . . (25.27)

converges.
c) Find the limit.

4. Let 0 < a1 < b1 and define


an + bn
an+1 = bn+1 = (25.28)
p
an bn , .
2
a) Prove that the sequences {an } and {bn } each converge.
b) Prove that they have the same limit.

5. Find a sequence {an } of points in (0, 1) such that limn→∞ an is not in (0, 1).

6. Suppose that f is continuous and that the sequence

x, f (x), f (f (x)), f (f (f (x))), . . . (25.29)

converges to l. Prove that l is a fixed point for f , i.e. f (l) = l.

207
Bibliography
Apostel, Tom M. Calculus. Second Edition. Vol. I. One-variable Calculus, with an
Introduction to Linear Algebra. John Wiley & Sons, 1967.
Folland, Gerald. B. Advanced Calculus. Prentice Hall, 2002.
Osserman, Robert. Two-Dimensional Calculus. Ed. by Salomon Bochner and W.C. Lister.
Harbrace College Mathematics Series. Harcourt, Brace and World, Inc., 1968.
Spivak, Michael. Calculus. Fourth Edition. Publish or Perish, Inc., 2008.
Walter, Wolfgang. Gewöhnliche Differentialgleichungen. Heidelberger Taschenbucher.
Springer, 1972.

209

You might also like