Advanced Calculus MAST10021 Guide
Advanced Calculus MAST10021 Guide
MAST10021
Volker Schlue
University of Melbourne
Semester 2, 2023
(version: November 21, 2023)
Contents
I. Numbers, Functions, and Graphs 11
1. Numbers 13
1.1. Natural numbers and induction . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2. Rational and real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3. Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4. Absolute value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5. The real number line and euclidean space . . . . . . . . . . . . . . . . . . 16
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3. Limits 35
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4. Continuous functions 41
4.1. Definition of continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2. Consequences of continuity . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3
Module Note 0
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
III. Differentiation 61
9. L’Hôpital’s rule 83
9.1. Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
9.2. L’Hôpital’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
[Link] functions 89
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4
MAST10021 Semester 2, 2023
V. Integration 97
5
Module Note 0
6
MAST10021 Semester 2, 2023
7
Preface
I have written these lecture notes for the subject Calculus 2: Advanced which was
introduced at the University of Melbourne in 2020. Its ambition is to teach all the topics
covered in the “standard” Calculus 2 subject, but with more emphasis on concepts, and
proofs. Even though the name would suggest it, there is no Calculus 1: Advanced
subject. For most students, this is the first enounter with any proofs in Calculus. Thus
to get from the basics (say the definition of a limit) to the more advanced topics (say
discussions of solutions to differential equations), the pace is quite high, and a selection
of topics had to be made: I am mostly following (Spivak, Calculus), but I have tried to
make a division into “core material” (these are the Notes, that I expect every student to
study), and “additional topics” (that I usually do not have time to cover in class, but
give an opportunity for further study, along with some references to the literature). The
result is a class that can be taught over 12 weeks, accompanied by tutorials (which are
not contained in these lecture notes). I would like to thank the students who took this
subject in the years 2020-2023 for their feedback and I hope that taking this class made
them curious about mathematics. – VS, Melbourne, November 2023.
Literature. The following text books were useful when preparing these lecture notes.
(Folland, Advanced Calculus) This is a good text, but with a more advanced starting
point. About 1/4 of the topics covered in this course can be found this text,
especially related to multivariable calculus.
9
Module I.
11
Note 1.
Numbers
1.1. Natural numbers and induction
The simplest numbers are the natural numbers 1, 2, 3, . . ., which are collectively referred
to as N.
The natural numbers are very important in particular in relation to the principle
of mathematical induction: Suppose P (x) means that the property P holds for the
number x. Then the principle of mathematical induction states that P (x) is true for all
natural numbers x provided that
• P (1) is true,
• If P (k) is true, then P (k + 1) is true.
Example 1.1. Suppose we want to prove that for any natural number n,
n(n + 1)
1 + 2 + ... + n = . (1.1)
2
Then it sufficies to demonstrate that this is valid in the case n = 1, and to show that its
validity for n = k, implies that the formula holds for n = k + 1. Now assuming that the
formula holds for n = k, we have that
k(k + 1) k 2 + 3k + 2 (k + 2)(k + 1)
1 + . . . + k + (k + 1) = + (k + 1) = = (1.2)
2 2 2
which shows that the formula holds for n = k + 1.
Closely related to proofs by induction are recursive defininitions.
Example 1.2. The number n! (“n factorial”) is defined as follows:
• 1! = 1
• (n + 1)! = (n + 1)n!
Example 1.3. We write for the sum a1 + . . . + an . For example, the formula above
Pn
i=1 ai
could be expressed as
n
n(n + 1)
i= (1.3)
X
.
i=1
2
A careful definition of this symbol would be a recursive one:
13
Module I Note 1
• = a1
P1
i=1 ai
• ai = + an+1 .
Pn+1 Pn
i=1 i=1 ai
2 were rational, and equals p/q for some integers p and q 6= 0, and we can assume that
p and q have no common divisor. Then
p2 = 2q 2
2k 2 = q 2
which now shows that also q is even. But if both p and q are even, this contradicts the
fact that p and q have no common divisor.
The argument in this example shows that there is no
√ rational number x such that
x2 = 2. We have not shown that there exists a number 2 whose square is 2.
1.3. Inequalities
Although inequalities are often not discussed in elementary mathematics, they play a
prominent role in Calculus.
14
MAST10021 Semester 2, 2023
We write a < b to say “a is less than b”, and a > b to say “a is greater than b”. a < b
means the same as b > a and are merely two ways of writing the same assertion. The
main properties of numbers pertaining to inequalities are:
(Trichotomy) For any real numbers a and b, one, and only one, of the following holds:
1. a = b
2. a < b
3. b < a
(Ordering) For any numbers a, b, and c, if a < b and b < c, then a < c.
(Closure under addition) For any numbers a, b, and c, if a < b, then a + c < b + c.
(Closure under multiplication) For any numbers a, b, and c, if a < b, and 0 < c, then
ac < bc.
The numbers satisfying a > 0 are called positive, while the numbers a < 0 are
negative. We also write a ≤ b to mean a < b or a = b, and a ≥ b to mean a > b or a = b.
All familiar facts about inequalities, howevery elementary they may seem, can be
derived from these basic properties.
Example 1.5. If a < 0 is negative, then −a > 0 is positive. Since a < 0 means the same as
0 > a it follows (from closure under addition) that 0 − a > a − a = 0, or simply −a > 0.
Example 1.6. More generally, if a < 0 and b < 0 are negative numbers, then ab > 0 is
positive: We have already shown that −a > 0 and −b > 0 are positive numbers, hence
(using closure under multiplication) it follows that (−a)(−b) > 0.
Exercise 1.1. If a > 0 and b > 0 are positive, then ab > 0 is positive.
The fact that ab > 0 if a > 0, b > 0 and also if a < 0, b < 0 has one special consequence:
a2 > 0 whenever a 6= 0. In particular, we have proven that 1 = 1 · 1 > 0.
We will encounter many more examples of inequalities in Problem 7 below.
Example 1.7. If a < b and c < 0 then ac > bc. Since −c > 0 we have a(−c) = −ac <
b(−c) = −bc which is the same as bc < ac.
a≥0
(
a
|a| =
−a a ≤ 0.
An important inequality is the triangle inequality, which is the statement that for
all numbers a and b, we have
|a + b| ≤ |a| + |b| . (1.4)
15
Module I Note 1
Proof. Since the absolute value is defined by cases, the proof amounts to verifying this
inequality in the following four cases:
In the case (A), the stated inequality is certainly true, because |a + b| = a + b = |a| + |b|.
Similarly in the case (D), we have |a + b| = −(a + b) = −a − b = |a| + |b|.
Now let us look at the case (B). Here we need to prove that
|a + b| ≤ a − b . (1.5)
From the assumption we do not know if a + b is positive, negative, or zero, but we can
consider the case a + b ≥ 0 first. Then we need to show that a + b ≤ a − b, which is the
same as b ≤ −b, which is certainly true since b ≤ 0. In the other case, when a + b ≤ 0,
we have to show that −(a + b) ≤ a − b which is the same as −a ≤ a, which is certainly
true for a ≥ 0.
Exercise 1.2. Verify the inequality in the case (C).
16
MAST10021 Semester 2, 2023
We frequently encounter the set of numbers x which satisfy |x − a| < ε. This is the
collection of points whose distance from a is less than ε > 0. This is an example of an
open interval
(a − ε, a + ε) = {x : a − ε < x < a + ε} (1.8)
[a, b] = {x : a ≤ x ≤ b} (1.9)
but we emphasize that the symbol ∞ is purely suggestive, because there is no number
“∞” such that a < ∞ for all a ∈ R. Nonetheless with this notation the real line can be
viewed as an interval:
R = (−∞, ∞) (1.11)
We can intersect two real number lines at a right angle to form the plane R2 consisting
of pairs of numbers (a, b), and we also refer to a and b as the coordinates of the point
(a, b). Alternatively, we can view ~v = (a, b) as a vector, namely an arrow from the origin
to the point (a, b). This allows us add two vectors ~v = (a, b), w ~ = (c, d), and interpret
v + w = (a + c, b + d) geometrically. Similarly we can define λ~v = (λa, λb) for any real
number λ, and interprete the operation of scalar multiplicaion geometrically is scaling.
The distance between two points in the plane (a, b), and (c, d) is
q
(a − c)2 + (b − d)2 . (1.12)
which allows us to speak of vectors which are orthogonal, and to define the length of
a vector. We say two vectors ~a and ~b are orthogonal if ~a · ~b = 0. Moreover, we define the
length (or norm) of a vector ~a by
√ q
|~a| = ~a · ~a = a21 + a22 + . . . + a2n . (1.14)
With this notion of length the triangle inequality that we have seen above for
numbers extends to vectors:
|~a + ~b| ≤ |~a| + |~b| . (1.15)
17
Module I Note 1
and so the stated inequality follows, provided we can show that |~a · ~b| ≤ |~a||~b|. This is
Cauchy’s inequality ; (see the additional notes to Module I).
The distance between two points (a1 , a2 , . . . , an ) and (b1 , b2 , . . . , bn ) in Rn is defined
by |~a − ~b|; note that this agrees with (1.12) in R2 . The reason (1.15) is called the triangle
inequality is because it implies that for any vectors ~a, ~b, ~c,
In other words, the distance from ~a to ~c is at most the sum of the distances from ~a to ~b,
and from ~b to ~c, for any intermediate point ~c.
Problems
1. Prove the following formula by induction:
n(n + 1)(2n + 1)
12 + . . . + n2 =
6
i=1
1 − rn+1
1 + r + r2 + . . . + rn =
1−r
18
MAST10021 Semester 2, 2023
9. Let ~a = (3, −1, 2) and ~b = (2, 1, 0). Compute the norms of ~a and ~b.
11. Show that |~a| − |~b| ≤ |~a − ~b| for every ~a, ~b in Rn .
19
Note 2.
Functions and Graphs
2.1. Functions
A function f is a rule which assigns to each real number x (or only certain real numbers
x) another real number f (x), the value of f at x.
Example 2.1. The function which assigns to each number the square of that number,
f (x) = x2 (2.1)
Example 2.2. The rule which assigns to each number c 6= 1, −1 the number c3 /(c2 − 1),
c3
f (c) = (c 6= 1, −1) (2.2)
c2 − 1
Example 2.3. The rule that assigns to each number t the number t3 + x. This rule
obviously depends on x, and defines a family of functions fx ,
fx (t) = t3 + x (2.3)
Remark 2.1. A function need not be expressed by an algebraic formula, it can be any
rule that assigns numbers to certain other numbers.
Example 2.1 is a special example of an extremely important class of functions, namely
the polynomial functions. If a0 , a1 , . . . , an are real numbers, and an =
6 0, then we say
is a polynomial of degree n.
Example 2.2 is an example of a rational function, namely a function of the form p/q
where p, and q are polynomials.
Given two functions f , g, they can be combined to form a new function in various
ways: f + g is called the sum of f and g, f · g is the product, and f /g the quotient of
f , and g. All these are defined in the obvious ways, but we already see that some thought
has to be given to the domain: For example (f + g)(x) = f (x) + g(x) only makes sense
for numbers x for which both f and g are defined. The domain of a function is the set of
numbers to which the rule can be applied. So in the above example if A is the domain of
f , and B the domain of g, then f + g is only defined on the intersection of A, and B,
denoted by A ∩ B. Similarly for f · g.
21
Module I Note 2
f : A −→ R . (2.5)
This is often also indicated implicitly, when we give a formula for f (x) followed by
(x ∈ A) .
2.2. Graphs
We call the graph of a function f the set of points
22
MAST10021 Semester 2, 2023
Exercise 2.2. Given two points (a, b) and (c, d) find the linear function f whose graph
goes through both points.
Example 2.5. The graph of the function f (x) = x2 is a parabola.
Exercise 2.3. Sketch the graphs of
1. f (x) = xn for n = 2, 3, 4, . . .
2. f (x) = x2 + x
3. f (x) = x3 − 3x
4. f (x) = 1/x (x 6= 0)
5. f (x) = 1
1+x2
Example 2.6. Consider the function f (x) = sin(1/x) on its domain R \ {0}. To draw the
graph it helps to observe that
Moreover, when x is large, 1/x is small, so also f (x) is small; similarly when |x| is large,
for x < 0. We take away from this that while the graph of f approaches the horizontal
axis as |x| → ∞ (from above on the right, and from below on the left), it oscillates
infinitely many times between −1, and 1 near 0.
23
Module I Note 2
Figure 2.2.: The graph of the function f (x) = sin(1/x) for x > 0.
which y = f (x). Similary when visualizing a function f (x, y) of two variables, it is useful
to think about the graph of f as a surface in R3 consisting of all points
Example 2.10. The graph of the function in Example 2.7 is a plane. It intersects the
z-axis at z = 1, because f (0, 0) = 1. Similarly we obtain the intersections with the x and
y axes, and these three points, (0, 0, 1), (1/2, 0, 0), and (0, 1, 0) determine uniquely the
plane.
Example 2.11. The graph of the function in Example 2.8 is a surface of revolution, since
the value of f depends only on the distance |(x, y)| from the z-axis. The surface is a
paraboloid.
Example 2.12. An important difference between Example 2.9 and the previous ones, is
that here f is not defined for all values of x, and y. More generally the quotient of two
polynomials
P (x, y)
f (x, y) = (2.12)
Q(x, y)
is not defined at points where the denominator vanishes. In Example 2.9 this a single
point, the origin. However, since the numerator also vanishes, it is not immediately clear
how this function behaves near the origin.
Exercise 2.5. Introduce polar coordinates in the plane to study the behaviour of the
function f (x, y) = 2xy/(x2 + y 2 ) near the origin. Set x = r cos(θ), and y = r sin(θ), to
express f (x, y) as a function of r, and θ. Sketch the graph of f .
24
MAST10021 Semester 2, 2023
Figure 2.3.: The graph of the function from Example 2.9 is generated by rotating a ray
from center, while moving it up and down.
In other words, the value only depends on θ, not on r. Therefore the graph of f is
constant over the rays (r cos θ, r sin(θ)), r > 0, for constant θ, and the graph z = f (x, y)
lies entirely between the planes z = 1, and z = −1. We can visualize the surface as a
kind of “spiral ramp”, see Fig. 2.3.
A common way to sketch a function of two variables is to consider the intersections of
the graph of f in R3 with a plane ax + by = 0. In other words, by restricting ourselves
to points in the xy plane with ax + by = 0 for some fixed a, and b, the values of f are a
function of one variable, and its graph is visualized as a curve in the plane.
Example 2.13.
f (x, y) = cos(xey ) (2.14)
Note that the graph of f lies between the horizontal planes z = 1 and z = −1. On each
straight line y = c in the xy plane, the function f (x, c) = cos(kx), with k = ec , is an
oscillating function in x, which is oscillating more rapidly the larger the value of c. The
graph of x → f (x, c) is the cross section of the graph of f , as a surface in R3 , with the
planes y = c.
Exercise 2.7. Sketch the graph of the function of the previous example.
25
Module I Note 2
Another approach is to intersect the graph of a function of two variables with the
planes z = c. Here we are interested in the set of points (x, y) in the plane for which
f takes a given value c. The set of points (x, y) in the xy plane for which f (x, y) = c,
where c is a given constant, is a level curve of f . By choosing various values of c, and
constructing the corresponding level curves, we can often obtain a picture of the graph
of f .
Exercise 2.8. Sketch the level curves of the function
f (x, y) = xy . (2.15)
Exercise 2.10. Ask yourself what is the relevance of level curves to reading a map of a
mountainous terrain?
Problems
1. Let
1
f (x) = (2.16)
1+x
What is
26
MAST10021 Semester 2, 2023
a) f (1/x)
b) f (x + y)
c) f (x) + f (y)
3. A function is even if f (x) = f (−x) and odd if f (x) = −f (−x). For example, the
functions f (x) = x2 , and f (x) = |x| are even, while the function f (x) = x, or
f (x) = sin(x) are odd.
a) Determine whether f + g is even, odd, or not necessarily either, in the four
cases obtained by choosing f even or odd, and g even or odd.
b) Do the same for f · g, and f ◦ g.
c) Prove that every even function f can be written as f (x) = g(|x|), with a
function g that is not uniquely determined.
5. Indicate on the real line the set of x satisfying the following relations, and write
these sets using the notation of intervals.
a) |x − 3| ≤ 1
b) 1
1+x2
≤a
c) |x − 1| < 21
2
6. Draw the set of all points (x, y) satisfying the following conditions.
a) x > y
b) |x − y| < 1
c) 1/(x + y) is an integer.
d) x = y 2
e) x = |y|
7. Sketch the graphs of the following functions, by plotting enough points to get a
good idea of its shape.
a) f (x) = x − 1/x
b) f (x) = x2 + 1/x2
27
Module I Note 2
11. Let h(t) be a strictly increasing function of t, and let g(x, y) = h(f (x, y)).
a) How are the level curves of f and g related?
b) How are the graphs of f and g related?
28
Additional: Vectors, functions, and graphs
Further Reading
~x + ~y =(x1 + y1 , . . . , xn + yn ) (2.1)
λ~x =(λx1 , . . . , λxn ) λ∈R (2.2)
We have already introduced the dot product between two vectors and the norm of a
vector. The fundamental inequality relating the two is Cauchy’s inequality:
Exercise 2.1. This inequality has a geometrically instructive proof. Draw pictures while
you are reading it!
We know, on one hand, that f (t) ≥ 0 is non-negative. On the other hand, this is a
quadratic in t, and its minimum occurs at t = ~a · ~b/|~b|2 , where f takes the value
(~a · ~b)2
f (~a · ~b/|~b|2 ) = |~a|2 − . (2.5)
|~b|2
Since f ≥ 0, in particular at the minimum, we obtain the inequality after multiplying
through by |~b|2 .
29
Module I Note 2
Proof. We have |~x + ~y |2 = |~x|2 + 2~x · ~y + |~y |2 , and by Cauchy’s inequality |~x · ~y | ≤ |~x||~y |.
Therefore,
|~x + ~y |2 ≤ (|~x| + |~y |)2 , (2.7)
Further Reading
and
f (x) = x2 + 3x + 3 − 3(x + 1) (2.9)
Definition 2.2. The domain of f is the set of numbers of all a for which there is some
b such that (a, b) is in f .
If a is in the domain of f , it follows from the first definition that there is a unique
number b such that (a, b) is in f . This unique number is denoted by f (a).
30
MAST10021 Semester 2, 2023
Mappings
More generally, a map (or mapping) is a rule f that assigns to each element of some
set A an element of some other set B (possibly equal to A). We write f : A → B. If
x ∈ A, the element in B assigned to x by f is called the value f (x). Thus functions are
maps, but the term “function” is typically reserved for mappings whose values are real
numbers (or complex numbers).
Given f : A → B we refer to A as the domain of f . If S is a subset of A, we denote by
Definition 2.3. The distance d between two points (a, b) and (c, d) in the plane R2 is
defined by q
d = (a − c)2 + (b − d)2 (2.11)
Example 2.1. The circle with centre (a, b) of radius r is the set of points (x, y) whose
distance from (a, b) is equal to r. Since for example both (a, b + r), and (a, b − r) are in
this collection of points, it is not a graph of a function.
Example 2.2. Given two points in the plane, an ellipse is the set of points, for which the
sum of the distances to the two focal points is constant. If we take for simplicity the
two focal points to be (−c, 0), and (c, 0) on the horizontal axis, and the distance to be
2a > 0, then these are all points (x, y) for which
q q
(x + c)2 + y 2 + (x − c)2 + y 2 = 2a (2.12)
x2 y2
+ = 1, (2.13)
a2 a2 − c2
where we take a > c.
Example 2.3. The hyperbola is defined analogously, except that we require the difference
of the two distances to be a constant. These leads to the same equation, but we now
take c > a. The hyperbola has two branches because we can take the difference in
31
Module I Note 2
two different ways. Note that while the hyperbola is also not a graph, we can write for
example in the case a2 = 2, and a2 − c2 = −2, that
(x + y)(x − y) = 2 (2.14)
and so hyperbola coincides with the graph of the function f (x) = 1/x after a rotation of
the axes by an angle of π/4.
Additional problems
1. a) If f is any function, define a new function |f | by |f |(x) = |f (x)|. If f and g
are functions, define two new functions max(f, g), min(f, g) by
max(f, g)(x) = max(f (x), g(x)) (2.15)
min(f, g)(x) = min(f (x), g(x)) (2.16)
Find an expression for max(f, g), and min(f, g) in terms of | · |.
b) Show that f = f+ + f− where f+ = max(f, 0) is the positive part, and
f− = min(f, 0) is the negative part of f .
c) A function f is called nonnegative if f (x) ≥ 0 for all x. Prove that any
function f can be written as f = g − h, where g and h are nonnegative, (and
not uniquely determined).
2. Prove that the graphs of the linear functions
f (x) = mx + b f (x) = nx + c (2.17)
are perpendicular if mn = −1.
Hint: Consider the triangle with vertices at (0, 0), (1, m), and (1, n), use the
Pythagorean theorem.
3. a) If x1 , . . . , xn are distinct numbers, find a polynomial function fi of degree n − 1
which is 1 at xi and 0 at xj for j 6= i.
b) Find a polynomial function f of degree n − 1 such that f (xi ) = ai , where
a1 , . . . , an are given numbers.
4. For which numbers a, b, c, d does the function
ax + b
f (x) = (2.18)
cx + d
satisfy f (f (x)) = x (for all x in the domain of f ◦ f )?
5. Convince yourself that the set of points (x, y) satisfying
ax2 + bx + cy 2 + dy + e = 0 (2.19)
is either a parabola, an ellipse, a hyperbola, or in a degenerate cases two lines, one
line, a point, or the empty set.
32
Module II.
33
Note 3.
Limits
In this lecture we shall make precise one of the most important notions in Calculus,
namely the limit of a function. We would like to say that “a function f approaches
the limit l near a, if we can make f (x) as close as we like to l by requiring that x be
sufficiently close to, but not equal to, a.”
Here it is irrelevant how or even if f is defined at the point a. For example the functions
x 6= a
(
x2
f (x) = x ,
2
g(x) = x 2
(x 6= a) , h(x) = (3.1)
b x=a
35
Module II Note 3
Definition 3.1 (Limit). A function f approaches the limit l near a if for every
ε > 0, there is some δ > 0 such that, for all x, if 0 < |x − a| < δ, then |f (x) − l| < ε.
This is a very important definition and you need to know it by heart!
Let us also make sure to get the logical negation of this statement right, namely to
understand what it means for a function not to approach a limit l at a:
A function does not approach the limit l at a, if there is some ε > 0 such
that for every δ > 0 there is some x which satisfies 0 < |x − a| < δ, but not
|f (x) − l| < ε.
Example 3.3. The function f (x) = sin(1/x) does not approach 0 near 0, because for
ε = 1/2 and any δ > 0, there is some x with 0 < |x| < δ such that sin(1/x) ≥ 1/2. Indeed,
we only need to choose x = 1/(π/2 + 2πn) for some n ∈ N, which becomes arbitrarily
small for n large.
36
MAST10021 Semester 2, 2023
Exercise 3.2. In fact, more is true: The function f (x) = sin(1/x) does not approach any
limit l near 0.
Example 3.4. The function f (x) = x sin(1/x) approaches the limit 0 near 0. Since for all
x 6= 0,
x sin(1/x) ≤ |x| (3.8)
we can make |f (x)| < ε simply by requiring that 0 < |x| < δ with δ = ε.
Exercise 3.3. Show that the function f (x) = x2 sin(1/x) approaches 0 near 0. What
√
about f (x) = x sin(1/x)?
Since a function f cannot approach two different limits, we can talk about the limit l
that f approaches near a, which is denoted by
The statement limx→a f (x) = l has exactly the same meaning as the phrase “f approaches
l near a.” The possibility remains that f does not approach l near a for any l, and that
is expressed by saying “limx→a f (x) does not exist.”
While the examples at the beginning of the lecture may give the impression that every
function in question has to be dealt with separately, the idea is of course to establish
general theorems which will make it easy to find limits.
Theorem 3.1 (Limit laws). If limx→a f (x) = l and limx→a g(x) = m, then
Moreover, if m 6= 0, then
f l
lim = (3.12)
x→a g m
37
Module II Note 3
Example 3.5. Using the Theorem we can prove, trivially, such statements as
x3 + 7x5 a3 + 7a5
lim = (3.13)
x→a x2 + 1 a2 + 1
without going through the laborious process of finding a δ, for a given ε.
We only give a proof of the first “limit law”.
Proof. Let ε > 0. By assumption we know that there are δ1 , δ2 > 0 such that, for all x,
So if we choose δ = min(δ1 , δ2 ) to be the smallest of the two, then both statements are
true for all 0 < |x − a| < δ, and moreover
|(f + g)(x) − (l + m)| = |f (x) − l + g(x) − m| ≤ |f (x) − l| + |g(x) − m| < ε/2 + ε/2 = ε
we would write
|f (x)g(x) − lm| = |(f (x) − l)g(x) + l(g(x) − m)| ≤ |f (x) − l||g(x)| + |l||g(x) − m|
yields the desired inequality for all 0 < |x − a| < δ, where again δ = min(δ1 , δ2 ).
Problems
1. Find the following limits
√
1− x
a) limx→1 1−x
√
1− 1−x2
b) limx→0 x
38
MAST10021 Semester 2, 2023
2. In each of the following cases, determine the limit l for the given a, and prove that
it is the limit by showing how to find a δ such that |f (x) − l| < ε for all x satisfying
0 < |x − a| < δ.
a) f (x) = x2 + 5x − 2, a = 2
b) f (x) = x4 , for any a > 0.
c) f (x) = |x|, a = 0.
p
4. a) If limx→a f (x) and limx→a g(x) do not exist, can limx→a (f (x) + g(x)) exist?
What about limx→a f (x)g(x)?
b) If limx→a f (x) exists and limx→a (f (x) + g(x)) exists, must limx→a g(x) exist?
c) If limx→a f (x) exists and limx→a g(x) does not exist, can limx→a (f (x) + g(x))
exist?
6. Prove that if limx→0 g(x) = 0 and |h(x)| ≤ M for all x, then limx→0 g(x)h(x) = 0.
39
Note 4.
Continuous functions
4.1. Definition of continuity
Intuitively, a function f is continuous if the graph contains no breaks, jumps, or wild
oscillations.
Remark 4.1. There are several ways this can fail. For example, f might not be defined at
a, or the limit may not exist, in which cases this identity makes no sense. It could also
be that f is defined at a and the limit of f (x) at a exists, but these two numbers are not
the same.
Example 4.1. The function f (x) = sin(1/x) is not continuous at 0, because it is not even
defined at 0.
Example 4.2. The function f (x) = x sin(1/x) is not defined at 0 either, but the limit
limx→0 x sin(1/x) exists and is 0, so while f is not continuous at 0, we can define an
extension of f , namely the function
f (x) x 6= 0
(
F (x) = (4.2)
0 x=0
which is continuous at 0.
Example 4.3. Any monomial f (x) = xn is obviously continuous at any a because
limx→a xn = an = f (a).
Proof. This follows directly from Theorem 3.1. Indeed, if limx→a f (x) = f (a) and
limx→a g(x) = g(a), then
lim (f + g)(x) = lim f (x) + lim g(x) = f (a) + g(a) = (f + g)(a) . (4.3)
x→a x→a x→a
41
Module II Note 4
The theorem allows us to infer that rational functions are continuous at every point
in their domain. We defer the proof that the trigonometric functions are continuous,
but even if we know that we are still unable to prove the continuity of functions like
f (x) = sin(x2 ), before making a statement about compositions:
Theorem 4.2 (Composition of continuous functions). If g is continuous at a, and f is
continuous at g(a), then f ◦ g is continuous at a.
Proof. Let ε > 0. Since f is continuous at g(a) we can find a δ > 0 such that, if
|y − g(a)| < δ, then |f (y) − f (g(a))| < ε. So now choose η > 0, so that for all x, if
|x − a| < η, we have |g(x) − g(a)| < δ, which is possible because g is continuous at a.
Example 4.4. With this theorem we can now infer that F (x) from (4.2) is continuous at
every point. Similarly for functions like f (x) = sin(x2 + sin(x)), etc.
Exercise 4.1. Give another proof of the quotient case in the limit laws using Theorem 4.2.
In other words, use the theorem about compositions of continuous functions to show that
if g is continuous at a, and g(a) 6= 0, then
1 1
lim = . (4.4)
x→a g(x) g(a)
So far we have talked about continuity at a point. The consequences of continuity are
more powerful when it refers to continuity on an interval: We say f is continuous on
(a, b) if f (x) is continuous at x for all x ∈ (a, b). This is a special case of a function being
continuous on R = (−∞, ∞).
Exercise 4.2. Formulate and prove an analogous statement under the assumption f (a) < 0.
The next theorem says, geometrically, that the graph of a continuous function which
starts below the horizontal axis and ends above the horizontal axis must cross this axis
at some point.
42
MAST10021 Semester 2, 2023
Theorem 4.4. If f is continuous on an interval (c, d), c < a < b < d, and
Proof. Consider the function f (x) = x2 . We want to show that if α > 0, there exists a
number x such that f (x) = x2 = α. There is obviously a number b such that f (b) > α.
(In fact, if α > 1, take for example b = α, and if α < 1, take for example b = 1.) Since
f is continuous, and f (0) = 0 < α < f (b), there exists a number x in the interval [0, b]
such that f (x) = α, so x2 = α.
Problems
1. For which of the following functions f is there a continuous extension F of f ? In
other words, for which of the following functions is there a continuous function F
on the real line such that F (x) = f (x) for all x in the domain of f .
x2 −4
a) f (x) = x−2
|x|
b) f (x) = x
c) f (x) = x2 sin(1/x2 )
2. a) Suppose that f is a function satisfying |f (x)| ≤ |x| for all x. Show that f is
continuous at 0.
b) Suppose that g is continuous at 0 and g(0) = 0, and |f (x)| ≤ |g(x)|. Prove
that f is continuous at 0.
43
Module II Note 4
3. Prove that if f is continuous at a then for any ε > 0 there is a δ > 0 so that
whenever |x − a| < δ and |y − a| < δ we have |f (x) − f (y)| < ε.
4. Find an integer n such that f (x) = 0 for some x between n and n + 1, where
f (x) = x3 − x + 3 (4.6)
6. Suppose that f is a continuous function on [0, 1] and that f (x) is in [0, 1] for each
x. (Draw a picture!) Prove that f (x) = x for some number x in the unit interval.
44
Note 5.
Theorems about continuity
5.1. Global properties of continuous functions
We have already seen one important theorem about continuity, namely the Intermediate
value theorem which states that
Theorem 5.1 (Intermediate value theorem). If f is continuous on [a, b] and
f (a) < t < f (b),
then there is some x in (a, b) such that f (x) = t.
Let us now state two more theorems about continuity and explore some of their
consequences:
Theorem 5.2 (Bounded value theorem). If f is continuous on [a, b], then f is bounded
above on [a, b], that is there is some number N such that f (x) ≤ N for all x in [a, b].
Geometrically, this means that the graph of f lies below some horizontal line.
The third theorem states that a continuous function on a closed interval always achieves
a maximum:
Theorem 5.3 (Extreme value Theorem). If f is continuous on [a, b], then there is some
number y in [a, b] such that f (y) ≥ f (x) for all x in [a, b].
These theorems all rely on the continuity of f on the interval [a, b]. Indeed, the
conclusions are false if continuity fails at a single point
Example 5.1. For Theorem 5.2, take the function
1/x x 6= 0
(
f (x) = (5.1)
0 x=0
which is continuous at every point except 0, but f is not bounded above.
Example 5.2. For Theorem 5.3, consider the function
x<1
(
x2
f (x) = (5.2)
0 x≥1
On the interval [0, 1] the function is bounded above, but there is no y in [0, 1] such that
f (y) ≥ f (x) for all x in the interval.
45
Module II Note 5
These important theorems are stated in the simplest setting and are easily generalised.
For example, a continuous function on a closed interval always achieves a minimum, too:
Exercise 5.1. Use Theorem 5.3 to show that if f is continuous on [a, b] then there is some
y in [a, b] such that f (y) ≤ f (x) for all x in [a, b].
46
MAST10021 Semester 2, 2023
Exercise 5.3. Why can we find a number b > y with f (b) > c?
It remains to show:
Lemma 5.6. If n is even and f (x) = xn + an−1 xn−1 + . . . + a0 , then there is a number
y such that f (y) ≤ f (x) for all x.
Hence we can find an interval [−b, b] so large, that for any points x outside that interval,
Now we can use Theorem 5.3, to infer that f has a minimum on the closed interval
[−b, b]:
f (x) ≥ f (y) (5.9)
for some y ∈ (−b, b). The point y is a minimum not just on the interval [−b, b], because
for |x| > b we have
f (x) > |f (0)| ≥ f (0) ≥ f (y) . (5.10)
Solution 5.4. In the proofs of the theorems above we have used that for any polynomial
of degree n ∈ N,
f (x) = xn + an−1 xn−1 + . . . + a0 (5.11)
we have
lim f (x) = ∞ . (5.12)
x→∞
We can see this as follows:
an−1 a0
f (x) = xn 1 + + ... + n (x > 0) (5.13)
x x
Now by choosing M = max(1, 2n|an−1 |, . . . 2n|a0 |) we get that for |x| ≥ M ,
an−1 a0 1 1
+ . . . + n ≤ |an−1 | + . . . + |a0 | ≤ . (5.14)
x x M 2
Therefore, for all x, with |x| ≥ M ,
1 n an−1 a0
x ≤ xn 1 + + . . . + n = f (x) (5.15)
2 x x
This shows that for any N > 0, we can find M > 0, so that, if x > M , then f (x) > N .
Indeed, for a given N > 1, let us choose M = max{1, 2n|an−1 |, . . . , 2n|a0 |, 2N }, then,
whenever x > M , we have
1
f (x) > xn > 2n−1 N n > N . (5.16)
2
47
Module II Note 5
Problems
1. For each of the following functions, decide which are bounded above and below on
the indicated interval, and which take on their maximum or minimum value.
a) f (x) = x2 on (−1, 1)
b) f (x) = x3 on (−1, 1)
c) f (x) = x2 on R
d) f (x) = x2 on [0, ∞)
(
x2 , x≤a
e) f (x) = on (−a − 1, a + 1). (Assume here that a > −1.)
a + 2, x > a
(
x2 , x<a
f) f (x) = on [−a − 1, a + 1]. (Again assume here that a > −1.)
a + 2, x ≥ a
48
MAST10021 Semester 2, 2023
2. Suppose f and g are continuous on [a, b] and that f (a) < g(a), but f (b) > g(b).
Prove that f (x) = g(x) for some x in [a, b].
3. Suppose that f is a continuous function with f (x) > 0 for all x, and
Prove that there is some number y such that f (y) ≥ f (x) for all x.
49
Additional: Limts and Continuity
Recommended Reading
Proof. We want to show that if f approaches l near a, and f approaches m near a, then
l = m.
By definition, for a given ε > 0, there exist δ1 > 0, and δ2 > 0 such that
which is a contradiction.
Exercise 5.1. First interpret separately, precisely in the sense of Definition 3.1, and then
prove equality of the expressions
There are times we would like to talk about the limit of f as x approaches a “from
above”, or “from below”. These are situations when f (x) may not be defined for all x in
|x − a| < δ, but only say for x > a, or x < a, but the “one-sided” limits still exist.
51
Module II Note 5
x<0
(
−1,
f (x) = (5.6)
1, x>0
This function does not approach any number near 0, but the limits from above and below
do exist:
lim f (x) = 1 lim f (x) = −1 . (5.7)
x→0+ x→0−
Additional Problems
1. Prove that limx→a f (x) exists if limx→a+ f (x) = limx→a− f (x).
2. The function f (x) = 1/x2 does not approach a limit near 0. Nonetheless it is
common to write limx→0 f (x) = ∞. In general we define “limx→a f (x) = ∞” to
mean that for all N there is a δ > 0 such that, for all x, if 0 < |x − a| < δ, then
f (x) > N .
a) Show that limx→3 1/(x − 3)2 = ∞.
b) Prove that if f (x) > > 0 for all x, and limx→a g(x) = 0, then
f (x)
lim = ∞.
x→a |g(x)|
Recommended Reading
Exercise 5.2. You might have noticed that in the proof of Theorem 4.2 we have taken
limx→a f (x) = f (a) to mean that for every ε > 0, there exists a δ > 0, so that for all x,
if |x − a| < δ, then |f (x) − f (a)| < ε. So for a function f which is continuous at a, we
have dropped the condition that 0 < |x − a|. Why?
52
MAST10021 Semester 2, 2023
For the theorems about global properties of continuous functions, the notion of conti-
nuity on a closed interval is important. We say a function is continuous on [a, b] if f is
continuous at all x in (a, b), and
Recall here from Defintion 5.1 (Additional: Limits) what it means for a function to
approach a limit from above or from below.
Another variation of the “one-sided” limit occurs when we talk about the limit of a
function f (x) “as x approaches ∞.”
Example 5.2. We have
lim sin(1/x) = 0 . (5.9)
x→∞
lim f (x) = l
x→∞
Additional Problems
1. a) Prove the following version of Theorem 4.3 for “right-hand continuity”: Suppose
that limx→a+ f (x) = f (a), and f (a) > 0. Then there is a number δ > 0 such
that f (x) > 0 for all x satisfying 0 ≤ x − a < δ. Similarly if f (a) < 0, then
there is a number δ > 0 such that f (x) < 0 for all x satisfying 0 ≤ x − a < δ.
b) Prove a version of Theorem 4.3 when limx→b− f (x) = f (b).
53
Note 6.
Functions of two variables: Limits and
Continuity
In Note 3 we have arrived at a definition of what it means for a function of one variable
to approach a limit at a point a. In this note we want to extend this notion to functions
f (x, y) of two variables.
Definition 6.1 (Limit). A function f (x, y) of two variables has the limit l as (x, y)
approaches (a, b) if for every ε > 0, there is some δ > 0 so that
In other words, the definition is conceptually exactly the same as for one variable just
that | · | now refers to the distance in R2 : The set of points (x, y) satisfying
q
0 < |(x, y) − (a, b)| = (x − a)2 + (y − b)2 < δ (6.1)
is a (punctured, and open) disc of radius δ centered at the point (a, b).
Remark 6.1. It is often more convenient to use the following equivalent formulation:
A function f (x, y) has a limit l as (x, y) approaches (a, b) if for every ε > 0 there is a
δ > 0 so that |f (x, y) − l| < ε whenever (x, y) 6= (a, b) and
Indeed, if |f (x, y) − l| < ε on a punctured disc of radius δ,√ then this will also be
the case on a puctured square with side length 2δ 0 , provided 2δ 0 ≤ δ. Conversely, if
|f (x, y) − l| < ε on a punctured square of side-length 2δ, then this will be true for all
points (x, y) on a punctured disc that fits into this square, namely a disc of radius δ.
The definition absolves us from the following dilemma. For a function of one variable
f (x) it is clear what we mean by “x approaches a”: we can approach the point a either
from the left, or the right. However, for functions of two variables there are infinitely many
ways to approach the point (a, b) in the plane, because any curve in the plane through
(a, b) gives a way to approach (a, b). Moreover, the value that a function approaches at a
point may depend on the direction in which this point is approached.
55
Module II Note 6
First observe that g(x, 0) = 0 and g(0, y). In particular, along the coordinate axes g goes
to 0. Now let c 6= 0, and consider g evaluated on the points (x, cx):
cx3 cx
g(x, cx) = = 2 →0 (x → 0) . (6.3)
x +c x
4 2 2 c + x2
This shows that along any straight line through the origin the function g tends to 0.
However, if we approach the origin on an parabola y = cx2 , for any c 6= 0, then
c
g(x, cx2 ) = 6= 0 . (6.4)
1 + c2
Our definition does not make reference to the “way” in which (a, b) is approached. In
fact, according to our definition the function g(x, y) of the previous example does not
have a limit at (0, 0) at all!
Exercise 6.1. Why does our working of Example 6.1 show that g(x, y) does not have a
limit at the origin?
Exercise 6.2. Let
xy
f (x, y) = (x, y) 6= (0, 0) (6.5)
x2 + y 2
and let f (0, 0) = 0. Show that f does not have a limit as (x, y) approaches (0, 0).
Solution 6.3. First note that f (0, y) = 0, and f (x, 0) = 0. Now consider the values of f
on the straight lines through the origin: For any c 6= 0,
cx2 c
f (x, cx) = = (x 6= 0) . (6.6)
x +c x
2 2 2 1 + c2
We need to show that for any l, we can find ε > 0, so that for all δ > 0, there exists a
point (x, y) in the punctured disc of radius δ, where |f (x, y) − l| > ε.
For l = 0, we can always choose a point (x, x) on the straight line y = x arbitrarily
close to the origin, where f (x, x) = 1/2, so we can arrange this for any ε < 1/2.
For l 6= 0, simply choose = l/2, then for any point on the axis |g(x, 0) − l| = l > ε.
Notice that the definition of a limit, which if it does exist for a function f (x, y) at the
point (a, b) we denote by
lim f (x, y) , (6.7)
(x,y)→(a,b)
does not involve the value f (a, b) at all; only the values of the function near (a, b) are
relevant here. Indeed the function need not even be defined at (a, b). However, if f is
defined at (a, b), and its value at the point agrees with its limit as we approach (a, b),
then the function is said to be continuous at (a, b):
56
MAST10021 Semester 2, 2023
While the examples above may have given the impression that functions of two variables,
in general, do not have a limit, this is rather the exception than the rule:
The functions
f (x, y) = x + y (6.9)
f (x, y) = xy (6.10)
f (x, y) = x − y (6.11)
f (x, y) = x/y (y 6= 0) (6.12)
are continuous in their domain. Furthermore; (see the additional notes to Module II):
(Continuity laws) the sum, and product of two continuous functions is contin-
uous. Moreover, the quotient of two continuous functions is continuous
(on the set where the denominator is not zero).
Since the “elementary functions” of one variable, in particular polynomials and trigono-
metric functions, are all continuous (on their domains), it is almost immediate that all
the “elementary functions” of two variables, namely those built up of these functions of
one variable, by arithmetic operations and compositions, are also continuous, where they
are defined.
Example 6.2. The function
sin(3x + 2y)
f (x, y) = (6.13)
x2 − y
is continuous everywhere, except along the parabola y = x2 .
Exercise 6.4. Let f be defined by
xy(x2 −y2 ) (x, y) 6= (0, 0)
f (x, y) = x2 +y 2 (6.14)
0 (x, y) = (0, 0) .
which, if true, then shows that f is continuous at the origin. Since |x2 − y 2 | ≤ x2 + y 2 ,
we have
|f (x, y)| ≤ |xy| (6.16)
57
Module II Note 6
for all (x, y) 6= (0, 0). Since the function h(x, y) = xy is continuous at the origin, it follows
that f is continuous at the origin. Indeed, let ε > 0. Then for all (x, y) 6= (0, 0), with
|(x, y)| < δ, we have
1
|f (x, y)| ≤ |xy| ≤ (x2 + y 2 ) < ε , (6.17)
2
√
provided δ is chosen so that δ 2 < 2ε, say δ = ε.
Problems
1. Show that the following functions do not have a limit at the origin:
2
a) f (x, y) = √x 2+y 2 (x, y) 6= (0, 0)
x +y
b) f (x, y) = x
x4 +y 4
(x, y) 6= (0, 0)
3. Let
1
f (x, y) =
sin(xy) (x 6= 0) (6.18)
x
How should we define f (0, y) for any number y so as to make f a continuous
function on the plane?
In Exercise 6.2 we have shown that f is not continuous at (0, 0). Nonetheless, prove
that f (x, b), and f (a, y) are continous functions of x, and y, respectively, for any
numbers a, and b (including a = 0, b = 0).
We say f is separately continuous.
58
Additional: Limits and Continuity
Further Reading
Proposition 6.3. The functions f3 (x, y) = x − y is continuous on the plane, and the
function f4 (x, y) = x/y is continuous on {(x, y) : y 6= 0}.
Proof. We have f3 (x, y) = f1 (x, f2 (−1, y)), hence a composition of continuous functions.
Moreover, f4 (x, y) = f2 (x, g(y)), where g(y) = 1/y is continuous away from 0, hence f2
is a composition of continuous functions on the set where it is defined.
59
Module III.
Differentiation
61
Note 7.
Differentiation in one variable
7.1. Differentiability in one variable
Following our intuition that continuous functions are those “whose graphs can be drawn
without lifting the pen off the paper” they are still allowed to have “sharp corners”. A
differentiable function does not have graphs like that and admits a well-defined “tangent
line” at each point.
Remark 7.1. Note that the difference quotient (f (x + h) − f (x))/h is the slope of the line
through the points (x, f (x)) and (x + h, f (x + h)). Therefore define that tangent line
to the graph of f at (a, f (a)) to be the line through the point (a, f (a)) with slope f 0 (a).
We say f is differentiable if f is differentiable at every point on its domain. More
generally, we say f is differentiable on say an interval A = (a, b) (or some set of points
A) if f is differentiable at every point a ∈ A, and we call the function f 0 the derivative
of f on the domain A.
Example 7.1. The constant function f (x) = c is differentiable and f 0 (x) = 0.
Exercise 7.1. The linear functions f (x) = cx + d are differentiable, and f 0 (x) = c.
Example 7.2. Let us compute the derivative of the function f (x) = x2 at x = a:
(a + h)2 − a2
f 0 (a) = lim = lim (2a + h) = 2a (7.2)
h→0 h h→0
Exercise 7.2. Show that the function f (x) = x3 is differentiable and f 0 (a) = 3a2 .
Example 7.3. The function
f (x) = |x| (7.3)
is not differentiable at 0. Indeed, the difference quotient at 0 is simply |h|/h, which is 1
for h > 0, and −1 for h < 0, so the limit as h approaches 0 does not exist.
Exercise 7.3. Show that the function f (x) = |x| is differentiable at every point a 6= 0.
63
Module III Note 7
Example 7.4. The function f (x) = |x| is also not differentiable at 0. In fact, the slopes
p
of the tangent lines at (x, f (x)) become infinite as we approach 0 from the right, and
negative infinite as we approach from the left.
These are example of functions which are continuous, but not differentiable. Conversely,
we have:
Theorem 7.1. If f is differentiable at a, then f is continuous at a.
Proof.
f (a + h) − f (a)
lim f (a + h) − f (a) = lim · h = f 0 (a) lim h = 0 . (7.4)
h→0 h→0 h h→0
approximates the graph of the function f near a. More precisely, we know that the
difference in height
h(x) = f (x) − l(x) (7.6)
tends to zero at a faster rate than x − a, as x approaches a:
h(x) f (x) − l(x) f (x) − f (a)
lim = lim = lim − f 0 (a) = 0 . (7.7)
x→a x − a x→a x−a x→a x−a
64
MAST10021 Semester 2, 2023
In conclusion, we have
and we can view this as a linear approximation of the function f near a, and h as an
error which goes to zero faster than the distance to a.
Remark 7.2. The statement that
lim h(x) = 0 (7.9)
x→a
is the statment that f (x) is continuous at a. The fact that (7.9) is implied by (7.7) is
another proof of Theorem 7.1.
is continuous. However, once again this function is not differentiable at 0: For any h 6= 0,
f (h) − f (0)
= sin(1/h) (7.11)
h
and this function does not have a limit as h → 0. A very similar function, which is
differentiable at 0, is
x2 sin(1/x) x =
6 0
(
f (x) = (7.12)
0 x = 0.
However, we will see that for this function the second derivative fails to exist at 0.
For any function f , we obtain by taking the derivative another function f 0 (whose
domain may be smaller than the domain of f ). Clearly, now starting with the function
f 0 , we obtain another function (f 0 )0 whose domain we take to be all points where f 0 is
differentiable. This is the second derivative f 00 of f . In general, we also write
f (0) = f (7.13)
0
f (1)
=f (7.14)
f (2) = f 00 (7.15)
(k) 0
f (k+1)
= (f ) , (7.16)
and we also call f (k) , for k ≥ 2, the higher order derivatives of f . The idea is that
the more derivatives of a function exist, the more regular it is.
Example 7.5. Consider the function
x≥0
(
x2
f (x) = (7.17)
−x2 x ≤ 0.
65
Module III Note 7
We know that f 0 (a) = 2a for a > 0, and f 0 (a) = −2a for a < 0. Moreover,
f (h) − f (0)
= |h| (7.18)
h
and so f is differentiable at 0, and f 0 (0) = 0. We can summarize this conveniently by
f 0 (x) = 2|x| . (7.19)
As we have seen this function is continuous but not differentiable, hence f 00 (0) does not
exist.
7.2. Differentiation
The aim is now to prove a few theorems that will allow us to differentiate a large number
of functions without invoking the definition, and investigating a limit, every time.
Theorem 7.2 (Sum and product rule). If f and g are differentiable at a, then f + g
and f · g are also differentiable at a, and
(f + g)0 (a) =f 0 (a) + g 0 (a) (7.20)
0 0 0
(f · g) (a) =f (a)g(a) + f (a)g (a) (7.21)
The second formula is also called the product rule.
Proof of (7.21).
(f g)(a + h) − (f g)(a)
(f g)0 (a) = lim
h→0 h
(f (a + h) − f (a))g(a + h) + f (a)(g(a + h) − g(a)) (7.22)
= lim
h→0 h
0 0
=f (a)g(a) + f (a)g (a)
where in the last step we have used Theorem 3.1.
Another proof of (7.21). We can also use the ideas from Section 7.1.1 to give another
proof of the product rule: If f and g are differentiable at a, then they have a linear
approximation near a,
f (x) = f (a) + f 0 (a)(x − a) + h(x) , g(x) = g(a) + g 0 (a)(x − a) + k(x) , (7.23)
where h(x), and k(x) are functions that tend to zero faster than x − a. Therefore
h i
f (x)g(x) = f (a)g(a) + f (a)g 0 (a) + f 0 (a)g(a) (x − a) + E(x) (7.24)
Therefore the graph of the function f g is approximated by a linear function with slope
f (a)g 0 (a) + f 0 (a)g(a).
66
MAST10021 Semester 2, 2023
Example 7.6. If f (x) = xn for some natural number n ∈ N, then we can now prove by
induction that
f 0 (a) = nan−1 . (7.26)
Exercise 7.4. The above allows us to compute easily the derivatives of polynomials:
Note that the product rule can also be used to differentiate any product of functions.
For example to compute (f · g · h)0 , we could either write f · g · h = (f g) · h, and apply
the product rule to the functions f g, and h, or we could write f · g · h = f · (gh), and
apply the product rule to f , and gh; both has the same result:
Theorem 7.3 (Quotient rule). If f and g are differentiable at a and g(a) 6= 0, then f /g
is differentiable at a, and
f 0 f 0 (a)g(a) − f (a)g 0 (a)
(a) = (7.30)
g (g(a))2
We have
1 0 1 g(a) − g(a + h)
(a) = lim
g h→0 h g(a + h)g(a)
(7.32)
g(a + h) − g(a) 1 g 0 (a)
= − lim lim =− .
h→0 h h→0 g(a + h)g(a) (g(a))2
Note that for the difference quotient to make sense we needed g(a + h) 6= 0, at least
for |h| sufficiently small. However, we know that g is differentiable, hence continuous at
a, and since g(a) 6= 0 it follows that g(a + h) 6= 0, as long as |h| < δ, for some δ > 0;
cf. Theorem 7.1 and Theorem 4.3.
Since
f 1
=f· , (7.33)
g g
the formula for the derivative of the quotient then follows from the product rule.
67
Module III Note 7
Example 7.7.
x2 − 1 4x
f (x) = f 0 (x) = (7.34)
x2 + 1 (x2 + 1)2
Example 7.8. If f (x) = x−n for some natural number n ∈ N, then
1 −nxn−1
f (x) = , f 0 (x) = = (−n)x−n−1 . (7.35)
xn x2n
To investigate our favourite functions from above, say
Example 7.9. Consider again the function f (x) from (7.36), or more precisely the extension
x2 sin(1/x) x 6= 0
(
f (x) = (7.39)
0 x = 0.
We have already seen, directly from the defintion, that f 0 (0) = 0. With the chain rule,
we can also compute that for x 6= 0,
1
f 0 (x) = 2x sin(1/x) + x2 cos(1/x) · − = 2x sin(1/x) − cos(1/x) (7.40)
x2
In particular, f 0 (x) is not continuous at 0.
We illustrate how the chain rule is applied in practice with some more examples.
Example 7.10.
f (x) = sin(x2 ) (7.41)
When we apply the chain rule we view this as a composition
f = sin ◦S (7.42)
where S(x) = x2 (say S for “taking the square”). Then it is clear that
68
MAST10021 Semester 2, 2023
Example 7.11.
f (x) = sin2 (x2 ) (7.44)
We could view f as the composition
The above notation is useful to clarify the compositions that make up a function, but
in practice one does not usually introduce additional notation.
Example 7.12.
f (x) = sin(sin(x2 )) (7.48)
We compute directly:
f 0 (x) = cos(sin(x2 )) · cos(x2 ) · 2x (7.49)
Exercise 7.5. Compute the derivatives of
f (x) = sin (sin(x))2 (7.50)
f (x) = sin2 (x sin x) (7.51)
Problems
√
1. Prove directly using the defintion, that if f (x) = x, then f 0 (a) = 1
√
2 a
for any
a > 0.
2. Prove that if g(x) = f (x) + c, then g 0 (x) = f 0 (x). Also show that if g(x) = cf (x)
then g 0 (x) = cf 0 (x).
3. Let f be a function such that |f (x)| ≤ x2 for all x. Prove that f is differentiable at
0.
b)
f (a + h) − f (a − h)
f 0 (a) = lim (7.53)
h→0 2h
69
Module III Note 7
Draw a picture!
b) Prove that if f is odd, then f 0 is even: f 0 (x) = f 0 (−x).
70
Note 8.
Differentiation in two variables
Definition 8.1 (Partial derivatives). For a given function f (x, y), the limit of the
following difference quotients, if they exists, are called the partial derivative ∂x f , and
∂y f of f at (a, b):
∂f f (a + h, b) − f (a, b)
(a, b) = lim (8.1)
∂x h→0 h
∂f f (a, b + h) − f (a, b)
(a, b) = lim . (8.2)
∂y h→0 h
e2x sin(y)
f (x, y) = (8.3)
1 + y2
are
2e2x sin(y)
∂x f (x, y) = = 2f (x, y) (8.4)
1 + y2
e2x cos(y) e2x 2y sin(y) 2y sin(y) e2x
∂y f (x, y) = − = cos(y) + . (8.5)
1+y 2 (1 + y )2 2 1 + y2 1 + y2
The partial derivatives of a function tell us how the values of a function change along
the coordinate axes. However, even if they exist at a point (a, b), they do not necessarily
give us information about the behaviour of the function near (a, b).
Example 8.2. Let us take another look at the example
( xy
(x, y) 6= (0, 0)
f (x, y) = x2 +y 2 (8.6)
0 (x, y) = (0, 0)
71
Module III Note 8
However, this does not tell us anything about the behaviour of the function f (x, y) near
(a, b). The reason is, as we shall see, that this function is not differentiable at (0, 0).
l(~x) = b + ~c · ~x , (8.9)
for some b ∈ R, ~c = (c1 , c2 ) ∈ R2 , and the condition l(~a) = f (~a) implies that b = f (~a)−~c·~a,
so
l(~x) = f (~a) + ~c · (~x − ~a) . (8.10)
Definition 8.2 (Differentiability). A function f (x1 , x2 ) is differentiable at a point
~a = (a1 , a2 ), if there is a vector ~c ∈ R2 such that
E(~h)
lim = 0. (8.12)
~h→0 |~
h|
As for functions of one variable, all differentiable functions are continuous.
Theorem 8.1. If f is differentiable at ~a, then f is continuous at ~a.
The proof follows immediately from the linear approximation of a differentiable function.
Indeed if f is differentiable at ~a, then
72
MAST10021 Semester 2, 2023
where E(~x)/|~x − ~a| → 0 as ~x → ~a. Let us now determine the components of this vector
~c = (c1 , c2 ). In (8.15) simply choose ~x = (a1 + h, a2 ), then
∂f
(~a) = c1 . (8.17)
∂x
Similarly for the other component of ~c, and we conclude
∂f
∂x (~a)
∇f (~a) = (8.18)
.
∂f
(~a)
∂y
Theorem 8.2. If f (x, y) is differentiable at (a, b), then the partial derivatives ∂x f , and
∂y f of f exist at ~a and are the components of the gradient vector ∇f (~a).
is not continuous at the origin, hence in particular not differentiable at (0, 0), by Theo-
rem 8.1. Nonetheless the partial derivatives exist:
73
Module III Note 8
Theorem 8.3. Suppose the partial derivatives ∂x f , and ∂y f of a function f (x, y) exist
at every point, and are themselves as functions ∂x f (x, y), and ∂y f (x, y) continuous at
(a, b). Then f (x, y) is differentiable at (a, b).
Example 8.4. For the example above we compute away from the origin,
y 3 − x2 y x3 − xy 2
∂x f (x, y) = ∂y f (x, y) = (x, y) 6= (0, 0) , (8.21)
(x2 + y 2 )2 (x2 + y 2 )2
~u = (1, 0) (8.24)
then
∂f ∂f
= (8.25)
∂~u ∂x
is the partial derivative of f in x. Similarly ∂y f is the directional derivative in the
direction ~u = (0, 1).
Theorem 8.4 (Formula for the directional derivative). Suppose f is differentiable at ~a.
Then the directional derivative at ~a in any direction ~u exists, and is given by
∂f
(~a) = ∇f (~a) · ~u . (8.26)
∂~u
74
MAST10021 Semester 2, 2023
Proof. Since f is differentiable at ~a, we know that (8.13) holds for any ~h, in particular
for ~h = t~u:
f (~a + t~u) = f (~a) + ∇f (~a) · t~u + E(~a + t~u) . (8.27)
Hence
f (~a + t~u) − f (~a) E(~a + t~u)
= ∇f (~a) · ~u + , (8.28)
t t
and the formula follows by taking the limit t → 0.
The above formula also provides a geometric interpretation of the gradient of a function:
Since for any vectors ~a, and ~b,
|~a · ~b| ≤ |~a||~b| (8.29)
with equality when ~a and ~b are colinear, we have in particular with |~u| = 1 that
∂f
(~a) ≤ ∇f (~a) (8.30)
∂~u
with equality when ∇f (~a) and ~u are colinear. This means that ∇f (~a) points in the
direction of the steepest increase of f at ~a, and its magnitude is that rate of increase of
f in that direction.
Exercise 8.1. Let f (x, y) = x2 + 5xy 2 , and ~a = (−2, 1).
1. Find the directional derivative of f at ~a in the direction of the vector ~v = (3, 4).
2. What is the largest directional derivative of f at ~a, and in what direction does it
occur?
Solution 8.2. We have ∇f (x, y) = (2x + 5y 2 , 10xy), so that ∇f (−2, 1) = (1, −20).
Note that ~v is not normalised, so let us first determine ~u colinear to ~v of unit length:
v 1
~u = = (3, 4) . (8.31)
|~v | 5
75
Module III Note 8
Problems
1. For each of the following functions f (x, y) find the linear function l(x, y) whose
graph is the tangent plane to the graph of f at the point (1, −2, f (1, −2)).
a) f (x, y) = x2 y + sin(πxy)
xy
b) f (x, y) = x2 +y 2
3. For each of the following functions f (x, y) compute the directional derivative of f
at the point (−1, 2) in the direction ( 35 , 45 ).
a) f (x, y) = x2 y + sin(πxy)
xy
b) f (x, y) = x2 +y 2
4. Suppose f (~x) and g(~x) are differentiable at ~a. Does that imply that f + g, and f g
are differentiable at ~a? Find a formula for the gradient of f g, and f + g at ~a.
76
Additional: Chain rule
Further Reading
77
Module III Note 8
Theorem 8.1 (Chain rule). Suppose that ~g (t) = (g1 (t), . . . , gn (t)), and gi : i = 1, 2, . . . , n
are differentiable functions at t = a. Suppose moreover that f (x1 , . . . , xn ) is differentiable
at ~b = ~g (a). Then the function (f ◦ g)(t) is differentiable at t = a, and its derivative is
given by
(f ◦ g)0 (a) = ∇f (~b) · ~g 0 (a) . (8.10)
where
~h =~g (a + u) − ~g (a)
(8.12)
=~g 0 (a)u + E~2 (u) , ~g 0 (a) = (g10 (a), . . . , gn0 (a)) ,
78
MAST10021 Semester 2, 2023
f (~g (a + u)) = f (~g (a)) + ∇f (~g (a)) · ~g 0 (a)u + ∇f (~g (a)) · E~2 (u) + E(~h) , (8.13)
and the chain rule follows, because the last two terms go to zero as u tends to zero faster
than u.
Exercise 8.1. Use this approach to give another proof of the chain rule for functions of
one variable.
79
Module IV.
81
Note 9.
L’Hôpital’s rule
The aim of this lecture is to prove:
and suppose that limx→a f 0 (x)/g 0 (x) exists. Then limx→a f (x)/g(x) exists, and
f (x) f 0 (x)
lim = lim 0 . (9.2)
x→a g(x) x→a g (x)
There are many variations of L’Hôpital’s rule; see for example Problems 6 below.
The proof in turn relies on the mean value theorem, which says that given a
continuous function on [a, b], which is differentiable on (a, b), there is a some x ∈ (a, b)
such that
f (b) − f (a)
f 0 (x) = . (9.3)
b−a
Geometrically, this means that there is some tangent line to the graph of f , which is
paralllel to the line between (a, f (a)) and (b, f (b)); see Figure 9.1.
Before we discuss these concepts an example how L’Hôptial’s rule is applied:
Example 9.1. The theorem allows us to determine the limit of the function
sin(x)
f (x) = (x 6= 0) (9.4)
x
cos(x)
lim = 1, (9.5)
x→0 1
sin(x) cos(x)
lim = lim = 1. (9.6)
x→0 x x→0 1
83
Module IV Note 9
Theorem 9.2 (Rolle’s Theorem). If f is continuous on [a, b] and differentiable on (a, b),
and f (a) = f (b), then there is a number x in (a, b) such that f 0 (x) = 0.
Proof. Since f is continuous on [a, b] it has a maximum point, and a minimum point in
[a, b].
If the maximum, or minimum, occur at x ∈ (a, b), then f 0 (x) = 0 at this point. More
precisely, we are using here that if x ∈ (a, b) is a local maximum (or minimum), then
necessarily f 0 (x) = 0. (See Additional for a proof of this statement.)
If both the maximum and minimum points lie on the boundary, then since f (a) = f (b)
they must be equal, and the function f is a constant, hence f 0 (x) = 0 for any x ∈ (a, b).
84
MAST10021 Semester 2, 2023
Exercise 9.1. Draw the graph of a continuous on [a, b], but not differentiable function on
(a, b) for which the conclusion of Rolle’s theorem is false.
f (b) − f (a)
f 0 (x) = . (9.7)
b−a
Proof. Recall the geometric interpretation of this statement: At the point x the slope of
the tangent equals that of the line from (a, f (a)) to (b, f (b)).
Now the line from (a, f (a)) to (b, f (b)) is the graph of the linear function
f (b) − f (a)
l(x) = (x − a) + f (a) . (9.8)
b−a
In particular l(a) = f (a), and l(b) = f (b) and we set
which is the height of the graph of f over the line from (a, f (a)) to (b, f (b)). We have
f (b) − f (a)
h0 (x) = f 0 (x) − l0 (x) = f 0 (x) − = 0. (9.11)
b−a
85
Module IV Note 9
Corollary 9.4. If f is defined on an interval and f 0 (x) = 0 for all x in the interval,
then f is constant on the interval.
Proof. Take any points a < b in that interval, then by the mean value theorem there is
an x ∈ (a, b) with f 0 (x)(b − a) = f (b) − f (a), but f 0 (x) = 0 so f (a) = f (b).
Exercise 9.2. If f and g are defined on the same interval and f 0 (x) = g 0 (x) for all x, show
that then there is a constant c such that f (x) = g(x) + c.
Theorem 9.5 (Cauchy mean value theorem). Let f and g be continuous on [a, b] and
differentiable on (a, b), then there is a number x ∈ (a, b) such that
Remark 9.1. Note that the special case g(x) = x is the mean value theorem, but the
Cauchy mean value theorem is not a direct consequence of the mean value theorem,
because while f (b) − f (a) = f 0 (x)(b − a) for some x, and g(b) − g(a) = g 0 (y)(b − a) for
some y, x and y are not neccesarily the same.
Proof. Let
h(x) = f (b) − f (a) g(x) − f (x) g(b) − g(a) (9.13)
This theorem is the main statement we need to evaluate limits of the form
f (x)
lim (9.15)
x→a g(x)
Proof of Theorem 9.1. Recall that we assume that f and g approach the limit 0 near a,
so let us define (possibly redefine)
86
MAST10021 Semester 2, 2023
then f and g are continuous at a. Then by the Cauchy mean value theorem applied to f
and g on the interval [a, x], we get that there exists a < αx < x, such that
Note that the assumptions of Theorem 9.5 are indeed satisfied: Since the f 0 (x)/g 0 (x)
approaches a limit, and so in particular g 0 (x) 6= 0 near a. This also shows that g(x) 6= 0
near a, because if g(x) = 0 for some x > a, then by the mean value theorem there would
exist a y ∈ (a, x) with g 0 (y) = 0, again contradicting that g 0 (x) 6= 0 near a.
Furthermore it follows that
f (x) f 0 (αx )
lim = lim 0 , (9.18)
x→a g(x) x→a g (αx )
f 0 (x)
l = lim (9.19)
x→a g 0 (x)
exists. This means that for any ε > 0, we can find δ > 0, so that, if |y − a| < δ, then
f 0 (y)
− l < ε. (9.20)
g 0 (y)
f (x) f 0 (αx )
−l = 0 −l <ε (9.21)
g(x) g (αx )
Problems
1. A function is increasing on an interval if f (a) < f (b) whenever a and b are two
numbers in the interval with a < b. Similarly for a decreasing function.
Show that if f 0 (x) > 0 for all x in an interval, then f is increasing on the interval.
3. a) Suppose that f 0 (x) > g 0 (x) for all x and that f (a) = g(a). Show that f (x) >
g(x) for x > a and f (x) < g(x) for x < a
87
Module IV Note 9
b) Show by an example that these conclusions do not follow without the hypothesis
f (a) = g(a).
x3 + x − 2 3x2 + 1 6x
lim = lim = lim =3 (9.22)
x→1 x2 − 3x + 2 x→1 2x − 3 x→1 2
6. Prove the following variations of L’Hôpital’s rule (with the much the same reasoning
as in the proof of Theorem 9.1).
a) If limx→a f (x) = limx→a g(x) = 0, and limx→a f 0 (x)/g 0 (x) = ∞, then
lim f (x)/g(x) = ∞ .
x→a
88
Note 10.
Inverse functions
The graph of f −1 is the graph of f reflected across the diagonal line consisting of all
points (x, x). For f −1 to be a function it is necessary that, geometrically, no horizontal
line intersects the graph of f twice, and this property has a name:
Example 10.1. Say f (x) = x3 . The f −1 is the function that assigns to y = x3 the number
unique number x, that is
√
f −1 (y) = 3 y . (10.1)
More generally, the fact that f −1 (x) is the number y such that f (y) = x can be restated
as: f (f −1 (x)) = x for every point x in the domain f −1 , or alternatively f −1 (f (x)) = x
for every point in the domain of f .
We know that all increasing, and decreasing functions are one-to-one.
Exercise 10.1. Show that if f is increasing, then f −1 is also increasing.
Exercise 10.2. A function f is increasing if and only if −f is decreasing.
However, it is not true that every one-to-one function is either increasing or decreasing.
Example 10.2. The function
0<x<1
(
x2
f (x) = (10.2)
1
x−1 +1 x>1
89
Module IV Note 10
and so the slope of L0 is the reciprocal of the slope of L. In other words, this suggests
that:
1
(f −1 )0 (f (a)) = 0 , (10.3)
f (a)
or alternatively,
1
(f −1 )0 (y) = . (10.4)
f 0 (f −1 (y))
There is another reason, this formula should be true: We know that
f f −1 (y) = y (10.5)
This argument is not a proof because it presupposes that we know that f −1 is differentiable,
but it does tell us if f , and f −1 are differentiable, then (f −1 )0 must be given by this
formula.
This argument also tells us:
Corollary 10.1. If f is a continuous one-to-one function defined on an interval and
f 0 (f −1 (a)) = 0, then f −1 is not differentiable at a.
90
MAST10021 Semester 2, 2023
Example 10.3. The function f (x) = x3 is continuous and one-to-one, and satisfies
f 0 (0) = 0, and indeed f −1 is not differentiable at 0 = f −1 (0). (Draw a picture!)
Now finally, the positive results:
Theorem 10.2 (Continuity of the inverse). If f is continuous and one-to-one on an
interval, then f −1 is also continuous.
This is surprisingly cumbersome to show and we will not go into the proof here; (see
the additional notes to Module IV).
Theorem 10.3 (Differentiability of the inverse). Let f be a continuous one-to-one
function defined on an interval, and suppose that f is differentiable at f −1 (b), with
f 0 (f −1 (b)) 6= 0, then f −1 is differentiable at b, and
1
(f −1 )0 (b) = . (10.7)
f 0 (f −1 (b))
Proof. In order to prove the theorem, we need to look at the difference quotient
f −1 (b + h) − f −1 (b)
(10.8)
h
where b = f (a). For given h, let us choose k, depending on h, so that b + h = f (a + k).
Then we can also write this as
a+k−a k
= . (10.9)
f (a + k) − f (a) f (a + k) − f (a)
Moreover, since f is differentiable at a, we have
fn (x) = xn (10.12)
For n odd, this function is continuous and one-to-one, and for n even it is so if we take
the domain to be [0, ∞). We have
√
fn−1 (x) = n x = x1/n (10.13)
91
Module IV Note 10
whose domain is R when n is odd, and [0, ∞) if n is even. By Theorem 10.3 we have, for
x 6= 0,
1 1
(fn−1 )0 (x) = (n−1)/n
= x1/n−1 (10.14)
nx n
Hence if f (x) = xa for any integer a, or a the reciprocal of a natural number, then
f (a) = naa−1 . In fact, for any rational number a = m/n, we can write
0
Problems
1. Find f −1 for each of the following functions f .
a) f (x) = x3 + 1
b) f (x) = (x − 1)3
3. Prove that if f and g are one-to-one, then f ◦ g is also one-to-one. Find a formula
for (f ◦ g)−1 in terms of f −1 and g −1 .
92
Additional: Critical points, and continuity of
the inverse
10.1. Critical points
Further Reading
Theorem 10.1. Let f be any function defined on (a, b). If x is a local maximum (or
minimum) point for f on (a, b), and f is differentiable at x, then f 0 (x) = 0.
f (x + h) − f (x)
(10.1)
h
is ≤ 0 for h > 0, and ≥ 0 for h < 0, because x is a local maximum point. Consequently,
f (x + h) − f (x) f (x + h) − f (x)
lim ≤ 0, lim ≥0 (10.2)
h→0+ h h→0− h
and since f is differentiable both limits exist and are equal. Hence f 0 (x) = 0.
However, the converse is not true: A function f whose derivative f 0 (x) = 0 is zero at
a point x does not necessarily have a minimum or maximum at that point. A simple
example is the function f (x) = x3 , for which f 0 (0), yet it does not have a minimum or
maximum anywhere.
93
Module IV Note 10
∇f (~a) = 0 . (10.3)
Similarly, we can prove that if f (x1 , x2 ) has a local maximum at a point ~a = (a1 , a2 ),
in the sense that for some δ > 0,
We have seen that functions can be defined as pairs of numbers. The pairs of numbers
(x, y) consist of points x in the domain of f and the values y = f (x).
Definition 10.4. For any function f , the inverse of f , denoted by f −1 , is the set of
pairs (y, x) for which the pair (x, y) is in f .
However, we have seen that what makes a collection of pairs (a, b) a function f , is that
for each point a in the domain, there is a unique number b such that (a, b) is in f . So for
f −1 to be a function, we need that for each y = f (x) there is a unique number x such
that f (x) = y. In other words, f needs to be one-to-one.
A function f which is one-to-one has an inverse function f −1 . The inverse function f −1
is itself one-to-one, and (f −1 )−1 = f . In the pair (a, b) is in f , then b = f (a); moreover if
f is one-to-one, then (b, a) is in f −1 , and a = f −1 (b).
We have seen that increasing, or decreasing functions are one-to-one.
Let us prove Theorem 10.2 in the case that the domain of f is an open interval, and f
is increasing on that interval.
94
MAST10021 Semester 2, 2023
95
Module IV Note 10
Additional Problems
1. In Note 8, we have seen in Theorem 8.3 a sufficient criterion for differentiability of
a function of two variables, but we deferred the proof because it relies on the Mean
Value Theorem. This exercise guides you through the proof of Theorem 8.3:
Suppose the partial derivatives of a function f (x, y) exist at every point. We want
to show that at every point (a, b),
where
E(h, k)
∇f (a, b) = (∂x f (a, b), ∂y f (a, b)) and lim . (10.15)
(h,k)→0 |(h, k)|
and apply the mean value theorem twice. Finally use that the partial derivatives
are continuous functions to evaluate the limit.
96
Module V.
Integration
97
Note 11.
The fundamental theorem of Calculus
Further Reading
Given a continuous function f ≥ 0 on [a, b] we can talk about the “area under the
graph of f ”. In fact, this concept can be made precise for a larger class of bounded
functions which
Rb
are called integrable (in particular they need not be nonnegative), and
the number a f which formalizes the concept of “area” is called the integral.
In this subject, we will not define the integral, or attempt to discuss this notion with
the same level of care as we have treated the topics of continuity, and differentiability,
for instance. Consequently, we cannot hope to prove any of the fundamental theorems
about integration, most importantly the fundamental theorems of Calculus stated
below, which relate integration and differentiation.
In this note, we will only state these theorems as facts.
Basic properties of the integral. The basic properties of the integral, which we denote
interchangeably by
Z b Z b
f (x)dx or f (11.1)
a a
are that
Z b Z c Z b
f= f+ f (for any a < c < b) (11.2)
a a c
Z b Z b Z b
(f + g) = f+ g (11.3)
a a a
Z b Z b
f =c f (for any c ∈ R) (11.4)
a a
99
Module V Note 11
It turns out this function is itself continuous in x, and more importantly always differen-
tiable:
Theorem 11.1 (First fundamental theorem of Calculus). If f is continuous on [a, b],
then F is differentiable on (a, b), and
F 0 (x) = f (x) . (11.6)
Remark 11.1. The theorem says that given any continuous function on [a, b] (in fact,
more generally any merely integrable function), there always exists a function F , whose
derivative is f ; namely (11.5).
Also note that if G is defined by
Z b
G(x) = f (11.7)
x
Exercise 11.1. Compute the integral of f (x) = xn on the interval [a, b] for any natural
number n ∈ N.
Example 11.2. Also for f (x) = x−n , where n ∈ N, n 6= 1, we know that we can find g
with g 0 (x) = f (x), at least for x 6= 0: g(x) = (−n + 1)−1 x−n+1 . Thus for 0 < a < b
1 1 1
Z b
x−n dx = − . (11.10)
a n − 1 an−1 bn−1
The exception n = 1 in the above example is significant. While there is no monomial
whose derivative is the function 1/x, we do know, by the fundamental theorem of Calculus,
that there exists a function g(x) whose derivative is f (x) = 1/x for x > 0, namely
1
Z x
g(x) = dt (11.11)
a t
For a = 1 this serves as the definition of the logarithm; (see the additional notes to
Module V).
100
MAST10021 Semester 2, 2023
Problems
1. The fundamental theorem of Calculus, together with the chain rule, allow us to
compute derivatives of a variety of functions defined in terms of integrals.
Example 11.3. Let us compute the derivative of the function
1
Z sin x
f (x) = dt (11.12)
a 1 + sin2 (t)
cos(x)
f 0 (x) = F 0 (sin(x)) cos(x) = . (11.14)
1 + sin2 (sin(x))
2. Find (f −1 )0 (0) if Z x
f (x) = 1 + sin sin(t) dt (11.15)
0
3. a) Find F 0 if Z x
F (x) = xf (t)dt (11.16)
0
Hint: The answer is not xf (x).
b) Prove that if f is continuous, then
Z x Z x Z u
f (u)(x − u)du = f (t)dt du (11.17)
0 0 0
101
Module V Note 11
102
Note 12.
The “simplest” differential equations
The simplest example of a differential equation is
Here we are looking for a function y whose derivative is f . The fundamental theorem
of Calculus says that for any continuous function f (x) this differential equation has a
solution, namely
Z x
y(x) = f (t)dt . (12.2)
a
1 I
x
Figure 12.1.: The function f in (12.1) geometrically prescribes a slope, and gives rise
to the simplest example of a direction field. Solutions are functions whose
graph are tangential to the indicated slopes at every point.
103
Module V Note 12
exp(x) = ex . (12.4)
The exponential can be defined as the inverse of the logarithm; see Note 13 below.
1
Z x
log(x) = dt (x > 0) , exp(x) = log−1 (x) . (12.5)
1 t
The statement that the exponential function solves the differential equation (12.3) then
follows from the formula for the derivative of the inverse given in Note 10:
1
exp0 (x) = (log−1 )0 (x) = = exp(x) . (12.6)
log0 (exp(x))
Of course ex is not the only function with that property, for example also f (x) = c exp(x)
satisfies the relation f 0 (x) = f (x) for any constant c ∈ R. However, these are all:
Remark 12.1. In other words, the solutions of the differential equation y 0 = y are all of
the form y(x) = cex for some constant c ∈ R.
ex f 0 (x) − f (x)ex
g 0 (x) = =0 (12.9)
(ex )2
f (x)
g(x) = =c (12.10)
ex
104
MAST10021 Semester 2, 2023
y 00 + y = 0 . (12.11)
It is easy to verify that the trigonometric functions are solutions, namely for both
It may at first be surprising that cos(x) and sin(x) are the only solutions to (12.11) with
these values at 0.
f 00 + f = 0 (12.14)
0
f (0) = 0 f (0) = 0 . (12.15)
Then f = 0.
for all x.
f 00 + f = 0 (12.18)
0
f (0) = a f (0) = b (12.19)
then
f (x) = a cos(x) + b sin(x) . (12.20)
Indeed, if we define
g(x) = f (x) − a cos(x) − b sin(x) (12.21)
then g also satisfies g 00 + g = 0, and g(0) = 0, and g 0 (0) = 0, from which we conclude
with the Lemma that g(x) = 0.
An unexpected consequence are the addition theorems for trigonometric functions.
105
Module V Note 12
Proposition 12.3.
In particular we note that cos(2x) = cos2 (x) − sin2 (x), and sin(2x) = 2 cos(x) sin(x).
106
Note 13.
Logarithm and exponential function
Further Reading
(Spivak, Calculus, Chapter 18), with more motivation for the definition of the
logarithm. (Apostel, Calculus I, Chapter 8.3), with an emphasis on the exponential
functions as the solution to a differential equation.
13.1. Logarithm
The logarithm is an example of a function that is defined by an integral.
Definition 13.1 (Logarithm). For x > 0, we set
1
Z x
log(x) = dt (13.1)
1 t
Exercise 13.1. Sketch the graph of the logarithm.
Proposition 13.1. If x, y > 0, then
log(xy) = log(x) + log(y) (13.2)
Proof. Note that by the fundamental theorem of calculus log0 (x) = 1/x. Now choose a
number y > 0, and let f (x) = log(xy). Then
y 1
f 0 (x) = log0 (xy)y = = (13.3)
xy x
which says that f 0 = log0 . This implies that there is a number c such that
f (x) = log(x) + c (13.4)
for all x > 0, and we can find c by evaluating
f (1) = log(y) = log(1) + c = c (13.5)
and therefore
log(xy) = f (x) = log(x) + c = log(x) + log(y) . (13.6)
Since this is true for all y > 0, the theorem is proved.
107
Module V Note 13
Exercise 13.2. Show by induction that if n is a natural number and x > 0, then
log(xn ) = n log(x) (13.7)
Corollary 13.2. If x, y > 0, then
x
log = log(x) − log(y) . (13.8)
y
Proof. This is true because
x
log(x) = log( y) = log(x/y) + log(y) . (13.9)
y
The function log(x) is clearly increasing but since log0 (x) = 1/x the slope gets very
small when x is large, and consequently log(x) grows more and more slowly. It is not
immediately clear if the function is bounded or unbounded. However, for any n ∈ N,
log(2n ) = n log(2) (13.10)
and log 2 > 1; similarly
log(2−n ) = −n log(2) . (13.11)
Thus by the intermediate value theorem the logarithm takes on any value t ∈ R.
The number e = exp(1) is called Euler’s number. For any number x, we define
ex = exp(x); see Additional.
108
MAST10021 Semester 2, 2023
Indeed, for 0 < x < 1 we have log(x) < 0; moreover log(1) = 0 < 1, and for x > 1,
dt
Z x
log(x) = ≤ (x − 1) < x . (13.20)
1 t
Therefore
x = exp(log(x)) < ex , (13.21)
which in particular implies (13.18).
Next we prove that
ex
lim = ∞. (13.22)
x→∞ x
Since
ex 1 ex/2 x/2
= e (13.23)
x 2 x/2
and in view of (13.21) the factor in parenthesis is bounded from below by 1, this statement
also follows from (13.18).
Similarly we can now write
ex ex/n n 1 ex/n n
= = (13.24)
xn x nn x/n
Problems
1. a) Check that the derivative of log ◦f is f 0 /f .
Note: The derivative of log ◦f is called the logarithmic derivative and is some-
times easier to compute than f 0 , because taking the logarithm turns products
into sums. The formula says that multiplying (log ◦f )0 by f recovers f 0 , and
this process of finding the derivative of f is called logarithmic differentiation.
109
Module V Note 13
110
Note 14.
Methods of integration
Definition 14.1 (Primitive). A function F satisfying F 0 = f is called the primitive of f .
However, in this lecture we will try to find a primitive which can be written in terms
of elementary functions, namely the trigonometric functions and their inverses, and the
logarithmic and exponential functions, and rational functions formed thereof.
Remark 14.1. Elementary primitives usually cannot be found. For example, there is no
elementary function F such that
2
F 0 (x) = e−x . (14.3)
The basic methods for finding elementary primitives are actually theorems which allow
us to express primitives of one function in terms of primitives of other functions. To
integrate we will therefore need a list of primitives for some functions, and such a list
can be obtained simply by differentiating various well-known functions.
Definition 14.2. For the primitive of a function f we often use the notation
Z Z
f (x)dx or f. (14.4)
These are also called indefinite integrals, in contrast to definite integrals of a function f
with primitive F for which we adopt the notation
Z b
b
f (x)dx = F (b) − F (a) = F (x) a .
a
111
Module V Note 14
Example 14.2. We can verify the following formulas by differentiating the right hand
sides:
Z
adx = ax (14.5)
xn+1
Z
xn dx = (n 6= −1) (14.6)
n + 1
1
Z
dx = log x (14.7)
Zx
ex dx = ex (14.8)
Z
sin xdx = − cos x (14.9)
Z
cos xdx = sin x (14.10)
Z
sec2 xdx = tan x (14.11)
dx
Z
= arctan x (14.12)
1 + x2
dx
Z
√ = arcsin x (14.13)
1 − x2
112
MAST10021 Semester 2, 2023
The formula for the definite integral follows if we integrate both side of this equation on
the interval [a, b].
Example 14.3. Z
xex dx = xex − ex (14.19)
Example 14.4.
Z Z
x sin xdx = −x cos(x) + cos x = −x cos(x) + sin(x) (14.20)
Example 14.5. Z Z
log x = x log x − x · (1/x)dx = x log x − x (14.21)
Example 14.6. The computation of the primitive of log(x)/x is an example where the
result is obtained in two steps:
1 1
Z Z
log(x)dx = log(x)2 − log(x) dx (14.22)
x x
log x 1
Z
dx = (log x)2 (14.23)
x 2
Example 14.7. Any previously computed primitive can be used for integration by parts:
Z Z
(log(x)) dx =
2
(log x)(log x)dx
1
Z Z
= log(x) log(x) − x log x − x dx
xZ (14.24)
= log(x) x log x − x − log(x) − 1 dx
then F 0 = f , and
(F ◦ g)0 = (F 0 ◦ g) · g 0 = (f ◦ g) · g 0 , (14.28)
113
Module V Note 14
Example 14.8.
1
Z b
b
sin5 (x) cos(x)dx = sin6 (x) (14.30)
a 6 a
because with f (x) = x5 , and g(x) = sin(x), this integral is of the form
Z b Z g(b)
0
f (g(x))g (x)dx = f (y)dy = (F ◦ g)|ba (14.31)
a g(a)
follows that
sin6 x
Z
sin5 (x) cos(x)dx = . (14.34)
6
It is quite uneconomical to find the primitive by first evaluating a definite integral. Instead
we have the following procedure:
only the variable u appears. Then find a primitive in terms of u, and substitute
g(x) back in for u.
114
MAST10021 Semester 2, 2023
so
1 1
Z Z
6
sin5 (x) cos(x)dx = u5 du = u6 = sin(x) (14.39)
6 6
Example 14.13. To evaluate
x
Z
dx (14.40)
1 + x2
set
u = 1 + x2 du = 2xdx (14.41)
so this integral equals
1 du 1 1
Z
= log u = log(1 + x2 ) (14.42)
2 u 2 2
Example 14.14.
e3x
Z
e3x dx = (14.45)
3
sin(4x)
Z
cos(4x)dx = (14.46)
4
More interesting uses of the substitution formula appear when the factor g 0 (x) does
not appear.
Example 14.15. Consider
1 + ex
Z
dx . (14.47)
1 − ex
The obvious substitution to try is
u = ex du = ex dx (14.48)
115
Module V Note 14
and even though this factor does not appear in the integral we are led to
1 + ex 1+u1
Z Z
dx = du . (14.49)
1 − ex 1−uu
This can be integrated easily once we recognise that
1+u1 2 1
= + , (14.50)
1−uu 1−u u
hence
1 + ex
Z
dx = −2 log 1 − ex ) + log(ex ) (14.51)
1 − ex
Alternatively we could have set
1
u = ex x = log u dx = du (14.52)
u
then immediately
1 + ex 1+u1
Z Z
dx = du . (14.53)
1 − ex 1−uu
u = g(x) (14.54)
and say we are in the situation that g is one-to-one, at least for all x under consideration,
in particular g 0 6= 0, then we can solve
x = g −1 (u) . (14.55)
In order to find Z
f (g(x))dx (14.56)
116
MAST10021 Semester 2, 2023
e2x
Z
√ dx . (14.62)
ex + 1
Set √
u= ex + 1 , (14.63)
then
u2 = ex + 1 (14.64)
2u
x = log(u2 − 1) dx = du (14.65)
u2−1
hence
e2x (u2 − 1)2 2u
Z Z Z
√ dx = du = 2 (u2 − 1)du
ex + 1 u u2 − 1 (14.66)
2 2 3 1
= u3 − 2u = ex + 1 2 − 2 ex + 1 2
3 3
Finally let us look at some examples for the integration of trigonometric functions
by substitution. When integrating a monomials in trigonometric functions it is useful
to remember the formulas from Prop. 12.3, in particular
If we set
x = sin(u) (14.70)
√
then 1 − x2 = cos(x) simplifies, so we are led to the substitution
u = arcsin(x) . (14.71)
Then Z p Z q Z
1 − x2 dx = 1 − sin2 (u) cos(u)du = cos2 (u)du . (14.72)
This integral can be evaluated using that cos2 (u) = (1 + cos(2u))/2 and we find that
u sin(2u)
Z
cos2 (u)du = + (14.73)
2 4
117
Module V Note 14
and substituting back in u = arcsin(x) we have an expression for the primitive of (14.69):
cos(2x) + 1 2
Z 2π Z 2π
cos4 (x)dx = dx
0 0 2
1 1 1
Z 2π
= cos2 (2x) +
cos(2x) + dx (14.78)
0 4 2 4
1 cos(4x) + 1 3π
Z 2π
= + 1 dx =
4 0 2 4
118
MAST10021 Semester 2, 2023
Problems
1. Find elementary expressions for the following primitives.
a)
Z √ √
x3 + 6 x
5
√ dx
x
b)
dx
Z
√ √
x−1+ x+1
c)
dx
Z
a2 + x2
d)
dx
Z
√
a2 − x2
2. Solve by substitution.
a) Z
2
xe−x dx
b)
ex dx
Z
e2x + 2ex + 1
c) Z
log(cos(x)) tan xdx
b) Z
(log x)3 dx
c)
log(log x)
Z
dx
x
4. Find the following primitives in elementary terms using substitution.
a) Z
ex sin ex dx
119
Module V Note 14
b) Z p
x 1 − x2 dx
c)
log(log(x))
Z
dx
x log(x)
b)
dx
Z
1 + ex
c)
dx
Z
√ √
x+ 3x
d)
1
Z
√ dx
1 + ex
e)
1
Z
dx
2 + tan(x)
120
Module VI.
121
Note 15.
First order linear differential equations
In the previous lectures we have already encountered the notion of a solution to a
differential equation, and the initial value problem.
The notion of a solution is more familiar for algebraic equations. For example the
equation x2 − 1 = 0 has as its solution x = ±1, which means inserting the values x = ±1
turns the equation into a true statement.
For a differential equation the solutions are functions, which upon inserting turn the
equation into a true statement.
f :U →R
(iii) Let (x0 , y0 ) ∈ U . We say ϕ ∈ C 1 (I) is a solution to the initial value problem
Remark 15.1. The general theory of differential equations addresses the question under
which conditions on f there exist solutions to (15.1), and when they are unique. It is also
of interest when the solutions can be expressed in explicit terms.
Remark 15.2. In practise we often write y(x) for the solution, but conceptually it is
important to distinguish between the unknown y, and the solution ϕ(x).
123
Module VI Note 15
Remark 15.3. We use the notation C0 (I) to denote “the space of continuous functions on
the interval I”. For example, ϕ ∈ C0 (I), where I = (a, b), means that ϕ is continuous on
(a, b). Similarly, C1 (I) denotes the “space of continuously differentiable functions on I”.
So ϕ ∈ C1 (I), where I = (a, b), means that ϕ is differentiable on (a, b), and ϕ0 ∈ C0 (I).
In Module V we have seen the exponential as the solution to an initial value problem.
Let us prove this characterisation of the exponential in yet another way, and thereby give
an example of an existence and uniqueness theorem for a differential equation.
Proposition 15.1. Let a, y0 ∈ R. Then f : I → R is a solution to the initial value
problem
y 0 = ay y(0) = y0 (15.4)
if and only if
f (x) = y0 eax . (15.5)
Proof. Clearly, with f (x) given by (15.5) we have
f 0 (x) = y0 aeax = af (x) (15.6)
and f (0) = y0 , so f (x) solves the initial value problem. Conversely, let f (x) be a solution
to the initial value problem (15.4), and set g(x) = f (x)e−ax on the interval I where f is
defined. Then
g 0 (x) = f 0 (x)e−ax − af (x)e−ax = 0 . (15.7)
Therefore (by the Mean Value Theorem) g(x) = g(0), hence
f (x)e−ax = g(x) = g(0) = f (0) = y0 . (15.8)
124
MAST10021 Semester 2, 2023
y0 + P y = 0 y(x0 ) = y0 (15.12)
if and only if Z x
h i
ϕ(x) = y0 exp − P (t)dt (15.13)
x0
One can find the solutions to the inhomogeneous equation by the method of variation
of constants: Consider the function
Z x
ϕ(x) = ϕ0 (x)e−G(x) G(x) = P (t)dt (15.15)
x0
which is obtained from the solution to the homogeneous equation by replacing the constant
y0 by a function ϕ0 ∈ C 1 (I). We will now derive a condition for the function ϕ0 (x) for
ϕ(x) to be a solution to (15.9). We compute
Integrating gives a formula for ϕ0 (x) in terms of the known functions P (x), and Q(x).
if and only if
Rx Z x Rt
− P (t)dt P (τ )dτ
ϕ(x) = e x0 y0 + xe 0 Q(t)dt . (15.20)
x0
125
Module VI Note 15
Exercise 15.1. Prove the first part of the theorem. Also verify that (15.20) solves the
initial value problem (15.19).
Example 15.1. Let us find all solutions of the equation
ex−1 ex −1
Z x
ϕ(x; y0 ) = y0 + e−(t−1) e2t dt = y0 e + ex − e1 (15.23)
x 1 x
Note that these solutions are unbounded as x tends to 0, unless y0 /e + 1 − e = 0 in
which case limx→0 ϕ(x; y0 ) = 1. Here we used that ex is well approximated by the linear
function 1 + x near x = 0.
Problems
1. Solve the following initial value problems:
a) y 0 − 3y = e2x with y(0) = 0
b) y 0 + y = e2x with y(0) = 1
2. Find all solutions of y 0 sin(x) + y cos(x) = 1 on the interval (0, π). Prove that exactly
one of these solutions has a finite limit as x → 0, and another has a finite limit as
x → π.
y0 = x + y (15.24)
126
Note 16.
Separable differential equations
In this lecture we consider a somewhat more challenging class of differential equations:
These are equations as in Definition 15.1 where the function f on the right hand side is
in fact a product of a function of x and a function of y.
More precisely, a separable differential equation is an equation of the form
y 0 = f (x)g(y) (16.1)
where f , and g are continuous functions on intervals I, and J respectively, and the
corresponding initial value problem is
and illustrate the behaviour that may occur if g has a zero with an example later.
We have not proven yet that a solution to (16.2) exists, but we will obtain such a proof
by first assuming there is a solution, and deriving an explicit formula for the solution.
Indeed, suppose ϕ is a solution to the initial value problem (16.2), then in view of the
assumption (16.3), we obtain
ϕ0 (t)
Z x Z x
dt = f (t)dt (16.4)
x0 g(ϕ(t)) x0
We can apply the substitution rule of Theorem 14.2 to write the left hand side as
ϕ0 (t) du
Z x Z ϕ(x)
dt = (16.5)
x0 g(ϕ(t)) y0 g(u)
where we have used that ϕ(x0 ) = y0 . Introducing the notation
du
Z y Z x
G(y) = , F (x) = f (t)dt , (16.6)
y0 g(u) x0
127
Module VI Note 16
We have derived here a necessary condition for the solution: If there is a solution ϕ
to the initial value problem then it must of this form. Conversely, we can now argue as
promised that given f , and g, we can define the functions F , and G as in (16.6) and
verify that with ϕ defined by (16.8),
1
ϕ0 (x) = F 0 (x) = g(ϕ(x))f (x) (16.9)
G0 ((G−1 ◦ F )(x))
and ϕ(x0 ) = G−1 (0) = y0 .
In summary, we have proven that the solution to the initial value problem (16.2) is
precisely given by (16.8). In the same way one proves:
Theorem 16.1. Suppose f , and g are continuous functions on intervals I, and J,
respectively and g 6= 0 on J. Then ϕ is a solution to (16.1) if and only if
128
MAST10021 Semester 2, 2023
Problems
1. Determine the solutions to the differential equation
y0 = x + y . (16.15)
Hint: This is not a separable equation, but it can be reduced to this case, by
considering the equation satisfied by
y 0 = f y/x (16.17)
129
Note 17.
Examples of first order equations
17.1. Examples of separable equations
We have discussed separable first order equations in general. Let us look at one more
example, and others will be discussed in the tutorials.
Example 17.1. Consider the non-linear equation xy 0 + y = y 2 . By inspection we see that
y = 0, and y = 1 are solutions. The remaining solutions, when y(y − 1) 6= 0, and x 6= 0,
satisfy
y 0 (x) 1
= (17.1)
y(y − 1) x
That means that a solution y = ϕ(x) satisfies
ϕ0 (x) dx
Z Z
dx = . (17.2)
ϕ(x)(ϕ(x) − 1) x
We can use partial fractions to rewrite the integrand
1 1 1
= − (17.3)
y(y − 1) y−1 y
so by the substitution rule,
ϕ0 (x) dy
Z Z
dx = = ln |y − 1| − ln |y| (17.4)
ϕ(x)(ϕ(x) − 1) y(y − 1)
with y = ϕ(x), and on the right hand side dx/x = ln |x|. Thus for some constant C,
R
y−1
ln = ln |x| + C y = ϕ(x) (17.5)
y
which gives |(y − 1)/y| = eC |x|, or (y − 1)/y = Kx for some constant K. We have
y−1 1
− 1 = − = Kx − 1 (17.6)
y y
which finally gives the formula for the solutions
1
ϕK (x) = (x ∈ IK ) (17.7)
1 − Kx
where IK = (−∞, 1/K), or IK = (1/K, ∞) depending on the parameter K ∈ R, K =
6 0.
Note that K = 0 corresponds to the solution y = 1.
131
Module VI Note 17
Exercise 17.1. Find the solution passing through any given point (x0 , y0 ). Sketch all
solutions to the differential equation (17.1). Are there points with several solutions passing
through them?
y 0 = f (ax + by + c) (17.8)
Here the direction field is constant on straight lines, and we can pass from y(x) to the
new unknown
u(x) = ax + by(x) + c . (17.9)
Then u satisfies
u0 = a + by 0 = a + bf (u) (17.10)
which is an equation of the form u0 = g(u). Conversely, any solution to (17.10) gives rise
to a solution of the original equation (17.8) using the relation (17.9).
Exercise 17.2. Verify this!
Example 17.2.
y 0 = (x + y)2 (17.11)
We find that u(x) = x + y(x) satisfies u0 = 1 + u2 , hence
arctan(u(x)) = x + C (17.12)
132
MAST10021 Semester 2, 2023
Problems
1. Find formulas for the solutions of the following differential equations.
a) (x + 1)y 0 + y 2 = 0
b) y 0 = (y − 1)(y − 2)
c) (x − 1)y 0 = xy
133
Module VI Note 17
y 0 = (x − y + 3)3 (17.22)
134
Additional: Isoclines and Homogeneity
Further Reading
135
Module VI Note 17
y−x
y0 = (17.6)
y+x
x 2 + y 2 3
y0 = . (17.7)
xy
All of these equations are separable: Indeed, any first-order equation with the property
(17.5) can be written as
y 0 = f (1, y/x) . (17.8)
v−1 v2 + 1
xv 0 = −v =− , v = y/x , (17.10)
v+1 v+1
vdv dv dx
Z Z Z
+ =− (17.11)
v +1
2 v +1
2 x
1
ln(v 2 + 1) + arctan(v) = − ln |x| + C (17.12)
2
which shows that for every solution y = ϕ(x) there is a constant C such that
1 ϕ(x)
ln ϕ(x)2 + x2 + arctan =C. (17.13)
2 x
We have seen examples of differential equations (17.8) which are homogeneous of degree
zero in the sense of (17.5). Let us explore some of their properties in greater generality.
Exercise 17.5. Show that straight lines through the origin are isoclines of the differential
equations which are homogeneous of degree zero.
Exercise 17.6. Demonstrate this property for the equation y 0 = −2y/x. Sketch the
isoclines and the direction field.
136
MAST10021 Semester 2, 2023
Exercise 17.7. Given that all straight lines are isoclines, and the slope of the direction
field is unchanged along an isocline, we may guess that the integral curves are similar, in
the sense that if
G = {(x, ϕ(x)) : x ∈ I} (17.14)
is the graph of a solution, then so is
y 0 = −x/y (17.16)
whose integral curves are concentric circles given by x2 + y 2 = C for some constant C > 0.
137
Note 18.
Linear differential equations of second order
with constant coefficients
The equation
y 00 = −k 2 y (18.1)
describes the oscillations of a particle on a line around the origin y = 0, when a force
proportional to its displacement pulls it back to its equilibrium position at y = 0.1
This is an example of a second order equation, and could we viewed as a system of
first order equations: Introducing the unknowns y1 = y and y2 = y 0 we could rewrite this
equation as
y10 = y2 , y20 = −k 2 y1 . (18.2)
We can think of ~y = (y1 , y2 ) as a point in the plane, and view the solution ϕ
~ (t) as a point
moving through 2-dimensional space R2 with velocity ~v :
~ 0 (t) = ~v (~
ϕ ϕ(t)) (18.3)
Here ~v (y1 , y2 ) = (y2 1 ) is a vectorfield.
, −k 2 y
Exercise 18.1. A vectorfield in R2 , in comparison to a direction field which has a line
attached to each point, has in addition to the slope also a magnitude at each point.
In fact,
q with ~ v (y1 , y2 ) = (v1 (y1 , y2 ), v2 (y1 , y2 )) the magnitude at the point (y1 , y2 ) is
|v| = v12 + v22 . Sketch the vectorfield ~v of (18.2).
Exercise 18.2. Show that in the case k = 1 concentric circles are solutions to (18.2). The
fact that y12 /2 + y22 /2 is constant along the solution curve can be interpreted as the law
of conservation of energy. What is the situation if k 6= 1?
While recasting a second order equation as a first order system is a very fruitful point
of view, we will not adopt this approach here, and study directly the linear differential
equation of second order with constant coefficients:
y 00 + ay 0 + by = 0 (18.4)
More generally, a linear differential equations of second order is an expressions
of the form
y 00 + g(x)y 0 + h(x)y = r(x) (18.5)
In the case that r(x) vanishes identically we say the equation is homogeneous, otherwise
inhomogeneous.
1
For the given equation, this relationship is linear, and is then also referred to as Hooke’s law in physics.
139
Module VI Note 18
y(x) = c1 x + c2 (18.6)
with constant c2 . Conversely, for any numbers c1 , c2 ∈ R, the function given by (18.6) is
a solution, so we have found all solutions in this case.
The equation y 00 + by = 0, when b < 0. Since b < 0, we can write b = −k 2 for some
k > 0, and the differential equation takes the form
y 00 = k 2 y . (18.7)
We immediately verify that y(x) = ekx is a solution, and another is y(x) = e−kx .
Therefore also linear combinations of these are solutions, and we conclude: For any
constants c1 , c2 ∈ R
y(x) = c1 ekx + c2 e−kx (18.8)
is a solution. We will prove below that these are in fact all solutions in this case.
The equation y 00 + by = 0, when b > 0. In this case let us again write b = k 2 , then the
equation (18.4) takes the form
y 00 (x) = −k 2 y . (18.9)
One may recognize that this relation is satisfied by the function y(x) = cos(kx), and
also y(x) = sin(kx). We find again a general solution by forming a linear combination:
This does not show that any solution of (18.9) is of this form, but we have actually
already given a proof of that in Note 12 in the case k = 1.
Exercise 18.3. Can you generalise the argument we have given in Lemma 12.2 to show
that any solution to (18.9) is of the form (18.10)?
The case a 6= 0. The cases considered above actually cover the case a 6= 0 as well, in
the sense that we can reduce the problem (18.4) to the problem with a = 0.
The idea is to consider solutions of (18.4) of the form
140
MAST10021 Semester 2, 2023
y 00 + ay 0 + by = v 00 + av 0 + bv u + 2v 0 + av u0 + vu00 (18.12)
Proposition 18.1. Let y(x) = u(x) exp(−ax/2). Then the function y satisfies (18.4) if
and only if the function u satisfies the differential equation
4b − a2
u00 + u=0 (18.14)
4
y 00 + by = 0 (18.15)
• when b > 0: set b = k 2 and set f1 (x) = cos(kx), and f2 (x) = sin(kx).
In other words, for every solution ϕ(x) of the equation (18.15) there are constants
c1 , c2 ∈ R so that
ϕ(x) = c1 f1 (x) + c2 f2 (x) . (18.16)
Proof. Let ϕ(x) be a solution to (18.15), and let f1 (x), and f2 (x) be chosen as in the
statement of the theorem, depending on the value of b. Then also
is a solution to (18.15).
Let us choose the constants c1 , and c2 so that ψ(0) = 0, and ψ 0 (0) = 0. This amounts
to solving the equations:
141
Module VI Note 18
ϕ(0) =c1 + c2
ϕ0 (0) =c1 k − c2 k
ϕ(0) + ϕ0 (0)/k
c1 =
2
ϕ(0) − ϕ0 (0)/k
c2 =
2
Exercise 18.7. In view of Proposition 18.1 it is now also possible to state a theorem that
gives a characterisation of all solutions to (18.4), for any values of a, b ∈ R, depending
on the sign of the so-called discriminant of the differential equation (18.4):
∆ = a2 − 4b . (18.21)
State it!
142
MAST10021 Semester 2, 2023
Problems
1. Find explicitly the solution to the following initial value problem:
a) y 00 + ky = 0, y(0) = 0, y 0 (0) = y1 , where k > 0, and y1 ∈ R are fixed constants.
b) y 00 + ay 0 = 0, y(0) = 1, y 0 (0) = 0, where a ∈ R is fixed.
2. Show that the solution (18.16) in the case b > 0 can also be written as
143
Additional: Uniqueness of solutions to the
initial value problem
We will give a proof of Theorem 18.2 above, in a way that applies to all cases, and
already outlines an approach to the general second order differential equation (18.5) (not
necessarily with constant coefficients).
In this section we will prove that there exists one and only one solution to the following
Initial value problem. Let a, b ∈ R, and x0 , y0 , y1 ∈ R. The initial value problem for
(18.4) is to find a solution ϕ ∈ C 2 (R) to the problem
y 00 + ay 0 + by = 0 , (18.1a)
0
y(x0 ) = y0 y (x0 ) = y1 . (18.1b)
In particular, we want to prove that the only solution to this initial value problem
with x0 = y0 = y1 = 0 is y = 0.
We begin with the following observation:
Note this means in particular that either W (ϕ1 , ϕ2 )(x) = 0 for all x ∈ R or none.
Proof. Let us denote for short by W (x) = W (ϕ1 , ϕ2 )(x) and note that W ∈ C 1 (R)
because ϕ1 , ϕ2 ∈ C 2 (R). We have
W 0 (x) = ϕ1 ϕ002 − ϕ001 ϕ2 = ϕ1 (−aϕ02 − bϕ2 ) − (−aϕ01 − bϕ1 )ϕ2 = −aW (x) (18.4)
145
Module VI Note 18
Exercise 18.1. With f1 , and f2 as defined in Theorem 18.2, compute W (f1 , f2 ) in all
cases b > 0, b = 0, b < 0. In particular, verify that W (f1 , f2 ) 6= 0 in all cases!
Then
ϕ1 (x) = ϕ2 (x) x ∈ R. (18.7)
Proof. The difference ϕ(x) = ϕ1 (x) − ϕ2 (x) is a solution to the same equation with trivial
initial values, namely ϕ(x0 ) = ϕ0 (x0 ) = 0. Assume that ϕ(x1 ) 6= 0 for some x1 ∈ R.
The idea is now to write down a solution to the initial value problem
In fact, we obtain
ϕ(x1 )f2 (x1 ) ϕ(x1 )f1 (x1 )
c1 = − c2 = . (18.13)
W (x1 ) W (x1 )
Since ϕ and ψ are solutions to y 00 + by = 0 we know from Proposition 18.1 that the
Wronskian determinant either vanishes for all x ∈ R or none. However,
Remark 18.1. In the proof we have made the following important observation: In order
to find the solution to the initial value problem (18.1a,18.1b) it suffices to find any two
solutions of (18.1a) whose Wronski determinant does not vanish at a point. Indeed, if
ϕ1 , ϕ2 ∈ C 2 (R) are two solutions to (18.1a) such that W (x1 ) 6= 0 for some x1 ∈ R where
146
MAST10021 Semester 2, 2023
W (x) = W (ϕ1 , ϕ2 )(x), then W (x) 6= 0 for all x ∈ R by virtue of Proposition 18.1, and
we can set for any x0 , y0 , y1 ∈ R,
1
c1 = (y0 ϕ02 (x0 ) − y1 ϕ2 (x0 )) (18.14a)
W (x0 )
1
c2 = (−y0 ϕ01 (x0 ) + y1 ϕ1 (x0 )) . (18.14b)
W (x0 )
Problems
1. Let ϕ1 , and ϕ2 be two solutions to differential equation (18.1a), and assume ϕ1 is
not identically zero.
a) Prove that W (ϕ1 , ϕ2 )(0) = 0 if and only if ϕ2 /ϕ1 is constant.
b) Suppose ϕ2 /ϕ1 is not constant. Let ϕ be any solution to (18.1a). Show that
there exists constants c1 , c2 such that
147
Additional: The space of solutions
Further Reading
T (y) = y 0 + P y . (18.1)
where ϕin is a solution to the inhomogeneous problem (15.9). The set of solutions to the
inhomogeneous equation is thus span an affine subspace of C 1 (I) of dimension 1.
149
Module VI Note 18
(ii) For f ∈ C 2 (R), f is a solution to (18.1a) if and only if f ∈ ker(T ). In particular the
set of solutions to the homogeneous second order differential equation with constant
coefficients is a linear subspace of C 2 (R).
(iii) If f1 , f2 ∈ ker T then f1 , f2 are linearly independent if and only if W (f1 , f2 )(x) 6= 0
for some x ∈ R (or equivalently W (f1 , f2 )(x) = 0 for all x ∈ R).
(v) If f1 , f2 ∈ ker(T ) are linearly independent, then we say {f1 , f2 } are a fundamental
system for the differential equation (18.1a), and
(18.5)
is the space of solutions to (18.1a) defined on R.
We omit most of the proof except for (iii,iv) which are instructive.
Proof of (iii). If f1 , f2 are linearly dependent, say f1 = λf2 , then clearly W (f1 , f2 ) = 0.
Conversely suppose that W (f1 , f2 ) = 0. If either f1 or f2 vanish identically then they
are linearly dependent, so we can assume that f1 6= 0 and f2 = 6 0. (Meaning f1 , f2 are
not the “zero function” 0(x) = 0, x ∈ R.) Then by Theorem 18.2 f1 and f2 cannot have
trivial initial values at x = 0. Consider the case that f10 (0) = f20 (0) = 0. Then we must
have f1 (0) 6= 0, and f2 (0) 6= 0, and we can define
which satisfies f (0) = 0 and f 0 (0) = W (f1 , f2 )(0) = 0, so f = 0 again by Theorem 18.2,
which shows that f1 , f2 are linearly dependent. In the case that either f10 (0) 6= 0 or
f20 (0) 6= 0 we can define
f = f20 (0)f1 − f10 (0)f2 ∈ ker(T ) (18.7)
and find f (0) = W (0) = 0 and f 0 (0) = 0 which implies that f = 0 by Theorem 18.2
hence f1 , f2 are also linearly dependent in this case.
Proof of (iv). Let f1 , f2 be the solutions to (18.1b) with (y0 , y1 ) = (1, 0), and (y0 , y1 ) =
(0, 1) respectively (and x0 = 0). Then W (f1 , f2 )(0) = 1 and so f1 , f2 are linearly
independent by (iii). Hence dim ker(T ) ≥ 2. Given f ∈ ker(T ), then we claim that
f = f (0)f1 +f 0 (0)f2 . Indeed g = f −f (0)f1 +f 0 (0)f2 ∈ ker(T ), and g(0) = f (0)−f (0) = 0,
and g 0 (0) = f 0 (0) − f 0 (0) = 0, hence g = 0 identically by Theorem 18.2. Thus we have
shown that f1 , f2 is a basis for ker(T ), so dim ker(T ) = 2.
150
Module VII.
151
Note 19.
Complex numbers
In Module VI we have found the solutions to the differential equation y 00 + by = 0 “by
inspection.” What that means is that we have taken a guess that a certain function is a
solution, or as one also says we make the ansatz:
λ2 + b = 0 (19.2)
have a solution? There is no real number that solves this equation, but as we will see in
this lecture it does have the complex solutions λ = i and λ = −i. The corresponding
fundamental solutions e±ix — of which we still have to make sense of! — are in fact
related to the trigonometric functions that we have encountered in Section 18.1 as the
fundamental solutions to y 00 + by = 0 when b > 0.
More generally, we verify that the ansatz (19.1) gives a solution to (18.4) if λ is a zero
of the characteristic polynomial:
λ2 + aλ + b = 0 (19.4)
Depending on the sign of the discriminant ∆ — see (18.21) — this equation has either
real solutions (∆ > 0, or ∆ = 0) or complex roots (∆ < 0), corresponding to the three
systems of fundamental solutions that we have encountered in Theorem 18.2.
1
This of course does not explain how one arrives at this particular ansatz. But it is clear that polynomials
cannot provide a solution, because differentiation reduces its order. The exponential function already
led to success for the equation y 0 = y, so why not try again! This procedure of trial and error is also
called heuristics, which comes from Greek for “searching.”
153
Module VII Note 19
154
MAST10021 Semester 2, 2023
2. z = z if z is real
3. z + w = z̄ + w̄
4. −z = −z
5. z · w = z · w
6. z −1 = z −1
7. |z|2 = zz
8. |z · w| = |z| · |w|
A less straight-forward but very important statement is the following triangle inequality;
(see the additional notes to Module VII): For any complex numbers z, and w it holds
Addition and multiplication both have geometric interpretations in the complex plane.
For the interpretation of multiplication note that for any complex number z 6= 0 we
can write
z
z = |z| (19.13)
|z|
where |z| is a positive factor, and z/|z| is a complex number with unit absolute value.
Since any complex number with unit modulus can be written in the form cos θ + i sin θ,
we see that
z = r(cos θ + i sin θ) (19.14)
where r = |z| > 0, and θ ∈ R (which is not unique, because if θ0 is one possibility, then
so are θ + 2πk, for any k ∈ Z); θ is called the argument of z.
Exercise 19.4. Show that the product of two nonzero complex numbers z = r(cos θ +sin θ),
and w = s(cos φ + i sin φ) is
This formula is also known as de Moivre’s theorem, and can be used to compute the nth
roots of a complex number; (see the additional notes to Module VII).
155
Module VII Note 19
Problems
1. Find the absolute value and argument(s) of each of the following complex numbers:
a) 3 + 4i
b) (3 + 4i)−1
c) (1 + i)5
4. Prove that |z| = |z̄| and that the real part of z can be written as (z + z̄)/2, while
the imaginary part is (z − z̄)/2i.
156
Additional: Complex numbers
Further Reading
The discussion suggests that we can arrive at a sensible definition of complex numbers
if we view them as pairs of real numbers:
Moreover, we define
i = (0, 1) (19.2)
Remark 19.1. We can identify complex numbers (a, 0) with the real number a ∈ R.
Moreover i2 = (0, 1) · (0, 1) = (−1, 0), and so we have
We will not verify explicitly that C satisfy all the properties of a number system, or
more precisely the axioms of a field; see for example (Spivak, Calculus, Chapter 1, and
Chapter 25). But let us figure out how to compute the multiplicative inverse. For (a, b)
let us find (x, y) such that
(a, b) · (x, y) = (1, 0) (19.4)
For this to be true we need ax − by = 1, and ay + bx = 0, which has the solutions
x = a/(a2 + b2 ), and y = −b/(a2 + b2 ). This proves (19.9)
157
Module VII Note 19
Proof. It is easy to see that this is true when z = λw for some real number λ. So let us
assume that z 6= λw for any λ ∈ R, and that w 6= 0. Then for all λ ∈ R,
Since z w̄ + wz̄ is real (verify this!), the right hand side is a quadratic in λ with real
coefficients, which by the inequality cannot have a zero. Therefore its discriminant must
be negative:
(z w̄ + wz̄)2 − 4|w|2 |z|2 < 0 (19.7)
From this inequality it follows that
This formula is also known as de Moivre’s theorem, and has the important consequence:
Proposition 19.2. Every nonzero complex number has exactly n complex nth roots.
Proof. The statement is that for any complex number w = s(cos φ + i sin φ) 6= 0, and any
natural number n there are precisely n different complex numbers z = r(cos θ + sin θ)
satisfying z n = w. So by de Moivre’s theorem this happens if and only if
rn = s (19.10)
cos(nθ) + i sin(nθ) = cos(φ) + i sin φ (19.11)
√
So from the first equation r = n
s, and from the second for some integer k,
nθ = φ + 2πk (19.12)
which has the solutions θk . However, it remains to find out how many of these are distinct.
Since any integer k can be written as k = nq + k 0 for some integer q, and some integer k 0
between 0 and n − 1, we see that
φ 2πk 0
θk = + 2πq + = θk0 + 2πq (19.13)
n n
158
MAST10021 Semester 2, 2023
and so θk and θk0 are the arguments of the same root z in the complex plane. Therefore
there are n distinct nth roots
√
z = n s cos θk + i sin θk ) k = 0, 1, 2, . . . , n − 1 . (19.14)
Exercise 19.1. The proof actually shows us a method to compute the nth root of a
complex number. Use it to compute the three cube roots of i.
159
Note 20.
Hyperbolic functions
Let us return to the differential equation
y 00 − y = 0 . (20.1)
for some constants c1 , c2 ∈ R. We now define the following special solutions, corresponding
to the choice of constants c1 = 12 , and c2 = ± 12 :
1 x
cosh(x) := e + e−x (20.3)
2
1 x
sinh(x) := e − e−x . (20.4)
2
Remark 20.1. We may view these functions as the unique solutions to the initial value
problem
y 00 − y = 0 y(0) = y0 y 0 (0) = y1 (20.5)
with (y0 , y1 ) = (1, 0), and (y0 , y1 ) = (0, 1), respectively. Indeed,
which tells us that these two solutions are linearly independent, namely every solution
to (20.1) can be written as a linear combination of these; cf. Theorem 18.1 in Note 18,
additional material to Module VI.
161
Module VII Note 20
Recall that
Definition 20.1. A function f is called even if f (−x) = f (x) for all x ∈ R, and it is
called odd if f (−x) = −f (x) for all x ∈ R.
162
MAST10021 Semester 2, 2023
Also note that the hyperbolic functions are each other’s derivatives:
√
Therefore the difference g(x) = arcosh(x) − log(x + x2 − 1) is a continuous function
on [1, ∞) whose derivative exists on (1, ∞) and is g 0 (x) = 0. Therefore g(x) is constant,
and
163
Module VII Note 20
164
MAST10021 Semester 2, 2023
Catenary problem What shape does a chain of uniform density take under its own weight
when suspended between two points?
The solution to this problem invokes the theory of separable differential equations
from Note 16, and involves hyperbolic functions as discussed above; (see the additional
notes to Module VII).
Problems
1. Compute the following integrals using hyperbolic substitutions:
a)
dx
Z
√ . (20.31)
x 1 + x2
b)
dx
Z
√ (20.32)
x2 x2 − 1
c) Z p
x2 + 1dx (20.33)
d) Z p
x2 − 1dx (20.34)
165
Note 21.
Second order differential equations
y 00 − y = 0 (21.1)
of a linear differential equation, which we solved by making the ansatz f (x) = eλx . This
led to the equation
λ2 − 1 = 0 (21.2)
which has the solutions λ = ±1, corresponding to the linearly independent solutions e±x .
Linear combinations of these define the hyperbolic functions
1 x 1 x
cosh(x) = e + e−x sinh(x) = e − e−x (21.3)
2 2
which we have studied in Note 20.
Recall from Note 18 that the special case
y 00 + y = 0 (21.4)
has the trigonometric functions as its solutions. The same ansatz f (x) = eλx leads — as
we have already seen in Lecture 19 — to the equation
λ2 + 1 = 0 (21.5)
which now has the complex solutions λ = ±i, corresponding to the complex valued
solutions e±ix .
These complex valued functions need to be discussed separately; (see the additional
notes to Module VII). However, given that e±ix is a complex valued solution, we can
already infer that their real and imaginary parts are also solutions. Since for any complex
number w the real part is given by Re(w) = (w + w̄)/2, and the imaginary part is given
by Im(w) = (w − w̄)/2i, and given that moreover ez = ez̄ for any complex number z, we
obtain that the real and imaginary parts of eix are given, respectively, by
1 ix 1 ix
Re(eix ) = e + e−ix , Im(eix ) = e − e−ix . (21.6)
2 2i
167
Module VII Note 21
Therefore they are, respectively, the solutions of the initial value problems,
In view of the uniqueness theorem for solutions to the initial value problem — see
Theorem 18.2 — and the discussion in Section 18.1, we can infer that they must be
precisely the trigonometric functions:
1 ix 1 ix
cos(x) = e + e−ix , sin(x) = e − e−ix . (21.9)
2 2i
Moreover, by the very definition of the trigonometric functions as the real and imaginary
part we obtain Euler’s identity:
Let us now return to the general case (18.4) of a homogeneous linear second order
differential equation with constant coefficients:
y 00 + ay 0 + by = 0 (21.12)
As we have seen in Note 19, the corresponding characteristic polynomial is (19.4), namely
the algebraic condition for f (x) = eλx to be a solution is that λ is a root of
Q(λ) = λ2 + aλ + b . (21.13)
This polynomial may not have any real zeros, depending on the sign of the discriminant
∆ = a2 − 4b. Indeed λ is a solution to Q(λ) = 0 if and only if
a 1p 2
λ=− ± a − 4b (21.14)
2 2
Case ∆ > 0.
Exercise 21.1. Let λ1 , λ2 be the two real solutions of (21.13). Show that f1 (x) = eλ1 x ,
and eλ2 x are two linearly independent solutions of (21.12).
168
MAST10021 Semester 2, 2023
Case ∆ = 0. In this case the two roots coincide λ1 = λ2 = −a/2. Let us prove that
f200 (x) + af20 (x) + bf2 (x) = −ae−a/2 + xQ(−a/2)e−ax/2 + ae−ax/2 = 0 . (21.16)
and thus f1 (x) and f2 (x) are linearly independent by Theorem 18.1.
are zeros of the characteristic polynomial Q(λ). Similarly to the special case (21.4), we
expect that the real and imaginary parts of eλx are solutions to (21.12). Since
we obtain
1
Re e(α+iβ)x = eαx eiβx + e−iβx = eαx cos(βx) (21.20)
2
1
Im e(α+iβ)x
= eαx eiβx − e−iβx = eαx sin(βx) (21.21)
2i
Exercise 21.2. Show that the functions f1 (x) = eαx cos(βx), and f2 (x) = eαx sin(βx) are
solutions to (21.12) in the case ∆ < 0.
Exercise 21.3. Show that the solutions f1 (x), and f2 (x) are linearly independent, i.e. that
for all x ∈ R,
W (f1 , f2 )(x) 6= 0 . (21.22)
Summary. In summary, to find the solutions to the homogeneous equation (21.12), one
may determine the zeros of the characteristic polynomial,
2. λ1 = λ2 ∈ R
169
Module VII Note 21
3. λ1 ∈ C, Im(λ) 6= 0, λ2 = λ1 ,
the functions f1 and f2 defined above form a fundamental system, in the sense that a
twice differentiable function f is a solution to (21.12) if and only there are constants
c1 , c2 ∈ R (uniquely determined) such that f = c1 f1 + c2 f2 .
Exercise 21.4. Convince yourself that the fundamental system of solutions found in
this way is exactly the same as we have found initially in Lecture 18. In particular
from Proposition 18.1 we know that any solution to (21.12) can be written as a linear
combination of
f1 (x) = e−ax/2 u1 (x) f1 (x) = e−ax/2 u2 (x) (21.24)
where u1 (x), and u2 (x) are as in Theorem 18.2 depending on the sign of the discrimant
∆ = a2 − 4b.
y 00 + ay 0 + by = r(x) (21.25)
where a and b remain constants, but r(x) is a continuous function on (−∞, ∞).
Let us first observe that if y1 and y2 are solutions to (21.25), then y1 − y2 is a solution
to the homogeneous equation (21.12), and thus can be written as a linear combination of
the fundamental solutions f1 and f2 . In other words, there are constants c1 , and c2 such
that
y1 − y2 = c1 f1 + c2 f2 . (21.26)
This means in particular if one particular solution y1 to (21.25) is known, then any
solution y2 to (21.25) can be written as y2 = c1 f1 + c2 f2 + y1 .
Proposition 21.1. If y1 is a particular solution to (21.25), then the general solution to
the inhomogeneous equation (21.25) is obtained by adding to y1 the general solution to
the homogeneous equation (21.12).
It thus suffices to find one particular solution to the inhomogeneous problem.
In general, for any continuous function r(x), a particular solution can be found by the
method of variation of constants; (see the additional notes to Module VII).
y 00 + ay 0 + by = r(x) , (21.27)
there are special cases — namely when the function r(x) takes a special form — for which
other methods are available. We illustrate this in the cases when r(x) is a polynomial, or
a polynomial times an exponential.
170
MAST10021 Semester 2, 2023
y 00 + y = x3 . (21.30)
The ansatz
y1 (x) = a3 x3 + a2 x2 + a1 x + a0 (21.31)
leads to the equation
Therefore a particular solution is g(x) = x3 − 6x, and the general solution is given by
Exercise 21.5. Derive the particular solution to (21.30) by using Theorem 22.1 and verify
that the two methods give the same result.
which is precisely of the from discussed above, and can be solved with a particular
solution of the form n
q(x) = (21.37)
X
ak xk ,
k=1
171
Module VII Note 22
y 00 + y = xe3x . (21.38)
q(x) = a1 x + a0 , (21.40)
Problems
1. Find the general solution of each of the following differential equations. Unless
defined on the whole real line, indicate the interval on which the solution is defined.
a) y 00 − y = x
b) y 00 + y 0 = x2 + x
c) y 00 − 4y = e2x
d) y 00 − 2y 0 + y = x + 2xe2x
y 00 − k 2 y = r(x) (21.44)
172
Additional: Complex functions,
and power series
Further Reading
f :C → C
(22.2)
z 7→ f (z)
where u(z) and v(z) are real numbers, and say that f is real-valued if v(z) = 0.
1
In much the same way that we have talked about functions in Note 2 it makes sense to give a formal
defintion of a complex function as a collection of pairs of complex numbers, which does not contain
two distinct pairs with the same first element; cf. Additional material to Module I.
173
Module VII Note 22
where a0 , . . . , an ∈ C.
Example 22.2. The conjugate function is defined by
f (z) = z̄ . (22.5)
Example 22.3. Since any complex number z can be written in the form z = x + iy, many
explict examples can be written down using (22.3), such as
In a similar fashion to Note 3 and 4 we can talk about the existence of a limit, and
continuity:
The statement
lim f (z) = l (22.7)
z→a
means that for every (real) number ε > 0, there is a (real) number δ > 0,
such that, for all z, if 0 < |z − a| < δ, then |f (z) − l| < ε.
While formally the same, the geometric interpretation is different; see Fig. 22.1. The
function f (z) has the limit l as z approaches a, if f (z) can be made to lie in a circle of
radius ε in the complex plane C, by restricting z to lie within a circle of radius δ around
a in the complex plane C.
Moreover a function f (z) is continuous at a if limz→a f (z) = f (a), and continuous if
f is continuous at every a in the domain.
Remark 22.1. Compare this to the discussion of limits and continuity for functions of
two variables in Note 6!
In much the same way as in Notes 3 and 4, one can prove that
lim z = a (22.8)
z→a
lim f (z) + g(z) = lim f (z) + lim g(z) (22.9)
z→a z→a z→a
lim f (z) · g(z) = lim f (z) · lim g(z) (22.10)
z→a z→a z→a
f (z) limz→a f (z)
lim = , if lim g(z) 6= 0 . (22.11)
z→a g(z) limz→a g(z) z→a
The story is however more delicate — but eventually more beautiful! — for differentia-
bility. We can define that a function f (z) is differentiable at a, if
f (z) − f (a)
f 0 (a) = lim (22.12)
z→a z−a
174
MAST10021 Semester 2, 2023
exists, in which case the limit on the right is denoted by f 0 (z) on the left. While the
familiar rules of differentiation from Note 7 can be verified for rational functions, in
particular for
f (z) = z n f 0 (z) = nz n−1 (22.13)
there are many perplexing examples of functions which are simply not differentiable.
Example 22.4. Consider the conjugate function f (z) = z̄ which we can also write in the
form
f (x + iy) = x − iy . (22.14)
f (z) − f (0) x − iy x
= = =1 (22.15)
z−0 x + iy x
Thus no matter how small we take 0 < |z|, f (z) does not approach any limit l.
175
Module VII Note 22
j=0
j! n!
lim an = l . (22.20)
n→∞
It turns out that the sequence of numbers {sn (z)} of (22.19) converges for any complex
number, and we denote the limit by
∞ j
z
= lim sn (z) . (22.21)
X
j=0
j! j→∞
In Note ?? we will give a formal definition of convergent series along these lines.
More generally, a complex power series is a series of the form
∞
f (z) = an (z − a)n = a0 + a1 (z − a) + a2 (z − a)2 + . . . (22.22)
X
n=1
n=1
An important theorem about power series is that if f (z0 ) converges, then f (z) converges
for any |z| < |z0 |. This means, geometrically, that if the power series converges for some
z0 , then it does so inside the entire circle of radius |z0 |. In fact, for power series of the
form (22.23) one of the following three possibilities must be true:
176
MAST10021 Semester 2, 2023
P∞
1. n=0 an z
n converges only for z = 0
P∞
2. n=0 an z
n converges absolutely for all z ∈ C
P∞
3. There is a number R > 0 such that n=0 an z
n converges absolutely if |z| < R and
diverges for |z| > R.
The number R is called the radius of convergence of the power series.2 Inside the
circle of convergence a power series defines a differentiable function. In view of the
examples we have given in the previous section, this shows that power series provide a
large class of differentiable function.
In fact, if the power series (22.23) has radius of convergence R > 0, then f is differen-
tiable inside the circle of convergence |z| < R, and
∞
f 0 (z) = (22.24)
X
nan z n−1 .
n=1
It follows that a power series is infinitely differentiable and continuous inside the circle of
convergence.
As we have seen power series provide a way to define the complex exponential exp(z)
as in (22.17). Then by the above mentioned results for power series exp0 (z) = exp(z). We
compute in particular
where we have defined the complex functions cos(z) and sin(z) by:
z3 z5
sin(z) =z − + + ...w (22.26)
3! 5!
z2 z4
cos(z) =1 − + + ... (22.27)
2! 4!
then also sin0 (z) = cos(z), and cos0 (z) = − sin(z).
As for the exponential function in (22.18), the formulas (22.26) and (22.27) are identities
for real numbers z = x. This will be proven in Modules VIII and ??. In other words, the
2
Inside the circle of convergence the power series converges absolutely, but outside it diverges. What
happens on the circle is a more difficult question, and there are examples of series which converge on
this circle, and others which do not.
177
Module VII Note 22
definitions (22.26) and (22.27) extend statements for trigonometric functions and their
Taylor series to the domain of complex functions.
It is also clear from the definitions, namely the power series, that
eiπ = −1 , (22.33)
from (22.25) with z = π, and more generally that e2πi/n is an nth root of 1.
Recall that for real numbers z, the values of sin(z) always lie between −1 and 1.
However for complex z this is not true at all: Take z = iy, for any real y, then
It may seem that the complex functions we have considered are very special, but this
is not quite true. The basic theorems of complex analysis show:
The power series that are relevant for this statement are Taylor series which are the
topic of Module VIII and ??.
178
MAST10021 Semester 2, 2023
Adapt the definition of continuity given in (22.7) to the case of a comlex-valued function
of a real variable, and show that f is continuous in that sense, if and only if u and v are
continuous.
Exercise 22.2. Similarly, show that f is differentiable if and only if u and v are differen-
tiable, and
f 0 (x) = u0 (x) + iv 0 (x) . (22.40)
Use this to give an alternative proof of (22.38) using Euler’s identity (21.10).
As an application recall the formula (21.20). We obtain
Z Z
eαx cos(βx)dx = Re e(α+iβ)x dx
1 1
= Re e(α+iβ)x = 2 αeαx
cos(βx) + βeαx
sin(βx) (22.41)
α + iβ α + β2
Additional Problems
1. Show that every complex number of absolute value 1 can be written as eiy for some
real number y.
2. For all complex numbers z and w it holds that ez+w = ez ew . It also holds that
sin(z + w) = sin(z) cos(w) + cos(z) sin(w). Interprete these statement in terms of
series. How could we prove these?
179
Module VII Note 22
6. Prove that n
1 − einx sin(nx/2) i(n+1)x/2
eikx = eix =
X
e .
k=1
1 − eix sin(x/2)
0 m 6= n
Z 2π (
einx e−imx dx = (22.45)
0 2π m 6= n .
b) Use part (a) to deduce the following orthgonality relation: If m and n are
integers with m2 =
6 n2 , then
Z 2π
sin(nx) cos(mx)dx = 0 . (22.46)
0
180
Additional: Catenary problem
In this note we want to solve the
Catenary problem: What shape does a flexible rope1 of uniform density take under its
own weight when suspended between two points?
The rope is in a static equilibrium meaning that at every point on the rope the total
force acting on it vanishes. The rope lies in a plane and we can choose coordinates (x, y)
so that the y-axis aligns with the gravitational force and the lowest point on the rope
falls in the origin (0, 0). At every point (x0 , y0 ) on the rope there is a force F (x0 , y0 ) ∈ R2
corresponding to the pull of the rope segment with x > x0 . This force is not yet known
to us, but we know it is always tangential to the rope at (x0 , y0 ). Now consider the lowest
point where the “pull from the rope segment on the left” is balanced by the “pull from
the rope segment on the right”:
for some F0 > 0, representing the force exerted by the rope segment with x < 0.
Now consider some point (x0 , y0 ) on the rope on the right with x0 > 0; cf. Figure 22.1.
The horizontal component of the force exerted by the rope segment x < x0 is still F0 ,
but in addition there is a vertical component related to the weight of the rope segment
between 0 < x < x0 , which is proportional to the total mass of this segment given by ρs,
where ρ0 is the density (mass per unit length) and s is the arc length of the rope segment
from the lowest point. Therefore
where g is the gravitational constant, and by writing F in the form F = |F |(cos θ, sin θ),
where |F | and θ a functions of (x0 , y0 ) we obtain two equations
Let us now view the rope as a graph y = y(x) over the x-axis. The arc length s(x0 , y0 )
of the rope segment from (0, 0) to (x0 , y0 ) is then given by the length s(x0 ) of the curve
x → (x, y(x)), with x ∈ [0, x0 ]. The length of this curve is given by
s
Z x0 dy(x) 2
s(x0 ) = 1+ dx (22.4)
0 dx
1
The word catenary comes from the latin word for chain, but I find it more pleasant to think about
ropes. The problem was first solved in the 1690s, independently by Leibniz, Huygens, and Bernoulli.
181
Module VII Note 22
dy
where m(x) = y 0 (x) = dx (x) is the slope of the graph at (x, y(x)). Since (cos(θ(x)), sin(θ(x))
is the unit tangent vector to the curve at (x, y(x)), with the angle θ as introduced above,
it follows from (22.3) that
ρs(x)g
m(x) = tan(θ(x)) = (22.5)
F0
In view of (22.4) we thus obtain the following ODE for the slope m(x):
ρg 0 ρg q
m0 (x) = s (x) = 1 + m(x)2 (22.6)
F0 F0
In Note 16 we have learned that this is an example of a separable differential equation,
which can be solved by writing
dm
Z Z
√ =a dx (22.7)
1 + m2
where for short a = ρg/F0 . It follows from (20.26) that
arsinh(m(x)) = ax (22.8)
182
MAST10021 Semester 2, 2023
cosh(ax) − 1
Z x
y(x) = sinh(ax0 )dx0 = (22.10)
0 a
where we used y(0) = 0.
This is the solution to the catenary problem, and obviously finds applications in civil
engineering, although the assumption of uniform density is rather idealized.
Exercise 22.1. Find the area under the catenary curve from x = 0 to x = 1/a.
183
Additional: Variation of constants
In this note we show how to obtain a particular solution to (21.25) by variation of
constants.1
More precisely, let f1 and f2 be two linearly independent solutions of the homogeneous
equation as discussed in Section 21.1. We are then looking for a particular solution to
the inhomogeneous problem of the form
We compute
g 0 (x) = c1 (x)f10 (x) + c2 (x)f20 (x) + c01 (x)f1 (x) + c02 (x)f2 (x) (22.2)
g 00 (x) = c1 (x)f100 (x) + c2 (x)f200 (x) + c01 (x)f10 (x) + c02 (x)f20 (x)
+ (c01 (x)f1 (x) + c02 (x)f2 (x))0 (22.3)
Therefore, using the fact that f1 (x) and f2 (x) are solutions to the homogeneous equation,
The point is that (22.5) and (22.6) are a linear system of equations for c01 (x) and c02 (x)
which is always solvable:
! ! !
f1 (x) f2 (x) c01 (x) 0
= (22.7)
f10 (x) f20 (x) c02 (x) r(x)
1
This idea was first used by Bernoulli in 1697 to solve linear equations of first order, and then by
Lagrange in 1774 to solve linear equations of second order.
185
Module VII Note 22
The determinant of this matrix is precisely the Wronskian W (f1 , f2 )(x), and we have
shown in Theorem 18.1 that W (f1 , f2 )(x) never vanishes. Therefore we can always solve
this linear system and obtain
f2 (x)r(x)
c01 (x) = − (22.8)
W (f1 , f2 )(x)
f1 (x)r(x)
c02 (x) = . (22.9)
W (f1 , f2 )(x)
Theorem 22.1. Let (f1 , f2 ) be a fundamental system for the homogeneous equation
(21.12). Then a particular solution to the inhomogeneous equation (21.25) is given by the
formula (22.1),
y1 (x) = c1 (x)f1 (x) + c2 (x)f2 (x) (22.10)
where c1 and c2 are primitives of the following functions:
f2 (x)r(x) f1 (x)r(x)
Z Z
c1 = − dx , c2 = dx . (22.11)
W (f1 , f2 )(x) W (f1 , f2 )(x)
Example 22.1. Let us determine the general solution of the equation
y 00 + y = sin(2x) (22.12)
A fundamental system of solutions for the homogeneous equation is given by
f1 (x) = sin(x) , f2 (x) = cos(x) , (22.13)
which has the Wronskian W (x) = −1, and the two primitives (22.11) are
Z Z
c1 = cos(x) sin(2x)dx = 2 cos2 (x) sin(x)dx = −2 cos3 (x)/3 (22.14)
Z Z
c2 = − sin(x) sin(2x)dx = −2 sin2 (x) cos(x)dx = −2 sin3 (x)/3 (22.15)
Problems
1. Find the general solution of the equation
y 00 + y = tan(x) (22.18)
on the interval (−π, π).
2. Derive the formula (21.45) by the method of variation of constants.
186
Module VIII.
Taylor polynomials
187
Note 23.
Approximation by polynomial functions
In the previous lectures we have looked at differential equations with constant coefficients,
and we have found several special functions as solutions. These functions are often given
implicitly and “in practice” it would not be easy at all to compute their values.
Example 23.1. For example to compute values of the exponential function, we would first
have to approximate
1
Z x
log(x) = dt (23.1)
1 t
by some upper and lower sums, and finding ex = log−1 (x) would involve computing log(a)
for many values of a, until log(a) is close to x, and a would then be an approximation
for ex .
In this lecture we will learn a way to approximate functions by polynomials,
p(x) = a0 + a1 x + . . . + an xn , (23.2)
in the sense that the derivatives of p(x) at a point a agree with those of the given function
up to a given order. Such polynomials are called Taylor polynomials.
Remark 23.1. There are of course other methods to approximate functions by polynomials.
One could for example try to find a polynomial of degree n that passes through n+1 given
points on the graph of the function. Or one could attempt to make the area between the
function and the polynomial as small as possible. Or one could approximate a function
uniformly by polynomials, as it is done in the “Weierstrass approximation theorem.”
First note that all coefficients in (23.2) can be expressed in terms of the values of p
and its derivatives at 0: p(0) = a0 , and p0 (x) = a1 , and more generally
p(k) (a)
ak = . (23.5)
k!
189
Module VIII Note 23
The Taylor polynomials of the most important elementary function are extremely
simple.
Example 23.2. The Taylor polynomial for sin(x) at 0 is, to order 2n + 1,
x3 x5 x2n+1
P2n+1,0 (x) = x − + + . . . + (−1)n (23.8)
3! 5! (2n + 1)!
1 1 (−1)n−1
Pn,1 (x) = (x − 1) − (x − 1)2 + (x − 1)3 + . . . + (x − 1)n (23.9)
2 3 n
Exercise 23.1. For the example of the logarithm it is more convenient to consider the
function f (x) = log(1 + x). In this case compute the Taylor polynomial of f at a = 0.
Example 23.4. Let us compute the Taylor polynomial for arctan(x) at a = 0. Since
1
arctan0 (x) = , (23.10)
1 + x2
we see that arctan0 (0) = 1, and arctan00 (0) = 0, so
P2,0 = x . (23.11)
190
MAST10021 Semester 2, 2023
Example 23.5. Consider the exponential function f (x) = ex , and its Taylor polynomial of
degree 1,
P1,0 (x) = 1 + x . (23.14)
While this is a good approximation as x approaches 0, it appears that
1
P2,0 (x) = 1 + x + x2 (23.15)
2
gives an even better approximation. Indeed by L’Hôpital’s rule,
ex − 1 − x − 12 x2 ex − 1 − x ex − 1
lim = lim = lim = 0. (23.16)
x→0 x2 2x x→0 2
Theorem 23.1. Suppose f is a function for which f 0 (a), . . . , f (n) (a) all exist. Then
We also say that “f (x) equals Pn,a (x) up to order n at a.” We have designed Pn,a to
have that property, but it turns out that the Taylor polynomial is the only polynomial
with this property.
Theorem 23.2. Let p(x) and q(x) be two polynomials in (x − a), of degree at most n,
which are equal up to order n at a in the sense that
p(x) − q(x)
lim = 0. (23.18)
x→a (x − a)n
Then p = q.
Proof. The difference r = p − q is a polynomial of degree at most n,
and we can use the case i = 1 to infer that b1 = 0. Continuing in this way gives
b0 = b1 = b2 = . . . = bn = 0 .
191
Module VIII Note 23
In some situations this insight gives an unexpected way to compute the Taylor polyno-
mial of a function.
Example 23.6. Let us return to the problem of computing the Taylor polynomial of
1
Z x
arctan(x) = dt . (23.22)
0 1 + t2
1 n+1 t
2n+2
= 1 − t 2
+ t 4
− t 6
+ . . . + (−1)n 2n
t + (−1) . (23.23)
1 + t2 1 + t2
Therefore
1 1 x2n+1
Z x 2n+2
t
arctan(x) = x − x3 + x5 − . . . + (−1)n + (−1)n+1 dt , (23.24)
3 5 2n + 1 0 1 + t2
and so the polynomial that appears here is the Taylor polynomial of degree 2n + 1,
provided we can show that
1
Z x 2n+2
t
lim dt = 0 (23.25)
x→0 x2n+1 0 1 + t2
Since Z x 2n+2
t |x|2n+3
0≤ dt ≤ (23.26)
0 1 + t2 2n + 3
this is satisfied, and we conclude
1 1 x2n+1
P2n+1,0 (x) = x − x3 + x5 − . . . + (−1)n . (23.27)
3 5 2n + 3
This last example also shows that for |x| ≤ 1,
1
| arctan(x) − P2n+1,0 (x)| ≤ , (23.28)
2n + 1
which means that we use Taylor polynomials to compute arctan(x) on this interval as
accurately as we like. We will now turn more generally to the question how well a Taylor
polynomial Pn,a [f ](x) approximates f (x), for fixed x, and different n.
Problems
1. Find the Taylor polynomials Pn,a [f ] for the following functions f :
x
a) f (x) = ee , n = 3, a = 0
192
MAST10021 Semester 2, 2023
b) f (x) = esin(x) , n = 3, a = 0
c) f (x) = sin(x), degree 2n, a = π/2
d) f (x) = cos(x), degree 2n, a = π
3. Consider the equation x2 = cos(x), which has precisely two solutions. Use the
Taylor polynomialpof the cosine function of degree 3, to show that the solutions are
approximately ± 2/3, and give a bound for the error. Use a fifth degree Taylor
polynomial to get a better approximation.
4. Suppose that ai and bi are the coefficients in the Taylor polynomials at a of f and
g. Find the coefficients of the Taylor polynomials at a of the following functions in
terms of ai and bi :
a) f + g
b) f g
c) f 0
Rx
d) h(x) = a f (t)dt
x6 x10 x4n+2
P4n+2,0 (x) = x2 − + − . . . + (−1)n . (23.29)
3! 5! (2n + 1)!
193
Note 24.
Taylor’s theorem
In the previous lecture we have introduced Taylor polynomials Pn,a as polynomial
functions whose derivatives at a point a agree with those of a given function f up to a
certain order n. The main theorem in this lecture will answer the question precisely in
which sense a Taylor polynomial approximates the values of a given function away from
the point a.
If f is a function for which Pn,a [f ] exists, we define the remainder term Rn,a [f ] by
f (t)
Z x (3)
R2,a (x) = (x − t)2 dt (24.5)
a 2
by suitably integrating by parts:
Z x h 1 i0
R1,a (x) = f 00 (t) − (x − t)2 dt . (24.6)
a 2
195
Module VIII Note 24
Proceeding in this way we can prove by induction the integral form of the remainder,
(t)
Z x (n+1)
f
Rn,a (x) = (x − t)n dt . (24.7)
a n!
We get from this the first estimate for the remainder:
Proposition 24.1. Suppose f is n + 1 times differentiable on an interval I, a ∈ I, and
|f (n+1) (x)| ≤ M for x ∈ I, with some M > 0. Then
M
|Rn,a (x)| ≤ |x − a|n+1 (24.8)
(n + 1)!
196
MAST10021 Semester 2, 2023
Problems
1. Prove that if x ≤ 0, then the remainder term Rn,0 for ex satisfies
|x|n+1
|Rn,0 | ≤ (24.15)
(n + 1)!
2. Prove that if −1 < x ≤ 0, then the remainder term Rn,0 for log(1 + x) satisfies
|x|n+1
|Rn,0 | ≤ . (24.16)
(1 + x)(n + 1)
f (a + h) + f (a − h) − 2f (a)
f 00 (a) = lim . (24.17)
h→0 h2
The limit on the right hand side is called the Schwarz second derivative of
f at a. Hint: Use the Taylor polynomial P2,a (x) with x = a + h, and with
x = a − h.
b) Let f (x) = x2 for x ≥ 0, and −x2 for x ≤ 0. Show that the Schwarz second
derivative of f at 0 exists, even though f 00 (0) does not.
c) Prove that if f 000 (a) exists, then
4. Give another proof for the uniqueness of solutions to the differential equation
y 00 − y = 0 using Taylor’s theorem. In other words, suppose f 00 − f = 0 and
f (0) = f 0 (0) = 0, and show that then f = 0.
197
Additional: Taylor’s theorem
Proof of Taylor’s theorem with Lagrange remainder
Recommended Reading
As mentioned above the proof of Taylor’s theorem does not rely on the integral form
of the remainder, but instead on a higher order version of the mean value theorem:
g (n+1) (t)
g(x) = (x − a)n+1 . (24.2)
(n + 1)!
for some t in between a and x.
for some t ∈ (a, x). Now g 0 itself is k + 1 times differentiable, and satisfies g 0 (a) = . . . =
(g 0 )(k) (a) = 0, so by our inductive assumption,
(g 0 )(k+1) (y)
g 0 (t) = (t − a)k+1 (24.5)
(k + 1)!
199
Module VIII Note 24
Proof of Theorem 24.2. The remainder Rn,a = f (x) − Pn,a (x) satisfies by its very defini-
tion,
Rn,a (a) = . . . = Rn,a
(n)
(a) = 0 (24.7)
(n+1)
where we used that Pn,a (x) is a polynomial of degree n, and hence Pn,a = 0.
Local extrema
Recommended Reading
As a consequence of Theorem 23.1 the test for local extrema can be answered even in
the indefinite case. Recall that if a is a critical point of f , then f has a local minimum
at a if f 00 (a) > 0, and a local maximum if f 00 (a) < 0, but no immediate conclusion can
be drawn if f 00 (a) = 0. It is now clear that in this case f (3) (a) will give the relevant
information, and moreover if also f (3) (a) = 0, then the sign of f (4) (a) is significant. More
generally, we can ask what happens when
200
MAST10021 Semester 2, 2023
Additional Problems
1. a) Show that if |g 0 (x)| ≤ M |x − a|n for |x − a| < δ, then |g(x) − g(a)| ≤
M |x − a|n+1 /(n + 1) for |x − a| < δ.
b) Use this to show that if limx→a g 0 (x)/(x − a)n = 0, then
g(x) − g(a)
lim = 0. (24.10)
x→a (x − a)n+1
c) Show that if g(x) = f (x) − Pn,a [f ](x), then g 0 (x) = f 0 (x) − Pn−1,a [f 0 ](x).
d) Give an inductive proof of Theorem 23.1, without using L’Hôptial’s rule.
2. Deduce Theorem 23.1 as a corollary of Taylor’s theorem, albeit under the assumption
of one more derivative.
201
Note 25.
Infinite Sequences
In the previous lecture we have encountered the problem whether the numbers
an
(25.1)
n!
for a given choice of a ∈ R, become “small for n sufficiently large”. The numbers
an = an /n! for n ∈ N are an example of an “infinite sequence” of real numbers
a1 , a2 , a3 , . . . (25.2)
{an }∞
n=1 . (25.3)
Given that a sequence assigns to each natural number n and real number an we could
define the concept as follows:
One could graph a sequence in the same way we graph a function, but it is usually
more convenient to simply label the points an on the real number line R.
Sequences are often defined explicitly by a formula for the nth term, or recursively in
the sense that ak+1 is given in terms ak , or even al for 1 ≤ l ≤ k.
Example 25.1. ak = k 2 .
Example 25.2. The factorial function n! can itself be thought of as a recursively defined
sequence. Setting a0 = 1, and an+1 = an (n + 1), the number n! is n-th number in this
sequence, n! = an .
Example 25.3. Another example of a recursively defined sequence are the Fibonacci
numbers, where x1 = x2 = 1, and xk = xk−1 + xk−2 for k ≥ 2.
The statement we would like to make about the sequence (25.1) is that it “converges
to zero”, or
an
→0 (n → ∞) . (25.4)
n!
The next definition makes this notion precise.
203
Module VIII Note 25
an → l (n → ∞) or lim an = l (25.5)
n→∞
Alternatively, we could have taken the point of view of Definition 25.1, and view
√
an = n = f (n) as the values of a function f , first with domain N, but then as the
√
values of the function f (x) = x with domain x > 0, evaluated on the natural numbers
x = n. This would allow us to apply the mean value theorem, to get immediately
√ √ 1 1
n+1− n = f (n + 1) − f (n) = f 0 (x) = √ ≤ √ → 0 (n → ∞) (25.10)
2 x 2 n
lim xn = ∞
n→∞
if for every C > 0 there is an integer N such that xn > C whenever n > N .
Example 25.7. Another typical example is
3n3 + 7n2 + 1 3
lim = (25.11)
n→∞ 4n3 − 8n + 63 4
204
MAST10021 Semester 2, 2023
This is clear because the n3 terms are the “leading order” terms for large n, and this can
be turned into a proof by “dividing through” by n3 :
We could now proceed by first finding an explicit expression for an − 3/4, and then
estimate |an − 3/4|, with the aim of verifying an → 3/4 directly using Definition 25.2. It
is easier, however, to apply at this stage the following limit laws.
For the evaluation of limits as in the last example the following facts are useful:
Theorem 25.1 (Limit Laws). Suppose the infinite sequences {an } and {bn } both have
limits as n → ∞, then
Remark 25.1. The formulation of the last equality of limits actually requires some more
care. As it stands we are considering the sequence cn = an /bn which may not even be
defined for all n ∈ N. However, since limn→∞ bn 6= 0, we know that for some N sufficiently
large, bn 6= 0 whenever n ≥ 0. Moreover, redefinining all bn , whenever bn = 0 and n ≤ N ,
obviously does not effect the statement about the limits.
The proof of these limit laws is so similar to the corresponding statements for limits of
functions, that they will not be repeated here. Nonetheless, let us explore the similarity
between the definition of limits of functions and sequences a little further.
Note for example that if f is a function that satisfies
lim an = 0 . (25.18)
n→∞
To prove this note that ax = ex log a , and log a is negative, hence limx→∞ ex log a = 0.
Exercise 25.2. Show that for any |a| < 1, limn→∞ an = 0 . Also show that, if a > 1 then
limn→∞ an = ∞.
205
Module VIII Note 25
We can now finally return to (25.4). In other words, let us show that for any a ∈ R,
limn→∞ an /n! = 0. We can write for n > N > 2a,
an aN a···a aN 1 n−N
| |≤ ≤ →0 (n → ∞) (25.19)
n! N ! (N + 1) · · · n N! 2
where in the last step we are using that 2−n → 0 as n → ∞ as we have shown in (25.18).
Finally the observation made in (25.16) and (25.17) still does not give us convergence
of sequences like
1
an = sin 13 + (25.20)
n2
1
bn = cos sin 1 + (−1)n (25.21)
n
which clearly should converge to sin(13), and cos sin(1), respectively. (If this is not “clear”
draw a few points on the line.) The theorem that allows us to conclude that is the
following:
Theorem 25.2. Let c ∈ R, and f be function defined on an interval I that contains c,
except perhaps at c itself, and suppose
lim f (x) = l . (25.22)
x→c
Suppose {an } is a sequence such that each an ∈ I, an 6= c, and limn→∞ an = c. Then the
sequence {f (an )} converges, and
lim f (an ) = l . (25.23)
n→∞
Conversely, if this is true for every sequence {an } satisfying these conditions, then (25.22)
holds.
Proof. If limx→c f (x) = l, then for every > 0 there is a δ > 0 such that
|f (x) − l| < (25.24)
whenever 0 < |x − c| < δ. Now by assumption we can choose N > 0 such that
n > N =⇒ |an − c| < δ , (25.25)
which then implies that |f (an ) − l| < , showing that f (an ) → l.
Conversely, if (25.22) were not true, then there exists > 0, so that for every δ > 0,
there exists x with |x − c| < δ, and |f (x) − l| ≥ . However, this can be used to define
a sequence an of points with the property that say |an − c| < 1/n, but |f (an ) − l| ≥ 1.
Then an → c, but f (an ) does not converge to l, in contradiction ot (25.23).
Example 25.9. a n
lim 1 + = ea . (25.26)
n→∞ n
206
MAST10021 Semester 2, 2023
Problems
1. Verify each of the following limits.
a) limn→∞ n+1 = 1
n
b) limn→∞ nn+3
3 +4 = 0
c) limn→∞ nn!n = 0
√
d) limn→∞ n
a = 1, a>0
√
e) limn→∞ n
n=1
√
f) limn→∞ n
n2 + n = 1
2n +(−1)n
c) limn→∞ 2n+1 +(−1)n+1
d) limn→∞ ncn , |c| < 1
√
3. a) Prove that if 0 < a < 2, then a < 2a < 2.
b) Prove that the sequence
r q
√ q √ √
2, 2 2, 2 2 2, . . . (25.27)
converges.
c) Find the limit.
5. Find a sequence {an } of points in (0, 1) such that limn→∞ an is not in (0, 1).
207
Bibliography
Apostel, Tom M. Calculus. Second Edition. Vol. I. One-variable Calculus, with an
Introduction to Linear Algebra. John Wiley & Sons, 1967.
Folland, Gerald. B. Advanced Calculus. Prentice Hall, 2002.
Osserman, Robert. Two-Dimensional Calculus. Ed. by Salomon Bochner and W.C. Lister.
Harbrace College Mathematics Series. Harcourt, Brace and World, Inc., 1968.
Spivak, Michael. Calculus. Fourth Edition. Publish or Perish, Inc., 2008.
Walter, Wolfgang. Gewöhnliche Differentialgleichungen. Heidelberger Taschenbucher.
Springer, 1972.
209