Integration
Integration
2024, 16 LECTURES
Acknowledgement. These notes are a very small edit of the notes produced by
Charles Batty who lectured this course from 2018-21. I’m grateful to him for allowing
me to use his notes in this way. I am responsible for any typos / inaccuracies in the
notes (please let me know of any you find).
Stuart White
stuart.white@maths.ox.ac.uk
Reading
Qian’s notes were written for the course as he gave it in 2014-17, based on previous
versions of the course given by Alison Etheridge and Charles Batty. We will cover more
or less the same material, but not follow his notes exactly.
Capinski and Kopp is the most basic of the books, giving the theory in a basic style,
but with not many worked examples; we shall follow rather closely their approach to the
theory. Priestley adopts a very different approach to the construction of the integral,
so early parts of her book look quite different from what we will do, but about the 8th
lecture onward everything comes together; she has lots of worked examples.
Stein and Shakarchi, and Garling, are a little more sophisticated in the theory.
Garling’s book is based on lectures given in Cambridge, and it has a good number of
worked examples.
Numerous other useful books may be found in libraries. Some may adopt different
approaches to the construction of the integral, but when they talk about Lebesgue
integration they all mean the same class of integrable functions and the same theorems.
Introduction
Rb
In Prelims, you saw how to define a f (x) dx for a continuous function f : [a, b] → R
or more generally for Riemann integrable f . It had some good properties: the Funda-
mental Theorem of Calculus shows that it is more or less an inverse of differentiation,
leading to rigorous statements concerning A level calculus. Moreover you saw that
Z b Z b
(*) lim fn (x) dx = f (x) dx
n→∞ a a
if (fn ) converges to f uniformly on [a, b].
R This was useful (a) for integrating power series
term-by-term, (b) for finding limn→∞ γ fn (z) dz, where γ is a contour of finite length,
in complex analysis last term. However, the Riemann integral has various deficiencies:
(a) There are still functions which one feels one should be able to integrate, for which
the Prelims definition fails to work. For example, let f = χQ∩[0,1] be the character-
istic function of Q ∩ [0, 1]. Then
Z 1 Z 1
f (x) dx = 0, f (x) dx = 1
0 0
we need to extend the definition of integrals in some way beyond Riemann integra-
tion.
(b) There is a lack of theorems saying that
Z Z
fn → f =⇒ fn (x) dx → f (x) dx
(i) Instead of using integrals to define lengths of sets, define the length of a set
directly; then define integrals.
(ii) Instead of partitioning the x-axis into intervals and using step functions, partition
the y-axis into intervals and considering corresponding “simple” functions.
There are other ways of constructing Lebesgue’s integral on R, including ways which
use step functions (see Priestley), but they don’t generalise so easily to probability (for
example). Once one gets the Monotone Convergence Theorem, then everything is the
same, however you got there. We then get a whole host of theorems about:
Note that these processes do not always work—there are simple counterexamples
for the first 4! So all these theorems have conditions which must be checked before
using in applications. In this course, we do not take the position that you can just
assume all these processes work. On the other hand, we shall not go pedantically
through all details of the construction of the integral and the proofs of the theorems.
We’ll approach the construction in a way which generalises easily, but the proofs of
these generalistions are often not interesting. The construction up to the MCT will
take some time - around 8 lectures - and then useful theorems and applications will
come thick and fast.
Please be aware that all the Prelims theory remains valid in this context. Lebesgue
integration theory extends Riemann’s theory by enabling you to integrate more func-
tions. In particular, the Fundamental Theorem of Calculus (both versions), Integration
by Parts and Substitution remain valid under the assumptions given in Prelims.
In this course, we shall often take infinite series of non-negative terms and limits of
(monotone) sequences. In order to avoid complications concerning divergence, it will
be convenient to work in the extended real numbers including −∞ and ∞, and to use
the notions of lim sup and lim inf.
4 INTEGRATION, H.T. 2024
Thus we consider the set [−∞, ∞] = R ∪ {−∞, ∞}. Addition and multiplication
by ∞ are defined as follows (for x ∈ R):
x + ∞ = ∞ + x = ∞,
x − ∞ = −∞ + x = −∞,
∞
(x > 0),
x.∞ = ∞.x = (−x).(−∞) = −∞ (x < 0),
0 (x = 0).
Note that
• ∞ − ∞ is undefined;
• the usual laws (commutativity, associativity and distributivity) apply, provided
that the relevant expressions are defined;
• the above are uncontroversial, except for 0.∞ = 0 which is convenient for our
particular context but might be inappropriate in other mathematical contexts.
The ordering on [−∞, ∞] is the obvious one, and limn→∞ an = ∞ has the same meaning
as in Prelims Analysis.
In this system, any subset E has a supremum and an infimum in [−∞, ∞]. Note
that sup ∅ = −∞. If E ⊆ R, sup E = ∞ if and only if E is not bounded above.P For an
increasing sequence (an ), limn→∞ an = sup{an }. If an ≥ 0 for all n, then an = ∞ if
and only if the series diverges.
Proposition 1.1. 1. Let (an ) be a sequence of non-negative terms. Then
∞
( )
X X
an = sup an : J finite subset of N .
n=1 n∈J
2. Let (bmn )m,n≥1 be a double sequence of non-negative terms, and {(mk , nk ) : k ≥ 1}
be any enumeration of N × N. Then
X∞ X ∞ X∞ X ∞ ∞
X X
bmn = bmn = bmk ,nk = sup bmn : J finite subset of N × N .
m=1 n=1 n=1 m=1 k=1 (m,n)∈J
P
In particular, Proposition
PP 1.1 implies that an is independent of the order of the
terms, and similarly bmn can be arbitrarily rearranged.
A bounded sequence (an ) in R may not have a limit. It has a supremum and
infimum, but for some large values of n, an may not be close to them. Think for
example about an = (1 + 1/n) sin n. Asymptotically the values oscillate between −1
and 1, but there are infinitely many values bigger than 1 and infinitely many smaller
than −1.
For a sequence (an ) in [−∞, ∞], define
lim sup an = lim sup an ,
n→∞ m→∞ n≥m
lim inf an = lim inf an .
n→∞ m→∞ n≥m
INTEGRATION, H.T. 2024 5
The limits exist, because supn≥m an m≥1
is a decreasing sequence
So, lim supn→∞ an is the largest number ` such that there is a subsequence of (an )
converging to `.
Examples 1.2. 1. Let an = (1 + 1/n) sin n. Then
lim sup an = 1, lim inf an = −1.
n→∞ n→∞
lim sup and lim inf are useful for avoiding epsilontics. For example, consider the
Sandwich Rule, i.e., suppose that an ≤ bn ≤ cn for all n and lim an = lim cn . Then
lim sup bn ≤ lim sup cn (Proposition 1.3(4))
= lim cn (Proposition 1.3(3))
= lim an (assumption)
= lim inf an (Proposition 1.3(3))
≤ lim inf bn (Proposition 1.3(4))
≤ lim sup bn (Proposition 1.3(2)).
Hence equality holds throughout, so lim bn = lim an , by Proposition 1.3(3).
2. Lebesgue measure
(vi)0 m (S∞
S P∞
n=1 An ) = n=1 m(An ) if An ∩ Ak = ∅ for k 6= n (m is countably additive);
(vii) m ( ∞ A
n=1 n ) = lim n→∞ m(An ) if (An ) is an increasing sequence of sets.
In fact, there is very considerable redundancy here. For example, (v), (vi) and (vii)
follow from (i) and (vi)0 .
The status of (vi)0 is perhaps debatable, but it is usually assumed. It is equivalent to
(vi) and (vii) together, and (vii) is essential to have a Monotone Convergence Theorem.
Let us attempt to construct such an m. For A ⊆ R, suppose that A ⊆ ∞
S
n=1 In for
intervals In . Letting In0 = In \ (I1 ∪ · · · ∪ In−1 ), we have
[ X X
m(A) ≤ m( In0 ) = m(In0 ) ≤ m(In ).
So we attempt to define m as follows. First, for any interval I with endpoints a and b,
define
|I| = b − a.
For A ⊆ R, we define the outer measure of A to be
(∞ ∞
)
X [
∗
m (A) = inf |In | : In intervals, A ⊆ In .
n=1 n=1
We can always take In = [−n, n], so the infimum is not over the empty set (but m∗ (A)
may be infinite). It makes no difference if we restrict In to being closed intervals, or
open intervals.
Proposition 2.1. 1. m∗ (∅) = 0, m∗ ({x}) = 0;
2. m∗ (I) = |I| = b − a if I is any interval with endpoints a, b;
3. m∗ (A + x) = m∗ (A);
4. m∗ (αA) = |α|m∗ (A);
5. m∗ (A) ≤ m∗ (B) if A ⊆ B;
6. m∗ (A
S∪ B) ≤ m∗P (A) + m∗ (B);
6 . m ( n=1 An ) ≤ ∞
0 ∗ ∞ ∗
n=1 m (An ).
Proof. (1), (3), (4), (5) are easy; (6) and (6)’ are moderately tricky exercises. See Q8
Sheet 1. Let us prove (2); we will do it for I = [a, b]; then the other cases follow using
(1), (5) and (6).
Firstly, m∗ [a, b] ≤ b − a, because we may take I1 = [a, b] and In = {0} for n ≥ 2.
Now suppose that [a, b] ⊆ ∞
S
n=1 In where In is an interval with endpoints an , bn
(which we can assume interesects [a, b]). Take ε > 0. Let
Jn = an − ε2−n , bn + ε2−n =: (cn , dn ).
Then x1 < a < b < xk , each interval (xi , xi+1 ) is contained in some Jn , and Jn has
endpoints cn = xkn , dn = x`n , say. Hence
k−1
X X n −1
N `X N
X
b − a < xk − x1 = (xi+1 − xi ) ≤ (xi+1 − xi ) = |Jn |.
i=1 n=1 i=kn n=1
Now ∞ ≥ N
P P PN −(n−1) ε > b − a − 2ε. This holds for
n=1 |In | P n=1 |In | = n=1 |Jn | − 2
∞ ∗
every ε > 0, so n=1 |In | ≥ b − a. Hence m [a, b] ≥ b − a.
Proof.
S∞ [DirectPproof of (2)] Let ε > 0. There exist intervals Ir,n such that En ⊆
I and |I | < ε2 −n . Now {I r, n = 1, 2, . . . } is a countable
r=1 r,n r Sr,n r,n :P family of
−n ∗
P P S
intervals covering En , and n r |Ir,n | < n ε2 = ε. Hence m ( n En ) = 0.
Example 2.3. Let C0 = [0, 1], C1 = [0, 13 ] ∪ [ 23 , 1], C2 = [0, 91 ] ∪ [ 92 , 13 ] ∪ [ 23 , 97 ] ∪ [ 89 , 1],
etc. In general, Cn is the union of 2n disjoint closed intervals, each of length 3−n , and
Cn+1 is obtained from Cn by deleting the open middle third of each of those intervals.
Let C = ∞
T
n=1 Cn . Then C is a closed subset of R, known as the Cantor set.
Clearly, m (C) ≤ 2n 3−n for each n. Letting n → ∞ shows that C is null.
∗
A property Q of real numbers is said to hold almost everywhere (a.e.) if the set of
real numbers for which Q does not hold is a null set. For example, χC = 0 a.e., i.e.,
χC (x) = 0 for almost all x, because C is null.
Now let us consider the question whether m∗ is countably additive.
Example 2.4. Let A be a subset of [0, 1] with the following properties;
(i) x, y ∈ A, x 6= y =⇒ x − y ∈ / Q;
(ii) For any x ∈ [0, 1], there exists q ∈ Q such that x + q ∈ A.
Then [
[0, 1] ⊆ (A − q) ⊆ [−1, 2].
q∈Q∩[−1,1]
Moreover, the sets A − q are disjoint (as q varies), and there are countably many of
them. If m∗ is countably additive, then
X X
1 = m∗ [0, 1] ≤ m∗ (A − q) = m∗ (A) ≤ 3.
q∈Q∩[−1,1] q∈Q∩[−1,1]
This is impossible.
8 INTEGRATION, H.T. 2024
Thus m∗ is not countably additive, provided that such a set A exists. The additive
group R is partitioned into the cosets of its additive subgroup Q, and (i) and (ii) say
that A contains exactly one member of each coset of Q. The existence of such a set
follows from the Axiom of Choice, an axiom of set theory beyond the basic axioms.
This shows that it is impossible to prove that m∗ is countably additive without using
some weird axiom which contradicts the Axiom of Choice. On the other hand, it can
be proved that it is impossible to show that m∗ is not countably additive, using only
the basic axioms of set theory.
This is bad news, but it is not so very bad because the badness occurs only with sets
which cannot be explicitly described. So we can rescue things by restricting attention
to a class of sets with good behaviour.
A subset E of R is said to be (Lebesgue) measurable if
m∗ (A) = m∗ (A ∩ E) + m∗ (A \ E)
for all subsets A of R. Here, A \ E = A ∩ (R \ E)—it is not assumed that E ⊆ A.1
Let MLeb be the set of all Lebesgue measurable subsets of R.
Proposition 2.5. 1. If E is null then E ∈ MLeb .
2. If I is any interval, then I ∈ MLeb .
3. If E ∈ MLeb , then R \ E ∈ MLeb . S
4. If En ∈ MLeb for n = 1, 2, . . . , then ∞ n=1 En ∈ MLeb . S∞
∈ M ∩E ∗
5. If
P∞ En Leb for n = 1, 2, . . . and En k = ∅ whenever n 6= k, then m ( n=1 En ) =
∗
n=1 m (En ).
The proofs are exercises (Q9, Sheet 1 for 1,2,4 and 5), or can be found in books
such as Capinski & Kopp. (3) is almost trivial.
Note ∞
T S∞
n=1 En = R \ ( n=1 R \ En ), MLeb is also closed under (finite or countable)
intersections. The set A of Example 2.4 is not Lebesgue measurable.
Corollary 2.6. All open subsets, and all closed subsets of R, are Lebesgue measurable.
Proof. Any open subset of R is a countable union of intervals (See the optional exercise:
sheet 1 Q8).
For E ∈ MLeb , we shall write m(E) for m∗ (E). Then m : MLeb → [0, ∞] is
countably additive.
The definition of Lebesgue measurability we have chosen to use is designed for
use in the proof that the Lebesgue measurable sets are closed under countable unions.
Also the Cartheodory condition generalises very nicely, and is an essential part of the
Carthedory extension theorem which is a fundamental tool for producing measures (see
B8.1: Probability Measure and Martingales, or Chapter 6.1 of Stein and Shakarchi).
1The definition we use is the same definition as Capinski & Kopp and Zhongming Qian’s lecture notes (2017)
— this is known as the Carthedory criterion for measurability. Etheridge had a different definition, Stein &
Shakarchi have another, Garling has another; and Priestley has yet another. All these definitions are equivalent,
but this requires some work; we will see the equivalence of the definition above with that used by Stein and
Shakarchi after Corollary 2.7 (but relying on your work proving the Lebsgue mesurable sets from a σ-algebra in
Proposition 2.5).
INTEGRATION, H.T. 2024 9
While the Carthedory condition is designed for use in proofs, it’s hard to visualise
being a condition quantified over all sets A. So we end with an alternative description
of Lebegue measurable sets:
Corollary 2.7. Let E ⊂ R be a Lebesgue measurable set. Then for ε > 0, there exists
an open set U ⊇ E with m(U \ E) < ε.
of open sets and you can check that (G1 × G2 ) \ (E1 × E2 ) is null in R2 (see Stein and
Shakarchi Proposition 3.3.6).5
Let Ω be any set, and F ⊆ P(Ω). We say that F is a σ-algebra (or σ-field) on Ω if:
(i) ∅ ∈ F,
(ii) If E ∈ F, then Ω \ E ∈ F,
(iii) If En ∈ F for n = 1, 2, . . . , then ∞
S
n=1 En ∈ F.
T
Then (Ω, F) is a measurable space, and sets in F are F-measurable. As before, En ∈ F
if En ∈ F for n = 1, 2, . . . .
A measure on (Ω, F) is a function µ : F → [0, ∞] such that
(i) µ(∅)
S = 0,
(ii) µ( ∞
P∞
n=1 En ) = n=1 µ(En ) whenever En are disjoint sets in F.
We will be interested
Proposition 3.3. Let Ω be a set, and B ⊆ P(Ω). Then there is a unique σ-algebra FB
on Ω satisfying:
(i) FB is a σ-algebra and B ⊆ FB ,
(ii) If F is σ-algebra on Ω and B ⊆ F then FB ⊆ F.
Proof. We let FB be the intersection of all σ-algebras on Ω which contain B (which you
should check is a σ-algebra, so that (i) holds). By definition (ii) holds. Notice that if
FB0 is another such σ-algebra, then applying (ii) for FB we have FB ⊆ FB0 . But revesing
the roles, we can apply (ii) for FB0 giving FB0 ⊆ FB .
The σ-algebra MBor generated by the intervals is the Borel σ-algebra on R. It can
be described as the class of all subsets of R which can be obtained from intervals in
a countable number of steps, each of which is one of taking the complement of a set,
taking a countable union of sets, or a countable intersection of sets. However this has
to be treated with caution, because it is not necessarily possible to obtain a given Borel
set by performing the countable number of steps in a single sequence.
Proposition 3.4. 1. Let B be any one of the following classes of subsets of R.
(i) All intervals
(ii) All intervals of the form (a, ∞)
(iii) All intervals of the form [a, b]
(iv) All open sets.
Then MBor is the smallest σ-algebra on R containing B.
2. MBor 6= MLeb .
3. If E ∈ MLeb there exist A, B ∈ MBor such that A ⊆ E ⊆ B and B \ A is null (so
E \ A and B \ E are null).
Proof. (1) is an exercise involving showing each interval can be obtained from members
of B, and each member of B can be obtained from intervals (see Sheet 2, Q3 for (ii)).
(2) and (3) are quite deep results; (2) will be discussed in the appendix; (3) is Theorem
2.28 in Capinski & Kopp.
In this course, we shall usually take (Ω, F) to be (R, MLeb ) or minor variants, but
much of this section will apply to the general case as well. We may refer to MLeb -
measurable functions simply as measurable functions, for simplicity; or as Lebesgue
measurable functions. We shall also be interested in cases where Ω is an interval (or
a Lebesgue measurable subset) and F = MLeb |Ω = {E ∈ MLeb : E ⊆ Ω}. However,
f : Ω → R is MLeb |Ω -measurable if and only if f˜ : R → R is measurable, where
f˜(x) = f (x) for x ∈ Ω, and f (x) = 0 otherwise. So we may state results just for
functions defined on R.
Recall from the Analysis courses that f : R → R is continuous if and only if f −1 (G)
is open for every open set (or open interval) G. By Proposition 3.5, we have that f is
(Lebesgue) measurable if and only if f −1 (G) is (Lebesgue) measurable for every open
set (or open interval) G.
It follows from the definition of measurable functions and Example 3.6(2) that
the existence of a non-measurable function is equivalent to the existence of a non-
measurable set. So their existence depends on the Axiom of Choice. Thus, we have the
following:
Fact of Life. ALL FUNCTIONS f : R → R THAT CAN BE EXPLICITLY DEFINED
ARE LEBESGUE MEASURABLE.
This is not exactly a mathematical theorem—it becomes one if one interprets “ex-
plicitly defined” in the right technical way. It is a true statement about the real world: a
non-measurable function involves some non-explicit choice process. Priestley compares
the existence of non-measurable functions to the existence of yetis.
Nevertheless, measurability is a real issue in some more advanced mathematics,
because:
(b) One may be interested in functions f which are not real-valued, but take values in
an infinite-dimensional space. Then measurability is a real issue in many areas of
analysis, although you probably won’t see this in your undergraduate course.
So it is useful to accumulate general results about measurable functions, even if we only
state them for functions f : (R, MLeb ) → R.
Proposition 3.7. Let f and g be measurable functions from R to R. The following
functions are measurable:
f + g, f g, max(f, g), h ◦ f for any continuous function h.
For example, αf is measurable, where α ∈ R.
If G is open in R, then h−1 (G) is open. Since f is measurable, f −1 (h−1 (G)) is measur-
able, i.e., (h ◦ f )−1 (G) is measurable.
Proof. First,
[
(sup fn )−1 (a, ∞] = fn−1 (a, ∞] ∈ MLeb .
n
Then
inf fn = − sup(−fn ),
lim sup fn = inf gm , where gm = sup fn .
n≥m
Any function of the form nj=1 βj χEj , where βj ∈ R and Ej ∈ MLeb is simple. On
P
the other hand, if φ is simple with non-zero values α1 , . . . , αk , and Bi = φ−1 ({αi }),
then Bi is measurable, and
k
X
(*) φ= αi χBi .
i=1
If these additional properties hold, then (*) is unique (up to reordering of the terms).
We shall then say that φ is in standard, or canonical, form. For example, the standard
form of χ(0,2) + χ[1,3] is 1χ(0,1)∪[2,3] + 2χ[1,2) .
In defining simple functions, some authors insist that the sets Bi , corresponding
to non-zero αi , must be bounded [Etheridge] or of finite measure [Stein & Shakarchi].
[Garling and Priestley avoid introducing simple functions.]
Examples 3.9. 1. Any step function is a simple function—for a step function, the sets
Bi in the standard representation must be finite unions of bounded intervals (or single
points).
2. The function χQ∩[0,1] is a simple function but it is not a step function.
Proposition 3.10. Let f : R → [0, ∞] be measurable. There is an increasing sequence
(φn ) of non-negative simple functions φn such that
f (x) = lim φn (x)
n→∞
for all x ∈ R.
Let (
k2−n if x ∈ Bk,n for some (unique) k,
φn (x) =
2n if f (x) ≥ 2n .
Then φn ≤ φn+1 , φn ≤ f , φn (x) > f (x) − 2−n for all sufficiently large n if f (x) < ∞,
and φn (x) = 2n for all n if f (x) = ∞.
Notice here that the approximating simple functions are constructed by taking
horizontal strips, unlike Prelims where vertical strips were used.
Theorem 3.11. A function f : R → R is measurable if and only if there is a sequence
of step functions ψn such that f = lim ψn a.e.
We now start to define our notion of the integral. In contrast to Riemann’s theory
which simultaneously considers upper and lower approximations to the area under the
curve, in Lebesgue’s theory we approximate area that lies above the x-axis from below,
and area below the x-axis from above. This leads us to split any function into its positive
and negative parts, and integrate these separately. In this section, we’ll develop the
theory of integration for non-negative functions, and turn to the general case in the
following section.
For a non-negative simple function φ with standard form ki=1 αi χBi (so αi > 0),
P
the integral of φ is defined to be:
Z Z ∞ k
X
φ= φ(x) dx = αi m(Bi ).
R −∞ i=1
R
Note that φ < ∞ if and only if m(Bi ) < ∞ for each i.
Proposition 4.1. Let φ, ψ be non-negative simple functions, α ∈ [0, ∞).
Pn
1. If φ = j=1 βj χEj where βj ≥ 0 and Ej are measurable (but not necessarily in
R P
standard form), then φ = j βj m(Ej ).
R R R R R
2. (φ + ψ) = φR + ψ, R αφ = α φ.
3. If φ ≤ ψ then φ ≤ ψ.
This is the first of our three big convergence theorems. We’ll give a slight strength-
ening of the theorem as Theorem 6.1.
R R
Proof. Since fn ≤ f , it is immediate that supn fn ≤ f.
For the reverse inequality,
R we consider
R a simple function φ such that 0 ≤ φ ≤R f .
We have
R to show that
R φ ≤ limn→∞ fn . It then follows from the definition of f
that f ≤ limn→∞ fn .
Take α ∈ (0, 1), and let
Bn = {x : fn (x) ≥ αφ(x)}.
Then Bn is measurable (since fn − αφ is measurable), Bn ⊆ Bn+1 and ∞
S
n=1 Bn = R
(for each x, either φ(x) = 0 or f (x) > αφ(x)). Since αφχBn ≤ fn χBn ≤ fn ,
Z Z
(*) α φ≤ fn .
Bn R
Pk
If φ = i=1 βi χEi , then
Z k
X k
X Z
φ= βi m(Ei ∩ Bn ) → βi m(Ei ) = φ
Bn i=1 i=1 R
Proof. Apply Theorem 4.2 with fn = f χEn , noting that χEn ≤ χEn+1 and f ≥ 0, so
fn ≤ fn+1 and χE (x) = limn→∞ χEn (x).
INTEGRATION, H.T. 2024 17
The baby version of the MCT will be used a lot in order to use the fundamental
theorem of calculus to compute integrals on closed and bounded sets using the theory
developed in prelims.
Corollary 4.4. For non-negative measurable functions f and g,
Z Z Z
(f + g) = f + g.
Proof. Let (φn ) and ψn be increasing sequences of non-negative simple functions, con-
verging pointwise to f and g respectively (Proposition 3.10). Then (φn + ψn ) is an
increasing sequence, converging to f + g. By MCT and Proposition 4.1(2),
Z Z Z Z Z Z Z Z
(f +g) = lim (φn +ψn ) = lim φn + ψn = lim φn + lim ψn = f + g.
n→∞ n→∞ n→∞ n→∞
Corollary
P∞ 4.5. [MCT R for Series]
P∞ R Let fn be non-negative measurable functions and
fP=R n=1 fn . Then f = n=1 fn . In particular, f is integrable if and only if
n f n < ∞.
Pn
Proof. Let gn = r=1 fr , and apply MCT.
In order to give any interesting examples, we need to show that the integrals just
defined agree with the Riemann integral initially for continuous functions on closed
bounded intervals. We will come back to this in Section 5, but record the result here
for continuous functions for use in the next few examples.7
RL
Corollary 4.6. Let f : [a, b] → [0, ∞) be continuous. Then the Lebesgue integral [a,b] f
RR
as defined above equals the Riemann integral [a,b] f as defined in first-year Integration.
Example 4.7. Consider f (x) = (1 − x)−1/2 on (0, 1). By Baby MCT (Corollary 4.3),
Corollary Corollary 4.6 and FTC (from Prelims),
Z 1 Z 1− 1
n
(1 − x)−1/2 dx = lim (1 − x)−1/2 dx = lim 2(1 − n−1/2 ) = 2.
0 n→∞ 0 n→∞
For 0 ≤ x < 1, the Binomial Theorem with exponent −1/2 or Taylor’s Theorem in
complex analysis gives
∞
−1/2
X (2n)! n
(1 − x) = x .
4n (n!)2
n=0
By Corollary 4.5 and FTC,
Z 1 ∞ Z 1 ∞
X (2n)! X (2n)!
(1 − x)−1/2 dx = n (n!)2
x n
dx = n n!(n + 1)!
.
0 4 0 4
n=0 n=0
7This is easier than the result in section 5, as continuous functions are automatically measurable. So one
can prove this by choosing a sequence of partitions given by repeatedly bisecting [a, b], and taking the step
functions associated to the lower Riemann sums for these partitions, we obtain an increasing sequence (φn ) of
step functions such that limn→∞ φn (x) = f (x) for all x ∈ [a, b] (continuity ensures that one has convergence
RR RL
everywhere) and limn→∞ ab φn = [a,b] f . By the MCT (Theorem 4.2), limn→∞ ab φn = [a,b]
R R
f . As you’ll see
in Section 5, for a general Riemann integrable f , we can arrange for the φn to converge to f almost everywhere,
and then the same result holds.
18 INTEGRATION, H.T. 2024
The fact that the series above converges to 2 can be obtained directly from the Binomial
Expansion of (1 − x)1/2 , via Abel’s continuity theorem (A2 lecture notes MT 2019,
Theorem 13.24).
Z nπ
x 2 −x3
Example 4.8. Consider cos x e dx. It is not obvious how to evaluate
0 2n
the integral for a given value of n, but we can use the MCT to find the limit of the
integrals, as n → ∞, as follows.
Let
( x 2 −x3
x 2 −x3 cos x e if 0 ≤ x ≤ nπ
fn (x) = cos x e χ[0,nπ] (x) = 2n
2n 0 otherwise.
Fix n for a moment. We wish to show that fn (x) ≤ fn+1 (x) for all x. If 0 ≤ x ≤ nπ,
x x
then 0 ≤ cos ≤ cos , so fn (x) ≤ fn+1 (x). If nπ < x ≤ (n + 1)π, then
2n 2(n + 1)
fn (x) = 0 ≤ fn+1 (x). If x > (n + 1)π (or if x < 0), then fn (x) = 0 = fn+1 (x). Thus
we have established our claim that fn (x) ≤ fn+1 (x) for all x.
3
Noting that fn (x) → f (x) = x2 e−x for all x ≥ 0, the MCT gives
Z nπ Z n Z ∞
x 2 −x3 3
lim cos x e dx lim fn (x) dx = x2 e−x .
n→∞ 0 2n n→∞ 0 0
This can be computed using the Baby MCT at the first step below, and the FTC at
the second:
Z ∞ 3
1 − e−n
Z n
2 −x3 2 −x3 1
x e dx = lim x e dx = lim = .
0 n→∞ 0 n→∞ 3 3
Notice that this requirement prevents any problems with ∞ − ∞.8 Then the integral
of f is Z Z Z
f= f+ − f −.
Moreover, f is integrable over a measurable subset E if f χE is integrable. If f : E →
[−∞, ∞], then f is integrable over E if f˜ is integrable over R. We write f ∈ L1 (E) to
mean that f is integrable over E.
8It is possible to make sense of the quantity R f if one only has that one of R f + or R f − is finite, but this
notion would not be well behaved — for example we definitely want the sum of two integrable functions to be
integrable.
INTEGRATION, H.T. 2024 19
Apart from Corollary 4.6, almost all the theory in Section 4 up to this point applies
to general measure spaces. Now we make some comments which are specific to the case
of Lebesgue measure.
Firstly, as promised in Section 4, the Lebesgue integral is more general than the
Riemann (Prelims) integral. In fact, f : [a, b] → R is Riemann integrable if and
only if f is bounded and continuous a.e.9 Any such f is measurable and bounded,
9This is a very nice exercise, but off topic for us, so omitted. See Stein and Shakarchi Problem 1.6.4. The
essential idea, which is useful for many prelims exercises relating to continuity is to consider the oscilation of a
function f , ωf (x) = limδ→0 (supy∈(x−δ,x+δ) f (y) − inf y∈(x−δ,x+δ) f (x)). You can check that f is continuous at
x if and only if ωf (x) = 0. So if f is continuous a.e. then for any > 0 the set A = {x ∈ [a, b] : ωf (x) ≥ }
is null, so (using compactness, and you’ll need to check it is compact) can be covered by finitely many open
intervals of total length . This should help you access analysis 3, sheet 2, Q4 to get Riemann integrability.
20 INTEGRATION, H.T. 2024
Conversely if f is Riemann integrablem n ∈ N and > 0, take a partition P such that U (f ; p) − L(f ; P ) < /n
and consider the total length of the intervals in P whose interior intersects A1/n .
INTEGRATION, H.T. 2024 21
By
R ∞ Baby MCT, xα is integrable over (1, ∞) if and only if α < −1, and then
α −1
1 x dx = −(α + 1) .
3. Consider f (x) = xα /(1 + xβ ) over (0, ∞), where α ∈ R and β ≥ 0. For 0 < x ≤ 1,
xα /2 ≤ f (x) ≤ xα . By comparison, f is integrable over (0, 1) if and only if xα is,
i.e., α > −1. For x > 1, xα−β /2 < f (x) < xα−β , so, by comparison, f is integrable
over (1, ∞) if and only if xα−β is, i.e., α − β < −1. Hence f is integrable over (0, ∞)
if and only if −1 < α < β − 1. [The case when β < 0 can be reduced to the previous
case because f (x) = xα−β /(1 + x−β ).]
4. Consider f (x) = (sin x)/x over (0, 2π). This function is continuous on (0, 2π], hence
measurable. If we define f (0) = 1, it becomes continuous, hence bounded on [0, 2π]—
in fact it is bounded above by 1 and below by −1/π. So it is integrable over (0, 2π).
5. Consider f (x) = (sin x)/x over (0, ∞). Now
Z (r+1)π Z (r+1)π
sin x | sin x| 2
dx ≥ dx = .
rπ x rπ (r + 1)π (r + 1)π
Hence,
Z nπ n−1
X 2
lim |f (x)| dx ≥ lim = ∞.
n→∞ 0 n→∞ (r + 1)π
r=0
So |f | is not integrable, and hence f is not integrable, over (0, ∞).
The FTC should be treated with care, if the range of integration is unbounded (as
already discussed), or if the derivative does not exist at some points as the following
examples show.
Examples 5.5. 1. Let f (x) = x sin x1 (x ∈ (0, 1]);f (0) = 0. Then
f is continuous
on [0, 1] and differentiable on (0, 1] but f 0 (x) = sin x1 − x1 cos x1 ∈/ L1 (0, 1).
Integration by parts must be treated with great care if the interval of integration is
an unbounded interval or the integrand has a singularity and you do not know whether
the integrals exist. In those circumstances you cannot infer the existence of one integral
from the existence of the other.
Ra
Example 5.7. Consider 0 sinx x dx. Integration by parts gives
Z a Z a
sin x cos a cos x
dx = cos 1 − − dx.
1 x a 1 x2
But cos
x2
x
≤ x12 , so cos
x2
x
is integrable over [1, ∞), by Example 5.3(2) and the Compar-
ison Test. It follows from Proposition 5.1(8) that
Z a Z 1 Z ∞
sin x sin x cos x
lim dx = dx + cos 1 − dx.
a→∞ 0 x 0 x 1 x2
Nevertheless, sin x/x is not integrable over (0, ∞), by Example 5.3(5).
In the case of substitution, one can infer the existence of one integral from the
other. [Note: Priestley’s comment near the bottom of p.133 is misleading.]
Theorem 5.8. (Substitution) Let g : I → R be a monotonic function with a continu-
ous derivative on an interval I, and let J be the interval g(I). A (measurable) function
f : J → R is integrable over J if and only if (f ◦ g).g 0 is integrable over I. Then
Z Z
f (x) dx = f (g(y))|g 0 (y)| dy.
J I
This theorem is not contained in the one in the first-year course, because f is not
required to be continuous or Riemann integrable. FTC gives the result when f = χJ 0 for
a bounded interval J 0 ⊆R J. One has to extend this to f = χE when E ∈ MLeb , E ⊆ J,
i.e., one needs m(E) = g−1 (E) g 0 . After that, the rest follows fairly easily. See Theorem
7.4 in Qian’s notes.
Example 5.9. Let I = (0, 1), g(y) = 1/y, so J = (1, ∞). Let f (x) = xα . Then
xα ∈ L1 (1, ∞) if and only if y −α−2 ∈ L1 (0, 1). This provides a passage between
Example 5.3, (1) and (2).
Other measures. We make some comments about integration with respect to mea-
sures other than Lebesgue.
expectation E(X); X is integrable if and only if |X| has finite expectation. The the-
ory that follows applies to all random variables simultaneously—discrete, continuous,
hybrid, singular.
The feature of Lebesgue integration theory which distinguishes it from other theo-
ries, and makes it much more manageable, is the group of theorems known as conver-
gence theorems. These are the theorems, mentioned in the introduction, which enable
one to pass limits or infinite sums through integrals, under certain conditions.
We have already seen the MCT, but we give a different form below to allow for
increasing sequences of functions which are not necessarily non-negative. RNotice that
in this case, we must work with integrable functions so that we can add f1 to both
sides at the end of the argument.10.
Theorem 6.1. Let (fn ) be a sequence of integrable functions such that:
R n, fn ≤ fn+1 a.e.,
(1) for each
(2) supn fn < ∞.
R R
Then (fn ) converges a.e. to an integrable function f , and f = limn→∞ fn .
Proof. By Proposition 5.1(6), fn (x) ∈ R a.e. From this and assumption (1) we may
redefine fn on the union of countably many null set without changing any integrals, so
we may assume that fn (x) ≤ fn+1 (x) andRfn (x) ∈ R for all x andR all n.R Apply Theorem
4.2 applied to fn − f1 . One obtains that (f − f1 ) = limn→∞ fn − f1 . RThus f − f1
is integrable, so f is Rintegrable which
R implies that f is finite a.e. Adding f1 to both
sides we obtain that f = limn→∞ fn .
Theorem 6.2. [Fatou’s Lemma] Let (fn ) be a sequence of non-negative measurable
functions. Then Z Z
lim inf fn ≤ lim inf fn .
n→∞ n→∞
R
RProof. Let gr := inf R to lim inf n→∞ fnRand gr ≤ fr and Rgr ≤
R n≥r fn . Then (gr ) increases
fr . By MCT, lim inf n→∞ fn = limr→∞ gr = lim inf r→∞ gr ≤ lim inf r→∞ fr .
2 n
Note that
R in Example 0.1 with fn (x)R = n x (1 − x) on (0, 1), fn ≥R 0, limn→∞ fn =
0, limn→∞ fn = 1. So one can have limRsupn→∞ fn < lim inf n→∞ fn . RHowever if
fn ≤ g for all n where g is integrable, then lim supn→∞ fn ≥ lim supn→∞ fn (apply
Fatou to g − fn ).
R R
One can also have lim supn→∞ fn > lim supn→∞ fn —for example, fn (x) =
sin2 (x + n) on (0, π).
10Whereas the MCT which has non-negative f , doesn’t require the f to have finite integrals, but then it
n n
does not conclude that f is integrable.
24 INTEGRATION, H.T. 2024
Proof. Since f is measurable (Proposition 3.8) and |f (x)| ≤ g(x) a.e., f isR integrable
by
R comparison. Apply
R Fatou’s
R Lemma R to g + fn and gR − fn , to obtain (g + f ) ≤
g + lim inf n→∞ fn and (g − f ) ≤ g − lim supn→∞ fn .
Z 1 3/2 x
n xe
Example 6.4. Consider 2 2
dx. It is difficult (impossible?) to evaluate the
0 1+n x
integrals themselves, but we can find the limit of the integrals, with the help of the
DCT (Theorem 6.3). Let
n3/2 xex (nx)3/2 ex
fn (x) = = .
1 + n2 x2 1 + n2 x2 x1/2
y 3/2
The function tends to 0 as y → ∞, so it is bounded for y > 0. It follows that
1 + y2
fn (x) → 0 as n → ∞, and there is a constant c such that
cex ce
0 ≤ fn (x) ≤ 1/2
≤ 1/2 (0 < x < 1).
x x
ce
Now let g(x) = . Then g is integrable over (0, 1) (Example 5.3(1)), so we have
x1/2
verified the conditions of the DCT (with f = 0). We can therefore conclude that
Z 1 3/2 x
n xe
lim dx = 0.
n→∞ 0 1 + n2 x2
The next example involves, for the first time in this course, integration of a complex-
valued function. A function f : R → C is integrable if Re f and Im f are both inte-
grable. Results which hold for real-valued integrable functions and which make sense
for complex-valued functions are almost invariably true in the complex case, and can
easily be deduced by applying the result to the real and imaginary parts separately.
This is the case, for example, with the Comparison Test, FTC, Integration by Parts
and the DCT. Note, however, that in Theorem 5.8 (Substitution), the function f may
be complex-valued, but the substitution g(t) is assumed to be real-valued.
Example 6.6. Let γr be the semi-circular contour {reiθ : 0 ≤ θ ≤ π}, and consider
Z π
eiz
Z
dz = i eir cos θ e−r sin θ dθ.
γr z 0
INTEGRATION, H.T. 2024 25
Since
eir cos θ e−r sin θ ≤ 1 for all r > 0, 0 ≤ θ ≤ π
(
0 as r → ∞, if 0 < θ < π,
eir cos θ e−r sin θ →
1 as r → 0
the Bounded Convergence Theorem gives
eiz eiz
Z Z
dz → 0 (Rn → ∞), dz → πi (εn → 0).
γRn z γεn z
By Cauchy’s Theorem,
eiz eiz Rn
eix − e−ix
Z Z Z
0= dz − dz + dx.
γRn z γεn z εn x
Letting n → ∞, we obtain
Z Rn
sin x π
lim dx = .
n→∞ ε
n
x 2
Ra
Hence lima→∞ 0 sinx x dx = π/2 (see Example 5.7, and Part A Complex Analysis,
Example 11.9 in MT2020 notes).
Theorem 6.8. [Lebesgue’s Series Theorem; Beppo P R Levi Theorem, P ....] Let
∞
(gn ) be a sequence of integrable functions such
R Pthat |gn | <R ∞. Then
n P n=1 gn
∞ ∞
converges a.e. to an integrable function, and n=1 gn = n=1 g n .
Proof. Apply MCT for Series to gn+ and gn− . Alternatively, apply MCT for Series to
|gn | and use the fact that absolute convergence implies convergence.
P
Theorem 6.9. Let R P∞n |gn |
P∞(gn ) be a sequence of integrable functions such that is
integrable.
P∞ R Then n=1 gn converges a.e. to an integrable function, and n=1 gn =
n=1 gn .
Pk R R P∞ P∞ R R P∞
Proof. Clearly n=1 |gn | ≤ n=1 |gn | for all k, so n=1 |gn | ≤ n=1 |gn |.
Apply Theorem 6.8.
R 1 α−1 −x
Example 6.10. Let α > 0, and consider 0 x e dx. Let gn (x) = (−1)n xα+n−1 /n!,
P∞
so that n=0 gn (x) = xα−1 e−x . Now
Z 1
1
|gn (x)| dx = ,
0 (α + n)n!
26 INTEGRATION, H.T. 2024
P R1
so n 0 |gn (x)| dx < ∞. Thus Lebesgue’s Series Theorem tells us that our integral
exists (we could have established this directly, by comparing the integrand with xα−1 ),
and that
Z 1 ∞ Z 1 ∞
α−1 −x
X X (−1)n
x e dx = (−1)n xα+n−1 /n! dx = .
0 0 (α + n)n!
n=0 n=0
R∞ −isx e−x2
Example 6.11. Let s ∈ R, and consider −∞ e dx. The integrand is con-
2 2 (−isx)n −x2
tinuous, |e−isx e−x | = ≤ e−x ee−|x| ∈ L1 (exercise). If gn (x) = n! e , then
P∞ −isx −x 2
n=0 gn (x) = e e , and
∞
2 2 /2 2 /2
X
|gn (x)| = e|sx|−x ≤ es e−x ∈ L1 .
n=0
All theorems in this Section hold in general measure spaces. Corollary 6.5 holds in
finite measure spaces.
Remark. In condition (3) of Theorem 7.2, the function g does not depend on y.
Proof. Let (yn ) be any sequence in J converging to y ∈ J. Let fn (x) = f (x, yn ). Then
|fn (x)| ≤ g(x) a.e., for all n, and limn→∞ fn (x) = f (x, y) a.e., so the conditions of the
DCT are satisfied. The DCT implies that:
Z Z
F (yn ) = f (x, yn ) dx → f (x, y) dx = F (y).
I I
Thus F is continuous.
Example 7.3. The Gamma function Γ is defined by:
Z ∞
Γ(y) = e−x xy−1 dx (y > 0).
0
We wish to show that Γ is continuous, firstly for y ∈ [1, 2]. In order to apply Theorem
7.2, we take I = (0, ∞), J = [1, 2], and f (x, y) = e−x xy−1 . Condition (1) of Theorem
7.2 is an exercise, and (2) is more or less trivial. For condition (3), we need to ensure
that
(
e−x (0 < x ≤ 1)
(7.1) g(x) ≥ sup f (x, y) = −x
1≤y≤2 xe (x > 1).
We choose to take g equal to the right-hand side of (7.1). Then g is integrable over
(0, ∞) (exercise), so condition (3) of Theorem 7.2 is satisfied. Thus, Theorem 7.2 shows
that Γ is continuous on [1, 2].
In fact, Γ is continuous on (0, ∞). However, it is impossible to establish this by
applying Theorem 7.2 with J = (0, ∞), for in condition (3), it would be necessary that
(
x−1 e−x (0 < x ≤ 1)
g(x) ≥ sup f (x, y) =
y>0 ∞ (x > 1).
28 INTEGRATION, H.T. 2024
Remark. The method of Theorem 7.2 can also be used to cover cases where y → y0
for a single point y0 or y → ∞. For example, suppose that there exists a in R and a
function h : I → R such that
Note that in condition (3) (and (2)) above we require a single null set N such that
∂f
∂y (x, y) ≤ g(x) holds for all y ∈ J and x ∈ I \ N . This is not a-priori the same as
requiring that for all y ∈ J, ∂f 11
∂y (x, y) ≤ g(x) holds for almost all x ∈ I. . In practise
when you want to apply Theorem 7.5 it’s quite likely that (3) will hold for all x and y
(or perhaps all but finitely many values of x).
Proof. Fix y in J, and let (yn ) be any sequence in J converging to y (with yn 6= y). Let
f (x, yn ) − f (x, y)
gn (x) = .
yn − y
Then gn is integrable over I, gn (x) → ∂f
∂y (x, y) for almost all x as n → ∞. Moreover,
the Mean Value Theorem says that there exists a point ξx,n (depending on x and n)
between yn and y such that gn (x) = ∂f
∂y (x, ξx,n ). It follows from (3) that |gn (x)| ≤ g(x)
12
a.e.(x). This shows that the Dominated Convergence Theorem is applicable, so
F (yn ) − F (y)
Z Z
∂f
= gn (x) dx → (x, y) dx as n → ∞.
yn − y I I ∂y
Since (yn ) is an arbitrary sequence tending to y, and the right-hand side is independent
of the choice of sequence, it follows that
F (y 0 ) − F (y)
Z
∂f
→ (x, y) dx as y 0 → y,
y0 − y I ∂y
which completes the proof.
2 ∞
Example 7.6. Let f (x, s) = e−isx e−x , and F (s) = −∞ f (x, s) dx (compare Example
R
11as that would allow the null set N of those x for which this fails to depend on y.
y
12This is where it matters that we have a single null set N for which ∂f (x, y) ≤ g(x) holds for all x ∈ I \ N
∂y
and y ∈ J. If we had that for each y ∈ J, there was a null set Ny depending on y such that the estimate holds
for y ∈ J and x ∈ I \ Ny , then as ξx,n depends on x, we would have no way of deducing that |gn (x)| ≤ g(x)
a.e.. I think Oliver Riordan for bringing this issue to my attention.
30 INTEGRATION, H.T. 2024
2
as n → ∞, |x|e−x ∈ L1 (R) (Baby MCT). Thus Theorem 7.5 is applicable, with
2
I = J = R and g(x) = |x|e−x . It follows that F is differentiable on R, and
Z ∞
0 2
F (s) = −i xe−isx e−x dx.
−∞
By integration by parts,
s
F 0 (s) = − F (s).
2
2 /4 R∞ 2 √
Hence F (s) = Ae−s for some constant A. But F (0) = −∞ e−x dx = π, so
√
A = π.
Corollary 7.7. Let I and J be intervals in R, and f : I × J → R be a function such
that (1) and (2) of Theorem 7.5 hold, and
(30 ) for each b in J, there is an open subinterval Jb of J containing b and an inte-
grable function gb : I → R such that, for almost all x, ∂f
∂y (x, y) ≤ gb (x) for all
y ∈ Jb .
Then the conclusions of Theorem 7.5 hold.
Example 7.8. Let f (x, y) = e−xy (1 + x3 )−1 (x ≥ 0, y ≥ 0). Since 0 ≤ f (x, y) ≤
(1 + x3 )−1 , x 7→ f (x, y) is integrable over [0, ∞) for each y ≥ 0. Moreover,
∂f xe−xy
(x, y) = − ,
∂y 1 + x3
so
∂f x
(x, y) ≤ (x ≥ 0, y ≥ 0).
∂y 1 + x3
Since x(1+x3 )−1 is integrable over [0, ∞) (by comparison with x−2 for x ≥ 1), Theorem
7.5 is applicable, and shows that F is differentiable on [0, ∞) and
Z ∞ −xy
0 xe
F (y) = − dx.
0 1 + x3
We would like to repeat this argument to show that F 00 (y) exists (at least for y > 0),
but this is more complicated. Indeed,
∂2f x2 e−xy
(x, y) = .
∂y 2 1 + x3
For y = 0, this function is not integrable (by comparison with x−1 ), so we should only
consider y > 0. However, it is not possible to apply Theorem 7.5 with f replaced by
∂f
∂y and with J = (0, ∞), because
∂2f x2
sup 2
(x, y) = ,
y>0 ∂y 1 + x3
which is not integrable over [0, ∞). Instead, we must apply Corollary 7.7. Thus we
take b > 0, let Jb = (b/2, ∞), and
∂2f x2 e−xb/2
gb (x) = sup 2
(x, y) = 3
≤ x2 e−xb/2 .
y>b/2 ∂y 1 + x
INTEGRATION, H.T. 2024 31
This function is integrable on [0, ∞), and we conclude from Corollary 7.7, with f
replaced by ∂f 00
∂y and J = (0, ∞) that F (y) exists for y > 0 and
Z ∞ 2 −xy
x e
F 00 (y) = dx.
0 1 + x3
Repeating this argument, it is possible to show that F is infinitely differentiable on
(0, ∞) and to obtain integrals for all the derivatives.
Remark. There are versions of Theorem 7.5 and Corollary 7.7 where the real variable
y ∈ J is replaced by a complex variable z ∈ Ω, a domain in C, the function f is
complex-valued, z 7→ f (x, z) is holomorphic for each x, and the conclusion is that F is
holomorphic. The proofs are almost the same, except that the use Rof the Mean Value
∂f
Theorem should be replaced by the formula gn (x) = (zn − z0 )−1 [z0 ,zn ] ∂w (x, w) dw
which leads to the estimate |gn (x)| ≤ g(x).
8. Double Integrals
(3)
Z Z Z
f (x, y) d(x, y) = f (x, y) dx dy.
R2 R R
Similarly,
Z Z Z Z Z
f (x, y) dx dy = f (x, y) d(x, y) = f (x, y) dy dx,
R R R2 R R
where the first repeated integral exists in the sense described above.
RProof. Apply Theorem 8.1 to f + and f − , using Proposition 5.1(6) to get that
±
R f (x, y) dx < ∞ a.e.(y).
Remark. Note that, when applying Tonelli’s Theorem, one must verify that a repeated
integral of |f | is finite. It is not sufficient that the repeated integrals of f exist (see
Example 8.4), nor is it sufficient that the repeated integrals of f both exist and are
equal (see Example 8.7).
If E is a measurable subset of R2 and f : E → R is any function, then f is said to
be integrable over E if f˜ is integrable over R2 , where f˜(x, y) = f (x, y) if (x, y) ∈ E,
f˜(x, y) = 0 otherwise. Then E f is defined to be R2 f˜.
R R
Fubini’s Theorem and Tonelli’s Theorem can be applied in this situation. However,
when E is not a rectangle, great care must be taken to choose the correct limits of
integration in the repeated integrals. If in any doubt draw a sketch of the region. See
Example 8.5.
In RR
repeated integrals, one often omits the brackets around the inner integral and
writes f (x, y) dy dx, etc., with appropriate limits of integration. This means that
INTEGRATION, H.T. 2024 33
one is integrating first with respect to y between the limits on the right-hand integral
sign, which may be functions of x. Thus
Z b Z ψ(x)
f (x, y) dy dx
a φ(x)
denotes the repeated integral over the region E bounded
by curves y = φ(x) and y = ψ(x) and by vertical lines
x = a, x = b.
x−y
Example 8.4. Let f (x, y) = (0 < x < 1, 0 < y < 1). It was an exercise in
(x + y)3
Problem Sheet 1 that the repeated integrals of f exist, but are not equal. It follows from
the final part of Fubini’s Theorem that f is not integrable over the square (0, 1) × (0, 1).
!
1 − y 1/2
Z 1 Z x
Example 8.5. Consider dy dx. As it stands, the inner integral
0 0 x−y
is difficult. However, it turns out that when the order of integration is reversed, the
other repeated integral is easily evaluated. To justify the equality of the repeated in-
tegrals, we apply Tonelli’s Theorem; this is contained
in the following discussion.
First, note that the integrand is continuous ex-
cept on the line y = x which is null; it is non-negative
throughout the range of integration, so that in apply-
ing Tonelli’s Theorem, it is unnecessary to replace f
by |f |. The next problem is to work out the limits
of integration when we reverse the order. For this,
we have to identify the region in R2 over which the
double integral is taken. For each x, between 0 and
1, we are integrating along the (vertical) line-segment from y = 0 to y = x. As x runs
from 0 to 1, this sweeps out the triangle shown. The integrand is continuous on the
interior of the triangle (and we take it to be 0 outside the triangle), so it is measurable.
If we fix a value of y, the values of x which give us points within the triangle are those
between x = y and x = 1. This applies for y between 0 and 1; otherwise there are no
points within the triangle. Thus the limits of the reversed repeated integral are x = y
and x = 1 in the inner integral, and y = 0 and y = 1 in the outer. This is confirmed
by the following equalities of sets:
{(x, y) ∈ R2 : 0 < y < x, 0 < x < 1} = {(x, y) ∈ R2 : 0 < y < x < 1}
= {(x, y) ∈ R2 : y < x < 1, 0 < y < 1},
but the picture was more informative!
Now the reversed repeated integral is:
!
1 − y 1/2
Z 1 Z 1 Z 1h ix=1
dx dy = 2(1 − y)1/2 (x − y)1/2 dy
0 y x−y 0 x=y
Z 1
= 2(1 − y) dy = 1.
0
34 INTEGRATION, H.T. 2024
Since the integrand is non-negative, and since this repeated integral is finite, it fol-
lows from Tonelli’s Theorem that f is integrable over the triangle, and from Fubini’s
Theorem that !
1 − y 1/2
Z 1 Z x
dy dx = 1.
0 0 x−y
The next example shows how it is both possible and useful to make changes of
variable within the inner integral of a repeated integral. The same technique will be
used in several subsequent examples.
2 2
Example 8.6. Let f (x, y) = ye−y (1+x ) . Since f is continuous, it is certainly measur-
R ∞consider the integral of f over the positive quadrant (0, ∞)×(0, ∞). First
able. We shall
we consider 0 f (x, y) dy for a fixed x. Making the change of variable t = y(1 + x2 )1/2
(x is a constant at this point),
Z ∞ Z ∞ −t2 " 2
#t=k
te e−t 1
f (x, y) dy = 2
dt = lim − 2
= .
0 0 1+x k→∞ 2(1 + x ) 2(1 + x2 )
t=0
It follows that
Z ∞ √
−x2 π
e dx = .
0 2
If f takes both positive and negative values, then to apply Tonelli’s Theorem, it is
necessary to consider |f |, or alternatively to consider separately the regions where f is
positive and where it is negative.
xy
Example 8.7. Let f (x, y) = 4 . Since f is odd both as a function of x, and also
Z ∞ x + y4 Z ∞
as a function of y, f (x, y) dy = 0 for all x, and f (x, y) dx = 0 for all y. Hence
−∞ −∞
INTEGRATION, H.T. 2024 35
both repeated integrals exist and equal 0. However, if we consider f over the quadrant
x > 0, y > 0, part of the region where f (x, y) > 0, then, putting y = xt (x > 0 fixed),
Z ∞ Z ∞
x3 t c
f (x, y) dy = 4 4
dt = ,
0 0 x (1 + t ) x
Z ∞
t
where c is the constant dt. Since cx−1 is not integrable with respect to x
0 1 + t4
over (0, ∞), it follows that f is not integrable over the quadrant, and therefore not
integrable over the plane.
In practice, it often happens that one has no means of evaluating the repeated
integrals of f or |f |, but can nevertheless decide whether f is integrable. One technique
for this is to show that f is dominated by a simpler function which one can show to be
integrable (or that f dominates a function which one can show not to be integrable).
1
Example 8.8. Let f (x, y) = sin cos(x2 + y 3 ). We wish to show that f is
x2 + y 4
integrable over the positive quadrant (0, ∞) × (0, ∞).
Since f is continuous in this region (although not con-
tinuous at (0, 0)), it is measurable. Moreover, f is
bounded, and hence integrable over any bounded re-
gion, in particular over the square (0, 1)×(0, 1). Thus
it suffices to show that f is integrable over the regions
[1, ∞) × [0, ∞) and (0, 1) × (1, ∞).
Using the inequalities | sin t| ≤ |t| and | cos t| ≤ 1, it follows that |f (x, y)| ≤ (x2 +
y 4 )−1 ,
so it suffices to show that (x2 + y 4 )−1 is integrable over these two regions. Now
Z ∞ Z ∞ Z ∞ Z ∞
dy dz
dx = dx < ∞,
1 0 x2 + y 4 1 0 x3/2 (1 + z 4 )
where we made the substitution y = x1/2 z and used the integrability of x−3/2 over
[1, ∞) and of (1 + z 4 )−1 over (0, ∞). Also,
Z ∞ Z 1 Z ∞ Z 1 Z ∞
dx dx dy 1
2 + y4
dy ≤ 4
dy = 4
= .
1 0 x 1 0 y 1 y 3
It follows from Tonelli’s Theorem that (x2 + y 4 )−1 is integrable over these two regions,
so f is integrable over the quadrant.
Another useful technique for testing functions for integrability, and for evaluating
integrals, is to change variables. The reader will be familiar with this idea from courses
in applied mathematics and in A3 Probability, and will know that one has to take
account of the Jacobian of the transformation. The method is the extension to two
variables of Theorem 5.8. We shall state the result and give examples for polar coordi-
nates x = r cos θ, y = r sin θ, when the Jacobian is r. This corresponds to the fact that
a small rectangle with sides δr, δθ (area δrδθ) in the (r, θ)-space is transformed into an
approximate rectangle of sides δr, rδθ (area rδrδθ)) in the (x, y)-space.
36 INTEGRATION, H.T. 2024
Now we state a version of Theorem 8.9 for general changes of coordinates. Let
T : (u, v) 7→ (x, y) be a change of variables, and suppose that x, y are differentiable
functions of u, v. Let JT be the Jacobian matrix:
∂x ∂x
JT = ∂u ∂y
∂v
∂y .
∂u ∂v
Observe that JS◦T = JS JT (Chain Rule).
INTEGRATION, H.T. 2024 37
∂(x,y)
Writing ∂(u,v) for det JT , this formula becomes
Z Z
∂(x, y)
f (x, y) d(x, y) = f (u, v) d(u, v).
E E 0 ∂(u, v)
∂(x,y)
To recover Theorem 8.9 from Theorem 8.13, take T (r, θ) = (r cos θ, r sin θ), so ∂(r,θ) =
r.
In the situation of Theorem 8.13, E is always measurable (continuous image of a
Borel set) although this is not obvious.
One can extend Section 8 to Rn instead of R2 . Moreover, for any (σ-finite) measure
spaces (Ω1 , F1 , µ1 ) and (Ω2 , F2 , µ2 ), one can define a product (Ω1 ×Ω2 , F1 ⊗F2 , µ1 ×µ2 )
such that Fubini’s and Tonelli’s theorems hold.
9. Lp -spaces
Then
(i) kf k1 = 0 if and only if f = 0 a.e. (Proposition 5.1(5),(7));
(ii) kαf k1 = |α|kf k1 ;
(iii) kf + gk1 ≤ kf k1 + kgk1 .
Consequently,
(i)0 d1 (f, g) = 0 if and only if f = g a.e.
(ii)0 d1 (g, f ) = d1 (f, g);
(iii)0 d1 (f, h) ≤ d1 (f, g) + d1 (g, h).
So k · k1 is almost a norm and d1 is almost a metric (cf., Metric Spaces). The problems
are that we have not yet defined a suitable vector space, and kf k1 = 0 does not imply
that f is the zero function.
If we allow our integrable functions to take the values ∞ and −∞, then f + g
may not be everywhere defined (but it is almost everywhere defined). Any integrable
function is real-valued almost everywhere, so we will now take L1 to be the space of all
integrable functions with real (or complex) values. Then we identify functions which
are almost everywhere equal (actually, we have effectively been doing this for some
time). Define an equivalence relation on L1 by
f ∼ g ⇐⇒ f = g a.e.
38 INTEGRATION, H.T. 2024
Then h·, ·i2 is positive-semidefinite, linear in the first variable, and conjugate-symmetric,
so it is almost an inner product. Again there is a small problem that hf, f i2 = 0 implies
only that f ∈ N . So we form L2 = L2 /N , and we obtain an inner product on L2 . Hence,
we get a well-defined norm on L2 given by
Z 1/2
1/2 2
k[f ]k2 = kf k2 = hf, f i2 = |f | .
The function t 7→ tp is continuous on [0, ∞) and its second derivative p(p − 1)tp−2
is positive on (0, ∞). This implies that it is convex, i.e.
(λs + (1 − λ)t)p ≤ λsp + (1 − λ)tp
for 0 ≤ λ ≤ 1, s, t ≥ 0. Apply this with
α |f (x)| |g(x)|
λ= , s= , t= .
α+β α β
This gives
p
|f |p |g|p
|f | + |g| 1
≤ + .
α+β α+β αp−1 β p−1
Using |f + g| ≤ |f | + |g|, integrating, and taking pth roots gives the required inequality.
The pair (p, q) are sometimes called Hölder conjugates. For p = q = 2, Hölder’s
Inequality is the Cauchy-Schwarz Inequality. Notice also that Hölder’s inequality holds
for the pair p = 1 and q = ∞.
Proof. Note first that the function t 7→ log t is concave on [0, ∞), because its second
derivative −t−2 is negative. Hence
1 1 s t
log s + log t ≤ log + .
p q p q
Exponentiate to obtain s1/p t1/q ≤ ps + qt . Let s = (|f (x)|/kf kp )p and t = (|g(x)|/kgkq )q .
This gives
|f g| |f |p |g|q
≤ + .
kf kp kgkq pkf kpp qkgkqq
Integrate.
40 INTEGRATION, H.T. 2024
Corollary 9.3. If 1 ≤ p1 < p2 < ∞ and f ∈ Lp2 (a, b), then f ∈ Lp1 (a, b) and
1
− p1
kf kp1 ≤ (b − a) p1 2 kf kp2 .
Hence if fn ∈ Lp2 (a, b) and kfn kp2 → 0, then kfn kp1 → 0.
Proof. Apply Proposition 9.2 to the functions |f |p1 and χ(a,b) , with p = p2 /p1 . Then
raise both sides to the power (1/p1 ).
The inclusion Lp2 (a, b) ⊂ Lp1 (a, b) in Corollary 9.3 is strict: consider xα on (0, 1).
Corollary 9.3 holds if (a, b) is replaced by any finite measure space. However,
Lp1 (1, ∞) is not contained in Lp2 (1, ∞) (exercise).
For p ≥ 1, Lp is a normed space and hence a metric space for dp (f, g) = kf − gkp .
How does convergence in Lp -norm compare with pointwise a.e. convergence?
Examples 9.4. 1. Convergence a.e. does not imply convergence in Lp -norm: If fn (x) =
n2 xn (1 − x) (0 ≤ x ≤ 1), then fn (x) → 0 a.e., but kfn k1 → 1.
2. Convergence in Lp -norm does not imply convergence a.e.: For n = 2r + k, where
0 ≤ k < 2r , let fn be the characteristic function of [k2−r , (k + 1)2−r ]. Then kfn k1 =
2−r ≤ 2/n → 0, but for each x ∈ [0, 1], fn (x) takes the values 0 and 1 infinitely
often.
Theorem 9.5. Let p ∈ [1, ∞), and let (fn ) be a sequence in Lp which is Cauchy, i.e.,
for each ε > 0, there exists N such that kfn − fm kp < ε whenever m, n ≥ N . Then
there exists f ∈ Lp such that
1. There is a subsequence (fnk ) such that limk→∞ fnk (x) = f (x) a.e.
2. limn→∞ kfn − f kp = 0.
Thus Lp is a complete metric space.
The Convergence Theorems provide situations when a.e. convergence implies con-
vergence in Lp -norm. Here is a general result in that direction with a weaker conclusion
(see the bonus sheet for a proof),
Theorem 9.7. [Egorov’s Theorem] Suppose that fn → f a.e. Let E be a measurable
set with m(E) < ∞ and let ε > 0. Then there is a measurable subset F of E with
m(E \ F ) < ε such that fn → f uniformly on F . In particular, kfn − f kLp (F ) → 0 for
all p ≥ 1.
It is very useful to identify natural dense subsets of the Lp spaces. We can often
establish results on dense subsets and extend them to all of Lp by density (just as you
often use density of Q and density of R \ Q in prelims analysis arguments).
Theorem 9.8. Let 1 ≤ p < ∞ and f ∈ Lp (R).
1. There is a sequence of step functions ψn such that limn→∞ kf − ψn kp = 0.
2. There is a sequence (gn ) of continuous functions with compact support13 such that
limn→∞ kf − gn kp = 0.
Part 1 of this result is closely related to Theorem 3.11, that measurable functions
are pointwise (a.e.) limits of sequences of step functions. For a proof when p = 1, see
Stein & Shakarchi, Theorem 2.4, p.71.
As an example of density in action, we will show that translation of a function is
continuous in the Lp norm. For f : R → R, and h ∈ R consider the translation fh (x) =
f (x − h). As Lebesgue measure is translation invariant, it follows that f ∈ Lp (R) if
and only if fh ∈ Lp (R) for all h ∈ R.
Proposition 9.9. For 1 ≤ p < ∞ and f ∈ Lp (R), limh→0 kfh − f kp = 0.
Proof. Given > 0, use Theorem 9.8(2) to find g which is continuous and of compact
support such that kf − gkp < /3. As g is continuous and of compact support it is
uniformly continuous. From this one obtains that limh→0 kgh −gkpp = 0. So one can find
δ > 0 such that kgp − gkp < /3 whenever 0 < |h| < δ. Using Minkowski’s inequality
and invariance of the Lebesgue measure under translation, for 0 < |h| < δ one has
kf − fh kp ≤ kf − gkp + kg − gh kp + kgh − fh kp < 3/3 = .
[You could equally use part (1) of Theorem 9.8 by showing that limh→0 kψh −ψk = 0
for a step function ψ (an easy calculation obtained by doing the integral when ψ is the
indicator function of an interval, and then use a triangle inequality argument).]
We end this section by revisiting the Fourier transform of an integral function from
the ASO Integral transforms course, giving rigorous proofs of some properties from that
course. Let f ∈ L1 (R). The Fourier transform of f is the function fb : R → C defined
by Z
f (s) =
b f (x)e−isx dx.
R
Proof. (1) follows from |f (x)e−isx | = |f (x)|. (2) follows from the continuous-parameter
DCT (Theorem 7.2) with g(x) = |f (x)|.14
i(e−isb − e−isa )
For (3) when f = χ(a,b) , fb(s) = → 0 as |s| → ∞. This extends to
s
step functions, by linearity. For general f ∈ L1 (R) and ε > 0, there is a step function
ϕ such that kf − ϕk1 < ε by Theorem 9.8, and there exists K such that |ϕ(s)| b <ε
whenever |s| > K. Then
|fb(s)| ≤ |fb(s) − ϕ(s)|
b + |ϕ(s)|
b ≤ kf − ϕk1 + |ϕ(s)|
b < 2ε.
(4) can be proved by applying Theorem 7.5 with |g| as dominating function. (5)
can be proved by using integration by parts over intervals [an , bn ] where an → −∞,
f (an ) → 0, bn → ∞ and f (bn ) → 0.
One can alternatively prove (3) using the L1 -continuity of translations of Propo-
sition 9.9 as follows. For f ∈ L1 (R), making the change of variables y = x + π/s, we
have
Z Z Z
−isx −isx−iπ
ˆ
f (s) = f (x)e dx = −f (x)e dx = −f (y − π/s)e−isy dy.
R R R
Therefore Proposition 9.9 gives
Z
1 1
ˆ
|f (s)| = (f (x) − f (x − π/s))e−isx dx ≤ kf − fπ/s k1 → 0,
2 R 2
as s → ∞,
The theorem about the Fourier transform of the convolution of two integrable func-
tions (Theorem 81) is an application of Fubini/Tonelli. One can also formulate a Fourier
inversion theorem (normalising appropriately) when both f and fˆ are integrable. See
Stein and Shakarchi section 2.4. In fact Fourier inversion works particularly well in
the L2 -setting; this will be further developped in the Fourier analysis course (and for
Fourier series in the functional analysis course B4.2).
it is false for the Cantor-Lebesgue function Φ whose derivative exists and equals 0 a.e.
on [0, 1] (Example 4.12).
The ideal Fundamental Theorem of Calculus would identify a class A of functions
F on [a, b] with both the following properties:
Rx
(i) If F ∈ A, then F is differentiable a.e., F 0 ∈ L1 (a, b), and a F 0 (y) dy = F (x)−F (a)
for all x ∈ [a, b]. Rx
(ii) If f ∈ L1 (a, b) and F (x) = a f (y) dy for x ∈ [a, b], then F ∈ A and F 0 = f a.e.
It is not obvious that such a class exists—its existence implies that the indefinite integral
F of an integrable function f is differentiable a.e. and F 0 = f a.e.
Rx
In fact, this is true. Then A is the class of all functions of the form F (x) := c+ a f
for some c ∈ R and some f ∈ L1 (a, b). Remarkably there is an intrinsic characterisation
of such functions.
Let I be an interval. A function F : I → R is said to be absolutely continuous on
I if, for each ε > 0, there exists δ > 0 such that
n
X
|F (br ) − F (ar )| < ε
r=1
n
X
whenever n ∈ N, (ar , br ) (r = 1, . . . , n) are disjoint subintervals of I and (br −ar ) < δ.
r=1
If we only allowed n = 1 in this definition, we would have the definition of uniform
continuity on I. Recall from Prelims that any continuous function on [a, b] is uniformly
continuous.
Examples 10.1. 1. Recall that F is Lipschitz if there exists c such that |F (y)−F (x)| ≤
c|y − x| for all x, y. Any Lipschitz function is absolutely
R x continuous (take δ = ε/c).
2. If f is a bounded measurable function and F (x) = a f (y) dy, then F is Lipschitz.
3. The Cantor-Lebesgue function is not absolutely continuous on [0, 1].
Rx
Theorem 10.2. Let f ∈ L1 (I) and F (x) = a f (y) dy. Then F is absolutely continu-
ous on I.
Theorem 10.3. Let F be an absolutely continuous R x function on [a, b]. Then F is dif-
ferentiable a.e., F 0 ∈ L1 (a, b) and F (x) − F (a) = a F 0 (y) dy for all x ∈ [a, b].
courses; you can use these ideas to identify the Lipschitz functions f : [−1, 1] → R with
f (0) = 0 and an appropriate norm with L∞ [−1, 1]).