Math Analysis
Math Analysis
Kuttler
3
4 CONTENTS
5 Multi-variable Calculus 91
5.1 Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.1.1 Distance In Fn . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.2 Open And Closed Sets . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.3 Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.3.1 Sufficient Conditions For Continuity . . . . . . . . . . . . . . 96
5.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.5 Limits Of A Function . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.7 The Limit Of A Sequence . . . . . . . . . . . . . . . . . . . . . . . . 102
5.7.1 Sequences And Completeness . . . . . . . . . . . . . . . . . . 104
5.7.2 Continuity And The Limit Of A Sequence . . . . . . . . . . . 105
5.8 Properties Of Continuous Functions . . . . . . . . . . . . . . . . . . 106
5.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.10 Proofs Of Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.11 The Space L (Fn , Fm ) . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.11.1 The Operator Norm . . . . . . . . . . . . . . . . . . . . . . . 112
5.12 The Frechet Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.13 C 1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.14 C k Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.15 Mixed Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . 123
5.16 Implicit Function Theorem . . . . . . . . . . . . . . . . . . . . . . . 125
5.16.1 More Continuous Partial Derivatives . . . . . . . . . . . . . . 129
5.17 The Method Of Lagrange Multipliers . . . . . . . . . . . . . . . . . . 130
26 Residues 705
26.1 Rouche’s Theorem And The Argument Principle . . . . . . . . . . . 708
26.1.1 Argument Principle . . . . . . . . . . . . . . . . . . . . . . . 708
26.1.2 Rouche’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . 711
26.1.3 A Different Formulation . . . . . . . . . . . . . . . . . . . . . 712
26.2 Singularities And The Laurent Series . . . . . . . . . . . . . . . . . . 713
26.2.1 What Is An Annulus? . . . . . . . . . . . . . . . . . . . . . . 713
26.2.2 The Laurent Series . . . . . . . . . . . . . . . . . . . . . . . . 716
26.2.3 Contour Integrals And Evaluation Of Integrals . . . . . . . . 720
26.3 The Spectral Radius Of A Bounded Linear Transformation . . . . . 729
26.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 731
Review Of Advanced
Calculus
15
Set Theory
17
18 SET THEORY
In this notation, the colon is read as “such that” and in this case the condition is
being a multiple of 2.
Another example of political interest, could be the set of all judges who are not
judicial activists. I think you can see this last is not a very precise condition since
there is no way to determine to everyone’s satisfaction whether a given judge is an
activist. Also, just because something is grammatically correct does not mean it
makes any sense. For example consider the following nonsense.
So what is a condition?
We will leave these sorts of considerations and assume our conditions make sense.
The axiom of unions states that for any collection of sets, there is a set consisting
of all the elements in each of the sets in the collection. Of course this is also open to
further consideration. What is a collection? Maybe it would be better to say “set
of sets” or, given a set whose elements are sets there exists a set whose elements
consist of exactly those things which are elements of at least one of these sets. If S
is such a set whose elements are sets,
∪ {A : A ∈ S} or ∪ S
The complement of a set, (the set of things which are not in the given set ) must
be taken with respect to a given set called the universal set which is a set which
contains the one whose complement is being taken. Thus, the complement of A,
denoted as AC ( or more precisely as X \ A) is a set obtained from using the axiom
of specification to write
AC ≡ {x ∈ X : x ∈ / A}
The symbol ∈ / means: “is not an element of”. Note the axiom of specification takes
place relative to a given set. Without this universal set it makes no sense to use
the axiom of specification to obtain the complement.
Words such as “all” or “there exists” are called quantifiers and they must be
understood relative to some given set. For example, the set of all integers larger
than 3. Or there exists an integer larger than 7. Such statements have to do with a
given set, in this case the integers. Failure to have a reference set when quantifiers
are used turns out to be illogical even though such usage may be grammatically
correct. Quantifiers are used often enough that there are symbols for them. The
symbol ∀ is read as “for all” or “for every” and the symbol ∃ is read as “there
exists”. Thus ∀∀∃∃ could mean for every upside down A there exists a backwards
E.
DeMorgan’s laws are very useful in mathematics. Let S be a set of sets each of
which is contained in some universal set, U . Then
© ª C
∪ AC : A ∈ S = (∩ {A : A ∈ S})
and © ª C
∩ AC : A ∈ S = (∪ {A : A ∈ S}) .
These laws follow directly from the definitions. Also following directly from the
definitions are:
Let S be a set of sets then
B ∪ ∪ {A : A ∈ S} = ∪ {B ∪ A : A ∈ S} .
B ∩ ∪ {A : A ∈ S} = ∪ {B ∩ A : A ∈ S} .
Unfortunately, there is no single universal set which can be used for all sets.
Here is why: Suppose there were. Call it S. Then you could consider A the set
of all elements of S which are not elements of themselves, this from the axiom of
specification. If A is an element of itself, then it fails to qualify for inclusion in A.
Therefore, it must not be an element of itself. However, if this is so, it qualifies for
inclusion in A so it is an element of itself and so this can’t be true either. Thus
the most basic of conditions you could imagine, that of being an element of, is
meaningless and so allowing such a set causes the whole theory to be meaningless.
The solution is to not allow a universal set. As mentioned by Halmos in Naive
set theory, “Nothing contains everything”. Always beware of statements involving
quantifiers wherever they occur, even this one.
20 SET THEORY
X Y
f
A - C = f (A)
g
B = g(D) ¾ D
1.2. THE SCHRODER BERNSTEIN THEOREM 21
C ≡ f (A) , D ≡ Y \ C, B ≡ X \ A.
Definition 1.4 Let I be a set and let Xi be a set for each i ∈ I. f is a choice
function written as Y
f∈ Xi
i∈I
The axiom of choice says that if Xi 6= ∅ for each i ∈ I, for I a set, then
Y
Xi 6= ∅.
i∈I
Sometimes the two functions, f and g are onto but not one to one. It turns out
that with the axiom of choice, a similar conclusion to the above may be obtained.
Similarly g0−1 is one to one. Therefore, by the Schroder Bernstein theorem, there
exists h : X → Y which is one to one and onto.
Definition 1.6 A set S, is finite if there exists a natural number n and a map θ
which maps {1, · · ·, n} one to one and onto S. S is infinite if it is not finite. A
set S, is called countable if there exists a map θ mapping N one to one and onto
S.(When θ maps a set A to a set B, this will be written as θ : A → B in the future.)
Here N ≡ {1, 2, · · ·}, the natural numbers. S is at most countable if there exists a
map θ : N →S which is onto.
The property of being at most countable is often referred to as being countable
because the question of interest is normally whether one can list all elements of the
set, designating a first, second, third etc. in such a way as to give each element of
the set a natural number. The possibility that a single element of the set may be
counted more than once is often not important.
Theorem 1.7 If X and Y are both at most countable, then X × Y is also at most
countable. If either X or Y is countable, then X × Y is also countable.
Proof: It is given that there exists a mapping η : N → X which is onto. Define
η (i) ≡ xi and consider X as the set {x1 , x2 , x3 , · · ·}. Similarly, consider Y as the
set {y1 , y2 , y3 , · · ·}. It follows the elements of X × Y are included in the following
rectangular array.
(x1 , y1 ) (x1 , y2 ) (x1 , y3 ) · · · ← Those which have x1 in first slot.
(x2 , y1 ) (x2 , y2 ) (x2 , y3 ) · · · ← Those which have x2 in first slot.
(x3 , y1 ) (x3 , y2 ) (x3 , y3 ) · · · ← Those which have x3 in first slot. .
.. .. .. ..
. . . .
Follow a path through this array as follows.
(x1 , y1 ) → (x1 , y2 ) (x1 , y3 ) →
. %
(x2 , y1 ) (x2 , y2 )
↓ %
(x3 , y1 )
It remains to show the last claim. Suppose without loss of generality that X
is countable. Then there exists α : N → X which is one to one and onto. Let
β : X × Y → N be defined by β ((x, y)) ≡ α−1 (x). Thus β is onto N. By the first
part there exists a function from N onto X × Y . Therefore, by Corollary 1.5, there
exists a one to one and onto mapping from X × Y to N. This proves the theorem.
X = {x1 , x2 , x3 , · · ·}
and
Y = {y1 , y2 , y3 , · · ·} .
Consider the following array consisting of X ∪ Y and path through it.
x1 → x2 x3 →
. %
y1 → y2
Thus the first element of X ∪ Y is x1 , the second is x2 the third is y1 the fourth is
y2 etc.
Consider the second claim. By the first part, there is a map from N onto X × Y .
Suppose without loss of generality that X is countable and α : N → X is one to one
and onto. Then define β (y) ≡ 1, for all y ∈ Y ,and β (x) ≡ α−1 (x). Thus, β maps
X × Y onto N and this shows there exist two onto maps, one mapping X ∪ Y onto
N and the other mapping N onto X ∪ Y . Then Corollary 1.5 yields the conclusion.
This proves the theorem.
Definition 1.10 [x] denotes the set of all elements of S which are equivalent to x
and [x] is called the equivalence class determined by x or just the equivalence class
of x.
With the above definition one can prove the following simple theorem.
Theorem 1.11 Let ∼ be an equivalence class defined on a set, S and let H denote
the set of equivalence classes. Then if [x] and [y] are two of these equivalence classes,
either x ∼ y and [x] = [y] or it is not true that x ∼ y and [x] ∩ [y] = ∅.
x ≤ x for all x ∈ F.
If x ≤ y and y ≤ z then x ≤ z.
C ⊆ F is said to be a chain if every two elements of C are related. This means that
if x, y ∈ C, then either x ≤ y or y ≤ x. Sometimes a chain is called a totally ordered
set. C is said to be a maximal chain if whenever D is a chain containing C, D = C.
The most common example of a partially ordered set is the power set of a given
set with ⊆ being the relation. It is also helpful to visualize partially ordered sets
as trees. Two points on the tree are related if they are on the same branch of
the tree and one is higher than the other. Thus two points on different branches
would not be related although they might both be larger than some point on the
trunk. You might think of many other things which are best considered as partially
ordered sets. Think of food for example. You might find it difficult to determine
which of two favorite pies you like better although you may be able to say very
easily that you would prefer either pie to a dish of lard topped with whipped cream
and mustard. The following theorem is equivalent to the axiom of choice. For a
discussion of this, see the appendix on the subject.
There is a theorem about the integral of a continuous function which requires the
notion of uniform continuity. This is discussed in this section. Consider the function
f (x) = x1 for x ∈ (0, 1) . This is a continuous function because, it is continuous at
every point of (0, 1) . However, for a given ε > 0, the δ needed in the ε, δ definition of
continuity becomes very small as x gets close to 0. The notion of uniform continuity
involves being able to choose a single δ which works on the whole domain of f. Here
is the definition.
25
26 CONTINUOUS FUNCTIONS OF ONE VARIABLE
the nested interval lemma there exists a point, c contained in all these intervals.
Furthermore,
|xnk − c| < (b − a) 2−k
and so limk→∞ xnk = c ∈ [a, b] . This proves the theorem.
Proof: If this is not true, there exists ε > 0 such that for every δ > 0 there exists
a pair of points, xδ and yδ such that even though |xδ − yδ | < δ, |f (xδ ) − f (yδ )| ≥ ε.
Taking a succession of values for δ equal to 1, 1/2, 1/3, ···, and letting the exceptional
pair of points for δ = 1/n be denoted by xn and yn ,
1
|xn − yn | < , |f (xn ) − f (yn )| ≥ ε.
n
Now since K is sequentially compact, there exists a subsequence, {xnk } such that
xnk → z ∈ K. Now nk ≥ k and so
1
|xnk − ynk | < .
k
Consequently, ynk → z also. ( xnk is like a person walking toward a certain point
and ynk is like a dog on a leash which is constantly getting shorter. Obviously ynk
must also move toward the point also. You should give a precise proof of what is
needed here.) By continuity of f
2.1 Exercises
1. A function, f : D ⊆ R → R is Lipschitz continuous or just Lipschitz for short
if there exists a constant, K such that
|f (x) − f (y)| ≤ K |x − y|
Proof: First consider 1.) Let ε > 0 be given. By assumption, there exist
ε
δ 1 > 0 such that whenever |x − y| < δ 1 , it follows |f (x) − f (y)| < 2(|a|+|b|+1) and
there exists δ 2 > 0 such that whenever |x − y| < δ 2 , it follows that |g (x) − g (y)| <
ε
2(|a|+|b|+1) . Then let 0 < δ ≤ min (δ 1 , δ 2 ) . If |x − y| < δ, then everything happens
at once. Therefore, using the triangle inequality
Now consider 2.) There exists δ 1 > 0 such that if |y − x| < δ 1 , then
|f g (x) − f g (y)| ≤ |f (x) g (x) − g (x) f (y)| + |g (x) f (y) − f (y) g (y)|
28 CONTINUOUS FUNCTIONS OF ONE VARIABLE
where M is defined by
2
M≡ 2 (1 + 2 |f (x)| + 2 |g (x)|)
|g (x)|
This proves part 4.) and completes the proof of the theorem.
Next here is a proof of the intermediate value theorem.
Theorem 2.7 Suppose f : [a, b] → R is continuous and suppose f (a) < c < f (b) .
Then there exists x ∈ (a, b) such that f (x) = c.
Proof: Let d = a+b 2 and consider the intervals [a, d] and [d, b] . If f (d) ≥ c,
then on [a, d] , the function is ≤ c at one end point and ≥ c at the other. On the
other hand, if f (d) ≤ c, then on [d, b] f ≥ 0 at one end point and ≤ 0 at the
30 CONTINUOUS FUNCTIONS OF ONE VARIABLE
other. Pick the interval on which f has values which are at least as large as c and
values no larger than c. Now consider that interval, divide it in half as was done for
the original interval and argue that on one of these smaller intervals, the function
has values at least as large as c and values no larger than c. Continue in this way.
Next apply the nested interval lemma to get x in all these intervals. In the nth
interval, let xn , yn be elements of this interval such that f (xn ) ≤ c, f (yn ) ≥ c.
Now |xn − x| ≤ (b − a) 2−n and |yn − x| ≤ (b − a) 2−n and so xn → x and yn → x.
Therefore,
f (x) − c = lim (f (xn ) − c) ≤ 0
n→∞
while
f (x) − c = lim (f (yn ) − c) ≥ 0.
n→∞
Consequently f (x) = c and this proves the theorem.
Lemma 2.8 Let φ : [a, b] → R be a continuous function and suppose φ is 1 − 1 on
(a, b). Then φ is either strictly increasing or strictly decreasing on [a, b] .
Proof: First it is shown that φ is either strictly increasing or strictly decreasing
on (a, b) .
If φ is not strictly decreasing on (a, b), then there exists x1 < y1 , x1 , y1 ∈ (a, b)
such that
(φ (y1 ) − φ (x1 )) (y1 − x1 ) > 0.
If for some other pair of points, x2 < y2 with x2 , y2 ∈ (a, b) , the above inequality
does not hold, then since φ is 1 − 1,
(φ (y2 ) − φ (x2 )) (y2 − x2 ) < 0.
Let xt ≡ tx1 + (1 − t) x2 and yt ≡ ty1 + (1 − t) y2 . Then xt < yt for all t ∈ [0, 1]
because
tx1 ≤ ty1 and (1 − t) x2 ≤ (1 − t) y2
with strict inequality holding for at least one of these inequalities since not both t
and (1 − t) can equal zero. Now define
h (t) ≡ (φ (yt ) − φ (xt )) (yt − xt ) .
Since h is continuous and h (0) < 0, while h (1) > 0, there exists t ∈ (0, 1) such
that h (t) = 0. Therefore, both xt and yt are points of (a, b) and φ (yt ) − φ (xt ) = 0
contradicting the assumption that φ is one to one. It follows φ is either strictly
increasing or strictly decreasing on (a, b) .
This property of being either strictly increasing or strictly decreasing on (a, b)
carries over to [a, b] by the continuity of φ. Suppose φ is strictly increasing on (a, b) ,
a similar argument holding for φ strictly decreasing on (a, b) . If x > a, then pick
y ∈ (a, x) and from the above, φ (y) < φ (x) . Now by continuity of φ at a,
φ (a) = lim φ (z) ≤ φ (y) < φ (x) .
x→a+
Therefore, φ (a) < φ (x) whenever x ∈ (a, b) . Similarly φ (b) > φ (x) for all x ∈ (a, b).
This proves the lemma.
2.2. THEOREMS ABOUT CONTINUOUS FUNCTIONS 31
Corollary 2.9 Let f : (a, b) → R be one to one and continuous. Then f (a, b) is
an open interval, (c, d) and f −1 : (c, d) → (a, b) is continuous.
it follows
z ≡ f −1 (f (z)) ∈ (x − η, x + η) ⊆ (x − ε, x + ε)
so ¯ −1 ¯ ¯ ¯
¯f (f (z)) − x¯ = ¯f −1 (f (z)) − f −1 (f (x))¯ < ε.
This proves the theorem in the case where f is strictly decreasing. The case where
f is increasing is similar.
32 CONTINUOUS FUNCTIONS OF ONE VARIABLE
The Riemann Stieltjes
Integral
The integral originated in attempts to find areas of various shapes and the ideas
involved in finding integrals are much older than the ideas related to finding deriva-
tives. In fact, Archimedes1 was finding areas of various curved shapes about 250
B.C. using the main ideas of the integral. What is presented here is a generaliza-
tion of these ideas. The main interest is in the Riemann integral but if it is easy to
generalize to the so called Stieltjes integral in which the length of an interval, [x, y]
is replaced with an expression of the form F (y) − F (x) where F is an increasing
function, then the generalization is given. However, there is much more that can
be written about Stieltjes integrals than what is presented here. A good source for
this is the book by Apostol, [3].
which he knew the area of and taking a limit. He also made fundamental contributions to physics.
The story is told about how he determined that a gold smith had cheated the king by giving him
a crown which was not solid gold as had been claimed. He did this by finding the amount of water
displaced by the crown and comparing with the amount of water it should have displaced if it had
been solid gold.
33
34 THE RIEMANN STIELTJES INTEGRAL
Definition 3.1 Let F be an increasing function defined on [a, b] and let ∆Fi ≡
F (xi ) − F (xi−1 ) . Then define upper and lower sums as
n
X n
X
U (f, P ) ≡ Mi (f ) ∆Fi and L (f, P ) ≡ mi (f ) ∆Fi
i=1 i=1
respectively. The numbers, Mi (f ) and mi (f ) , are well defined real numbers because
f is assumed to be bounded and R is complete. Thus the set S = {f (x) : x ∈
[xi−1 , xi ]} is bounded above and below.
In the following picture, the sum of the areas of the rectangles in the picture on
the left is a lower sum for the function in the picture and the sum of the areas of the
rectangles in the picture on the right is an upper sum for the same function which
uses the same partition. In these pictures the function, F is given by F (x) = x and
these are the ordinary upper and lower sums from calculus.
y = f (x)
x0 x1 x2 x3 x0 x1 x2 x3
What happens when you add in more points in a partition? The following
pictures illustrate in the context of the above example. In this example a single
additional point, labeled z has been added in.
y = f (x)
x0 x1 x2 z x3 x0 x1 x2 z x3
Note how the lower sum got larger by the amount of the area in the shaded
rectangle and the upper sum got smaller by the amount in the rectangle shaded by
dots. In general this is the way it works and this is shown in the following lemma.
Lemma 3.2 If P ⊆ Q then
U (f, Q) ≤ U (f, P ) , and L (f, P ) ≤ L (f, Q) .
3.1. UPPER AND LOWER RIEMANN STIELTJES SUMS 35
P = {x0 , · · ·, xn }
and let
Q = {x0 , · · ·, xk , y, xk+1 , · · ·, xn }.
Thus exactly one point, y, is added between xk and xk+1 . Now the term in the
upper sum which corresponds to the interval [xk , xk+1 ] in U (f, P ) is
and the term which corresponds to the interval [xk , xk+1 ] in U (f, Q) is
All the other terms in the two sums coincide. Now sup {f (x) : x ∈ [xk , xk+1 ]} ≥
max (M1 , M2 ) and so the expression in 3.2 is no larger than
the term corresponding to the interval, [xk , xk+1 ] and U (f, P ) . This proves the
first part of the lemma pertaining to upper sums because if Q ⊇ P, one can obtain
Q from P by adding in one point at a time and each time a point is added, the
corresponding upper sum either gets smaller or stays the same. The second part
about lower sums is similar and is left as an exercise.
L (f, P ) ≤ U (f, Q) .
Definition 3.4
Theorem 3.5 I ≤ I.
because U (f, Q) is an upper bound to the set of all lower sums and so it is no
smaller than the least upper bound. Therefore, since Q is arbitrary,
where the inequality holds because it was just shown that I is a lower bound to the
set of all upper sums and so it is no larger than the greatest lower bound of this
set. This proves the theorem.
f ∈ R ([a, b])
if
I=I
and in this case,
Z b
f (x) dF ≡ I = I.
a
When F (x) = x, the integral is called the Riemann integral and is written as
Z b
f (x) dx.
a
Thus, in words, the Riemann integral is the unique number which lies between
all upper sums and all lower sums if there is such a unique number.
Recall the following Proposition which comes from the definitions.
Proposition 3.7 Let S be a nonempty set and suppose sup (S) exists. Then for
every δ > 0,
S ∩ (sup (S) − δ, sup (S)] 6= ∅.
If inf (S) exists, then for every δ > 0,
This proposition implies the following theorem which is used to determine the
question of Riemann Stieltjes integrability.
Theorem 3.8 A bounded function f is Riemann integrable if and only if for all
ε > 0, there exists a partition P such that
Proof: First assume f is Riemann integrable. Then let P and Q be two parti-
tions such that
U (f, Q) < I + ε/2, L (f, P ) > I − ε/2.
Then since I = I,
Now suppose that for all ε > 0 there exists a partition such that 3.3 holds. Then
for given ε and partition P corresponding to ε
I − I ≤ U (f, P ) − L (f, P ) ≤ ε.
3.2 Exercises
1. Prove the second half of Lemma 3.2 about lower sums.
2. Verify that for f given in 3.4, the lower sums on the interval [0, 1] are all equal
to zero while the upper sums are all equal to one.
© ª
3. Let f (x) = 1 + x2 for x ∈ [−1, 3] and let P = −1, − 31 , 0, 21 , 1, 2 . Find
U (f, P ) and L (f, P ) for F (x) = x and for F (x) = x3 .
4. Show that if f ∈ R ([a, b]) for F (x) = x, there exists a partition, {x0 , · · ·, xn }
such that for any zk ∈ [xk , xk+1 ] ,
¯Z ¯
¯ b X n ¯
¯ ¯
¯ f (x) dx − f (zk ) (xk − xk−1 )¯ < ε
¯ a ¯
k=1
Pn
This sum, k=1 f (zk ) (xk − xk−1 ) , is called a Riemann sum and this exercise
shows that the Riemann integral can always be approximated by a Riemann
sum. For the general Riemann Stieltjes case, does anything change?
© ª
5. Let P = 1, 1 14 , 1 12 , 1 34 , 2 and F (x) = x. Find upper and lower sums for the
function, f (x) = x1 using this partition. What does this tell you about ln (2)?
6. If f ∈ R ([a, b]) with F (x) = x and f is changed at finitely many points,
show the new function is also in R ([a, b]) . Is this still true for the general case
where F is only assumed to be an increasing function? Explain.
38 THE RIEMANN STIELTJES INTEGRAL
Also suppose that all partitions have the property that xk − xk−1 equals a
constant, (b − a) /n so the points in the partition are equally spaced, and
define the integral to be the number these right and leftR sums get close to as
x
n gets larger and larger. Show that for f given in 3.4, 0 f (t) dt = 1 if x is
Rx
rational and 0 f (t) dt = 0 if x is irrational. It turns out that the correct
answer should always equal zero for that function, regardless of whether x is
rational. This is shown when the Lebesgue integral is studied. This illustrates
why this method of defining the integral in terms of left and right sums is total
nonsense. Show that even though this is the case, it makes no difference if f
is continuous.
for some constant K. Then if f, g ∈ R ([a, b]) it follows that H ◦ (f, g) ∈ R ([a, b]) .
Proof: In the following claim, Mi (h) and mi (h) have the meanings assigned
above with respect to some partition of [a, b] for the function, h.
Claim: The following inequality holds.
and
H (f (x2 ) , g (x2 )) − η < mi (H ◦ (f, g)) .
Then
|Mi (H ◦ (f, g)) − mi (H ◦ (f, g))|
n
X
< K [|Mi (f ) − mi (f )| + |Mi (g) − mi (g)|] ∆Fi < ε.
i=1
Since ε > 0 is arbitrary, this shows H ◦ (f, g) satisfies the Riemann criterion and
hence H ◦ (f, g) is Riemann integrable as claimed. This proves the theorem.
This theorem implies that if f, g are Riemann Stieltjes integrable, then so is
af + bg, |f | , f 2 , along with infinitely many other such continuous combinations of
Riemann Stieltjes integrable functions. For example, to see that |f | is Riemann
integrable, let H (a, b) = |a| . Clearly this function satisfies the conditions of the
above theorem and so |f | = H (f, f ) ∈ R ([a, b]) as claimed. The following theorem
gives an example of many functions which are Riemann integrable.
Thus the Riemann criterion is satisfied and so the function is Riemann Stieltjes
integrable. The proof for decreasing f is similar.
Corollary 3.11 Let [a, b] be a bounded closed interval and let φ : [a, b] → R be
Lipschitz continuous and suppose F is continuous. Then φ ∈ R ([a, b]) . Recall that
a function, φ, is Lipschitz continuous if there is a constant, K, such that for all
x, y,
|φ (x) − φ (y)| < K |x − y| .
Proof: The first part of the conclusion of this lemma follows from Theorem 3.10
since the function φ (y) ≡ −y is Lipschitz continuous. Now choose P such that
Z b
−f (x) dF − L (−f, P ) < ε.
a
which implies
Z b n
X Z b Z b
ε> −f (x) dF + Mi (f ) ∆Fi ≥ −f (x) dF + f (x) dF.
a i=1 a a
Proof: First note that by Theorem 3.9, αf + βg ∈ R ([a, b]) . To begin with,
consider the claim that if f, g ∈ R ([a, b]) then
Z b Z b Z b
(f + g) (x) dF = f (x) dF + g (x) dF. (3.7)
a a a
mi (f + g) ≥ mi (f ) + mi (g) , Mi (f + g) ≤ Mi (f ) + Mi (g) .
Therefore,
and
Z b Z b
f (x) dF + g (x) dF ∈ [L (f, P ) + L (g, P ) , U (f, P ) + U (g, P )] .
a a
Therefore,
¯Z ÃZ Z b !¯
¯ b b ¯
¯ ¯
¯ (f + g) (x) dF − f (x) dF + g (x) dF ¯ ≤
¯ a a a ¯
which shows that since ε is arbitrary, 3.8 holds. This proves the theorem.
44 THE RIEMANN STIELTJES INTEGRAL
Corollary 3.17 Let F be continuous and let [a, b] be a closed and bounded interval
and suppose that
a = y1 < y2 · ·· < yl = b
and that f is a bounded function defined on [a, b] which has the property that f is
either increasing on [yj , yj+1 ] or decreasing on [yj , yj+1 ] for j = 1, · · ·, l − 1. Then
f ∈ R ([a, b]) .
Definition 3.18 Let [a, b] be an interval and let f ∈ R ([a, b]) . Then
Z a Z b
f (x) dF ≡ − f (x) dF.
b a
and so Z a
f (x) dF = 0.
a
Proof: This follows from Theorem 3.16 and Definition 3.18. For example, as-
sume
c ∈ (a, b) .
Then from Theorem 3.16,
Z c Z b Z b
f (x) dF + f (x) dF = f (x) dF
a c a
The following properties of the integral have either been established or they
follow quickly from what has been shown so far.
and Z Z
b b
|f (x)| dF ≥ − f (x) dF.
a a
Therefore, ¯Z ¯
Z b ¯ b ¯
¯ ¯
|f (x)| dF ≥ ¯ f (x) dF ¯ .
a ¯ a ¯
If b < a then the above inequality holds with a and b switched. This implies 3.15.
had been around for thousands of years and the derivative was by their time well known. However
the connection between these two ideas had not been fully made although Newton’s predecessor,
Isaac Barrow had made some progress in this direction.
46 THE RIEMANN STIELTJES INTEGRAL
Let f ∈ R ([a, b]) . Then by 3.10 f ∈ R ([a, x]) for each x ∈ [a, b] . The first version
of the fundamental theorem of calculus is a statement about the derivative of the
function Z x
x→ f (t) dt.
a
F 0 (x) = f (x) .
Let ε > 0 and let δ > 0 be small enough that if |t − x| < δ, then
Therefore, if |h| < δ, the above inequality and 3.11 shows that
¯ −1 ¯
¯h (F (x + h) − F (x)) − f (x)¯ ≤ |h|−1 ε |h| = ε.
Theorem 3.21 Let f ∈ R ([a, b]) and suppose there exists an antiderivative for
f, G, such that
G0 (x) = f (x)
for every point of (a, b) and G is continuous on [a, b] . Then
Z b
f (x) dx = G (b) − G (a) . (3.16)
a
Then
where zi is some point in [xi−1 , xi ] . It follows, since the above sum lies between the
upper and lower sums, that
and also Z b
f (x) dx ∈ [L (f, P ) , U (f, P )] .
a
Therefore,
¯ Z b ¯
¯ ¯
¯ ¯
¯G (b) − G (a) − f (x) dx¯ < U (f, P ) − L (f, P ) < ε.
¯ a ¯
but this is a hard theorem using the difficult result about uniform continuity.
48 THE RIEMANN STIELTJES INTEGRAL
Definition 3.22 Let f be a bounded function defined on a closed interval [a, b] and
let P ≡ {x0 , · · ·, xn } be a partition of the interval. Suppose zi ∈ [xi−1 , xi ] is chosen.
Then the sum
Xn
f (zi ) (xi − xi−1 )
i=1
Rb
Proof:
Pn Choose P such that U (f, P ) − L (f, P ) < ε and then both a f (x) dx
and k=1 f (zk ) (xk − xk−1 ) are contained in [L (f, P ) , U (f, P )] and so the claimed
inequality must hold. This proves the proposition.
It is significant because it gives a way of approximating the integral.
The definition of Riemann integrability given in this chapter is also called Dar-
boux integrability and the integral defined as the unique number which lies between
all upper sums and all lower sums which is given in this chapter is called the Dar-
boux integral . The definition of the Riemann integral in terms of Riemann sums
is given next.
Rb
The number a
f (x) dx is defined as I.
Thus, there are two definitions of the Riemann integral. It turns out they are
equivalent which is the following theorem of of Darboux.
3.6. EXERCISES 49
The proof of this theorem is left for the exercises in Problems 10 - 12. It isn’t
essential that you understand this theorem so if it does not interest you, leave it
out. Note that it implies that given a Riemann integrable function f in either sense,
it can be approximated by Riemann sums whenever ||P || is sufficiently small. Both
versions of the integral are obsolete but entirely adequate for most applications and
as a point of departure for a more up to date and satisfactory integral. The reason
for using the Darboux approach to the integral is that all the existence theorems
are easier to prove in this context.
3.6 Exercises
R x3 t5 +7
1. Let F (x) = x2 t7 +87t6 +1
dt. Find F 0 (x) .
Rx 1
2. Let F (x) = 2 1+t4
dt. Sketch a graph of F and explain why it looks the way
it does.
4. Solve the following initial value problem from ordinary differential equations
which is to find a function y such that
x7 + 1
y 0 (x) = , y (10) = 5.
x6 + 97x5 + 7
R
5. If F, G ∈ f (x) dx for all x ∈ R, show F (x) = G (x) + C for some constant,
C. Use this to give a different proof of the fundamental theorem of calculus
Rb
which has for its conclusion a f (t) dt = G (b) − G (a) where G0 (x) = f (x) .
7. Suppose f and g are continuous functions on [a, b] and that g (x) 6= 0 on (a, b) .
Show there exists c ∈ (a, b) such that
Z b Z b
f (c) g (x) dx = f (x) g (x) dx.
a a
Rx Rx
Hint: Define F (x) ≡ a f (t) g (t) dt and let G (x) ≡ a g (t) dt. Then use
the Cauchy mean value theorem on these two functions.
8. Consider the function
½ ¡ ¢
sin x1 if x 6= 0
f (x) ≡ .
0 if x = 0
Hint: Write the sum for U (f, P ) − L (f, P ) and split this sum into two sums,
the sum of terms for which [xi−1 , xi ] contains at least one point of Q, and
terms for which [xi−1 , xi ] does not contain any£points of¤Q. In the latter case,
[xi−1 , xi ] must be contained in some interval, x∗k−1 , x∗k . Therefore, the sum
of these terms should be no larger than |U (f, Q) − L (f, Q)| .
11. ↑ If ε > 0 is given and f is a Darboux integrable function defined on [a, b],
show there exists δ > 0 such that whenever ||P || < δ, then
This chapter contains some important linear algebra as distinguished from that
which is normally presented in undergraduate courses consisting mainly of uninter-
esting things you can do with row operations.
The notation, Cn refers to the collection of ordered lists of n complex numbers.
Since every real number is also a complex number, this simply generalizes the usual
notion of Rn , the collection of all ordered lists of n real numbers. In order to avoid
worrying about whether it is real or complex numbers which are being referred to,
the symbol F will be used. If it is not clear, always pick C.
Fn ≡ {(x1 , · · ·, xn ) : xj ∈ F for j = 1, · · ·, n} .
(x1 , · · ·, xn ) ∈ Fn ,
it is conventional to denote (x1 , · · ·, xn ) by the single bold face letter, x. The num-
bers, xj are called the coordinates. The set
{(0, · · ·, 0, t, 0, · · ·, 0) : t ∈ F}
for t in the ith slot is called the ith coordinate axis. The point 0 ≡ (0, · · ·, 0) is
called the origin.
Thus (1, 2, 4i) ∈ F3 and (2, 1, 4i) ∈ F3 but (1, 2, 4i) 6= (2, 1, 4i) because, even
though the same numbers are involved, they don’t match up. In particular, the
first entries are not equal.
The geometric significance of Rn for n ≤ 3 has been encountered already in
calculus or in precalculus. Here is a short review. First consider the case when
n = 1. Then from the definition, R1 = R. Recall that R is identified with the
points of a line. Look at the number line again. Observe that this amounts to
51
52 SOME IMPORTANT LINEAR ALGEBRA
identifying a point on this line with a real number. In other words a real number
determines where you are on this line. Now suppose n = 2 and consider two lines
which intersect each other at right angles as shown in the following picture.
6 · (2, 6)
(−8, 3) · 3
2
−8
Notice how you can identify a point shown in the plane with the ordered pair,
(2, 6) . You go to the right a distance of 2 and then up a distance of 6. Similarly,
you can identify another point in the plane with the ordered pair (−8, 3) . Go to
the left a distance of 8 and then up a distance of 3. The reason you go to the left
is that there is a − sign on the eight. From this reasoning, every ordered pair
determines a unique point in the plane. Conversely, taking a point in the plane,
you could draw two lines through the point, one vertical and the other horizontal
and determine unique points, x1 on the horizontal line in the above picture and x2
on the vertical line in the above picture, such that the point of interest is identified
with the ordered pair, (x1 , x2 ) . In short, points in the plane can be identified with
ordered pairs similar to the way that points on the real line are identified with
real numbers. Now suppose n = 3. As just explained, the first two coordinates
determine a point in a plane. Letting the third component determine how far up
or down you go, depending on whether this number is positive or negative, this
determines a point in space. Thus, (1, 4, −5) would mean to determine the point
in the plane that goes with (1, 4) and then to go below this plane a distance of 5
to obtain a unique point in space. You see that the ordered triples correspond to
points in space just as the ordered pairs correspond to points in a plane and single
real numbers correspond to points on a line.
You can’t stop here and say that you are only interested in n ≤ 3. What if you
were interested in the motion of two objects? You would need three coordinates
to describe where the first object is and you would need another three coordinates
to describe where the other object is located. Therefore, you would need to be
considering R6 . If the two objects moved around, you would need a time coordinate
as well. As another example, consider a hot object which is cooling and suppose
you want the temperature of this object. How many coordinates would be needed?
You would need one for the temperature, three for the position of the point in the
object and one more for the time. Thus you would need to be considering R5 .
Many other examples can be given. Sometimes n is very large. This is often the
case in applications to business when they are trying to maximize profit subject
to constraints. It also occurs in numerical analysis when people try to solve hard
problems on a computer.
4.1. ALGEBRA IN FN 53
There are other ways to identify points in space with three numbers but the one
presented is the most basic. In this case, the coordinates are known as Cartesian
coordinates after Descartes1 who invented this idea in the first half of the seven-
teenth century. I will often not bother to draw a distinction between the point in n
dimensional space and its Cartesian coordinates.
The geometric significance of Cn for n > 1 is not available because each copy of
C corresponds to the plane or R2 .
4.1 Algebra in Fn
There are two algebraic operations done with elements of Fn . One is addition and
the other is multiplication by numbers, called scalars. In the case of Cn the scalars
are complex numbers while in the case of Rn the only allowed scalars are real
numbers. Thus, the scalars always come from F in either case.
x + y = (x1 , · · ·, xn ) + (y1 , · · ·, yn )
≡ (x1 + y1 , · · ·, xn + yn ) (4.2)
With this definition, the algebraic properties satisfy the conclusions of the fol-
lowing theorem.
Theorem 4.3 For v, w ∈ Fn and α, β scalars, (real numbers), the following hold.
v + w = w + v, (4.3)
(v + w) + z = v+ (w + z) , (4.4)
v + 0 = v, (4.5)
v+ (−v) = 0, (4.6)
1 René Descartes 1596-1650 is often credited with inventing analytic geometry although it seems
the ideas were actually known much earlier. He was interested in many different subjects, physi-
ology, chemistry, and physics being some of them. He also wrote a large book in which he tried to
explain the book of Genesis scientifically. Descartes ended up dying in Sweden.
54 SOME IMPORTANT LINEAR ALGEBRA
α (v + w) = αv+αw, (4.7)
(α + β) v =αv+βv, (4.8)
α (βv) = αβ (v) , (4.9)
1v = v. (4.10)
In the above 0 = (0, · · ·, 0).
You should verify these properties all hold. For example, consider 4.7
α (v + w) = α (v1 + w1 , · · ·, vn + wn )
= (α (v1 + w1 ) , · · ·, α (vn + wn ))
= (αv1 + αw1 , · · ·, αvn + αwn )
= (αv1 , · · ·, αvn ) + (αw1 , · · ·, αwn )
= αv + αw.
4.2 Exercises
1. Verify all the properties 4.3-4.10.
(a) (1, 2)
(b) (−2, −2)
(c) (−2, 3)
(d) (2, −5)
(a) (1, 2, 0)
(b) (−2, −2, 1)
(c) (−2, 3, −2)
4.3. SUBSPACES SPANS AND BASES 55
where the ci are scalars. The set of all linear combinations of these vectors is
called span (x1 , · · ·, xn ) . If V ⊆ Fn , then V is called a subspace if whenever α, β
are scalars and u and v are vectors of V, it follows αu + βv ∈ V . That is, it is
“closed under the algebraic operations of vector addition and scalar multiplication”.
A linear combination of vectors is said to be trivial if all the scalars in the linear
combination equal zero. A set of vectors is said to be linearly independent if the
only linear combination of these vectors which equals the zero vector is the trivial
linear combination. Thus {x1 , · · ·, xn } is called linearly independent if whenever
p
X
ck xk = 0
k=1
it follows that all the scalars, ck equal zero. A set of vectors, {x1 , · · ·, xp } , is called
linearly dependent if it is not linearly independent. Thus the set of vectors
Pp is linearly
dependent if there exist scalars, ci , i = 1, ···, n, not all zero such that k=1 ck xk = 0.
then X
0 = 1xk + (−cj ) xj ,
j6=k
Not all of these scalars can equal zero because if this were the case, it would follow
that x1 = 0 and Prso {x1 , · · ·, xr } would not be linearly independent. Indeed, if
x1 = 0, 1x1 + i=2 0xi = x1 = 0 and so there would exist a nontrivial linear
combination of the vectors {x1 , · · ·, xr } which equals zero.
Say ck 6= 0. Then solve (4.11) for yk and obtain
s-1 vectors here
z }| {
yk ∈ span x1 , y1 , · · ·, yk−1 , yk+1 , · · ·, ys .
Now replace the yk in the above with a linear combination of the vectors,
{x1 , z1 , · · ·, zs−1 }
to obtain
v ∈ span {x1 , z1 , · · ·, zs−1 } .
The vector yk , in the list {y1 , · · ·, ys } , has now been replaced with the vector x1
and the resulting modified list of vectors has the same span as the original list of
vectors, {y1 , · · ·, ys } .
Now suppose that r > s and that
span (x1 , · · ·, xl , z1 , · · ·, zp ) = V
where the vectors, z1 , · · ·, zp are each taken from the set, {y1 , · · ·, ys } and l + p = s.
This has now been done for l = 1 above. Then since r > s, it follows that l ≤ s < r
4.3. SUBSPACES SPANS AND BASES 57
and so l + 1 ≤ r. Therefore, xl+1 is a vector not in the list, {x1 , · · ·, xl } and since
span {x1 , · · ·, xl , z1 , · · ·, zp } = V, there exist scalars, ci and dj such that
l
X p
X
xl+1 = ci xi + dj zj . (4.12)
i=1 j=1
Now not all the dj can equal zero because if this were so, it would follow that
{x1 , · · ·, xr } would be a linearly dependent set because one of the vectors would
equal a linear combination of the others. Therefore, (4.12) can be solved for one of
the zi , say zk , in terms of xl+1 and the other zi and just as in the above argument,
replace that zi with xl+1 to obtain
p-1 vectors here
z }| {
span x1 , · · ·xl , xl+1 , z1 , · · ·zk−1 , zk+1 , · · ·, zp = V.
span (x1 , · · ·, xs ) = V.
span (x1 , · · ·, xr ) = Fn
Proof: From the exchange theorem, r ≤ s and s ≤ r. Now note the vectors,
of hissing as in “The sixth shiek’s sixth sheep is sick”. This is the reason that bases is used instead
of basiss.
58 SOME IMPORTANT LINEAR ALGEBRA
Pr Pr
Proof: Suppose α, β are two scalars and let k=1 ck vk and k=1 dk vk are two
elements of V. What about
r
X Xr
α ck vk + β dk vk ?
k=1 k=1
Is it also in V ?
r
X r
X r
X
α ck vk + β dk vk = (αck + βdk ) vk ∈ V
k=1 k=1 k=1
Proof: This follows immediately from the proof of Theorem 31.23. You do
exactly the same argument except you start with {v1 , · · ·, vr } rather than {v1 }.
It is also true that any spanning set of vectors can be restricted to obtain a
basis.
Proof: Let r be the smallest positive integer with the property that for some
set, {v1 · ··, vr } ⊆ {u1 · ··, up } ,
Then r ≤ p and it must be the case that {v1 · ··, vr } is linearly independent because
if it were not so, one of the vectors, say vk would be a linear combination of the
others. But then you could delete this vector from {v1 · ··, vr } and the resulting list
of r − 1 vectors would still span V contrary to the definition of r. This proves the
theorem.
Proof: First suppose A is one to one. Consider the vectors, {Ae1 , · · ·, Aen }
where ek is the column vector which is all zeros except for a 1 in the k th position.
This set of vectors is linearly independent because if
n
X
ck Aek = 0,
k=1
which implies each ck = 0. Therefore, {Ae1 , · · ·, Aen } must be a basis for Fn because
if not there would exist a vector, y ∈ / span (Ae1 , · · ·, Aen ) and then by Lemma 4.13,
{Ae1 , · · ·, Aen , y} would be an independent set of vectors having n + 1 vectors in it,
contrary to the exchange theorem. It follows that for y ∈ Fn there exist constants,
ci such that
n
à n !
X X
y= ck Aek = A ck ek
k=1 k=1
A (BA) − A = A (BA − I) = 0.
But this means (BA − I) x = 0 for all x since otherwise, A would not be one to
one. Hence BA = I as claimed. This proves the theorem.
This theorem shows that if an n×n matrix, B acts like an inverse when multiplied
on one side of A it follows that B = A−1 and it will act like an inverse on both sides
of A.
The conclusion of this theorem pertains to square matrices only. For example,
let
1 0 µ ¶
1 0 0
A = 0 1 , B = (4.13)
1 1 −1
1 0
Then µ ¶
1 0
BA =
0 1
but
1 0 0
AB = 1 1 −1 .
1 0 0
4.5. THE MATHEMATICAL THEORY OF DETERMINANTS 61
Lemma 4.18 There exists a unique function, sgnn which maps each list of n num-
bers from {1, · · ·, n} to one of the three numbers, 0, 1, or −1 which also has the
following properties.
sgnn (1, · · ·, n) = 1 (4.14)
n+1−θ
(−1) sgnn (i1 , · · ·, iθ−1 , iθ+1 , · · ·, in+1 ) .
It is necessary to verify this satisfies 4.14 and 4.15 with n replaced with n + 1. The
first of these is obviously true because
n+1−(n+1)
sgnn+1 (1, · · ·, n, n + 1) ≡ (−1) sgnn (1, · · ·, n) = 1.
62 SOME IMPORTANT LINEAR ALGEBRA
If there are repeated numbers in (i1 , · · ·, in+1 ) , then it is obvious 4.15 holds because
both sides would equal zero from the above definition. It remains to verify 4.15 in
the case where there are no numbers repeated in (i1 , · · ·, in+1 ) . Consider
³ r s
´
sgnn+1 i1 , · · ·, p, · · ·, q, · · ·, in+1 ,
where the r above the p indicates the number, p is in the rth position and the s
above the q indicates that the number, q is in the sth position. Suppose first that
r < θ < s. Then
µ ¶
r θ s
sgnn+1 i1 , · · ·, p, · · ·, n + 1, · · ·, q, · · ·, in+1 ≡
³ r s−1
´
n+1−θ
(−1) sgnn i1 , · · ·, p, · · ·, q , · · ·, in+1
while µ ¶
r θ s
sgnn+1 i1 , · · ·, q, · · ·, n + 1, · · ·, p, · · ·, in+1 =
³ r s−1
´
n+1−θ
(−1) sgnn i1 , · · ·, q, · · ·, p , · · ·, in+1
and so, by induction, a switch of p and q introduces a minus sign in the result.
Similarly, if θ > s or if θ < r it also follows that 4.15 holds. The interesting case
is when θ = r or θ = s. Consider the case where θ = r and note the other case is
entirely similar. ³ ´
r s
sgnn+1 i1 , · · ·, n + 1, · · ·, q, · · ·, in+1 =
³ s−1
´
n+1−r
(−1) sgnn i1 , · · ·, q , · · ·, in+1 (4.17)
while ³ ´
r s
sgnn+1 i1 , · · ·, q, · · ·, n + 1, · · ·, in+1 =
³ r
´
n+1−s
(−1) sgnn i1 , · · ·, q, · · ·, in+1 . (4.18)
Therefore,
³ r s
´
sgnn+1 i1 , · · ·, n + 1, · · ·, q, · · ·, in+1
³ s−1
´
n+1−r
= (−1) sgnn i1 , · · ·, q , · · ·, in+1
4.5. THE MATHEMATICAL THEORY OF DETERMINANTS 63
³ r
´
n+1−r s−1−r
= (−1) (−1) sgnn i1 , · · ·, q, · · ·, in+1
³ r
´
n+s
= (−1) sgnn i1 , · · ·, q, · · ·, in+1
³ r
´
2s−1 n+1−s
= (−1) (−1) sgnn i1 , · · ·, q, · · ·, in+1
³ r s ´
= − sgnn+1 i1 , · · ·, q, · · ·, n + 1, · · ·, in+1 .
This proves the existence of the desired function.
To see this function is unique, note that you can obtain any ordered list of
distinct numbers from a sequence of switches. If there exist two functions, f and
g both satisfying 4.14 and 4.15, you could start with f (1, · · ·, n) = g (1, · · ·, n)
and applying the same sequence of switches, eventually arrive at f (i1 , · · ·, in ) =
g (i1 , · · ·, in ) . If any numbers are repeated, then 4.15 gives both functions are equal
to zero for that ordered list. This proves the lemma.
In what follows sgn will often be used rather than sgnn because the context
supplies the appropriate n.
Definition 4.19 Let f be a real valued function which has the set of ordered lists
of numbers from {1, · · ·, n} as its domain. Define
X
f (k1 · · · kn )
(k1 ,···,kn )
(k1 , · · ·, kn )
of numbers of
{1, · · ·, n} .
For example,
X
f (k1 , k2 ) = f (1, 2) + f (2, 1) + f (1, 1) + f (2, 2) .
(k1 ,k2 )
where the sum is taken over all ordered lists of numbers from {1, · · ·, n}. Note it
suffices to take the sum over only those ordered lists in which there are no repeats
because if there are, sgn (k1 , · · ·, kn ) = 0 and so that term contributes 0 to the sum.
64 SOME IMPORTANT LINEAR ALGEBRA
and
A (1, · · ·, n) = A.
X
= sgn (k1 , · · ·, kn ) ar1 k1 · · · arn kn (4.20)
(k1 ,···,kn )
X
sgn (k1 , · · ·, kr , · · ·, ks , · · ·, kn ) a1k1 · · · arkr · · · asks · · · ankn ,
(k1 ,···,kn )
where it took p switches to obtain(r1 , · · ·, rn ) from (1, · · ·, n). By Lemma 4.18, this
implies
p
det (A (r1 , · · ·, rn )) = (−1) det (A) = sgn (r1 , · · ·, rn ) det (A)
and proves the proposition in the case when there are no repeated numbers in the
ordered list, (r1 , · · ·, rn ). However, if there is a repeat, say the rth row equals the
sth row, then the reasoning of 4.22 -4.23 shows that A (r1 , · · ·, rn ) = 0 and also
sgn (r1 , · · ·, rn ) = 0 so the formula holds in this case also.
Observation 4.22 There are n! ordered lists of distinct numbers from {1, · · ·, n} .
To see this, consider n slots placed in order. There are n choices for the first
slot. For each of these choices, there are n − 1 choices for the second. Thus there
are n (n − 1) ways to fill the first two slots. Then for each of these ways there are
n − 2 choices left for the third slot. Continuing this way, there are n! ordered lists
of distinct numbers from {1, · · ·, n} as stated in the observation.
With the above, it is possible to give a more symmetric
¡ ¢ description of the de-
terminant from which it will follow that det (A) = det AT .
1
det (A) = ·
n!
X X
sgn (r1 , · · ·, rn ) sgn (k1 , · · ·, kn ) ar1 k1 · · · arn kn . (4.24)
(r1 ,···,rn ) (k1 ,···,kn )
¡ ¢
And also
¡ det AT = det (A) where AT is the transpose of A. (Recall that for
¢
AT = aTij , aTij = aji .)
Summing over all ordered lists, (r1 , · · ·, rn ) where the ri are distinct, (If the ri are
not distinct, sgn (r1 , · · ·, rn ) = 0 and so there is no contribution to the sum.)
n! det (A) =
X X
sgn (r1 , · · ·, rn ) sgn (k1 , · · ·, kn ) ar1 k1 · · · arn kn .
(r1 ,···,rn ) (k1 ,···,kn )
This proves the corollary since the formula gives the same number for A as it does
for AT .
66 SOME IMPORTANT LINEAR ALGEBRA
Corollary 4.24 If two rows or two columns in an n×n matrix, A, are switched, the
determinant of the resulting matrix equals (−1) times the determinant of the original
matrix. If A is an n × n matrix in which two rows are equal or two columns are
equal then det (A) = 0. Suppose the ith row of A equals (xa1 + yb1 , · · ·, xan + ybn ).
Then
det (A) = x det (A1 ) + y det (A2 )
where the ith row of A1 is (a1 , · · ·, an ) and the ith row of A2 is (b1 , · · ·, bn ) , all
other rows of A1 and A2 coinciding with those of A. In other words, det is a linear
function of each row A. The same is true with the word “row” replaced with the
word “column”.
Proof: By Proposition 4.21 when two rows are switched, the determinant of the
resulting matrix is (−1) times the determinant of the original matrix. By Corollary
4.23 the same holds for columns because the columns of the matrix equal the rows
of the transposed matrix. Thus if A1 is the matrix obtained from A by switching
two columns,
¡ ¢ ¡ ¢
det (A) = det AT = − det AT1 = − det (A1 ) .
If A has two equal columns or two equal rows, then switching them results in the
same matrix. Therefore, det (A) = − det (A) and so det (A) = 0.
It remains to verify the last assertion.
X
det (A) ≡ sgn (k1 , · · ·, kn ) a1k1 · · · (xaki + ybki ) · · · ankn
(k1 ,···,kn )
X
=x sgn (k1 , · · ·, kn ) a1k1 · · · aki · · · ankn
(k1 ,···,kn )
X
+y sgn (k1 , · · ·, kn ) a1k1 · · · bki · · · ankn
(k1 ,···,kn )
By Corollary 4.24
r
X ¡ ¢
det (A) = ck det a1 · · · ar ··· an−1 ak = 0.
k=1
¡ ¢
The case for rows follows from the fact that det (A) = det AT . This proves the
corollary.
Recall the following definition of matrix multiplication.
One of the most important rules about determinants is that the determinant of
a product equals the product of the determinants.
det (AB) =
X
sgn (k1 , · · ·, kn ) c1k1 · · · cnkn
(k1 ,···,kn )
à ! à !
X X X
= sgn (k1 , · · ·, kn ) a1r1 br1 k1 ··· anrn brn kn
(k1 ,···,kn ) r1 rn
X X
= sgn (k1 , · · ·, kn ) br1 k1 · · · brn kn (a1r1 · · · anrn )
(r1 ···,rn ) (k1 ,···,kn )
X
= sgn (r1 · · · rn ) a1r1 · · · anrn det (B) = det (A) det (B) .
(r1 ···,rn )
Letting θ denote the position of n in the ordered list, (k1 , · · ·, kn ) then using the
earlier conventions used to prove Lemma 4.18, det (M ) equals
X µ ¶
θ n−1
n−θ
(−1) sgnn−1 k1 , · · ·, kθ−1 , kθ+1 , · · ·, kn m1k1 · · · mnkn
(k1 ,···,kn )
Now suppose 4.26. Then if kn 6= n, the term involving mnkn in the above expression
equals zero. Therefore, the only terms which survive are those for which θ = n or
in other words, those for which kn = n. Therefore, the above expression reduces to
X
a sgnn−1 (k1 , · · ·kn−1 ) m1k1 · · · m(n−1)kn−1 = a det (A) .
(k1 ,···,kn−1 )
To get the assertion in the situation of 4.25 use Corollary 4.23 and 4.26 to write
µµ T ¶¶
¡ ¢ A 0 ¡ ¢
det (M ) = det M T = det = a det AT = a det (A) .
∗ a
This proves the lemma.
In terms of the theory of determinants, arguably the most important idea is
that of Laplace expansion along a row or a column. This will follow from the above
definition of a determinant.
Definition 4.30 Let A = (aij ) be an n × n matrix. Then a new matrix called
the cofactor matrix, cof (A) is defined by cof (A) = (cij ) where to obtain cij delete
the ith row and the j th column of A, take the determinant of the (n − 1) × (n − 1)
matrix which results, (This is called the ij th minor of A. ) and then multiply this
i+j
number by (−1) . To make the formulas easier to remember, cof (A)ij will denote
the ij th entry of the cofactor matrix.
4.5. THE MATHEMATICAL THEORY OF DETERMINANTS 69
The following is the main result. Earlier this was given as a definition and the
outrageous totally unjustified assertion was made that the same number would be
obtained by expanding the determinant along any row or column. The following
theorem proves this assertion.
Theorem 4.31 Let A be an n × n matrix where n ≥ 2. Then
n
X n
X
det (A) = aij cof (A)ij = aij cof (A)ij . (4.27)
j=1 i=1
The first formula consists of expanding the determinant along the ith row and the
second expands the determinant along the j th column.
Proof: Let (ai1 , · · ·, ain ) be the ith row of A. Let Bj be the matrix obtained
from A by leaving every row the same except the ith row which in Bj equals
(0, · · ·, 0, aij , 0, · · ·, 0) . Then by Corollary 4.24,
n
X
det (A) = det (Bj )
j=1
Denote by Aij the (n − 1) × (n − 1) matrix obtained by deleting the ith row and
i+j ¡ ¢
the j th column of A. Thus cof (A)ij ≡ (−1) det Aij . At this point, recall that
from Proposition 4.21, when two rows or two columns in a matrix, M, are switched,
this results in multiplying the determinant of the old matrix by −1 to get the
determinant of the new matrix. Therefore, by Lemma 4.29,
µµ ij ¶¶
n−j n−i A ∗
det (Bj ) = (−1) (−1) det
0 aij
µµ ij ¶¶
i+j A ∗
= (−1) det = aij cof (A)ij .
0 aij
Therefore,
n
X
det (A) = aij cof (A)ij
j=1
which is the formula for expanding det (A) along the ith row. Also,
n
¡ ¢ X ¡ ¢
det (A) = det AT = aTij cof AT ij
j=1
n
X
= aji cof (A)ji
j=1
which is the formula for expanding det (A) along the ith column. This proves the
theorem.
Note that this gives an easy way to write a formula for the inverse of an n × n
matrix.
70 SOME IMPORTANT LINEAR ALGEBRA
¡Theorem
¢ 4.32 A−1 exists if and only if det(A) 6= 0. If det(A) 6= 0, then A−1 =
−1
aij where
a−1
ij = det(A)
−1
cof (A)ji
for cof (A)ij the ij th cofactor of A.
Now consider
n
X
air cof (A)ik det(A)−1
i=1
when k 6= r. Replace the k column with the rth column to obtain a matrix, Bk
th
whose determinant equals zero by Corollary 4.24. However, expanding this matrix
along the k th column yields
n
X
−1 −1
0 = det (Bk ) det (A) = air cof (A)ik det (A)
i=1
Summarizing,
n
X −1
air cof (A)ik det (A) = δ rk .
i=1
¡ ¢
This proves that if det (A) 6= 0, then A−1 exists with A−1 = a−1
ij , where
−1
a−1
ij = cof (A)ji det (A) .
Corollary 4.33 Let A be an n×n matrix and suppose there exists an n×n matrix,
B such that BA = I. Then A−1 exists and A−1 = B. Also, if there exists C an
n × n matrix such that AC = I, then A−1 exists and A−1 = C.
4.5. THE MATHEMATICAL THEORY OF DETERMINANTS 71
det B det A = 1
Theorem 4.37 If A has determinant rank, r, then there exist r rows of the matrix
such that every other row is a linear combination of these r rows.
Proof: Suppose the determinant rank of A = (aij ) equals r. If rows and columns
are interchanged, the determinant rank of the modified matrix is unchanged. Thus
rows and columns can be interchanged to produce an r × r matrix in the upper left
corner of the matrix which has non zero determinant. Now consider the r + 1 × r + 1
matrix, M,
a11 · · · a1r a1p
.. .. ..
. . .
ar1 · · · arr arp
al1 · · · alr alp
where C will denote the r × r matrix in the upper left corner which has non zero
determinant. I claim det (M ) = 0.
There are two cases to consider in verifying this claim. First, suppose p > r.
Then the claim follows from the assumption that A has determinant rank r. On the
other hand, if p < r, then the determinant is zero because there are two identical
columns. Expand the determinant along the last column and divide by det (C) to
obtain
Xr
cof (M )ip
alp = − aip .
i=1
det (C)
Now note that cof (M )ip does not depend on p. Therefore the above sum is of the
form
r
X
alp = mi aip
i=1
which shows the lth row is a linear combination of the first r rows of A. Since l is
arbitrary, this proves the theorem.
Proof: From Theorem 4.37, the row rank is no larger than the determinant
rank. Could the row rank be smaller than the determinant rank? If so, there exist
p rows for p < r such that the span of these p rows equals the row space. But this
implies that the r × r submatrix whose determinant is nonzero also has row rank
no larger than p which is impossible if its determinant is to be nonzero because at
least one row is a linear combination of the others.
Corollary 4.39 If A has determinant rank, r, then there exist r columns of the
matrix such that every other column is a linear combination of these r columns.
Also the column rank equals the determinant rank.
Proof: This follows from the above by considering AT . The rows of AT are the
columns of A and the determinant rank of AT and A are the same. Therefore, from
Corollary 4.38, column rank of A = row rank of AT = determinant rank of AT =
determinant rank of A.
The following theorem is of fundamental importance and ties together many of
the ideas presented above.
1. det (A) = 0.
2. A, AT are not one to one.
3. A is not onto.
Since also A0 = 0, it follows A is not one to one. Similarly, AT is not one to one
by the same argument applied to AT . This verifies that 1.) implies 2.).
Now suppose 2.). Then since AT is not one to one, it follows there exists x 6= 0
such that
AT x = 0.
Taking the transpose of both sides yields
xT A = 0
74 SOME IMPORTANT LINEAR ALGEBRA
1. det(A) 6= 0.
2. A and AT are one to one.
3. A is onto.
4.6 Exercises
1. Let m < n and let A be an m × n matrix. Show that A is not one to one.
Hint: Consider the n × n matrix, A1 which is of the form
µ ¶
A
A1 ≡
0
The explanation for the last term is that A0 is interpreted as I, the identity matrix.
4.7. THE CAYLEY HAMILTON THEOREM 75
The Cayley Hamilton theorem states that every matrix satisfies its characteristic
equation, that equation defined by PA (t) = 0. It is one of the most important
theorems in linear algebra. The following lemma will help with its proof.
A0 + A1 λ + · · · + Am λm = 0,
A0 + A1 λ + · · · + Am λm = B0 + B1 λ + · · · + Bm λm
for all |λ| large enough. Then Ai = Bi for all i. Consequently if λ is replaced by
any n × n matrix, the two sides will be equal. That is, for C any n × n matrix,
A0 + A1 C + · · · + Am C m = B0 + B1 C + · · · + Bm C m .
Theorem 4.45 Let A be an n × n matrix and let p (λ) ≡ det (λI − A) be the
characteristic polynomial. Then p (A) = 0.
Proof: Let C (λ) equal the transpose of the cofactor matrix of (λI − A) for |λ|
large. (If |λ| is large enough, then λ cannot be in the finite list of eigenvalues of A
−1
and so for such λ, (λI − A) exists.) Therefore, by Theorem 4.32
−1
C (λ) = p (λ) (λI − A) .
Note that each entry in C (λ) is a polynomial in λ having degree no more than n−1.
Therefore, collecting the terms,
for Cj some n × n matrix. It follows that for all |λ| large enough,
¡ ¢
(A − λI) C0 + C1 λ + · · · + Cn−1 λn−1 = p (λ) I
and so Corollary 4.44 may be used. It follows the matrix coefficients corresponding
to equal powers of λ are equal on both sides of this equation. Therefore, if λ is
replaced with A, the two sides will be equal. Thus
¡ ¢
0 = (A − A) C0 + C1 A + · · · + Cn−1 An−1 = p (A) I = p (A) .
1 1 1
··· .
ai1 + bj1 ai2 + bj2 ain + bjn
This equals
1 X Y
sgn (i1 · · · in ) sgn (j1 · · · jn ) (ai + bj )
n! i
1 ···in ,j1 ,···jn (i,j)∈{(i
/ 1 ,j1 ),(i2 ,j2 )···,(in ,jn )}
4.9. BLOCK MULTIPLICATION OF MATRICES 77
where you can assume the ik are all distinct and the jk are also all distinct because
otherwise sgn will produce a 0. Therefore, in
Y
(ai + bj ) ,
(i,j)∈{(i
/ 1 ,j1 ),(i2 ,j2 )···,(in ,jn )}
there are exactly n − 1 factors which contain ak for each k and similarly, there are
exactly n − 1 factors which contain bk for each k. Therefore, the left side of 4.28 is
of the form
dan−1
1 a2n−1 · · · an−1
n bn−1
1 · · · bn−1
n
and it remains to verify that c = d. Using the properties of determinants, the left
side of 4.28 is of the form
¯ a1 +b1 +b1 ¯
¯ 1 · · · aa11+b ¯
¯ a2 +b2 a1 +b2 n ¯
Y ¯ 1 a2 +b2 ¯
· · · a2 +bn ¯
¯ a2 +b1
(ai + bj ) ¯ . . .. .. ¯
¯ .
. .
. . . ¯
i6=j ¯ ¯
¯ an +bn an +bn · · · 1 ¯
an +b1 an +b2
Q
Let ak → −bk . Then this converges to i6=j (−bi + bj ) . The right side of 4.28
converges to Y Y
(−bi + bj ) (bi − bj ) = (−bi + bj ) .
j<i i6=j
where Aij is a si × pj matrix where si does not depend onPj and pj does not
P depend
on i. Such a matrix is called a block matrix. Let n = j pj and k = i si so A
is an k × n matrix. What is Ax where x ∈ Fn ? From the process of multiplying a
matrix times a vector, the following lemma follows.
By Lemma 4.47, this shows that (BA) x equals the block matrix whose ij th entry is
given by 4.31 times x. Since x is an arbitrary vector in Fn , this proves the theorem.
The message of this theorem is that you can formally multiply block matrices as
though the blocks were numbers. You just have to pay attention to the preservation
of order.
This simple idea of block multiplication turns out to be very useful later. For now
here is an interesting and significant application. In this theorem, pM (t) denotes
the polynomial, det (tI − M ) . Thus the zeros of this polynomial are the eigenvalues
of the matrix, M .
Theorem 4.49 Let A be an m × n matrix and let B be an n × m matrix for m ≤ n.
Then
pBA (t) = tn−m pAB (t) ,
so the eigenvalues of BA and AB are the same including multiplicities except that
BA has n − m extra zero eigenvalues.
4.10. EXERCISES 79
and so det (tI − BA) = pBA (t) = tn−m det (tI − AB) = tn−m pAB (t) . This proves
the theorem.
4.10 Exercises
1. Show that matrix multiplication is associative. That is, (AB) C = A (BC) .
2. Show the inverse of a matrix, if it exists, is unique. Thus if AB = BA = I,
then B = A−1 .
3. In the proof of Theorem 4.32 it was claimed that det (I) = 1. Here I = (δ ij ) .
Prove this assertion. Also prove Corollary 4.35.
4. Let v1 , ···, vn be vectors in Fn and let M (v1 , · · ·, vn ) denote the matrix whose
ith column equals vi . Define
and
d (e1 , · · ·, en ) = 1 (4.33)
n
where here ej is the vector in F which has a zero in every position except
the j th position in which it has a one.
5. Suppose f : Fn × · · · × Fn → F satisfies 4.32 and 4.33 and is linear in each
variable. Show that f = d.
80 SOME IMPORTANT LINEAR ALGEBRA
6. Show that if you replace a row (column) of an n×n matrix A with itself added
to some multiple of another row (column) then the new matrix has the same
determinant as the original one.
P
7. If A = (aij ) , show det (A) = (k1 ,···,kn ) sgn (k1 , · · ·, kn ) ak1 1 · · · akn n .
8. Use the result of Problem 6 to evaluate by hand the determinant
1 2 3 2
−6 3 2 3
det
5 2 2 3 .
3 4 6 4
10. Let Ly = y (n) +an−1 (x) y (n−1) +···+a1 (x) y 0 +a0 (x) y where the ai are given
continuous functions defined on a closed interval, (a, b) and y is some function
which has n derivatives so it makes sense to write Ly. Suppose Lyk = 0 for
k = 1, 2, · · ·, n. The Wronskian of these functions, yi is defined as
y1 (x) ··· yn (x)
y10 (x) ··· yn0 (x)
W (y1 , · · ·, yn ) (x) ≡ det .. ..
. .
(n−1) (n−1)
y1 (x) · · · yn (x)
tn + an−1 tn−1 + · · · + a1 t + a0
{u1 , · · ·, un }
where the denominator is not equal to zero because the xj form a basis and so
xk+1 ∈
/ span (x1 , · · ·, xk ) = span (u1 , · · ·, uk )
Thus by induction,
Also, xk+1 ∈ span (u1 , · · ·, uk , uk+1 ) which is seen easily by solving 4.34 for xk+1
and it follows
If l ≤ k,
k
X
(uk+1 · ul ) = C (xk+1 · ul ) − (xk+1 · uj ) (uj · ul )
j=1
k
X
= C (xk+1 · ul ) − (xk+1 · uj ) δ lj
j=1
= C ((xk+1 · ul ) − (xk+1 · ul )) = 0.
n
The vectors, {uj }j=1 , generated in this way are therefore an orthonormal basis
because each vector has unit length.
The process by which these vectors were generated is called the Gram Schmidt
process. Recall the following definition.
Proof: Let v1 be a unit eigenvector for A . Then there exists λ1 such that
Av1 = λ1 v1 , |v1 | = 1.
Extend {v1 } to a basis and then use Lemma 4.50 to obtain {v1 , · · ·, vn }, an or-
thonormal basis in Fn . Let U0 be a matrix whose ith column is vi . Then from the
above, it follows U0 is unitary. Then U0∗ AU0 is of the form
λ1 ∗ · · · ∗
0
..
. A 1
0
where A1 is a real n − 1 × n − 1 matrix. This is just like the proof of Theorem 4.52
up to this point.
Now in case λ1 = α + iβ, it follows since A is real that v1 = z1 + iw1 and
that v1 = z1 − iw1 is an eigenvector for the eigenvalue, α − iβ. Here z1 and w1
are real vectors. It is clear that {z1 , w1 } is an independent set of vectors in Rn .
Indeed,{v1 , v1 } is an independent set and it follows span (v1 , v1 ) = span (z1 , w1 ) .
Now using the Gram Schmidt theorem in Rn , there exists {u1 , u2 } , an orthonormal
set of real vectors such that span (u1 , u2 ) = span (v1 , v1 ) . Now let {u1 , u2 , · · ·, un }
be an orthonormal basis in Rn and let Q0 be a unitary matrix whose ith column
is ui . Then Auj are both in span (u1 , u2 ) for j = 1, 2 and so uTk Auj = 0 whenever
k ≥ 3. It follows that Q∗0 AQ0 is of the form
∗ ∗ ··· ∗
∗ ∗
0
..
. A 1
0
e 1 an n − 2 × n − 2
where A1 is now an n − 2 × n − 2 matrix. In this case, find Q
matrix to put A1 in an appropriate form as above and come up with A2 either an
n − 4 × n − 4 matrix or an n − 3 × n − 3 matrix. Then the only other difference is
to let
1 0 0 ··· 0
0 1 0 ··· 0
Q1 = 0 0
.. ..
. . e
Q1
0 0
thus putting a 2 × 2 identity matrix in the upper left corner rather than a one.
Repeating this process with the above modification for the case of a complex eigen-
value leads eventually to 4.36 where Q is the product of real unitary matrices Qi
above. Finally,
λI1 − P1 · · · ∗
.. ..
λI − T = . .
0 λIr − Pr
where Ik is the 2 × 2 identity matrix in the case that Pk is 2 × 2 and is the num-
ber
Qr 1 in the case where Pk is a 1 × 1 matrix. Now, it follows that det (λI − T ) =
k=1 det (λIk − Pk ) . Therefore, λ is an eigenvalue of T if and only if it is an eigen-
value of some Pk . This proves the theorem since the eigenvalues of T are the same
as those of A because they have the same characteristic polynomial due to the
similarity of A and T.
The next lemma is the basis for concluding that every normal matrix is unitarily
similar to a diagonal matrix.
Now use the fact that T is upper triangular and let i = k = 1 to obtain the following
from the above. X X
2 2 2
|t1j | = |tj1 | = |t11 |
j j
You see, tj1 = 0 unless j = 1 due to the assumption that T is upper triangular.
This shows T is of the form
∗ 0 ··· 0
0 ∗ ··· ∗
.. . . .. . .
. . . ..
0 ··· 0 ∗
Now do the same thing only this time take i = k = 2 and use the result just
established. Thus, from the above,
X 2
X 2 2
|t2j | = |tj2 | = |t22 | ,
j j
Theorem 4.56 Let A be a normal matrix. Then there exists a unitary matrix, U
such that U ∗ AU is a diagonal matrix.
86 SOME IMPORTANT LINEAR ALGEBRA
Proof: From Theorem 4.52 there exists a unitary matrix, U such that U ∗ AU
equals an upper triangular matrix. The theorem is now proved if it is shown that
the property of being normal is preserved under unitary similarity transformations.
That is, verify that if A is normal and if B = U ∗ AU, then B is also normal. But
this is easy.
B∗B = U ∗ A∗ U U ∗ AU = U ∗ A∗ AU
= U ∗ AA∗ U = U ∗ AU U ∗ A∗ U = BB ∗ .
Corollary 4.57 If A is Hermitian, then all the eigenvalues of A are real and there
exists an orthonormal basis of eigenvectors.
where the entries denote the columns of AU and U D respectively. Therefore, Aui =
λi ui and since the matrix is unitary, the ij th entry of U ∗ U equals δ ij and so
δ ij = uTi uj = uTi uj = ui · uj .
This proves the corollary because it shows the vectors {ui } form an orthonormal
basis.
F = RU, U = U ∗ ,
U 2 = F ∗ F, R∗ R = I,
(F ∗ F x, x) = (F x, F x) ≥ 0.
{U x1 , · · ·, U xr }
Let
{U x1 , · · ·, U xr , yr+1 , · · ·, yn }
be an orthonormal basis for Fn and let
{F xr , · · ·, F xr , zr+1 , · · ·, zn , · · ·, zm }
Define
r
X n
X r
X n
X
R ak U xk + bj yj ≡ ak F xk + bj zj
k=1 j=r+1 k=1 j=r+1
Then since
{U x1 , · · ·, U xr , yr+1 , · · ·, yn }
and
{F xr , · · ·, F xr , zr+1 , · · ·, zn , · · ·, zm }
are orthonormal,
¯ ¯2 ¯ ¯2
¯ r n ¯ ¯X ¯
¯ X X ¯ ¯ r Xn
¯
¯R ak U xk + bj yj ¯¯
= ¯ ak F xk + bj zj ¯¯
¯ ¯
¯ k=1 j=r+1 ¯ ¯k=1 j=r+1 ¯
r
X n
X
2 2
= |ak | + |bj |
k=1 j=r+1
¯ ¯2
¯ r n ¯
¯X X ¯
= ¯ ak U xk + bj yj ¯¯ .
¯
¯k=1 j=r+1 ¯
Letting x ∈ Fn ,
r
X
Ux = ak U xk (4.38)
k=1
Therefore,
¯ Ã !¯2 Ã Ã ! Ã !!
¯ X r ¯ r
X r
X
¯ ¯
¯F ak xk − x ¯ = F ak xk − x , F ak xk − x
¯ ¯
k=1 k=1 k=1
à à r
! Ã r !!
X X
= F ∗F ak xk − x , ak xk − x =0
k=1 k=1
Pr
and so F ( k=1 ak xk ) = F (x) as hoped. Thus RU = F on Fn .
5.1.1 Distance In Fn
It is necessary to give a generalization of the dot product for vectors in Cn . This
definition reduces to the usual one in the case the components of the vector are real.
Definition 5.1 Let x, y ∈ Cn . Thus x = (x1 , · · ·, xn ) where each xk ∈ C and a
similar formula holding for y. Then the dot product of these two vectors is defined
to be X
x·y ≡ xj yj ≡ x1 y1 + · · · + xn yn .
j
Notice how you put the conjugate on the entries of the vector, y. It makes no
difference if the vectors happen to be real vectors but with complex vectors you
must do it this way. The reason for this is that when you take the dot product of a
vector with itself, you want to get the square of the length of the vector, a positive
number. Placing the conjugate on the components of y in the above definition
assures this will take place. Thus
X X 2
x·x= xj xj = |xj | ≥ 0.
j j
If you didn’t place a conjugate as in the above definition, things wouldn’t work out
correctly. For example,
2
(1 + i) + 22 = 4 + 2i
and this is not a positive number.
The following properties of the dot product follow immediately from the defini-
tion and you should verify each of them.
Properties of the dot product:
91
92 MULTI-VARIABLE CALCULUS
1. u · v = v · u.
z
Proof: Let θ = 1 if z = 0 and otherwise, let θ = . Recall that for z =
|z|
2
x + iy, z = x − iy and zz = |z| .
Thus ¯ n ¯
n
X n
X ¡ ¢ ¯X ¯
¯ ¯
θ xi y i = xi θyi =¯ xi y i ¯ .
¯ ¯
i=1 i=1 i=1
Pn ¡ ¢³ ´
Consider p (t) ≡ i=1 xi + tθyi xi + tθyi where t ∈ R.
n
à n
! n
X 2
X X 2
2
0 ≤ p (t) = |xi | + 2t Re θ xi y i +t |yi |
i=1 i=1 i=1
¯ ¯
¯Xn ¯
2 ¯ ¯ 2
= |x| + 2t ¯ xi y i ¯ + t2 |y|
¯ ¯
i=1
5.1. CONTINUOUS FUNCTIONS 93
If |y| = 0 then 5.1 is obviously true because both sides equal zero. Therefore,
assume |y| 6= 0 and then p (t) is a polynomial of degree two whose graph opens
up. Therefore, it either has no zeroes, two zeros or one repeated zero. If it has two
zeros, the above inequality must be violated because in this case the graph must
dip below the x axis. Therefore, it either has no zeros or exactly one. From the
quadratic formula this happens exactly when
¯ n ¯2
¯X ¯
¯ ¯ 2 2
4¯ xi y i ¯ − 4 |x| |y| ≤ 0
¯ ¯
i=1
and so ¯ n ¯
¯X ¯
¯ ¯
¯ xi y i ¯ ≤ |x| |y|
¯ ¯
i=1
as claimed. This proves the inequality.
By analogy to the case of Rn , length or magnitude of vectors in Cn can be
defined.
1/2
Definition 5.5 Let z ∈ Cn . Then |z| ≡ (z · z) . Also numbers in F will often be
referred to as scalars.
Theorem 5.6 For length defined in Definition 5.5, the following hold.
|z| ≥ 0 and |z| = 0 if and only if z = 0 (5.2)
If α is a scalar, |αz| = |α| |z| (5.3)
|z + w| ≤ |z| + |w| . (5.4)
Proof: The first two claims are left as exercises. To establish the third,
2
|z + w| = (z + w, z + w)
= z·z+w·w+w·z+z·w
2 2
= |z| + |w| + 2 Re w · z
2 2
≤ |z| + |w| + 2 |w · z|
2 2 2
≤ |z| + |w| + 2 |w| |z| = (|z| + |w|) .
The main difference between Cn and Rn is that the scalars are complex numbers.
Definition 5.7 Suppose you have a vector space, V and for z, w ∈ V and α a scalar
a norm is a way of measuring distance or magnitude which satisfies the properties
5.2 - 5.4. Thus a norm is something which does the following.
||z|| ≥ 0 and ||z|| = 0 if and only if z = 0 (5.5)
If α is a scalar, ||αz|| = |α| ||z|| (5.6)
||z + w|| ≤ ||z|| + ||w|| . (5.7)
Here is is understood that for all z ∈ V, ||z|| ∈ [0, ∞).
Note that |·| provides a norm on Fn from the above.
94 MULTI-VARIABLE CALCULUS
If there is something called an open set, surely there should be something called
a closed set and here is the definition of one.
qx U
B(x, r)
You see in this picture how the edges are dotted. This is because an open set,
can not include the edges or the set would fail to be open. For example, consider
what would happen if you picked a point out on the edge of U in the above picture.
Every open ball centered at that point would have in it some points which are
outside U . Therefore, such a point would violate the above definition. You also see
the edges of B (x, r) dotted suggesting that B (x, r) ought to be an open set. This
is intuitively clear but does require a proof. This will be done in the next theorem
and will give examples of open sets. Also, you can see that if x is close to the edge
of U, you might have to take r to be very small.
It is roughly the case that open sets don’t have their skins while closed sets do.
Here is a picture of a closed set, C.
5.2. OPEN AND CLOSED SETS 95
C qx
B(x, r)
Note that x ∈/ C and since Fn \ C is open, there exists a ball, B (x, r) contained
entirely in F \ C. If you look at Fn \ C, what would be its skin? It can’t be in
n
Theorem 5.10 Let x ∈ Fn and let r ≥ 0. Then B (x, r) is an open set. Also,
D (x, r) ≡ {y ∈ Fn : |y − x| ≤ r}
is a closed set.
|z − x| = |z − y + y − x|
≤ |z − y| + |y − x|
< r1 + |y − x| = r − |x − y| + |y − x| = r.
every point in it, (There are none.) satisfies the desired property of being an interior
point.
Now suppose y ∈ / D (x, r) . Then |x − y| > r and defining δ ≡ |x − y| − r, it
follows that if z ∈ B (y, δ) , then by the triangle inequality,
|x − z| ≥ |x − y| − |y − z| > |x − y| − δ
= |x − y| − (|x − y| − r) = r
and this shows that B (y, δ) ⊆ Fn \ D (x, r) . Since y was an arbitrary point in
Fn \ D (x, r) , it follows Fn \ D (x, r) is an open set which shows from the definition
that D (x, r) is a closed set as claimed.
A picture which is descriptive of the conclusion of the above theorem which also
implies the manner of proof is the following.
6 6
r r
q r-
q1 q r-
q1
x y x y
B(x, r) D(x, r)
|y − x| < δ
it follows that
|f (x) − f (y)| < ε.
f is continuous if it is continuous at every point of D (f ) .
The proof of this theorem is in the last section of this chapter. Its conclusions
are not surprising. For example the first claim says that (af + bg) (y) is close to
(af + bg) (x) when y is close to x provided the same can be said about f and g.
For the second claim, if y is close to x, f (x) is close to f (y) and so by continuity
of g at f (x), g (f (y)) is close to g (f (x)) . To see the third claim is likely, note that
closeness in Fp is the same as closeness in each coordinate. The fourth claim is
immediate from the triangle inequality.
For functions defined on Fn , there is a notion of polynomial just as there is for
functions defined on R.
α = (α1 , · · ·, αn )
5.4 Exercises
1. Let f (t) = (t, sin t) . Show f is continuous at every point t.
2. Suppose |f (x) − f (y)| ≤ K |x − y| where K is a constant. Show that f is
everywhere continuous. Functions satisfying such an inequality are called
Lipschitz functions.
α
3. Suppose |f (x) − f (y)| ≤ K |x − y| where K is a constant and α ∈ (0, 1).
Show that f is everywhere continuous.
98 MULTI-VARIABLE CALCULUS
if and only if the following condition holds. For all ε > 0 there exists δ > 0 such
that if
0 < |y − x| < δ, and y ∈ D (f )
then,
|L − f (y)| < ε.
Proof: Let ε > 0 be given. There exists δ > 0 such that if 0 < |y − x| < δ and
y ∈ D (f ) , then
|f (y) − L| < ε, |f (y) − L1 | < ε.
Pick such a y. There exists one because x is a limit point of D (f ) . Then
Definition 5.17 If f (x) ∈ F, limy→x f (x) = ∞ if for every number l, there exists
δ > 0 such that whenever |y − x| < δ and y ∈ D (f ) , then f (x) > l.
The following theorem is just like the one variable version presented earlier.
5.5. LIMITS OF A FUNCTION 99
Proof: The proof of 5.8 is left for you. It is like a corresponding theorem for
continuous functions. Now 5.9is to be verified. Let ε > 0 be given. Then by the
triangle inequality,
|f (y) − L| < 1,
and so for such y, the triangle inequality implies, |f (y)| < 1 + |L| . Therefore, for
0 < |y − x| < δ 1 ,
|f · g (y) − L · K| < ε
Now since limy→x f (y) = L, there exists δ > 0 such that if 0 < |y − x| < δ, then
It only remains to verify the last assertion. Assume |f (y) − b| ≤ r for all y
close enough to x. It is required to show that |L − b| ≤ r. If this is not true, then
|L − b| > r. Consider B (L, |L − b| − r) . Since L is the limit of f , it follows f (y) ∈
B (L, |L − b| − r) whenever y ∈ D (f ) is close enough to x. Thus, by the triangle
inequality,
|f (y) − L| < |L − b| − r
and so
if and only if
lim fk (y) = Lk (5.14)
y→x
Proof: Suppose 5.13. Then letting ε > 0 be given there exists δ > 0 such that
if 0 < |y − x| < δ, it follows
x2 −9
It is clear that lim(x,y)→(3,1) x−3 = 6 and lim(x,y)→(3,1) y = 1. Therefore, this
limit equals (6, 1) .
xy
Example 5.22 Find lim(x,y)→(0,0) x2 +y 2 .
First of all observe the domain of the function is F2 \ {(0, 0)} , every point in F2
except the origin. Therefore, (0, 0) is a limit point of the domain of the function
so it might make sense to take a limit. However, just as in the case of a function
of one variable, the limit may not exist. In fact, this is the case here. To see this,
take points on the line y = 0. At these points, the value of the function equals 0.
Now consider points on the line y = x where the value of the function equals 1/2.
Since arbitrarily close to (0, 0) there are points where the function equals 1/2 and
points where the function has the value 0, it follows there can be no limit. Just
take ε = 1/10 for example. You can’t be within 1/10 of 1/2 and also within 1/10
of 0 at the same time.
Note it is necessary to rely on the definition of the limit much more than in the
case of a function of one variable and it is the case there are no easy ways to do
limit problems for functions of more than one variable. It is what it is and you will
not deal with these concepts without agony.
102 MULTI-VARIABLE CALCULUS
5.6 Exercises
1. Find the following limits if possible
x2 −y 2
(a) lim(x,y)→(0,0) x2 +y 2
x(x2 −y 2 )
(b) lim(x,y)→(0,0) (x2 +y 2 )
2
(x2 −y4 )
(c) lim(x,y)→(0,0) (x2 +y 4 )2
Hint: Consider along y = 0 and along x = y 2 .
³ ´
1
(d) lim(x,y)→(0,0) x sin x2 +y 2
lim an = a or an → a
n→∞
if and only if for every ε > 0 there exists nε such that whenever n ≥ nε ,
In words the definition says that given any measure of closeness, ε, the terms
of the sequence are eventually all this close to a. There is absolutely no difference
between this and the definition for sequences of numbers other than here bold face
is used to indicate an and a are points in Fp .
Proof: Suppose a1 6= a. Then let 0 < ε < |a1 −a| /2 in the definition of the limit.
It follows there exists nε such that if n ≥ nε , then |an −a| < ε and |an −a1 | < ε.
Therefore, for such n,
a contradiction.
As in the case of a vector valued function, it suffices to consider the components.
This is the content of the next theorem.
¡ ¢
Theorem 5.25 Let an = an1 , · · ·, anp ∈ Fp . Then limn→∞ an = a ≡ (a1 , · · ·, ap )
if and only if for each k = 1, · · ·, p,
Proof: First suppose limn→∞ an = a. Then given ε > 0 there exists nε such
that if n > nε , then
|ank − ak | ≤ |an − a| < ε
which establishes 5.15.
Now suppose 5.15 holds for each k. Then letting ε > 0 be given there exist nk
such that if n > nk ,
√
|ank − ak | < ε/ p.
Therefore, letting nε > max (n1 , · · ·, np ) , it follows that for n > nε ,
à n
!1/2 Ã n
!1/2
X 2
X ε2
|an − a| = |ank − ak | < = ε,
p
k=1 k=1
Theorem 5.27 Suppose {an } and {bn } are sequences and that
lim an · bn = a · b (5.17)
n→∞
If bn ∈ F, then
an bn → ab.
104 MULTI-VARIABLE CALCULUS
Proof: The first of these claims is left for you to do. To do the second, let ε > 0
be given and choose n1 such that if n ≥ n1 then
Then for such n, the triangle inequality and Cauchy Schwarz inequality imply
For n ≥ nε ,
Definition 5.28 {an } is a Cauchy sequence if for all ε > 0, there exists nε such
that whenever n, m ≥ nε ,
|an −am | < ε.
|ank − am
k | ≤ |an − am |
∞
which shows for each k = 1, · · ·, p, it follows {ank }n=1 is a Cauchy sequence in F.
This requires that both the real and imaginary parts of ank are Cauchy sequences
∞
in R which means the real and imaginary parts converge in R. This shows {ank }n=1
5.7. THE LIMIT OF A SEQUENCE 105
lim an = a.
n→∞
Proof: Let ε = 1 in the definition of a Cauchy sequence and let n > n1 . Then
from the definition,
|an −an1 | < 1.
It follows that for all n > n1 ,
Proof: Let ε > 0 be given and suppose an → a. Then from the definition of
convergence, there exists nε such that if n > nε , it follows that
ε
|an −a| <
2
Therefore, if m, n ≥ nε + 1, it follows that
ε ε
|an −am | ≤ |an −a| + |a − am | < + =ε
2 2
showing that, since ε > 0 is arbitrary, {an } is a Cauchy sequence.
Proof: Suppose first that f is continuous at x and let xn → x. Let ε > 0 be given.
By continuity, there exists δ > 0 such that if |y − x| < δ, then |f (x) − f (y)| < ε.
However, there exists nδ such that if n ≥ nδ , then |xn −x| < δ and so for all n this
large,
|f (x) −f (xn )| < ε
which shows f (xn ) → f (x) .
Now suppose the condition about taking convergent sequences to convergent
sequences holds at x. Suppose f fails to be continuous at x. Then there exists ε > 0
and xn ∈ D (f ) such that |x − xn | < n1 , yet
|f (x) −f (xn )| ≥ ε.
There is also the long technical theorem about sums and products of continuous
functions. These theorems are proved in the next section.
5.9 Exercises
1. f : D ⊆ Fp → Fq is Lipschitz continuous or just Lipschitz for short if there
exists a constant, K such that
|f (x) − f (y)| ≤ K |x − y|
Proof: Begin with 1.) Let ε > 0 be given. By assumption, there exist δ 1 > 0
ε
such that whenever |x − y| < δ 1 , it follows |f (x) − f (y)| < 2(|a|+|b|+1) and there
exists δ 2 > 0 such that whenever |x − y| < δ 2 , it follows that |g (x) − g (y)| <
ε
2(|a|+|b|+1) . Then let 0 < δ ≤ min (δ 1 , δ 2 ) . If |x − y| < δ, then everything happens
at once. Therefore, using the triangle inequality
Now begin on 2.) There exists δ 1 > 0 such that if |y − x| < δ 1 , then
|f g (x) − f g (y)| ≤ |f (x) g (x) − g (x) f (y)| + |g (x) f (y) − f (y) g (y)|
Now let ε > 0 be given. There exists δ 2 such that if |x − y| < δ 2 , then
ε
|g (x) − g (y)| < ,
2 (1 + |g (x)| + |f (y)|)
Now let 0 < δ ≤ min (δ 1 , δ 2 , δ 3 ) . Then if |x − y| < δ, all the above hold at once and
|f g (x) − f g (y)| ≤
This proves the first part of 2.) To obtain the second part, let δ 1 be as described
above and let δ 0 > 0 be such that for |x − y| < δ 0 ,
which implies |g (y)| ≥ |g (x)| /2, and |g (y)| < 3 |g (x)| /2.
5.10. PROOFS OF THEOREMS 109
2
≤ 2 [|f (x) g (y) − f (y) g (y) + f (y) g (y) − f (y) g (x)|]
|g (x)|
2
≤ 2 [|g (y)| |f (x) − f (y)| + |f (y)| |g (y) − g (x)|]
|g (x)|
· ¸
2 3
≤ 2 |g (x)| |f (x) − f (y)| + (1 + |f (x)|) |g (y) − g (x)|
|g (x)| 2
2
≤ 2 (1 + 2 |f (x)| + 2 |g (x)|) [|f (x) − f (y)| + |g (y) − g (x)|]
|g (x)|
≡ M [|f (x) − f (y)| + |g (y) − g (x)|]
where
2
M≡ 2 (1 + 2 |f (x)| + 2 |g (x)|)
|g (x)|
Now let δ 2 be such that if |x − y| < δ 2 , then
ε −1
|f (x) − f (y)| < M
2
and let δ 3 be such that if |x − y| < δ 3 , then
ε −1
|g (y) − g (x)| < M .
2
Then if 0 < δ ≤ min (δ 0 , δ 1 , δ 2 , δ 3 ) , and |x − y| < δ, everything holds and
¯ ¯
¯ f (x) f (y) ¯
¯ ¯
¯ g (x) − g (y) ¯ ≤ M [|f (x) − f (y)| + |g (y) − g (x)|]
hε ε i
< M M −1 + M −1 = ε.
2 2
This completes the proof of the second part of 2.) Note that in these proofs no
effort is made to find some sort of “best” δ. The problem is one which has a yes or
a no answer. Either it is or it is not continuous.
Now begin on 3.). If f is continuous at x, f (x) ∈ D (g) ⊆ Fp , and g is continuous
at f (x) ,then g ◦ f is continuous at x. Let ε > 0 be given. Then there exists η > 0
110 MULTI-VARIABLE CALCULUS
such that if |y − f (x)| < η and y ∈ D (g) , it follows that |g (y) − g (f (x))| < ε. It
follows from continuity of f at x that there exists δ > 0 such that if |x − z| < δ and
z ∈ D (f ) , then |f (z) − f (x)| < η. Then if |x − z| < δ and z ∈ D (g ◦ f ) ⊆ D (f ) ,
all the above hold and so
Suppose first that f is continuous at x. Then there exists δ > 0 such that if |x − y| <
δ, then |f (x) − f (y)| < ε. The first part of the above inequality then shows that for
each k = 1, · · ·, q, |fk (x) − fk (y)| < ε. This shows the only if part. Now suppose
each function, fk is continuous. Then if ε > 0 is given, there exists δ k > 0 such
that whenever |x − y| < δ k
Now let 0 < δ ≤ min (δ 1 , · · ·, δ q ) . For |x − y| < δ, the above inequality holds for all
k and so the last part of 5.18 implies
q
X
|f (x) − f (y)| ≤ |fi (x) − fi (y)|
i=1
Xq
ε
< = ε.
i=1
q
This proves part 5.) and completes the proof of the theorem.
Here is a multidimensional version of the nested interval lemma.
The following definition is similar to that given earlier. It defines what is meant
by a sequentially compact set in Fp .
5.10. PROOFS OF THEOREMS 111
It turns out the sequentially compact sets in Fp are exactly those which are closed
and bounded. Only half of this result will be needed in this book and this is proved
next. First note that C can be considered as R2 . Therefore, Cp may be considered
as R2p .
Proof: If this is not so, there exists ε > 0 and pairs of points, xn and yn satisfy-
ing |xn − yn | < 1/n but |f (xn ) − f (yn )| ≥ ε. Since C is sequentially compact, there
112 MULTI-VARIABLE CALCULUS
exists x ∈ C and a subsequence, {xnk } satisfying xnk → x. But |xnk − ynk | < 1/k
and so ynk → x also. Therefore, from Theorem 5.32 on Page 105,
It is convenient to give a norm for the elements of L (Fn , Fm ) . This will allow
the consideration of questions such as whether a function having values in this space
of linear transformations is continuous.
Theorem 5.43 Denote by |·| the norm on either Fn or Fm . Then L (Fn , Fm ) with
this operator norm is a complete normed linear space of dimension nm with
For α a scalar,
||αA|| = |α| ||A|| ,
and for A, B ∈ L (Fn , Fm ) ,
The first two properties are obvious but you should verify them. It remains to verify
the norm is well defined and also to verify the triangle inequality above. First if
|x| ≤ 1, and (Aij ) is the matrix of the linear transformation with respect to the
usual basis vectors, then
à !1/2
X
2
||A|| = max |(Ax)i | : |x| ≤ 1
i
¯ ¯2 1/2
X ¯X ¯ ¯
¯
= max ¯ ¯
Aij xj ¯ : |x| ≤ 1
¯
i ¯ j ¯
Proof: Suppose first the second condition holds. Then from the material on
linear transformations,
By continuity of each Aij , there exists a δ > 0 such that for each i, j
ε
|Aij (x) − Aij (y)| < √
n m
g (v)
lim =0 (5.21)
|v|→0 |v|
5.12. THE FRECHET DERIVATIVE 115
f (x + v) = f (x) + Lv + o (v)
f (x + v) − f (x) − Lv
converges to 0 faster than |v|. Thus the above definition is equivalent to saying
|f (x + v) − f (x) − Lv|
lim =0 (5.22)
|v|→0 |v|
or equivalently,
|f (y) − f (x) − Df (x) (y − x)|
lim = 0. (5.23)
y→x |y − x|
Now it is clear this is just a generalization of the notion of the derivative of a
function of one variable because in this more specialized situation,
|f (x + v) − f (x) − f 0 (x) v|
lim = 0,
|v|→0 |v|
f (x + v) − f (x)
f 0 (x) = lim .
v→0 v
For functions of n variables, you can’t define the derivative as the limit of a difference
quotient like you can for a function of one variable because you can’t divide by a
vector. That is why there is a need for a more general definition.
The term o (v) is notation that is descriptive of the behavior in 5.21 and it is
only this behavior that is of interest. Thus, if t and k are constants,
and other similar observations hold. The sloppiness built in to this notation is
useful because it ignores details which are not important. It may help to think of
o (v) as an adjective describing what is left over after approximating f (x + v) by
f (x) + Df (x) v.
Proof: First note that for a fixed vector, v, o (tv) = o (t). Now suppose both
L1 and L2 work in the above definition. Then let v be any vector and let t be a
real scalar which is chosen small enough that tv + x ∈ U . Then
Therefore, subtracting these two yields (L2 − L1 ) (tv) = o (tv) = o (t). There-
fore, dividing by t yields (L2 − L1 ) (v) = o(t)t . Now let t → 0 to conclude that
(L2 − L1 ) (v) = 0. Since this is true for all v, it follows L2 = L1 . This proves the
theorem.
|f (x + v) − f (x)| ≤ K |v|
Proof: From the definition of the derivative, f (x + v)−f (x) = Df (x) v+o (v).
Let |v| be small enough that o(|v|)
|v| < 1 so that |o (v)| ≤ |v|. Then for such v,
Theorem 5.48 (The chain rule) Let U and V be open sets, U ⊆ Fn and V ⊆
Fm . Suppose f : U → V is differentiable at x ∈ U and suppose g : V → Fq is
differentiable at f (x) ∈ V . Then g ◦ f is differentiable at x and
Proof: This follows from a computation. Let B (x,r) ⊆ U and let r also be small
enough that for |v| ≤ r, it follows that f (x + v) ∈ V . Such an r exists because f is
continuous at x. For |v| < r, the definition of differentiability of g and f implies
g (f (x + v)) − g (f (x)) =
What if all the partial derivatives of f exist? Does it follow that f is differen-
tiable? Consider the following function.
½ xy
f (x, y) = x2 +y 2 if (x, y) 6= (0, 0) .
0 if (x, y) = (0, 0)
Then from the definition of partial derivatives,
f (h, 0) − f (0, 0) 0−0
lim = lim =0
h→0 h h→0 h
118 MULTI-VARIABLE CALCULUS
and
f (0, h) − f (0, 0) 0−0
lim = lim =0
h→0 h h→0 h
However f is not even continuous at (0, 0) which may be seen by considering the
behavior of the function along the line y = x and along the line x = 0. By Lemma
5.47 this implies f is not differentiable. Therefore, it is necessary to consider the
correct definition of the derivative given above if you want to get a notion which
generalizes the concept of the derivative of a function of one variable in such a way
as to preserve continuity whenever the function is differentiable.
5.13 C 1 Functions
However, there are theorems which can be used to get differentiability of a function
based on existence of the partial derivatives.
Definition 5.50 When all the partial derivatives exist and are continuous the func-
tion is called a C 1 function.
Because of Proposition 5.44 on Page 113 and Theorem 5.49 which identifies the
entries of Jf with the partial derivatives, the following definition is equivalent to
the above.
{x : (x, y) ∈ U }
Thus,
g (x + v, y) − g (x, y) = D1 g (x, y) v + o (v) .
A similar definition holds for the symbol Dy g or D2 g. The special case seen in
beginning calculus courses is where g : U → Fq and
∂g (x) g (x + hei ) − g (x)
gxi (x) ≡ ≡ lim .
∂xi h→0 h
5.13. C 1 FUNCTIONS 119
The following theorem will be very useful in much of what follows. It is a version
of the mean value theorem.
Proof: Let
S ≡ {t ∈ [0, 1] : for all s ∈ [0, t] ,
|f (x + s (y − x)) − f (x)| ≤ (M + ε) s |y − x|} .
Then 0 ∈ S and by continuity of f , it follows that if t ≡ sup S, then t ∈ S and if
t < 1,
|f (x + t (y − x)) − f (x)| = (M + ε) t |y − x| . (5.25)
∞
If t < 1, then there exists a sequence of positive numbers, {hk }k=1 converging to 0
such that
|f (x + (t + hk ) (y − x)) − f (x + t (y − x))|
M |y − x| ≥ ||Df (x + t (y − x))|| |y − x|
≥ |Df (x + t (y − x)) (y − x)| ≥ (M + ε) |y − x| ,
|f (x + (y − x)) − f (x)| ≤ (M + ε) |y − x| .
A similar argument applies for D2 g and this proves the continuity of the function,
(x, y) → Di g (x, y) for i = 1, 2. The formula follows from
g (x + u, y + v) − g (x, y) = g (x + u, y + v) − g (x, y + v)
+g (x, y + v) − g (x, y)
= g (x + u, y) − g (x, y) + g (x, y + v) − g (x, y) +
[g (x + u, y + v) − g (x + u, y) − (g (x, y + v) − g (x, y))]
= D1 g (x, y) u + D2 g (x, y) v + o (v) + o (u) +
[g (x + u, y + v) − g (x + u, y) − (g (x, y + v) − g (x, y))] . (5.26)
Let h (x, u) ≡ g (x + u, y + v) − g (x + u, y). Then the expression in [ ] is of the
form,
h (x, u) − h (x, 0) .
Also
D2 h (x, u) = D1 g (x + u, y + v) − D1 g (x + u, y)
and so, by continuity of (x, y) → D1 g (x, y),
whenever ||(u, v)|| is small enough. By Theorem 5.53 on Page 119, there exists
δ > 0 such that if ||(u, v)|| < δ, the norm of the last term in 5.26 satisfies the
inequality,
Therefore, this term is o ((u, v)). It follows from 5.27 and 5.26 that
g (x + u, y + v) =
{xi : x ∈ U }
Proof: The only if part of the proof is leftPfor you. Suppose then that Di g
k
exists and is continuous for each i. Note that j=1 θj vj = (v1 , · · ·, vk , 0, · · ·, 0).
Pn P0
Thus j=1 θj vj = v and define j=1 θj vj ≡ 0. Therefore,
n
X k
X k−1
X
g (x + v) − g (x) = g x+ θj vj − g x + θj vj (5.29)
k=1 j=1 j=1
122 MULTI-VARIABLE CALCULUS
k
X k−1
X
g x+ θj vj − g (x+θk vk ) − g x + θj vj − g (x) (5.31)
j=1 j=1
and the expression in 5.31 is of the form h (vk ) − h (0) where for small w ∈ Frk ,
k−1
X
h (w) ≡ g x+ θj vj + θk w − g (x + θk w) .
j=1
Therefore,
k−1
X
Dh (w) = Dk g x+ θj vj + θk w − Dk g (x + θk w)
j=1
and by continuity, ||Dh (w)|| < ε provided |v| is small enough. Therefore, by
Theorem 5.53, whenever |v| is small enough, |h (θk vk ) − h (0)| ≤ ε |θk vk | ≤ ε |v|
which shows that since ε is arbitrary, the expression in 5.31 is o (v). Now in 5.30
g (x+θk vk ) − g (x) = Dk g (x) vk + o (vk ) = Dk g (x) vk + o (v). Therefore, referring
to 5.29,
X n
g (x + v) − g (x) = Dk g (x) vk + o (v)
k=1
5.14 C k Functions
Recall the notation for partial derivatives in the following definition.
∂2g
gxk xl (x) ≡ (x)
∂xl ∂xk
and so forth.
To deal with higher order partial derivatives in a systematic way, here is a useful
definition.
x = (x1 , · · ·, xn ),
∂ |α| f (x)
xα ≡ x α1 α2 αn α
1 x2 · · · xn , D f (x) ≡ .
∂xα α2
1 ∂x2
1
· · · ∂xα
n
n
Proof: Since U is open, there exists r > 0 such that B ((x, y) , r) ⊆ U. Now let
|t| , |s| < r/2, t, s real numbers and consider
h(t) h(0)
1 z }| { z }| {
∆ (s, t) ≡ {f (x + t, y + s) − f (x + t, y) − (f (x, y + s) − f (x, y))}. (5.32)
st
Note that (x + t, y + s) ∈ U because
¡ ¢1/2
|(x + t, y + s) − (x, y)| = |(t, s)| = t2 + s2
µ 2 ¶1/2
r r2 r
≤ + = √ < r.
4 4 2
124 MULTI-VARIABLE CALCULUS
x4 − y 4 + 4x2 y 2 x4 − y 4 − 4x2 y 2
fx = y 2 , fy = x 2
(x2 + y 2 ) (x2 + y 2 )
Now
fx (0, y) − fx (0, 0)
fxy (0, 0) ≡ lim
y→0 y
−y 4
= lim = −1
y→0 (y 2 )2
while
fy (x, 0) − fy (0, 0)
fyx (0, 0) ≡ lim
x→0 x
x4
= lim =1
x→0 (x2 )2
showing that although the mixed partial derivatives do exist at (0, 0) , they are not
equal there.
Then there exist positive constants, δ, η, such that for every y ∈ B (y0 , η) there
exists a unique x (y) ∈ B (x0 , δ) such that
f (x (y) , y) = 0. (5.34)
Proof: Let
f1 (x, y)
f2 (x, y)
f (x, y) = .. .
.
fn (x, y)
126 MULTI-VARIABLE CALCULUS
¡ ¢ n
Define for x1 , · · ·, xn ∈ B (x0 , δ) and y ∈ B (y0 , η) the following matrix.
¡ ¢ ¡ ¢
f1,x1 x1 , y · · · f1,xn x1 , y
¡ ¢ .. ..
J x1 , · · ·, xn , y ≡ . . .
fn,x1 (xn , y) · · · fn,xn (xn , y)
Then by the assumption of continuity of all the partial derivatives, ¡ there exists ¢
δ 0 > 0 and η 0 > 0 such that if δ < δ 0 and η < η 0 , it follows that for all x1 , · · ·, xn ∈
n
B (x0 , δ) and y ∈ B (y0 , η) ,
¡ ¡ ¢¢
det J x1 , · · ·, xn , y > r > 0. (5.35)
h (t) ≡ fi (x + t (z − x) , y) .
Then h (1) = h (0) and so by the mean value theorem, h0 (ti ) = 0 for some ti ∈ (0, 1) .
Therefore, from the chain rule and for this value of ti ,
for every vector v. But from 5.35 and the fact that v is arbitrary, it follows
f (x (y) , y) = 0. This proves the existence of the function y → x (y) such that
f (x (y) , y) = 0 for all y ∈ B (y0 , η) .
It remains to verify this function is a C 1 function. To do this, let y1 and y2 be
points of B (y0 , η) . Then as before, consider the ith component of f and consider
the same argument using the mean value theorem to write
0 = fi (x (y1 ) , y1 ) − fi (x (y2 ) , y2 )
= fi (x (y¡1 ) , y1 )¢− fi (x (y2 ) , y1 ) + fi (x (y¡2 ) , y1 ) − f¢i (x (y2 ) , y2 )
= D1 fi xi , y1 (x (y1 ) − x (y2 )) + D2 fi x (y2 ) , yi (y1 − y2 ) .
Therefore, ¡ ¢
J x1 , · · ·, xn , y1 (x (y1 ) − x (y2 )) = −M (y1 − y2 ) (5.37)
¡ ¢
where M is the matrix whose ith row is D2 fi x (y2 ) , yi . Then from 5.35 there
exists a constant, C independent of the choice of y ∈ B (y0 , η) such that
¯¯ ¡ ¢−1 ¯¯¯¯
¯¯
¯¯J x1 , · · ·, xn , y ¯¯ < C
¡ ¢ n
whenever x1 , · · ·, xn ∈ B (x0 , δ) . By continuity of the partial derivatives of f it
also follows there exists a constant, C1 such that ||D2 fi (x, y)|| < C1 whenever,
(x, y) ∈ B (x0 , δ) × B (y0 , η) . Hence ||M || must also be bounded independent of
the choice of y1 and y2 in B (y0 , η) . From 5.37, it follows there exists a constant,
C such that for all y1 , y2 in B (y0 , η) ,
Now let y ∈ B (y0 , η) and let |v| be sufficiently small that y + v ∈ B (y0 , η) .
Then
0 = f (x (y + v) , y + v) − f (x (y) , y)
= f (x (y + v) , y + v) − f (x (y + v) , y) + f (x (y + v) , y) − f (x (y) , y)
Therefore,
−1
x (y + v) − x (y) = −D1 f (x (y) , y) D2 f (x (y) , y) v + o (v)
128 MULTI-VARIABLE CALCULUS
−1
which shows that Dx (y) = −D1 f (x (y) , y) D2 f (x (y) , y) and y →Dx (y) is
continuous. This proves the theorem.
−1
In practice, how do you verify the condition, D1 f (x0 , y0 ) ∈ L (Fn , Fn )?
f1 (x1 , · · ·, xn , y1 , · · ·, yn )
..
f (x, y) = . .
fn (x1 , · · ·, xn , y1 , · · ·, yn )
.. ..
. .
∂fn (x1 ,···,xn ,y1 ,···,yn ) ∂fn (x1 ,···,xn ,y1 ,···,yn )
∂x1 ··· ∂xn
−1
and from linear algebra, D1 f (x0 , y0 ) ∈ L (Fn , Fn ) exactly when the above matrix
has an inverse. In other words when
∂f (x ,···,x ,y ,···,y ) ∂f1 (x1 ,···,xn ,y1 ,···,yn )
1 1
∂x1
n 1 n
··· ∂xn
.. ..
det
. .
6= 0
∂fn (x1 ,···,xn ,y1 ,···,yn ) ∂fn (x1 ,···,xn ,y1 ,···,yn )
∂x1 ··· ∂xn
∂ (z1 , · · ·, zn )
.
∂ (x1 , · · ·, xn )
Of course you can replace R with F in the above by applying the above to the
situation in which each F is replaced with R2 .
Then there exist positive constants, δ, η, such that for every y ∈ B (y0 , η) there
exists a unique x (y) ∈ B (x0 , δ) such that
f (x (y) , y) = 0. (5.41)
The next theorem is a very important special case of the implicit function the-
orem known as the inverse function theorem. Actually one can also obtain the
implicit function theorem from the inverse function theorem. It is done this way in
[36] and in [3].
5.16. IMPLICIT FUNCTION THEOREM 129
x0 ∈ W ⊆ U, (5.43)
F (x, y) ≡ f (x) − y
Then there exist positive constants, δ, η, such that for every y ∈ B (y0 , η) there
exists a unique x (y) ∈ B (x0 , δ) such that
f (x (y) , y) = 0. (5.47)
∂x −1 ∂f
l
= −D1 (x, y) .
∂y ∂y l
130 MULTI-VARIABLE CALCULUS
where Mβ is a matrix whose entries are differentiable functions of Dγ (x) for |γ| < q
−1
and Dτ f (x, y) for |τ | ≤ q. This follows easily from the description of D1 (x, y) in
terms of the cofactor matrix and the determinant of D1 (x, y). Suppose 5.48 holds
for |α| = q < k. Then by induction, this yields x is C q . Then
∂M (x,y)
By the chain rule β
∂y p is a matrix whose entries are differentiable functions of
D f (x, y) for |τ | ≤ q + 1 and Dγ (x) for |γ| < q + 1. It follows since y p was arbitrary
τ
that for any |α| = q + 1, a formula like 5.48 holds with q being replaced by q + 1.
By induction, x is C k . This proves the theorem.
As a simple corollary this yields an improved version of the inverse function
theorem.
x0 ∈ W ⊆ U, (5.50)
f (x) = a
gi (x) = 0, i = 1, · · ·, m.
5.17. THE METHOD OF LAGRANGE MULTIPLIERS 131
x0 is a local maximum if f (x0 ) ≥ f (x) for all x near x0 which also satisfies the
constraints 5.53. A local minimum is defined similarly. Let F : U × R → Rm+1 be
defined by
f (x) − a
g1 (x)
F (x,a) ≡ .. . (5.54)
.
gm (x)
Now consider the m + 1 × n Jacobian matrix,
fx1 (x0 ) · · · fxn (x0 )
g1x1 (x0 ) · · · g1xn (x0 )
.. .. .
. .
gmx1 (x0 ) · · · gmxn (x0 )
If this matrix has rank m + 1 then some m + 1 × m + 1 submatrix has nonzero
determinant. It follows from the implicit function theorem that there exist m + 1
variables, xi1 , · · ·, xim+1 such that the system
F (x,a) = 0 (5.55)
specifies these m + 1 variables as a function of the remaining n − (m + 1) variables
and a in an open set of Rn−m . Thus there is a solution (x,a) to 5.55 for some x
close to x0 whenever a is in some open interval. Therefore, x0 cannot be either a
local minimum or a local maximum. It follows that if x0 is either a local maximum
or a local minimum, then the above matrix must have rank less than m + 1 which
requires the rows to be linearly dependent. Thus, there exist m scalars,
λ1 , · · ·, λm ,
and a scalar µ, not all zero such that
fx1 (x0 ) g1x1 (x0 ) gmx1 (x0 )
.. .. ..
µ . = λ1 . + · · · + λm . . (5.56)
fxn (x0 ) g1xn (x0 ) gmxn (x0 )
If the column vectors
g1x1 (x0 ) gmx1 (x0 )
.. ..
. ,· · · . (5.57)
g1xn (x0 ) gmxn (x0 )
are linearly independent, then, µ 6= 0 and dividing by µ yields an expression of the
form
fx1 (x0 ) g1x1 (x0 ) gmx1 (x0 )
.. .. ..
. = λ1 . + · · · + λm . (5.58)
fxn (x0 ) g1xn (x0 ) gmxn (x0 )
132 MULTI-VARIABLE CALCULUS
at every point x0 which is either a local maximum or a local minimum. This proves
the following theorem.
d (x, y) = d (y, x)
d (x, y) ≥ 0 and d (x, y) = 0 if and only if x = y
d (x, y) ≤ d (x, z) + d (z, y) .
You can check that Rn and Cn are metric spaces with d (x, y) = |x − y| . How-
ever, there are many others. The definitions of open and closed sets are the same
for a metric space as they are for Rn .
Lemma 6.3 In a metric space, X every ball, B (x, r) is open. A set is closed if
and only if it contains all its limit points. If p is a limit point of S, then there exists
a sequence of distinct points of S, {xn } such that limn→∞ xn = p.
133
134 METRIC SPACES AND GENERAL TOPOLOGICAL SPACES
Theorem 6.4 Suppose (X, d) is a metric space. Then the sets {B(x, r) : r >
0, x ∈ X} satisfy
∪ {B(x, r) : r > 0, x ∈ X} = X (6.1)
If p ∈ B (x, r1 ) ∩ B (z, r2 ), there exists r > 0 such that
B (p, r) ⊆ B (x, r1 ) ∩ B (z, r2 ) . (6.2)
Proof: Observe that the union of these balls includes the whole space, X so
6.1 is obvious. Consider 6.2. Let p ∈ B (x, r1 ) ∩ B (z, r2 ). Consider
r ≡ min (r1 − d (x, p) , r2 − d (z, p))
and suppose y ∈ B (p, r). Then
d (y, x) ≤ d (y, p) + d (p, x) < r1 − d (x, p) + d (x, p) = r1
and so B (p, r) ⊆ B (x, r1 ). By similar reasoning, B (p, r) ⊆ B (z, r2 ). This proves
the theorem.
Let K be a closed set. This means K C ≡ X \ K is an open set. Let p be a
limit point of K. If p ∈ K C , then since K C is open, there exists B (p, r) ⊆ K C . But
this contradicts p being a limit point because there are no points of K in this ball.
Hence all limit points of K must be in K.
Suppose next that K contains its limit points. Is K C open? Let p ∈ K C .
Then p is not a limit point of K. Therefore, there exists B (p, r) which contains at
most finitely many points of K. Since p ∈ / K, it follows that by making r smaller if
necessary, B (p, r) contains no points of K. That is B (p, r) ⊆ K C showing K C is
open. Therefore, K is closed.
Suppose now that p is a limit point of S. Let x1 ∈ (S \ {p})∩B (p, 1) . If x1 , ···, xk
have been chosen, let
½ ¾
1
rk+1 ≡ min d (p, xi ) , i = 1, · · ·, k, .
k+1
Let xk+1 ∈ (S \ {p}) ∩ B (p, rk+1 ) . This proves the lemma.
Lemma 6.5 If {xn } is a Cauchy sequence in a metric space, X and if some subse-
quence, {xnk } converges to x, then {xn } converges to x. Also if a sequence converges,
then it is a Cauchy sequence.
Proof: Note first that nk ≥ k because in a subsequence, the indices, n1 , n2 , ··· are
strictly increasing. Let ε > 0 be given and let N be such that for k > N, d (x, xnk ) <
ε/2 and for m, n ≥ N, d (xm , xn ) < ε/2. Pick k > n. Then if n > N,
ε ε
d (xn , x) ≤ d (xn , xnk ) + d (xnk , x) < + = ε.
2 2
Finally, suppose limn→∞ xn = x. Then there exists N such that if n > N, then
d (xn , x) < ε/2. it follows that for m, n > N,
ε ε
d (xn , xm ) ≤ d (xn , x) + d (x, xm ) < + = ε.
2 2
6.2. COMPACTNESS IN METRIC SPACE 135
Definition 6.6 Let (X, d) be a metric space and let S be a nonempty set in X.
Then
dist (x, S) ≡ inf {d (x, y) : y ∈ S} .
Lemma 6.7 The function, x → dist (x, S) is continuous and in fact satisfies
Proof: Suppose dist (x, y) is as least as large as dist (y, S). Then pick z ∈ S
such that d (y, z) ≤ dist (y, S) + ε. Then
Example 6.9 Let X be any infinite set and define d (x, y) = 1 if x 6= y while
d (x, y) = 0 if x = y.
You should verify the details that this is a metric space because it satisfies the
axioms of a metric. The set X is closed and bounded because its complement is
∅ which is clearly open because every point of ∅ is an interior point. (There are
© ¡X is ¢bounded ªbecause X = B (x, 2). However, X is clearly not compact
none.) Also
because B x, 12 : x ∈ X is a collection of open sets whose union contains X but
136 METRIC SPACES AND GENERAL TOPOLOGICAL SPACES
Definition 6.10 In any metric space, a set E is totally bounded if for every ε > 0
there exists a finite set of points {x1 , · · ·, xn } such that
The following proposition tells which sets in a metric space are compact. First
here is an interesting lemma.
Lemma 6.11 Let X be a metric space and suppose D is a countable dense subset
of X. In other words, it is being assumed X is a separable metric space. Consider
the open sets of the form B (d, r) where r is a positive rational number and d ∈ D.
Denote this countable collection of open sets by B. Then every open set is the union
of sets of B. Furthermore, if C is any collection of open sets, there exists a countable
subset, {Un } ⊆ C such that ∪n Un = ∪C.
Proof: Let U be an open set and let x ∈ U. Let B (x, δ) ⊆ U. Then by density of
D, there exists d ∈ D∩B (x, δ/4) . Now pick r ∈ Q∩(δ/4, 3δ/4) and consider B (d, r) .
Clearly, B (d, r) contains the point x because r > δ/4. Is B (d, r) ⊆ B (x, δ)? if so,
this proves the lemma because x was an arbitrary point of U . Suppose z ∈ B (d, r) .
Then
δ 3δ δ
d (z, x) ≤ d (z, d) + d (d, x) < r + < + =δ
4 4 4
Now let C be any collection of open sets. Each set in this collection is the union
of countably many sets of B. Let B 0 denote the sets of B which are contained in
some set of C. Thus ∪B0 = ∪C. Then for each B ∈ B 0 , pick UB ∈ C such that
B ⊆ UB . Then {UB : B ∈ B 0 } is a countable collection of sets of C whose union
equals ∪C. Therefore, this proves the lemma.
Proposition 6.12 Let (X, d) be a metric space. Then the following are equivalent.
Proof: Suppose 6.3 and let {xk } be a sequence. Suppose {xk } has no convergent
subsequence. If this is so, then by Lemma 6.3, {xk } has no limit point and no value
of the sequence is repeated more than finitely many times. Thus the set
Cn = ∪{xk : k ≥ n}
6.2. COMPACTNESS IN METRIC SPACE 137
Un = CnC ,
then
X = ∪∞
n=1 Un
but there is no finite subcovering, because no value of the sequence is repeated more
than finitely many times. This contradicts compactness of (X, d). This shows 6.3
implies 6.4.
Now suppose 6.4 and let {xn } be a Cauchy sequence. Is {xn } convergent? By
sequential compactness xnk → x for some subsequence. By Lemma 6.5 it follows
that {xn } also converges to x showing that (X, d) is complete. If (X, d) is not
totally bounded, then there exists ε > 0 for which there is no ε net. Hence there
exists a sequence {xk } with d (xk , xl ) ≥ ε for all l 6= k. By Lemma 6.5 again,
this contradicts 6.4 because no subsequence can be a Cauchy sequence and so no
subsequence can converge. This shows 6.4 implies 6.5.
Now suppose 6.5. What about 6.4? Let {pn } be a sequence and let {xni }m i=1 be
n
−n
a2 net for n = 1, 2, · · ·. Let
¡ ¢
Bn ≡ B xnin , 2−n
pnk ∈ Bk .
Then if k ≥ l,
k−1
X ¡ ¢
d (pnk , pnl ) ≤ d pni+1 , pni
i=l
k−1
X
< 2−(i−1) < 2−(l−2).
i=l
D = ∪∞
n=1 Dn .
Now let C be any set of open sets such that ∪C ⊇ X. By Lemma 6.11, there
exists a countable subset of C,
Ce = {Un }∞
n=1
such that ∪Ce = ∪C. If C admits no finite subcover, then neither does Ce and there ex-
ists pn ∈ X \ ∪nk=1 Uk . Then since X is sequentially compact, there is a subsequence
{pnk } such that {pnk } converges. Say
p = lim pnk .
k→∞
All but finitely many points of {pnk } are in X \ ∪nk=1 Uk . Therefore p ∈ X \ ∪nk=1 Uk
for each n. Hence
p∈/ ∪∞k=1 Uk
|z − 0| ≤ |z − xj | + |xj | < 1 + r.
where
ak = −r + i2−p r, bk = −r + (i + 1) 2−p r,
¡ ¢n
for i ∈ {0, 1, · · ·, 2p+1 − 1}. Thus S is a collection of 2p+1 non overlapping
√ cubes
whose union equals [−r, r)n and whose diameters are all equal to 2−p r n. Now
choose p large enough that the diameter of these cubes is less than ε. This yields a
contradiction because one of the cubes must contain infinitely many points of {ai }.
This proves the lemma.
The next theorem is called the Heine Borel theorem and it characterizes the
compact sets in Rn .
6.3. SOME APPLICATIONS OF COMPACTNESS 139
Proof: First it is shown f (X) is compact. Suppose C is a set of open sets whose
−1
union contains
© −1f (X). Thenª since f is continuous f (U ) is open for all U ∈ C.
Therefore, f (U ) : U ∈ C is a collection of open sets © whose union contains X. ª
Since X is compact, it follows finitely many of these, f −1 (U1 ) , · · ·, f −1 (Up )
contains X in their union. Therefore, f (X) ⊆ ∪pk=1 Uk showing f (X) is compact
as claimed.
Now since f (X) is compact, Theorem 6.14 implies f (X) is closed and bounded.
Therefore, it contains its inf and its sup. Thus f achieves both a maximum and a
minimum.
Proof: Suppose this is not true and that f is continuous but not uniformly
continuous. Then there exists ε > 0 such that for all δ > 0 there exist points,
pδ and qδ such that d (pδ , qδ ) < δ and yet d (f (pδ ) , f (qδ )) ≥ ε. Let pn and qn
be the points which go with δ = 1/n. By Proposition 6.12 {pn } has a convergent
subsequence, {pnk } converging to a point, x ∈ X. Since d (pn , qn ) < n1 , it follows
that qnk → x also. Therefore,
Definition 6.18 If every finite subset of a collection of sets has nonempty inter-
section, the collection has the finite intersection property.
Thus if you have a locally compact metric space, then if {an } is a bounded
sequence, it must have a convergent subsequence.
Let K be a compact subset of Rn and consider the continuous functions which
have values in a locally compact metric space, (X, d) where d denotes the metric on
X. Denote this space as C (K, X) .
The Ascoli Arzela theorem is a major result which tells which subsets of C (K, X)
are sequentially compact.
6.4. ASCOLI ARZELA THEOREM 141
xm m m
3 ∈ K \ (B (x1 , 1/m) ∪ B (x2 , 1/m)) .
Continue this way until the process stops, say at N (m). It must stop because
if it didn’t, there would be a convergent subsequence due to the compactness of
K. Ultimately all terms of this convergent subsequence would be closer than 1/m,
N (m)
violating the manner in which they are chosen. Then D = ∪∞ m
m=1 ∪k=1 {xk } . This
is countable because it is a countable union of countable sets. If y ∈ K and ε > 0,
then for some m, 2/m < ε and so B (y, ε) must contain some point of {xm k } since
otherwise, the process stopped too soon. You could have picked y. This proves the
lemma.
142 METRIC SPACES AND GENERAL TOPOLOGICAL SPACES
lim ρ (gm , g) = 0.
m→∞
Next I show that {gm } converges at every point of K. Let x ∈ K and let ε > 0 be
given. Choose xk such that for all f ∈ A,
ε
d (f (xk ) , f (x)) < .
3
I can do this by the equicontinuity. Now if p, q are large enough, say p, q ≥ M,
ε
d (gp (xk ) , gq (xk )) < .
3
Therefore, for p, q ≥ M,
d (gp (x) , gq (x)) ≤ d (gp (x) , gp (xk )) + d (gp (xk ) , gq (xk )) + d (gq (xk ) , gq (x))
ε ε ε
< + + =ε
3 3 3
It follows that {gm (x)} is a Cauchy sequence having values X. Therefore, it con-
verges. Let g (x) be the name of the thing it converges to.
Let ε > 0 be given and pick δ > 0 such that whenever x, y ∈ K and |x − y| < δ,
it follows d (f (x) , f (y)) < 3ε for all f ∈ A. Now let {x1 , · · ·, xm } be a δ net for K
as in Lemma 6.25. Since there are only finitely many points in this δ net, it follows
that there exists N such that for all p, q ≥ N,
ε
d (gq (xi ) , gp (xi )) <
3
for all {x1 , · · ·, xm } · Therefore, for arbitrary x ∈ K, pick xi ∈ {x1 , · · ·, xm } such
that |xi − x| < δ. Then
d (gq (x) , gp (x)) ≤ d (gq (x) , gq (xi )) + d (gq (xi ) , gp (xi )) + d (gp (xi ) , gp (x))
ε ε ε
< + + = ε.
3 3 3
Since N does not depend on the choice of x, it follows this sequence {gm } is uni-
formly Cauchy. That is, for every ε > 0, there exists N such that if p, q ≥ N,
then
ρ (gp , gq ) < ε.
6.4. ASCOLI ARZELA THEOREM 143
Now let p satisfy 6.6 for all x whenever p > N. Also pick δ > 0 such that if
|x − y| < δ, then
ε
d (gp (x) , gp (y)) < .
3
Then if |x − y| < δ,
d (g (x) , g (y)) ≤ d (g (x) , gp (x)) + d (gp (x) , gp (y)) + d (gp (y) , g (y))
ε ε ε
< + + = ε.
3 3 3
Since ε was arbitrary, this shows that g is continuous.
It only remains to verify that ρ (g, gk ) → 0. But this follows from 6.6. This
proves the lemma.
With these lemmas, it is time to prove Theorem 6.24.
Proof of Theorem 6.24: Let D = {xk } be the countable dense set of K
gauranteed by Lemma 6.25 and let
be a subsequence of
lim ρ (g, gk ) = 0.
k→∞
Lemma 6.28 Let (X, d) be a metric space and let S ⊆ X be a nonempty subset.
Define
dist (x, S) ≡ inf {d (x, y) : y ∈ S} .
6.5. THE TIETZE EXTENSION THEOREM 145
Proof: The continuity of x → dist (x, S) is obvious if the inequality 6.7 is estab-
lished. So let x, y ∈ X. Without loss of generality, assume dist (x, S) ≥ dist (y, S)
and pick z ∈ S such that d (y, z) − ε < dist (y, S) . Then
|dist (x, S) − dist (y, S)| = dist (x, S) − dist (y, S) ≤ d (x, z) − (d (y, z) − ε)
≤ d (z, y) + d (x, y) − d (y, z) + ε = d (x, y) + ε.
Lemma 6.29 Let H, K be two nonempty disjoint closed subsets of a metric space,
(X, d) . Then there exists a continuous function, g : X → [−1, 1] such that g (H) =
−1/3, g (K) = 1/3, g (X) ⊆ [−1/3, 1/3] .
Proof: Let
dist (x, H)
f (x) ≡ .
dist (x, H) + dist (x, K)
The denominator is never equal to zero because if dist (x, H) = 0, then x ∈ H
becasue H is closed. (To see this, pick hk ∈ B (x, 1/k) ∩ H. Then hk → x and
since H is closed, x ∈ H.) Similarly, if dist (x, K) = 0, then x ∈ K and so the
denominator is never zero as claimed. Hence, by Lemma 6.28, f is continuous
¡ and
¢
from its definition, f = 0 on H and f = 1 on K. Now let g (x) ≡ 23 h (x) − 12 .
Then g has the desired properties.
Definition 6.30 For f a real or complex valued bounded continuous function de-
fined on a metric space, M
Lemma 6.31 Suppose M is a closed set in X where (X, d) is a metric space and
suppose f : M → [−1, 1] is continuous at every point of M. Then there exists a
function, g which is defined and continuous on all of X such that ||f − g||M < 23 .
Lemma 6.32 Suppose M is a closed set in X where (X, d) is a metric space and
suppose f : M → [−1, 1] is continuous at every point of M. Then there exists a
function, g which is defined and continuous on all of X such that g = f on M and
g has its values in [−1, 1] .
146 METRIC SPACES AND GENERAL TOPOLOGICAL SPACES
Proof: Let g1 be such that g1 (X) ⊆ [−1/3, 1/3] and ||f − g1 ||M ≤ 23 . Suppose
g1 , · · ·, gm have been chosen such that gj (X) ⊆ [−1/3, 1/3] and
¯¯ m µ ¶i−1 ¯¯
¯¯ µ ¶m
¯¯ X 2 2
¯¯ ¯¯
¯¯f − g i ¯¯ < . (6.8)
¯¯ 3 ¯¯ 3
i=1 M
Hence ¯¯Ã ! µ ¶ ¯¯
¯¯ m µ ¶i−1
X m ¯¯ µ ¶m+1
¯¯ 2 2 ¯¯ 2
¯¯ f − gi − gm+1 ¯¯ ≤ .
¯¯ 3 3 ¯¯ 3
i=1 M
It follows there exists a sequence, {gi } such that each has its values in [−1/3, 1/3]
and for every m 6.8 holds. Then let
∞ µ ¶i−1
X 2
g (x) ≡ gi (x) .
i=1
3
It follows ¯ ¯
¯X ∞ µ ¶i−1 ¯ X m µ ¶i−1
¯ 2 ¯ 2 1
|g (x)| ≤ ¯ gi (x)¯ ≤ ≤1
¯ 3 ¯ 3 3
i=1 i=1
and since convergence is uniform, g must be continuous. The estimate 6.8 implies
f = g on M .
The following is the Tietze extension theorem.
Theorem 6.33 Let M be a closed nonempty subset of a metric space (X, d) and
let f : M → [a, b] is continuous at every point of M. Then there exists a function,
g continuous on all of X which coincides with f on M such that g (X) ⊆ [a, b] .
2
Proof: Let f1 (x) = 1 + b−a (f (x) − b) . Then f1 satisfies the conditions of
Lemma 6.32 and so there exists g1 : X → [−1, ¡ 1]¢such that g is continuous on X
and equals f1 on M . Let g (x) = (g1 (x) − 1) b−a
2 + b. This works.
6.6. GENERAL TOPOLOGICAL SPACES 147
Note that this is simply the analog of saying a set is open exactly when every
point is an interior point.
Proposition 6.35 Let X be a set and let B be a basis for a topology as defined
above and let τ be the set of open sets determined by B. Then
∅ ∈ τ, X ∈ τ, (6.9)
If C ⊆ τ , then ∪ C ∈ τ (6.10)
If A, B ∈ τ , then A ∩ B ∈ τ . (6.11)
Definition 6.36 A set X together with such a collection of its subsets satisfying
6.9-6.11 is called a topological space. τ is called the topology or set of open sets of
X.
148 METRIC SPACES AND GENERAL TOPOLOGICAL SPACES
U V
p q
· ·
Hausdorff
Note that if the topological space is Hausdorff, then this definition is equivalent
to requiring that every open set containing p contains infinitely many points from
E. Why?
Theorem 6.39 A subset, E, of X is closed if and only if it contains all its limit
points.
Regular
U V
C K
Normal
Proof: Let C denote all the closed sets which contain E. Then C is nonempty
because X ∈ C. © ª
C
(∩ {A : A ∈ C}) = ∪ AC : A ∈ C ,
an open set which shows that ∩C is a closed set and is the smallest closed set which
contains E.
Proof: Let x ∈ E and suppose that x ∈ / E. If x is not a limit point either, then
there exists an open set, U ,containing x which does not intersect E. But then U C
is a closed set which contains E which does not contain x, contrary to the definition
that E is the intersection of all closed sets containing E. Therefore, x must be a
limit point of E after all.
Now E ⊆ E so suppose x is a limit point of E. Is x ∈ E? If H is a closed set
containing E, which does not contain x, then H C is an open set containing x which
contains no points of E other than x negating the assumption that x is a limit point
of E.
The following is the definition of continuity in terms of general topological spaces.
It is really just a generalization of the ε - δ definition of continuity given in calculus.
Definition 6.46 Let (X, τ ) and (Y, η) be two topological spaces and let f : X → Y .
f is continuous at x ∈ X if whenever V is an open set of Y containing f (x), there
exists an open set U ∈ τ such that x ∈ U and f (U ) ⊆ V . f is continuous if
f −1 (V ) ∈ τ whenever V ∈ η.
150 METRIC SPACES AND GENERAL TOPOLOGICAL SPACES
x = (x1 , · · ·, xn ) .
Qn Qn
Then
Qn xi ∈ Ai ∩ Bi for each i. Therefore, x ∈ i=1 Ai ∩ Bi ∈ B and i=1 Ai ∩ Bi ⊆
i=1 Ai .
The definition of compactness is also considered for a general topological space.
This is given next.
A useful construction when dealing with locally compact Hausdorff spaces is the
notion of the one point compactification of the space.
Definition 6.51 Suppose (X, τ ) is a locally compact Hausdorff space. Then let
Xe ≡ X ∪ {∞} where ∞ is just the name of some point which is not in X which is
called the point at infinity. A basis for the topology e e is
τ for X
© ª
τ ∪ K C where K is a compact subset of X .
disjoint open sets containing the points, p and ∞ respectively. Now let C be an
open cover of X e with sets from e
τ . Then ∞ must be in some set, U∞ from C, which
must contain a set of the form K C where K is a compact subset of X. Then there
e is
exist sets from C, U1 , · · ·, Ur which cover K. Therefore, a finite subcover of X
U1 , · · ·, Ur , U∞ .
In general topological spaces there may be no concept of “bounded”. Even if
there is, closed and bounded is not necessarily the same as compactness. However,
in any Hausdorff space every compact set must be a closed set.
Theorem 6.53 If (X, τ ) is a Hausdorff space, then every compact subset must also
be a closed set.
Proof: Suppose p ∈
/ K. For each x ∈ X, there exist open sets, Ux and Vx such
that
x ∈ Ux , p ∈ Vx ,
and
Ux ∩ Vx = ∅.
If K is assumed to be compact, there are finitely many of these sets, Ux1 , · · ·, Uxm
which cover K. Then let V ≡ ∩m i=1 Vxi . It follows that V is an open set containing
p which has empty intersection with each of the Uxi . Consequently, V contains no
points of K and is therefore not a limit point of K. This proves the theorem.
Definition 6.54 If every finite subset of a collection of sets has nonempty inter-
section, the collection has the finite intersection property.
Theorem 6.55 Let K be a set whose elements are compact subsets of a Hausdorff
topological space, (X, τ ). Suppose K has the finite intersection property. Then
∅ 6= ∩K.
Lemma 6.56 Let (X, τ ) be a topological space and let B be a basis for τ . Then K
is compact if and only if every open cover of basic open sets admits a finite subcover.
S = A ∪ B, A, B 6= ∅, and A ∩ B = B ∩ A = ∅.
In this case, the sets A and B are said to separate S. A set is connected if it is not
separated.
One of the most important theorems about connected sets is the following.
Theorem 6.58 Suppose U and V are connected sets having nonempty intersection.
Then U ∪ V is also connected.
It follows one of these sets must be empty since otherwise, U would be separated.
It follows that U is contained in either A or B. Similarly, V must be contained in
either A or B. Since U and V have nonempty intersection, it follows that both V
and U are contained in one of the sets, A, B. Therefore, the other must be empty
and this shows U ∪ V cannot be separated and is therefore, connected.
The intersection of connected sets is not necessarily connected as is shown by
the following picture.
V
6.7. CONNECTED SETS 153
Proof: To do this you show f (X) is not separated. Suppose to the contrary
that f (X) = A ∪ B where A and B separate f (X) . Then consider the sets, f −1 (A)
and f −1 (B) . If z ∈ f −1 (B) , then f (z) ∈ B and so f (z) is not a limit point of
A. Therefore, there exists an open set, U containing f (z) such that U ∩ A = ∅.
But then, the continuity of f implies that f −1 (U ) is an open set containing z such
that f −1 (U ) ∩ f −1 (A) = ∅. Therefore, f −1 (B) contains no limit points of f −1 (A) .
Similar reasoning implies f −1 (A) contains no limit points of f −1 (B). It follows
that X is separated by f −1 (A) and f −1 (B) , contradicting the assumption that X
was connected.
An arbitrary set can be written as a union of maximal connected sets called
connected components. This is the concept of the next definition.
Definition 6.60 Let S be a set and let p ∈ S. Denote by Cp the union of all
connected subsets of S which contain p. This is called the connected component
determined by p.
S ≡ {t ∈ [x, y] : [x, t] ⊆ A}
154 METRIC SPACES AND GENERAL TOPOLOGICAL SPACES
(l, l + δ) ∩ B = ∅
Theorem 6.63 Let U be an open set in R. Then there exist countably many dis-
∞
joint open sets, {(ai , bi )}i=1 such that U = ∪∞
i=1 (ai , bi ) .
Definition 6.64 A topological space, E is arcwise connected if for any two points,
p, q ∈ E, there exists a closed interval, [a, b] and a continuous function, γ : [a, b] → E
such that γ (a) = p and γ (b) = q. E is locally connected if it has a basis of connected
open sets. E is locally arcwise connected if it has a basis of arcwise connected open
sets.
You can verify that this set of points considered as a metric space with the metric
from R2 is not locally connected or arcwise connected but is connected.
Proof: Suppose not. Then it achieves two different values, k and l 6= k. Then
Ω = f −1 (l) ∪ f −1 ({m ∈ Z : m 6= l}) and these are disjoint nonempty open sets
which separate Ω. To see they are open, note
µ µ ¶¶
−1 −1 1 1
f ({m ∈ Z : m 6= l}) = f ∪m6=l m − , n +
6 6
Definition 7.1 α = (α1 , ···, αn ) for α1 ···αn positive integers is called a multi-index.
For α a multi-index, |α| ≡ α1 + · · · + αn and if x ∈ Rn ,
x = (x1 , · · ·, xn ) ,
The following estimate will be the basis for the Weierstrass approximation the-
orem. It is actually a statement about the variance of a binomial random variable.
157
158 WEIERSTRASS APPROXIMATION THEOREM
Therefore,
Xm µ ¶
m 2 m−k
(k − mx) xk (1 − x) = m (m − 1) x2 + mx − 2m2 x2 + m2 x2
k
k=0
¡ ¢ 1
= m x − x2 ≤ m.
4
This proves the lemma.
n
Now for x = (x1 , · · ·, xn ) ∈ [0, 1] consider the polynomial,
m
X X m µ ¶µ ¶ µ ¶
m m m k1 m−k1 k2 m−k2
pm (x) ≡ ··· ··· x (1 − x1 ) x2 (1 − x2 )
k1 k2 kn 1
k1 =1 kn =1
µ ¶
m−kn k1 kn
··· xknn (1 − xn ) f , · · ·, . (7.2)
m m
Also define if I is a set in Rn
k
Also to simplify the notation, let k = (k1 , · · ·, kn ) where each ki ∈ [0, m], ≡
¡ k1 kn
¢ m
m , · · ·, m , and let µ ¶ µ ¶µ ¶ µ ¶
m m m m
≡ ··· .
k k1 k2 kn
7.1. THE BERNSTEIN POLYNOMIALS 159
Also define
||k||∞ ≡ max {ki , i = 1, 2, · · ·, n}
m−k m−k1 m−k2 m−kn
xk (1 − x) ≡ xk11 (1 − x1 ) xk22 (1 − x2 ) · · · xknn (1 − xn ) .
Thus in terms of this notation,
X µ ¶ µ ¶
m k m−k k
pm (x) = x (1 − x) f
k m
||k||∞ ≤m
n n
Lemma 7.3 For x ∈ [0, 1] , f a continuous function defined on [0, 1] , and pm
n
given in 7.2, pm converges uniformly to f on [0, 1] as m → ∞.
n
and so for x ∈ [0, 1] ,
µ ¶ ¯ µ ¶ ¯
X m k m−k ¯¯ k ¯
|pm (x) − f (x)| ≤ x (1 − x) ¯ f − f (x)¯¯
k m
||k||∞ ≤m
X µm¶ ¯ µ ¶ ¯
k m−k ¯¯ k ¯
¯
≤ x (1 − x) ¯f m − f (x)¯
k
k∈G
X µm¶ ¯ µ ¶ ¯
m−k ¯¯ k ¯
+ k
x (1 − x) ¯ f − f (x)¯¯ (7.3)
C
k m
k∈G
n
Letting M ≥ max {|f (x)| : x ∈ [0, 1] } it follows
because on GC ,
2
(kj − mxj )
< 1, j = 1, · · ·, n.
η 2 m2
Now by Lemma 7.2,
µ ¶n ³
1 m ´n
|pm (x) − f (x)| ≤ ε + 2M .
η m2
2 4
Therefore, since the right side does not depend on x, it follows
and since ε is arbitrary, this shows limm→∞ ||pm − f ||[0,1]n = 0. This proves the
lemma.
The following is not surprising.
n
Lemma 7.4 Let f be a continuous function defined on [−M, M ] . Then there ex-
n
ists a sequence of polynomials, {pm } converging uniformly to f on [−M, M ] .
Theorem 7.5 Let M be a closed nonempty subset of a metric space (X, d) and let
f : M → [a, b] is continuous at every point of M. Then there exists a function, g
continuous on all of X which coincides with f on M such that g (X) ⊆ [a, b] .
Theorem 7.6 Let K be a compact set in Rn and let f be a continuous function de-
fined on K. Then there exists a sequence of polynomials {pm } converging uniformly
to f on K.
Proof: Choose M large enough that K ⊆ [−M, M ] and let fe denote a contin-
n
n e
uous function defined on all of [−M, M ] such that f = f on K. Such an extension
exists by the Tietze extension theorem, Theorem 7.5 applied to the real and imagi-
nary parts of f . By Lemma
¯¯ 7.4 there
¯¯ exists a sequence of polynomials,
¯¯ ¯¯{pm } defined
n ¯¯ e ¯¯ ¯¯ e ¯¯
on [−M, M ] such that ¯¯f − pm ¯¯ n
→ 0. Therefore, ¯¯f − pm ¯¯ → 0 also.
[−M,M ] K
This proves the theorem.
To begin with assume that the field of scalars is R. This will be generalized
later. Theorem 7.6 implies the following very special case.
The next result is the key to the profound generalization of the Weierstrass
theorem due to Stone in which an interval will be replaced by a compact or locally
compact set and polynomials will be replaced with elements of an algebra satisfying
certain axioms.
Corollary 7.9 On the interval [−M, M ], there exist polynomials pn such that
pn (0) = 0
and
lim ||pn − |·|||∞ = 0.
n→∞
Proof: By Corollary 7.8 there exists a sequence of polynomials, {p̃n } such that
p̃n → |·| uniformly. Then let pn (t) ≡ p̃n (t) − p̃n (0) . This proves the corollary.
162 WEIERSTRASS APPROXIMATION THEOREM
Lemma 7.12 Let c1 and c2 be two real numbers and let x1 6= x2 be two points of
A. Then there exists a function fx1 x2 such that
g (x1 ) 6= g (x2 ).
Such a g exists because the algebra separates points. Since the algebra annihilates
no point, there exist functions h and k such that
h (x1 ) 6= 0, k (x2 ) 6= 0.
Then let
u ≡ gh − g (x2 ) h, v ≡ gk − g (x1 ) k.
It follows that u (x1 ) 6= 0 and u (x2 ) = 0 while v (x2 ) 6= 0 and v (x1 ) = 0. Let
c1 u c2 v
fx 1 x 2 ≡ + .
u (x1 ) v (x2 )
This proves the lemma. Now continue the proof of Theorem 7.11.
First note that A satisfies the same axioms as A but in addition to these axioms,
A is closed. The closure of A is taken with respect to the usual norm on C (A),
|f − g| + (f + g)
max (f, g) =
2
(f + g) − |f − g|
min (f, g) = .
2
Therefore, this shows that if f, g ∈ A then
Now let h ∈ C (A; R) and let x ∈ A. Use Lemma 7.12 to obtain fxy , a function
of A which agrees with h at x and y. Letting ε > 0, there exists an open set U (y)
containing y such that
Then fx ∈ A and
fx (z) > h (z) − ε
for all z ∈ A and fx (x) = h (x). This implies that for each x ∈ A there exists an
open set V (x) containing x such that for z ∈ V (x),
Therefore,
f (z) < h (z) + ε
for all z ∈ A and since fx (z) > h (z) − ε for all z ∈ A, it follows
also and so
|f (z) − h (z)| < ε
for all z. Since ε is arbitrary, this shows h ∈ A and proves A = C (A; R). This
proves the theorem.
164 WEIERSTRASS APPROXIMATION THEOREM
Lemma 7.14 For (X, τ ) a locally compact Hausdorff space with the above norm,
C0 (X) is a complete space.
³ ´
Proof: Let X, e eτ be the one point compactification described in Lemma 6.52.
n ³ ´ o
e : f (∞) = 0 .
D≡ f ∈C X
³ ´
e . For f ∈ C0 (X) ,
Then D is a closed subspace of C X
½
f (x) if x ∈ X
fe(x) ≡
0 if x = ∞
and let θ : C0 (X) → D be given by θf = fe. Then θ is one to one and onto and also
satisfies ||f ||∞ = ||θf ||∞ . Now D is complete because it is a closed subspace of a
complete space and so C0 (X) with ||·||∞ is also complete. This proves the lemma.
The above refers to functions which have values in C but the same proof works
for functions which have values in any complete normed linear space.
In the case where the functions in C0 (X) all have real values, I will denote the
resulting space by C0 (X; R) with similar meanings in other cases.
With this lemma, the generalization of the Stone Weierstrass theorem to locally
compact sets is as follows.
7.3 Exercises
1. Let (X, τ ) , (Y, η) be topological spaces and let A ⊆ X be compact. Then if
f : X → Y is continuous, show that f (A) is also compact.
2. ↑ In the context of Problem 1, suppose R = Y where the usual topology is
placed on R. Show f achieves its maximum and minimum on A.
3. Let V be an open set in Rn . Show there is an increasing sequence of compact
sets, Km , such that V = ∪∞
m=1 Km . Hint: Let
½ ¾
n
¡ C
¢ 1
Cm ≡ x ∈ R : dist x,V ≥
m
where
dist (x,S) ≡ inf {|y − x| such that y ∈ S}.
Consider Km ≡ Cm ∩ B (0,m).
4. Let B (X; Rn ) be the space of functions f , mapping X to Rn such that
where
||f || ≡ sup{|f (x)| : x ∈ X}
and
|f (x) − f (y)|
ρα (f ) ≡ sup{ α : x, y ∈ X, x 6= y}.
|x − y|
Show that (C α (X; Rn ) , ||·||α ) is a complete normed linear space.
7.3. EXERCISES 167
6. Let {fn }∞ α n p
n=1 ⊆ C (X; R ) where X is a compact subset of R and suppose
||fn ||α ≤ M
for all n. Show there exists a subsequence, nk , such that fnk converges in
C (X; Rn ). The given sequence is called precompact when this happens. (This
also shows the embedding of C α (X; Rn ) into C (X; Rn ) is a compact embed-
ding.)
7. Use the general Stone Weierstrass approximation theorem to prove Theorem
7.6.
8. Let (X, d) be a metric space where d is a bounded metric. Let C denote the
collection of closed subsets of X. For A, B ∈ C, define
Let A ≡ ∩∞ ∞
n=1 ∪k=n Ak . By the first part, there exists N1 > N such that for
n ≥ N1 , ¡ ¢
ρ ∪∞ ∞
k=n Ak , A < ε, and (An )ε ⊇ ∪k=n Ak .
(An )ε ⊇ ∪∞
k=n Ak ⊇ A.
10. ↑ Let X be a compact metric space. Show (C, ρ) is compact. Hint: Let Dn
be a 2−n net for X. Let Kn denote finite unions of sets of the form B (p, 2−n )
where p ∈ Dn . Show Kn is a 2−(n−1) net for (C, ρ) .
168 WEIERSTRASS APPROXIMATION THEOREM
Part II
169
Abstract Measure And
Integration
8.1 σ Algebras
This chapter is on the basics of measure theory and integration. A measure is a real
valued mapping from some subset of the power set of a given set which has values
in [0, ∞]. Many apparently different things can be considered as measures and also
there is an integral defined. By discussing this in terms of axioms and in a very
abstract setting, many different topics can be considered in terms of one general
theory. For example, it will turn out that sums are included as an integral of this
sort. So is the usual integral as well as things which are often thought of as being
in between sums and integrals.
Let Ω be a set and let F be a collection of subsets of Ω satisfying
∅ ∈ F, Ω ∈ F , (8.1)
E ∈ F implies E C ≡ Ω \ E ∈ F ,
If {En }∞ ∞
n=1 ⊆ F, then ∪n=1 En ∈ F . (8.2)
Definition 8.1 A collection of subsets of a set, Ω, satisfying Formulas 8.1-8.2 is
called a σ algebra.
As an example, let Ω be any set and let F = P(Ω), the set of all subsets of Ω
(power set). This obviously satisfies Formulas 8.1-8.2.
Lemma 8.2 Let C be a set whose elements are σ algebras of subsets of Ω. Then
∩C is a σ algebra also.
Be sure to verify this lemma. It follows immediately from the above definitions
but it is important for you to check the details.
Example 8.3 Let τ denote the collection of all open sets in Rn and let σ (τ ) ≡
intersection of all σ algebras that contain τ . σ (τ ) is called the σ algebra of Borel
sets . In general, for a collection of sets, Σ, σ (Σ) is the smallest σ algebra which
contains Σ.
171
172 ABSTRACT MEASURE AND INTEGRATION
whenever the Ei are disjoint sets of F. The triple, (Ω, F, µ) is called a measure
space and the elements of F are called the measurable sets. (Ω, F, µ) is a finite
measure space when µ (Ω) < ∞.
The following theorem is the basis for most of what is done in the theory of
measure and integration. It is a very simple result which follows directly from the
above definition.
Theorem 8.5 Let {Em }∞ m=1 be a sequence of measurable sets in a measure space
(Ω, F, µ). Then if · · ·En ⊆ En+1 ⊆ En+2 ⊆ · · ·,
µ(∪∞
i=1 Ei ) = lim µ(En ) (8.4)
n→∞
µ(∩∞
i=1 Ei ) = lim µ(En ). (8.5)
n→∞
Stated more succinctly, Ek ↑ E implies µ (Ek ) ↑ µ (E) and Ek ↓ E with µ (E1 ) < ∞
implies µ (Ek ) ↓ µ (E).
and the sets in the above union are disjoint. Hence by 8.3,
∞
X
µ(∪∞
i=1 Ei ) = µ(E1 ) + µ(Ek+1 \ Ek ) = µ(E1 )
k=1
8.1. σ ALGEBRAS 173
∞
X
+ µ(Ek+1 ) − µ(Ek )
k=1
n
X
= µ(E1 ) + lim µ(Ek+1 ) − µ(Ek ) = lim µ(En+1 ).
n→∞ n→∞
k=1
µ(E1 ) − µ(∩∞ ∞ n
i=1 Ei ) = µ(E1 \ ∩i=1 Ei ) = lim µ(E1 \ ∩i=1 Ei )
n→∞
xn > l.
Proof: First note that the first and the third are equivalent. To see this, observe
f −1 ([d, ∞]) = ∩∞
n=1 f
−1
((d − 1/n, ∞]),
f −1 ((d, ∞]) = ∪∞
n=1 f
−1
([d + 1/n, ∞]),
so the first and fourth conditions are equivalent. Thus the first four conditions are
equivalent and if any of them hold, then for −∞ < a < b < ∞,
and so the third condition holds. Therefore, all five conditions are equivalent. This
proves the lemma.
This lemma allows for the following definition of a measurable function having
values in (−∞, ∞].
Definition 8.7 Let (Ω, F, µ) be a measure space and let f : Ω → (−∞, ∞]. Then
f is said to be measurable if any of the equivalent conditions of Lemma 8.6 hold.
When the σ algebra, F equals the Borel σ algebra, B, the function is called Borel
measurable. More generally, if f : Ω → X where X is a topological space, f is said
to be measurable if f −1 (U ) ∈ F whenever U is open.
(a, b) = ∪∞ ∞
m=1 Vm = ∪m=1 V m .
Note that Vm 6= ∅ for all m large enough. Since f is the pointwise limit of fn ,
You should note that the expression in the middle is of the form
−1
∪∞ ∞
n=1 ∩k=n fk (Vm ).
8.1. σ ALGEBRAS 175
Therefore,
−1
f −1 ((a, b)) = ∪∞
m=1 f
−1
(Vm ) ⊆ ∪∞ ∞ ∞
m=1 ∪n=1 ∩k=n fk (Vm )
⊆ ∪∞
m=1 f
−1
(V m ) = f −1 ((a, b)).
It follows f −1 ((a, b)) ∈ F because it equals the expression in the middle which is
measurable. This shows f is measurable.
The following theorem considers the case of functions which have values in a
metric space. Its proof is similar to the proof of the above.
and so
f −1 (U ) = ∪∞
m=1 f
−1
(Vm )
−1
= ∪m=1 ∪n=1 ∩∞
∞ ∞
k=n fk (Vm )
¡ ¢
⊆ ∪∞
m=1 f
−1
Vm = f −1 (U )
where each xk ∈ X and the Ak are disjoint measurable sets. (Such functions are
often referred to as simple functions.) Then f is measurable.
Theorem 8.11 Let E be a compact metric space and let (Ω, F) be a measure space.
Suppose ψ : E × Ω → R has the property that x → ψ (x, ω) is continuous and
ω → ψ (x, ω) is measurable. Then there exists a measurable function, f having
values in E such that
ψ (f (ω) , ω) = sup ψ (x, ω) .
x∈E
and let s1 (ω) ≡ x12 on A12 . Continue in this way to obtain a simple function, s1
such that
ψ (s1 (ω) , ω) = max {ψ (x, ω) : x ∈ C1 }
and s1 has values in C1 .
Suppose s1 (ω) , s2 (ω) , · · ·, sm (ω) are simple functions with the property that if
m > 1,
for each k + 1 ≤ m, only the second and third assertions holding if m = 1. Letting
N
Cm = {xk }k=1 , it follows sm (ω) is of the form
N
X
sm (ω) = xk XAk (ω) , Ai ∩ Aj = ∅. (8.6)
k=1
n
Denote by {y1i }i=1
1
those points of Cm+1 which are contained in B (x1 , 2−m ) . Let-
ting Ak play the role of Ω in the first step in which s1 was constructed, for each
ω ∈ A1 let sm+1 (ω) be a simple function which has one of the values y1i and satisfies
n n
for each ω ∈ A1 . Next let {y2i }i=1
2
be those points of Cm+1 different than {y1i }i=1
1
−m
which are contained in B (x2 , 2 ). Then define sm+1 (ω) on A2 to have values
n2
taken from {y2i }i=1 and
for each ω ∈ A2 . Continuing this way defines sm+1 on all of Ω and it satisfies
It remains to verify
where yj ∈ Cm+1 and out of all the balls B (xl , 2−m ) , the first one which contains
yj is B (xk , 2−m ). Then by the construction, sm+1 (ω) = yj . This and 8.9 verifies
8.8.
From 8.7 it follows sm (ω) converges uniformly on Ω to a measurable function,
f (ω) . Then from the construction, ψ (f (ω) , ω) ≥ ψ (sm (ω) , ω) for all m and ω.
Now pick ω ∈ Ω and let z be such that ψ (z, ω) = maxx∈E ψ (x, ω). Letting yk → z
where yk ∈ Ck , it follows from continuity of ψ in the first argument that
where δ is a positive rational number and x ∈ Qn . Then every open set in Rn can be
written as a countable union of open cubes from B. Furthermore, B is a countable
set.
178 ABSTRACT MEASURE AND INTEGRATION
B(y, r)
Qx
qxq
y
|z − y| ≤ |z − x| + |x − y|
r r
< + = r.
2 2
Consequently, Qx ⊆ U. Now also,
à n !1/2
X 2 r
(xi − yi ) < √
i=1
10 n
since otherwise the above inequality would not hold. Therefore, y ∈ Qx ⊆ U . Now
let BU denote those sets of B which are contained in U. Then ∪BU = U.
To see B is countable, note there are countably many choices for x and countably
many choices for δ. This proves the theorem.
Recall that g : Rn → R is continuous means g −1 (open set) = an open set. In
particular g −1 ((a, b)) must be an open set.
U = ∪∞
k=1 Qk
Now à !
n
Y
f −1 (xi − δ, xi + δ) = ∩ni=1 fi−1 ((xi − δ, xi + δ)) ∈ F
i=1
and so
−1 ¡ ¢
(g ◦ f ) ((a, b)) = f −1 g −1 ((a, b)) = f −1 (U )
= f −1 (∪∞ ∞
k=1 Qk ) = ∪k=1 f
−1
(Qk ) ∈ F.
and so ω → f1 (ω) f2 (ω) is measurable. Similarly you can show the sum of two
measurable functions is measurable by considering g (x, y) = x + y and you can
180 ABSTRACT MEASURE AND INTEGRATION
(µ(Ω) < ∞)
and let fn , f be complex valued functions such that Re fn , Im fn are all measurable
and
lim fn (ω) = f (ω)
n→∞
for all ω ∈
/ E where µ(E) = 0. Then for every ε > 0, there exists a set,
F ⊇ E, µ(F ) < ε,
0 = µ(∩∞
k=1 Ekm ) = lim µ(Ekm ).
k→∞
Let k(m) be chosen such that µ(Ek(m)m ) < ε2−m and let
∞
[
F = Ek(m)m .
m=1
∞
\
C
ω∈ Ek(m)m .
m=1
C
Hence ω ∈ Ek(m0 )m0
so
|fn (ω) − f (ω)| < 1/m0 < η
for all n > k(m0 ). This holds for all ω ∈ F C and so fn converges uniformly to f on
F C.
∞
Now if E 6= ∅, consider {XE C fn }n=1 . Each XE C fn has real and imaginary
parts measurable and the sequence converges pointwise to XE f everywhere. There-
fore, from the first part, there exists a set of measure less than ε, F such that on
C
F C , {XE C fn } converges uniformly to XE C f. Therefore, on (E ∪ F ) , {fn } con-
verges uniformly to f . This proves the theorem.
Finally here is a comment about notation.
8.2 Exercises
1. Let Ω = N ={1, 2, · · ·}. Let F = P(N) and let µ(S) = number of elements in
S. Thus µ({1}) = 1 = µ({2}), µ({1, 2}) = 2, etc. Show (Ω, F, µ) is a measure
space. It is called counting measure. What functions are measurable in this
case?
2. Let Ω be any uncountable set and let F = {A ⊆ Ω : either A or AC is
countable}. Let µ(A) = 1 if A is uncountable and µ(A) = 0 if A is countable.
Show (Ω, F, µ) is a measure space. This is a well known bad example.
3. Let F be a σ algebra of subsets of Ω and suppose F has infinitely many
elements. Show that F is uncountable. Hint: You might try to show there
exists a countable sequence of disjoint sets of F, {Ai }. It might be easiest to
verify this by contradiction if it doesn’t exist rather than a direct construction.
Once this has been done, you can define a map, θ, from P (N) into F which
is one to one by θ (S) = ∪i∈S Ai . Then argue P (N) is uncountable and so F
is also uncountable.
4. Prove Lemma 8.2.
5. g is Borel measurable if whenever U is open, g −1 (U ) is Borel. Let f : Ω → Rn
and let g : Rn → R and F is a σ algebra of sets of Ω. Suppose f is measurable
and g is Borel measurable. Show g ◦ f is measurable. To say g is Borel
measurable means g −1 (open set) = (Borel set) where a Borel set is one of
those sets in the smallest σ algebra containing the open sets of Rn . See Lemma
8.2. Hint: You should show, using Theorem 8.12 that f −1 (open set) ∈ F .
Now let © ª
S ≡ E ⊆ Rn : f −1 (E) ∈ F
By what you just showed, S contains the open sets. Now verify S is a σ
algebra. Argue that from the definition of the Borel sets, it follows S contains
the Borel sets.
6. Let (Ω, F) be a measure space and suppose f : Ω → C. Then f is said to be
mesurable if
f −1 (open set) ∈ F .
Show f is measurable if and only if Re f and Im f are measurable real-valued
functions. Thus it suffices to define a complex valued function to be mea-
surable if the real and imaginary parts are measurable. Hint: Argue that
−1 −1
f −1 (((a, b) + i (c, d))) = (Re f ) ((a, b)) ∩ (Im f ) ((c, d)) . Then use Theo-
rem 8.12 to verify that if Re f and Im f are measurable, it follows f is. Con-
−1
versely, argue that (Re f ) ((a, b)) = f −1 ((a, b) + iR) with a similar formula
holding for Im f.
7. Let (Ω, F, µ) be a measure space. Define µ : P(Ω) → [0, ∞] by
Show µ satisfies
etc. Now consider what it means for fnk (x) to fail to converge to f (x). Then
use Problem 8.
Lemma 8.19 Let f (a, b) ∈ [−∞, ∞] for a ∈ A and b ∈ B where A, B are sets.
Then
sup sup f (a, b) = sup sup f (a, b) .
a∈A b∈B b∈B a∈A
Proof: Note that for all a, b, f (a, b) ≤ supb∈B supa∈A f (a, b) and therefore, for
all a,
sup f (a, b) ≤ sup sup f (a, b) .
b∈B b∈B a∈A
Therefore,
sup sup f (a, b) ≤ sup sup f (a, b) .
a∈A b∈B b∈B a∈A
Repeating the same argument interchanging a and b, gives the conclusion of the
lemma.
Lemma 8.20 If {An } is an increasing sequence in [−∞, ∞], then sup {An } =
limn→∞ An .
The following lemma is useful also and this is a good place to put it. First
∞
{bj }j=1 is an enumeration of the aij if
∪∞
j=1 {bj } = ∪i,j {aij } .
∞
In other words, the countable set, {aij }i,j=1 is listed as b1 , b2 , · · ·.
P∞ P∞ P∞ P∞ ∞
Lemma 8.21 Let aij ≥ 0. Then i=1 j=1 aij = j=1 i=1 aij . Also if {bj }j=1
P∞ P∞ P∞
is any enumeration of the aij , then j=1 bj = i=1 j=1 aij .
Proof: First note there is no trouble in defining these sums because the aij are
all nonnegative. If a sum diverges, it only diverges to ∞ and so ∞ is written as the
answer.
∞ X
X ∞ X∞ X n Xm Xn
aij ≥ sup aij = sup lim aij
n n m→∞
j=1 i=1 j=1 i=1 j=1 i=1
n X
X m n X
X ∞ ∞ X
X ∞
= sup lim aij = sup aij = aij . (8.10)
n m→∞ n
i=1 j=1 i=1 j=1 i=1 j=1
Interchanging the i and j in the above argument the first part of the lemma is
proved.
8.3. THE ABSTRACT LEBESGUE INTEGRAL 185
P∞ P∞ P∞
and so j=1 bj ≤ i=1 j=1 aij . Now let m, n > 1 be given. Then
m X
X n p
X
aij ≤ bj
i=1 j=1 j=1
3h
2h hµ([3h < f ])
h hµ([2h < f ])
hµ([h < f ])
You can see that by following the procedure illustrated in the picture and letting
h get smaller, you would expect to obtain better approximations to the area under
186 ABSTRACT MEASURE AND INTEGRATION
the curve1 although all these approximations would likely be too small. Therefore,
define
Z ∞
X
f dµ ≡ sup hµ ([ih < f ])
h>0 i=1
Also, it suffices to consider only h smaller than a given positive number in the above
definition of the integral.
Proof:
Let N ∈ N.
2N
X µ· ¸¶ X2N
h h h
µ i <f = µ ([ih < 2f ])
i=1
2 2 i=1
2
N
X N
X
h h
= µ ([(2i − 1) h < 2f ]) + µ ([(2i) h < 2f ])
i=1
2 i=1
2
XN µ· ¸¶ X N
h (2i − 1) h
= µ h<f + µ ([ih < f ])
i=1
2 2 i=1
2
N
X N
X N
X
h h
≥ µ ([ih < f ]) + µ ([ih < f ]) = hµ ([ih < f ]) .
i=1
2 i=1
2 i=1
X∞ µ· ¸¶ Z
h h
M< µ i <f ≤ f dµ
i=1
2 2
1 Note the difference between this picture and the one usually drawn in calculus courses where
the little rectangles are upright rather than on their sides. This illustrates a fundamental philo-
sophical difference between the Riemann and the Lebesgue integrals. With the Riemann integral
intervals are measured. With the Lebesgue integral, it is inverse images of intervals which are
measured.
8.3. THE ABSTRACT LEBESGUE INTEGRAL 187
X∞ µ· ¸¶ Z
h h
M< µ i n <f ≤ f dµ.
i=1
2n 2
where the sets, Ei are disjoint and measurable. s takes the value ci at Ei .
Note that by taking the union of some of the Ei in the above definition, you
can assume that the numbers, ci are the distinct values of s. Simple functions are
important because it will turn out to be very easy to take their integrals as shown
in the following lemma.
Pp
Lemma 8.24 Let s (ω) = i=1 ai XEi (ω) be a nonnegative simple function with
the ai the distinct non zero values of s. Then
Z p
X
sdµ = ai µ (Ei ) . (8.11)
i=1
Proof: Consider 8.11 first. Without loss of generality, you can assume 0 < a1 <
a2 < · · · < ap and that µ (Ei ) < ∞. Let ε > 0 be given and let
p
X
δ1 µ (Ei ) < ε.
i=1
188 ABSTRACT MEASURE AND INTEGRATION
Because of the choice of h there exist positive integers, ik such that i1 < i2 < ···, < ip
and
Then in the sum of 8.13 the only terms which are nonzero are those for which
i ∈ {i1 , i2 · ··, ip }. From the above, you see that
Therefore,
∞
X p
X
hµ ([s > kh]) = ik hµ (Ek ) .
k=1 k=1
Taking the inf for h this small and using Lemma 8.22,
p
X ∞
X
0 ≤ ak µ (Ek ) − sup hµ ([s > kh])
δ>h>0
k=1 k=1
Xp Z
= ak µ (Ek ) − sdµ ≤ ε.
k=1
To verify 8.12 Note the formula is obvious if λ = 0 because then [ih < λf ] = ∅
for all i > 0. Assume λ > 0. Then
Z ∞
X
λf dµ ≡ sup hµ ([ih < λf ])
h>0 i=1
X∞
= sup hµ ([ih/λ < f ])
h>0 i=1
∞
X
= sup λ (h/λ) µ ([i (h/λ) < f ])
h>0 i=1
Z
= λ f dµ.
where the ci are not necessarily distinct but the Ei are disjoint. It follows that
Z n
X
s= ci µ (Ei ) .
i=1
Proof: Let the values of s be {a1 , · · ·, am }. Therefore, since the Ei are disjoint,
each ai equal to one of the cj . Let Ai ≡ ∪ {Ej : cj = ai }. Then from Lemma 8.24
it follows that
Z m
X m
X X
s = ai µ (Ai ) = ai µ (Ej )
i=1 i=1 {j:cj =ai }
m
X X n
X
= cj µ (Ej ) = ci µ (Ei ) .
i=1 {j:cj =ai } i=1
Proof: Let
n
X m
X
s(ω) = αi XAi (ω), t(ω) = β j XBj (ω)
i=1 i=1
where αi are the distinct values of s and the β j are the distinct values of t. Clearly
as + bt is a nonnegative simple function because it is measurable and has finitely
many values. Also,
m X
X n
(as + bt)(ω) = (aαi + bβ j )XAi ∩Bj (ω)
j=1 i=1
0 ≤ sn (ω) (8.14)
Then tn (ω) ≤ f (ω) for all ω and limn→∞ tn (ω) = f (ω) for all ω. This is because
n
tn (ω) = n for ω ∈ I and if f (ω) ∈ [0, 2 n+1 ), then
1
0 ≤ f (ω) − tn (ω) ≤ . (8.16)
n
8.3. THE ABSTRACT LEBESGUE INTEGRAL 191
Thus whenever ω ∈
/ I, the above inequality will hold for all n large enough. Let
Theorem 8.28 Let (Ω, F) be a measure space and let f : Ω → X where (X, d) is a
separable metric space. Then f be a measurable function if and only if there exists
a sequence of simple functions,{fn } such that for each ω ∈ Ω and n ∈ N,
and
lim d (fn (ω) , f (ω)) = 0. (8.18)
n→∞
∞
Proof: Let D = {xk }k=1 be a countable dense subset of X. First suppose f is
measurable. Then since in a metric space every open set is the countable intersection
n
of closed sets, it follows f −1 (closed set) ∈ F. Now let Dn = {xk }k=1 . Let
½ ¾
A1 ≡ ω : d (x1 , f (ω)) = min d (xk , f (ω))
k≤n
That is, A1 are those ω such that f (ω) is approximated best out of Dn by x1 .
Why is this a measurable set? It is because ω → d (x, f (ω)) is a real valued
measurable function, being the composition of a continuous function, y → d (x, y)
and a measurable function, ω → f (ω) . Next let
½ ¾
A2 ≡ ω ∈ / A1 : d (x2 , f (ω)) = min d (xk , f (ω))
k≤n
n
and continue in this manner obtaining disjoint measurable sets, {Ak }k=1 such that
for ω ∈ Ak the best approximation to f (ω) from Dn is xk . Then
n
X
fn (ω) ≡ xk XAk (ω) .
k=1
192 ABSTRACT MEASURE AND INTEGRATION
Note
min d (xk , f (ω)) ≤ min d (xk , f (ω))
k≤n+1 k≤n
Theorem 8.29 (Monotone Convergence theorem) Let f have values in [0, ∞] and
suppose {fn } is a sequence of nonnegative measurable functions having values in
[0, ∞] and satisfying
lim fn (ω) = f (ω) for each ω.
n→∞
k
X
= sup sup hµ ([ih < f ])
h>0 k i=1
k
X
= sup sup sup hµ ([ih < fm ])
h>0 k m
i=1
∞
X
= sup sup hµ ([ih < fm ])
m h>0
i=1
Z
≡ sup fm dµ
m
Z
= lim fm dµ.
m→∞
8.3. THE ABSTRACT LEBESGUE INTEGRAL 193
which follows from Theorem 8.5 since the sets, [ih < fm ] are increasing in m and
their union equals [ih < f ]. This proves the theorem.
To illustrate what goes wrong without the Lebesgue integral, consider the fol-
lowing example.
Example 8.30 Let {rn } denote the rational numbers in [0, 1] and let
½
1 if t ∈
/ {r1 , · · ·, rn }
fn (t) ≡
0 otherwise
Then fn (t) ↑ f (t) where f is the function which is one on the rationals and zero
on the irrationals. Each fn is RiemannR integrable (why?)
R but f is not Riemann
integrable. Therefore, you can’t write f dx = limn→∞ fn dx.
R
another way to get the same thing for f dµ is to take an increasing sequence of
nonnegative simple functions, {sn } with sn (ω) → f (ω) and then by monotone
convergence theorem, Z Z
f dµ = lim sn
n→∞
Pm
where if sn (ω) = j=1 ci XEi (ω) ,
Z m
X
sn dµ = ci m (Ei ) .
i=1
Similarly this also shows that for such nonnegative measurable function,
Z ½Z ¾
f dµ = sup s : 0 ≤ s ≤ f, s simple
194 ABSTRACT MEASURE AND INTEGRATION
which is the usual way of defining the Lebesgue integral for nonnegative simple
functions in most books. I have done it differently because this approach led to an
easier proof of the Monotone convergence theorem. Here is an equivalent definition
of the integral. The fact it is well defined has been discussed above.
Definition 8.31 For s a nonnegative simple function,
Xn Z X n
s (ω) = ck XEk (ω) , s = ck µ (Ek ) .
k=1 k=1
and
lim sup an ≡ lim sup {ak : k ≥ n}
n→∞ n→∞
Lemma 8.34 Let {an } be a sequence in [−∞, ∞] . Then limn→∞ an exists if and
only if
lim inf an = lim sup an
n→∞ n→∞
and in this case, the limit equals the common value of these two numbers.
Since ε is arbitrary, the two must be equal and they both must equal a. Next suppose
limn→∞ an = ∞. Then if l ∈ R, there exists N such that for n ≥ N,
l ≤ an
inf an > l.
n>N
Therefore, limn→∞ an = ∞. The case for −∞ is similar. This proves the lemma.
The next theorem, known as Fatou’s lemma is another important theorem which
justifies the use of the Lebesgue integral.
196 ABSTRACT MEASURE AND INTEGRATION
In other words, Z ³ Z
´
lim inf fn dµ ≤ lim inf fn dµ
n→∞ n→∞
Thus gn is measurable by Lemma 8.6 on Page 173. Also g(ω) = limn→∞ gn (ω) so
g is measurable because it is the pointwise limit of measurable functions. Now the
functions gn form an increasing sequence of nonnegative measurable functions so
the monotone convergence theorem applies. This yields
Z Z Z
gdµ = lim gn dµ ≤ lim inf fn dµ.
n→∞ n→∞
As long as you are allowing functions to take the value +∞, you cannot consider
something like f + (−g) and so you can’t very well expect a satisfactory statement
about the integral being linear until you restrict yourself to functions which have
values in a vector space. This is discussed next.
Definition 8.38 A complex simple function will be a function which is of the form
n
X
s (ω) = ck XEk (ω)
k=1
where ck ∈ C and µ (Ek ) < ∞. For s a complex simple function as above, define
n
X
I (s) ≡ ck µ (Ek ) .
k=1
Lemma 8.39 The definition, 8.38 is well defined. Furthermore, I is linear on the
vector space of complex simple functions. Also the triangle inequality holds,
|I (s)| ≤ I (|s|) .
Pn P
Proof: Suppose k=1 ck XEk (ω) = 0. Does it follow that k ck µ (Ek ) = 0?
The supposition implies
n
X n
X
Re ck XEk (ω) = 0, Im ck XEk (ω) = 0. (8.21)
k=1 k=1
P
Choose λ large and positive so that λ + Re ck ≥ 0. Then adding k λXEk to both
sides of the first equation above,
n
X n
X
(λ + Re ck ) XEk (ω) = λXEk
k=1 k=1
R
and by Lemma 8.26 on Page 189, it follows upon taking of both sides that
n
X n
X
(λ + Re ck ) µ (Ek ) = λµ (Ek )
k=1 k=1
198 ABSTRACT MEASURE AND INTEGRATION
Pn Pn
which
Pn implies k=1 Re ck µ (Ek ) = 0. Similarly, k=1 Im ck µ (Ek ) = 0 and so
k=1 ck µ (Ek ) = 0. Thus if
X X
cj XEj = dk XFk
j k
P P
then
P j cj XEjP+ k (−dk ) XFk = 0 and so the result just established verifies
j cj µ (Ej ) − k dk µ (Fk ) = 0 which proves I is well defined.
That I is linear is now obvious. It only remains to verify the triangle inequality.
Let s be a simple function,
X
s= cj XEj
j
Then pick θ ∈ C such that θI (s) = |I (s)| and |θ| = 1. Then from the triangle
inequality for sums of complex numbers,
X
|I (s)| = θI (s) = I (θs) = θcj µ (Ej )
j
¯ ¯
¯ ¯
¯X ¯ X
= ¯ θc µ (E )¯≤ |θcj | µ (Ej ) = I (|s|) .
¯ j j ¯
¯ j ¯ j
Definition 8.40 f ∈ L1 (Ω) means there exists a sequence of complex simple func-
tions, {sn } such that
Then
I (f ) ≡ lim I (sn ) . (8.23)
n→∞
Proof: There are several things which need to be verified. First suppose 8.22.
Then by Lemma 8.39
and for m, n large enough this last is given to be small so {I (sn )} is a Cauchy
sequence in C and so it converges. This verifies the limit in 8.23 at least exists. It
remains to consider another sequence {tn } having the same properties as {sn } and
8.4. THE SPACE L1 199
verifying I (f ) determined by this other sequence is the same. By Lemma 8.39 and
Fatou’s lemma, Theorem 8.35 on Page 196,
Z
|I (sn ) − I (tn )| ≤ I (|sn − tn |) = |sn − tn | dµ
Z
≤ |sn − f | + |f − tn | dµ
Z Z
≤ lim inf |sn − sk | dµ + lim inf |tn − tk | dµ < ε
k→∞ k→∞
whenever n is large enough. Since ε is arbitrary, this shows the limit from using
the tn is the same as the limit from using Rsn . This proves the lemma.
What if f has values in [0, ∞)? Earlier f dµ was defined for such functions and
now I (f ) hasR been defined. Are they the same? If so, I can be regarded as an
extension of dµ to a larger class of functions.
Lemma 8.42 Suppose f has values in [0, ∞) and f ∈ L1 (Ω) . Then f is measurable
and Z
I (f ) = f dµ.
where x+ ≡ 12 (|x| + x) , the positive part of the real number, x. 2 Thus there is no
loss of generality in assuming {sn } is a sequence of complex simple functions
R having
values in [0, ∞). Then since for such complex simple functions, I (s) = sdµ,
¯ Z ¯ ¯Z Z ¯ Z
¯ ¯ ¯ ¯
¯I (f ) − f dµ¯ ≤ |I (f ) − I (sn )| + ¯ sn dµ − f dµ¯ < ε + |sn − f | dµ
¯ ¯ ¯ ¯
whenever n is large enough. But by Fatou’s lemma, Theorem 8.35 on Page 196, the
last term is no larger than
Z
lim inf |sn − sk | dµ < ε
k→∞
R
whenever n is large enough. Since ε is arbitrary, this shows I (f ) = f dµ as
claimed. R
As explained above,
R I can be regarded as an extension of
R dµ so from now on,
the usual symbol, dµ will be used. It is now easy to verify dµ is linear on L1 (Ω) .
2 The negative part of the real number x is defined to be x− ≡ 1
2
(|x| − x) . Thus |x| = x+ + x−
and x = x+ − x− . .
200 ABSTRACT MEASURE AND INTEGRATION
R
Theorem 8.43 dµ is linear on L1 (Ω) and L1 (Ω) is a complex vector space. If
f ∈ L1 (Ω) , then Re f, Im f, and |f | are all in L1 (Ω) . Furthermore, for f ∈ L1 (Ω) ,
Z Z Z µZ Z ¶
+ − + −
f dµ = (Re f ) dµ − (Re f ) dµ + i (Im f ) dµ − (Im f ) dµ
Proof: First it is necessary to verify that L1 (Ω) is really a vector space because
it makes no sense to speak of linear maps without having these maps defined on
a vector space. Let f, g be in L1 (Ω) and let a, b ∈ C. Then let {sn } and {tn }
be sequences of complex simple functions associated with f and g respectively as
described in Definition 8.40. Consider {asn + btn } , another sequence of complex
simple functions. Then asn (ω) + btn (ω) → af (ω) + bg (ω) for each ω. Also, from
Lemma 8.39
Z Z Z
|asn + btn − (asm + btm )| dµ ≤ |a| |sn − sm | dµ + |b| |tn − tm | dµ
and the sum of the two terms on the right converge to zero as m, n → ∞. Thus
af + bg ∈ L1 (Ω) . Also
Z Z
(af + bg) dµ = lim (asn + btn ) dµ
n→∞
µ Z Z ¶
= lim a sn dµ + b tn dµ
n→∞
Z Z
= a lim sn dµ + b lim tn dµ
n→∞ n→∞
Z Z
= a f dµ + b gdµ.
Proof: Suppose f ∈ L1 (Ω) . Then from Definition 8.40, it follows both real and
imaginary parts of f are measurable. Just take real and imaginary parts of sn and
observe the real and imaginary parts of f are limits of the real and imaginary parts
of sn respectively. By Theorem 8.43 this shows the only if part.
R The more interesting part is the if part. Suppose then that f is measurable and
|f | dµ < ∞. Suppose first that f has values in [0, ∞). It is necessary to obtain the
sequence of complex simple functions. By Theorem 8.27, there exists a sequence of
nonnegative simple functions, {sn } such that sn (ω) ↑ f (ω). Then by the monotone
convergence theorem,
Z Z
lim (2f − (f − sn )) dµ = 2f dµ
n→∞
and so Z
lim (f − sn ) dµ = 0.
n→∞
R
Letting m be large enough, it follows (f − sm ) dµ < ε and so if n > m
Z Z
|sm − sn | dµ ≤ |f − sm | dµ < ε.
and there exists a measurable function g, with values in [0, ∞],3 such that
Z
|fn (ω)| ≤ g(ω) and g(ω)dµ < ∞.
Hence
Z Z Z
0 ≥ lim sup ( |f − fn |dµ) ≥ lim sup | f dµ − fn dµ|
n→∞ n→∞
Z Z
≥ lim inf | f dµ − fn dµ| ≥ 0.
n→∞
This proves the theorem by Lemma 8.34 on Page 195 because the lim sup and lim inf
are equal.
Corollary 8.46 Suppose fn ∈ L1 (Ω) and f (ω) = limn→∞ fn (ω) . Suppose also
there exist measurable functions, gn , g with values in [0, ∞] such that
Z Z
lim gn dµ = gdµ,
n→∞
R R
gn (ω) → g (ω) µ a.e. and both gn dµ and gdµ are finite. Also suppose |fn (ω)| ≤
gn (ω) . Then Z
lim |f − fn | dµ = 0.
n→∞
3 Note that, since g is allowed to have the value ∞, it is not known that g ∈ L1 (Ω) .
8.5. VITALI CONVERGENCE THEOREM 203
Z Z
lim inf (gn + g) − lim sup |f − fn | dµ
n→∞ n→∞
Z Z
= lim inf ((gn + g) − |f − fn |) dµ ≥ 2gdµ
n→∞
R
and so − lim supn→∞ |f − fn | dµ ≥ 0.
{E ∩ A : A ∈ F}
Definition 8.48 Let (Ω, F, µ) be a measure space and let S ⊆ L1 (Ω). S is uni-
formly integrable if for every ε > 0 there exists δ > 0 such that for all f ∈ S
Z
| f dµ| < ε whenever µ(E) < δ.
E
Proof: Let ε > 0 be given and suppose S is uniformly integrable. First suppose
the functions are real valued. Let δ be such that if µ (E) < δ, then
¯Z ¯
¯ ¯
¯ f dµ¯ < ε
¯ ¯ 2
E
204 ABSTRACT MEASURE AND INTEGRATION
Theorem 8.50 Let {fn } be a uniformly integrable set of complex valued functions,
µ(Ω) < ∞, and fn (x) → f (x) a.e. where f is a measurable complex valued
function. Then f ∈ L1 (Ω) and
Z
lim |fn − f |dµ = 0. (8.24)
n→∞ Ω
for all n. By Egoroff’s theorem, there exists a set, E of measure less than δ such
that on E C , {fn } converges uniformly. Therefore, for p large enough, and n > p,
Z
|fp − fn | dµ < 1
EC
which implies Z Z
|fn | dµ < 1 + |fp | dµ.
EC Ω
Then since there are only finitely many functions, fn with n ≤ p, there exists a
constant, M1 such that for all n,
Z
|fn | dµ < M1 .
EC
But also,
Z Z Z
|fm | dµ = |fm | dµ + |fm |
Ω EC E
≤ M1 + 1 ≡ M.
Therefore, by Fatou’s lemma,
Z Z
|f | dµ ≤ lim inf |fn | dµ ≤ M,
Ω n→∞
8.6 Exercises
1. Let Ω = N = {1, 2, · · ·} and µ(S) = number of elements in S. If
f :Ω→C
R
what is meant by f dµ? Which functions are in L1 (Ω)? Which functions are
measurable?
206 ABSTRACT MEASURE AND INTEGRATION
R P∞
2. Show that for f ≥ 0 and measurable, f dµ ≡ limh→0+ i=1 hµ ([ih < f ]).
3. For the measure space of Problem 1, give an example of a sequence of nonneg-
ative measurable functions {fn } converging pointwise to a function f , such
that inequality is obtained in Fatou’s lemma.
4. Fill in all the details of the proof of Lemma 8.49.
Pn
5. Let i=1 ci XEi (ω) = s (ω) be a nonnegative simple function for which the ci
are the distinct nonzero values. Show with the aid of the monotone conver-
gence theorem that the two definitions of the Lebesgue integral given in the
chapter are equivalent.
6. Suppose (Ω, µ) is a finite measure space and S ⊆ L1 (Ω). Show S is uniformly
integrable and bounded in L1 (Ω) if there exists an increasing function h which
satisfies ½Z ¾
h (t)
lim = ∞, sup h (|f |) dµ : f ∈ S < ∞.
t→∞ t Ω
S is bounded if there is some number, M such that
Z
|f | dµ ≤ M
for all f ∈ S.
7. Let {an }, {bn } be sequences in [−∞, ∞] and a ∈ R. Show
This was used in the proof of the Dominated convergence theorem. Also show
provided no sum is of the form ∞ − ∞. Also show strict inequality can hold
in the inequality. State and prove corresponding statements for lim inf.
8. Let (Ω, F, µ) be a measure space and suppose f, g : Ω → (−∞, ∞] are mea-
surable. Prove the sets
[M/h]
X ¡£ ¤¢
= sup hµ φ−1 (ih) < f
h>0 i=1
[M/h]
X h∆i ¡£ ¤¢
= sup µ φ−1 (ih) < f
h>0 i=1 ∆ i
¡ −1 ¢
where ∆i = φ (ih) − φ−1 ((i − 1) h) . Now use the mean value theorem to
write
¡ ¢0
∆i = φ−1 (ti ) h
1
= ¡ ¢h
φ0 φ−1 (ti )
for some ti between (i − 1) h and ih. Therefore, the right side is of the form
[M/h]
X ¡ ¢ ¡£ ¤¢
sup φ0 φ−1 (ti ) ∆i µ φ−1 (ih) < f
h i=1
¡ ¢
where φ (ti ) ∈ φ ((i − 1) h) , φ−1 (ih) . Argue that if ti were replaced
−1 −1
with ih, this would be a Riemann sum for the Riemann integral
Z φ−1 (M ) Z ∞
0
φ (t) µ ([t < f ]) dt = φ0 (t) µ ([t < f ]) dt.
0 0
208 ABSTRACT MEASURE AND INTEGRATION
12. Suppose un (t) is a differentiable function for t ∈ (a, b) and suppose that for
t ∈ (a, b),
|un (t)|, |u0n (t)| < Kn
P∞
where n=1 Kn < ∞. Show
∞
X ∞
X
( un (t))0 = u0n (t).
n=1 n=1
Hint: Use the monotone convergence theorem along with the fact the integral
is linear.
The Construction Of
Measures
Definition 9.1 Let Ω be a nonempty set and let µ : P(Ω) → [0, ∞] satisfy
µ(∅) = 0,
involved was either oil, bread, flour or fish. In mathematics such things have also been done with
sets. In the book by Bruckner Bruckner and Thompson there is an interesting discussion of the
Banach Tarski paradox which says it is possible to divide a ball in R3 into five disjoint pieces and
assemble the pieces to form two disjoint balls of the same size as the first. The details can be
found in: The Banach Tarski Paradox by Wagon, Cambridge University press. 1985. It is known
that all such examples must involve the axiom of choice.
209
210 THE CONSTRUCTION OF MEASURES
those for which no such miracle occurs. You might think of the measurable sets as
the nonmiraculous sets. The idea is to show that they form a σ algebra on which
the outer measure, µ is a measure.
First here is a definition and a lemma.
Definition 9.2 (µbS)(A) ≡ µ(S ∩ A) for all A ⊆ Ω. Thus µbS is the name of a
new outer measure, called µ restricted to S.
The next lemma indicates that the property of measurability is not lost by
considering this restricted measure.
S
T T
S AC B C
T T
S AC B
T T
S A BC
T T
S A B
B
Since µ is subadditive,
¡ ¢ ¡ ¢ ¡ ¢
µ (S) ≤ µ S ∩ A ∩ B C + µ (A ∩ B ∩ S) + µ S ∩ B ∩ AC + µ S ∩ AC ∩ B C .
Now using A, B ∈ S,
¡ ¢ ¡ ¢ ¡ ¢
µ (S) ≤ µ S ∩ A ∩ B C + µ (S ∩ A ∩ B) + µ S ∩ B ∩ AC + µ S ∩ AC ∩ B C
¡ ¢
= µ (S ∩ A) + µ S ∩ AC = µ (S)
It follows equality holds in the above. Now observe using the picture if you like that
¡ ¢ ¡ ¢
(A ∩ B ∩ S) ∪ S ∩ B ∩ AC ∪ S ∩ AC ∩ B C = S \ (A \ B)
and therefore,
¡ ¢ ¡ ¢ ¡ ¢
µ (S) = µ S ∩ A ∩ B C + µ (A ∩ B ∩ S) + µ S ∩ B ∩ AC + µ S ∩ AC ∩ B C
≥ µ (S ∩ (A \ B)) + µ (S \ (A \ B)) .
Pn
By induction, if Ai ∩ Aj = ∅ and Ai ∈ S, µ(∪ni=1 Ai ) = i=1 µ(Ai ).
Now let A = ∪∞ i=1 Ai where Ai ∩ Aj = ∅ for i 6= j.
∞
X n
X
µ(Ai ) ≥ µ(A) ≥ µ(∪ni=1 Ai ) = µ(Ai ).
i=1 i=1
Since this holds for all n, you can take the limit as n → ∞ and conclude,
∞
X
µ(Ai ) = µ(A)
i=1
which establishes 9.3. Part 9.4 follows from part 9.3 just as in the proof of Theorem
8.5 on Page 172. That is, letting F0 ≡ ∅, use part 9.3 to write
∞
X
µ (F ) = µ (∪∞
k=1 (Fk \ Fk−1 )) = µ (Fk \ Fk−1 )
k=1
n
X
= lim (µ (Fk ) − µ (Fk−1 )) = lim µ (Fn ) .
n→∞ n→∞
k=1
In order to establish 9.5, let the Fn be as given there. Then, since (F1 \ Fn )
increases to (F1 \ F ), 9.4 implies
which implies
lim µ (Fn ) ≤ µ (F ) .
n→∞
But since F ⊆ Fn ,
µ (F ) ≤ lim µ (Fn )
n→∞
and this establishes 9.5. Note that it was assumed µ (F1 ) < ∞ because µ (F1 ) was
subtracted from both sides.
It remains to show S is closed under countable unions. Recall that if A ∈ S, then
AC ∈ S and S is closed under finite unions. Let Ai ∈ S, A = ∪∞ n
i=1 Ai , Bn = ∪i=1 Ai .
Then
Then apply Parts 9.5 and 9.4 to the outer measure, µbS in 9.6 and let n → ∞.
Thus
Bn ↑ A, BnC ↓ AC
and this yields µ(S) = (µbS)(A) + (µbS)(AC ) = µ(S ∩ A) + µ(S \ A).
Therefore A ∈ S and this proves Parts 9.3, 9.4, and 9.5. It remains to prove the
last assertion about the measure being complete.
Let F ∈ S and let µ(E \ F ) + µ(F \ E) = 0. Consider the following picture.
E F
The following lemma deals with the outer measure generated by a measure
which is σ finite. It says that if the given measure is σ finite and complete then no
new measurable sets are gained by going to the induced outer measure and then
considering the measurable sets in the sense of Caratheodory.
214 THE CONSTRUCTION OF MEASURES
Lemma 9.7 Let (Ω, S, µ) be any measure space and let µ : P(Ω) → [0, ∞] be the
outer measure induced by µ. Then µ is an outer measure as claimed and if S is the
set of µ measurable sets in the sense of Caratheodory, then S ⊇ S and µ = µ on S.
Furthermore, if µ is σ finite and (Ω, S, µ) is complete, then S = S.
Next consider the claim about not getting any new sets from the outer measure
in the case the measure space is σ finite and complete.
Claim 1: If E, D ∈ S, and µ(E \ D) = 0, then if D ⊆ F ⊆ E, it follows F ∈ S.
Proof of claim 1:
F \ D ⊆ E \ D ∈ S,
and E \D is a set of measure zero. Therefore, since (Ω, S, µ) is complete, F \D ∈ S
and so
F = D ∪ (F \ D) ∈ S.
Claim 2: Suppose F ∈ S and µ (F ) < ∞. Then F ∈ S.
Proof of the claim 2: From the definition of µ, it follows there exists E ∈ S
such that E ⊇ F and µ (E) = µ (F ) . Therefore,
µ (E) = µ (E \ F ) + µ (F )
so
µ (E \ F ) = 0. (9.8)
Similarly, there exists D1 ∈ S such that
D1 ⊆ E, D1 ⊇ (E \ F ) , µ (D1 ) = µ (E \ F ) .
and
µ (D1 \ (E \ F )) = 0. (9.9)
9.2. REGULAR MEASURES 215
F
D
D1
µ (E \ D) ≤ µ (E \ F ) + µ (F \ D)
= µ (E \ F ) + µ (D1 \ (E \ F )) = 0.
Theorem 9.9 (Urysohn) Let (X, τ ) be normal and let H ⊆ U where H is closed
and U is open. Then there exists g : X → [0, 1] such that g is continuous, g (x) = 1
on H and g (x) = 0 if x ∈
/ U.
H ⊆ V, U C ⊆ W, and V ∩ W = ∅.
Then
H ⊆ V ⊆ V , V ∩ UC = ∅
and so let Vr1 = V .
Suppose Vr1 , · · ·, Vrk have been chosen and list the rational numbers r1 , · · ·, rk
in order,
rl1 < rl2 < · · · < rlk for {l1 , · · ·, lk } = {1, · · ·, k}.
If rk+1 > rlk then letting p = rlk , let Vrk+1 satisfy
V p ⊆ Vrk+1 ⊆ V rk+1 ⊆ U.
If rk+1 ∈ (rli , rli+1 ), let p = rli and let q = rli+1 . Then let Vrk+1 satisfy
V p ⊆ Vrk+1 ⊆ V rk+1 ⊆ Vq .
H ⊆ Vrk+1 ⊆ V rk+1 ⊆ Vp .
9.3. URYSOHN’S LEMMA 217
Thus there exist open sets Vr for each r ∈ Q ∩ (0, 1) with the property that if
r < s,
H ⊆ Vr ⊆ V r ⊆ Vs ⊆ V s ⊆ U.
Now let [
f (x) = inf{t ∈ D : x ∈ Vt }, f (x) ≡ 1 if x ∈
/ Vt .
t∈D
I claim f is continuous.
an open set.
Next consider x ∈ f −1 ([0, a]) so f (x) ≤ a. If t > a, then x ∈ Vt because if not,
then
inf{t ∈ D : x ∈ Vt } > a.
Thus
f −1 ([0, a]) = ∩{Vt : t > a} = ∩{V t : t > a}
which is a closed set. If a = 1, f −1 ([0, 1]) = f −1 ([0, a]) = X. Therefore,
Then f is continuous.
Proof: Consider |f (x) − f (x1 )|and suppose without loss of generality that
f (x1 ) ≥ f (x) . Then choose y ∈ S such that f (x) + ε > d (x, y) . Then
Since ε is arbitrary, it follows that |f (x1 ) − f (x)| ≤ d (x1 , x) and this proves the
lemma.
Theorem 9.11 (Urysohn’s lemma for metric space) Let H be a closed subset of
an open set, U in a metric space, (X, d) . Then there exists a continuous function,
g : X → [0, 1] such that g (x) = 1 for all x ∈ H and g (x) = 0 for all x ∈
/ U.
218 THE CONSTRUCTION OF MEASURES
Proof: If x ∈/ C, a closed set, then dist (x, C) > 0 because if not, there would
exist a sequence of points of¡ C converging
¢ to x and it would follow that x ∈ C.
Therefore, dist (x, H) + dist x, U C > 0 for all x ∈ X. Now define a continuous
function, g as ¡ ¢
dist x, U C
g (x) ≡ .
dist (x, H) + dist (x, U C )
It is easy to see this verifies the conclusions of the theorem and this proves the
theorem.
Proof: First it is shown that X, is regular. Let H be a closed set and let p ∈ / H.
Then for each h ∈ H, there exists an open set Uh containing p and an open set Vh
containing h such that Uh ∩ Vh = ∅. Since H must be compact, it follows there
are finitely many of the sets Vh , Vh1 · · · Vhn such that H ⊆ ∪ni=1 Vhi . Then letting
U = ∩ni=1 Uhi and V = ∪ni=1 Vhi , it follows that p ∈ U , H ∈ V and U ∩ V = ∅. Thus
X is regular as claimed.
Next let K and H be disjoint nonempty closed sets.Using regularity of X, for
every k ∈ K, there exists an open set Uk containing k and an open set Vk containing
H such that these two open sets have empty intersection. Thus H ∩U k = ∅. Finitely
many of the Uk , Uk1 , ···, Ukp cover K and so ∪pi=1 U ki is a closed set which has empty
¡ ¢C
intersection with H. Therefore, K ⊆ ∪pi=1 Uki and H ⊆ ∪pi=1 U ki . This proves
the theorem.
A useful construction when dealing with locally compact Hausdorff spaces is the
notion of the one point compactification of the space discussed earler. However, it
is reviewed here for the sake of convenience or in case you have not read the earlier
treatment.
Definition 9.13 Suppose (X, τ ) is a locally compact Hausdorff space. Then let
Xe ≡ X ∪ {∞} where ∞ is just the name of some point which is not in X which is
called the point at infinity. A basis for the topology e e is
τ for X
© ª
τ ∪ K C where K is a compact subset of X .
C
having compact closure which contains p. Then p ∈ U and ∞ ∈ U and these are
disjoint open sets containing the points, p and ∞ respectively. Now let C be an
open cover of X e with sets from e
τ . Then ∞ must be in some set, U∞ from C, which
must contain a set of the form K C where K is a compact subset of X. Then there
e is
exist sets from C, U1 , · · ·, Ur which cover K. Therefore, a finite subcover of X
U1 , · · ·, Ur , U∞ .
Theorem 9.15 Let X be a locally compact Hausdorff space, and let K be a compact
subset of the open set V . Then there exists a continuous function, f : X → [0, 1],
such that f equals 1 on K and {x : f (x) 6= 0} ≡ spt (f ) is a compact subset of V .
Proof: Let X e be the space just described. Then K and V are respectively
closed and open in eτ . By Theorem 9.12 there exist open sets in e τ , U , and W such
that K ⊆ U, ∞ ∈ V C ⊆ W , and U ∩ W = U ∩ (W \ {∞}) = ∅. Thus W \ {∞} is
an open set in the original topological space which contains V C , U is an open set in
the original topological space which contains K, and W \ {∞} and U are disjoint.
Now for each x ∈ K, let Ux be a basic open set whose closure is compact and
such that
x ∈ Ux ⊆ U.
Thus Ux must have empty intersection with V C because the open set, W \ {∞}
contains no points of Ux . Since K is compact, there are finitely many of these sets,
Ux1 , Ux2 , · · ·, Uxn which cover K. Now let H ≡ ∪ni=1 Uxi .
Claim: H = ∪ni=1 Uxi
Proof of claim: Suppose p ∈ H. If p ∈ / ∪ni=1 Uxi then if follows p ∈
/ Uxi for
each i. Therefore, there exists an open set, Ri containing p such that Ri contains
no other points of Uxi . Therefore, R ≡ ∩ni=1 Ri is an open set containing p which
contains no other points of ∪ni=1 Uxi = W, a contradiction. Therefore, H ⊆ ∪ni=1 Uxi .
On the other hand, if p ∈ Uxi then p is obviously in H so this proves the claim.
From the claim, K ⊆ H ⊆ H ⊆ V and H is compact because it is the finite
union of compact sets. Repeating the same argument, ¡ there
¢ exists an open set, I
such that H ⊆ I ⊆ I ⊆ V with I compact. Now I, τ I is a compact topological
space where τ I is the topology which is obtained by taking intersections of open
sets in X with I. Therefore, by Urysohn’s lemma, there exists f : I → ¡ [0, 1]¢ such
that f is continuous at every point of I and also f (K) = 1 while f I \ H = 0.
C
Extending f to equal 0 on I , it follows that f is continuous on X, has values in
[0, 1] , and satisfies f (K) = 1 and spt (f ) is a compact subset contained in I ⊆ V.
This proves the theorem.
In fact, the conclusion of the above theorem could be used to prove that the
topological space is locally compact. However, this is not needed here.
where Ω denotes the whole topological space considered. Also for φ ∈ Cc (Ω), K ≺ φ
if
φ(Ω) ⊆ [0, 1] and φ(K) = 1.
and φ ≺ V if
φ(Ω) ⊆ [0, 1] and spt(φ) ⊆ V.
K ⊆ V = ∪ni=1 Vi , Vi open.
for all x ∈ K.
W i U i Vi
Proof: Keep Vi the same but replace Vj with V fj ≡ Vj \ H. Now in the proof
above, applied to this modified collection of open sets, if j 6= i, φj (x) = 0 whenever
x ∈ H. Therefore, ψ i (x) = 1 on H.
and if Lf ≥ 0 whenever f ≥ 0.
Theorem 9.21 (Riesz representation theorem) Let (Ω, τ ) be a locally compact Haus-
dorff space and let L be a positive linear functional on Cc (Ω). Then there exists a
σ algebra S containing the Borel sets and a unique measure µ, defined on S, such
that
µ is complete, (9.10)
µ(K) < ∞ for all K compact, (9.11)
The plan is to define an outer measure and then to show that it, together with the
σ algebra of sets measurable in the sense of Caratheodory, satisfies the conclusions
of the theorem. Always, K will be a compact set and V will be an open set.
Proof: First it is necessary to verify µ is well defined because there are two
descriptions of it on open sets. Suppose then that µ1 (V ) ≡ inf{µ(U ) : U ⊇ V
and U is open}. It is required to verify that µ1 (V ) = µ (V ) where µ is given as
sup{Lf : f ≺ V }. If U ⊇ V, then µ (U ) ≥ µ (V ) directly from the definition. Hence
222 THE CONSTRUCTION OF MEASURES
Hence
∞
X
µ(V ) ≤ µ(Vi )
i=1
P∞
since f ≺ V is arbitrary. Now let E = ∪∞ i=1 Ei . Is µ(E) ≤ i=1 µ(Ei )? Without
loss of generality, it can be assumed µ(Ei ) < ∞ for each i since if not so, there is
nothing to prove. Let Vi ⊇ Ei with µ(Ei ) + ε2−i > µ(Vi ).
∞
X ∞
X
µ(E) ≤ µ(∪∞
i=1 Vi ) ≤ µ(Vi ) ≤ ε + µ(Ei ).
i=1 i=1
P∞
Since ε was arbitrary, µ(E) ≤ i=1 µ(Ei ) which proves the lemma.
K Vα
g>α
Then h ≤ 1 on Vα while gα−1 ≥ 1 on Vα and so gα−1 ≥ h which implies
L(gα−1 ) ≥ Lh and that therefore, since L is linear,
Lg ≥ αLh.
Lg ≥ αµ (Vα ) ≥ αµ (K) .
Letting α ↑ 1 yields Lg ≥ µ(K). This proves the first part of the lemma. The
second assertion follows from this and Theorem 9.15. If K is given, let
K≺g≺Ω
and so from what was just shown, µ (K) ≤ Lg < ∞. This proves the lemma.
9.4. POSITIVE LINEAR FUNCTIONALS 223
A U1 B V1
From Lemma 9.24 µ(A ∪ B) < ∞ and so there exists an open set, W such that
W ⊇ A ∪ B, µ (A ∪ B) + ε > µ (W ) .
Lemma 9.26 Let f ∈ Cc (Ω), f (Ω) ⊆ [0, 1]. Then µ(spt(f )) ≥ Lf . Also, every
open set, V satisfies
µ (V ) = sup {µ (K) : K ⊆ V } .
spt(f ) V
Finally, let V be open and let l < µ (V ) . Then from the definition of µ, there
exists f ≺ V such that L (f ) > l. Therefore, l < µ (spt (f )) ≤ µ (V ) and so this
shows the claim about inner regularity of the measure on an open set.
Proof: Let K be compact. Then from the definition of µ, there exists an open
set U , with µ(U ) < ∞ and U ⊇ K. Suppose for every open set, V , containing
K, µ(V \ K) > ε. Then there exists f ≺ U \ K with Lf > ε. Consequently,
µ((f )) > Lf > ε. Let K1 = spt(f ) and repeat the construction with U \ K1 in
place of U.
K1
K
K2
K3
for all r, contradicting µ(U ) < ∞. This demonstrates the first part of the lemma.
To show the second part, employ a similar construction. Suppose µ(V \ K) > ε
for all K ⊆ V . Then µ(V ) > ε so there exists f ≺ V with Lf > ε. Let K1 = spt(f )
so µ(spt(f )) > ε. If K1 · · · Kn , disjoint, compact subsets of V have been chosen,
there must exist g ≺ (V \ ∪ni=1 Ki ) be such that Lg > ε. Hence µ(spt(g)) > ε. Let
Kn+1 = spt(g). In this way there exists a sequence of disjoint compact subsets of
V , {Ki } with µ(Ki ) > ε. Thus for any m, K1 · · · Km are all contained in V and
are disjoint and compact. By Lemma 9.25
m
X
µ(V ) ≥ µ(∪m
i=1 Ki ) = µ(Ki ) > mε
i=1
for all m, a contradiction to µ(V ) < ∞. This proves the second part.
Lemma 9.28 Let S be the σ algebra of µ measurable sets in the sense of Carath-
eodory. Then S ⊇ Borel sets and µ is inner regular on every open set and for every
E ∈ S with µ(E) < ∞.
Proof: Define
S1 = {E ⊆ Ω : E ∩ K ∈ S}
for all compact K.
9.4. POSITIVE LINEAR FUNCTIONALS 225
Let C be a compact set. The idea is to show that C ∈ S. From this it will follow
that the closed sets are in S1 because if C is only closed, C ∩ K is compact. Hence
C ∩ K = (C ∩ K) ∩ K ∈ S. The steps are to first show the compact sets are in
S and this implies the closed sets are in S1 . Then you show S1 is a σ algebra and
so it contains the Borel sets. Finally, it is shown that S1 = S and then the inner
regularity conclusion is established.
Let V be an open set with µ (V ) < ∞. I will show that
By Lemma 9.27, there exists an open set U containing C and a compact subset of
V , K, such that µ(V \ K) < ε and µ (U \ C) < ε.
U V
C K
Since ε is arbitrary,
µ(V ) = µ(V \ C) + µ(V ∩ C) (9.13)
whenever C is compact and V is open. (If µ (V ) = ∞, it is obvious that µ (V ) ≥
µ(V \ C) + µ(V ∩ C) and it is always the case that µ (V ) ≤ µ (V \ C) + µ (V ∩ C) .)
Of course 9.13 is exactly what needs to be shown for arbitrary S in place of V .
It suffices to consider only S having µ (S) < ∞. If S ⊆ Ω, with µ(S) < ∞, let
V ⊇ S, µ(S) + ε > µ(V ). Then from what was just shown, if C is compact,
Since ε is arbitrary, this shows the compact sets are in S. As discussed above, this
verifies the closed sets are in S1 .
Therefore, S1 contains the closed sets and S contains the compact sets. There-
fore, if E ∈ S and K is a compact set, it follows K ∩ E ∈ S and so S1 ⊇ S.
To see that S1 is closed with respect to taking complements, let E ∈ S1 .
K = (E C ∩ K) ∪ (E ∩ K).
226 THE CONSTRUCTION OF MEASURES
Then from the fact, just established, that the compact sets are in S,
E C ∩ K = K \ (E ∩ K) ∈ S.
because
<ε
z }| {
µ (V ∩ (K ∩ E)) + µ (V \ K) ≥ µ (V ∩ E)
Since ε is arbitrary,
µ(V ) = µ(V \ E) + µ(V ∩ E).
Now let S ⊆ Ω. If µ(S) = ∞, then µ(S) = µ(S ∩ E) + µ(S \ E). If µ(S) < ∞, let
V ⊇ S, µ(S) + ε ≥ µ(V ).
Then
K K ∩VC F U
µ (V \ (U \ F )) + µ (U \ F ) = µ (V )
9.4. POSITIVE LINEAR FUNCTIONALS 227
and so
µ (V \ (U \ F )) = µ (V ) − µ (U \ F ) < ε.
Also,
¡ ¢C
V \ (U \ F ) = V ∩ U ∩ FC
£ ¤
= V ∩ UC ∪ F
¡ ¢
= (V ∩ F ) ∪ V ∩ U C
⊇ V ∩F
and so
µ(V ∩ F ) ≤ µ (V \ (U \ F )) < ε.
Since V ⊇ U ∩ F , V ⊆ U C ∪ F so U ∩ V C ⊆ U ∩ F = F . Hence U ∩ V C is a
C C
Since ε is arbitrary, this proves the second part of the lemma. Formula 9.11 of this
theorem was established earlier.
It remains to show µ satisfies 9.12.
R
Lemma 9.29 f dµ = Lf for all f ∈ Cc (Ω).
Proof: Let f ∈ Cc (Ω), f real-valued, and suppose f (Ω) ⊆ [a, b]. Choose t0 < a
and let t0 < t1 < · · · < tn = b, ti − ti−1 < ε. Let
Now note that |t0 | + ti + ε ≥ 0 and so from the definition of µ and Lemma 9.24,
this is no larger than
n
X
(|t0 | + ti + ε)µ(Vi ) − |t0 |µ(spt(f ))
i=1
n
X
≤ (|t0 | + ti + ε)(µ(Ei ) + ε/n) − |t0 |µ(spt(f ))
i=1
n
X n
X
≤ |t0 | µ(Ei ) + |t0 |ε + ti µ(Ei ) + ε(|t0 | + |b|)
i=1 i=1
n
X
+ε µ(Ei ) + ε2 − |t0 |µ(spt(f )).
i=1
From 9.15 and 9.14, the first and last terms cancel. Therefore this is no larger than
n
X
(2|t0 | + |b| + µ(spt(f )) + ε)ε + ti−1 µ(Ei ) + εµ(spt(f ))
i=1
Z
≤ f dµ + (2|t0 | + |b| + 2µ(spt(f )) + ε)ε.
Thus µ1 (K) ≤ µ2 (K) for all K. Similarly, the inequality can be reversed and so it
follows the two measures are equal on compact sets. By the assumption of inner
regularity on open sets, the two measures are also equal on all open sets. By outer
regularity, they are equal on all sets of S. This proves the theorem.
An important example of a locally compact Hausdorff space is any metric space
in which the closures of balls are compact. For example, Rn with the usual metric
is an example of this. Not surprisingly, more can be said in this important special
case.
Theorem 9.30 Let (Ω, τ ) be a metric space in which the closures of the balls are
compact and let L be a positive linear functional defined on Cc (Ω) . Then there
exists a measure representing the positive linear functional which satisfies all the
conclusions of Theorem 9.15 and in addition the property that µ is regular. The
same conclusion follows if (Ω, τ ) is a compact Hausdorff space.
Theorem 9.31 Let (Ω, τ ) be a metric space in which the closures of the balls are
compact and let L be a positive linear functional defined on Cc (Ω) . Then there
exists a measure representing the positive linear functional which satisfies all the
conclusions of Theorem 9.15 and in addition the property that µ is regular. The
same conclusion follows if (Ω, τ ) is a compact Hausdorff space.
k < µ (F ∩ Ωn ) ≤ µ (F ) .
Corollary 9.32 Let (Ω, τ ) be a locally compact Hausdorff space and suppose µ
defined on a σ algebra, S represents the positive linear functional L where L is
defined on Cc (Ω) in the sense of Theorem 9.15. Suppose also that there exist Ωn ∈ S
such that Ω = ∪∞n=1 Ωn and µ (Ωn ) < ∞. Then µ is regular.
Definition 9.33 Let (Ω, τ ) be a locally compact Hausdorff space and let L be a
positive linear functional defined on Cc (Ω) such that the complete measure defined
by the Riesz representation theorem for positive linear functionals is inner regular.
Then this is called a Radon measure. Thus a Radon measure is complete, and
regular.
230 THE CONSTRUCTION OF MEASURES
Corollary 9.34 Let (Ω, τ ) be a locally compact Hausdorff space which is also σ
compact meaning
Ω = ∪∞n=1 Ωn , Ωn is compact,
and let L be a positive linear functional defined on Cc (Ω) . Then if (µ1 , S1 ) , and
(µ2 , S2 ) are two Radon measures, together with their σ algebras which represent L
then the two σ algebras are equal and the two measures are equal.
Proof: Suppose (µ1 , S1 ) and (µ2 , S2 ) both work. It will be shown the two
measures are equal on every compact set. Let K be compact and let V be an open
set containing K. Then let K ≺ f ≺ V. Then
Z Z Z
µ1 (K) = dµ1 ≤ f dµ1 = L (f ) = f dµ2 ≤ µ2 (V ) .
K
Therefore, taking the infimum over all V containing K implies µ1 (K) ≤ µ2 (K) .
Reversing the argument shows µ1 (K) = µ2 (K) . This also implies the two measures
are equal on all open sets because they are both inner regular on open sets. It is
being assumed the two measures are regular. Now let F ∈ S1 with µ1 (F ) < ∞.
Then there exist sets, H, G such that H ⊆ F ⊆ G such that H is the countable
union of compact sets and G is a countable intersection of open sets such that
µ1 (G) = µ1 (H) which implies µ1 (G \ H) = 0. Now G \ H can be written as the
countable intersection of sets of the form Vk \Kk where Vk is open, µ1 (Vk ) < ∞ and
Kk is compact. From what was just shown, µ2 (Vk \ Kk ) = µ1 (Vk \ Kk ) so it follows
µ2 (G \ H) = 0 also. Since µ2 is complete, and G and H are in S2 , it follows F ∈ S2
and µ2 (F ) = µ1 (F ) . Now for arbitrary F possibly having µ1 (F ) = ∞, consider
F ∩ Ωn . From what was just shown, this set is in S2 and µ2 (F ∩ Ωn ) = µ1 (F ∩ Ωn ).
Taking the union of these F ∩Ωn gives F ∈ S2 and also µ1 (F ) = µ2 (F ) . This shows
S1 ⊆ S2 . Similarly, S2 ⊆ S1 .
The following lemma is often useful.
Lemma 9.35 Let (Ω, F, µ) be a measure space where Ω is a metric space having
closed balls compact or more generally a topological space. Suppose µ is a Radon
measure and f is measurable with respect to F. Then there exists a Borel measurable
function, g, such that g = f a.e.
where Ekn ∈ F. By the outer regularity of µ, there exists a Borel set, Fkn ⊇ Ekn such
that µ (Fkn ) = µ (Ekn ). In fact Fkn can be assumed to be a Gδ set. Let
Pn
X
tn (ω) ≡ cnk XFkn (ω) .
k=1
9.5. ONE DIMENSIONAL LEBESGUE MEASURE 231
This will be done in general a little later but for now, consider the following picture
of functions, f k and g k converging pointwise as k → ∞ to X[a,b] .
1 fk 1 gk
a + 1/k £ B b − 1/k a − 1/k £ B b + 1/k
£ B ¡ £ B
@£ @ £ ¡
@
R B
ª
¡ @ B ¡ª
£ B R£
@ B
a b a b
Then
µ ¶ Z Z
2
b−a− ≤ f k dx = f k dm ≤ m ((a, b)) ≤ m ([a, b])
k
Z Z Z µ ¶
k k 2
= X[a,b] dm ≤ g dm = g dx ≤ b − a + .
k
From this the claim in 9.18 follows.
Proof: The sets, [fn > t] are increasing and their union is [f > t] because if
f (ω) > t, then for all n large enough, fn (ω) > t also. Therefore, from Theorem 8.5
on Page 172 the desired conclusion follows.
Lemma 9.38 Suppose s ≥ 0 is a measurable simple function,
n
X
s (ω) ≡ ak XEk (ω)
k=1
where the ak are the distinct nonzero values of s, a1 < a2 < · · · < an . Suppose φ is
a C 1 function defined on [0, ∞) which has the property that φ (0) = 0, φ0 (t) > 0 for
all t. Then Z ∞ Z
φ0 (t) µ ([s > t]) dm = φ (s) dµ.
0
Proof: First note that if µ (Ek ) = ∞ for any k then both sides equal ∞ and
so without loss of generality, assume µ (Ek ) < ∞ for all k. Letting a0 ≡ 0, the left
side equals
Xn Z ak X n Z ak Xn
0
φ (t) µ ([s > t]) dm = φ0 (t) µ (Ei ) dm
k=1 ak−1 k=1 ak−1 i=k
Xn Xn Z ak
= µ (Ei ) φ0 (t) dm
k=1 i=k ak−1
Xn X n
= µ (Ei ) (φ (ak ) − φ (ak−1 ))
k=1 i=k
n
X i
X
= µ (Ei ) (φ (ak ) − φ (ak−1 ))
i=1 k=1
Xn Z
= µ (Ei ) φ (ai ) = φ (s) dµ.
i=1
Definition 9.40 Let R denote the set of countable unions of sets of the form A×B,
where A ∈ S and B ∈ T (Sets of the form A × B are referred to as measurable
rectangles) and also let
ρ (A × B) = µ (A) ν (B) (9.19)
More generally, define Z Z
ρ (E) ≡ XE (x, y) dµdν (9.20)
and Z
y→ XE (x, y) dµ is ν measurable. (9.22)
B
D
n
Lemma 9.41 Given C × D and {Ai × Bi }i=1 , there exist finitely many disjoint
p
rectangles, {Ci0 × Di0 }i=1 such that none of these sets intersect any of the Ai × Bi ,
each set is contained in C × D and
(C × D) \ (A1 × B1 ) = C × (D \ B1 ) ∪ (C \ A1 ) × (D ∩ B1 )
and these last two sets are disjoint, have empty intersection with A1 × B1 , and
(C × (D \ B1 ) ∪ (C \ A1 ) × (D ∩ B1 )) ∪ (∪ni=1 Ai × Bi ) = (C × D) ∪ (∪ni=1 Ai × Bi )
n om
Now suppose disjoint sets, C ei × D
ei have been obtained, each being a subset
i=1
of C × D such that
³ ´
(∪ni=1 Ai × Bi ) ∪ ∪m e e n
k=1 Ck × Dk = (∪i=1 Ai × Bi ) ∪ (C × D)
Lemma 9.42 If Q = ∪∞ i=1 Ai × Bi ∈ R, then there exist disjoint sets, of the form
A0i × Bi0 such that Q = ∪∞ 0 0 0 0
i=1 Ai × Bi , each Ai × Bi is a subset of some Ai × Bi , and
0 0
Ai ∈ S while Bi ∈ T . Also, the intersection of finitely many sets of R is a set of
R. For ρ defined in 9.20, it follows that 9.21 and 9.22 hold for any element of R.
Furthermore, X X
ρ (Q) = µ (A0i ) ν (Bi0 ) = ρ (A0i × Bi0 ) .
i i
m
Now suppose disjoint rectangles, {A0i × Bi0 }i=1p
have been obtained such that each
rectangle is a subset of Ak × Bk for some k ≤ p and
¡ mp 0 ¢ ¡ ∞ ¢
∪∞ 0
i=1 Ai × Bi = ∪i=1 Ai × Bi ∪ ∪k=p+1 Ak × Bk .
m
By Lemma 9.41 again, there exist disjoint rectangles {A0i × Bi0 }i=m
p+1
p +1
such that
mp
each is contained in Ap+1 × Bp+1 , none have intersection with any of {Ai × Bi0 }i=1
0
and ¡ mp+1 0 ¢ ¡ ∞ ¢
∪∞ 0
i=1 Ai × Bi = ∪i=1 Ai × Bi ∪ ∪k=p+2 Ak × Bk .
m
Note that no change is made in {A0i × Bi0 }i=1 p
. Continuing this way proves the
existence of the desired sequence of disjoint rectangles, each of which is a subset of
at least one of the original rectangles and such that
Q = ∪∞ 0 0
i=1 Ai × Bi .
This shows the measurability conditions, 9.21 and 9.22 hold for Q ∈ R and also
establishes the formula for ρ (Q) , 9.23.
If ∪i Ai × Bi and ∪j Cj × Dj are two sets of R, then their intersection is
∪i ∪j (Ai ∩ Cj ) × (Bi ∩ Dj )
∞
Proof: Let Ri = ∪∞ i i 0 0
j=1 Aj × Bj . Using Lemma 9.42, let {Am × Bm }m=1 be a
sequence of disjoint rectangles each of which is contained in some Aij × Bji for some
i, j such that
∪∞ ∞ 0 0
i=1 Ri = ∪m=1 Am × Bm .
Now define © ª
Si ≡ m : A0m × Bm
0
⊆ Aij × Bji for some j .
It is not important to consider whether some m might be in more than one Si . The
important thing to notice is that
0
∪m∈Si A0m × Bm ⊆ ∪∞ i i
j=1 Aj × Bj = Ri .
P∞
Proposition 9.45 (µ × ν) (S) = inf { i=1 µ (Ai ) ν (Bi ) : S ⊆ ∪∞
i=1 Ai × Bi }
P∞
Proof: Let λ (S) ≡ inf { i=1 µ (Ai ) ν (Bi ) : S ⊆ ∪∞ i=1 Ai × Bi } . Suppose S ⊆
∪∞i=1 A i × B i ≡ Q ∈ R. Then by Lemma 9.42, Q 0
P∞ i ×0 Bi where
= ∪ i A 0
these rectan-
gles are disjoint. Thus by this lemma, ρ (Q) = i=1 µ (Ai ) ν (Bi0 ) ≥ λ (S) and so
λ (S) ≤ (µ × ν) (S) . If λ (S) = ∞, P∞ this shows λ (S) = (µ × ν) (S) . Suppose then
that λ (S) < ∞ and λ (S) + ε > i=1 µ (Ai ) ν (Bi ) where Q = ∪∞ i=1 Ai × Bi ⊇ S.
Then by Lemma 9.42 again, ∪∞ i=1 A i ×B i = ∪ ∞
A
i=1 i
0
×B 0
i where the primed rectangles
are disjoint, each is a subset of some Ai × Bi and so
∞
X ∞
X
λ (S) + ε ≥ µ (Ai ) ν (Bi ) ≥ µ (A0i ) ν (Bi0 ) = ρ (Q) ≥ (µ × ν) (S) .
i=1 i=1
Since ε is arbitrary, this shows λ (S) ≥ (µ × ν) (S) and this proves the proposition.
(µ × ν) (S) ≤ (µ × ν) (T ) , (9.26)
∞
X
(µ × ν) (∪∞
i=1 Si ) ≤ (µ × ν) (Si ) . (9.27)
i=1
To do this, note that 9.26 is obvious. To verify 9.27, note that it is obvious if
(µ × ν) (Si ) = ∞ for any i. Therefore, assume (µ × ν) (Si ) < ∞. Then letting ε > 0
be given, there exist Ri ∈ R such that
ε
(µ × ν) (Si ) + > ρ (Ri ) , Ri ⊇ Si .
2i
238 THE CONSTRUCTION OF MEASURES
(µ × ν) (∪∞
i=1 Si ) ≤ (µ × ν) (∪∞ i=1 Ri )
X∞
= ρ (∪∞
i=1 Ri ) ≤ ρ (Ri )
i=1
∞ ³
X ε´
≤ (µ × ν) (Si ) +
i=1
2i
̰ !
X
= (µ × ν) (Si ) + ε.
i=1
ρ (P ∩ (A × B)) + ρ (P \ (A × B)) = ρ (P ) .
while
∞
[ ∞
[
P \ (A × B) = (A0i \ A) × Bi0 ∪ (A ∩ A0i ) × (Bi0 \ B) .
i=1 i=1
ρ (P ∩ (A × B)) + ρ (P \ (A × B)) =
Z Z X
∞ Z Z X
∞
X(A∩A0 ) (x) XB∩Bi0 (y) dµdν + X(A0 \A) (x) XBi0 (y) dµdν
i i
i=1 i=1
Z Z X
∞
+ XA∩A0i (x) XBi0 \B (y) dµdν
i=1
9.7. PRODUCT MEASURES 239
∞
X
= µ (A ∩ A0i ) ν (B ∩ Bi0 ) + µ (A0i \ A) ν (Bi0 ) + µ (A ∩ A0i ) ν (Bi0 \ B)
i=1
∞
X ∞
X
= µ (A ∩ A0i ) ν (Bi0 ) + µ (A0i \ A) ν (Bi0 ) = µ (A0i ) ν (Bi0 ) = ρ (P ) .
i=1 i=1
(µ × ν) (S) + ε > ρ (P ) .
Lemma 9.48 Let R1 be defined as the set of all countable intersections of sets of
R. Then if S ⊆ X × Y, there exists R ∈ R1 for which it makes sense to write ρ (R)
because 9.21 and 9.22 hold such that
P ≡ ∩∞
i=1 Qi ⊇ S.
x → XP (x, y) is µ measurable
because this function is the pointwiseR limit of functions for which this is so. It
remains to consider whether y → XP (x, y) dµ is ν measurable. First observe
Qn ⊇ Qn+1 , XQi ≤ XPi , and
Z Z
ρ (Q1 ) = ρ (P1 ) = XP1 (x, y) dµdν < ∞. (9.31)
240 THE CONSTRUCTION OF MEASURES
and so Z
y → XN C (y) XP (x, y) dµ
The sets of R1 are µ × ν measurable because these sets are countable intersec-
tions of countable unions of rectangles and Lemma 9.47 verifies the rectangles are
µ × ν measurable. This proves the Lemma.
The following theorem is the main result.
The function, Z
y→ XE (x, y) dµ
is ν measurable and
Z Z
(µ × ν) (E) = XE (x, y) dµdν.
Similarly, Z Z
(µ × ν) (E) = XE (x, y) dνdµ.
ρ (R) = (µ × ν) (E) , R ⊇ E.
(µ × ν) (R \ E) = 0.
ρ (P ) = (µ × ν) (R \ E) = 0.
Thus Z Z
XP (x, y) dµdν = 0. (9.32)
is µ measurable and Z
XN C (y) XR\E (x, y) dµ = 0. (9.33)
Now also
The right side of this equation equals a ν measurable function and so the left side
R it is also a ν measurable function. It follows from completeness of ν
which equals
that y → XE (x,Ry) dµ is ν measurable because for y outside of a set of ν measure
zero, N it equals XR (x, y) dµ. Therefore,
Z Z Z Z
XE (x, y) dµdν = XN C (y) XE (x, y) dµdν
Z Z
= XN C (y) XR (x, y) dµdν
Z Z
= XR (x, y) dµdν
= ρ (R) = (µ × ν) (E) .
In all the above there would be no change in writing dνdµ instead of dµdν. The
same result would be obtained. This proves the theorem.
Now let f : X × Y → [0, ∞] be µ × ν measurable and
Z
f d (µ × ν) < ∞. (9.35)
Pm
Let s (x, y) ≡ i=1 ci XEi (x, y) be a nonnegative simple function with ci being the
nonzero values of s and suppose
0 ≤ s ≤ f.
In which Z Z
sdµ = XN C (y) sdµ
R
for N a set of ν measure zero such that y → XN C (y) sdµ is ν measurable. This
follows because 9.35 implies (µ × ν) (Ei ) < ∞. Now let sn ↑ f where sn is a non-
negative simple function and
Z Z Z
sn d (µ × ν) = XNnC (y) sn (x, y) dµdν
where Z
y→ XNnC (y) sn (x, y) dµ
Theorem 9.50 (Fubini) Let (X, S, µ) and (Y, T , ν) be complete measure spaces
and let ½Z Z ¾
(µ × ν) (E) ≡ inf XR (x, y) dµdν : E ⊆ R ∈ R 2
then Z Z Z
f d (µ × ν) = f dµdν,
X×Y Y X
where the iterated integral Ron the right makes sense because for ν a.e. y, x → f (x, y)
is µ measurable and y → f (x, y) dµ is ν measurable. Similarly,
Z Z Z
f d (µ × ν) = f dνdµ.
X×Y X Y
In the case where (X, S, µ) and (Y, T , ν) are both σ finite, it is not necessary to
assume 9.36.
Corollary 9.51 (Fubini) Let (X, S, µ) and (Y, T , ν) be complete measure spaces
such that (X, S, µ) and (Y, T , ν) are both σ finite and let
½Z Z ¾
(µ × ν) (E) ≡ inf XR (x, y) dµdν : E ⊆ R ∈ R
where the iterated integral Ron the right makes sense because for ν a.e. y, x → f (x, y)
is µ measurable and y → f (x, y) dµ is ν measurable. Similarly,
Z Z Z
f d (µ × ν) = f dνdµ.
X×Y X Y
Proof: Let ∪∞ ∞
n=1 Xn = X and ∪n=1 Yn = Y where Xn ∈ S, Yn ∈ T , Xn ⊆
Xn+1 , Yn ⊆ Yn+1 for all n and µ (Xn ) < ∞, ν (Yn ) < ∞. From Theorem 9.50
applied to Xn , Yn and fm ≡ min (f, m) ,
Z Z Z
fm d (µ × ν) = fm dµdν
Xn ×Yn Yn Xn
Then use the monotone convergence theorem again letting n → ∞ to obtain the
desired conclusion. The argument for the other order of integration is similar.
If µ and
R Rν are σ finite, thenRifR f is µ × ν measurableR having complex values and
either |f | dµdν < ∞ or |f | dνdµ < ∞, then |f | d (µ × ν) < ∞ so f ∈
L1 (X × Y ) .
Proof: Without loss of generality, it can be assumed that f has real values.
Then
|f | + f − (|f | − f )
f=
2
and both
R f + ≡ |f |+f
2 and f − ≡ |f |−f
2 are nonnegative and are less than |f |. There-
fore, gd (µ × ν) < ∞ for g = f + and g = f − so the above theorem applies and
Z Z Z
f d (µ × ν) ≡ f + d (µ × ν) − f − d (µ × ν)
Z Z Z Z
= f + dµdν − f − dµdν
Z Z
= f dµdν.
sXRn ≤ |f | XRn
Lemma 9.54 Suppose R and E are subsets of P(Z)3 such that E is defined as the
set of all finite disjoint unions of sets of R. Suppose also
∅, Z ∈ R
A ∩ B ∈ R whenever A, B ∈ R,
A \ B ∈ E whenever A, B ∈ R.
Then E is an algebra of sets of Z.
E1 = ∪m n
i=1 Ri , E2 = ∪j=1 Rj
where the Ri are disjoint sets in R and the Rj are disjoint sets in R. Then
E1 ∩ E2 = ∪m n
i=1 ∪j=1 Ri ∩ Rj
which is clearly an element of E because no two of the sets in the union can intersect
and by assumption they are all in R. Thus by induction, finite intersections of sets
of E are in E. Consider the difference of two elements of E next.
If E = ∪ni=1 Ri ∈ E,
E1 \ E2 = E1 ∩ E2C ∈ E
E1 ∪ E2 = (E1 \ E2 ) ∪ E2 ∈ E
because E1 \ E2 consists of a finite disjoint union of sets of R and these sets must
be disjoint from the sets of R whose union yields E2 because (E1 \ E2 ) ∩ E2 = ∅.
This proves the lemma.
The following corollary is particularly helpful in verifying the conditions of the
above lemma.
R ≡ {R1 × R2 : Ri ∈ Ri }
and
E ≡ { finite disjoint unions of sets of R}.
Consequently, E is an algebra of sets.
3 Set of all subsets of Z
9.8. ALTERNATIVE TREATMENT OF PRODUCT MEASURE 247
by assumption.
A × B \ (C × D) =
∈E2 ∈E1 ∈R2
z }| { z }| { z }| {
A × (B \ D) ∪ (A \ C) × (D ∩ B)
= (A × Q) ∪ (P × R)
where Q ∈ E2 , P ∈ E1 , and R ∈ R2 .
C
D
B
Proof: Consider all monotone classes which contain A, and take their inter-
section. The result is still a monotone class which contains A and is therefore the
smallest monotone class containing A. Therefore, assume without loss of general-
ity that M is the smallest monotone class containing A because if it is shown the
smallest monotone class containing A contains σ (A), then the given monotone class
does also. To avoid more notation, let M denote this smallest monotone class.
The plan is to show M is a σ-algebra. It will then follow M ⊇ σ(A) because
σ (A) is defined as the intersection of all σ algebras which contain A. For A ∈ A,
define
MA ≡ {B ∈ M such that A ∪ B ∈ M}.
Clearly MA is a monotone class containing A. Hence MA ⊇ M because M is
the smallest such monotone class. But by construction, MA ⊆ M. Therefore,
248 THE CONSTRUCTION OF MEASURES
Example 9.59 It follows from Lemma 9.54 or more easily from Corollary 9.55
that the elementary sets form an algebra.
Ex = {y ∈ Y : (x, y) ∈ E},
E y = {x ∈ X : (x, y) ∈ E}.
Ex
X
x
Proof: Let
(∪∞ ∞
i=1 Ei )x = ∪i=1 (Ei )x ∈ F.
Similarly,
y y
(∪∞ ∞
i=1 Ei ) = ∪i=1 Ei ∈ S.
(A × B) \ (A0 × B0 ) = (A \ A0 ) × B ∪ (A ∩ A0 ) × (B \ B0 ),
an elementary set.
Theorem 9.62 If (X, S, µ) and (Y, F, λ) are both finite measure spaces (µ(X),
λ(Y ) < ∞), then for every E ∈ S × F,
a.) Rx → λ(Ex ) is µR measurable, y → µ(E y ) is λ measurable
b.) X λ(Ex )dµ = Y µ(E y )dλ.
250 THE CONSTRUCTION OF MEASURES
Proof: Let
Since µ and λ are both finite, the monotone convergence and dominated convergence
theorems imply M is a monotone class.
Next I will argue M contains the elementary sets. Let
E = ∪ni=1 Ai × Bi
Similarly,
Z n
X
µ (E y ) dλ = µ (Ai ) λ (Bi )
Y i=1
Theorem 9.63 If (X, S, µ) and (Y, F, λ) are both σ finite measure spaces, then for
every E ∈ S × F,
a.) Rx → λ(Ex ) is µR measurable, y → µ(E y ) is λ measurable.
b.) X λ(Ex )dµ = Y µ(E y )dλ.
Proof: Let X = ∪∞ ∞
n=1 Xn , Y = ∪n=1 Yn where,
Let
Sn = {A ∩ Xn : A ∈ S}, Fn = {B ∩ Yn : B ∈ F}.
Thus (Xn , Sn , µ) and (Yn , Fn , λ) are both finite measure spaces.
Claim: If E ∈ S × F, then E ∩ (Xn × Yn ) ∈ Sn × Fn .
Proof: Let
Mn = {E ∈ S × F : E ∩ (Xn × Yn ) ∈ Sn × Fn } .
9.8. ALTERNATIVE TREATMENT OF PRODUCT MEASURE 251
(E ∩ (Xn × Yn ))x = ∅
if x ∈
/ Xn and a similar observation holds for the second integrand in 9.38 if y ∈
/ Yn .
Therefore,
Z Z
λ((E ∩ (Xn × Yn ))x )dµ = λ((E ∩ (Xn × Yn ))x )dµ
X Xn
Z
= µ((E ∩ (Xn × Yn ))y )dλ
ZYn
= µ((E ∩ (Xn × Yn ))y )dλ.
Y
Then letting n → ∞, the monotone convergence theorem implies b.) and the
measurability assertions of a.) are valid because
Proof: The first assertion about the measure of a measurable rectangle was
∞
established above. Now suppose {Ei }i=1 is a disjoint collection of sets of S × F.
Then using the monotone convergence theorem along with the observation that
252 THE CONSTRUCTION OF MEASURES
(Ei )x ∩ (Ej )x = ∅,
Z
(µ × λ) (∪∞
i=1 Ei ) = λ((∪∞
i=1 Ei )x )dµ
X
Z Z X
∞
= λ (∪∞
i=1 (Ei )x ) dµ = λ ((Ei )x ) dµ
X X i=1
∞ Z
X
= λ ((Ei )x ) dµ
i=1 X
X∞
= (µ × λ) (Ei )
i=1
Thus from Definition 9.64, 9.39 holds if f = XE . It follows that 9.39 holds for
every nonnegative simple function. By Theorem 8.27 on Page 190, there exists an
increasing sequence, {fn }, of simple functions converging pointwise to f . Then
Z Z
f (x, y)dλ = lim fn (x, y)dλ,
Y n→∞ Y
Z Z
f (x, y)dµ = lim fn (x, y)dµ.
X n→∞ X
This follows from the monotone convergence theorem. Since
Z
x→ fn (x, y)dλ
Y
R
is measurable with respect to S, it follows that x → Y f (x, y)dλR is also measurable
with respect to S. A similar conclusion can be drawn about y → X f (x, y)dµ. Thus
the two iterated integrals make sense. Since 9.39 holds for fn, another application
of the Monotone Convergence theorem shows 9.39 holds for f . This proves the
theorem.
9.9. COMPLETION OF MEASURES 253
R R
Corollary
R R 9.67 Let f : X×Y → C be S × F measurable. Suppose either X Y
|f | dλdµ
or Y X |f | dµdλ < ∞. Then f ∈ L1 (X × Y, µ × λ) and
Z Z Z Z Z
f d(µ × λ) = f dλdµ = f dµdλ (9.40)
X×Y X Y Y X
Proof : Suppose first that f is real valued. Apply Theorem 9.66 to f + and
f . 9.40 follows from observing that f = f + − f − ; and that all integrals are finite.
−
If f is complex valued, consider real and imaginary parts. This proves the corollary.
Suppose f is product measurable. From the above discussion, and breaking f
down into a sum of positive and negative parts of real and imaginary parts and then
using Theorem 8.27 on Page 190 on approximation by simple functions, it follows
that whenever f is S × F measurable, x → f (x, y) is µ measurable, y → f (x, y) is
λ measurable.
Theorem 9.68 Let (Ω, ¡ F, µ) ¢be a σ finite measure space. Then there exists a
unique measure space, Ω, F, µ satisfying
¡ ¢
1. Ω, F, µ is a complete measure space.
2. µ = µ on F
3. F ⊇ F
µ (G \ F ) = µ (G \ F ) = 0 (9.41)
Proof: First consider the claim about uniqueness. Suppose (Ω, F1 , ν 1 ) and
(Ω, F2 , ν 2 ) both work and let E ∈ F1 . Also let µ (Ωn ) < ∞, · · ·Ωn ⊆ Ωn+1 · ··, and
∪∞
n=1 Ωn = Ω. Define En ≡ E ∩ Ωn . Then pick Gn ⊇ En ⊇ Fn such that µ (Gn ) =
254 THE CONSTRUCTION OF MEASURES
This verifies 2.
Next consider 3. Let E ∈ F and let S be a set. I must show
µ (S) ≥ µ (S \ E) + µ (S ∩ E) .
If µ (S) = ∞ there is nothing to show. Therefore, suppose µ (S) < ∞. Then from
the definition of µ there exists G ⊇ S such that G ∈ F and µ (G) = µ (S) . Then
from the definition of µ,
µ (S) ≤ µ (S \ E) + µ (S ∩ E)
≤ µ (G \ E) + µ (G ∩ E)
= µ (G) = µ (S)
This verifies 3.
Claim 4 comes by the definition of µ as used above. The only other case is when
µ (S) = ∞. However, in this case, you can let G = Ω.
It only remains to verify 5. Let the Ωn be as described above and let E ∈ F
such that E ⊆ Ωn . By 4 there exists H ∈ F such that H ⊆ Ωn , H ⊇ Ωn \ E, and
be one of these simple functions, it follows from Theorem 9.68 there exist sets,
Fkn ∈ F such that Fkn ⊆ Ekn and µ (Fkn ) = µ (Ekn ) . Then let
mn
X
tn (ω) ≡ cnk XFkn (ω) .
k=1
µ × ν (A × B) = µ (A) λ (B)
Definition 9.70 Let (X, F, µ) and (Y, S, ν) be two measure spaces. A measurable
rectangle is a set of the form A × B where A ∈ F and B ∈ S.
A × B ∩ A0 × B 0 = (A ∩ A0 ) × (B ∩ B 0 ) .
The following is the fundamental lemma which shows these π systems are useful.
1. K ⊆ G
2. If A ∈ G, then AC ∈ G
∞
3. If {Ai }i=1 is a sequence of disjoint sets from G then ∪∞
i=1 Ai ∈ G.
H ≡ {G : 1 - 3 all hold}
then ∩H yields a collection of sets which also satisfies 1 - 3. Therefore, I will assume
in the argument that G is the smallest collection satisfying 1 - 3. Let A ∈ K and
define
GA ≡ {B ∈ G : A ∩ B ∈ G} .
I want to show GA satisfies 1 - 3 because then it must equal G since G is the smallest
collection of subsets of Ω which satisfies 1 - 3. This will give the conclusion that for
A ∈ K and B ∈ G, A ∩ B ∈ G. This information will then be used to show that if
258 THE CONSTRUCTION OF MEASURES
A ∩ ∪∞ ∞
i=1 Bi = ∪i=1 A ∩ Bi ∈ G
GB ≡ {A ∈ G : A ∩ B ∈ G} .
because finite intersections of sets of G are in G. Since the A0i are disjoint, it follows
∪∞ ∞ 0
i=1 Ai = ∪i=1 Ai ∈ G
where in the above, part of the requirement is for all integrals to make sense.
Then K ⊆ G. This is obvious.
9.10. ANOTHER VERSION OF PRODUCT MEASURES 259
XZ
∞ Z
= XAi dµdν
i=1 Y X
X∞ Z Z
= XAi dνdµ
i=1 X Y
Z X
∞ Z
= XAi dνdµ
X i=1 Y
Z Z X
∞
= XAi dνdµ
X Y i=1
Z Z
= X∪∞
i=1 Ai
dνdµ, (9.45)
X Y
the interchanges between the summation and the integral depending on the mono-
tone convergence theorem. Thus G is closed with respect to countable disjoint
unions.
From Lemma 9.72, G ⊇ σ (K) . Also the computation in 9.45 implies that on
σ (K) one can define a measure, denoted by µ × ν and that for every E ∈ σ (K) ,
Z Z Z Z
(µ × ν) (E) = XE dµdν = XE dνdµ. (9.46)
Y X X Y
Now here is Fubini’s theorem.
Theorem 9.73 Let f : X × Y → [0, ∞] be measurable with respect to the σ algebra,
σ (K) just defined and let µ × ν be the product measure of 9.46 where µ and ν are
finite measures on (X, F) and (Y, S) respectively. Then
Z Z Z Z Z
f d (µ × ν) = f dµdν = f dνdµ.
X×Y Y X X Y
260 THE CONSTRUCTION OF MEASURES
Proof: Since the measures are σ finite, there exist increasing sequences of sets,
{Xn } and {Yn } such that µ (Xn ) < ∞ and µ (Yn ) < ∞. Then µ and ν restricted
to Xn and Yn respectively are finite. Then from Theorem 9.73,
Z Z Z Z
f dµdν = f dνdµ
Yn Xn Xn Yn
Then just as in the proof of Theorem 9.73, the conclusion of this theorem is obtained.
This proves the theorem.
Qn
It is also useful to note that all the above holds for i=1 Xi in place of X × Y.
You would simply modify the definition of G in 9.44 including allQpermutations for
n
the iterated integrals and for K you would use sets of the form i=1 Ai where Ai
is measurable. Everything goes through exactly as above. Thus the following is
obtained.
n Qn
Theorem 9.75 Let {(Xi , Fi , µi )}i=1 be σ finite measure spaces and let i=1 QF i de-
n
note the smallest σ algebra which contains the measurable boxesQof the form i=1 Ai
n
Qn Ai ∈ Fi . Then there
where Qn exists a measure, λ defined on i=1 Fi such that if
f : i=1 Xi → [0, ∞] is i=1 Fi measurable, and (i1 , · · ·, in ) is any permutation of
(1, · · ·, n) , then
Z Z Z
f dλ = ··· f dµi1 · · · dµin
Xin Xi1
9.10. ANOTHER VERSION OF PRODUCT MEASURES 261
Proof: This follows immediately from Theorem 9.75 and Theorem 9.69. By the
second theorem,
Qn there exists a function f1 ≥ f such that f1 = f for all (x1 , · · ·, xn ) ∈
/
N, a set of i=1 Fi having measure zero. Then by Theorem 9.68 and Theorem 9.75
Z Z Z Z
f dλ = f1 dλ = ··· f1 dµi1 · · · dµin .
Xin Xi1
This theorem is often referred to as Fubini’s theorem. The next theorem is also
called this.
262 THE CONSTRUCTION OF MEASURES
³Q Qn ´
n
Corollary 9.77 Suppose f ∈ L1 i=1 Xi , i=1 F i , µ 1 × · · · × µ n where each Xi
is a σ finite measure space. Then if (i1 , · · ·, in ) is any permutation of (1, · · ·, n) , it
follows Z Z Z
f d (µ1 × · · · × µn ) = ··· f dµi1 · · · dµin .
Xin Xi1
Proof: Just apply Theorem 9.76 to the positive and negative parts of the real
and imaginary parts of f. This proves the theorem.
Here is another easy corollary.
Corollary
Qn 9.78 Suppose in the situation of Corollary 9.77, f = f1 off N, a set of
i=1 F i having µ1 × · · · × µnQmeasure zero and that f1 is a complex valued function
n
measurable with respect to i=1 Fi . Suppose also that for some permutation of
(1, 2, · · ·, n) , (j1 , · · ·, jn )
Z Z
··· |f1 | dµj1 · · · dµjn < ∞.
Xjn Xj1
Then à !
n
Y n
Y
1
f ∈L Xi , Fi , µ1 × · · · × µn
i=1 i=1
and the conclusion of Corollary 9.77 holds.
Qn
Proof: Since |f1 | is i=1 Fi measurable, it follows from Theorem 9.75 that
Z Z
∞ > ··· |f1 | dµj1 · · · dµjn
Xjn Xj1
Z
= |f1 | d (µ1 × · · · × µn )
Z
= |f1 | d (µ1 × · · · × µn )
Z
= |f | d (µ1 × · · · × µn ) .
³Q Qn ´
n
Thus f ∈ L1 i=1 Xi , i=1 F i , µ1 × · · · × µn as claimed and the rest follows from
Corollary 9.77. This proves the corollary.
The following lemma is also useful.
Lemma 9.79 Let (X, F, µ) and (Y, S, ν) be σ finite complete measure spaces and
suppose f ≥ 0 is F × S measurable. Then for a.e. x,
y → f (x, y)
is S measurable. Similarly for a.e. y,
x → f (x, y)
is F measurable.
9.11. DISTURBING EXAMPLES 263
Note this is actually a finite sum for each such (x, y) . Therefore, this is a continuous
function on [0, 1) × [0, 1). Now for a fixed y,
Z 1 ∞
X Z 1
f (x, y) dx = gn (y) (gn (x) − gn+1 (x)) dx = 0
0 k=1 0
R1R1 R1
showing that 0 0
f (x, y) dxdy = 0
0dy = 0. Next fix x.
Z 1 ∞
X Z 1
f (x, y) dy = (gn (x) − gn+1 (x)) gn (y) dy = g1 (x) .
0 k=1 0
R1R1 R1
Hence 0 0 f (x, y) dydx = 0 g1 (x) dx = 1. The iterated integrals are not equal.
Note theR function, g is not nonnegative
R 1 R 1 even though it is measurable. In addition,
1R1
neither 0 0 |f (x, y)| dxdy nor 0 0 |f (x, y)| dydx is finite and so you can’t apply
Corollary 9.52. The problem here is the function is not nonnegative and is not
absolutely integrable.
264 THE CONSTRUCTION OF MEASURES
Example 9.81 This time let µ = m, Lebesgue measure on [0, 1] and let ν be count-
ing measure on [0, 1] , in this case, the σ algebra is P ([0, 1]) . Let l denote the line
segment in [0, 1] × [0, 1] which goes from (0, 0) to (1, 1). Thus l = (x, x) where
x ∈ [0, 1] . Consider the outer measure of l in m × ν. Let l ⊆ ∪k Ak × Bk where Ak
is Lebesgue measurable and Bk is a subset of [0, 1] . Let B ≡ {k ∈ N : ν (Bk ) = ∞} .
If m (∪k∈B Ak ) has measure zero, then there are uncountably many points of [0, 1]
outside of ∪k∈B Ak . For p one of these points, (p, p) ∈ Ai × Bi and i ∈ / B. Thus each
of these points is in ∪i∈B/ B i , a countable set because these B i are each finite. But
this is a contradiction because there need to be uncountably many of these points as
just indicated. Thus m (Ak ) > 0 for some k ∈ B and so mR× ν (Ak × Bk ) = ∞. It
follows m × ν (l) = ∞ and so l is m × ν measurable. Thus Xl (x, y) d m × ν = ∞
and so you cannot apply Fubini’s theorem, Theorem 9.50. Since ν is not σ finite,
you cannot apply the corollary to this theorem either. Thus there is no contradiction
to the above theorems in the following observation.
Z Z Z Z Z Z
Xl (x, y) dνdm = 1dm = 1, Xl (x, y) dmdν = 0dν = 0.
R
The problem here is that you have neither f d m × ν < ∞ not σ finite measure
spaces.
The next example is far more exotic. It concerns the case where both iterated
integrals make perfect sense but are unequal. In 1877 Cantor conjectured that the
cardinality of the real numbers is the next size of infinity after countable infinity.
This hypothesis is called the continuum hypothesis and it has never been proved
or disproved4 . Assuming this continuum hypothesis will provide the basis for the
following example. It is due to Sierpinski.
Example 9.82 Let X be an uncountable set. It follows from the well ordering
theorem which says every set can be well ordered which is presented in the ap-
pendix that X can be well ordered. Let ω ∈ X be the first element of X which is
preceded by uncountably many points of X. Let Ω denote {x ∈ X : x < ω} . Then
Ω is uncountable but there is no smaller uncountable set. Thus by the contin-
uum hypothesis, there exists a one to one and onto mapping, j which maps [0, 1]
onto nΩ. Thus, for x ∈ [0, 1] , j (x)
o is preceeded by countably many points. Let
2
Q ≡ (x, y) ∈ [0, 1] : j (x) < j (y) and let f (x, y) = XQ (x, y) . Then
Z 1 Z 1
f (x, y) dy = 1, f (x, y) dx = 0
0 0
In each case, the integrals make sense. In the first, for fixed x, f (x, y) = 1 for all
but countably many y so the function of y is Borel measurable. In the second where
4 In 1940 it was shown by Godel that the continuum hypothesis cannot be disproved. In 1963 it
was shown by Cohen that the continuum hypothesis cannot be proved. These assertions are based
on the axiom of choice and the Zermelo Frankel axioms of set theory. This topic is far outside the
scope of this book and this is only a hopefully interesting historical observation.
9.12. EXERCISES 265
9.12 Exercises
1. Let Ω = N, the natural numbers and let d (p, q) = |p − q|, the usual dis-
tance in
P∞ R. Show that (Ω, d) the closures of the balls are compact. Now let
Λf ≡ k=1 f (k) whenever f ∈ Cc (Ω). Show this is a well defined positive
linear functional on the space Cc (Ω). Describe the measure of the Riesz rep-
resentation theorem which results from this positive linear functional. What
if Λ (f ) = f (1)? What measure would result from this functional? Which
functions are measurable?
2. Verify that µ defined in Lemma 9.7 is an outer measure.
R
3. Let F : R → R be increasing and right continuous. Let Λf ≡ f dF where
the integral is the Riemann Stieltjes integral of f . Show the measure µ from
the Riesz representation theorem satisfies
4. Let Ω be a metric space with the closed balls compact and suppose µ is a
measure defined on the Borel sets of Ω which is finite on compact sets. Show
there exists a unique Radon measure, µ which equals µ on the Borel sets.
5. ↑ Random vectors are measurable functions, X, mapping a probability space,
(Ω, P, F) to Rn . Thus X (ω) ∈ Rn for each ω ∈ Ω and P is a probability
measure defined on the sets of F, a σ algebra of subsets of Ω. For E a Borel
set in Rn , define
¡ ¢
µ (E) ≡ P X−1 (E) ≡ probability that X ∈ E.
Show this is a well defined measure on the Borel sets of Rn and use Problem 4
to obtain a Radon measure, λX defined on a σ algebra of sets of Rn including
the Borel sets such that for E a Borel set, λX (E) =Probability that (X ∈E).
6. Suppose X and Y are metric spaces having compact closed balls. Show
(X × Y, dX×Y )
is also a metric space which has the closures of balls compact. Here
Let
A ≡ {E × F : E is a Borel set in X, F is a Borel set in Y } .
Show σ (A), the smallest σ algebra containing A contains the Borel sets. Hint:
Show every open set in a metric space which has closed balls compact can be
obtained as a countable union of compact sets. Next show this implies every
open set can be obtained as a countable union of open sets of the form U × V
where U is open in X and V is open in Y .
Lemma 10.2 Every open set in Rn is the countable disjoint union of half open
boxes of the form
Yn
(ai , ai + 2−k ]
i=1
−k
where ai = l2 for some integers, l, k. The sides of these boxes are of equal length.
One could also have half open boxes of the form
n
Y
[ai , ai + 2−k )
i=1
Proof: Let
n
Y
Ck = {All half open boxes (ai , ai + 2−k ] where
i=1
267
268 LEBESGUE MEASURE
√
you will get the idea. Note that each box has diameter no larger than 2−k n. This
is because if
Yn
x, y ∈ (ai , ai + 2−k ],
i=1
−k
then |xi − yi | ≤ 2 . Therefore,
à n !1/2
X¡ ¢ 2 √
|x − y| ≤ 2−k = 2−k n.
i=1
Let B∞ = ∪∞ i=1 Bi . In fact ∪B∞ = U . Clearly ∪B∞ ⊆ U because every box of every
Bi is contained in U . If p ∈ U , let k be the smallest integer such that p is contained
in a box from Ck which is also a subset of U . Thus
p ∈ ∪Bk ⊆ ∪B∞ .
Hence B∞ is the desired countable disjoint collection of half open boxes whose union
is U . The last assertion about the other type of half open rectangle is obvious. This
proves the lemma. Qn
Now what does Lebesgue measure do to a rectangle, i=1 (ai , bi ]?
Qn Qn
Lemma 10.3 Let R = i=1 [ai , bi ], R0 = i=1 (ai , bi ). Then
n
Y
mn (R0 ) = mn (R) = (bi − ai ).
i=1
for i = 1, · · ·, n and consider functions gik and fik having the following graphs.
1 fik 1 gik
ai + ££
1 B bi − 1 ai − k1 £ B bi + 1
k
B ¡ k £ B k
@£ @ £ ¡
R
@ B
ª
¡ B ¡
£ B @
R£
@ B ª
ai bi ai bi
Let
n
Y n
Y
g k (x) = gik (xi ), f k (x) = fik (xi ).
i=1 i=1
10.1. BASIC PROPERTIES 269
Z n
Y
≥ f k dmn = Λf k ≥ (bi − ai − 2/k).
i=1
the last equality because of the first part of the lemma which implies mn (B (x, R)) =
mn (B (0, R)) . Therefore, mn (x + H) = mn (H) as claimed. If H is not bounded,
consider Hm ≡ B (0, m) ∩ H. Then mn (x + Hm ) = mn (Hm ) . Passing to the limit
as m → ∞ yields the result in general.
mn (E) = mn (x + E)
Proof: Suppose mn (E) < ∞. By regularity of the measure, there exist sets
G, H such that G is a countable intersection of open sets, H is a countable union
of compact sets, mn (G \ H) = 0, and G ⊇ E ⊇ H. Now mn (G) = mn (G + x) and
270 LEBESGUE MEASURE
mn (H) = mn (H + x) which follows from Lemma 10.4 applied to the sets which
are either intersected to form G or unioned to form H. Now
and both x + H and x + G are measurable because they are either countable unions
or countable intersections of measurable sets. Furthermore,
mn (x + G \ x + H) = mn (x + G) − mn (x + H) = mn (G) − mn (H) = 0
mn (E) = mn (H) = mn (x + H) ≤ mn (x + E)
≤ mn (x + G) = mn (G) = mn (E) .
Corollary 10.6 Let D be an n × n diagonal matrix and let U be an open set. Then
Therefore,
∞
X ∞
X
mn (DU ) = mn (DRi ) = |det (D)| mn (Ri ) = |det (D)| mn (U ) .
i=1 i=1
Lemma 10.10 Let ||·|| be a norm on Rn and let F be a collection of balls deter-
mined by this norm. Suppose
if B1 , B2 ∈ G then B1 ∩ B2 = ∅, (10.2)
G is maximal with respect to 10.1 and 10.2.
Note that if there is no ball of F which has radius larger than k then G = ∅.
Proof: Let H = {B ⊆ F such that 10.1 and 10.2 hold}. If there are no balls
with radius larger than k then H = ∅ and you let G =∅. In the other case, H 6= ∅
because there exists B(p, r) ∈ F with r > k. In this case, partially order H by set
inclusion and use the Hausdorff maximal principle (see the appendix on set theory)
to let C be a maximal chain in H. Clearly ∪C satisfies 10.1 and 10.2 because if B1
and B2 are two balls from ∪C then since C is a chain, it follows there is some
element of C, B such that both B1 and B2 are elements of B and B satisfies 10.1 and
10.2. If ∪C is not maximal with respect to these two properties, then C was not a
maximal chain because then there would exist B ! ∪C, that is, B contains C as a
proper subset and {C, B} would be a strictly larger chain in H. Let G = ∪C.
A ≡ ∪{B : B ∈ F}.
Suppose
∞ > M ≡ sup{r : B(p, r) ∈ F } > 0.
Then there exists G ⊆ F such that G consists of disjoint balls and
b : B ∈ G}.
A ⊆ ∪{B
M
B(p, r) ∈ G1 implies r > , (10.3)
2
B1 , B2 ∈ G1 implies B1 ∩ B2 = ∅, (10.4)
G1 is maximal with respect to 10.3, and 10.4.
Suppose G1 , · · ·, Gm have been chosen, m ≥ 1. Let
Fm ≡ {B ∈ F : B ⊆ Rn \ ∪{G1 ∪ · · · ∪ Gm }}.
10.3. THE VITALI COVERING THEOREM (ELEMENTARY VERSION) 273
b : B ∈ G} covers A.
Thus G is a collection of disjoint balls in F. I must show {B
Let x ∈ B(p, r) ∈ F and let
M M
< r ≤ m−1 .
2m 2
Then B (p, r) must intersect some set, B (p0 , r0 ) ∈ G1 ∪ · · · ∪ Gm since otherwise,
Gm would fail to be maximal. Then r0 > 2Mm because all balls in G1 ∪ · · · ∪ Gm satisfy
this inequality.
r0 .x
¾ p
p0 r
?
M
Then for x ∈ B (p, r) , the following chain of inequalities holds because r ≤ 2m−1
and r0 > 2Mm
|x − p0 | ≤ |x − p| + |p − p0 | ≤ r + r0 + r
2M 4M
≤ + r0 = m + r0 < 5r0 .
2m−1 2
b (p0 , r0 ) and this proves the theorem.
Thus B (p, r) ⊆ B
Proof: If no ball of F has radius larger than k, let G = ∅. Assume therefore, that
∞
some balls have radius larger than k. Let F ≡ {Bi }i=1 . Now let Bn1 be the first ball
in the list which has radius greater than k. If every ball having radius larger than k
intersects this one, then stop. The maximal set is just Bn1 . Otherwise, let Bn2 be
the next ball having radius larger than k which is disjoint from Bn1 . Continue this
∞
way obtaining {Bni }i=1 , a finite or infinite sequence of disjoint balls having radius
larger than k. Then let G ≡ {Bni }. To see that G is maximal with respect to 10.7
and 10.8, suppose B ∈ F , B has radius larger than k, and G ∪ {B} satisfies 10.7
and 10.8. Then at some point in the process, B would have been chosen because it
would be the ball of radius larger than k which has the smallest index. Therefore,
B ∈ G and this shows G is maximal with respect to 10.7 and 10.8.
For the next lemma, for an open ball, B = B (x, r) , denote by B e the open ball,
B (x, 4r) .
A ≡ ∪ {B : B ∈ F} .
Suppose
∞ > M ≡ sup {r : B(p, r) ∈ F } > 0.
Then there exists G ⊆ F such that G consists of disjoint balls and
e : B ∈ G}.
A ⊆ ∪{B
and using Lemma 10.12, let Gm be a maximal collection¡ of¢ disjoint balls from Fm
m
with the property that each ball has radius larger than 23 M. Let G ≡ ∪∞ k=1 Gk .
Let x ∈ B (p, r) ∈ F. Choose m such that
µ ¶m µ ¶m−1
2 2
M <r≤ M
3 3
10.3. THE VITALI COVERING THEOREM (ELEMENTARY VERSION) 275
Then B (p, r) must have nonempty intersection with some ball from G1 ∪ · · · ∪ Gm
because if it didn’t, then Gm would fail to be maximal. Denote by B (p0 , r0 ) a ball
in G1 ∪ · · · ∪ Gm which has nonempty intersection with B (p, r) . Thus
µ ¶m
2
r0 > M.
3
Consider the picture, in which w ∈ B (p0 , r0 ) ∩ B (p, r) .
r0 ·x
¾ w
· p
p0 r
?
Then
<r0
z }| {
|x − p0 | ≤ |x − p| + |p − w| + |w − p0 |
< 32 r0
z }| {
µ ¶m−1
2
< r + r + r0 ≤ 2 M + r0
3
µ ¶
3
< 2 r0 + r0 = 4r0 .
2
This proves the lemma since it shows B (p, r) ⊆ B (p0 , 4r0 ) .
With this Lemma consider a version of the Vitali covering theorem in which
the balls do not have to be open. A ball centered at x of radius r will denote
something which contains the open ball, B (x, r) and is contained in the closed ball,
B (x, r). Thus the balls could be open or they could contain some but not all of
their boundary points.
b the
Definition 10.14 Let B be a ball centered at x having radius r. Denote by B
open ball, B (x, 5r).
Theorem 10.15 (Vitali) Let F be a collection of balls, and let
A ≡ ∪ {B : B ∈ F} .
Suppose
∞ > M ≡ sup {r : B(p, r) ∈ F } > 0.
Then there exists G ⊆ F such that G consists of disjoint balls and
b : B ∈ G}.
A ⊆ ∪{B
Proof:
¡ For
¢ B one of these balls, say B (x, r) ⊇ B ⊇ B (x, r), denote by B1 , the
ball B x, 5r
4 . Let F1 ≡ {B1 : B ∈ F } and let A1 denote the union of the balls in
F1 . Apply Lemma 10.13 to F1 to obtain
f1 : B1 ∈ G1 }
A1 ⊆ ∪{B
276 LEBESGUE MEASURE
A ⊆ A1 ⊆ ∪{B f1 : B1 ∈ G1 } = ∪{B
b : B ∈ G}
¡ ¢
because for B1 = B x, 5r f b
4 , it follows B1 = B (x, 5r) = B. This proves the theorem.
Definition 10.16 Let F be a collection of balls that cover a set, E, which have
the property that if x ∈ E and ε > 0, then there exists B ∈ F, diameter of B < ε
and x ∈ B. Such a collection covers E in the sense of Vitali.
Theorem 10.17 Let E ⊆ Rn and suppose 0 < mn (E) < ∞ where mn is the
outer measure determined by mn , n dimensional Lebesgue measure, and let F be
a collection of closed balls of bounded radii such that F covers E in the sense of
Vitali. Then there exists a countable collection of disjoint balls from F, {Bj }∞
j=1 ,
such that mn (E \ ∪∞ B
j=1 j ) = 0.
Proof: From the definition of outer measure there exists a Lebesgue measurable
set, E1 ⊇ E such that mn (E1 ) = mn (E). Now by outer regularity of Lebesgue
measure, there exists U , an open set which satisfies
E1
10.4. VITALI COVERINGS 277
Therefore,
³ ´ X ³ ´
mn (E1 ) = mn (E) ≤ mn ∪∞
j=1
b
B j ≤ mn
bj
B
j
X ¡ ¢
n
= 5 mn (Bj ) = 5 mn ∪∞
n
j=1 Bj
j
Then
mn (E1 ) > (1 − 10−n )mn (U )
≥ (1 − 10−n )[mn (E1 \ ∪∞ ∞
j=1 Bj ) + mn (∪j=1 Bj )]
=mn (E1 )
z }| {
−n
≥ (1 − 10 )[mn (E1 \ ∪∞
j=1 Bj ) +5 −n
mn (E) ].
and so
¡ ¡ ¢ ¢
1 − 1 − 10−n 5−n mn (E1 ) ≥ (1 − 10−n )mn (E1 \ ∪∞
j=1 Bj )
which implies
(1 − (1 − 10−n ) 5−n )
mn (E1 \ ∪∞
j=1 Bj ) ≤ mn (E1 )
(1 − 10−n )
(1 − (1 − 10−n ) 5−n )
0< <1
(1 − 10−n )
(1 − (1 − 10−n ) 5−n )
< θn < 1,
(1 − 10−n )
¡ ¢
mn E \ ∪∞ ∞
j=1 Bj ≤ mn (E1 \ ∪j=1 Bj ) < θ n mn (E1 ) = θ n mn (E)
θn mn (E) ≥ mn (E1 \ ∪N N1
j=1 Bj ) ≥ mn (E \ ∪j=1 Bj )
1
(10.10)
Let F1 = {B ∈ F : Bj ∩ B = ∅, j = 1, · · ·, N1 }. If E \ ∪N
j=1 Bj = ∅, then F1 = ∅
1
and ³ ´
mn E \ ∪ Nj=1 Bj = 0
1
278 LEBESGUE MEASURE
Therefore, in this case let Bk = ∅ for all k > N1 . Consider the case where
E \ ∪N
j=1 Bj 6= ∅.
1
role of U . (You pick a different E1 whose measure equals the outer measure of
E \ ∪Nj=1 Bj .) Then choosing Bj for j = N1 + 1, · · ·, N2 as in the above argument,
1
θn mn (E \ ∪N N2
j=1 Bj ) ≥ mn (E \ ∪j=1 Bj )
1
³ ´
mn E \ ∪N j=1 j = 0.
k
B
for every k ∈ N. Therefore, the conclusion holds in this case also. This proves the
Theorem.
There is an obvious corollary which removes the assumption that 0 < mn (E).
Corollary 10.18 Let E ⊆ Rn and suppose mn (E) < ∞ where mn is the outer
measure determined by mn , n dimensional Lebesgue measure, and let F, be a col-
lection of closed balls of bounded radii such that F covers E in the sense of Vitali.
Then there exists a countable collection of disjoint balls from F, {Bj }∞
j=1 , such that
mn (E \ ∪∞ j=1 Bj ) = 0.
Proof: If 0 = mn (E) you simply pick any ball from F for your collection of
disjoint balls.
It is also not hard to remove the assumption that mn (E) < ∞.
n
Proof: Let Rm ≡ (−m, m) be the open rectangle having sides of length 2m
which is centered at 0 and let R0 = ∅. Let Hm ≡ Rm \ Rm . Since both Rm
n
and Rm have the same measure, (2m) , it follows mn (Hm ) = 0. Now for all
k ∈ N, Rk ⊆ Rk ⊆ Rk+1 . Consider the disjoint open sets, Uk ≡ Rk+1 \ Rk . Thus
10.5. CHANGE OF VARIABLES FOR LINEAR MAPS 279
and so
¡ ¢ ¡ ∞ ¢
mn (E \ ∪∞
i=1 Bi ) ≤ mn E \ ∪∞
i=1 Bi + mn ∪i=1 Bi \ Bi
¡ ¢
= mn E \ ∪∞
i=1 Bi = 0.
This implies you can fill up an open set with balls which cover the open set in
the sense of Vitali.
∞
X ³ ³ ´´ X∞
≤ bi
mn h B ≤ mn (B (h (xi ) , 2krxi ))
i=1 i=1
∞
X ∞
X
n
= mn (B (xi , 2krxi )) = (2k) mn (B (xi , rxi ))
i=1 i=1
n n
≤ (2k) mn (V ) ≤ (2k) ε.
Since ε > 0 is arbitrary, this shows mn (h (Tk )) = 0. Now
mn (h (T )) = lim mn (h (Tk )) = 0.
k→∞
This proves the lemma in the case that V is bounded. Suppose now that V is just
an open set. Let Vk = V ∩ B (0, k) . Then mn (RVk ) = mn (Vk ) . Letting k → ∞,
this yields the desired conclusion. This proves the lemma in the case that V is open.
Suppose now that H is a closed and bounded set. Let B (0,R) ⊇ H. Then letting
B = B (0, R) for short,
In general, let Hm = H ∩ B (0,m). Then from what was just shown, mn (RHm ) =
mn (Hm ) . Now let m → ∞ to get the conclusion of the lemma in general. This
proves the lemma.
Lemma 10.26 Let E be Lebesgue measurable set in Rn and let R be unitary. Then
mn (RE) = mn (E) .
Proof: First suppose E is bounded. Then there exist sets, G and H such that
H ⊆ E ⊆ G and H is the countable union of closed sets while G is the countable
intersection of open sets such that mn (G \ H) = 0. By Lemma 10.25 applied to
these sets whose union or intersection equals H or G respectively, it follows
Therefore,
In the general case, let Em = E ∩ B (0, m) and apply what was just shown and let
m → ∞.
Proof: Let RU be the right polar decomposition (Theorem 4.59 on Page 87) of
A and let V be an open set. Then from Lemma 10.26,
mn (AV ) = mn (RU V ) = mn (U V ) .
Now U = Q∗ DQ where D is a diagonal matrix such that |det (D)| = |det (A)| and
Q is unitary. Therefore,
Now QV is an open set and so by Corollary 10.6 on Page 270 and Lemma 10.25,
which shows the desired equation is obvious in the case where det (A) = 0. Therefore,
assume A is one to one. Since H is bounded, H ⊆ B (0, R) for some R > 0. Then
letting B = B (0, R) for short,
If H is not bounded, apply the result just obtained to Hm ≡ H ∩ B (0, m) and then
let m → ∞.
With this preparation, the main result is the following theorem.
Proof: First suppose E is bounded. Then there exist sets, G and H such that
H ⊆ E ⊆ G and H is the countable union of closed sets while G is the countable
intersection of open sets such that mn (G \ H) = 0. By Lemma 10.27 applied to
these sets whose union or intersection equals H or G respectively, it follows
Therefore,
In the general case, let Em = E ∩ B (0, m) and apply what was just shown and let
m → ∞.
10.6. CHANGE OF VARIABLES FOR C 1 FUNCTIONS 283
Lemma 10.29 Let U and V be bounded open sets in Rn and let h, h−1 be C 1
functions such that h (U ) = V. Also let f ∈ Cc (V ) . Then
Z Z
f (y) dy = f (h (x)) |det (Dh (x))| dx
V U
h (B (x, r)) − h (x) = h (x+B (0,r)) − h (x) ⊆ Dh (x) (B (0, (1 + ε) r)) . (10.12)
|f (h (x1 )) |det (Dh (x1 ))| − f (h (x)) |det (Dh (x))|| < ε (10.14)
Therefore,
Z Z Z
XEm (y) dy = XEm (y) dy = XEm (h (x)) |det (Dh (x))| dx
V Vm Um
Z
= XEm (h (x)) |det (Dh (x))| dx
U
Let m → ∞ and use the monotone convergence theorem to obtain the conclusion
of the corollary.
With this corollary, the main theorem follows.
Theorem 10.32 Let U and V be open sets in Rn and let h, h−1 be C 1 functions
such that h (U ) = V. Then if g is a nonnegative Lebesgue measurable function,
Z Z
g (y) dy = g (h (x)) |det (Dh (x))| dx. (10.16)
V U
Proof: From Corollary 10.31, 10.16 holds for any nonnegative simple function
in place of g. In general, let {sk } be an increasing sequence of simple functions
which converges to g pointwise. Then from the monotone convergence theorem
Z Z Z
g (y) dy = lim sk dy = lim sk (h (x)) |det (Dh (x))| dx
V k→∞ V k→∞ U
Z
= g (h (x)) |det (Dh (x))| dx.
U
it follows that
n−1
mn (Kε ) ≤ 2n ε (diam (K) + ε) .
{v1 , · · ·, vn−1 , vn }
286 LEBESGUE MEASURE
and let r1 > 0 be a constant as in Lemma 10.36 such that whenever x ∈ Uk and
0 < |v| ≤ r1 ,
|h (x + v) − h (x) − Dh (x) v| < ε |v| . (10.17)
Now the closures of balls which are contained in W and which have the property
that their diameters are less than r1 yield a Vitali covering of n W. Therefore,
o by
e
Corollary 10.21 there is a disjoint sequence of these closed balls, Bi such that
W = ∪∞ e
i=1 Bi ∪ N
288 LEBESGUE MEASURE
where N is a set of measure zero. Denote by {Bi } those closed balls in this sequence
which have nonempty intersection with Zk , let di be the diameter of Bi , and let zi
be a point in Bi ∩ Zk . Since zi ∈ Zk , it follows Dh (zi ) B (0,di ) = Di where Di is
contained in a subspace, V which has dimension n − 1 and the diameter of Di is no
larger than 2Ck di where
Ck ≥ max {||Dh (x)|| : x ∈ Zk }
Then by 10.17, if z ∈ Bi ,
h (z) − h (zi ) ∈ Di + B (0, εdi ) ⊆ Di + B (0,εdi ) .
Thus
h (Bi ) ⊆ h (zi ) + Di + B (0,εdi )
By Lemma 10.33
n−1
mn (h (Bi )) ≤ 2n (2Ck di + εdi ) εdi
³ ´
n n n−1
≤ di 2 [2Ck + ε] ε
≤ Cn,k mn (Bi ) ε.
Therefore, by Lemma 10.22
X X
mn (h (Zk )) ≤ mn (W ) = mn (h (Bi )) ≤ Cn,k ε mn (Bi )
i i
≤ εCn,k mn (W ) ≤ εCn,k (mn (Zk ) + ε)
Since ε is arbitrary, this shows mn (h (Zk )) = 0 and so 0 = limk→∞ mn (h (Zk )) =
mn (h (Z)).
With this important lemma, here is a generalization of Theorem 10.32.
Theorem 10.38 Let U be an open set and let h be a 1 − 1, C 1 function with values
in Rn . Then if g is a nonnegative Lebesgue measurable function,
Z Z
g (y) dy = g (h (x)) |det (Dh (x))| dx. (10.18)
h(U ) U
Proof: Let Z = {x : det (Dh (x)) = 0} . Then by the inverse function theorem,
h−1 is C 1 on h (U \ Z) and h (U \ Z) is an open set. Therefore, from Lemma 10.37
and Theorem 10.32,
Z Z Z
g (y) dy = g (y) dy = g (h (x)) |det (Dh (x))| dx
h(U ) h(U \Z) U \Z
Z
= g (h (x)) |det (Dh (x))| dx.
U
and Z the set where |det Dh (x)| = 0, Lemma 10.37 implies mn (h(Z)) = 0. For
x ∈ U+ , the inverse function theorem implies there exists an open set Bx such that
x ∈ Bx ⊆ U+ , h is one to one on Bx .
Let {Bi } be a countable subset of {Bx }x∈U+ such that U+ = ∪∞ i=1 Bi . Let
E1 = B1 . If E1 , · · ·, Ek have been chosen, Ek+1 = Bk+1 \ ∪ki=1 Ei . Thus
∪∞
i=1 Ei = U+ , h is one to one on Ei , Ei ∩ Ej = ∅,
and each Ei is a Borel set contained in the open set Bi . Now define
∞
X
n(y) ≡ Xh(Ei ) (y) + Xh(Z) (y).
i=1
The set, h (Ei ) , h (Z) are measurable by Lemma 10.23. Thus n (·) is measurable.
Proof: Using Lemma 10.37 and the Monotone Convergence Theorem or Fubini’s
Theorem,
mn (h(Z))=0
Z Z ∞ z }| {
X
n(y)XF (y)dy = Xh(Ei ) (y) + Xh(Z) (y) XF (y)dy
h(U ) h(U ) i=1
∞ Z
X
= Xh(Ei ) (y)XF (y)dy
i=1 h(U )
X∞ Z
= Xh(Ei ) (y)XF (y)dy
i=1 h(Bi )
X∞ Z
= XEi (x)XF (h(x))| det Dh(x)|dx
i=1 Bi
X∞ Z
= XEi (x)XF (h(x))| det Dh(x)|dx
i=1 U
Z X
∞
= XEi (x)XF (h(x))| det Dh(x)|dx
U i=1
290 LEBESGUE MEASURE
Z Z
= XF (h(x))| det Dh(x)|dx = XF (h(x))| det Dh(x)|dx.
U+ U
This proves the lemma.
Definition 10.40 For y ∈ h(U ), define a function, #, according to the formula
#(y) ≡ number of elements in h−1 (y).
Observe that
#(y) = n(y) a.e. (10.19)
because n(y) = #(y) if y ∈
/ h(Z), a set of measure 0. Therefore, # is a measurable
function.
Theorem 10.41 Let g ≥ 0, g measurable, and let h be C 1 (U ). Then
Z Z
#(y)g(y)dy = g(h(x))| det Dh(x)|dx. (10.20)
h(U ) U
Proof: From 10.19 and Lemma 10.39, 10.20 holds for all g, a nonnegative simple
function. Approximating an arbitrary measurable nonnegative function, g, with an
increasing pointwise convergent sequence of simple functions and using the mono-
tone convergence theorem, yields 10.20 for an arbitrary nonnegative measurable
function, g. This proves the theorem.
This will be accomplished by Fubini’s theorem, Theorem 9.50 and the following
lemma.
Lemma 10.43 mk × mn−k = mn on the mn measurable sets.
Qn
Proof: First of all, let R = i=1 (ai , bi ] be a measurable rectangle and let
Qk Qn
Rk = i=1 (ai , bi ], Rn−k = i=k+1 (ai , bi ]. Then by Fubini’s theorem,
Z Z Z
XR d (mk × mn−k ) = XRk XRn−k dmk dmn−k
k n−k
ZR R Z
= XRk dmk XRn−k dmn−k
Rk Rn−k
Z
= XR dmn
10.8. LEBESGUE MEASURE AND ITERATED INTEGRALS 291
and so mk × mn−k and mn agree on every half open rectangle. By Lemma 10.2
these two measures agree on every open set. ¡Now¢ if K is a compact set, then
1
K = ∩∞ © Uk where Uk is the
k=1 ª open set, K + B 0, k . Another way of saying this
1
is Uk ≡ x : dist (x,K) < k which is obviously open because x → dist (x,K) is a
continuous function. Since K is the countable intersection of these decreasing open
sets, each of which has finite measure with respect to either of the two measures,
it follows that mk × mn−k and mn agree on all the compact sets. Now let E be
a bounded Lebesgue measurable set. Then there are sets, H and G such that
H is a countable union of compact sets, G a countable intersection of open sets,
H ⊆ E ⊆ G, and mn (G \ H) = 0. Then from what was just shown about compact
and open sets, the two measures agree on G and on H. Therefore,
Now let p → ∞ and use the Monotone convergence theorem and the Fubini Theorem
9.50 on Page 243.
Not surprisingly, the following corollary follows from this.
Proof: Apply Corollary 10.44 to the postive and negative parts of the real and
imaginary parts of f .
292 LEBESGUE MEASURE
y1 = ρ cos θ
y2 = ρ sin θ
where ρ > 0 and θ ∈ [0, 2π). Here I am writing ρ in place of r to emphasize a pattern
which is about to emerge. I will consider polar coordinates as spherical coordinates
in two dimensions. I will also simply refer to such coordinate systems as polar
coordinates regardless of the dimension. This is also the reason I am writing y1 and
y2 instead of the more usual x and y. Now consider what happens when you go to
three dimensions. The situation is depicted in the following picture.
r(x1 , x2 , x3 )
R
¶¶
φ1 ¶
¶ρ
¶
¶ R2
From this picture, you see that y3 = ρ cos φ1 . Also the distance between (y1 , y2 )
and (0, 0) is ρ sin (φ1 ) . Therefore, using polar coordinates to write (y1 , y2 ) in terms
of θ and this distance,
y1 = ρ sin φ1 cos θ,
y2 = ρ sin φ1 sin θ,
y3 = ρ cos φ1 .
where φ1 ∈ [0, π] . What was done is to replace ρ with ρ sin φ1 and then to add in
y3 = ρ cos φ1 . Having done this, there is no reason to stop with three dimensions.
Consider the following picture:
r(x1 , x2 , x3 , x4 )
R ¶¶
φ2 ¶
¶ρ
¶
¶ R3
From this picture, you see that y4 = ρ cos φ2 . Also the distance between (y1 , y2 , y3 )
and (0, 0, 0) is ρ sin (φ2 ) . Therefore, using polar coordinates to write (y1 , y2 , y3 ) in
10.9. SPHERICAL COORDINATES IN MANY DIMENSIONS 293
where φ2 ∈ [0, π] .
Continuing this way, given spherical coordinates in Rn , to get the spherical
coordinates in Rn+1 , you let yn+1 = ρ cos φn−1 and then replace every occurance of
ρ with ρ sin φn−1 to obtain y1 · · · yn in terms of φ1 , φ2 , · · ·, φn−1 ,θ, and ρ.
It is always the case that ρ measures the distance from the point in Rn to the
origin in Rn , 0. Each φi ∈ [0, π] , and θ ∈ [0, 2π). It can be shown using math
Qn−2
induction that these coordinates map i=1 [0, π] × [0, 2π) × (0, ∞) one to one onto
Rn \ {0} .
Proof: Formula 10.21 is obvious from the definition of the spherical coordinates.
The first claim is also Q
clear from the definition and math induction. It remains to
n−2
verify 10.22. Let A0 ≡ i=1 (0, π)×(0, 2π) . Then it is clear that (A \ A0 )×(0, ∞) ≡
N is a set of measure zero in Rn . Therefore, from Lemma 10.22 it follows h (N )
is also a set of measure zero. Therefore, using the change of variables theorem,
Corollary 10.44, and Sard’s lemma,
Z Z Z
f (y) dy = f (y) dy = f (y) dy
Rn Rn \{0} Rn \({0}∪h(N ))
Z
= f (h (φ, θ, ρ)) ρn−1 Φ (φ, θ) dmn
A0 ×(0,∞)
Z
= XA×(0,∞) (φ, θ, ρ) f (h (φ, θ, ρ)) ρn−1 Φ (φ, θ) dmn
Z ∞ µZ ¶
n−1
= ρ f (h (φ, θ, ρ)) Φ (φ, θ) dφ dθ dρ.
0 A
1 Actually it is only a function of the first but this is not important in what follows.
294 LEBESGUE MEASURE
Now the claim about f ∈ L1 follows routinely from considering the positive and
negative parts of the real and imaginary parts of f in the usual way. This proves
the theorem.
Notation 10.47 Often this is written differently. Note that from the spherical co-
ordinate formulas, f (h (φ, θ, ρ)) = f (ρω) where |ω| = 1. Letting S n−1 denote the
unit sphere, {ω ∈ Rn : |ω| = 1} , the inside integral in the above formula is some-
times written as Z
f (ρω) dσ
S n−1
where σ is a measure on S n−1 . See [35] for another description of this measure.
It isn’t an important issue here. Later in the book when integration on manifolds
is discussed, more general considerations will be dealt with. Either 10.22 or the
formula Z µZ ¶
∞
ρn−1 f (ρω) dσ dρ
0 S n−1
will be referred
¡ ¢ toRas polar coordinates and is very useful in establishing estimates.
Here σ S n−1 ≡ A Φ (φ, θ) dφ dθ.
R ³ ´s
2
Example 10.48 For what values of s is the integral B(0,R) 1 + |x| dy bounded
independent of R? Here B (0, R) is the ball, {x ∈ Rn : |x| ≤ R} .
I think you can see immediately that s must be negative but exactly how neg-
ative? It turns out it depends on n and using polar coordinates, you can find just
exactly what is needed. From the polar coordinats formula above,
Z ³ ´s Z R Z
2 ¡ ¢s
1 + |x| dy = 1 + ρ2 ρn−1 dσdρ
B(0,R) 0 S n−1
Z R ¡ ¢s
= Cn 1 + ρ2 ρn−1 dρ
0
Now the very hard problem has been reduced to considering an easy one variable
problem of finding when
Z R
¡ ¢s
ρn−1 1 + ρ2 dρ
0
∂gi ∂ det(Dg)
where here (Dg)ij ≡ gi,j ≡ ∂xj . Also, cof (Dg)ij = ∂gi,j .
and so
∂ det (Dg)
= cof (Dg)ij (10.23)
∂gi,j
which shows the last claim of the lemma. Also
X
δ kj det (Dg) = gi,k (cof (Dg))ij (10.24)
i
Subtracting the first sum on the right from both sides and using the equality of
mixed partials,
X X
gi,k (cof (Dg))ij,j = 0.
i j
296 LEBESGUE MEASURE
P
If det (gi,k ) 6= 0 so that (gi,k ) is invertible, this shows j (cof (Dg))ij,j = 0. If
det (Dg) = 0, let
gk = g + εk I
where εk → 0 and det (Dg + εk I) ≡ det (Dgk ) 6= 0. Then
X X
(cof (Dg))ij,j = lim (cof (Dgk ))ij,j = 0
k→∞
j j
Proof: Suppose not. Then |g (x) − x| must be bounded away from zero on Br .
Let a (x) be the larger of the two roots of the equation,
2
|x+a (x) (x − g (x))| = r2 . (10.25)
Thus
r ³ ´
2 2 2
− (x, (x − g (x))) + (x, (x − g (x))) + r2 − |x| |x − g (x)|
a (x) = 2 (10.26)
|x − g (x)|
The expression under the square root sign is always nonnegative and it follows
from the formula that a (x) ≥ 0. Therefore, (x, (x − g (x))) ≥ 0 for all x ∈ Br .
The reason for this is that a (x) is the larger zero of a polynomial of the form
2 2
p (z) = |x| + z 2 |x − g (x)| − 2z (x, x − g (x)) and from the formula above, it is
nonnegative. −2 (x, x − g (x)) is the slope of the tangent line to p (z) at z = 0. If
2
x 6= 0, then |x| > 0 and so this slope needs to be negative for the larger of the two
zeros to be positive. If x = 0, then (x, x − g (x)) = 0.
Now define for t ∈ [0, 1],
and
|f (t, x)| = r for all |x| = r (10.28)
These properties follow immediately from 10.26 and the above observation that for
x ∈ Br , it follows (x, (x − g (x))) ≥ 0.
Also from 10.26, a is a C 2 function near Br . This is obvious from 10.26 as long
as |x| < r. However, even if |x| = r it is still true. To show this, it suffices to verify
10.10. THE BROUWER FIXED POINT THEOREM 297
the expression under the square root sign is positive. If this expression were not
positive for some |x| = r, then (x, (x − g (x))) = 0. Then also, since g (x) 6= x,
¯ ¯
¯ g (x) + x ¯
¯ ¯<r
¯ 2 ¯
and so µ ¶ 2
g (x) + x 1 r2 |x| r2
r2 > x, = (x, g (x)) + = + = r2 ,
2 2 2 2 2
a contradiction. Therefore, the expression under the square root in 10.26 is always
positive near Br and so a is a C 2 function near Br as claimed because the square
root function is C 2 away from zero.
Now define Z
I (t) ≡ det (D2 f (t, x)) dx.
Br
Then Z
I (0) = dx = mn (Br ) > 0. (10.29)
Br
Using the dominated convergence theorem one can differentiate I (t) as follows.
Z X
∂ det (D2 f (t, x)) ∂fi,j
I 0 (t) = dx
Br ij ∂fi,j ∂t
Z X
∂ (a (x) (xi − gi (x)))
= cof (D2 f )ij dx.
Br ij ∂xj
Now from 10.27 a (x) = 0 when |x| = r and so integration by parts and Lemma
10.49 yields
Z X
0 ∂ (a (x) (xi − gi (x)))
I (t) = cof (D2 f )ij dx
Br ij ∂xj
Z X
= − cof (D2 f )ij,j a (x) (xi − gi (x)) dx = 0.
Br ij
Theorem 10.51 Let Br be the above closed ball and let f : Br → Br be continuous.
Then there exists x ∈ Br such that f (x) = x.
298 LEBESGUE MEASURE
f (x) r
Proof: Let fk (x) ≡ 1+k−1 . Thus ||fk − f || < 1+k where
∂gi ∂ det(Dg)
where here (Dg)ij ≡ gi,j ≡ ∂xj . Also, cof (Dg)ij = ∂gi,j .
10.11. THE BROUWER FIXED POINT THEOREM ANOTHER PROOF 299
n
Definition
¡ ¢ 10.54 Let h be a function defined on an open set, U ⊆ R . Then
k
h ∈ C U if there exists a function g defined on an open set, W containng U such
that g = h on U and g is C k (W ) .
³ ´
Lemma 10.55 There does not exist h ∈ C 2 B (0, R) such that h :B (0, R) →
∂B (0, R) which also has the property that h (x) = x for all x ∈ ∂B (0, R) . Such a
function is called a retraction.
Proof: Suppose such an h exists. Let λ ∈ [0, 1] and let pλ (x) ≡ x+λ (h (x) − x) .
This function, pλ is a homotopy of the identity map and the retraction, h. Let
Z
I (λ) ≡ det (Dpλ (x)) dx.
B(0,R)
Now by assumption, hi (x) = xi on ∂B (0, R) and so one can integrate by parts and
write
XZ X
I 0 (λ) = − cof (Dpλ (x))ij,j (hi (x) − xi ) dx = 0.
i B(0,R) j
but Z Z
I (1) = det (Dh (x)) dmn = # (y) dmn = 0
B(0,1) ∂B(0,1)
because from polar coordinates or other elementary reasoning, mn (∂B (0, 1)) = 0.
This proves the lemma.
The following is the Brouwer fixed point theorem for C 2 maps.
³ ´
Lemma 10.56 If h ∈ C 2 B (0, R) and h : B (0, R) → B (0, R), then h has a
fixed point, x such that h (x) = x.
Proof: Suppose the lemma is not true. Then for all x, |x − h (x)| =
6 0. Then
define
x − h (x)
g (x) = h (x) + t (x)
|x − h (x)|
300 LEBESGUE MEASURE
where t (x) is nonnegative and is chosen such that g (x) ∈ ∂B (0, R) . This mapping
is illustrated in the following picture.
tf (x)
xt ¡
¡
¡
t¡
g(x)
Proof: If this is not so, there exists ε > 0 such that for all x ∈ B (0, R),
|x − f (x)| > ε.
then
∞
X
µ0 (E) = µ (Ei )
i=1
In this definition, µ0 is trying to be a measure and acts like one whenever pos-
sible. Under these conditions, µ0 can be extended uniquely to a complete measure,
µ, defined on a σ algebra of sets containing E such that µ agrees with µ0 on E. The
following is the main result.
µ (E) = µ0 (E)
for all E ∈ E. Also if ν is any such measure which agrees with µ0 on E, then ν = µ
on σ (E), the σ algebra generated by E.
303
304 SOME EXTENSION THEOREMS
X ∞
ε
µ (Si ) + i ≥ µ (Eij ) .
2 j=1
Then X³
XX ε´ X
µ (S) ≤ µ (Eij ) = µ (Si ) + = µ (Si ) + ε.
i j i
2i i
it follows
∞
X
µ (A) + ε > µ0 (Ei ∩ A) ≥ µ0 (A)
i=1
since A = ∪∞ i=1 Ei ∩ A. Therefore, µ = µ0 on E.
Consider the assertion that E ⊆ S. Let A ∈ E and let S ⊆ Ω be any set. There
exist sets {Ei } ⊆ E such that ∪∞i=1 Ei ⊇ S but
∞
X
µ (S) + ε > µ (Ei ) .
i=1
Then
µ (S) ≤ µ (S ∩ A) + µ (S \ A)
≤ µ (∪∞ ∞
i=1 Ei \ A) + µ (∪i=1 (Ei ∩ A))
∞
X X∞ X∞
≤ µ (Ei \A) + µ (Ei ∩ A) = µ (Ei ) < µ (S) + ε.
i=1 i=1 i=1
Since ε is arbitrary, this shows A ∈ S.
This has proved the existence part of the theorem. To verify uniqueness, Let
M ≡ {E ∈ σ (E) : µ (E) = ν (E)} .
Then M is given to contain E and is obviously a monotone class. Therefore by
Theorem 9.57 on monotone classes, M = σ (E) and this proves the lemma.
The following lemma is also very significant.
11.2. THE TYCHONOFF THEOREM 305
Lemma 11.3 Let M be a metric space with the closed balls compact and suppose
µ is a measure defined on the Borel sets of M which is finite on compact sets.
Then there exists a unique Radon measure, µ which equals µ on the Borel sets. In
particular µ must be both inner and outer regular on all Borel sets.
R
Proof: Define a positive linear functional, Λ (f ) = f dµ. Let µ be the Radon
measure which comes from the Riesz representation theorem for positive linear
functionals. Thus for all f continuous,
Z Z
f dµ = f dµ.
and so the two measures coincide on all open sets. Every compact set is a countable
intersection of open sets and so the two measures coincide on all compact sets. Now
let B (a, n) be a ball of radius n and let E be a Borel set contained in this ball.
Then by regularity of µ there exist sets F, G such that G is a countable intersection
of open sets and F is a countable union of compact sets such that F ⊆ E ⊆ G and
µ (G \ F ) = 0. Now µ (G) = µ (G) and µ (F ) = µ (F ) . Thus
µ (G \ F ) + µ (F ) = µ (G)
= µ (G) = µ (G \ F ) + µ (F )
and so µ (G \ F ) = µ (G \ F ) . It follows
The main tool in the study of products of compact topological spaces is the
Alexander subbasis theorem which is presented next. Recall a set is compact if
every basic open cover admits a finite subcover. This was pretty easy to prove.
However, there is a much smaller set of open sets called a subbasis which has this
property. The proof of this result is much harder.
Definition 11.5 S ⊆ τ is called a subbasis for the topology τ if the set B of finite
intersections of sets of S is a basis for the topology, τ .
Theorem 11.6 Let (X, τ ) be a topological space and let S ⊆ τ be a subbasis for
τ . Then if H ⊆ X, H is compact if and only if every open cover of H consisting
entirely of sets of S admits a finite subcover.
Proof: The only if part is obvious because the subasic sets are themselves open.
By Lemma 6.56 on Page 6.56, if every basic open cover admits a finite subcover
then the set in question is compact. Suppose then that H is a subset of X having
the property that subbasic open covers admit finite subcovers. Is H compact?
Assume this is not so. Then what was just observed about basic covers implies
there exists a basic open cover of H, O, which admits no finite subcover. Let F be
defined as
The assumption is that F is nonempty. Partially order F by set inclusion and use
the Hausdorff maximal principle to obtain a maximal chain, C, of such open covers
and let
D = ∪C.
If D admits a finite subcover, then since C is a chain and the finite subcover has only
finitely many sets, some element of C would also admit a finite subcover, contrary
to the definition of F. Therefore, D admits no finite subcover. If D0 % D and D0
is a basic open cover of H, then D0 has a finite subcover of H since otherwise, C
would fail to be a maximal chain, being properly contained in C∪ {D0 }. Every set
of D is of the form
U = ∩mi=1 Bi , Bi ∈ S
because they are all basic open sets. If it is the case that for all U ∈ D one of the
Bi is found in D, then replace each such U with the subbasic set from D containing
it. But then this would be a subbasic open cover of H which by assumption would
admit a finite subcover contrary to the properties of D. Therefore, one of the sets
of D, denoted by U , has the property that
U = ∩m
i=1 Bi , Bi ∈ S
11.2. THE TYCHONOFF THEOREM 307
and no Bi is in D. Thus D ∪ {Bi } admits a finite subcover, for each of the above
Bi because it is strictly larger than D. Let this finite subcover corresponding to Bi
be denoted by
V1i , · · ·, Vmi i , Bi
Consider
{U, Vji , j = 1, · · ·, mi , i = 1, · · ·, m}.
If p ∈ H \ ∪{Vji }, then p ∈ Bi for each i and so p ∈ U . This is therefore a finite
subcover of D contradicting the properties of D. Therefore, F must be empty and
by Lemma 6.56, this proves the theorem.
Let I be a set and suppose for each i ∈ I, (Xi , τ i )Qis a nonempty topological
space. The Cartesian product of the Xi , denoted by i∈I Xi , consists of the set
of allQchoice functions defined on I which select a single element of each XQ
i . Thus
f ∈ i∈I Xi means for every i ∈ I, f (i) ∈ Xi . The axiom of choice says i∈I Xi
is nonempty. Let Y
Pj (A) = Bi
i∈I
Proof: By the Alexander subbasis theorem, the theorem will be proved if every
subbasic Q
open cover admits a finite subcover. Therefore, let O be a subbasic open
cover of i∈I Xi . Let
Thus Oj consists of those sets of O which have a possibly proper subset of Xi only
in the slot i = j. Let
π j Oj = {A : Pj (A) ∈ Oj }.
Thus π j Oj picks out those proper open subsets of Xj which occur in Oj .
308 SOME EXTENSION THEOREMS
Xj = ∪π j Oj
where ½
Et if t ∈ J
Ft =
Mt0 if t ∈
/J
Thus γ J E leaves alone Et for t ∈ J and changes the other Et into Mt0 . If γ J E = E,
then this means Et = Mt0 for all t ∈/ J. Also define for J a subset of I,
Y
πJ x ≡ xt
t∈J
Q Q
so π J is a continuous mapping from t∈I Mt0 to t∈J Mt0 .
Y
πJ E ≡ Et .
t∈J
11.3. KOLMOGOROV EXTENSION THEOREM 309
then from what was just shown G contains Q the open sets. It is also clearly a σ
algebra. Hence G equals the Borel sets of t∈J Mt0Q .
It only remains
Q to verify
Q that any Borel set in t∈J Mt is the intersection of a
Borel set of t∈J Mt0 with t∈J Mt . Let
( )
Y Y Y
0 0
H≡ F Borel in Mt such that F = F ∩ Mt , F Borel in Mt0
t∈J t∈J t∈J
From the first part of the argument, HQcontains the open sets. Now Q let {Fn }
0 0 0
be a sequence in H. Thus Q F n = F n ∩ t∈J M t where
Q F n is Borel in t∈J Mt .
0 0 0
Then ∪n Fn = ∪n Fn ∩ t∈J Mt and ∪n Fn is Borel in t∈J Mt . Thus Q H is closed
0
under countable intersections. Next let F ∈ H ¡Q and F = F ¢ ∩ Mt . Then
0C
Q 0 0
Q 0
t∈J
0C
Q
F ≡ t∈J Mt \ F is Borel in t∈J Mt and t∈J Mt \ F = F ∩ t∈J Mt .
Thus H is a σ algebra containing the open sets and so H equals the Borel sets in
Q
t∈J Mt . This proves this wretched little lemma.
With this preparation here is the Kolmogorov extension theorem. In the state-
ment and proof of the theorem, Fi , Gi , and Ei will denote Borel sets. Any list of
indices from I will always be assumed to be taken in order. Thus, if J ⊆ I and
J = (t1 , · · ·, tn ) , it will always be assumed t1 < t2 < · · · < tn .
J = (t1 , · · ·, tn ) ⊆ I,
(t1 , · · ·, tn ) ⊆ (s1 , · · ·, sp ) ,
then ¡ ¢
ν t1 ···tn (Ft1 × · · · × Ftn ) = ν s1 ···sp Gs1 × · · · × Gsp (11.1)
where if si = tj , then Gsi = Ftj and if si is not equal to any of the indices, tk ,
then Gsi = Msi . Then there exists a probability space, (Ω, P, F) and measurable
functions, Xt : Ω → Mt for each t ∈ I such that for each (t1 · · · tn ) ⊆ I,
J = (s1 , · · ·, sp ) ⊇ ∪m
k=1 Jk
define
m
X ³ ´
P0 (E) ≡ ν s1 ···sp Gks1 × · · · × Gksp
k=1
where Gksi = Etkk in case si = tkj and Ms0 i otherwise. By 11.1 this is well defined
j
and equals
m
X ³ ´
ν tk1 ···tkm Etkk × · · · × Etkkm .
k 1 k
k=1
P0 is clearly finitely additive because the ν J are measures and one can pick J as
large as desired. Also, from the definition,
à !
Y ¡ ¢
P0 Mt = ν t1 Mt01 = 1.
0
t∈I
Next I will show P0 is a finite measure on E. From this it is only a matter of using
the Caratheodory extension theorem.
Claim: If En ↓ ∅, then P0 (En ) ↓ 0.
Proof of the claim: If not, there exists a sequence such that although En ↓
∅, P0 (En ) ↓ ε > 0. Since each of the ν s1 ···sm is inner regular, Q there exists a compact
set, Kn ⊆ π J (En ) for suitably large J such that if Kn0 ⊆ t∈I Mt0 is defined by
γ J (Kn0 ) = Kn0 and π J (Kn0 ) = Kn and P0 (En \ Kn0 ) < ε/2n+1 . (Less precisely,
you get Kn0 by filling in all the slots other than in J with the appropriate Mt0 .)
Thus by Tychonoff’s theorem, Kn0 is compact. The interesting thing about these
Kn0 is they have the finite intersection property. Here is why.
¡ ¢ ¡ ¢
ε ≤ P0 ∩m k=1 K
k0
+ P0 Em \ ∩m k=1 K
k0
¡ ¢ ¡ ¢
≤ P0 ∩m k=1 K
k0
+ P0 ∪m k
k=1 E \ K
k0
∞
¡ ¢ X ε ¡ ¢
< P0 ∩m k=1 K k0
+ k+1
< P0 ∩m k=1 K
k0
+ε
2
k=1
¡ ¢
and so P0 ∩m k=1 K
k0
> 0. Now this yields a contradiction because this finite inter-
section property implies the intersection of all the Kn0 is nonempty contradicting
En ↓ ∅ since each Kn0 is contained in En .
∞
With the claim, it follows P0 is a measure on E. Here is why: If E = ∪k=1 Ek
k n
where E, E ∈ E, then (E \ ∪k=1 Ek ) ↓ ∅ and so
P0 (∪nk=1 Ek ) → P0 (E) .
Pn
Hence if the Ek are disjoint, P0 (∪nk=1 Ek ) = k=1 P0 (Ek ) → P0 (E) .
312 SOME EXTENSION THEOREMS
Now to conclude the proof, apply the Caratheodory extension theorem to obtain
P a probability © measure
Q which extends ª P0 to σ (E)¡theQ sigma algebra ¢ generated by
E. Let S ≡ E ∩ t∈I Mt : E ∈ σ (E) . It follows t∈I Mt , S, P is a probability
measure space with the property that when γ J (E) = E for J = (t1 · · · tn ) a finite
subset of I, P (E) = P0 (E)¡Q = ν t1 ···tn (Et1 ¢× · · · × Etn ) .
For the last part, let t∈I Mt , S, P be the probability space and for x ∈
Q th
t∈I Mt let Xt (x) = xt , the t entry of x. (xt = π t x). It follows Xt is measurable
because if U is open in Mt , then Xt−1 (U ) has a U in the tth slot and Ms everywhere
else for s 6= t so this is actually in E. Also, letting (t1 · · · tn ) be a finite subset of I
and Ft1 , · · ·, Ftn be Borel sets in Mt1 · · · Mtn respectively,
Then these measures satisfy the necessary consistency condition and so the Kol-
mogorov extension
¡Q theorem ¢ given above can be applied Q to obtain a measure, P
defined on a t∈I M t , F and measurable functions X s : t∈I Mt → Ms such that
for Fti a Borel set in Mti ,
à n
!
Y
P (Xt1 , · · ·, Xtn ) ∈ Fti = ν t1 ···tn (Ft1 × · · · × Ftn )
i=1
11.4 Exercises
1. Let (X, S, µ) and (Y, F, λ) be two finite measure spaces. A subset of X × Y
is called a measurable rectangle if it is of the form A × B where A ∈ S and
B ∈ F. A subset of X × Y is called an elementary set if it is a finite disjoint
union of measurable rectangles. Denote this set of functions by E. Show that
E is an algebra of sets.
and that Z Z Z Z
XA (x, y) dµdλ = XA (x, y) dλdµ. (11.4)
Hint: Let M ≡ {A ∈ σ (E) : 11.4 holds} along with all relevant measurability
assertions. Show M contains E and is a monotone class. Then apply the
Theorem 9.57.
RR
3. ↑For A ∈ σ (E) define (µ × λ) (A) ≡ XA (x, y) dµdλ. Show that (µ × λ) is
a measur on σ (E) and that whenever f ≥ 0 is measurable with respect to
σ (E) ,
Z Z Z Z Z
f d (µ × λ) = f (x, y) dµdλ = f (x, y) dλdµ.
X×Y
4. ↑Generalize the above version of Fubini’s theorem to the case where the mea-
sure spaces are only σ finite.
5. ↑Suppose now that µ and λ are both complete σ finite measures. Let (µ × λ)
denote
³ the completion´ of this measure. Let the larger measure space be
X × Y, σ (E), (µ × λ) . Thus if E ∈ σ (E), it follows there exists a set A ∈
σ (E) such that E ∪ N = A where (µ × λ) (N ) = 0. Now argue that for λ
a.e. y, x → XN (x, y) is measurable because it is equal to zero µ a.e. and µ is
complete. Therefore, Z Z
XN (x, y) dµdλ
314 SOME EXTENSION THEOREMS
makes sense and equals zero.R Use to argue that for λ a.e. y, x → XE (x, y)
R µ measurable and equals XA (x, y) dµ. Then by completeness of λ, y →
is
XE (x, y) dµ is λ measurable and
Z Z Z Z
XA (x, y) dµdλ = XE (x, y) dµdλ = (µ × λ) (E) .
Similarly Z Z
XE (x, y) dλdµ = (µ × λ) (E) .
Use this to give a generalization of the above Fubini theorem. Prove that if f
is measurable with respect to the σ algebra, σ (E) and nonnegative, then
Z Z Z Z Z
f d(µ × λ) = f (x, y) dµdλ = f (x, y) dλdµ
X×Y
315
316 THE LP SPACES
x
b
x = tp−1
t = xq−1
t
a
From this picture, the sum of the area between the x axis and the curve added to
the area between the t axis and the curve is at least as large as ab. Using beginning
calculus, this is equivalent to the following inequality.
Z a Z b
ap bq
ab ≤ tp−1 dt + xq−1 dx = + .
0 0 p q
The above picture represents the situation which occurs when p > 2 because the
graph of the function is concave up. If 2 ≥ p > 1 the graph would be concave down
or a straight line. You should verify that the same argument holds in these cases
just as well. In fact, the only thing which matters in the above inequality is that
the function x = tp−1 be strictly increasing.
Note equality occurs when ap = bq .
Here is an alternate proof.
Then f 0 (a) = ap−1 − b. This is negative when a < b1/(p−1) and is positive when
a > b1/(p−1) . Therefore, f has a minimum when a = b1/(p−1) . In other words, when
ap = bp/(p−1) = bq since 1/p + 1/q = 1. Thus the minimum value of f is
bq bq
+ − b1/(p−1) b = bq − bq = 0.
p q
It follows f ≥ 0 and this yields the desired inequality.
R R
Proof of Holder’s inequality: If either |f |p dµ or |g|p dµ equals
R ∞, the
inequality 12.1 is obviously valid because ∞ ≥ anything. If either |f |p dµ or
12.1. BASIC INEQUALITIES AND PROPERTIES 317
R
|g|p dµ equals 0, then f = 0 a.e. or that g = 0 a.e. and so in this case the left side
of theRinequality equals
R 0 and so the inequality is therefore true. Therefore assume
both |f |p dµ and |g|p dµ are less than ∞ and not equal to 0. Let
µZ ¶1/p
p
|f | dµ = I (f )
¡R ¢1/q
and let |g|p dµ = I (g). Then using the lemma,
Z Z Z
|f | |g| 1 |f |p 1 |g|q
dµ ≤ p dµ + q dµ = 1.
I (f ) I (g) p I (f ) q I (g)
Hence,
Z µZ ¶1/p µZ ¶1/q
p q
|f | |g| dµ ≤ I (f ) I (g) = |f | dµ |g| dµ .
¡
¡
¡
¡
¡ (|x| + |y|)/2 = m
¡
|x| m |y|
Now as shown above,
µ ¶p p p
|x| + |y| |x| + |y|
≤
2 2
which implies
p p p p
|x + y| ≤ (|x| + |y|) ≤ 2p−1 (|x| + |y| )
and this proves the lemma.
Note that if y = φ (x) is any function for which the graph of φ is concave up,
you could get a similar inequality by the same argument.
318 THE LP SPACES
¡R p ¢1/p
and |f + g| dµ 6= 0 or there is nothing to prove. Therefore, using the above
lemma, Z µZ ¶
p p−1 p p
|f + g| dµ ≤ 2 |f | + |g| dµ < ∞.
p p−1
Now |f (ω) + g (ω)| ≤ |f (ω) + g (ω)| (|f (ω)| + |g (ω)|). Also, it follows from the
definition of p and q that p − 1 = pq . Therefore, using this and Holder’s inequality,
Z
|f + g|p dµ ≤
Z Z
|f + g|p−1 |f |dµ + |f + g|p−1 |g|dµ
Z Z
p p
= |f + g| |f |dµ + |f + g| q |g|dµ
q
Z Z Z Z
1 1 1 1
≤ ( |f + g|p dµ) q ( |f |p dµ) p + ( |f + g|p dµ) q ( |g|p dµ) p.
R 1
Dividing both sides by ( |f + g|p dµ) q yields 12.2. This proves the corollary.
The following follows immediately from the above.
¡R p ¢1/p ¡R p ¢1/p
b.) |af | dµ = |a| |f | dµ if a is a scalar.
¡R p ¢1/p ¡R p ¢1/p ¡R p ¢1/p
c.) |f + g| dµ ≤ |f | dµ + |g| dµ .
¡R p ¢1/p ¡R p ¢1/p
f → |f | dµ would define a norm if |f | dµ = 0 implied f = 0.
Unfortunately, this is not so because if f = 0 a.e. but is nonzero on a set of
¡R p ¢1/p
measure zero, |f | dµ = 0 and this is not allowed. However, all the other
properties of a norm are available and so a little thing like a set of measure zero
will not prevent the consideration of Lp as a normed vector space if two functions
in Lp which differ only on a set of measure zero are considered the same. That is,
an element of Lp is really an equivalence class of functions where two functions are
equivalent if they are equal a.e. With this convention, here is a definition.
Then with this definition and using the convention that elements in Lp are
considered to be the same if they differ only on a set of measure zero, || ||p is a
norm on Lp (Ω) because if ||f ||p = 0 then f = 0 a.e. and so f is considered to be
the zero function because it differs from 0 only on a set of measure zero.
The following is an important definition.
Proof: Let {fn } be a Cauchy sequence in Lp (Ω). This means that for every
ε > 0 there exists N such that if n, m ≥ N , then ||fn − fm ||p < ε. Now select a
subsequence as follows. Let n1 be such that ||fn − fm ||p < 2−1 whenever n, m ≥ n1 .
1 These spaces are named after Stefan Banach, 1892-1945. Banach spaces are the basic item of
study in the subject of functional analysis and will be considered later in this book.
There is a recent biography of Banach, R. Katuża, The Life of Stefan Banach, (A. Kostant and
W. Woyczyński, translators and editors) Birkhauser, Boston (1996). More information on Banach
can also be found in a recent short article written by Douglas Henderson who is in the department
of chemistry and biochemistry at BYU.
Banach was born in Austria, worked in Poland and died in the Ukraine but never moved. This
is because borders kept changing. There is a rumor that he died in a German concentration camp
which is apparently not true. It seems he died after the war of lung cancer.
He was an interesting character. He hated taking examinations so much that he did not receive
his undergraduate university degree. Nevertheless, he did become a professor of mathematics due
to his important research. He and some friends would meet in a cafe called the Scottish cafe where
they wrote on the marble table tops until Banach’s wife supplied them with a notebook which
became the ”Scotish notebook” and was eventually published.
320 THE LP SPACES
Let n2 be such that n2 > n1 and ||fn −fm ||p < 2−2 whenever n, m ≥ n2 . If n1 , ···, nk
have been chosen, let nk+1 > nk and whenever n, m ≥ nk+1 , ||fn − fm ||p < 2−(k+1) .
The subsequence just mentioned is {fnk }. Thus, ||fnk − fnk+1 ||p < 2−k . Let
for all m and so the monotone convergence theorem implies that the sum up to m
in 12.3 can be replaced by a sum up to ∞. Thus,
Z ÃX∞
!p
|gk+1 | dµ < ∞
k=1
which requires
∞
X
|gk+1 (x)| < ∞ a.e. x.
k=1
P∞
Therefore, k=1 gk+1 (x) converges for a.e. x because the functions have values in
a complete space, C, and this shows the partial sums form a Cauchy sequence. Now
let x be such that this sum is finite. Then define
∞
X
f (x) ≡ fn1 (x) + gk+1 (x) = lim fnm (x)
m→∞
k=1
Pm
since k=1 gk+1 (x) = fnm+1 (x) − fn1 (x). Therefore there exists a set, E having
measure zero such that
lim fnk (x) = f (x)
k→∞
for all x ∈
/ E. Redefine fnk to equal 0 on E and let f (x) = 0 for x ∈ E. It then
follows that limk→∞ fnk (x) = f (x) for all x. By Fatou’s lemma, and the Minkowski
inequality,
µZ ¶1/p
p
||f − fnk ||p = |f − fnk | dµ ≤
µZ ¶1/p
p
lim inf |fnm − fnk | dµ = lim inf ||fnm − fnk ||p ≤
m→∞ m→∞
12.1. BASIC INEQUALITIES AND PROPERTIES 321
m−1
X ∞
X
¯¯ ¯¯ ¯¯ ¯¯
lim inf ¯¯fnj+1 − fnj ¯¯ ≤ ¯¯fni+1 − fni ¯¯ ≤ 2−(k−1). (12.4)
m→∞ p p
j=k i=k
Lemma 12.11 Let (X, S, µ) and (Y, F, λ) be finite complete measure spaces and
let f be µ × λ measurable and uniformly bounded. Then the following inequality is
valid for p ≥ 1.
Z µZ ¶ p1 µZ Z ¶ p1
p
|f (x, y)| dλ dµ ≥ ( |f (x, y)| dµ)p dλ . (12.5)
X Y Y X
µZ Z ¶ p1
p
( |f (x, y)|dµ) dλ < ∞.
Y X
Let
Z
J(y) = |f (x, y)|dµ.
X
Note there is no problem in writing this for a.e. y because f is product measurable.
Then by Fubini’s theorem,
Z µZ ¶p Z Z
|f (x, y)|dµ dλ = J(y)p−1 |f (x, y)|dµ dλ
Y X Y X
Z Z
= J(y)p−1 |f (x, y)|dλ dµ
X Y
322 THE LP SPACES
Now apply Holder’s inequality in the last integral above and recall p − 1 = pq . This
yields
Z µZ ¶p
|f (x, y)|dµ dλ
Y X
Z µZ ¶ q1 µZ ¶ p1
≤ J(y)p dλ |f (x, y)|p dλ dµ
X Y Y
µZ ¶ q1 Z µZ ¶ p1
= J(y)p dλ |f (x, y)|p dλ dµ
Y X Y
µZ Z ¶ q1 Z µZ ¶ p1
= ( |f (x, y)|dµ)p dλ |f (x, y)|p dλ dµ. (12.6)
Y X X Y
Therefore, dividing both sides by the first factor in the above expression,
µZ µZ ¶p ¶ p1 Z µZ ¶ p1
|f (x, y)|dµ dλ ≤ |f (x, y)|p dλ dµ. (12.7)
Y X X Y
Note that 12.7 holds even if the first factor of 12.6 equals zero. This proves the
lemma.
Now consider the case where f is not assumed to be bounded and where the
measure spaces are σ finite.
Theorem 12.12 Let (X, S, µ) and (Y, F, λ) be σ-finite measure spaces and let f
be product measurable. Then the following inequality is valid for p ≥ 1.
Z µZ ¶ p1 µZ Z ¶ p1
p p
|f (x, y)| dλ dµ ≥ ( |f (x, y)| dµ) dλ . (12.8)
X Y Y X
Proof: Since the two measure spaces are σ finite, there exist measurable sets,
Xm and Yk such that Xm ⊆ Xm+1 for all m, Yk ⊆ Yk+1 for all k, and µ (Xm ) , λ (Yk ) <
∞. Now define ½
f (x, y) if |f (x, y)| ≤ n
fn (x, y) ≡
n if |f (x, y)| > n.
Thus fn is uniformly bounded and product measurable. By the above lemma,
Z µZ ¶ p1 µZ Z ¶ p1
p p
|fn (x, y)| dλ dµ ≥ ( |fn (x, y)| dµ) dλ . (12.9)
Xm Yk Yk Xm
Now observe that |fn (x, y)| increases in n and the pointwise limit is |f (x, y)|. There-
fore, using the monotone convergence theorem in 12.9 yields the same inequality
with f replacing fn . Next let k → ∞ and use the monotone convergence theorem
again to replace Yk with Y . Finally let m → ∞ in what is left to obtain 12.8. This
proves the theorem.
12.2. DENSITY CONSIDERATIONS 323
Note that the proof of this theorem depends on two manipulations, the inter-
change of the order of integration and Holder’s inequality. Note that there is nothing
to check in the case of double sums. Thus if aij ≥ 0, it is always the case that
à !p 1/p 1/p
X X X X p
aij ≤ aij
j i i j
because the integrals in this case are just sums and (i, j) → aij is measurable.
The Lp spaces have many important properties.
Proof: Recall that a function, f , having values in R can be written in the form
f = f + − f − where
Definition 12.14 Let (Ω, S, µ) be a measure space and suppose (Ω, τ ) is also a
topological space. Then (Ω, S, µ) is called a regular measure space if the σ algebra
of Borel sets is contained in S and for all E ∈ S,
Lemma 12.15 Let Ω be a metric space in which the closed balls are compact and
let K be a compact subset of V , an open set. Then there exists a continuous function
f : Ω → [0, 1] such that f (x) = 1 for all x ∈ K and spt(f ) is a compact subset of
V . That is, K ≺ f ≺ V.
and so W , being a finite union of compact sets is itself a compact set. Also, from
the construction
W ⊆ ∪m i=1 B (xi , rxi ) .
Define f by
dist(x, W C )
f (x) = .
dist(x, K) + dist(x, W C )
It is clear that f is continuous if the denominator is always nonzero. But this is
clear because if x ∈ W C there must be a ball B (x, r) such that this ball does not
intersect K. Otherwise, x would be a limit point of K and since K is closed, x ∈ K.
However, x ∈ / K because K ⊆ W .
It is not necessary to be in a metric space to do this. You can accomplish the
same thing using Urysohn’s lemma.
It follows that for each s a simple function in Lp (Ω) , there exists h ∈ Cc (Ω) such
that ||s − h||p < ε. This is because if
m
X
s(x) = ci XEi (x)
i=1
is a simple function in Lp where the ci are the distinct nonzero values of s each
/ Lp due to the inequality
µ (Ei ) < ∞ since otherwise s ∈
Z
p p
|s| dµ ≥ |ci | µ (Ei ) .
By Theorem 12.13, simple functions are dense in Lp (Ω) ,and so this proves the
Theorem.
12.3 Separability
Theorem 12.17 For p ≥ 1 and µ a Radon measure, Lp (Rn , µ) is separable. Recall
this means there exists a countable set, D, such that if f ∈ Lp (Rn , µ) and ε > 0,
there exists g ∈ D such that ||f − g||p < ε.
and both ai , bi are rational, while c has rational real and imaginary parts. Let D be
the set of all finite sums of functions in Q. Thus, D is countable. In fact D is dense in
Lp (Rn , µ). To prove this it is necessary to show that for every f ∈ Lp (Rn , µ), there
exists an element of D, s such that ||s − f ||p < ε. If it can be shown that for every
g ∈ Cc (Rn ) there exists h ∈ D such that ||g − h||p < ε, then this will suffice because
if f ∈ Lp (Rn ) is arbitrary, Theorem 12.16 implies there exists g ∈ Cc (Rn ) such
that ||f − g||p ≤ 2ε and then there would exist h ∈ Cc (Rn ) such that ||h − g||p < 2ε .
By the triangle inequality,
|f (ai ) − cm
i |<2
−m
,
|cm
i | ≤ |f (ai )|. (12.10)
326 THE LP SPACES
Let
∞
X
sm (x) = cm
i X[ai ,bi ) (x) .
i=1
Since f (ai ) = 0 except for finitely many values of i, the above is a finite sum. Then
12.10 implies sm ∈ D. If sm converges uniformly to f then it follows ||sm − f ||p → 0
because |sm | ≤ |f | and so
µZ ¶1/p
p
||sm − f ||p = |sm − f | dµ
ÃZ !1/p
p
= |sm − f | dµ
spt(f )
1/p
≤ [εmn (spt (f ))]
whenever m is large enough.
Since f ∈ Cc (Rn ) it follows that f is uniformly continuous and so given ε > 0
there exists δ > 0 such that if |x − y| < δ, |f (x) − f (y)| < ε/2. Now let m be large
enough that every box in Pm has diameter less than δ and also that 2−m < ε/2.
Then if [ai , bi ) is one of these boxes of Pm , and x ∈ [ai , bi ),
|f (x) − f (ai )| < ε/2
and
|f (ai ) − cm
i |<2
−m
< ε/2.
Therefore, using the triangle inequality, it follows that
|f (x) − cm
i | = |sm (x) − f (x)| < ε
and since x is arbitrary, this establishes uniform convergence. This proves the
theorem.
Here is an easier proof if you know the Weierstrass approximation theorem.
Theorem 12.18 For p ≥ 1 and µ a Radon measure, Lp (Rn , µ) is separable. Recall
this means there exists a countable set, D, such that if f ∈ Lp (Rn , µ) and ε > 0,
there exists g ∈ D such that ||f − g||p < ε.
Proof: Let P denote the set of all polynomials which have rational coefficients.
n n
Then P is countable. Let τ k ∈ Cc ((− (k + 1) , (k + 1)) ) such that [−k, k] ≺ τ k ≺
n
(− (k + 1) , (k + 1)) . Let Dk denote the functions which are of the form, pτ k where
p ∈ P. Thus Dk is also countable. Let D ≡ ∪∞ k=1 Dk . It follows each function in D is
in Cc (Rn ) and so it in Lp (Rn , µ). Let f ∈ Lp (Rn , µ). By regularity of µ there exists
n
g ∈ Cc (Rn ) such that ||f − g||Lp (Rn ,µ) < 3ε . Let k be such that spt (g) ⊆ (−k, k) .
Now by the Weierstrass approximation theorem there exists a polynomial q such
that
n
||g − q||[−(k+1),k+1]n ≡ sup {|g (x) − q (x)| : x ∈ [− (k + 1) , (k + 1)] }
ε
< n .
3µ ((− (k + 1) , k + 1) )
12.4. CONTINUITY OF TRANSLATION 327
It follows
e is dense in Lp (Ω).
and so the countable set D
fw (x) = f (x − w).
ε
Proof: Let ε > 0 be given and let g ∈ Cc (Rn ) with ||g − f ||p < 3. Since
Lebesgue measure is translation invariant (mn (w + E) = mn (E)),
ε
||gw − fw ||p = ||g − f ||p < .
3
You can see this from looking at simple functions and passing to the limit or you
could use the change of variables formula to verify it.
Therefore
µZ ¶1/p
p
||g − gw ||p = |g (x) − g (x − w)| dmn
B
1/p
mn (B) ε
≤ ε ³ ´< .
3 1 + mn (B)
1/p 3
Therefore, whenever |w| < δ, it follows ||g−gw ||p < 3ε and so from 12.11 ||f −fw ||p <
ε. This proves the theorem.
Part of the argument of this theorem is significant enough to be stated as a
corollary.
Proof: The proof of this follows from the last part of the above argument simply
replacing mn with µ. Translation invariance of the measure is not needed to draw
this conclusion because of uniform continuity of g.
Then a little work shows ψ ∈ Cc∞ (U ). The following also is easily obtained.
Proof: Pick z ∈ U and let r be small enough that B (z, 2r) ⊆ U . Then let
ψ ∈ Cc∞ (B (z, 2r)) ⊆ Cc∞ (U ) be the function of the above example.
The following lemma will be useful in what follows. It says that one of these very
unregular functions in L1loc (Rn , µ) is smoothed out by convolving with a mollifier.
Lemma 12.29 Let f ∈ L1loc (Rn , µ), and g ∈ Cc∞ (Rn ). Then f ∗ g is an infinitely
differentiable function. Here µ is a Radon measure on Rn .
330 THE LP SPACES
K Kr U
1
Consider XKr ∗ ψ m where ψ m is a mollifier. Let m be so large that m < r.
Then from the definition of what is meant by a convolution, and using that ψ m has
12.5. MOLLIFIERS AND DENSITY OF SMOOTH FUNCTIONS 331
¡ 1¢
support in B 0, m , XKr ∗ ψ m = 1 on K and that its support is in K + B (0, 3r).
Now using Lemma 12.29, XKr ∗ ψ m is also infinitely differentiable. Therefore, let
h = XKr ∗ ψ m .
The following corollary will be used later.
∞
Corollary 12.31 Let K be a compact set in Rn and let {Ui }i=1 be an open cover
of K. Then there exist functions, ψ k ∈ Cc∞ (Ui ) such that ψ i ≺ Ui and
∞
X
ψ i (x) = 1.
i=1
If K1 is a compact subset of U1 there exist such functions such that also ψ 1 (x) = 1
for all x ∈ K1 .
Proof: This follows from a repeat of the proof of Theorem 9.18 on Page 220,
replacing the lemma used in that proof with Theorem 12.30.
Theorem 12.32 For each p ≥ 1, Cc∞ (Rn ) is dense in Lp (Rn ). Here the measure
is Lebesgue measure.
Proof: Let f ∈ Lp (Rn ) and let ε > 0 be given. Choose g ∈ Cc (Rn ) such that
||f − g||p < 2ε . This can be done by using Theorem 12.16. Now let
Z Z
gm (x) = g ∗ ψ m (x) ≡ g (x − y) ψ m (y) dmn (y) = g (y) ψ m (x − y) dmn (y)
whenever m is large enough. This follows from Corollary 12.22. Theorem 12.12 was
used to obtain the third inequality. There is no measurability problem because the
function
(x, y) → |g(x) − g(x − y)|ψ m (y)
is continuous. Thus when m is large enough,
ε ε
||f − gm ||p ≤ ||f − g||p + ||g − gm ||p < + = ε.
2 2
332 THE LP SPACES
12.6 Exercises
1. Let E be a Lebesgue measurable set in R. Suppose m(E) > 0. Consider the
set
E − E = {x − y : x ∈ E, y ∈ E}.
Show that E − E contains an interval. Hint: Let
Z
f (x) = XE (t)XE (x + t)dt.
and for all ε > 0, there exist a δ > 0 and such that if |h| < δ, then
Z
p
|u (x + h) − u (x)| dx < εp
R1 R∞
12. B(p, q) = 0 xp−1 (1 − x)q−1 dx, Γ(p) = 0 e−t tp−1 dt for p, q > 0. The first
of these is called the beta function, while the second is the gamma function.
Show a.) Γ(p + 1) = pΓ(p); b.) Γ(p)Γ(q) = B(p, q)Γ(p + q).
Rx
13. Let f ∈ Cc (0, ∞) and define F (x) = x1 0 f (t)dt. Show
p
||F ||Lp (0,∞) ≤ ||f ||Lp (0,∞) whenever p > 1.
p−1
Hint: Argue there isR no loss of generality in assuming f ≥ 0 and then assume
∞
this is so. Integrate 0 |F (x)|p dx by parts as follows:
Z show = 0 Z
∞ z }| { ∞
F dx = xF p |∞
p
0 − p xF p−1 F 0 dx.
0 0
0
Now show xF = f − F and use this in the last integral. Complete the
argument by using Holder’s inequality and p − 1 = p/q.
14. ↑ Now supposeRf ∈ Lp (0, ∞), p > 1, and f not necessarily in Cc (0, ∞). Show
x
that F (x) = x1 0 f (t)dt still makes sense for each x > 0. Show the inequality
of Problem 13 is still valid. This inequality is called Hardy’s inequality. Hint:
To show this, use the above inequality along with the density of Cc (0, ∞) in
Lp (0, ∞).
16. Prove Vitali’s Convergence theorem: Let {fn } be uniformly integrable and
complex valued, µ(Ω)R < ∞, fn (x) → f (x) a.e. where f is measurable. Then
f ∈ L1 and limn→∞ Ω |fn − f |dµ = 0. Hint: Use Egoroff’s theorem to show
{fn } is a Cauchy sequence in L1 (Ω). This yields a different and easier proof
than what was done earlier. See Theorem 8.50 on Page 204.
17. ↑ Show the Vitali Convergence theorem implies the Dominated Convergence
theorem for finite measure spaces but there exist examples where the Vitali
convergence theorem works and the dominated convergence theorem does not.
h (t)
lim = ∞.
t→∞ t
Show {fn } is uniformly integrable. In applications, this often occurs in the
form of a bound on ||fn ||p .
12.6. EXERCISES 335
for all f ∈ S.
20. f ∈ L∞ (Ω, µ) if there exists a set of measure zero, E, and a constant C < ∞
such that |f (x)| ≤ C for all x ∈
/ E.
Now raise both ends to the 1/p power and take lim inf and lim sup as p → ∞.
You should get ||f ||∞ − ε ≤ lim inf ||f ||p ≤ lim sup ||f ||p ≤ ||f ||∞
22. Suppose µ(Ω) < ∞. Show that if 1 ≤ p < q, then Lq (Ω) ⊆ Lp (Ω). Hint Use
Holder’s inequality.
23. Show L1 (R)√* L2 (R) and L2 (R) * L1 (R) if Lebesgue measure is used. Hint:
Consider 1/ x and 1/x.
1 θ 1−θ
= + .
q r s
336 THE LP SPACES
show that
Z Z Z
q 1/q
( |f | dµ) ≤ (( |f | dµ) ) (( |f |s dµ)1/s )1−θ.
r 1/r θ
Hint: Z Z
|f |q dµ = |f |qθ |f |q(1−θ) dµ.
θq q(1−θ)
Now note that 1 = r + s and use Holder’s inequality.
The following remarkable result is called the Baire category theorem. To get an
idea of its meaning, imagine you draw a line in the plane. The complement of this
line is an open set and is dense because every point, even those on the line, are limit
points of this open set. Now draw another line. The complement of the two lines
is still open and dense. Keep drawing lines and looking at the complements of the
union of these lines. You always have an open set which is dense. Now what if there
were countably many lines? The Baire category theorem implies the complement
of the union of these lines is dense. In particular it is nonempty. Thus you cannot
write the plane as a countable union of lines. This is a rather rough description of
this very important theorem. The precise statement and proof follow.
Theorem 13.2 Let (X, d) be a complete metric space and let {Un }∞ n=1 be a se-
quence of open subsets of X satisfying Un = X (Un is dense). Then D ≡ ∩∞n=1 Un
is a dense subset of X.
337
338 BANACH SPACES
¾ r0 p
p· 1
rn < 2−n,
lim pn = p∞ .
n→∞
Since all but finitely many terms of {pn } are in B(pm , rm ), it follows that p∞ ∈
B(pm , rm ) for each m. Therefore,
p∞ ∈ ∩∞ ∞
m=1 B(pm , rm ) ⊆ ∩i=1 Ui ∩ B(p, r0 ).
Proof: If all Fi has empty interior, then FiC would be a dense open set. There-
fore, from Theorem 13.2, it would follow that
C
∅ = (∪∞ ∞ C
i=1 Fi ) = ∩i=1 Fi 6= ∅.
13.1. THEOREMS BASED ON BAIRE CATEGORY 339
The set D of Theorem 13.2 is called a Gδ set because it is the countable inter-
section of open sets. Thus D is a dense Gδ set.
Recall that a norm satisfies:
a.) ||x|| ≥ 0, ||x|| = 0 if and only if x = 0.
b.) ||x + y|| ≤ ||x|| + ||y||.
c.) ||cx|| = |c| ||x|| if c is a scalar and x ∈ X.
From the definition of continuity, it follows easily that a function is continuous
if
lim xn = x
n→∞
implies
lim f (xn ) = f (x).
n→∞
Theorem 13.4 Let X and Y be two normed linear spaces and let L : X → Y be
linear (L(ax + by) = aL(x) + bL(y) for a, b scalars and x, y ∈ X). The following
are equivalent
a.) L is continuous at 0
b.) L is continuous
c.) There exists K > 0 such that ||Lx||Y ≤ K ||x||X for all x ∈ X (L is
bounded).
L (xn − x) = Lxn − Lx → 0
Hence
1
||Lx|| ≤ ||x||.
δ
Note that from Theorem 13.4 ||L|| is well defined because of part c.) of that
Theorem.
The next lemma follows immediately from the definition of the norm and the
assumption that L is linear.
Lemma 13.6 With ||L|| defined in 13.1, L(X, Y ) is a normed linear space. Also
||Lx|| ≤ ||L|| ||x||.
Therefore, multiplying both sides by ||x||, ||Lx|| ≤ ||L|| ||x||. This is obviously a
linear space. It remains to verify the operator norm really is a norm. First ³of all, ´
x
if ||L|| = 0, then Lx = 0 for all ||x|| ≤ 1. It follows that for any x 6= 0, 0 = L ||x||
and so Lx = 0. Therefore, L = 0. Also, if c is a scalar,
This shows the operator norm is really a norm as hoped. This proves the lemma.
For example, consider the space of linear transformations defined on Rn having
values in Rm . The fact the transformation is linear automatically imparts conti-
nuity to it. You should give a proof of this fact. Recall that every such linear
transformation can be realized in terms of matrix multiplication.
Thus, in finite dimensions the algebraic condition that an operator is linear is
sufficient to imply the topological condition that the operator is continuous. The
situation is not so simple in infinite dimensional spaces such as C (X; Rn ). This
explains the imposition of the topological condition of continuity as a criterion for
membership in L (X, Y ) in addition to the algebraic condition of linearity.
Lx = lim Ln x.
n→∞
13.1. THEOREMS BASED ON BAIRE CATEGORY 341
Also L is continuous. To see this, note that {||Ln ||} is a Cauchy sequence of real
numbers because |||Ln || − ||Lm ||| ≤ ||Ln −Lm ||. Hence there exists K > sup{||Ln || :
n ∈ N}. Thus, if x ∈ X,
Theorem 13.8 Let X be a Banach space and let Y be a normed linear space. Let
{Lα }α∈Λ be a collection of elements of L(X, Y ). Then one of the following happens.
a.) sup{||Lα || : α ∈ Λ} < ∞
b.) There exists a dense Gδ set, D, such that for all x ∈ D,
sup{||Lα x|| α ∈ Λ} = ∞.
But then, since Lα is continuous, this situation persists for all y sufficiently close
to x, say for all y ∈ B (x, δ). Then B (x, δ) ⊆ Un which shows Un is open.
Case b.) is obtained from Theorem 13.2 if each Un is dense.
The other case is that for some n, Un is not dense. If this occurs, there exists
x0 and r > 0 such that for all x ∈ B(x0 , r), ||Lα x|| ≤ n for all α. Now if y ∈
342 BANACH SPACES
B(0, r), x0 + y ∈ B(x0 , r). Consequently, for all such y, ||Lα (x0 + y)|| ≤ n. This
implies that for all α ∈ Λ and ||y|| < r,
Theorem 13.9 Let X and Y be Banach spaces, let L ∈ L(X, Y ), and suppose L
is onto. Then L maps open sets onto open sets.
Then
L(B(0, b)) ⊆ L(B(0, 2b)).
Proof of Lemma 13.10: Let y ∈ L(B(0, b)). There exists x1 ∈ B(0, b) such
that ||y − Lx1 || < a2 . Now this implies
Thus 2y − 2Lx1 ∈ L(B(0, b)) just like y was. Therefore, there exists x2 ∈ B(0, b)
such that ||2y − 2Lx1 − Lx2 || < a/2. Hence ||4y − 4Lx1 − 2Lx2 || < a, and there
exists x3 ∈ B (0, b) such that ||4y − 4Lx1 − 2Lx2 − Lx3 || < a/2. Continuing in this
way, there exist x1 , x2 , x3 , x4 , ... in B(0, b) such that
n
X
||2n y − 2n−(i−1) L(xi )|| < a
i=1
which implies
n
à n
!
X X
−(i−1) −(i−1)
||y − 2 L(xi )|| = ||y − L 2 (xi ) || < 2−n a (13.2)
i=1 i=1
13.1. THEOREMS BASED ON BAIRE CATEGORY 343
P∞
Now consider the partial sums of the series, i=1 2−(i−1) xi .
n
X ∞
X
|| 2−(i−1) xi || ≤ b 2−(i−1) = b 2−m+2 .
i=m i=m
Therefore, these P
partial sums form a Cauchy sequence and so since X is complete,
∞
there exists x = i=1 2−(i−1) xi . Letting n → ∞ in 13.2 yields ||y − Lx|| = 0. Now
n
X
||x|| = lim || 2−(i−1) xi ||
n→∞
i=1
n
X n
X
≤ lim 2−(i−1) ||xi || < lim 2−(i−1) b = 2b.
n→∞ n→∞
i=1 i=1
The reason for the last inclusion is that from the above, if y1 ∈ B (y, r) and y2 ∈
B (−y, r), there exists xn , zn ∈ B (0, n0 ) such that
Lxn → y1 , Lzn → y2 .
Therefore,
||xn + zn || ≤ 2n0
and so (y1 + y2 ) ∈ L(B(0, 2n0 )).
By Lemma 13.10, L(B(0, 2n0 )) ⊆ L(B(0, 4n0 )) which shows
Letting a = r(4n0 )−1 , it follows, since L is linear, that B(0, a) ⊆ L(B(0, 1)). It
follows since L is linear,
L(B(0, r)) ⊇ B(0, ar). (13.3)
Now let U be open in X and let x + B(0, r) = B(x, r) ⊆ U . Using 13.3,
Hence
Lx ∈ B(Lx, ar) ⊆ L(U ).
which shows that every point, Lx ∈ LU , is an interior point of LU and so LU is
open. This proves the theorem.
This theorem is surprising because it implies that if |·| and ||·|| are two norms
with respect to which a vector space X is a Banach space such that |·| ≤ K ||·||,
then there exists a constant k, such that ||·|| ≤ k |·| . This can be useful because
sometimes it is not clear how to compute k when all that is needed is its existence.
To see the open mapping theorem implies this, consider the identity map id x = x.
Then id : (X, ||·||) → (X, |·|) is continuous and onto. Hence id is an open map which
implies id−1 is continuous. Theorem 13.4 gives the existence of the constant k.
Definition 13.12 If X and Y are normed linear spaces, make X×Y into a normed
linear space by using the norm ||(x, y)|| = max (||x||, ||y||) along with component-
wise addition and scalar multiplication. Thus a(x, y) + b(z, w) ≡ (ax + bz, ay + bw).
There are other ways to give a norm for X × Y . For example, you could define
||(x, y)|| = ||x|| + ||y||
Lemma 13.13 The norm defined in Definition 13.12 on X × Y along with the
definition of addition and scalar multiplication given there make X × Y into a
normed linear space.
Proof: The only axiom for a norm which is not obvious is the triangle inequality.
Therefore, consider
It is obvious X × Y is a vector space from the above definition. This proves the
lemma.
Lemma 13.14 If X and Y are Banach spaces, then X × Y with the norm and
vector space operations defined in Definition 13.12 is also a Banach space.
13.1. THEOREMS BASED ON BAIRE CATEGORY 345
Proof: The only thing left to check is that the space is complete. But this
follows from the simple observation that {(xn , yn )} is a Cauchy sequence in X × Y
if and only if {xn } and {yn } are Cauchy sequences in X and Y respectively. Thus
if {(xn , yn )} is a Cauchy sequence in X × Y , it follows there exist x and y such that
xn → x and yn → y. But then from the definition of the norm, (xn , yn ) → (x, y).
Note the distinction between closed and continuous. If the operator is closed
the assertion that y = Lx only follows if it is known that the sequence {Lxn }
converges. In the case of a continuous operator, the convergence of {Lxn } follows
from the assumption that xn → x. It is not always the case that a mapping which
is closed is necessarily continuous. Consider the function f (x) = tan (x) if x is not
an odd multiple of π2 and f (x) ≡ 0 at every odd multiple of π2 . Then the graph
is closed and the function is defined on R but it clearly fails to be continuous. Of
course this function is not linear. You could also consider the map,
d © ª
: y ∈ C 1 ([0, 1]) : y (0) = 0 ≡ D → C ([0, 1]) .
dx
where the norm is the uniform norm on C ([0, 1]) , ||y||∞ . If y ∈ D, then
Z x
y (x) = y 0 (t) dt.
0
dyn
Therefore, if dx → f ∈ C ([0, 1]) and if yn → y in C ([0, 1]) it follows that
Rx dyn (t)
yn (x) = 0 dx dt
↓ Rx ↓
y (x) = 0
f (t) dt
and so by the fundamental theorem of calculus f (x) = y 0 (x) and so the mapping
is closed. It is obviously not continuous because it takes y (x) and y (x) + n1 sin (nx)
to two functions which are far from each other even though these two functions are
very close in C ([0, 1]). Furthermore, it is not defined on the whole space, C ([0, 1]).
The next theorem, the closed graph theorem, gives conditions under which closed
implies continuous.
By Theorem 13.4 on Page 339, this shows L is continuous and proves the theorem.
The following corollary is quite useful. It shows how to obtain a new norm on
the domain of a closed operator such that the domain with this new norm becomes
a Banach space.
Proof: If {xn } is a Cauchy sequence in D with this new norm, it follows both
{xn } and {Lxn } are Cauchy sequences and therefore, they converge. Since L is
closed, xn → x and Lxn → Lx for some x ∈ D. Thus ||xn − x||D → 0.
x ≤ x for all x ∈ F.
If x ≤ y and y ≤ z then x ≤ z.
C ⊆ F is said to be a chain if every two elements of C are related. This means that
if x, y ∈ C, then either x ≤ y or y ≤ x. Sometimes a chain is called a totally ordered
set. C is said to be a maximal chain if whenever D is a chain containing C, D = C.
The most common example of a partially ordered set is the power set of a given
set with ⊆ being the relation. It is also helpful to visualize partially ordered sets
as trees. Two points on the tree are related if they are on the same branch of
the tree and one is higher than the other. Thus two points on different branches
would not be related although they might both be larger than some point on the
13.2. HAHN BANACH THEOREM 347
trunk. You might think of many other things which are best considered as partially
ordered sets. Think of food for example. You might find it difficult to determine
which of two favorite pies you like better although you may be able to say very
easily that you would prefer either pie to a dish of lard topped with whipped cream
and mustard. The following theorem is equivalent to the axiom of choice. For a
discussion of this, see the appendix on the subject.
and that if F (z) can be chosen in this way, this will satisfy 13.5 for all x, y and the
problem of extending f will be solved. Hence it is necessary to choose F (z) such
that for all x, y ∈ M
Is there any such number between f (y) − ρ(y − z) and ρ(x + z) − f (x) for every
pair x, y ∈ M ? This is where f (x) ≤ ρ(x) on M and that f is linear is used.
For x, y ∈ M ,
ρ(x + z) − f (x) − [f (y) − ρ(y − z)]
= ρ(x + z) + ρ(y − z) − (f (x) + f (y))
≥ ρ(x + y) − f (x + y) ≥ 0.
348 BANACH SPACES
and
inf {ρ(x + z) − f (x) : x ∈ M }
Choose F (z) to satisfy 13.6. This has proved the following lemma.
Lemma 13.22 Let M be a subspace of X, a real linear space, and let ρ be a gauge
function on X. Suppose f : M → R is linear, z ∈ / M , and f (x) ≤ ρ (x) for all
x ∈ M . Then f can be extended to M ⊕ Rz such that, if F is the extended function,
F is linear and F (x) ≤ ρ(x) for all x ∈ M ⊕ Rz.
Theorem 13.23 (Hahn Banach theorem) Let X be a real vector space, let M be a
subspace of X, let f : M → R be linear, let ρ be a gauge function on X, and suppose
f (x) ≤ ρ(x) for all x ∈ M . Then there exists a linear function, F : X → R, such
that
a.) F (x) = f (x) for all x ∈ M
b.) F (x) ≤ ρ(x) for all x ∈ X.
(V, g) ≤ (W, h)
means
V ⊆ W and h(x) = g(x) if x ∈ V.
By Theorem 13.20, there exists a maximal chain, C ⊆ F. Let Y = ∪{V : (V, g) ∈ C}
and let h : Y → R be defined by h(x) = g(x) where x ∈ V and (V, g) ∈ C. This
is well defined because if x ∈ V1 and V2 where (V1 , g1 ) and (V2 , g2 ) are both in the
chain, then since C is a chain, the two element related. Therefore, g1 (x) = g2 (x).
Also h is linear because if ax + by ∈ Y , then x ∈ V1 and y ∈ V2 where (V1 , g1 )
and (V2 , g2 ) are elements of C. Therefore, letting V denote the larger of the two Vi ,
and g be the function that goes with V , it follows ax + by ∈ V where (V, g) ∈ C.
Therefore,
Also, h(x) = g (x) ≤ ρ(x) for any x ∈ Y because for such x, x ∈ V where (V, g) ∈ C.
Is Y = X? If not, there exists z ∈ X \ Y and there exists an extension of h to
Y ⊕ Rz using Lemma 13.22. Letting h denote this extended function, contradicts
13.2. HAHN BANACH THEOREM 349
¡ ¢
the maximality of C. Indeed, C ∪ { Y ⊕ Rz, h } would be a longer chain. This
proves the Hahn Banach theorem.
This is the original version of the theorem. There is also a version of this theorem
for complex vector spaces which is based on a trick.
Re f (x + y) − i Re f (i (x + y)) = f (x + y)
= f (x) + f (y)
Actually, |h (x)| ≤ K ||x|| . The reason for this is that h (−x) = −h (x) ≤ K ||−x|| =
K ||x|| and therefore, h (x) ≥ −K ||x||. Let
If c is a real scalar,
Now
Thus
Definition 13.26 Let X and Y be Banach spaces and suppose L ∈ L(X, Y ). Then
define the adjoint map in L(Y 0 , X 0 ), denoted by L∗ , by
L∗ y ∗ (x) ≡ y ∗ (Lx)
for all y ∗ ∈ Y 0 .
13.2. HAHN BANACH THEOREM 351
Proof:
||x∗ || = sup {|x∗ (x)| : ||x|| ≤ 1} = sup {|J (x) (x∗ )| : ||Jx|| ≤ 1}
It happens the Lp spaces are reflexive whenever p > 1. This is shown later.
354 BANACH SPACES
ρA (x∗ + y ∗ ) ≤ ρ (x∗ ) + ρA (y ∗ ) ,
ρA (ax∗ ) = |a| ρA (x∗ ) .
Lemma 13.33 The sets, BA0 (x, r) where A0 is a finite subset of X 0 and x ∈ X
form a basis for a topology on X known as the weak topology. The sets BA (x∗ , r)
where A is a finite subset of X and x∗ ∈ X 0 form a basis for a topology on X 0
known as the weak ∗ topology.
Proof: The two assertions are very similar. I will verify the one for the weak
topology. The union of these sets, BA0 (x, r) for x ∈ X and r > 0 is all of X. Now
suppose z is contained in the intersection of two of these sets. Say
and so
r > ρA0 (y − z) + ρA0 (z − x) ≥ ρA0 (y − x)
which shows y ∈ BA0 (x, r) . Similar reasoning shows y ∈ BA01 (x1 , r1 ) and so
Therefore, the weak topology consists of the union of all sets of the form BA (x, r).
Definition 13.34 Let I be a set and suppose for each i ∈ I, (Xi , τQi ) is a nonempty
topological space. The Cartesian product of the Xi , denoted by i∈I Xi , consists
of the set of allQchoice functions defined on I which select a single element of each
Qi . Thus f ∈ i∈I Xi means for every i ∈ I, f (i) ∈ Xi . The axiom of choice says
X
i∈I Xi is nonempty. Let Y
Pj (A) = Bi
i∈I
Theorem 13.36 Let B 0 be the closed unit ball in X 0 . Then B 0 is compact in the
weak ∗ topology.
is compact in the product topology where the topology on B (0, ||x||) is the usual
topology of F. Recall P is the set of functions which map a point, x ∈ X to a point
in B (0, ||x||). Therefore, B 0 ⊆ P. Also the basic open sets in the weak ∗ topology
on B 0 are obtained as the intersection of basic open sets in the product topology
of P to B 0 and so it suffices to show B 0 is a closed subset of P. Suppose then that
f ∈ P \ B 0 . It follows f cannot be linear. There are two ways this can happen. One
way is that for some x, y
f (x + y) 6= f (x) + f (y)
for some x, y ∈ X. However, if g is close enough to f at the three points, x + y, x,
and y, the above inequality will hold for g in place of f. In other words there is a
basic open set containing f such that for all g in this basic open set, g ∈ / B0. A
similar consideration applies in case f (λx) 6= λf (x) for some scalar, λ and x. Since
P \ B 0 is open, it follows B 0 is a closed subset of P and is therefore, compact. This
proves the theorem.
Sometimes one can consider the weak ∗ topology in terms of a metric space.
where ρxn (f ) = |f (xn )|. Clearly d (f, g) = d (g, f ) ≥ 0. If d (f, g) = 0, then this
requires f (xn ) = g (xn ) for all xn ∈ D. Since f and g are continuous and D is
dense, this requires that f (x) = g (x) for all x. It is routine to verify the triangle
inequality from the easy to establish inequality,
x y x+y
+ ≥ ,
1+x 1+y 1+x+y
ρxn (f − g)
g→
1 + ρxn (f − g)
is a continuous function from (K, τ ) to [0, ∞) and also the above sum defining d
converges uniformly. It follows
g → d (f, g)
is continuous. Therefore, the ball with respect to d,
Proof: By Theorem 13.37, K is a metric space for the metric described there
and it is compact. Therefore by the characterization of compact metric spaces,
Proposition 6.12 on Page 136, K is sequentially compact. This proves the corollary.
Jx (f ) ≡ f (x)
and let X be reflexive so that J is onto. Then J is a homeomorphism of (X, weak topology)
and (X 00 , weak ∗ topology).This means J is one to one, onto, and both J and J −1
are continuous.
Proof: Let f ∈ X 0 and let
Corollary 13.40 If X is a reflexive Banach space, then the closed unit ball is
weakly compact.
Proof: This follows from Theorem 13.37 and Lemma 13.39. Lemma 13.39
implies J (K) is compact in X 00 . Then since X 0 is separable, there is a metric, d00
on J (K) which delivers the weak ∗ topology. Let d (x, y) ≡ d00 (Jx, Jy) . Then
J id J −1
(K, τ d ) → (J (K) , τ d00 ) → (J (K) , τ weak ∗ ) → (K, τ weak )
Proof: Let Y be the closed subspace of the reflexive space, X. Consider the
following diagram
i∗∗ 1-1
Y 00 → X 00
i∗ onto
Y0 ← X0
i
Y → X
This diagram follows from Theorem 13.28 on Page 351, the theorem on adjoints.
Now let y ∗∗ ∈ Y 00 . Then i∗∗ y ∗∗ = JX (y) because X is reflexive. I want to show
that y ∈ Y . If it is not in Y then since Y is closed, there exists x∗ ∈ X 0 such that
x∗ (y) 6= 0 but x∗ (Y ) = 0. Then i∗ x∗ = 0. Hence
Since i∗ is onto, this shows y ∗∗ = JY (y) and this proves the lemma.
13.3. WEAK AND WEAK ∗ TOPOLOGIES 359
Theorem 13.43 (Eberlein Smulian) The closed unit ball in a reflexive Banach
space X, is weakly sequentially compact. By this is meant that if {xn } is con-
tained in the closed unit ball, there exists a subsequence, {xnk } and x ∈ X such that
for all x∗ ∈ X 0 ,
x∗ (xnk ) → x∗ (x) .
Proof: Let {xn } ⊆ B ≡ B (0, 1). Let Y be the closure of the linear span of
{xn }. Thus Y is a separable. It is reflexive because it is a closed subspace of a
reflexive space so the above lemma applies. By the Banach Alaoglu theorem, the
closed unit ball B ∗ in Y 0 is weak ∗ compact. Also by Theorem 13.37, B ∗ is a metric
space with a suitable metric. Thus B ∗ is complete and totally bounded with respect
to this metric and it follows that B ∗ with the weak ∗ topology is separable. This
implies Y 0 is also separable in the weak ∗ topology. To see this, let {yn∗ } ≡ D be a
weak ∗ dense set in B ∗ and let y ∗ ∈ Y 0 . Let p be a large enough positive rational
number that y ∗ /p ∈ B ∗ . Then if A is any finite set from Y, there exists yn∗ ∈ D such
that ρA (y ∗ /p − yn∗ ) < pε . It follows pyn∗ ∈ BA (y ∗ , ε) showing that rational multiples
of D are weak ∗ dense in Y 0 . Since Y is reflexive, the weak and weak ∗ topologies on
Y 0 coincide and so Y 0 is weakly separable. Since Y 0 is separable, Corollary 13.38
implies B ∗∗ , the closed unit ball in Y 00 is weak ∗ sequentially compact. Then by
Lemma 13.39 B, the unit ball in Y , is weakly sequentially compact. It follows there
exists a subsequence xnk , of the sequence {xn } and a point x ∈ Y , such that for all
f ∈ Y 0,
f (xnk ) → f (x).
which shows xnk converges weakly and this shows the unit ball in X is weakly
sequentially compact.
Corollary 13.44 Let {xn } be any bounded sequence in a reflexive Banach space,
X. Then there exists x ∈ X and a subsequence, {xnk } such that for all x∗ ∈ X 0 ,
13.4 Exercises
1. Is N a Gδ set? What about Q? What about a countable dense subset of a
complete metric space?
2. ↑ Let f : R → C be a function. Define the oscillation of a function in B (x, r)
by ω r f (x) = sup{|f (z) − f (y)| : y, z ∈ B(x, r)}. Define the oscillation of the
function at the point, x by ωf (x) = limr→0 ω r f (x). Show f is continuous
at x if and only if ωf (x) = 0. Then show the set of points where f is
continuous is a Gδ set (try Un = {x : ωf (x) < n1 }). Does there exist a
function continuous at only the rational numbers? Does there exist a function
continuous at every irrational and discontinuous elsewhere? Hint: Suppose
∞
D is any countable set, D = {di }i=1 , and define the function, fn (x) to equal
zero for P / {d1 , · · ·, dn } and 2−n for x in this finite set. Then consider
every x ∈
∞
g (x) ≡ n=1 fn (x). Show that this series converges uniformly.
3. Let f ∈ C([0, 1]) and suppose f 0 (x) exists. Show there exists a constant, K,
such that |f (x) − f (y)| ≤ K|x − y| for all y ∈ [0, 1]. Let Un = {f ∈ C([0, 1])
such that for each x ∈ [0, 1] there exists y ∈ [0, 1] such that |f (x) − f (y)| >
n|x − y|}. Show that Un is open and dense in C([0, 1]) where for f ∈ C ([0, 1]),
||f || ≡ sup {|f (x)| : x ∈ [0, 1]} .
Show that ∩n Un is a dense Gδ set of nowhere differentiable continuous func-
tions. Thus every continuous function is uniformly close to one which is
nowhere differentiable.
P∞
4. ↑Suppose f (x) = k=1 uk (x) where the convergence is uniformP∞ and each uk
is a polynomial. Is it reasonable to conclude that f 0 (x) = k=1 u0k (x)? The
answer is no. Use Problem 3 and the Weierstrass approximation theorem do
show this.
5. Let X be a normed linear space. We say A ⊆ X is “weakly bounded” if for
each x∗ ∈ X 0 , sup{|x∗ (x)| : x ∈ A} < ∞, while A is bounded if sup{||x|| : x ∈
A} < ∞. Show A is weakly bounded if and only if it is bounded.
6. Let X and Y be two Banach spaces. Define the norm
|||(x, y)||| ≡ ||x||X + ||y||Y .
Show this is a norm on X × Y which is equivalent to the norm given in the
chapter for X × Y . Can you do the same for the norm defined for p > 1 by
p p 1/p
|(x, y)| ≡ (||x||X + ||y||Y ) ?
where Z π
1
ak = e−ikx f (x) dx.
2π −π
Show Z π
Sn f (x) = Dn (x − y) f (y) dy
−π
where
sin((n + 12 )t)
Dn (t) = .
2π sin( 2t )
Rπ
Verify that −π
Dn (t) dt = 1. Also show that if g ∈ L1 (R) , then
Z
lim g (x) sin (ax) dx = 0.
a→∞ R
This last is called the Riemann Lebesgue lemma. Hint: For the last part,
assume first that g ∈ Cc∞ (R) and integrate by parts. Then exploit density of
the set of functions in L1 (R).
8. ↑It turns out that the Fourier series sometimes converges to the function point-
wise. Suppose f is 2π periodic and Holder continuous. That is |f (x) − f (y)| ≤
θ
K |x − y| where θ ∈ (0, 1]. Show that if f is like this, then the Fourier series
converges to f at every point. Next modify your argument to show that if
θ
at every point, x, |f (x+) − f (y)| ≤ K |x − y| for y close enough to x and
θ
larger than x and |f (x−) − f (y)| ≤ K |x − y| for every y close enough to x
and smaller than x, then Sn f (x) → f (x+)+f 2
(x−)
, the midpoint of the jump
of the function. Hint: Use Problem 7.
where
||f || ≡ sup{|f (x)| : x ∈ X}
362 BANACH SPACES
and
|f (x) − f (y)|
ρα (f ) ≡ sup{ α : x, y ∈ X, x 6= y}.
|x − y|
Show that (C α (X; Rn ) , ||·||α ) is a complete normed linear space. This is
called a Holder space. What would this space consist of if α > 1?
11. ↑Now recall Problem 10 about the Holder spaces. Let X be the Holder
functions which are periodic of period 2π. Define Ln f (x) = Sn f (x) where
Ln : X → Y for Y given in Problem 9. Show ||Ln || is bounded independent
of n. Conclude that Ln f → f in Y for all f ∈ X. In other words, for the
Holder continuous and 2π periodic functions, the Fourier series converges to
the function uniformly. Hint: Ln f (x) is given by
Z π
Ln f (x) = Dn (y) f (x − y) dy
−π
α
where f (x − y) = f (x) + g (x, y) where |g (x, y)| ≤ C |y| . Use the fact the
Dirichlet kernel integrates to one to write
=|f (x)|
¯Z π ¯ ¯zZ π
}| {¯
¯ ¯ ¯ ¯
¯ Dn (y) f (x − y) dy ¯¯ ≤ ¯¯ Dn (y) f (x) dy ¯¯
¯
−π −π
¯Z π µµ ¶ ¶ ¯
¯ 1 ¯
+C ¯¯ sin n+ y (g (x, y) / sin (y/2)) dy ¯¯
−π 2
Show the functions, y → g (x, y) / sin (y/2) are bounded in L1 independent of
x and get a uniform bound on ||Ln ||. Now use a similar argument to show
{Ln f } is equicontinuous in addition to being uniformly bounded. If Ln f
fails to converge to f uniformly, then there exists ε > 0 and a subsequence,
nk such that ||Lnk f − f ||∞ ≥ ε where this is the norm in Y or equivalently
the sup norm on [−π, π]. By the Arzela Ascoli theorem, there is a further
subsequence, Lnkl f which converges uniformly on [−π, π]. But by Problem 8
Ln f (x) → f (x).
12. Let X be a normed linear space and let M be a convex open set containing
0. Define
x
ρ(x) = inf{t > 0 : ∈ M }.
t
Show ρ is a gauge function defined on X. This particular example is called a
Minkowski functional. It is of fundamental importance in the study of locally
convex topological vector spaces. A set, M , is convex if λx + (1 − λ)y ∈ M
whenever λ ∈ [0, 1] and x, y ∈ M .
13. ↑ The Hahn Banach theorem can be used to establish separation theorems. Let
M be an open convex set containing 0. Let x ∈/ M . Show there exists x∗ ∈ X 0
∗ ∗
such that Re x (x) ≥ 1 > Re x (y) for all y ∈ M . Hint: If y ∈ M, ρ(y) < 1.
13.4. EXERCISES 363
Show this. If x ∈
/ M, ρ(x) ≥ 1. Try f (αx) = αρ(x) for α ∈ R. Then extend
f to the whole space using the Hahn Banach theorem and call the result F ,
show F is continuous, then fix it so F is the real part of x∗ ∈ X 0 .
14. A Banach space is said to be strictly convex if whenever ||x|| = ||y|| and x 6= y,
then ¯¯ ¯¯
¯¯ x + y ¯¯
¯¯ ¯¯
¯¯ 2 ¯¯ < ||x||.
18. Show that a closed subspace of a reflexive Banach space is reflexive. Hint:
The proof of this is an exercise in the use of the Hahn Banach theorem. Let
Y be the closed subspace of the reflexive space X and let y ∗∗ ∈ Y 00 . Then
i∗∗ y ∗∗ ∈ X 00 and so i∗∗ y ∗∗ = Jx for some x ∈ X because X is reflexive.
Now argue that x ∈ Y as follows. If x ∈ / Y , then there exists x∗ such that
x∗ (Y ) = 0 but x∗ (x) 6= 0. Thus, i∗ x∗ = 0. Use this to get a contradiction.
When you know that x = y ∈ Y , the Hahn Banach theorem implies i∗ is onto
Y 0 and for all x∗ ∈ X 0 ,
21. Suppose L ∈ L (X, Y ) and M ∈ L (Y, Z). Show M L ∈ L (X, Z) and that
∗
(M L) = L∗ M ∗ .
22. Let X and Y be Banach spaces and suppose f ∈ L (X, Y ) is compact. Recall
this means that if B is a bounded set in X, then f (B) has compact closure
in Y. Show that f ∗ is also a compact map. Hint: Take a bounded subset of
Y 0 , S. You need to show f ∗ (S) is totally bounded. You might consider using
the Ascoli Arzela theorem on the functions of S applied to f (B) where B is
the closed unit ball in X.
Hilbert Spaces
Note that 14.2 and 14.3 imply (x, ay + bz) = a(x, y) + b(x, z). Such a vector space
is called an inner product space.
The Cauchy Schwarz inequality is fundamental for the study of inner product
spaces.
Proof: Let ω ∈ C, |ω| = 1, and ω(x, y) = |(x, y)| = Re(x, yω). Let
and so (x, 0) = 0. Thus, it can be assumed y 6= 0. Then from the axioms of the
inner product,
F (t) = ||x||2 + 2t Re(x, ωy) + t2 ||y||2 ≥ 0.
365
366 HILBERT SPACES
This yields
||x||2 + 2t|(x, y)| + t2 ||y||2 ≥ 0.
Since this inequality holds for all t ∈ R, it follows from the quadratic formula that
Proof: All the axioms are obvious except the triangle inequality. To verify this,
2 2 2
||x + y|| ≡ (x + y, x + y) ≡ ||x|| + ||y|| + 2 Re (x, y)
2 2
≤ ||x|| + ||y|| + 2 |(x, y)|
2 2 2
≤ ||x|| + ||y|| + 2 ||x|| ||y|| = (||x|| + ||y||) .
Definition 14.6 A Hilbert space is an inner product space which is complete. Thus
a Hilbert space is a Banach space in which the norm comes from an inner product
as described above.
In Hilbert space, one can define a projection map onto closed convex nonempty
sets.
2 yn + ym
||yn − x + ym − x|| = 4(|| − x||2 )
2
yn +ym
Then by the parallelogram identity, and convexity of K, 2 ∈ K, and so
Since ||x − yn || → λ, this shows {yn − x} is a Cauchy sequence. Thus also {yn } is
a Cauchy sequence. Since H is complete, yn → y for some y ∈ H which must be in
K because K is closed. Therefore
Let P x = y.
368 HILBERT SPACES
Re(x − z, y − z) ≤ 0 (14.6)
for all y ∈ K.
Before proving this, consider what it says in the case where the Hilbert space is
Rn.
yX
yX θ
K X - x
z
Condition 14.6 says the angle, θ, shown in the diagram is always obtuse. Re-
member from calculus, the sign of x · y is the same as the sign of the cosine of the
included angle between x and y. Thus, in finite dimensions, the conclusion of this
corollary says that z = P x exactly when the angle of the indicated angle is obtuse.
Surely the picture suggests this is reasonable.
The inequality 14.6 is an example of a variational inequality and this corollary
characterizes the projection of x onto K as the solution of this variational inequality.
Proof of Corollary: Let z ∈ K and let y ∈ K also. Since K is convex, it
follows that if t ∈ [0, 1],
z + t(y − z) = (1 − t) z + ty ∈ K.
Furthermore, every point of K can be written in this way. (Let t = 1 and y ∈ K.)
Therefore, z = P x if and only if for all y ∈ K and t ∈ [0, 1],
for all t ∈ [0, 1] and y ∈ K if and only if for all t ∈ [0, 1] and y ∈ K
2 2 2
||x − z|| + t2 ||y − z|| − 2t Re (x − z, y − z) ≥ ||x − z||
Now this is equivalent to 14.7 holding for all t ∈ (0, 1). Therefore, dividing by
t ∈ (0, 1) , 14.7 is equivalent to
2
t ||y − z|| − 2 Re (x − z, y − z) ≥ 0
for all t ∈ (0, 1) which is equivalent to 14.6. This proves the corollary.
14.1. BASIC THEORY 369
|P x − P y| ≤ |x − y| .
Re (x0 − P x0 , P x − P x0 ) ≤ 0, Re (x − P x, P x0 − P x) ≤ 0
Hence
0 ≤ Re (x − P x, P x − P x0 ) − Re (x0 − P x0 , P x − P x0 )
2
= Re (x − x0 , P x − P x0 ) − |P x − P x0 |
and so
2
|P x − P x0 | ≤ |x − x0 | |P x − P x0 | .
This proves the corollary.
The next corollary is a more general form for the Brouwer fixed point theorem.
Proof: Let K ⊆ B (0, R) and let P be the projection map onto K. Then
consider the map f ◦ P which maps B (0, R) to B (0, R) and is continuous. By the
Brouwer fixed point theorem for balls, this map has a fixed point. Thus there exists
x such that
f ◦ P (x) = x
Now the equation also requires x ∈ K and so P (x) = x. Hence f (x) = x.
The case where the closed convex set is a closed subspace is of special importance
and in this case the above corollary implies the following.
(x − z, y) = 0 (14.8)
and
2 2 2
||x|| = ||x − P x|| + ||P x|| . (14.9)
370 HILBERT SPACES
Theorem 14.14 Let H be a Hilbert space and let f ∈ H 0 . Then there exists a
unique z ∈ H such that
f (x) = (x, z) (14.10)
for all x ∈ H.
which shows that yf (w)−f (y)w ∈ f −1 (0), which is a closed subspace of H since f is
continuous. If f −1 (0) = H, then f is the zero map and z = 0 is the unique element
of H which satisfies 14.10. If f −1 (0) 6= H, pick u ∈
/ f −1 (0) and let w ≡ u − P u 6= 0.
Thus Corollary 14.13 implies (y, w) = 0 for all y ∈ f −1 (0). In particular, let y =
xf (w) − f (x)w where x ∈ H is arbitrary. Therefore,
Thus, solving for f (x) and using the properties of the inner product,
f (w)w
f (x) = (x, )
||w||2
14.2. APPROXIMATIONS IN HILBERT SPACE 371
where the denominator is not equal to zero because the xj form a basis and so
xk+1 ∈
/ span (x1 , · · ·, xk ) = span (u1 , · · ·, uk )
Thus by induction,
uk+1 ∈ span (u1 , · · ·, uk , xk+1 ) = span (x1 , · · ·, xk , xk+1 ) .
Also, xk+1 ∈ span (u1 , · · ·, uk , uk+1 ) which is seen easily by solving 14.11 for xk+1
and it follows
span (x1 , · · ·, xk , xk+1 ) = span (u1 , · · ·, uk , uk+1 ) .
If l ≤ k,
k
X
(uk+1 · ul ) = C (xk+1 · ul ) − (xk+1 · uj ) (uj · ul )
j=1
k
X
= C (xk+1 · ul ) − (xk+1 · uj ) δ lj
j=1
= C ((xk+1 · ul ) − (xk+1 · ul )) = 0.
372 HILBERT SPACES
n
The vectors, {uj }j=1 , generated in this way are therefore an orthonormal basis
because each vector has unit length.
Consider the second claim about finite dimensional subspaces. Without loss of
generality, assume {x1 , · · ·, xn } is linearly independent. If it is not, delete vectors un-
til a linearly independent set is obtained. Then by the first part, span (x1 , · · ·, xn ) =
span (u1 , · · ·, un ) ≡ M where the ui are an orthonormal set of vectors. Suppose
{yk } ⊆ M and yk → y ∈ H. Is y ∈ M ? Let
n
X
yk ≡ ckj uj
j=1
¡ ¢T
Then let ck ≡ ck1 , · · ·, ckn . Then
n
X Xn n
X
¯ k ¯ ¯ k ¯ ¡ ¢ ¡ ¢
¯c − cl ¯2 ≡
2
¯cj − clj ¯ = ckj − clj uj , ckj − clj uj
j=1 j=1 j=1
2
= ||yk − yl ||
© ª
which shows ck is a Cauchy sequence in Fn and so it converges to c ∈ Fn . Thus
n
X n
X
y = lim yk = lim ckj uj = cj uj ∈ M.
k→∞ k→∞
j=1 j=1
Theorem 14.16 Let M be the span of {u1 , · · ·, un } in a Hilbert space, H and let
y ∈ H. Then P y is given by
n
X
Py = (y, uk ) uk (14.12)
k=1
Proof:
à n
! n
X X
y− (y, uk ) uk , up = (y, up ) − (y, uk ) (uk , up )
k=1 k=1
= (y, up ) − (y, up ) = 0
It follows that à !
n
X
y− (y, uk ) uk , u =0
k=1
14.2. APPROXIMATIONS IN HILBERT SPACE 373
Thus the ij th entry of this matrix is (xi , xj ). This is sometimes called the Gram
matrix. Also define G (x1 , · · ·, xn ) as the determinant of this matrix, also called the
Gram determinant.
¯ ¯
¯ (x1 , x1 ) · · · (x1 , xn ) ¯
¯ ¯
¯ .. .. ¯
G (x1 , · · ·, xn ) ≡ ¯ . . ¯ (14.15)
¯ ¯
¯ (xn , x1 ) · · · (xn , xn ) ¯
≡ d2 + yxT α (14.19)
in which
yxT ≡ ((y, x1 ) , · · ·, (y, xn )) , αT ≡ (α1 , · · ·, αn ) .
Then 14.17 and 14.18 imply the following system
µ ¶µ ¶ µ ¶
G (x1 , · · ·, xn ) 0 α yx
= 2
yxT 1 d2 ||y||
By Cramer’s rule,
µ ¶
G (x1 , · · ·, xn ) yx
det 2
yxT ||y||
d2 = µ ¶
G (x1 , · · ·, xn ) 0
det
yxT 1
µ ¶
G (x1 , · · ·, xn ) yx
det 2
yxT ||y||
=
det (G (x1 , · · ·, xn ))
det (G (x1 , · · ·, xn , y)) G (x1 , · · ·, xn , y)
= =
det (G (x1 , · · ·, xn )) G (x1 , · · ·, xn )
and this proves the theorem.
Recall the Cauchy identity presented earlier, Theorem 4.46 on Page 76 which is
stated here for convenience.
Theorem 14.19 The following identity holds.
¯ 1 1 ¯
¯ a +b ¯
Y ¯ 1 1 · · · a1 +bn ¯ Y
¯ .
.. .
.. ¯
(ai + bj ) ¯ ¯= (ai − aj ) (bi − bj ) . (14.20)
¯ ¯
i,j ¯ 1
· · · an +bn ¯
1 j<i
an +b1
Lemma 14.20 Let m, p1 , ···, pn be distinct real numbers larger than −1/2. Thus the
functions, fm (x) ≡ xm , fpj (x) ≡ xpj are all in L2 (0, 1). Let M = span (fp1 , · · ·, fpn ) .
Then the L2 distance, d between fm and M is
Yn
1 |m − pj |
d= √
2m + 1 j=1 m + pj + 1
Qn 2Q 2
k=1 (m − pk ) j<i≤n (pi − pj )
= Qn Qn Q
i=1 (m + pi + 1) j=1 (m + pj + 1) i,j≤n (pi + pj + 1) (2m + 1)
Qn 2Q 2
k=1 (m − pk ) j<i≤n (pi − pj )
= Qn 2 Q
i=1 (m + pi + 1) i,j≤n (pi + pj + 1) (2m + 1)
376 HILBERT SPACES
Therefore,
µ Qn 2Q 2
¶
k=1 (m−pk ) j<i≤n (pi −pj )
Qn 2Q
i=1 (m+pi +1) i,j≤n (pi +pj +1)(2m+1)
d2 = ³Q (pi −pj )2
´
Q j<i≤n
i,j≤n (pi +pj +1)
Qn 2
k=1 (m − pk )
= Qn 2
i=1 (m + pi + 1) (2m + 1)
which shows
Yn
1 |m − pk |
d= √ .
2m + 1 k=1 m + pk + 1
and this proves the lemma.
The following lemma relates
£ ¤an infinite sum to a product. First consider the
graph of ln (1 − x) for x ∈ 0, 12 . Here is a rough sketch with two lines, y = −x
which lies above the graph of ln (1 − x) and y = −2x which lies below.
1
2
A@
A@
A @
A @
A @
A
A
A
A
A
Lemma 14.21 Let an =
6 1, an > 0, and limn→∞ an = 0. Then
∞
Y n
Y
(1 − an ) ≡ lim (1 − an ) = 0
n→∞
k=1 k=1
if and only if
∞
X
an = +∞.
n=1
Proof:Without loss of generality, you can assume an < 1/2 because the two
conditions are determined by the values of an for n large. By the above sketch the
14.3. THE MÜNTZ THEOREM 377
following is obtained.
n n
" n n
#
Y X X X
ln (1 − ak ) = ln (1 − ak ) ∈ −2 ak , − ak .
k=1 k=1 k=1 k=1
Therefore,
Pn n
Y Pn
e−2 k=1 ak
≤ (1 − ak ) ≤ e− k=1 ak
k=1
Theorem 14.22 Let {pn } be a sequence of real numbers larger than −1/2 such that
limn→∞ pn = ∞. Let S denote the set of finite linear combinations of the functions,
{xp1 , xp2 , · · ·} . Then S is dense in L2 (0, 1) if and only if
X∞
1
= ∞.
p
i=1 i
P 1
which has the same convergence properties as pk by the limit comparison test.
This proves the theorem.
The following is Müntz’s second theorem.
Theorem 14.23 Let S be finite linear combinations of {1, xp1 , xp2 , ·P
· ·} where pj ≥
∞
1 and limn→∞ pn = ∞. Then S is dense in C ([0, 1]) if and only if k=1 p1k = ∞.
Proof: If S is Pdense in C ([0, 1]) then S must also be dense in L2 (0, 1) and so
∞
by Theorem 14.22 k=1 p1k = ∞.
P∞ 1
Suppose then that k=1 pk = ∞ so that by Theorem 14.22, S is dense in
L2 (0, 1) . The theorem will be proved if it is shown that for all m a nonnegative
integer,
max {|xm − f (x)| : x ∈ [0, 1]} < ε
for some f ∈ S. This is true if m = 0 because 1 ∈ S. Suppose then that m > 0. Let
S 0 denote finite linear combinations of the functions
© p1 −1 p2 −1 ª
x ,x ,· · · .
P 1
These functions are also dense in L2 (0, 1) because pk −1 = ∞ by the limit com-
parison test. Then by Theorem 14.22 there exists f ∈ S 0 such that
µZ 1 ¶1/2
¯ ¯
¯f (x) − mxm−1 ¯2 dx < ε.
0
Rx
Thus F (x) ≡ 0
f (t) dt ∈ S and
¯Z x ¯
¯ ¡ ¢ ¯
|F (x) − xm | = ¯¯ f (t) − mtm−1 dt¯¯
Z 0x
¯ ¯
≤ ¯f (t) − mtm−1 ¯ dt
0
µZ 1 ¶1/2 µZ 1 ¶1/2
¯ ¯
≤ ¯f (t) − mtm−1 ¯2 dt dx
0 0
< ε
and this proves the theorem.
Theorem 14.25 In any separable Hilbert space, H, there exists a countable or-
thonormal set, S = {xi } such that the span of these vectors is dense in H. Further-
more, if span (S) is dense, then for x ∈ H,
∞
X n
X
x= (x, xi ) xi ≡ lim (x, xi ) xi . (14.21)
n→∞
i=1 i=1
n
X n
X n
X
2 2
||x|| + |ck | − ck (x, xk ) − ck (x, xk ).
k=1 k=1 k=1
This equals
n
X n
X
2 2 2
||x|| + |ck − (x, xk )| − |(x, xk )|
k=1 k=1
Now since span (S) is dense, there exists n large enough that for some choice of
constants, ck ,
¯¯ ¯¯2
¯¯ n
X ¯¯
¯¯ ¯¯
¯¯x − ck xk ¯¯ < ε.
¯¯ ¯¯
k=1
{x1 , · · ·, xn } ⊆ S.
Then if x ∈ H
¯¯ ¯¯2 ¯¯ ¯¯2
¯¯ n
X ¯¯ ¯¯ n
X ¯¯
¯¯ ¯¯ ¯¯ ¯¯
¯¯x − ck xk ¯¯ ≥ ¯¯x − (x, xi ) xi ¯¯
¯¯ ¯¯ ¯¯ ¯¯
k=1 i=1
∞
If S is countable and span (S) is dense, then letting {xi }i=1 = S, 14.21 follows.
This is a Hilbert space because of the theorem which states the Lp spaces are com-
plete, Theorem 12.10 on Page 319. An example of an orthonormal set of functions
in L2 (0, 2π) is
1
φn (x) ≡ √ einx
2π
for n an integer. Is it true that the span of these functions is dense in L2 (0, 2π)?
Theorem 14.27 Let S = {φn }n∈Z . Then span (S) is dense in L2 (0, 2π).
14.5. FOURIER SERIES, AN EXAMPLE 381
for m chosen large enough. This algebra separates the points of T because it
contains the function, p (z) = z. It annihilates no point of t because it contains
the constant function 1. Furthermore, it has the property that for f ∈ A, f ∈ A.
By the Stone Weierstrass approximation theorem, Theorem 7.16 on Page 165, A is
dense in C¡ (T¢) . Now for g ∈ Cc (0, 2π) , extend g to all of R to be 2π periodic. Then
letting G eit ≡ g (t) , it follows G is well defined and continuous on T. Therefore,
there exists H ∈ A such that for all t ∈ R,
¯ ¡ it ¢ ¡ ¢¯
¯H e − G eit ¯ < ε2 /2π.
¡ ¢
Thus H eit is of the form
m
X m
X
¡ ¢ ¡ ¢k
H eit = ck eit = ck eikt ∈ span (S) .
k=−m k=−m
Pm ikt
Let h (t) = k=−m ck e . Then
µZ 2π ¶1/2 µZ 2π ¶1/2
2
|g − h| dx ≤ max {|g (t) − h (t)| : t ∈ [0, 2π]} dx
0 0
µZ 2π ¶1/2
©¯ ¡ ¢ ¡ ¢¯ ª
= max ¯G eit − H eit ¯ : t ∈ [0, 2π] dx
0
µZ 2π ¶1/2
ε2
< = ε.
0 2π
and either
lim λn = 0, (14.23)
n→∞
or for some n,
span (u1 , · · ·, un ) = H. (14.24)
In any case,
∞
span ({ui }i=1 ) is dense in A (H) . (14.25)
and for all x ∈ H,
∞
X
Ax = λk (x, uk ) uk . (14.26)
k=1
and
An : Hn → Hn . (14.28)
⊥
where H ≡ H1 and Hn ≡ {u1 , · · ·, un−1 } and An is the restriction of A to Hn .
14.6. COMPACT OPERATORS 383
Since A is compact, there exists a subsequence of {xn } still denoted by {xn } such
that Axn converges to some element of H. Thus since λ21 − A2 satisfies
¡¡ 2 ¢ ¢
λ1 − A2 y, y ≥ 0
¡¡ ¢ ¢
in addition to being self adjoint, it follows x, y → λ21 − A2 x, y satisfies all the
axioms for an inner product except for the one which says that (z, z) = 0 only if
z = 0. Therefore, the Cauchy Schwarz inequality may be used to write
¯¡¡ 2 ¢ ¢¯ ¡¡ 2 ¢ ¢1/2 ¡¡ 2 ¢ ¢1/2
¯ λ1 − A2 xn , y ¯ ≤ λ1 − A2 y, y λ1 − A2 xn , xn
≤ en ||y|| .
Now
(λ1 I − A) (λ1 I + A) x = (λ1 I + A) (λ1 I − A) x = 0.
y
If (λ1 I − A) x = 0, let u1 ≡ x. If (λ1 I − A) x = y 6= 0, let u1 ≡ ||y|| .
Suppose {u1 , · · ·, un } is such that Auk = λk uk and |λk | ≥ |λk+1 |, |λk | = ||Ak ||
and Ak : Hk → Hk for k ≤ n. If
span (u1 , · · ·, un ) = H
this yields the conclusion of the theorem in the situation of 14.24. Therefore, assume
the span of these vectors is always a proper subspace of H. It is shown next that
An+1 : Hn+1 → Hn+1 . Let
⊥
y ∈ Hn+1 ≡ {u1 , · · ·, un }
Then for k ≤ n
(Ay, uk ) = (y, Auk ) = λk (y, uk ) = 0,
384 HILBERT SPACES
showing An+1 : Hn+1 → Hn+1 as claimed. There are two cases. Either λn = 0 or it
is not. In the case where λn = 0 it follows An = 0. Every element of H is the sum
⊥
of one in span (u1 , · · ·, un ) and one in span (u1 , · · ·, un ) . (note span (u1 , · · ·, un )
is a closed subspace.) Thus, if x ∈ H, x = y + z where y ∈ span (u1 , · · ·, un ) and
⊥ Pn
z ∈ span (u1 , · · ·, un ) and Az = 0. Say y = j=1 cj uj . Then
n
X
Ax = Ay = cj Auj
j=1
n
X
= cj λj uj ∈ span (u1 , · · ·, un ) .
j=1
The conclusion of the theorem holds in this case because the above equation holds
if with ci = (x, ui ).
Now consider the case where λn 6= 0. In this case repeat the above argument
used to find un+1 and λn+1 for the operator, An+1 . This yields un+1 ∈ Hn+1 ≡
⊥
{u1 , · · ·, un } such that
and if it is ever the case that λn = 0, it follows from the above argument that the
conclusion of the theorem is obtained.
I claim limn→∞ λn = 0. If this were not so, then for some ε > 0, 0 < ε =
limn→∞ |λn | but then
2 2
||Aun − Aum || = ||λn un − λm um ||
2 2
= |λn | + |λm | ≥ 2ε2
∞
and so there would not exist a convergent subsequence of {Auk }k=1 contrary to the
assumption that A is compact. This verifies the claim that limn→∞ λn = 0.
⊥
It remains to verify that span ({ui }) is dense in A (H). If w ∈ span ({ui }) then
w ∈ Hn for all n and so for all n,
Therefore, Aw = 0. Now every vector from H can be written as a sum of one from
⊥ ⊥
span ({ui }) = span ({ui })
and one from span ({ui }). Therefore, if x ∈ H, x = y + w where y ∈ span ({ui })
⊥
and w ∈ span ({ui }) . It follows Aw = 0. Also, since y ∈ span ({ui }), there exist
constants, ck and n such that
¯¯ ¯¯
¯¯ n
X ¯¯
¯¯ ¯¯
¯¯y − ck uk ¯¯ < ε.
¯¯ ¯¯
k=1
14.6. COMPACT OPERATORS 385
Therefore,
¯¯ à !¯¯
¯¯ Xn ¯¯
¯¯ ¯¯
||A|| ε > ¯¯A y − (x, uk ) uk ¯¯
¯¯ ¯¯
k=1
¯¯ ¯¯
¯¯ n
X ¯¯
¯¯ ¯¯
= ¯¯Ax − (x, uk ) λk uk ¯¯ .
¯¯ ¯¯
k=1
Since ε is arbitrary, this shows span ({ui }) is dense in A (H) and also implies 14.26.
This proves the theorem.
Define v ⊗ u ∈ L (H, H) by
v ⊗ u (x) = (x, u) v,
then 14.26 is of the form
∞
X
A= λk uk ⊗ uk
k=1
This is the content of the following corollary.
Corollary 14.31 The main conclusion of the above theorem can be written as
∞
X
A= λk uk ⊗ uk
k=1
where the convergence of the partial sums takes place in the operator norm.
Proof: Using 14.26
¯Ãà ! !¯
¯ Xn ¯
¯ ¯
¯ A− λk uk ⊗ uk x, y ¯
¯ ¯
k=1
¯Ã !¯
¯ Xn ¯
¯ ¯
= ¯ Ax − λk (x, uk ) uk , y ¯
¯ ¯
k=1
¯Ã !¯
¯ X ∞ ¯
¯ ¯
= ¯ λk (x, uk ) uk , y ¯
¯ ¯
k=n
¯ ¯
¯X ∞ ¯
¯ ¯
= ¯ λk (x, uk ) (uk , y)¯
¯ ¯
k=n
Ã∞ !1/2 à ∞ !1/2
X 2
X 2
≤ |λn | |(x, uk )| |(y, uk )|
k=n k=n
≤ |λn | ||x|| ||y||
386 HILBERT SPACES
It follows ¯¯Ã ! ¯¯
¯¯ n
X ¯¯
¯¯ ¯¯
¯¯ A − λk uk ⊗ uk (x)¯¯ ≤ |λn | ||x||
¯¯ ¯¯
k=1
and this proves the corollary.
Corollary 14.32 Let A be a compact self adjoint operator defined on a separable
Hilbert space, H. Then there exists a countable set of eigenvalues, {λi } and an
orthonormal set of eigenvectors, vi satisfying
Avi = λi vi , ||vi || = 1, (14.29)
∞
span ({vi }i=1 ) is dense in H. (14.30)
Furthermore, if λi 6= 0, the space, Vλi ≡ {x ∈ H : Ax = λi x} is finite dimensional.
⊥
Proof: In the proof of the above theorem, let W ≡ span ({ui }) . By Theorem
∞
14.25, there is an orthonormal set of vectors, {wi }i=1 whose span is dense in W .
∞
As shown in the proof of the above theorem, Aw = 0 for all w ∈ W . Let {vi }i=1 =
∞ ∞
{ui }i=1 ∪ {wi }i=1 .
It remains to verify the space, Vλi , is finite dimensional. First observe that
A : Vλi → Vλi . Since A is continuous, it follows that A : Vλi → Vλi . Thus A is a
compact self adjoint operator on Vλi and by Theorem 14.30, 14.24 holds because
the only eigenvalue is λi . This proves the corollary.
Note the last claim of this corollary holds independent of the separability of H.
This proves the corollary.
Suppose λ ∈ / {λn } and λ 6= 0. Then the above formula for A, 14.26, yields an
−1
interesting formula for (A − λI) . Note first that since limn→∞ λn = 0, it follows
2 2
that λn / (λn − λ) must be bounded, say by a positive constant, M .
∞
Corollary 14.33 Let A be a compact self adjoint operator and let λ ∈
/ {λn }n=1
and λ 6= 0 where the λn are the eigenvalues of A. Then
∞
−1 1 1 X λk
(A − λI) x=− x+ (x, uk ) uk . (14.31)
λ λ λk − λ
k=1
Proof: Let m < n. Then since the {uk } form an orthonormal set,
¯ ¯ Ã n µ ¶2 !1/2
¯X n
λk ¯ X λk
¯ ¯ 2
¯ (x, uk ) uk ¯ = |(x, uk )| (14.32)
¯ λk − λ ¯ λk − λ
k=m k=m
à n !1/2
X 2
≤ M |(x, uk )| .
k=m
and so for m large enough, the first term in 14.32 is smaller than ε. This shows
the infinite series in 14.31 converges. It is now routine to verify that the formula in
14.31 is the inverse.
lim ||A − An || = 0.
n→∞
Proof: Let B be a bounded set in X such that ||b|| ≤ C for all b ∈ B. I need to
verify AB is totally bounded. Suppose then it is not. Then there exists ε > 0 and
a sequence, {Abi } where bi ∈ B and
||Abi − Abj || ≥ ε
∞
X
x= (x, ej ) ej
j=1
∞ X
X ∞ ∞ X
X ∞
(Aei , ej ) ei ⊗ ej (x) ≡ (Aei , ej ) (x, ej ) ei
j=1 i=1 j=1 i=1
X∞ ∞
X
= (x, ej ) (Aei , ej ) ei
j=1 i=1
X∞ X∞
= (x, ej ) (Aej , ei ) ei
j=1 i=1
X∞
= (x, ej ) Aej
j=1
³P ´1/2
∞
Next consider the claim that A is compact. Let CA ≡ j=1 |(Aej , ej )| .
Let An be defined by
∞ X
X n
An ≡ (Aei , ej ) (ei ⊗ ej ) .
j=1 i=1
Therefore,
lim ||A − An || = 0
n→∞
and so A is the limit in operator norm of finite rank bounded linear operators, each
of which is compact. Therefore, A is also compact.
where the uk are the orthonormal eigenvectors of A which form a complete or-
thonormal set. Then
X∞ X∞ X∞ ∞
X
(Aek , ek ) = A uj (ek , uj ) , uj (ek , uj )
k=1 k=1 j=1 j=1
∞ X
X
= (Auj , ui ) (ek , uj ) (ui , ek )
k=1 ij
X∞ X∞
2
= (Auj , uj ) |(ek , uj )|
k=1 j=1
X∞ ∞
X ∞
X
2 2
= (Auj , uj ) |(ek , uj )| = (Auj , uj ) |uj |
j=1 k=1 j=1
∞
X ∞
X
= (Auj , uj ) = λj
j=1 j=1
where {ek } is some orthonormal basis for H. Also L2 (H, G) ⊆ L (H, G) and
Proof: First consider the norm. I need to verify the norm does not depend on
the choice of orthonormal basis. Let {fk } be an orthonormal basis for G. Then for
{ek } an orthonormal basis for H,
X 2
XX 2
XX 2
||T ek || = |(T ek , fj )| = |(ek , T ∗ fj )|
k k j k j
XX 2
X 2
∗
= |(ek , T fj )| = ||T ∗ fj || .
j k j
© ª
The same result would be obtained for any other orthonormal basis e0j and this
shows the norm is at least well defined. It is clear this does indeed satisfy the axioms
of a norm.
Next I want to show L2 (H, G) ⊆ L (H, G) and ||T || ≤ ||T ||L2 . Pick an orthonor-
mal basis for H, {ek } and an orthonormal basis for G, {fk }. Then letting
n
X
x= xk ek ,
k=1
à n
! n
X X
Tx = T xk ek = xk T (ek )
k=1 k=1
392 HILBERT SPACES
n
̰ !1/2
X X 2
≤ |(xj T ej , ek )|
j=1 k=1
à !1/2
X X 2
≤ |xj | |(T ej , ek )|
j k
1/2
X Xn
2
= |xj | ||T ej || ≤ |xj | ||T ||L2
j j=1
The only remaining issue is the completeness. Suppose then that {Tn } is a Cauchy
sequence in L2 (H, G) . Then from 14.33 {Tn } is a Cauchy sequence in L (H, G) and
so there exists a unique T such that limn→∞ ||Tn − T || = 0. Then it only remains
to verify T ∈ L2 (H, G) . But by Fatou’s lemma,
X 2
X 2
||T ek || ≤ lim inf ||Tn ek ||
n→∞
k k
2
= lim inf ||Tn ||L2 < ∞.
n→∞
All that remains is to verify L2 (H, G) is separable and these Hilbert Schmidt
operators are compact. I will show an orthonormal basis for L2 (H, G) is {fj ⊗ ek }
14.6. COMPACT OPERATORS 393
where {fk } is an orthonormal basis for G and {ek } is an orthonormal basis for H.
Here, for f ∈ G and e ∈ H,
f ⊗ e (x) ≡ (x, e) f.
so each of these operators is in L2 (H, G). Next I show they are orthonormal.
X
(fj ⊗ ek , fs ⊗ er ) = (fj ⊗ ek (ep ) , fs ⊗ er (ep ))
p
X X
= δ rp δ kp (fj , fs ) = δ rp δ kp δ js
p p
Then
n X
X n
Tn ek = (T ei , fj ) (ek , ei ) fj
i=1 j=1
Xn
= (T ek , fj ) fj
j=1
It follows
||Tn ek || ≤ ||T ek ||
and
lim Tn ek = T ek .
n→∞
Therefore, the linear combinations of the fj ⊗ ei are dense in L2 (H, G) and this
proves completeness.
This also shows L2 (H, G) is separable. From 14.33 it also shows that every
T ∈ L2 (H, G) is the limit in the operator norm of a sequence of compact operators.
This follows because each of the fj ⊗ ei is easily seen to be a compact operator.
394 HILBERT SPACES
and since if {xm } is any bounded sequence, there exists a subsequence, {xnk } which
converges weakly and by the above, fj ⊗ ei (xnk ) → fj ⊗ ei (x) showing bounded
sets are mapped to precompact sets.) Therefore, each T ∈ L2 (H, G) must also be
a compact operator. Here is why.
Let B be a bounded set in which ||x|| < M for all x ∈ B and consider T B. I
ε
need to show T B is totally bounded. Let ε > 0 be given. Then let ||Tm − T || < 3M
N
where Tm is a compact operator like those described above and let {Tm xj }j=1 be
an ε/3 net for Tm (B) . Then
ε
||T xj − Tm xj || <
3
and so letting x ∈ B, pick xj such that ||Tm x − Tm xj || < ε/3. Then
and so by the Hahn Banach theorem, there exists y ∗ extending g to all of Y having
the same operator norm.
µ µ ¶¶
x
y ∗ (Ax) = lim ||x|| yn∗ k A = lim yn∗ k (Ax)
k→∞ ||x|| k→∞
and this is uniformly small for large k due to the uniform convergence of yn∗ k to f
¯¯ ¯¯
on A (B). Therefore, ¯¯A∗ y ∗ − A∗ yn∗ k ¯¯ → 0.
Define A : H → H by
Aa ≡ b ≡ {0, a1 , a2 , · · ·} .
Thus A slides the sequence to the right and puts a zero in the first slot. Clearly A is
one to one and linear but it cannot be onto because it fails to yield e1 ≡ {1, 0, 0, · · ·}.
Notwithstanding the above example, there are theorems which are like the linear
algebra theorem mentioned above which hold in an arbitrary Banach spaces in the
case where the operator is compact. To begin with here is an interesting lemma.
Awn − Awm + en − em = wn − wm
1
||wn − z|| = ||xn − zn − ||xn − zn || z||
||xn − zn ||
1 αn n
≥ αn ≥ ¡ ¢ = .
||xn − zn || 1 + n1 αn n+1
Taking the limit, ||w∞ − z|| ≥ 1. Since z ∈ ker (I − A) is arbitrary, this shows
dist (w∞ , ker (I − A)) ≥ 1.
Since Case 2 does not occur, this proves the lemma.
if and only if
x∗ (f ) = 0 (14.37)
exists a solution to (I − A) x = f .
Now suppose there exists a solution, x, to (I − A) x = f for every f ∈ X. If
(I − A∗ ) x∗ = 0, then for every x ∈ X,
Corollary 14.47 In the case where X is a Hilbert space, the conclusions of Corol-
lary 14.46, Theorem 14.45, and Lemma 14.44 remain true if H 0 is replaced by H
and the adjoint is understood in the usual manner for Hilbert space. That is
399
400 REPRESENTATION THEOREMS
By Holder’s inequality,
µZ ¶1/2 µZ ¶1/2
2 2 1/2
|Λg| ≤ 1 dλ |g| d (λ + µ) = λ (Ω) ||g||2
Ω Ω
The plan is to show h is real and nonnegative at least a.e. Therefore, consider the
set where Im h is positive.
E = {x ∈ Ω : Im h(x) > 0} ,
P∞
Let f (x) = i=1 hi (x) and use the Monotone Convergence theorem in 15.4 to let
n → ∞ and conclude Z
λ(E) = f dµ.
E
1
f ∈ L (Ω, µ) because λ is finite.
The function, f is unique µ a.e. because, if g is another function which also
serves to represent λ, consider for each n ∈ N the set,
· ¸
1
En ≡ f − g >
n
Similarly, the set where g is larger than f has measure zero. This proves the
theorem.
Case where it is not necessarily true that λ ¿ µ.
In this case, let N = [h ≥ 1] and let g = XN . Then
Z
λ(N ) = h d(µ + λ) ≥ µ(N ) + λ(N ).
N
¡ ¢
and so µ (N ) = 0 and so µ (E) = µ E ∩ N C . Now define a measure, λ⊥ by
λ⊥ (E) ≡ λ (E ∩ N )
so
λ⊥ (E ∩ N ) ≡ λ (E ∩ N ∩ N ) = λ (E ∩ N ) ≡ λ⊥ (E)
and let λ|| ≡ λ − λ⊥ . Thus,
¡ ¢
λ|| (E) = λ (E) − λ⊥ (E) ≡ λ (E) − λ (E ∩ N ) = λ E ∩ N C .
Suppose now that λ|| (E) > 0. It follows from the first part of the proof that since
h < 1 on N C
Z
¡ ¢
0 < λ|| (E) = λ E ∩ N C = h d(µ + λ)
E∩N C
¡ ¢ ¡ ¢
< µ E ∩ N C + λ E ∩ N C = µ (E) + λ|| (E)
which shows that µ (E) > 0. Thus if µ (E) = 0 it follows λ|| (E) = 0 and so λ|| ¿ µ.
It only remains to verify the two measures λ⊥ and λ|| are unique. Suppose then
that ν 1 and ν 2 play the roles of λ⊥ and λ|| respectively. Let N1 play the role of N
in the definition of ν 1 and let g1 play the role of g for ν 2 . I will show that g = g1 µ
a.e. Let Ek ≡ [g1 − g > 1/k] for k ∈ N. Then on observing that λ⊥ − ν 1 = ν 2 − λ||
³ ´ Z
C
0 = (λ⊥ − ν 1 ) En ∩ (N1 ∪ N ) = (g1 − g) dµ
En ∩(N1 ∪N )C
1 ³ C
´ 1
≥ µ Ek ∩ (N1 ∪ N ) = µ (Ek ) .
k k
and so µ (Ek ) = 0. Therefore, µ ([g1 − g > 0]) = 0 because [g1 − g > 0] = ∪∞ k=1 Ek .
It follows g1 ≤ g µ a.e. Similarly, g ≥ g1 µ a.e. Therefore, ν 2 = λ|| and so λ⊥ = ν 1
also. This proves the theorem.
The f in the theorem for the absolutely continuous case is sometimes denoted
dλ
by dµ and is called the Radon Nikodym derivative.
The next corollary is a useful generalization to σ finite measure spaces.
Sn ∩ Sm = ∅, ∪∞
n=1 Sn = Ω,
15.1. RADON NIKODYM THEOREM 403
and λ(Sn ), µ(Sn ) < ∞. Then there exists f ≥ 0, where f is µ measurable, and
Z
λ(E) = f dµ
E
Sn ≡ {E ∩ Sn : E ∈ S}.
Then both λ, and µ are finite measures on Sn , and λ ¿ µ. Thus, by RTheorem 15.2,
there exists a nonnegative Sn measurable function fn ,with λ(E) = E fn dµ for all
E ∈ Sn . Define f (x) = fn (x) for x ∈ Sn . Since the Sn are disjoint and their union
is all of Ω, this defines f on all of Ω. The function, f is measurable because
f −1 ((a, ∞]) = ∪∞ −1
n=1 fn ((a, ∞]) ∈ S.
Also, for E ∈ S,
∞
X ∞ Z
X
λ(E) = λ(E ∩ Sn ) = XE∩Sn (x)fn (x)dµ
n=1 n=1
X∞ Z
= XE∩Sn (x)f (x)dµ
n=1
P∞
Hence µ([f1 − f2 > 0]) ≤ k=1 µ (Ek ) = 0. Therefore, λ ([f1 − f2 > 0]) = 0 also.
Similarly
(µ + λ) ([f1 − f2 < 0]) = 0.
This version of the Radon Nikodym theorem will suffice for most applications,
but more general versions are available. To see one of these, one can read the
treatment in Hewitt and Stromberg [26]. This involves the notion of decomposable
measure spaces, a generalization of σ finite.
Not surprisingly, there is a simple generalization of the Lebesgue decomposition
part of Theorem 15.2.
Corollary 15.4 Let (Ω, S) be a set with a σ algebra of sets. Suppose λ and µ are
two measures defined on the sets of S and suppose there exists a sequence of disjoint
∞
sets of S, {Ωi }i=1 such that λ (Ωi ) , µ (Ωi ) < ∞. Then there is a set of µ measure
zero, N and measures λ⊥ and λ|| such that
such that λi = λi⊥ + λi|| , a set of µi measure zero, Ni ∈ Si such that for all E ∈ Si ,
λi⊥ (E) = λi (E ∩ Ni ) and λi|| ¿ µi . Define for E ∈ S
X X
λ⊥ (E) ≡ λi⊥ (E ∩ Ωi ) , λ|| (E) ≡ λi|| (E ∩ Ωi ) , N ≡ ∪i Ni .
i i
and
X X
λ⊥ (E) ≡ λi⊥ (E ∩ Ωi ) = λi (E ∩ Ωi ∩ Ni )
i i
X
= λ (E ∩ Ωi ∩ N ) = λ (E ∩ N ) .
i
15.2. VECTOR MEASURES 405
The decomposition is unique because of the uniqueness of the λi|| and λi⊥ and the
observation that some other decomposition must coincide with the given one on the
Ωi .
Definition 15.5 Let (V, || · ||) be a normed linear space and let (Ω, S) be a measure
space. A function µ : S → V is a vector measure if µ is countably additive. That
is, if {Ei }∞
i=1 is a sequence of disjoint sets of S,
∞
X
µ(∪∞
i=1 Ei ) = µ(Ei ).
i=1
Note that it makes sense to take finite sums because it is given that µ has
values in a vector space in which vectors can be summed. In the above, µ (Ei ) is a
vector. It might be a point in Rn or in any other vector space. In many of the most
important applications, it is a vector in some sort of function space which may be
infinite dimensional. The infinite sum has the usual meaning. That is
∞
X n
X
µ(Ei ) = lim µ(Ei )
n→∞
i=1 i=1
Definition 15.6 Let (Ω, S) be a measure space and let µ be a vector measure defined
on S. A subset, π(E), of S is called a partition of E if π(E) consists of finitely
many disjoint sets of S and ∪π(E) = E. Let
X
|µ|(E) = sup{ ||µ(F )|| : π(E) is a partition of E}.
F ∈π(E)
The next theorem may seem a little surprising. It states that, if finite, the total
variation is a nonnegative measure.
406 REPRESENTATION THEOREMS
Theorem 15.7 If P |µ|(Ω) < ∞, then |µ| is a measure on S. Even if |µ| (Ω) =
∞
∞, |µ| (∪∞
i=1 Ei ) ≤ i=1 |µ| (Ei ) . That is |µ| is subadditive and |µ| (A) ≤ |µ| (B)
whenever A, B ∈ S with A ⊆ B.
Proof: Consider the last claim. Let a < |µ| (A) and let π (A) be a partition of
A such that X
a< ||µ (F )|| .
F ∈π(A)
Since this is true for all such a, it follows |µ| (B) ≥ |µ| (A) as claimed.
∞
Let {Ej }j=1 be a sequence of disjoint sets of S and let E∞ = ∪∞ j=1 Ej . Then
letting a < |µ| (E∞ ) , it follows from the definition of total variation there exists a
partition of E∞ , π(E∞ ) = {A1 , · · ·, An } such that
n
X
a< ||µ(Ai )||.
i=1
Also,
Ai = ∪∞ j=1 Ai ∩ Ej
P∞
and so by the triangle inequality, ||µ(Ai )|| ≤ j=1 ||µ(Ai ∩ Ej )||. Therefore, by the
above, and either Fubini’s theorem or Lemma 8.21 on Page 184
≥||µ(Ai )||
z }| {
n X
X ∞
a < ||µ(Ai ∩ Ej )||
i=1 j=1
X∞ X n
= ||µ(Ai ∩ Ej )||
j=1 i=1
X∞
≤ |µ|(Ej )
j=1
n
because {Ai ∩ Ej }i=1 is a partition of Ej .
Since a is arbitrary, this shows
∞
X
|µ|(∪∞
j=1 Ej ) ≤ |µ|(Ej ).
j=1
If the sets, Ej are not disjoint, let F1 = E1 and if Fn has been chosen, let Fn+1 ≡
En+1 \ ∪ni=1 Ei . Thus the sets, Fi are disjoint and ∪∞ ∞
i=1 Fi = ∪i=1 Ei . Therefore,
∞ ∞
¡ ¢ ¡ ∞ ¢ X X
|µ| ∪∞
j=1 Ej = |µ| ∪j=1 Fj ≤ |µ| (Fj ) ≤ |µ| (Ej )
j=1 j=1
15.2. VECTOR MEASURES 407
and proves |µ| is always subadditive as claimed regarless of whether |µ| (Ω) < ∞.
Now suppose |µ| (Ω) < ∞ and let E1 and E2 be sets of S such that E1 ∩ E2 = ∅
and let {Ai1 · · · Aini } = π(Ei ), a partition of Ei which is chosen such that
ni
X
|µ| (Ei ) − ε < ||µ(Aij )|| i = 1, 2.
j=1
Such a partition exists because of the definition of the total variation. Consider the
sets which are contained in either of π (E1 ) or π (E2 ) , it follows this collection of
sets is a partition of E1 ∪ E2 denoted by π(E1 ∪ E2 ). Then by the above inequality
and the definition of total variation,
X
|µ|(E1 ∪ E2 ) ≥ ||µ(F )|| > |µ| (E1 ) + |µ| (E2 ) − 2ε,
F ∈π(E1 ∪E2 )
Since n is arbitrary,
∞
X
|µ|(∪∞
j=1 Ej ) = |µ|(Ej )
j=1
which shows that |µ| is a measure as claimed. This proves the theorem.
In the case that µ is a complex measure, it is always the case that |µ| (Ω) < ∞.
Here 20 is just a nice sized number. No effort is made to be delicate in this argument.
Also note that µ (E) ∈ C because it is given that µ is a complex measure. Consider
the following picture consisting of two lines in the complex plane having slopes 1
and -1 which intersect at the origin, dividing the complex plane into four closed
sets, R1 , R2 , R3 , and R4 as shown.
@ ¡
@ R2 ¡
@ ¡
@ ¡
R3 @¡ R1
¡@
¡ @
¡ R4 @
¡ @
¡ @
Let π i consist of those sets, A of π (E) for which µ (A) ∈ Ri . Thus, some sets,
A of π (E) could be in two of the π i if µ (A) is on one of the intersecting lines. This
is
√
not important. The thing which is important is that √
if µ (A) ∈ R1 or R3 , then
2 2
2 |µ (A)| ≤ |Re (µ (A))| and if µ (A) ∈ R2 or R4 then 2 |µ (A)| ≤ |Im (µ (A))| and
Re (z) has the same sign for z in R1 and R3 while Im (z) has the same sign for z in
R2 or R4 . Then by 15.6, it follows that for some i,
X
|µ (F )| > 5 (1 + |µ (E)|) . (15.7)
F ∈π i
¯ ¯ ¯ ¯
¯X ¯ ¯X ¯ X
¯ ¯ ¯ ¯
¯ µ (F )¯ ≥ ¯ Re (µ (F ))¯ = |Re (µ (F ))|
¯ ¯ ¯ ¯
F ∈π i F ∈π i F ∈π i
√ √
2 X 2
≥ |µ (F )| > 5 (1 + |µ (E)|) .
2 2
F ∈π i
¯ ¯
¯X ¯ 5
¯ ¯
|µ (C)| = ¯ µ (F )¯ > (1 + |µ (E)|) > 1. (15.8)
¯ ¯ 2
F ∈π i
Define D ≡ E \ C.
15.2. VECTOR MEASURES 409
5
(1 + |µ (E)|) < |µ (C)| = |µ (E) − µ (E \ C)|
2
= |µ (E) − µ (D)| ≤ |µ (E)| + |µ (D)|
and so
5 3
1< + |µ (E)| < |µ (D)| .
2 2
Now since |µ| (E) = ∞, it follows from Theorem 15.8 that ∞ = |µ| (E) ≤ |µ| (C) +
|µ| (D) and so either |µ| (C) = ∞ or |µ| (D) = ∞. If |µ| (C) = ∞, let B = C and
A = D. Otherwise, let B = D and A = C. This proves the claim.
Now suppose |µ| (Ω) = ∞. Then from the claim, there exist A1 and B1 such that
|µ| (B1 ) = ∞, |µ (B1 )| , |µ (A1 )| > 1, and A1 ∪ B1 = Ω. Let B1 ≡ Ω \ A play the same
role as Ω and obtain A2 , B2 ⊆ B1 such that |µ| (B2 ) = ∞, |µ (B2 )| , |µ (A2 )| > 1,
and A2 ∪ B2 = B1 . Continue in this way to obtain a sequence of disjoint sets, {Ai }
such that |µ (Ai )| > 1. Then since µ is a measure,
∞
X
µ (∪∞
i=1 Ai ) = µ (Ai )
i=1
but this is impossible because limi→∞ µ (Ai ) 6= 0. This proves the theorem.
Therefore, each of
| Re λ| + Re λ | Re λ| − Re(λ) | Im λ| + Im λ | Im λ| − Im(λ)
, , , and
2 2 2 2
are finite measures on S. It is also clear that each of these finite measures are abso-
lutely continuous with respect to µ and so there exist unique nonnegative functions
in L1 (Ω), f1, f2 , g1 , g2 such that for all E ∈ S,
Z
1
(| Re λ| + Re λ)(E) = f1 dµ,
2 E
Z
1
(| Re λ| − Re λ)(E) = f2 dµ,
2
ZE
1
(| Im λ| + Im λ)(E) = g1 dµ,
2
ZE
1
(| Im λ| − Im λ)(E) = g2 dµ.
2 E
B(p, r)
(0, 0)
¾ p.
1
because it is closer to p than r. (Refer to the picture.) However, this contradicts the
assumption of the lemma. It follows µ(E) = 0. Since the set of complex numbers,
z such that |z| > 1 is an open set, it equals the union of countably many balls,
∞
{Bi }i=1 . Therefore,
¡ ¢ ¡ ¢
µ f −1 ({z ∈ C : |z| > 1} = µ ∪∞k=1 f
−1
(Bk )
X∞
¡ ¢
≤ µ f −1 (Bk ) = 0.
k=1
Corollary 15.11 Let λ be a complex vector Rmeasure with |λ|(Ω) < ∞1 Then there
exists a unique f ∈ L1 (Ω) such that λ(E) = E f d|λ|. Furthermore, |f | = 1 for |λ|
a.e. This is called the polar decomposition of λ.
Proof: First note that λ ¿ |λ| and so such an L1 function exists and is unique.
It is required to show |f | = 1 a.e. If |λ|(E) 6= 0,
¯ ¯ ¯ Z ¯
¯ λ(E) ¯ ¯ 1 ¯
¯ ¯=¯ f d|λ|¯¯ ≤ 1.
¯ |λ|(E) ¯ ¯ |λ|(E)
E
Xm Z µ ¶ Xm µ ¶
1 1
≤ 1− d |λ| = 1− |λ| (Fi )
i=1 Fi
n i=1
n
µ ¶
1
= |λ| (En ) 1 − .
n
which shows |λ| (En ) = 0. Hence |λ| ([|f | < 1]) = 0 because [|f | < 1] = ∪∞
n=1 En .This
proves Corollary 15.11.
1 As proved above, the assumption that |λ| (Ω) < ∞ is redundant.
412 REPRESENTATION THEOREMS
Then Z
|λ| (E) = |h| dµ.
E
Furthermore, |h| = gh where gd |λ| is the polar decomposition of λ,
Z
λ (E) = gd |λ|
E
Proof: From Corollary 15.11 there exists g such that |g| = 1, |λ| a.e. and for
all E ∈ S Z Z
λ (E) = gd |λ| = hdµ.
E E
Let sn be a sequence of simple functions converging pointwise to g. Then from the
above, Z Z
gsn d |λ| = sn hdµ.
E E
Passing to the limit using the dominated convergence theorem,
Z Z
d |λ| = ghdµ.
E E
It follows gh ≥ 0 a.e. and |g| = 1. Therefore, |h| = |gh| = gh. It follows from the
above, that Z Z Z Z
|λ| (E) = d |λ| = ghdµ = d |λ| = |h| dµ
E E E E
and this proves the corollary.
Theorem 15.13 (Riesz representation theorem) Let p > 1 and let (Ω, S, µ) be a
finite measure space. If Λ ∈ (Lp (Ω))0 , then there exists a unique h ∈ Lq (Ω) ( p1 + 1q =
1) such that Z
Λf = hf dµ.
Ω
This function satisfies ||h||q = ||Λ|| where ||Λ|| is the operator norm of Λ.
15.3. REPRESENTATION THEOREMS FOR THE DUAL SPACE OF LP 413
i=1 Ω
||XFn − XF ||p → 0.
Therefore, by continuity of Λ,
n
X ∞
X
λ(F ) = Λ(XF ) = lim Λ(XFn ) = lim Λ(XEk ) = λ(Ek ).
n→∞ n→∞
k=1 k=1
Pm
Actually h ∈ Lq and satisfies the other conditions above. Let s = i=1 ci XEi be a
simple function. Then since Λ is linear,
m
X m
X Z Z
Λ(s) = ci Λ(XEi ) = ci hdµ = hsdµ. (15.9)
i=1 i=1 Ei
the first equality holding because of continuity of Λ, the second following from 15.9
and the third holding by the dominated convergence theorem.
This is a very nice formula but it still has not been shown that h ∈ Lq (Ω).
Let En = {x : |h(x)| ≤ n}. Thus |hXEn | ≤ n. Then
¯¯ ¯¯ q
= ||hXEn ||qp
q
Therefore, since q − p = 1, it follows that
Now that h has been shown to be in Lq (Ω), it follows from 15.9 and the density
of the simple functions, Theorem 12.13 on Page 323, that
Z
Λf = hf dµ
Definition 15.14 Let (Ω, S, µ) be a measure space. L∞ (Ω) is the vector space of
measurable functions such that for some M > 0, |f (x)| ≤ M for all x outside of
some set of measure zero (|f (x)| ≤ M a.e.). Define f = g when f (x) = g(x) a.e.
and ||f ||∞ ≡ inf{M : |f (x)| ≤ M a.e.}.
a.e. and so ||f ||∞ + ||g||∞ serves as one of the constants, M in the definition of
||f + g||∞ . Therefore,
||f + g||∞ ≤ ||f ||∞ + ||g||∞ .
Next let c be a number. Then |cf (x)| = |c| |f (x)| ≤ |c| ||f ||∞ and ¯ ¯ so ||cf ||∞ ≤
|c| ||f ||∞ . Therefore since c is arbitrary, ||f ||∞ = ||c (1/c) f ||∞ ≤ ¯ 1c ¯ ||cf ||∞ which
implies |c| ||f ||∞ ≤ ||cf ||∞ . Thus || ||∞ is a norm as claimed.
To verify completeness, let {fn } be a Cauchy sequence in L∞ (Ω) and use the
above claim to get the existence of a set of measure zero, Enm such that for all
x∈ / Enm ,
|fn (x) − fm (x)| ≤ ||fn − fm ||∞
Let E = ∪n,m Enm . Thus µ(E) = 0 and for each x ∈/ E, {fn (x)}∞n=1 is a Cauchy
sequence in C. Let
½
0 if x ∈ E
f (x) = = lim XE C (x)fn (x).
limn→∞ fn (x) if x ∈
/E n→∞
416 REPRESENTATION THEOREMS
and F = ∪∞
n=1 Fn , it follows µ(F ) = 0 and that for x ∈
/ F ∪ E,
|f (x)| ≤ lim inf |fn (x)| ≤ lim inf ||fn ||∞ < ∞
n→∞ n→∞
because {||fn ||∞ } is a Cauchy sequence. (|||fn ||∞ − ||fm ||∞ | ≤ ||fn − fm ||∞ by the
triangle inequality.) Thus f ∈ L∞ (Ω). Let n be large enough that whenever m > n,
Then, if x ∈
/ E,
Hence ||f − fn ||∞ < ε for all n large enough. This proves the theorem.
¡ ¢0
The next theorem is the Riesz representation theorem for L1 (Ω) .
for all f ∈ L1 (Ω). If h is the function in L∞ (Ω) representing Λ ∈ (L1 (Ω))0 , then
||h||∞ = ||Λ||.
Proof: Just as in the proof of Theorem 15.13, there exists a unique h ∈ L1 (Ω)
such that for all simple functions, s,
Z
Λ(s) = hs dµ. (15.11)
Let |k| = 1 and hk = |h|. Since the measure space is finite, k ∈ L1 (Ω). As in
Theorem 15.13 let {sn } be a sequence of simple functions converging to k in L1 (Ω),
and pointwise. It follows from the construction in Theorem 8.27 on Page 190 that
it can be assumed |sn | ≤ 1. Therefore
Z Z
Λ(kXE ) = lim Λ(sn XE ) = lim hsn dµ = hkdµ
n→∞ n→∞ E E
15.3. REPRESENTATION THEOREMS FOR THE DUAL SPACE OF LP 417
where the last equality holds by the Dominated Convergence theorem. Therefore,
Z Z
||Λ||µ(E) ≥ |Λ(kXE )| = | hkXE dµ| = |h|dµ
Ω E
≥ (||Λ|| + ε)µ(E).
It follows that µ(E) = 0. Since ε > 0 was arbitrary, ||Λ|| ≥ ||h||∞ . It was shown
that h ∈ L∞ (Ω), the density of the simple functions in L1 (Ω) and 15.11 imply
Z
Λf = hf dµ , ||Λ|| ≥ ||h||∞ . (15.12)
Ω
This proves the existence part of the theorem. To verify uniqueness, suppose h1
and h2 both represent Λ and let f ∈ L1 (Ω) be such that |f | ≤ 1 and f (h1 − h2 ) =
|h1 − h2 |. Then
Z Z
0 = Λf − Λf = (h1 − h2 )f dµ = |h1 − h2 |dµ.
Thus h1 = h2 . Finally,
Z
||Λ|| = sup{| hf dµ| : ||f ||1 ≤ 1} ≤ ||h||∞ ≤ ||Λ||
by 15.12.
Next these results are extended to the σ finite case.
Lemma 15.17 Let (Ω, S, µ) be a measure space and suppose there exists a measur-
able function, Rr such that r (x) > 0 for all x, there exists M such that |r (x)| < M
for all x, and rdµ < ∞. Then for
Then Z ¯ ¯
p ¯ − p1 ¯p p
||ηf ||Lp (eµ) = ¯r f ¯ rdµ = ||f ||Lp (µ)
and so η is one to one and in fact preserves norms. I claim that also η is onto. To
1
see this, let g ∈ Lp (Ω, µ
e) and consider the function, r p g. Then
Z ¯ ¯ Z Z
¯ p1 ¯p p p
¯r g ¯ dµ = |g| rdµ = |g| de µ<∞
1
³ 1 ´
Thus r p g ∈ Lp (Ω, µ) and η r p g = g showing that η is onto as claimed. Thus
η is one to one, onto, and preserves norms. Consider the diagram below which is
descriptive of the situation in which η ∗ must be one to one and onto.
η∗
p0 e0 0
h, L (e
µ) L (e p
µ) , Λ → Lp (µ) , Λ
η
Lp (e
µ) ← Lp (µ)
¯¯ ¯¯
0 e ∈ Lp (e
Then for Λ ∈ Lp (µ) , there exists a unique Λ
0 e = Λ, ¯¯¯¯Λ
µ) such that η ∗ Λ e ¯¯¯¯ =
||Λ|| . By the Riesz representation theorem for finite measure spaces, there exists
a unique h ∈ Lp (e
0
µ) which represents Λ e in the manner described in the Riesz
¯¯ ¯¯
¯¯ e ¯¯
representation theorem. Thus ||h||Lp0 (eµ) = ¯¯Λ ¯¯ = ||Λ|| and for all f ∈ Lp (µ) ,
Z Z ³ 1 ´
∗e e (ηf ) =
Λ (f ) = η Λ (f ) ≡ Λ h (ηf ) de
µ= rh f − p f dµ
Z
1
= r p0 hf dµ.
Now Z ¯ ¯ 0 Z
¯ p10 ¯p p0 p0
¯r h¯ dµ = |h| rdµ = ||h||Lp0 (eµ) < ∞.
¯¯ 1 ¯¯ ¯¯ ¯¯
¯¯ ¯¯ ¯¯ e ¯¯
Thus ¯¯r p0 h¯¯ = ||h||Lp0 (eµ) = ¯¯Λ ¯¯ = ||Λ|| and represents Λ in the appropriate
Lp0 (µ)
way. If p = 1, then 1/p0 ≡ 0. This proves the Lemma.
A situation in which the conditions of the lemma are satisfied is the case where
the measure space is σ finite. In fact, you should show this is the only case in which
the conditions of the above lemma hold.
Theorem 15.18 (Riesz representation theorem) Let (Ω, S, µ) be σ finite and let
Define Z
X∞
1 −1
r(x) = XΩ (x) µ(Ωn ) , µ
e (E) = rdµ.
n=1
n2 n E
Thus Z X∞
1
rdµ = µ
e(Ω) = <∞
Ω n=1
n2
so µ
e is a finite measure. The above lemma gives the existence part of the conclusion
of the theorem. Uniqueness is done as before.
With the Riesz representation theorem, it is easy to show that
Lp (Ω), p > 1
is a reflexive Banach space. Recall Definition 13.32 on Page 353 for the definition.
Theorem 15.19 For (Ω, S, µ) a σ finite measure space and p > 1, Lp (Ω) is reflex-
ive.
0
1 1
Proof: Let δ r : (Lr (Ω))0 → Lr (Ω) be defined for r + r0
= 1 by
Z
(δ r Λ)g dµ = Λg
for all g ∈ Lr (Ω). From Theorem 15.18 δ r is one to one, onto, continuous and linear.
By the open map theorem, δ −1r is also one to one, onto, and continuous (δ r Λ equals
the representor of Λ). Thus δ ∗r is also one to one, onto, and continuous by Corollary
13.29. Now observe that J = δ ∗p ◦ δ −1 ∗ q 0 ∗ p 0
q . To see this, let z ∈ (L ) , y ∈ (L ) ,
δ ∗p ◦ δ −1 ∗ ∗
q (δ q z )(y ) = (δ ∗p z ∗ )(y ∗ )
= z ∗ (δ p y ∗ )
Z
= (δ q z ∗ )(δ p y ∗ )dµ,
J(δ q z ∗ )(y ∗ ) = y ∗ (δ q z ∗ )
Z
= (δ p y ∗ )(δ q z ∗ )dµ.
Therefore δ ∗p ◦ δ −1 q 0 p
q = J on δ q (L ) = L . But the two δ maps are onto and so J is
also onto.
420 REPRESENTATION THEOREMS
ΛR (f ) = λf + − λf −
(cf )± = cf ±,
if c ≥ 0 while
(cf )+ = −c(f )−,
if c < 0 and
(cf )− = (−c)f +,
if c < 0. Thus, if c < 0,
¡ ¢ ¡ ¢
ΛR (cf ) = λ(cf )+ − λ(cf )− = λ (−c) f − − λ (−c)f +
Note that λ(f ) < ∞ because |Lg| ≤ ||L|| ||g|| ≤ ||L|| ||f || for |g| ≤ f . Then the
following lemma is important.
Proof: The first two assertions are easy to see so consider the third. Let
|gj | ≤ fj and let gej = eiθj gj where θj is chosen such that eiθj Lgj = |Lgj |. Thus
Legj = |Lgj |. Then
|e
g1 + ge2 | ≤ f1 + f2.
Hence
|Lg1 | + |Lg2 | = Le
g1 + Le
g2 =
L(e
g1 + ge2 ) = |L(e
g1 + ge2 )| ≤ λ(f1 + f2 ). (15.15)
Choose g1 and g2 such that |Lgi | + ε > λ(fi ). Then 15.15 shows
for all f ∈ C(X). Thus Λ(1) = µ(X). What follows is the Riesz representation
theorem for C(X)0 .
Theorem 15.22 Let L ∈ (C(X))0 . Then there exists a Radon measure µ and a
function σ ∈ L∞ (X, µ) such that
Z
L(f ) = f σ dµ.
X
Proof: Let f ∈ C(X). Then there exists a unique Radon measure µ such that
Z
|Lf | ≤ Λ(|f |) = |f |dµ = ||f ||1 .
X
with all complements of compact sets which are defined as the open sets containing
∞. Also C0 (X) will denote the space of continuous functions, f , defined on X
such that in the topology of X, e limx→∞ f (x) = 0. For this space of functions,
||f ||0 ≡ sup {|f (x)| : x ∈ X} is a norm which makes this into a Banach space.
Then the generalization is the following corollary.
0
Corollary 15.23 Let L ∈ (C0 (X)) where X is a locally compact Hausdorff space.
Then there exists σ ∈ L∞ (X, µ) for µ a finite Radon measure such that for all
f ∈ C0 (X),
Z
L (f ) = f σdµ.
X
Proof: Let n ³ ´ o
e ≡ f ∈C X
D e : f (∞) = 0 .
³ ´
Thus De is a closed subspace of the Banach space C X e . Let θ : C0 (X) → D
e be
defined by
½
f (x) if x ∈ X,
θf (x) =
0 if x = ∞.
e (||θu|| = ||u|| .)The following diagram is
Then θ is an isometry of C0 (X) and D.
obtained. ³ ´0 ∗ ³ ´0
0 θ∗ e i e
C0 (X) ← D ← C X
³ ´
C0 (X) → e
D → e
C X
θ i
³ ´0
By the Hahn Banach theorem, there exists L1 ∈ C X e such that θ∗ i∗ L1 = L.
Now apply Theorem 15.22 e
³ to get
´ the existence of a finite Radon measure, µ1 , on X
and a function σ ∈ L ∞ e µ1 , such that
X,
Z
L1 g = gσdµ1 .
e
X
S ≡ {E \ {∞} : E ∈ S1 }
The following lemma says that the difference of regular complex measures is also
regular.
From 15.17 and the regularity of µ, it follows that |λ| is also regular.
What of the claim about ||L||? By the regularity of |λ| , it follows that C0 (X) (In
fact, Cc (X)) is dense in L1 (X, |λ|). Since |λ| is finite, g ∈ L1 (X, |λ|). Therefore,
there exists a sequence of functions in C0 (X) , {fn } such that fn → g in L1 (X, |λ|).
Therefore, there exists a subsequence, still denoted by {fn } such that fn (x) → g (x)
|λ| a.e. also. But since |g (x)| = 1 a.e. it follows that hn (x) ≡ |f f(x)|+ n (x)
1 also
n n
converges pointwise |λ| a.e. Then from the dominated convergence theorem and
15.18 Z
||L|| ≥ lim hn gd |λ| = |λ| (X) .
n→∞ X
Also, if ||f ||C0 (X) ≤ 1, then
¯Z ¯ Z
¯ ¯
¯
|L (f )| = ¯ f gd |λ|¯¯ ≤ |f | d |λ| ≤ |λ| (X) ||f ||C0 (X)
X X
15.7 Exercises
1. Suppose µ is a vector measure having values in Rn or Cn . Can you show that
|µ| must be finite? Hint: You might define for each ei , one of the standard ba-
sis vectors, the real or complex measure, µei given by µei (E) ≡ ei ·µ (E) . Why
would this approach not yield anything for an infinite dimensional normed lin-
ear space in place of Rn ?
2. The Riesz representation theorem of the Lp spaces can be used to prove a
very interesting inequality. Let r, p, q ∈ (1, ∞) satisfy
1 1 1
= + − 1.
r p q
426 REPRESENTATION THEOREMS
Then
1 1 1 1
=1+ − >
q r p r
and so r > q. Let θ ∈ (0, 1) be chosen so that θr = q. Then also we have
1/p+1/p0 =1
z }| {
1
1 1
1 1
= 1− 0 + −1= − 0
r p q q p
and so
θ 1 1
= − 0
q q p
which implies p0 (1 − θ) = q. Now let f ∈ Lp (Rn ) , g ∈ Lq (Rn ) , f, g ≥ 0.
Justify the steps in the following argument using what was just shown that
θr = q and p0 (1 − θ) = q. Let
µ ¶
0 1 1
h ∈ Lr (Rn ) . + 0 =1
r r
Z Z Z
f ∗ g (x) |h (x)| dx = f (y) g (x − y) |h (x)| dxdy.
Z Z
θ 1−θ
≤ |f (y)| |g (x − y)| |g (x − y)| |h (x)| dydx
Z µZ ³ ´r 0 ¶1/r0
1−θ
≤ |g (x − y)| |h (x)| dx ·
µZ ³ ´r ¶1/r
θ
|f (y)| |g (x − y)| dx dy
q/r q/p0
= ||g||q ||g||q ||f ||p ||h||r0 = ||g||q ||f ||p ||h||r0 . (15.19)
Young’s inequality says that
Therefore ||f ∗ g||r ≤ ||g||q ||f ||p . How does this inequality follow from the
above computation? Does 15.19 continue to hold if r, p, q are only assumed to
be in [1, ∞]? Explain. Does 15.20 hold even if r, p, and q are only assumed to
lie in [1, ∞]?
3. Show that in a reflexive Banach space, weak and weak ∗ convergence are the
same.
429
430 INTEGRALS AND DERIVATIVES
5n
m([M f > α]) ≤ ||f ||1 .
α
(Here and elsewhere, [M f > α] ≡ {x ∈ Rn : M f (x) > α} with other occurrences of
[ ] being defined similarly.)
By the Vitali covering theorem, there are disjoint balls B(xi , ri ) such that
S ⊆ ∪∞
i=1 B(xi , 5ri )
and Z
1
|f | dm > α.
m(B(xi , ri )) B(xi ,ri )
Therefore
∞
X ∞
X
m(S) ≤ m(B(xi , 5ri )) = 5n m(B(xi , ri ))
i=1 i=1
∞ Z
5n X
≤ |f | dm
α i=1 B(xi ,ri )
Z
5n
≤ |f | dm,
α Rn
the last inequality being valid because the balls B(xi , ri ) are disjoint. This proves
the theorem.
Note that at this point it is unknown whether S is measurable. This is why
m(S) and not m (S) is written.
The following is the fundamental theorem of calculus from elementary calculus.
and so ¯ Z ¯
¯ 1 ¯
¯ ¯
¯g (x) − g(y)dy ¯
¯ m(B(x, r)) B(x,r) ¯
¯ Z ¯
¯ 1 ¯
¯ ¯
= ¯ (g(y) − g (x)) dy ¯
¯ m(B(x, r)) B(x,r) ¯
Z
1
≤ |g(y) − g (x)| dy.
m(B(x, r)) B(x,r)
Now by continuity of g at x, there exists r > 0 such that if |x − y| < r, |g (y) − g (x)| <
ε. For such r, the last expression is less than
Z
1
εdy < ε.
m(B(x, r)) B(x,r)
This proves the lemma.
¡ ¢
Definition 16.4 Let f ∈ L1 Rk , m . A point, x ∈ Rk is said to be a Lebesgue
point if Z
1
lim sup |f (y) − f (x)| dm = 0.
r→0 m (B (x, r)) B(x,r)
à Z !
1
= lim sup |f (y) − f (x)| − |g (y) − g (x)| dm
r→0 m (B (x, r)) B(x,r)
à Z !
1
≤ lim sup ||f (y) − f (x)| − |g (y) − g (x)|| dm
r→0 m (B (x, r)) B(x,r)
à Z !
1
≤ lim sup |f (y) − g (y) − (f (x) − g (x))| dm
r→0 m (B (x, r)) B(x,r)
à Z !
1
≤ lim sup |f (y) − g (y)| dm + |f (x) − g (x)|
r→0 m (B (x, r)) B(x,r)
Therefore,
" Z #
1
x : lim sup |f (y) − f (x)| dm > λ
r→0 m (B (x, r)) B(x,r)
· ¸ · ¸
λ λ
⊆ M ([f − g]) > ∪ |f − g| >
2 2
Now
Z Z
ε > |f − g| dm ≥ |f − g| dm
[|f −g|> λ2 ]
µ· ¸¶
λ λ
≥ m |f − g| >
2 2
This along with the weak estimate of Theorem 16.2 implies
Ã" Z #!
1
m x : lim sup |f (y) − f (x)| dm > λ
r→0 m (B (x, r)) B(x,r)
µ ¶
2 k 2
< 5 + ||f − g||L1 (Rk )
λ λ
µ ¶
2 k 2
< 5 + ε.
λ λ
Since ε > 0 is arbitrary, it follows
Ã" Z #!
1
mn x : lim sup |f (y) − f (x)| dm > λ = 0.
r→0 m (B (x, r)) B(x,r)
Now let
" Z #
1
N = x : lim sup |f (y) − f (x)| dm > 0
r→0 m (B (x, r)) B(x,r)
16.1. THE FUNDAMENTAL THEOREM OF CALCULUS 433
and " Z #
1 1
Nn = x : lim sup |f (y) − f (x)| dm >
r→0 m (B (x, r)) B(x,r) n
It was just shown that m (Nn ) = 0. Also, N = ∪∞ n=1 Nn . Therefore, m (N ) = 0 also.
It follows that for x ∈
/ N,
Z
1
lim sup |f (y) − f (x)| dm = 0
r→0 m (B (x, r)) B(x,r)
Corollary 16.6 (Fundamental Theorem of Calculus) Let f ∈ L1loc (Rk ). Then there
exists a set of measure 0, N , such that if x ∈
/ N , then
Z
1
lim |f (y) − f (x)|dy = 0.
r→0 m(B(x, r)) B(x,r)
¡Proof:
¢ Consider B (0, n) where n is a positive integer. Then fn ≡ f XB(0,n) ∈
L1 Rk and so there exists a set of measure 0, Nn such that if x ∈ B (0, n) \ Nn ,
then
Z
1
lim |fn (y) − fn (x)|dy
r→0 m(B(x, r)) B(x,r)
Z
1
= lim |f (y) − f (x)|dy = 0.
r→0 m(B(x, r)) B(x,r)
Let N = ∪∞
n=1 Nn . Then if x ∈
/ N, the above equation holds.
Proof:
¯ Z ¯
¯ 1 ¯
¯ ¯
¯ f (y)dy − f (x)¯
¯ m(B(x, r)) B(x,r) ¯
Z
1
≤ |f (y) − f (x)| dy
m(B(x, r)) B(x,r)
Definition 16.8 For N the set of Theorem 16.5 or Corollary 16.6, N C is called
the Lebesgue set or the set of Lebesgue points.
The next corollary is a one dimensional version of what was just presented.
434 INTEGRALS AND DERIVATIVES
For simplicity, V [a, x] will be denoted by V (x) . It is called the total variation of
the function, f.
There are some simple facts about the total variation of an absolutely continuous
function, f which are contained in the next lemma.
Also if y > x,
V (y) − V (x) ≥ |f (y) − f (x)| (16.8)
and the function, x → V (x) − f (x) is increasing. The total variation function, V
is absolutely continuous.
Proof: The claim that V is increasing is obvious as is the next claim about
P ⊆ Q leading to VP [x, y] ≤ VQ [x, y] . To verify this, simply add in one point
at a time and verify that from the triangle inequality, the sum involved gets no
smaller. The claim that V is increasing consistent with set inclusion of intervals is
also clearly true and follows directly from the definition.
Now let t < V [x, y] where P0 = {x0 , x1 , · · ·, xn } is a partition of [x, y] . There
exists a partition, P of [x, y] such that t < VP [x, y] . Without loss of generality it
can be assumed that {x0 , x1 , · · ·, xn } ⊆ P since if not, you can simply add in the
points of P0 and the resulting sum for the total variation will get no smaller. Let
Pi be those points of P which are contained in [xi−1 , xi ] . Then
n
X n
X
t < Vp [x, y] = VPi [xi−1 , xi ] ≤ V [xi−1 , xi ] .
i=1 i=1
Note that 16.9 does not depend on f being absolutely continuous. Suppose now
that f is absolutely continuous. Let δ correspond to ε = 1. Then if [x, y] is an
interval of length no larger than δ, the definition of absolute continuity implies
V [x, y] < 1.
Thus V is bounded on [a, b]. Now let Pi be a partition of [xi−1 , xi ] such that
ε
VPi [xi−1 , xi ] > V [xi−1 , xi ] −
n
Then letting P = ∪Pi ,
n
X n
X
−ε + V [xi−1 , xi ] < VPi [xi−1 , xi ] = VP [x, y] ≤ V [x, y] .
i=1 i=1
where this integral is the Riemann Stieltjes integral with respect to the integrating
function, f. By the Riesz representation theorem for positiveR linear functionals,
there exists a unique Radon measure, µ such that Lg = gdµ. Now consider the
following picture for gn ∈ C ([a, b]) in which gn equals 1 for x between x + 1/n and
y.
¥ D
¥ D
¥ D
¥ D
¥ D
¥ D
¥ D
¥ D
x x + 1/n y y + 1/n
Then gn (t) → X(x,y] (t) pointwise. Therefore, by the dominated convergence
theorem, Z
µ ((x, y]) = lim gn dµ.
n→∞
However,
µ µ ¶¶
1
f (y) − f x +
n
Z Z b µ µ ¶ ¶
1
≤ gn dµ = gn df ≤ f y + − f (y)
a n
µ µ ¶¶ µ µ ¶ ¶
1 1
+ f (y) − f x + + f x+ − f (x)
n n
Similarly, µ (x, y) = f (y) − f (y) and µ ([x, y]) = f (y) − f (x) , the argument used
to establish this being very similar to the above. It follows in particular that
Z
f (x) − f (a) = dµ.
[a,x]
Note that up till now, no referrence has been made to the absolute continuity of f.
Any increasing continuous function would be fine.
438 INTEGRALS AND DERIVATIVES
Now if E is a Borel set such that m (E) = 0, Then the outer regularity of m
implies there exists an open set, V containing E such that m (V ) < δ where δ
corresponds to ε in the definition of absolute continuity of P f. Then letting {Ik } be
the connected components of V it follows E ⊆ ∪∞ k=1 Ik with k m (Ik ) = m (V ) < δ.
Therefore, from absolute continuity of f, it follows that for Ik = (ak , bk ) and each
n
n
X n
X
µ (∪nk=1 Ik ) = µ (Ik ) = |f (bk ) − f (ak )| < ε
k=1 k=1
and so letting n → ∞,
∞
X
µ (E) ≤ µ (V ) = |f (bk ) − f (ak )| ≤ ε.
k=1
In particular, Z
µ ([a, x]) = f (x) − f (a) = hdm.
[a,x]
From the fundamental theorem of calculus f 0 (x) = h (x) at every Lebesgue point
of h. Therefore, writing in usual notation,
Z x
f (x) = f (a) + f 0 (t) dt
a
Proof: Suppose first that f is absolutely continuous. By Lemma 16.12 the total
variation function, V is absolutely continuous and f (x) = V (x) − (V (x) − f (x))
where both V and V −f are increasing and absolutely continuous. By Lemma 16.13
It turns out this limit exists for m a.e. x. To verify this here is another definition.
This is well defined because the function r → inf {f (t) : t ∈ [0, r]} is increasing and
r → sup {f (t) : t ∈ [0, r]} is decreasing. Also note that limr→0+ f (r) exists if and
only if
lim sup f (r) = lim inf f (r)
r→0+ r→0+
The claims made in the above definition follow immediately from the definition
of what is meant by a limit in [−∞, ∞] and are left for the reader.
dµ
Theorem 16.18 Let µ be a Borel measure on Rn then dm (x) exists in [−∞, ∞]
m a.e.
Proof:Let p < q and let p, q be rational numbers. Define
½
µ (B (x, r))
Npq (M ) ≡ x ∈ Rn such that lim sup >q
r→0+ m (B (x, r))
¾
µ (B (x, r))
> p > lim inf ∩ B (0, M ) ,
r→0+ m (B (x, r))
½
µ (B (x, r))
Npq ≡ x ∈ Rn such that lim sup >q
r→0+ m (B (x, r))
¾
µ (B (x, r))
> p > lim inf ,
r→0+ m (B (x, r))
½
µ (B (x, r))
N ≡ x ∈ Rn such that lim sup >
r→0+ m (B (x, r))
¾
µ (B (x, r))
lim inf .
r→0+ m (B (x, r))
I will show m (Npq (M )) = 0. Use outer regularity to obtain an open set, V con-
taining Npq (M ) such that
m (Npq (M )) + ε > m (V ) .
From the definition of Npq (M ) , it follows that for each x ∈ Npq (M ) there exist
arbitrarily small r > 0 such that
µ (B (x, r))
< p.
m (B (x, r))
Only consider those r which are small enough to be contained in B (0, M ) so that
the collection of such balls has bounded radii. This is a Vitali cover of Npq (M ) and
∞
so by Corollary 16.15 there exists a sequence of disjoint balls of this sort, {Bi }i=1
such that
µ (Bi ) < pm (Bi ) , m (Npq (M ) \ ∪∞
i=1 Bi ) = 0. (16.10)
Now for x ∈ Npq (M ) ∩ (∪∞ i=1 Bi ) (most of Npq (M )), there exist arbitrarily small
∞
balls, B (x, r) , such that B (x, r) is contained in some set of {Bi }i=1 and
µ (B (x, r))
> q.
m (B (x, r))
This is a Vitali cover Npq (M )∩(∪∞
© 0ofª∞ i=1 Bi ) and so there exists a sequence of disjoint
balls of this sort, Bj j=1 such that
¡ ¢ ¡ 0¢ ¡ 0¢
m (Npq (M ) ∩ (∪∞ ∞ 0
i=1 Bi )) \ ∪j=1 Bj = 0, µ Bj > qm Bj . (16.11)
16.3. DIFFERENTIATION OF MEASURES WITH RESPECT TO LEBESGUE MEASURE441
Therefore,
X ¡ ¢ X ¡ ¢
µ Bj0 > q m Bj0 ≥ qm (Npq (M ) ∩ (∪i Bi )) = qm (Npq (M ))
j j
X
≥ pm (Npq (M )) ≥ p (m (V ) − ε) ≥ p m (Bi ) − pε
i
X X ¡ ¢
≥ µ (Bi ) − pε ≥ µ Bj0 − pε.
i j
It follows
pε ≥ (q − p) m (Npq (M ))
Since ε is arbitrary, m (Npq (M )) = 0. Now Npq ⊆ ∪∞
M =1 Npq (M ) and so m (Npq ) =
0. Now
N = ∪p.q∈Q Npq
and since this is a countable union of sets of measure zero, m (N ) = 0 also. This
proves the theorem.
From Theorem 15.8 on Page 407 it follows that if µ is a complex measure then
|µ| is a finite measure. This makes possible the following definition.
Definition 16.19 Let µ be a real measure. Define the following measures. For E
a measurable set,
1
µ+ (E) ≡ (|µ| + µ) (E) ,
2
1
µ− (E) ≡ (|µ| − µ) (E) .
2
These are measures thanks to Theorem 15.7 on Page 406 and µ+ − µ− = µ. These
measures have values in [0, ∞). They are called the positive and negative parts of µ
respectively. For µ a complex measure, define Re µ and Im µ by
1³ ´
Re µ (E) ≡ µ (E) + µ (E)
2
1 ³ ´
Im µ (E) ≡ µ (E) − µ (E)
2i
Then Re µ and Im µ are both real measures. Thus for µ a complex measure,
¡ ¢
µ = Re µ+ − Re µ− + i Im µ+ − Im µ−
= ν 1 − ν 1 + i (ν 3 − ν 4 )
νi.
Theorem 15.2 on Page 399, the Radon Nikodym theorem, implies that if you have
two finite measures, µ and λ, you can write λ as the sum of a measure absolutely
continuous with respect to µ and one which is singular to µ in a unique way. The
next topic is related to this. It has to do with the differentiation of a measure which
is singular with respect to Lebesgue measure.
Let ε > 0. Since µ is regular, there exists H, a compact set such that H ⊆
N ∩ B (0, M ) and
µ (N ∩ B (0, M ) \ H) < ε.
B(0, M )
N ∩ B(0, M )
H
Bi
Bk (M )
16.3. DIFFERENTIATION OF MEASURES WITH RESPECT TO LEBESGUE MEASURE443
For each x ∈ Bk (M ) , there exist arbitrarily small r > 0 such that B (x, r) ⊆
B (0, M ) \ H and
µ (B (x, r)) 1
> . (16.13)
m (B (x, r)) k
Two such balls are illustrated in the above picture. This is a Vitali cover of Bk (M )
∞
and so there exists a sequence of disjoint balls of this sort, {Bi }i=1 such that
m (Bk (M ) \ ∪i Bi ) = 0. Therefore,
X X
m (Bk (M )) ≤ m (Bk (M ) ∩ (∪i Bi )) ≤ m (Bi ) ≤ k µ (Bi )
i i
X X
= k µ (Bi ∩ N ) = k µ (Bi ∩ N ∩ B (0, M ))
i i
≤ kµ (N ∩ B (0, M ) \ H) < εk
Lemma 16.22 Suppose µ is a Borel measure on Rn having values in [0, ∞). Then
there exists a Radon measure, µ1 such that µ1 = µ on all Borel sets.
By the Riesz representation theorem for positive linear functionals of this sort, there
exists a unique Radon measure, µ1 such that for all f ∈ Cc (Rn ) ,
Z Z
f dµ1 = Lf = f dµ.
© ¡ ¢ ª
Now let V be an open set and let Kk ≡ x ∈ V : dist x, V C ≤ 1/k ∩ B (0,k).
Then {Kk } is an incresing sequence of compact sets whose union is V. Let Kk ≺ fk
≺ V. Then fk (x) → XV (x) for every x. Therefore,
Z Z
µ1 (V ) = lim fk dµ1 = lim fk dµ = µ (V )
k→∞ k→∞
and so µ and µ1 are equal on all compact sets. It follows µ = µ1 on all countable
unions of compact sets and countable intersections of open sets.
Now let E be a Borel set. By regularity of µ1 , there exist sets, H and G such
that H is the countable union of an increasing sequence of compact sets, G is the
countable intersection of a decreasing sequence of open sets, H ⊆ E ⊆ G, and
µ1 (H) = µ1 (G) = µ1 (E) . Therefore,
16.4 Exercises
1. Suppose A and B are sets of positive Lebesgue measure in Rn . Show that
A − B must contain B (c, ε) for some c ∈ Rn and ε > 0.
A − B ≡ {a − b : a ∈ A and b ∈ B} .
Hint: First assume both sets are bounded. This creates no loss of generality.
Next there exist a0 ∈ A, b0 ∈ B and δ > 0 such that
Z Z
3 3
XA (t) dt > m (B (a0 , δ)) , XB (t) dt > m (B (b0 , δ)) .
B(a0 ,δ) 4 B(b0 ,δ) 4
Explain why
1
m ((A − a0 ) ∩ (B − b0 )) > m (B (0, δ)) > 0.
2
Let Z
f (x) ≡ XA−a0 (x + t) XB−b0 (t) dt.
Explain why f (0) > 0. Next explain why f is continuous and why f (x) > 0
for all x ∈ B (0, ε) for some ε > 0. Thus if |x| < ε, there exists t such that
x + t ∈ A − a0 and t ∈ B − b0 . Subtract these.
2. Show M f is Borel measurable by verifying that R [M f > λ] ≡ Eλ is actually
an open set. Hint: If x ∈ Eλ then for some r, B(x,r) |f | dm > λm (B (x, r)) .
R
Then for δ a small enough positive number, B(x,r) |f | dm > λm (B (x, r + 2δ)) .
Now pick y ∈ B (x, δ) and argue that B (y, δ + r) ⊇ B (x, r) . Therefore show
that,
Z Z
|f | dm > |f | dm > λB (x, r + 2δ) ≥ λm (B (y, r + δ)) .
B(y,δ+r) B(x,r)
Thus B (x, δ) ⊆ Eλ .
3. Consider£ the
¤ following
£ ¤ nested sequence of compact sets, {Pn }.Let P1 = [0, 1],
P2 = 0, 31 ∪ 23 , 1 , etc. To go from Pn to Pn+1 , delete the open interval
which is the middle third of each closed interval in Pn . Let P = ∩∞ n=1 Pn .
By the finite intersection property of compact sets, P 6= ∅. Show m(P ) = 0.
If you feel ambitious also show there is a one to one onto mapping of [0, 1]
to P . The set P is called the Cantor set. Thus, although P has measure
zero, it has the same number of points in it as [0, 1] in the sense that there
is a one to one and onto mapping from one to the other. Hint: There are
various ways of doing this last part but the most enlightenment is obtained
by exploiting the topological properties of the Cantor set rather than some
silly representation in terms of sums of powers of two and three. All you need
to do is use the Schroder Bernstein theorem and show there is an onto map
from the Cantor set to [0, 1]. If you do this right and remember the theorems
about characterizations of compact metric spaces, Proposition 6.12 on Page
136, you may get a pretty good idea why every compact metric space is the
continuous image of the Cantor set.
4. Consider the sequence of functions defined in the following way. Let f1 (x) = x
on [0, 1]. To get from fn to fn+1 , let fn+1 = fn on all intervals where fn is
constant. If fn is nonconstant on [a, b], let fn+1 (a) = fn (a), fn+1 (b) =
fn (b), fn+1 is piecewise linear and equal to 12 (fn (a) + fn (b)) on the middle
third of [a, b]. Sketch a few of these and you will see the pattern. The process
of modifying a nonconstant section of the graph of this function is illustrated
in the following picture.
446 INTEGRALS AND DERIVATIVES
¡
¡
¡
Show {fn } converges uniformly on [0, 1]. If f (x) = limn→∞ fn (x), show that
f (0) = 0, f (1) = 1, f is continuous, and f 0 (x) = 0 for all x ∈
/ P where P is
the Cantor set of Problem 3. This function is called the Cantor function.It is
a very important example to remember. Note it has derivative equal to zero
a.e. and yet it succeeds in climbing from 0 to 1. Explain why this interesting
function is not absolutely continuous although it is continuous. Hint: This
isn’t too hard if you focus on getting a careful estimate on the difference
between two successive functions in the list considering only a typical small
interval in which the change takes place. The above picture should be helpful.
5. A function, f : [a, b] → R is Lipschitz if |f (x) − f (y)| ≤ K |x − y| . Show
that every Lipschitz function is absolutely continuous. Thus R y every Lipschitz
function is differentiable a.e., f 0 ∈ L1 , and f (y) − f (x) = x f 0 (t) dt.
6. Suppose f, g are both absolutely continuous on [a, b] . Show the product of
0
these functions is also absolutely continuous. Explain why (f g) = f 0 g + g 0 f
and show the usual integration by parts formula
Z b Z b
0
f (b) g (b) − f (a) g (a) − f g dt = f 0 gdt.
a a
Rb
7. In Problem 4 f 0 failed to give the expected result for a f 0 dx 1 but at least
f 0 ∈ L1 . Suppose f 0 exists for f a continuous function defined on [a, b] . Does
it follow that f 0 is measurable? Can you conclude f 0 ∈ L1 ([a, b])?
8. A sequence of sets, {Ei } containing the point x is said to shrink to x nicely
if there exists a sequence of positive numbers, {ri } and a positive constant, α
such that ri → 0 and
m (Ei ) ≥ αm (B (x, ri )) , Ei ⊆ B (x, ri ) .
Show the above theorems about differentiation of measures with respect to
Lebesgue measure all have a version valid for Ei replacing B (x, r) .
Rx
9. Suppose F (x) = a f (t) dt. Using the concept of nicely shrinking sets in
Problem 8 show F 0 (x) = f (x) a.e.
10. A random variable, X is a measurable real valued function defined on a mea-
sure space, (Ω, S, P ) where P is just a measure with P (Ω) = 1 called a
probability measure. The distribution function for X is the function, F (x) ≡
P ([X ≤ x]) in words, F (x) is the probability that X has values no larger than
x. Show that F is a right continuous increasing function with the property
that limx→−∞ F (x) = 0 and limx→∞ F (x) = 1.
1 In this example, you only know that f 0 exists a.e.
16.4. EXERCISES 447
12. Suppose now that G is just an increasing function defined on R. Show that
G0 (x) exists a.e. Hint: You can mimic the proof of Theorem 16.18. The Dini
derivates are defined as
G (x + h) − G (x)
D+ G (x) ≡ lim inf ,
h→0+ h
G (x + h) − G (x)
D+ G (x) ≡ lim sup
h→0+ h
G (x) − G (x − h)
D− G (x) ≡ lim inf ,
h→0+ h
G (x) − G (x − h)
D− G (x) ≡ lim sup .
h→0+ h
When D+ G (x) = D+ G (x) the derivative from the right exists and when
D− G (x) = D− G (x) , then the derivative from the left exists. Let (a, b) be an
open interval and let
© ª
Npq ≡ x ∈ (a, b) : D+ G (x) > q > p > D+ G (x) .
Let V ⊆ (a, b) be an open set containing Npq such that n (V ) < m (Npq ) + ε.
Show using a Vitali covering theorem there is a disjoint sequence of intervals
∞
contained in V , {(xi , xi + hi )}i=1 such that
G (xi + hi ) − G (xi )
< p.
hi
448 INTEGRALS AND DERIVATIVES
©¡ ¢ª∞
Next show there is a disjoint sequence of intervals x0i , x0j + h0j j=1 such
that each of these is contained in one of the former intervals and
¡ ¢ ¡ ¢
G x0j + h0j − G x0j X
> q, h0j ≥ m (Npq ) .
h0j j
Then
X X ¡ ¢ ¡ ¢ X
qm (Npq ) ≤ q h0j ≤ G x0j + h0j − G x0j ≤ G (xi + hi ) − G (xi )
j j i
X
≤ p hi ≤ pm (V ) ≤ p (m (Npq ) + ε) .
i
Since ε was arbitrary, this shows m (Npq ) = 0. Taking a union of all Npq
for p, q rational, shows the derivative from the right exists a.e. Do a similar
argument to show the derivative from the left exists a.e. and then show the
derivative from the left equals the derivative from the right a.e. using a simlar
argument. Thus G0 (x) exists on (a, b) a.e. and so it exists a.e. on R because
(a, b) was arbitrary.
Hausdorff Measure
Definition 17.1 For a set, E, denote by r (E) the number which is half the diam-
eter of E. Thus
1 1
r (E) ≡ sup {|x − y| : x, y ∈ E} ≡ diam (E)
2 2
Let E ⊆ Rn .
X∞
Hδs (E) ≡ inf{ β(s)(r (Cj ))s : E ⊆ ∪∞
j=1 Cj , diam(Cj ) ≤ δ}
j=1
∞
X
β(s)(r(Cji ))s − ε/2i < Hδs (Ei )
j=1
449
450 HAUSDORFF MEASURE
which shows Hδsis an outer measure. Now notice that Hδs (E) is increasing as δ → 0.
Picking a sequence δ k decreasing to 0, the monotone convergence theorem implies
∞
X
Hs (E) ≤ Hs (Ei ).
i=1
If limn→∞ µ((Kn \ K) ∩ S) = 0 then the theorem will be proved because this limit
along with 17.2 implies limn→∞ µ (S \ Kn ) = µ (S \ K) and then taking a limit in
17.1, µ(S) ≥ µ(S ∩ K) + µ(S \ K) as desired. Therefore, it suffices to establish this
limit.
Since K is closed, a point, x ∈
/ K must be at a positive distance from K and so
Kn \ K = ∪∞
k=n Kk \ Kk+1 .
Therefore
∞
X
µ(S ∩ (Kn \ K)) ≤ µ(S ∩ (Kk \ Kk+1 )). (17.3)
k=n
If
∞
X
µ(S ∩ (Kk \ Kk+1 )) < ∞, (17.4)
k=1
then µ(S ∩ (Kn \ K)) → 0 because it is dominated by the tail of a convergent series
so it suffices to show 17.4.
M
X
µ(S ∩ (Kk \ Kk+1 )) =
k=1
X X
µ(S ∩ (Kk \ Kk+1 )) + µ(S ∩ (Kk \ Kk+1 )). (17.5)
k even, k≤M k odd, k≤M
By the construction, the distance between any pair of sets, S ∩ (Kk \ Kk+1 ) for
different even values of k is positive and the distance between any pair of sets,
S ∩ (Kk \ Kk+1 ) for different odd values of k is positive. Therefore,
X X
µ(S ∩ (Kk \ Kk+1 )) + µ(S ∩ (Kk \ Kk+1 )) ≤
k even, k≤M k odd, k≤M
[ [
µ( S ∩ (Kk \ Kk+1 )) + µ( S ∩ (Kk \ Kk+1 )) ≤ 2µ (S) < ∞
k even k odd
PM
and so for all M, k=1 µ(S ∩ (Kk \ Kk+1 )) ≤ 2µ (S) showing 17.4 and proving the
theorem.
With the above theorem, the following theorem is easy to obtain. This property
is sometimes called Borel regularity.
452 HAUSDORFF MEASURE
Theorem 17.5 The σ algebra of Hs measurable sets contains the Borel sets and
Hs has the property that for all E ⊆ Rn , there exists a Borel set F ⊇ E such that
Hs (F ) = Hs (E).
∞
X
Hδs (A ∪ B) + ε > β(s)(r (Cj ))s.
j=1
Thus X X
Hδs (A ∪ B )˙ + ε > β(s)(r (Cj ))s + β(s)(r (Cj ))s
j∈J1 j∈J2
where
J1 = {j : Cj ∩ A 6= ∅}, J2 = {j : Cj ∩ B 6= ∅}.
Recall dist(A, B) = 2δ 0 , J1 ∩ J2 = ∅. It follows
Hs (A ∪ B) ≥ Hs (A) + Hs (B).
and
∞
X
Hδs (E) + δ > β(s)(r (Cj ))s.
j=1
Let
Fδ = ∪∞
j=1 Cj .
Thus Fδ ⊇ E and
∞
X ¡ ¢
Hδs (E) ≤ Hδs (Fδ ) ≤ β(s)(r Cj )s
j=1
∞
X
= β(s)(r (Cj ))s < δ + Hδs (E).
j=1
17.1. DEFINITION OF HAUSDORFF MEASURES 453
Letting k → ∞,
Hs (E) ≤ Hs (F ) ≤ Hs (E)
and this proves the theorem.
A measure satisfying the conclusion of Theorem 17.5 is sometimes called a Borel
regular measure.
17.1.2 Hn And mn
Next I will compare Hn and mn . To do this, recall the following covering theorem
which is a summary of Corollaries 10.20 and 10.19 found on Page 279.
Lemma 17.7 There exists a constant, k such that Hn (E) ≤ kmn (E) for all E
Borel. Also, if Q0 ≡ [0, 1)n , the unit cube, then Hn ([0, 1)n ) > 0.
Proof: First let U be an open set and letting δ > 0, consider all balls, B
contained in U which have diameters less than δ. This is a Vitali covering of U and
therefore by Theorem 17.6, there exists {Bi } , a sequence of disjoint balls of radii
less than δ contained in U such that ∪∞i=1 Bi differs from U by a set of Lebesgue
measure zero. Let α (n) be the Lebesgue measure of the unit ball in Rn . Then
∞
X ∞
n β (n) X n
Hδn (U ) ≤ β (n) r (Bi ) = α (n) r (Bi )
i=1
α (n) i=1
∞
β (n) X β (n)
= mn (Bi ) = mn (U ) ≡ kmn (U ) .
α (n) i=1 α (n)
Now letting E be Borel, it follows from the outer regularity of mn there exists
a decreasing sequence of open sets, {Vi } containing E such such that mn (Vi ) →
mn (E) . Then from the above,
Hδn (E) ≤ lim Hδn (Vi ) ≤ lim kmn (Vi ) = kmn (E) .
i→∞ i→∞
Now let Bi be a ball having radius equal to diam (Ci ) = 2r (Ci ) which contains Ci .
It follows
n α (n) 2n n
mn (Bi ) = α (n) 2n r (Ci ) = β (n) r (Ci )
β (n)
which implies
∞
X X∞
n β (n)
1> β (n) r (Ci ) = mn (Bi ) = ∞,
i=1 i=1
α (n) 2n
I will show kHn (E) = mn (E). When this is done, it will follow that by adjusting
β (n) the multiple can be taken to be 1. I will only need to show that the right
value for β (n) is α (n). Recall Lemma 10.2 on Page 267 which is listed here for
convenience.
Lemma 17.9 Every open set in Rn is the countable disjoint union of half open
boxes of the form
Yn
(ai , ai + 2−k ]
i=1
where ai = l2−k for some integers, l, k. The sides of these boxes are of equal length.
One could also have half open boxes of the form
n
Y
[ai , ai + 2−k )
i=1
Therefore, kHn (Q) = mn (Q) . It follows from Lemma 10.2 on Page 267 stated
above that kHn (U ) = mn (U ) for all open sets. It follows immediately, since
every compact set is the countable intersection of open sets that kHn = mn on
compact sets. Therefore, they are also equal on all closed sets because every closed
set is the countable union of compact sets. Now let F be an arbitrary Lebesgue
measurable set. I will show that F is Hn measurable and that kHn (F ) = mn (F ).
Let Fl = B (0, l) ∩ F. Then there exists H a countable union of compact sets and
G a countable intersection of open sets such that
H ⊆ Fl ⊆ G (17.6)
and
mn (G \ H) = kHn (G \ H) = 0. (17.7)
β (n) n
Hn (B (0, 1)) ≤ H (B (0, 1))
α (n)
Also
∞
X
Hn (B (0, 1)) ≥ Hδn (B (0, 1)) = Hδn (∪i Bi ) = Hδn (Bi )
i=1
∞
X ∞
n β (n) X n
≥ β (n) r (Bi ) = α (n) r (Bi )
i=1
α (n) i=1
∞
β (n) X β (n)
= mn (Bi ) = mn (B (0, 1))
α (n) i=1 α (n)
β (n) n
= H (B (0, 1))
α (n)
which shows α (n) ≥ β (n) and so the two are equal. This proves the theorem.
This gives another way to think of Lebesgue measure which is a particularly nice
way because it is coordinate free, depending only on the notion of distance.
For s < n, note that Hs is not a Radon measure because it will not generally be
finite on compact sets. For example, let n = 2 and consider H1 (L) where L is a line
segment joining (0, 0) to (1, 0). Then H1 (L) is no smaller than H1 (L) when L is
considered a subset of R1 , n = 1. Thus by what was just shown, H1 (L) ≥ 1. Hence
H1 ([0, 1] × [0, 1]) = ∞. The situation is this: L is a one-dimensional object inside
R2 and H1 is giving a one-dimensional measure of this object. In fact, Hausdorff
measures can make such heuristic remarks as these precise. Define the Hausdorff
dimension of a set, A, as
dim(A) = inf{s : Hs (A) = 0}
Next
Z ∞ Z ∞
Γ (p) Γ (q) = e−t tp−1 dt e−s sq−1 ds
0 0
Z ∞Z ∞
= e−(t+s) tp−1 sq−1 dtds
0 0
Z ∞Z ∞
p−1 q−1
= e−u (u − s) s duds
Z0 ∞ Zs u
p−1 q−1
= e−u (u − s) s dsdu
0 0
Z ∞ Z 1
p−1 q−1
= e−u (u − ux) (ux) udxdu
0 0
Z ∞Z 1
p−1
= e−u up+q−1 (1 − x) xq−1 dxdu
0 0
µZ 1 ¶
p−1 q−1
= Γ (p + q) x (1 − x) dx .
0
¡1¢
It remains to find Γ 2 .
µ ¶ Z ∞ Z ∞ Z ∞
1 −t −1/2 −u2 1 2
Γ = e t dt = e 2udu = 2 e−u du
2 0 0 u 0
Now
µZ ∞ ¶2 Z ∞ Z ∞ Z ∞ Z ∞
e−(x +y 2 )
2
−x2 −x2 −y 2
e dx = e dx e dy = dxdy
0 0 0 0 0
Z ∞ Z π/2
2 1
= e−r rdθdr = π
0 0 4
and so µ ¶ Z ∞
1 2 √
Γ =2 e−u du = π
2 0
Theorem 17.11 α(n) = π n/2 (Γ(n/2 + 1))−1 where Γ(s) is the gamma function
Z ∞
Γ(s) = e−t ts−1 dt.
0
Thus
2 √
π 1/2 (Γ(1/2 + 1))−1 = √ π = 2 = α (1) .
π
and this shows the theorem is true if n = 1.
Assume the theorem is true for n and let Bn+1 be the unit ball in Rn+1 . Then
by the result in Rn ,
Z 1
mn+1 (Bn+1 ) = α(n)(1 − x2n+1 )n/2 dxn+1
−1
Z 1
= 2α(n) (1 − t2 )n/2 dt.
0
Doing an integration by parts and using Lemma 17.10
Z 1
= 2α(n)n t2 (1 − t2 )(n−2)/2 dt
0
Z
1 1 1/2
= 2α(n)n u (1 − u)n/2−1 du
2 0
Z 1
= nα(n) u3/2−1 (1 − u)n/2−1 du
0
= nα(n)Γ(3/2)Γ(n/2)(Γ((n + 3)/2))−1
= nπ n/2 (Γ(n/2 + 1))−1 (Γ((n + 3)/2))−1 Γ(3/2)Γ(n/2)
= nπ n/2 (Γ(n/2)(n/2))−1 (Γ((n + 1)/2 + 1))−1 Γ(3/2)Γ(n/2)
= 2π n/2 Γ(3/2)(Γ((n + 1)/2 + 1))−1
= π (n+1)/2 (Γ((n + 1)/2 + 1))−1 .
which distorts is the one which will have a nontrivial interaction with Hausdorff
measure while the one which preserves lengths does not change Hausdorff measure.
These ideas are behind the following theorems and lemmas.
Theorem 17.12 Let F be an n × m matrix where m ≥ n. Then there exists an
m × n matrix R and a n × n matrix U such that
F = RU, U = U ∗ ,
all eigenvalues of U are non negative,
U 2 = F ∗ F, R∗ R = I,
and |Rx| = |x|.
Lemma 17.13 Let R ∈ L(Rn , Rm ), n ≤ m, and R∗ R = I. Then if A ⊆ Rn ,
Hn (RA) = Hn (A).
In fact, if P : Rn → Rm satisfies |P x − P y| = |x − y| , then
Hn (P A) = Hn (A) .
Proof: Note that
2 2
|R(x − y)| = (R (x − y) , R (x − y)) = (R∗ R (x − y) , x − y) = |x − y|
Thus R preserves lengths.
Now let P be an arbitrary mapping which preserves lengths and let A be
bounded, P (A) ⊆ ∪∞j=1 Cj , diam(Cj ) ≤ δ, and
∞
X
Hδn (P A) +ε> α(n)(r(Cj ))n .
j=1
Since P preserves lengths, it follows P is one to one on P (Rn ) and P −1 also preserves
lengths on P (Rn ) . Replacing each Cj with Cj ∩ (P A),
∞
X
Hδn (P A) + ε > α(n)r(Cj ∩ (P A))n
j=1
X∞
¡ ¢n
= α(n)r P −1 (Cj ∩ (P A))
j=1
≥ Hδn (A).
Thus Hδn (P A) ≥ Hδn (A).
Now let A ⊆ ∪∞ j=1 Cj , diam(Cj ) ≤ δ, and
∞
X n
Hδn (A) + ε ≥ α(n) (r (Cj ))
j=1
460 HAUSDORFF MEASURE
Then
∞
X n
Hδn (A) + ε ≥ α(n) (r (Cj ))
j=1
X∞
n
= α(n) (r (P Cj ))
j=1
≥ Hδn (P A).
Hence Hδn (P A) = Hδn (A). Letting δ → 0 yields the desired conclusion in the
case where A is bounded. For the general case, let Ar = A ∩ B (0, r). Then
Hn (P Ar ) = Hn (Ar ). Now let r → ∞. This proves the lemma.
Hn (F A) = Hn (RU A)
h : U → Rm is continuous, (17.8)
17.2. THE AREA FORMULA 461
∞
X ³ ³ ´´ X∞
≤ bi
Hδn h B ≤ Hδn (B (h (xi ) , 6krxi ))
i=1 i=1
∞
X ∞
X
n n
≤ α (n) (6krxi ) = (6k) α (n) rxni
i=1 i=1
∞
X
n
= (6k) mn (B (xi , rxi ))
i=1
n n ε
≤ (6k) mn (V ) ≤ (6k) = ε.
k n 6n
Since ε > 0 is arbitrary, this shows Hδn (h (Tk )) = 0. Since δ is arbitrary, this implies
Hn (h (Tk )) = 0. Now
Hn (h (T )) = lim Hn (h (Tk )) = 0.
k→∞
a contradiction to 17.11.
Now letting G :B → B, be defined by
r (a − F (v))
G (v) ≡ ,
|a − F (v)|
it follows G is continuous. Then by the Brouwer fixed point theorem, G (v) = v for
some v ∈ B. Using the formula for G, it follows |v| = r. Taking the inner product
with v,
2 r
(G (v) , v) = |v| = r2 = (a − F (v) , v)
|a − F (v)|
r
= (a − v + v − F (v) , v)
|a − F (v)|
r
= [(a − v, v) + (v − F (v) , v)]
|a − F (v)|
r h i
2
= (a, v) − |v| + (v − F (v) , v)
|a − F (v)|
r £ 2 ¤
≤ r (1 − ε) − r2 +r2 ε = 0,
|a − F (v)|
and so
2 2
|u| = |u−RR∗ u+RR∗ u|
2 2
= |u−RR∗ u| + |RR∗ u|
2 2
= |u−RR∗ u| + |R∗ u| .
Hn (P E) ≤ Ln Hn (E) .
464 HAUSDORFF MEASURE
Proof: Without loss of generality, assume Hn (E) < ∞. Let δ > 0 and let
∞
{Ci }i=1 be a covering of E such that diam (Ci ) ≤ δ for each i and
∞
X n
α (n) r (Ci ) ≤ Hδn (E) + ε.
i=1
∞
Then {P Ci }i=1 is a covering of P E such that diam (P Ci ) ≤ Lδ. Therefore,
∞
X
n n
HLδ (P E) ≤ α (n) r (P Ci )
i=1
∞
X n
≤ Ln α (n) r (Ci ) ≤ Ln Hδn (E) + Ln ε
i=1
≤ Hn (E) + ε.
Letting δ → 0,
Hn (P E) ≤ Ln Hn (E) + Ln ε
and since ε > 0 is arbitrary, this proves the Lemma.
Then the following corollary follows from Lemma 17.20.
Hn (T ) ≥ Hn (RR∗ T ) = Hn (R∗ T ).
Hn (h (B (x, r) ∩ A))
lim = 1. (17.16)
r→0 Hn (h (B (x, r)))
17.2. THE AREA FORMULA 465
It follows
F (v) − v = o (|v|)
and so Lemma 17.19 implies that for all r small enough,
−1 −1
F (B (0, r)) ≡ U (x) R∗ (x) h (x+B (0,r)) − U (x) R∗ (x) h (x)
⊇ B (0, (1 − ε) r).
Therefore,
−1
Proof: Suppose first that U (x) exists. Using 17.15, 17.13 and the change
of variables formula for linear maps,
n
J (x) (1 − ε)
mn (U (x) B (0,r (1 − ε))) Hn (h(B (x, r)))
= ≤
mn (B (x, r)) mn (B (x, r))
mn (U (x) B (0,r (1 + ε))) n
≤ = J (x) (1 + ε)
mn (B (x, r))
whenever r is small enough. It follows that since ε > 0 is arbitrary, 17.20 holds.
−1
Now suppose U (x) does not exist. The first part shows that the conclusion
of the theorem holds when J (x) 6= 0. I will apply this to a modified function. Let
k : Rn → Rm × Rn
be defined as µ ¶
h (x)
k (x) ≡ .
εx
Then
∗ ∗
Dk (x) Dk (x) = Dh (x) Dh (x) + ε2 In
and so
2 ¡ ∗ ¢
Jk (x) ≡ det Dh (x) Dh (x) + ε2 In
¡ ¢
= det Q∗ DQ + ε2 In
∗
where D is a diagonal matrix having the nonnegative eigenvalues of Dh (x) Dh (x)
down the main diagonal. Thus, since one of these eigenvalues equals 0, letting λ2i
denote the ith eigenvalue, there exists a constant, C independent of ε such that
n
Y
2 ¡ ¢
0 < Jk (x) = λ2i + ε2 ≤ C 2 ε2. (17.21)
i=1
n o
T
Tε ≡ (h (w) , εw) : w ∈ B (x,r)
≡ k (B (x,r)),
then
T = P Tε
where P is the projection map defined by
µ ¶ µ ¶
x x
P ≡ .
y 0
468 HAUSDORFF MEASURE
What is f ? I will show that f (x) = J (x) = det (U (x)) a.e. Define
E ≡ {x ∈ A : x is not a point of density of A} ∪
{x ∈ A : x is not a Lebesgue point of f }.
Then E is a set of measure zero and if x ∈ (A \ E), Lemma 17.24 and Theorem
17.25 imply
Z
1
f (x) = lim f (y) dm
r→0 mn (B (x,r)) B(x,r)
Hn (h (B (x,r) ∩ A))
= lim
r→0 mn (B (x,r))
Hn (h (B (x,r) ∩ A)) Hn (h (B (x,r)))
= lim
r→0 Hn (h (B (x,r))) mn (B (x,r))
= J (x).
17.2. THE AREA FORMULA 469
Note there are no measurability questions in the above formula because h−1 (F ) is
a Borel set due to the continuity of h. The Borel measurability of J (x) also follows
from the observation that h is continuous and therefore, the partial derivatives are
Borel measurable, being the limit of continuous functions. Then J (x) is just a
continuous function of these partial derivatives. However, things are not so clear
if E is only assumed Hn measurable. Is there a similar formula for F only Hn
measurable?
First consider the case where E is only Hn measurable but
Hn (E ∩ h (A)) = 0.
By Theorem 17.5 on Page 452, there exists a Borel set F ⊇ E ∩ h (A) such that
Hn (F ) = Hn (E ∩ h (A)) = 0.
Hn (E ∩ h (A)) = 0.
Then
(E ∩ h (AR )) ∪ (F \ (E ∩ h (AR )) ∩ h (AR )) = F ∩ h (AR )
and so
XAR ∩h−1 (F ) J = XAR ∩h−1 (E) J + XAR ∩h−1 (F \(E∩h(AR ))) J
where from 17.26 and 17.25, the second function on the right of the equal sign is
Lebesgue measurable and equals zero a.e. Therefore, the first function on the right
of the equal sign is also Lebesgue measurable and equals the function on the left
a.e. Thus, Z Z
XE∩h(AR ) (y) dH = XF ∩h(AR ) (y) dHn
n
Z Z
= XAR ∩h−1 (F ) (x) J (x) dmn = XAR ∩h−1 (E) (x) J (x) dmn . (17.27)
h : U → Rm is continuous, (17.32)
Proof: Let
Tk ≡ {x ∈ T : ||Dh (x)|| < k} .
Thus T = ∪k Tk . I will show h (Tk ) has Hn measure zero and then it will follow
that
h (T ) = ∪∞k=1 h (Tk )
and so
h (B (x, 5rx )) ⊆ B (h (x) , 6krx ).
Letting δ > 0 be given, the Vitali covering theorem implies there exists a sequence
of disjoint balls {Bi }, Bi =n B o(xi , rxi ), which are contained in V such that the
b
sequence of enlarged balls, Bi , having the same center but 5 times the radius,
covers Tk and 6krxi < δ. Then
³ ³ ´´
Hδn (h (Tk )) ≤ Hδn h ∪∞ b
i=1 Bi
∞
X ³ ³ ´´ X∞
≤ bi
Hδn h B ≤ Hδn (B (h (xi ) , 6krxi ))
i=1 i=1
∞
X ∞
X
n n
≤ α (n) (6krxi ) = (6k) α (n) rxni
i=1 i=1
∞
X
n
= (6k) mn (B (xi , rxi ))
i=1
n n ε
≤ (6k) mn (V ) ≤ (6k) = ε.
k n 6n
Since ε > 0 is arbitrary, this shows Hδn (h (Tk )) = 0. Since δ is arbitrary, this implies
Hn (h (Tk )) = 0. Now
Hn (h (T )) = lim Hn (h (Tk )) = 0.
k→∞
a contradiction to 17.35.
Now letting G :B → B, be defined by
r (a − F (v))
G (v) ≡ ,
|a − F (v)|
it follows G is continuous. Then by the Brouwer fixed point theorem, G (v) = v for
some v ∈ B. Using the formula for G, it follows |v| = r. Taking the inner product
with v,
2 r
(G (v) , v) = |v| = r2 = (a − F (v) , v)
|a − F (v)|
r
= (a − v + v − F (v) , v)
|a − F (v)|
r
= [(a − v, v) + (v − F (v) , v)]
|a − F (v)|
r h i
2
= (a, v) − |v| + (v − F (v) , v)
|a − F (v)|
r £ 2 ¤
≤ r (1 − ε) − r2 +r2 ε = 0,
|a − F (v)|
and so
2 2
|u| = |u−RR∗ u+RR∗ u|
2 2
= |u−RR∗ u| + |RR∗ u|
2 2
= |u−RR∗ u| + |R∗ u| .
Hn (P E) ≤ Ln Hn (E) .
Proof: Without loss of generality, assume Hn (E) < ∞. Let δ > 0 and let
∞
{Ci }i=1 be a covering of E such that diam (Ci ) ≤ δ for each i and
∞
X n
α (n) r (Ci ) ≤ Hδn (E) + ε.
i=1
∞
Then {P Ci }i=1 is a covering of P E such that diam (P Ci ) ≤ Lδ. Therefore,
∞
X
n n
HLδ (P E) ≤ α (n) r (P Ci )
i=1
∞
X n
≤ Ln α (n) r (Ci ) ≤ Ln Hδn (E) + Ln ε
i=1
n
≤ H (E) + ε.
Letting δ → 0,
Hn (P E) ≤ Ln Hn (E) + Ln ε
and since ε > 0 is arbitrary, this proves the Lemma.
Then the following corollary follows from Lemma 17.30.
Hn (T ) ≥ Hn (RR∗ T ) = Hn (R∗ T ).
−1
Lemma 17.34 Let x ∈ A be a point where U (x) exists. Then if ε ∈ (0, 1) the
following hold for all r small enough.
Hn (h (B (x, r) ∩ A))
lim = 1. (17.40)
r→0 Hn (h (B (x, r)))
F (v) − v = o (|v|)
Therefore,
and so
Now by assumption 17.34, Hn (h (B (x, r) \ A)) = 0 and so for all r small enough,
whenever r is small enough. It follows that since ε > 0 is arbitrary, 17.43 holds.
−1
Now suppose U (x) does not exist. The first part shows that the conclusion
of the theorem holds when J (x) 6= 0. I will apply this to a modified function. Let
k : Rn → Rm × Rn
be defined as µ ¶
h (x)
k (x) ≡ .
εx
Then
∗ ∗
Dk (x) Dk (x) = Dh (x) Dh (x) + ε2 In
and so
2 ¡ ∗ ¢
Jk (x) ≡ det Dh (x) Dh (x) + ε2 In
¡ ¢
= det Q∗ DQ + ε2 In
∗
where D is a diagonal matrix having the nonnegative eigenvalues of Dh (x) Dh (x)
down the main diagonal. Thus, since one of these eigenvalues equals 0, letting λ2i
denote the ith eigenvalue, there exists a constant, C independent of ε such that
n
Y
2 ¡ ¢
0 < Jk (x) = λ2i + ε2 ≤ C 2 ε2. (17.44)
i=1
n o
T
Tε ≡ (h (w) , εw) : w ∈ B (x,r)
≡ k (B (x,r)),
then
T = P Tε
where P is the projection map defined by
µ ¶ µ ¶
x x
P ≡ .
y 0
Hn (h (B (x,r))) = Hn (T ) = Hn (P Tε )
≤ Hn (Tε ) = Hn (k (B (x,r))) .
478 HAUSDORFF MEASURE
It follows from 17.44 and the first part of the proof applied to k that
Hn (k (B (x, r)))
Cε ≥ Jk (x) = lim
r→0 mn (B (x,r))
Hn (h (B (x, r)))
≥ lim sup .
r→0 mn (B (x,r))
−1
Since ε is arbitrary, this establishes 17.43 in the case where U (x) does not exist
and completes the proof of the theorem.
ν (E) ≡ Hn (h (E ∩ A)).
What is f ? I will show that f (x) = J (x) = det (U (x)) a.e. Let x be a Lebesgue
point of f. Then by Lemma 17.34 and Theorem 17.35
Z
1
f (x) = lim f (y) dm
r→0 mn (B (x,r)) B(x,r)
Hn (h (B (x,r) ∩ A))
= lim
r→0 mn (B (x,r))
H (h (B (x,r) ∩ A)) Hn (h (B (x,r)))
n
= lim
r→0 Hn (h (B (x,r))) mn (B (x,r))
= J (x).
Note there are no measurability questions in the above formula because h−1 (F ) is
a Borel set due to the continuity of h. The Borel measurability of J (x) also follows
from the observation that h is continuous and therefore, the partial derivatives are
Borel measurable, being the limit of continuous functions. Then J (x) is just a
continuous function of these partial derivatives. However, things are not so clear
if E is only assumed Hn measurable. Is there a similar formula for F only Hn
measurable?
First consider the case where E is only Hn measurable but
Hn (E ∩ h (A)) = 0.
By Theorem 17.5 on Page 452, there exists a Borel set F ⊇ E ∩ h (A) such that
Hn (F ) = Hn (E ∩ h (A)) = 0.
Z Z
= XA∩h−1 (F ) (x) J (x) dmn = XA∩h−1 (E) (x) J (x) dmn , (17.47)
Hn (E ∩ h (A)) = 0.
Z Z
= XAR ∩h−1 (F ) (x) J (x) dmn = XAR ∩h−1 (E) (x) J (x) dmn . (17.49)
Theorem 17.37 Let M be a closed nonempty subset of a metric space (X, d) and
let f : M → [a, b] be continuous at every point of M. Then there exists a function,
g continuous on all of X which coincides with f on M such that g (X) ⊆ [a, b] .
The next topic needed is the concept of an infinitely differentiable partition of
unity.
Definition 17.38 Let C be a set whose elements are subsets of Rn .1 Then C is
said to be locally finite if for every x ∈ Rn , there exists an open set, Ux containing
x such that Ux has nonempty intersection with only finitely many sets of C.
Lemma 17.39 Let C be a set whose elements are open subsets of Rn and suppose
∞
∪C ⊇ H, a closed set. Then there exists a countable list of open sets, {Ui }i=1 such
∞
that each Ui is bounded, each Ui is a subset of some set of C, and ∪i=1 Ui ⊇ H.
Proof: Let Wk ≡ B (0, k) , W0 = W−1 = ∅. For each x ∈ H ∩ Wk there exists
an open set, Ux such that Ux is a subset of some set of C and Ux ⊆ Wk+1 \ Wk−1 .
© ªm(k)
Then since H ∩ Wk is compact, there exist finitely many of these sets, Uik i=1
whose union contains H ∩ Wk . If H ∩ Wk = ∅, let m (k) = 0 and there are no such
© k ªm(k)
sets obtained.The desired countable list of open sets is ∪∞ k=1 Ui i=1 . Each open
set in this list is bounded. Furthermore, if x ∈ Rn , then x ∈ Wk where k is the
first positive integer with x ∈ Wk . Then Wk \ Wk−1 is an open set containing x and
© ªm(k)
this open set can have nonempty intersection only with with a set of Uik i=1 ∪
© k−1 ªm(k−1) © k ªm(k)
Ui i=1
, a finite list of sets. Therefore, ∪∞
k=1 Ui i=1 is locally finite.
∞
The set, {Ui }i=1 is said to be a locally finite cover of H. The following lemma
gives some important reasons why a locally finite list of sets is so significant. First
∞
of all consider the rational numbers, {ri }i=1 each rational number is a closed set.
∞
Q = {ri }i=1 = ∪∞ ∞
i=1 {ri } 6= ∪i=1 {ri } = R
Proof: Let p be a limit point of ∪C and let W be an open set which intersects
only finitely many
© sets ofªC. Then p must©be a limit ª point of one of these sets. It
follows p ∈ ∪ H : H ∈ C and so ∪C ⊆ ∪ H : H ∈ C . The inclusion in the other
direction is obvious.
Now consider the second assertion. Letting x ∈ Rn , there exists an open set, W
intersecting only finitely many open sets of C, U1 , U2 , · · ·, Um . Then for all y ∈ W,
m
X
f (y) = ψ Ui (y)
i=1
and so the desired result is obvious. It merely says that a finite sum of differentiable
functions is differentiable. Recall the following definition.
Lemma 17.42 Let U be a bounded open set and let K be a closed subset of U. Then
there exist an open set, W, such that W ⊆ W ⊆ U and a function, f ∈ Cc∞ (U )
such that K ≺ f ≺ U .
Also let
© ¡ ¢ª
W1 ≡ x : dist (x, K) < 2−1 dist K, U C
Then it is clear
K ⊆ W ⊆ W ⊆ W1 ⊆ W1 ⊆ U
it follows that for such k,the function, h ∗ φk ∈ Cc∞ (U ) , has values in [0, 1] , and
equals 1 on K. Let f = h ∗ φk .
The above lemma is used repeatedly in the following.
17.4. THE DIVERGENCE THEOREM 483
∞
Lemma 17.43 Let K be a closed set and let {Vi }i=1 be a locally finite list of
bounded open sets whose union contains K. Then there exist functions, ψ i ∈
Cc∞ (Vi ) such that for all x ∈ K,
∞
X
1= ψ i (x)
i=1
is in C ∞ (Rn ) .
Proof: Let K1 = K \ ∪∞
i=2 Vi . Thus K1 is compact because K1 ⊆ V1 . Let
K 1 ⊆ W 1 ⊆ W 1 ⊆ V1
W i ⊆ Vi , K ⊆ ∪ ∞
i=1 Wi .
∞ ∞
Note {Wi }i=1 is locally finite because the original list, {Vi }i=1 was locally finite.
Now let Ui be open sets which satisfy
W i ⊆ Ui ⊆ U i ⊆ Vi .
∞
Similarly, {Ui }i=1 is locally finite.
Wi Ui Vi
∞ ∞
Since the set, {Wi }i=1 is locally finite, it follows ∪∞
i=1 Wi = ∪i=1 Wi and so it
is possible to define φi and γ, infinitely differentiable functions having compact
support such that
U i ≺ φi ≺ Vi , ∪∞ ∞
i=1 W i ≺ γ ≺ ∪i=1 Ui .
484 HAUSDORFF MEASURE
Now define
½ P∞ P∞
γ(x)φi (x)/ j=1 φj (x) if j=1 φj (x) 6= 0,
ψ i (x) = P∞
0 if j=1 φj (x) = 0.
P∞
If x is such that j=1 φj (x) = 0, then x ∈ / ∪∞i=1 Ui because φi equals one on Ui .
Consequently γ (y) = 0 for all y near x thanks to the fact that ∪∞ i=1 Ui is closed
and
P∞ so ψ i (y) = 0 for all y near x. Hence ψ i is infinitely differentiable at such x. If
j=1 φj (x) 6= 0, this situation persists near x because each φj is continuous and so
ψ i is infinitely differentiable at such points also thanks to LemmaP 17.40. Therefore
∞
ψ i is infinitely differentiable. If x ∈ K, then γ (x) = 1 and so j=1 ψ j (x) = 1.
Clearly 0 ≤ ψ i (x) ≤ 1 and spt(ψ j ) ⊆ Vj . This proves the theorem.
The functions, {ψ i } are called a C ∞ partition of unity.
The method of proof of this lemma easily implies the following useful corollary.
Corollary 17.44 If H is a compact subset of Vi for some Vi there exists a partition
of unity such that ψ i (x) = 1 for all x ∈ H in addition to the conclusion of Lemma
39.6.
Proof: Keep Vi the same but replace Vj with V fj ≡ Vj \ H. Now in the proof
above, applied to this modified collection of open sets, if j 6= i, φj (x) = 0 whenever
x ∈ H. Therefore, ψ i (x) = 1 on H.
Lemma 17.45 Let Ω be a metric space with the closed balls compact and suppose
µ is a measure defined on the Borel sets of Ω which is finite on compact sets.
Then there exists a unique Radon measure, µ which equals µ on the Borel sets. In
particular µ must be both inner and outer regular on all Borel sets.
R
Proof: Define a positive linear functional, Λ (f ) = f dµ. Let µ be the Radon
measure which comes from the Riesz representation theorem for positive linear
functionals. Thus for all f continuous,
Z Z
f dµ = f dµ.
and so the two measures coincide on all open sets. Every compact set is a countable
intersection of open sets and so the two measures coincide on all compact sets. Now
let B (a, n) be a ball of radius n and let E be a Borel set contained in this ball.
Then by regularity of µ there exist sets F, G such that G is a countable intersection
of open sets and F is a countable union of compact sets such that F ⊆ E ⊆ G and
µ (G \ F ) = 0. Now µ (G) = µ (G) and µ (F ) = µ (F ) . Thus
µ (G \ F ) + µ (F ) = µ (G)
= µ (G) = µ (G \ F ) + µ (F )
17.4. THE DIVERGENCE THEOREM 485
and so µ (G \ F ) = µ (G \ F ) . Thus
Lemma
¡ ¢ 17.46 Let V be a bounded open set and let X be the closed subspace of
C V , the space of continuous functions defined on V , which is given by the fol-
lowing.
¡ ¢
X = {u ∈ C V : u (x) = 0 on ∂V }.
Then Cc∞ (V ) is dense in X with respect to the norm given by
© ª
||u|| = max |u (x)| : x ∈ V
¡ ¢
Proof: Let O ⊆ O ⊆ W ⊆ W ⊆ V be such that dist O, V C < η and let
ψ δ (·) be a mollifier. Let u ∈ X and consider XW u ∗ ψ δ . Let ε > 0 be given and
let η be small enough that |u (x) | < ε/2 whenever x ∈ V \ O. Then if δ is small
enough |XW u ∗ ψ δ (x) − u (x) | < ε for all x ∈ O and XW u ∗ ψ δ is in Cc∞ (V ). For
x ∈ V \ O, |XW u ∗ ψ δ (x) | ≤ ε/2 and so for such x,
such that ∂U ≡ U \ U is contained in their union. Also, for each Qi , there exists k
and a Lipschitz function, gi such that U ∩ Qi is of the form
k−1
Y¡ ¢
x : (x1 , · · ·, xk−1 , xk+1 , · · ·, xn ) ∈ aij , bij ×
j=1
Yn
¡ i i¢
aj , bj and aik < xk < gi (x1 , · · ·, xk−1 , xk+1 , · · ·, xn ) (17.54)
j=k+1
486 HAUSDORFF MEASURE
Qk−1 ¡ i i ¢ Qn ¡ i i¢
The function, gi has a derivative on Ai ⊆ j=1 aj , bj × j=k+1 aj , bj where
k−1
Y n
Y
¡ i i¢ ¡ i i¢
mn−1 aj , bj × aj , bj \ Ai = 0.
j=1 j=k+1
Also, there exists an open set, Q0 such that Q0 ⊆ Q0 ⊆ U and U ⊆ Q0 ∪Q1 ∪···∪QN .
Note that since there are only finitely many Qi and each gi is Lipschitz, it follows
from an application of Lemma 17.21 that Hn−1 (∂U ) < ∞. Also from Lemma 17.45
Hn−1 is inner and outer regular on ∂U .
Lemma 17.48 Suppose U is a¡ bounded ¢open set as described above. Then there
n
exists a unique function in L∞ ∂U, Hn−1 , n (y) for y ∈ ∂U such that |n (y)| =
n−1
1, n is H measurable, (meaning each component of n is Hn−1 measurable) and
for every w ∈Rn satisfying |w| = 1, and for every f ∈ Cc1 (Rn ) ,
Z Z
f (x + tw) − f (x)
lim dx = f (n · w) dHn−1
t→0 U t ∂U
N
Proof: Let U ⊆ V ⊆ V ⊆ ∪N i=0 Qi and let {ψ i }i=0 be a C
∞
partition of unity
on V such that spt (ψ i ) ⊆ Qi . Then for all t small enough and x ∈ U ,
N
f (x + tw) − f (x) 1X
= ψ f (x + tw) − ψ i f (x) .
t t i=0 i
Z X
N X
n
= Dj (ψ i f (x)) wj dx
U i=0 j=1
17.4. THE DIVERGENCE THEOREM 487
Z X
n N Z X
X n
= Dj (ψ 0 f (x)) wj dx + Dj (ψ i f (x)) wj dx (17.56)
U j=1 i=1 U j=1
Since spt (ψ 0 ) ⊆ Q0 , it follows the first term in the above equals zero. In the second
term, fix i. Without loss of generality, suppose the k in the above definition equals
n and 17.54 holds. This just makes things a little easier to write. Thus gi is a
function of
n−1
Y¡ ¢
(x1 , · · ·, xn−1 ) ∈ aij , bij ≡ Bi
j=1
Then
Z X
n
Dj (ψ i f (x)) wj dx
U j=1
Z Z n
gi (x1 ,···,xn−1 ) X
= Dj (ψ i f (x)) wj dxn dx1 · · · dxn−1
Bi ain j=1
Z Z n
gi (x1 ,···,xn−1 ) X
= Dj (ψ i f (x)) wj dxn dx1 · · · dxn−1
Bi −∞ j=1
wj dydx1 · · · dxn−1
Z Z 0 X n
= Dj (ψ i f (x1 , · · ·, xn−1 , y + gi (x1 , · · ·, xn−1 ))) ·
Ai −∞ j=1
wj dydx1 · · · dxn−1
Z Z 0 n−1
X ∂
= (ψ f (x1 , · · ·, xn−1 , y + gi (x1 , · · ·, xn−1 ))) wj −
Ai −∞ j=1 ∂xj i
Dn (ψ i f ) (x1 , · · ·, xn−1 , y + gi (x1 , · · ·, xn−1 )) ·
gi,j (x1 , · · ·, xn−1 ) wj dydx1 · · · dxn−1
Z Z 0
+ Dn (ψ i f (x1 , · · ·, xn−1 , y + gi (x1 , · · ·, xn−1 ))) · (17.57)
Ai −∞
wn dydx1 · · · dxn−1
Consider the term
Z Z 0 n−1
X ∂
(ψ f (x1 , · · ·, xn−1 , y + gi (x1 , · · ·, xn−1 ))) ·
Ai −∞ j=1 ∂xj i
wj dydx1 · · · dxn−1 .
488 HAUSDORFF MEASURE
This equals
Z Z 0 n−1
X ∂
(ψ f (x1 , · · ·, xn−1 , y + gi (x1 , · · ·, xn−1 ))) ·
Bi −∞ j=1 ∂xj i
wj dydx1 · · · dxn−1 ,
and now interchanging the order of integration and using the fact that spt (ψ i ) ⊆ Qi ,
it follows this term equals zero. (The reason this is valid is that
(−gi,1 (x1 , · · ·, xn−1 ) , −gi,2 (x1 , · · ·, xn−1 ) , · · ·, −gi,n−1 (x1 , · · ·, xn−1 ) , 1) . (17.61)
At this point I need a technical lemma which will allow the use of the area formula.
The part of the boundary of U which is contained in Qi is the image of the map,
hi (x1 , · · ·, xn−1 ) given by (x1 , · · ·, xn−1 , gi (x1 , · · ·, xn−1 )) for (x1 , · · ·, xn−1 ) ∈ Ai .
I need a formula for
¡ ∗ ¢1/2
det Dhi (x1 , · · ·, xn−1 ) Dhi (x1 , · · ·, xn−1 ) .
To avoid interupting the argument, I will state the lemma here and prove it later.
Lemma 17.49
¡ ∗ ¢1/2
det Dhi (x1 , · · ·, xn−1 ) Dhi (x1 , · · ·, xn−1 )
v
u n−1
u X
= t1 + 2
gi,j (x1 , · · ·, xn−1 ) ≡ Ji (x1 , · · ·, xn−1 ) .
j−1
17.4. THE DIVERGENCE THEOREM 489
For
y = (x1 , · · ·, xn−1 , gi (x1 , · · ·, xn−1 )) ∈ ∂U ∩ Qi
define It follows if n is defined by
1
ni (y) = Ni (y)
Ji (x1 , · · ·, xn−1 )
it follows from the description of Ji (x1 , · · ·, xn−1 ) given in the above lemma, that
ni is a unit vector. All components of ni are continuous functions of limits of con-
tinuous functions. Therefore, ni is Borel measurable and so it is Hn−1 measurable.
Now 17.60 reduces to
Z
(ψ i f ) (x1 , · · ·, xn−1 , gi (x1 , · · ·, xn−1 )) ×
Ai
ni (x1 , · · ·, xn−1 , gi (x1 , · · ·, xn−1 )) · wJi (x1 , · · ·, xn−1 ) dmn−1 .
Now by Lemma 17.21 and the equality of mn−1 and Hn−1 on Rn−1 , the above
integral equals
Z Z
n−1
ψ i f (y) ni (y) · wdH = ψ i f (y) ni (y) · wdHn−1 .
∂U ∩Qi ∂U
Returning to 17.56 similar arguments apply to the other terms and therefore,
Z
f (x + tw) − f (x)
lim dmn
t→0 U t
X N Z
= ψ i f (y) ni (y) · wdHn−1
i=1 ∂U
Z N
X
= f (y) ψ i (y) ni (y) · wdHn−1
∂U i=1
Z
= f (y) n (y) · wdHn−1 (17.62)
∂U
PN
Then let n (y) ≡ i=1 ψ i (y) ni (y) .
I need to show first there is no other n which satisfies 17.62 and then I need
to show that |n (y)| = 1. Note that it is clear |n (y)| ≤ 1 because each ni is a
unit vector
¡ and this¢ is just a convex combination of these. Suppose then that
n1 ∈ L∞ ∂U, Hn−1 also works in 17.62. Then for all f ∈ Cc1 (Rn ) ,
Z Z
f (y) n (y) · wdHn−1 = f (y) n1 (y) · wdHn−1 .
∂U ∂U
490 HAUSDORFF MEASURE
= ∇ (ψ i f ) · wdmn
U
Since Cc1 (O) is dense in Cc (O) , the above equation is also true for all f ∈ Cc (O).
¡ ¢letting h ∈ Cc (W ) , the Tietze extension theorem implies there exists f1 ∈
Now
C O whose restriction to W equals h. Let f be defined by
¡ ¢
dist x, OC
f1 (x) = f (x) .
dist (x, spt (h)) + dist (x, OC )
Then f = h on W and so this has shown that for all h ∈ Cc (W ) , 17.63 holds
n−1
for h in place of f. But as observed
¡ earlier,¢H is outer and inner regular on
1 n−1
∂U and so Cc (W ) is dense in L W, H which implies w · n (y) = w · ni (y)
for a.e. y. Considering a countable dense subset of the unit sphere as above, this
implies n (y) = ni (y) a.e. y. This proves |n (y)| = 1 a.e. and in fact n (y) can be
computed by using the formula for ni (y). This proves the lemma.
It remains to prove Lemma 17.49.
T
Proof of Lemma 17.49: Let h (x) = (x1 , · · ·, xn−1 , g (x1 , · · ·, xn−1 ))
1 0
.. .. ..
.
Dh (x) = . .
0 1
g,x1 ··· g,xn−1
Therefore,
¡ ¡ ∗ ¢¢1/2
J (x) = det Dh (x) Dh (x) .
Therefore, J (x) is the square root of the determinant of the following n × n matrix.
1 + (g,x1 )2 g,x1 g,x2 ··· g,x1 g,xn−1
g,x2 g,x1 1 + (g,x2 )2 ··· g,x2 g,xn−1
.. .. .. . (17.64)
. . .
g,xn−1 g,x1 g,xn−1 g,x2 ··· 1 + (g,xn−1 )2
n−1
X 2
1+ (g,xi (x)) .
i=1
This is implied by the following claim. To simplify the notation I will replace n − 1
with n.
Claim: Let a1 , · · ·, an be real numbers and let A (a1 , · · ·, an ) be the matrix
which has 1 + a2i in the iith slot and ai aj in the ij th slot when i 6= j. Then
n
X
det A = 1 + a2i.
i=1
492 HAUSDORFF MEASURE
and therefore,
à n
! Ã n
!2
X X
1+ a2i det (A (a1 , · · ·, an )) = 1+ a2i
i=1 i=1
¡ Pn ¢
which shows det (A (a1 , · · ·, an )) = 1 + i=1 a2i . This proves the claim.
Now the above lemma implies the divergence theorem.
Theorem 17.50 Let U be a bounded open set with a Lipschitz boundary which lies
on one side of its boundary. Then if f ∈ Cc1 (Rn ) ,
Z Z
f,k (x) dmn = f nk dHn−1 (17.66)
U ∂U
17.4. THE DIVERGENCE THEOREM 493
where n = (n1 , · · ·, nn ) is the Hn−1 measurable unit vector of Lemma 17.48. Also,
if F is a vector field such that each component is in Cc1 (Rn ) , then
Z Z
∇ · F (x) dmn = F · ndHn−1 . (17.67)
U ∂U
given by
(−gi,1 (x1 , · · ·, xn−1 ) , −gi,2 (x1 , · · ·, xn−1 ) , · · ·, −gi,n−1 (x1 , · · ·, xn−1 ) , 1) (17.68)
xn − gi (x1 , · · ·, xn−1 )
xn − gi (x1 , · · ·, xn−1 ) = 0
in the case where gi is C 1 . It also points away from U so the vector n is the unit
outer normal. The other cases work similarly.
The divergence theorem is valid in situations more general than for Lipschitz
boundaries. What you need is essentially the ability to say that the functions, gi
above can be differentiated a.e. and more importantly that these functions can be
recovered by integrating their partial derivatives. In other words, you need absolute
continuity in each variable. Later in the chapter on weak derivatives, examples of
such functions which are more general than Lipschitz functions will be discussed.
However, the Lipschitz functions are pretty general and will suffice for now.
494 HAUSDORFF MEASURE
Differentiation With Respect
To General Radon Measures
The fundamental theorem of calculus presented above for Lebesgue measures can
be generalized to arbitrary Radon measures. It turns out that the same approach
works if a different covering theorem is employed instead of the Vitali theorem. This
covering theorem is the Besicovitch covering theorem of this section. It is necessary
because for a general Radon measure µ, it is no longer the case that ³ ´the measure
is translation invariant. This implies there is no way to estimate µ B b in terms of
µ (B) and thus the Vitali covering theorem is of no use. In the Besicovitch covering
theorem the balls in the covering are not enlarged as they are in the Vitali theorem.
In this theorem they can also be either open or closed or neither open nor closed.
The balls can also be taken with respect to any norm on Rn . The notation, B (x,r)
in the above argument will denote any set which satisfies
and the norm, ||·|| is just some norm on Rn . The following picture is a distorted
picture of the situation described in the following lemma.
495
496 DIFFERENTIATION WITH RESPECT TO GENERAL RADON MEASURES
ry
0r
rx
Lemma 18.1 Let 10 ≤ rx ≤ ry and suppose B (x, rx ) and B (y, ry ) both have
nonempty intersection with B (0, 1) but neither of these balls contains 0. Suppose
also that
||x − y|| ≥ ry
Proof: By hypothesis,
Then ¯¯ ¯¯ ¯¯ ¯¯
¯¯ x y ¯¯¯¯ ¯¯¯¯ x ||y|| − ||x|| y ¯¯¯¯
¯¯
¯¯ ||x|| − ||y|| ¯¯ = ¯¯ ||x|| ||y|| ¯¯
¯¯ ¯¯
¯¯ x ||y|| − y ||y|| + y ||y|| − ||x|| y ¯¯
¯
= ¯¯¯ ¯¯
||x|| ||y|| ¯¯
ry ||y|| ry (ry + 1)
≥ − +1≥ − +1
||x|| ||x|| ||x|| ||x||
1 1 1 9
= 1− ≥1− ≥1− = .
||x|| rx 10 10
18.1. BESICOVITCH COVERING THEOREM 497
Lemma 18.5 In the situation of Theorem 18.4, suppose the set of centers A is
J
bounded. Then there exists a sequence of balls from F, {Bj }j=1 where J ≤ ∞ such
that
3
r (B1 ) ≥ sup {r (B) : B ∈ F} (18.2)
4
and if
Am ≡ A \ (∪m i=1 Bi ) 6= ∅, (18.3)
then
3
r (Bm+1 ) ≥ sup {r : B (a, r) ∈ F , a ∈ Am } . (18.4)
4
Letting Bj = B (aj , rj ) , this sequence satisfies
4 J
A ⊆ ∪Ji=1 Bi , r (Bk ) ≤ r (Bj ) for j < k, {B (aj , rj /3)}j=1 are disjoint. (18.5)
3
Proof: Pick B1 satisfying 18.2. If B1 , · · ·, Bm have been chosen, and Am is
given in 18.3, then if it equals ∅, it follows A ⊆ ∪m
i=1 Bi . Set J = m. If Am 6= ∅,
pick Bm+1 to satisfy 18.4. This defines the desired sequence. It remains to verify
the claims in 18.5. Consider the second claim. Letting A0 ≡ A, Ak ⊆ Aj−1 and so
3 3 1
r0 ≤ sup {r : B (a, r) ∈ F, a ∈ Ai−1 } ≤ ri < r0 ,
4 4 10
a contradiction. This proves the lemma.
Lemma 18.6 There exists a constant Mn depending only on n such that for each
1 ≤ k ≤ J, Mn exceeds the number of sets Bj for j < k which have nonempty
intersection with Bk .
18.1. BESICOVITCH COVERING THEOREM 499
Proof: These sets Bj which intersect Bk are of two types. Either they have
large radius, rj > 10rk , or they have small radius, rj ≤ 10rk . In this argument let
card (S) denote the number of elements in the set S. Define for fixed k,
I ≡ {j : 1 ≤ j < k, Bj ∩ Bk 6= ∅, rj ≤ 10rk },
Then rj /rk ≥ 10 because j ∈ K. None of the balls, f (Bj ) contain 0 but all these
balls intersect B (0, 1) and as just noted, each of these balls has radius ≥ 10 and
none of them contains two centers on its interior. By Corollary 18.3, it follows there
are no more than Ln of them. This proves the claim. A constant which will satisfy
the desired conditions is
Mn ≡ Ln + 60n + 1.
This completes the proof of Lemma 18.6.
500 DIFFERENTIATION WITH RESPECT TO GENERAL RADON MEASURES
J
Next subdivide the balls {Bi }i=1 into Mn subsets G1 , · · ·, GMn each of which
consists of disjoint balls. This is done in the following way. Let B1 ∈ G1 . If
B1 , · · ·, Bk have each been assigned to one of the sets G1 , · · ·, GMn , let Bk+1 ∈ Gr
where r is the smallest index having the property that Bk+1 does not intersect any
of the balls already in Gr . There must exist such an index r ∈ {1, · · ·, Mn } because
otherwise Bk+1 ∩ Bj 6= ∅ for at least Mn values of j < k + 1 contradicting Lemma
18.6. By Lemma 18.5
A ⊆ ∪M J
i=1 {B : B ∈ Gi } = ∪j=1 Bj .
n
and
Fl = {B (a,r) : B (a,r) ∈ F and a ∈ Al }.
Then since D is an upper bound for all the diameters of these balls,
whenever m ≥ l + 2. Therefore, applying what was just shown to the pair (Al , Fl ),
l
there exist subsets of Fl , G1l · · · GM n
such that each Gil is a countable collection of
disjoint balls of Fl ⊆ F and
© ª
Al ⊆ ∪M l
i=1 B : B ∈ Gi .
n
2l−1
Now let Gj ≡ ∪∞l=1 Gj for 1 ≤ j ≤ Mn and for 1 ≤ j ≤ Mn , let Gj+Mn ≡ ∪∞ 2l
l=1 Gj .
Thus, letting Nn ≡ 2Mn ,
Nn
A = ∪∞ ∞
l=1 A2l ∪ ∪l=1 A2l−1 ⊆ ∪j=1 {B : B ∈ Gj }
and by 18.6, each Gj is a countable set of disjoint balls of F. This proves the
Besicovitch covering theorem.
Proof: For each x ∈ Z, there exists a ball B (x,r) with µ (B (x,r)) = 0. Let C
be the collection of these balls. Since Rn has a countable basis, a countable subset,
e of C also covers Z. Let
C,
Ce = {Bi }i=1 .
∞
Theorem 18.8 Let µ be a Radon measure and let f ∈ L1 (Rn , µ). Then
Z
lim − |f (y) − f (x)| dµ (y) = 0
r→0 B(x,r)
for µ a.e. x ∈ Rn .
Proof: First consider the following claim which is a weak type estimate of the
same sort used when differentiating with respect to Lebesgue measure.
Claim 1:
µ ([M f > ε]) ≤ Nn ε−1 ||f ||1
Proof: First note A ∩ Z = ∅. For each x ∈ A there exists a ball Bx = B (x,rx )
with rx ≤ 1 and Z
−1
µ (Bx ) |f | dµ > ε.
B(x,rx )
Let F be this collection of balls so that A is the set of centers of balls of F. By the
Besicovitch covering theorem,
A ⊆ ∪N
i=1 {B : B ∈ Gi }
n
µ (A) /Nn ≤ µ (∪ {B : B ∈ Gi })
and if x ∈
/ Z, Z
1
lim g (y) dµ (y) = g (x). (18.7)
r→0 µ (B (x,r)) B(x,r)
Z
1
= |g (y) − g (x)| dµ (y)
µ (B (x,r)) B(x,r)
Z
1
≤ ε dµ (y) = ε.
µ (B (x,r)) B(x,r)
18.7 follows from the above and the triangle inequality. This proves the claim.
Now let g ∈ Cc (Rn ) and x ∈ / Z. Then from the above observations about
continuous functions,
Ã" Z #!
µ x : lim sup − |f (y) − f (x)| dµ (y) > ε (18.8)
r→0 B(x,r)
Ã" Z #!
ε
≤ µ x : lim sup − |f (y) − g (y)| dµ (y) >
r→0 B(x,r) 2
³h ε i´
+µ x : |g (x) − f (x)| > .
2
³h ε i´ ³h ε i´
≤ µ M (f − g) > + µ |f − g| > (18.9)
2 2
Now Z
ε ³h ε i´
|f − g| dµ ≥ µ |f − g| >
[|f −g|> 2ε ] 2 2
18.2. FUNDAMENTAL THEOREM OF CALCULUS FOR RADON MEASURES 503
Proof: If f is replaced by f XB(0,k) then the conclusion 18.10 holds for all x ∈F / k
where Fk is a set of µ measure 0. Letting k = 1, 2, · · ·, and F ≡ ∪∞ k=1 Fk , it follows
that F is a set of measure zero and for any x ∈ / F , and k ∈ {1, 2, · · ·}, 18.10 holds
if f is replaced by f XB(0,k) . Picking any such x, and letting k > |x| + 1, this shows
Z
lim − |f (y) − f (x)| dµ (y)
r→0 B(x,r)
Z
¯ ¯
= lim − ¯f XB(0,k) (y) − f XB(0,k) (x)¯ dµ (y) = 0.
r→0 B(x,r)
α (E) = µ (E × Rm ) (18.11)
for all E Borel. There also exists a Borel set of α measure zero, N , such that
/ N , there exists a Radon probability measure ν x such that if f is a
for each x ∈
nonnegative µ measurable function or a µ measurable function in L1 (µ),
and Z Z µZ ¶
f (x, y) dµ = f (x, y) dν x (y) dα (x). (18.13)
Rn+m Rn Rm
If νbx is any other collection of Radon measures satisfying 18.12 and 18.13, then
νbx = ν x for α a.e. x.
α0 (E) = µ (E × Rm ).
Thus α0 is a finite Borel measure and so it is finite on compact sets. Lemma 11.3
on Page 11.3 implies the existence of the Radon measure α extending α0 .
Next consider the uniqueness of ν x . Suppose ν x and νbx satisfy all conclusions
of the theorem with exceptional sets denoted by N and N b respectively. Then,
b
enlarging N and N , one may also assume, using Lemma 18.7, that for x ∈ / N ∪N b,
α (B (x,r)) > 0 whenever r > 0. Now let
m
Y
A= (ai , bi ]
i=1
where ai and bi are rational. Thus there are countably many such sets. Then from
the conclusion of the theorem, if x0 ∈
/ N ∪N b,
Z Z
1
XA (y) dν x (y) dα
α (B (x0 , r)) B(x0 ,r) Rm
Z Z
1
= XA (y) db
ν x (y) dα,
α (B (x0 , r)) B(x0 ,r) Rm
506 DIFFERENTIATION WITH RESPECT TO GENERAL RADON MEASURES
Then η f is a finite measure defined on the Borel sets with η f ¿ α. By the Radon
Nikodym theorem, there exists a Borel measurable function gef such that for all
Borel E, Z Z
η f (E) ≡ f dµ = gef dα. (18.14)
E×Rm E
By the theory of differentiation for Radon measures, there exists a set of α measure
zero, Nf such that if x ∈N
/ f , then α (B (x,r)) > 0 for all r > 0 and
Z Z
1
lim f dµ = lim − gef dα = gef (x).
r→0 α (B (x,r)) B(x,r)×Rm r→0 B(x,r)
This functional may appear to depend on the choice of ψ satisfying ψ (x) = 1 but
this is not the case because all such ψ 0 s used in the definition of Lx are continuous.
Let ν x be the Radon measure representing Lx . Thus replacing an arbitrary
ψ
ψ ∈ Cc (Rn ) with ψ(x) , in the case when ψ (x) 6= 0,
Z
ψ (x) Lx (φ) = ψ (x) φ (y) dν x (y)
Rm
Z
ψ (x) ψ
= lim φdµ
r→0 α (B (x,r)) B(x,r)×Rm ψ (x)
Z
1
= lim ψφdµ
r→0 α (B (x,r)) B(x,r)×Rm
By Lemma 18.13,
Z Z Z Z
ψφdν x dα = gψφ dα = ψφdµ.
Rn Rm Rn Rn ×Rm
Since the φk are uniformly bounded, 18.15 implies the existence of a dominating
function for the integrand in 18.18. Therefore, one can take the limit inside the
integrals and obtain
Z Z Z
XR1 (x) XR2 (y) dµ = XR1 (x) XR2 (y) dν x dα
Rn ×Rm Rn Rm
Every open set, V in Rn+m is a countable disjoint union of such half open rectangles
and so the monotone convergence theorem implies for all V open in Rn+m ,
Z Z Z
XV dµ = XV dν x dα. (18.19)
Rn ×Rm Rn Rm
Since every compact set is the countable intersection of open sets, the above for-
mula holds for V replaced with K where K is compact. Then it follows from the
dominated convergence and monotone convergence theorems that whenever H is
either a Gδ (countable intersection of open sets) or a Fσ (countable union of closed
sets) Z Z Z
XH dµ = XH dν x dα.
Rn ×Rm Rn Rm
Now let E be µ measurable. Using the regularity of µ there exists F, G such that
F is Fσ , G is Gδ , µ (G \ F ) = 0, and F ⊆ E ⊆ G. Also a routine application of the
dominated convergence theorem and 18.19 shows
Z Z Z
X(G\F ) dµ = X(G\F ) dν x dα
Rn ×Rm Rn Rm
18.4. VITALI COVERINGS 509
It follows from 18.20 that one can replace XE with an arbitrary nonnegative µ
measurable simple function, s. Letting f be a nonnegative µ measurable function,
it follows there is an increasing sequence of nonnegative simple functions converging
to f pointwise and so by the monotone convergence theorem,
Z Z Z
f dµ = f dν x dα
Rn ×Rm Rn Rm
R
where y → f (x, y) is ν x measurable for α a.e. x and x → Rm f dν x is α measurable
so the iterated integral makes sense.
To see ν x is a probability measure for a.e. x,
Z Z
1
dν x dα
α (B (x, r)) B(x,r) Rm
1
= µ (B (x, r) × Rm ) = 1
α (B (x, r))
and so, using the fundamental theorem of calculus it follows that upon passing to
a limit as r → 0, it follows that for α a.e. x
Z
m
ν x (R ) = dν x = 1
Rm
Due to the regularity of the measures, all sets of measure zero may be taken to be
Borel. In the case of f ∈ L1 (µ) , one applies the above to the positive and negative
parts of the real and imaginary parts. This proves the theorem.
It follows there exists a finite set of balls of Gi , {B1 , · · ·, Bm1 } such that
m1
X
(Nn + 1) µ (Bi ) > µ (E) (18.21)
i=1
and so
m1
X
(2Nn + 2) µ (Bi ) > 2µ (E) > µ (U1 ) .
i=1
m1
X ¡ 1 ¢
≥ µ (U1 ) − µ (Bi ) = µ (U1 ) − µ ∪m
j=1 Bj
i=1
¡ ¢ ¡ ¢
= µ U1 \ ∪m m1
j=1 Bj ≥ µ E \ ∪j=1 Bj .
1
Since the balls are closed, you can consider the sets of F which have empty intersec-
tion with ∪m m1
j=1 Bj and this new collection of sets will be a Vitali cover of E \∪j=1 Bj .
1
Letting this collection of balls play the role of F in the above argument and letting
E \ ∪mj=1 Bj play the role of E, repeat the above argument and obtain disjoint sets
1
of F,
{Bm1 +1 , · · ·, Bm2 } ,
such that
¡ ¢ ¡¡ ¢ ¢ ¡ ¢
λµ E \ ∪m
j=1 Bj > µ
1
E \ ∪m m2 m2
j=1 Bj \ ∪j=m1 +1 Bj = µ E \ ∪j=1 Bj ,
1
and so ¡ ¢
λ2 µ (E) > µ E \ ∪m
j=1 Bj .
2
Continuing in this way, yields a sequence of disjoint balls {Bi } contained in F and
¡ ¢ ¡ mk
¢ k
µ E \ ∪∞ j=1 Bj ≤ µ E \ ∪j=1 Bj < λ µ (E )
¡ ¢
for all k. Therefore, µ E \ ∪∞ j=1 Bj = 0 and this proves the Theorem.
It is not necessary to assume µ (E) < ∞.
Let Fm denote those closed balls of F which are contained in Dm . Then letting
Em denote E ∩ Dm , Fm is a Vitali cover of Em , µ (Em ) < ∞,© andªso by Theorem
∞
18.15, there exists a countable sequence of balls from Fm Bjm j=1 , such that
¡ ¢ © ª∞
µ Em \ ∪∞ m
j=1 Bj = 0. Then consider the countable collection of balls, Bjm j,m=1 .
¡ ¢ ¡ ¢
µ E \ ∪∞ ∞
m=1 ∪j=1 Bj
m
≤ µ ∪∞
j=1 ∂B (0, ri ) +
∞
X ¡ ¢
+ µ Em \ ∪∞ m
j=1 Bj = 0
m=1
512 DIFFERENTIATION WITH RESPECT TO GENERAL RADON MEASURES
Proof: Let x ∈ E. Thus x is the center of arbitrarily small balls from F. Since
µ is a Radon measure, at most countably many radii, r of these balls can have
the property that µ (∂B (0, r)) = 0. Let F 0 denote the closures of the balls of F,
B (x, r) with the property that µ (∂B (x, r)) = 0. Since for each x ∈ E there are only
countably many exceptions, F 0 is still a Vitali cover of© E.ªTherefore, by Corollary
∞
18.16 there is a disjoint sequence of these balls of F 0 , Bi i=1 for which
¡ ¢
µ E \ ∪∞
j=1 Bj = 0
in the case when both the upper and lower derivatives are equal.
18.5. DIFFERENTIATION OF RADON MEASURES 513
λ (A) ≤ aµ (A)
© ª
Proof: Suppose first that A is a bounded subset of x ∈/ Z : Dµ λ (x) ≥ a , let
ε > 0, and let V be a bounded open set with V ⊇ A and λ (V ) − ε < λ (A) , µ (V ) −
ε < µ (A) . Then if x ∈ A,
λ (B (x, r))
> a − ε, B (x, r) ⊆ V,
µ (B (x, r))
for infinitely many values of r which are arbitrarily small. Thus the collection of
such balls constitutes a Vitali cover for A. By Corollary 18.17 there is a disjoint
sequence of these balls {Bi } such that
µ (A \ ∪∞
i=1 Bi ) = 0. (18.22)
Therefore,
∞
X ∞
X
(a − ε) µ (Bi ) < λ (Bi ) ≤ λ (V ) < ε + λ (A)
i=1 i=1
and so
∞
X
a µ (Bi ) ≤ ε + εµ (V ) + λ (A)
i=1
≤ ε + ε (µ (A) + ε) + λ (A) (18.23)
Now
µ (A \ ∪∞ ∞
i=1 Bi ) + µ (∪i=1 Bi ) ≥ µ (A)
λ (B (x, r))
< a + ε, B (x, r) ⊆ V
µ (B (x, r))
514 DIFFERENTIATION WITH RESPECT TO GENERAL RADON MEASURES
for values of r which are arbitrarily small. Therefore, by Corollary 18.17 again,
there exists a disjoint sequence of these balls, {Bi } satisfying this time,
λ (A \ ∪∞
i=1 Bi ) = 0.
Theorem 18.20 There exists a set of measure zero, N containing Z such that
for x ∈
/ N, Dµ λ (x) exists and also XN C (·) Dµ λ (·) is a µ measurable function.
Furthermore, Dµ λ (x) < ∞ µ a.e.
Proof: First I show Dµ λ (x) exists a.e. Let 0 ≤ a < b < ∞ and let A be any
bounded subset of
© ª
N (a, b) ≡ x ∈/ Z : Dµ λ (x) > b > a > Dµ λ (x) .
By Lemma 18.19,
aµ (A) ≥ λ (A) ≥ bµ (A)
and so µ (A) = 0 and A is µ measurable. It follows µ (N (a, b)) = 0 because
∞
X
µ (N (a, b)) ≤ µ (N (a, b) ∩ B (0, m)) = 0.
m=1
Define © ª
N0 ≡ x ∈
/ Z : Dµ λ (x) > Dµ λ (x) .
Thus µ (N0 ) = 0 because
for all a and since λ is finite on bounded sets, the above implies µ (I ∩ B (0, m)) = 0
for each m which implies that I is µ measurable and has µ measure zero since
I = ∪∞
m=1 Im .
18.6. THE RADON NIKODYM THEOREM FOR RADON MEASURES 515
Letting η be an arbitrary Radon measure, let r > 0, and suppose η (∂B (x, r)) =
0. (Since η is finite on every ball, there are only countably many r such that
η (∂B (x, r)) > 0.) and let V be an open set containing B (x, r). Then whenever
y is close enough to x, it follows that B (y, r) is also a subset of V. Since V is an
arbitrary open set containing B (x, r), it follows
³ ´
η (B (x, r)) = η B (x, r) ≥ lim sup η (B (y, r))
y→x
λ (B (x, ri ))
XN C (x) Dµ λ (x) = lim XN C (x)
ri →0 µ (B (x, ri ))
Theorem 18.21 Let λ and µ be Radon measures and suppose λ ¿ µ. Then for all
E a µ measurable set, Z
λ (E) = (Dµ λ) dµ.
E
Proof: Let t > 1 and let E be a µ measurable set which is bounded and a
subset of N C where N is the exceptional set of µ measure zero in Theorem 18.20
off of which µ (B (x,r)) > 0 for all r > 0 and Dµ λ (x) exists. Consider
© ª
Em ≡ E ∩ x ∈ N C : tm ≤ Dµ λ (x) < tm+1
for all a > 0 and µ (E ) is finite due to the assumption that E is bounded and µ is
a Radon measure. Therefore, by Lemma 18.19,
X X X
λ (E) = λ (Em ) ≤ tm+1 µ (Em ) = t tm µ (Em )
m∈Z m∈Z m∈Z
XZ Z
≤t Dµ λ (x) dµ = t Dµ λ (x) dµ.
m∈Z Em E
XZ Z
≥ t−1 Dµ λ (x) dµ = t−1 Dµ λ (x) dµ.
m∈Z Em E
Thus, Z Z
t Dµ λ (x) dµ ≥ λ (E) ≥ t−1 Dµ λ (x) dµ
E E
and letting t → 1, it follows
Z
λ (E) = Dµ λ (x) dµ. (18.25)
E
x = (x1 , · · ·, xn ) ,
Here α is a multi-index as just described and aα ∈ C. Also define for α = (α1 , ···, αn )
a multi-index
∂ |α| f
Dα f (x) ≡ αn .
∂x1 ∂xα
α1
2 · · · ∂xn
2
2
Definition 19.2 Define G1 to be the functions of the form p (x) e−a|x| where a > 0
and p (x) is a polynomial. Let G be all finite sums of functions in G1 . Thus G is an
algebra of functions which has the property that if f ∈ G then f ∈ G.
It is always assumed, unless stated otherwise that the measure will be Lebesgue
measure.
517
518 FOURIER TRANSFORMS
Proof: Let f ∈ Lp (Rn ) . Then there exists g ∈ Cc (Rn ) such that ||f − g||p < ε.
Now let b > 0 be large enough that
Z ³ ´
2 p
e−b|x| dx < εp .
Rn
2
Then x → g (x) eb|x| is in Cc (Rn ) ⊆ C0 (Rn ) . Therefore, from Lemma 19.3 there
exists ψ ∈ G such that ¯¯ ¯¯
¯¯ b|·|2 ¯¯
¯¯ge − ψ ¯¯ < 1
∞
−b|x|2
Therefore, letting φ (x) ≡ e ψ (x) it follows that φ ∈ G and for all x ∈ Rn ,
2
|g (x) − φ (x)| < e−b|x|
Therefore,
µZ ¶1/p µZ ³ ´p ¶1/p
p −b|x|2
|g (x) − φ (x)| dx ≤ e dx < ε.
Rn Rn
It follows
||f − φ||p ≤ ||f − g||p + ||g − φ||p < 2ε.
Since ε > 0 is arbitrary, this proves the theorem.
The following lemma is also interesting even if it is obvious.
an integrable function.
One reason for using the functions, G is that it is very easy to compute the
Fourier transform of these functions. The first thing to do is to verify F and F −1
map G to G and that F −1 ◦ F (ψ) = ψ.
Now using the dominated convergence theorem to justify passing derivatives inside
the integral where necessary and using integration by parts,
Z Z
0 ct2 −cx2 ct2 2
H (t) = 2cte e cos (2cxt) dx − e e−cx sin (2cxt) 2xcdx
R R
Z
ct2 −cx2
= 2ctH (t) − e 2ct e cos (2cxt) dx = 2ct (H (t) − H (t)) = 0
R
R 2
and so H (t) = H (0) = R
e−cx dx ≡ I. Thus
Z Z ∞ Z 2π
2 −c(x2 +y 2 ) 2 π
I = e dxdy = e−cr rdθdr = .
R2 0 0 c
√ √
Therefore, I = π/ c. Since the sign of t is unimportant, this proves 19.1. This
also proves 19.2 after writing as iterated integrals.
520 FOURIER TRANSFORMS
Consider 19.3.
Z Z ³ ´
is 2
c +( 2c )
2
2 −c t2 − ist − s4c
e−ct eist dt = e e dt
R R
Z √
2
− s4c −c(t− 2c
is
)
2 2
− s4c π
= e e dt = e √ .
R c
Proof: The first claim will be shown if it is shown that F ψ ∈ G for ψ (x) =
α −b|x|2
x e because an arbitrary function of G is a finite sum of scalar multiples of
functions such as ψ. Using Lemma 19.7,
µ ¶n/2 Z
1 2
F ψ (t) ≡ e−it·x xα e−b|x| dx
2π Rn
µ ¶n/2 µZ ¶
1 −|α| 2
= (i) Dtα e−it·x e−b|x| dx
2π Rn
µ ¶n/2 µ µ √ ¶n ¶
1 −|α| α
|t|2
− 2b π
= (i) Dt e √
2π b
|t|2
and this is clearly in G because it equals a polynomial times e− 2b . It remains
to verify the other assertion. As in the first case, it suffices to consider ψ (x) =
2
xα e−b|x| . Using Lemma 19.7 and³ ordinary integration by parts on the iterated
R 2 |s|2
√ ´n
integrals, Rn e−c|t| eis·t dt = e− 2c √πc ,
F −1 ◦ F (ψ) (s)
µ ¶n/2 Z µ ¶n/2 Z
1 1 2
≡ eis·t e−it·x xα e−b|x| dxdt
2π Rn 2π Rn
µ ¶n Z µZ ¶
1 −|α| 2
= eis·t (−i) Dtα e−it·x e−b|x| dxdt
2π Rn Rn
µ ¶n/2 Z µ ¶n/2 µ µ √ ¶n ¶
1 is·t 1 −|α| α
|t|2
− 4b π
= e (−i) Dt e √ dt
2π Rn 2π b
19.3. FOURIER TRANSFORMS OF JUST ABOUT ANYTHING 521
µ ¶n µ √ ¶n Z µ ¶
1 π −|α| is·t α
|t|2
− 4b
= √ (−i) e Dt e dt
2π b Rn
µ ¶n µ √ ¶n Z
1 π −|α| |α| |α| |t|2
= √ (−i) (−1) sα (i) eis·t e− 4b dt
2π b Rn
µ ¶n µ √ ¶n Z
1 π |t|2
= √ sα eis·t e− 4b dt
2π b Rn
µ ¶n µ √ ¶n à √ !n
1 π |s|2
α − 4(1/(4b)) π
= √ s e p
2π b 1/ (4b)
µ ¶n µ √ ¶n ³
1 π 2 √ √ ´n 2
= √ sα e−b|s| π2 b = sα e−b|s| = ψ (s) .
2π b
This little computation proves the theorem. The other case is entirely similar.
Proof:
Z
TF ψ (φ) ≡ F ψ (t) φ (t) dt
Rn
Z µ ¶n/2 Z
1
= e−it·x ψ(x)dxφ (t) dt
Rn 2π Rn
Z µ ¶n/2 Z
1
= ψ(x) e−it·x φ (t) dtdx
Rn 2π Rn
Z
= ψ(x)F φ (x) dx ≡ Tψ (F φ)
Rn
for all φ ∈ G. Therefore, this is true for φ = ψ and so ψ = 0. This proves the lemma.
From now on regard G ⊆ G ∗ and for ψ ∈ G write ψ (φ) instead of Tψ (φ) . It was
just shown that with this interpretation1 ,
¡ ¢
F ψ (φ) = ψ (F (φ)) , F −1 ψ (φ) = ψ F −1 φ .
Lemma 19.12 F and F −1 are both one to one, onto, and are inverses of each
other.
Proof: First note F and F −1 are both linear. This follows directly from the
definition. Suppose now F T = 0. Then F T (φ) = T (F¡φ) = 0 for¢ all φ ∈ G. But F
and F −1 map G onto G because if ψ ∈ G, then ψ = F F −1 (ψ) . Therefore, T = 0
and so F is one to one. Similarly F −1 is one to one. Now
¡ ¢ ¡ ¡ ¢¢
F −1 (F T ) (φ) ≡ (F T ) F −1 φ ≡ T F F −1 (φ) = T φ.
consider the derivative of a function of one variable, in elementary courses you think of it as a
number but thinking of it as a linear transformation acting on R is better because this leads to
the concept of a derivative which generalizes to functions of many variables. So it is here. You
can think of ψ ∈ G as simply an element of G but it is better to think of it as an element of G ∗ as
just described.
19.3. FOURIER TRANSFORMS OF JUST ABOUT ANYTHING 523
Therefore,
mn (Vm \ Km ) ≤ 2−m .
Let
φm ∈ Cc (Vm ) , Km ≺ φm ≺ Vm .
Then φm (x) → XER (x) a.e. because the set where φm (x) fails to converge to this
set is contained in the set of all x which are in infinitely many of the sets Vm \ Km .
This set has measure zero because
∞
X
mn (Vm \ Km ) < ∞
m=1
Hence, letting R → ∞,
Z Z
f ψXE+ dmn = f+ ψdmn = 0
Since ψ is arbitrary, the first part of the argument applies to f+ and implies f+ = 0.
Similarly f− = 0. Finally, if f is complcx valued, the assumptions mean
Z Z
Re (f ) φdmn = 0, Im (f ) φdmn = 0
for all φ ∈ Cc (Rn ) and so both Re (f ) , Im (f ) equal zero a.e. This proves the
lemma.
Corollary 19.14 Let f ∈ L1 (Rn ) and suppose
Z
f (x) φ (x) dx = 0
Rn
By Lemma 19.13 f = 0.
The next theorem is the main result of this sort.
Proof: The case where f ∈ L1 (Rn ) was dealt with in Corollary 19.14. Suppose
0
f ∈ Lp (Rn ) forR p > 1. Then by Holder’s inequality and the density of G in Lp (Rn ) ,
p0 n
it follows that f gdx = 0 for all g ∈ L (R ) . By the Riesz representation theorem,
f = 0.
It remains to consider the case where f has polynomial growth. Thus x →
2
f (x) e−|x| ∈ L1 (Rn ) . Therefore, for all ψ ∈ G,
Z
2
0 = f (x) e−|x| ψ (x) dx
2 2
because e−|x| ψ (x) ∈ G. Therefore, by the first part, f (x) e−|x| = 0 a.e.
The following theorem shows that you can consider most functions you are likely
to encounter as elements of G ∗ .
Proof: Let f have polynomial growth first. Then the above integral is clearly
well defined and so in this case, f ∈ G ∗ .
Next suppose f ∈ Lp (Rn ) with ∞ > p ≥ 1. Then it is clear again that the
above integral is well defined because of the fact that φ is a sum of polynomials
19.3. FOURIER TRANSFORMS OF JUST ABOUT ANYTHING 525
2 0
times exponentials of the form e−c|x| and these are in Lp (Rn ). Also φ → f (φ) is
clearly linear in both cases. This proves the theorem.
This has shown that for nearly any reasonable function, you can define its Fourier
0
transform as described above. Also you should note that G ∗ includes C0 (Rn ) , the
space of complex measures whose total variation are Radon measures. It is especially
interesting when the Fourier transform yields another function of some sort.
R ¡ ¢
1 n/2
R
and F −1 f (φ) = Rn
gφdt where g (t) = 2π Rn
eit·x f (x) dx. In short,
Z
−n/2
F f (t) ≡ (2π) e−it·x f (x)dx,
Rn
Z
F −1 f (t) ≡ (2π)−n/2 eit·x f (x)dx.
Rn
Since φ ∈ G is arbitrary, it follows from Theorem 19.15 that F f (x) is given by the
claimed formula. The case of F −1 is identical.
Here are interesting properties of these Fourier transforms of functions in L1 .
Now integrating by parts, it follows that for ||t||∞ ≡ max {|tj | : j = 1, · · ·, n} > 0
¯ ¯
¯ Z X n ¯ ¯ ¯
¯ 1 ¯ ∂g (x) ¯ ¯
|F f (t)| ≤ ε/2 + (2π)−n/2 ¯¯ ¯
¯
¯ dx¯
¯ ¯ (19.6)
¯ ||t||∞ Rn j=1 ∂xj ¯
and this last expression converges to zero as ||t||∞ → ∞. The reason for this is that
if tj 6= 0, integration by parts with respect to xj gives
Z Z
−n/2 −it·x −n/2 1 ∂g (x)
(2π) e g(x)dx = (2π) e−it·x dx.
Rn −itj Rn ∂xj
19.3. FOURIER TRANSFORMS OF JUST ABOUT ANYTHING 527
Therefore, choose the j for which ||t||∞ = |tj | and the result of 19.6 holds. There-
fore, from 19.6, if ||t||∞ is large enough, |F f (t)| < ε. Similarly, lim||t||→∞ F −1 (t) =
0. Consider the claim about uniform continuity. Let ε > 0 be given. Then there
exists R such that if ||t||∞ > R, then |F f (t)| < 2ε . Since F f is continuous, it is
n
uniformly continuous on the compact set, [−R − 1, R + 1] . Therefore, there exists
n
δ 1 such that if ||t − t0 ||∞ < δ 1 for t0 , t ∈ [−R − 1, R + 1] , then
|F f (t) − F f (t0 )| < ε/2. (19.7)
Now let 0 < δ < min (δ 1 , 1) and suppose ||t − t0 ||∞ < δ. If both t, t0 are contained
n n n
in [−R, R] , then 19.7 holds. If t ∈ [−R, R] and t0 ∈ / [−R, R] , then both are
n
contained in [−R − 1, R + 1] and so this verifies 19.7 in this case. The other case
n
is that neither point is in [−R, R] and in this case,
|F f (t) − F f (t0 )| ≤ |F f (t)| + |F f (t0 )|
ε ε
< + = ε.
2 2
This proves the theorem.
There is a very interesting relation between the Fourier transform and convolu-
tions.
n/2
Theorem 19.19 Let f, g ∈ L1 (Rn ). Then f ∗g ∈ L1 and F (f ∗g) = (2π) F f F g.
Proof: Consider Z Z
|f (x − y) g (y)| dydx.
Rn Rn
The function, (x, y) → |f (x − y) g (y)| is Lebesgue measurable and so by Fubini’s
theorem,
Z Z Z Z
|f (x − y) g (y)| dydx = |f (x − y) g (y)| dxdy
Rn Rn Rn Rn
= ||f ||1 ||g||1 < ∞.
R
It follows that for a.e.
R x, Rn |f (x − y) g (y)| dy < ∞ and for each of these values
of x, it follows that Rn f (x − y) g (y) dy exists and equals a function of x which is
in L1 (Rn ) , f ∗ g (x). Now
F (f ∗ g) (t)
Z
−n/2
≡ (2π) e−it·x f ∗ g (x) dx
n
ZR Z
−n/2
= (2π) e−it·x f (x − y) g (y) dydx
n Rn
ZR Z
−n/2 −it·y
= (2π) e g (y) e−it·(x−y) f (x − y) dxdy
Rn Rn
n/2
= (2π) F f (t) F g (t) .
There are many other considerations involving Fourier transforms of functions
in L1 (Rn ).
528 FOURIER TRANSFORMS
Similarly, Z Z
φ(F −1 ψ)dx = (F −1 φ)ψdt. (19.9)
Rn Rn
Now, 19.8 - 19.9 imply
Z Z
|φ|2 dx = φF −1 (F φ)dx
Rn Rn
Z
= φF (F φ)dx
n
ZR
= F φ(F φ)dx
Rn
Z
= |F φ|2 dx.
Rn
Similarly
||φ||2 = ||F −1 φ||2 .
This proves the theorem.
∞
Also by Theorem 19.20 {F φk }k=1 is Cauchy in L2 (Rn ) and so it converges to some
h ∈ L2 (Rn ). Therefore, from the above,
Z
F f (ψ) = h (x) ψ (x)
Rn
||F f ||2 = lim ||F φk ||2 = lim ||φk ||2 = ||f ||2 .
k→∞ k→∞
Similarly,
||f ||2 = ||F −1 f ||2.
This proves the theorem.
The following corollary is a simple generalization of this. To prove this corollary,
use the following simple lemma which comes as a consequence of the Cauchy Schwarz
inequality.
Proof:
¯Z Z ¯ ¯Z Z ¯
¯ ¯ ¯ ¯
¯ fk gk dx − f gdx¯¯ ≤ ¯¯ fk gk dx − fk gdx¯¯ +
¯
Rn Rn Rn Rn
¯Z Z ¯
¯ ¯
¯ fk gdx − f gdx¯¯
¯
Rn R n
Proof: First note the above formula is obvious if f, g ∈ G. To see this, note
Z Z Z
1
F f F gdx = F f (x) n/2
e−ix·t g (t) dtdx
Rn Rn (2π) Rn
Z Z
1
= n/2
eix·t F f (x) dxg (t)dt
R (2π)
n R n
Z
¡ −1 ¢
= F ◦ F f (t) g (t)dt
n
ZR
= f (t) g (t)dt.
Rn
F f = lim F fr , F −1 f = lim F −1 fr .
r→∞ r→∞
19.3. FOURIER TRANSFORMS OF JUST ABOUT ANYTHING 531
Since this holds for all φ ∈ G, a dense subset of L2 (Rn ), it follows that
Z
−n
F fr (y) = (2π) 2 fr (x)e−ix·y dx.
Rn
Similarly Z
−1 −n
F fr (y) = (2π) 2 fr (x)eix·y dx.
Rn
and letting φ ∈ G, Z
F (hr ∗ f ) (φ) dx
Z
≡ (hr ∗ f ) (F φ) dx
Z Z Z
−n/2
= (2π) hr (x − y) f (y) e−ix·t φ (t) dtdydx
Z Z µZ ¶
−n/2
= (2π) hr (x − y) e−i(x−y)·t dx f (y) e−iy·t dyφ (t) dt
Z
n/2
= (2π) F hr (t) F f (t) φ (t) dt.
Definition 19.28 f ∈ S, the Schwartz class, if f ∈ C ∞ (Rn ) and for all positive
integers N ,
ρN (f ) < ∞
where
2
ρN (f ) = sup{(1 + |x| )N |Dα f (x)| : x ∈ Rn , |α| ≤ N }.
Thus f ∈ S if and only if f ∈ C ∞ (Rn ) and
Also note that if f ∈ S, then p(f ) ∈ S for any polynomial, p with p(0) = 0 and
that
S ⊆ Lp (Rn ) ∩ L∞ (Rn )
for any p ≥ 1. To see this assertion about the p (f ), it suffices to consider the case
of the product of two elements of the Schwartz class. If f, g ∈ S, then Dα (f g) is
a finite sum of derivatives of f times derivatives of g. Therefore, ρN (f g) < ∞ for
all N . You may wonder about examples of things in S. Clearly any function in
2
Cc∞ (Rn ) is in S. However there are other functions in S. For example e−|x| is in
S as you can verify for yourself and so is any function from G. Note also that the
density of Cc (Rn ) in Lp (Rn ) shows that S is dense in Lp (Rn ) for every p.
Recall the Fourier transform of a function in L1 (Rn ) is given by
Z
−n/2
F f (t) ≡ (2π) e−it·x f (x)dx.
Rn
Therefore, this gives the Fourier transform for f ∈ S. The nice property which S
has in common with G is that the Fourier transform and its inverse map S one to
one onto S. This means I could have presented the whole of the above theory as
well as what follows in terms of S and its algebraic dual, S∗ rather than in terms
of G and G ∗ . However, it is more technical. Nevertheless, letting S play the role
of G in the above is convenient in certain applications because it is easier to reduce
to S than G. I will make use of this simple observation whenever it will simplify a
presentation. The fundamental result which makes it possible is the following.
Now xej f (x) ∈ S and so one can continue in this way and take derivatives indefi-
nitely. Thus F −1 f ∈ C ∞ (Rn ) and from the above argument,
Z
α
Dα F −1 f (t) =(2π)−n/2 eit·x (ix) f (x)dx.
Rn
To complete showing F −1 f ∈ S,
Z
a
tβ Dα F −1 f (t) =(2π)−n/2 eit·x tβ (ix) f (x)dx.
Rn
where the boundary term vanishes because f ∈ S. Returning to 19.17, use the fact
that |eia | = 1 to conclude
Z
a
|tβ Dα F −1 f (t)| ≤C |Dβ ((ix) f (x))|dx < ∞.
Rn
Proof: The first claim follows from the fact that F and F −1 are inverses
¡ of each
¢
other which was established above. For the second, let ψ ∈ S. Then ψ = F F −1 ψ .
Thus F maps S onto S. If F ψ = 0, then do F −1 to both sides to conclude ψ = 0.
Thus F is one to one and onto. Similarly, F −1 is one to one and onto.
Note the above equations involving F and F −1 hold pointwise everywhere be-
cause F ψ and F −1 ψ are continuous.
19.3.4 Convolution
To begin with it is necessary to discuss the meaning of φf where f ∈ G ∗ and φ ∈ G.
What should it mean? First suppose f ∈ Lp (Rn ) or measurable with polynomial
growth. RThen φf alsoR has these properties. Hence, it should be the case that
φf (ψ) = Rn φf ψdx = Rn f (φψ) dx. This motivates the following definition.
19.3. FOURIER TRANSFORMS OF JUST ABOUT ANYTHING 535
Definition 19.32 Let f ∈ G ∗ and let φ ∈ G. Then define the convolution of f with
an element of G as follows.
n/2
f ∗ φ ≡ (2π) F −1 (F φF f ) ∈ G ∗
Proof: Note that 19.18 follows from Definition 19.32 and both assertions hold
for f ∈ G. Consider 19.19. Here is a simple formula involving a pair of functions in
G. ¡ ¢
ψ ∗ F −1 F −1 φ (x)
µZ Z Z ¶
iy·y1 iy1 ·z n
= ψ (x − y) e e
φ (z) dzdy1 dy (2π)
µZ Z Z ¶
−iy·ỹ1 −iỹ1 ·z n
= ψ (x − y) e e φ (z) dzdỹ1 dy (2π)
= (ψ ∗ F F φ) (x) .
Now for ψ ∈ G,
n/2 ¡ ¢ n/2 ¡ −1 ¢
(2π) F F −1 φF −1 f (ψ) ≡ (2π) F φF −1 f (F ψ) ≡
n/2 ¡ ¢ n/2 ¡ −1 ¡ −1 ¢¢
(2π) F −1 f F −1 φF ψ ≡ (2π) f F F φF ψ =
³ ¢´
n/2 −1 ¡¡ ¢
f (2π) F F F −1 F −1 φ (F ψ) ≡
¡ ¢
f ψ ∗ F −1 F −1 φ = f (ψ ∗ F F φ) (19.20)
536 FOURIER TRANSFORMS
Also ¡ ¢
n/2 n/2
(2π) F −1 (F φF f ) (ψ) ≡ (2π) (F φF f ) F −1 ψ ≡
n/2 ¡ ¢ n/2 ¡ ¡ ¢¢
(2π) F f F φF −1 ψ ≡ (2π) f F F φF −1 ψ =
³ ³ ¢´´
n/2 ¡
= f F (2π) F φF −1 ψ
³ ³ ¢´´
n/2 ¡ −1 ¡ ¡ ¢¢
= f F (2π) F F F φF −1 ψ = f F F −1 (F F φ ∗ ψ)
f (F F φ ∗ ψ) = f (ψ ∗ F F φ) . (19.21)
The last line follows from the following.
Z Z
F F φ (x − y) ψ (y) dy = F φ (x − y) F ψ (y) dy
Z
= F ψ (x − y) F φ (y) dy
Z
= ψ (x − y) F F φ (y) dy.
19.4 Exercises
1. For f ∈ L1 (Rn ), show that if F −1 f ∈ L1 or F f ∈ L1 , then f equals a
continuous bounded function a.e.
2. Suppose f, g ∈ L1 (R) and F f = F g. Show f = g a.e.
3. Show that if f ∈ L1 (Rn ) , then lim|x|→∞ F f (x) = 0.
4. ↑ Suppose f ∗ f = f or f ∗ f = 0 and f ∈ L1 (R). Show f = 0.
R∞ Rr
5. For this problem define a f (t) dt ≡ limr→∞ a f (t) dt. Note this coincides
with the Lebesgue integral when f ∈ L1 (a, ∞). Show
R∞
(a) 0 sin(u)
u du = 2
π
R ∞ sin(ru)
(b) limr→∞ δ u du = 0 whenever δ > 0.
1
R
(c) If f ∈ L (R), then limr→∞ R sin (ru) f (u) du = 0.
R∞
Hint: For the first two, use u1 = 0 e−ut dt and apply Fubini’s theorem to
RR R
0
sin u R e−ut dtdu. For the last part, first establish it for f ∈ Cc∞ (R) and
then use the density of this set in L1 (R) to obtain the result. This is sometimes
called the Riemann Lebesgue lemma.
19.4. EXERCISES 537
6. ↑Suppose that g ∈ L1 (R) and that at some x > 0, g is locally Holder contin-
uous from the right and from the left. This means
lim g (x + r) ≡ g (x+)
r→0+
exists,
lim g (x − r) ≡ g (x−)
r→0+
exists and there exist constants K, δ > 0 and r ∈ (0, 1] such that for |x − y| <
δ,
r
|g (x+) − g (y)| < K |x − y|
for y > x and
r
|g (x−) − g (y)| < K |x − y|
for y < x. Show that under these conditions,
Z µ ¶
2 ∞ sin (ur) g (x − u) + g (x + u)
lim du
r→∞ π 0 u 2
g (x+) + g (x−)
= .
2
7. ↑ Let g ∈ L1 (R) and suppose g is locally Holder continuous from the right
and from the left at x. Show that then
Z R Z ∞
1 g (x+) + g (x−)
lim eixt e−ity g (y) dydt = .
R→∞ 2π −R −∞ 2
Assume that g has exponential growth as above and is Holder continuous from
the right and from the left at t. Pick γ > η. Show that
Z R
1 g (t+) + g (t−)
lim eγt eiyt Lg (γ + iy) dy = .
R→∞ 2π −R 2
538 FOURIER TRANSFORMS
and is called the complex inversion integral for Laplace transforms. It can be
used to find inverse Laplace transforms. Hint:
Z R
1
eγt eiyt Lg (γ + iy) dy =
2π −R
Z R Z ∞
1 γt iyt
e e e−(γ+iy)u g (u) dudy.
2π −R 0
Now use Fubini’s theorem and do the integral from −R to R to get this equal
to Z
eγt ∞ −γu sin (R (t − u))
e g (u) du
π −∞ t−u
where g is the zero extension of g off [0, ∞). Then this equals
Z
eγt ∞ −γ(t−u) sin (Ru)
e g (t − u) du
π −∞ u
which equals
Z ∞
2eγt g (t − u) e−γ(t−u) + g (t + u) e−γ(t+u) sin (Ru)
du
π 0 2 u
and then apply the result of Problem 6.
9. Suppose f ∈ S. Show F (fxj )(t) = itj F f (t).
10. Let f ∈ S and let k be a positive integer.
X
||f ||k,2 ≡ (||f ||22 + ||Dα f ||22 )1/2.
|α|≤k
Show both || ||k,2 and ||| |||k,2 are norms on S and that they are equivalent.
These are Sobolev space norms. For which values of k does the second norm
make sense? How about the first norm?
11. ↑ Define H k (Rn ), k ≥ 0 by f ∈ L2 (Rn ) such that
Z
1
( |F f (x)|2 (1 + |x|2 )k dx) 2 < ∞,
19.4. EXERCISES 539
Z
1
|||f |||k,2 ≡ ( |F f (x)|2 (1 + |x|2 )k dx) 2.
Then show µ is a Radon measure and show there exists {gm } such that gm ∈ G
and gm → F f in L2 (µ). Thus gm = F fm , fm ∈ G because F maps G onto G.
Then by Problem 10, {fm } is Cauchy in the norm || ||k,2 .
So Z Z
k −k
|F f (x)|dx = |F f (x)|(1 + |x|2 ) 2 (1 + |x|2 ) 2 dx.
13. Let u ∈ G. Then F u ∈ G and so, in particular, it makes sense to form the
integral, Z
F u (x0 , xn ) dxn
R
constant such that F (γu) (x0 ) equals this constant times the above integral.
Hint: By the dominated convergence theorem
Z Z
2
0
F u (x , xn ) dxn = lim e−(εxn ) F u (x0 , xn ) dxn .
R ε→0 R
Now use the definition of the Fourier transform and Fubini’s theorem as re-
quired in order to obtain the desired relationship.
14. Recall the Fourier series of a function in L2 (−π, π) converges to the func-
tion in L2 (−π, π). Prove a similar theorem with L2 (−π, π) replaced by
L2 (−mπ, mπ) and the functions
n o
−(1/2) inx
(2π) e
n∈Z
540 FOURIER TRANSFORMS
Now suppose f is a function in L2 (R) satisfying F f (t) = 0 if |t| > mπ. Show
that if this is so, then
µ ¶
1X −n sin (π (mx + n))
f (x) = f .
π m mx + n
n∈Z
The purpose of this chapter is to present some of the most important theorems
on Fourier analysis in Rn . These theorems are the Marcinkiewicz interpolation
theorem, the Calderon Zygmund decomposition, and Mihlin’s theorem. They are
all fundamental results whose proofs depend on the methods of real analysis.
Definition 20.1 Lp (Ω) + L1 (Ω) will denote the space of measurable functions, f ,
such that f is the sum of a function in Lp (Ω) and L1 (Ω). Also, if T : Lp (Ω) +
L1 (Ω) → space of measurable functions, T is subadditive if
541
542 FOURIER ANALYSIS IN RN AN INTRODUCTION
Therefore, f1 ∈ Lr (Ω).
Z Z µZ ¶1/p
1/p0 p
|f2 (x)| dµ = |f (x)| dµ ≤ µ [|f | > λ] |f | dµ < ∞.
[|f |>λ]
where ai > 0 and the ai are all distinct nonzero values of f , the sets, Ei being
disjoint. Thus,
Z m
X
(φ ◦ f ) dµ = φ (ai ) µ (Ei ).
Ω i=1
is constant on the intervals [0, a1 ), [a1 , a2 ), · · ·. For example, on [ai , ai+1 ), this func-
tion has the value
Xm
µ (Ej ).
j=i+1
j
m X
X
= µ (Ej ) (φ (ai ) − φ (ai−1 ))
j=1 i=1
Xm Z
= µ (Ej ) φ (aj ) = (φ ◦ f ) dµ
j=1 Ω
and so this establishes 20.1 in the case when f is a nonnegative simple function.
Since every measurable nonnegative function may be written as the pointwise limit
of such simple functions, the desired result will follow by the Monotone convergence
theorem and the next claim.
Claim: If fn ↑ f , then for each α > 0,
Proof of the claim: [fn > α] ↑ [f > α] because if f (x) > α then for large
enough n, fn (x) > α and so
This proves the lemma. (Note the importance of the strict inequality in [f > α] in
proving the claim.)
The next theorem is the main result in this section. It is called the Marcinkiewicz
interpolation theorem.
Theorem 20.4 Let (Ω, µ, S) be a σ finite measure space, 1 < r < ∞, and let
be subadditive, weak (r, r), and weak (1, 1). Then T is of type (p, p) for every
p ∈ (1, r) and
||T f ||p ≤ Ap ||f ||p
where the constant Ap depends only on p and the constants in the definition of weak
(1, 1) and weak (r, r).
Z ∞
+p αp−1 µ ([|T f2 | > α/2]) dα.
0
Z Z ∞ Z Z ∞
r p−1−r r
p (2Ar ) α |f1 | dαdµ + 2A1 p αp−2 |f2 | dαdµ.
Ω 0 Ω 0
Now f1 (x) = 0 unless |f1 (x)| ≤ α and f2 (x) = 0 unless |f2 (x)| > α so this equals
Z Z ∞ Z Z |f (x)|
r r p−1−r
p (2Ar ) |f (x)| α dαdµ + 2A1 p |f (x)| αp−2 dαdµ
Ω |f (x)| Ω 0
which equals Z Z
2r Arr p 2pA1p p
|f (x)| dµ + |f (x)| dµ
r−p Ω p − 1 Ω
µ r r ¶
2 Ar p 2pA1 p
≤ max , ||f ||Lp (Ω)
r−p p−1
and this proves the theorem.
R
Theorem 20.5 Let f ≥ 0, f dx < ∞, and let α be a positive constant. Then
there exist sets F and Ω such that
Rn = F ∪ Ω, F ∩ Ω = ∅ (20.3)
Ω ≡ Rn \ F = ∪∞
m=1 ∪ {Q : Q ∈ Tm }
Note that the cubes from Tm have pair wise disjoint interiors and also the interiors
of cubes from Tm have empty intersections with the interiors of cubes of Tk if k 6= m.
Let x be a point of Ω and let x be in a cube of Tm such that m is the first index
for which this happens. Let Q be the cube in Sm−1 containing x and let Q∗ be the
cube in the bisection of Q which contains x. Therefore 20.6 does not hold for Q∗ .
Thus
≤α
Z z }|Z {
1 m (Q) 1
α< f dx ≤ f dx ≤ 2n α
m (Q∗ ) Q∗ m (Q∗ ) m (Q) Q
which shows Ω is the union of cubes having disjoint interiors for which 20.5 holds.
Now a.e. point of F is a Lebesgue point of f . Let x be such a point of F and
suppose x ∈ Qk for Qk ∈ Sk . Let dk ≡ diameter of Qk . Thus dk → 0.
Z Z
1 1
|f (y) − f (x)| dy ≤ |f (y) − f (x)| dy
m (Qk ) Qk m (Qk ) B(x,dk )
Z
m (B (x,dk )) 1
= |f (x) − f (y)| dy
m (Qk ) m (B (x,dk )) B(x,dk )
Z
1
≤ Kn |f (x) − f (y)| dy
m (B (x,dk )) B(x,dk )
546 FOURIER ANALYSIS IN RN AN INTRODUCTION
where Kn is a constant which depends on n and measures the ratio of the volume of
a ball with diamiter 2d and a cube with diameter d. The last expression converges
to 0 because x is a Lebesgue point. Hence
Z
1
f (x) = lim f (y) dy ≤ α
k→∞ m (Qk ) Q
k
Lemma 20.6 Suppose ρ ∈ L∞ (Rn ) ∩ L2 (Rn ) and suppose also there exists a con-
stant C1 such that
Z
¯ −1 ¯
¯F ρ (x − y) − F −1 ρ (x)¯ dx ≤ C1 . (20.8)
|x|≥2|y|
Then there exists a constant A depending only on C1 , ||ρ||∞ , and n such that
¡£ ¯ −1 ¯ ¤¢ A
m x : ¯F ρ∗φ (x)¯ > α ≤ ||φ||1
α
for all φ ∈ G.
20.3. MIHLIN’S THEOREM 547
Thus
Z Z Z Z
b (x) dx = (φ (x) − g (x)) dx = φ (x) dx − φ (x) dx = 0,
(20.11)
Qi Qi Qi Qi
b (x) = 0 if x ∈Ω.
/ (20.12)
Claim:
2
||g||2 ≤ α (1 + 4n ) ||φ||1 , ||g||1 ≤ ||φ||1 . (20.13)
Proof of claim:
2 2 2
||g||2 = ||g||L2 (E) + ||g||L2 (Ω) .
Thus
2
XZ 2
||g||L2 (Ω) = |g (x)| dx
i Qi
XZ µ Z ¶2
1
≤ |φ (y)| dy dx
i Qi m (Qi ) Qi
XZ 2
X
≤ (2n α) dx ≤ 4n α2 m (Qi )
i Qi i
Z
1X
≤ 4n α2 |φ (x)| dx ≤ 4n α ||φ||1 .
α i Qi
Z Z
2 2
||g||L2 (E) = |φ (x)| dx ≤ α |φ (x)| dx = α ||φ||1 .
E E
Now consider the second of the inequalities in 20.13.
Z Z
||g||1 = |g (x)| dx + |g (x)| dx
E Ω
Z XZ
= |φ (x)| dx + |g| dx
E i Qi
Z XZ Z
1
≤ |φ (x)| dx + |φ (x)| dm (x) dm
E i Qi m (Qi ) Qi
Z XZ
= |φ (x)| dx + |φ (x)| dm (x) = ||φ||1
E i Qi
548 FOURIER ANALYSIS IN RN AN INTRODUCTION
This proves the claim. From the claim, it follows that b ∈ L2 (Rn ) ∩ L1 (Rn ) .
Because of 20.13, g ∈ L1 (Rn ) and so F −1 ρ ∗ g ∈ L2 (Rn ). (Since ρ ∈ L2 ,
it follows F −1 ρ ∈ L2 and so this convolution is indeed in L2 .) By Plancherel’s
theorem, ¯¯ −1 ¯¯ ¯¯ ¡ ¢¯¯
¯¯F ρ ∗ g ¯¯ = ¯¯F F −1 ρ ∗ g ¯¯ .
2 2
By Corollary 19.27 on Page 531, the expression on the right equals
n/2
(2π) ||ρF g||2
and so ¯¯ −1 ¯¯
¯¯F ρ ∗ g ¯¯ = (2π)n/2 ||ρF g|| ≤ Cn ||ρ|| ||g|| .
2 2 ∞ 2
From this and 20.13 ¡£¯ −1 ¯ ¤¢
m ¯F ρ ∗ g ¯ ≥ α/2
2
Cn ||ρ||∞
≤ α (1 + 4n ) ||φ||1 = Cn α−1 ||φ||1 . (20.14)
α2
This
¡£¯ is what ¯is wanted
¤¢ so far as g is concerned. Next it is required to estimate
m ¯F −1 ρ ∗ b¯ ≥ α/2 .
∗
If Q is one of the cubes whose √ union is Ω, let Q be the cube with the same
center as Q but whose sides are 2 n times as long.
Q∗i
rQi
yi
Let
Ω ∗ ≡ ∪∞ ∗
i=1 Qi
and let
E ∗ ≡ Rn \ Ω∗.
Thus E ∗ ⊆ E. Let x ∈ E ∗ . Then because of 20.11,
Z
F −1 ρ (x − y) b (y) dy
Qi
Z
£ −1 ¤
= F ρ (x − y) − F −1 ρ (x − yi ) b (y) dy, (20.15)
Qi
√
where yi is the center of Qi . Consequently if the sides of Qi have length 2t/ n,
20.15 implies Z ¯Z ¯
¯ ¯
¯ F ρ (x − y) b (y) dy ¯¯ dx ≤
−1
(20.16)
∗
¯
E Qi
20.3. MIHLIN’S THEOREM 549
Z Z
¯ −1 ¯
¯F ρ (x − y) − F −1 ρ (x − yi )¯ |b (y)| dydx
E∗ Qi
Z Z
¯ −1 ¯
= ¯F ρ (x − y) − F −1 ρ (x − yi )¯ dx |b (y)| dy (20.17)
Qi E∗
Z Z
¯ −1 ¯
≤ ¯F ρ (x − y) − F −1 ρ (x − yi )¯ dx |b (y)| dy (20.18)
Qi |x−yi |≥2t
X∞ Z
≤ C1 |b (y)| dy = C1 ||b||1 .
i=1 Qi
Thus, by 20.13,
Z
¯ −1 ¯
¯F ρ ∗ b (x)¯ dx ≤ C1 ||b||1
E∗
≤ C1 [||φ||1 + ||g||1 ]
≤ C1 [||φ||1 + ||φ||1 ]
≤ 2C1 ||φ||1 .
Consequently,
³h¯ ¯ i ´
m ¯F −1 ρ ∗ b¯ ≥ α ∩ E ∗ ≤ 4C1 ||φ|| .
1
2 α
550 FOURIER ANALYSIS IN RN AN INTRODUCTION
Lemma 20.7 Suppose ρ ∈ L∞ (Rn ) ∩ L2 (Rn ) and suppose also that there exists a
constant C1 such that
Z
¯ −1 ¯
¯F ρ (x − y) − F −1 ρ (x)¯ dx ≤ C1 . (20.20)
|x|>2|y|
¡£¯ −1 ¯ ¤¢
m ¯F ρ ∗ f ¯ > α ≤ A ||f ||1 if f ∈ L1 (Rn ), (20.21)
α
µ ¶2
¡£¯ ¯ ¤¢ ||f ||2
m ¯F −1 ρ ∗ f ¯ > α ≤ A if f ∈ L2 (Rn ). (20.22)
α
Thus, F −1 ρ∗ is weak type (1, 1) and weak type (2, 2). Also
¯¯ −1 ¯¯
¯¯F ρ ∗ f ¯¯ ≤ A ||f || if f ∈ L2 (Rn ). (20.23)
2 2
also,
Z
¯ −1 ¯ ¯ −1 ¯
¯F ρ ∗ g (x) − F −1 ρ ∗ g (x0 )¯ ≤ ¯F ρ (x − y) − F −1 ρ (x0 − y)¯ |g (y)| dy
¯¯ −1 ¯¯
≤ ¯¯F ρ − F −1 ρx0 −x ¯¯ ||g|| 2
l
2 n −1
and by continuity of translation in L (R ), this shows x → F ρ ∗ g (x) is continu-
ous. Therefore, F −1 ρ∗ maps L1 (Rn )+L2 (Rn ) to the space of measurable functions.
(Continuous functions are measurable.) It is clear that F −1 ρ∗ is subadditive.
If φ ∈ G, Plancherel’s theorem implies as before,
¯¯ −1 ¯¯ ¯¯ ¡ ¢¯¯
¯¯F ρ ∗ φ¯¯ = ¯¯F F −1 ρ ∗ φ ¯¯ =
2 2
n/2 n/2
(2π) ||ρF φ||2 ≤ (2π) ||ρ||∞ ||φ||2 . (20.24)
2 n
Now let f ∈ L (R ) and let φk ∈ G, with
||φk − f ||2 → 0.
à Z ¯Z ¯2 !1/2
¯ ¯ ¯¯ ¯¯
≤ lim inf ¯ F ρ (x − y) φk (y) dy ¯ dx
−1
= lim inf ¯¯F −1 ρ ∗ φk ¯¯2
k→∞ ¯ ¯ k→∞
n/2 n/2
≤ ||ρ||∞ (2π) lim inf ||φk ||2 = ||ρ||∞ (2π) ||f ||2 .
k→∞
n/2
Thus, 20.23 holds with A = ||ρ||∞ (2π) . Consequently,
ÃZ !1/2
¯ −1 ¯2
A ||f ||2 ≥ ¯F ρ ∗ f (x)¯ dx
[|F −1 ρ∗f |>α]
¡£¯ −1 ¯ ¤¢
≥ αm ¯F ρ ∗ f ¯ > α 1/2
and so 20.22 follows.
It remains to prove 20.21 which holds for all f ∈ G by Lemma 20.6. Let
f ∈ L1 (Rn ) and let φk → f in L1 (Rn ) , φk ∈ G. Without loss of generality, assume
that both f and F −1 ρ are Borel measurable. Therefore, by Minkowski’s inequality,
and Plancherel’s theorem,
¯¯ −1 ¯¯
¯¯F ρ ∗ φk − F −1 ρ ∗ f ¯¯
2
552 FOURIER ANALYSIS IN RN AN INTRODUCTION
ÃZ ¯Z ¯2 !1/2
¯ ¯
≤ ¯ F −1 ρ (x − y) (φk (y) − f (y)) dy ¯ dx
¯ ¯
X[|F −1 ρ∗f |>α] (x) ≤ lim inf X[|F −1 ρ∗φk |>α] (x) a.e. x.
k→∞
Thus by Lemma 20.6 and Fatou’s lemma, there exists a constant, A, depending on
C1 , n, and ||ρ||∞ such that
¡£¯ ¯ ¤¢ ¡£¯ ¯ ¤¢
m ¯F −1 ρ ∗ f ¯ > α ≤ lim inf m ¯F −1 ρ ∗ φk ¯ > α
k→∞
||φk ||1 ||f ||1
≤ lim inf A =A .
k→∞ α α
This shows 20.21 and proves the lemma.
Then for each p ∈ (1, ∞), there exists a constant, Ap , depending only on
p, n, ||ρ||∞ ,
Proof: From Lemma 20.7, F −1 ρ∗ is weak (1, 1), weak (2, 2), and maps
L1 (Rn ) + L2 (Rn )
Thus the theorem is proved for these values of p. Now suppose p > 2. Then p0 < 2
where
1 1
+ 0 = 1.
p p
20.3. MIHLIN’S THEOREM 553
xα ≡ xα αn
1 · · · xn
1
Lemma 20.9 Let 20.25 hold and suppose ψ ∈ Cc∞ (Rn \ {0}). Then for each
α, |α| ≤ L, there exists a constant C ≡ C (α, n, ψ) independent of k such that
¯ α¡ ¡ ¢¢¯
sup |x|
|α| ¯D ρ (x) ψ 2k x ¯ ≤ CC0 .
x ∈Rn
Proof:
¯ α¡ ¡ ¢¢¯ X ¯ ¯ ¯ ¡ ¢¯
|x|
|α| ¯D ρ (x) ψ 2k x ¯ ≤ |x||α| ¯Dβ ρ (x)¯ 2k|γ| ¯Dγ ψ 2k x ¯
β+γ=α
554 FOURIER ANALYSIS IN RN AN INTRODUCTION
X ¯ β ¯¯ ¯ ¯ ¡ ¢¯
= |x|
|β| ¯D ρ (x)¯ ¯2k x¯|γ| ¯Dγ ψ 2k x ¯
β+γ=α
X |γ|
≤ C0 C (α, n) sup{|z| |Dγ ψ (z)| : z ∈ Rn } = C0 C (α, n, ψ)
|γ|≤|α|
and
∞
X ¡ ¢
φ 2k x = 1
k=−∞
for each x 6= 0.
Proof: Let £ ¤
ψ ≥ 0, ψ = 1 on 2−1 ≤ |x| ≤ 2 ,
£ ¤
spt (ψ) ⊆ 4−1 < |x| < 4 .
Consider
∞
X ¡ ¢
g (x) = ψ 2k x .
k=−∞
Then for each x, only finitely many terms are not equal to 0. Also, g (x) £ l> l+20 for
¤
all x 6= 0. To verify this last claim, note that for some
£ k an ¤ integer, |x| ∈ 2 , 2 .
Therefore, choose k an integer such £that 2k |x| ∈ 2 −1
, 2 . For example,
¤ £ l−l−1 l+2−l−1 ¤ £ −1 ¤ let k =
−l − 1. This works because 2k |x| ∈
¡ k ¢2 l k l+2 k
2 , 2 2 = 2 ,2 = 2 ,2 .
Therefore, for this value of k, ψ 2 x = 1 so g (x) > 0.
Now notice that
∞
X ∞
X
¡ ¢ ¡ ¢
g (2r x) = ψ 2k 2r x = ψ 2k x = g (x).
k=−∞ k=−∞
−1
Let φ (x) ≡ ψ (x) g (x) . Then
∞ ∞ ¡ ¢ ∞
X ¡ ¢ X ψ 2k x −1
X ¡ ¢
k
φ 2 x = k
= g (x) ψ 2k x = 1
g (2 x)
k=−∞ k=−∞ k=−∞
ÃZ !1/2 ÃZ !1/2
¯ −1 ¯
≤2 |x|
−2L
dx |x|
2L ¯F γ k (x)¯2 dx
|x|≥t |x|≥t
where £ ¤
Sk ≡ x :2−2−k < |x| < 22−k , (20.28)
a set containing the support of γ k . Now from the definition of γ k ,
¯ L ¯ ¯ ¡ ¡ ¢¢¯
¯Dj γ k (z)¯ = ¯DjL ρ (z) φ 2k z ¯.
It follows, using polar coordinates, that the last expression in 20.27 is no larger than
µZ ¶1/2
n/2−L −2L
C (n, L, φ, C0 ) t |z| dz ≤ C (n, L, φ, C0 ) tn/2−L· (20.30)
Sk
ÃZ !1/2
22−k
n−1−2L
ρ dρ ≤ C (n, L, φ, C0 ) tn/2−L 2k(L−n/2).
2−2−k
¯ ¯
Z ¯Z 1 X
n ¯
¯ ¯
¯ D F −1
γ (x−sy) y ds ¯ dx
¯ j k j ¯
|x|≥2t ¯ 0 j=1 ¯
Z Z n
1X ¯ ¯
≤ t ¯Dj F −1 γ k (x−sy)¯ dsdx
|x|≥2t 0 j=1
Z X
n
¯ ¯
≤ t ¯Dj F −1 γ k (x)¯ dx
j=1
n µZ ³ ¶1/2
X ¯ ¯2 ´−L
≤ t 1 + ¯2−k x¯ dx
j=1
µZ ³ ¶1/2
¯ −k ¯2 ´L ¯ ¯2
· ¯
1+ 2 x ¯ ¯ −1 ¯
Dj F γ k (x) dx
n µZ ³ ¶1/2
X ¯ ¯2 ´ L ¯ ¯
≤ C (n, L) t2kn/2 1 + ¯2−k x¯ ¯Dj F −1 γ k (x)¯2 dx . (20.31)
j=1
X Z
−2k|α| 2
= C (n, L) 2 |Dα (zj γ k (z))| dz (20.33)
|α|≤L Sk
−|α|
2−k C (α, n, φ, j, C0 ) |z| .
³ ¡ ¢n/2−L ´
C (L, n, φ, C0 ) min t2−k , 2−k t .
With this inequality, the next lemma which is the desired result can be obtained.
Lemma 20.11 There exists a constant depending only on the indicated objects,
C1 = C (L, n, φ, C0 ) such that when |y| ≤ t,
Z
¯ −1 ¯
¯F ρ (x − y) − F −1 ρ (x)¯ dx ≤ C1
|x|≥2t
Z
¯ −1 ¯
¯F ρm (x − y) − F −1 ρm (x)¯ dx ≤ C1 . (20.35)
|x|≥2t
ml
X Z
¯ −1 ¯
≤ lim inf ¯F γ k (x − y) − F −1 γ k (x)¯ dx
l→∞ |x|≥2t
k=−ml
∞
X ³ ¡ ¢n/2−L ´
≤ C (L, n, φ, C0 ) min t2−k , 2−k t . (20.36)
k=−∞
where L is an integer greater than n/2 and ρ ∈ C L (Rn \ {0}). Then for every
p > 1, there exists a constant Ap depending only on p, C0 , φ, n, and L, such that
for all ψ ∈ G, ¯¯ −1 ¯¯
¯¯F ρ ∗ ψ ¯¯ ≤ Ap ||ψ|| .
p p
Proof: Since ρm satisfies 20.35, and is obviously in L2 (Rn )∩L∞ (Rn ), Theorem
20.8 implies there exists a constant Ap depending only on p, n, ||ρm ||∞ , and C1 such
that for all ψ ∈ G and p ∈ (1, ∞),
¯¯ −1 ¯¯
¯¯F ρm ∗ ψ ¯¯ ≤ Ap ||ψ|| .
p p
20.4. SINGULAR INTEGRALS 559
It turns out that some meaning can be assigned to K ∗ f for some functions K
which are not in L1 . This involves assuming a certain form for K and exploiting
cancellation. The resulting theory of singular integrals is very useful. To illustrate,
an application will be given to the Helmholtz decomposition of vector fields in the
next section. Like Mihlin’s theorem, the theory presented here rests on Theorem
20.8, restated here for convenience.
Then for each p ∈ (1, ∞), there exists a constant, Ap , depending only on
p, n, ||ρ||∞ ,
and Z
|K (x − y) − K (x)| dx ≤ B.
|x|>2|y|
Then for all p > 1, there exists a constant, A (p, n, B), depending only on the
indicated quantities such that
for all f ∈ G.
Define ½
K (x) if |x| ≥ ε,
Kε (x) = (20.43)
0 if |x| < ε.
Then there exists a constant C (n) such that
Z
|Kε (x − y) − Kε (x)| dx ≤ C (n) B (20.44)
|x|>2|y|
and
||F Kε ||∞ ≤ C (n) B. (20.45)
Proof: In the argument, C (n) will denote a generic constant depending only on
n. Consider 20.44 first. The integral is broken up according to whether |x| , |x − y| >
ε.
Now consider the terms in the above expression. The last integral in 20.46 equals 0
from the definition of Kε . The third integral on the right is no larger than B by the
definition of Kε and 20.42. Consider the second integral on the right. This integral
is no larger than Z
−n
B |x| dx.
|x|≥2|y|,|x|≥ε,|x−y|<ε
Now |x| ≤ |y| + ε ≤ |x| /2 + ε and so |x| < 2ε. Thus this is no larger than
Z Z Z 2ε
−n 1
B |x| dx = B ρn−1 n dρdσ ≤ BC (n) ln 2 = C (n) B.
S n−1 ε ρ
ε≤|x|≤2ε
It remains to estimate the first integral on the right in 20.46. This integral is
bounded by Z
−n
B |x − y| dx
|x|≥2|y|,|x−y|>ε,|x|<ε
In the integral above, |x| < ε and so |x − y| − |y| < ε. Therefore, |x − y| < ε + |y| <
ε + |x| /2 < ε + ε/2 = (3/2) ε. Hence ε ≤ |x − y| ≤ (3/2) |x − y|. Therefore, the
above integral is no larger than
Z (3/2)ε Z Z (3/2)ε
−n
B |z| dz = B ρ−1 dρdσ = BC (n) ln (3/2) .
ε S n−1 ε
= A + B. (20.47)
Consider A. By 20.41
Z
Kε (x) dx = 0
−1
ε<|x|<3π|y|
and so ¯ ¯
¯ Z ¯
¯ ¡ −ix·y ¢ ¯¯
¯
A =¯ Kε (x) e − 1 dx¯
¯ ¯
¯ε<|x|<3π|y|−1 ¯
Now ¯ −ix·y ¯
¯e − 1¯ = |2 − 2 cos (x · y)|
1/2
≤ 2 |x · y| ≤ 2 |x| |y|
so, using polar coordinates, this expression is no larger than
Z Z 3π/|y|
−n
2B |x| |x| |y| dx ≤ C (n) B |y| dρ ≤ BC (n).
ε
−1
ε<|x|<3π|y|
Next, consider B. This estimate is based on the trick which follows. Let
2
z ≡ yπ/ |y|
so that
|z| = π/ |y| , z · y =π.
Then
R R
Kε (x) e−ix·y dx = 12 Kε (x) e−ix·y dx
−1 −1
3π|y| <|x|≤R 3π|y| <|x|≤R
R (20.48)
− 12 Kε (x) e−i(x+z)·y dx.
3π|y|−1 <|x|≤R
Thus
R
Kε (x) e−ix·y dx =
R 3π|y|−1 <|x|≤R R
1
2 Kε (x) e−ix·y dx − 12 Kε (x − z) e−ix·y dx (20.49)
|x|≤R
R |x−z|≤R R
+ 12 Kε (x − z) e−ix·y dx − 12 Kε (x) e−ix·y dx.
|x−z|≤3π|y|−1 |x|≤3π|y|−1
π 3π
Since |z| = π/ |y|, it follows |z| = |y| < |y| < R and so the following picture
describes the situation. In this picture, the radius of each ball equals either R or
−1
3π |y| and each integral above is taken over one of the two balls in the picture,
either the one centered at 0 or the one centered at z.
0t zt
Kε (x − z) e−ix·y dx.
−1 −1
|x−z|<3π|y| ,|x|>3π|y|
illustrated in the following picture as the region between the small ball centered at
0 and the big ball which surrounds the two small balls
0t zt
1
= B ³ ´n α (n) ·
−1 −1
3π |y| − π |y|
³³ ´n ³ ´n ´
−1 −1 −1
3π |y| + π |y| − 3π |y|
20.4. SINGULAR INTEGRALS 565
n
|y| 1 n n
= α (n) B n n ((4π) − (3π) ) = C (n) B.
(2π) |y|
Returning to 20.49, the terms involving x − y have now been estimated. Thus,
collecting the terms which have not yet been estimated along with those that have,
¯ ¯
¯ Z ¯
¯ ¯
¯ −ix·y ¯
B =¯ Kε (x) e dx¯
¯ ¯
¯3π|y|−1 <|x|≤R ¯
¯
¯ Z Z
1 ¯¯
≤ ¯ Kε (x) e−ix·y dx − Kε (x − z) e−ix·y dx
2¯
¯|x|<R |x|<R
¯
Z Z ¯
¯
¯
+ Kε (x − z) e−ix·y dx − Kε (x) e−ix·y
dx¯
¯
|x|<3π|y|−1 |x|<3π|y|−1
¯
+C (n) B + g (R)
where g (R) → 0 as R → ∞. Using |z| = π/ |y| again,
Z
1
B≤ |Kε (x) − Kε (x − z)| dx + C (n) B + g (R).
2
3|z|<|x|<R
But the integral in the above is dominated by C (n) B by 20.44 which was established
earlier. Therefore, from 20.47,
where g (R) → 0.
Now KεR → Kε in L2 (Rn ) because
Z
1
||KεR − Kε ||L2 (Rn ) ≤ B 2n dx
|x|>R |x|
Z Z ∞
1
= B dρdσ,
S n−1 R ρn+1
Corollary 20.16 Suppose 20.40 - 20.42 hold. Then if g ∈ Cc1 (Rn ), Kε ∗g converges
uniformly and in Lp (Rn ) as ε → 0.
566 FOURIER ANALYSIS IN RN AN INTRODUCTION
Proof: Z
Kε ∗ g (x) ≡ Kε (y) g (x − y) dy.
Let 0 < η < ε. Then since g ∈ Cc1 (Rn ) , there exists a constant, K such that
K |u − v| ≥ |g (u) − g (v)| for all u, v ∈ Rn .
Z
1
|Kε ∗ g (x) − Kη ∗ g (x)| ≤ BK n |y| dy
η<|y|<ε |y|
Z Z ε
= BK dρdσ = Cn |ε − η| .
S n−1 η
Theorem 20.17 Suppose 20.40 - 20.42. Then for Kε given by 20.43 and p > 1,
there exists a constant A (p, n, B) such that for all f ∈ Lp (Rn ),
Thus T is a linear and continuous map defined on Lp (Rn ) for each p > 1.
0
Proof: From 20.40 it follows Kε ∈ Lp (Rn ) ∩ L2 (Rn ) where, as usual, 1/p +
0
1/p0 = 1. By continuity of translation in Lp (Rn ), x → Kε ∗ f (x) is a continuous
function.By Lemma 20.15, ||F Kε ||∞ ≤ C (n) B for all ε. Therefore, by Lemma
20.14,
||Kε ∗ g||p ≤ A (p, n, B) ||g||p
for all g ∈ G. Now let f ∈ Lp (Rn ) and gk → f in Lp (Rn ) where gk ∈ G. Then
Z
|Kε ∗ f (x) − Kε ∗ gk (x)| ≤ |Kε (x − y)| |gk (y) − f (y)| dy
||Kε ∗ f ||p ≤ lim inf ||Kε ∗ gk ||p ≤ lim inf A (p, n, B) ||gk ||p
k→∞ k→∞
= A (p, n, B) ||f ||p .
Theorem 20.18 For K given by 20.55 - 20.57, it follows there exists a constant
B such that
−n
|K (x)| ≤ B |x| , (20.58)
Z
K (x) dx = 0, (20.59)
a<|x|<b
Z
|K (x − y) − K (x)| dx ≤ B. (20.60)
|x|>2|y|
Consequently, the conclusions of Theorem 20.17 hold also.
Proof: 20.58 is obvious. To verify 20.59,
Z Z bZ
Ω (ρw) n−1
K (x) dx = ρ dσdρ
a<|x|<b a S n−1 ρn
Z b Z
1
= Ω (w) dσdρ = 0.
a ρ S n−1
³ ´
z
where 20.56 was used to write Ω |z| = Ω (z). The first group of terms in 20.61 is
dominated by ¯ ¯
¯ x−y x ¯¯
|x − y|
−n ¯
Lip (Ω) ¯ −
|x − y| |x| ¯
and an estimate is required for |x| > 2 |y|. Since |x| > 2 |y|,
−n −n 2n
|x − y| ≤ (|x| − |y|) ≤ n.
|x|
Also ¯ ¯ ¯ ¯
¯ x−y x ¯¯ ¯¯ (x − y) |x| − x |x − y| ¯¯
¯
¯ |x − y| − |x| ¯ = ¯ |x| |x − y| ¯
¯ ¯ ¯ ¯
¯ (x − y) |x| − x |x − y| ¯ ¯ (x − y) |x| − x |x − y| ¯
≤ ¯¯ ¯≤¯
¯ ¯
¯
¯
|x| (|x| − |y|) |x| (|x| /2)
2 2
= 2 |x |x| − y |x| − x |x − y|| = 2 |x (|x| − |x − y|) − y |x||
|x| |x|
2 2
≤ 2 |x| ||x| − |x − y|| + |y| |x| ≤ 2 (|x| |x− (x − y)| + |y| |x|)
|x| |x|
4 |y|
≤ 2 |x| |y| = 4 .
|x| |x|
Therefore, Z ¯ µ ¶ µ ¶¯
¯ ¯
|x − y|
−n ¯Ω x − y − Ω x ¯ dx
¯ |x − y| |x| ¯
|x|>2|y|
Z
n 1 |y|
≤ 4 (2 ) n dx Lip (Ω)
|x|>2|y| |x| |x|
Z
|y|
= C (n, Lip Ω) n+1 dx
|x|>2|y| |x|
Z
1
= C (n, Lip Ω) n+1 du. (20.62)
|u|>2 |u|
It remains to consider the second group of terms in 20.61 when |x| > 2 |y|.
¯ ¯ ¯ n n¯
¯ 1 1 ¯¯ ¯¯ |x| − |x − y| ¯¯
¯ − =
¯ |x − y|n |x| ¯ ¯ |x − y| |x| ¯
n n n
2n n n
≤ 2n ||x| − |x − y| |
|x|
2n h
n−1 n−2
≤ 2n |y| |x| + |x| |x − y| +
|x|
i
n−2 n−1
· · · + |x| |x − y| + |x − y|
20.5. HELMHOLTZ DECOMPOSITIONS 569
n−1
2n |y| C (n) |x| C (n) 2n |y|
≤ 2n = n+1 .
|x| |x|
Thus
Z ¯ µ ¶¯
¯ 1 1 ¯¯
¯Ω (x) − dx
¯ |x − y|
n
|x| ¯
n
|x|>2|y|
Z
|y|
≤ C (n) n+1 dx
|x|>2|y| |x|
Z
1
≤ C (n) n+1 du. (20.63)
|u|>2 |u|
From 20.62 and 20.63,
Z
|K (x − y) − K (x)| dx ≤ C (n, Lip Ω).
|x|>2|y|
Proof:
Proof: The case n = 2 is left to the reader. 20.66 and 20.67 are obvious from
the above descriptions. It remains to verify 20.68. If n ≥ 3 and i 6= j, then this
formula is also clear from 20.64. Thus consider the case when n ≥ 3 and i = j. By
symmetry, Z Z
I≡ 1 − nyi2 dσ = 1 − nyj2 dσ.
S n−1 S n−1
Hence
n Z Z Ã !
X X
nI = 1− nyi2 dσ = n−n yi2 dσ
i=1 S n−1 S n−1 i
Z
= (n − n) dσ = 0.
S n−1
B ⊇ U − U ≡ {x − y : x ∈ U, y ∈ U }
where here and below, e (ε) → 0 as ε → 0. The first term in 20.69 converges to 0
as ε → 0 because
¯Z ¯ ½
¯ ¯ 1
¯ ¯ Cnh εn−2 εn−1 = Cnh ε if n > 2
¯ Φ (y) ∇h (y) · ndσ ¯ ≤
¯ ∂B(0,ε) ¯ Ch (ln ε) ε if n = 2
Consequently
Z
∆u (x) = − ∇ · (∇Φ (y) h (y)) dy + e (ε).
B\B(0,ε)
Letting ε → 0,
−∆u (x) = f (x).
This proves the following lemma.
Lemma 20.21 Let U be a bounded open set in Rn with Lipschitz boundary and let
B ⊇ U − U where B = B (0,R). Let f ∈ Cc∞ (U ). Then for x ∈ U,
Z Z
Φ (y) f (x − y) dy = Φ (x − y) f (y) dy,
B U
and it follows that if u is given by one of the above formulas, then for all x ∈ U ,
−∆u = f.
572 FOURIER ANALYSIS IN RN AN INTRODUCTION
It is given by
Z Z
u (x) = Φ (y) fe(x − y) dy = Φ (x − y) f (y) dy (20.70)
B U
and so uk → u in Lp (U ). Also
Z Z
uk,i (x) = Φ,i (x − y) fk (y) dy = fk (x − y) Φ,i (y) dy.
U B
Now let Z
wi ≡ fe(x − y) Φ,i (y) dy. (20.71)
B
||uk,i − wi ||Lp (U )
µZ µZ ¯ ¯ ¶p ¶1/p
¯ e ¯
≤ ¯fk (x − y) − f (x − y)¯ |Φ,i (y)| dy dx
U B
Z µZ ¯ ¯p ¶1/p
¯ ¯
≤ |Φ,i (y)| ¯fk (x − y) − fe(x − y)¯ dx dy
B U
≤ C (B) ||fk − f ||Lp (U )
and so uk,i → wi in Lp (U ).
Now let φ ∈ Cc∞ (U ). Then
Z Z Z
wi φdx = − lim uk φ,i dx = − uφ,i dx.
U k→∞ U U
p n
Thus u,i = wi ∈ L (R ) and so if φ ∈ (U ), Cc∞
Z Z Z Z
f φdx = lim fk φdx = lim ∇uk · ∇φdx = ∇u · ∇φdx
U k→∞ U k→∞ U U
One could also ask whether the second weak partial derivatives of u are in
Lp (U ). This is where the theory singular integrals is used. Recall from 20.70 and
20.71 along with the argument of the above lemma, that if u is given by 20.70, then
u,i is given by 20.71 which equals
Z
Φ,i (x − y) f (y) dy.
U
and
Z
wi,j (x) = Φ,i (y) f,j (x − y) dy
ZB Z
= Φ,i (y) f,j (x − y) dy + Φ,i (y) f,j (x − y) dy.
B\B(0,ε) B(0,ε)
The second term converges to 0 because f,j is bounded and by 20.65, Φ,i ∈ L1loc .
Thus
Z
wi,j (x) = Φ,i (y) f,j (x − y) dy + e (ε)
B\B(0,ε)
Z
= − (Φ,i (y) f (x − y)),j + Φ,ij (y) f (x − y) dy + e (ε)
B\B(0,ε)
Consider the first term on the right. This term equals, after letting y = εz,
Z Z
εn−1 Φ,i (εz) f (x−εz) nj dσ = Cn εn−1 ε1−n zi zj f (x−εz) dσ (z)
∂B(0,1) ∂B(0,1)
Z
= Cn zi zj f (x−εz) dσ (z)
∂B(0,1)
574 FOURIER ANALYSIS IN RN AN INTRODUCTION
if i = j. Thus
Z
wi,j (x) = Cn δ ij f (x) + Φ,ij (y) f (x − y) dy + e (ε).
B\B(0,ε)
Letting ½
0 if |y| < ε,
Φεij ≡
Φ,ij (y) if |y| ≥ ε,
it follows
wi,j (x) = Cn δ ij f (x) + Φεij ∗ fe(x) + e (ε) .
By the theory of singular integrals, there exists a continuous linear map, Kij ∈
L (Lp (Rn ) , Lp (Rn )) such that
Therefore, letting ε → 0,
wi,j = Cn δ ij f + Kij fe
whenever f ∈ Cc∞ (U ).
Now let f ∈ Lp (U ), let
||fk − f ||Lp (U ) → 0,
where fk ∈ Cc∞ (U ), and let
Z
wik (x) = Φ,i (x − y) fk (y) dy.
U
It follows
wi,j = Cn δ ij fe + Kij fe
and this proves the lemma.
20.5. HELMHOLTZ DECOMPOSITIONS 575
Next define π : L (U ; R ) → Lp (U ; Rn ) by
p n
Z
πF = −∇φ, φ (x) = ∇Φ (x − y) · F (y) dy.
U
π : Lp (U ; Rn ) → Lp (U ; Rn )
∇ · (F−πF) = 0
Note this theorem shows that any Lp vector field is the sum of a gradient and a
part which is divergence free. F = F−πF+πF.
The Bochner Integral
577
578 THE BOCHNER INTEGRAL
In words, Ukm is the set of points of X which are as close to ak as they are to any
of the al for l ≤ n.
¡ ¢
Bkn ≡ x−1 (Ukn ) , Dkn ≡ Bkn \ ∪k−1 n n n
i=1 Bi , D1 ≡ B1 ,
and
n
X
xn (s) ≡ ak XDkn (s).
k=1
n
Thus xn (s) is a closest approximation to x (s) from {ak }k=1 and so xn (s) → x (s)
because {an }∞n=1 is dense in x (Ω). Furthermore, xn is measurable because each Dk
n
is measurable.
Since (Ω, S, µ) is σ finite, there exists Ωn ↑ Ω with µ (Ωn ) < ∞. Let
Then yn (s) → x (s) for each s because for any s, s ∈ Ωn if n is large enough. Also
yn is a simple function because it equals 0 off a set of finite measure.
Now suppose that x is strongly measurable. Then some sequence of simple
functions, {xn }, converges pointwise to x. Then x−1 n (W ) is measurable for every
open set W because it is just a finite union of measurable sets. Thus, x−1n (W ) is
measurable for every Borel set W . This follows by considering
© ª
W : x−1
n (W ) is measurable
and observing this is a σ algebra which contains the open sets. Since X is a metric
space, it follows that if U is an open set in X , there exists a sequence of open sets,
{Vn } which satisfies
V n ⊆ U, V n ⊆ Vn+1 , U = ∪∞
n=1 Vn .
Then [ \ ¡ ¢
x−1 (Vm ) ⊆ x−1
k (Vm ) ⊆ x
−1
Vm .
n<∞ k≥n
This implies [
x−1 (U ) = x−1 (Vm )
m<∞
[ [ \ [ ¡ ¢
⊆ x−1
k (Vm ) ⊆ x−1 V m ⊆ x−1 (U ).
m<∞ n<∞ k≥n m<∞
Since [ [ \
x−1 (U ) = x−1
k (Vm ),
m<∞ n<∞ k≥n
21.1. STRONG AND WEAK MEASURABILITY 579
it follows that x−1 (U ) is measurable for every open U . It remains to show x (Ω) is
separable. Let
D ≡ all values of the simple functions xn
which converge to x pointwise. Then D is clearly countable and dense in D, a set
which contains x (Ω).
Claim: x (Ω) is separable. © ª
Proof of claim: For n ∈ N, let Bn ≡ B (d, r) : 0 < r < n1 , r rational, d ∈ D .
¡ 1¢
Thus
¡ B1 n¢ is countable. Let z¡∈1 D.1 ¢Consider B z, n . Then there exists d ∈ D ∩
B z, 3n . Now pick r ∈ Q ∩ 3n , n so that B (d, r) ∈ Bn . Now z ∈ B (d, r) and so
this shows that x (Ω) ⊆ D ⊆ ∪Bn for each n. Now let Bn0 denote those sets of Bn
∞
which have nonempty intersection with x (Ω) . Say Bn0 = {Bkn }n,k=1 . By the axiom
n n
of choice, there exists xk ∈ Bk ∩ x (Ω) . Then if z ∈ x (Ω) , z is contained in some set
∞
of Bn0 which also contains a point of {xnk }n,k=1 . Therefore, z is at least as close as
∞ ∞
2/n to some point of {xnk }n,k=1 which shows {xnk }n,k=1 is a countable dense subset
of x (Ω) . Therefore x (Ω) is separable. This proves the theorem.
The last part also shows that a subset of a separable metric space is also sepa-
rable. Therefore, the following simple corollary is obtained.
The next lemma is interesting for its own sake. Roughly it says that if a Banach
space is separable, then the unit ball in the dual space is weak ∗ separable. This
will be used to prove Pettis’s theorem, one of the major theorems in this subject
which relates weak measurability to strong measurability.
Lemma 21.4 If X is a separable Banach space with B 0 the closed unit ball in X 0 ,
then there exists a sequence {fn }∞ 0 0
n=1 ≡ D ⊆ B with the property that for every
x ∈ X,
||x|| = sup |f (x)|
f ∈D 0
Proof: Let {ak } be a countable dense set in X and consider the mapping
φ n : B 0 → Fn
given by
φn (f ) ≡ (f (a1 ) , · · ·, f (an )) .
Then φn (B ) is contained in a compact subset of Fn because |f (ak )| ≤ ||ak || .
0
∞
Therefore, there exists a countable dense subset of φn (B 0 ) , {φn (fkn )}k=1 . Let Dn0 ≡
n ∞
{fk }k=1 . Let
D 0 ≡ ∪∞ 0
k=1 Dk .
It remains to show this works. Letting x ∈ X and ε > 0 be given, there exists am
such that ||am − x|| < ε. Then by the usual argument involving the Hahn Banach
580 THE BOCHNER INTEGRAL
theorem, there exists fx ∈ B 0 such that ||x|| = fx (x) . Letting n > m, let g ∈ B 0 be
∞
one of the fkn with {φn (fkn )}k=1 a dense subset of φn (B 0 ) such that
Then
Corollary 21.6 Let X be a separable Banach space and let B (X) denote the σ
algebra of Borel sets. Then B (X) = F where F is the smallest σ algebra of subsets
of X which has the property that every function, x∗ ∈ X 0 is F measurable.
Proof: First I need to show F contains open balls because then F will contain
the open sets, since every open set is a countable union of open balls, which will
imply F ⊇ B (X). As noted above, it suffices to verify F contains the closed balls
21.1. STRONG AND WEAK MEASURABILITY 581
because every open ball is a countable union of closed balls. Let D0 be those
functionals in B 0 defined in Lemma 21.4. Then
½ ¾
∗
{x : ||x − a|| ≤ r} = x : sup |x (x − a)| ≤ r
x∗ ∈D 0
= ∩x∗ ∈D0 {x : |x∗ (x − a)| ≤ r}
= ∩x∗ ∈D0 {x : |x∗ (x) − x∗ (a)| ≤ r}
³ ´
= ∩x∗ ∈D0 x∗−1 B (x∗ (a) , r)
Proof: Let B be the countable basis of 21.1 and let U ∈ B. Let {Vm } be the
sequence of 21.2. Since f is the pointwise limit of fn ,
Therefore,
−1
f −1 (U ) = ∪∞
m=1 f
−1
(Vm ) ⊆ ∪∞ ∞ ∞
m=1 ∪n=1 ∩k=n fk (Vm )
⊆ ∪∞
m=1 f
−1
(Vm ) = f −1 (U ).
It follows f −1 (U ) ∈ F because it equals the expression in the middle which is
measurable. Now let W ∈ τ . Since B is countable, W = ∪∞ n=1 Un for some sets
Un ∈ B. Hence
f −1 (W ) = ∪∞
n=1 f
−1
(Un ) ∈ F.
This proves the theorem.
582 THE BOCHNER INTEGRAL
Proof: By Corollary 13.41 on Page 358, there exists a metric d, such that the
metric space topology with respect to d coincides with the weak topology. Since
K is compact, it follows that K is also separable. Hence it is completely separable
and so there exists a countable basis of open sets, B for the weak topology on K. It
follows that if U is any weakly open set, covered by basic sets of the form BA (x, r)
where A is a finite subset of X 0 , there exists a countable collection of these sets of
the form BA (x, r) which covers U .
Suppose now that x is weakly measurable. To show x−1 (U ) ∈ F whenever U
is weakly open, it suffices to verify x−1 (BA (z, r)) ∈ F for any set, BA (z, r) . Let
A = {x∗1 , · · ·, x∗m } . Then
= ∪m
i=1 {s ∈ Ω : |x∗i (x (s) − z)| < r}
= ∪m
i=1 {s ∈ Ω : |x∗i (x (s)) − x∗i (z)| < r}
Lemma 21.10 Let B be the closed unit ball in X. If X 0 is separable, there exists
a sequence {xm }∞ ∗ 0
m=1 ≡ D ⊆ B with the property that for all y ∈ X ,
Proof: Let
{x∗k }∞
k=1
It remains to verify this works. Let y ∗ ∈ X 0 . Then there exists y such that
By density, there exists one of the x∗k from the countable dense subset of X 0 such
that also
|x∗k (y)| > ||y ∗ || − ε, ||x∗k − y ∗ || < ε.
Now x∗k (y) ∈ φk (B) and so there exists x ∈ Dk ⊆ D such that
then
n
X
ak µ (Ek ) = 0.
k=1
Let f ∈ X 0 . Then
à n
! n
X X
f ak XEk (s) = f (ak ) XEk (s) = 0
k=1 k=1
and, therefore,
Z ÃX n
! n
X
à n
X
!
0= f (ak ) XEk (s) dµ = f (ak ) µ (Ek ) = f ak µ (Ek ) .
Ω k=1 k=1 k=1
Theorem 21.18 The Bochner integral is well defined and if x is Bochner integrable
and f ∈ X 0 , µZ ¶ Z
f x (s) dµ = f (x (s)) dµ (21.6)
Ω Ω
and ¯¯Z ¯¯ Z
¯¯ ¯¯
¯¯ x (s) dµ¯¯ ≤ ||x (s)|| dµ. (21.7)
¯¯ ¯¯
Ω Ω
Also, the Bochner integral is linear. That is, if a, b are scalars and x, y are two
Bochner integrable functions, then
Z Z Z
(ax (s) + by (s)) dµ = a x (s) dµ + b y (s) dµ (21.8)
Ω Ω Ω
Proof: First it is shown that the triangle inequality holds on simple functions
and that the limit in 21.5 exists. Thus, if x is given by 21.3 with the Ek disjoint,
¯¯Z ¯¯
¯¯ ¯¯
¯¯ x (s) dµ¯¯
¯¯ ¯¯
Ω
¯¯Z ¯¯ ¯¯ ¯¯
¯¯ X n ¯¯ ¯¯ X n ¯¯
¯¯ ¯¯ ¯¯ ¯¯
= ¯¯ ak XEk (s) dµ¯¯ = ¯¯ ak µ (Ek )¯¯
¯¯ Ω ¯¯ ¯¯ ¯¯
k=1 k=1
Xn Z Xn Z
≤ ||ak || µ (Ek ) = ||ak || XEk (s) dµ = ||x (s)|| dµ
k=1 Ω k=1 Ω
which shows the triangle inequality holds on simple functions. This implies
¯¯Z Z ¯¯ ¯¯Z ¯¯
¯¯ ¯¯ ¯¯ ¯¯
¯¯ xn (s) dµ − x (s) dµ ¯ ¯ = ¯ ¯ (x (s) − x (s)) dµ ¯¯
¯¯ m ¯¯ ¯¯ n m ¯¯
Ω Ω Ω
Z
≤ ||xn (s) − xm (s)|| dµ
Ω
which verifies the existence of the limit in 21.5. This completes the first part of the
argument.
21.2. THE BOCHNER INTEGRAL 587
Next it is shown the integral does not depend on the choice of the sequence
satisfying 21.4 so that the integral is well defined. Suppose yn , xn both satisfy 21.4
and converge to x pointwise. By Fatou’s lemma,
¯¯Z Z ¯¯ Z Z
¯¯ ¯¯
¯¯ yn dµ − x dµ¯ ¯ ≤ ||y − x|| dµ + ||x − xm || dµ
¯¯ m ¯¯ n
Ω Ω Ω Ω
Z Z
≤ lim inf ||yn − yk || dµ + lim inf ||xk − xm ||
k→∞ Ω k→∞ Ω
≤ ε/2 + ε/2
if m and n are chosen large enough. Since ε is arbitrary, this shows the limit is the
same for both sequences and demonstrates the Bochner integral is well defined.
It remains to verify the triangle inequality on Bochner integral functions and
the claim about passing a continuous linear functional inside the integral. Let x be
Bochner integrable and let xn be a sequence which satisfies the conditions of the
definition. Define
½
xn (s) if ||xn (s)|| ≤ 2 ||x (s)||,
yn (s) ≡ (21.9)
0 if ||xn (s)|| > 2 ||x (s)||.
If x (s) = 0 then yn (s) = 0 for all n. If ||x (s)|| > 0 then for all n large enough,
yn (s) = xn (s).
Thus,
µZ ¶ µZ ¶ Z Z
f xdµ = lim f yn dµ = lim f (yn ) dµ = f (x) dµ,
Ω n→∞ Ω n→∞ Ω Ω
588 THE BOCHNER INTEGRAL
the last equation holding from the dominated convergence theorem and 21.10 and
21.11. This shows 21.6. To verify 21.7,
¯¯Z ¯¯ ¯¯Z ¯¯
¯¯ ¯¯ ¯¯ ¯¯
¯¯ x (s) dµ¯¯ = lim ¯¯ yn (s) dµ¯¯
¯¯ ¯¯ ¯¯ n→∞ ¯¯
Ω Ω
Z Z
≤ lim ||yn (s)|| dµ = ||x (s)|| dµ
n→∞ Ω Ω
where the last equation follows from the dominated convergence theorem and 21.10,
21.11.
It remains to verify 21.8. Let f ∈ X 0 . Then from 21.6
µZ ¶ Z
f (ax (s) + by (s)) dµ = (af (x (s)) + bf (y (s))) dµ
Ω Ω
Z Z
= a f (x (s)) dµ + b f (y (s)) dµ
µΩ Z ZΩ ¶
= f a x (s) dµ + b y (s) dµ .
Ω Ω
In this case there exists a sequence of simple functions {yn } satisfying 21.4, yn (s)
converging pointwise to x (s),
and Z
lim ||x (s) − yn (s)|| dµ = 0. (21.15)
n→∞ Ω
But then taking a limit as ε → 0 and using the dominated convergence theorem
and 21.14 and 21.13, this would imply 0 ≥ ε. Therefore, x is Bochner integrable.
21.15 follows from the dominated convergence theorem and 21.14.
Now suppose x is Bochner integrable. Then it is strongly measurable and there
exists a sequence of simple functions {xn } such that xn (s) converges pointwise to
x and Z
lim ||xn (s) − xm (s)|| dµ = 0.
m,n→∞ Ω
©R ª∞
Therefore, as before, since Ω
xn dµ n=1 is a Cauchy sequence, it follows
½Z ¾∞
||xn || dµ
Ω n=1
Thus Z Z
||x|| dµ ≤ lim inf ||xn || dµ < ∞
Ω n→∞ Ω
Using 21.16 it follows yn satisfies 21.14, converges pointwise to x and then from the
dominated convergence theorem 21.15 holds. This proves the theorem.
is a closed subset of X × Y with respect to the product topology obtained from the
norm
||(x, y)|| = max (||x|| , ||y||) .
590 THE BOCHNER INTEGRAL
Thus also G (A) is a separable Banach space with the above norm. You can also
consider D (A) as a separable Banach space having the norm
i∗ x∗ (y) = y ∗∗ (i∗ x∗ )
because this will imply y ∗∗ = JY (y) where JY is the James map from Y to Y 0 .
However, the above is equivalent to the following holding for all x∗ ∈ X 0 .
Lemma 21.21 Suppose V and W are reflexive Banach spaces and that V is a
dense subset of W in the topology of W. Then i∗ W 0 is a dense subset of V 0 where
here i is the inclusion map of V into W .
in which i is the inclusion map. Next suppose i∗ W 0 is not dense in V 0 . Then there
exists v ∗∗ ∈ V 00 such that v ∗∗ 6= 0 but v ∗∗ (i∗ W 0 ) = 0. It follows from V being
reflexive, that v ∗∗ = Jv0 where J is the James map from V to V 00 for some v0 ∈ V .
Thus for every w∗ W 0 ,
0 = v ∗∗ (i∗ w∗ ) = i∗∗ v ∗∗ (w∗ )
= i∗∗ Jv0 (w∗ ) = Jv0 (i∗ w∗ )
= i∗ w∗ (v0 ) = w∗ (v0 )
and since W 0 separates the points of W, it follows v0 = 0 which contradicts v ∗∗ 6= 0.
This proves the lemma.
Note that in the proof, only V reflexive was used.
This lemma implies an easy corollary.
Corollary 21.22 Let E and F be reflexive Banach spaces and let A be a closed
operator, A : D (A) ⊆ E → F. Suppose also that D (A) is dense in E. Then making
D (A) into a Banach space by using the above graph norm given in 21.17, it follows
0
that D (A) is a Banach space and i∗ E 0 is a dense subspace of D (A) .
Proof: First note that E × F is a reflexive Banach space and G (A) is a closed
subspace of E × F so it is also a reflexive Banach space. Now D (A) is isometric to
G (A) and so it follows D (A) is a dense subspace of E which is reflexive. Therefore,
from Lemma 21.21 the conclusion follows.
With this preparation, here is another interesting theorem. This one is about
taking outside the integral a closed linear operator as opposed to a continuous linear
operator.
Theorem 21.23 Let X, Y be separable Banach spaces and let A : D (A) ⊆ X → Y
be a closed operator where D (A) is a dense subset of X. Suppose also that i∗ X 0 is
0
a dense subspace of D (A) where D (A) is a Banach space having the graph norm
described in 21.17. Suppose that (Ω, F, µ) is a σ finite measure space and x : Ω → X
is strongly measurable and it happens that x (s) ∈ D (A) for all s ∈ Ω. Then x is
strongly measurable as a mapping into D (A). Also Ax is strongly measurable as a
map into Y and if Z Z
||x (s)|| dµ, ||Ax (s)|| dµ < ∞, (21.18)
Ω Ω
then Z
x (s) dµ ∈ D (A) (21.19)
Ω
and Z Z
A x (s) dµ = Ax (s) dµ. (21.20)
Ω Ω
Proof: First of all, consider the assertion that x is strongly measurable into
0
D (A) . Letting f ∈ D (A) be given, there exists a sequence, {gn } ⊆ i∗ X 0 such that
0
gn → f in D (A) . Therefore,
s → gn (x (s))
592 THE BOCHNER INTEGRAL
and Z Z
A x (s) dµ = Ax (s) dµ.
Ω Ω
This proves the theorem.
Here is another version of this theorem which has different hypotheses.
21.2. THE BOCHNER INTEGRAL 593
Theorem 21.24 Let X and Y be separable Banach spaces and let A : D (A) ⊆
X → Y be a closed operator. Also let (Ω, F, µ) be a σ finite measure space and let
x : Ω → X be Bochner integrable such that x (s) ∈ D (A) for all s. Also suppose Ax
is Bochner integrable. Then
Z Z
Axdµ = A xdµ
R
and xdµ ∈ D (A).
Proof: Consider the graph of A,
G (A) ≡ {(x, Ax) : x ∈ D (A)} ⊆ X × Y.
Then since A is closed, G (A) is a closed separable Banach space with the norm
0
||(x, y)|| ≡ max (||x|| , ||y||) . Therefore, for g ∗ ∈ G (A) , one can apply the Hahn Ba-
0
nach theorem and obtain (x , y ) ∈ (X × Y ) such that g ∗ (x, Ax) = (x∗ (x) , y ∗ (Ax)) .
∗ ∗
Now it follows from the assumptions that s → (x∗ (x (s)) , y ∗ (Ax (s))) is mea-
surable with values in G (A) . It is also separably valued because this is true of
G (A)R. By the Pettis theorem, s → (x (s) , A (x (s))) must be strongly measurable.
Also ||x (s)|| + ||A (x (s))|| dµ < ∞ by assumption and so there exists a sequence
of simple functions having values Rin G (A) , {(xn (s) , Axn (s))} which converges to
(x (s) , A (s)) pointwise such that ||(xn , Axn ) − (x, Ax)|| dµ → 0 in G (A) . Now
for simple functions is it routine to verify that
Z µZ Z ¶
(xn , Axn ) dµ = xn dµ, Axn dµ
µZ Z ¶
= xn dµ, A xn dµ
Also
¯¯Z Z ¯¯ Z
¯¯ ¯¯
¯¯ xn dµ − xdµ¯¯ ≤ ||xn − x|| dµ
¯¯ ¯¯
Z
≤ ||(xn , Axn ) − (x, Ax)|| dµ
s → A (s) x
where Ank is in L (X, Y ). It follows An (s) x → A (s) x for each s and so, since s →
An (s) x is a simple Y valued function, s → A (s) x must be strongly measurable.
Definition 21.26 Suppose A (s) ∈ L (X, Y ) for each s ∈ Ω where X, Y are sepa-
rable Banach spaces. Suppose also that for each x ∈ X,
Lemma 21.27 The above definition is well defined. Furthermore, if 21.21 holds
then s → ||A (s)|| is measurable and if 21.22 holds, then
¯¯Z ¯¯ Z
¯¯ ¯¯
¯¯ A (s) dµ¯¯ ≤ ||A (s)|| dµ.
¯¯ ¯¯
Ω Ω
R R
This is because x → Ω A (s) xdµ is linear and continuous. Thus Ψ = Ω A (s) dµ
and the definition is well defined.
Now consider the assertion about s → ||A (s)||. Let D0 ⊆ B 0 the closed unit ball
in Y 0 be such that D0 is countable and
and so ¯¯Z ¯¯ Z
¯¯ ¯¯
¯¯ A (s) dµ¯¯ ≤ ||A (s)|| dµ.
¯¯ ¯¯
Ω Ω
Theorem 21.28 Let E be a compact metric space and let (Ω, F) be a measure
space. Suppose ψ : E × Ω → R has the property that x → ψ (x, ω) is continuous
and ω → ψ (x, ω) is measurable. Then there exists a measurable function, f having
values in E such that
ψ (f (ω) , ω) = sup ψ (x, ω) .
x∈E
v ⊗ u (x) = (x, u) v.
x → (Ax, x)
x → (v ⊗ u (x) , x)
Proof: Since H is separable, it follows from Corollary 13.41 on Page 358 that
B can be considered as a metric space. Therefore, showing continuity reduces
to showing convergent sequences are taken to convergent sequences. Let xn → x
weakly in B. Suppose Axn does not converge to Ax. Then there exists a subsequence,
still denoted by {xn } such that
for all n. Then since A maps bounded sets to compact sets, there is a further sub-
sequence, still denoted by {xn } such that Axn converges to some y ∈ H. Therefore,
v ⊗ u (xn ) = (xn , u) v.
There exists a weakly convergent subsequence of {xn } say {xnk } converging weakly
to x ∈ H. Therefore,
which converges to 0.
and this converges to 0 by weak convergence. It follows from the definition that
u ⊗ u is self adjoint. This proves the lemma.
Lemma 21.32 Let A ∈ L (H, H) and suppose it is self adjoint and compact. Let
B denote the closed unit ball in H. Let e ∈ B be such that
Proof: From the above observation, (Ax, x) is always real and since A is com-
pact, |(Ax, x)| achieves a maximum at e. It remains to verify e is an eigenvector.
Note that ||e|| = 1 whenever λ 6= 0 since otherwise |(Ae, e)| could be made larger
by replacing e with e/ ||e||.
Suppose λ = (Ae, e) > 0. Then it is easy to verify that λI − A is a nonnegative
(((λI − A) x, x) ≥ 0 for all x.) and self adjoint operator. Therefore, the Cauchy
Schwarz inequality can be applied to write
1/2 1/2
((λI − A) e, x) ≤ ((λI − A) e, e) ((λI − A) x, x) =0
By that lemma again, An en+1 = λn+1 en+1 and ||en+1 || = 1 if λn+1 6= 0 while
en+1 = 0 if λn+1 = 0. Then
An+1 ≡ An − λn+1 en+1 ⊗ en+1
Thus
n
X
An = A − λk e k ⊗ e k . (21.26)
k=1
Claim 1: If k < n + 1 then (en+1 , ek ) = 0. Also Aek = λk ek for all k.
Proof of claim: From the above,
n
X
λn+1 en+1 = An en+1 = Aen+1 − λk (en+1 , ek ) ek .
k=1
Theorem 21.34 Let A (s) ∈ L (H, H) be a compact self adjoint operator and H is
a separable Hilbert space such that s → A (s) x is strongly measurable. Then there
∞ ∞
exist real numbers {λk (s)}k=1 and vectors {ek (s)}k=1 such that
||ek (s)|| = 1 if λk 6= 0,
||ek (s)|| = 0 if λk = 0,
Proof: It is simply a repeat of the above proof of the Hilbert Schmidt theorem
except at every step when the ek and λk are defined, you use the Kuratowski mea-
surable selection theorem, Theorem 21.28 on Page 595 to obtain λk (s) is measurable
and that s → ek (s) is also measurable.
When you consider maxx∈B |(An (s) x, x)| , let ψ (x, s) = |(An (s) x, x)| . Then
ψ is continuous in x by Lemma 21.30 on Page 596 and it is measurable in s by
assumption. Therefore, by the Kuratowski theorem, ek (s) is measurable in the sense
that inverse images of weakly open sets in B are measurable. However, by Lemma
21.9 on Page 582 this is the same as weakly measurable. Since H is separable, this
implies s → ek (s) is also strongly measurable. The measurability of λk and ek is
the only new thing here and so this completes the proof.
is µ measurable and
s1 → f (s1 , s2 ) is µ measurable,
s2 → f (s1 , s2 ) is λ measurable,
Z
s1 → f (s1 , s2 ) dλ is µ measurable,
Ω2
Z
s2 → f (s1 , s2 ) dµ is λ measurable,
Ω1
The following theorem is the version of Fubini’s theorem valid for Bochner integrable
functions.
Then there exist a set of µ measure zero, N and a set of λ measure zero, M such
that the following formula holds with all integrals making sense.
Z Z Z
f (s1 , s2 ) d (µ × λ) = f (s1 , s2 ) XN (s1 ) dλdµ
Ω1 ×Ω2 Ω1 Ω2
Z Z
= f (s1 , s2 ) XM (s2 ) dµdλ.
Ω2 Ω1
Proof: First note that from 21.27 and the usual Fubini theorem for nonnegative
valued functions,
Z Z Z
||f (s1 , s2 )|| d (µ × λ) = ||f (s1 , s2 )|| dλdµ
Ω1 ×Ω2 Ω1 Ω2
and so Z
||f (s1 , s2 )|| dλ < ∞ (21.28)
Ω2
and so from the usual Fubini theorem for complex valued functions,
Z Z Z
φ ◦ f (s1 , s2 ) d (µ × λ) = φ ◦ f (s1 , s2 ) dλdµ. (21.29)
Ω1 ×Ω2 Ω1 Ω2
Now also if you fix s2 , it follows from the definition of strongly measurable and the
properties of product measure mentioned above that
s1 → f (s1 , s2 )
is µ measurable because
I want to show this is also Bochner integrable with respect to µ so I can factor
out φ once again. It’s measurability follows from the Pettis theorem and the above
observation 21.31. Also,
Z ¯¯Z ¯¯
¯¯ ¯¯
¯¯ f (s1 , s2 ) XN C (s1 ) dλ¯¯¯¯ dµ
¯¯
Ω Ω
Z 1Z 2
≤ ||f (s1 , s2 )|| dλdµ
ZΩ1 Ω2
= ||f (s1 , s2 )|| d (µ × λ) < ∞.
Ω1 ×Ω2
Therefore, the function in 21.32 is indeed Bochner integrable and so in 21.30 the φ
can be taken outside the last integral. Thus,
µZ ¶
φ f (s1 , s2 ) d (µ × λ)
Z Ω1 ×Ω2
= φ ◦ f (s1 , s2 ) d (µ × λ)
ZΩ1 ×Ω
Z 2
= φ ◦ f (s1 , s2 ) dλdµ
Ω Ω2
Z 1 µZ ¶
= φ f (s1 , s2 ) XN C (s1 ) dλ dµ
Ω1 Ω
µZ Z 2 ¶
= φ f (s1 , s2 ) XN C (s1 ) dλdµ .
Ω1 Ω2
0
Since X separates the points,
Z Z Z
f (s1 , s2 ) d (µ × λ) = f (s1 , s2 ) XN C (s1 ) dλdµ.
Ω1 ×Ω2 Ω1 Ω2
The other formula follows from similar reasoning. This proves the theorem.
Also µZ ¶1/p
p
||x||Lp (Ω;X) ≡ ||x||p ≡ ||x (s)|| dµ . (21.33)
Ω
604 THE BOCHNER INTEGRAL
As in the case of scalar valued functions, two functions in Lp (Ω; X) are consid-
ered equal if they are equal a.e. With this convention, and using the same arguments
found in the presentation of scalar valued functions it is clear that Lp (Ω; X) is a
normed linear space with the norm given by 21.33. In fact, Lp (Ω; X) is a Banach
space. This is the main contribution of the next theorem.
then there exists x ∈ Lp (Ω; X) such that xn (s) → x (s) a.e. and
||x − xn ||p → 0.
Proof: Let
N
X
gN (s) ≡ ||xn+1 (s) − xn (s)||X
n=1
X∞
≤ ||xn+1 − xn ||p < ∞.
n=1
Let
∞
X
g (s) = lim gN (s) = ||xn+1 (s) − xn (s)||X .
N →∞
n=1
exists because
N
X
xN +1 (s) = xN +1 (s) − x1 (s) + x1 (s) = (xn+1 (s) − xn (s)) + x1 (s).
n=1
21.5. THE SPACES LP (Ω; X) 605
N
X
||xN +1 (s) − xM +1 (s)||X ≤ ||xn+1 (s) − xn (s)||X
n=M +1
X∞
≤ ||xn+1 (s) − xn (s)||X
n=M +1
∞
which shows that {xN +1 (s)}N =1 is a Cauchy sequence. Now let
½
limN →∞ xN (s) if s ∈
/ E,
x (s) ≡
0 if s ∈ E.
By Theorem 21.2, xn (Ω) is separable for each n. Therefore, x (Ω) is also separable.
Also, if f ∈ X 0 , then
f (x (s)) = lim f (xN (s))
N →∞
if s ∈
/ E and f (x (s)) = 0 if s ∈ E. Therefore, f ◦ x is measurable because it is the
limit of the measurable functions,
f ◦ xN XE C .
Since x is weakly measurable and x (Ω) is separable, Corollary 21.8 shows that x is
strongly measurable. By Fatou’s lemma,
Z Z
p p
||x (s) − xN (s)|| dµ ≤ lim inf ||xM (s) − xN (s)|| dµ.
Ω M →∞ Ω
It remains to show x ∈ Lp (Ω; X). This follows from the above and the triangle
inequality. Thus, for N large enough,
µZ ¶1/p
p
||x (s)|| dµ
Ω
606 THE BOCHNER INTEGRAL
µZ ¶1/p µZ ¶1/p
p p
≤ ||xN (s)|| dµ + ||x (s) − xN (s)|| dµ
Ω Ω
µZ ¶1/p
p
≤ ||xN (s)|| dµ + ε < ∞.
Ω
Theorem 21.38 Lp (Ω; X) is complete. Also every Cauchy sequence has a subse-
quence which converges pointwise.
and apply Lemma 21.37. The pointwise convergence of this subsequence was estab-
lished in the proof of this lemma. This proves the theorem because if a subsequence
of a Cauchy sequence converges, then the Cauchy sequence must also converge.
Observation 21.39 If the measure space is Lebesgue measure then you have conti-
nuity of translation in Lp (Rn ; X) in the usual way. More generally, for µ a Radon
measure on Ω a locally compact Hausdorff space, Cc (Ω; X) is dense in Lp (Ω; X) .
Here Cc (Ω; X) is the space of continuous X valued functions which have compact
support in Ω. The proof of this little observation follows immediately from approx-
imating with simple functions and then applying the appropriate considerations to
the simple functions.
Clearly Fatou’s lemma and the monotone convergence theorem make no sense
for functions with values in a Banach space but the dominated convergence theorem
holds in this setting.
Proof: ||xn (s) − x (s)|| ≤ 2g (s) a.e. so by the usual dominated convergence
theorem, Z
0 = lim ||xn (s) − x (s)|| dµ.
n→∞ Ω
Also, Z
||xn (s) − xm (s)|| dµ
Ω
21.5. THE SPACES LP (Ω; X) 607
Z Z
≤ ||xn (s) − x (s)|| dµ + ||xm (s) − x (s)|| dµ,
Ω Ω
and so {xn } is a Cauchy sequence in L1 (Ω; X). Therefore, by Theorem 21.38, there
exists y ∈ L1 (Ω; X) and a subsequence xn0 satisfying
xn0 (s) → y (s) a.e. and in L1 (Ω; X).
But x (s) = limn0 →∞ xn0 (s) a.e. and so x (s) = y (s) a.e. Hence
Z Z
||x (s)|| dµ = ||y (s)|| dµ < ∞
Ω Ω
which shows that x is Bochner integrable. Finally, since the integral is linear,
¯¯Z Z ¯¯ ¯¯Z ¯¯
¯¯ ¯¯ ¯¯ ¯¯
¯¯ x (s) dµ − xn (s) dµ¯¯ = ¯¯ (x (s) − xn (s)) dµ¯¯¯¯
¯ ¯ ¯ ¯
¯¯
Ω Ω
Z Ω
≤ ||xn (s) − x (s)|| dµ,
Ω
Now ∪∞
M =1 BM
r
= L ([0, T ] ; X) . Note this did not depend on the measure space
used. It would have been equally valid on any measure space.
Consider now C ([0, T ] ; X) . The norm on this space is the usual norm, ||·||∞ .
The argument above shows ||·||∞ is a Borel measurable function on Lp ([0, T ] ; X) .
This is because BM ≡ {x ∈ Lp ([0, T ] ; X) : ||x||∞ ≤ M } is a closed, hence Borel sub-
set of Lp ([0, T ] ; X). Now let θ ∈ L (Lp ([0, T ] ; X) , Lp (R; X)) such that θ (x (t)) =
x (t) for all t ∈ [0, T ] and also θ ∈ L (C ([0, T ] ; X) , BC (R; X)) where BC (R; X)
denotes the bounded continuous functions with a norm given by
||x|| ≡ sup ||x (t)|| ,
t∈R
608 THE BOCHNER INTEGRAL
and let Φ ∈ Cc∞ (−T, 2T ) such that Φ (t) = 1 for t ∈ [0, T ]. Then you could let
θx (t) ≡ Φ (t) x
e (t) .
ψ n x (t) ≡ φn ∗ θx (t) .
provided n is large enough due to the compact support and consequent uniform
continuity of θx.
If ||ψ n x − x||L∞ ([0,T ];X) → 0, then {ψ n x} must be a Cauchy sequence in C ([0, T ] ; X)
and this requires that x equals a continuous function a.e. Thus C ([0, T ] ; X) con-
sists exactly of those functions, x of Lp ([0, T ] ; X) such that ||ψ n x − x||∞ → 0. It
follows
C ([0, T ] ; X) =
½ ¾
1
∩∞ ∪ ∞
n=1 m=1 k=m ∩ ∞
x ∈ L p
([0, T ] ; X) : ||ψ k x − x||∞ ≤ . (21.34)
n
It only remains to show
is a Borel set. Suppose then that xn ∈ S and xn → x in Lp ([0, T ] ; X). Then there
exists a subsequence, still denoted by n such that xn → x pointwise a.e. as well as
in Lp . There exists a set of measure 0 such that for all n, and t not in this set,
¯¯Z ¯¯
¯¯ 1/k ¯¯
¯¯ ¯¯
||ψ k xn (t) − xn (t)|| ≡ ¯¯ φk (s) (θxn (t − s)) ds − xn (t)¯¯ ≤ α
¯¯ −1/k ¯¯
xn (t) → x (t) .
Then
Thus S is closed and so the set in 21.34 is a Borel set. This proves the theorem.
As in the scalar case, the following lemma holds in this more general context.
Lemma 21.42 Let (Ω, µ) be a regular measure space where Ω is a locally compact
Hausdorff space. Then Cc (Ω; X) the space of continuous functions having compact
support and values in X is dense in Lp (0, T ; X) for all p ∈ [0, ∞). For any σ finite
measure space, the simple functions are dense in Lp (0, T ; X) .
Proof: First is it shown the simple functions are dense in Lp (0, T ; X) . Let
f ∈ Lp (0, T ; X) and let {xn } denote a sequence of simple functions which converge
to f pointwise which also have the property that
Then Z
p
||xn (s) − f (s)|| dµ → 0
Ω
from the dominated convergence theorem. Therefore, the simple functions are in-
deed dense in Lp (0, T ; X) . P
Next suppose (Ω, µ) is a regular measure space. If x (s) ≡ i ai XEi (s) is a
simple function, then by regularity, there exist compact sets, Ki and open sets, Vi
1/p P
such that Ki ⊆ Ei ⊆ Vi and µ (Vi \ Ki ) < ε/ i ||ai || . Let Ki ≺ hi ≺ Vi . Then
consider X
ai hi ∈ Cc (Ω) .
i
610 THE BOCHNER INTEGRAL
X µZ p p
¶1/p
≤ ||ai || |hi (s) − XEi (s)| dµ
i Ω
ÃZ !1/p
X
≤ ||ai || dµ
i Vi \Ki
X 1/p
≤ ||ai || µ (Vi \ Ki ) <ε
i
Since ε is arbitrary, this and the first part of the lemma shows Cc (Ω; X) is dense
in Lp (Ω; X) .
then xn is measurable.
Z Z Z Z
|xn (s, t) − xm (s, t)| dνdµ ≤ |xn (s, t) − x (s) (t)| dνdµ
Ω B Ω B
21.6. MEASURABLE REPRESENTATIVES 611
Z Z
+ |xm (s, t) − x (s) (t)| dνdµ. (21.37)
Ω B
It follows from 21.37 and 21.36 that {xn } is a Cauchy sequence in L1 (Ω × B).
Therefore, there exists y ∈ L1 (Ω × B) and a subsequence of {xn }, still denoted by
{xn }, such that
lim xn (s, t) = y (s, t) a.e.
n→∞
and
lim ||xn − y||1 = 0.
n→∞
It follows that
Z Z Z Z
|y (s, t) − x (s) (t)| dνdµ ≤ |y (s, t) − xn (s, t)| dνdµ (21.38)
Ω B Ω
Z BZ
+ |x (s) (t) − xn (s, t)| dνdµ.
Ω B
Since limn→∞ ||xn − x||1 = 0, it follows from 21.38 that y = x in L1 (Ω; X) . Thus,
for a.e. s,
y (s, ·) = x (s) in X = L1 (B).
R ¡R ¢
Now Ω x (s) dµ ∈ X = L1 (B, ν) so it makes sense to ask for Ω x (s) dµ (t), at
least a.e. To find what this is, note
¯¯Z Z ¯¯ Z
¯¯ ¯¯
¯¯ xn (s) dµ − x (s) dµ ¯¯ ≤ ||xn (s) − x (s)||X dµ.
¯¯ ¯¯
Ω Ω X Ω
Therefore Z ¯Z µZ ¶ ¯
¯ ¯
lim ¯ x (s) dµ (t)¯¯ dν = 0.
n→∞ ¯ xn (s, t) dµ − (21.39)
B Ω Ω
Theorem 21.43 Let X = L1 (B) where (B, F, ν) is a σ finite measure space and
let x ∈ L1 (Ω; X). Then there exists a measurable representative, y ∈ L1 (Ω × B),
such that
x (s) = y (s, ·) a.e. s in Ω,
and Z µZ ¶
y (s, t) dµ = x (s) dµ (t) a.e. t.
Ω Ω
F :S→X
∞
whenever {Ei }i=1 is a sequence of disjoint elements of S. For F a vector measure,
X
|F | (A) ≡ sup{ ||µ (F )|| : π (A) is a partition of A}.
F ∈π(A)
This is the same definition that was given in the case where F would have values
in C, the only difference being the fact that now F has values in a general Banach
space X as the vector space of values of the vector measure. Recall that a partition
of A is a finite set, {F1 , · · ·, Fm } ⊆ S such that ∪m
i=1 Fi = A. The same theorem
about |F | proved in the case of complex valued measures holds in this context with
the same proof. For completeness, it is included here.
Proof: Let E1 and E2 be sets of S such that E1 ∩ E2 = ∅ and let {Ai1 · · · Aini } =
π(Ei ), a partition of Ei which is chosen such that
ni
X
|F |(Ei ) − ε < ||F (Aij )|| i = 1, 2.
j=1
21.7. VECTOR MEASURES 613
Consider the sets which are contained in either of π (E1 ) or π (E2 ) , it follows this
collection of sets is a partition of E1 ∪ E2 which is denoted here by π(E1 ∪ E2 ).
Then by the above inequality and the definition of total variation,
X
|F |(E1 ∪ E2 ) ≥ ||F (F )|| > |F |(E1 ) + |F |(E2 ) − 2ε,
F ∈π(E1 ∪E2 )
Since n is arbitrary,
∞
X
|F |(∪∞
j=1 Ej ) = |F |(Ej )
j=1
Definition 21.46 A Banach space is said to have the Radon Nikodym property if
whenever
(Ω, S, µ) is a finite measure space
F : S → X is a vector measure with |F | (Ω) < ∞
F ¿µ
then one may conclude there exists g ∈ L1 (Ω; X) such that
Z
F (E) = g (s) dµ
E
for all E ∈ S.
Some Banach spaces have the Radon Nikodym property and some don’t. No
attempt is made to give a complete answer to the question of which Banach spaces
have this property but the next theorem gives examples of many spaces which do.
Theorem 21.47 Suppose X 0 is a separable dual space. Then X 0 has the Radon
Nikodym property.
Proof: Let F ¿ µ and let |F | (Ω) < ∞ for F : S → X 0 , a vector measure. Pick
x ∈ X and consider the map
E → F (E) (x)
for E ∈ S. This defines a complex measure which is absolutely continuous with
respect to |F |. Therefore, by the Radon Nikodym theorem, there exists fx ∈
L1 (Ω, |F |) such that Z
F (E) (x) = fx (s) d |F |. (21.42)
E
Claim: |fx (s)| ≤ ||x|| for |F | a.e. s.
Proof of claim: Consider the closed ball in F, B (0, ||x||) and let B ≡ B (p, r)
be an open ball contained in its complement. Let fx−1 (B) ≡ E ∈ S. I want to
argue that |F | (E) = 0 so suppose |F | (E) > 0. then
which contradicts 21.43 because B (p, r) was given to have empty intersection with
B (0, ||x||). Therefore, |F | (E) = 0 as hoped.
³ Now F \´B (0, ||x||) can be covered by
countably many such balls and so |F | F \ B (0, ||x||) = 0.
21.7. VECTOR MEASURES 615
N1 ≡ ∪x∈D Nx .
Thus
|F | (N1 ) = 0.
For any E ∈ S, x, y ∈ D, and a, b ∈ F,
Z
fax+by (s) d |F | = F (E) (ax + by) = aF (E) (x) + bF (E) (y)
E
Z
= (afx (s) + bfy (s)) d |F |. (21.44)
E
for |FP
| a.e. s and x, y ∈ D. Let D̃ consist of all finite linear combinations of the
m
form i=1 ai xi where ai is a rational point of F and xi ∈ D. If
m
X
ai xi ∈ D̃,
i=1
|F | (N2 ) = 0
hx (s) ≡ lim
0
{h̃x0 (s) : x0 ∈ D̃}.
x →x
616 THE BOCHNER INTEGRAL
This is well defined because if x0 and y 0 are elements of D̃, the above claim and
21.45 imply ¯ ¯ ¯ ¯
¯ 0 ¯ ¯ ¯
¯h̃x (s) − h̃y0 (s)¯ = ¯h̃(x0 −y0 ) (s)¯ ≤ ||x0 − y 0 ||.
Using 21.45, the dominated convergence theorem may be applied to conclude that
for xn → x, with xn ∈ D̃,
Z Z
hx (s) d |F | = lim h̃xn (s) d |F | = lim F (E) (xn ) = F (E) (x). (21.46)
E n→∞ E n→∞
|hx (s)| ≤ ||x|| , hax+by (s) = ahx (s) + bhy (s), (21.47)
Therefore, Z
||θ (s)|| d |F | < ∞
Ω
so θ ∈ L1 (Ω; X 0 ). By 21.6, if E ∈ S,
Z Z µZ ¶
hx (s) d |F | = θ (s) (x) d |F | = θ (s) d |F | (x). (21.48)
E E E
Corollary 21.48 Any separable reflexive Banach space has the Radon Nikodym
property.
It is not necessary to assume separability in the above corollary. For the proof
of a more general result, consult Vector Measures by Diestal and Uhl, [16].
||θx||Y = ||x||X .
Theorem 21.50 Let X be any Banach space and let (Ω, S, µ) be a finite measure
0
space. Let p ≥ 1 and let 1/p + 1/p0 = 1.(If p = 1, p0 ≡ ∞.) Then Lp (Ω; X 0 ) is
p 0 p0 0
isometric to a subspace of (L (Ω; X)) . Also, for g ∈ L (Ω; X ),
¯Z ¯
¯ ¯
sup ¯¯ g (s) (f (s)) dµ¯¯ = ||g||p0 .
||f ||p ≤1 Ω
0
Proof: First observe that for f ∈ Lp (Ω; X) and g ∈ Lp (Ω; X 0 ),
s → g (s) (f (s))
by Z
θg (f ) ≡ g (s) (f (s)) dµ.
Ω
618 THE BOCHNER INTEGRAL
||θg|| = ||g||.
∪m
i=1 Ei = Ω.
0
Then ||g|| ∈ Lp (Ω). Let ε > 0 be given. By the scalar Riesz representation
theorem, there exists h ∈ Lp (Ω) such that ||h||p = 1 and
Z
||g (s)||X 0 h (s) dµ ≥ ||g||Lp0 (Ω;X 0 ) − ε.
Ω
X µZ
m
p
¶
p
Z
p
= |h (s) | dµ ||di ||X ≤ |h| dµ = 1.
i=1 Ei Ω
Also ¯Z ¯
¯ ¯
||θg|| ≥ |θg (f )| = ¯¯ g (s) (f (s)) dµ¯¯ ≥
Ω
¯Z ¯
¯ X m ³ ´ ¯
¯ ¯
¯ ||ci ||X 0 − ε/ ||h||L1 (Ω) h (s) XEi (s) dµ¯
¯ Ω ¯
i=1
¯Z ¯ ¯Z ¯
¯ ¯ ¯ ¯
≥ ¯¯ ||g (s)||X 0 h (s) dµ¯¯ − ε ¯¯ h (s) / ||h||L1 (Ω) dµ¯¯
Ω Ω
Theorem 21.51 If X is a Banach space and X 0 has the Radon Nikodym property,
then if (Ω, S, µ) is a finite measure space,
0
(Lp (Ω; X)) ∼
0
= Lp (Ω; X 0 )
Lemma 21.52 F defined above is a vector measure with values in X 0 and |F | (Ω) <
∞.
1/p
≤ ||l|| sup ||XE (·) x||Lp (Ω;X) ≤ ||l|| µ (E) .
||x||≤1
Let {Ei }∞
i=1 be a sequence of disjoint elements of S and let E = ∪n<∞ En .
¯ ¯ ¯ ¯
¯ Xn ¯ ¯ X n ¯
¯ ¯ ¯ ¯
¯F (E) (x) − F (Ek ) (x)¯ = ¯l (XE (·) x) − l (XEi (·) x)¯ (21.51)
¯ ¯ ¯ ¯
k=1 i=1
¯¯ ¯¯
¯¯ Xn ¯¯
¯¯ ¯¯
≤ ||l|| ¯¯XE (·) x − XEi (·) x¯¯
¯¯ ¯¯ p
i=1 L (Ω;X)
à !1/p
[
≤ ||l|| µ Ek ||x||.
k>n
Thus
¯¯ n ¯¯
n
X n
X ¯¯X ¯¯
¯¯ ¯¯
−ε + ||F (Hi )|| < l (XHi (·) xi ) ≤ ||l|| ¯¯ XHi (·) xi ¯¯
¯¯ ¯¯
i=1 i=1 i=1 Lp (Ω;X)
ÃZ n !1/p
X 1/p
≤ ||l|| XHi (s) dµ = ||l|| µ (Ω) .
Ω i=1
1/p
Since the partition was arbitrary, this shows |F | (Ω) ≤ ||l|| µ (Ω) and this proves
the lemma.
Continuing with the proof of Theorem 21.51, note that
F ¿ µ.
Since X 0 has the Radon Nikodym property, there exists g ∈ L1 (Ω; X 0 ) such that
Z
F (E) = g (s) dµ.
E
n
X n Z
X
= F (Ei ) (xi ) = g (s) (xi ) dµ. (21.52)
i=1 i=1 Ei
Let
Gn ≡ {s : ||g (s)||X 0 ≤ n}
and let
j : Lp (Gn ; X) → Lp (Ω; X)
be given by ½
h (s) if s ∈ Gn ,
jh (s) =
0 if s ∈
/ Gn .
Letting h be a simple function in Lp (Gn ; X),
Z
∗
j l (h) = l (jh) = g (s) (h (s)) dµ. (21.54)
Gn
0
Since the simple functions are dense in Lp (Gn ; X), and g ∈ Lp (Gn ; X 0 ), it follows
21.54 holds for all h ∈ Lp (Gn ; X). By Theorem 21.50,
||g||Lp0 (Gn ;X 0 ) = ||j ∗ l||(Lp (Gn ;X))0 ≤ ||l||(Lp (Ω;X))0 .
By the monotone convergence theorem,
||g||Lp0 (Ω;X 0 ) = lim ||g||Lp0 (Gn ;X 0 ) ≤ ||l||(Lp (Ω;X))0 .
n→∞
0
Therefore g ∈ Lp (Ω; X 0 ) and since simple functions are dense in Lp (Ω; X), 21.53
holds for all h ∈ Lp (Ω; X) . Thus l = θg and the theorem is proved because, by
Theorem 21.50, ||l|| = ||g|| and the mapping θ is onto because l was arbitrary.
Corollary 21.53 If X 0 is separable, then
0
(Lp (Ω; X)) ∼
0
= Lp (Ω; X 0 ).
Corollary 21.54 If X is separable and reflexive, then
0
(Lp (Ω; X)) ∼
0
= Lp (Ω; X 0 ).
Corollary 21.55 If X is separable and reflexive, then if p ∈ (1, ∞) , then Lp (Ω; X)
is reflexive.
Proof: This is just like the scalar valued case.
21.9 Exercises
1. Show L1 (R) is not reflexive. Hint: L1 (R) is separable. What about L∞ (R)?
2. If f ∈ L1 (Rn ; X) for X a Banach space, does the usual fundamental
R theorem
1
of calculus work? That is, can you say limr→0 m(B(x,r)) B(x,r)
f (t) dm =
f (x) a.e.?
3. Does the Vitali convergence theorem hold for Bochner integrable functions?
If so, give a statement of the appropriate theorem and a proof.
622 THE BOCHNER INTEGRAL
Part III
Complex Analysis
623
The Complex Numbers
The reader is presumed familiar with the algebraic properties of complex numbers,
including the operation of conjugation. Here a short review of the distance in C is
presented.
The length of a complex number, referred to as the modulus of z and denoted
by |z| is given by
¡ ¢1/2 1/2
|z| ≡ x2 + y 2 = (zz) ,
Then C is a metric space with the distance between two complex numbers, z and
w defined as
d (z, w) ≡ |z − w| .
This metric on C is the same as the usual metric of R2 . A sequence, zn → z if and
only if xn → x in R and yn → y in R where z = x + iy and zn = xn + iyn . For
n
example if zn = n+1 + i n1 , then zn → 1 + 0i = 1.
This is the usual definition of Cauchy sequence. There are no new ideas here.
Proposition 22.2 The complex numbers with the norm just mentioned forms a
complete normed linear space.
|z + w| ≤ |z| + |w|
625
626 THE COMPLEX NUMBERS
The only one of these axioms of a norm which is not completely obvious is the first
one, the triangle inequality. Let z = x + iy and w = u + iv
2 2 2
|z + w| = (z + w) (z + w) = |z| + |w| + 2 Re (zw)
2 2 2
≤ |z| + |w| + 2 |(zw)| = (|z| + |w|)
Definition 22.3 An infinite sum of complex numbers is defined as the limit of the
sequence of partial sums. Thus,
∞
X n
X
ak ≡ lim ak .
n→∞
k=1 k=1
Just as in the case of sums of real numbers, an infinite sum converges if and
only if the sequence of partial sums is a Cauchy sequence.
From now on, when f is a function of a complex variable, it will be assumed that
f has values in X, a complex Banach space. Usually in complex analysis courses,
f has values in C but there are many important theorems which don’t require this
so I will leave it fairly general for a while. Later the functions will have values in
C. If you are only interested in this case, think C whenever you see X.
Just as in the case of functions of a real variable, one of the important theorems
is the Weierstrass M test. Again, there is nothing new here. It is just a review of
earlier material.
1/2 ¡ ¢1/2
|x + iy| ≡ ((x + iy) (x − iy)) = x2 + y 2
which is just the usual norm in R2 identifying (x, y) with x + iy. Therefore, C is
a complete metric space topologically like R2 and so the Heine Borel theorem that
compact sets are those which are closed and bounded is valid. Thus, as far as
topology is concerned, there is nothing new about C.
The extended complex plane, denoted by C b , consists of the complex plane, C
along with another point not in C known as ∞. For example, ∞ could be any point
in R3 . A sequence of complex numbers, zn , converges to ∞ if, whenever K is a
compact set in C, there exists a number, N such that for all n > N, zn ∈ / K. Since
compact sets in C are closed and bounded, this is equivalent to saying that for all
R > 0, there exists N such that if n > N, then zn ∈ / B (0, R) which is the same as
saying limn→∞ |zn | = ∞ where this last symbol has the same meaning as it does in
calculus.
A geometric way of understanding this in terms of more familiar objects involves
a concept known as the Riemann sphere.
2
Consider the unit sphere, S 2 given by (z − 1) + y 2 + x2 = 1. Define a map
from the complex plane to the surface of this sphere as follows. Extend a line
from the point, p in the complex plane to the point (0, 0, 2) on the top of this
sphere and let θ (p) denote the point of this sphere which the line intersects. Define
θ (∞) ≡ (0, 0, 2).
(0, 0,
s 2)
@
@
s @ sθ(p)
(0, 0, 1) @
@
@ p
@s C
628 THE COMPLEX NUMBERS
22.2 Exercises
1. Prove the root test for series of complex numbers. If ak ∈ C and r ≡
1/n
lim supn→∞ |an | then
X ∞ converges absolutely if r < 1
ak diverges if r > 1
k=0 test fails if r = 1.
¡ 2+i ¢n
2. Does limn→∞ n 3exist? Tell why and find the limit if it does exist.
Pn
3. Let A0 = 0 and let An ≡ k=1 ak if n > 0. Prove the partial summation
formula,
Xq q−1
X
ak bk = Aq bq − Ap−1 bp + Ak (bk − bk+1 ) .
k=p k=p
Now using this formula, suppose {bn } is a sequence of real numbers which
converges toP0 and is decreasing. Determine those values of ω such that
∞
|ω| = 1 and k=1 bk ω k converges.
4. Let f : U ⊆ C → C be given by f (x + iy) = u (x, y) + iv (x, y) . Show f is
continuous on U if and only if u : U → R and v : U → R are both continuous.
Riemann Stieltjes Integrals
In the theory of functions of a complex variable, the most important results are those
involving contour integration. I will base this on the notion of Riemann Stieltjes
integrals as in [13], [39], and [27]. The Riemann Stieltjes integral is a generalization
of the usual Riemann integral and requires the concept of a function of bounded
variation.
where the sums are taken over all possible lists, {a = t0 < · · · < tn = b} .
The idea is that it makes sense to talk of the length of the curve γ ([a, b]) , defined
as V (γ, [a, b]) . For this reason, in the case that γ is continuous, such an image of a
bounded variation function is called a rectifiable curve.
where τ j ∈ [tj−1 , tj ] . (Note this notation is a little sloppy because it does not identify
the
R specific point, τ j used. It is understood that this point is arbitrary.) Define
γ
f dγ as the unique number which satisfies the following condition. For all ε > 0
there exists a δ > 0 such that if ||P|| ≤ δ, then
¯Z ¯
¯ ¯
¯ f dγ − S (P)¯ < ε.
¯ ¯
γ
629
630 RIEMANN STIELTJES INTEGRALS
The set of points in the curve, γ ([a, b]) will be denoted sometimes by γ ∗ .
exists, so does Z
f d (γ ◦ φ)
γ◦φ
and Z Z
f dγ = f d (γ ◦ φ) . (23.1)
γ γ◦φ
Proof: There exists δ > 0 such that if P is a partition of [a, b] such that ||P|| < δ,
then ¯Z ¯
¯ ¯
¯ f dγ − S (P)¯ < ε.
¯ ¯
γ
exists. R
This theorem shows that γ f dγ is independent of the particular γ used in its
computation to the extent that if φ is any nondecreasing function from another
interval, [c, d] , mapping to [a, b] , then the same value is obtained by replacing γ
with γ ◦ φ.
The fundamental result in this subject is the following theorem.
631
n
X
+f (γ (σ ∗ )) (γ (tp ) − γ (t∗ )) + f (γ (σ j )) (γ (tj ) − γ (tj−1 )) ,
j=p+1
p−1
X
S (P) ≡ f (γ (τ j )) (γ (tj ) − γ (tj−1 )) +
j=1
n
X
+ f (γ (τ j )) (γ (tj ) − γ (tj−1 )) .
j=p+1
Therefore,
p−1
X 1 1
|S (P) − S (Q)| ≤ |γ (tj ) − γ (tj−1 )| + |γ (t∗ ) − γ (tp−1 )| +
j=1
m m
Xn
1 1 1
|γ (tp ) − γ (t∗ )| + |γ (tj ) − γ (tj−1 )| ≤ V (γ, [a, b]) . (23.4)
m j=p+1
m m
Clearly the extreme inequalities would be valid in 23.4 if Q had more than one
extra point. You simply do the above trick more than one time. Let S (P) and
S (Q) be Riemann Steiltjes sums for which ||P|| and ||Q|| are less than δ m and let
R ≡ P ∪ Q. Then from what was just observed,
2
|S (P) − S (Q)| ≤ |S (P) − S (R)| + |S (R) − S (Q)| ≤ V (γ, [a, b]) .
m
and this shows 23.3 which proves 23.2. Therefore, there Rexists a unique complex
number, I ∈ ∩∞ m=1 Fm which satisfies the definition of γ f dγ. This proves the
theorem.
The following theorem follows easily from the above definitions and theorem.
Proof: Let 23.5 hold. From the proof of the above theorem, when ||P|| < δ m ,
¯¯Z ¯¯
¯¯ ¯¯
¯¯ f dγ − S (P)¯¯ ≤ 2 V (γ, [a, b])
¯¯ ¯¯ m
γ
and so ¯¯Z ¯¯
¯¯ ¯¯
¯¯ f dγ ¯¯ ≤ ||S (P)|| + 2 V (γ, [a, b])
¯¯ ¯¯ m
γ
633
n
X 2
≤ M |γ (tj ) − γ (tj−1 )| + V (γ, [a, b])
j=1
m
2
≤ M V (γ, [a, b]) + V (γ, [a, b]) .
m
This proves 23.6 since m is arbitrary. To verify 23.7 use the above inequality to
write ¯¯Z Z ¯¯ ¯¯Z ¯¯
¯¯ ¯¯ ¯¯ ¯¯
¯¯ f dγ − fn dγ ¯¯ = ¯¯ (f − fn ) dγ (t)¯¯
¯¯ ¯¯ ¯¯ ¯¯
γ γ γ
Lemma 23.6 Let γ : [a, b] → C be in C 1 ([a, b]) . Then V (γ, [a, b]) < ∞ so γ is of
bounded variation.
Therefore it follows V (γ, [a, b]) ≤ ||γ 0 ||∞ (b − a) . Here ||γ||∞ = max {|γ (t)| : t ∈ [a, b]}.
Therefore,
Z b+2h
1
γ h (b) = γ (s) ds = γ (b) ,
2h b
Z a
1
γ h (a) = γ (s) ds = γ (a) .
2h a−2h
Proof: Let a = t0 < t1 < · · · < tn = b. Then using the definition of γ h and
changing the variables to make all integrals over [0, 2h] ,
n
X
|γ h (tj ) − γ h (tj−1 )| =
j=1
¯ Z
Xn ¯ 2h · µ ¶
¯ 1 2h
¯ γ s − 2h + tj + (tj − a) −
¯ 2h 0 b−a
j=1
µ ¶¸¯
2h ¯
γ s − 2h + tj−1 + (tj−1 − a) ¯¯
b−a
Z 2h X
n ¯ µ ¶
1 ¯ 2h
≤ ¯ γ s − 2h + tj + (tj − a) −
2h 0 j=1 ¯ b−a
µ ¶¯
2h ¯
γ s − 2h + tj−1 + (tj−1 − a) ¯¯ ds.
b−a
635
2h
For a given s ∈ [0, 2h] , the points, s − 2h + tj + b−a (tj − a) for j = 1, · · ·, n form
an increasing list of points in the interval [a − 2h, b + 2h] and so the integrand is
bounded above by V (γ, [a − 2h, b + 2h]) = V (γ, [a, b]) . It follows
n
X
|γ h (tj ) − γ h (tj−1 )| ≤ V (γ, [a, b])
j=1
and Sh (P) is a similar Riemann Steiltjes sum taken with respect to γ h instead of
γ. Because of 23.11 γ h (t) has values in H ⊆ Ω. Therefore, fix the partition, P, and
choose h small enough that in addition to this, the following inequality is valid for
all z ∈ K.
ε
|S (P) − Sh (P)| <
3
This is possible because of 23.11 and the uniform continuity of f on H × K. It
follows ¯¯Z Z ¯¯
¯¯ ¯¯
¯¯ ¯¯
¯¯ f (·, z) dγ (t) − f (·, z) dγ h (t)¯¯ ≤
¯¯ γ γh ¯¯
¯¯Z ¯¯
¯¯ ¯¯
¯¯ f (·, z) dγ (t) − S (P)¯¯ + ||S (P) − Sh (P)||
¯¯ ¯¯
γ
¯¯ Z ¯¯
¯¯ ¯¯
¯¯ ¯¯
+ ¯¯Sh (P) − f (·, z) dγ h (t)¯¯ < ε.
¯¯ γh ¯¯
636 RIEMANN STIELTJES INTEGRALS
Formula 23.10 follows from the lemma. This proves the theorem.
Of course the same result is obtained without the explicit dependence of f on z.
1
R This is a very useful theorem because if γ is C ([a, b]) , it is easy to calculate
γ
f dγ and the above theorem allows a reduction to the case where γ is C 1 . The
next theorem shows how easy it is to compute these integrals in the case where γ is
C 1 . First note that if f is continuous and γ ∈ C 1 ([a,
R b]) , then by Lemma 23.6 and
the fundamental existence theorem, Theorem 23.4, γ f dγ exists.
Proof: Let P be a partition of [a, b], P = {t0 , · · ·, tn } and ||P|| is small enough
that whenever |t − s| < ||P|| ,
and ¯¯ ¯¯
¯¯Z n ¯¯
¯¯ X ¯¯
¯¯ f dγ − f (γ (τ j )) (γ (tj ) − γ (tj−1 ))¯¯¯¯ < ε.
¯¯
¯¯ γ j=1 ¯¯
Now
n
X Z n
bX
f (γ (τ j )) (γ (tj ) − γ (tj−1 )) = f (γ (τ j )) X[tj−1 ,tj ] (s) γ 0 (s) ds
j=1 a j=1
where here ½
1 if s ∈ [a, b]
X[a,b] (s) ≡ .
0 if s ∈
/ [a, b]
Also,
Z b Z n
bX
f (γ (s)) γ 0 (s) ds = f (γ (s)) X[tj−1 ,tj ] (s) γ 0 (s) ds
a a j=1
n Z
X tj X
≤ ||f (γ (τ j )) − f (γ (s))|| |γ 0 (s)| ds ≤ ||γ 0 ||∞ ε (tj − tj−1 )
j=1 tj−1 j
= ε ||γ 0 ||∞ (b − a) .
637
It follows that
¯¯Z Z b ¯¯ ¯¯¯¯Z ¯¯
¯¯
¯¯ ¯¯ ¯¯ n
X ¯¯
¯¯ ¯¯ ¯¯
¯¯ f dγ − 0
f (γ (s)) γ (s) ds¯¯ ≤ ¯¯ f dγ − f (γ (τ j )) (γ (tj ) − γ (tj−1 ))¯¯¯¯
¯¯ γ a ¯¯ ¯¯ γ ¯¯
j=1
¯¯ ¯¯
¯¯X Z b ¯¯
¯¯ n ¯¯
¯
+ ¯¯¯ f (γ (τ j )) (γ (tj ) − γ (tj−1 )) − f (γ (s)) γ (s) ds¯¯¯¯ ≤ ε ||γ 0 ||∞ (b − a) + ε.
0
¯¯ j=1 a ¯¯
The following lemma is useful and follows quickly from Theorem 23.3.
Lemma 23.11 In the above definition, there exists a continuous bounded vari-
ation function, γ defined on some closed interval, [c, d] , such that γ ([c, d]) =
∪m
k=1 γ k ([ak , bk ]) and γ (c) = γ 1 (a1 ) while γ (d) = γ m (bm ) . Furthermore,
Z m Z
X
f (z) dz = f (z) dz.
γ k=1 γk
Re stating Theorem 23.7 with the new notation in the above definition,
638 RIEMANN STIELTJES INTEGRALS
Proof: By Theorem 23.12 there exists η ∈ C 1 ([a, b]) such that γ (a) = η (a) ,
and γ (b) = η (b) such that
¯¯Z Z ¯¯
¯¯ ¯¯
¯¯ f (z) dz − f (z) dz ¯¯ < ε.
¯¯ ¯¯
γ η
Therefore,
¯¯ Z ¯¯
¯¯ ¯¯
¯¯(F (γ (b)) − F (γ (a))) − f (z) dz ¯¯ < ε
¯¯ ¯¯
γ
23.1 Exercises
1. Let γ : [a, b] → R be increasing. Show V (γ, [a, b]) = γ (b) − γ (a) .
2. Suppose γ : [a, b] → C satisfies a Lipschitz condition, |γ (t) − γ (s)| ≤ K |s − t| .
Show γ is of bounded variation and that V (γ, [a, b]) ≤ K |b − a| .
3. γ : [c0 , cm ] → C is piecewise smooth if there exist numbers, ck , k = 1, · ·
·, m such that c0 < c1 < · · · < cm−1 < cm such that γ is continuous and
γ : [ck , ck+1 ] → C is C 1 . Show that such piecewise smooth functions are of
bounded variation and give an estimate for V (γ, [c0 , cm ]) .
4. Let γ R: [0, 2π] → C be given by γ (t) = r (cos mt + i sin mt) for m an integer.
Find γ dz z .
−(1/2) θ
7. Let f (z) ≡ |z| e−i 2R where z = |z| eiθ . This function is called the principle
−(1/2)
branch of z . Find γ f (z) dz where γ is the semicircle in the upper half
plane which goes from (1, 0) to (−1, 0) in the counter clockwise direction. Next
do the integral in which γ goes in the clockwise direction along the semicircle
in the lower half plane.
8. Prove an open set, U is connected if and only if for every two points in U,
there exists a C 1 curve having values in U which joins them.
9. Let P, Q be two partitions of [a, b] with P ⊆ Q. Each of these partitions can
be used to form an approximation to V (γ, [a, b]) as described above. Recall
the total variation was the supremum of sums of a certain form determined by
a partition. How is the sum associated with P related to the sum associated
with Q? Explain.
10. Consider the curve,
½ ¡1¢
t + it2 sin t if t ∈ (0, 1]
γ (t) = .
0 if t = 0
f (z + h) − f (z)
lim ≡ f 0 (z)
h→0 h
More generally, power series are analytic. This will be shown soon but first here is
an important definition and a convergence theorem called the root test.
P∞ Pn
Definition 24.2 Let {ak } be a sequence in X. Then k=1 ak ≡ limn→∞ k=1 ak
whenever this limit exists. When the limit exists, the series is said to converge.
641
642 FUNDAMENTALS OF COMPLEX ANALYSIS
P∞ 1/k
Theorem 24.3 Consider k=1 ak and let ρ ≡ lim supk→∞ ||ak || . Then if ρ < 1,
the series converges absolutely and if ρ > 1 the series diverges
P∞ spectacularly in the
k
sense that limk→∞ ak 6= 0. If ρ = 1 the test fails. Also k=1 ak (z − a) converges
on some disk B (a, R) . It converges absolutely if |z − a| < R and uniformly on
P∞ k
B (a, r1 ) whenever r1 < R. The function f (z) = k=1 ak (z − a) is continuous on
B (a, R) .
Proof: Suppose ρ < 1. Then there exists r ∈ P (ρ, 1) . Therefore, ||ak || ≤ rk for
all k large enough and so by a comparison test, k ||ak || converges because the
partial sums are bounded above. Therefore, the partial sums of the original series
form a Cauchy sequence in X and so they also converge due to completeness of X.
1/k
Now suppose ρ > 1. Then letting ρ > r > 1, it follows ||ak || ≥ r infinitely
often. Thus ||ak || ≥ rk infinitely often. Thus there exists a subsequence for which
||ank || converges to ∞. Therefore, the series cannot converge.
P∞ k
Now consider k=1 ak (z − a) . This series converges absolutely if
1/k
lim sup ||ak || |z − a| < 1
k→∞
1/k
which is the same as saying |z − a| < 1/ρ where ρ ≡ lim supk→∞ ||ak || . Let
R = 1/ρ.
Now suppose r1 < R. Consider |z − a| ≤ r1 . Then for such z,
k
||ak || |z − a| ≤ ||ak || r1k
and
¡ ¢1/k 1/k r1
lim sup ||ak || r1k = lim sup ||ak || r1 = <1
k→∞ k→∞ R
P P∞ k
so k ||ak || r1k converges. By the Weierstrass M test, k=1 ak (z − a) converges
uniformly for |z − a| ≤ r1 . Therefore, f is continuous on B (a, R) as claimed because
it is the uniform limit of continuous functions, the partial sums of the infinite series.
What if ρ = 0? In this case,
1/k
lim sup ||ak || |z − a| = 0 · |z − a| = 0
k→∞
P k
and so R = ∞ and the series, ||ak || |z − a| converges everywhere.
What if ρ = ∞? Then in this case, the series converges only at z = a because if
z 6= a,
1/k
lim sup ||ak || |z − a| = ∞.
k→∞
P∞ k
Theorem 24.4 Let f (z) ≡ k=1 ak (z − a) be given in Theorem 24.3 where R >
0. Then f is analytic on B (a, R) . So are all its derivatives.
24.1. ANALYTIC FUNCTIONS 643
P∞ k−1
Proof: Consider g (z) = k=2 ak k (z − a) on B (a, R) where R = ρ−1 as
above. Let r1 < r < R. Then letting |z − a| < r1 and h < r − r1 ,
¯¯ ¯¯
¯¯ f (z + h) − f (z) ¯¯
¯¯ − g (z)¯¯¯¯
¯¯ h
¯ ¯
X∞ ¯ (z + h − a)k − (z − a)k ¯
¯ k−1 ¯
≤ ||ak || ¯ − k (z − a) ¯
¯ h ¯
k=2
¯ Ã k µ ¶
! ¯
X∞ ¯1 X k ¯
¯ k−i i k k−1 ¯
≤ ||ak || ¯ (z − a) h − (z − a) − k (z − a) ¯
¯h i ¯
k=2 i=0
¯ Ã k µ ¶
! ¯
X∞ ¯1 X k ¯
¯ k−i i k−1 ¯
= ||ak || ¯ (z − a) h − k (z − a) ¯
¯h i ¯
k=2 i=1
¯ Ã µ ¶ !¯
X∞ ¯ X k
k ¯
¯ k−i i−1 ¯
≤ ||ak || ¯ (z − a) h ¯
¯ i ¯
k=2 i=2
à !
X∞ Xµ k ¶
k−2
k−2−i i
≤ |h| ||ak || |z − a| |h|
i=0
i+2
k=2
Ãk−2 µ !
X∞ X k − 2¶ k (k − 1) k−2−i i
= |h| ||ak || |z − a| |h|
i=0
i (i + 2) (i + 1)
k=2
∞
Ãk−2 µ ¶ !
X k (k − 1) X k − 2 k−2−i i
≤ |h| ||ak || |z − a| |h|
2 i=0
i
k=2
X∞ X ∞
k (k − 1) k−2 k (k − 1) k−2
= |h| ||ak || (|z − a| + |h|) < |h| ||ak || r .
2 2
k=2 k=2
Then
µ ¶1/k
k (k − 1) k−2
lim sup ||ak || r = ρr < 1
k→∞ 2
and so ¯¯ ¯¯
¯¯ f (z + h) − f (z) ¯¯
¯¯ − g (z)¯¯¯¯ ≤ C |h| .
¯¯ h
therefore, g (z) = f 0 (z) . Now by Theorem 24.3 it also follows that f 0 is continuous.
Since r1 < R was arbitrary, this shows that f 0 (z) is given by the differentiated
series above for |z − a| < R. Now a repeat of the argument shows all the derivatives
of f exist and are continuous on B (a, R).
∂u ∂v ∂u ∂v
= , =− .
∂x ∂y ∂y ∂x
Furthermore,
∂u ∂v
f 0 (z) = (x, y) + i (x, y) .
∂x ∂x
Proof: Suppose f is analytic first. Then letting t ∈ R,
f (z + t) − f (z)
f 0 (z) = lim =
t→0 t
µ ¶
u (x + t, y) + iv (x + t, y) u (x, y) + iv (x, y)
lim −
t→0 t t
∂u (x, y) ∂v (x, y)
= +i .
∂x ∂x
But also
f (z + it) − f (z)
f 0 (z) = lim =
t→0 it
µ ¶
u (x, y + t) + iv (x, y + t) u (x, y) + iv (x, y)
lim −
t→0 it it
µ ¶
1 ∂u (x, y) ∂v (x, y)
+i
i ∂y ∂y
∂v (x, y) ∂u (x, y)
= −i .
∂y ∂y
This verifies the Cauchy Riemann equations. We are assuming that z → f 0 (z) is
continuous. Therefore, the partial derivatives of u and v are also continuous. To see
this, note that from the formulas for f 0 (z) given above, and letting z1 = x1 + iy1
¯ ¯
¯ ∂v (x, y) ∂v (x1 , y1 ) ¯
¯ − ¯ ≤ |f 0 (z) − f 0 (z1 )| ,
¯ ∂y ∂y ¯
f (z + h) − f (z) = u (x + h1 , y + h2 )
24.1. ANALYTIC FUNCTIONS 645
∂u ∂v
f 0 (z) = (x, y) + i (x, y) .
∂x ∂x
It follows from this formula and the assumption that u, v are C 1 (Ω) that f 0 is
continuous.
It is routine to verify that all the usual rules of derivatives hold for analytic
functions. In particular, the product rule, the chain rule, and quotient rule.
Lemma 24.6 Let γ denote the closed curve which is a circle of radius r centered
at z0 . Then a parameterization this curve is γ (t) = z0 + reit where t ∈ [0, 2π] .
¯ ¯
Proof: |γ (t) − z0 | = ¯reit re−it ¯ = r2 . Also, you can see from the definition of
2
the sine and cosine that the point described in this way moves counter clockwise
over this circle.
646 FUNDAMENTALS OF COMPLEX ANALYSIS
24.2 Exercises
1. Verify all the usual rules of differentiation including the product and chain
rules.
and has magnitude equal to the product of the sine of the included angle
times the product of the two norms of the vectors. In this case, the cross
product either points in the direction of the positive z axis or in the direction
of the negative z axis. Thus, either the vectors hx, y, 0i, ha, b, 0i, k form a right
handed system or the vectors ha, b, 0i, hx, y, 0i, k form a right handed system.
These are the two possible orientations. Show that in the situation of Problem
8 the orientation of γ 0 (t0 ) , η 0 (s0 ) , k is the same as the orientation of the
0 0
vectors (f ◦ γ) (t0 ) , (f ◦ η) (s0 ) , k. Such mappings are called conformal. If f
is analytic and f 0 (z) 6= 0, then we know from this problem and the above that
0
f is a conformal map. Hint: You can do this by verifying that (f ◦ γ) (t0 ) ×
0 2
(f ◦ η) (s0 ) = |f 0 (γ (t0 ))| γ 0 (t0 ) × η 0 (s0 ). To make the verification easier,
you might first establish the following simple formula for the cross product
where here x + iy = z and a + ib = w.
(x, y, 0) × (a, b, 0) = Re (ziw) k.
10. Write the Cauchy Riemann equations in terms of polar coordinates. Recall
the polar coordinates are given by
x = r cos θ, y = r sin θ.
This means, letting u (x, y) = u (r, θ) , v (x, y) = v (r, θ) , write the Cauchy Rie-
mann equations in terms of r and θ. You should eventually show the Cauchy
Riemann equations are equivalent to
∂u 1 ∂v ∂v 1 ∂u
= , =−
∂r r ∂θ ∂r r ∂θ
11. Show that a real valued analytic function must be constant.
Proof: From the above lemma, you can apply the mean value theorem to the
real and imaginary parts of g.
Applying the above lemma to the components yields the following lemma.
If you want to have X be a complex Banach space, the result is still true.
Proof: Let Λ ∈ X 0 . Then Λg : [a, b] → C . Therefore, from Lemma 24.8, for each
Λ ∈ X 0 , Λg (s) = Λg (t) and since X 0 separates the points, it follows g (s) = g (t) so
g is constant.
∂φ
Then g is continuous. If ∂t exists and is continuous on [a, b] × [c, d] , then
Z b
∂φ (s, t)
g 0 (t) = ds. (24.2)
a ∂t
Proof: The first claim follows from the uniform continuity of φ on [a, b] × [c, d] ,
which uniform continuity results from the set being compact. To establish 24.2, let
t and t + h be contained in [c, d] and form, using the mean value theorem,
Z b
g (t + h) − g (t) 1
= [φ (s, t + h) − φ (s, t)] ds
h h a
Z b
1 ∂φ (s, t + θh)
= hds
h a ∂t
Z b
∂φ (s, t + θh)
= ds,
a ∂t
where θ may depend on s but is some number between 0 and 1. Then by the uniform
continuity of ∂φ
∂t , it follows that 24.2 holds.
24.3. CAUCHY’S FORMULA FOR A DISK 649
∂φ
Then g is continuous. If ∂t exists and is continuous on [a, b] × [c, d] , then
Z b
∂φ (s, t)
g 0 (t) = ds. (24.4)
a ∂t
∂φ
Then g is continuous. If ∂t exists and is continuous on [a, b] × [c, d] , then
Z b
0 ∂φ (s, t)
g (t) = ds. (24.6)
a ∂t
∂φ
Then g is continuous. If ∂t exists and is continuous on [a, b] × [c, d] , then
Z b
0 ∂φ (s, t)
g (t) = ds. (24.8)
a ∂t
∂Λφ
Proof: Let Λ ∈ X 0 . Then Λφ : [a, b] × [c, d] → C is continuous and ∂t exists
and is continuous on [a, b] × [c, d] . Therefore, from 24.8,
Z b Z b
0 0 ∂Λφ (s, t) ∂φ (s, t)
Λ (g (t)) = (Λg) (t) = ds = Λ ds
a ∂t a ∂t
Z 2π X∞
n
= if (z) r−n e−int (z − z0 ) dt
0 n=0
¯ ¯
because ¯ z−z
reit
0¯
< 1. Since this sum converges uniformly you can interchange the
sum and the integral to obtain
X∞ Z 2π
−n n
g (0) = if (z) r (z − z0 ) e−int dt
n=0 0
= 2πif (z)
R 2π
because 0 e−int dt = 0 if n > 0.
Next consider the claim that g is constant. By Corollary 24.13, for α ∈ (0, 1) ,
Z 2π 0 ¡ ¡ ¢¢ ¡ it ¢
0 f z + α z0 + reit − z re + z0 − z
g (α) = rieit dt
0 reit + z0 − z
Z 2π
¡ ¡ ¢¢
= f 0 z + α z0 + reit − z rieit dt
0
Z 2π µ ¶
d ¡ ¡ ¢¢ 1
= f z + α z0 + reit − z dt
0 dt α
¡ ¡ ¢¢ 1 ¡ ¡ ¢¢ 1
= f z + α z0 + rei2π − z − f z + α z0 + re0 − z = 0.
α α
Now g is continuous on [0, 1] and g 0 (t) = 0 on (0, 1) so by Lemma 24.9, g equals a
constant. This constant can only be g (0) = 2πif (z) . Thus,
Z
f (w)
g (1) = dw = g (0) = 2πif (z) .
γ w −z
24.3. CAUCHY’S FORMULA FOR A DISK 651
where γ (t) ≡ z0 + reit , t ∈ [0, 2π] for r small enough that B (z0 , r) ⊆ Ω.
Proof: Let z ∈ B (z0 , r) ⊆ Ω and let B (z0 , r) ⊆ Ω. Then, letting γ (t) ≡
z0 + reit , t ∈ [0, 2π] , and h small enough,
Z Z
1 f (w) 1 f (w)
f (z) = dw, f (z + h) = dw
2πi γ w − z 2πi γ w − z − h
Now
1 1 h
− =
w−z−h w−z (−w + z + h) (−w + z)
and so
Z
f (z + h) − f (z) 1 hf (w)
= dw
h 2πhi γ (−w + z + h) (−w + z)
Z
1 f (w)
= dw.
2πi γ (−w + z + h) (−w + z)
Now for all h sufficiently small, there exists a constant C independent of such h
such that
¯ ¯
¯ 1 1 ¯
¯ − ¯
¯ (−w + z + h) (−w + z) (−w + z) (−w + z) ¯
¯ ¯
¯ h ¯
¯ ¯
= ¯ ¯ ≤ C |h|
¯ (w − z − h) (w − z)2 ¯
Corollary 24.17 Suppose f is continuous on ∂B (z0 , r) and suppose that for all
z ∈ B (z0 , r) , Z
1 f (w)
f (z) = dw,
2πi γ w − z
where γ (t) ≡ z0 + reit , t ∈ [0, 2π] . Then f is analytic on B (z0 , r) and in fact has
infinitely many derivatives on B (z0 , r) .
Lemma 24.18 Let γ (t) = z0 + reit , for t ∈ [0, 2π], suppose fn → f uniformly on
B (z0 , r), and suppose Z
1 fn (w)
fn (z) = dw (24.11)
2πi γ w − z
for z ∈ B (z0 , r) . Then Z
1 f (w)
f (z) = dw, (24.12)
2πi γ w−z
implying that f is analytic on B (z0 , r) .
Proof: From 24.11 and the uniform convergence of fn to f on γ ([0, 2π]) , the
integrals in 24.11 converge to
Z
1 f (w)
dw.
2πi γ w − z
Proposition 24.19 Let {an } denote a sequence in X. Then there exists R ∈ [0, ∞]
such that
∞
X k
ak (z − z0 )
k=0
is analytic on B (z0 , R) .
24.3. CAUCHY’S FORMULA FOR A DISK 653
Proof: The assertions about absolute convergence are routine from the root test
if
µ ¶−1
1/n
R ≡ lim sup |an |
n→∞
with R = ∞ if the quantity in parenthesis equals zero. The root test can be used
to verify absolute convergence which then implies convergence by completeness of
X.
The assertion aboutP∞uniform convergence follows from the Weierstrass M test
and Mn ≡ |an | rn . ( n=0 |an | rn < ∞ by the root test). It only remains to verify
the assertion about f (z) being analytic in the case where R > 0.
Pn k
Let 0 < r < R and define fn (z) ≡ k=0 ak (z − z0 ) . Then fn is a polynomial
and so it is analytic. Thus, by the Cauchy integral formula above,
Z
1 fn (w)
fn (z) = dw
2πi γ w−z
where γ (t) = z0 + reit , for t ∈ [0, 2π] . By Lemma 24.18 and the first part of this
proposition involving uniform convergence,
Z
1 f (w)
f (z) = dw.
2πi γ w−z
f (n) (z0 )
an = . (24.14)
n!
Proof: Consider |z − z0 | < r and let γ (t) = z0 + reit , t ∈ [0, 2π] . Then for
w ∈ γ ([0, 2π]) ,
¯ ¯
¯ z − z0 ¯
¯ ¯
¯ w − z0 ¯ < 1
654 FUNDAMENTALS OF COMPLEX ANALYSIS
Since the series converges uniformly, you can interchange the integral and the sum
to obtain
∞
à Z !
X 1 f (w) n
f (z) = n+1 (z − z0 )
n=0
2πi γ (w − z0 )
∞
X n
≡ an (z − z0 )
n=0
24.4 Exercises
¯P ∞ ¡ ¢¯
1. Show that if |ek | ≤ ε, then ¯ k=m ek rk − rk+1 ¯ < ε if 0 ≤ r < 1. Hint:
Let |θ| = 1 and verify that
¯ ¯
X∞
¡ k ¢ ¯X∞
¡ ¢ ¯ X ∞
¡ ¢
¯ ¯
θ ek r − rk+1 = ¯ ek rk − rk+1 ¯ = Re (θek ) rk − rk+1
¯ ¯
k=m k=m k=m
where |Ak − A| < ε for all k ≥ m. In the first sum, write Ak = A + ek and use
P∞ k 1
Problem 1. Use this theorem to verify that arctan (1) = k=0 (−1) 2k+1 .
24.4. EXERCISES 655
P∞
5. Suppose f (z) = n=0 an z n for all |z| < R. Show that then
Z 2π ∞
X
1 ¯ ¡ iθ ¢¯2
¯f re ¯ dθ = 2
|an | r2n
2π 0 n=0
show Z 2π n
X
1 ¯ ¡ iθ ¢¯2
¯fn re ¯ dθ = 2
|ak | r2k
2π 0 k=0
and so
∞
X ∞
X k
2u (w) = ak w k + ak (w) . (24.16)
k=0 k=0
656 FUNDAMENTALS OF COMPLEX ANALYSIS
Using these formulas for an in 24.15, we can interchange the sum and the
integral (Why can we do this?) to write the following for |z| < R.
Z
1 X ³ z ´k+1
∞
1
f (z) = u (w) dw − a0
πi γ z w
k=0
Z
1 u (w)
= dw − a0 ,
πi γ w−z
1
R u(w)
which is the Schwarz formula. Now Re a0 = 2πi γ w
dw and a0 = Re a0 −
i Im a0 . Therefore, we can also write the Schwarz formula as
Z
1 u (w) (w + z)
f (z) = dw + i Im a0 . (24.17)
2πi γ (w − z) w
7. Take the real parts of the second form of the Schwarz formula to derive the
Poisson formula for a disk,
Z 2π ¡ ¢¡ ¢
¡ iα ¢ 1 u Reiθ R2 − r2
u re = dθ. (24.18)
2π 0 R2 + r2 − 2Rr cos (θ − α)
P∞ k
9. Suppose f (z) = k=0 ak (z − z0 ) for all |z − z0 | < R. Show that f 0 (z) =
P∞ k−1
k=0 ak k (z − z0 ) for all |z − z0 | < R. Hint: Let fn (z) be a partial sum
of f. Show that fn0 converges uniformly to some function, g on |z − z0 | ≤ r
for any r < R. Now use the Cauchy integral formula for a function and its
derivative to identify g with f 0 .
P∞ ¡ ¢k
10. Use Problem 9 to find the exact value of k=0 k 2 13 .
11. Prove the binomial formula,
X∞ µ ¶
α α n
(1 + z) = z
n=0
n
where µ ¶
α α · · · (α − n + 1)
≡ .
n n!
Can this be used to give a proof of the binomial formula,
Xn µ ¶
n n n−k k
(a + b) = a b ?
k
k=0
Explain.
12. Suppose f is analytic on B (z0¯ , r) and¯continuous on B (z0 , r) and |f (z)| ≤ M
on B (z0 , r). Show that then ¯f (n) (a)¯ ≤ Mrnn! .
It turns out the zeros of an analytic function which is not constant on some
region cannot have a limit point. This is also a good time to define the order of a
zero.
Z ≡ {z ∈ Ω : f (z) = 0} .
Proof: It is clear the first condition implies the second two. Suppose the third
holds. Then for z near z0
∞
X f (n) (z0 ) n
f (z) = (z − z0 )
n!
n=k
Thus f is identically equal to zero near z ∈ S. Therefore, all points near z are
contained in S also, showing that S is an open set. Now Ω = S ∪ (Ω \ S) , the union
of two disjoint open sets, S being nonempty. It follows the other open set, Ω \ S,
must be empty because Ω is connected. Therefore, the first condition is verified.
This proves the theorem. (See the following diagram.)
1.)
.% &
2.) ←− 3.)
Note how radically different this is from the theory of functions of a real variable.
Consider, for example the function
½ 2 ¡ ¢
x sin x1 if x 6= 0
f (x) ≡
0 if x = 0
24.6. LIOUVILLE’S THEOREM 659
which has a derivative for all x ∈ R and for which 0 is a limit point of the set, Z,
even though f is not identically equal to zero.
Here is a very important application called Euler’s formula. Recall that
Proof: It was already observed that ez given by 24.19 is analytic. So is exp (z) ≡
P∞ zk
k=0 k! . In fact the power series converges for all z ∈ C. Furthermore the two
functions, ez and exp (z) agree on the real line which is a set which contains a limit
point. Therefore, they agree for all values of z ∈ C.
This formula shows the famous two identities,
With Liouville’s theorem it becomes possible to give an easy proof of the fun-
damental theorem of algebra. It is ironic that all the best proofs of this theorem
in algebra come from the subjects of analysis or topology. Out of all the proofs
that have been given of this very important theorem, the following one based on
Liouville’s theorem is the easiest.
lowing picture.
z3
¡@
¡ª @
¡ T11 @
I T
¡- ¾@
¡@R
@ T21 ¡@
¡
ª
¡
ª @ ¡ @
¡T31 @ ¡T41 @
I
@
I µ
¡
z1¡ - @¡ - @ z
2
By Lemma 23.11
Z 4 Z
X
f (z) dz = f (z) dz. (24.20)
∂T k=1 ∂Tk1
On the “inside lines” the integrals cancel as claimed in Lemma 23.11 because there
are two integrals going in opposite directions for each of these inside lines.
Theorem 24.28 (Cauchy Goursat) Let f : Ω → X have the property that f 0 (z)
exists for all z ∈ Ω and let T be a triangle contained in Ω. Then
Z
f (w) dw = 0.
∂T
Now let T1 play the same role as T , subdivide as in the above picture, and obtain
T2 such that ¯¯Z ¯¯
¯¯ ¯¯ α
¯¯ f (w) dw¯¯¯¯ ≥ 2 .
¯¯ 4
∂T2
and ¯¯Z ¯¯
¯¯ ¯¯ α
¯¯ f (w) dw¯¯¯¯ ≥ k .
¯¯ 4
∂Tk
662 FUNDAMENTALS OF COMPLEX ANALYSIS
Then let z ∈ ∩∞ 0
k=1 Tk and note that by assumption, f (z) exists. Therefore, for all
k large enough,
Z Z
f (w) dw = f (z) + f 0 (z) (w − z) + g (w) dw
∂Tk ∂Tk
where ||g (w)|| < ε |w − z| . Now observe that w → f (z) + f 0 (z) (w − z) has a
primitive, namely,
2
F (w) = f (z) w + f 0 (z) (w − z) /2.
Therefore, by Corollary 23.14.
Z Z
f (w) dw = g (w) dw.
∂Tk ∂Tk
and so
α ≤ ε (length of T ) diam (T ) .
R
Since ε is arbitrary, this shows α = 0, a contradiction. Thus ∂T f (w) dw = 0 as
claimed.
This fundamental result yields the following important theorem.
Theorem 24.29 (Morera1 ) Let Ω be an open set and let f 0 (z) exist for all z ∈ Ω.
Let D ≡ B (z0 , r) ⊆ Ω. Then there exists ε > 0 such that f has a primitive on
B (z0 , r + ε).
Then by the Cauchy Goursat theorem, and w ∈ B (z0 , r + ε) , it follows that for |h|
small enough, Z
F (w + h) − F (w) 1
= f (u) du
h h γ(w,w+h)
Z Z 1
1 1
= f (w + th) hdt = f (w + th) dt
h 0 0
which converges to f (w) due to the continuity of f at w. This proves the theorem.
The following is a slight generalization of the above theorem which is also referred
to as Morera’s theorem.
1 Giancinto Morera 1856-1909. This theorem or one like it dates from around 1886
24.7. THE GENERAL CAUCHY INTEGRAL FORMULA 663
γ (z1 , z2 , z3 , z1 )
then f is analytic on Ω.
Proof: As in the proof of Morera’s theorem, let B (z0 , r) ⊆ Ω and use the given
condition to construct a primitive, F for f on B (z0 , r) . Then F is analytic and so
by Theorem 24.16, it follows that F and hence f have infinitely many derivatives,
implying that f is analytic on B (z0 , r) . Since z0 is arbitrary, this shows f is analytic
on Ω.
Theorem 24.31 Let Ω be an open set in C and suppose f : Ω → X has the property
that f 0 (z) exists for each z ∈ Ω. Then f is analytic on Ω.
Corollary 24.32 Let Ω be a convex open set and suppose that f 0 (z) exists for all
z ∈ Ω. Then f has a primitive on Ω.
Note that this implies that if Ω is a convex open set on which f 0 (z) exists and
if γ : [a, b] → Ω is a closed, continuous curve having bounded variation, then letting
F be a primitive of f Theorem 23.13 implies
Z
f (z) dz = F (γ (b)) − F (γ (a)) = 0.
γ
664 FUNDAMENTALS OF COMPLEX ANALYSIS
Notice how different this is from the situation of a function of a real variable! It
is possible for a function of a real variable to have a derivative everywhere and yet
the derivative can be discontinuous. A simple example is the following.
½ 2 ¡ ¢
x sin x1 if x 6= 0
f (x) ≡ .
0 if x = 0
Then f 0 (x) exists for all x ∈ R. Indeed, if x 6= 0, the derivative equals 2x sin x1 −cos x1
which has no limit as x → 0. However, from the definition of the derivative of a
function of one variable, f 0 (0) = 0.
Proof: Suppose B (z0 , δ) has no points of f (B 0 (a, r)) . Such a ball must exist if
f (B 0 (a, r)) is not dense. Then for z ∈ B 0 (a, r) , |f (z) − z0 | ≥ δ > 0. It follows from
1
Theorem 24.35 that f (z)−z 0
has a removable singularity at a. Hence, there exists h
an analytic function such that for z near a,
1
h (z) = . (24.22)
f (z) − z0
P∞ k 1
There are two cases. First suppose h (a) = 0. Then k=1 ak (z − a) = f (z)−z 0
for z near a. If all the ak = 0, this would be a contradiction because then the left
side would equal zero for z near a but the right side could not equal zero. Therefore,
there is a first m such that am 6= 0. Hence there exists an analytic function, k (z)
which is not equal to zero in some ball, B (a, ε) such that
m 1
k (z) (z − a) = .
f (z) − z0
Hence, taking both sides to the −1 power,
∞
X
1 k
f (z) − z0 = m bk (z − a)
(z − a)
k=0
b
in C.
M
X −1
|bM | |bk |
|f (z)| ≥ M
− |g (z)| − k
|z − a| k=1 |z − a|
à à M −1
!!
1 M
X M −k
= M
|bM | − |g (z)| |z − a| + |bk | |z − a| .
|z − a| k=1
³ PM −1 ´
M M −k
Now limz→a |g (z)| |z − a| + k=1 |bk | |z − a| = 0 and so the above in-
equality proves limz→a |f (z)| = ∞. Referring to the diagram on Page 628, you see
this is the same as saying
lim |θf (z) − (0, 0, 2)| = lim |θf (z) − θ (∞)| = lim d (f (z) , ∞) = 0
z→a z→a z→a
The usefulness of the above convention about f (a) ≡ ∞ at a pole is made clear
in the following theorem.
b be meromorphic.
Theorem 24.40 Let Ω be an open subset of C and let f : Ω → C
b
Then f is continuous with respect to the metric, d on C.
whenever z1 is close enough to z. This proves the continuity assertion. Note this
did not depend on γ being closed.
Next it is shown that for a closed curve the winding number equals an integer.
To do so, use Theorem 23.12 to obtain η k , a function in C 1 ([a, b]) such that z ∈ /
η k ([a, b]) for all k large enough, η k (x) = γ (x) for x = a, b, and
¯ ¯
¯ 1 Z dw 1
Z
dw ¯¯ 1 1
¯
¯ − ¯ < , ||η k − γ|| < .
¯ 2πi γ w − z 2πi ηk w − z ¯ k k
1
R dw
It is shown that each of 2πi η k w−z
is an integer. To simplify the notation, write η
instead of η k .
Z Z b 0
dw η (s) ds
= .
η w − z a η (s) − z
668 FUNDAMENTALS OF COMPLEX ANALYSIS
Define Z t
η 0 (s) ds
g (t) ≡ . (24.24)
a η (s) − z
Then
³ ´0
e−g(t) (η (t) − z) = e−g(t) η 0 (t) − e−g(t) g 0 (t) (η (t) − z)
= e−g(t) η 0 (t) − e−g(t) η 0 (t) = 0.
It follows that e−g(t) (η (t) − z) equals a constant. In particular, using the fact that
η (a) = η (b) ,
and so e−g(b) = 1. This happens if and only if −g (b) = 2mπi for some integer m.
Therefore, 24.24 implies
Z b 0 Z
η (s) ds dw
2mπi = = .
a η (s) − z η w −z
1
R dw 1
R dw
Therefore, 2πi η k w−z
is a sequence of integers converging to 2πi γ w−z
≡ n (γ, z)
and so n (γ, z) must also be an integer and n (η k , z) = n (γ, z) for all k large enough.
Since n (γ, ·) is continuous and integer valued, it follows from Corollary 6.67 on
Page 155 that it must be constant on every connected component of C\γ ∗ . It is clear
that n (γ, z) equals zero on the unbounded component because from the formula,
µ ¶
1
lim |n (γ, z)| ≤ lim V (γ, [a, b])
z→∞ z→∞ |z| − c
where c ≥ max {|w| : w ∈ γ ∗ } .This proves the theorem.
Proof: Letting η be a C 1 curve for which η (a) = γ (a) and η (b) = γ (b) and
which is close enough to γ that n (η, z) = n (γ, z) , the argument is similar to the
above. Let Z t 0
η (s) ds
g (t) ≡ . (24.25)
a η (s) − z
Then
³ ´0
e−g(t) (η (t) − z) = e−g(t) η 0 (t) − e−g(t) g 0 (t) (η (t) − z)
= e−g(t) η 0 (t) − e−g(t) η 0 (t) = 0.
Hence
e−g(t) (η (t) − z) = c 6= 0. (24.26)
24.7. THE GENERAL CAUCHY INTEGRAL FORMULA 669
By assumption Z
1
g (b) = dw = 2πim
η w−z
for some integer, m. Therefore, from 24.26
η (b) − z
1 = e2πmi = .
c
Thus c = η (b) − z and letting t = a in 24.26,
η (a) − z
1=
η (b) − z
which shows η (a) = η (b) . This proves the corollary since the assertion about con-
tinuity was already observed.
It is a good idea to consider a simple case to get an idea of what the winding
number is measuring. To do so, consider γ : [a, b] → C such that γ is continuous,
closed and bounded variation. Suppose also that γ is one to one on (a, b) . Such a
curve is called a simple closed curve. It can be shown that such a simple closed curve
divides the plane into exactly two components, an “inside” bounded component and
an “outside” unbounded component. This is called the Jordan Curve theorem or
the Jordan separation theorem. This is a difficult theorem which requires some
very hard topology such as homology theory or degree theory. It won’t be used
here beyond making reference to it. For now, it suffices to simply assume that γ
is such that this result holds. This will usually be obvious anyway. Also suppose
that it is¡ possible to change
¢ the parameter to be in [0, 2π] , in such a way that
γ (t) + λ z + reit − γ (t) − z 6= 0 for all t ∈ [0, 2π] and λ ∈ [0, 1] . (As t goes
from 0 to 2π the point γ (t) traces the curve γ ([0, 2π]) in the counter clockwise
direction.) Suppose z ∈ D, the inside of the simple closed curve and consider the
curve δ (t) = z +reit for t ∈ [0, 2π] where r is chosen small enough that B (z, r) ⊆ D.
Then it happens that n (δ, z) = n (γ, z) .
n (δ, z) = n (γ, z)
and n (δ, z) = 1.
Proof: By changing the parameter, assume that [a, b] = [0, 2π]¡. From Theorem¢
24.42 it suffices to assume also that γ is C 1 . Define hλ (t) ≡ γ (t)+λ z + reit − γ (t)
for λ ∈ [0, 1] . (This function is called a homotopy of the curves γ and δ.) Note that
for each λ ∈ [0, 1] , t → hλ (t) is a closed C 1 curve. Also,
Z Z 2π
¡ ¢
1 1 1 γ 0 (t) + λ rieit − γ 0 (t)
dw = dt.
2πi hλ w−z 2πi 0 γ (t) + λ (z + reit − γ (t)) − z
670 FUNDAMENTALS OF COMPLEX ANALYSIS
¾
γ3
¾ -
γ2 γ1
m
X
n (γ k , z) equals an integer
k=1
m
X m
X Z
1 f (w)
f (z) n (γ k , z) = dw.
2πi γ k w − z
k=1 k=1
for every triangle, T, contained in Ω and apply Corollary 24.30. To do this, use
Theorem 23.12 to obtain for each k, a sequence of functions, η kn ∈ C 1 ([ak , bk ])
such that
η kn (x) = γ k (x) for x ∈ {ak , bk }
and
1
η kn ([ak , bk ]) ⊆ Ω, ||η kn − γ k || < ,
n
¯¯Z Z ¯¯
¯¯ ¯¯ 1
¯¯ ¯¯
¯¯ φ (z, w) dw − φ (z, w) dw¯¯ < , (24.27)
¯¯ ηkn γk ¯¯ n
for all z ∈ T. Then applying Fubini’s theorem,
Z Z Z Z
φ (z, w) dwdz = φ (z, w) dzdw = 0
∂T η kn η kn ∂T
/ ∪m
Why is g (z) well defined? For z ∈ Ω ∩ H, z ∈ k=1 γ k ([ak , bk ]) and so
m Z m Z
1 X 1 X f (w) − f (z)
g (z) = φ (z, w) dw = dw
2πi 2πi w−z
k=1 γ k k=1 γ k
m Z m Z
1 X f (w) 1 X f (z)
= dw − dw
2πi w−z 2πi w−z
k=1 γ k k=1 γ k
m Z
1 X f (w)
= dw
2πi γk w − z
k=1
m Z m Z
1 X 1 X f (w) − f (z)
0 = h (z) = φ (z, w) dw = dw =
2πi γk 2πi γk w−z
k=1 k=1
m Z m
1 X f (w) X
dw − f (z) n (γ k , z) .
2πi γk w−z
k=1 k=1
for all z ∈
/ Ω. Then if f : Ω → C is analytic,
m Z
X
f (w) dw = 0.
k=1 γk
g (w) = f (w) (w − z)
where z ∈ Ω \ ∪m
k=1 γ k ([ak , bk ]) . Then by this theorem,
m
X m
X
0=0 n (γ k , z) = g (z) n (γ k , z) =
k=1 k=1
m
X Z m Z
1 g (w) 1 X
dw = f (w) dw.
2πi γ k w − z 2πi γk
k=1 k=1
Another simple corollary to the above theorem is Cauchy’s theorem for a simply
connected region.
g (w) ≡ f (w) (w − z) .
Proof: Pick a point, z0 ∈ Ω and let V denote those points, z of Ω for which
there exists a curve, γ : [a, b] → Ω such that γ is continuous, of bounded variation,
γ (a) = z0 , and γ (b) = z. Then it is easy to verify that V is both open and closed
in Ω and therefore, V = Ω because Ω is connected. Denote by γ z0 ,z such a curve
from z0 to z and define Z
F (z) ≡ f (w) dw.
γ z0 ,z
Then F is well defined because if γ j , j = 1, 2 are two such curves, it follows from
Corollary 24.49 that
Z Z
f (w) dw + f (w) dw = 0,
γ1 −γ 2
implying that Z Z
f (w) dw = f (w) dw.
γ1 γ2
Theorem 24.52 Let K be a compact subset of an open set, Ω. Then there exist
m
continuous, closed, bounded variation oriented curves {Γj }j=1 for which Γ∗j ∩ K = ∅
for each j, Γ∗j ⊆ Ω, and for all p ∈ K,
m
X
n (Γk , p) = 1.
k=1
K
676 FUNDAMENTALS OF COMPLEX ANALYSIS
Let S denote the set of all the closed squares in this tiling which have nonempty
intersection with K.Thus, all the squares of S are contained in Ω. First suppose p is
a point of K which is in the interior of one of these squares in the tiling. Denote by
∂Sk the boundary of Sk one of the squares in S, oriented in the counter clockwise
direction and Sm denote the square of S which contains the point, p in its interior.
n o4
Let the edges of the square, Sj be γ jk . Thus a short computation shows
k=1
n (∂Sm , p) = 1 but n (∂Sj , p) = 0 for all j 6= m. The reason for this is that for
z in Sj , the values {z − p : z ∈ Sj } lie in an open square, Q which is located at a
positive distance from 0. Then C b \ Q is connected and 1/ (z − p) is analytic on Q.
It follows from Corollary 24.50 that this function has a primitive on Q and so
Z
1
dz = 0.
∂Sj z−p
Similarly, if z ∈
/ Ω, n (∂Sj , z) = 0. On the other
³ hand, ´ a P
direct computation will
P j
verify that n (p, ∂Sm ) = 1. Thus 1 = j,k n p, γ k = Sj ∈S n (p, ∂Sj ) and if
P ³ ´ P
j
z∈/ Ω, 0 = j,k n z, γ k = Sj ∈S n (z, ∂Sj ) .
If γ j∗ l∗
k coincides with γ l , then the contour integrals taken over this edge are
taken in opposite directions and so ³ the edge
´ the two squares have in common can
P
be deleted without changing j,k n z, γ jk for any z not on any of the lines in the
tiling. For example, see the picture,
¾ ¾ ¾
?
? ? 6
- 6 - 6 -
K
-
?
? K 6
¾ 6
?
Pm
Then as explained above, k=1 n (p, γ k ) = 1. It remains to prove the claim
about the closed curves.
Each orientation on an edge corresponds to a direction of motion over that
edge. Call such a motion over the edge a route. Initially, every vertex, (corner of
a square in S) has the property there are the same number of routes to and from
that vertex. When an open edge whose closure contains a point of K is deleted,
every vertex either remains unchanged as to the number of routes to and from that
vertex or it loses both a route away and a route to. Thus the property of having the
same number of routes to and from each vertex is preserved by deleting these open
edges.. The isolated points which result lose all routes to and from. It follows that
upon removing the isolated points you can begin at any of the remaining vertices
and follow the routes leading out from this and successive vertices according to
orientation and eventually return to that end. Otherwise, there would be a vertex
which would have only one route leading to it which does not happen. Now if you
have used all the routes out of this vertex, pick another vertex and do the same
process. Otherwise, pick an unused route out of the vertex and follow it to return.
Continue this way till all routes are used exactly once, resulting in closed oriented
curves, Γk . Then
X X
n (Γk , p) = n (γ k , p) = 1.
k j
24.8 Exercises
1. If U is simply connected, f is analytic on U and f has no zeros in U, show
there exists an analytic function, F, defined on U such that eF = f.
2. Let f be defined and analytic near the point a ∈ C. Show that then f (z) =
P∞ k
k=0 bk (z − a) whenever |z − a| < R where R is the distance between a and
the nearest point where f fails to have a derivative. The number R, is called
the radius of convergence and the power series is said to be expanded about
a.
1
3. Find the radius of convergence of the function 1+z 2 expanded about a = 2.
1
Note there is nothing wrong with the function, 1+x2 when considered as a
function of a real variable, x for any value of x. However, if you insist on using
power series, you find there is a limitation on the values of x for which the
power series converges due to the presence in the complex plane of a point, i,
where the function fails to have a derivative.
1/2
4. Suppose f is analytic on all of C and satisfies |f (z)| < A + B |z| . Show f
is constant.
Proof: Suppose f (Ω) is not a point. Then if z0 ∈ Ω it follows there exists r > 0
such that f (z) 6= f (z0 ) for all z ∈ B (z0 , r) \ {z0 } . Otherwise, z0 would be a limit
point of the set,
{z ∈ Ω : f (z) − f (z0 ) = 0}
which would imply from Theorem 24.23 that f (z) = f (z0 ) for all z ∈ Ω. Therefore,
making r smaller if necessary and using the power series of f,
³ ´m
m ? 1/m
f (z) = f (z0 ) + (z − z0 ) g (z) (= (z − z0 ) g (z) )
for all z ∈ B (z0 , r) , where g (z) 6= 0 on B (z0 , r) . As implied in the above formula,
one wonders if you can take the mth root of g (z) .
g0
g is an analytic function on B (z0 , r) and so by Corollary 24.32 it has a primitive
¡ ¢0
on B (z0 , r) , h. Therefore by the product rule and the chain rule, ge−h = 0 and
so there exists a constant, C = ea+ib such that on B (z0 , r) ,
ge−h = ea+ib .
681
682 THE OPEN MAPPING THEOREM
Therefore,
g (z) = eh(z)+a+ib
and so, modifying h by adding in the constant, a + ib, g (z) = eh(z) where h0 (z) =
g 0 (z)
g(z) on B (z0 , r) . Letting
h(z)
φ (z) = (z − z0 ) e m
Shrinking r if necessary you can assume φ0 (z) 6= 0 on B (z0 , r). Is there an open
set, V contained in B (z0 , r) such that φ maps V onto B (0, δ) for some δ > 0?
Let φ (z) = u (x, y) + iv (x, y) where z = x + iy. Consider the mapping
µ ¶ µ ¶
x u (x, y)
→
y v (x, y)
Therefore, by the inverse function theorem there exists an open set, V, containing
T
z0 and δ > 0 such that (u, v) maps V one to one onto B (0, δ) . Thus φ is one to
one onto B (0, δ) as claimed. Applying the same argument to other points, z of V
and using the fact that φ0 (z) 6= 0 at these points, it follows φ maps open sets to
open sets. In other words, φ−1 is continuous.
It also follows that φm maps V onto B (0, δ m ) . Therefore, the formula 25.1
implies that f maps the open set, V, containing z0 to an open set. This shows f (Ω)
is an open set because z0 was arbitrary. It is connected because f is continuous and
Ω is connected. Thus f (Ω) is a region. It remains to verify that φ−1 is analytic on
B (0, δ) . Since φ−1 is continuous,
It only remains to verify the assertion about the case where f is one to one. If
2πi
m > 1, then e m 6= 1 and so for z1 ∈ V,
2πi
e m φ (z1 ) 6= φ (z1 ) . (25.2)
2πi
But e m φ (z1 ) ∈ B (0, δ) and so there exists z2 6= z1 (since φ is one to one) such that
2πi
φ (z2 ) = e m φ (z1 ) . But then
³ 2πi ´m
m m
φ (z2 ) = e m φ (z1 ) = φ (z1 )
implying f (z2 ) = f (z1 ) contradicting the assumption that f is one to one. Thus
m = 1 and f 0 (z) = φ0 (z) 6= 0 on V. Since f maps open sets to open sets, it follows
that f −1 is continuous and so
¡ ¢0 f −1 (f (z1 )) − f −1 (f (z))
f −1 (f (z)) = lim
f (z1 )→f (z) f (z1 ) − f (z)
z1 − z 1
= lim = 0 .
z1 →z f (z1 ) − f (z) f (z)
Corollary 25.2 Suppose in the situation of Theorem 25.1 m > 1 for the local
representation of f given in this theorem. Then there exists δ > 0 such that if
w ∈ B (f (z0 ) , δ) = f (V ) for V an open set containing z0 , then f −1 (w) consists of
m distinct points in V. (f is m to one on V )
Theorem 25.5 Let ρ be a ray starting at 0. Then there exists an analytic function,
L (z) defined on C \ ρ such that
eL(z) = z.
This function, L is called a branch of the logarithm. This branch of the logarithm
satisfies the usual formula for logarithms, L (zw) = L (z) + L (w) provided zw ∈
/ ρ.
where argθ (z) is the angle in (θ, θ + 2π) associated with z. (You could of course
have considered this to be the angle in (θ − 2π, θ) associated with z or in infinitely
many other open intervals of length 2π. The description of the log is not unique.)
Then letting L (z) = a + ib
for this fixed w, the equation holds for all z real. Therefore, by similar reasoning,
it holds for all complex z.)
Now suppose z, w ∈ C \ ρ and zw ∈ / ρ. Then
Definition 25.6 Let log denote the branch of the logarithm which corresponds to
the ray for θ = π. That is, the ray is the negative real axis. Sometimes this is called
the principal branch of the logarithm.
Theorem 25.7 (maximum modulus theorem) Let Ω be a bounded region and let
f : Ω → C be analytic and f : Ω → C continuous. Then if z ∈ Ω,
Equality occurs for some r > 0 and a ∈ Ω if and only if f is constant in Ω hence
equality occurs for all such a, r.
686 THE OPEN MAPPING THEOREM
Proof: The claimed inequality holds by Theorem 25.7. Suppose equality in the
above is achieved for some B (a, r) ⊆ Ω. Then by Theorem 25.7 f is equal to a
constant, w on B (a, r) . Therefore, the function, f (·) − w has a zero set which has
a limit point in Ω and so by Theorem 24.23 f (z) = w for all z ∈ Ω.
Conversely, if f is constant, then the equality in the above inequality is achieved
for all B (a, r) ⊆ Ω.
Next is yet another version of the maximum modulus principle which is in Con-
way [13]. Let Ω be an open set.
Note that if lim supz→a |f (z)| ≤ M and δ > 0, then there exists r > 0 such that
if z ∈ B 0 (a, r) ∩ S, then |f (z)| < M + δ. If a = ∞, there exists r > 0 such that if
|z| > r and z ∈ S, then |f (z)| < M + δ.
Proof: By Theorem 25.3 there exists log (φ (z)) analytic on Ω. Now define
η
g (z) ≡ exp (η log (φ (z))) so that g (z) = φ (z) . Now also
η
|g (z)| = |exp (η log (φ (z)))| = |exp (η ln |φ (z)|)| = |φ (z)| .
Let m ≥ |φ (z)| for all z ∈ Ω. Define F (z) ≡ f (z) g (z) m−η . Thus F is analytic
and for b ∈ B,
η
lim sup |F (z)| = lim sup |f (z)| |φ (z)| m−η ≤ M m−η
z→b z→b
while for a ∈ A,
lim sup |F (z)| ≤ M.
z→a
−η
Therefore, for α³∈ ∂∞ Ω,´ lim supz→α |F (z)| ≤ max (M, M η ) and so by Theorem
mη −η
25.11, |f (z)| ≤ |φ(z)|η max (M, M η ) . Now let η → 0 to obtain |f (z)| ≤ M.
In applications, it is often the case that B = {∞}.
Now here is an interesting
© case of this theorem.
ª It involves a particular form for
π
Ω, in this case Ω = z ∈ C : |arg (z)| < 2a where a ≥ 12 .
Then ∂Ω equals the two slanted lines. Also on Ω you can define a logarithm,
log (z) = ln |z| + i arg (z) where arg (z) is the angle associated with z between −π
688 THE OPEN MAPPING THEOREM
and π. Therefore, if c is a real number you can define z c for such z in the usual way:
Proof: Let b < c < a and let φ (z) ≡ exp (− (z c )) . Then as discussed above,
φ (z) 6= 0 on Ω and |φ (z)| is bounded on Ω. Now
η c
|φ (z)| = |exp (− |z| η (cos (c arg (z))))|
³ ´
b
P exp |z|
η
lim sup |f (z)| |φ (z)| ≤ lim sup c =0≤M
z→∞ z→∞ |exp (|z| η (cos (c arg (z))))|
Corollary 25.14 Let Ω be the open set consisting of {z ∈ C : a < Re z < b} and
suppose f is analytic on Ω , continuous on Ω, and bounded on Ω. Suppose also that
f (z) ≤ 1 on the two lines Re z = a and Re z = b. Then |f (z)| ≤ 1 for all z ∈ Ω.
1
Proof: This time let φ (z) = 1+z−a . Thus |φ (z)| ≤ 1 because Re (z − a) > 0 and
η
φ (z) 6= 0 for all z ∈ Ω. Also, lim supz→∞ |φ (z)| = 0 for every η > 0. Therefore, if a
η
is a point of the sides of Ω, lim supz→a |f (z)| ≤ 1 while lim supz→∞ |f (z)| |φ (z)| =
0 ≤ 1 and so by Theorem 25.12, |f (z)| ≤ 1 on Ω.
This corollary yields an interesting conclusion.
Corollary 25.15 Let Ω be the open set consisting of {z ∈ C : a < Re z < b} and
suppose f is analytic on Ω , continuous on Ω, and bounded on Ω. Define
© ª
Corollary 25.16 Let Ω = z ∈ C : |Im (z)| < π2 . Suppose f is analytic on Ω,
continuous on Ω, and there exist constants, α < 1 and A < ∞ such that
and ¯ ³
¯ π ´¯¯
¯f x ± i ¯ ≤ 1
2
for all x ∈ R. Then |f (z)| ≤ 1 on Ω.
−1
Proof: This time let φ (z) = [exp (A exp (βz)) exp (A exp (−βz))] where α <
β < 1. Then φ (z) 6= 0 on Ω and for η > 0
η 1
|φ (z)| =
|exp (ηA exp (βz)) exp (ηA exp (−βz))|
Now
and so
η 1
|φ (z)| =
exp [ηA (cos (βy) (eβx + e−βx ))]
π
Now cos βy > 0 because β < 1 and |y| < 2. Therefore,
η
lim sup |f (z)| |φ (z)| ≤ 0 ≤ 1
z→∞
this shows 25.4 and it also verifies 25.5 on taking the limit as z → 0. If equality
holds in 25.5, then |F (z) /z| achieves a maximum at an interior point so F (z) /z
equals a constant, λ by the maximum modulus theorem. Since F (z) = λz, it follows
F 0 (0) = λ and so |λ| = 1.
Rudin [45] gives a memorable description of what this lemma says. It says that
if an analytic function maps the unit ball to itself, keeping 0 fixed, then it must do
one of two things, either be a rotation or move all points closer to 0. (This second
part follows in case |F 0 (0)| < 1 because in this case, you must have |F (z)| 6= |z|
and so by 25.4, |F (z)| < |z|)
2 1
φ0α (0) = 1 − |α| , φ0 (α) = 2.
1 − |α|
after a few computations. If I show that φα maps B (0, 1) to B (0, 1) for all |α| < 1,
this will have¯ shown that φα is one to one and onto B (0, 1).
¡ ¢¯
Consider ¯φα eiθ ¯ . This yields
¯ iθ ¯ ¯ ¯
¯ e − α ¯ ¯ 1 − αe−iθ ¯
¯ ¯=¯ ¯
¯ 1 − αeiθ ¯ ¯ 1 − αeiθ ¯ = 1
692 THE OPEN MAPPING THEOREM
¯ ¯
where the first equality is obtained by multiplying by ¯e−iθ ¯ = 1. Therefore, φα maps
∂B (0, 1) one to one and onto ∂B (0, 1) . Now notice that φα is analytic on B (0, 1)
because the only singularity, a pole is at z = 1/α. By the maximum modulus
theorem, it follows
|φα (z)| < 1
whenever |z| < 1. The same is true of φ−α .
It only remains to verify
³ the assertions
´ about the derivatives. Long division
−1 −α+(α)−1
gives φα (z) = (−α) + 1−αz and so
³ ´
−2 −1
φ0α (z) = (−1) (1 − αz) −α + (α) (−α)
³ ´
−2 −1
= α (1 − αz) −α + (α)
³ ´
−2 2
= (1 − αz) − |α| + 1
25.5 Exercises
1. Consider the function, g (z) = z−i
z+i . Show this is analytic on the upper half
plane, P + and maps the upper half plane one to one and onto B (0, 1). Hint:
First show g maps the real axis to ∂B (0, 1) . This is really easy because you
end up looking at a complex number divided by its conjugate. Thus |g (z)| = 1
for z on ∂ (P +) . Now show that lim supz→∞ |g (z)| = 1. Then apply a version
of the maximum modulus theorem. You might note that g (z) = 1 + −2i z+i . This
will show |g (z)| ≤ 1. Next pick w ∈ B (0, 1) and solve g (z) = w. You just
have to show there exists a unique solution and its imaginary part is positive.
25.5. EXERCISES 693
2. Does there exist an entire function f which maps C onto the upper half plane?
¡ ¢0
3. Letting g be the function of Problem 1 show that g −1 (0) = 2. Also note
that g −1 (0) = i. Now suppose f is an analytic function defined on the upper
half plane which has the property that |f (z)| ≤ 1 and f (i) = β where |β| < 1.
Find an upper bound to |f 0 (i)| . Also find all functions, f which satisfy the
condition, f (i) = β, |f (z)| ≤ 1, and achieve this maximum value. Hint: You
could consider the function, h (z) ≡ φβ ◦ f ◦ g −1 (z) and check the conditions
for the Schwarz lemma for this function, h.
4. This and the next two problems follow a presentation of an interesting topic
in Rudin [45]. Let φα be given in Lemma 25.18. Suppose f is an analytic
function defined on B (0, 1) which satisfies |f (z)| ≤ 1. Suppose also there are
α, β ∈ B (0, 1) and it is required f (α) = β. If f is such a function, show
1−|β|2
that |f 0 (α)| ≤ 1−|α| 2 . Hint: To show this consider g = φβ ◦ f ◦ φ−α . Show
g in the above problem such that equality holds in Lemma 25.17. Thus you
need g (z) = λz where |λ| = 1 and solve g = φβ ◦ f ◦ φ−α for f .
6. Suppose that f : B (0, 1) → B (0, 1) and that f is analytic, one to one, and
onto with f (α) = 0. Show there exists λ, |λ| = 1 such that f (z) = λφα (z) .
This gives a different way to look at Theorem 25.19. Hint: Let g = f −1 .
Then g 0 (0) f 0 (α) = 1. However, f (α) = 0 and g (0) = α. From Problem
4 with β = 0, you can conclude an inequality for |f 0 (α)| and another one
for |g 0 (0)| . Then use the fact that the product of these two equals 1 which
comes from the chain rule to conclude that equality must take place. Now use
Problem 5 to obtain the form of f.
7. In Corollary 25.16 show that it is essential that α < 1. That is, show there
exists an example where the conclusion is not satisfied with a slightly weaker
growth condition. Hint: Consider exp (exp (z)) .
8. Suppose {fn } is a sequence of functions which are analytic on Ω, a bounded
region such that each fn is also continuous on Ω. Suppose that {fn } converges
uniformly on ∂Ω. Show that then {fn } converges uniformly on Ω and that the
function to which the sequence converges is analytic on Ω and continuous on
Ω.
9. Suppose Ω is ©a bounded region ª and there exists a point z0 ∈ Ω such that
|f (z0 )| = min |f (z)| : z ∈ Ω . Can you conclude f must equal a constant?
10. Suppose f is continuous on B (a, r) and analytic on B (a, r) and that f is not
constant. Suppose also |f (z)| = C 6= 0 for all |z − a| = r. Show that there
exists α ∈ B (a, r) such that f (α) = 0. Hint: If not, consider f /C and C/f.
Both would be analytic on B (a, r) and are equal to 1 on the boundary.
694 THE OPEN MAPPING THEOREM
11. Suppose f is analytic on B (0, 1) but for every a ∈ ∂B (0, 1) , limz→a |f (z)| =
∞. Show there exists a sequence, {zn } ⊆ B (0, 1) such that limn→∞ |zn | = 1
and f (zn ) = 0.
Theorem 25.20 Let Ω be an open set in C and let γ : [a, b] → Ω be closed, con-
tinuous, bounded variation, and n (γ, z) = 0 for all z ∈ / Ω. Suppose also that f
is analytic on Ω having zeros a1 , · · ·, am where the zeros are repeated according to
multiplicity, and suppose that none of these zeros are on γ ∗ . Then
Z m
X
1 f 0 (z)
dz = n (γ, ak ) .
2πi γ f (z)
k=1
Qm
Proof: Let f (z) = j=1 (z − aj ) g (z) where g (z) 6= 0 on Ω. Hence
m
f 0 (z) X 1 g 0 (z)
= +
f (z) j=1
z − aj g (z)
and so Z Z
X m
1 f 0 (z) 1 g 0 (z)
dz = n (γ, aj ) + dz.
2πi γ f (z) j=1
2πi γ g (z)
0
But the function, z → gg(z)
(z)
is analytic and so by Corollary 24.47, the last integral
in the above expression equals 0. Therefore, this proves the theorem.
The following picture is descriptive of the situation described in the next theo-
rem.
|f (γ (t)) − f (γ (s))| =
¯Z 1 ¯
¯ ¯
¯ f (γ (s) + λ (γ (t) − γ (s))) (γ (t) − γ (s)) dλ¯¯
0
¯
0
≤ C |γ (t) − γ (s)|
© ª
where C ≥ max |f 0 (z)| : z ∈ B . Hence, in this case,
Now let ε denote the distance between γ ∗ and C \ Ω. Since γ ∗ is compact, ε > 0.
By uniform continuity there exists δ = b−a p for p a positive integer such that if
ε
|s − t| < δ, then |γ (s) − γ (t)| < 2 . Then
³ ε´
γ ([t, t + δ]) ⊆ B γ (t) , ⊆ Ω.
2
n ¡ ¢o
Let C ≥ max |f 0 (z)| : z ∈ ∪pj=1 B γ (tj ) , 2ε where tj ≡ j
p (b − a) + a. Then from
what was just shown,
p−1
X
V (f ◦ γ, [a, b]) ≤ V (f ◦ γ, [tj , tj+1 ])
j=0
p−1
X
≤ C V (γ, [tj , tj+1 ]) < ∞
j=0
showing that f ◦ γ is bounded variation as claimed. Now from Theorem 24.42 there
exists η ∈ C 1 ([a, b]) such that
and
n (η, ak ) = n (γ, ak ) , n (f ◦ γ, α) = n (f ◦ η, α) (25.7)
for k = 1, · · ·, m. Then
n (f ◦ γ, α) = n (f ◦ η, α)
696 THE OPEN MAPPING THEOREM
Z
1 dw
=
2πi f ◦η w−α
Z b
1 f 0 (η (t)) 0
= η (t) dt
2πi a f (η (t)) − α
Z
1 f 0 (z)
= dz
2πi η f (z) − α
Xm
= n (η, ak )
k=1
Pm
By Theorem 25.20. By 25.7, this equals k=1 n (γ, ak ) which proves the theorem.
The next theorem is incredible and is very interesting for its own sake. The
following picture is descriptive of the situation of this theorem.
t a3 f
t a1 q
sz
ta sα
t a2
t a4
B(α, δ)
B(a, ²)
where g (z) 6= 0 in B (a, R) . (f (z) − α has a zero of order m at z = a.) Then there
exist ε, δ > 0 with the property that for each z satisfying 0 < |z − α| < δ, there exist
points,
{a1 , · · ·, am } ⊆ B (a, ε) ,
such that
f −1 (z) ∩ B (a, ε) = {a1 , · · ·, am }
and each ak is a zero of order 1 for the function f (·) − z.
f 0 6= 0
s
2ε ¡
¡
¡f − α 6= 0
¡
ª
Proof: If f is not constant, then for every α ∈ f (Ω) , it follows from Theorem
24.23 that f (·) − α has a zero of order m < ∞ and so from Theorem 25.22 for each
a ∈ Ω there exist ε, δ > 0 such that f (B (a, ε)) ⊇ B (α, δ) which clearly implies
that f maps open sets to open sets. Therefore, f (Ω) is open, connected because f
is continuous. If f is one to one, Theorem 25.22 implies that for every α ∈ f (Ω)
the zero of f (·) − α is of order 1. Otherwise, that theorem implies that for z near
α, there are m points which f maps to z contradicting the assumption that f is one
to one. Therefore, f 0 (z) 6= 0 and since f −1 is continuous, due to f being an open
map, it follows
¡ ¢0 f −1 (f (z1 )) − f −1 (f (z))
f −1 (f (z)) = lim
f (z1 )→f (z) f (z1 ) − f (z)
z1 − z 1
= lim = 0 .
z1 →z f (z1 ) − f (z) f (z)
This theorem says to add up the absolute values of the entries of the ith row
which are off the main diagonal and form the disc centered at aii having this radius.
The union of these discs contains σ (A) .
Proof: Suppose Ax = λx where x 6= 0. Then for A = (aij )
X
aij xj = (λ − aii ) xi .
j6=i
Therefore, if we pick k such that |xk | ≥ |xj | for all xj , it follows that |xk | 6= 0 since
|x| 6= 0 and
X X
|xk | |akj | ≥ |akj | |xj | ≥ |λ − akk | |xk | .
j6=k j6=k
More can be said using the theory about counting zeros. To begin with the
distance between two n × n matrices, A = (aij ) and B = (bij ) as follows.
2
X 2
||A − B|| ≡ |aij − bij | .
ij
Thus two matrices are close if and only if their corresponding entries are close.
Let A be an n × n matrix. Recall the eigenvalues of A are given by the zeros
of the polynomial, pA (z) = det (zI − A) where I is the n × n identity. Then small
changes in A will produce small changes in pA (z) and p0A (z) . Let γ k denote a very
small closed circle which winds around zk , one of the eigenvalues of A, in the counter
clockwise direction so that n (γ k , zk ) = 1. This circle is to enclose only zk and is
to have no other eigenvalue on it. Then apply Theorem 25.20. According to this
theorem Z 0
1 pA (z)
dz
2πi γ pA (z)
is always an integer equal to the multiplicity of zk as a root of pA (t) . Therefore,
small changes in A result in no change to the above contour integral because it
must be an integer and small changes in A result in small changes in the integral.
Therefore whenever every entry of the matrix B is close enough to the corresponding
entry of the matrix A, the two matrices have the same number of zeros inside γ k
under the usual convention that zeros are to be counted according to multiplicity. By
making the radius of the small circle equal to ε where ε is less than the minimum
distance between any two distinct eigenvalues of A, this shows that if B is close
enough to A, every eigenvalue of B is closer than ε to some eigenvalue of A. The
next theorem is about continuous dependence of eigenvalues.
Lemma 25.26 Let λ (t) ∈ σ (A (t)) for t < 1 and let Σt = ∪s≥t σ (A (s)) . Also let
Kt be the connected component of λ (t) in Σt . Then there exists η > 0 such that
Kt ∩ σ (A (s)) 6= ∅ for all s ∈ [t, t + η] .
Proof: Denote by D (λ (t) , δ) the disc centered at λ (t) having radius δ > 0,
with other occurrences of this notation being defined similarly. Thus
D (λ (t) , δ) ≡ {z ∈ C : |λ (t) − z| ≤ δ} .
Suppose δ > 0 is small enough that λ (t) is the only element of σ (A (t)) contained
in D (λ (t) , δ) and that pA(t) has no zeroes on the boundary of this disc. Then by
continuity, and the above discussion and theorem, there exists η > 0, t + η < 1, such
that for s ∈ [t, t + η] , pA(s) also has no zeroes on the boundary of this disc and that
700 THE OPEN MAPPING THEOREM
A (s) has the same number of eigenvalues, counted according to multiplicity, in the
disc as A (t) . Thus σ (A (s)) ∩ D (λ (t) , δ) 6= ∅ for all s ∈ [t, t + η] . Now let
[
H= σ (A (s)) ∩ D (λ (t) , δ) .
s∈[t,t+η]
and from the above discussion, for some choice of sn → s0 , λ (sn ) → λ (s0 ) which
contradicts P and Q separated and nonempty. Since P is nonempty, this shows
Q = ∅. Therefore, H is connected as claimed. But Kt ⊇ H and so Kt ∩σ (A (s)) 6= ∅
for all s ∈ [t, t + η] . This proves the lemma.
The following is the necessary theorem.
Corollary 25.28 Suppose one of the Gerschgorin discs, Di is disjoint from the
union of the others. Then Di contains an eigenvalue of A. Also, if there are n
disjoint Gerschgorin discs, then each one contains an eigenvalue of A.
¡ ¢
Proof: Denote by A (t) the matrix atij where if i 6= j, atij = taij and atii = aii .
Thus to get A (t) we multiply all non diagonal terms by t. Let t ∈ [0, 1] . Then
A (0) = diag (a11 , · · ·, ann ) and A (1) = A. Furthermore, the map, t → A (t) is
continuous. Denote by Djt the Gerschgorin disc obtained from the j th row for the
matrix, A (t). Then it is clear that Djt ⊆ Dj the j th Gerschgorin disc for A. Then
aii is the eigenvalue for A (0) which is contained in the disc, consisting of the single
point aii which is contained in Di . Letting K be the connected component in Σ for
Σ defined in Theorem 25.27 which is determined by aii , it follows by Gerschgorin’s
theorem that K ∩ σ (A (t)) ⊆ ∪nj=1 Djt ⊆ ∪nj=1 Dj = Di ∪ (∪j6=i Dj ) and also, since
K is connected, there are no points of K in both Di and (∪j6=i Dj ) . Since at least
one point of K is in Di ,(aii ) it follows all of K must be contained in Di . Now by
Theorem 25.27 this shows there are points of K ∩ σ (A) in Di . The last assertion
follows immediately.
Actually, this can be improved slightly. It involves the following lemma.
Proof: Let S ≡
Corollary 25.30 Suppose one of the Gerschgorin discs, Di is disjoint from the
union of the others. Then Di contains exactly one eigenvalue of A and this eigen-
value is a simple root to the characteristic polynomial of A.
Proof: In the proof of Corollary 25.28, first note that aii is a simple root of A (0)
since otherwise the ith Gerschgorin disc would not be disjoint from the others. Also,
K, the connected component determined by aii must be contained in Di because it
is connected and by Gerschgorin’s theorem above, K ∩ σ (A (t)) must be contained
in the union of the Gerschgorin discs. Since all the other eigenvalues of A (0) , the
ajj , are outside Di , it follows that K ∩ σ (A (0)) = aii . Therefore, by Lemma 25.29,
K ∩ σ (A (1)) = K ∩ σ (A) consists of a single simple eigenvalue. This proves the
corollary.
The Gerschgorin discs are D (5, 1) , D (1, 2) , and D (0, 1) . Then D (5, 1) is dis-
joint from the other discs. Therefore, there should be an eigenvalue in D (5, 1) .
The actual eigenvalues are not easy to find. They are the roots of the characteristic
equation, t3 − 6t2 + 3t + 5 = 0. The numerical values of these are −. 669 66, 1. 423 1,
and 5. 246 55, verifying the predictions of Gerschgorin’s theorem.
25.8 Exercises
1. Use Theorem 25.20 to give an alternate proof of the fundamental theorem
of algebra. Hint: Take a contour of the form γ r = reit where t ∈ [0, 2π] .
R 0
Consider γ pp(z)
(z)
dz and consider the limit as r → ∞.
r
Argue that small changes will produce small changes in pM (z) . Then apply
Theorem 25.20 using γ k a very small circle surrounding zk , the k th eigenvalue.
3. Suppose that two analytic functions defined on a region are equal on some
set, S which contains a limit point. (Recall p is a limit point of S if every
open set which contains p, also contains infinitely many points of S. ) Show
the two functions coincide. We defined ez ≡ ex (cos y + i sin y) earlier and we
showed that ez , defined this way was analytic on C. Is there any other way
to define ez on all of C such that the function coincides with ex on the real
axis?
4. You know various identities for real valued functions. For example cosh2 x −
z −z z −z
sinh2 x = 1. If you define cosh z ≡ e +e
2 and sinh z ≡ e −e
2 , does it follow
that
cosh2 z − sinh2 z = 1
for all z ∈ C? What about
Can you verify these sorts of identities just from your knowledge about what
happens for real arguments?
6. Let f : U → C be analytic and one to one. Show that f 0 (z) 6= 0 for all z ∈ U.
Does this hold for a function of a real variable?
At this point, recall Corollary 24.47 which is stated here for convenience.
for all z ∈
/ Ω. Then if f : Ω → C is analytic,
Xm Z
f (w) dw = 0.
k=1 γk
The following theorem is called the residue theorem. Note the resemblance to
Corollary 24.47.
for all z ∈
/ Ω. Then if f : Ω → C b is meromorphic such that no γ ∗ contains any poles
k
of f ,
m Z m
1 X X X
f (w) dw = res (f, α) n (γ k , α) (26.1)
2πi γk
k=1 α∈A k=1
705
706 RESIDUES
where here A denotes the set of poles of f in Ω. The sum on the right is a finite
sum.
Proof: First note that there are at most finitely many α which are not in
the unbounded component of C \ ∪m k=1 γ k ([ak , bk ]) . Thus there existsPna finite set,
{α1 , · · ·, αN } ⊆ A such that these are the only possibilities for which k=1 n (γ k , α)
might not equal zero. Therefore, 26.1 reduces to
m Z N n
1 X X X
f (w) dw = res (f, αj ) n (γ k , αj )
2πi γk j=1
k=1 k=1
Now
m Z m Z
à mj
!
X X bj1 X bjr
Qj (z) dz = + dz
γk (z − αj ) r=2 (z − αj )r
k=1 k=1 γ k
Xm Z X m
bj1
= dz ≡ n (γ k , αj ) res (f, αj ) (2πi) .
γk (z − αj )
k=1 k=1
Therefore,
m Z
X m Z
N X
X
f (z) dz = Qj (z) dz
k=1 γk j=1 k=1 γk
N X
X m
= n (γ k , αj ) res (f, αj ) (2πi)
j=1 k=1
N
X m
X
= 2πi res (f, αj ) n (γ k , αj )
j=1 k=1
X Xm
= (2πi) res (f, α) n (γ k , α)
α∈A k=1
707
RR sin(x)
Example 26.4 Find limR→∞ −R x dx
This gives the same answer because cos (x) /x is odd. Consider the following contour
in which the orientation involves counterclockwise motion exactly once around.
−R −R−1 R−1 R
Denote by γ R−1 the little circle and γ R the big one. Then on the inside of this
contour there are no singularities of eiz /z and it is contained in an open set with
the property that the winding number with respect to this contour about any point
not in the open set equals zero. By Theorem 24.22
ÃZ Z Z Z !
−R−1 R
1 eix eiz eix eiz
dx + dz + dx + dz =0 (26.2)
i −R x γ R−1 z R−1 x γR z
Now
¯Z ¯ ¯Z ¯ Z π
¯ eiz ¯¯ ¯¯ π R(i cos θ−sin θ) ¯¯
¯
¯ dz =
¯ ¯ e idθ ¯≤ e−R sin θ dθ
¯ γR z ¯ 0 0
and this last integral converges to 0 by the dominated convergence theorem. Now
consider the other circle. By the dominated convergence theorem again,
Z Z 0
eiz −1
dz = eR (i cos θ−sin θ)
idθ → −iπ
γ R−1 z π
708 RESIDUES
This equals
Z R
sin (x)
lim (cos (xt) + i sin (xt)) dx
R→∞ −R x
Z R
sin (x)
= lim cos (xt) dx
R→∞ −R x
Z R
sin (x)
= lim cos (xt) dx
R→∞ −R x
Z R
1 sin (x (t + 1)) + sin (x (1 − t))
= lim dx
R→∞ 2 −R x
Let t 6= 1, −1. Then changing variables yields
à Z Z !
1 R(1+t) sin (u) 1 R(1−t) sin (u)
lim du + du .
R→∞ 2 −R(1+t) u 2 −R(1−t) u
In case |t| < 1 Example 26.4 implies this limit is π. However, if t > 1 the limit
equals 0 and this is also the case if t < −1. Summarizing,
Z R ½
sin x π if |t| < 1
lim eixt dx = .
R→∞ −R x 0 if |t| > 1
conclusion is obvious. Nevertheless, to avoid using this big topological result and
to attain some extra generality, I will state the following theorem in terms of the
winding number to avoid using it. This theorem is called the argument principle.
m
First recall that f has a zero of order m at α if f (z) = g (z) (z − α) where g is
an analytic
P∞ function which is not equal to zero at α. This is equivalent to having
k
f (z) = k=m ak (z − α) for z near α where am 6= 0. Also recall that f has a pole
of order m at α if for z near α, f (z) is of the form
m
X bk
f (z) = h (z) + k
(26.3)
k=1 (z − α)
Proof: This theorem follows from computing the residues of f 0 /f. It has residues
at poles and zeros. I will do this now. First suppose f has a pole of order p at α.
Then f has the form given in 26.3. Therefore,
Pp kbk
f 0 (z) h0 (z) − k=1 (z−α) k+1
= Pp b
f (z) h (z) + k=1 (z−α)k k
and from this it is clear res (f 0 /f ) = p, the order of the zero. The conclusion of this
theorem now follows from Theorem 26.3.
710 RESIDUES
One can also generalize the theorem to the case where there are many closed
curves involved. This is proved in the same way as the above.
f 0 (z)
g (z)
f (z)
³ Pm ´
kbk
g (z) h0 (z) − k=1 (z−α) k+1
= Pm bk
h (z) + k=1 (z−α) k
From this, it is clear res (g (f 0 /f ) , α) = −mg (α) , where m is the order of the pole.
Thus α would have been listed m times in the list of poles. Hence the residue at
this point is equivalent to adding −g (α) m times.
26.1. ROUCHE’S THEOREM AND THE ARGUMENT PRINCIPLE 711
and from this it is clear res (g (f 0 /f )) = g (α) m, where m is the order of the zero.
The conclusion of this theorem now follows from the residue theorem, Theorem
26.3.
The way people usually apply these theorems is to suppose γ ∗ is a simple closed
bounded variation curve, often a circle. Thus it has an inside and an outside, the
outside being the unbounded component of C\γ ∗ . The orientation of the curve is
such that you go around it once in the counterclockwise direction. Then letting rk
and lk be as described, the conclusion of the theorem follows. In applications, this
is likely the way it will be.
Then
Zf − Pf = Zg − Pg .
Letting
³ ´ l denote a branch of the logarithm defined on C \ [0, ∞), it follows that
l fg(z)
(z)
is a primitive for the function,
0
(f /g) f0 g0
= − .
(f /g) f g
Therefore, by the argument principle,
Z 0 Z µ 0 ¶
1 (f /g) 1 f g0
0 = dz = − dz
2πi γ (f /g) 2πi γ f g
= Zf − Pf − (Zg − Pg ) .
Corollary 26.10 In the situation of Theorem 26.9 change 26.4 to the condition,
Theorem 26.11 Let Ω be a bounded open set and suppose f, g are continuous on Ω
and analytic on Ω. Also suppose |f (z)| < |g (z)| on ∂Ω. Then g and f + g have the
same number of zeros in Ω provided each zero is counted according to multiplicity.
© ª
Proof: Let K = z ∈ Ω : |f (z)| ≥ |g (z)| . Then letting λ ∈ [0, 1] , if z ∈
/ K,
then |f (z)| < |g (z)| and so
which shows that all zeros of g + λf are contained in K which must be a compact
subset of Ω due to the assumption that |f (z)| < |g (z)| on ∂Ω. ByP Theorem 24.52 on
n n
Page 675 there exists aPcycle, {γ k }k=1 such that ∪nk=1 γ ∗k ⊆ Ω\K, k=1 n (γ k , z) = 1
n
for every z ∈ K and k=1 n (γ k , z) = 0 for all z ∈ / Ω. Then as above, it follows
from the residue theorem or more directly, Theorem 26.7,
n
X Z Xp
1 λf 0 (z) + g 0 (z)
dz = mj
2πi γ k λf (z) + g (z) j=1
k=1
26.2. SINGULARITIES AND THE LAURENT SERIES 713
is integer valued and continuous so it gives the same value when λ = 0 as when
λ = 1. When λ = 0 this gives the number of zeros of g in Ω and when λ = 1 it is
the number of zeros of f + g. This proves the theorem.
Here is another formulation of this theorem.
Corollary 26.12 Let Ω be a bounded open set and suppose f, g are continuous on
Ω and analytic on Ω. Also suppose |f (z) − g (z)| < |g (z)| on ∂Ω. Then f and
g have the same number of zeros in Ω provided each zero is counted according to
multiplicity.
Thus ann (a, 0, R) would denote the punctured ball, B (a, R) \ {0} and when
R1 > 0, the annulus looks like the following.
r a
qz qz
qa qa
Lemma 26.14 Let γ r (t) ≡ a + reit for t ∈ [0, 2π] and let |z − a| < r. Then
n (γ r , z) = 1. If |z − a| > r, then n (γ r , z) = 0.
f (t) ≡ n (γ r , a + t (z − a)) .
Then from properties of the winding number derived earlier, f (t) ∈ Z, f is continu-
ous, and f (0) = 1. Therefore, f (t) = 1 for all t ∈ [0, 1] . This proves the first claim
because f (1) = n (γ r , z) .
For the second claim,
Z
1 1
n (γ r , z) = dw
2πi γ r w − z
Z
1 1
= dw
2πi γ r w − a − (z − a)
Z
1 −1 1
= ³ ´ dw
2πi z − a γ r 1 − w−a
z−a
Z X ∞ µ ¶k
−1 w−a
= dw.
2πi (z − a) γ r z−a
k=0
26.2. SINGULARITIES AND THE LAURENT SERIES 715
for some c > 0 due to the assumption that |z − a| > r. Therefore, the sum and the
integral can be interchanged to give
∞ Z
X µ ¶k
−1 w−a
n (γ r , z) = dw = 0
2πi (z − a) γr z−a
k=0
³ ´k
because w → w−a
z−a has an antiderivative. This proves the lemma.
Now consider the following picture which pertains to the next lemma.
γr
r a
Lemma 26.15 Let g be analyticRon ann (a, R1 , R2 ) . Then if γ r (t) ≡ a + reit for
t ∈ [0, 2π] and r ∈ (R1 , R2 ) , then γ g (z) dz is independent of r.
r
Proof: Let R1 < r1 < r2 < R2 and denote by −γ r (t) the curve, −γ r (t) ≡
a ¡+ rei(2π−t)
¢ for¡ t ∈ ¢[0, 2π] . Then if z ∈ B (a, R1 ), Lemma 26.14 implies both
n γ r2 , z and n γ r1 , z = 1 and so
¡ ¢ ¡ ¢
n −γ r1 , z + n γ r2 , z = −1 + 1 = 0.
³ ´
Also if z ∈ / B (a, R2 ) , then Lemma 26.14 implies n γ rj , z = 0 for j = 1, 2.
Therefore, whenever z ∈ / ann (a, R1 , R2 ) , the sum of the winding numbers equals
zero. Therefore, by Theorem 24.46 applied to the function, f (w) = g (z) (w − z)
and z ∈ ann (a, R1 , R2 ) \ ∪2j=1 γ rj ([0, 2π]) ,
¡ ¡ ¢ ¡ ¢¢ ¡ ¡ ¢ ¡ ¢¢
f (z) n γ r2 , z + n −γ r1 , z = 0 n γ r2 , z + n −γ r1 , z =
Z Z
1 g (w) (w − z) 1 g (w) (w − z)
dw − dw
2πi γ r w−z 2πi γ r w−z
2 1
Z Z
1 1
= g (w) dw − g (w) dw
2πi γ r 2πi γ r
2 1
Now for |z − a| ≥ r1 ,
−n 1
|z − a| ≤
r1n
and so for all sufficiently large n
n
(r1 − δ) −n
|a−n | |z − a| . ≤
r1n
P∞ −n
Therefore, by the Weierstrass M test, the series, n=1 a−n (z − a) converges
absolutely and uniformly on the set
{z ∈ C : |z − a| ≥ r1 } .
26.2. SINGULARITIES AND THE LAURENT SERIES 717
P∞ n
Similar reasoning shows the series, n=0 an (z − a) converges uniformly on the set
{z ∈ C : |z − a| ≤ r2 } .
Theorem 26.18 Let f be analytic on ann (a, R1 , R2 ) . Then there exist numbers,
an ∈ C such that for all z ∈ ann (a, R1 , R2 ) ,
∞
X n
f (z) = an (z − a) , (26.5)
n=−∞
where the series converges absolutely and uniformly on ann (a, r1 , r2 ) whenever R1 <
r1 < r2 < R2 . Also
Z
1 f (w)
an = dw (26.6)
2πi γ (w − a)n+1
where γ (t) = a + reit , t ∈ [0, 2π] for any r ∈ (R1 , R2 ) . Furthermore the series is
unique in the sense that if 26.5 holds for z ∈ ann (a, R1 , R2 ) , then an is given in
26.6.
Proof: Let R1 < r1 < r2 < R2 and define γ 1 (t) ≡ a + (r1 − ε) eit and γ 2 (t) ≡
a + (r2 + ε) eit for t ∈ [0, 2π] and ε chosen small enough that R1 < r1 − ε < r2 + ε <
R2 .
γ2
γ1
qa
qz
n (−γ 1 , z) + n (γ 2 , z) = 0
n (−γ 1 , z) + n (γ 2 , z) = 1.
718 RESIDUES
Z ∞ µ ¶n
1 f (w) X z − a
= dw+
2πi γ 2 w − a n=0 w − a
Z ∞ µ ¶n
1 f (w) X w − a
dw. (26.7)
2πi γ 1 (z − a) n=0 z − a
From the formula 26.7, it follows that for z ∈ ann ³ (a, r´1n, r2 ), the terms in the first
sum are bounded by an expression of the form C r2r+ε 2
while those in the second
³ ´n
are bounded by one of the form C r1r−ε 1
and so by the Weierstrass M test, the
convergence is uniform and so the integrals and the sums in the above formula may
be interchanged and after renaming the variable of summation, this yields
∞
à Z !
X 1 f (w) n
f (z) = n+1 dw (z − a) +
n=0
2πi γ 2
(w − a)
−1
à Z !
X 1 f (w) n
n+1 (z − a) . (26.8)
n=−∞
2πi γ1 (w − a)
Therefore, by Lemma 26.15, for any r ∈ (R1 , R2 ) ,
∞
à Z !
X 1 f (w) n
f (z) = n+1 dw (z − a) +
n=0
2πi γ r (w − a)
−1
à Z !
X 1 f (w) n
n+1 (z − a) . (26.9)
n=−∞
2πi γr (w − a)
and so à Z !
∞
X 1 f (w) n
f (z) = n+1 dw (z − a) .
n=−∞
2πi γr (w − a)
where r ∈ (R1 , R2 ) is arbitrary. This proves the existence part of the theorem. It
remains to characterize
P∞ an .
n
If f (z) = n=−∞ an (z − a) on ann (a, R1 , R2 ) let
n
X k
fn (z) ≡ ak (z − a) . (26.10)
k=−n
26.2. SINGULARITIES AND THE LAURENT SERIES 719
This function is analytic in ann (a, R1 , R2 ) and so from the above argument,
∞
à Z !
X 1 fn (w) k
fn (z) = dw (z − a) . (26.11)
2πi γ r (w − a)k+1
k=−∞
and
∞
X −n
a−n (w − a)
n=1
ensured by Lemma 26.17 which allows the interchange of sums and integrals, if
k ∈ [−n, n] ,
Z
1 f (w)
dw
2πi γ r (w − a)k+1
Z P∞ m P∞ −m
1 m=0 am (w − a) + m=1 a−m (w − a)
= k+1
dw
2πi γ r (w − a)
X∞ Z
1 m−(k+1)
= am (w − a) dw
m=0
2πi γ r
X∞ Z
−m−(k+1)
+ a−m (w − a) dw
m=1 γr
n
X Z
1 m−(k+1)
= am (w − a) dw
m=0
2πi γr
Xn Z
−m−(k+1)
+ a−m (w − a) dw
m=1 γr
Z
1 fn (w)
= k+1
dw
2πi γr (w − a)
720 RESIDUES
One could imagine evaluating this integral by the method of partial fractions
and it should work out by that method. However, we will consider the evaluation
of this integral by the method of residues instead. To do so, consider the following
picture.
Let γ r (t) = reit , t ∈ [0, π] and let σ r (t) = t : t ∈ [−r, r] . Thus γ r parameterizes
the top curve and σ r parameterizes the straight line from −r to r along the x
axis. Denoting by Γr the closed curve traced out by these two, we see from simple
estimates that Z
1
lim dz = 0.
r→∞ γ 1 + z 4
r
26.2. SINGULARITIES AND THE LAURENT SERIES 721
Therefore, Z Z
∞
1 1
dx = lim dz.
−∞ 1 + x4 r→∞ Γr 1 + z4
R 1
We compute Γr 1+z 4 dz
using the method of residues. The only residues of the
integrand are located at points, z where 1 + z 4 = 0. These points are
1√ 1 √ 1√ 1 √
z = − 2 − i 2, z = 2− i 2,
2 2 2 2
1 √ 1 √ 1 √ 1 √
z = 2 + i 2, z = − 2+ i 2
2 2 2 2
and it is only the last two which are found in the inside of Γr . Therefore, we need
to calculate the residues at these points. Clearly this function has a pole of order
one at each of these points and so we may calculate the residue at α in this list by
evaluating
1
lim (z − α)
z→α 1 + z4
Thus
µ ¶
1√ 1 √
Res f, 2+ i 2
2 2
µ µ ¶¶
1√ 1 √ 1
= lim
√ √ z − 2 + i 2
1 1
z→ 2 2+ 2 i 2 2 2 1 + z4
1√ 1 √
= − 2− i 2
8 8
Similarly we may find the other residue in the same way
µ ¶
1√ 1 √
Res f, − 2+ i 2
2 2
µ µ ¶¶
1√ 1 √ 1
= lim
√ √ z− − 2+ i 2
1 1
z→− 2 2+ 2 i 2 2 2 1 + z4
1 √ 1√
= − i 2+ 2.
8 8
Therefore,
Z µ µ ¶¶
1 1 √ 1√ 1√ 1 √
dz = 2πi − i 2 + 2+ − 2− i 2
Γr 1 + z4 8 8 8 8
1 √
= π 2.
2
722 RESIDUES
√ R∞ 1
Thus, taking the limit we obtain 12 π 2 = −∞ 1+x 4 dx.
Obviously many different variations of this are possible. The main idea being
that the integral over the semicircle converges to zero as r → ∞.
Sometimes we don’t blow up the curves and take limits. Sometimes the problem
of interest reduces directly to a complex integral over a closed curve. Here is an
example of this.
Only the first two are inside the unit circle. It is also clear the function has simple
poles at these points. Therefore,
µ ¶
z2 + 1
Res (f, 0) = lim z = 1.
z→0 z (4z + z 2 + 1)
³ √ ´
Res f, −2 + 3 =
³ ³ √ ´´ z2 + 1 2√
lim √ z − −2 + 3 2
=− 3.
z→−2+ 3 z (4z + z + 1) 3
It follows
Z π Z
cos θ 1 z2 + 1
dθ = 2
dz
0 2 + cos θ 2i
γ z (4z + z + 1)
µ ¶
1 2√
= 2πi 1 − 3
2i 3
µ ¶
2√
= π 1− 3 .
3
26.2. SINGULARITIES AND THE LAURENT SERIES 723
Other rational functions of the trig functions will work out by this method also.
Sometimes you have to be clever about which version of an analytic function
that reduces to a real function you should use. The following is such an example.
The same curve used in the integral involving sinx x earlier will create problems
with the log since the usual version of the log is not defined on the negative real
axis. This does not need to be of concern however. Simply use another branch of
the logarithm. Leave out the ray from 0 along the negative y axis and use Theorem
25.5 to define L (z) on this set. Thus L (z) = ln |z| + i arg1 (z) where arg1 (z) will be
the angle, θ, between − π2 and 3π iθ
2 such that z = |z| e . Now the only singularities
contained in this curve are
1√ 1 √ 1√ 1 √
2 + i 2, − 2+ i 2
2 2 2 2
and the integrand, f has simple poles at these points. Thus using the same proce-
dure as in the other examples,
µ ¶
1√ 1 √
Res f, 2+ i 2 =
2 2
1√ 1 √
2π − i 2π
32 32
and µ ¶
−1 √ 1 √
Res f, 2+ i 2 =
2 2
3√ 3 √
2π + i 2π.
32 32
Consider the integral along the small semicircle of radius r. This reduces to
Z 0
ln |r| + it ¡ it ¢
4 rie dt
π 1 + (reit )
R L(z)
Observing that large semicircle 1+z 4
dz → 0 as R → ∞,
Z Z µ ¶
R
ln t 0
1 1 1 √
e (R) + 2 lim dt + iπ dt = − + i π2 2
r→0+ r 1 + t4 −∞ 1 + t4 8 4
C
|f (z)| ≤ b
,b > α
|z|
C0
|f (z)| ≤ b1
, b1 < α
|z|
for all |z| sufficiently small. It turns out there exists an explicit formula for this
Mellin transformation under these conditions. Consider the following contour.
26.2. SINGULARITIES AND THE LAURENT SERIES 725
−R ¾
In this contour the small semicircle in the center has radius ε which will converge
to 0. Denote by γ R the large circular path which starts at the upper edge of the
slot and continues to the lower edge. Denote by γ ε the small semicircular contour
and denote by γ εR+ the straight part of the contour from 0 to R which provides
the top edge of the slot. Finally denote by γ εR− the straight part of the contour
from R to 0 which provides the bottom edge of the slot. The interesting aspect of
this problem is the definition of f (z) z α−1 . Let
z α−1 ≡ e(ln|z|+i arg(z))(α−1) = e(α−1) log(z)
where arg (z) is the angle of z in (0, 2π) . Thus you use a branch of the logarithm
which is defined on C\(0, ∞) . Then it is routine to verify from the assumed estimates
that Z
lim f (z) z α−1 dz = 0
R→∞ γR
and Z
lim f (z) z α−1 dz = 0.
ε→0+ γε
and Z Z R
lim f (z) z α−1 dz = −ei2π(α−1) f (x) xα−1 dx.
ε→0+ γ εR− 0
726 RESIDUES
Therefore, letting ΣR denote the sum of the residues of f (z) z α−1 which are con-
tained in the disk of radius R except for the possible residue at 0,
³ ´Z R
e (R) + 1 − ei2π(α−1) f (x) xα−1 dx = 2πiΣR
0
Calculating the residue of the integrand at −1, and simplifying the above expression,
³ ´ Z ∞ xp−1
1 − e2πi(p−1) dx = 2πie(p−1)iπ .
0 1+x
Upon simplification Z ∞
xp−1 π
dx = .
0 1+x sin pπ
26.2. SINGULARITIES AND THE LAURENT SERIES 727
@ x
¡
Thus the curve to integrate over is shaped like a slice of pie. Denote by γ r the
curved part. Since f is analytic,
Z Z r Z r ³ ³ 1+i ´´2 µ ¶
iz 2 ix2 i t √2 1+i
0 = e dz + e dx − e √ dt
γr 0 0 2
Z Z r Z r µ ¶
2 2 2 1+i
= eiz dz + eix dx − e−t √ dt
γr 0 0 2
Z Z r √ µ ¶
2 2 π 1+i
= eiz dz + eix dx − √ + e (r)
γr 0 2 2
R∞ 2
√
where e (r) → 0 as r → ∞. Here we used the fact that 0 e−t dt = 2π . Now
consider the first of these integrals.
¯Z ¯ ¯Z π ¯
¯ ¯ ¯ 4 ¯
¯ ¯ ¯ it 2 ¯
ei(re ) rieit dt¯
2
¯ eiz dz ¯ = ¯
¯ γr ¯ ¯ 0 ¯
Z π4
2
≤ r e−r sin 2t dt
0
Z 1 2
r e−r u
= √ du
2 0 1 − u2
Z r −(3/2) µZ 1 ¶
r 1 r 1
e−(r )
1/2
≤ √ du + √
2 0 1−u2 2 0 1−u 2
728 RESIDUES
and so Z ∞ √ Z ∞
2 π
sin x dx = √ = cos x2 dx.
0 2 2 0
The following example is one of the most interesting. By an auspicious choice
of the contour it is possible to obtain a very interesting formula for cot πz known
as the Mittag- Leffler expansion of cot πz.
Example 26.25 Let γ N be the contour which goes from −N − 21 − N i horizontally
to N + 12 − N i and from there, vertically to N + 21 + N i and then horizontally
to −N − 12 + N i and finally vertically to −N − 12 − N i. Thus the contour is a
large rectangle and the direction of integration is in the counter clockwise direction.
Consider the following integral.
Z
π cos πz
IN ≡ 2 2
dz
γ N sin πz (α − z )
where α ∈ R is not an integer. This will be used to verify the formula of Mittag
Leffler,
X∞
1 2 π cot πα
+ = . (26.12)
α2 n=1 α2 − n2 α
You should verify that cot πz is bounded on this contour and that therefore,
IN → 0 as N → ∞. Now you compute the residues of the integrand at ±α and
at n where |n| < N + 12 for n an integer. These are the only singularities of the
integrand in this contour and therefore, you can evaluate IN by using these. It is
left as an exercise to calculate these residues and find that the residue at ±α is
−π cos πα
2α sin πα
while the residue at n is
1
.
α2 − n2
Therefore,
" N
#
X 1 π cot πα
0 = lim IN = lim 2πi −
N →∞ N →∞ α 2 − n2 α
n=−N
Definition 26.26 Let X be a complex Banach space and let A ∈ L (X, X) . Then
n o
−1
r (A) ≡ λ ∈ C : (λI − A) ∈ L (X, X)
This is called the resolvent set. The spectrum of A, denoted by σ (A) is defined as
all the complex numbers which are not in the resolvent set. Thus
σ (A) ≡ C \ r (A)
Lemma 26.27 λ ∈ r (A) if and only if λI − A is one to one and onto X. Also if
|λ| > ||A|| , then λ ∈ σ (A). If the Neumann series,
∞ µ ¶k
1X A
λ λ
k=0
converges, then
∞ µ ¶k
1X A −1
= (λI − A) .
λ λ
k=0
Proof: Note that to be in r (A) , λI − A must be one to one and map X onto
−1
X since otherwise, (λI − A) ∈ / L (X, X) .
By the open mapping theorem, if these two algebraic conditions hold, then
−1
(λI − A) is continuous and so this proves the first part of the lemma. Now
suppose |λ| > ||A|| . Consider the Neumann series
∞ µ ¶k
1X A
.
λ λ
k=0
By the root test, Theorem 24.3 on Page 642 this series converges to an element
of L (X, X) denoted here by B. Now suppose the series converges. Letting Bn ≡
1
Pn ¡ A ¢k
λ k=0 λ ,
n µ ¶k
X n µ ¶k+1
X
A A
(λI − A) Bn = Bn (λI − A) = −
λ λ
k=0 k=0
µ ¶n+1
A
= I− →I
λ
730 RESIDUES
as n → ∞ because the convergence of the series requires the nth term to converge
to 0. Therefore,
(λI − A) B = B (λI − A) = I
which shows λI − A is both one to one and onto and the Neumann series converges
−1
to (λI − A) . This proves the lemma.
This lemma also shows that σ (A) is bounded. In fact, σ (A) is closed.
¯¯ ¯¯−1
¯¯ −1 ¯¯
Lemma 26.28 r (A) is open. In fact, if λ ∈ r (A) and |µ − λ| < ¯¯(λI − A) ¯¯ ,
then µ ∈ r (A).
Proof: Lemma 26.27 shows σ (A) is bounded and Lemma 26.28 shows it is
closed.
Since σ (A) is compact, this maximum exists. Note from Lemma 26.27, ρ (A) ≤
||A||.
26.4. EXERCISES 731
converges.
Proof: This follows directly from Theorem 26.18 on Page 717 and the obser-
P∞ ¡ ¢k −1
vation above that λ1 k=0 A λ = (λI − A) for all |λ| > ||A||. Thus the analytic
−1
function, λ → (λI − A) has a Laurent expansion on |λ| > ρ (A) by Theorem 26.18
P∞ ¡ ¢k
and it must coincide with λ1 k=0 A on |λ| > ||A|| so the Laurent expansion of
−1 1
P∞ λ¡ A ¢k
λ → (λI − A) must equal λ k=0 λ on |λ| > ρ (A) . This proves the lemma.
The theorem on the spectral radius follows. It is due to Gelfand.
1/n
Theorem 26.32 ρ (A) = limn→∞ ||An || .
Proof: If
1/n
|λ| < lim sup ||An ||
n→∞
then by the root test, the Neumann series does not converge and so by Lemma
26.31 |λ| ≤ ρ (A) . Thus
1/n
ρ (A) ≥ lim sup ||An || .
n→∞
It follows from Lemma 26.27 applied to Ap that for λ ∈ σ (A) , |λp | ≤ ||Ap || and so
1/p 1/p
|λ| ≤ ||Ap || . Therefore, ρ (A) ≤ ||Ap || and since p is arbitrary,
1/p 1/n
lim inf ||Ap || ≥ ρ (A) ≥ lim sup ||An || .
p→∞ n→∞
26.4 Exercises
1. Example 26.19 found the integral of a rational function of a certain sort. The
technique used in this example typically works for rational functions of the
form fg(x)
(x)
where deg (g (x)) ≥ deg f (x) + 2 provided the rational function
has no poles on the real axis. State and prove a theorem based on these
observations.
732 RESIDUES
2. Fill in the missing details of Example 26.25 about IN → 0. Note how important
it was that the contour was chosen just right for this to happen. Also verify
the claims about the residues.
Show
1
Res (f, a) = g (m−1) (a) .
(m − 1)!
Hint: Use the Laurent series.
4. Give a proof of Theorem 26.6. Hint: Let p be a pole. Show that near p, a
pole of order m,
P∞ k
f 0 (z) −m + k=1 bk (z − p)
= P∞ k
f (z) (z − p) + k=2 ck (z − p)
Show that Res (f, p) = −m. Carry out a similar procedure for the zeros.
9. Use the contour described in Example 26.19 to compute the exact values of
the following improper integrals.
R∞ x
(a) −∞ (x2 +4x+13) 2 dx
R∞ x2
(b) 0 (x2 +a 2 )2 dx
R∞
(c) −∞ (x2 +a2dx )(x2 +b2 ) , a, b > 0
R∞ x sin x
(b) 0 (x2 +a2 )2
dx
defined as
µZ 1−ε Z ∞ ¶
sin x sin x
lim dx + dx .
ε→0+ −∞ (x2 + 1) (x − 1) 1+ε (x2 + 1) (x − 1)
R∞ dx
12. Find a formula for the integral −∞ (1+x2 )n+1
where n is a nonnegative integer.
R∞ sin2 x
13. Find −∞ x2 dx.
R∞ 1
15. Find −∞ (1+x4 )2
dx.
R∞ ln(x)
16. Find 0 1+x2 dx = 0.
cos θ
0 0
(f ◦ γ) (t0 ) · (f ◦ η) (s0 )
= ¯ ¯ ¯ ¯
¯(f ◦ η)0 (s0 )¯ ¯(f ◦ γ)0 (t0 )¯
1 f 0 (γ (t0 )) γ 0 (t0 ) f 0 (η (s0 ))η 0 (s0 ) + f 0 (γ (t0 )) γ 0 (t0 )f 0 (η (s0 )) η 0 (s0 )
=
2 |f 0 (γ (t0 ))| |f 0 (η (s0 ))|
1 f 0 (z) f 0 (z)γ 0 (t0 ) η 0 (s0 ) + f 0 (z)f 0 (z) γ 0 (t0 )η 0 (s0 )
=
2 |f 0 (z)| |f 0 (z)|
1 γ 0 (t0 ) η 0 (s0 ) + η 0 (s0 ) γ 0 (t0 )
=
2 1
which equals the angle between the vectors, γ 0 (t0 ) and η 0 (t0 ) . Thus analytic map-
pings preserve angles at points where the derivative is nonzero. Such mappings are
called isogonal. .
Actually, they also preserve orientations. If z = x + iy and w = a + ib are two
complex numbers, then (x, y, 0) and (a, b, 0) are two vectors in R3 . Recall that the
cross product, (x, y, 0) × (a, b, 0) , yields a vector normal to the two given vectors
such that the triple, (x, y, 0) , (a, b, 0) , and (x, y, 0) × (a, b, 0) satisfies the right hand
735
736 COMPLEX MAPPINGS
rule and has magnitude equal to the product of the sine of the included angle times
the product of the two norms of the vectors. In this case, the cross product will
produce a vector which is a multiple of k, the unit vector in the direction of the z
axis. In fact, you can verify by computing both sides that, letting z = x + iy and
w = a + ib,
(x, y, 0) × (a, b, 0) = Re (ziw) k.
Therefore, in the above situation,
0 0
(f ◦ γ) (t0 ) × (f ◦ η) (s0 )
³ ´
= Re f 0 (γ (t0 )) γ 0 (t0 ) if 0 (η (s0 ))η 0 (s0 ) k
³ ´
2
= |f 0 (z)| Re γ 0 (t0 ) iη 0 (s0 ) k
which shows that the orientation of γ 0 (t0 ), η 0 (s0 ) is the same as the orientation of
0 0
(f ◦ γ) (t0 ) , (f ◦ η) (s0 ). Mappings which preserve both orientation and angles are
called conformal mappings and this has shown that analytic functions are conformal
mappings if the derivative does not vanish.
Lemma 27.2 The fractional linear transformation, 27.1 can be written as a finite
composition of dilations, inversions, and translations.
Proof: Let
d 1 (bc − ad)
S1 (z) = z + , S2 (z) = , S3 (z) = z
c z c2
27.2. FRACTIONAL LINEAR TRANSFORMATIONS 737
and
a
S4 (z) = z +
c
in the case where c 6= 0. Then f (z) given in 27.1 is of the form
f (z) = S4 ◦ S3 ◦ S2 ◦ S1 .
Here is why. µ ¶
d 1 c
S2 (S1 (z)) = S2 z + ≡ d
= .
c z+ c
zc + d
Now consider
µ ¶ µ ¶
c (bc − ad) c bc − ad
S3 ≡ = .
zc + d c2 zc + d c (zc + d)
Finally, consider
µ ¶
bc − ad bc − ad a b + az
S4 ≡ + = .
c (zc + d) c (zc + d) c zc + d
Corollary 27.3 Fractional linear transformations map circles and lines to circles
or lines.
Proof: It is obvious that dilations and translations map circles to circles and
lines to lines. What of inversions? If inversions have this property, the above lemma
implies a general fractional linear transformation has this property as well.
Note that all circles and lines may be put in the form
¡ ¢ ¡ ¢
α x2 + y 2 − 2ax − 2by = r2 − a2 + b2
where α = 1 gives a circle centered at (a, b) with radius r and α = 0 gives a line. In
terms of complex variables you may therefore consider all possible circles and lines
in the form
αzz + βz + βz + γ = 0, (27.2)
To see this let β = β 1 + iβ 2 where β 1 ≡ −a and β 2 ≡ b. Note that even if α is not
0 or 1 the expression still corresponds to either a circle or a line because you can
divide by α if α 6= 0. Now I verify that replacing z with z1 results in an expression
of the form in 27.2. Thus, let w = z1 where z satisfies 27.2. Then
¡ ¢ 1 ¡ ¢
α + βw + βw + γww = αzz + βz + βz + γ = 0
zz
738 COMPLEX MAPPINGS
and so w also satisfies a relation like 27.2. One simply switches α with γ and β
with β. Note the situation is slightly different than with dilations and translations.
In the case of an inversion, a circle becomes either a line or a circle and similarly, a
line becomes either a circle or a line. This proves the corollary.
The next example is quite important.
z−i
Example 27.4 Consider the fractional linear transformation, w = z+i .
First consider what this mapping does to the points of the form z = x + i0.
Substituting into the expression for w,
x−i x2 − 1 − 2xi
w= = ,
x+i x2 + 1
a point on the unit circle. Thus this transformation maps the real axis to the unit
circle.
The upper half plane is composed of points of the form x + iy where y > 0.
Substituting in to the transformation,
x + i (y − 1)
w= ,
x + i (y + 1)
which is seen to be a point on the interior of the unit disk because |y − 1| < |y + 1|
which implies |x + i (y + 1)| > |x + i (y − 1)|. Therefore, this transformation maps
the upper half plane to the interior of the unit disk.
One might wonder whether the mapping is one to one and onto. The mapping
w+1
is clearly one to one because it has an inverse, z = −i w−1 for all w in the interior
of the unit disk. Also, a short computation verifies that z so defined is in the upper
half plane. Therefore, this transformation maps {z ∈ C such that Im z > 0} one to
one and onto the unit disk {z ∈ C such that |z| < 1} . ¯ ¯
¯ ¯
A fancy way to do part of this is to use Theorem 25.11. lim supz→a ¯ z−i z+i ¯ ≤ 1
¯ ¯
¯ ¯
whenever a is the real axis or ∞. Therefore, ¯ z−iz+i ¯ ≤ 1. This is a little shorter.
The result will be a fractional linear transformation with the desired properties.
If any of the points equals ∞, then the quotient containing this point should be
adjusted.
Why should this procedure work? Here is a heuristic argument to indicate why
you would expect this to happen rather than a rigorous proof. The reader may
want to tighten the argument to give a proof. First suppose z = z1 . Then the right
side equals zero and so the left side also must equal zero. However, this requires
w = w1 . Next suppose z = z2 . Then the right side equals 1. To get a 1 on the left,
you need w = w2 . Finally suppose z = z3 . Then the right side involves division by
0. To get the same bad behavior, on the left, you need w = w3 .
Example 27.5 Let Im ξ > 0 and consider the fractional linear transformation
which takes ξ to 0, ξ to ∞ and 0 to ξ/ξ, .
theorem. In mathematics there are two sorts of questions, those related to whether
something exists and those involving methods for finding it. The real questions are
often related to questions of existence. There is a long and involved history for
proofs of this theorem. The first proofs were based on the Dirichlet principle and
turned out to be incorrect, thanks to Weierstrass who pointed out the errors. For
more on the history of this theorem, see Hille [27].
The following theorem is really wonderful. It is about the existence of a subse-
quence having certain salubrious properties. It is this wonderful result which will
give the existence of the mapping desired. The other parts of the argument are
technical details to set things up and use this theorem.
Proof: First note there exists a sequence of compact sets, Kn such that Kn ⊆
int Kn+1 ⊆ Ω for all n where here int K denotes the interior of the set K, the
∞
union of all open © sets contained
¡ in¢ K andª ∪n=1 Kn = Ω. In fact, you can verify
C 1
that B (0, n) ∩ z ∈ Ω : dist z, Ω ≤ n works for Kn . Then there exist positive
numbers, δ n such that if z ∈ Kn , then B (z, δ n ) ⊆ int Kn+1 . Now denote by Fn
the set of restrictions of functions of F to Kn . Then let z ∈ Kn and let γ (t) ≡
z + δ n eit , t ∈ [0, 2π] . It follows that for z1 ∈ B (z, δ n ) , and f ∈ F ,
¯ Z µ ¶ ¯
¯ 1 1 1 ¯
|f (z) − f (z1 )| = ¯¯ f (w) − dw¯¯
2πi γ w−z w − z1
¯Z ¯
1 ¯¯ z − z1 ¯
≤ f (w) dw¯
2π ¯ γ (w − z) (w − z1 ) ¯
δn
Letting |z1 − z| < 2 ,
M |z − z1 |
|f (z) − f (z1 )| ≤ 2πδ n 2
2π δ n /2
|z − z1 |
≤ 2M .
δn
It follows that Fn is equicontinuous and uniformly bounded so by the Arzela Ascoli
∞
theorem there exists a sequence, {fnk }k=1 ⊆ F which converges uniformly on Kn .
∞
Let {f1k }k=1 converge uniformly on K1 . Then use the Arzela Ascoli theorem applied
∞
to this sequence to get a subsequence, denoted by {f2k }k=1 which also converges
∞
uniformly on K2 . Continue in this way to obtain {fnk }k=1 which converges uni-
∞
formly on K1 , · · ·, Kn . Now the sequence {fnn }n=m is a subsequence of {fmk } ∞
k=1
and so it converges uniformly on Km for all m. Denoting fnn by fn for short, this
27.3. RIEMANN MAPPING THEOREM 741
∞
is the sequence of functions promised by the theorem. It is clear {fn }n=1 converges
uniformly on every compact subset of Ω because every such set is contained in Km
for all m large enough. Let f (z) be the point to which fn (z) converges. Then f
is a continuous function defined on Ω. Is f is analytic? Yes it is by Lemma 24.18.
Alternatively, you could let T ⊆ Ω be a triangle. Then
Z Z
f (z) dz = lim fn (z) dz = 0.
∂T n→∞ ∂T
int (Kn ) \ K
Pm
such that k=1 n (γ k , z) = 1 for every z ∈ K. Also let η denote the distance
between ∪j γ ∗j and K. Then for z ∈ K,
¯ ¯
¯ ¯ ¯ m Z ¯
¯ (k) ¯ ¯ k! X f (w) − f n (w) ¯
¯f (z) − fn(k) (z)¯ = ¯ dw¯¯
¯ 2πi k+1
¯ j=1 γ j (w − z) ¯
Xm
k! 1
≤ ||fk − f ||Kn (length of γ k ) k+1 .
2π j=1
η
where here ||fk − f ||Kn ≡ sup {|fk (z) − f (z)| : z ∈ Kn } . Thus you get uniform
convergence of the derivatives.
Since the family, F satisfies the conclusion of Theorem 27.7 it is known as a
normal family of functions. More generally,
Then φα maps B (0, 1) one to one and onto B (0, 1), φ−1
α = φ−α , and
1
φ0α (α) = 2.
1 − |α|
742 COMPLEX MAPPINGS
The next lemma, known as Schwarz’s lemma is interesting for its own sake but
will also be an important part of the proof of the Riemann mapping theorem. It
was stated and proved earlier but for convenience it is given again here.
this shows 27.3 and it also verifies 27.4 on taking the limit as z → 0. If equality
holds in 27.4, then |F (z) /z| achieves a maximum at an interior point so F (z) /z
equals a constant, λ by the maximum modulus theorem. Since F (z) = λz, it follows
F 0 (0) = λ and so |λ| = 1. This proves the lemma.
The next theorem will turn out to be equivalent to the Riemann mapping the-
orem.
for all ψ ∈ F. When this has been done it will be shown that h is actually onto.
This will prove the theorem.
Claim 1: F is nonempty.
Proof of Claim 1: Since Ω 6= C it follows there exists ξ ∈/ Ω. Then it follows
1
z − ξ and z−ξ are both analytic on Ω. Since Ω has the square root property,
there exists an analytic
√ function, φ : Ω → C such that φ2 (z) = z − ξ for all
z ∈ Ω, φ (z) = z − ξ. Since z − ξ is not constant, neither is φ and it follows
from the open mapping theorem that φ (Ω) is a region. Note also that φ is one
to one because if φ (z1 ) = φ (z2 ) , then you can square both sides and conclude
z1 − ξ = z2 − ξ implying z1 = z2√.
Now pick
¯√ a ∈ φ (Ω)
¯ . Thus za − ξ = a. I claim there exists a positive lower
bound to ¯ z − ξ + a¯ for z ∈ Ω. If not, there exists a sequence, {zn } ⊆ Ω such
that p p p
zn − ξ + a = zn − ξ + za − ξ ≡ εn → 0.
Then p ³ p ´
zn − ξ = εn − za − ξ (27.6)
and squaring both sides,
p
zn − ξ = ε2n + za − ξ − 2εn za − ξ.
√
Consequently, (zn −√za ) = ε2n − 2εn za − ξ which converges to 0. Taking the limit
in 27.6, it follows 2 za − ξ =¯√0 and so ξ¯ = za , a contradiction to ξ ∈
/ Ω. Choose
r > 0 such that for all z ∈ Ω, ¯ z − ξ + a¯ > r > 0. Then consider
r
ψ (z) ≡ √ . (27.7)
z−ξ+a
¯√ ¯
This is one to one, analytic, and maps Ω into B (0, 1) (¯ z − ξ + a¯ > r). Thus F
is not empty and this proves the claim.
Claim 2: Let z0 ∈ Ω. There exists a finite positive real number, η, defined by
©¯ ¯ ª
η ≡ sup ¯ψ 0 (z0 )¯ : ψ ∈ F (27.8)
and
ψ n → h, ψ 0n → h0 , (27.10)
uniformly on all compact subsets of Ω. It follows
¯ ¯
|h0 (z0 )| = lim ¯ψ 0n (z0 )¯ = η (27.11)
n→∞
¾
γ
t ? t
z1 z2 6
Using the theorem on counting zeros, Theorem 25.20, and the fact that ψ n is
one to one,
Z
1 ψ 0n (w)
0 = lim dw
n→∞ 2πi γ ψ n (w) − ψ n (z1 )
Z
1 h0 (w)
= dw,
2πi γ h (w) − h (z1 )
which shows that h − h (z1 ) has no zeros in B (z2 , r) . In particular z2 is not a zero of
h − h (z1 ) . This shows that h is one to one since z2 6= z1 was arbitrary. Therefore,
h ∈ F. It only remains to verify that h (z0 ) = 0.
If h (z0 ) 6= 0,consider φh(z0 ) ◦ h where φα is the fractional linear transformation
defined in Lemma 27.9. By this lemma it follows φh(z0 ) ◦ h ∈ F. Now using the
27.3. RIEMANN MAPPING THEOREM 745
chain rule,
¯³ ´0 ¯ ¯ ¯
¯ ¯ ¯ 0 ¯
¯ φh(z ) ◦ h (z0 )¯ = ¯φh(z0 ) (h (z0 ))¯ |h0 (z0 )|
¯ 0 ¯
¯ ¯
¯ 1 ¯
¯ ¯ 0
= ¯ ¯ |h (z0 )|
¯ 1 − |h (z0 )|2 ¯
¯ ¯
¯ 1 ¯
¯ ¯
= ¯ 2 ¯η > η
¯ 1 − |h (z0 )| ¯
h (z) − α
φα ◦ h (z) =
1 − αh (z)
Thus p
ψ (z0 ) = φ√φ ◦ φα ◦ h (z0 ) = 0
α ◦h(z0 )
Define s (w) ≡ w2 . Then using Lemma 27.9, in particular, the description of φ−1
α =
φ−α , you can solve 27.13 for h to obtain
= (F ◦ ψ) (z) (27.15)
Now
and F maps B (0, 1) into B (0, 1). Also, F is not one to one because it maps B (0, 1)
to B (0, 1) and has s in its definition. Thus there exists z1 ∈ B (0, 1) such that
φ−√φ ◦h(z ) (z1 ) = − 21 and another point z2 ∈ B (0, 1) such that φ−√φ ◦h(z ) (z2 ) =
α 0 α 0
1
2. However, thanks to s, F (z1 ) = F (z2 ).
Since F (0) = h (z0 ) = 0, you can apply the Schwarz lemma to F . Since F is
not one to one, it can’t be true that F (z) = λz for |λ| = 1 and so by the Schwarz
lemma it must be the case that |F 0 (0)| < 1. But this implies from 27.15 and 27.14
that
¯ ¯
η = |h0 (z0 )| = |F 0 (ψ (z0 ))| ¯ψ 0 (z0 )¯
¯ ¯ ¯ ¯
= |F 0 (0)| ¯ψ 0 (z0 )¯ < ¯ψ 0 (z0 )¯ ≤ η,
Lemma 27.13 Let Ω be a simply connected region with Ω 6= C. Then Ω has the
square root property.
0
Proof: Let f and both be analytic on Ω. Then ff is analytic on Ω so by
1
f
0
Corollary 24.50, there exists Fe, analytic on Ω such that Fe0 = ff on Ω. Then
³ ´0
e e e
f e−F = 0 and so f (z) = CeF = ea+ib eF . Now let F = Fe + a + ib. Then F is
1
still a primitive of f 0 /f and f (z) = eF (z) . Now let φ (z) ≡ e 2 F (z) . Then φ is the
desired square root and so Ω has the square root property.
Proof: From Theorem 27.12 and Lemma 27.13 there exists a function, f : Ω →
B (0, 1) which is one to one, onto, and analytic such that f (z0 ) = 0. The assertion
that f −1 is analytic follows from the open mapping theorem.
[2]. The approach given here is suggested by Rudin [45] and avoids many of the
standard technicalities.
rβ
r a
has radius of convergence r. Then there exists a singular point on ∂B (a, r).
Proof: If not, then for every z ∈ ∂B (a, r) there exists δ z > 0 and gz analytic
on B (z, δ z ) such that gz = f on B (z, δ z ) ∩ B (a, r) . Since ∂B (a, r) is compact,
n
there exist z1 , · · ·, zn , points in ∂B (a, r) such that {B (zk , δ zk )}k=1 covers ∂B (a, r) .
Now define ½
f (z) if z ∈ B (a, r)
g (z) ≡
gzk (z) if z ∈ B (zk , δ zk )
¡ ¢
Is this well defined? If z ∈ B (zi , δ zi ) ∩ B zj , δ zj , is gzi (z) = gzj (z)? Consider the
following picture representing this situation.
¡ ¢ ¡ ¢
You see that if z ∈ B (zi , δ zi ) ∩ B zj , δ zj then I ≡ B (zi , δ zi ) ∩ B zj , δ zj ∩
B (a, r) is a nonempty open set. ¡Both g¢zi and gzj equal f on I. Therefore, they
must be equal on B (zi , δ zi ) ∩ B zj , δ zj because I has a limit point. Therefore,
g is well defined and analytic on an open set containing B (a, r). Since g agrees
748 COMPLEX MAPPINGS
with f on B (a, r) , the power series for g is the same as the power series for f and
converges on a ball which is larger than B (a, r) contrary to the assumption that the
radius of convergence of the above power series equals r. This proves the theorem.
Lemma 27.18 Suppose (f, B (0, r)) for r < 1 is a function element and (f, B (0, r))
can be analytically continued along every curve in B (0, 1) that starts at 0. Then
there exists an analytic function, g defined on B (0, 1) such that g = f on B (0, r) .
Proof: Let
Define gR (z) ≡ gr1 (z) where |z| < r1 . This is well defined because if you use r1
and r2 , both gr1 and gr2 agree with f on B (0, r), a set with a limit point and so
the two functions agree at every point in both B (0, r1 ) and B (0, r2 ). Thus gR is
analytic on B (0, R) . If R < 1, then by the assumption there are no singular points
on B (0, R) and so Theorem 27.16 implies the radius of convergence of the power
series for gR is larger than R contradicting the choice of R. Therefore, R = 1 and
this proves the lemma. Let g = gR .
The following theorem is the main result in this subject, the monodromy theo-
rem.
27.5. THE PICARD THEOREMS 749
Corollary 27.20 Suppose (f, B (a, r)) is a function element with B (a, r) ⊆ C.
Suppose also that this function element can be analytically continued along every
curve through a. Then there exists G analytic on C such that G agrees with f on
B (a, r).
Ω1
r
a
A picture of Ω2 is similar except the line extends down from the boundary of
B (a, r).
Thus B (a, r) ⊆ Ωi and Ωi is simply connected and proper. By Theorem 27.19
there exist analytic functions, Gi analytic on Ωi such that Gi = f on B (a, r). Thus
G1 = G2 on B (a, r) , a set with a limit point. Therefore, G1 = G2 on Ω1 ∩ Ω2 . Now
let G (z) = Gi (z) where z ∈ Ωi . This is well defined and analytic on C. This proves
the corollary.
Picard theorem. The big Picard theorem is even more incredible. This one asserts
that to be non constant the entire function must take every value of C but two
infinitely many times! I will begin with the little Picard theorem. The method of
proof I will use is the one found in Saks and Zygmund [47], Conway [13] and Hille
[27]. This is not the way Picard did it in 1879. That approach is very different and
is presented at the end of the material on elliptic functions. This approach is much
more recent dating it appears from around 1924.
Lemma 27.21 Let f be analytic on a region containing B (0, r) and suppose
|f 0 (0)| = b > 0, f (0) = 0,
³ 2 2´
and |f (z)| ≤ M for all z ∈ B (0, r). Then f (B (0, r)) ⊇ B 0, r6M
b
.
Proof: By assumption,
∞
X
f (z) = ak z k , |z| ≤ r. (27.16)
k=0
Lemma 27.22 Let f be analytic on an open set containing B (0, R) and suppose
|f 0 (0)| > 0. Then there exists a ∈ B (0, R) such that
µ ¶
|f 0 (0)| R
f (B (0, R)) ⊇ B f (a) , .
24
Proof: Let K (ρ) ≡ max {|f 0 (z)| : |z| = ρ} . For simplicity, let Cρ ≡ {z : |z| = ρ}.
Claim: K is continuous from the left.
Proof of claim: Let zρ ∈ Cρ such that |f 0 (zρ )| = K (ρ) . Then by the maximum
modulus theorem, if λ ∈ (0, 1) ,
1 0
|f 0 (a)| r = |f (0)| R, B (a, r) ⊆ B (0, ρ0 + r) ⊆ B (0, R) . (27.18)
2
ra
r
0
752 COMPLEX MAPPINGS
which implies µ ¶
|f 0 (z0 )| R
f (B (z0 , R)) ⊇ B f (a) ,
24
as claimed. This proves the lemma.
No attempt was made to find the best number to multiply by R |f 0 (z0 )|. A
discussion of this is given in Conway [13]. See also [27]. Much larger numbers than
1/24 are available and there is a conjecture due to Alfors about the best value. The
conjecture is that 1/24 can be replaced with
¡ ¢ ¡ ¢
Γ 13 Γ 11
12
¡ √ ¢1/2 ¡ 1 ¢ ≈ . 471 86
1+ 3 Γ 4
You can see there is quite a gap between the constant for which this lemma is proved
above and what is thought to be the best constant.
Bloch’s lemma above gives the existence of a ball of a certain size inside the
image of a ball. By contrast the next lemma leads to conditions under which the
values of a function do not contain a ball of certain radius. It concerns analytic
functions which do not achieve the values 0 and 1.
Lemma 27.24 Let F denote the set of functions, f defined on Ω, a simply con-
nected region which do not achieve the values 0 and 1. Then for each such function,
it is possible to define a function analytic on Ω, H (z) by the formula
"r r #
log (f (z)) log (f (z))
H (z) ≡ log − −1 .
2πi 2πi
There exists a constant C independent of f ∈ F such that H (Ω) does not contain
any ball of radius C.
Proof: Let f ∈ F . Then since f does not take the value 0, there exists g1 a
primitive of f 0 /f . Thus
d ¡ −g1 ¢
e f =0
dz
so there exists a, b such that f (z) e−g1 (z) = ea+bi . Letting g (z) = g1 (z) + a + ib, it
follows eg(z) = f (z). Let log (f (z)) = g (z). Then for n ∈ Z, the integers,
log (f (z)) log (f (z))
, − 1 6= n
2πi 2πi
because if equality held, then f (z) = 1 which does not happen. It follows log(f (z))
2πi
and log(f
2πi
(z))
− 1 are never equal to zero. Therefore, using the same reasoning, you
can define a logarithm of these two quantities and therefore, a square root. Hence
there exists a function analytic on Ω,
r r
log (f (z)) log (f (z))
− − 1. (27.20)
2πi 2πi
754 COMPLEX MAPPINGS
√ √
For n a positive integer, this function cannot equal n ± n − 1 because if it did,
then Ãr r !
log (f (z)) log (f (z)) √ √
− −1 = n± n−1 (27.21)
2πi 2πi
and you could take reciprocals of both sides to obtain
Ãr r !
log (f (z)) log (f (z)) √ √
+ − 1 = n ∓ n − 1. (27.22)
2πi 2πi
Proof: Suppose the two values omitted are a and b and that h is not constant.
Let f (z) = (h (z) − a) / (b − a). Then f omits the two values 0 and 1. Let H be
defined in Lemma 27.24. Then H (z) is clearly
¡√ not√ of the¢form az +b because then it
would have values equal to the vertices ln n ± n − 1 +2mπi or else be constant
neither of which happen if h is not constant. Therefore, by Liouville’s theorem, H 0
must be unbounded. Pick ξ such that |H 0 (ξ)| > 24C where C is such that H (C)
27.5. THE PICARD THEOREMS 755
contains no balls of radius larger than C. But by Lemma 27.23 H (B (ξ, 1)) must
|H 0 (ξ)|
contain a ball of radius 24 > 24C 24 = C, a contradiction. This proves Picard’s
theorem.
The following is another formulation of this theorem.
|f (z)| ≤ M (β, θ)
for all z ∈ B (0, θR) , where M (β, θ) is a function of only the two variables β, θ.
(In particular, there is no dependence on R.)
You notice there are two explicit uses of logarithms. Consider first the logarithm
inside the radicals. Choose this logarithm such that
log (f (0)) = ln |f (0)| + i arg (f (0)) , arg (f (0)) ∈ (−π, π]. (27.24)
and by replacing α with α + 2mπ for a suitable integer, m it follows the above
equation still holds. Therefore, you can assume 27.24. Similar reasoning applies to
the logarithm on the outside of the parenthesis. It can be assumed H (0) equals
¯r ¯ Ãr !
¯ log (f (0)) r log (f (0)) ¯ log (f (0))
r
log (f (0))
¯ ¯
ln ¯ − − 1¯ + i arg − −1
¯ 2πi 2πi ¯ 2πi 2πi
(27.25)
756 COMPLEX MAPPINGS
Therefore,
Ãr r !2
log (f (z)) log (f (z))
+ −1
2πi 2πi
Ãr r !2
log (f (z)) log (f (z))
+ − −1
2πi 2πi
= exp (2H (z)) + exp (−2H (z))
and
µ ¶
log (f (z)) 1
−1 = (exp (2H (z)) + exp (−2H (z))) .
πi 2
Thus
πi
log (f (z)) = πi + (exp (2H (z)) + exp (−2H (z)))
2
which shows
¯ · ¸¯
¯ πi ¯
|f (z)| = ¯¯exp (exp (2H (z)) + exp (−2H (z))) ¯¯
2
¯ ¯
¯ πi ¯
≤ exp ¯¯ (exp (2H (z)) + exp (−2H (z)))¯¯
2
¯π ¯
¯ ¯
≤ exp ¯ (|exp (2H (z))| + |exp (−2H (z))|)¯
¯π2 ¯
¯ ¯
≤ exp ¯ (exp (2 |H (z)|) + exp (|−2H (z)|))¯
2
= exp (π exp 2 |H (z)|) .
Consider exp (2 |H (0)|). I want to obtain an inequality for this which involves
β. This is where I will use the convention about the logarithms discussed above.
From 27.25,
¯ Ãr r !¯
¯ log (f (0)) log (f (0)) ¯
¯ ¯
2 |H (0)| = 2 ¯log − −1 ¯
¯ 2πi 2πi ¯
758 COMPLEX MAPPINGS
à ¯r ¯!2 1/2
¯ log (f (0)) r log (f (0)) ¯
¯ ¯
≤ 2 ln ¯ − − 1¯ + π 2
¯ 2πi 2πi ¯
¯ ïr ¯ ¯ ¯!¯2 1/2
¯ ¯ log (f (0)) ¯ ¯r log (f (0)) ¯ ¯
¯ ¯ ¯ ¯ ¯ ¯
≤ 2 ¯ln ¯ ¯+¯ − 1¯ ¯ + π 2
¯ ¯ 2πi ¯ ¯ 2πi ¯ ¯
¯ ïr ¯ ¯r ¯!¯
¯ ¯ log (f (0)) ¯ ¯ log (f (0)) ¯ ¯
¯ ¯ ¯ ¯ ¯ ¯
≤ 2 ¯ln ¯ ¯+¯ − 1¯ ¯ + 2π
¯ ¯ 2πi ¯ ¯ 2πi ¯ ¯
µ µ¯ ¯ ¯ ¯¶¶
¯ log (f (0)) ¯ ¯ log (f (0)) ¯
≤ ln 2 ¯¯ ¯+¯
¯ ¯ − 1¯¯ + 2π
2πi 2πi
µµ¯ ¯ ¯ ¯¶¶
¯ log (f (0)) ¯ ¯ log (f (0)) ¯
= ln ¯ ¯+¯ − 2 ¯ + 2π (27.28)
¯ πi ¯ ¯ πi ¯
¯ ¯
¯ (0)) ¯
Consider ¯ log(f
πi ¯
Similarly,
¯ ¯ ï ¯ !1/2
¯ log (f (0)) ¯ ¯ ln β ¯2
¯ − 2¯¯ ≤ ¯ ¯ + (2 + 1)2
¯ πi ¯ π ¯
ï ¯ !1/2
¯ ln β ¯2
= ¯ ¯ +9
¯ π ¯
and so, letting M (β, θ) be given by the above expression on the right, the lemma
is proved.
The following theorem will be referred to as Schottky’s theorem. It looks just
like the above lemma except it is only assumed that f is analytic on B (0, R) rather
than on an open set containing B (0, R). Also, the case of an arbitrary center is
included along with arbitrary points which are not attained as values of the function.
Theorem 27.28 Let f be analytic on B (z0 , R) and suppose that f does not take
on either of the two distinct values a or b. Also suppose |f (z0 )| ≤ β. Then letting
θ ∈ (0, 1) , it follows
|f (z)| ≤ M (a, b, β, θ)
for all z ∈ B (z0 , θR) , where M (a, b, β, θ) is a function of only the variables β, θ,a, b.
(In particular, there is no dependence on R.)
Proof: First you can reduce to the case where the two values are 0 and 1 by
considering
f (z) − a
h (z) ≡ .
b−a
If there exists an estimate of the desired sort for h, then there exists such an estimate
for f. Of course here the function, M would depend on a and b. Therefore, there is
no loss of generality in assuming the points which are missed are 0 and 1.
Apply Lemma 27.27 to B (0, R1 ) for the function, g (z) ≡ f (z0 + z) and R1 < R.
Then if β ≥ |f (z0 )| = |g (0)| , it follows |g (z)| = |f (z0 + z)| ≤ M (β, θ) for every
z ∈ B (0, θR1 ) . Now let θ ∈ (0, 1) and choose R1 < R large enough that θR = θ1 R1
where θ1 ∈ (0, 1) . Then if |z − z0 | < θR, it follows
|f (z)| ≤ M (β, θ1 ) .
Now let R1 → R so θ1 → θ.
complex plane to the surface of this sphere as follows. Extend a line from the point,
p in the complex plane to the point (0, 0, 2) on the top of this sphere and let θ (p)
denote the point of this sphere which the line intersects. Define θ (∞) ≡ (0, 0, 2).
760 COMPLEX MAPPINGS
(0, 0,
s 2)
@
@
s @ sθ(p)
(0, 0, 1) @
@
@ p
@s C
−1
Then θ is sometimes called sterographic projection. The mapping θ is clearly
continuous because it takes converging sequences, to converging sequences. Fur-
thermore, it is clear that θ−1 is also continuous. In terms of the extended complex
b a sequence, zn converges to ∞ if and only if θzn converges to (0, 0, 2) and
plane, C,
a sequence, zn converges to z ∈ C if and only if θ (zn ) → θ (z) .
b
In fact this makes it easy to define a metric on C.
Definition 27.29 Let z, w ∈ C. b Then let d (x, y) ≡ |θ (z) − θ (w)| where this last
distance is the usual distance measured in R3 .
³ ´
Theorem 27.30 C, b d is a compact, hence complete metric space.
The Ascoli Arzela theorem, Theorem 6.24 is a major result which tells which
subsets of C (K, X) are sequentially compact.
f (x) ∈ B (a, M )
lim ρK (fkl , f ) = 0.
l→∞
or there exists a subsequence {fnk } such that for all compact subsets K,
Proof: Let B (z0 , 2R) ⊆ Ω. There are two cases to consider. The first case is
that there exists a subsequence, nk such that {fnk (z0 )} is bounded. The second
case is that limn→∞ |fnk (z0 )| = ∞.
Consider the first case. By Theorem 27.28 {fnk (z)} is uniformly bounded on
B (z0 , R) because by
¡ this theorem,
¢ and letting θ = 1/2 applied to B (z0 , 2R) , it fol-
lows |fnk (z)| ≤ M a, b, 21 , β where β is an upper bound to the numbers, |fnk©(z0 )|.ª
The Cauchy integral formula implies the existence of a uniform bound on the fn0 k
which implies the functions are equicontinuous and uniformly bounded. Therefore,
by the Ascoli Arzela theorem there exists a further subsequence which converges
uniformly on B (z0 , R) to a function, f analytic on B (z0 , R). Thus denoting this
subsequence by {fnk } to save on notation,
Consider the second case. In this case, it follows {1/fn (z0 )} is bounded on
B (z0 , R) and so by the same argument just given {1/fn (z)} is uniformly bounded
on B (z0 , R). Therefore, a subsequence converges uniformly on B (z0 , R). But
{1/fn (z)} converges to 0 and so this requires that {1/fn (z)} must converge uni-
formly to 0. Therefore,
lim ρB(z0 ,R) (fnk , ∞) = 0. (27.32)
k→∞
Now let {Dk } denote a countable set of closed balls, Dk = B (zk , Rk ) such that
B (zk , 2Rk ) ⊆ Ω and ∪∞k=1 int (Dk ) = Ω. Using a Cantor diagonal process, there
exists a subsequence, {fnk } of {fn } such that for each Dj , one of the above two
alternatives holds. That is, either
or,
lim ρDj (fnk , ∞) . (27.34)
k→∞
Let A = {∪ int (Dj ) : 27.33 holds} , B = {∪ int (Dj ) : 27.34 holds} . Note that the
balls whose union is A cannot intersect any of the balls whose union is B. Therefore,
one of A or B must be empty since otherwise, Ω would not be connected.
If K is any compact subset of Ω, it follows K must be a subset of some finite
collection of the Dj . Therefore, one of the alternatives in the lemma must hold.
That the limit function, f must be analytic follows easily in the same way as the
proof in Theorem 27.7 on Page 740. You could also use Morera’s theorem. This
proves the lemma.
Proof: Suppose this is not true. Then there exists R1 > 0 and two points, α
and β such that f −1 (β) ∩ B 0 (0, R1 ) and f −1 (α) ∩ B 0 (0, R1 ) are both finite sets.
Then shrinking R1 and calling the result R, there exists B (0, R) such that
Corollary 27.38 Suppose f is entire and nonconstant and not a polynomial. Then
f assumes every complex value infinitely many times with the possible exception of
one.
764 COMPLEX MAPPINGS
P∞
Proof: Since f is entire, f (z) = n=0 an z n . Define for z 6= 0,
µ ¶ X ∞ µ ¶n
1 1
g (z) ≡ f = an .
z n=0
z
27.6 Exercises
1. Prove that in Theorem 27.7 it suffices to assume F is uniformly bounded on
each compact subset of Ω.
4. Verify the conclusion of Theorem 27.7 involving the higher order derivatives.
6. Verify that |φα (z)| = 1 if |z| = 1. Apply the maximum modulus theorem to
conclude that |φα (z)| ≤ 1 for all |z| < 1.
7. Suppose that |f (z)| ≤ 1 for |z| = 1 and f (α) = 0 for |α| < 1. Show that
|f (z)| ≤ |φα (z)| for all z ∈ B (0, 1) . Hint: Consider f (z)(1−αz)
z−α which has a
removable singularity at α. Show the modulus of this function is bounded by
1 on |z| = 1. Then apply the maximum modulus theorem.
10. Suppose Ω is a simply connected region and u is a real valued function defined
on Ω such that u is harmonic. Show there exists an analytic function, f such
that u = Re f . Show this is not true if Ω is not a simply connected region.
Hint: You might use the Riemann mapping theorem and Problems 8 and
9. ¡ For the¢ second part it might be good to try something like u (x, y) =
ln x2 + y 2 on the annulus 1 < |z| < 2.
1+z
11. Show that w = 1−z maps {z ∈ C : Im z > 0 and |z| < 1} to the first quadrant,
{z = x + iy : x, y > 0} .
a1 z+b1
12. Let f (z) = az+b
cz+d and let g (z) = c1 z+d1 . Show that f ◦g (z) equals the quotient
of two expressions, the numerator being the top entry in the vector
µ ¶µ ¶µ ¶
a b a1 b1 z
c d c1 d1 1
and the denominator being the bottom entry. Show that if you define
µµ ¶¶
a b az + b
φ ≡ ,
c d cz + d
then φ (AB) = φ (A) ◦ φ (B) . Find an easy way to find the inverse of f (z) =
az+b
cz+d and give a condition on the a, b, c, d which insures this function has an
inverse.
13. The modular group2 is the set of fractional linear transformations, az+b
cz+d such
that a, b, c, d are integers and ad − bc = 1. Using Problem 12 or brute force
show this modular group is really a group with the group operation being
composition. Also show the inverse of az+b dz−b
cz+d is −cz+a .
14. Let Ω be a region and suppose f is analytic on Ω and that the functions fn
are also analytic on Ω and converge to f uniformly on compact subsets of Ω.
Suppose f is one to one. Can it be concluded that for an arbitrary compact
set, K ⊆ Ω that fn is one to one for all n large enough?
15. The Vitali theorem says that if Ω is a region and {fn } is a uniformly bounded
sequence of functions which converges pointwise on a set, S ⊆ Ω which has a
limit point in Ω, then in fact, {fn } must converge uniformly on compact sub-
sets of Ω to an analytic function. Prove this theorem. Hint: If the sequence
fails to converge, show you can get two different subsequences converging uni-
formly on compact sets to different functions. Then argue these two functions
coincide on S.
16. Does there exist a function analytic on B (0, 1) which maps B (0, 1) onto
B 0 (0, 1) , the open unit ball in which 0 has been deleted?
2 This is the terminology used in Rudin’s book Real and Complex Analysis.
766 COMPLEX MAPPINGS
Approximation By Rational
Functions
Definition 28.1 Approximation will be taken with respect to the following norm.
767
768 APPROXIMATION BY RATIONAL FUNCTIONS
Pn
for every z ∈ K, k=1 n (γ k , z) = 1. One more ingredient is needed and this is a
theorem which lets you keep the approximation but move the poles.
To begin with, consider the part about the cycle of closed oriented curves. Recall
Theorem 24.52 which is stated for convenience.
Theorem 28.2 Let K be a compact subset of an open © ªset, Ω. Then there exist
m
continuous, closed, bounded variation oriented curves γ j j=1 for which γ ∗j ∩K = ∅
for each j, γ ∗j ⊆ Ω, and for all p ∈ K,
m
X
n (p, γ k ) = 1.
k=1
and
m
X
n (z, γ k ) = 0
k=1
for all z ∈
/ Ω.
This theorem implies the following.
Theorem 28.3 Let K ⊆ Ω where K is compact and Ω is open. Then there exist
oriented closed curves, γ k such that γ ∗k ∩ K = ∅ but γ ∗k ⊆ Ω, such that for all z ∈ K,
p Z
1 X f (w)
f (z) = dw. (28.1)
2πi γk w −z
k=1
Proof: This follows from Theorem 24.52 and the Cauchy integral formula. As
shown in the proof, you can assume the γ k are linear mappings but this is not
important.
Next I will show how the Cauchy integral formula leads to approximation by
rational functions, quotients of polynomials.
Lemma 28.4 Let K be a compact subset of an open set, Ω and let f be analytic
on Ω. Then there exists a rational function, Q whose poles are not in K such that
||Q − f ||K,∞ < ε.
Proof: By Theorem 28.3 there are oriented curves, γ k described there such that
for all z ∈ K,
p Z
1 X f (w)
f (z) = dw. (28.2)
2πi γk w −z
k=1
ε
||R − f ||K,∞ < .
2
à ∞
! ∞ ∞
X X X
ai bj = cn
i=r j=r n=r
where
n
X
cn = ak bn−k+r .
k=r
∞
X
cn = pnk ak bn−k+r .
k=r
1 Actually, it is only necessary to assume one of the series converges and the other converges
absolutely. This is known as Merten’s theorem and may be read in the 1974 book by Apostol
listed in the bibliography.
770 APPROXIMATION BY RATIONAL FUNCTIONS
Also,
∞ X
X ∞ ∞
X ∞
X
pnk |ak | |bn−k+r | = |ak | pnk |bn−k+r |
k=r n=r k=r n=r
X∞ X∞
= |ak | |bn−k+r |
k=r n=k
X∞ X∞
¯ ¯
= |ak | ¯bn−(k−r) ¯
k=r n=k
X∞ X∞
= |ak | |bm | < ∞.
k=r m=r
Therefore,
∞
X ∞ X
X n ∞ X
X ∞
cn = ak bn−k+r = pnk ak bn−k+r
n=r n=r k=r n=r k=r
X∞ X ∞ X∞ ∞
X
= ak pnk bn−k+r = ak bn−k+r
k=r n=r k=r n=k
X∞ X∞
= ak bm
k=r m=r
Proof:
n
X n
X
|cn (z)| ≤ |an−k (z)| |bk (z)| ≤ An−k Bk .
k=0 k=0
28.1. RUNGE’S THEOREM 771
Also,
∞ X
X n ∞ X
X ∞
An−k Bk = An−k Bk
n=0 k=0 k=0 n=k
X∞ X∞
= Bk An < ∞.
k=0 n=0
The claim of 28.3 follows from Merten’s theorem. This proves the lemma.
P∞
Corollary 28.7 Let P be a polynomial and let n=0 an (z) converge uniformly and
absolutely on K such that the an Psatisfy the conditions
P∞ of the Weierstrass M test.
∞
Then there exists a series for P ( n=0 an (z)) , n=0 cn (z) , which also converges
absolutely and uniformly for z ∈ K because cn (z) also satisfies the conditions of the
Weierstrass M test.
The following picture is descriptive of the following lemma. This lemma says
that if you have a rational function with one pole off a compact set, then you can
approximate on the compact set with another rational function which has a different
pole.
ub
ua K
V
Proof: Say that b ∈ V satisfies P if for all ε > 0 there exists a rational function,
Qb , having a pole only at b such that
for all z ∈ K whenever b ∈ B (b1 , δ) . In fact, it suffices to take |b − b1 | < dist (b1 , K) /4
because then
¯ ¯ ¯ ¯
¯ b1 − b ¯ ¯ ¯
¯ ¯ < ¯ dist (b1 , K) /4 ¯ ≤ dist (b1 , K) /4
¯ z−b ¯ ¯ z−b ¯ |z − b1 | − |b1 − b|
dist (b1 , K) /4 1 1
≤ ≤ < .
dist (b1 , K) − dist (b1 , K) /4 3 2
Since b1 satisfies P, there exists a rational function Qb1 with the desired prop-
erties. It is shown next that you can approximate Qb1 with Qb thus yielding an
approximation to R by the use of the triangle inequality,
1
By Corollary 28.7 the same is true of the series for 1 −b n
. Thus a suitable partial
(1− bz−b )
sum can be made uniformly on K as close as desired to (z−b1 1 )n . This shows that b
satisfies P whenever b is close enough to b1 verifying that S is open.
Next it is shown S is closed in V. Let bn ∈ S and suppose bn → b ∈ V. Then
since bn ∈ S, there exists a rational function, Qbn such that
ε
||Qbn − R||K,∞ < .
2
28.1. RUNGE’S THEOREM 773
1
dist (b, K) ≥ |bn − b|
2
and so for all n large enough, ¯ ¯
¯ b − bn ¯ 1
¯ ¯
¯ z − bn ¯ < 2 ,
for all z ∈ K. Pick such a bn . As before, it suffices to assume Qbn , is of the form
1
(z−bn )n . Then
1 1
Qbn (z) = n = ³ ´n
(z − bn ) n
(z − b) 1 − bz−bn −b
and because of the estimate, there exists M such that for all z ∈ K
¯ ¯
¯ XM µ ¶k ¯ n
¯ 1 bn − b ¯¯ ε (dist (b, K))
¯³ ´n − ak < . (28.7)
¯ z−b ¯ 2
¯ 1 − bz−b
n −b
k=0 ¯
PM ³ ´k
1 bn −b
and so, letting Qb (z) = (z−b)n k=0 ak z−b ,
form
p (z) n 1 1
Qb (z) = n = p (z) (−1) ¡ ¢
(z − b) bn 1 − zb n
̰ !n
n 1
X ³ z ´k
= p (z) (−1) n
b b
k=0
Then by an application of Corollary 28.7 there exists a partial sum of the power
series for Qb which is uniformly close to Qb on K. Therefore, you can approximate
Qb and therefore also R uniformly on K by a polynomial consisting of a partial sum
of the above infinite sum. This proves the theorem.
If f is a polynomial, then f has a pole at ∞. This will be discussed more later.
Theorem 28.9 Let K be a compact subset of an open set, Ω and let {bj } be a
set which consists of one point from each component of C b \ K. Let f be analytic
on Ω. Then for each ε > 0, there exists a rational function, Q whose poles are all
contained in the set, {bj } such that
It follows
||f − Q||K,∞ ≤ ||f − R||K,∞ + ||R − Q||K,∞ < ε.
In the case of only one component of C \ K, this component is the unbounded
component and so you can take Q to be a polynomial. This proves the theorem.
The next version of Runge’s theorem concerns the case where the given points
are contained in C b \ Ω for Ω an open set rather than a compact set. Note that here
there could be uncountably many components of C b \ Ω because the components are
no longer open sets. An easy example of this phenomenon in one dimension is where
Ω = [0, 1] \ P for P the Cantor set. Then you can show that R \ Ω has uncountably
many components. Nevertheless, Runge’s theorem will follow from Theorem 28.9
with the aid of the following interesting lemma.
Lemma 28.10 Let Ω be an open set in C. Then there exists a sequence of compact
sets, {Kn } such that
Ω = ∪∞
k=1 Kn , · · ·, Kn ⊆ int Kn+1 · ··, (28.10)
Proof: Let µ ¶
[ 1
Vn ≡ {z : |z| > n} ∪ B z, .
n
z ∈Ω
/
b \ Vn = C \ Vn ⊆ Ω.
Kn ≡ C
You should verify that 28.10 and 28.11 hold. It remains to show that every compo-
b
nent of C\K b b
n contains a component of C\Ω. Let D be a component of C\Kn ≡ Vn .
If ∞ ∈
/ D, then D contains no point of {z : |z| > n} because this set is connected
and D is a component. (If it did contain S a ¡point ¢ of this set, it would have to
contain the whole set.) Therefore, D ⊆ B z, n1 and so D contains some point
¡ ¢ z ∈Ω
/
of B z, n1 for some z ∈
/ Ω. Therefore, since this ball is connected, it follows D must
contain the whole ball and consequently D contains some point of ΩC . (The point
z at the center of the ball will do.) Since D contains z ∈ / Ω, it must contain the
component, Hz , determined by this point. The reason for this is that
b \Ω⊆C
Hz ⊆ C b \ Kn
sa1 sa2 Ω
However, there could be many more holes than two. In fact, there could be
infinitely many. Nor does it follow that the components of the complement of Ω need
to have any interior points. Therefore, the picture is certainly not representative.
Theorem 28.11 (Runge) Let Ω be an open set, and let A be a set which has one
point in each component of C b \ Ω and let f be analytic on Ω. Then there exists a
sequence of rational functions, {Rn } having poles only in A such that Rn converges
uniformly to f on compact subsets of Ω.
Proof: Let Kn be the compact sets of Lemma 28.10 where each component of
b \ Kn contains a component of C
C b \ Ω. It follows each component of C
b \ Kn contains
a point of A. Therefore, by Theorem 28.9 there exists Rn a rational function with
poles only in A such that
1
||Rn − f ||Kn ,∞ < .
n
It follows, since a given compact set, K is a subset of Kn for all n large enough,
that Rn → f uniformly on K. This proves the theorem.
Corollary 28.12 Let Ω be simply connected and f analytic on Ω. Then there exists
a sequence of polynomials, {pn } such that pn → f uniformly on compact sets of Ω.
∞
Theorem 28.13 Let P ≡ {zk }k=1 be a set of points in an open subset of C, Ω.
Suppose also that P ⊆ Ω ⊆ C. For each zk , denote by Sk (z) a function of the form
mk
X akj
Sk (z) = j
.
j=1 (z − zk )
Then there exists a meromorphic function, Q defined on Ω such that the poles of
∞
Q are the points, {zk }k=1 and the singular part of the Laurent expansion of Q at
zk equals Sk (z) . In other words, for z near zk , Q (z) = gk (z) + Sk (z) for some
function, gk analytic near zk .
Proof: Let {Kn } denote the sequence of compact sets described in Lemma
28.10. Thus ∪∞ n=1 Kn = Ω, Kn ⊆ int (Kn+1 ) ⊆ Kn+1 · ··, and the components of
b \ Kn contain the components of C
C b \ Ω. Renumbering if necessary, you can assume
each Kn 6= ∅. Also let K0 = ∅. Let Pm ≡ P ∩ (Km \ Km−1 ) and consider the
rational function, Rm defined by
X
Rm (z) ≡ Sk (z) .
zk ∈Km \Km−1
It remains to verify this function works. First consider K1 . Then on K1 , the above
sum converges uniformly. Furthermore, the terms of the sum are analytic in some
open set containing K1 . Therefore, the infinite sum is analytic on this open set and
so for z ∈ K1 The function, f is the sum of a rational function, R1 , having poles at
778 APPROXIMATION BY RATIONAL FUNCTIONS
P1 with the specified singular terms and an analytic function. Therefore, Q works
on K1 . Now consider Km for m > 1. Then
m+1
X ∞
X
Q (z) = R1 (z) + (Rk (z) − Qk (z)) + (Rk (z) − Qk (z)) .
k=2 k=m+2
As before, the infinite sum converges uniformly on Km+1 and hence on some open
set, O containing Km . Therefore, this infinite sum equals a function which is
analytic on O. Also,
m+1
X
R1 (z) + (Rk (z) − Qk (z))
k=2
Then there exists a meromorphic function, Q defined on C such that the poles of Q
∞
are the points, {zk }k=1 and the singular part of the Laurent expansion of Q at zk
equals Sk (z) . In other words, for z near zk ,
1
on this set. Therefore, by Corollary 28.7, Sk (z) , being a polynomial in z−z k
, has
a power series which converges uniformly to Sk (z) on Kk . Therefore, there exists a
polynomial, Pk (z) such that
1
||Pk − Sk ||B(0,|zk |/2),∞ < .
2k
Let
∞
X
Q (z) ≡ (Sk (z) − Pk (z)) . (28.12)
k=1
Consider z ∈ Km and let N be large enough that if k > N, then |zk | > 2 |z|
N
X ∞
X
Q (z) = (Sk (z) − Pk (z)) + (Sk (z) − Pk (z)) .
k=1 k=N +1
The series converges uniformly on every compact set because of the assumption
that limn→∞ |zn | = ∞ which implies that any compact set is contained in Kk for
k large enough. Choose N such that z ∈ int(KN ) and zn ∈
/ KN for all n ≥ N + 1.
Then
N
X ∞
X
Q (z) = S0 (z) + (Sk (z) − Pk (z)) + (Sk (z) − Pk (z)) .
k=1 k=N +1
The last sum is analytic on int(KN ) because each function in the sum is analytic due
PN
to the fact that none of its poles are in KN . Also, S0 (z) + k=1 (Sk (z) − Pk (z)) is
a finite sum of rational functions so it is a rational function and Pk is a polynomial
so zm is a pole of this function with the correct singularity whenever zm ∈ int (KN ).
780 APPROXIMATION BY RATIONAL FUNCTIONS
28.2.3 b
Functions Meromorphic On C
Sometimes it is useful to think of isolated singular points at ∞.
So what is f like for these cases? First suppose f has a removable singularity
at ∞. Then zg (z) converges to 0 as z → 0. It follows g (z) must be analytic¡ near ¢
0 and so can be given as a power series. Thus f (z) is of the form f (z) = g z1 =
P∞ ¡ ¢
1 n
n=0 an z . Next suppose f has a pole at ∞. This means g (z) has a pole at 0 so
Pm
g (z) is of the form g (z) = k=1 zbkk +h (z) where h (z) is analytic near 0. Thus in the
¡ ¢ Pm P∞ ¡ ¢n
case of a pole at ∞, f (z) is of the form f (z) = g z1 = k=1 bk z k + n=0 an z1 .
It turns out that the functions which are meromorphic on C b are all rational
b
functions. To see this suppose f is meromorphic on C and note that there exists
r > 0 such that f (z) is analytic for |z| > r. This is required if ∞ is to be isolated.
Therefore, there are only finitely many poles of f for |z| ≤ r, {a1 , · · ·, am } , because
by assumption, these poles are isolated and this is P a compact set. Let the singular
m
part of f at ak be denoted by Sk (z) . Then f (z) − k=1 Sk (z) is analytic on all of
C. Therefore, it is bounded on |z| ≤ r. In one P case, f has a removable singularity at
∞. In this case, f is bounded as z → ∞ andP k Sk also converges to 0 as z → ∞.
m
Therefore,
P by Liouville’s theorem, f (z) − k=1 Sk (z) equals a constant and so
f − k Sk is a constant. Thus f is a rational function. In¡the ¢ other case that f has
Pm Pm k
P∞ 1 n
Pm
a pole atPm∞, f (z) − P
k=1 S k (z) − k=1 bk z = n=0 a n z − k=1 Sk (z) . Now
m k
f (z) − k=1 S¡k (z) −
¢n Pk=1 b k z is analytic on C and so is bounded on |z| ≤ r. But
P∞ m
now n=0 an z1 P− k=1 Sk (z) Pm converges to 0 as z → ∞ and so by Liouville’s
m
theorem, f (z) − k=1 Sk (z) − k=1 bk z k must equal a constant and again, f (z)
equals a rational function.
Proof: Let H be the homotopy described above. The problem with this is
that it is not known that H (α, ·) is of bounded variation. There is no reason it
should be. Therefore, it might not make sense to take the integral which defines
the winding number. There are various ways around this. Extend H as follows.
H (α, t) = H (α, a) for t < a, H (α, t) = H (α, b) for t > b. Let ε > 0.
Z 2ε
t+ (b−a) (t−a)
1
Hε (α, t) ≡ H (α, s) ds, Hε (0, t) = p.
2ε 2ε
−2ε+t+ (b−a) (t−a)
Thus Hε (α, ·) is a closed curve which has bounded variation and when α = 1, this
converges to γ uniformly on [a, b]. Therefore, for ε small enough, n (a, Hε (1, ·)) =
n (a, γ) because they are both integers and as ε → 0, n (a, Hε (1, ·)) → n (a, γ) . Also,
Hε (α, t) → H (α, t) uniformly on [0, 1] × [a, b] because of uniform continuity of H.
Therefore, for small enough ε, you can also assume Hε (α, t) ∈ Ω for all α, t. Now
α → n (a, Hε (α, ·)) is continuous. Hence it must be constant because the winding
number is integer valued. But
Z
1 1
lim dz = 0
α→0 2πi H (α,·) z − a
ε
because the length of Hε (α, ·) converges to 0 and the integrand is bounded because
a∈/ Ω. Therefore, the constant can only equal 0. This proves the lemma.
Now it is time for the great and glorious theorem on simply connected regions.
The following equivalence of properties is taken from Rudin [45]. There is a slightly
different list in Conway [13] and a shorter list in Ash [6].
3. If z ∈
/ Ω, and if γ is a closed bounded variation continuous curve in Ω, then
n (γ, z) = 0.
9. If f, 1/f are both analytic on Ω, then there exists φ analytic on Ω such that
f = φ2 .
Proof: 1⇒2. Assume 1 and let γ be a closed¡ ¡ curve in ¢¢Ω. Let h be the homeo-
morphism, h : B (0, 1) → Ω. Let H (α, t) = h α h−1 γ (t) . This works.
2⇒3 This is Lemma 28.18.
3⇒4. Suppose 3 but 4 fails to hold. Then if C b \ Ω is not connected, there exist
disjoint nonempty sets, A and B such that A ∩ B = A ∩ B = ∅. It follows each
of these sets must be closed because neither can have a limit point in Ω nor in
the other. Also, one and only one of them contains ∞. Let this set be B. Thus
A is a closed set which must also be bounded. Otherwise, there would exist a
sequence of points in A, {an } such that limn→∞ an = ∞ which would contradict
the requirement that no limit points of A can be in B. Therefore, A is a compact
set contained in the open set, B C ≡ {z ∈ C : z ∈
/ B} . Pick p ∈ A. By Lemma 28.16
m
there exist continuous bounded variation closed curves {Γk }k=1 which are contained
C
in B , do not intersect A and such that
m
X
1= n (p, Γk )
k=1
However, if these curves do not intersect A and they also do not intersect B then
they must be all contained in Ω. Since p ∈ / Ω, it follows by 3 that for each k,
n (p, Γk ) = 0, a contradiction.
4⇒5 This is Corollary 28.12 on Page 776.
28.2. THE MITTAG-LEFFLER THEOREM 783
5⇒6 Every polynomial has a primitive and so the integral over any closed
bounded variation curve of a polynomial equals 0. Let f be analytic on Ω. Then let
{fn } be a sequence of polynomials converging uniformly to f on γ ∗ . Then
Z Z
0 = lim fn (z) dz = f (z) dz.
n→∞ γ γ
This is well defined by 6 and is easily seen to be a primitive. You just write the
difference quotient and take a limit using 6.
ÃZ Z !
F (z + w) − F (z) 1
lim = lim f (u) du − f (u) du
w→0 w w→0 w γ(z0 ,z+w) γ(z0 ,z)
Z
1
= lim f (u) du
w→0 w γ(z,z+w)
Z
1 1
= lim f (z + tw) wdt = f (z) .
w→0 w 0
7⇒8 Suppose then that f, 1/f are both analytic. Then f 0 /f is analytic and so
it has a primitive by 7. Let this primitive be g1 . Then
¡ −g1 ¢0
e f = e−g1 (−g10 ) f + e−g1 f 0
µ 0¶
f
= −e−g1 f + e−g1 f 0 = 0.
f
Therefore, since Ω is connected, it follows e−g1 f must equal a constant. (Why?)
Let the constant be ea+ibi . Then f (z) = eg1 (z) ea+ib . Therefore, you let g (z) =
g1 (z) + a + ib.
8⇒9 Suppose then that f, 1/f are both analytic on Ω. Then by 8 f (z) = eg(z) .
Let φ (z) ≡ eg(z)/2 .
9⇒1 There are two cases. First suppose Ω = C. This satisfies condition 9
because if f, 1/f are both analytic, then the same argument involved in 8⇒9 gives
the existence of a square root. A homeomorphism is h (z) ≡ √ z 2 . It obviously
1+|z|
maps onto B (0, 1) and is continuous. To see it is 1 - 1 consider the case of z1
and z2 having different arguments. Then h (z1 ) 6= h (z2 ) . If z2 = tz1 for a positive
t 6= 1, then it is also clear h (z1 ) 6= h (z2 ) . To show h−1 is continuous, note that if
you have an open set in C and a point in this open set, you can get a small open
set containing this point by allowing the modulus and the argument to lie in some
open interval. Reasoning this way, you can verify h maps open sets to open sets. In
the case where Ω 6= C, there exists a one to one analytic map which maps Ω onto
B (0, 1) by the Riemann mapping theorem. This proves the theorem.
784 APPROXIMATION BY RATIONAL FUNCTIONS
28.3 Exercises
1. Let a ∈ C. Show there exists a sequence of polynomials, {pn } such that
pn (a) = 1 but pn (z) → 0 for all z 6= a.
2. Let l be a line in C. Show there exists a sequence of polynomials {pn } such
that pn (z) → 1 on one side of this line and pn (z) → −1 on the other side of
the line. Hint: The complement of this line is simply connected.
Show this is an analytic function which maps the unit ball onto an annulus.
Is it possible to find a one to one analytic map which does this?
Infinite Products
Q∞ Qn
Definition 29.1 n=1 (1 + un ) ≡ limn→∞ k=1 (1 + uk ) whenever this limit ex-
ists. If un = un (z) for z Q
∈ H, we say the infinite product converges uniformly on
n
H if the partial products, k=1 (1 + uk (z)) converge uniformly on H.
P∞
Theorem 29.2 Let H ⊆ C and suppose that n=1 |un (z)| converges uniformly on
H where un (z) bounded on H. Then
∞
Y
P (z) ≡ (1 + un (z))
n=1
785
786 INFINITE PRODUCTS
n
Y
(1 + |uk (z)|)
k=m
à à n
!! Ã n
!
Y X
= exp ln (1 + |uk (z)|) = exp ln (1 + |uk (z)|)
k=m k=m
à ∞
!
X
≤ exp |uk (z)| <e
k=m
P∞
for all z ∈ H provided m is large enough. Since k=1 |uk (z)| converges uniformly
on H, |uk (z)| < 12 for all z ∈ H provided k is large enough. Thus you can take
log (1 + uk (z)) . Pick N0 such that for n > m ≥ N0 ,
n
1 Y
|um (z)| < , (1 + |uk (z)|) < e. (29.1)
2
k=m
Now having picked N0 , the assumption the un are bounded on H implies there
exists a constant, C, independent of z ∈ H such that for all z ∈ H,
N0
Y
(1 + |uk (z)|) < C. (29.2)
k=1
¯N ¯
¯Y YM ¯
¯ ¯
¯ (1 + uk (z)) − (1 + uk (z))¯
¯ ¯
k=1 k=1
¯ ¯
YN0 ¯ Y N M
Y ¯
¯ ¯
≤ (1 + |uk (z)|) ¯ (1 + uk (z)) − (1 + uk (z))¯
¯ ¯
k=1 k=N0 +1 k=N0 +1
¯ N ¯
¯ Y YM ¯
¯ ¯
≤ C¯ (1 + uk (z)) − (1 + uk (z))¯
¯ ¯
k=N0 +1 k=N0 +1
à M ! ¯ ¯
Y ¯ Y N ¯
¯ ¯
≤ C (1 + |uk (z)|) ¯ (1 + uk (z)) − 1¯
¯ ¯
k=N0 +1 k=M +1
¯ ¯
¯ Y N ¯
¯ ¯
≤ Ce ¯ (1 + |uk (z)|) − 1¯ .
¯ ¯
k=M +1
787
QN
Since 1 ≤ k=M +1 (1 + |uk (z)|) ≤ e, it follows the term on the far right is domi-
nated by
¯ Ã N ! ¯
¯ Y ¯
¯
2 ¯
Ce ¯ln (1 + |uk (z)|) − ln 1¯
¯ ¯
k=M +1
N
X
≤ Ce2 ln (1 + |uk (z)|)
k=M +1
N
X
≤ Ce2 |uk (z)| < ε
k=M +1
uniformly in z ∈ H provided M is large enough. This follows from Qmthe simple obser-∞
vation that if 1 < x < e, then x−1 ≤ e (ln x − ln 1). Therefore, { k=1 (1 + uk (z))}m=1
is uniformly Cauchy on H and therefore, converges uniformly on H. Let P (z) denote
the function it converges to.
What about the permutations? Let {n1 , n2 , · · ·} be a permutation of the indices.
Let ε > 0 be given and let N0 be such that if n > N0 ,
¯ ¯
¯Yn ¯
¯ ¯
¯ (1 + uk (z)) − P (z)¯ < ε
¯ ¯
k=1
© ª
for all z ∈ H. Let {1, 2, · · ·, n} ⊆ n1 , n2 , · · ·, np(n) where p (n) is an increasing
sequence. Then from 29.1 and 29.2,
¯ ¯
¯ p(n) ¯
¯ Y ¯
¯P (z) − (1 + unk (z))¯¯
¯
¯ k=1 ¯
¯ ¯ ¯ ¯
¯ Yn ¯ ¯¯ Y n p(n)
Y ¯
¯
¯ ¯ ¯
≤ ¯P (z) − (1 + uk (z))¯ + ¯ (1 + uk (z)) − (1 + unk (z))¯¯
¯ ¯ ¯ ¯
k=1 k=1 k=1
¯ ¯
¯ n p(n) ¯
¯Y Y ¯
≤ ε+¯ ¯ (1 + uk (z)) − (1 + unk (z))¯¯
¯k=1 k=1 ¯
¯ n ¯¯ ¯
¯Y ¯¯ Y ¯
¯ ¯¯ ¯
≤ ε+¯ (1 + |uk (z)|)¯ ¯1 − (1 + unk (z))¯
¯ ¯¯ nk >n
¯
k=1
¯N ¯¯ ¯ ¯ ¯
¯Y 0 ¯¯ Y n ¯¯ Y ¯
¯ ¯¯ ¯¯ ¯
≤ ε+¯ (1 + |uk (z)|)¯ ¯ (1 + |uk (z)|)¯ ¯1 − (1 + unk (z))¯
¯ ¯¯ ¯¯ ¯
k=1 k=N0 +1 nk >n
¯ ¯ ¯ ¯
¯Y ¯ ¯M (p(n)) ¯
¯ ¯ ¯ Y ¯
≤ ε + Ce ¯ (1 + |unk (z)|) − 1¯ ≤ ε + Ce ¯¯ (1 + |unk (z)|) − 1¯¯
¯ ¯ ¯ ¯
n >n
k k=n+1
788 INFINITE PRODUCTS
© ª
where M (p (n)) is the largest index in the permuted list, n1 , n2 , · · ·, np(n) . then
from 29.1, this last term is dominated by
¯ ¯
¯ M (p(n))
Y ¯
¯ ¯
ε + Ce2 ¯¯ln (1 + |unk (z)|) − ln 1¯¯
¯ k=n+1 ¯
∞
X ∞
X
≤ ε + Ce2 ln (1 + |unk |) ≤ ε + Ce2 |unk | < 2ε
k=n+1 k=n+1
¯ Qp(n) ¯
¯ ¯
for all n large enough uniformly in z ∈ H. Therefore, ¯P (z) − k=1 (1 + unk (z))¯ <
2ε whenever n is large enough. This proves the part about the permutation.
It remains to verify the assertion about the points, z0 , where P (z0 ) = 0. Obvi-
ously, if un (z0 ) = −1, then P (z0 ) = 0. Suppose then that P (z0 ) = 0 and M > N0 .
Then ¯ ¯
¯YM ¯
¯ ¯
¯ (1 + uk (z0 ))¯ =
¯ ¯
k=1
¯ ¯
¯Y M Y∞ ¯
¯ ¯
¯ (1 + uk (z0 )) − (1 + uk (z0 ))¯
¯ ¯
k=1 k=1
¯ ¯¯ ¯
¯Y M ¯¯ Y∞ ¯
¯ ¯¯ ¯
≤ ¯ (1 + uk (z0 ))¯ ¯1 − (1 + uk (z0 ))¯
¯ ¯¯ ¯
k=1 k=M +1
¯ ¯ ¯ ¯
¯Y M ¯¯ Y ∞ ¯
¯ ¯¯ ¯
≤ ¯ (1 + uk (z0 ))¯ ¯ (1 + |uk (z0 )|) − 1¯
¯ ¯¯ ¯
k=1 k=M +1
¯ ¯ ¯ ¯
¯Y M ¯¯ Y∞ ¯
¯ ¯¯ ¯
≤ e¯ (1 + uk (z0 ))¯ ¯ln (1 + |uk (z0 )|) − ln 1¯
¯ ¯¯ ¯
k=1 k=M +1
à ∞ !¯ M ¯
X ¯Y ¯
¯ ¯
≤ e ln (1 + |uk (z)|) ¯ (1 + uk (z0 ))¯
¯ ¯
k=M +1 k=1
¯ ¯
X∞ ¯Y M ¯
¯ ¯
≤ e |uk (z)| ¯ (1 + uk (z0 ))¯
¯ ¯
k=M +1 k=1
¯M ¯
1 ¯¯ Y ¯
¯
≤ ¯ (1 + uk (z0 ))¯
2¯ ¯
k=1
the last inequality holding by the mean value theorem. Now consider 29.4.
¯ ¯
¯X∞
z k ¯¯ X∞
|z|
k ∞
1 X k
¯
¯ ¯ ≤ ≤ |z|
¯ k¯ k m
k=m k=m k=m
m
1 |z| 2 m 1 1
= ≤ |z| ≤ .
m 1 − |z| m m 2m−1
X∞
zn ³ ´ X ∞
zn
−1
log (1 − z) = − , log (1 − z) = ,
n=1
n n=1
n
P∞ n
because the function log (1 − z) and the analytic function, − n=1 zn both are
equal to ln (1 − x) on the real line segment (−1, 1) , a set which has a limit point.
Therefore, using Lemma 29.3,
|Ep (z) − 1|
¯ µ ¶ ¯
¯ 2 p ¯
= ¯(1 − z) exp z + z + · · · + z − 1 ¯
¯ 2 p ¯
¯ Ã ! ¯
¯ ³ ´ X∞
zn ¯
¯ −1 ¯
= ¯(1 − z) exp log (1 − z) − − 1¯
¯ n ¯
n=p+1
¯ Ã ! ¯
¯ X ∞
zn ¯
¯ ¯
= ¯exp − − 1¯
¯ n ¯
n=p+1
¯ ¯
¯ X ∞
z n ¯¯ |− P∞
¯ zn
n=p+1 n |
≤ ¯− ¯e
¯ n¯
n=p+1
1 p+1 p+1
≤ · 2 · e1/(p+1) |z| . ≤ 3 |z|
p+1
Theorem 29.6 Let {zn } be a sequence of nonzero complex numbers which have no
limit point in C and let {pn } be a sequence of nonnegative integers such that
X∞ µ ¶pn +1
R
<∞ (29.5)
n=1
|zn |
29.1. ANALYTIC FUNCTION WITH PRESCRIBED ZEROS 791
Proof: Since {zn } has no limit point, it follows limn→∞ |zn | = ∞. Therefore,
if pn = n − 1 the condition, 29.5 holds for this choice of pn . Now by Theorem 29.2,
the infinite product in this theorem will converge uniformly on |z| ≤ R if the same
is true of the sum,
∞ ¯
X µ ¶ ¯
¯ ¯
¯Epn z − 1¯ . (29.6)
¯ zn ¯
n=1
Since |zn | → ∞, there exists N such that for n > N, |zn | > 2R. Therefore, for
|z| < R and letting 0 < a = min {|zn | : n ≤ N } ,
∞ ¯
X µ ¶ ¯ N ¯ ¯pn +1
X
¯ ¯ ¯R¯
¯Epn z − 1¯ ≤ 3 ¯ ¯
¯ zn ¯ ¯a¯
n=1 n=1
∞ µ
X ¶pn +1
R
+3 < ∞.
2R
n=N
By the Weierstrass M test, the series in 29.6 converges uniformly for |z| < R and so
the same is true of the infinite product. It follows from Lemma 24.18 on Page 652
that P (z) is analytic on |z| < R because it is a uniform limit of analytic functions.
Also by Theorem 29.2 the zeros of the analytic P (z) are exactly the points,
{zn } , listed according to multiplicity. That is, if zn is a zero of order m, then if it
is listed m times in the formula for P (z) , then it is a zero of order m for P. This
proves the theorem.
The following corollary is an easy consequence and includes the case where there
is a zero at 0.
Corollary 29.7 Let {zn } be a sequence of nonzero complex numbers which have
no limit point and let {pn } be a sequence of nonnegative integers such that
X∞ µ ¶1+pn
r
<∞ (29.7)
n=1
|zn |
is analytic Ω and has a zero at each point, zn and at no others along with a zero of
order m at 0. If w occurs m times in {zn } , then P has a zero of order m at w.
The above theory can be generalized to include the case of an arbitrary open
set. First, here is a lemma.
Lemma 29.8 Let Ω be an open set. Also let {zn } be a sequence of points in Ω
which is bounded and which has no point repeated more than finitely many times
such that {zn } has no limit point in Ω. Then there exist {wn } ⊆ ∂Ω such that
limn→∞ |zn − wn | = 0.
Proof: Since ∂Ω is closed, there exists wn ∈ ∂Ω such that dist (zn , ∂Ω) =
|zn − wn | . Now if there is a subsequence, {znk } such that |znk − wnk | ≥ ε for all k,
then {znk } must possess a limit point because it is a bounded infinite set of points.
However, this limit point can only be in Ω because {znk } is bounded away from ∂Ω.
This is a contradiction. Therefore, limn→∞ |zn − wn | = 0. This proves the lemma.
QmProof: There is nothing to prove if {zn } is finite. You just let f (z) =
j=1 (z − zj ) where {zn } = {z1 , · · ·, zm }.
∞ 1
Pick w ∈ Ω \ {zn }n=1 and let h (z) ≡ z−w . Since w is not a limit point of {zn } ,
there exists r > 0 such that B (w, r) contains no points of {zn } . Let Ω1 ≡ Ω \ {w}.
Now h is not constant and so h (Ω1 ) is an open set by the open mapping theorem.
In fact, h maps each component of Ω to a region. |zn − w| > r for all zn and
so |h (zn )| < r−1 . Thus the sequence, {h (zn )} is a bounded sequence in the open
set h (Ω1 ) . It has no limit point in h (Ω1 ) because this is true of {zn } and Ω1 .
By Lemma 29.8 there exist wn ∈ ∂ (h (Ω1 )) such that limn→∞ |wn − h (zn )| = 0.
Consider for z ∈ Ω1 µ ¶
Y∞
h (zn ) − wn
f (z) ≡ En . (29.8)
n=1
h (z) − wn
Therefore,
∞ ¯
X µ ¶ ¯
¯ ¯
¯En h (zn ) − wn − 1¯
¯ h (z) − wn ¯
n=1
Q∞ ³ ´
converges uniformly for z ∈ K. This implies n=1 En h(z n )−wn
h(z)−wn also converges
uniformly for z ∈ K by Theorem 29.2. Since K is arbitrary, this shows f defined
in 29.8 is analytic on Ω1 .
Also if zn is listed m times so it is a zero of multiplicity m and wn is the point
from ∂ (h (Ω1 )) closest to h (zn ) , then there are m factors in 29.8 which are of the
form
µ ¶ µ ¶
h (zn ) − wn h (zn ) − wn
En = 1− egn (z)
h (z) − wn h (z) − wn
µ ¶
h (z) − h (zn ) gn (z)
= e
h (z) − wn
µ ¶
zn − z 1
= egn (z)
(z − w) (zn − w) h (z) − wn
= (z − zn ) Gn (z) (29.10)
where Gn is an analytic function which is not zero at and near zn . Therefore, f has
a zero of order m at zn . This proves the theorem except for the point, w which has
been left out of Ω1 . It is necessary to show f is analytic at this point also and right
now, f is not even defined at w.
The {wn } are bounded because {h (zn )} is bounded and limn→∞ |wn − h (zn )| =
0 which implies |wn − h (zn )| ≤ C for some constant, C. Therefore, there exists
δ > 0 such that if z ∈ B 0 (w, δ) , then for all n,
¯ ¯
¯ ¯ ¯ ¯
¯ h (zn ) − w ¯ ¯ h (zn ) − wn ¯ 1
¯³ ´ ¯=¯ ¯
¯ 1 ¯ ¯ h (z) − wn ¯ < 2 .
¯ z−w −w ¯ n
Thus 29.9 holds for all z ∈ B 0 (w, δ) and n so by Theorem 29.2, the infinite product
in 29.8 converges uniformly on B 0 (w, δ) . This implies f is bounded in B 0 (w, δ) and
so w is a removable singularity and f can be extended to w such that the result is
analytic. It only remains to verify f (w) 6= 0. After all, this would not do because
it would be another zero other than those in the given list. By 29.10, a partial
product is of the form
YN µ ¶
h (z) − h (zn ) gn (z)
e (29.11)
n=1
h (z) − wn
where
à µ ¶2 µ ¶n !
h (zn ) − wn 1 h (zn ) − wn 1 h (zn ) − wn
gn (z) ≡ + +···+
h (z) − wn 2 h (z) − wn n h (z) − wn
794 INFINITE PRODUCTS
Proof: Let Q have a pole of order m (z) at z. Then by Corollary 29.9 there
exists an analytic function, g which has a zero of order m (z) at every z ∈ Ω. It
follows gQ has a removable singularity at the poles of Q. Therefore, there is an
analytic function, f such that f (z) = g (z) Q (z) . This proves the theorem.
Proof: From Theorem 29.10 there are analytic functions, f, g such that Q = fg .
Therefore, the zero set of the function, f (z) − cg (z) has a limit point in Ω and so
f (z) − cg (z) = 0 for all z ∈ Ω. This proves the corollary.
X∞ µ ¶pn +1
r
< ∞,
n=1
|zn |
29.2. FACTORING A GIVEN ANALYTIC FUNCTION 795
Note that eg(z) 6= 0 for any z and this is the interesting thing about this function.
Proof: {zn } cannot have a limit point because if there were a limit point of this
sequence, it would follow from Theorem 24.23 that f (z) = 0 for all z, contradicting
the hypothesis that f (0) 6= 0. Hence limn→∞ |zn | = ∞ and so
X∞ µ ¶1+n−1 X ∞ µ ¶n
r r
= <∞
n=1
|zn | n=1
|zn |
and so
h (z) = ea+ib ege(z)
for some constants, a, b. Therefore, letting g (z) = ge (z) + a + ib, h (z) = eg(z) and
thus 29.12 holds. This proves the theorem.
Corollary 29.13 Let f be analytic on C, f has a zero of order m at 0, and let the
other zeros of f be {zk } , listed according to order. (Thus if z is a zero of order l,
it will be listed l times in the list, {zk } .) Also let
X∞ µ ¶1+pn
r
<∞ (29.13)
n=1
|zn |
for any choice of r > 0. Then there exists an entire function, g such that
∞
Y µ ¶
z
f (z) = z m eg(z) Epn . (29.14)
n=1
zn
Proof: Since f has a zero of order m at 0, it follows from Theorem 24.23 that
{zk } cannot have a limit point in C and so you can apply Theorem 29.12 to the
function, f (z) /z m which has a removable singularity at 0. This proves the corollary.
796 INFINITE PRODUCTS
where α ∈ R is not an integer. This will be used to verify the formula of Mittag-
Leffler,
∞
1 X 2α
+ = π cot πα. (29.15)
α n=1 α2 − n2
First you show that cot πz is bounded on this contour. This is easy using the
iz
+e−iz
formula for cot (z) = eeiz −e −iz . Therefore, IN → 0 as N → ∞ because the integrand
is of order 1/N 2 while the diameter of γ N is of order N. Next you compute the
residues of the integrand at ±α and at n where |n| < N + 12 for n an integer. These
are the only singularities of the integrand in this contour and therefore, using the
residue theorem, you can evaluate IN by using these. You can calculate these
residues and find that the residue at ±α is
−π cos πα
2α sin πα
while the residue at n is
1
.
α 2 − n2
Therefore " #
N
X 1 π cot πα
0 = lim IN = lim 2πi 2 2
−
N →∞ N →∞ α −n α
n=−N
Y∞ µ ¶
g(z) z
sin (πz) = ze 1− ez/zn (29.16)
n=1
zn
where the zn are the nonzero integers. Remember you can permute the factors in
these products. Therefore, this can be written more conveniently as
Y∞ µ ³ z ´2 ¶
g(z)
sin (πz) = ze 1−
n=1
n
Y∞ µ ³ z ´2 ¶ Y∞ µ ³ z ´2 ¶
g(z) 0 g(z)
π cos (πz) = e 1− + zg (z) e 1−
n=1
n n=1
n
X∞ µ ¶Yµ ³ z ´2 ¶
2z
+zeg(z) − 1 −
n=1
n2 k
k6=n
X∞
1 0 2z/n2
π cot (πz) = + g (z) −
z n=1
(1 − z 2 /n2 )
X∞
1 2z
= + g 0 (z) + .
z n=1
z − n2
2
By 29.15, this yields g 0 (z) = 0 for z not an integer and so g (z) = c, a constant. So
far this yields
Y∞ µ ³ z ´2 ¶
sin (πz) = zec 1−
n=1
n
and it only remains to find c. Divide both sides by πz and take a limit as z → 0.
Using the power series of sin (πz) , this yields
ec
1=
π
and so c = ln π. Therefore,
Y∞ µ ³ z ´2 ¶
sin (πz) = zπ 1− . (29.17)
n=1
n
798 INFINITE PRODUCTS
Then there exists an analytic function defined on C such that the Taylor series of
f at zk has the first mk terms given by 29.18.1
1 This says you can specify the first mk derivatives of the function at the point zk .
29.3. THE EXISTENCE OF AN ANALYTIC FUNCTION WITH GIVEN VALUES799
It follows you need to solve the following system of equations for b1 , · · ·, bmk +1 .
cmk +1 bmk +1 = ak0
cmk +2 bmk +1 + cmk +1 bmk = ak1
cmk +3 bmk +1 + cmk +2 bmk + cmk +1 bmk −1 = ak2
..
.
cmk +mk +1 bmk +1 + cmk +mk bmk + · · · + cmk +1 b1 = akmk
Since cmk +1 6= 0, it follows there exists a unique solution to the above system.
You first solve for bmk +1 in the top. Then, having found it, you go to the next
and use cmk +1 6= 0 again to find bmk and continue in this manner. Let Sk (z)
be determined in this manner for each zk . By the Mittag-Leffler theorem, there
exists a Meromorphic function, g such that g has exactly the singularities, Sk (z) .
Therefore, f (z) g (z) has removable singularities at each zk and for z near zk , the
first mk terms of f g are as prescribed. This proves the theorem.
∞
Corollary 29.17 Let P ≡ {zk }k=1 be a set of points in Ω, an open set such that
P has no limit points in Ω. For each zk , consider
mk
X j
akj (z − zk ) . (29.19)
j=0
Then there exists an analytic function defined on Ω such that the Taylor series of
f at zk has the first mk terms given by 29.19.
Proof: The proof is identical to the above except you use the versions of the
Mittag-Leffler theorem and Weierstrass product which pertain to open sets.
Definition 29.18 Denote by H (Ω) the analytic functions defined on Ω, an open
subset of C. Then H (Ω) is a commutative ring2 with the usual operations of addition
and multiplication. A set, I ⊆ H (Ω) is called a finitely generated ideal of the ring
if I is of the form
( n )
X
gk fk : fk ∈ H (Ω) for k = 1, 2, · · ·, n
k=1
where g1 , ···, gn are given functions in H (Ω). This ideal is also denoted as [g1 , · · ·, gn ]
and is called the ideal generated by the functions, {g1 , · · ·, gn }. Since there are
finitely many of these functions it is called a finitely generated ideal. A principal
ideal is one which is generated by a single function. An example of such a thing is
[1] = H (Ω) .
2 It is not a field because you can’t divide two analytic functions and get another one.
800 INFINITE PRODUCTS
Theorem 29.19 Every finitely generated ideal in H (Ω) for Ω a connected open set
(region) is a principal ideal.
Let
∞
X k
h (z) = ck (z − α)
k=0
and the ck must be determined. Using Merten’s theorem, the power series for 1−hgn
is of the form à j !
∞
X X j
1 − b0 c0 − bj−r cr (z − α) .
j=1 r=0
b1 c0 + b0 c1 = 0.
b2 c0 + b1 c1 + b0 c2 = 0
Again there is no problem in solving, this time for c2 because b0 6= 0. Continuing this
way, you see that in every step, the ck which needs to be solved for is multiplied by
b0 6= 0. Therefore, by Corollary 29.9 there exists an analytic function, h satisfying
29.20. Therefore, (1 − hgn ) /φ has a removable singularity at every zero of φ and
so may be considered an analytic function. Therefore,
1 − hgn
1= φ + hgn ∈ [φ, gn ] = [g1 · · · gn ]
φ
which shows [g1 · · · gn ] = H (Ω) = [1] . It follows the claim is established.
Now suppose {g1 · · · gn } are just elements of H (Ω) . As explained above, it can
be assumed they all have zeros of finite order and the zeros have no limit point
in Ω since if these occur, you can delete the function from the list. By Corollary
29.9 there exists φ ∈ H (Ω) such that m (φ, z) ≤ min {m (gi , z) : i = 1, · · ·, n} . Then
gk /φ has a removable singularity at each zero of gk and so can be regarded as an
analytic function. Also, as before, there is no point which is a zero of each gk /φ and
so by the first part of this argument, [g1 /φ · · · gn /φ] = H (Ω) . As in the first part
of the argument, this implies [g1 · · · gn ] = [φ] which proves the theorem. [g1 · · · gn ]
is a principal ideal as claimed.
The following corollary follows from the above theorem. You don’t need to
assume Ω is connected.
Corollary 29.20 Every finitely generated ideal in H (Ω) for Ω an open set is a
principal ideal.
Proof: Let [g1 , · · ·, gn ] be a finitely generated ideal in H (Ω) . Let {Uk } be the
components of Ω. Then applying the above to each component, there exists hk ∈
H (Uk ) such that restricting each gi to Uk , [g1 , · · ·, gn ] = [hk ] . Then let h (z) = hk (z)
for z ∈ Uk . This is an analytic function which works.
802 INFINITE PRODUCTS
Lemma 29.21 Z π ¯ ¯
ln ¯1 − eiθ ¯ dθ = 0.
−π
Proof: First note that the only problem with the integrand occurs when θ = 0.
However, this is an integrable singularity so the integral will end up making sense.
Letting z = eiθ , you could get the above integral as a limit as ε → 0 of the following
contour integral where γ ε is the contour shown in the following picture with the
radius of the big circle equal to 1 and the radius of the little circle equal to ε..
Z
ln |1 − z|
dz.
γε iz
s s
1
On the indicated contour, 1−z lies in the half plane Re z > 0 and so log (1 − z) =
ln |1 − z| + i arg (1 − z). The above integral equals
Z Z
log (1 − z) arg (1 − z)
dz − dz
γε iz γε z
The first of these integrals equals zero because the integrand has a removable sin-
gularity at 0. The second equals
Z −ηε Z π
¡ ¢ ¡ ¢
i arg 1 − eiθ dθ + i arg 1 − eiθ dθ
−π ηε
Z −π Z π
2 −λε
+εi θdθ + εi θdθ
−π
2 −λε π
to place. Now suppose f is analytic on B (0, r + ε) , and f has no zeros on B (0, r).
Then you can define a branch of the logarithm which makes sense for complex
numbers near f (z) . Thus z → log (f (z)) is analytic on B (0, r + ε). Therefore, its
real part, u (x, y) ≡ ln |f (x + iy)| must be harmonic. Consider the following lemma.
Corollary 29.23 Suppose f is analytic on B (0, r + ε) and has no zeros on B (0, r).
Then
Z π
1 ¯ ¡ ¢¯
ln |f (0)| = ln ¯f reiθ ¯ (29.21)
2π −π
What if f has some zeros on |z| =©r butªnone on B (0, r)? It turns out 29.21
m
is still valid. Suppose the zeros are at reiθk k=1 , listed according to multiplicity.
Then let
f (z)
g (z) = Qm iθ k )
.
k=1 (z − re
804 INFINITE PRODUCTS
It follows g is analytic on B (0, r + ε) but has no zeros in B (0, r). Then 29.21 holds
for g in place of f. Thus
m
X
ln |f (0)| − ln |r|
k=1
Z π Z π Xm
1 ¯ ¡ iθ ¢¯ 1 ¯ ¯
= ¯
ln f re ¯ dθ − ln ¯reiθ − reiθk ¯ dθ
2π −π 2π −π
k=1
Z π Z π Xm Xm
1 ¯ ¡ iθ ¢¯ 1 ¯ ¯
= ln ¯f re ¯ dθ − ln ¯eiθ − eiθk ¯ dθ − ln |r|
2π −π 2π −π
k=1 k=1
Z π Z π Xm m
X
1 ¯ ¡ ¢¯ 1 ¯ ¯
= ln ¯f reiθ ¯ dθ − ln ¯eiθ − 1¯ dθ − ln |r|
2π −π 2π −π
k=1 k=1
R π Pm ¯ iθ ¯
1
Therefore, 29.21 will continue to hold exactly when 2π ¯ ¯
−π k=1 ln e − 1 dθ = 0.
But this is the content of Lemma 29.21. This proves the following lemma.
With this preparation, it is now not too hard to prove Jensen’s formula. Suppose
n
there are n zeros of f in B (0, r) , {ak }k=1 , listed according to multiplicity, none equal
to zero. Let
Yn
r2 − ai z
F (z) ≡ f (z) .
i=1
r (z − ai )
Then F is analytic
Qn on B (0, r + ε) and has no zeros in B (0, r) . The reason for this
is that f (z) / i=1 r (z − ai ) has no zeros there and r2 − ai z cannot equal zero if
|z| < r because if this expression equals zero, then
r2
|z| = > r.
|ai |
n
X ¯ ¯ Z 2π
¯r¯ 1 ¯ ¡ ¢¯
ln |f (0)| = − ln ¯¯ ¯¯ + ln ¯f reiθ ¯ dθ
i=1
ai 2π 0
as claimed.
Written in terms of exponentials this is
n ¯
Y
¯ µ Z 2π ¶
¯r ¯ ¯ ¡ iθ ¢¯
|f (0)| ¯ ¯ = exp 1 ln ¯f re ¯ dθ .
¯ ak ¯ 2π 0
k=1
Theorem 29.26 Let {αn } be a sequence of nonzero points in B (0, 1) with the
property that
∞
X
(1 − |αn |) < ∞.
n=1
is a bounded function which is analytic on B (0, 1) which has zeros only at 0 if k > 0
and at the αn .
3 Wilhelm Blaschke, 1915
806 INFINITE PRODUCTS
Proof: From Theorem 29.2 the above product will converge uniformly on B (0, r)
for r < 1 to an analytic function if
X∞ ¯ ¯
¯ αn − z |αn | ¯
¯ ¯
¯ 1 − αn z αn − 1¯
k=1
and so the assumption on the sum gives uniform convergence of the product on
B (0, r) to an analytic function. Since r < 1 is arbitrary, this shows B (z) is analytic
on B (0, 1) and has the specified zeros because the only place the factors equal zero
are at the αn or 0.
Now consider the factors in the product. The claim is that they are all no larger
in absolute value than 1. This is very easy to see from the maximum modulus
α−z
theorem. Let |α| < 1 and φ (z) = 1−αz . Then φ is analytic near B (0, 1) because its
iθ
only pole is 1/α. Consider z = e . Then
¯ ¯ ¯ ¯
¯ ¡ iθ ¢¯ ¯ α − eiθ ¯ ¯ 1 − αe−iθ ¯
¯φ e ¯ = ¯ ¯=¯ ¯
¯ 1 − αeiθ ¯ ¯ 1 − αeiθ ¯ = 1.
Thus the modulus of φ (z) equals 1 on ∂B (0, 1) . Therefore, by the maximum mod-
ulus theorem, |φ (z)| < 1 if |z| < 1. This proves the claim that the terms in the
product are no larger than 1 and shows the function determined by the Blaschke
product is bounded. This proves the theorem. P∞
Note in the conditions for this theorem the one for the sum, n=1 (1 − |αn |) <
∞. The Blaschke product gives an analytic function, whose absolute value is bounded
by 1 and which has the αn as zeros. What if you had a bounded function, analytic
on B (0, 1) which had zeros at {αk }? Could you conclude the condition on the sum?
29.5. BLASCHKE PRODUCTS 807
The answer is yes. In fact, you can get by with less than the assumption that f is
bounded but this will not be presented here. See Rudin [45]. This theorem is an
exciting use of Jensen’s equation.
Proof: If there are only finitely many zeros, there is nothing to prove so assume
there are infinitely many. Also let the zeros be listed such that |αn | ≤ |αn+1 | · ·· Let
n (r) denote the number of zeros in B (0, r) . By Jensen’s formula,
n(r)
X Z 2π
1 ¯ ¡ ¢¯
ln |f (0)| + ln r − ln |αi | = ln ¯f reiθ ¯ dθ ≤ ln (M ) .
i=1
2π 0
n(r) n(r)
X1 X
(r − |αi |) ≤ ln r − ln |αi | ≤ ln (M ) − ln |f (0)|
i=1
r i=1
∞ n(r)
X X1
(1 − |αi |) ≤ lim inf (r − |αi |) ≤ ln (M ) − ln |f (0)| .
i=1
r→1−
i=1
r
n(r)
X
ln |g (0)| + ln r − ln |αi |
i=1
Z 2π
1 ¯ ¡ ¢¯
= ln ¯g reiθ ¯ dθ
2π 0
Z 2π Z 2π
1 ¯ ¡ ¢¯ 1
= ln ¯f reiθ ¯ dθ − m ln (r)
2π0 2π 0
Z 2π
1 ¡ ¢
≤ M+ m ln r−1
2π 0
µ ¶
1
≤ M + m ln .
r0
Corollary
P∞29.29 Suppose f is analytic and bounded on B (0, 1) having zeros {αn } .
Then if k=1 (1 − |αn |) = ∞, it follows f is identically equal to zero.
Theorem 29.30 Let λ1 < λ2 < λ3 < · · · be an increasing list of positive real
numbers and let a > 0. If
X∞
1
= ∞, (29.23)
λ
n=1 n
Now dist (f, X) > 0 because X is closed. Therefore, there exists a lower bound,
η > 0 to ||g + f || for g ∈ X. Therefore, the above is no larger than
µ ¶
1
sup |α| ||f ||∞ = ||f ||∞
|α|≤ 1 η
η
³ ´
1
which shows that ||Λ0 || ≤ η ||f ||∞ . By the Hahn Banach theorem Λ0 can be
0
extended to Λ ∈ C ([0, b]) which has the property that Λ (X) = 0 but Λ (f ) =
||f || 6= 0. By the Weierstrass approximation theorem, Theorem 7.6 or one of its
cases, there exists a polynomial, p such that Λ (p) 6= 0. Therefore, if it can be
shown that whenever Λ (X) = 0, it is the case that Λ (p) = 0 for all polynomials, it
must be the case that X is dense in C ([0, b]).
0
By the Riesz representation theorem the elements of C ([0, b]) are complex mea-
sures. Suppose then that for µ a complex measure it follows that for all tλk ,
Z
tλk dµ = 0.
[0,b]
for all positive integers. It suffices to modify µ is necessary to have µ ({0}) = 0 since
this will not change any of the above integrals. Let µ1 (E) = µ (E ∩ (0, b]) and use
µ1 . I will continue using the symbol, µ.
For Re (z) > 0, define
Z Z
F (z) ≡ tz dµ = tz dµ
[0,b] (0,b]
The function tz = exp (z ln (t)) is analytic. I claim that F (z) is also analytic for
Re z > 0. Apply Morea’s theorem. Let T be a triangle in Re z > 0. Then
Z Z Z
F (z) dz = e(z ln(t)) ξd |µ| dz
∂T ∂T (0,b]
810 INFINITE PRODUCTS
R
Now ∂T can be split into three integrals over intervals of R and so this integral is es-
sentially a Lebesgue integral taken with respect to Lebesgue measure. Furthermore,
e(z ln(t)) is a continuous function of the two variables and ξ is a function of only the
one variable, t. Thus the integrand¯ is product¯ measurable. The iterated integral is
also absolutely integrable because ¯e(z ln(t)) ¯ ≤ ex ln t ≤ ex ln b where x + iy = z and
x is given to be positive. Thus the integrand is actually bounded. Therefore, you
can apply Fubini’s theorem and write
Z Z Z
F (z) dz = e(z ln(t)) ξd |µ| dz
∂T ∂T (0,b]
Z Z
= ξ e(z ln(t)) dzd |µ| = 0.
(0,b] ∂T
¯ −1 ¯ 2
¯φ (z)¯2 = z − 1 · z − 1 = |z| − 2 Re z + 1 < 1.
z+1 z+1 2
|z| + 2 Re z + 1
Consider F ◦ φ, an analytic function defined on B (0, 1). This function is given to
1+zn
have zeros at zn where φ (zn ) = 1−z n
= λn . This reduces to zn = −1+λ
1+λn . Now
n
c
1 − |zn | ≥
1 + λn
P 1 P
for a positive constant, c. It is given that λn = ∞. so it follows (1 − |zn |) = ∞
also. Therefore, by Corollary 29.29, F ◦ φ = 0. It follows F = 0 also. In particular,
0
F (k) for k a positive integer equals zero. This has shown that if Λ ∈ C ([0, b]) and
λn k
Λ sends 1 and all the t to 0, then Λ sends 1 and all t for k a positive integer to
zero. As explained above, X is dense in C ((0, b]) .
The converse of this theorem is also true and is proved in Rudin [45].
29.6 Exercises
1. Suppose f is an entire function with f (0) = 1. Let
M (r) = max {|f (z)| : |z| = r} .
Use Jensen’s equation to establish the following inequality.
M (2r) ≥ 2n(r)
where n (r) is the number of zeros of f in B (0, r).
29.6. EXERCISES 811
2. The version of the Blaschke product presented above is that found in most
complex variable texts. However, there is another one in [37]. Instead of
αn −z |αn |
1−αn z αn you use
αn − z
1
αn − z
2 = k=1 (2k−1)(2k+1) .
If no such a exists, the function is said to be of infinite order. Show the order
of an entire function is also equal to lim supr→∞ ln(ln(M (r)))
ln(r) where M (r) ≡
max {|f (z)| : |z| = r}.
7. Suppose Ω is a simply connected region and let f be meromorphic on Ω.
Suppose also that the set, S ≡ {z ∈ Ω : f (z) = c} has a limit point in Ω. Can
you conclude f (z) = c for all z ∈ Ω?
8. This and the next collection of problems are dealing with the gamma function.
Show that ¯³ ¯ C (z)
¯ z ´ −z ¯
¯ 1+ e n − 1¯ ≤
n n2
and therefore,
∞ ¯³ ¯
X ¯ z ´ −z ¯
¯ 1+ e n − 1¯ < ∞
n=1
n
with the convergence uniform on compact sets.
Q∞ ¡ ¢ −z
9. ↑ Show n=1 1 + nz e n converges to an analytic function on C which has
zeros only at the negative integers and that therefore,
∞ ³
Y z ´−1 z
1+ en
n=1
n
812 INFINITE PRODUCTS
e−γz Y ³ z ´−1 z
∞
Γ (z) ≡ 1+ en ,
z n=1 n
µ Pn
¶ −γz
n! e
ez( k=1 k )
1
= lim
n→∞ (1 + z) (2 + z) · · · (n + z) z
n! P P
ez( k=1 k ) e−z[ k=1 ]
n 1 n 1
= lim k −ln n
n→∞ (1 + z) (2 + z) · · · (n + z)
n!nz
= lim .
n→∞ (1 + z) (2 + z) · · · (n + z)
13. ↑ Verify from the Gauss formula above that Γ (z + 1) = Γ (z) z and that for n
a nonnegative integer, Γ (n + 1) = n!.
14. ↑ The usual definition of the gamma function for positive x is
Z ∞
Γ1 (x) ≡ e−t tx−1 dt.
0
¡ ¢n
Show 1 − nt ≤ e−t for t ∈ [0, n] . Then show
Z nµ ¶n
t n!nx
1− tx−1 dt = .
0 n x (x + 1) · · · (x + n)
Use the first part to conclude that
n!nx
Γ1 (x) = lim = Γ (x) .
n→∞ x (x + 1) · · · (x + n)
¡ ¢n
Hint: To show 1 − nt ≤ e−t for t ∈ [0, n] , verify this is equivalent to
n −nu
showing (1 − u) ≤ e for u ∈ [0, 1].
29.6. EXERCISES 813
R∞
15. ↑Show Γ (z) = 0 e−t tz−1 dt. whenever Re z > 0. Hint: You have already
shown that this is true for positive real numbers. Verify this formula for
Re z > 0 yields an analytic function.
¡1¢ √ ¡5¢
16. ↑Show Γ 2 = π. Then find Γ 2 .
R∞ −s2 √
17. Show that e 2 ds = 2π. Hint: Denote this integral by I and observe
R −∞−(x2 +y2 )/2
that I 2 = R2
e dxdy. Then change variables to polar coordinates,
x = r cos (θ), y = r sin θ.
18. ↑ Now that you know what the gamma function is, consider in the formula
for Γ (α + 1) the following change of variables. t = α + α1/2 s. Then in terms
of the new variable, s, the formula for Γ (α + 1) is
Z ∞ µ ¶α
−α α+ 12
√
− αs s
e α √
e 1+ √ ds
− α α
Z ∞ h ³ ´ i
−α α+ 12 α ln 1+ √sα − √sα
=e α √
e ds
− α
s2
Show the integrand converges to e− 2 . Show that then
Z ∞ √
Γ (α + 1) −s2
lim = e 2 ds = 2π.
α→∞ e−α αα+(1/2) −∞
Hint: You will need to obtain a dominating function for the integral so that
you can use the dominated convergence theorem. You might try considering
√ √
like e1−(s /4) on this interval.
2
s ∈ (− α, α) first and consider something
√
Then look for another function for s > α. This formula is known as Stirling’s
formula.
19. This and the next several problems develop the zeta function and give a
relation between the zeta and the gamma function. Define for 0 < r < 2π
Z 2π Z ∞
e(z−1)(ln r+iθ) iθ e(z−1)(ln t+2πi)
Ir (z) ≡ ire dθ + dt (29.24)
0 ereiθ − 1 r et − 1
Z r (z−1) ln t
e
+ t
dt
∞ e −1
Show that Ir is an entire function. The reason 0 < r < 2π is that this prevents
iθ
ere − 1 from equaling zero. The above is just a precise description of the
R z−1
contour integral, γ eww −1 dw where γ is the contour shown below.
814 INFINITE PRODUCTS
? ¾ -
in which on the integrals along the real line, the argument is different in going
from r to ∞ than it is in going from ∞ to r. Now I have not defined such
contour integrals over contours which have infinite length and so have chosen
to simply write out explicitly what is involved. You have to work with these
integrals given above anyway but the contour integral just mentioned is the
motivation for them. Hint: You may want to use convergence theorems from
real analysis if it makes this more convenient but you might not have to.
r¡
µ
¡ ¾
s¡ x
? - 2δ
Show that limδ→0 Irδ (z) = Ir (z) . Hint: Use the dominated convergence
theorem if it makes this go easier. This is not a hard problem if you use these
theorems but you can probably do it without them with more work.
21. ↑ In the context of Problem 20 show that for r1 < r, Irδ (z) − Ir1 δ (z) is a
contour integral,
Z
wz−1
w
dw
γ r,r ,δ e − 1
1
¾
-
γ r,r1 ,δ
?6
¾
-
In this contour integral, wz−1 denotes e(z−1) log(w) where log (w) = ln |w| +
i arg (w) for arg (w) ∈ (0, 2π) . Explain why this integral equals zero. From
Problem 20 it follows that Ir = Ir1 . Therefore, you can define an entire func-
tion, I (z) ≡ Ir (z) for all r positive but sufficiently small. Hint: Remember
the Cauchy integral formula for analytic functions defined on simply connected
regions. You could argue there is a simply connected region containing γ r,r1 ,δ .
22. ↑ In case Re z > 1, you can get an interesting formula for I (z) by taking the
limit as r → 0. Recall that
Z 2π (z−1)(ln r+iθ) Z ∞ (z−1)(ln t+2πi)
e iθ e
Ir (z) ≡ ire dθ + dt (29.25)
0 ere iθ
−1 r et − 1
Z r (z−1) ln t
e
+ t
dt
∞ e −1
and now it is desired to take a limit in the case where Re z > 1. Show the first
integral above converges to 0 as r → 0. Next argue the sum of the two last
integrals converges to
³ ´ Z ∞ e(z−1) ln(t)
e(z−1)2πi − 1 dt.
0 et − 1
Thus Z
¡ ¢ ∞
e(z−1) ln(t)
I (z) = ez2πi − 1 dt (29.26)
0 et − 1
when Re z > 1.
23. ↑ So what does all this have to do with the zeta function and the gamma
function? The zeta function is defined for Re z > 1 by
X∞
1
z
≡ ζ (z) .
n=1
n
Now show that you can interchange the order of the sum and the integral.
This
P∞ isR ∞possibly most
¯ −ns z−1 ¯ easily done by using Fubini’s theorem. Show that
¯e s ¯ ds < ∞ and then use Fubini’s theorem. I think you
n=1 0
could do it other ways though. It is possible to do it without any reference to
Lebesgue integration. Thus
Z ∞ ∞
X
ζ (z) Γ (z) = sz−1 e−ns ds
0 n=1
Z ∞ Z ∞
sz−1 e−s sz−1
= ds = ds
0 1 − e−s 0 es − 1
By 29.26,
Z ∞
¡ ¢ e(z−1) ln(t)
I (z) = ez2πi − 1 dt
0 et − 1
¡ z2πi
¢
= e − 1 ζ (z) Γ (z)
¡ 2πiz
¢
= e − 1 ζ (z) Γ (z)
whenever Re z > 1.
24. ↑ Now show there exists an entire function, h (z) such that
1
ζ (z) = + h (z)
z−1
for Re z > 1. Conclude ζ (z) extends to a meromorphic function defined on
all of C which has a simple pole at z = 1, namely, the right side of the above
formula. Hint: Use Problem 10 to observe that Γ (z) is never equal to zero
but has simple poles at every nonnegative integer. Then for Re z > 1,
I (z)
ζ (z) ≡ .
(e2πiz − 1) Γ (z)
By 29.26 ζ has no poles for Re z > 1. The right side of the above equation is
defined for all z. There are no poles except possibly when z is a nonnegative
integer. However, these points are not poles either because of Problem 10
which states that Γ has simple poles at these points thus cancelling the simple
29.6. EXERCISES 817
¡ ¢
zeros of e2πiz − 1 . The only remaining possibility for a pole for ζ is at z = 1.
Show it has a simple pole at this point. You can use the formula for I (z)
Z 2π Z ∞
e(z−1)(ln r+iθ) iθ e(z−1)(ln t+2πi)
I (z) ≡ ire dθ + dt (29.27)
0 ereiθ − 1 r et − 1
Z r (z−1) ln t
e
+ t
dt
∞ e −1
I (z) 1
= + h (z)
(e2πiz − 1) Γ (z) z−1
where h (z) is an entire function. People worry a lot about where the zeros of
ζ are located. In particular, the zeros for Re z ∈ (0, 1) are of special interest.
The Riemann hypothesis says they are all on the line Re z = 1/2. This is a
good problem for you to do next.
25. There is an important relation between prime numbers and the zeta function
∞
due to Euler. Let {pn }n=1 be the prime numbers. Then for Re z > 1,
∞
Y 1
= ζ (z) .
n=1
1 − p−z
n
819
820 ELLIPTIC FUNCTIONS
Therefore,
¯n ¯ ¯¯ ¯
¯
¯X2 n1
X ¯ ¯ X ¯
¯ ¯
¯ aφ(k) − aθ(k) ¯ = ¯¯ aφ(k) ¯¯
¯ ¯ ¯ ¯
k=1 k=1 φ(k)∈{θ(1),···,θ(n
/ 1 )},k≤n2
Now all of these φ (k) in the last sum are contained in {θ (n1 + 1) , · · ·} and so the
last sum above is dominated by
∞
X ¯ ¯
≤ ¯aθ(k) ¯ < ε.
k=n1 +1
Therefore,
¯ ¯ ¯ ¯
¯X∞ ∞
X ¯ ¯X ∞ Xn2 ¯
¯ ¯ ¯ ¯
¯ aφ(k) − aθ(k) ¯ ≤ ¯ aφ(k) − aφ(k) ¯
¯ ¯ ¯ ¯
k=1 k=1 k=1 k=1
¯n ¯
¯X 2 X n1 ¯
¯ ¯
+¯ aφ(k) − aθ(k) ¯
¯ ¯
k=1 k=1
¯n ¯
¯X 1 X∞ ¯
¯ ¯
+¯ aθ(k) − aθ(k) ¯ < ε + ε + ε = 3ε
¯ ¯
k=1 k=1
P∞ P∞
and since ε is arbitrary, it follows k=1 aφ(k) = k=1 aθ(k) as claimed. This proves
the theorem.
w = mw1 + nw2 + (x − m) w1 + (y − n) w2
and so
w − mw1 − nw2 = (x − m) w1 + (y − n) w2 (30.1)
Now since w2 /w1 ∈
/ R,
which would contradict the choice of w1 as being the period having minimal absolute
value because the expression on the left in the above is a period and it equals
822 ELLIPTIC FUNCTIONS
something which has absolute value less than |w1 |. Therefore, x = k and w is an
integer linear combination of w1 and w2 . It only remains to verify the claim about
τ.
From the construction, |w1 | ≤ |w2 | and |w2 | ≤ |w1 − w2 | , |w2 | ≤ |w1 + w2 | .
Therefore,
|τ | ≥ 1, |τ | ≤ |1 − τ | , |τ | ≤ |1 + τ | .
The last two of these inequalities imply −1/2 ≤ Re τ ≤ 1/2.
This proves the theorem.
Definition 30.4 For f a meromorphic function which has the last of the above
alternatives holding in which M = {aw1 + bw2 : a, b ∈ Z} , the function, f is called
elliptic. This is also called doubly periodic.
Theorem 30.7 The sum of the residues of any elliptic function, f equals zero on
every Pa if a is chosen so that there are no poles on ∂Pa .
because the integrals over opposite sides of the parallelogram cancel out because
the values of f are the same on these sides and the orientations are opposite. It
follows from the residue theorem that the sum of the residues in Pa equals 0.
Proof: Let c ∈ f (Pa ) and consider Pa0 such that f −1 (c) ∩ Pa0 = f −1 (c) ∩ Pa
and Pa0 contains the same poles and zeros of f − c as Pa but Pa0 has no zeros of
f (z) − c or poles of f on its boundary. Thus f 0 (z) / (f (z) − c) is also an elliptic
function and so Theorem 30.7 applies. Consider
Z
1 f 0 (z)
dz.
2πi ∂Pa0 f (z) − c
By the argument principle, this equals Nz −Np where Nz equals the number of zeros
of f (z) − c and Np equals the number of the poles of f (z). From Theorem 30.7 this
must equal zero because it is the sum of the residues of f 0 / (f − c) and so Nz = Np .
Now Np equals the number of poles in Pa counted according to multiplicity.
There is an even better theorem than this one.
Theorem 30.9 Let f be a non constant elliptic function and suppose it has poles
p1 , · · ·, pm and zeros, z1 , · · ·, zm in Pα , listed
Pm according Pm to multiplicity where ∂Pα
contains no poles or zeros of f . Then k=1 zk − k=1 pk ∈ M, the module of
periods.
Proof: You can assume ∂Pa contains no poles or zeros of f because if it did,
then you could consider a slightly shifted period parallelogram, Pa0 which contains
no new zeros and poles but which has all the old ones but no poles or zeros on its
boundary. By Theorem 26.8 on Page 710
Z m
X Xm
1 f 0 (z)
z dz = zk − pk . (30.2)
2πi ∂Pa f (z)
k=1 k=1
Z
f 0 (z)
= (z − (z + w2 )) dz
γ(a,a+w1 ) f (z)
Z
f 0 (z)
+ (z − (z + w1 )) dz
γ(a,a+w2 ) f (z)
f 0 (z)
Now near these line segments f (z) is analytic and so there exists a primitive, gwi (z)
on γ (a, a + wi ) by Corollary 24.32 on Page 663 which satisfies egwi (z) = f (z).
Therefore,
= −w2 (gw1 (a + w1 ) − gw1 (a)) − w1 (gw2 (a + w2 ) − gw2 (a)) .
824 ELLIPTIC FUNCTIONS
Corollary 30.10 Let f be a non constant elliptic function and suppose the func-
tion, f (z) − c has poles p1 , · · ·, pm and zeros, z1 , · · ·, zm on Pα , listed according
Pm to
multiplicity where
P ∂Pα contains no poles or zeros of f (z) − c. Then k=1 zk −
m
k=1 pk ∈ M, the module of periods.
ad − bc = ±1.
The following is an interesting lemma which ties matrices with the fractional
linear transformations.
30.1. PERIODIC FUNCTIONS 825
az + b = z (cz + d)
However, it was shown that this implies BA is a nonzero multiple of I which requires
that A−1 must exist. Hence the condition must hold.
826 ELLIPTIC FUNCTIONS
Proof: Since {w1 , w2 } is a basis, there exist integers, a, b, c, d such that 30.4
holds. It remains to show the transformation determined by the matrix is unimod-
ular. Taking conjugates,
µ 0 ¶ µ ¶µ ¶
w1 a b w1
= .
w20 c d w2
Therefore, µ ¶ µ ¶µ ¶
w10 w10 a b w1 w1
=
w20 w20 c d w2 w2
Now since {w10 , w20 }µ is also ¶
given to be a basis, there exits another matrix having
e f
all integer entries, such that
g h
µ ¶ µ ¶µ ¶
w1 e f w10
=
w2 g h w20
and µ ¶ µ ¶µ ¶
w1 e f w10
= .
w2 g h w20
Therefore, µ ¶ µ ¶µ ¶µ ¶
w10 w10 a b e f w10 w10
= .
w20 w20 c d g h w20 w20
However, since w10 /w20 is not real, it is routine to verify that
µ 0 ¶
w1 w10
det 6= 0.
w20 w20
Therefore, µ ¶ µ ¶µ ¶
1 0 a b e f
=
0 1 c d g h
µ ¶ ¶ µ
a b e f
and so det det = 1. But the two matrices have all integer
c d g h
entries and so both determinants must equal either 1 or −1.
Next suppose µ 0 ¶ µ ¶µ ¶
w1 a b w1
= (30.5)
w20 c d w2
30.1. PERIODIC FUNCTIONS 827
µ ¶
a b
where is unimodular. I need to verify that {w10 , w20 } is a basis. If w ∈ M,
c d
there exist integers, m, n such that
µ ¶
¡ ¢ w1
w = mw1 + nw2 = m n
w2
From 30.5 µ ¶µ ¶ µ ¶
d −b w10 w1
± =
−c a w20 w2
and so µ ¶µ ¶
¡ ¢ d −b w10
w=± m n
−c a w20
which is an integer linear combination of {w10 , w20 } . It only remains to verify that
w10 /w20 is not real.
Claim: Let w1 and w2 be nonzero complex numbers. Then w2 /w1 is not real
if and only if µ ¶
w1 w1
w1 w2 − w1 w2 = det 6= 0
w2 w2
Proof of the claim: Let λ = w2 /w1 . Then
¡ ¢ 2
w1 w2 − w1 w2 = λw1 w1 − w1 λw1 = λ − λ |w1 |
¡ ¢
Thus the ratio is not real if and only if λ − λ 6= 0 if and only if w1 w2 − w1 w2 6= 0.
Now to verify w20 /w10 is not real,
µ ¶
w10 w10
det
w20 w20
µµ ¶µ ¶¶
a b w1 w1
= det
c d w2 w2
µ ¶
w1 w1
= ± det 6= 0
w2 w2
where w consists of all numbers of the form aw1 + bw2 for a, b integers. Sometimes
people write this as ℘ (z, w1 , w2 ) to emphasize its dependence on the periods, w1
and w2 but I won’t do so. It is understood there exist these periods, which are
given. This is a reasonable thing to try. Suppose you formally differentiate the
right side. Never mind whether this is justified for now. This yields
−2 X −2 X −2
℘0 (z) = 3 − 3 = 3
z (z − w) w (z − w)
w6=0
and so
³ w ´ ³ w ´
1 1
c1 = ℘ − + w1 − ℘ −
³w ´2 ³ w ´ 2
1 1
= ℘ −℘ − =0
2 2
which shows the constant for ℘ (z + w1 ) − ℘ (z) must equal zero. Similarly the
constant for ℘ (z + w2 ) − ℘ (z) also equals zero. Thus ℘ is periodic having the two
periods w1 , w2 .
Of course to justify this, you need to consider whether the series of 30.6 con-
verges. Consider the terms of the series.
¯ ¯ ¯ ¯
¯ 1 1 ¯¯ ¯ 2w − z ¯
¯ ¯ ¯
¯ − 2 ¯ = |z| ¯ ¯
¯ (z − w)2 w ¯ ¯ (z − w)2 w2 ¯
Claim: There exists a positive number, k such that for all pairs of integers,
m, n, not both equal to zero,
|mw1 + nw2 |
≥ k > 0.
|m| + |n|
Proof of claim: If not, there exists mk and nk such that
mk nk
lim w1 + w2 = 0
k→∞ |mk | + |nk | |mk | + |nk |
³ ´
However, |mkm nk
|+|nk | , |mk |+|nk |
k
is a bounded sequence in R2 and so, taking a sub-
sequence, still denoted by k, you can have
µ ¶
mk nk
, → (x, y) ∈ R2
|mk | + |nk | |mk | + |nk |
and so there are real numbers, x, y such that xw1 + yw2 = 0 contrary to the
assumption that w2 /w1 is not equal to a real number. This proves the claim.
Now from the claim,
X 1
3
w6=0
|w|
X 1 X 1
= 3 ≤ 3
(m,n)6=(0,0)
|mw1 + nw2 | (m,n)6=(0,0)
k3 (|m| + |n|)
∞ ∞
1 X X 1 1 X 4j
= 3 = < ∞.
k 3 j=1 (|m| + |n|) k 3 j=1 j 3
|m|+|n|=j
and the last series converges uniformly on B (0, R) to an analytic function. Thus ℘ is
a meromorphic function and also the argument given above involving differentiation
of the series termwise is valid. Thus ℘ is an elliptic function as claimed. This is
called the Weierstrass ℘ function. This has proved the following theorem.
Also since there are no poles of order 1 you can obtain a primitive for ℘, −ζ.2
To do so, recall à !
1 X 1 1
℘ (z) ≡ 2 + 2 − w2
z (z − w)
w6=0
where for |z| < R this is the sum of a rational function with a uniformly convergent
series. Therefore, you can take the integral along any path, γ (0, z) from 0 to z
which misses the poles of ℘. By the uniform convergence of the above integral, you
can interchange the sum with the integral and obtain
1 X 1 z 1
ζ (z) = + + + (30.7)
z z − w w2 w
w6=0
while
1 X −1 z 1
−ζ (z) = + − −
−z z − w w2 w
w6=0
1 X −1 z 1
= + − + .
−z z + w w2 w
w6=0
Now consider 30.7. It will be used to find the Laurent expansion about the origin
for ζ which will then be differentiated to obtain the Laurent expansion for ℘ at the
origin. Since w 6= 0 and the interest is for z near 0 so |z| < |w| ,
1 z 1 z 1 1 1
+ 2+ = + −
z−w w w w 2 w w 1 − wz
1 X ³ z ´k
∞
z 1
= + −
w2 w w w
k=0
1 X ³ z ´k
∞
= −
w w
k=2
2 I don’t know why it is traditional to refer to this antiderivative as −ζ rather than ζ but I
am following the convention. I think it is to minimize the number of minus signs in the next
expression.
30.1. PERIODIC FUNCTIONS 831
From 30.7
à ∞
!
1 X X zk
ζ (z) = + −
z wk+1
w6=0 k=2
∞
XX ∞
1 k
z 1 X X z 2k−1
= − k+1
= −
z w z w2k
k=2 w6=0 k=2 w6=0
because the sum over odd powers must be zero because for each w 6= 0, there exists
2k
z 2k
−w 6= 0 such that the two terms wz2k+1 and (−w) 2k+1 cancel each other. Hence
∞
1 X
ζ (z) = − Gk z 2k−1
z
k=2
P 1
where Gk = w6=0 w2k . Now with this,
X ∞
1
−ζ 0 (z) = ℘ (z) = 2
+ Gk (2k − 1) z 2k−2
z
k=2
1
= + 3G2 z 2 + 5G3 z 4 + · · ·
z2
Therefore,
−2
℘0 (z) = + 6G2 z + 20G3 z 3 + · · ·
z3
2 4 24G2
℘0 (z) = − 2 − 80G3 + · · ·
z6 z
µ ¶3
3 1 2 4
4℘ (z) = 4 + 3G2 z + 5G3 z · ··
z2
4 36
= + 2 G2 + 60G3 + · · ·
z6 z
and finally
60G2
60G2 ℘ (z) =
+0+···
z2
where in the above, the positive powers of z are not listed explicitly. Therefore,
∞
X
0 2 3
℘ (z) − 4℘ (z) + 60G2 ℘ (z) + 140G3 = an z n
n=1
In deriving the equation it was assumed |z| < |w| for all w = aw1 +bw2 where a, b are
integers not both zero. The left side of the above equation is periodic with respect
to w1 and w2 where w2 /w1 is not a real number. The only possible poles of the
left side are at 0, w1 , w2 , and w1 + w2 , the vertices of the parallelogram determined
by w1 and w2 . This follows from the original formula for ℘ (z) . However, the above
832 ELLIPTIC FUNCTIONS
equation shows the left side has no pole at 0. Since the left side is periodic with
periods w1 and w2 , it follows it has no pole at the other vertices of this parallelogram
either. Therefore, the left side is periodic and has no poles. Consequently, it equals
a constant by Theorem 30.5. But the right side of the above equation shows this
constant is 0 because this side equals zero when z = 0. Therefore, ℘ satisfies the
differential equation,
2 3
℘0 (z) − 4℘ (z) + 60G2 ℘ (z) + 140G3 = 0.
Thus
à !
1 X 1 1
℘ (−z) = 2
+ 2 − w2
z (−z − w)
w6=0
à !
1 X 1 1
= 2
+ 2 − w2 = ℘ (z) .
z (−z + w)
w6=0
It follows one of the ei must equal ℘ (w1 /2) . Similarly, one of the ei must equal
℘ (w2 /2) and one must equal ℘ ((w1 + w2 ) /2).
Lemma 30.15 The numbers, ℘ (w1 /2) , ℘ (w2 /2) , and ℘ ((w1 + w2 ) /2) are dis-
tinct.
30.1. PERIODIC FUNCTIONS 833
Proof: Choose Pa , a period parallelogram which contains the pole 0, and the
points w1 /2, w2 /2, and (w1 + w2 ) /2 but no other pole of ℘ (z) . Also ∂Pa∗ does not
contain any zeros of the elliptic function, z → ℘ (z) − ℘ (w1 /2). This can be done
by shifting P0 slightly because the poles are only at the points aw1 + bw2 for a, b
integers and the zeros of ℘ (z) − ℘ (w1 /2) are discreet.
w1 + w2
»s
» » »£ »
»
w2 »»» »»» » £
»
s »» £
»£»
» » £ £
£ £ £ £
£ £ £
£ £
£ £ £
£ £ £ £
£ »»
£ £s w1
£ »»»
£ £0»»»»»»» £
£ »£s
»»»
£ »»
s»
a
If ℘ (w2 /2) = ℘ (w1 /2) , then ℘ (z) − ℘ (w1 /2) has two zeros, w2 /2 and w1 /2 and
since the pole at 0 is of order 2, this is the order of ℘ (z) − ℘ (w1 /2) on Pa hence by
Theorem 30.8 on Page 822 these are the only zeros of this function on Pa . It follows
by Corollary 30.10 on Page 824 which says the sum of the zeros minus the sum of
the poles is in M , w21 + w22 ∈ M. Thus there exist integers, a, b such that
w1 + w2
= aw1 + bw2
2
which implies (2a − 1) w1 + (2b − 1) w2 = 0 contradicting w2 /w1 not being real.
Similar reasoning applies to the other pairs of points in {w1 /2, w2 /2, (w1 + w2 ) /2} .
For example, consider (w1 + w2 ) /2 and choose Pa such that its boundary contains
no zeros of the elliptic function, z → ℘ (z) − ℘ ((w1 + w2 ) /2) and Pa contains no
poles of ℘ on its interior other than 0. Then if ℘ (w2 /2) = ℘ ((w1 + w2 ) /2) , it
follows from Theorem 30.8 on Page 822 w2 /2 and (w1 + w2 ) /2 are the only two
zeros of ℘ (z) − ℘ ((w1 + w2 ) /2) on Pa and by Corollary 30.10 on Page 824
w1 + w1 + w2
= aw1 + bw2 ∈ M
2
for some integers a, b which leads to the same contradiction as before about w1 /w2
not being real. The other cases are similar. This proves the lemma.
Lemma 30.15 proves the ei are distinct. Number the ei such that
e1 = ℘ (w1 /2) , e2 = ℘ (w2 /2)
and
e3 = ℘ ((w1 + w2 ) /2) .
To summarize, it has been shown that for complex numbers, w1 and w2 with
w2 /w1 not real, an elliptic function, ℘ has been defined. Denote this function as
834 ELLIPTIC FUNCTIONS
you see that if the two periods w1 and w2 are replaced with tw1 and tw2 respectively,
then
ei (tw1 , tw2 ) = t−2 ei (w1 , w2 ) .
Let τ denote the complex number which equals the ratio, w2 /w1 which was assumed
in all this to not be real. Then
This function is meromorphic for Im τ > 0 or for Im τ < 0. However, since the
denominator is never equal to zero the function must actually be analytic on both
the upper half plane and the lower half plane. It never is equal to 0 because e3 6= e2
and it never equals 1 because e3 6= e1 . This is stated as an observation.
Observation 30.16 The function, λ (τ ) is analytic for τ in the upper half plane
and never assumes the values 0 and 1.
and the matrix is unimodular. By Theorem 30.13 on Page 826 {w10 , w20 } is just an-
other basis for the same module of periods. Therefore, ℘ (z, w1 , w2 ) = ℘ (z, w10 , w20 )
because both are defined as sums over the same values of w, just in different order
which does not matter because of the absolute convergence of the sums on compact
subsets of C. Since ℘ is unchanged, it follows ℘0 (z) is also unchanged and so the
numbers, ei are also the same. However, they might be permuted in which case
the function λ (τ ) defined above would change. What would it take for λ (τ ) to
not change? In other words, for which unimodular transformations will λ be left
30.1. PERIODIC FUNCTIONS 835
w10 w1 w0 w2
− ∈ M, 2 − ∈M
2 2 2 2
¡ ¢ ³ 0´
w
then ℘ w21 = ℘ 21 and so e1 will be unchanged and similarly for e2 and e3 .
This occurs exactly when
1 1
((a − 1) w1 + bw2 ) ∈ M, (cw1 + (d − 1) w2 ) ∈ M.
2 2
This happens if a and d are odd and if b and c are even. Of course the stylish way
to say this is
This has shown that for unimodular transformations satisfying 30.11 λ is unchanged.
Letting τ be defined as above,
w20 cw1 + dw2 c + dτ
τ0 = ≡ = .
w10 aw1 + bw2 a + bτ
µ ¶
a b
Thus for unimodular transformations, satisfying 30.11, or more suc-
c d
cinctly, µ ¶ µ ¶
a b 1 0
∼ mod 2 (30.12)
c d 0 1
it follows that µ ¶
c + dτ
λ = λ (τ ) . (30.13)
a + bτ
Furthermore, this is the only way this can happen.
Proof: It only remains to verify that if ℘ (w10 /2) = ℘ (w1 /2) then it is necessary
that
w10 w1
− ∈M
2 2
w0
with a similar requirement for w2 and w20 . If 21 − w21 ∈
/ M, then there exist integers,
m, n such that
−w10
+ mw1 + nw2
2
836 ELLIPTIC FUNCTIONS
Now consider what happens for some other unimodular matrices which are not
congruent to the identity mod 2. This will yield other functional equations for λ
in addition to the fact that λ is periodic of period 2. As before, these functional
equations come about because ℘ is unchanged when you change the basis for M,
the module of periods. In particular, consider the unimodular matrices
µ ¶ µ ¶
1 0 0 1
, . (30.15)
1 1 1 0
Consider the first of these. Thus
µ 0 ¶ µ ¶
w1 w1
=
w20 w1 + w2
30.1. PERIODIC FUNCTIONS 837
λ (τ 0 ) = λ (1 + τ )
³ 0 0´ ³ 0´
w +w w
℘ 1 2 2 − ℘ 22
= ³ 0´ ³ 0´
w w
℘ 21 − ℘ 22
¡ ¢ ¡ ¢
℘ w1 +w22 +w1 − ℘ w1 +w 2
= ¡ w1 ¢ ¡ w +w 2¢
℘ − ℘ 12 2
¡ w2 2 ¢ ¡ ¢
℘ 2 + w1 − ℘ w1 +w 2
2
= ¡ ¢ ¡ ¢
℘ w21 − ℘ w1 +w 2
2
¡ ¢ ¡ ¢
℘ w22 − ℘ w1 +w 2
2
= ¡ ¢ ¡ ¢
℘ w21 − ℘ w1 +w 2
2
¡ ¢ ¡ ¢
℘ w1 +w2
2
− ℘ w22
= − ¡ ¢ ¡ ¢
℘ w21 − ℘ w1 +w 2
2
¡ ¢ ¡ ¢
℘ w1 +w 2
2
− ℘ w22
= − ¡ w1 ¢ ¡ ¢ ¡ ¢ ¡ ¢
℘ 2 − ℘ w22 + ℘ w22 − ℘ w1 +w 2
2
µ w +w w
¶
℘( 1 2 2 )−℘( 22 )
w w
℘( 21 )−℘( 22 )
= − µ w w +w
¶
℘( 22 )−℘( 1 2 2 )
1+ w1
℘( 2 )−℘( 2 )
w2
µ w +w w
¶
℘( 1 2 2 )−℘( 22 )
w1 w2
℘( 2 )−℘( 2 )
= µ w +w w
¶
℘( 1 2 2 )−℘( 22 )
w w
℘( 21 )−℘( 22 )
−1
λ (τ )
= . (30.16)
λ (τ ) − 1
λ (τ )
λ (1 + τ ) = . (30.17)
λ (τ ) − 1
838 ELLIPTIC FUNCTIONS
Next consider the other unimodular matrix in 30.15. In this case w10 = w2 and
w20 = w1 . Therefore, τ 0 = w20 /w10 = w1 /w2 = 1/τ . Then
λ (τ 0 ) = λ (1/τ )
³ 0 0´ ³ 0´
w +w w
℘ 1 2 2 − ℘ 22
= ³ 0´ ³ 0´
w w
℘ 21 − ℘ 22
¡ ¢ ¡ ¢
℘ w1 +w
2
2
− ℘ w21
= ¡ ¢ ¡ ¢
℘ w22 − ℘ w21
e3 − e1 e3 − e2 + e2 − e1
= =−
e2 − e1 e1 − e2
= − (λ (τ ) − 1) = −λ (τ ) + 1. (30.18)
You could try other unimodular matrices and attempt to find other functional
equations if you like but this much will suffice here.
Now this formula can be used to obtain a formula for λ (τ ) . As pointed out
above, λ depends only on the ratio w2 /w1 and so it suffices to take w1 = 1 and
30.1. PERIODIC FUNCTIONS 839
w2 = τ . Thus ¡ ¢ ¡ ¢
℘ 1+τ
2 − ℘ τ2
λ (τ ) = ¡ ¢ ¡ ¢ . (30.20)
℘ 12 − ℘ τ2
From the original formula for ℘,
µ ¶ ³τ ´
1+τ
℘ −℘
2 2
1 1 X 1 1
= ¡ ¢ 2 − ¡ ¢2 + ¡ ¡ ¢ ¢2 − ¡ ¡ ¢ ¢2
1+τ τ 1 1
2 2 (k,m)6=(0,0) k − 2 + m− 2 τ k + m − 12 τ
X 1 1
= ¡ ¡ ¢ ¢2 − ¡ ¡ ¢ ¢2
1 1
(k,m)∈Z2 k− 2 + m− 2 τ k + m − 12 τ
X 1 1
= ¡ ¡ ¢ ¢2 − ¡ ¡ ¢ ¢2
1 1
(k,m)∈Z2 k− 2 + m− 2 τ k + m − 12 τ
X 1 1
= ¡ ¡ ¢ ¢2 − ¡ ¡ ¢ ¢2
1 1
(k,m)∈Z2 k− 2 + −m − 2 τ k + −m − 12 τ
X 1 1
= ¡1 ¡ ¢ ¢2 − ¡¡ ¢ ¢2 . (30.21)
1 1
(k,m)∈Z2 2 + m+ 2 τ −k m+ 2 τ −k
Similarly,
µ ¶ ³τ ´
1
℘ −℘
2 2
1 1 X 1 1
= ¡ ¢2 − ¡ ¢2 + ¡ ¢2 − ¡ ¡ ¢ ¢2
1 τ 1 1
2 2 (k,m)6=(0,0) k− 2 + mτ k+ m− 2 τ
X 1 1
= ¡ ¢2 − ¡ ¡ ¢ ¢2
1
(k,m)∈Z2 k− 2 + mτ k + m − 21 τ
X 1 1
= ¡ ¢2 − ¡ ¡ ¢ ¢2
1
(k,m)∈Z2 k− 2 − mτ k + −m − 12 τ
X 1 1
= ¡1 ¢2 − ¡¡ ¢ ¢2 . (30.22)
1
(k,m)∈Z2 2 + mτ − k m+ 2 τ −k
and
µ ¶ ³τ ´ X
1 π2 π2
℘ −℘ = ¡ ¡ ¢¢ − ¡ ¡ ¢ ¢
2 2 m
sin2 π 12 + mτ sin2 π m + 12 τ
X π2 π2
= − 2
¡ ¡ ¢ ¢.
m
cos2 (πmτ ) sin π m + 12 τ
The following interesting formula for λ results.
P 1 1
m cos2 (π (m+ 1 )τ ) − sin2 (π (m+ 1 )τ )
2 2
λ (τ ) = P 1 1 . (30.23)
m cos (πmτ )
2 − sin (π (m+ 2 )τ )
2 1
l1 C l2
r r
1 1
2
30.1. PERIODIC FUNCTIONS 841
In this picture, l1 is¡ the¢y axis and l2 is the line, x = 1 while C is the top half of
the circle centered at 12 , 0 which has radius 1/2. Note the above formula implies
λ has real values on l1 which are between 0 and 1. This is because 30.23 implies
P 1 1
m cos2 (π (m+ 1 )ib) − sin2 (π (m+ 1 )ib)
2 2
λ (ib) = P 1 1
m cos2 (πmib) − sin2 (π (m+ 21 )ib)
P 1 1
m cosh2 (π (m+ 1 )b) + sinh2 (π (m+ 1 )b)
2 2
= P 1 1 ∈ (0, 1) . (30.28)
m cosh (πmb)
2 + sinh2 (π (m+ 21 )b)
it follows
1 1 1 1
cos2 (π (− 12 )τ )
− sin2 (π (− 12 )τ )
+ cos2 (π 12 τ )
− sin2 (π 21 τ )
+ A (τ )
λ (τ ) =
1 + B (τ )
2 2
cos2 (π ( 12 )τ )
− sin2 (π ( 12 )τ )
+ A (τ )
= (30.30)
1 + B (τ )
Where A (τ ) , B (τ ) → 0 as Im (τ ) → ∞. I took out the m = 0 term involving
1/ cos2 (πmτ ) in the denominator and the m = −1 and m = 0 terms in the nu-
merator of 30.29. In fact, e−iπ(a+ib) A (a + ib) , e−iπ(a+ib) B (a + ib) converge to zero
uniformly in a as b → ∞.
cos (αm a + iαm b) = cos (αm a) cosh (αm b) − i sinh (αm b) sin (αm a)
sin (αm a + iαm b) = sin (αm a) cosh (αm b) + i cos (αm a) sinh (αm b)
842 ELLIPTIC FUNCTIONS
Therefore,
¯ 2 ¯
¯cos (αm a + iαm b)¯ = cos2 (αm a) cosh2 (αm b) + sinh2 (αm b) sin2 (αm a)
≥ sinh2 (αm b) .
Similarly,
¯ 2 ¯
¯sin (αm a + iαm b)¯ = sin2 (αm a) cosh2 (αm b) + cos2 (αm a) sinh2 (αm b)
≥ sinh2 (αm b) .
X∞ µ ¶m
¯ −iπτ ¯ eπb 1
¯e A (τ )¯ ≤ 4 ¡ ¢
sinh 3πb
2 m=1
e3πb
eπb 1/e3πb
≤ 4 ¡ 3πb ¢
sinh 2 1 − (1/e3πb )
which converges to zero as b → ∞. Similar reasoning will establish the claim about
B (τ ) . This proves the lemma.
Proof: From 30.30 and Lemma 30.20, this lemma will be proved if it is shown
à !
2 2
lim ¡ ¡ ¢ ¢− ¡ ¡ ¢ ¢ e−iπ(a+ib) = 16
b→∞ cos2 π 12 (a + ib) sin2 π 12 (a + ib)
30.1. PERIODIC FUNCTIONS 843
Now
¯ ¯ ¯ ¡ ¢2 ¯
¯ 1 + e2πiτ ¯ ¯ 1 + e2πiτ 1 − e2πiτ ¯¯
¯ ¯ ¯
¯ − 1 ¯ = ¯ − 2¯
¯ (1 − e2πiτ )2 ¯ ¯ (1 − e2πiτ )2 (1 − e2πiτ ) ¯
¯ 2πiτ ¯
¯3e − e4πiτ ¯ 3e−2πb + e−4πb
≤ 2 ≤ 2
(1 − e−2πb ) (1 − e−2πb )
|λ (a + ib)| ≤ 17e−πb .
by Corollary 30.22.
Next consider the behavior of λ on line l2 in the above picture. From 30.17 and
30.28,
λ (ib)
λ (1 + ib) = <0
λ (ib) − 1
844 ELLIPTIC FUNCTIONS
It follows λ is real on the boundary of Ω in the above picture. This proves the
corollary.
Now, following Alfors [2], cut off Ω by considering the horizontal line segment,
z = a + ib0 where b0 is very large and positive and a ∈ [0, 1] . Also cut Ω off
by the images of this horizontal line, under the transformations z = τ1 and z =
1 − τ1 . These are arcs of circles because the two transformations are fractional linear
transformations. It is left as an exercise for you to verify these arcs are situated as
shown in the following picture. The important thing to notice is that for b0 large the
points of these circles are close to the origin and (1, 0) respectively. The following
picture is a summary of what has been obtained so far on the mapping by λ.
z = a + ib0
¾
real small positive small, real, negative
l1 l2
Ω
? 6
C1 C
- C2
near 1 and real z : large, real, negative
r r
1 1
2
In the picture, the descriptions are of λ acting on points of the indicated bound-
ary of Ω. Consider the oriented contour which results from λ (z) as z moves first up
l2 as indicated, then along the line z = a + ib and then down l1 and then along C1
to C and along C till C2 and then along C2 to l2 . As indicated in the picture, this
involves going from a large negative real number to a small negative real number
and then over a smooth curve which stays small to a real positive number and from
there to a real number near 1. λ (z) stays fairly near 1 on C1 provided b0 is large
so that the circle, C1 has very small radius. Then along C, λ (z) is real until it hits
C2 . What about the behavior of λ on C2 ? For z ∈ C2 , it follows from the definition
of C2 that z = 1 − τ1 where τ is on the line, a + ib0 . Therefore, by Lemma 30.21,
30.1. PERIODIC FUNCTIONS 845
1 eπb0 e−iaπ
1− =1− .
16eiπ(a+ib0 ) 16
These points are essentially on a large half circle in the upper half plane which has
πb0
radius approximately e16 .
Now let w ∈ C with Im (w) 6= 0. Then for b0 large enough, the motion over the
boundary of the truncated region indicated in the above picture results in λ tracing
out a large simple closed curve oriented in the counter clockwise direction which
includes w on its interior if Im (w) > 0 but which excludes w if Im (w) < 0.
Theorem 30.23 Let Ω be the domain described above. Then λ maps Ω one to one
and onto the upper half plane of C, {z ∈ C such that Im (z) > 0} . Also, the line
λ (l1 ) = (0, 1) , λ (l2 ) = (−∞, 0) , and λ (C) = (1, ∞).
Proof: Let Im (w) > 0 and denote by γ the oriented contour described above
and illustrated in the above picture. Then the winding number of λ ◦ γ about w
equals 1. Thus Z
1 1
dz = 1.
2πi λ◦γ z − w
But, splitting the contour integrals into l2 ,the top line, l1 , C1 , C, and C2 and chang-
ing variables on each of these, yields
Z
1 λ0 (z)
1= dz
2πi γ λ (z) − w
and by the theorem on counting zeros, Theorem 25.20 on Page 694, the function,
z → λ (z) − w has exactly one zero inside the truncated Ω. However, this shows
this function has exactly one zero inside Ω because b0 was arbitrary as long as it
is sufficiently large. Since w was an arbitrary element of the upper half plane, this
verifies the first assertion of the theorem. The remaining claims follow from the
above description of λ, in particular the estimate for λ on C2 . This proves the
theorem.
Note also that the argument in the above proof shows that if Im (w) < 0, then
w is not in λ (Ω) . However, if you consider the reflection of Ω about the y axis,
then it will follow that λ maps this set one to one onto the lower half plane. The
argument will make significant use of Theorem 25.22 on Page 696 which is stated
here for convenience.
846 ELLIPTIC FUNCTIONS
where g (z) 6= 0 in B (a, R) . (f (z) − α has a zero of order m at z = a.) Then there
exist ε, δ > 0 with the property that for each z satisfying 0 < |z − α| < δ, there exist
points,
{a1 , · · ·, am } ⊆ B (a, ε) ,
such that
f −1 (z) ∩ B (a, ε) = {a1 , · · ·, am }
and each ak is a zero of order 1 for the function f (·) − z.
Corollary 30.25 Let Ω be the region above. Consider the set of points, Q = Ω ∪
Ω0 \ {0, 1} described by the following picture.
Ω0 Ω
l1 C l2
r r r
−1 1 1
2
Proof: By Theorem 30.23, this will be proved if it can be shown that λ (Ω0 ) =
{z ∈ C : Im (z) < 0} . Consider λ1 defined on Ω0 by
Claim: λ1 is analytic.
Proof of the claim: You just verify the Cauchy Riemann equations. Letting
λ (x + iy) = u (x, y) + iv (x, y) ,
Then u1x (x, y) = −ux (−x, y) and v1y (x, y) = −vy (−x, y) = −ux (−x, y) since
λ is analytic. Thus u1x = v1y . Next, u1y (x, y) = uy (−x, y) and v1x (x, y) =
vx (−x, y) = −uy (−x, y) and so u1y = −vx .
Now recall that on l1 , λ takes real values. Therefore, λ1 = λ on l1 , a set with
a limit point. It follows λ = λ1 on Ω0 ∪ Ω. By Theorem 30.23 λ maps Ω one to
one onto the upper half plane. Therefore, from the definition of λ1 = λ, it follows
λ maps Ω0 one to one onto the lower half plane as claimed. This has shown that λ
30.1. PERIODIC FUNCTIONS 847
is one to one on Ω ∪ Ω0 . This also verifies from Theorem 25.22 on Page 696 that
λ0 6= 0 on Ω ∪ Ω0 .
Now consider the lines l2 and C. If λ0 (z) = 0 for z ∈ l2 , a contradiction can
be obtained. Pick such a point. If λ0 (z) = 0, then z is a zero of order m ≥ 2 of
the function, λ − λ (z) . Then by Theorem 25.22 there exist δ, ε > 0 such that if
w ∈ B (λ (z) , δ) , then λ−1 (w) ∩ B (z, ε) contains at least m points.
z1 r r z B(z, ε)
0
Ω Ω
l1 C l2
λ(z1 ) r
r r r λ(z)
r
−1 1 1
2 B(λ(z), δ)
µ ¶
a b
Lemma 30.26 If Im (τ ) > 0 then there exists a unimodular such that
c d
c + dτ
a + bτ
848 ELLIPTIC FUNCTIONS
¯ ¯
¯ c+dτ ¯
is contained in the interior of Q. In fact, ¯ a+bτ ¯ ≥ 1 and
µ ¶
c + dτ
−1/2 ≤ Re ≤ 1/2.
a + bτ
Proof: Letting a basis for the module of periods of ℘ be {1, τ } , it follows from
Theorem 30.3 on Page 820 that there exists a basis for the same module of periods,
{w10 , w20 } with the property that for τ 0 = w20 /w10
−1 1
|τ 0 | ≥ 1, ≤ Re τ 0 ≤ .
2 2
Since this
µ is a basis
¶ for the same module of periods, there exists a unimodular
a b
matrix, such that
c d
µ ¶ µ ¶µ ¶
w10 a b 1
= .
w20 c d τ
Hence,
w20 c + dτ
τ0 = 0 = .
w1 a + bτ
Thus τ 0 is in the interior of H. In fact, it is on the interior of Ω0 ∪ Ω ≡ Q.
τ s
0
s s
−1 −1/2 0 1/2 1
e2 − e3 e3 − e2 1 λ (τ )
λ (τ 0 ) = = = 1 = (30.38)
e1 − e3 e3 − e2 − (e1 − e2 ) 1 − λ(τ )
λ (τ )−1
e1 − e3 1
λ (τ 0 ) = = (30.39)
e3 − e2 λ (τ )
Then the main theorem is the monodromy theorem listed next, Theorem 27.19
and its corollary on Page 749.
Theorem 30.29 Let Ω be a simply connected subset of C and suppose (f, B (a, r))
is a function element with B (a, r) ⊆ Ω. Suppose also that this function element can
be analytically continued along every curve through a. Then there exists G analytic
on Ω such that G agrees with f on B (a, r).
30.1. PERIODIC FUNCTIONS 851
Lemma 30.30 Let λ be the modular function defined on P+ the upper half plane.
Let V be a simply connected region in C and let f : V → C\ {0, 1} be analytic
and nonconstant. Then there exists an analytic function, g : V → P+ such that
λ ◦ g = f.
Proof: Let a ∈ V and choose r0 small enough that f (B (a, r0 )) contains neither
0 nor 1. You need only let B (a, r0 ) ⊆ V . Now there exists a unique point in Q, τ 0
such that λ (τ 0 ) = f (a). By Corollary 30.25, λ0 (τ 0 ) 6= 0 and so by the open
mapping theorem, Theorem 25.22 on Page 696, There exists B (τ 0 , R0 ) ⊆ P+ such
that λ is one to one on B (τ 0 , R0 ) and has a continuous inverse. Then picking r0
still smaller, it can be assumed f (B (a, r0 )) ⊆ λ (B (τ 0 , R0 )). Thus there exists
a local inverse for λ, λ−1 0 defined on f (B (a, r0 )) having values in B (τ 0 , R0 ) ∩
λ−1 (f (B (a, r0 ))). Then defining g0 ≡ λ−1 0 ◦ f, (g0 , B (a, r0 )) is a function element.
I need to show this can be continued along every curve starting at a in such a way
that each function in each function element has values in P+ .
Let γ : [α, β] → V be a continuous curve starting at a, (γ (α) = a) and sup-
pose that if t < T there exists a nonnegative integer m and a function element
(gm , B (γ (t) , rm )) which is an analytic continuation of (g0 , B (a, r0 )) along γ where
gm (γ (t)) ∈ P+ and each function in every function element for j ≤ m has values
in P+ . Thus for some small T > 0 this has been achieved.
Then consider f (γ (T )) ∈ C\ {0, 1} . As in the first part of the argument, there
exists a unique τ T ∈ Q such that λ (τ T ) = f (γ (T )) and for r small enough there is
an analytic local inverse, λ−1 T between f (B (γ (T ) , r)) and λ
−1
(f (B (γ (T ) , r))) ∩
B (τ T , RT ) ⊆ P+ for some RT > 0. By the assumption that the analytic continua-
tion can be carried out for t < T, there exists {t0 , · · ·, tm = t} and function elements
(gj , B (γ (tj ) , rj )) , j = 0, · · ·, m as just described with gj (γ (tj )) ∈ P+ , λ ◦ gj = f
on B (γ (tj ) , rj ) such that for t ∈ [tm , T ] , γ (t) ∈ B (γ (T ) , r). Let
I = B (γ (tm ) , rm ) ∩ B (γ (T ) , r) .
Pick z0 ∈ I . Then by Lemma 30.19 on Page 840 there exists a unimodular mapping
of the form
az + b
φ (z) =
cz + d
where µ ¶ µ ¶
a b 1 0
∼ mod 2
c d 0 1
such that
¡ ¢
gm (z0 ) = φ λ−1
T ◦ f (z0 ) .
852 ELLIPTIC FUNCTIONS
¡ ¢
Since both gm (z0 ) and φ λ−1 T ◦ f (z0 ) are in the upper half plane, it follows ad −
cb = 1 and φ maps the upper half plane to the upper half plane. Note the pole of
φ is real and all the sets being considered are contained in the upper half plane so
φ is analytic where it needs to be.
Claim: For all z ∈ I,
gm (z) = φ ◦ λ−1
T ◦ f (z) . (30.40)
λ−1
and so φ ◦ T is a local inverse forλ on f (B (γ (T )) , r) . Let the new function
gm+1
z }| {
element be φ ◦ λ−1
T ◦ f , B (γ (T ) , r) . This has shown the initial function element
Thus h is an entire function which misses the two values 0 and 1. If h is not constant,
then by Lemma 30.30 there exists a function, g analytic on C which has values in
the upper half plane, P+ such that λ ◦ g = h. However, g must be a constant
because there exists ψ an analytic map on the upper half plane which maps the
upper half plane to B (0, 1) . You can use the Riemann mapping theorem or more
simply, ψ (z) = z−i
z+i . Thus ψ ◦ g equals a constant by Liouville’s theorem. Hence
g is a constant and so h must also be a constant because λ (g (z)) = h (z) . This
proves f is a constant also. This proves the theorem.
30.3 Exercises
1. Show the set of modular transformations is a group. Also show those modular
transformations which are congruent mod 2 to the identity as described above
is a subgroup.
2. Suppose f is an elliptic function with period module M. If {w1 , w2 } and
{w10 , w20 } are two bases, show that the resulting period parallelograms resulting
from the two bases have the same area.
3. Given a module of periods with basis {w1 , w2 } and letting a typical element
of this module be denoted by w as described above, consider the product
Y³ z ´ (z/w)+ 1 (z/w)2
σ (z) ≡ z 1− e 2 .
w
w6=0
1
R
Show that 2πi ∂Pa
ζ (z) dz = 1 where the contour is taken once around the
parallelogram in the counter clockwise direction. Next evaluate this contour
integral directly to obtain Legendre’s relation,
η 1 w2 − η 2 w1 = 2πi.
7. Show any even elliptic function, f with periods w1 and w2 for which 0 is
neither a pole nor a zero can be expressed in the form
n
Y ℘ (z) − ℘ (ak )
f (0)
℘ (z) − ℘ (bk )
k=1
Hint: You might try something like this: By Theorem 30.9, it follows that if
{α
P k } arePthe zeros and {bk } the poles in an appropriate period
P parallelogram,
P
αk − bk equals a period. Replace αk with ak such that ak − bk = 0.
Then use 30.41 to show that the given formula for f is bi periodic. Anyway,
you try to arrange things such that the given formula has the same poles as
f. Remember an entire elliptic function equals a constant.
9. Show that the map τ → 1 − τ1 maps l2 onto the curve, C in the above picture
on the mapping properties of λ.
10. Modify the proof of Theorem 30.23 to show that λ (Ω)∩{z ∈ C : Im (z) < 0} =
∅.
Part IV
Stochastic Processes, An
Introduction
855
Random Variables And Basic
Probability
Caution: This material on probability and stochastic processes may be half baked
in places. I have not yet rewritten it several times. This is not to say that nothing
else is half baked. However, the probability is higher here.
Recall Lemma 11.3 on Page 305 which is stated here for convenience.
Lemma 31.1 Let M be a metric space with the closed balls compact and suppose
λ is a measure defined on the Borel sets of M which is finite on compact sets.
Then there exists a unique Radon measure, λ which equals λ on the Borel sets. In
particular λ must be both inner and outer regular on all Borel sets.
Also recall from earlier the following fundamental result which is called the Borel
Cantelli lemma.
Lemma 31.2 Let (Ω, F, λ) be a measure space and let {Ai } be a sequence of mea-
surable sets satisfying
X∞
λ (Ai ) < ∞.
i=1
Then letting S denote the set of ω ∈ Ω which are in infinitely many Ai , it follows
S is a measurable set and λ (S) = 0.
Proof: S = ∩∞ ∞
k=1 ∪m=k Am . Therefore, S is measurable and also
∞
X
λ (S) ≤ λ (∪∞
m=k Am ) ≤ λ (Ak )
m=k
857
858 RANDOM VARIABLES AND BASIC PROBABILITY
Thus HX ⊆ F. This is also often written as σ (X). For E a Borel set in Rp define
¡ ¢
λX (E) ≡ P X−1 (E) .
then define
Z
E (X) ≡ XdP
Ω
Lemma 31.4 For X a random vector defined above, λX is inner and outer regular,
Borel, and its completion is a Radon measure and λX (Rp ) = 1. Furthermore, if h
is any bounded Borel measurable function,
Z Z
h (X (ω)) dP = h (x) dλX .
Ω Rp
Proof: The assertions about λX follow from Lemma 11.3 on Page 305 listed
above. It remains to verify the formula involving the integrals. Suppose first
Similarly, if h is any Borel simple function, the same result will hold. For an arbi-
trary bounded Borel function, h, there exists a sequence of Borel simple functions,
{sn } converging to h. Hence, by the dominated convergence theorem,
Z Z Z Z
h (X (ω)) dP = lim sn (X (ω)) dP = lim sn (x) dλX = h (x) dλX .
Ω n→∞ Ω n→∞ Rp
n
Definition 31.5 A finite set of random vectors, {Xk }k=1 is independent if when-
ever Fk ∈ HXk (σ (Xk )),
n
Y
P (∩nk=1 Fk ) = P (Fk ) .
k=1
More generally, if {Fi }i∈I is any set of σ algebras, they are said to be independent
if whenever Aik ∈ Fik for k = 1, 2, · · ·, m, then
m
Y
P (∩m
k=1 Aik ) = P (Aik ) .
k=1
r
Lemma 31.6 If {Xk }k=1 are independent and if gk is a Borel measurable function,
n
then {gk (Xk )}k=1 is also independent. Furthermore, if the random variables have
values in R and they are all bounded, then
à r
! r
Y Y
E Xi = E (Xi ) .
i=1 i=1
r
Proof: First consider the claim about {gk (Xk )}k=1 . Letting O be an open set
in R,
−1 ¡ −1 ¢
(gk ◦ Xk ) (O) = X−1
k gk (O) = X−1
k (Borel set) ∈ HXk .
−1
It follows (gk ◦ Xk ) (E) is in HXk whenever E is Borel. Thus Hgk ◦Xk ⊆ HXk and
this proves the first part of the lemma.
© ª∞
Now let sin n=1 be a bounded sequence of simple functions measurable in HXi
which converges to Xi uniformly. (Since Xi is bounded, such a sequence exists by
breaking Xi into positive and negative parts and using Theorem 8.27 on Page 190.)
Say
mn
X
sin (ω) = cn,i
k XE n,i (ω)
k
k=1
where the Ek are disjoint elements of HXi and some might be empty. This is for
convenience in keeping the same index on the top of the sum. Then since all the
random variables are bounded, there is no problem about existence of any of the
860 RANDOM VARIABLES AND BASIC PROBABILITY
Z X
= lim cn,1 n,2 n,r
k1 ck2 · · · ckr XE n,1 XE n,2 · · · XEk dP
n,r
n→∞ Ω k ,k ,···,k
k1 k2 r
1 2 r
X Z
= lim cn,1 n,2 n,r
k1 ck2 · · · ckr XE n,1 XE n,2 · · · XEk dP
n,r
n→∞ Ω
k1 k2 r
k1 ,k2 ,···,kr
X r
Y ³ ´
= lim cn,1 n,2 n,r
k1 ck2 · · · ckr P Ekn,i
i
n→∞
k1 ,k2 ,···,kr i=1
Yr Z r
Y
= lim sin (ω) dP = E (Xi ) .
n→∞ Ω
i=1 i=1
Recall the following fundamental lemma and definition, Lemma 19.12 on Page
522.
Lemma 31.9 F and F −1 are both one to one, onto, and are inverses of each other.
Theorem
¡ it·X ¢ 31.10 Let¢ X and Y be random vectors with values in Rp and suppose
¡ it·Y
E e =E e for all t ∈ Rp . Then λX = λY .
31.2. CONDITIONAL PROBABILITY 861
R R
Proof: For ψ ∈ G, let λX (ψ) ≡ Rp ψdλX and λY (ψ) ≡ Rp ψdλY . Thus both
λX and λY are in G ∗ . Then letting ψ ∈ G and using Fubini’s theorem,
Z Z Z Z
it·y
e ψ (t) dtdλY = eit·y dλY ψ (t) dt
Rp Rp Rp Rp
Z
¡ ¢
= E eit·Y ψ (t) dt
p
ZR
¡ ¢
= E eit·X ψ (t) dt
p
ZR Z
= eit·x dλX ψ (t) dt
p p
ZR ZR
= eit·x ψ (t) dtdλX .
Rp Rp
¡ ¢ ¡ ¢
Thus λY F −1 ψ = λX F −1 ψ . Since ψ ∈ G is arbitrary and F −1 is onto, this
implies λX = λY in G ∗ . But G is dense in C0 (Rp ) and so λX = λY as measures.
This proves the theorem.
and Z Z Z
XE dλ(X,Y) = XE dλX|y dλY .
Rp1 ×Rp2 Rp2 Rp1
Z Z Z
= XE dλ(X3 ,···,Xn )|x1 x2 dλX2 |x1 dλX1
Rp1 Rp2 Rp3 ×···×Rpn
..
.
Z Z
= ··· XE dλXn |x1 x2 ···xn−1 dλXn−1 |x1 ···xn−2 · · · dλX2 |x1 dλX1 . (31.1)
Rp1 Rpn
Obviously, this could have been done in any order in the iterated integrals by
simply modifying the “given” variables, those occurring after the symbol |, to be
those which have been integrated in an outer level of the iterated integral.
Z Z
= ··· XE dλXn dλXn−1 · · · dλX2 dλX1 (31.2)
Rp1 Rpn
and the iterated integration may be taken in any order. If A is any set of random
vectors defined on a probability space, A is independent if any finite set of random
vectors from A is independent.
Thus, the random vectors are independent exactly when the dependence on the
givens in 31.1 can be dropped.
Does this amount to the same thing as discussed earlier? These two ran-
dom vectors, X, Y were independent if whenever A ∈ HX (σ (X)) and B ∈ HY ,
P (A ∩ B) = P (A) P (B) . Suppose the above definition and A and B as described.
Let A = X−1 (E) and B = Y−1 (F ) . Then
P (A ∩ B) = P ((X, Y) ∈ E × F )
Z
= XE (x) XF (y) dλ(X,Y)
Rp1 ×Rp2
Z Z
= XE (x) XF (y) dλY|x dλX
Rp1 Rp2
Z Z
= XE (x) XF (y) dλY dλX
Rp1 Rp2
= λX (E) λY (F ) = P (A) P (B)
and so, by uniqueness of the slicing measures, dλY|x = dλY . A similar argument
shows dλX|y = dλX . Thus this amounts to the same thing discussed earlier.
Proposition 31.13 Equations 31.2 and 31.1 hold with XE replaced by any non-
negative Borel measurable function and for any bounded continuous function.
Proof: The two equations hold for simple functions in place of XE and so an
application of the monotone convergence theorem applied to an increasing sequence
of simple functions converging pointwise to a given nonnegative Borel measurable
function yields the conclusion of the proposition in the case of the nonnegative
Borel function. For a bounded continuous function, one can apply the result just
established to the positive and negative parts of the real and imaginary parts of the
function.
Lemma 31.14 Let X1 , ···, Xn be random vectors with values in Rp1 , ···, Rpn respec-
tively and let g : Rp1 × · · · × Rpn → Rk be Borel measurable. Then g (X1 , · · ·, Xn )
is a random vector with values in Rk and if h : Rk → [0, ∞), then
Z
h (y) dλg(X1 ,···,Xn ) (y) =
Rk
Z
h (g (x1 , · · ·, xn )) dλ(X1 ,···,Xn ) . (31.3)
Rp1 ×···×Rpn
This proves 31.3 in the case when h is XE . To prove it in the general case, approx-
imate the nonnegative Borel measurable function with simple functions for which
the formula is true, and use the monotone convergence theorem.
864 RANDOM VARIABLES AND BASIC PROBABILITY
Z Z
= ··· XE ◦ (g1 ◦ π 1 , · · ·, gn ◦ π n ) dλXn · · · dλX1
p1 pn
ZR ZR
= ··· XE dλgn (Xn ) · · · dλg1 (X1 )
Rk1 Rkn
n
Lemma 31.16 If {Xi }i=1 are independent random variables having values in R,
à n ! n
Y Y
E Xi = E (Xi ).
i=1 i=1
Qn
Proof: By Lemma 31.14 and denoting by P the product, i=1 Xi ,
à n ! Z Z n
Y Y
E Xi = zdλP (z) = xi dλ(X1 ,···,Xn )
i=1 R R×R i=1
Z Z Y
n n
Y
= ··· xi dλX1 · · · dλXn = E (Xi ).
R R i=1 i=1
There is a way to tell if random vectors are independent by using their charac-
teristic functions.
Proposition 31.17 If X1 and X2 are random vectors having values in Rp1 and
Rp2 respectively, then the random vectors are independent if and only if
2
Y
¡ iP
¢ ¡ ¢
E e = E eitj ·Xj
j=1
P2
where P ≡ j=1 tj · Xj for tj ∈ Rpj . More generally, if Xi is a random vector
Pn
having values in Rpi for i = 1, 2, · · ·, n, and if P = j=1 tj · Xj , then the random
vectors are independent if and only if
n
¡ ¢ Y ¡ ¢
E eiP = E eitj ·Xj .
j=1
Lemma 31.18 Let Y be a random vector with values in Rp and let f be bounded
and measurable with respect to the Radon measure, λY , and satisfy
Z
f (y) eit·y dλY = 0
Proof: The proof is just like Rthe proof of Theorem 31.10 on Page 860 applied
to the measure, f (y) dλY . Thus E f (y) dλY = 0 for all E Borel. Hence f (y) = 0
a.e.
Proof of the proposition: If the Xj are independent, the formula follows
from Lemma 31.16 and Lemma 31.14.
Now suppose the formula holds. Then
Z Z
¡ ¢
eit1 ·x1 eit2 ·x2 dλX1 dλX2 = E eiP
Rp2 Rp1
866 RANDOM VARIABLES AND BASIC PROBABILITY
Z Z
= eit1 ·x1 eit2 ·x2 dλX1 |x2 dλX2 .
Rp2 Rp1
Now apply Lemma 31.18 to conclude that
Z Z
it1 ·x1
e dλX1 = eit1 ·x1 dλX1 |x2 (31.4)
Rp1 Rp1
for λX2 a.e. x2 , the exceptional set depending on t1 . Therefore, taking the union of
all exceptional sets corresponding to t1 ∈ Qp1 , it follows by continuity and the domi-
nated convergence theorem that 31.4 holds for all t1 whenever x2 is not an element of
this exceptional set of measure zero. Therefore, for such x2 , Theorem 31.10 applies
p1 p2
Rand it follows λX1 |x2 = λRX1 for
R λX2 a.e. x2 . Hence, ifR E isR a Borel set in R × R ,
X dλ(X1 ,X2 ) = Rp2 Rp1 XE dλX1 |x2 dλX2 = Rp2 Rp1 XE dλX1 dλX2 . A re-
Rp1 +p2 E
peat of the above argument will give the iterated integral in the reverse order or
else one could apply Fubini’s theorem to obtain this. The proposition also holds if 2
is replaced with n and the argument is a longer version of what was just presented.
This proves the proposition.
With this preparation, it is time to present the Doob Dynkin lemma.I am not
entirely sure what the Doob Dynkin lemma says actually. What follows is a gener-
alization of what is identified as a special case of this lemma in [42]. I am not sure
I have the right generalization. However, it is a very interesting lemma regardless
of its name.
Lemma 31.19 Suppose X, Y1 , Y2 , · · ·, Yk are random vectors, X having values in
Rn and Yj having values in Rpj and
X, Yj ∈ L1 (Ω) .
Suppose X is H(Y1 ,···,Yk ) measurable. Thus
© −1 ª −1
k
Y
X (E) : E Borel ⊆ (Y1 , · · ·, Yk ) (F ) : F is Borel in Rpj
j=1
Qk
Then there exists a Borel function, g : j=1 Rpj → Rn such that
X = g (Y1 , Y2 , · · ·, Yk ) .
Proof: For the sake of brevity, denote by Y the vector (Y1 , · · ·, Yk ) and by y
Qk
the vector (y1 , · · ·, yk ) and let j=1 Rpj ≡ RP . For E a Borel set of Rn ,
Z Z
XdP = XRn ×E (x, y) xdλ(X,Y)
Y −1 (E) Rn ×RP
Z Z
= xdλX|y dλY . (31.5)
E Rn
Since dλY is a Radon measure having inner and outer regularity, it follows the
above function is equal to a Borel function for λY a.e. y. This function will be
denoted by g. Then from 31.5
Z Z Z
XdP = g (y) dλY = XE (y) g (y) dλY
Y −1 (E) E RP
Z
= XE (Y (ω)) g (Y (ω)) dP
ZΩ
= g (Y (ω)) dP
Y −1 (E)
and since Y−1 (E) is an arbitrary element of HY , this shows that since X is HY
measurable,
X = E (X|HY ) = g (Y) .
E (X|y1 , · · ·, yk )
Therefore,
Z Ã !
∗ − 12 y∗ D −1 y 1
(RM R )ij = yi yj e dy p/2 Qp
= 0,
Rp (2π) i=1 σ i
X1 + X2 ∼ Np (m1 + m2 , Σ1 + Σ2 ). (31.6)
31.3. THE MULTIVARIATE NORMAL DISTRIBUTION 869
Therefore
Z
¡ it·X
¢ 1 1 ∗
D −1 y
E e = p/2 Qp
eis·(y+Rm) e− 2 y dx
(2π) i=1 σi Rp
Thus,
¡ ¢ ¡ ¢ ¡ ¢
E eit·X1 +X2 = E eit·X1 E eit·X2
1 ∗
Σ1 t it·m2 − 12 t∗ Σ2 t
= eit·m1 e− 2 t e e
it·(m1 +m2 ) − 21 t∗ (Σ1 +Σ2 )t
= e e
870 RANDOM VARIABLES AND BASIC PROBABILITY
Np (m1 + m2 , Σ1 + Σ2 ).
Pp
if every linear combination, j=1 ai Xi is normally distributed. In this case the
mean of X is
m = (E (X1 ) , · · ·, E (Xp ))
and the covariance matrix for X is
¡ ∗¢
Σjk = E (Xj − mj ) (Xk − mk ) .
Proof: In the Proof of Theorem 31.23 the proof implies that the characteristic
functions of a · X and a · Y are both of the form
1 2 2
eitm e− 2 σ t
.
then X1 and (X2 , · · ·, Xp ) are both normally distributed and the two random vectors
are independent. Here mj ≡ E (Xj ) . More generally, if the covariance matrix is a
diagonal matrix, the random variables, {X1 , · · ·, Xp } are linearly independent.
Then by assumption, µ ¶
σ 21 0
Σ= . (31.10)
0 Σp−1
I need to verify that if E ∈ HX1 (σ (X1 )) and F ∈ H(X2 ,···,Xp ) (σ (X2 , · · ·, Xp )),
then
P (E ∩ F ) = P (E) P (F ) .
Let E = X1−1 (A) and
−1
F = (X2 , · · ·, Xp ) (B)
p−1
where A and B are Borel sets in R and R respectively. Thus I need to verify
that
P ([(X1 , (X2 , · · ·, Xp )) ∈ (A, B)]) =
µ(X1 ,(X2 ,···,Xp )) (A × B) = µX1 (A) µ(X2 ,···,Xp ) (B) . (31.11)
Using 31.10, Fubini’s theorem, and definitions,
µ(X1 ,(X2 ,···,Xp )) (A × B) =
Z
1 −1 ∗ −1
XA×B (x) e 2 (x−m) Σ (x−m)
dx
p/2 1/2
Rp (2π) det (Σ)
Z Z
= XA (x1 ) XB (X2 , · · ·, Xp ) ·
R Rp−1
−(x1 −m1 )2
1 2σ 2
(p−1)/2 √ 1/2 1/2
e 1 ·
(2π) 2π (σ 21 ) det (Σp−1 )
∗
³ 0 ´
e
−1
2 (x0 −m0 ) Σ−1
p−1 x −m
0
dx0 dx1
where x0 = (x2 , · · ·, xp ) and m0 = (m2 , · · ·, mp ) . Now this equals
Z −(x1 −m1 )2
Z
1 2σ 2
1
XA (x1 ) p 2
e 1
(p−1)/2 1/2
· (31.12)
R 2πσ 1 B (2π) det (Σp−1 )
∗
³ 0 ´
e
−1
2 (x0 −m0 ) Σ−1
p−1 x −m
0
dx0 dx. (31.13)
p−1
In case B = R , the inside integral equals 1 and
¡ ¢
λX1 (A) = λ(X1 ,(X2 ,···,Xp )) A × Rp−1
Z −(x1 −m1 )2
1 2σ 2
= XA (x1 ) p e 1 dx1
R 2πσ 21
which shows X1 is normally distributed as claimed. Similarly, letting A = R,
λ(X2 ,···,Xp ) (B)
= λ(X1 ,(X2 ,···,Xp )) (R × B)
Z ³ 0 ´
1 −1 ∗
(x0 −m0 ) Σ−1
p−1 x −m
0
= (p−1)/2 1/2
e 2
dx0
B (2π) det (Σp−1 )
874 RANDOM VARIABLES AND BASIC PROBABILITY
and (X2 , · · ·, Xp ) is also normally distributed with mean m0 and covariance Σp−1 .
Now from 31.12, 31.11 follows. In case the covariance matrix is diagonal, the above
reasoning extends in an obvious way to prove the random variables, {X1 , · · ·, Xp }
are independent.
However, another way to prove this is to use Proposition 31.17 on Page 865 and
consider the characteristic function. Let E (Xj ) = mj and
p
X
P = tj Xj .
j=1
Also,
¡ ¢ X
E e itj Xj
= E exp itj Xj + i0Xk
k6=j
µ ¶
1 2 2
= exp itj mj − tj σ j
2
{X1 , · · ·, Xp }
for all j = 1, · · ·, p and so, by the dominated convergence theorem, the same is true
with φXn in place of φX provided n is large enough, say n ≥ N (u). Thus, if u ≤ r,
and n ≥ N (u), µ· ¸¶
2
λX n x : |xj | ≥ < ε/p
u
for all j ∈ {1, · · ·, p}. It follows that for u ≤ r and n ≥ N (u) ,
µ· ¸¶
2
λX n x : ||x||∞ ≥ < ε.
u
This proves the lemma because there are only finitely many measures, λXn for
n < N (u) and the compact set can be enlarged finitely many times to obtain a
single compact set, Kε such that for all n, λXn ([x ∈
/ Kε ]) < ε. This proves the
lemma.
Lemma 31.28 If φXn (t) → φX (t) for all t, then whenever ψ ∈ S,
Z Z
λXn (ψ) ≡ ψ (y) dλXn (y) → ψ (y) dλX (y) ≡ λX (ψ)
Rp Rp
as n → ∞.
Proof: Recall that if X is any random vector, its characteristic function is given
by Z
φX (y) ≡ eiy·x dλX (x) .
Rp
Also remember the inverse Fourier transform. Letting ψ ∈ S, the Schwartz class,
Z
¡ ¢
F −1 (λX ) (ψ) ≡ λX F −1 ψ ≡ F −1 ψdλX
Rp
Z Z
1
= p/2
eiy·x ψ (x) dxdλX (y)
(2π) Rp Rp
Z Z
1
= p/2
ψ (x) eiy·x dλX (y) dx
(2π) Rp Rp
Z
1
= p/2
ψ (x) φX (x) dx
(2π) Rp
whenever ψ ∈ S. Thus
Lemma 31.29 If φXn (t) → φX (t) , then if ψ is any bounded uniformly continuous
function, Z Z
lim ψdλXn = ψdλX .
n→∞ Rp Rp
Proof: Let ε > 0 be given, let ψ be a bounded function in C ∞ (Rp ). Now let
p
η ∈ Cc∞ (Qr ) where Qr ≡ [−r, r] satisfy the additional requirement that η = 1 on
∞
Qr/2 and η (x) ∈ [0, 1] for all x. By Lemma 31.27 the set, {λXn }n=1 , is tight and
so if ε > 0 is given, there exists r sufficiently large such that for all n,
Z
ε
|1 − η| |ψ| dλXn < ,
[x∈Q
/ r/2 ] 3
and Z
ε
|1 − η| |ψ| dλX < .
[x∈Q
/ r/2 ] 3
Thus, ¯Z Z ¯ ¯Z Z ¯
¯ ¯ ¯ ¯
¯ ψdλ − ψdλ ¯ ≤ ¯ ψdλ − ψηdλ ¯
Xn ¯ +
¯ p Xn X¯ ¯ Xn
R Rp Rp Rp
¯Z Z ¯ ¯Z Z ¯
¯ ¯ ¯ ¯
¯ ψηdλ − ψηdλ ¯ + ¯ ψηdλ − ψdλ ¯
¯ p Xn X¯ ¯ X X¯
R Rp Rp Rp
¯Z Z ¯
2ε ¯¯ ¯
≤ +¯ ψηdλXn − ψηdλX ¯¯ < ε
3 Rp Rp
The main result is the following continuity theorem. More can be said about
the equivalence of various criteria [9].
Theorem 31.31 If φXn (t) → φX (t) then λXn (A) → λX (A) whenever A is a λX
continuity set.
878 RANDOM VARIABLES AND BASIC PROBABILITY
Thus, since K is closed limk→∞ ψ k (x) = XK (x). Choose k large enough that
Z
ψ k dλX ≤ λX (K) + ε.
Rp
λX (interior (A)) ≤ lim inf λXn (interior (A)) ≤ lim inf λXn (A) ≤
n→∞ n→∞
¡ ¢ ¡ ¢
lim sup λXn (A) ≤ lim sup λXn A ≤ λX A .
n→∞ n→∞
31.4. THE CENTRAL LIMIT THEOREM 879
¡ ¢
But λX (interior (A)) = λX A by assumption and so limn→∞ λXn (A) = λX (A)
as claimed. This proves the theorem.
As an application of this theorem the following is a version of the central limit
theorem in the situation in which the limit distribution is multivariate normal. It
∞
concerns a sequence of random vectors, {Xk }k=1 , which are identically distributed,
have finite mean m, and satisfy
³ ´
2
sup E |Xk | < ∞. (31.15)
k
∞
Theorem 31.32 Let {Xk }k=1
be random vectors satisfying 31.15, which are in-
dependent
¡ and identically distributed with mean m and positive definite covariance
∗¢
Σ ≡ E (X − m) (X − m) . Let
Xn
Xj − m
Zn ≡ √ . (31.16)
j=1
n
for all x.
By Taylor’s theorem,
eiθx x2
eix = 1 + ix −
2
for some θ ∈ [0, 1] which depends on x. Denoting Xj as X, this implies
³ ´ 2
it· X−µ
√ X−m iθt· X−m
√ (t· (X − m))
e n
= 1 + it· √ −e n
n 2n
à !#
³ ´ (t· (X − m))2
iθt· X−m
√
+E 1−e n
2n
n · ´¸
Y 1 ∗ 1 ³³ iθt· X−m
√
´
2
= 1− t Σt+ E 1 − e n (t· (X − m)) . (31.18)
j=1
2n 2n
2 ∗
(Note (t· (X − m)) = t∗ (X − m) (X − m) t.) Now here is a simple inequality for
complex numbers whose moduli are no larger than one. I will give a proof of this
at the end. It follows easily by induction.
n
X
|z1 · · · zn − w1 · · · wn | ≤ |zk − wk |. (31.19)
k=1
where
n ¯ ´¯¯
X ¯ 1 ³³ X−m
´
|en | ≤ ¯ E 1 − eiθt· √n (t· (X − m))2 ¯
¯ 2n ¯
j=1
1 ¯¯ ³³ iθt· X−m
√
´ ´¯
2 ¯
= ¯E 1 − e n (t· (X − m)) ¯
2
which converges to 0 as n → ∞ by the Dominated Convergence theorem. Therefore,
¯ µ ¶n ¯
¯ t∗ Σt ¯¯
¯
lim φZn (t) − 1 −
n→∞ ¯ 2n ¯=0
and so
1 ∗
lim φZn (t) = e− 2 t Σt
= φZ (t)
n→∞
Qp
where Z ∼Np (0, Σ). Therefore, FZn (x) → FZ (x) for all x because Rx ≡ k=1 (−∞, xk ]
is a set of λZ continuity due to the assumption that λZ ¿ mp which is implied by
Z ∼Np (0, Σ). This proves the theorem.
Here is the proof of the little inequality used above. The inequality is obviously
true if n = 1. Assume it is true for n. Then since all the numbers have absolute
value no larger than one,
¯ ¯ ¯ ¯
¯n+1 Y ¯¯
n+1 ¯n+1 n ¯
¯Y ¯Y Y ¯
¯ zi − wi ¯ ≤ ¯ zi − zn+1 wi ¯
¯ ¯ ¯ ¯
i=1 i=1 i=1 i=1
¯ ¯
¯ n
Y Y ¯¯
n+1
¯
+ ¯zn+1 wi − wi ¯
¯ ¯
i=1 i=1
31.5. BROWNIAN MOTION 881
¯ ¯
¯Yn n
Y ¯
¯ ¯
≤ ¯ zi − wi ¯ + |zn+1 − wn+1 |
¯ ¯
i=1 i=1
n+1
X
≤ |zk − wk |
k=1
by induction.
Suppose X is a random vector with covariance Σ and mean m, and suppose also
that Σ−1 exists. Consider Σ−(1/2) (X − m) ≡ Y. Then E (Y) = 0 and
³ ´
E (YY∗ ) = E Σ−(1/2) (X − m) (X∗ − µ) Σ−(1/2)
= Σ−(1/2) E ((X − m) (X∗ − µ)) Σ−(1/2) = I.
Thus Y has zero mean and covariance I. This implies the following corollary to
Theorem 31.32.
have mean m and positive definite covariance Σ where Σ−1 exists. Then if
n
X (Xj − µ)
Zn ≡ Σ−(1/2) √ ,
j=1
n
for all x.
J = (t1 , · · ·, tn ) ⊆ I,
882 RANDOM VARIABLES AND BASIC PROBABILITY
(t1 , · · ·, tn ) ⊆ (s1 , · · ·, sp ) ,
then ¡ ¢
ν t1 ···tn (Ft1 × · · · × Ftn ) = ν s1 ···sp Gs1 × · · · × Gsp (31.20)
where if si = tj , then Gsi = Ftj and if si is not equal to any of the indices, tk ,
then Gsi = Ms0 i . Then there exists a probability space, (Ω, P, F) and measurable
functions, Xt : Ω → Mt for each t ∈ I such that for each (t1 · · · tn ) ⊆ I,
δ x (E) = 1 if x ∈ E and 0 if x ∈
/ E.
Now define for each increasing list (t1 , t2 , · · ·, tk ) , a measure defined as follows.
For F a Borel set in Rnk ,
Z
ν t1 t2 ···tk (F ) ≡ p (t1 , x, y1 ) p (t2 − t1 , y1 , y2 )
F
· · ·p (tk − tk−1 , yk−1 , yk ) dy1 dy2 · · · dyk . (31.22)
R
Since Rn p (s, x, y) dy = 1 whenever s ≥ 0, this shows the conditions of the Kol-
mogorov extension theorem are satisfied for these measures and therefore there
exists a probability space, (Ω, F, P ) and measurable functions, Bt for each t ≥ 0
such that whenever the Fj are Borel sets,
Proof: To show this use Theorem 31.23. The components of Btj are indepen-
dent and normally distributed because Btj is distributed as y → p (tj , x, y) which
is defined above. The off diagonal terms of the correlation matrix are zero and so
by Theorem 31.25 the components are independent and all normally distributed.
Denote by Btj r the rth component of Btj . Thus the mean of Btj r is P xr and the
variance of Btj r is tj . Also a · Btj is normally distributed with mean ar xr . To
verify Z is normally distributed, it suffices to show that a · Z is normally distributed
for a = (a1 , · · ·, ak ). Consider the case where k = 2. Then Z has values in R2n . I
will directly calculate the characteristic function for Z in this case and then note
that a similar pattern will hold for larger k.
E (exp(iu · Z))
Z Z
= p (t1 , x, y1 ) p (t2 − t1 , y1 , y2 ) eiu1 ·y1 eiu2 ·y2 dy2 dy1
Z Z
= p (t1 , x, y1 ) eiu1 ·y1 p (t2 − t1 , y1 , y2 ) eiu2 ·y2 dy2 dy1
Z µ µ µ ¶¶¶
iu1 ·y1 1 ∗
= p (t1 , x, y1 ) e exp iu2 · y1 + − (u2 (t2 − t1 ) Iu2 ) dy1
2
µ ¶Z
1
= exp − (u∗2 (t2 − t1 ) Iu2 ) p (t1 , x, y1 ) eiu1 ·y1 (exp (iu2 · y1 )) dy1
2
µ ¶Z
1 ∗
= exp − (u2 (t2 − t1 ) Iu2 ) p (t1 , x, y1 ) ei(u1 +u2 )·y1 dy1
2
µ ¶ µ ¶
1 ∗ 1 ∗
= exp − (u2 (t2 − t1 ) Iu2 ) exp − (u1 + u2 ) t1 I (u1 + u2 )
2 2
· exp (i (u1 + u2 ) · x)
µ ¶
1£ ∗ ¤
= exp − (u∗2 (t2 − t1 ) Iu2 ) + (u1 + u2 ) t1 I (u1 + u2 )
2
· exp (i (u1 + u2 ) · x) .
∗
The expression (u∗2 (t2 − t1 ) Iu2 ) + (u1 + u2 ) t1 I (u1 + u2 ) equals
µ ¶µ ¶
¡ ¢ t1 I t 1 I u1
u1 u2
t1 I t 2 I u2
and the expression i (u1 + u2 ) · x equals
¡ ¢ ¡ ¢
i u1 u2 · x x
and
¡ so in
¢ the case that k = 2, this shows Z is normally distributed with mean
x x and covariance µ ¶
t1 I t1 I
.
t1 I t2 I
The pattern continues in this way. In general the mean is
¡ ¢
x ··· x
884 RANDOM VARIABLES AND BASIC PROBABILITY
Therefore,
Z
¡ ∗ ¢ 2
E (Bt − x) (Bs − x) = p (s, x, y1 ) |y1 − x| dy1
³ ´
2
= E |Bs − x| = ns
Now for t ≥ s,
³ ´ ³ ´
2 2 2
E |Bt − Bs | = E |Bt − x| + |Bs − x| − 2 (Bt − x) · (Bs − x)
= nt + ns − 2ns = n (t − s) .
Lemma 31.38 Bt has independent increments. This means if t1 < t2 < · · · < tk ,
the random variables,
Bt1 , Bt2 − Bt1 , · · ·, Btk − Btk−1
are independent. In addition, these random variables are normally distributed.
31.5. BROWNIAN MOTION 885
Proof: Bt1 is normal and so is each of the Btj . Also I claim that Btj − Btj−1
is normal with mean 0. I will show this next.
¡ ¡ ¡ ¢¢¢ ¡ ¡ ¢ ¡ ¢¢
E exp iu· Btj − Btj−1 = E exp iu · Btj exp −iu · Btj−1
Z Z
= p (tj−1 , x, y1 ) p (tj − tj−1 , y1 , y2 ) exp (−iu · y1 ) exp (iu · y2 ) dy2 dy1
Z Z
= p (tj−1 , x, y1 ) exp (−iu · y1 ) p (tj − tj−1 , y1 , y2 ) exp (iu · y2 ) dy2 dy1
Z µ ¶
1 2
= p (tj−1 , x, y1 ) exp (−iu · y1 ) exp (iu · y1 ) exp − (tj − tj−1 ) |u| dy1
2
µ ¶
1 2
= exp − (tj − tj−1 ) |u| . (31.23)
2
Therefore, Btj − ¡Btj−1 is normal with covariance ¢(tj − tj−1 ) I and mean 0.
Next let Z = Bt1 , Bt2 − Bt1 , · · ·, Btk − Btk−1 . I need to verify Z is normally
distributed. Let u = (u1 , · · ·, uk ) .
à k
!
Y ¡ ¡ ¢¢
E (exp (iu · Z)) = E exp (iu1 · Bt1 ) exp iur · Btr − Btr−1
r=2
Z Z
= ··· p (t1 , x, y1 ) p (t2 − t1 , y1 , y2 ) · · · p (tk − tk−1 , yk−1 , yk ) ·
Rn Rn
k
Y
exp (iu1 · y1 ) exp (iur · (yr − yr−1 )) dyk dyk−1 · · · dy1 .
r=2
which has no y variables left so I can factor it out and then work on the next inside
integral which gives µ ¶
1 2
exp − (tk−1 − tk−2 ) |uk−1 |
2
which also can be factored out. Continuing this way eventually obtains
k
Y µ ¶Z
1 2
exp − (tj − tj−1 ) |uk | p (t1 , x, y1 ) exp (iu1 · y1 ) dy1
j=1
2
k
Y µ ¶ µ ¶
1 2 1 2
= exp − (tj − tj−1 ) |uk | exp (iu1 · x) exp − t1 |u1 | .
j=1
2 2
886 RANDOM VARIABLES AND BASIC PROBABILITY
µ ¶
1 2
= (− (t − s)) exp − (t − s) |u| + (− (t − s) − uj (t − s)) (−uj (t − s)) ·
2
µ ¶
1 2
exp − (t − s) |u|
2
µ ¶ µ ¶
2 1 2 3 1 2
2uj (t − s) exp − (t − s) |u| + (−uj (t − s)) exp − (t − s) |u|
2 2
31.5. BROWNIAN MOTION 887
Finally take yet another derivative with respect to uj and then let u = 0.
³ ´
4 2 2 2
E (Btj − Bsj ) = (t − s) + 2 (t − s) = 3 (t − s) .
This shows
n
X 4 2
E (Btj − Bsj ) = 3n (t − s) .
j=1
Pn 4 2
But also j=1 (Btj − Bsj ) ≥ (1/n) |Bt − Bs | and so
³ ´
4 2
E |Bt − Bs | ≤ 3n2 (t − s) . (31.24)
With more work, you can show the 3n2 can be replaced with n (n + 2) but it is the
inequality which is of interest here.
Before going further here is an interesting elementary lemma.
Since ε is arbitrary, this proves the existence part of the lemma. Uniqueness follows
from observing that Y (t) must equal limd→t X (d). This proves the lemma.
The following is a very interesting theorem called the Kolmogorov Čentsov con-
tinuity theorem[33].
Theorem 31.40 Suppose Xt is a random vector for each t ∈ [0, ∞). Suppose also
that for all T > 0 there exists a constant, C and positive numbers, α, β such that
α 1+β
E (|Xt − Xs | ) ≤ C |t − s| (31.25)
Then there exist random vectors, Yt such that for a.e. ω, t → Yt (ω) is continuous
and P ([|Xt − Yt | > 0]) = 0.
888 RANDOM VARIABLES AND BASIC PROBABILITY
¡ ¢ © ª2 m
Proof: Let rjm denote j 2Tm where j ∈ {0, 1, · · ·, 2m } . Also let Dm = rjm j=1
and D = ∪∞
m=1 Dm . Consider the set,
[|Xt − Xs | > δ]
for k = 1, 2, · · ·. By 31.25,
Z
α α
P ([|Xt − Xs | > δ]) δ ≤ |Xt − Xs | dP
[|Xt −Xs |>δ]
1+β
≤ C |t − s| . (31.26)
k
Letting t = rj+1 , s = rjk ,and δ = 2−γk where
µ ¶
β
γ ∈ 0, ,
α
this yields ³h¯ ¯ i´
¯ ¯ ¡ ¢1+β
P ¯Xrj+1
k − Xrjk ¯ > 2−γk ≤ C2αγk T 2−k .
it follows ¡ ¢1+β k
P (Ek ) ≤ C2αγk T 2−k 2 = C2k(αγ−β) T 1+β .
Since γ < β/α,
∞
X ∞
X
P (Ek ) ≤ CT 1+β 2k(αγ−β) < ∞
k=1 k=1
and so by the Borel Cantelli lemma, Lemma 31.2, there exists a set of measure
zero, E, such that if ω ∈
/ E, then ω is in only finitely many Ek . In other words, for
ω∈/ E, there exists N (ω) such that if k > N (ω) , then for each j,
¯ ¯
¯ ¯
¯Xrj+1
k (ω) − Xrjk (ω)¯ ≤ 2−γk . (31.27)
Suppose the claim is true for some m > n. Then let d, d0 ∈ Dm+1 with |d − d0 | <
T 2−n . Let d0 ≤ d01 ≤ d1 ≤ d where d1 , d01 are in Dm and d01 is the smallest element
of Dm which is at least as large as d0 and d1 is the largest element of Dm which is
no larger than d. Then |d0 − d01 | ≤ T 2−(m+1) and |d1 − d| ≤ T 2−(m+1) while all of
these are still in Dm+1 which contains Dm . Therefore, from 31.27 and induction,
|Xd0 (ω) − Xd (ω)|
¯ ¯ ¯ ¯
≤ ¯Xd0 (ω) − Xd0 (ω)¯ + ¯Xd0 (ω) − Xd (ω)¯ + |Xd (ω) − Xd (ω)|
1 1 1 1
m
X m+1
X
≤ 2 × 2−γ(m+1) + 2 2−γj = 2 2−γj
j=n+1 j=n+1
Therefore,
µ · ¸¶
∞ 1
P ([|Yt − Xt | > 0]) = P ∪k=1 |Yt − Xt | >
k
X∞ µ· ¸¶
1
≤ P |Yt − Xt | > = 0.
k
k=1
Definition 31.41 Let Xt and Yt be random vectors for each t ∈ [0, ∞). Then Yt
is said to be a version of Xt if there exists a set of measure zero, E such that for
ω∈ / E, Xt (ω) = Yt (ω) a.e. ω for all t ∈ [0, T ).
Corollary 31.43 Let B be the Borel sets on [0, T ] and let F be the σ algebra for
the underlying probability space. Then there exists a set of measure zero, N ∈ F
such that (t, ω) → XN (ω) Bt (ω) is B × F measurable.
Proof: Let N be the set of measure zero off which t → Bt (ω) is continuous.
Letting tm
k =2
−m
T k consider for ω ∈
/N
m
2
X
Bm
t (ω) ≡ Btm
k
(ω) X[tm m (t) .
k−1 ,tk )
k=1
The above development has proved the following theorem on Brownian motion.
Theorem 31.44 There exists a probability space, (Ω, F, P ) and random vectors,
B t for t ∈ [0, ∞) which satisfy the following properties.
2. Bt has independent increments. This means if t1 < t2 < · · · < tk , the random
variables,
Bt1 , Bt2 − Bt1 , · · ·, Btk − Btk−1
are independent and normally distributed. Note this implies the k th compo-
nents must also be independent. Also Btj − Btj−1 is normal with covariance
(tj − tj−1 ) I and mean 0. In addition to this, the k th component of Bt is
normally distributed with density function
à !
2
1 |y − xk |
p (t, xk ,y) ≡ 1/2
exp −
(2πt) 2t
³ ´
4 2
3. E |Bt − Bs | ≤ 3n2 (t − s) , For t > s,
³ ´ ¡ ¢
2 ∗
E |Bt − Bs | = n (t − s) , E (Bt − x) (Bs − x) = ns
E (Bt − Bs ) = 0,
E (f |S) is S measurable
For all E ∈ S, Z Z
E (f |S) dP = f dP
E E
for all E ∈ S.
893
894 CONDITIONAL EXPECTATION AND MARTINGALES
Let F ∈ S. Then
Z Z
E (E (X|F) |S) dP ≡ E (X|F) dP
F
ZF Z
≡ XdP ≡ E (X|S) dP
F F
which shows 32.2 in the case where Z is the characteristic function of a set in F.
It follows this also holds for simple functions. Let {sn } be a sequence of simple
functions which converges uniformly to Z and let F ∈ F. Then by what was just
shown, Z Z
sn E (X|F) dP = sn XdP.
F F
Then passing to the limit using the dominated convergence theorem, yields
Z Z Z
ZE (X|F) dP = ZXdP ≡ E (ZX|F) dP.
F F F
Lemma 32.3 Let I be an open interval on R and let φ be a convex function defined
on I. Then there exists a sequence {(an , bn )} such that
φ (t) − φ (x)
= .
t−x
φ(t)−φ(x)
Therefore t → t−x is increasing if t > x. If t < x
φ (t) − φ (x)
=
t−x
32.1. CONDITIONAL EXPECTATION 895
φ(t)−φ(x)
and so t → t−x is increasing for t 6= x. Let
½ ¾
φ (t) − φ (x)
ax ≡ inf :t>x .
t−x
Thus ψ (x) = φ (x) and letting Q ∩ I = {rn }, an = arn and bn = arn rn + φ (rn ).
This proves the lemma.
X → E (X|S)
is linear.
896 CONDITIONAL EXPECTATION AND MARTINGALES
Proof: Let A ∈ S. Z Z
E (X|S) dP ≡ XdP
A A
Z Z
≤ Y dP ≡ E (Y |S) dP.
A A
whenever P (A) 6= 0. Hence E (X|S) (ω) ∈ I a.e. and so it makes sense to consider
φ (E (X|S)). Now
Thus
sup {an E (X|S) + bn }
= φ (E (X|S)) ≤ E (φ (X) |S) a.e.
which proves the theorem.
E (Xk+1 |Sk ) = Xk ,
a submartingale if
E (Xk+1 |Sk ) ≥ Xk ,
and a supermartingale if
E (Xk+1 |Sk ) ≤ Xk .
I
Definition 32.7 Let {xi }i=1 be any sequence of real numbers, I ≤ ∞. Define
an increasing sequence of integers {mk } as follows. m1 is the first integer ≥ 1
such that xm1 ≤ a, m2 is the first integer larger than m1 such that xm2 ≥ b, m3
is the first integerª larger than m2 such that xm3 ≤ a, etc. Then each sequence,
©
xm2k−1 , · · ·, xm2k , is called an upcrossing of [a, b].
n
Proposition 32.8 Let {Xi }i=1 be a finite sequence of real random variables defined
on Ω where (Ω, S, P ) is a probability space. Let U[a,b] (ω) denote the number of
upcrossings of Xi (ω) of the interval [a, b]. Then U[a,b] is a random variable.
Proof: Let X0 (ω) ≡ a+1, let Y0 (ω) ≡ 0, and let Yk (ω) remain 0 for k = 0, ···, l
until Xl (ω) ≤ a. When this happens (if ever), Yl+1 (ω) ≡ 1. Then let Yi (ω) remain
1 for i = l + 1, · · ·, r until Xr (ω) ≥ b when Yr+1 (ω) ≡ 0. Let Yk (ω) remain 0 for
k ≥ r + 1 until Xk (ω) ≤ a when Yk (ω) ≡ 1 and continue in this way. Thus the
upcrossings of Xi (ω) are identified as unbroken strings of ones for Yk with a zero
at each end, with the possible exception of the last string of ones which may be
missing the zero at the upper end and may or may not be an upcrossing.
Note also that Y0 is measurable because it is identically equal to 0 and that if
Yk is measurable, then Yk+1 is measurable because the only change in going from
k to k + 1 is a change from 0 to 1 or from 1 to 0 on a measurable set determined
by Xk . Now let
½
1 if Yk (ω) = 1 and Yk+1 (ω) = 0,
Zk (ω) =
0 otherwise,
if k < n and ½
1 if Yn (ω) = 1 and Xn (ω) ≥ b,
Zn (ω) =
0 otherwise.
Thus Zk (ω) = 1 exactly when an upcrossing has been completed and each Zi is a
random variable.
n
X
U[a,b] (ω) = Zk (ω)
k=1
Corollary 32.9 U[a,b] (ω) ≤ the number of unbroken strings of ones in the se-
quence, {Yk (ω)} there being at most one unbroken string of ones which produces no
upcrossing. Also ³ ´
i−1
Yi (ω) = ψ i {Xj (ω)}j=1 , (32.6)
{(Xn , Sn )}
898 CONDITIONAL EXPECTATION AND MARTINGALES
{(φ (Xn ) , Sn )}
is also a submartingale.
by Jensen’s inequality.
The following is called the upcrossing lemma.
n
Lemma 32.11 (upcrossing lemma) Let {(Xi , Si )}i=1 be a submartingale and let
U[a,b] (ω) be the number of upcrossings of [a, b]. Then
¡ ¢ E (|Xn |) + |a|
E U[a,b] ≤ .
b−a
+
Proof: Let φ (x) ≡ a + (x − a) so that φ is an increasing convex function
always at least as large as a. By Lemma 32.10 it follows that {(φ (Xk ) , Sk )} is also
a submartingale.
k+r
X
φ (Xk+r ) − φ (Xk ) = φ (Xi ) − φ (Xi−1 )
i=k+1
k+r
X k+r
X
= (φ (Xi ) − φ (Xi−1 )) Yi + (φ (Xi ) − φ (Xi−1 )) (1 − Yi ).
i=k+1 i=k+1
Ai ≡ {ω : Yi (ω) = 0},
à k+r
!
X
E (φ (Xi ) − φ (Xi−1 )) (1 − Yi )
i=k+1
k+r
X Z
= (φ (Xi ) − φ (Xi−1 )) (1 − Yi ) dP
i=k+1 Ω
k+r
X Z
= (φ (Xi ) − φ (Xi−1 )) dP
i=k+1 Ai
32.2. DISCRETE MARTINGALES 899
k+r
X Z Z
≥ φ (Xi−1 ) dP − φ (Xi−1 ) dP = 0. (32.7)
i=k+1 Ai Ai
n
X
= (φ (Xk (ω)) − φ (Xk−1 (ω))) Yk (ω)
k=1
Xn
+ (φ (Xk (ω)) − φ (Xk−1 (ω))) (1 − Yk (ω)).
k=1
The first sum in the above reduces to summing over the unbroken strings of ones
because the terms in which Yi (ω) = 0 contribute nothing. implies
φ (Xn (ω)) − φ (X1 (ω))
≥ U[a,b] (ω) (b − a) + 0+
n
X
(φ (Xk (ω)) − φ (Xk−1 (ω))) (1 − Yk (ω)) (32.9)
k=1
where the zero on the right side results from a string of ones which does not
produce an upcrossing. It is here that it is important that φ (x) ≥ a. Such
a string begins with φ (Xk (ω)) = a and results in an expression of the form
φ (Xk+m (ω)) − φ (Xk (ω)) ≥ 0 since φ (Xk+m (ω)) ≥ a. If Xk had not been re-
placed with φ (Xk ) , it would have been possible for φ (Xk+m (ω)) to be less than a
and the zero in the above could have been a negative number This would have been
inconvenient.
Next take the expected value of both sides in 32.9. Using 32.7, this results in
¡ ¢
E (φ (Xn ) − φ (X1 )) ≥ (b − a) E U[a,b]
à n !
X
+E (φ (Xk ) − φ (Xk−1 )) (1 − Yk )
k=1
¡ ¢
≥ (b − a) E U[a,b]
900 CONDITIONAL EXPECTATION AND MARTINGALES
n
Proof: Let a, b ∈ Q and let a < b. Let U[a,b] (ω) be the number of upcrossings
n
of {Xi (ω)}i=1 . Then let
n
U[a,b] (ω) ≡ lim U[a,b] (ω) = number of upcrossings of {Xi } .
n→∞
∞
Theorem 32.13 Let {(Xi , Si )}i=1 be a submartingale. Then
µ ¶ Z
1
P max Xk ≥ λ ≤ X + dP
1≤k≤n λ Ω n
Proof: Let
A1 ≡ [X1 ≥ λ] , A2 ≡ [X2 ≥ λ] \ A1 ,
¡ ¢
· · ·, Ak ≡ [Xk ≥ λ] \ ∪k−1
i=1 Ai · ··
Thus each Ak is Sk measurable, the Ak are disjoint, and their union equals [max1≤k≤n Xk ≥ λ] .
Therefore from the definition of a submartingale and Jensen’s inequality,
µ ¶ n
X n Z
1X
P max Xk ≥ λ = P (Ak ) ≤ Xk dP
1≤k≤n λ Ak
k=1 k=1
n Z
1X
≤ E (Xn |Sk ) dP
λ
k=1 Ak
Xn Z
1 +
≤ E (Xn |Sk ) dP
λ Ak
k=1
Xn Z
1 ¡ ¢
≤ E Xn+ |Sk dP
λ
k=1 Ak
Xn Z Z
1 1
= Xn+ dP = Xn+ dP.
λ Ak λ Ω
k=1
where ej is Ftj measurable and ∪∞ j=0 [tj , tj+1 ) = R. Of course you can replace [0, ∞)
in the above with [0, T ] and this is the case of most interest. Another convention
followed will be to assume that (Ω, Ft , P ) is a complete measure space. Thus all sets
of measure zero from F are in Ft . If it is not complete, you simply replace it with
its completion. This goes for F as well.
The act of replacing the measure spaces with their completions is completely
harmless. The only important idea which needs consideration is that of indepen-
dence. Suppose the σ algebras, G and H are independent and you then consider
their completions. Will the new σ algebras also be independent?
Lemma 33.2 Suppose the σ algebras, G and H are independent and let G 0 and H0
be the σ algebras of the completions. Then G 0 and H0 are also independent.
903
904 FILTRATIONS AND MARTINGALES
Proof: Extend f to equal 0 for t ∈ / [0, T ] . Let tj = j2−n and let φn (t) denote
−n
the step function which equals j2 on the interval [j2−n , (j + 1) 2−n ). It follows
easily that if s ≥ 0, then φn (t − s) + s ∈ [t − 2−n , t). Now let
fh (t) ≡ f ∗ ψ h (t)
From
R ∞ now on, ω ∈
/ E where E is the exceptional set of measure zero on which
2
−∞
f (t, ω) dt = ∞. Consider
Z T Z 1
2
|f (φn (t − s) + s, ω) − f (t, ω)| dsdt.
0 0
Z T Z 1
2
|f (φn (t − s) + s, ω) − f (t, ω)| dsdt ≤
0 0
ÃZ Z
T 1
2
3 |f (φn (t − s) + s, ω) − fh (φn (t − s) + s, ω)| dsdt (33.2)
0 0
Z T Z 1
2
+ |fh (φn (t − s) + s, ω) − fh (t, ω)| dsdt (33.3)
0 0
Z Z 1 !
T
2
+ |fh (t, ω) − f (t, ω)| dsdt (33.4)
0 0
Consider the term in 33.2. There are disjoint intervals such that φn (t − s) is con-
stant on these intervals. Therefore, the inside integral of this term must be of the
form
mn Z
X 2
|f (ck + s, ω) − fh (ck + s, ω)| ds
k=1 Ik
905
where the intervals, Ik are disjoint, ck is of the form j2−n , and the union of these
intervals equals [0, 1]. Therefore, there are other disjoint intervals, Jk such that this
term equals
mn Z
X Z
2 2
|f (s, ω) − fh (s, ω)| ds ≤ |f (s, ω) − fh (s, ω)| ds
k=1 Jk R
Letting M be the uniform bound on f, and using the fact that |φn (t − s) + s − t| <
2−n , it follows the above expression is dominated by
ε M −n
+ 2 <ε
2 h
provided n is chosen large enough. Therefore, this has shown
Z T Z 1
2
lim |f (φn (t − s) + s, ω) − f (t, ω)| dsdt
n→∞ 0 0
Z 1 Z T
2
= lim |f (φn (t − s) + s, ω) − f (t, ω)| dtds = 0.
n→∞ 0 0
Therefore, if ε > 0 is given, the above expression is less than ε provided n is large
enough, depending on ω. However, this requires that for some sn ∈ [0, 1]
Z T
2
|f (φn (t − sn ) + sn , ω) − f (t, ω)| dt < ε.
0
for all k large enough. From the definition of these fnk given above, they are
uniformly bounded. Therefore, by the dominated convergence theorem,
Z Z T
2
lim (fnk (t, ω) − f (t, ω)) dtdP = 0.
k→∞ Ω 0
Then there exists a sequence of uniformly bounded adapted step functions, φn such
that ÃZ !
T
2
lim P (f (t, ω) − φn (t, ω)) dt > ε = 0. (33.6)
n→∞ 0
Thus
mX
n −1
Then fM satisfies all the conditions of Lemma 33.3. Letting ε > 0 be given, it
follows there exists φM a uniformly bounded adapted step function such that
ÃZ !
T
2 δ
P (fM (t, ω) − φM (t, ω)) dt > < ε.
0 4
This is because for E the set of measure zero such that 33.5 does not hold,
" Z #
T
∞ 2 δ
Ω \ E = ∪M =1 ω : (f (t, ω) − fM (t, ω)) dt ≤
0 2
and so by the Borel Canelli lemma there exists a set of measure zero E such that
for ω ∈
/ E and all k large enough,
Z T
2
(f (t, ω) − fMk (t, ω)) dt ≤ 2−k .
0
Then by Lemma 33.3 there exists an adapted bounded step function, φMk such that
ÃZ Z !1/2
T ¡ ¢2
fMk (t, ω) − φMk (t, ω) dtdP < 2−(k̄+1) .
Ω 0
It follows ÃZ Z !1/2
T ¡ ¢2
f (t, ω) − φMk (t, ω) dtdP < 2−k .
Ω 0
909
Then there exists a sequence of uniformly bounded adapted step functions, φn such
that ÃZ !
T
2 −n
lim P (f (t, ω) − φn (t, ω)) dt ≤ 2 = 1.
n→∞ 0
From the Borel Cantelli lemma ω which is in infinitely many of the sets
" Z #
T ¡ ¢2 −k
ω: f (t, ω) − φnk (t, ω) dt > 2
0
for all finite increasing sequences t1 , · · ·, tk such that 0 ≤ t1 < t2 · ·· < tk ≤ t and Fj
Borel. Thus Ft is a filtration. Another way to say it is that Ft is the smallest σ al-
gebra contained in F which is complete and such that for every increasing sequence,
t1 , · · ·, tk such that 0 ≤ t1 < t2 · ·· < tk ≤ t, it follows that (Bt1 , · · ·, Btk ) : Ω → Rn
is measurable with respect to Ft .
Lemma 33.7 Ft also equals the smallest σ algebra which is complete and contains
all sets of the form
¡ ¢−1
Bt1 , Bt2 − Bt1 , · · ·, Btk − Btk−1 (F1 × · · · × Fk ) .
In other words, one can consider instead the independent increments when defining
Ft .
Proof: Ft0 is the smallest σ algebra such that (Bt1 , · · ·, Btk ) is measurable for
every increasing sequence, t1 , ···, tk such that 0 ≤ t1 < t2 ··· < tk ≤ t. The 0 function
is clearly measurable because 0−1 (E) = Ω if 0 ∈ E and if ¡0 ∈ / E, 0−1 (E) = ∅.¢
Therefore, for every increasing¡ sequence, as just described, 0,¢ −Bt1 , · · ·, −Btk−1
is Ft measurable. Therefore, Bt1 , Bt2 − Bt1 , ·¡· ·, Btk − Btk−1 is Ft0 measurable. ¢
Now suppose for all such increasing sequences, Bt1 , Bt2 − Bt1 , · · ·, Btk − Btk−1 is
Ft0 measurable. Then in particular,
must be Ft0 measurable because Bti is. Therefore, adding in k of these, it follows
(Bt1 , · · ·, Btk ) is Ft0 measurable. In other words, a σ algebra is measurable for all
Theorem 33.9 Brownian motion for S ≤ t with respect to the filtration described
above is a martingale.
Next it must be shown that E (Bs |Ft ) = Bt whenever s ≥ t. Let F ∈ Ft and let
s ≥ t. Then
Z Z
E (Bs − Bt |Ft ) dP = (Bs −Bt ) XF dP
F
Z Z
= (Bs −Bt ) dP XF dP = 0.
Ω Ω
Hence
Lemma 33.10 Let Bt be real valued Brownian motion. Then Bt2 −t is a martingale.
Proof: The idea is to exploit the fact the increments (Bs − Bt ) for s > t are
independent of Ft . Thus you write things in terms of (Bs − Bt ) . It is easy to see
2
that (Bs − Bt ) − 2Bt2 + 2Bs Bt = Bs2 − Bt2 . Therefore, using the fact that Brownian
motion is a martingale,
¡ ¢ ³ ´
2
E Bs2 − Bt2 |Ft = E (Bs − Bt ) − 2Bt2 + 2Bs Bt |Ft
³ ´ ¡ ¢
2
= E (Bs − Bt ) |Ft + E −2Bt2 + 2Bs Bt |Ft
³ ´
2
= E (Bs − Bt ) |Ft − 2Bt2 + 2Bt E (Bs |Ft )
³ ´
2
= E (Bs − Bt ) |Ft − 2Bt2 + 2Bt2
³ ´
2
= E (Bs − Bt ) |Ft
Now for A ∈ Ft ,
Z ³ ´ Z
2 2
E (Bs − Bt ) |Ft dP ≡ (Bs − Bt ) dP
A A
912 FILTRATIONS AND MARTINGALES
Z Z Z
2
= XA dP (Bs − Bt ) dP = (s − t) dP.
A
³ ´
2
Since this holds for all A ∈ Ft , it follows E (Bs − Bt ) |Ft = (s − t) and so
¡ ¢ ¡ ¢
E Bs2 − s|Ft = E Bs2 − Bt2 |Ft + Bt2 − s
= (s − t) + Bt2 − s = Bt2 − t.
Therefore, from the earlier observations about the characteristic function of normaly
distributed random variables and using the fact the mean of Bt is x and the variance
is t,
1 2
eiux e− 2 u s ³ ´
i(Bs −Bt )u
1 2 = E e
eiux e− 2 u t
and so
³ ´ 1 2
E ei(Bs −Bt )u = e− 2 u (s−t) . (33.8)
Therefore,
³ ´ 1 2
E i (Bs − Bt ) ei(Bs −Bt )u = u (t − s) e 2 u (t−s)
³ ´ 1 2 1 2
2
E − (Bs − Bt ) ei(Bs −Bt )u = e2u (t−s)
t − e2u (t−s)
s
1 2 1 2
+u2 e 2 u (t−s) t2 − 2u2 e 2 u (t−s)
ts
1 2
+u2 e 2 u (t−s) s2 (33.9)
³ ´ 1 2 1 2
3
E −i (Bs − Bt ) ei(Bs −Bt )u = 3ue 2 u (t−s) 2
t − 6ue 2 u (t−s)
ts
1 2 1 2
+3ue 2 u (t−s) s2 + u3 e 2 u (t−s) t3
1 2 1 2
−3u3 e 2 u (t−s) t2 s + 3u3 e 2 u (t−s) ts2
1 2
−u3 e 2 u (t−s) 3
s (33.10)
33.1. CONTINUOUS MARTINGALES 913
³ ´ 1 2 1 2
4
E (Bs − Bt ) ei(Bs −Bt )u = 3e 2 u (t−s) 2
t + 6u2 e 2 u (t−s) 3
t
1 2 1 2
−18u2 e 2 u (t−s) t2 s − 6e 2 u (t−s) ts
1 2 1 2
2 2 u (t−s) 2 2 u (t−s)
+18u e ts + 3e s2
2 12 u2 (t−s) 3 4 12 u2 (t−s) 4
−6u e s +u e t
4 12 u2 (t−s) 3 4 21 u2 (t−s) 2 2
−4u e t s + 6u e t s
4 12 u2 (t−s) 3 4 12 u2 (t−s) 4
−4u e ts + u e s (33.11)
Clearly one could go on like this but this is enough for now. You might conjecture
m m/2
that E ((Bs − Bt ) ) = (m − 1) (s − t) for m even and 0 for m odd. Lets simply
call it gm (s − t) for now.
It follows
2 3
Bs3 = Bt3 + 3Bt2 (Bs − Bt ) + 3Bt (Bs − Bt ) + (Bs − Bt ) .
³ ´ ³ ´
2 3
= Bt3 + Bt2 E (Bs − Bt |Ft ) + 3Bt E (Bs − Bt ) |Ft + E (Bs − Bt ) |Ft
³ ´ ³ ´
2 3
= Bt3 + 3Bt E (Bs − Bt ) |Ft + E (Bs − Bt ) |Ft (33.15)
914 FILTRATIONS AND MARTINGALES
m
Consider E ((Bs − Bt ) |Ft ) . If A ∈ Ft ,then by independence,
Z Z
m m
E ((Bs − Bt ) |Ft ) dP = (Bs − Bt ) dP
A
ZA Z
m
= XA dP (Bs − Bt ) dP
Z
= gm (s − t) dP
A
which shows
m
gm (s − t) = E ((Bs − Bt ) |Ft ) . (33.16)
Then considering 33.15 in light of 33.12 and 33.13, this leads to
¡ ¢
E Bs3 |Ft = Bt3 + 3 (s − t) Bt
It follows
¡ ¢
E Bs3 − 3Bs s|Ft = Bt3 + 3 (s − t) Bt − 3sE (Bs |Ft )
= Bt3 + 3 (s − t) Bt − 3sBt
= Bt3 − 3tBt
Proof: The Taylor series for y 4 − 6ty 2 + 3t2 considered a function of y expanded
about x is
¡ 4 ¢ ¡ ¢
x − 6tx2 + 3t2 + −12tx + 4x3 (y − x)
¡ ¢ 2 3 4
+ −6t + 6x2 (y − x) + 4x (y − x) + (y − x)
Using
¡ 33.16 and taking¢conditional expectations using the formulas, 33.12 - 33.14,
E Bs4 − 6sBs2 + 3s2 |Ft equals
¡ 4 ¢ ¡ ¢ 2
Bt − 6sBt2 + 3s2 + −6s + 6Bt2 (s − t) + 3 (s − t)
Lemma 33.13 Let {Mt } be a filtration and let {Mt } be a real valued martingale
for t ∈ [S, T ] . Then for λ > 0 and any p ≥ 1, if At is a Mt measurable subset of
[|Mt | ≥ λ] , then
Z
1 p
P (At ) ≤ p |MT | dP.
λ At
Theorem 33.14 Let {Mt } be a filtration and let {Mt } be a real valued continuous1
martingale for t ∈ [S, T ] . Then for all λ > 0 and p ≥ 1,
Ã" #! Z
1 p
P sup |Mt | ≥ λ ≤ |MT | dP
t∈[S,T ] λp Ω
Proof: Let S ≤ tm m m m m
0 < t1 < · · · < tNm = T where tj+1 − tj = (T − S) 2
−m
.
First consider m = 1.
n ¯ ¯ o n ¯ ¯ o
¯ ¯ ¯ ¯
At10 ≡ ω ∈ Ω : ¯Mt10 (ω)¯ ≥ λ , At11 ≡ ω ∈ Ω : ¯Mt11 (ω)¯ ≥ λ \ At10
n ¯ ¯ o ³ ´
¯ ¯
At12 ≡ ω ∈ Ω : ¯Mt12 (ω)¯ ≥ λ \ At10 ∪ At10 .
n o2m
Do this type of construction for m = 2, 3, 4, · · · yielding disjoint sets, At m
j
j=0
whose union equals
∪t∈Dm [|Mt | ≥ λ]
© ª2 m
where Dm = tm ∞
j j=0 . Thus Dm ⊆ Dm+1 . Then also, D ≡ ∪m=1 Dm is dense and
countable. From Lemma 33.13,
m
2
X ³ ´
P (∪t∈Dm [|Mt | ≥ λ]) = P Atm
j
j=0
2 Zm
1 X p
≤ |MT | dP
λp j=0 Atm
j
Z
1 p
≤ |MT | dP.
λp Ω
Let m → ∞ in the above to obtain
Z
1 p
P (∪t∈D [|Mt | ≥ λ]) ≤ |MT | dP. (33.17)
λp Ω
From now on, assume that for a.e. ω ∈ Ω, t → Mt (ω) is continuous. Then with
this assumption, the following claim holds.
Claim:For λ > ε > 0,
" #
∪t∈D [|Mt | ≥ λ − ε] ⊇ sup |Mt | ≥ λ
t∈[S,T ]
h i
Proof of the claim: Suppose ω ∈ supt∈[S,T ] |Mt | ≥ λ . Then there exists s
such that |Ms (ω)| > λ − ε. By continuity, this situation persists for all t near to s.
In particular, it is true for some t ∈ D. This proves the claim.
Letting P 0 denote the outer measure determined by P it follows from the claim
and 33.17 that
Ã" #!
P0 sup |Mt | ≥ λ ≤ P (∪t∈D [|Mt | ≥ λ − ε])
t∈[S,T ]
Z
1 p
≤ p |MT | dP.
(λ − ε) Ω
In all this, Bt will be a martingale for the filtration, Ht and the increments, Bs − Bt
will be independent of Ht for s > t. I will define the Itô integral on [0, T ] where T
is arbitrary. In doing so, I will also define it on [0, t] . First the integral is defined
on uniformly bounded adapted step functions. Let φ be such a function. Thus
n−1
X
φ (t, ω) = φj (ω) X[tj ,tj+1 ) (t) . (34.1)
j=0
Z t k−1
X ¡ ¢
φdB (ω) ≡ φj (ω) Btj+1 (ω) − Btj (ω) + φk (ω) (Bt (ω) − Btk (ω)) . (34.2)
0 j=0
The verification that this is well defined and linear is essentially the same as it is
in the context of the Riemann integral from calculus. To show linearity on such
step functions, you simply take a common refinement and if s is one of the new
partition points in [ti , ti+1 ), you replace the term φi X[ti ,ti+1 ) (t) with the sum of the
two terms, φi X[ti ,s) (t) + φi X[s,,ti+1 ) (t) .
Lemma 34.1 Let s > t and let φ be bounded and Ht measurable. Then
µ ¶
1
exp (φ (Bs − Bt )) exp − φ2 (s − t) (34.3)
2
is a function in L1 (Ω) .
917
918 THE ITÔ INTEGRAL
where |φ (ω)| < M. Then using the technique of the distribution function, Theorem
9.39 on Page 232,
Z Z ∞
hdP = P (h > λ) dλ
Ω
Z0 ∞
= P (exp (M |Bs − Bt |) > λ) dλ
Z0 ∞ µ ¶
ln (λ)
≤ P |Bs − Bt | > dλ
0 M
Z ∞ µ ¶
ln (λ)
= P |Bs − Bt | > dλ
1 M
Z 1
+ P (|Bs − Bt | > nonpositive) dλ
0
Z ∞Z ∞
2 x2
= 1+ p e− 2(s−t) dxdλ
2π (s − t) 1 ln(λ)/M
Z ∞ Z eM x 2
Z ∞
x x2
− 2(s−t)
= 1+C e dλdx ≤ 1 + C e− 2(s−t) eM x dx < ∞.
0 1 0
Lemma 34.2 Let φ be a uniformly bounded adapted step function on [0, T ] . Let
µZ t Z t ¶
1 2
ξ (t) ≡ exp φdB − φ dr
0 2 0
Proof: That ξ (t) is continuous follows from the fact φ is bounded and the
continuity of Bt . It remains to verify it is a martingale. Let φ be given by 34.1 and
suppose tj ≤ t < s < tj+1 . Then
µ ¶ µ ¶
ξ (s) ξ (s)
E (ξ (s) |Ht ) = E ξ (t) |Ht = ξ (t) E |Ht . (34.4)
ξ (t) ξ (t)
From the definition of the integral on step functions given above and using the
assumption on t and s just mentioned, it follows this equals
¡R s R ¢
1 s 2
exp φdB − φ dr
ξ (t) E ³R0
t
2 0
Rt ´ |Ht
exp 0 φdB − 12 0 φ2 dr
µ µ ¶ ¶
¡ ¢ 1 2
= ξ (t) E exp φj (Bs − Bt ) exp − φj (s − t) |Ht .
2
919
where each Ei ∈ Ht (In fact, Ei ∈ Htj .). Then by the dominated convergence
theorem, or simply the boundedness of φj and the uniform convergence of αn to φj ,
Z µ ¶
1 2
lim exp (αn (Bs − Bt )) exp − αn (s − t) dP
n→∞ Ω 2
Z µ ¶
¡ ¢ 1 2
= exp φj (Bs − Bt ) exp − φj (s − t) dP.
Ω 2
However, by independence of the increments again
Z µ ¶
1 2
exp (αn (Bs − Bt )) exp − αn (s − t) dP
Ω 2
m
X n Z µ ¶
1
= exp (ci (Bs − Bt )) exp − c2i (s − t) dP
i=1 Ei
2
m
X n Z Z µ ¶
1 2
= dP exp (ci (Bs − Bt )) exp − ci (s − t) dP.
i=1 Ei Ω 2
However,
Z µ ¶
1
exp (ci (Bs − Bt )) exp − c2i (s − t) dP
2
ZΩ
1 x2 1 2
= p e− 2(s−t) eci x e− 2 ci (s−t) dx = 1
R 2π (s − t)
which follows from completing the square in the exponents and recognizing
³ the inte-
´
R
grand as a normal distribution. Therefore, 34.5 reduces to A dP and so E ξ(s)
ξ(t) |Ht =
1 which shows from 34.4 that
The next case is when s = tj+1 and t ∈ (tj , tj+1 ) . The argument goes the same
way. In this case, µ ¶
ξ (s)
E |Ht
ξ (t)
¡ ¡ ¡ ¢ ¡ ¢¢
= E exp φj Btj+1 − Btj − φj Bt − Btj ·
µ ¶ ¶
1¡ 2 2
¢
exp − φj (tj+1 − tj ) − φj (t − tj ) |Ht
2
µ µ ¶ ¶
¡ ¡ ¢¢ 1
= E exp φj Btj+1 − Bt exp − φ2j (tj+1 − t) |Ht .
2
Now it is just a repeat of the above argument to show this is 1.
All other cases follow easily from this. For example, suppose tj−1 ≤ t < tj <
tj+1 ≤ s < tj+2 . Then from the two cases considered above,
¡ ¡ ¡ ¢ ¢ ¢
E (ξ (s) |Ht ) = E E E ξ (s) |Htj+1 |Htj |Ht
¡ ¡ ¢ ¢
= E E ξ (tj+1 ) |Htj |Ht
= E (ξ (tj ) |Ht ) = ξ (t) .
Continuing in this way shows ξ (t) is a martingale. This proves the lemma.
Note also that this shows
E (ξ (T )) = E (ξ (T ) |H0 ) = ξ (0) = 1.
Now from Doob’s martingale estimate, Theorem 33.14,
µ ¶
1 1
P max ξ (t) ≥ λ ≤ E (ξ (T )) = .
t∈[0,T ] λ λ
If φ were replaced by αφ and ξ α were obtained by replacing φ with αφ, the same
estimate would follow. Thus
µ µZ t Z ¶ ¶
α t 2
P max φdB − φ ds > λ
t 0 2 0
µ µZ t Z ¶ ¶
1 t 2
= P max αφdB − (αφ) ds > αλ
t 0 2 0
µ µ µZ t Z ¶¶ ¶
1 t 2
= P max exp αφdB − (αφ) ds > eαλ
t 0 2 0
³ ´ 1
= P max (ξ α (t)) > eαλ ≤ αλ E (ξ α (T )) = e−αλ (34.6)
t e
Summarizing this gives the following very significant inequality in which α, λ are
two arbitrary positive constants independent of φ.
µ µZ t Z ¶ ¶
α t 2
P max φdB − φ ds > λ ≤ e−αλ . (34.7)
t 0 2 0
Now recall how adapted functions can be approximated by adapted step func-
tions. This was proved in Corollary 33.4 which is stated here for convenience.
921
Then there exists a sequence of uniformly bounded adapted step functions, φn such
that ÃZ !
T
2
lim P (f (t, ω) − φn (t, ω)) dt > ε = 0. (34.9)
n→∞ 0
Thus
mX
n −1
From this corollary, the following fundamental lemma will make possible the
definition of the Itô integral. It pertains to the filtration, Ht with respect to which
Bt is a martingale and such that for s > t, Bs − Bt is independent of Ht .
Lemma 34.4 Suppose f is Ht adapted and B × F measurable such that for a.e. ω,
Z T
2
f (t, ω) dt < ∞. (34.10)
0
Then there exists a sequence of bounded adapted step functions, {φk } and a set of
measure zero, E, such that for ω ∈/ E,
Z T
2
(f (t, ω) − φk (t, ω)) dt ≤ 2−k
0
Proof: By Corollary 33.4© stated ª above, there exists a subsequence, of the {φn }
mentioned in this corollary, φnk such that
ÃZ !
T ¡ ¢2 −k
P f (t, ω) − φnk (t, ω) dt > 2 < 2−k .
0
922 THE ITÔ INTEGRAL
h R ¡ ¢2 i
T
Now let Ak ≡ ω : 0 f (t, ω) − φnk (t, ω) dt > 2−k . Then from the above,
∞
X
P (Ak ) < ∞
k=1
and so by the Borel Cantelli lemma, the set, E of points ω contained in infinitely
many of the Ak has measure zero. Therefore, for ω ∈ / E, there exists K (ω) such
that for k > K (ω) , ω ∈
/ Ak and so
Z T
¡ ¢2
f (t, ω) − φnk (t, ω) dt ≤ 2−k
0
Denote φk = φnk .
For ω ∈
/ E and l > k > K (ω) described above, and t ≤ T,
Z t Z t
2 2
(φk (s, ω) − φl (s, ω)) ds ≤ 2 (φk (s, ω) − f (s, ω)) ds
0 0
Z t
2
+2 (f (s, ω) − φl (s, ω)) ds ≤ 2k−1 + 2l−1 ≤ 2k−2 .
0
Proof: By Lemma 34.4, for ω ∈ / E the exceptional set of measure zero, there
exists N (ω) such that if n > m ≥ N (ω) ,
Z T
2
(φn (s, ω) − φm (s, ω)) ds < 2−(m−2)
0
By the Borel Cantelli lemma again, there exists a set of measure zero E containng
the earlier exceptional set such that for ω ∈
/ E there exists N (ω) large enough that
for n > m ≥ N (ω) ,
ÃZ µ ¶m−2 Z t ! µ ¶
t m−2
1 3 2 2
max (φn − φm ) dB (ω) − (φn − φm ) ds ≤ m ln θ
t 0 2 2 0 3
Z T
2
(φn − φm ) ds < 2−(m−2) .
0
Therefore for such ω,
ÃZ µ ¶m−2 ! µ ¶
t m−2
1 3 −(m−2) 2
max (φn − φm ) dB (ω) − 2 ≤ m ln θ
t 0 2 2 3
Therefore,
ïZ ¯ µ ¶m−2 µ ¶m−2 !
¯ t ¯ 2 1 3
max ¯¯ (φm − φn ) dB (ω)¯¯ ≤ m ln θ + .
t 0 3 2 4
nR o
t
It follows that for ω off a set of measure zero, 0 φn dB (ω) is a Cauchy sequence.
Adjusting the above constants, there exists a constant, r < 1 and a positive con-
stant, C such that
µ¯Z t ¯ ¶
¯ ¯
max ¯ ¯ ¯
(φm − φn ) dB (ω)¯ ≤ Cr m
t 0
924 THE ITÔ INTEGRAL
and so the same estimate yields, for all ω off a set of measure zero,
µ¯Z t ¯ ¶
¯ ¯
max ¯¯ (ψ n − φn ) dB (ω)¯¯ ≤ Crn
t 0
Then there exists a sequence of adapted bounded step functions, {φn } satisfying
Z T
2
(f (t, ω) − φn (t, ω)) dt ≤ 2−n
S
for ω ∈
/ E, a set of measure zero. Then for t ∈ [S, T ] , the Itô integral is defined by
Z t Z t
f dB (ω) = lim φn dB (ω) .
S n→∞ S
Rt
Furthermore, for these ω, t → S f dB (ω) is continuous because by Theorem 34.5
Rt
the convergence of S φn dB (ω) is uniform on [0, T ].
Definition 34.7 Suppose f is Ht adapted and B × F measurable such that for a.e.
ω,
Z T
2
f (t, ω) dt < ∞.
S
Proof: This holds essentially because it holds for any φ a step function. For
example, if
n
X
φ (t, ω) = ej (ω) X[tj ,tj+1 ) (t) ,
j=1
Then without loss of generality it can be assumed U is one of the tj say tk . Therefore,
Z T k−1
X ¡ ¢
φdB = ej (ω) Btj+1 (ω) − Btj (ω)
S j=1
Xn
¡ ¢
+ ej (ω) Btj+1 (ω) − Btj (ω)
j=k
Z U Z T
= φdB + φdB
S U
It follows 34.13 must hold in the limit. 34.14 is somewhat more obvious. Consider
34.16. Let φ be as above. Then recall that ej is Htj measurable and so
Xn n
X
¡ ¢ ¡ ¢
E ej (ω) Btj+1 (ω) − Btj (ω) = E (ej ) E Btj+1 − Btj
j=1 j=1
n
X
= E (ej ) 0 = 0
j=1
RT
34.15 must also hold because if φ is as above, S φdB is HT measurable and
RT
S
f dB is a pointwise a.e. limit of these.
The next theorem is called the Itô isometry. It pertains to the case where
f ∈ L2 ([S, T ] × Ω) . Thus
Z Z T
2
f (t, ω) dtdP < ∞
Ω 0
Then there exists a sequence of uniformly bounded adapted step functions, φn such
that ÃZ !
T
2
lim P (f (t, ω) − φn (t, ω)) dt > ε = 0. (34.18)
n→∞ 0
Thus
mX
n −1
where tn0
= 0 and enjis Ftnj measurable. Furthermore, if f is in L2 ([0, T ] × Ω) ,
© ª
there exists a subsequence φnk such that
Z Z T ¡ ¢2
lim f (t, ω) − φnk (t, ω) dtdP = 0
k→∞ Ω 0
The following theorem is the Itô isometry.All of this is still in the context of the
filtration, Ht with respect to which Bt is a martingale and the increments, Bs − Bt
are independent of Ht whenever s > t.
Pn−1
Proof: First let φ (t, ω) = j=0 ej (ω) X[tj ,tj+1 ) (t) be a uniformly bounded
adapted step function. Then
Z T n−1
X ¡ ¢
φdB = ej Btj+1 − Btj .
S j=0
Then ÃZ !2
T X ¡ ¢ ¡ ¢
φdB = ej Btj+1 − Btj ei Bti+1 − Bti .
S i,j
Therefore,
Z ÃZ T
!2 Z n−1
X ¡ ¢2
φdB dP = e2j Btj+1 − Btj dP
Ω S Ω j=0
XZ
n−1
¡ ¢2
= e2j Btj+1 − Btj dP
j=0 Ω
XZ
n−1 Z
¡ ¢2
= e2j dP Btj+1 − Btj dP
j=0 Ω Ω
XZ
n−1 Z Z T
= e2j dP (tj+1 − tj ) dP = φ2 dtdP.
j=0 Ω Ω 0
f ∈ L2 ([S, T ] × Ω)
in L2 (Ω) .
which converges to 0. Rt
Letting f ∈ W (H) , one can consider the stochastic process S f (s, ω) dB (ω) .
From the construction of the Itô integral above, this is a continuous function of t for
a.e. ω. It turns out that if f is also in L2 ([S, T ] × Ω) , then this stochastic process
is also an Ht martingale.
Rt
Theorem 34.12 I (t, ω) ≡ S f (s, ω) dB (ω) is an Ht martingale if
f ∈ L2 ([S, T ] × Ω) .
Proof: Let Z t
In (t, ω) = φn (s, ω) dB (ω) .
S
where φn is a bounded adapted step function such that
Z t Z t
f dB = lim φn dB
S n→∞ S
34.1. PROPERTIES OF THE ITÔ INTEGRAL 929
in L2 (Ω) for each t ∈ [S, T ] and for ω not in a suitable set of measure zero,
Z t Z t
I (t, ω) ≡ f dB (ω) = lim φn dB (ω) (34.19)
S n→∞ S
Rt
Now S
φn dB is measurable in Ft and so this reduces to
Z t X ¡ ¢
φn dB + E enj Btj+1 − Btj |Ft .
S t≤tn n
j <tj+1 ≤s
By Lemma 32.2,
Z t X ¡ ¢
= φn dB + E E enj Btj+1 − Btj |Ftj |Ft
S t≤tn n
j <tj+1 ≤s
Z t X ¡¡ ¢ ¢
= φn dB + E E enj Btj+1 − Btj |Ftj |Ft
S t≤tn n
j <tj+1 ≤s
Z t X ¡¡ ¢ ¢
= φn dB + E enj E Btj+1 − Btj |Ftj |Ft
S t≤tn n
j <tj+1 ≤s
Z t X Z t
= φn dB + E enj · 0|Ft = φn dB = In (t, ω) .
S t≤tn n S
j <tj+1 ≤s
Thus In (t, ω) is a martingale. Let s > t. Since In (r, ·) → I (r, ·) in L2 (Ω) for
each r, Jensen’s inequality implies
Z
|E (In (s, ω) |Ft ) − E (I (s, ω) |Ft )| dP
ZΩ
≤ E (|In (s, ω) − I (s, ω)| |Ft ) dP
Ω
Z
= |In (s, ω) − I (s, ω)| dP
Ω
930 THE ITÔ INTEGRAL
What about the measurability of I? This follows from the pointwise convergence
described in 34.19 the measurability of In and completness of the measure. This
proves the theorem.
Rt
Example 34.13 Find 0 Bs (ω) dB (ω) assuming B0 (0) = 0.
Pn
Let φn (s, ω) = j=0 Btj (ω) X[tj ,tj+1 ) (s) where |tj+1 − tj | is constant in j and
equals t/n. Then φn → Bs in L2 ([0, t] × Ω) and it is clear that φn is adapted.
Therefore,
Z T Z T
φn (t, ω) dB (ω) → Bt (ω) dB (ω)
S S
in L2 (Ω) . But by definition,
Z T n
X ¡ ¢
φn (t, ω) dB (ω) = Btj (ω) Btj+1 (ω) − Btj (ω)
S j=0
Now ¯ ¯2
Z ¯X ¯
¯ n ¡ ¢2 ¯
¯ B − B − t ¯ dP
¯ tj+1 tj ¯
Ω ¯ j=0 ¯
Z X X¡
¡ ¢2 ¡ ¢2 ¢2
= Btj+1 − Btj Bti+1 − Bti − 2t Btj+1 − Btj + t2 dP
Ω i,j j
X X
= (tj+1 − tj ) (ti+1 − ti ) − 2t (tj+1 − tj ) + t2 = t2 − 2t2 + t2 = 0
i,j j
and so
¯ ¯
Z ¯ Xn µ ¶¯2
¯1 ¡ ¢ 1 ¯
¯ Bt (ω)2 − 1 Btj+1 − Btj −
2 1
Bt (ω) − t ¯¯ dP = 0
2
¯2 2 j=0 2 2 ¯
Ω¯
34.1. PROPERTIES OF THE ITÔ INTEGRAL 931
which shows Z T
1 2 1
Bt (ω) dB (ω) = Bt (ω) − t.
S 2 2
This is contrary to what any student would know; that from the symbols in-
volved,
Z T
1 2 1 2 1 2
Bt (ω) dBt (ω) = Bt (ω) − B0 (ω) = Bt (ω) .
S 2 2 2
Here you get the extra term, − 12 t.
932 THE ITÔ INTEGRAL
Stochastic Processes
933
934 STOCHASTIC PROCESSES
One other thing should be pointed out although it was mentioned above and
that is the distribution of Bs − Bt for s > t. The density for Bt as described above
is à !
2
1 |y − x|
n/2
exp −
(2πt) 2t
and so by Theorem 31.22 on Page 868
¡ ¢ 1 ∗
E eiu·Bt = eiu·x e− 2 u tIu .
and so
³ ´ 1
eiu·x e− 2 u
∗
sIu
E eiu·(Bs −Bt ) = − 12 u∗ tIu
eiu·x e
1 ∗
= e− 2 u (s−t)Iu .
Lemma 35.4 Letting Ht denote the completion of the smallest σ algebra containing
−1
(Bs1 , · · ·, Bsk ) (B)
35.1. AN IMPORTANT FILTRATION 935
for all B a Borel set in Rnk for all sequences, 0 ≤ s1 < s2 · ·· < sk ≤ t as defined
above, Ht is also equal to the completion of the smallest σ algebra containing
−1
(Bs1 , · · ·, Bsk ) (B)
for all B an open set in Rnk for all sequences, 0 ≤ s1 < s2 · ·· < sk ≤ t. In addition
to this, Ht is equal to the completion of the smallest σ algebra containing
−1
(Bs1 , · · ·, Bsk ) (B)
nk
for all B an open set in R for all sequences, 0 ≤ s1 < s2 · ·· < sk ≤ t such that
the sj are rational numbers.
Proof: The first claim reducing to inverse images of open sets is not hard.
−1
Define Gt to be the smallest σ algebra such that (Bs1 , · · ·, Bsk ) (U ) ∈ Gt for U
open and 0 ≤ s1 < s2 · ·· < sk ≤ t. Now let
n o
−1
S(s1 ,···,sk ) ≡ E Borel such that (Bs1 , · · ·, Bsk ) (E) ∈ Gt
Then S(s1 ,···,sk ) contains the open sets and so it also contains the Borel sets because
−1
it is a σ algebra. Hence Gt contains all sets of the form (Bs1 , · · ·, Bsk ) (E) for all E
Borel and 0 ≤ s1 < s2 · ·· < sk ≤ t. It follows Gt is the smallest σ algebra containing
−1
the sets of the form (Bs1 , · · ·, Bsk ) (B) for B Borel and so its completion equals
Ht .
The second claim is more interesting. In this claim, it suffices to consider fi-
nite increasing sequences of rational numbers, rather than just finite increasing
sequences. Let 0 ≤ s1 < s2 · ·· < sk ≤ t and let 0 ≤ tn1 < tn2 · ·· < tnk ≤ t be an
increasing sequence of rational numbers such that limn→∞ tnk = sk . It has been
proven that off a set of measure zero, t → Bt (ω) is continuous. To simplify the
presentation, I will assume without loss of generality that this set of measure zero
is empty. If not, you could simply delete it and consider a slightly modified Ω.
Another way to see this is not a loss of generality is that Ht is complete and so
contains all subsets of sets of measure zero. Let O be an open set and let
O = ∪∞
m=1 Om , · · ·Om ⊆ Om ⊆ Om+1 · ··
Thus ³ ´−1
−1
∪m ∪l ∩p≥l Btp1 , · · ·, Btpk (Om ) = (Bs1 , · · ·, Bsk ) (O)
−1
and so the smallest σ algebra containing (Bs1 , · · ·, Bsk ) (B) for B open and 0 ≤
s1 < s2 · ·· < sk ≤ t an arbitrary sequence of numbers is the same as the smallest
−1
σ algebra containing (Bs1 , · · ·, Bsk ) (B) for B open and 0 ≤ s1 < s2 · ·· < sk ≤ t
an increasing sequence of rational numbers.
where V (s) is an m × n matrix. For now, assume uk and vkl are all Ht adapted
uniformly bounded step functions.
Also assume g is a C 2 function defined
© r ªnon R × Rm for which all partial deriva-
r
tives are uniformly bounded and let tj j=0 be partitions of [0, T ] such that for
© ª
∆ (r) ≡ supj trj+1 − trj , limr→∞ ∆ (r) = 0 and also all discontinuities of all the
© ªn r
step functions, vkl and uk are contained in trj j=0 . Then suppressing the super-
script on trj for the sake of simpler notation,
nX
r −1
¡ ¢ ¡ ¢
g (T, XT ) − g (0, X0 ) = g tj+1 , Xtj+1 − g tj , Xtj
j=0
nX
r −1
∂g ¡ ¢ ¡ ¢
= tj , Xtj ∆tj + D2 g tj , Xtj ∆Xtj
j=0
∂t
µ
∂2g ¡
1 ¢
+ 2
tj + θ∆tj , Xtj + θ∆Xtj ∆t2j (35.3)
2
∂t
¡ ¡ ¢¡ ¢¢
+D2 D2 g tj + θ∆tj , Xtj + θ∆Xtj ∆Xtj ∆Xtj
µ ¶ ¶
∂g ¡ ¢
+ 2D2 tj + θ∆tj , Xtj + θ∆Xtj ∆tj ∆Xtj . (35.4)
∂t
35.2. ITÔ PROCESSES 937
Now from 35.2 and the assumptions that all discontinuities of all step functions are
in the partition, it follows the matrix, V and the vector, u must be of the form
nX
r −1 nX
r −1
j
V (s, ω) = V (ω) X[tj ,tj+1 ) (s) , u (s, ω) = uj (ω) V[tj ,tj+1 ) (s)
j=0 j=0
µ ¶
∂g ¡ ¢ ¡ ¢
+2D2 tj + θ∆tj , Xtj + θ∆Xtj ∆tj uj ∆tj + V j ∆Btj (35.8)
∂t
¡ ¡ ¢¡ j ¢¢
+D2 D2 g tj + θ∆tj , Xtj + θ∆Xtj u ∆tj + V j ∆Btj · (35.9)
¡ j j
¢¢
u ∆tj + V ∆Btj . (35.10)
nX
r −1
∂g ¡ ¢ ¡ ¢¡ ¢
= e (r) + tj , Xtj ∆tj + D2 g tj , Xtj uj ∆tj + V j ∆Btj
j=0
∂t
µ µ ¶
1 ∂g ¡ ¢
+ 2D2 tj + θ∆tj , Xtj + θ∆Xtj V j ∆tj ∆Btj
2 ∂t
¡ ¢T
+ uj ∆tj + V j ∆Btj
¡ ¢¡ ¢
H tj + θ∆tj , Xtj + θ∆Xtj uj ∆tj + V j ∆Btj (35.11)
aj ∆tj ∆Bktj
j=0
nX
r −1 Z
¯ ¯
≤ C ∆tj ¯∆Bktj ¯ dP
j=0 Ω
nX
r −1 µZ ¶1/2
¯ ¯
= C ∆tj ¯∆Bktj ¯2 dP
j=0 Ω
1/2
≤ C∆ (r) T
which converges to 0. Therefore, there exists a set of measure zero off which terms
of this form converge to 0 as r → ∞ upon taking a further subsequence if necessary.
Therefore, the above expression simplifies further and yields
g (T, XT ) − g (0, X0 ) =
nX
r −1
∂g ¡ ¢ ¡ ¢¡ ¢
e (r) + tj , Xtj ∆tj + D2 g tj , Xtj uj ∆tj + V j ∆Btj
j=0
∂t
1 ³¡ j ¢´
nX
r −1
¢T ¡ ¢¡
+ V ∆Btj H tj + θ∆tj , Xtj + θ∆Xtj V j ∆Btj (35.12)
j=0
2
where e (r) → 0 off a set of measure zero. Consider the last term. This term is of
the form
1 ³¡ j ¢´ nX
nXr −1 r −1
¢T ¡ ¢¡ 1
V ∆Btj H tj , Xtj V j ∆Btj + · (35.13)
j=0
2 j=0
2
³¡ ¢T ¡ ¡ ¢ ¡ ¢¢ ¡ j ¢´
V j ∆Btj H tj + θ∆tj , Xtj + θ∆Xtj − H tj , Xtj V ∆Btj . (35.14)
∆BTtj Mr ∆Btj
j=0
XZ ¯ ¯
Z r −1 Z
nX
¯ ¯
= ¯∆Btj ¯2 dP 2
|∆Bti | dP + ¯∆Btj ¯4 dP
i6=j Ω Ω i=1 Ω
X nX
r −1
2
≤ (tj+1 − tj ) (ti+1 − ti ) + 3n2 (tj+1 − tj )
i,j i=1
2 2
≤ T + 3n ∆ (r) T.
Thus ¯ ¯
¯¯nX
r −1
¯
¯
¯ ∆B T
M ∆B ¯
¯¯ tj r tj ¯
j=0 ¯
r
is uniformly integrable and so by the Vitali convergence theorem,
¯ ¯
Z ¯nX ¯
¯ r −1 ¯
lim ¯ ∆Btj Mr ∆Btj ¯¯ dP = 0.
T
r→∞ Ω ¯¯ ¯
j=0
g (T, XT ) − g (0, X0 ) =
nX
r −1
∂g ¡ ¢ ¡ ¢¡ ¢
e (r) + tj , Xtj ∆tj + D2 g tj , Xtj uj ∆tj + V j ∆Btj
j=0
∂t
1³ ´
nX
r −1
¡ ¢T ¡ ¢
+ ∆BTtj V j H tj , Xtj V j ∆Btj (35.16)
j=0
2
where for a.e. ω, e (r) → 0. It remains to consider the last term in the above as
r → ∞. Denote by Aj the symmetric matrix
¡ ¢T ¡ ¢
Vj H tj , Xtj V j
XZ ³ ¡ ¢ ´¡ ¡ ¢ ¢
= ∆BTtj Aj ∆Btj − tr Aj ∆tj ∆BTti Ai ∆Bti − tr Ai ∆ti dP (35.17)
i,j Ω
X Z
= ∆Bαtj Ajαβ ∆Bβtj ∆Bσti Aiστ ∆Bτ ti dP
α,β,σ,τ Ω
X Z Z
= ∆Bαtj ∆Bβtj dP ∆Bσti Ajαβ Aiστ ∆Bτ ti dP
α,β,σ,τ Ω Ω
which shows that this term in the case where j > i cancels with the third term of
35.18. Now consider the second term of 35.18 again in the case where j > i. This
35.2. ITÔ PROCESSES 941
yields
Z
¡ ¢
− tr Ai ∆ti ∆BTtj Aj ∆Btj dP
ZΩ X
¡ ¢
= − tr Ai ∆ti ∆Bαtj Ajαβ ∆Bβtj dP
Ω α,β
XZ Z
¡ ¢
=− ∆Bαtj ∆Bβtj dP tr Ai ∆ti Ajαβ dP
α,β Ω Ω
XZ Z
¡ ¢
2
= − ∆Bαt j
dP tr Ai ∆ti Ajαα dP
α Ω Ω
Z X
¡ ¢
= − (∆tj ) tr Ai ∆ti Ajαα dP
Ω α
Z
¡ ¡ ¢ ¢
= − (∆tj ) tr A ∆ti tr Aj dP
i
Ω
and so this second term cancels with the last term of 35.18. It follows the only
terms to consider in 35.17 are those for which j = i. Thus 35.17 is of the form
X Z ³¡ ¢2 ¡ ¢ ¡ ¡ ¢ ¢2 ´
∆BTti Ai ∆Bti − 2∆BTti Ai ∆Bti tr Ai ∆ti + tr Ai ∆ti dP.
i Ω
(35.19)
First consider the second term.
Z
¡ ¢
∆BTti Ai ∆Bti tr Ai ∆ti dP
Ω
XZ ¡ ¢
= ∆Bαti Aiαβ ∆Bβti tr Ai ∆ti dP
α,β Ω
XZ Z
¡ ¢
= ∆Bαti ∆Bβti dP Aiαβ tr Ai ∆ti dP
α,β Ω Ω
XZ Z
¡ ¢
2
= ∆Bαt i
dP Aiαα tr Ai ∆ti dP
α Ω Ω
Z ÃX !
¡ ¢
= ∆ti Aiαα tr Ai ∆ti dP
Ω α
Z
¡ ¢2
= tr Ai ∆t2i dP ≥ 0
Ω
X Z
= ∆Bαti Aiαβ ∆Bβti ∆Bσti Aiστ ∆Bτ ti dP
α,β,σ,τ Ω
X Z Z
= Aiαβ Aiστ dP ∆Bαti ∆Bβti ∆Bσti ∆Bτ ti dP. (35.20)
α,β,σ,τ Ω Ω
There are two ways in which the term of this sum will not equal zero. One way is
for α = β and σ = τ . In this situation, the above is dominated by
XZ Z
i i 2 2
Aαα Aσσ dP ∆Bαt i
∆Bσt i
dP
α,σ Ω Ω
XZ XZ ¡ ¢2 2
= Aiαα Aiσσ dP ∆t2i + Aiαα dP 3 (∆ti )
α6=σ Ω α Ω
XZ XZ ¡ ¢2 2
≤ Aiαα Aiσσ dP ∆t2i + Aiαα dP 3 (∆ti )
α,σ Ω α Ω
Z Z X
¡ ¢2 ¡ ¢2 2
= tr Ai dP ∆t2i + 3 Aiαα dP (∆ti )
Ω Ω α
The other way in which the expression in 35.20 is not zero is for α = σ and β = τ .
If this happens, the expression is of the form
XZ Z
i i
Aαβ Aαβ dP ∆Bαti ∆Bβti ∆Bαti ∆Bβti dP
α,β Ω Ω
XZ Z
= Aiαβ Aiαβ dP 2
∆Bαti
2
∆Bβt i
dP
α,β Ω Ω
XZ Z Z
= Aiαβ Aiαβ dP 2
∆Bαt i
dP 2
∆Bβt i
dP
α±β Ω Ω Ω
XZ ¡ ¢2
+ Aiαα dP 3∆t2i
α Ω
XZ XZ ¡ ¢2
= Aiαβ Aiαβ dP ∆t2i + Aiαα dP 3∆t2i .
α±β Ω α Ω
C ∆t2i
i=1
35.2. ITÔ PROCESSES 943
Therefore, there exists a subsequence, still denoted by r and a set of measure zero
off of which the last term of 35.16 converges to
Z T
1 ¡ ¢
tr V T H (t, Xt ) V dt.
2 0
It follows that off a set of measure zero, you can pass to the limit in 35.16 and
conclude
g (T, XT ) − g (0, X0 ) =
Z T µ ¶
∂g 1 ¡ T ¢
(t, Xt ) + D2 g (t, Xt ) u+ tr V H (t, Xt ) V dt
0 ∂t 2
Z T
+ D2 g (t, Xt ) V dB
0
Lemma 35.5 Let Ht be the filtration defined above and let B be n dimensional
Brownian motion. Suppose Xt is a vector valued stochastic process for t ∈ [0, T ]
defined by the following for a.e. ω
Z t Z t
Xt − X0 = u (s, ·) ds + V (s, ·) dB
0 0
where all entries of u and V are Ht adapted uniformly bounded step functions.
Then if g is a C 2 function such that all partial derivatives are uniformly bounded,
then for all t ∈ [0, T ] ,
g (t, Xt ) − g (0, X0 ) =
Z tµ ¶
∂g 1 ¡ T ¢
(s, Xs ) + D2 g (s, Xs ) u+ tr V H (s, Xs ) V ds
0 ∂t 2
Z t
+ D2 g (s, Xs ) V dB
0
Proof: Let {tk } be the rational numbers in [0, T ] . The above computation
shows that for each tk , there exists a set of measure zero, Ek such that if ω ∈
/ Ek ,
then
g (tk , Xtk ) − g (0, X0 ) =
944 STOCHASTIC PROCESSES
Z tk µ ¶
∂g ¡ T ¢
(s, Xs ) + D2 g (s, Xs ) u+tr V H (s, Xs ) V ds
0 ∂t
Z tk
+ D2 g (s, Xs ) V dB
0
Letting E = ∪∞ k=1 Ek , it follows E has measure zero and the above formula holds for
all tk . By continuity the above must hold for all t ∈ [0, T ] . This proves the lemma.
Now let V (t, ω) and u (t, ω) will be B × F measurable, both u and V are Ht
adapted, and the components of V and u satisfy
ÃZ ! ÃZ !
T T
2
P vij ds < ∞ = 1, P |uk | ds < ∞ = 1. (35.21)
0 0
Lemma 35.7 Let Ht be the filtration defined above and let B be n dimensional
Brownian motion. Suppose Xt is a vector valued stochastic process for t ∈ [0, T ]
defined by the following for a.e. ω
Z t Z t
Xt − X0 = u (s, ·) ds + V (s, ·) dB (35.22)
0 0
where all entries of u and V are Ht adapted and satisfy 35.21. Then if g is a C 2
function such that all second order partial derivatives are uniformly bounded, then
for all t ∈ [0, T ] ,
g (t, Xt ) − g (0, X0 ) =
Z tµ ¶
∂g 1 ¡ T ¢
(s, Xs ) + D2 g (s, Xs ) u+ tr V H (s, Xs ) V ds
0 ∂t 2
Z t
+ D2 g (s, Xs ) V dB (35.23)
0
∂2g
(s, Xs (ω)) .
∂xi ∂xj
35.2. ITÔ PROCESSES 945
© ª∞ © ª∞
Proof: From 35.21 there exist sequences, ul l=1 and V l l=1 such that the
entries of ul and V l are adapted step functions and in addition there is a set of
measure zero, E, such that if ω ∈ / E, then the components of ul and V l satisfy
Z T Z T
¯ l ¯ ¯ l ¯
¯uk − uk ¯ dt < 2−l , ¯vkj − vkj ¯2 dt < 2−l (35.24)
0 0
for all l large enough, depending on ω of course. Then from 35.22 define
Z t Z t
Xlt − Xl0 = ul (s, ·) ds + V l (s, ·) dB
0 0
In addition to this, it follows from 35.24 it follows there is a subsequence such that
for ω ∈
/ E,
V l (t, ω) → V (t, ω) , ul (t, ω) → u (t, ω) a.e. t.
By Lemma 35.5 for a.e. ω,
¡ ¢ ¡ ¢
g t, Xlt − g 0, Xl0 =
Z tµ ¶
∂g ¡ l
¢ ¡ l
¢ l 1 ¡ lT ¡ l
¢ l¢
s, Xs + D2 g s, Xs u + tr V H s, Xs V ds
0 ∂t 2
Z t
¡ ¢
+ D2 g s, Xls V l dB (35.26)
0
R T ¯¯ ¯¯2
/ E, 0 ¯¯V l ¯¯ dt is uniformly
for all t ∈ [0, T ] . Now 35.24 implies for each ω ∈
bounded independent of l. Consider the third term.
Z T
¯ ¡ lT ¡ ¢ ¢ ¡ ¢¯
¯tr V H s, Xls V l − tr V T H (s, Xs ) V ¯ dt (35.27)
0
Z T ¯ ¡ lT ¡ ¢ ¡ ¢ ¢¯
≤ ¯tr V H s, Xls V l − V T H s, Xls V ¯ dt
0
Z T ¯ ¡ T¡ ¡ ¢ ¢ ¢¯
+ ¯tr V H s, Xls − H (s, Xs ) V ¯ dt
0
which converges to 0 as l → ∞. This shows that for each t ∈ [0, T ] , one can pass
to the limit in the third term of 35.26 and eliminate the superscript, l. Passing to
the limit
¡ in¢ the second term of 35.26 follows from 35.24 and the boundedness of
D2 g s, Xls which results from the assumption that all the partial derivatives of g
are uniformly bounded. Passing to the limit in the first term of 35.26 follows from
35.25. The last term involving dB is of the form
Z t XZ t¡
¡ l
¢ l ¡ ¢ ¢
D2 g s, X V dB = D2 g s, Xl V l ij dBj .
0 j 0
Then from the definition of the Itô integral given above, there is a subsequence still
denoted by l such that for a.e. ω, the above converges uniformly in t to
XZ t
(D2 g (s, X) V )ij dBj
j 0
Thus for a dense subset of [0, T ] , D, there exists an exceptional set of measure zero
such that 35.23 holds for all t ∈ D. By continuity of the Ito integral, this continues
to hold for all t ∈ [0, T ] . This proves the lemma.
It remains to remove the assumption that the partial derivatives of g are bounded.
This results in the following theorem which is the main result.
Theorem 35.8 Let Ht be the filtration defined above and let B be n dimensional
Brownian motion. Suppose Xt is a vector valued stochastic process for t ∈ [0, T ]
defined by the following for a.e. ω
Z t Z t
Xt − X0 = u (s, ·) ds + V (s, ·) dB (35.28)
0 0
where all entries of u and V are Ht adapted and satisfy 35.21. Then if g is a C 2
function with values in Rp , it follows that for a.e. ω and for all t ∈ [0, T ] ,
gk (t, Xt ) − gk (0, X0 ) =
Z tµ ¶
∂gk 1 ¡ T ¢
(s, Xs ) + D2 gk (s, Xs ) u+ tr V Hk (s, Xs ) V ds
0 ∂t 2
Z t
+ D2 gk (s, Xs ) V dB (35.29)
0
35.2. ITÔ PROCESSES 947
gN (t, Xt ) − gN (0, X0 ) =
Z tµ ¶
∂gN 1 ¡ T ¢
(s, Xs ) + D2 gN (s, Xs ) u+ tr V HN (s, Xs ) V ds
0 ∂t 2
Z t
+ D2 gN (s, Xs ) V dB (35.30)
0
Let E = ∪∞ N =1 EN . Then for ω ∈ / E, the above formula holds for all N. By continuity
of X, it follows that for all N large enough, the values of X (t, ω) are in B (0,N )
and so you can delete the subscript of N in the above. This proves the theorem.
How do people remember this? Letting Y (t, ω) ≡ g (t, X (t, ω)) , 35.29 can be
considered formally as
µ ¶
∂gk 1 ¡ ¢
dYk = (t, Xt ) + D2 gk (t, Xt ) u+ tr V T Hk (t, Xt ) V dt
∂t 2
+D2 gk (t, Xt ) dB
I think this is not too bad but one can write an easier to remember formula which
reduces to this one,
∂gk 1
dYk = (t, Xt ) dt + D2 gk (t, Xt ) dXt + dXTt Hk (t, Xt ) dXt
∂t 2
under the convention that dtdBk = 0, dt2 = 0, and, dBi dBj = δ ij dt.
In this case,
Yt = Bt2
and so
1
dYt = 2Bt dBt + · 2dBt2
2
948 STOCHASTIC PROCESSES
and so
Z t Z t
Bt2 − 0 = 2 BdB + dt
0 0
Z t
= 2 BdB + t.
0
This yields
Z t
1¡ 2 ¢
BdB = Bt − t
0 2
which was encountered earlier.
Then
1
dYt = 3Xt2 dXt + 6Xt dXt2
2 ¡ ¢
= 3Xt2 dBt + 3Bt dBt2
= 3Bt2 dBt + 3Bt dt
and so µZ Z ¶
t t
Bt3 = 3 B 2 dB + Bdt
0 0
and so Z Z
t t
Yt = tBt = Bdt + tdB
0 0
Lemma 35.12 Letting Ht denote the completion of the smallest σ algebra contain-
ing
−1
(Bs1 , · · ·, Bsk ) (B)
35.3. SOME REPRESENTATION THEOREMS 949
for all B a Borel set in Rnk for all sequences, 0 ≤ s1 < s2 · ·· < sk ≤ t as defined
above, Ht is also equal to the completion of the smallest σ algebra containing
−1
(Bs1 , · · ·, Bsk ) (B)
for all B an open set in Rnk for all sequences, 0 ≤ s1 < s2 · ·· < sk ≤ t. In addition
to this, Ht is equal to the completion of the smallest σ algebra containing
−1
(Bs1 , · · ·, Bsk ) (B)
for all B an open set in Rnk for all sequences, 0 ≤ s1 < s2 · ·· < sk ≤ t such that
the sj are rational numbers.
Also recall the Doob Dynkin theorem, Theorem 31.19 on Page 866 which is
listed here.
X, Yj ∈ L1 (Ω) .
Qk
Then there exists a Borel function, g : j=1 Rpj → Rn such that
X = g (Y) .
Also recall the submartingale convergence theorem, Theorem 32.12 on Page 900
reviewed below.
Lemma 35.15 Let f be Ht adapted in the sense that every component is Ht adapted
and f ∈ L2 (Ω; Rn ). Here Ht is the filtration defined in Lemma 35.12. Then
¯¯Z ¯¯
¯¯ T ¯¯
¯¯ T ¯¯
¯¯ f (s) dB¯¯ = ||f ||L2 (Ω×[0,T ];Rn ) .
¯¯ 0 ¯¯ 2
L (Ω)
950 STOCHASTIC PROCESSES
Then
¯¯Z ¯¯2
¯¯ T ¯¯
¯¯ T ¯¯
¯¯ f (s) dB¯¯
¯¯ 0 ¯¯ 2
L (Ω)
Z X
¡ ¢ ¡ ¢
= ai Bti+1 − Bti aTj Btj+1 − Btj dP
T
Ω i,j
Lemma 35.16 Let Ht be the filtration alluded to in Lemma 35.12 defined in terms
of the n dimensional Brownian motion. Then random variables of the form
φ (Bt1 , · · ·, Btk )
where t1 ¡< t¢2 · ·· < tk is a finite increasing sequence of rational points in [0, T ] and
φ ∈ Cc∞ Rk are dense in L2 (Ω, HT , P ). Here the set of random variables includes
all such finite increasing lists of rational points of [0, T ].
∞
Proof: Let g ∈ L2 (Ω, HT , P ) . Also let {tj }j=1 be the rational points of [0, T ] .
Now letting {s1 , · · ·, sn } = {t1 , · · ·, tn } such that s1 < s2 < · · · < sn , let Hn denote
the smallest σ algebra which contains
−1
(Bs1 , · · ·, Bsn ) (U )
where
g (ω) if g (ω) ∈ [−M, M ]
gM (ω) ≡ M if g (ω) > M
−M if g (ω) < −M
and M is chosen large enough that
for p > 2. Therefore, these functions are uniformly integrable and so by the Vitali
convergence theorem,
µZ ¶1/2
2
(gM − E (gM |Hn )) dP →0
Ω
Now by the Doob Dynkin lemma listed above, there exists a Borel measurable,
h : Rnm → R such that
Of course h is not in Cc∞ (Rnm ) . Let λ(Bt ) be the distribution of the random
1 ,···,Btm
vector (Bt1 , · · ·, Btm ) . Thus λ(Bt ,···,Bt ) is a Radon measure and so there exists
1 m
φ ∈ Cc (Rnm ) such that
µZ ¶1/2
2
|E (gM |Hm ) − φ (Bt1 , · · ·, Btm )| dP
Ω
µZ ¶1/2
2
= |h (Bt1 , · · ·, Btm ) − φ (Bt1 , · · ·, Btm )| dP
Ω
µZ ¶1/2
2
= |h (x1 , · · ·, xm ) − φ (x1 , · · ·, xm )| dλ(Bt ,···,Bt ) < ε/4.
nm 1 m
R
By convolving with a mollifier, one can assume that φ ∈ Cc∞ (Rnm ) also. It follows
from 35.31 and 35.32 that
for all increasing sequences, s1 < · · · < sk and B a Borel set of Rnk .
ÃZ Z !
T T
T 1
exp h dB − h · hdt (35.33)
0 2 0
Proof: I will show in the process of the proof that functions of the form 35.33
are in L2 (Ω, P ). If the conclusion of the lemma is not true, there exists nonzero
g ∈ L2 (Ω, HT , P ) such that
Z ÃZ Z !
T
1 T T
g (ω) exp h dB − h · hdt dP
Ω 0 2 0
à Z !Z ÃZ !
1 T T
T
= exp − h · hdt g (ω) exp h dB dP = 0
2 0 Ω 0
Z T m−1
X ¡ ¢
hT dB = aTi Bti+1 − Bti (35.34)
0 i=0
m
X m−1
X
= aTi−1 Bti − aTi Bti
i=1 i=0
m−1
X ¡ ¢
= aTi−1 − aTi Bti + aT0 Bt0 + aTn−1 Btn . (35.35)
i=1
³R ´
T
Also 35.34 shows exp 0 hT dB is in L2 (Ω, P ) . To see this recall the Bti+1 − Bti
are independent and the density of Bti+1 − Bti is
à !
2
1 |x|
C (n, ∆ti ) exp −
2 (ti+1 − ti )
so
Z Ã ÃZ
T
!!2 Z Ã Z
T
!
T T
exp h dB dP = exp 2 h dB dP
Ω 0 Ω 0
954 STOCHASTIC PROCESSES
Z Ãm−1 !
X ¡ ¢
= exp 2aTi Bti+1 − Bti dP
Ω i=0
Z m−1
Y ¡ ¡ ¢¢
= exp 2aTi Bti+1 − Bti dP
Ω i=0
YZ
m−1
¡ ¡ ¢¢
= exp 2aTi Bti+1 − Bti dP
i=0 Ω
à !
YZ
m−1
¡ 1 |x| ¢ 2
= C (n, ∆ti ) exp 2aTi x exp − dx < ∞
i=0 Rn 2 ∆ti
Z m−1
X
g (ω) exp yjT Btj (ω) dP = 0.
Ω j=0
is analytic on Cmn and equals zero on Rnm so from standard complex variable the-
ory, this analytic function must equal zero on Cnm , not just on Rnm . In particular,
for all y = (y1 , · · ·, ym ) ∈ Rnm ,
Z m−1
X
g (ω) exp iyjT Btj (ω) dP = 0. (35.36)
Ω j=0
Now pick φ ∈ Cc∞ (Rn ) . Thus φ is in the Schwartz class and from the theory of
Fourier transforms,
Z
1
φ (x) = mn/2
eiy·x F φ (y) dy.
(2π) Rmn
In particular,
¡ ¢
φ Bt0 (ω) , · · ·, Btm−1 (ω)
Z m−1
X
1
= mn/2
exp iyjT Btj (ω) F φ (y) dy.
(2π) R mn
j=0
35.3. SOME REPRESENTATION THEOREMS 955
Therefore, Z
¡ ¢
g (ω) φ Bt0 (ω) , · · ·, Btm−1 (ω) dP
Ω
Z Z m−1
X
1
= mn/2
g (ω) exp iyjT Btj (ω) F φ (y) dydP
(2π) Ω Rmn j=0
Z Z m−1
X
1
= mn/2
g (ω) exp iyjT Btj (ω) dP F φ (y) dy
(2π) Rmn Ω j=0
Z
1
= mn/2
0F φ (y) dy = 0
(2π) Rmn
which shows by Lemma 35.16 that g = 0 after all, contrary to assumption. This
proves the lemma.
Why such a funny lemma? It is because of the following computation which
depends on Itô’s formula. First lets review Itô’s formula. For
dXt = udt + V dB
where u is a vector and V a matrix, and Y = g (t, X) ,
∂gk 1
dYk = (t, Xt ) dt + D2 gk (t, Xt ) dXt + dXTt Hk (t, Xt ) dXt
∂t 2
where dtdBk = 0, dt2 = 0, and, dBi dBj = δ ij dt. In the above, Hk is the hessian
matrix of gk . Thus this reduces to
∂gk
dYk = (t, Xt ) dt + D2 gk (t, Xt ) (udt + V dB)
∂t
1 T
+ (udt + V dB) Hk (t, Xt ) (udt + V dB)
µ 2 ¶
∂gk 1 ¡ ¢
= (t, Xt ) + D2 gk (t, Xt ) u + tr V T Hk V dt + D2 gk (t, Xt ) V dB
∂t 2
In the above case, referring to 35.33, let
Z t Z
1 t
X= hT dB − h · hdt
0 2 0
and g (x) = ex so Y = eX . Then in this case, g is a scalar valued function of one
variable and so the above formula reduces to
µ ³ ´¶
1 2 1
dY = − eX |h| + eX tr hhT dt + eX hT dB
2 2
µ ¶
1 2 1 2
= − eX |h| + eX |h| dt + eX hT dB
2 2
= Y hT dB
956 STOCHASTIC PROCESSES
Hence
Z t
Y = Y0 + Y hT dB
0
Z t
= 1+ Y hT dB.
0
¡ ¢
because the integrand is an adapted step function and E Btj+1 − Btj = 0. There-
fore, letting F = Y,
Z T
T
F = E (F ) + f (t, ω) dB (35.37)
0
It follows that for F ∈ L2 (Ω, HT , P ) of the special form described in Lemma 35.17,
there exists an adapted function in L2 (Ω; Rn ), f such that 35.37 holds. Does such a
function f exist for all F ∈ L2 (Ω, HT , P )? The answer is yes and this is the content
of the next theorem which is called the Itô representation theorem.
Z T
T
F = E (F ) + f (s, ω) dB.
0
where h is a vector valued deterministic step function of the sort described in this
∞
lemma, are dense in L2 (Ω, HT , P ). Given F ∈ L2 (Ω, HT , P ) , {Gk }k=1 be functions
in the subspace of linear combinations of the above functions which converge to F in
L2 (Ω, HT , P ). For each of these functions there exists fk an adapted step function
such that
Z T
T
Gk = E (Gk ) + fk (s, ω) dB.
0
35.3. SOME REPRESENTATION THEOREMS 957
2
Then from the Itô isometry, and the observation that E (Gk − Gl ) → 0 as k, l → ∞
by the above definition of Gk in which the Gk converge to F in L2 (Ω) ,
³ ´
2
0 = lim E (Gk − Gl )
k,l→∞
à à !!2
Z T Z T
T T
= lim E E (Gk ) + fk (s, ω) dB− E (Gl ) + fl (s, ω) dB
k,l→∞ 0 0
( Z Z T
2 T
= lim E (Gk − Gl ) + 2E (Gk − Gl ) (fk − fl ) dBdP
k,l→∞ Ω 0
Z ÃZ T
!2
T
+ (fk − fl ) dB dP
Ω 0
Z ÃZ T
!2
2 T
= lim E (Gk − Gl ) + (fk − fl ) dB dP
k,l→∞ Ω 0
Z ÃZ T
!2
T
= lim (fk − fl ) dB dP = lim ||fk − fl ||L2 (Ω×[0,T ];Rn ) (35.38)
k,l→∞ Ω 0 k,l→∞
R RT T
Going from the third to the fourth equations, is justified because Ω 0 (fk − fl ) dBdP =
0 thanks to the independence of the integrals and the fact fk − fl is an adapted step
function.
∞
This shows {fk }k=1 is a Cauchy sequence in L2 (Ω × [0, T ] ; Rn ) . It follows there
exists a subsequence and f ∈ L2 (Ω × [0, T ] ; Rn ) such that fk converges to f point-
wise and in L2 (Ω × [0, T ] ; Rn ) with f B ×HT measurable. Then by the Itô isometry
and the equation
Z T
T
Gk = E (Gk ) + fk (s, ω) dB
0
you can pass to the limit as k → ∞ and obtain
Z T
T
F = E (F ) + f (s, ω) dB
0
Then Z Z
T T
T T
f (t, ω) dB = f1 (t, ω) dB
0 0
and so Z T ³ ´
T T
f (t, ω) − f1 (t, ω) dB = 0
0
and by the Itô isometry,
¯¯Z ¯¯
¯¯ T ³ ´ ¯¯
¯¯ T T ¯¯
0 = ¯¯ f (t, ω) − f1 (t, ω) dB¯¯ = ||f − f 1 ||L2 (Ω×[0,T ];Rn )
¯¯ 0 ¯¯
L2 (Ω)
Proof: First suppose f is an adapted function of the sort that g is. Then the
following claim is the first step in the proof.
Claim: Let t1 < t2 . Then
µZ t2 ¶
T
E f dB|Ht1 = 0
t1
Proof of claim: First consider the claim in the case that f is an adapted step
function of the form
n−1
X
f (t) = ai X[ti ,ti+1 ) (t)
i=0
Then
Z t2 n−1
X ¡ ¢
f T dB (ω) = aTi (ω) Bti+1 (ω) − Bti (ω) .
t1 i=0
35.3. SOME REPRESENTATION THEOREMS 959
XZ
n−1
¡ ¢
= XA (ω) aTi (ω) Bti+1 (ω) − Bti (ω) dP
i=0 Ω
XZ
n−1 Z
¡ ¢
= XA (ω) aTi (ω) dP Bti+1 (ω) − Bti (ω) dP = 0.
i=0 Ω Ω
³R ´
t
Since A is arbitrary, E t12 f T dB|Ht1 = 0. This proves the claim in the case that
f is an adapted step function. The general case follows from this in the usual way.
If f is not a step function, there is a sequence of adapted step functions, {fk } such
that Z Z t2 t2
fkT dB → f T dB
t1 t1
in L2 (Ω, P ) and ||fk − f ||L2 (Ω×[t1 ,t2 ];Rn ) → 0. Then using the Itô isometry,
Z µ µZ t2 ¶ µZ t2 ¶¶2
E fkT dB|Ht1 −E T
f dB|Ht1 dP
Ω t1 t1
Z µZ t2 Z t2 ¶2
= E fkT dB − T
f dB|Ht1 dP
Ω t1 t1
Z µZ t2 Z t2 ¶2 Z µZ t2 ¶2
¡ ¢
= fkT dB − f dBT
dP = fkT −f T dB dP
Ω t1 t1 Ω t1
= ||fk − f ||L2 (Ω×[t1 ,t2 ];Rn ) → 0
µZ t1 Z t2 ¶
t2 T t2 T
= E (M0 ) + E f (s, ·) dB+ f (s, ·) dB|Ht1
0 t1
µZ t1 ¶
T
= E (M0 ) + E f t2 (s, ·) dB|Ht1
0
Z t1
T
= E (M0 ) + f t2 (s, ·) dB
0
R t1 T
because 0
f t2 (s, ·) dB is Ht1 measurable. Thus
Z t1 Z t1
T T
Mt1 = E (M0 ) + f t2 (s, ·) dB = E (M0 ) + f t1 (s, ·) dB
0 0
and so Z Z
t1 t1
t1 T T
0= f (s, ·) dB− f t2 (s, ·) dB
0 0
for all t ≤ N. Let g = f N for t ∈ [0, N ] . Then asside from a set of measure zero,
this is well defined and for all t ≥ 0
Z t
T
Mt = E (M0 ) + g (s, ·) dB
0
Then
u (t) ≤ u0 ekt .
Proof: Let µ Z t ¶
kt
f (t) = u0 e − u0 + ku (s) ds .
0
d ¡ −kt ¢
e f (t) ≥ 0
dt
which implies f (t) ≥ 0. Hence
Z t
u (t) ≤ u0 + ku (s) ds ≤ u0 ekt .
0
Then there exists a sequence of adapted bounded step functions, {φn } satisfying
Z T
2
(f (t, ω) − φn (t, ω)) dt ≤ 2−n
S
962 STOCHASTIC PROCESSES
for ω ∈
/ E, a set of measure zero. Then for t ∈ [S, T ] , the Itô integral is defined by
Z t Z t
f dB (ω) = lim φn dB (ω) .
S n→∞ S
Rt
Furthermore, for these ω, t → S f dB (ω) is continuous because by Theorem 34.5
Rt
the convergence of S φn dB (ω) is uniform on [0, T ].
whenever 0 ≤ t0 < · · · < tk ≤ t and U is a Borel set. Then the following lemma is
what is needed to consider certain Itô integrals.
It follows that for all D an inverse image of an open set and E of the above form
where V is open, P (D ∩ E) = P (D) P (E). It follows easily this holds for all D
−1
and E inverse images of Borel sets. If D ∈ Ht and E = (Bs − Bt ) (V ) then there
exists D1 an inverse image of a Borel set such that D1 ⊇ D and P (D1 \ D) = 0 so
P (D ∩ E) = P (D1 ∩ E)
= P (D1 ) P (E) = P (D) P (E) .
35.4. STOCHASTIC DIFFERENTIAL EQUATIONS 963
and so
¡ ¢ ¡ ¢
E Bs |HtZ = E Bs − Bt + Bt |HtZ
¡ ¢
= 0 + E Bt |HtZ = Bt .
Proof: The assertion about the norms and the Banach space are all obvious.
The main message is about the measurability assertions. Let P ≡ {t0 , · · ·, tn } be
a partition
¡ of [0, T ] of¢the usual sort where 0 = t0 < t1 < · · · < tn = T. Then for
p
X ∈ C [0, T ] ; L2 (Ω) and Gt adapted, consider
n
X
XP (t) ≡ X (tk−1 ) X[tk−1 ,tk ) (t) .
k=1
and so it follows for each t ∈ [0, T ] , Xn (t) (ω) → Y (t) (ω) a.e. ω. It follows since
(Ω, Gt , P ) is complete, Y (t) is Gt measurable for each t. This proves the lemma.
Rt Rt p
which shows 0 YT dB = 0 Y1T dB in L2 (Ω) . This proves the lemma.
p m
For x ∈ R , b (t, x) ∈ R and σ (t, x) will be an p × m matrix. It is assumed
that for given x, y ∈ Rp the following measurability and Lipschitz conditions hold.
Definition 35.26 Let Gt be a filtration for which Bt is a martingale and such that
for s > t, Bs − Bt is independent of Gt . For X product measurable in B × F and
Gt adapted define
µZ t ¶ Z t
σ (s, X) dB ≡ (σ (s, X))k dB
0 k 0
The integral defined in this way satisfies all the usual algebraic properties for inte-
grals and
Z t
t→ b (s, X (s)) ds
a
p
is continuous as a function with values in L2 (Ω) . Also if X is Gt adapted for Gt a
filtration, then
Z b
b (s, X (s)) ds
a
is Gb measurable.
Proof: Let XPn (t) be the sequence of step functions converging to X (t) which
is described in Lemma 35.23. Then on [tj , tj+1 ),
Z Z
b (t, XPn (t)) dP = b (t, X (tj )) dP
Ω Ω
p
where X (tj ) P ∈ L2 (Ω) . It follows X (tj ) is the pointwise limit of simple functions
m 2 p
of the form k=1 ck XEk (ω) which also converge to X (tj ) in L (Ω) . Thus for
t ∈ [tj , tj+1 ),
Z Ã m ! m Z
X X
b t, ck XEk (ω) dP = b (t, ck ) dP,
Ω k=1 k=1 Ek
is Lebesgue measurable.
¡ Also,¢ using the properties of b and Holder’s inequality, if
p
Xn → X in C [0, T ] ; L2 (Ω) ,
¯Z Z ¯
¯ ¯
¯ |b (t, X (t))|2 dt − |b (t, X (t))|
2
dt¯ ≤ C ||X (t) − Xn (t)||2 2 p .
¯ n ¯ L (Ω)
Ω Ω
Rb
Letting Λ (h) ≡ a (b (t, X (t)) , h)L2 (Ω)p dt, it follows Λ is a continuous linear func-
p
tional on L2 (Ω) and so by the Riesz representation theorem, there exists a unique
p
element of L2 (Ω) denoted by
Z b
b (t, X (t)) dt
a
such that
ÃZ ! Z
b b
b (t, X (t)) dt, h = (b (t, X (t)) , h)L2 (Ω)p dt.
a a
L2 (Ω)p
Lemma 35.28 Let b and σ satisfy 35.39 - 35.41. Let Z be a random vector which
is either independent of Ht for all t > 0 or else is measurable with respect to Ht for
all t and suppose Z
2
|Z| dP < ∞
Ω
Let Gt = HtZ in the first case and let Gt = Ht in the second. Then there exists a
unique solution, X ∈ VG to the integral equation,
Z t Z t
X (t) = Z + b (s, X (s)) ds + σ (s, X (s)) dB.
0 0
Proof: For X ∈ VG , supplied with the norm ||·||λ described above, let
Z t Z t
ΦX (t) ≡ Z + b (s, X (s)) ds + σ (s, X (s)) dB
0 0
It follows from Corollary 31.43 on Page 890 and Lemmas 34.8 on Page 925 and
Lemma 35.27 that ΦX is Gt adapted and B × F measurable. The deterministic
p
integral is a continuous function of t with values in L2 (Ω) by Lemma 35.27. The
same is true of the Itô integral. To see this, let Y be adapted and product measur-
able and equal to X for a.e. t. Then by the Itô isometry and the above definition
of this integral of functions in VG ,
¯¯Z t Z s ¯¯2
¯¯ ¯¯
¯¯ σ (r, X (r)) dB− σ (r, X (r)) dB¯¯¯¯
¯¯
0 0 2 L (Ω)p
¯¯Z t Z s ¯¯2
¯¯ ¯¯
= ¯¯ σ (r, Y (r)) dB− σ (r, Y (r)) dB¯¯¯¯
¯¯
0 0 L2 (Ω)p
¯¯Z t ¯¯2 Z tZ
¯¯ ¯¯
= ¯¯ σ (r, Y (r)) dB ¯¯ =
2
|σ (r, Y (r))| dP dr
¯¯ ¯¯ 2 p
s L (Ω) s Ω
Z tZ ³ ´ Z tZ ³ ´
2 2
≤ 2C 1 + |Y (r)| dP dr = 2C 1 + |X (r)| dP dr
s Ω s Ω
Z tZ
2
≤ e−λt |σ (s, X1 (s)) − σ (s, Y1 (s))| dP ds
0 Ω
Z tZ
−λt 2
≤ Ke |X1 (s) − Y1 (s)| dP ds
0 Ω
Z t µ Z ¶
−λt λs −λs 2
= Ke e e |X (s) − Y (s)| dP ds
0 Ω
Z t
1
≤ K eλ(s−t) ds ||X − Y||λ ≤ K ||X − Y||λ .
0 λ
Thus if λ is large enough, this term is a contraction. Similar but easier reasoning
applies to the deterministic integral in the definition of Φ. Therefore, by the usual
contraction mapping theorem, Φ has a unique fixed point in VG . This proves the
lemma.
Theorem 35.29 Let b and σ satisfy 35.39 - 35.41. Let Z be a random vector
which is either independent of Ht for all t > 0 or else is measurable with respect to
Ht for all t and suppose Z
2
|Z| dP < ∞
Ω
Let Gt = HtZ in the first case and let Gt = Ht in the second. Then there exists a
p
B × F measurable function, Y ∈ L2 ([0, T ] × Ω) and a set of measure zero, N such
that for ω ∈
/ N,
Z t Z t
Y (t) (ω) = Z + b (s, Y (s) (ω)) ds + σ (s, Y (s)) dB (ω)
0 0
Z t³ ³ ´ ´
≤ e (s) , h
b (s, X (s)) − b s, Y ds
0 L2 (Ω)p
Z t ¯¯ ¯¯
¯¯ e (s)¯¯¯¯
≤ K ¯¯X (s) − Y ds ||h||L2 (Ω)p = 0
0 L2 (Ω)p
and so Z Z
t t ³ ´
b (s, X (s)) ds = e (s) ds in L2 (Ω)p .
b s, Y
0 0
p
It follows that in L2 (Ω) ,
Z t ³ ´ Z t ³ ´
X (t) = Z + e
b s, Y (s) ds + e (s) dB
σ s, Y (35.42)
0 0
where now the first integral on the right is the usual thing given by
Z t ³ ´ Z t ³ ´
e (s) ds (ω) =
b s, Y e (s) (ω) ds
b s, Y
0 0
Also, for a.e. ω, t³→ Y e (t) (ω) is a function in L2 (0, T ) and from the theory of
Rt ´
the Itô integral, 0 σ s, Ye (s) dB is a continuous function of t for ω not in a set of
measure zero. Therefore, for ω off a set of measure zero, the right side of 35.42 is
continuous in t. It also delivers an adapted product measurable function, Y. Thus
for a.e. ω, Y (t) (ω) is a continuous function of t and
Z t ³ ´ Z t ³ ´
Y (t) = Z + b s, Ye (s) ds + σ s, Y e (s) dB
0 0
p
and so Y (t) = X (t) in L2 (Ω) for all t. Now this also shows Y (t) = Y e (t) for a.e.
p
t. Hence by the Itô isometry, the right side of the above is unchanged in L2 (Ω) if Ye
is replaced by Y. By the argument just given, the resulting right side is continuous
in t for a.e. ω and so there exists a set of measure zero such that for ω not in this
set, Z Z
t t
Y (t) (ω) = Z + b (s, Y (s) (ω)) ds + σ (s, Y (s)) dB (ω)
0 0
and both sides are continuous functions of t. This proves the theorem.
Note there were two cases given for the initial condition in the above theorem.
The second is not very interesting. If Z is H0 measurable, then since B0 = x, a
constant, it follows H0 = {∅, Ω} so Z is a constant. However, if Z is a constant,
then it satisfies the first condition.
Not surprisingly, the solution to the above theorem is unique. This is stated as
the following corollary which is the main result.
Corollary 35.30 Let b and σ satisfy 35.39 - 35.41. Let Z be a random vector
which is independent of Ht for all t > 0 and suppose
Z
2
|Z| dP < ∞
Ω
35.4. STOCHASTIC DIFFERENTIAL EQUATIONS 971
n
Then there exists a unique HtZ adapted solution, X ∈ L2 ([0, T ] × Ω) to the integral
equation,
Z t Z t
X (t) = Z + b (s, X (s)) ds + σ (s, X (s)) dB a.e. ω (35.43)
0 0
Proof: The existence part of this proof is already done. Let N denote the union
of the exceptional sets corresponding to X and X e . Then from 35.43 and the various
assumptions on b and σ,it follows that for ω ∈ / N,
¯ ¯2 Z t¯ ¯2
¯ e (t)¯¯ ≤ 2K 2 T ¯ e (s)¯¯ ds
¯X (t) − X ¯X (s) − X
0
¯Z t ³ ³ ´´ ¯2
¯ ¯
+2 ¯¯ e (s) dB¯ .
σ (s, X (s)) − σ s, X ¯
0
is
dX = b (t, X) dt + σ (t, X (t)) dB, X (0) = Z.
Obviously, one would want to do something like dX X = h (t) dB. However, you
have to follow the rules. Let g (x) = ln (x) and Y = g (X) . Then by the Itô formula,
µ ¶
1 1 −1
dY = dX + dX 2
X 2 X2
1 1 1 2
= h (t) XdB − h (t) X 2 dB 2
X 2 X2
1 2
= h (t) dB − h (t) dt
2
Rt 1
Rt 2
and also Y (0) = 0. Therefore, Y (t) = ln (X (t)) = 0 h (s) dB − 2 0
h (t) dt and
so
µZ t Z ¶
1 t 2
X (t) = exp h (s) dB − h (s) ds
0 2 0
Rt 2
Note the extra term, − 12 0 h (s) ds.
In this case it is a lot like the above example but it has an extra f (t) Xdt. This
suggests something useful might be obtained by letting Y = ln (X) as was done
earlier. Thus
µ ¶
1 1 −1 2
dY = (Xf (t) dt + h (t) XdB) + (f (t) dt + h (t) XdB)
X 2 X2
µ ¶
1 1 −1 2
= (Xf (t) dt + h (t) XdB) + h (t) X 2 dB 2
X 2 X2
µ ¶
1 1 2
= Xf (t) dt − Xh (t) dt + h (t) XdB
X 2
µ ¶
1 2
= f (t) − h (t) dt + h (t) dB
2
Rt³ 2
´ Rt
and so ln (X) = 0 f (s) − 12 h (s) ds + 0 h (s) dB and so
µZ t µ ¶ Z t ¶
1 2
X (t) = exp f (s) − h (s) ds + h (s) dB
0 2 0
The next example is a model for stock prices. Learn this model and get rich.
Example 35.33 For P (t) the price of stock,
dP = µP dt + σP dB
In this model, µ is called the drift and σ is called the volatility.
It is just a special case of the above model in which f (t) = µ and h (t) = σ.
Then from the above,
µ ¶
1 2
P (t) = exp tµ − tσ + σBt
2
Example 35.34 This example is called the Brownian bridge.
−X
dX = + dB, X (0) = 0.
1−t
This is also a special case in which f (t) = 1/ (t − 1) and h (t) = 1. Thus the
solution is
µZ t µ ¶ ¶
1 1
X (t) = exp − ds + Bt
0 t−1 2
µ Z tµ ¶ ¶
1 t−3
= exp ds + Bt
2 0 t−1
µ µ ¶ ¶
1 t−3
= exp t + Bt
2 t−1
Before doing another example I will give a simple lemma on integration by parts.
In this lemma B will denote m dimensional Brownian motion.
974 STOCHASTIC PROCESSES
Lemma 35.35 Let (t, ω) → g (t, ω) be an Gt adapted measurable function such that
µZ t ¶
2
P |g (s, ω)| ds < ∞ = 1
0
where Bt is a martingale with respect to the filtration Gt Rand the increments, Bs −Bt
t
for s > t are independent of Gt so that the Itô integral, 0 gT dB is defined. Suppose
also that t → g (t, ω) is C 1 and B0 = 0. Then
Z t Z t
∂gT
gT (s, ω) dB = gT (t, ω) Bt (ω) − (s, ω) B (s) ds a.e.
0 0 ∂t
Pn−1
Proof: Let gn (t) ≡ k=0 g (tk ) X[tk ,tk+1 ) (t) where tk = k (t/n) . Then by the
definition of the Itô integral,
Z t Z t n−1
X T ¡ ¢
gT dB = lim gnT dB = lim g (tk ) Btk+1 − Btk
0 n→∞ 0 n→∞
k=0
à n n−1
!
X T
X T
= lim g (tk−1 ) Btk − g (tk ) Btk
n→∞
k=1 k=0
" #
³ ´ X³
n−1 ´
T T T
= lim g (tn−1 ) Bt − g (tk ) − g (tk−1 ) Btk
n→∞
k=1
Z t
∂gT
= g (t, ω) Bt (ω) − (s, ω) B (s, ω) ds a.e. ω.
0 ∂t
Example 35.36 Linear systems of equations. Here B is m dimensional Brownian
motion. The equation of interest is
dX = (AX + h (t)) dt + KdB, X (0) = X0
where X0 is a random vector in Rm which is independent of Ht for all t ≥ 0 and
A, K are constant m × m matrices. Then I will show
µ Z t ¶
At
¡ −As −As
¢ −At
X (t) = e X0 + e h (s) + e AKB (s) ds + e KB (t)
0
−At
Let Y (t) = e X (t) . Then from the above,
1
dY = −Ae−At X + e−At IdX+ e−At dXT 0dX
2
−At −At
= −Ae X+e I ((AX + h (t)) dt + KdB)
= e−At h (t) dt + e−At KdB.
Hence using integration by parts,
Z t Z t
e−At X (t) − X0 = e−As h (s) ds + e−As KdB
0 0
Z t Z t
= e−As h (s) ds + e−At KB (t) + A e−As KB (s) ds
0 0
35.5. A DIFFERENT PROOF OF EXISTENCE AND UNIQUENESS 975
and so
µ Z t Z t ¶
At −As −At −As
X (t) = e X0 + e h (s) ds + e KB (t) + A e KB (s) ds
0 0
µ Z t ¶
¡ −As ¢
= eAt X0 + e h (s) + e−As AKB (s) ds + e−At KB (t) (. 35.45)
0
In this formula e−As is the matrix, M (t) which solves M 0 = AM, M (0) = I.
Note that formally differentiating the above equation gives
µ Z t ¶
0 At
¡ −As −As
¢ −At
X = Ae X0 + e h (s) + e AKB (s) ds + e KB (t)
0
µ ¶
At
¡ −At −At
¢ −At −At dB
+e e h (t) + e AKB (t) − Ae KB (t) + e K
dt
and so
dB
X0 = AX + h (t) + AKB (t) − AKB (t) + K
dt
dB
= AX + h (t) + K .
dt
Of course this is total nonsense because B is known to not be differentiable. How-
ever, multiplying by dt gives
and the formula 35.45 shows X (0) = X0 . This was the original differential equation.
Note that it was not necessary to assume very much about X0 to write 35.45.
Then
u (t) ≤ u0 ekt .
Proof: Let µ Z t ¶
kt
f (t) = u0 e − u0 + ku (s) ds .
0
Then there exists a sequence of adapted bounded step functions, {φn } satisfying
Z T
2
(f (t, ω) − φn (t, ω)) dt ≤ 2−n
S
for ω ∈
/ E, a set of measure zero. Then for t ∈ [S, T ] , the Itô integral is defined by
Z t Z t
f dB (ω) = lim φn dB (ω) .
S n→∞ S
Rt
Furthermore, for these ω, t → S f dB (ω) is continuous because by Theorem 34.5
Rt
the convergence of S φn dB (ω) is uniform on [0, T ].
35.5. A DIFFERENT PROOF OF EXISTENCE AND UNIQUENESS 977
whenever 0 ≤ t0 < · · · < tk ≤ t and U is a Borel set. Then the following lemma is
what is needed to consider certain Itô integrals.
It follows that for all D an inverse image of an open set and E of the above form
where V is open, P (D ∩ E) = P (D) P (E). It follows easily this holds for all D
−1
and E inverse images of Borel sets. If D ∈ Ht and E = (Bs − Bt ) (V ) then there
exists D1 an inverse image of a Borel set such that D1 ⊇ D and P (D1 \ D) = 0 so
P (D ∩ E) = P (D1 ∩ E)
= P (D1 ) P (E) = P (D) P (E) .
and so
¡ ¢ ¡ ¢
E Bs |HtZ = E Bs − Bt + Bt |HtZ
¡ ¢
= 0 + E Bt |HtZ = Bt .
Definition 35.40 Let Gt be a filtration for which Bt is a martingale and such that
for s > t, Bs − Bt is independent of Gt . For X product measurable in B × F and
Gt adapted define
µZ t ¶ Z t
σ (s, X) dB ≡ (σ (s, X))k dB
0 k 0
Let Gt = HtZ in the first case and let Gt = Ht in the second. Then there exists a
solution, X to the integral equation,
Z t Z t
X (t) = Z + b (s, X (s)) ds + σ (s, X (s)) dB a.e. ω
0 0
n
This solution satisfies X ∈ L2 ([0, T ] × Ω) .
The Itô integral on the right is well defined for all ω not in some set of measure
zero because Z is Gt adapted. Now also X1 is Gt adapted because both integrals in
the above yield Gt adapted functions of t by Theorem 34.8 on Page 925 and the Itô
integral yields B × F measurable function by Corollary 31.43 on Page 890 and the
convention mentioned after this corollary. Then
Z t Z t
¡ ¢ ¡ ¢
X2 (t) ≡ Z + b s, X1 (s) ds + σ s, X1 (s) dB.
0 0
Continue this way. Each iteration involves a set of measure zero. Take the union of
all these sets, N . Then for ω ∈
/N
Z t Z t
k+1
¡ ¢ ¡ ¢
X (t) ≡ Z + b s, Xk (s) ds + σ s, Xk (s) dB (35.49)
0 0
Now
Z
¯ 1 ¯
¯X (tk ) − Z¯2 dP
Ω
ÃZ Z ³ !
T ´
2
≤ CT 1 + |Z| dP dt ≡ CZ < ∞.
0 Ω
Then it follows
µZ ¶1/2 µ Z tZ t1 Z tk−1 ¶1/2
¯ k+1 ¯2
¯X (t) − Xk (t)¯ dP ≤ CTk CZ ··· dtk · · · dt1
Ω 0 0 0
µ ¶1/2 µ ¶1/2
tk Tk
≤ CTk CZ ≤ CTk CZ .
k! k!
P∞ ³ k Tk
´1/2 © ª ¡ n¢
Since k=0 C T C Z k! < ∞, it follows Xk converges in C [0, T ] ; L2 (Ω)
© ª
to a function, X. f It follows Xk is also Cauchy in L2 ([0, T ] × Ω)n . Therefore,
there exists X, B × F measurable and in©L2 ([0, ª T ] × Ω) such that upon taking
a suitable subsequence still denoted by k, Xk converges to X pointwise and in
n R ¯¯ ¯2
¯
L2 ([0, T ] × Ω) . The function, t → Ω ¯X − Xk ¯ dP is Lebesgue measurable and
R ¯¯ ¯2
t → Ω ¯X−X e ¯¯ dP is the limit so it is also Lebesgue measurable. Also,
Z TZ ¯ ÃZ Z Z TZ ¯ !
¯2 T ¯ ¯2 ¯
¯ e ¯ ¯ k¯ ¯ k e ¯2
¯X−X¯ dP dt ≤ 2 ¯X − X ¯ dP dt + ¯X −X¯ dP dt
0 Ω 0 Ω 0 Ω
For a.e. ω, the right side of the above is a continuous function of t. This is true of
the Itô integral and it also follows for the deterministic integral because of the ob-
servation that for a.e. ω, s → X (s) (ω) is in L2 (0, T ) . For ω not in this exceptional
set of measure zero, define
Z t Z t
Y (t) (ω) ≡ Z + b (s, X (s) (ω)) ds + σ (s, X (s)) dB (ω) (35.51)
0 0
35.5. A DIFFERENT PROOF OF EXISTENCE AND UNIQUENESS 981
Thus Y (t) is adapted by Theorem 34.8 on Page 925 and Y is product measurable.
Also it follows from 35.50 that Y (t) = X e (t) in L2 (Ω)n and X (t) = Xe (t) for a.e. t
so Y (t) = X (t) a.e. t. It follows that
Z t Z t
Y (t) ≡ Z + b (s, Y (s)) ds + σ (s, Y (s)) dB (35.52)
0 0
n n
holds in L (Ω) for each t and so equality is also true in L2 ([0, T ] × Ω) . As before,
2
the right side is a continuous function of t for ω off a set of measure zero. Off a set
of measure zero, t → Y (t) (ω) is also continuous. This follows from the definition
of Y (t) in 35.51. Since both sides are product measurable, there exists a set of
measure zero, N such that for ω ∈ / N,
Z T¯ µ Z t Z t ¶¯2
¯ ¯
¯Y (t) − Z + b (s, Y (s)) ds + σ (s, Y (s)) dB ¯ dt = 0
¯ ¯
0 0 0
and the integrand is a continuous function. Therefore, 35.52 holds a.e. and both
sides are continuous for ω not in a suitable set of measure zero. This proves the
theorem.
Note there were two cases given for the initial condition in the above theorem.
The second is not very interesting. If Z is H0 measurable, then since B0 = x, a
constant, it follows H0 = {∅, Ω} so Z is a constant. However, if Z is a constant,
then it satisfies the first condition.
Not surprisingly, the solution to the above theorem is unique. This is stated as
the following corollary which is the main result.
Corollary 35.42 Let b and σ satisfy 35.46 - 35.48. Let Z be a random vector
which is independent of Ht for all t > 0 and suppose
Z
2
|Z| dP < ∞
Ω
n
Then there exists a unique HtZadapted solution, X ∈ L2 ([0, T ] × Ω) to the integral
equation,
Z t Z t
X (t) = Z + b (s, X (s)) ds + σ (s, X (s)) dB a.e. ω (35.53)
0 0
¯¯ ¯¯2
e are in L2 ([0, T ] × Ω)n so t → ¯¯¯¯X (t) − X
and by assumption both X and X e (t)¯¯¯¯
2 L (Ω)n
1 e (t) = X (t) in L2 (Ω)n for all t. It fol-
is in L ([0, T ]). By Gronwall’s inequality, X
lows there exists a set of measure zero, N1 such that for ω ∈ / N1 ,
Z T ¯ ¯2
¯e ¯
¯X (t) − X (t)¯ dt = 0
0
is
dX = b (t, X) dt + σ (t, X (t)) dB, X (0) = Z.
Obviously, one would want to do something like dX X = h (t) dB. However, you
have to follow the rules. Let g (x) = ln (x) and Y = g (X) . Then by the Itô formula,
µ ¶
1 1 −1
dY = dX + dX 2
X 2 X2
1 1 1 2
= h (t) XdB − h (t) X 2 dB 2
X 2 X2
1 2
= h (t) dB − h (t) dt
2
Rt Rt 2
and also Y (0) = 0. Therefore, Y (t) = ln (X (t)) = 0 h (s) dB − 12 0 h (t) dt and
so µZ t Z ¶
1 t 2
X (t) = exp h (s) dB − h (s) ds
0 2 0
R t 2
Note the extra term, − 12 0 h (s) ds.
In this case it is a lot like the above example but it has an extra f (t) Xdt. This
suggests something useful might be obtained by letting Y = ln (X) as was done
earlier. Thus
µ ¶
1 1 −1 2
dY = (Xf (t) dt + h (t) XdB) + (f (t) dt + h (t) XdB)
X 2 X2
µ ¶
1 1 −1 2
= (Xf (t) dt + h (t) XdB) + h (t) X 2 dB 2
X 2 X2
µ ¶
1 1 2
= Xf (t) dt − Xh (t) dt + h (t) XdB
X 2
µ ¶
1 2
= f (t) − h (t) dt + h (t) dB
2
Rt³ 2
´ Rt
and so ln (X) = 0 f (s) − 12 h (s) ds + 0 h (s) dB and so
µZ t µ ¶ Z t ¶
1 2
X (t) = exp f (s) − h (s) ds + h (s) dB
0 2 0
The next example is a model for stock prices. Learn this model and get rich.
984 STOCHASTIC PROCESSES
dP = µP dt + σP dB
It is just a special case of the above model in which f (t) = µ and h (t) = σ.
Then from the above,
µ ¶
1 2
P (t) = exp tµ − tσ + σBt
2
−X
dX = + dB, X (0) = 0.
1−t
This is also a special case in which f (t) = 1/ (t − 1) and h (t) = 1. Thus the
solution is
µZ t µ
¶ ¶
1 1
X (t) = exp − ds + Bt
0 t−1 2
µ Z tµ ¶ ¶
1 t−3
= exp ds + Bt
2 0 t−1
µ µ ¶ ¶
1 t−3
= exp t + Bt
2 t−1
Before doing another example I will give a simple lemma on integration by parts.
In this lemma B will denote m dimensional Brownian motion.
Lemma 35.47 Let (t, ω) → g (t, ω) be an Gt adapted measurable function such that
µZ t ¶
2
P |g (s, ω)| ds < ∞ =1
0
where Bt is a martingale with respect to the filtration Gt Rand the increments, Bs −Bt
t
for s > t are independent of Gt so that the Itô integral, 0 gT dB is defined. Suppose
also that t → g (t, ω) is C 1 and B0 = 0. Then
Z t Z t
T T ∂gT
g (s, ω) dB = g (t, ω) Bt (ω) − (s, ω) B (s) ds a.e.
0 0 ∂t
35.5. A DIFFERENT PROOF OF EXISTENCE AND UNIQUENESS 985
Pn−1
Proof: Let gn (t) ≡ k=0 g (tk ) X[tk ,tk+1 ) (t) where tk = k (t/n) . Then by the
definition of the Itô integral,
Z t Z t n−1
X T ¡ ¢
gT dB = lim gnT dB = lim g (tk ) Btk+1 − Btk
0 n→∞ 0 n→∞
k=0
à n n−1
!
X T
X T
= lim g (tk−1 ) Btk − g (tk ) Btk
n→∞
k=1 k=0
" #
³ ´ X³
n−1 ´
T T T
= lim g (tn−1 ) Bt − g (tk ) − g (tk−1 ) Btk
n→∞
k=1
Z t
∂gT
= g (t, ω) Bt (ω) − (s, ω) B (s, ω) ds a.e. ω.
0 ∂t
1
dY = −Ae−At X + e−At IdX+ e−At dXT 0dX
2
= −Ae−At X + e−At I ((AX + h (t)) dt + KdB)
= e−At h (t) dt + e−At KdB.
and so
µ Z t Z t ¶
At −As −At −As
X (t) = e X0 + e h (s) ds + e KB (t) + A e KB (s) ds
0 0
µ Z t ¶
At
¡ −As −As
¢ −At
= e X0 + e h (s) + e AKB (s) ds + e KB (t) (. 35.55)
0
986 STOCHASTIC PROCESSES
In this formula e−As is the matrix, M (t) which solves M 0 = AM, M (0) = I.
Note that formally differentiating the above equation gives
µ Z t ¶
0 At
¡ −As −As
¢ −At
X = Ae X0 + e h (s) + e AKB (s) ds + e KB (t)
0
µ ¶
At
¡ −At −At
¢ −At −At dB
+e e h (t) + e AKB (t) − Ae KB (t) + e K
dt
and so
dB
X0 = AX + h (t) + AKB (t) − AKB (t) + K
dt
dB
= AX + h (t) + K .
dt
Of course this is total nonsense because B is known to not be differentiable. How-
ever, multiplying by dt gives
and the formula 35.55 shows X (0) = X0 . This was the original differential equation.
Note that it was not necessary to assume very much about X0 to write 35.55.
Probability In Infinite
Dimensions
I am following the book by Da Prato and Zabczyk for much of this material. [15].
λX (G) ≡ P (X (ω) ∈ G) .
To speak of the expected value, it is necessary that X ∈ L1 (Ω; R) . Now the variance
is defined as ³ ´ Z
2 2
E (X − E (X)) ≡ (X (ω) − E (X)) dP
Ω
2
and it is necessary that X ∈ L (Ω; R) .
What about random vectors where X has values in Rp ? In this case, the expected
value would be a vector in Rp given by
Z
E (X) ≡ X (ω) dP
Ω
and the thing which takes the place of the variance is the covariance. This is a
linear transformation mapping Rp to Rp just as the variance could be considered a
linear transformation mapping R to R. The covariance is defined as
¡ ∗¢
E (X−E (X)) (X−E (X))
987
988 PROBABILITY IN INFINITE DIMENSIONS
u ⊗ v (w) ≡ (w, v) u.
If there are two random vectors, X and Y, X having values in Rp and Y having
values in Rq , the correlation is the linear transformation defined by
or in terms of matrices,
¡ ∗¢
E (X−E (X)) (Y−E (Y))
This all makes sense provided X and Y are in L2 (Ω; Rr ) where r = p or q because
you can simply integrate the entries of the matrix which results when you write
∗
(X−E (X)) (Y−E (Y)) .
What does it all mean in the case where X ∈ L2 (Ω; H) and Y ∈ L2 (Ω; G) for
H, G separable Hilbert spaces? In this case there is no “matrix”. This involves the
notion of a Hilbert Schmidt operator.
Definition 36.1 Let H and G be two separable Hilbert spaces and let T map H to
G be continuous and linear. Then T is called a Hilbert Schmidt operator if there
exists some orthonormal basis for H, {ej } such that
X 2
||T ej || < ∞.
j
Y ⊗ X ∈ L2 (H, G) ,
and
||Y ⊗ X||L2 = ||X||H ||Y ||G . (36.2)
36.1. EXPECTED VALUE COVARIANCE AND CORRELATION 989
Now if X ∈ L1 (Ω, H) ,
Z
E (X) ≡ XdP.
Ω
Proof: It suffices to verify the claim about cor (Y, X). First consider the issue
of measurability. I need to verify that ω → (Y (ω) − E (Y )) ⊗ (X (ω) − E (X)) is
measurable. Since L2 (H, G) is a separable Hilbert space, it suffices to use the Pettis
theorem and verify
36.2 Independence
Recall that for X a random variable, σ (X) is the smallest σ algebra containing all
the sets of the form X −1 (F ) where F© is Borel. Since such sets, −1
ª X (F ) for F Borel
form a σ algebra it follows σ (X) = X −1 (F ) : F is Borel .
Definition 36.4 Let (Ω, F, P ) be a probability space. A finite set of random vec-
n
tors, {Xk }k=1 is independent if whenever Fk ∈ σ (Xk ) ,
n
Y
P (∩nk=1 Fk ) = P (Fk ) .
k=1
More generally, if {Fj }j∈J are σ algebras, they are said to be independent if when-
ever I ⊆ J is a finite set of indices and Ai ∈ Fi ,
Y
P (∩i∈I Ai ) = P (Ai ) .
i∈I
r
Lemma 36.5 If {Xk }k=1 are independent and if gk is a Borel measurable function,
n
then {gk (Xk )}k=1 is also independent. Furthermore, if the random variables have
values in R and they are all bounded, then
à r
! r
Y Y
E Xi = E (Xi ) .
i=1 i=1
r
Proof: First consider the claim about {gk (Xk )}k=1 . Letting O be an open set
in R,
−1 ¡ ¢
(gk ◦ Xk ) (O) = Xk−1 gk−1 (O) = Xk−1 (Borel set) ∈ σ (Xk ) .
−1
It follows (gk ◦ Xk ) (E) is in σ (Xk ) whenever E is Borel. Thus σ (gk ◦ Xk ) ⊆
σ (Xk ) and ©thisªproves the first part of the lemma.
∞
Now let sin n=1 be a bounded sequence of simple functions measurable in σ (Xi )
which converges to Xi uniformly. (Since Xi is bounded, such a sequence exists by
breaking Xi into positive and negative parts and using Theorem 8.27 on Page 190.)
Say
mn
X
sin (ω) = cn,i
k XE n,i (ω) k
k=1
where the Ek are disjoint elements of σ (Xi ) and some might be empty. This is for
convenience in keeping the same index on the top of the sum. Then since all the
random variables are bounded, there is no problem about existence of any of the
36.2. INDEPENDENCE 991
Z X
= lim cn,1 n,2 n,r
k1 ck2 · · · ckr XE n,1 XE n,2 · · · XEk dP
n,r
n→∞ Ω k ,k ,···,k
k1 k2 r
1 2 r
X Z
= lim cn,1 n,2 n,r
k1 ck2 · · · ckr XE n,1 XE n,2 · · · XEk dP
n,r
n→∞ Ω
k1 k2 r
k1 ,k2 ,···,kr
X r
Y ³ ´
= lim cn,1 n,2 n,r
k1 ck2 · · · ckr P Ekn,i
i
n→∞
k1 ,k2 ,···,kr i=1
Yr Z r
Y
= lim sin (ω) dP = E (Xi ) .
n→∞
i=1 Ω i=1
Proof: I need to verify that for all n ∈ N, if {j1 , j2 , · · ·, jn } ⊆ I and Ajk ⊆ Fjk ,
then
Yn
P (∩nk=1 Ajk ) = P (Ajk ) .
k=1
992 PROBABILITY IN INFINITE DIMENSIONS
¡ ¢
Pick Aj1 · ··, Ajn−1 ∈ Kj1 × · · ·, Kjn−1 and let
( n
)
Y
G(Aj ···,Aj ) ≡ Ajn ∈ Fjn : P (∩nk=1 Ajk ) = P (Ajk )
1 n−1
k=1
n−1
Y ¡ ¢
P (Ajk ) = P ∩n−1
k=1 Ajk
k=1
¡¡ ¢ ¡ n−1 ¢¢
= P ∩n−1 C
k=1 Ajk ∩ Ajn ∪ ∩k=1 Ajk ∩ Ajn
¡ ¢ ¡ n−1 ¢
= P ∩n−1 C
k=1 Ajk ∩ Ajn + P ∩k=1 Ajk ∩ Ajn
n
¡ ¢ Y
= P ∩n−1
k=1 A j k
∩ A C
jn + P (Ajk )
k=1
and so
n−1
Y
¡ n−1 ¢
P ∩k=1 Ajk ∩ AC
jn = P (Ajk ) (1 − P (Ajn ))
k=1
n−1
Y ¡ ¢
= P (Ajk ) P AC
jn
k=1
It was just shown G(Aj ···,Aj ) ⊇ Kjn−1 . Also by similar reasoning to the above, it
1 n−2
follows G(Aj ···,Aj ) satisfies the conditions needed to apply Lemma 9.72 on Page
1 n−2
257 and so whenever Ajn , Ajn−1 are in Fjn and Fjn−1 respectively and
¡ ¢
Aj1 · ··, Ajn−2 ∈ Kj1 × · · ·, Kjn−2 ,
Qn
it follows P (∩nk=1 Ajk ) = k=1 P (Ajk ) . Continue this way to obtain the desired
result. This proves the lemma.
What is a useful π system for B (E) where E is a Banach space?
Recall the fundamental lemma used to prove the Pettis theorem. It was proved
on Page 579 but here I want to show that in addition, the set D0 can be taken as a
subset of a given dense subspace, M of E 0 . Thus I will present next a generalization
of that important lemma. You might consider whether the following lemma can be
generalized even more.
36.2. INDEPENDENCE 993
Lemma 36.8 If E is a separable Banach space with B 0 the closed unit ball in E 0 ,
and if M is a dense subspace of E 0 , then there exists a sequence {fn }∞ 0
n=1 ≡ D ⊆
0
B ∩ M with the property that for every x ∈ E,
that ||x|| < |fkn (x)| + ε. By a standard exercise in the Hahn Banach theorem, there
exists f ∈ B 0 such that f (x) = ||x|| . Next choose an ∈ D such that ||x − an ||E <
ε/4. Since M is dense, there exists g ∈ M ∩ B 0 such that |g (an ) − f (an )| < ε/4.
Finally, there exists fkn ∈ D0 such that |fkn (an ) − g (an )| < ε/4. Then
It follows ||x|| < |fkn (x)| + ε and this proves the lemma because for every f ∈ D0
you can simply include −f .
Lemma 36.9 Let E be a separable real Banach space. Sets of the form
{x ∈ E : x∗i (x) ≤ αi , i = 1, 2, · · ·, m}
Proof: The sets described are obviously a π system. I want to show σ (K)
contains the closed balls because then σ (K) contains the open balls and hence the
open sets and the result will follow. Let D0 ⊆ B 0 ∩ M be described in Lemma 36.8.
994 PROBABILITY IN INFINITE DIMENSIONS
Then
{x ∈ E : ||x − a|| ≤ r}
( )
= x ∈ E : sup |f (x − a)| ≤ r
f ∈D 0
( )
= x ∈ E : sup |f (x) − f (a)| ≤ r
f ∈D 0
Since the Banach space is separable, it is completely separable and so every open
set is the countable union of balls. This shows the open sets are in σ (K) and
so σ (K) ⊇ B (E) . However, all the sets in the π system are closed hence Borel
because they are inverse images of closed sets. Therefore, σ (K) ⊆ B (E) and so
σ (K) = B (E). This proves the lemma.
Next suppose you have some random variables having values in a separable
Banach space, E, {Xi }i∈I . How can you tell if they are independent? To show they
are independent, you need to verify that
n
¡ ¢ Y ¡ ¢
P ∩nk=1 Xi−1
k
(F ik
) = P Xi−1
k
(Fik )
k=1
whenever the Fik are Borel sets in E. It is desirable to find a way to do this easily.
Then G ⊇ K. If A ∈ G, then
¡ ¢
X −1 (A) ∈ σ X −1 (K)
36.2. INDEPENDENCE 995
and so ¡ ¢ ¡ ¢
C
X −1 (A) = X −1 AC ∈ σ X −1 (K)
¡ ¢
because σ X −1 (K) is a σ algebra. Hence AC ∈ G. Finally suppose {Ai } is a
sequence of disjoint sets of G. Then
¡ ¢
X −1 (∪∞ ∞
i=1 Ai ) = ∪i=1 X
−1
(Ai ) ∈ σ X −1 (K)
¡ ¢
again because σ X −1 (K) is a σ algebra. It follows from Lemma 9.72 on ¡ Page 257
¢
−1 −1
that G ⊇ σ (K) and this¡ shows that
¢ whenever A ∈ σ (K) , X (A) ∈ σ X (K) .
Thus X −1 (σ (K)) ⊆ σ X −1 (K) and this proves the lemma.
With this lemma, here is the desired result about independent random variables.
Essentially, you can reduce to the case of random vectors having values in Rn .
Theorem 36.11 The random variables, {Xi }i∈I are independent if whenever
{i1 , · · ·, in } ⊆ I,
m
mi1 , · · ·, min are positive integers, and gmi1 , · · ·, gmin are respectively in (M ) i1 , · ·
n on
m
·, (M ) n for M a dense subspace of E 0 , gmij ◦ Xij are independent random
j=1
vectors having values in Rmi1 , · · ·, Rmin respectively.
as described in Lemma 36.9. Then as proved in this lemma, σ (K) = B (E). Then
the random vectors are independent if whenever
{i1 , · · ·, in } ⊆ I
¡ ¢
By¡ Lemma
¢ 9.72
¡ ¢on Page 257 if Kij is a π system contained in σ Xij such that
σ Kij = σ Xij , then it suffices to check only the case where the Aij is in Kij .
So what will serve for such a collection of π systems? Let
n o
Kij ≡ Xi−1
j
(K) ≡ X −1
i j
(A) : A ∈ K .
¡ ¢
This is clearly a π system contained in σ Xij and by Lemma 36.10
¡ ¢ ³ ´ ¡ ¢
σ Kij = σ Xi−1
j
(K) = Xi−1
j
(σ (K)) ≡ σ Xij .
996 PROBABILITY IN INFINITE DIMENSIONS
Theorem 36.14 Let E be a separable Banach space and let X ∈ L1 (Ω; E, F) where
X is measurable with respect to F. Then there exists a unique Z ∈ L1 (Ω; E, G) such
that for all A ∈ G, Z Z
XdP = ZdP
A A
If P (A) > 0, then for some k, P (Ak ) > 0 because the balls Bk cover the set
{x : ||x|| > δ} . Then
Z
1
||ak || P (Ak ) ≥ ||Z 0 − Z + ak || dP
2 Ak
¯¯Z ¯¯
¯¯ ¯¯
¯
≥ ¯¯¯ (Z − Z + ak ) dP ¯¯¯¯
0
A
¯¯Z k ¯¯
¯¯ ¯¯
¯
= ¯¯¯ ak dP ¯¯¯¯ = ||ak || P (Ak )
Ak
and Z
lim ||X (ω) − Xn (ω)|| dP = 0. (36.6)
n→∞ Ω
Now let {Xn } be the simple functions just defined and let
m
X
Xn (ω) = xk XFk (ω)
k=1
Thus, if A ∈ G,
Z m
X Z
Zn dP = xk E (XFk |G) dP
A k=1 A
Xm Z
= xk XFk dP
k=1 A
Xm Z
= xk P (Fk ) = Xn dP (36.7)
k=1 A
Similarly,
E (||Zn − Zm ||) ≤ E (||Xn − Xm ||)
and this last term converges to 0 as n, m → ∞ by the properties of the Xn . There-
fore, {Zn } is a Cauchy sequence in L1 (Ω; E; G) . It follows it converges to Z in
36.3. CONDITIONAL EXPECTATION 999
It remains to verify ||E (X|G)|| ≤ E (||X|| |G) . This follows similar to the above.
Letting Zn and Z have the same meaning as above and A ∈ G,
Z ¯¯¯¯Xm
¯¯
¯¯
¯¯ ¯¯
E (XA ||Zn ||) = ¯¯ xk E (XFk |G)¯¯ dP
¯ ¯
A k=1 ¯¯
Xm Z
≤ ||xk || E (XFk |G) dP
k=1 A
Xm Z
= ||xk || XFk dP
k=1 A
Z X
m
= ||xk || XFk dP
A k=1
Therefore,
Z Z
||Z|| dP = XA ||Z|| dP
A
Z
= lim XA ||Zn || dP
n→∞
which shows
||E (X|G)|| ≤ E (||X|| |G)
1000 PROBABILITY IN INFINITE DIMENSIONS
and then you would use the fact that reflexive separable Banach spaces have the
Radon Nikodym property to obtain Z ∈ L1 (Ω; E, G) such that
Z Z
ν (F ) = XdP = ZdP.
F F
Definition 36.16 A measure, µ defined on B (E) will be called inner regular if for
all F ∈ B (E) ,
A measure, µ defined on B (E) will be called outer regular if for all F ∈ B (E) ,
Proof: First note every open set is the countable union of closed sets and every
closed set is the countable intersection of open sets. Here is why. Let V be an open
set and let © ¡ ¢ ª
Kk ≡ x ∈ V : dist x, V C ≥ 1/k .
36.4. PROBABILITY MEASURES AND TIGHTNESS 1001
Then clearly the union of the Kk equals V. Next, for K closed let
Clearly the intersection of the Vk equals K. Therefore, letting V denote an open set
and K a closed set,
Then F contains the open sets. I want to show F is a σ algebra and then it will
follow F = B (E).
First I will show F is closed with respect to complements. Let F ∈ F . Then
since µ is finite and F is inner regular,
¡ ¢ exists K ⊆ F such that µ (F \ K) < ε.
there
But K C \ F C = F \ K and so µ K C \ F C < ε showning that F C is outer regular.
I have just approximated the measure of F C with the measure of K C , an open set
containing F C . A similar argument works to show F C is inner regular. You start
with
¡ CV ⊇ CF¢ such that µ (V \ F ) < ε, note F C \ V C = V \ F, and then conclude
µ F \V < ε, thus approximating F with the closed subset, V C .
C
Next I will show F is closed with respect to taking countable unions. Let {Fk }
be a sequence of sets in F. Then µ is inner regular on each of these so there exist
{Kk } such that Kk ⊆ Fk and µ (Fk \ Kk ) < ε/2k+1 . First choose m large enough
that
ε
µ ((∪∞ m
k=1 Fk ) \ (∪k=1 Fk )) < .
2
Then
m
X ε ε
µ ((∪m m
k=1 Fk ) \ (∪k=1 Kk )) ≤ <
2k+1 2
k=1
and so
µ ((∪∞ m
k=1 Fk ) \ (∪k=1 Kk )) ≤ µ ((∪∞ m
k=1 Fk ) \ (∪k=1 Fk ))
+µ ((∪m m
k=1 Fk ) \ (∪k=1 Kk ))
ε ε
< + =ε
2 2
1002 PROBABILITY IN INFINITE DIMENSIONS
Lemma 36.18 Let µ be a finite measure on B (E) , the Borel sets of E, a separable
complete metric space. Then if C is a closed set,
Now let K = C ∩ (∩∞ n=1 Cn ) . Then K is a subset of Cn for each n and so for each
ε > 0 there exists an ε net for K since Cn has a 1/n net, namely a1 , · · ·, amn . Since
K is closed, it is complete and so it is also compact. Now
X∞
ε
µ (C \ K) = µ (∪∞
n=1 (C \ Cn )) < n
= ε.
n=1
2
Thus µ (C) can be approximated by µ (K) for K a compact subset of C. This proves
the lemma.
This shows that for a finite measure on the Borel sets of a separable metric
space, the above definition of regular coincides with the earlier one.
Now here is a definition of what it means for a set of measures to be tight.
Definition 36.19 Let Λ be a set of probability measures defined on the Borel sets
of a topological space. Then Λ is “tight” if for all ε > 0 there exists a compact set,
Kε such that
µ ([x ∈
/ Kε ]) < ε
for all µ ∈ Λ.
Lemma 36.18 implies a single probability measure on the Borel sets of a separable
metric space is tight. The proof of that lemma generalizes slightly to give a simple
criterion for a set of measures to be tight.
36.4. PROBABILITY MEASURES AND TIGHTNESS 1003
Lemma 36.20 Let E be a separable complete metric space and let Λ be a set of
Borel probability measures. Then Λ is tight if and only if for every ε > 0 and r > 0
m
there exists a finite collection of balls, {B (ai , r)}i=1 such that
³ ´
µ ∪m i=1 B (ai , r) > 1 − ε
for every µ ∈ Λ.
µ (Kε ) > 1 − ε
for all µ ∈ Λ. Then consider the open cover, {B (x, r) : x ∈ Kε } . Finitely many of
these cover Kε and this yields the above condition.
Now suppose the above condition and let
Cn ≡ ∪ m n
i=1 B (ai , 1/n)
n
Theorem 36.21 Let H be a compact metric space. Then there exists a compact
subset of [0, 1] , K and a continuous function, θ which maps K onto H.
disjoint closed intervals in [0, 1] each of length no longer than 2−mi which have the
property that Iji is contained in Iki−1 for some k. Letting Ki ≡ ∪m i
j=1 Ij , it follows
i
∞
Ki is a sequence of nested compact sets. Let K = ∩i=1 Ki . Then © eachª∞ x ∈ K is
the intersection of a unique sequence of these closed intervals, Ijkk k=1 . Define
θx ≡ ∩∞ k i
k=1 Hjk . Since the diameters of the Hj converge to 0 as i → ∞, this function
is well defined. It is continuous because if xn → x, then ultimately xn and x are
1004 PROBABILITY IN INFINITE DIMENSIONS
both in Ijkk , the k th closed interval in the sequence whose intersection is x. Hence,
d (θxn , θx) ≤ diameter(Hjkk ) ≤ 1/k. To see the map is onto, let h ∈ H. Then
© ª∞
from the construction, there exists a sequence Hjkk k=1 of the above sets whose
¡ ¢
intersection equals h. Then θ ∩∞ k
i=1 Ijk = h. This proves the theorem.
Note θ is maybe not one to one.
As an important corollary, it follows that the continuous functions defined on
any compact metric space is separable.
Corollary 36.22 Let H be a compact metric space and let C (H) denote the con-
tinuous functions defined on H with the usual norm,
||f ||∞ ≡ max {|f (x)| : x ∈ H}
Then C (H) is separable.
Proof: The proof is by contradiction. Suppose C (H) is not separable. Let
Hk denote a maximal collection of functions of C (H) with the property that if
f, g ∈ Hk , then ||f − g||∞ ≥ 1/k. The existence of such a maximal collection of
functions is a consequence of a simple use of the Hausdorff maximallity theorem.
Then ∪∞k=1 Hk is dense. Therefore, it cannot be countable by the assumption that
C (H) is not separable. It follows that for some k, Hk is uncountable. Now by
Theorem 36.21 there exists a continuous function, θ defined on a compact subset,
K of [0, 1] which maps K onto H. Now consider the functions defined on K
Gk ≡ {f ◦ θ : f ∈ Hk } .
Then Gk is an uncountable set of continuous functions defined on K with the prop-
erty that the distance between any two of them is at least as large as 1/k. This
contradicts separability of C (K) which follows from the Weierstrass approximation
theorem in which the separable countable set of functions is the restrictions of poly-
nomials that involve only rational coefficients. This proves the corollary. Now here
is Prokhorov’s theorem.
∞
Theorem 36.23 Let Λ = {µn }n=1 be a sequence of probability measures defined
on B (E) where E is a separable Banach space. If Λ is tight then there exists a
∞ ∞
probability measure, λ and a subsequence of {µn }n=1 , still denoted by {µn }n=1
such that whenever φ is a continuous bounded complex valued function defined on
E, Z Z
lim φdµn = φdλ.
n→∞
and so the restrictions of the measures of Λ to Kn are contained in the unit ball of
0
C (Kn ) . Recall from the Riesz representation theorem, the dual space of C (Kn )
is a space of complex Borel measures. Theorem 13.37 on Page 356 implies the unit
0
ball of C (Kn ) is weak ∗ sequentially compact. This follows from the observation
that C (Kn ) is separable which is proved in Corollary 36.22 and leads to the fact
0
that the unit ball in C (Kn ) is actually metrizable by Theorem 13.37 on Page 356.
Therefore, there exists a subsequence of Λ, {µ1k } such that their restrictions to K1
0
converge weak ∗ to a measure, λ1 ∈ C (K1 ) . That is, for every φ ∈ C (K1 ) ,
Z Z
lim φdµ1k = φdλ1
k→∞ K1 K1
By the same reasoning, there exists a further subsequence {µ2k } such that the
0
restrictions of these measures to K2 converge weak ∗ to a measure λ2 ∈ C (K2 )
etc. Continuing this way,
0
µ11 , µ12 , µ13 , · · · → Weak ∗ in C (K1 )
0
µ21 , µ22 , µ23 , · · · → Weak ∗ in C (K2 )
0
µ31 , µ32 , µ33 , · · · → Weak ∗ in C (K3 )
..
.
th
Here the j th sequence is a subsequence of the (j − 1) . Let λn denote the measure
0 ∞
in C (Kn ) to which the sequence {µnk }k=1 converges weak∗. Let {µn } ≡ {µnn } ,
the diagonal sequence. Thus this sequence is ultimately a subsequence of every one
0
of the above sequences and so µn converges weak∗ in C (Km ) to λm for each m.
Claim: For p > n, the restriction of λp to the Borel sets of Kn equals λn .
Proof of claim: Let H be a compact subset of Kn . Then there are sets, Vl open
in Kn which are decreasing and whose intersection equals H. This follows because
this is a metric space. Then let H ≺ φl ≺ Vl . It follows
Z Z
λn (Vl ) ≥ φl dλn = lim φl dµk
Kn k→∞ K
Z Z n
Now considering the ends of this inequality, let l → ∞ and pass to the limit to
conclude
λn (H) ≥ λp (H) .
Similarly,
Z Z
λn (H) ≤ φl dλn = lim φl dµk
Kn k→∞ K
Z Z n
= lim φl dµk = φl dλp ≤ λp (Vl ) .
k→∞ Kp Kp
1006 PROBABILITY IN INFINITE DIMENSIONS
The limit exists because the sequence on the right is increasing due to the above
observation that λn = λm on the Borel subsets of Km whenever n > m. Thus for
n>m
λn (F ∩ Kn ) ≥ λn (F ∩ Km ) = λm (F ∩ Km ) .
Now let {Fk } be a sequence of disjoint Borel sets. Then
λ (∪∞
k=1 Fk ) ≡ lim λn (∪∞ ∞
k=1 Fk ∩ Kn ) = lim λn (∪k=1 (Fk ∩ Kn ))
n→∞ n→∞
∞
X ∞
X
= lim λn (Fk ∩ Kn ) = λ (Fk )
n→∞
k=1 k=1
Consequently,
¯Z Z ¯ ¯¯Z Z ÃZ Z !¯
¯
¯ ¯ ¯ ¯
¯ φdµk − φdλ¯ ≤ ¯
¯ ¯ ¯ C φdµk + φdµk − φdλ +
C
φdλ ¯
¯
Kn Kn Kn Kn
¯Z Z ¯ ¯¯Z Z ¯
¯
¯ ¯ ¯ ¯
≤ ¯¯ φdµk − φdλn ¯¯ + ¯ φdµk − φdλ¯
Kn Kn ¯ KnC C
Kn ¯
¯Z Z ¯ ¯¯Z ¯ ¯Z
¯ ¯
¯
¯
¯ ¯ ¯ ¯ ¯ ¯
≤ ¯¯ φdµk − ¯
φdλn ¯ + ¯ φdµk ¯ + ¯ φdλ¯
Kn Kn ¯ KnC ¯ ¯ KnC ¯
¯Z Z ¯
¯ ¯ M M
≤ ¯¯ φdµk − φdλn ¯¯ + +
Kn Kn n n
First let n be so large that 2M/n < ε/2 and then pick k large enough that the
above expression is less than ε. This proves the theorem.
Definition 36.24 Let E be a Banach space and let µ and the sequence of probability
measures, {µn } defined on B (E) satisfy
Z Z
lim φdµn = φdµ.
n→∞
and so
|µ (U ) − µn (U )| < 3ε.
since ε is arbitrary, this proves the lemma.
λX (F ) ≡ P ([X ∈ F ])
µ (F ) ≡ P ([X ∈ F ]) .
Theorem 36.27 Let E be a separable Banach space and let {µn } be a sequence
of Borel probability measures defined on B (E) such that µn converges weakly to µ
another probability measure on B (E). Then there exist random variables, Xn , X
defined on the probability space, ([0, 1), B ([0, 1)) , m) where m is one dimensional
Lebesgue measure such that
Construction of sets in E
Thus the sets, Ckr for k = 1, 2, · · · are disjoint Borel sets whose union is all of C.
Now let C = E, the whole Banach space. Also let {rk } be a decreasing sequence of
positive numbers which converges to 0. Let
Ak ≡ Ekr1 , k = 1, 2, · · ·
Thus {Ak } is a sequence of Borel sets, Ak ⊆ B (ak , r1 ) , and the union of the Ak
equals E. For (i1 , · · ·, im ) ∈ Nm , suppose Ai1 ,···,im has been defined. Then for
k ∈ N,
r
Ai1 ,···,im k ≡ (Ai1 ,···,im )km+1
Thus Ai1 ,···,im k ⊆ B (ak , rm+1 ), is a Borel set, and
∪∞
k=1 Ai1 ,···,im k = Ai1 ,···,im . (36.10)
Also note that Ai1 ,···,im could be empty. This is because Ai1 ,···,im k ⊆ B (ak , rm+1 )
but Ai1 ,···,im ⊆ B (aim , rm ) which might have empty intersection with B (ak , rm+1 ) .
However, applying 36.10 repeatedly,
E = ∪i1 · · · ∪im Ai1 ,···,im
and also, the construction shows the Borel sets, Ai1 ,···,im are disjoint.
Construction of intervals depending on the measure
Next I will construct intervals, Iiν1 ,···,in in [0, 1) corresponding to these Ai1 ,···,in .
In what follows, ν = µn or µ. These intervals will depend on the measure chosen
as indicated in the notation.
" j−1 j
!
X X
ν ν
I1 ≡ [0, ν (A1 )), · · ·, Ij ≡ ν (Ak ) , ν (Ak )
k=1 k=1
for j = 1, 2, · · ·. Note these are disjoint intervals whose union is [0, 1). Also note
¡ ¢
m Ijν = ν (Aj ) .
The endpoints of these intervals as well as their lengths depend on the measures of
the sets Ak . Now supposing Iiν1 ,···,im = [α, β) where β − α = ν (Ai1 ···,im ) , define
" j−1 j
!
X X
ν
Ii1 ···,im ,j ≡ α + ν (Ai1 ···,im ,k ) , α + ν (Ai1 ···,im ,k )
k=1 k=1
¡ ¢
Thus m Iiν1 ···,im ,j = ν (Ai1 ···,im ,j ) and
∞
X ∞
X ¡ ¢
ν (Ai1 ···,im ) = ν (Ai1 ···,im ,k ) = m Iiν1 ···,im ,k = β − α,
k=1 k=1
There are at most countably many positive numbers, r such that for ν = µn
or µ, ν (∂B (ai , r)) > 0. This is because ν is a finite measure. Taking the count-
able union of these countable sets, there are only countably many r such that
ν (∂B (ai , r)) > 0 for some ai . Let the sequence avoid all these bad values of r.
Thus for
F ≡ ∪∞ ∞
m=1 ∪k=1 ∂B (ak , rm )
and ν = µ or µn , ν (F ) = 0.
Claim 1: ∂Ai1 ,···,ik ⊆ F.
Proof of claim: Suppose C is a Borel set for which ∂C ⊆ F. I need to show
∂Ckri ∈ F. First consider k = 1. Then C1ri ≡ B (a1 , ri ) ∩ C. If x ∈ ∂C1ri , then
C
B (x, δ) contains points of B (a1 , ri ) ∩ C and points of B (a1 , ri ) ∪ C C for every
δ > 0. First suppose x ∈ B (a1 , ri ) . Then a small enough neighborhood of x has no
C
points of B (a1 , ri ) and so every B (x, δ) has points of C and points of C C so that
x ∈ ∂C ⊆ F by assumption. If x ∈ ∂C1ri , then it can’t happen that ||x − a1 || > ri
because then there would be a neighborhood of x having no points of C1ri . The
only other case to consider is that ||x − ai || = ri but this says x ∈ F. Now assume
∂Cjri ⊆ F for j ≤ k − 1 and consider ∂Ckri .
Since there are only finitely many sets in the union, there exists s ≤ k − 1 such that
every ball about x contains points of Csri but from 36.11, every ball about x contains
C
points of (Csri ) which implies x ∈ ∂Csri ⊆ F by induction. It is not possible that
||x − ak || > ri and yet have x in ∂Ckri . This follows from the description in 36.11.
If ||x − ak || = ri then by definition, x ∈ F. The only other case to consider is
that x ∈ / int (B (ak , ri ) ∩ C) but x ∈ B (ak , ri ). From 36.11, every ball about x
contains points of C. However, since x ∈ B (ak , ri ) , a small enough ball is contained
in B (ak , ri ) . Therefore, every ball about x must also contain points of C C since
otherwise, x ∈ int (B (ak , ri ) ∩ C) . Thus x ∈ ∂C ⊆ F by assumption. Now apply
what was just shown to the case where C = E, the whole space. In this case,
∂E ⊆ F because ∂E = ∅. Then keep applying what was just shown to the Ai1 ,···,in .
This proves the claim.
From the claim, ν (int (Ai1 ,···,in )) = ν (Ai1 ,···,in ) whenever ν = µ or µn .
By the axiom of choice, there exists xi1 ,···,im ∈ int (Ai1 ,···,im ) whenever int (Ai1 ,···,im ) 6=
∅. For ν = µn or µ, define the following functions. For ω ∈ Iiν1 ,···,im
ν
Zm (ω) ≡ xi1 ,···,im .
µ µ
This defines the functions, Zmn and Zm . Note these functions have the same values
but on slightly different intervals. Here is an important claim.
µ µ
Claim 2: For a.e. ω ∈ [0, 1), limn→∞ Zmn (ω) = Zm (ω) .
Proof of the claim: This follows from the weak convergence of µn to µ and
Lemma 36.25. This lemma implies µn (int (Ai1 ,···,im )) → µ (int (Ai1 ,···,im )) . Thus
by the construction described above, µn (Ai1 ,···,im ) → µ (Ai1 ,···,im ) because of claim
1 and the construction
¡ of¢ F in which it is always a set of measure ¡ zero. It
¢ follows
µ
that if ω ∈ int Iiµ1 ,···,im , then for all n large enough, ω ∈ int Ii1n,···,im and so
µ µ
Zmn (ω) = Zm (ω) . Note this convergence is very far from being uniform.
ν ∞
Claim 3: For ν = µn or µ, {Zm }m=1 is uniformly Cauchy independent of n.
Proof of the claim: For ω ∈ Iiν1 ,···,im , then by the construction, ω ∈ Iiν1 ,···,im ,im+1 ···,in
ν
for some im+1 · ··, in . Therefore, Zm (ω) and Znν (ω) are both contained in Ai1 ,···,im
which is contained in B (aim , rm ) . Since ω ∈ [0, 1) was arbitrary, and rm → 0, it
follows these functions are uniformly Cauchy as claimed.
Let X ν (ω) = limm→∞ Zm ν
(ω). Since each Zm ν
is continuous off a set of measure
zero, it follows from the uniform convergence that X ν is also continuous off a set of
measure zero.
Claim 4: For a.e. ω,
Proof of the claim: From Claim 3 and letting ε > 0 be given, there exists m
large enough that for all n,
µn
||Zm − X µn ||∞ < ε/3, ||Zm
µ
− X µ ||∞ < ε/3.
Now pick ©ω ∈ [0, 1)ªsuch that ω is not equal to any of the end points of any of the
intervals, Iiν1 ,···,im ¯,¯ a set of measure zero.
¯¯ Then by Claim 2, there exists N such
¯ ¯ µn µ ¯ ¯
that if n ≥ N, then Zm (ω) − Zm (ω) E < ε/3. Therefore, for such n and this ω,
Showing L (X ν ) = ν.
ν
This has mostly proved the theorem except³for the claim that L ´ (X ) = ν for
−1
ν = µn and µ. To do this, I will first show m (X ν ) (∂Ai1 ,···,im ) = 0. By the
1012 PROBABILITY IN INFINITE DIMENSIONS
construction, ν (∂Ai1 ,···,im ) = 0. Let ε > 0 be given and let δ > 0 be small enough
that
Hδ ≡ {x ∈ E : dist (x, ∂Ai1 ,···,im ) ≤ δ}
is a set of measure less than ε/2. Denote by Gk the sets of the form Ai1 ,···,ik
where (i1 , · · ·, ik ) ∈ Nk . Recall also that corresponding to Ai1 ,···,ik is an interval,
Iiν1 ,···,ik having length equal to ν (Ai1 ,···,ik ) . Denote by Bk those sets of Gk which
have nonempty intersection with Hδ and let the corresponding intervals be denoted
by Ikν . If ω ∈ / ∪Ikν , then from the construction, Zpν (ω) is at a distance of at least
δ from ∂Ai1 ,···,im for all p ≥ k and so, passing to the limit as p → ∞, it follows
X ν (ω) ∈ / ∂Ai1 ,···,im . Therefore,
−1
(X ν ) (∂Ai1 ,···,im ) ⊆ ∪Ikν
Recall that Ai1 ,···,ik ⊆ B (aik , rk ) and the rk → 0. Therefore, if k is large enough,
ν (∪Bk ) < ε
= ν (∪Bk ) < ε.
³ ´
−1
Since ε > 0 is arbitrary, this shows m (X ν ) (∂Ai1 ,···,im ) = 0.
If ω ∈ Iiν1 ,···,im , then from the construction, Zpν (ω) ∈ int (Ai1 ,···,im ) for all p ≥ k.
Therefore, taking a limit, as p → ∞,
and so
−1
Iiν1 ,···,im ⊆ (X ν ) (int (Ai1 ,···,im ) ∪ ∂Ai1 ,···,im )
but also, if X ν (ω) ∈ int (Ai1 ,···,im ) , then Zpν (ω) ∈ int (Ai1 ,···,im ) for all p large
enough and so
−1
(X ν ) (int (Ai1 ,···,im ))
⊆ Iiν1 ,···,im
ν −1
⊆ (X ) (int (Ai1 ,···,im ) ∪ ∂Ai1 ,···,im )
36.5. A MAJOR EXISTENCE AND CONVERGENCE THEOREM 1013
Therefore,
³ ´
−1
m (X ν ) (int (Ai1 ,···,im ))
¡ ¢
≤ m Iiν1 ,···,im
³ ´ ³ ´
−1 −1
≤ m (X ν ) (int (Ai1 ,···,im )) + m (X ν ) (∂Ai1 ,···,im )
³ ´
−1
= m (X ν ) (int (Ai1 ,···,im ))
which shows
³ ´ ¡ ¢
−1
m (X ν ) (int (Ai1 ,···,im )) = m Iiν1 ,···,im = ν (Ai1 ,···,im ) . (36.12)
Also
³ ´
−1
m (X ν ) (int (Ai1 ,···,im ))
³ ´
−1
≤ m (X ν ) (Ai1 ,···,im )
³ ´
−1
≤ m (X ν ) (int (Ai1 ,···,im ) ∪ ∂Ai1 ,···,im )
³ ´
−1
= m (X ν ) (int (Ai1 ,···,im ))
Since this holds for every open set, it is routine to verify using regularity that it
holds for every Borel set and so L (X ν ) = ν as claimed. This proves the theorem.
1014 PROBABILITY IN INFINITE DIMENSIONS
Recall the following fundamental lemma and definition, Lemma 19.12 on Page
522.
Lemma 36.30 F and F −1 are both one to one, onto, and are inverses of each
other.
Theorem
¡ it·X ¢ 36.31 Let¢ X and Y be random vectors with values in Rp and suppose
¡ it·Y
E e =E e for all t ∈ Rp . Then λX = λY .
Proof: This follows from the Lemma 9.72 on Page 257. Let
Lemma 36.33 If E is a separable Banach space with B 0 the closed unit ball in E 0 ,
then there exists a sequence {fn }∞ 0 0
n=1 ≡ D ⊆ B with the property that for every
x ∈ E,
||x|| = sup |f (x)|
f ∈D 0
Definition 36.34 Let E be a separable real Banach space. A cylindrical set is one
which is of the form
{x ∈ E : x∗i (x) ∈ Γi , i = 1, 2, · · ·, m}
It is obvious that ∅ is a cylindrical set and that the intersection of two cylindrical
sets is another cylindrical set. Thus the cylindrical sets form a π system. What is
∞
the smallest σ algebra containing the cylindrical sets? Letting {fn }n=1 = D0 be the
sequence of Lemma 36.33 it follows that
{x ∈ E : ||x − a|| ≤ δ}
( )
= x ∈ E : sup |f (x − a)| ≤ δ
f ∈D 0
( )
= x ∈ E : sup |f (x) − f (a)| ≤ δ
f ∈D 0
n o
= ∩∞
n=1 x ∈ E : fn (x) ∈ B (fn (a) , δ)
Note this is a little different than earlier when the symbol φX (t) was used and
X was a random variable. Here the focus is more on the measure than a random
variable, X such that L (X) = µ but it does not matter much because of Skorokhod’s
theorem presented above. The fundamental result is the following theorem.
1016 PROBABILITY IN INFINITE DIMENSIONS
φµ (x∗ ) = φν (x∗ )
µ
e (A) ≡ µ ({x ∈ E : (x∗1 (x) , · · ·, x∗n (x)) ∈ A}) ,
νe (A) ≡ ν ({x ∈ E : (x∗1 (x) , · · ·, x∗n (x)) ∈ A}) . (36.14)
Note these sets in the parentheses are cylindrical sets. Letting λ ∈ Rn , consider in
the definition of the characteristic function, λ1 x∗1 + · · · + λn x∗n ∈ E 0 . Thus
Z Z
i(λ1 x∗ (x)+···+λn x∗ (x)) ∗ ∗
e 1 n dµ = ei(λ1 x1 (x)+···+λn xn (x)) dν
E E
n
Now if F is a Borel measurable subset of R ,
Z
XF (y) de
µ (y) = µ
e (F )
Rn
and using the usual approximations involving simple functions, it follows that for
any f bounded and Borel measurable,
Z Z
f (y) de
µ (y) = f ((x∗1 (x) , · · ·, x∗n (x))) dµ (x) .
Rn E
Similarly, Z Z
f (y) de
ν (y) = f ((x∗1 (x) , · · ·, x∗n (x))) dν (x) ,
Rn E
Therefore,
Z Z
∗ ∗
eiλ·y de
µ (y) = ei(λ1 x1 (x)+···+λn xn (x)) dµ
Rn
ZE
∗ ∗
= ei(λ1 x1 (x)+···+λn xn (x)) dν
ZE
= eiλ·y de
ν (y)
Rn
which shows from Theorem 36.31 that νe = µ e on the Borel sets of Rn . However,
from the definition of these measures in 36.14 this says nothing more than µ = ν
36.6. CHARACTERISTIC FUNCTIONS 1017
on any cylindrical set. Hence by Corollary 36.32 this shows µ = ν on B (E) . This
proves the theorem.
Finally, I will consider the relation between the characteristic function and in-
dependence of random variables. Recall an earlier proposition which relates inde-
pendence of random vectors with characteristic functions. It is proved starting on
Page 865 in the case of two random variables and concludes with the observation
that the general case is entirely similar but more tedious to write down.
n
Proposition 36.37 Let {Xk }k=1 be random vectors such that Xk has values in
Rpk . Then the random vectors are independent if and only if
n
¡ ¢ Y ¡ ¢
E eiP = E eitj ·Xj
j=1
Pn
where P ≡ j=1 tj · Xj for tj ∈ Rpj .
It turns out there is a generalization of the above proposition to the case where
the random variables have values in a real separable Banach space. Before proving
this recall an earlier theorem which had to do with reducing to the case where the
random variables had values in Rn . It is restated here for convenience.
Theorem 36.38 The random variables, {Xi }i∈I are independent if whenever
{i1 , · · ·, in } ⊆ I,
m m
mi1 , · · ·, min are positive integers, and gmi1 , · · ·, gmin are in (E 0 ) i1 , · · ·, (E 0 ) n re-
n on
spectively, gmij ◦ Xij are independent random vectors having values in Rmi1 , ··
j=1
·, Rmin respectively.
Now here is the theorem about independence and the characteristic functions.
n
Theorem 36.39 Let {Xk }k=1 be random variables having values in E, a real sep-
arable Banach space. Then the random variables are independent if and only if
¡ ¢ Y n ³ ∗ ´
E eiP = E eitj (Xj )
j=1
Pn ∗
where P ≡ j=1 tj (Xj ) for t∗j ∈ E 0 .
Proof: If the random variables are independent, then so are the random vari-
ables, t∗j (Xj ) and so the equation follows.
The interesting case is when the equation holds. Can you draw the conclusion
the random variables are independent? By Theorem 36.38, it suffices to show the
n
random variables {gmk ◦ Xk }k=1 are independent. This happens if whenever tmk ∈
Rmk and
Xn
P = tmk · (gmk ◦ Xk ) ,
k=1
1018 PROBABILITY IN INFINITE DIMENSIONS
it follows
¡ ¢ Y n ³ ´
E eiP = E eitmk ·(gmk ◦Xk ) . (36.15)
j=1
which is assumed to hold. Therefore, the random variables are independent. This
proves the theorem.
There is an obvious corollary which is useful.
n
Corollary 36.40 Let {Xk }k=1 be random variables having values in E, a real sep-
arable Banach space. Then the random variables are independent if and only if
¡ ¢ Y n ³ ∗ ´
E eiP = E eitj (Xj )
j=1
Pn ∗
where P ≡ j=1 tj (Xj ) for t∗j ∈ M where M is a dense subset of E 0 .
Then define
n
X n
X
P ≡ t∗j Xj , Pn ≡ t∗nj Xj .
j=1 j=1
It follows
¡ ¢ ¡ ¢
E eiP = lim E eiPn
n→∞
n
Y ³ ∗ ´
= lim E eitnj (Xj )
n→∞
j=1
n
Y ³ ∗ ´
= E eitj (Xj )
j=1
36.7 Convolution
Lemma 36.18 on Page 1002 makes possible a definition of convolution of two prob-
ability measures defined on B (E) where E is a separable Banach space as well as
some other interesting theorems which held earlier in the context of locally compact
spaces. I will first show a little theorem about density of continuous functions in
Lp (E) and then define the convolution of two finite measures. First here is a simple
technical lemma.
Proof: For each x ∈ K, there exists a ball, B (x, δ x ) such that B (x, 3δ x ) ⊆ U .
m
Finitely many of these balls cover K because K is compact, say {B (xi , δ xi )}i=1 .
Let
0 < δ < min (δ xi : i = 1, 2, · · ·, m) .
Now pick any x ∈ K. Then x ∈ B (x¡i , δ xi ) ¢for some xi and so B (x, δ) ⊆ B (xi , 2δ xi ) ⊆
C
U. Therefore,¡ forC any
¢ x ∈ K, dist x, U ≥ δ. If x ∈ B (xi , 2δ xi ) for some xi , it
follows dist x, U ≥ δ because then B (x, δ) ⊆ B (xi , 3δ xi ) ⊆ U. If x ∈/ B (xi , 2δ xi )
for any of the xi , then x ∈/ B (y, δ) for any y ∈ K because all these sets are contained
in some B (xi , 2δ xi ) . Consequently dist (x, K) ≥ δ. This proves the lemma.
From this lemma, there is an easy corollary.
Proof: Consider
¡ ¢
dist x, U C
f (x) ≡ .
dist (x, U C ) + dist (x, K)
Therefore,
2
|f (x) − f (x0 )| ≤ d (x, x0 )
δ
and this proves the corollary.
1020 PROBABILITY IN INFINITE DIMENSIONS
Now suppose µ is a finite measure defined on the Borel sets of a separable Banach
space, E. It was shown above that µ is inner and outer regular. Lemma 36.18 on
Page 1002 shows that µ is inner regular in the usual sense with respect to compact
sets. This makes possible the following theorem.
such that ||f − s||Lp (E) < ε/2. Now by regularity of µ there exist compact sets,
Pm 1/p
Kk and open sets, Vk such that 2 k=1 |ck | µ (Vk \ K) < ε/2 and by Corollary
36.42 there exist uniformly continuous functions gk having values in [0, 1] such that
gk = 1 on Kk and 0 on VkC . Then consider
m
X
g (x) = ck gk (x) .
k=1
m
ÃZ !1/p
X
p
≤ |ck | 2 dµ
k=1 Vk \Kk
Xm
1/p
= 2 |ck | µ (Vk \ K) < ε/2.
k=1
Therefore,
||f − g||Lp ≤ ||f − s||Lp + ||s − g||Lp < ε/2 + ε/2.
This proves the theorem.
36.7. CONVOLUTION 1021
Lemma 36.44 Let A ∈ B (E) where µ is a finite measure on B (E) for E a sepa-
rable Banach space. Also let xi ∈ E for i = 1, 2, · · ·, m. Then for x ∈ E m ,
à m
! Ã m
!
X X
x →µ A + xi , x → µ A − xi
i=1 i=1
B (E) × · · · × B (E)
measurable where the above denotes the product measurable sets as described in
Theorem 9.75 on Page 260.
where the Ui come from a countable basis for E. Since every open set is the countable
union of sets like the above, each being a measurable box, the open sets are contained
in
B (E) × · · · × B (E)
which implies B (E m ) ⊆ B (E) × · · · × B (E) also. This proves the lemma.
With this lemma, it is possible to define the convolution of two finite measures.
Definition 36.45 Let µ and ν be two finite measures on B (E) , for E a separable
Banach space. Then define a new measure, µ ∗ ν on B (E) as follows
Z
µ ∗ ν (A) ≡ ν (A − x) dµ (x) .
E
This is well defined because of Lemma 36.44 which says that x → ν (A − x) is Borel
measurable.
Here is an interesting theorem about convolutions. However, first here is a little
lemma. The following picture is descriptive of the set described in the following
lemma.
@ E
@
@ @
@ @
@ SA@
@ @
@ @ E
@ A @
@ @
@
S∪∞
i=1 Ai
= ∪∞
i=1 SAi
and this shows that G is also closed with respect to countable unions of disjoint
sets. Therefore, by the lemma on π systems, Lemma 9.72 on Page 257 it follows
G = σ (K) = B (E) . This proves the lemma.
while
Z
µ ∗ (ν ∗ λ) (A) ≡ (ν ∗ λ) (A − y) dµ (y)
ZE Z
= ν (A − y − x) dλ (x) dµ (y)
E E
Z Z
= ν (A − y − x) dµ (y) dλ (x) .
E E
1024 PROBABILITY IN INFINITE DIMENSIONS
for all E a Borel set in Rp . In different notaion, L (X) = λX . Then the following
definitions and theorems are proved and presented starting on Page 867
X1 + X2 ∼ Np (m1 + m2 , Σ1 + Σ2 ). (36.20)
Following [42] a random vector has a generalized normal distribution if its char-
acteristic function is given as
1 ∗
eit·m e− 2 t Σt (36.22)
where Σ is symmetric and has nonnegative eigenvalues. For a random real valued
variable, m is scalar and so is Σ so the characteristic function of such a generalized
normally distributed random variable is
1 2
σ2
eitm e− 2 t (36.23)
where mj = E (Xj ).
then X1 and (X2 , · · ·, Xp ) are both normally distributed and the two random vectors
are independent. Here mj ≡ E (Xj ) .
Proof: First note that if A, B are Borel sets of E then A × B is a Borel set in
E × E where the norm on E × E is given by
G ≡ {B ∈ B (E) : A × B ∈ B (A × B)} .
Show G is a σ algebra and it contains the open sets. Therefore, this will show A × B
is in B (A × B) whenever A is open and B is Borel. Next repeat a similar argument
to show that this is true whenever either set is Borel. Since E is separable, it is
completely separable and so is E × E. Thus every open set in E × E is the union
36.8. THE MULTIVARIATE NORMAL DISTRIBUTION 1027
of balls from a countable set. However, these balls are of the form B1 × B2 where
Bi is a ball in E. Now let
K ≡ {A × B : A, B are Borel}
Then K ⊆ B (E × E) as was just shown and also every open set from E × E is in
σ (K). It follows σ (K) equals the σ algebra of product measurable sets, B (E)×B (E)
and you can consider the product measure, µ×µ. By Skorokhod’s theorem, Theorem
36.27, there exists (X, Y ) a random variable with values in E × E and a probability
space, (Ω, F, P ) such that L ((X, Y )) = µ × µ. Then for A, B Borel sets in E
P (X ∈ A, Y ∈ B) = (µ × µ) (A × B) = µ (A) µ (B) .
Then letting θ = π,
φ (t) = φ (−t) · φ (0) = φ (−t) = φ (t)
showing φ has real values. It is positive near 0 because φ (0) = 1 and φ is a
continuous function of t thanks to the dominated convergence theorem. However,
this and 36.25 implies it is positive everywhere. Here is why. If not, let tm be the
smallest positive value of t where φ (t) = 0. Then tm > 0 by continuity. Now from
36.25, an immediate contradiction results. Therefore, φ (t) > 0 for all t > 0. Similar
reasoning yields the same conclusion for t < 0.
Next note that φ (t) = φ (−t) also implies φ depends only on |t| because it takes
2
the same value for t as for −t. More simply, ¡ 2φ¢ depends only on t . Thus one can
define a new function of the form φ (t) = f t and 36.24 implies the following for
α ∈ [0, 1] . ¡ ¢ ¡ ¢ ¡¡ ¢ ¢
f t2 = f α 2 t2 f 1 − α 2 t2 .
¡ ¢ ¡ ¢
Taking ln of both sides, one obtains the following for g t2 ≡ ln f t2 .
¡ ¢ ¡ ¢ ¡¡ ¢ ¢
ln f t2 = ln f α2 t2 + ln f 1 − α2 t2 .
¡ ¢
Now letting x = α2 t2 and y = 1 − α2 t2 , it follows that for all x ≥ 0
ln f (x + y) = ln f (x) + ln f (y) .
¡ ¢ ¡ ¢ 2
Hence ln f (x) = kx and so ln f t2 = kt2 and so φ (t) = f t2 = ekt for all t.
The constant, k must be nonpositive because φ (t) is bounded due to its definition.
Therefore, the characteristic function of ν is
1 2
σ2
φν (t) = e− 2 t
for some σ ≥ 0. That is, ν is the law of a generalized normal random variable.
Note the other direction of the implication is obvious. If ξ, ζ ∼ N (0, σ) and
they are independent, then if α2 + β 2 = 1, it follows
¡ ¢
αξ + βζ ∼ N 0, σ 2
because
³ ´ ¡ ¢ ¡ ¢
E eit(αξ+βζ) = E eitαξ E eitβζ
2 2 2
1 1
σ2
= e− 2 (αt) σ e− 2 (βt)
1 2 2
= e− 2 t σ ,
the characteristic function for a random variable which is N (0, σ). This proves the
theorem.
The next theorem is a useful gimick for showing certain random variables are
independent in the context of normal distributions.
36.8. THE MULTIVARIATE NORMAL DISTRIBUTION 1029
Where the last equality needs to be justified. When this is done it will follow from
Proposition 36.37 on Page 1017 which is proved on Page 996 that X and Y are
independent. Thus all that remains is to verify
¡ ¢ 1 ∗ ¡ ¢ 1 ∗
E eiu·X = eiu·mX e− 2 u ΣX u , E eiv·Y = eiv·mY e− 2 v ΣY v .
However, this follows from 36.26. To get the first formula, let v = 0. To get the
second, let u = 0. This proves the Theorem.
Note that to verify the conclusion of this theorem, it suffices to show
Next are some technical lemmas. The first is like an earlier result but will require
more work because it will not be assumed a certain function is bounded.
∞ ¡ ¢
Proof: Let {ai }i=1 be a countable dense subset of R. Let Bin ≡ B ai , n1 ⊆ R
and define Borel sets, Ani ⊆ E as follows:
¡ n ¢ ¡ k ¢
An1 = h−1 (B1n ) , Ank+1 ≡ h−1 Bk+1 \ ∪i=1 Ani .
∞
Thus {Ani }i=1 ¡are disjoint
¢ Borel sets, with h (Ani ) ⊆ Bin . Also let bni denote the
1
endpoint of B ai , n which is closer to 0.
½ n
bi if h−1 (Bin ) 6= ∅
hni ≡
0 if h−1 (Bin ) = ∅
Then define
∞
X
hn (x) ≡ hni XAni (x)
i=1
Thus |h (x)| ≤ |h (x)| for all x ∈ E and |hn (x) − h (x)| ≤ 1/n for all x ∈ E. Then
n
Let
k
X
hnk (x) ≡ hni XAni (x) .
i=1
∞
Then from the construction in which the {Ani }i=1 are disjoint,
Now by the uniform convergence in the construction, you can let n → ∞ and obtain
Z Z
|h (X (ω))| dP = |h (x)| dµ.
Ω E
1
Thus h ∈ L (E, µ). It is obviously Borel measurable, being the limit of a sequence
of Borel measurable functions. Now similar reasoning to the above and using the
dominated convergence theorem when necessary yields
Z Z
hn (X (ω)) dP = hn (x) dµ
Ω E
is given by Z
1 1 2
√ e− 2σ2 (x−m) dx
A 2πσ
for some σ and m. A Gaussian measure is called symmetric if m is always equal
to 0.
t1 h1 ◦ X + · · · + tn hn ◦ X
= (t1 h1 + · · · + tn hn ) ◦ X
36.10. GAUSSIAN MEASURES FOR A SEPARABLE HILBERT SPACE 1033
(h1 ◦ X, · · ·, hn ◦ X)
where the ξ i are independent normal random variables having mean 0 for conve-
nience. However, this is a rather trivial case. It is much more interesting to consider
the case of infinite sums of random variables.
J = (t1 , · · ·, tn ) ⊆ I,
(t1 , · · ·, tn ) ⊆ (s1 , · · ·, sp ) ,
then ¡ ¢
ν t1 ···tn (Ft1 × · · · × Ftn ) = ν s1 ···sp Gs1 × · · · × Gsp (36.27)
1034 PROBABILITY IN INFINITE DIMENSIONS
where if si = tj , then Gsi = Ftj and if si is not equal to any of the indices, tk ,
then Gsi = Msi . Then there exists a probability space, (Ω, P, F) and measurable
functions, Xt : Ω → Mt for each t ∈ I such that for each (t1 · · · tn ) ⊆ I,
L (ξ k ) = N (0, 1)
∞
and {ξ k }k=1 is independent.
Then for the index set equal to N the measures satisfy the necessary consistency
condition for the Kolmogorov theorem above. Therefore, there exists a probability
space, (Ω, P, F) and measurable functions, ξ k : Ω → R such that
¡£ ¤ £ ¤ £ ¤¢
P ξ i1 ∈ Fi1 ∩ ξ i2 ∈ Fi2 · · · ∩ ξ in ∈ Fin
= µi1 ···in (F1 × · · · × Fn )
¡£ ¤¢ ¡£ ¤¢
= P ξ i1 ∈ Fi1 · · · P ξ in ∈ Fin
which shows the random variables are independent as well as normal with mean 0
and variance 1. This proves the Lemma.
Now let H be a separable Hilbert space. Consider
∞
X
λk e k ⊗ e k
k=1
≡ a + Yn (ω) . (36.30)
For n > m,
n
X
2 2
|Yn (ω) − Ym (ω)|H = λk ξ k (ω) (36.31)
k=m
and Z X
∞ ∞
X
2
λk ξ k (ω) dP = λk < ∞
Ω k=1 k=1
and so for a.e. ω, 36.31 shows {Yn (ω)} is a Cauchy sequence in H and therefore,
converges.
The series also converges in L2 (Ω; H) because
Z Xn p n
X p
λk ξ k (ω) ek , λj ξ j (ω) ej dP
Ω k=m j=m
Z X
n n
X ∞
X
2
= λk |ξ k (ω)| dP = λk ≤ λk
Ω k=m k=m k=m
and this is the characteristic function for a random variable with mean (h, a) and
P∞ 2
variance k=1 λk (h, ek ) , this last series converging because
M
X ∞
X
2 2 2
λk (h, ek ) ≤ C (h, ek ) = C |h| .
k=1 k=1
0
Thus for every h ∈ H , h ◦ X is normally distributed. Therefore, Lemma 36.59
implies the following theorem.
Theorem 36.62 Let X (ω) be given by 36.29 as described above. Then letting
µ ≡ L (X) , it follows µ is a Gaussian measure on the separable Hilbert space, H.
1036 PROBABILITY IN INFINITE DIMENSIONS
{x ∈ H : ((x, e1 ) , · · ·, (x, en )) ∈ F }
where F ∈ B (Rn ) , the Borel sets of Rn and {ek } are orthonormal. Denote this
collection of cylinder sets as C.
Lemma 36.64 σ (C) , the smallest σ algebra containing C, contains the Borel sets
of H, B (H).
Proof: It follows from the definition of these cylinder sets that if fi (x) ≡ (x, ei ) ,
so that fi ∈ H 0 , then with respect to σ (C) , each fi is measurable. It follows that
every linear combination of the fi is also measurable with respect to σ (C). However,
this set of linear combinations is dense in H 0 and so the conclusion of the lemma
follows from Lemma 36.9 on Page 993. This proves the lemma.
Definition 36.65 Define ν on the cylinder sets, C by the following rule. For {ek }
a complete orthonormal set in H,
{x ∈ H : ((x, e1 ) , · · ·, (x, en )) ∈ F }
= {x ∈ H : ((x, f1 ) , · · ·, (x, fn )) ∈ G}
Then it needs to be the case that ν gives the same result for the two equal cylinder
sets. Let X
L= ei ⊗ f i .
i
Thus Lfi = ei and L maps H one to one and onto and preserves norms and
X
L∗ = fi ⊗ ei
i
and maps ei to fi and has the same properties of being one to one and onto H and
preserving norms.
Let
x ∈ {x ∈ H : ((x, e1 ) , · · ·, (x, en )) ∈ F } ≡ A.
36.11. ABSTRACT WIENER SPACES 1037
Then by definition,
((x, e1 ) , · · ·, (x, en )) ∈ F
and so
((x, Lf1 ) , · · ·, (x, Lfn )) ∈ F
which implies
((L∗ x, f1 ) , · · ·, (L∗ x, fn )) ∈ F
Thus, since L, L∗ are one to one and onto,
A = {x ∈ H : ((x, e1 ) , · · ·, (x, en )) ∈ F }
= {x ∈ H : ((L∗ x, f1 ) , · · ·, (L∗ x, fn )) ∈ F }
= {x ∈ LH : ((L∗ x, f1 ) , · · ·, (L∗ x, fn )) ∈ F }
= {L∗ x ∈ H : ((L∗ x, f1 ) , · · ·, (L∗ x, fn )) ∈ F }
= {y ∈ H : ((y, f1 ) , · · ·, (y, fn )) ∈ F }
= {x ∈ H : ((x, f1 ) , · · ·, (x, fn )) ∈ G}
((y, f1 ) , · · ·, (y, fn )) ∈ F
{x ∈ H : ((x, f1 ) , · · ·, (x, fn )) ∈ G}
which shows
((y, f1 ) , · · ·, (y, fn )) ∈ G.
Hence F ⊆ G. Similarly, G ⊆ F and ν is well defined. This proves the lemma.
It would be natural to try to extend ν to the σ algebra determined by C and
obtain a measure defined on this σ algebra. However, this is always impossible if
the Hilbert space, H is infinite dimensional.
Proof: Let {en } be a complete orthonormal set of vectors in H. Then first note
that H is a cylinder set.
H = {x ∈ H : (x, e1 ) ∈ R}
and so Z
1 2
ν (H) = √ e−x /2
dx = 1.
2π R
where an → ∞.
Z
1 2
ν (An ) ≡ ¡√ ¢ an e−|x| /2
dx
2π B(0,n)
Z n Z n
1 2
≤ ¡√ ¢ an ··· e−|x| /2
dx1 · · · dxan
2π −n −n
ÃR n 2 !an
−n
e−x /2 dx
= √
2π
Now pick an so large that the above is smaller than 1/2n+1 . This can be done
because for no matter what choice of n,
Rn 2
−n
e−x /2 dx
√ < 1.
2π
Then
∞
X ∞
X 1 1
ν (An ) ≤ n+1
= .
n=1 n=1
2 2
This proves the proposition and shows something else must be done to get a measure
from ν.
Definition 36.68 Let H be a separable Hilbert space and let ||·|| be a norm defined
on H which has the following property. Whenever {en } is an orthonormal sequence
of vectors in H and F ({en }) consists of the set of all orthogonal projections onto
the span of finitely many of the ek the following condition holds. For every ε > 0
there exists Pε ∈ F ({en }) such that if P ∈ F ({en }) and P Pε = 0, then
Lemma 36.69 Let ||·|| be Gross measurable. Then there exists c > 0 such that
||x|| ≤ c |x|
Proof: First it is important to consider the question whether the above defi-
nition is well defined. To do this note that on P H, the two norms are equivalent
because P H is a finite dimensional space. Let G = {y ∈ P H : ||y|| > ε} so G is an
open set in P H. Then
{x ∈ H : ||P x|| > ε}
equals
{x ∈ H : P x ∈ G}
which equals a set of the form
{x ∈ H : ((x, ei1 )H , · · ·, (x, eim )H ) ∈ G0 }
for G0 an open set in Rm and so everything makes sense in the above definition.
Now it is necessary to verify ||·|| ≤ c |·|. If it is not so, there exists e1 such that
||e1 || ≥ 1, |e1 | = 1.
n
Suppose {ek }k=1 have been chosen such that each is a unit vector in H and ||ek || ≥ k.
⊥ ⊥
Then considering span (e1 , · · ·, en ) if for every x ∈ span (e1 , · · ·, en ) , ||x|| ≤ c |x| ,
then if z ∈ H is arbitrary, z = x + y where y ∈ span (e1 , · · ·, en ) and so since the two
norms are equivalent on a finite dimensional subspace, there exists c0 corresponding
to span (e1 , · · ·, en ) such that
2 2 2 2
||z|| ≤ (||x|| + ||y||) ≤ 2 ||x|| + 2 ||y||
2 2
≤ 2c2 |x| + 2c0 |y|
¡ 2 ¢³ 2 2
´
≤ 2c + 2c02 |x| + |y|
¡ 2 ¢ 2
= 2c + 2c02 |z|
and the lemma is proved. Therefore it can be assumed, there exists
⊥
en+1 ∈ span (e1 , · · ·, en )
such that |en+1 | = 1 and ||en+1 || ≥ n + 1.
This constructs an orthonormal set of vectors, {ek } . Letting 0 < ε < 12 , it
follows since ||·|| is measurable, there exists Pε ∈ F ({en }) such that if P Pε = 0
where P ∈ F ({en }) , then
ν ({x ∈ H : ||P x|| > ε}) < ε.
Say Pε is the projection onto the span of finitely many of the ek , the last one being
eN . Then for n > N and Pn the projection onto en , it follows Pε Pn = 0 and from
the definition of ν,
ε > ν ({x ∈ H : ||Pn x|| > ε})
= ν ({x ∈ H : |(x, en )| ||en+1 || > ε})
= ν ({x ∈ H : |(x, en )| > ε/ ||en+1 ||})
≥ ν ({x ∈ H : |(x, en )| > ε/ (n + 1)})
Z ∞
1 2
> √ e−x /2 dx
2π ε/(n+1)
1040 PROBABILITY IN INFINITE DIMENSIONS
which yields a contradiction for all n large enough. This proves the lemma.
What are examples of Gross measurable norms defined on a separable Hilbert
space, H? The following lemma gives an important example.
Lemma 36.70 Let H be a separable Hilbert space and let A ∈ L2 (H, H) , a Hilbert
Schmidt operator. Thus A is a continuous linear operator with the property that for
any orthonormal set, {ek } ,
∞
X 2
|Aek | < ∞.
k=1
∞
X 2
|Aek | < α
k=N
where α is chosen very small. In fact, α is chosen such that α < ε2 /r2 where r is
sufficiently large that
Z ∞
2 2
√ e−t /2 dt < ε. (36.32)
2π r
¯ ¯
¯m ¯
¯X ¡ ¢ ¯
ν x∈H : ¯¯ x, eij Aeij ¯¯ > ε ≤
¯ j=1 ¯
R∞ 2
m
2 ε/(√mα1/2 ) e−t /2 dt
= √
2π
Next consider a weaker norm for H which comes from the inner product
∞
X 1
(x, y)E ≡ (x, ek )H (y, ek )H .
k2
k=1
1042 PROBABILITY IN INFINITE DIMENSIONS
Then let E be the completion of H with respect to this new norm. Thus {kek }
is a complete orthonormal basis for E. This follows from the density of H in E
along with the obvious observation that in the above inner product, {kek } is an
orthonormal set of vectors.
λ (F ) ≡ P ({ω ∈ Ω : X (ω) ∈ F })
One can pass to the limit because XN (ω) converges to X (ω) in E. This proves the
lemma.
Theorem 36.73 Let (i, H, B) be an abstract Wiener space. Then there exists a
Gaussian measure on the Borel sets of B.
36.11. ABSTRACT WIENER SPACES 1043
In particular, ¡© ª¢
ν x : ||Qn x − Qm x|| > 2−m < 2−m
whenever n ≥ m.
I would like to consider the infinite series,
∞
X
S (ω) ≡ k 2 (X (ω) , ek )E ek ∈ B.
k=1
converging in B but of course this might make no sense because the series might
not converge. It was shown above that the series converges in E but it has not been
shown to converge in B.
Suppose the series did converge a.e. Then let f ∈ B 0 and consider the random
variable f ◦ S which maps Ω to R. I would like to verify this is normally distributed.
First note that the following finite sum is weakly measurable and separably valued
so it is strongly measurable with values in B.
pn
X
Spn (ω) ≡ k 2 (X (ω) , ek )E ek ,
k=1
there exists a unique v ∈ H such that f (x) = (x, v) for all x ∈ H. Then from the
above sum,
pn
X
f (Spn (ω)) = (Spn (ω) , v) = k 2 (X (ω) , ek )E (ek , v)
k=1
¯¯ ¯¯
¯¯ n ¯¯
¯¯ X ¯ ¯
= P ω ∈ Ω : ¯¯¯¯ k 2 (X (ω) , ek )E ek ¯¯¯¯ > 2−m
¯¯k=pm +1 ¯¯
¯¯ ¯¯
¯¯ n ¯¯
¯¯ X ¯¯
= P ω ∈ Ω : ¯¯¯¯ ξ k (ω) ek ¯¯¯¯ > 2−m (36.36)
¯¯k=pm +1 ¯¯
¡© ¡ ¢ ª¢
= P ω ∈ Ω : ξ n (ω) , · · ·, ξ pm +1 (ω) ∈ F 0
¡© ¡ ¢ ª¢
= ν x ∈ H : (x, en )H , · · ·, (x, epm +1 )H ∈ F 0
= ν ({x ∈ H : Q (x) − Qm (x) ∈ F })
¡© ª¢
= ν x ∈ H : ||Q (x) − Qm (x)|| > 2−m < 2−m .
36.11. ABSTRACT WIENER SPACES 1045
Thus the subsequence {Spn } of the sequence of partial sums of the above series
does converge pointwise in B and so the dominated convergence theorem also verifies
that the computations involving the characteristic function in 36.35 are correct.
The random variable obtained as the limit of the partial sums, {Spn (ω)} de-
scribed above is strongly measurable because each Spn (ω) is strongly measurable
due to each of these being weakly measurable and separably valued. Thus the
measure given as the law of S defined as
S (ω) ≡ lim Spn (ω)
n→∞
for {ξ k } a sequence of independent random variables which are normal with mean
0 and variance 1 which are defined on a probability space, (Ω, F, P ). Furthermore,
for any k > pn ,
¡© ª¢
P ω ∈ Ω : ||Sk (ω) − Spn (ω)|| > 2−n < 2−n .
1046 PROBABILITY IN INFINITE DIMENSIONS
By induction, it follows that if you have n independent random variables each having
symmetric distribution, then their sum has symmetric distribution.
Here is a simple lemma about random variables having symmetric distributions.
It will depend on Lemma 36.57 on Page 1029.
You can also change the inequalities in the obvious way, < to ≤ , > to ≥.
36.12. WHITE NOISE 1047
Thus Ak consists of those ω where ||Sk (ω)|| > r for the first time at k. Thus
· ¸
sup ||Sk || > r and ||SN || ≤ r = ∪N −1
j=1 Aj ∩ [||SN || ≤ r]
k≤N −1
and the sets in the above union are disjoint. Consider Aj ∩ [||SN || ≤ r] . For ω in
this set,
||Sj (ω)|| > r, ||Si (ω)|| ≤ r if i < j.
Since ||SN (ω)|| ≤ r in this set, it follows
¯¯ ¯¯
¯¯ XN ¯¯
¯¯ ¯¯
¯¯Sj (ω) + ξ i (ω)¯¯¯¯ ≤ r
¯¯
¯¯ i=j+1 ¯¯
and so from the symmetry of the distributions and Lemma 36.76 the following
computation is valid.
P (Aj ∩ [||SN || ≤ r]) (36.40)
¯¯ ¯¯
¯¯ N ¯¯
¯¯ X ¯¯
= P ∩j−1 [||S || ≤ r] ∩ [||S || > r] ∩ ¯ ¯ S + ξ ¯¯
i ¯¯ ≤ r
(36.41)
i=1 i j ¯¯ j
¯¯ i=j+1 ¯¯
Now ∩j−1
i=1 [||Si || ≤ r] ∩ [||Sj || > r] is of the form
£¡ ¢ ¤
ξ 1 , · · ·, ξ j ∈ A
PN
for some Borel set, A. Then letting Y = i=j+1 ξ i in Lemma 36.76 and Xi = ξ i ,
36.41 equals
¯¯ ¯¯
¯¯ N ¯¯
¯ ¯ X ¯ ¯
j−1
P ∩i=1 [||Si || ≤ r] ∩ [||Sj || > r] ∩ ¯¯¯¯Sj − ξ i ¯¯¯¯ ≤ r
¯¯ i=j+1 ¯¯
³ ´
= P ∩j−1
i=1 [||Si || ≤ r] ∩ [||Sj || > r] ∩ [||Sj − (SN − Sj )|| ≤ r]
³ ´
= P ∩j−1
i=1 [||S i || ≤ r] ∩ [||S j || > r] ∩ [||2S j − SN || ≤ r]
It follows that
µ· ¸¶ N
X −1
P sup ||Sk || > r and ||SN || ≤ r = P (Aj ∩ [||SN || ≤ r])
k≤N −1 i=1
N
X −1
≤ P (Aj ∩ [||SN || > r])
i=1
≤ P ([||SN || > r])
Then in fact,
Sk (ω) → S (ω) a.e.ω (36.43)
In using this lemma, you could renumber the ζ i so that the sum
l
X
ζj
j=nk +1
corresponds to
l−n
Xk
ξj
j=1
each of which has measure no more than 2−(k−1) . Thus ω must be in a set of
measure zero. This proves the lemma.
Now with this preparation, here is the theorem about white noise.
Theorem 36.79 Let (i, H, B) be an abstract Wiener space. Then there exists a
Gaussian measure on the Borel sets of B. This Gaussian measure equals L (S)
where S (ω) is the a.e. limit of the sequence of partial sums,
n
X
Sn (ω) ≡ ξ k (ω) ek
k=1
for {ξ k } a sequence of independent random variables which are normal with mean
0 and variance 1 which are defined on a probability space, (Ω, F, P ) and {ek } is a
complete orthonormal sequence in H.
whenever k > pn and so by Lemma 36.78 the original sequence of partial sums
also converges a.e. The reason this lemma applies is that ξ k (ω) ek has symmetric
distribution. This proves the corollary.
Lemma 36.80 Let E be a separable Banach space. Then there exists an increasing
sequence of subspaces, {Fn } such that dim (Fn+1 ) − dim (Fn ) ≤ 1 and equals 1 for
all n if the dimension of E is infinite. Also ∪∞
n=1 Fn is dense in E.
Proof: Since E is separable, so is ∂B (0, 1) , the boundary of the unit ball. Let
∞
{wk }k=1 be a countable dense subset of ∂B (0, 1).
Let z1 = w1 . Let F1 = Fz1 . Suppose Fn has been obtained and equals span (z1 , · · ·, zn )
where {z1 , · · ·, zn } is independent, ||zk || = 1, and if n 6= m,
1
||zm − zn || ≥ .
2
36.13. EXISTENCE OF ABSTRACT WIENER SPACES 1051
Xn
yk ckj
0 = lim = lim zj . (36.45)
k→∞ |ck | k→∞ |ck |
j=1
Then if m < n + 1,
¯¯ ¯¯
¯¯ w − y ¯¯
||zn+1 − zm || = ¯¯ ¯¯
¯¯ ||w − y|| − zm ¯¯
¯¯ ¯¯
¯¯ w − y ||w − y|| zm ¯¯¯¯
= ¯¯
¯¯ ||w − y|| − ||w − y|| ¯¯
1
≥ ||w − y − ||w − y|| zm ||
2λ
λ 1
≥ = .
2λ 2
1052 PROBABILITY IN INFINITE DIMENSIONS
This has shown the existence of an increasing sequence of subspaces, {Fn } as de-
scribed above. It remains to show the union of these subspaces is dense. First note
that the union of these subspaces must contain the {wk } because if wm is miss-
ing, then it would contradict the construction at the mth step. That one should
have been chosen. However, {wk } is dense in ∂B (0, 1). If x ∈ E and x 6= 0, then
x
||x|| ∈ ∂B (0, 1) then there exists
wm ∈ {wk } ⊆ ∪∞
n=1 Fn
¯¯ ¯¯
¯¯ x ¯¯ ε
such that ¯¯wm − ||x|| ¯¯ < ||x|| . But then
Lemma 36.81 Let E be a separable Banach space. Then there exists a sequence
{en } of points of E such that whenever |β| ≤ 1 for β ∈ Fn ,
n
X
β k ek ∈ B (0, 1)
k=1
Then ( )
n
X
Cn ≡ β k αk zk : β ∈ Dn
k=1
and so
¯¯ ¯¯
¯¯n+1 n ¯¯
¯¯ X X ¯¯ ¯¯ ¯¯
¯¯ β k αk zk − β k αk zk ¯¯ = ¯¯β n+1 αn+1 zn+1 ¯¯
¯¯ ¯¯
k=1 k=1
< ||αn+1 zn+1 || < δ
36.13. EXISTENCE OF ABSTRACT WIENER SPACES 1053
which shows
n+1
X
β k αk zk ∈ B (0, 1) .
k=1
Theorem 36.82 Let E be a real separable Banach space with norm ||·||. Then there
exists a separable Hilbert space, H such that H is dense in E and the inclusion map
is continuous. Furthermore, if ν is the Gaussian measure defined earlier on the
cylinder sets of H, ||·|| is Gross measurable.
Proof: Let {ek } be the points of E described in Lemma 36.81. Then let H0
denote the subspace of all finite linear combinations of the {ek }. It follows H0 is
dense in E. Next decree that {ek } is an orthonormal basis for H0 . Thus for
n
X n
X
ck ek , dj ek ∈ H0 ,
k=1 j=1
n
X n
X n
X
ck ek , dj ej ≡ ck dk
k=1 j=1 k=1
H0
this being well defined because the {ek } are linearly independent. Let the norm on
H0 be denoted by |·|H0 . Let H1 be the completion of H0 with respect to this norm.
I want to show that |·|H0 is stronger than ||·||. Suppose then that
¯ ¯
¯Xn ¯
¯ ¯
¯ β k ek ¯ ≤ 1.
¯ ¯
k=1 H0
and so
||h|| < |h|H0 .
It follows that the completion of H0 must lie in E because this shows that every
Cauchy sequence in H0 is a Cauchy sequence in E. Thus H1 embedds continuously
into E and is dense in E. Denote its norm by |·|H1 .
Now consider the Hilbert Schmidt operator,
∞
X
A= λk ek ⊗ ek
k=1
P
where each λk > 0 and k λ2k < ∞. This operator is clearly one to one. Let
H ≡ AH1 .
∞
à ∞
!1/2 Ã ∞
!1/2
X X X 2
λk |(x, ek )| ≤ λ2k |(x, ek )| < ∞.
k=1 k=1 k=1
H is complete
© −1 ª because if {xn } is a Cauchy sequence in H, this is the same as
−1
A xn being a Cauchy sequence ¡ −1 in ¢H1 which implies A xn → y for some
y ∈ H1 . Then it follows xn = A A xn → Ay in H.
For x ∈ H ⊆ H1 ,
¯ ¯ ¯ ¯
||x|| ≤ |x| = ¯AA−1 x¯ ≤ ||A|| ¯A−1 x¯ ≡ ||A|| |x|
H1 H1 H1 H
Proof of the claim: From the definition of the inner product in H, it follows
an orthonormal basis for H is {λk ek } . This is because
¡ ¢
(λk ek , λj ej )H ≡ λk A−1 ek , λj A−1 ej H1 = (ek , ej )H1 = δ jk .
because this is the definition of an operator being Hilbert Schmidt. However, the
above equals X¯ X
¯
¯A−1 A (λk ek )¯2 = λ2k < ∞.
H1
k k
Corollary 36.83 Let E be any real separable Banach space and let {ξ k } be any
sequence of independent random variables such that L (ξ k ) = N (0, 1). Then there
exists a sequence, {ek } ⊆ E such that
∞
X
X (ω) ≡ ξ k (ω) ek
k=1
This equals
m m
1 X 1 X
√ tj hj (X) + √ tj hj (Y )
2 j=1 2 j=1
k k
1 X 1 X
+√ si gi (X) − √ si gi (Y )
2 i=1 2 i=1
m
X k
X
1
= √ tj hj + si gi (X)
2 j=1 i=1
Xm Xk
1
+√ t j hj − si gi (Y )
2 j=1 i=1
and this is the sum of two independent normally distributed random variables so it
is also normally distributed. Therefore, by Theorem 36.51
(h1 ◦ X 0 , · · ·, hm ◦ X 0 , g1 ◦ Y 0 , · · ·, gk ◦ Y 0 )
is a random variable with multivariate normal distribution and by Theorem 36.56
the two random vectors
(h1 ◦ X 0 , · · ·, hm ◦ X 0 ) and (g1 ◦ Y 0 , · · ·, gk ◦ Y 0 )
are linearly independent if
E ((hi ◦ X 0 ) (gj ◦ Y 0 )) = 0
for all i, j. This is what I will show next.
E ((hi ◦ X 0 ) (gj ◦ Y 0 ))
1
= E ((hi (X) + hi (Y )) (gj (X) − gj (Y )))
4
1 1
= E (hi (X) gj (X)) − E (hi (X) gj (Y ))
4 4
1 1
+ E (hi (Y ) gj (X)) − E (hi (Y ) gj (Y )) (36.47)
4 4
36.14. FERNIQUE’S THEOREM 1057
Now from the above observation after the definition of Gaussian measure hi (X) gj (X)
and hi (Y ) gj (Y ) are both in L1 because each term in each product is normally dis-
tributed. Therefore, by Lemma 36.57,
Z
E (hi (X) gj (X)) = hi (Y ) gj (Y ) dP
ZΩ
= hi (y) gj (y) dµ
ZE
= hi (X) gj (X) dP
Ω
= E (hi (Y ) gj (Y ))
due to the assumption that µ is symmetric which implies the mean of these ran-
dom variables equals 0. The other term works out similarly. This has proved the
independence of the random variables, X 0 and Y 0 .
Next consider the claim they have the same law and it equals µ. To do this, I
will use Theorem 36.36 on Page 1016. Thus I need to show
³ ´ ³ ´ ³ ´
E eih(X ) = E eih(Y ) = E eih(X)
0 0
(36.48)
for all h ∈ E 0 . Pick such an h. Then h ◦ X is normally distributed and has mean 0.
Therefore, for some σ,
¡ ¢ 1 2 2
E eith◦X = e− 2 t σ .
Now since X and Y are independent,
³ ´ µ ³ ´ ¶
ith◦X 0 ith √12 (X+Y )
E e = E e
µ ³ ´ ¶ µ ³ ´ ¶
ith √12 X ith √12 Y
= E e E e
the product of two characteristic functions of two random variables, √12 X and √12 Y.
The variance of these two random variables which are normally distributed with
zero mean is 12 σ 2 and so
³ ´ ¡ ¢
E eith◦X = e− 2 ( 2 σ ) e− 2 ( 2 σ ) = e− 2 σ = E eith◦X .
0 1 1 2 1 1 2 1 2
1058 PROBABILITY IN INFINITE DIMENSIONS
³ 0
´ ¡ ¢ ¡ ¢
Similar reasoning shows E eith◦Y = E eith◦Y = E eith◦X . Letting t = 1, this
yields 36.48. This proves the lemma.
With this preparation, here is an incredible theorem due to Fernique.
then Z
2 ¡ ¢ e2
eλ||x|| dµ ≤ exp λr2 + .
R e2 − 1
1 1
√ (X − Y ) , √ (X + Y )
2 2
are also independent and have the same law. Now let 0 ≤ s ≤ t and use indepen-
dence of the above random variables along with the fact they have the same law as
X and Y to obtain
µ¯¯ ¯¯ ¶ µ¯¯ ¯¯ ¶
¯¯ 1 ¯¯ ¯¯ 1 ¯¯
= ¯ ¯ ¯ ¯ ¯ ¯
P ¯¯ √ (X − Y )¯¯ ≤ s P ¯¯ √ (X + Y )¯¯ > t¯ ¯
2 2
µ¯¯ ¯¯ ¯¯ ¯¯ ¶
¯¯ 1 ¯¯ ¯¯ 1 ¯¯
= ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯
P ¯¯ √ (X − Y )¯¯ ≤ s, ¯¯ √ (X + Y )¯¯ > t
2 2
µ ¶
1 1
≤ P √ |||X|| − ||Y ||| ≤ s, √ (||X|| + ||Y ||) > t .
2 2
Now consider the following picture in which the region, R represents the points,
(||X|| , ||Y ||) such that
1 1
√ |||X|| − ||Y ||| ≤ s and √ (||X|| + ||Y ||) > t.
2 2
36.14. FERNIQUE’S THEOREM 1059
¡
¡ ¡
¡R ¡
t−s - r¡ ¡
( √2 , ?) ¡
r¡(?, t−s
√ )
2
¡
From 36.49,
³h ³ ´ ¡ ¢i´ n
2
P exp λ ||X|| > exp λt2n ≤ P ([||X|| ≤ r]) eln(α0 (r))2 .
¡ ¡ 2¢ ¡ 2 ¢¢
Now split the above improper ³hintegral into
i´intervals, exp λt n , exp λtn+1 for
λ||X||2
n = 0, 1, · · · and note that P e > t is decreasing in t. Then from 36.50,
Z ∞ Z exp(λt2n+1 ) ³h i´
2 ¡ ¢ X 2
eλ||x|| dµ ≤ exp λr2 + P eλ||X|| > t dt
E n=0 exp(λt2n )
¡ ¢ ∞
X ³h 2 ¡ ¢i´ ¡ ¡ ¢ ¡ ¢¢
2
≤ exp λr + P eλ||X|| > exp λt2n exp λt2n+1 − exp λt2n
n=0
X∞
¡ ¢ n ¡ ¢
≤ exp λr2 + P ([||X|| ≤ r]) eln(α0 (r))2 exp λt2n+1
n=0
X∞
¡ 2
¢ n ¡ ¢
≤ exp λr + eln(α0 (r))2 exp λt2n+1 .
n=0
and so ³√ ´n
tn+1 ≤ 5r 2
Therefore,
Z ∞
2 ¡ ¢ X n 2 n
eλ||x|| dµ ≤ exp λr2 + eln(α0 (r))2 +λ25r 2 .
E n=0
Now first pick r large enough that ln (α0 (r)) < −2 and then let λ be small enough
that 25λr2 < 1 or some such scheme and you obtain ln (α0 (r)) + λ25r2 < −1. Then
36.15. REPRODUCING KERNELS 1061
for this choice of r and λ, or for any other choice which makes ln (α0 (r)) + λ25r2 <
−1,
Z ∞
2 ¡ ¢ X n
eλ||x|| dµ ≤ exp λr2 + e−2
E n=0
∞
¡ ¢ X
≤ exp λr2 + e−2n
n=0
¡ ¢ e2
= exp λr2 + .
e2 −1
This proves the theorem.
Note this theorem implies all moments exist for Gaussian measures.
and so φ ∈ L2 (E) . Thus you can consider E 0 ⊆ L2 (E) . Let E 0 denote the closure
of E 0 in L2 (E). Then E 0 is a Hilbert space with inner product given by
Z
(φ, ψ) ≡ φ (x) ψ (x) dµ.
E
For φ ∈ L2 (E) , denote by R−1 φ the element of E given by the Bochner integral,
Z
R−1 φ ≡ xφ (x) dµ. (36.52)
E
Also in 36.52 the integrand is weakly measurable and is separably valued so the
Bochner integral makes sense as claimed and the integrand is in L1 (E; E).
1062 PROBABILITY IN INFINITE DIMENSIONS
The map, R−1 is clearly linear and it is also one to one on E 0 because if R−1 φ =
0, then there exists a sequence {φn } ⊆ E 0 converging to φ in L2 (E) . Therefore,
µZ ¶ Z Z
2
0 = φn xφ (x) dµ = φn (x) φ (x) dµ → φ (x) dµ
E E E
Since R−1 is one to one, the inner product is well defined and the map, R−1 : E 0 →
H is one to one, onto, and preserves norms. Therefore, H is also a Hilbert space.
Now before making the next observation, note that by Fernique’s theorem, The-
orem 36.85, there exists λ > 0 such that
Z Z Z
2 1 2 1 2
||x|| dµ = λ ||x|| dµ ≤ eλ||x|| dµ ≡ Cµ < ∞.
E λ E λ E
Now it follows from all this that H is a Hilbert space which embedds continuously
into E and for φ ∈ E 0 ,
= ||φ||L2 (E) = σ.
by 36.51.
36.15. REPRODUCING KERNELS 1063
Finally, I claim that H must be dense in E. To see this, suppose it is not the
case. Then by a standard use of the Hahn Banch theorem, there would exist φ ∈ E 0
such that φ (H) = 0 but φ 6= 0. But then
µZ ¶ Z
¡ −1 ¢ 2
0=φ R φ ≡φ xφ (x) dµ = φ (x) dµ
E E
The integrand is weakly measurable and separably valued and Fernique’s theorem
implies
Z µZ ¶1/2 µZ ¶1/2
2 2
||x||E |φ (x)| dµ ≤ ||x|| dµ |φ (x)| dµ <∞
E E E
a contradiction to φ 6= 0.
Now suppose H1 and H are two reproducing kernel spaces for µ. Let φ ∈
H10 ⊆ L2 (E, µ). Since the norm on H 0 equals the L2 (E) norm, it follows that for
φ ∈ E 0 , ψ ∈ H 0 and R the Riesz map from H to H 0 ,
¡ ¢ ¡ ¢¡ ¢
φ R−1 (ψ) = R R−1 (φ) R−1 ψ
¡ ¢
= R−1 φ, R−1 ψ H = (φ, ψ)H 0
Z
¡ ¢
= φ (x) ψ (x) dµ = φ S −1 (ψ) .
E
Also, for x ∈ H, ¡ ¢
(h, x)Hµ = R−1 φh , x Hµ ≡ φh (x) ,
a similar formula holding for g in place of h. Now using this in 36.53 yields the
following interesting formula.
Z
(h, g)Hµ ≡ (h, x)Hµ (g, x)Hµ dµ.
E
Next consider the question of how to identify reproducing kernels and how to
tell whether a given probability measure is a Gaussian measure.
Before the next theorem is proved, recall the following two theorems proved on
Pages 1029 and 1027 respectively.
Also recall the following theorem and corollary proved on Page 870.
where mj = E (Xj ).
Lemma 36.94 Let M ⊆ E 0 , where E is a real separable Banach space, be such that
σ (M ) = B (E) . Also suppose
³ X, Y´ are two
³ E valued
´ random variables such that for
all n ∈ N, and ~φ ∈ M n , L ~φ ◦ X = L ~φ ◦ Y . That is, for all F ∈ B (Rn ) ,
³h i´ ³h i´
P ~φ ◦ X ∈ F = P ~φ ◦ Y ∈ F
Then L (X) = L (Y ).
Proof: Define F as the π system which consists of cylindrical sets of the form
( m
)
Y
x ∈ E : ~φ (x) ∈ Gi , Gi ∈ B (R)
i=1
and so F ⊆ G. If A ∈ G then
¡£ ¤¢
P X ∈ AC = 1 − P ([X ∈ A])
¡£ ¤¢
= 1 − P ([Y ∈ A]) = P Y ∈ AC
of G,
P ([X ∈ ∪∞
i=1 Ai ]) = P (∪∞
i=1 [X ∈ Ai ])
X∞
= P ([X ∈ Ai ])
i=1
∞
X
= P ([Y ∈ Ai ])
i=1
= P ([Y ∈ ∪∞
i=1 Ai ]) .
It follows from the lemma about π systems, Lemma 9.72 on Page 257 that G = σ (F) =
B (E) and this says L (X) = L (Y ). This proves the lemma.
So when do ³the conditions
´ ³of this´ lemma hold? It seems a fairly strong assump-
tion to have L ~φ ◦ X = L ~φ ◦ Y for all ~φ ∈ M n for any n ∈ N. In the next
corollary, this condition will hold. This corollary says that if σ (M ) = B (E) , then
in verifying a probability measure is Gaussian, you only need to consider φ ∈ M
rather than all φ ∈ E 0 .
L (φ (αX + βY )) = L (φ (X)) ,
and both random variables are normally distributed with 0 mean. Now take ~φ ∈ M n
and consider a· ~φ which is also in M because M is a subspace. Then from the above,
³ ´ ³ ³ ´´ ³ ´
L a · ~φ (αX + βY ) = L a · α~φ (X) + β~φ (Y ) = L a · ~φ (X) .
³ ´
and the random variables, a · ~φ (X) and a · α~φ (X) + β~φ (Y ) are both normally
distributed with 0 mean and have the same distribution. Then by Corollary 36.93
³ ´ ³ ´
L ~φ (X) = L α~φ (X) + β~φ (Y )
³ ´
= L ~φ (αX + βY )
2 Recall this means the smallest σ algebra such that each function in M is measurable.
1068 PROBABILITY IN INFINITE DIMENSIONS
and both are equal to a multvariate normal distribution. Now applying Lemma
36.94, it follows
L (X) = L (αX + βY ) = µ (36.54)
whenever α2 + β 2 = 1. ³ ´
2
I want to verify L (φ) = N 0, ||φ||L2 (E) for all φ ∈ E 0 . I have just shown that
whenever X, Y are independent with L (X) = L (Y ) = µ, then if α2 + β 2 = 1, it
follows 36.54 holds. Now take an arbitrary φ ∈ E 0 . It follows
Then it follows
||i∗ φ||H 0 = ||φ||L2 (E)
for all φ ∈ E 0 and H is the reproducing kernel space for µ.
(φ − φ0n , ψ 1 , · · ·, ψ m ) (36.56)
is a multivariate normal random vector having values in Rm+1 . Now the random
vector,
(φ − φ0 , ψ 1 , · · ·, ψ m ) (36.57)
is the limit in L2 (E) as n → ∞ of the random vectors in 36.56 and so the means
and covariances of the vectors in 36.56 converge. Thus the vector in 36.57 is also a
multivariate normal random vector. By Theorem 36.90 it follows the two random
vectors, φ − φ0 and (ψ 1 , · · ·, ψ m ) are independent. Now it is easy to see that
σ (M ) = ∪F ⊆M,F finite σ (F ) .
Lemma 36.98 Suppose G and F are two σ algebras on a probability space, (Ω, S, P )
and suppose they are independent and that G ⊆ F. Then if A ∈ G it follows either
P (A) = 0 or P (A) = 1.
2
and so P (A) = P (A) . This proves the lemma.
1070 PROBABILITY IN INFINITE DIMENSIONS
Now continuing with the proof of Theorem 36.97, σ (φ − φ0 ) ⊆ B (E) and the
two are independent so every set in σ (φ − φ0 ) has measure either 0 or 1. Thus
Therefore, letting φ ∈ M,
µZ ¶ Z
φ xf (x) dµ = φ (x) f (x) dµ
E E
Z
= lim φ (x) φn (x) dµ
n→∞
where R−1 is the inverse of the Riesz map, R from H to H 0 which is defined by
Now here is where M separates the points is used. The above equation shows
since φ is arbitrary that Z
xf (x) dµ = R−1 (i∗ f )
E
Z
(i∗ f, i∗ g)H 0 = f (x) g (x) dµ
E
because
¡ −1 ∗ ¢
(i∗ f, i∗ g)H 0 = R (i f ) , R−1 (i∗ g) H
µZ ¶
¡ ¢
= i∗ f R−1 (i∗ g) = f xg (x) dµ
E
Z
= f (x) g (x) dµ.
E
and this proves the theorem. Since H has the properties of the reproducing kernel
space, it equals Hµ .
Hµ ⊆ E, E 0 ⊆ Hµ0 ⊆ L2 (E, µ) .
Also recall Hµ0 ≡ R−1 (E 0 ) where R is the Riesz map from Hµ to Hµ0 satisfying
Then the theorem about white noise to be proved here is the following.
1072 PROBABILITY IN INFINITE DIMENSIONS
Theorem 36.99 Let E be a real separable Banach space and let µ be a Gaussian
measure defined on the Borel sets of E and let Hµ be the reproducing kernel space for
E. Suppose also that there exists an orthonormal complete basis for Hµ , {en } ⊆ Hµ0
0 −1 0
such
© that
ª for φn ∈ E defined by en = R φn , span ({φn }) is also dense in E . Then
if ξ j is a sequence of independent random variables having mean 0 and variance
1, which are defined on a probability space, (Ω, F, P ) it follows
∞
X
X (ω) ≡ ξ i (ω) ei (36.59)
i=1
Proof: Let φi = R (ei ) where R is the Riesz map from Hµ to Hµ0 . Then it
follows φi is normally distributed with mean 0 and variance σ 2 . What is σ 2 ? By
definition, and the properties of the reproducing kernel space,
Z
σ2 = φ2i (x) dµ ≡ (φi , φi )H 0
µ
E
¡ −1 −1
¢
= R φi , R φi Hµ = (ei , ei )Hµ = 1.
I claim
Ppthat it is also the case that {φi } are independent. First note that if αi ∈ R,
then i=1 αi φ³ni ∈ E 0 and so´ is also normally distributed. Hence by Theorem 31.23
on Page 870, φn1 , · · ·, φnp is multivariate normal. Now
³ ´ Z
E φnj φnk ≡ φnj (x) φnk (x) dµ
³E ´ ¡ ¢
= φnj , φnk 0 = enj , enk H = δ jk
Hµ µ
and so thencovariance
op matrix is a diagonal. It follows from Theorem 31.25 on Page
872 that φnj is independent. This establishes the claim and shows that a
j=1
special case of the theorem involves the consideration of
∞
X
φk (x) ek . (36.60)
k=1
Here the probability space is E and the measure is µ. Now this special case is easier
to work with and the plan is to consider this special case first, showing that the
above sum in 36.60 converges to x for a.e. x ∈ E and then extending to the general
case. The advantage of considering this special case first is that you have a candidate
for the function to which the series converges
PN which has known distribution.
Let S (x) ≡ x and let SN (x) ≡ n=1 φn (x) en . First of all, observe that
(E, B (E) , µ) is a probability space and S maps E to E and L (S) = µ. Thus
S has known distribution and it is reasonable to try and get SN (x) to converge to
S (x).
36.16. REPRODUCING KERNELS AND WHITE NOISE 1073
Let
L (SN ) ≡ µN , L (S − SN ) ≡ µ⊥
N.
First note that SN and SM − SN for M > N are independent random variables by
the first part of this argument. Letting φ ∈ span ({φn })
à N
! N
X X
φ φn (x) en = φn (x) φ (en )
k=1 k=1
and this series converges for each x ∈ E as N → ∞ because φk (en ) = (ek , en )Hµ =
δ kn which implies that if n is sufficiently large φ (en ) = 0 so the above sequence of
partial sums is eventually constant.
Therefore, letting φ ∈ span ({φn }) ,
Now span ({φn }) is dense in E 0 by assumption and so it follows from Corollary 36.40
on Page 1018 that S − SN and SN are independent random variables.
It follows from Theorem 36.47 on Page 1023 that
µ = µN ∗ µ⊥
N (36.61)
µ⊥ ⊥
N (−K + xN ) = µN (K − xN ) ≥ 1 − ε.
Now note
K −K
(−K + xN ) ∩ (K − xN ) ⊆
2
1074 PROBABILITY IN INFINITE DIMENSIONS
2x = k2 − k1 ∈ K − K.
Therefore,
µ ¶
K −K
µ⊥
N ≥ µ⊥
N ((−K + xN ) ∩ (K − xN )) ≥ 1 − 2ε.
2
0
By density of span ({φn }) in E , it follows that
Z
1 = E (exp (iφ (Y ))) = exp (iφ (x)) dν
E
ν = δ 0 because by Theorem 36.36 the two measures, ν and δ 0 , have the same
characteristic functions. ³ ³ ´´
C C
Now consider B (0, ε) ⊆ E. δ 0 ∂ B (0, ε) = 0 and so it follows by Lemma
36.25 on Page 1007 that
³ ´ ³ ´
C C
lim µ⊥N B (0, ε) = δ 0 B (0, ε) = 0.
N →∞
Therefore,
³ ´
C
0 = lim µ⊥
N B (0, ε)
N →∞
³h i´
C
= lim µ S − SN ∈ B (0, ε)
N →∞
= lim µ ([||S − SN ||E ≥ ε]) .
N →∞
Pn 2
which is a normally distributed random variable having mean 0 and variance k=1 φ (ek ) .
Therefore, Pn
1 2 2
E (exp (itφ (Sn ))) = e− 2 t k=1 φ(ek )
and so, passing to the limit, yields
1 2
P∞
φ(ek )2
E (exp (itφ (S))) = e− 2 t k=1
k=1
2
Thus φ (S) is normally distributed with mean 0 and variance ||φ||L2 (E) . Hence
L (S) = µ because if ν = L (S) , the above has just shown, for ψ ∈ E 0 ,
− 21 ||ψ||2L2 (E)
φν (ψ) = e
1076 PROBABILITY IN INFINITE DIMENSIONS
while Z
− 12 ||ψ||2L2 (E)
φµ (ψ) ≡ eiψ(x) dµ = e
E
due to the observation that since µ is Gaussian, each ψ ∈ E 0 is normal with mean 0
2
and variance equal to ||ψ||L2 (E) . Since the two measures have the same characteristic
functions, they are equal by Theorem 36.36.
It only remains to consider the general case described in 36.59. Consider the
sum,
X n
Xn (ω) ≡ ξ i (ω) ei .
i=1
and the pointwise a.e. convergence of {Xn } will follow as above using Lemma 36.78
on Page 1049 and then the same characteristic function argument will show X (ω)
defined in 36.59 has L (X) = µ. But the ξ i are given to be independent and normally
distributed with mean 0 and variance 1 so
Pn
1
φ(ek )2
E (exp (iφ (Xn − Xm ))) = e− 2 k=m+1
Sobolev Spaces
1077
Weak Derivatives
Definition 37.1 Let X 0 be the dual of a Banach space X and let {x∗n } be a sequence
of elements of X 0 . Then x∗n converges weak ∗ to x∗ if and only if for all x ∈ X,
Lemma 37.2 Let X 0 be the dual of a Banach space, X and suppose X is separa-
ble. Then if {x∗n } is a bounded sequence in X 0 , there exists a weak ∗ convergent
subsequence.
Proof: Let D be a dense countable set in X. Then the sequence, {x∗n (x)} is
bounded for all x and in particular for all x ∈ D. Use the Cantor diagonal process to
obtain a subsequence, still denoted by n such that x∗n (d) converges for each d ∈ D.
1079
1080 WEAK DERIVATIVES
Now let x ∈ X be completely arbitrary. In fact {x∗n (x)} is a Cauchy sequence. Let
ε > 0 be given and pick d ∈ D such that for all n
ε
|x∗n (x) − x∗n (d)| < .
3
This is possible because D is dense. By the first part of the proof, there exists Nε
such that for all m, n > Nε ,
ε
|x∗n (d) − x∗m (d)| < .
3
Then for such m, n,
|x∗n (x) − x∗m (x)| ≤ |x∗n (x) − x∗n (d)| + |x∗n (d) − x∗m (d)|
ε ε ε
+ |x∗m (d) − x∗m (x)| < + + = ε.
3 3 3
Since ε is arbitrary, this shows {x∗n (x)} is a Cauchy sequence for all x ∈ X.
Now define f (x) ≡ limn→∞ x∗n (x). Since each x∗n is linear, it follows f is also
linear. In addition to this,
where K is some constant which is larger than all the norms of the x∗n . Such a
constant exists because the sequence, {x∗n } was bounded. This proves the lemma.
The lemma implies the following important theorem.
idea of weak partial derivatives goes further in the direction of defining something
in terms of what it does rather than by a formula, and extra generality is obtained
when it is used. In particular, it is possible to differentiate almost anything if
the notion of what is meant by the derivative is sufficiently weak. This has the
advantage of allowing the consideration of the weak partial derivative of a function
without having to agonize over the important question of existence but it has the
disadvantage of not being able to say much about the derivative. Nevertheless, it
is the idea of weak partial derivatives which makes it possible to use functional
analytic techniques in the study of partial differential equations and it is shown in
this chapter that the concept of weak derivative is useful for unifying the discussion
of some very important theorems. Certain things which shold be true are.
Let Ω ⊆ Rn . A distribution on Ω is defined to be a linear functional on Cc∞ (Ω),
called the space of test functions. The space of all such linear functionals will be
denoted by D∗ (Ω). Actually, more is sometimes done here. One imposes a topology
on Cc∞ (Ω) making it into a topological vector space, and when this has been done,
D0 (Ω) is defined as the dual space of this topological vector space. To see this,
consult the book by Yosida [52] or the book by Rudin [46].
Example: The space L1loc (Ω) may be considered as a subset of D∗ (Ω) as
follows. Z
f (φ) ≡ f (x) φ (x) dx
Ω
for all φ ∈ Cc∞ (Ω). Recall that f ∈ L1loc (Ω) if f XK ∈ L1 (Ω) whenever K is
compact.
Example: δ x ∈ D∗ (Ω) where δ x (φ) ≡ φ (x).
It will be observed from the above two examples and a little thought that D∗ (Ω)
is truly enormous. The derivative of a distribution will be defined in such a way that
it agrees with the usual notion of a derivative on those distributions which are also
continuously differentiable functions. With this in mind, let f be the restriction to
Ω of a smooth function defined on Rn . Then Dxi f makes sense and for φ ∈ Cc∞ (Ω)
Z Z
Dxi f (φ) ≡ Dxi f (x) φ (x) dx = − f Dxi φdx = −f (Dxi φ).
Ω Ω
and it is clear that all mixed partial derivatives are equal because this holds for
the functions in Cc∞ (Ω). In this weak sense, the derivative of almost anything
exists, even functions that may be discontinuous everywhere. However the notion
of “derivative” is very weak, hence the name, “weak derivatives”.
1082 WEAK DERIVATIVES
Then Z
DH (φ) = − H (x) φ0 (x) dx = φ (0) = δ 0 (φ).
Theorem 37.5 Let Ω = (a, b) and suppose that f and Df are both in L1 (a, b).
Then f is equal to a continuous function a.e., still denoted by f and
Z x
f (x) = f (a) + Df (t) dt.
a
Lemma 37.6 Let T ∈ D∗ (a, b) and suppose DT = 0. Then there exists a constant
C such that Z b
T (φ) = Cφdx.
a
Proof: T (Dφ) = 0 for all φ ∈ Cc∞ (a, b) from the definition of DT = 0. Let
Z b
φ0 ∈ Cc∞ (a, b) , φ0 (x) dx = 1,
a
and let Z ÃZ !
x b
ψ φ (x) = [φ (t) − φ (y) dy φ0 (t)]dt
a a
Therefore, ÃZ !
b
φ = Dψ φ + φ (y) dy φ0
a
and so
ÃZ ! Z
b b
T (φ) = T (Dψ φ ) + φ (y) dy T (φ0 ) = T (φ0 ) φ (y) dy.
a a
Consider Z (·)
f (·) − Df (t) dt
a
Z b Z b
= Df (φ) + Df (t) φ0 (x) dxdt
a t
Z b
= Df (φ) − Df (t) φ (t) dt = 0.
a
for all φ ∈ Cc∞ (a, b). It follows from Lemma 37.9 in the next section that
Z x
f (x) − Df (t) dt − C = 0 a.e. x.
a
Rx
whenever it makes sense to write a Df (t) dt, if Df is interpreted as a weak deriva-
tive. Somehow, this is the way it ought to be. It follows from the fundamental
theorem of calculus that f 0 (x) exists for a.e. x where the derivative is taken in the
sense of a limit of difference quotients and f 0 (x) = Df (x). This raises an inter-
esting question. Suppose f is continuous on [a, b] and f 0 (x) exists in the classical
sense for a.e. x. Does it follow that
Z x
f (x) = f (a) + f 0 (t) dt?
a
The answer is no. To see an example, consider Problem 4 on Page 445 which gives
an example of a function which is continuous on [0, 1], has a zero derivative for
a.e. x but climbs from 0 to 1 on [0, 1]. Thus this function is not recovered from
integrating its classical derivative.
In summary, if the notion of weak derivative is used, one can at least give
meaning to the derivative of almost anything, the mixed partial derivatives are
always equal, and, in one dimension, one can recover the function from integrating
its derivative. None of these claims are true for the classical derivative. Thus weak
derivatives are convenient and rule out pathologies.
Definition 37.8 For α = (k1 , · · ·, kn ) where the ki are nonnegative integers, define
n
X ∂ |α| f (x)
|α| ≡ |kxi |, Dα f (x) ≡ .
i=1
∂xk11 ∂xk22 · · · ∂xknn
¡ ¢ R
Also define
¡ be a mollifier if spt (φk ) ⊆ B 0, k1 , φk ≥ 0, φk dx = 1, and
φ¡k to ¢¢
φk ∈ Cc∞ B 0, k1 . In the case a GreekR letter like δ or ε is used as a subscript,
it will mean spt (φδ ) ⊆ B (0, δ) , φδ ≥ 0, φδ dx = 1, and φδ ∈ RCc∞ (B (0, δ)) . You
can always get a mollifier by letting φ ≥ 0, φ ∈ Cc∞ (B (0, 1)) , φdx = 1,and¡ then¢
defining φk (x) ≡ k n φ (kx) or in the case of a Greek subscript, φδ (x) = δ1n φ xδ .
Consider the case where u and Dα u for |α| = 1 are each in Lploc (Rn ). The next
lemma is the one alluded to in the proof of Theorem 37.5.
Lemma 37.11 Let u ∈ L1loc (Rn ) and suppose u,i ∈ L1loc (Rn ), where the subscript
on the u following the comma denotes the ith weak partial derivative. Then if φε is
a mollifier and uε ≡ u ∗ φε , it follows uε,i ≡ u,i ∗ φε .
Therefore,
Z Z Z
uε,i (ψ) = − uε ψ ,i = − u (x − y) φε (y) ψ ,i (x) d ydx
Z Z
= − u (x − y) ψ ,i (x) φε (y) dxdy
Z Z
= u,i (x − y) ψ (x) φε (y) dxdy
Z
= u,i ∗ φε (x) ψ (x) dx.
The technical questions about product measurability in the use of Fubini’s theorem
may be resolved by picking a Borel measurable representative for u. This proves
the lemma.
What about the product rule? Does it have some form in the context of weak
derivatives?
(uψ),i = u,i ψ + uψ ,i .
thus
Du (x) v =∇u (x) · v.
Lemma 37.13 Let u ∈ C 1 (Rn ) and p > n. Then there exists a constant, C,
depending only on n such that for any x, y ∈ Rn ,
|u (x) − u (y)|
ÃZ !1/p
³ ´
p
≤C |∇u (z) | dz | x − y|(1−n/p) . (37.1)
B(x,2|x−y|)
U xr W ry V
m (W ) m (W )
= = C.
m (U ) m (V )
You could compute this constant if you desired but it is not important here.
Define the average of a function over a set, E ⊆ Rn as follows.
Z Z
1
− f dx ≡ f dx.
E m (E) E
1088 WEAK DERIVATIVES
Then
Z
|u (x) − u (y)| = − |u (x) − u (y)| dz
ZW Z
≤ − |u (x) − u (z)| dz + − |u (z) − u (y)| dz
W W
·Z Z ¸
C
= |u (x) − u (z)| dz + |u (z) − u (y)| dz
m (U ) W W
·Z Z ¸
≤ C − |u (x) − u (z)| dz + − |u (y) − u (z)| dz
U V
Now consider these two terms. Using spherical coordinates and letting U0 denote
the ball of the same radius as U but with center at 0,
Z
− |u (x) − u (z)| dz
U
Z
1
= |u (x) − u (z + x)| dz
m (U0 ) U0
Z r Z
1 n−1
= ρ |u (x) − u (ρw + x)| dσ (w) dρ
m (U0 ) 0 n−1
Z r ZS Z ρ
1 n−1
≤ ρ |∇u (x + tw) · w| dtdσdρ
m (U0 ) 0 n−1
Z r ZS Z0 ρ
1
≤ ρn−1 |∇u (x + tw)| dtdσdρ
m (U0 ) 0 n−1
Z rZ Z r S 0
1
≤ C |∇u (x + tw)| dtdσdρ
r 0 S n−1 0
Z Z Z r
1 r |∇u (x + tw)| n−1
= C t dtdσdρ
r 0 S n−1 0 tn−1
Z Z r
|∇u (x + tw)| n−1
= C t dtdσ
n−1 tn−1
ZS 0
|∇u (x + z)|
= C n−1 dz
U0 |z|
µZ ¶1/p µZ ¶1/p0
p p0 −np0
≤ C |∇u (x + z)| dz |z|
U0 U
µZ ¶1/p µZ Z r ¶(p−1)/p
p p0 −np0 n−1
= C |∇u (z)| dz ρ ρ dρdσ
U S n−1 0
µZ ¶1/p ÃZ Z r
!(p−1)/p
p 1
= C |∇u (z)| dz n−1 dρdσ
U S n−1 0 ρ p−1
37.4. MORREY’S INEQUALITY 1089
µ ¶(p−1)/p µZ ¶1/p
p−1 p n
= C |∇u (z)| dz r1− p
p−n U
µ ¶(p−1)/p µZ ¶1/p
p−1 p 1− n
= C |∇u (z)| dz |x − y| p
p−n U
Similarly,
Z µ ¶(p−1)/p µZ ¶1/p
p−1 p 1− n
− |u (y) − u (z)| dz ≤ C |∇u (z)| dz |x − y| p
V p−n V
Therefore,
µ ¶(p−1)/p ÃZ !1/p
p−1 p 1− n
|u (x) − u (y)| ≤ C |∇u (z)| dz |x − y| p
p−n B(x,2|x−y|)
à Z !1/p
1 p
≤C |∇u (z) − ∇u (x) | dz | x − y|. (37.2)
m (B (x, 2 |x − y|)) B(x,2|x−y|)
Proof: This follows easily from letting g (y) ≡ u (y) − u (x) − ∇u (x) · (y − x) .
Then g ∈ C 1 (Rn ), g (x) = 0, and ∇g (z) = ∇u (z) − ∇u (x) . From Lemma 37.13,
Theorem 37.15 Suppose u and all its weak partial derivatives, u,i are in Lploc (Rn ).
Then there exists a set of measure zero, E such that if x, y ∈
/ E then inequalities
37.2 and 37.1 are both valid. Furthermore, u equals a continuous function a.e.
Proof: Let u ∈ Lploc (Rn ) and ψ k ∈ Cc∞ (Rn ) , ψ k ≥ 0, and ψ k (z) = 1 for all
z ∈ B (0, k). Then it is routine to verify that
Here is why:
Z
(uψ k ),i (φ) ≡ − uψ k φ,i dx
n
ZR Z Z
= − uψ k φ,i dx − uψ k,i φdx + uψ k,i φdx
Rn Rn Rn
Z Z
= − u (ψ k φ),i dx + uψ k,i φdx
n Rn
Z R
¡ ¢
= u,i ψ k + uψ k,i φdx
Rn
which shows
(uψ k ),i = u,i ψ k + uψ k,i
as expected.
Let φε be a mollifier and consider
(uψ k )ε ≡ uψ k ∗ φε .
Therefore
(uψ k )ε,i → (uψ k ),i in Lp (Rn ) (37.3)
and
(uψ k )ε → uψ k in Lp (Rn ) (37.4)
as ε → 0. By 37.4, there exists a subsequence ε → 0 such that for |z| < k and for
each i = 1, 2, · · ·, n
and
|u (y) − u (x) − ∇u (x) · (y − x)|
à Z !1/p
1 p
≤C |∇u (z) − ∇u (x) | dz | x − y|. (37.7)
m (B (x, 2 |x − y|)) B(x,2|x−y|)
Redefining u on the set of mesure zero, E yields 37.6 for all x, y. This proves the
theorem.
Corollary 37.16 Let u, u,i ∈ Lploc (Rn ) for i = 1, · · ·, n and p > n. Then the
representative of u described in Theorem 37.15 is differentiable a.e.
Thus w = u,i and u,i ∈ L∞ (Rn ) for each i. Hence u, u,i ∈ Lploc (Rn ) for all p > n
and so u is differentiable a.e. by Corollary 37.16. This proves the corollary.
37.6. CHANGE OF VARIABLES FORMULA LIPSCHITZ MAPS 1093
You see that if x were an interior point of E, then this limit will equal 1.
However, it is sometimes the case that the limit equals 1 even when x is not an
interior point. In fact, these points of density make sense even for sets that have
empty interior.
Lemma 37.20 Let E be a Lebesgue measurable set. Then there exists a set of
measure zero, N , such that if x ∈ E \ N , then x is a point of density of E.
Proof: Consider the function, f (x) = XE (x). This function is in L1loc (Rn ).
Let N C denote the Lebesgue points of f . Then for x ∈ E \ N ,
Z
1
1 = XE (x) = lim XE (y) dmn
r→0 mn (B (x, r)) B(x,r)
mn (B (x, r) ∩ E)
= lim .
r→0 mn (B (x, r))
Definition 37.21 Let F be a collection of balls that cover a set, E, which have
the property that if x ∈ E and ε > 0, then there exists B ∈ F, diameter of B < ε
and x ∈ B. Such a collection covers E in the sense of Vitali.
Theorem 37.22 Let E ⊆ Rn and suppose mn (E) < ∞ where mn is the outer
measure determined by mn , n dimensional Lebesgue measure, and let F, be a col-
lection of closed balls of bounded radii such that F covers E in the sense of Vitali.
Then there exists a countable collection of disjoint balls from F, {Bj }∞
j=1 , such that
mn (E \ ∪∞ j=1 Bj ) = 0.
Now this theorem implies a simple lemma which is what will be used.
As in the proof of the change of variables theorem given earlier, the first step
is to show that h maps Lebesgue measurable sets to Lebesgue measurable sets. In
showing this the key result is the next lemma which states that h maps sets of
measure zero to sets of measure zero.
Proof: Let V be an open set containing T whose measure is less than ε. Now
using the Vitali covering theorem, there exists a sequence of disjoint balls {Bi },
Bi =o B (xi , ri ) which are contained in V such that the sequence of enlarged balls,
n
Bbi , having the same center but 5 times the radius, covers T . Then
³ ³ ´´
mn (h (T )) ≤ mn h ∪∞ b
i=1 Bi
∞
X ³ ³ ´´
≤ bi
mn h B
i=1
∞
X ∞
X
n n
≤ α (n) (Lip (h)) 5n rin = 5n (Lip (h)) mn (Bi )
i=1 i=1
n n
≤ (Lip (h)) 5n mn (V ) ≤ ε (Lip (h)) 5n.
By Lemma 37.24,
mn (h (Ak )) ≤ mn (h (V ))
≤ mn (h (∪∞
i=1 Bi )) + mn (h (T )) = mn (h (∪∞
i=1 Bi ))
∞
X ∞
X
≤ mn (h (Bi )) ≤ mn (B (h (xi ) , Lip (h) ri ))
i=1 i=1
∞
X ∞
X
n n n
≤ α (n) (Lip (h) ri ) = Lip (h) mn (Bi ) = Lip (h) mn (V ).
i=1 i=1
Therefore,
n
mn (h (Ak )) ≤ Lip (h) mn (V ).
37.6. CHANGE OF VARIABLES FORMULA LIPSCHITZ MAPS 1095
Now let k → ∞ to obtain 37.9. This proves the formula. It remains to show h (A)
is measurable.
By inner regularity of Lebesgue measure, there exists a set, F , which is the
countable union of compact sets and a set T with mn (T ) = 0 such that
F ∪ T = Ak .
mn (h (T )) = 0
h (A) = ∪∞
k=1 h (Ak )
¡ ¢
Proof: Suppose a ∈ B (0, r (1 − ε)) \ F B and let
r (a − F (v))
G (v) ≡ .
|a − F (v)|
1096 WEAK DERIVATIVES
Then by the Brouwer fixed point theorem, G (v) = v for some v ∈ B. Using the
formula for G, it follows |v| = r. Taking the inner product with v,
2 r
(G (v) , v) = |v| = r2 = (a − F (v) , v)
|a − F (v)|
r
= (a − v + v − F (v) , v)
|a − F (v)|
r
= [(a − v, v) + (v − F (v) , v)]
|a − F (v)|
r h i
2
= (a, v) − |v| + (v − F (v) , v)
|a − F (v)|
r £ 2 ¤
≤ r (1 − ε) − r2 +r2 ε = 0,
|a − F (v)|
¡ ¢
a contradiction. Therefore, B (0, r (1 − ε)) \ F B = ∅ and this proves the lemma.
Now let Ω be a Lebesgue measurable set and suppose h : Rn → Rn is Lipschitz
continuous and one to one on Ω. Let
Lemma 37.27 Let x ∈ Ω \ (S ∪ N ). Then if ε ∈ (0, 1) the following hold for all r
small enough.
³ ³ ´´
mn h B (x,r) ≥ mn (Dh (x) B (0, r (1 − ε))), (37.13)
Consequently, when r is small enough, 37.14 holds. Therefore, 37.15 holds. From
−1
37.19, and the assumption that Dh (x) exists,
−1 −1
Dh (x) h (x + v) − Dh (x) h (x) − v =o(|v|). (37.20)
Letting
−1 −1
F (v) = Dh (x) h (x + v) − Dh (x) h (x),
apply Lemma 37.26 in 37.20 to conclude that for r small enough, whenever |v| < r,
−1 −1
Dh (x) h (x + v) − Dh (x) h (x) ⊇ B (0, (1 − ε) r).
Therefore, ³ ´
h B (x,r) ⊇ h (x) + Dh (x) B (0, (1 − ε) r)
which implies
³ ³ ´´
mn h B (x,r) ≥ mn (Dh (x) B (0, r (1 − ε)))
mn (h (B (x, r) ∩ Ω))
1−ε< ≤1
mn (h (B (x, r)))
and so
¡ ¡ ¢¢
mn h B (x, r) ∩ ΩC mn (h (B (x, r) ∩ Ω))
1−ε < +
mn (h (B (x, r))) mn (h (B (x, r)))
¡ ¡ C
¢¢
mn h B (x, r) ∩ Ω
≤ + 1.
mn (h (B (x, r)))
which implies
mn (B (x,r) \ Ω) < εα (n) rn. (37.21)
Then for such r,
mn (h (B (x, r) ∩ Ω))
1≥
mn (h (B (x, r)))
mn (h (B (x, r))) − mn (h (B (x,r) \ Ω))
≥ .
mn (h (B (x, r)))
From Lemma 37.25, 37.21, and 37.13, this is no larger than
n
Lip (h) εα (n) rn
1− .
mn (Dh (x) B (0, r (1 − ε)))
1098 WEAK DERIVATIVES
By the theorem on the change of variables for a linear map, this expression equals
n
Lip (h) εα (n) rn
1− n ≡ 1 − g (ε)
|det (Dh (x))| rn α (n) (1 − ε)
mn (h (B (x, r) ∩ Ω))
1≥ ≥ 1 − g (ε)
mn (h (B (x, r)))
µ (E) ≡ mn (h (E))
Proof: Define
|det Dh (x)|
mn (h (B (x, r)))
= lim
r→0 mn (B (x, r))
mn (h (B (x, r))) mn (h (B (x, r) ∩ Ω))
= lim
r→0 mn (h (B (x, r) ∩ Ω)) mn (B (x, r))
Z
1
= lim J (y) dmn
r→0 mn (B (x, r)) B(x,r)∩Ω
Z
1
= lim J (y) dmn = J (x) .
r→0 mn (B (x, r)) B(x,r)
the last equality because J was extended to be zero off Ω. This proves the lemma.
Here is the change of variables formula for Lipschitz mappings. It is a special
case of the area formula.
Define
h (x) ≡ inf{h (w) + K |x − w| : w ∈ Ω}. (37.27)
If x ∈ Ω, then for all w ∈ Ω,
h (w) + K |x − w| ≥ h (x)
by 37.26. This shows h (x) ≤ h (x). But also you could take w = x in 37.27 which
yields h (x) ≤ h (x). Therefore h (x) = h (x)¯ if x ∈ Ω. ¯
Now suppose x, y ∈ Rn and consider ¯h (x) − h (y)¯. Without loss of gener-
ality assume h (x) ≥ h (y) . (If not, repeat the following argument with x and y
interchanged.) Pick w ∈ Ω such that
Then ¯ ¯
¯h (x) − h (y)¯ = h (x) − h (y) ≤ h (w) + K |x − w| −
[h (w) + K |y − w| − ε] ≤ K |x − y| + ε.
Since ε is arbitrary, ¯ ¯
¯h (x) − h (y)¯ ≤ K |x − y|
1103
1104 THE AREA AND COAREA FORMULAS
Thus Nk ↑ N. Let ε > 0 be given and let U ⊇ Vk ⊇ N be open and mn (Vk ) < 5nεkn .
¡δ ¢
Now fix δ > 0. For x ∈ Nk let B (x, 5rx ) ⊆ Vk such that rx < min 5k , Rx . By
∞
the Vitali covering
n o∞ theorem, there exists a disjoint sequence of these balls, {Bi }i=1
such that B ci , the corresponding sequence of balls having the same centers
i=1 ³ ´ n ³ ´o∞
but five times the radius covers Nk . Then diam B ci < 2δ/k. Hence h B ci
³ ³ ´´ i=1
covers h (Nk ) and diam h B ci < 2δ. It follows
∞
X ³ ³ ´´n
n
H2δ (h (Nk )) ≤ ci
α (n) r h B
i=1
∞
X n
≤ α (n) k n 5n r (Bi )
i=1
∞
X
= 5n k n mn (Bi ) ≤ 5n k n mn (Vk ) < ε
i=1
n
Since δ was arbitrary, this shows H (h (Nk )) ≤ ε. Since k was arbitrary, this
shows Hn (h (N )) = limk→∞ Hn (h (Nk )) ≤ ε. Since ε is arbitrary, this shows
Hn (h (N )) = 0. This proves the lemma.
Now with this lemma, here is one of many possible generalizations of the area
formula.
x → (g ◦ h) (x) J (x)
¡ ∗ ¢1/2
where J (x) = det (U (x)) = det Dh (x) Dh (x) .
By Lemma 38.2
Z Z
n
g (y) dH = g (y) dHn
h(A) h(A0 )
Z Z
= g (h (x)) J (x) dmn = g (h (x)) J (x) dmn
A0 A
such that ∂U ≡ U \ U is contained in their union. Also, for each Qi , there exists k
and a Lipschitz function, gi such that U ∩ Qi is of the form
k−1
Y ¡ i i¢
x : (x1 , · · ·, xk−1 , xk+1 , · · ·, xn ) ∈ aj , bj ×
j=1
n
Y ¡ i i¢
aj , bj and aik < xk < gi (x1 , · · ·, xk−1 , xk+1 , · · ·, xn ) (38.4)
j=k+1
Also, there exists an open set, Q0 such that Q0 ⊆ Q0 ⊆ U and U ⊆ Q0 ∪Q1 ∪···∪QN .
1106 THE AREA AND COAREA FORMULAS
2
|T v| ≤ |U (b) v| for all v (a)
3
1
|h (a) − h (b) − Dh (b) (a − b)| ≤ |T (a − b)| . (b)
2
¡ ¢
for all a ∈ B b, 2i .
6
2
i
br
1 rc
i¡ a
¡
ª
First I will show these sets, E (c, T, i) cover B and that they are measurable
sets. To begin with consider the measurability question. Inequality (a) is the same
as saying
2
|T v| ≤ |Dh (b) v| for all v
3
which is the same as saying
2 ¯ ¯
|v| ≤ ¯Dh (b) T −1 v¯ for all v.
3
1108 THE AREA AND COAREA FORMULAS
it follows easily that Si is measurable because the component functions of the matrix
of Dh (b) are limits of difference quotients of continuous functions so they are Borel
measurable. (Note that if B were Borel, then Si would also be Borel.) Now by
continuity,
½ ¾
2 ¯ ¯
∞ ¯ −1 ¯
∪i=1 Si = b : |v| ≤ Dh (b) T v for all v
3
and so this set is measurable also. Inequality (b) also determines a measurable set
by similar reasoning. It is the same as saying that for all |v| < 2/i,
1
|h (b + v) − h (b) − Dh (b) (v)| ≤ |T (v)|
2
Use {vi } a countable dense subset of B (0,2/i) in a similar fashion to (a)¡ . ¢
Next¡ I need to¢ show these sets cover B. Let x ∈ B. Then pick ci ∈ B x, 1i and
Ti ∈ B U (x) ¯,¯1i . I need¯¯to show that x ∈ E (ci , Ti , i) for i large enough. For i
¯¯ −1 ¯¯
large enough, ¯¯Ti U (x) ¯¯ < 32 . Therefore, for such i
¯ ¯ 3
¯ −1 ¯
¯Ti U (x) (v)¯ < |v|
2
for all v and so
3
|Ti w| < |U (x) w|
2
for all w. Next consider (b). An equivalent norm is v → |U (x) v| and so, for i large
enough,
1
|h (a) − h (x) − Dh (x) (a − x)| ≤ |U (x) (a − x)| (38.10)
8
¯¯ ¯¯
whenever |a − x| < 2/i. Now also, for i large enough, ¯¯U (x) Ti−1 ¯¯ < 4 and so for
all w,
¯ ¯
¯U (x) T −1 w¯ < 4 |w|
i
which implies
|U (x) v| < 4 |Ti v| .
Applying this in 38.10 yields
1
|h (a) − h (x) − Dh (x) (a − x)| ≤ |Ti (a − x)|
2
with implies x ∈ E (ci , Ti , i) .
38.2. MAPPINGS THAT ARE NOT ONE TO ONE 1109
3 3
|T (b2 − b1 )| ≤ |U (b1 ) (b2 − b1 )| = |Dh (b1 ) (b2 −b1 )|
2 2
31
≤ |T (b2 − b1 )|
22
which is a contradiction unless b2 = b1 .
∞
There are clearly countably many E (c, T, i) . Denote them as {Fi }i=1 . Then let
E1 = F1 and if E1 , · · ·, Em have been chosen, let
Em+1 = Fm+1 \ ∪m
i=1 Ei .
Thus the Ei are disjoint measurable sets whose union is B and h is one to one on
each Ei .
Now consider one of the Ei . This is a subset of some E (c, T, i) . Let a, b ∈ Ei .
Then using (a) and (b) ,
3
|T (a − b)| ≤ |U (b) (a − b)|
2
3
= |Dh (b) (a − b)|
2
3 3
≤ |h (a) − h (b)| + |T (a − b)| .
2 4
Hence
1 3
|T (a − b)| ≤ |h (a) − h (b)|
4 2
Since v → |T v| is an equivalent norm, there exists some r > 0 such that |T v| ≥ r |v|
for all v. Therefore,
6
|a − b| ≤ |h (a) − h (b)| .
r
In other words,
¯ −1 ¯
¯h (h (a)) − h−1 (h (b))¯ = |a − b| ≤ 6 |h (a) − h (b)| .
r
which completes the proof.
With these lemmas, here is the main theorem which is a generalization of The-
orem 38.3. First remember that from Lemma 17.18 on Page 462 a locally Lipschitz
function maps Lebesgue measurable sets to Hausdorff measurable sets.
Then # is Hn measurable,
x → (g ◦ h) (x) J (x)
is Lebesgue measurable, and
Z Z
# (y) g (y) dHn = g (h (x)) J (x) dmn
h(A) A
¡ ∗ ¢1/2
where J (x) = det (U (x)) = det Dh (x) Dh (x) .
Proof: Let B = A \ (S ∪ N ) where S is the set of points where J (x) = 0 and
N is the set of points, x of A where Dh (x) does not exist. Also from Lemma 38.7
∞
there exists {Ei }i=1 , a sequence of disjoint measurable sets whose union equals B
such that h is one to one on each Ei . Then from Theorem 38.3
Z
g (h (x)) J (x) dmn
A
Z ∞ Z
X
= g (h (x)) J (x) dmn = g (h (x)) J (x) dmn
B i=1 Ei
∞ Z Z Ã∞ !
X X
n
= g (y) dH = Xh(Ei ) (y) g (y) dHn . (38.11)
i=1 h(Ei ) h(B) i=1
¡P∞ ¢ n
Now # (y) = i=1 Xh(Ei ) (y) on h (B) and # differs from this H measurable
n
function only on h (S ∪ N ) , which by Lemma 38.6 is a set of H measure zero.
Therefore, # is Hn measurable and the last term of 38.11 equals
Z ̰ ! Z
X
n
Xh(Ei ) (y) g (y) dH = # (y) g (y) dHn .
h(A) i=1 h(A)
and so
λ1 + 1 0
..
det (I + A∗ A) = det .
∗
= det (I + AA ) .
0 λm + 1
is Borel measurable.
and
∞
X s
α (s) (r (Si )) < t.
i=1
38.3. THE COAREA FORMULA 1113
I claim these sets can be taken to be open sets. Choose λ > 1 but close enough to
1 that
X∞
s
α (s) (λr (Si )) < t
i=1
Then
diam (Si + B (0, η i )) ≤ λ diam (Si )
and so r (Si + B (0, η i )) ≤ λr (Si ) . Thus
∞
X s
α (s) r (Si + B (0, η i )) < t.
i=1
Hence you could replace Si with Si + B (0, η i ) and so one can assume the sets Si
are open.
Claim: If z is close enough to y, then A ∩ h−1 (z) ⊆ ∪∞ i=1 Si .
Proof: If not, then there exists a sequence {zk } such that
zk → y,
and
xk ∈ (A ∩ h−1 (zk )) \ ∪∞
i=1 Si .
zk → y, xk → x ∈ A \ ∪∞
i=1 Si .
Hence
h (x) = lim h (xk ) = lim zk = y.
k→∞ k→∞
But x ∈/ ∪∞
i=1 Si contrary to the assumption that A ∩ h−1 (y) ⊆ ∪∞ i=1 Si .
It follows from this claim that whenever z is close enough to y,
¡ ¢
Hδs A ∩ h−1 (z) < t.
This shows © ¡ ¢ ª
z ∈ Rp : Hδs A ∩ h−1 (z) < t
¡ ¢
is an open set and so y →Hδs A ∩ h−1 (y) is Borel measurable whenever A is
compact. Now let V be an open set and let
Ak ↑ V, Ak compact.
Then ¡ ¢ ¡ ¢
Hδs V ∩ h−1 (y) = lim Hδs Ak ∩ h−1 (y)
k→∞
¡ ¢
so y →Hδs V ∩ h−1 (y) is Borel measurable for all V open. This proves the lemma.
1114 THE AREA AND COAREA FORMULAS
In particular, if s = n − m and p = n
Z
¡ ¢ m α (n − m) α (m)
Hn−m A ∩ h−1 (y) dy ≤ 2m (Lip (h)) mn (A)
R m α (n)
¡ ¢
Proof: From Lemma 38.13 y → Hδs A ∩ h−1 (y) is Borel measurable for each
δ > 0. Without loss of generality, Hs+m (A) < ∞. Now let Bi be closed sets with
diam (Bi ) < δ, A ⊆ ∪∞i=1 Bi , and
∞
X s+m
Hδs+m (A) + ε > α (s + m) r (Bi ) .
i=1
¡ ¢
Note each Bi is compact so y → Hδs Bi ∩ h−1 (y) is Borel measurable. Thus
Z
¡ ¢
Hδs A ∩ h−1 (y) dy
m
ZR X
¡ ¢
≤ Hδs Bi ∩ h−1 (y) dy
Rm i
XZ ¡ ¢
= Hδs Bi ∩ h−1 (y) dy
i Rm
XZ
≤ Hδs (Bi ) dy
i h(Bi )
X
= mm (h (Bi )) Hδs (Bi )
i
X m m s
≤ (Lip (h)) 2m α (m) r (Bi ) α (s) r (Bi )
i
By Lemma 38.14
Z
¡ n−m ¡ ¢ ¡ ¢¢
= Hδ Vk ∩ h−1 (y) − Hδn−m Kk ∩ h−1 (y) dy
m
ZR
¡ ¢
= Hδn−m (Vk − Kk ) ∩ h−1 (y) dy
Rm
m α (n − m) α (m)
≤ 2m (Lip (h)) mn (Vk \ Kk )
α (n)
m α (n − m) α (m) −k
< 2m (Lip (h)) 2
α (n)
Let the Borel measurable functions, g and f be defined by
¡ ¢ ¡ ¢
g (y) ≡ lim Hδn−m Vk ∩ h−1 (y) , f (y) ≡ lim Hδn−m Kk ∩ h−1 (y)
k→∞ k→∞
and Z
(g (y) − f (y)) dy = 0.
Rm
¡ ¢
By completness of mm , this establishes y →Hδn−m A ∩ h−1 (y) is Lebesgue mea-
surable. Then by Lemma 38.14 again,
Z
¡ ¢ m α (n − m) α (m)
Hδn−m A ∩ h−1 (y) dy ≤ 2m (Lip (h)) mn (A) .
Rm α (n)
1116 THE AREA AND COAREA FORMULAS
Letting δ → 0 and ¡using the monotone¢ convergence theorem yields the desired
inequality for Hn−m A ∩ h−1 (y) .
The case where A is not bounded can be handled by considering Ar = A∩B (0, r)
and letting r → ∞. This proves the lemma.
By fussing with the isodiametric inequality one can remove the factor of 2m in
the above inequalities obtaining much more attractive formulas. This is done in
[20]. See also [36] which follows [20] and [22]. This last reference probably has the
most complete treatment of these topics.
With these lemmas, it is now possible to give a proof of the coarea formula.
Define Λ (n, m) as all possible ordered lists of m numbers taken from {1, 2, · · ·, n} .
Then the following formula holds along with all measurability assertions needed for
it to make sense.
Z Z
¡ ¢
Hn−m A ∩ h−1 (y) dy = Jh (x) dx (38.13)
Rm A
∪i,j Fji = A.
© ª
Now let Eji be measurable sets such that Eji ⊆ Fki for some k, the sets are
disjoint, and their union coincides with ∪i,j Fji . Then
Z X ∞ Z
X ¡ ∗ ¢1/2
Jh (x) dx = det Dh (x) Dh (x) dx. (38.14)
A i ∈Λ(n,m) j=1 Eji ∩A
38.3. THE COAREA FORMULA 1117
¡ ¢−1
Let g : Rn → Rn be a Lipschitz extension of hi so g ◦ hi (x) = x for all x ∈
i
Ej . First, using Theorem 38.11, and the fact that Lipschitz mappings take sets of
measure zero to sets of measure zero, replace Eji with a measurable set, E ei ⊆ E i
j j
such that Eji \ Ee i has measure zero and
j
X ∞ Z
X ¡ ∗ ¢1/2 ¯¯ ¯−1
det Dh (g (y)) Dh (g (y)) det Dhi (g (y))¯ dy.
i ∈Λ(n,m) j=1 hi (E
e i ∩A)
j
(38.15)
Note the integrands are all Borel measurable functions because they are continu-
ous functions of the entries of matrices which entries come from taking limits of
difference quotients of continuous functions. Thus,
Z
¡ ∗ ¢1/2
det Dh (x) Dh (x) dx =
e i ∩A
Ej
Z
¡ ∗ ¢1/2 ¯¯ ¯−1
Xhi (Eei ∩A) (y) det Dh (g (y)) Dh (g (y)) det Dhi (g (y))¯ dy
j
Rn
Z Z
¡ ∗ ¢1/2 −1
= det Dh (g (y)) Dh (g (y)) |det Dhxi (g (y))| dy2 dy1
Rm π ic (h−1 (y1 )∩E
e i ∩A)
j
(38.16)
where y1 = h (x) and y2 = xic . Thus
¡ ¢
y2 = π ic g (y) = π ic g hi (x) = xic . (38.17)
Now consider the inner integral in 38.16 in which y1 is fixed. The integrand
equals
· µ ∗ ¶¸1/2
¡ ¢ Dxi h (g (y)) −1
det Dxi h (g (y)) Dxic h (g (y)) ∗ |det Dhxi (g (y))| .
Dxic h (g (y))
(38.18)
I want to massage the above expression slightly. Since y1 is fixed, and y1 =
h (π i g (y) , π ic g (y)) = h (g (y)), it follows from 38.17 that
Letting A ≡ Dxi h (g (y)) and B ≡ Dy2 π i g (y) and using the above formula, 38.18
is of the form
· µ ¶¸1/2
¡ ¢ A∗ −1
det A −AB |det A|
−B ∗ A∗
1/2 −1
= det [A∗ A + ABB ∗ A∗ ] |det A|
1/2 −1
= det [A∗ (I + BB ∗ ) A] |det A|
1/2
= det (I + BB ∗ ) ,
1/2
which, by Corollary 38.10, equals det (I + B ∗ B) . (Note the size of the identity
changes in these two expressions, the first being an m × m matrix and the second
being a n − m × n − m matrix.)
By 38.17 π ic g (y) = y2 and so,
· µ ¶¸1/2
∗ 1/2 ¡ ∗ ¢ B
det (I + B B) = det B I
I
· µ ¶¸1/2
¡ ∗ ∗ ¢ Dy2 π i g (y)
= det Dy2 π i g (y) Dy2 π ic g (y)
Dy2 π ic g (y)
¡ ∗ ¢ 1/2
= det Dy2 g (y) Dy2 g (y) .
Therefore, 38.16 reduces to
Z
¡ ∗ ¢1/2
det Dh (x) Dh (x) dx =
e i ∩A
Ej
Z Z
¡ ∗ ¢1/2
det Dy2 g (y) Dy2 g (y) dy2 dy1 . (38.19)
Rm π ic (h−1 (y1 )∩E
e i ∩A)
j
By the area formula applied to the inside integral, this integral equals
³ ´
Hn−m h−1 (y1 ) ∩ E eji ∩ A
and so
Z
¡ ∗ ¢1/2
det Dh (x) Dh (x) dx
e i ∩A
E j
Z ³ ´
= eji ∩ A dy1 .
Hn−m h−1 (y1 ) ∩ E
Rm
ei
Using Lemma 38.15, along with the inner regularity of Lebesgue measure, Ej
i
can be replaced with Ej . Therefore, summing the terms over all i and j,
Z Z
¡ ∗ ¢1/2 ¡ ¢
det Dh (x) Dh (x) dx = Hn−m h−1 (y) ∩ A dy.
A Rm
Proof: By Lemma 38.15 again, this formula is true for all measurable A ⊆
Rn \ S. It remains to verify the formula for all measurable sets, A, whether or not
they intersect S.
Consider the case where
Then
Dk (x, y) = (Dh (x) , εI) = (U R, εI)
where the dependence of U and R on x has been suppressed. Thus
µ ∗ ¶
2 R U ¡ ¢
Jk = det (U R, εI) = det U 2 + ε2 I
εI
¡ ¢ ¡ ¢
= det Q∗ DQQ∗ DQ + ε2 I = det D2 + ε2 I
m
Y ¡ 2 ¢
= λi + ε2 ∈ [ε2m , C 2 ε2 ] (38.21)
i=1
since one of the λi equals 0. All the eigenvalues must be bounded independent of
x, since ||Dh (x)|| is bounded independent of x due to the assumption that h is
Lipschitz. Since Jk 6= 0, the first part of the argument implies
³ ´ Z
εCmn+m A × B (0,1) ≥ |Jk| dmn+m
A×B(0,1)
Z ³ ´
= Hn k−1 (y) ∩ A × B (0,1) dy
Rm
Which by Lemma 38.14,
Z Z ³ ´
≥ Cnm Hn−m k−1 (y) ∩ p−1 (w) ∩ A × B (0,1) dwdy (38.22)
Rm Rm
α(n)
where Cnm = α(n−m)α(m) .
1120 THE AREA AND COAREA FORMULAS
Claim: ³ ´
Hn−m k−1 (y) ∩ p−1 (w) ∩ A × B (0,1)
¡ ¢
≥ XB(0,1) (w) Hn−m h−1 (y − εw) ∩ A .
The use of Fubini’s theorem is justified because the integrand is Borel measurable.
Now by 38.24, it follows since ε > 0 is arbitrary,
Z Z
¡ ¢
Hn−m A ∩ h−1 (y) dy = 0 = Jh (x) dx.
Rm A
Since this holds for arbitrary compact sets in S, it follows from Lemma 38.15 and
inner regularity of Lebesgue measure that the equation holds for all measurable
subsets of S. This completes the proof of the coarea formula.There is a simple
corollary to this theorem in the case of locally Lipschitz maps.
Proof: The assumption that h is locally Lipschitz implies that for each r > 0 it
follows h is Lipschitz on B (0, r) . To see this, cover the compact set, B (0, r) with
finitely many balls on which h is Lipschitz.
38.4. A NONLINEAR FUBINI’S THEOREM 1121
h (x) = hr (x)
Xp Z
¡ ¢
= ci Hn−m Ei ∩ h−1 (y) dy
i=1 h(Ei )
Z p
X ¡ ¢
= ci Hn−m Ei ∩ h−1 (y) dy
h(Rn ) i=1
Z "Z #
n−m
= s dH dy
h(Rn ) h−1 (y)
Z "Z #
n−m
= s dH dy. (38.26)
h(Rn ) h−1 (y)
h : Rn → Rm , n ≥ m
satisfy the Coarea formula. For example, it could be locally Lipschitz. Then
Z Z "Z #
g (x) J ((Dh (x))) dx = g dHn−m dy.
Rn h(Rn ) h−1 (y)
Definition 39.1 Let C be a set whose elements are subsets of Rn .1 Then C is said
to be locally finite if for every x ∈ Rn , there exists an open set, Ux containing x
such that Ux has nonempty intersection with only finitely many sets of C.
Lemma 39.2 Let C be a set whose elements are open subsets of Rn and suppose
∞
∪C ⊇ H, a closed set. Then there exists a countable list of open sets, {Ui }i=1 such
that each Ui is bounded, each Ui is a subset of some set of C, and ∪∞i=1 Ui ⊇ H.
1123
1124 INTEGRATION ON MANIFOLDS
∞
of all consider the rational numbers, {ri }i=1 each rational number is a closed set.
∞
Q = {ri }i=1 = ∪∞ ∞
i=1 {ri } 6= ∪i=1 {ri } = R
Next suppose the elements of C are open sets and that for each U ∈ C, there exists a
differentiable function, ψ U having spt (ψ U ) ⊆ U. Then you can define the following
finite sum for each x ∈ Rn
X
f (x) ≡ {ψ U (x) : x ∈ U ∈ C} .
Proof: Let p be a limit point of ∪C and let W be an open set which intersects
only finitely many
© sets ofªC. Then p must©be a limit ª point of one of these sets. It
follows p ∈ ∪ H : H ∈ C and so ∪C ⊆ ∪ H : H ∈ C . The inclusion in the other
direction is obvious.
Now consider the second assertion. Letting x ∈ Rn , there exists an open set, W
intersecting only finitely many open sets of C, U1 , U2 , · · ·, Um . Then for all y ∈ W,
m
X
f (y) = ψ Ui (y)
i=1
and so the desired result is obvious. It merely says that a finite sum of differentiable
functions is differentiable. Recall the following definition.
Lemma 39.5 Let U be a bounded open set and let K be a closed subset of U. Then
there exist an open set, W, such that W ⊆ W ⊆ U and a function, f ∈ Cc∞ (U )
such that K ≺ f ≺ U .
Also let © ¡ ¢ª
W1 ≡ x : dist (x, K) < 2−1 dist K, U C
Then it is clear
K ⊆ W ⊆ W ⊆ W1 ⊆ W1 ⊆ U
Now consider the function,
¡ ¢
dist x, W1C
h (x) ≡ ¡ ¢ ¡ ¢
dist x, W1C + dist x, W
it follows that for such k,the function, h ∗ φk ∈ Cc∞ (U ) , has values in [0, 1] , and
equals 1 on K. Let f = h ∗ φk .
The above lemma is used repeatedly in the following.
∞
Lemma 39.6 Let K be a closed set and let {Vi }i=1 be a locally finite list of bounded
open sets whose union contains K. Then there exist functions, ψ i ∈ Cc∞ (Vi ) such
that for all x ∈ K,
∞
X
1= ψ i (x)
i=1
is in C ∞ (Rn ) .
Proof: Let K1 = K \ ∪∞
i=2 Vi . Thus K1 is compact because K1 ⊆ V1 . Let
K 1 ⊆ W 1 ⊆ W 1 ⊆ V1
W i ⊆ Vi , K ⊆ ∪ ∞
i=1 Wi .
1126 INTEGRATION ON MANIFOLDS
∞ ∞
Note {Wi }i=1 is locally finite because the original list, {Vi }i=1 was locally finite.
Now let Ui be open sets which satisfy
W i ⊆ Ui ⊆ U i ⊆ Vi .
∞
Similarly, {Ui }i=1 is locally finite.
Wi Ui Vi
∞ ∞
Since the set, {Wi }i=1 is locally finite, it follows ∪∞
i=1 Wi = ∪i=1 Wi and so it
is possible to define φi and γ, infinitely differentiable functions having compact
support such that
U i ≺ φi ≺ Vi , ∪∞ ∞
i=1 W i ≺ γ ≺ ∪i=1 Ui .
Now define
½ P∞ P∞
γ(x)φi (x)/ j=1 φj (x) if j=1 φj (x) 6= 0,
ψ i (x) = P∞
0 if j=1 φj (x) = 0.
P∞
If x is such that j=1 φj (x) = 0, then x ∈ / ∪∞i=1 Ui because φi equals one on Ui .
Consequently γ (y) = 0 for all y near x thanks to the fact that ∪∞ i=1 Ui is closed
and
P∞ so ψ i (y) = 0 for all y near x. Hence ψ i is infinitely differentiable at such x. If
j=1 φj (x) 6= 0, this situation persists near x because each φj is continuous and so
ψ i is infinitely differentiable at such points also thanks to LemmaP39.3. Therefore
∞
ψ i is infinitely differentiable. If x ∈ K, then γ (x) = 1 and so j=1 ψ j (x) = 1.
Clearly 0 ≤ ψ i (x) ≤ 1 and spt(ψ j ) ⊆ Vj . This proves the theorem.
The method of proof of this lemma easily implies the following useful corollary.
Proof: Keep Vi the same but replace Vj with V fj ≡ Vj \ H. Now in the proof
above, applied to this modified collection of open sets, if j 6= i, φj (x) = 0 whenever
x ∈ H. Therefore, ψ i (x) = 1 on H.
Theorem 39.8 Let H be any closed set and let C be any open cover of H. Then
∞
there exist functions {ψ i }i=1 such that spt (ψ i ) is contained in some
Pset of C and ψ i
∞
is infinitely differentiable having values
P∞ in [0, 1] such that on H, i=1 ψ i (x) = 1.
Furthermore, the function, f (x) ≡ i=1 ψ i (x) is infinitely differentiable on Rn .
∞
Also, spt (ψ i ) ⊆ Ui where Ui is a bounded open set with the property that {Ui }i=1
is locally finite and each Ui is contained in some set of C.
39.2. INTEGRATION ON MANIFOLDS 1127
T
for u = (u1 , · · ·, ui , ui+1 , · · ·, un ) ∈ Ui for φi ∈ C m,1 (Ui ) and gi : Ui × R → Ui
given by
gi (u1 , · · ·, ui , y, ui+1 , · · ·, un ) ≡ u
for i = 1, 2, · · ·, p. Then for u ∈ Ui , the definition gives
This example can be used to describe the boundary of a bounded open set and since
φi ∈ C m,1 (Ui ) , such an open set is said to have a C m,1 boundary. Note also that
in this example, Ui could be taken to be Rn or if Ui is given, both hi and and gi
can be taken as restrictions of functions defined on all of Rn and Rp respectively.
The symbol, I will refer to an increasing list of n indices taken from {1, · · ·, p} .
Denote by Λ (p, n) the set of all such increasing lists of n indices.
Let
à ¡ ¢ !2 1/2
X ∂ xi1 · · · xin
Ji (u) ≡
∂ (u1 · · · un )
I∈Λ(p,n)
where here the sum is taken over all possible¡ ¢ increasing lists of n indices, I, from
{1, · · ·, p} and x = hi u. Thus there are np terms in the sum. In this formula,
∂ (xi1 ···xin )
∂(u1 ···un ) is defined to be the determinant of the following matrix.
∂xi1 ∂xi1
∂u1 ··· ∂un
.. ..
.
. .
∂xin ∂xin
∂u1 ··· ∂un
Note that if p = n there is only one term in the sum, the absolute value of the
determinant of Dx (u). Define a positive linear functional, Λ on Cc (Γ) as follows:
∞
First let {ψ i } be a C
P partition of unity subordinate to the open sets, {Wi } . Thus
∞
ψ i ∈ Cc (Wi ) and i ψ i (x) = 1 for all x ∈ Γ. Then
∞ Z
X
Λf ≡ f ψ i (hi (u)) Ji (u) du. (39.1)
i=1 gi Γi
Lemma 39.12 The functional defined in 39.1 does not depend on the choice of
atlas or the partition of unity.
39.2. INTEGRATION ON MANIFOLDS 1129
X ∞ Z
∞ X ∞ Z
∞ X
X
η j ψ i f (hi (u)) Ji (u) du = ·
i=1 j=1 gi Γi i=1 j=1 gj0 (Γi ∩Γ0j )
¯ ¡ ¯
¡ 0 ¢ ¡ 0 ¢ ¡ 0 ¢ ¯ ∂ u1 · · · un ¢ ¯
¯ ¯
η j hj (v) ψ i hj (v) f hj (v) Ji (u) ¯ ¯ dv
¯ ∂ (v 1 · · · v n ) ¯
X∞ X ∞ Z
¡ ¢ ¡ ¢ ¡ ¢
= η j h0j (v) ψ i h0j (v) f h0j (v) Jj (v) dv. (39.3)
i=1 j=1 gj (Γi ∩Γj )
0 0
Thus
the definition of Λf using (Γi , gi ) ≡
X∞ Z
ψ i f (hi (u)) Ji (u) du =
i=1 gi Γi
∞ Z
∞ X
X ¡ ¢ ¡ ¢ ¡ ¢
η j h0j (v) ψ i h0j (v) f h0j (v) Jj (v) dv
i=1 j=1 gj0 (Γi ∩Γ0j )
∞ Z
X ¡ ¢ ¡ ¢
= η j h0j (v) f h0j (v) Jj (v) dv
j=1 gj0 (Γ0j )
Since ε is arbitrary, this shows gi V has measure no more than ε with respect to the
measure, ν i . Since ε is arbitrary, gi S has measure zero.
Consider the converse. Suppose gi S has ν i measure zero. Then there exists an
open set, O ⊆ gi Γi such that O ⊇ gi S and
Z
Ji (u) du < ε.
O
As in the first part, Corollary 39.7 on Page 1126 implies there exists a partition of
unity such that h (x) = 0 off the set,
{x ∈ Rp : ψ i (x) = 1}
3 This means V is the intersection of an open set with Γ. Equivalently, it means that V is an
open set in the traditional way regarding Γ as a metric space with the metric it inherits from Rm .
39.2. INTEGRATION ON MANIFOLDS 1131
and so from 39.5 and 39.6 µ (S) ≤ 2ε. Since ε is arbitrary, this proves the claim.
For the last part of the theorem, it suffices to let A ⊆ Γr because otherwise, the
above argument would apply to A ∩ Γr . Thus let A ⊆ Γr be µ measurable. By
the regularity of the measure, there exists an Fσ set, F and a Gδ set, G such that
Γr ⊇ G ⊇ A ⊇ F and µ (G \ F ) = 0.(Recall a Gδ set is a countable intersection
of open sets and an Fσ set is a countable union of closed sets.) Then since Γr is
compact, it follows each of the closed sets whose union equals F is a compact set.
Thus if F = ∪∞ ∞
k=1 Fk , gr (Fk ) is also a compact set and so gr (F ) = ∪k=1 gr (Fk ) is
a Borel set. Similarly, gr (G) is also a Borel set. Now by the claim,
Z
Jr (u) du = 0.
gr (G\F )
and
g ∈ L1 (gr K; ν r ) .
By the pointwise convergence and the claim used in the proof of Theorem 39.13,
g (u) = f (hr (u))
for µ a.e. hr (u) ∈ K. Therefore,
Z Z Z
f dµ = lim fk dµ = lim fk (hr (u)) Jr (u) du
K k→∞ K k→∞ g (K)
r
Z Z
= g (u) Jr (u) du = f (hr (u)) Jr (u) du. (39.9)
gr (K) gr (K)
where here the sum is taken over all possible increasing lists of n indices, I, from
{1, · · ·, p} and x = hi u and the functional was given as
∞ Z
X
Λf ≡ f ψ i (hi (u)) Ji (u) du (39.10)
i=1 gi Γi
∞ ∞
where the {ψ i }i=1 was a partition of unity subordinate to the open sets, {Wi }i=1
as described above. I will show
¡ ∗ ¢1/2
Ji (u) = det Dh (u) Dh (u)
and then use the area formula. The key result is really a special case of the Binet
Cauchy theorem and this special case is presented in the next lemma.
Lemma 39.15 Let A = (aij ) be a real p×n matrix in which p ≥ n. For I ∈ Λ (p, n)
denote by AI the n × n matrix obtained by deleting from A all rows except for those
corresponding to an element of I. Then
X 2
det (AI ) = det (A∗ A)
I∈Λ(p,n)
Proof: For (j1 , · · ·, jn ) ∈ Λ (p, n) , define θ (jk ) ≡ k. Then let for {k1 , · · ·, kn } =
{j1 , · · ·, jn } define
X p
X p
X
∗
det (A A) = sgn (i1 , · · ·, in ) ak1 i1 ak1 1 ak2 i2 ak2 2
i1 ,···,in k1 =1 k2 =1
X p
··· akn in akn n
kn =1
X X X
= sgn (i1 , · · ·, in ) ak1 i1 ak1 1 ak2 i2 ak2 2 · · · akn in akn n
J∈Λ(p,n) {k1 ,···,kn }=J i1 ,···,in
1134 INTEGRATION ON MANIFOLDS
X X X
= sgn (i1 , · · ·, in ) ak1 i1 ak2 i2 · · · akn in · ak1 1 ak2 2 · · · akn n
J∈Λ(p,n) {k1 ,···,kn }=J i1 ,···,in
X X
= sgn (k1 , · · ·, kn ) det (AJ ) ak1 1 ak2 2 · · · akn n
J∈Λ(p,n) {k1 ,···,kn }=J
X
= det (AJ ) det (AJ )
J∈Λ(p,n)
n
Now H is a Borel measure defined on Γ which is finite on all compact subsets of Γ.
This finiteness follows from the above formula. If K is a compact subset of Γ, then
there exists an open set, W whose closure is compact and a continuous
R function
with compact support, f such that K ≺ f ≺ W . Then Hn (K) ≤ Γ f (y) dHn < ∞
because of the above formula.
Lemma 39.16 µ = Hn on every µ measurable set.
Proof: The Riesz representation theorem shows that
Z Z
f dµ = f dHn
Γ Γ
for every continuous function having compact support. Therefore, since every open
set is the countable union of compact sets, it follows µ = Hn on all open sets. Since
compact sets can be obtained as the countable intersection of open sets, these two
measures are also equal on all compact sets. It follows they are also equal on all
countable unions of compact sets. Suppose now that E is a µ measurable set of finite
measure. Then there exist sets, F, G such that G is the countable intersection of
open sets each of which has finite measure and F is the countable union of compact
sets such that µ (G \ F ) = 0 and F ⊆ E ⊆ G. Thus Hn (G \ F ) = 0,
Hn (G) = µ (G) = µ (F ) = Hn (F )
By completeness of Hn it follows E is Hn measurable and Hn (E) = µ (E) . If
E is not of finite measure, consider Er ≡ E ∩ B (0, r) . This is contained in the
compact set Γ ∩ B (0, r) and so µ (Er ) if finite. Thus from what was just shown,
Hn (Er ) = µ (Er ) and so, taking r → ∞ Hn (E) = µ (E) .
This shows you can simply use Hn for the measure on Γ.
Basic Theory Of Sobolev
Spaces
Definition 40.1 Let U be an open set of Rn . Define X m,p (U ) as the set of all
functions in Lp (U ) whose weak partial derivatives up to order m are also in Lp (U )
where 1 ≤ p. The norm1 in this space is given by
1/p
Z X
p
||u|| ≡
m,p |Dα u| dx .
U |α|≤m
P ¡ ¢
where α = (α1 , · · ·, αn ) ∈ Nn and |α| ≡ αi . Here D0 u ≡ u. C ∞ U is defined
to be the ¡set¢of functions which are restrictions to U of a function in Cc∞ (Rn ).
Thus C ∞ U ⊆ W m,p (U ) . The Sobolev space, W m,p (U ) is defined to be the clo-
¡ ¢
sure of C ∞ U in X m,p (U ) with respect to the above norm. Denote this norm by
||u||W m,p (U ) , ||u||X m,p (U ) , or ||u||m,p,U when it is important to identify the open set,
U.
Also the following notation will be used pretty consistently.
Definition 40.2 Let u be a function defined on U. Define
½
u (x) if x ∈ U
ue (x) ≡ .
0 if x ∈
/U
Theorem 40.3 Both X m,p (U ) and W m,p (U ) are separable reflexive Banach spaces
provided p > 1.
w
Proof: Define Λ : X m,p (U ) → Lp (U ) where w equals the number of multi
w
indices, α, such that |α| ≤ m as follows. Letting {αi }i=1 be the set of all multi
indices with α1 = 0,
Λ (u) ≡ (Dα1 u, Dα2 u, · · ·, Dαw u) = (u, Dα2 u, · · ·, Dαw u) .
1 You
P α
could also let the norm be given by ||u||m,p ≡ |α|≤m ||D u||p or ||u||m,p ≡
n o
α p
max ||D u||p : |α| ≤ m because all norms are equivalent on R where p is the number of multi
indices no larger than m. This is used whenever convenient.
1135
1136 BASIC THEORY OF SOBOLEV SPACES
Then Λ is one to one because one of the multi indices is 0. Also Λ (X m,p (U ))
w
is a closed subspace of Lp (U ) . To see this, suppose (uk , Dα2 uk , · · ·, Dαw uk ) →
w
(f1 , f2 , · · ·, fw ) in Lp (U ) . Then uk → f1 in Lp (U ) and Dαj uk → fj in Lp (U ) .
Therefore, letting φ ∈ Cc∞ (U ) and letting k → ∞,
R |α| R |α| R
U
(Dαj uk ) φdx = (−1) U
uk Dαj φdx → (−1) U
f1 Dαj φdx ≡ Dαj (f1 ) (φ)
R ↓
f φdx
U j
Theorem 40.4 Suppose U is an open set and U0 ⊆ U is another open set. Suppose
also Dα u ∈ Lp (U ) . Then for all ψ ∈ Cc∞ (U0 ) ,
Z Z
α |α|
(D u) ψdx = (−1) u (Dα ψ) .
U0 U0
Theorem 40.5 Let ¡ U be ¢an open set and let U0 be an open subset of U with the
property that dist U0 , U C > 0. Then if u ∈ X m,p (U ) and u e denotes the zero
extention of u off U,
lim ||e
u ∗ φl − u||X m,p (U0 ) = 0.
l→∞
¡ ¢
Proof: Always assume l is large enough that 1/l < dist U0 , U C . Thus for
x ∈ U0 ,
Z
u
e ∗ φl (x) = u (x − y) φl (y) dy. (40.1)
B (0, 1l )
1137
Also,
³ ´ Z µZ ¶
g
D αu ∗ φ
l (ψ) ≡
gα
D u (y) φl (x − y) dy ψ (x) dx
U0
Z µZ ¶
α
= D u (y) φl (x − y) dy ψ (x) dx
U0 U
Z µZ ¶
= u (y) (Dα φl ) (x − y) dy ψ (x) dx
U U
Z 0 Z
= u (y) (Dα φl ) (x − y) ψ (x) dxdy
U U0
Z Z
|α|
= (−1) u (y) φl (x − y) (Dα ψ) (x) dxdy.
U U0
³ ´
It follows that Dα (eu ∗ φl ) = D g αu ∗ φ ∞
l as weak derivatives defined on Cc (U0 ) .
Therefore,
¯¯ ¯¯
¯¯ g
α u ∗ φ − D α u¯¯¯¯
||Dα (e
u ∗ φl ) − Dα u||Lp (U0 ) = ¯¯D l
Lp (U0 )
¯¯ ¯¯
¯¯ g g ¯ ¯
≤ ¯¯Dα u ∗ φl − D α u¯¯ → 0.
p n
L (R )
Corollary 40.6 Let U0 and U be as in the above theorem. Then for all l large
enough and φl a mollifier,
³ ´
Dα (e
u ∗ φl ) = Dgαu ∗ φ
l (40.2)
Definition 40.7 Let U be an open set. C ∞ (U ) denotes the set of functions which
are defined and infinitely differentiable on U.
1138 BASIC THEORY OF SOBOLEV SPACES
Theorem 40.8 (Meyer Serrin) Let U be an open subset of Rn . Then if δ > 0 and
u ∈ X m,p (U ) , there exists J ∈ C ∞ (U ) such that ||J − u||m,p,U < δ.
Proof: Let ···Uk ⊆ Uk ⊆ Uk+1 ··· be a sequence of open subsets of U whose union
equals U such that Uk is compact for all k. Also let U−3 = U−2 = U−1 = U0 = ∅.
∞
Now define Vk ≡ Uk+1 \ Uk−1 . Thus {Vk }k=1 is an open cover of U. Note the open
cover is locally finite and therefore, there exists a partition of unity subordinate to
∞
this open cover, {η k }k=1 such that each spt (η k ) ∈ Cc (Vk ) . Let ψ m denote the sum
of all the η k which are non zero at some point of Vm . Thus
∞
X
spt (ψ m ) ⊆ Um+2 \ Um−2 , ψ m ∈ Cc∞ (U ) , ψ m (x) = 1 (40.3)
m=1
where lm is chosen large enough that the following two conditions hold:
¡ ¢
spt uψ m ∗ φlm ⊆ Um+3 \ Um−3 , (40.5)
¯¯ ¯¯ ¯¯ ¯¯ δ
¯¯(uψ m ) ∗ φl − uψ m ¯¯ = ¯¯(uψ m ) ∗ φlm − uψ m ¯¯m,p,U < , (40.6)
m m,p,U m+3 2m+5
where 40.6 is obtained from Theorem 40.5. Because of 40.3 only finitely many terms
of the series in 40.4 are nonzero and therefore, J ∈ C ∞ (U ) . Now let N > 10, some
large value.
¯¯ ¯¯
¯¯ X
N
¡ ¢¯¯¯¯
¯¯
||J − u||m,p,UN −3 = ¯¯ uψ k ∗ φlk − uψ k ¯¯
¯¯ ¯¯
k=0 m,p,UN −3
N
X ¯¯ ¯¯
≤ ¯¯uψ k ∗ φl − uψ k ¯¯
k m,p,UN −3
k=0
N
X δ
≤ < δ.
2m+5
k=0
Now apply the monotone convergence theorem to conclude that ||J − u||m,p,U ≤ δ.
This proves the theorem.
1139
Note that J = 0 on ∂U. Later on, you will see that this is pathological.
In the study of partial differential equations it is the space W m,p (U ) which
¡ ¢is
of the most use, not the space X m,p (U ) . This is because of the density of C ∞ U .
Nevertheless, for reasonable open sets, U, the two spaces coincide.
U ∩ U z + ta ⊆ U
Uz
u
z
T
Uz U + ta
You can imagine open sets which do not satisfy the segment condition. For
example, a pair of circles which are tangent at their boundaries. The condition in
the above definition breaks down at their point of tangency.
Here is a simple lemma which will be used in the proof of the following theorem.
Therefore, Dxi (uψ) = ψDxi u + uψ ,xi ∈ Lp (U ) . In other words, the product rule
holds. Now considering the terms in the last expression, you can do the same
argument with each of these as long as they all have derivatives in Lp (U ) . Therefore,
continuing this process the lemma is proved.
1140 BASIC THEORY OF SOBOLEV SPACES
Theorem 40.11 Let U be an open set and suppose there exists a locally finite
∞
covering2 of U which is of the form {Ui }i=1 such that each Ui is a bounded open set
which satisfies the conditions of Definition 40.9. Thus there exist vectors, ai such
that for all t ∈ (0, 1) ,
Ui ∩ U + tai ⊆ U.
¡ ¢
Then C ∞ U is dense in X m,p (U ) and so W m,p (U ) = X m,p (U ) .
∞
Proof: Let {ψ i }i=1 be a partition of unity subordinate to the given open cover
with ψ i ∈ Cc∞ (Ui ) and let u ∈ X m,p (U ) . Thus
∞
X
u= ψ k u.
k=1
Consider Uk for some k. Let ak be the special vector associated with Uk such that
tak + U ∩ Uk ⊆ U (40.7)
This can be done because tak + U ∩ Uk is a compact subset of U and so has positive
distance to U C and spt (ψ k )−tak is a compact subset of Uk having positive distance
to UkC . Let tk be such a value for t and for φl a mollifier, define
Z
vtk (x) ≡ u
e (x + tk ak − y) ψ k (x + tk ak − y) φl(tk ) (y) dy (40.10)
Rn
Rn or more generally in any metric space due to Stone’s theorem. These are issues best left to
you in case you are interested. I am usually interested in bounded sets, U, and for these, there is
a finite covering.
1141
showing that vtk has compact support in Uk . Now change variables in 40.10 to
obtain Z
vtk (x) ≡ u
e (y) ψ k (y) φl(tk ) (x + tk ak − y) dy. (40.11)
Rn
For x ∈ U ∩ Uk , the above equals zero unless
µ ¶
1
y − tk ak − x ∈ B 0,
l (tk )
which implies by 40.9 that
µ ¶
1
y ∈ tk ak + U ∩ Uk + B 0, ⊆U
l (tk )
Therefore, for such x ∈ U ∩ Uk ,40.11 reduces to
Z
vtk (x) = u (y) ψ k (y) φl(tk ) (x + tk ak − y) dy
n
ZR
= u (y) ψ k (y) φl(tk ) (x + tk ak − y) dy.
U
Actually, this formula holds for all x ∈ U. If x ∈ U but x ∈ / Uk , then the left
side of the above formula equals zero because, as noted above, spt (vtk ) ⊆ Uk . The
integrand of the right side equals zero unless
µ ¶
1
x ∈ B 0, + spt (ψ k ) − tk ak ⊆ Uk
l (tk )
by 40.9 and here x ∈
/ Uk .
Next an estimate is obtained for ||Dα vtk − Dα (uψ k )||Lp (U ) . By 40.12,
||Dα vtk − Dα (uψ k )||Lp (U )
µZ µZ ¯ ¯ ¶p ¶1/p
¯ α^ ^
α ¯
≤ ¯D (uψ k ) (x + tk ak − y) − D (uψ k ) (x)¯ φl(tk ) (y) dy dx
U Rn
Z µZ ¯ ¯p ¶1/p
¯ α^ ^
α ¯
≤ φl(tk ) (y) ¯ D (uψ k ) (x + t a
k k − y) − D (uψ k ) (x) ¯ dx dy
Rn U
ε
≤
2k
1142 BASIC THEORY OF SOBOLEV SPACES
whenever tk is taken small enough. Pick tk this small and let wk ≡ vtk . Thus
ε
||Dα wk − Dα (uψ k )||Lp (U ) ≤ k
2
and wk ∈ Cc∞ (Rn ) . Now let
∞
X
J (x) ≡ wk .
k=1
Since the Uk are locally finite and spt (wk ) ⊆ Uk for each k, it follows Dα J =
P ∞ α
k=0 D wk and the sum is always finite. Similarly,
∞
X ∞
X
Dα (ψ k u) = Dα (ψ k u)
k=1 k=1
for each multi index, β such that |β| ≤ m. Therefore, there exists J ∈ Cc∞ (Rn )
such that
||J − u||W m,p (U ) ≤ εK
where K equals the number of multi indices no larger than m. Since ε is arbitrary,
this proves the theorem.
Corollary 40.12 Let U be an open set which has the segment property. Then
W m,p (U ) = X m,p (U ) .
Proof: Start with an open covering of U whose sets satisfy the segment condition
and obtain a locally finite refinement consisting of bounded sets which are of the
sort in the above theorem.
Now consider a situation where h : U → V where U and V are two open sets
in Rn and Dα h exists and is continuous and bounded if |α| < m − 1 and Dα h is
Lipschitz if |α| = m − 1.
Definition 40.13 Whenever h : U → V, define h∗ mapping the functions which
are defined on V to the functions which are defined on U as follows.
h∗ f (x) ≡ f (h (x)) .
h : U → V is bilipschitz if h is one to one, onto and Lipschitz and h−1 is also one
to one, onto and Lipschitz.
1143
Theorem 40.14 Let h : U → V be one to ¡ one¢ and onto where U and V are two
open sets. Also suppose that Dα h and Dα h−1 exist and are Lipschitz continuous
if |α| ≤ m − 1 for m a positive integer. Then
h∗ : W m,p (V ) → W m,p (U )
is continuous,¡linear,
¢∗ one to one, and has an inverse with the same properties, the
inverse being h−1 .
Proof: It is clear that h∗ is linear. It is required to show it is one to one and
continuous. First suppose h∗ f = 0. Then
Z
p
0= |f (h (x))| dx
V
and so f (h (x)) = 0 for a.e. x ∈ U. Since h is Lipschitz, it takes sets of measure
zero to sets of measure zero. Therefore, f (y) = 0 a.e. This shows h∗ is one to one.
By the Meyer Serrin theorem, Theorem 40.8, it suffices to verify that h∗ is
continuous on functions in C ∞ (V ) . Let f be such a function. Then using the chain
rule and product rule, (h∗ f ),i (x) = f,k (h (x)) hk,i (x) ,
(h∗ f ),ij (x) = (f,k (h (x)) hk,i (x)),j
= f,kl (h (x)) hl,j (x) hk,i (x) + f,k (h (x)) hk,ij (x)
etc. In general, for |α| ≤ m − 1, succsessive applications of the product rule and
chain rule yield that Dα (h∗ f ) (x) has the form
X ¡ ¢
Dα (h∗ f ) (x) = h∗ Dβ f (x) gβ (x)
|β|≤|α|
X Z ¯¡ ¢ ¯
= Cm,p,h ¯ Dβ f (h (x))¯p dx
|β|≤m U
X Z ¯¡ ¢ ¯ ¯ ¯
= Cm,p,h ¯ Dβ f (y)¯p ¯det Dh−1 (y)¯ dy
|β|≤m V
Theorem 40.15 Suppose u, u,i ∈ Lploc (Rn ) for i = 1, · · ·, n and p > n. Then u has
a representative, still denoted by u, such that for all x, y ∈Rn ,
ÃZ !1/p
p (1−n/p)
|u (x) − u (y)| ≤ C |∇u| dz |x − y| . (40.13)
B(x,2|y−x|)
This amazing result shows that every u ∈ W m,p (Rn ) has a representative which
is continuous provided p > n.
Using the above inequality, one can give an important embedding theorem.
and so
Mp
mn ([|u| > r] ∩ U ) ≤ .
rp
Now choosing r large enough, M p /rp < mn (U ) and so, for such r, there exists
xu ∈ U such that |u (xu )| ≤ r. Therefore from 40.13, whenever x ∈ U,
1−n/p
|u (x)| ≤ |u (xu )| + CM diam (U )
1−n/p
≤ r + CM diam (U )
¡ ¢
40.1. EMBEDDING THEOREMS FOR W M,P RN 1145
Suppose the identity map, id, is not compact. Then there exists ε > 0 and a
∞
sequence, {fk }k=1 ⊆ C m,λ (K) such that ||fk ||m,λ < M for all k but ||fk − fl ||β ≥ ε
whenever k 6= l. By the AscoliPArzela theorem, there exists a subsequence of this,
still denoted by fk such that |α|≤m ||Dα (fl − fk )||∞ < δ where δ satisfies
µ ³ ´³ ¶
ε ε ε ´β/(λ−β)
0 < δ < min , . (40.14)
2 8 8M
Therefore, sup|α|=m ρβ (Dα (fk − fl )) ≥ ε−δ for all k 6= l. It follows that there exist
pairs of points and a multi index, α with |α| = m, {xkl , ykl , α} such that
ε−δ |(Dα fk − Dα fl ) (xkl ) − ((Dα fk − Dα fl ) (ykl ))| λ−β
< β
≤ 2M |xkl − ykl |
2 |xkl − ykl |
(40.15)
and so considering the ends of the above inequality,
µ ¶1/(λ−β)
ε−δ
< |xkl − ykl | .
4M
P
Now also, since |α|≤m ||Dα (fl − fk )||∞ < δ, it follows from the first inequality in
40.15 that
ε−δ 2δ
<¡ ¢ .
2 ε−δ β/(λ−β)
4M
Since δ < ε/2, this implies
ε 2δ
<¡ ¢
4 ε β/(λ−β)
8M
and so ³ ε ´ ³ ε ´β/(λ−β)
<δ
8 8M
contrary to 40.14. This proves the lemma.
Corollary 40.21 Let p > n, U and rU be as in Theorem ¡ ¢ 40.17 and let m be a
nonnegative integer. Then rU : W m+1,p (Rn ) → C m,λ U is continuous as a map
¡ ¢
into C m,λ U for all λ ∈ [0, 1 − np ] and rU is compact if λ < 1 − np .
n
Proof: Suppose uk → 0 in W m+1,p (Rn ) . Then from 40.13, if λ ≤ 1 − p and
|α| = m
1− n −λ
ρλ (Dα uk ) ≤ C ||Dα uk ||1,p diam (U ) p .
Therefore, ρλ (Dα uk ) → 0. From Theorem 40.17 it follows that for |α| ≤ m,
||Dα uk ||∞ → 0
and so ||uk ||m,λ → 0. This proves the claim about continuity. The claim about
compactness for λ < 1 − np follows from Lemma 40.20 and this.
rU n ¡ ¢ id ¡ ¢
(Bounded in W m,p (Rn ) → Bounded in C m,1− p U → Compact in C m,λ U .)
It is just as important to consider the case where p < n. To do this case the
following lemma due to Gagliardo [23] will be of interest. See also [1].
¡ ¢
40.1. EMBEDDING THEOREMS FOR W M,P RN 1147
In this inequality, assume all the functions are continuous so there can be no mea-
surability questions.
Proof: First note that for n = 2 the inequality reduces to the statement
Z Z Z Z
|w1 (x2 )| |w2 (x1 )| dx1 dx2 ≤ |w1 (x2 )| dx2 |w2 (x1 )| dx1
which is obviously true. Suppose then that the inequality is valid for some n. Using
Fubini’s theorem, Holder’s inequality, and the induction hypothesis,
Z n+1
Y
|wj (x)| dmn+1
Rn+1 j=1
Z Z n
Y
= |wn+1 (x)| |wj (x)| dmn dxn+1
R Rn j=1
Z Z Yn
= |wn+1 (x)| |wj (x)| dxn+1 dmn
Rn R j=1
1/n
Z n Z
Y n
= |wn+1 (x)| |wj (x)| dxn+1 dmn
Rn j=1 R
Z n µZ
Y ¶1/n
n
= |wn+1 (x)| |wj (x)| dxn+1 dmn
Rn j=1 R
µZ ¶1/n
n
≤ |wn+1 (x)| dmn ·
Rn
n/(n−1) (n−1)/n
Z n µZ
Y ¶1/n
n
|wj (x)| dxn+1 dmn
Rn j=1 R
µZ ¶1/n
n
= |wn+1 (x)| dmn ·
Rn
(n−1)/n
Z n µZ
Y ¶1/(n−1)
n
|wj (x)| dxn+1 dmn
Rn j=1 R
1148 BASIC THEORY OF SOBOLEV SPACES
µZ ¶1/n
n
≤ |wn+1 (x)| dmn ·
Rn
(n−1)/n
n µZ
Y µZ ¶ ¶1/(n−1)
n
|wj (x)| dxn+1 dmn−1
j=1 Rn−1 R
µZ n µZ
¶1/n Y ¶1/n
n n
= |wn+1 (x)| dmn |wj (x)| dmn
Rn j=1 Rn
Y µZ
n+1
n
¶1/n
= |wj (x)| dmn
j=1 Rn
In fact, the term on the left is one of many terms of the expression on the right.
Therefore, taking nth roots
n
Y n
1/n 1 X
ai ≤ √ ai .
i=1
n
n j=1
so Z
n/(n−1) n/(n−1)
||φ||n/(n−1) ≡ |φ (x)| dmn
Rn
Z n µZ
Y ∞ ¶1/(n−1)
¯ ¯
≤ ¯φ,j (x)¯ dxj dmn
Rn j=1 −∞
Qn 1/n 1
Pn
Hence i=1 ai ≤ √
n n j=1 ai
n µZ
Y ¶1/n
¯ ¯
||φ||n/(n−1) ≤ ¯φ,j (x)¯ dmn
j=1 Rn
n Z
1 X ¯ ¯
¯φ,j (x)¯ dmn
≤ √
n
n j=1 Rn
n
X
1 ¯¯ ¯¯
= √ ¯¯φ,i ¯¯
n
n 1
j=1
1 1
Theorem 40.24 Let 1 ≤ p < n and q = p − n1 . Then if f ∈ W 1,p (Rn ) ,
1 (n − 1) p
||f ||q ≤ √ ||f ||1,p,Rn .
n
n n−p
Proof: From the definition of W 1,p (Rn ) , Cc1 (Rn ) is dense in W 1,p . Here
Cc1 (Rn ) is the space of continuous functions having continuous derivatives which
have compact support. The desired inequality will be established for such φ and
then the density of this set in W 1,p (Rn ) will be exploited to obtain the inequality
for all f ∈ W 1,p (Rn ). First note that the case where p = 1 follows immediately
from the above lemma and so it is only necessary to consider the case where p > 1.
r
Let φ ∈ Cc1 (Rn ) and consider |φ| where r > 1. Then a short computation shows
r
|φ| ∈ Cc1 (Rn ) and
¯ ¯
¯ r¯ r−1 ¯¯ ¯
¯|φ| ¯ = r |φ| ,i φ,i ¯ .
p(n−1) rn np
That is, let r = n−p > 1 and so n−1 = n−p . Then this reduces to
µZ ¶(n−1)/n n µZ ¶1/p µZ ¶(p−1)/p
np r X ¯ ¯p
¯φ,i ¯
np
|φ| n−p
dmn ≤ √ |φ| n−p
dmn .
n
n i=1
µZ ¶ n−p n µZ ¶1/p
np np
r X ¯ ¯p
¯φ,i ¯ r
|φ| n−p
dmn ≤ √ ≤ √ ||φ||1,p,Rn .
n
n i=1 n
n
np 1 n−p 1 1
Letting q = n−p , it follows q = np = p − n and
r
||φ||q ≤ √ ||φ||1,p,Rn .
n
n
Now let f ∈ W m,p (Rn ) and let ||φk − f ||1,p,Rn → 0 as k → ∞. Taking another
subsequence, if necessary, you can also assume φk (x) → f (x) a.e. Therefore, by
Fatou’s lemma,
µZ ¶1/q
q
||f ||q ≤ lim inf |φk (x)| dmn
k→∞ Rn
r
≤ lim inf √ ||φk ||1,p,Rn = ||f ||1,p,Rn .
k→∞ n n
1 n − (m − 1) p 1 n − mp
= − = .
q np n np
The above corollaries imply yet another interesting corollary which involves em-
beddings in the Holder spaces.
Corollary 40.27 Suppose jp < n < (j + 1) p and let m be a positive integer. Let
U be any bounded open set ¡in ¢Rn . Then letting rU denote the restriction to U ,
rU : W m+j,p (Rn ) → C m−1,λ U is continuous for every λ ≤ λ0 ≡ (j + 1) − np and
if λ < (j + 1) − np , then rU is compact.
Proof: From Corollary 40.26 W m+j,p (Rn ) ⊆ W m,q (Rn ) where q is given by
40.16. Therefore,
np
>n
n − jp
¡ ¢
and so by Corollary 40.21, W m,q (Rn ) ⊆ C m−1,λ U for all λ satisfying
(n − jp) n p (j + 1) − n n
0<λ<1− = = (j + 1) − .
np p p
The assertion
¡ ¢about compactness
¡ ¢ follows from the compactness of the embedding
of C m−1,λ0 U into C m−1,λ U for λ < λ0 . See Lemma 40.20.
There are other embeddings of this sort available. You should see Adams [1] for a
more complete listing of these. Next are some theorems about compact embeddings.
This requires some consideration of which subsets of Lp (U ) are compact. The main
theorem is the following. See [1].
Theorem 40.28 Let K be a bounded subset of Lp (U ) and suppose that for all
ε > 0, there exist a δ > 0 such that if |h| < δ, then
Z
p
|e
u (x + h) − ue (x)| dx < εp (40.17)
Rn
Suppose also that for each ε > 0 there exists an open set, G ⊆ U such that G is
compact and for all u ∈ K,
Z
p
|u (x)| dx < εp (40.18)
U \G
Kk ≡ {u ∗ φk : u ∈ K} .
and verify the conditions for the Ascoli Arzela theorem for these functions defined
on G. Say ||u||p ≤ M for all u ∈ K.
First of all, for u ∈ K and x ∈ Rn ,
µZ ¶p
p
|u ∗ φk (x)| ≤ |u (x − y) φk (y)| dy
µZ ¶p
= |u (y) φk (x − y)| dy
Z
p
≤ |u (y)| φk (x − y) dy
µ ¶Z µ ¶
≤ sup φk (z) |u (y)| dy ≤ M sup φk (z)
z∈Rn z∈Rn
|u ∗ φk (x) − u ∗ φk (x1 )|
Z
≤ |u (x − y) − u (x1 −y)| φk (y) dy
µZ ¶1/p µZ ¶q
p q
≤ |u (x − y) − u (x1 −y)| dy φk (y) dy
Now let ε > 0 be given and let δ and G correspond to ε as given in the hypotheses
and let 1/k < δ and also k is large enough that for all u ∈ K,
ÃZ !1/p
p
≤ 2ε + |u ∗ φk − uj ∗ φk | dx
G+B(0,1)
ÃZ !1/p
p
+ |u ∗ φk − uj ∗ φk | dx
Rn \(G+B(0,1))
≤ 2ε + ε1/p
ÃZ µZ ¶p !1/p
+ |u (x − y) − uj (x − y)| φk (y) dy dx
Rn \(G+B(0,1))
≤ 2ε + ε1/p
Z ÃZ !1/p
p
+ φk (y) (|u (x − y)| + |uj (x − y)|) dx dy
Rn \(G+B(0,1))
Z ÃZ !1/p
1/p p
≤ 2ε + ε + φk (y) (|u (x)| + |uj (x)|) dx dy
Rn \G
Z ÃZ !1/p
p p
≤ 2ε + ε1/p + 2p−1 φk (y) (|u (x)| + |uj (x)| ) dx dy
Rn \G
and since ε > 0 is arbitrary, this shows that K is totally bounded and is therefore
precompact.
Now for an arbitrary open set, U and K given in the hypotheses of the theorem,
e ≡ {e
let K u : u ∈ K} and observe that K e is precompact in Lp (Rn ) . But this is the
same as saying that K is precompact in Lp (U ) . This proves the theorem.
Actually the converse of the above theorem is also true [1] but this will not be
needed so I have left it as an exercise for anyone interested.
Lemma 40.29 Let u ∈ W 1,1 (U ) for U an open set and let φ ∈ Cc∞ (U ) . Then
there exists a constant,
³ ´
C φ, ||u||1,1,U ,
n
depending
¡ only on¢ the indicated quantities such that whenever v ∈ R with |v| <
C
dist spt (φ) , U , it follows that
Z ¯ ¯ ³ ´
¯f f (x)¯¯ dx ≤ C φ, ||u||
¯φu (x + v) − φu 1,1,U |v| .
Rn
¡ ¢
Proof: First suppose u ∈ C ∞ U . Then for any x ∈ spt (φ) ∪ (spt (φ) − v) ≡
Gv , the chain rule implies
Z n ¯
1X ¯
¯ ¯
|φu (x + v) − φu (x)| ≤ ¯(φu),i (x + tv) vi ¯ dt
0 i=1
Z 1Xn
¯¡ ¢ ¯
≤ ¯ φ,i u + u,i φ (x + tv)¯ dt |v| .
0 i=1
Lemma 40.30 Let U be a bounded open set and define for p > 1
n o
S ≡ u ∈ W 1,1 (U ) ∩ Lp (U ) : ||u||1,1,U + ||u||Lp (U ) ≤ M (40.19)
It remains to satisfy the first condition. It is necessary to verify there exists δ > 0
such that if |v| < δ, then
Z ¯ ¯q
¯f f (x)¯¯ dx < εp .
¯φu (x + v) − φu (40.21)
Rn
Let spt (φ) ∪ (spt (φ) − v) ≡ Gv . Now if h is any measurable function, and if
θ ∈ (0, 1) is chosen small enough that θq < 1,
Z Z
q θq (1−θ)q
|h| dx = |h| |h| dx
Gv Gv
µZ ¶θq µZ ³ ´ 1−θq
1 ¶1−θq
(1−θ)q
≤ |h| dx |h|
Gv Gv
µZ ¶θq µZ ¶1−θq
(1−θ)q
= |h| dx |h| 1−θq
. (40.22)
Gv Gv
1156 BASIC THEORY OF SOBOLEV SPACES
Now let θ also be small enough that there exists r > 1 such that
(1 − θ) q
r =p
1 − θq
and use Holder’s inequality in the last factor of the right side of 40.22. Then 40.22
is dominated by
µZ ¶θq µZ ¶ 1−θq
r
µZ ¶1/r0
p
|h| dx |h| 1dx
Gv Gv Gv
³ ´ µZ ¶θq
= C ||h||Lp (Gv ) , mn (Gv ) |h| dx .
Gv
Therefore, for u ∈ S,
Z ¯ ¯q Z
¯f f ¯ q
¯φu (x + v) − φu (x)¯ dx = |φu (x + v) − φu (x)| dx
Rn Gv
³ ´ µZ ¶θq
≤ C ||φu (·+v) − φu (·)||Lp (Gv ) , mn (Gv ) |φu (x + v) − φu (x)| dx
Gv
³ ´ µZ ¶θq
≤ C 2 ||φu (·)||Lp (U ) , mn (U ) |φu (x + v) − φu (x)| dx
Gv
µZ ¶θq
≤ C (φ, M, mn (U )) |φu (x + v) − φu (x)| dx
Gv
µZ ¯ ¯ ¶θq
¯f f ¯
= C (φ, M, mn (U )) ¯ φu (x + v) − φu (x)¯ dx . (40.23)
Rn
∞
Proof: If suffices to show that every sequence, {uk }k=1 ⊆ S has a subsequence
∞
which converges in Lq (U ) . Let {Km }m=1 denote a sequence of compact subsets
of U with the property that Km ⊆ Km+1 for all m and ∪∞ m=1 Km = U. Now let
φm ∈ Cc∞ (U ) such that φm (x) ∈ [0, 1] and φm (x) = 1 for all x ∈ Km . Let
∞
Sm ≡ {φm u : u ∈ S}. By Lemma 40.30 there exists a subsequence of {uk }k=1 ,
∞ ∞
denoted here by {u1,k }k=1 such that {φ1 u1,k }k=1 converges in Lq (U ) . Now S2 is
∞
also precompact in Lq (U ) and so there exists a subsequence of {u1,k }k=1 , denoted
∞ ∞
by {u2,k }k=1 such that {φ2 u2,k }k=1 converges in L2 (U ) . Thus it is also the case that
∞
{φ1 u2,k }k=1 converges in Lq (U ) . Continue taking subsequences in this manner such
∞ ∞ ∞
that for all l ≤ m, {φl um,k }k=1 converges in Lq (U ). Let {wm }m=1 = {um,m }m=1 so
∞ ∞ ∞
that {wk }k=m is a subsequence of {um,k }k=1 . Then it follows for all k, {φk wm }m=1
q
must converge in L (U ) . For u ∈ S,
Z
q q q
||u − φk u||Lq (U ) = |u| (1 − φk ) dx
U
µZ ¶q/p µZ ¶1/r
p qr
≤ |u| dx (1 − φk ) dx
U U
µZ ¶1/r
qr
≤ M (1 − φk ) dx
U
where q/p + 1/r = 1. Now φl (x) → XU (x) and so the integrand in the last integral
converges to 0 by the dominated convergence theorem. Therefore, k may be chosen
large enough that for all u ∈ S,
³ ε ´q
q
||u − φk u||Lq (U ) ≤ .
3
Fix such a value of k. Then
||wq − wp ||Lq (U ) ≤
||wq − φk wq ||Lq (U ) + ||φk wq − φk wp ||Lq (U ) + ||wp − φk wp ||Lq (U )
2ε
≤ + ||φk wq − φk wp ||Lq (U ) .
3
∞
But {φk wm }m=1 converges in Lq (U ) and so the last term in the above is less than
∞
ε/3 whenever p, q are large enough. Thus {wm }m=1 is a Cauchy sequence and must
q
therefore converge in L (U ). This proves the theorem.
R (Q ∩ U ) = {y ∈ Rn : y
b ∈ B, a < yn < g (b
y)} (40.27)
© ª
where g is Lipschitz continuous on B, a < min g (x) : x ∈ B , and
b ≡ (y1 , · · ·, yn−1 ).
y
¶ ZZ b
Q ¶ Z R(Q)
¶ ZZ
¶ ¶
¶ ¶
Z W R - R(W )
Z ¶
xZ ¶ a
Z y
Z¶
The following lemma is important.
Proof: For x ∈ ∂U, simply look at a single open set, Qx described in the above
which contains x. Then consider an open set whose intersection with U is of the
form RT ({y :b y ∈ B, g (b
y ) − ε < yn < y)}) and ªa vector of the form εRT (−en )
© g (b
where ε is chosen smaller than min g (x) : x ∈ B − a. There is nothing to prove
for points of U.
One way to extend many of the above theorems to more general open sets than
Rn is through the use of an appropriate extension theorem. In this section, a fairly
general one will be presented.
for g a Lipschitz function of the sort described in this definition. Suppose u+ and
u− are Lipschitz functions defined on V + and V − respectively and suppose that
u+ (b y)) = u− (b
y, g (b y, g (b b ∈ B. Let
y)) for all y
½
u+ (b y , yn ) ∈ V +
y, yn ) if (b
u (b
y , yn ) ≡ −
u (b y , yn ) ∈ V −
y, yn ) if (b
and suppose spt (u) ⊆ B × (a, b). Then extending u to be 0 off of B × (a, b), u is
continuous and the weak partial derivatives, u,i , are all in L∞ (Rn ) ∩ Lp (Rn ) for
all p > 1 and u,i = (u+ ),i on V + and u,i = (u− ),i on V − .
40.2. AN EXTENSION THEOREM 1159
spt(u)
¡¡
b ¡
¡
a
B
Note ¡ firsti ¢that u is Lipschitz continuous. To see this, consider |u (y1 ) − u (y2 )|
where y bi , yn = yi . There are various cases to consider depending on whether
yni is above g (b yi ) . Suppose yn1 < g (b y1 ) and yn2 > g (b y2 ) . Then letting K ≥
+ −
max (Lip (u ) , Lip (u )) ,
¯ ¡ ¢ ¡ ¢¯ ¯ ¡ ¢ ¡ ¢¯ ¯ ¡ ¢ ¯
¯u yb1 , yn1 − u y b2 , yn2 ¯ ≤ ¯u y b1 , yn1 − u y b2 , yn1 ¯ + ¯u y b2 , yn1 − u (b y2 ))¯
y2 , g (b
¯ ¡ ¢¯
+ ¯u (by2 , g (by2 )) − u y b2 , yn2 ¯
£ ¤
≤ K |b y1 − y b2 | + K g (b y2 ) − yn1 + yn2 − g (by2 )
¡ ¯ ¯¢ √
= K |b y1 − y b2 | + ¯yn1 − yn2 ¯ ≤ K n |y1 − y2 |
The other cases are similar. Thus u is a Lipschitz continuous function which has
compact support. By Corollary 37.18 on Page 1092 it follows that u,i ∈ L∞ (Rn ) ∩
Lp (Rn ) for all p > 1. It remains to verify u,i = (u+ ),i on V + and u,i = (u− ),i on
V − . The last claim is obvious from the definition of weak derivatives.
³ ´
Lemma 40.35 In the situation of Lemma 40.34 let u ∈ C 1 V − ∩Cc1 (B × (a, b))3
and define
u (b b ∈ B and yn ≤ g (b
y, yn ) if y y) ,
w (by , yn ) ≡ u (b
y, 2g (b b ∈ B and yn > g (b
y) − yn ) , if y y)
b∈
0 if y / B.
Then w ∈ W 1,p (Rn ) and there exists a constant, C depending only on Lip (g) and
dimension such that
||w||W 1,p (Rn ) ≤ C ||u||W 1,p (V − ) .
Denote w by E0 u. Thus E0 (u) (y) = u (y) for all y ∈ V − but E0 u = w is defined
on all of Rn . Also, E0 is a linear mapping.
yn > g (b
y) and then to extract an estimate of the right sort. Denote by U the set
of points of Rn with the property that (by, yn ) ∈ U if and only if yb∈ b∈B
/ B or y
∞
and yn > g (by) . Then letting φ ∈ Cc (U ) , suppose first that i < n. Then
Z
w (b
y, yn ) φ,i (y) dy
U
Z ¡ ¡ ¢ ¢
u yb − hen−1
i , 2g yb − hen−1
i − yn − u (b
y, 2g (b
y) − yn )
≡ lim φ (y) dy
h→0 U h
(40.28)
½ Z
−1 £ ¡ ¢
= lim φ (y) D1 u (b y) − yn ) hen−1
y, 2g (b i
h→0 hU
¡ ¡ ¢ ¢¤
+2D2 u (b
y, 2g (b
y) − yn ) g yb − hen−1
i − g (by) dy
Z ¾
−1 £ ¡ ¡ ¢ ¢ ¤
+ φ (y) o g yb − hen−1
i − g (b
y) + o (h) dy
h U
where en−1
i is the unit vector in Rn−1 having all zeros except for a 1 in the ith
position. Now by Rademacher’s theorem, ¡ Dg (b ¢ for a.e. y
y) exists b and so except for
b − hen−1
a set of measure zero, the expression, g y i − g (b
y ) is o b
(h) and also for y
not in the exceptional set,
¡ ¢
g yb − hen−1
i − g (b y) en−1
y) = −hDg (b i + o (h) .
Therefore, since the integrand in 40.28 has compact support and because of the
Lipschitz continuity of all the functions, the dominated convergence theorem may
be applied to obtain
Z
w (b
y, yn ) φ,i (y) dy =
U
Z
£ ¡ ¢ ¡ ¢¤
φ (y) −D1 u (b y) − yn ) en−1
y, 2g (b i + 2D2 u (b
y, 2g (b y) en−1
y) − yn ) Dg (b i dy
U
Z · ¸
∂u ∂u ∂g (b
y)
= φ (y) − (b
y, 2g (b
y) − yn ) + 2 (b
y, 2g (b
y) − yn ) dy
U ∂yi ∂yn ∂yi
and so
∂u ∂u ∂g (b
y)
w,i (y) = (b
y, 2g (b
y) − yn ) − 2 (b
y, 2g (b
y) − yn ) (40.29)
∂yi ∂yn ∂yi
40.2. AN EXTENSION THEOREM 1161
whenever i < n which is what you would expect from a formal application of the
chain rule. Next suppose i = n.
Z
w (b
y, yn ) φ,n (y) dy
U
Z
u (b
y, 2g (by) − (yn − h)) − u (by, 2g (b
y) − yn )
= lim φ (y) dy
h→0 U h
Z
D2 u (by, 2g (b
y) − yn ) h + o (h)
= lim φ (y) dy
h→0 U h
Z
∂u
= (b
y, 2g (by) − yn ) φ (y) dy
U ∂yn
showing that
−∂u
w,n (y) = (b
y, 2g (b
y) − yn ) (40.30)
∂yn
Z ¯ ¯p
¯ ∂u ∂u y) ¯¯
∂g (b
p
||w,i ||Lp (U ) = ¯ (b
y, 2g (b
y) − yn ) − 2 (b
y, 2g (b
y ) − yn ) dy
¯ ∂yi ¯
U ∂yi ∂yn
1162 BASIC THEORY OF SOBOLEV SPACES
Z ¯ ¯p
¯ ∂u ¯
≤ 2p−1 ¯ y) − yn )¯¯
¯ ∂yi (b
y, 2g (b
¯ U ¯p
¯
p ¯ ∂u
¯
y) − yn )¯¯ Lip (g) dy
p
+2 ¯ (b
y, 2g (b
∂yn
Z ¯ ¯p
¯ ∂u ¯
≤ p
4 (1 + Lip (g) )
p ¯ y) − yn )¯¯
¯ ∂yi (b y, 2g (b
U
¯ ¯p
¯ ∂u ¯
+¯ ¯ (b
y, 2g (b y) − yn )¯¯ dy
∂yn
Z Z ∞ ¯ ¯p
¯ ∂u ¯
= 4p (1 + Lip (g) )
p ¯ (b
y , 2g (b
y ) − y )¯
¯ ∂yi n ¯
B g(b y)
¯ ¯p
¯ ∂u ¯
+¯ ¯ (b
y, 2g (b y) − yn )¯¯ dyn db y
∂yn
Z Z g(by) ¯ ¯p
¯ ∂u ¯
= p
4 (1 + Lip (g) )
p ¯ y, zn )¯¯
¯ ∂yi (b
B −∞
¯ ¯p
¯ ∂u ¯
+ ¯¯ y, zn )¯¯ dzn db
(b y
∂yn
Z Z g(by) ¯ ¯p
¯ ∂u ¯
= p
4 (1 + Lip (g) )
p ¯ y, zn )¯¯
¯ ∂yi (b
¯ ¯p B a
¯ ∂u ¯
+ ¯¯ y, zn )¯¯ dzn db
(b y
∂yn
p p
≤ 4p (1 + Lip (g) ) ||u||1,p,V −
It follows
p p p
||w||1,p,Rn = ||w||1,p,U + ||u||1,p,V −
p p p
≤ 4p n (1 + Lip (g) ) ||u||1,p,V − + ||u||1,p,V −
and so
p p p
||w||1,p,Rn ≤ 4p n (2 + Lip (g) ) ||u||1,p,V −
which implies
p 1/p
||w||1,p,Rn ≤ 4n1/p (2 + Lip (g) ) ||u||1,p,V −
40.2. AN EXTENSION THEOREM 1163
R (Q ∩ U ) = {y ∈ Rn : y
b ∈ B, a < yn < g (b
y)} (40.32)
© ª
where g is Lipschitz continuous on B, a < min g (x) : x ∈ B , and
b ≡ (y1 , · · ·, yn−1 ).
y
¶ZZ b
Q ¶ Z R(Q)
¶ ZZ
¶ ¶
¶ ¶
Z W R - R(W )
Z ¶
xZ ¶ a
Z y
Z¶
¡ ¢
Lemma 40.37 In the situation of Definition 40.32 let u ∈ C 1 U ∩ Cc1 (Q) and
define ¡ ¢∗
Eu ≡ R∗ E0 RT u.
¡ ¢∗
where RT maps W 1,p (U ∩ Q) to W 1,p (R (W )) . Then E is linear and satisfies
Theorem 40.38 Let U be a bounded ¡ open set which has¢Lipschitz boundary. Then
for each p ≥ 1, there exists E ∈ L W 1,p (U ) , W 1,p (Rn ) such that Eu (x) = u (x)
a.e. x ∈ U.
Pp ¡ ¢
let ψ i ∈ Cc∞ (Qi ) with ψ i (x) ∈ [0, 1] and i=0 ψ i (x) = 1 on U . For u ∈ C ∞ U ,
let E 0 (ψ 0 u) ≡ ψ 0 u on Q0 and 0 off Q0 . Thus
¯¯ 0 ¯¯
¯¯E (ψ 0 u)¯¯ = ||ψ 0 u||1,p,U .
1,p,Rn
For i ≥ 1, let ¡ ¢∗
E i (ψ i u) ≡ Ri∗ E0 RT (ψ i u) .
Thus, by Lemma 40.37
¯¯ 1 ¯¯
¯¯E (ψ i u)¯¯ ≤ C ||ψ i u||1,p,Qi ∩U
1,p,Rn
¡ ¢
where the constant depends on Lip (gi ) but is independent of u ∈ C ∞ U . Now
define E as follows.
X p
Eu ≡ E i (ψ i u) .
i=0
∞
¡ ¢
Thus for u ∈ C U , it follows Eu (x) = u (x) for all x ∈ U. Also,
p
X p
¯¯ i ¯¯ X
||Eu||1,p,Rn ≤ ¯¯E (ψ i u)¯¯ ≤ Ci ||ψ i u|| 1,p,Qi ∩U
i=0 i=0
Xp p
X
= Ci ||ψ i u||1,p,U ≤ Ci ||u||1,p,U
i=0 i=0
p
X
≤ (p + 1) Ci ||u||1,p,U ≡ C ||u||1,p,U . (40.33)
i=0
¡ ¢
where C depends on the ψ i and the gi but is independent of u ∈ C ∞ U . Therefore,
¡ ¢
by density of C ∞ U in W 1,p (U ) , E has a unique continuous extension to W 1,p (U )
still denoted by E satisfying the inequality determined by the ends of 40.33. It
remains to verify that Eu (x) = u (x) a.e. for¡ x ¢∈ U .
Let uk → u in W 1,p (U ) where uk ∈ C ∞ U . Therefore, by 40.33, Euk → Eu
in W 1,p (Rn ) . Since Euk (x) = uk (x) for each k,
which shows u (x) = Eu (x) for a.e. x ∈ U as claimed. This proves the theorem.
Definition 40.39 Let U be an open set. Then W0m,p (U ) is the closure of the set,
Cc∞ (U ) in W m,p (U ) .
Corollary 40.40 Let U be a bounded open set which has Lipschitz boundary and
let³ W be an open set ´
containing U . Then for each p ≥ 1, there exists EW ∈
1,p 1,p
L W (U ) , W0 (W ) such that EW u (x) = u (x) a.e. x ∈ U.
40.3. GENERAL EMBEDDING THEOREMS 1165
Theorem 40.41 Let 1 ≤ p < n and 1q = p1 − n1 and let U be any open set for
which there exists a (1, p) extension operator. Then if u ∈ W 1,p (U ) , there exists a
constant independent of u such that
||u||Lq (U ) ≤ C ||u||1,p,U .
Proof: Let E be the (1, p) extension operator. Then by Theorem 40.24 on Page
1149
1 (n − 1) p
||u||Lq (U ) ≤ ||Eu||Lq (Rn ) ≤ √ ||Eu||1,p,Rn
n
n (n − p)
≤ C ||u||1,p,U .
Corollary 40.42 Suppose mp < n and U is an open set satisfying the segment con-
dition which has a (1, p) extension operator for all p. Then id ∈ L (W m,p (U ) , Lq (U ))
np
where q = n−mp .
1 n − (m − 1) p 1 n − mp
= − = .
q np n np
and is compact.
and so for each such α, satisfying |α| ≤ m, it follows from Lemma 40.30 on Page 1155
that {Dα u : u ∈ S} is precompact in Lr (U ) . Therefore, there exists a subsequence,
still denoted by uk such that Dα1 uk converges in Lr (U ) . Applying the same lemma,
there exists a subsequence of this subsequence such that both Dα1 uk and Dα2 uk
converge in Lr (U ) . Continue taking subsequences until you obtain a subsequence,
∞ ∞
{uk }k=1 for which {Dα uk }k=1 converges in Lr (U ) for all |α| ≤ m. But this must
be a convergent subsequence in W m,r (U ) and this proves the corollary.
Theorem 40.44 Let U be a bounded open ¡ ¢ set having a (1, p) extension operator
and let p > n. Then id : W 1,p (U ) → C U is continuous and compact.
40.3. GENERAL EMBEDDING THEOREMS 1167
¡ ¢
Proof: Theorem 40.17 on Page 40.17 implies rU : W 1,p (Rn ) → C U is con-
tinuous and compact. Thus
Corollary 40.45 Let p > n, let U be a bounded open set having a (1, p) extension
operator which also satisfies the segment
¡ ¢condition, and let m be a nonnegative
integer. Then id : W m+1,p (U ) → C m,λ U is continuous for all λ ∈ [0, 1 − np ] and
id is compact if λ < 1 − np .
1− n
p −λ
ρλ (E (Dα uk )) ≤ C ||E (Dα uk )||1,p,Rn diam (U ) .
Theorem 40.46 Suppose jp < n < (j + 1) p and let m be a positive integer. Let
U be any bounded open set in Rn which has ¡a (1, p) extension operator
¡ ¢¢ for each
p ≥ 1 and the segment property. Then id ∈ L W m+j,p (U ) , C m−1,λ U for every
λ ≤ λ0 ≡ (j + 1) − np and if λ < (j + 1) − np , id is compact.
(n − jp) n p (j + 1) − n n
0<λ<1− = = (j + 1) − .
np p p
The assertion
¡ ¢about compactness
¡ ¢ follows from the compactness of the embedding
of C m−1,λ0 U into C m−1,λ U for λ < λ0 , Lemma 40.20 on Page 1145.
1168 BASIC THEORY OF SOBOLEV SPACES
This is satisfied if
k
X r
(−j) λj = 1
j=1
Corollary 40.49 Let H − be the half space of 40.35. There exists E with the prop-
erty that E : W l,p (H − ) → W l,p (Rn ) and is linear and continuous for each l ≤ k.
³ ´ ³ ´
Proof: This immediate from the density of Cc∞ H − in W k,p H − and
Lemma 40.48.
There is nothing sacred about a half space or this particular half space. It is
clear that everything works as well for a half space of the form
Hk− ≡ {x : xk < 0} .
Thus the half space featured in the above discussion is Hn− .
Corollary 40.50 Let {k1 , · · ·, kr } ⊆ {1, · · ·, n} where the ki are distinct and let
Hk−1 ···kr ≡ Hk−1 ∩ Hk−2 ∩ · · · ∩ Hk−r . (40.37)
¡ ¢
Then there exists E : W k,p Hk−1 ···kr → W k,p (Rn ) such that E is linear and con-
tinuous.
Proof: Follow the above argument with minor modifications to first extend
from Hk−1 ···kr to Hk−1 ···kr−1 and then from from Hk−1 ···kr−1 to Hk−1 ···kr−2 etc.
This easily implies the ability to extend off bounded open sets which near their
boundaries look locally like an intersection of half spaces.
Theorem 40.51 Let U be a bounded open set and suppose U0 , U1 , · · ·, Um are open
sets with the property that U ⊆ ∪m m
k=0 Uk , U0 ⊆ U, and ∂U ⊆ ∪k=1 Uk . Suppose also
n n
there exist one to one and onto functions, hk : R → R , hk (Uk ∩ U ) = Wk where
Wk equals the intersection of a bounded open set with a finite intersection of half
spaces, Hk−1 ···kr , as in 40.37 such that hk (∂U ∩ Uk ) ⊆ ∂Hk−1 ···kr . Suppose also that
for all |α| ≤ k − 1,
Dα hk and Dα h−1 k
exist and are Lipschitz continuous. Then there letting W be an open set which con-
tains U , there exists E : W k,p (U ) → W k,p (W ) such that E is a linear continuous
map from W l,p (U ) to W l,p (W ) for each l ≤ k.
Pm
Proof: Let ψ j ∈ Cc∞ (Uj ), ψ j (x) ∈ [0, 1] for all x ∈ Rn , and j=0 ψ j (x) = 1
¡ ¢∗
on U . This is a C ∞ partition of unity on U . By Theorem 40.14 h−1 j uψ j ∈
k,p −
W (W ) . By the assumption that hj (∂U ∩ Uj ) ⊆ ∂Hk1 ···kr , the zero extension of
¡ −1 ¢∗ j ¡ ¢
hj uψ j to the rest of Hk−1 ···kr results in an element of W k,p Hk−1 ···kr . Apply
¡ ¢
Corollary 40.50 to conclude there exists Ej : W k,p Hk−1 ···kr → W k,p (Rn ) which is
¡ −1 ¢∗
continuous and linear. Abusing notation slightly, by using hj uψ j as the above
³¡ ¢ ∗
´
zero extension, it follows Ej h−1 j uψ j ∈ W k,p (Rn ) . Now let η be a function in
¡ ¢
Cc∞ (h (W )) such that η (y) = 1 on h U . Then Define
m
X ³¡ ¢∗ ¡ ¢´
Eu ≡ h∗j ηEj h−1
j uψ j .
j=0
40.4. MORE EXTENSION THEOREMS 1171
Definition 40.52 When E is a linear continuous map from W l,p (U ) to W l,p (Rn )
for each l ≤ k. it is called a strong (k, p) extension map.
There is also a very easy sort of extension theorem for the space, W0m,p (U ) which
does not require any assumptions on the boundary of U other than mn (∂U ) = 0.
First here is the definition of W0m,p (U ) .
This follows
³ C´ because, since mn (∂U ) = 0 it suffices to consider φ ∈ Cc∞ (U ) and
∞
φ ∈ Cc U . Therefore, ||Eu||l,p,Rn = ||u||l,p,U .
There are many other extension theorems and if you are interested in pursuing
this further, consult Adams [1]. One of the most famous which is discussed in this
reference is due to Calderon and depends on the theory of singular integrals.
Sobolev Spaces Based On L2
Definition 41.1 f ∈ S, the Schwartz class, if f ∈ C ∞ (Rn ) and for all positive
integers N ,
ρN (f ) < ∞
where
2
ρN (f ) = sup{(1 + |x| )N |Dα f (x)| : x ∈ Rn , |α| ≤ N }.
Thus f ∈ S if and only if f ∈ C ∞ (Rn ) and
1173
1174 SOBOLEV SPACES BASED ON L2
Also recall that the Fourier transform and its inverse are one to one and onto maps
from S to S.
To tie the Fourier transform technique in with what has been done so far, it is
necessary to make the following assumption on the set, U. This assumption is made
so that it is possible to consider elements of W k,2 (U ) as restrictions of elements of
W k,2 (Rn ) .
Assumption 41.2 Assume U satisfies the segment condition and that for any m of
m,p
interest,
¡ k,p there exists E∈¢ L (W (U ) , W m,p (Rn )) such that for each k ≤ m, E ∈
k,p n
L W (U ) , W (R ) . That is, there exists a stong (m, p) extension operator.
Proof: The set, Rn satisfies the segment condition and so Cc∞ (Rn ) is dense in
m,p
W (Rn ) . However, Cc∞ (Rn ) ⊆ S. This proves the lemma.
Recall now Plancherel’s theorem which states that ||f ||0,2,Rn = ||F f ||0,2,Rn when-
ever f ∈ L2 (Rn ) . Also it is routine to verify from the definition of the Fourier
transform that for u ∈ S,
F ∂k u = ixk F u.
From this it follows that
where C (n, m) is the largest of the multinomial coefficients obtained in the expan-
sion, m
X n
1 + x2j .
j=1
Here the notation, v|U means v restricted to U. Define the norm in this space by
n o
||u||H m (U ) ≡ inf ||v||H m (Rn ) : v|U = u . (41.5)
Proof: First it is necessary to verify that the given norm really is a norm.
Suppose then that u = 0. Is ||u||H m (U ) = 0? Of course it is. Just take v ≡ 0.
Then v|U = u and ||v||H m = 0. Next suppose ||u||H m (U ) = 0. Does it follow that
u = 0? Letting ε > 0 be given, there exists v ∈ H m (Rn ) such that v|U = u and
||v||H m (Rn ) < ε. Therefore,
Therefore,
||u1 + u2 ||H m (U ) ≤ ||v1 + v2 ||H m (Rn ) ≤ ||v1 ||H m (Rn ) + ||v2 ||H m (Rn )
≤ ||u1 ||H m (U ) + ||u2 ||H m (U ) + 2ε
∞
which shows that {vNk }k=1 is a Cauchy sequence. Consequently it must converge
to v ∈ H m (Rn ) . Let u = v|U . Then
which shows the subsequence, {uNk }k converges to u. Since {uk } is a Cauchy se-
quence, it follows it too must converge to u. This proves the lemma.
The main result is next.
Theorem 41.8 Suppose U satisfies Assumption 41.2. Then for m a nonnegative
integer, H m (U ) = W m,2 (U ) and the two norms are equivalent.
Proof: Let u ∈ H m (U ) . Then there exists v ∈ H m (Rn ) such that v|U = u.
Hence v ∈ W k,2 (Rn ) and so all its weak derivatives up to order m are in L2 (Rn ) .
Therefore, the restrictions of these weak derivitves are in L2 (U ) . Since U satisfies
the segment condition, it follows u ∈ W m,2 (U ) which shows H m (U ) ⊆ W m,2 (U ) .
Next take u ∈ W m,2 (U ) . Then Eu ∈ W m,2 (Rn ) = H m (Rn ) and this shows
u ∈ H m (U ) . This has shown the two spaces are the same. It remains to verify their
norms are equivalent. Let u ∈ H m (U ) and let v|U = u where v ∈ H m (Rn ) and
||u||H m (U ) + ε > ||v||H m (Rn ) .
Then recalling that ||·||H m (Rn ) and ||·||m,2,Rn are equivalent norms for H m (Rn ) ,
there exists a constant, C such that
||u||H m (U ) + ε > ||v||H m (Rn ) ≥ C ||v||m,2,Rn ≥ C ||u||m,2,U
Now consider the two Banach spaces,
³ ´ ³ ´
H m (U ) , ||·||H m (U ) , W m,2 (U ) , ||·||m,2,U .
³ ´
The above inequality shows since ε > 0 is arbitrary that id : H m (U ) , ||·||H m (U ) →
³ ´
W m,2 (U ) , ||·||m,2,U is continuous. By the open mapping theorem, it follows id
is continuous in the other direction. Thus there exists a constant, K such that
||u||H m (U ) ≤ K ||u||k,2,U . Hence the two norms are equivalent as claimed.
Specializing Corollary 40.43 and Theorem 40.46 starting on Page 1166 to the case
of p = 2 while also assuming more on U yields the following embedding theorems.
Theorem 41.9 Suppose m ≥ 0 and j is a nonnegative integer satisfying 2j <
n.¡ Also suppose U is an ¢ open set which satisfies Assumption 41.2. Then id ∈
L H m+j (U ) , W m,q (U ) where
2n
q≡ . (41.6)
n − 2j
If, in addition to the above, U is bounded and 1 ≤ r < q, then
¡ ¢
id ∈ L H m+j (U ) , W m,r (U )
and is compact.
Theorem 41.10 Suppose for j a nonnegative integer, 2j < n < 2 (j + 1) and let m
n
be a positive integer. Let¡U be any bounded open¡ ¢¢set in R which satisfies Assump-
tion 41.2. Then id ∈ L H m+j
(U ) , C m−1,λ
U for every λ ≤ λ0 ≡ (j + 1) − n2
n
and if λ < (j + 1) − 2 , id is compact.
1178 SOBOLEV SPACES BASED ON L2
Corollary 41.14 Let U be an open set and let S|U denote the restrictions of func-
tions of S to U. Then S|U is dense in H t (U ) .
Proof: Let u ∈ H t (U ) and let v ∈ H t (Rn ) such that v|U = u a.e. Then since
S is dense in H t (Rn ) , there exists w ∈ S such that
It follows that
These fractional order spaces are important when trying to understand the trace
on the boundary. The Fourier transform description also makes it very easy to
establish interesting inequalities such as interpolation inequalities.
41.2. FRACTIONAL ORDER SPACES 1179
Thus
Z ³ ´s
2 2
1 + |x| |F u| dx
Z ³ ´rθ ³ ´(1−θ)t
2 2 2
= 1 + |x| 1 + |x| |F u| dx
µZ ³ ´r ¶θ µZ ³ ´(1−θ)t ¶1−θ
2 2 2 2
≤ 1 + |x| |F u| dx 1 + |x| |F u| dx
2θ 2(1−θ)
= ||u||H r (Rn ) ||u||H t (Rn ) .
Is there something like this for the fractional order spaces? Yes there is. How-
ever, in order to prove it, it is convenient to use an equivalent norm for H m+s (Rn )
which does not depend explicitly on the Fourier transform. The following theorem
is similar to one in [28]. It describes the norm in H m+s (Rn ) in terms which are
free of the Fourier transform. This is also called an intrinsic norm [1].
Theorem 41.18 Let s ∈ (0, 1) and let m be a nonnegative integer. Then an equiv-
alent norm for H m+s (Rn ) is
2 2
X Z Z 2 −n−2s
|||u|||m+s ≡ ||u||m,2,Rn + |Dα u (x) − Dα u (y)| |x − y| dxdy.
|α|=m
Also if |β| ≤ m, there are constants, m (s) and M (s) such that
Z Z Z
2 ¯¯ β ¯¯2 2s ¯ β ¯
¯D u (x) − Dβ u (y)¯2 |x − y|−n−2s dxdy
m (s) |F u (z)| z |z| dz ≤
Z
2¯ ¯2 2s
≤ M (s) |F u (z)| ¯zβ ¯ |z| dz (41.10)
Proof: Let u ∈ S which is dense in H m+s (Rn ). The Fourier transform of the
function, y → Dα u (x + y) − Dα u (y) equals
¡ ix·z ¢
e − 1 F Dα u (z) .
Now by Fubini’s theorem and Plancherel’s theorem along with the above, taking
|α| = m, Z Z
2 −n−2s
|Dα u (x) − Dα u (y)| |x − y| dxdy
Z Z
2 −n−2s
= |Dα u (y + t) − Dα u (y)| |t| dtdy
Z Z
−n−2s 2
= |t| |Dα u (y + t) − Dα u (y)| dydt
Z Z
¯¡ it·z ¢ ¯2
= |t|
−n−2s ¯ e − 1 F Dα u (z)¯ dzdt
Z µZ ¶
−n−2s ¯¯¡ it·z ¢¯2
− 1 ¯ dt dz.
2
= |F Dα u (z)| |t| e (41.11)
The essential thing to notice about this function of z is that it is a positive real
number whenever z 6= 0. This is because for small |t| , the integrand is dominated
−n+2(1−s)
by C |t| . Changing to polar coordinates, you see that
Z
−n−2s ¯¯¡ it·z ¢¯2
|t| e − 1 ¯ dt < ∞
[|t|≤1]
41.2. FRACTIONAL ORDER SPACES 1181
−n−2s
Next, for |t| > 1, the integrand is bounded by 4 |t| , and changing to polar
coordinates shows
Z Z
−n−2s ¯¯¡ it·z ¢¯2
¯ −n−2s
|t| e − 1 dt ≤ 4 |t| dt < ∞.
[|t|>1] [|t|>1]
More can be said but this will suffice. Also observe that for s ∈ (0, 1) and b > 0,
s s
(1 + b) ≤ 1 + bs , 21−s (1 + b) ≥ 1 + bs .
In what follows, C (s) will denote a constant which depends on the indicated quan-
tities which may be different on different lines of the argument. Then from 41.11,
Z Z
2 −n−2s
|Dα u (x) − Dα u (y)| |x − y| dxdy
Z
2 2s
≤ M (s) |F Dα u (z)| |z| dz
Z
2 2 2s
= M (s) |F u (z)| |zα | |z| dz.
No referrence was made to |α| = m and so this establishes the top half of 41.10.
Therefore,
2 2
X Z Z 2 −n−2s
|||u|||m+s ≡ ||u||m,2,Rn + |Dα u (x) − Dα u (y)| |x − y| dxdy
|α|=m
Z ³ ´m Z X
2 2 2 2 2s
≤ C 1 + |z| |F u (z)| dz + M (s) |F u (z)| |zα | |z| dz
|α|=m
1182 SOBOLEV SPACES BASED ON L2
Recall that
m
X n
X X
z12α1 · · · zn2αn ≤ 1 + zj2 ≤ C (n, m) z12α1 · · · zn2αn . (41.12)
|α|≤m j=1 |α|≤m
Therefore,
2
|||u|||m+s
Z ³ ´m Z X
2 2 2 2 2s
≤ C 1 + |z| |F u (z)| dz + M (s) |F u (z)| |zα | |z| dz
|α|=m
Z ³ ´m+s Z ³ ´m
2 2 2 2 2s
≤ C 1 + |z| |F u (z)| dz + M (s) |F u (z)| 1 + |z| |z| dz
Z ³ ´m+s
2 2
≤ C 1 + |z| |F u (z)| dz = C ||u||H m+s (Rn ) .
Z
2 2s
≥ m (s) |F Dα u (z)| |z| dz
Z
2 2 2s
= m (s) |F u (z)| |zα | |z| dz.
No reference was made to |α| = m and so this establishes the bottom half of 41.10.
Therefore, from 41.12,
2
|||u|||m+s
Z ³ ´m Z X
2 2 2 2 2s
≥ C 1 + |z| |F u (z)| dz + m (s) |F u (z)| |zα | |z| dz
|α|=m
Z ³ ´m Z ³ ´m
2 2 2 2 2s
≥ C 1 + |z| |F u (z)| dz + C |F u (z)| 1 + |z| |z| dz
Z ³ ´m ³ ´
2 2s 2
= C 1 + |z| 1 + |z| |F u (z)| dz
Z ³ ´m ³ ´s
2 2 2
≥ C 1 + |z| 1 + |z| |F u (z)| dz
Z ³ ´m+s
2 2
= C 1 + |z| |F u (z)| dz = ||u||H m+s (Rn ) .
41.2. FRACTIONAL ORDER SPACES 1183
Proof: Let u ∈ S. From Theorem 41.17 and the equivalence of the norms in
W m,2 (Rn ) and H m (Rn ) ,
2 RRP 2 −n−2s
||h∗ u||H m (Rn ) + |α|=m |Dα h∗ u (x) − Dα h∗ u (y)| |x − y| dxdy
2 RRP 2 −n−2s
≤ C ||u||H m (Rn ) + |Dα h∗ u (x) − Dα h∗ u (y)| |x − y|
|α|=m dxdy
RRP ¯ ¡ β(α) ¢
2 ¯P ∗
= C ||u||H m (Rn ) + |α|=m ¯ |β(α)|≤m h D u gβ(α) (x)
¡ ¢ ¯2
−h∗ Dβ(α) u gβ(α) (y)¯ |x − y|
−n−2s
dxdy
R R P P ¯ ¡ ¢
2
≤ C ||u||H m (Rn ) + C ¯ ∗ Dβ(α) u gβ(α) (x)
|α|=m |β(α)|≤m h
¡ ¢ ¯2
−h∗ Dβ(α) u gβ(α) (y)¯ |x − y|
−n−2s
dxdy
(41.13)
A single term in the last sum corresponding to a given α is then of the form,
Z Z
¯ ∗¡ β ¢ ¡ ¢ ¯
¯h D u gβ (x) − h∗ Dβ u gβ (y)¯2 |x − y|−n−2s dxdy (41.14)
·Z Z
¯ ∗¡ β ¢ ¡ ¢ ¯
≤ ¯h D u (x) gβ (x) − h∗ Dβ u (y) gβ (x)¯2 |x − y|−n−2s dxdy +
Z Z ¸
¯ ∗¡ β ¢ ¡ ¢ ¯
¯h D u (y) gβ (x) − h∗ Dβ u (y) gβ (y)¯2 |x − y|−n−2s dxdy
· Z Z
¯ ∗¡ β ¢ ¡ ¢ ¯
≤ C (h) ¯h D u (x) − h∗ Dβ u (y)¯2 |x − y|−n−2s dxdy +
Z Z ¸
¯ ∗¡ β ¢ ¯
¯h D u (y)¯2 |gβ (x) − gβ (y)|2 |x − y|−n−2s dxdy .
Changing variables, and then using the names of the old variables to simplify the
notation,
· Z Z
¡ ¢ ¯¡ β ¢ ¡ ¢ ¯
≤ C h, h−1 ¯ D u (x) − Dβ u (y)¯2 |x − y|−n−2s dxdy +
1184 SOBOLEV SPACES BASED ON L2
Z Z ¸
¯ ∗¡ β ¢ ¯
¯h D u (y)¯2 |gβ (x) − gβ (y)|2 |x − y|−n−2s dxdy .
By 41.10,
Z
2¯ ¯2 2s
≤ C (h) |F (u) (z)| ¯zβ ¯ |z| dz
Z Z
¯ ∗¡ β ¢ ¯
+ ¯h D u (y)¯2 |gβ (x) − gβ (y)|2 |x − y|−n−2s dxdy.
because the inside integral equals a constant which depends on the Lipschitz con-
stants and bounds of the function, gβ and these things depend only on h. The
reason this integral is finite is that for |t| ≤ 1,
2 −n−2s 2 −n−2s
|gβ (y + t) − gβ (y)| |t| ≤ K |t| |t|
and using polar coordinates, you see
Z
2 −n−2s
|gβ (y + t) − gβ (y)| |t| dt < ∞.
[|t|≤1]
−n−2s
Now for |t| > 1, the integrand in 41.15 is dominated by 4 |t| and using polar
coordinates, this yields
Z Z
2 −n−2s −n−2s
|gβ (y + t) − gβ (y)| |t| dt ≤ 4 |t| dt < ∞.
[|t|>1] [|t|>1]
This proves the theorem because the assertion about h−1 is obvious. Just replace
h with h−1 in the above argument.
Next consider the case where U is an open set.
41.2. FRACTIONAL ORDER SPACES 1185
Lemma 41.20 Let h (U ) ⊆ V where U and V are open subsets of Rn and sup-
pose that h, h−1 : Rn → Rn are both functions in C m,1 (Rn ) . Recall this means
Dα h and Dα h−1 exist and are Lipschitz continuous for all |α| ≤ m. Then h∗ ∈
L (H m+s (V ) , H m+s (U )).
Proof: Let u ∈ H m+s (V ) and let v ∈ H m+s (Rn ) such that v|V = u. Then
from the above, h∗ v ∈ H m+s (Rn ) and so h∗ u ∈ H m+s (U ) because h∗ u = h∗ v|U .
Then by Lemma 41.19,
||h∗ u||H m+s (U ) ≤ ||h∗ v||H m+s (Rn ) ≤ C ||v||H m+s (Rn )
With harder work, you don’t need to have h, h−1 defined on all of Rn but I
don’t feel like including the details so this lemma will suffice.
Another interesting application of the intrinsic norm is the following.
Lemma 41.21 Let φ ∈ C m,1 (Rn ) and suppose spt (φ) is compact. Then there
exists a constant, Cφ such that whenever u ∈ H m+s (Rn ) ,
Proof: It is a routine exercise in the product rule to verify that ||φu||H m (Rn ) ≤
Cφ ||u||H m (Rn ) . It only remains to consider the term involving the integral. A typical
term is
Z Z
2 −n−2s
|Dα φu (x) − Dα φu (y)| |x − y| dxdy.
By 41.10 and the Lipschitz continuity of all the derivatives of φ, this is dominated
by
Z
2¯ ¯2 2s
CM (s) |F u (z)| ¯zβ ¯ |z| dz
Z Z
¯ β ¯
+K ¯D u (y)¯2 |x − y|2 |x − y|−n−2s dxdy
Z
2¯ ¯2 2s
= CM (s) |F u (z)| ¯zβ ¯ |z| dz
Z Z
¯ β ¯2
+K ¯ D u (y) ¯ |t|
−n+2(1−s)
dtdy
µZ Z ¶
2¯ ¯2 2s ¯ ¯2
≤ C (s) |F u (z)| ¯zβ ¯ |z| dz + K ¯Dβ u (y)¯ dy
Z ³ ´m+s
2 2
≤ C (s) 1 + |y| |F u (y)| dy.
Since there are only finitely many such terms, this proves the lemma.
Corollary 41.22 Let t = m + s for s ∈ [0, 1) and let U, V be open sets. Let
φ ∈ Ccm,1 (V ). This means spt (φ) ⊆ V and φ ∈ C m,1 (Rn ) . Then if u ∈ H t (U ) it
follows that uφ ∈ H t (U ∩ V ) and ||uφ||H t (U ∩V ) ≤ Cφ ||u||H t (U ) .
Taking the infimum for all such v whose restrictions equal u, this yields
||φu||H t (U ∩V ) ≤ Cφ ||u||H t (U ) .
Definition 41.23 Let Cbm (Rn ) denote the functions which are m times continu-
ously differentiable and for which
¡ ¢
For U an open set, C m U denotes the functions which are restrictions of Cbm (Rn )
to U.
41.3. EMBEDDING THEOREMS 1187
It is clear this is a Banach space, the proof being a simple exercise in the use
of the fundamental theorem of calculus along with standard results about uniform
convergence.
Lemma 41.24 Let u ∈ S and let n2 + m < t. Then there exists C independent of
u such that
||u||C m (Rn ) ≤ C ||u||H t (Rn ) .
b
Proof: Using the fact that the Fourier transform maps S to S and the definition
of the Fourier transform,
≤ C ||u||H t (Rn )
because for the given values of t and m the first integral is finite. This follows from
a use of polar coordinates. Taking sup over all x ∈ Rn and |α| ≤ m, this proves the
lemma.
Proof: This follows from the above lemma. Let {uk } be a sequence of functions
of S which converges to u in H t and a.e. Then by the inequality of the above
lemma, this sequence is also Cauchy in Cbm (Rn ) and taking the limit,
||u||C m (Rn ) = lim ||uk ||C m (Rn ) ≤ C lim ||uk ||H t (Rn ) = C ||u||H t (Rn ) .
b k→∞ b k→∞
Corollary 41.26 Let t > m + n2 and let U be an open set with u ∈ H t (U ) . Then
¡ ¢
u is a.e. equal to a function of C m U still denoted by u. Furthermore, there exists
a constant, C independent of u such that
||u||C m (U ) ≤ C ||u||H t (U ) .
1188 SOBOLEV SPACES BASED ON L2
||u||C m (U ) ≤ C ||u||H t (U ) .
The following elementary lemma featuring trig. substitutions is the basis for the
proof of some of the arguments which follow.
for a > 0 and t > 1/2. Then this integral is of the form Ct a−2t+1 where Ct is some
constant which depends on t.
and since t > 1/2 the last integral is finite. This yields the desired conclusion and
proves the lemma.
Z µ ¶n/2 Z
1
e−i(x ·y +xn yn ) u (y0 , yn ) dy 0 dyn dxn
2 0 0
≡ lim e−(εxn )
ε→0 R 2π Rn
µ ¶n/2 Z Z
1 0 0 2
= lim u (y0 , yn ) e−ix ·y e−(εxn ) e−ixn yn dxn dy 0 dyn .
ε→0 2π Rn R
2 ¡ ¢2 y2
Now − (εxn ) − ixn yn = −ε2 xn + iy2n − ε2 4n and so the above reduces to an
expression of the form
Z Z Z
1 −ε2 yn2 0 0 0 0
lim Kn e 4 u (y0 , yn ) e−ix ·y dy 0 dyn = Kn u (y0 , 0) e−ix ·y dy 0
ε→0 R ε R n−1 Rn
= Kn F γu (x0 )
Z ³ ´s ¯Z ³ ´t/2 ³ ´−t/2 ¯2
0 2
¯ ¯
= Cn 1 + |y | ¯ F u (y0 , yn ) 1 + |y|2 1 + |y|
2
dyn ¯¯ dy 0
¯
Rn−1 R
³ ´1/2
2
by Lemma 41.28 and taking a = 1 + |y0 | , this equals
µ³ ´1/2 ¶−2t+1 ³ ´
0 2 2 (−2t+1)/2
Ct 1 + |y | = Ct 1 + |y0 | .
C eiy·x ³ 0
´t F u (y ) dy (41.19)
Rn 2
1 + |y|
41.4. THE TRACE ON THE BOUNDARY OF A HALF SPACE 1191
Here the inside Fourier transform is taken with respect to Rn−1 because u is only
defined on Rn−1 and C will be chosen in such a way that γ ◦ ζ = id. First the
existence of C such that γ ◦ ζ = id will be shown. Since u ∈ S0 it follows
³ ´t−1/2
2
1 + |y0 |
0
y→ ³ ´t F u (y )
2
1 + |y|
is in S. Hence the inverse Fourier transform of this function is also in S and so
for u ∈ S0 , it follows ζu ∈ S. Therefore, to check γ ◦ ζ = id it suffices to plug in
xn = 0. From Lemma 41.28 this yields
γ (ζu) (x0 , 0)
³ ´t−1/2
Z 1 + |y0 |
2
0 0
= C eiy ·x ³
0
´t F u (y ) dy
Rn 2
1 + |y|
Z ³ ´t−1/2 0 0 Z
0 2 iy ·x 10 0
= C 1 + |y | e F u (y )
³ ´t dyn dy
Rn−1 R 1 + |y|2
Z ³ ´ ³ ´ −2t+1
2 t−1/2 iy0 ·x0 2 2
= CCt 1 + |y0 | e F u (y0 ) 1 + |y0 | dy 0
n−1
ZR
0 0 n/2
= CCt eiy ·x F u (y0 ) dy 0 = CCt (2π) F −1 (F u) (x0 )
Rn−1
³ ´−1
n/2
and so the correct value of C is Ct (2π) to obtain γ ◦ ζ = id. It only remains
to verify that ζ is continuous. From 41.19, and Lemma 41.28,
2
||ζu||H t (Rn )
Z ³ ´t
2 2
= 1 + |x| |F ζu (x)| dx
R n
Z ³ ´t ¯ ¡ ¢¯
= C2 1 + |x|
2 ¯F F −1 (φF u) (x) ¯2 dx
Rn
Z ³ ´t
2 2
= C2 1 + |x| |φ (x) F u (x0 )| dx
Rn
¯³ ´t−1/2 ¯2
Z ¯ ¯
³ ´t ¯ 1 + |x0 |2 ¯
2 ¯ 0 ¯
= C2 1 + |x| ¯ ³ ´t F u (x )¯ dx
Rn ¯ 2 ¯
¯ 1 + |x| ¯
Z ³ ´ ³ ´ −2t+1
2 2 2t−1 2 2 2
= C Ct 1 + |x0 | |F u (x0 )| 1 + |y0 | dx0
Rn−1
Z ³ ´
2 2 t−1/2 2 2
= C Ct 1 + |x0 | |F u (x0 )| dx0 = C 2 Ct ||u||H t−1/2 (Rn−1 ) .
Rn−1
where here Z
(f, g)H ≡ f gdx0 ,
Rn−1
¡ ¢
just the inner product in L2 Rn−1 . Furthermore,
u (·, 0) = γu a.e. x0 .
Proof: Let {uk } be a sequence of functions from S which ¡ converges
¢ to u in
H 1 (Rn ) and let {φk } denote a countable dense subset of L2 Rn−1 . Then
Z xn
¡ ¢ ¡ ¢ ¡ ¢
γuk , φj H + uk,n (·, t) , φj H dt = uk (·, xn ) , φj H . (41.20)
0
Now
µZ ¯¡ ¶1/2
∞
¯ ¢ ¡ ¢ ¯¯2
¯ uk (·, xn ) , φj H − u (·, xn ) , φj H ¯ dxn
0
µZ ¯¡ ¶1/2
∞
¯ ¢ ¯¯2
= ¯ uk (·, xn ) − u (·, xn ) , φj H ¯ dxn
0
µZ ∞ ¶1/2
¯ ¯2
≤ |uk (·, xn ) −
2
u (·, xn )|H ¯φj ¯ dxn
H
0
µZ ∞ ¶1/2
¯ ¯2
= ¯φj ¯ |uk (·, xn ) − u (·, xn )|H dxn
2
H
0
µZ ∞ Z ¶1/2
¯ ¯2 2
= ¯φj ¯ 0 0
|uk (x , xn ) − u (x , xn )| dx dxn 0
H
0 Rn−1
which converges to zero. Therefore, there exists a set of measure zero, Nj and a
subsequence, still denoted by k such that if xn ∈ / Nj , then
¡ ¢ ¡ ¢
uk (·, xn ) , φj H → u (·, xn ) , φj H .
41.4. THE TRACE ON THE BOUNDARY OF A HALF SPACE 1193
¡ ¢
Now by Theorem 41.30, γuk → γu in H = L2 Rn−1 . It only remains to consider
the term of 41.20 which involves an integral.
¯Z x n Z xn ¯
¯ ¡ ¢ ¡ ¢ ¯
¯ uk,n (·, t) , φj H dt − u,n (·, t) , φj H dt¯¯
¯
Z 0xn ¯ 0
¯¡ ¢ ¯¯
≤ ¯ uk,n (·, t) − u,n (·, t) , φj H ¯ dt
Z0 xn
¯ ¯
≤ |uk,n (·, t) − u,n (·, t)|H ¯φj ¯H dt
0
µZ xn ¶1/2 µZ xn ¶1/2
¯ ¯2
≤ |uk,n (·, t) −
2
u,n (·, t)|H dt ¯φj ¯ dt
H
0 0
µZ xn Z ¶1/2
¯ ¯ 2
= x1/2
n
¯φ j ¯ 0 0
|uk,n (x , t) − u,n (x , t)| dx 0
dt
H
0 Rn−1
||γ (φv) − γφγv||H t−1/2 (Rn−1 ) = lim ||γ (φvk ) − γφγvk ||H t−1/2 (Rn−1 ) = 0
k→∞
1194 SOBOLEV SPACES BASED ON L2
because each term in the sequence equals zero due to the observation that for vk ∈ S
and φ ∈ Cc∞ (U ) , γ (φvk ) = γvk γφ.
Now suppose v = 0 a.e. on U . Define for 0 < r < δ, vr (x) ≡ v (x0 , xn + r) .
Claim: If u ∈ H t (Rn ) , then
Proof of claim: First of all, let v ∈ S. Then v ∈ H m (Rn ) for all m and so by
Lemma 41.15,
θ 1−θ
||vr − v||H t (Rn ) ≤ ||vr − v||H m (Rn ) ||vr − v||H m+1 (Rn )
Therefore,
||ur − u||H t (Rn ) ≤ ||ur − vr ||H t (Rn ) + ||vr − v||H t (Rn ) + ||v − u||H t (Rn )
= 2ε/3 + ||vr − v||H t (Rn ) .
Now using what was just shown, it follows that for r small enough, ||ur − u||H t (Rn ) <
ε and this proves the claim.
Now suppose v ∈ H t (Rn ) . By the claim,
and so by continuity of γ,
¡ ¢
γvr → γv in H t−1/2 Rn−1 . (41.21)
Note vr = 0 a.e. on
Let φ ∈ Cc∞ (Ur ) and consider φvr . Then it follows φvr = 0 a.e. on Rn .¢ Let
¡ n−1
t−1/2
w ≡ 0. Then w ∈ S and so γw = 0 = γ (φvr ) = γφγvr in H R . It
follows that for mn−1 a.e. x0 ∈ [φ 6= 0] ∩ Rn−1 , γvr (x0 ) = 0. Now let U 0 = ∪∞
k=1 Kk
where the Kk are compact sets such that Kk ⊆ Kk+1 and let φk ∈ Cc∞ (U ) such
that φk has values in [0, 1] and φk (x0 ) = 1 if x0 ∈ Kk . Then from what was just
shown, γvr = 0 for a.e. point of Kk . Therefore, γvr = 0 for mn−1 a.e. point in U 0 .
Therefore, since each γvr = 0, it follows from 41.21 that γv = 0 also. This proves
the lemma.
41.5. SOBOLEV SPACES ON MANIFOLDS 1195
where U 0 is an open subset of Rn−1 and φ (u0 ) is a positive function such that
φ (u0 ) ≤ ∞ and
inf {φ (u0 ) : u0 ∈ U 0 } = δ > 0.
Then there exists a unique
³ ´
γ ∈ L H t (U ) , H t−1/2 (U 0 )
which has the property that if u = v|U where v is continuous and also a function of
H 1 (Rn ) , then γu (x0 ) = u (x0 , 0) for a.e. x0 ∈ U 0 .
γu ≡ γv|U 0
Is this well defined? The answer is yes because if vi |U = u a.e., then γ (v1 − v2 ) = 0
a.e. on U 0 which implies γv1 = γv2 a.e. and so the two different versions of γu
differ only on a set of measure zero.
If u = v|U where v is continuous and also a function of H 1 (Rn ) , then for a.e.
x0 ∈ Rn−1 , it follows from Lemma 41.31 on Page 1192 that γv (x0 ) = v (x0 , 0) .
Hence, it follows that for a.e. x0 ∈ U 0 , γu (x0 ) ≡ u (x0 , 0).
In particular, γ is determined by γu (x0 ) = u (x0 , 0) on S|U and the density of
S|U and continuity of γ shows γ is unique.
It only remains to show γ is continuous. Let u ∈ H t (U ) . Thus there exists
v ∈ H t (Rn ) such that u = v|U . Then
for C independent of v. Then taking the inf for all such v ∈ H t (Rn ) which are
equal to u a.e. on U, it follows
∞
3. {Wi }i=1 is locally finite.
4. There are open bounded sets, Ui and functions hi : Ui → Γi which are one to
one, onto, and in C m,1 (Ui ) . There exists a constant, C, such that C ≥ Lip hr
for all r.
gi ◦ hk : Uk ∩ h−1 −1
k (Γi ) → Ui ∩ hi (Γk )
Proof: First it is well to show it does indeed map the given open sets. Let
x ∈ Uk ∩h−1 k (Γi ) . Then hk (x) ∈ Γk ∩Γi and so gi (hk (x)) ∈ Ui because hk (x) ∈ Γi .
Now since hk (x) ∈ Γk , gi (hk (x)) ∈ hi−1 (Γk ) also and this proves the mappings do
what they should in terms of mapping the two open sets. That gi ◦hk is C m,1 follows
immediately from the chain rule and the assumptions that the functions gi and hk
are C m,1 . The claim about the inverse follows immediately from the definitions of
the functions.
∞
Let {ψ i }i=1 be a partition of unity subordinate to the open cover {Wi } satisfying
ψ i ∈ Cc (Wi ) . Then the following definition provides a norm for H m+s (Γ) .
∞
Definition 41.36 Let s ∈ (0, 1) and m is a nonnegative integer. Also let µ denote
the surface measure for Γ defined in the last section. A µ measurable function, u
∞
is in H m+s (Γ) if whenever {Wi , ψ i , Γi , Ui , hi , gi }i=1 is described above, h∗i (uψ i ) ∈
m+s
H (Ui ) and
à ∞
!1/2
X 2
||u||H m+s (Γ) ≡ ||h∗i (uψ i )||H m+s (Ui ) < ∞.
i=1
Are there functions which are in H m+s (Γ)? The answer is yes. Just take the
restriction to Γ of any function, u ∈ Cc∞ (Rm ) . Then each h∗i (uψ i ) ∈ H m+s (Ui )
and the sum is finite because spt u has nonempty intersection with only finitely
many Wi . © ª∞
It is not at all obvious this norm is well defined. What if Wi0 , ψ 0i , Γ0i , Ui , h0i , gi0 i=1
is as described above? Would the two norms be equivalent? If they aren’t, then
this is not a good way to define H m+s (Γ) because it would depend on the choice of
partition of unity and functions, hi and choice of the open sets, Ui . To begin with
∞
pick a particular choice for {Wi , ψ i , Γi , Ui , hi , gi }i=1 .
∞ ∞
Proof: Let {uj }j=1 be a Cauchy sequence in H m+s (Γ) . Then {h∗i (uj ψ i )}j=1
is a Cauchy sequence in H m+s (Ui ) for each i. Therefore, for each i, there exists
wi ∈ H m+s (Ui ) such that
It is required to show there exists u ∈ H m+s (Γ) such that wi = h∗i (uψ i ) for each i.
Now from Corollary 39.14 it follows easily by approximating with simple func-
tions that for ever nonnegative µ measurable function, f,
Z ∞ Z
X
f dµ = ψ r f (hr (u)) Jr (u) du.
Γ r=1 gr Γr
Therefore,
Z ∞ Z
X
2 2
|uj − uk | dµ = ψ r |uj − uk | (hr (u)) Jr (u) du
Γ r=1 gr Γr
X∞ Z
2
≤ C ψ r |uj − uk | (hr (u)) du
r=1 gr Γr
∞
X 2
= C ||h∗r (ψ r |uj − uk |)||0,2,Ur
r=1
≤ C ||uj − uk ||H m+s (Γ)
||uj − u||0,2,Γ → 0.
and a subsequence, still denoted by uj such that uj (x) → u (x) for µ a.e. x ∈ Γ. It
is required to show that u ∈ H m+s (Γ) such that wi = h∗i (uψ i ) for each i. First of
all, u is measurable because it is the limit of measurable functions. The pointwise
convergence just established and the fact that sets of measure zero on Γi correspond
to sets of measure zero on Ui which was discussed in the claim found in the proof
of Theorem 39.13 on Page 1129 shows that
a.e. x. Therefore,
h∗i (uψ i ) = wi
and this shows that h∗i (uψ i ) ∈ H m+s (Ui ) . It remains to verify that u ∈ H m+s (Γ) .
This follows from Fatou’s lemma. From 41.22,
2 2
||h∗i (uj ψ i )||H m+s (Ui ) → ||h∗i (uψ i )||H m+s (Ui )
1198 SOBOLEV SPACES BASED ON L2
and so
∞
X ∞
X
2 2
||h∗i (uψ i )||H m+s (Ui ) ≤ lim inf ||h∗i (uj ψ i )||H m+s (Ui )
j→∞
i=1 i=1
2
= lim inf ||uj ||H m+s (Γ) < ∞.
j→∞
Therefore,
||u||1 ≤ C ||u||2 , ||u||2 ≤ C 0 ||u||1 .
This proves the following theorem.
and
Ui0 ≡ gi (Γ0i ) = h−1 0
i (Wi ∩ Γ) ,
it follows that Ui0 is an open set because hi is continuous and (Γ0 , Wi0 , Ui0 , Γ0i , h0i , gi0 )
is also a C m,1 manifold if you define h0i to be the restriction of hi to Ui0 and gi0 to
be the restriction of gi to Wi0 .
As a case of this, consider a C m,1 manifold, Γ where (Γ, Wi , Ui , Γi , hi , gi ) are as
described in Definition 41.34 and the submanifold consisting of Γi . The next lemma
shows there is a simple way to define a norm on H t (Γi ) which does not depend on
dragging in a partition of unity.
Proof: Let u ∈ H t (Γ) and let (Γk , Wi0 , Ui0 , Γ0i , h0i , gi0 ) be the sets and functions
which define what is meant by Γk being a C m,1 manifold as described in Definition © ª
41.34. Also let (Γ, Wi , Ui , Γi , hi , gi ) be pertain to Γ in the same way and let φj
be a C ∞ partition of unity for the {Wj }. Since the {Wi0 } are locally finite, only
finitely many can intersect Γk , say {W10 , · · ·, Ws0 } . Also
© ªonly finitely many of the Wi
can intersect Γk , say {W1 , · · ·, Wq } . Then letting ψ 0i be a C ∞ partition of unity
subordinate to the {Wi0 } .
∞
X ¯¯ 0∗ ¡ 0 ¢¯¯
¯¯hi uψ i ¯¯
H t (Ui0 )
i=1
¯¯ ¯¯
Xs ¯¯ q ¯¯
¯¯ 0∗ X ¯¯
= ¯¯h i φ uψ 0 ¯¯
¯¯ j i ¯¯
i=1 ¯¯ j=1 ¯¯
H t (Ui0 )
q
s X
X ¯¯ 0∗ ¯¯
≤ ¯¯hi φj uψ 0i ¯¯
H t (Ui0 )
i=1 j=1
Xq X s
¯¯ 0∗ ¯¯
= ¯¯hi φj uψ 0i ¯¯
H t (Ui0 )
j=1 i=1
Xq X s
¯¯ ¯¯
= ¯¯(gj ◦ h0i )∗ h∗j φj uψ 0i ¯¯ t .
H (Ui0 )
j=1 i=1
By Lemma 41.20 on page 1185, there exists a single constant, C such that the above
is dominated by
q X
X s
¯¯ ∗ ¯¯
C ¯¯hj φj uψ 0i ¯¯ .
H t (Uj )
j=1 i=1
q X
X s q X
X s
¯¯ ∗ ¯¯ ¯¯ ∗ ¯¯
C C ψ 0i
¯¯hj φj u¯¯ ≤ C ¯¯hj φj u¯¯ t
H t (Uj ) H (U j)
j=1 i=1 j=1 i=1
Xq
¯¯ ∗ ¯¯
≤ C ¯¯hj φj u¯¯ t < ∞.
H (Uj )
j=1
This shows that u restricted to Γk is in H t (Γk ). It also shows that the restriction
map of H t (Γ) to H t (Γk ) is continuous.
Now consider the norm |||·|||t . For u ∈ H t (Γk ) , let (Γk , Wi0 , Ui0 , Γ0i , h0i , gi0 ) be
sets and functions which define an atlas for Γk . Since the {Wi0 } are locally finite,
only finitely many can have nonempty intersection with Γk , say {W1 , · · ·, Ws } . Thus
i ≤ s for some finite s. The problem is to compare |||·|||t with ||·||H t (Γk ) . As above,
1200 SOBOLEV SPACES BASED ON L2
© ª © ª
let ψ 0i denote a C ∞ partition of unity subordinate to the Wj0 . Then
¯¯ ¯¯
¯¯ s ¯¯
¯ ¯ X ¯¯
∗ ¯ ¯
|||u|||t ≡ ||hk u||H t (Uk ) = ¯¯hk∗ 0 ¯¯
ψ j u¯¯
¯¯ j=1 ¯¯ t
H (Uk )
s
X ¯¯ ∗ ¡ 0 ¢¯¯
≤ ¯¯hk ψ j u ¯¯
H t (Uk )
j=1
s ¯¯
X ¯¯¡ 0 ¢∗ ¡ 0 ¢¯¯¯¯
= ¯¯ gj ◦ hk h0∗
j ψ j u ¯¯
H t (Uk )
j=1
s
X ¯¯ 0∗ ¡ 0 ¢¯¯
≤ C ¯¯hj ψ j u ¯¯ .
H t (Uj0 )
j=1
1/2
Xs
¯¯ 0∗ ¡ 0 ¢¯¯2
≤ C ¯¯hj ψ j u ¯¯ = ||u||H t (Γk ) .
H t (Uj0 )
j=1
where Lemma 41.20 on page 1185 was used in the last step. Now also, from Lemma
41.20 on page 1185
1/2
Xs
¯¯ 0∗ ¡ 0 ¢¯¯2
||u||H t (Γk ) = ¯¯hj ψ j u ¯¯ t 0
H (Uj )
j=1
1/2
s ¯¯
X ¯¯¡ ¢∗ ¡ ¢¯¯¯¯2
= ¯¯ gk ◦ h0j h∗k ψ 0j u ¯¯
H t (Uj0 )
j=1
1/2
s
X ¯¯ ∗ ¡ 0 ¢¯¯2
≤ C ¯¯h k ψ j u ¯¯
H t (Uk )
j=1
1/2
s
X 2
≤ C ||h∗k u||H t (Uk ) = Cs ||h∗k u||H t (Uk ) = |||u|||t .
j=1
¶ ZZ b
¶ W Z
¶ ZZ
¶ ¶
¶ T T
Z Ω W ¶ R RW (Ω W)
Z ¶ -
Z ¶ 0
Z
Z¶ u0 ∈ U 0
I must show it satisfies what it should. Recall the definition of what it means for a
function to be in H t−1/2 (Γ) where t = m + s.
1202 SOBOLEV SPACES BASED ON L2
Definition 41.41 Let s ∈ (0, 1) and m is a nonnegative integer. Also let µ denote
the surface measure for Γ. A µ measurable function, u is in H m+s (Γ) if whenever
∞
{Wi , ψ i , Γi , Ui , hi , gi }i=1 is described above, h∗i (uψ i ) ∈ H m+s (Ui ) and
à ∞
!1/2
X 2
||u||H m+s (Γ) ≡ ||h∗i (uψ i )||H m+s (Ui ) < ∞.
i=1
Recall that all these norms which are obtained from various partitions of unity
and functions, hi and gi are equivalent. Here there are only finitely many Wi so
the sum is a finite sum. The theorem is the following.
Theorem 41.42 Let Ω be a bounded open set having C m,1 boundary as discussed
above in Definition 41.40. Then for t ≤ m + 1, there exists a unique
³ ´
γ ∈ L H t (Ω) , H t−1/2 (Γ)
which has the property that for µ the measure on the boundary,
gi ◦ hj : Uj0 → Ui0 .
and ¯¯ ¯¯
¯¯(gi ◦ hj )∗ (γH∗i (uψ i ))¯¯ ≤ Cij ||γH∗i (uψ i )||H t−1/2 (U 0 ) .
H t−1/2 (Uj0 ) i
41.5. SOBOLEV SPACES ON MANIFOLDS 1203
¡ ¢
Also h∗j ψ j ∈ C m,1 Uj0 and has compact support in Uj0 and so by Corollary 41.22
on Page 1186 ¡ ∗ ¢ ¡ ¢
∗
hj ψ j (gi ◦ hj ) (γH∗i (uψ i )) ∈ H t−1/2 Uj0
and ¯¯¡ ∗ ¢ ¯¯
¯¯ hj ψ j (gi ◦ hj )∗ (γH∗i (uψ i ))¯¯
H t−1/2 (Uj0 )
¯¯ ¯¯
≤ Cij ¯¯(gi ◦ hj ) (γH∗i (uψ i ))¯¯H t−1/2 (U 0 )
∗
(41.28)
j
¡ ¢ ¡ ¢
This shows γu ∈ H t−1/2 (Γ) because each h∗j ψ j (γu) ∈ H t−1/2 Uj0 . Also
from 41.29 and 41.27
q
X ¯¯ ∗ ¡ ¢¯¯
2
||γu||H t−1/2 (Γ) ≤ ¯¯hj ψ j (γu) ¯¯2
H t−1/2 (Uj0 )
j=1
q
X ¯¯ ∗ ¡ ¢¯¯
= ¯¯hj ψ j (γu) ¯¯2
H t−1/2 (Uj0 )
j=1
¯¯ q
q ¯¯X
¯¯2
X ¡ ∗ ¢ ¯¯
¯¯ ∗ ∗ ¯¯
= ¯¯ hj ψ j (gi ◦ hj ) (γHi (uψ i ))¯¯
¯¯ ¯¯
j=1 i=1 H t−1/2 (Uj0 )
q X
X q
¯¯¡ ∗ ¢ ¯¯
≤ Cq ¯¯ hj ψ j (gi ◦ hj )∗ (γH∗i (uψ i ))¯¯2
H t−1/2 (Uj0 )
j=1 i=1
q X
X q
2
≤ Cq Cij ||(γH∗i (uψ i ))||H t−1/2 (U 0 )
i
j=1 i=1
Xq
2
≤ Cq ||(γH∗i (uψ i ))||H t−1/2 (U 0 )
i
i=1
q
X 2
≤ Cq ||H∗i (uψ i )||H t (Ri (Wi ∩Ω))
i=1
q
X 2 2
≤ Cq ||uψ i ||H t (Wi ∩Ω) ≤ Cq ||u||H t (Ω) .
i=1
Then
X
γu (x) = (γH∗i (uψ i )) (gi (x))
i∈Ix
X
= (γH∗i (uψ i )) (gi (hi (u0i )))
i∈Ix
X
= (γH∗i (uψ i )) (u0i ) .
i∈Ix
Theorem 42.2 (Lax Milgram) Let A ∈ L (V, V 0 ) be coercive. Then A maps one to
one and onto.
Proof: The proof that A is onto involves showing A (V ) is both dense and
closed.
Consider first the claim that A (V ) is closed. Let Axn → y ∗ ∈ V 0 . Then
2
δ ||xn − xm ||V ≤ ||Axn − Axm ||V 0 ||xn − xm ||V .
1205
1206 WEAK SOLUTIONS
Here is a simple example which illustrates the use of the above theorem. In the
example the repeated index summation convention is being used. That is, you sum
over the repeated indices.
According to the Lax Milgram theorem and the verification of its conditions in
Example 42.3, there exists a unique solution to the problem of finding u ∈ H01 (U )
such that for all v ∈ H01 (U ) ,
Z Z
¡ ij ¢
α (x) u,i (x) v,j (x) + u (x) v (x) dx = f (x) v (x) dx (42.1)
U U
and since u ∈ H01 (U ) , it must be the case that γu = 0 on ∂U. This is why the
solution to 42.1 is referred to as a weak solution to the boundary value problem
¡ ¢
− αij (x) u,i (x) ,j + u (x) = f (x) , u = 0 on ∂U.
Of course you then begin to ask the important question whether ¡ u really has¢ two
derivatives. It is not immediately clear that just because − αij (x) u,i (x) ,j ∈
L2 (U ) it follows that the second derivatives of u exist. Actually this will often be
true and is discussed somewhat in the next section.
Next suppose you choose V = H 1 (U ) and let g ∈ H 1/2 (∂U ). Define F ∈ V 0 by
Z Z
F (v) ≡ f (x) v (x) dx + g (x) γv (x) dµ.
U ∂U
Everything works the same way and you get the existence of a unique u ∈ H 1 (U )
such that for all v ∈ H 1 (U ) ,
Z Z Z
¡ ij ¢
α (x) u,i (x) v,j (x) + u (x) v (x) dx = f (x) v (x) dx + g (x) γv (x) dµ
U U ∂U
(42.2)
is satisfied. It you pretend u has all second order derivatives in L2 (U ) and apply
the divergence theorem, you find that you have obtained a weak solution to
¡ ¢
− αij u,i ,j + u = f, αij u,i nj = g on ∂U
Therefore, uδ equals a constant on Uδ0 because Uδ0 is a connected open set and
uδ is a smooth function defined on this set which has its gradient equal to 0. By
Minkowski’s inequality,
ÃZ !1/2 Z ÃZ !1/2
2 2
|u (x) − uδ (x)| dx ≤ φδ (y) |u (x) − u (x − y)| dx dy
Uδ0 B(0,δ) Uδ 0
Replacing uk with uk / ||uk ||1,2 , it can be assumed that ||uk ||1,2 = 1 for all k.
Therefore, using the compactness of the embedding of H 1 (U ) into L2 (U ) , there
42.1. THE LAX MILGRAM THEOREM 1209
A fundamental inequality used in elasticity to obtain coercivity and then apply the
Lax Milgram theorem or some other theorem is Korn’s inequality. The proof given
here of this fundamental result follows [41] and [19].
Theorem 43.1 Let f ∈ L2 (Ω) where Ω is a bounded Lipschitz domain. Then there
exist constants, C1 and C2 such that
à n ¯¯ ¯¯ !
X ¯¯ ∂f ¯¯
C1 ||f ||0,2,Ω ≤ ||f ||−1,2,Ω + ¯¯ ¯¯ ≤ C2 ||f ||0,2,Ω ,
¯¯ ∂xi ¯¯
i=1 −1,2,Ω
where here ||·||0,2,Ω represents the L2 norm and ||·||−1,2,Ω represents the norm in
the dual space of H01 (Ω) , denoted by H −1 (Ω) .
Similar conventions will apply for any domain in place of Ω. The proof of this
theorem will proceed through the use of several lemmas.
1211
1212 KORN’S INEQUALITY
Z
∂φ
+ f dx. (43.1)
U− ∂xn
Consider the first integral on the right in 43.1. Changing the variables, letting
yn = 2g (x) − xn in the first term of the integrand and 3g (x) − 2xn in the next, it
equals Z
∂φ
−3 (x,2g (x) − yn ) f (x,yn ) dyn dx
U − ∂xn
Z µ ¶
∂φ 3 yn
+2 x, g (x) − f (x,yn ) dyn dx.
U − ∂xn 2 2
For (x,yn ) ∈ U − , and defining
µ ¶
3 yn
ψ (x,yn ) ≡ φ (x,yn ) + 3φ (x,2g (x) − yn ) − 4φ x, g (x) − ,
2 2
and so ¯¯ ¯¯
¯¯ ∂f ¯¯
¯¯ ¯¯ ≡
¯¯ ∂xn ¯¯
−1,2,Rn
½Z ¾
∂φ
sup f dx : φ ∈ Cc∞ (Rn ) , ||φ||1,2,Rn ≤ 1 ≤
Rn ∂xn
½¯Z ¯ ¾
¯ ∂ψ ¯ ¡ ¢
sup ¯¯ f dxdyn ¯¯ : ψ ∈ H01 U − , ||ψ||1,2,U − ≤ Cg
U − ∂xn
¯¯ ¯¯
¯¯ ∂f ¯¯
= Cg ¯¯¯¯ ¯¯ (43.2)
∂xn ¯¯−1,2,U −
43.1. A FUNDAMENTAL INEQUALITY 1213
It remains to establish a similar inequality for the case where the derivatives are
taken with respect to xi for i < n. Let φ ∈ Cc∞ (Rn ) . Then
Z Z
∂φ ∂φ
f dx = f dx
R n ∂x i U − ∂x i
Z
∂φ
[−3f (x,g (x) − xn ) + 4f (x, 3g (x) − 2xn )] dx.
U + ∂xi
Then
∂ψ 1
= Di φ (x,2g (x) − yn ) + Dn φ (x,2g (x) − yn ) 2Di g (x) ,
∂xi
µ ¶ µ ¶
∂ψ 2 3 yn 3 yn 3
= Di φ x, g (x) − + Dn φ x, g (x) − Di g (x) .
∂xi 2 2 2 2 2
Also
∂ψ 1
(x,yn ) = −Dn φ (x,2g (x) − yn ) ,
∂yn
µ ¶ µ ¶
∂ψ 2 −1 3 yn
(x,yn ) = Dn φ x, g (x) − .
∂yn 2 2 2
Therefore,
∂ψ 1 ∂ψ
(x,yn ) = Di φ (x,2g (x) − yn ) − 2 1 (x,yn ) Di g (x) ,
∂xi ∂yn
µ ¶
∂ψ 2 3 yn ∂ψ
(x,yn ) = Di φ x, g (x) − − 3 2 (x,yn ) Di g (x) .
∂xi 2 2 ∂yn
Using this in 43.3, the integrals in this expression equal
Z · ¸
∂ψ 1 ∂ψ 1
−3 (x,yn ) + 2 (x,yn ) Di g (x) f (x,yn ) dyn dx+
U − ∂xi ∂yn
Z · ¸
∂ψ 2 ∂ψ 2
2 (x,yn ) + 3 (x,yn ) Di g (x) f (x,yn ) dyn dx
U − ∂xi ∂yn
1214 KORN’S INEQUALITY
·
Z ¸
∂ψ 1 (x,y) ∂ψ 2 (x,yn )
= −3 +2 f (x,yn ) dyn dx.
U− ∂xi ∂xi
Therefore, Z Z · ¸
∂φ ∂φ ∂ψ 1 ∂ψ 2
f dx = −3 +2 f dxdyn
Rn ∂xi U− ∂xi ∂xi ∂xi
and also
φ (x,g (x)) − 3ψ 1 (x,g (x)) + 2ψ 2 (x,g (x)) =
φ (x,g (x)) − 3φ (x,g (x)) + 2φ (x,g (x)) = 0
and so φ − 3ψ 1 + 2ψ 2 ∈ H01 (U − ) . It also follows from the definition of the functions,
ψ i and the assumption that g is Lipschitz, that
Therefore, ¯¯ ¯¯ ½¯Z ¯ ¾
¯¯ ∂f ¯¯ ¯ ∂φ ¯¯
¯¯ ¯¯ ≡ sup ¯ ¯ f dx¯ : ||φ||1,2,Rn ≤ 1
¯¯ ∂xi ¯¯
−1,2,Rn Rn ∂xi
½¯Z · ¸ ¯ ¾
¯ ∂φ ∂ψ 1 ∂ψ 2 ¯
= sup ¯ ¯ f −3 +2 dx¯¯ : ||φ||1,2,Rn ≤ 1
U− ∂xi ∂xi ∂xi
¯¯ ¯¯
¯¯ ∂f ¯¯
≤ Cg ¯¯¯¯ ¯¯
∂xi ¯¯−1,2,U −
where Cg is a constant which depends on g. This inequality along with 43.2 yields
n ¯¯ ¯¯ à n ¯¯ ¯¯ !
X ¯¯ ∂f ¯¯ X ¯¯ ∂f ¯¯
¯¯ ¯¯ ≤ Cg ¯¯ ¯¯ .
¯¯ ∂xi ¯¯ ¯¯ ∂xi ¯¯
i=1 −1,2,Rn i=1 −1,2,U −
The inequality,
||f ||−1,2,Rn ≤ Cg ||f ||−1,2,U −
follows from 43.4 and the equation,
Z Z Z
f φdx = f φdx − 3 f (x,yn ) ψ 1 (x,yn ) dxdyn
Rn U− U−
Z
+2 f (x,yn ) ψ 2 (x,yn ) dxdyn
U−
which results in the same way as before by changing variables using the definition
of f off U − . This proves the lemma.
The next lemma is a simple application of Fourier transforms.
is an equivalent norm to the usual Sobolev space norm for H01 (Rn ) and is used in
the³ following
´ argument which depends on Plancherel’s theorem and the fact that
∂φ
F ∂xi = ti F (φ) .
¯¯ ¯¯ ½¯Z ¯ ¾
¯¯ ∂f ¯¯ ¯ ∂φ ¯
¯¯ ¯¯ ≡ sup ¯ f dx ¯ : ||φ|| ≤ 1
¯¯ ∂xi ¯¯ ¯ n ∂xi ¯ 1,2
−1,2,Rn R
½¯Z ¯ ¾
¯ ¯
= Cn sup ¯ ¯ ¯
ti (F φ) (F f )dt¯ : ||φ||1,2 ≤ 1
Rn
¯ ³ ´1/2 ¯
¯ ¯
¯Z ti (F φ) 1 + |t|2 ¯
¯ ¯
= Cn sup ¯ ³ ´ (F f )dt¯ : ||φ||1,2 ≤ 1
¯¯ Rn 2
1/2 ¯
1 + |t| ¯
1/2
Z 2
|F f | t2i
= Cn ³ ´ dt (43.5)
2
1 + |t|
Also, ½¯Z ¯ ¾
¯ ¯
||f ||−1,2 ≡ sup ¯¯ φf dx¯¯ : ||φ||1,2 ≤ 1
Rn
½¯Z ¯ ¾
¯ ¡ ¢ ¯
= Cn sup ¯ ¯ ¯
(F φ) F f dx¯ : ||φ||1,2 ≤ 1
Rn
¯ ³ ´1/2 ¯
¯ ¯
¯Z F φ 1 + |t|2 ¯
¯ ¯
= Cn sup ¯ ³ ´ (F f )dt¯ : ||φ||1,2 ≤ 1
¯¯ Rn 2
1/2 ¯
1 + |t| ¯
1/2
Z 2
|F f |
= Cn ³ ´ dt
2
Rn 1 + |t|
This along with 43.5 yields the conclusion of the lemma because
Xn ¯¯ ¯¯ Z
¯¯ ∂f ¯¯2
¯¯ ¯¯ 2
+ ||f ||−1,2 = Cn
2 2
|F f | dx = Cn ||f ||0,2 .
¯¯ ∂xi ¯¯ n
i=1 −1,2 R
Now consider Theorem 43.1. First note that by Lemma 43.2 and U − defined
there, Lemma 43.3 implies that for f extended as in Lemma 43.2,
à n ¯¯ ¯¯ !
X ¯¯ ∂f ¯¯
||f ||0,2,U − ≤ ||f ||0,2,Rn = Cn ||f ||−1,2,Rn + ¯ ¯ ¯ ¯
¯¯ ∂xi ¯¯
i=1 −1,2,Rn
1216 KORN’S INEQUALITY
à n ¯¯ ¯¯ !
X ¯¯ ∂f ¯¯
≤ Cgn ||f ||−1,2,U − + ¯¯ ¯¯ . (43.6)
¯¯ ∂xi ¯¯
i=1 −1,2,U −
Let Ω be a bounded open set having Lipschitz boundary which lies locally on
p
one side of its boundary. Let {Qi }i=0 be cubes of the sort used in the proof of
the divergence theorem such that Q0 ⊆ Ω and the other cubes cover the boundary
of Ω. Let {ψ i } be a C ∞ partition of unity with spt (ψ i ) ⊆ Qi and let f ∈ L2 (Ω) .
Then for φ ∈ Cc∞ (Ω) and ψ one of these functions in the partition of unity,
¯¯ ¯¯ ¯Z ¯ ¯Z ¯
¯¯ ∂ (f ψ) ¯¯ ¯ ¯ ¯ ¯
¯¯ ¯¯ ≤ sup ¯ f ∂ (ψφ) dx¯ + sup ¯ f φ ∂ψ dx¯
¯¯ ∂xi ¯¯ ¯ ¯ ¯ ∂xi ¯
−1,2,Ω ||φ||1,2 ≤1 Ω ∂xi ||φ||1,2 ≤1 Ω
Therefore,
¯¯ ¯¯ ¯Z ¯ ¯Z ¯
¯¯ ∂ (f ψ) ¯¯ ¯ ∂η ¯¯ ¯ ¯
¯¯ ¯¯ ≤ sup ¯ f ¯ dx¯ + ¯
sup ¯ f ηdx¯¯
¯¯ ∂xi ¯¯ ∂x i
−1,2,Ω ||η||1,2 ≤Cψ Ω ||η||1,2 ≤Cψ Ω
à ¯¯ ¯¯ !
¯¯ ∂f ¯¯
≤ Cψ ¯¯¯¯ ¯¯ + ||f ||−1,2,Ω . (43.7)
∂xi ¯¯−1,2,Ω
Now using 43.7 and 43.6
¯¯ ¢ ¯¯
n ¯¯ ¡
¯¯ ¯¯ ¯¯ ¯¯ X ¯ ¯ ∂ f ψ j ¯¯¯¯
¯¯f ψ j ¯¯ ≤ Cg ¯¯f ψ j ¯¯−1,2,Ω + ¯¯ ¯¯
0,2,Ω ¯¯ ∂xi ¯¯
i=1 −1,2,Ω
à n ¯¯ ¯¯ !
X ¯¯ ∂f ¯¯
≤ Cψ j Cg ||f ||−1,2,Ω + ¯¯ ¯¯ .
¯¯ ∂xi ¯¯
i=1 −1,2,Ω
Pp
Therefore, letting C = j=1 Cψj Cg ,
p
à n ¯¯ ¯¯ !
X ¯¯ ¯¯ X ¯¯ ∂f ¯¯
||f ||0,2,Ω ≤ ¯¯f ψ j ¯¯ ≤C ||f ||−1,2,Ω + ¯ ¯ ¯¯ . (43.8)
0,2,Ω ¯¯ ∂xi ¯¯
j=1 i=1 −1,2,Ω
It is very significant because it is the strain as just defined which occurs in many of
the physical models proposed in continuum mechanics. The inequality is far from
obvious because the strains only involve certain combinations of partial derivatives.
Theorem 43.4 (Korn’s second inequality) Let Ω be any domain for which the con-
clusion of Theorem 43.1 holds. Then the two norms in 43.9 and 43.10 are equivalent.
∂ 2 ui ∂ ∂ ∂
= (εik (u)) + (εij (u)) − (εjk (u)) .
∂xj , ∂xk ∂xj ∂xk ∂xi
"¯¯ ¯¯ #
¯¯ ∂ui ¯¯ X ¯¯¯¯ ∂εrs (u) ¯¯¯¯
≤ C ¯¯¯¯ ¯¯ + ¯¯ ¯¯
∂xj ¯¯−1,2,Ω r,s,p ¯¯ ∂xp ¯¯−1,2,Ω
"¯¯ ¯¯ #
¯¯ ∂ui ¯¯ X
≤ C ¯¯ ¯ ¯ ¯ ¯ + ||εrs (u)||0,2,Ω .
∂xj ¯¯−1,2,Ω r,s
and so ¯¯ ¯¯ " #
¯¯ ∂ui ¯¯ X
¯¯ ¯¯ ≤ C ||ui ||0,2,Ω + ||εrs (u)||0,2,Ω
¯¯ ∂xj ¯¯
0,2,Ω r,s
R
Γ
¡
V ¡
¡
¡
ª
Rn−1
U1 U
1219
1220 ELLIPTIC REGULARITY AND NIRENBERG DIFFERENCES
2
αrs (y) vr vs ≥ δ |v| , δ > 0. (44.2)
The following technical lemma gives the essential ideas.
w ∈ H 1 (U ) , (44.3)
rs
¡ ¢
α ∈ C 0,1 U , (44.4)
hs ∈ H 1 (U ) , (44.5)
f ∈ L2 (U ) . (44.6)
and Z Z Z
∂w ∂z ∂z
αrs (y) dy + hs (y) dy = f zdy (44.7)
U ∂y r ∂y s U ∂y s U
for all z ∈ H 1 (U ) having the property that spt (z) ⊆ V. Then w ∈ H 2 (U1 ) and for
some constant C, independent of f, w, and g, the following estimate holds.
à !
2 2 2
X 2
||w||H 2 (U1 ) ≤ C ||w||H 1 (U ) + ||f ||L2 (U ) + ||hs ||H 1 (U ) . (44.8)
s
R
Γ
¡
W V ¡
¡
¡
ª
U1 Rn−1
U
¡ ¢
For h small (3h < dist W , V C ), let
½ · ¸
1 2 w (y) − w (y − hek )
z (y) ≡ η (y−hek )
h h
44.1. THE CASE OF A HALF SPACE 1221
· ¸¾
2w (y + hek ) − w (y)
−η (y) (44.9)
h
−h
¡ 2 h ¢
≡ −Dk η Dk w , (44.10)
where here k < n. Thus z can be used in equation 44.7. Begin by estimating the
left side of 44.7.
Z
∂w ∂z
αrs (y) r s dy
U ∂y ∂y
Z ¡ ¢
1 rs ∂w ∂ η 2 Dkh w
= α (y + hek ) r (y + hek ) dy
h U ∂y ∂y s
Z ¡ ¢
1 rs ∂w ∂ η 2 Dkh w
− α (y) r dy
h U ∂y ∂y s
Z ¡ ¢ ¡ ¢
rs ∂ Dkh w ∂ η 2 Dkh w
= α (y + hek ) dy+
U ∂y r ∂y s
Z ¡ ¢
1 rs rs ∂w ∂ η 2 Dkh w
(α (y + hek ) − α (y)) r dy (44.11)
h U ∂y ∂y s
Now ¡ ¢ ¡ h ¢
∂ η 2 Dkh w ∂η h 2 ∂ Dk w
= 2η s Dk w + η . (44.12)
∂y s ∂y ∂y s
therefore,
Z ¡ ¢ ¡ ¢
∂ Dkh w ∂ Dkh w
= η 2 αrs (y + hek ) dy
U ∂y r ∂y s
(Z ¡ ¢
rs ∂ Dkh w ∂η
+ α (y + hek ) r
2η s Dkh wdy
W ∩U ∂y ∂y
Z ¡ ¢ )
1 rs ∂w ∂ η 2 Dkh w
rs
+ (α (y + hek ) − α (y)) r dy ≡ A. + {B.} . (44.13)
h W ∩U ∂y ∂y s
Now consider these two terms. From 44.2,
Z
¯ ¯2
A. ≥ δ η 2 ¯∇Dkh w¯ dy. (44.14)
U
³ ¯¯ ¯¯2 ´
≤ C (η, Lip (α) , α) Cε ¯¯Dkh w¯¯L2 (W ∩U ) + ||η∇w||L2 (W ∩U ;Rn ) +
2
Now
¯¯ h ¯¯
¯¯Dk w¯¯ 2
≤ ||∇w||L2 (U ;Rn ) . (44.17)
L2 (W )
ÃZ ¯ ¯ !1/2
¯ w (y + hek ) − w (y) ¯2
¯ ¯ dy
¯ h ¯
W
¯ Z ¯2 1/2
Z ¯1 h ¯
¯ ¯
≤ ¯ ∇w (y + tek ) · ek dt¯ dy
W ¯h 0 ¯
ÃZ µZ ¶1/2 !
h
2 dt
≤ |∇w (y + tek ) · ek | dy ≤ ||∇w||L2 (U ;Rn )
0 W h
¯¯ ¯¯2
B. ≤ Cε (η, Lip (α) , α) ||∇w||L2 (U ;Rn ) + ε ¯¯η∇Dkh w¯¯L2 (W ∩U ;Rn ) .
2
(44.18)
With 44.14 and 44.18 established, consider the other terms of 44.7.
¯Z ¯ ¯Z ¯
¯ ¯ ¯ ¡ ¢ ¯
¯ f zdy ¯ ≤ ¯ f −D−h η 2 Dkh w dy ¯
¯ ¯ ¯ k ¯
U U
µZ ¶1/2 µZ ¶1/2
¯ −h ¡ 2 h ¢¯2
≤
2
|f | dy ¯ Dk η Dk w dy ¯
U
¯¯ ¡ 2 hU ¢¯¯
≤ ||f ||L2 (U ) ¯¯∇ η Dk w ¯¯L2 (U ;Rn )
³ ¯¯ ¯¯ ¯¯ ¯¯ ´
≤ ||f ||L2 (U ) ¯¯2η∇ηDkh w¯¯L2 (U ;Rn ) + ¯¯η 2 ∇Dkh w¯¯L2 (U ;Rn )
¯¯ ¯¯
≤ C ||f ||L2 (U ) ||∇w||L2 (U ;Rn ) + ||f ||L2 (U ) ¯¯η∇Dkh w¯¯L2 (U ;Rn )
³ ´ ¯¯ ¯¯2
Cε ||f ||L2 (U ) + ||∇w||L2 (U ;Rn ) + ε ¯¯η∇Dkh w¯¯L2 (U ;Rn ) (44.19)
2 2
≤
44.1. THE CASE OF A HALF SPACE 1223
¯Z ¯
¯ ¯
¯ hs (y) ∂z dy ¯
¯ ∂y s ¯
¯Z U ¡ ¡ ¢¢ ¯¯
¯ ∂ −Dk−h η 2 Dkh w
¯ ¯
≤ ¯ hs (y) dy ¯
¯ U ∂y s ¯
¯Z ¡¡ ¢¢ ¯
¯ ∂ η 2 Dkh w ¯¯
¯
≤ ¯ Dkh hs (y) ¯
¯ U ∂y s ¯
Z ¯ ¯ Z ¯¯ à ¡ ¢ !¯¯
¯ h ∂η ¯ ¯ ¡ h ¢ ∂ D h
w ¯
≤ ¯Dk hs 2η D w¯ dy +
h
¯ ηDk hs η
k
¯ dy
¯ ∂y s k ¯ ¯ ∂y s ¯
U U
X ³ ¯¯ ¯ ¯ ´
≤ C ||hs || 1 ||w|| 1
H (U ) + ¯¯η∇Dkh w¯¯ 2
H (U ) n L (U ;R )
s
X ¯¯ ¯¯2
||hs ||H 1 (U ) + ||w||H 1 (U ) + ε ¯¯η∇Dkh w¯¯L2 (U ;Rn ) .
2 2
≤ Cε (44.20)
s
¯¯ ¯¯2
B. ≤ Cε (η, Lip (α) , α) ||∇w||L2 (U ;Rn ) + ε ¯¯η∇Dkh w¯¯L2 (W ∩U ;Rn ) ,
2
¯Z ¯ ³ ´
¯ ¯ ¯¯ ¯¯
h ¯¯2
¯ f zdy ¯ ≤ Cε ||f ||2 2 2 ¯¯
¯ ¯ L (U ) + ||∇w||L2 (U ;Rn ) + ε η∇Dk w L2 (U ;Rn )
U
¯Z ¯
¯ ¯ X
¯ hs (y) ∂z dy ¯ ≤ Cε
2
||hs ||H 1 (U )
¯ ∂y s ¯
U s
¯¯ ¯¯2
+ ||w||H 1 (U ) + ε ¯¯η∇Dkh w¯¯L2 (U ;Rn ) .
2
Therefore,
¯¯ ¯¯2
δ ¯¯η∇Dkh w¯¯L2 (U ;Rn )
¯¯ ¯¯2
Cε (η, Lip (α) , α) ||∇w||L2 (U ;Rn ) + ε ¯¯η∇Dkh w¯¯L2 (U ;Rn )
2
≤
X ¯¯ ¯¯2
||hs ||H 1 (U ) + ||w||H 1 (U ) + ε ¯¯η∇Dkh w¯¯L2 (U ;Rn )
2 2
+Cε
s
³ ´ ¯¯ ¯¯2
+Cε ||f ||L2 (U ) + ||∇w||L2 (U ;Rn ) + ε ¯¯η∇Dkh w¯¯L2 (U ;Rn ) .
2 2
à !
2 2
X 2
C ||w||H 1 (U ) + ||f ||L2 (U ) + Cε ||hs ||H 1 (U )
s
where the constant, C, depends on η, Lip (α) , α, δ. Since this holds for all h small
∂w 1
enough, it follows ∂y k ∈ H (U1 ) and
¯¯ ¯¯
¯¯ ∂w ¯¯2
¯¯∇ ¯¯ ≤
¯¯ ∂y k ¯¯ 2
L (U1 ;Rn )
à !
2 2
X 2
C ||w||H 1 (U ) + ||f ||L2 (U ) + Cε ||hs ||H 1 (U ) (44.21)
s
¯¯ 2 ¯¯2
¯¯ ¯¯
for each k < n. It remains to estimate ¯¯ ∂∂yw2 ¯¯ 2 . To do this return to 44.7
n L (U1 )
which must hold for all z ∈ Cc∞
(U1 ) . Therefore, using 44.7 it follows that for all
z ∈ Cc∞ (U1 ) ,
Z Z Z
rs ∂w ∂z ∂hs
α (y) r s dy = − s
zdy + f zdy.
U ∂y ∂y U ∂y U
∈ L2 (U1 )
and
à !
2 2
X 2
||F ||L2 (U1 ) ≤ C ||w||H 1 (U ) + ||f ||L2 (U ) + Cε ||hs ||H 1 (U ) . (44.22)
s
which with 44.21 and 44.22 implies the existence of a constant, C depending on δ
such that
à !
2 2 2
X 2
||w||H 2 (U1 ) ≤ C ||w||H 1 (U ) + ||f ||L2 (U ) + Cε ||hs ||H 1 (U ) ,
s
Proof: The proof involves the following claim which is proved using the conclu-
sion of Lemma 44.1 on Page 1220.
Claim : If α = (α0 , 0) where |α0 | ≤ k − 1, then there exists a constant indepen-
dent of w such that
à !
X
α
||D w||H 2 (U1 ) ≤ C ||f ||H k−1 (U ) + ||hs ||H k (U ) + ||w||H k (U ) . (44.25)
s
Proof of claim: First note that if |α| = 0, then 44.25 follows from Lemma 44.1
on Page 1220. Now suppose the conclusion of the claim holds for all |α| ≤ j − 1
where j < k. Let |α| = j and α = (α0 , 0) . Then for z ∈ H 1 (U ) having compact
support in V, it follows that for h small enough,
¡ ¢
Dα−h z ∈ H 1 (U ) , spt Dαh z ⊆ V.
1226 ELLIPTIC REGULARITY AND NIRENBERG DIFFERENCES
Therefore, you can replace z in 44.23 with Dα−h z. Now note that you can apply the
following manipulation.
Z Z
p (y) Dα−h z (y) dy = Dαh p (y) z (y) dy
U U
and obtain
Z µ µ ¶ ¶ Z
∂w ∂z ∂z ¡¡ h ¢ ¢
Dαh αrs r s
+ Dα
h
(h s ) s
dy = Dα f z dy. (44.26)
U ∂y ∂y ∂y U
Now µ ¶ α X τ
α rs ∂w rs ∂ (D w) α−τ rs ∂ (D w)
D α = α + C (τ ) D (α )
∂y r ∂y r τ <α
∂y r
Z
= (Dα f ) zdy. (44.27)
U
b1
U U Rn−1
U1
à !
X
C ||f ||H k−1 (U ) + ||hs ||H k (U ) + ||w||H k (U )
s
and consequently,
¯¯ ¯¯
¯¯ X τ ¯¯
¯¯ α−τ rs ∂ (D w) α ¯¯
¯¯ C (τ ) D (α ) + D (hs )¯¯ ≤
¯¯τ <α ∂y r ¯¯
H1 (Ub1 )
à !
X
C ||f ||H k−1 (U ) + ||hs ||H k (U ) + ||w||H k (U ) . (44.28)
s
Now consider 44.27. The equation remains true if you replace U with U b1 and require
b
that spt (z) ⊆ U1 . Therefore, by Lemma 44.1 on Page 1220 there exists a constant,
C independent of w such that
³
||Dα w||H 2 (U1 ) ≤ C ||Dα f ||L2 (Ub1 ) + ||Dα w||H 1 (Ub1 ) +
¯¯ ¯¯
X ¯¯¯¯ X τ ¯¯
α−τ rs ∂ (D w) α ¯¯
+ ¯¯ C (τ ) D (α ) + D (h )
s ¯¯
s
¯ ¯ ∂y r ¯¯
τ <α H1 (Ub1 )
and by 44.28, this implies
à !
X
α
||D w||H 2 (U1 ) ≤ C ||f ||H k−1 (U ) + ||w||H k (U ) + ||hs ||H k (U )
s
because in this case, you can subtract 1 from a pair of positive αi and obtain a new
multi index, β such that |β| = k − 1 and β n = 0 and then from the claim,
à !
¯¯ β ¯¯ X
α
||D w|| 2 ≤ ¯¯D w ¯¯ 2
L (U1 ) ≤ C ||f || k−1
H (U1 )
+ ||w|| k H + (U )||hs || k .
H (U ) H (U )
s
β ≡ (α1 , · · ·, αn−1 , αn − 2) .
Thus Dα = Dβ Dn2 . Restricting 44.23 to z ∈ Cc∞ (U1 ) and using the density of this
set of functions in L2 (U1 ) , it follows that
µ ¶
∂ ∂w ∂hs
− s αrs (y) r − s = f.
∂y ∂y ∂y
As noted earlier, the condition, 44.2 implies αnn (y) ≥ δ > 0 and so
1 ∂αrs ∂w X X 2
rs ∂ w
Dn2 w = − nn s + α +
α ∂y ∂y r ∂y s ∂y r
r≤n−1 s≤n−1
!
X 2 X 2
ns ∂ w rn ∂ w ∂hs
α + α + s +f .
s
∂y s ∂y n r
∂y n ∂y r ∂y
44.2. THE CASE OF BOUNDED OPEN SETS 1229
¶ ZZ b
¶ W Z
¶ ZZ
¶ ¶
¶ T T
Z Ω W ¶ RW RW (Ω W)
Z ¶ -
Z ¶ 0
Z
Z¶ u0 ∈ U 0
1230 ELLIPTIC REGULARITY AND NIRENBERG DIFFERENCES
T
Φ−1
i (Ω Wi )
@
R y Φ−1 (spt(ψ i ))
U@
i ´ i
Z
¶W spt(ψ i ) ´
¶ i Z
0 +́´
Z Rn−1
¶
À Z Z
¶ ¶ Vi
¶ ¶
Z T
Z Ω Wi ¶
Z ¶ Ri ¢̧
Z @ ¢
Z¶ @ ¢
@ ¢ Gi
@ ¢
R
@ ¢
¢
b
T
Ri (Ω Wi )
¡
¡
ª
0
u0 ∈ U 0
44.2. THE CASE OF BOUNDED OPEN SETS 1231
Therefore, by Lemma 41.20 on Page 1185, it follows that for t ∈ [m, m + 1),
¡ ¢
Φ∗i ∈ L H t (Wi ∩ Ω) , H t (Vi ) .
Assume
2
aij (x) vi vj ≥ δ |v| . (44.32)
Lemma 44.4 Let W be one of the sets described in the above definition and let
m ≥ 1. Let W1 ⊆ W1 ⊆ W where W1 is an open set. Suppose also that
u ∈ H 1 (Ω) ,
rs
¡ ¢
α ∈ C 0,1 Ω ,
f ∈ L2 (Ω) ,
hk ∈ H 1 (Ω) ,
Proof: Let © ª
E ≡ v ∈ H 1 (Ω ∩ W ) : spt (v) ⊆ W
u restricted to W ∩ Ω is in H 1 (Ω ∩ W ) and
Z Z Z
aij (x) u,i v,j dx + hk (x) v,k (x) dx = f (x) v (x) dx for all v ∈ E.
Ω∩W Ω Ω
(44.35)
Now let Φi (y) = x. For this particular W, denote Φi more simply by Φ, Ui ≡
Φi (Ω ∩ Wi ) by U, and Vi by V. Denoting the coordinates of V by y, and letting
u (x) ≡ w (y) and v (x) ≡ z (y) , it follows that in terms of the new coordinates,
44.35 takes the form
Z
∂w ∂y r ∂z ∂y s
aij (Φ (y)) r |det DΦ (y)| dy
U ∂y ∂xi ∂y s ∂xj
Z
∂z ∂y l
+ hk (Φ (y)) l k |det DΦ (y)| dx
∂y ∂x
Z U
= f (Φ (y)) z (y) |det DΦ (y)| dy
U
Let
∂y r ∂y s
αrs (y) ≡ aij (Φ (y)) |det DΦ (y)| , (44.36)
∂xi ∂xj
1232 ELLIPTIC REGULARITY AND NIRENBERG DIFFERENCES
∂y l
hel (y) ≡ hk (Φ (y)) k |det DΦ (y)| , (44.37)
∂x
and
fe(y) ≡ Φ∗ f |det DΦ| (y) ≡ f (Φ (y)) |det DΦ (y)| . (44.38)
0,1
¡ ¢
Now the function on the right in 44.36 is in C U . This is because of the
assumption that m ≥ 1 in the statement of the ¡ ¢ lemma. This function is therefore a
finite product of bounded functions in C 0,1 U .
The function hel defined in 44.37 is in H 1 (U ) and
¯¯ ¯¯ X
¯¯ e ¯¯
¯¯hl ¯¯ 1 ≤C ||hk ||H 1 (Ω∩W )
H (U )
k
again because m ≥ 1.
Finally, the right side of 44.38 is a function in L¡2 (U¢ ) by Lemma 41.20 on Page
1185 and the observation that |det DΦ (·)| ∈ C 0,1 U which follows from the as-
sumption of the lemma that m ≥ 1 so Φ ∈ C 1,1 (Rn ). Also
¯¯ ¯¯
¯¯ e¯¯
¯¯f ¯¯ ≤ C ||f ||L2 (Ω∩W ) .
L2 (U )
Therefore,
à !
2 2 2
X 2
||u||H 2 (W1 ∩Ω) ≤ C ||f ||L2 (W ∩Ω) + ||w||H 1 (W ∩Ω) + ||hk ||H 1 (W ∩Ω)
k
à !
2 2
X 2
≤ C ||f ||L2 (Ω) + ||w||H 1 (Ω) + ||hk ||H 1 (Ω) .
k
Theorem 44.5 Let Ω be a bounded open set with C 1,1 boundary as in Definition
44.3, let f ∈ L2 (Ω) , hk ∈ H 1 (Ω), and suppose that for all x ∈ Ω,
2
aij (x) vi vj ≥ δ |v| .
for all v ∈ H 1 (Ω) . Then u ∈ H 2 (Ω) and for some C independent of f, g, and u,
à !
2 2 2
X 2
||u||H 2 (Ω) ≤ C ||f ||L2 (Ω) + ||u||H 1 (Ω) + ||hk ||H 1 (Ω) .
k
C2 ⊆ D2 ⊆ D2 ⊆ W2 .
D0 ≡ Ω \ ∪li=1 Di .
What about the Dirichlet problem? The same differencing procedure as above
yields the following.
1234 ELLIPTIC REGULARITY AND NIRENBERG DIFFERENCES
Theorem 44.6 Let Ω be a bounded open set with C 1,1 boundary as in Definition
44.3, let f ∈ L2 (Ω) , hk ∈ H 1 (Ω), and suppose that for all x ∈ Ω,
2
aij (x) vi vj ≥ δ |v| .
for all v ∈ H01 (Ω) . Then u ∈ H 2 (Ω) and for some C independent of f, g, and u,
à !
2 2 2
X 2
||u||H 2 (Ω) ≤ C ||f ||L2 (Ω) + ||u||H 1 (Ω) + ||hk ||H 1 (Ω) .
k
Lemma 44.7 Let W be one of the sets described in Definition 44.3 and let m ≥ k.
Let W1 ⊆ W1 ⊆ W where W1 is an open set. Suppose also that
u ∈ H k (Ω) ,
¡ ¢
αrs ∈ C k−1,1 Ω ,
f ∈ H k−1 (Ω) ,
hs ∈ H k (Ω) ,
Proof: Let © ª
E ≡ v ∈ H k (Ω ∩ W ) : spt (v) ⊆ W
u restricted to W ∩ Ω is in H k (Ω ∩ W ) and
Z Z
aij (x) u,i v,j dx + hs (x) v,s (x) dx
Ω∩W Ω
Z
= f (x) v (x) dx for all v ∈ E. (44.42)
Ω
u (x) ≡ w (y) and v (x) ≡ z (y) , it follows that in terms of the new coordinates,
44.35 takes the form
Z
∂w ∂y r ∂z ∂y s
aij (Φ (y)) r |det DΦ (y)| dy
U ∂y ∂xi ∂y s ∂xj
Z
∂z ∂y l
+ hk (Φ (y)) l k |det DΦ (y)| dx
∂y ∂x
Z U
= f (Φ (y)) z (y) |det DΦ (y)| dy
U
Let
∂y r ∂y s
αrs (y) ≡ aij (Φ (y)) |det DΦ (y)| , (44.43)
∂xi ∂xj
∂y l
hel (y) ≡ hk (Φ (y)) k |det DΦ (y)| , (44.44)
∂x
and
fe(y) ≡ Φ∗ f |det DΦ| (y) ≡ f (Φ (y)) |det DΦ (y)| . (44.45)
¡ ¢
Now the function on the right in 44.43 is in C k,1 U . This is because of the
assumption that m ≥ k in the statement of the ¡ ¢lemma. This function is therefore a
finite product of bounded functions in C k,1 U .
The function hel defined in 44.44 is in H k (U ) and
¯¯ ¯¯ X
¯¯ e ¯¯
¯¯h l ¯¯ k ≤C ||hs ||H k (Ω∩W )
H (U )
s
again because m ≥ k.
Finally, the right side of 44.45 is a function in H k−1 (U ¡ ) ¢by Lemma 41.20 on
Page 1185 and the observation that |det DΦ (·)| ∈ C k−1,1 U which follows from
the assumption of the lemma that m ≥ k so Φ ∈ C k−1,1 (Rn ). Also
¯¯ ¯¯
¯¯ e¯¯
¯¯f ¯¯ k−1 ≤ C ||f ||H k−1 (Ω∩W ) .
H (U )
Proof of the claim: If this is not so, there exist vectors, vn , |vn | = 1, and
yn ∈ U such that αrs (yn ) vrn vsn ≤ n1 . Taking a subsequence, there exists y ∈ U and
|v| = 1 such that αrs (y) vr vs = 0 contradicting 44.32.
1236 ELLIPTIC REGULARITY AND NIRENBERG DIFFERENCES
Therefore,
à !
2 2 2
X 2
||u||H k+1 (W1 ∩Ω) ≤ C ||f ||H k−1 (W ∩Ω) + ||w||H k (W ∩Ω) + ||hs ||H k (W ∩Ω)
s
à !
2 2
X 2
≤ C ||f ||H k−1 (Ω) + ||w||H k (Ω) + ||hs ||H k (Ω) .
s
Theorem 44.8 Let Ω be a bounded open set with C k,1 boundary as in Definition
44.3, let f ∈ H k−1 (Ω) , hs ∈ H k (Ω), and suppose that for all x ∈ Ω,
2
aij (x) vi vj ≥ δ |v| .
for all v ∈ H k (Ω) . Then u ∈ H k+1 (Ω) and for some C independent of f, g, and u,
à !
2 2 2
X 2
||u||H k+1 (Ω) ≤ C ||f ||H k−1 (Ω) + ||u||H k (Ω) + ||hs ||H k (Ω) .
s
C2 ⊆ D2 ⊆ D2 ⊆ W2 .
D0 ≡ Ω \ ∪li=1 Di .
44.2. THE CASE OF BOUNDED OPEN SETS 1237
Now let Kn ⊆ E ⊆ Vn with m (Vn \ Kn ) < 2−n . Then from the above,
¯Z ¯ Z
¯ b ¯ b
¯ ¯
¯ XE (t) g (t) dt¯ ≤ XVn \Kn (t) ||g (t)|| dt
¯ a ¯ a
and
P the integrand of the last integral converges to 0 a.e. as n → ∞ because
n m (Vn \ Kn ) < ∞. By the dominated convergence theorem, this last integral
1239
1240 INTERPOLATION IN BANACH SPACE
Since the endpoints have measure zero, it also follows that for any measurable E,
the above equation holds.
Now g ∈ L1 ([a, b] ; X) and so it is measurable. Therefore, g ([a, b]) is separable.
Let D be P
a countable dense subset and let E denote the set of linear combinations of
the form i ai di where ai is a rational point of F and di ∈ D. Thus E is countable.
Denote by Y the closure of E in X. Thus Y is a separable closed subspace of X
which contains all the values of g.
∞
Now let Sn ≡ g −1 (B (yn , ||yn || /2)) where E = {yn }n=1 . Therefore,
³ ∪´n Sn =
g −1 (X \ {0}) . This follows because if x ∈ Y and x 6= 0, then in B x, ||x||
4 there
||yn || 3||x|| ||x||
is a point of E, yn . Therefore, ||yn || > 34 ||x||
and so > > 2 so x ∈ 8 4
B (yn , ||yn || /2) . It follows that if each Sn has
measure zero, then g (t) = 0 for a.e.
t. Suppose then that for some n, the set, Sn has positive mesure. Then from what
was shown above,
¯¯ Z ¯¯ ¯¯ Z ¯¯
¯¯ 1 ¯¯ ¯¯ 1 ¯¯
||yn || = ¯¯ ¯ ¯ ¯ ¯ ¯
g (t) dt − yn ¯¯ = ¯¯¯ g (t) − yn dt¯¯¯¯
m (Sn ) Sn m (Sn ) Sn
Z Z
1 1
≤ ||g (t) − yn || dt ≤ ||yn || /2dt = ||yn || /2
m (Sn ) Sn m (Sn ) Sn
[2a − b, 2b − a] = [a − (b − a) , b + (b − a)]
as follows.
f (t) if t ∈ [a, b]
f (t) ≡ f (2a − t) if t ∈ [2a − b, a]
f (2b − t) if t ∈ [b, 2b − a]
Definition 45.3 Also if f ∈ Lp (a, b; X) and h > 0, define for t ∈ [a, b] , fh (t) ≡
f (t − h) for all h < b − a. Thus the map f → fh is continuous and linear on
Lp (a, b; X) . It is continuous because
Z b Z a+h Z b−h
p p p
||fh (t)|| dt = ||f (2a − t + h)|| dt + ||f (t)|| dt
a a a
Z a+h Z b−h
p p p
= ||f (t)|| dt + ||f (t)|| dt ≤ 2 ||f ||p .
a a
Lemma 45.4 Let f be as defined in Definition 45.2. Then for f ∈ Lp (a, b; X) for
p ∈ [1, ∞),
Z b
¯¯ ¯¯
lim ¯¯f (t − δ) − f (t)¯¯p dt = 0.
δ→0 X
a
Then f 0 ∈ L1 (a, b; X) if there exists h ∈ L1 (a, b; X) such that for all φ ∈ Cc∞ (a, b) ,
Z b
0
f (φ) = h (t) φ (t) dt.
a
Then f 0 is defined to equal h. Here f and f 0 are considered as vector valued distri-
butions in the same way as was done for scalar valued functions.
Proof: Suppose both h and g work in the definition. Then for all φ ∈ Cc∞ (a, b) ,
Z b
(h (t) − g (t)) φ (t) dt = 0.
a
Lemma 45.7 Suppose f, f 0 ∈ L1 (a, b; X) . Then if [c, d] ⊆ [a, b], it follows that
¡ ¢0
f |[c,d] = f 0 |[c,d] . This notation means the restriction to [c, d] .
1242 INTERPOLATION IN BANACH SPACE
Recall that in the case of scalar valued functions, if you had both f and its
weak derivative, f 0 in L1 (a, b) , then you were able to conclude that f is almost
everywhere equal to a continuous function, still denoted by f and
Z t
f (t) = f (a) + f 0 (s) ds.
a
In particular, you can define f (a) to be the initial value of this continuous function.
It turns out that an identical theorem holds in this case. To begin with here
is the same sort of lemma which was used earlier for the case of scalar valued
functions. It says that if f 0 = 0 where the derivative is taken in the sense of X
valued distributions, then f equals a constant.
Theorem 45.9 Suppose f, f 0 both are in L1 (a, b; X) where the derivative is taken
in the sense of X valued distributions. Then there exists a unique point of X,
denoted by f (a) such that the following formula holds a.e. t.
Z t
f (t) = f (a) + f 0 (s) ds
a
Proof:
Z bµ Z t ¶ Z b Z b Z t
f (t) − f 0 (s) ds φ0 (t) dt = f (t) φ0 (t) dt − f 0 (s) φ0 (t) dsdt.
a a a a a
RbRt
Now consider a a f 0 (s) φ0 (t) dsdt. Let Λ ∈ X 0 . Then it is routine from approxi-
mating f 0 with simple functions to verify
ÃZ Z ! Z Z
b t b t
0
Λ 0
f (s) φ (t) dsdt = Λ (f 0 (s)) φ0 (t) dsdt.
a a a a
Therefore,
Z b µ Z t ¶
f (t) − f 0 (s) ds φ0 (t) dt
a a
Z b Z b Z b
0
= f (t) φ (t) dt − f 0 (s) φ0 (t) dtds
a a s
Z b Z b Z b
= f (t) φ0 (t) dt − f 0 (s) φ0 (t) dtds
a a s
Z b Z b
= f (t) φ0 (t) dt + f 0 (s) φ (s) ds = 0.
a a
Therefore, by Lemma 45.8, there exists a constant, denoted as f (a) such that
Z t
f (t) − f 0 (s) ds = f (a)
a
Z b Z b
f (t) φ0 (t) dt = f (b) φ (b) − f (a) φ (a) − f 0 (t) φ (t) dt.
a a
Z b
f (t) φ0 (t) dt
a
Z b µ Z t ¶
= f (a) + f (s) ds φ0 (t) dt
0
a a
Z b Z t
= f (a) (φ (b) − φ (a)) + f 0 (s) dsφ0 (t) dt
a a
Z b Z b
= f (a) (φ (b) − φ (a)) + f 0 (s) φ0 (t) dtds
a s
Z b
= f (a) (φ (b) − φ (a)) + f 0 (s) (φ (b) − φ (s)) ds
a
Z b
= f (a) (φ (b) − φ (a)) − f 0 (s) φ (s) ds + (f (b) − f (a)) φ (b)
a
Z b
= f (b) φ (b) − f (a) φ (a) − f 0 (s) φ (s) ds.
a
0
f (t) if t ∈ [a, b]
0
f (t) ≡ −f (2a − t) if t ∈ [2a − b, a] (45.1)
−f (2b − t) if t ∈ [b, 2b − a]
0
where f (t) is given in 45.1. This proves the lemma.
Definition 45.12 Let V be a Banach space and let H be a Hilbert space. (Typically
H = L2 (Ω)) Suppose V ⊆ H is dense in H meaning that the closure in H of V
gives H. Then it is often the case that H is identified with its dual space, and then
because of the density of V in H, it is possible to write
V ⊆ H = H0 ⊆ V 0
When this is done, H is called a pivot space. Another notation which is often used
is hf, gi to denote f (g) for f ∈ V 0 and g ∈ V. This may also be written as hf, giV 0 ,V
Theorem 45.13 Let V and H be a Banach space and Hilbert space as described
0
in Definition 45.12. Suppose f ∈ Lp (0, T ; V ) and f 0 ∈ Lp (0, T ; V 0 ) . Then f is
1246 INTERPOLATION IN BANACH SPACE
1 1
Here f 0 is being taken in the sense of V 0 valued distributions and p + p0 = 1 and
p ≥ 2.
Proof: Let Ψ ∈ Cc∞ (−T, 2T ) satisfy Ψ (t) = 1 if t ∈ [−T /2, 3T /2] and Ψ (t) ≥ 0.
For t ∈ R, define ½
b f (t) Ψ (t) if t ∈ [−T, 2T ]
f (t) ≡
0 if t ∈
/ [−T, 2T ]
and Z 1/n
fn (t) ≡ fb(t − s) φn (s) ds (45.6)
−1/n
provided n is large enough. This follows from Lemma 45.4 about continuity of
translation. Since ε > 0 is arbitrary, it follows fn → fb in Lp (R; V ) . Similarly,
45.1. AN ASSORTMENT OF IMPORTANT THEOREMS 1247
fn → f in L2 (R; H). This follows because p ≥ 2 and the norm in V and norm in
H are related by |x|H ≤ C ||x||V for some constant, C. Now
Ψ (t) f (t) if t ∈ [0, T ] ,
b Ψ (t) f (2T − t) if t ∈ [T, 2T ] ,
f (t) =
Ψ (t) f (−t) if t ∈ [0, T ] ,
0 if t ∈
/ [−T, 2T ] .
An easy modification of the argument of Lemma 45.11 yields
0
Ψ (t) f (t) + Ψ (y) f 0 (t) if t ∈ [0, T ] ,
0
Ψ (t) f (2T − t) − Ψ (t) f 0 (2T − t) if t ∈ [T, 2T ] ,
fb0 (t) = .
Ψ0 (t) f (−t) − Ψ (t) f 0 (−t) if t ∈ [−T, 0] ,
0 if t ∈
/ [−T, 2T ] .
Recall
Z 1/n Z
fn (t) = fb(t − s) φn (s) ds = fb(t − s) φn (s) ds
−1/n R
Z
= fb(s) φn (t − s) ds.
R
Therefore,
Z Z 1
2T + n
fn0 (t) = fb(s) φ0n (t − s) ds = fb(s) φ0n (t − s) ds
1
R −T − n
Z 1
2T + n Z
= fb0 (s) φn (t − s) ds = fb0 (s) φn (t − s) ds
1
−T − n R
Z Z 1/n
= fb0 (t − s) φn (s) ds = fb0 (t − s) φn (s) ds
R −1/n
and it follows from the first line above that fn0 is continuous with values in V for all
t ∈ R. Also note that both fn0 and fn equal zero if t ∈ / [−T, 2T ] whenever n is large
enough. Exactly similar reasoning to the above shows that fn0 → fb0 in Lp (R; V 0 ) .
0
∞
Now let φ ∈ Cc (0, T ) .
Z Z
2
|fn (t)|H φ0 (t) dt = (fn (t) , fn (t))H φ0 (t) dt (45.7)
R R
Z Z
= − 2 (fn0 (t) , fn (t)) φ (t) dt = − 2 hfn0 (t) , fn (t)i φ (t) dt
R R
Now
¯Z Z ¯
¯ ¯
¯ hfn0 (t) , fn (t)i φ (t) dt − hf 0
(t) , f (t)i φ (t) dt¯
¯ ¯
R R
Z
≤ (|hfn0 (t) − f 0 (t) , fn (t)i| + |hf 0 (t) , fn (t) − f (t)i|) φ (t) dt.
R
1248 INTERPOLATION IN BANACH SPACE
From the first part of this proof which showed that fn → fb in Lp (R; V ) and fn0 → fb0
0
in Lp (R; V 0 ) , an application of Holder’s inequality shows the above converges to 0
as n → ∞. Therefore, passing to the limit as n → ∞ in the 45.8,
Z ¯ ¯ Z D E
¯ b ¯2 0
¯f (t)¯ φ (t) dt = − 2 fb0 (t) , fb(t) φ (t) dt
R H R
¯ ¯2
¯ ¯
which shows t → ¯fb(t)¯ equals a continuous function a.e. and it also has a weak
D H E
derivative equal to 2 fb0 , fb .
It remains to verify that fb is continuous on [0, T ] . Of course fb = f on this
interval. Let N be large enough that fn (−T ) = 0 for all n > N. Then for m, n > N
and t ∈ [−T, 2T ]
Z t
2
|fn (t) − fm (t)|H = 2 (fn0 (s) − fm
0
(s) , fn (s) − fm (s)) ds
−T
Zt
= 2 hfn0 (s) − fm
0
(s) , fn (s) − fm (s)iV 0 ,V ds
−T
Z
≤ 2 ||fn0 (s) − fm
0
(s)||V 0 ||fn (s) − fm (s)||V ds
R
≤ 2 ||fn − fm ||Lp0 (R;V 0 ) ||fn − fm ||Lp (R;V )
which shows from the above that {fn } is uniformly Cauchy on [−T, 2T ] with values
in H. Therefore, there exists g a continuous function defined on [−T, 2T ] having
values in H such that
Now g = f a.e. and g is continuous with values in H hence continuous with values
in V 0 and so Z t
g (t) = f (0) + f 0 (s) ds in V 0
0
Definition 45.14 Let E, W be Banach spaces such that E ⊆ W and the injection
map from E into W is continuous. The injection map is said to be compact if every
bounded set in E has compact closure in W. In other words, if a sequence is bounded
in E it has a convergent subsequence converging in W . This is also referred to by
saying that bounded sets in E are precompact in W.
Proof: Suppose not. Then there exists ε > 0 and for each n ∈ N, un such that
Theorem 45.17 Let q > 1 and let E ⊆ W ⊆ X where the injection map is con-
tinuous from W to X and compact from E to W . Let S be defined by
n o
u such that ||u (t)||E + ||u0 ||Lq ([a,b];X) ≤ R for all t ∈ [a, b] .
Thus S ⊆ C ([a, b] ; X) . Let ε > 0 be given. Then by Theorem 45.15 there exists a
constant, Cε such that for all u ∈ W
ε
||u||W ≤ ||u||E + Cε ||u||X .
4R
Therefore, for all u ∈ S,
ε
||u (t) − u (s)||W ≤ ||u (t) − u (s)||E + Cε ||u (t) − u (s)||X
6R ¯¯Z t ¯¯
ε ¯¯ ¯¯
≤ ¯
+ C ε ¯¯¯ u (r) dr¯¯¯¯
0
3 s X
Z t
ε ε 1/q
≤ + Cε ||u0 (r)||X dr ≤ + Cε R |t − s| . (45.8)
3 s 3
Since ε is arbitrary, it follows u ∈ C ([a, b] ; W ).
∞
Let D = Q ∩ [a, b] so D is a countable dense subset of [a, b]. Let D = {tn }n=1 .
By compactness of the embedding of E into W, there exists a subsequence u(n,1)
such that as n → ∞, u(n,1) (t1 ) converges to a point in W. Now take a subsequence
of this, called (n, 2) such that as n → ∞, u(n,2) (t2 ) converges to a point in W.
It follows that u(n,2) (t1 ) also converges to a point of W. Continue this way. Now
consider the diagonal sequence, uk ≡ u(k,k) This sequence is a subsequence of u(n,l)
whenever k > l. Therefore, uk (tj ) converges for all tj ∈ D.
Claim: Let {uk } be as just defined, converging at every point of [a, b] . Then
{uk } converges at every point of [a, b].
Proof of claim: Let ε > 0 be given. Let t ∈ [a, b] . Pick tm ∈ D ∩ [a, b] such
that in 45.8 Cε R |t − tm | < ε/3. Then there exists N such that if l, n > N, then
||ul (tm ) − un (tm )||X < ε/3. It follows that for l, n > N,
||ul (t) − un (t)||X ≤ ||ul (t) − ul (tm )|| + ||ul (tm ) − un (tm )||
+ ||un (tm ) − un (t)||
2ε ε 2ε
≤ + + < 2ε
3 3 3
45.1. AN ASSORTMENT OF IMPORTANT THEOREMS 1251
∞
Since ε was arbitrary, this shows {uk (t)}k=1 is a Cauchy sequence. Since W is
complete, this shows this sequence converges.
Now for t ∈ [a, b] , it was just shown that if ε > 0 there exists Nt such that if
n, m > Nt , then
ε
||un (t) − um (t)|| < .
3
Now let s 6= t. Then
||un (s) − um (s)|| ≤ ||un (s) − un (t)|| + ||un (t) − um (t)|| + ||um (t) − um (s)||
From 45.8
³ε ´
1/q
||un (s) − um (s)|| ≤ 2 + Cε R |t − s| + ||un (t) − um (t)||
3
and so it follows that if δ is sufficiently small and s ∈ B (t, δ) , then when n, m > Nt
||u (t) − u (s)||W ≤ ||u (t) − un (t)||W + ||un (t) − un (s)||W + ||un (s) − u (s)||W
(45.9)
Let N be in the above claim and fix n > N. Then
and similarly, ||un (s) − u (s)||W ≤ ε. Then if |t − s| is small enough, 45.8 shows the
middle term in 45.9 is also smaller than ε. Therefore, if |t − s| is small enough,
Thus u is continuous. Finally, let N be as in the above claim. Then letting m, n >
N, it follows that for all t ∈ [a, b] ,
The idea is to show that un approximates un well and then to argue that a subse-
quence of the {un } is a Cauchy sequence yielding a contradiction to 45.10.
Therefore,
Xk Z ti
1
un (t) − un (t) = (un (t) − un (s)) dsX[ti−1 ,ti ) (t) .
t − ti−1 ti−1
i=1 i
and so
Z b
p
||(un (t) − un (s))||W ds
a
Z b k
X Z ti
1 p
≤ ||un (t) − un (s)||W dsX[ti−1 ,ti ) (t) dt
a i=1 t i − ti−1 ti−1
Xk Z t Z ti
1 i
p
= ||un (t) − un (s)||W dsdt. (45.11)
i=1
t i − t i−1 t i−1 t i−1
45.1. AN ASSORTMENT OF IMPORTANT THEOREMS 1253
From Theorems 45.15 and 45.9, if ε > 0, there exists Cε such that
p p p
||un (t) − un (s)||W ≤ ε ||un (t) − un (s)||E + Cε ||un (t) − un (s)||X
¯¯Z t ¯¯p
¯¯ ¯¯
≤ p−1 p ¯
2 ε (||un (t)|| + ||un (s)|| ) + Cε ¯¯¯ un (r) dr¯¯¯¯
0 p
s X
µZ t ¶p
p−1 p p 0
≤ 2 ε (||un (t)|| + ||un (s)|| ) + Cε ||un (r)||X dr
s
p p
≤ 2p−1 ε (||un (t)|| + ||un (s)|| )
õZ ¶1/q !p
t
0 q 1/q 0
+Cε ||un (r)||X dr |t − s|
s
p−1 p p p/q 0
= 2 ε (||un (t)|| + ||un (s)|| ) + Cε Rp/q |t − s| .
This is substituted in to 45.11 to obtain
Z b
p
||(un (t) − un (s))||W ds ≤
a
k
X Z ti Z ti
1 ¡ p p
2p−1 ε (||un (t)|| + ||un (s)|| )
i=1
ti − ti−1 ti−1 ti−1
´
p/q 0
+Cε Rp/q |t − s| dsdt
k
X Z ti Z ti Z ti
p p p/q 1 p/q 0
= 2 ε ||un (t)||W + Cε R |t − s| dsdt
i=1 ti−1 ti − ti−1 ti−1 ti−1
Z b k
X Z ti Z ti
p p p/q 1 p/q 0
= 2 ε ||un (t)|| dt + Cε R (ti − ti−1 ) dsdt
a i=1
(ti − ti−1 ) ti−1 ti−1
Z b k
X
p 1 p/q 0 2
= 2p ε ||un (t)|| dt + Cε Rp/q (ti − ti−1 ) (ti − ti−1 )
a i=1
(t i − ti−1 )
Xk µ ¶1+p/q0
p p p/q 1+p/q 0 p p p/q T
≤ 2 εR + Cε R (ti − ti−1 ) = 2 εR + Cε R k .
i=1
k
Taking ε so small that 2p εRp < η p /8p and then choosing k sufficiently large, it
follows
η
||un − un ||Lp ([a,b];W ) < .
4
Now use compactness of the embedding of E into W to obtain a subsequence
such that {un } is Cauchy in Lp (a, b; W ) and use this to contradict 45.10. Suppose
Pk
un (t) = i=1 uni X[ti−1 ,ti ) (t) . Thus
k
X
||un (t)||E = ||uni ||E X[ti−1 ,ti ) (t)
i=1
1254 INTERPOLATION IN BANACH SPACE
and so
Z k
b
p T X n p
R≥ ||un (t)||E dt = ||u ||
a k i=1 i E
Therefore, the {uni }
are all bounded. It follows that after taking subsequences
k times there exists a subsequence {unk } such that unk is a Cauchy sequence in
Lp (a, b; W ) . You simply get a subsequence such that uni k is a Cauchy sequence in
W for each i. Then denoting this subsequence by n,
||un − um ||Lp (a,b;W ) ≤ ||un − un ||Lp (a,b;W )
+ ||un − um ||Lp (a,b;W ) + ||um − um ||Lp (a,b;W )
η η
≤ + ||un − um ||Lp (a,b;W ) + < η
4 4
provided m, n are large enough, contradicting 45.10. This proves the theorem.
Lemma 45.20 (A0 + A1 , K (t, ·)) is a Banach space and all the norms, K (t, ·) are
equivalent.
Proof: First, why is K (t, ·) a norm? It is clear that K (t, a) ≥ 0 and that if
a = 0 then K (t, a) = 0. Is this the only way this can happen? Suppose K (t, a) = 0.
Then there exist a0n ∈ A0 and a1n ∈ A1 such that ||a0n ||0 → 0, ||a1n ||1 → 0, and
a = a0n + a1n . Since the embedding of Ai into X is continuous and since X is a
topological vector space1 , it follows
a = a0n + a1n → 0
and so a = 0.
Let α be a nonzero scalar. Then
It remains to verify the triangle inequality. Let ε > 0 be given. Then there exist
a0 , a1 , b0 , and b1 in A0 , A1 , A0 , and A1 respectively such that a0 +a1 = a, b0 +b1 = b
and
ε + K (t, a) + K (t, b) > ||a0 ||0 + t ||a1 ||1 + ||b0 ||0 + t ||b1 ||1
≥ ||a0 + b0 ||0 + t ||b1 + a1 ||1 ≥ K (t, a + b) .
This has shown that K (t, ·) is at least a norm. Are all these norms equivalent?
If 0 < s < t then it is clear that K (t, a) ≥ K (s, a) . To show there exists a constant,
C such that CK (s, a) ≥ K (t, a) for all a,
t t
K (s, a) ≡ inf {||a0 ||0 + s ||a1 ||1 : a0 + a1 = a}
s s ½ ¾
t t
= inf ||a0 ||0 + s ||a1 ||1 : a0 + a1 = a
s s
½ ¾
t
= inf ||a0 ||0 + t ||a1 ||1 : a0 + a1 = a
s
≥ inf {||a0 ||0 + t ||a1 ||1 : a0 + a1 = a} = K (t, a) .
Finally, it is required to verify that (A0 + A1 , K (t, ·)) is a Banach space. Since
all these norms are equivalent, it suffices to only consider the norm, K (1, ·). Let
∞
{a0n + a1n }n=1 be a Cauchy sequence in A0 + A1 . Then for m, n large enough,
a0n + xn → a 0 ∈ A0
a1n + yn → a 1 ∈ A1 .
Then
Definition 45.21 Let 1 ≤ q < ∞, 0 < θ < 1. Define (A0 , A1 )θ,q to be those
elements of A0 + A1 , a, such that
·Z ∞ ¸1/q
¡ −θ ¢q dt
||a||θ,q ≡ t K (t, a, A0 , A1 ) < ∞.
0 t
If a ∈ A0 ∩ A1 , then
µ ¶1/q
1 θ 1−θ
||a||θ,q ≤ ||a||1 ||a||0 . (45.14)
qθ (1 − θ)
A0 ∩ A1 = A0 ⊆ (A0 , A1 )θ,q ⊆ A1 = A0 + A1 .
45.2. THE K METHOD 1257
Also, if bounded sets in A0 have compact closures in A1 then the same is true if A1
is replaced with (A0 , A1 )θ,q . Finally, if
and if M is its norm, and M0 and M1 are the norms of T as a map in L (A0 , B0 )
and L (A1 , B1 ) respectively, then
q rq−qθ q r
−θq
= ||a||1 + ||a||0 <∞ (45.19)
q − qθ θq
Which shows the first inclusion of 45.12. The above holds for all r > 0 and in
particular for the value of r which minimizes the expression on the right in 45.19,
r = ||a||0 / ||a||1 . Therefore, doing some calculus,
q 1 q(1−θ) qθ
||a||θ,q ≤ ||a||0 ||a||1
θq (1 + θ)
which shows 45.14. This also verifies that the inclusion map is continuous in 45.12.
Now consider the second inclusion in 45.12. The inclusion is obvious because
(A0 , A1 )θ,q is given to be a subset of A0 +A1 . It remains to verify the inclusion map is
continuous. Therefore, suppose an → 0 in (A0 , A1 )θ,q . Since an → 0 in (A0 , A1 )θ,q ,
it follows the function, t → t−θ K (t, an ) converges to zero in Lq (0, ∞) with respect
to the measure, dt/t. Therefore, taking another subsequence, still denoted as an , you
can assume this function converges to 0 a.e. Pick such a t where this convergence
takes place. Then K (t, an ) → 0 as n → ∞ and so an → 0 in A0 + A1 . this shows
that if an → 0 in (A0 , A1 )θ,q , then there exists a subsequence {ank } such that
ank → 0 in A0 + A1 . It follows that if an → 0 in (A0 , A1 )θ,q , then an → 0 in
A0 + A1 . This proves the continuity of the embedding.
What about 45.13? Suppose {an } is a Cauchy sequence in (A0 , A1 )θ,q . Then
there exists a ∈ A0 + A1 such that an → a in A0 + A1 because A0 + A1 is a Banach
1258 INTERPOLATION IN BANACH SPACE
space. Thus, K (t, an ) → K (t, a) for all t > 0. Therefore, by Fatou’s lemma,
µZ ∞ ¶ µZ ∞ ¶
¡ −θ ¢q dt 1/q ¡ −θ ¢q dt 1/q
t K (t, a) ≤ lim inf t K (t, an )
0 t n→∞ 0 t
n o
≤ max ||an ||θ,q : n ∈ N < ∞
but unless indicated otherwise, A0 will come first. Now for θ ∈ (0, 1) and q ≥ 1,
define a space, (A0 , A1 )θ,q,J as follows. The space, (A0 , A1 )θ,q,J will consist of those
elements, a, of A0 + A1 which can be written in the form
Z ∞ Z 1 Z r
dt dt dt
a= u (t) ≡ lim u (t) + lim u (t) (45.22)
0 t ε→0+ ε t r→∞ 1 t
1260 INTERPOLATION IN BANACH SPACE
where the infimum is taken over all u satisfying 45.22 and 45.23.
Note that a norm on A0 × A1 would be
¡ ¢
||(a0 , a1 )|| ≡ max ||a0 ||A0 , t ||a1 ||A1
and so J (t, ·) is the restriction of this norm to the subspace of A0 × A1 defined
by {(a, a) : a ∈ A0 ∩ A1 }. Also for each t > 0 J (t, ·) is a norm on A0 ∩ A1 and
furthermore, any two of these norms are equivalent. In fact, it is easy to see that
for 0 < t < s, st J (s, a) ≤ J (t, a) ≤ J (s, a) .
The following lemma is significant and follows immediately from the above def-
inition.
R∞
Lemma 45.24 Suppose a ∈ (A0 , A1 )θ,q,J and a = 0 u (t) dt t where u is described
above. Then letting r > 1,
½ ¡ ¢
u (t) if t ∈ 1r , r
ur (t) ≡ .
0 otherwise
it follows that Z ∞
dt
ur (t)
∈ A0 ∩ A1 .
0 t
Rr Rr 1
Proof: The integral equals 1/r u (t) dt t . 1/r t dt = 2 ln r < ∞. Now ur is mea-
surable in A0 ∩ A1 and bounded. Therefore, there exists a sequence of measurable
simple functions, {sn } having values in A0 ∩ A1 which converges pointwise and uni-
formly to ur . It can also be assumed J (r, sn (t)) ≤ J (r, ur (t)) for all t ∈ [1/r, r].
Therefore, Z r
dt
lim J (r, sm − sn ) = 0.
n,m→∞ 1/r t
It follows from the definition of the Bochner integral that
Z r Z r
dt dt
lim sn = ur ∈ A0 ∩ A1 .
n→∞ 1/r t 1/r t
45.3. THE J METHOD 1261
Then let ui ≡ a0,i − a0,i−1 = a1,i−1 − a1,i . The reason these are equal is a =
a0,i + a1,i = a0,i−1 + a1,i−1 . Then
n
X
ui = a0,n − a0,−(m+1) = a1,−(m+1) − a1,n .
i=−m
Pn ¡ ¢
It follows a − i=−m ui = a − a0,n − a0,−(m+1) = a0,−(m+1) + a1,n , and both
terms converge to zero as m and n converge to ∞ by 45.28. Therefore,
à n
!
X ¯¯ ¯¯
K 1, a − ui ≤ ¯¯a0,−(m+1) ¯¯ + ||a1,n ||
i=−m
P∞
and so this shows a = i=−∞ ui which is one of the claims of the lemma. Also
¡ ¢ ¡ ¢
J 2i , ui ≡ max ||ui ||A0 , 2i ||ui ||A1 ≤ ||ui ||A0 + 2i ||ui ||A1
³ ´
≤2 ||a0,i−1 ||A +2i−1 ||a1,i−1 ||A
0 1
z }| {
≤ ||a0,i ||A0 + 2 ||a1,i ||A1 + ||a0,i−1 ||A0 + 2i ||a1,i−1 ||A1
i
¡ ¢ ¡ ¢ ¡ ¢
≤ (1 + ε) K 2i , a + 2 (1 + ε) K 2i−1 , a ≤ 3 (1 + ε) K 2i , a
because t → K (t, a) is nondecreasing. This proves the lemma.
1262 INTERPOLATION IN BANACH SPACE
¡ ¢
Lemma 45.26 If a ∈ A0 ∩ A1 , then K (t, a) ≤ min 1, st J (s, a) .
¡ ¢
Proof: If s ≥ t, then min 1, st = t
s and so
µ ¶ µ ¶
t t ¡ ¢ t
min 1, J (s, a) = max ||a||A0 , s ||a||A1 ≥ s ||a||A1
s s s
= t ||a||A1 ≥ K (t, a) .
¡ ¢
Now in case s < t, then min 1, st = 1 and so
µ ¶
t ¡ ¢
min 1, J (s, a) = max ||a||A0 , s ||a||A1 ≥ ||a||A0
s
≥ K (t, a) .
K (t, a) + tε ||a0 ||A0 + t ||a1 ||A1 ||a0 ||A0 + s ||a1 ||A1 K (s, a)
≥ ≥ ≥ .
t t s s
Since ε is arbitrary, this proves the claim.
45.3. THE J METHOD 1263
u (t) ≡ ui / ln 2.
Then
∞
X Z ∞
dt
a= ui = u (t) . (45.30)
i=−∞ 0 t
Now
Z ∞
q ¡ −θ ¢q dt
||a||θ,q,J ≤ t J (t, u (t))
0 t
Z ³ ³ ui ´´q dt
X∞ 2 i
= t−θ J t,
i=−∞ 2
i−1 ln 2 t
µ ¶q X ∞ Z 2 i
1 ¡ −θ ¡ i ¢¢q dt
≤ t J 2 , ui
ln 2 i=−∞ 2i−1 t
µ ¶q X ∞ Z i
2 ¡
1 ¡ ¢¢q dt
≤ t−θ 3 (1 + ε) K 2i , a
ln 2 i=−∞ 2i−1 t
This has shown that if a ∈ (A0 , A1 )θ,q , then by 45.30 and 45.31, a ∈ (A0 , A1 )θ,q,J
and µ ¶q
q 3 (1 + ε) q
||a||θ,q,J ≤ 2 ||a||θ,q . (45.32)
ln 2
It remains to prove the other inclusion and norm inequality, both of which are
much easier to obtain. Thus, let a ∈ (A0 , A1 )θ,q,J with
Z ∞
dt
a= u (t) (45.33)
0 t
where u is a strongly measurable function having values in A0 ∩ A1 and for which
Z ∞
¡ −θ ¢q
t J (t, u (t)) dt < ∞. (45.34)
0
µ Z ∞ ¶ Z ∞
ds ds
K (t, a) = K t, u (s) ≤ K (t, u (s)) .
0 s 0 s
Now by 45.26, this is dominated by an expression of the form
Z ∞ µ ¶ Z ∞ µ ¶
t ds 1 ds
≤ min 1, J (s, u (s)) = min 1, J (ts, u (ts))
0 s s 0 s s
where the equation follows from a change of variable. From Minkowski’s inequality,
µZ ∞ ¶
¡ −θ ¢q dt 1/q
||a||θ,q ≡ t K (t, a)
0 t
µZ ∞ µ Z ∞ µ ¶ ¶q ¶1/q
1 ds dt
≤ t−θ min 1, J (ts, u (ts))
0 0 s s t
Z ∞ µZ ∞ µ µ ¶ ¶q ¶1/q
1 dt ds
≤ t−θ min 1, J (ts, u (ts)) .
0 0 s t s
Now change the variable in the inside integral to obtain, letting t = τ s,
Z ∞ µ ¶ µZ ∞ ¶1/q
1 ¡ −θ
¢q dt ds
≤ min 1, t J (ts, u (ts))
0 s 0 t s
Z µ ¶ µZ ∞ ¶
∞
1 θ ds ¡ −θ ¢q dτ 1/q
= min 1, s τ J (τ , u (τ ))
0 s s 0 τ
µ ¶ µZ ∞ ¶1/q
1 ¡ −θ ¢q dτ
= τ J (τ , u (τ )) .
(1 − θ) θ 0 τ
This has shown that
µ ¶ µZ ∞ ¶1/q
1 ¡ −θ ¢q dτ
||a||θ,q ≤ τ J (τ , u (τ )) <∞
(1 − θ) θ 0 τ
45.4. DUALITY AND INTERPOLATION 1265
for all u satisfying 45.33 and 45.34. Therefore, taking the infimum it follows a ∈
(A0 , A1 )θ,q and
µ ¶
1
||a||θ,q ≤ ||a||θ,q,J .
(1 − θ) θ
This proves the theorem.
What is the dual space of (A0 , A1 )θ,q ? The answer is based on the following
|a0 (a)|
K (t, a0 ) = sup . (45.36)
a∈A0 ∩A1 J (t−1 , a)
0
Thus K (t, ·) is an
¡ equivalent
¢ norm to the usual operator norm on (A0 ∩ A1 ) taken
with respect to J t−1 , · . If, in addition to this, Ai is reflexive, then for a0 ∈ A00 ∩A01 ,
and a ∈ A0 ∩ A1 , ¡ ¢
J (t, a0 ) K t−1 , a ≥ |a0 (a)| . (45.37)
0
Proof: First consider the claim that A00 + A01 = (A0 ∩ A1 ) . As noted above, ⊆
is clear. Define a norm on A0 × A1 as follows.
¡ ¢
||(a0 , a1 )||A0 ×A1 ≡ max ||a0 ||A0 , t−1 ||a1 ||A1 . (45.38)
0
Let a0 ∈ (A0 ∩ A1 ) . Let
E ≡ {(a, a) : a ∈ A0 ∩ A1 }
1266 INTERPOLATION IN BANACH SPACE
¡ ¢ ¡ ¢
with the norm J t−1 , a ≡ max ||a||A0 , t−1 ||a||A1 . Now define λ on E, the sub-
space of A0 × A1 by
λ ((a, a)) ≡ a0 (a) .
Thus λ is a continuous linear map on E and in fact,
¡ ¢
|λ ((a, a))| = |a0 (a)| ≤ ||a0 || J t−1 , a .
|(a00 + a01 ) (a)| = |a00 (a) + a01 (a)| ≤ |a00 (a)| + |a01 (a)|
≤ ||a00 || ||a||A0 + ||a01 || ||a||A1
≤ ||a00 || ||a||A0 + t ||a01 || t−1 ||a||A1
¡ ¢
≤ (||a00 || + t ||a01 ||) J t−1 , a
Proof of the¡claim: |(a00 , a01 )¢ (a0 , a1 )| ≤ ||a00 || ||a0 || + ||a01 || ||a1 || . Now suppose
that ||a0 || = max ||a0 || , t−1 ||a1 || . Then this is no larger than
¡ ¢
(||a00 || + t ||a01 ||) ||a0 || = (||a00 || + t ||a01 ||) max ||a0 || , t−1 ||a1 || .
¡ ¢
The other case is that t−1 ||a1 || = max ||a0 || , t−1 ||a1 || . In this case,
This shows ||(a00 , a01 )||(A0 ×A1 )0 ≤ (||a00 || + t ||a01 ||) . Is equality achieved? Let a0n
and a1n be points of A0 and A1 respectively such that ||a0n || , ||a1n || ≤ 1 and
limn→∞ a0i (ain ) = ||a0i || . Then
a00 || + t ||e
≤ (||e a01 ||) ||(a, a)||A0 ×A1
so λ is continuous on the subspace, E of A0 × A1 and
a00 || + t ||e
||λ||E 0 ≤ ||e a01 || . (45.40)
By the Hahn Banach theorem, there exists an extension of λ defined on all of A0 ×A1
0
with the same norm. Thus, from 45.39, there exists (a00 , a01 ) ∈ (A0 × A1 ) which is
an extension of λ such that
||(a00 , a01 )||(A0 ×A1 )0 = ||a00 ||A0 + t ||a01 ||A0 = ||λ||E 0
0 1
Now a00 = a001 + a000 = η 1 a1 + η 0 a0 where η i is the map from Ai to A00i which is onto
and preserves norms, given by ηa (a0 ) ≡ a0 (a) . Therefore, letting a1 + a0 = a
|a00 (a0 )|
K (t, a) = K (t, a00 ) = sup
a0 ∈A00 ∩A01 J (t−1 , a0 )
|(η 1 a1 + η 0 a0 ) (a0 )| |(a0 (a1 + a0 ))|
= sup −1 0
= sup
a0 ∈A00 ∩A01 J (t , a ) a0 ∈A00 ∩A01 J (t−1 , a0 )
1268 INTERPOLATION IN BANACH SPACE
and so
|a0 (a)|
K (t, a) = sup
a0 ∈A00 ∩A01 J (t−1 , a0 )
Changing t → t−1 , ¡ ¢
K t−1 , a J (t, a0 ) ≥ |a0 (a)| .
which proves the lemma.
0
Consider (A0 , A1 )θ,q .
∞
Definition 45.29 Let q ≥ 1. Then λθ,q will denote the sequences, {αi }i=−∞ such
that
X∞
¡ ¢q
|αi | 2−iθ < ∞.
i=−∞
θ,q
For α ∈ λ ,
à ∞
!1/q
X ¡ ¢q
||α||λθ,q ≡ |αi | 2−iθ .
i=−∞
© ª
Thus α ∈ λθ,q means αi 2−iθ ∈ lq .
Lemma 45.30 Let f (t) ≥ 0, and let f (t) = αi for t ∈ [2i , 2i+1 ) where α ∈ λθ,q .
Then there exists a constant, C, such that
¯¯ −θ ¯¯
¯¯t f ¯¯ q ≤ C ||α||λθ,q . (45.43)
L (0,∞; dt ) t
then ¯¯© ¡ ¢ª ¯¯
¯¯ ∞ ¯¯
¯¯ f 2i i=−∞ ¯¯ 0
≤ C. (45.45)
λ1−θ,q
XZ 2i+1 ¡ −iθ ¢q dt X¡ ¢q q
≤ 2 αi = ln 2 2−iθ αi = ln 2 ||α||λθ,q .
i 2i t i
A0 ∩ A1 ⊆ (A0 , A1 )θ,q,J
and if
a ∈ (A0 , A1 )θ,q,J ,
then a has a representation of the form
Z ∞
dt
a= u (t)
0 t
where Z ∞ ¡ ¢q dt
t−θ J (t, u (t)) <∞
0 t
where ¡ ¢
J (t, u (t)) = max ||u (t)||A0 , t ||u (t)||A1
for u (t) ∈ A0 ∩ A1 . Now let
½ ¡ ¢
u (t) if t ∈ 1r , r
ur (t) ≡ .
0 otherwise
R ∞ ¡ −θ ¢q dt
Then 0
t J (t, ur (t)) t < ∞ and
Z ∞
dt
ar ≡ ur (t) ∈ A0 ∩ A1
0 t
Now
¯¯ ¯¯ ½
¯¯ bi 2iθ ¯¯
¯¯ 2iθ if i < 0
¯¯ ¡ ¢ ¯¯¯¯ ≤ . (45.47)
¯¯ max ||bi ||A0 , 2i ||bi ||A1 ¯¯ 2−i(1−θ) if i ≥ 0
A0 +A1
M
Ã∞ !1/q0 à ∞ !1/q
X X 0 X
2−i(1−θ) −iθ
2 αi ≤ 2 −i(1−θ)q
2−iqθ
αqi <∞
i=0 i=0 i=0
and similarly, ¯¯ ¯¯
0
X ¯¯ bi 2iθ ¯¯
¯¯ ¡ ¢ 2 αi ¯¯¯¯
−iθ
¯¯
¯¯ max ||bi ||A0 , 2i ||bi ||A1 ¯¯
i=−∞ A0 +A1
converges. Therefore, a∞ makes sense in A0 + A1 and also from 45.47, we see that
( )
||bi ||A0 +A1 2iθ 0
i
∈ λ(1−θ)q
J (2 , bi )
45.4. DUALITY AND INTERPOLATION 1271
Now let
αi bi
u (t) ≡ i
on [2i−1 , 2i ).
J (2 , bi ) ln 2
Then
Z ∞ XZ 2i
dt αi bi dt
u (t) =
0 t i 2i−1 J (2i , bi ) ln 2 t
X αi bi
= = a∞ .
i
J (2i , bi )
Also
Z Z ³
∞ ¡ −θ ¢q dt X 2i ¡ ¡ ¢¢´ dt
t J (t, u (t)) ≤ 2(1−i)θ J 2i , u 2i−1
0 t i 2i−1 t
Xh ¡ ¡ ¢¢iq
≤ 2−(i−1)θ J 2i , u 2i−1 ln 2
i
" ¡ i ¢ #q
X J 2 , bi α i
= 2−(i−1)θ ln 2
i
J (2i , bi ) ln 2
X¡ ¢q
=C 2−iθ |αi | < ∞ (45.48)
i
0 0
and so ||a∞ ||θ,q,J < ∞. Now for a0 as above, a0 ∈ (A0 , A1 )θ,q,J ⊆ (A0 + A1 ) , and
so since the sum for a∞ converges in A0 + A1 , we have
X ¡ ¢−1
a0 (a∞ ) = J 2i , bi αi a0 (bi ) .
i
Therefore,
X£ ¡ ¢ ¡ ¢¤
a0 (a∞ ) ≥ K 2−i , a0 − ε min 1, 2−i αi
i
X ¡ ¢ X ¡ ¢
= K 2−i , a0 αi − ε min 1, 2−i αi
i i
X ¡ ¢
= K 2−i , a0 αi − O (ε) (45.49)
i
1272 INTERPOLATION IN BANACH SPACE
© ª
The reason for this is that α ∈ λθ,q so αi 2−iθ ∈ lq . Therefore,
X ¡ ¢
ε min 1, 2−i αi
i
(∞ −1
)
X X
= ε 2−i αi + αi
i=0 i=−∞
(∞ −1
)
X X
= ε 2−iθ 2(θ−1)i αi + αi 2−iθ 2iθ
i=0 i=−∞
à !1/q à ∞ ! 0
X¯ ¯q X³ ´q0 1/q
≤ ε ¯α i 2 ¯
−iθ
2 (θ−1)i
i i=0
à !1/q à ∞ !1/q0
X¯ ¯ X ¡ ¢ 0
+ ¯αi 2−iθ ¯q 2θi
q
i i=0
< Cε.
Also
|a0 (a∞ )| ≤ ||a0 ||(A0 ,A1 )0 ||a∞ ||(A0 ,A1 ) .
θ,q,J θ,q,J
© ¡ ¢ª 0
By Lemma 45.30, K 2i , a0 , A01 , A00 ∈ λ1−θ,q and
¯¯© ¡ i 0 0 0 ¢ª¯¯
¯¯ K 2 , a , A1 , A0 ¯¯ 0 ≤ ||a0 ||(A0 ,A1 )0 Cθ .
λ1−θ,q θ,q,J
45.4. DUALITY AND INTERPOLATION 1273
Therefore,
µ Z ∞ ³ ´q0 dt ¶1/q0
1 0
K (t, a , A01 , A00 ) t−(1−θ)
ln 2 0 t
à ! 0
X 1 Z 2i+1 ³ ´q0 dt 1/q
= K (t, a0 , A01 , A00 ) t−(1−θ)
i
ln 2 2 i t
à !1/q0
X³ ¡ i 0 0 0 ¢´q0
−i(1−θ)
≤ 2 K 2 , a , A1 , A0
i
≤ ||a0 ||(A0 ,A1 )0 Cθ .
θ,q,J
Thus
¯¯ ¯¯
¯¯ ¯¯
||a0 ||(A0 ,A0 ) ≡ ¯¯t−(1−θ) K (t, a0 , A01 , A00 )¯¯
1 0 1−θ,q 0 Lq0 (0,∞, dt
t )
0
which shows that (A0 , A1 )θ,q,J ⊆ (A01 , A00 )1−θ,q0 with the inclusion map continuous.
This proves the lemma.
0
(A01 , A00 )1−θ,q0 ,J ⊆ (A0 , A1 )θ,q
Proof: Let a0 ∈ (A01 , A00 )1−θ,q0 ,J . Thus, there exists u∗ bounded on compact
subsets of (0, ∞) and measurable with values in A0 ∩ A1 and
Z ∞
0 dt
a = u∗ (t) , (45.50)
0 t
Z ∞ ³ ´q0 dt
t−(1−θ) J (t, u∗ (t)) < ∞.
0 t
Then
∞ Z
X 2i+1 X∞
0 dt
a = u∗ (t) ≡ a0i
i=−∞ 2i t i=−∞
where a0i ∈ A01 ∩ A00 , the convergence taking place in A01 + A00 . Now let a ∈ A0 ∩ A1 .
1274 INTERPOLATION IN BANACH SPACE
In going from the sums to the integrals, express the first sum as a sum of integrals
on [2i , 2i+1 ) and the second sum as a sum of integrals on (2i−1 , 2i ].
Taking the infimum over all u∗ representing a0 ,
0
It follows a0 ∈ (A0 , A1 )θ,q and ||a0 ||(A0 ,A1 )0 ≤ C ||a0 ||(A0 ,A0 ) which proves
θ,q 1 0 1−θ,q 0 ,J
the lemma.
With these two lemmas the main result follows.
Proof: This was already explained in the treatment of the K method of inter-
polation. It is just K (1, a) .
f ∈ L1loc (0, ∞; A0 + A1 )
as follows. Z ∞
f 0 (φ) ≡ −f (t) φ0 (t) dt
0
whenever φ ∈ Cc∞ (0, ∞) . Define a Banach space, W (A0 , A1 , p, θ) = W where
p ≥ 1, θ ∈ (0, 1). Let
³ ¯¯ ¯¯ ¯¯ ¯¯ ´
||f ||W ≡ max ¯¯tθ f ¯¯Lp (0,∞, dt ;A0 ) , ¯¯tθ f 0 ¯¯Lp (0,∞, dt ;A1 ) (46.2)
t t
and let W consist of f ∈ L1loc (0, ∞; A0 + A1 ) such that ||f ||W < ∞.
1275
1276 TRACE SPACES
1
Proof: Let 0 < s < t. Let ν + p = θ. Then
Z ∞ Z ∞ ¯¯ θ ¯¯
ν
||τ g (τ )|| dτ =
p ¯¯τ g (τ )¯¯p dτ
0 0 τ
so that tν f 0 ∈ Lp (0, ∞; A1 ) , the measure in this case being usual Lebesgue measure.
Then Z Z t t
f (t) − f (s) = f 0 (τ ) dτ = τ ν f 0 (τ ) τ −ν dτ .
s s
³ ´
1 1 1 1
For p + p0
0
= 1, νp = θ − p
0
p < 1 because θ < 1 = p0 + p1 . Therefore,
Taking a subsequence, it can be assumed fn (t) converges to f (t) a.e. But the
above inequality shows that fn (t) is a Cauchy sequence in C ([0, β] ; A0 + A1 ) for
all β < ∞. Therefore, fn (t) → f (t) for all t. Also,
0 0
||fn (t)||A0 +A1 ≤ Cν ||fn ||W t1−νp ≤ Kt1−νp
Definition 46.5 Let W be a Banach space and let Z be a closed subspace. Then
the quotient space, denoted by W/Z consists of the set of equivalence classes [x]
46.1. DEFINITION AND BASIC THEORY OF TRACE SPACES 1277
The verification of the algebraic claims made in the above definition is left to
the reader. It is routine. What is not as routine is the following lemma. However,
it is similar to some topics in the presentation of the K method of interpolation.
Lemma 46.6 Let W be a Banach space and let Z be a closed subspace of W. Then
W/Z with the norm described above is a Banach space.
Proof: That W/Z is a vector space is left to the reader. Why is ||·|| a norm?
Suppose α 6= 0. Then
Now let ||[x]|| ≥ ||x + z1 || − ε and let ||[y]|| ≥ ||y + z2 || − ε where zi ∈ Z. Then
Since ε is arbitrary, this shows the triangle inequality. Clearly, ||[x]|| ≥ 0. It remains
to show that the only way ||[x]|| = 0 is for x ∈ Z. Suppose then that ||[x]|| = 0.
This means there exist zn ∈ Z such that ||x + zn || → 0. Therefore, −x is a limit of
a sequence of points of Z and since Z is closed, this requires −x ∈ Z. Hence x ∈ Z
also because Z is a subspace. This shows ||·|| is a norm on W/Z. It remains to
verify that W/Z is a Banach space.
Suppose {[xn ]} is a Cauchy sequence in W/Z and suppose ||[xn ] − [xn+1 ]|| <
1
2 n+1 . Let x01 = x1 . If x0n has been chosen let x0n+1 = xn+1 + zn+1 where zn+1 ∈ Z
be such that
¯¯ 0 ¯¯ 1
¯¯xn+1 − x0n ¯¯ ≤ ||[xn+1 − xn ]|| +
2(n+1)
1 1
= ||[xn+1 ] − [xn ]|| + < .
2(n+1) 2n
It follows {x0n } is a Cauchy sequence in W and so it must converge to some x ∈ W.
Now
||[x] − [xn ]|| = ||[x − xn ]|| = ||[x − x0n ]|| ≤ ||x − x0n ||
which converges to 0. ¯¯Now if {[x
£ n ]} ¤¯
is¯ just a Cauchy sequence, there exists a
subsequence satisfying ¯¯[xnk ] − xnk+1 ¯¯ < 2k+1 1
and so from the first part, the
subsequence converges to some [x] ∈ W/Z and so the original Cauchy sequence also
converges. therefore, W/Z is a Banach space as claimed.
1278 TRACE SPACES
the limit taking place in A0 +A1 . Let γf be defined for f ∈ W by γf ≡ limt→0+ f (t) .
Thus T = γ (W ) . As above Z ≡ {f ∈ W : γf = 0} = ker (γ) .
ψ ([f ]) ≡ γf.
Therefore, the Banach space, W/Z and T are isometric and so T must be a Banach
space since W/Z is.
The following is an important interpolation inequality.
where the infimum is taken over all f ∈ W such that a = f (0) . Also, if a ∈ A0 ∩A1 ,
then a ∈ T and
1−θ θ
||a||T ≤ K ||a||A1 ||a||A0 (46.7)
for some constant K. Also
A0 ∩ A1 ⊆ T (A0 , A1 , p, θ) ⊆ A0 + A1 (46.8)
Next choose f ∈ W such that f (0) = a and ||f ||W ≈ ||a||T . More precisely, pick
f ∈ W such that f (0) = a and ||a||T > −ε + ||f ||W . Also let
¯¯ ¯¯ ¯¯ ¯¯
R ≡ ¯¯tθ f ¯¯Lp (0,∞, dt ;A0 ) , S ≡ ¯¯tθ f 0 ¯¯Lp (0,∞, dt ;A1 ) .
t t
Then as before,
¯¯ θ ¯¯ ¯¯ 0 ¯¯
¯¯t fλ ¯¯ p = λ−θ R, ¯¯tθ (fλ ) ¯¯Lp (0,∞, dt ;A1 ) = λ1−θ S. (46.9)
L (0,∞, dt ;A0 )
t t
so that ||f ||W = max (R, S) . Then, changing the variables, letting λ = R/S,
¯¯ θ ¯¯ ¯¯ 0 ¯¯
¯¯t f λ ¯¯ p = ¯¯tθ (fλ ) ¯¯Lp (0,∞, dt ;A ) = R1−θ S θ (46.10)
L (0,∞, dt ;A )
t 0 t 1
Since fλ (0) = a, fλ ∈ W, and it is always the case that for positive R, S, R1−θ S θ ≤
max (R, S) , this shows that
³ ¯¯ ¯¯ ¯¯ ´
0 ¯¯
||a||T ≤ max ¯¯tθ fλ ¯¯Lp (0,∞, dt ;A ) , ¯¯tθ (fλ ) ¯¯Lp (0,∞, dt ;A )
t 0 t 1
1−θ θ
= R S ≤ max (R, S) = ||f ||W < ||a||T + ε,
the first inequality holding because ||a||T is the infimum of such things on the right.
This shows 46.6.
It remains to verify 46.7. To do this, let ψ ∈ C ∞ ([0, ∞)) , with ψ (0) = 1
and ψ (t) = 0 for all t > 1. Then consider the special f ∈ W which is given by
f (t) ≡ aψ (t) where a ∈ A0 ∩ A1 . Thus f ∈ W and f (0) = a so a ∈ T (A0 , A1 , p, θ) .
From the first part, there exists a constant, K such that
¯¯ θ ¯¯1−θ ¯¯ θ 0 ¯¯θ
||a||T ≤ ¯¯t f ¯¯ p ¯¯t f ¯¯ p
t ;A0 )
L (0,∞, dt t ;A1 )
L (0,∞, dt
1−θ θ
≤ K ||a||A0 ||a||A1
This shows 46.7 and the first inclusion in 46.8. From the inequality just obtained,
¡ ¢
||a||T ≤ K (1 − θ) ||a||A0 + θ ||a||A1
≤ K ||a||A0 ∩A1 .
By 46.4,
0
||a − f (t)||A0 +A1 ≤ Cν t1−νp ||f ||W
1280 TRACE SPACES
1
where p + ν = θ, and so
0
||a||A0 +A1 ≤ ||f (t)||A0 +A1 + Cν t1−νp ||f ||W .
Therefore, recalling that νp0 < 1, and integrating both sides from 0 to 1,
To see this,
Z 1 µZ 1 ¶1/p µZ 1 ¶1/p0
ν −ν
¡ ν
¢p −νp0
t ||f (t)||A0 t dt ≤ t ||f (t)||A0 dt t dt
0 0 0
≤ C ||f ||W .
Since ε > 0 is arbitrary, this verifies the second inclusion and continuity of the
inclusion map completing the proof of the theorem.
The interpolation inequality, 46.7 is very significant. The next result concerns
bounded linear transformations.
Theorem 46.10 Now suppose A0 , A1 and B0 , B1 are pairs of Banach spaces such
that Ai embeds continuously into a topological vector space, X and Bi embeds con-
tinuously into a topological vector space, Y. Suppose also that L ∈ L (A0 , B0 ) and
L ∈ L (A1 , B1 ) where the operator norm of L in these spaces is Ki , i = 0, 1. Then
L ∈ L (A0 + A1 , B0 + B1 ) (46.11)
with
||La||B0 +B1 ≤ max (K0 , K1 ) ||a||A0 +A1 (46.12)
and
L ∈ L (T (A0 , A1 , p, θ) , T (B0 , B1 , p, θ)) (46.13)
and for K the operator norm,
Then
||L (a)||B0 +B1 = ||La0 + La1 ||B0 +B1 ≤ ||La0 ||B0 + ||La1 ||B1
46.2. EQUIVALENCE OF TRACE AND INTERPOLATION SPACES 1281
¡ ¢
≤ K0 ||a0 ||A0 + K1 ||a1 ||A1 ≤ max (K0 , K1 ) ||a||A0 +A1 + ε .
This establishes 46.12. Now consider the other assertions.
Let a ∈ T (A0 , A1 , p, θ) and pick f ∈ W (A0 , A1 , p, θ) such that γf = a and
¯¯ ¯¯1−θ ¯¯ ¯¯θ
||a||T (A0 ,A1 ,p,θ) + ε > ¯¯tθ f ¯¯Lp (0,∞, dt ,A0 ) ¯¯tθ f 0 ¯¯Lp (0,∞, dt ,A1 ) .
t t
The case when m = 1 was discussed in Section 46.1. Note it is not known at
this point whether limt→0+ u (t) even exists for every u ∈ V m . Of course, if m = 1
this was shown earlier but it has not been shown for m > 1. The following theorem
is absolutely amazing. Note the lack of dependence on m of the right side!
Proof: It is enough to show the first equality because of Theorem 45.27 which
identifies (A0 , A1 )θ,p,J and (A0 , A1 )θ,p . Let a ∈ T m . Then there exists u ∈ V m such
that
a = lim u (t) in A0 + A1 .
t→0+
The first task is to modify this u (t) to get a better one which is more usable in
order to show a ∈ (A0 , A1 )θ,p,J . Remember, it is required to find w (t) ∈ A0 ∩ A1
R∞
for all t ∈ (0, ∞) and a = 0 w (t) dt t , a representation which is not known at this
time. To get such a thing, let
with φ ≥ 0 and Z ∞
dt
φ (t) = 1. (46.21)
0 t
Then define
Z ∞ µ ¶ Z ∞ µ ¶
t dτ t ds
u
e (t) ≡ φ u (τ ) = φ (s) u . (46.22)
0 τ τ 0 s s
and so
¯¯ ¯¯ Z t/α
¯¯ (k) ¯¯ dτ
¯¯ue (t)¯¯ ≤ Ck ||u (τ )||A0
A0 t/β τ
ÃZ !1/p0 ÃZ !1/p
t/α t/α
dτ p dτ
≤C ||u (τ )||A0 .
t/β τ t/β τ
³ ´θ
β
Now t τ θ ≥ 1 for τ ≥ t/β and so the above expression
µ ¶1/p0 µ ¶θ ÃZ ∞ !1/p
β β ¡ θ ¢p dτ
≤ C ln τ ||u (τ )||A0
α t t/β τ
¯¯ (k) ¯¯
and so limt→∞ ¯¯u e (t)¯¯A0 = 0 and therefore, this also holds in A0 + A1 . This
proves the claim.
Thus ue has the same properties as u in terms of having a as its trace. u
e is used
to build the desired w, representing a as an integral. Define
m m Z ∞ µ ¶
(−1) tm (m) (−1) tm (m) t dτ
v (t) ≡ u
e (t) = φ u (τ )
(m − 1)! (m − 1)! 0 τm τ τ
m Z ∞ µ ¶
(−1) m (m) t ds
= s φ (s) u . (46.23)
(m − 1)! 0 s s
Then from the claim, and integration by parts in the last step,
Z ∞ µ ¶ Z ∞ m Z ∞
1 dt dt (−1)
v = v (t) = tm−1 u
e(m) (t) dt = a. (46.24)
0 t t 0 t (m − 1)! 0
¡ ¢
Thus v 1t represents a in the way desired for (A0 , A1 )θ,p,J if it is also true that
¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢
v 1t ∈ A0 ∩ A1 and t → t−θ v 1t is in Lp 0, ∞, dt t ; A0 and t → t
1−θ
v 1t is in
¡ ¢
Lp 0, ∞, dt
t ; A1 . First consider whether v (t) ∈ A0 ∩ A1 . v (t) ∈ A0 for each t from
¡ ¢
46.23 and the assumption that u ∈ Lp 0, ∞, dt t ; A0 . To verify v (t) ∈ A1 , integrate
by parts in 46.23 to obtain
m Z ∞ µ µ ¶¶
(−1) (m) m−1 t
v (t) = φ (s) s u ds (46.25)
(m − 1)! 0 s
Z ∞ µ µ ¶¶
1 dm m−1 t
= φ (s) m s u ds
(m − 1)! 0 ds s
m Z ∞ µ ¶
(−1) tm (m) t
= φ (s) u ds ∈ A1
(m − 1)! 0 sm+1 s
1284 TRACE SPACES
The last step may look very mysterious. If so, consider the case where m = 2.
µ µ ¶¶00
t
φ (s) su
s
µ µ ¶ µ ¶¶0
t 0 t t
= φ (s) − u +u
s s s
µµ ¶ µ ¶ µ ¶ µ ¶ µ ¶¶
t t t t t t t
= φ (s) − u00 − 2 + 2 u0 − 2 u0
s s s s s s s
2
µ ¶
t t
= φ (s) 3 u00 .
s s
You can see the same pattern will take place for other values of m.
Now
µZ ∞ µ µ µ ¶¶¶p ¶1/p
1 dt
||a||θ,p,J ≤ t−θ J t, v
0 t t
(Z "à ¯¯ µ ¶¯¯ ! à ¯¯ µ ¶¯¯ !#p )1/p
∞ ¯ ¯ ¯ ¯ ¯¯
1 1 ¯¯¯¯ dt
≤ Cp t−θ ¯¯¯¯v ¯¯
¯¯ + t1−θ ¯¯¯¯v ¯¯
0 t A0 t A1 t
à Ã
Z ∞ ¯¯ µ ¶¯¯ !p !1/p
¯¯
−θ ¯¯ 1 ¯¯¯¯ dt
≤ Cp t ¯¯v ¯ ¯
0 t A0 t
ÃZ à ¯¯ µ ¶¯¯ !p !1/p
∞ ¯¯ 1 ¯¯¯¯ dt
+ t1−θ ¯¯¯¯v . (46.26)
0 t ¯¯ A1 t
Z à à ¯¯ µ ¶¯¯ !p !1/p
∞ ¯ ¯ Z ∞ ¯¯ t ¯¯¯¯ dt ds
¯ (m) ¯
≤ m
s ¯φ (s)¯ tθ ¯¯¯¯u
0 0 s ¯¯ A0 t s
46.2. EQUIVALENCE OF TRACE AND INTERPOLATION SPACES 1285
Z ∞ ¯ ¯ ds µZ ∞ ¡ ¶
¢p dτ 1/p
¯ ¯
= sθ+m ¯φ(m) (s)¯ τ θ ||u (τ )||A0
0 s 0 τ
µZ ∞ ¶
¡ θ ¢p dτ 1/p
=C τ ||u (τ )||A0 . (46.27)
0 τ
The second term equals
ÃZ à ¯¯ µ ¶¯¯ !p !1/p µZ ¶1/p
∞ ¯¯ ¯¯ ∞ ¡ θ−1 ¢p dt
t 1−θ ¯¯v 1 ¯¯ dt
= t ||v (t)||A1
¯¯ t ¯¯ t t
0 A1 0
ÃZ à ¯¯ Z µ ¶ ¯¯ !p !1/p
∞ ¯¯ ∞
θ−1 ¯¯ 1 tm t ds ¯¯¯¯ dt
= t ¯¯ (m − 1)! φ (s) m u(m)
0 0 s s s ¯¯A 1 t
Z ÃZ õ ¶ ¯¯ µ ¶¯¯ !p !1/p
∞ ∞ ¯¯ (m) t ¯¯
tθ+m−1 ¯ ¯ ¯¯ dt ds
≤ |φ (s)| ¯¯u
0 0 sm s ¯¯ A1 t s
Z ÃZ à ¯¯ µ ¶¯¯ !p !1/p
∞ ∞ ¯¯ (m) t ¯¯
|φ (s)| θ+m−1 ¯¯u ¯¯ dt ds
≤ t ¯¯
0 sm 0 s ¯¯ A1 t s
Z ∞ µZ ∞µ ¯¯ ¯¯ ¶p ¶1/p
|φ (s)| θ+m−1 ¯¯ ¯¯ dτ ds
= s τ θ+m−1 ¯¯u(m) (τ )¯¯
0 sm 0 A1 τ s
µZ ∞µ ¯¯ ¯¯ ¶p ¶1/p
¯¯ ¯¯ dτ
=C τ θ+m−1 ¯¯u(m) (τ )¯¯ . (46.28)
0 A1 τ
Now from the estimates on the two terms in 46.26 found in 46.27 and 46.28, and
the simple estimate,
2 max (α, β) ≥ α + β,
it follows
||a||θ,p,J (46.29)
õZ ¶1/p
¡ θ ∞ ¢p dτ
≤ C max τ ||u (τ )||A0 (46.30)
0 τ
µZ ∞ µ ¯¯ ¯¯ ¶ p ¶1/p !
θ+m−1 ¯¯ (m) ¯¯ dτ
, τ ¯¯u (τ )¯¯ (46.31)
0 A1 τ
which shows that after taking the infimum over all u whose trace is a, it follows
a ∈ (A0 , A1 )θ,p,J .
||a||θ,p,J ≤ C ||a||T m (46.32)
Thus T m (A0 , A1 , θ, p) ⊆ (A0 , A1 )θ,p,J .
1286 TRACE SPACES
Is (A0 , A1 )θ,p,J ⊆ T m (A0 , A1 , θ, p)? Let a ∈ (A0 , A1 )θ,p,J . There exists u having
values in A0 ∩ A1 and such that
Z ∞ Z ∞ µ ¶
dt 1 dt
a= u (t) = u ,
0 t 0 t t
in A0 + A1 such that
Z ∞
¡ −θ ¢p ¡ ¢
t J (t, u (t)) dt < ∞, where J (t, a) = max ||a||A0 , t ||a||A1 .
0
ÃZ à ¯¯Z ¯¯ !p !1/p
∞ ¯¯ 1
−θ ¯¯ m−1 dτ ¯¯¯¯ dt
= t ¯¯ (1 − τ ) u (τ t) ¯¯
0 0 τ A0 t
Z 1 ³ µZ ∞ ´p dt ¶1/p dτ
m−1
≤ t−θ (1 − τ ) ||u (τ t)||A0
0 0 t τ
Z 1 µZ ∞ ¶1/p
m−1 ¡ −θ ¢p ds dτ
= τ θ (1 − τ ) s ||u (s)||A0
0 0 s τ
µZ 1 ¶ µZ ∞ ¶
m−1 ¡ −θ ¢p ds 1/p
= τ θ−1 (1 − τ ) dτ s ||u (s)||A0
0 0 s
µZ ∞ ¶1/p
¡ −θ ¢p ds
≤C s ||u (s)||A0
0 s
µZ ∞ ¶1/p
¡ −θ ¢p
≤C t J (t, u (t)) dt < ∞. (46.37)
0
It follows that
||w||V m ≡
õZ ¶ µZ ∞ µ !
∞¡ ¢p dt 1/p ¯¯ ¯¯ ¶p dt ¶1/p
θ θ+m−1 ¯¯ (m) ¯¯
max t ||w (t)||A0 , t ¯¯w (t)¯¯
0 t 0 A1 t
µZ ∞ ¶1/p
¡ −θ
¢p
≤C t J (t, u (t)) dt <∞
0
||a||T m ≤ C ||a||θ,p,J .
¡ ¢
W θ,p (Ω) ≡ T W 1,p (Ω) , Lp (Ω) , p, 1 − θ .
Thus, from the above general theory, W 1,p (Ω) ,→ W θ,p (Ω) ,→ Lp (Ω) = Lp (Ω)+
W 1,p (Ω) . Now we consider the trace map for Sobolev space.
∞
¡ n¢ 0 0 ∞
¡ n¢
Lemma 47.2
¡ n−1 ¢ Let φ ∈ C R+ . Then γφ (x )
¡ n¢≡ φ (x , 0) . Then
¡ n−1 ¢ γ : C R+ →
p 1,p p
L R is continuous as a map from W R+ to L R .
Proof: We know
Z xn
∂φ (x0 , t)
φ (x0 , xn ) = γφ (x0 ) + dt
0 ∂t
1289
1290 TRACES OF SOBOLEV SPACES AND FRACTIONAL ORDER SPACES
1,p
¡ n¢
and we see the same constant holds for all u ∈ W R+ . Now we will assert more
¡ ¢
than this. From the definition of the norm in the trace space, if f ∈ C ∞ Rn+ , and
we let θ = 1 − p1 , then
||γf ||1− 1 ,p,Rn−1
p
õZ
∞³ ´p dt ¶1/p
≤ max t1/p ||f (t)||1,p,Rn−1
0 t
µZ ∞ ³ ¶ !
´p dt 1/p
1/p 0
, t ||f (t)||0,p,Rn−1
0 t
≤ C ||f ||1,p,Rn .
+
¡ ¢ 1 ¡ ¢
Thus, if f ∈ W 1,p
Rn+ , we may define γf ∈ W 1− p ,p Rn−1 according to the rule,
γf = lim γφk ,
k→∞
47.1. TRACES OF SOBOLEV SPACES ON THE BOUNDARY OF A HALF SPACE1291
¡ ¢ ¡ ¢
where φk → f in W 1,p Rn+ and φk ∈ C ∞ Rn+ . This shows the continuity part of
the following lemma.
¡ ¢
Lemma 47.4 The trace map, γ, is a continuous map from W 1,p Rn+ onto
1 ¡ ¢
W 1− p ,p Rn−1 .
¡ ¢
Furthermore, for f ∈ W 1,p Rn+ ,
p
¡ ¢
the limit taking place in L Rn−1 .
Proof: It remains to verify γ is onto along with the displayed equation. But
1 ¡ ¢
by ¡definition, things
¡ in W 1− p ,p Rn−1
¢¢ ¡ are of¡ the form
¢¢ limt→0+ f (t) where f ∈
Lp 0, ∞; W 1,p Rn−1 , and f 0 ∈ Lp 0, ∞; Lp Rn−1 , the limit taking place in
¡ ¢ ¡ ¢ ¡ ¢
W 1,p Rn−1 + Lp Rn−1 = Lp Rn−1 ,
But we also have that for a.e. x0 ,the following equation holds for a.e. t > 0.
Z t
0 0
f (x , t) = γf (x ) + f,xn (x0 , s) ds, (47.1)
0
showing that
µ ¶
1 ¡ ¢ 1
γf = f (0) ∈ W 1− p ,p Rn−1 ≡ T W 1,p (Ω) , Lp (Ω) , p, .
p
¡ ¢
To see that 47.1 holds, we approximate f with a sequence from C ∞ Rn+ and
finally obtain an equation of the form
Z Z ∞· Z t ¸
0 0
f (x , t) − γf (x ) − f,xn (x , s) ds ψ (x0 , t) dtdx0 = 0,
0
Rn−1 0 0
¡ ¢
which holds for all ψ ∈ Cc∞ Rn+ . This proves the lemma. ¡ ¢
Thus we lose p1 derivatives when we take the trace of a function in W 1,p Rn+ .
1292 TRACES OF SOBOLEV SPACES AND FRACTIONAL ORDER SPACES
which has the property that γ (Rg) = g. We will define this function as follows.
Z µ 0 ¶
x − y0 1
Rg (x0 , xn ) ≡ g (y0 ) φ n−1 dy
0
(47.2)
R n−1 xn xn
where φ is a mollifier having support in B (0, 1) . Then we have the following lemma.
¡ ¢
Lemma 47.5 Let R be defined in 47.2. Then Rg ∈ W 1,p Rn+ and is a continuous
1 ¡ ¢ ¡ ¢
linear map from W 1− p ,p Rn−1 to W 1,p Rn+ with the property that γRg = g.
¡ ¢
Proof: Let f ∈ W 1,p Rn+ be such that γf = g. Let ψ (xn ) ≡ (1 − xn )+ and
assume f is Borel measurable by taking a Borel measurable representative. Then
for a.e. x0 we have the following formula holding for a.e. xn .
Rg (x0 , xn )
Z " Z # µ ¶
ψ(xn )
0 0 x0 − y 0
= ψ (xn ) f (y , ψ (xn )) − (ψf ),n (y , t) dt φ x1−n
n dy 0 .
Rn−1 0 xn
Using the repeated index summation convention to save space, we obtain that in
terms of weak derivatives,
Rg,n (x0 , xn )
Z " Z #
ψ(xn )
0
= ψ (xn ) f (y , ψ (xn )) − (ψf ),n (y0 , t) dt ·
Rn−1 0
· µ ¶µ ¶ µ ¶ ¸
x0 − y0 yk − xk x0 − y 0 (1 − n)
φ,k +φ dy 0
xn xnn xn xnn
Z " Z #
ψ(xn )
0 0 0 0
= ψ (xn ) f (x − xn z , ψ (xn )) − (ψf ),n (x − xn z , t) dt ·
Rn−1 0
· µ ¶ ¸
0 yk − xk (1 − n) n 0
0
φ,k (z ) zk + φ (z ) xn dz
xnn xnn
and so
¯Z
¯
¯
|Rg,n (x0 , xn )| ≤ C (φ) ¯ [ψ (xn ) f (x0 − xn z0 , ψ (xn ))
¯ B(0,1)
Z ψ(xn ) #¯
¯
0 0 ¯
− (ψf ),n (x − xn z , t) dt ¯
0 ¯
47.2. A RIGHT INVERSE FOR THE TRACE FOR A HALF SPACE 1293
(Z
C (φ)
≤ |ψ (xn ) f (x0 + y0 , ψ (xn ))| dy 0
xn−1
n B(0,xn )
Z Z )
ψ(xn ) ¯ ¯
¯ ¯
+ ¯(ψf ),n (x0 + y0 , t)¯ dtdy 0
B(0,xn ) 0
Therefore,
µZ ∞ Z ¶1/p
0 p 0
|Rg,n (x , xn )| dx dxn ≤
0 Rn−1
ÃZ Z Ã Z !p !1/p
∞
1 0 0 0 0
C (φ) |ψ (xn ) f (x + y , ψ (xn ))| dy dx dxn
0 Rn−1 xnn−1 B(0,xn )
ÃZ Z Ã Z Z !p !1/p
∞
1 ψ(xn ) ¯ ¯
¯ ¯
+C (φ) ¯(ψf ),n (x0 + y0 , t)¯ dtdy 0 dx0 dxn
0 Rn−1 xn−1
n B(0,xn ) 0
(47.3)
Consider the first term on the right. We change variables, letting y0 = z0 xn . Then
this term becomes
ÃZ Z ÃZ !p !1/p
1
C (φ) |ψ (xn ) f (x0 + xn z0 , ψ (xn ))| dz 0 dx0 dxn
0 Rn−1 B(0,1)
Z µZ 1 Z ¶1/p
p
≤ C (φ) |ψ (xn ) f (x0 + xn z0 , ψ (xn ))| dx0 dxn dz 0
B(0,1) 0 Rn−1
Taking the infimum over all such f and using the definition of the norm in
1 ¡ ¢
W 1− p ,p Rn−1 ,
it follows
||Rg||1,p,Rn ≤ C (φ) ||g||1− 1 ,p,Rn−1 ,
+ p
s,p
Corollary 47.7 The space, W (Ω) is a reflexive Banach space whenever p > 1.
Proof: We know from the theory of interpolation spaces that W σ,p (Ω) is reflex-
ive. This is because it is an iterpolation space for the two reflexive spaces, Lp (Ω)
and W 1,p (Ω) . Now the formula for the norm of an element in W s,p (Ω) shows this
k
space is isometric to a closed subspace of W m,p (Ω) × W σ,p (Ω) for suitable k.
s,p
Therefore, W (Ω) is also reflexive.
¡ ¢ 1 ¡ ¢
Theorem 47.8 The trace map, γ : W m,p Rn+ → W m− p ,p Rn−1 is continuous.
³ ´
Proof: Let f ∈ S. We let σ = 1 − p1 so that m − p1 = m − 1 + σ. Then from
the definition,
1/p
X
||γf || 1 n−1 =
||γf ||p n−1 +
p
||Dα γf || 1
m− p ,p,R m−1,p,R n−1
1− p ,p,R
|α|=m−1
and from
¡ ¢Lemma 47.4,¡and the ¢ fact that the trace is continuous as a map from
W m,p Rn+ to W m−1,p Rn−1 ,
³ ´1/p
p
||γf ||m− 1 ,p,Rn−1 ≤ C1 ||f ||m,p,Rn + C2 ||f ||m,p,Rn + ≤ C ||f ||m,p,Rn + .
p +
Theorem 47.9 Let h : U → V where U and V are two open sets and suppose h is
bilipschitz and that Dα h and Dα h−1 exist and are Lipschitz continuous if |α| ≤ m
where m = 0, 1, · · ·.and s = m + σ where σ ∈ (0, 1) . Then
h∗ : W s,p (V ) → W s,p (U )
is continuous,¡linear,
¢∗ one to one, and has an inverse with the same properties, the
inverse being h−1 .
Lk v = h∗ (v) hk,j .
and so
X
||Dj (h∗ u)||σ,p,U ≤ ||Lk (u,k )||σ,p,U
k
X
≤ Ck ||Dk u||σ,p,V
k
à !1/p
X p
≤ C ||Dk u||σ,p,V .
k
The general case is similar. We simply have a more complicated continuous linear
operator in place of Lk .
Now we prove an important interpolation inequality for Sobolev spaces.
Theorem 47.10 Let Ω be an open set in Rn and let f ∈ W m+1,p (Ω) and σ ∈
(0, 1) . Then for some constant, C, independent of f,
1−σ σ
||f ||m+σ,p,Ω ≤ C ||f ||m+1,p,Ω ||f ||m,p,Ω .
Therefore,
1/p
X ³ ´p
||f ||m+σ,p,Ω ≤ ||f ||p 1−σ σ
K ||Dα f ||1,p,Ω ||Dα f ||0,p,Ω
m,p,Ω +
|α|=m
h ³ ´p i1/p
p 1−σ σ
≤ C ||f ||m,p,Ω + ||f ||m+1,p,Ω ||f ||m,p,Ω
h³ ´p ³ ´p i1/p
1−σ σ 1−σ σ
≤ C ||f ||m+1,p,Ω ||f ||m,p,Ω + ||f ||m+1,p,Ω ||f ||m,p,Ω
1−σ σ
≤ C ||f ||m+1,p,Ω ||f ||m,p,Ω .
¡ ¢
This proves the first part. Now we consider the second. Let φ ∈ C ∞ Ω
1/p
p
X p
||Lφ||m+σ,p,Ω = ||Lφ||m,p,Ω + ||Dα Lφ||σ,p,Ω
|α|=m
1/p
X
= ||Lφ||p + ||LD α p
φ||T (W 1,p ,Lp ,p,1−σ)
m,p,Ω
|α|=m
1/p
X h ³ ¯¯ ¯ ¯ σ ¯ ¯ ¯ ¯ 1−σ
´ip
inf ¯¯t1−σ Lfα ¯¯1 ¯¯t1−σ Lfα0 ¯¯2
p
= ||Lφ||m,p,Ω + (47.6)
|α|=m
fα (0) ≡ limt→0 fα (t) = Dα φ in W 1,p (Ω) + Lp (Ω) , and the infimum is taken over
all such functions. Therefore, from 47.6, and letting ||L||1 denote the operator norm
of L in W 1,p (Ω) and ||L||2 denote the operator norm of L in Lp (Ω) ,
||Lφ||m+σ,p,Ω
1/p
X h ³ ¯ ¯ ¯ ¯ σ ¯ ¯ ¯ ¯ 1−σ
´ip
||Lφ||p inf ||L||1 ||L||2 ¯¯t1−σ fα ¯¯1 ¯¯t1−σ fα0 ¯¯2
σ 1−σ
≤ m,p,Ω +
|α|=m
1/p
³ ´p X h ³¯¯ ¯¯σ ¯¯ 1−σ 0 ¯¯1−σ ´ip
||Lφ||p inf ¯¯t f α ¯¯1 ¯¯t f α ¯¯2
σ 1−σ 1−σ
≤ m,p,Ω + ||L||1 ||L||2
|α|=m
1/p
X h ip
p
≤ C ||φ||m,p,Ω + ||Dα φ||σ,p,Ω = C ||φ||m+σ,p,Ω .
|α|=m
¡ ¢
Since C ∞ Ω is dense in all the Sobolev spaces, this inequality establishes the
desired result.
0
Definition 47.11 We define for s ≥ 0, W −s,p (Rn ) to be the dual space of
W s,p (Rn ) .
1 1
Here p + p0 = 1.
Note that in the case of m = 0 this is consistent with the Riesz representation
theorem for the Lp spaces.
1298 TRACES OF SOBOLEV SPACES AND FRACTIONAL ORDER SPACES
Sobolev Spaces On Manifolds
1299
1300 SOBOLEV SPACES ON MANIFOLDS
converges for a.e. x. Since h maps sets of measure zero to sets of n dimensional
Hausdorff measure zero, it follows that for a.e. y ∈ Γ,
uj (y) → u (y) a.e.
Therefore, wi (x) = h∗i (uψ i ) (x) a.e. and this shows h∗i (uψ i ) ∈ W s,p (Ui ) . Thus
u ∈ W s,p (Γ) and this shows completness. It is clear ||·||s,p,Γ is a norm. Thus L is
Ql
an isometry of W s,p (Γ) and a closed subspace of i=1 W s,p (Ui ) so this proves the
lemma since by Corollary 47.7, W s,p (Ui ) is reflexive.
We now show
© that any two such ªrnorms are equivalent. l
Suppose Wj0 , φj , Γj , Vj , gj , Gj j=1 and {Wi , ψ i , Γi , Ui , hi , Hi }i=1 both satisfy
1
the conditions described above. Let ||·||s,p,Γ denote the norm defined by
r
X ¯¯ ∗ ¡ ¢¯¯
1
||u||s,p,Γ ≡ ¯¯gj uφj ¯¯
s,p,Vj
j=1
¯¯ Ã
r ¯¯
!¯¯
X l ¯¯
¯¯ ∗ X ¯¯ X ¯¯ ¡ ¢¯¯
¯¯gj∗ uφj ψ i ¯¯
≤ ¯¯g j uφj ψ i ¯¯ ≤
¯¯ ¯¯ s,p,Vj
j=1 i=1 s,p,Vj j,i
X ¯¯ ¡ ¢¯¯
= ¯¯gj∗ uφj ψ i ¯¯ (48.2)
s,p,gj−1 (Wi ∩Wj0 )
j,i
1,g
Now we may define a new norm ||u||s,p,Γ by the formula 48.2. This norm is deter-
mined by © 0 ª
Wj ∩ Wi , ψ i φj , Γj ∩ Γi , Vj , gi,j , Gi,j
³ ´
1,g
where gi,j = gj . Thus the identity map is continuous from W s,p (Γ) , ||·||s,p,Γ to
³ ´
1 1,g 1
W s,p (Γ) , ||·||s,p,Γ . It follows the two norms, ||·||s,p,Γ and ||·||s,p,Γ , are equivalent
2,h 2
by the open mapping theorem. In a similar way, the norms, ||·||s,p,Γ and ||·||s,p,Γ
are equivalent where
l
X
2
||u||s,p,Γ ≡ ||h∗i (uψ i )||s,p,Ui
j=1
48.2. THE TRACE ON THE BOUNDARY OF AN OPEN SET 1301
and
X ¯¯ ¡ ¢¯¯ X ¯¯ ¡ ¢¯¯
2,h
||u||s,p,Γ ≡ ¯¯h∗i uφj ψ i ¯¯ .= ¯¯h∗i uφj ψ i ¯¯
i (Wi ∩Wj )
s,p,h−1
s,p,Ui 0
j,i j,i
But from the assumptions on h and g, in particular the assumption that these
are restrictions of functions which are defined on open subsets of Rm which have
Lipschitz derivatives up to order k along with their inverses, we know from Theorem
47.9, there exist constants Ci , independent of u such that
¯¯ ∗ ¡ ¢¯¯ ¯¯ ∗ ¡ ¢¯¯
¯¯hi uφj ψ i ¯¯ ≤ C1
¯¯gj uφj ψ i ¯¯
s,p,hi (Wi ∩Wj )
−1 0 s,p,gj−1 (Wi ∩Wj0 )
Theorem 48.3 Let Γ be described above. Then we may define W s,p (Γ) as in
Definition 41.36 and any two norms like those given in this definition are equivalent.
Rn−1
k ≡ {x ∈ Rn : xk = 0} , x
bk ≡ (x1 , · · ·, xk−1 , 0, xk+1 , · · ·, xn ) .
We will say an open set, Ω is C m,1 if there exist open sets, Wi , i = 0, 1, · · ·, l such
that
Ω = ∪li=0 Wi
with W0 ⊆ Ω, open sets Ui ⊆ Rn−1
k for some k, and open intervals, (ai , bi ) containing
0 such that for i ≥ 1,
∂Ω ∩ Wi = {b
xk + φi (b bk ∈ Ui } ,
x k ) ek : x
Ω ∩ Wi = {b
xk + (φi (b
xk ) + xk ) ek : (b
xk , xk ) ∈ Ui × (0, bi )} ,
where φi is Lipschitz with partial derivatives up to order m also Lipschitz. Note
that it makes no difference whether we use (0, bi ) or (ai , 0) in the last part of this
definition since we can go from one to the other by a simple change if the φi .
hi (b bk + φi (b
xk ) = x bk + (φi (b
xk ) ek , Hi (x) ≡ x xk ) + xk ) ek ,
1302 SOBOLEV SPACES ON MANIFOLDS
Pl
and let ψ i ∈ Cc∞ (Wi ) with i=0 ψ i (x) = 1 on Ω, we see that
l
{Wi , ψ i , ∂Ω ∩ Wi , Ui , hi , Hi }i=1
¡ ¢
satisfies all the conditions for defining W s,p (∂Ω) for s ≤ m. Let u ∈ C ∞ Ω and
let hi be as just described. Using Theorem 47.8, and Theorem 40.14,
l
X
||γu||m− 1 ,p,∂Ω = ||h∗i (ψ i γu)||m− 1 ,p,Ui
p p
i=1
l
X l
X
= ||h∗i (ψ i γu)||m− 1 ,p,Rn−1 ≤ C ||H∗i (ψ i u)||m,p,Rn
p k +
i=1 i=1
l
X l
X
≤C ||H∗i (ψ i u)||m,p,Ui ×(0,bi ) ≤ C ||(ψ i u)||m,p,Wi ∩Ω
i=1 i=1
l
X l
X
≤C ||(ψ i u)||m,p,Ω ≤ C ||u||m,p,Ω ≤ Cl ||u||m,p,Ω .
i=1 i=1
¡ ¢
Now we use the density of C ∞ Ω in W m,p (Ω) to see that γ extends to a continuous
linear map defined on W m,p (Ω) still called γ such that for all u ∈ W m,p (Ω) ,
In addition to this, in the case where m = 1, we may use Lemma 47.5 to obtain a
1
continuous linear map, R, from W 1− p ,p (∂Ω) to W 1,p (Ω) which has the property
1 1
that γRg = g for every g ∈ W 1− p ,p (∂Ω) . Letting g ∈ W 1− p ,p (∂Ω) ,
l
X
g= ψ i g.
i=1
Then also,
1 ¡ ¢
h∗i (ψ i g) ∈ W 1− p ,p Rn−1
¡ ¢
and so from Lemma 47.5, we can extend this to W 1,p Rn+ , Rh∗i (ψ i g) . We may also
assume that Rh∗i (ψ i g) ∈ W 1,p (Ui × (0, bi )) . We can accomplish this by multiplying
by a suitable cut off function in the definition of R or else adjusting the function, ψ
occuring in the proof of this lemma so that it vanishes off (0, bi ) . Then our extension
is
X l
¡ −1 ¢∗
Rg = Hi Rh∗i (ψ i g) .
i=1
48.2. THE TRACE ON THE BOUNDARY OF AN OPEN SET 1303
The purpose of this appendix is to prove the equivalence between the axiom of
choice, the Hausdorff maximal theorem, and the well-ordering principle. The Haus-
dorff maximal theorem and the well-ordering principle are very useful but a little
hard to believe; so, it may be surprising that they are equivalent to the axiom of
choice. First it is shown that the axiom of choice implies the Hausdorff maximal
theorem, a remarkable theorem about partially ordered sets.
A nonempty set is partially ordered if there exists a partial order, ≺, satisfying
x≺x
and
if x ≺ y and y ≺ z then x ≺ z.
An example of a partially ordered set is the set of all subsets of a given set and
≺≡⊆. Note that two elements in a partially ordered sets may not be related. In
other words, just because x, y are in the partially ordered set, it does not follow
that either x ≺ y or y ≺ x. A subset of a partially ordered set, C, is called a chain
if x, y ∈ C implies that either x ≺ y or y ≺ x. If either x ≺ y or y ≺ x then x and
y are described as being comparable. A chain is also called a totally ordered set. C
is a maximal chain if whenever Ce is a chain containing C, it follows the two chains
are equal. In other words C is a maximal chain if there is no strictly larger chain.
Lemma A.1 Let F be a nonempty partially ordered set with partial order ≺. Then
assuming the axiom of choice, there exists a maximal chain in F.
g(C) = C ∪ {f (C)}.
1305
1306 THE HAUSDORFF MAXIMAL THEOREM
Thus g(C) ) C and g(C) \ C ={f (C)} = {a single element of F}. A subset T of X
is called a tower if
∅∈T,
C ∈ T implies g(C) ∈ T ,
and if S ⊆ T is totally ordered with respect to set inclusion, then
∪S ∈ T .
Here S is a chain with respect to set inclusion whose elements are chains.
Note that X is a tower. Let T0 be the intersection of all towers. Thus, T0 is a
tower, the smallest tower. Are any two sets in T0 comparable in the sense of set
inclusion so that T0 is actually a chain? Let C0 be a set of T0 which is comparable
to every set of T0 . Such sets exist, ∅ being an example. Let
B ≡ {D ∈ T0 : D ) C0 and f (C0 ) ∈
/ D} .
·f (C0 )
C0 D
·f (C0 )
D C0 ·f (D)
Hence if f (D) ∈
/ C0 , then D ⊇ C0 . If D = C 0 , then f (D) = f (C0 ) ∈ g (D) so
1307
Lemma A.2 The Hausdorff maximal principle implies every nonempty set can be
well-ordered.
(S2 , ≤2 ) is well-ordered
and if
y ∈ S2 \ S1 then x ≤2 y for all x ∈ S1 ,
and if ≤1 is the well order of S1 then the two orders are consistent on S1 . Then
observe that ≺ is a partial order on F. By the Hausdorff maximal principle, let C
be a maximal chain in F and let
X∞ ≡ ∪C.
x ≤1 z whenever x ∈ X∞ .
Then let
Ce = {S ∈ C or X∞ ∪ {z}}.
Then Ce is a strictly larger chain than C contradicting maximality of C. Thus X \
X∞ = ∅ and this shows X is well-ordered by ≤. This proves the lemma.
With these two lemmas the main result follows.
Proof: It only remains to prove that the well-ordering principle implies the
axiom of choice. Let I be a nonempty set and let Xi be a nonempty set for each
i ∈ I. Let X = ∪{Xi : i ∈ I} and well order X. Let f (i) be the smallest element
of Xi . Then Y
f∈ Xi .
i∈I
A.1 Exercises
1. Zorn’s lemma states that in a nonempty partially ordered set, if every chain
has an upper bound, there exists a maximal element, x in the partially ordered
set. x is maximal, means that if x ≺ y, it follows y = x. Show Zorn’s lemma
is equivalent to the Hausdorff maximal theorem.
[1] Adams R. Sobolev Spaces, Academic Press, New York, San Francisco, London,
1975.
[10] Bledsoe W.W., Am. Math. Monthly vol. 77, PP. 180-182 1970.
[12] Bruckner A. , Bruckner J., and Thomson B., Real Analysis Prentice
Hall 1997.
[16] Diestal J. and Uhl J., Vector Measures, American Math. Society, Provi-
dence, R.I., 1977.
1311
1312 BIBLIOGRAPHY
[17] Dontchev A.L. The Graves theorem Revisited, Journal of Convex Analysis,
Vol. 3, 1996, No.1, 45-53.
[18] Dunford N. and Schwartz J.T. Linear Operators, Interscience Publishers,
a division of John Wiley and Sons, New York, part 1 1958, part 2 1963, part 3
1971.
[19] Duvaut, G. and Lions, J. L., Inequalities in Mechanics and Physics,
Springer-Verlag, Berlin, 1976.
[20] Evans L.C. and Gariepy, Measure Theory and Fine Properties of Functions,
CRC Press, 1992.
[21] Evans L.C. Partial Differential Equations, Berkeley Mathematics Lecture
Notes. 1993.
[22] Federer H., Geometric Measure Theory, Springer-Verlag, New York, 1969.
[23] Gagliardo, E., Properieta di alcune classi di funzioni in piu variabili, Ricerche
Mat. 7 (1958), 102-137.
[24] Grisvard, P. Elliptic problems in nonsmooth domains, Pittman 1985.
[25] Gross L. Abstract Wiener Spaces, Proc. fifth Berkeley Sym. Math. Stat. Prob.
1965.
[26] Hewitt E. and Stromberg K. Real and Abstract Analysis, Springer-Verlag,
New York, 1965.
[27] Hille Einar, Analytic Function Theory, Ginn and Company 1962.
[28] Hörmander, Lars Linear Partial Differrential Operators, Springer Verlag,
1976.
[29] Hörmander L. Estimates for translation invariant operators in Lp spaces,
Acta Math. 104 1960, 93-139.
[30] Hui-Hsiung Kuo Gaussian Measures in Banach Spaces Lecture notes in
Mathematics Springer number 463 1975.
[31] John, Fritz, Partial Differential Equations, Fourth edition, Springer Verlag,
1982.
[32] Jones F., Lebesgue Integration on Euclidean Space, Jones and Bartlett 1993.
[33] Karatzas and Shreve, Brownian Motion and Stochastic Calculus, Springer
Verlag, 1991.
[34] Kuratowski K. and Ryll-Nardzewski C. A general theorem on selectors,
Bull. Acad. Pol. Sc., 13, 397-403.
[35] Kuttler K.L. Basic Analysis. Rinton Press. November 2001.
BIBLIOGRAPHY 1313
[44] Rudin, W., Principles of mathematical analysis, McGraw Hill third edition
1976
[45] Rudin W. Real and Complex Analysis, third edition, McGraw-Hill, 1987.
[46] Rudin W. Functional Analysis, second edition, McGraw-Hill, 1991.
[47] Saks and Zygmund, Analytic functions, 1952. (This book is available on the
web. http://www.geocities.com/alex stef/mylist.html#FuncAn)
[48] Smart D.R. Fixed point theorems Cambridge University Press, 1974.
1314
INDEX 1315