Functional Analysisf15

Introduction to Functional Analysis
Yen Do
Fall 2015
2
Preface
This is the accompanying expository notes for an introductory course in Functional Analysis
that I was teaching at UVA. The goal of the course is to study the basic principles of linear
analysis, including the spectral theory of compact and self-adjoint operators. This is not a
monograph or a treatise and of course no originality is claimed. The prerequisite
is some basic knowledge about real analysis and topology. Some preliminary understanding
of functional analysis is beneficial but not required.
3
4
Contents
I Linear Spaces 9
1 Basic facts 11
1.1 Metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1.1 Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2 Linear spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2.1 The Hahn-Banach extension theorem . . . . . . . . . . . . . . . . . . 15
1.2.2 The HB theorem with symmetry constraints . . . . . . . . . . . . . . 17
1.3 Topological spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.3.1 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.3.2 Continuous functions on LCH spaces . . . . . . . . . . . . . . . . . . 19
1.3.3 Proof of Urysohn’s lemma . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3.4 Proof of Tietze’s theorem . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3.5 Partition of unity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2 Overview of normed linear spaces 23

2.1 Linear functionals and dual spaces . . . . . . . . . . . . . . . . . . . . . . . . 25
2.1.1 Linear functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.1.2 Dual spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.1.3 Reflexivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 (Non)compactness of the unit ball . . . . . . . . . . . . . . . . . . . . . . . . 28
3 Geometry and topology on NLS 31

3.1 Topologies on normed linear spaces . . . . . . . . . . . . . . . . . . . . . . . 31
3.1.1 The Banach–Alaoglu theorem . . . . . . . . . . . . . . . . . . . . . . 31
3.1.2 The sequential Banach-Alaoglu theorem . . . . . . . . . . . . . . . . 32
3.1.3 Weak compactness of the closed unit ball in reflexive spaces . . . . . 33
3.2 Uniform convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 Isometries between normed linear spaces . . . . . . . . . . . . . . . . . . . . 35
4 Basis on Hilbert spaces and Banach spaces 39

4.1 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Orthogonal basis for a Hilbert space . . . . . . . . . . . . . . . . . . . . . . . 40
4.3 Schauder basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5
6 CONTENTS
5 Bounded maps on Banach spaces 45

5.1 The Baire category theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 The principle of uniform boundedness . . . . . . . . . . . . . . . . . . . . . . 45
5.3 The open mapping theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.4 The closed graph theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.5 Uniform boundedness of canonical Schauder projections . . . . . . . . . . . . 47
5.6 Lp spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.6.1 Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.6.2 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.6.3 Basic facts about bounded operators on Lp . . . . . . . . . . . . . . . 50
6 Bounded continuous functions and dual spaces 53

6.1 Riesz representation theorems . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.1.1 The dual space of L∞ . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.1.2 Measures on a locally compact Hausdorff spaces . . . . . . . . . . . . 54
6.1.3 The dual space of Co (X) and Cc (X) . . . . . . . . . . . . . . . . . . 54
6.2 The Stone-Weierstrass approximation theorem . . . . . . . . . . . . . . . . . 56
7 Locally convex spaces 59

7.1 Two equivalent definitions of LCS . . . . . . . . . . . . . . . . . . . . . . . . 59
7.1.1 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7.1.2 Linear maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.2 The Hyperplane separation theorem . . . . . . . . . . . . . . . . . . . . . . . 61
7.3 The Krein-Milman theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.4 Inductive limit and weak solutions . . . . . . . . . . . . . . . . . . . . . . . . 63
II Spectral analysis for linear operators 65

8 Elementary spectral theory 67
8.1 Spectrum and resolvent set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.2 Functional calculus and spectral mapping . . . . . . . . . . . . . . . . . . . . 68
8.3 Examples of operators and their spectra . . . . . . . . . . . . . . . . . . . . 69
8.4 Adjoint operators and spectrum . . . . . . . . . . . . . . . . . . . . . . . . . 70
9 Compact operators 73
9.1 Compact operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
9.2 Compactness of integral operators . . . . . . . . . . . . . . . . . . . . . . . . 75
9.3 Spectral properties of compact operators . . . . . . . . . . . . . . . . . . . . 76
9.3.1 Riesz’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
9.3.2 Spectral properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
CONTENTS 7
10 Bounded self-adjoint operators 79

10.1 Diagonalization form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
10.2 Projection-valued measure and spectral projection . . . . . . . . . . . . . . . 81
10.3 Spectral representation and decomposition . . . . . . . . . . . . . . . . . . . 82
11 Spectral theory for unbounded self-adjoint operator 85
12 Fredholm determinant 89
12.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
12.2 Fredholm’s approach for integral operators . . . . . . . . . . . . . . . . . . . 90
12.2.1 Convergence and Continuity of the determinant . . . . . . . . . . . . 91
12.3 Fredholm determinant for operators on Hilbert spaces . . . . . . . . . . . . . 95
8 CONTENTS
Part I
Linear Spaces
9
Chapter 1
Preliminaries
Generally speaking, a space is a set with some structures, which could be

- algebraic (add, multiply)
- geometric (distance, angle, convexity)
- topological (open sets, continuity)
Our goal is to study spaces of functions and their structures using analytic tools. One
could think of this study as one place where“analysis” meets “linear algebra” and “geome-
try/topology”.
Our focus will be on linear spaces with some notion of geometry/topology.
Some times further understanding of these structures could be obtained via looking at
functions on these spaces, which could takes value in R, C, or even in other spaces with
similar structure. We refer to these functions are operators/maps/functionals depending on
the nature of the target spaces.
In this chapter we will recall several basic facts about metric spaces, linear spaces, and
topological spaces.
1.1 Metric spaces

. Here we have an additional geometric structure (distance) on the given set M , namely d
measures the distance between two elements of M , such that
• (positive) d(x, y) ≥ 0 equality iff x = y, and
• (symmetric) d(x, y) = d(y, x), and
• (triangle inequality) d(x, y) + d(y, z) ≥ d(x, z).
A sequence (xn ) converges to x in (M, d) if limn→∞ d(xn , x) = 0 A sequence (xn ) is called

a Cauchy sequence in (M, d) if limmin(m,n)→∞ d(xn , xm ) = 0.
It is clear that “convergence” ⇒ “Cauchy”.
If the reverse holds then the metric space is called complete.
11
12 CHAPTER 1. BASIC FACTS
Definition 1. A metric space (M, d) is complete if any Cauchy sequence is convergent.

(Note that the sequence and the limit are required to be in M ).
Examples: M = {continuous functions on [0, 1]}, let

d1 (f, g) = supx∈[0,1] |f (x) − g(x)|
R1
d2 (f, g) = 0 |f (x) − g(x)|dx
Then (M, d1 ) is complete but (M, d2 ) is not complete.
Question: if (M, d) is imcomplete can we add more stuffs to make it complete?
Definition 2. M1 ⊂ M2 is dense in M2 wrt metric d if every m ∈ M2 could be ap-

proximated by a sequence in M1 . In other words for some sequence (an ) in M1 we have
limn→∞ d(an , m) = 0.
Theorem 1. If (M, d) is a metric space then there exists a complete metric space (M
f, d)
e
and a map h : M → M
f such that
d(m1 , m2 ) = d(h(m
e 1 ), h(m2 ))
and h(M ) is dense in M

f.
(such a h is called an isometry between M and h(M ).)

Ideas of the proof: One starts by considering the set of all Cauchy sequences in M .
Inside the new space M f these sequences should converge to a limit, although two different
sequences may have the same limit. The idea is to identify these sequences together using
some equivalence relation and build M f from there.
More specifcally, for two Cauchy sequences x = (xn ) and y = (yn ) in M let d0 (x, y) =
limn→∞ d(xn , yn ). It is clear that such limit exists since d(xn , yn ) is a Cauchy sequence of
real numbers and it is not hard to verify that this is indeed a metric.
If d0 (x, y) = 0 we say that x and y are equivalent, and we then let M f be the set of all
equivalence classes, and de be the induced metric on M f.
Turns out (M f, d)
e is complete (this requires some work using the Cantor diagonal trick),
and we could embed M inside M f using the following isometry: for x ∈ M ,
h(x) = [(x, x, . . . )]
the equivalence class of (x, x, x . . . ) in M

f.
It is not hard to see that if the given metric has some reasonable symmetries then they
are preserved in the completion space. For instance, this applies if the space is linear and
the metric is dilation invariant or translation invariant (for instance if the metric is induced
from a norm). Thus we could easily see that the completion theorem also holds for normed
linear spaces.
1.1. METRIC SPACES 13
1.1.1 Separability
We recall that a topological space is separable if there exists a countable dense set. We say
that this set separates the space; for instance the rationals separate the real numbers.
A metric space is said to be separable if the topology induced from the metric is separable.
Note that there is also the concept of second countable, which says that one could find a
countable collection open sets from which one could get all open sets in the topology simply
by taking union.
This would imply separability (just take one point from each such open set), however for
metric/metrizable spaces these two are equivalent.
Separability is important in constructive proofs, for instance we could typically avoid
Zorn’s lemma/the axiom of choice if separability is available.
Basic properties:
1. All compact metric spaces are separable.
2. If the given topological space is an union of a countable number of separable subspaces,
then it is also separable.
Combining Properties 1 and 2, it is clear that an Euclidean space RI (consisting of
functions from I to R) is separable if any only the dimension |I| is countable.
3. A metric space is not separable if there is an uncountable collection of functions such
that the distance between any two is at least 1. (Indeed, if one could find a countable dense
set then at least an element of this dense set has to be close to two functions in the given
collection, which by the triangle inequality would imply that the distance between the two
has to be small, contradictory to the given hypothesis).
Using Property 3, we could show that L∞ [0, 1] (or L∞ (R) etc.) is not separable. To see
this, one could simply take the collection C of characteristic functions of intervals [a, b] ⊂ [0, 1]
where a, b are irrationals – there are uncountably many such functions and any two elements
of C are of L∞ distance at least 1, contradiction.
Using a similar line of reasoning, one could see that the space of all signed Borel measures
on [0, 1] (with norm being the total mass) is not separable. Here simply use the (uncountable)
collection of delta measures at points of [0, 1].
4. We could also define separability for measure spaces by defining a metric on the
underlying σ-algebra. Namely, given a σ-algebra A on X and a corresponding measure µ we
could define a metric on A by
p(A, B) = µ(A∆B)
the measure of the symmetric difference
A∆B = (A ∪ B) \ (A ∩ B)
and we say that (X, A, µ) is separable if the metric space (A, p) is separable.
Lemma 1. If a σ-algebra is generated from a countable collection of sets, then it is separable.
Proof. Let A be a countable generating set, let A0 consist of all finite intersections of elements
of A, and A00 consist of all finite unions of element of A0 , clearly A00 is still countable and it
is an algebra of set (i.e. the union and intersection of any two elements are still inside A00 ).
Clearly A00 also generates the given σ-algebra A, thus to show separability it suffices to show
that given any E ∈ A and any > 0 one could fine F ∈ A00 such that µ(E∆F ) < . This is
done by looking at the set of all such E and show that it is actually a σ algebra containing
A00 , thus by minimality it has to contain A and the desired claim follows.
Exercise: Rn with the usual Lebesgue measure is separable.
Theorem 2. Let (X, A, µ) be a separable measure space. Then for any 1 ≤ p < ∞ the space
Lp (X, A, µ) is separable.
Proof: use simple functions. More details later.
1.2 Linear spaces

Here the additional structure is algebraic, namely we allow for addition of two elements and
a somewhat limited notion of multiplication. Let F be a field (for us this is R or C), we
allow for multiplication of an element of the space with an element of F, the space is then
said to be linear over F.
Examples: 1. finite dimensional vector spaces (linear algebra).
2. Let p ∈ (0, ∞). Then Lp (R) = all (complex valued Borel) measurable functions on R
such that Z
( |f (x)|p dx)1/p < ∞
R
∞
Similarly L (R) is all measurable f such that for some M > 0 finite the set {x : |f (x)| > M }
has measure zero. Note that f could be unbounded.
Lp spaces are strong type spaces. We could also define weak Lp (R), denoted by Lp,∞ (R)
to be all measurable f such that
sup λ|{|f | > λ}|1/p < ∞

λ>0
For p = ∞, weak L∞ and L∞ are the same.

One could further generalize this to Lp on an arbitrary measure space. For instance we
could replace the Lebesgue measure with another Borel measure on R.
3. `p (N) = all complex sequences (a1 , a2 , . . . ) such that
X
( |an |p )1/p < ∞
n≥1
Now `∞ = all bounded sequences.

Exercise: prove that these spaces are actually linear spaces, namely they are closed
under addition and scalar multiplication.
1.2. LINEAR SPACES 15
1.2.1 The Hahn-Banach extension theorem

Given a linear space X and p : X → R, we say p is homogeneously convex if
• (positive homogeneity) p(ax) = ap(x) for a > 0 and x ∈ X, and
• (subadditivity) p(x1 + x2 ) ≤ p(x1 ) + p(x2 ) for x1 , x2 ∈ X.
Intuitively, the graph of p over X looks like a convex cone with top at the origin.
Note that this is equivalent to
p(ax1 + bx2 ) ≤ ap(x1 ) + bp(x2 )
for all a, b ≥ 0 with a + b > 0 and x1 , x2 ∈ X.

The following is known as the HB extension theorem for linear spaces over R (there is a
version for C discussed later). Note that it does not requires geometric structures on X, all
we need is linearity.
Theorem 3. Let X be a linear space over R and let p : X → R be a homogeneously convex
function. Let Y ⊂ X linear subspace and ` : Y → R linear such that
`(y) ≤ p(y) for all y ∈ Y .
Then there exists L : X → R linear that agrees with ` on Y such that
L(x) ≤ p(x) for all x ∈ X.
One could think of this theorem as an example of a local to global principle for linear
maps: as long as the local constraint is reasonable we could preserve it globally. Here we
need homogeneity and convexity.
Proof of the HB theorem in this generality uses Zorn’s lemma which is equivalent to the
axiom of choice. (for more concrete X such as Lp , `p , one could avoid the axiom of choice.)
A relation R on a set X is a subset of X ×X (could be empty!). We say xRy if (x, y) ∈ R.
R is an equivalence relation if three conditions holds:
• (reflexive) xRx for all x ∈ X,
• (symmetric) if xRy then yRx, and
• (transitive) if xRy and yRz then xRz.
We say R is a partial ordering if reflexive, transitive, and anti-symmetric (namely if xRy

and yRx then x = y).
We say R is a linear ordering if x, y ∈ X with x 6= y then either xRy or yRx. In this
case we sometimes say that X is a linearly ordered chain. Note that the size a chain could
be uncountable.
We say α ∈ X is an upper bound for Y ⊂ X if yRα for all y ∈ Y .
Example: the inclusion order in P(R).
We say that an element m ∈ X is maximal if for every α ∈ X if mRα then α = m.

Lemma 2 (Zorn’s lemma). X 6= ∅, R is a partial ordering on X such that any linearly
ordered chain Y inside X has an upper bound in X. Then for each such Y there is a
maximal m ∈ X such that m is an upper bound for Y .
Note that it is very important that we could find an upper bound for a chain. Without
that we have a counter example: X = Z+ is the set of all positive integers, and the chain
1 ≤ 2 ≤ . . . does not have any (maxinal) upper bound.
Technically speaking, this assumption is used in the proof of the Lemma (via say, trans-
finite induction plus the axiom of choice); details could be found in most introductory text-
books in mathematical logic.
Proof of the HB theorem. We divide the proof into two steps. In step 1, we show that
if Y is a proper subspace of X then we can increase the dimension of Y while preseving the
given hypothesis. In step 2, by induction on the dimension of Y (iterating step 1) and using
Zorn’s lemma we obtain the whole space X.
Step 1: Assume existence of z ∈ X \ Y . Let Ye = span(Y, z) = {ay + bz, a, b ∈ R, y ∈ Y }.
Want to define `0 such that `0 (y) = `(y) and
`0 (ay + bz) = a`0 (y) + b`0 (z)
so only need `0 (z). We also want
`0 (ay + bz) ≤ p(ay + bz)
thus
b`0 (z) ≤ p(ay + bz) − a`(y)
all a, b ∈ R and y ∈ Y . Just need a = 1.
By considering b > 0 and b < 0 we end up needing
1
sup − [p(y − cz) − `(y)]
c>0,y∈Y c
1
≤ inf [p(y + cz) − `(y)]
c>0,y∈Y c
After algebraic manipulations this follows from the given convexity assumption on p,
more specifically the following estimate, c1 , c2 > 0,
c1 c2 c1 c2
p( x+ y) ≤ p(x) + p(y)
c1 + c2 c1 + c2 c1 + c2 c1 + c2
1.2. LINEAR SPACES 17
Step 2: Here we iterate step 1 and use Zorn’s lemma. If X is finite dimensional then it
is clear that the process will stop after finitely many iterations. If X is infinitely dimensional
(the dimension of X could even be uncountable), we need to be able to go beyond any
linearly ordered chain of these extension in order to apply Zorn’s lemma.
Formally, consider the set of all possible extensions of ` from Y to bigger subspaces of
X while remain dominated by p. Define a partial order where (Yα , `α ) ≤ (Yβ , `β ) if Yα ⊂ Yβ
and `α agrees with `β on Yα . Now any linearly ordered chain (Yα , `α )α∈I (here I could be
uncountable) has an upper bound (Y 0 , `0 ) defined by
[
Y0 = Yα
α∈I
and `0 is the same as `α inside Yα . Now apply Zorn’s lemma.

Corollary 1 (complex HB). X is a complex vector space, p : X → R+ such that
p(ax + by) ≤ |a|p(x) + |b|p(y)
all |a| + |b| > 0 complex numbers. Assume that ` : Y → C linear functional such that
|`(y)| ≤ p(y) all y ∈ Y . Then can extend ` linearly to X so that it is still dominated by p(y).
Note: the extension given in the HB theorems is not necessarily unique!
Proof. Idea of proof: `1 (y) = Re(`(y)) and `2 (y) = Im(`(y)) are real functionals bounded
by p, and `2 (y) = −`1 (iy). So
`(y) = `1 (y) − i`1 (iy)
By real HB we could extend `1 to X, thus we extend ` to. But still need to show that
the extended ` satisfies
|`(x)| ≤ p(x)
For any x, let α be such that |α| = 1 and `(x) = α|`(x)|, then the desired estimate
follows:
|`(x)| = `(α−1 x) = `1 (α−1 x) ≤ p(α−1 x) = p(x)

We emphasize again that technically speaking the HB extension theorem applies to any
linear spaces; although heuristically speaking the existence of p implies some implicit ge-
omeric structures.
1.2.2 The HB theorem with symmetry constraints

One possible way to generalize the HB theorem is to introduce additional symmetries. More
specifically, if the local space Y and the linear map ` (on Y ) and the convex function p
(defined globally on X) are invariant under some “compatible” set of linear transformations,
we would also like to impose this on our extension `.
More specifically, let A be a set of linear maps from X to itself that commute, thus if
T1 , T2 ∈ A then T1 T2 x = T2 T1 x for all x ∈ X. Assume that p is invariant under elements
of A, thus p(x) = p(T x) if T ∈ A and x ∈ X. Assume that (on Y ) ` is invariant under A.
Then
Theorem 4. ` could be extended to all of X so that it remains invariant under A and still
controlled by p.
Idea of proof: We could add more to A all products of its elements, thus we may assume
that A contains 1 and is closed under composition (a semigroup). Let B contain all finite
convex combination of elements of A. By definition, (inside Y ) the given ` is controlled by
p0 (x) = inf T ∈B p(T x), which is a homogeneous convex function on X. Thus we could extend
` to all of X while remains dominated by p0 (thus is dominated by p). We need to show that
` is invariant under A. Let T ∈ A, it can be shown that
p0 (x − T x) ≤ 0
for every x ∈ X. Indeed, by definition p0 (x) ≤ p0 ((1 + · · · + T n−1 )x/n) for any n ≥ 1,
therefore
1
p0 (x − T x) ≤ p(x − T n x) → 0 as n → ∞.
n
Thus `(x − T x) ≤ p0 (x − T x) and thus `(x) ≤ `(T x) for all x. Since `(−x) = −`(x), it
follows that `(x) = `(T x).
An application: The most typical applications of the HB extension theorem are hyper-
plane separation theorems which require some local convexity of the underlying space. We
will revisit these applications later, here we discuss a concrete example.
1.3 Topological spaces

We recall the notion of locally compact Hausdorff spaces (LCH) and discuss related results.
1.3.1 Compactness
X is compact if for any open covering there is a finite subcover. A space is locally compact
if every point has a compact neighborhood.
Properties: 1. (finite intersection property) For any family of closed set with nonempty
intersection we could find a finite subfamily with nonempty intersection. In fact this is
equivalent to compactness.
2.QTychonoff ’s theorem: If (Xα )α∈A is a family of compact spaces then the product
X = α∈A Xα with the product topology is compact.
(The product topology is the minimal topology such that all coordinate projections πα :
X → Xα are continuous, equivalently speaking it is generated by πα−1 (open sets). )
1.3. TOPOLOGICAL SPACES 19
3. If there is some geometry (i.e. the topology is metrizable) then compactness is equiv-
alent to sequential compactness, which states that for any sequence there is a subsequence
that converges.
4. Net convergence: Without geometry, one needs to use more than sequences. A
net (xα )α∈I is a collection indexed by some directed set I, i.e. a set with some partial
ordering < so that any two elements has at least one common upper bound. We say this net
converges to x if given any neighborhood of x there is β ∈ I such that xα ∈ P if β < α.
(be careful, this limit is not necessary unique in general).
Bolzano–Weierstrass’s theorem: X is compact iff every net has a convergent subnet.
1.3.2 Continuous functions on LCH spaces

Recall that a function f : X → R is continuous if for every open A ⊂ R the set f −1 (A)
is open in X. Note that a continuous function maps a compact subset of X to a compact
subset of R, thus continuous functions are always bounded on compact sets.
A space is Hausdorff if one could separate two points using two disjoint open sets. There
are variants (both weaker/stronger) of this notion, but we won’t discuss them here. The
most important property of a Haudorff space is that limit is unique (if exists). Also a closed
subset of a compact subset is compact.
Basic questions:
1. Are there (a lot of) nonconstant continuous functions on a topological space? (We are
interested in bump functions, since this is related to the second question below.)
2. Can we extend continuously a given local continuous function (i.e. supported on a
compact subset) to all of the space?
3. Can we construct partition of unity on such a space?
We will show that if the space is LCH then the answer is yes for all questions. (One
could do better than LCH but we will not discuss that here.) These confirmative answers
lead to the study of C(X) the space of continuous functions on X.
Urysohn’s lemma: Let X be an LCH space. Let K be compact and U be open in X,
such that K ⊂ U . Then there exists a continuous function f : X → [0, 1] such that
• f = 1 on K and
• f = 0 outside a compact subset of U .
Tietze’s extension theorem: Let X be LCH and K ⊂ X compact subset. Then any
function f ∈ C(K) could be continuously extended to all of X. Furthermore the extended
function vanishes outside a compact set.
Partition of unity: this ia collection of nonnegative functions whose sum is 1 everywhere
on the space, but locally at each point only finitely many of them are nonzero. One certainly
could only do this partition for a compact subset of the space.
1.3.3 Proof of Urysohn’s lemma

Proof consists of two steps.
Step 1: First we consider a simpler setting when the space is actually compact Haus-
dorff. The idea is that a compact Hausdorff space has a nicer structure, namely one could
separate two disjoint closed sets using two open sets. (also known as the “normal” property).
Urysohn’s lemma works for normal spaces, here the assumptions would simply be K is closed
inside an open set U .
To see why compact Hausdorff implies normal, take two closed disjoint sets C1 , C2 , they
are then compact. Given x ∈ C1 and y ∈ C2 we could use open sets Wx,y and Vx,y to separate
them, now (Vx,y ) covers C2 so using compactness we could get a finite subcover Vx,yk , thus
we could separate x from C2 using two open sets ∩Wx,yk and ∪Vx,yk . Repeat this argument
to separate C1 from C2 using open sets.
Thus now X is a normal space, K closed, U open. We construct a large family of open
sets (Ur )r that interpolates K and U . This family is indexed by dyadic rational numbers
r = 2mk that are in (0, 1), such that K ⊂ Ur ⊂ U and
Ur ⊂ Us
if r < s. One then defines

g(x) = inf{r : x ∈ Ur }
for all x ∈ X. Clearly g(x) = 0 if x ∈ K and g(x) = 1 if x ∈ U c , thus we could define
f (x) = 1 − g(x) and f = 1 on K and vanish outside U . If we want f to vanish outside
a compact subset K 0 of U , we could apply this for K and U1/2 , and notice that U1/2 is a
compact subset of U . (
f −1 ((α, ∞)) open
Such f is continuous: it suffices to show for all α ∈ R. Wlog
f −1 ((−∞, α)) open
assume 0 ≤ α ≤ 1. Then
f (x) < α ⇔ x ∈ Ur for some r < α, thus
[
f −1 (−∞, α) = Ur is open.
r<α
f (x) > α ⇔ x 6∈ Ur for some r < α. Using the inclusion assumption, this is equivalent
to existence of r > α such that x 6∈ U r , so
[ c
f −1 (α, ∞) = Ur is open.
r>α
Thus the remaining step is to construct the family. Here we use normality: to get W1
open and W2 open that surrounds K and U c (disjoint closed sets). Thus K ⊂ W1 ⊂ W2c ⊂ U ,
and we define U1/2 = W1 , now the closure of U1/2 is inside W2c so inside U . Thus
K ⊂ U1/2 ⊂ U1/2 ⊂ U
1.3. TOPOLOGICAL SPACES 21
We repeat this with the new pairs (K, U1/2 ) and U1/2 , U ) and so on, get the family.
Step 2: Reduction to compact setting. The idea is to show that for some V ⊂ X open
it holds that V is compact and K ⊂ V ⊂ V ⊂ U . Thus if the Lemma holds for compact
Hausdorff, we simply restrict to the subspace V and then extend the local function on V to
the whole of X by letting it be 0 outside V .
The existence of such a V can be done as follows: first we show that given each x ∈ K
we could get a compact neigborhood Nx that remains inside U . Then the family of interior
of these Nx forms an open cover of K, thus using a finite subcovering we easily get a open
set containing K such that its closure is inside U .
Now the existence of such a neighborhood follows from LCH property: the idea is for
Hausdorff space we could actually separate point and a compact set.
Now given each x we could get a neighborhood of x, called Mx , that is compact but not
necessarily inside U , then the set P = Mx ∩ U is compact and is a neighborhood of x, it is
almost contained inside U . One uses Hausdorffness to separate x further from the compact
boundary of this set.
1.3.4 Proof of Tietze’s theorem

Essentially, Urysohn’s lemma implies Tietze’s theorem, at least for compact Hausdorff spaces
(or more generally normal space). We’ll focus on this, the extension to the locally compact
case is similar to the last proof.
The idea is to approximate f by a sequence of continuous functions that have global
extensions, and this sequence converges uniformly to a continuous function on X.
To start, wlog assume 0 ≤ f ≤ 1 everywhere on K. Then by Urysohn’s lemma there is a
bump function 0 ≤ h1 ≤ 1 such that h1 = 1 on the closed set f −1 ([ 23 , 1]) and h1 = 0 outside
the (bigger) open set f −1 ( 13 , ∞). Clearly
1 2
0 ≤ f − h1 ≤
3 3
on K.
Applying this argument to f1 = 32 (f − h1 ) which takes values in [0, 1] on K, we get a
bump function h2 continuous on X such that locally on K it holds that 23 (f − 31 h1 )− 13 h2 ≤ 32 ,
or equivalently
1 2 4
0 ≤ f − h1 − h2 ≤
3 9 9
.
Repeating this argument, we get continuous functions (globally on X) h1 , h2 , . . . such
that hj takes value in [0, 1] and
1 2 2n−1 2n
0 ≤ f − h1 − h2 · · · − n hn ≤ n
3 9 3 3
n−1
It is clear that the sequence 13 h1 , 13 h1 + 29 h2 , 31 h1 + 29 h2 + · · · + 2 3n hn converges uniformly to
a continuous function on the compact space X.
1.3.5 Partition of unity

Given K compact subset and a finite open cover (Uj )1≤j≤n can we find a partition of unity
consisting of compactly supported bump functions such that for each j at least one such
function is supported in Uj ? Yes if LCH.
The idea is to find a compact subset Vj (could be empty) from each Uj such that they
cover K. This follows from the fact that each point in K has a compact neighborhood inside
some Uj , thus by compactness one could refine this and get a finite covering consisting of
sets of this type, and let Vj be the union of those set that are strictly inside Uj . (Vj covers
K since any set in the covering has to be a subset of some Vj .)
Then use Urysohn’s lemma to construct gj bump functions that equal 1 in Vj and vanish
outside Uj . (If VjP= ∅ simply take gj ≡ 0.)
Clearly g := j gj ≥ 1 on K and continuous, but this will be zero outside a compact
set, so we can’t simply divide everything by g to get the partition of unity. The idea is to
use Urysohn’s again ti get a bump function f that equals 1 in K and vanish outside {g > 0}
(which is an open set). Now we could add to g the function 1 − f , which does not change
anything inside K but will be 1 as soon as g = 0. We get the partition of unity of K
gj
consisting of g+1−f .
Chapter 2
Overview of normed linear spaces
Starting from this chapter, we begin examining linear spaces with at least one extra structure
(topology or geometry). We assume linearity; this is a natural feature of functional spaces.
In this chapter, we start with spaces whose geometric and linear structures are compatible.
More precisely, we assume that the metric is translation invariant d(x, y) = d(x + z, y + z)
and homogeneous d(λx, λy) = |λ|d(x, y). We then define kxk = d(x, 0) the norm of x, and
it is clear that k.k satisfies the following three properties (below x, y ∈ X and λ ∈ C):
• kxk ≥ 0 and equality holds iff x = 0.
• kx + yk ≤ kxk + kyk.
• kλxk = |λ|kxk.
A linear space equiped with such a norm is called a normed linear space (NLS).
Conversely, if such a norm exists we could always define a metric d(x, y) = kx − yk that
is translation invariant and homogeneous.
If the norm topology is separable then we say the NLS is separable.
A Banach space is a complete normed linear space, i.e. the induced metric space is
complete. A Hilbert space has additional structures, where we could talk about “angle”
through the scalar product. We’ll discuss these spaces further in separate chapters.
We recall that an incomplete metric space could be completed, and this applies to normed
linear spaces too: the metric remains invariant under dilation and translation in the com-
pleted space thus it remains a norm.
Some examples:
1. Let (X, µ) be a measure space. Let 1 ≤ p ≤ ∞. Then Lp (X, µ) are normed linear
spaces, with
Z
kf kp = ( |f (x)|p dµ)1/p
X
23
24 CHAPTER 2. OVERVIEW OF NORMED LINEAR SPACES
In fact theyP are complete, this is P

a theorem of Riesz-Fisher. To see this, suffices to show that
if {fn } has n kfn kp < ∞ then Rn P fn converges. 1
Now by the trianglePinequality ( n |fn |)p dµ < ∞, thus h(x) = n |fn (x)| is finite for µ-
P
almost every x. Thus n f (x) converges to some g(x) measurable, clearly kgkp ≤ khkp < ∞
so g ∈ Lp . P
Finally, |( n fn ) − g|p converges pointwise a.e. toR 0Pand dominated by (h(x) P + |g(x)|)p
which are integrable, thus by dominated convergence |( n fn )−g|p dµ → 0, thus fn → g
in Lp .
On the other hand, for p < ∞ weak Lp (X, µ) is not a normed linear space, since the
triangle inequality fails
kf kp,∞ = sup λµ({x ∈ X : |f | > λ})1/p

λ>0
However, we still say “the weak Lp norm” in practice.

Here is a counter example to the triangle inequality: use f (x) = x10≤x≤1 and g(x) =
(1 − x)10≤x≤1 .
2. If X is compact Hausdorff then C(X) is a complete normed linear space if we use the
sup norm
kf k = sup |f (x)|
x∈X
If X is not compact then the sup is not necessarily finite and so this is not even a norm.
For locally compact (Hausdorff) spaces we could instead look at Cc (X) consisting of
compactly supported continuous functions, and this space is a normed linear space. However,
Cc (X) is not complete in the noncompact case, infact its completion is C0 (X) the space of
continuous functions on X that vanish at ∞. (These are functions f ∈ C(X) such that for
every M > 0 the set {|f | ≥ M } is compact.)
To see this, we first show that C0 (X) is complete under the uniform norm. One way to
show this is to use one-point compactification. Alternatively, consider a Cauchy sequence
(fn ) in C0 (X), then for each x ∈ X fn (x) is a Cauchy sequence of complex numbers, thus
it converges pointwise to some f (x). Furthermore on any compact subset of X the uniform
convergence fn (x) → f (x) implies continuity of f . Since X is locally compact and continuity
is a local property, it follows that f ∈ C(X), then using fn ∈ C0 it is not hard to see that
f ∈ C0 . (One way to check this is to use the net convergence characterization of continuity).
Now, given a function f ∈ C0 (X) we could approximate it by a convergent sequence of
compactly supported continuous functions on X using Urysohn’s lemma: let Kn = {|f | ≥
1/n} which is compact and sits inside Un = {|f | < 1/(n−1)} an open set. Then by Urysohn’s
lemma we may find φn a bump function that vanishes outside Un but equals 1 on Kn , it is
clear that φn f → f in the uniform metric.
3. L2 -Sobolev spaces on R. If k ≥ 0 integer we could define
H k := {f ∈ L2 : f, . . . , f (k) ∈ L2 }
1
For normed linear spaces, “ completeness” is equivalent to “every absolutely summable sequence is
summable”.
2.1. LINEAR FUNCTIONALS AND DUAL SPACES 25
be the space of L2 functions whose first k derivatives exists almost everywhere and are L2
integrable. H k is complete and we may alternatively define H k as the completion of the
space of locally compacted C ∞ functions on R under the norm
kf kH k = kf k2 + · · · + kf (k) k2
Using the Fourier transform, we could generalize this to allow for k fractional and even
negative, and we could also use Lp instead of L2 . We’ll revisit this in the future if time
permits.
4. If Y is a closed subspace of X then the quotient space X/Y (the space of equivalent
classes where x1 ∼ x2 if x1 − x2 ∈ Y ) is a normed linear space with norm
k[x]k = inf kx − yk
y∈Y
An equivalent definition is k[x]k = inf z∈[x] kzk. Note that closedness of Y is essential here to
ensure that k[x]k = 0 iff x ∈ Y .
5. If Y is a subspace of X then the closure Y of Y (with respect to the norm topology)
is another subspace of X. If this closure is the same as X then we say that Y is dense in X.
Note that this closure is not necessarily complete and Y is not the same as the completion
of Y under the norm. For instance we could take X = Y incomplete NLS, then the closure
of Y under the norm topology of X is the same as X, still incomplete.
2.1 Linear functionals and dual spaces

There are two basic notions of dual spaces:
• algebraic dual, which consists of all linear maps (aka linear functionals) ` : X → R (or
C); and
• continuous dual, which consists of all continuous linear functionals.
We will be only interested in continous dual spaces, which will be implicitly understood
whenever we refer to dual spaces.
Note that we could define continuous dual spaces even if X is only a topological linear
space, without any norm.
2.1.1 Linear functionals

For a normed linear space, we could talk about boundness of a linear functional. We say
that a linear functional ` is bounded if there exists C > 0 such that for every x ∈ X it holds
that
|`(x)| ≤ Ckxk
Equivalently, this means supx∈X:kxk≤1 |`(x)| < ∞ (equivalently supx∈X:kxk=1 |`(x)| < ∞)
which sometimes is used as the definition. Note that this fact also holds for linear transfor-
mations from X to a Banach spaces.
Theorem 5. On normed linear spaces, a linear functional is continuous iff it is bounded.

Proof. In one direction, clearly being bounded implies being continuous. For the other
direction, we’d like to show that the image of the closed unit ball under ` is a bounded set
if ` is linear continuous. Note that ` maps compact sets to compact sets, but unfortunately
as we’ll see the closed unit ball is not compact if X is infinite dimensional. So one has to
exploit linearity of `. Assume towards a contradiction that xn is a sequence of unit vector
s.t. |`(xn )| > n. Then xn /n converges to 0 in the norm, but |`(xn /n)| > 1 which violates
continuity of `.
One of the most frequently/implicitly used theorems for bounded linear functionals on
normed linear space is the socalled B.L.T. (bounded linear transformation) theorem. The
theorem applies even for bounded linear maps from X to another Banach space.
Theorem 6. Let D be a dense subspace of the normed linear space X. Let ` : D → Y where
Y is a Banach space, and for some C > 0 it holds for all x ∈ D that
k`(x)kY ≤ CkxkX
Then ` has an unique extension to a bounded linear map from X to Y and satisfies the above
estimate for all x ∈ X.
Proof. It is not hard to see that if such ` exists it has to be unique. To define `, fix x ∈ X.
Since D is dense in X there exists a sequence (xn ) in D that converges to x. We then define
`x = lim `xn . Note that this limit exists because `xn is a Cauchy sequence in Y which is a
complete space. One could easily show that the value of `x does not depend on the choice
of the sequence xn . Linearity and boundedness could be easily checked.
While working with singular integral operators on Lp spaces we typically invoke the
above theorem implicitly: these operators are explicitly defined only for a nicer dense subset
of functions (say sufficiently smooth and with sufficient decay), and so the theorem says
that as long as we could bound the operators on these dense subspace we could extend the
operator to all of the corresponding Lp and get the sameR 1 bound.
Example: the Hilbert transform Hf (x) = p.v. R y f (x − y)dy is defined for smooth
compactly supported functions on R which is dense in Lp . It turns out that kHf kp ≤ Ckf kp ,
1 < p < ∞, thus H extends to a bounded maps on Lp .
2.1.2 Dual spaces

For a bounded linear functional we could define k`k∗ = supx∈X:kxk=1 |`(x)| and this defines a
norm on the dual space of X. By definition |`(x)| ≤ kxkk`k∗ .
Theorem 7. The dual space of a normed linear space is a Banach space.
Proof. This is actually a special case of a more general fact: the space B(X, Y ) of bounded
operators from normed linear space X to a Banach space Y
kT xkY ≤ kT kkxkX
2.1. LINEAR FUNCTIONALS AND DUAL SPACES 27
is in turn another Banach space. (For us Y = R (or C) with the distance norm). It is not
hard to see that this is a normed linear space, the main thing is to show completeness. Given
any Cauchy sequence (Tn ) in B(X, Y ) it is not hard to see that supn kTn k < ∞. Now for
any x ∈ X the sequence Tn x is a Cauchy sequence in Y therefore it converges (thanks to
completness of Y ) in Y , and we let T∞ x to be this limit. It is clear that T∞ is linear and
bounded kT∞ k ≤ supn kTn k < ∞. It remains to show that limn→∞ kTn − T k = 0.
Examples:
1. The dual space of a Hilbert space is itself.
2. If 1 < p < ∞ then the dual space of Lp (X, µ) is Lq (X, µ) where q is the conjugate
exponent 1/p + 1/q = 1. In particular,
Z
kf kLp (X,µ) = sup | f gdµ|
g:kgkq =1
If µ is a σ-finite measure(namely one could break the space into countably many subsets
where µ is finite) then the dual space of L1 (X, µ) is L∞ (X, µ). This may not be true without
the σ-finite assumption. ON the other hand, the dual space of L∞ (X, µ) generally speaking
not L1 (X, µ) (but there are examples when they are, say when X is a finite set with the
counting measure).
3. If X is a locally compact Hausdorff space, then the dual of C0 (X) is the space of
regular Borel measures with finite total mass. This result is one of Riesz’s representation
theorems and we will prove them later in the course.
2.1.3 Reflexivity
If X is a normed linear space we let X ∗ denote its dual and X ∗∗ denote the dual of X ∗ .
Definition 3 (Reflexive spaces). We say that X is reflexive if X = X ∗∗ (up to isomor-
phism).
In particular, a reflexive space has to be a Banach space to begin with, but certainly not
all Banach spaces are reflexive.
The interest in reflexive spaces is natural, since we always can isometricaly embed X into
X . To see this, fix x ∈ X. Let k.k∗ and k.k∗∗ be the norms on X ∗ and X ∗∗ respectively.
∗∗
We could map x ∈ X to the following linear map x b on X ∗ :
b(`) := `(x) , ` ∈ X ∗ .
x
Note that xb is bounded on X ∗ , since by definition |b

x(`)| ≤ kxkk`k∗ . Thus kb
xk∗∗ ≤ kxk.
On the other hand, using the Hahn Banach theorem it follows that we could find ` ∈
X ∗ such that `(x) = kxk and |`(y)| ≤ kyk for all y ∈ X. It follows that k`k∗ ≤ 1,
therefore |b
x(`)| = kxk ≥ kxkk`k∗ , and consequently kb xk∗∗ ≥ kxk. Thus kxk = kbxk∗∗ and the
embedding x 7→ x b is an isometric embedding of X into X ∗∗ .
As we’ll see, being reflexive helps in many situation; we’ll explore that gradually.
Examples:
1. Hilbert spaces are reflexive.
2. Lp are reflexive if 1 < p < ∞. As discussed above, generally speaking both L1 and
∞
L are not reflexive. For examples they are not reflexive when the underlying space is Rn
with Lebesgue measure or Z with counting measure.
3. A closed linear subspace of a reflexive space is also reflexive.
4. C[−1, 1] is not reflexive. Note that this space is separable but its dual is not (since
the dual of this space contains in particular all Borel measures). As we’ll see later, for a
reflexive space its dual space has to be separable too. One could prove this directly w/o
using separability.
2.2 (Non)compactness of the unit ball

One of the key themes in analysis is existence of limit of a sequence (in some functional
spaces) that are bounded uniformly in norm. By rescaling if necessary we may assume the
bound is 1.
Sometimes, it suffices to get a convergent subsequence. In other words we are interested
in the sequential compactness of the closed unit ball.
Since NLS is in particular a metric space, sequential compactness = compactness. So the
question is about compactness of the closed unitball in the norm topology.
For Euclidean spaces it is clear that for finite dimensional setting the closed unit ball is
compact. For infinite dimensional setting, the answer is no, for instance consider the sequence
an = (0, . . . , 0, 1, 0, . . . ) (the nth coordinate is 1). If there is a limit x = (x1 , x2 , . . . ) for some
subsequence ank , i.e.
kank − xk → 0
as k → ∞. Then clearly kxk = 1. On the other hand, given any j by taking k large it is
clear that 0 ≤ |xj | ≤ kank − xk → 0, thus x1 = x2 = · · · = 0, which is a contradition.
This suggests that the answer should be similar for a general normed linear space: positive
for spaces with finite dimensions and negative in the infinite dimensional setting (whether
the space is complete or not).
In the finite dimension case, this follows from the fact that in finite dimensional linear
spaces all norms are equivalent. In other words, if k.k1 and k.k2 are two norms then there
exists C1 , C2 > 0 such that C11 kxk1 ≤ kxk2 ≤ C2 kxk1 for all x ∈ X. To see this, for simplicity
consider linear spaces over R, and assume that the given space is spanned by x1 , . . . , xm ,
it suffices to show the conclusion for k.k2 being the Euclidean norm ka1 x1 + . . . am xm k2 =
p
a21 + · · · + a2m the Euclidean norm. Then by the triangle inequality
√ q 2
ka1 x1 + . . . am xm k1 ≤ (|a1 | + · · · + |am |) max kxj k1 ≤ m a1 + . . . a2m max kxj k1
j j
√
so we could take C1 = m maxj kxj k1 . To show existence of C2 , note that the identity map
` : (X, k.k2 ) → (X, k.k1 ) (i.e. `x = x) is continuous since it is Lipschitz. The unit ball
2.2. (NON)COMPACTNESS OF THE UNIT BALL 29
in (X, k.k2 ) is compact, thus its image under this continous function is also compact, thus
bounded in k.k1 , therefore for some C2 > 0 we have
sup kxk1 ≤ C2
the sup is over x with kxk2 = 1, which is equivalent to the desired estimate.
In the infinite dimensional case, this is a theorem of Riesz.
Theorem 8 (Riesz). If X is an infinite dimensional NLS then the closed unit ball is not
compact with respect to the norm topology.
Proof. Our plan is to find a sequence of elements of unit vectors in X, x1 , x2 , . . . , such that
the distance kxi − xj k between any two elments of the sequence is uniformly larger than
1
0, for instance kxi − xj k ≥ 10 for all i 6= j. If that could be constructed it is clear that
no subsequence of (xn ) is Cauchy, thus no subsequence of this sequence is convergent and
therefore the closed unit ball is not (sequentially) compact.
To construct this sequence it suffices to show that if Y is a closed proper subspace of X
1
then one could find x ∈ X \ Y with kxk = 1 such that dist(x, Y ) ≥ 10 . Once this is proved
we could start with any point x1 of unit length and let Y be spaned by x1 (which is closed)
1
and then select x2 of unit length as above (in particular kx2 − x1 k ≥ 10 ), then reset Y to be
spaned by x1 , x2 (a closed subspace because it has finite dimension) and select x3 , etc.
1
Thus we only to show existence of x ∈ X of unit norm satisfying dist(x, Y ) ≥ 10 whenever
Y is a closed proper subspace of X. Let z ∈ X \ Y and let
d := dist(z, Y ) > 0 .
(if d = 0 then the closedness of Y would imply that z ∈ Y , contradiction.) Now for some
y ∈ Y we have kz − yk < 10d. Let x = z − y, it follows that dist(x, Y ) = dist(z, Y ) = d, thus
dist(x, Y ) > kxk/10
now we just rescale x so that it has length 1.

Because of this result, there is a natural question of determining a topology on X so
that the closed unit ball is compact with respect to this topology. One wants a weaker
topology (with fewer open sets, too many open sets is probably the reason why the ball is
not compact).
There is a natural notion of weak-topology that is weaker than the norm topology. Here
we wants just enough open sets so that all bounded linear functionals are continuous.
Theorem 9. The closed unit ball is compact with respect to the weak-topology if and only if
X is reflexive.
Note that compactness of the closed unit ball is compact in this weak topology does not
imply local compactness of X.
We’ll prove this result later when discussing topologies on Banach spaces; as we’ll see
this theorem is a consequence of the Banach Alaoglu theorem.
Exercises:
1. Prove that for a normed linear space, “completeness” of the norm is equivalent to
“every absolutely summable sequence is summable”.
2. Prove that for every x ∈ X a normed linear space it holds that kxk = sup`∈X ∗ : k`k=1 |`(x)|.
3. Prove that if K is compact Hausdorff then C(K) is complete with respect to the
sup norm. Use this to complete the proof that C0 (X) is complete if X is locally compact
Hausdorff without appealing to the one-point compactification trick.
4. Let S be a subset of X a normed linear space, and let Y be the linear span of S
(consisting of all finite linear combination of elements of S). Let L ⊂ X ∗ to be the set of all
` ∈ X ∗ that vanishes on S. Prove that z ∈ Y the closure of Y if any only if `(z) = 0 for
every ` ∈ L. (Hint: one direction should be easier, for the other direction you should use
the BLT theorem somewhere.)
5. Prove that all closed linear subspaces of a reflexive (Banach) space are reflexive. [Hint.
Use problem 4 at some point.]
Chapter 3
Basic geometrical and topological

properties of normed linear spaces
and their duals
3.1 Topologies on normed linear spaces

Recall that the norm topology is induced from the norm and the weak topology is the
minimal topology that makes all bounded linear functionals continuous.
If X = Y ∗ is the dual of another normed linear space Y (thus in particular X is a Banach
space) we could introduce another topology, called the weak* topology: this is the minimal
topology such that all bounded linear functionals yb, y ∈ Y , are continuous. We could use
the finite intersections of the following sets as a neighborhod base at 0 in this topology:
{x ∈ X : |b
y (x)| < }
Note that the maps y 7→ yb isometrically embedded Y inside X ∗ , the weak* topology is
smaller than or equal to the weak topology.
Certainly if X is reflexive then the weak* and weak topologies are the same.
3.1.1 The Banach–Alaoglu theorem

Theorem 10 (Banach–Alaoglu, 1940). Let X be the dual of another normed linear space.
Then the closed unit ball B of X is compact in the weak* topology.
Proof. Let X = Y ∗ . For simplicity assume that the spaces are over R. For each y ∈ Y let
Iy = [−kykY , kykY ] ⊂ R and consider
Y
I= Iy
y∈Y
31
32 CHAPTER 3. GEOMETRY AND TOPOLOGY ON NLS
consisting of tuples indexed by Y . By Tychonoff’s theorem, I is compact in the product

topology and it is also Hausdorff (tensor product of Hausdorff spaces with the product
topology is Hausdorff). Now, for each elements x in the unit ball B of X consider the map
x 7→ T x := (x(y))y∈Y ∈ I
(the fact that T x ∈ I follows because for x ∈ B we have |x(y)| ≤ kxkX kykY = kykY ). Note
that elements of T B are special because the coordinates of T x are actually related linearly.
Now, T is clearly linear and one to one and it embeds B into I, in fact it is not hard
to see that if we use the weak* topology on B and the inherited product topology on T B
then T is indeed a topological isomorphism between B and T B. We now show that T B is a
closed subset of I, which together with compactness of I will then implies that T B (hence
B) is compact.
Let z = (zy )y∈Y ∈ T B where closure taken inside I under the product topology. Then
we want to show that z is “linear and bounded by 1”, namely
zαx+βy = αzx + βzy
Once that’s done it follows that the maps y 7→ zy defines a linear map on Y with norm at
most 1 and so z ∈ T B as desired.
To verify the above property, simply observe that there is a net in T B that converge to z
(unique because of Hausdorffness), and linearity relationship between coordinates is actually
preserved in the limit, so this property survives and z has it.
Examples:
1. If X is the space of regular Borel measures on R with the norm = total mass, then its
is the dual space of C0 (X) which is a separable space. Therefore the closed unit ball in X is
sequentially compact. In particular, given any sequence (µn ) of probability measures on R
there is a subsequence (µnk ) that converges vaguely to some Borel measures µ, i.e. for every
continuous function f that vanishes at ∞ it holds that
Z Z
lim f (x)dµn (x) = f (x)dµ(x)
n→∞
Note that µ is not necessarily a probability measures, in particular it is possible that µ = 0,

which happens when the masses in µnk escape to ∞ (for instance we could take µn to have
the density function 1[n,n+1] ). To avoid this there is a notion of tightness, which basically
assume that given any > 0 we could find a compact K such that µn (R \ K ) < for every
n. With this assumption we could show that µ is a probability measure, and basically we
have just shown the Helly selection theorem for probability measures on R.
3.1.2 The sequential Banach-Alaoglu theorem

Recall that compactness means any net has a convergent subnet. In practice we are interested
in the sequential variant of this theorem, which says that
3.1. TOPOLOGIES ON NORMED LINEAR SPACES 33
Theorem 11. If X = Y ∗ where Y is separable normed linear space, then the closed unit ball
B in X is sequentially compact in the weak* topology. In other words given any sequence
(xn ) in B there exists a subsequence xnk and x ∈ B such that
lim xnk (y) = x(y)

k→∞
for every y ∈ Y .
Namely, since Y is separable the weak* topology is metrizable (i.e. it arises from some
metric on X), thus compactness and sequentially compactness are the same on this topology.
We could also prove this directly: let (yk ) be a dense subset of Y , then from the sequence
xn we could select a subsequence xnk such that y1 (xnk ) converges. Keep doing this and
use a diagonal argument we get a sequence xnk such that for any fixed j it holds that
limk→∞ yj (xnk ) exists. Using the fact that (yj ) is dense in Y it follows that for every y ∈ Y
the limit limk→∞ y(xnk ) exists, let this limit be z(y), it is clear that z is a linear functional
on Y and also bounded with kzk ≤ 1. This implies xnk converges weakly to z and element
of B.
3.1.3 Weak compactness of the closed unit ball in reflexive spaces

As a consequence of the Banach Alaoglu theorem, for any reflexive space the closed unit ball
is compact in the weak topology. The reverse direction is true (Kakutani’s theorem) and is
part of the homework.
If the given space is furthermore separable then its dual is also separable, so (from the
sequential BA theorem) the closed unit ball is sequentially compact in this weak topology.
It is surprising that this sequential compactness property still holds even if we don’t
assume separability on the given reflexive space!
Theorem 12. Let X be a Banach space. If X is reflexive then the closed unit ball in X is
sequentially compact in the weak topology.
Notes: The reverse direction also holds, this is a result of Eberlein–Smulian: if every
bounded sequence has a convergent subsequence then X is reflexive. In fact, ES showed
that a subset of a reflexive Banach space is weakly compact if any only if it is sequentially
compact, which implies the above statement.
Weak convergence of sequences: In a normed linear space, we say that a sequence
converges weakly if it converges in the weak topology. More precisely, (xn ) converges weakly
to x if for every ` ∈ X ∗ it holds that limn→∞ `(xn ) = `(x). This is in contrast with strong
convergence of the sequence xn , which means kxn − xk → 0. Clearly strong convergence
implies weak convergence, but the opposite direction is not true in general.
It follows from the above theorem that in a reflexive Banach space (for instance Lp with
1 < p < ∞) any norm bounded sequence has a subsequence that converges weakly.
Proof. We’ll show that reflexivity implies sequentially compactness of the closed unit ball.
Let (xn ) be the given sequence and let Y be the closure of the linear space spanned by (xn ).
Then Y is a closed linear subspace of X so is also reflexive, on the other hand Y is separable,
thus one could see that the closed unit ball inside Y is sequentially compact with respect to
the weak topology on Y . Thus there is a subsequence xnk and x∞ ∈ Y such that for every
bounded lienar functionals ` on Y we have
`(xnk ) → `(x)
Since every bounded linear functionals on X is also a bounded linear functional on Y , this
implies xnk converges weakly to x.
3.2 Uniform convexity

Uniform convexity is a geometric notion introduced by Clarkson (1936).
Definition 4 (Uniformly convex). We say that a normed linear space X is uniformly convex
if for each 0 < ≤ 2 there exists δ = δ() > 0 such that the following holds: If kxk = kyk = 1
and kx − yk ≥ then
x+y
k k≤1−δ
2
Note that this is a property of the norm: there may be equivalent norms on the same
space that is not uniformly convex.
Uniform convexity does not imply completeness. For instance we could take the rational
2 2
numbers with the usual distance. Then | x+y2
|2 + | x−y
2
|2 = |x| +|y|
2
which implies uniform
convexity.
Theorem 13 (Milman–Pettis). If X is uniformly convex Banach then it is reflexive.
Note that there are reflexive Banach spaces where it is not even possible to replace the
given norm with an equivalent norm that is uniformly convex, this is due P
to M. Day (BAMS,
1941). The idea is to use a vector valued Banach space with norm k.k = ( n kxj kp )1/p where
xn belongs to a reflexive Banach space Bn (which makes it reflexive); now for each j one
could still replace the norm on Bj by an equivalent norm that is uniformly convex, but as
n → ∞ these replacement couldn’t be done uniformly because the underlying equivalence
constants blow up if the Bj are carefully chosen.
Properties:
1. Hilbert spaces are uniformly convex, since in Hilbert spaces we have the paralellogram
law kx − yk2 + kx + yk2 = 2kxk2 + 2kyk2 .
2. C[−1, 1], L1 (R), L∞ (R) are not uniformly convex because they are not reflexive.
3. Clarkson(1936): if 1 < p < ∞ then Lp and `p spaces are uniformly convex.
For p ≥ 2 this holds essentially some form of parallelogram law holds.
0 0 0
(kx + ykp + kx − ykp )1/p ≤ 21/p (kxkp + kykp )1/p
3.3. ISOMETRIES BETWEEN NORMED LINEAR SPACES 35
0 0 0
where 1/p + 1/p0 = 1. For p ∈ (1, 2) we have a variant (kx + ykp + kx − ykp )1/p ≤
0
21/p (kxkp + kykp )1/p . 1
4. (Radon–Riesz, aka property (H)) If xn converges weakly to x in a uniformly convex
Banach space X, i.e. `(xn ) → `(x) for all ` ∈ X ∗ , and kxn k → kxk, then kxn − xk → 0. 2
5. A space is called uniformly smooth if its dual is uniformly convex.
6. A normed linear space is uniformly convex if the following holds for every bounded
sequences (xn ) and (yn ): if
kxn k2 + kyn k2 xn + y n 2
lim −k k =0
n→∞ 2 2
then limn→∞ kxn − yn k = 0. Below are two notions weaker than uniformly convex:
Local uniform convexity: A norm is locally uniformly convex if the following holds
for every x ∈ X and (xn ) in X: if
kxk2 + kxn k2 x + xn 2
lim −k k =0
n→∞ 2 2
then limn→∞ kx − xn k = 0. It is clear that this is a consequence of uniform convexity.
Strict convexity: A norm is strictly convex if the following holds for every x, y: if
kxk2 +kyk2
2
= k x+y
2
k2 then x = y. It is clear that this is a consequence of local uniform
convexity.
It can be shown that if X is a separable normed linear space over R then there exists an
equivalent locally uniformly convex norm.
3.3 Isometries between normed linear spaces

We say that a map T : X → Y between normed linear spaces X and Y is an isometry if
kT x1 − T x2 kY = kx1 − x2 kX
for every x1 , x2 ∈ X. Note that we do not assume that T is affine (i.e. T (αx + βy) =
αT x + βT y, which is the same as linear if we impose T 0 = 0).
The Banach–Mazur distance: The distance is useful to compare two norms on Rn -
it is known that all norms are equivalent so the point is to be more quantiative.
1
Alternatively, use Hanner’s inequality which says that if k.k is the Lp or `p norm then
kf + gkp + kf − gkp ≥ (kf k + kgk)p + |kf k − kgk|p (3.1)
if p ∈ [1, 2] and the reverse holds if p ∈ [2, ∞).

2
To see this, it suffices to show the theorem for sequences with kxn k = 1. Hahn Banach gives us ` ∈ X ∗
such that `(x) = 1 and k`k ≤ 1. Then by weak convergence of xn to x we obtain 2 ≥ kxn + xm k ≥
`(xn + xm ) → 2`(x) = 2 as min(m, n) → ∞. This easily implies kxm + xn k → 2, from there and uniform
convexity it follows that (xn ) is a Cauchy sequence and converges to some y, clearly `(y) = lim `(xn ) = `(x)
for every ` ∈ X ∗ , thus y = x.
Analytic: Let X and Y be normed linear spaces. The multiplicative BZ distance is defined
to be
d(X, Y ) = inf{kT kkT −1 k}
the infimum is over all isomorphisms T : X → Y . Sometimes people also use log d(X, Y )
which satisfies the usual (additive) triangle inequality.
Geometric: given two bounded convex sets with nonempty interiors (aka convex bodies)
K and L in Rn that are symmetric (i.e. K = −K and L = −L) the Banach–Mazur distance
between them is
d(K, L) = inf{C > 0 : L ⊂ T K ⊂ CL , T linear on Rn }
For Rn these two notions are equivalent. Basically given any norm on R the unit ball
is a symmetric convex body and conversely given any symmetric convex body K we could
construct a norm such that its unit ball is K, namely kxk = inf{t > 0 : x ∈ tK}.
Theorem 14 (F. John). Let `n2 denote Rn with the Euclidean norm. If X is a n-dimensional
NLS over R then
√
d(X, `n2 ) ≤ n
(This is actually sharp; equality holds for instance if X = `n∞ , for the geometric case take
the unit ball of this space.)
It follows from the theorem and the (multiplicative) triangle inequality that given any
two n-dimensional normed linear spaces over R it holds that
d(X, Y ) ≤ n
This result is also sharp upto a constant (Gluskin, 1981, a randomized construction).
Existence of nonlinear isometry: A natural question is whether non-affine isometry
exists (or equivalently nonlinear if we assume the normalization T 0 = 0)?
This is not true for complex NLS, even complex Banach spaces, see for instance the
conjugation mapping C to itself z 7→ z. We would need surjectivity, since otherwise the map
T : R → R2 mapping x to (x, sin(x)) is not linear but is an isometry if we equip R2 with the
L∞ norm k(x, y)k = max(|x|, |y|) (and distance norm in R).
It turns out that if the underlying normed linear spaces are over R then the answer is no.
Theorem 15 (Mazur-Ulam, 1932). If X and Y are NLS over R and T : X → Y surjective
and isometric with T 0 = 0 then T is linear.
If T 0 = 0 is not given the conclusion could be equivalently changed to “T is affine”,
namely T (αx + (1 − α)y) = αT (x) + (1 − α)T (y).
Main ideas of proof: Isometries are continuous so it suffices to show T (kx + my) =
kT (x) + mT (y) for all rational m, k, in fact suffices to show for k = m = 1/2. We construct
a nested sequence of subsets A1 ⊃ A2 . . . each of them is symmetric around x+y 2
such that
∩An = { x+y2
}. This construction will be invariant under surjective isometries, in particular
3.3. ISOMETRIES BETWEEN NORMED LINEAR SPACES 37
T x+T y
to get we could use the nested sequence T A1 ⊃ T A2 . . . , which are symmetric around
2
T x+T y
2
When that’s done we could take intersections to get T x+T
. 2
y
= T ( x+y
2
) as desired.
Now, let A1 contains all elements z of X such that kx − zk = ky − zk = kx−yk 2
. To
construct An+1 from An , simply let An+1 contains all z ∈ X such that for all w ∈ An it holds
that kw − zk ≤ 12 diam(An ), one could check that An+1 is still symmetric around x+y 2
and
diam(A1 )
diam(An+1 ) ≤ diam(An )/2 ≤ · · · ≤ 2n
→ 0 as n → ∞. Note that we need T X = Y
since we want T An+1 contains all z ∈ Y = T X such that “so and so” holds.
Exercises:
1. For any 1 < p < ∞ and any measure space (X, µ) prove that Lp (X, µ) is uniformly
convex (use the hints from the lecture notes). √
2. Show that the Banach Mazur distance satisfies d(`n∞ , `n2 ) ≥ n. [Hint: use the geomet-
ric formulation with corresponding convex bodies, and invoke a generalized√parallelogram
law.] (Note that together with F. John’s theorem this implies d(`n∞ , `n2 ) = n.) Then via
1 1
Holder’s inequality and the analytic formulation prove that d(`np , `nq ) = n| p − q | if 1 ≤ p, q ≤ 2
or 2 ≤ p, q ≤ ∞.
3. Prove F. John’s theorem for n = 2. [Hint. Use the geometric formulation, you may
want to use the fact that the Banach-Mazur distance is invariant under isomorphism of the
plane.]
4. Let X be a normed linear space. Let B = {x ∈ X : kxk ≤ 1} and B ∗∗ = {z ∈ X ∗∗ :
kzk∗∗ ≤ 1}. We know that there is a canonical map x 7→ x b that embeds B isometrically
linearly into B ∗. Prove that the image of B under this map is dense in B ∗∗ in the weak*
∗
topology of X ∗∗ (i.e. the minimal topology on X ∗∗ that makes all bounded linear functionals
constructed from elements of X ∗ continuous on X ∗∗ ). Use this fact to show that if B is
weakly compact then B = B ∗∗ and therefore X is reflexive.
5. Find two examples demonstrating that ”compactness” and ”sequentially compactness”
do not imply each other.
6. Let K be a convex subset of X a Banach space. Prove that if K is closed in the norm
topology then it is closed in the weak topology.
Chapter 4
Basis on Hilbert spaces and Banach

spaces
A Hilbert spacep is a complete normed linear space where the norm arises from an inner
product kuk = hu, ui, and the inner product hu, vi is a bilinear form that satisfies the
following properties (we state the properties for Hilbert spaces over C):
(i) (u, v) ≥ 0, equality iff u = 0.
(ii) hu, vi = hv, ui.
(iii) hu, vi is linear in u (hence conjugate linear
R in v).
2
Examples: H = L (X, dµ) with hf, gi = X f (x)g(x)dµ; by Holder this is finite
Z Z Z
| f (x)g(x)dµ| ≤ ( |f | dµ) ( |g|2 dµ)1/2
2 1/2
X X X
4.1 Basic properties
Parallelogram law: kx + yk2 + kx − yk2 = 2kxk2 + 2kyk2 .

Schwarz’s inequality:
| hu, vi | ≤ kukkvk
Proof. kx + yk ≤ kxk + kyk, therefore
kxk2 + kyk2 + hx, yi + hy, xi = hx + y, x + yi
≤ (kxk + kyk)2 = kxk2 + kyk2 + 2kxkkyk
Therefore Re hx, yi ≤ kxkkyk thus get desired estimate if hx, yi ∈ R. In general multiply x
with α unimodular such that | hx, yi | = hαx, yi ≤ kαxkkyk = kxkkyk.
Orthogonality: We say u ⊥ v if hu, vi = 0. A set E ⊂ H is orthonormal if kuk = 1 for
all u ∈ E and hu, vi = 0 if u, v ∈ E with u 6= v. PN 2 2
Bessel’s inequality: If {x1 , . . .P
, xN } orthonormal then n=1 | hx, xn i | ≤ kxk . A
2
sequence (xn ) is a Bessel sequence if n | hx, xn i | < ∞ for all x ∈ H.
39
40 CHAPTER 4. BASIS ON HILBERT SPACES AND BANACH SPACES
Theorem 16 (Riesz). For any ` ∈ H ∗ there is h` ∈ H such that `(x) = hx, h` i for all x ∈ H,
and the map ` 7→ h` is a surjective isometry (conjugate linear). Consequently H ∗ ∼ H.
Proof. Clearly ker(`) is a closed linear subspace of H, let Y = {y ∈ H : y ⊥ ker(`)}. Then
any x ∈ H has a unique decomposition into y + z with y ∈ Y and z ∈ ker(`). Sincd ` is
nonzero on Y so there is y0 ∈ Y such that `(y0 ) = 1. We now decompose each x ∈ H using
x = `(x)y0 + (x − y0 `(x)), now the first element is clearly in Y and the second is in ker(`)
since `(y0 ) = 1. It follows that H = span(y0 , ker(`)) and we could let h` = kyy00k2 , then
`(x) = `(hx, h` i y0 ) = hx, h` i
it is not hard to check isometry/(conjugate)linearity of the map ` 7→ h` .

A generalization of Riesz’s theorem is Lax-Milgram’s theorem:
Theorem 17 (Lax-Milgram). Let B(x, y) : H 2 → C be a bounded bilinear map (conjugate
linear in y), |B(x, y)| ≤ Ckxkkyk such that |B(x, x)| ≥ ckxk2 . Then every ` ∈ H ∗ has the
form `(x) = B(x, y` ) and the map ` → y` is bounded invertible (conjugate) linear.
Proof. Since B(x, y) is bounded linear in x for each y, we could find hy such that B(x, y) =
hx, T (y)i with kyk . kT (y)k . kyk, furthermore y 7→ T (y) is linear. It follows that one could
invert this map, now by Riesz representation theorem `(x) = hx, h` i = hx, T (T −1 (h` ))i =
B(x, T −1 h` ), so we could let y` = T −1 (h` ).
4.2 Orthogonal basis for a Hilbert space

We say that a set E ⊂ H is an orthogonal basis if iE is an orthogonal set and whenver
v ⊥ span(E) we must have v = 0. If furthermore every elements of E has unit norm then
we say that E is an orthonormal basis.
Theorem 18. A Hilbert space is separable iff it has a countable orthonormal basis.
Proof. If there is a countable orthonormal basis (vn ) then we simply use finite linear combina-
tion of vn with (complex) rational coefficients and get a dense countable subset. Conversely
if H has a countabler dense subset D we could apply Gram Schmidt to get a countable
orthogonormal set E. Let v ⊥ span(E), we will show v = 0. If not, assume kvk = 1, then
for all x ∈ E, exploiting orthogonality of v and x, we have
kv − xk2 = kvk2 + kxk2 ≥ 1
so span(E) is not dense in H, contradiction.

Notes:
1. If H is separable then we can extend any given basis to the full basis using Gram-
Schmidt and avoid axiom of choice for Hahn-Banach.
2. If H is not separable and E is an orthogonal basis of H then for each x ∈ H the
set of elements v ∈ E such that x has nonzero projection onto that element hx, vi v 6= 0 is
4.3. SCHAUDER BASIS 41
countable. Using this fact and Zorn’s lemma we could show that every Hilbert space has an
orthonormal basis.
3. If H is separable then we could define a natural isometry i : H → `2 (Z).
4.3 Schauder basis
Schauder basis: A sequence E = (xn ) is a (Schauder) basis for a normed Pnlinear space X
if for every x ∈ H there is a unique scalar sequence αn such that limn→∞ kx− k=0 αk xk k = 0.
Equivalently, (xn ) is a Schauder Pbasis if two conditions hold: the closure of the linear
span of {xn } is the whole space and n=1 an xn = 0 iff an = 0.
Note that this is different from an algebraic basis (aka Hamel basis) which consists of
linearly independent vectors and any x ∈ X could be written as a finite linear combination of
these elements. Note that this could be defined on any linear spaces, topology (convergence)
is not needed, as we dont’ permit infinite sums. Also the order of the elements in a Hamel
basis is not important.
A Schauder basis is said to be an unconditional basis if it remains a basis even after
reordering. There are two other equivalent characterizations
Theorem 19. Let (xn ) be a (Schauder) basis in a Banach space (over R or C). The following
are equivalent:
(i) (xn ) is an unconditional basis. P
(ii) For any scalar sequence (αn ), if αn xn converges then it converges unconditionally
(i.e. it remains convergent even if we change the summation order).
(iii) There exists C > 0 finite such that
n
X n
X
k k αk xk k ≤ Ck α k xk k
k=1 k=1
uniformly over n and all sequences n with |n | ≤ 1 and all scalar coefficients α1 , . . . , αn .
Property:
1. For NLS, existence of Schauder basis means the space is separable.
2. Every orthogonal basis in a separable Hilbert space is an unconditional Schauder basis.
Examples: √
1. Fourier basis: L2 [0, 1] has (e2πinx / 2π)n∈Z as an unconditional Schauder basis. This
remains a basis for Lp [0, 1] 1 < p < ∞ but not unconditional for p 6= 2.
2. Haar basis: 1[0,1] and hI (x) = √1 (1Il − 1Ir ) indexed by dyadic subintervals I ⊂ [0, 1]
|I|
2 p
is a basis for L and all L [0, 1] with 1 ≤ p < ∞, note that p could be 1 here. This is an
unconditional basis if 1 < p < ∞.
3. Orthogonal polynomials: If w(x), x ∈ R has sufficiently fast decay (say subexpo-
nential - Bernstein’s theorem) then the orthogonal polynomials with respect to dµ = wdx is
a basis for L2 (R, µ). Here pn (x) = an,n xn + · · · + a0,n where an,n > 0 and
Z
pn (x)pm (x)dµ = δmn
which is 1 if n = m and zero otherwise.
Proof of Theorem 19. It is clear that (i) and (ii) are the same. So we only show equivalence
to (iii).
We first observe that the following are equivalent for any given sequence (xn ) in a Banach
space X: P
(a) the series n xn converges unconditionally; P
(b) for any > 0 there exists n = n() such that k k∈M xk k < for any finite subset
M ⊂ (n, ∞) of the integers.
Note that the direction (b) → (a) is a consequence of the completeness
Pn of the space.
(Basically (b) implies that for any permutation of N the sequence k=1 xσ(k) would be a
Cauchy sequence).
For the other direction (a) → (b), assume towards a contradiction that there is an > 0
such that no such n could be found,
P thus we could find a sequence of sets M1 , M2 , ... such
that maxMj < minMj+1 and k k∈Mj xk k > . It is not hard to build a permutation σ of N
such that each Mj appears as a consecutive block of σ(1), σ(2), . . . , thus nk=1 xσ(k) is not a
P
Cauchy sequence in X so won’t be convergent, contradiction. P
Using
P this observation we claim
P that if a given series n xn is unconditionally convergent
then n xσ(n) is the same as n xn for all permutation of N. Using this fact it is not hard
to see that (i) and (ii) are equivalent. To see this claim,
P just note that forPany σ there is
some M such that σ(M + 1), · · · > n(), therefore k k>M xσ(k) k ≤ and k k>n() xk k ≤ ,
and by canceling out the common terms it follows that
X X X
k xk − xσ(k) k = k xσ(k) k <
k≤n() k≤M k≤M :σ(k)>n()
P P
thus k k xσ(k) − k xk k < 3 for all > 0, implying the desired claim.
Now we show that (ii) and (iii) are equivalent. In fact for simplicity we will only show
the equivalence to the version with = ±1. P
It is clear that (iii) implies (ii): suppose that Pk αk xk converges and (iii) holds true.
Then for any > 0 there exists n = n() such that k k>n αk xk k < /(2C). Given any finite
A ⊂ (n, ∞) by choosing the sign sequence that equals 1 P on A, equals 0 on P {1, 2, . . . , n} and
equals ±1 constantly on N\A and using (iii) P we obtain k k∈A αk xk k ≤ 2C k>n αk xk k < .
Thus by the observation above the series k αk xk converges unconditionally.
Now, to show that (ii) implies the sign sequence version of (iii), we will use the following
lemma
P P
Lemma 3. Let Pn be the projection Pn x = k≤n αk xk if x = k αk xk is the expansion into
the Schauder basis (xn ) for x. Then supn kPn k < ∞.
4.3. SCHAUDER BASIS 43
We first prove the above implication using the lemma. The proof of the lemma will be
discussed in the next class, after we have introduced several basic tools in Banach spaces,
such as the principle of uniform boundedness and the open mapping theorem. Now, it suffices
to show that there exists N > 0 and a scalarPCN > 0 such that forP every sequence of scalar
αj and signs j and n ≥ N it holds that k j=N j αj xj k < CN k nj=1 αj xj k. Indeed, the
n
desired claim would follows from the fact that given any N and any sign sequence 1 , . . . it
holds uniformly over n ≥ N that
N
X n
X
k j αj xj k .N k α j xj k
j=1 j=1
To see this, simply use Lemma 3 to obtain kαj xj k ≤ k(Pj−1 −Pj )( nk=1 αk xk )k ≤ Ck nk=1 αk ck k
P P
and then use the triangle inequality.
Now assume towards a contradiction that for every N > 0 one can not find such CN .
We then construct a sequence n1 < n2 < . . . and Aj ⊂ [nj , nj+1 − 1] andPβ1 , β2 , . . . re-
cursively as follows, with the following property: for any j it holds that k k∈Aj βk ak k ≥
j 2 k nj ≤k≤nj+1 −1 βk ak k.
P
Suppose that we have chosen n1 < · · · < nj and A1 , . . . , Aj−1 and β1 , . . . , βnj −1 satis-
fying the above requirements. Then given C > 0 by the assumption there exists nj+1 and
β1 , . . . , βnj+1 −1 and B ⊂ [nj , nj+1 ) such that
X X X
k β k xk − βk xk k ≥ Ck β k xk k
k∈B k∈[nj ,nj+1 )−B 1≤k≤nj+1 −1
On the other hand, by the triangle inequality

X X X
k βk x k k ≤ k βk xk k + kPnj −1 ( βk xk )k
nj ≤k≤nj+1 −1 1≤k≤nj+1 −1 1≤k≤nj+1 −1
X
.nj k β k xk k
1≤k≤nj+1 −1
therefore by choosing C large enough we could ensure that

X X X
k β k xk − βk xk k ≥ 2j 2 k β k xk k
k∈B k∈[nj ,nj+1 )−B nj ≤k≤nj+1 −1
from here it is clear that either Aj = B or Aj = [nj , nj+1 ) − B will work, with αk = βk for
k ∈ [nj , nj+1 ). This completes the selection of Aj and nj and αj .
P Pnj+1 −1
Now by rescaling αk ’s we may assume that k k∈Aj αk xk k = 1 while k k=n j
αk xk k ≤
2
1/j . Now, given any m1 < m2 we let ns < · · · < nt be the elements of (nj ) inside [m1 , m2 ),
then using the uniform boundedness of Pn we obtain
m2 t
X X X 1
k α k xk k . k αk x k k .
k=m1 j=s−1 nj ≤k<nj+1
s
P
and clearly if m1 → ∞ then so is s, thus k αk xk converges. On the other hand it does
not converge unconditionally since it violates the second property of the initial
P observation:
given any n we could select j such that nj > n, and so Aj ⊂ (n, ∞) and k k∈Aj αk xk k = 1.

Exercise:
1. Prove that (e2πinx )n∈Z is an unconditional Schauder basis for L2 [0, 1].
2. Prove that the Haar functions are orthogonal in L2 [0, 1] and form a Schauder basis
for L1 [0, 1] but not unconditional in L1 .
3. (Radon–Nikodym) Let µ and ν two nonnegative σ-finite measures on the same σ
algebra (X, A) such that the following holds: for every measurable A if µ(A) = 0 then
ν(A) = 0. Show R that there exists g nonnegative such that for every measurable E it holds
that ν(E) = E gdµ.
Chapter 5
Basic results about Banach Spaces
Here we will prove several basic results about bounded linear maps between Banach spaces:
the principle of uniform boundedness, the open mapping theorem, and the closed graph
theorem, and applications. We then focus on Lp spaces.
5.1 The Baire category theorem

We first recall the Baire category theorem for metric spaces, which will be used in the proofs
of these theorems.
Below we say that a set A is nowhere dense if its closured A has empty interior.
Theorem 20 (Baire). A complete metric space can not be written as a countable union of
nowhere dense set.
S
Proof: If X = An we can construct a Cauchy sequence {xn } that does not have a limit
as follows: Since A1 is nowhere dense there is an open ball of small radius B1 = B(x1 , r1 )
such that B1 ⊂ X \ A1 . Then again find an open ball B2 = B(x2 , r2 ) such that B2 ⊂ B1 \ A2 .
Repeat this, make sure rk ≤ rk−1 /2, we get a Cauchy sequence which can not converge to
anything in any Ak , so no limit inside X.
5.2 The principle of uniform boundedness

Let X be a Banach space and Tα , α ∈ A is a family of bounded linear functionals from X to
some normed linear space Y .
Theorem 21 (P.U.B, aka Banach-Steinhaus). If supα∈A kTα xk < ∞ for every x ∈ X then
supα∈A kTα k < ∞.
Remark: the opposite direction is clearly true.
Example: If gn ∈ L∞ (R) such that supn | gn (x)f (x)dx| < ∞ for every f ∈ L1 (R) then
R
supn kgn k∞ < ∞.
45
46 CHAPTER 5. BOUNDED MAPS ON BANACH SPACES
Proof. We first show that for a linear operator T : X → Y (only need normed linear spaces
X, Y ) T is bounded if any only if T −1 (kykY ≤ 1) has nonempty interior in X.
Certainly if T is bounded then the set T −1 (kykY ≤ 1) contains {kxkX ≤ 1/kT k} so
nonempty interior.
Conversely, if T −1 (unit ball in Y ) contains an interior point x0 , then for some > 0 it
holds for every kxkX ≤ that kT (x0 + x)k ≤ 1. It follows from the triangle inequality that
kT xk ≤ 1 + kT x0 k for such x, thus kT k ≤ 1+kT x0 k .
Therefore in order to show P.U.B it suffices to show that the set
\
B1 = Tα−1 (kyk ≤ 1)
α∈A
has an interior point. In fact it suffices to show thatSfor some n > 0 the (closed) set Bn
(replace the radius by n) has nonempty interior. But Bn = X because supα kTα xk < ∞.
Thus the desired claim follows from the Baire category theorem.
5.3 The open mapping theorem

Theorem 22 (Open mapping, aka Banach Schauder). Let T : X → Y linear map between
Banach spaces. If T is bounded and surjective then T maps open sets to open sets.
Remark: as a corollary, if T : X → Y bounded bijective linear operators between Banach

spaces then T −1 exists and is bounded linear.
Proof. Let Br = {kxk < r} ⊂ X.

By translation invariant if suffices to show that for every r > 0 the set T (Br ) contains a
neighborhood of 0. By dilation, if this holds for one r > 0 then it holds for all r > 0.
Now this in turns follows if we could show that T (Br ) contains an interior point: indeed
suppose that for some > 0 and y0 we have {|y − y0 k < } ⊂ T (Br ), then by linearity T (B2r )
contains T (Br ) − T (Br ) which in turn contains {|yk < /2} a neighborhood of 0.
To show that T (Br ) has nonempty interior, we first note that T (Br ) has S nonempty
interior. This follows from surjectivity of T and the Baire category theorem n T (kxk <
n) = Y . Therefore it suffices to show that T (Br ) contains T (Br/10 ).
To see this, we note that T (Br ) contains a neighborhood of 0 via the set difference
argument (as above). Assume that {kyk ≤ r/C} ⊂ T (Br ) for some C > 0; by dilation
invariant this C could be chosen uniformly over r. Let y0 ∈ T (Br/10 ) then there is a point
r
x1 ∈ Br/10 such that ky − T x1 k < 20C , thus y − T x1 ∈ T Br/20 , so again there is x2 ∈ Br/20
r
such that ky − T x1 − T x2 k < 40C , thus by repeating this argument we obtain x1 , x2 , . . . such
that kxn k ≤ 2nr 5 and T (x1 + · · · + xn ) → y in Y as n → ∞. Since X is complete it follows
that x1 + · · · + xn P
→ x ∈ X as n → ∞, thus using continuity of T if follows that y = T x,
and clearly kxk ≤ kxk k < r, thus y ∈ T Br as desired.
5.4. THE CLOSED GRAPH THEOREM 47
5.4 The closed graph theorem

Graph of a function f : X → Y is defined to be Γ(f ) = {(x, f (x)), x ∈ X} which is a subset
of X ×Y equiped with the product topology. If X and Y are normed spaces then the product
topology on X × Y is the topology of the norm k(x, y)k = kxk + kyk.
Theorem 23 (Close graph). Let T : X → Y be linear where X and Y are Banach spaces.
Then T is bounded iff Γ(T ) is closed.
Remark: Since (X, Y ) is a Banach space also, Γ(T ) is closed iff whenever (xn , T (xn ))
converges to (x, y) one must have y = T x. Comparing with the usual continuity of T
(which is equivalent to boundedness of T ), which states that whenever xn → x we must have
T xn → T x, we see that the extra thing we could assume is the convergence of T xn .
Proof. Clearly if T is bounded then T is continuous and the graph is closed. Assume now
Γ(T ) is closed, then Γ(T ) is a Banach subspace of X × Y . Let π1 (x, y) := x and π2 (x, y) := y
be the coordinate projections. By the open mapping theorem, the bounded bijective map
π1 (x, y) : Γ(T ) → X is boundedly invertible, therefore T = π2 ◦ π1−1 is bounded.
Corollary 2 (Hellinger-Toeplitz). If A is linear operator on H a Hilbert space and hx, Ayi =
hAx, yi then A is bounded.
Proof. Clearly Γ(A) is closed: if (xn , Axn ) converges to (x, y) then for every z ∈ H we have
hz, yi = lim hz, Axn i = lim hAz, xn i = hAz, xi = hz, Axi, thus y = Ax.
5.5 An application: uniform boundedness of partial

sum projections for Schauder basis
Here we prove Lemma 3 from the last Chapter.
P P
Recall that Pn is the projection Pn x = k≤n αk xk if x = k αk xk is the expansion into
the Schauder basis (xn ) for x ∈ X a Banach space.
We need to show that supn kPn k < ∞.
By the principle of uniform boundedness it suffices to show that the linear operator Pn
are bounded (the pointwise uniform boundedness of Pn follows from the fact that at each
x ∈ we have kPn xk → kxk and a convergent sequence is always bounded). Now to show
that Pn are bounded, we consider the following norm on X
kxk1 = sup kPn xk

n
it is not hard to see that this is a norm and kxk ≤ kxk1 . We will show that Pn is bounded
in (X, k.k1 ) and the norm k.k1 is actually equivalent to k.k, these two facts will take care of
the lemma. Now the boundedness of Pn with respect to the new norm is clear
kPn xk1 = sup kPm Pn xk = sup kPmin(m,n) xk ≤ kxk1

m m
Now we show that (X, k.k1 ) is complete. Once we did that, the identity map from
(X, k.k1 ) → (X, k.k) is a bijective bounded linear map thus by the open mapping theorem
it is boundedly invertible, thus the two norms are equivalent and completes the proof of the
lemma. Now, let X 0 be the completion of X under k.k1 , then it is not hard to see that P xn is
0 0
still a Schauder basis for (X , k.k1 ) (see the homework). So if a ∈ X we have P a = j α j xj
convergence
P in k.k1 , but this series also converges in k.k too because k m<j≤n αj xj k ≤
k m≤n αj xj k1 → 0 and X is complete. So, let b ∈ X be its limit under k.k. Now, it is clear
that b = a because
X X
kb − αj xj k1 = sup kPm (b − αj xj )k = sup k(Pm − Pn )bk → 0
m m≥n
j≤n j≤n
if n → ∞.
5.6 Lp spaces
In this section we conduct a case study of Lp spaces, which are the most fundamental type
of Banach spaces.
5.6.1 Separability
Recall that if (X, A, µ) is separable then Lp (X, A, µ) is separable for 1 ≤ p < ∞; this is part
of the homework.
5.6.2 Duality
We’ll use Hilbert space techniques to show that
Theorem 24. For 1 < p < ∞ the dual of Lp (X, µ) is Lq (X, µ) where 1/p + 1/q = 1.
If furthermore the measure µ is σ-finite then the dual of L1 is L∞ .
Proof. We first consider the simpler case when µ is σ-finite, in this case we will show the
result for all 1 ≤ p < ∞.
If µ is not necessary σ-finite we will need p > 1. Let ` ∈ (Lp )∗ . Suppose that E ⊂ X
measurable such that µ|E is σ-finite. Clearly ` is also a bounded linear functional on Lp (E, µ)
whose norm does not exceed k`k. Thus, by the σ-finite case, for every f ∈ Lp (E) there is
gE ∈ Lq (E, µ) such that Z
`(f ) = f gE dµ
E
kgE kq ≤ k`k
Now if E ⊂ E 0 both measurable and σ-finite wrt µ then it is clear that gE = gE 0 on E.
5.6. LP SPACES 49
Now, using the fact that a countable union of σ-finite sets is σ-finite one could find F
σ-finite wrt µ such that kgF kq = supE σ−f inite kgE kq. We will show that
Z
`(f ) = f gF dµ
Rfor every f ∈ Lp (X). Observe that g|A\F = 0 for every A σ-finite, because otherwise
q
|gA\F | dµ > 0 and therefore kgA∪F kq > kgkF contradiction (note that here is where we
need p > 1 so that q < ∞). In particular let A = {|f | > 0} which is σ-finite wrt µ and so
Z Z
`(f ) = f gA dµ = f gF dµ
We now focus on the σ-finite case. For simplicity we will assume µ(X) < ∞, otherwise
we could decompose X into countably many subsets where µ is finte and apply this argument
to each subspace and add things up: because of continuity of `, for each f ∈ Lp (X) we have
XZ
`(f ) = lim f gk dµ
n→∞ Xk
k≤n
fact that k k≤n gk kq = k k≤n kgk kqq )1/q ≤ k`k < ∞

P P
and use monotone convergence plus the P
to interchange the sum and define g = k gk . (Note that the gk have disjoint support).
Now we assume µ(X) < ∞. In this case it can be shown that any ` could be written as
a linear combination of finitely many positive linear functionals, i.e. `(f ) ≥ 0 if f ≥ 0.
For each E ⊂ X measurable construct a measure
ν(E) = `(1E )
it is clear that ν µ, therefore by the Radon Nikodym theorem there exists g ∈ L1 , g ≥ 0,

such that Z
`(1E ) = ν(E) = gdµ
E
R
for any measurable E ⊂ X. Now given any f ∈ Lp we will show that `(f ) = f gdµ. We
may assume wlog that f ≥ 0, now by the monotone convergence theorem it suffices to show
this for f being simple functions, for which the claim holds. It remains to show that g ∈ Lq
(note that since we assum µ(X) < ∞ this is better than g ∈ L1 since L1 contains Lq for all
q ≥ 1). Simply let f = |g|q−1 1|g|≤K , then
Z Z
|g| 1|g|≤K dµ = `(f ) ≤ k`kkf kp = k`( |g|q 1|g|≤K dµ)1/p
q
thus Z
( |g|q 1|g|≤K dµ)1/p ≤ k`k
therefore by monotone convergence we obtain kgkq ≤ k`k.

5.6.3 Basic facts about bounded operators on Lp

The most typical type of operator on Lp spaces are integral operators. Let (X, µ) and (Y, µ)
be two measure spaces. Let
Z
T f (y) = K(x, y)f (x)dµ(x)
Y
mapping from measureable functions on (X, µ) to measurable functions on (Y, ν). At the
beginning the operator T may not be defined for all f , as the integral may not be convergent.
The adjoint of T is Z
T ∗ g(x) = K(x, y)g(y)dσ(y)
X
Think of this as a generalization of the matrix in finite dimensional setting. (integration
is like adding over indices and K gives the matrix entries.) (This turns out to be the case
if say X = {1, . . . , n} and Y = {1, . . . , m} with the counting measures.) The function K is
called the kernel (assumed measurable etc.).
Boundedness of integral operators

Theorem 25. For 1 ≤ p ≤ ∞ it holds that
Z Z
1/p 0
kT kLp →Lp ≤ [sup |K(x, y)|dσ(y)] [sup |K(x, y)|dµ(x)]1/p
x X y X
In the special case p = 2 one also has

Z
kT kL2 →L2 . ( |K(x, y)|2 dµ(x)dσ(y))1/2
X×X
Remarks: The dual estimates hold for T ∗, namely we also have the same L2 → L2
esimates and
Z Z
1/p 0
kT kLp0 →Lp0 ≤ [sup |K(x, y)|dσ(y)] [sup |K(x, y)|dµ(x)]1/p
x X y X
In fact we will see that kTR kp→p = kT ∗ kp0 →p0 .

Operators for which ( X×X |K(x, y)|2 dµ(x)dσ(y))1/2 < ∞ are called Hilbert-Schmid op-
erators and the right hand side is called the Hilbert Schmidt norm. In the finite dimensional
setting, for Y = {1, . . . , n} with counting measures then this norm is the
Pinstance if X =2 1/2
same as ( 1≤i,j≤n |K(i, j)| ) , or equivalently the `2 norm of the sequence of eigenvalues
(if diagonalizable).
For 1 < p < ∞ to use the first estimate we need both factor to be finite. But if p = 1 or
p = ∞ we only need one of them to be finite. More precisely
Z
kT kL1 →L1 ≤ sup[ |K(x, y)|dσ(y)]
x X
5.6. LP SPACES 51
Z
kT kL∞ →L∞ ≤ sup |K(x, y)|dµ(x)
y X
These turn out to be the easiest cases. For the L1 case this follows from
Z Z
kT f kL1 (Y,ν) ≤ |K(x, y)||f (x)|dµ(x)dσ(y)
Y X
Z Z
= |f (x)|[
|K(x, y)|dσ(y)] dµ(x)
X Y
Z
≤ kf k1 sup[ |K(x, y)|dσ(y)]
x X
∞
Similarly for the L case we have, for each y ∈ Y
Z
|T f (y)| ≤ |K(x, y)||f (x)|dµ(x)
X
Z
≤ kf k∞ sup |K(x, y)|dµ(x)
y X
To get the case 1 < p < ∞ of the first estimate, we will need Riesz-Thorin interpolation
theorem (aka convexity theorem).
The Riesz-Thorin interpolation theorem

Theorem 26 (Riesz-Thorin). Let 1 ≤ p0 lep1 ≤ ∞ and 1 ≤ q0 ≤ q1 ≤ ∞. Let T be a linear
map from (X, µ) to (Y, ν) such that
M0 := kT kLp0 (X,µ)→Lq0 (Y,σ) < ∞
M1 := kT kLp1 (X,µ)→Lq1 (Y,σ) < ∞

Assume 1 < p, q < ∞ be such that for some 0 < α < 1
1 1 1 1 1 1
( , ) = (1 − α)( , ) + α( , )
p q p 0 q0 p1 q1
Then
kT kLp (X,µ)→Lq (Y,σ) ≤ M01−α M1α
Proof. We will use the Hadamard three lines theorem: let h be analytic bounded in the strip
{Re(z) ∈ [0, 1]}. For each α ∈ [0, 1] let H(α) = supRe(z)=α |h(z)|. Then
H(α) ≤ H(0)1−α H(1)α
The proof of this is an application of the maximum principle: if H(0) = H(1) = 1 then
the claims follows by the maximum principle, in the general case modify h by a suitable
exponential factor to reduce to this setting.
We now set up the bounded analytic function h(z). For Re(z) ∈ [0, 1] let pz and qz be
defined by
1 1 1 1 1 1
( , ) = (1 − z)( , ) + z( , )
pz qz p0 q 0 p1 q1
0
Then pz and qz are analytic functions. For any f ∈ Lp and g ∈ Lq (1/q 0 + 1/q = 1) we define
Z
h(z) = T (fz )gz dσ
Y
p q
where fz (x) = |f (x)| pz −1 f (x) and gz (x) = |g(x)| qz −1Rg(x). It is not hard to see that h(z)
is bounded analytic for Re(z) ∈ [0, 1] and h(α) = T f gdσ, therefore using the duality
characterization for Lp we obtain
M= sup |H(α)|
f ∈Lp ,g∈Lq0
0
Note that everything depends on f, g where f ∈ Lp and g ∈ Lq . We normalize kf kp =
kgkq0 = 1, then it is not hard to see that
|H(0)| ≤ M0 , |H(1)| ≤ M1
therefore the desired claims follows from Hadarmard three lines lemma.
Exercises:
1. Let X be a Banach space. Prove that separability of X ∗ implies separability of X.
2. Prove that the measure space (X, A, µ) is separable if any only if L2 (X, A, µ) is
separable. Is this true if we replace 2 by any fixed p ∈ (1, ∞)? Use this to construct a finite
measure space (X, µ) (µ(X) < ∞) such that L2 (X, µ) is nonseparable.
3. Prove that if X is a compact metric space then C(X) with the sup norm is separable.
4. Let X be a normed linear space and assume that (xn ) is a Schauder basis, furthermore
the canonical projection Pn into the first n vectors is bounded for each n ≥ 1. Show that (xn )
remains a Schauder basis for the completion of X. [Hint: show that Pn extends boundedly
to the completed space and use Pn to find the linear expansion/proving required properties
for Schauder basis.]
5. Let ` be a bounded linear functional on Lp (X, µ) where µ(X) < ∞. Show directly
(without using the duality theorem) that ` could be written as a linear combination of finitely
many positive linear functionals on Lp (X, µ). Here 1 ≤ p < ∞.
6. Show that every NLS is isomorphic to a subspace of the space of bounded continuous
function on some complete metric space Y . 1
1
if the given space is separable then one could take Y to be [0, 1] - this is the Banach-Mazur theorem.
Chapter 6
Bounded and continuous functions on

a locally compact Hausdorff space and
dual spaces
Recall that the dual space of a normed linear space is a Banach space, and the dual space
of Lp is Lq where 1/p + 1/q = 1 if 1 < p < ∞. If the underlying measure space is σ-finite
then the dual of L1 is L∞ . What about the dual of L∞ ? Or its subspace Co (X) and Cc (X)
where say X is locally compact Hausdorff. In this chapter we discuss several representation
theorems related to these themes.
6.1 Riesz representation theorems

6.1.1 The dual space of L∞
The dual space of L∞ (X, Σ, µ) consists of bounded (i.e. the total measure of X is finite) and
finitely additive signed/complex measures on Σ that is absolutely continuous with respect
to µ. R
Clearly if σ is a such a measure we may define a linear functional `σ (f ) := f dσ, it is
not hard to see that this is well defined as a bounded linear functional on L∞ . (The fact
that σ µ ensures that if f and f 0 equals almost everywhere with respect to µ, i.e. they
represent the same L∞ function, then the two integrals agree.) Note that it is possible to
define integration with respect to a finitely additive measure: start with positive function
and define the integral as the supremum over integration of simple functions dominated by
the given positive function. Then define integral of signed functions and complex valued
functions etc.
Conversely, given a bounded linear functional `, after several reductions we may assume
that ` is nonnegative, namely if f ≥ 0 then `(f ) ≥ 0. Then we may define σ(E) = `(1E ), it
is not hard to see that σ is a finitely additive bouned measure, and if µ(E) = 0 then 1E ≡ 0
in L∞ (µ) thus σ(E) = `(1E ) = `(0) = 0. Boundedness of the measures follows from choosing
53
54 CHAPTER 6. BOUNDED CONTINUOUS FUNCTIONS AND DUAL SPACES
the right input.
6.1.2 Measures on a locally compact Hausdorff spaces

There are several related and sometimes equivalent notions of measures on a given locally
compact Hausdorff space. Here we list and compare some of them.
Regular Borel measures and Radon measures

Recall that a Borel set is a set in the σ-algebra generated by the open sets.
A regular Borel measure is a Borel measure (i.e. all Borel sets on X are measurable)
such that for any Borel set E the following holds.
µ(E) = inf {µ(U ) : U open, E ⊂ U }
µ(E) = sup {µ(K) : K compact, K ⊂ E}

A Radon measure is a locally finite regular Borel measure, i.e. in addition to being
regular Borel it also assignes a finite value to all compact subsets of X.
Note that some textbooks define Radon measure as locally finite inner regular Borel
measures, thus requiring only that measures of Borel sets could be approximated from inside
using measures of compact subsets.
Baire sets and Baire measures

A Baire set is an element of the sigma algebra generated by all compact subsets of X that
are at the same time countable intersections of open sets.
A Baire measure is a measure on the Baire sigma algebra.
One could check that a Baire measure is also regular with respect to the Baire sigma
algebra.
The Baire sigma algebra is the smallest sigma algebra such that all continuous functions
are measurable.
It can be shown that any Baire measure extends uniquely to a Radon measure and vice
versa.
6.1.3 The dual space of Co (X) and Cc (X)

Let X be LCH. Since the closure of Cc (X) under the uniform norm is Co (X), it suffices to
consider the dual of Co (X). Note that if X is not compact then C(X) is not a normed linear
space with the sup norm, and we have to look at its dual as a topological dual. More on this
in later chapters.
We say that a signed measure is a signed Radon measure if it could be written as the
difference of two Radon measures. We say that a complex valued measure is a complex
valued Radon measure if its real and imaginary parts are signed Radon measures.
6.1. RIESZ REPRESENTATION THEOREMS 55
Theorem 27 (Riesz). Let X be a LCH. Then the dual of Co (X) is M (X) the space of
complex valued Radon measures (locally finite regular Borel measures) on X. Namely the
following map is an isometric isomorphism between this dual and M (X)
Z
µ 7→ `µ (f ) = f dµ
Pm
and k`µ k = kµk the total variation of µ defined by sup j=1 |µ(Aj )| supremum over all
collection A1 , . . . , Am of disjoint subsets of X.
Step 1: We first show that if ` is a bounded positive linear functionals on Co,R (X), the
space of real valued continuous functions on X that vanish at ∞, then there is a Radon
measure µ such that for any f ∈ Co,R (X) it holds that
Z
`(f ) = f dµ
To see this, we would like to define µ by µ(E) = `(1E ), however 1E is not continuous therefore
one can not apply ` to this function. To get around this we may define for each open set
U ⊂X
Defintion: µ(U ) is defined to be the supremum over `(f ) where f ∈ Co (R) and the
support of f is inside U and 0 ≤ f ≤ 1 pointwise.
(Note that we may not be able to do this if U wasn’t open since such a continuous
function may not exists; for open U the existence of f is due to Urysohn’s lemma).
Now we obtain a premeasure on open sets which is a ring of sets, thus by Caratheodory
we may exend µ to the sigma algebra generated by these sets, which are exactly the Borel
sets. R
We want to show that `(f ) = f dµ for every f ∈ Co (X). Without loss of generality
assume 0 ≤ f ≤ 1. Since f vanishes at infinity the sets Ak = {f ≥ k/n} are compact, and
we may find continuous functions fk such that
1Ak ≤ nfk ≤ 1Ak−1

Pn
and f = k=1 fk . We will use the following lemma
Lemma 4. If E ⊂ F are compact subsets and f is continuous on X such that 1E ≤ f ≤ 1F
then µ(E) ≤ `(f ) ≤ µ(F ).
Using the lemma and positivity of ` it follows that
n n
1X 1X
µ(Ak ) ≤ `(f ) ≤ µ(Ak−1 )
n k=1 n k=1
consequently Z
1 1 1
|`(f ) − f dµ| ≤ µ(A0 ) ≤ µ(supp(f )) = O( )
n n n
by sending n → ∞ we obtain the desired claim.

We now show the lemma.
For the second estimate, recall that by definition µ is outer regular, thus µ(F ) =
inf{µ(U ) : F ⊂ U , U open}. Now for every U open containing F we have 0 ≤ f ≤ 1U
thus by definition of µ(U ) it follows that `(f ) ≤ µ(U ). Consequently µ(F ) ≥ `(f ).
For the first estimate, it suffices to show that for every > 0 we have µ(E) ≤ (1 + )`(f ).
1
Let U = {f > 1+ } which is an open set that contains E. Thus
µ(E) ≤ µ(U ) = sup{`(g) : 0 ≤ g ≤ 1 , supp(g) ⊂ U }
≤ sup{`(g) : 0 ≤ g ≤ (1 + )f }
≤ (1 + )`(f )
We now show regularity properties for µ. (It is clear that µ is finite on compact set using
Urysohn’s lemma.) Outer regularity of µ follows from construction using premeasure, and
it remains to show inner regularity. Let E be Borel, it suffices to show that
µ(E) ≤ sup{µ(K) : K ⊂ E compact}
now using outer regularity it suffices to do this for E open. Let U be an open set, we have
µ(E) = sup{`(f ) : 0 ≤ f continuous ≤ 1, sup(f ) ⊂ U }
≤ sup{µ((supp(f )) . . . }
≤ sup{µ(K) : K ⊂ E compact}
Step 2: We now reduce the desired result to the positive setting. This is done via writing
bounded linear functionals on Co (X) as a linear combination of positive linear functionals. To
see this note that without loss of generality it suffices to consider bounded linear functional
on C0,R (X) real valued members of C0 (X). Let `+ be defined by
`+ (f ) = sup{`(g) : 0 ≤ g ≤ f, g ∈ C0,R (X)}
and let `− = `+ − `, it is not hard to check that both `+ and `− are positive linear functionals
on C0,R (X).
6.2 The Stone-Weierstrass approximation theorem

Let X be compact Hausdorff. Let P be a subspace of CR (X) such that P is also an algebra
inside CR (X), i.e. p1 p2 ∈ P if both p1 , p2 ∈ P .
Theorem 28. Assume that P separates point, i.e. if x1 6= x2 are elements of X then
there exists one element p ∈ P such that p(x1 ) 6= p(x2 ). Then one of the following holds:
P = CR (X) or there is some x0 ∈ X such that P = {f ∈ CR (X) : f (x0 ) = 0}.
6.2. THE STONE-WEIERSTRASS APPROXIMATION THEOREM 57
Note that there are also versions for complex valued functions, in which case the same
conclusion holds if we assume further that P is closed under conjugation p ∈ P then p ∈ P .
There are also versions for C0 (X) and C0,R (X) where X is locally compact Hausdorff.
The original Weiertrass approximation theorem is for X being compact intervals, which
implies that polynomials are dense inside continuous functions. Applications in probablity
(moment problems etc.)
Proof: Since P is also an algebra that separate points, without loss of generality we may
assume that P is closed. We divide the proof into two steps
Step 1: We will show that P is a lattice, namely if f, g ∈ P then so are max(f, g) and
min(f, g).
Step 2: Using the lattice property of P , we will show that if f ∈ CR (X) be such that for
every x 6= y in X there exists h ∈ P such that h(x) = f (x) and h(y) = f (y), then f ∈ P .
Combining these two steps, we prove the theorem as follows. Given any x 6= y consider
the algebra (h(x), h(y)), h ∈ P . Clearly this is a subalgebra subspace of R2 and it can’t be
{(0, 0)} since P separate points. Thus it could be either 0 × R or R × 0 or R2 . Now in the
last case by step 2 any f ∈ CR (X) must be in P as desired. In the first case for instance,
h(x) = 0 for all h ∈ P and so we must have {h(y), h ∈ P } = R for any other y. So by Step
2 again it follows that any function f such that f (x) = 0 must be in P as desired.
Proof of step 1: since max(f, g) and min(f, g) could be written as linear combination of
|f + g| and |f − g| and f and g, therefore it suffices to show that if f ∈ P then so is |f |. The
idea is to approximate |f | with a polynomial of f uniformly. Indeed, clearly f is bounded
therefore without loss of generality assume |f | ≤ 1, it then suffices to show that |x| could be
approximated
√ uniformly by polynomials on [−1, 1]. To see this use the Taylor expansion of
1
1 − t = 1 − 2 t + . . . which converges uniformly on 0 ≤ t ≤ 1, then write
√ p 1
|x| = x2 = 1 − (1 − x2 ) = 1 − (1 − x2 ) + . . .
2
Proof of step 2: Let > 0. It suffices to show that there exists g ∈ P such that
|g(x) − f (x)| ≤ uniformly over x ∈ X. Assume that for every x ∈ X there exists gx ∈ P
such that |gx (x) − f (x)| < and gx (y) ≤ f (y) + for every y ∈ X. Then the desired g could
be constructed as follows. By continuity for each x ∈ X there exists Ux ⊂ X neighborhood
of x such that supUx |gx − f | < . By compactness of X one could refine the collection
Ux , x ∈ X to a finite subcollection Ux1 , . . . , Uxm , and we simply let
g = max(gx1 , . . . , gxm )
which is in P by the lattice property and clearly supX |g − f | ≤ .

Now to show the existence of such a gx we fix x and notice that, using the given hypothesis,
for each y ∈ Y we may find hy ∈ P and two open neighborhoods of x and y respectively,
denoted respectively by Uy and Vy (note that x is fixed so we ignore the dependence on x)
such that
sup |hy − f | < , sup |hy − f | <
Uy Vy
By compactness of X again we may refine the covering Vy , y ∈ Y to a finite subcovering

Vy1 , . . . , Vyn , and we simply let gx = min(hy1 , . . . , hyn ) which has the desired property with
Ux := Uy1 ∩ · · · ∩ Uyn .
Chapter 7
Locally convex spaces, the hyperplane

separation theorem, and the
Krein-Milman theorem
Recall that C(X) is not a normed linear space when X is not compact. On the other hand
we could use semi norms on C(X): given any compact K ⊂ X let k.kK be the sup nom on
K. This family of seminorms determine C(X): together they provides a lot of properties so
that various theorems (for NLS) remains valid here.
7.1 Two equivalent definitions of LCS

Let X be a linear space over R (results over C are similar). We consider topologies on X
such that addition and scalar multiplication are continuous operations (which are assumed
in the definition below), in which case we say that X is a topological linear space.
Now there are two equivalent definitions of local convexity for a topological linear space.
In one definition, we use seminorms. A seminorm ρ on X is essentially a norm except for the
nondegrenerate condition, namely it satisfies the triangle inequality ρ(x + y) ≤ ρ(x) + ρ(y),
nonnegativity ρ(x) ≥ 0, homogeneity ρ(λx) = |λ|ρ(x), however it is possible that ρ(x) = 0
when x 6= 0. A typical example is ρ(x) = |`(x)| where ` is a linear functional on X. Now, a
topological linear space X is said to be locally convex if there exists a family of seminorms
ρα , α ∈ I, such that the topology on X is the minimal topology such that (addition and
multiplication are continuous and) ρα are continuous.
Another definition uses convex sets. A subset A of X is called balanced if λx ∈ A
whenever x ∈ A and |λ| ≤ 1. We also say that A is absorbent if for every x ∈ X there is
some t > 0 such that tA contains x. Then a topological linear space X is said to be locally
convex if there exists a neighborhood base at 0 consisting of only convex balanced absorbent
(open) sets.
To see the equivalence of two definitions, we start with the seminorm definition. Then a
59
60 CHAPTER 7. LOCALLY CONVEX SPACES
neighborhood base at 0 could be taken to consists of all open sets of the form
U = {x ∈ X : ρα1 (x) < , . . . , ραm (x) < }
where > 0 and α1 , . . . , αm are elements of I. (m ≥ 1 is arbitrary). It is clear that these

open sets are convex and balanced and absorbent.
Conversely given a neighborhood base consisting of convex balanced absorbent sets we
could use the Minkowski gauge functional to construct the seminorm. Namely if A is convex
balance absorbent we let
ρA (x) = inf{t > 0 : x ∈ tA}
(It is not hard to check that ρA is seminorm.)
7.1.1 Basic properties

A family of seminorm is said to be separated if whenever ρα (x) = 0 for every α ∈ I it must
follow that x = 0. This is actually equivalent to the Hausdorffness of the topology.
If (xi )i∈D is a net in X then (xi ) → x if any only if ρα (xi − x) → 0 (as a net in R) for all
α ∈ A.
If there is a lot of seminorms one expects that the topology is very rich; on the other
hand if there are fewer seminorms one expects a more well-structured topology. In particular
if there are only countably many seminorms involved then the topology is equivalent to the
topology of some pseudo-metric, i.e. a distance notion that resembles the metric notion
except for the fact that two distinct points could have distance 0. To see this, enumerate
the seminorms ρ1 , . . . , and let
∞
X ρn (x − y)
d(x, y) = 2−n
n=1
1 + ρn (x − y)
If one assume that the (countable) family is separated then the above pseudo metric is
actually a metric. If this metric is furthermore complete then the given space is called a
Frechet space.
Examples: Recall that C(X) is a locally convex space with seminorms given by ρK (f ) =
supx∈K |f (x)| where K ⊂ X is compact. Other examples are
(i) the space of Schwartz functions on Rn : These are functions that are C ∞ and their
derivatives decay faster than any polynomial. One could use ρα,β (f ) = |x|α |Dβ f (x)| where
α and β are nonnegative integer multi-indices.
(ii) if `α , αA is a family of linear functionals on X then the minimal topology such that
`α are continuous and linear operations (addition/scalar multiplication) are continuous is
called the weak topology induced by this set of linear functionals. Examples of these (for
normed linear spaces) are the weak topology and the weak* topology. It can be shown that
if ` is a linear functional that is continuous with respect to this topology it must be a finite
linear combination of the given linear functionals.
7.2. THE HYPERPLANE SEPARATION THEOREM 61
(iii) Given any open set Ω ⊂ Rn the space D(Ω) consisting of compactly supported
infinitely smooth functions on U is also a locally convex space and actually complete (recall
that the version without infinite smoothness, i.e. Cc (U ), is not complete).
7.1.2 Linear maps

Let X and Y be locally convex spaces with seminorms ρα and ρ0β where α ∈ I and β ∈ J
index sets. Then it can be shown that a linear map ` : X → Y is continuous if and only if
given any β ∈ J there exists α1 , . . . , αm ∈ I and C > 0 such that the following holds for all
x ∈ X:
ρ0β (`(x)) ≤ M (ρα1 (x) + · · · + ραm (x))
(Note that this generalizes the usual equivalence of continuity and boundedness of linear
maps between normed linear spaces.)
7.2 The Hyperplane separation theorem

Let K be a convex subset of a topological linear space X over R and let y ∈ X \ K. Assume
that K contains at least one interior point.
Theorem 29. There exists a continuous linear functional ` on X such that supx∈K `(x) ≤
`(y). If furthermore K is closed then this inequality could be taken strict.
Proof. By translation invariant we may assume that 0 is an interior point of K. Consider

the Minkowski gauge functional
ρK (x) = inf{t > 0 : x ∈ tK}
since 0 is an interior point of K it is clear that ρK (x) < ∞ for any x ∈ X. Furthermore ρK
is convex and positive homogeneous, and since y 6∈ K we have ρK (y) ≥ 1 ≥ ρK (x) for every
x ∈ K. Let ` be defined on the one dimensional subspace spanned by y using `(y) = 1. Then
|`(z)| ≤ ρK (z) inside this subspace, so by Hahn Banach we may extend ` to all of X such
that |`(x)| ≤ ρK (x) for all x. In particular `(x) is continuous since ρK is continuous at 0.
Now, if K is closed then one could see that ρK (y) > 1 ≥ supx∈K ρK (x) and we could
therefore obtain a strict inequality.
For a locally convex Hausdorff space over R, it follows from the above theorem that
for any two distinct points x 6= y there is a continuous linear functional ` on X such that
`(x) 6= `(y). To see this, by Hausdorffness and local convexity we could find a convex open set
A such that x ∈ A while y ∈ X \ A. Since A is closed convex we could apply the hyperplane
separation theorem and get a continuous linear function ` such that supz∈A `(z) < `(y). In
particular `(x) < `(y).
7.3 The Krein-Milman theorem

An extreme point of a convex set K is a point x such that if x is a convex combination of
x1 , x2 ∈ K then x1 = x2 = x.
Theorem 30 (Krein-Milman). Let X be locally convex Hausdorff and K is a nonempty
compact subset of X. Then
(i) K has at least one extreme point.
(ii) K is the closure of the convex hull of its extreme points.
Remark: If X = Rn for some finite n then we don’t need to take the closure in (ii), in
fact Caratheodory showed that one could get any point of X from a convex combination of
at most n + 1 extreme points. For infinite dimensional space the closure is essential.
Proof:
We first generalize the notion of extreme points to extreme subsets of K. A subset A of
K is said to be extreme if it is nonempty convex and furthermore if any x ∈ A is a convex
combination of two points x1 and x2 in K then we must have x1 , x2 ∈ A. It is clear that
if a family of extreme subsets of K has nonempty intersection then this intersection is also
extreme. In particular K is an extreme subset of itself.
Now, consider the collection A of all closed extreme subsets of K, which is nonempty
since it contains K, and we may order this collection partially using set inclusion, namley
A ≤ B if A ⊃ B.
(i) We first show that there exists a maximal element in A, which we will show to be a
point later. In order to show existence of the maximal element we plan to use Zorn’s lemma,
and what is needed here is the fact that any chain (i.e. a totally ordered subcollection of
A) has an upper bound. The idea is to take the intersection of the elements in this chain,
and what needs to be shown is the fact that this intersection would not be empty. Assume
towards a contradiction that this intersection is empty, it follows from compactness of K
that there exists a finite subcollection that has empty intersection, but this is a contradiction
because the intersection of a finite chain is simply the smallest element, which is nonempty.
Now, let A be a close extreme subset of K and assume towards a contradiction that it
is not a point. Say x 6= y are two elements of A, then there is a continuous linear functional
on X that separates x and y. We’ll show that the subset B of A where ` achieves maximum
is an extreme subset of K, which would violate maximality of A. (Clearly B 6= A). Since
A is extreme in K it suffices to show that B is extreme in A. Now extremality of B in A
follows easily from linearity of `.
(ii) We now show that the closure of the convex hull E of extremal points of K is K, i.e.
K = E. Suppose that z ∈ K \ E. Then by the hyperplane separation theorem there is a
continuous linear functional ` such that
sup `(x) < `(z)

x∈E
Again the set of maximum of ` on K is a proper subset of K and also an extreme subset, and
7.4. INDUCTIVE LIMIT AND WEAK SOLUTIONS 63
this set is also closed and disjoint from E. So by repeating the above argument one could
find one extremal point of K inside this set, which is therefore not inside E, a contradiction.
Examples: Let X be a compact Hausdorff space and consider CR (X) (we could do locally
compact Hausdorff too with CR,0 (X)) and let A consists of all positive linear functional on
C(X). Let A be the subset of A with `(1) = 1, it is not hard to see that A is convex and its
extreme pooints are the point evaluation linear functional ex f = f (x).
7.4 Inductive limit and weak solutions

Let Ω be a domain inside Rn . As mentioned before the space D(Ω) consists of C ∞ functions
whose supports are compact subsets of Ω. One way to construct a locally convex topology
on this space is to use inductive limitSof topologies. Let X1 , X2 , . . . , Xn , . . . , be linear spaces
such that X1 ⊂ X2 ⊂ · · · ⊂ X = Xn . Each Xn has a locally convex topology that is
consistent with the topologies on other Xm in following sense: the topology of Xn is the
induced topology from Xn+1 . If that is the case one could construct a limiting topology on
X whcih should be thought of as limn→∞ Xn . S
In our context we let K1 ⊂ K2 . . . be compact subsets of Ω such that Kj = Ω. Then
let Xn be the space of C ∞ functions on Rn whose support are subsets of Kn . Note that Xn
is a complete metrizable locally convex space. We then define the topology on D(Ω) to be
the inductive limit of the topologies of Xn .
The dual space of D(Ω) is called the space of (tempered) distribution on Ω denoted by
D0 (Ω). This space contains D(Ω) as a dense subspace. We may define the action of a linear
differential operator Dj (here j is a multiindex and the sum is over some finite collection)
on any ` ∈ D0 by defining for each φ ∈ D(Ω)
Dj `(φ) = (−1)j `(Dj φ)
A weak solution to a PDE is a distributional solution in the above sense. If this distribution
arise from some sufficiently smooth functions then we say it is a classical solution.
Part II
Spectral analysis for linear operators
65
Chapter 8
Elementary spectral theory
Let B denote the space of all bounded linear operators on some given Banach space X
over C. The analysis here works for more general settings, say B could be a unital Banach
subalgebra of this space (i.e. an associative subalgebra that contains a unit and is a complete
subspace with respect to the operator norm).
8.1 Spectrum and resolvent set

We say M ∈ B is invertible if its inverse exists and is in B. Observe that if M is invertible
then so is all N with kN − M k < kM1−1 k . To see this by writing N = M + (N − M ) =
M (I + M −1 (N − M )) it suffices to consider the case M = I. In this case (i.e. M = I) use
geometric series to show that the inverse is exactly
I + (N − I) + (N − I)2 1 + . . .
Definition: The resolvent set of M in B consists of all complex number λ such that
λI − M is invertible in B. The resolvent set is denoted by ρ(M ) and the spectrum of M is
σ(M ) := C \ ρ(M ).
Properties:
1. ρ(M ) is open in C.
(This is a consequence of the above observation.)
2. The resolvent function (λI − M )−1 is an analytic function of λ on ρ(M ).
This is because the above observation also shows that the resolvent could be expanded as
a power series around each point λ in the resolvent set ρ(M )
∞
X
−1
(λ − h − M ) = (λ − M )n−1 hn
j=0
which has positive radius of convergence |h| < k(λ − M )−1 k.

3. k(λ − M )−2 k ≥ 1/dist(λ, σ(M )).
67
68 CHAPTER 8. ELEMENTARY SPECTRAL THEORY
This is a corollary of the above analysis.

4. (Gelfand) The spectrum σ(M ) is a bounded nonempty closed subset of C. Further-
more, the spectral radius of M , defined by |σ(M )| = max |λ| over λ ∈ σ(M ) satisfies
σ(M ) = lim kM k k1/k

k→∞
Closedness is clear, boundedness follows from the fact that if λ is large enough in modulus
then λ−1 M has small norm and so (1 − λ−1 M ) is invertible and so λ ∈ ρ(M ). To show that
σ(M ) is nonempty assume towards a contradiction that it is. Then consider the contour
integral along CR = {|ξ| = R} Z
(λ − M )−1 dλ
CR
clearly analyticity of the resolvent operator implies that this would be 0. On the other hand
we know that (λ − M )−1 has the Laurent series expansion at ∞
∞
X
−1
(λ − M ) = M n ξ −(n+1)
n=0
therefore the contour integral will gives the residue term i.e. the term with n = 0 and we
get I, contradiction.
To show the identity for the spectral radius, by elementary complex analysis it follows
that |σ(M )| ≥ lim sup kM k k1/k . So it suffices to show that |σ(M )| ≤ |M k |1/k for all k ≥ 1.
To see this, observe that
X k−1
hX ih X i
n −(n+1) j −(j+1) k −k m
k M ξ k≤ kM k|ξ| (kM k|ξ| )
n≥0 j=0 m≥0
this follows from kM mk+j k ≤ kM j kkM k km . It follows that if |ξ| > kM k k then the se-
ries converges absolutely and therefore ξ ∈ ρ(M ); this completes the proof that |σ(M )| =
lim kM k k1/k .
8.2 Functional calculus and spectral mapping

We clearly could define a polynomial of a given element M ∈ B; in fact via power series this
works for analytic functions provided that the spectral norm of M is smaller than the radius
of convergence at 0. In particular it works if the given funciton is analytic in some domain
Ω that contains σ(M ). We demonstrate that this later fact would be enough. Indeed, let C
be a contour in the intersection of ρ(M ) and the domain of analyticity of f that winds once
around every point in σ(M ) but zero time around Ωc . Then define
Z
f (M ) = (ξ − M )−1 f (ξ)dξ
C
8.3. EXAMPLES OF OPERATORS AND THEIR SPECTRA 69
note that by the Cauchy theorem this is independent of the choice of the contour. It turns
out that this is consistent with the usual polynomial case and furthermore
σ(f (M )) = f (σ(M ))
(also known as the spectral mapping property). We also have the resolvent identity
(ξ1 − ξ2 )[(ξ2 − M )−1 − (ξ1 − M )−1 ] = (ξ1 − M )−1 (ξ2 − M )−1
which is useful to show that the functional calculus maps the algebra of analytic functions
on open sets containing σ(M ) into B is a homomorphism.
Also, if g is analytic on an open set containing f (σ(M )) and h = g ◦ f then h(M ) =
g(f (M )).
Spectral Projections: Assume the spectrum σ(M ) could be decomposed into an union
of n disjoint closed components σ1 ∪ . . . σn .
Let Cj be a contour R in ρ(M ) −1
that winds once around each point of σj but not other
components, and Pj = Cj (ξ − M ) dξ.
Then Pj are disjoint projections Pj2 = Pj and Pj Pk = 0 if j 6= k, and
X
Pj = I
j
and if σj 6= ∅ then Pj 6= 0.
8.3 Examples of operators and their spectra
1. Shifts
Let X = `2 consiting of x = (a0 , a1 , . . . ) such that |aj |2 < ∞. The right and left shifts
P
R and L
Rx = (0, a0 , a1 , . . . , ) , Lx = (a1 , a2 , . . . )
Now LR = I but RL 6= I so these linear bounded maps are not invertible. The spectrum of
R and L consists of all points in the unit disk {λ ∈ C : |λ| ≤ 1}. Now R0 = L and L0 = R
as adjoint of each other.
The idea is the spectral radius of L is 1Rso spectrum contained inside the disk.
2. The Fourier transform F f (ξ) = f (x)e−2πixξ dξ. Note that√F 4 = I therefore the
spectum of F is a subset of {x ∈ C : x4 = 1} which consists of ±1, ± −1. Let Hn denotes
2
the orthgonal polynomials wrt to e−x dx
√
Z
2
Hm Hn e−x dx = π2n n!δmn
2 /2 √ 2
then it could be shown that F [e−x Hn (x)] = (− −1)n e−ξ /2 Hn (ξ). This follows from
2 /2
X
exp(−x2 /2 + 2xt − t2 ) = e−x Hn (x)tn /n!
n≥0
Rx
3. Volterra integral operator Let X = C([0, 1]) and let T f (x) = 0 K(x, t)f (t)dt,
also known as the Volterra integral operator, here K is continuous. The spectrumn of T
consists of only one point λ = 0. This follows by computing T n f , for instance if K ≡ 1 then
1
R x
T n f (x) = (n−1)! 0
(x − t)n−1 f (t)dt and show that kT n f k ≤ kf k/n! in the sup norm, and use
the spectral radius identity.
4. Diagonal multiplication Let X = `p (Z+ ) and M acts diagonally (M x)n = λn xn ,
then the spectrum of M is the closure of {λn }.
8.4 Adjoint operators and spectrum

For convenience of notation let hx, ì := `(x) for every ` ∈ X ∗ and x ∈ X.
Let T : X → X be a bounded operator on a Banach space X. Then its adjoint operator
T ∗ is defined to be the bounded operator from X ∗ to X ∗ such that for every x ∈ X and
` ∈ X ∗ it holds that
hx, T ∗ ì = hT x, ì
It is not hard to see that the adjoint operator exists (and is linear and bounded), and
∗
kT k ≤ kT k.
Note that this definition differs the adjoint notion of an operator on a Hilbert space by
a conjugate. Thus some texts use T 0 for the Banach space conjugate of T .
Theorem 31 (Phillips). If T : X → X bounded linear on a complex Banach space X then
σ(T ) = σ(T ∗ ) and for every λ ∈ ρ(T ) it holds that ([T − λ]−1 )∗ = (T ∗ − λ)−1 .
As a corollary, for Hilbert space we have σ(T ) = σ(T ∗ ).
We will prove the first part, the second part follows as a by product.
Step 1: ρ(T ) ⊂ ρ(T ∗ ). Take λ ∈ ρ(T ). It suffices to show the following lemma (then
apply it to T − λ)
Lemma 5. Let T : X → X be bounded linear and boundedly invertible. Then (T ∗ )−1 exists
and (T ∗ )−1 = (T −1 )∗ .
Proof. : It suffices to show, using the open mapping theorem, that T ∗ : X ∗ → X ∗ bounded
and injective and onto.
Injectivity: If T ∗ ` = 0 for some ` ∈ X ∗ then we will show that ` = 0.
hT x, ì = hx, T ∗ ì = 0
for every x, so using the fact that T is onto we obtain `(X) = 0 so ` = 0.
8.4. ADJOINT OPERATORS AND SPECTRUM 71
Onto: If ` ∈ X ∗ then we want some `0 ∈ X ∗ such that T ∗ `0 = `. We equivalent transform

this equation into
hx, T ∗ `0 i = hx, ì ∀ x ∈ X
or equivalently hT x, `0 i = hx, ì for all x ∈ X. Since T is invertible this is the same as
hy, `0 i = hT −1 y, ì for every y ∈ X, which is the same as `0 = (T −1 )∗ `.
Step 2: Here we show that ρ(T ∗ ) ⊂ ρ(T ). We will use the following Lemma
Lemma 6. Let T : X → X be bounded linear on a complex Banach space. Then
(i) If range(T ∗ ) is dense in X ∗ (in the weak* topology) then T is injective.
(ii) If T ∗ is injective then range(T) is dense.
We first use the lemma to complete this step. If λ ∈ ρ(T ∗ ) we will show λ ∈ ρ(T ). By
the lemma it follows that T − λ is injective and has dense range. It remains to show that
(T − λ)−1 is bounded on its range (then we could invoke the BLT theorem). To see this for
every y ∈ range(T − λ), i.e. y = (T − λ)z for z ∈ X, and ` ∈ X ∗ we have
hy, ì = hz, (T − λ)∗ ì
and so
hz, ì = y, [T ∗ − −λ]−1 `

| hz, ì | ≤ kykk`k[T ∗ − −λ]−1 k

since this is true for all ` we obtain kzk ≤ k[T ∗ − −λ]−1 kkyk, as desired.
It remains to show the lemma. For (i) note that if T x = 0 then hT x, ì = 0 for all ` ∈ X ∗
and therefore hx, T ∗ ì = 0, but the denseness of the range of T ∗ implies that hx, ì = 0 for
all ` ∈ X ∗ , and consequently x = 0. For (ii), assume that range(T ) is a strict subspace of
X, then we could find, using Hahn Banach, a nonzero bounded linear functional such that
`(T x) = 0 for all x ∈ X. Then hx, T ∗ ì = 0 for all x therefore T ∗ ` = 0 but injectivity of T ∗
implies that ` = 0 contradition.
Chapter 9
Spectral theory for compact operators

on Banach spaces
Recall that a subset S of a metric space X is precompact if its closure is compact, or

equivalently every sequence contains a Cauchy subsequence. Another characterization is
that S is totally bounded, namely for any > 0 one could cover S by finitely many -balls.
If X is a normed linear space we can add/multiply, and we have the following basic
properties:
(i) If S is precompact then so is αS for any α scalar.
(ii) If S1 and S2 are precompact then S1 +S2 := {s1 +s2 , s1 ∈ S1 , s2 ∈ S2 } is precompact.
(iii) If S is precompact then so is the convex hull of S.
(iv) Let T : X → Y where X and Y are say normed linear spaces. If S ⊂ X is precompact
then so is T S ⊂ Y .
9.1 Compact operators

Let T : X → Y a bounded linear map between Banach spaces X and Y . Let B1 be the unit
ball in X. We say that T is pre-compact if T B1 is a precompact subset of Y .
Properties:
(i) A finite rank operator (namely the range of the operator is finite dimensional) is
compact.
(ii) If T1 : X → Y and T1 : X → Y are compact operators then α1 T1 + α2 T2 is also
compact for any scalar α1 , α2 .
(iii) If T : X → Y is compact and M : U → X and N : Y → V are bounded linear maps
between Banach spaces then N T M : U → V is compact.
(iv) If T : X → Y is compact then it maps a weakly convergent sequence in X to a
strongly convergent sequence in Y . Such operator is called completely continuous, so we
could say that compactness implies complete continuity.1
1
If T : X → X is completely continuous and X is reflexive then T is compact, this is left as an exercise.
73
74 CHAPTER 9. COMPACT OPERATORS
To see property (iv), first note that T xn converges weakly to T x. To see this, let ` ∈ X ∗ ,
then `(T xn ) = (T ∗ `)xn converges to T ∗ `x which is the same as `(T x). It then follows that
any strongly convergent subsequence of T xn has to converge to T x. Now since xn converges
weakly to x in X it follows that xn is a family of bounded operator on X ∗ that is uniformly
bounded pointwise, consequently by the principle of uniform boundedness supn kxn k < ∞,
thus (T xn ) is precompact, so any subsequent contains a Cauchy subsequence; thus by a
routine argument it follows that T xn is convergent to T x.
(v) If Tn is a sequence of compact operators from X to Y and Tn − T k → 0 as n → ∞
for some T : X → Y bounded linear, then T is compact.
To see property (v), note that for any > 0 one could choose N large such that kTN −T k <
/2. Now, we could use finitely many /2 balls to cover TN B1 . The balls with the same
centers will then cover T B1 . (Recall that B1 is the unit ball in X.)
As a corollary, we have
Corollary 3. If T : X → Y is the limit of a sequence of finite rank operators in the norm
topology then T is compact.
The converse direction of this corollary is not true in general (for Banach spaces X and
Y ), the first construction of counter examples is due to P. Enflo (’73). However, if Y is a
Hilbert space then the converse is true.
Theorem 32. Any compact operator T : X → Y where X is Banach and Y is Hilbert can
be approximated by a sequence of finite rank operators.
Proof.
SMn We sketch the main ideas of the proof. For every n the set T B1 is covered by
1
B
k=1 Y (y ,
k n ). We may assume that Mn is an increasing sequence. Let Pn be the projection
onto the span of y1 , . . . , yMn , which is clearly a finite rank operator, thus Pn T is also finite
rank. It remains to show that kT − Pn T k = O( n1 ). We observe that kPn y − yk k ≤ ky − yk k
for every 1 ≤ k ≤ Mn and every y ∈ Y , since projection are contractions. It follows that
kPn T y − T yk ≤ kPn T y − yk k + kT y − yk k ≤ 2kT y − yk k
and clearly given any y ∈ B1 there is a k such that kT y−yk k ≤ 1/n, therefore kPn T y−T yk ≤
2
n
for every y ∈ B1 , thus kPn T − T k ≤ 2/n as desired.
Theorem 33 (Schauder). Let T : X → Y be a bounded linear operator between Banach
spaces X and Y . Then T is compact if any only if T ∗ : Y ∗ → X ∗ is compact.
Proof. It suffices to show the forward direction, namely if T is compact then T ∗ is also
compact. For the other direction, apply the forward direction it follows that T ∗∗ is compact
from Y ∗∗ to X ∗∗ , and by restricting T ∗∗ to X we obtain T therefore T is also compact.
Now, assume that T is compact. Given any sequence `n ∈ X ∗ with k`n k ≤ 1 we will
show that (T ∗ `n ) has a Cauchy subsequence T ∗ `nj , in other words given any > 0 we have
sup k`nj (T x) − `nk (T x)k ≤

kxk≤1
9.2. COMPACTNESS OF INTEGRAL OPERATORS 75
if j and k are large enough.

Let B be the unit ball in X and let K = T B which is a compact subset of X. Then the
above estimate is a consequence of supy∈K k`nj y − `nk yk ≤ . We may view `n as a sequence
of continuous funcitons on K, which are uniformly bounded pointwise and equicontinuous
on K:
sup k`n yk ≤ kyk
n
sup k`n y1 − `n y2 k ≤ ky1 − y2 k

n
Thus by the Arzela Ascoli theorem the sequence `n has an uniformly convergent subse-
quence, as desired.
9.2 Compactness of integral operators

We now discuss compactness of the integral operator T
Z
T f (y) = K(x, y)f (x)dµ(x) , y ∈ V
U
where U and V are say compact metric spaces, viewing T as operator on different function
spaces.
Viewing T as a map from L2 (U, dµ) to L2 (V, dν) for some measure ν (note that both
spaces are separable RHilbert spaces), we know that one sufficient condition that guarantee
boundedness of T is U ×V |K|2 dµdν < ∞. It turns out that this would also imply compact-
ness of T . To see this, for each x we expand K(x, y) into the (countable) orthogonal basis
of L2 (Y, dν), which we may denote by φ1 , φ2 , . . .
∞
X
K(x, y) = Kj (x)φj (y)
j=1
note that for almost every x ∈ X the function K(x.y) is L2 (Y, dν) integrable in y and so
Z Z XZ
2
|K(x, y)| dµ(x)dν(y) = |Kj (x)|2 dµ(x)
j X
thus we may approximate T with the finite rank operator

XZ
Tn f (y) = Kj (x)f (x)dµ(x)uj (y)
j≤n
so T is compact.
9.3 Spectral properties of compact operators

9.3.1 Riesz’s theorem
One of the main results about compact operators is the following fact: if T : X → X is
a compact operator on a Banach space X and 1 − T is injective, then 1 − T is boundedly
invertible. Note that it is possible for kT k to be large, so the basic theory about small
perturbation of 1 does not applies here. The key to showing this fact is the following
theorem of Riesz.
Theorem 34 (Riesz). Let T : X → X be a compact operator on a Banach space X. Then
range of 1 − T is a closed subspace of X and furthermore dim(ker(1 − T )) and codim(1 − T )
are finite and equal to each other.
Part of the proof of this theorem is the following lemma.
Lemma 7. Let T : X → X be a compact operator on a Banach space X. Then
(i) ker(1 − T ) is finite dimensional.
(ii) There exists some k ≥ 1 such that ker((1 − T )m ) = ker((1 − T )m+1 ) for every m ≥ k.
(iii) range(1 − T ) is a closed subspace of X.
Proof of Lemma 7. (i) If y ∈ ker(1 − T ) then T y = y. Assume towards a contradiction that

ker(1 − T ) is infinite dimensional. Then by another result of Riesz we could find an infinite
sequence (yn ) in this kernel such that kyn k = 1 and kyn − ym k ≥ 1/2 for all m 6= n. This
implies the set {T yn , n ≥ 1} is not precompact, contradiction.
(ii) Note that ker([1 − T ]k ) ⊂ ker([1 − T ]k+1 ) for all k. Now, if ker([1 − T ]k ) = ker([1 −
T ]k+1 ) then ker([1 − T ]m ) = ker([1 − T ]m+1 ) for all m ≥ k. Assume towards a contradiction
that we have the strict inclusion ker([1 − T ]k ) ( ker([1 − T ]k+1 ) for all k ≥ 1. Since
ker([1 − T ]k ) are closed, by Riesz lemma we may find xn ∈ ker([1 − T ]n ) such that kxn k = 1
and dist(xn , ker([1 − T ]n−1 )) ≥ 1/2. We will show that kT xn − T xm k ≥ 1/2 for all m 6= n,
which will contradict compactness of T . Now, without loss of generality assume that m < n,
then
T xn − T xm = xn − (1 − T )xn − T xm
and (1 − T )xn ∈ ker([1 − T ]n−1 ) and T xm ∈ ker([1 − T ]n−1 ) (since T commutes with
(1 − T )n−1 ). Therefore by choice of xn we obtain
kT xn − T xm k ≥ dist(xn , ker([1 − T ]n−1 ) ≥ 1/2
(iii) Assume that yk = (1 − T )xk is a sequence in range(1 − T ) that converges to some

y ∈ Y , we will show that for some x ∈ X we have y = T x. Certainly if xk has a convergent
subsequence we could simply take x to be the corresponding limit. Now, xk = yk + T xk so it
suffices to obtain convergence of some subsequence of T xk , and using compactness of T this
would follow if xk is uniformly bounded. Unfortunately it is possible for xk to be unbounded,
but we could correct this by modifying xk by an appropriate term in ker(1 − T ) to make it
9.3. SPECTRAL PROPERTIES OF COMPACT OPERATORS 77
bounded. Note that this correction does not change yk and hence does not change the goal
of this part. Let
dk = dist(xk , ker(1 − T ))
By modifying xk by an amount inside ker(1 − T ) we may assume that dk ≤ kxk k ≤ 2dk .
Thus it suffices to show that
sup dk < ∞
k
Assume towards a contradiction that some subsequence of dk conveges to ∞. Without loss

of generality we may assume lim dk = ∞. We then have
yk xk
= (1 − T )( )
dk dk
now yk /dk → 0 and xk /dk is uniformly bounded, so using compactness of T it follows that
some subsequence of T (xk /dk ) converges, which in turn implies that some subsequence of
xk /dk converges to some x ∈ X. We obtain 0 = (1 − T )x so x ∈ ker(1 − T ), on the other
hand it is clear that dist(xk /dk , ker(1 − T )) ≥ 1, contradiction.
We now prove Riesz’s theorem using the lemma. It remains to show that ker(1 − T )
has the same dimension as the codimension of 1 − T . We first reduce the proof to the case
when ker(1 − T ) is trivial. Let k be the index given by part (ii) of the lemma and let
Y = ker[(1 − T )k+1 ] a closed subspace of X. Since (1 − T )Y ⊂ ker[(1 − T )k ] = Y by choice
of k, it follows that T Y ⊂ Y , thus T induces an operator TZ on Z := X/Y , which is a
Banach space with the induced norm. It follows immediately that TZ is compact on Z, and
1 − TZ is also injective on Z. Now, the dimension of ker(1 − T ) is the same as the dimension
of the kernel of 1 − T viewing as an operator on Y , which by standard linear algebra is the
same as the codimension of the range of 1 − T viewing as an operator on Y . Thus it remains
to show that the codimension of 1 − T on Z is 0. One could see that we have reduced the
proof to the case of the compact operator TZ on the Banach space Z which is also injective.
Now, if (1 − TZ )Z is a strict subspace of Z it follows that (1 − TZ )n Z is a strictly
decreasing sequence of closed subspaces of Z, and again we may find zn ∈ (1 − TZ )n Z such
that kzn k = 1 and dist(zn , (1 − TZ )n+1 Z) ≥ 1/2. Then for n < m we have kTZ zn − TZ zm k =
kzn − (1 − TZ )zn − TZ zm k ≥ dist(zn , (1 − TZ )n+1 Z) ≥ 1/2, violating the compactness of TZ .
It follows that the codimension of 1 − TZ is 0 as desired.
9.3.2 Spectral properties

Theorem 35. Let T be a compact operator on a Banach space X. Then
(i) its spectrum σ(T ) consists of at most countably many elements, all of them are eigen-
values with finite dimensional eiegenspace.
(ii) 0 is the only possible accumulation point for the elements of σ(T ) if such an accu-
mulation point exists, and 0 will belong to σ(T ) if the dimension of X is infinite.
Note that if X is a (separable) Hilbert space then we could furthermore diagonalize T
using the eigenvectors.
We now prove Theorem 35. First, given any λ ∈ σ(T ) such that λ 6= 0 we show that it is
an eigenvalue with finite dimensional eigenspace. Clearly λ1 T is compact. Thus, if ker(1− λ1 T )
is trivial then by Riesz’s theorem it is also onto, therefore by the open mapping theorem
1 − λ1 T is boundedly invertible and therefore λ 6∈ σ(T ). Thus ker(1 − λ1 T ) is nontrivial and
also finite dimensional by Riesz’ theorem, and so λ is an eigenvalue with finite dim eigen
space.
Now, we will show that 0 is the only possible accumulation point, which also implies that
σ(T ) is at most countable. It suffices to show that given any > 0 there is some C = C(T, )
finite such that at most C(T, ) elements of σ(T ) would be outside [−, ]. Let λ1 , . . . , λm
be a finite collection of distinct elements of σ(T ) with |λj | > , it suffices to show that
m < OT, (1).
The idea is to let Yn be the eigenspace associated with λn and observe that for any n
it holds that Yn ∩ span{Ym , m < n} is trivial. Let Y<n = span{Ym , m < n} which is now
a strictly nested sequence. Then by Riesz’s lemma one could choose yn ∈ Yn such that
kyn k = 1 and dist(yn , Y<n ) ≥ 1/2. We will show that
kT yn − T ym k ≥ /2
for any n 6= m. Indeed, without loss of generality assume n > m, then using the fact that
yn ∈ Yn and the fact that T leaves Ym invariant we have
kT yn − T ym k = k − T ym k = kλn yn − T ym k ≥ |λn |dist(yn , Y<n ) ≥ |λn |/2 ≥ /2
Consequently, m is bounded above by the maximum number of points in T B where B is

the unit ball inside X that are /2 apart. Since T B is precompact this is finite and depends
only on T and (and certainly X), and independent of the sequence (λk ).
Chapter 10
Spectral theorems for bounded

self-adjoint operators on a Hilbert
space
Let H be a Hilbert space. For a bounded operator A : H → H its Hilbert space adjoint is
an operator A∗ : H → H such that hAx, yi = hx, A∗ yi for all x, y ∈ H. We say that A is
bounded self adjoint if A = A∗ .
In this chapter we discussed several results about the spectrum of a bounded self adjoint
operator on a Hilbert space. We emphasize that in this chapter A is bounded, there is also
a notion of unbounded self adjoint operator which we will discuss in subsequent chapters.
10.1 Diagonalization form

The first result says that A could be diagonalized using some change of basis.
Theorem 36. Let A : H → H be a bounded self-adjoint operator on a Hilbert space H.
Then there exists some L2 (X, µ) and U : L2 (X, µ) → H isometric isomorphism such that
for some bounded M on (X, µ) and every f ∈ L2 (X, µ) it holds that
(U −1 AU )f (x) = M (x)f (x) , x ∈ X
Proof. We note that if H is the direct sum of subspaces H1 and H2 such that H1 and H2
are invariant under A then it suffices to prove the theorem for the restriction of A to each
subspace. This applies to direct sums indexed by larger index sets.
Now it is not hard to see that H could be written as an orthogonal direct sum of subspaces
of the form span(An ξ, n ≥ 0) where ξ ∈ H. (The proof uses Zorn’s lemma.) Note that
these subspaces are invariant under A, therefore it suffices to show the theorem when H =
span(An ξ, n ≥ 0) for some fixed ξ.
Now, recall from spectral caculus for bounded operators on a Banach space that if f is
analytic on a domain containing σ(A) then we could define f (A) and furthermore σ(f (A)) =
79
80 CHAPTER 10. BOUNDED SELF-ADJOINT OPERATORS
f (σ(A)). For polynomials we could do this directly using factorization into linear factors for
polynomials.
In the case of bounded self adjoint operator we will show below that σ(A) ⊂ R. (Note
that this fact also holds for unbounded case, but we will not discuss that in this section.)
Lemma 8. Let A be bounded self adjoint on a complex Hilbert space H. Then σ(A) ⊂ R.
To see this lemma, we will show that if λ ∈ C has nonzero imaginary part then λ ∈ ρ(A).
To do this, we will show that
| hx, (λ − A)xi | ≥ ckxk2 > 0
for some c depending on λ. This would imply that λ − A is invertible using an application
of the Lax-Milgram theorem: consider the bilinear form B(x, y) = hx, (λ − A)yi, which is
nondegenerate once we proved the above estimate, thus given any z we could find y such
that hx, zi = B(x, y) for all x ∈ H, which implies that z = (λ − A)y, thus λ − A is bijective
on H and so is boundedly invertible and so λ ∈ ρ(A) as desired.
To show the above estimate, simly write λ = a + ib where a, b ∈ R and b 6= 0, then using
the self-adjoint property of A it follows that hx, (a − A)xi is a real number, therefore
| hx, (a + ib − A)xi | = | hx, (a − A)xi + ibkxk2 | ≥ bkxk2
as desired. This completes the proof of the above lemma.

We now discuss functional calculus for bounded self adjoint operators. Note that since
σ(A) is bounded and closed it will follow that σ(A) is a compact subset of R, and thus we
could define f (A) even if f is merely continuous (which would be weaker than the analytic
assumption required by the complex method generally applied to all bounded operators).
The idea is to use the Weierstrass theorem and define f (A) to be the limit in operator norm
of pn (A) where (pn ) is a sequence of polynomials that approximates f . To see that this could
be done, it suffices to show that if g is a polynomial then
kg(A)k = sup |g(x)|

x∈σ(A)
To see this last claim, we first show it for real polynomial. In fact we will consider g(x) = x.
Then as we proved before
sup |g(x)| = lim kAn k1/n ≤ kAk

x∈σ(A) n
while kAk ≤ supx∈σ(A) |g(x)| using either the spectral theorem, or by elementary methods.1
The real polynomial case then follows from the spectral mapping theorem σ(g(A)) = g(σ(A))

n −n
1
Here we could easily see that kAxk2 = A2 x, x ≤ kA2 kkxk2 thus by repeating we obtain kA2 k2 ≤
kAk as desired.
10.2. PROJECTION-VALUED MEASURE AND SPECTRAL PROJECTION 81
and the fact that g(A) which is self adjoint when g is real polynomial. To allow for complex
polynomial p, simply write
kp(A)k2 = kp(A)∗ p(A)k = k(pp)(A)k = sup (pp)(λ) = sup |p(x)|2

λ∈σ(A) λ∈σ(A)
We note that as a consequence of the definition we also have

Corollary 4. For all continuous g on σ(A) it holds that kg(A)k = supx∈σ(A) |g(x)|.
Now, we may define a linear functional on the space of polynomials on σ(A) as follows:
given such a polynomial p, let L(p) = hp(A)ξ, ξi, it is clear that kLk ≤ supx∈σ(A) |p(x)|.
Using Weierstrass’s theorem it follows that we could extend L to the space of continuous
functions on σ(A). Using self-adjointness of A it is clear that L is positive: if f ≥ 0 then
write f = g 2 and g = lim pn limit of polynomials then L(f ) = L(g 2 ) = lim hpn (A)2 ξ,R ξi =
lim kpn (A)ξk2 ≥ 0. Thus by the Riesz representation theorem we may write L(f ) = f dµ
where µ is some finite Borel measures on σ(A).
We now construct the operator U , initially from the space of continuous functions on
σ(A) to H. If q is a polynomial then let U q = q(A)ξ, clearly
Z
|q(x)|2 dµ = L(qq) = h(qq)(A)ξ, ξi = kq(A)ξk2 = kU qk2
σ(A)
thus the restriction of U to the polynomials is an isometry and the image of the polynomials
under U is clearly dense inside H by the given assumption. Thus U extends to an isometric
isomorphism from L2 (σ(A), dµ) to H.
Finally we will show that U −1 AU f (x) = xf (x) for all f ∈ L2 , note that this equality
is understood in the almost everywhere since with respect to µ, furthermore m(x) := x is
bounded in σ(A) thanks to compactness of σ(A). Thanks to Weierstrass’s theorem again it
suffices to show this equality for f being polynomials. In that case let g(x) = xf (x) also a
polynomial, we then have
U −1 AU f = U −1 (Af (A)ξ) = U −1 (g(A)ξ) = g(x) = xf (x)
10.2 Projection-valued measure and spectral projec-

tion
Recall that P : H → H is a projection is P 2 = P . We say that it is an orthogonal projection
if ker(P) and range(P) are orthogonal subspaces, which would be the case if and only if
P = P ∗ (the underlying assumption is that P is bounded). Recall a basic fact:
Theorem 37. Given any closed subspace K of H there is an orthogonal projection PK onto
K: range(P) is K and ker(P) is exactly K ⊥ .
In this section we take a closer look at the spectral representation of A.

We say that a family PΩ indexed by Borel subsets Ω ⊂ R if a (compactly suppported)
projection-valued measure if
(i) PΩ is an orthogonal projection on H
(ii) P = 0 (and
S P[−M,M ] = I for some M sufficientlyP large).
(iii) If Ω = n≥1 Ωn disjoint union then PΩ ξ = PΩn ξ convergence in the norm.
(iv) If Ω1 and Ω2 are Borel sets then PΩ1 PΩ2 = PΩ1 ∩Ω2 .
We note that property (iv) is a corollary of the first three properties.
We first note that we could construct a measure out of (PΩ ) when testing the projections
on some vector φ ∈ H. Namely let µφ1 ,φ2 (Ω) = hPΩ φ1 , φ2 i then this defines a compactly
supported complex Borel measure on R with finite total mass, in fact it is not hard to see
that
kµφ1 ,φ2 k ≤ kPR k = 1
therefore we could integrate any bounded Borel measurable function f and obtained a
bounded bilinear form Z
Tf (φ1 , φ2 ) = f (λ)dµφ1 ,φ2 (λ)
Then by Riesz representation theorem we may write Tf (φ1 , φ2 ) = hBφ1 , φ2 i where B is a

bounded linear operator on H. If we approximate f using simple functions, it can be seen
that B is the limit in the weak operator topology of the corresponding linear combinations
of PRΩ (note that this implies B is the limit in the strong operator topology too). We will
let f (λ)dPλ denote this operator B and we thinkR of this as the integration of f over the
measure induced by the family (PΩ ). In particular, λdPλ is a bounded self adjoint operator
on H.
It turns out that the converse of this is also true:
Theorem 38. Given a bounded self adjoint operator A on a complex Hilbert space R H let
PΩ = 1Ω (A) then (PΩ ) is a compactly supported projection valued measure, and f (λ)dPλ
converges to f (A) (as defined by spectral caculus) in the strong operator norm (Tj → T
means kTj x − T xk → 0 for all x). Furthermore this is the unique projection valued operator
with this property.
R
To check that f (λ)dPλ converges to f (A) in strong operator norm topology, it suffices
to
R show convergence
in the weak operator norm, then it suffices to check that if f = 1Ω then
R1Ω dPλ φ1 , φ2 = h1Ω (A)φ1 , φ2 i for all φ1 , φ2 ∈ H. By definition the left hand side the same
as 1Ω dPφ1 ,φ2 = hPΩ φ1 , φ2 i which is the same as the right hand side.
10.3 Spectral representation and decomposition

Absolutely continuous, singular, and point spectral
Recall that H has the spectral representation L2 (X, µ) which comes from direct summing
the spectral representations L2 (µj ) of Kj . We may decompose µj into three parts (absolutely
10.3. SPECTRAL REPRESENTATION AND DECOMPOSITION 83
continuous, point, and singular parts) using the Radon Nikodym theorem. This leads to
decomposition of Kj and also decomposition of H into the absolutely continous, singular,
and point spectrum H (p) , H (s) , and H (c) . Note that these are orthogonal subspaces and the
corresponding spectral measure of the cyclic subspace has the inherited properties. Say if
x ∈ H (c) then the specral measure of A on span(An x) is absolutely continuous with respect
to the Lebesgue measure.
Uniqueness of the spectrum Let A be bounded self adjoint on H and assume that H
be separable. Recall from prior sections that there is a decomposition of H into orthogonal
direct sum of K1 , K2 , . . . where Kj are orthogonal closed subspaces of H and each of them
is invariant under A, furthermore there is a linear isometry Uj mapping some L2 (σ(A), µj )
into Kj such that Uj−2 AUj acts on this L2 space by multiplication, in fact Uj−1 f (A)Uj g(x) =
f (x)g(x) for all bounded measureable f and g ∈ L2 (µj ). One could certainly modify dµj
multiplicatively using a bounded positive function, this would affect the isometry part of
the map U but the L2 space remains the same and the corresponding action of f (A) is still
pointwise multiplication.
Now if H has another decomposition into an orthgonal direction sum of closed subspaces
spectralSrepresentations L2 (Tj , νj ) for Lj , then upto a set of (spectral) mea-
L1 , L2 , . . . withS S
sure 0 we have Sj = Tk , namely we will show that for any k the set difference Tk \ Sj
has zero measure under νk . Note that if we assume furthermore S that these
S measures are
absoluteluy continuous with respect to the Lebesgue measure then Sj = Tk up to set of
Lebesgue measure 0.
To see this take f ∈ L2 (Tk , νk ), and let F be any bounded Borel measurable function.
Note that Sj and Tk are invariant under F (A), and f coressponds to some hf ∈ H. We now
decompose hf into hj where hj ∈ Kj , and hj in turn corresponds to fj ∈ L2 (Sj , µj ). Then
Z Z Z
|F (x)f (x)| dµk = khf k = kh1 k +kh2 k +· · · = |F (x)| |f1 | dµ1 + |F (x)|2 |f2 (x)|2 dµ2 +. . .
2 2 2 2 2 2
S
therefore if Tk \ Sj has positive µk measure we may choose F to be the characteristic
function of this set and f ≡ 1, clearly the left hand side is now positive while the right hand
side is 0, contradiction.
Spectral multiplicity
Consider the spectral representation of H, assumed separable, using the orthogonal direct
sum of K1 , . . . with Kj = L2 (Sj , µj where Sj is the support of µj .
Given λ ∈ R we define its spectral multiplicity with respect to this representation as the
number of j such that λ ∈ Sj .
Assume for simplicity that the spectrum of A is absutely continuous with respect to the
Lebesgue measure. Then it can be shown that the spectral multiplicity is independent of
the spectral representation of H; this is a theorem of Hellinger.
Chapter 11
Spectral theory for unbounded

self-adjoint operator
Let A be a linear operator on some dense subspace D of a Hilbert space H.

Let A∗ be defined as follows: for every ω ∈ H, if there exists z ∈ H such that for every
x ∈ D it holds that hAx, ωi = hx, zi then ω ∈ D∗ the domain of A∗ and A∗ ω := z (it is clear
that such z is unique if exists).
We say that λ is in the resolvent set of A if A − z is bijective from D to H, and
σ(A) = C \ ρ(A) as before.
Theorem 39. A be self-adjoint on H with dense domain D ⊂ H. Then σ(A) ⊂ R and there
is an orthogonal projection valued measure PΩ such that PΩ commutes with A on the domain
D of A, which in turn consists of all vector φ such that
Z
t2 d hPt φ, φi < ∞
and Z
Aφ = tdPt φ
We first show that σ(A) ⊂ R. To do this we will show that for every z 6∈ R the range
of A − z is closed and then we use the denseness of D to show that A − z is onto, as a by
product of the proof we will also have A − z is injective, in fact (A − z)−1 is bounded from
H to D.
We now show the spectral theorem. To do that we first show that
(i) R(z) = (A − z)−1 is a complex differentiable function of z ∈ C+ := {Im(z) > 0} in
the strong topology, namely the topology on the space of bounded operators induced by the
seminnorms Aφ, φ ∈ H. In other words, R(z)u is complex differentiable for every u ∈ H.
(ii) the adjoint of R(z) is R(z).
To see (i), first note that R(z) is a bounded operator on H for those z, let u(z) =
(A − z)−1 u for any u ∈ H. Then by simple algebra
u(z + ω) − u(z) = ωR(z)R(z + ω)u
85
86CHAPTER 11. SPECTRAL THEORY FOR UNBOUNDED SELF-ADJOINT OPERATOR
so u(z) is complex differentiable for all u, thus R(z) is complex differentiable and therefore
is analytic in the strong operator topology.
To see (ii), note that hu, R(z)vi = h(A − z)R(z)u, R(z)vi = hR(z)u, (A − z)R(z)vi =
hR(z)u, vi.
As a consequence, we know that Fu (z) := hR(z)u, ui is a complex valued analytic function
1
on the upper half plane and Fu (z) . Im(z) . Furthermore using the second property it follows
that F is rather symmetric accross the real line in the sense that F (z) = F (z).
We now show that Fu (z) has nonnegative imaginary part for any u ∈ H and z in the
upper half plane and
lim yImFu (iy) = kuk2
y→∞
To see these points, let v = R(z)u, then Fu (z) = hv, (A − z)vi = hv, Avi − z hv, vi, note
that self adjointness of A implies that hv, Avi ∈ R, therefore ImFu (z) = Im(z)kvk2 > 0 as
desired. Let z = iy where y ∈ R, then it folows that
yImFu (iy) = y 2 kvk2
on the other hand
kuk2 = h(A − iy)v, (A − iy)vi = kAvk2 + y 2 kvk2
thus as y → ∞ we have y 2 kvk2 → kuk2 as desired once we show that Av = AR(iy)u

converges to 0 as y → ∞ for any fixed u ∈ H. Note that this holds if u ∈ D: we could
commute A and R(iy) and then kAR(iy)uk . kR(iy)kkAuk = O(1/y). If u 6∈ D we could
approximate it using some thing in D, this could be done uniformly as long as we could show
that AR(iy) is uniformly bounded. Infact we will show that
kAR(iy)k ≤ 2
To see this, simply write
kAR(iy)uk = ku + iyR(iy)uk ≤ kuk + ykR(iy)kkuk ≤ 2kuk
Now one could think of Fu (z) as an analytic function that maps the upper half plane
into itself, thus using Poisson integral and the Riesz representation theorem we could show
that there is some nonnegative measure µu and some linear function l(z) of z such that
Z
dµu (t)
Fu (z) = l(z) +
t−z
(this is basically Herglotz’s theorem) for all z in the upper half plane. Then combining with
the fact that yFu (iy) → kuk2 it follows that the linear part is 0 and µu (R) = kuk2 (this is
basically a result of Nevanlinna.)
87
We note that the values of Fu in the upper half plane determines its values in the lower
half plane via the relationship Fu (z) = Fu (z). Therefore the equation
Z
dµu (t)
Fu (z) =
t−z
holds for all z ∈ C \ R. Then via algebras it follows that for all u, v ∈ H
Z
dµu,v (t)
hR(z)u, vi =
t−z
where µu,v is a signed measure, basically a linear combination of µu±v , µu±iv .
1
µu,v = µu+v − µu−v + iµu+iv − iµu−iv
4
It is not hard to check the following properties
(i) µu,u = µu is a nonnegative measure with total mass kuk2 ;
(ii) µu,v is linear in u and conjugate linear in v and µu,v = µv,u
(iii) the total variation kµu,v k is controlled by kukkvk. (use Cauchy Schwartz)
Then via repeated application of the Riesz representation theorem there exists a bounded
self adjoint operator PΩ for each Borel Ω ⊂ R such that µu,v (Ω) = hPΩ u, vi for all u, v ∈ H.
We will check that PΩ indeeds define a projection valued measure, and it will then follow
that Z
1
hR(z)u, vi = d hPt u, vi
t−z
Check:
2
R (i) Clearly P= 0 since µu,v (∅) = 0 for all u, v, also PR = 1 since kuk = µU (R) =
hPt u, ui = hPR u, ui for all u ∈ H.
(ii) We will show that PΩ commutes with A. First we show that PΩ commutes with R(z).
Note that R(z1 ) and R(z2 ) commutes with each other for all z1 and z2 non real. Using
−1
R
the representation R(z)u, v >= (t − z) d hEt u, vi we have
Z Z
(t − z1 ) d hR(z2 )Pt u, vi = (t − z1 )−1 d hPt u, R(z2 )vi =
−1
= hR(z1 )u, R(z2 )vi = hR(z2 )R(z1 )u, vi = hR(z1 )R(z2 )u, vi = . . .
Z
= (t − z1 )−1 d hPt R(z2 )u, vi
Using the fact that the Cauchy transform determines uniquely the measure, it follows that
R(z2 )PΩ = PΩ R(z2 ) for any Borel Ω.
Let u ∈ H, then AR(z)u = u + zR(z)u, thus for any Ω we have
PΩ A(R(z)u) = PΩ u + zPΩ (R(z)u) = PΩ u + zR(z)PΩ u = AR(z)PΩ u = APΩ R(z)u

88CHAPTER 11. SPECTRAL THEORY FOR UNBOUNDED SELF-ADJOINT OPERATOR
since the range of R(z) is D it follows that PΩ Au = APΩ u for all u ∈ D.

(iii) We will show that PΩ1 PΩ2 = PΩ1 ∩Ω2 . This is a consequence of the resolvent identity
R(z)R(w) = (z − w)R(z)R(w). Indeed, this identity implies
Z Z
1 1 1 1
hR(z)R(w)u, vi = ( − )d hPt u, vi = ( )d hPt u, vi
z−w t−z t−w (t − z)(t − w)
On the other hand, as before
Z
hR(z)R(w)u, vi = (t − z)−1 d hR(w)Pt u, vi
therefore for any Borel Ω1 ⊂ R and any w 6∈ R we have

Z Z
1 1
d h1Ω1 Pt u, vi = hR(w)PΩ1 u, vi = d hPt PΩ1 u, vi
t−w t−w
Using again the fact that the Cauchy transform determines the measure again, it follows
that given any Ω2 ⊂ R we have
hPΩ2 ∩Ω1 u, vi = hPΩ2 PΩ1 u, vi

as desired.
(iv) Finally, assume φ ∈ D. Then φ = R(z)u for some u ∈ H, and z is arbitrary in the
upper half plane. We then have
Z
1
hPΩ φ, φi = hR(z)R(z)PΩ u, ui = 2 2
d hPt u, ui
Ω t + |z|
1
therefore d hPt φ, φi = t2 +|z|2
d hPt u, ui
as measures. In particular,
Z Z
2
t d hPt φ, φi . d hPt u, ui < ∞
R
Using orthgonality of PΩ on disjoint Ω’s one could construct tdPt u for all u ∈ D using
simple functions (the integral existrs in the strong operator topology) and we could show
that this is the same as Au. To see this, let v ∈ D and u = (A − z)v then similarly we could
show for any φ ∈ H:
1
d hPt v, φi = d hPt u, φi
t−z
(t − z)d hPt v, φi = d hPt (A − z)v, φi = d hPt Av, φi − zd hPt v, φi
td hPt v, φi = d hPt Av, φi
R
therefore td hPt v, φi = hAv, φi for every φ, thus for any v ∈ D we have
Z
Av = tdPt v
Chapter 12
Fredholm determinant
12.1 Motivation
Fredholm determinant started in Fredholm’s investigation of the integral equation
(1 + K)u = f
R
where K is the integral operator Kf (y) = Y K(x, y)f (y)dy maping functions on Y to
functions on X. (Here X, Y are compact metric spaces.) We know that K is compact
from L2 (Y ) to L2 (X) if the kernel K is square integrable in X × Y . The setting originally
considered is for continuous functions, for which we have the following results:
(i) If K is a continuous function of x and y then K is compact from L1 (Y ) to C(X).
(ii) If K is a continuous function of x in the L1 norm wrt to y, i.e. kK(x, .)kL1y is
continuous wrt x, then K is compact from C(Y ) to C(X).
Proof of these facts uses the Arzela Ascoli criteria: S haudorff compact, functions equicon-
tinuous and pointwise uniformly bounded, then the family is precompact in the sup norm
on S.
For simplicity, assume K : [0, 1]2 → C is continuous and f ∈ C[0, 1]. Note that this is
not a Hilbert space. By Riesz’s theorem the integral equation
Z 1
u(x) + K(x, y)u(y)dy = f (x)
0
is solvable on C[0, 1] (i.e. given f ∈ C[0, 1] one could find u ∈ C[0, 1]) iff the integral operator
K is injective on this space, or equivalently its range is the whole space. It can be shown
that a vector is in the range of K iff it is orthogonal to the null space of K ∗ which has kernel
K ∗ (x, y) = K(y, x).
Fredholm investigated the above equation by discretizing the equation and appeal to
linear algebra, and then taking a limit at the end. This give rises to a number called the
Fredholm determinant of 1 + K (we simply say the Fredholm determinant for K), which
determines whether the given integral equation is solvable or not. The determinant concept
89
90 CHAPTER 12. FREDHOLM DETERMINANT
has been extended to other settings, most commonly for K being a trace class operator on
some separable Hilbert space.
We first discuss Fredholm’s orginal approach for C[0, 1] (modulo some simplifications),
and then we’ll discuss the Hilbert space theory later.
12.2 Fredholm’s approach for integral operators

We now investigate Fredholm’s approach towards solving the above equation. Fredholm’s
idea is to use linear algebra, approximating the integral using discretized sums and taking
an appropriate limit at the end.
X
ui + h Kij uj = fi , i = 1, . . . , n ,
where h = 1/n, fi = f (ih) and Kij = K(ih, jh) and ui = u(ih). The determinant of the
matrix acting on the vector (u1 , . . . , un ) is denoted by D(h)
D(h) = det(I + h(Kij ))

which is clearly a polynomial in h of degree n
n
X
D(h) = cm hm
m=0
1 d m
cm = ( ) D(h)|h=0
m! dh
We then use the product rule, which say that if C = det(C1 , P . . . , Cn ) is the determinant of
d d
a matrix with columns C1 , . . . , Cn then by multilinearity dh C = k det(C1 , . . . , dh Ck , . . . , Cn ).
In our case each column is linear in h, and Cj (0) = (0, . . . , 1, . . . , 0) the unit is in the jth
position. Therefore
h2 X

X Kii Kij
D(h) = 1 + h Kjj + det + ...
2 Kji Kjj
j i,j

x1 , . . . , xk
For convenience let K denote the determinant of the matrix K(xi , yj ), then
y1 , . . . , yk
letting h = 1/n and send n → ∞ we obtain
∞ Z Z
X 1 x1 , . . . , xk
D= ... K dx1 . . . dxk
k! x1 , . . . , xk
k=0
This is called the Fredholm determinant of the integral operator K. Note that this is a
complex number.
12.2. FREDHOLM’S APPROACH FOR INTEGRAL OPERATORS 91
12.2.1 Convergence and Continuity of the determinant

We’ll show that the sum defining D converges. We’ll use Hadamard’s inequality: for column
vectors v1 , . . . , vn ∈ Rn , it holds that
n
Y
n/2
| det(v1 |v2 | . . . |vn )| ≤ n |vj |∞
i=1
This follows from the fact that the volume of the parallelopipde is at most the product of
the side lengths.
Now, the given assumption on K implies that supx,y |K(x, y)| ≤ M for some finite M .
So by Hadamard inequality we have

x1 , . . . , xk
K ≤ (M k 1/2 )k (12.1)
y1 , . . . , yk
which implies the desired convergence.

Let kk denote the sup norm on [0, 1]2 below, we obtain
Lemma 9. Let F and G be two functions on [0, 1]2 . Then
n
| det(F (xi , xj )) − det(G(xi , xj ))| ≤ n1+ 2 kF − Gksup max(kF k, kGk)n−1
here all matrices are n × n.
Proof. For convenience, let F be the n × n matrix with entries F (xi , xj ) and G be the n × n
matrix with entries G(xi , xj ). It is clear that
det(F ) − det(G) = det M1 + det M2 + · · · + det Mn
where Mk is the matrix whose first k − 1 rows are the same as G, and the kth row is the
same as F − G, and the last n − k rows are the same as F .
Apply Hadamard inequality we obtain the first desired estimate.
We now discuss how to solve the integral equation using Fredholm’s determinant. The
idea is to solve it at the discrete level and then send h → 0 via n → ∞. Via this heuristic
we obtain
(I + K)(I + L) = (I + L)(I + K) = I
where Z
Lf = L(x, y)f (y)dy
X 1 Z Z
−1 x, x1 , . . . , xk
L(x, y) = −D ... K dξ1 . . . dξk
k! y, y1 , . . . , yk
k≥0
We now ready to prove

Theorem 40. Let K acts on C[0, 1]. Let K be a continuous kernel.

(i) If D = 0 then the operator I + K has a nontrivial null space and therefore is not
invertible.
(ii) Conversely if D 6= 0 then the operator I + K is invertible, furthermore its inverse is
given by I + L where L is defined above.
Proof: We first note that if K1 and K2 are two integral operators with kernel K1 (x, y)
and K2 (x, y) respectively then (1 + K1 )(1 + K2 ) = 1 + K3 where K3 is another integral
operator whose kernel is given by
Z
K3 (x, y) = K1 (x, y) + K2 (x, y) + K1 (x, z)K2 (z, y)dz
(i) Now, assume that D 6= 0. We need to show

Z
K(x, y) + L(x, y) + K(x, z)L(z, y)dz = 0
Z
L(x, y) + K(x, y) + L(x, z)K(z, y)dz = 0
We will show the first equality. For convenience of notation let

X 1 Z Z
x, x1 , . . . , xk
R(x, y) := ... K dξ1 . . . dξk
k! y, y1 , . . . , yk
k≥0
Then L = − D1 R.
x, x1 , . . . , xk
Now, computing the determinant K using its first row, we obtain
y, y1 , . . . , yk

x, x1 , . . . , xk x1 , x2 , . . . , xk
K = K(x, y)K +
y, y1 , . . . , yk y1 , y2 , . . . , yk
k
X
k x1 , x 2 , ..., xk
+ (−1) K(x, yj )
y, y1 , . . . , (yj ), . . . yk
j=1
here (yj ) means one omit yj .

Now let y1 = x1 , ... yk = xk and integrate the above over x1 , . . . , xk ∈ [0, 1]. Then the
integrals of the last k terms in the above expansion are actually the same, one could prove
this by simple change of variable. It follows that
Z Z
x, x1 , . . . , xk
... K dx1 . . . dxk
y, x1 , . . . , xk
Z Z
x1 , x2 , . . . , xk
= K(x, y) ... K dx1 . . . dxk +
x1 , x2 , . . . , xk
12.2. FREDHOLM’S APPROACH FOR INTEGRAL OPERATORS 93

x1 , x2 , . . . , xk
+kK(x, x1 ) dx1 . . . dxk
y, x2 , . . . , xk
Divinding by k! and summing over k ≥ 0, we obtain
Z
R(x, y) = K(x, y)D − K(x, x1 )R(x1 , y)dx1
R
which implies the desired equality K(x, y) + L(x, y) + K(x, z)L(z, y)dz = 0.
For the second equality, one argues similarly, computing the determinant using the first
column.
(ii) Since D = 0, by part (i) we have
Z
R(x, y) + K(x, z)R(z, y)dz = 0
for every x, y. If R 6≡ 0 then one could find one y such that g(.) = R(., y) is not the zero
function (note that it is continuous), and it satisfies g + Kg = 0, therefore 1 + K is not
injective.
It is however possible that R ≡ 0. If so, consider the following functions
X zk Z Z
x1 , . . . , xk

D(z) = ... K dx1 . . . dxk
k! x1 , . . . , xk
k≥0
X z k+1 Z Z
x, x1 , . . . , xk

R(x, y, z) := ... K dξ1 . . . dξk
k! y, y1 , . . . , yk
k≥0
One could think of D(z) and R(x, y, z) as the version of D and R with zK instead of K.
It is clear that D and R are entire functions. Since D(1) = 0 the entire function D has a
zero of finite order n ≥ 1 at z = 1. By algebraic manipulation we have
Z
R(x, x, z)dx = zD0 (z)
therefore there is some (x, y) such that R(x, y, z) can not vanish at z = 1 with order more
than n − 1. Then for some 1 ≤ ` < n it holds that
R(x, y, z) = (z − 1)` g(x, y) + O((z − 1)`+1 )
where g(x, y) 6≡ 0, and g is continuous in x, y (to see this note that g is the uniform limit of
a Rsequence of continuous functions on [0, 1]2 ). We recall that R(x, y, z) = zK(x, y)D(z) −
z K(x, x1 )R(x1 , y, z)dx1 thus by dividing everything by (z − 1)` and then send z → 1 we
obtain Z
g(x, y) = − K(x, x1 )g(x1 , y)dx1
and since g 6≡ 0 continuous it follows that 1 + K is not injective.

Theorem 41. Assume that K is Holder c-continuous where c > 1/2. Then the nonzero
eigenvalues
P of K on C[0, 1] (only countablity many of them since K is compact) satisfies
j |λj | < ∞ and for every z ∈ C we have
Y
D(z) = (1 + zλj )
j
and we also have the trace formula

Z X
K(x, x)dx = λj
j
Proof. We recall Hadamard’s factorization theorem (or may be a consequence of this theo-
rem): Let f be an entire function such that for some finite positive C1 , C2 and ρ ∈ [0, 1) it
holds for every complex number z that
|f (z)| ≤ C1 exp(C1 |z|ρ )
Assume that f (0) 6= 0. Then f has at most a countable number of zeros, furthermore
X 1
<∞
|z|
z: f (z)=0
and for every complex number λ it holds that

Y λ
f (λ) = f (0) (1 − )
z
z: f (z)=0
We plan to use this theorem to show the first part of the theorem. We note that the
second part of the theorem, i.e. the trace formula would then follows from
Z
K(x, x)dx = D0 (z)|z=0
Q
(which is part of the definiton of D) and the absolute convergence of the product j (1+zλj )
(viewed as an infinite power series for z).
First, we will show that for z 6= 0 it holds that: D(z) = 0 if and only if −1/z is an
eigenvalue of K. This is simply a consequence of our last theorem applied to D = det(1+zK).
Now, we will show that if K is Holder c-continuous then
2
|D(z)| . exp(O(|z| 1+2c )
To see this, using Stirling’s formula it suffices to show that, for some C finite,
X C n |z|2n/(1+2c)
|D(z)| .
n≥0
nn
12.3. FREDHOLM DETERMINANT FOR OPERATORS ON HILBERT SPACES 95
which is equivalent to showing that, for some C finite,

X C k |z|k
|D(z)| . 1
k≥0 k ( 2 +c)k
Using the definition of D it suffices to show that given any x1 , . . . , xk , y1 , . . . , yk it holds that
for some C > 0 finite

x1 , x2 , . . . , xk
|K | ≤ (Ck 1/2 )k k −c
y1 , y2 , . . . , yk
Note that incomparision with (12.1) we gain a factor of k −c . To see this it suffices to show
that k−1
x1 , x2 , . . . , xk 1/2
Y
|K | ≤ (Ck )k |yj+1 − yj |c
y1 , y2 , . . . , yk
j=1
(note that the constant C may be different in different display). Indeed, wlog weQ
may assume
y1 ≤ y2 · · · ≤ yk , then by the geometric arirthmetic mean inequality we have k−1 j=1 |yj+1 −
1 k−1
yj | ≤ ( k−1 ) , as desired.
Now, note that if c1 , . . . , ck are k column vectors (k × 1) then
| det(c1 | . . . |ck )| = | det(c1 , c2 − c1 , . . . , ck − ck−1 )|
therfeore using Hadamard’s inequality we obtain

k
x1 , x2 , . . . , xk YX
|K |≤ ( |K(xn , yj+1 ) − K(xn , yj )|2 )1/2
y1 , y2 , . . . , yk
j n=1
which implies the desired estimate using Holder continuity of K.
12.3 Fredholm determinant for operators on Hilbert

spaces
Let H be a separable Hilbert space over C. The Fredholm determinant det(1 + K) could
be defined for trace class operators K on a Hilbert space H. Note that this is not the same
as the setting considered in the last section since C[0, 1] with the sup norm is not a Hilbert
space.
We first define the singular values of a compact operator T on H. Let T ∗ be its adjoint,
clearly√T ∗ T is a nonnegative self adjoint operator on H, so one could define its square root
A = T ∗ T using functional calculus. Note that T ∗ T and A are both compact operators
with nonnegative eigenvalues.
The singular values of T are defined to be the positive eigenvalues of A, counting with
multiplicity.
Definition: We say that T is a trace class operator if the sum of its singular values is
finite (counting with multiplicity). In that case the trace norm of T is defined to be this
sum, denoted by kT ktr .
Properties: T and T ∗ has the same trace norm, and if T is trace class then so is T B
and BT where B is any bounded operator on H, and k.ktr satisfies the triangle inequality.
kT1 + T2 ktr ≤ kT1 ktr + kT2 ktr
This inequality is an immediate consequence of the following equivalent characterization of

the trace norm:
Lemma 10. For any trace class operator T
X
kT ktr = sup | hT fn , en i |
fn ,en n
where the sup is taken over all (fn ) and (en ) orthonormal bases of H.
To show this characterization we will first derive the polar factorization of T : namely
there is a partial unitary operator U such that T = U A, here unitary of U simply means
that U ∗ U when acting on the range of A is the identity operator. This should be thought of
as the operator analogue of the usual polar factorization of a complex number.
To define U , note that kAuk = kT uk for all u ∈ H, therefore we may define an isometry
U from range(A) to range(T ) by mapping Au to T u.
One then extends U to H by letting U to be zero on the orthogonal complement of
range(A). It follows immediately that U ∗ H ⊂ range(A): one simply notice that for every z
in the othogonal complement of range(A) and z 0 ∈ H it holds that hz, U ∗ z 0 i = hU z, z 0 i = 0.
Now let z in the closure of the range of A. We want to show that (U ∗ U − 1)z = 0, which
is the desired local unitary property of U . Using the isometric property of U on range(A),
for every z 0 in the closure of the range of A we have
hz 0 , zi = hU z 0 , U zi = hz 0 , U ∗ U zi
therefore (U ∗ U −1)z belongs to the orthogonal complement of range(A). But U ∗ U −1 leaves

range(A) invariant since the range of U ∗ is inside range(A). This contradiction completes
the proof of the local unitary property of U .
We are now back to proving the above equivalent charactization√of the trace class norm
of T . Let (Fn ) be a complete set of normalized eigenvectors of A = T ∗ T . (Note that since
A∗ = A we could find such a set.) Let Gn = U Fn . (Note that kFn k = kGn k = 1. ) Then
X X X
| hT Fn , Gn i | = | hU AFn , U Fn i = | hAFn , Fn i | = kT ktr
n n n
therefore it remains to show that

X
| hT fn , en i | ≤ kT ktr
n
for any pair of orthogonal bases (fn ) and (en ). Let sn be the singular values of T , namely
AFn = sn Fn . Then expand fn into (Fn ) we have
X
fn = hfn , Fk i Fk
k
and by an application of Fubini’s theorem

X X X
| hT fn , en i | ≤ |sk | | hfn , Fk i hGk , en i |
n k n
(it will be clear from the proof that the double sum is absolutely summable)
X X X
≤ |sk |( | hfn , Fk i |2 )1/2 ( | hGk , en i |2 )1/2
k n n
X X
≤ |sk |kFk kkGk k = |sk | = kT ktr
k k
This completes the proof of the characterization.

Trace: Given a trace class operator one may define a linear functional, namely the trace
of T X
T race(T ) = hT fn , fn i
n
where (fn ) is any orthonormal basis of H. This definition is independent of the choice of the
basis. Let (gn ) be another orthonormal basis, then by expanding fn into this new basis we
have X XX
hT fn , fn i = hfn , gk i hT gk , fn i
n n k
it is not hard to see that this double sum is abs convergence (using Cauchy Schwartz). Then
by Fubini X XX
hT fn , fn i = ( hfn , gk i hT gk , fn i)
n k n
using orthogonality of fn and normalization of fn we obtain

X XX
= ( hfn , gk i hfj , fn i hT gk , fj i)
k n j
X
= hT gk , gk i
k
By definition it is clear that |T race(T )| ≤ kT ktr and the sum defining the trace is
absolutely convergent.
Now, Lidskii’s trace formula says that
Theorem 42 (Lidskii). If T is trace class on a separable Hilbert space H then

X
T race(T ) = λj
j
where λj are the eigenvalues of T .

Note that T is compact so it has a countable set of nonzero eigenvalues. It is clear that
the sum of the eigenvalues is bounded above by the trace norm.
Proof. The proof of this formula could be divided into two steps: first one consider the cases
when T does not have any nonzero eigenvalues, and then in the second step we reduce the
general case to this setting.
Let’s assume for now that T does not have any nonzero eigenvalues, then we want to
show that T race(T ) = 0. Let s1 , s2 , . . . be the singular values of T . Then we’ll show that
for every λ > 0 P
eλ|T race(T )| ≤ O(1 + |λ|)M )eλ j>M sj (12.2)
(the implicit constant
P is independent of λ), from here by sending λ → ∞ wePobtain
|T race(T )| ≤ j>M sj ; and the desired estimate now follows from the fact that j sj =
kT ktr < ∞. To show the above claim, we approximate T by finite rank operators, say
Tn = Pn T Pn → T where Pn is the projection into the first n basis vectors of H, here one may
fix any orthonormal basis. Let Dn (λ) = det(1 + λTn ), here the determinant isQdefined using
linear algebra, in other words if Λn is the set of eigenvalues of Tn then Dn (λ) = α∈Λn (1+λα).
We will show that uniformly on any compact subsets of the complex plane it holds that
eλT race(T ) = lim Dn (λ) (12.3)

n→∞
Note that by definition we have kTn − T k → 0 (in operator norm) and T race(Tn ) →
T race(T ). Since the spectral radius of T is 0 it is clear that the spectral radius σ(Tn )
of Tn converges to 0 too. In particular given any bounded set of the complex plane we may
choose n large such that this bounded set is contained inside the ball of radius 1/σ(Tn ).
Furthermore one could also show that sj (Tn ) ≤ sj (T ) if the singular values of T and Tn are
ordered in decreasing order.
Now, it is not hard to see that
Dn0 (λ) X α
=
Dn (λ) α∈Λ 1 + λα
n
X
= T race(Tn ) + O (|λ|σ(Tn ))k−1 kTn kT r
k≥2
therefore uniformly over λ in a bounded subset of C it holds that

Dn0 (λ)
lim − T race(T ) = 0
n→∞ Dn (λ)
which implies the desired limiting equality (12.3).

Now using (12.3), for any λ > 0 we have
Y
eλ|T race(T )| ≤ lim inf (1 + λ|α|)
n→∞
α∈Λn
We will show that the last limit is bounded above by ∞

Q
j=1 (1 + λsj (T )) where sj (T ) are the
singular values of T , which easily implies the desired estimate (12.2). To show this, using
sj (Tn ) ≤ sj (T ) it suffices to show that
X n
X
log(1 + λ|α|) ≤ log(1 + λsj (Tn ))
α∈Λn j=1
Thanks to convexity of log, one could show that this inequality is a consequence of the fact
that
Y N
Y
|α| ≤ sj (Tn )
α∈Λn j=1
which in turn could be easily verified (equality holds if Tn is nonsingular).

Now to reduce the general case to the above setting, we consider the subspace K1 of H
spanned by the eigenfunctions and generalized eigenfunctions of T . Let K2 be the orthogonal
complement of K1 . Using linear algebra it is clear that the trace of the restriction of T to
K1 is exactly the sum of the eigenvalues of T . More precisely if (gn ) is a basis for K2 and
(fm ) is a basis for K1 then we may take the union of the two as a basis for H and
X X
T race(T ) = hT fm , fm i + hT gn , gn i
m n
X X
= α+ hT ∗ gn , gn i
α∈Λ(T ) n
thus using the previous argument it suffices to show that T ∗ leaves K2 invariant and the
only eigenvalue of T ∗ on K2 is 0 (it is clear that T2 is also compact and trace class). These
properties could be easily checked.

Determinant We now discuss det(1 + T ) for trace class operators T on a separable
Hilbert space H.
Define an inner product on H k by
h(w1 , . . . , wk ), (v1 , . . . , vk )i = det(hwi , vj i)i,j
Then T extends to a trace class operator Tk on H k , defined by Tk (w1 , . . . , wk ) = (T w1 , . . . , T wk ),

with kTk ktr ≤ kT kktr . Then define
X
det(1 + T ) = T race(Tk )
k≥0
One could check that if T is finite rank then this determinant is the same as what we
would obtain from linear algebra, and det(1 + .) is locally Lipschitz wrt to the trace class
norm. Approximating T by a sequence of finite rank operators (for which det(1 + T ) could
be defined using standard linear algebra), we could show that det(1 + T ) is the limit of the
corresponding determinants, and thus this limit is independent of the choice of the sequence.
Now, one could use the polar decomposition T = U A and approximate A by projections into
finite dimensional subspaces of H spanned by eigenfunctions of A. By this approximation
scheme it can be shown that Y
det(1 + T ) = (1 + λj )
where λj are the eigenvalues of T and
det[(1 + T1 )(1 + T2 )] = det(1 + T1 ) det(1 + T2 )
for any two trace class operators T1 and T2 .

n
Using exp(T ) = 1 + T + · · · + Tn1 + . . . we could also define det(exp T ) too, and in fact
det(exp T ) = eT race(T ) .
As an example, if T is an integral operator on some L2 (X) with kernel K(x, y) with mild
assumptions on K and X, then it could be shown that
Z Z
1 x1 , x2 , . . . xk
T r(Tk ) = ... K dx1 . . . dxk
k! x1 , x2 , . . . xk
and the definition of the determinant coincides with the previous section.

Functional Analysisf15

Uploaded by

Functional Analysisf15

Uploaded by

Introduction to Functional Analysis

2 Overview of normed linear spaces 23

3 Geometry and topology on NLS 31

4 Basis on Hilbert spaces and Banach spaces 39

5 Bounded maps on Banach spaces 45

6 Bounded continuous functions and dual spaces 53

7 Locally convex spaces 59

II Spectral analysis for linear operators 65

10 Bounded self-adjoint operators 79

11 Spectral theory for unbounded self-adjoint operator 85

Generally speaking, a space is a set with some structures, which could be

1.1 Metric spaces

• (positive) d(x, y) ≥ 0 equality iff x = y, and

• (symmetric) d(x, y) = d(y, x), and

• (triangle inequality) d(x, y) + d(y, z) ≥ d(x, z).

A sequence (xn ) converges to x in (M, d) if limn→∞ d(xn , x) = 0 A sequence (xn ) is called

Definition 1. A metric space (M, d) is complete if any Cauchy sequence is convergent.

Examples: M = {continuous functions on [0, 1]}, let

Definition 2. M1 ⊂ M2 is dense in M2 wrt metric d if every m ∈ M2 could be ap-

and h(M ) is dense in M

(such a h is called an isometry between M and h(M ).)

the equivalence class of (x, x, x . . . ) in M

Proof: use simple functions. More details later.

1.2 Linear spaces

sup λ|{|f | > λ}|1/p < ∞

For p = ∞, weak L∞ and L∞ are the same.

Now `∞ = all bounded sequences.

1.2.1 The Hahn-Banach extension theorem

• (positive homogeneity) p(ax) = ap(x) for a > 0 and x ∈ X, and

• (subadditivity) p(x1 + x2 ) ≤ p(x1 ) + p(x2 ) for x1 , x2 ∈ X.

p(ax1 + bx2 ) ≤ ap(x1 ) + bp(x2 )

for all a, b ≥ 0 with a + b > 0 and x1 , x2 ∈ X.

`(y) ≤ p(y) for all y ∈ Y .

Then there exists L : X → R linear that agrees with ` on Y such that

L(x) ≤ p(x) for all x ∈ X.

• (reflexive) xRx for all x ∈ X,

• (symmetric) if xRy then yRx, and

• (transitive) if xRy and yRz then xRz.

We say R is a partial ordering if reflexive, transitive, and anti-symmetric (namely if xRy

We say α ∈ X is an upper bound for Y ⊂ X if yRα for all y ∈ Y .

Example: the inclusion order in P(R).

We say that an element m ∈ X is maximal if for every α ∈ X if mRα then α = m.

`0 (ay + bz) = a`0 (y) + b`0 (z)

so only need `0 (z). We also want

`0 (ay + bz) ≤ p(ay + bz)

and `0 is the same as `α inside Yα . Now apply Zorn’s lemma.

p(ax + by) ≤ |a|p(x) + |b|p(y)

1.2.2 The HB theorem with symmetry constraints

1.3 Topological spaces

1.3.2 Continuous functions on LCH spaces

• f = 0 outside a compact subset of U .

1.3.3 Proof of Urysohn’s lemma

if r < s. One then defines

1.3.4 Proof of Tietze’s theorem

1.3.5 Partition of unity

Overview of normed linear spaces

• kxk ≥ 0 and equality holds iff x = 0.

In fact theyP are complete, this is P

kf kp,∞ = sup λµ({x ∈ X : |f | > λ})1/p

However, we still say “the weak Lp norm” in practice.

2.1 Linear functionals and dual spaces

2.1.1 Linear functionals

Theorem 5. On normed linear spaces, a linear functional is continuous iff it is bounded.

2.1.2 Dual spaces

We could map x ∈ X to the following linear map x b on X ∗ :

Note that xb is bounded on X ∗ , since by definition |b

2.2 (Non)compactness of the unit ball

dist(x, Y ) > kxk/10

now we just rescale x so that it has length 1.

Basic geometrical and topological

3.1 Topologies on normed linear spaces

3.1.1 The Banach–Alaoglu theorem

consisting of tuples indexed by Y . By Tychonoff’s theorem, I is compact in the product

zαx+βy = αzx + βzy

Note that µ is not necessarily a probability measures, in particular it is possible that µ = 0,

3.1.2 The sequential Banach-Alaoglu theorem

it is clear that ν µ, therefore by the Radon Nikodym theorem there exists g ∈ L1 , g ≥ 0,

µ(E) ≤ µ(U ) = sup{`(g) : 0 ≤ g ≤ 1 , supp(g) ⊂ U }

which is in P by the lattice property and clearly supX |g − f | ≤ .

U = {x ∈ X : ρα1 (x) < , . . . , ραm (x) < }

where > 0 and α1 , . . . , αm are elements of I. (m ≥ 1 is arbitrary). It is clear that these