0% found this document useful (0 votes)
19 views152 pages

Functionalanalysis

Uploaded by

Abdelghani Amara
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
19 views152 pages

Functionalanalysis

Uploaded by

Abdelghani Amara
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 152

Introduction to Functional Analysis

Michael Müger
26.01.2021

Abstract
These are notes for the course Inleiding in de Functionaalanalyse, Autumn 2020/21 (14×90
min.). They are also recommended as background for my courses on Operator Algebras.

Contents

Part I: Basics 4
1 Rough introduction 4

2 Setting the stage 6


2.1 Topological groups, fields, vector spaces . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Normed spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Brief look at more general notions . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1 Metrizable TVS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.2 Locally convex and Fréchet spaces . . . . . . . . . . . . . . . . . . . . . . 12

3 Normed and Banach space basics 14


3.1 Why we care about completeness. Closed subspaces . . . . . . . . . . . . . . . . 14
3.2 Linear maps: bounded ⇔ continuous . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Spaces of bounded linear maps. First glimpse of Banach algebras . . . . . . . . . 18

4 The sequence spaces and their dual spaces 19


4.1 Basics. 1 ≤ p ≤ ∞: Hölder and Minkowski inequalities . . . . . . . . . . . . . . . 20
4.2 ? Aside: The translation-invariant metric dp for 0 < p < 1 . . . . . . . . . . . . . 22
4.3 c00 and c0 . Completeness of `p (S, F) and c0 (S, F) . . . . . . . . . . . . . . . . . . 23
4.4 Separability of `p (S, F) and c0 (S, F) . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.5 Dual spaces of `p (S, F), 1 ≤ p < ∞, and c0 (S, F) . . . . . . . . . . . . . . . . . . 26
4.6 Outlook on general Lp -spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5 Basics of Hilbert spaces 29


5.1 Inner products. Cauchy-Schwarz inequality . . . . . . . . . . . . . . . . . . . . . 29
5.2 The parallelogram and polarization identities . . . . . . . . . . . . . . . . . . . . 32
5.3 Orthogonality, subspaces, orthogonal projections . . . . . . . . . . . . . . . . . . 33
5.3.1 Basic Hilbert space geometry . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.3.2 Closed subspaces, orthogonal complement, and orthogonal projections . . 34

1
5.4 The dual space H ∗ of a Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.5 Orthonormal sets. Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.6 ? Tensor products of Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6 Quotient spaces and complemented subspaces 42


6.1 Quotient spaces of Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.2 Quotient spaces of Banach spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.3 Complemented subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

7 Hahn-Banach theorem and its applications 45


7.1 First version of Hahn-Banach over R . . . . . . . . . . . . . . . . . . . . . . . . . 46
7.2 Hahn-Banach theorem for (semi)normed spaces . . . . . . . . . . . . . . . . . . . 47
7.3 First applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.4 Reflexivity of Banach spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

8 Uniform boundedness theorem: Two versions 50


8.1 The weak version, using only countable choice . . . . . . . . . . . . . . . . . . . . 50
8.2 Applications: Banach-Steinhaus, Hellinger-Toeplitz . . . . . . . . . . . . . . . . . 51
8.3 The strong version, using Baire’s theorem . . . . . . . . . . . . . . . . . . . . . . 53
8.4 Application: A dense set of continuous functions with divergent Fourier series . . 54

9 The Open Mapping Theorem and its relatives 55


9.1 The Open Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
9.2 The Bounded Inverse Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
9.3 ? The Closed Graph Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
9.4 Boundedness below. Invertibility . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Part II: Spectral theory of operators and algebras 61


10 Spectrum of bounded operators and of (elements of ) Banach algebras 61
10.1 The spectra of A ∈ B(E) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
10.2 The spectrum in a unital Banach algebra . . . . . . . . . . . . . . . . . . . . . . 62
10.3 Examples of spectra of operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
10.4 Characters. Spectrum of a Banach algebra . . . . . . . . . . . . . . . . . . . . . . 70

11 Transpose and adjoint of bounded operators. C ∗ -algebras 72


11.1 The transpose of a bounded Banach space operator . . . . . . . . . . . . . . . . . 72
11.2 The adjoint of a bounded Hilbert space operator . . . . . . . . . . . . . . . . . . 73
11.3 Involutions. Definition of C ∗ -algebras . . . . . . . . . . . . . . . . . . . . . . . . 76
11.4 Spectrum of elements of a C ∗ -algebra . . . . . . . . . . . . . . . . . . . . . . . . 78

12 Functional calculus in Banach and C ∗ -algebras 79


12.1 Some functional calculus in Banach algebras . . . . . . . . . . . . . . . . . . . . . 79
12.2 Continuous functional calculus for self-adjoint elements in a C ∗ -algebra . . . . . 81
12.3 Continuous functional calculus for normal elements in a C ∗ -algebra . . . . . . . . 84

2
13 More on Hilbert space operators 85
13.1 Normal operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
13.2 Self-adjoint operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
13.3 Positive operators. Polar decomposition . . . . . . . . . . . . . . . . . . . . . . . 89

14 Compact operators 91
14.1 Compact Banach space operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
14.2 Fredholm alternative. The spectrum of compact operators . . . . . . . . . . . . . 95
14.3 Spectral theorems for compact Hilbert space operators . . . . . . . . . . . . . . . 97
14.4 ? Hilbert-Schmidt operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

15 Spectral theorems for normal Hilbert space operators 102


15.1 Spectral theorem: Multiplication operator version . . . . . . . . . . . . . . . . . . 102
15.2 Borel functional calculus for normal operators . . . . . . . . . . . . . . . . . . . . 105
15.3 Normal operators vs. projection-valued measures . . . . . . . . . . . . . . . . . . 108

16 Weak and weak ∗-topologies. Alaoglu’s theorem 110


16.1 The weak topology of a Banach space . . . . . . . . . . . . . . . . . . . . . . . . 110
16.2 The weak-∗ topology on the dual space of a Banach space . . . . . . . . . . . . . 111

17 The Gelfand homomorphism for commutative Banach and C ∗ -algebras 113


17.1 The topology of Ω(A). The Gelfand homomorphism . . . . . . . . . . . . . . . . 113
17.2 Application: Absolutely convergent Fourier series . . . . . . . . . . . . . . . . . . 115
17.3 C ∗ -algebras. Continuous functional calculus revisited . . . . . . . . . . . . . . . . 116

A Some topics from topology and measure theory 118


A.1 Unordered sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
A.2 Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
A.3 The Stone-Čech compactification . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
A.4 Reminder of the choice axioms and Zorn’s lemma . . . . . . . . . . . . . . . . . . 122
A.5 Baire’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
A.6 Tietze’s extension theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
A.7 The Stone-Weierstrass theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
A.7.1 Weierstrass’ theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
A.7.2 The Stone-Weierstrass theorem in the simplest case . . . . . . . . . . . . 126
A.7.3 Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
A.8 Totally bounded sets in metric spaces . . . . . . . . . . . . . . . . . . . . . . . . 128
A.9 The Arzelà-Ascoli theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
A.10 Some notions from measure and integration theory . . . . . . . . . . . . . . . . . 130

B Supplements for the curious. NOT part of the course 131


B.1 Functional analysis over fields other than R and C? . . . . . . . . . . . . . . . . . 131
B.2 The dual space of `∞ (S, F) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
B.3 c0 (N, F) ⊆ `∞ (N, F) is not complemented . . . . . . . . . . . . . . . . . . . . . . . 135
B.4 Banach spaces with no or multiple pre-duals . . . . . . . . . . . . . . . . . . . . . 136
B.5 Normability. Separation theorems. Goldstine’s theorem . . . . . . . . . . . . . . 137
B.5.1 Minkowski functionals. Criteria for normability and local convexity . . . . 137
B.5.2 Hahn-Banach separation theorem. Goldstine’s theorem . . . . . . . . . . 140
B.6 Strictly convex and uniformly convex Banach spaces . . . . . . . . . . . . . . . . 141

3
B.6.1 Strict convexity and uniqueness in the Hahn-Banach theorem . . . . . . . 141
B.6.2 Uniform convexity and reflexivity. Duality of Lp -spaces reconsidered . . . 142
B.7 Schur’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
B.8 The Fuglede-Putnam theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
B.9 Glimpse of non-linear FA: Schauder’s fixed point theorem . . . . . . . . . . . . . 146

C Tentative schedule (14 lectures à 90 minutes) 148

References 149

1 Rough introduction
We will begin with a quick delineation of what we will discuss – and what not!
• “Classical analysis” is concerned with ‘analysis in finitely many dimensions’. ‘Functional
analysis’ is the generalization or extension of classical analysis to infinitely many dimen-
sions. Before one can try to make sense of this, one should make the first sentence more
precise. Since the creation of general topology, one can talk about convergence and con-
tinuity in very general terms. As far as I see it, this is not analysis, even if infinite sums
(=series) are studied. Analysis proper starts as soon as one talks about differentiation
and/or integration. Differentiation has to do with approximating functions locally by lin-
ear ones, and for this one needs the spaces considered to be vector spaces (at least locally).
This is the reason why most of classical analysis considers functions between the vector
spaces Rn and Cn (or subsets of them). (In a second step, one can then generalize to
spaces that look like Rn only locally by introducing topological and smooth manifolds and
their generalizations, but the underlying model of Rn remains important.) On the other
hand, integration, at least in the sense of the modern theory, can be studied much more
generally, i.e. on arbitrary sets equipped with a measure (defined on some σ-algebra). Such
a set can be very far from being a vector space or manifold, for example by being totally
disconnected.
• In view of the above, it is not surprising that functional analysis is concerned with (pos-
sibly) infinite dimensional vector spaces and continuous maps between them. (Again, one
can then generalize to spaces that look like a vector space only locally, but this would
be considered infinite dimensional geometry, not functional analysis.) In addition to the
vector space structure one needs a topology, which naturally leads to topological vector
spaces, which I will define soon.
• The importance of topologies is not specific to infinite dimensions. The point rather is
that Rn , Cn have unique topologies making them topological vector spaces. This is no
more true in infinite dimensions!
• Actually, ‘functional analysis’ most often studies only linear maps between topological
vector spaces so that this domain of study should be called ‘linear functional analysis’, but
this is done only rarely, e.g. [68]. Allowing non-linear maps leads to non-linear functional
analysis. This course will discuss only linear functional analysis. Thorough mastery of the
latter is needed anyway before one can think about non-linear FA or infinite dimensional
geometry. For the simplest result of non-linear functional analysis, see Section B.9. For
more, you could have a look at, e.g., [80, 14, 53]. There even is a five volume treatise [86]!

4
• The restriction to linear maps means that the notion of differentiation becomes point-
less, the derivative of f (x) = Ax + b being just A everywhere. But there are many
non-trivial connections between linear FA and integration (and measure) theory. For
example, every measure space (X, A, µ) gives rise to a family of topological vector spaces
Lp (X, A, µ), p ∈ (0, ∞], and integration provides linear functionals. Proper appreciation of
these matters requires some knowledge of measure and integration theory, cf. e.g. [11, 69].
I will not suppose that you have followed a course on this subject (but if you haven’t, I
strongly that you do so on the next occasion or, at least, read the appendix in MacCluer’s
book [42].). Yet, one can get a reasonably good idea by focusing on sequence spaces, for
which no measure theory is required, see Section 4.
• One should probably consider linear functional analysis as an infinite dimensional and
topological version of linear algebra rather than as a branch of analysis! This might lead
one to suspect linear FA to be slightly boring, but this would be wrong for many reasons:
– Functional analysis (linear or not) leads to very interesting (and arbitrarily challeng-
ing) technical questions (most of which reduce to very easy ones in finite dimensions).
– Linear FA is essential for non-linear FA, like variational calculus, and the theory of
differential equations – not only linear ones!
– Quantum theory [38] is a linear theory and cannot be done properly without func-
tional analysis, despite the fact that many physicists think so! Conversely, many
developments in FA were directly motivated by quantum theory.
• The above could give the impression that functional analysis arose from the wish of gen-
eralizing analysis to infinitely many dimensions. This may have played a role for some
of its creators, but its beginnings (and much of what is being done now) were mostly
motivated by finite dimensional “classical”1 analysis: If U ⊆ Rn , the set of functions (pos-
sibly continuous, differentiable, etc.) from U to Rm is a vector space as soon as we put
(cf + dg)(x) = cf (x) + dg(x). Unless U is a finite set, this vector space will be infinite
dimensional. Now one can consider certain operations on R such vector spaces, like differ-
entiation C ∞ (U ) → C ∞ (U ), f 7→ f 0 or integration f 7→ U f . This sort of considerations
provided the initial motivation for the development functional analysis, and indeed FA
now is a very important tool for the study of ordinary and partial differential equations on
finite dimensional spaces. See e.g. [7, 20]. The relevance of FA is even more obvious if one
studies differential equations in infinitely many dimensions. In fact, it is often useful to
study a partial differential equation (like heat or wave equation) by singling out one of the
variables (typically ‘time’) and studying the equation as an ordinary differential equation
in an infinite dimensional space of functions. FA is also essential for variational calculus
(which in a sense is just a branch of differential calculus in infinitely many dimensions).
• In view of the above, FA studies abstract topological vector spaces as well as ‘concrete’
spaces, whose elements are functions. In order to obtain a proper understanding of FA,
one needs some familiarity with both aspects.
Before we delve into technicalities, some further general remarks:
• The history of functional analysis is quite interesting, cf. e.g. the article [5], [56, Chapter
4] and the books [15, 45]. But clearly it makes little sense to study it before one has some
technical knowledge of FA. It is surprisingly intertwined with the development of linear
1
Ultimately, I find it quite futile to try and draw a neat line between “classical” and “modern” or functional
analysis, in particular since many problems in the former require methods from the latter for their proper treatment.

5
algebra. One would think that (finite dimensional) vector spaces, linear maps etc. were
defined much earlier than, e.g., Banach spaces, but this is not true. In fact, Banach’s2
book [3], based on his 1920 PhD thesis, is one of the first references containing the mod-
ern definition of a vector space. Some mathematicians, mostly Italian ones, like Peano,
Pincherle and Volterra, essentially had the modern definition already in the last decades
of the 19th century, but they had no impact since the usefulness of an abstract/axiomatic
approach was not yet widely appreciated. Cf. [36, Chapter 5] or [16, 46].
Here I limit myself to mentioning that the basics of functional analysis (Hilbert and Banach
spaces and bounded linear maps between them) were developed in the period 1900-1930.
Nevertheless, many important developments (locally convex spaces, distributions, operator
algebras) took place in 1930-1960. After that, functional analysis has split up into many
very specialized subfields that interact quite little with each other. The very interesting
article [81] ends with the conclusion that ‘functional analysis’ has ceased to exist as a
coherent field of study!
• The study of functional analysis requires a solid background in general topology. It may
well be that you’ll have to refresh and likely also extend yours. In Appendix A I have
collected brief accounts of the topics that – sadly – you are most likely not to have en-
countered before. All of them are contained in [65] (written by a functional analyst!), but
my favorite (I’m admittedly biased) reference is [47]. You should have seen Weierstrass’
theorem, but those of Tietze and Arzelà-Ascoli tend to vanish in the (pedagogical, not
factual) gap between general topology and functional analysis.
• The main reference for this course has been [42] for a number of years, but I am unenthu-
siastic about a number of aspects of it, which is why I wrote these notes. If you find them
too advanced, you might want to have a look at [54, 68, 70]. On the other hand, if you
want more, [55] is a good place to start, followed by [59, 39, 12, 64]. (The MasterMath
course currently uses [12].)
One word about notation (without guarantee of always sticking to it): General vector spaces,
but also normed spaces, are denoted V, W, . . ., normed spaces also as E, F, . . .. Vectors in such
spaces are e, f, . . . , x, y, . . .. Linear maps are always denoted A, B, . . ., except linear functionals
V → F, which are ϕ, ψ. Algebras are usually denoted A, B, . . . and their elements a, b, . . .. (For
A = B(E) this leads to inconsistency, but I cannot bring myself to using capital letters for
abstract algebra elements.)

Acknowledgment. I thank Bram Balkema, Victor Hissink Muller and Niels Vooijs for corrections,
but in particular Tim Peters for a huge number of them.

2 Setting the stage


2.1 Topological groups, fields, vector spaces
As said in the Introduction, functional analysis (even most of the non-linear version) is concerned
with vector spaces, allowing infinite dimensional ones. Large parts of linear algebra of course
work equally well for finite and infinite dimensional spaces. One aspect where problems arise in
infinite dimensions is the description of linear maps by matrices, for example since multiplication
2
Stefan Banach (1892-1945). Polish mathematician and pioneer of functional analysis. Also known for B. algebras,
B.’s contraction principle, the B.-Tarski paradox and the Hahn-B. and B.-Steinhaus theorems, etc.

6
of infinite matrices involves infinite summations, which require the introduction of topologies.
(Actually, in some restricted contexts infinite matrices still are quite useful.)
We begin with the following

2.1 Definition A topological group is a group (G, ·, 1) equipped with a topology τ such that
the group operations G×G → G, (g, h) 7→ gh and G → G, g 7→ g −1 are continuous (where G×G
is given the product topology). (For abelian groups, one often denotes the binary operation by
+ instead of ·.)

2.2 Example 1. If (G, ·, 1) is any group then it becomes a topological group by putting τ =
τdisc , the discrete topology on G.
2. The group (R, +, 0), where R is equipped with its standard topology, is easily seen to be
a topological group.
3. If n ∈ N3 and F ∈ {R, C} then the set GL(n, F) = {A ∈ Mn×n (F) | det(A) 6= 0} of
invertible n × n matrices is a group w.r.t. matrix product and inversion and in fact a topological
group when equipped with the subspace topology induced from Mn×n (F) ∼
2
= Fn .

2.3 Remark Topological groups – or rather matrix groups as in 3. above – are the subject of
my 3rd year course on course Continuous Matrix Groups, taught again next spring. They are
an important (and prototypical) case of Lie groups. The latter are a subject at Master level
that is very much worthy of study! 2

2.4 Definition A topological field is a field (F, +, 0, ·, 1) equipped with a topology on F such
that (F, +, 0) and (F\{0}, ·, 1) are topological groups. (Equivalently, all field operations are
continuous.)
It is very easy to check that R and C are topological fields when equipped with their standard
topologies. (So is Q with the topology induced from Q.)

2.5 Exercise Prove the above claims.

2.6 Definition Let F be a topological field. Then a topological vector space (TVS) over F
is an F-vector space equipped with a topology τV (to be distinguished, obviously, from the
topology τF on F) such that the maps V × V → V, (x, y) 7→ x + y and F × V → V, (c, x) 7→ cx
are continuous.
(These conditions imply that V → V, x 7→ −x is continuous, so that (V, +, 0) is a topological
group, but not conversely.)
Again it is very easy to check that Rn and Cn are topological vector spaces over the topo-
logical fields R, C, respectively, when they are quipped with the euclidean topologies (=product
topologies on F × · · · × F).
In this course, the only topological fields considered are R and C. When a result holds for
either of the two, I will write F. But note that one can consider topological vector spaces over
other topological fields, like the p-adic ones Qp [24]. (But the resulting p-adic functional analysis
is quite different in some respects from the ‘usual’ one, cf. the comments in Section B.1 and the
literature, e.g. [62, 57].)

2.7 Exercise Let F be a topological field and V an F-vector space. Is it true that V , equipped
with the discrete topology, is a topological vector space over F? Prove or give a counterexample.
3
Throughout these notes, N = {1, 2, 3, . . .}, thus 0 6∈ N.

7
Now we can define:

2.8 Definition Functional analysis (ordinary, as opposed to p-adic) is concerned with topo-
logical vector spaces over R or C and continuous maps between them. Linear functional analysis
considers only linear maps.
As it turns out, the above notion of topological vector spaces is a bit too general to build a
satisfactory and useful theory upon it. Just as in topology it is often (but by no means always!)
sufficient to work with metric spaces, for most purposes it is usually sufficient to consider certain
subclasses of topological vector spaces. The following diagram illustrates some of these classes
and their relationships:

topological vector sp. ⊃ metrized/F-sp.


∪ ∪
locally convex sp. ⊃ Fréchet sp. ⊃ normed/Banach sp. ⊃ (pre)Hilbert sp.

(Note that F -spaces, Fréchet4 , Banach and Hilbert5 spaces are assumed complete but one also
has the non-complete versions. There is no special name for Fréchet spaces with completeness
dropped other than metrizable locally convex spaces. In the other cases, one speaks of metrized,
normed and pre-Hilbert spaces.)
The most useful of these classes are those in the bottom row. In fact, locally convex (vector)
spaces are general enough for almost all applications. They are thoroughly discussed in the
MasterMath course on functional analysis, while we will only briefly touch upon them. Most
of the time, we will be discussing Banach and Hilbert spaces. There is much to be said for
studying them in some depth before turning to locally convex spaces (or more general) spaces.
(Some books on functional analysis, like [64], begin with general topological vector spaces and
then turn to some special classes, but for a first encounter this does not seem appropriate. This
said, I also don’t see the point of beginning with proofs of many results on Hilbert spaces that
literally generalize to Banach spaces.)

2.2 Normed spaces


I assume that you remember the notion of a metric on a set X: A map d : X × X → [0, ∞)
satisfying d(x, y) = d(y, x) and d(x, z) ≤ d(x, y) + d(y, z) for all x, y, z ∈ X and d(x, y) = 0 ⇔
x = y. Every metric d on X defines a topology τd on X, the smallest topology τ containing all
open balls B(x, r) = {y ∈ X | d(x, y) < r}. (The open balls then form a base, not just a subbase,
for τ .) A topology τ on X is called metrizable if there exists a metric d on X (not necessarily
unique) such that τ = τd . Metrizable topologies automatically have many nice properties, like
e.g. normality and, a fortiori, the Hausdorff property.
Normed spaces are a class of metrizable topological vector spaces. We first recall:

2.9 Definition Let V be a vector space over F ∈ {R, C}. A seminorm on V is a map V →
[0, ∞), x 7→ kxk such that

kx + yk ≤ kxk + kyk ∀x, y ∈ V. (Subadditivity)


kcxk = |c| kxk ∀c ∈ F, x ∈ V. (This implies k0k = 0 and k − xk = kxk.)
4
Maurice Fréchet (1878-1973). French mathematician. Introduced metric spaces in his 1906 PhD thesis.
5
David Hilbert (1862-1943). Eminent German mathematician who worked on many different subjects. Considered
the strongest and most influential mathematician in the decades around 1900, only Poincaré coming close.

8
(Note that by the above, kxk = ∞ is not allowed!) A norm is a seminorm satisfying also
kxk = 0 ⇒ x = 0.
A normed F-vector space is an F-vector space equipped with a norm.

2.10 Example 0. Clearly F ∈ {R, C} is a vector space over itself and kck := |c| defines a norm,
making F a complete normed F-vector space.
1. Let X be a compact topological space and V = C(X, F). Clearly, V is an F-vector space.
Now kf k = supx∈X |f (x)| is a norm on V . You probably know that the normed space (V, k · k)
is complete.
If X is non-compact then kf k can be infinite, but replacing C(X, F) by

Cb (X, F) = {f ∈ C(X, F) | kf k < ∞},

k · k again is a norm with which Cb (X, F) is complete.


2. Let n ∈ N and V = Cn . For x ∈ V and 1 ≤ p < ∞ (NB: p does not stand for prime!),
define

kxk∞ = max |xi |,


i=1,...,n
n
!1/p
X
kxkp = |xi |p .
i=1

(Note that all these k · kp including p = ∞ coincide if n = 1.) It is quite obvious that for
each p ∈ [1, ∞] we have kxkp = 0 ⇔ x = 0 and kcxkp = |c| kxkp . For p = 1 and p = ∞ also
the subadditivity is trivial to check using only |c + d| ≤ |c| + |d|. Subadditivity also holds for
1 < p < ∞, but is harder to prove. You have probably seen the proof for p = 2, which relies
on the Cauchy-Schwarz inequality. The proof for 1 < p < 2 and 2 < p < ∞ is similar, using
the inequality of Hölder instead. We will return to this and also prove that Rn , Cn is complete
w.r.t. any of the norms k · kp , p ∈ [1, ∞].
3. The above examples are easily generalized to infinite dimensions. Let S be any set. For
a function f : S → F and 1 ≤ p < ∞ define
!1/p
X
kf k∞ = sup |f (s)|, kf kS = |f (s)|p
s∈S s∈S

with the understanding that (+∞)1/p = +∞. For the definition of infinite sums like
P
s∈S f (s)
see Appendix A.1. Now let

`p (S, F) = {f : S → F | kf kp < ∞}.

Now one can prove that k · kp is a complete norm on (`p (S, F ), k · k)p ) for each p ∈ [1, ∞]. We
will do this in Section 4.
4. Let (X, A, µ) be a measure space, f : X → F measurable and 1 ≤ p < ∞. We define
Z 1/p
p
kf kp = |f | dµ ,
X
kf k∞ = inf{M > 0 | µ({x ∈ X | |f (x)| > M }) = 0}

and
Lp (X, A, µ; F) = {f : X → F measurable | kf kp < ∞}.

9
Then kf kp = ( X |f |p dµ)1/p is a seminorm on Lp (X, A, µ; F) for all 1 ≤ p < ∞.
R

However, in general k · kp it is not a norm since kf kp vanishes whenever f is zero almost


everywhere, i.e. µ(f −1 (C\{0})) = 0, which may well happen even if f is not identically zero. In
order to obtain a normed space one considers the quotient space Lp (X, A, µ) = Lp (X, A, µ)/{f ∈
Lp | kf kp = 0}. Going into the details would require too much measure theory. See [42,
Appendices A.1-A.3] for a crash course or [11, 69] for the full story.
There is an instructive special case: If S Ris a set, A = PP(S) and µ(A) = #A (the counting
measure) then for every f : S → F we have PS f (s) dµ(s) = s∈S f (s), where the integral, like
the (unordered) sum, exists if and only if s∈S |f (s)| < ∞. Thus Lp (S, A, µ; F) = `p (S, F)6 .

2.11 Lemma Let (V, k · k) be a normed F-vector space, and define d(x, y) = kx − yk. Then
(i) d is a metric on V that is translation-invariant, i.e. satisfies the equivalent statements

d(x, y) = d(x − z, y − z) ∀x, y, z ∈ V ⇔ d(x, y) = d(x − y, 0) ∀x, y ∈ V. (2.1)

(ii) (V, τd ) is a topological vector space.


Proof. (i) Proving the equivalence of the two statements in (2.1) is an easy exercise. That norms
give rise to metrics is probably known: d(x, y) ≥ 0 follows from kxk ≥ 0, and d(x, y) = 0 ⇔ x = y
follows from the norm axiom kxk = 0 ⇔ x = 0. Furthermore,

d(y, x) = ky − xk = k − (x − y)k = kx − yk = d(x, y),

where we used k − xk = kxk, a special case of the second seminorm axiom. Finally,

d(x, z) = kx − zk = kx − y + y − zk ≤ kx − yk + ky − zk = d(x, y) + d(y, z).

Translation invariance of d is obvious: d(x, y) := kx − yk = d(x − y, 0).


(ii) Let x, x0 , y, x0 ∈ V . Then

d(x+y, x0 +y 0 ) = k(x+y)−(x0 +y 0 )k = k(x−x0 )+(y−y 0 )k ≤ kx−x0 k+ky−y 0 k = d(x, x0 )+d(y, y 0 ),

showing that + : V × V → V is jointly continuous. If x, x0 ∈ V and c, c0 ∈ F then

d(cx, c0 x0 ) = kcx − c0 x0 k = kcx − cx0 + cx0 − c0 x0 k ≤ kc(x − x0 )k + k(c − c0 )x0 k


= |c| kx − x0 k + |c − c0 | kx0 k = |c|d(x, x0 ) + |c − c0 |kx0 k.

This implies joint continuity of the scalar action F × V → V . 

2.12 Definition A topological vector space (V, τ ) is called normable if there exists a norm k · k
on V such that τ = τd with d(x, y) = kx − yk.

2.13 Remark One can prove, see the supplementary Appendix B.5.1, that a topological vector
space (V, τ ) is normable if and only if there is an open U ⊆ V such that 0 ∈ U and U is convex,
cf. Definition 5.21, and ‘bounded’. The latter property means that for every open neighborhood
V of 0 there exists an s > 0 such that U ⊆ sV . (In a normed space it is easy to see that B(0, r)
has all these properties for each r > 0.) 2
6
In view of these facts, which we cannot prove without going deeper into measure and integration theory, I don’t
find it unreasonable to ask that you understand the much simpler unordered summation.

10
2.14 Definition A normed space (V, k · k) is called
• complete, or Banach space, if the metric space (V, d) with d(x, y) = kx − yk is complete.
• separable if the metric space (V, d) is separable (i.e. has a countable7 dense subset).
Just as complete metric spaces are ‘better behaved’ (in the sense of allowing stronger theo-
rems) than general metric spaces, Banach spaces are ‘better’ than normed spaces. Separability
is an annoying restriction that we will try to avoid as much as possible. (An opposite approach
[54] restricts to separable spaces from the beginning in order to make do with weak versions of
the axiom of choice.)
Note that the norm of a normable space (V, τ ) never is unique (unless V = {0}): If k · k is
a norm compatible with τ then the same holds for ck · k for every c > 0. Thus the choice of a
norm on a vector space is an extra piece of structure. If k · k1 , k · k2 are different norms on V
then (V, k · k1 ), (V, k · k2 ) are different as normed spaces even if the norms give rise to the same
topology!

2.15 Definition Let V be an F-vector space. Two norms k·k1 , k·k2 on V are called equivalent
if τd1 = τd2 , where di (x, y) = kx − yki .
We will soon prove (quite easily) the following:

2.16 Proposition Two norms k · k1 , k · k2 on an F-vector space V are equivalent if and only if
there are 0 < c0 ≤ c such that c0 kxk1 ≤ kxk2 ≤ ckxk1 for all x ∈ V .
The following deeper result will be proven later:

2.17 Theorem If V is a vector space that is complete w.r.t. each of the norms k · k1 , k · k2 and
k · k2 ≤ ck · k1 for some c > 0 then also k · k1 ≤ c0 k · k2 for some c0 > 0, thus the two norms are
equivalent.

2.3 Brief look at more general notions


Normed spaces are an extremely versatile notion that has many applications. But there def-
initely are topological vector spaces that are not normable. We therefore briefly look at two
generalizations: metrizable TVS and locally convex TVS. The latter contain all normable spaces
and the intermediate class of Fréchet spaces.

2.3.1 Metrizable TVS


2.18 Definition A topological vector space (V, τ ) is called metrizable if there exists a transla-
tion-invariant metric d on V such that τ = τd . An F -space is a TVS that is metrizable by a
translation-invariant and complete metric.

2.19 Remark 1. If a topological vector space (V, τ ) is normable, d(x, y) = kx − yk is a


translation invariant metric such that τ = τd , thus (V, τ ) is metrizable. But in Section 4.2 we
will encounter examples of TVS that are metrizable but not normable. A translation-invariant
metric comes from a norm if and only if d(cx, 0) = |c|d(x, 0) for all x ∈ V, c ∈ F.
2. Every R-vector space equipped with the indiscrete topology is a TVS. (Check this!) Since
the indiscrete topology (on a space with more than one point) is not metrizable, this gives an
example of a TVS that is not even metrizable.
7
‘Countable’ always means ‘at most countable’, otherwise we’ll say ‘countably infinite’.

11
3. Given a vector space V and some translation-invariant metric d on V , one proves as in
Lemma 2.11 that (V, +, 0), equipped with the topology τd , is a topological abelian group. But
scalar multiplication can fail to be continuous. This continuity holds if and only if d(cx, 0) → 0
as c → 0 for each x ∈ V .
4. There is a nice necessary and sufficient condition for metrizability of a TVS: It must
be Hausdorff and the zero element must have a countable base of open neighborhoods, cf. [64,
Theorem 1.24].
5. Metrizable TVS that are non-normable, while better behaved than general TVS, can still
be rather pathological. They do not have too many applications. 2

2.20 Remark The definition of completeness that we gave for normed spaces immediately
generalizes to metrizable TVS. Remarkably, completeness can actually be defined for arbitrary
TVS: A sequence {xn }ι∈I in a TVS V is called a Cauchy sequence if for every open set U ⊆ V
containing 0 there is an n0 such that n, m ≥ n0 ⇒ xn − xm ∈ U . Cauchy nets are defined
analogously. (For the definition of nets see e.g. [47].) Now a TVS V is called complete if every
Cauchy net in V is convergent. It is easy to see that for a metrizable TVS the above notion of
completeness coincides with the metric one. 2

2.3.2 Locally convex and Fréchet spaces


We have seen that every norm on a vector space gives rise to a translation-invariant metric and
a TVS structure. Analogously, if k · k is seminorm on V , but not a norm, then d(x, y) = kx − yk
defines only a pseudo-metric, and τd is not Hausdorff (if x 6= 0 is such that kxk = 0 then there
are no disjoint open sets U, V containing x, 0, respectively).

2.21 Definition If V is an F-vector space and F is a family of seminorms on V then the


topology τF is the smallest topology on V containing the balls

Bk·k (x, r) = {y ∈ V | kx − yk < r}

for all x ∈ V, r > 0, k · k ∈ F.


More explicitly, τF consists of all unions of finite intersections of such balls, i.e. the latter
form a subbase for τF . Now a sequence or net {xι } in V converges to z ∈ V if and only if
kxι − zk converges to zero for each k · k ∈ F.

2.22 Definition We say that F is separating if for any non-zero x ∈ V there is a k · k ∈ F


such that kxk =
6 0.
The property of being separating is important since one usually is only interested in Hausdorff
topological vector spaces and the following holds:

2.23 Lemma The topology τF induced by a family F of seminorms on V is Hausdorff if and


only if F is separating.
Proof. ⇒ Assume F is not separating. Then there is 0 6= x ∈ V such that kxk = 0 for all
k · k ∈ F. Then by definition of τF , every open set containing 0 also contains x and vice versa,
so that τF is not Hausdorff.
⇐ Assume x 6= y. By assumption there is a k · k ∈ F such that c = kx − yk > 0. Let
U = Bk·k (x, c/2), V = Bk·k (y, c/2). Then U, V are open sets containing x, y, respectively, and

12
existence of z ∈ U ∩ V would imply the contradiction d(x, y) ≤ d(x, z) + d(z, y) < c/2 + c/2 =
r = d(x, y). Thus U ∩ V = ∅, so that τF is Hausdorff. 

If V is an F-vector space and F is a family of seminorms on V , one can prove that V is


a topological vector space when equipped with the topology τF . The proof is not much more
complicated than for the case of one (semi)norm considered above.

2.24 Definition A topological vector space (V, τ ) over F is called locally convex if there exists
a separating family F of seminorms on V such that τ = τF .

2.25 Remark 1. Locally convex spaces were introduced in 1935 by John von Neumann (1903-
1957), to whom also von Neumann algebras, the theory of unbounded operators, the spectral
theorem and countless other discoveries in pure and applied mathematics are due.
2. For an equivalent, more geometric way of defining local convexity of a TVS see the
supplementary Section B.5.1, and for more on locally convex spaces see, e.g., [39, 12, 64]. 2

If the separating family F has just one element, we are back at the notionPof a normed,
possibly Banach, space. If F is finite, i.e. F = {k · k1 , . . . , k · kn }, then k · k = ni=1 k · ki is a
seminorm, and it is a norm if and only if F is separating. Thus the case of finite F again gives
a normed space. Thus F must be infinite in order Pfor interesting things to happen.
If F is infinite, we cannot just put kxk = k·k∈F kxk, since the r.h.s. has no reason to
converge for all x ∈ V . But if the family F of seminorms on V is countable, we can label the
elements of F as k · kn , n ∈ N and define

X
d(x, y) = 2−n min(1, kx − ykn ).
n=1

Now each term min(1, kx − ykn ) is a translation-invariant pseudometric on V that is bounded


by 1, and the sum converges to a translation-invariant metric on V . With just a bit more
work one
P shows that τF = τd , thus (V, τF ) is metrizable. (Note that we could not have defined
kxk = ∞ i=1 2 −i kxk since this again may fail to converge, thus need not be a norm.) If such a
i
space is complete, it is called a Fréchet space. Clearly, every Fréchet space is an F -space.
Here is an example of a Fréchet space: For f ∈ C ∞ (R, C) and n, m ∈ N0 , define

kf kn,m = sup |x|n |f (m) (x)|,


x∈R

where f (m) is the m-th derivative of f . These k · kn,m are seminorms. Now

S = {f ∈ C ∞ (R, C) | kf kn,m < ∞ ∀n, m ∈ N0 }

is a Fréchet space when equipped with the topology defined by the family F = {k · kn,m | n, m ∈
N0 }, which is countable. Its elements are called Schwartz8 functions. They are infinitely differ-
entiable functions that, together with all their derivatives, vanish as |x| → ∞ faster than |x|−n
for any n. (This definition is easily generalized to functions of several variables.) Note that the
seminorm k · k0,0 alone already separates the elements of S, thus is a norm, but having the other
seminorms around gives rise to a finer topology, one that is not normable.
8
Laurent Schwartz (1915-2002). French mathematician who invented ‘distributions’, an important notion in func-
tional analysis.

13
3 Normed and Banach space basics
3.1 Why we care about completeness. Closed subspaces
As you (should) know from topology, completeness of a metric space is convenient since leads
to results that are not necessarily true without it, like Cantor’s intersection theorem and the
contraction principle (or Banach’s fixed point theorem). The same holds for normed spaces.
Here is one main reason:
P∞
3.1 Definition Let (V, k · k) be a normed
P∞ space and {xn }n∈N a sequence. The series n=1 xn
is saidPto be absolutely convergent if n=1 kxn k < ∞ and to converge to s ∈ V if the sequence
Sn = nk=1 xk of partial sums converges to s.

3.2 Proposition Let (V, k · k) be a normed F-vector space. Then the following are equivalent:9
(i) (V, k · k) is complete, thus a Banach space.
(ii) Every absolutely convergent series ∞
P
n=1 xn in V converges.
P P
Under these assumptions, the sum satisfies k n xn k ≤ n kxn k.
V to be complete and n xn to be absolutely convergent. Let Sn = nk=1 xk
P P
Proof. ⇒ Assume
Pn
and Tn = k=1 kxk k. For all n > m we have (by subadditivity of the norm)
n
X n
X
kSn − Sm k = k xk k ≤ kxk k = Tn − Tm .
k=m+1 k=m+1

Since the sequence {Tn } is convergent by assumption, thus Cauchy, the above implies that
{SPn } is Cauchy,
Pnthus convergent by completeness of V . The subadditivity of the norm gives
n
k k=1 xk k ≤ k=1 kxk k for all n, and since the limit n → ∞ of both sides exists, we have the
inequality.
⇐ Assume that every absolutely convergent series in V converges, and let {yn }n∈N be a
Cauchy sequence in V . We can find (why?) a subsequence {zk }k∈N = {ynk } such that kzk −
zk−1 k ≤ 2−k ∀k ≥ 2. Now put z0 = 0 and define xk = zk − zk−1 for k ≥ 1. Now

X ∞
X ∞
X
kxk k = kzk − zk−1 k ≤ kz1 k + 2−k < ∞.
k=1 k=1 k=2
P∞
Thus k=1 xk is absolutely convergent,
Pn andP therefore convergent by the hypothesis. To wit,
n
limn→∞ Sn exists, where Sn = k=1 xk = k=1 (zk − zk−1 ) = zn . Thus z = limk→∞ zk =
limk→∞ ynk exists. Now the sequence {yn } is Cauchy and has a convergent subsequence {ynk }.
This implies (why?) that the whole sequence {yn } converges to the limit of the subsequence. 

We will see various other reasons for the importance of completeness. In the next section,
it will be used to prove that every finite dimensional linear subspace of a normed space is
automatically closed. This is
Pnot at all true for infinite dimensional subspaces. For example, let

V = `1 (N) = {f : N → R | n=1 |f (n)| < ∞}. Now W = {f : N → R | #{n ∈ N | f (n) 6= 0} <
∞} $ V is an infinite dimensional linear proper subspace, but not closed: One easily checks
that W = V .
Closedness and completeness of subsets of a metric space are related. We recall from topology
(if you haven’t seen this, prove it!):
9
P P
Unfortunately, some authors write: “ n kxn k < ∞, thus n xn converges” without indicating that something
needs to be proven here.

14
3.3 Lemma Let (X, d) be a metric space and Y ⊆ X. Then (instead of d|Y 10 I just write d)
(i) If (X, d) is complete and Y ⊆ X is closed (w.r.t. τd , of course) then (Y, d) is complete.
(ii) If (Y, d) is complete then Y ⊆ X is closed (whether or not (X, d) is complete).
The above should be compared with the fact that a closed subset of a compact space is
compact and that a compact subset of a Hausdorff space is closed. In the above, completeness
works as a weak substitute of compactness, an interpretation that is reinforced by the fact that
every compact metric space is complete.
This specializes immediately to normed spaces:

3.4 Lemma Let (V, k · k) be a normed space and W ⊆ V a linear subspace. Then
(i) If V is complete (=Banach) and W ⊆ V is closed then W is Banach.
(ii) If W is complete then W ⊆ V is closed (whether or not V is complete).

3.5 Definition A linear map A : V → W of normed spaces is called an isometry if kAxk = kxk
for all x ∈ V .
Recall that a linear map A : V → W is injective if and only if its kernel ker A = A−1 (0) is
{0}. It follows readily that an isometry is automatically injective. Furthermore, if A : V → W
is a surjective isometry then it is invertible, and its inverse also is an isometry. Then A is called
an isometric isomorphism of normed spaces.

3.6 Corollary Let V be a complete normed space and W a normed space. If A : V → W is


an isometry then the linear subspace AV ⊆ W is closed.
Proof. The map A : V → AV ⊆ W is an isometric bijection and therefore an isometric isomor-
phism of normed spaces. Thus (AV, k · k) is complete, thus closed in W by Lemma 3.4. 

I suppose as known that that every metric space can be completed, i.e. embedded isometri-
cally into a complete metric space (unique up to isometry) as a dense subspace.

3.7 Exercise Prove that the completion of a normed space again is a vector space, thus a
Banach space.
Later we will see an alternative construction of the completion.

3.8 Exercise Let (V1 , k · k1 ), (V2 , k · k2 ) be normed spaces.


(i) Prove that k(x1 , x2 )ks = kx1 k1 +kx2 k2 and k(x1 , x2 )km = max(kx1 k1 , kx2 k2 ) are equivalent
norms on V1 ⊕ V2 .
(ii) Prove that (V1 ⊕ V2 , k · ks/m ) is complete if and only if (V1 , k · k1 ), (V2 , k · k2 ) both are
complete.
Proof. (i) It is immediate that k(x1 , x2 )km ≤ k(x1 , x2 )ks ≤ 2k(x1 , x2 )km .
(ii) Assume that X1 , X2 are complete. If {(xi , yi )} ⊆ X1 ⊕ X2 is Cauchy w.r.t. k · km then
{xi } and {yi } are Cauchy in X, Y , thus convergent. I.e. there are X ∈ X1 , y ∈ X2 such that
10
If f : X → Y is a function and Z ⊆ X, we write either f|Z or f  Z for the restriction of f to Z. Usually the first
form is typographically more convenient.

15
kxi − xk → 0, kyi − yk → 0. Now k(xi , yi ) − (x, y)k → 0 for both sum and maximum norm.
Assume X1 ⊕ X2 is complete for the sum or maximum norm. By their equivalence, it then is
complete for the other. And if {xi } is Cauchy in X1 then {(xi , 0)} ⊆ X1 ⊕ X2 is Cauchy, thus
convergent to some (x, 0). Now it is clear that xi → x. 

3.9 Exercise (i) Let {(Vi , k · ki )}i∈I be a family of normed spaces, where I is any set. Put
M X
Vi = {{xi }i∈I | kf (i)ki < ∞}.
i∈I i∈I
Q S
(Technically, this is a subset of Pi Vi = {f : I 7→ j Vj | f (i) ∈ Vi ∀i ∈ I}.) Prove that
this is a linear space and kf k = i kf (i)ki a norm on it.
L
(ii) Prove that ( i∈I Vi , k · k) is complete if all the Vi are complete. Hint: The proof is an
adaptation of the one for `1 (S) given in Section 4.3.
If Vi = F for all i ∈ I with the usual norm, we have i∈I Vi ∼ = `1 (I, F).
L

3.2 Linear maps: bounded ⇔ continuous


If E, F are vector spaces over F, a map A : E → F is called linear if A(x + y) = Ax + Ay for
all x, y ∈ E and and A(cx) = cAx for all x ∈ E, c ∈ F. Note that, as in linear algebra, we write
Ax instead of A(x).
NB: A map of the form x 7→ Ax + b, where A : E → F is linear and b ∈ F , is not called a
linear map, but an affine one!

3.10 Definition Let E, F be normed spaces and A : E → F a linear map. Then the norm
kAk ∈ [0, ∞] is defined by

kAek
kAk = sup = sup kAek.11
06=e∈E kek e∈E
kek=1

(The equality of the second and third expression is due to linearity of A.) If kAk < ∞ then A
is called bounded.

3.11 Remark 1. ‘Linear operator’ is a synonym for linear map, but linear maps A : E → F
are called linear functionals.
2. While unbounded linear maps exist, cf. Exercise 3.15 below, in the unbounded case it
often is too restrictive to require them to be defined everywhere. See also Remark 8.10. 2

3.12 Exercise If E, G, H are normed spaces and S : E → G, T : G → H are linear maps,


prove that kT ◦ Sk ≤ kSkkT k.

3.13 Lemma Let E, F be normed spaces and A : E → F a linear map. Then the following are
equivalent:
(i) A is bounded.
(ii) A is continuous (w.r.t. the norm topologies).
11
It should be clear that writing sup kAek instead would not change the result.
e∈E
kek≤1

16
(iii) A is continuous at 0 ∈ E.
Proof. (i)⇒(ii) For x, y ∈ E we have kAx − Ayk = kA(x − y)k ≤ kAk kx − yk. Thus d(Ax, Ay) ≤
kAk d(x, y), and with kAk < ∞ we have (uniform) continuity of E.
(ii)⇒(iii) This is obvious.
(iii)⇒(i) B F (0, 1) ⊆ F is an open neighborhood of 0 ∈ F . Since A is continuous at 0, there
is an open neighborhood U ⊆ E of 0 ∈ E such that A(U ) ⊆ B F (0, 1). Since the balls form
bases of the topologies, there is C > 0 such that B E (0, C) ⊆ U , thus A(B E (0, C)) ⊆ B F (0, 1).
By linearity of A and the properties of the norm, this is equivalent to A(B E (0, 1)) ⊆ B F (0, D),
where D = 1/C. If 0 6= x ∈ E then
 
x
Ax = 2kxk A ,
2kxk

thus kAxk ≤ 2kxkkA(x/2kxk)k < 2kxkD, and A is bounded. 

3.14 Exercise Let V, W be normed spaces, where V is finite dimensional. Prove that every
linear map V → W is bounded.

3.15 Exercise 1. Give an example of an unbounded linear map A : V → W between normed


spaces that is defined on all of V .
2. Bonus: Same as 1., but with V, W Banach.
For linear functionals, i.e. linear maps from an F-vector space to F, there is another charac-
terization of continuity:

3.16 Exercise Let (V, k · k) be a normed F-vector space and ϕ : V → F a linear functional.
Prove that ϕ is continuous if and only if ker ϕ = ϕ−1 (0) ⊆ V is closed.
Hint: For ⇐, pick a ball B(x, r) ⊆ V \ ker ϕ and prove that ϕ(B(0, r)) is bounded.

3.17 Lemma Let V be a normed space, W ⊆ V a dense linear subspace, Z a Banach space
and A : W → Z a bounded linear map. Then there is a unique linear map A
b : V → Z with
A|W = A and kAk = kAk. If A is an isometry, so is A.
b b b

Proof. Let x ∈ V . Then there is a sequence {wn } in W such that kwn − xk → 0. Then
{wn } ⊆ W is a Cauchy sequence, and so is {Awn } ⊆ Z by boundedness of A. The latter con-
verges since Z is complete. If {wn0 } is another sequence converging to x then kA(wn −wn0 )k → 0,
so that lim Awn0 = lim Awn . It thus is well-defined to put Ax b = limn→∞ Awn . We omit the
easy proof of linearity of A. If x ∈ W then we can put wn = x ∀n, obtaining Ax
b b = Ax, thus
A|W = A. Finally, kAxk = lim kAwn k ≤ kAkkxk. Thus kAk ≤ kAk, and the converse inequality
b b b
is obvious. If A is an isometry then kAxk b = limn kAwn k = limn kwn k = kxk, so that A b is an
isometry. 

We recall the following from topology: If X is a set and τ1 , τ2 are topologies on X then
τ1 = τ2 holds if and only if idX : (X, τ1 ) → (X, τ2 ) is a homeomorphism, i.e. continuous with
continuous inverse.

Proof of Proposition 2.16. Equivalence of k · k1 , k · k2 means that the two norms give rise to the
same topology. By the above, this is equivalent to the maps idV : (V, k · k1 ) → (V, k · k2 ) and
idV : (V, k · k2 ) → (V, k · k1 ) being continuous. Now by Lemma 3.13, this is equivalent to the
existence of C, C 0 such that kxk1 ≤ Ckxk2 and kxk2 ≤ C 0 kxk1 holding for all x ∈ V . 

17
3.18 Exercise Prove: If V is a vector space and k · k1 , k · k2 are equivalent norms on V then
completeness of V w.r.t. k · k1 is equivalent to completeness of V w.r.t. k · k2 .

3.19 Proposition On a finite dimensional vector space, all norms are equivalent.
Proof. Let P F ∈ {R, C}. Let BP = {e1 , . . . , ed } be a basis for V , and define the Euclidean norm
k·k2 of x = i ci ei by kxk2 = ( i |ci |2 )1/2 . Since equivalence of norms is an equivalence relation,
it clearly is sufficient to show that any norm k · k is equivalent to k · k2 . Using |ci | ≤ kxk2 ∀i
and the properties of any norm, we have
d d d
!
X X X
kxk = ci ei ≤ |ci | kei k ≤ kei k kxk2 . (3.1)
i=1 i=1 i=1

This implies that x 7→ kxk is continuous w.r.t. the topology on V defined by k · k2 . The sphere
S = {x ∈ Fn | kxk2 = 1} is closed and bounded, thus compact, which implies that there is
z ∈ S such that λ := inf x∈S kxk = kzk. Since z ∈ S implies z 6= 0 and k · k is a norm, we have
x
λ = kzk > 0. Now, for x 6= 0 we have kxk 2
∈ S, and thus

x
kxk = kxk2 ≥ kxk2 λ. (3.2)
kxk2
P
Combining (3.1, 3.2), we have c1 kxk2 ≤ kxk ≤ c2 kxk2 with 0 < c1 = inf x∈S kxk ≤ i kei k = c2 .
(Note that ei ∈ S ∀i, so that c2 ≥ dc1 , showing again that V must be finite dimensional.) 

3.20 Remark In fact, one can prove the somewhat stronger result that on a finite dimensional
vector space there is precisely one topology making it a TVS. 2

3.21 Exercise Prove that every finite dimensional normed space over R or C is complete.

3.22 Exercise Prove that every finite dimensional subspace of a normed space is closed.

3.3 Spaces of bounded linear maps. First glimpse of Banach


algebras
3.23 Definition Let E, F be normed F-vector spaces. The set of bounded linear maps from
E to F is denoted B(E, F ). Instead of B(E, E) and B(E, F) one also writes B(E) and E ∗ ,
respectively. E ∗ is called the dual space of E.
Clearly B(E, F ) should not be confused with notation B(x, r) for open balls!

3.24 Proposition Let E, F be normed spaces.


(i) B(E, F ) is a vector space and B(E, F ) → [0, ∞), A 7→ kAk is a norm in the sense of
Definition 2.9.
(ii) If F is complete (=Banach) then so is B(E, F ). E ∗ is always Banach.
Proof. (i) If T : E → F is a linear map, it is clear that kαT k = |α|kT k and that kT k = 0 if and
only if T = 0. If S, T ∈ B(E, F ) and x ∈ E then k(S + T )xk ≤ kSxk + kT xk ≤ (kSk + kT k)kxk,
so that kS + T k ≤ kSk + kT k. This implies that B(E, F ) is a vector space.
(ii) Assume F is complete, and let {Tn } ⊆ B(E, F ) be a Cauchy sequence. Then there is
n0 such that m, n ≥ n0 ⇒ kTm − Tn k < 1, in particular Tm ∈ B(Tn0 , 1) for all n ≥ n0 . Thus

18
with M = max(kT1 k, . . . , kTn0 −1 k, kTn0 k + 1) we have kTn k ≤ M for all n. If now x ∈ E then
k(Tn −Tm )xk ≤ kTn −Tm kkxk, so that {Tn x} is a Cauchy sequence in F and therefore convergent
by completeness of F . Now define T : E → F by T x = limn→∞ Tn x. It is straightforward to
check that T is linear. Finally, since kTn xk ≤ M kxk for all n, we have kT xk = limn→∞ kTn k,
so that T ∈ B(E, F ). 

3.25 Exercise Let (V1 , k · k1 ), (V2 , k · k2 ) be normed spaces. Prove (V1 ⊕ V2 , k · ks )∗ ∼


= (V1∗ ⊕
V2∗ , k · km ) and (V1 ⊕ V2 , k · km )∗ ∼
= (V1∗ ⊕ V2∗ , k · ks ).
If E is a normed F-vector space, the same holds for B(E) = B(E, E), and by Exercise 3.12,
we have kS ◦ T k ≤ kSkkT k for all S, T ∈ B(E). This motivates the following definition:

3.26 Definition If F is a field, an F-algebra is an F-vector space A together with an associative


bilinear operation A × A → A, the ‘multiplication’.
Examples: A = Mn×n (F) with matrix product as multiplication, A = C(X, F) with point-
wise product of functions.

3.27 Definition A normed F-algebra is a F-algebra A equipped with a norm k · k such that
kabk ≤ kakkbk for all a, b ∈ A (submultiplicativity). A Banach algebra is a normed algebra that
is complete (as a normed space). An algebra A is called unital if it has a unit 1 6= 0. (In fact,
if A =
6 {0} then 1 = 0 would imply the contradiction kak = k1ak ≤ k1kkak = 0 ∀a ∈ A.)

3.28 Remark 1. If A is a normed algebra then for all a, a0 , b, b0 ∈ A we have

kab − a0 b0 k = kab − ab0 + ab0 − a0 b0 k ≤ kakkb − b0 k + ka − a0 kkb0 k.

This proves that the multiplication map · : A × A → A is jointly continuous.


2. If A is a normed algebra with unit 1 then 1 = 12 , thus k1k = k12 k ≤ k1k2 . With k1k =
6 0
this implies 1 ≤ k1k. Some authors require all unital normed algebras to satisfy k1k = 1, but
we don’t. Of course this does hold for B(E). 2

By the above, B(E) is a normed algebra for every normed space E, and by Proposition
3.24(ii), B(E) is a Banach algebra whenever E is a Banach space. There is another standard
class of examples:

3.29 Example Let X be a compact Hausdorff space and A = C(X, F). We already know that
A, equipped with the norm kf k = supx∈X |f (x)| is a Banach space. The pointwise product
(f g)(x) = f (x)g(x) of functions is bilinear, associative and clearly satisfies kf gk ≤ kf kkgk.
This makes (A, ·, k · k) a Banach algebra.
We will have more to say about Banach algebras later in the course.
Before we go on developing general theory, it seems instructive to study in some depth an
important class of spaces, the `p (S) spaces, where everything can be done very explicitly, in
particular the dual spaces can be determined.

4 The sequence spaces and their dual spaces


In this section we will consider in some detail the sequence spaces `p (S, F) for 0 < p ≤ ∞. These
spaces are worth studying for several reasons:

19
• They provide a first encounter with the more general Lebesgue spaces Lp (X, A, µ) without
the measure and integration theoretic baggage needed for the latter.
• They can be studied quite completely and have their dual spaces identified.
• We will see that every Hilbert space is isomorphic to `2 (S, F) for some S.

4.1 Basics. 1 ≤ p ≤ ∞: Hölder and Minkowski inequalities


4.1 Definition (`p -Spaces) If F ∈ {R, C}, 0 < p < ∞, S is a set and f : S → F, define
!1/p
X
kf k∞ = sup |f (s)| ∈ [0, ∞], kf kp = |f (s)|p ∈ [0, ∞],
s∈S s∈S

where ∞1/p = ∞ and we use the notion of unordered sums, cf. Appendix A.1. Now for all
p ∈ (0, ∞] put
`p (S, F) := {f : S → F | kf kp < ∞}.

4.2 Lemma For all p ∈ (0, ∞] and f : S → F we have:


(i) kf kp = 0 if and only if f = 0.
(ii) For all c ∈ F we have kcf kp = |c|kf kp (with the understanding that 0 · ∞ = 0).
(iii) If S is finite then `p (S, F) = {f : S → F} = FS . If #S = 1 then all the k · kp coincide.
Proof. Trivial. 

4.3 Lemma (i) (`p (S, F), k · kp ) are vector spaces for all p ∈ [1, ∞].
(ii) k · k1 and k · k∞ are norms.
(iii) If f ∈ `1 (S, F) and g ∈ `∞ (S, F) then

X
f (s)g(s) ≤ kf gk1 ≤ kf k1 kgk∞ .
s∈S

Proof. (ii) This follows from

kf + gk∞ = sup |f (s) + g(s)| ≤ sup |f (s)| + sup |g(s)| = kf k∞ + kgk∞ ,


s s s
X X
kf + gk1 = |f (s) + g(s)| ≤ (|f (s)| + |g(s)|) = kf k1 + kgk1 .
s s

(i) Together with the obvious fact that each `p (S, F) is stable under scalar multiplication, (ii)
implies that `p (S, F) is a vector space for p ∈ {1, ∞}. For 1 < p < ∞, with the useful inequality

|a + b|p ≤ (|a| + |b|)p ≤ (2 max(|a|, |b|))p = 2p max(|a|p , |b|p ) ≤ 2p (|a|p + |b|p ) (4.1)
and f, g ∈ `p (S, F) we have
X X
kf + gkpp = |f (s) + g(s)|p ≤ 2p |f (s)|p + |g(s)|p = 2p (kf kpp + kgkpp ) < ∞.

s s

so that kf + gkp < ∞, thus f + g ∈ `p (S, C).

20
P P P
(iii) | s∈S f (s)g(s)| ≤ s∈S |f (s)||g(s)| ≤ kgk∞ s∈S |f (s)| = kgk∞ kf k1 . 

In order to obtain analogues of (ii), (iii) for 1 < p < ∞, define q ∈ (1, ∞) by p1 + 1q = 1.
(This is equivalent to pq = p + q, which often is useful.) Whenever p, q appear together they
are supposed to be a conjugate or dual pair in this sense. We extend this in a natural way by
declaring (1, ∞) and (∞, 1) to be conjugate pairs.
1 1
4.4 Proposition Let 1 < p < ∞ and q conjugate to p, i.e. p + q = 1. Then
(i) For all f, g : S → F we have kf gk1 ≤ kf kp kgkq . (Inequality of Hölder12 (1889))
(ii) For all f, g : S → F we have kf + gkp ≤ kf kp + kgkp . (Inequality of Minkowski13 (1896))
Proof. (i) The inequality is trivially true if kf kp or kgkq is zero or infinite. Thus we assume
kf kp , kgkq to be finite and non-zero. The exponential function R → R, x 7→ ex is convex14 , to
that with of p1 + 1q = 1 we have

ea eb
 
a b
ea/p eb/q = exp + ≤ + ∀a, b ∈ R.
p q p q

With ea = up , eb = v q , where u, v > 0 this becomes


up v q
uv ≤ + ∀u, v ≥ 0. (4.2)
p q
(The validity also for u = 0 or v = 0 is obvious.)
Putting u = |f (s)|, v = |g(s)| in (4.2), we have |f (s)g(s)| ≤ p−1 |f (s)|p + q −1 |g(s)|p , so that
summing over s gives kf gk1 ≤ p−1 kf kpp + q −1 kgkqq . If kf kp = kgkq = 1 then this reduces to
kf gk1 ≤ p1 + p1 = 1. With f 0 = f /kf kp , g 0 = g/kgkq we have kf 0 kp = 1 = kg 0 kq , so that the
above implies kf 0 g 0 k1 ≤ 1. Now inserting the definitions of f 0 , g 0 gives kf gk1 ≤
Pkf kp kgkq .
(ii) If h ∈ `p then using pq = p + q (where q is conjugate to p) we find s |h(s)|(p−1)q =
p p−1 is in `q with k|h|p−1 k = khkp/q . Now
P
s |h(s)| < ∞, so that s 7→ |h(s)| q p
X
kf + gkpp = |f (s) + g(s)|p = |f (s) + g(s)| |f (s) + g(s)|p−1
s
X
≤ (|f (s)| + |g(s)|) |f (s) + g(s)|p−1
s
≤ (kf kp + kgkp ) k |f + g|p−1 kq = (kf kp + kgkp ) kf + gkp/q
p ,

where the second ≤ comes from Hölder’s inequality applied to |f | ∈ `p and |f + g|p−1 ∈ `q and
also to |g| ∈ `p and |f + g|p−1 ∈ `q . If kf + gkp 6= 0, we can divide by kf + gkp/q and with
p − p/q = p(1 − 1/q) = p p1 = 1 we obtain

kf + gkp = kf + gkpp−p/q ≤ kf kp + kgkp .

Since this clearly also holds if kf + gkp = 0, we are done. 

12
Otto Hölder (1859-1937). German mathematician. Important contributions to analysis and algebra.
13
Hermann Minkowski (1864-1909). German mathematician. Contributions to number theory, relativity and other
fields.
14
f : [a, b] → R is convex if f (tx + (1 − t)y) ≤ tf (x) + (1 − t)f (y) for all x, y ∈ [a, b] and t ∈ [0, 1] and strictly convex
if the inequality is strict whenever x 6= y, 0 < t < 1. See, e.g., [23, Vol. 1, Section 7.2].

21
For p = q = 2, the inequality of Hölder is known as the Cauchy-Schwarz inequality. We
will also call the trivial inequalities of Lemma 4.3 for {p, q} = {1, ∞} Hölder and Minkowski
inequalities. Now the analogue of Lemma 4.3 for 1 < p < ∞ is clear:

4.5 Corollary Let 1 < p < ∞. Then


(i) (`p (S, F), k · kp ) is a normed vector space.
(ii) If q is conjugate to p and f ∈ `p (S, F) and g ∈ `q (S, F) then

X
f (s)g(s) ≤ kf gk1 ≤ kf kp kgkq .
s∈S

4.2 ? Aside: The translation-invariant metric dp for 0 < p < 1


For s ∈ S, let δs : S → F be the function defined by δs (t) = δs,t (which is 1 for s = t and zero
otherwise).

4.6 Proposition If 0 < p < 1 and #S ≥ 2 then


(i) k · kp violates subadditivity, thus is not a norm.
(ii) Nevertheless, `p (S, F) is a vector space.
(iii) Restricted to `p (S, F), X
dp (f, g) = |f (s) − g(s)|p
s∈S

defines a translation-invariant metric.


(iv) `p (S, F) a topological vector space when given the metric topology τdp .
Proof. (i) Pick s, t ∈ S, s 6= t and put f = δs , g = δt . Now kf kp = kgkp = 1 and

2 < 21/p = kf + gkp 6≤ kf kp + kgkp = 2

since 1/p > 1. Thus k · kp is not subadditive and therefore not a norm.
(ii) It is clear that f ∈ `p (S, F) implies cf ∈ `p (S, F) for all c ∈ F. For a, b ≥ 0 we have
(a + b)p ≤ (2 max(a, b))p ≤ 2p (ap + bp ), whence the inequality
X X
kf + gkpp = |f (s) + g(s)|p ≤ (|f (s)| + |g(s)|)p
s∈S s∈S
X
p
≤ 2 (|f (s)| + |g(s)|p ) = 2p (kf kpp + kgkpp ),
p

s∈S

which still implies that f + g ∈ `p (S, F) for all f, g ∈ `p (S, F).


(iii) That dp (f, g) < ∞ for all f, g ∈ `p (S, F) follows from `p being a vector space. Translation
invariance of dp and the axioms dp (f, g) = dp (g, f ) and dp (f, g) = 0 ⇔ f = g are all evident
from the definition. We claim that

0 < p < 1, a, b ≥ 0 ⇒ (a + b)p ≤ ap + bp .

22
Believing this for a minute, we have
X X
dp (f, h) = dp (f − h, 0) = |f (s) − h(s)|p ≤ (|f (s) − g(s)| + |g(s) − h(s)|)p
s s
X
p p
≤ (|f (s) − g(s)| + |g(s) − h(s)| )
s
= dp (f − g, 0) + dp (g − h, 0) = dp (f, g) + dp (g, h),

as wanted, where first used the triangle inequality and then the claim.
Turning to our claim (a + b) ≤ ap + bp , it is clear that this holds if a = 0. For a = 1 it
reduces to (1 + b)p ≤ 1 + bp ∀b ≥ 0. For b = 0 this is true, and for all b > 0 it follows from the
fact that
d
(1 + bp − (1 + b)p ) = p(bp−1 − (b + 1)p−1 ) > 0
db
due to p − 1 < 0. If now a > 0 then

(a + b)p = ap (1 + (b/a))p ≤ ap (1 + (b/a)p ) = ap + bp ,

and we are done.


(iv) In view of dp (f + g, 0) ≤ dp (f, 0) + dp (g, 0), it is clear that the addition operation
`p × `p → `p is jointly continuous at (0, 0), thus everywhere. It remains to show that scalar
action F × `p → `p is jointly continuous. By distributivity it suffices to do this at (0, 0). Now
X X
dp (cf, 0) = |cf (s)|p = |c|p |f (s)|p = |c|p dp (f, 0),
s s

and this goes to zero as (c, f ) goes to zero in F × `p . 

The above does not amount to a proof that the topological vector spaces (`p (S), dp ) with
0 < p < 1 are not normable, but this can be done, cf. Section B.5.1. (In fact they are not even
locally convex.) This leads to strange behavior; for example the dual space `p (S)∗ is unexpected,
cf. [47]. This strangeness is even more pronounced for the continuous versions Lp (X, A, µ): For
X = [0, 1] equipped with Lebesgue measure, one has Lp (X, A, µ)∗ = {0}, which cannot happen
for a non-zero Banach (or locally convex) space due to the Hahn-Banach theorem.

4.3 c00 and c0 . Completeness of `p (S, F) and c0 (S, F)


In what follows we put

 kf − gk∞ = supP s
|f (s) − g(s)| if p=∞
p 1/p
dp (f, g) = kf − gkp = ( s |f (s) − g(s)| ) if 1 ≤ p < ∞
 P
|f (s) − g(s)|p if 0 < p < 1
s

which is a metric in all cases. For a function f : S → F we define supp f = {s ∈ S | f (s) 6= 0}.

4.7 Definition For a set S and F ∈ {R, C} we define

c00 (S, F) = {f : S → F | #(supp f ) < ∞},


c0 (S, F) = {f : S → F | ε > 0 ⇒ #{s ∈ S | |f (s)| ≥ ε} < ∞}.

(The elements of c0 are the functions that ‘tend to zero at infinity’.)

23
4.8 Lemma If 0 < p ≤ q < ∞, we have
(i) c00 (S, F) ⊆ `p (S, F) ⊆ `q (S, F) ⊆ c0 (S, F) ⊆ `∞ (S, F),
p/q
(ii) kf kq ≤ min(1, kf kp ), thus kf kp → 0 ⇒ kf kq → 0 so that all inclusion maps `p (S) ,→
`q (s) for p ≤ q are continuous.
Proof. (i) If f ∈ c00 (S, F) then clearly kf kp < ∞ for all p ∈ (0, ∞]. And f ∈ c0 (S, F) implies
boundedness of f . This gives the first and last inclusion.
If f ∈ `p (S, F) with p ∈ (0, ∞) then finiteness of s∈S |f (s)|p implies that {s ∈ S | |f (s)| ≥ ε}
P
is finite for each ε > 0, thus f ∈ c0 (S, F). In particular F = {s ∈ S | |f (s)| ≥ 1} is finite. If
now 0 < p < q < ∞ then
p· q
X X X X
kf kqq − |f (s)|q = |f (s)|q = |f (s)| p ≤ |f (s)|p ≤ kf kpp < ∞, (4.3)
s∈F s∈S\F s∈S\F s∈S\F

since q/p > 1 and |f (s)| < q/p ≤ |f (s)|, for all s ∈ S\F . With the finiteness of
P q
P 1, thus |fq(s)| q
s∈F |f (s)| this implies s∈S |f (s)| < ∞, thus f ∈ ` (S, F).
(ii) It suffices to observe that kf kp < 1 implies that the set F in part (i) of the proof is
p/q
empty, so that (4.3) reduces to kf kqq ≤ kf kpp , thus kf kq ≤ kf kp . 

4.9 Exercise Let S be an infinite set and 0 < p < q < ∞. Prove that all inclusions in Lemma
4.8(i) are strict.

4.10 Lemma Let p ∈ (0, ∞] and dp (x, y) = kx − ykp . Then (`p (S, F), dp ) is complete for every
set S and F ∈ {R, C}.
Proof. Let {fn } ⊆ `p (S, F) be a Cauchy sequence w.r.t. dp , thus also w.r.t. k · kp . Then
|fn (s) − fm (s)| ≤ kfn − fm kp , so that {fn (s)} is a Cauchy sequence in F, thus convergent for
each s ∈ S. Defining g(s) = limn fn (s), it remains to prove g ∈ `p (S, F) and dp (fn , g) → 0.
For p = ∞ and ε > 0 we can find n0 such that n, m ≥ n0 implies kfn − fm k∞ < ε, which
readily gives kfm k∞ ≤ kfn0 k∞ + ε for all m ≥ n0 . Thus also kgk∞ ≤ kfn0 k∞ + ε < ∞. Taking
m → ∞ in sups |fn (s) − fm (s)| < ε gives sups |fn (s) − g(s)k ≤ ε, whence kfn − gk∞ → 0.
For 0 < p < ∞ we give a uniform argument. Since {fn } is Cauchy w.r.t. dp , for ε > 0 we can
find n0 such that n, m ≥ n0 implies dp (fn , fm ) < ε. In particular dp (fm , fn0 ) < ε for all m ≥ n0 ,
thus also dp (g, fn0 ) ≤ ε, thus g ∈ `p (S, F). Applying the dominated convergence theorem (in
the simple case of an infinite sum rather than a general integral, cf. Proposition A.3) to take
m → ∞ in dp (fn , fm ) < ε gives d(fn , g) ≤ ε, whence d(fn , g) → 0. 

4.11 Lemma (i) We have

`p (S, F) if 0 < p < ∞



k·kp
c00 (S, F) =
c0 (S, F) if p = ∞

(ii) (c0 (S, F), k · k∞ ) is complete.


p
Proof. (i) Let 0 < p < ∞ and f ∈ `p (S, F). Then p
P
s∈S |f (s)| = kf kp implies that for each
p
|f (s)|p < ε. Putting g(s) = f (s)χF (s),
P
ε > 0 there is a finite F ⊆ S such thatPkf kp − s∈F
p
we have g ∈ c00 (S, F) and kf − gkp = s∈S\F |f (s)| < ε. Since ε > 0 is arbitrary, c00 ∈ `p is
p

dense.
If f ∈ c0 (S, F) and ε > 0 then F = {s ∈ S | |f (s)| ≥ ε} is finite. Now g = f χF is in c00 (S, F)
k·k∞ k·k∞
and kf − gk∞ < ε, proving f ∈ c00 (S, F) . And f ∈ c00 (S, F) means that for each ε > 0

24
there is a g ∈ c00 (S, F) with kf − gk∞ < ε. But this means |f (s)| < ε for all s ∈ S\F , where
F = supp(g) is finite. Thus f ∈ c0 (S, F).
(ii) Being the closure of c00 (S, F) in `∞ (S, F), c0 (S, F) is closed, thus complete by complete-
ness of `∞ (S, F), cf. Lemmas 4.10 and 3.4. 

4.12 Remark Note that `∞ (S, F) is a commutative algebra under pointwise multiplication
of functions. (In fact, `∞ (S, F) = Cb (S, F) if we equip S with the discrete topology.) And
c0 (S, F) ⊆ `∞ (S, F) is an ideal. 2

While the finitely supported functions are not dense in `∞ (S, F) (for infinite S), the finite-
image functions are:

4.13 Lemma The set {f : S → F | #f (S) < ∞} of functions PK assuming only finitely many
χ
values, equivalently, the set of finite linear combinations k=1 ck Ak of characteristic functions,
is dense in `∞ (S, F).
Proof. We prove this for F = R, from which the case F = C is easily deduced. Let f ∈ `∞ (S, F)
−1 kf k∞
P ε > 0. For k ∈ Z define Ak = f ([kε, (k + 1)ε)). Define K = d ε e + 1 and g =
and
ε |k|≤K k χAk . Then g has finite image and kf − gk∞ < ε. 

4.4 Separability of `p (S, F) and c0 (S, F)


4.14 Proposition Let p ∈ (0, ∞). The metric space (`p (S, F), dp ), where dp (f, g) = kf − gkp ,
is separable (⇔ second countable) if and only if the set S is countable.
Proof. We prove this for F = R, from which the claim for F = C is easily deduced. For
f : S → R, let supp(f ) := {s ∈ S | f (s) 6= 0} ⊆ S be the support of f . Now, if S is
countable, then Y = {g : S → Q | #(supp(g)) < ∞} ⊆ `p (S, R) is countable, and we claim that
∈ `p (S, R) and ε > 0. Since kf kpp = s∈S |f (s)|p < ∞, there
Y = `p (S, R). To prove this, let f P
P
is a finite subset T ⊆ S such that s∈S\T |f (s)| p #T ⊆ R#T
P < ε/2. On thepother hand, since Q
is dense, we can choose g : T → Q such that t∈T |f (t) − g(t)| < ε/2. Defining g to be zero
on S\T , we have g ∈ Y and
X X ε ε
kf − gkp = |f (t) − g(t)|p + |f (s)|p < + = ε,
2 2
t∈T s∈S\T

and since ε > 0 was arbitrary, Y ⊆ `p (S, R) is dense.


For the converse, assume that S is uncountable. By Proposition A.2(iii),
S supp(f ) is countable
for every f ∈ `p (S, R). Thus if Y ⊆ `p (S, R) is countable then T = f ∈Y supp(f ) ⊆ S is a
countable union of countable sets and therefore countable. Thus all functions f ∈ Y vanish on
S\T 6= ∅, and the same holds for f ∈ Y since the coordinate maps f 7→ f (s) are continuous in
view of |f (s)| ≤ kf k. Thus Y cannot be dense. 

4.15 Exercise With d∞ (f, g) = kf − gk∞ prove


(i) The space (`∞ (S, F), d∞ ) is separable if and only if S is finite.
(ii) The space (c0 (S, F), d∞ ) is separable if and only if S is countable.
Hint: For (i), consider {0, 1}S ⊆ `∞ (S).

25
4.5 Dual spaces of `p (S, F), 1 ≤ p < ∞, and c0 (S, F)
If (V, k · k) is a normed vector space over F and ϕ : V → F is a linear functional, Definition 3.10
specializes to
|ϕ(x)|
kϕk = sup = sup |ϕ(x)|.
06=x∈V kxk x∈V
kxk≤1

Recall that the dual space V ∗ = {ϕ : V → F linear | kϕk < ∞} is a Banach space with norm
kϕk. The aim of this section is to concretely identify `p (S, F)∗ for 1 ≤ p < ∞ and c0 (S, F)∗ .
(We will have something to say about `∞ (S, F)∗ , but the complete story would lead us too far.)
For the purpose of the following proof, it will be useful to define sgn : C → C by sgn(0) = 0
and sgn(z) = z/|z| otherwise. Then z = sgn(z)|z| and |z| = sgnz z for all z ∈ C.

4.16 Theorem (i) Let p ∈P [1, ∞] with conjugate value q. Then for each g ∈ `q (S, F) the map
ϕg : ` (S, F) → F, f 7→ s∈S f (s)g(s) satisfies kϕg k ≤ kgkq , thus ϕg ∈ `p (S, F)∗ . And the
p

map ι : `q (S, F) → `p (S, F)∗ , g 7→ ϕg , called the canonical map, is linear with kιk ≤ 1.
(ii) For all 1 ≤ p ≤ ∞ the canonical map `q (S, F) → `p (S, F)∗ is isometric.
(iii) If 1 ≤ p < ∞, the canonical map `q (S, F) → `p (S, F)∗ is surjective, thus `p (S, F)∗ ∼ =
`q (S, F).
(iv) The canonical map `1 (S, F) → c0 (S, F)∗ is an isometric bijection, thus c0 (S, F)∗ ∼
= `1 (S, F).
(v) If S is finite, the canonical map `1 (S, F) → `∞ (S, F)∗ is surjective. If S is infinite, its
image is a proper closed subspace of `∞ (S, F)∗ .
Proof. (i) For all p ∈ [1, ∞] and conjugate q we have
X X
| f (s)g(s)| ≤ |f (s)g(s)| ≤ kf kp kgkq < ∞ ∀f ∈ `p , g ∈ `q
s s∈S

by Hölder’s inequality. In either case, the absolute convergence for all f, g implies that (f, g) 7→
P
s f (s)g(s) is bilinear.
(ii) If kgk∞ 6= 0 and ε > 0 there is an s ∈ S with |g(s)| > kgk∞ − ε. If f = δs : t 7→ δs,t , we
have |ϕg (f )| = |g(s)| > kgk∞ − ε. Since kf k1 = 1, this proves kϕg k > kgk∞ − ε. Since ε > 0
was arbitrary, we have kϕg k ≥ kgk∞ .
P P
If kgk1 6= 0, define f (s) = sgn(g(s)). Then kf k∞ = 1 and s f (s)g(s) = s |g(s)| = kgk1 .
This proves kϕg k ≥ kgk1 .
If 1 < p, q < ∞ and kgkq 6= 0, define f (s) = sgn(g(s))|g(s)|q−1 . Then
X X
f (s)g(s) = |g(s)|q = kgkqq ,
s s
X X X
kf kpp = |f (s)|p = |g(s)|(q−1)p = |g(s)|q = kgkqq ,
s s, g(s)6=0 s

where we used p + q = pq, whence (q − 1)p = q. The above gives

kgkqq kgkqq
P
| s f (s)g(s)|
kϕq k ≥ = = q/p
= kgkq(1−1/p)
q = kgkq .
kf kp kf kp kgkq

We thus have proven kϕg k ≥ kgkq in all cases and since the opposite inequality is known
from (i), g 7→ ϕg is isometric.

26
(iii) Let 0 6= ϕ ∈ `1 (S, F)∗ . Define g : S → F by g(s) = ϕ(δs ). With kδs k1 = 1, we
have |g(s)| = |ϕ(δs )| ≤ kϕkP for all s ∈ S, thus 1
P kgk∞ ≤ kϕk. If f ∈ ` (S, F) and F ⊆ S is
finite, wePhave ϕ(f χF ) = ϕ( s∈F f (s)δs ) = s∈F f (s)g(s). In the limit F % S this becomes
ϕ(f ) = s∈S f (s)g(s) = ϕg (f ) (since f g ∈ `1 , thus the r.h.s. is absolutely convergent, and
kf (1 − χF )k1 → 0 and ϕ is k · k1 -continuous). This proves ϕ = ϕg with g ∈ `∞ (S, F).
Now let 1 < p, q < ∞, and let 0 6= ϕ ∈ `p (S, F)∗ . Since `1 (S, F) ⊆ `p (S, F) by Lemma
4.8, we P can restrict ϕ to `1 (S, F)∗ , and the preceding argument gives a g ∈ `∞ (S, F) such that
ϕ(f ) = s∈S f (s)g(s) for all f ∈ `1 (S, F). The arguments in the proof of (ii) also show that for
1 < p, q < ∞ and any function g : S → F we have
( )
X
kgkq = sup | f (s)g(s)| | f ∈ c00 (S, F), kf kp ≤ 1 .
s∈S
P
Using this and ϕ(f ) = s f (s)g(s) for all f ∈ c00 (S, F) we have

kgkq = sup{|ϕ(f )| | f ∈ c00 (S, F), kf kp ≤ 1} = kϕk < ∞.

Now ϕ(f ) = s∈S f (s)g(s) = ϕg (f ) for all f ∈ `p (S, F) follows as before from f g ∈ `1 and
P
kf (1 − χF )kp → 0 as F % S and the k · kp -continuity of ϕ.
(iv) Let 0 6= g ∈ `1 (S, F). Then ϕg ∈ `∞ (S, F)∗ , which we can restrict to c0 (S, F). For
finite F ⊆ S define fF = f χF with f (s)P= sgn(g(s)). Then fF ∈ P c00 (S, F) with kfF k∞ = 1
(provided F ∩ supp g 6= ∅) and ϕ(fF ) = s∈F |g(s)|. Thus kϕk ≥ s∈F |g(s)| for all finite F
intersecting supp g, and this implies kϕg k ≥ kgk1 . The opposite being known, we have proven
that `1 (S, F) → c0 (S, F)∗ is isometric.
To prove surjectivity, let 0 6= ϕ ∈ c0 (S, F)∗ Pand define g : S → F, s 7→ P ϕ(δs ). If now
f ∈ c0 (S, F) and F ⊆ S is finite, we have f χF = s∈F f (s)δs , thus ϕ(f χF ) = s∈F f (s)g(s).
In particular with f (s) = sgn(g(s)) we have ϕ(f χF ) = s∈F f (s)g(s) = s∈F |g(s)|. Again
P P
we have kf χf k∞ ≤ kf k∞ = 1, thus |ϕ(f χF )| ≤ kϕk, and combining these observations gives
kgk1 ≤ kϕk < ∞, thus g ∈ `1 (S, F). As F % S, we have kf (1 − χF )k∞ = kf χS\F k∞ → 0 since
f ∈ c0 , thus with k · k∞ -continuity of ϕ
X X
ϕ(f ) = lim ϕ(f χF ) = lim f (s)g(s) = f (s)g(s) = ϕg (f ),
F %S F %S
s∈F s∈S

where we again used f g ∈ `1 . Thus ϕ = ϕg , so that `1 (S, F) → c0 (S, F)∗ is an isometric bijection.
(v) It is clear that ι : `1 (S, F) → `∞ (S, F)∗ is surjective if S is finite. Closedness of the
image of ι always follows from the completeness of `1 (S, F) and the fact that ι is an isometry,
cf. Corollary 3.6. The failure of surjectivity is deeper than the results of this section so far, so
that it is illuminating to give two proofs.
First proof: If S is infinite, the closed subspace c0 (S, F) ⊆ `∞ (S, F) is proper since 1 ∈
` (S, F)\c0 (S, F). Thus the quotient space Z = `∞ (S, F)/c0 (S, F) is non-trivial. In Section 6

we will show that Z is a Banach space, thus admits non-zero bounded linear maps ψ : Z → F by
the Hahn-Banach theorem (Section 7), and that the quotient map p : `∞ (S, F) → Z is bounded.
Thus ϕ = ψ ◦ p is a non-zero bounded linear functional on `∞ (S, F) that vanishes on the closed
subspace c0 (S, F). By (iv), the canonical map `1 (S, F) → c0 (S, F)∗ is isometric, thus ϕg with
g ∈ `1 (S, F) vanishes identically on c0 (S, F) if and only if g = 0. Thus ϕ 6= ϕg for all g ∈ `1 (S, F).
Second proof: (This proof uses no unproven results from functional but the Stone-Čech
compactification from general topology. Cf. Appendix A.3 and [47].) Since S is discrete,
`∞ (S, F) = Cb (S, F) ∼ = C(βS, F), where βS is the Stone-Čech compactification of S. The

27
isomorphism is given by the unique continuous extension Cb (S, F) → C(βX, F), f 7→ fb with
the restriction map C(βS, R) → Cb (S, R) as inverse. Since S is discrete and infinite, thus non-
compact, βS 6= S. If f ∈ c0 (S, F) then fb(x) = 0 for every x ∈ βS\S. (Proof: Let x ∈ βS\S.
Since X = βX, we can find a net {xι } in X such that xι → x. Since x 6∈ X, xι leaves every
finite subset of X. Now f ∈ c0 (S) and continuity of fb imply fb(x) = lim fb(xι ) = lim f (xι ) = 0.)
Thus for such an x, the evaluation map ψx : C(βS, F) → F, fb 7→ fb(x) gives rise to a non-zero
bounded linear functional (in fact character) ϕ(f ) = fb(x) on Cb (S, F) = `∞ (S, F) that vanishes
on c0 (S, F). Now we conclude as in the first proof that ϕ 6= ϕg for all g ∈ `1 (S, F). 

4.17 Remark 1. The two proofs of non-surjectivity of the canonical map `1 (S, F) → `∞ (S, F)∗
for infinite S given above are both very non-constructive: The first used the Hahn-Banach
theorem, which we will prove using Zorn’s lemma, equivalent to AC. The second used the
Stone-Čech compactification βS whose usual construction relies on Tychonov’s theorem, which
is equivalent to the axiom of choice. (But here we only need the restriction of Tychonov’s
theorem to Hausdorff spaces, which is equivalent to the ‘ultrafilter lemma’, which is strictly
weaker than AC. Also Hahn-Banach can be proven using only the ultrafilter lemma. See [47].)
2. In fact, the two proofs are essentially the same. The second proof implicitly uses the
fact that `∞ (S) and c0 (S) are algebras, so that we can consider characters instead of all linear
functionals. Now, the characters on `∞ (S) = Cb (S) = C(βS) correspond bijectively to the
points of βS, and those that vanish on c0 (S) correspond to βS\S. The first construction is more
functional analytic, involving the Banach space quotient Cb (S)/c0 (S) and general functionals
instead of characters.
3. The dual space of `∞ (S, F) can be determined quite explicitly, but it is not a space of
functions on S as are the spaces c0 (S, F)∗ and `p (S, F)∗ for p < ∞. It is the space ba(S, F) of
‘finitely additive F-valued measures on S’. Going into this would lead us too far, but see the
supplementary Section B.2.
4. There are set theoretic frameworks without AC (but with DCω ) in which `∞ (N)∗ ∼ = `1 (N),
see [71, §23.10]. (In this situation, all finitely additive measures, cf. Section B.2, on N are
countably additive!)
5. For all p ∈ (0, 1), the dual space `p (S, F)∗ equals {ϕg | g ∈ `∞ (S, F)} = `1 (S, F)∗ . See [47,
Appendix F.6]. Thus there is no p-dependence despite the fact that the `p (S, F) are mutually
non-isomorphic! 2

4.6 Outlook on general Lp -spaces


For an arbitrary measure space (X, A, µ) one can define normed spaces Lp (X, A, µ; F) in a
broadly analogous fashion. (We will usually omit the F.) Since integration on measure spaces
is not among the formal prerequisites of these notes, we only sketch the basic facts referring
to, e.g., R[11, 69] for details. If f : X → F is a measurable function and 0 < p < ∞, then
kf kp = ( |f (x)|p dµ(x))1/p ∈ [0, ∞]. If p = ∞, put15

kf k∞ = ess supµ |f | = inf{λ > 0 | µ({x ∈ X | |f (x)| > λ}) = 0}.

Now Lp (X, µ) = {f : X → F measurable | kf kp < ∞} is an F-vector space for all p ∈ (0, ∞].
For 1 ≤ p ≤ ∞, the proofs of the inequalities of Hölder and Minkowski extend to the present
setting without any difficulties, so that the k · kp are seminorms on Lp (X, A, µ). But the latter
fails to be a norm whenever there exists ∅ = 6 Y ∈ A with µ(Y ) = 0 since then kχY kp = 0.
15
Warning: [11] defines k · k∞ using locally null sets instead of null sets, which is very non-standard.

28
For this reason we define Lp (X, A, µ) = Lp (X, A, µ)/{f | kf kp = 0}. Now it is straightforward
to prove that Lp (X, µ) = Lp (X, µ)/ ∼ is a normed space, and in fact complete. The proof
now uses Proposition 3.2. If S is a set and µ is the counting measure, we have `p (S, F) =
Lp (S, P (S), µ, F) = Lp (S, P (S), µ, F).
A measurable function P is called simple if it assumes only finitely many values. Equivalently
it is of the form f (x) = K χ
k=1 ck Ak (x), where A1 , . . . , Ak are measureable sets. Now one proves
that the simple functions are dense in Lp for all p ∈ [1, ∞]. If X is locally compact and µ is nice
enough, the set Cc (X, F) of compactly supported continuous functions is dense in Lp (X, A, µ; F)
for 1 ≤ p < ∞, while its closure in L∞ is C0 (X, F).
The inclusion `p ⊆ `q for p ≤ q (Lemma 4.8) is false for general measure spaces! In fact,
if µ(X) < ∞ then one has the reverse inclusion p ≤ q ⇒ Lq (X, A, µ) ⊆ Lp (X, A, µ), while for
general measure spaces there is no inclusion relation between the Lp with different p.
If 1 < p, q < ∞ are conjugate, the canonical map ϕ : Lq (X, A, µ) → Lp (X, A, µ)∗ is
an isometric bijection for all measure spaces. That ϕ is an isometry is proven just as for
the spaces `p : Hölder’s inequality gives kϕg k ≤ kgkq , and equality is proven as in Theo-
rem 4.16(ii) by showing |ϕg (f )| ≥ kf kp kgk1 , where the f ∈ Lp are the same as before.
However, isometry of L∞ (X, A, µ) → (L1 (X, A, µ))∗ is not automatic, as the measure space
X = {x}, A = P (X) = {∅, X} and µ : ∅ 7→ 0, X 7→ +∞ shows, for which L1 (X, A, µ, F) ∼ = {0}
and F ∼ = L ∞ (X, A, µ, F) 6∼ L1 (X, A, µ, F)∗ . It is not hard to show that L∞ → (L1 )∗ is isometric
=
if and only if (X, A, µ) is semifinite, i.e.
µ(Y ) = sup{µ(Z) | Z ∈ A, Z ⊆ Y, µ(Z) < ∞} ∀Y ∈ A.
If 1 < p < ∞, one still has surjectivity of Lp → (Lq )∗ for all measure spaces (X, A, µ), but the
standard proof is outside our scope since it requires the Radon-Nikodym theorem. (For a more
functional-analytic proof see Section B.6.) In order for L∞ → (L1 )∗ to be an isometric bijection,
the measure space must be ‘localizable’, cf. [69]. This condition subsumes semifiniteness and is
implied by σ-finiteness, to which case many books limit themselves.
Since we relegated the dual spaces `∞ (S, F)∗ to an appendix, we only remark that also
in general L∞ (X, A, µ)∗ is a space of finitely additive measures with fairly similar proofs, see
[17]. For 0 < p < 1, the dual spaces (Lp )∗ behave even stranger than (`p )∗ . For example,
Lp ([0, 1], λ; R)∗ = {0}.

5 Basics of Hilbert spaces


5.1 Inner products. Cauchy-Schwarz inequality
p
We have seen thatP every bounded linear functional ϕ onq ` (S, F), where 1 ≤ p < ∞ is of the
form ϕg : f 7→ s∈S f (s)g(s) for a certain unique g ∈ ` (S, F). Here the conjugate exponent
q ∈ (1, ∞] is determined by p1 + 1q = 1. Clearly we have p = q if and only if p = 2. In this space
we have self-duality: `2 (S, F)∗ ∼
= `2 (S, F). The map
X
`2 (S, F) × `2 (S, F) → F, (f, g) 7→ f (s)g(s)
s∈S
P
is bilinear and symmetric. Furthermore, it satisfies | s∈S f (s)g(s)| ≤ kf k2 kgk2 . Defining
P
g(s) = g(s), we have kgk = kgk, so that also | s∈S f (s)g(s)| ≤ kf k2 kgk2 , which is the Cauchy-
Schwarz inequality (in its incarnation for `2 (S, C)).
For the development of a general, abstract theory it is better to adopt a slightly different
definition:

29
5.1 Definition Let V be an F-vector space. An inner product on V is a map V × V →
F, (x, y) 7→ hx, yi such that
• The map x 7→ hx, yi is linear for each choice of y ∈ V .
• hy, xi = hx, yi ∀x, y ∈ V .
• hx, xi ≥ 0 ∀x, and hx, xi = 0 ⇒ x = 0.

5.2 Remark 1. Many authors write (x, y) instead of hx, yi, but this leads to confusion with
the notation for ordered pairs. We will use pointed brackets throughout.
2. If F = R, the complex conjugation has no effect and can be omitted. Then h·, ·i is
symmetric.
3. Combining the first two axioms one finds that the map y 7→ hx, yi is anti-linear for each
choice of x. This means hx, cy + c0 y 0 i = chx, yi + c0 hx, y 0 i for all y, y 0 ∈ V and c, c0 ∈ F. Of
course this reduces to linearity if F = R. A map V × V → C that is linear in one variable and
anti-linear in the other is called sesquilinear.
4. A large minority of authors, mostly (mathematical) physicists, defines inner products to
be linear in the second and anti-linear in the first argument. We follow the majority use like
[42].
5. The first two axioms together already imply hx, xi ∈ R for all x, but not the positivity
assumption.
6. If hx, yi = 0 for all y ∈ H then x = 0. To see this, it suffices to take y = x. 2

5.3 Example 1. If V = Cn then hx, yi = ni=1 xi yi is an inner product and the corresponding
P
norm (see below) is k · k2 , which is complete.
2. Let S be any set and V = `2 (S, C). Then hf, gi = s∈S f (s)g(s) converges for all f, g ∈ V
P
by Hölder’s inequality and is easily seen to be an inner product. Of course, 1. is a special case
of 2. R
3. If (X, A, µ) is any measure space then hf, gi = X f (x)g(x) dµ(x) is an inner product on
L2 (X, A, µ; F) turning it into a Hilbert space. (Here we allow ourselves a standard sloppiness:
The elements of Lp are not functions, but equivalence classes of functions. The inner product
of two such classes is defined by picking arbitrary representers.) P
4. Let V = Mn×n (C). For a, b ∈ V , define ha, bi = Tr(b∗ a) = ni,j=1 aij bij , where (b∗ )ij =
bji . That this is an inner product turning V into a Hilbert space follows from 1. upon the
identification Mn×n (C) ∼
2
= Cn .
In view of hx, xi ≥ 0 for all x, and we agree that hx, xi1/2 always is the positive root.

5.4 Lemma (Abstract Cauchy-Schwarz inequality) 16 If h·, ·i is an inner product on V


then
|hx, yi| ≤ hx, xi1/2 hy, yi1/2 ∀x, y ∈ V. (5.1)
(This even holds if one drops the assumption that hx, xi = 0 ⇒ x = 0.)
Equality holds in (5.1) if and only if one of the vectors is zero or x = cy for some c ∈ F.

5.5 Exercise Prove Lemma 5.4 along the following lines:


16
Augustin-Louis Cauchy (1789-1857). French mathematician with many important contributions to analysis.
Karl Hermann Armandus Schwarz (1843-1921). German mathematician, mostly active in complex analysis.
Some authors (mostly Russian ones) include Viktor Yakovlevich Bunyakovski (1804-1889). Russian mathematician.

30
1. Prove it for y = 0, so that we may assume y 6= 0 from now on.
2. Define x1 = kyk−2 hx, yiy and x2 = x − x1 and prove hx1 , x2 i = 0.
3. Use 2. to prove kxk2 = kx1 k2 + kx2 k2 ≥ kx1 k2 .
4. Deduce Cauchy-Schwarz from kx1 k2 ≤ kxk2 .
5. Prove the claim about equality.
The above proof is the easiest to memorize (at least in outline) and reconstruct, but there
are many others, e.g.:

5.6 Exercise (i) For x, y ∈ V , define P (t) = kx + tyk2 and show this defines a quadratic
polynomial in t ∈ C with real coefficients.
(ii) Use the obvious fact that this polynomial takes values in [0, ∞) for all t ∈ R, thus also
inf t∈R P (t) ≥ 0, to prove the Cauchy-Schwarz inequality.
p
5.7 Proposition If h·, ·i is an inner product on V then kxk = + hx, xi is a norm on V .
Proof. kxk ≥ 0 holds by construction, and the third axiom in Definition 5.1 implies kxk = 0 ⇒
x = 0. We have p p p
kcxk = hcx, cxi = cchx, xi = |c|2 hx, xi = |c|kxk,
thus kcxk = |c|kxk for all x ∈ V, c ∈ F. Finally,

kx + yk2 = hx + y, x + yi = hx, xi + hx, yi + hy, xi + hy, yi = kxk2 + kyk2 + hx, yi + hy, xi.

With Re z ≤ |z| for all z ∈ C and the Cauchy-Schwarz inequality we have

hx, yi + hy, xi = hx, yi + hx, yi = 2 Rehx, yi ≤ 2|hx, yi| ≤ 2kxkkyk,

thus
kx + yk2 ≤ kxk2 + kyk2 + 2kxkkyk = (kxk + kyk)2
and therefore kx + yk ≤ kxk + kyk, i.e. subadditivity. 

In terms of the norm, the Cauchy-Schwarz inequality just becomes |hx, yi| ≤ kxkkyk.

5.8 Definition A pre-Hilbert space (or inner product space) is a pair (V, h·, ·i), where V is an
F-vector space and h·, ·i an inner product on it. A Hilbert space is a pre-Hilbert space that is
complete for the norm k · k obtained from the inner product.

5.9 Remark 1. By the above, an inner product gives rise to a norm and therefore to a norm
topology τ . Now the Cauchy-Schwarz inequality implies that the inner product h·, ·i → F is
jointly continuous:

|hx, yi − hx0 , y 0 i| = |hx, yi − hx, y 0 i + hx, y 0 i − hx0 , y 0 i|


= |hx, y − y 0 i + hx − x0 , y 0 i|
≤ kxkky − y 0 k + kx − x0 kky 0 k.

2. If h·, ·i be an inner product on H and k · k is the norm derived from it then

kxk = sup |hx, yi| ∀x ∈ H. (5.2)


y∈H,kyk=1

31
x
(For x = 0 this is obvious, and for x 6= 0 it follows from hx, kxk i = kxk.)
3. The restriction of an inner product on H to a linear subspace K ⊆ H again is an inner
product. Thus if H is a Hilbert space and K a closed subspace then K again is a Hilbert space
(with the restricted inner product).
4. All spaces considered in Example 5.3 are complete, thus Hilbert spaces. For `2 (S) this
was proven in Section 4., and the claim for Cn , thus also Mn×n (C), follows since Cn ∼
= `2 (S, C)
when #S = n. For L2 (X, A, µ) see books on measure theory like [11, 69]. 2

5.10 Definition Let (H1 , h·, ·i1 ), (H2 , h·, ·i2 ) be pre-Hilbert spaces. A linear map A : H1 → H2
is called
• isometry if hAx, Ayi2 = hx, yi1 ∀x, y ∈ H1 .
• unitary if it is a surjective isometry.

5.11 Remark Every unitary map is invertible and its inverse is also unitary. Two Hilbert
spaces H1 , H2 are called unitarily equivalent or isomorphic if there exists a unitary U : H1 → H2 .
2

If (H1 , h·, ·i1 ), (H2 , h·, ·i2 ) are (pre)Hilbert spaces then
h(x1 , x2 ), (y1 , y2 )i = hx1 , y1 i1 + hx2 , y2 i2
defines an inner product on H1 ⊕ H2 turning it into a (pre)Hilbert space. More generally, if
{Hi , h·, ·ii }i∈I is a family of (pre)Hilbert spaces then
M X
Hi = {{xi }i∈I | hxi , xi ii < ∞}
i∈I i∈I

with X
h{xi }, {yi }i = hxi , yi ii
i∈I

is a (pre)Hilbert space. (If Hi = F for all i ∈ I, this construction recovers `2 (I, F), while the
Banach space direct sum gives `1 (I, F).)

5.12 Exercise Let (V, h·, ·i) be a pre-Hilbert space and k·k the associated norm. Let (V 0 , k·k0 )
be the completion (as a normed space) of (V, k · k). Prove that V 0 is a Hilbert space.

5.2 The parallelogram and polarization identities


Given a normed space (V, k · k), it is natural to ask whether there exists an inner product on V
giving rise to the given norm.

5.13 Exercise Let (V, h·, ·i) be a pre-Hilbert space over F ∈ {R, C}. Prove the parallelogram
identity
kx + yk2 + kx − yk2 = 2kxk2 + 2kyk2 ∀x, y ∈ V (5.3)
and the polarization identities
3
1X k
hx, yi = i kx + ik yk2 if F = C, (5.4)
4
k=0
1
hx, yi = (kx + yk2 − kx − yk2 ) if F = R, (5.5)
4
32
5.14 Remark The proof of (5.4) only uses the sesquilinearity of h·, ·i, so that the polarization
1 P3
identity [x, y] = 4 k=0 ik [x + ik y, x + ik y] holds for every sesquilinear form [·, ·] over C.. 2

For a map of (pre)Hilbert spaces we have two a priori different notions of isometry, but they
are equivalent:

5.15 Exercise Let (H1 , h·, ·i1 ), (H2 , h·, ·i2 ) be (pre)Hilbert spaces over C. Let k · k1,2 be the
norms induced by the inner products. Prove that if a linear map A : H1 → H2 is an isometry
of normed spaces then it is an isometry of pre-Hilbert spaces. (I.e. if kAxk2 = kxk1 ∀x ∈ H1
then hAx, Ayi2 = hx, yi1 ∀x, y ∈ H1 .) Hint: Polarization.
The polarization identities actually characterize norms coming from inner products:

5.16 Exercise Let (V, k · k) be a normed space over C whose norm satisfies (5.3). Take (5.4) as
definition of h·, ·i and prove that this is an inner product. Thus (5.3) is necessary and sufficient
for the norm to come from an inner product. (See the hints in [42, Exercise 1.14].)
The above is only one of very many characterizations of Hilbert spaces among the Banach
spaces. There is a whole book [2] about them!

5.17 Exercise Let S be a set with #S ≥ 2. Prove that (`p (S, F), k · kp ), where p ∈ [1, ∞],
satisfies the parallelogram identity if and only if p = 2.

5.3 Orthogonality, subspaces, orthogonal projections


5.3.1 Basic Hilbert space geometry
5.18 Definition If H is a (pre)Hilbert space then x, y ∈ H are called orthogonal, denoted
x ⊥ y, if hx, yi = 0. If S, T ⊆ H then S ⊥ T means x ⊥ y ∀x ∈ S, y ∈ T .
It should be obvious that x ⊥ y implies cx ⊥ dy for all c, d ∈ F.

5.19 Lemma (Pythagoras’ theorem) Let H be a (pre)Hilbert space and x1 , . . . , xn ∈ H


mutually orthogonal, i.e. i 6= j ⇒ hxi , xj i = 0. Let x = x1 + · · · + xn . Then

kxk2 = kx1 k2 + · · · + kxn k2 .

Proof. We have
X X X X X
kxk2 = hx, xi = h xi , xj i = hxi , xj i = hxi , xi i = kxi k2 ,
i j i,j i i

where we used i 6= j ⇒ hxi , xj i = 0. 


P
5.20 Remark If H is a Hilbert space, P I is an infinite set and {xi }i∈I ⊆ H is such that i kxi k <
∞ then we can make sense of x = i∈I xi ∈ H by completeness and Proposition 3.2. If all
2
x
Pi are mutually
2
orthogonal then
Pby taking the limit P over 2finite subsets we again have kxk =
i∈I kxi k . (This shows that i kxi k < ∞ ⇒ i kxi k < ∞, which also follows from the
1 2
inclusion ` (S) ⊆ ` (S) proven in Lemma 4.8.) 2

5.21 Definition Let V be an F-vector space. Then C ⊆ V is called convex if x, y ∈ C, t ∈


[0, 1] ⇒ tx + (1 − t)y ∈ C.

33
5.22 Proposition (Riesz lemma) 17 Let H be a Hilbert space and C ⊆ H a non-empty
closed convex set. Then for each x ∈ H there is a unique y ∈ C minimizing kx − yk, i.e.
kx − yk = inf z∈C kx − zk.
Proof. We will prove this for x = 0, in which case the statement says that there is a unique
element of C of minimal norm. For general x ∈ H 0 , let y 0 be the unique element of minimal
norm in the convex set C 0 = C − x. Then y = y 0 + x is the unique element in C minimizing
kx − yk.
Let d = inf z∈C kzk and pick a sequence {yn } in C such that kyn k → d. Now with the
parallelogram identity (5.3) we have

kyn − ym k2 = 2kyn k2 + 2kym k2 − kyn + ym k2


2
yn + ym
= 2kyn k2 + 2kym k2 − 4 .
2

As n, m → ∞ we have limn→∞ 2kyn k2 + 2kym k2 = 4d2 . Since C is convex, we have yn +y 2


m
∈ C,
yn +ym 2
thus 2 ≥ d for all n, m, so that lim supn→∞ ky n − y m k ≤ 0. Since this lim sup also must
be positive, it follows that kyn − ym k → 0 as n, m → ∞. Thus {yn } is a Cauchy sequence and
therefore converges to some y ∈ C by completeness of H and closedness of C. By continuity of
the norm, kyk = lim kyn k = d.
If y, y 0 ∈ C with kyk = ky 0 k = d then the parallelogram identity gives ky − y 0 k2 = 4d2 −
2
y+y 0
4 2 ≥ 0. Since again k(y + y 0 )/2k2 ≥ d2 , we have ky − y 0 k = 0, thus the uniqueness claim.


5.3.2 Closed subspaces, orthogonal complement, and orthogonal projections


5.23 Definition Let (H, h·, ·i) be a Hilbert space and S ⊆ H. The orthogonal complement
S ⊥ is defined as
S ⊥ = {y ∈ H | hy, xi = 0 ∀x ∈ S}.

Here some easy facts without proofs: For every S ⊆ H we have S = S ⊥ , and S ⊥ ⊆ H is a
closed linear subspace. If S ⊆ T ⊆ H then T ⊥ ⊆ S ⊥ .
A linear subspace of a vector space clearly is a convex subset. Now,

5.24 Theorem Let H be a Hilbert space and K ⊆ H a closed linear subspace. Define a map
P : H → K by P x = y, where y ∈ K minimizes kx − yk as in Proposition 5.22. Also define
Qx = x − P x. Then
(i) Qx ∈ K ⊥ ∀x.
(ii) For each x ∈ H there are unique y ∈ K, z ∈ K ⊥ with x = y + z, namely y = P x, z = Qx.
(iii) The maps P, Q are linear.
(iv) The map U : H → K ⊕ K ⊥ , x 7→ (P x, Qx) is an isomorphism of Hilbert spaces. In
particular, kxk2 = kP xk2 + kQxk2 ∀x.
(v) The map P : H → H satisfies P 2 = P and hP x, yi = hx, P yi. The same holds for Q.
17
Frigyes Riesz (1880-1956). Hungarian mathematician and one of the pioneers of functional analysis. (The same
applies to his younger brother Marcel Riesz (1886-1969).)

34
Proof. (i) Let x ∈ H, v ∈ K. We want to prove Qx ⊥ v, i.e. hx − P x, vi = 0. Since y = P x is
the element of K minimizing kx − yk, we have for all t ∈ C

kx − P xk ≤ kx − P x − tvk.

Taking squares and putting z = x − y = x − P x, this becomes hz, zi ≤ hz − tv, z − tvi, equivalent
to
2 Re(thv, zi) ≤ |t|2 kvk2 .
With the polar decomposition t = |t|eiϕ , the above becomes 2 Re(eiϕ hv, zi) ≤ |t|kvk2 . Taking
|t| → 0, we find Re(eiϕ hv, zi) = 0, and since ϕ was arbitrary, we conclude hv, zi = 0. In view of
z = x − y = x − P x this is what we wanted.
(ii) For each x ∈ H we have x = P x + Qx with P x ∈ K, Qx ∈ K ⊥ , proving the existence.
If y, y 0 ∈ K, z, z 0 ∈ K ⊥ such that y + z = y 0 + z 0 then y − y 0 = z 0 − z ∈ K ∩ K ⊥ = {0}. Thus
y − y 0 = z 0 − z = 0, proving the uniqueness.
(iii) If x, x0 ∈ H, c, c0 ∈ F then cx + c0 x0 = P (cx + c0 x0 ) + Q(cx + c0 x0 ). But also

cx + c0 x0 = c(P x + Qx) + c0 (P x0 + Qx0 ) = (cP x + c0 P x0 ) + (cQx + c0 Qy 0 )

is a decomposition of cx + c0 x0 as a sum of vectors in K and K ⊥ , respectively. Since such


a decomposition is unique by (ii), we have P (cx + c0 x0 ) = cP x + c0 P x0 and Q(cx + c0 x0 ) =
cQx + c0 Qx0 , thus P, Q are linear.
(iv) It is clear that U is a linear isomorphism. Furthermore, P x ⊥ Qy implies

hx, yi = hP x + Qx, P y + Qyi = hP x, P yi + hQx, Qyi = hU x, U yi,

so that U is an isometry of Hilbert spaces.



=
(v) In view of (iv), i.e. U : H −→ K ⊕K ⊥ , it suffices to prove this for the map S : H1 ⊕H2 →
H1 ⊕ H2 , (x1 , x2 ) 7→ (x1 , 0). It is clear that S 2 := S ◦ S = S. And

hS(x1 , x2 ), (y1 , y2 )i = h(x1 , 0), (y1 , y2 )i = hx1 , y1 i1 = h(x1 , x2 ), (y1 , 0)i = h(x1 , x2 ), S(y1 , y2 )i.

5.25 Remark The above theorem remains valid if H is only a pre-Hilbert space, provided
K ⊆ H is finite dimensional. We first note that the proof of Lemma 5.22 only uses completeness
of C ⊆ H, not that of H. And we recall that finite dimensional subspaces of normed spaces are
automatically complete and closed, cf. Exercises 3.21 and 3.22. In the proof of Theorem 5.24
we use Lemma 5.22 with C = K, which is complete as just noted. 2

5.26 Exercise Let H be a Hilbert space and V ⊆ H a linear subspace. Prove:


(i) V ⊥ = {0} if and only if V = H.
(ii) V ⊥⊥ = V .

5.27 Definition Let V be a vector space and H a (pre)Hilbert space.


• A linear map P : V → V is idempotent if P 2 ≡ P ◦ P = P .
• A linear map P : H → H is self-adjoint if hP x, yi = hx, P yi for all x, y ∈ H.
• A bounded linear map P : H → H is an orthogonal projection if it is a self-adjoint
idempotent.

35
(In Theorem 8.9 we will prove that every self-adjoint P : H → H is automatically bounded,
but this is not needed here.)
We have seen that every closed subspace K of a Hilbert space gives rise to an orthogonal
projection P with P H = K. Conversely, we have:

5.28 Exercise Let H be a Hilbert space and P an idempotent on H (i.e. in B(H)). Prove:
(i) K = P H ⊆ H and L = (1 − P )H are closed linear subspaces.
(ii) We have K ⊥ L if and only if P = P ∗ , i.e. P is an orthogonal projection.
(iii) If P is an orthogonal projection then it equals the P associated to K by Theorem 5.24.

5.4 The dual space H ∗ of a Hilbert space


If H is a Hilbert space, every y ∈ H gives rise to a bounded linear functional on H via ϕy :
x 7→ hx, yi. The Cauchy-Schwarz inequality implies kϕy k ≤ kyk. For every y 6= 0 we have
ϕy (y) = hy, yi = kyk2 > 0, implying kϕy k = kyk. As a consequence, for every non-zero x ∈ H
there is a ϕ ∈ H ∗ with ϕ(x) 6= 0. Thus H ∗ separates the points of H. But we have more:

5.29 Theorem (Riesz-Fréchet representation theorem) If H is a Hilbert space and


ϕ ∈ H ∗ then there is a unique y ∈ H such that ϕ = ϕy = h·, yi.
Proof. See [42, Theorem 1.29]. If ϕ = 0, put y = 0 (every other y gives ϕy 6= 0). Now assume
ϕ 6= 0. Let K = ker ϕ = ϕ−1 (0). Then K ⊆ H is a proper (since ϕ 6= 0) closed (by continuity)
linear subspace of H. By the preceding section, K ⊥ 6= {0}. The dimension of K ⊥ is one. (If
y1 , y2 ∈ K ⊥ \{0} then ϕ(y1 ) 6= 0 6= ϕ(y2 ) implies ϕ(y1 /ϕ(y1 ) − y2 /ϕ(y2 )) = 0 and therefore
y1 /ϕ(y1 ) − y2 /ϕ(y2 ) ∈ K ∩ K ⊥ = {0}, thus y1 and y2 are linearly dependent.) Pick a non-zero
ϕ(z)
z ∈ K ⊥ and put y = kzk2
z. Then ϕy vanishes on K, and ϕy (z) = hz, yi = ϕ(z). Thus ϕ = ϕy
on both K and K⊥ = Cz, thus on H = K + K ⊥ . In view of the construction, y is unique. 

By the above, the map H → H ∗ , y 7→ ϕy is an isometric bijection of Banach spaces. If we


want, we can use it to put an inner product on H ∗ .

5.30 Exercise Let H be a Hilbert space and K ⊆ H a linear subspace.


(i) Prove that for every ϕ ∈ K ∗ there exists ϕ
b ∈ H ∗ such that ϕ
b  K = ϕ.
b ∈ H ∗ with ϕ
(ii) Prove that ϕ b  K = ϕ is unique if and only K = H.
b ∈ H ∗ satisfying ϕ
(iii) Prove that there is a unique ϕ b  K = ϕ and kϕk
b = kϕk.

5.5 Orthonormal sets. Bases


We begin by recalling the notion of bases from linear algebra: A finite Pnsubset {x1 , . . . , xn }
of a vector space V over the field k is called linearly independent if i=1 ci xi = 0, where
c1 , . . . , cn ∈ k, implies c1 = · · · = cn = 0. (In particular, xi 6= 0 ∀i.) An arbitrary subset B ⊆ V
is called linearly independent if every finite subset S ⊆ B is linearly independent. A linearly
independent subset B ⊆ V is called a base (or Hamel base) if every x ∈ V can be written as
a linear combination of finitely many elements of B. This is equivalent to B being maximal,
i.e. non-existence of linearly independent sets B 0 properly containing B. One now proves that
any two bases of V have the same cardinality. All this is known from linear algebra, but the
following possibly not:

36
5.31 Proposition Every vector space V has a base.
Proof. If V = {0}, ∅ is a base. Thus let V be non-zero and let B be the set of linearly inde-
pendent subsets of V . The set B is partially ordered by inclusion ⊆ and non-empty (since it
contains {x} for all 0 6= x ∈ V ). We claim that every chain in (=totally ordered subset of)
(B, ⊆) has a maximal element: Just take the union B b of all sets in the chain. Since any finite
subset of the union over a chain of sets is contained in some element of the chain, every finite
subset of B
b is linearly independent. Thus B b is in B and clearly is an upper bound of the chain.
Thus the assumption of Zorn’s Lemma is satisfied, so that (B, ⊆) has a maximal element M .
We claim that M is a basis for V : If this was false, we could find a v ∈ V not contained in
the span of M . But then M ∪ {v} would be a linearly independent set strictly larger than M ,
contradicting the maximality of M . 

The linear algebra notion of base is quite irrelevant as soon as we are concerned with topo-
logical vector spaces, like normed spaces since in the presence of a topology we can also talk
about infinite linear combinations. This leads to the notion of a Hilbert space base and the
above purely algebraic one is of little or no relevance.

5.32 Definition Let H be a (pre)Hilbert space. A subset E ⊆ H is orthonormal if


• kxk = 1 ∀x ∈ E.
• If x, y ∈ E, x 6= y then hx, yi = 0.
(Equivalently, hx, yi = δx,y ∀x, y ∈ E.)

5.33 Exercise Prove that every orthonormal set is linearly independent.

5.34 Lemma Let H be a (pre)Hilbert space and E ⊆ H an orthonormal set. Then the Bessel
inequality X
|hx, ei|2 ≤ kxk2 ∀x ∈ H (5.6)
e∈E

holds.
P
Proof. Let E be a finite orthonormal set and x ∈ V . Define y = x − e∈E hx, eie. It is
P check that hy, ei = 0 for all e ∈ E, so that E ∪ {y} is an orthonormal set. In
straightforward to
view of x = y + e∈E hx, eie, Pythagoras gives
X
kxk2 = kyk2 + |hx, ei|2 ,
e∈E

2 (5.6) for all finite E. If E is infinite, e∈E |hx, ei|2 equals


P
which in view of kyk
P ≥ 0 implies
the supremum of e∈F |hx, ei|2 ≤ kxk2 over the finite subsets F ⊆ E, which thus also satisfies
(5.6). 

5.35 Definition An orthonormal base (ONB) in a (pre)Hilbert space is an orthonormal set


E that is maximal. I.e. there is no orthonormal set E 0 properly containing E.

5.36 Lemma For every orthonormal set E in a (pre)Hilbert space H there is an orthonormal
base E
b containing E. In particular every Hilbert space admits an ONB.

37
Proof. The proof is essentially the same as that of Proposition 5.31: Let B be the set of or-
thonormal sets that contain E. A Zorn’s lemma argument gives the existence of an orthonormal
set E
b that contains E and is maximal. Thus E b is a base. (There is a tricky point: Our E
b is
maximal among the orthonormal sets that contain E. Without the latter requirement there
might be a bigger E.
b Why can’t this happen?) 

5.37 Exercise If H is a Hilbert space, E ⊆ H is an ONB and A ∈ B(H), define kAkE =


supe∈E kAek.
(i) Prove that k · kE is a norm on B(H) and kAkE ≤ kAk.
kAk √
(ii) Show that there are H, E, A making kAkE
≥ N arbitrarily large.

5.38 Theorem 18 Let H be a Hilbert space and E an orthonormal set in H. Then the following
are equivalent:
(i) E is an orthonormal base, i.e. maximal.
(ii a) If x ∈ H and x ⊥ e for all e ∈ E then x = 0.
(ii b) The map H → `2 (E, F), x 7→ {hx, ei}e∈E (well-defined thanks to (5.6)) is injective.
(iii) spanF E = H.
P
(iv) For every x ∈ H, there are numbers {ae }e∈E in F such that x = e∈E ae e.
P
(v) For every x ∈ H, the equality x = e∈E hx, eie holds.
(vi a) For every x ∈ H, we have kxk2 = e∈E |hx, ei|2 . (Abstract Parseval19 identity)
P

(vi b) The map H → `2 (E, F), x 7→ {hx, ei}e∈E is an isometric map of of normed spaces, where
`2 (S) has the k · k2 -norm.
P P
(vii a) For all x, y ∈ H we have hx, yi = e∈E hx, eihe, yi = e∈E hx, eihy, ei.
(vii b) The map H → `2 (E, F), x 7→ {hx, ei}e∈E is an isometric map of pre-Hilbert spaces.
Here all summations over E are in the sense of the unordered summation of Appendix A.1 (with
V = H in (iv),(v) and V = F in (vi a),(vii a)).
Proof. If (ii a) holds then E is maximal, thus (i). If (ii a) is false then there is a non-zero x ∈ H
with x ⊥ e for all e ∈ E. Then E ∪ {x/kxk} is an orthonormal set larger than E, thus E is
not maximal. Thus (i)⇔(ii a). The equivalence (ii a)⇔(ii b) follows from the fact that a linear
map is injective if and only if its kernel is {0}.
(iii)⇒(i) If spanF E = H and x ∈ H satisfies x ⊥ E then also x ⊥ (spanF E = H), thus
x = 0. Thus E is maximal and therefore a base.
(ii a)⇒(iii) K = spanF E ⊆ H is a closed linear subspace. If K 6= H then by Theorem 5.24
we can find a non-zero x ∈ K ⊥ . In particular x ⊥ e ∀e ∈ E, contradicting (ii a). Thus K = H.
It should be clear that the statements (vi b) and (vii b) are just high-brow versions of (vi a),
(vii a), respectively, to which they are equivalent. That (vii a) implies (vi a) is seen by taking
x = y. Since Exercise 5.15 gives (vi b)⇒(vii b), we have the mutual equivalence of (vi a), (vi
b), (vii a), (vii b).
18
I dislike the approach of [42] of restricting this statement to finite or countably infinite orthonormal sets. This in
particular means that the hypotheses of [42, Theorem 1.33] can never be satisfied if H is non-separable! I also find
it desirable to understand how much of the theorem survives without completeness since the latter does not hold in
situations like Example 5.44. See Remark 5.39.
19
Marc-Antoine Parseval (1755-1836). French mathematician.

38
(v)⇒(iv) is trivial.
P If (iv) holds then continuity of the inner product, cf. Remark 5.9.1,
implies hx, yi = e∈E ae he, yi for all y ∈ H. For y ∈ E, the r.h.s. reduces to ay , implying (v).
(iv) means that every x ∈ H is a limit of finite linear combinations of the e ∈ E, thus (iii)
holds. P
(v)⇒(vi a) For finite F ⊆ E we define xF = e∈F hx, eie. Pythagoras’ theorem gives
2 = 2 . As F % E, the l.h.s. converges to kxk2 by (iii) and the r.h.s. to
P
kx
PF k e∈F |hx, ei|
|hx, ei|2 . Thus (vi a) holds.
e∈E
If (vi a) holds then for each ε > 0 there is a finite F ⊆ E such that e∈E\F |hx, ei|2 < ε.
P

Since x−xF is orthogonal to each e ∈ F , P we have x−xF ⊥ xF , to that kxk2 P = kx−xF k2 +kxF k2 .
Combining this with (iv a) and kxF k = e∈F |hx, ei| we find kx−xF k = e∈E\F |hx, ei|2 < ε.
2 2 2

Since ε was arbitrary, this proves that limF %E xF = x, thus (v).


It remains to prove (iii)⇒(v). The fact spanF E = H means that for every P x ∈ H and ε > 0
there are a finite subset F ⊆ E and coefficients {ae }e∈F such that kx − e∈F ae ek < ε. On
the other hand, Theorem 5.24 tells us that for each finite F ⊆ E there is a unique PF (x) ∈
KF = spanF F minimizing kx − PFP (x)k. Clearly PF  KF is the identity map andP the zero map
⊥ 0 0
on KFP. Thus defining PF (x) = e∈F hx, eie, we have PF = PF . Thus kx P − e∈F hx, eiek ≤
kx − e∈F ae ek < ε. Since ε > 0 was arbitrary, P this proves lim F %E kx − e∈F hx,
P eiek = 0,
which is nothing other thanPthe statement x = e∈E hx, eie. Since the finite sums e∈F hx, eie
are in H, the identity x = e∈E hx, eie also holds in H for all x ∈ H. 

5.39 Remark If H is only a pre-Hilbert space, we still have


(i)⇔(ii a/b)⇐(iii)⇔(iv)⇔(v)⇔(vi a/b)⇔(vii a/b). This follows from the fact that complete-
ness is not used in proving these equivalences, except in (iii)⇒(v) where it can be avoided by
appealing to Remark 5.25.
Furthermore, we have equivalence of (iii), i.e. spanF E = H (in H), with spanF E = H b (in
the completion H) and with E being an ONB for H. The equivalence of the second and third
b b
statement comes from (i)⇔(iii), applied to H. b If spanF E is dense in H then it is dense in H b
since H is dense in H.
b And the converse follows from the general topology fact that the closure
in H of some S ⊆ H ⊆ H b equals S ∩ H, where S is the closure in H.b
In Example 5.44 below, all statements (i)-(vii) hold despite the incompleteness of H. But
in the absence of completeness the implication (i)⇒(iii) can fail! For a counterexample see
Exercise 5.40. (In view of this, maximal orthonormal sets in pre-Hilbert spaces should not
be called bases.) In [26] it it is even proven that a pre-Hilbert space in which every maximal
orthonormal set E has dense span actually is a Hilbert space. Equivalently, in every incomplete
pre-Hilbert space there is a maximal orthonormal set E whose span is non-dense! There even
are pre-Hilbert spaces (called pathological) in which no orthonormal set has dense span!
Actually, most of the non-trivial results, like H ∼= K ⊕ K ⊥ for closed subspaces K and
Theorem 5.29, hold for a pre-Hilbert space if and only if it is a Hilbert space, see [26]. 2

5.40 Exercise (Counterexample) Let H = `2 (N, F), let f = ∞


P
n=1 δn /n ∈ H (equivalently,
f (n) = 1/n). Now K = spanF {f, δ2 , δ3 , . . .} (no closure!) is a pre-Hilbert space. Prove:
(i) E = {δ2 , δ3 , . . .} is a maximal orthonormal set in K.
(ii) f 6∈ spanF E, thus spanF E 6= K (both closures in K).

5.41 Theorem ((F.) Riesz-Fischer) 2021 Let H be a pre-Hilbert space and E an orthonor-
mal set such that spanF E = H. Then the following are equivalent:
20
Ernst Sigismund Fischer (1975-1954). Austrian mathematician. Early pioneer of Hilbert space theory.
21
Also the completeness of L2 (X, A, µ; F) (see Lemma 4.10 for `2 (S)) is sometimes called Riesz-Fischer theorem.

39
(i) H is a Hilbert space (thus complete).
(ii) The isometric map H → `2 (E, F), x 7→ {hx, ei}e∈E is surjective. I.e. for every f ∈ `2 (E, F)
there is an x ∈ H such that hx, ei = f (e) for all e ∈ E.
Proof. (ii)⇒(i) We know from (iii)⇒(vii b) in Theorem 5.38 that the map H → `2 (E, F) is an
isometry. If it is surjective then it is an isomorphism of pre-Hilbert spaces. Since `2 (E, F) is
complete by Lemma 4.10, so is H.
(i)⇒(ii) Let f ∈ `2 (E, F). For each finite P
P
subset F ⊆ E we define xF = e∈F f (e)e. For
each ε > 0 there is a finite F ⊆ E such that e∈E\F |f (e)|2 < ε. Whenever F ⊆ U ∩ U 0 , the
identity xU − x0U = e∈E (χU (e) − χU 0 (e))f (e)e implies
P

X
kxU − xU 0 k2 = |χU (e) − χU 0 (e)|2 |f (e)|2 ≤ ε
e∈E

since |χU − χU 0 | vanishes on F and is bounded by one on (U ∪U 0 )\F . Thus {xF } is a Cauchy net
and therefore convergent to a unique x ∈ H by completeness, cf. Lemma A.11. By continuity
of the inner product, hxF , ei converges to f (e), so that hx, ei = f (e) for all e ∈ E. 

5.42 Remark If E and E 0 are ONBs for a Hilbert space H then one can prove that E and E 0
have the same cardinality, i.e. there is a bijection between E and E 0 , cf. [12, Proposition I.4.14].
(This does not follow from the linear algebra proof, since the latter uses a different notion of
base, the Hamel bases.) The common cardinality of all bases of H is called the dimension of H.
2

5.43 Proposition For a Hilbert space H, the following are equivalent:


(i) H is separable in the topological sense, i.e. there is a countable dense set S ⊆ H.
(ii) H admits a countable orthonormal base.
Proof. (ii)⇒(i) Let E be a countable ONB for H. Then by Theorem 5.41 we have a unitary
equivalence H ∼ = `2 (E, F), and the claim follows from Proposition 4.14.
(i)⇒(ii) Let S be a countable dense set not containing zero and write it as S = {x1 , x2 , . . .}.
Put y1 = x1 /kx1 k. Put z2 = x2 − hx2 , y1 iy1 . If z2 6= 0, put y2 = z2 /kz2 k, otherwise y2 = 0. Now
z3 = x3 − hx3 , y1 iy1 − hx3 , y2 iy2 etc. Now let E = {yn | n ∈ N}\{0}. We claim that E is an
ONB. It is clear by construction that E is orthonormal. If z is orthogonal to all b ∈ E then z
is orthogonal to all s ∈ S. If x ∈ H is arbitrary, there is a sequence {sn } in S such that s → x.
Now continuity of the inner product implies hx, si = limn→∞ hsn , si = 0. Thus x ∈ H ⊥ = {0}.


5.44 Example Here is an application of Theorem 5.38: Let

H = {f ∈ C([0, 2π], C) | f (0) = f (2π)} ∼


= C(S 1 , C).
R 2π
One easily checks that hf, gi = (2π)−1 0 f (x)g(x)dx (Riemann integral) is anR inner product,
so that (H, h·, ·i) is a pre-Hilbert space. (If a continuous function satisfies |f (x)|2 dx = 0
then it is identically zero.) For n ∈ Z, let en (x) = einx . It is straightforward to show that
E = {en | n ∈ Z} is an orthonormal set, thus Bessel’s inequality holds. For f ∈ H we have
Z 2π
1
hf, en i = f (x)e−inx dx,
2π 0

40
which is the n-th Fourier coefficient fb(n) of f , cf. e.g. [76, 34]. In fact, in Fourier analysis one
proves, cf. e.g. [76, Corollary 5.4], that the finite linear combinations of the en (‘trigonometric
polynomials’) are dense in H, which is (iii) of Theorem 5.38. Thus all other statements in the
theorem also hold. The weaker statement (ii a) is also well-known in Fourier analysis, cf. [76,
Corollary 5.3]. Furthermore,
Z 2π
1 X X
|f (x)|2 dx = kf k2 = |hf, en i|2 = |fb(n)|2 .
2π 0
n∈Z n∈Z

This is the original Parseval formula, cf. e.g. [76, Chapter 3, Theorem 1.3]. Note that H is
not complete. Measure theory tells us that this completion is L2 ([0, 2π], λ; C), the measure
being Lebesgue measure λ (defined on the σ-algebra of Borel sets). Now the map L2 ([0, 2π]) →
`2 (Z, C), f 7→ fb is an isomorphism of Hilbert spaces. This nice situation shows that the Lebesgue
integral is much more appropriate for the purposes of Fourier analysis than the Riemann integral
(as for most other purposes).

5.45 Exercise Prove that the pre-Hilbert space H = C([0, 1]) with inner product hf, gi =
RA
0 f (t)g(t)dt is not complete.

5.6 ? Tensor products of Hilbert spaces


In this optional section, referenced only in Section 14.4 but important well beyond that, I assume
known22 the notion of (algebraic) tensor product V ⊗k W of two vector spaces V, W over a field
k. (In two sentences: V ⊗k W is the free abelian group spanned the pairs (v, w) ∈ V × W ,
divided by the subgroup generated by all elements of the form (v + v 0 , w) − (v, w) − (v 0 , w) and
(v, w + w0 ) − (v, w) − (v, w0 ) and (cv, w) − (v, cw), where v, v 0 ∈ V, w, w0 ∈ W, c ∈ k, the quotient
being a k-vector space in the obvious way. If v ∈ V, w ∈ W then the equivalence class [(v, w)]
is denoted v ⊗ w.)
The crucial property is that given a bilinear map α : V × W → Z (where V × W is the
Cartesian product) there is a unique linear map β : V ⊗k W → Z such that β(v ⊗w) = α((v, w)).

5.46 Lemma Let (H, h·, ·iH ), (H 0 , h·, ·iH 0 ) be pre-Hilbert spaces over F ∈ {R, C}. Then there is
a unique inner product h·, ·iZ on Z = H ⊗F H 0 such that hv ⊗ w, v 0 ⊗ w0 i = hv, v 0 iH hw, w0 iH 0 .
Proof. Every element z ∈ Z = H ⊗F H 0 has a representation z = K
P
k=1 vk ⊗ wk with K < ∞.
Given another z 0 = L 0 ⊗ w 0 ∈ H ⊗ H 0 , we must define
P
v
l=1 l l F

K X
X L
0
hz, z iZ = hvk , vl0 iH hwk , wl0 iH 0 .
k=1 l=1

Since an element z ∈ Z can have many representations of the form z = K


P
k=1 vk ⊗ wk , we must
PK PK̃
show that this is well-defined. Let thus k=1 vk ⊗ wk = k̃=1 ṽk̃ ⊗ w̃k̃ . Now, for fixed l the map
H × H → F, (x, y) 7→ hx, vl0 iH hy, wl0 iH 0 clearly is bilinear, thus it gives rise to a unique linear
map H ⊗F H 0 → F. This implies
K X
X L K̃ X
X L
hvk , vl0 iH hwk , wl0 iH 0 = hṽk̃ , vl0 iH hw̃k̃ , wl0 iH 0 .
k=1 l=1 k̃=1 l=1
22
Unfortunately, this is often omitted from undergraduate linear algebra teaching. E.g., it does not appear in [22]
despite the book’s > 500 pages. See however [35, 40] which, admittedly, are aiming higher.

41
The independence of hz, z 0 iZ of the representation of z 0 is shown in the same way.
It is quite clear that h·, ·iZ is sesquilinear and satisfies 0 0
P hz , ziZ = hz, z iZ .
In order to study hz, ziZ we may assume that z = k vk ⊗ wk , where the wk are mutually
orthogonal. This leads to
X X
hz, ziZ = hvk , vk iH hwk , wk iH 0 = kvk k2 kwk k2 ≥ 0
k k

and hz, ziZ = 0 ⇒ z = 0. 

5.47 Definition If H, H 0 are Hilbert spaces then H ⊗ H 0 is the Hilbert space obtained by
completing the above pre-Hilbert space (Z, h·, ·iZ ).

5.48 Remark 1. We usually write the completed tensor products ⊗ without subscript to
distinguish them from the algebraic ones.
2. If E, E 0 are ONBs in the Hilbert spaces H, H 0 , respectively, then it is immediate that
E × E 0 is an orthonormal set in the algebraic tensor product H ⊗k H 0 , thus also in H ⊗ H 0 . In
fact its span is dense in E ⊗ E 0 , so that it is an ONB.
This leads to a pedestrian way of defining the tensor product H ⊗ H 0 of Hilbert spaces over
F: Pick ONBs E ⊆ H, E 0 ⊆ H 0 and define H ⊗ H 0 = `2 (E × E 0 , F). By Remark 5.42, the
outcome is independent of the chosen bases up to isomorphism. If x ∈ H, x0 ∈ H 0 then the map
E ×E 0 → F, (e, e0 ) 7→ hx, eiH hx0 , e0 iH 0 is in `2 (E ×E, F), thus defines an element x⊗x0 ∈ H ⊗H 0 .
This map H × H 0 → H ⊗ H 0 is bilinear. But this definition is very ugly and unconceptual due
to its reliance on a choice of bases. 2

6 Quotient spaces and complemented subspaces


In linear algebra, one has a notion of quotient spaces, cf. e.g. [22, Exercise 31 in Section 1.3]:
If V is an F-vector space and W ⊆ V is a linear subspace, one defines an equivalence relation
on V by x ∼ y ⇔ x − y ∈ W and then lets V /W denote the quotient space V /∼, i.e. the set of
∼-equivalence classes. One shows that V /W again is a F-vector space.
It is very natural to ask whether V /K again is a Hilbert (or Banach) space if that is the
case for V . We begin with Hilbert spaces.

6.1 Quotient spaces of Hilbert spaces


6.1 Exercise Let H be a Hilbert space and K ⊆ H a closed linear subspace. Prove that there
is a linear isomorphism H/K → K ⊥ of F-vector spaces.
Conclude that the quotient space H/K of a Hilbert space H by a closed subspace K admits
an inner product turning it into a Hilbert space.
The above, which completes our first encounter with Hilbert spaces, shows that for a Hilbert
space H the notion of quotient H/K by a closed subspace K in a sense is quite superfluous
since one has the orthogonal complement K ⊥ ⊆ H as a simpler substitute. The latter is no
more available for general Banach spaces, so that we’ll have some more work to do.

42
6.2 Quotient spaces of Banach spaces
In a general Banach space, we don’t have the notion of orthogonal complement. But in most
situations, having Banach quotient spaces is good enough. (For a different substitute for or-
thogonal complements see Section 6.3.)

6.2 Proposition If V is a normed space, W ⊆ V a linear subspace and V /W denotes the


quotient vector space, we define k · k0 : V /W → [0, ∞) by kv + W k0 = inf w∈W kv − wk. Then
(i) k · k0 is a seminorm on V /W , and the quotient map p : V → V /W satisfies kpk ≤ 1.
(ii) k · k0 is a norm if and only if W ⊆ V is closed.
(iii) If W ⊆ V is closed, the topology on V /W induced by k · k0 coincides with the quotient
topology, and the quotient map p : V → V /W is open.
(iv) If V is a Banach space and W ⊆ V is closed then (V /W, k · k0 ) is Banach space.
(v) If V is a Banach space with closed subspace W and T ∈ B(V, E), where E is a normed
space with W ⊆ ker T then there is a unique T 0 ∈ B(V /W, E) such that T 0 ◦ p = T .
Furthermore, kT 0 k = kT k. T 0 is surjective if and only if T is surjective and injective if and
only if W = ker T .
(vi) If A is a normed algebra and I ⊆ A is a closed two-sided ideal, then A/I is a normed
algebra.
Proof. (i) It is clear that k0k0 = 0 (where we denote the zero element of V /W by 0 rather than
W ). For x ∈ V, c ∈ F\{0} we have

kc(x + W )k0 = kcx + W k0 = inf kcx − wk = |c| inf kx − w/ck = |c| inf kx − wk = |c|kxk0 ,
w∈W w∈W w∈W

where we used that W → W, w 7→ cw is a bijection. Now let x1 , x2 ∈ V and ε > 0. Then there
are w1 , w2 ∈ W such that kxi − wi k < kxi + W k0 + ε/2 for i = 1, 2. Then

kx1 + x2 + W k0 = inf kx1 + x2 + wk ≤ k(x1 − w1 ) + (x2 − w2 )k


w∈W
≤ kx1 − w1 k + kx2 − w2 k < kx1 + W k0 + kx2 + W k0 + ε.

Since ε > 0 was arbitrary, we have kx1 +x2 +W k0 ≤ kx1 +W k0 +kx2 +W k0 , proving subadditivity
of k · k0 . It is immediate that kv + W k0 = inf w∈W kv − wk ≤ kvk.
(ii) If v ∈ V , the definition of k · k0 readily implies that kv + W k0 = 0 if and only if v ∈ W .
Thus if W is closed then w = v + W ∈ V /W has kwk0 = 0 only if w is the zero element of W .
And if W is non closed then every v ∈ W \W satisfies kv + W k0 = 0 even though v + W ∈ V /W
is non-zero. Thus k · k0 is not a norm.
(iii) Continuity of p : (V, k · k) → (V /W, k · k0 ) follows from kpk ≤ 1, see (i). Since p is
norm-decreasing, we have p(B V (0, r)) ⊆ B V /W (0, r) for each r > 0. And if y ∈ V /W with
kyk < r then there is an x ∈ V with p(x) = y and kxk < r (but typically larger than kyk). Thus
p maps B V (0, r) onto B V /W (0, r) for each r. Similarly, p(B V (x, r)) = B V /W (p(x, r)), and from
this it is easily deduced that p(U ) ⊆ V /W is open for each open U ⊆ V . Thus p is open (w.r.t.
the norm topologies on V, V /W ), which implies (cf. [47, Lemma 6.4.5]) that p is a quotient map,
thus the topology on V /W coming from k · k0 is the quotient topology.
(iv) Let {yn } ⊆ V /W be a Cauchy sequence. Then we can pass to a subsequence wn = yin
such that kwn −wn+1 k < 2−n . Pick xn ∈ V such that p(xn ) = wn and kxn −xn+1 k < 2−n . (Why
can this be done?) Then {xn } is a Cauchy sequence converging to some x ∈ V by completeness
of V . With y = p(x) we have kyn − yk ≤ kxn − xk → 0. Thus yn → y, and V /W is complete.

43
(v) Existence and uniqueness of T 0 as linear map are standard. And using p(B V (0, 1)) =
B V /W (0, 1) we have

kT 0 k = sup{kT 0 yk | y ∈ B V /W (0, 1)} = sup{kT 0 p(x)k | x ∈ B V (0, 1)}


= sup{kT xk | x ∈ B V (0, 1)} = kT k.

Also the statements concerning injectivity and surjectivity of T 0 are again pure algebra, but for
completeness we give proofs: The statement about surjectivity follows from T = T 0 ◦ p together
with surjectivity of p, which gives T (V ) = T 0 (V /W ). If W $ ker T , pick x ∈ (ker T )\W and
put y = p(x). Then y 6= 0, but T 0 y = T 0 px = T x = 0, so that T 0 is not injective. Now assume
W = ker T . If y ∈ ker T 0 then pick x ∈ V with y = p(x). Then T x = T 0 px = T 0 y = 0, thus
x ∈ ker T = W , so that y = p(x) = 0, proving injectivity of T 0 .
(vi) It is known from algebra that A/I is again an algebra. By the above, it is normed. It
remains to prove that the quotient norm on A/I is submultiplicative. Let c, d ∈ A/I and ε > 0.
Then there are a, b ∈ A with p(a) = c, p(b) = d, kak < kck + ε, kbk < kdk + ε (see the exercise
below). Then kcdk = kp(ab)k ≤ kabk ≤ kakkbk < (kck + ε)(kdk + ε), and since this holds for all
ε > 0, we have kcdk ≤ kckkdk. 

6.3 Exercise (i) If V is a normed space and W ⊆ V is a closed subspace, prove that for
every y ∈ V /W and every ε > 0 there is an x ∈ V with p(x) = y and kxk ≤ kyk + ε.
(ii) Give an example of a normed space V , a closed subspace W and y ∈ V /W for which no
x ∈ V with y = p(x), kxk = kyk exists.

6.4 Exercise Use the quotient space construction of Banach spaces to give a new proof for
the difficult part of Exercise 3.16.
The following is closely related to the Hilbert space ⊥, but not the same:

6.5 Definition Let V be a Banach space and W ⊆ V a subspace. Then the annihilator of W

is W ⊥ = {ϕ ∈ V ∗ | ϕ  W = 0} ⊆ V ∗ . One easily checks W ⊥ = W ⊥ = W .

6.6 Exercise Let V be a Banach space and W ⊆ V a closed subspace. Let p : V → V /W be


the quotient map. Prove that the map α : (V /W )∗ → V ∗ , ψ 7→ ψ ◦ p is injective and isometric
and its image is W ⊥ ⊆ V ∗ . Thus W ⊥ ∼
= (V /W )∗ as Banach spaces.

6.7 Exercise Let V be a Banach space and Z ⊆ V ∗ a closed subspace. Define Z > ⊆ V and
prove V ∗ /Z ∼
= (V /Z > )∗ .

6.8 Exercise Let V be a Banach space, W ⊆ V a closed subspace and Z ⊆ V a finite


dimensional subspace. Prove that W + Z ⊆ V is closed. Hint: Use V /W .

6.9 Exercise Give a counterexample showing that W + Z = {w + z | w ∈ W, z ∈ Z} ⊆ V need


not be closed for all closed subspaces W, Z of a Banach space. Hint: V = `2 (N, R).

6.3 Complemented subspaces


The following notion provides a partial substitute for orthogonal complements which we don’t
have in Banach spaces:

44
6.10 Definition Let V be a Banach space. A closed subspace W ⊆ V is called complemented
if there is a closed subspace Z ⊆ V such that V = W + Z and W ∩ Z = {0}.
If V, W, Z are as in the definition (without closedness) then every v ∈ V can be written as
v = w + z with w ∈ W, z ∈ Z in a unique way. (Uniqueness follows from w + z = w0 + z 0 ⇒
w − w0 = z 0 − z ∈ W ∩ Z = {0}.) One says ‘V is the internal direct sum of W and Z’. Purely
algebraically, every subspace W has a complementary subspace Z: Pick a (Hamel) base E for
W , extend to a base E 0 of V and put Z = spanF E 0 \E. But here we want Z to be closed! In
Exercise 9.11 we will prove that with closedness of W, Z we have V ∼= W ⊕ Z also topologically.

6.11 Exercise Let V = C([0, 2], R) with the k · k∞ -norm. Let W = {f ∈ V | f|(1,2] = 0}.
(i) Prove that W is complemented.
(ii) Can you ‘classify’ all possible complements, i.e. put them in bijection with a simpler set?

6.12 Exercise Let V be a Banach space and P ∈ B(V ) satisfying P 2 = P . Prove that
W = P V is a complemented subspace. (The converse is also true, as you will prove later.)

6.13 Exercise Let V be a Banach space and W ⊆ V a closed subspace such that dim V /W <
∞. Prove that W is complemented.

6.14 Proposition Every finite dimensional subspace of a Banach space is complemented.


Proof. The proof will be given in Section 7.3 since it requires tools to which we now turn. 

Not every closed subspace of a Banach space is complemented! In view of Exercise 6.13
and Proposition 6.14, a non-complemented subspace W ⊆ V must have infinite dimension and
codimension. And indeed, c0 (N, F) ⊆ `∞ (N, F) is non-complemented, as we prove in Appendix
B.3. See also [43] for more on the subject of complemented subspaces.
In fact, a Banach space has complementary subspaces for all closed subspaces if and only
if it is isomorphic to a Hilbert space, i.e. it admits an inner product whose associated norm is
equivalent to the original one! See [41].

In the process of returning from Hilbert to Banach spaces, the above discussion of quotient
spaces and complements was the easiest part. The question of bases is much harder for Banach
spaces, as the existence of the two volume treatment [74] of the subject, having 680+888 pages,
might suggest. (Then again, the basics are quite accessible, cf. e.g. [43, 27, 9, 1], but unfortu-
nately we don’t have the time.) The same is true for the formidable subject of tensor products
of Banach spaces, see e.g. [67]. Going into that would be pointless given that we already slighted
the much simpler tensor products of Hilbert spaces.
A more tractable problem is the fact that in the absence of an inner product, the existence
of non-zero bounded linear functionals is rather non-trivial and can in general only be proven
non-constructively, as we will do in the next section. (Of course, for spaces that are given very
explicitly like `p (S, F), we may well have more concrete approaches as in Section 4.5.)

7 Hahn-Banach theorem and its applications


We have seen that every bounded linear functional ϕ ∈ H ∗ , where H is a Hilbert space, is of the
form ϕ = ϕy for a certain (unique) y ∈ H. Thus dual spaces of Hilbert spaces are completely

45
understood. (The map H → H ∗ , y 7→ ϕy is an anti-linear bijection.) For a general Banach
space V , matters are much more complicated. The point of the Hahn23 -Banach theorem (which
comes in many versions)24 is to show that there many linear functionals.

7.1 First version of Hahn-Banach over R


7.1 Definition If V is a real vector space, a map p : V → R is called sublinear if it satisfies
• Positive homogeneity: p(cv) = cp(v) for all v ∈ V and c > 0.
• Subadditivity: p(x + y) ≤ p(x) + p(y) for all x, y ∈ V .

7.2 Theorem Let V be a real vector space and p : V → R a sublinear function. Let W ⊆ V
be a linear subspace and ϕ : W → R a linear functional such that ϕ(w) ≤ p(w) for all w ∈ W .
b : V → R such that ϕ
Then there is a linear functional ϕ b  W = ϕ and ϕ(v)
b ≤ p(v) for all v ∈ V .
The heart of the proof of the theorem is proving it in the case where we extend ϕ from W
to W + Rv 0 .

7.3 Lemma Let V, p, W, ϕ be as in Theorem 7.2 and v 0 ∈ V . Then there is a linear functional
b : Y = W + Rv 0 → R such that ϕ
ϕ b  W = ϕ and ϕ(v)
b ≤ p(v) for all v ∈ Y .
Proof. If v ∈ W , there is nothing to do so that we may assume v 0 ∈ V \W . Then every
x ∈ W + Rv 0 can be written as x = w + cv 0 with unique w ∈ W, c ∈ R. Thus if d ∈ R, we can
define ϕb : W + Rv 0 → R by w + cv 0 7→ ϕ(w) + cd for all w ∈ W and c ∈ R. Since ϕ
b is linear and
trivially satisfies ϕ
b  W = ϕ, it remains to show that d can be chosen such that

b + cv 0 ) = ϕ(w) + cd ≤ p(w + cv 0 ) ∀w ∈ W, c ∈ R.
ϕ(w (7.1)

For c = 0, this holds by assumption. If (7.1) holds for all w ∈ W and c ∈ {1, −1}, i.e.

ϕ(w) ± d ≤ p(w ± v 0 ), (7.2)

then for all e > 0 we have

b ± ev 0 ) = eϕ(e
ϕ(w b −1 w ± v 0 ) ≤ ep(e−1 w ± v 0 ) = p(w ± ev 0 ),

thus the desired inequality (7.1) holds for all w ∈ W, c ∈ R. Now d ∈ R satisfies (7.2) for all
w ∈ W and both signs if and only if

ϕ(w) − p(w − v 0 ) ≤ d ≤ p(w0 + v 0 ) − ϕ(w0 ) ∀w, w0 ∈ W.

Clearly this is possible if and only if ϕ(w) − p(w − v 0 ) ≤ p(w0 + v 0 ) − ϕ(w0 ) for all w, w0 ∈ W ,
which is equivalent to ϕ(w) + ϕ(w0 ) ≤ p(w − v 0 ) + p(w0 + v 0 ) ∀w, w0 . This is indeed satisfied for
all w, w0 ∈ W since w + w0 ∈ W so that

ϕ(w) + ϕ(w0 ) = ϕ(w + w0 ) ≤ p(w + w0 ) ≤ p(w − v 0 ) + p(w0 + v 0 )

holds since ϕ is linear and bounded by p and since p is subadditive. 

23
Hans Hahn (1879-1934). Austrian mathematician who mostly worked in analysis and topology.
24
Important early results are due to Eduard Helly (1884-1943), another Austrian mathematician. See [42, p. 54-55].

46
Proof of Theorem 7.2. If W = V , there is nothing to do, so assume W $ V . Let E be the set
of pairs (Z, ψ), where Z ⊆ V is a linear subspace space containing W and ψ : Z → R is a linear
map extending ϕ such that ψ(z) ≤ p(z) ∀z ∈ Z. Since W 6= V , Lemma 7.3 implies E 6= ∅.
We define a partial ordering on E by (Z, 0 0 0
S ψ) ≤ (Z , ψ ) ⇔ Z ⊆ Z, ψ  Z = ψ. If C ⊆ E is
a chain, i.e. totally ordered by ≤, let Y = (Z,ψ)∈C Z and define ψY : Y → R by ψY (v) = ψ(v)
for any (Z, ψ) ∈ C with v ∈ Z. This clearly is consistent and gives a linear map. Now (Y, ψY ) is
in element of E and an upper bound for C. Thus by Zorn’s lemma there is a maximal element
(YM , ψM ) of E. Now ψM : YM → R is an extension of ϕ satisfying ψM (y) ≤ p(y) for all y ∈ YM ,
so we are done if we prove YM = V . If this is not the case, we can pick v 0 ∈ V \YM and use
Lemma 7.3 to extend ψY to YM + Rv 0 , but this contradicts the maximality of (YM , ψM ). 

7.4 Remark The above proof used Zorn’s lemma, which is equivalent to the Axiom of Choice
(AC), and therefore very non-constructive25 . There is nothing much to be done about this, but
we mention that the Hahn-Banach theorem can be deduced from the ‘ultrafilter lemma’, which
is strictly weaker than AC. For separable spaces, the Hahn-Banach theorem can be proven using
only the axiom DCω of countable dependent choice. For proofs of these claims see [47, Appendix
G]. 2

7.2 Hahn-Banach theorem for (semi)normed spaces


With the exception of Section B.5.2 we will not use Theorem 7.2 directly, but only the following
consequence:

7.5 Theorem (Hahn-Banach Theorem) If V be a vector space over F ∈ {R, C}, p a semi-
norm on it, W ⊆ V a linear subspace and ϕ : W → C a linear functional such that |ϕ(w)| ≤ p(w)
for all w ∈ W . Then there is a linear functional ϕ
b : V → C such that ϕ b  W = ϕ and
|ϕ(v)|
b ≤ p(v) for all v ∈ V .
Proof. F = R: This is an immediate consequence of Theorem 7.2 since a seminorm p is sublinear
with the additional properties p(−v) = p(v) ≥ 0 for all v. In particular, −ϕ(v) b = ϕ(−v)
b ≤
p(−v) = p(v), so that −p(v) ≤ ϕ(v) ≤ p(v) for all v ∈ V , which is equivalent to |ϕ(v)|
b ≤ p(v) ∀v.
ϕ
F = C: Assume V ⊇ W → C satisfies |ϕ(w)| ≤ p(w) ∀w ∈ W . Define ψ : W → R, w 7→
Re(ϕ(w)), which clearly is R-linear and satisfies the same bounds. Thus by the real case just
considered, there is an R-linear functional ψb : V → R such that |ψ(v)|
b ≤ p(v) for all v ∈ V .
b : V → C by
Define ϕ
ϕ(v)
b b − iψ(iv).
= ψ(v) b
Again it is clear that ϕ
b is R-linear. Furthermore
ϕ(iv)
b = ψ(iv)
b − iψ(−v)
b = ψ(iv)
b + iψ(v)
b b − iψ(iv))
= i(ψ(v) b = iϕ(v),
b
b : V → C is C-linear. If w ∈ W then
proving that ϕ
ϕ(w)
b = ψ(w)
b − iψ(iw)
b = ψ(w) − iψ(iw) = Re(ϕ(w)) − iRe(ϕ(iw))
= Re(ϕ(w)) − iRe(iϕ(w)) = Re(ϕ(w)) + iIm(ϕ(w)) = ϕ(w),
so that ϕ
b extends ϕ.
Given v ∈ V , let α ∈ C, |α| = 1 be such that αϕ(v) b ≥ 0. Then αϕ(v)
b = ϕ(αv)
b =
Re(ϕ(αv))
b = ψ(αv),
b so that |ϕ(v)|
b = |αϕ(v)|
b = ψ(αv)
b ≤ p(αv) = p(v). 

25
“Such reliance on awful non-constructive results is unfortunately typical of traditional functional analysis.” [38]

47
7.6 Remark In Exercise 5.30 we saw (with a fairly easy proof) that bounded linear functionals
defined on linear subspaces of Hilbert spaces always have unique norm-preserving extensions to
the whole space. For a general Banach space V this uniqueness is far from true! (It holds if and
only if V ∗ is strictly convex, cf. Section B.6 for definition and proof.) 2

7.7 Exercise Give an example for a Banach space V , a linear subspace K ⊆ V and ϕ ∈ K ∗
b ∈ V ∗.
such that there are multiple norm-preserving extensions ϕ

7.3 First applications


We are now in a position to give the

Proof of Proposition 6.14. To begin with, finite dimensional subspaces are automatically closed
by Exercise 3.22. Let W ⊆ V be finite P dimensional and let {e1 , . . . , en } be a base for W .
Since every w ∈ W can beP written as ni=1 ci ei in a unique way, there are linear functionals
ϕi : W → C such that w = ni=1 ϕi (w)ei for each w ∈ W . Since W is finite dimensional, the ϕi
are automatically bounded. Now by the Hahn-Banach Theorem 7.5 there are continuous linear
bi : V → C extending the ϕi . Then Z = ni=1 ker ϕ
T
functionals ϕ bi is aP closed linear subspace
of V . It should be clear that W ∩ Z = {0}. Define P : V → W, v 7→ ni=1 ϕ bi (v)ei . We have
2
P  W = idW , thus P = P . Now apply Exercise 6.12. 

7.8 Proposition Let E be a normed space over F ∈ {R, C}.


(i) For every 0 6= x ∈ E there is a ϕ ∈ E ∗ with kϕk = 1 such that ϕ(x) = kxk. Thus E ∗
separates the points of E.
(ii) If x ∈ E then xb : E ∗ → F, ϕ 7→ ϕ(x) is in E ∗∗ with kb
xk = kxk. The map ιE : E →
∗∗
E , x 7→ x
b is an isometric embedding.
(iii) The image26 ιE (E) ⊆ E ∗∗ is closed if and only if E is complete (i.e. Banach).
Proof. (i) Let W = Fx ⊆ E. The linear functional ϕ : W → F, cx 7→ ckxk is isometric since
|ϕ(x)| = kxk, thus kϕk = 1. By the Hahn-Banach Theorem 7.5 there exists a ϕ b ∈ E ∗ with
ϕ(x)
b = ϕ(x) = kxk and kϕk b = kϕk = 1.
b : E ∗ → F, ϕ 7→ ϕ(x) is a linear functional. If x ∈ E, ϕ ∈ E ∗ then
(ii) It is clear that x
|b
x(ϕ)| = |ϕ(x)| ≤ kxkkϕk. Thus kb xk ≤ kxk. By (i) there is ϕ ∈ E ∗ with kϕk = 1 such that
ϕ(x) = kxk. This gives kb xk = kxk. Thus the map ιE : E → E ∗∗ , x 7→ x
b, which clearly is linear,
is an isometric embedding.
(iii) If E is complete then ιE (E) ⊆ E ∗∗ is closed by Corollary 3.6 since ιE is an isometry
by (ii). Conversely, if ιE (E) ⊆ E ∗∗ is closed then completeness of E ∗∗ (Proposition 3.24(ii))
implies that ιE (E) is complete, thus also E since ιE is an isometry. 

It is customary to write simply ι instead of ιE or to drop it from the notation entirely,


identifying E with its image ιE (E) in E ∗∗ , so that E ⊆ E ∗∗ .
26
If f : X → Y is any function, from a category theory point of view one would call X the source (or domain) and
Y the target (or codomain) of f and call the subset f (X) ⊆ Y the image of f . I prefer to avoid the term ‘range’ since
some authors use it for ‘target’ (thus Y ) and others for ‘image’ (thus f (X)). The term ‘image’ is unambiguous since
no reasonable person would use it intending Y .

48
7.9 Corollary Every normed space E embeds isometrically into a Banach space E b as a dense
subspace. That space E is unique up to isometric isomorphism and is called the completion of
b
E.
Proof. This can be proven by completing the metric space (E, d), where d(x, y) = kx − yk and
showing that the completion is a linear space, which is easy. Alternatively, using the above
result that ιE : E → E ∗∗ is an isometry, we can take Eb = ιE (E) ⊆ E ∗∗ as the definition of E
b
∗∗
since this is a closed subspace of the complete space E and therefore complete.
Uniqueness of the completion follows with the same proof as for metric spaces, cf. [47]. 

7.10 Exercise Let V be a Banach space and W ⊆ V a subspace. Prove: W = V ⇔ W ⊥ =


{0}. (This is a Banach space analogue of Exercise 5.26(i), but now W ⊥ ⊆ V ∗ , not W ⊥ ⊆ V !)

7.11 Exercise Let V be an infinite dimensional Banach space over F ∈ {R, C}.
(i) Use Hahn-Banach to construct sequences {xn }n∈N ⊆ V and {ϕn }n∈N ⊆ V ∗ such that
kxn k = 1 and ϕn (xn ) 6= 0 for all n ∈ N and ϕn (xm ) = 0 whenever n 6= m.
(ii) Prove that {xn }n∈N is linearly independent and that xn 6∈ spanF {xm | m 6= n} for all n.

7.12 Exercise Let V be a Banach space and x ∈ V, ϕ ∈ V ∗ . Prove that ιV ∗ (ϕ)(ιV (x)) = ϕ(x).

7.4 Reflexivity of Banach spaces


7.13 Definition A Banach space E is called reflexive if the map ιE : E → E ∗∗ is surjective.

7.14 Remark 1. If E is reflexive, ιE : E → E ∗∗ is an isometric isomorphism of Banach spaces.


2. There are Banach spaces E that are not reflexive, yet satisfy E ∼
= E ∗∗ non-canonically! An
example is the James space, see e.g. [43, Section 4.5], which is also interesting since E ∗∗ /ιE (E)
is one-dimensional! For most non-reflexive spaces this quotient is infinite dimensional. 2

7.15 Exercise Prove:


(i) Every finite dimensional Banach space is reflexive.
(ii) Every Hilbert space is reflexive.
(iii) If 1 < p < ∞ then `p (S, F) is reflexive.
(iv) If S is infinite then c0 (S, F) and `1 (S, F) are not reflexive.

7.16 Exercise (i) Prove that if E is reflexive then for each ϕ ∈ E ∗ there is x ∈ E such that
kxk = 1 and |ϕ(x)| = kϕk.
(ii) Use (i) and Theorem 4.16 to prove (again) that c0 (N, C) is not reflexive.

7.17 Remark 1. The converse of the statement in Exercise 7.16(i) is also true, but the proof
is much harder and more than 10 pages long! (See [43, Section 1.13].)
2. See Appendix B.6 for the notion of uniform convexity, which is stonger than the strict
convexity encountered earlier, and a proof of the fact that uniformly convex spaces are reflexive.
We will also prove prove that Lp (X, A, µ) is uniformly convex for each measure space
(X, A, µ) and 1 < p < ∞. This provides a proof of reflexivity of these spaces that does
not use the relation between Lp and Lq . This in turn leads to a simple proof of surjectivity of
the isometric map Lq → (Lp )∗ known from Section 4.6 (reversing the logic of Exercise 7.15(iii)).

49
3. If E is a Banach space and F ⊆ E is a closed subspace then E is reflexive if and only both
F and E/F are reflexive. The proof uses only Hahn-Banach. See [85] for a nice exposition. 2

7.18 Theorem Let V be a Banach space. Then V is reflexive if and only if V ∗ is reflexive.
Proof. ⇒ Given surjectivity of the canonical map ιV : V → V ∗∗ , we want to prove surjectivity
of ιV ∗ : V ∗ → V ∗∗∗ . Let thus ϕ ∈ V ∗∗∗ = (V ∗∗ )∗ . Putting ϕ0 = ϕ ◦ ιV ∈ V ∗ , the implication
is proven if we show ϕ = ιV ∗ (ϕ0 ), which means ϕ(x∗∗ ) = ιV ∗ (ϕ0 )(x∗∗ ) for all x∗∗ ∈ V ∗∗ . By
surjectivity of ιV : V → V ∗∗ , this is equivalent to ϕ(ιV (x)) = ιV ∗ (ϕ0 )(ιV (x)) for all x ∈ V . This
is true since the l.h.s. is ϕ0 (x) by definition of ϕ0 and the r.h.s. equals ϕ0 (x) by Exercise 7.12.
⇐ Assume that V is not reflexive. Then ιV (V ) ⊆ V ∗∗ is a proper closed subspace, so that
ιV (V )⊥ 6= {0} by Exercise 7.10. Let thus 0 6= ϕ ∈ ιV (V )⊥ ⊆ V ∗∗∗ . Since V ∗ is reflexive,
we have ϕ = ιV ∗ (ϕ0 ) for some ϕ0 ∈ V ∗ . Using Exercise 7.12 again, for each x ∈ V we have
ϕ0 (x) = ιV ∗ (ϕ0 )(ιV (x)) = ϕ(ιV (x)) = 0. But this means ϕ0 = 0, thus ϕ = 0, a contradiction. 

7.19 Remark 1. Since `∞ (S, F) ∼ = `1 (S, F)∗ , the theorem implies that also `∞ (S, F) is not
reflexive for infinite S.
2. More generally, for non-reflexive E none of the spaces E ∗ , E ∗∗ , E ∗∗∗ , . . . is reflexive, so that
E $ E ∗∗ $ E ∗∗∗∗ $ · · · and E ∗ $ E ∗∗∗ $ E ∗∗∗∗∗ $ · · · , and we have two somewhat mysterious
successions of ever larger spaces! There do not seem to be many general results about this, but
see Lemma B.10(iv). Even understanding C(X, R)∗∗ for compact X is complicated, cf. [33]. 2

7.20 Exercise Let V be a Banach space. Prove:


(i) If V ∗ is separable then V is separable.
(ii) For infinite dimensional separable V , V ∗ can be separable or non-separable. (Examples!)
(iii) If V is separable and reflexive then V ∗ is separable.
While the material of the present section almost trivializes for Hilbert spaces, the results in
the next two sections remain equally non-trivial when restricted to Hilbert spaces.

8 Uniform boundedness theorem: Two versions


8.1 The weak version, using only countable choice
8.1 Definition Let E, F be normed spaces and F ⊆ B(E, F ) a family of bounded linear maps.
(i) F is called pointwise bounded if supA∈F kAxk < ∞ for each x ∈ E.
(ii) F is called uniformly bounded if supA∈F kAk < ∞.
It is trivial that uniform boundedness of F implies pointwise boundedness. Remarkably:

8.2 Theorem [Helly 1912, Hahn, Banach 1922] Let E be a Banach space, F a normed space
and F ⊆ B(E, F ) pointwise bounded. Then F is uniformly bounded.
Proof. Assume that F is not uniformly bounded. Then the sets Fn = {A ∈ F | kAk ≥ 4n }
are all non-empty, so that using ACω (axiom of countable choice), we can pick an An ∈ Fn for
each n ∈ N. By definition of kAn k, the sets Xn = {x ∈ E | kxk ≤ 1, kAn xk ≥ 23 kAn k} are all
non-empty, to that using ACω again, we can choose an xn ∈ Xn for each n ∈ N.

50
Applying the triangle inequality to Az = 12 (A(y + z) − A(y − z)) gives

1 1
kAzk = k(A(y + z) − A(y − z))k ≤ (kA(y + z)k + kA(y − z)k) ≤ max(kA(y + z)k, kA(y − z)k).
2 2

Applying this inequality to A = An+1 , y = yn , z = ±3−(n+1) xn+1 , recalling kAn xn k ≥ 32 kAn k,


we see that for at least one of the signs ± we have
2
kAn+1 (yn ± 3−(n+1) xn+1 )k ≥ 3−(n+1) kAn+1 xn+1 k ≥ 3−(n+1) kAn+1 k.
3
Thus defining a sequence {yn } ⊆ E by y1 = x1 and

yn + 3−(n+1) xn+1 if kAn+1 (yn + 3−(n+1) xn+1 )k ≥ 3−(n+1) 32 kAn+1 k



yn+1 = (8.1)
yn − 3−(n+1) xn+1 otherwise

we have kAn yn k ≥ 32 3−n kAn k for all n. (For n = 1 this is true since y1 = x1 .) Since (8.1)
involves no further free choices, this inductive definition can be formalized in ZF (which we
don’t do here, see [21]).
With (8.1) and kxn k ≤ 1 for all n, we have kyn+1 − yn k ≤ 3−(n+1) ∀n. Now for all m > n
m−1 ∞
X X 1 1
kym − yn k = yk+1 − yk ≤ 3−(k+1) = 3−(n+1) 1 = 3−n ,
1− 3
2
k=n k=n

so that {yn } is a Cauchy sequence. By completeness of E we have yn → y ∈ E with ky − yn k ≤


1 −n
23 . Another use of the triangle inequality gives

kAn yn k = kAn (y − y + yn )k ≤ kAn yk + kAn (y − yn )k ≤ kAn yk + kAn kky − yn k,

so that with ky − yn k ≤ 21 3−n , kAn yn k ≥ 32 3−n kAn k and kAn k ≥ 4n for all n we finally have
   n
2 −n 1 −n 1 1 4
kAn yk ≥ kAn yn k − kAn kky − yn k ≥ kAn k 3 − 3 = 3−n kAn k ≥ → ∞.
3 2 6 6 3

Thus y ∈ E is a witness for the failure of pointwise boundedness of F. 

8.3 Remark The above method of proof is called the gliding (or sliding) hump method and
is more than 100 years old. (See also Section B.7 for another use of this method.) Nowadays,
the above theorem is usually deduced from Baire’s theorem, cf. Appendix A.5. As mentioned
there, the latter is equivalent to the axiom DCω of countable dependent choice, whereas above
we only used the weaker axiom ACω of countable choice. The above argument was discovered
only a few years ago and published [21] in 2017! 2

8.2 Applications: Banach-Steinhaus, Hellinger-Toeplitz


8.4 Definition Let E, F be normed spaces. A sequence (or net) {An } ⊆ B(E, F ) is strongly
convergent if limn→∞ An x exists for every x ∈ E.
Under the above assumption, the map A : E → F, x 7→ limn→∞ An x is easily seen to be
s
linear. Now we write An → A or A = s-lim An .

51
8.5 Corollary (Banach-Steinhaus) 2728 If E is a Banach space, F a normed space and
the sequence {An } ⊆ B(E, F ) is strongly convergent then the map A = s-lim An is bounded,
thus in B(E, F ).
Proof. The convergence of {An x} ⊆ F for each x ∈ E implies boundedness of {An x | n ∈ N}
for each x, so that F = {An | n ∈ N} ⊆ B(E, F ) is pointwise bounded and therefore uniformly
bounded by Theorem 8.2. Thus there is T such that kAn k ≤ T ∀n, so that kAn xk ≤ T kxk ∀x ∈
E, n ∈ N. With An x → Ax this implies kAxk ≤ T kxk for all x, thus kAk ≤ T < ∞. 
s
8.6 Remark Clearly An → A is equivalent to kAn − Akx → 0 for all x ∈ E, where kAkx :=
kAxk is a seminorm on B(E, F ) for each x ∈ E. If kAkx = 0 for all x ∈ E then Ax = 0 ∀x ∈ E,
thus A = 0. Thus the family F = {k · kx | x ∈ E} is separating and induces a locally convex
topology on B(E, F ), the strong operator topology τsot . Norm convergence kAn −Ak → 0 clearly
s
implies strong convergence An → A, but usually the strong (operator) topology is strictly weaker
(despite its name) than the norm topology. See the following exercise for an example. 2

8.7 Exercise Let 1 ≤ p < ∞ and V = `p (N, F). For each m ∈ N define Pm ∈ B(V ) by
s
(Pm f )(n) = f (n) for n ≥ m and (Pm f )(n) = 0 if n < m. Prove Pm → 0, but kPm k = 1 ∀m,
k·k
thus Pm 6→ 0.

8.8 Exercise Let V be a separable Banach space and B ⊆ B(V ) a bounded subset.
(i) Prove: If S ⊆ V is dense and a net {Aι } ⊆ B satisfies kAι xk → 0 for all x ∈ S then
kAι xk → 0 for all x ∈ V , thus Aι → 0 in the strong operator topology.
(ii) Prove that the topological space (B, τsot ) is metrizable.
(iii) BONUS: Prove that (V, τsot ) is not metrizable if V is infinite dimensional.

8.9 Corollary (Hellinger-Toeplitz theorem) 29 If H is a Hilbert space and a linear


map A : H → H is self-adjoint (i.e. hAx, yi = hx, Ayi for all x, y ∈ H) then A is bounded.
Proof. The set F = {x 7→ hx, Ayi | y ∈ H, kyk ≤ 1} clearly is contained in H ∗ = B(H, C). For
each x ∈ H we have

Fx = {hx, Ayi | y ∈ H, kyk ≤ 1} = {hAx, yi | y ∈ H, kyk ≤ 1}.

With Cauchy-Schwarz and kyk ≤ 1 we have |hAx, yi| ≤ kAxk. Thus F is pointwise bounded
and therefore uniformly bounded by Theorem 8.2. Thus there is an M ∈ [0, ∞) such that
|hAx, yi| = |hx, Ayi| ≤ M kxk for all y ∈ H with kyk ≤ 1, and this implies kAk ≤ M . 

8.10 Remark The Hellinger-Toeplitz Theorem shows that on a Hilbert space H there are no
unbounded linear operators A : H → H satisfying hAx, yi = hx, Ayi ∀x, y. This is a typical
example of a ‘no-go-theorem’. Occasionally such results are a nuisance. After all, the operator
of multiplication by n on `2 (N) ‘obviously’ is self-adjoint. What Hellinger-Toeplitz really says
is that such an operator cannot be defined everywhere, i.e. on all of H. This leads to the notion
of symmetric operators, and also illustrates that no-go theorems often can be circumvented by
27
Hugo Steinhaus (1887-1972). Polish mathematician
28
In the literature, one can find either this result or Theorem 8.2 denoted as ‘Banach-Steinhaus theorem’.
29
Ernst David Hellinger (1883-1950), Otto Toeplitz (1881-1940). German mathematicians. Both were forced into
exile in 1939.

52
generalizing the setting. This is the case here, since the Hellinger-Toeplitz theorem only applies
to operators that are defined everywhere. 2

8.11 Definition A symmetric operator on a Hilbert space H is a linear map A : D → H,


where D ⊆ H is a dense linear subspace, that satisfies hAx, yi = hx, Ayi for all x, y ∈ D,

8.12 Exercise Let H = `2 (N, C), D = {f ∈ `2 (N, C) | 2


P
n |nf (n)| < ∞} ⊆ H and A : D →
H, (Af )(n) = nf (n). Prove:
(i) D ⊆ H is a dense proper linear subspace.
(ii) A : D → H is symmetric and unbounded.
There is an extensive theory of unbounded linear operators defined on dense subspaces
of a Hilbert space. Most books on (linear) functional analysis have a chapter on them, e.g.,
[55, 59, 12, 64]. This theory is quite important for applications to differential equations and
quantum mechanics, but since it is quite technical one should not approach it before one has
mastered the material of this course.

8.3 The strong version, using Baire’s theorem


We have seen two applications of the statement F ⊆ B(E, F ) pointwise bounded ⇒ F uniformly
bounded. Some applications of the uniform boundedness theorem use the contraposition: If F
is not uniformly bounded then it is not pointwise bounded, thus there exists x ∈ E with
supA∈F kAxk = ∞. For some of these applications the following statement, which clearly
implies Theorem 8.2, is a very definite improvement30 of the latter:

8.13 Theorem Let E be a Banach space, F a normed space and F ⊆ B(E, F ). Then either
F is uniformly bounded or the set {x ∈ E | supA∈F kAxk = ∞} ⊆ E is dense Gδ .
Proof. The map F → R≥0 , x 7→ kxk is continuous and each A ∈ F is bounded, thus continuous.
Therefore the map fA : E → R≥0 , x 7→ kAxk is continuous for every A ∈ F. Defining for each
n∈N
Vn = {x ∈ E | sup kAxk > n},
A∈F
the definition of sup implies
[ [
Vn = {x ∈ E | ∃A ∈ F : kAxk > n} = {x ∈ E | kAxk > n} = fA−1 ((n, ∞)),
A∈F A∈F

which is open by continuity of the fA .


If Vn is non-dense for some n ∈ N, there exists x0 ∈ E and r > 0 such that B(x0 , r) ∩ Vn = ∅.
This means supA∈F kA(x0 + x)k ≤ n for all x with kxk < r. With x = (x0 + x) − x0 and the
triangle inequality we have

kAxk ≤ kA(x0 + x)k + kAx0 k ≤ 2n ∀A ∈ F, x ∈ B(0, r).

This implies kAk ≤ 2n/r for all A ∈ F, thus F is uniformly bounded.


T If Vn ⊆ E is dense for all n ∈ N then Baire’s Theorem A.20 gives that the Gδ -set X =
n∈N Vn is dense. Since the definition of the Vn gives X = {x ∈ E | supA∈F kAxk = ∞}, the
claim is proven. 
30
Mystifyingly, not many authors state this better result, even though using Baire it comes out without extra effort.

53
8.4 Application: A dense set of continuous functions with di-
vergent Fourier series
Let f : R → C be 2π-periodic, i.e. f (x + 2π) = f (x) ∀x, and integrable over finite intervals.
Define Z 2π
1
cn (f ) = f (x)e−inx dx (8.2)
2π 0
and
n
X
Sn (f )(x) = ck (f )eikx , n ∈ N. (8.3)
k=−n

The fundamental problem of the theory of Fourier series is to find conditions for the conver-
gence Sn (f )(x) → f (x) as n → ∞, where convergence can be understood as (possibly almost)
everywhere pointwise or w.r.t. some norm, like k · k2 (as in Example 5.44) or k · k∞ . Here we
will discuss only continuous functions and we identify continuous 2π-periodic functions with
continuous functions on S 1 . It is not hard to show that Sn (f )(x) → f (x) if f is differentiable
at x (or just Hölder continuous: |f (x0 ) − f (x)| ≤ C|x0 − x|D with C, D > 0 for x0 near x) and
that convergence is uniform when f is continuously differentiable (or the Hölder condition holds
uniformly in x, x0 ). (See any number of books on Fourier analysis, e.g. [76, 34].)
Assuming only continuity of f one can still prove that limn→∞ Sn (f )(x) = f (x) if the limit
exists, but there actually exist continuous functions f such that Sn (f )(x) diverges at some
x. Such functions were first constructed in the 1870s using ‘condensation of singularities’, a
relative and precursor of the gliding hump method. Nowadays, most textbook presentations of
such functions are based on Lemma 8.15 below combined with either the uniform boundedness
theorem or constructions ‘by hand’, see e.g. [34, Section II.2], that are quite close in spirit to
the uniform boundedness method.
However, individual examples of continuous functions with divergent (in a point) Fourier
series can be produced in a totally constructive fashion, avoiding all choice axioms! (See [49]
for a very classical example.) But using non-constructive arguments seems unavoidable if one
wants to prove that there are many such functions as in the following:

8.14 Theorem There is a dense Gδ -set X ⊆ C(S 1 ) such that {Sn (f )(0)}n∈N diverges for each
f ∈ X.
Proof. Inserting (8.2) into (8.3) we obtain
n n
!
1 X ikx 2π
Z Z 2π
−ikt 1 X
Sn (f )(x) = e f (t)e dt = f (t) eik(x−t) dt = (Dn ? f )(x),
2π 0 2π 0
k=−n k=−n

1
R 2π
where ? denotes convolution, defined for 2π-periodic f, g by (f ? g)(x) = 2π 0 f (t)g(x − t)dt,
and
n
X sin(n + 21 )x
Dn (x) := eikx =
sin x2
k=−n

is the Dirichlet kernel. The quickest way to check the last identity is the telescoping calculation
n
X n
X
(eix/2 − e−ix/2 )Dn (x) = eix(k+1/2) − eix(k−1/2) = eix(n+1/2) − e−ix(n+1/2) ,
k=−n k=−n

54
together with eix − e−ix = 2i sin x. Since Dn (x) is an even function, we have
Z 2π
1
ϕn (f ) := Sn (f )(0) = f (x)Dn (x)dx.
2π 0

It is clear that the norm of the map ϕn : (C(S 1 ), kR · k∞ ) → C is bounded above by kDn k1 .

For gn (x) = sgn(Dn (x)) we have ϕn (gn ) = (2π)−1 0 |Dn (x)|dx =: kDn k1 . While gn is not
m→∞
continuous, we can find a sequence of continuous gn,m bounded by 1 such that gn,m −→ gn
pointwise. Now Lebesgue’s dominated convergence theorem implies ϕn (gn,m ) → ϕn (gn ) =
kDn k1 , which implies kϕn k = kDn k1 . By Lemma 8.15 below, kDn k1 → ∞ as n → ∞. Thus the
family F = {ϕn } ⊆ B(C(S 1 ), C) is not uniformly bounded. Now Theorem 8.13 implies that the
set X = {f ∈ C(S 1 , C) | {Sn (f )(0)} is unbounded} is dense Gδ . 

4
8.15 Lemma We have kDn k1 ≥ log n for all n ∈ N.
π2
Proof. Using | sin x| ≤ |x| for all x ∈ R, we compute
Z π
2 π
Z  
1 1 dx
kDn k1 = |Dn (x)|dx ≥ sin n + x
2π −π π 0 2 x
Z (n+1/2)π n
2 X kπ | sin x|
Z
2 dx
= | sin x| ≥ dx
π 0 x π (k−1)π x
k=1
n Z π n
2 X 1 4 X 1 4
≥ sin x dx = 2 ≥ 2 log n,
π kπ 0 π k π
k=1 k=1
Pn R n+1
where we used k=1 1/k ≥ 1 dx/x = log(n + 1) > log n. 

9 The Open Mapping Theorem and its relatives


9.1 The Open Mapping Theorem
9.1 Theorem (Schauder 1930) 31 Let E, F be Banach spaces and let T ∈ B(E, F ) (thus
linear and bounded) be surjective. Then T is an open map. (I.e. T U ⊆ F is open for every
open U ⊆ E.) Note that E and F must be complete!
Proofs of the open mapping theorem tend to be longish and monolithic. I try to give a more
accessible presentation. Following [25], we begin with a lemma that deduces surjectivity of a
linear map from a certain approximate surjectivity property (and also leads to a quick proof of
the Tietze extension theorem in topology, see Appendix A.6):

9.2 Lemma Let E be a Banach space, F a normed space (real or complex) and T : E → F a
linear map. Assume also that there are m > 0 and r ∈ (0, 1) such that for every y ∈ F there is
an x0 ∈ E with kx0 kE ≤ mkykF and ky − T x0 kF ≤ rkykF . Then for every y ∈ F there is an
m
x ∈ E such that kxkE ≤ 1−r kykF and T x = y. In particular, T is surjective.
Proof. It suffices to consider the case kyk = 1. By assumption, there is x0 ∈ E such that
kx0 k ≤ m and ky − T x0 k ≤ r. Now, applying the hypothesis to y − T x0 instead of y, we find
31
Juliusz Schauder (1899-1943). Polish mathematician. Killed by the Gestapo.

55
an x1 ∈ E with kx1 k ≤ rm and ky − T (x0 + x1 )k ≤ r2 . Continuing this inductively (thus using
DCω !) we obtain a sequence {xn } such that for all n ∈ N

kxn k ≤ rn m, (9.1)
n+1
ky − T (x0 + x1 + · · · + xn )k ≤ r . (9.2)
P∞
Now, (9.1) together with completeness of E implies, cf. Proposition 3.2, that n=0 xn converges
to an x ∈ E with
∞ ∞
X X m
kxk ≤ kxn k ≤ rn m = ,
1−r
n=0 n=0
and taking n → ∞ in (9.2) gives y = T x. 

9.3 Proposition Let E be a Banach space, F a normed space and T ∈ B(E, F ) such that
B F (0, β) ⊆ T (B E (0, α)) for certain α, β > 0. Then B F (0, β 0 ) ⊆ T (B E (0, α)) if 0 < β 0 < β.
F
Proof. If 0 < β 0 < β 00 < β then B (0, β 00 ) ⊆ B F (0, β) ⊆ T B E (0, α). Equivalently (since x 7→ λx
F
is a homeomorphism for every λ > 0), B (0, 1) ⊆ T B E (0, α/β 00 ). With the definition of the
closure, this means that for every y ∈ F, kyk ≤ 1 and ε > 0 there exists x ∈ E with kxk < α/β 00
and kT x − yk < ε. Equivalently, for every y ∈ F and ε > 0 there is x ∈ E with kxk < βα00 kyk and
kT x − yk < εkyk. Now Lemma 9.2 gives (assuming ε < 1) that for every y ∈ F there is x ∈ E
00 β 0 /β 00
β0
with T x = y and kxk ≤ α/β 1−ε kyk. If we choose ε ∈ (0, 1 − β 00 ) then 1−ε < 1, so that for every
F
y ∈ F with kyk ≤ β 0 there is x ∈ E with T x = y and kxk < α. Thus B (0, β 0 ) ⊆ T B E (0, α). 

Proof of Theorem 9.1. Since T is surjective, we have



[
F = T (E) = T (B E (0, n)).
n=1

Since F is a complete metric space and has non-empty interior F 0 = F 6= ∅, Corollary A.21 of
Baire’s theorem implies that at least one of the closed sets T (B E (0, n)) has non-empty interior.
Thus there are n ∈ N, y ∈ F, ε > 0 such that B F (y, ε) ⊆ T (B E (0, n)). If x ∈ B F (0, ε) then
2x = (y + x) − (y − x), thus 2B F (0, ε) ⊆ B F (y, ε) − B F (y, ε) and thus
1 1
B F (0, ε) ⊆ (B F (y, ε) − B F (y, ε)) ⊆ (T (B E (0, n)) − T (B E (0, n))) ⊆ T (B E (0, n)).
2 2
Now Proposition 9.3 implies that B F (0, ε0 ) ⊆ T (B E (0, n)) for some ε0 > 0 (actually every
ε0 ∈ (0, ε), but we don’t need this). By linearity we have that for every δ > 0 there is a δ 0 > 0
such that B F (0, δ 0 ) ⊆ T B E (0, δ). Now using the linearity of T , proving its openness is routine.


9.4 Exercise Let E, F be normed spaces and T : E → F linear such that for every δ > 0
there is δ 0 > 0 for which B F (0, δ 0 ) ⊆ T B E (0, δ). Prove that U is open.

9.2 The Bounded Inverse Theorem


Now we can prove Theorem 2.17:

9.5 Corollary (Banach 1929) If E, F are Banach spaces and T ∈ B(E, F ) (thus linear and
bounded) is a bijection then also T −1 is bounded. (Thus T is a homeomorphism.)

56
Proof. By Theorem 9.1, T is open. Thus the inverse T −1 that exists by bijectivity (and clearly
is linear) is continuous, thus bounded by Lemma 3.13. 

9.6 Definition A linear map A : E → F between normed spaces that is a bijection and a
homeomorphism is called an isomorphism of normed spaces. (Not to be confused with isometric
isomorphisms, for which kAxk = kxk ∀x ∈ E.) If an (isometric) isomorphism A : E → F exists,
we write E ' F (E ∼= F ).

9.7 Remark 1. If k·k1 , k·k2 are norms on V then idV : (V, k·k1 ) → (V, k·k2 ) is an isomorphism
(isometric isomorphism) if and only if the two norms are equivalent (equal).
2. The Bounded Inverse Theorem is a special case of the Open Mapping Theorem, but it
also implies the latter: Assume that the former holds, that E, F are Banach spaces and that
T ∈ B(E, F ) is surjective. The kernel ker T ⊆ E is closed, so that the quotient space E/ ker T is
a Banach space, and the quotient map p : E → E/ ker T is continuous and open by Proposition
6.2. Since T is surjective, the induced map Te : E/ ker T → F is a continuous bijection, so that
Te−1 : F → E/ ker T is continuous by the Bounded Inverse Theorem. Equivalently, Te is open,
so that the T = Te ◦ p is open as the composite of two open maps.
3. Also the Bounded Inverse Theorem has an interesting application R 2πto Fourier analysis:
1
For f ∈ L ([0, 2π]), we define the Fourier coefficients f (n) = (2π) −1 −int
0 f (t)e dt for all
b
n ∈ N. It is immediate that kfbk∞ ≤ kf k1 , and is not hard to prove the Riemann-Lebesgue
theorem fb ∈ c0 (Z, C) and injectivity of the resulting map L1 ([0, 2π]) → c0 (Z, C), f 7→ fb, see
e.g. [63, Theorem 5.15] or [34]. If this map was surjective, the Bounded Inverse Theorem would
give kf k1 ≤ Ckfbk∞ . For the Dirichlet kernel it is immediate that D cn (m) = χ[−n,n] (m), thus
kDcn k∞ = 1 for all n ∈ N. Since we know that kDn k1 → ∞, we would have a contradiction.
Thus L1 ([0, 2π]) → c0 (Z, C), f 7→ fb is not surjective.
4. The Open Mapping Theorem can be generalized to the case where E is an F -space, i.e.
a TVS admitting a complete translation-invariant metric. See [64, Theorem 2.11]. 2

9.8 Exercise (i) Let V be an infinite dimensional Banach space. Prove that all finite di-
mensional subspaces have empty interior (in V !), then use Baire’s theorem to prove that
V cannot have a countable Hamel basis. (Thus dim V > ℵ0 = #N.)
(ii) Let V be an infinite dimensional Banach space and {xn }, {ϕn } sequences as in Exercise
7.11. For every N ⊆ N define xN = n∈N 2−n xn . Now use Lemma B.9 to find a linearly
P
independent family in V of cardinality c = #R, so that dim V ≥ c (Hamel dimension).
(iii) Prove that every separable normed space V has cardinality at most c and deduce dim V ≤ c.
(iv) Conclude that every infinite dimensional separable Banach space has Hamel dimension c.

9.9 Remark 1. The result of Exercise 9.8 (i) can be proven using Riesz’ Lemma 14.1 instead of
Baire’s theorem, see [4], but (as in most such cases) the proof uses countable dependent choice
DCω like the proof of Baire’s theorem.
2. If the continuum hypothesis (CH) is true, Exercise 9.8 (ii) readily follows from (i)+(iii).
But the proof of (ii) indicated above is independent of CH. 2

9.10 Exercise Give counterexamples showing that both spaces appearing in the Bounded
Inverse Theorem must be complete.
Hint: For complete E, incomplete F use `p spaces, and for E incomplete, F complete use
F = `1 (N, R) and the fact that it has Hamel dimension c = #R, cf. Exercise 9.8(iv).

57
9.11 Exercise Let V be a Banach space.
(i) Let W, Z ⊆ V be closed subspaces such that W + Z = V and W ∩ Z = {0}. Give
W ⊕ Z the norm k(w, z)k = kwk + kzk. Prove that α : W ⊕ Z → V, (w, z) 7→ w + z is a
homeomorphism, thus an isomorphism of Banach spaces.
(ii) If W ⊆ V is complemented then:
– There is a bounded linear map P ∈ B(V ) with P 2 = P and W = P V . (The converse
was proven in Exercise 6.12.)
– Every closed Z ⊆ V complementary to W is isomorphic to V /W as Banach spaces.

9.12 Exercise Let V be a Banach space and W, Z ⊆ V closed linear subspaces satisfying
W ∩ Z = {0}, so that W + Z ∼ = W ⊕ Z algebraically. Prove that W + Z ⊆ V is closed if and
only if the projection W + Z → W : w + z 7→ w is continuous.
[There is a generalization without the assumption W ∩ Z = {0}, but we don’t pursue this.]

9.13 Exercise Let V, W be Banach spaces and A ∈ B(V, W ) such that dim(W/AV ) < ∞.
(i) Prove that AV ⊆ W is closed, assuming injectivity of A.
(ii) Remove the injectivity assumption.

9.14 Remark The quotient W/AV is called the (algebraic) cokernel of A. Some authors de-
fine the cokernel as W/AV . But we don’t do this, since finite dimensionality of W/AV (the
topological cokernel) is a much weaker condition on A and doesn’t imply closedness of AV . 2

9.15 Exercise It is not true that every subspace W ⊆ V with dim(V /W ) < ∞ of a Banach
space V is closed! Find a counterexample! (Hint: codimension one.)

9.3 ? The Closed Graph Theorem


We quickly look at an interesting result equivalent to the Bounded Inverse Theorem, but we
will not need it afterwards.
If f : X → Y is a function, the graph of f is the set G(f ) = {(x, f (x)) | x ∈ X} ⊆ X × Y .

9.16 Exercise Let X be a topological space, Y a Hausdorff space and f : X → Y continuous.


Prove that G(f ) ⊆ X × Y is closed.
If E, F are normed spaces, we know that k(x, y)k = kxk + kyk is a norm on E ⊕ F , complete
if E and F are. The projections p1 : E ⊕ F → E, p2 : E ⊕ F → F are bounded.

9.17 Lemma Let E, F be normed spaces and T : E → F a linear map (not assumed bounded).
Then the following are equivalent:
(i) The graph G(T ) = {(x, T x) | x ∈ E} ⊆ E ⊕ F of T is closed.
(ii) Whenever {xn }n∈N ⊆ E is a sequence such that xn → x ∈ E and T xn → y ∈ F , we have
y = T x.
Proof. Since E ⊕ F is a metric space, G(T ) is closed if and only if it contains the limit (x, y)
of every sequence {(xn , yn )} in G(T ) that converges to some (x, y) ∈ E ⊕ F . But a sequence in
G(T ) is of the form {(xn , T xn )}, and (x, y) ∈ G(T ) ⇔ y = T x. 

58
9.18 Remark Operators with closed graph (in particular unbounded ones) are often called
closed. But this must not be confused with their closedness as a map, i.e. the property of
sending closed sets to closed sets! Bounded linear operators between Banach spaces have closed
graphs, but need not be closed maps. 2

9.19 Theorem (Banach 1929) If E, F are Banach spaces, then a linear map T : E → F is
bounded if and only if its graph is closed.
Proof. Let E, F be Banach spaces, and let T : E → F be linear. If T is bounded then it is
continuous, thus G(T ) is closed by Exercise 9.16. Now assume T , thus G(T ), is closed. The
Cartesian product E ⊕F with norm k(e, f )k = kek+kf k is a Banach space. The linear subspace
G(T ) ⊆ E⊕F is closed by assumption, thus a Banach space. Since the projection p1 : G(T ) → E
is a bounded bijection, by Corollary 9.5 it has a bounded inverse p−1
1 : E → G(T ). Then also
T = p2 ◦ p−1
1 is bounded. 

9.20 Exercise Show that the Bounded Inverse Theorem (Corollary 9.5) can be deduced from
the Closed Graph Theorem. (Thus the three main results of this section are ‘equivalent’.)

9.21 Remark 1. The Hellinger-Toeplitz Theorem (Corollary 8.9) can also be deduced from
the Closed Graph Theorem: Let {xn } ⊆ H be a sequence converging to x ∈ H and assume that
Axn → y. Then
hAx, zi = hx, Azi = limhxn , Azi = limhAxn , zi = hy, zi ∀z ∈ H,
n n

thus Ax = y. Thus A has closed graph and therefore is bounded by Theorem 9.19.
2. Since we deduced the Hellinger-Toeplitz theorem from the weak version of the uniform
boundedness theorem, it is moderately interesting [But not too much: We needed DCω to prove
the closed graph theorem, whereas we know that ACω suffices for proving the weak version of
the uniform boundedness theorem! And with DCω one has the better Theorem 8.13.] that also
the latter can also be deduced from the closed graph theorem: 2

9.22 Exercise Let E, F be Banach spaces and F ⊆ B(E, F ) a pointwise bounded family. Use
the Closed Graph Theorem to prove that F is uniformly bounded, as follows:
(i) Prove that FF = {{yA }A∈F ∈ F F = Fun(F, F ) | supA∈F kyA k < ∞} is a Banach space.
(ii) Show that pointwise boundedness of F is equivalent to T (E) ⊆ FF .
(iii) Prove that the graph G(T ) ⊆ E ⊕ FF of T is closed. (Thus T is bounded by Theorem
9.19.)
(iv) Deduce uniform boundedness of F from the boundedness of T .
(v) Remove the requirement that F be complete.

9.4 Boundedness below. Invertibility


9.23 Definition Let E, F be normed spaces and let A : E → F be a linear map. Then A is
called bounded below32 if there is a δ > 0 such that kAxk ≥ δkxk ∀x ∈ E.
(Equivalently, inf kxk=1 kAxk > 0.)
32
This terminology clashes with another one according to which a self-adjoint operator A is bounded below if
σ(A) ⊆ [c, ∞) for some c ∈ R. Since we consider only bounded operators, we’ll have no use for this notion. The
problem could be avoided by writing ‘bounded away from zero’, as some authors do, but this is a bit tedious.

59
It is obvious that boundedness below of a map implies injectivity, but the converse is not
true. Furthermore, the image AE = {Ax | x ∈ E} of a linear map A : E → F need not
be closed. In particular, the image can be dense without A being surjective. The operator
A ∈ B(E), where E = `2 (N, C), defined by (Af )(n) = f (n)/n exemplifies both phenomena.

9.24 Exercise Let E, F be be normed spaces, where E is finite dimensional, and let A : E → F
be an injective linear map. Prove that A is bounded below.

9.25 Lemma Let E, F be normed spaces and A : E → F a linear bijection. Then

inf kAxk = kA−1 k−1 .


kxk=1

In particular, A is bounded below if and only if its (set-theoretic) inverse A−1 is bounded.
Proof. Using the invertibility of A, thus bijectivity of x 7→ Ax, we have
−1
kA−1 yk

−1 kxk kAxk
kA k = sup = sup = inf = ( inf kAxk)−1 .
y∈F \{0} kyk x∈E\{0} kAxk x∈E\{0} kxk kxk=1

The second statement follows immediately. 

The following generalizes Corollary 3.6:

9.26 Lemma If E is a Banach space, F is a normed space and A : E → F is a linear map that
is bounded and bounded below then AE ⊆ F is closed.
Proof. Since A is bounded below, it is injective, thus A e : E → AE (the map A, with the
−1
codomain replaced by AE) is a bijection. Now A : AE → E is bounded by Lemma 9.25.
e
Thus if {fn } is a Cauchy sequence in AE, {A e−1 fn } is a Cauchy sequence in E. Since E
is complete, there is e ∈ E such that A −1
e fn → e. Since A is bounded, {fn = A(A e−1 fn )}
converges to Ae ∈ AE. Thus AE is complete, thus closed. 

9.27 Definition If E, F are normed spaces then A ∈ B(E, F ) is called invertible if there is a
B ∈ B(F, E) such that BA = idE and AB = idF .

9.28 Proposition Let E, F be a Banach spaces and A ∈ B(E, F ). Then the following are
equivalent:
(i) A is invertible.
(ii) A is injective and surjective.
(iii) A is bounded below and has dense image.
Proof. It is clear that invertibility implies injectivity and surjectivity, thus in particular dense
image. Since A−1 is bounded, Lemma 9.25 gives that A is bounded below.
If (ii) holds then the the set-theoretic inverse, clearly linear, is bounded by the bounded
inverse theorem, (Corollary 9.5). Thus A is invertible in the sense of Definition 9.23.
Assume (iii). By boundedness below, A is injective. And AE ⊆ F is dense by assumption
and closed by Lemma 9.26, thus AE = F . Thus A is injective and surjective. Now boundedness
of the inverse A−1 follows from boundedness below of A and Lemma 9.25. 

60
9.29 Remark 1. Note that dense image is weaker than surjectivity, while boundedness below
is stronger than injectivity. The point of criterion (iii) is that it can be quite hard to verify
surjectivity of A directly, while density of the image usually is easier to establish.
2. The material on bounded below maps discussed so far, including (i)⇔(iii) in Proposition
9.28, was entirely elementary and could be moved to Section 3. 2

9.30 Exercise Let E, F be Banach spaces and A ∈ B(E, F ). Prove:


(i) If A is injective then AE ⊆ F is closed ⇔ A is bounded below.
(ii) If ker A has a complement W then AE ⊆ F is closed ⇔ A|W is bounded below.
(iii) If E = H is a Hilbert space then AH ⊆ F is closed if and only if A  (ker A)⊥ is bounded
below.

9.31 Exercise Let H be a Hilbert space and A ∈ B(H) such that |hAx, xi| ≥ Ckxk2 for some
C > 0. Prove that A is invertible and kA−1 k ≤ C −1 .

10 Spectrum of bounded operators and of (elements


of ) Banach algebras
10.1 The spectra of A ∈ B(E)
We now specialize from B(E, F ) to B(E), i.e. linear maps from a normed space to itself. If E
is a finite dimensional vector space and A ∈ End E, it is well known that one has: A is injective
⇔ A is surjective ⇔ A is invertible.
It is extremely important that this fails in infinite dimensions:

10.1 Definition Let E = `p (N, F), where 1 ≤ p ≤ ∞. Define L, R ∈ B(E) by



0 if n = 1
(Lf )(n) = f (n + 1), (Rf )(n) =
f (n − 1) if n ≥ 2
Equivalently: Rδn = δn+1 , Lδ1 = 0, Lδn = δn−1 if n ≥ 2, which is why we call L, R the left
and right, respectively, shift operators on E.
It is immediate that R is injective (in fact isometric), but not surjective (since (Rf )(1) =
0 ∀f ∈ E) while L is surjective, but not injective (since Lf does not depend on f (1)). One
easily checks LR = idE , while RL 6= idE since RL = P2 (notation from Exercise 8.7).
Let E be a finite dimensional F-vector space, A ∈ End E and λ ∈ F. Then failure of A − λ1E
to be invertible is equivalent to ker(A − λ1E ) 6= {0}, thus the existence of a non-zero x ∈ E
with Ax = λx. Thus λ is an eigenvalue of A and x is a corresponding eigenvector.
Passing to an infinite dimensional Banach space E, there can be A ∈ B(E) and λ ∈ F for
which A − λ1 is injective, but not surjective, thus not invertible. Such λ are not eigenvalues,
but they turn out to be equally important as the former. This motivates the following:

10.2 Definition Let E be a Banach space over F and A ∈ B(E). Then


• The spectrum33 σ(A) is the set of λ ∈ F for which A − λ1E is not invertible.
33
The choice of this term by Hilbert was nothing less than a stroke of genius since it turned out to fit exactly its
later use in quantum theory.

61
• The point spectrum σp (A) consists of those λ ∈ F for which A − λ1E is not injective.
Equivalently, σp (A) consists of the eigenvalues of A.
• The continuous spectrum σc (A) consists of those λ ∈ F for which A − λ1E is injective, but
not surjective, while it has dense image, i.e. (A − λ1E )E = E.
• The residual spectrum σr (A) consists of those λ ∈ F for which A − λ1E is injective and
(A − λ1E )E 6= E.
We have some immediate observations:
• It is obvious by construction that the sets σp (A), σc (A), σr (A) are mutually disjoint and
have σ(A) as their union.
• Clearly 0 ∈ σ(A) is equivalent to non-invertibility of A and 0 ∈ σp (A) to ker A 6= {0}.
• If E is finite dimensional then we know from linear algebra that injectivity and surjectivity
of any A ∈ B(E) are equivalent. Thus for all operators on a finite dimensional space we
have σc (A) = σr (A) = ∅, thus σ(A) = σp (A).
• If E is infinite dimensional, the situation is much more complicate, thus more interesting.
For example, the right shift R on `2 (N) is injective, but not surjective. Thus 0 ∈ σ(R),
while 0 6∈ σp (R).
• If λ ∈ σp (A) then there is non-zero x ∈ E with Ax = λx. Then An x = λn x ∀n ∈ N. With
the definition of kAk it follows that |λ| ≤ inf n∈N kAn k1/n . (This can be smaller than kAk,
e.g. if A is nilpotent, i.e. An = 0 for some n ∈ N.)
• There are other interesting subsets of σ(A), motivated by Proposition 9.28(iii):
– The approximate point spectrum σapp (A) = {λ ∈ F | A − λ1 not bounded below}.
Clearly σp (A) ⊆ σapp (A).
– The compression spectrum σcp (A) = {λ ∈ F | (A − λ1)E 6= F }. Obviously σ(A) =
σapp (A) ∪ σcp (A) (but the two need not be disjoint). And σr (A) = σcp (A)\σp (A).
– The discrete spectrum σd (A) ⊆ σp (A), cf. Exercise 10.42. The essential spectrum
σess (A) (or rather one version of it – there are others) is σ(A)\σd (A).

10.3 Exercise Let V be a Banach space, A ∈ B(V ), and let W, Z ⊆ V closed subspaces such
that W + Z = V, W ∩ Z = {0} and AW ⊆ W, AZ ⊆ Z. Prove σ(A) = σ(A|W ) ∪ σ(A|Z ) and
σt (A) = σt (A|W ) ∪ σt (A|Z ) for all t ∈ {p, c, r}.
Before we try to compute the spectra of some interesting operators, it is better to first prove
some general results, since they will be helpful also for studying examples. Remarkably one can
get rather far using only the fact that B(E) is a Banach algebra.

10.2 The spectrum in a unital Banach algebra


Since B(E) is a unital Banach algebra for every Banach space, all results proven here apply to
bounded operators on Banach spaces. Restricting these results to B(E) does not significantly
simplify their proofs! In the beginning it does not matter whether F is R or C.

10.4 Definition If A is a unital algebra over F then InvA = {a ∈ A | ∃b ∈ A : ab = ba = 1}


is the set of invertible elements of A. The spectrum of a ∈ A is defined as

σ(a) = {λ ∈ F | a − λ1 6∈ InvA}.

62
The spectral radius of a is r(a) = sup{|λ| | λ ∈ σ(a)}, where r(a) = 0 if σ(a) = ∅. (But we will
soon prove σ(a) 6= ∅ for all a ∈ A if A is normed.)

10.5 Remark It is clear that for an element of the Banach algebra B(E), where E is a Banach
space, this definition is equivalent to Definition 10.2. But in the present abstract setting there
is no distinction between point, continuous and residual spectrum. 2

As to our standard example of a Banach algebra not of the form B(V ) with V Banach:

10.6 Exercise (i) Let X be a compact Hausdorff space. Recall that (C(X, F), k · k∞ ) is a
Banach algebra. For f ∈ C(X, F), prove σ(f ) = f (X) ⊆ F.
(ii) If S is any set, `∞ (S, F) is a Banach algebra w.r.t. pointwise multiplication. If f ∈ `∞ (S, F),
prove σ(f ) = f (S).
There is another case where the spectrum is easy to determine:

10.7 Definition An element a ∈ A of an algebra is called nilpotent if an = 0 for some n ∈ N.

10.8 Lemma If A is a unital algebra and a ∈ A is nilpotent then σ(a) = {0}, thus r(a) = 0.
Proof. If a ∈ A is nilpotent then the series b =P ∞ n
P
n=0 a converges since it breaks off after
finitely many terms. Now (1 − a)b = b(1 − a) = n=0 a − ∞
∞ n n = 1, thus a − 1 ∈ InvA.
P
n=1 a
Since the same holds for a/λ whenever λ 6= 0, we have σ(a) ⊆ {0}. Since no nilpotent a is
invertible (why?) we have σ(a) = {0}. 

10.9 Definition If A is a unital Banach algebra, a ∈ A is called quasi-nilpotent if r(a) = 0,


equivalently σ(a) ⊆ {0}. As just proven, nilpotent ⇒ quasi-nilpotent.
For examples of quasi-nilpotent elements that are not nilpotent, see Exercises 10.35, 17.6.

10.10 Lemma Let A be a unital Banach algebra. Then


P∞
(i) If a ∈ A, kak < 1 then 1 − a ∈ InvA and (1 − a)−1 = n
n=0 a .
(ii) InvA ⊆ A is open.
Proof. (i) If kak < 1 then ∞
P n
P∞ n
P∞ n
n=0 ka k ≤ n=0 kak < ∞, so that the series n=0 a converges
to
P∞somen b P ∈ A by completeness and Proposition 3.2. Now again (1 − a)b = b(1 − a) =
∞ n
n=0 a − n=1 a = 1, so that 1 − a is invertible with inverse b.
(ii) If a ∈ InvA and a0 ∈ A with ka − a0 k < ka−1 k−1 then
k1 − a−1 a0 k = ka−1 (a − a0 )k ≤ ka−1 kka − a0 k < 1 so that a−1 a0 = 1 − (1 − a−1 a0 ) ∈ InvA, thus
a0 = a(a−1 a0 ) ∈ InvA. This proves that InvA is open. 

10.11 Exercise Let A be a unital normed algebra.


(i) If a, b ∈ A and ab = ba ∈ InvA with c = (ab)−1 , prove that a, b ∈ InvA and a−1 = cb =
bc, b−1 = ca = ac.
(ii) Give an example of a unital algebra A and a, b ∈ A such that ab 6= ba ∈ InvA with
a, b 6∈ InvA.
(iii) If a, b ∈ A and 1 − ab ∈ InvA, prove 1 − ba ∈ InvA.
Hint: Assuming that A is Banach and kakkbk < 1, find a formula for (1 − ba)−1 in terms
of (1 − ab)−1 . Now prove that the latter holds without the mentioned assumptions.

63
(iv) Deduce that σ(ab) ∪ {0} = σ(ba) ∪ {0} and r(ab) = r(ba).

10.12 Proposition If A is a unital Banach algebra and a ∈ A then


(i) σ(a) is closed.
(ii) r(a) ≤ inf n∈N kan k1/n ≤ kak.
Proof. (i) If a ∈ A then fa : F → A, λ 7→ a − λ1 is continuous, thus fa−1 (InvA) ⊆ F is open by
Lemma 10.10(ii). Now σ(a) = F\fa−1 (InvA) is closed.
(ii) If λ ∈ F, |λ| > kak then ka/λk < 1 so that 1 − a/λ ∈ InvA by Lemma 10.10(i). Thus
λ1 − a ∈ InvA, so that λ 6∈ σ(a). This proves r(a) ≤ kak.
In each associative unital algebra, a simple telescoping argument gives the formula

z n − 1 = (z − 1)(1 + z + z 2 + · · · + z n−1 ), (10.1)

known from finite geometric sums. If 0 6= λ ∈ σ(a) then a/λ − 1 is not invertible, thus putting
z = a/λ in (10.1) gives that (a/λ)n − 1 is not invertible (since a product of two commuting
elements is invertible if and only if both are invertible by Exercise 10.11(i)). Thus λn ∈ σ(an ),
so that r(a)n ≤ r(an ). Since this holds for all n ∈ N, using r(b) ≤ kbk just proven, we have
r(a) ≤ inf r(an )1/n ≤ inf kan k1/n ≤ kak. 
n∈N n∈N

10.13 Exercise Let A be unital Banach algebra and a ∈ InvA. Prove:


(i) σ(a−1 ) = {λ−1 | λ ∈ σ(a)}.
(ii) If kak ≤ 1 and ka−1 k ≤ 1 then σ(a) ⊆ S 1 = {z ∈ C | |z| = 1}.
More can be said about the relation of r(a) to the norms kan k, but this will be more work.

10.14 Lemma Let A be a unital normed algebra. Then InvA is a topological group (w.r.t. the
norm topology).
Proof. It is clear that InvA is a group and that multiplication is continuous, since multiplication
A × A → A is jointly continuous (Remark 3.28). It remains to show that the inverse map
σ : InvA → InvA, a 7→ a−1 is continuous. To this purpose, let r, r + h ∈ InvA and put
(r + h)−1 = r−1 + k. We must show that khk → 0 implies kkk → 0. From 1 = (r−1 +
k)(r + h) = 1 + r−1 h + kr + kh we obtain r−1 h + kr + kh = 0. Multiplying this on the
right by r−1 we have r−1 hr−1 + k + khr−1 = 0, thus k = −r−1 hr−1 − khr−1 . Therefore
kkk ≤ kr−1 k2 khk + kkkkhkkr−1 k, which is equivalent to kkk(1 − khkkr−1 k) ≤ kr−1 k2 khk and,
for khk < kr−1 k−1 , to
kr−1 k2
kkk ≤ khk.
1 − khkkr−1 k
From this it is clear that khk → 0 implies kkk → 0. 

10.15 Corollary If A is a unital normed algebra and a ∈ A then the ‘resolvent map’ Ra :
C\σ(a) → A, λ 7→ (a − λ1)−1 is continuous.

10.16 Lemma Let A be a unital normed algebra and a ∈ A. Put ν = inf n∈N kan k1/n . Then
(i) limn→∞ kan k1/n = ν.
n n
(ii) For all µ > ν we have µa → 0 as n → ∞, but νa 6→ 0 provided ν > 0.
(This is of course trivial if µ > kak, but our hypothesis is weaker when ν < kak.)

64
(iii) If ν = 0 then a 6∈ InvA, thus 0 ∈ σ(a).
Proof. (i) With kan k ≤ kakn we trivially have

0 ≤ ν = inf kan k1/n ≤ lim inf kan k1/n ≤ lim sup kan k1/n ≤ kak < ∞. (10.2)
n∈N n→∞ n→∞

By definition of ν, for every ε > 0 there is a k such that kak k1/k < ν + ε. Every m ∈ N is of the
form m = sk + r with unique s ∈ N0 and 0 ≤ r < k (division with remainder). Then

kam k = kask+r k ≤ kak ks kakr < (ν + ε)sk kakr ,


sk r
kam k1/m ≤ (ν + ε) sk+r kak sk+r .
sk r
Now m → ∞ means sk+r → 1 and sk+r → 0, so that lim supm→∞ kam k1/m ≤ ν + ε. Since this
holds for every ε > 0, we have lim supm→∞ kam k1/m ≤ inf n∈N kan k1/n . Together with (10.2)
this implies that limm→∞ kam k1/m exists and equals inf n∈N kan k1/n . (Compare Exercise 10.36.)
(ii) Let µ > ν, and choose µ0 such that ν < µ0 < µ. Since kan k1/n → ν by (i), there is a n0
such that n ≥ n0 ⇒ kan k1/n < µ0 . For such n we have
 n  0 n
a kan k µ n→∞
= n ≤ −→ 0.
µ µ µ

This proves the first claim. On the other hand, for all n ∈ N we have kan k1/n ≥ ν. With ν > 0
this implies k(a/ν)n k ≥ 1 ∀n, and therefore (a/ν)n 6→ 0.
(iii) Assume a ∈ InvA. Then there is b ∈ A such that ab = ba = 1. Then 1 = an bn , thus
with Remark 3.28 we have 1 ≤ k1k = kan bn k ≤ kan kkbn k ≤ kan kkbkn . Taking n-th roots, we
have 1 ≤ kan k1/n kbk, and taking the limit gives the contradiction 1 ≤ νkbk = 0. Thus if ν = 0
then a is not invertible, so that 0 ∈ σ(a). 

Everything we did so far works over R and over C. (With the obvious exception of the ONB
{en : x 7→ einx } ⊆ L2 ([0, 2π], λ; C) in the discussion of Fourier series. But over R that can be
replaced by {cos nx | n ∈ N0 } ∪ {sin nx | n ∈ N}.) The rest of this section requires F = C, and
the same applies whenever we use Theorem 10.18 or Corollaries 10.21, 10.24.

10.17 Lemma Let A be a unital normed algebra over C, a ∈ A and λ ∈ C\{0} such that
1 a n

λS ∩ σ(a) = ∅. Then for all n ∈ N we have λ − 1 ∈ InvA and

 a −1 n −1
n 1 X a 2πi
k
−1 = −1 , where λk = e n λ. (10.3)
λ n λk
k=1

2πi
Proof. For 0 6= λ ∈ C and n ∈ N, put λk = λe n k , where k = 1, . . . , n. (One should really
write λn,k , but we n n
n n
Q suppress the n.) Then λ1 , . . . , λn are the solutions of z = λ , and we
have z − λ = k (z − λk ). This is an identity in C[z], thus is also holds in every unital
C-algebra A with z replaced by a ∈ A. Let |λ| ≥ ν > 0 and n ∈ N. Then our assumption
λS 1 ∩ σ(a) =Q∅ implies λk 6∈ σ(a) for allk = 1, . . . , n. Thus all a − λk 1 are invertible, and so is
n
an − λn 1 = k (a − λk 1). Thus also λa − 1 ∈ InvA, our first claim.
Putting z = a/λk in (10.1) and observing λnk = λn , we have
a a a a a
( )n − 1 = ( )n − 1 = ( − 1)(1 + + · · · + ( )n−1 ).
λ λk λk λk λk

65
a a n

Using the invertibility of λk − 1 ∀k and λ − 1, we can rewrite this as

n−1
a a a a a X  a l 2πikl
φ(λk ) = ( − 1)−1 = (( )n − 1)−1 (1 + + · · · + ( )n−1 ) = (( )n − 1)−1 e− n .
λk λ λk λk λ λ
l=0

Summing over k ∈ {1, . . . , n}, we obtain


n n−1 n
X a X  a l X 2πikl
φ(λk ) = (( )n − 1)−1 e− n . (10.4)
λ λ
k=1 l=0 k=1

2πi
If l ∈ {1, . . . , n − 1} then z = e− n
l
satisfies z 6= 1 and z n = 1, so that (10.1) gives
n n−1
X 2πi X zn − 1
e− n
kl
=z zk = z = 0.
z−1
k=1 k=0

Thus only l = 0 contributes to (10.4), and the r.h.s. equals n(( λa )n − 1)−1 , yielding (10.3). 

10.18 Theorem (Beurling 1938-Gelfand 1939) 3435 Let A be a unital normed algebra
over C (not necessarily complete) and a ∈ A. Then σ(a) 6= ∅, and

r(a) ≥ inf kan k1/n = lim kan k1/n . (10.5)


n∈N n→∞

If A is complete, equality holds in (10.5), which then is called the spectral radius formula.
Proof. The equality of infimum and limit was Lemma 10.16(i). Once the ≥ is proven, combining
it with Proposition 10.12(ii) in the complete case gives r(a) = limn→∞ kan k1/n .
For a ∈ A, define ν as before. If ν = 0 then 0 ∈ σ(a) by Lemma 10.16(iii). Thus σ(a) 6= ∅
and (10.5) is trivially true.
From now on assume ν > 0. Assume that there is no λ ∈ σ(a) with |λ| ≥ ν. This implies
that (a − λ1)−1 exists for all |λ| ≥ ν and depends continuously on λ by Lemma 10.14. The
same holds (since |λ| ≥ ν > 0) for the slightly more convenient function
a −1
φ : {λ ∈ C | |λ| ≥ ν} → A, λ 7→ −1 .
λ
Now Lemma 10.17 gives for all λ with |λ| ≥ ν and n ∈ N that ( λa )n − 1 ∈ InvA, the inverse
given by (10.3). Pick any η > ν. Since the annulus Λ = {λ ∈ C | ν ≤ |λ| ≤ η} is compact, the
continuous map φ : Λ → A is uniformly continuous. I.e., for every ε > 0 we can find δ > 0 such
that λ, λ0 ∈ Λ, |λ−λ0 | < δ ⇒ kφ(λ)−φ(λ0 )k < ε. If ν < µ < ν +δ, we have |νk −µk | = |ν −µ| < δ
and therefore kφ(νk ) − φ(µk )k < ε for all n P
∈ N and k = 1, . . . , n. Combining this with (10.3)
we have k(( ν ) − 1) − (( µ ) − 1) k ≤ n nk=1 kφ(νk ) − φ(µk )k < ε ∀n ∈ N, so that:
a n −1 a n −1 1

a n −1 a n −1
∀ε > 0 ∃µ > ν ∀n ∈ N : −1 − −1 < ε. (10.6)
ν µ
By Lemma 10.16(ii), µ > ν implies (a/µ)n → 0 as n → ∞. With continuity of the inverse
map, ((a/µ)n − 1)−1 → −1. Thus for n large enough we have k((a/µ)n − 1)−1 + 1k < ε, and
34
Arne Beurling (1905-1986). Swedish mathematician. Worked mostly on harmonic and complex analysis.
35
Israel Moiseevich Gelfand (1913-2009). Outstanding Soviet mathematician. Many important contributions to
many areas of mathematics, among which functional analysis and Banach algebras.

66
combining this with (10.6) we have k((a/ν)n − 1)−1 + 1k < 2ε. Since ε > 0 was arbitrary, we
have ((a/ν)n − 1)−1 → −1 as n → ∞ and therefore (a/ν)n → 0. This contradicts the other
part of Lemma 10.16(ii), so that our assumption that there is no λ ∈ σ(a) with |λ| ≥ ν is
false. Existence of such a λ obviously gives σ(a) 6= ∅ and r(a) ≥ ν, completing the proof. We
emphasize that completeness of A was not needed! 

10.19 Remark 1. The standard proof of the above theorem, which requires completeness, uses
the differentiability of the resolvent map Ra and a certain amount of complex analysis. The
more elementary (which does not mean simple) proof given above, due to Rickart36 (1958),
shows that neither the completeness assumption nor the complex analysis are essential to the
problem. (See also Exercise 10.30 and the subsequent remark.)
2. Even though we avoided  complex
 analysis (holomorphicity etc.), it is clear that the proof
0 −1
only works over C. In fact, ∈ M2×2 (R) has empty spectrum over R. 2
1 0

10.20 Corollary (‘Fundamental Theorem of Algebra’) Let P ∈ C[z] be a polynomial


of degree d ≥ 1. Then there is λ ∈ C with P (λ) = 0.
Proof. We may assume that P is monic, i.e. the coefficient of the highest power z d is 1. It is
not hard to construct a matrix aP ∈ Md×d (C) such that P (λ) = det(aP − λ1) (do it!). Now
Theorem 10.18 gives σ(aP ) 6= ∅, and for every λ ∈ σ(aP ) we have P (λ) = 0. 

The above is not a joke! This proof is certainly more elementary than those using complex
analysis (Liouville’s theorem) or topological arguments based on π1 (S 1 ) 6= 0. And the ‘standard’
proof using compactness and n-th roots of complex numbers, cf. e.g. [47, Theorem 7.7.57], has
more than a little in common with the above argument.

10.21 Corollary (Gelfand-Mazur) 37

(i) Every unital normed algebra over C other than C1 has non-zero non-invertible elements.
(ii) If A is a normed division algebra (i.e. unital with InvA = A\{0}) over C then A = C1.
Proof. (i) Let a ∈ A\C1. By Theorem 10.18 we can pick λ ∈ σ(a). Then a − λ1 is non-zero
and non-invertible. Now (ii) is immediate. 

10.22 Remark 1. That there are no finite dimensional division algebras over C is an easy
consequence of algebraic closedness. (Why?) There are infinite dimensional ones (like the field
of C(z) of rational functions over C), but they do not admit norms by the above corollary, which
does not assume finite dimensionality of A.
2. Over R a theorem of Hurwitz38 says that there are precisely four division algebras
admitting a norm, namely R, C, H (Hamilton’s39 quaternions, which everyone should know)
and O, the octonions of Graves40 . But of these only C is an algebra over C. For more on the
fascinating subject of real division algebras see the 120 pages on the subject in [18]. 2

36
Charles Earl Rickart (1913-2002). American mathematician, mostly operator algebraist.
37
Stanislaw Mazur (1905-1981). Polish mathematician.
38
Adolf Hurwitz (1859-1919). German mathematician who worked on many subjects.
39
Sir William Rowan Hamilton (1805-1865). Irish mathematician. Known particularly for quaternions and Hamil-
tonian mechanics. It was he who advocated the modern view of complex numbers as pairs of real numbers.
40
John T. Graves (1806-1870). Irish jurist (!) and mathematician.

67
The preceding corollaries only used σ(a) 6= ∅, but also the spectral radius formula will have
many applications.

10.23 Corollary If A is a unital normed algebra and a ∈ A then a is quasi-nilpotent (r(a) =


0) if and only if kzakn → 0 for all z ∈ C.
Proof. Combine Lemma 10.16(ii) with ν = r(a). 

10.24 Corollary Let A be a unital Banach algebra and B ⊆ A a closed subalgebra containing
1. Then σA (b) ⊆ σB (b) and rA (b) = rB (b) for all b ∈ B.
Proof. If b − λ1 has an inverse in B then the latter also is an inverse in A. Thus λ 6∈ σB (b) ⇒
λ 6∈ σA (b), whence the first claim.
Since the norm of B is the restriction to B of the norm of A, the spectral radius formula
gives rB (b) = limn→∞ kbn k1/n = rA (b). 

In the situation of Corollary 10.24, σA (b) $ σB (b) is possible:

10.25 Example Consider the BanachPspace A = `1 (Z, C) with


Pnorm k · k1 . For f, g ∈ A, define
the convolution product (f ? g)(n) = m∈Z f (m)g(n − m) = r,s∈Z f (r)g(s). Then
r+s=n

X X XX
kf ? gk1 = f (m)g(n − m) ≤ |f (m)g(n − m)| = kf k1 kgk1 ,
n∈Z m∈Z n∈Z m∈Z

thus f ? g ∈ A. It is clear that ? is bilinear with 1 = δ0 as unit, and associativity is easy to


check. Thus (A, k · k1 , ?, 1) is a unital Banach algebra. The functions δn (m) = δn,m satisfy
δn ? δm = δn+m . In particular δn−1 = δ−n for each n ∈ Z.
Let B = {f ∈ A | f (n) = 0 ∀n < 0} = spanC {δn | n ≥ 0} ⊆ A. It is immediate that B is a
closed subalgebra containing 1. Now b = δ1 ∈ B has an inverse in A, namely δ−1 , but not in B:
If there was an inverse in B, it would also be an inverse in A and would have to equal δ−1 6∈ B,
which is a contradiction. Thus 0 ∈ σB (b) and 0 6∈ σA (b), so that σA (b) $ σB (b).
(The Banach algebra `1 (Z, C) has other interesting applications, cf. Section 17.2. Its con-
struction generalizes to all discrete groups, and in fact to all locally compact groups if one
replaces summation by integration w.r.t. the Haar measure µ, obtaining L1 (G, µ; C).)

10.26 Exercise Let A be a unital Banach algebra over C and a, b ∈ A commuting elements.
Prove r(ab) ≤ r(a)r(b). (I.e. r is submultiplicative. We will soon prove subadditivity.)

10.27 Exercise Let A be a unital Banach algebra and B ⊆ A a maximal abelian Banach
subalgebra with 1 ∈ B. (Maximality means that we cannot have B $ C ⊆ A with C commutative
Banach.) Prove InvB = B ∩ InvA and conclude that σB (b) = σA (b) ∀b ∈ B.

10.28 Exercise Let A be a unital Banach algebra and a ∈ A. Prove the ‘resolvent identity’

Ra (s) − Ra (t) = (s − t)Ra (s)Ra (t) ∀s, t ∈ C\σ(a).

10.29 Exercise Let A be a unital Banach algebra over C.


(i) Use the ideas in the proof of Lemma 10.10 to give an alternative proof of the continuity
of InvA → InvA, a 7→ a−1 .

68
(ii) BONUS: If E, F are normed spaces and U ⊆ E is open, a map f : U → F is Fréchet
differentiable at x ∈ U if there is a bounded linear map D ∈ B(E, F ) such that

kf (x + h) − f (x) − D(h)k
→0 as khk → 0.
khk

Prove that InvA → InvA, a 7→ a−1 is Fréchet differentiable. Conclude that the map
C\σ(a) → C, λ 7→ ϕ((a − λ1)−1 ) is holomorphic for each ϕ ∈ A∗ .

10.30 Exercise Let A be a unital Banach algebra, a ∈ A and z ∈ C\{0} such that zS 1 ∩σ(a) =
∅ (i.e. there is no λ ∈ σ(a) with |λ| = |z|). Let pn = (1 − (a/z)n )−1 . Prove:
(i) If |z| > r(a) then pn → 1 as n → 0.
(ii) If there is no λ ∈ σ(a) with |λ| ≤ |z| then pn → 0 as n → 0.
(iii) The limit p = limn→∞ pn ∈ A exists and satisfies pa = ap. Hint: Lemma 10.17.
1
H
(iv) p = − 2πi C Ra (z)dz, where C is the circle of radius |z| around 0 ∈ C with counterclockwise
orientation.
(v) BONUS: p2 = p. Hint: You may use Exercise 10.28.

10.31 Remark In our (that is Rickart’s) proof of the Beurling-Gelfand theorem we used the
operators ((a/z)n − 1)−1 = −pn for large n and the above (i) concerning existence of p =
limn→∞ pn . The result of (iv) to the effect that p is given by a contour integral establishes a
connection between Rickart’s proof and the standard textbook proof via complex analysis. See
also [60, §149]. 2

Since we need a unit in order to define σ(a), the following construction is quite important
(but we won’t use it):

10.32 Exercise (Unitization of Banach algebras) Let A be a Banach algebra over F,


possibly without unit. Define Ae = A ⊕ F, which is an F-vector space in the obvious way. For
(a, α), (b, β) ∈ Ae define k(a, α)k∼ = kak + |α| and (a, α)(b, β) = (ab + αb + βa, αβ). Prove:
(i) Ae is an associative algebra with unit (0, 1).
e k · k∼ ) is a Banach algebra.
(ii) (A,
(iii) The map ι : A → A,
e a 7→ (a, 0) is an isometric algebra homomorphism, and ι(A) ⊆ Ae is a
two-sided ideal.

10.3 Examples of spectra of operators


We summarize what we have proven: If A ∈ B(E) then σ(A) ⊆ C is closed, non-empty, and
bounded by r(A) = limn→∞ kAn k1/n ≤ kAk. Note that we do not yet have good general results
on σp , σc , σr . We will later prove some, for special classes of operators.

10.33 Exercise Compute σp (L) and σp (R) for the shift operators L, R on `p (N, C) for all
p ∈ [1, ∞]. (Of course, the p in σp has nothing to to with the p in `p .)

10.34 Exercise Prove: If V is a finite-dimensional Banach space over C then every quasi-
nilpotent operator on V is nilpotent.

69
10.35 Exercise Let H = `2 (N, C). Define A ∈ B(H) by Aek = 2−k ek+1 ∀k. Prove that A is
(i) injective, (ii) quasi-nilpotent, but (iii) not nilpotent.

10.36 Exercise Let H = `2 (N, C) and define A ∈ B(H) by Aek = αk ek+1 , where αk = 2 for
odd k and αk = 1/2 for even k. Compute kAn k for all n and show that n 7→ kAn k1/n is not
monotonously decreasing.
The operators in the two preceding exercises are examples of ‘weighted shift operators’.

10.37 Exercise Let H = `2 (N, C) and define A ∈ B(H) by (Af )(n) = f (n)/n. Determine
σp (A), σc (A), σr (A).

10.38 Exercise (Assuming some measure theory) Let H = L2 ([a, b]), where −∞ < a <
b < ∞, and define A ∈ B(H) by (Af )(x) = xf (x). Prove σc (A) = [a, b] and σp (A) = σr (A) = ∅.

10.39 Exercise Prove that for every compact set C ⊆ C there is an operator A ∈ B(H),
where H is a separable Hilbert space, such that σ(A) = C.
Hint: Prove and use that C has a countable dense subset.
The next exercise is a (quite weak, given the strong hypothesis) converse of Exercise 10.3:

10.40 Exercise Let V be a Banach space and A ∈ B(V ) such that σ(A) is disjoint from the
circle C = {z ∈ C | |z − z0 | = r}.
(i) Apply Exercise 10.30 to A − z0 1 to obtain P 2 = P ∈ B(V ) satisfying P A = AP .
(ii) Prove that AVi ⊆ Vi , where V1 = P V, V2 = (1 − P )V . Conclude that A = A1 ⊕ A2 where
Ai = A  V i .
(iii) Prove V1 = {x ∈ V | limn→∞ (A − z0 1)n x = 0}.
(iv) Deduce σ(A1 ) = σ(A) ∩ B(z0 , r) and σ(A2 ) = σ(A)\B(z0 , r).

10.41 Remark The unnatural assumption that the two parts of the spectrum are separated by
a circle can be removed using holomorphic functional calculus. Cf. e.g. [60, §149]. For normal
operators on Hilbert space, there is an alternative approach, cf. Proposition 13.10. 2

10.42 Exercise (Discrete Spectrum) Let V be a Banach space and A ∈ B(V ). If λ ∈ σ(A)
is isolated, pick r > 0 such that B(λ, r)∩σ(A) = {λ} and let Pλ be the idempotent corresponding
to C = {z | |z − λ| = r} constructed in the preceding exercise. Now put

σd (A) = {λ ∈ σ(A) | λ is isolated and dim Pλ V < ∞}.

Prove σd (A) ⊆ σp (A).

10.4 Characters. Spectrum of a Banach algebra


We now develop a new perspective on the spectrum that will prove very powerful, allowing to
obtain results that would be hard to reach in other ways. For example: If A is a unital Banach
algebra and a, b ∈ A. What can we say about σ(a + b) or σ(ab)? Using only the definition of
the spectrum this seems quite difficult. In this section we require k1k = 1.

70
10.43 Definition If A, B are F-algebras, an (algebra) homomorphism α : A → B is a linear
map such that also α(aa0 ) = α(a)α(a0 ) ∀a, a0 ∈ A. If A, B are unital, α is called unital if
α(1A ) = 1B . Algebra homomorphisms from an F-algebra to F are called characters.

10.44 Lemma If A, B are unital algebras and α : A → B is a unital algebra homomorphism


then σB (α(a)) ⊆ σA (a).
Proof. If λ 6∈ σA (a) then a − λ1A ∈ A has an inverse b. Then α(b) is an inverse for
α(a − λ1A ) = α(a) − λ1B ∈ B, thus λ 6∈ σB (α(a)). 

10.45 Lemma Let A be a unital Banach algebra. Then every non-zero character ϕ : A → F
satisfies ϕ(1) = 1, ϕ(a) ∈ σ(a) ∀a ∈ A and kϕk = 1, thus ϕ is continuous.
Proof. If ϕ(1) = 0 then ϕ(a) = ϕ(a1) = ϕ(a)ϕ(1) = 0 for all a ∈ A, thus ϕ = 0. Thus
ϕ 6= 0 ⇒ ϕ(1) 6= 0. Now ϕ(1) = ϕ(12 ) = ϕ(1)2 implies ϕ(1) = 1.
We have just proven that every non-zero character is a unital homomorphism. Thus by
Lemma 10.44, σ(ϕ(a)) ⊆ σ(a). Since the spectrum of z ∈ F clearly is {z}, this means ϕ(a) ∈
σ(a), thus |ϕ(a)| ≤ r(a) ≤ kak by Proposition 10.12, whence kϕk ≤ 1. Since we require k1k = 1,
we also have kϕk ≥ |ϕ(1)|/k1k = 1. 

10.46 Definition If A is a unital Banach algebra, the spectrum Ω(A) of A is the set of non-
zero characters ϕ : A → F.

10.47 Exercise Let X be a compact Hausdorff space and A = C(X, F). For every x ∈ X
define ϕx : A → F, f 7→ f (x). Prove:
(i) ϕx is a non-zero character of A, thus ϕx ∈ Ω(A), for each x ∈ X.
(ii) The map ι : X → Ω(A), x 7→ ϕx is injective.
(iii) For each f ∈ A we have σ(f ) = {ϕ(f ) | ϕ ∈ Ω(A)}.
(Later we will define a topology on Ω(A) and see that ι is a homeomorphism.)
One could hope that σ(a) = {ϕ(a) | ϕ ∈ Ω(A)} holds for every unital Banach algebra A
and a ∈ A. But this is too much to ask since a non-commutative algebra A may well have
Ω(A) = ∅! E.g., this holds for all matrix algebras Mn×n (F), n ≥ 2 since these are simple (no
proper two-sided ideals) so that a homomorphism to another algebra B must be zero or injective,
the latter being impossible for B = F for dimensional reasons.

10.48 Proposition Let A be a commutative unital Banach algebra over C. Then


(i) If ϕ ∈ Ω(A) then ker ϕ ⊆ A is a maximal ideal (i.e. not contained in a larger proper ideal).
(ii) Every maximal ideal in A is the kernel of a unique ϕ ∈ Ω(A). In particular, Ω(A) 6= ∅.
(iii) For each a ∈ A we have
σ(a) = {ϕ(a) | ϕ ∈ Ω(A)}. (10.7)

Proof. (i) Every ϕ ∈ Ω(A) is continuous, thus M = ker ϕ is a closed ideal. We have M 6= A
since ϕ 6= 0. This ideal has codimension one since A/M ∼ = C and therefore is maximal.
(ii) Now let M ⊆ A be a maximal ideal. Since maximal ideals are proper, no element of M
is invertible. For each b ∈ M we have k1 − bk ≥ 1 since otherwise b = 1 − (1 − b) would be
invertible by Lemma 10.10(i). (This is the only place where completeness is used.) Thus 1 6∈ M ,

71
so that M is a proper ideal containing M . Since M is maximal, we have M = M , thus M is
closed. Now by Proposition 6.2(vi), A/M is a normed algebra, and by a well-known algebraic
argument the maximality of M implies that A/M is a division algebra. Thus A/M ∼ = C by
Gelfand-Mazur (Corollary 10.21, which holds only over C), so that there is a unique isomorphism
α : A/M → C sending 1 ∈ A/M to 1 ∈ C. If p : A → A/M is the quotient homomorphism
then ϕ = α ◦ p : A → C is a non-zero character with ker ϕ = M . This ϕ clearly is unique. The
last statement follows from the fact that every commutative unital algebra has maximal ideals
(by a standard Zorn argument).
(iii) We already know that {ϕ(a) | Ω(A)} ⊆ σ(a), so that it remains to prove that for every
λ ∈ σ(a) there is a ϕ ∈ Ω(A) such that ϕ(a) = λ. If λ ∈ σ(a) then a − λ1 6∈ Inv A. Thus
I = (a − λ1)A ⊆ A is a proper ideal. (Here we need the commutativity of A since otherwise
this would only be a right ideal!) Using Zorn’s lemma, we can find a maximal ideal M ⊇ I.
By (ii) there is a ϕ ∈ Ω(A) such that ker ϕ = M . Since a − λ1 ∈ I ⊆ M = ker ϕ, we have
ϕ(a − λ1) = 0 and therefore ϕ(a) = λ. 

10.49 Exercise Let A be a commutative unital Banach algebra over C and a, b ∈ A. Prove:
(i) σ(a + b) ⊆ σ(a) + σ(b) and σ(ab) ⊆ σ(a)σ(b).
(ii) r(a + b) ≤ r(a) + r(b) and r(ab) ≤ r(a)r(b).
(iii) If A is non-commutative but ab = ba then (ii) still holds.
(iv) If A is non-commutative but ab = ba then (i) still holds.
Hint: For (iii), use an abelian subalgebra, for (iv) a maximal one.

10.50 Exercise Give an example of a commutative Banach algebra over R for which (ii) and
(iii) of Proposition 10.48 (with C replaced by R) fail.
From now on, whether we say it explicitly or not, we assume F = C in all con-
siderations involving spectra! I.e. essentially everywhere except most of Sections
11, 14.1, 16. Finding out whether a result also holds over R usually is very easy.

We could now discuss the contents of Section 12.1, but it seems better first to go on with
our study of operators on Banach and Hilbert spaces.

11 Transpose and adjoint of bounded operators. C ∗-


algebras
11.1 The transpose of a bounded Banach space operator
Let E, F be normed spaces, A ∈ B(E, F ) and ϕ ∈ F ∗ = B(F, F). Then ϕ ◦ A ∈ B(E, F) = E ∗ .
This defines a bilinear map B(E, F ) × F ∗ → E ∗ , and keeping A fixed, we have a linear map
At : F ∗ → E ∗ , ϕ 7→ ϕ ◦ A, which is called the transpose (or adjoint) of A. We will stick to
‘transpose’ to avoid confusion with the Hilbert space adjoint. Note that the transpose goes in
the ‘opposite direction’ ! In fact, if E, F, G be normed spaces and A ∈ B(E, F ), B ∈ B(F, G)
then (B ◦ A)t = At ◦ B t in B(G∗ , E ∗ ).

11.1 Lemma The linear map B(E, F ) → B(F ∗ , E ∗ ) is isometric, i.e. kAt k = kAk.

72
Proof. By Proposition 7.8(i), for f ∈ F we have kf k = supϕ∈F ∗ ,kϕk=1 |ϕ(f )|. Thus

kAk = sup kAek = sup sup |ϕ(Ae)| = sup sup |ϕ(Ae)| = sup kAt ϕk = kAt k.
e∈E e∈E ϕ∈F ∗ ϕ∈F ∗ e∈E ϕ∈F ∗
kek=1 kek=1 kϕk=1 kϕk=1 kxk=1 kϕk=1

The transposition operation can be iterated, giving Att ∈ B(E ∗∗ , F ∗∗ ), etc.

11.2 Lemma Let E, F be Banach spaces and A ∈ B(E, F ). If ιE : E → E ∗∗ and ιF : F → F ∗∗


are the canonical inclusions, then Att ◦ ιE = ιF ◦ A.
Equivalently, Att : E ∗∗ → F ∗∗ maps E ⊆ E ∗∗ to F ⊆ F ∗∗ and Att  E = A.
Proof. Let x ∈ E, ϕ ∈ F ∗ . Then using the definition of ιE , ιF and of the transpose, we have

ιF (Ax)(ϕ) = ϕ(Ax) = (At ϕ)(x) = ιE (x)(At ϕ) = (Att ιE (x))(ϕ).

Now, ιF (Ax) and (Att ιE (x)) are in F ∗∗ , and the fact that they coincide on all ϕ ∈ F ∗ means
ιF (Ax) = Att ιE (x). And since this holds for all x ∈ E, we have ιF A = Att ιE , as claimed. 

11.3 Exercise Let E, F be Banach spaces and A ∈ B(E, F ). Prove:


(i) Let E, F be Banach spaces, A ∈ B(E, F ). Prove ker At = (AE)⊥ .
(ii) If A ∈ B(E, F ) is invertible then At ∈ B(F ∗ , E ∗ ) is invertible.
(iii) If At ∈ B(F ∗ , E ∗ ) then A ∈ B(E, F ) is invertible. (Warning: We don’t assume reflexivity
of the spaces involved!)
(iv) σ(A) = σ(At ) for each A ∈ B(E).
Hint: (i),(ii),(iv) are very easy. The proof of (iii) uses (i) and (ii).
⊥ ∼
11.4 Remark Combining Exercises 11.3(i) and 6.6 we have ker At = (AE)⊥ = AE =
(F/AE)∗ . Thus dim ker At < ∞ is equivalent to dim(F/AE) < ∞, i.e. finite dimensionality of
the topological cokernel of A. Compare Remark 9.14. 2

11.2 The adjoint of a bounded Hilbert space operator


In the rest of this section we will study aspects of bounded (linear) operators that are specific
to bounded operators between Hilbert spaces, as well as the closely related C ∗ -algebras.
If H is a Hilbert space then we have a canonical map γH : H → H ∗ given by y 7→ ϕy = h·, yi.
This map is antilinear, and by Riesz’ representation theorem it is a bijection. This bijection
in a sense makes the dual spaces of Hilbert spaces redundant to a large extent, so that it is
desirable to eliminate them from considerations of the transpose:
−1
11.5 Proposition Let H1 , H2 be Hilbert spaces. For A ∈ B(H1 , H2 ), define A∗ := γH 1
◦ At ◦
γH2 : H2 → H1 .
(i) The map A∗ : H2 → H1 is linear.
(ii) The map B(H1 , H2 ) → B(H2 , H1 ), A 7→ A∗ is anti-linear.
(iii) For all x ∈ H1 , y ∈ H2 we have hAx, yi2 = hx, A∗ yi1 .

73
Proof. (i) Linearity of A∗ : H2 → H1 follows from its being the composite of the linear map At
−1
with the two anti-linear maps γH2 and γH 1
.

(ii) Additivity of A 7→ A is obvious. Let A ∈ B(H1 , H2 ), c ∈ C, x ∈ H2 . Then
−1 −1 −1
(cA)∗ (x) = γH1
◦ (cA)t ◦ γH2 (x) = γH 1
(cAt (γH2 (x))) = cγH 1
(At (γH2 (x))) = cA∗ (x),
−1
where we used the linearity of A 7→ At and anti-linearity of γH 1
, shows (cA)∗ = cA∗ . (The
anti-linearity of γH2 is irrelevant here.)
(iii) If y ∈ H2 then γH2 (y) ∈ H2∗ is the functional h·, yi2 . Then (At ◦ γH2 )(y) ∈ H1∗ is the
−1
functional x 7→ hAx, yi2 . Thus z = A∗ y = (γH 1
◦ At ◦ γH2 )(y) ∈ H1 is a vector such that

hx, zi1 = hAx, yi2 for all x ∈ H1 . This means hx, A yi1 = hAx, yi2 ∀x ∈ H1 , y ∈ H2 , as claimed.


There is a useful bijection between bounded operators and bounded sesquilinear forms. It
can be used to give an alternative (at least in appearance) construction of the adjoint A∗ (and
for many other purposes). It is based on the following observation: If A ∈ B(H) satisfies
hAx, yi = 0 for all x, y ∈ H then Ax = 0 for all x, thus A = 0. Applying this to A − B shows
that hAx, yi = hBx, yi ∀x, y implies A = B. Thus bounded operators are determined by their
‘matrix elements’. This motivates the following developments.

11.6 Definition Let V be an F-vector space. A map V × V → V, (x, y) 7→ [x, y] is called


sesquilinear if it is linear w.r.t. x and anti-linear w.r.t. y. A sesquilinear form [·, ·] is bounded if
supkxk=kyk=1 |[x, y]| < ∞.

11.7 Remark Recall that the inner product h·, ·i on a (pre)Hilbert space is sesquilinear and
bounded by Cauchy-Schwarz. If F = R, the definition of course reduces to bilinearity. 2

11.8 Proposition Let H be a Hilbert space. Then there is a bijection between B(H) and the
set of bounded sesquilinear forms on H, given by B(H) 3 A 7→ [·, ·]A , where [x, y]A = hAx, yi.
Proof. Let A ∈ B(H). Sesquilinearity of [·, ·]A = hAx, yi is an obvious consequence of sesquilin-
earity of h·, ·i and linearity of A, and boundedness follows from Cauchy-Schwarz:

|[x, y]A | = |hAx, yi| ≤ kAxkkyk ≤ kAkkxkkyk ∀x, y.

Now let [·, ·] be a sesquilinear form bounded by M . Then for each x ∈ H, the map ψx : H →
C, y 7→ [x, y] is linear (thanks to the complex conjugation) and satisfies |ψx (y)| ≤ M kykkxk,
thus ψx ∈ H ∗ . Thus by Theorem 5.29 there is a unique vector zx ∈ H such that ψx = ϕzx , thus
[x, y] = ψx (y) = ϕzx (y) = hy, zx i ∀y and, taking complex conjugates, hzx , yi = [x, y] ∀y. Thus
defining A : H → H by Ax = zx ∀x we have hAx, yi = [x, y] ∀x, y. Since the maps x 7→ ψx and
ψx 7→ zx are both anti-linear, their composite A is linear. And since H → H ∗ , z 7→ ϕz is an
isometry, we have kAxk = kzx k = kϕzx k = kψx k ≤ M kxk, thus A ∈ B(H). 

11.9 Proposition Let H be a Hilbert space and A ∈ B(H). Then there is a unique B ∈ B(H)
such that
hAx, yi = hx, Byi ∀x, y ∈ H.
This B is denoted A∗ and called the adjoint of A.
Proof. The map (y, x) 7→ hy, Axi is sesquilinear and bounded (by kAk). Thus by Proposition
11.8 there is a bounded B ∈ B(H) such that hBy, xi = hy, Axi ∀x, y. Taking the complex
conjugate gives hx, Byi = hBy, xi = hy, Axi = hAx, yi, which is the wanted identity. 

74
11.10 Remark 1. In view of the identity hAx, yi = hx, A∗ yi satisfied by the adjoint as defined
above and Proposition 11.5(iii) (with H1 = H2 = H), it is clear that the two constructions of
A∗ give the same result (and in a sense are the same construction since both use Theorem 5.29).
2. It is obvious that A ∈ B(H) is self-adjoint as defined earlier (hAx, yi = hx, Ayi ∀x, y ∈ H)
if and only if A∗ = A.
3. If [·, ·] is a sesquilinear form then also [x, y]0 := [y, x] is a sesquilinear form, called the
adjoint form. Looking at the above definition of A∗ , one finds that A∗ is the bounded operator
associated with the form [·, ·]0 . Thus self-adjointness of A is equivalent to [·, ·]0A = [·, ·]A , i.e.
self-adjointness of [·, ·]A . (Self-adjoint forms and operators are also called hermitian.) 2

11.11 Lemma The map B(H) → B(H), A 7→ A∗ satisfies


(i) (cA + dB)∗ = cA∗ + dB ∗ ∀A, B ∈ B(H), c, d ∈ F (antilinearity).
(ii) (AB)∗ = B ∗ A∗ (anti-multiplicativity).
(iii) A∗∗ = A (involutivity).
(iv) 1∗ = 1.
Proof. (i) Follows from

hx, (cA∗ + dB ∗ )yi = chx, A∗ yi + dhx, B ∗ yi = chAx, yi + dhBx, yi = h(cA + dB)x, yi


= hx, (cA + dB)∗ yi.

(ii) hx, (AB)∗ yi = h(AB)x, yi = hBx, A∗ yi = hx, B ∗ A∗ yi.


(iii) Complex conjugating hAx, yi = hx, A∗ yi gives

hy, Axi = hAx, yi = hx, A∗ yi = hA∗ y, xi,

which shows that A is an adjoint of A∗ . Uniqueness of the adjoint now implies A∗∗ = A.
(iv) Obvious. 

11.12 Proposition Let H be a Hilbert space. Then for all A ∈ B(H) we have
(i) kA∗ k = kAk. (The ∗-operation is isometric.)
(ii) kA∗ Ak = kAk2 . (“C ∗ -identity”)
Proof. (i) Similarly to Lemma 11.1, using (5.2) we have

kA∗ k = sup |hA∗ x, yi| = sup |hx, Ayi| = sup |hAy, xi| = kAk.
kxk=kyk=1 kxk=kyk=1 kxk=kyk=1

(ii) On the one hand, kA∗ Ak ≤ kA∗ kkAk = kAk2 , where we used (i). On the other, using
(5.2) we have

kA∗ Ak = sup kA∗ Axk = sup |hA∗ Ax, yi| = sup |hAx, Ayi|
kxk=1 kxk=kyk=1 kxk=kyk=1

≥ sup hAx, Axi = sup kAxk2 = ( sup kAxk)2 = kAk2 .


kxk=1 kxk=1 kxk=1

75
The above construction of A∗ for A ∈ B(H) can be generalized to bounded linear maps
A : H1 → H2 , so as to give A∗ : H2 → H1 satisfying

hAx, yi2 = hx, A∗ yi1 ∀x ∈ H1 , y ∈ H2 .

We refrain from doing so explicitly since, as in the case H1 = H2 , the result would be the same
as that of the construction in Proposition 11.5. Instead we take the latter as definition of A∗ in
the general case. Now one has:

11.13 Lemma Let H1 , H2 be Hilbert spaces and A ∈ B(H1 , H2 ). Then


(i) A is an isometry if and only if A∗ A = idH1 .
(ii) A is unitary if and only if A∗ A = idH1 and AA∗ = idH2 .
Proof. (i) By definition, A is an isometry if hAx, Ayi2 = hx, yi1 for all x, y ∈ H1 . Since the l.h.s.
equals hx, A∗ Ayi1 , A is an isometry if and only if hA∗ Ax, yi1 = hx, yi1 for all x, y ∈ H1 , which
is equivalent to A∗ A = idH1 by the observations at the beginning of the section.
(ii) By definition, a unitary is a surjective isometry, which is equivalent to being an invertible
isometry, or an isometry that has an isometry as inverse. Now apply (i). 

11.14 Exercise Consider the left and right shift operators L, R of Definition 10.1 in the Hilbert
space `2 (N, C).
(i) Prove L∗ = R and R∗ = L.
(ii) Prove σ(L) = σ(R) = B(0, 1) (closed unit disk). Hint: Use Exercise 10.33.

11.3 Involutions. Definition of C ∗ -algebras


The properties of the adjoint map A 7→ A∗ on B(H) motivate some definitions:

11.15 Definition Let A be a C-algebra. A map ∗ : A → A satisfying antilinearity, antimulti-


plicativity and involutivity ((i)-(iii) in Lemma 11.11) is called an involution or ∗-operation. An
algebra with a chosen ∗ is called a ∗-algebra. A ∗-homomorphism α : A → B of ∗-algebras is a
homomorphism satisfying α(a∗ ) = α(a)∗ ∀a ∈ A.
If a ∗-algebra has a unit 1 we automatically have 1∗ = 11∗ = 1∗∗ 1∗ = (11∗ )∗ = (1∗ )∗ = 1.

11.16 Definition If A is a Banach algebra and ∗ : A → A an involution then A is called a


• Banach ∗-algebra if ka∗ k = kak ∀a ∈ A.
• C ∗ -algebra if ka∗ ak = kak2 ∀a ∈ A.41

11.17 Lemma Every C ∗ -algebra is a Banach ∗-algebra. If it has a unit 1 then k1k = 1.
Proof. With the C ∗ -identity and submultiplicativity we have kak2 = ka∗ ak ≤ ka∗ kkak, thus
kak ≤ ka∗ k for all a ∈ A. Replacing a by a∗ herein gives the converse inequality, thus ka∗ k = kak.
If 1 is a unit then k1k2 = k1∗ 1k = k1∗ k = k1k, and since k1k = 6 0 this implies k1k = 1. 
41
The original definition by Gelfand and Naimark (1942) had the additional axiom that a∗ a + 1 be invertible for
each a. This turned out to be redundant, cf. Proposition 12.13.

76
11.18 Remark 1. Clearly B(H) is a C ∗ -algebra for each Hilbert space H. Since this holds
also for real Hilbert spaces, is shows that one can discuss Banach ∗-algebras and C ∗ -algebras
over R. But we will consider only complex ones.
2. There is no special name for the non-complete variants of the above definitions. But a
submultiplicative norm on a ∗-algebra satisfying the C ∗ -identity is called a C ∗ -norm, whether
A is complete w.r.t. it or not. Completion of a ∗-algebra w.r.t. a C ∗ -norm gives a C ∗ -algebra,
and this is an important way of constructing new C ∗ -algebras. 2

11.19 Exercise Recall the Banach algebra A = `1 (Z, C) from Example 10.25. Show that both
f ∗ (n) = f (n) and f ∗ (n) = f (−n) are involutions on A making it a Banach ∗-algebra. Show
that neither of them satisfies the C ∗ -identity. (Thus Banach-∗ 6⇒ C ∗ .)

11.20 Lemma Let X be a compact space. For f ∈ C(X, C), define f ∗ by f ∗ (x) = f (x). Then
C(X, C) is a C ∗ -algebra. The same holds for Cb (X), where X is arbitrary, thus also for `∞ (S, C).
Proof. We know that C(X, C) equipped with the norm kf k = supx |f (x)| is a Banach algebra.
It is immediate that ∗ is an involution. The computation
 2

kf f k = sup |f (x)f (x)| = sup |f (x)| = sup |f (x)| = kf k2
2
x x x

proves the C ∗ -identity. It is clear that this generalizes to the bounded continuous functions on
any space X. 

In a sense, the examples B(H) and C(X, C) for compact X are all there is: One can prove,
as we will do in Theorem 17.11, that every commutative unital C ∗ -algebra is isometrically
∗-isomorphic to C(X, C) for some compact Hausdorff space X, determined uniquely up to
homeomorphism. (For example one has `∞ (S, C) ∼ = C(βS, C), where βS is the Stone-Čech

compactification of (S, τdisc ).) And every C -algebra is isometrically ∗-isomorphic to a norm-
closed ∗-subalgebra of B(H) for some Hilbert space H. Cf. e.g. [50].

11.21 Definition Let A be a C-algebra with an involution ∗. Then a ∈ A is called


• self-adjoint if a = a∗ .
• unitary if aa∗ = a∗ a = 1. (Obviously A needs to be unital.)
• normal if aa∗ = a∗ a.
• orthogonal projection if a2 = a = a∗ .
A subset S of a ∗-algebra is called self-adjoint if S = S ∗ := {s∗ | s ∈ S}.
∗ ∗
11.22 Remark 1. If A is a ∗-algebra and a ∈ A, we define Re(a) = a+a a−a
2 , Im(a) = 2i . Now
it is immediate that Re(a), Im(a) are self-adjoint and a = Re(a) + i Im(a). Furthermore, a is
self-adjoint ⇔ Im(a) = 0, and a is normal ⇔ Re(a) and Im(a) commute.
2. Obviously all self-adjoint and all unitary elements of a C ∗ -algebra are normal. On
the other hand, non-unitary isometries (a∗ a = 1 6= aa∗ ) are not normal. Every element of
a commutative C ∗ -algebra is normal. Conversely, if a ∈ A is normal then the C ∗ -subalgebras
C ∗ (a) and C ∗ (1, a) of A, i.e. the smallest (unital) C ∗ -subalgebra containing a, are commutative.
Later we will see that normal elements of abstract C ∗ -algebras and normal operators on
Hilbert spaces have very nice properties and closely correspond to continuous functions and
multiplication operators, respectively, cf. Sections 15 and 17. 2

77
11.4 Spectrum of elements of a C ∗ -algebra
11.23 Lemma Let A be a unital C ∗ -algebra.
(i) If a ∈ A is invertible then a∗ is invertible and (a∗ )−1 = (a−1 )∗ .
(ii) σ(a∗ ) = σ(a)∗ := {λ | λ ∈ σ(a)}.42
Proof. (i) Taking the adjoint of the equation aa−1 = 1 = a−1 a gives (a−1 )∗ a∗ = 1 = a∗ (a−1 )∗ ,
thus (a∗ )−1 = (a−1 )∗ . (ii) By (i), a − λ1 is invertible if and only if a∗ − λ1 is invertible. 

11.24 Proposition Let A be a unital C ∗ -algebra. Then


(i) If a ∈ A is normal then r(a) = kak.
(ii) If u ∈ A is unitary then σ(u) ⊆ S 1 .
(iii) If a ∈ A is self-adjoint then σ(a) ⊆ R.
n n
Proof. (i) If b = b∗ then kbk2 = kb∗ bk = kb2 k, and induction gives kb2 k = kbk2 ∀n. If a is
normal, then
n n n n n n
ka2 k = k(a∗ )2 a2 k1/2 = k(a∗ a)2 k1/2 = (ka∗ ak2 )1/2 = kak2 ,
since a∗ a is self-adjoint. Now Theorem 10.18 gives
n n n n
r(a) = lim ka2 k1/2 = lim (kak2 )1/2 = kak. (11.1)
n→∞ n→∞

(ii) We have kuk2 = ku∗ uk = k1k = 1, and in same way ku−1 k = ku∗ k = 1. Now Exercise
10.13 gives σ(u) ⊆ S 1 .
(iii) Given λ ∈ σ(a), write λ = α + iβ with α, β ∈ R. Since σ(a + z1) = σ(a) + z ∀z ∈ C
(why?), we have iβ(n + 1) = α + iβ − α + inβ ∈ σ(a − α1 + inβ1). Thus with r(c) ≤ kck
(Proposition 10.12), the C ∗ -identity and k1k = 1 we have
(n2 + 2n + 1)β 2 = |iβ(n + 1)|2 ≤ r(a − α1 + inβ1)2 ≤ ka − α1 + inβ1k2
= k(a − α1 − inβ1)(a − α1 + inβ1)k = k(a − α1)2 + n2 β 2 1k ≤ ka − α1k2 + n2 β 2 ,
which simplifies to (2n + 1)β 2 ≤ ka − α1k2 ∀n ∈ N. Thus β = 0 and λ ∈ R. 

11.25 Remark 1. Since (i) implies kak = ka∗ ak1/2 = r(a∗ a)1/2 for all a ∈ A and the spectral
radius r(a) by definition depends only on the algebraic structure of A, the latter also determines
the norm, which therefore is unique in a C ∗ -algebra! But note that two C ∗ -norms on a ∗-algebra
A can be very different if A fails to be complete w.r.t. one of the two norms!
2. AnPalternative and perhaps more insightful proof for (iii) goes like this: Since ez ≡
exp(z) = ∞ n
n=0 z /n! converges absolutely for all z ∈ ∗C, Proposition 3.2 gives convergence of
exp(a) for all a ∈ A. It is easy to verify (ea )∗ = e(a ) and ea+b = ea eb provided ab = ba.
In particular we have ea ∈ InvA for all a with (ea )−1 = e−a . If now a = a∗ then u = eia
satisfies u∗ = e−ia , thus uu∗ = u∗ u = 1, so that u is unitary and therefore σ(eia ) ⊆ S 1 by (ii).
Now the holomorphic spectral mapping theorem, cf. Section 12.1, in particular (12.1), gives
{eiλ | λ ∈ σ(a)} = σ(eia ) ⊆ S 1 , and this implies σ(a) ⊆ R. [The above argument only needs
{eiλ | λ ∈ σ(a)} ⊆ σ(eia ), which can be proven more directly: For all λ ∈ C we have

!
X (i(a − λ1))k
eia − eiλ 1 = (ei(a−λ1) − 1)eiλ = eiλ = (a − λ1)beiλ ,
k!
k=1

42
If S ⊆ C we we write S ∗ for {s | s ∈ S} since S could be confused with the closure.

78
(i(a−λ1))k−1
where b = i ∞ ∈ A. Since a − λ1 and b commute, we have λ ∈ σ(a) ⇒ eiλ ∈
P
k=1 k!
σ(e ).] For another (quite striking) application of exp to C ∗ -algebras see Section B.8.
ia 2

11.26 Exercise Let A be a unital C ∗ -algebra and a ∈ A normal. Prove:


(i) kan k = kakn ∀n.
(ii) There is λ ∈ σ(a) with |λ| = kak.
(iii) If σ(a) = {λ} then a = λ1.
The following improvement over Corollary 10.24 and Exercise 10.27 illustrates that C ∗ -
algebras are better behaved than general Banach algebras:

11.27 Theorem Let A be a unital C ∗ -algebra, B ⊆ A a C ∗ -subalgebra containing 1. Then


(i) InvB = B ∩ InvA.
(ii) σB (b) = σA (b) for all b ∈ B.
Proof. (i) The inclusion σA (b) ⊆ σB (b) was part of Corollary 10.24, so that we are left with
proving ⊇. Let first b = b∗ ∈ B ∩ InvA. Proposition 11.24(iii) then gives that b − it1 is
invertible in B for all t ∈ R\{0} and in A for all t ∈ R. Lemma 10.14 implies that the function
f : R → A, t 7→ (b − it1)−1 is continuous. For t ∈ R\{0}, b − it1 is invertible in B, so that
uniqueness of inverses gives f (t) ∈ B for all t 6= 0. Now continuity of f and closedness of B ⊆ A
imply f (0) ∈ B. Since f (0) = b−1 , we have b−1 ∈ B, as claimed.
Let now b ∈ B ∩ InvA. Then b has an inverse a ∈ A. Furthermore, b∗ is invertible in A
(Lemma 11.23), thus also bb∗ . Since bb∗ is self-adjoint, it has an inverse c ∈ B by the above.
Thus bb∗ c = 1 = cbb∗ , so that b∗ c ∈ B is a right inverse for b. Now a = abb∗ c = b∗ c proves that
a = b∗ c, and with b, c ∈ B we have b−1 = a ∈ B.
(ii) This is immediate by (i) and the definition of the spectrum. 

11.28 Definition If A is a unital C ∗ -algebra, a ∈ A is called positive, or a ≥ 0, if a = a∗ and


σ(a) ⊆ [0, ∞).

11.29 Exercise Give an example of a unital C ∗ -algebra A and a ∈ A showing that σ(a) ⊆
[0, ∞) does not imply a = a∗ !

11.30 Exercise Prove: If A is a unital C ∗ -algebra and a, b ∈ A are positive and ab = ba then
a + b is positive. (Later we will use different methods to remove the condition ab = ba.)

11.31 Exercise Let A be a unital C ∗ -algebra. Prove: If a, b ∈ A are positive and a + b = 0


then a = b = 0.

12 Functional calculus in Banach and C ∗-algebras


12.1 Some functional calculus in Banach algebras
Let A ∈ B(E) be a bounded operator and f a function (we will soon be more specific). It is
natural to ask how to define f (A). The next question is: Determine σ(f (A)). Does it equal
f (σ(A))? These are the basic questions addressed by the many different ‘functional calculi’ that
there are: holomorphic, continuous, Borel, etc.

79
We immediately generalize the above questions to elements of unital Banach algebras, but
mostly we will (later) focus on C ∗ -algebras, to which B(H) belongs for each Hilbert space.
Defining f (a) poses no problem in the simplest case, which surely is f = P , a polynomial:

12.1 Definition If A is a unital algebra, a ∈ A and P (x) = cn xn +· · ·+c1 x+c0 is a polynomial,


we put P (a) = cn an + · · · + c1 a + c0 1.

12.2 Exercise Let A be a unital algebra and a ∈ A. Prove that the map C[x] → A, P 7→ P (a)
is a homomorphism of unital C-algebras.

12.3 Lemma Let A be a unital Banach algebra, a ∈ A and P ∈ C[x] a polynomial. Then

σ(P (a)) = P (σ(a)) := {P (λ) | λ ∈ σ(a)}.

Proof. Choose a maximal abelian Banach subalgebra B ⊆ A containing a. Since every ϕ ∈ Ω(B)
is a unital homomorphism, we have

ϕ(P (a)) = ϕ(cn an + · · · + c1 a + c0 1) = cn ϕ(a)n + · · · + c1 ϕ(a) + c0 = P (ϕ(a)).

Now by (10.7) we have

σB (P (a)) = {ϕ(P (a)) | ϕ ∈ Ω(B)} = {P (ϕ(a)) | ϕ ∈ Ω(B)} = {P (λ) | λ ∈ σ(a)} = P (σ(a)).

Now appeal to Exercise 10.27 to have σA (P (a)) = σB (P (a)). 

There is a more elementary proof of Lemma 12.3, which works in every normed algebra (but
does not lend itself to generalizations like (12.1) or Proposition 12.20(ii)):

12.4 Exercise Let A be a unital normed algebra and P ∈ C[z] with n = deg P .
(i) Prove σ(P (a)) = P (σ(a)) when n = 0.
Qn
(ii) Assume n ≥ 1 and λ ∈ C. Use a factorization P (z) − λ = cn k=1 (z − zk ) to prove
σ(P (a)) = P (σ(a)) without using characters.
(iii) Why did we assume A to be normed?

P∞The above ‘polynomial functional calculus’ can be generalized: If the power series f (z) =
n has convergence radius R and a ∈ A satisfies kak < R then
n=0 cn z

X ∞
X ∞
X
kcn an k = |cn |kan k ≤ |cn |kakn < ∞,
n=0 n=0 n=0

n |cn |z n have the same convergence radius.


P P
where we used that the power
P∞series n cn z and
Thus we can define f (a) as n=0 cn a . Furthermore, if ϕ ∈ Ω(A) then by continuity of ϕ we
have
N N
! !
X X
n n
ϕ(f (a)) = ϕ lim cn a = lim ϕ cn a
N →∞ N →∞
n=0 n=0
N
X ∞
X
= lim cn ϕ(a)n = cn ϕ(a)n = f (ϕ(a)).
N →∞
n=0 n=0

80
Thus if A is commutative we have

σ(f (a)) = {ϕ(f (a)) | ϕ ∈ Ω(A)} = {f (ϕ(a)) | ϕ ∈ Ω(A)} = f (σ(a)). (12.1)

If A is non-commutative, we can find a maximal abelian Banach subalgebra B ⊆ A containing


a. Then f (a) ∈ B, the above gives σB (f (a)) = f (σB (a)), and Exercise 10.27 allows us to drop
the subscript.
If one defines HR to be the set of functions defined by power series with convergence radius
≥ R then HR is easily checked to be a commutative algebra (which coincides with the algebra
of functions holomorphic on B(0, R)). Now for every a ∈ A with kak < R one has a unital
homomorphism HR → A, f 7→ f (a). This can be generalized quite a bit, leading to the fully
fledged holomorphic functional calculus,Pwhich we do not discuss. See e.g. [32, 39, 73].
Of course every power series f (z) = ∞ n
n=0 cn z with infinite convergence
P radius is in all HR ,
an
thus it can be ‘applied’ to every a ∈ A. For example exp(a) = ea = ∞ n=0 n! converges for
a
every a ∈ A. Whenever ab = ba one has e e = e b a+b b a
= e e by essentially the same argument
as for complex numbers. But the commutativity ab = ba is essential! (Why?) Now it follows
that ea e−a = ea−a = e0 = 1 = e−a ea , thus ea ∈ InvA with (ea )−1 = e−a .

12.2 Continuous functional calculus for self-adjoint elements in


a C ∗ -algebra
Our goal is to make sense of f (a), where a is a normal element of some arbitrary C ∗ -algebra
A, for all functions f ∈ C(σ(a), C), in such a way that f 7→ f (a) is a ∗-homomorphism. (If
you don’t care for this generality, you may substitute A = B(H).) We will first do this for
self-adjoint elements and then generalize to normal ones. We cannot hope to go beyond this: If
f (z) = z 2 z then it is not clear whether to define f (a) as a2 a∗ or aa∗ a or a∗ a2 when aa∗ 6= a∗ a.
[Yet ‘quantization theory’, motivated by quantum theory, tries to do it.] For normal a, this
problem does not arise.

12.5 Proposition Let A be a unital C ∗ -algebra, a ∈ A normal and P a polynomial. Then

kP (a)k = sup |P (λ)| = kP|σ(a) k∞ .


λ∈σ(a)

Proof. Normality of a implies that P (a) is normal. Thus

kP (a)k = r(P (a)) = sup |λ| = sup |P (λ)|.


λ∈σ(P (a)) λ∈σ(a)

The first equality is due to Proposition 11.24(i), the second is the definition of r and the third
comes from the Lemma 12.3. 

Even though we are after a result for all normal operators, we first consider self-adjoint
operators:

12.6 Theorem Let A be a unital C ∗ -algebra and a = a∗ ∈ A. Then there is a unique con-
tinuous ∗-homomorphism αa : C(σ(a), C) → A such that αa (P ) = P (a) for all polynomials.
(Usually we will write f (a) instead of αa (f ).) It satisfies
(i) kαa (f )k = supλ∈σ(a) |f (λ)|. (Thus αa is an isometry.)

81
(ii) The image of αa is the smallest C ∗ -subalgebra B ⊆ A containing 1 and a, and αa :
C(σ(a), C) → B is a ∗-isomorphism.
(iii) σ(αa (f )) = f (σ(a)) = {f (λ) | λ ∈ σ(a)}. (Spectral mapping theorem)
(iv) If g ∈ C(f (σ(a)), C) then αa (g ◦ f ) = ααa (f ) (g), or just g(f (a)) = (g ◦ f )(a).
Proof. (i) By Propositions 10.12 and 11.24(iii), we have σ(a) ⊆ [−kak, kak]. By the classical
Weierstrass approximation theorem, cf. Theorem A.25, for every continuous continuous function
f : [c, d] → C and ε > 0 there is a polynomial P such that |f (x) − P (x)| ≤ ε for all x ∈ [c, d].
We cannot apply this directly since σ(a), while contained in an interval, need not be an entire
interval. But using Tietze’s extension theorem, cf. Appendix A.6, we can find (very non-
uniquely) a continuous function g : [−kak, kak] → C that coincides with f on σ(a). Now this g
can be approximated uniformly by polynomials thanks to Weierstrass’ theorem. (Alternatively,
apply the more abstract Stone-Weierstrass theorem directly to f .) In any case, the restriction
of the polynomials to σ(a) is dense in C(σ(a), C) w.r.t. k · k∞ . By Proposition 12.5, the map
C(σ(a), C) ⊇ C[x]|σ(a) → A, P 7→ P (a) is an isometry. Thus applying Lemma 3.17 we obtaining
a unique isometry αa : C(σ(a), C) → A extending P 7→ P (a). Thus (i) is proven up to the claim
that αa is a ∗-homomorphism. This is left as an exercise.
(ii) Since αa is a ∗-homomorphism, B := αa (C(σ(a), C)) ⊆ A is a ∗-subalgebra. And since
αa is an isometry by (i) and (C(σ(a), C), k · k∞ ) is complete, B is closed, thus a C ∗ -algebra.
Since αa maps the constant-one function to 1 ∈ A and the inclusion map σ(a) ,→ C to a, B
contains 1, a. Conversely, the smallest C ∗ -subalgebra of A containing 1 and a clearly is obtained
by taking the norm-closure of the set {P (a) | P ∈ C[z]}, which is contained in the image of αa .
(iii) Let f ∈ C(σ(a), C). Then clearly αa (f ) ∈ B. Now

σA (αa (f )) = σB (αa (f )) = σC(σ(a),C) (f ) = f (σ(a)),

where the equalities come from Theorem 11.27, from the fact that αa : C(σ(a), C) → B is a
∗-isomorphism, and from Exercise 10.6, respectively.
(iv) If {Pn } is a sequence of polynomials converging to f uniformly on σ(a) and {Qn } is a
sequence of polynomials converging to g uniformly on σ(f (a)), then Qn ◦Pn converges uniformly
to g ◦ f , thus Qn (Pn (a)) = (Qn ◦ Pn )(a) converges to (g ◦ f )(a). On the other hand, {Qn (Pn (a))}
converges uniformly to g(f (a)). 

12.7 Exercise Prove that αa is a ∗-homomorphism.

12.8 Remark The isometric ∗-isomorphism αa : C(σ(a), C) → B = C ∗ (1, a) is a special case


of the Gelfand isomorphism for commutative unital C ∗ -algebras proven in Section 17 (which
will go in the opposite direction π : A → C(Ω(A), C)). A general commutative C ∗ -algebra A is
not generated by a single element a, so that we’ll need to find a substitute for σ(a). It shouldn’t
be surprising that this will be Ω(A) (with a suitable topology). 2

Here are a few of the many applications of the preceding results:

12.9 Exercise (i) Define f+ , f− : R → R by f+ (x) = max(x, 0), f− (x) = − min(x, 0). Prove
the alternative formulae f± (x) = (|x| ± x)/2 and f+ f− = 0 and f± ∈ C(R, R).
(ii) Let now A be a unital C ∗ -algebra and a = a∗ ∈ A. Define a± ∈ A by functional calculus
as a± = f± (a). Prove: 1. a+ − a− = a and a+ + a− = |a|, 2. a+ a− = a− a+ = 0, 3.
a+ ≥ 0, a− ≥ 0.

82
12.10 Proposition Let A be a unital C ∗ -algebra.
(i) If a = a∗ ∈ A then a2 is positive.
(ii) If a ∈ A is positive then there is a positive b ∈ A such that b2 = a, unique in C ∗ (1, a)

(and in A). We write b = a.
Proof. (i) It is clear that a2 is self-adjoint. Since σ(a) ⊆ R by Proposition 11.24(iii), the spectral
mapping theorem (Lemma 12.3 suffices) gives σ(a2 ) = {λ2 | λ ∈ σ(a)} ⊆ [0, ∞), thus a2 ≥ 0.
(ii) In view of a ≥ 0 we have σ(a) ⊆ [0, ∞). Now continuity of the function [0, ∞) →
√ √
[0, ∞), x 7→ + x allows us to define b = a by the continuous functional calculus. It is
immediate by construction that b = b∗ , and the spectral mapping theorem gives σ(b) ⊆ [0, ∞),
√ 2
thus b ≥ 0. Now b2 = (a1/2 )2 = a since x = x. If c ∈ C ∗ (1, a) is positive and c2 = a then
c = b. This follows from the ∗-isomorphism C ∗ (1, a) ∼ = C(σ(a), C) and the fact that positive
square roots are unique in the function algebra C(σ(a), C) (why?). The stronger result that
positive square roots are unique even in A will be proven in Remark 12.15. 

12.11 Exercise Prove: If A is a unital C ∗ -algebra and c ∈ A is normal then c∗ c is positive.


Hint: Remark 11.22.1.
The next two results generalize Exercises 11.30 and 12.11, respectively, and show the use-
fulness of the preceding theory even in very non-commutative situations:

12.12 Proposition Let A be a unital C ∗ -algebra.


(i) a ∈ A is positive if and only if a = a∗ and there is a t ≥ 0 such that ka − t1k ≤ t.
(ii) If a, b ∈ A are positive then a + b is positive. (No assumption that ab = ba!)
Proof. (i) Under either assumption we have a = a∗ . Let αa : C(σ(a), C) → C ∗ (a, 1) ⊆ A be the
∗-isomorphism from Theorem 12.6. We have αa (f ) = a, where f : σ(a) ,→ C is the inclusion
map.
If a ≥ 0 then the function f takes values in [0, kak]. Putting t = kak, f − t1 takes values in
[−t, 0], thus ka − t1k = kf − t1k ≤ t.
Now assume ka − t1k ≤ t for some t ≥ 0. Then the function f − t1 takes values in
σ(a − t1) ⊆ [−t, t], thus f takes values in [0, 2t]. Thus f is positive, and so is a = αa (f ).
(ii) If a, b ≥ 0 then a = a∗ , b = b∗ , so that (a + b)∗ = a + b. And by (i) there are s, t ≥ 0 such
that ka − s1k ≤ s and kb − t1k ≤ t. This implies k(a + b) − (s + t)1k ≤ ka − s1k + kb − t1k ≤ s + t,
so that a + b ≥ 0 by (i). 

12.13 Proposition If A is a unital C ∗ -algebra and a ∈ A then a∗ a is positive.


Proof. First a preparatory argument: Assume c ∈ A is such that −c∗ c is positive. Then by
Exercise 10.11(iv) we have σ(−cc∗ )\{0} = σ(−c∗ c)\{0}, thus −cc∗ is positive. Writing c = a+ib
with a, b self-adjoint, we have c∗ c + cc∗ = (a − ib)(a + ib) + (a + ib)(a − ib) = 2a2 + 2b2 , thus
c∗ c = 2a2 + 2b2 − cc∗ . Using −cc∗ ≥ 0 just proven and Propositions 12.10(i) and 12.12(ii) this
implies c∗ c ≥ 0. Combining −c∗ c ≥ 0 and c∗ c ≥ 0 gives σ(c∗ c) ⊆ [0, ∞) ∩ (∞, 0] = {0}. This
implies kck2 = kc∗ ck = r(c∗ c) = 0, thus c = 0.
We turn to the proof of the claim. Let a ∈ A be arbitrary. Then b = a∗ a is self-adjoint, thus
with Exercise 12.9(ii) we have b = b+ − b− with b± ≥ 0 and b+ b− = 0. Putting c = ab− we have
−c∗ c = −b− a∗ ab− = −b− (b+ − b− )b− = b3− , which is positive (spectral mapping theorem). Now
the preparatory step gives ab− = c = 0. This implies −b2− = (b+ − b− )b− = bb− = a∗ ab− = 0,
thus b− = 0. (Since d = d∗ , d2 = 0 implies d = 0.) Now we have a∗ a = b = b+ ≥ 0. 

83
12.14 Exercise Let A be a unital C ∗ -algebra and a, b ∈ A with a ≥ 0. Prove that bab∗ ≥ 0.

12.15 Remark Now we can prove the strong uniqueness claim in Proposition 12.10: Let A
be a unital C ∗ -algebra and a, b, b0 ∈ A positive with b2 = a = b02 . Then Exercise 12.14 gives
positivity of (b − b0 )b(b − b0 ) and (b − b0 )b0 (b − b0 ). A short computation gives

(b − b0 )b(b − b0 ) + (b − b0 )b0 (b − b0 ) = (b2 − b02 )(b − b0 ) = 0,

so that by Exercise 11.31 we have (b − b0 )b(b − b0 ) = (b − b0 )b0 (b − b0 ) = 0. Thus also their


difference (b − b0 )3 vanishes. Since ka2 k = kak2 for self-adjoint a and b − b0 is self-adjoint, we
have k(b − b0 )k4 = k(b − b0 )4 k = 0, thus b = b0 . 2

12.16 Definition If A is a unital C ∗ -algebra and a ∈ A, we define |a| = (a∗ a)1/2 .


By construction, |a| is positive and is similar to |z| for z ∈ C. But some care is required
since |a∗ | = |a| holds if and only if a is normal. The ‘if’ part is obvious, and the converse follows
√ 2 √ 2 √
by a∗ a = a∗ a = |a|2 = |a∗ |2 = aa∗ = aa∗ , where we used that ( b)2 for b ≥ 0.
It is not unreasonable to ask whether, as for complex numers, we have a factorization a = b|a|
for each a ∈ A. In Section 13.3 we will prove this for A = B(H), but:

12.17 Exercise Give an example of a unital C ∗ -algebra A and a ∈ A such that there is no
b ∈ A with a = b|a|.

12.3 Continuous functional calculus for normal elements in a


C ∗ -algebra
12.18 Theorem Theorem 12.6 literally extends to all normal elements of a unital C ∗ -algebra.
The proof of Theorem 12.6 does not generalize immediately. The reason is that the spectrum
of a normal operator need not be contained in R. (In fact, for normal a we have σ(a) ⊆ R ⇔ a =
a∗ , cf. Exercise 12.22.) If that happens, the polynomials, restricted to σ(a), fail to be uniformly
dense in C(σ(a), C). (All functions that are uniform limits of polynomials on sufficiently large
subsets of C are holomorphic so that, e.g. f (z) = Re z cannot be approximated by polynomials
in z = x+iy.) But with σ(a) ⊆ C ∼ = R2 and considering functions on (a subset of) C as functions
of two real variables, the polynomials in x, y are dense in C(σ(a), C) by the higher dimensional
version of the classical Weierstrass theorem, cf. Theorem A.31. Thus also the polynomials in
z = x + iy and z = x − iy are dense43 . Now there is a unique unital homomorphism αa from
C[z, z] to A sending z to a and z to a∗ , and we need to adapt Proposition 12.5 to this setting.
For this we need another lemma:

12.19 Lemma Let A be a unital C ∗ -algebra. Then every character ϕ ∈ Ω(A) satisfies ϕ(c∗ ) =
ϕ(c) for all c ∈ A, i.e. is a ∗-homomorphism.
Proof. We have c = a + ib, where a = Re(c), b = Im(c) are self-adjoint. Now σ(a) ⊆ R by
Proposition 11.24(iii), thus ϕ(a) ∈ σ(a) ⊆ R by Lemma 10.45. Similarly ϕ(b) ∈ R. Thus

ϕ(c∗ ) = ϕ(a − ib) = ϕ(a) − iϕ(b) = ϕ(a) + iϕ(b) = ϕ(a + ib) = ϕ(c),

where the third equality used that ϕ(a), ϕ(b) ∈ R as shown before. 
43
We allow ourselves the harmless sloppiness of not distinguishing between elements of the ring C[z, z] (where z, z
are independent variables) and the functions C → C induced by them.

84
12.20 Proposition Let A be a unital C ∗ -algebra and a ∈ A normal. Then
(i) If P ∈ C[z, z], define P (a, a∗ ) by replacing z and z in P by a, a∗ , respectively. Then
αa : P 7→ P (a, a∗ ) is a ∗-homomorphism (extending the αa : C[z] → A defined earlier).
(ii) For every P ∈ C[z, z] we have σ(P (a, a∗ )) = {P (λ, λ) | λ ∈ σ(a)} and

kP (a, a∗ )k = sup |P (λ, λ)|.


λ∈σ(a)

Proof. (i) If P (z, z) = N


P i j ∗
PN i j
i,j=0 cij z z we put P (a, a ) = i,j=0 cij a a . Using the normality
of a it is very easy to see that P 7→ P ∗
P (a, a ) is an algebra homomorphism. It also is a ∗-
homomorphism if we define P (z, z) = N
∗ j i
i,j=0 cij z z .

(ii) Since a is normal, B = C (1, a) ⊆ A is commutative, so that Proposition 10.48 applies,
and using that the ϕ ∈ Ω(B) are ∗-homomorphisms thanks to the preceding lemma, we have
N
n X  o
σB (P (a, a∗ )) = {ϕ(P (a, a∗ )) | ϕ ∈ Ω(B)} = ϕ cij ai (a∗ )j | ϕ ∈ Ω(B)
i,j=0
N
j
nX o
= cij ϕ(a)i ϕ(a) | ϕ ∈ Ω(B) = {P (λ, λ) | λ ∈ σ(a)}.
i,j=0

Now we appeal to Theorem 11.27 to get σA (P (a, a∗ )) = σB (P (a, a∗ )).


Since P (a, a∗ ) is normal, the last claim follows from (ii) and Proposition 11.24(i). 

Now the proof of Theorem 12.6 becomes a proof of Theorem 12.18 if we replace the invocation
of Proposition 12.5 by one of Proposition 12.20 and use the density of the P (z, z)  σ(a) in
C(σ(a), C) as explained before.

12.21 Exercise Let A be a unital C ∗ -algebra and a ∈ A normal. For t 6∈ σ(a), prove that
k(a − t1)−1 k = (dist(t, σ(a)))−1 .

12.22 Exercise Let A be a unital C ∗ -algebra and a ∈ A normal. Prove


(i) If σ(a) ⊆ R then a is self-adjoint.
(ii) If σ(a) ⊆ S 1 then a is unitary.
We now leave the discussion of abstract Banach and C ∗ -algebras (to which we will briefly
return at the end) and return to operator theory.

13 More on Hilbert space operators


We now return to bounded operators on Hilbert spaces. Of course all results of the preceding
section also apply to the C ∗ -algebra B(H). But there is more to say about operators on Hilbert
space than about general C ∗ -algebra elements.
If H is a finite dimensional Hilbert space and A ∈ B(H) then one easily checks that A is
surjective if and only if A∗ is injective. In infinite dimensions, this becomes modified:

13.1 Lemma Let H be a Hilbert space and A ∈ B(H). Then ker A∗ = (AH)⊥ . Thus A has
dense image (AH = H) if and only if A∗ is injective. (Compare Exercise 11.3(i).)

85
Proof. We have x ∈ (AH)⊥ ⇔ hAy, xi = 0 ∀y ⇔ hy, A∗ xi = 0 ∀y ⇔ A∗ x = 0. Thus A∗ is
injective if and only if (AH)⊥ = {0}, which is equivalent to AH = H by Exercise 5.26(i). 

13.2 Exercise Let A ∈ B(H). Prove:


(i) If λ ∈ σr (A) then λ ∈ σp (A∗ ).
(ii) If λ ∈ σp (A) then λ ∈ σp (A∗ ) ∪ σr (A∗ ).
(iii) Use the general theory proven so far and the results of Exercise 10.33 to determine
σc (L), σr (L), σc (R), σr (R) for the shift operators L, R on `2 (N, C).

13.3 Remark Using Exercise 11.3(i) instead of Lemma 13.1, one has analogous results for the
transpose At ∈ B(V ∗ ) of a Banach space operator A ∈ B(V ): σr (A) ⊆ σp (At ) and σp (A) ⊆
σp (At ) ∪ σr (At ). (As in Exercise 11.3(iv) there is no complex conjugation since A 7→ At is
linear.) 2

13.1 Normal operators


We proved a bijection between bounded operators and bounded sesquilinear forms. In fact, an
operator A ∈ B(H) is already determined by the ‘diagonal’ elements (or ‘expectation values’ in
quantum theory) [x, x]A = hAx, xi of the associated form:

13.4 Lemma Let H be a Hilbert space over C and A, B ∈ B(H).


(i) If hAx, xi = 0 for all x ∈ H (‘A has vanishing diagonal elements’) then A = 0.
(ii) If hAx, xi = hBx, xi for all x ∈ H then A = B.
(The converse statements are trivially true.)
Proof. (i) The hypothesis implies hA(x + y), x + yi = 0 ∀x, y, and expanding this, using
hAx, xi = hAy, yi = 0 gives hAx, yi + hAy, xi = 0. Replacing x by ix gives hAx, yi − hAy, xi = 0.
Adding the two equations, we obtain hAx, yi = 0 ∀x, y. This implies Ax = 0 ∀x, thus A = 0.
(ii) follows by applying (i) to A − B. 

13.5 Remark There is something to be said for the above simple direct argument, but the
result also follows from Remark 5.14, which even allows to recover A (more precisely [·, ·]A )
from the map x 7→ hAx, xi. 2

Normality of A ∈ B(H) is defined as in general C ∗ -algebras, i.e. as AA∗ = A∗ A.

13.6 Proposition Let H be a Hilbert space and A ∈ B(H). Then


(0) A is normal if and only if A∗ is normal.
(i) A is normal if and only if kAxk = kA∗ xk for all x ∈ H.
(ii) If A is normal then (AH)⊥ = ker A∗ = ker A.
(iii) For a normal operator, injectivity ⇔ dense image.
(iv) For a normal operator, invertibility ⇔ boundedness below ⇔ surjectivity.

86
Proof. (0) This is trivial, but nevertheless worth pointing out.
(i) If A is normal then for all x ∈ H we have

kA∗ xk2 = hA∗ x, A∗ xi = hAA∗ x, xi = hA∗ Ax, xi = hAx, Axi = kAxk2 .

This computation holds both ways, thus kAxk = kA∗ xk ∀x implies hAA∗ x, xi = hAA∗ x, xi ∀x.
By Lemma 13.4, this implies AA∗ = A∗ A.
(ii) The first equality is Lemma 13.1, and the second is immediate from (i).
(iii) In view of (AH)⊥ = ker A established in (ii), injectivity of A is equivalent to (AH)⊥ =
{0}, which is equivalent to AH = H by Exercise 5.26(i).
(iv) Invertibility implies boundedness below and surjectivity, cf. Proposition 9.28. If a normal
operator is bounded below, it is injective, so that it has dense image by (iii). Now boundedness
below and dense image imply invertibility by Proposition 9.28. And surjectivity implies dense
image, thus injectivity by (iii). Now injectivity and surjectivity give invertibility. 

Normal operators have very nice spectral properties, which foreshadows the spectral theorem:

13.7 Exercise Let A ∈ B(H) be normal. Prove:


(i) If An x = 0 for some n ∈ N then Ax = 0.
(ii) Lλ (A) = ker(A − λ1). (Thus generalized eigenvectors are eigenvectors.)

13.8 Exercise Let A ∈ B(H) be normal. Let x, x0 be (non-zero) eigenvectors for the eigen-
values λ, λ0 , respectively. Prove:
(i) A∗ x = λx, thus x is eigenvector for A∗ with eigenvalue λ.
(ii) σp (A∗ ) = σp (A)∗ .
(iii) If λ 6= λ0 then x ⊥ x0 .44

13.9 Exercise (Spectra of normal operators) Let A ∈ B(H) be normal. Prove:


(i) σr (A) = ∅. (No residual spectrum)
(ii) σc (A∗ ) = σc (A)∗ .
(iii) σ(A) = σapp (A) = {λ ∈ C | ∀ε > 0 ∃x ∈ H : kxk = 1, k(A − λ1)xk < ε}.
Our last result in this section improves on those obtained in Exercises 10.40 and 10.42 for
Banach space operators:

13.10 Proposition Let A ∈ B(H) be normal.


(i) If Σ ⊆ σ(A) is clopen with ∅ =6 Σ 6= σ(A) and P = χΣ (A) ∈ B(H) then with H1 =
P H, H2 = (1 − P )H we have AHi ⊆ Hi , i = 1, 2, thus A = A|H1 ⊕ A|H2 . The restrictions
A|H1 , A|H2 are normal, and σ(A|H1 ) = Σ, σ(A|H2 ) = σ(A)\Σ.
(ii) If λ ∈ σ(A) is isolated then (A − λ1)x = 0 for every x ∈ P{λ} H 6= {0}, thus λ ∈ σp (A).
Proof. (i) The assumption Σ 6= 0 implies σ(χΣ (A)) = χΣ (σ(A)) 6= {0}, thus P 6= 0. Similarly,
σ 6= σ(A) implies P 6= 1. Since P commutes with A, the subspaces H1 = P H, H2 = (1−P )H are
mapped into themselves by A and every f (A), where f ∈ C(σ(A), C). Normality of Ai = A|Hi
is clear. Now there are unital ∗-homomorphisms πi : C(σ(A), C) → B(Hi ) such that f (A) =
44
You may have seen this before, but probably only for self-adjoint operators.

87
π1 (f ) ⊕ π2 (f ) for all f ∈ C(σ(A), C). Since Σ is clopen, the ∗-homoomorphism C(σ(A), C) →
C(Σ, C) ⊕ C(σ(A)\Σ, C), f 7→ (f|Σ , f|σ(A)\Σ ) is an isomorphism. (The inverse sends (f1 , f2 ) to
fb1 +fb2 where fbi is the extension of fi to all of σ(A) that vanishes on the complement of the domain
of fi .) Now the composite C(Σ, C) ⊕ C(σ(A)\Σ, C) → C(σ(A), C) → B(H1 ) ⊕ B(H2 ) sends
(f1 , f2 ) to (π1 (f1 ), π2 (f2 )). If now z1 , z2 are the inclusion maps from Σ and σ(A)\Σ, respectively,
to C, we have A = A1 ⊕A2 = π1 (z1 )⊕π2 (z2 ). Thus σ(A1 ) = σ(π1 (z1 )) ⊆ σ(z1 ) = Σ. Analogously
σ(A2 ) ⊆ σ(A)\Σ. Now in view of σ(A) = σ(A1 ) ∪ σ(A2 ), we have σ(A1 ) = Σ, σ(A2 ) ⊆ σ(A)\Σ.
(ii) Since λ is isolated, Σ = {λ} ⊆ σ(A) is clopen. Now (i) gives σ(A1 ) = {λ}, thus by
Exercise 11.26(iii) we have A|H1 = λ idH1 , so that every x ∈ H1 satisfies Ax = λx.
Alternative argument: We have P = χ{λ} (A) 6= 0, thus P H ⊆ H is a non-zero closed
subspace. If 0 6= x ∈ P H then x = P x, and
(A − λ1)x = (A − λ1)P x = (z − λ)(A)χ{λ} (A)x = ((z − λ)χ{λ} )(A)x = 0,
where z is the inclusion map σ(A) ,→ C and we used the homomorphism property of the
functional calculus and the fact that the function z 7→ (z − λ)χ{λ} (z) is identically zero. This
proves that x ∈ ker(A − λ1), so that λ ∈ σp (A). 

13.2 Self-adjoint operators


Self-adjoint operators are normal, thus the results of Section 13.1 all apply!

13.11 Lemma Let A ∈ B(H). Then


(i) A = A∗ ⇔ hAx, xi ∈ R ∀x ∈ H.
(ii) If A = A∗ then σp (A) ⊆ R. (We already know σ(A) ⊆ R, but with a longer proof.)
Proof. (i) By Lemma 13.4, A = A∗ is equivalent to hAx, xi = hA∗ x, xi ∀x. In view of hA∗ x, xi =
hx, Axi = hAx, xi, we find that A = A∗ is equivalent to hAx, xi = hAx, xi ∀x, which in turn is
equivalent to hAx, xi ∈ R ∀x.
(ii) If λ ∈ σp (A) then λ is an eigenvalue, so that there is a corresponding eigenvector x 6= 0.
Now λkxk2 = λhx, xi = hAx, xi ∈ R by (i). With kxk = 6 0 this implies λ ∈ R. 

13.12 Exercise Use Exercise 13.9(iii) to give simple(r) proofs for σ(A) ⊆ R and σ(A) ⊆ S 1
for A ∈ B(H) self-adjoint or unitary, respectively.

13.13 Exercise (i) Let A ∈ B(H) and K ⊆ H a closed subspace such that AK ⊆ K. (Thus
K is A-invariant.) Prove that AK ⊥ ⊆ K ⊥ is equivalent to A∗ K ⊆ K.
In this situation, K is called reducing since then A ∼
= A|K ⊕ A|K ⊥ .
(ii) Deduce that every invariant subspace of a self-adjoint operator is reducing.
(iii) Show by example that a normal operator can have invariant but non-reducing subspaces.
For every A ∈ B(H), using (5.2) we have
kAk = sup kAxk = sup |hAx, yi|.
kxk=1 kxk=kyk=1

For self-adjoint A, the norm is determined already by the ‘diagonal’ elements:

13.14 Proposition If H is a Hilbert space and A = A∗ ∈ B(H) then


kAk = sup |hAx, xi|.
kxk=1

88
Proof. Putting M = supkxk=1 |hAx, xi|, Cauchy-Schwarz gives M ≤ kAk. (We also note for
later use that |hAx, xi| ≤ M kxk2 ∀x.) It remains to prove kAk ≤ M , which in view of kAk =
supkxk=1 kAxk follows if we have kAxk ≤ M whenever kxk = 1. This inequality is trivially true
Ax
if Ax = 0. If not, put y = kAxk . Using A = A∗ and hAx, yi = kAxk−1 hAx, Axi ∈ R, we have
hAy, xi = hy, Axi = hAx, yi = hAx, yi. Using this, we have
1 
kAxk = hAx, yi = hA(x + y), x + yi − hA(x − y), x − yi
4
1
= hA(x + y), x + yi − hA(x − y), x − yi
4
1  
≤ |hA(x + y), x + yi| + |hA(x − y), x − yi|
4
M
kx + yk2 + kx − yk2


4
M
kxk2 + kyk2 = M,

=
2
where in the last steps we used the parallelogram identity (5.3) and kxk = kyk = 1. 

13.15 Remark 1. The set {hAx, xi | kxk = 1} is called the numerical range of A. In quantum
mechanics [38] it is the set of expectation values of A.
2. The number supkxk=1 |hAx, xi| is the numerical radius 9A9 of A. The identity kAk =
9A9 generalizes to all normal operators, but the proof is a bit trickier, see [55, Proposition
3.2.25] or [29]. (It also follows from the spectral theorem, cf. Section 15.1.) 2

13.3 Positive operators. Polar decomposition


In an abstract C ∗ -algebra A we called a ∈ A positive if a = a∗ and σ(a) ⊆ [0, ∞). In the
C ∗ -algebra B(H), there is another notion of positivity:

13.16 Definition Let H be a Hilbert space and A ∈ B(H). Then A is called operator positive,
A ≥O 0, if hAx, xi ≥ 0 for all x ∈ H. (I.e., the numerical range of A is contained in [0, ∞).)

13.17 Proposition Let A, B ∈ B(H). Then


(i) If A ≥O 0, B ≥O 0 then A + B ≥O 0.
(ii) A∗ A ≥O 0.
(iii) If A ≥O 0 then BAB ∗ ≥O 0.
(iv) If A ≥O 0 then A = A∗ and σ(A) ⊆ [0, ∞).
(v) If A = A∗ and σ(A) ⊆ [0, ∞) then A ≥O 0.
(vi) Thus A ≥O 0 ⇔ A ≥ 0 in the C ∗ -sense. We therefore drop the notation ≥O .
Proof. (i) Obvious.
(ii) For every x ∈ H one has hA∗ Ax, xi = hAx, Axi ≥ 0 so that A∗ A ≥ 0.
(iii) hBAB ∗ x, xi = hAB ∗ x, B ∗ xi ≥ 0.
(Note how easy these three proofs were compared to the analogous statements for C ∗ -algebras.)
(iv) If A ≥ 0 then Lemma 13.11(i) gives A = A∗ so that σ(A) ⊆ R by Proposition 11.24(iii).
It remains to prove that λ < 0 implies λ 6∈ σ(A). For all x ∈ H we have, using λ < 0 and A ≥ 0,

h(A − λ1)x, (A − λ1)xi = kAxk2 + |λ|2 kxk2 − 2λhAx, xi ≥ |λ|2 kxk2 ,

89
(note that −2λhAx, xi ≥ 0), thus k(A−λ1)xk ≥ |λ| kxk, so that A−λ1 is bounded below. Since
it is also normal, Proposition 13.6(iv) implies that A−λ1 is invertible. Thus σ(A)∩(−∞, 0) = ∅.
(v) A satisfies the hypotheses of Proposition 12.10, so that there is a B = B ∗ ∈ B(H) such
that A = B 2 = B ∗ B. Now the claim follows from (ii).
(vi) Combine (iv) and (v). 

If A ∈ B(H), we put |A| = (A∗ A)1/2 , as in Definition 12.16.

13.18 Definition V ∈ B(H) is a partial isometry if V|(ker V )⊥ : (ker V )⊥ → H is an isometry.

13.19 Exercise Prove that for V ∈ B(H), the following are equivalent.
(i) V is a partial isometry.
(ii) V ∗ is a partial isometry.
(iii) V ∗ V is an orthogonal projection.
(iv) V V ∗ is an orthogonal projection.
(v) V V ∗ V = V (trivially equivalent to V ∗ V V ∗ = V ∗ ).
In Exercise 12.17 we saw a C ∗ -algebra not admitting polar decomposition. But:

13.20 Proposition (Polar decomposition) Let H be a Hilbert space and A ∈ B(H).


(i) There exists a unique partial isometry V such that A = V |A| and ker A = ker V .
(ii) If A is injective (invertible) then V is an isometry (unitary).
(iii) In addition, we have |A| = V ∗ A. [This follows trivially from (i) only if V is unitary.]
Proof. (i) For each x ∈ H we have
kAxk2 = hAx, Axi = hx, A∗ Axi = h(A∗ A)1/2 x, (A∗ A)1/2 xi = h|A|x, |A|xi = k|A|xk2 . (13.1)
Thus we can define a linear map V : |A|H → H by V |A|x = Ax. (If |A|x = |A|x0 then
|A|(x − x0 ) = 0, thus A(x − x0 ) = 0 by (13.1), so that V is well defined.) This map is isometric,
thus it extends continuously (Lemma 3.17) to |A|H. We extend V to all of H by having it send

|A|H to zero, obtaining a partial isometry. We have

ker V = |A|H = ker |A|∗ = ker |A| = ker A,
where we used Lemma 13.1, the self-adjointness of |A| and (13.1). It is clear from the definition
that V |A| = A.
That V is uniquely determined by its properties is quite clear: It must send |A|x to Ax, which

determines it on |A|H. And in view of ker A = ker |A| = ker |A|∗ = |A|H , the requirement

ker V = ker A forces V to be zero on |A|H .
(ii) It is trivial that an injective (bijective) partial isometry is an isometry (unitary).
(iii) Using (13.1) as in (i) one can define a partial isometry W such that W Ax = |A|x ∀x ∈ H
and W  (AH)⊥ = 0. Now it is immediate that W V  |A|H = id and V W  AH = id, while
W V and V W vanish on |A|H ⊥ and AH ⊥ respectively. Thus W = V ∗ , and we are done. 

13.21 Remark If A ⊆ B(H) is a von Neumann algebra (a class of particularly nice C ∗ -


subalgebras of B(H)) and A ∈ A one has not only |A| ∈ A, but also V ∈ A. 2

At this point, we could go on to Section 15, where the various versions of the spectral
theorem for normal operators are proven. But it is customary to first study compact operators,
since their spectral theory is much simpler and quite similar to that in finite dimensions.

90
14 Compact operators
14.1 Compact Banach space operators
We have met compact topological spaces many times in this course. A subset Y of a topological
space (X, τ ) is compact if it is compact when equipped with the induced (=subspace) topology
τ|Y . And Y ⊆ X is called precompact (or relatively compact) if its closure Y is compact. Recall
that a metric space X, thus also a subset of a normed space, is compact if and only if every
sequence {xn } in X has a convergent subsequence.
A subset Y of a normed space (V, k · k) is called bounded if there is an M such that kyk ≤
M ∀y ∈ Y . A compact subset of a normed space is closed and bounded, but the converse, while
true for finite dimensional spaces by the Heine-Borel theorem, is false in infinite dimensional
spaces. This is particularly easy to see for a Hilbert space: Any ONB √ B ⊆ H clearly is bounded.
For any e, e0 ∈ B, e 6= e0 we have ke − e0 k = he − e0 , e − e0 i1/2 = 2. Thus B ⊆ H is closed and
discrete. Since it is infinite, it is not compact.
For normed spaces, one needs the following easy, but important lemma:

14.1 Lemma (F. Riesz) Let (V, k · k) be a normed space and W $ V a closed proper subspace.
Then for each δ ∈ (0, 1) there is an xδ ∈ V such that kxδ k = 1 and dist(xδ , W ) ≥ δ, i.e.
kxδ − xk ≥ δ ∀x ∈ W .
Proof. If x0 ∈ V \W then closedness of W implies λ = dist(x0 , W ) > 0. In view of δ ∈ (0, 1),
we have λδ > λ, so that we can find y0 ∈ W with kx0 − y0 k < λδ . Putting
y0 − x 0
xδ = ,
ky0 − x0 k
we have kxδ k = 1. If x ∈ W then
y0 − x0 kky0 − x0 kx − y0 + x0 k dist(x0 , W ) λ
kx − xδ k = x − = ≥ > λ = δ,
ky0 − x0 k ky0 − x0 k ky0 − x0 k δ

where the ≥ is due to ky0 − x0 kx − y0 ∈ W and the > comes from dist(x0 , W ) = λ and
kx0 − y0 k < λδ . Since this holds for all x ∈ W , we have dist(xδ , W ) ≥ δ. 

14.2 Proposition If (V, k · k) is an infinite dimensional normed space then:


(i) Each closed ball B(x, r) = {y ∈ V | kx − yk ≤ r} (with r > 0) is non-compact.
(ii) Every subset Y ⊆ V with non-empty interior Y 0 is non-compact.
Proof. (i) Choose x1 ∈ V with kx1 k = 1. Then Cx1 is a closed proper subspace, thus there exists
x2 ∈ V with kx2 k = 1 and kx1 − x2 k ≥ 21 . Since V is infinite dimensional, V2 = span{x1 , x2 }
is a closed proper subspace, thus there exists x3 ∈ V with dist(x3 , V2 ) ≥ 12 , thus in particular
kx3 − xi k ≥ 21 for i = 1, 2. Continuing in this way we can construct a sequence of xi ∈ V
with kxi k = 1 and kxi − xj k ≥ 12 ∀i 6= j. The sequence {xi } clearly cannot have a convergent
subsequence, thus the closed unit ball B(0, 1) is non-compact. Since x 7→ λx + x0 with λ > 0 is
a homeomorphism, all closed balls are non-compact.
(ii) If Y ⊆ V and Y 0 6= ∅ then Y contains some open ball B(x, r), thus also B(x, r/2), which
is non-compact. Thus neither Y nor Y are compact. 

In view of the above, it is interesting to look at linear operators that send sets S ⊆ V to
sets AS with ‘better compactness properties’. There are several such notions:

91
14.3 Exercise Let V be a normed space and A : V → V a linear map. Prove that the following
conditions are equivalent and imply boundedness of A:
(i) The image AB ⊆ V of the closed unit ball B = V≤1 is precompact.
(ii) AS is precompact whenever S ⊆ V is bounded.
(iii) Given any bounded sequence {xn } ⊆ V , the sequence {Axn } has a convergent subsequence.

14.4 Definition Operators A ∈ B(V ) satisfying the above equivalent conditions are called
compact (or completely continuous). The set of compact operators on V is denoted K(V ).

14.5 Remark 1. Compactness for A ∈ B(V, W ) with V 6= W is defined completely analogously.


2. Some authors write B0 (V ) rather than K(V ), motivated by Exercise 14.11(iii) below.
3. If A ∈ B(V ) is compact and W ⊆ V is a closed A-invariant subspace, thus AW ⊆ W ,
then the restriction A|W is compact.
4. The Heine-Borel theorem implies that every linear operator on a finite dimensional normed
space (automatically bounded by Exercise 3.14) is compact. For infinite dimensional spaces this
is false since every closed ball is bounded but non-compact by Proposition 14.2. In particular
the unit operator 1V is compact if and only if V is finite dimensional.
5. Compactness can also be defined for non-linear maps between Banach spaces. But then
the three versions above are no more equivalent and continuity is no more automatic. See
Section B.9. 2

Before we develop further theory, we should prove that (non-zero) compact operators on
infinite dimensional spaces exist.

14.6 Definition Let V be a normed space and A ∈ B(V ). Then A has finite rank if its image
AV is finite dimensional. The set of finite rank operators on V is denoted F (V ).
For example, if ϕ ∈ V ∗ , y ∈ V then A ∈ B(V ) define by A : x 7→ ϕ(x)y has finite rank.

14.7 Lemma For each Banach space V , F (V ) ⊆ K(V ).


Proof. Let A ∈ F (V ). If S ⊆ V is bounded then AS ⊆ AV is bounded by boundedness of V .
Since AV is finite dimensional, is has the Heine-Borel property so that AS ⊆ AV is compact.
Thus A is compact. 

14.8 Lemma K(V ) ⊆ B(V ) is a two-sided ideal (thus a linear subspace, and if A ∈ B(V ), B ∈
K(V ) then AB, BA ∈ K(V )).
Proof. Let {xn } be a bounded sequence in V . Since A, B are compact, we can find a subsequence
{xnk }k∈N such that Axnk and Bxnk converge as k → ∞. Then also (cA + dB)xnk converges,
thus cA + dB is compact. Thus K(V ) ⊆ B(V ) is a linear subspace.
Alternative argument: Let A, B ∈ K(V ) and S ⊆ V bounded. Then AS and BS are
compact, and so are cAS, dBS if c, d ∈ F. Thus also cAS + dBS is compact (by joint continuity
of the map + : V × V → V ), thus also (cA + dB)S ⊆ cAS + dBS = cAS + dBS.
Now let A ∈ B(V ), B ∈ K(V ) and S ⊆ V bounded. Then BS and ABS = ABS are
compact by compactness of B and continuity of A, respectively. And boundedness of A implies
boundedness of AS, so that BAS has compact closure by compactness of B. Thus AB and BA
are compact. 

92
For the proof of the next result, we need the notion of total boundedness in metric spaces,
see Appendix A.8. In particular we will use Exercise A.36(iii).

14.9 Proposition For each Banach space V , K(V ) ⊆ B(V ) is k · k-closed.


Proof. Since B(V ) is a metric space, it suffices to prove that the limit A of every norm-
convergent sequence {An } in K(V ) is in K(V ). Let thus {An } ⊆ K(V ) and A ∈ B(V ) such
that kAn − Ak → 0. We want to prove that AB ⊆ V is precompact, where B is the closed unit
ball. Since (B(V ), k · k) is complete, by Exercise A.36(iii) this is equivalent to AB being totally
bounded. To show this, let ε > 0. Then there is an n such that kAn − Ak < ε/3. Since An is
compact,
Sn An B is precompact, thus totally bounded. Thus there are x1 , . . . , xn ∈ B such that
i=1 B(A n xi , ε/3) ⊇ An B. If now x ∈ B (thus kxk ≤ 1) then there is i ∈ {1, . . . , n} such that
kAn x − An xi k < ε/3. Thus
ε ε ε
kAx − Axi k ≤ kAx − An xk + kAn x − An xi k + kAn xi − Axi k < + + = ε,
3 3 3
where we used kA − An k < ε/3 and x, xi ∈ B. This proves that AB is totally bounded, thus
precompact. Thus A is compact. 

14.10 Corollary For each Banach space V , we have F (V ) ⊆ K(V ).

14.11 Exercise Let V = `p (S, F), where S is an infinite set and 1 ≤ p < ∞. If g ∈ `∞ (S, F)
and f ∈ `p (S, F) then Mg (f ) = gf (pointwise product) is in `p (S). This defines a linear map
`∞ (S, F) → B(V ), g 7→ Mg . Prove:
(i) g 7→ Mg is an algebra homomorphism.
(ii) kMg k = kgk∞ .
(iii) Mg ∈ K(V ) if and only if g ∈ c0 (S, F).

14.12 Remark 1. We now have two classes of compact operators: The (rather commutative)
one of multiplication by c0 -functions, and the operators that are norm-limits of finite rank
operators. Actually, the first class is contained in the second. Why?
2. It is quite natural to ask whether in fact F (V ) = K(V ), i.e. whether all compact op-
erators on V are norm-limits of finite rank operators. When this holds, V is said to have the
approximation property. We will later see that this is true for all Hilbert spaces. Whether all
Banach spaces have the approximation property was an open problem until Enflo45 in 1973 [19]
constructed a counterexample. His construction was very complicated and his spaces were not
very ‘natural’ (in the sense of having a simple definition and/or having been encountered pre-
viously). A simpler example, but still tricky and not natural, can be found in [13]. Somewhat
later, very natural examples were found: The Banach space B(H) does not have the approxi-
mation property whenever H is an infinite dimensional Hilbert space, cf. [78]. (Note that this
is about compact operators on B(H), not compact operators in B(H)!) All this is well beyond
the level of this course, but you should be able to understand [13]. 2

None of the above examples of compact operators seems very relevant for applications, even
within mathematics. Indeed the most useful compact operators perhaps are integral operators.
We will briefly look at a class of them in Exercise 14.35. But there are very simple examples:
45
Per H. Enflo (1944-). Swedish mathematician, working mostly in functional analysis.

93
14.13 Definition Let V = C([0, C], F) for some C > 0, equipped with the norm kf k =
supx∈[0,C] |f (x)|. As we know, (V, k·k) is a Banach space. Define a linear operator, the Volterra46
operator, by Z x
A : V → V, (Af )(x) = f (t)dt.
0
Rx RC
We have kAf k = supx | 0 f (t)dt| ≤ 0 |f (t)|dt ≤ Ckf k, thus kAk ≤ C < ∞.

14.14 Proposition The Volterra operator A : V → V is compact.


The proof of this result makes essential use of the Arzelà-Ascoli Theorem47 which charac-
terizes the (pre)compact subsets of (C(X, F), k · k∞ ) for compact X.

Proof. We will prove that F = AB ⊆ V is precompact by showing that it satisfies the hypotheses
of Theorem A.38. If x ∈ [0, C] and f ∈ C([0, C]) with kf k ≤ 1 then
Z x
|(Af )(x)| = f (t)dt ≤ Ckf k ≤ C < ∞,
0

showing that F is pointwise bounded. For each f ∈ V with kf k ≤ 1 we have


Z y
|(Af )(x) − (Af )(y)| = f (t)dt ≤ |x − y|.
x

Since this is uniform in f it shows that F is equicontinuous. 

R b generalized to give compactness of A : V → V with V =


The above proof is very easily
C([a, b], F) given by (Af )(x) = a K(x, y)f (y)dy for any K ∈ C([a, b] × [a, b], F).

14.15 Proposition Let A ∈ B(V ). Then At ∈ B(V ∗ ) is compact if and only if A is compact.
Proof. As in the proof of Proposition, we use the equivalence of precompactness and total
boundedness from Exercise A.36(iii). Assume that A ∈ B(V ) is compact. Then X = AV≤1 ⊆ V
is compact. Let
F = {ϕ  X | ϕ ∈ (V ∗ )≤1 } ⊆ C(X, F).
For each x ∈ X we have Fx = {ϕ(x) | ϕ ∈ (V ∗ )≤1 }, which clearly is bounded. Thus F is
pointwise bounded. If x, x0 ∈ X then for each ϕ|X ∈ F we have |ϕ(x) − ϕ(x0 )| ≤ kϕkkx − x0 k ≤
kx − x0 k, so that F ⊆ C(X, F) is equicontinuous. Thus by the Arzelà-Ascoli Theorem A.38, F
is totally bounded. This means that for each ε > 0 there are ϕ1 , . . . , ϕN ∈ (V ∗ )≤1 such that for
every ϕ ∈ (V ∗ )≤1 there is an i such that kϕ − ϕi kC(X,F) = supx∈X |ϕ(x) − ϕi (x)| < ε. In view
of X = AV≤1 , this implies: For every ε > 0 there are ϕ1 , . . . , ϕN ∈ (V ∗ )≤1 such that for each
ϕ ∈ (V ∗ )≤1 there is an i such that supx∈V≤1 |ϕ(Ax) − ϕi (Ax)| < ε. In view of

sup |ϕ(Ax) − ϕi (Ax)| = sup |(At ϕ − At ϕi )(x)| = kAt ϕ − At ϕi k,


x∈V≤1 x∈V≤1

we have proven the total boundedness (=precompactness) of the set At (V ∗ )≤1 ⊆ V ∗ and there-
fore compactness of At ∈ B(V ∗ ).
46
Vito Volterra (1860-1940). Italian mathematician and one of the early pioneers of functional analysis.
47
You should have seen this theorem in Analysis 2 or Topology. See e.g. Appendix A.9 or [23, Vol. 2, Theorem
15.5.1]. It has many applications in classical analysis, for example Peano’s existence theorem on differential equations.

94
Now assume that At ∈ B(V ∗ ) is compact. Then by the above, Att ∈ B(V ∗∗ ) is compact.
Since V ⊆ V ∗∗ is a closed subspace, the restriction Att |V is compact. But by Lemma 11.2, the
latter equals A, so that A is compact. 

14.16 Remark The above result (due to Schauder) can be proven in different ways: One can
give essentially the same proof avoiding invocation of Arzelà-Ascoli, cf. [50, Theorem 1.4.4], or
use the circle of ideas in Section 16 as in [12, Theorem VI.3.4]. For another functional analysis
proof see [66]. Cf. also Remark A.40.2 on different proofs of the Arzelà-Ascoli theorem. 2

For Hilbert spaces, we have a simple proof using polar decomposition:

14.17 Proposition If H is a Hilbert space and A ∈ B(H) is compact then A∗ and |A| are
compact.

=
Proof. Compactness of A∗ follows from Proposition 14.15 and A∗ = γ −1 ◦ At ◦ γ, where γ : H →
H ∗ , y 7→ h·, yi.
For |A| (and also A∗ , if we want to avoid Proposition 14.15) we argue as follows: By the polar
decomposition, there is a partial isometry V such that A = V |A| and |A| = V ∗ A. The second
identity together with compactness of A and Lemma 14.8 gives compactness of |A|. Since the
first identity is equivalent to A∗ = |A|V ∗ , the compactness of A∗ follows.
Yet another proof: If A ∈ K(H) and ε > 0 then by Corollary 14.30 proven below there is
F ∈ F (H) with kA − F k < ε, thus kA∗ − F ∗ k < ε. By the following exercise, F ∗ is finite rank.
Since ε > 0 was arbitrary, Corollary 14.10 gives A∗ ∈ K(H). 

14.18 Exercise Prove: If H is a Hilbert space and A ∈ F (H) then A∗ ∈ F (H).


Thus the closed ideal K(H) ⊆ B(H) is closed under the ∗-operation, i.e. self-adjoint, and
of course a C ∗ -algebra. One can prove, see e.g. [50], that K(H) it is the smallest self-adjoint
closed ideal in B(H). If H is separable, it is the only one.

14.2 Fredholm alternative. The spectrum of compact operators


We now begin studying the spectrum of compact operators.

14.19 Lemma If A ∈ B(V ) is compact and λ ∈ C\{0} then ker(A − λ1) is finite dimensional.
Proof. If λ 6∈ σ(A) then this is trivial since A − λ1 is invertible. In general, Vλ = ker(A − λ1)
is the space of eigenvectors of A with eigenvalue λ. Clearly A|Vλ = λ idVλ , so that Vλ is an
invariant subspace. Since Vλ is closed and A|Vλ is compact by Remark 14.5.3, Vλ must be
finite-dimensional by Remark 14.5.4. 

14.20 Proposition (Fredholm alternative) 48 Let V be a Banach space, A ∈ B(V ) com-


pact and λ ∈ C\{0}. Then the following are equivalent:
(i) A − λ1 is invertible. (I.e. λ 6∈ σ(A).)
(ii) A − λ1 is injective.
(iii) A − λ1 is surjective.
48
Erik Ivar Fredholm (1866-1927). Swedish mathematician. Early pioneer of functional analysis through his work
on integral equations.

95
Proof. We know from Proposition 9.28 that (i) is equivalent to the combination of (ii) and (iii).
It therefore suffices to prove (ii)⇔(iii).
(iii)⇒(ii): It suffices to do this for λ = 1. (Why?) Assume that A − 1 is not injective, but
surjective. Clearly (A − 1)n is surjective for all n. In view of (A − 1)n+1 = (A − 1)(A − 1)n
we have ker(A − 1)n+1 = ((A − 1)n )−1 (ker(A − 1)), where the −1 stands for ‘preimage’. This
space clearly contains ker(A − 1)n , but it is strictly larger since it contains vectors x such
that (A − 1)n x ∈ ker(A − 1)\{0}. Thus ker(A − 1)n+1 % ker(A − 1)n for all n. Now by
Riesz’ Lemma 14.1, for each n we can find an xn ∈ ker(A − 1)n+1 such that kxn k = 1 and
dist(xn , ker(A − 1)n ) ≥ 21 . If n > m then (A − 1)xn − Axm ∈ ker(A − 1)n (note that (A − 1)n
commutes with A and xm ∈ ker(A − 1)n since n > m), so that
1
kAxn − Axm k = kxn + ((A − 1)xn − Axm )k ≥ .
2
Thus {Axn } has no convergent subsequence, contradicting the compactness of A.
(ii)⇒(iii): If A − λ1 is injective but not surjective, one similarly proves (A − λ1)n+1 H $
(A − λ1)n H for all n, which again leads to a contradiction with compactness of A. 

14.21 Remark In the above proof we have shown that if A ∈ B(V ) is compact and λ ∈ C\{0}
then we cannot have ker(A − 1)n+1 % ker(A − 1)n for all n or (A − λ1)n+1 H $ (A − λ1)n H for
all n. One says that A − λ1 has finite ascent and descent. 2

Fredholm’s alternative has far-reaching consequences for the spectrum of A:

14.22 Corollary If A ∈ B(V ) is compact then σ(A) ⊆ σp (A) ∪ {0}.


Proof. For λ 6= 0, with Proposition 14.20 we have: λ ∈ σ(A) ⇔ A − λ1 not invertible ⇔ A − λ1
not injective ⇔ λ ∈ σp (A). 

14.23 Proposition Let H be a non-zero complex Hilbert space and A ∈ B(H) a compact
normal operator. Then there is an eigenvalue λ ∈ σp (A) such that |λ| = kAk.
Proof. If A = 0 then it is clear that λ = 0 does the job. Now assume A 6= 0. By Exercise 11.26,
there is λ ∈ σ(A) with |λ| = kAk. Since λ 6= 0, Corollary 14.22 gives λ ∈ σp (A). 

The rest of this subsection is not needed for the proof of the spectral theorem, but puts the
Fredholm alternative into perspective.
Recall that if A : V → W is a linear map, the cokernel of A by definition is the linear quotient
space W/AV . If V, W are Hilbert spaces and AV ⊆ W is closed then, recalling Exercise 6.1 we
may alternatively define the cokernel of A to be (AV )⊥ ⊆ W .

14.24 Proposition Let A ∈ B(V ) be compact and λ ∈ C\{0}. Then


(i) coker(A − λ1) = V /((A − λ1)V ) is finite dimensional.
(ii) (A − λ1)V ⊆ V is closed.
Proof. (i) By Proposition 14.15, At is compact, thus ker(At − λ1V ∗ ) is finite dimensional by
Lemma 14.19. Now

ker(At − λ1V ∗ ) = ker[(A − λ1V )t ] = ((A − λ1V )V )⊥ ∼


= (V /((A − λ1V )V ))∗ ,

96
where the second equality and the final isomorphism come from Exercises 11.3(i) and 6.6, re-
spectively. Thus (V /((A−λ1V )V ))∗ is finite dimensional, implying (why?) finite dimensionality
of V /((A − λ1V )V ) = coker(A − λ1V ).
(ii) This follows from (i) and Exercise 9.13.
Alternatively, a direct proof goes as follows: By Lemma 14.19, K = ker(A − λ1) is finite
dimensional, thus closed by Exercise 3.22. Thus there is a closed subspace S ⊆ V such that
V = K ⊕ S. (If V is a Hilbert space, we can just take S = K ⊥ . For general Banach spaces
this is the statement of Proposition 6.14.) The restriction (A − λ1)|S : S → H is compact and
injective. If (A − λ1)|S is not bounded below, we can find a sequence {xn } in S with kxn k = 1
for all n and k(A − λ1)xn k → 0. Since A is compact, we can find a subsequence {xnk } such
that {Axnk } converges. We relabel, so that now {Axn } converges. Now

xn = λ−1 [Axn − (A − λ1)xn ].

Since {Axn } converges and {(A − λ1)xn } converges to zero by choice of {xn }, {xn } converges
to some y ∈ S (since xn ∈ S ∀n and S is closed). From (A − λ1)xn → 0 and xn → y we obtain
(A − λ1)y = 0, so that y ∈ ker(A − λ1) = K. Thus y ∈ K ∩ S = {0}, which is impossible since
y = limn xn and kxn k = 1 ∀n. This contradiction shows that (A − λ1)|S is bounded below. Now
Lemma 9.26 gives that (A − λ1)H = (A − λ1)S is closed. 

14.25 Remark 1. In fact one has a much stronger result: If A ∈ B(V ) is compact and
λ ∈ C\{0} then
dim ker(A − λ1) = dim coker(A − λ1). (14.1)
Compare this with the equivalence (ii)⇔(iii) in Proposition 14.20, which amounts to the weak
statement dim ker(A − λ1) = 0 ⇔ dim coker(A − λ1) = 0. For proofs of (14.1) in a Banach
space context see any of [39, 50, 43].
2. If V is a Banach space then A ∈ B(V ) is called a Fredholm operator if ker A and
coker A are finite dimensional. (Often it is also assumed that A has closed image, but by
Exercise 9.13 this follows from finite dimensionality of the cokernel.) In this case, one calls
ind(A) = dim ker A − dim coker A ∈ Z the (Fredholm) index of A. If A, B are both Fredholm
then so is AB and ind(AB) = ind(A) + ind(B). (This can be used for proving (14.1).)
Thus if A is compact and λ 6= 0 then A − λ1 is Fredholm with index zero.
Since 1 is Fredholm with index zero, this result is a very special case of the following: If F
is Fredholm and K is compact then F + K is Fredholm and ind(F + K) = ind(F ).
Another important connection between compact and Fredholm operators is Atkinson’s the-
orem: A ∈ B(V ) is Fredholm if and only there exists B ∈ B(V ) such that AB − 1 and BA − 1
are compact. (Equivalently, the image of A in the quotient algebra B(V )/K(V ) is invertible.)
For more on Fredholm operators see [55, p.110-112] or [50, Section 1.4].
Finally, λ ∈ σd (A) is equivalent to λ ∈ σ(A) being isolated and A − λ1 being Fredholm
(equivalently, Fredholm of index zero). 2

14.26 Exercise Let H be a Hilbert space, A ∈ K(H) and λ ∈ C\{0}. Show that each of the
implications (ii)⇒(iii) and (iii)⇒(ii) in Proposition 14.20 can be deduced from the other.

14.3 Spectral theorems for compact Hilbert space operators


It is well-known that self-adjoint n × n-matrices can be diagonalized. The following beautiful
result generalizes this to compact normal operators:

97
14.27 Theorem (Spectral theorem for compact normal operators) Let H be a Hil-
bert space and A ∈ B(H) compact normal. Then
(i) H is spanned by the eigenvectors of A.
P
(ii) There is an ONB E of H consisting of eigenvectors, thus A = e∈E λe Pe , where Pe : x 7→
hx, eie.
(iii) For each ε > 0 there are at most finitely many λ ∈ σp (A) with |λ| ≥ ε.
(iv) σp (A) is at most countable and has no accumulation points except perhaps 0.
(v) We have σ(A) ⊆ σp (A) ∪ {0}. Furthermore,
– 0 ∈ σp (A) if and only if A is not injective.
– 0 ∈ σc (A) if and only if A is injective and H is infinite dimensional.
S
Proof. (i) Let K ⊆ H be the smallest closed subspace containing λ∈σp (A) Hλ , where Hλ =
ker(A − λ1). Clearly K is an invariant subspace: AK ⊆ K. Exercise 13.8(i) implies that also
A∗ K ⊆ K. Now Exercise 13.13(i) gives that also K ⊥ is A-invariant: AK ⊥ ⊆ K ⊥ . Clearly
A|K ⊥ is compact, thus if K ⊥ 6= {0} then by Proposition 14.23 it contains an eigenvector of A.
Since this contradicts the definition of K, we have K ⊥ = 0, proving that H is spanned by the
eigenvectors of A.
(ii) By Exercise 13.8(ii), the eigenspaces for different eigenvalues of A are mutually orthogo-
S
nal. Now the claim follows from (i) by choosing ONBs Eλ for each Hλ and putting E = λ Eλ .
(iii) Taking into account the unitary equivalence H ∼ = `2 (E, C), cf. Theorem 5.41, this
essentially is Exercise 14.11(iii).
(iv) This is an immediate consequence of (iii).
(v) The first statement was already proven in Corollary 14.22. The second is immediate. As
to the last, if H is finite dimensional then 0 6∈ σp (A) = σ(A). If H is infinite dimensional and A
injective then σp is infinite since the eigenspaces for the λ 6= 0 are finite dimensional and span
H. Thus in view of (iii), we have 0 ∈ σp (A). Now 0 ∈ σc (A) follows from the fact that A is not
bounded below or from the closedness of σ(A). (And recall that σr (A) = ∅ by normality.) 

14.28 Remark 1. The common theme of ‘spectral theorems’ is that normal operators can be
diagonalized, i.e. be interpreted as multiplication operators, compactness simplifying statement
and proof considerably.
2. The statements about σ(A) actually hold for all compact operators on Banach spaces.
(Instead of the orthogonality of eigenvectors for different eigenvalues, it suffices to use their
linear independence.) 2

For non-normal operators one can only prove a weaker statement:

14.29 Proposition Let A ∈ K(H). Then are orthonormal sets (not necessarily bases!) E and
F of H, a bijection E → F, e 7→ fe and positive numbers {βe }, called the singular values of A,
such that e 7→ βe is in c0 (E, C) and
X
A= βe fe h·, ei.
e∈E

The βe are the precisely the eigenvalues of |A|.

98
Proof.PB = A∗ A is compact and self-adjoint, so that there is an ONB EB diagonalizing B, thus
Ae
B = e∈EB λe Pe . Clearly E = {e ∈ EB | Ae 6= 0} is orthonormal. For e ∈ E put fe = kAek .
0
Now let F = {fe | e ∈ E }. For all x ∈ H we have
X X X X
Ax = A hx, eie = hx, eiAe = hx, eiAe = kAekhx, eife
e∈EB e∈EB e∈E e∈E

If e, e0 ∈ E, e 6= e0 then hAe, Ae0 i = he, A∗ Ae0 i = 0 since E diagonalizes A∗ A. Thus the fe = kAek
Ae

are mutually orthogonal, and they are normalized by definition. Thus F is orthonormal.
Since E diagonalizes A∗ A, we have A∗ Ae = λe e for all e ∈ E, where compactness of A∗ A im-
plies that e 7→ λe is in c0 (EB , C), cf. Theorem 14.27(iii). Now, kAek2 = hAe, Aei = he, A∗ Aei =
1/2
λe , thus βe = kAek = λe implies that also e 7→ λe is in c0 (E). For the final claim, note that
A∗ Ae = |A|2 e = λe e, thus |A|e = βe e. 

Now we can prove that Hilbert spaces have the approximation property:

14.30 Corollary Let H be a Hilbert space, A ∈ K(H) and ε > 0. Then there is a B ∈ F (H)
k·k
(finite rank) with kA − Bk ≤ ε. Thus K(H) = F (H) .
P
Proof. Pick a representation A = e∈E λe fe h·, ei as in the preceding proposition. Since E →
C, e 7→ λe is in c0 (E), there is a finite subset F ⊆ E such that |λe | < ε for all e ∈ E\F . Define
X
B= λe h·, fe ie,
e∈F

which clearly has finite rank. If x ∈ H then using the orthonormality of E, we have
X X
k(A − B)xk2 = k λe hx, fe iek2 = |λe hx, fe i|2 ≤ ε2 kxk2 .
e∈E\F e∈E\F

Thus kA − Bk ≤ ε, so that K(H) ⊆ F (H). The converse inclusion was Corollary 14.10. 

14.31 Remark 1. In the above, bases played a crucial role. Even though there is no notion of
orthogonality in general Banach spaces, it turns out that Banach spaces having suitable bases
k·k
do satisfy K(H) = F (H) , i.e. the approximation property.
2. If you like applications of complex analysis to functional analysis, see [59, Section VI.5]
for an interesting alternative approach to compact operators. 2

14.4 ? Hilbert-Schmidt operators


Generalizing a well-known construction in (finite dimensional) linear algebra, if A ∈ B(H) is
positive and E is an orthonormal base of H, we define the ‘trace’ of A w.r.t. E by
X
TrE (A) = hAe, ei ∈ [0, ∞].
e∈E

14.32 Lemma (i) For every A ∈ B(H) we have TrE (A∗ A) = TrE (AA∗ ).
(ii) If A ≥ 0 and U is unitary then TrE (U AU ∗ ) = TrE (A).
(iii) If A ≥ 0 then TrE (A) is independent of the ONB E. We therefore just write Tr(A).

99
(iv) For A, B ∈ B(H)+ and λ ≥ 0 we have Tr(A + B) = Tr(A) + Tr(B) and Tr(λA) = λTr(A).
Proof. (i) Using Parseval (kxk2 = e0 ∈E |hx, e0 i|2 ), we have
P

X X X XX
TrE (A∗ A) = hA∗ Ae, ei = hAe, Aei = kAek2 = |hAe, e0 i|2
e∈E e∈E e∈E e∈E e0 ∈E
XX XX X
0 2 ∗ 0 2
= |hAe, e i| = |he, A e i| = kA∗ e0 k2 = TrE (AA∗ ),
e0 ∈E e∈E e0 ∈E e∈E e0 ∈E

where the exchange of summation is justified since all summands are non-negative.
(ii) Put B = U A1/2 . By (i), TrE (A) = TrE (B ∗ B) = TrE (BB ∗ ) = TrE (U AU ∗ ).
(iii) Let A ≥ 0 and let E, F be ONBs for H. Since E, F have the same cardinality, we can
pick a bijection α : F → E. The latter extends to a unitary operator U : H → H. Thus by (ii),
X X
TrE (A) = TrE (U AU ∗ ) = hAU ∗ e, U ∗ ei = hAf, f i = TrF (A).
e∈E f ∈F

(iv) The second statement is evident, and the first follows from the fact that a sum of non-
negative numbers is independent of the order or bracketing. 

We have seen that A∗ A is positive for any A ∈ B(H), so that Tr(A∗ A) ∈ [0, ∞] is well-
defined. For each A ∈ B(H) we define

kAk2 = (Tr(A∗ A))1/2 ∈ [0, ∞].

14.33 Definition An operator A ∈ B(H) is called Hilbert-Schmidt49 operator if kAk2 < ∞.


The set of Hilbert-Schmidt operators is denoted L2 (H).

14.34 Proposition Let H be a Hilbert space. Then


(i) kAk ≤ kAk2 = kA∗ k2 for all A ∈ B(H). Thus L2 (H) is self-adjoint.
(ii) For every A, B ∈ L2 (H),
X
hA, BiHS = Tr(B ∗ A) = hB ∗ Ae, ei
e

is absolutely convergent and independent of the ONB E. Now h·, ·iHS is an inner product
on L2 (H) such that hA, AiHS = kAk22 . And (L2 (H), h·, ·iHS ) is complete, thus a Hilbert
space.
(iii) For all A, B ∈ B(H) we have kABk2 ≤ kAkkBk2 and kABk2 ≤ kAk2 kBk. Thus L2 (H) ⊆
B(H) is a two-sided ideal.
k·k2
(iv) We have F (H) ⊆ L2 (H) ⊆ K(H) and F (H) = L2 (H).
Proof. (i) If x ∈ H is a unit vector, pick an ONB E containing x. Then kAxk2 = (A∗ Ax, x) ≤
TrE (A∗ A) = kAk22 . Thus kAxk ≤ kAk2 whenever kxk = 1, proving the inequality. And Lemma
14.32(i) gives kA∗ k22 = Tr(AA∗ ) = Tr(A∗ A) = kAk22 .
(ii) If E is any ONB (whose choice does not matter) for H, we have (as before)
X X X XX
Tr(A∗ A) = hA∗ Ae, ei = hAe, Aei = kAek2 = |hAe, e0 i|2 .
e∈E e∈E e∈E e∈E e0 ∈E

49
Erhard Schmidt (1876-1959). Baltic German mathematician, contributions to functional analysis like Gram-
Schmidt orthogonalization.

100
Thus L2 (H) is the set of A ∈ B(H) for which the matrix elements hAe, e0 i (w.r.t. the ONB E)
are absolutely square summable. We therefore have a map

α : L2 (H) → `2 (E × E), A 7→ {hAe, e0 i}(e,e0 )∈E 2

that clearly is injective. (Recall that `2 (S) = L2 (S, µ), where µ is the counting measure.) To
show surjectivity of α, let f = {fee0 } ∈ `2 (E × E). Define a linear operator A : H → H by
0
P
A : e 7→ e0 fee0 e . For each e, the r.h.s. is in H by square summability of f . If x ∈ H then
2 ! 2
X X X X
2 0 0
kAxk = sup hx, ei fee0 e = sup hx, eifee0 e
E0 e e0 ∈E 0 E0 e0 ∈E 0 e
2
X X X
= sup hx, eifee0 ≤ kxk2 |fee0 |2 ,
E0 e0 e e,e0

where the supremum is over the finite subsets E 0 ⊆ E, we used |hx, ei| ≤ kxk and the change
ofPsummation order is allowed due to the finiteness of E 0 . This computation shows that kAk ≤
( e,e0 |fee0 |2 )1/2 < ∞. Thus A ∈ B(H) and α(A) = f , so that α is surjective. Thus α : L2 (H) →
`2 (E × E) is a linear bijection. Now `2 (E × E) is a Hilbert space (in particular complete) with
inner product (f, g) = e,e0 fee0 gee0 , and pulling this inner product back to L2 (H) along α we
P
have
X X
hA, BiHS = (Ae, e0 )(Be, e0 ) = hAe, e0 ihe0 , Bei
(e,e0 )∈E 2 (e,e0 )∈E 2
X X
= hAe, Bei = hB ∗ Ae, ei = Tr(B ∗ A),
e e

where all sums converge absolutely. Lemma 14.32(ii) implies that (A, A) = Tr(A∗ A) is inde-
pendent of the chosen ONB, and for general (A, B) this follows by the polarization identity.
From the above it is clear that (L2 (H), h·, ·iHS ) is isomorphic to the Hilbert space (`2 (E ×
E), h·, ·i), thus a Hilbert space. And the norm associated to h·, ·iHS is nothing other than k · k2 .
(iii) For any ONB E we have
X X
kABk22 = Tr(B ∗ A∗ AB) = kABek2 ≤ kAk2 kBek2 = kAk2 Tr(B ∗ B) = kAk2 kBk22 ,
e∈E e∈E

proving kABk2 ≤ kAkkBk2 . And kABk2 = k(AB)∗ k2 = kB ∗ A∗ k2 ≤ kB ∗ kkA∗ k2 = kAk2 kBk,


where we used the fact just proven and kA∗ k2 = kAk2 . The conclusion is obvious.
(iv) The inclusion F (H) ⊆ L2 (H) is very easy 2
P and is left as an exercise. If A ∈ L (H) and
F is a finite subset of the ONB E, define pF = e∈F h·, eie and AF = ApF . Then AF ∈ F (H)
and A X
kA − AF k22 = kA(1 − pF )k22 = |hAe, e0 i|2 .
he,e0 i∈(E\F )×E

k·k2
This implies kA − AF k2 → 0 as F % E, so that L2 (H) = F (H) .
k·k
Finally, by (i) we have kA − AF k ≤ kA − AF k2 → 0, thus A ∈ F (H) = K(H), where we
used Corollary 14.10. This proves L2 (H) ⊆ K(H). 

101
14.35 Example (L2 -Integral operators) Let (X, A, µ) be a measure space, and put H =
L2 (X, A,R µ).
R Let K 2: X × X → C be measureable (w.r.t. the product σ-algebra A × A) and
assume |K(x, y)| dµ(x)dµ(y) < ∞. (Thus K ∈ L2 (X × X, A × A, µ × µ).) Then
Z
(Kf )(x) = K(x, y)f (y) dµ(y)
X

defines a linear operator K : H → H whose Hilbert-Schmidt norm kKk2 coincides with the
norm kKkL2 of K ∈ L2 (X × X). Thus K is Hilbert-Schmidt, and in particular compact.

14.36 Exercise Prove the equality kKkL2 = kKk2 of norms claimed in the above example.
If V, W are vector spaces over any field k then there is a canonical linear map W ⊗k V ∗ →
Homk (V, W ) sending w ⊗k ϕ to the linear map v 7→ wϕ(v). (Here V ∗ is the algebraic dual space
and ⊗k is the algebraic tensor product.) If V or W is finite dimensional, this map is a bijection,
but otherwise it is not. For Hilbert spaces, one has a statement that works irrespective of the
dimensions:

14.37 Exercise Let H be a Hilbert space.


(i) Define H to be the vector space H with scalar action (c, x) 7→ cx and inner product
hx, yiH = hx, yi. Prove that H is a Hilbert space.
a map α : H ⊗alg H → F (H) by associating to ni=1 xi ⊗ yi the operator z 7→
P
(ii) Define
Pn
i=1 xi hz, yi i. Prove that α is linear and extends to an isometric bijection α : H ⊗ H →
2
L (H).

14.38 Remark If A ∈ B(H) and 1 ≤ p < ∞ one puts kAkp = (Tr(|A|p ))1/p . For p = 2 this
agrees with our previous definition since |A|2 = A∗ A, while for p = 1 one has kAk1 = Tr|A|.
Now each space Lp (H) = {A ∈ B(H) | kAkp < ∞}, the ‘p-th Schatten class’, is a two-sided
ideal in B(H) and in fact Lp (H) ⊆ K(H) for all p, see e.g. [72]. In particular L1 (H), the
‘trace-class operators’, play an important role in von Neumann algebra theory. The treatments
of them in [59] and [38] are quite good. See also [48]. If 1 ≤ p ≤ q < ∞, it is not hard to show
that kAk := kAk∞ ≤ kAkq ≤ kAkp , thus Lq (H) ⊆ Lp (H) ⊆ K(H). Thus the spaces Lp (H)
behave quite similarly to the `p (S, F), as also this exercise shows: 2

14.39 Exercise Given a set S and f ∈ `∞ (S, F), define H = `2 (S, C) and the multiplication
operator Mg : H → H, f 7→ gf (known from Exercise 14.11, where we saw Mg ∈ K(H) ⇔ g ∈
c0 (S)). Prove |Mg | = M|g| and kMg kp = kgkp for all p ∈ [1, ∞). (Thus Mg ∈ Lp (H) ⇔ g ∈
`p (S, C).)

15 Spectral theorems for normal Hilbert space oper-


ators
15.1 Spectral theorem: Multiplication operator version
In the remainder of this section we assume some knowledge of measure theory or willingness to
learn some.
Let H be a Hilbert space, A ∈ B(H) normal and x ∈ H. Then the map

ϕA,x : C(σ(A), C) → C, f 7→ hf (A)x, xi

102
is a bounded linear functional on (C(σ(A), C), k · k). If f is positive (i.e. takes values in [0, ∞))
then σ(f (A)) ⊆ [0, ∞) by the spectral mapping theorem, so that f (A) ≥ 0 and hf (A)x, xi ≥ 0
by Proposition 13.17. Thus ϕA,x is a bounded positive linear functional on C(σ(A), C). Thus
by the Riesz-Markov-Kakutani theorem, cf. [42, Appendix A.5] for the statement and, e.g., [11,
Theorem 7.2.8] or [63, Theorem 2.14] for proofs, there is a unique finite regular positive measure
µA,x on the Borel σ-algebra of C(σ(A)) such that
Z
f dµA,x = ϕA,x (f ) = hf (A)x, xi ∀f ∈ C(σ(A), C). (15.1)

Taking f = 1 = const., we have f (A) = 1, so that µA,x (σ(A)) = kxk2 < ∞. Since all measures
will be Borel measures, we omit the σ-algebra from the notation and just write L2 (σ(A), µA,x ).

15.1 Definition Let H be a Hilbert space, A ∈ B(H) and x ∈ H. Then x is called ∗-cyclic
for A if spanC {An (A∗ )m x | n, m ∈ N0 } = H.

15.2 Remark A vector x is cyclic for A if spanC {An x | n ∈ N0 } = H. Clearly the two notions
are equivalent for self-adjoint A, but in general they differ. For the present purpose, ∗-cyclicity
is the right notion. 2

15.3 Proposition Let H be a Hilbert space, A ∈ B(H) normal and x ∈ H ∗-cyclic for
A. Then there is a unique unitary U : H → L2 (σ(A), µA,x ) such that U AU ∗ = Mz , where
(Mz f )(z) = zf (z) for all f ∈ L2 (σ(A), µA,x ), z ∈ σ(A).
Thus A is unitarily equivalent to a multiplication operator.
Proof. The computation
Z
2 ∗
kf (A)xk = hf (A)x, f (A)xi = hf (A) f (A)x, xi = h(f f )(A)x, xi = |f |2 dµA,x

shows that the map


D = {f (A)x | f ∈ C(σ(A), C)} → L2 (σ(A), µA,x ), f (A)x 7→ f
is well-defined and isometric. Since the domain D ⊆ H is dense by ∗-cyclicity of x, the
map extends by continuity (Lemma 3.17) to an isometric map U : H → L2 (σ(A), µA,x ).
Since U H ⊆ L2 (σ(A), µA,x ) is closed and C(σ(A), C) is dense in L2 (σ(A), µA,x ), we have
U H = L2 (σ(A), µA,x ). Thus U is unitary. The uniqueness of U (for given x) is clear from
the construction. Now for f ∈ C(σ(A), C) we have
(U AU ∗ )(f )(z) = (U Af (A)x)(z) = (U (xf )(A))(z) = zf (z),
and by density of C(σ(A), C) in L2 (σ(A), µA,x ), this holds for all f ∈ L2 (σ(A), µA,x ). 

Not every normal operator A ∈ B(H) admits a ∗-cyclic vector. (If H is separable, A has a
∗-cyclic vector if and only if the algebra {B ∈ B(H) | AB = BA, A∗ B = BA∗ } is commutative.)
In this case we say that A is multiplicity free. In general we have:

15.4 Theorem (Spectral theorem for normal operators) Let H be a Hilbert space
and A ∈ B(H) normal. Then there exists a family {µι }ι∈I L
of finite Borel measures on σ(A) and
2 50 ∗
L
unitary U : H → ι∈I L (σ(A), µι ) such that U AU = ι∈I Mz , i.e.
M
(U AU ∗ f )ι (z) = zfι (z) ∀f = {fι } ∈ L2 (σ(A), µι ), z ∈ σ(A). (15.2)
ι∈I
50
L
Here is the Hilbert space direct sum defined at the end of Section 5.1.

103
Proof. Let F be the family of subsets F ⊆ H such that for x, y ∈ F, x 6= y we have f (A)x ⊥
f 0 (A)y for all f, f 0 ∈ C(σ(A), C). We partially order F by inclusion. One easily checks S that
F satisfies the hypothesis of Zorn’s lemma. (Given a totally ordered subset C ⊆ F, C is in
F, thus an upper bound for C.) Thus there is a maximal element M ∈ F. For each x ∈ M
we put HL x = {f (A)x | f ∈ C(σ(A), C)}. By construction these Hx are mutually orthogonal.
Let K = x∈M Hx . By construction, we have f (A)K ⊆ K for all f ∈ C(σ(A), σ), thus also
f (A)∗ K ⊆ K since f (A)∗ = f (A). Thus K ⊥ is invariant under all f (A). Picking a non-zero
y ∈ K ⊥ , we have M ∪ {y} ∈ F, which is a contradiction. Thus K = H.
Since clearly x ∈ M is ∗-cyclic for the restriction of A to HL
x , we can use Proposition 15.3 to
obtain unitaries Ux : Hx → L2 (σ(A), µA,x ). Defining U : H → x∈M L2 (σ(A), µA,x ) by sending
x ∈ Hx to Ux x ∈ L2 (σ(A), µA,x ) and extending linearly, U is unitary, and we are done. (Of
course we have identified I = M and µι = µA,x .) 

15.5 Remark 1. Once the maximal family M of vectors has been picked, the construction
is canonical. But there is no uniqueness in the choice of that family. (This is similar to the
non-uniqueness of the choices of ONBs in the eigenspace ker(A − λ1) that we make in proving
Theorem 14.27.) For much more on this (in the self-adjoint case) see [59, Section VII.2].
2. Theorem 15.4 is perfectly compatible with Theorem 14.27: If A is compact normal and
E is an ONB diagonalizing it then the Hι in Theorem 15.4 are precisely the one-dimensional
spaces Ce for e ∈ E and the measure µι corresponding to Hι = Ce is the δ-measure on P (σ(A))
defined by µ(S) = 1 if λe ∈ S and µ(S) = 0 otherwise. (To be really precise, one should take
the non-uniqueness in both theorems into account.)
3. If A is as in the theorem and g ∈ C(σ(A), C) then the continuous functional calculus
gives us a normal operator g(A). We now have
M
U g(A)U ∗ = Mg .
ι∈I

(This is an obvious consequence of (15.2) when g is a polynomial and follows by a density


argument in general.) If one took Theorem 15.4 as given, this could even be used to define the
continuous functional calculus. This would be circular since we used the continuous functional
calculus to prove the theorem, or rather Proposition 15.3 on which it relied, but it shows that
the continuous functional calculus and the spectral theorem are ‘equivalent’ in the sense of being
easily deducible from each other.
4. The statement of Theorem 15.4 may not quite be what we expected, given the slogan
‘normal operators are multiplication operators’, since there is a direct sum involved. But this
can be fixed when H is separable: 2

15.6 Corollary Let H be a separable Hilbert space and A ∈ B(H) normal. Then there
exists a finite measure space (X, A, µ), a function g ∈ L∞ (X, A, µ; C) and a unitary W : H →
L2 (X, A, µ; C) such that W AW ∗ = Mg .
Proof. We apply Theorem 15.4. Since H is separable, the index set I is at most countable, and
L write I = {1, . . . , N } where N ∈ N ∪ {∞}
we
−1
with ∞ = #N. Now we put X = I × σ(A) =
i∈I σ(A) and for Y ⊆ X we put Y i = p 2 (p1 (i)) = {x ∈ σ(A) | (i, x) ∈ Y } ⊆ σ(A). We define
A ⊆ P (X) and µ : A → [0, ∞] by

A = {Y ⊆ X | Yi ∈ B(σ(A)) ∀i ∈ I},
X
µ(Y ) = µi (Yi ).
i∈I

104
Using the countability of I it is straightforward to check that A is a σ-algebra on X and µ a
(positive) measure on (X, A). With (15.1) we have P µi (σ(A)) = kxi k2 . Thus
P if we choose the
−i
cyclic vectors xi such that kxi k = 2 then µ(X) = i µ({i} × σ(A)) = i µi (σ(A)) < ∞, so
that the measure space (X, A, µ) is finite. Now we define a linear map
M
V : L2 (σ(A), µi ) → L2 (X, A, µ), {fi }i∈I 7→ f where f ((i, x)) = fi (x).
i∈I

From the way (X, A, µ) was constructed, it is quite clear that V is unitary. (Check this!) Now
W = V U : H → L2 (X, A, µ), where U comes from Theorem 15.4, is unitary. In view of
(U AU ∗ f )i (λ) = λfi (λ), defining g : X → C, (i, x) 7→ x (which is bounded by r(A) = kAk), we
have W AW ∗ = Mg . 

15.7 Exercise (i) Let Σ ⊆ C be compact and non-empty and µ be a finite positive Borel
measure on Σ. Put H = L2 (Σ, µ) and define A ∈ B(H) by (Af )(x) = xf (x) for f ∈ H.
Prove:

σ(A) = {λ ∈ Σ | ∀ε > 0 : µ(B(λ, ε)) > 0},


σp (A) = {λ ∈ Σ | µ({λ}) > 0},

where B(λ, ε) denotes the open ε-disc around λ.


(ii) Let A ∈ B(H) be normal. Use (i) to prove that λ ∈ σp (A) if and only if µι ({λ}) > 0 holds
for at least one ι ∈ I with µι as in Theorem 15.4.

15.2 Borel functional calculus for normal operators


In the preceding section we used the continuous functional calculus to prove the spectral theorem
for normal operators. Now we will turn the logic around and use the spectral theorem to extend
the functional calculus to a larger class of functions!

15.8 Definition If (X, τ ) is a topological space, B ∞ (X, C) denotes the set of bounded func-
tions X → C that are measurable with respect to the Borel σ-algebra B(X, τ ).

15.9 Lemma Let (X, τ ) be a topological space. Then


(i) If {fn }n∈N is a sequence of Borel measurable functions X → C converging pointwise to f
then is Borel measurable.
(ii) (B ∞ (X, C), k·k∞ ), equipped with pointwise multiplication and ∗-operation is a C ∗ -algebra.
Proof. (i) It is an elementary fact of measure theory, cf. e.g. [11, Proposition 2.1.5], that the
pointwise limit of a sequence of measurable functions (whatever the σ-algebra) is measurable.
(ii) Every sequence in B ∞ (S, C) that is Cauchy w.r.t. k · k∞ converges pointwise everywhere,
thus is measurable by (i), and clearly bounded. Thus B ∞ (S, C) is complete. It is a C ∗ -algebra
since product and ∗-operation satisfy submultiplicativity and the C ∗ -identity. 

For a normal element a ∈ A of a C ∗ -algebra, we cannot make sense of f (a) if f is not


continuous. But the C ∗ -algebra B(H) has much more structure, and it turns out there is a
Borel functional calculus extending the continuous functional calculus:

15.10 Theorem Let H be a Hilbert space and A ∈ B(H) normal. Then:

105
(i) There is a unique unital ∗-homomorphism αA : B ∞ (σ(A), C) → B(H) extending the con-
tinuous functional calculus C(σ(A), C) → B(H) and satisfying kαA (f )k ≤ kf k∞ . Again
we write more suggestively f (A) = αA (f ).
(ii) If B ∈ B(H) commutes with A and A∗51 then B commutes with g(A) for all g ∈
B ∞ (σ(A), C).
(iii) If {fn }n∈N ⊆ B ∞ (σ(A), C) is a bounded sequence converging pointwise to f then f ∈
w
B ∞ (σ(A), C) and fn (A) → f (A), i.e. w.r.t. τwot , cf. Definition 16.8. (And kfn − f k∞ →
0 ⇒ kfn (A) − f (A)k → 0.)
Proof. (i) For all x, y ∈ H, the map
ϕx,y : C(σ(A), C) → C, f 7→ hf (A)x, yi
is a linear functional on C(σ(A), C) that is bounded since kf (A)k ≤ kf k∞ . Thus by the Riesz-
Markov-Kakutani
R theorem there exists a unique complex Borel measure µx,y on σ(A) such that
ϕx,y (f ) = f dµx,y for all f ∈ C(σ(A), C). Since ϕx,y depends in a sesquilinear way on (x, y),
the same holds for µx,y , and |µx,y (σ(A))|R = |hx, yi| ≤ kxkkyk. Thus if f ∈ B ∞ (σ(A), C), the
map ψf : H 2 → C defined by (x, y) 7→ f dµx,y is a sesquilinear form that is bounded since
|ψx,y (f )| ≤ kf k∞ kxkkyk. Thus by Proposition 11.8 there is a unique Af ∈ B(H) such that
hAf x, yi = ψx,y (f ) for all x, y ∈ H. It satisfies kAf k ≤ kf k∞ . Define α : B ∞ (σ(A), C) → B(H)
by f 7→ Af . If f ∈ C(σ(A), C) then ψx,y (f ) = hf (A)x, yi ∀x, y, implying Af = f (A). Thus αA
extends the continuous functional calculus.
It remains to be shown that αA is a ∗-homomorphism. Linearity is quite obvious. Since the
continuous functional calculus is a ∗-homomorphism, for f ∈ C(σ(A), C) we have f (A) = f (A)∗ ,
thus
Z Z Z

f dµx,y = hf (A)x, yi = hx, f (A) yi = hx, f (A)yi = hf (A)y, xi = f dµy,x = f dµy,x ,

thus µy,x = µx,y . Reading the above computation backwards, this implies αA (f ) = αA (f )∗
for all f ∈ B ∞ (σ(A), C). Since the continuous functional calculus is a homomorphism, for all
f, g ∈ C(σ(A), C) we have
Z Z

(f g) dµx,y = h(f g)(A)x, yi = hf (A)g(A)x, yi = hg(A)x, f (A) yi = g dµx,f (A)y .

The fact that this holds for all f, g ∈ C(σ(A), C) implies f µx,y = µx,f (A)y . Thus for all
f ∈ C(σ(A), C), g ∈ B ∞ (σ(A), C) we have
Z Z
h(f g)(A)x, yi = f g dµx,y = g dµx,f (A)y = hg(A)x, f (A)yi = hf (A)g(A)x, yi,

so that (f g)(A) = f (A)g(A). As above, we deduce from this that f µx,y = µx,f (A)y for all
f ∈ B ∞ (σ(A), C), and then (f g)(A) = f (A)g(A) for all f, g ∈ B ∞ (σ(A), C).
(ii) The assumption implies Bf (A) = f (A)B for all f ∈ C(σ(A), C). Thus
ϕBx,y (f ) = hf (A)Bx, yi = hBf (A)x, yi = hf (A)x, B ∗ yi = ϕx,B ∗ y (f ) ∀x, y, f.
This implies µBx,y = µx,B ∗ y for all x, y, whence
Z Z
hf (A)Bx, yi = f dµBx,y = f dµx,B ∗ y = hf (A)x, B ∗ yi ∀x, y ∈ H, f ∈ B ∞ (σ(A), C),

51
Since A is normal, AB = BA actually implies A∗ B = BA∗ by Fuglede’s theorem, cf. Section B.8!

106
thus f (A)B = Bf (A) for all f ∈ B ∞ (σ(A), C).
(iii) Measurability of the limit function f follows from Lemma 15.9(i). If kfn k ≤ M ∀n then
clearly kf k ≤ M . Thus f ∈ B ∞ (σ(A)). For all x, y ∈ H we have
Z Z
hαA (fn )x, yi = fn dµx,y −→ f dµx,y = hαA (f )x, yi,

where convergence in the center is a trivial application of the dominated convergence theorem,
w
using boundedness of µx,y and kfn k∞ ≤ M for all n. This proves αA (fn ) → αA (f ). The final
claim clearly follows from kfn (A) − f (A)k = k(fn − f )(A)k ≤ kfn − f k∞ . 

The above construction of the Borel functional calculus was independent of the Spectral
Theorem 15.4. We now wish to understand their relationship. This is the first step:

15.11 Exercise Let Σ ⊆ C be compact and λ a finite positive Borel measure on Σ. Let
H = L2 (Σ, λ; C) and g ∈ B ∞ (Σ, C).
(i) Prove that the multiplication operator Mg : H → H, [f ] 7→ [gf ] satisfies

kMg k = ess supµ |g| = inf{t ≥ 0 | λ({x ∈ X | |g(x)| > t}) = 0} ≤ kgk∞ .

(ii) Let A = Mz ∈ B(H), where z : Σ ,→ C. Prove that g(A) as defined by the Borel functional
calculus coincides with Mg .

15.12 Corollary Let A ∈ B(H) be normal and g ∈ B ∞ (σ(A), C). Then


(i) σ(g(A)) ⊆ g(σ(A)).
(ii) If h ∈ B ∞ (g(σ(A)), C)) then h(g(A)) = (h ◦ g)(A).
2 ∗
L L
Proof. Let U : H → ι∈I L (σ(A), µι ) as in Theorem 15.4, so that U AU = ι Mz . The
projectors Pι onto the subspaces L2 (σ(A), µι ) of the direct sum commute with A (and A∗ ),
thus also with g(A) for each g ∈ B ∞ (σ(A), C) by Theorem 15.10(ii). Thus the Borel functional
calculus respects the direct sum decomposition of A (no matter how the maximal set M was
chosen). It is a pure formality to show that if V : H →LH 0 is unitary then V g(A)V ∗ L
= g(V AV ∗ ).
Thus
L with the direct sum decomposition U AU ∗ = ι Mz we have U g(A)U =

ι g(Mz ) =
ι M g , where the second equality comes from Exercise 15.11(ii).
(i) If λ 6∈ g(σ(A)) then Mg −λ1 has a bounded inverse with norm ≤ dist(λ, g(σ(A)))−1 . Thus
all Mg − λ1 in the direct sum decomposition of A − λ1 have inverses with uniformly bounded
norms. Thus A − λ1 has a bounded inverse.
(ii) Under the assumption on h, we have
M M
U h(g(A))U ∗ = h(Mg ) = Mh◦g = U (h ◦ g)(A)U ∗ .
ι ι

(This is too sloppy, but the reader should be able to make it precise.) 

15.13 Remark 1. Since it turns out that g(A) = U ∗ ( ι Mg )U for all g ∈ B ∞ (σ(A), C), one
L
might try to take this as the definition of g(A). But apart from being very inelegant, it has
the problem that one must prove the independence of g(A) thus defined from the choice of the
maximal set M ⊆ H in the proof of the spectral theorem. This would not be difficult if every
Borel measureable function was a pointwise limit of a sequence of continuous functions. But
this is false, making such an approach quite painful. (Compare Lusin’s theorem in, e.g., [63].)

107
2. We cannot hope to prove kg(A)k = kgk∞ for all g ∈ B ∞ (σ(A), C) since it is true only
if σ(A) = σp (A)! Since singletons in C are closed, thus Borel measurable, we can change g
arbitrarily for some λ ∈ σ(A) without destroying the measurability of g, making kgk∞ as large
as we want. But if λ ∈ σ(A)\σp (A), Exercise 15.7 gives µι ({λ}) = 0 ∀ι ∈ I, so that this change
of g does not affect the norms, cf. Exercise 15.11, ess supµi |g| of the multiplication operators
making up g(A) and therefore does not affect kg(A)k.
3. Let A ∈ B(H) be normal and consider the C ∗ -algebra A = C ∗ (1, A) ⊆ B(H). Then
g(A) ∈ A for continuous g, but for most non-continuous g we have g(A) 6∈ A. For this reason
there is no Borel functional calculus in abstract C ∗ -algebras. (But g(A) is always contained in
wot
the von Neumann algebra vN(A) = C ∗ (A, 1) generated by A. This follows from Theorem
15.10(ii) and von Neumann’s ‘double commutant theorem’.) 2

15.3 Normal operators vs. projection-valued measures


There is yet another perspective on the spectral theorem/functional calculus, provided by
projection-valued measures:

15.14 Definition Let H be Hilbert space and Σ ⊆ C a compact subset. Let B(Σ) be the Borel
σ-algebra on Σ. A projection-valued measure relative to (H, Σ) is a map P : B(Σ) → B(H)
such that
(i) P (S) is an orthogonal projection for all S ∈ B(Σ).
(ii) P (∅) = 0, P (Σ) = 1.
(iii) P (S ∩ S 0 ) = P (S)P (S 0 ) for all S, S 0 ∈ B(Σ).
(iv) For all x, y ∈ H, the map Ex,y : B(Σ) → C, S 7→ hP (S)x, yi is P a complex measure.
(Equivalently,Sif the {Sn }n∈N ⊆ B(Σ) are mutually disjoint then n P (Sn ) converges
weakly to P ( n Sn ).)
Note that (iii) implies P (S)P (S 0 ) = P (S 0 )P (S) for all S, S 0 ∈ B(Σ).

15.15 Proposition Let H be a Hilbert space and A ∈ B(H) normal. Put Σ = σ(A). For
each S ∈ B(Σ), define P (S) = χS (A). Then S 7→ P (S) is a projection-valued measure relative
to (H, Σ), also called the spectral resolution of A.
Proof. If g = χS for S ∈ B(Σ), g(A) is a direct sum of operators of multiplication by χS , which
clearly all are idempotent. And since g = χS is real-valued, g(A) is self-adjoint. Thus each
P (S) = χS (A) is an orthogonal projection. P (∅) = 0 is clear, and P (Σ) = 1(A) = 1H (since the
constant 1 function is continuous). Property (iii) is immediate from χS∩S 0 = χS χS 0 . Finally, if
x, y ∈ H let U x = {fι }ι∈I , U y = {gι }ι∈I . Then
XZ
Ex,y (S) = hP (S)x, yi = χS (z)fι (z)gι (z) dµι (z).
ι∈I σ(A)

PSR7→ Ex,y (S) is additive on countable families {Sn }n∈N ⊆ B(Σ) by


From this it is clear that
absolute convergence of ι fι gι dµι . 

15.16 Exercise Let A ∈ B(H) be normal and Σ ⊆ σ(A) a Borel set. Prove σ(A|P (Σ)H ) ⊆
Σ ∪ {0}. Bonus: State and prove a better result.

108
15.17 Exercise Let A ∈ B(H) be a normal operator and let P be the corresponding spectral
measure. Prove:
(i) λ ∈ σ(A) if and only if P (σ(A) ∩ B(λ, ε)) 6= 0 for each ε > 0.
(ii) λ ∈ σ(A) is an eigenvalue if and only if P ({λ}) 6= 0.
We have thus seen that every normal operator gives rise to a projection valued measure. The
converse is also true, and we have a bijection between normal operators and projection-valued
measures:

15.18 Proposition Let H be a Hilbert space, Σ ⊆ C a compact subset and P a projection-


valued measure relative to (H, Σ). Then
(i) For every f ∈ B ∞ (Σ, C) there is a unique α(f ) ∈ B(H) such that
Z
hα(f )x, yi = f dEx,y ∀x, y ∈ H
R
and kα(f )k ≤ kf k∞ ∀f . We also write, somewhat symbolically, α(f ) = f (z) dP (z).
(ii) α : B ∞ (Σ, C) → B(H) is a unital ∗-homomorphism, and α(f ) is normal for each f .
(iii) Put A = α(z) ∈ B(H), where z : Σ ,→ C is the inclusion map. Then σ(A) ⊆ Σ and
α(f ) = f (A) for each f ∈ C(Σ, C).
(iv) The maps from normal elements to projection-valued measures (Proposition 15.15) and
and conversely ((i)-(iii) above) are mutually inverse.
Proof. (i) It is clear that the map (x, y) 7→ R Ex,y (S) = hP (S)x, yi is sesquilinear Rfor each
S ∈ B(Σ). For each f ∈ B ∞ (Σ, C), we have | f dEx,y | ≤ kf k∞ kxkkyk. Thus [x, y]f = f dEx,y
is a sesquilinear form with norm ≤ kf k∞ . Thus by Proposition 11.8 there is a unique Af ∈ B(H)
such that hAf x, yi = [x, y]f ∀x, y ∈ H and kAf k ≤ kf k∞ . Now put α(f ) := Af .
(ii) The inequality has already been proven. ItR is clear that f 7→ Af = α(f ) is linear. If 1 is
the constant one function, we have hα(1)x, yi = 1 dEx,y = Ex,y (Σ) = hx, yi since P (Σ) = 1.
Thus α(1) = 1. It remains to prove α(f g) = α(f )α(g) and α(f )∗ = α(f ). We first do this for
characteristic functions of measurable sets S, T : f = χS , g = χT . Now
Z
χ
hα( S )x, yi = dEx,y = hP (S)x, yi.
S

Thus α(χS χT ) = α(χS∩T ) = E(S ∩ T ) = E(S)E(T ) = α(χS )α(χT ). By linearity of α we now


have α(f g) = α(f )α(g) for all simple functions, i.e. finite linear combinations of characteristic
functions. The latter are k · k∞ -dense in B ∞ (Σ, C). (By a proof very similar to that of Lemma
4.13 in the case of `∞ . Note that a measurable function is simple if and only if it assumes only
finitely many values.) Now the identity follows for all f, g by continuity of α.
Furthermore, α(χS )∗ = P (S)∗ = P (S) = α(χS ), so that α(f )∗ = α(f ) for simple functions.
Now apply the same density and continuity argument as above.
In view of f (A) = α(f ), the normality of f (A) follows from

f (A)f (A)∗ = α(f )α(f )∗ = α(f f ∗ ) = α(f ∗ f ) = α(f )∗ α(f ) = f (A)∗ f (A).

(iii) σ(A) ⊆ Σ is clear. Since α(1) = 1 and α(z) = A by definition, we have α(P ) = P (A)
for each polynomial. More generally, since α is a ∗-homomorphism, a polynomial in z, z is sent
by α to the corresponding polynomial in A, A∗ . These polynomials are k · k∞ -dense in C(Σ, C)

109
by Weierstrass Theorem A.31, so that the continuity proven in (ii) implies that α(f ) = f (A) as
produced by the continuous functional calculus.
(iv) Left as an exercise. 

We close the discussion of spectral theorems with the advice of looking at the paper [29] and
[73, Chapter 5] by two masters.

16 Weak and weak ∗-topologies. Alaoglu’s theorem


16.1 The weak topology of a Banach space
16.1 Definition If V is a Banach space, the weak topology τw is the topology on V induced
by the family of seminorms F = {k · kϕ = |ϕ(·)| | ϕ ∈ V ∗ }. Thus a net {xι } ⊆ V converges
weakly to x ∈ V if and only if ϕ(xι ) → ϕ(x) for all ϕ ∈ V ∗ .
The weak topology is also called the σ(V, V ∗ )-topology (the topology on V induced by the
linear functionals in V ∗ ). The Hahn-Banach theorem immediately gives that F is separating, so
that this topology is locally convex. It is clear that a norm-convergent net is weakly convergent
k·k w
since |ϕ(xι ) − ϕ(x)| ≤ kϕkkxι − xk. This implies S ⊆ S for every S ⊆ V and τw ⊆ τk·k .
If (H, h·, ·i) is a Hilbert space, Theorem 5.29 implies F = {|h·, yi| | y ∈ H}, so that weak
convergence of a net {xι } in a Hilbert space H is equivalent to convergence of the nets {hxι , yi}
for each y ∈ H.

16.2 Proposition The weak topology on every infinite dimensional Banach space is strictly
weaker than the norm-topology.
Proof. By the definition of τw , for every weakly open neighborhood U of 0 there are ϕ1 , . . . , ϕn ∈
V ∗ such
T that {x ∈ V | |ϕi (x)| < 1 ∀i = 1, . . . , n} ⊆ U . Thus U contains the linear subspace
W = ni=1 ϕ−1 i (0) ⊆ V , whose codimension is ≤ n. Thus if V is infinite dimensional then
dim W is infinite, thus non-zero. On the other hand, it is clear that the (norm-)open ball
B(0, 1) contains no linear subspace of dimension > 0. Thus B(0, 1) 6∈ τw . Since τw ⊆ τk·k was
clear, we have τw $ τk·k . 

16.3 Exercise (i) Prove that the sequence {δn }n∈N has no weak limit in `1 (N, F).
(ii) Let 1 < p < ∞. Prove that the sequence {δn }n∈N ⊆ `p (N, F) converges to zero weakly, but
not in norm.
w
(iii) (Bonus) Let 1 ≤ p < ∞ and g, {fn }n∈N ∈ `p (N, F). Prove that if fn → g and kfn kp → kgkp
then kfn − gkp → 0.52
The deviant behavior of `1 in the preceding exercise can be understood as a consequence of
the following surprising result:
53 w
16.4 Theorem (I. Schur 1920) If g, {fn }n∈N ⊆ `1 (N, F) and fn → g then kfn − gk1 → 0.
52
This implication holds for all uniformly convex Banach spaces, cf. e.g. [37, Proposition 9.11]. The Lp -spaces with
1 < p < ∞ are uniformly convex, cf. Section B.6.2.
53
Issai Schur (1874-1941). Russian mathematician. Studied and worked in Germany up to his emigration to Israel
in 1939. Mostly known for his work in group and representation theory.

110
Like the uniform boundedness theorem, this result can be proven using a beautiful gliding
hump argument or using Baire’s theorem, cf. Section B.7.
Theorem 16.4 does not generalize to nets since the weak and norm topologies on `1 (N, F)
differ by Proposition 16.2 and nets can distinguish topologies, cf. [47, Section 5.1].

16.5 Exercise Prove that every weakly convergent sequence in a Banach space is norm-
bounded. Hint: Uniform boundedness theorem.

16.6 Exercise Let V be a Banach space. Prove that the (norm) closed unit ball V≤1 is also
weakly closed. Hint: Hahn-Banach.
In Section 14.1 we saw that V≤1 is compact w.r.t. the norm topology if and only if V is
finite-dimensional. But the weak topology is weaker than the norm topology, so that a set can
be weakly compact even though it is not norm compact. Indeed, after some further preparations
we will prove the following theorem:

16.7 Theorem Let V be a Banach space. Then the following are equivalent:
(i) V is reflexive. (⇔ V ∗ is reflexive by Theorem 7.18.)
(ii) V≤1 is compact w.r.t. the weak topology.
In Remark 8.6 we have encountered the strong (operator) topology on B(V ): A net {Aι } ⊆
B(V ) converges strongly to A ∈ B(V ) if k(Aι − A)xk → 0 for all x ∈ V . Now we can have a
brief look at the weak operator topology:

16.8 Definition Let V be a Banach space. The weak operator topology τwot on B(V ) is
generated by the family F = {k · kx,ϕ : A 7→ |ϕ(Ax)| | x ∈ V, ϕ ∈ V ∗ } of seminorms. Thus
{Aι } ⊆ B(V ) converges to A ∈ B(V ) w.r.t. τwot if and only if ϕ((Aι − A)x) → 0 for all
x ∈ V, ϕ ∈ V ∗ , i.e. {Aι x} ⊆ V converges weakly to Ax for all x. The family F is separating, so
wot
that τwot is Hausdorff. We write Aι −→ A or A = w-lim Aι .
wot
If H is a Hilbert space, we have Aι −→ A if and only if hAι x, yi → hAx, yi for all x, y ∈ H.
There is little risk of confusing the weak topology on V with the weak operator topology on
B(V ). But one might confuse the latter with the weak topology that B(V ) has as a Banach
space, in particular since the above k·kx,ϕ are in B(V )∗ ! However, when V is infinite dimensional
these seminorms do not exhaust (or span) the bounded linear functionals on B(V ), so that the
weak operator topology on B(V ) is strictly weaker than the weak topology!

16.9 Exercise Let H be a Hilbert space.


(i) Prove that the map (B(H), τwot ) → (B(H), τwot ), A 7→ A∗ is continuous.
(ii) Prove that the map (B(H), τsot ) → (B(H), τsot ), A 7→ A∗ is not continuous if dim H = ∞.
The strong and weak operator topologies on B(H) are quite important for the theory of von
Neumann algebras, an important special class of C ∗ -algebras. See [32, 50] for the basics.

16.2 The weak-∗ topology on the dual space of a Banach space


16.10 Definition If V is a Banach space, the weak-∗ topology τw∗ (or σ(V ∗ , V )-topology) is
the topology on the dual space V ∗ defined by the family F = {k · kx | x ∈ V } of seminorms,
where kϕkx = |b x(ϕ)| = |ϕ(x)|. Thus a net {ϕι } in V ∗ converges to ϕ ∈ V ∗ if and only if
ϕι (x) → ϕ(x) for every x ∈ V .

111
16.11 Remark 1. Since ϕ(x) = 0 for all x ∈ V means ϕ = 0, F is separating, thus the
σ(V ∗ , V )-topology is Hausdorff and therefore locally convex.
2. If V is infinite dimensional, the weak-* topology τw∗ is neither normable nor metrizable.
3. Since the weak-∗ topology is induced by the linear functionals on V ∗ of the form x b, which
constitute a subset of V ∗∗ , it is weaker than the weak topology, thus also weaker than the norm
topology: τw∗ ⊆ τw ⊆ τk·k . As we know, the second inclusion is proper whenever V is infinite
dimensional. For the first, we have: 2

16.12 Proposition If V is a Banach space, the weak-∗ topology σ(V ∗ , V ) on V ∗ coincides


with the weak topology σ(V ∗ , V ∗∗ ) if and only if V is reflexive.
Proof. If V is reflexive then V ∗∗ ∼ = V , so that the weak-∗ topology σ(V ∗ , V ) on V ∗ clearly
coincides with the weak topology σ(V ∗ , V ∗∗ ). If V is not reflexive, we have V $ V ∗∗ . Now
for ψ ∈ V ∗∗ \V it is clear that the linear functional ψ on V ∗ is σ(V ∗ , V ∗∗ )-continuous, whereas
Exercise 16.13 gives that it is not σ(V ∗ , V )-continuous. This proves σ(V ∗ , V ) 6= σ(V ∗ , V ∗∗ ). 

16.13 Exercise Let V be an F-vector space with algebraic dual space V ? .


(i) For ϕ, ψ1 , . . . , ψn ∈ V ? prove that ϕ ∈ spanF {ψ1 , . . . , ψn } ⇔ ni=1 ker ψi ⊆ ker ϕ.
T

(ii) Let W ⊆ V ? be a linear subspace. Prove that a linear functional ϕ : V → F is σ(V, W )-


continuous if and only if ϕ ∈ W . Hint: Use (i).

16.14 Remark Before we proceed, some comments are in order: While the norm and weak
topologies are defined for each Banach space, the weak-∗ topology is defined only on spaces
that are the dual space V ∗ of a given space V . There are Banach spaces, like c0 (N, F), that are
not isomorphic (isometrically or not) to the dual space of any Banach space, cf. Corollary B.11.
And there are non-isomorphic Banach spaces with isomorphic dual spaces, cf. Corollary B.12.
Thus to define the weak-∗ topology on a Banach space, it is not enough just to know that the
latter is a dual space. We must choose a ‘pre-dual’ space. 2

The following is the reason for the importance of the weak-∗ topology:

16.15 Theorem (Alaoglu’s theorem) 54 If V is a Banach space then the (norm)closed unit
ball (V ∗ )≤1 = {ϕ ∈ V ∗ | kϕk ≤ 1} is compact in the σ(V ∗ , V )-topology.
Proof. Define Y
Z= {z ∈ C | |z| ≤ kxk},
x∈V

equipped with the product topology. Since the closed discs in C are compact, Z is compact by
Tychonov’s theorem. If ϕ ∈ (V ∗ )≤1 then |ϕ(x)| ≤ kxk ∀x, so that we have a map
Y
f : (V ∗ )≤1 → Z, ϕ 7→ ϕ(x).
x∈V

Since the map ϕ 7→ ϕ(x) is continuous for each x, f is continuous (w.r.t. the weak-∗ topology
on (V ∗ )≤1 ). It is trivial that V separates the points of V ∗ , thus f is injective. By definition,
a net {ϕι } in (V ∗ )≤1 converges in the σ(V ∗ , V )-topology if and only if ϕι (x) converges for all
x ∈ V , and therefore if and only if f (ϕι ) converges. Thus f : (V ∗ )≤1 → f ((V ∗ )≤1 ) ⊆ Z is a
homeomorphism.
54
Leonidas Alaoglu (1914-1981). Greek mathematician. (Earlier versions due to Helly and Banach.)

112
Now let z ∈ f ((V ∗ )≤1 ) ⊆ Z. Clearly, |zx | ≤ kxk ∀x ∈ X. By Proposition A.9.2 there is
a net in f ((V ∗ )≤1 ) converging to z and therefore a net {ϕι } in (V ∗ )≤1 such that f (ϕι ) → z.
This means ϕι (x) → zx ∀x ∈ V . In particular ϕι (αx + βy) → zαx+βy , while also ϕι (αx + βy) =
αϕι (x) + βϕι (y) → αzx + βzy . Thus the map ψ : V → C, x 7→ zx is linear with kψk ≤ 1, to wit
ψ ∈ (V ∗ )≤1 and z = f (ψ). Thus f ((V ∗ )≤1 ) ⊆ f ((V ∗ )≤1 ), so that f ((V ∗ )≤1 ) ⊆ Z is closed.
Now we have proven that (V ∗ )≤1 is homeomorphic to the closed subset f ((V ∗ )≤1 ) of the
compact space Z, and therefore compact. 

16.16 Remark We deduced Alaoglu’s theorem from Tychonov’s theorem. The latter is known
to be equivalent to the axiom of choice (AC). However, we only needed Tychonov’s theorem as
restricted to Hausdorff spaces. The latter can be proven from a weaker axiom than AC, to which
it actually is equivalent (namely the ‘ultrafilter lemma’, which also implies the Hahn-Banach
theorem). Cf. [47]. 2

16.17 Exercise Use Alaoglu’s theorem to prove that every Banach space V over F admits a
linear isometric bijection onto a closed subspace of C(X, F) for some compact Hausdorff space
X.

16.18 Exercise (i) Use Alaoglu’s theorem to prove (i)⇒(ii) in Theorem 16.7.
(ii) Conclude that the closed unit ball of every Hilbert space is weakly compact.
(iii) Prove σ(V, V ∗ ) = σ(V ∗∗ , V ∗ )  V .
(iv) Use Theorem 16.19 and (iii) to prove (ii)⇒(i) in Theorem 16.7.

16.19 Theorem (Goldstine) 55 If V is Banach then V≤1 ⊆ (V ∗∗ )≤1 is σ(V ∗∗ , V ∗ )-dense.


The fairly non-trivial proof will be given in the supplementary Section B.5.2.

17 The Gelfand homomorphism for commutative Ba-


nach and C ∗-algebras
We now pick up the discussion begun in Section 10.4.

17.1 The topology of Ω(A). The Gelfand homomorphism


The following is one of the most important applications of Alaoglu’s Theorem 16.15:

17.1 Proposition Let A be a unital Banach algebra and Ω(A) its spectrum. For each a ∈ A
define ba : Ω(A) → C, ϕ 7→ ϕ(a). Let τ be the initial topology on Ω(A) defined by {b
a | a ∈ A},
i.e. the weakest topology making all b
a continuous. Then (Ω(A), τ ) is compact Hausdorff.
Proof. We have proven in Section 10.4 that (non-zero) characters are automatically continuous
with norm one, so that Ω(A) ⊆ (A∗ )≤1 . By definition, b a(ϕ) = ϕ(a). Thus the topology gener-
ated by the b ∗
a is the restriction to Ω(A) ⊆ A of the σ(A , A)-topology, thus Hausdorff. Let {ϕι }
be a net in Ω(A) that converges to ψ ∈ A∗ w.r.t. the σ(A∗ , A)-topology. Then for all a, b ∈ A we
have ψ(ab) = limι ϕι (ab) = limι ϕι (a)ϕι (b) = ψ(a)ψ(b), so that ψ ∈ Ω(A). Thus Ω(A) ⊆ (A∗ )≤1
55
Herman Heine Goldstine (1913-2004). American mathematician and computer scientist. Worked on very pure
and very applied mathematics, like John von Neumann with whom he collaborated on computers.

113
is σ(A∗ , A)-closed, thus compact since (A∗ )≤1 is σ(A∗ , A)-compact by Alaoglu’s theorem. 

The above works whether or not A is commutative, but we’ll now restrict to commutative
A since Ω(A) can be very small otherwise. We begin by completing Exercise 10.47:

17.2 Proposition Let X be a compact Hausdorff space and A = C(X, C). Then the map
X → Ω(A), x 7→ ϕx is a homeomorphism.
Proof. Injectivity was already proven in Exercise 10.47. In order to prove surjectivity, let
ϕ ∈ Ω(A) and put M = ker ϕ. Then M ⊆ A is a proper closed subalgebra (in fact an ideal),
and it is self-adjoint by Lemma 12.19 since A is a C ∗ -algebra. If x, y ∈ X, x 6= y, pick f ∈ A
with f (x) 6= f (y). With g = f − ϕ(f )1 we have ϕ(g) = 0, thus g ∈ M . This proves that
M separates the points of X, yet it is not dense in A. Now the incarnation Corollary A.34
of the Stone-Weierstrass theorem implies that there must be an x ∈ X at which M vanishes
identically, i.e. ϕx (f ) = 0 for all f ∈ M . Now for every f ∈ A we have f − ϕ(f ) ∈ M , thus
ϕx (f − ϕ(f )1) = 0, which is equivalent to ϕx (f ) = ϕ(f ). Thus ι : X → Ω(A) is surjective.
If {xι } ⊆ X such that xι → x then ϕxι (f ) = f (xι ) → f (x) = ϕx (f ) for every f ∈ A by
continuity of f . But this precisely means that ϕxι → ϕx w.r.t. the weak-∗ topology. Thus ι is
continuous. As a continuous bijection of compact Hausdorff spaces it is a homeomorphism. 

17.3 Definition Let A be a unital Banach algebra. Then its radical is the set of quasi-
nilpotent elements: radA = {a ∈ A | r(a) = 0}. We call A semisimple if radA = {0}.

17.4 Proposition If A is a unital commutative Banach algebra, the map

π : A → C(Ω(A), C), a 7→ b
a (17.1)

is a unital homomorphism, called the Gelfand homomorphism (or representation) of A, and


kπ(a)k = r(a) ≤ kak for all a ∈ A. Thus ker π = radA, and π is injective if and only if A is
semisimple.
Proof. It is clear that π is linear. Furthermore, 1(ϕ)
b = ϕ(1) = 1 and

a(ϕ1 ϕ2 ) = (ϕ1 ϕ2 )(a) = ϕ1 (a)ϕ2 (a) = b


b a(ϕ1 )b
a(ϕ2 ),

so that π is a unital homomorphism. We have

kb
ak = sup |b
a(ϕ)| = sup |ϕ(a)| = r(a) ≤ kak,
ϕ∈Ω(A) ϕ∈Ω(A)

where we used (10.7) and Proposition 10.12. 

The Gelfand homomorphism can fail to be surjective or injective or both. See Section 17.2
for an important example for the failure of surjectivity and Exercise 17.6 for a non-trivial unital
Banach algebra with very large radical.

17.5 Proposition Let A be a commutative unital Banach algebra and a ∈ A such that A is
generated by {1, a}. Then the map ba : Ω(A) → σ(a) is a homeomorphism. The same conclusion
holds if a ∈ InvA and A is generated by {1, a, a−1 }.

114
Proof. We know from (10.7) that b a(Ω(A)) = σ(a), thus b
a is surjective. Assume b
a(ϕ1 ) = b
a(ϕ2 ),
thus ϕ1 (a) = ϕ2 (a). Since the ϕi are unital homomorphisms, this implies ϕ1 (a ) = ϕ2 (an )
n

for all n ∈ N0 , so that ϕ1 , ϕ2 agree on the polynomials in a. Since the latter are dense in A
by assumption and the ϕi are continuous, this implies ϕ1 = ϕ2 . Thus b a : Ω(A) → σ(a) is
injective, thus a continuous bijection. Since Ω(A) is compact and σ(a) ⊆ C Hausdorff, b a is a
homeomorphism. This proves the first claim.
For the second claim, note that ϕ(a)ϕ(a−1 ) = ϕ(aa−1 ) = ϕ(1) = 1, thus ϕ(a−1 ) = ϕ(a)−1 ,
for each ϕ ∈ Ω(A). This implies that ϕ1 (an ) = ϕ2 (an ) also holds for negative n ∈ Z. Now
ϕ1 , ϕ2 agree on all Laurent polynomials in a, thus on A by density and continuity. The rest of
the proof is the same. 

17.6 Exercise Let α : N0 →P (0, ∞) be a map satisfying α(0) = 1 and αn+m ≤ αn αm ∀n, m.
For f : N0 → C, define kf k P
= n∈N0 αn |f (n)|, and A = {f : N0 → C | kf k < ∞}. For f, g ∈ A,
define f · g by (f · g)(n) = u,v∈N0 f (u)g(v).
u+v=n

(i) Prove that (A, ·, 1, k · k) is a commutative Banach algebra with unit 1 = δ0 .


1/n
(ii) Prove that δ1 generates A and r(δ1 ) = limn→∞ αn .
(iii) Find a sequence {αn } satisfying the above requirements such that δ1 is quasi-nilpotent.
(iv) Conclude that radA = {f ∈ A | f (0) = 0}.

17.7 Remark 1. Since every commutative unital Banach algebra has at least one non-zero
character ϕ, the worst that can happen is radA = ϕ−1 (0), which has codimension one, as in the
preceding exercise.
2. If A is a non-unital Banach algebra and a ∈ A one defines σ(a) = σAe(a), where Ae is
the unitization of A considered in Exercise 10.32. Now one defines r(a) = supλ∈σ(a) |λ| and
radA = r−1 (0) ⊆ A as before. Now for the non-unital subalgebra A0 = {f ∈ A | f (0) = 0} of
the A from Exercise 17.6 one easily proves Af0 ∼
= A, thus r(a) = 0 ∀a ∈ A0 and radA0 = A0 . 2

17.8 Exercise Let A be a commutative unital Banach algebra generated by {a1 , . . . , an } ⊆ A.


Define s : Ω(A) → Cn , ϕ 7→ (ϕ(a1 ), . . . , ϕ(an )). Prove that s is a homeomorphism of Ω(A) onto
a closed subspace of σ(a1 ) × · · · × σ(an ). (The latter is one way of defining the joint spectrum
σ(a1 , . . . , an ).

17.2 Application: Absolutely convergent Fourier series


The following example is lengthy, but very instructive:

17.9 Example Let (A = `1 (Z, C), k · k, ?, 1) be the unital Banach algebra from Example 10.25.
In view of δn ? δm = δn+m , this algebra is generated by a = δ1 ∈ InvA and a−1 = δ−1 . We have
kak = ka−1 k = 1 so that by Exercise 10.13 we have σ(a) ⊆ S 1 . Now, for z ∈ S 1 define
X
ϕz : f 7→ f (n)z n , (17.2)
n∈Z

which is absolutely and uniformly convergent since f ∈ `1 . It is clear that ϕz (δn ) = z n , so that
ϕz (δn δm ) = ϕz (δn+m ) = z n+m = ϕz (δn )ϕz (δm ), proving ϕz ∈ Ω(A). In particular, ϕz (a) = z, so
that σ(a) = S 1 . Now Proposition 17.5 gives Ω(A) = {ϕz | z ∈ S 1 }. By uniform convergence in

115
R1
(17.2), one finds that fˇ(z) = ϕz (f ) is continuous in z and fbˇ(n) = 0 fˇ(e2πit )e−2πit dt = f (n) ∀n.
We have
kπ(f )k = r(f ) = sup |ϕz (f )| = sup |fˇ(z)| = kfˇk∞ ,
z∈S 1 z∈S 1

which vanishes only if f = 0 (by the fact that g ∈ C(S 1 , C) vanishes if and only if gb(n) = 0 ∀n ∈
Z, cf. e.g. [76, Chapter 2, Theorem 2.1]). Thus A is semisimple and π : `1 (Z) → C(S 1 , C) is
injective. But π is not surjective: Its image consists precisely of
X
B = {g ∈ C(S 1 , C) | |b
g (n)| < ∞}.
n∈Z

This is an algebra since A is. In (superficially) more elementary terms this is just the observation
that the pointwise product of two elements of C(S 1 , C) corresponds to convolution of their
Fourier coefficients and the fact that `1 (Z) is closed under convolution. For the g ∈ B the Fourier
series converges absolutely uniformly to g, but we have proven in Section 8.4 that C(S 1 , C)
has a dense subset of functions whose the Fourier series does not even converge pointwise
everywhere. (Our proof was non-constructive, but as we remarked, single examples can be
produced constructively.)
And functions in C(S 1 , C)\B can be writtenP∞ sin down even more concretely: With some ef-
fort (see [49] for an exposition) the series n=2 n lognxn can be shown to be uniformly conver-
gent
P∞ to some −1 f ∈ C(S 1 , C), and its Fourier coefficients are not absolutely summable since
n=2 (n log n) = ∞.
1 1 1
We now turn P the non-surjectivity of π : ` (Z)1 → C(S ) into a virtue! For g ∈ C(S , C)
define kgkB = n∈Z |b g (n)|. Thus B = {g ∈ C(S , C) | kgkB < ∞}. We have seen that the
Gelfand representation of `1 (Z) is an isometric isomorphism (`1 (Z), k · k1 ) → (B, k · kB ). Now we
can give Gelfand’s slick proof of the following result proven by Wiener with much more effort:

17.10 Theorem If g ∈ B satisfies g(z) 6= 0 ∀z ∈ S 1 and h ∈ C(S 1 , C) is its multiplicative


inverse h(z) = 1/g(z) then h ∈ B (thus h has absolutely convergent Fourier series).
Proof. Let f = π −1 (g) ∈ `1 (Z). We have seen that Ω(A) = S 1 and ϕz (f ) = g(z) for all z ∈ S 1 .
Now the assumption g(z) 6= 0 ∀z implies that 0 6∈ σ(f ) = {ϕz (f ) | z ∈ S 1 }, so that f is invertible
in `1 (Z). Thus π(f ) = g ∈ B is invertible. Since the product on B is pointwise multiplication,
this proves that h = g −1 ∈ B, thus h has absolutely convergent Fourier series. 

17.3 C ∗ -algebras. Continuous functional calculus revisited


In discussing when the Gelfand homomorphism π : A → C(Ω(A), F) is an isomorphism, we
limit ourselves to the case where A is a C ∗ -algebra over C.

17.11 Theorem (Gelfand isomorphism) If A is a commutative unital C ∗ -algebra then the


Gelfand homomorphism π : A → C(Ω(A), C) is an isometric ∗-isomorphism.
Proof. For all a ∈ A, ϕ ∈ Ω(A), using Lemma 12.19 we have

π(a∗ )(ϕ) = ab∗ (ϕ) = ϕ(a∗ ) = ϕ(a) = ϕ∗ (a) = b


a(ϕ∗ ) = π(a)(ϕ∗ ).

Thus π(a∗ ) = π(a)∗ , so that π is a ∗-homomorphism, and π(A) ⊆ C(Ω(A), C) is self-adjoint.


Since A is commutative, all a ∈ A are normal, thus satisfy r(a) = kak by Proposition
11.24(i). Together with kπ(a)k = r(a) for all a this implies that π is an isometry, thus injective.
Since A is complete, this implies that the image π(A) ⊆ C(Ω(A), C) is complete, thus closed.

116
If ϕ1 6= ϕ2 then there is an a ∈ A such that ϕ1 (a) 6= ϕ2 (a), thus π(a)(ϕ1 ) = b
a(ϕ1 ) 6= b
a(ϕ2 ) =
π(a)(ϕ2 ). This proves that π(A) ⊆ C(Ω(A), R) separates the points of Ω(A). Since π is also
unital, the Stone-Weierstrass theorem (Corollary A.32) gives π(A) = π(A) = C(Ω(A), C). 

17.12 Remark 1. Theorem 17.11 can be strengthened to a (contravariant) equivalence of cat-


egories between the categories of commutative unital C ∗ -algebras and unital ∗-homomorphisms
and of compact Hausdorff spaces and continuous maps.
2. With some work, the assumption of A having a unit can be dropped, cf. e.g. [50].
One finds that every commutative C ∗ -algebra is isometrically ∗-isomorphic to C0 (X, C) for a
locally compact Hausdorff space X, unique up to homeomorphism. X is compact if and only
if A is unital. And the equivalence of categories mentioned above extends to a contravariant
equivalence between the category of locally compact Hausdorff spaces and proper maps and the
category of commutative ∗-algebras and non-degenerate homomorphisms.
3. The preceding comments is a sense end the theory of commutative C ∗ -algebras since the
latter is reduced it to general topology. But the theory of non-commutative C ∗ -algebras is vast,
see [32, 50] for accessible introductions, and it turns out that commutative C ∗ -algebras are a
very useful tool for studying them, as results like Propositions 12.12 and 12.13 just begin to
illustrate.
4. Comparing Theorem 17.11 with the non-surjectivity of the Gelfand-Homomorphism for
(`1 (Z, C), ?) shows that `1 (Z, C) does not admit a norm that would make it a C ∗ -algebra. But
`1 (Z, C) admits a non-complete C ∗ -norm k · k0 , and completing `1 (Z, C) w.r.t. the latter yields
a C ∗ -algebra C ∗ (Z) that is isomorphic to C ∗ (U ) ⊆ B(`2 (Z, C)), where U ∈ B(`2 (Z, C)) is the
two-sided shift unitary. One also has C ∗ (Z) ∼ = C(S 1 , C), thus the C ∗ -completion ‘adds’ the
continuous functions with non-absolutely convergent Fourier series. 2

The following is a C ∗ -version of Proposition 17.5:

17.13 Proposition Let B be a commutative unital C ∗ -algebra and b ∈ B such that B =


C ∗ (1, b). Then the map bb : Ω(B) → σ(b) is a homeomorphism.
Proof. The proof is similar to that of Proposition 17.5, enriched by the following argument: If
ϕ1 (b) = ϕ2 (b) then by Lemma 12.19 we have ϕ1 (b∗ ) = ϕ1 (b) = ϕ2 (b) = ϕ2 (b∗ ). Thus ϕ1 and ϕ2
coincide on all polynomials in b and b∗ , and therefore on B. 

Now we have another proof of the continuous functional calculus for normal elements of a
C ∗ -algebra:

17.14 Theorem Let A be a unital C ∗ -algebra and a ∈ A normal. Then


(i) There is a unique unital ∗-homomorphism αa : C(σ(a), C) → A such that αa (z) = a,
where z is the inclusion map σ(a) ,→ C. As in Section 12.2, we interpret αa (f ) as f (a).
(ii) If f ∈ C(σ(a), C)) then σ(f (a)) = f (σ(a)).
(ii) If f ∈ C(σ(a), C)) and g ∈ C(f (σ(a)), C) then (g ◦ f )(a) = g(f (a)).
Proof. (i) Let B = C ∗ (1, a) ⊆ A be the closed ∗-subalgebra generated by {1, a}. Since a is
normal, B is a commutative unital C ∗ -algebra, thus by Theorem 17.11, there is an isometric
∗-isomorphism π : B → C(Ω(B), C). And by Proposition 17.13 we have a homeomorphism
a : Ω(B) → σ(a). Now we define αa to be the composite of the maps
b
at π −1
C(σ(a), C) → C(Ω(B), C) → B ,→ A,
b

117
at : f 7→ f ◦b
where the first map is b a. It is clear that αa is a unital homomorphism. If z : σ(a) ,→ C
−1
is the inclusion, then αa (z) = π (z ◦ b a) = π −1 (b
a) = a. Any continuous unital homomorphism
α : C(σ(a)) → B sending 1 to 1A and z to a coincides with αa on the polynomials C[x]. Since
the latter are dense in C(σ(a), C) by Stone-Weierstrass, we have α = αa .
(ii) As used above, C ∗ -subalgebra B = C ∗ (1, a) is abelian and there is an isometric ∗-
isomorphism π : B → C(σ(a), C). By construction of the functional calculus we have π(f (a)) =
f ◦ ι, where ι is the inclusion map σ(a) ,→ C. Now, with Theorem 11.27 and Exercise 10.6 we
have
σA (f (a)) = σB (f (a)) = σC(σ(a)) (π(f (a))) = σC(σ(a)) (f ◦ ι) = f (σ(a)).
(iii) This is essentially obvious, since applying f to a and g to f (a) is just composition of
maps on the right hand side of the Gelfand isomorphism. 

17.15 Remark 1. It should be clear that Theorem 17.11 is conceptually of fundamental im-
portance, but it is not easy to find applications that are not just applications of the continuous
functional calculus for normal operators. The proof of the latter that we gave in Section 12.3
was a good deal more elementary than the one above: It did not involve the weak-∗ topology
and Alaoglu’s theorem, and it only needed Weierstrass’ classical theorem (in two dimensions)
rather than the more general result of Stone.
2. Enriching Theorem 17.11 by some considerations on von Neumann algebras (which we
don’t define here) one can prove representation theorems for commutative von Neumann alge-
bras, cf. e.g. [42, Chapter 6] or [50, Section 4.4], which add additional perspective to the spectral
theorem for normal operators. 2

17.16 Exercise Let A be a C ∗ -algebra and a, b ∈ A commuting self-adjoint elements. Put


c = a + ib.
(i) Prove σ(a) = Re(σ(c)) and σ(b) = Im(σ(c)).
(ii) Prove that the joint spectrum σ(a, b) defined in Exercise 17.8 coincides with

σ(a, b) = {(Re λ, Im λ) | λ ∈ σ(a + ib)}.

A Some topics from topology and measure theory


A.1 Unordered sums
P S is a finite set, A an abelian group and f : S → A a function, it is not hard to define
If
s∈S f (S) (even though few textbook authors bother to do so explicitly). One choses a bijection
α : {1, 2, . . . , #S} → S and defines s∈S f (s) = #S
P P
i=1 f (α(i)). The only slight difficulty is
proving that the result does not depend on the choice of α.
In order to define infinite sums, we need a topology on A, and we restrict to the case of
functions f : S → V , where (V, k · k) is a normed space. Many authors of introductory texts
(for a nice exception see [79, Vol. I, Section 8.2]) consider only those countable sums known as
series, but for our purposes this is inadequate.

A.1 Definition
P Let S be a set, (V, k · k) a normed space and f : S → V a function. We say
that s∈S f (s) exists (or converges) and equalsPx if there is x ∈ V such that for every ε > 0
there is a finite subset T ⊆ S such that kx − s∈U f (s)k < ε holds for every finite U ⊆ S
containing T .

118
In many cases, the above will be applied to V = F ∈ {R, C} and k · k = | · |.
This notion of summation has some useful properties:
P
A.2 Proposition (i) If s∈S f (s) exists then the sum x ∈ V is uniquely determined.
P P P
(ii) If s∈S f (s) = x and s∈S g(s) = y then s∈S (cf (s) + dg(s)) = cx + dy for all c, d ∈ F.
P P
(iii) If f (s) ≥ 0 ∀s ∈ S then s∈S f (s) exists if and only if sup{ t∈T f (t) | T ⊆ S finite} < ∞,
in which case the two expressions coincide. These equivalent conditions imply that the set
{s ∈ S | f (s) 6= 0} is at most countable.
P P P
P(V, k · k) is complete and s∈S kf (s)k < ∞ then s∈S f (s) exists, and k s∈S f (s)k ≤
(iv) If
s∈S kf (s)k.
P P
(v) If f : S → F is such that s∈S f (s) exists then s∈S |f (s)| exists.
The proofs of (i) and (ii) are straightforward and similar to those for the analogous state-
ments about series.
P The equivalence
P in (iii) follows from monotonicity of the map Pfin (S) →
[0, ∞), T 7→ t∈T f (t). If s∈S |f (s)| < ∞ then it follows that for every ε > 0 there are
at most finitely many s ∈ S such that |f (s)| ≥ ε. In particular, for every n ∈ N the set
Sn = {s ∈ S | |f (s)| ≥ 1/n} is finite. Since S∞a countable union of finite sets is countable, we
have countability of {s ∈ S | f (s) 6= 0} = n=1 Sn . The proof of (iv) is analogous to that of
the implication ⇒ in Proposition 3.2.
The statement (v) probably is surprising since P the analogous statement for series is false.
Roughly, the reason is that our definition of s∈S f (s) imposes no ordering on S, while the
sum of a series ∞
P
n=1 f (n) is invariant under reordering of the terms only if the series converges
absolutely. I strongly suggest that you make an effort to understand this! A rigorous proof of
(v) can be found, e.g., in [47].
In discussing the spaces `p (S, F), the following (which just is an easy special case of Lebesgue’s
dominated convergence theorem) is useful:

A.3 Proposition (Discrete case of dominated convergence theorem) Let S be a set


and {fn }n∈N functions S → C. Assume that
1. For each s ∈ S, the limit limn→∞ fn (s) exists. Define h : S → C, s 7→ limn→∞ fn (s).
P
2. There exists a function g : S → [0, ∞) such that s∈S g(s) < ∞ and |fn (s)| ≤ g(s) ∀s ∈ S.
Then
P P
(i) s∈Sfn (s) converges for each n ∈ N. So does s∈S h(s).
P P
(ii) limn→∞ s∈S fn (s) = s∈S h(s). (Thus limit and summation can be interchanged.)
Proof. (i) Assumption
P P 1. gives |h(s)| ≤ g(s) ∀s. Now assumption 2. implies convergence of
s h(s) and of f (s)
s n P for all n.
(ii) Let ε > 0. Since s g(s) < ∞, there is a finite subset T ⊆ S such that s∈S\T g(s) < 4ε .
P
ε
For each t ∈ T there is an nt ∈ N such that n ≥ nt ⇒ |fn (t) − h(t)| < 2#T . Put n0 = maxt∈T nt .
If n ≥ n0 then

X X X X X X
fn (s) − h(s) ≤ fn (s) − h(s) + fn (s) − h(s) .
s∈S s∈S s∈T s∈T s∈S\T s∈S\T

The first term on the r.h.s. is bounded by


X ε ε
|fn (s) − h(s)| ≤ #T · =
2#T 2
s∈T

119
due to the definition of n0 and n ≥ n0 ≥ nt . And the second is bounded by
X X ε
(|fn (s)| + |h(s)|) ≤ 2 g(s) ≤ ,
2
S\T S\T

where we used that g dominates |fnP| and |h|, as P


well as the choice of T . Putting the two
estimates together gives n ≥ n0 ⇒ | s∈S fn (s) − s∈S h(s)| ≤ ε, completing the proof. 

A.2 Nets
The Definition A.1 of unordered sums is an instance of a much more general notion, the con-
vergence of nets.

A.4 Definition A directed set is a set I equipped with a binary relation ≤ on I satisfying
1. a ≤ a for each a ∈ I (reflexivity).
2. If a ≤ b and b ≤ c for a, b, c ∈ I then a ≤ c (transitivity).
3. For any a, b ∈ I there exists a c ∈ I such that a ≤ c and b ≤ c (directedness).

A.5 Remark If only 1. and 2. hold, (I, ≤) is called a pre-ordered set. Some authors, as e.g.
[42], require in addition that a ≤ b and b ≤ a together imply a = b (antisymmetry). Recall
that a pre-ordered set with this property is called partially ordered. But the antisymmetry is
an unnatural assumption in this context and is never used. 2

A.6 Example 1. Every totally ordered set (X, ≤) is a directed set. Only the directedness
needs to be shown, and it follows by taking c = max(a, b). In particular N is a directed set with
its natural total ordering.
2. If S is a set then the power set I = P (S) with its natural partial ordering is directed: For
the directedness, put c = a ∪ b. The same works for the set Pfin (S) of finite subsets of S, which
appeared in the definition of unordered sums.
3. If (X, τ ) is a topological space and x ∈ X, let Ux be the set of open neighborhoods of x.
Now for U, V ∈ Ux , define U ≤ V ⇔ U ⊆ V , thus we take the reversed ordering. Then (Ux , ≤)
is directed with c = a ∩ b.

A.7 Definition If X is a set, a net in X is a map I → X, ι 7→ xι , where (I, ≤) is a directed


set.
If (X, τ ) is a topological space, a net {xι }ι∈I in X converges to z ∈ X if for every open
neighborhood U of z there is a ι0 ∈ I such that ι ≥ ι0 ⇒ xι ∈ U .
When this holds, we write xι → z or limι∈I xι = z. (The second notation should only be
used if X is Hausdorff, since this property is equivalent to uniqueness of limits of nets.)

A.8 Remark 1. With I = N and ≤ the natural total ordering, a net indexed by I just is a
sequence, and this net converges if and only if the sequence does.
2. Unordered summation is a special case of a net limit: If S is any set, let I be the set of
finite subsets of S and let ≤ be the ordinary (partial) ordering of subsets of S. If T, U ∈ I let
V = T ∪ U . Clearly T ≤ V, U ≤ V , showing that (I, ≤) is a directed set. (This is the same as
Example A.6.2, except that now we only look at finite subsets P of S.) Now given f : S → F, for
every T ∈ I, thus every finite T ⊆ S, we can clearly define t∈T f (t). Now
X X
f (s) = lim f (t),
T ∈I
s∈S t∈T

120
where the sum exists if and only if the limit exists. 2

Why nets? The reason is that sequences are totally inadequate for the study of topological
spaces that do not satisfy the first countability axiom.56 Given a metric space X and a subset
Y ⊆ X, one proves that x ∈ Y if and only if there is a sequence {yn } in Y converging to x, but
for general topological spaces this is false. Similarly, the statement that a function f : X → Y
is continuous at x ∈ X if and only if f (xn ) → f (x) for every sequence {xn } converging to x is
true for metric spaces, but false in general! (It is instructive to work out counterexamples.)
On the other hand:

A.9 Proposition 1. Let X be a topological space and Y ⊆ X. If {yι } is a net in Y that


converges to x ∈ X then x ∈ Y .
2. Let X be a topological space and Y ⊆ X. Then for every x ∈ Y there exists a net {yι }
in Y such that yι → x.
3. A topological space X is Hausdorff if and only if there exists no net {xι } in X that
converges to two different points of X.
4. If X, Y are topological spaces, f : X → Y a function, and x ∈ X, then f is continuous at
x if and only if f (xι ) → f (x) for every net {xι } in X converging to x.
For proofs see [47] or any decent book on topology or [42, Section 5.5].
If (X, d) is a metric space, the problems with sequences mentioned above do not arise.
Nevertheless, there are situations where the use of nets in X is useful, as in the proof of
Theorem 5.38 and 5.41 where we considered nets indexed by the finite subsets of an ONB E.
In this case one wants:

A.10 Definition A net {xι }, indexed by a directed set (I, ≤), in a metric space (X, d) is a
Cauchy net if for every ε > 0 there is a ι0 ∈ I such that ι, ι0 ≥ ι0 ⇒ d(xι , xι0 ) < ε.
(In a normed space, this definition is consistent with the one in Remark 2.20.)

A.11 Lemma In a complete metric space every Cauchy net converges.


Proof. Let {xι }ι∈I be Cauchy. Then for every n ∈ N there is a ιn ∈ I such that ι, ι0 ≥ ιn ⇒
d(xι , xι0 ) < 1/n. We can also arrange that ι1 ≤ ι2 ≤ · · · (using directedness to replace ι2 by
some ι02 larger than ι1 and ι2 etc.). Now it is quite clear that {xιn }n∈N is a Cauchy sequence
in X. By completeness of (X, d) it converges to some x ∈ X. Let now ε > 0. Pick n0 ∈ N
such that n ≥ n0 ⇒ d(xιn , x) < ε/2. Put ι0 = ι2n . If ι ≥ ι0 then ι ≥ ι2n ≥ ιn , so that
d(xι , x) ≤ d(xι , xι2n ) + d(xι2n , x) < 2ε + 2ε = ε, thus xι → x. 

A.3 The Stone-Čech compactification


If X is a topological space, a compactification of X is a space X b together with a continuous
map ι : X → X b such that ι(X) ⊆ X is a dense subset and ι : X → ι(X) is a homeomorphism.
You certainly know the one-point or Alexandrov compactification of a topological space.
(Usually it is considered only for locally compact spaces.) It is the smallest possible compacti-
fication in that it just adds one point.
But for many purposes, another compactification is more important, the Stone-Čech com-
pactification. It is defined for spaces that have the following property:
56
Unfortunately many introductory books and courses sweep this under the rug and don’t even mention nets.

121
A.12 Definition A topological space X is completely regular if for every closed C ⊆ X and
y ∈ X\C there exists a continuous function f : X → [0, 1] such that f  C = 0 and f (x) = 1.
All subspaces of a completely regular space are completely regular. By Urysohn’s lemma,
every normal space is completely regular, in particular every metrizable and every compact
Hausdorff space. This implies that complete regularity is a necessary condition for a space X
to have a compactification X
b that is Hausdorff. In fact, it also is sufficient:

A.13 Theorem Let X be a topological space. Then the following are equivalent:
(i) X is completely regular.
(ii) There exists a compact Hausdorff space βX together with a dense embedding X ,→ βX
such that for every continuous function f : X → Y , where Y is compact Hausdorff, there
exists a continuous fb : βX → Y such that fb  X = f . (This fb is automatically unique by
density of X ⊆ βX.)
The universal property (ii) implies that βX is unique up to homeomorphism. ‘It’ is called the
Stone57 -Čech58 compactification of X.
Let X be completely regular and F ∈ {R, C}. Then the restriction map C(βX, F) → Cb (X, F)
given by f 7→ f  X is a bijection and an isomorphism of commutative unital F-algebras.
There are many ways to prove the non-trivial implication (i)⇒(ii), the most common one
using Tychonov’s theorem, cf. [47]. But we can also use Gelfand duality for commutative C ∗ -
algebras, cf. Section 17. Here is a sketch: If X is completely regular, A = Cb (X, C) with
norm kf k = supx |f (x)| is a commutative unital C ∗ -algebra. As such it has a spectrum Ω(A),
which is compact Hausdorff. We define βX = Ω(A). There is a map ι : X → Ω(A), x 7→ ϕx ,
where ϕx (f ) = f (x). This map is continuous by definition of the topology on Ω(A). Using the
complete regularity of X one proves that ι is an embedding, i.e. a homeomorphism of X onto
ι(X) ⊆ Ω(A). Now ι(X) = Ω(A) is seen as follows: ι(X) 6= Ω(A) would imply (using Urysohn
or Tietze) that there are f ∈ A\{0} such that ι(x)(f ) = 0 for all x ∈ X. This is a contradiction,
since the elements of A are functions on X, so that ι(x)(f ) = 0 ∀x implies f = 0.

A.4 Reminder of the choice axioms and Zorn’s lemma


A.14 Definition The Axiom of Choice (AC) is any of the following statements, which are
easily shown to be equivalent:
• If f : X → Y is a surjective function then there exists a function g : Y → X such that
f ◦ g = idY .
• If X is a set, there exists a function s : P (X)\{∅} → X such that s(Y ) ∈ Y for each
Y ∈ P (X)\{∅}, i.e. ∅ = 6 Y ⊆ X.
Q
• If {Xi }i∈I
S is a family of non-empty sets then i∈I Xi 6= ∅. Concretely, there exists a map
f : I → j∈I Xj such that f (i) ∈ Xi ∀i ∈ I.

A.15 Definition Let (X, ≤) be a partially ordered set. Then


• m ∈ X is called a maximal element if y ∈ X, y ≥ m implies y = m.
57
Marshall Harvey Stone (1903-1989). American mathematician, mostly active in topology and (functional) analysis.
58
Eduard Čech (1893-1960). Czech mathematician, worked mostly in topology.

122
• u ∈ X is called an upper bound for Y ⊆ X if x ≤ u holds for each u ∈ y. If this u is in Y
it is called largest element of Y (which is unique).

A.16 Theorem Given the Zermelo-Frenkel axioms of set theory, the Axiom of Choice is equiv-
alent to Zorn’s lemma, which says: If (X, ≤) is a non-empty partially ordered set such that
every totally ordered subset Y ⊆ X has an upper bound then X has a maximal element.

A.17 Definition The Axiom of Countable Choice (ACω ) is the first (or third) of the above
versions of AC with the restriction that Y (respectively I) be at most countable.

A.18 Definition The Axiom of Countable Dependent Choice (DCω ) is the following: If X is
a set and R ⊆ X × X is such that for every x ∈ X there is a y ∈ X such that (x, y) ∈ R then
for each x1 ∈ X there is a sequence {xn } in X such that (xn , xn+1 ) ∈ R for all n ∈ N.
It is easy to prove AC ⇒ DCω ⇒ ACω . The converse implications are false.

A.5 Baire’s theorem


You should have seen the following. (If not, cf. e.g. [47].)

A.19 Lemma Let (X, τ ) be a topological space with X =6 ∅. Then Y ⊆ X is dense if and only
if Y ∩ W 6= ∅ whenever ∅ =
6 W ∈ τ . (Equivalently, X\Y has empty interior.)

A.20 Theorem 59 Let (X, Td) be a complete metric space and {Un }n∈N a countable family of

dense open subsets. Then n=1 Un is dense in X.
Proof. Let ∅ 6= W ∈ τ . Since U1 is dense, W ∩ U1 6= ∅ by Lemma A.19, so we can pick
x1 ∈ W ∩ U1 . Since W ∩ U1 is open, we can choose ε1 > 0 such that B(x1 , ε1 ) ⊆ W ∩ U1 . We
may also assume ε1 < 1. Since U2 is dense, U2 ∩ B(x1 , ε1 ) 6= ∅ and we pick x2 ∈ U2 ∩ B(x1 , ε1 ).
By openness, we can pick ε2 ∈ (0, 1/2) such that B(x2 , ε2 ) ⊆ U2 ∩ B(x1 , ε1 ). Continuing this
iteratively, we find points xn and εn ∈ (0, 1/n) such that B(xn , εn ) ⊆ Un ∩ B(xn−1 , εn−1 ) ∀n. If
i > n and j > n we have by construction that xi , xj ∈ B(xn , εn ) and thus d(xi , xj ) ≤ 2εn < 2/n.
Thus {xn } is a Cauchy sequence, and by completeness it converges to some z ∈ X. Since
n > k ⇒ xn ∈ B(xk , εk ), the limit z is contained in B(xk , εk ) for each k, thus
\ \
z∈ B(xn , εn ) ⊆ W ∩ Un ,
n n
T
thus W ∩ Tn Un is non-empty. Since W was an arbitrary non-empty open set, Lemma A.19
gives that n Un is dense. 

The following (equivalent) reformulation is useful:

A.21 Corollary Let (X, d) be a complete S∞ metric space and {Cn }n∈N a countable family of
closed subsets with empty interior. Then n=1 Cn has empty interior.
Proof. The sets Un = X\Cn , n ∈ N, are open and TUn = X\Cn = X\Cn0 = X since the
interiors Cn0 are empty. Thus the Un are dense so that n Un is dense by Baire’s theorem, thus
X\Cn = n Un = X. Thus with X\Y = (X\Y )0 we have ( n Cn )0 = (X\ n (X\Cn ))0 =
T T S T
nT S
X\ n X\Cn = ∅, i.e. n Cn has empty interior. 
59
René-Louis Baire (1874-1932). French mathematician, proved this for Rn in his 1899 doctoral thesis. The gener-
alization is due to Hausdorff (1914).

123
A.22 Remark 1. There are many other ways of stating Baire’s theorem, but most of the
alternative versions introduce additional terminology (nowhere dense sets, meager sets, sets of
first or second category,
T etc.) and tend to obscure the matter unnecessarily.
2. An intersection n Un of a countable family {Un }n∈N of open sets is called a Gδ -set.
3. The proof implicitly used the axiom DCω of countable dependent choice. (Making this
explicit would be a tedious exercise.) Remarkably, one can prove that the (Zermelo-Frenkel)
axioms of set theory (without any choice axiom) combined with Baire’s theorem imply DCω .
4. Some results usually proven using Baire’s theorem can alternatively be proven without it.
But in most cases, such alternative proofs will also use DCω and therefore not be better from
a foundational point of view. The proof of Theorem 8.2 is a rare exception. 2

A typical application of Baire’s theorem is the following (for a proof see, e.g., [47]):

A.23 Theorem There is a k · k∞ -dense Gδ -set F ⊆ C([0, 1], R) such that every f ∈ F is
nowhere differentiable.
Note that a single function f ∈ C([0, 1], R) that is nowhere differentiable can be written down
quite explicitly and constructively, for example f (x) = ∞ −n cos(2n x). But for proving that
P
n=1 2
such functions are dense one needs Baire’s theorem.

A.6 Tietze’s extension theorem


A.24 Theorem (Tietze-Urysohn extension theorem) 60 Let (X, τ ) be normal, Y ⊆ X
closed and f ∈ Cb (Y, R). Then there exists fb ∈ Cb (X, R) such that fb|Y = f and kfbk = kf k.
(In other words, the restriction map T : Cb (X, R) → Cb (Y, R), f 7→ f  Y is surjective.)
Proof. Let f ∈ Cb (Y, R), where we may assume kf k = 1, i.e. f (Y ) ⊆ [−1, 1]. Let A =
f −1 ([−1, −1/3]) and B = f −1 ([1/3, 1]). Then A, B are disjoint closed subsets of Y , which are
also closed in X since Y is closed. Thus by Urysohn’s Lemma, there is a g ∈ C(X, [−1/3, 1/3])
such that g  A = −1/3 and g  B = 1/3. Thus kgkX = 1/3 and kT g − f kY ≤ 2/3. (You should
check this!) Now Lemma 9.2 is applicable with m = 1/3 and r = 2/3 and gives the existence of
fb ∈ C(X, R) with T fb = f and kfbk = kf k (since m/(1 − r) = 1). 

The theorem is easily extended to C-valued functions.

A.7 The Stone-Weierstrass theorem


A.7.1 Weierstrass’ theorem
The following fundamental theorem of Weierstrass61 (1885) has been proven in many ways. A
fairly standard proof due to E. Landau (1908) involves convolution of f with a sequence {gn } of
functions that is a polynomial approximate unit, cf. e.g. [79, Vol. II, Section 3.8]. The following
proof, given in 1913 by Sergei Bernstein62 , has the advantage of using no integration.
60
H. F. F. Tietze (1880-1964), Austrian mathematician. He proved this for metric spaces (for which Urysohn’s
lemma is a triviality). The generalization to normal spaces is due to Urysohn.
61
Karl Theodor Wilhelm Weierstrass (1815-1897). German mathematician and one of the fathers of rigorous analysis.
62
Sergei Natanovich Bernstein (1880-1968). Russian/Soviet mathematician. Important contributions to approxima-
tion theory, probability, PDEs.

124
A.25 Theorem Let f ∈ C([a, b], F) and ε > 0. Then there exists a polynomial P ∈ F[x] such
that |f (x) − P (x)| ≤ ε for all x ∈ [a, b]. (As always, F ∈ {R, C}.)
Proof. It clearly suffices to prove this for the interval [0, 1]. For n ∈ N and x ∈ [0, 1], define
n   
X k n k
Pn (x) = f x (1 − x)n−k .
n k
k=0

Clearly Pn is a polynomial of degree at most n, called Bernstein polynomial. In view of


n  
X n
1 = 1n = (x + (1 − x))n = xk (1 − x)n−k (A.1)
k
k=0

we have
n     
X k n k
f (x) − Pn (x) = f (x) − f x (1 − x)n−k ,
n k
k=0

thus
n    
X k n k
|f (x) − Pn (x)| ≤ f (x) − f x (1 − x)n−k . (A.2)
n k
k=0

Since [0, 1] is compact and f : [0, 1] → F is continuous, it is bounded and uniformly continuous.
Thus there is M such that |f (x)| ≤ M for all x, and for each ε > 0 there is δ > 0 such that
|x − y| < δ ⇒ |f (x) − f (y)| < ε.
Let ε > 0 be given, and chose a corresponding δ > 0 as above. Let x ∈ [0, 1]. Define
 
k
A = k ∈ {0, 1, . . . , n} | −x <δ .
n

For all k we have |f (x) − f (k/n)| ≤ 2M , and for k ∈ A we have |f (x) − f (k/n)| < ε. Thus with
(A.2) we have
X n X n
|f (x) − Pn (x)| ≤ ε xk (1 − x)n−k + 2M xk (1 − x)n−k
k k
k∈A k∈Ac
X n
≤ ε + 2M xk (1 − x)n−k , (A.3)
c
k
k∈A

where we used (A.1) again. In an exercise, we will prove the purely algebraic identity
n  
X n
xk (1 − x)n−k (k − nx)2 = nx(1 − x) (A.4)
k
k=0

for all n ∈ N0 and x ∈ [0, 1] (in fact all x ∈ R). Now, k ∈ Ac is equivalent to | nk − x| ≥ δ and
to (k − nx)2 ≥ n2 δ 2 . Multiplying both sides of the latter inequality by nk xk (1 − x)n−k and
summing over k ∈ Ac , we have
X n X n
n2 δ 2 xk (1 − x)n−k ≤ xk (1 − x)n−k (k − nx)2
c
k c
k
k∈A k∈A
n 
X n k
≤ x (1 − x)n−k (k − nx)2 = nx(1 − x), (A.5)
k
k=0

125
where the last equality comes from (A.4). This implies
X n nx(1 − x) 1
xk (1 − x)n−k ≤ 2 2
≤ 2, (A.6)
c
k n δ nδ
k∈A

where we used the obvious inequality x(1 − x) ≤ 1 for x ∈ [0, 1]. Plugging (A.6) into (A.3)
2M
we have |f (x) − Pn (x)| ≤ ε + nδ 2 . This holds for all x ∈ [0, 1] since, by uniform continuity, δ

depends only on ε, not on x. Thus for n > 2M εδ 2


we have |f (x) − Pn (x)| ≤ 2ε ∀x ∈ [0, 1] and are
done. 

A.26 Exercise Prove


P (A.4). Hint: Use basic properties of the binomial coefficients or differ-
entiate (x + y)n = nk=0 nk xk y n−k twice with respect to x and then put y = 1 − x.
An immediate consequence of Theorem A.25 is the following:

A.27 Corollary There exists a sequence {pn }n∈N ⊆ R[x] of real polynomials that converges

uniformly on [0, 1] to the function x 7→ x.
The above corollary can also be proven directly:

A.28 Exercise Define a sequence {pn }n∈N0 ⊆ R[x] of polynomials by p0 = 0 and

x − pn (x)2
pn+1 (x) = pn (x) + . (A.7)
2
Prove by induction that the following holds:

(i) pn (x) ≤ x for all n ∈ N0 , x ∈ [0, 1].
(ii) The sequence {pn (x)} increases monotonously for each x ∈ [0, 1] and converges uniformly

to x.

A.7.2 The Stone-Weierstrass theorem in the simplest case


Theorem A.25 says that the polynomials, restricted to [0, 1] are uniformly dense in C([0, 1]).
Our aim is to generalize this, replacing replacing [0, 1] by (locally) compact Hausdorff spaces.
In order to see what should take the place of polynomials, notice that a polynomial on R is
a linear combination of powers xn , and the latter can be seen as powers f n (under pointwise
multiplication) of the identity function f = idR . Thus the polynomials are the unital subalgebra
P ⊆ C(R, R) generated by the single element idR . Now, if X is a topological space and F ∈
{R, C} then C(X, F) is a unital algebra, and we will consider subalgebras (not necessarily singly
generated) A ⊆ C(X, F). Since the functions on a (locally) compact Hausdorff space separate
points, we clearly need to impose the following if we want to prove A = C(X, F):

A.29 Definition A subalgebra A ⊆ C(X, F) separates points if for any x, y ∈ X, x 6= y there


is a f ∈ A such that f (x) 6= f (y).

A.30 Theorem (M. H. Stone 1937) If X is compact Hausdorff and A ⊆ C(X, R) is a unital
subalgebra separating points then A = C(X, R).

126
Proof. Replacing A by A, the claim is equivalent to showing that A = C(X, R). We proceed in
several steps. We claim that f ∈ A implies |f | ∈ A. Since f is bounded due to compactness, it
clearly is enough to prove this under the assumption |f | ≤ 1. With the pn of Corollary A.27,
2 2
p have (x 7→ pn (f (x))) ∈ A since A is a unital algebra. Since pn ◦ f converges uniformly to
we
2
f = |f |, closedness of A implies |f | ∈ A. In view of

f + g + |f − g| f + g − |f − g|
max(f, g) = , min(f, g) = ,
2 2
and the preceding result, we see that f, g ∈ A implies min(f, g), max(f, g) ∈ A. By induction,
this extends to pointwise minima/maxima of finite families of elements of A.
Now let f ∈ C(X, R). Our goal is to find fε ∈ A satisfying kf − fε k < ε for each ε > 0.
Since A is closed, this will give A = C(X, R).
If a 6= b, the fact that A separates points gives us an h ∈ A such that h(a) 6= h(b). Thus
the function ha,b (x) = h(x)−h(a)
h(b)−h(a) is in A, continuous and satisfies h(a) = 0, h(b) = 1. Thus also
fa,b (x) = f (a) + (f (b) − f (a))ha,b (x) is in A, and it satisfies fa,b (a) = f (a) and fa,b (b) = f (b).
This implies that the sets

Ua,b,ε = {x ∈ X | fa,b (x) < f (x) + ε}, Va,b,ε = {x ∈ X | fa,b (x) > f (x) − ε}

are open neighborhoods of a and b, respectively, for every ε > 0. Thus keeping b, ε fixed,
{Ua,b,ε }a∈X is an open cover of X, and by compactness we find a finite subcover {Uai ,b,ε }ni=1 .
By the above preparation, the function fb,ε = min(fa1 ,b,ε , . . . , fan ,b,ε ) is in A. If x ∈ Uai ,b,ε
then fb,ε (x) ≤ fai ,b,ε (x) < f (x) + ε forTall x ∈ X, and since {Uai ,b,ε }ni=1 covers X, we have
fb,ε (x) < f (x) + ε ∀x. For all x ∈ Vb,ε = ni=1 Vai ,b,ε , we have fai ,b,ε (x) > f (x) − ε, and therefore
fb (x) = mini (fai ,b,ε ) > f (x) − ε. Now {Vb,ε }b∈X is an open cover of X, and we find a finite
subcover {Vbj ,ε }nj=1 . Then fε = max(fb1 ,ε , . . . , fbn ) is in A. Now fε (x) = maxj (fbj ,ε ) ≤ f (x) + ε
holds everywhere, and for x ∈ Vbj ,ε we have fε (x) ≥ fbj ,ε > f (x) − ε. Since {Vbj ,ε }j covers X,
we conclude that fε (a) ∈ (f (x) − ε, f (x) + ε) for all x, to wit kf − fε k < ε. 

Since the polynomial ring R[x] is an algebra, and the polynomials clearly separate the points
of R, Theorem A.30 recovers Theorem A.25. (This is not circular if one has used Exercise A.28 to
prove Corollary A.27.) But we immediately have the higher dimensional generalization (which
can also be proven by more classical methods, like approximate units):

A.31 Theorem Let X ⊆ Rn be compact. Then the restrictions to X of the P ∈ R[x1 , . . . , xn ]


(considered as functions) are uniformly dense in C(X, R).

A.7.3 Generalizations
Having proven Theorem A.30, it is easy to generalize it to locally compact spaces or/and
subalgebras of C(0) (X, C).63 Recall that a subset S of a ∗-algebra A is called self-adjoint if
S = S ∗ := {s∗ | s ∈ S}.

A.32 Corollary If X is compact Hausdorff and A ⊆ C(X, C) is a self-adjoint unital subal-


gebra separating points then A = C(X, C).
63
Some authors, mostly operator algebraists, write C(X) for C(X, C), whereas topologists put C(X) = C(X, R). I
don’t use C(X).

127

Proof. Define B = A ∩ C(X, R). Let f ∈ A. Since f ∗ ∈ A, we also have Re(f ) = f +f 2 ∈B
f −f ∗
and Im(f ) = 2i = −Re(if ) ∈ B. Thus A = B + iB. It is obvious that Re(A) ⊆ C(X, R)
is a unital subalgebra. If x 6= y then there is f ∈ C(X, C) such that f (x) 6= f (y). Thus
Re(f )(x) 6= Re(f )(y) or Re(if )(x) 6= Re(if )(y) (or both). Since Re(f ), Re(if ) ∈ B, we see that
B separates points. Thus B = C(X, R) by Theorem A.30, implying A = B + iB = B + iB =
C(X, R) + iC(X, R) = C(X, C). 

A.33 Definition A subalgebra A ⊆ C0 (X, F) vanishes at no point if for every x ∈ X there is


an f ∈ A such that f (x) 6= 0.

A.34 Corollary If X is locally compact Hausdorff and A ⊆ C0 (X, R) is a subalgebra sepa-


rating points and vanishing at no point then A = C0 (X, R).
Proof. Let X∞ = X ∪ {∞} be the one-point compactification of X. Recall that every f ∈
C0 (X, R) extends to fb ∈ C(X∞ , R) with fb(∞) = 0. Then B = {fb | f ∈ A} + R1 clearly is a
unital subalgebra of C(X∞ , R). We claim that B separates the points of X∞ . This is obvious
for x, y ∈ X, x 6= y since already A does that. Now let x ∈ X. Since A vanishes at no point,
there is f ∈ A such that f (x) 6= 0. Let fb ∈ C(X, R) be the extension to X∞ with fb(∞) = 0.
In view of fb(x) = f (x) 6= 0, we see that B also separates ∞ from the points of X, so that
Theorem A.30 gives B = C(X, R). In view of B = A + R1 and C(X, R)  X = C0 (X, R), we
have A = B  X = C0 (X, R). 

A.35 Corollary If X is locally compact Hausdorff and A ⊆ C0 (X, C) is a self-adjoint subal-


gebra separating points and vanishing at no point then A = C0 (X, C).
Proof. The proof just combines the ideas of the proofs of Corollaries A.32 and A.34. 

A.8 Totally bounded sets in metric spaces


Recall that a metric space (X, d) is totally bounded if for every ε > 0 there are x1 , . . . , xn ∈ X
such that X = B(x1 , ε) ∪ · · · ∪ B(xn , ε). And: A metric space is compact if and only if it is
complete and totally bounded, cf. e.g. [47]. We will need the following:

A.36 Exercise Let (X, d) be a metric space. Prove:


(i) If (X, d) is totally bounded and Y ⊆ X then (Y, d) is totally bounded.
(ii) If (Y, d) is totally bounded and Y ⊆ X is dense then (X, d) is totally bounded.
(iii) If (X, d) is complete and Y ⊆ X then (Y, d) is totally bounded if and only if Y is precom-
pact.

A.9 The Arzelà-Ascoli theorem


If (X, τ ) is a topological space and (Y, d) metric, the set Cb (X, Y ) is topologized by the metric
D(f, g) = sup d(f (x), g(x)).
x∈X

It is therefore natural to ask whether the (relative) compactness of a set F ⊆ Cb (X, Y ) can
be characterized in terms of the elements of F, which after all are functions f : X → Y .
This will be the subject of this section, but we will restrict ourselves to compact X, for which
C(X, Y ) = Cb (X, Y ).

128
A.37 Definition Let (X, τ ) be a topological space and (Y, d) a metric space. A family F
of functions X → Y is called equicontinuous if for every x ∈ X and ε > 0 there is an open
neighborhood U 3 x such that f ∈ F, x0 ∈ U ⇒ d(f (x), f (x0 )) < ε. Then F ⊆ C(X, Y ).
The point of course is that the choice of U depends only on x and ε, but not on f ∈ F.

A.38 Theorem (Arzelà-Ascoli) 64 Let (X, τ ) be a compact topological space and (Y, d) a
complete metric space. Then F ⊆ C(X, Y ) is (pre)compact (w.r.t. the uniform topology τD ) if
and only if
• {f (x) | f ∈ F} ⊆ Y is (pre)compact for every x ∈ X,
• F is equicontinuous.
Proof. ⇒ If f, g ∈ C(X, Y ) then d(f (x), g(x)) ≤ D(f, g) for every x ∈ X. This implies that the
evaluation map ex : C(X, Y ) → Y, f 7→ f (x) is continuous for every x. Thus if F is compact, so
is eX (F). And compactness of F implies that ex (F) = {f (x) | f ∈ F} is compact, thus closed.
Since ex (F) contains ex (F), also ex (F) ⊆ ex (F) is compact.
To prove equicontinuity, let x ∈ X and ε > S 0. Since F is compact, F is totally bounded,
thus there are g1 , . . . , gn ∈ F such that F ⊆ i B D (gi , ε). By continuity of theTgi , there are
open Ui 3 x, i = 1, . . . , n, such that x0 ∈ Ui ⇒ d(gi (x), gi (x0 )) < ε. Put U = i Ui . If now
f ∈ F, there is an i such that f ∈ B D (gi , ε), to wit D(f, gi ) < ε. Now for x0 ∈ U ⊆ Ui we have

d(f (x), f (x0 )) ≤ d(f (x), gi (x)) + d(gi (x), gi (x0 )) + d(gi (x0 ), f (x0 )) < 3ε,

proving equicontinuity of F (at x, but x was arbitrary).


⇐ We first prove a lemma:

A.39 Lemma Let (X, d) be a metric space. Assume that for each ε > 0 there are a δ > 0, a
metric space (Y, d0 ) and a continuous map h : X → Y such that (h(X), d0 ) is totally bounded
and such that d0 (h(x), h(x0 )) < δ implies d(x, x0 ) < ε. Then (X, d) is totally bounded.
Proof. For ε > 0, pick δ, (Y, d0 ), h S as asserted. Since h(X) is S totally bounded, there are
y1 , . . . , yn ∈ h(X) such that h(X) ⊆ i B(yi , δ) ⊆ Y . Then X = i h−1 (B(yi , δ)). For each i
choose xi ∈ X such that h(xi ) = yi . Now x S ∈ h−1 (B(yi , δ)) ⇒ d0 (h(x), yi ) < δ ⇒ d(x, xi ) < ε,
so that h−1 (B(yi , δ)) ⊆ B(xi , ε). Thus X = ni=1 B(xi , ε), and (X, d) is totally bounded. 

Let ε > 0. Since F is equicontinuous, for every x ∈ X there is an open neighborhood Ux


such that f ∈ F, x0S∈ Ux ⇒ d(f (x), f (x0 )) < ε. Since X is compact, there are x1 , . . . , xn ∈
X such that X = ni=1 Uxi . Now define h : F → Y ×n : f 7→ (f (x1 ), . . . , f (xn )). Now
e 1 , . . . , yn ), (y 0 , . . . , y 0 )) = P d(yi , y 0 ) is a product metric on Y ×n making h continuous. By
d((y 1 n i i
assumption {f (x) | f ∈ F} is compact for each x ∈ X, thus h(F) ⊆ i {f (xi ) | f ∈ F} ⊆ Y ×n
Q

is compact, thus (h(F), d) e is totally bounded. If now f, g ∈ F satisfy d(h(f


e ), h(g)) < ε then
d(f (xi ), g(xi )) < ε ∀i by definition of d. For every x ∈ X there is i such that x ∈ Uxi , thus
e

d(f (x), g(x)) ≤ d(f (x), f (xi )) + d(f (xi ), g(xi )) + d(g(xi ), g(x)) < 3ε.

Since this holds for all x ∈ X, we have D(f, g) ≤ 3ε. Thus the assumptions of Lemma A.39 are
satisfied, and we obtain total boundedness, thus precompactness, of F. 

64
Giulio Ascoli (1843-1896), Cesare Arzelà (1847-1912), Italian mathematicians. They proved special cases of this
result, of which there also exist more general versions than the one above.

129
A.40 Remark 1. If Y = Rn , as in most statements of the theorem, then in view of the Heine-
Borel theorem the requirement of precompactness of {f (x) | f ∈ F } for each x reduces to
that of boundedness, i.e. pointwise boundedness of F. One can also formulate the theorem in
terms of existence of uniformly convergent (or Cauchy) subsequences of bounded equicontinuous
sequences in C(X, Rn ).
2. We intentionally stated a more general version of the theorem than needed in order to
argue that the result belongs to general topology rather than functional analysis. For Y = Rn
this is less clear, also since there are many alternative proofs of the theorem using various
methods from topology and functional analytis, cf. e.g. [51]. (This is no surprise since, as
explained in [47], the theorems of Alaoglu, the Stone-Čech compactification and Tychonov’s
theorem for Hausdorff spaces are all equivalent, i.e. easily deducible from each other.) 2

A.10 Some notions from measure and integration theory


A.41 Definition If X is a set, a σ-algebra on X is a family A ⊆ P (X) of subsets such that
1. ∅ ∈ A.
2. If A ∈ A then X\A ∈ A.
3. If {An }n∈N ⊆ A then ∞
S
n=1 An ∈ A.
A measurable space is a pair (X, A) consisting of a set and a σ-algebra on it.
The closedness of A under complements implies that a σ-algebra also contains X and is
closed under countable intersections. Obviously P (X) is a σ-algebra.
It is very easy to see that the intersection of any number of σ-algebras on X is a σ-algebra
on X. Thus if F ⊆ P (X) is any family of subsets of X, we can define the σ-algebra generated
by F as the intersection of all σ-algebras on X that contain F.
If (X, τ ) is a topological space, the σ-algebra on X generated by τ is called the Borel65
σ-algebra B(X) of X. (We should of course write B(X, τ ). . . ) Apart from the open sets, it
contains the closed sets, the Gδ sets and many more. A function f : X → C is called Borel
measurable if f −1 (U ) ∈ B(X) for every open U ⊆ C. (This is equivalent to f −1 (B) ∈ B(X)
for every B ∈ B(C).) If (X, A) is a measurable space, B ∞ (X, C) denotes the set of functions
f : X → C that are Borel-measurable and bounded, i.e. supx∈X |f (x)| < ∞. It is not hard to
check that this is an algebra (with the pointwise product).

A.42 Definition A positive S measure P on a measurable space (X, A) is a map µ : A → [0, ∞]


such that µ(∅) = 0 and µ( ∞ A
n=1 n ) = ∞
n=1 µ(An ) whenever {An } ⊆ A is a countable family
of mutually disjoint sets. If µ(X) < ∞ then µ is called finite (then µ(A) ≤ µ(X) ∀A ∈ A).
A Borel measure on a topological space (X, τ ) is a positive measure on (X, B(X)).
There is a notion of regularity of a measure. Since we will only consider measures on compact
subsets of C, which are second countable, regularity of all finite Borel measures is automatic.
(This follows e.g. from [63, Theorem 2.18].)
For the definition of integration of real or complex valued functions w.r.t. a measure see any
book on measure theory or the appendix of [42].
The counting measure on (X, P (X)) is defined by µc (A) = #(A). P It is easy to show that
f : XR → C (obviously Pmeasurable) is µc -integrable if and only if x∈X f (x) exists, in which
case f (x) dµc (x) = x∈X f (x).
65
Emile Borel (1871-1956). French mathematician. One of the pioneers of measure theory.

130
A.43 Definition A complex
S∞ P∞on a measurable space (X, A) is a map µ : A → C such
measure
that µ(∅) = 0 and µ( n=1 An ) = n=1 µ(An ) whenever {An } ⊆ A is a countable family of
mutually disjoint sets.
Note that complex measures are by definition bounded.
P Furthermore, if {A
S n } is a countable
family of mutually disjoint sets then automatically n |µ(An )| < ∞ since µ( n An ) is invariant
under permutations of the An .

B Supplements for the curious. NOT part of the


course
B.1 Functional analysis over fields other than R and C?
The most general meaningful definition of (linear) functional analysis is as the theory of topo-
logical vector spaces over a topological field F and continuous linear maps between them. If
(the topology of) F is discrete, we are effectively doing topological abelian group theory, and
this would not be considered functional analysis. Thus we are restrict ourselves to non-discrete
topological fields. The general theory of topological fields is a thorny subject, almost unknown
to non-specialists. (For reviews see [84, 82].) There would be no point in going into this here
since in this course we considered general topological vector spaces only as a step towards spaces
that are at least metrizable.
But there is a complete (in a sense) classification of the non-discrete locally compact fields.
In characteristic zero, these are precisely R, the p-adic fields Qp , where p runs through the prime
numbers, and all their finite (thus algebraic) extensions. (And in characteristic p 6= 0 one has
the finite extensions of Fp ((x)), the field of formal Laurent series over the finite field Fp .) For
a proof see e.g. [58]. While R has only one algebraic extension (namely C), Qp has infinitely
many finite extensions, so that the algebraic closure of Qp (which is not complete!) is infinite
dimensional over Qp . Like R and C, the p-adic fields and their finite extensions all have a norm,
usually called ‘valuation’ or ‘absolute value’, i.e. a map F → [0, ∞) satisfying |x| = 0 ⇔ x = 0,
|x + y| ≤ |x| + |y| and |xy| = |x||y|. Note that the norm is strictly multiplicative, not just
submultiplicative. The locally compact fields are complete w.r.t. their absolute value | · |.
Books entitled ‘Functional analysis’ or ‘Topological vector spaces’ tend to work entirely over
R and C unless the title contains ‘p-adic’, ‘non-archimedian’ or ‘ultrametric’ (but there are
exceptions like [6, 52]). Nevertheless, functional analysis over p-adic fields is a well-studied
subject, cf. e.g. [62, 57], but a somewhat exotic one since it only seems to have applications to
number theory, algebraic geometry and related fields.66
In the remainder of this short section we briefly comment on the extent to which the theory
covered in these notes remains valid over p-adic fields. As a rule of thumb, one must be very
careful with theorems on normed/Banach spaces that involve R or C either in their statement or
in the proof. Since then either the orderedness of R or the algebraic completeness of C tend to
be used, and the p-adic fields are neither algebraically closed nor orderable! The Hahn-Banach
theorem is a case in point since we first proved it for R, making essential use of the orderedness
of the base field F = R, thus not just of the set [0, ∞) in which the norms take values, and
then extended it to C. (There nevertheless is a p-adic Hahn-Banach theorem, but with slightly
different hypotheses and a different proof.)
66
I am sceptical about claims of relevance of p-adic/ultrametric (functional) analysis to fundamental theoreti-
cal/mathematical physics (but statistical/condensed matter physics is another discussion).

131
Theorems not explicitly referring to R or C have a better chance of carrying over to p-adic
functional analysis. For example, the open mapping theorem and both versions of the uniform
boundedness theorem generalize without change. However, one has to be careful with the above
rule since there are properties, like connectedness, shared by R and C, but not enjoyed by the
p-adic fields! There are other problems: There is no a priori relationship between the subsets
S1 = {|c| | c ∈ F} and S2 = {kxk | x ∈ V } of [0, ∞). Thus given x ∈ V \{0} there may not be a
c ∈ F such that kcxk = 1.
We also have to be very careful with results on Hilbert spaces, since scalars in F can be
pulled out of inner products without picking up an absolute value: hcx, yi = chx, yi. Indeed
this leads to problems adapting the proof of Theorem 5.24. The same holds for the polarization
identities.
We leave the discussion here and refer to the literature on p-adic (functional) analysis for
more information. See e.g. [24, 61, 62, 57].

B.2 The dual space of `∞ (S, F)


We have seen in in Theorem 4.16(v) that there are bounded linear functionals ϕ ∈ `∞ (S, F)∗
that vanish on c0 (S, F). Those clearly cannot be captured by the function g(s) = ϕ(δs ) widely
used in the proof of Theorem 4.16. This suggests to consider µϕ (A) = ϕ(χA ) for arbitrary
A ⊆ S instead. If A1 , . . . , AK are mutually disjoint, and A = K
PK
χ χ
S
PK k=1 Ak then A = k=1 Ak ,
thus µϕ (A) = k=1 µϕ (Ak ), so that µϕ is finitely additive.67

B.1 Definition If S is a set, a finitely additive finite F-valued measure on S is a map µ :


P (S) → F satisfying µ(∅) = 0 and µ(A1 ∪ · · · ∪ AK ) = µ(A1 ) + · · · + µ(AK ) whenever A1 , . . . , AK
are mutually disjoint subsets of S. The set of such µ, which we denote f a(S, F), is a vector
space via (c1 µ1 + c2 µ2 )(A) = c1 µ1 (A) + c2 µ2 (A). For µ ∈ f a(S, F) we define
(K )
X
kµk = sup |µ(Ak )| | K ∈ N, A1 , . . . , AK ⊆ S, i 6= j ⇒ Ai ∩ Aj = ∅ ,
k=1
0
kµk = sup |µ(A)|.
A⊆S

B.2 Theorem (i) k · k and k · k0 are equivalent norms on f a(S, F). We write
ba(S, F) = {µ ∈ f a(S, F) | kµk0 < ∞ (⇔ kµk < ∞)}.

(ii) (ba(S, F), k · k) is a Banach space.


(iii) If ϕ ∈ `∞ (S, F)∗ then kµϕ k ≤ kµk, thus we have a norm-decreasing linear map `∞ (S, F)∗ →
ba(S, F), ϕ 7→ µϕ .
Proof. (i) It is immediate from the definition kcµk = |c|kµk and kcµk0 = |c|kµk0 for all c ∈
F, µ ∈ f a(S, F) and that kµk = 0 ⇔ µ = 0 ⇔ kµk0 = 0. Also kµ1 + µ2 k0 ≤ kµ1 k0 + kµ2 k0 is
quite obvious. Now
(K ) (K )
X X
kµ1 + µ2 k = sup |µ1 (Ak ) + µ2 (Ak )| | · · · ≤ sup |µ1 (Ak )| + |µ2 (Ak )| | · · ·
k=1 k=1
(K ) (K )
X X
≤ sup |µ1 (Ak )| | · · · + sup |µ2 (Ak )| | · · · = kµ1 k + kµ2 k.
k=1 k=1
67
The discussion in this section strongly borrows from [17].

132
Thus k · k, k · k0 are norms on f a(S, F). The definition of k · k clearly implies |µ(A)| ≤ kµk for
each A ⊆ S, whence kµk0 ≤ kµk.
Assume µ ∈ f a(S, R) and kµk0 < ∞. If A1 , . . . , AK ⊆ S are mutually disjoint, put
[ [
A+ = {Ak | µ(Ak ) ≥ 0}, A− = {Ak | µ(Ak ) < 0}.

Now by finite additivity, k |µ(Ak )| = µ(A+ ) + µ(A− ) ≤ 2kµk0 since |µ(A± )| ≤ kµk0 . Taking
P
the supremum over the families {Ak } gives kµk ≤ 2kµk0 .
If µ ∈ f a(S, C), writing µ = Re µ + i Im µ we find kµk ≤ 4kµk0 . Thus kµk0 ≤ kµk ≤ 4kµk0
for all µ, and the two norms are equivalent.
(ii) Here it is more convenient to work with the simpler norm k·k0 . Now let {µn } be a Cauchy
sequence in ba(S, F). Then |µn (A) − µm (A)| ≤ kµn − µm k0 , so that {µn (A)} is Cauchy, thus
convergent. Define µ(n) = limn µn (A). It is clear that µ(∅) = 0. If A1 , . . . , AK are mutually
disjoint then

µ(A1 ∪· · ·∪AK ) = lim µn (A1 ∪· · ·∪AK ) = lim (µn (A1 )+· · ·+µn (AK )) = µ(A1 )+· · ·+µ(AK ),
n→∞ n→∞

so that µ is finitely additive. Since {µn } is Cauchy, for every ε > 0 there is n0 such that n, m ≥ n0
implies kµm − µn k0 < ε. In particular there is n0 such that kµm k0 ≤ kµn0 k0 + 1 for m ≥ n0 .
This implies boundedness of µ. And taking m → ∞ in |µn (A) − µm (A)| ≤ kµn − µm k0 < ε gives
kµn − µk0 ≤ ε, so that kµn − µk0 → 0. Thus ba(S, F) is complete (w.r.t. k · k0 , thus also w.r.t.
k · k).
(iii) It is clear that `∞ (S, F)∗ → f a(S, F), ϕ 7→ µϕ is linear. Now let A1 , . . . , AK ⊆ S be
mutually disjoint. Then
K K K K
!
X X X X
|µϕ (Ak )| = sgn(µϕ (Ak ))µϕ (Ak ) = sgn(µϕ (Ak ))ϕ(χAk ) = ϕ sgn(µϕ (Ak ))χAk .
k=1 k=1 k=1 k=1

Since the Ak are mutually disjoint and |sgn(z)| ≤ 1, we have k K χ


P
PK k=1 sgn(µϕ (Ak )) Ak k∞ ≤ 1,
so that k=1 |µϕ (Ak )| ≤ kϕk. Taking the supremum over the finite families {Ak } gives kµϕ k ≤
kϕk. 

B.3 Theorem (i) For each µ ∈ ba(S, F) there is a unique linear functional µ ∈ `∞ (S, F)∗
R

such that µ (χA ) = µ(A) for all A ⊆ S. It satisfies k µ k ≤ kµk.


R R

(ii) The maps α : `∞ (S, F)∗ → ba(S, F), ϕ 7→ µϕ and : ba(S, F) → `∞ (S, F)∗ , µ 7→ µ are
R R

mutually inverse and isometric, thus `∞ (S, F)∗ ∼ = ba(S, F).


Proof. (i) If f ∈ `1 (S, F) has finite image, write f = K χ
P
k=1 ck Ak , where the Ak are mutually
disjoint, and define
Z XK
f dµ = ck µ(Ak ).
k=1

(We write µ (f ) or f dµ according to convenience.) If f = L 0χ


R R P
l=1 cl A0l is another representation
of f , then P
using finite additivity
PL of0 µ it0 ia straightforward to check, using the R finite additivity
K R R
of µ, that k=1 ck µ(Ak ) R= l=1 cl µ(ARl ), so that
R f dµ is well-defined. Now cf dµ = c f dµ
for c ∈ F is obvious, and (f + g)dµ = f dµ + g dµ for all finite-image functions follows from R
the fact that f + g again is a finite-image function and the representation independence of .

133
R R
Thus µ : f 7→ f dµ is a linear functional on the bounded finite image functions. It is clear
that this is the unique linear functional sending χA to µ(A) for each A ⊆ S. Now
Z K
X K
X
f dµ ≤ |ck | |µ(Ak )| ≤ kf k∞ |µ(Ak )| ≤ kf k∞ kµk.
k=1 k=1
R
Thus µ is a bounded functional, and since the bounded finite-image functions are dense in
`∞ (S, F) by Lemma 4.13, µ has a unique extension to a linear functional µ ∈ `∞ (S, F)∗ with
R R
R
k µ k ≤ kµk.
(ii) If µ ∈ ba(S, F) then by definition of µ , we have χA dµ = µ(A) for all A ⊆ S. Thus
R R
R
α ◦ = idba(S,F) .
If ϕ ∈ `∞ (S, F) then Rin view of the definition of we have χA dµϕ = µϕ (A) = ϕ(χA ) for
R R

all A ⊆ S. Thus ϕ and µϕ coincide on all characteristic functions, thus on all of `∞ (S, F) by
R
linearity, density of the finite-image functions and the k · k∞ continuity of ϕ and µϕ . Thus
R
◦ α = id`∞ (S,F)∗ . R
Since the maps α and are mutually inverse and both norm-decreasing, they actually both
are isometries. 

This completes the determination of `∞ (S, F)∗ . (Note that we did not use the completeness
of ba(S, F) proven in Theorem B.2(ii). Thus it would also follow from the isometric bijection
ba(S, F) ∼
= `∞ (S, F)∗ just established.)

B.4 Exercise Given µ ∈ ba(S, F), prove that µ is {0, 1}-valued if and only if µ ∈ `∞ (S, F)∗ is
R

a character, i.e. µ (f g) = µ (f ) µ (g) for all f, g ∈ `∞ (S, F).


R R R

Since `∞ (S, F)∗ has a closed subspace ι(`1 (S, F)), it is interesting to identify the correspond-
ing subspace of ba(S, F).

B.5 Definition A finitely additive measure µ ∈ ba(S, F) is called countably additive if for
every countable family A ⊆ P (S) of mutually disjoint sets we have
[  X
µ A = µ(A)
A∈A

and totally additive if the same holds for any family of mutually disjoint sets. The set of
countably and totally additive measures on S are denoted ca(S, F) and ta(S, F), respectively.

B.6 Proposition For µ ∈ ba(S, F), consider the following statements:


(i) There is g ∈ `1 (S, F) such that µ(A) = s∈A g(s) for all A ⊆ S.
P

(ii) µ ∈ `∞ (S, F)∗ is normal, thus f dµ = limι fι dµ for every net {fι } ∈ FS that is
R R R

pointwise convergent and uniformly bounded.


(iii) µ is totally additive.
(iv) µ is countably additive.
Then (i)⇔(ii)⇔(iii)⇒(iv). If S is countable then also (iv)⇒(iii).

134
Proof. (i)⇒(ii) If µ is of the given form then clearly µ χA dµ = µ(A) = s∈A g(s) for each
R P
R R P
A ⊆ S. By the way µ is constructed from µ, it is clear that f dµ = s∈S f (s)g(s) for all
f ∈ `∞ (S, F). Thus µ = ϕg , and normality of µ follows from Proposition 13.6.
R R

(ii)⇒(iii) We know that we can recover µ from µ as µ(A) = χA dµ. Let A be a family
R R

of mutually disjoint subsets of S. Then the net {fF = χS F }, indexed by the finite subsets
χB , where B = A. Now normality
S
F ⊆ A, is uniformly bounded and converges pointwise to
of µ implies that µ(B) = µ χB dµ = limF fF = limF A∈F µ(A) = A∈A µ(A), which is
R R R P P
additivity of µ. P
(iii)⇒(i) If we put g(s) = µ({s}) then additivity of µ means that µ(A) = s∈A g(s) for all
A ⊆ S, convergence being absolute. Now the finiteness of µ(S) gives kgk1 < ∞.
(iii)⇒(iv) is trivial. If S is countable then a family of mutually disjoint non-empty subsets
of S is at most countable, so that (iii) and (iv) are equivalent. 
Thus we have the situation of the following diagram:
=∼ `1 (S, F)

∼=

-


=
(`∞ (S, F)∗ )n - ta(S, F)
∩ ∩

? ∼
= ?
`∞ (S, F)∗ - ba(S, F)

where ta(S, F) can be replaced by ca(S, F) if S is countable.

B.3 c0 (N, F) ⊆ `∞ (N, F) is not complemented


B.7 Definition We say that a Banach space V has property S if there is a countable subset
C ⊆ V ∗ separating the points of V . I.e., if x ∈ V and ϕ(x) = 0 for all ϕ ∈ C then x = 0.
If V has property S then every closed subspace W ⊆ V has property S. (And so would
non-closed subspaces, but they are not Banach.)
It is easy to see that V has property S whenever V ∗ is separable. But this is not a necessary
condition: V = `∞ (N, C) has property S, as we see by taking C = {ϕn }n∈N , where ϕn (f ) = f (n).
But V ∗ (∼= ba(N, C)) is not separable, since by Exercise 7.20 this would imply separability of V ,
which is false by Exercise 4.15(i).

B.8 Theorem c0 (N, R) ⊆ `∞ (N, R) is not complemented.


Proof. From now on we abbreviate `∞ (N, F) and c0 (N, F) as `∞ , c0 . Our strategy for proving
that c0 ⊆ `∞ is not complemented is the following: If c0 ⊆ `∞ had a complementary closed
subspace W , Exercise 9.11 would give `∞ ∼ = c0 ⊕ W , thus `∞ /c0 ∼
= W . Since W would have

property S, it would follow that Q = ` /c0 has property S, but we will prove that it doesn’t!
The idea for doing so is to produce an uncountable subset F ⊆ Q such that each functional
ϕ ∈ Q∗ isSnon-zero only on countably many elements of F. Then for any countable C ⊆ Q∗ the
set F 0 = ϕ∈C {q ∈ Q | ϕ(q) 6= 0} is countable, so that the family C ⊆ Q∗ vanishes identically
on the uncountable set F\F 0 . It therefore cannot separate the elements of F, let alone those of

135
Q. Thus Q does not have property S and we are done. For the construction of such an F we
use the following lemma:

B.9 Lemma Every countably infinite set X admits a family {Xλ }λ∈Λ of subsets of X such that
(i) Λ has cardinality c = #R, in particular it is uncountable.
(ii) Xλ is infinite for each λ ∈ Λ.
(iii) Xλ ∩ Xλ0 is finite for all λ, λ0 ∈ Λ, λ 6= λ0 .
Proof. Take Y = (0, 1) ∩ Q and Λ = (0, 1)\Q. Clearly Y is countable and Λ is uncount-
able (since removing a countable set from one of cardinality c does not change the cardinal-
ity). For each λ ∈ Λ pick a sequence {an } ⊆ Y converging to λ (for example an = bnλc/n)
and put Yλ = {an | n ∈ N}. That each Yλ is infinite follows from the irrationality of λ
and the rationality of the an . If λ 6= λ0 and an → λ, a0n → λ0 then there exists n0 such
that n, n0 ≥ n0 ⇒ max(|an − λ|, |a0n0 − λ0 |) < |λ − λ0 |/2, so that an 6= a0n0 . This implies
#(Yλ ∩ Yλ0 ) < ∞. We thus have a family of subsets of Y with all desired properties. For an
arbitrary countably infinite set X the claim now follows using a bijection X ∼
=Y. 

Let {Xλ }λ∈Λ be a family of subsets of N as provided by the lemma. For λ ∈ Λ, the
characteristic function χXλ : N → {0, 1} ⊆ C clearly is in `∞ . Let p : `∞ → Q = `∞ /c0
be the quotient map. Now let qλ = p(χXλ ) and F = {qλ | λ ∈ Λ}. If λ, λ0 ∈ Λ, λ 6= λ
the symmetric difference Xλ ∆Xλ0 = (Xλ ∪ Xλ0 )\(Xλ0 ∩ Xλ ) is infinite by (ii) and (iii). Thus
χXλ − χX 0 6= c0 = ker p, so that λ 7→ qλ is injective, thus with (i) we see that F is uncountable.
λ
Let now ϕ ∈ Q∗ , m, n ∈ N and let λ1 , . . . , λm ∈ Λ be mutually distinct and such that
|ϕ(qλi )| ≥P1/n ∀i = 1, . . . , m. For each i pick ti with |ti | = 1 such that ti ϕ(qλi ) = |ϕ(qλi )|.
Put f = m χ ∞
i=1 ti Xλi ∈ ` . Since the sets Xλi have pairwise finite intersections,
Sm the function
f has absolute value larger than one only S on a subset
S of the finite set j,k=1 λj ∩ Xλk and
X
absolute value one on the infinite set ( i Xλi )\( j,k Xλj ∩ Xλk ). This implies that kp(f )k =
inf g∈c0 kf − gk∞ = 1. Thus
m m m
X X X m
kϕk ≥ |ϕ(p(f ))| = ti ϕ(p(χXλi )) = ti ϕ(qλi ) = |ϕ(qλi )| ≥ .
n
i=1 i=1 i=1

Thus m ≤ nkϕk < ∞, so that for each ϕ ∈ Q∗ and n ∈ N there cannot be more than m distinct
λ ∈ Λ with |ϕ(qλ )| ≥ 1/n. If there was an uncountable F 0 ⊆ F with ϕ(q) 6= 0 ∀q ∈ F 0 , there
would have to be an n ∈ N such that |ϕ(q)| ≥ 1/n for infinitely (in fact uncountably) many
q ∈ F 0 , contradicting what we just proved. This completes the proof. 

B.4 Banach spaces with no or multiple pre-duals


Recall that we write ∼
= for isometric isomorphism and ' for isomorphism of Banach spaces.

B.10 Lemma Let V be a Banach space. Then:


(i) P = ιV ∗ ◦ (ιV )∗ ∈ B(V ∗∗∗ ) is an idempotent and P V ∗∗∗ = ιV ∗ (V ∗ ).
(ii) ιV ∗ (V ∗ ) ⊆ V ∗∗∗ is a complemented subspace.
(iii) If a Banach space W is isomorphic to V ∗ with V Banach then ιW (W ) ⊆ W ∗∗ is comple-
mented.

136
(iv) V ∗∗∗ /V ∗ ' (V ∗∗ /V )∗ . (We omitted the ι’s for simplicity.)
Proof. (i) Since ιV , ιV ∗ are bounded, with Lemma 11.1 we have boundedness of P . Let ϕ ∈ V ∗
and x ∈ X. Then
(P ιV ∗ (ϕ))(ιV (x)) = ιV ∗ ((ιV )∗ (ιV ∗ (ϕ)))(ιV (x)) = [(ιV )∗ (ιV ∗ (ϕ))](x) = ϕ(x) = ιV ∗ (ϕ)(ιV (x)),
where we used Exercise 7.12 several times, proves P ιV ∗ (ϕ) = ιV ∗ (ϕ). Thus P  ιV ∗ (V ∗ ) = id.
On the other hand, it follows directly from the definition of P that P V ∗∗∗ ⊆ ιV ∗ (V ∗ ). Combining
these two facts gives P 2 = P and P V ∗∗∗ = ιV ∗ (V ∗ ).
(ii) This is an immediate consequence of (ii) and Exercise 6.12.
(iii) If T : W → V ∗ is an isomorphism then we have isomorphisms T ∗ : V ∗∗ → W ∗ and
T : W ∗∗ → V ∗∗∗ . Using this it is straighforward to deduce the claim from (ii).
∗∗

(iv) By Exercise 6.6 we have (V ∗∗ /V )∗ ∼= V ⊥ ⊆ V ∗∗∗ . And by (ii), V ∗∗∗ ' V ∗ ⊕ W , where
W ' V /V , the isomorphism being given by x∗∗∗ 7→ (P x∗∗∗ , (1 − P )x∗∗∗ ) with P as in (i).
∗∗∗ ∗

Thus P V ∗∗∗ ' V ∗ and V ∗∗∗ /V ∗ ' (1 − P )V ∗∗∗ . Thus the claimed isomorphism follows if we
prove that the subspaces V ⊥ and (1 − P )V ∗∗∗ of V ∗∗∗ are equal.
Now, x∗∗∗ ∈ (1 − P )V ∗∗∗ means (1 − P )x∗∗∗ = x∗∗∗ , thus P x∗∗∗ = 0. Since P = ιV ∗ ◦ (ιV )∗ ,
where ιV ∗ is injective, this is equivalent to (ιV )∗ (x∗∗∗ ) = 0. By the definition of the transpose,
this means that x∗∗∗ ◦ ιV = 0. Since this is the same as x∗∗∗ ∈ ιV (V )⊥ , we are done. 

B.11 Corollary c0 (N, F) is not isomorphic to the dual space of any Banach space.
Proof. We again abbreviate c0 (N, F) as c0 etc. We know that c∗0 ∼ = `1 and c∗∗ ∼ ∞
0 = ` , the
canonical map ιc0 : c0 → c0 just being the inclusion map c0 ,→ ` . By Theorem B.8, c0 ⊆ `∞
∗∗ ∞

is not complemented. Combining this with Lemma B.10(iii), the claim follows. 

B.12 Corollary Let X = c0 ⊕ (`∞ /c0 ). Then X 6' `∞ , but X ∗ ' (`∞ )∗ .
Proof. X ' `∞ would imply that c0 ⊆ `∞ is complemented, which it is not by Theorem B.8.
Thus X 6' `∞ . With c∗0 ∼
= `1 we have X ∗ ' c∗0 ⊕ (`∞ /c0 )∗ ' `1 ⊕ (`∞ /c0 )∗ .
On the other hand, since `1 ∼= c∗0 is a dual space, we see that `1 ⊆ (`1 )∗∗ ∼
= (`∞ )∗ is com-
plemented by Lemma B.10(iii). Thus (`∞ )∗ ' `1 ⊕ (`∞ )∗ /`1 by Exercise 9.11(i). Now Lemma
B.10(iv) with V = c0 gives (`∞ )∗ /`1 ' (`∞ /c0 )∗ , so that X ∗ ' (`∞ )∗ . 

One can also find Banach spaces V with V ∗ ∼


= `1 , while V 6' c0 . But this is a bit more
involved.

B.5 Normability. Separation theorems. Goldstine’s theorem


In this section we will prove the normability criterion for topological vector spaces stated in
Remark 2.13 and Goldstine’s Theorem 16.19, which we used in Exercise 16.18 and will use again
in Section B.6. The proofs require some preparations which, however, are quite fundamental,
also for the study of locally convex spaces.

B.5.1 Minkowski functionals. Criteria for normability and local convexity


B.13 Proposition Let V be a topological vector space and U a convex open neighborhood of
0. Define the ‘Minkowski functional’ µU : V → [0, ∞) of U by
µU (x) = inf{t ≥ 0 | x ∈ tU }.
Then µU is sublinear and continuous, and U = {x ∈ X | µU (x) < 1}.

137
Proof. As t → ∞ we have t−1 x → 0. Since U is an open neighborhood of 0, we have t−1 x ∈ U
for t large enough. Thus µU (x) < ∞ for each x ∈ V . It is quite obvious from the definition
that µU (cx) = cµU (x) for c > 0. Thus µU is positive-homogeneous. We have µU (x) < 1 if and
only if there exists t ∈ (0, 1) such that x ∈ tU . Thus µU (x) < 1 ⇒ x ∈ U . And if x ∈ U then
openness of U implies that (1 − ε)x ∈ U for some ε > 0. Thus µU (x) < 1, so that we have
U = {x ∈ X | µU (x) < 1}.
Let x, y ∈ V , and let s, t > 0 such that x ∈ sU, y ∈ tU . I.e. there are a, b ∈ U such that
x = sa, y = tb. Thus x + y = sa + tb = (s + t) sa+tb s t
s+t . Since s+t a + s+t b ∈ U due to convexity
of U , we have x + y ∈ (s + t)U . Thus µU (x + y) ≤ s + t, and since we have x ∈ sU, y ∈ tU for
all s < µU (x) + ε, t < µU (y) + ε with ε > 0, the conclusion is µU (x + y) ≤ µU (x) + µU (y), thus
subadditivity. Being subadditive and positive homogeneous, µU is sublinear.
Let {xι }ι∈I ⊆ V be a net converging to zero. For each n ∈ N, n−1 U is an open neighborhood
of zero. Thus there exists a ιn ∈ I such that ι ≥ ιn implies xι ∈ n−1 U and therefore, with the
definition of µU , that µU (xι ) ≤ n−1 . Thus µU (xι ) → 0, which is continuity of µU at 0 ∈ V .
If now xι → x then the subadditivity of µU gives

µU (x) − µU (x − xι ) ≤ µU (xι ) ≤ µU (x) + µU (xι − x),

and since µU (xι − x) → 0, we have µU (xι ) → µU (x), thus continuity of µU at all x ∈ V . 

B.14 Definition Let V be a topological vector space and 0 ∈ U ⊆ V . Then U is called


• balanced if x ∈ U, |λ| ≤ 1 ⇒ λx ∈ U ,
• bounded if for every open W 3 0 there exists λ > 0 such that λU ⊆ W .
Note that if U is convex and contains zero, multiplication by t ∈ [0, 1] sends U into itself.
Thus for checking balancedness it suffices to consider |λ| = 1.

B.15 Proposition Let (V, τ ) be a topological vector space and U a convex open neighborhood
of zero. Then
(i) The Minkowski functional µU is a seminorm if and only if U is balanced.
(ii) If U is bounded then µU (x) = 0 implies x = 0.
(iii) If U is balanced and bounded then kxk = µU (x) is a norm inducing the topology τ .
Proof. (i) Since µU is subadditive and positive-homogeneous, it is a seminorm if and only if
µU (λx) = µU (x) for all x ∈ V and λ ∈ F with |λ| = 1. If U is balanced then this is evidently
satisfied. Now assume µU (λx) = µU (x). The openness of U implies that {t > 0 | x ∈ tU } =
(µU (x), ∞). Thus if |λ| = 1 then the assumption µU (λx) = µU (x) implies that x ∈ U if and
only if λx ∈ U . Thus U is balanced.
(ii) Assume that U is bounded and that x 6= 0. Since τ is T1 , there is an open W ⊆ V such
that 0 ∈ W 63 x. Since U is bounded, there is λ > 0 such that λU ⊆ W , which clearly implies
x 6∈ λU . Now the definition of µU implies µU (x) > λ > 0.
(iii) Proposition B.13 and the above (i) and (ii) show that k · k = µU is a continuous norm on
V . Thus xn → 0 implies kxn k → 0. If we prove the converse implication then τ = τk·k follows
since V is a topological vector space. Let {xn } be a sequence such that kxn k → 0, and let W
be an open neighborhood of 0. Since U is bounded, there is λ > 0 such that λU ⊆ W . Now,
kxn k → 0 means that there is n0 ∈ N such that n ≥ n0 ⇒ kxn k < λ/2. With the definition of
µU this implies xn ∈ λU , thus xn ∈ λU ⊆ W for all n ≥ n0 . This proves xn → 0. 

138
B.16 Exercise Let V be a topological vector space and A ⊆ V convex. Prove that the interior
A0 and the closure A are convex.
We now know that a topological vector space is normable if the zero element has a balanced
convex bounded open neighborhood. (The converse is easy.) But this can be improved:

B.17 Lemma Let V be a topological vector space and U a convex open neighborhood of 0.
Then there exists a balanced convex open neighborhood U 0 ⊆ U of 0.

S λU ⊆ U
Proof. Since multiplication by scalars is continuous, there exists an ε > 0 such that
whenever |λ| ≤ ε. Thus with W = |ε|U we have tW ⊆ U whenever |t| ≤ 1. Put Y = |t|≤1 tW ⊆
U . By construction, Y is a balanced open neighborhood of 0.
T every λ ∈ F with |λ| = 1 it is clear that λU is a convex open neighborhood of 0. Putting
For
Z = |λ|=1 λU , it is manifestly clear that Z is balanced and 0 ∈ Z. Furthermore, U 0 is convex
(as an intersection of convex sets). Since tW ⊆ U for all |t| = 1, we have Y ⊆ Z, so that Z
has non-empty interior Z 0 . Now we put U 0 = Z 0 and claim that U 0 has the desired properties.
Clearly U 0 is an open neighborhood of 0, as the interior of a convex set it is convex (Exercise
B.16). If |t| = 1 then the map Z → Z, x 7→ tx is a homeomorphism. Thus if x ∈ Z 0 = U 0 then
tx ∈ Z 0 = U 0 , showing that U 0 = Z 0 is balanced. 

Now we are in a position to prove geometric criteria for normability and local convexity of
topological vector spaces:

B.18 Theorem Let V be a topological vector space. Then V is normable if and only if there
exists a bounded convex open neighborhood of 0.
Proof. If V is normable by the norm k · k then Bk·k (0, 1) = {x ∈ V | kxk < 1} is clearly open,
convex (and balanced). To show boundedness, let W 3 0 be open. Then there is ε > 0 such
that B(0, ε) ⊆ W . Now clearly εB(0, 1) = B(0, ε) ⊆ W , thus B(0, 1) is bounded.
If there exists a bounded convex open neighborhood U of 0 then by Lemma B.17 we can
assume U in addition to be balanced. (The U 0 provided by the lemma is a subset of U , thus
bounded if U is bounded.) Now by Proposition B.15(iii), µU is a norm inducing the given
topology on V . 

B.19 Theorem A topological vector space (V, τ ) is locally convex in the sense of Definition
2.24 (i.e. the topology τ comes from a separating family F of seminorms) if and only if it is
Hausdorff and the zero element has an open neighborhood base consisting of convex sets.
Proof. Given a separating family F of seminorms and putting τ = τF , a basis of open neigh-
borhoods of 0 is given by the finite intersections of sets Up,ε = {x ∈ V | p(x) < ε}, where
p ∈ F, ε > 0. Each of the Up,ε is convex and open, thus also the finite intersections.
And if τ has the stated property, Lemma B.17 gives that 0 has a neighborhood base consisting
of balanced convex open sets. Defining F = {µU | U balanced convex open neighborhood of 0},
each of the µU is a continuous seminorm by Propositions B.13 and B.15. Thus if xι → 0 then
kxι kU := µU (xι ) → 0. And kxι kU → 0 for all balanced convex open U implies that xι ultimately
is in every open neighborhood of 0, thus xι → 0. Thus τ = τF , and 2.23 gives that F is
separating. 

B.20 Exercise Let 0 < p < 1.


(i) Prove that (`p (S, F), τdp ) is normable if S is finite.

139
(ii) Prove that the open unit ball of (`p (S, F), τdp ) does not contain any convex open neigh-
borhood of 0 if S is infinite.
(iii) Prove that (`p (S, F), τdp ) is neither normable nor locally convex if S is infinite.

B.5.2 Hahn-Banach separation theorem. Goldstine’s theorem


The Hahn-Banach theorem in the sublinear functional version (Theorem 7.2) has an important
geometric application, namely the fact that disjoint convex sets can be separated by hyperplanes,
i.e. sets {x ∈ V | Re ϕ(x) = t} for some ϕ ∈ V ∗ and t ∈ R. The following is just one of many
versions, sufficient for our purposes.

B.21 Theorem Let V be a topological vector space and A, B ⊆ V disjoint non-empty convex
subsets, A being open. Then there is a continuous linear functional ϕ : V → F such that
Re ϕ(a) < inf b∈B Re ϕ(b) ∀a ∈ A. (If F = R, drop the ‘Re’.)
Proof. Pick a0 ∈ A, b0 ∈ B and put z = b0 − a0 and U = (A − a0 ) − (BS− b0 ) = A − B + z,
which is a convex (as pointwise sum of two convex sets) open (since U = x∈−B−a0 +b0 (A + x))
neighborhood of 0. Let p = µU be the associated Minkowski functional. As a consequence of
A ∩ B = ∅ we have 0 6∈ A − B, thus z 6∈ U , and therefore p(z) ≥ 1.
Put W = Rz and define ψ : W → R, cz 7→ c. For c ≥ 0 we have ψ(cz) = c ≤ cp(z) = p(cz).
Thus by sublinearity of p and Theorem 7.2 there exists a linear functional ϕ : V → R satisfying
ϕ  W = ψ, thus ϕ(cz) = c, and ϕ(x) ≤ p(x) ∀x ∈ V . Thus also −p(−x) ≤ −ϕ(−x) = ϕ(x),
and since x → 0 implies p(x) → 0, ϕ is continuous at zero, thus everywhere.
If now a ∈ A, b ∈ B then a − b + z ∈ U , so that p(a − b + z) < 1. Thus

ϕ(a − b) + 1 = ϕ(a − b + z) ≤ p(a − b + z) < 1,

thus ϕ(a) < ϕ(b) for all a ∈ B and b ∈ B. Thus the subsets ϕ(A), ϕ(B) of R are disjoint.
Since A, B are convex, they are connected. Consequently, ϕ(A), ϕ(B) are connected, thus
intervals. Since A is open, so is ϕ(A) (open mapping theorem). If we put s = sup ϕ(A), we have
ϕ(a) < s ≤ ϕ(b) for all a ∈ A, b ∈ B, and this is equivalent to ϕ(a) ≤ inf b∈B ϕ(b) for all a ∈ A.
F = C: Considering V as R-vector space, apply the above to obtain a continuous R-linear
functional ϕ0 : V → R such that ϕ(a) < inf b∈B ϕ(b) ∀a ∈ A. Now define ϕ : V → C, x 7→
ϕ0 (x)−iϕ0 (ix). This clearly is continuous and satisfies Re ϕ = ϕ0 , so that the desired inequality
holds. That ϕ is C-linear follows from the same argument as in the proof of Theorem 7.5. 

B.22 Theorem (Goldstine’s theorem) If V is a Banach space then V≤1 is σ(V ∗∗ , V ∗ )-dense
in (V ∗∗ )≤1 .
Proof. We abbreviate τ = σ(V ∗∗ , V ∗ ). The unit ball (V ∗∗ )≤1 is τ -compact by Alaoglu’s theorem,
τ
thus τ -closed, so that B = V≤1 , which is convex by Exercise B.16, is contained in (V ∗∗ )≤1 . If
τ
this inclusion is strict, pick x∗∗ ∈ (V ∗∗ )≤1 \V≤1 . Then x∗∗ has a τ -open neighborhood U disjoint
from B, and by Theorem B.19 there is a convex open A ⊆ U . Now Theorem B.21 applied to
(V ∗∗ , τ ) and A, B ⊆ V ∗∗ gives a τ = σ(V ∗∗ , V ∗ )-continuous linear functional ϕ ∈ (V ∗∗ )? such
that Re ϕ(a) < inf b∈B Re ϕ(b) ∀a ∈ A. Now Exercise 16.13 gives ϕ ∈ V ∗ ⊆ V ∗∗∗ .
Putting ψ = −ϕ we have supb∈B Re ψ(b) < Re ψ(a) ∀a ∈ A, which is more convenient. Since
ψ ∈ V ∗ and B ⊇ V≤1 , we have kψk ≤ supb∈B Re ψ(b). On the other hand, with x∗∗ ∈ A and
kx∗∗ k ≤ 1, we have Re ψ(x∗∗ ) ≤ |ψ(x∗∗ )| ≤ kx∗∗ kkψk ≤ kψk. Combining these findings, we
have kψk ≤ supb∈B Re ψ(b) < Re ψ(x∗∗ ) ≤ kψk, which is absurd. This contradiction proves
τ
V≤1 = (V ∗∗ )≤1 . 

140
B.6 Strictly convex and uniformly convex Banach spaces
B.6.1 Strict convexity and uniqueness in the Hahn-Banach theorem
B.23 Definition A Banach space V is called strictly convex if x, y ∈ V, kxk = kyk = 1, x 6= y
implies kx + yk < 2.

B.24 Exercise (i) Prove that `p (S, F) is not strictly convex if #S ≥ 2 and p ∈ {1, ∞}.
(ii) Prove that `p (S, F) is strictly convex for every S and 1 < p < ∞.
(iii) Prove that all Hilbert spaces are strictly convex.

B.25 Proposition Let V be a Banach space. Then


(i) The following are equivalent:
(a) V is strictly convex.
(b) If x, y ∈ V satisfy kx + yk = kxk + kyk then y = 0 or x = cy with c ≥ 0.
(ii) [Taylor-Foguel] The following are equivalent:
(a) V ∗ is strictly convex.
(b) For every closed subspace W ⊆ V and ϕ ∈ W ∗ there is a unique ϕ
b ∈ V ∗ with ϕ
b|W = ϕ
and kϕkb = kϕk.

B.26 Exercise (i) Prove (b)⇒(a) in Proposition B.25(i).


(ii) Prove (a)⇒(b) in Proposition B.25(ii).
(These are the easier directions.)
Proof of the remaining implications in Proposition B.25. (i) (a)⇒(b) Assume that V is strictly
convex and x, y ∈ V satisfy kx + yk = kxk + kyk. If x = 0 or y = 0 then we are done. By
rescaling and/or exchanging if necessary we may assume 1 = kxk ≤ kyk. Put z = y/kyk. Then

2 ≥ kx + zk = kx + y − (1 − kyk−1 )yk ≥ kx + yk − (1 − kyk−1 )kyk = kx + yk − kyk + 1


= kxk + kyk − kyk + 1 = 2,

where we used ka − bk ≥ kak − kbk and the assumptions kx + yk = kxk + kyk and kxk = 1. This
implies kx + zk = 2. Since kxk = 1 = kzk (by assumption and by construction of z), the strict
convexity implies x = z = y/kyk. Thus y = kykx, and we have proven (b).
(ii) (b)⇒(a) Assume V ∗ is not strictly convex. Then there are ϕ1 , ϕ2 ∈ V ∗ with ϕ1 6= ϕ2
and kϕ1 k = kϕ2 k = 1 and kϕ1 + ϕ2 k = 2. Then W = {x ∈ V | ϕ1 (x) = ϕ2 (x)} ⊆ V is a
closed linear subspace and proper (since ϕ1 6= ϕ2 ). Put ψ = ϕ1 |W = ϕ2 |W ∈ W ∗ . We will prove
kψk = 1. Then ϕ1 , ϕ2 are distinct norm-preserving extensions of ψ ∈ W ∗ to V , providing a
counterexample for uniqueness of norm-preserving extensions.
Since ϕ1 − ϕ2 6= 0, there exists z ∈ V with ϕ1 (z) − ϕ2 (z) = 1. Now every x ∈ V can be
written uniquely as x = y + cz, where y ∈ W, c ∈ C: Put c = ϕ1 (x) − ϕ2 (x) and then y = x − cz.
Now it is obvious that y ∈ W . Uniqueness of such a representation follows from z 6∈ W .
Since kϕ1 + ϕ2 k = 2, we can find a sequence {xn } ⊆ V with kxn k = 1 ∀n such that
ϕ1 (xn ) + ϕ2 (xn ) → 2. Since |ϕi (xn )| ≤ 1 for i = 1, 2 and all n, it follows that ϕi (xn ) → 1 for i =
1, 2. Now write xn = yn +cn z, where {yn } ⊆ W and {cn } ⊆ C. Then cn = ϕ1 (xn )−ϕ2 (xn ) → 0.
Thus kxn − yn k = |cn |kzk → 0, so that kyn k → 1. And with cn → 0 we have

lim ϕ1 (yn ) = lim ϕ1 (yn + cn z) = lim ϕ1 (xn ) = 1.


n→∞ n→∞ n→∞

141
In view of {yn } ⊆ W and ϕ1 |W = ψ, we have ψ(yn ) = ϕ1 (yn ) → 1. Together with kyn k → 1
this implies kψk ≥ 1. Since the converse inequality is obvious, we have kψk = 1, as claimed. 

In addition to the above we remark that V is strictly convex if and only if V ∗ is ‘smooth’, and
conversely. (Cf. e.g. [43] for definition and proof.) Thus uniqueness of Hahn-Banach extensions
for subspaces W ⊆ V holds if and only if V is smooth.

B.6.2 Uniform convexity and reflexivity. Duality of Lp -spaces reconsidered


B.27 Definition A Banach space V is called uniformly convex if for every ε > 0 there exists
δ(ε) > 0 such that x, y ∈ V, kxk = kyk = 1 and kx − yk ≥ ε imply k(x + y)/2k ≤ 1 − δ(ε).
It is obvious that uniform convexity implies strict convexity.

B.28 Theorem (Milman-Pettis 1938/9) Every uniformly convex Banach space is reflexive.
Proof. (Following Ringrose 1958) Assume V is uniformly convex, but not reflexive. Let S ⊆ V
and S ∗∗ ⊆ V ∗∗ be the unit spheres (sets of elements of norm one). Since S = S ∗∗ easily implies
V = V ∗∗ , we have S $ S ∗∗ . If x∗∗ ∈ S ∗∗ \S then by the obvious norm-closedness of S ⊆ S ∗∗
there is ε > 0 such that B(x∗∗ , ε) ∩ S = ∅. Since kx∗∗ k = 1, we can find ϕ ∈ V ∗ with kϕk = 1
and |x∗∗ (ϕ) − 1| > 1 − δ(ε)/2. Now U = {y ∗∗ ∈ V ∗∗ | |y ∗∗ (ϕ) − 1| > 1 − δ(ε)} ⊆ V ∗∗ is a
τ := σ(V ∗∗ , V ∗ )-open neighborhood of x∗∗ . By Goldstine’s Theorem B.22, V≤1 ⊆ (V ∗∗ )≤1 is
τ
τ -dense. If {xα } ⊆ V≤1 is a net τ -converging to x ∈ S ∗∗ then kxα k → 1 and kxxαα k → x. Thus
S ⊆ S ∗∗ is τ -dense, thus S ∩ U 6= ∅. If now y1 , y2 ∈ S ∩ U then |ϕ(y1 ) + ϕ(y2 )| > 2 − 2δ(ε). With
kϕk = 1 this implies ky1 + y2 k > 2 − 2δ(ε). Thus by uniform convexity we have ky1 − y2 k < ε.
Since every net in S that τ -converges to x∗∗ ultimately lives in U , picking any y1 ∈ S ∩ U we
have kx∗∗ − y1 k ≤ ε. But this contradicts the choice of ε. 

The converse of the theorem is not true, but the construction of counterexamples is laborious.
Note also that the dual of a uniformly convex space need not be uniformly convex!

B.29 Theorem For every measure space (X, A, µ) and 1 < p < ∞, the space Lp (X, A, µ; F) is
uniformly convex and reflexive.
Proof. We follow [37]. Let 0 < ε ≤ 21−p . Then the set
 p 
x−y
Z = (x, y) ∈ R2 | |x|p + |y|p = 2, ≥ε
2

is closed and bounded, thus compact, and non-empty since (21/p , 0) ∈ Z. Since the function
p p p
R → R, t 7→ |t|p is strictly convex, we have x+y
2 < |x| +|y|
2 whenever x 6= y. Thus
p
|x|p + |y|p

x+y
ρ(ε) = inf − > 0.
(x,y)∈Z 2 2

Now by homogeneity we have


p p
x−y |x|p + |y|p |x|p + |y|p |x|p + |y|p x+y
≥ε ⇒ ρ(ε) ≤ − . (B.1)
2 2 2 2 2

142
Let now 0 < ε < 21−p and f, g ∈ Lp (X, A, µ) with kf kp = kgkp = 1 and k(f + g)/2kpp > 1 − δ.
Writing f, g instead of f (x), g(x), we put

f −g p |f |p + |g|p
 
M = x∈X | ≥ε .
2 2

Now
p p
f −g f −g f −g p
Z Z
= +
2 p X\M 2 M 2
Z p
|f | + |g| p Z
|f |p + |g|p
≤ ε +
X\M 2 M 2
p p x+y p
Z  p
|x| + |y|p

|f | + |g|
Z
1
≤ ε + −
X 2 ρ(ε) M 2 2
p p x+y p
Z  p p

|f | + |g| |x| + |y|
Z
1
≤ ε + −
X 2 ρ(ε) X 2 2
1 1−δ δ
≤ ε+ − = ε+ .
ρ(ε) ρ(ε) ρ

(In the second row we used the definition of M and (4.1), in the third we used (B.1), which
holds on M , in the forth the fact that the expression in brackets is non-negative on X\M , and
finally we used the assumptions kf kpp ≤ 1, kgkpp ≤ 1 and k(f + g)/2kpp > 1 − δ.) Now choosing
δ < ερ(ε) we have k(f − g)/2kpp ≤ 2ε, thus uniform convexity (more precisely, an implication
equivalent to it).
Reflexivity now follows from Theorem B.28. 

B.30 Remark The uniform convexity of Lp for 1 < p < ∞ was first proven by Clarkson in
1936 with a fairly complicated proof. (Reflexivity was known earlier thanks to F. Riesz’ proof
of (Lp )∗ ∼
= Lq .) A simpler proof, still giving optimal bounds, can be found in [31]. 2

Now we are in a position to complete the determination of Lp (X, A, µ)∗ for arbitrary measure
space (X, A, µ) and 1 < p < ∞ without invocation of the Radon-Nikodym theorem:

B.31 Corollary Let 1 < p < ∞ and (X, A, µ) any measure space. Then the canonical map
Lq (X, A, µ; F) → Lp (X, A, µ; F)∗ is an isometric bijection.
Proof. Let (X, A, µ) be any measure space, 1 < p < ∞ and q the conjugate exponent. We
abbreviate Lp (X, A, µ) to Lp . As discussed (without complete, but hopefully sufficient detail)
in Section 4.6, the map ϕ : Lq → (Lp )∗ , g 7→ ϕg is an isometry, so that only surjectivity remains
to be proven. Assume ϕ(Lq ) $ (Lp )∗ . The subspace being closed (since Lq is complete and ϕ is
an isometry), by Hahn-Banach there is a 0 6= ψ ∈ (Lp )∗∗ such that ψ  ϕ(Lq ) = 0. By reflexivity
of Lp (Theorem B.29), there is an p
R f ∈ L such that ψ = ιLp (f ). This implies ϕg (f ) = ψ(ϕg ) = 0
for all g ∈ L . With ϕg (f ) = f g dµ = ϕ0f (g), where ϕ0 : Lp → (Lq )∗ is the canonical map,
q

this implies ϕ0f = 0. Since ϕ0 is an isometry, we have f = 0 and therefore ψ = 0, which is a


contradiction. Thus ϕ : Lq → (Lp )∗ is surjective. 

B.7 Schur’s theorem


As on earlier occasions, we abbreviate `1 = `1 (N, F).

143
w
B.32 Theorem (I. Schur) If g, {fn }n∈N ⊆ `1 (N, F) and fn → g then kfn − gk1 → 0.
w
Proof. It clearly suffices to prove this for g = 0, thus `1 3 fn → 0 ⇒ kfn k1 → 0. We will
follow the gliding hump argument in [3] very closely.
w
Assume that fn → 0, but kfn k 6→ 0. Since δm ∈ `∞ ∼ = (`1 )∗ , the first fact clearly implies
n→∞
fn (m) = ϕδm (fn ) −→ 0 for all m. And by the second assumption there exists ε > 0 such that
kfn k1 ≥ ε for infinitely many n. Using this, we inductively define {nk }, {rk } ⊆ N as follows:
(a) Let n1 be the smallest number for which kfn1 k1 ≥ ε.
P∞
(b) Let r1 be the smallest number for which ri=1 ε
≤ 5ε .
P1
|fn1 (i)| ≥ 2 and i=r1 +1 |fn1 (i)|
For k ≥ 2:
Prk−1
(c) Let nk be the smallest number such that nk > nk−1 and kfnk k1 ≥ ε and i=1 |fnk (i)| ≤ 5ε .
Prk ε
(d) Let rk be the smallest number such that rk > rk−1 and
P∞ i=rk−1 +1 |fnk (i)| ≥ 2 and
ε
i=rk +1 |fnk (i)| ≤ 5 .
The reader should convince herself that the existence of such nk , rk follows from our assumptions!
Now define {ci }i∈N by ci = sgn(fnk (i)) where k is uniquely determined by rk−1 < i ≤ rk
with r0 = 0. Now clearly c = {ci } ∈ `∞ , and for all k we have, using the lower bound in (b),(d),
rk rk
X X ε
ci fnk (i) = |fnk (i)| ≥ ,
2
i=rk−1 +1 i=rk−1 +1

while using |ci | ≤ 1 and the upper bounds in (b),(c),(d) we have

rk−1 rk−1 ∞ ∞
X X ε X X ε
ci fnk (i) ≤ |fnk (i)| ≤ , ci fnk (i) ≤ |fnk (i)| ≤ .
5 5
i=1 i=1 i=rk +1 i=rk +1

Thus |ϕc (fnk )| ≥ 2ε − 5ε − 5ε = 10


ε
for all k, so that ϕc (fn ) 6→ 0. Since this contradicts the
w
assumption fn → 0, we must have kfn k1 → 0. 

B.33 Remark In the above proof, the gliding hump philosophy is much more clearly visible
than in the proof of Theorem 8.2: The gliding hump is precisely the dominant contribution to
ϕc (fnk ) coming from the i in the interval {rk + 1, . . . , rk+1 }, which moves to infinity as k → ∞.
Note also that the determination of the nk , rk in the above proof was deterministic, using
no choice axiom at all. In this sense the proof is better than the alternative one using Baire’s
theorem, thus countable dependent choice, cf. e.g. [12, Proposition V.5.2], which nevertheless is
instructive. But of course also the above proof is non-constructive in the somewhat extremist
sense of intuitionism since the necessary ε > 0 cannot be found algorithmically.
For a high-brow interpretation of Schur’s theorem in terms of Banach space bases see [1,
Section 2.3]. But also this discussion uses gliding humps. 2

B.8 The Fuglede-Putnam theorem


B.34 Theorem Let A be a unital C ∗ -algebra over C.
(i) Let a, c ∈ A. If a is normal and ac = ca then a∗ c = ca∗ (and ac∗ = c∗ a).
(ii) Let a, b, c ∈ A. If a, b are normal and ac = cb then a∗ c = cb∗ .

144
The theorem is quite remarkable, and asked for a proof one probably wouldn’t know where
to begin. For matrices this is quite easy to
P prove, as we do for (i): Normality of a implies the
existence of an ONB {ei } such that a = i λi Pi , where Pi (·) = ei h·, ei i. Now ac = ca implies
Pi c = cPi for all i, from which a∗ c = ca∗ is immediate. This argument can be extended to
operators on infinite dimensional spaces, cf. [77]. But the following is quite different:

Proof. Obviously (i) is just the special case a = b of (ii).


∗ ∗
(ii) We define f : C → A, z 7→ eza ce−zb , where ea = exp(a) is defined in terms of the power
series as in Section 12.1. Expanding the two power series in the definition of f we have
∞ ∞ ∞
! !
za∗ −zb∗
X z k (a∗ )k X (−z)l (b∗ )l X
f (z) = e ce = c = z n dn ∀z ∈ C
k! l!
k=0 l=0 n=0

for certain dn ∈ A. (The reshuffling is justified by the uniform convergence of the series.) We
only need d1 = a∗ c − cb∗ , which is quite obvious. Thus the theorem follows if we prove d1 = 0.
By induction, the assumption ab = cb is seen to imply an c = cbn . Multiplying by z n /n! and
summing over n ∈ N0 gives eza c = cezb for all z ∈ C, thus also eza ce−zb = c. Thus
∗ ∗ ∗ ∗ ∗ −za ∗
f (z) = eza ce−zb = eza (e−za cezb )e−zb = eza cezb−zb = ei2Im(za) ce−i2Im(zb) ,
∗ ∗
where eza e−za = eza −za is true due to the normality aa∗ = a∗ a of a, and similarly for b. Now
2Im(za), 2Im(zb) are self-adjoint so that ei2Im(za) and ei2Im(zb) are unitary for all z ∈ C, cf.
Remark 11.25(ii), thus bounded. This proves that f : C → A is bounded.
Thus for every ϕ ∈ A∗ , the function z 7→ ϕ(f (z)) is entire and bounded, thus constant by
Liouville’s theorem. Thus for all z, z 0 ∈ C, ϕ ∈ A∗ we have ϕ(f (z) − f (z 0 )) = 0. Hahn-Banach
now implies f (z) − f (z 0 ) = 0 ∀z, z 0 , thus f is constant. In particular, d1 = f 0 (0) = 0. 

B.35 Remark (i) was proven by Fuglede in 1950, (ii) by Putnam in 1951. The above elegant
proof is due to Rosenblum (1958). Nevertheless, the appeal to complex analysis is redundant
and somewhat misleading since for a bounded function given in terms of a power series of infinite
convergence radius, as is the case here, Liouville’s theorem has a proof that involves neither the
notion of holomorphicity nor the general path independence of contour integrals: 2

B.36 Lemma Let f (z) = ∞ n 1/n = 0)


P
n=0 cn z have infinite convergence radius (i.e. lim supn |cn |
and assume that f is bounded. Then cn = 0 ∀n ≥ 1.
Proof. Let r > 0, m ∈ N. Then
∞ ∞
!
Z 2π Z 2π X X Z 2π
−imt it −imt n int n
e f (re )dt = e r e cn dt = r cn ei(n−m)t = 2π rm cm ,
0 0 n=0 n=0 0

where the interchange of integration


R 2π and summation was justified by the uniform convergence
of the series, and we used 0 ei(n−m)t dt = 2πδn,m . With M = supz∈C |f (z)| < ∞ we have
Z 2π
1 M
|cm | = m
e−imt f (reit )dt ≤ m ∀m ∈ N, r > 0.
2πr 0 r
Taking the limit r → +∞, we have cm = 0 for all m ≥ 1. 

For another instance where the standard invocation of Liouville’s theorem can be replaced by
harmonic analysis see Rickart’s proof of Theorem 10.18, where only finite cyclic groups appear!

145
B.9 Glimpse of non-linear FA: Schauder’s fixed point theorem
In this final section we give a glimpse of non-linear functional analysis by proving Schauder’s
fixed point theorem, which is a generalization of Brouwer’s fixed point theorem to Banach
spaces.

B.37 Definition A topological space X has the fixed-point property if for every continuous
f : X → X there is x ∈ X such that f (x) = x, i.e. a fixed-point.

B.38 Theorem (Brouwer, Hadamard, 1910) 68 [0, 1]n has the fixed point property. The
same holds for every non-empty compact convex subset of Rn .
The second result follows from the first since such an X is homeomorphic to some [0, 1]m .
There are many proofs of the first result. For what probably is the simplest proof (due to
Kulpa) of the first statement, using only some easy combinatorics, see [47]. (Proofs using
algebraic topology or analysis involve inessential elements and don’t reduce the combinatorics.)

B.39 Theorem (Schauder 1930) 69 Every non-empty compact convex subset K of a normed
vector space has the fixed point property.
Proof. Let (V, k · k) be a normed vector space, K ⊆ V a non-empty compact convex subset
and f : K → K continuous. Let S ε > 0. Since K is compact, thus totally bounded, there are
x1 , . . . , xn ∈ K such that K ⊆ ni=1 B(xi , ε). Thus if we define αi (x) ≥ 0 by

0 if kx − xi k ≥ ε
αi (x) = ∀i = 1, . . . , n, (B.2)
ε − kx − xi k if kx − xi k < ε
we see that for each x ∈ K there is at least one i such that αi (x) > 0. The functions αi clearly
are continuous. Thus also the map
Pn
αi (x)xi
Pε : K → K, x 7→ Pi=1 n
i=1 αi (x)

is continuous. Since Pε (x) is a convex combination of those xi for which kx − xi k < ε, we have
kPε (x) − xk < ε for all x ∈ K. The finite dimensional subspace Vn = span(x1 , . . . , xn ) ⊆ V
is isomorphic to some Rm , and by Proposition 3.19 the restriction of the norm k · k to Vn is
equivalent to the Euclidean norm on Rm . Thus the convex hull conv(x1 , . . . , xn ) ⊆ Vn into
which Pε maps is homeomorphic to a compact convex subset of Rm and thus has the fixed point
property by Theorem B.38. Thus if we define fε = Pε ◦ f then fε maps conv(x1 , . . . , xn ) into
itself and thus has a fixed point x0 = fε (x0 ). Now,
kx0 − f (x0 )k ≤ kx0 − fε (x0 )k + kfε (x0 ) − f (x0 )k = kfε (x0 ) − f (x0 )k = kPε (f (x0 )) − f (x0 )k < ε.
Since ε > 0 was arbitrary, we find inf{kx − f (x)k | x ∈ K} = 0. Since K is compact and
x 7→ kx − f (x)k continuous, the infimum is assumed, thus f has a fixed point in K. 

The use of methods/results from algebraic topology is quite typical for non-linear functional
analysis. (But also linear functional analysis connects to algebraic topology, for example via
K-theory, cf. e.g. [50, Chapter 7].)
68
Luitzen Egbertus Jan Brouwer (1881-1966). Dutch mathematician. Important contributions to topology, founding
of intuitionism. Jacques Hadamard (1865-1963). French mathematician.
69
Juliusz Schauder (1899-1943). Born in Lwow/Lviv (now Ukraine, then Lemberg in the Austrian empire) and killed
by the Nazis during WW2.

146
B.40 Definition Let V be a Banach space and W ⊆ V . A map f : W → V is called compact
if it is continuous and f (S) ⊆ V is precompact for every bounded S ⊆ V .

B.41 Corollary Let V be a Banach space and C ⊆ V closed, bounded and convex. If
f : C → V is compact and f (C) ⊆ C then f has a fixed point in C.
Proof. If W ⊆ V is compact then its convex hull co(W ) = {tx + (1 − t)y | x, y ∈ W, t ∈ [0, 1]} is
the image of the compact space W ×W ×[0, 1] under the continuous map (x, y, t) 7→ tx+(1−t)y
and therefore compact.
Now f (C) ⊆ V is compact by boundedness of C and compactness of f , thus by the above
also K = co(f (C)) is compact and, of course, convex. Thus K has the fixed point property by
Schauder’s theorem. Since C is closed and convex, we have K ⊆ C, thus f is defined on K and
maps it into f (K) ⊆ f (C) ⊆ C. Thus f has a fixed point x ∈ K ⊆ C. 

147
C Tentative schedule (14 lectures à 90 minutes)
1. Introduction, motivation. TVS. Normed spaces, bounded linear maps
2. Continuation of basic material (Sections 2, 3). Sequence spaces `p (S): proof of Hölder and
Minkowski inequalities.
3. More on sequence spaces (most proofs omitted). Basics on Hilbert spaces: inner product,
CS ineq., norm from inner product. Parallelogram equal., polarization. Orthogonality.
4. From Riesz lemma to Sect. 5.3.
5. End of Sect. 5.3. Briefly: tensor products of Hilbert spaces. Quotients of Banach spaces.
Hahn-Banach over R
6. Hahn-Banach over C. Applications: Reflexivity, complemented subspaces. Baire’s thm.
Strong version of the uniform boundedness theorem (gliding hump proof of Theorem 8.2
omitted). Hellinger-Toeplitz.
7. Strong convergence, Banach-Steinhaus. Many continuous functions with divergent Fourier
series. Open mapping, bounded inverse and closed graph theorems. Invertibility of Banach
space operators: (i)⇔(ii) in Proposition 9.28.
8. Bounded below maps, (i)⇔(iii) in Proposition 9.28. Sections 10.1-2.
9. Sections 10.4 and 11.1-2.
10. Sections 11.3-4 and 12.1-2 (incl. brief discussion of Weierstrass and Tietze theorems).
11. Sections 12.3 and 13. If possible: Beginning of Section 14.
12. Brief discussion of Arzelà-Ascoli (App. A.9). Rest of Section 14 (probably no time for
Section 14.4).
13. Section 15: Spectral theorems for normal operators.
14. Sections 16, 17: Weak and weak∗ topologies, Gelfand homomorphism/isomorphism.

148
All papers appearing in the bibliography are cited somewhere, but not all books. Still, all
are worth looking at.

References
[1] F. Albiac, N. J. Kalton: Topics in Banach space theory. 2nd. ed. Springer, 2016.
[2] D. Amir: Characterizations of inner product spaces. Birkhäuser, 1986.
[3] S. Banach: Théorie des opérations linéaires, 1932. Engl. transl.: Theory of linear opera-
tions. North-Holland, 1987.
[4] W. R. Bauer, R. H. Benner: The non-existence of a Banach space of countably infinite
Hamel dimension. Amer. Math. Monthly 78, 895-596 (1971).
[5] G. Birkhoff, E. Kreyzig: The establishment of functional analysis. Histor. Math. 11, 258-
321 (1984).
[6] N. Bourbaki: Topological vector spaces. Chapters 1-5. Springer, 1987.
[7] H. Brézis: Functional analysis, Sobolev spaces and partial differential equations. Springer,
2011.
[8] T. Bühler, D. A. Salamon: Functional analysis. American Mathematical Society, 2018.
[9] N. L. Carothers: A short course on Banach space theory. Cambridge University Press,
2005.
[10] P. G. Ciarlet: Linear and nonlinear functional analysis with applications. Society for In-
dustrial and Applied Mathematics, 2013.
[11] D. L. Cohn: Measure theory. 2nd. ed. Springer, 2013.
[12] J. B. Conway: A course in functional analysis. 2nd. ed. Springer, 2007.
[13] A. M. Davie: The Banach approximation problem. J. Approx. Th. 13, 392-394 (1975).
[14] K. Deimling: Nonlinear functional analysis. Springer, 1985.
[15] J. Dieudonné: History of functional analysis. North-Holland, 1981.
[16] J.-L. Dorier: A general outline of the genesis of vector space theory. Hist. Math. 22, 227-261
(1995).
[17] N. Dunford, J. T. Schwartz: Linear operators. I. General theory. Interscience Publishers,
1958, John Wiley & Sons, 1988.
[18] H.-D. Ebbinghaus et al.: Numbers. Springer, 1991.
[19] P. Enflo: A counterexample to the approximation problem in Banach spaces. Acta Math.
130, 309-317 (1973).
[20] L. C. Evans: Partial differential equations. 2nd. ed. American Mathematical Society, 2010.
[21] A. Fellhauer: On the relation of three theorems of analysis to the axiom of choice. J. Logic
Analysis 9, 1-23 (2017).
[22] S. Friedberg, A. Insel, L. Spence: Linear algebra. 4th. ed. Pearson, 2014.
[23] D. J. H. Garling: A course in mathematical analysis. Vol. 1 & 2. Cambridge University
Press, 2013.
[24] F. Q. Gouvêa: p-adic numbers. An introduction. 3rd. ed. Springer, 2020.

149
[25] S. Grabiner: The Tietze extension theorem and the open mapping theorem. Amer. Math.
Monthly 93, 190-191 (1986).
[26] S. Gudder: Inner product spaces. Amer. Math. Monthly 81, 29-36 (1974), 82, 251-252
(1975), 82, 818 (1975).
[27] C. Heil: A basis theory primer. Birkhäuser, 2011.
[28] P. R. Halmos: Introduction to Hilbert space and the theory of spectral multiplicity. 2nd. ed.
Chelsea, 1957.
[29] P. R. Halmos: What does the spectral theorem say? Amer. Math. Monthly 70, 241-247
(1963).
[30] P. R. Halmos: A Hilbert space problem book. 2nd. ed. Springer, 1982.
[31] O. Hanner: On the uniform convexity of Lp and lp . Ark. Mat. 3, 239-244 (1955).
[32] R. V. Kadison, J. R. Ringrose: Fundamentals of the theory of operator algebras. Vol. 1.
Elementary theory. Academic Press, 1983.
[33] S. Kaplan: The bidual of C(X) I. North-Holland, 1985.
[34] Y. Katznelson: An introduction to harmonic analysis. 3rd. ed. Cambridge University Press,
2004.
[35] Y. & Y. R. Katznelson: A (terse) introduction to linear algebra. American Mathematical
Society, 2008.
[36] I. Kleiner: A history of abstract algebra. Birkhäuser, 2007.
[37] V. Komornik: Lectures on functional analysis and the Lebesgue integral. Springer, 2016.
[38] N. P. Landsman: Foundations of quantum theory. Springer, 2017. Freely available at
https://link.springer.com/book/10.1007%2F978-3-319-51777-3
[39] P. D. Lax: Functional analysis. Wiley, 2002.
[40] P. D. Lax: Linear algebra and its applications. 2nd. ed. Wiley, 2007.
[41] J. Lindenstrauss, L. Tzafriri: On the complemented subspace problem. Isr. J. Math. 9,
263-269 (1971).
[42] B. MacCluer: Elementary functional analysis. Springer, 2009.
[43] R. E. Megginson: An introduction to Banach space theory. Springer, 1998.
[44] R. Meise, D. Vogt: Introduction to functional analysis. Oxford University Press, 1997.
[45] D. F. Monna: Functional analysis in historical perspective. Oosthoek Publishing Company,
1973.
[46] G. H. Moore: The axiomatization of linear algebra: 1875-1940. Hist. Math. 22, 262-303
(1995).
[47] M. Müger: Topology for the working mathematician. (work in progress).
http://www.math.ru.nl/~mueger/topology.pdf
[48] M. Müger: On trace class operators (and Hilbert-Schmidt operators).
http://www.math.ru.nl/~mueger/PDF/Trace-class.pdf
[49] M. Müger: Some examples of Fourier series.
http://www.math.ru.nl/~mueger/PDF/Fourier.pdf
[50] G. J. Murphy: C ∗ -Algebras and operator theory. Academic Press, 1990.

150
[51] G. Nagy: A functional analysis point of view on the Arzela-Ascoli theorem. Real. Anal.
Exch. 32, 583-586 (2006/7).
D. C. Ullrich: The Ascoli-Arzelà Theorem via Tychonoff’s Theorem. Amer. Math. Monthly
110, 939-940 (2003).
M. Wójtowicz: For eagles only: probably the most difficult proof of the Arzela-Ascoli
theorem − via the Stone-Cech compactification. Quaest. Math. 40, 981-984 (2017).
[52] L. Narici, E. Beckenstein, G. Bachman: Functional analysis and valuation theory. Marcel
Dekker, Inc. 1971.
[53] L. Nirenberg: Topics in nonlinear functional analysis. American Mathematical Society,
1974.
[54] B. de Pagter, A. C. M. van Rooij: An invitation to functional analysis. Epsilon Uitgaven,
2013.
[55] G. Pedersen: Analysis now. Springer, 1989.
[56] J.-P. Pier: Mathematical analysis during the 20th century. Oxford University Press, 2001.
[57] J. B. Prolla: Topics in functional analysis over valued division rings. North-Holland, 1982.
[58] D. Ramakrishnan, R. J. Valenza: Fourier analysis on number fields. Springer, 1999.
[59] M. Reed, B. Simon: Methods of modern mathematical physics. I: Functional analysis.
Academic Press, 1980.
[60] F. Riesz, B. Sz.-Nagy: Functional analysis. Frederick Ungar Publ., 1955, Dover, 1990.
[61] A. M. Robert: A course in p-adic analysis. Springer, 2000.
[62] A. C. M. van Rooij: Non-Archimedean functional analysis. Marcel Dekker, Inc., 1978.
[63] W. Rudin: Real and complex analysis. McGraw-Hill, 1966, 1974, 1986.
[64] W. Rudin: Functional analysis. 2nd. ed. McGraw-Hill, 1991.
[65] V. Runde: A taste of topology. Springer, 2005.
[66] V. Runde: A new and simple proof of Schauder’s theorem. arXiv:1010.1298.
[67] R. A. Ryan: Introduction to tensor products of Banach spaces. Springer, 2002.
[68] B. P. Rynne, M. A. Youngson: Linear functional analysis. 2nd. ed. Springer, 2008.
[69] D. A. Salamon: Measure and integration. European Mathematical Society, 2016.
[70] K. Saxe: Beginning functional analysis. Springer, 2002.
[71] E. Schechter (ed.): Handbook of analysis and its foundations. Academic Press, 1997.
[72] B. Simon: Trace ideals and their applications. 2nd. ed. American Mathematical Society,
2005.
[73] B. Simon: Operator theory. American Mathematical Society, 2015.
[74] I. Singer: Bases in Banach spaces I & II. Springer, 1970, 1981.
[75] P. Soltan: A primer on Hilbert space. Springer, 2018.
[76] E. M. Stein, R. Shakarchi: Fourier analysis. Princeton University Press, 2005.
[77] V. S. Sunder: Fuglede’s theorem. Indian J. Pure Appl. M. 46, 415-417 (2015).
[78] A. Szankowski: B(H) does not have the approximation property. Acta Math. 147, 89-108
(1981).

151
[79] T. Tao: Analysis I & II. 3rd. ed. Springer, 2016.
[80] G. Teschl: Topics in linear and nonlinear functional analysis. American Mathematical
Society, 2020.
[81] A. M. Vershik: The life and fate of functional analysis in the twentieth century. In: A.
A. Bolibruch, Yu. S. Osipov, Ya. G. Sinai (eds.): Mathematical events of the twentieth
century. Springer, 2006.
[82] S. Warner: Topological fields. North-Holland, 1989.
[83] J. Weidmann: Lineare Operatoren in Hilberträumen. 1: Grundlagen, 2: Anwendungen.
Teubner, 2000, 2003.
[84] W. Wieşlaw: On topological fields. Colloq. Math. 29, 119-146 (1974).
[85] K.-W. Yang: A note on reflexive Banach spaces. Proc. Amer. Math. Soc. 18, 859-861
(1967).
[86] E. Zeidler: Nonlinear functional analysis. Volumes 1, 2A, 2B, 3, 4, 5. Springer, 1984-1990.
[87] R. J. Zimmer: Essential results of functional analysis. University of Chicago Press, 1990.

152

You might also like