0% found this document useful (0 votes)
68 views264 pages

Functional Analysis Course Notes

Uploaded by

Morgan Freeman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views264 pages

Functional Analysis Course Notes

Uploaded by

Morgan Freeman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Introduction to Functional Analysis

Michael Müger
24.02.2024

Abstract
These are notes for my Bachelor course Inleiding in de Functionaalanalyse (14×90 min.).
They are also recommended as background for my Master courses on Operator Algebras.
Some familiarity with metric and topological spaces is assumed, and the last lecture (Section
18) will use some measure theory. Complex analysis is not used.

Contents
1 Introduction 5

Part I: Fundamentals 9
2 Setting the stage 9
2.1 Topological algebra: Topological groups, fields, vector spaces . . . . . . . . . . . 9
2.2 Translation-invariant metrics. Normed and Banach spaces . . . . . . . . . . . . . 10
2.3 A glimpse beyond normed spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.1 Finite-dimensional TVS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.2 Locally convex and Fréchet spaces . . . . . . . . . . . . . . . . . . . . . . 17

3 Normed and Banach space basics 19


3.1 Linear maps: bounded ⇔ continuous . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Why we care about completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.1 Extension of bounded linear maps . . . . . . . . . . . . . . . . . . . . . . 22
3.2.2 Convergence of series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.3 Closedness vs. completeness of subspaces . . . . . . . . . . . . . . . . . . 24
3.3 Spaces of bounded linear maps. First glimpse of Banach algebras . . . . . . . . . 25

4 The sequence spaces and their dual spaces 26


4.1 Basics. 1 ≤ p ≤ ∞: Hölder and Minkowski inequalities . . . . . . . . . . . . . . . 27
4.2 ? Aside: The translation-invariant metric dp for 0 < p < 1 . . . . . . . . . . . . . 29
4.3 c00 and c0 . Completeness of `p (S, F) and c0 (S, F) . . . . . . . . . . . . . . . . . . 30
4.4 Separability of `p (S, F) and c0 (S, F) . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.5 Dual spaces of `p (S, F), 1 ≤ p < ∞, and c0 (S, F) . . . . . . . . . . . . . . . . . . 33
4.6 The Banach algebras (`∞ (S, F), ·) and (`1 (Z, F), ?) . . . . . . . . . . . . . . . . . 35
4.7 Outlook on general Lp -spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

1
5 Basics of Hilbert spaces 37
5.1 Inner products. Cauchy-Schwarz inequality . . . . . . . . . . . . . . . . . . . . . 37
5.2 The parallelogram and polarization identities . . . . . . . . . . . . . . . . . . . . 40
5.3 Basic Hilbert space geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.4 Closed subspaces, orthogonal complement, and orthogonal projections . . . . . . 43
5.5 The dual space H ∗ of a Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.6 Orthonormal sets and bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.7 Tensor products of Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

6 Subspaces and quotient spaces of Banach spaces 52


6.1 Quotient spaces of Banach spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.2 Complemented subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

7 Open Mapping Theorem and relatives 56


7.1 Open Mapping Theorem. Bounded Inverse Theorem . . . . . . . . . . . . . . . . 56
7.2 Some applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.3 The Closed Graph Theorem (CGT). Hellinger-Toeplitz . . . . . . . . . . . . . . . 61
7.3.1 The Hellinger-Toeplitz theorem . . . . . . . . . . . . . . . . . . . . . . . . 62
7.4 Boundedness below. Invertibility . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

8 Uniform Boundedness Theorem and applications 65


8.1 The Uniform Boundedness Theorem (UBT) . . . . . . . . . . . . . . . . . . . . . 65
8.2 Applications of the weak UBT. Banach-Steinhaus . . . . . . . . . . . . . . . . . . 66
8.3 Appl. of the strong UBT: Many continuous functions with divergent Fourier series 68

9 Duality: Hahn-Banach Theorem and applications 69


9.1 First version of Hahn-Banach over R . . . . . . . . . . . . . . . . . . . . . . . . . 70
9.2 Hahn-Banach theorem for (semi)normed spaces . . . . . . . . . . . . . . . . . . . 71
9.3 First applications of Hahn-Banach . . . . . . . . . . . . . . . . . . . . . . . . . . 72
9.4 Reflexivity of Banach spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
9.5 The transpose of a bounded Banach space operator . . . . . . . . . . . . . . . . . 77

10 ? Duality: Weak and weak ∗-topologies 79


10.1 The weak topology of a Banach space . . . . . . . . . . . . . . . . . . . . . . . . 80
10.2 The weak operator topology on B(V ) . . . . . . . . . . . . . . . . . . . . . . . . 83
10.3 The weak-∗ topology on a dual space. Alaoglu’s theorem . . . . . . . . . . . . . . 83

11 Hilbert space operators and their adjoints. Special classes of operators 86


11.1 The adjoint of a bounded Hilbert space operator . . . . . . . . . . . . . . . . . . 86
11.2 Unitaries, isometries, coisometries, partial isometries . . . . . . . . . . . . . . . . 90
11.3 Polarization revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
11.4 A little more on self-adjoint operators . . . . . . . . . . . . . . . . . . . . . . . . 92
11.5 Normal operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
11.6 Numerical range and radius . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
11.7 Positive operators and their square roots . . . . . . . . . . . . . . . . . . . . . . . 95
11.8 The polar decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
11.9 ? The trace of positive operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

2
12 Compact operators 99
12.1 Compact Banach space operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
12.2 ? Compactness vs. weak forms of weak-norm continuity . . . . . . . . . . . . . . 106
12.3 Compact Hilbert space operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
12.4 ? Hilbert-Schmidt operators: L2 (H) . . . . . . . . . . . . . . . . . . . . . . . . . 111

Part II: Spectral theory of operators and algebras 113


13 Spectrum of bounded operators and of Banach algebra elements 113
13.1 Spectra of bounded operators I: Definitions, first results . . . . . . . . . . . . . . 113
13.2 The spectrum in a unital Banach algebra . . . . . . . . . . . . . . . . . . . . . . 117
13.2.1 The group of invertibles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
13.2.2 The spectrum. Basic properties . . . . . . . . . . . . . . . . . . . . . . . . 118
13.2.3 The spectral radius formula (Beurling-Gelfand theorem) . . . . . . . . . . 121
13.2.4 Applications, complements, exercises . . . . . . . . . . . . . . . . . . . . . 123
13.3 Spectra of bounded operators II: Banach algebra methods . . . . . . . . . . . . . 127
13.4 Applications to normal Hilbert space operators . . . . . . . . . . . . . . . . . . . 129

14 Compact operators II: Spectral theorems 129


14.1 The spectrum of compact operators. Fredholm alternative . . . . . . . . . . . . . 129
14.2 ? A glimpse of Fredholm operators . . . . . . . . . . . . . . . . . . . . . . . . . . 131
14.3 Spectral theorems for compact Hilbert space operators . . . . . . . . . . . . . . . 132
14.4 ? Spectral theorem (Jordan normal form) for compact Banach space operators . 134

15 Some functional calculus for Banach algebras 136


15.1 Characters. Spectrum of a Banach algebra . . . . . . . . . . . . . . . . . . . . . . 136
15.2 Baby version of holomorphic functional calculus . . . . . . . . . . . . . . . . . . . 139

16 Basics of C ∗ -algebras 142



16.1 Involutions. Definition of C -algebras . . . . . . . . . . . . . . . . . . . . . . . . 142
16.2 Some classes of elements in a C ∗ -algebra and their spectra . . . . . . . . . . . . . 143
16.3 Positive elements of a C ∗ -algebra I . . . . . . . . . . . . . . . . . . . . . . . . . . 145

17 Continuous functional calculus for C ∗ -algebras 146


17.1 Continuous functional calculus for self-adjoint elements . . . . . . . . . . . . . . . 146
17.2 Positive elements of a C ∗ -algebra II. Absolute value . . . . . . . . . . . . . . . . 148
17.3 Continuous functional calculus for normal elements . . . . . . . . . . . . . . . . . 149

18 Spectral theorems for normal Hilbert space operators 152


18.1 Spectral theorem: Multiplication operator version . . . . . . . . . . . . . . . . . . 152
18.2 Borel functional calculus for normal operators . . . . . . . . . . . . . . . . . . . . 155
18.3 Normal operators vs. projection-valued measures . . . . . . . . . . . . . . . . . . 159

19 ? The Gelfand homomorphism for commutative Banach and C ∗ -algebras 161


19.1 The topology of Ω(A). The Gelfand homomorphism . . . . . . . . . . . . . . . . 161
19.2 Application: Absolutely convergent Fourier series . . . . . . . . . . . . . . . . . . 163
19.3 C ∗ -algebras. Continuous functional calculus revisited . . . . . . . . . . . . . . . . 164

3
A Some more advanced topics from topology and measure theory 166
A.1 Unordered infinite sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
A.2 More on unconditional convergence of series . . . . . . . . . . . . . . . . . . . . . 168
A.3 Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
A.4 Reminder of the choice axioms and Zorn’s lemma . . . . . . . . . . . . . . . . . . 171
A.5 Baire’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
A.6 On C(X, F) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
A.6.1 Tietze’s extension theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 174
A.6.2 Weierstrass’ theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
A.6.3 The Stone-Weierstrass theorem . . . . . . . . . . . . . . . . . . . . . . . . 177
A.6.4 The Arzelà-Ascoli theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 178
A.6.5 Separability of C(X, R) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
A.6.6 ? The Stone-Čech compactification . . . . . . . . . . . . . . . . . . . . . . 181
A.7 Some notions from measure and integration theory . . . . . . . . . . . . . . . . . 182

B ? Supplements for the curious 184


B.1 Functional analysis over fields other than R and C? . . . . . . . . . . . . . . . . . 184
B.2 Even more on unconditional and conditional convergence . . . . . . . . . . . . . 185
B.2.1 The Dvoretzky-Rogers theorem . . . . . . . . . . . . . . . . . . . . . . . . 185
B.2.2 Converses of Dvoretzky-Rogers . . . . . . . . . . . . . . . . . . . . . . . . 186
B.2.3 Conditional convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
B.3 More on the spaces `p (S, F) and c0 (S, F) . . . . . . . . . . . . . . . . . . . . . . . 188
B.3.1 Precompact subsets of c0 (S, F) and `p (S, F), 1 ≤ p < ∞ . . . . . . . . . . 188
B.3.2 The dual space of `∞ (S, F) . . . . . . . . . . . . . . . . . . . . . . . . . . 189
B.3.3 c0 (N, F) ⊆ `∞ (N, F) is not complemented . . . . . . . . . . . . . . . . . . 192
B.3.4 c0 (N, F) is not a dual space. Spaces with multiple pre-duals . . . . . . . . 194
B.3.5 Schur’s theorem for `1 (N, F) . . . . . . . . . . . . . . . . . . . . . . . . . . 195
B.3.6 Compactness of all bounded linear maps c0 → `q → `p (1 ≤ p < q < ∞) . 196
B.4 The weak Uniform Boundedness Theorem using only ACω . . . . . . . . . . . . . 197
B.5 Alaoglu’s theorem ⇔ Tychonov  T2 ⇔ Ultrafilter lemma . . . . . . . . . . . . . . 198
B.6 More on convexity and Hahn-Banach matters . . . . . . . . . . . . . . . . . . . . 201
B.6.1 Tychonov for Hausdorff spaces implies Hahn-Banach . . . . . . . . . . . . 201
B.6.2 Minkowski functionals. Criteria for normability and local convexity . . . . 202
B.6.3 Hahn-Banach separation theorems for locally convex spaces . . . . . . . . 204
B.6.4 First applications of the separation theorems to Banach spaces . . . . . . 206
B.6.5 Convex hulls and closed convex hulls . . . . . . . . . . . . . . . . . . . . . 207
B.6.6 Extreme points and faces of convex sets. The Krein-Milman theorem . . . 208
B.6.7 Strictly convex Banach spaces and Hahn-Banach uniqueness . . . . . . . . 211
B.6.8 Uniform convexity and reflexivity. Duality of Lp -spaces reconsidered . . . 212
B.7 The Eidelheit-Chernoff theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
B.8 A bit more on invariant subspaces. Lomonosov’s theorem . . . . . . . . . . . . . 215
B.9 More on Fredholm operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
B.10 Discrete and essential spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
B.10.1 Banach space operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
B.10.2 Normal Hilbert space operators. Weyl’s theorem . . . . . . . . . . . . . . 224
B.10.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
B.11 Trace-class operators: L1 (H) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
B.12 More on numerical ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

4
B.12.1 The numerical range W (A) of a Hilbert space operator . . . . . . . . . . . 232
B.12.2 Numerical range in Banach algebras . . . . . . . . . . . . . . . . . . . . . 234
B.12.3 Positive functionals on and numerical range for C ∗ -algebras . . . . . . . . 238
B.13 Some more basic theory of C ∗ -algebras . . . . . . . . . . . . . . . . . . . . . . . . 242
B.13.1 The Fuglede-Putnam theorem . . . . . . . . . . . . . . . . . . . . . . . . . 242
B.13.2 Homomorphisms of C ∗ -algebras . . . . . . . . . . . . . . . . . . . . . . . . 243
B.14 Unbounded operators (mostly on Hilbert space) . . . . . . . . . . . . . . . . . . . 244
B.14.1 Basic definitions. Closed and closable operators . . . . . . . . . . . . . . . 244
B.14.2 Adjoints of unbounded Hilbert space operators . . . . . . . . . . . . . . . 245
B.14.3 Basic criterion for (essential) self-adjointness . . . . . . . . . . . . . . . . 247
B.15 Glimpse of non-linear FA: Schauder’s fixed point theorem . . . . . . . . . . . . . 249

C The mathematicians encountered in these notes 251

D Results stated, but not proven 254

E What next? 255

F Approximate schedule for 14 lectures à 90 minutes 256

References 257

1 Introduction
We will begin with a quick delineation of what we will discuss – and what not!
• “Classical analysis” is concerned with ‘analysis in finitely many dimensions’. ‘Functional
analysis’ is the generalization or extension of classical analysis to infinitely many dimen-
sions. Before one can try to make sense of this, one should make the first sentence more
precise. Since the creation of general topology, one can talk about convergence and con-
tinuity in very general terms. As far as I see it, this is not analysis, even if infinite sums
(=series) are studied. Analysis proper starts as soon as one talks about differentiation
and/or integration. Differentiation has to do with approximating functions locally by lin-
ear ones, and for this one needs the spaces considered to be vector spaces (at least locally).
This is the reason why most of classical analysis considers functions between the vector
spaces Rn and Cn (or subsets of them). (In a second step, one can then generalize to
spaces that look like Rn only locally by introducing topological and smooth manifolds and
their generalizations, but the underlying model of Rn remains important.) On the other
hand, integration, at least in the sense of the modern theory, can be studied much more
generally, i.e. on arbitrary sets equipped with a measure (defined on some σ-algebra). Such
a set can be very far from being a vector space or manifold, for example by being totally
disconnected.
• In view of the above, it is not surprising that functional analysis is concerned with (pos-
sibly) infinite-dimensional vector spaces and continuous maps between them. (Again, one
can then generalize to spaces that look like a vector space only locally, but this would
be considered infinite-dimensional geometry, not functional analysis.) In addition to the
vector space structure one needs a topology, which naturally leads to topological vector
spaces, which we will define soon.

5
• The importance of topologies is not specific to infinite dimensions. The point rather is that
Rn , Cn have unique topologies making them Hausdorff topological vector spaces. This is
no more true in infinite dimensions!
• Actually, ‘functional analysis’ most often studies only linear maps between topological
vector spaces so that this domain of study should be called ‘linear functional analysis’, but
this is done only rarely, e.g. [145]. Allowing non-linear maps leads to non-linear functional
analysis. This course will discuss only linear functional analysis. Thorough mastery of the
latter is needed anyway before one can think about non-linear FA or infinite-dimensional
geometry. For the simplest result of non-linear functional analysis, see Section B.15. For
more, you could have a look at, e.g., [164, 37, 115]. There even is a five volume treatise
[177]!
• The restriction to linear maps means that the notion of differentiation becomes point-less,
the derivative of f (x) = Ax being just A everywhere. But there are many non-trivial
connections between linear FA and integration (and measure) theory. For example, every
measure space (X, A, µ) gives rise to a family of topological vector spaces Lp (X, A, µ), p ∈
(0, ∞], and integration provides linear functionals. Proper appreciation of these matters
requires some knowledge of measure and integration theory, cf. e.g. [29, 146]. I will not
suppose that you have followed a course on this subject (but if you haven’t, I strongly
that you do so on the next occasion or, at least, read the appendix in MacCluer’s book
[101].). Yet, one can get a reasonably good idea by focusing on sequence spaces, for which
no measure theory is required, see Section 4.
• One should probably consider linear functional analysis as an infinite-dimensional and
topological version of linear algebra rather than as a branch of analysis! This might lead
one to suspect linear FA to be slightly boring, but this would be wrong for many reasons:
– Functional analysis (linear or not) leads to very interesting (and arbitrarily challeng-
ing) technical questions (most of which reduce to very easy ones in finite dimensions).
– Linear FA is essential for non-linear FA, like variational calculus, and the theory of
differential equations – not only linear ones!
– Quantum theory [92] is a linear theory and cannot be done properly without func-
tional analysis, despite the fact that many physicists think so! Conversely, many
developments in FA were directly motivated by quantum theory.
• The above could give the impression that functional analysis arose from the wish of gen-
eralizing analysis to infinitely many dimensions. This may have played a role for some
of its creators, but its beginnings (and much of what is being done now) were mostly
motivated by finite-dimensional “classical”1 analysis: If U ⊆ Rn , the set of functions
(possibly continuous, differentiable, etc.) from U to Rm is a vector space as soon as we
put (cf + dg)(x) = cf (x) + dg(x). Unless U is a finite set, this vector space will be
infinite-dimensional. Now one can consider certain operations R on such vector spaces, like
∞ ∞ 0
differentiation C (U ) → C (U ), f 7→ f or integration f 7→ U f . This sort of considera-
tions provided the initial motivation for the development functional analysis, and indeed
FA now is a very important tool for the study of ordinary and partial differential equations
on finite-dimensional spaces. See e.g. [23, 52]. The relevance of FA is even more obvious if
one studies differential equations in infinitely many dimensions. In fact, it is often useful to
study a partial differential equation (like heat or wave equation) by singling out one of the
1
Ultimately, it is quite futile to try and draw a neat line between “classical” and “modern” or functional analysis,
in particular since many problems in the former require methods from the latter for their proper treatment.

6
variables (typically ‘time’) and studying the equation as an ordinary differential equation
in an infinite-dimensional space of functions. FA is also essential for variational calculus
(which in a sense is just a branch of differential calculus in infinitely many dimensions).
• In view of the above, FA studies abstract topological vector spaces as well as ‘concrete’
spaces, whose elements are functions. In order to obtain a proper understanding of FA,
one needs some familiarity with both aspects.
Before we delve into technicalities, some further general remarks:
• The history of functional analysis is quite interesting, cf. e.g. the article [15], [120, Chapter
4] and the books [40, 105, 122]. But clearly it makes little sense to study it before one has
some technical knowledge of FA. It is surprisingly intertwined with the development of
linear algebra. One would think that (finite-dimensional) vector spaces, linear maps etc.
were defined much earlier than, e.g., Banach spaces, but this is not what has happened.
In fact, Banach’s2 book [9], based on his 1920 PhD thesis, is one of the first references
containing the modern definition of a vector space. Some mathematicians, mostly Italian
ones, like Peano, Pincherle and Volterra, essentially had the modern definition already in
the last decades of the 19th century, but their work had little impact since the usefulness
of an abstract/axiomatic approach was not yet widely appreciated. Cf. [85, Chapter 5] or
[41, 106].
Here I limit myself to mentioning that the basics of functional analysis (Hilbert and Banach
spaces and bounded linear maps between them) were developed in the period 1900-1930.
Nevertheless, many important developments (locally convex spaces, distributions, operator
algebras) took place in 1930-1960. After that, functional analysis has split up into many
quite specialized subfields that interact quite little with each other. The very interesting
article [166] ends with the conclusion that ‘functional analysis’ has ceased to exist as a
coherent field of study!
• The study of functional analysis requires a solid background in general topology. It may
well be that you’ll have to refresh and likely also extend yours. In Appendix A I have
collected brief accounts of the topics that – sadly – you are most likely not to have encoun-
tered before. All of them are treated in [56] or [142] (written by a functional analyst!), but
my favorite (I’m admittedly biased) reference is [108]. You should have seen Weierstrass’
theorem, but those of Tietze and Arzelà-Ascoli tend to vanish in the (pedagogical, not
factual) gap between general topology and functional analysis.
• If you find these notes too advanced, you might want to have a look at less ambitious
books like [116, 145, 148]. On the other hand, if you want more, [118] is a good place to
start, followed by [128, 94, 30, 141]. (The Dutch MasterMath course currently uses [30].)
One word about notation (without guarantee of always sticking to it): General vector spaces,
but also normed spaces, are denoted V, W, . . ., normed spaces also as E, F, . . .. Vectors in such
spaces are e, f, . . . , x, y, . . .. Linear maps are always denoted A, B, . . ., except linear functionals
V → F, which are ϕ, ψ. Algebras are usually denoted A, B, . . . and their elements a, b, . . .. (For
A = B(E) this leads to inconsistency, but I cannot bring myself to using capital letters for
abstract algebra elements.)

Note for experts. Our treatment deviates from the beaten path at a number of points. E.g. we
2
Stefan Banach (1892-1945). Polish mathematician and pioneer of functional analysis. Also known for B. algebras,
B.’s contraction principle, the B.-Tarski paradox and the Hahn-B. and B.-Steinhaus theorems, etc.

7
• emphasize that absolute and unconditional convergence of series are not the same thing
in infinite-dimensional spaces, including a proof the Dvoretzky-Rogers theorem. Very
strangly, most introductory books on functional analysis fail to point this out.
• following [61] we simplify the lengthy ad hoc argument in the standard proof of the open
mapping theorem by using a lemma that also gives Tietze’s extension theorem.
• include a fairly recent (2017) proof [53] of the uniform boundedness theorem (weak ver-
sion) that only uses the axiom of countable choice, thus neither Baire’s theorem nor the
equivalent axiom of countable dependent choice (used by all previous ‘elementary’ proofs).
• and also show how Baire’s theorem gives a stronger version of uniform boundedness, which
is old but ignored by most authors.
• give a slick proof, inspired by [99], of the Hahn-Banach theorem using only Tychonov’s the-
orem for Hausdorff spaces (equivalent to the Ultrafilter Lemma) instead of Zorn’s lemma.
• follow [72] in proving the Arzelà-Ascoli theorem for complete metric spaces as target spaces
and the Kolmogorov-Riesz compactness theorem (but only for `p ).
• prove Pitt’s compactness theorem without using bases, following [38].
• introduce characters of a Banach algebra, albeit without the weak-∗ topology, at a rela-
tively early stage. Among other things, this allows for a relatively painless extension of
the continuous functional calculus for self-adjoint elements of a C ∗ -algebra to normal ones.
This approach seems to be new. (Doing this via the full blown Gelfand isomorphism as
e.g. in [101] seems inappropriate in a first course.)
• follow Rickart [130] in proving the Beurling-Gelfand formula for the spectral radius in a
Banach algebra without using complex analysis (one reason being that the author cannot
assume that all his students have been exposed to complex analysis). Since we also do
the same on a few other occasions, like the Fuglede-Putnam theorem, the text does not
assume any knowledge of complex analysis (apart from the elementary fact that power
series converge on disks, which is not genuine complex analysis).
• we give a purely elementary construction of the Riesz projectors associated with isolated
points of the spectrum (which we apply to compact non-normal operators and in the
discussion of discrete and essential spectra).
• we similarly limit the use of measure theory to the absolute minimum until it becomes
unavoidable in discussing the spectral theorem for normal operators. We prove all standard
results on the Lebesgue spaces Lp for discrete measure spaces, i.e. sets S equipped with
the counting measure, not limiting ourselves to S = N. We then indicate how most proofs
generalize to arbitrary measure spaces, while for the duality (Lp )∗ ∼
= Lq in the general
case we give a proof using uniform convexity.
• touch upon, in Appendix B, a number of somewhat more advanced topics, for which there
is no time in the author’s lecture course, but which are very closely related to the core
material. In particular, we go slightly further into Banach space theory and operator
theory than many introductory texts.

Acknowledgment. I thank Bram Balkema, Victor Hissink Muller and Niels Vooijs for corrections,
but in particular Tim Peters for a huge number of them.

8
Part I: Fundamentals
2 Setting the stage
2.1 Topological algebra: Topological groups, fields, vector spaces
As said in the Introduction, functional analysis (even most of the non-linear theory) is concerned
with vector spaces, allowing infinite-dimensional ones. Large parts of linear algebra of course
work equally well for finite and infinite-dimensional spaces. One aspect where problems arise in
infinite dimensions is the description of linear maps by matrices, for example since multiplication
of infinite matrices involves infinite summations, which require the introduction of topologies.
(Actually, in some restricted contexts infinite matrices still are quite useful.)
We begin with the following

2.1 Definition A topological group is a group (G, ·, 1) equipped with a topology τ such that
the group operations G×G → G, (g, h) 7→ gh and G → G, g 7→ g −1 are continuous (where G×G
is given the product topology). (For abelian groups, one often denotes the binary operation by
+ instead of ·.)

2.2 Example 1. If (G, ·, 1) is any group then it becomes a topological group by putting τ =
τdisc , the discrete topology on G. (This is true since every function from a discrete space to a
topological space is continuous.)
2. The group (R, +, 0), where R is equipped with its standard topology, is easily seen to be
a topological group.
3. If n ∈ N3 and F ∈ {R, C} then the set GL(n, F) = {A ∈ Mn×n (F) | det(A) 6= 0} of
invertible n × n matrices is a group w.r.t. matrix product and inversion and in fact a topological
group when equipped with the subspace topology induced from Mn×n (F) ∼
2
= Fn .

2.3 Definition A topological field is a field (F, +, 0, ·, 1) equipped with a topology on F such
that (F, +, 0) and (F\{0}, ·, 1) are topological groups. (Equivalently, all field operations are
continuous.) Usually we just write F and denote the topology by τF .
Again, if F is any field then (F, τdisc ) is a topological field.

2.4 Exercise Prove that R and C are topological fields when equipped with their standard
topologies induced by the metric d(c, c0 ) = |c − c0 |.

2.5 Definition Let F be a topological field. Then a topological vector space (TVS) over F
is an F-vector space V equipped with a topology τV (to be distinguished, obviously, from the
topology τF on F) such that the maps V × V → V, (x, y) 7→ x + y and F × V → V, (c, x) 7→ cx
are continuous.
(These conditions imply that V → V, x 7→ −x is continuous, so that (V, +, 0) is a topological
group, but not conversely.)

2.6 Exercise Let F be a topological field.


(i) Prove that every F-vector space is a a TVS over F if equipped with the indiscrete topology.
(ii) If F ∈ {R, C}, prove that an F-vector space V 6= {0} equipped with the discrete topology
is a TVS over F if and only if F is discrete.
3
Throughout these notes, N = {1, 2, 3, . . .}, thus 0 6∈ N.

9
(iii) If S is any set, let V = {f : S → F} = FS . With pointwise addition and scalar multiplica-
tion, V is an F-vector space. Let τFS be the product topology on FS . Prove that (V, τFS ) is
a TVS over F.
In this course, the only topological fields considered are R and C. When a result holds for
either of the two, we will write F. (In part I of these notes it will hardly ever matter whether
F = R or F = C, but much of part II will require F = C.) Note however that one can consider
topological vector spaces over other topological fields, like the p-adic ones Qp [60]. (This said,
the resulting p-adic functional analysis is quite different in some respects from the ‘usual’ one,
cf. the comments in Section B.1 and the literature, e.g. [138, 125].) Now:

2.7 Definition Functional analysis (ordinary, as opposed to p-adic) is concerned with topo-
logical vector spaces over R or C and continuous maps between them. Linear functional analysis
considers only linear maps.
As it turns out, the notion of topological vector spaces is a bit too general to base a satis-
factory and useful theory upon it. We’ll prove only one result (Proposition 2.29) in this setting.
Just as in topology it is often (but by no means always!) sufficient to work with metric spaces,
for most purposes it is usually sufficient to consider certain subclasses of topological vector
spaces. The following diagram illustrates some of these classes and their relationships:

topological vector sp. ⊃ metrized/F-sp.


∪ ∪
locally convex sp. ⊃ Fréchet sp. ⊃ normed/Banach sp. ⊃ (pre-)Hilbert sp.

(Note that F -spaces, Fréchet4 , Banach and Hilbert5 spaces are assumed complete but one also
has the non-complete versions. There is no special name for Fréchet spaces with completeness
dropped other than metrizable locally convex spaces. In the other cases, one speaks of metrized,
normed and pre-Hilbert spaces.)
The most useful of these classes are those in the bottom row. In fact, locally convex (vector)
spaces are general enough for almost all applications. They are thoroughly discussed in the
MasterMath course on functional analysis, while we will only briefly touch upon them. Most
of the time, we will be discussing Banach and Hilbert spaces. There is much to be said for
studying them in some depth before turning to locally convex spaces (or more general) spaces.
(Some books on functional analysis, like [141], begin with general topological vector spaces and
then turn to some special classes, but for a first encounter this does not seem appropriate. This
said, the author doesn’t see the point of beginning by proving many results on Hilbert spaces
that literally generalize to Banach spaces.)

2.2 Translation-invariant metrics. Normed and Banach spaces


I assume that you remember the notion of a metric on a set X: A map d : X × X → [0, ∞)
satisfying d(x, y) = d(y, x) and d(x, z) ≤ d(x, y) + d(y, z) for all x, y, z ∈ X and d(x, y) = 0 ⇔
x = y. Every metric d on X defines a topology τd on X, the smallest topology τ containing
all open balls B(x, r) = {y ∈ X | d(x, y) < r}. (The open balls then form a base6 , not just
4
Maurice Fréchet (1878-1973). French mathematician. Introduced metric spaces in his 1906 PhD thesis.
5
David Hilbert (1862-1943). Eminent German mathematician who worked on many different subjects. Considered
the strongest and most influential mathematician in the decades around 1900, only Poincaré coming close.
6
We write ‘base’ for the notion in topology and ‘basis’ for the one in linear algebra. The plural for both is ‘bases’.

10
a subbase, for τ .) A topology τ on X is called metrizable if there exists a metric d on X
(not necessarily unique) such that τ = τd . Metrizable topologies automatically have many nice
properties, like e.g. normality and, a fortiori, the Hausdorff property. I also assume as familiar
the notion of completeness of a metric space and the fact that that every metric space can be
completed, i.e. embedded isometrically into a complete metric space (unique up to isometry) as
a dense subspace.
On a vector space, it is natural and common to consider only metrics of a special type:

2.8 Definition Let F ∈ {R, C}.


(i) A metric d on an F-vector space V is called translation-invariant if it satisfies the equivalent
statements

d(x, y) = d(x − z, y − z) ∀x, y, z ∈ V ⇔ d(x, y) = d(x − y, 0) ∀x, y ∈ V. (2.1)

(ii) A topological F-vector space (V, τ ) is called (completely) metrizable if there exists a
translation-invariant (and complete) metric d on V such that τ = τd . (Completely metriz-
able TVS are also called F -spaces.)

2.9 Lemma Let V be a vector space over F ∈ {R, C} and d a translation-invariant metric on
V . Then addition V × V → V, (x, y) 7→ x + y and inversion V → V, x 7→ −x are continuous,
thus (V, τd ) is a topological abelian group.
Proof. If x, x0 , y, y 0 ∈ V we have

d(x + y, x0 + y 0 ) = d(x, x0 + y 0 − y) ≤ d(x, x0 ) + d(x0 , x0 + y 0 − y) = d(x, x0 ) + d(y, y 0 ),

where we used translation invariance twice and the triangle inequality once. This implies that
the operation of addition + : V × V → V is jointly continuous. Continuity of the inverse
operation x 7→ −x follows from d(−x, −y) = d(0, x − y) = d(x − y, 0) = d(x, y). 

2.10 Remark 1. If d is a translation-invariant metric on a vector space V , (V, τd ) need not be


a TVS since the scalar multiplication F × V → V can fail to be continuous! (This follows from 
Exercise 2.6(ii) and the fact that the discrete topology is metrizable by d(x, y) = 1 whenever
x 6= y.) With d(cx, c0 x0 ) ≤ d(cx, cx0 ) + d(cx0 , c0 x0 ) = d(c(x − x0 ), 0) + d((c − c0 )x0 , 0) we find
that continuity of the scalar action F × V → V is equivalent to the condition that d(cx, 0) → 0
whenever c → 0 (with x fixed) or d(x, 0) → 0 (with c fixed). This problem does not arise for
topologies coming from a norm, to which we turn soon.
2. Since metric spaces are Hausdorff and first countable, we have many examples of non-
metrizable TVS: All non-Hausdorff ones, like the indiscrete ones, cf. Exercise 2.6(i). On the
other hand, RS is Hausdorff and a TVS by Exercise 2.6(iii), but for uncountable S it doesn’t
have countable neighborhood bases, thus is not metrizable.
3. The necessary condition for metrizability given in 2. can be proven to be sufficient, see
[141, Theorem 1.24], where it suffices to have a countable7 open neighborhood base at zero. 2

2.11 Exercise Let (V, τ ) be a topological vector space.


(i) Let d be a translation invariant metric on V such that τ = τd . Show that a sequence {xn }
in V is Cauchy w.r.t. d if and only if for every open neighborhood U of 0 there is an N ∈ N
such that n, m ≥ N implies xn − xm ∈ U .
7
‘Countable’ always means ‘at most countable’, otherwise we’ll say ‘countably infinite’.

11
(ii) If d1 , d2 are translation-invariant metrics on V such that τd1 = τd2 , show that d1 is complete
if and only if d2 is complete.

2.12 Remark 1. The analogue of (ii) for topological spaces is false: There can be equivalent
metrics of which only one is complete!
2. ? The equivalent characterization of Cauchy sequences in (i) makes sense in arbitrary
TVS V : A net {xι }ι∈I in V is called a Cauchy net if for every open neighborhood U of 0 there
is a ι0 ∈ I such that ι, ι0 ≥ ι0 implies xι − xι0 ∈ U . Now V is called complete if every Cauchy
net converges. 2

2.13 Definition Let V be a vector space over F ∈ {R, C}. A seminorm on V is a map
V → [0, ∞), x 7→ kxk (thus kxk = ∞ is not allowed!) such that

kx + yk ≤ kxk + kyk ∀x, y ∈ V. (Subadditivity)


kcxk = |c| kxk ∀c ∈ F, x ∈ V. (absolute homogeneity)

A norm is a seminorm satisfying also kxk = 0 ⇒ x = 0.


A normed space over F is a pair (V, k · k), where V is an F-vector space and k · k a norm on
V.
It is immediate that every seminorm satisfies k0k = 0 and k − xk = kxk.

2.14 Exercise If k · k is a (semi)norm on V , prove kxk − kyk ≤ kx − yk ∀x, y ∈ V , the ‘reverse


triangle inequality.

2.15 Lemma Let (V, k · k) be a normed F-vector space, and define d(x, y) = kx − yk. Then
(i) d is a translation-invariant metric on V .
(ii) (V, τd ) is a Hausdorff topological vector space.
(iii) The map V → [0, ∞), x 7→ kxk is τd -continuous.
Proof. (i) That norms give rise to metrics is probably known: d(x, y) ≥ 0 follows from kxk ≥ 0,
and d(x, y) = 0 ⇔ x = y follows from the norm axiom kxk = 0 ⇔ x = 0. Furthermore,

d(y, x) = ky − xk = k − (x − y)k = kx − yk = d(x, y),

where we used k − xk = kxk, a special case of the second seminorm axiom. Finally,

d(x, z) = kx − zk = kx − y + y − zk ≤ kx − yk + ky − zk = d(x, y) + d(y, z).

Now translation invariance of d is obvious: d(x, y) := kx − yk = d(x − y, 0).


(ii) All metric spaces are Hausdorff. We have seen earlier that the topology coming from a
translation invariant metric makes addition jointly continuous. And if x, x0 ∈ V and c, c0 ∈ F
then

d(cx, c0 x0 ) = kcx − c0 x0 k = kcx − cx0 + cx0 − c0 x0 k ≤ kc(x − x0 )k + k(c − c0 )x0 k


= |c| kx − x0 k + |c − c0 | kx0 k = |c|d(x, x0 ) + |c − c0 |kx0 k.

This implies joint continuity of the scalar action F × V → V .


(iii) This is immediate by the inequality proven in Exercise 2.14. 

12
2.16 Definition • A norm k·k on a vector space V is called complete if the metric d(x, y) =
kx − yk is complete.
• A topological vector space (V, τ ) is called (completely) normable if there exists a (complete)
norm k · k on V such that τ = τd with d(x, y) = kx − yk.
• A complete normed space is called Banach space.
• A normed space (V, k · k) is called separable if the associated norm topology τ is separable.
(I.e. V has a countable τ -dense subset.)

2.17 Remark 1. Obviously every normable TVS is metrizable. If d is a translation-invariant


metric on V and we put kxk = d(x, 0) then clearly kxk = 0 ⇒ x = 0 and the computation
kx + yk = d(x + y, 0) = d(x, −y) ≤ d(x, 0) + d(0, −y) = d(x, 0) + d(y, 0) = kxk + kyk proves
subadditivity. But there is no reason why kcxk = |c|kxk should hold. We will later see examples
of metrizable TVS that are not normable, see Section 4.2. (Such spaces, while somewhat better
behaved than general TVS, can still be rather pathological. The subclass of Fréchet spaces, see
below, is much better.)
2. We will soon prove that every finite-dimensional Hausdorff TVS is normable.
3. In every normed space (V, k · k) the balls B(0, r) = {x ∈ V | kxk < r} are bounded,
open and convex (see Definition 5.22). The first two properties are obvious, and convexity
follows from ktx + (1 − t)yk ≤ tkxk + (1 − t)kyk < t + (1 − t) = 1 for all x, y ∈ B(0, r) and
t ∈ [0, 1]. Conversely, one can show that every TVS in which zero has a bounded convex
open neighborhood is normable! (Note that in a general TVS one needs a new definition of
boundedness of sets since neither norm nor metric are available a priori. See Appendix B.6.2
for definition and proof.)
4. Just as complete metric spaces are ‘better behaved’ (in the sense of allowing stronger
theorems) than general metric spaces, Banach spaces are ‘better’ than normed spaces. We’ll
meet some applications of completeness in Section 3.2, and more will follow.
5. Separability is a somewhat annoying restriction that we will avoid as much as possible.
(An opposite philosophy, cf. e.g. [116], restricts to separable spaces from the beginning in order
to make do with weak versions of the axiom of choice.) 2

2.18 Example 0. Clearly F ∈ {R, C} is a vector space over itself and kck := |c| defines a norm,
making F a complete normed F-vector space.
1. Let X be a compact topological space and V = C(X, F). Clearly, V is an F-vector space.
Now kf k = supx∈X |f (x)| is a norm on V . You probably know that the normed space (V, k · k)
is complete. (See Lemma A.30 for a proof.) One can prove that it is separable if and only if X
is second countable, see Proposition A.48.
If X is non-compact then kf k can be infinite, but replacing C(X, F) by

Cb (X, F) = {f ∈ C(X, F) | kf k < ∞},

k · k again is a norm with which Cb (X, F) is complete.


2. Let n ∈ N and V = Cn . For x ∈ V and 1 ≤ p < ∞ (NB: p does not stand for prime!),
define

kxk∞ = max |xi |,


i=1,...,n
X n 1/p
kxkp = |xi |p .
i=1

13
(Note that all these k · kp including p = ∞ coincide if n = 1.) It is quite obvious that for
each p ∈ [1, ∞] we have kxkp = 0 ⇔ x = 0 and kcxkp = |c| kxkp . For p = 1 and p = ∞ also
the subadditivity is trivial to check using only |c + d| ≤ |c| + |d|. Subadditivity also holds for
1 < p < ∞, but is harder to prove. You have probably seen the proof for p = 2, which relies on
the Cauchy-Schwarz inequality. The proof for 1 < p < 2 and 2 < p < ∞ is similar, using the
inequality of Hölder instead.

2.19 Exercise Prove that the norms k · kp on Fn are complete for all F ∈ {R, C}, n ∈ N, p ∈
[1, ∞].
3. The above examples are easily generalized to infinite dimensions. Let S be any set. For
a function f : S → F and 1 ≤ p < ∞ define
X 1/p
kf k∞ = sup |f (s)|, kf kS = |f (s)|p
s∈S s∈S

with the understanding that (+∞)1/p = +∞. For the definition of infinite sums like
P
s∈S f (s)
see Appendix A.1. Now let

`p (S, F) = {f : S → F | kf kp < ∞}.

Now one can prove that k · kp is a complete norm on (`p (S, F ), k · kp ) for each p ∈ [1, ∞]. We
will do this in Section 4.
4. Let (X, A, µ) be a measure space, f : X → F measurable and 1 ≤ p < ∞. We define
Z 1/p
p
kf kp = |f | dµ ,
X
kf k∞ = inf{M > 0 | µ({x ∈ X | |f (x)| > M }) = 0}

and
Lp (X, A, µ; F) = {f : X → F measurable | kf kp < ∞}.
Then kf kp = ( X |f |p dµ)1/p is a seminorm on Lp (X, A, µ; F) for all 1 ≤ p < ∞.
R

However, in general k · kp it is not a norm since kf kp vanishes whenever f is zero almost


everywhere, i.e. µ(f −1 (C\{0})) = 0, which may well happen even if f is not identically zero. In
order to obtain a normed space one puts V0 = {f ∈ Lp | kf kp = 0}, which is a linear subspace,
and considers the quotient space Lp (X, A, µ) = Lp (X, A, µ)/V0 , to which the norms k · kp
descend. Going into the details would require too much measure theory. See [101, Appendices
A.1-A.3] for a crash course or [29, 146] for the full story.
There is an instructive special case: If S Ris a set, A = PP
(S) and µ(A) = #A (the counting
measure) then for every f : S → F we have P S f (s) dµ(s) = s∈S f (s), where the integral, like
the (unordered) sum, exists if and only if s∈S |f (s)| < ∞. Thus Lp (S, A, µ; F) = `p (S, F)8 .
Note that the norm of a normable space (V, τ ) never is unique (unless V = {0}): If k · k is
a norm compatible with τ then the same holds for ck · k for every c > 0. Thus the choice of a
norm on a vector space is an extra piece of structure. If k · k1 , k · k2 are different norms on V
then (V, k · k1 ), (V, k · k2 ) are different as normed spaces even if the norms give rise to the same
topology!
8
In view of these facts, which we cannot prove without going deeper into measure and integration theory, it certainly
isn’t unreasonable to ask that you understand the much simpler unordered summation.

14
2.20 Definition Let V be an F-vector space. Two norms k·k1 , k·k2 on V are called equivalent
if τd1 = τd2 , where di (x, y) = kx − yki .
This definition is a special case of the notion of equivalence of two metrics d1 , d2 on a set,
also defined by τd1 = τd2 . In that general situation one can prove criteria for equivalence, cf.
e.g. [108], but for normed spaces one has a much simpler one:

2.21 Proposition Two norms k · k1 , k · k2 on an F-vector space V are equivalent if and only if
there are 0 < c0 ≤ c such that c0 kxk1 ≤ kxk2 ≤ ckxk1 for all x ∈ V .
Proof. Since the norm topologies τi are defined in terms of the translation invariant metrics
di , for them to coincide it suffices that every d1 -open ball around zero contains a d2 -open ball
around zero and vice versa. By the absolute homogeneity of the norms, this is equivalent to
the existence of s, s0 > 0 such that B k·k1 (0, s) ⊆ B k·k2 (0, 1) and B k·k2 (0, s0 ) ⊆ B k·k2 (0, 1), which
means that
kxk1 < s ⇒ kxk2 < 1 and kxk2 < s0 ⇒ kxk1 < 1. (2.2)
This clearly is implied by the statement c0 kxk1 ≤ kxk2 ≤ ckxk1 with c0 > 0. On the other hand,
by continuity of the norms (2.2) implies kxk1 ≤ s ⇒ kxk2 ≤ 1 and kxk2 ≤ s0 ⇒ kxk1 ≤ 1
which, using homogeneity again, gives kxk2 ≤ s−1 kxk1 and kxk1 ≤ s0 −1 kxk2 , i.e. our condition
(with c0 = s0 , c = s−1 ). 

2.22 Example Let F ∈ {R, C}, p ∈ [1, ∞), n ∈ N. Then for x ∈ Fn we have
n
X
kxkp∞ p
= max |xi | ≤ kxkpp = |xi |p ≤ nkxkp∞ ,
i
i=1

Thus kxk∞ ≤ kxkp ≤ n1/p kxk∞ , so that k · k∞ is equivalent to k · kp for all p < ∞ if n is finite.
This clearly implies that all k · kp , p ∈ [1, ∞] are mutually equivalent. (You probably know that
all norms on Fn are equivalent, not only those of the form k · kp . We will prove the even stronger
result that there is only one Hausdorff topology on Fn making it a TVS.)
Later (Section 7.1) we will also prove the following deeper and quite surprising result:

2.23 Theorem (Two norm theorem) If V is a vector space that is complete w.r.t. each of
the norms k · k1 , k · k2 and k · k2 ≤ ck · k1 for some c > 0 then also k · k1 ≤ c0 k · k2 for some c0 > 0,
thus the two norms are equivalent.

2.24 Example Let S be an infinite set, F ∈ {R, C} and V = `1 (S, F). Then kf k∞ ≤ kf k1
for all f ∈ V , but the norms are not equivalent, for example since (V, k · k1 ) is complete while
(V, k · k∞ ) is not. This also is the reason why there is no contradiction with the above theorem.

2.25 Exercise Prove: If V is a vector space and k · k1 , k · k2 are equivalent norms on V then
completeness of (V, k · k1 ) is equivalent to completeness of (V, k · k2 ).

2.26 Exercise Let (V, k · k) be a normed space. Put d(x, y) = kx − yk and let (Vb , d)
b be the
completion of the metric space (V, d). Prove that Vb is a Banach space (in particular a vector
space!) and give its norm.

2.27 Exercise Let (V1 , k · k1 ), (V2 , k · k2 ) be normed spaces.

15
(i) Prove that k(x1 , x2 )ks = kx1 k1 +kx2 k2 and k(x1 , x2 )km = max(kx1 k1 , kx2 k2 ) are equivalent
norms on V1 ⊕ V2 .
(ii) Prove that (V1 ⊕ V2 , k · ks/m ) is complete if and only if (V1 , k · k1 ), (V2 , k · k2 ) both are
complete.

2.28 Exercise (i) Let {(Vi , k · ki )}i∈I be a family of normed spaces, where I is any set. Put
M X
Vi = {{xi }i∈I | kf (i)ki < ∞}.
i∈I i∈I
Q S
(Technically, this is a subset of Pi Vi = {f : I 7→ j Vj | f (i) ∈ Vi ∀i ∈ I}.) Prove that
this is a linear space and kf k = i kf (i)ki a norm on it.
L
(ii) Prove that ( i∈I Vi , k · k) is complete if all the Vi are complete. Hint: The proof is an
adaptation of the one for `1 (S) given in Section 4.3.
L
If Vi = F for all i ∈ I with the usual norm, we have i∈I Vi = ∼ `1 (I, F).

2.3 A glimpse beyond normed spaces


Normed spaces are an extremely versatile notion that has many applications. Below we will
prove that all finite-dimensional Hausdorff TVS are normable. But non-normable spaces exist,
and we have already met one generalization of normable spaces, the metrizable TVS. We will
quickly look at a generalization into a different direction, the locally convex TVS. The spaces
that are both locally convex and metrizable, called Fréchet spaces, are almost as ‘good’ as
Banach spaces.

2.3.1 Finite-dimensional TVS


2.29 Proposition Let V be a finite-dimensional vector space over F ∈ {R, C}. Then there is
a unique Hausdorff topology τ on V making (V, τ ) a topological F-vector space.
Proof. We give the proof for F = R and leave it to the reader to adapt it to F = C.
Let τF be the standard topology on F coming from d(c, c0 ) = |c − c0 |. Let V be a finite-
dimensional F-vector space, put n = dimF V and choose a basis E = {e1 , . . . , en } for V . Now
the product topology τFn = τF × · · · × τF on Fn is Hausdorff, and (Fn , τFn ) is a TVS by Exercise
2.6(iii) or by Lemma 2.15 and the fact that any of the norms k · kp , pP∈ [1, ∞] on Fn induce
the product topology τFn . Since the map αE : Fn → V, (x1 , . . . , xn ) 7→ ni=1 xi ei is a bijection,
τ = {αE (U ) | U ∈ τFn } is a Hausdorff topology on V and one easily checks that (V, τ ) a TVS.
This proves existence of τ . For uniqueness, let τ be an arbitrary Hausdorff topology on V
making (V, τ ) a TVS. Then the maps · : F × V → V and + : V × V → V are continuous w.r.t.
−1
τ . This implies that αE : (Fn , τFn ) → (V, τ ) is continuous, thus τ 0 := {αE (U ) | U ∈ τ } ⊆ τFn .
Since αE is a bijection, τ is a Hausdorff topology on F such that · : F × (Fn , τ 0 ) → (Fn , τ 0 ) is
0 n

continuous. Thus if we prove that every such τ 0 ⊆ τFn coincides with τFn , the uniqueness of τ
follows.
Let S = {x ∈ Fn | kxk2 = 1} be the euclidean unit sphere, which is τFn -compact. Now
τ 0 ⊆ τFn implies that S is also τ 0 -compact9 , and therefore τ 0 -closed (since τ 0 is Hausdorff). Thus
Fn \S ∈ τ 0 . Since the scalar action F×Fn → Fn is continuous (w.r.t. the metric topology τF on F
and τ 0 on Fn ), the pre-image V = ·−1 (Fn \S) ⊆ F × Fn , which contains 0 × 0, is τF × τ 0 -open. By
9
An open cover by elements of τ 0 is an open cover by elements of τFn , thus has a finite subcover.

16
definition of the product topology, there are ε > 0 and 0Fn ∈ U 0 ∈ τ 0 such that B(ε, 0F )×U 0 ⊆ V .
In other words, c ∈ F, |c| < ε and x ∈ U 0 imply cx 6∈ S, which is equivalent to kcxk2 6= 1 and
to kxk2 6= 1/|c|. Since 1/|c| may assume any value larger than 1/ε, we find that x ∈ U 0 implies
kxk2 ≤ 1/ε. Replacing x by d−1 x, where d > 0, we find x ∈ dU 0 ⇒ kxk2 ≤ d/ε, so that the
map id : (Fn , τ 0 ) → (Fn , τFn ) is continuous at zero, thus everywhere by linearity. This means
τFn ⊆ τ 0 , completing the proof of τ 0 = τFn . 

2.30 Remark 1. The choice of the basis E in the above proof does not matter. Why?
2. The example of the indiscrete topology, which turns every vector space into a TVS, shows
that there is no uniqueness if one omits the Hausdorff assumption.
3. The above proof is quite typical for proofs in topological algebra. Luckily, as soon as
we have (semi)norms at our disposal, proofs tend to be less point set topological. For example
explicit invocations of the product topology are rare. 2

2.31 Corollary Every finite-dimensional Hausdorff TVS (V, τ ) over F ∈ {R, C} is normable.
−1
Proof.
P Pick aP basis E = {e1 , . . . , en } for V and define a norm on V by kxk = kαE (x)k1 . (Thus
k i ci ei k = i |ci |.) By Lemma 2.15 the topology on V induced by this norm is Hausdorff,
thus coincides with τ by Proposition 2.29. Thus τ is normable. 

2.32 Corollary On a finite-dimensional vector space over F ∈ {R, C}, all norms are equiva-
lent.
Proof. Let k · k1 , k · k2 be norms on V . They give rise to topologies τ1 , τ2 such that (V, τi ) is a
Hausdorff TVS for i = 1, 2. Proposition 2.29 implies τ1 = τ2 , so that in view of Definition 2.20
the norms are equivalent. 

2.33 Exercise Prove that every finite-dimensional normed space (V, k · k) over R or C is com-
plete.

2.3.2 Locally convex and Fréchet spaces


We have seen that every norm on a vector space gives rise to a translation-invariant metric and
a TVS structure. Analogously, if k · k is seminorm on V , but not a norm, then d(x, y) = kx − yk
defines only a pseudo-metric, and τd is not Hausdorff (if x 6= 0 is such that kxk = 0 then there
are no disjoint open sets U, V containing x, 0, respectively).

2.34 Definition If V is an F-vector space and F is a family of seminorms on V then the


topology τF is the smallest topology on V containing the balls

Bk·k (x, r) = {y ∈ V | kx − yk < r}

for all x ∈ V, r > 0, k · k ∈ F.


More explicitly, τF consists of all unions of finite intersections of such balls, i.e. the latter
form a subbase for τF . Now a sequence or net {xι } in V converges to z ∈ V if and only if
kxι − zk converges to zero for each k · k ∈ F.

2.35 Definition We say that F is separating if for any non-zero x ∈ V there is a k · k ∈ F


such that kxk =
6 0.

17
The property of being separating is important since one usually is only interested in Hausdorff
topological vector spaces and the following holds:

2.36 Lemma The topology τF induced by a family F of seminorms on V is Hausdorff if and


only if F is separating.
Proof. ⇒ Assume F is not separating. Then there is 0 6= x ∈ V such that kxk = 0 for all
k · k ∈ F. Then by definition of τF , every open set containing 0 also contains x and vice versa,
so that τF is not Hausdorff.
⇐ Assume x 6= y. By assumption there is a k · k ∈ F such that c = kx − yk > 0. Let
U = Bk·k (x, c/2), V = Bk·k (y, c/2). Then U, V are open sets containing x, y, respectively, and
existence of z ∈ U ∩ V would imply the contradiction d(x, y) ≤ d(x, z) + d(z, y) < c/2 + c/2 =
r = d(x, y). Thus U ∩ V = ∅, so that τF is Hausdorff. 

If V is an F-vector space and F is a family of seminorms on V , one can prove that V is


a topological vector space when equipped with the topology τF . The proof is not much more
complicated than for the case of one (semi)norm considered above.

2.37 Definition A topological vector space (V, τ ) over F is called locally convex if there exists
a separating family F of seminorms on V such that τ = τF .

2.38 Remark 1. Note that with this definition, locally convex spaces are Hausdorff.
2. For an equivalent, more geometric way of defining local convexity of a TVS see the
supplementary Section B.6.2, and for more on locally convex spaces see, e.g., [94, 30, 141].
3. Locally convex spaces, introduced in 1935 by von Neumann10 , have many applications.
In these notes, we will encounter the weak and weak-∗ topologies on (duals of) Banach spaces
and the strong and weak operator topologies. There are many others, as in distribution theory,
relevant for the theory of (partial) differential equations. 2

If the separating family F has just one element, we are back at the notionPof a normed,
possibly Banach, space. If F is finite, i.e. F = {k · k1 , . . . , k · kn }, then k · k = ni=1 k · ki is a
seminorm, and it is a norm if and only if F is separating. Thus the case of finite F again gives
a normed space. Thus F must be infinite in order for interesting Pthings to happen.
0
If F is infinite, we cannot obtain a norm by putting kxk = k·k∈F kxk, since the r.h.s. has
no reason to converge for all x ∈ V . But if the family F of seminorms on V is countable, we
can label the elements of F as k · kn , n ∈ N and define

X
d(x, y) = 2−n min(1, kx − ykn ).
n=1

Now each term min(1, kx − ykn ) is a translation-invariant pseudometric [defined like a metric,
but without the requirement d(x, y) = 0 ⇒ x = y] on V that is bounded by 1, and the sum
converges to a translation-invariant metric on V . With just a bit more work one Pshows that
τF = τd , thus (V, τF ) is metrizable. (Note that we could not have defined kxk = ∞ i=1 2−i kxk
i
since this again may fail to converge, thus need not be a norm.) If such a space is complete, it
is called a Fréchet space. Clearly, every Fréchet space is an F -space.
10
John von Neumann (1903-1957). Hungarian, later American, mathematician. Countless contributions mostly to
foundational matters and analysis, e.g. the theory of unbounded operators and the spectral theorem, von Neumann
algebras, locally convex spaecs, but also to applied mathematics and computer science.

18
Here is an example of a Fréchet space: For f ∈ C ∞ (R, C) and n, m ∈ N0 , define

kf kn,m = sup |x|n |f (m) (x)|,


x∈R

where f (m) is the m-th derivative of f . These k · kn,m are seminorms. Since the family F =
{k · kn,m | n, m ∈ N0 } is countable, the space

S = {f ∈ C ∞ (R, C) | kf kn,m < ∞ ∀n, m ∈ N0 }

equipped with the topology τF is a Fréchet space. Its elements are called Schwartz11 functions.
They are infinitely differentiable functions that, together with all their derivatives, vanish as
|x| → ∞ faster than |x|−n for any n. (This definition is easily generalized to functions of several
variables.) Note that the seminorm k · k0,0 alone already separates the elements of S, thus is
a norm, but having the other seminorms around gives rise to a finer topology, one that is not
normable.

3 Normed and Banach space basics


3.1 Linear maps: bounded ⇔ continuous
If E, F are vector spaces over F, a map A : E → F is called linear if A(x + y) = Ax + Ay for
all x, y ∈ E and and A(cx) = cAx for all x ∈ E, c ∈ F. (NB: A map of the form x 7→ Ax + b,
where A : E → F is linear and b ∈ F \{0}, is not called a linear map, but an affine one!) As in
linear algebra, we mostly write Ax instead of A(x).

3.1 Definition A linear map A : V → W of normed spaces is called an isometry if kAxk = kxk
for all x ∈ V .
Recall that a linear map A : V → W is injective if and only if its kernel ker A = A−1 (0)
is {0}. It follows that an isometry is automatically injective. Furthermore, if A : V → W is
a surjective isometry then it is invertible, and its inverse also is an isometry. Then A is called
an isometric isomorphism of normed spaces, and we write V ∼ = W . Normed spaces that are
isometrically isomorphic are essentially indistinguishable.

3.2 Definition Let E, F be normed spaces and A : E → F a linear map. Then the norm
kAk ∈ [0, ∞] is defined by

kAek
kAk = sup = sup kAek.12
06=e∈E kek e∈E
kek=1

(The equality of the second and third expression is due to linearity of A and homogeneity of
the norms.) If kAk < ∞ then A is called bounded.
11
Laurent Schwartz (1915-2002). French mathematician who invented ‘distributions’, an important notion in func-
tional analysis.
12
It should be clear that writing sup kAek instead would not change the result.
e∈E
kek≤1

19
3.3 Remark 1. Every isometry has norm one, but not every norm one map is an isometry.
2. The definition implies kAxk ≤ kAkkxk ∀x ∈ E, and kAxk ≤ Ckxk ∀x implies kAk ≤ C.
3. Linear maps are also called linear operators, but linear maps A : E → F are called linear
functionals (whence the term ‘functional analysis’).
4. If V = C ∞ (R, R) with norm kf k = supx |f (x)| and fc (x) = sin(cx) then for all c ∈ R
we have fc ∈ V, kfc k = 1 and kfc0 k = c. Thus A : V → V, f 7→ f 0 is unbounded. (The same
holds for essentially all differential operators.) While this operator is defined on all of V , for
unbounded operators it often is too restrictive to require them to be defined on the whole space.
Cf. also Remark 7.33.
5. In fact, every infinite-dimensional space admits unbounded linear maps, cf. Exercise 3.8
below. 2

3.4 Exercise If E, G, H are normed spaces and S : E → G, T : G → H are linear maps, prove
that kT Sk ≤ kSkkT k. (We write T S for the composite map T ◦ S : E → H.)

3.5 Lemma Let E, F be normed spaces and A : E → F a linear map. Then the following are
equivalent:
(i) A is bounded.
(ii) A is continuous (w.r.t. the norm topologies).
(iii) A is continuous at 0 ∈ E.
Proof. (i)⇒(ii) For x, y ∈ E we have kAx − Ayk = kA(x − y)k ≤ kAk kx − yk. Thus d(Ax, Ay) ≤
kAk d(x, y), and with kAk < ∞ we have (uniform) continuity of E.
(ii)⇒(iii) This is obvious.
(iii)⇒(i) B F (0, 1) ⊆ F is an open neighborhood of 0 ∈ F . Since A is continuous at 0,
the preimage A−1 (B F (0, 1)) ⊆ E, which clearly contains 0, is open. Thus there exists ε > 0
such that B E (0, ε) ⊆ A−1 (B F (0, 1)). In other words, kxk < ε implies kAxk < 1. A for-
tiori, kxk ≤ ε/2 ⇒ kAxk ≤ 1. By linearity of A and absoloute homegeneity of the norms,
this is equivalent to kxk ≤ 1 ⇒ kAxk ≤ 2/ε, thus kAk ≤ 2/ε < ∞. (More precisely,
kAk = (sup{ε > 0 | B E (0, ε) ⊆ A−1 (B F (0, 1))})−1 .) 

The above motivates the following important notion:

3.6 Definition Let V, W be normed spaces. A linear map A : V → W is called isomorphism


if it satisfies the equivalent statements
(i) A is a bijection and a homeomorphism, i.e. A and A−1 are continuous.
(ii) A is surjective and there are C1 , C2 > 0 such that C1 kxk ≤ kAxk ≤ C2 kxk ∀x ∈ V .
Clearly A is an isometric isomorphism if and only if C1 = C2 = 1. If an isomorphism V → W
exists, we write V ' W (not to be confused with ∼= for isometric isomorphism).
In particular, if k·k1 , k·k2 are norms on V then idV : (V, k·k1 ) → (V, k·k2 ) is an isomorphism
(isometric isomorphism) if and only if the two norms are equivalent (equal).

3.7 Exercise Let V, W be normed spaces, where V is finite-dimensional. Prove that every
linear map V → W is bounded.

3.8 Exercise Let V be an infinite-dimensional Banach space.

20
(i) Show that there exists an unbounded linear map ϕ : V → F. (Hint: Use a Hamel basis13 )
(ii) If W is a non-zero normed space, show that there is an unbounded linear map A : V → W .
For linear functionals, i.e. linear maps from an F-vector space to F, there is another charac-
terization of boundedness/continuity:

3.9 Exercise Let (V, k · k) be a normed F-vector space and ϕ : V → F a linear functional.
Prove that ϕ is continuous if and only if ker ϕ = ϕ−1 (0) ⊆ V is closed.
Hint: For ⇐, pick a ball B(x, r) ⊆ V \ ker ϕ and prove that ϕ(B(0, r)) is bounded.
If V is a finite-dimensional normed space of dimension d, by linear algebra there exists a
basis E = {e1 , . . . , ed }. We can normalize its elements so that kei k = 1 for all i. Once E
is fixed, there is a unique family F = {ϕ1 , . . . , ϕd } of linear functionals V → F such that
ϕi (ej ) = δi,j ∀i, j. (One easily checks that F is a basis for V ∗ .) By Exercise 3.7, the ϕi are
bounded, and kϕi k ≥ |ϕ(ei )|/kei k = 1/1 = 1. Less trivially one has:

3.10 Proposition (Auerbach 1929) 14 Every finite-dimensional normed space admits a nor-
malized basis E, called Auerbach basis, such that also the dual basis F is normalized, i.e.
kϕi k = 1 ∀i.
Proof. Since every finite-dimensional vector space is isomorphic to Rn for some n, it suffices
to prove this for W = Rn (but with arbitrary norm). Let X = {w ∈ W | kwk = 1} be the
unit sphere, which is compact (since it is bounded and closed and the Heine-Borel theorem15
applies since the topology on W = Rn is the Euclidean one by Corollary 2.32). Consider the
map f : X n → R, E = (e1 , . . . , en ) 7→ det(e1 |e2 | · · · |en ), the matrix on the right having the
ei as columns. This is a continuous function, so µ = supE∈X n f (E) is finite, positive (since f
changes sign under exchange of two columns) and is assumed by some E = (e1 , . . . , en ). Defining
ϕi : x 7→ det(e1 , . . . , ei−1 , x, ei+1 , . . . , en )/µ, we have ϕi (ej ) = δij (since the determinant vanishes
if two columns are equal). The definition of µ implies det(e1 , . . . , ei−1 , x, ei+1 , . . . , en ) ≤ µ for
all x ∈ X, thus kϕi k = supx∈X |ϕi (x)| ≤ 1, and we are done. 

3.11 Exercise (Banach-Mazur distance) 16 Let V, W be Banach spaces. Define D(V, W ) ∈


[0, ∞] by D(V, W ) = +∞ if V and W are not isomorphic, i.e. V 6' W , and otherwise by

D(V, W ) = inf{kAk kA−1 k | A : V → W bounded linear isomorphism}.

Prove:
(i) D(V, W ) = D(W, V ) ≥ 1.
∼ W (isometric isomorphism) then D(V, W ) = 1. In particular D(V, V ) = 1.
(ii) If V =
(iii) If V ' W ' Z then D(V, Z) ≤ D(V, W )D(W, Z).
13
Recall from linear algebra that a Hamel basis for V is a subset E ⊂ V such that every x ∈ V is a linear combination
of finitely many elements of E, in a unique way.
14
Herman Auerbach (1901-1942). Polish mathematician. Born in Tarnopol (then Austria-Hungary, now Ukraine).
Murdered in the Belzec extermination camp.
15
A subset of Rn , equipped with its standard topology, is compact if and only if it is closed and bounded. While
the name attributes the theorem to Eduard Heine (1821-1881) and Emile Borel (1871-1956), the real history is very
complicated. See [3].
16
Stanislaw Mazur (1905-1981). Polish mathematician. Also known for the Gelfand-M. theorem and others.

21
(iv) Restricted to a set of mutually isomorphic Banach spaces, d(V, W ) = log D(V, W ) is a
pseudometric.
If V and W are finite-dimensional normed spaces, one can prove D(V, W ) = 1, thus they
are isometrically isomorphic, but in infinite dimensions this is not true!

3.2 Why we care about completeness


As you (should) know from topology, completeness of a metric space is convenient since it leads
to results that are not necessarily true without it, like Cantor’s intersection theorem and the
contraction principle (or Banach’s fixed point theorem). The same holds for normed spaces. In
this section we present three important applications of completeness, each of which will be used
repeatedly. Later we will encounter others.

3.2.1 Extension of bounded linear maps


The following application of completeness will be used several times:

3.12 Lemma Let V be a normed space, W ⊆ V a dense linear subspace, Z a Banach space
and A : W → Z a bounded linear map. Then there is a unique bounded linear map A
b:V →Z
with A = A 17
b|W . It satisfies kAk
b = kAk. If A is an isometry, so is A.
b

Proof. Let x ∈ V . Then there is a sequence {wn } in W such that kwn −xk → 0. Then {wn } ⊆ W
is a Cauchy sequence, and so is {Awn } ⊆ Z by boundedness of A. The latter converges since
Z is complete. If {wn0 } is another sequence converging to x then kA(wn − wn0 )k → 0, so that
lim Awn0 = lim Awn . It thus is well-defined to put Ax b = limn→∞ Awn . We omit the easy
b If x ∈ W then we can put wn = x ∀n, obtaining Ax
proof of linearity of A. b = Ax, thus
A|W = A. By density of W ⊆ V , any other continuous extension of A coincides with A.
b b We
have kAxk
b = lim kAwn k ≤ kAkkxk. Thus kAk b ≤ kAk, and the converse inequality is obvious.
If A is an isometry then kAxk = limn kAwn k = limn kwn k = kxk, so that A
b b is an isometry. 

3.13 Exercise Let X, Y be Banach spaces over F ∈ {R, C}. Let {xi }i∈I ⊆ X, {yi }i∈I ⊆ Y be
families of vectors such that spanF {xi | i ∈ I} ⊆ X is dense. Show that there is an A ∈ B(X, Y )
satisfying Axi = yi ∀i ∈ I if and only if there exists a C ∈ [0, ∞) such that
X X
ci yi ≤ C ci xi
i∈J i∈J

for all finite subsets J ⊆ I and numbers {ci }i∈J in F. Show that this A is uniquely determined.

3.2.2 Convergence of series


3.14 Definition Let (V, k · k) be a normed space and {xn }n∈N ⊂ V a sequence. The series
P ∞
n=1 xn is called
• convergent if the sequence {Sn } of partial sums Sn = nk=1 xk converges to some s ∈ V .
P

• unconditionally convergent if ∞
P
k=1 xσ(k) converges for each permutation σ of N.

17
If f : X → Y is a function and Z ⊆ X, then according to typographical convenience we write either f|Z or f  Z
for the restriction of f to Z, which is a map Z → Y . But it if f : X → X maps Y ⊆ Y into itself, f  Y = f|Y usually
is meant as a map Y → Y , not Y → X.

22
• conditionally convergent if it converges, but not unconditionally.
• absolutely convergent if ∞
P
n=1 kxn k < ∞.

3.15 Proposition Let (V, k · k) be a normed space over F. Then


(i) If ∞
P P∞ P∞
n=1 xn converges then k n=1 xn k ≤ n=1 kxn k.
(ii) Unconditional convergence implies convergence, but the converse is false (unless V = {0}).
P∞
(iii) If (V, k · k) is complete then
P∞every absolutely convergent series n=1 xn in V converges
unconditionally, all sums k=1 xσ(k) being equal.
(iv) If every absolutely convergent series in V converges then (V, k · k) is complete.
Pn Pn
Proof. (i) The subadditivity of theP norm gives k k=1 xkP k ≤ k=1 kxk k for all n ∈ N. As
n → ∞, the l.h.s. converges to k n xn k and the r.h.s. to n kxn k (which may be infinite, in
which case the inequality trivially holds).
n
(ii) The first part is obvious. If V 6= {0}, pick x ∈ V \{0} and put xn = (−1) n x. Then
P∞ P (−1)n P P
n=1 xn converges (to x n n = −xPln 2), but n kxn k = n 1/n = ∞. Pn
Pn V to be complete and n xn to be absolutely convergent. Let Sn = k=1 xk
(iii) Assume
and Tn = k=1 kxk k. For all n > m we have (by subadditivity of the norm)
n
X n
X
kSn − Sm k = xk ≤ kxk k = Tn − Tm .
k=m+1 k=m+1

Since the sequence {Tn } is convergent by assumption, thus Cauchy, the above implies that {Sn }
is Cauchy, thus convergent
P∞ by completeness of V . That the convergence is unconditional follows
from
P∞ the fact that n=1 kx σ(n) k is independent of σ, thus finite. The statement that then also
n=1 xσ(n) is independent of σ is proven as for series of real or complex numbers in a standard
analysis course.18
(iv) Assume that every absolutely convergent series in V converges, and let {yn }n∈N be
a Cauchy sequence in V . We can find (why?) a subsequence {zk }k∈N = {ynk } such that
kzk − zk−1 k ≤ 2−k ∀k ≥ 2. Now put z0 = 0 and define xk = zk − zk−1 for k ≥ 1. Now

X ∞
X ∞
X
kxk k = kzk − zk−1 k ≤ kz1 k + 2−k < ∞.
k=1 k=1 k=2
P∞
Thus k=1 xk is absolutely convergent,
Pn andP therefore convergent by the hypothesis. To wit,
n
limn→∞ Sn exists, where Sn = k=1 xk = k=1 (zk − zk−1 ) = zn . Thus z = limk→∞ zk =
limk→∞ ynk exists. Now the sequence {yn } is Cauchy and has a convergent subsequence {ynk }.
This implies (why?) that the whole sequence {yn } converges to the limit of the subsequence. 

The following is well-known (and we will re-prove it later):

3.16 Proposition Every unconditionally convergent series in R or C is absolutely conver-


gent.19
18
P∞
For completeness, here is a proof: The assumption
P∞ n=1 kxn k < ∞ implies that for PM
each ε > 0 there is an
PM
−1 −1
N ∈ N such that n=N +1 kxn k < ε. Picking M = max(N, σ (1), . . . , σ (N )), the sums n=1 xn and n=1 xσ(n)
both comprise x1 , . . . , xN so that their difference is a finite linear combination of xN +1 , xN +2 , . . . with coefficients
PM PM P∞ P∞
in
P∞ {0, 1, −1}. Thus k n=1 xn − n=1 xσ(n) k ≤ n=N +1 Pkx n k < ε.PTaking the limit M → ∞ gives k n=1 xn −
∞ ∞
n=1 x σ(n) k ≤ ε, and since ε > 0 was arbitrary, we have n=1 x n = n=1 x σ(n) .
19
This was proven (for real series) in 1854 by Bernhard Riemann (1826-1866), German mathematician.

23
3.17 Exercise Let (V, k · k) be a finite-dimensional
P∞ normed space over F ∈ {R, C}. Prove that
every unconditionally convergent series n=1 xn in V is absolutely convergent.

3.18 Remark 1. The result of the preceding exercise fails in infinite dimensions! In Exercise
4.16 we will encounter series in infinite-dimensional Banach spaces that are unconditionally
but not absolutely convergent. In fact, by the remarkable Dvoretzky-Rogers theorem (1950) 
every infinite-dimensional Banach space contains unconditionally convergent series that are not
absolutely convergent! (See Section B.2.1 for a proof.)
2. We will laterP∞prove in two different ways (Corollary 9.11 and Theorem A.4) that the
independence of n=1 xσ(n) of σ holds for all unconditionally convergent series, not only the
absolutely convergent ones.
in terms of {kxn k}, but
3. There is no characterization of unconditionally convergent series P
see the resultsPof Appendices A.2 and B.2. For example, we prove that n P xn converges uncon-
ditionally ⇔ c x
n n n converges for all bounded sequences {cn } ⊆ F ⇔ i xni converges for
all n1 < n2 < · · · . The moral is that unconditional convergence does not rely
P on ‘cancellations’.
P 4. Proposition 3.15(iv) shows that it is somewhat careless to write “ n kxn k < ∞, thus
n xn converges” since the completeness condition is indispensable. 2

3.19 Exercise Let (V, k · k) be a Banach space over F ∈ {R, C} and E ⊂ V a Hamel basis. By
P there are unique linear functionals {ϕe : V → F}e∈E such that for each x ∈ X we
linear algebra
have x = e∈E ϕe (x)e for each x ∈ X, where {e ∈ E | ϕe (x) 6= 0} is finite. Prove:
(i) If V is finite-dimensional then all ϕe are continuous.
(ii) If V is infinite-dimensional then {e ∈ E | ϕe is continuous} is finite. Hint: Argue by
contradiction, using Proposition 3.15 to construct an x ∈ V for which {e ∈ E | ϕe (x) 6= 0}
is infinite.
Part (ii) shows that Hamel bases are not very well suited for infinite-dimensional Banach
spaces (other than for constructing counterexamples). This will be reinforced by Exercise 7.21,
where we will see that every Hamel basis of a separable Banach space has cardinality c = #R
rather than ℵ0 = #N.

3.2.3 Closedness vs. completeness of subspaces


Closedness and completeness of subsets of a metric space are related. We recall from topology
(if you haven’t seen this, prove it!):

3.20 Lemma Let (X, d) be a metric space and Y ⊆ X. Then (instead of d|Y we just write d)
(i) If (X, d) is complete and Y ⊆ X is closed (w.r.t. τd , of course) then (Y, d) is complete.
(ii) If (Y, d) is complete then Y ⊆ X is closed (whether or not (X, d) is complete).
The above should be compared with the fact that a closed subset of a compact space is
compact and that a compact subset of a Hausdorff space is closed. In the above, completeness
works as a weak substitute of compactness, an interpretation that is reinforced by the fact that
every compact metric space is complete.
The above lemma readily specializes to normed spaces:

3.21 Lemma Let (V, k · k) be a normed space and W ⊆ V a linear subspace. Then
(i) If V is complete (=Banach) and a W ⊆ V is closed then W is Banach.

24
(ii) If W is complete then W ⊆ V is closed (whether or not V is complete).
Note: We will often omit ‘linear’ from ‘linear subspace’. When arbitrary, possibly non-linear,
subsets are intended we will make this clear.

3.22 Exercise Prove that every finite-dimensional subspace of a normed space is closed.
The result of the preceding exercise is
Pnot at all true for infinite-dimensional subspaces! For

example, let V = `1 (N) = {f : N → R | n=1 |f (n)| < ∞}. Now the infinite-dimensional linear 
subspace W = {f : N → R | #{n ∈ N | f (n) 6= 0} < ∞} ⊆ V is non-closed as follows from the
easy facts W 6= V and W = V .
The following (to be generalized in Lemma 7.39) gives closedness of the image of an isometry:

3.23 Corollary Let V be a Banach space and W a normed space. If A : V → W is a linear


isometry then the linear subspace AV ⊆ W is closed.
Proof. The map A : V → AV ⊆ W is an isometric bijection and therefore an isometric isomor-
phism of normed spaces. Thus (AV, k · k) is complete, thus closed in W by Lemma 3.21. 

3.3 Spaces of bounded linear maps. First glimpse of Banach


algebras
3.24 Definition Let E, F be normed F-vector spaces. The set of bounded linear maps from
E to F is denoted B(E, F ). Instead of B(E, E) and B(E, F) one also writes B(E) and E ∗ ,
respectively. E ∗ is called the dual space of E.20
Clearly B(E, F ) should not be confused with notation B(x, r) for open balls!

3.25 Proposition Let E, F be normed spaces. Then


(i) B(E, F ) is a vector space and B(E, F ) → [0, ∞), A 7→ kAk is a norm in the sense of
Definition 2.13.
(ii) If F is complete (=Banach) then so is B(E, F ). In particular, E ∗ is always Banach.
Proof. (i) If T : E → F is a linear map, it is clear that kαT k = |α|kT k and that kT k = 0 if and
only if T = 0. If S, T ∈ B(E, F ) and x ∈ E then k(S + T )xk ≤ kSxk + kT xk ≤ (kSk + kT k)kxk,
so that kS + T k ≤ kSk + kT k. This implies that B(E, F ) is a vector space.
(ii) Assume F is complete, and let {Tn } ⊆ B(E, F ) be a Cauchy sequence. Then there is
n0 such that m, n ≥ n0 ⇒ kTm − Tn k < 1, in particular Tm ∈ B(Tn0 , 1) for all n ≥ n0 .
Thus with M = max(kT1 k, . . . , kTn0 −1 k, kTn0 k + 1) < ∞ we have kTn k ≤ M for all n. If now
x ∈ E then k(Tn − Tm )xk ≤ kTn − Tm kkxk, so that {Tn x} is a Cauchy sequence in F and
therefore convergent by completeness of F . Now define T : E → F by T x = limn→∞ Tn x. It
is straightforward to check that T is linear. Finally, since kTn xk ≤ M kxk for all n, we have
kT xk = k limn→∞ Tn xk = limn→∞ kTn xk ≤ M kxk, so that T ∈ B(E, F ). The second statement
follows from the completeness of R and C. 

3.26 Exercise Let (V1 , k · k1 ), (V2 , k · k2 ) be normed spaces. Prove (V1 ⊕ V2 , k · ks )∗ ∼


= (V1∗ ⊕
∗ ∗ ∼ ∗ ∗
V2 , k · km ) and (V1 ⊕ V2 , k · km ) = (V1 ⊕ V2 , k · ks ). (See Exercise 2.27 for the definitions of
k · ks , k · km .)
20
More generally, if V is any topological vector space we write V ∗ for the space of continuous linear maps V → F.

25
If E is a normed F-vector space, the same holds for B(E) = B(E, E), and by Exercise 3.4,
we have kST k ≤ kSkkT k for all S, T ∈ B(E). This motivates the following definition:

3.27 Definition If F is a field, an F-algebra is an F-vector space A together with an associative


bilinear operation A × A → A, (a, b) 7→ a · b, the ‘multiplication’.
Examples: (i) A = Mn×n (F) with matrix product as multiplication,
(ii) A = C(X, F) with pointwise product of functions, i.e. (f · g)(x) := f (x)g(x) ∀x ∈ X.

3.28 Definition A normed F-algebra is a F-algebra A equipped with a norm k · k such that
kabk ≤ kakkbk for all a, b ∈ A (submultiplicativity). A Banach algebra is a normed algebra that
is complete (as a normed space). An algebra A is called unital if it has a unit 1 6= 0. (In fact,
if A =
6 {0} then 1 = 0 would imply the contradiction kak = k1ak ≤ k1kkak = 0 ∀a ∈ A.)

3.29 Remark 1. If A is a normed algebra then for all a, a0 , b, b0 ∈ A we have

kab − a0 b0 k = kab − ab0 + ab0 − a0 b0 k ≤ kakkb − b0 k + ka − a0 kkb0 k.

This proves that the multiplication map · : A × A → A is jointly continuous.


2. If A is a normed algebra with unit 1 then 1 = 12 , thus k1k = k12 k ≤ k1k2 . With k1k =
6 0
this implies 1 ≤ k1k. Some authors require all unital normed algebras to satisfy k1k = 1, but
we don’t. Of course this does hold for B(E). 2

By the above, B(E) is a normed algebra for every normed space E, and by Proposition
3.25(ii), B(E) is a Banach algebra whenever E is a Banach space. There is another standard
class of examples:

3.30 Example Let X be a compact topological space and A = C(X, F).21 We already know
that A, equipped with the norm kf k = supx∈X |f (x)| is a Banach space. The pointwise product
(f g)(x) = f (x)g(x) of functions is bilinear, associative and clearly satisfies kf gk ≤ kf kkgk.
This makes (A, k · k, ·) a Banach algebra. An analogous result holds for the algebra Cb (X, F) of
bounded continuous functions on a not necessarily compact space X.
We will have much more to say about Banach algebras later in the course.
Before we go on developing the general theory of Banach spaces, it is instructive to study
in some depth an important class of spaces, the spaces `p (S, F), where everything can be done
very explicitly, in particular the dual spaces can be determined.

4 The sequence spaces and their dual spaces


In this section we will consider in some detail the spaces `p (S, F), which we call sequence spaces
even though strictly speaking this is correct only for S = N. They are worth studying for several
reasons:
• They provide a first encounter with the more general Lebesgue spaces Lp (X, A, µ) without
the measure and integration theoretic baggage needed for the latter.
• They can be studied quite completely and have their dual spaces identified.
• We will see that every Hilbert space is isometrically isomorphic to `2 (S, F) for some S.
21
Some authors, mostly operator algebraists, write C(X) for C(X, C), whereas topologists put C(X) = C(X, R).
Due to this ambiguity we avoid the notation C(X).

26
4.1 Basics. 1 ≤ p ≤ ∞: Hölder and Minkowski inequalities
4.1 Definition (`p -Spaces) If F ∈ {R, C}, 0 < p < ∞, S is a set and f : S → F, define
X 1/p
kf k∞ = sup |f (s)| ∈ [0, ∞], kf kp = |f (s)|p ∈ [0, ∞],
s∈S s∈S

where ∞1/p = ∞ and we use the notion of unordered sums, cf. Appendix A.1. Now for all
p ∈ (0, ∞] put
`p (S, F) := {f : S → F | kf kp < ∞}.
Here are some immediate observations:
• We have |f (s)| ≤ kf kp for all s ∈ S and p ∈ (0, ∞].
• kf kp = 0 if and only if f = 0.
• For all c ∈ F we have kcf kp = |c|kf kp (with the understanding that 0 · ∞ = 0).
• If S is finite then `p (S, F) = {f : S → F} = FS . If #S = 1 then all the k · kp coincide.
• If #S = ∞ then k · kp is not a norm on FS in the sense of Definition 2.13 since we can have
kf kp = +∞. But we’ll see that the restriction of k · kp to `p (S, F) is a norm if p ∈ [1, ∞].

4.2 Lemma (i) k · k1 and k · k∞ are subadditive, thus norms on `1 (S, F) and `∞ (S, F), resp.
(ii) If f ∈ `1 (S, F) and g ∈ `∞ (S, F) then

X
f (s)g(s) ≤ kf gk1 ≤ kf k1 kgk∞ .
s∈S

(iii) (`p (S, F), k · kp ) are vector spaces for all p ∈ (0, ∞].
Proof. (i) In view of the preceding observations, it remains to prove subadditivity:

kf + gk∞ = sup |f (s) + g(s)| ≤ sup |f (s)| + sup |g(s)| = kf k∞ + kgk∞ ,


s s s
X X
kf + gk1 = |f (s) + g(s)| ≤ (|f (s)| + |g(s)|) = kf k1 + kgk1 .
s s
P P P
(ii) This just is | s∈S f (s)g(s)| ≤ s∈S |f (s)||g(s)| ≤ kgk∞ s∈S |f (s)| = kgk∞ kf k1 .
(iii) It remains to show that f + g ∈ `p (S, F) whenever f, g ∈ `p (S, F). For p = ∞ this follows
from k · k∞ being a norm. The map R+ → R+ , t 7→ tp is monotonous for all p ∈ (0, ∞). Thus

|a + b|p ≤ (|a| + |b|)p ≤ (2 max(|a|, |b|))p = 2p max(|a|p , |b|p ) ≤ 2p (|a|p + |b|p ), (4.1)

so that with f, g ∈ `p (S, F) we have


X X
kf + gkpp = |f (s) + g(s)|p ≤ 2p |f (s)|p + |g(s)|p = 2p (kf kpp + kgkpp ) < ∞.

s s

Thus kf + gkp < ∞ and f + g ∈ `p (S, C). 

In order to obtain analogues of (i), (ii) for 1 < p < ∞, we put

4.3 Definition If p, q ∈ [1, ∞] we say that p and q are dual (or conjugate) to each other if
1 1 1 1
p + q = 1, with the understanding ∞ = 0, 0 = ∞.

27
One easily checks that every p ∈ [1, ∞] has a unique conjugate q ∈ [1, ∞]. And for 1 <
p, q < ∞ conjugacy is equivalent to pq = p + q.
1 1
4.4 Proposition Let 1 < p < ∞ and let q be conjugate to p, i.e. p + q = 1. Then
(i) For all f, g : S → F we have kf gk1 ≤ kf kp kgkq . (Inequality of Hölder22 (1889))
(ii) For all f, g : S → F we have kf + gkp ≤ kf kp + kgkp . (Inequality of Minkowski23 (1896))
Proof. (i) The inequality is trivially true if kf kp or kgkq is zero or infinite. Thus we assume
kf kp and kgkq to be finite and non-zero. The exponential function R → R, x 7→ ex is convex24 ,
so that with of p1 + 1q = 1 we have

ea eb
 
a/p b/q a b
e e = exp + ≤ + ∀a, b ∈ R.
p q p q

With the substitutions ea = up , eb = v q , where u, v > 0 this becomes


up v q
uv ≤ + ∀u, v ≥ 0. (4.2)
p q
(The validity also for u = 0 or v = 0 is obvious.)
Putting u = |f (s)|, v = |g(s)| in (4.2), we have |f (s)g(s)| ≤ p−1 |f (s)|p + q −1 |g(s)|p , so that
summing over s gives kf gk1 ≤ p−1 kf kpp + q −1 kgkqq . If kf kp = kgkq = 1 then this reduces to
kf gk1 ≤ p1 + p1 = 1. With f 0 = f /kf kp , g 0 = g/kgkq we have kf 0 kp = 1 = kg 0 kq , so that the
above implies kf 0 g 0 k1 ≤ 1. Inserting the definitions of f 0 , g 0 herein gives kf gk1 ≤ kf kp kgkq .
(ii) We may assume that f, g ∈ `p , thus kf kp , kgkp < ∞, since otherwise the inequality is
trivially true. With h = f + g this P implies h ∈ `p byP Lemma 4.2(iii). If q is conjugate to
p, we have pq = p + q, and we find s |h(s)| (p−1)q = s |h(s)|p < ∞, so that the function
p/q
s 7→ |h(s)|p−1 is in `q with k|h|p−1 kq = khkp . Now
X X
kf + gkpp = |f (s) + g(s)|p = |f (s) + g(s)| |f (s) + g(s)|p−1
s s
X
≤ (|f (s)| + |g(s)|) |f (s) + g(s)|p−1 . (4.3)
s

Since f ∈ `p and |h|p−1 ∈ `q , we can apply Hölder’s inequality, obtaining s |f (s)||h(s)|p−1 ≤


P
p/q p/q
kf kp k|h|p−1 kq = kf kp khkp . Analogously, s |g(s)||h(s)|p−1 ≤ kgkp khkp . Plugging this into
P
(4.3) we obtain  
kf + gkpp ≤ kf kp + kgkp kf + gkp/q
p .

p/q
If kf + gkp 6= 0 we divide by kf + gkp , and using p − p/q = p(1 − 1/q) = p p1 = 1 we obtain

kf + gkp = kf + gkpp−p/q ≤ kf kp + kgkp .

Since this clearly also holds if kf + gkp = 0, we are done. 

22
Otto Hölder (1859-1937). German mathematician. Important contributions to analysis and algebra.
23
Hermann Minkowski (1864-1909). German mathematician. Contributions to number theory, relativity and other
fields. We’ll encounter M.-functionals.
24
f : [a, b] → R is convex if f (tx + (1 − t)y) ≤ tf (x) + (1 − t)f (y) for all x, y ∈ [a, b] and t ∈ [0, 1] and strictly convex
if the inequality is strict whenever x 6= y, 0 < t < 1. See, e.g., [57, Vol. 1, Section 7.2].

28
For p = q = 2, the inequality of Hölder is known as the Cauchy-Schwarz inequality. We
will also call the trivial inequalities of Lemma 4.2 for {p, q} = {1, ∞} Hölder and Minkowski
inequalities. Now the analogue of Lemma 4.2 for 1 < p < ∞ is clear:

4.5 Corollary Let 1 < p < ∞. Then


(i) (`p (S, F), k · kp ) is a normed vector space.25
(ii) If q is conjugate to p and f ∈ `p (S, F) and g ∈ `q (S, F) then

X
f (s)g(s) ≤ kf gk1 ≤ kf kp kgkq .
s∈S

4.2 ? Aside: The translation-invariant metric dp for 0 < p < 1


For s ∈ S, let δs : S → F be the function defined by δs (t) = δs,t (which is 1 for s = t and zero
otherwise).

4.6 Proposition If 0 < p < 1 then


(i) k · kp violates subadditivity whenever #S ≥ 2, thus is not a norm.
(ii) Nevertheless, `p (S, F) is a vector space.
(iii) Restricted to `p (S, F), X
dp (f, g) = |f (s) − g(s)|p
s∈S

defines a translation-invariant metric. (Note the absence of the p-th root present in k · kp !)
(iv) `p (S, F) a topological vector space when given the metric topology τdp .
Proof. (i) Pick s, t ∈ S, s 6= t and put f = δs , g = δt . Now kf kp = kgkp = 1 and

2 < 21/p = kf + gkp 6≤ kf kp + kgkp = 2

since 1/p > 1. Thus k · kp is not subadditive and therefore not a norm.
(ii) The proof of Lemma 4.2(iii) included the case 0 < p < 1.
(iii) That dp (f, g) < ∞ for all f, g ∈ `p (S, F) follows from `p being a vector space. Translation
invariance of dp and the axioms dp (f, g) = dp (g, f ) and dp (f, g) = 0 ⇔ f = g are all evident
from the definition. We claim that

0 < p < 1, a, b ≥ 0 ⇒ (a + b)p ≤ ap + bp .

Believing this for a minute, we have


X X
dp (f, h) = dp (f − h, 0) = |f (s) − h(s)|p ≤ (|f (s) − g(s)| + |g(s) − h(s)|)p
s s
X
≤ (|f (s) − g(s)|p + |g(s) − h(s)|p )
s
= dp (f − g, 0) + dp (g − h, 0) = dp (f, g) + dp (g, h),
25
If you wonder why the Lp -spaces with p 6∈ {1, 2, ∞} are studied at all, the short answer is that they arise in many
areas of analysis, in particular harmonic analysis and PDE theory. For longer answers, have a look at [180].

29
as wanted, where first used the triangle inequality and then the claim.
Turning to our claim (a + b)p ≤ ap + bp , it is clear that this holds if a = 0. For a = 1 it
reduces to (1 + b)p ≤ 1 + bp ∀b ≥ 0. For b = 0 this is true, and for all b > 0 it follows from the
fact that
d
(1 + bp − (1 + b)p ) = p(bp−1 − (b + 1)p−1 ) > 0
db
due to p − 1 < 0. If now a > 0 then

(a + b)p = ap (1 + (b/a))p ≤ ap (1 + (b/a)p ) = ap + bp ,

and we are done. (By almost the same argument, for p ≥ 1 we have (a + b)p ≥ ap + bp .)
(iv) By the above, dp is a translation-invariant metric, so that the addition operation + :
` × `p → `p is jointly continuous by Lemma 2.9. Furthermore,
p

X X
dp (cf, 0) = |cf (s)|p = |c|p |f (s)|p = |c|p dp (f, 0),
s s

which tends to zero if c → 0 for fixed f or f → 0 (in the sense of dp (f, 0) → 0) for fixed c. By
Remark 2.10.3 this implies joint continuity of the scalar action F × `p → `p . 

4.7 Exercise Let S be an infinite set and 0 < p < 1.


(i) Prove that `p (S, F) does not contain a bounded convex open neighborhood of 0.
(ii) Conclude that the metrizable TVS (`p (S, F), dp ) are not normable if 0 < p < 1.
In fact, the `p spaces with 0 < p < 1 are not even locally convex. This leads to strange
behavior. For example the dual space `p (S, F)∗ is unexpected, cf. [108]. This strangeness is
even more pronounced for the continuous versions Lp (X, A, µ): For X = [0, 1] equipped with
Lebesgue measure, one has Lp (X, A, µ)∗ = {0}, which cannot happen for a non-zero Banach (or
locally convex) space in view of the Hahn-Banach theorem.

4.3 c00 and c0 . Completeness of `p (S, F) and c0 (S, F)


In what follows we put

 kf − gk∞ = sup
P s
|f (s) − g(s)| if p=∞
p 1/p
dp (f, g) = kf − gkp = P
( s |f (s) − g(s)| ) if 1 ≤ p < ∞
kf − gkpp = s |f (s) − g(s)|p

if 0 < p < 1

which is a metric in all cases. For a function f : S → F we define supp f = {s ∈ S | f (s) 6= 0}.

4.8 Lemma Let p ∈ (0, ∞] and dp (x, y) = kx − ykp . Then (`p (S, F), dp ) is complete for every
set S and F ∈ {R, C}.
Proof. For all p ∈ (0, ∞] we have |f (s)−g(s)| ≤ kf −gkp . Thus for p ≥ 1 we have |f (s)−g(s)| ≤
dp (f, g), while for p ∈ (0, 1) we have |f (s) − g(s)| ≤ kf − gkp = dp (f, g)1/p . In either case
d(f, g) → 0 implies f (s) − g(s) → 0 for all s ∈ S. Thus if {fn } ⊆ `p (S, F) is a Cauchy sequence
w.r.t. dp then {fn (s)} is a Cauchy sequence in F, thus convergent for each s ∈ S. Defining
g(s) = limn→∞ fn (s), it remains to prove g ∈ `p (S, F) and dp (fn , g) → 0.
For p = ∞ and ε > 0 we can find n0 such that n, m ≥ n0 implies kfn − fm k∞ < ε, which
readily gives kfm k∞ ≤ kfn0 k∞ + ε for all m ≥ n0 . Thus also kgk∞ ≤ kfn0 k∞ + ε < ∞. Taking
m → ∞ in sups |fn (s) − fm (s)| < ε gives sups |fn (s) − g(s)k ≤ ε, whence kfn − gk∞ → 0.

30
For 0 < p < ∞ we give a uniform argument. Since {fn } is Cauchy w.r.t. dp , for ε > 0 we
can find n0 such that n, m ≥ n0 implies dp (fn , fm ) < ε. Applying the dominated convergence
theorem (in the simple case of an infinite sum rather than a general integral, cf. Proposition
A.3) gives dp (g, fm ) = limn→∞ dp (fn , fm ) ≤ ε. This implies both g ∈ `p (S, F) and dp (g, fm ) → 0
as m → ∞. 

4.9 Definition For a set S and F ∈ {R, C} we define

c00 (S, F) = {f : S → F | #(supp f ) < ∞},


c0 (S, F) = {f : S → F | ε > 0 ⇒ #{s ∈ S | |f (s)| ≥ ε} < ∞}.

(The elements of c0 are the functions that ‘tend to zero at infinity’.)

4.10 Lemma If 0 < p ≤ q < ∞, we have the set-theoretic inclusions

c00 (S, F) ⊆ `p (S, F) ⊆ `q (S, F) ⊆ c0 (S, F) ⊆ `∞ (S, F),

where kf k∞ ≤ kf kq ≤ kf kp . All the inclusion maps except the first have norm one.
Proof. If f ∈ c00 (S, F) then clearly kf kp < ∞ for all p ∈ (0, ∞]. And f ∈ c0 (S, F) implies
boundedness of f . This gives the first and last inclusion. The map c0 (S, F) ,→ `∞ (S, F) is an
isometry since both spaces have the norm k · k∞ .
If f ∈ `q (S, F) with q ∈ (0, ∞) then finiteness of kf kqq = s∈S |f (s)|q implies that {s ∈
P
S | |f (s)| ≥ ε} is finite for each ε > 0, thus f ∈ c0 (S, F). Since |f (s)| ≤ kf kq for all s, we have
kf k∞ = sups∈S |f (s)| ≤ kf kq .
Now let 0 < p < q < ∞ and f ∈ `p (S, F) with kf kp = 1. Then |f (s)| ≤ 1 ∀s, thus
q
·p
X X X
kf kqq = |f (s)|q = |f (s)| p ≤ |f (s)|p = kf kpp = 1, (4.4)
s∈S s∈S s∈S

where we used q/p > 1 and |f (s)| ≤ 1 ∀s. Thus kf kq ≤ 1. Applying this to f = g/kgkp gives
kgkq ≤ kgkp for all g ∈ `p (S, F).
We now have that all the inclusion maps have norm ≤ 1. Taking f = δs and using kδs kp = 1
for all p ∈ (0, ∞] gives that the inclusion maps all have norm one. 

4.11 Remark While we have found continuous maps between them, the spaces `p (1 ≤ p < ∞)
and c0 are mutually non-isomorphic. See Corollary B.32 for an even stronger statement. 2

4.12 Lemma (i) We have

`p (S, F) if 0 < p < ∞



k·kp
c00 (S, F) =
c0 (S, F) if p = ∞

(ii) (c0 (S, F), k · k∞ ) is complete.


p
Proof. (i) Let 0 < p < ∞ and f ∈ `p (S, F). Then p
P
s∈S |f (s)| = kf kp implies that for each
p
|f (s)|p < ε. Putting g(s) = f (s)χF (s),
P
ε > 0 there is a finite F ⊆ S such thatP kf kp − s∈F
p
we have g ∈ c00 (S, F) and kf − gkp = s∈S\F |f (s)| < ε. Since ε > 0 is arbitrary, c00 ⊆ `p is
p

dense.
If f ∈ c0 (S, F) and ε > 0 then F = {s ∈ S | |f (s)| ≥ ε} is finite. Now g = f χF is in c00 (S, F)
k·k∞ k·k∞
and kf − gk∞ < ε, proving f ∈ c00 (S, F) . And f ∈ c00 (S, F) means that for each ε > 0

31
there is a g ∈ c00 (S, F) with kf − gk∞ < ε. But this means |f (s)| < ε for all s ∈ S\F , where
F = supp(g) is finite. Thus f ∈ c0 (S, F).
(ii) Being the closure of c00 (S, F) in `∞ (S, F), c0 (S, F) is closed, thus complete by complete-
ness of `∞ (S, F), cf. Lemmas 4.8 and 3.21. 

While the finitely supported functions are not dense in (`∞ (S, F), k · k∞ ) (for infinite S), the
finite-image functions are:

4.13 Lemma The set {f : S → F | #f (S) < ∞} of functions assuming only finitely many
values, equivalently, the set of finite linear combinations K χ
P
k=1 k Ak of characteristic functions,
c

is dense in (` (S, F), k · k∞ ).
Proof. We prove this for F = R, from which the case F = C is easily deduced. Let f ∈ `∞ (S, F)
−1 kf k∞
P ε > 0. For k ∈ Z define Ak = f ([kε, (k + 1)ε)). Define K = d ε e + 1 and g =
and
ε |k|≤K k χAk . Then g has finite image and kf − gk∞ < ε. 

4.14 Exercise Let S be an infinite set and 0 < p < q < ∞.


(i) Prove that all inclusions in Lemma 4.10(i) are strict.
(ii) For p < q, show that `p (S) is not a closed subspace of (`q (S), k · kq ) and analogously for
`q (S) ⊆ (c0 (S), k · k∞ ). Why is this compatible with Corollary 3.23?

4.15 Exercise For f : S → C, prove



kf k∞ if kf kp < ∞ for some p ∈ [1, ∞)
lim kf kp =
p→∞ ∞ otherwise

4.16 Exercise Let 1 < p ≤ ∞ and V = `p (N, R). Define δn ∈ V by δn (m) = δn,m and
xn = n−α δn , where α > 0.
(i) For which α > 0 is ∞
P
n=1 xn absolutely convergent?
P∞
(ii) For which α > 0 is n=1 xn unconditionally convergent? (Cf. Remark A.5.)
(iii) Use (i),(ii) to give examples of series that are unconditionally convergent, but not abso-
lutely convergent, in each `p (N, F) with 1 < p ≤ ∞.
(iv) BONUS: As (iii), but for p = 1. (NB: Just invoking Corollary B.3 is not enough!)

4.4 Separability of `p (S, F) and c0 (S, F)


4.17 Proposition Let p ∈ (0, ∞). The metric space (`p (S, F), dp ), where dp (f, g) = kf − gkp ,
is separable (⇔ second countable) if and only if the set S is countable.
Proof. We prove this for F = R, from which the claim for F = C is easily deduced. For
f : S → R, let supp(f ) := {s ∈ S | f (s) 6= 0} ⊆ S be the support of f . Now, if S is
countable, then Y = {g : S → Q | #(supp(g)) < ∞} ⊆ `p (S, R) is countable, and we claim that
p p
p
∈ ` (S, R) and ε > 0. Since kf kp = s∈S |f (s)|p < ∞, there
P
Y = ` (S, R). To prove this, let f P
is a finite subset T ⊆ S such that s∈S\T |f (s)| p #T ⊆ R#T
P < ε/2. On thepother hand, since Q
is dense, we can choose g : T → Q such that t∈T |f (t) − g(t)| < ε/2. Defining g to be zero
on S\T , we have g ∈ Y and
X X ε ε
kf − gkp = |f (t) − g(t)|p + |f (s)|p < + = ε,
2 2
t∈T s∈S\T

32
and since ε > 0 was arbitrary, Y ⊆ `p (S, R) is dense.
For the converse, assume that S is uncountable. By Proposition A.2(iii),
S supp(f ) is countable
for every f ∈ `p (S, R). Thus if Y ⊆ `p (S, R) is countable then T = f ∈Y supp(f ) ⊆ S is a
countable union of countable sets and therefore countable. Thus all functions f ∈ Y vanish on
S\T 6= ∅, and the same holds for f ∈ Y since the coordinate maps f 7→ f (s) are continuous in
view of |f (s)| ≤ kf k. Thus Y cannot be dense. 

4.18 Exercise With d∞ (f, g) = kf − gk∞ prove


(i) The space (`∞ (S, F), d∞ ) is separable if and only if S is finite.
(ii) The space (c0 (S, F), d∞ ) is separable if and only if S is countable.
Hint: For (i), consider {0, 1}S ⊆ `∞ (S).

4.5 Dual spaces of `p (S, F), 1 ≤ p < ∞, and c0 (S, F)


If (V, k · k) is a normed vector space over F and ϕ : V → F is a linear functional, Definition 3.2
specializes to
|ϕ(x)|
kϕk = sup = sup |ϕ(x)|.
06=x∈V kxk x∈V
kxk≤1


Recall that the dual space V = {ϕ : V → F linear | kϕk < ∞} is a Banach space with norm
kϕk. The aim of this section is to concretely identify `p (S, F)∗ for 1 ≤ p < ∞ and c0 (S, F)∗ .
(We will have something to say about `∞ (S, F)∗ , but the complete story would lead us too far.)
For the purpose of the following proof, it will be useful to define sgn : C → C by sgn(0) = 0
and sgn(z) = z/|z| otherwise. Then z = sgn(z)|z| and |z| = sgnz z for all z ∈ C.

4.19 Theorem (i) Let p ∈P [1, ∞] with conjugate value q. Then for each g ∈ `q (S, F) the map
ϕg : `p (S, F) → F, f 7→ s∈S f (s)g(s) satisfies kϕg k ≤ kgkq , thus ϕg ∈ `p (S, F)∗ . And the
map ι : `q (S, F) → `p (S, F)∗ , g 7→ ϕg , called the canonical map, is linear with kιk ≤ 1.
(ii) For all 1 ≤ p ≤ ∞ the canonical map `q (S, F) → `p (S, F)∗ is isometric.
(iii) If 1 ≤ p < ∞, the canonical map `q (S, F) → `p (S, F)∗ is surjective, thus `p (S, F)∗ ∼ =
`q (S, F).
(iv) The canonical map `1 (S, F) → c0 (S, F)∗ is an isometric bijection, thus c0 (S, F)∗ ∼
= `1 (S, F).
(v) If S is finite, the canonical map `1 (S, F) → `∞ (S, F)∗ is surjective. But its image is a
proper closed subspace of `∞ (S, F)∗ whenever S is infinite.
Proof. (i) For all p ∈ [1, ∞] and conjugate q we have
X X
| f (s)g(s)| ≤ |f (s)g(s)| ≤ kf kp kgkq < ∞ ∀f ∈ `p , g ∈ `q
s s∈S

by Hölder’s inequality. In either case, the absolute convergence for all f, g implies that (f, g) 7→
P
s f (s)g(s) is bilinear.
(ii) If kgk∞ 6= 0 and ε > 0 there is an s ∈ S with |g(s)| > kgk∞ − ε. If f = δs : t 7→ δs,t , we
have |ϕg (f )| = |g(s)| > kgk∞ − ε. Since kf k1 = 1, this proves kϕg k > kgk∞ − ε. Since ε > 0
was arbitrary, we have kϕg k ≥ kgk∞ .
P P
If kgk1 6= 0, define f (s) = sgn(g(s)). Then kf k∞ = 1 and s f (s)g(s) = s |g(s)| = kgk1 .
This proves kϕg k ≥ kgk1 .

33
6 0, define f (s) = sgn(g(s))|g(s)|q−1 . Then
If 1 < p, q < ∞ and kgkq =
X X
f (s)g(s) = |g(s)|q = kgkqq ,
s s
X X X
kf kpp = |f (s)|p = |g(s)|(q−1)p = |g(s)|q = kgkqq ,
s s, g(s)6=0 s

where we used p + q = pq, whence (q − 1)p = q. The above gives

kgkqq kgkqq
P
| s f (s)g(s)|
kϕq k ≥ = = q/p
= kgkq(1−1/p)
q = kgkq .
kf kp kf kp kgkq

We thus have proven kϕg k ≥ kgkq in all cases and since the opposite inequality is known
from (i), g 7→ ϕg is isometric.
(iii) Let 0 6= ϕ ∈ `1 (S, F)∗ . Define g : S → F by g(s) = ϕ(δs ). With kδs k1 = 1, we
have |g(s)| = |ϕ(δs )| ≤ kϕkP for all s ∈ S, thus 1
P kgk∞ ≤ kϕk. If f ∈ ` (S, F) and F ⊆ S is
finite, wePhave ϕ(f χF ) = ϕ( s∈F f (s)δs ) = s∈F f (s)g(s). In the limit F % S this becomes
ϕ(f ) = s∈S f (s)g(s) = ϕg (f ) (since f g ∈ `1 , thus the r.h.s. is absolutely convergent, and
kf (1 − χF )k1 → 0 and ϕ is k · k1 -continuous). This proves ϕ = ϕg with g ∈ `∞ (S, F).
Now let 1 < p, q < ∞, and let 0 6= ϕ ∈ `p (S, F)∗ . Since `1 (S, F) ⊆ `p (S, F) by Lemma
4.10, weP can restrict ϕ to `1 (S, F)∗ , and the preceding argument gives a g ∈ `∞ (S, F) such that
ϕ(f ) = s∈S f (s)g(s) for all f ∈ `1 (S, F). The arguments in the proof of (ii) also show that for
1 < p, q < ∞ and any function g : S → F we have
( )
X
kgkq = sup | f (s)g(s)| | f ∈ c00 (S, F), kf kp ≤ 1 .
s∈S
P
Using this and ϕ(f ) = s f (s)g(s) for all f ∈ c00 (S, F) we have

kgkq = sup{|ϕ(f )| | f ∈ c00 (S, F), kf kp ≤ 1} = kϕk < ∞.

Now ϕ(f ) = s∈S f (s)g(s) = ϕg (f ) for all f ∈ `p (S, F) follows as before from f g ∈ `1 and
P
kf (1 − χF )kp → 0 as F % S and the k · kp -continuity of ϕ.
(iv) Let 0 6= g ∈ `1 (S, F). Then ϕg ∈ `∞ (S, F)∗ , which we can restrict to c0 (S, F). For
finite F ⊆ S define fF = f χF with f (s)P= sgn(g(s)). Then fF ∈ P c00 (S, F) with kfF k∞ = 1
(provided F ∩ supp g 6= ∅) and ϕ(fF ) = s∈F |g(s)|. Thus kϕk ≥ s∈F |g(s)| for all finite F
intersecting supp g, and this implies kϕg k ≥ kgk1 . The opposite being known, we have proven
that `1 (S, F) → c0 (S, F)∗ is isometric.
To prove surjectivity, let 0 6= ϕ ∈ c0 (S, F)∗ Pand define g : S → F, s 7→ P ϕ(δs ). If now
f ∈ c0 (S, F) and F ⊆ S is finite, we have f χF = s∈F f (s)δs , thus ϕ(f χF ) = s∈F f (s)g(s).
In particular with f (s) = sgn(g(s)) we have ϕ(f χF ) = s∈F f (s)g(s) = s∈F |g(s)|. Again
P P
we have kf χf k∞ ≤ kf k∞ = 1, thus |ϕ(f χF )| ≤ kϕk, and combining these observations gives
kgk1 ≤ kϕk < ∞, thus g ∈ `1 (S, F). As F % S, we have kf (1 − χF )k∞ = kf χS\F k∞ → 0 since
f ∈ c0 , thus with k · k∞ -continuity of ϕ
X X
ϕ(f ) = lim ϕ(f χF ) = lim f (s)g(s) = f (s)g(s) = ϕg (f ),
F %S F %S
s∈F s∈S

where we again used f g ∈ `1 . Thus ϕ = ϕg , so that `1 (S, F) → c0 (S, F)∗ is an isometric bijection.

34
(v) It is clear that ι : `1 (S, F) → `∞ (S, F)∗ is surjective if S is finite. Closedness of the
image of ι always follows from the completeness of `1 (S, F) and the fact that ι is an isometry,
cf. Corollary 3.23. The failure of surjectivity is deeper than the results of this section so far, so
that it is illuminating to give two proofs.
First proof: If S is infinite, the closed subspace c0 (S, F) ⊆ `∞ (S, F) is proper since 1 ∈
` (S, F)\c0 (S, F). Thus the quotient space Z = `∞ (S, F)/c0 (S, F) is non-trivial. In Section 6.1

we will show that Z is a Banach space, thus admits non-zero bounded linear maps ψ : Z → F
by the Hahn-Banach theorem (Section 9), and that the quotient map P : `∞ (S, F) → Z is
bounded. Thus ϕ = ψ ◦ P is a non-zero bounded linear functional on `∞ (S, F) that vanishes on
the closed subspace c0 (S, F). By (iv), the canonical map `1 (S, F) → c0 (S, F)∗ is isometric, thus
ϕg with g ∈ `1 (S, F) vanishes identically on c0 (S, F) if and only if g = 0. Thus ϕ 6= ϕg for all
g ∈ `1 (S, F).
Second proof: (This proof uses no (as yet) unproven results from functional but the Stone-
Čech compactification from general topology. Cf. Appendix A.6.6 and [108].) Since S is discrete,
`∞ (S, F) = Cb (S, F) ∼ = C(βS, F), where βS is the Stone-Čech compactification of S. The
isomorphism is given by the unique continuous extension Cb (S, F) → C(βX, F), f 7→ fb with
the restriction map C(βS, R) → Cb (S, R) as inverse. Since S is discrete and infinite, thus non-
compact, βS 6= S. If f ∈ c0 (S, F) then fb(x) = 0 for every x ∈ βS\S. (Proof: Let x ∈ βS\S.
Since X = βX, we can find a net {xι } in X such that xι → x. Since x 6∈ X, xι leaves every
finite subset of X. Now f ∈ c0 (S) and continuity of fb imply fb(x) = lim fb(xι ) = lim f (xι ) = 0.)
Thus for such an x, the evaluation map ψx : C(βS, F) → F, fb 7→ fb(x) gives rise to a non-zero
bounded linear functional (in fact character) ϕ(f ) = fb(x) on Cb (S, F) = `∞ (S, F) that vanishes
on c0 (S, F). Now we conclude as in the first proof that ϕ 6= ϕg for all g ∈ `1 (S, F). 

4.20 Remark 1. The two proofs given above for the non-surjectivity of the canonical map
`1 (S, F) → `∞ (S, F)∗ for infinite S are both non-constructive: The first used the Hahn-Banach
theorem, which we will prove using Zorn’s lemma, equivalent to AC. The second used the Stone-
Čech compactification βS whose usual construction relies on Tychonov’s theorem, which also
is equivalent to the axiom of choice. (But both the Hahn-Banach theorem and the existence
of the Stone-Čech compactification can be proven using only the ‘ultrafilter lemma’, which is
strictly weaker than AC. For Hahn-Banach see Appendix B.5, for Stone-Čech e.g. [149, 108].)
2. The dual space of `∞ (S, F) can be determined quite explicitly, but it is not a space of
functions on S as are the spaces c0 (S, F)∗ and `p (S, F)∗ . It is the space ba(S, F) of ‘finitely
additive F-valued measures on S’. A discussion of this can be found in the supplementary
Section B.3.2.
3. The non-constructiveness mentioned above is unavoidable: There are set theoretic frame-
works without the ultrafilter lemma (but with DCω ) in which `∞ (N)∗ ∼ = `1 (N), see [149, §23.10].
(In this situation, all finitely additive measures on N are countably additive!)
4. For all p ∈ (0, 1), the dual space `p (S, F)∗ equals {ϕg | g ∈ `∞ (S, F)} = `1 (S, F)∗ . See [108,
Appendix F.6]. Thus there is no p-dependence despite the fact that the `p (S, F) are mutually
non-isomorphic! 2

4.6 The Banach algebras (`∞ (S, F), ·) and (`1 (Z, F), ?)
4.21 Definition If f, g ∈ `∞ (S, F) we define f · g by (f · g)(s) = f (s)g(s) (pointwise product).
If f, g ∈ `1 (Z, F) we define the ‘convolution product’ f ? g by
X X
(f ? g)(n) = f (m)g(n − m) = f (k)g(l). (4.5)
m∈Z k,l∈Z
k+l=n

35
4.22 Lemma (i) If f, g ∈ `∞ (S, F) then kf · gk∞ ≤ kf k∞ kgk∞ , thus f · g ∈ `∞ (S, F).
(ii) If f, g ∈ `1 (Z, F) then kf ? gk1 ≤ kf k1 kgk1 , thus f ? g ∈ `1 (Z, F).
(iii) The maps · : `∞ (S, F) × `∞ (S, F) → `∞ (S, F) and ? : `1 (Z, F) × `1 (Z, F) → `1 (Z, F) are
bilinear, commutative and associative. A unit for · is the constant function 1 ∈ `∞ (S, F),
while δ0 (n) = δn,0 is a unit for ?. Thus (`∞ (S, F), ·, 1) and (`1 (Z, F), ?, δ0 ) are commutative
unital Banach algebras with k1k = 1.
(iv) c0 (S, F) ⊆ `∞ (S, F) is a closed two-sided ideal.
Proof. (i) If f, g ∈ `∞ (S, F) then sups∈S |f (s)g(s)| ≤ sups∈S |f (s)| sups∈S |g(s)| < ∞.
(ii) The second claim clearly follows from the first, which is seen by
X X XX
kf ? gk1 = f (m)g(n − m) ≤ |f (m)g(n − m)|
n∈Z m∈Z n∈Z m∈Z
X X  X X
= |f (m)| |g(n − m)| = |f (n)| |g(m)| = kf k1 kgk1 .
m n n m

(iii) Bilinearity of both maps is obvious, as are commutativity and associativity of ·. Com-
mutativity of ∗ is clear from the rightmost expression in (4.5). The latter is easily seen to
imply X
((f ? g) ? h)(n) = f (k)g(l)h(m) = (f ? (g ? h))(n),
k,l,m∈Z
k+l+m=n

thus associativity of ?. The statements about units are easy, (i),(ii) give submultiplicativity of
the norms, and completeness was proven earlier. In both cases it is obvious that k1k = 1.
(iv) Since c0 (S, F) ⊆ `∞ (S, F) is a closed linear subspace, it remains to show that f ∈
c0 (S, F), g ∈ `∞ (S, F) implies f g, gf ∈ c0 (S, F). This is obvious. 

4.23 Remark 1. For 1 < p < ∞ there is no natural way of defining a bilinear product on
`p (S, F) turning it into a Banach algebra.
2. If G isPa discrete group, the definition of `1 (Z, F) is easily adapted to `1 (G, F) by putting
(f ? g)(k) = l∈G f (l)g(l−1 k). This again gives rise to a Banach algebra, but it is commutative
if and only if G is abelian. 2

4.7 Outlook on general Lp -spaces


For an arbitrary measure space (X, A, µ) one can define normed spaces Lp (X, A, µ; F) in a
broadly analogous fashion. (We will usually omit the F.) Since integration on measure spaces
is not among the formal prerequisites of these notes, we only sketch the basic facts referring
to, e.g., R[29, 146] for details. If f : X → F is a measurable function and 0 < p < ∞, then
kf kp = ( |f (x)|p dµ(x))1/p ∈ [0, ∞]. If p = ∞, put26

kf k∞ = ess supµ |f | = inf{λ > 0 | µ({x ∈ X | |f (x)| > λ}) = 0}.

Now Lp (X, µ) = {f : X → F measurable | kf kp < ∞} is an F-vector space for all p ∈ (0, ∞].
For 1 ≤ p ≤ ∞, the proofs of the inequalities of Hölder and Minkowski extend to the present
setting without any difficulties, so that the k · kp are seminorms on Lp (X, A, µ). But the latter
fails to be a norm whenever there exists ∅ 6= Y ∈ A with µ(Y ) = 0 since then kχY kp = 0.
26
Warning: [29] defines k · k∞ using locally null sets instead of null sets, which is very non-standard.

36
For this reason we define Lp (X, A, µ) = Lp (X, A, µ)/{f | kf kp = 0}. Now it is straightforward
to prove that Lp (X, µ) = Lp (X, µ)/ ∼ is a normed space, and in fact complete. The proof
now uses Proposition 3.15. If S is a set and µ is the counting measure, we have `p (S, F) =
Lp (S, P (S), µ, F) = Lp (S, P (S), µ, F).
A measurable function P is called simple if it assumes only finitely many values. Equivalently
it is of the form f (x) = K χ
k=1 ck Ak (x), where A1 , . . . , Ak are measurable sets. Now one proves
that the simple functions are dense in Lp for all p ∈ [1, ∞]. If X is locally compact and µ is nice
enough, the set Cc (X, F) of compactly supported continuous functions is dense in Lp (X, A, µ; F)
for 1 ≤ p < ∞, while its closure in L∞ is C0 (X, F).
The inclusion `p ⊆ `q for p ≤ q (Lemma 4.10) is false for general measure spaces! In fact,
if µ(X) < ∞ then one has the reverse inclusion p ≤ q ⇒ Lq (X, A, µ) ⊆ Lp (X, A, µ), while for
general measure spaces there is no inclusion relation between the Lp with different p.
If 1 < p, q < ∞ are conjugate, the canonical map ϕ : Lq (X, A, µ) → Lp (X, A, µ)∗ is
an isometric bijection for all measure spaces. That ϕ is an isometry is proven just as for
the spaces `p : Hölder’s inequality gives kϕg k ≤ kgkq , and equality is proven as in Theo-
rem 4.19(ii) by showing |ϕg (f )| ≥ kf kp kgk1 , where the f ∈ Lp are the same as before.
However, isometry of L∞ (X, A, µ) → (L1 (X, A, µ))∗ is not automatic, as the measure space
X = {x}, A = P (X) = {∅, X} and µ : ∅ 7→ 0, X 7→ +∞ shows, for which L1 (X, A, µ, F) ∼ = {0}
and F ∼ = L ∞ (X, A, µ, F) 6∼ L1 (X, A, µ, F)∗ . It is not hard to show that L∞ → (L1 )∗ is isometric
=
if and only if (X, A, µ) is semifinite, i.e.
µ(Y ) = sup{µ(Z) | Z ∈ A, Z ⊆ Y, µ(Z) < ∞} ∀Y ∈ A.
If 1 < p < ∞, one still has surjectivity of Lp → (Lq )∗ for all measure spaces (X, A, µ),
but the standard proof is outside our scope since it requires the Radon-Nikodym theorem.
(For a more functional-analytic proof see Section B.6.8.) In order for L∞ → (L1 )∗ to be an
isometric bijection, the measure space must be ‘localizable’, cf. [146]. This condition subsumes
semifiniteness and is implied by σ-finiteness, to which case many books limit themselves.
Since we relegated the dual spaces `∞ (S, F)∗ to an appendix, we only remark that also
in general L∞ (X, A, µ)∗ is a space of finitely additive measures with fairly similar proofs, see
[43]. For 0 < p < 1, the dual spaces (Lp )∗ behave even stranger than (`p )∗ . For example,
Lp ([0, 1], λ; R)∗ = {0}.

5 Basics of Hilbert spaces


5.1 Inner products. Cauchy-Schwarz inequality
p
We have seen thatP every bounded linear functional ϕ onq ` (S, F), where 1 ≤ p < ∞ is of the
form ϕg : f 7→ s∈S f (s)g(s) for a certain unique g ∈ ` (S, F). Here the conjugate exponent
q ∈ (1, ∞] is determined by p1 + 1q = 1. Clearly we have p = q if and only if p = 2. In this space
we have self-duality: `2 (S, F)∗ ∼
= `2 (S, F). The map
X
`2 (S, F) × `2 (S, F) → F, (f, g) 7→ f (s)g(s)
s∈S
P
is bilinear and symmetric. Furthermore, it satisfies | s∈S f (s)g(s)| ≤ kf k2 kgk2 . Defining
P
g(s) = g(s), we have kgk = kgk, so that also | s∈S f (s)g(s)| ≤ kf k2 kgk2 , which is the Cauchy-
Schwarz inequality (in its incarnation for `2 (S, C)).
For the development of a general, abstract theory it is better to adopt a slightly different
definition:

37
5.1 Definition Let V be an F-vector space. An inner product on V is a map V × V →
F, (x, y) 7→ hx, yi such that
(i) The map x 7→ hx, yi is linear for each choice of y ∈ V .
(ii) hy, xi = hx, yi ∀x, y ∈ V .
(iii) hx, xi ≥ 0 ∀x, and hx, xi = 0 ⇒ x = 0.

5.2 Remark 1. Many authors write (x, y) instead of hx, yi, but this often leads to confusion
with the notation for ordered pairs. We will use pointed brackets throughout.
2. Combining the first two axioms one finds that the map y 7→ hx, yi is anti-linear for each
choice of x. This means hx, cy + c0 y 0 i = chx, yi + c0 hx, y 0 i for all y, y 0 ∈ V and c, c0 ∈ F. Of
course this reduces to linearity if F = R. A map V × V → C that is linear in the first variable
and anti-linear in the second is called sesquilinear.
3. A large minority of authors, mostly (mathematical) physicists, defines inner products to
be linear in the second and anti-linear in the first argument. We follow the majority practice.
4. If F = R then hy, xi = hx, yi = hx, yi ∀x, y. Thus h·, ·i is bilinear and symmetric.
5. The first two axioms together already imply hx, xi ∈ R for all x, but not the positivity
assumption.
6. If hx, yi = 0 for all y ∈ H then x = 0. To see this, it suffices to take y = x. 2

5.3 Example 1. If V = Cn then hx, yi = ni=1 xi yi is an inner product and the corresponding
P
norm (see below) is k · k2 , which is complete.
2. Let S be any set and V = `2 (S, C). Then hf, gi = s∈S f (s)g(s) converges for all f, g ∈ V
P
by Hölder’s inequality and is easily seen to be an inner product. Of course, 1. is a special case
of 2. R
3. If (X, A, µ) is any measure space then hf, gi = X f (x)g(x) dµ(x) is an inner product on
L2 (X, A, µ; F) turning it into a Hilbert space. (Here we allow ourselves a standard sloppiness:
The elements of Lp are not functions, but equivalence classes of functions. The inner product
of two such classes is defined by picking arbitrary representers.) P
4. Let V = Mn×n (C). For a, b ∈ V , define ha, bi = Tr(b∗ a) = ni,j=1 aij bij , where (b∗ )ij =
bji . That this is an inner product turning V into a Hilbert space follows from 1. upon the
identification Mn×n (C) ∼
2
= Cn .
In view of hx, xi ≥ 0 for all x, and we agree that hx, xi1/2 always is the positive root.

5.4 Lemma (Abstract Cauchy-Schwarz inequality) 27 If h·, ·i is an inner product on V


then
|hx, yi| ≤ hx, xi1/2 hy, yi1/2 ∀x, y ∈ V. (5.1)
(This even holds if one drops the assumption that hx, xi = 0 ⇒ x = 0.)
Equality holds in (5.1) if and only if one of the vectors is zero or x = cy for some c ∈ F.

5.5 Exercise Prove Lemma 5.4 along the following lines:


1. Prove it for y = 0, so that we may assume y 6= 0 from now on.
2. Define x1 = kyk−2 hx, yiy and x2 = x − x1 and prove hx1 , x2 i = 0.
27
Augustin-Louis Cauchy (1789-1857). French mathematician with many important contributions to analysis.
Karl Hermann Armandus Schwarz (1843-1921). German mathematician, mostly active in complex analysis.
Some authors (mostly Russian ones) include Viktor Yakovlevich Bunyakovski (1804-1889). Russian mathematician.

38
3. Use 2. to prove kxk2 = kx1 k2 + kx2 k2 ≥ kx1 k2 .
4. Deduce Cauchy-Schwarz from kx1 k2 ≤ kxk2 .
5. Prove the claim about equality.
The above proof is the easiest to memorize (at least in outline) and reconstruct, but there
are many others, e.g.:

5.6 Exercise Let V be a vector space with inner product h·, ·i and define kxk = hx, xi1/2 .
(i) For x, y ∈ V and t ∈ R, define P (t) = kx + tyk2 and show this defines a quadratic
polynomial in t with real coefficients.
(ii) Use the obvious fact that this polynomial takes values in [0, ∞) for all t ∈ R, thus also
inf t∈R P (t) ≥ 0, to prove the Cauchy-Schwarz inequality.
p
5.7 Proposition If h·, ·i is an inner product on V then kxk = + hx, xi is a norm on V .
(An inner product h·, ·i and a norm k · k related in this way are called compatible.)
Proof. kxk ≥ 0 holds by construction, and the third axiom in Definition 5.1 implies kxk = 0 ⇒
x = 0. We have p p p
kcxk = hcx, cxi = cchx, xi = |c|2 hx, xi = |c|kxk,
thus kcxk = |c|kxk for all x ∈ V, c ∈ F. Finally,

kx + yk2 = hx + y, x + yi = hx, xi + hx, yi + hy, xi + hy, yi = kxk2 + kyk2 + hx, yi + hy, xi.

With Re z ≤ |z| for all z ∈ C and the Cauchy-Schwarz inequality we have

hx, yi + hy, xi = hx, yi + hx, yi = 2 Rehx, yi ≤ 2|hx, yi| ≤ 2kxkkyk,

thus
kx + yk2 ≤ kxk2 + kyk2 + 2kxkkyk = (kxk + kyk)2
and therefore kx + yk ≤ kxk + kyk, i.e. subadditivity. 

In terms of the norm, the Cauchy-Schwarz inequality just becomes |hx, yi| ≤ kxkkyk.

5.8 Definition A pre-Hilbert space (or inner product space) is a pair (V, h·, ·i), where V is an
F-vector space and h·, ·i an inner product on it. A Hilbert space is a pre-Hilbert space that is
complete for the norm k · k obtained from the inner product.

5.9 Remark 1. By the above, an inner product gives rise to a norm and therefore to a norm
topology τ . Now the Cauchy-Schwarz inequality implies that the inner product h·, ·i → F is
jointly continuous:

|hx, yi − hx0 , y 0 i| = |hx, yi − hx, y 0 i + hx, y 0 i − hx0 , y 0 i|


= |hx, y − y 0 i + hx − x0 , y 0 i|
≤ kxkky − y 0 k + kx − x0 kky 0 k.

2. If h·, ·i be an inner product on H and k · k is the norm derived from it then

kxk = sup |hx, yi| ∀x ∈ H. (5.2)


y∈H,kyk=1

39
x
(For x = 0 this is obvious, and for x 6= 0 it follows from hx, kxk i = kxk.)
3. The restriction of an inner product on H to a linear subspace K ⊆ H again is an inner
product. Thus if H is a Hilbert space and K a closed subspace then K again is a Hilbert space
(with the restricted inner product).
4. All spaces considered in Example 5.3 are complete, thus Hilbert spaces. For `2 (S) this
was proven in Section 4., and the claim for Cn , thus also Mn×n (C), follows since Cn ∼
= `2 (S, C)
when #S = n. For L2 (X, A, µ) see books on measure theory like [29, 146]. 2

5.10 Definition Let (H1 , h·, ·i1 ), (H2 , h·, ·i2 ) be pre-Hilbert spaces. A linear map A : H1 → H2
is called
• isometric or an isometry if hAx, Ayi2 = hx, yi1 ∀x, y ∈ H1 .
• unitary if it is a surjective isometry.

5.11 Remark Every unitary map is invertible with unitary inverse. Two Hilbert spaces H1 , H2
are called unitarily equivalent or isomorphic if there exists a unitary U : H1 → H2 . 2

If (H1 , h·, ·i1 ), (H2 , h·, ·i2 ) are (pre-)Hilbert spaces then
h(x1 , x2 ), (y1 , y2 )i = hx1 , y1 i1 + hx2 , y2 i2
defines an inner product on H1 ⊕ H2 turning it into a (pre-)Hilbert space. With this definition,
k(x, y)k = h(x, y), (x, y)i1/2 = (hx, xi1 + hy, yi2 )1/2 = (kxk21 + kyk22 )1/2 (thus not kxk1 + kyk2 !).
More generally, if {Hi , h·, ·ii }i∈I is a family of (pre-)Hilbert spaces then
M X
Hi = {{xi }i∈I | hxi , xi ii < ∞}
i∈I i∈I

with X
h{xi }, {yi }i = hxi , yi ii
i∈I

is a (pre-)Hilbert space. (If Hi = F for all i ∈ I, this construction recovers `2 (I, F), while the
Banach space direct sum gives `1 (I, F).)

5.12 Exercise Let (V, h·, ·i) be a pre-Hilbert space and k·k the associated norm. Let (V 0 , k·k0 )
be the completion (as a normed space) of (V, k · k). Prove that V 0 is a Hilbert space.

5.2 The parallelogram and polarization identities


Given a normed space (V, k · k), it is natural to ask whether there exists an inner product h·, ·i
on V compatible with k · k (in the sense hx, xi = kxk2 ∀x).

5.13 Exercise Let (V, h·, ·i) be a pre-Hilbert space over F ∈ {R, C}. Prove the parallelogram
identity
kx + yk2 + kx − yk2 = 2kxk2 + 2kyk2 ∀x, y ∈ V (5.3)
and the polarization identities
1
hx, yi = (kx + yk2 − kx − yk2 ) if F = R, (5.4)
4
3
1X k
hx, yi = i kx + ik yk2 if F = C. (5.5)
4
k=0

40
For a map of (pre-)Hilbert spaces we have two a priori different notions of isometry, but
they are equivalent:

5.14 Exercise Let (H1 , h·, ·i1 ), (H2 , h·, ·i2 ) be (pre-)Hilbert spaces over F ∈ {R, C}. Let k · k1,2
be the norms induced by the inner products. Prove that a linear map A : H1 → H2 is an
isometry of normed spaces (i.e. kAxk2 = kxk1 ∀x ∈ H1 ) if and only if it is an isometry of
pre-Hilbert spaces (i.e. hAx, Ayi2 = hx, yi1 ∀x, y ∈ H1 ).

5.15 Exercise Let S be a set with #S ≥ 2 and p ∈ [1, ∞]. Prove that the norm k · kp on
`p (S, F) satisfies the parallelogram identity if and only if p = 2.

5.16 Exercise (Jordan-von Neumann 1935) 28 Let (V, k · k) be a normed space over F ∈
{R, C} satisfying (5.3). Define h·, ·i : V × V → R by (5.4).
(i) Prove hx, yi = hy, xi and hx + x0 , yi = hx, yi + hx0 , yi.
(ii) Prove htx, yi = thx, yi for all t ∈ N, then successively for t ∈ Z, Q, R.
(iii) Prove that h·, ·i is compatible with k · k and makes V a real inner product space.
(iv) If F = C, prove that (5.5) defines an inner product (Definition 5.1) compatible with k · k.
Hint: Prove and use a relationship between the right hand sides of (5.4) and (5.5).
Exercise 5.16 shows that (5.3) characterizes the Banach spaces that ‘are’ Hilbert spaces in
the sense of admitting an inner product compatible with the norm. There are very many such
criteria: The whole book [2] is dedicated to proving about 350 of them! To state just one more:
A Banach space V is a Hilbert space (in the above sense) if and only if for every 2-dimensional
subspace W there exists an idempotent P = P 2 ∈ B(V ) with kP k = 1 such that P V = W .

5.17 Exercise (i) If H is a Hilbert space and x1 , . . . , xn ∈ H, prove [this is easier without
induction!] the generalized parallelogram identity
n n
X X 2 X
−n
2 si xi = kxi k2 . (5.6)
s∈{±1}n i=1 i=1

(ii) Prove that a Banach space (V, k · k) is isomorphic to a Hilbert space (not necessarily
isometrically) if and only if there is an inner product h·, ·i on V such that the norm
kxk0 = hx, xi1/2 is equivalent to k · k.
(iii) If V is a Banach space isomorphic to a Hilbert space, prove that there are C 0 ≥ C > 0
such that
n n n
X X X 2 X
C kxi k2 ≤ 2−n si xi ≤ C 0 kxi k2 (5.7)
i=1 s∈{±1}n i=1 i=1

for all n ∈ N and x1 , . . . , xn ∈ V .

5.18 Remark ? A Banach space V in which the second inequality in (5.7) holds for some
C 0 > 0, all n ∈ N and x1 , . . . , xn ∈ V is said to have ‘type 2’. It has ‘cotype 2’ if the analogous
statement holds for the first inequality. (Type p and cotype p are defined analogously by
28
Pascual Jordan (1902-1980). German theoretical physicist with contributions to quantum theory. (Not to be
confused with the French mathematician Camille Jordan (1838-1922) to whom e.g. the Jordan normal form is due.)

41
replacing all k · k2 by k · kp .) By a remarkable theorem of Kwapień29 (1972), every Banach space
of type and cotype 2 (i.e. satisfying (5.7) with fixed C, C 0 > 0 for all n, xi ) is isomorphic to a
Hilbert space! For a proof see e.g. [1, Theorem 7.4.1], [97, Vol. 1, Theorem 5.V.6].
Granting this, we have a criterion for a given Banach space (V, k · k) to be isomorphic (not
necessarily isometrically) to a Hilbert space. We will encounter two more, cf. Remark 6.10.4
and Remark 9.13, but proving the converse directions is way too involved for these notes. 2

5.3 Basic Hilbert space geometry


5.19 Definition If H is a (pre-)Hilbert space then x, y ∈ H are called orthogonal, denoted
x ⊥ y, if hx, yi = 0. If S, T ⊆ H then S ⊥ T means x ⊥ y ∀x ∈ S, y ∈ T .
It should be obvious that x ⊥ y implies cx ⊥ dy for all c, d ∈ F.

5.20 Lemma (Pythagoras’ theorem) Let H be a (pre-)Hilbert space and x1 , . . . , xn ∈ H


mutually orthogonal, i.e. i 6= j ⇒ hxi , xj i = 0. Let x = x1 + · · · + xn . Then

kxk2 = kx1 k2 + · · · + kxn k2 .

Proof. We have
DX X E X X X
kxk2 = hx, xi = xi , xj = hxi , xj i = hxi , xi i = kxi k2 ,
i j i,j i i

where we used i 6= j ⇒ hxi , xj i = 0. 


P
5.21 Remark If H is a Hilbert space, P I is an infinite set and {x i }i∈I ⊆ H is such that i kxi k <
∞ then we can make sense of x = i∈I xi ∈ H by completeness and Proposition 3.15. If all
x
Pi are mutually orthogonal then by taking the limit over finite subsets we again have kxk2 =
2 2
P P
i∈I kxi k . (This shows that i kxi k < ∞ ⇒ i kxi k < ∞, which also follows from the
1 2
inclusion ` (S) ⊆ ` (S) proven in Lemma 4.10.) 2

5.22 Definition Let V be an F-vector space. Then C ⊆ V is called convex if for all x, y ∈ C
and t ∈ [0, 1] we have tx + (1 − t)y ∈ C. (Equivalently tC + (1 − t)C ⊆ C for all t ∈ [0, 1].)
PN
5.23 Exercise If V is an F-vector space and C ⊆ P V is convex, prove that i=1 ti xi ∈ C
whenever x1 , . . . , xN ∈ C and t1 , . . . , tN ≥ 0 satisfy i ti = 1.

5.24 Proposition (Riesz lemma) 30 Let H be a Hilbert space and C ⊆ H a non-empty


closed convex set. Then for each x ∈ H there is a unique y ∈ C minimizing kx − yk, i.e.
kx − yk = inf z∈C kx − zk.
Proof. We will prove this for x = 0, in which case the statement says that there is a unique
element of C of minimal norm. For general x ∈ H 0 , let y 0 be the unique element of minimal
norm in the convex set C 0 = C − x. Then y = y 0 + x is the unique element in C minimizing
kx − yk.
29
Stanislaw Kwapień (b. 1942), Polish mathematician, mostly in functional analysis and probability. The proof of
his theorem indeed uses some probability theory.
30
Frigyes Riesz (1880-1956). Hungarian mathematician and one of the pioneers of functional analysis. (The same
applies to his younger brother Marcel Riesz (1886-1969).)

42
Let d = inf z∈C kzk and pick a sequence {yn } in C such that kyn k → d. Since C is convex,
we have yn +y
2
m
∈ C, thus yn +y
2
m
≥ d for all n, m. By the choice of {yn } we have kyn k → d.
Thus for every ε > 0 there is N ∈ N such that n ≥ N implies kyn k < d + ε. Thus if n, m ≥ N ,
then with the parallelogram identity (5.3) we have
2
yn + ym
kyn − ym k2 = 2kyn k2 + 2kym k2 − kyn + ym k2 = 2kyn k2 + 2kym k2 − 4
2
< 4(d + ε)2 − 4d2 = 8dε + 4ε2 .

This implies that {yn } is a Cauchy sequence and therefore converges to some y ∈ H by
completeness of H, and closedness of C gives y ∈ C. By continuity of the norm, we have
kyk = k lim yn k = lim kyn k = d.
0
If y, y 0 ∈ C with kyk = ky 0 k = d then y+y 0 2 2
2 ∈ C by convexity, thus k(y + y )/2k ≥ d by the
2
y+y 0
definition of d. Now the parallelogram identity gives 0 ≤ ky − y 0 k2 = 4d2 − 4 2 ≤ 0. Thus
ky − y0k = 0, proving y = y0. 

5.4 Closed subspaces, orthogonal complement, and orthogonal


projections
5.25 Definition Let (H, h·, ·i) be a Hilbert space and S ⊆ H. The orthogonal complement
S ⊥ is defined as
S ⊥ = {y ∈ H | hy, xi = 0 ∀x ∈ S}.

5.26 Exercise Let H be a Hilbert space over F and S, T ⊆ H arbitrary subsets. Prove:
(i) T ⊆ S ⊥ ⇔ S ⊥ T ⇔ S ⊆ T ⊥ .
(ii) S ⊥ ⊆ H is a closed linear subspace.

(iii) S ⊥ = spanF (S) .
(iv) If S ⊆ T then T ⊥ ⊆ S ⊥ .
(v) S ⊆ S ⊥⊥ and S ⊥ = S ⊥⊥⊥ .
A linear subspace of a vector space clearly is a convex subset. Now,

5.27 Theorem Let H be a Hilbert space and K ⊆ H a closed linear subspace. Define a map
P : H → K by P x = y, where y ∈ K minimizes kx − yk as in Proposition 5.24. Also define
Qx = x − P x. Then
(i) Qx ∈ K ⊥ ∀x.
(ii) For each x ∈ H there are unique y ∈ K, z ∈ K ⊥ with x = y + z, namely y = P x, z = Qx.
(iii) The maps P, Q are linear.
(iv) The map P : H → H satisfies P 2 = P and hP x, yi = hx, P yi. The same holds for Q.
(v) The map U : H → K ⊕ K ⊥ , x 7→ (P x, Qx) is an isomorphism of Hilbert spaces. In
particular, kxk2 = kP xk2 + kQxk2 ∀x.
Proof. (i) Let x ∈ H, v ∈ K. We want to prove Qx ⊥ v, i.e. hx − P x, vi = 0. Since y = P x is
the element of K minimizing kx − yk, we have for all t ∈ C

kx − P xk ≤ kx − P x − tvk.

43
Taking squares and putting z = x − y = x − P x, this becomes hz, zi ≤ hz − tv, z − tvi, equivalent
to
2 Re(thv, zi) ≤ |t|2 kvk2 .
With the polar decomposition t = |t|eiϕ , this inequality becomes 2 Re(eiϕ hv, zi) ≤ |t|kvk2 .
Taking |t| → 0, we find Re(eiϕ hv, zi) = 0, and since ϕ was arbitrary, we conclude hv, zi = 0. In
view of z = x − y = x − P x this is what we wanted.
(ii) For each x ∈ H we have x = P x + Qx with P x ∈ K, Qx ∈ K ⊥ , proving the existence.
If y, y 0 ∈ K, z, z 0 ∈ K ⊥ such that y + z = y 0 + z 0 then y − y 0 = z 0 − z ∈ K ∩ K ⊥ = {0}. Thus
y − y 0 = z 0 − z = 0, proving the uniqueness.
(iii) If x, x0 ∈ H, c, c0 ∈ F then cx + c0 x0 = P (cx + c0 x0 ) + Q(cx + c0 x0 ). But also

cx + c0 x0 = c(P x + Qx) + c0 (P x0 + Qx0 ) = (cP x + c0 P x0 ) + (cQx + c0 Qy 0 )


| {z } | {z }
∈K ∈K ⊥

is a decomposition of cx + c0 x0 as a sum of vectors in K and K ⊥ , respectively. Since such


a decomposition is unique by (ii), we have P (cx + c0 x0 ) = cP x + c0 P x0 and Q(cx + c0 x0 ) =
cQx + c0 Qx0 , which just is the linearity of P and Q.
(iv) The definition of P clearly implies P x = x if x ∈ K (thus P  K = idK ). With P H ⊆ K
we have P 2 x = P (P x) = P x for all x, thus P 2 = P . And for all x, y ∈ H we have

hP x, yi = hP x, P y + Qyi = hP x, P yi = hP x + Qx, P yi = hx, P yi,

where we used the orthogonality of the images of P and Q. The proofs for Q are analogous.
(v) It is clear that U is a linear isomorphism. Furthermore, P x ⊥ Qy implies

hx, yi = hP x + Qx, P y + Qyi = hP x, P yi + hQx, Qyi = hU x, U yi,

so that U is an isometry of Hilbert spaces. 

5.28 Remark The above theorem remains valid if H is only a pre-Hilbert space, provided
K ⊆ H is finite-dimensional. We first note that the proof of Lemma 5.24 only uses completeness
of C ⊆ H, not that of H. And we recall that finite-dimensional subspaces of normed spaces are
automatically complete and closed, cf. Exercises 2.33 and 3.22. In the proof of Theorem 5.27
we use Lemma 5.24 with C = K, which is complete as just noted. 2

5.29 Exercise Let H be a Hilbert space and V ⊆ H a linear subspace. Prove:


(i) V ⊥ = {0} if and only if V = H.
(ii) V ⊥⊥ = V .

5.30 Definition Let V be a vector space and H a (pre-)Hilbert space.


• A linear map P : V → V is idempotent if P 2 ≡ P ◦ P = P .
• A linear map P : H → H is self-adjoint31 if hP x, yi = hx, P yi for all x, y ∈ H.
• A bounded linear map P : H → H is an orthogonal projection if it is a self-adjoint
idempotent.
31
Many authors write ‘hermitian’ instead of ‘self-adjoint’. We stick to the latter.

44
(In Theorem 7.32 we will prove that every self-adjoint P : H → H is automatically bounded,
but this is not needed here. There are unbounded idempotents.)
We have seen that every closed subspace K of a Hilbert space gives rise to an orthogonal
projection P with P H = K. Conversely, we have:

5.31 Exercise Let H be a Hilbert space and P ∈ B(H) idempotent. Prove:


(i) K = P H ⊆ H and L = (1 − P )H are closed linear subspaces.
(ii) We have K ⊥ L if and only if P is self-adjoint, i.e. an orthogonal projection.
(iii) If P is an orthogonal projection then it equals the P associated to K by Theorem 5.27.
In linear algebra, one has a notion of quotient spaces, cf. e.g. [55, Exercise 31 in Section 1.3]:
If V is an F-vector space and W ⊆ V is a linear subspace, one defines an equivalence relation
on V by x ∼ y ⇔ x − y ∈ W and then lets V /W denote the quotient space V /∼, i.e. the set of
∼-equivalence classes. One shows that V /W again is a F-vector space.
It is very natural to ask whether V /W again is a Hilbert (or Banach) space if that is the
case for V . For Hilbert spaces this is quite easy:

5.32 Exercise Let H be a Hilbert space and K ⊆ H a closed linear subspace. Prove that
there is a linear isomorphism H/K → K ⊥ of F-vector spaces.
Conclude that the quotient space H/K of a Hilbert space H by a closed subspace K admits
an inner product turning it into a Hilbert space.

5.33 Exercise Let H be a Hilbert space and K, L closed linear subspaces of H such that
dim K < ∞ and dim K < dim L. Prove that L ∩ K ⊥ 6= {0}.

5.5 The dual space H ∗ of a Hilbert space


If H is a Hilbert space, every y ∈ H gives rise to a linear functional on H via ϕy : x 7→ hx, yi.
The Cauchy-Schwarz inequality gives |ϕy (x)| = |hx, yi| ≤ kxkkyk, so that kϕy k ≤ kyk < ∞. For
every y 6= 0 we have ϕy (y) = hy, yi = kyk2 > 0, implying kϕy k = kyk. As a consequence, for
every non-zero x ∈ H there is a ϕ ∈ H ∗ with ϕ(x) 6= 0. Thus H ∗ separates the points of H.
But we have more:

5.34 Theorem ((F.) Riesz-Fréchet representation theorem) If H is a Hilbert space


and ϕ ∈ H ∗ then there is a unique y ∈ H such that ϕ = ϕy = h·, yi.
Thus the map H → H ∗ , y 7→ ϕy is an anti-linear isometric bijection.
Proof. If ϕ = 0, put y = 0 (every y 6= 0 gives ϕy 6= 0). Now assume ϕ 6= 0. Let
K = ker ϕ = ϕ−1 (0). Then K ⊆ H is a linear subspace that is closed (by continuity of
ϕ) and proper (since ϕ 6= 0). By the preceding section, K ⊥ 6= {0}. The dimension of
K ⊥ is one. (Either by the algebraic fact that dim(H/K) = codim K = 1 or as follows: If
y1 , y2 ∈ K ⊥ \{0} then ϕ(y1 ) 6= 0 6= ϕ(y2 ) implies ϕ(y1 /ϕ(y1 ) − y2 /ϕ(y2 )) = 0 and therefore
y1 /ϕ(y1 ) − y2 /ϕ(y2 ) ∈ K ∩ K ⊥ = {0}, thus y1 and y2 are linearly dependent.) D Pick aEnon-zero
ϕ(z) ϕ(z)
z ∈ K ⊥ and put y = kzk2
z. Then ϕy vanishes on K, and ϕy (z) = hz, yi = z, kzk2z = ϕ(z).
Thus ϕ = ϕy on both K and K ⊥ = Fz, thus on H = K + K ⊥ . In view of the construction, y
is unique. Since the inner product is anti-linear in the second variable, the final claim follows. 

By the above, the map H → H ∗ , y 7→ ϕy is an anti-linear isometric bijection of Banach


spaces. If we want, we can use it to put an inner product on H ∗ .

45
5.35 Exercise Let H be a Hilbert space, K ⊆ H a linear subspace (not necessarily closed!)
and let ϕ ∈ K ∗ . Prove:
b ∈ H ∗ such that ϕ
(i) There exists ϕ b  K = ϕ.
b ∈ H ∗ satisfying ϕ
(ii) Uniqueness of ϕ b  K = ϕ holds if and only K = H.
b ∈ H ∗ satisfying ϕ
(iii) There is a unique ϕ b  K = ϕ and kϕk
b = kϕk.

5.6 Orthonormal sets and bases


We begin by recalling the notion of bases from linear algebra: A finite Pnsubset {x1 , . . . , xn }
of a vector space V over the field k is called linearly independent if i=1 ci xi = 0, where
c1 , . . . , cn ∈ k, implies c1 = · · · = cn = 0. (In particular, xi 6= 0 ∀i.) An arbitrary subset
B ⊆ V is called linearly independent if every finite subset S ⊆ B is linearly independent. A
linearly independent subset B ⊆ V is called a (Hamel) basis if every x ∈ V can be written as
a linear combination of finitely many elements of B. This is equivalent to B being maximal,
i.e. non-existence of linearly independent sets B 0 properly containing B. One now proves that
any two bases of V have the same cardinality. All this is known from linear algebra, but the
following possibly not:

5.36 Proposition Every vector space V has a basis.


Proof. If V = {0}, ∅ is a basis. Thus let V be non-zero and let B be the set of linearly in-
dependent subsets of V . The set B is partially ordered by inclusion ⊆ and non-empty (since
it contains {x} for all 0 6= x ∈ V ). We claim that every chain in (=totally ordered subset of)
(B, ⊆) has a maximal element: Just take the union B b of all sets in the chain. Since any finite
subset of the union over a chain of sets is contained in some element of the chain, every finite
subset of B
b is linearly independent. Thus B b is in B and clearly is an upper bound of the chain.
Thus the assumption of Zorn’s Lemma is satisfied, so that (B, ⊆) has a maximal element M .
We claim that M is a basis for V : If this was false, we could find a v ∈ V not contained in
the span of M . But then M ∪ {v} would be a linearly independent set strictly larger than M ,
contradicting the maximality of M . 

As soon as we study topological vector spaces, we can also talk about infinite linear com-
binations, which renders the linear algebra notion of basis quite irrelevant (except as a tool in
some proofs). For Hilbert spaces we have the following natural notions:

5.37 Definition Let H be a (pre-)Hilbert space. A subset E ⊆ H is called


• orthogonal if hx, yi = 0 whenever x, y ∈ E, x 6= y,
• orthonormal if it is orthogonal and every x ∈ E is a unit vector, i.e. kxk = 1. (Equivalently,
hx, yi = δx,y ∀x, y ∈ E.)
• orthonormal basis (ONB) if it is orthonormal and maximal. (I.e. there is no orthonormal
set E 0 properly containing E.)

5.38 Exercise Prove that every orthonormal set is linearly independent. What about orthog-
onal sets?

5.39 Lemma Let H be a (pre-)Hilbert space and E ⊆ H an orthonormal set. Then


X
|hx, ei|2 ≤ kxk2 ∀x ∈ H, (5.8)
e∈E

46
which is called the the inequality of Bessel.32
P
Proof. Let E be a finite orthonormal set and x ∈ V . Define y = x − e∈E hx, eie. It is
P check that hy, ei = 0 for all e ∈ E, so that E ∪ {y} is an orthogonal set. In
straightforward to
view of x = y + e∈E hx, eie, Pythagoras’ theorem (Lemma 5.20) gives
X X
kxk2 = kyk2 + khx, eiek2 = kyk2 + |hx, ei|2 ,
e∈E e∈E

2 (5.8) for all finite E. If E is infinite, e∈E |hx, ei|2 equals


P
which in view of kyk
P ≥ 0 implies
the supremum of e∈F |hx, ei|2 ≤ kxk2 over the finite subsets F ⊆ E, which thus also satisfies
(5.8). 

5.40 Lemma For every orthonormal set E in a (pre-)Hilbert space H there is an orthonormal
basis E
b containing E. In particular every Hilbert space admits an ONB.

Proof. The proof is essentially the same as that of Proposition 5.36: Let B be the set of or-
thonormal sets that contain E, partially ordered by inclusion. Then E ∈ B, thus B 6= ∅. A
Zorn’s lemma argument gives the existence of maximal element E b of the partially ordered set
(B, ⊆). Thus E is maximal among the orthonormal sets containing E. If there is a unit vector
b
f ∈ E ⊥ then E
b ∪ {f } is an orthonomal set containing E strictly larger than E,
b contradicting
the maximality of E.
b Thus no such f exists, and E b is an ONB for H. 

The following exercise is motivated by a frequently made mistake. If H is a Hilbert space 


with ONB E and A ∈ B(H) then supe∈E kAek ≤ kAk, but equality need not hold at all!!

5.41 Exercise If H is a Hilbert space, E ⊆ H is an ONB and A ∈ B(H), define kAkE =


supe∈E kAek.
(i) Prove that k · kE is a norm on B(H) and that kAkE ≤ kAk ∀A ∈ B(H).

√ for every N ∈ N there are H, E, A as above with dim H = N such that


(ii) Show that
kAk ≥ N kAkE .
(Thus there is no C < ∞ such that kAk ≤ CkAkE for all H, E, A, certainly not C = 1.)

5.42 Theorem 33 Let H be a Hilbert space and E an orthonormal set in H. Then the following
are equivalent:
(i) E is an orthonormal basis, i.e. maximal.
(ii a) If x ∈ H and x ⊥ e for all e ∈ E then x = 0.
(ii b) The map H → `2 (E, F), x 7→ {hx, ei}e∈E (well-defined thanks to (5.8)) is injective.
(iii) spanF E = H.
P
(iv) For every x ∈ H, there are numbers {ae }e∈E in F such that x = e∈E ae e.
P
(v) For every x ∈ H, the equality x = e∈E hx, eie holds.
(vi a) For every x ∈ H, we have kxk2 = e∈E |hx, ei|2 . (Abstract Parseval34 identity)
P

32
Friedrich Bessel (1784-1846), German mathematician, now best known for certain differential equations.
33
I dislike the approach of some textbooks that restrict this statement to finite or countably infinite orthonormal
sets, which amounts to assuming H to be separable. I also find it desirable to understand how much of the theorem
survives without completeness since the latter does not hold in situations like Example 5.49. See Remark 5.43.
34
Marc-Antoine Parseval (1755-1836). French mathematician.

47
(vi b) The map H → `2 (E, F), x 7→ {hx, ei}e∈E is an isometric map of normed spaces, where
`2 (S) has the k · k2 -norm.
P P
(vii a) For all x, y ∈ H we have hx, yi = e∈E hx, eihy, ei = e∈E hx, eihe, yi.
(vii b) The map H → `2 (E, F), x 7→ {hx, ei}e∈E is an isometric map of pre-Hilbert spaces.
Here all summations over E are in the sense of the unordered summation of Appendix A.1 (with
V = H in (iv),(v) and V = F in (vi a),(vii a)).
Proof. If (ii a) holds then E is maximal, thus (i). If (ii a) is false then there is a non-zero x ∈ H
with x ⊥ e for all e ∈ E. Then E ∪ {x/kxk} is an orthonormal set larger than E, thus E is
not maximal. Thus (i)⇔(ii a). The equivalence (ii a)⇔(ii b) follows from the fact that a linear
map is injective if and only if its kernel is {0}.
(iii)⇒(i) If spanF E = H and x ∈ H satisfies x ⊥ E then also x ⊥ (spanF E = H), thus
x = 0. Thus E is maximal and therefore a basis.
(ii a)⇒(iii) K = spanF E ⊆ H is a closed linear subspace. If K 6= H then by Theorem 5.27
we can find a non-zero x ∈ K ⊥ . In particular x ⊥ e ∀e ∈ E, contradicting (ii a). Thus K = H.
It should be clear that the statements (vi b) and (vii b) are just high-brow versions of (vi a),
(vii a), respectively, to which they are equivalent. That (vii a) implies (vi a) is seen by taking
x = y. Since Exercise 5.14 gives (vi b)⇒(vii b), we have the mutual equivalence of (vi a), (vi
b), (vii a), (vii b).
(v)⇒(iv) is trivial.
P If (iv) holds then continuity of the inner product, cf. Remark 5.9.1,
implies hx, yi = e∈E ae he, yi for all y ∈ H. For y ∈ E, the r.h.s. reduces to ay , implying (v).
(iv) means that every x ∈ H is a limit of finite linear combinations of the e ∈ E, thus (iii)
holds. P
(v)⇒(vi a) For finite F ⊆ E we define xF = e∈F hx, eie. Pythagoras’ theorem gives
2 = 2 . As F % E, the l.h.s. converges to kxk2 by (iii) and the r.h.s. to
P
kx F k e∈F |hx, ei|
2
P
e∈E |hx, ei| . Thus (vi a) holds.
If (vi a) holds then for each ε > 0 there is a finite F ⊆ E such that e∈E\F |hx, ei|2 < ε.
P

Since x−xF is orthogonal to each e ∈ F , P we have x−xF ⊥ xF , to that kxk2 P= kx−xF k2 +kxF k2 .
Combining this with (iv a) and kxF k = e∈F |hx, ei| we find kx−xF k = e∈E\F |hx, ei|2 < ε.
2 2 2

Since ε was arbitrary, this proves that limF %E xF = x, thus (v).


It remains to prove (iii)⇒(v). The fact spanF E = H means that for every P x ∈ H and ε > 0
there are a finite subset F ⊆ E and coefficients {ae }e∈F such that kx − e∈F ae ek < ε. On
the other hand, Theorem 5.27 tells us that for each finite F ⊆ E there is a unique PF (x) ∈
KF = spanF F minimizing kx − PP F (x)k. Clearly PF  KF is the identity map andP the zero map
on KF⊥ P . Thus defining P 0 (x) =
F e∈F hx, eie, we have P 0 = P . Thus kx −
F F P e∈F hx, eiek ≤
kx − e∈F ae ek < ε. Since ε > 0 was arbitrary, P this proves limF %E kx − e∈F hx, P eiek = 0,
which is nothing other thanPthe statement x = e∈E hx, eie. Since the finite sums e∈F hx, eie
are in H, the identity x = e∈E hx, eie also holds in H for all x ∈ H. 
P
5.43 Remark 1. The identity hx, yi = e∈E hx, eihe, yi ∀x, y ∈ HPin statement (vii a) is
sometimes called ‘inserting a partition of unity’. Physicists write 1 = e∈E |eihe|.
2. If H is only a pre-Hilbert space, we still have
(i)⇔(ii a/b)⇐(iii)⇔(iv)⇔(v)⇔(vi a/b)⇔(vii a/b). This follows from the fact that complete-
ness is not used in proving these equivalences, except in (iii)⇒(v) where it can be avoided by
appealing to Remark 5.28.
Furthermore, we have equivalence of (iii), i.e. spanF E = H (in H), with spanF E = H b (in
the completion H)b and with E being an ONB for H. b The equivalence of the second and third
statement comes from (i)⇔(iii), applied to H. If spanF E is dense in H then it is dense in H
b b

48
since H is dense in H.
b And the converse follows from the general topology fact that the closure
in H of some S ⊆ H ⊆ H b equals S ∩ H, where S is the closure in H.b
In Example 5.49 below, all statements (i)-(vii) hold despite the incompleteness of H. But
in the absence of completeness the implication (i)⇒(iii) can fail! For a counterexample see
Exercise 5.44. (In view of this, maximal orthonormal sets in pre-Hilbert spaces should not
be called bases.) In [63] it is even proven that a pre-Hilbert space in which every maximal
orthonormal set E has dense span actually is a Hilbert space. Equivalently, in every incomplete
pre-Hilbert space there is a maximal orthonormal set E whose span is non-dense! There even
are pre-Hilbert spaces (called pathological) in which no orthonormal set has dense span!
Actually, most of the non-trivial results, like H ∼= K ⊕ K ⊥ for closed subspaces K and
Theorem 5.34, hold for a pre-Hilbert space if and only if it is a Hilbert space, see [63]. 2

5.44 Exercise (Counterexample) Let H = `2 (N, F), let f = ∞


P
n=1 δn /n ∈ H (equivalently,
f (n) = 1/n). Now K = spanF {f, δ2 , δ3 , . . .} (no closure!) is a pre-Hilbert space. Prove:
(i) E = {δ2 , δ3 , . . .} is a maximal orthonormal set in K.
(ii) f 6∈ spanF E, thus spanF E 6= K (both closures in K).

5.45 Theorem ((F.) Riesz-Fischer) 3536 Let H be a pre-Hilbert space and E an orthonor-
mal set such that spanF E = H. Then the following are equivalent:
(i) H is a Hilbert space (thus complete).
(ii) The isometric map H → `2 (E, F), x 7→ {hx, ei}e∈E is surjective. I.e. for every f ∈ `2 (E, F)
there is an x ∈ H such that hx, ei = f (e) for all e ∈ E.
Proof. (ii)⇒(i) We know from (iii)⇒(vii b) in Theorem 5.42 that the map H → `2 (E, F) is an
isometry. If it is surjective then it is an isomorphism of pre-Hilbert spaces. Since `2 (E, F) is
complete by Lemma 4.8, so is H.
(i)⇒(ii) With f ∈ `2 (E, F) we have 2
P
e∈E |f (e)| < ∞. This implies that for each ε > 0
2
P
there is a finite F ⊆ E such that e∈E\F |f (e)| < ε. For each finite subset F ⊆ E we
define xF =P e∈F f (e)e. Whenever U, U 0 ⊆ E are finite subsets containing F , the identity
P
xU − x0U = e∈E (χU (e) − χU 0 (e))f (e)e implies
X
kxU − xU 0 k2 = |χU (e) − χU 0 (e)|2 |f (e)|2 ≤ ε
e∈E

since |χU − χU 0 | vanishes on F and is bounded by one on (U ∪ U 0 )\F . Thus {xF }F ⊆E finite is
a Cauchy net in H and therefore convergent to a unique x ∈ H by completeness, cf. Lemma
A.14. By continuity of the inner product, hxF , ei converges to f (e), so that hx, ei = f (e) for all
e ∈ E. 

: E → F in (ii) satisfies e∈E |f (e)| < ∞, thus f ∈ `1 (E, F), then


P
5.46 Remark If the f P
we can simply put x = e∈E f (e)e, the convergence following from Proposition 3.15(iii). But
for infinite E, we have a strict inclusion `1 (E, F) $ `2 (E, F) (compare Exercise 4.14), so that
the above proof is not redundant. Here we again see that infinite-dimensional Hilbert spaces
contain series that are unconditionally but not absolutely convergent. 2
35
Ernst Sigismund Fischer (1875-1954). Austrian mathematician. Early pioneer of Hilbert space theory.
36
Also the completeness of L2 (X, A, µ; F) (see Lemma 4.8 for `2 (S)) is sometimes called Riesz-Fischer theorem.

49
5.47 Proposition For a Hilbert space H, the following are equivalent:
(i) H is separable in the topological sense, i.e. there is a countable dense set S ⊆ H.
(ii) H admits a countable orthonormal basis.
(iii) Every orthonormal basis for H is countable.
Proof. If E ⊆ H is any ONB for H, Theorem 5.45 gives a unitary equivalence H ∼ = `2 (E, F).
2
By Proposition 4.17, ` (E, F) is separable if and only if E is countable. Combining these facts
proves the implications (ii)⇒(i)⇒(iii), while (iii)⇒(ii) is trivial. 

5.48 Remark One can prove that any two ONBs E, E 0 for a Hilbert space H have the same
cardinality, i.e. there is a bijection between E and E 0 , cf. e.g. [30, Proposition I.4.14]. (This
does not follow from the linear algebra proof, since the latter uses a different notion of basis,
the Hamel bases.) The common cardinality of all bases of H is called the dimension of H. 2

5.49 Example Here is an application of Theorem 5.42: Let37

H = {f ∈ C([0, 2π], C) | f (0) = f (2π)}.


R 2π
One easily checks that hf, gi = (2π)−1 0 f (x)g(x)dx (Riemann integral) is anR inner product,
so that (H, h·, ·i) is a pre-Hilbert space. (If a continuous function satisfies |f (x)|2 dx = 0
then it is identically zero.) For n ∈ Z, let en (x) = einx . It is straightforward to show that
E = {en | n ∈ Z} is an orthonormal set, thus Bessel’s inequality holds. For f ∈ H we have
Z 2π
1
hf, en i = f (x)e−inx dx,
2π 0

which is the n-th Fourier coefficient fb(n) of f , cf. e.g. [157, 83]. In fact, in Fourier analysis one
proves, cf. e.g. [157, Corollary 5.4], that the finite linear combinations of the en (‘trigonometric
polynomials’) are dense in H, which is (iii) of Theorem 5.42. Thus all other statements in the
theorem also hold. The weaker statement (ii a) is also well-known in Fourier analysis, cf. [157,
Corollary 5.3]. Furthermore,
Z 2π
1 X X
|f (x)|2 dx = kf k2 = |hf, en i|2 = |fb(n)|2 .
2π 0 n∈Z n∈Z

This is the original Parseval formula, cf. e.g. [157, Chapter 3, Theorem 1.3]. Note that H is
not complete. Measure theory tells us that this completion is L2 ([0, 2π], λ; C), the measure
being Lebesgue measure λ (defined on the σ-algebra of Borel sets). Now the map L2 ([0, 2π]) →
`2 (Z, C), f 7→ fb is an isomorphism of Hilbert spaces. This nice situation shows that the Lebesgue
integral is much more appropriate for the purposes of Fourier analysis than the Riemann integral
(as for most other purposes).
Note: If we consider L2 ([0, 2π], λ; R) instead, we must replace the basis E by ER = {cos nx | n ∈
N0 } ∪ {sin nx | n ∈ N}. One easily checks that spanC E = spanC ER .

R 1 Exercise Prove that the pre-Hilbert space H = C([0, 1]) with inner product hf, gi =
5.50
0 f (t)g(t)dt is not complete.
37
The set of continuous 2π-periodic functions can be identified with C(S 1 , C) via z = eix . We write f (x) when we
consider f as a function on [0, 2π] (or R) and f (z) if f is understood as a function on S 1 = {z ∈ C | |z| = 1}.

50
5.7 Tensor products of Hilbert spaces
In this optional section, referenced only in Section 12.4 but important well beyond that, you are
assumed to know38 the notion of (algebraic) tensor product V ⊗k W of two vector spaces V, W
over a field k. (In two sentences: V ⊗k W is the free abelian group spanned the pairs (v, w) ∈
V ×W , divided by the subgroup generated by all elements of the form (v +v 0 , w)−(v, w)−(v 0 , w)
and (v, w + w0 ) − (v, w) − (v, w0 ) and (cv, w) − (v, cw), where v, v 0 ∈ V, w, w0 ∈ W, c ∈ k, the
quotient being a k-vector space in the obvious way. If v ∈ V, w ∈ W then the equivalence class
[(v, w)] is denoted v ⊗ w.)
The crucial property is that given a bilinear map α : V × W → Z (where V × W is the
Cartesian product) there is a unique linear map β : V ⊗k W → Z such that β(v ⊗w) = α((v, w)).

5.51 Lemma Let (H, h·, ·iH ), (H 0 , h·, ·iH 0 ) be pre-Hilbert spaces over F ∈ {R, C}. Then there is
a unique inner product h·, ·iZ on Z = H ⊗F H 0 such that hv ⊗ w, v 0 ⊗ w0 i = hv, v 0 iH hw, w0 iH 0 .
Proof. Every element z ∈ Z = H ⊗F H 0 has a representation z = K
P
PL 0 k=1 vk ⊗ wk with K < ∞.
0 0 0
Given another z = l=1 vl ⊗ wl ∈ H ⊗F H , we must define
K X
X L
hz, z 0 iZ = hvk , vl0 iH hwk , wl0 iH 0 .
k=1 l=1

Since an element z ∈ Z can have many representations of the form z = K


P
k=1 vk ⊗ wk , we must
PK PK̃
show that this is well-defined. Let thus k=1 vk ⊗ wk = k̃=1 ṽk̃ ⊗ w̃k̃ . Now, for fixed l the map
H × H → F, (x, y) 7→ hx, vl0 iH hy, wl0 iH 0 clearly is bilinear, thus it gives rise to a unique linear
map H ⊗F H 0 → F. This implies

K X
X L K̃ X
X L
0 0
hvk , vl iH hwk , wl iH =
0 hṽk̃ , vl0 iH hw̃k̃ , wl0 iH 0 .
k=1 l=1 k̃=1 l=1

The independence of hz, z 0 iZ of the representation of z 0 is shown in the same way.


It is quite clear that h·, ·iZ is sesquilinear and satisfies 0 0
P hz , ziZ = hz, z iZ .
In order to study hz, ziZ we may assume that z = k vk ⊗ wk , where the wk are mutually
orthogonal. This leads to
X X
hz, ziZ = hvk , vk iH hwk , wk iH 0 = kvk k2 kwk k2 ≥ 0
k k

and hz, ziZ = 0 ⇒ z = 0. 

5.52 Definition If H, H 0 are Hilbert spaces then H ⊗ H 0 is the Hilbert space obtained by
completing the above pre-Hilbert space (Z, h·, ·iZ ).

5.53 Remark 1. We usually write the completed tensor products ⊗ without subscript to
distinguish them from the algebraic ones.
2. If E, E 0 are ONBs in the Hilbert spaces H, H 0 , respectively, then it is immediate that
E × E 0 is an orthonormal set in the algebraic tensor product H ⊗k H 0 , thus also in H ⊗ H 0 . In
fact its span is dense in E ⊗ E 0 , so that it is an ONB.
38
Unfortunately, this is often omitted from undergraduate linear algebra teaching. E.g., it does not appear in [55]
despite the book’s > 500 pages. See however [84, 95] which, admittedly, are aiming higher.

51
This leads to a pedestrian way of defining the tensor product H ⊗ H 0 of Hilbert spaces over
F: Pick ONBs E ⊆ H, E 0 ⊆ H 0 and define H ⊗ H 0 = `2 (E × E 0 , F). By Remark 5.48, the
outcome is independent of the chosen bases up to isomorphism. If x ∈ H, x0 ∈ H 0 then the map
E ×E 0 → F, (e, e0 ) 7→ hx, eiH hx0 , e0 iH 0 is in `2 (E ×E, F), thus defines an element x⊗x0 ∈ H ⊗H 0 .
This map H × H 0 → H ⊗ H 0 is bilinear. But this definition is very ugly and unconceptual due
to its reliance on a choice of bases. 2

6 Subspaces and quotient spaces of Banach spaces


6.1 Quotient spaces of Banach spaces
If H is a Hilbert space and K ⊆ H a closed subspace, we have seen in Exercise 5.32 that the
orthogonal complement K ⊥ is isomorphic to the quotient space H/K. For this reason, quotient
spaces of Hilbert spaces are hardly ever used: We can work with the easily defined orthogonal
complements instead. In a Banach space context, we have no notion of orthogonality and
therefore no orthogonal complements. This is the reason why quotient spaces of Banach spaces
are important, and we will now study them in some detail. (In fact, there is more trouble with
general Banach spaces, to be discussed in Section 6.2.)

6.1 Proposition If V is a normed space, W ⊆ V a linear subspace and V /W denotes the


quotient vector space, we define k · k0 : V /W → [0, ∞) by kv + W k0 = inf w∈W kv − wk. Then
(i) k · k0 is a seminorm on V /W , and the quotient map Q : V → V /W satisfies kQk ≤ 1.
(ii) k · k0 is a norm if and only if W ⊆ V is closed.
(iii) If W ⊆ V is closed, the topology on V /W induced by k · k0 coincides with the quotient
topology, and the quotient map p : V → V /W is open.
(iv) If V is a Banach space and W ⊆ V is closed then (V /W, k · k0 ) is Banach space.
(v) If V is a normed space with closed subspace W and T ∈ B(V, E), where E is a normed
space with W ⊆ ker T then there is a unique T 0 ∈ B(V /W, E) such that T 0 Q = T .
Furthermore, kT 0 k = kT k. T 0 is surjective if and only if T is surjective. T 0 is injective if
and only if W = ker T .
(vi) If A is a normed (resp. Banach) algebra and I ⊆ A is a closed two-sided ideal, then A/I
is a normed (resp. Banach) algebra.
Proof. (i) It is clear that k0k0 = 0 (where we denote the zero element of V /W by 0 rather than
W ). For x ∈ V, c ∈ F\{0} we have

kc(x + W )k0 = kcx + W k0 = inf kcx − wk = |c| inf kx − w/ck = |c| inf kx − wk = |c|kxk0 ,
w∈W w∈W w∈W

where we used that W → W, w 7→ cw is a bijection. Now let x1 , x2 ∈ V and ε > 0. Then there
are w1 , w2 ∈ W such that kxi − wi k < kxi + W k0 + ε/2 for i = 1, 2. Then

kx1 + x2 + W k0 = inf kx1 + x2 + wk ≤ k(x1 − w1 ) + (x2 − w2 )k


w∈W
≤ kx1 − w1 k + kx2 − w2 k < kx1 + W k0 + kx2 + W k0 + ε.

Since ε > 0 was arbitrary, we have kx1 +x2 +W k0 ≤ kx1 +W k0 +kx2 +W k0 , proving subadditivity
of k · k0 . In view of 0 ∈ W it is immediate that kv + W k0 = inf w∈W kv − wk ≤ kvk.

52
(ii) If v ∈ V , the definition of k · k0 readily implies that kv + W k0 = 0 if and only if v ∈ W .
Thus if W is closed then w = v + W ∈ V /W has kwk0 = 0 only if w is the zero element of V /W .
And if W is non closed then every v ∈ W \W satisfies kv + W k0 = 0 even though v + W ∈ V /W
is non-zero. Thus k · k0 is not a norm.
(iii) Continuity of Q : (V, k · k) → (V /W, k · k0 ) follows from kQk ≤ 1, see (i). Since Q is
norm-decreasing, we have Q(B V (0, r)) ⊆ B V /W (0, r) for each r > 0. And if y ∈ V /W with
kyk < r then there is an x ∈ V with Q(x) = y and kxk < r (but typically larger than kyk).
Thus Q maps B V (0, r) onto B V /W (0, r) for each r. Similarly, Q(B V (x, r)) = B V /W (Q(x), r),
and from this it is easily deduced that Q(U ) ⊆ V /W is open for each open U ⊆ V . Thus Q is
open (w.r.t. the norm topologies on V, V /W ), which implies (cf. [108, Lemma 6.4.5]) that Q is
a quotient map, thus the topology on V /W coming from k · k0 is the quotient topology.
(iv) Let {yn } ⊆ V /W be a Cauchy sequence. Then we can pass to a subsequence wn = yin
such that kwn −wn+1 k < 2−n . Pick xn ∈ V such that Q(xn ) = wn and kxn −xn+1 k < 2−n . (Why
can this be done?) Then {xn } is a Cauchy sequence converging to some x ∈ V by completeness
of V . With y = Q(x) we have kyn − yk ≤ kxn − xk → 0. Thus yn → y, and V /W is complete.
(v) If y ∈ V /W , and x, x0 ∈ V satisfy Q(x) = Q(x0 ) = y then Q(x − x0 ) = 0, thus
x − x0 ∈ ker Q = W ⊆ ker T , implying T x = T x0 . Thus putting T 0 y = T x gives rise to a
well-defined map T 0 : V /W → Z satisfying T 0 Q = T . One easily checks that T 0 is linear. And
using Q(B V (0, 1)) = B V /W (0, 1) from the proof of (iii) we have

kT 0 k = sup{kT 0 yk | y ∈ B V /W (0, 1)} = sup{kT 0 Q(x)k | x ∈ B V (0, 1)}


= sup{kT xk | x ∈ B V (0, 1)} = kT k.

The statements concerning injectivity and surjectivity of T 0 are pure algebra, but for complete-
ness we give proofs: The statement about surjectivity follows from T = T 0 Q together with
surjectivity of Q, which gives T (V ) = T 0 (V /W ). If W $ ker T , pick x ∈ (ker T )\W and put
y = Q(x). Then y 6= 0, but T 0 y = T 0 Qx = T x = 0, so that T 0 is not injective. Now assume
W = ker T . If y ∈ ker T 0 then pick x ∈ V with y = Q(x). Then T x = T 0 Qx = T 0 y = 0, thus
x ∈ ker T = W , so that y = Q(x) = 0, proving injectivity of T 0 .
(vi) It is known from algebra that A/I is again an algebra. By the above, it is a normed
(resp. Banach) space. It remains to prove that the quotient norm on A/I is submultiplicative.
Let c, d ∈ A/I and ε > 0. Then there are a, b ∈ A with Q(a) = c, Q(b) = d, kak < kck+ε, kbk <
kdk + ε (see the exercise below). Then kcdk = kQ(ab)k ≤ kabk ≤ kakkbk < (kck + ε)(kdk + ε),
and since this holds for all ε > 0, we have kcdk ≤ kckkdk. 

6.2 Exercise (i) If V is a normed space and W ⊆ V is a closed subspace, prove that for
every y ∈ V /W and every ε > 0 there is an x ∈ V with Q(x) = y and kxk ≤ kyk + ε.
(ii) Give an example of a normed space V , a closed subspace W and y ∈ V /W for which no
x ∈ V with y = Q(x), kxk = kyk exists.
We have seen that if V is Banach and W ⊆ V is closed then W and V /W (with their
inherited and quotient norms, respectively) are complete. The converse is also true:

6.3 Proposition Let V be a normed space and W ⊆ V a closed subspace. Then


(i) If W and V /W are complete then so is V .
(ii) If W is complete and has finite codimension then V is complete. If W is finite-dimensional
and V /W is complete then V is complete.

53
Proof. (i) Let {vn } ⊆ V be a Cauchy sequence. Since Q : V → V /W is bounded, the sequence
{zn = Q(vn )} ⊆ V /W is Cauchy, so that by completeness of V /W it converges to some z ∈ V /W .
Pick zb ∈ V with Q(b z ) = z. By Exercise 6.2(i), for every n ∈ N we can find a yn ∈ V such that
Q(yn ) = z − zn = Q(b z − vn ) and kyn k ≤ kz − zn k + 2−n . With zn → z this implies yn → 0. With
Q(yn + vn − zb) = 0, we have yn + vn − zb ∈ ker Q = W ∀n. Since {vn } and {yn } are Cauchy, so
is {yn + vn − zb} so that by completeness of W we have yn + vn − zb → w for some w ∈ W . In
view of yn → 0 this implies vn → w + zb ∈ V . Thus V is complete.
(ii) Recalling that finite codimensionality of W means dim(V /W ) < ∞, both statements are
immediate by (i) and the completeness of finite-dimensional normed spaces (Exercise 2.33). 

6.4 Exercise Use the quotient space construction of Banach spaces to


(i) give a new proof for the difficult (⇐) part of Exercise 3.9.
(ii) prove that every non-zero ϕ ∈ V ∗ is an open map. (To be vastly generalized later.)

6.5 Exercise If V a Banach space and W ⊆ V a closed subspace, prove that V is separable if
and only if W and V /W are separable.
If V is a Banach space and W ⊆ V a closed linear subspace, it is natural to ask how the
dual spaces W ∗ and (V /W )∗ are related to V ∗ . This leads to the following definitions, which
are closely related to the Hilbert space ⊥, but not the same:

6.6 Definition Let V be a normed space.


• For any W ⊆ V , the annihilator of W is W ⊥ = {ϕ ∈ V ∗ | ϕ(x) = 0 ∀x ∈ W } ⊆ V ∗ .
• For any Φ ⊆ V ∗ , the annihilator of Φ is Φ> = {x ∈ V | ϕ(x) = 0 ∀ϕ ∈ Φ} ⊆ V .
(Some authors write ⊥ Φ instead of Φ> .)
One easily checks for any W ⊆ V that W ⊥ ⊆ V ∗ is a closed linear subspace and W ⊥ =

spanF W . (Similar statements hold for Φ> .) We trivially have W ⊆ (W ⊥ )> and Φ ⊆ (Φ> )⊥
for all W ⊆ V, Φ ⊆ V ∗ , and {0}⊥ = V ∗ and W = V ⇒ W ⊥ = {0}. (For the less trivial
converses see Exercise 9.14.) It is easy that Φ ⊆ {0} ⇔ Φ> = V . Answering the question when
Φ> = {0} requires more preparations, cf. Theorem 10.32(ii).

6.7 Exercise Let V be a Banach space and W ⊆ V a closed subspace. Let Q : V → V /W be


the quotient map. Prove that the map α : (V /W )∗ → V ∗ , ψ 7→ ψQ is injective and isometric
and its image is W ⊥ ⊆ V ∗ . Thus (V /W )∗ ∼
= W ⊥ as Banach spaces.
Under the same assumptions, W ∗ ∼
= V ∗ /W ⊥ , but this has to await Exercise 9.15.

If V is a normed space and W, Z ⊆ V are closed linear subspaces, we will later see that
W + Z = {w + z | w ∈ W, z ∈ Z} ⊆ V can fail to be closed. If W and Z are both finite-
dimensional then W + Z is finite-dimensional, thus closed. More generally:

6.8 Exercise Let V be a normed space, W ⊆ V a closed subspace and Z ⊆ V a finite-


dimensional subspace. Prove that W + Z ⊆ V is closed. Hint: Use V /W .
For more on closedness of sums of closed subspaces see Exercises 7.13 and 7.44.

54
6.2 Complemented subspaces
If V is a vector space over any field K and W ⊆ V is a linear subspace, it is known from linear
algebra that we can find another subspace Z ⊆ V such that V = W + Z and W ∩ Z = {0}.
The proof is easy: Pick a (Hamel) basis E for W , extend it to a basis E 0 ⊇ E of V and put
Z = spanF (E 0 \E). Such a Z is called a (algebraic) complement for W . Since every x ∈ V can
be written as x = w + z with w ∈ W, z ∈ Z in a unique way (w + z = w0 + z 0 ⇒ w − w0 =
z 0 − z ∈ W ∩ Z = {0}) one says V is the internal direct sum of W and Z, or V ∼ = W ⊕ Z.
If V is a Banach space and W ⊆ V a closed subspace, it is natural ask for a complementary
subspace Z with the above properties to be closed, too. We have seen that for every closed
subspace K of a Hilbert space H there is a closed complement, namely the orthogonal comple-
ment K ⊥ ⊆ H. Since K ⊥ is defined in terms of the inner product, it is not surprising that the
situation will turn out more complicated for general Banach spaces, where no inner product
is around. (Simply passing from an algebraic complement Z to its closure is no solution since
there is no reason for W ∩ Z = {0} to hold.) This leads us to define:

6.9 Definition Let V be a topological vector space. A closed subspace W ⊆ V is called


complemented if there is a closed subspace Z ⊆ V such that V = W + Z and W ∩ Z = {0}.

6.10 Remark 1. In Exercise 7.15 we will prove that if V is a Banach space and W, Z ⊆ V are
complementary closed subspaces, the linear isomorphism V ' W ⊕ Z also is a homeomorphism,
thus an isomorphism of Banach spaces.
2. By the comments preceding the definition, every closed subspace of a Hilbert space is
complemented, and the same holds for Banach spaces isomorphic to a Hilbert space.
3. But ‘most’ infinite-dimensional Banach spaces have uncomplemented closed subspaces!
The simplest example of an uncomplemented closed subspace probably is c0 (N, R) ⊆ `∞ (N, R).
For a proof, not entirely trivial, see Appendix B.3.3.
Another easily stated example is given by X = C(S 1 , C)R with norm k · k∞ and the closed
1
subspace Y = {f ∈ X | fb(n) = 0 ∀n < 0}, where fb(n) = 0 f (e2πit )e−2πint dt, n ∈ Z are the
Fourier coefficients. Proving that Y ⊆ X is not complemented,
PN sin nx see e.g. [76, p. 163-4], boils down
to rather classical analysis, namely the fact that n=1 n is bounded uniformly in N ∈ N
einx
and x ∈ R, while N
P
n=1 n is not, combined with Exercise 7.15.
4. In fact, by a remarkable theorem of Lindenstrauss and Tzafriri39 (1971), cf. e.g. [1,
Theorem 13.4.5], [97, Vol. 2, Chap. 1, Theorem V.1], every Banach space all closed subspaces
of which are complemented is isomorphic to a Hilbert space. Cf. also Remark 9.13. 2

On the positive side, we have some results and exercises:

6.11 Proposition Let V be a normed space and W ⊆ V a linear subspace.


(i) If W is closed and has finite codimension (i.e. dim(V /W ) < ∞) then it is complemented.
(ii) If W is finite-dimensional then it is complemented.
Proof. (i) Finite-dimensionality of V /W is equivalent to the existence of a finite-dimensional
algebraic complement Z ⊆ V for W . (This is just linear algebra, but here is a proof: Let
{e1 , . . . , en } be a basis for V /W . By surjectivity of p : V → V /W we can find x1 , . . . , xn ∈ V
such that p(xi ) = ei . It is easy to see that {x1 , . . . , xn } is linearly independent. Let Z =
39
Joram Lindenstrauss (1936-2012). Lior Tzafriri (1936-2008). Israeli mathematicians with many contributions to
Banach space theory.

55
spanF {x1 , . . . , xn }. Now it is straightforward that Z is a complement for W .) Being finite-
dimensional, it is automatically closed by Exercise 3.22.
(ii) The proof will be given in Section 9.3 since it requires tools still to be developed. 

6.12 Exercise It is not true that every subspace W ⊆ V with dim(V /W ) < ∞ of a Banach
space V is closed! Find a counterexample! (Hint: try codimension one.)

6.13 Exercise Let V be a normed space and P ∈ B(V ) satisfying P 2 = P (i.e. idempotent).
Prove that W = P V is a complemented subspace. (For a converse see Exercise 7.15.)

6.14 Exercise Let V = C([0, 2], R) with the k · k∞ -norm. Let W = {f ∈ V | f|(1,2] = 0}.
(i) Prove that W is complemented.
(ii) Can you ‘classify’ all possible complements, i.e. put them in bijection with a simpler set?
For more on the subject of complemented subspaces see [102].

In the process of returning from Hilbert to the more general Banach spaces, the above
discussion of quotient spaces and complements was the easiest part. The question of bases is
much more involved for Banach spaces, as the very extensive two-volume treatment [153] of
the subject attests. (Then again, the basics are quite accessible40 , cf. e.g. [102, 26, 1, 74], but
unfortunately we don’t have the time.) The same is true for the formidable subject of tensor
products of Banach spaces, see e.g. [144]. Going into that would be pointless given that we
already slighted the much simpler tensor products of Hilbert spaces.
A more tractable problem is the fact that in the absence of an inner product, the existence of
non-zero bounded linear functionals is rather non-trivial and can in general only be proven non-
constructively. We will do this in Section 9. (Of course, for spaces that are given very explicitly
like `p (S, F), we may well have more concrete descriptions of the dual spaces as in Section 4.5.)
But first we will prove two major theorems that are non-trivial even when restricted to Hilbert
spaces. Both of them use Baire’s theorem on complete metric spaces, cf. Appendix A.5 which
should perhaps be read first.

7 Open Mapping Theorem and relatives


7.1 Open Mapping Theorem. Bounded Inverse Theorem
We begin with a few very easy equivalences:

7.1 Exercise Let E, F be normed spaces and T ∈ B(E, F ). Consider the statements
(i) T is open (i.e. T U ⊆ F is open for each open U ⊆ E).
(ii) For every α > 0 there exists β > 0 such that B F (0, β) ⊆ T B E (0, α).
(iii) There exist α, β > 0 such that B F (0, β) ⊆ T B E (0, α).
(iv) There is a C > 0 such that for every y ∈ F there exists x ∈ E with T x = y and kxk ≤ Ckyk.
(This is a more quantitative or ‘controlled’ version of surjectivity.)
40
A Schauder Pbasis for a Banach space is a sequence {en }n∈N such that for every x ∈ V there are unique cn ∈ F

such that x = n=1 cn en in the sense of (possibly
P conditional) convergence of series. One then proves that there are
continuous linear functionals ϕn such that x = n ϕn (x)x. Existence of a Schauder basis clearly implies separability
of V , but while most ‘natural’ separable Banach spaces have Schauder bases, counterexamples exist!

56
(v) T is surjective.
Obviously (iv)⇒(v). Prove the easy equivalences (i)⇔(ii)⇔(iii)⇔(iv).
Remarkably, for Banach spaces also (v) is equivalent to (i)-(iv):

7.2 Theorem (Open Mapping Theorem (OMT), Schauder 1930) 41 If E, F are Banach
spaces then every surjective T ∈ B(E, F ) is open (and (iv) holds, which is often useful).
Most proofs of this theorem are not very transparent. We follow the slightly better approach
of [61], which makes the proof of Proposition 7.4 more palatable by isolating a lemma that also
has other applications42 . It deduces statement (iv) in Exercise 7.1 from completeness of E and
an approximate form of surjectivity of T ∈ B(E, F ):

7.3 Lemma Let E be a Banach space, F a normed space and T ∈ B(E, F ). Assume also that
there are m > 0 and r ∈ (0, 1) such that for every y ∈ F there is an x0 ∈ E with kx0 k ≤ mkyk
m
and ky − T x0 k ≤ rkyk. Then for every y ∈ F there is an x ∈ E such that kxk ≤ 1−r kyk and
T x = y. In particular, T is surjective.
Proof. By linearity we may assume kyk = 1. By assumption, there is x0 ∈ E such that kx0 k ≤ m
and ky − T x0 k ≤ r. Putting y1 = y − T x0 we have ky1 k ≤ r, and applying the hypothesis to y1 ,
we find an x1 ∈ E with kx1 k ≤ mky1 k ≤ rm and ky − T (x0 + x1 )k = ky1 − T x1 k ≤ rky1 k ≤ r2 .
Continuing this inductively43 we obtain a sequence {xn } ⊆ E such that for all n ∈ N we have

kxn k ≤ rn m, (7.1)
n+1
ky − T (x0 + x1 + · · · + xn )k ≤ r . (7.2)
P∞
Now, (7.1) together with completeness of E implies, cf. Proposition 3.15, that n=0 xn converges
to an x ∈ E with
∞ ∞
X X m
kxk ≤ kxn k ≤ rn m = ,
1−r
n=0 n=0

and taking n → ∞ in (7.2) gives y = T x. Not assuming kyk = 1 gives an extra factor kyk. 

7.4 Proposition If E is a Banach space, F a normed space, T ∈ B(E, F ) and there are
α, β > 0 such that B F (0, β) ⊆ T B E (0, α) then the statements (i)-(iv) from Exercise 7.1 hold.
Proof. The hypothesis clearly implies B F (0, β) ⊆ T B E (0, α). By linearity of T and the
fact that multiplication with a non-zero scalar is a homeomorphism, thus commutes with
F
the closure, this is equivalent to B (0, 1) ⊆ T B E (0, γ), 44 where γ = α/β. Thus for each
y ∈ F, kyk ≤ 1, ε > 0 there is x ∈ E, kxk < γ such that kT x − yk < ε. This, in turn, is
equivalent to ∀y ∈ F, ε > 0 ∃x ∈ E : kxk < γkyk, kT x − yk < εkyk. (Why?) Since this precisely
is the hypothesis of Lemma 7.3, we can conclude that every y ∈ F equals T x for an x ∈ E with
γ
kxk ≤ 1−ε kyk. This is statement (iv) in Exercise 7.1. 

41
Juliusz Schauder (1899-1943). Polish mathematician. Born in Lwow (then Austria-Hungary, now Ukraine). Killed
by the Gestapo.
42
It leads to a quick proof of the Tietze extension theorem in topology, see Appendix A.6.1.
43
Here we are using the axiom DCω of countable dependent choice, cf. Appendix A.4.
44
In a normed space E we have B(x, r) = B(x, r) := {y ∈ E | d(x, y) ≤ r}, but in a metric space ⊇ may fail!

57
Proof of Theorem 7.2. Since T is surjective, we have

[
F = TE = T B E (0, n).
n=1

Since F is a complete metric space and has non-empty interior F 0 = F 6= ∅, Corollary A.24 of
Baire’s theorem implies that at least one of the closed sets T B E (0, n) has non-empty interior.
Thus there are n ∈ N, y ∈ F, ε > 0 such that B F (y, ε) ⊆ T B E (0, n). If x ∈ B F (0, ε) then
2x = (y + x) − (y − x), thus 2B F (0, ε) ⊆ B F (y, ε) − B F (y, ε) and thus
1 1 
B F (0, ε) ⊆ (B F (y, ε) − B F (y, ε)) ⊆ T B E (0, n) − T B E (0, n) ⊆ T B E (0, n).
2 2
Thus the hypothesis of Proposition 7.4 is satisfied with α = n, β = ε, and we are done. 

7.5 Exercise Let E be a Banach space, F a normed space and A ∈ B(E, F ). Prove: If A is
open then F is complete, thus Banach.
Hint: Combine Exercise 7.1 and the method of proof of Proposition 6.1(iv).

7.6 Remark 1. The preceding exercise shows that A ∈ B(E, F ) is never open if E is a Banach
space and F an incomplete normed space! On the other hand, if E is incomplete, openness of
A ∈ B(E, F ) can fail even if A is bijective. See Exercise 7.23 below.
2. The proof of the OMT involved two applications of DCω , in the proof of the approximation
Lemma 7.3 and via Baire’s theorem. It actually is possible to replace the invocation of Baire’s theorem
by the (weak) uniform boundedness theorem, see [46]. This is interesting since the latter can be proven
using only the axiom of countable choice, cf. Appendix B.4. But replacing DCω by ACω in proving the
approximation Lemma 7.3 seems impossible.
3. Recent claims by various authors (Kesavan, Liebaug and Spindler, Velasco, . . . ) that OMT can be
deduced from the Uniform Boundedness Theorem (next section) do not make much sense. From a logical
point of view, ‘A implies B’ is true for every true statement B: Just ignore A and prove B from scratch.
In order for ‘A implies B’ to have meaning, we must restrict the tools allowed in proving the implication,
as in Theorem B.43, where no use of choice axioms is allowed. The proposed deduction of OMT from
UBT is not of this type since is uses DCω , while DCω is sufficient to prove OMT without invoking UBT!
(There is another non-rigorous but very common use of ‘A implies B’ in the sense of ‘deducing B from
A is much simpler than proving B without A’. As in: The non-existence of a retraction from a ball to
its boundary implies the Brouwer fixed point theorem. But also this does not apply here.) 2

7.7 Corollary (Bounded Inverse Theorem (BIT), Banach 1929) If E, F are Banach
spaces and T ∈ B(E, F ) is a bijection then also T −1 is bounded.
Proof. By Theorem 7.2, T is open. Thus the inverse T −1 that exists by bijectivity (and clearly
is linear) is continuous, thus bounded by Lemma 3.5 or property (iv) from Exercise 7.1. 

Proof of Theorem 2.23. The hypothesis implies that idV : (V, k · k1 ) → (V, k · k2 ) is a continuous
bijection, thus a homeomorphism by the BIT. Now Lemma 3.5 gives k · k1 ≤ c0 k · k2 . 

7.8 Exercise Prove:


(i) The Two Norm Theorem 2.23 implies the BIT.
(ii) The BIT implies the OMT. (Hint: Use a quotient map.)

58
7.9 Remark 1. The BIT is equivalent to the statement that every continuous linear bijection of
Banach spaces is a homeomorphism. This is reminiscent of the statement that every continuous
bijection of compact Hausdorff spaces is a homeomorphism. (The analogy ends when we look
at the generalization: Every continuous map from a compact space to a Hausdorff space is a
closed map. But there are other open mapping theorems, e.g. for topological groups.)
2. The Open Mapping Theorem can be generalized to the case where E is an F -space, i.e.
a TVS admitting a complete translation-invariant metric. See [141, Theorem 2.11]. 2

7.2 Some applications


With the Open Mapping Theorem at our disposal, we return to the quotient space construction:

7.10 Corollary Let E, F be Banach spaces and T ∈ B(E, F ) surjective. Then


(i) The topology of F coincides with the quotient topology, thus T is a quotient map.
(ii) The linear bijection T 0 : E/ ker T → F from Proposition 6.1 is a homeomorphism.
(iii) The quotient norm on E/ ker T is equivalent to kT 0 (·)kF .
Proof. (i) Since T is continuous, T −1 (U ) ⊆ E is open whenever U ⊆ F is open. And if U ⊆ F
is such that F −1 (U ) is open then by surjectivity of T we have U = F (F −1 (U )), which is open
by the OMT. Thus U ⊆ F is open if and only if the preimage T −1 (U ) ⊆ E is open, which is the
definition of the quotient topology. (ii) The map T 0 is bounded and injective by construction
and surjective by surjectivity of T . Thus by the BIT it is a homeomorphism. (Alternatively
this follows from continuity of T 0 and the fact that F is homeomorphic to E/ ker T by (i).) (iii)
The fact that T 0 is a homeomorphism means that the two norms on E/ ker T are equivalent. 

The next remarkable application will have several uses later:

7.11 Exercise Let V, W be Banach spaces and A ∈ B(V, W ).


(i) If A is injective and dim(W/AV ) < ∞, prove that AV ⊆ W is closed.
Hint: Use an algebraic complement of AV ⊆ W and the BIT.
(ii) Remove the injectivity assumption in (i).
(iii) Prove: If A has dense image but is not surjective then dim(W/AV ) = ∞.

7.12 Remark 1. The above contrasts interestingly with Exercise 6.12.


2. The quotient W/AV is called the (algebraic) cokernel of A. Some authors define the
cokernel as W/AV . But we don’t do this, since finite-dimensionality of W/AV (the topological
cokernel) is a much weaker condition on A and doesn’t imply closedness of AV . 2

The OMT and BIT have many applications to the questions concerning closed linear sub-
spaces, their sums and complementedness:

7.13 Exercise Let V be a Banach space and K, L ⊆ V closed linear subspaces. Equip W =
K ⊕ L with the norm k(k, l)k = kkk + klk (or an equivalent one like max(kkk, klk)).
(i) Prove that the following are equivalent:
(α) K + L ⊆ V is closed.
(β) The (surjective) map + : W → K + L, (k, l) 7→ k + l is open.

59
(γ) There is a C such that for every y ∈ K + L there are k ∈ K, l ∈ L such that k + l = y
and kkk + klk ≤ Ckyk.
(ii) If K ∩ L = {0} prove that (α)-(γ) are also equivalent to
(δ) inf{kk − lk | k ∈ K, l ∈ L, kkk = klk = 1} > 0. Thus the unit spheres of K and L
have positive distance (which they cannot have if K ∩ L 6= {0}).

7.14 Proposition Let V be a Banach space and W, Z ⊆ V closed complementary subspaces,


i.e. W + Z = V and W ∩ Z = {0}. Then W ⊕ Z equipped with the norm k(w, z)ks = kwk + kzk
(or k(w, z)km = max(kwk, kzk)) is a Banach space, and the map α : W ⊕ Z → V, (w, z) 7→ w + z
is an isomorphism of Banach spaces (i.e. a linear bijection and a homeomorphism).
Proof. As closed subspaces, W, Z are Banach spaces, thus W ⊕ Z with one of the given norms
is a Banach space. The assumptions imply that α is a linear bijection. Continuity of α follows
from kα((w, z))k = kw + zk ≤ kwk + kzk = k(w, z)ks ≤ 2k(w, z)km . Now the BIT gives that α
is open, thus a homeomorphism. 

7.15 Exercise Let V be a Banach space.


(i) For W ⊆ V complemented, prove:
– There is a bounded linear map P ∈ B(V ) with P 2 = P and W = P V . (The converse
was proven in Exercise 6.13.)
– Every closed Z ⊆ V complementary to W is isomorphic to V /W as a Banach space.
(ii) Prove that an idempotent P : V → V is bounded if and only if the linear subspaces P V
and (1 − P )V of V are both closed.
(iii) Give an example of a Banach space and an unbounded idempotent on it.

7.16 Remark If V is a Banach space and W ⊆ V a finite-dimensional subspace, thus auto-


matically closed and complemented, proving the existence of an idempotent P ∈ B(V ) with
P V = W does not require the open mapping theorem, cf. the proof of Proposition 6.11(ii) given
in Section 9.3. One can also prove bounds on kP k in terms of dim W . 2

7.17 Definition If A : V → W and B : W → V are linear maps satisfying AB = idW then A


is called a left inverse of B and B is called a right inverse or section of A.
If AB = idW then clearly A is surjective and B is injective. In a pure algebra context,
it is not hard to prove that every surjective (resp. injective) linear map admits a linear right
(resp. left) inverse. (Do it!) But for normed spaces and bounded linear maps, matters are more
complicated:

7.18 Exercise (i) Let V be a Banach space and W ⊆ V a closed subspace. Prove that the
quotient map p : V → V /W has a bounded section if and only if W is complemented.
(ii) If V, W are Banach spaces and A ∈ B(V, W ), prove that A has a bounded right inverse if
and only if A is surjective and ker A ⊆ V is complemented.

7.19 Exercise Let V, W be Banach spaces and A ∈ B(V, W ). Prove that A has a bounded
left inverse if and only if A is injective and AV ⊆ W is complemented.

60
7.20 Remark Since all closed subspaces of Hilbert spaces are complemented, we find that a
bounded map between Hilbert spaces has a bounded right (resp. left) inverse if and only if it is
surjective (resp. injective with closed image). 2

7.21 Exercise (i) Let V be an infinite-dimensional Banach space. Prove that all finite-
dimensional subspaces have empty interior (in V !), then use Baire’s theorem to prove that
V cannot have a countable Hamel basis. (Thus dim V > ℵ0 = #N.)
(ii) Let V be an infinite-dimensional Banach space and {xn }, {ϕn } sequences as in Exercise
9.17. For every N ⊆ N define xN = n∈N 2−n xn . Now use Lemma B.24 to find a linearly
P
independent family in V of cardinality c = #R, so that dim V ≥ c (Hamel dimension).
(iii) Prove that every separable normed space V has cardinality at most c and deduce dim V ≤ c.
(iv) Conclude that every infinite-dimensional separable Banach space has Hamel dimension c.

7.22 Remark 1. The result of Exercise 7.21(i) can be proven using Riesz’ Lemma 12.2 instead
of Baire’s theorem, see [10], but (as in most such cases) the proof uses countable dependent
choice DCω like the proof of Baire’s theorem.
2. If the continuum hypothesis (CH) is true, Exercise 7.21 (ii) readily follows from (i)+(iii).
But the proof of (ii) indicated above is independent of CH. 2

7.23 Exercise Give counterexamples showing that both spaces appearing in the Bounded
Inverse Theorem must be complete.
Hint: For complete E, incomplete F use `p spaces, and for E incomplete, F complete use
F = `1 (N, R) and the fact that it has Hamel dimension c = #R, cf. Exercise 7.21(iv).

7.3 The Closed Graph Theorem (CGT). Hellinger-Toeplitz


We quickly look at an interesting result equivalent to the Bounded Inverse Theorem, but we
will not need it afterwards.
If f : X → Y is a function, the graph of f is the set G(f ) = {(x, f (x)) | x ∈ X} ⊆ X × Y .

7.24 Exercise Let X be a topological space, Y a Hausdorff space and f : X → Y continuous.


Prove that G(f ) ⊆ X × Y is closed.
If E, F are normed spaces, we know that k(x, y)k = kxk + kyk is a norm on E ⊕ F , complete
if E and F are. The projections p1 : E ⊕ F → E, p2 : E ⊕ F → F are bounded.

7.25 Lemma Let E, F be normed spaces and T : E → F a linear map (not assumed bounded).
Then the following are equivalent:
(i) The graph G(T ) = {(x, T x) | x ∈ E} ⊆ E ⊕ F of T is closed.
(ii) Whenever {xn }n∈N ⊆ E is a sequence such that xn → x ∈ E and T xn → y ∈ F , we have
y = T x.
Proof. Since E ⊕ F is a metric space, G(T ) is closed if and only if it contains the limit (x, y)
of every sequence {(xn , yn )} in G(T ) that converges to some (x, y) ∈ E ⊕ F . But a sequence in
G(T ) is of the form {(xn , T xn )}, and (x, y) ∈ G(T ) ⇔ y = T x. 

61
7.26 Remark Operators with closed graph (in particular unbounded ones) are often called
closed. But this must not be confused with their closedness as a map, i.e. the property of 
sending closed sets to closed sets! Bounded linear operators between Banach spaces have closed
graphs, but need not be closed maps. 2

7.27 Theorem (Banach 1929) If E, F are Banach spaces, then a linear map T : E → F is
bounded if and only if its graph is closed.
Proof. Let E, F be Banach spaces, and let T : E → F be linear. If T is bounded then
it is continuous, thus G(T ) is closed by Exercise 7.24. Now assume that G(T ) is closed. The
Cartesian product E ⊕F with norm k(e, f )k = kek+kf k is a Banach space. The linear subspace
G(T ) ⊆ E⊕F is closed by assumption, thus a Banach space. Since the projection p1 : G(T ) → E
is a bounded bijection, by Corollary 7.7 it has a bounded inverse p−1
1 : E → G(T ). Then also
T = p2 ◦ p−1
1 is bounded. 

7.28 Exercise Show that the Bounded Inverse Theorem (Corollary 7.7) can be deduced from
the Closed Graph Theorem. (Thus the three main results of this section are ‘equivalent’.)
We discuss a few typical applications of the CGT. (For another one see Exercise 8.5.)

7.29 Exercise Let A : E → F be a linear map of Banach spaces. Prove that A is bounded if
and only if the following holds: If {xn }n∈N ⊆ E is such that xn → 0 and Axn → y then y = 0.

7.30 Exercise Let V be a Banach space over F, p ∈ [1, ∞] and {xn }n∈N a sequence in V such
that for every ϕ ∈ V ∗ the sequence {ϕ(xn )}n∈N is in `p (N, F). Thus there is a (clearly linear)
map A : V ∗ → `p (N, F), ϕ 7→ {ϕ(xn )}n∈N (called the analysis map). Prove that A is bounded,
thus A ∈ B(V ∗ , `p (N, F)).

7.31 Exercise Let H be a Hilbert space and {zn }n∈N ⊆ H a sequence such that for each
f ∈ `2 (N, C) there exists an x ∈ H satisfying hx, zn i = f (n) ∀n. Prove that there exists D such
that for each f ∈ `2 (N, C) there exists an x ∈ H satisfying hx, zn i = f (n) ∀n and kxk ≤ Dkf k2 .
Hint: Put N = {z1 , z2 , . . .}⊥ , construct a map `2 (N, C) → H/N and prove its boundedness.

7.3.1 The Hellinger-Toeplitz theorem


The following is a very typical application of the Closed Graph Theorem:

7.32 Theorem (Hellinger-Toeplitz theorem (1928)) 45

(i) If H, K are Hilbert spaces and A : H → K, B : K → H are linear maps satisfying


hAx, yi = hx, Byi for all x ∈ H, y ∈ K then A and B are bounded.
(ii) If H is a Hilbert space and a linear map A : H → H is self-adjoint (i.e. hAx, yi = hx, Ayi
for all x, y ∈ H) then A is bounded.
Proof. (i) Let {xn } ⊆ H be a sequence converging to x ∈ H and assume that Axn → y. Then
hAx, zi = hx, Bzi = lim hxn , Bzi = lim hAxn , zi = hy, zi ∀z ∈ K,
n→∞ n→∞

implying Ax = y. Thus A has closed graph and therefore is bounded by Theorem 7.27. The
proof for B is analogous. (ii) is just the special case H = K, A = B of (i). 
45
Ernst David Hellinger (1883-1950), Otto Toeplitz (1881-1940). German mathematicians. Both were forced into
exile in 1939. See also the T.-Hausdorff Theorem B.143.

62
7.33 Remark The Hellinger-Toeplitz Theorem shows that on a Hilbert space H there are no
unbounded linear operators A : H → H satisfying hAx, yi = hx, Ayi ∀x, y. This is a typical
example of a ‘no-go-theorem’. Occasionally such results are a nuisance. After all, the operator
of multiplication by n on `2 (N) ‘obviously’ is self-adjoint. What Hellinger-Toeplitz really says
is that such an operator cannot be defined everywhere, i.e. on all of H. This leads to the notion
of symmetric operators, and also illustrates that no-go theorems often can be circumvented by
generalizing the setting. This is the case here, since the Hellinger-Toeplitz theorem only applies
to operators that are defined everywhere. 2

7.34 Definition A symmetric operator on a Hilbert space H is a linear map A : D → H,


where D ⊆ H is a dense linear subspace, that satisfies hAx, yi = hx, Ayi for all x, y ∈ D,

7.35 Exercise Let H = `2 (N, C), D = {f ∈ `2 (N, C) | 2


P
n |nf (n)| < ∞} ⊆ H and A : D →
H, (Af )(n) = nf (n). Prove:
(i) D ⊆ H is a dense proper linear subspace.
(ii) A : D → H is symmetric and unbounded.
There is an extensive theory of unbounded linear operators defined on dense subspaces
of a Hilbert space. Most books on (linear) functional analysis have a chapter on them, e.g.,
[118, 128, 30, 141]. This subject is very important for applications to differential equations and
quantum mechanics, but since it is quite technical one should probably not approach it before
one has mastered the material of this course.

7.4 Boundedness below. Invertibility


7.36 Definition Let V, W be normed spaces and let A : V → W be a linear map. Then A is
called bounded below46 if there is a δ > 0 such that kAxk ≥ δkxk ∀x ∈ V .
(Equivalently, inf kxk=1 kAxk > 0.)
Boundedness below of a map clearly implies injectivity, but the converse is not true. E.g.,
A ∈ B(`2 (N, F)) defined by (Af )(n) = f (n)/n is injective, but not bounded below.

7.37 Exercise Let V, W be be normed spaces, where V is finite-dimensional, and let A : V →


W be an injective linear map. Prove that A is bounded below.

7.38 Lemma Let V, W be normed spaces and A : V → W a linear bijection. Then


 −1
kA−1 k = inf kAxk .
kxk=1

In particular, A is bounded below if and only if its (set-theoretic) inverse A−1 is bounded.
Proof. Using the bijectivity of x 7→ Ax, we have
−1
kA−1 yk

−1 kxk kAxk  −1
kA k = sup = sup = inf = inf kAxk .
y∈W \{0} kyk x∈V \{0} kAxk x∈V \{0} kxk kxk=1

46
This terminology clashes with another one according to which a self-adjoint operator A is bounded below if
σ(A) ⊆ [c, ∞) for some c ∈ R. Since we consider only bounded operators, we’ll have no use for this notion. The
problem could be avoided by writing ‘bounded away from zero’, as some authors do, but this is a bit tedious.

63
The second statement follows immediately. 

Recall that the image AV ⊆ W of a linear map A : V → W need not be closed (if V is
infinite-dimensional). The following generalizes Corollary 3.23:

7.39 Lemma If V is a Banach space, W is a normed space and A : V → W is a linear map


that is bounded and bounded below then its image AV ⊆ W is closed.
Proof. Let y ∈ AV . Then there is a sequence {yn } ⊆ AV converging to y. Pick {xn } ⊆ V with
Axn = yn ∀n. (They are unique, but this is not needed.) The sequence {yn } is Cauchy, and
the same holds for {xn } by kxk ≤ δ −1 kAxk. Since V is complete, xn converges to some x ∈ V .
With boundedness, thus continuity, of A we have y = lim yn = lim Axn = Ax ∈ AV , so that
AV is closed. 

7.40 Definition If V, W are normed spaces then A ∈ B(V, W ) is called invertible if there is a
B ∈ B(W, V ) such that BA = idV and AB = idW .

7.41 Proposition Let V, W be a Banach spaces and A ∈ B(V, W ). Then the following are
equivalent:
(i) A is invertible.
(ii) A is injective and surjective.
(iii) A is bounded below and has dense image.
Proof. (i)⇒(ii)+(iii). It is clear that invertibility implies injectivity and surjectivity, thus in
particular dense image. Since A−1 is bounded, Lemma 7.38 gives that A is bounded below.
(ii)⇒(i) The set-theoretic inverse which exists by bijectivity, clearly linear, is bounded by
the bounded inverse theorem (Corollary 7.7). Thus A is invertible.
(iii)⇒(i). By boundedness below, A is injective. And AV ⊆ W is dense by assumption and
closed by Lemma 7.39, thus AV = W . Thus A is injective and surjective. Now boundedness of
the inverse A−1 follows from boundedness below of A and Lemma 7.38. Note: BIT not used. 

7.42 Remark 1. Note that dense image is weaker than surjectivity, while boundedness below
is stronger than injectivity. The point of criterion (iii) is that it can be quite hard to verify
surjectivity of A directly, while density of the image usually is easier to establish.
2. The material on bounded below maps discussed so far, including (i)⇔(iii) in Proposition
7.41, was entirely elementary and could be moved to Section 3. 2

7.43 Exercise Let V, W be Banach spaces and A ∈ B(V, W ). Prove:


(i) The following are equivalent:
(α) A is bounded below.
(β) A is injective and AV ⊆ W is closed.
(γ) e : V → AV is an isomorphism.
A
(ii) If ker A has a complement Z then AV ⊆ W is closed ⇔ A|Z is bounded below.
(iii) If V is a Hilbert space then AV ⊆ W is closed if and only if A  (ker A)⊥ is bounded below.

7.44 Exercise Let V be an infinite dimensional Banach space and A ∈ B(V ) injective.

64
(i) Putting W = V ⊕ V with norm k(a, b)k = kak + kbk, prove that the subspaces

K = {(x, 0) | x ∈ V } ⊆ W and L = {(x, Ax) | x ∈ V } ⊆ W

are infinite-dimensional, closed and K ∩ L = {0}.


(ii) Prove that K + L = {(x, Ay) | x, y ∈ V }.
(iii) Prove that K + L ⊆ W is non-closed when A is not bounded below.
(iv) For A not bounded below, prove that the unit spheres of K and L have distance zero (as
they must by Exercise 7.13(ii)).

7.45 Exercise Let H be a Hilbert space and A ∈ B(H) such that |hAx, xi| ≥ Ckxk2 for some
C > 0. Prove that A is invertible and kA−1 k ≤ C −1 . (Such an A is called elliptic or coercive.)

7.46 Exercise Let V be a Banach space. Prove that {A ∈ B(V ) | A bounded below } ⊆ B(V )
is open.

8 Uniform Boundedness Theorem and applications


8.1 The Uniform Boundedness Theorem (UBT)
The Open Mapping Theorem concerned a single bounded operator. Now we consider families:

8.1 Definition Let E, F be normed spaces and F ⊆ B(E, F ) a family of bounded linear maps.
• F is called pointwise bounded if supA∈F kAxk < ∞ for each x ∈ E.
• F is called uniformly bounded if supA∈F kAk < ∞.
That uniform boundedness of F implies pointwise boundedness is trivial, but this is not:

8.2 Theorem Let E be a Banach space, F a normed space and F ⊆ B(E, F ). Then:
(i) Either F is uniformly bounded or the set {x ∈ E | supA∈F kAxk = ∞} ⊆ E is dense Gδ 47 .
(ii) [Helly 1912, Hahn, Banach 1922] If F is pointwise bounded then it is uniformly bounded.
Proof. (i) The map F → R≥0 , x 7→ kxk is continuous and each A ∈ F is bounded, thus
continuous. Therefore the map fA : E → R≥0 , x 7→ kAxk is continuous for every A ∈ F.
Defining for each n ∈ N
Vn = {x ∈ E | sup kAxk > n},
A∈F

the definition of sup implies


[ [
Vn = {x ∈ E | ∃A ∈ F : kAxk > n} = {x ∈ E | kAxk > n} = fA−1 ((n, ∞)),
A∈F A∈F

which is open by
T continuity of the functions fA .
Thus X = n∈N Vn is Gδ . And X = {x ∈ E | supA∈F kAxk = ∞} in view of the definition
of the Vn . Now Baire’s Theorem A.23 implies that X is dense if all the Vn are.
47
A Gδ set in a topological space is an intersection of countably many open sets.

65
On the other hand, if Vn is non-dense for some n ∈ N, there exists x0 ∈ E and r > 0 such
that B(x0 , r) ∩ Vn = ∅. This means supA∈F kA(x0 + x)k ≤ n for all x ∈ E with kxk < r. With
x = (x0 + x) − x0 and the triangle inequality we have

kAxk ≤ kA(x0 + x)k + kAx0 k ≤ 2n ∀A ∈ F, x ∈ B(0, r).

This implies kAk ≤ 2n/r for all A ∈ F, thus F is uniformly bounded.


(ii) If F is not uniformly bounded then by (i) it clearly is not pointwise bounded. 

8.3 Remark 1. We call (i) the strong and (ii) the weak version of the Uniform Bounded-
ness Theorem, respectively. Mystifyingly, most expositions of uniform boundedness use Baire’s
theorem to prove the weak version without pointing out that the proof actually gives a much
stronger result. One of the few exceptions is [140], the source of the elegant proof given above.
2. There is an incessant stream of publications purporting to give ‘elementary’ proofs (i.e.
avoiding Baire) of Theorem 8.2(ii), but all of them (except [53]) use, usually without acknowl-
edging it, the axiom DCω of countable dependent choice which, however, is logically equivalent
over ZF to Baire’s theorem! (See [17].) This also holds for [154], but the proof given there
admits a tiny modification, discovered quite recently [53], that really only uses countable choice
ACω . See Appendix B.4 for the beautiful argument (employing a version of the ‘method of
the gliding hump’) that really is more elementary, in the precise sense of reverse mathematics,
which concerns itself with the weakest axioms needed to prove desired results. (Cf. [158] for an
engaging introduction.) 2

8.4 Exercise (i) Deduce Theorem 8.2(ii) from its special case where E, F are both assumed
complete.
(ii) Show that the completeness assumption on E in Theorem 8.2(ii) cannot be omitted.
The weak version of the UBT can also be deduced from the closed graph theorem:48

8.5 Exercise Let E, F be Banach spaces and F ⊆ B(E, F ) a pointwise bounded family. Use
the Closed Graph Theorem to prove that F is uniformly bounded, as follows:
(i) Prove that FF = {{yA }A∈F ∈ F F = Fun(F, F ) | supA∈F kyA k < ∞} is a Banach space.
(ii) Show that pointwise boundedness of F is equivalent to T E ⊆ FF .
(iii) Prove that the graph G(T ) ⊆ E ⊕ FF of T is closed. (Thus T is bounded by Theorem
7.27.)
(iv) Deduce uniform boundedness of F from the boundedness of T .
(v) Remove the requirement that F be complete.

8.2 Applications of the weak UBT. Banach-Steinhaus


8.6 Exercise Let X, Y be Banach spaces, Z a normed space and T : X × Y → Z a map such
that Y → Z, y 7→ T (x, y) and X → Z, x 7→ T (x, y) are linear and bounded for each x ∈ X and
y ∈ Y , respectively. Prove that there is a 0 ≤ C < ∞ such that kT (x, y)k ≤ Ckxkkyk for all
x ∈ X, y ∈ Y .
48
While this approach can be found in various books, it is unsatisfactory. After all, the weak UBT can be proven
quite directly using only ACω , cf. Appendix B.4, whereas the proof via the CGT, besides being quite indirect, relies
on Baire’s theorem, without however yielding the strong UBT.

66
8.7 Exercise Give a proof of the Hellinger-Toeplitz Theorem 7.32 using Theorem 8.2(ii) in-
stead of the Closed Graph Theorem 7.27. (This is interesting since the it shows that also
Hellinger-Toeplitz depends only on the axiom ACω of countable choice.)

8.8 Definition Let E, F be normed spaces. A sequence (or net) {An } ⊆ B(E, F ) is strongly
convergent if limn→∞ An x exists for every x ∈ E.
Under the above assumption, the map A : E → F, x 7→ limn→∞ An x is easily seen to be
s
linear. Now we write An → A or A = s-lim An .

8.9 Corollary (Banach-Steinhaus) 4950 If E is a Banach space, F a normed space and


the sequence {An } ⊆ B(E, F ) is strongly convergent then the map A = s-lim An is bounded,
thus in B(E, F ).
Proof. The convergence of {An x} ⊆ F for each x ∈ E implies boundedness of {An x | n ∈ N}
for each x, so that F = {An | n ∈ N} ⊆ B(E, F ) is pointwise bounded and therefore uniformly
bounded by Theorem 8.2(ii). Thus there is T such that kAn k ≤ T ∀n, so that kAn xk ≤
T kxk ∀x ∈ E, n ∈ N. With An x → Ax this implies kAxk = lim kAn xk ≤ T kxk for all x, thus
kAk ≤ T < ∞. 
s
8.10 Remark Clearly An → A is equivalent to kAn − Akx → 0 for all x ∈ E, where kAkx :=
kAxk is a seminorm on B(E, F ) for each x ∈ E. If kAkx = 0 for all x ∈ E then Ax = 0 ∀x ∈ E,
thus A = 0. Thus the family F = {k · kx | x ∈ E} is separating and induces a locally convex
topology on B(E, F ), the strong operator topology τsot . Norm convergence kAn −Ak → 0 clearly
s
implies strong convergence An → A, but usually the strong (operator) topology is strictly weaker
(despite its name) than the norm topology. See the following exercise for an example. 2

8.11 Exercise Let H be a Hilbert space and {e1 , e2 , . . .} an orthonormal sequence in H (not
necessarily an ONB). Define ϕn ∈ H ∗ = B(H, F) by ϕn (x) = hx, en i. Prove that ϕn → 0
strongly, but not in norm.

8.12 Exercise Let 1 ≤ p < ∞ and V = `p (N, F). For each m ∈ N define Pm ∈ B(V ) by
s
(Pm f )(n) = f (n) for n ≥ m and (Pm f )(n) = 0 if n < m. Prove Pm → 0, but kPm k = 1 ∀m,
k·k
thus Pm 6→ 0.

8.13 Exercise Let V be a separable Banach space and B ⊆ B(V ) a bounded subset.
(i) Prove: If S ⊆ V is dense and a net {Aι } ⊆ B satisfies kAι xk → 0 for all x ∈ S then
kAι xk → 0 for all x ∈ V , thus Aι → 0 in the strong operator topology.
(ii) Prove that the topological space (B, τsot ) is metrizable.
(iii) BONUS: Prove that (V, τsot ) is not metrizable if V is infinite-dimensional.
49
Hugo Steinhaus (1887-1972). Polish mathematician
50
In the literature, one can find either this result or Theorem 8.2(ii) denoted as ‘Banach-Steinhaus theorem’.

67
8.3 Appl. of the strong UBT: Many continuous functions with
divergent Fourier series
For the preceding applications of the uniform boundedness theorem, the weak version was
sufficient. Other applications use the contraposition for a (non-constructive) existence proof:
If F ⊆ B(E, F ) is not uniformly bounded then it is not pointwise bounded, thus there exists
x ∈ E with supA∈F kAxk = ∞. For such applications, Theorem 8.2(i) is a definite improvement.
Let f : R → C be 2π-periodic, i.e. f (x + 2π) = f (x) ∀x, and integrable over finite intervals.
Define Z 2π
1
cn (f ) = f (x)e−inx dx (8.1)
2π 0
and
n
X
Sn (f )(x) = ck (f )eikx , n ∈ N. (8.2)
k=−n

The fundamental problem of the theory of Fourier series is to find conditions for the conver-
gence Sn (f )(x) → f (x) as n → ∞, where convergence can be understood as (possibly almost)
everywhere pointwise or w.r.t. some norm, like k · k2 (as in Example 5.49) or k · k∞ . Here we
will discuss only continuous functions and we identify continuous 2π-periodic functions with
continuous functions on S 1 . It is not hard to show that Sn (f )(x) → f (x) if f is differentiable
at x (or just Hölder continuous: |f (x0 ) − f (x)| ≤ C|x0 − x|D with C, D > 0 for x0 near x) and
that convergence is uniform when f is continuously differentiable (or the Hölder condition holds
uniformly in x, x0 ). (See any number of books on Fourier analysis, e.g. [157, 83].)
Assuming only continuity of f one can still prove that limn→∞ Sn (f )(x) = f (x) if the limit
exists, but there actually exist continuous functions f such that Sn (f )(x) diverges at some
x. Such functions were first constructed in the 1870s using ‘condensation of singularities’,
a relative and precursor of the gliding hump method, cf. Appendix B.3.5. Nowadays, most
textbook presentations of such functions are based on Lemma 8.15 below combined with either
the uniform boundedness theorem or constructions ‘by hand’, see e.g. [83, Section II.2], that
are quite close in spirit to the uniform boundedness method.
However, individual examples of continuous functions with Fourier series divergent in a point
can be produced in a totally constructive fashion, avoiding all choice axioms! (See [109] for a
very classical example.) But using non-constructive arguments seems unavoidable if one wants
to prove that there are many such functions as in the following:

8.14 Theorem There is a subset X ⊆ C(S 1 ) that is dense Gδ (in the k · k∞ -topology) such
that the Fourier series {Sn (f )(0)}n∈N diverges for each f ∈ X.
Proof. Inserting (8.1) into (8.2) we obtain
n n
!
1 X ikx 2π
Z Z 2π
−ikt 1 X
Sn (f )(x) = e f (t)e dt = f (t) eik(x−t) dt = (Dn ? f )(x),
2π 0 2π 0
k=−n k=−n

1
R 2π
where ? denotes convolution, defined for 2π-periodic f, g by (f ? g)(x) = 2π 0 f (t)g(x − t)dt,
and
n
X sin(n + 21 )x
Dn (x) := eikx =
sin x2
k=−n

68
is the Dirichlet kernel. The quickest way to check the last identity is the ‘telescoping’ calculation
n
X n
X
(eix/2 − e−ix/2 )Dn (x) = eix(k+1/2) − eix(k−1/2) = eix(n+1/2) − e−ix(n+1/2) ,
k=−n k=−n

together with eix − e−ix = 2i sin x. Since Dn (x) is an even function, we have
Z 2π
1
ϕn (f ) := Sn (f )(0) = f (x)Dn (x)dx.
2π 0
It is clear that the norm of the map ϕn : (C(S 1 ), kR · k∞ ) → C is bounded above by kDn k1 .

For gn (x) = sgn(Dn (x)) we have ϕn (gn ) = (2π)−1 0 |Dn (x)|dx =: kDn k1 . While gn is not
m→∞
continuous, we can find a sequence of continuous gn,m bounded by 1 such that gn,m −→ gn
pointwise. Now Lebesgue’s dominated convergence theorem implies ϕn (gn,m ) → ϕn (gn ) =
kDn k1 , which implies kϕn k = kDn k1 . By Lemma 8.15 below, kDn k1 → ∞ as n → ∞. Thus the
family F = {ϕn } ⊆ B(C(S 1 ), C) is not uniformly bounded. Now Theorem 8.2(i) implies that
the set X = {f ∈ C(S 1 , C) | {Sn (f )(0)} is unbounded} is dense Gδ . 
4
8.15 Lemma We have kDn k1 ≥ log n for all n ∈ N.
π2
Proof. Using | sin x| ≤ |x| for all x ∈ R, we compute
Z π
2 π
Z  
1 1 dx
kDn k1 = |Dn (x)|dx ≥ sin n + x
2π −π π 0 2 x
Z (n+1/2)π n
2 X kπ | sin x|
Z
2 dx
= | sin x| ≥ dx
π 0 x π x
k=1 (k−1)π
n Z π n
2X 1 4 X1 4
≥ sin x dx = 2 ≥ 2 log n,
π kπ 0 π k π
k=1 k=1
Pn R n+1
where we used k=1 1/k ≥ 1 dx/x = log(n + 1) > log n. 

8.16 Remark Also the Bounded Inverse Theorem has an interesting application R 2π to Fourier
analysis: For f ∈ L1 ([0, 2π]), we define the Fourier coefficients fb(n) = (2π)−1 0 f (t)e−int dt
for all n ∈ N. It is immediate that kfbk∞ ≤ kf k1 , and is not hard to prove the Riemann-Lebesgue
theorem fb ∈ c0 (Z, C) and injectivity of the resulting map L1 ([0, 2π]) → c0 (Z, C), f 7→ fb, see
e.g. [140, Theorem 5.15] or [83]. If this map was surjective, the Bounded Inverse Theorem
would give kf k1 ≤ Ckfbk∞ ∀f ∈ L1 ([0, 2π]). For the Dirichlet kernel it is immediate that
Dcn (m) = χ[−n,n] (m), thus kD cn k∞ = 1 for all n ∈ N. Since we know that kDn k1 → ∞, we
would have a contradiction. Thus L1 ([0, 2π]) → c0 (Z, C), f 7→ fb is not surjective. 2

9 Duality: Hahn-Banach Theorem and applications


We have seen that every bounded linear functional ϕ ∈ H ∗ , where H is a Hilbert space, is
of the form ϕ = ϕy for a certain (unique) y ∈ H. Thus dual spaces of Hilbert spaces are
completely understood. (The map H → H ∗ , y 7→ ϕy is an anti-linear bijection.) For a general
Banach space V , matters are much more complicated. The point of the Hahn51 -Banach theorem
51
Hans Hahn (1879-1934). Austrian mathematician who mostly worked in analysis and topology.

69
(which comes in many versions)52 is to show that all Banach spaces admit many bounded linear
functionals.

9.1 First version of Hahn-Banach over R


We begin with a slight generalization of the notion of seminorms:

9.1 Definition If V is a real vector space, a map p : V → R is called sublinear if it satisfies


• Positive homogeneity: p(cv) = cp(v) for all v ∈ V and c > 0.
• Subadditivity: p(x + y) ≤ p(x) + p(y) for all x, y ∈ V .

9.2 Theorem Let V be a real vector space and p : V → R a sublinear function. Let W ⊆ V
be a linear subspace and ϕ : W → R a linear functional such that ϕ(w) ≤ p(w) for all w ∈ W .
b : V → R such that ϕ
Then there is a linear functional ϕ b  W = ϕ and ϕ(v)
b ≤ p(v) for all v ∈ V .
The heart of the proof is the special case where W has codimension one:

9.3 Lemma Let V, p, W, ϕ be as in Theorem 9.2 and v 0 ∈ V . Then there is a linear functional
b : Y = W + Rv 0 → R such that ϕ
ϕ b  W = ϕ and ϕ(v)
b ≤ p(v) for all v ∈ Y .
Proof. If v 0 ∈ W , there is nothing to do so that we may assume v 0 ∈ V \W . Then every
x ∈ W + Rv 0 can be written as x = w + cv 0 with unique w ∈ W, c ∈ R. Thus if d ∈ R, we can
define ϕb : W + Rv 0 → R by w + cv 0 7→ ϕ(w) + cd for all w ∈ W and c ∈ R. Since ϕb is linear and
b  W = ϕ, it remains to show that d can be chosen such that ϕ ≤ p holds on
trivially satisfies ϕ
Y = W + Rv 0 , to wit

b + cv 0 ) = ϕ(w) + cd ≤ p(w + cv 0 )
ϕ(w ∀w ∈ W, c ∈ R. (9.1)

For c = 0, this holds by assumption. If (9.1) holds for all w ∈ W and c ∈ R then in particular

ϕ(w) ± d ≤ p(w ± v 0 ) ∀w ∈ W. (9.2)

And if (9.2) holds then by linearity of ϕ and positive homogeneity of p, for all e > 0 we have
(9.2)
b ± ev 0 ) = eϕ(e
ϕ(w b −1 w ± v 0 ) ≤ ep(e−1 w ± v 0 ) = p(w ± ev 0 ),

thus the desired inequality (9.1) holds for all w ∈ W, c ∈ R. Now (9.2) is equivalent to

ϕ(w) − p(w − v 0 ) ≤ d ≤ p(w0 + v 0 ) − ϕ(w0 ) ∀w, w0 ∈ W.

Clearly this is possible if and only if ϕ(w) − p(w − v 0 ) ≤ p(w0 + v 0 ) − ϕ(w0 ) for all w, w0 ∈ W ,
which in turn is equivalent to ϕ(w) + ϕ(w0 ) ≤ p(w − v 0 ) + p(w0 + v 0 ) ∀w, w0 . The latter inequality
is indeed satisfied for all w, w0 ∈ W since w + w0 ∈ W , so that

ϕ(w) + ϕ(w0 ) = ϕ(w + w0 ) ≤ p(w + w0 ) = p((w − v 0 ) + (w0 + v 0 )) ≤ p(w − v 0 ) + p(w0 + v 0 ),

which holds since ϕ is additive and bounded by p on W and since p is subadditive. 

52
Important early results are due to Eduard Helly (1884-1943), another Austrian mathematician. See [101, p. 54-55].

70
If n = dim(V /W ) < ∞, proving the theorem amounts to applying the Lemma n times. But
otherwise an infinite inductive procedure is required. This can be formalized via ‘transfinite
induction’, but is easier to invoke Zorn’s lemma, as in the proof of Proposition 5.36:

Proof of Theorem 9.2. If W = V , there is nothing to do, so assume W $ V . Let E be the set
of pairs (Z, ψ), where Z ⊆ V is a linear subspace space containing W and ψ : Z → R is a linear
map extending ϕ such that ψ(z) ≤ p(z) ∀z ∈ Z. Since W 6= V , Lemma 9.3 implies E 6= ∅.
We define a partial ordering on E by (Z, 0 0 0 0
S ψ) ≤ (Z , ψ ) ⇔ Z ⊆ Z , ψ  Z = ψ. If C ⊆ E is
a chain, i.e. totally ordered by ≤, let Y = (Z,ψ)∈C Z and define ψY : Y → R by ψY (v) = ψ(v)
for any (Z, ψ) ∈ C with v ∈ Z. This clearly is consistent and gives a linear map. Now (Y, ψY ) is
an element of E and an upper bound for C. Thus by Zorn’s lemma there is a maximal element
(YM , ψM ) of E. Now ψM : YM → R is an extension of ϕ satisfying ψM (y) ≤ p(y) for all y ∈ YM ,
so we are done if we prove YM = V . If this is not the case, we can pick v 0 ∈ V \YM and use
Lemma 9.3 to extend ψY to YM + Rv 0 , but this contradicts the maximality of (YM , ψM ). 

9.4 Remark The above proof is even more non-constructive than the preceding ones in that
it uses Zorn’s lemma, which is equivalent to the Axiom of Choice (AC)53 . If V is a separable
normed space, we can replace AC by the weaker axiom DCω (countable dependent choice), cf.
Exercise 9.8. But even in the generality of all Banach spaces there is the seldom cited fact that
the Hahn-Banach theorem can be deduced over ZF from the restriction of Tychonov’s theorem
to Hausdorff spaces, which is strictly weaker than AC. See Appendix B.6.1. 2

9.2 Hahn-Banach theorem for (semi)normed spaces


With the exception of Section B.6.3 we will not use Theorem 9.2 directly, but only the following
consequence:

9.5 Theorem (Hahn-Banach Theorem (1927/9)) Let V be a vector space over F ∈ {R, C},
p a seminorm on it, W ⊆ V a linear subspace and ϕ : W → C a linear functional such that
|ϕ(w)| ≤ p(w) for all w ∈ W . Then there is a linear functional ϕ
b : V → C such that ϕ
b W = ϕ
and |ϕ(v)|
b ≤ p(v) for all v ∈ V .
Proof. F = R: This is an immediate consequence of Theorem 9.2 since a seminorm p is sublinear
with the additional property p(−v) = p(v) ≥ 0 for all v. In particular, −ϕ(v) b = ϕ(−v)
b ≤
p(−v) = p(v), so that −p(v) ≤ ϕ(v)
b ≤ p(v) for all v ∈ V , which is equivalent to |ϕ(v)|
b ≤ p(v) ∀v.
54 ϕ
F = C : Assume V ⊇ W → C satisfies |ϕ(w)| ≤ p(w) ∀w ∈ W . Define ψ : W → R, w 7→
Re(ϕ(w)), which clearly is R-linear and satisfies the same bound. Thus by the real case just
considered, there is an R-linear functional ψb : V → R extending ψ such that |ψ(v)|b ≤ p(v) for
all v ∈ V . Define ϕ
b : V → C by
ϕ(v)
b b − iψ(iv).
= ψ(v) b

Again it is clear that ϕ


b is R-linear. Furthermore

ϕ(iv)
b = ψ(iv)
b − iψ(−v)
b = ψ(iv)
b + iψ(v)
b b − iψ(iv))
= i(ψ(v) b = iϕ(v),
b
53
“Such reliance on awful non-constructive results is unfortunately typical of traditional functional analysis.” [92]
54
This was discovered only in 1938 by Henri Frederic Bohnenblust (1906-2000) and Andrew Florian Sobczyk (1915-
1981), Swiss resp. Polish born American mathematicians.

71
b : V → C is C-linear. If w ∈ W then
proving that ϕ

ϕ(w)
b = ψ(w)
b − iψ(iw)
b = ψ(w) − iψ(iw) = Re(ϕ(w)) − iRe(ϕ(iw))
= Re(ϕ(w)) − iRe(iϕ(w)) = Re(ϕ(w)) + iIm(ϕ(w)) = ϕ(w),

so that ϕ
b extends ϕ.
Given v ∈ V , let α ∈ C, |α| = 1 be such that αϕ(v)b ≥ 0. Then αϕ(v)
b = ϕ(αv)
b =
Re(ϕ(αv))
b = ψ(αv), so that |ϕ(v)|
b b = |αϕ(v)|
b = ψ(αv) ≤ p(αv) = p(v).
b 

9.6 Remark In Exercise 5.35 we saw (with a fairly easy proof) that bounded linear functionals
defined on linear subspaces of Hilbert spaces always have unique norm-preserving extensions to
the whole space. For a general Banach space V this uniqueness is far from true! (It holds if and
only if V ∗ is strictly convex, cf. Section B.6.7 for definition and proof.) 2

9.7 Exercise Give an example for a Banach space V , a linear subspace W ⊆ V and ϕ ∈ W ∗
b ∈ V ∗ of ϕ.
such that there are multiple norm-preserving extensions ϕ

9.8 Exercise Give a proof of Theorem 9.5 in the case of a separable normed space (V, k · k)
using only the axiom DCω of countable dependent choice (thus neither AC nor Zorn).

9.3 First applications of Hahn-Banach


9.9 Proposition Let V be a normed space over F ∈ {R, C}.
(i) For every 0 6= x ∈ V there is a ϕ ∈ V ∗ with kϕk = 1 such that ϕ(x) = kxk.
(ii) For each x ∈ V we have kxk = supϕ∈V ∗ ,kϕk=1 |ϕ(x)|.
(iii) ‘V ∗ separates the points of V ’: If x, x0 ∈ V and ϕ(x) = ϕ(x0 ) ∀ϕ ∈ V ∗ then x = x0 .
(iv) If x ∈ V then xb : V ∗ → F, ϕ 7→ ϕ(x) is in V ∗∗ with kb
xk = kxk. The map ιV : V →
∗∗
V , x 7→ x
b is an isometric embedding.
(v) The image55 ιV (V ) ⊆ V ∗∗ is closed if and only if V is complete (i.e. Banach).
Proof. (i) Let W = Fx ⊆ V . The linear functional ϕ : W → F, cx 7→ ckxk is isometric since
|ϕ(x)| = kxk, thus kϕk = 1. By the Hahn-Banach Theorem 9.5 there exists a ϕ b ∈ V ∗ with
ϕ(x)
b = ϕ(x) = kxk and kϕk b = kϕk = 1.
(ii) It is clear that supϕ∈V ∗ ,kϕk=1 |ϕ(x)| ≤ kxk, and the converse inequality follows from (i).
(iii) Apply (ii) to x − x0 .
(iv) One easily checks that x b : V ∗ → F, ϕ 7→ ϕ(x) is a linear functional. If x ∈ V, ϕ ∈ V ∗
then |b x(ϕ)| = |ϕ(x)| ≤ kxkkϕk. Thus kb xk ≤ kxk. By (i) there is ϕ ∈ V ∗ with kϕk = 1 such
that ϕ(x) = kxk. This gives kb xk = kxk. Thus the map ιV : V → V ∗∗ , x 7→ x b, which clearly is
linear, is an isometric embedding.
(v) If V is complete then ιV (V ) ⊆ V ∗∗ is closed by Corollary 3.23 since ιV is an isometry
by (iv). Conversely, if ιV (V ) ⊆ V ∗∗ is closed then completeness of V ∗∗ (Proposition 3.25(ii))
55
If f : X → Y is any function, from a category theory point of view one would call X the source (or domain) and
Y the target (or codomain) of f and call the subset f (X) ⊆ Y the image of f . I prefer to avoid the term ‘range’ since
some authors use it for ‘target’ (thus Y ) and others for ‘image’ (thus f (X)). The term ‘image’ is unambiguous since
no reasonable person would use it intending Y .

72
implies that ιV (V ) is complete, thus also V since ιV : V → V ∗∗ is an isometric bijection. 

It is customary to simply write ι instead of ιV or to drop ιV from the notation entirely,


identifying V with its image ιV (V ) in V ∗∗ , so that V ⊆ V ∗∗ .

9.10 Corollary Every normed space V embeds isometrically as a dense subspace into a Ba-
nach space Vb . The latter is unique up to isometric isomorphism and is called the completion of
V.
Proof. This can be proven by completing the metric space (V, d), where d(x, y) = kx − yk and
showing that the completion is a linear space, but this is a bit tedious. Alternatively, using the
above result that ιV : V → V ∗∗ is an isometry, we can take Vb = ιV (V ) ⊆ V ∗∗ as the definition
of Vb since this is a closed subspace of the complete space V ∗∗ , thus complete, and contains
ιV (V ) ∼
= V as a dense linear subspace.
Uniqueness of the completion follows with the same proof as for metric spaces, cf. [108]. 

The Hahn-Banach theorem allows to prove two claims made earlier:

9.11 Corollary If V is a normed space and {xn } ⊂ V such that ∞


P
n=1 xσ(n) converges for all
permutations σ of N (i.e. unconditionally) then the sums do not depend on σ.
Proof. For each ϕ ∈ V ∗ , continuity of ϕ we have

X  N
X N
X ∞
X

ϕ xσ(n) = ϕ lim xσ(n) = lim ϕ(xσ(n) ) = ϕ(xσ(n) ).
N →∞ N →∞
n=1 n=1 n=1 n=1

Thus the series on the r.h.s. converges P


for all σ. Since it takes valuesP
in R or C, this
 implies
P∞ abso-
∞ ∞
lute convergence and independence of n=1 ϕ(xσ(n) ) of σ. Thus ϕ n=1 xσ(n) = ϕ n=1 xn
for all ϕ ∈ V ∗ . Now the claim follows from Proposition 9.9(iii). 

9.12 Proposition Let V be a normed space and W ⊆ V a finite-dimensional subspace. Then


(i) There exists an idempotent P ∈ B(V ) such that W = P V , thus W is complemented.
(ii) We can achieve kP k ≤ dim W .
Proof. (i) To begin with, W is closed by Exercise 3.22. If E =P{e1 , . . . , en } is a basis for W ,
there are unique linear functionals ϕi : W → C such that w = ni=1 ϕi (w)ei for each w ∈ W ,
equivalently, ϕj (ei ) = δij . Since W is finite-dimensional, the ϕi are automatically bounded by
Exercise 3.7. Now by the Hahn-Banach Theorem 9.5 therePare linear functionals ϕ bi : V → C
extending theP ϕi with kϕbi k =Pkϕi k. Then P ∈ B(V ), v 7→ ni=1 ϕ bi (v)ei is bounded. If w ∈ W
then P (w) = ni=1 ϕ bi (w)ei = ni=1 ϕi (w)ei = w, thus P  W = idW . This implies W = P V and
P 2 = P , so that W is complemented by Exercise 6.13. (A complement, obviously closed, then
is Z = (1 − P )V = ker P = ni=1 ker ϕ
T
bi .)
(ii) If E is an Auerbach basis, cf. P P then kei k = kϕi k =P1 ∀i. Then also the
Proposition 3.10,
bi have norm one, so that kP xk = k i ϕi (x)ei k ≤ i |ϕi (x)|kei k ≤ kxk i kϕi kkei k = nkxk,
ϕ
proving the claim. 

9.13 Remark With more effort one proves √ the Kadets-Snobar theorem (1971) giving an idem-
potent P with image W satisfying kP k ≤ dim W (which is almost optimal, but not quite). Cf.
e.g. [103, Theorem 12.14] or [1, Theorem 13.1.7]). If V is a Hilbert space, one clearly has the
orthogonal projections satisfying kP k = 1. If V is isomorphic to a Hilbert space, this implies a

73
uniform bound kP k ≤ λ < ∞ for all finite-dimensional subspaces. The converse is also true! Cf.
e.g. [1, Theorem 13.4.3]. Combining this with the Kadets-Snobar theorem, it is not too difficult
to prove the characterization of Hilbert spaces mentioned in Remark 6.10.4. 2

Now we can continue the discussion of ⊥ and > begun with Definition 6.6 and Exercise 6.7:

9.14 Exercise Let V be a normed space and W ⊆ V, Φ ⊆ V ∗ linear subspaces. Prove:


(i) W ⊥ = {0} ⇔ W = V .
(ii) W ⊥ = V ∗ ⇔ W = {0}.
(iii) Φ = V ∗ ⇒ Φ> = {0}.
(iv) (W ⊥ )> = W .

9.15 Exercise Let V be a normed space and W ⊆ V a closed linear subspace. Construct an
isometric linear bijection β : V ∗ /W ⊥ → W ∗ .

9.16 Exercise If V is Banach and Z ⊆ V ∗ a closed subspace, construct an isometric isomor-


phism V ∗ /Z ∼
= (Z > )∗ .

9.17 Exercise Let V be an infinite-dimensional Banach space over F ∈ {R, C}.


(i) Use Hahn-Banach to construct sequences {xn }n∈N ⊆ V and {ϕn }n∈N ⊆ V ∗ such that
kxn k = 1 and ϕn (xn ) 6= 0 for all n ∈ N and ϕn (xm ) = 0 whenever n 6= m.
(ii) Prove that {xn }n∈N is linearly independent and that xn 6∈ spanF {xm | m 6= n} for all n.

9.18 Exercise Let V be a normed space and x ∈ V, ϕ ∈ V ∗ . Prove that ιV (x) ∈ V ∗∗ and
ιV ∗ (ϕ) ∈ V ∗∗∗ satisfy ιV ∗ (ϕ)(ιV (x)) = ϕ(x).

9.4 Reflexivity of Banach spaces


The following definition is immediately suggested by the fact that for every Banach space V
there is an isometric embedding ιV : V ,→ V ∗∗ :

9.19 Definition A Banach space V is called reflexive if the map ιV : V → V ∗∗ is surjective.

9.20 Remark 1. Reflexivity of V means that every bounded linear functional ψ ∈ V ∗∗ on V ∗


is of the form ψ = ιV (x), thus ϕ 7→ ϕ(x) for some x ∈ V .
2. If V is reflexive, ιV : V → V ∗∗ is an isometric isomorphism of normed spaces. Now
completeness of V ∗∗ implies completeness of V . For this reason there is little point in defining
reflexivity for normed spaces.
3. There are Banach spaces V that are not reflexive, yet satisfy V ∼= V ∗∗ non-canonically! An
example is the James space, see e.g. [102, Section 4.5], which is also interesting since V ∗∗ /ιV (V )
56

is one-dimensional! For ‘most’ non-reflexive spaces this quotient is infinite-dimensional. (E.g.


c∗∗ ∼ ∞ ∞ would be complemented.) 2
0 /ιc0 (c0 ) = ` /c0 is infinite-dimensional since otherwise c0 ⊆ `

9.21 Theorem Let V be a Banach space. Then V is reflexive if and only if V ∗ is reflexive.
56
Robert Clarke James (1918-2004). American functional analyst.

74
Proof. ⇒ Reflexivity of V means surjectivity of ιV : V → V ∗∗ . Let ϕ ∈ V ∗∗∗ = (V ∗∗ )∗ .
Then ϕ0 = ϕ ◦ ιV ∈ V ∗ , and we claim that ϕ = ιV ∗ (ϕ0 ). This would clearly imply surjectivity
of ιV ∗ : V ∗ → V ∗∗∗ , thus reflexivity of V ∗ . The claim means ϕ(x∗∗ ) = ιV ∗ (ϕ0 )(x∗∗ ) for all
x∗∗ ∈ V ∗∗ . By surjectivity of ιV : V → V ∗∗ , this is equivalent to ϕ(ιV (x)) = ιV ∗ (ϕ0 )(ιV (x))
for all x ∈ V . The latter identity indeed is true since both sides equal ϕ0 (x), the l.h.s. by the
definition of ϕ0 and the r.h.s. by Exercise 9.18.
⇐ Assume that V is not reflexive. Then ιV (V ) ⊆ V ∗∗ is a proper closed subspace, so that
ιV (V )⊥ 6= {0} by Exercise 9.14. Let thus 0 6= ϕ ∈ ιV (V )⊥ ⊆ V ∗∗∗ . Since V ∗ is reflexive,
we have ϕ = ιV ∗ (ϕ0 ) for some ϕ0 ∈ V ∗ . Using Exercise 9.18 again, for each x ∈ V we have
ϕ0 (x) = ιV ∗ (ϕ0 )(ιV (x)) = ϕ(ιV (x)) = 0 by ϕ ∈ ιV (V )⊥ . Thus ϕ0 = 0, implying ϕ = 0, but this
is a contradiction. 

9.22 Remark For non-reflexive V none of the spaces V ∗ , V ∗∗ , V ∗∗∗ , . . . is reflexive, so that
V $ V ∗∗ $ V ∗∗∗∗ $ · · · and V ∗ $ V ∗∗∗ $ V ∗∗∗∗∗ $ · · · , and we have two somewhat mysterious
successions of ever larger spaces! There do not seem to be many general results about this, but
see Lemma B.25(iv). Even understanding C(X, R)∗∗ for compact X is complicated, cf. [81]. 2

9.23 Exercise Prove:


(i) Every finite-dimensional Banach space is reflexive.
(ii) Every Hilbert space is reflexive.
(iii) If 1 < p < ∞ then `p (S, F) is reflexive.
(iv) If S is infinite then c0 (S, F), `1 (S, F), `∞ (S, F) are not reflexive.

9.24 Exercise (i) Prove that if V is reflexive then for each ϕ ∈ V ∗ there exists an x ∈ V
such that kxk = 1 and |ϕ(x)| = kϕk. (We say ‘ϕ attains its norm’.)
(ii) Identify the ϕ ∈ c0 (N, F)∗ for which there exists x ∈ c0 (N, F) with kxk = 1 such that
ϕ(x) = kϕk. Conclude that such ϕ are dense in c0 (N, F)∗ .
(ii) Prove (again) that c0 (N, C) is not reflexive.

9.25 Remark 1. The converse of the statement in Exercise 9.24(i) is also true: If every ϕ ∈ V ∗
attains its norm, V is reflexive. But the proof, also due to R. C. James, is much harder and
more than 10 pages long! See [102, Section 1.13].
2. On the other hand, Bishop and Phelps57 proved that the result of Exercise 9.24(ii) holds
for every Banach space V , i.e. the set of ϕ ∈ V ∗ that attain their norm is dense in V ∗ . Cf. [16]
or [102, Section 2.11].
3. See Appendix B.6.8 for the notion of uniform convexity, which is stronger than the strict
convexity encountered earlier, and a proof of the fact that uniformly convex spaces are reflexive.
We will also prove prove that Lp (X, A, µ) is uniformly convex for each measure space
(X, A, µ) and 1 < p < ∞. This provides a proof of reflexivity of these spaces that does
not use the relation between Lp and Lq . This in turn leads to a simple proof of surjectivity of
the isometric map Lq → (Lp )∗ known from Section 4.7 (reversing the logic of Exercise 9.23(iii)).
2

9.26 Exercise Let V be a Banach space. Prove:


57
Errett Albert Bishop (1928-1983), Robert Ralph Phelps (1926-2013), Amerian functional analysts. Around 1965
Bishop became a strong advocate of and contributor to constructive mathematics.

75
(i) If V ∗ is separable then V is separable.
(ii) For V infinite-dimensional separable, V ∗ can be separable or non-separable. (Examples!)
(iii) If V is separable and reflexive then V ∗ is separable.

9.27 Theorem (Pettis) 58 Let V be a Banach space and W ⊆ V a closed subspace. Then
the following are equivalent:
(i) V is reflexive.
(ii) W and V /W are reflexive.
Proof. We begin with some preparations. Let W ⊆ V be a closed subspace of W . W ⊥ ⊆ V ∗ is
a closed subspace, thus W ⊥⊥ is a closed subspace of V ∗∗ . Explicitly,

W ⊥⊥ = ψ ∈ V ∗∗ | ϕ ∈ V ∗ , ϕ  W = 0 ⇒ ψ(ϕ) = 0 .

(9.3)

If w ∈ W and ψ = ιV (w) then for each ϕ ∈ V ∗ we have ψ(ϕ) = ιV (w)(ϕ) = ϕ(w). Thus if
ϕ  W = 0 then ψ(ϕ) = 0. This proves ιV (W ) ⊆ W ⊥⊥ .
By Exercise 6.7 (dual space of quotient space) we have an isometric isomorphism W ∗ ∼ =
V /W ⊥ . Now Exercise 9.15 (dual space of subspace) gives an isometric isomorphism α : W ∗∗ →

W ⊥⊥ , where W ⊥⊥ ⊆ V ∗∗ . We thus have the situation in this diagram:


W ⊂ - V
∩ ∩
ιW

ιV ιV (9.4)


? ?
W ∗∗ - W ⊥⊥ ⊂ - V ∗∗
α

Let w ∈ W . Now ιW (w) ∈ W ∗∗ and ιV (w) ∈ V ∗∗ are the linear functionals on W ∗ and V ∗ ,
respectively, given by evaluation at w. Thus for ϕ ∈ V ∗ we have ιV (w)(ϕ) = ϕ(w). On the
other hand, ϕ  W ∈ W ∗ , and ιW (w)(ϕ  W ) = (ϕ  W )(w) = ϕ(w), proving that the left triangle
of the diagram commutes.
(i)⇒(ii) Now assume that V is reflexive, so that ιV : V → V ∗∗ is a bijection. Thus every
ψ ∈ V ∗∗ is of the form ιV (v) for a unique v ∈ V . With this, (9.3) becomes

W ⊥⊥ = ιV (v) | ϕ ∈ V ∗ , ϕ  W = 0 ⇒ ϕ(v) = 0 = ιV (W ),


where we used that for every v ∈ V \W there exists a ϕ ∈ V ∗ with ϕ  W = 0, ϕ(v) 6= 0. Thus
ιV : W → W ⊥⊥ is a bijection. Since α is a bijection, also ιW : W → W ∗∗ is a bijection, thus W
is reflexive.
By Theorem 9.21, V ∗ is reflexive. Thus the closed subspace W ⊥ ⊆ V ∗ is reflexive by what
was just proven. Since W ⊥ ∼ = (V /W )∗ by Exercise 6.7, (V /W )∗ is reflexive, thus V /W is
reflexive using Theorem 9.21 again.
(ii)⇒(i) Let ψ ∈ V ∗∗ . Our aim is to find a v ∈ V such that ψ = ιV (v). We have a canonical
isomorphism β : (V /W )∗ → W ⊥ ⊆ V ∗ . Thus ψ ◦ β ∈ (V /W )∗∗ . Since V /W is reflexive, there
exists v + W ∈ V /W such that ιV /W (v + W ) = ψ ◦ β. Now for all ϕ ∈ W ⊥ ⊆ V ∗ we have

ψ(ϕ) = ψ ◦ β ◦ β −1 (ϕ) = ιV /W (v + W )(β −1 (ϕ)) = (β −1 (ϕ))(v + W ) = ϕ(v) = ιV (v)(ϕ).


58
Billy James Pettis (1913-1979), American mathematician who mostly worked in functional analysis.

76
Thus ψ − ιV (v) ∈ V ∗∗ vanishes on W ⊥ , so that ψ − ιV (v) ∈ W ⊥⊥ . Since W is reflexive, ιW is
a bijection. Together with the fact that α is a bijection, this implies that ιV : W → W ⊥⊥ is
a bijection, thus W ⊥⊥ = ιV (W ). Thus there exists w ∈ W such that ιV (w) = ψ − ιV (v), thus
ψ = ιV (v + w), proving surjectivity of ιV . Thus V is reflexive. 

In these notes, the Banach spaces C(b) (X, F), C0 (X, F) of continuous functions do not play
a very prominent role, since their study requires more general topology than the rest or our
subjects, and also measure theory for the dual spaces. But the following is not too difficult:

9.28 Exercise Let X be a normal (T4 ) topological space. Prove that the Banach space
Cb (X, F) is reflexive if and only if #X < ∞. Hint: If #X = ∞, pick distinct points {xn }n∈N , use
Urysohn’s lemma to produce functions fn ∈ C(X, [0, 1]) with disjoint supports and fn (xm ) =
δn,m . Use these to produce an embedding c0 ,→ Cb (X, F).

9.5 The transpose of a bounded Banach space operator


9.29 Definition If V, W are normed spaces over F, A : V → W is linear and ϕ ∈ W ∗ then
At ϕ := ϕ ◦ A : V → F is linear (and bounded if A is bounded). This defines a linear map
At : W ∗ → Lin(V, F), called the transpose of A. If A is bounded then At maps W ∗ → V ∗ .
Some authors call At the ‘adjoint’ of A, but we stick to transpose in order to avoid confusion
with the Hilbert space adjoint discussed below. Note that the transpose goes in the ‘opposite
direction’ !

9.30 Exercise If V, W, Z are normed spaces and A ∈ B(V, W ), B ∈ B(W, Z), prove (BA)t =
At B t in B(X ∗ , Z ∗ ).

9.31 Lemma If V, W are normed spaces over F and A : V → W is linear then kAt k = kAk.
Thus At is bounded if and only if A is bounded. The map B(V, W ) → B(W ∗ , V ∗ ), A 7→ At is
isometric.
Proof. The identity follows from the computation

kAk = sup kAvk = sup sup |ϕ(Av)| = sup sup |ϕ(Av)| = sup kAt ϕk = kAt k,
v∈V v∈V ϕ∈W ∗ ϕ∈W ∗ v∈V ϕ∈W ∗
kvk=1 kvk=1 kϕk=1 kϕk=1 kxk=1 kϕk=1

where the the first and last identities are the definition of the norm, the second and fourth
follow from Proposition 9.9(ii), and the third is the exchangeability of two suprema. (Note that
we did not assume boundedness of A or At .) The rest is clear. 

Now we can use W ∗ to test a linear map A : V → W for boundedness:

9.32 Exercise Let V, W be normed spaces and A : V → W a linear map. Prove: A is bounded
if and only if At ϕ = ϕ ◦ A is bounded for all ϕ ∈ W ∗ . Hint: Use Lemma 9.31 and UBT or CGT.
The transposition operation can be iterated, giving Att ∈ B(V ∗∗ , W ∗∗ ), etc.

77
9.33 Lemma If V, W are normed spaces and A ∈ B(V, W ) then the diagram

A -
V W
∩ ∩

ιV ιW
? ?
V ∗∗ tt
- W ∗∗
A

commutes, thus considering V and W as subspaces of V ∗∗ , W ∗∗ , we have Att  V = A.


Proof. Let v ∈ V, ϕ ∈ W ∗ . Then using the definition of ιV , ιW and of the transpose, we have

ιW (Av)(ϕ) = ϕ(Av) = (At ϕ)(v) = ιV (v)(At ϕ) = (Att ιV (v))(ϕ).

Now, ιW (Av) and (Att ιV (v)) are in W ∗∗ , and the fact that they coincide on all ϕ ∈ W ∗ means
ιW (Av) = Att ιV (v). And since this holds for all v ∈ V , we have ιW A = Att ιV , as claimed. 

9.34 Remark 1. If V and W are reflexive, we can identify V ∗∗ = V and W ∗∗ = W , obtaining


Att = A, so that B(V, W ) → B(W ∗ , V ∗ ), A 7→ At is a bijection. But in general, the transposition
map B(V, W ) → B(W ∗ , V ∗ ), A 7→ At is not surjective. For a characterization of the A ∈
B(W ∗ , V ∗ ) that are of the form A = B t with B ∈ B(V, W ) see Theorem 10.32(i).
2. If V, W are finite-dimensional, we know from linear algebra that A ∈ B(V, W ) is injective
(surjective) if and only if At ∈ B(W ∗ , V ∗ ) is surjective (injective). In infinite dimensions this
becomes more complicated since, as seen in Section 7.4, we must distinguish between injectivity
and boundedness below and between dense image and surjectivity. 2

9.35 Exercise Let V, W be Banach spaces and A ∈ B(V, W ). Prove:


(i) ker At = (AV )⊥ ⊆ W ∗ , thus At is injective if and only if A has dense image AV ⊆ W .
(ii) If A ∈ B(V, W ) is invertible then At ∈ B(W ∗ , V ∗ ) is invertible.
(iii) If At ∈ B(W ∗ , V ∗ ) is invertible then A ∈ B(V, W ) is invertible. (Warning: We don’t
assume reflexivity of the spaces involved!)
Hint: (i) and (ii) are very easy. The proof of (iii) uses (i) and (ii).

9.36 Exercise Let V, W be Banach spaces and A ∈ B(V, W ).


(i) Prove ker A = (At W ∗ )> .
(ii) Show that both (i) and Exercise 9.35(i) can be used to prove: If At has dense image then
A is injective.

9.37 Exercise Let V, W be Banach spaces and A ∈ B(V, W ). Prove:


(i) If A is surjective then At : W ∗ → V ∗ is bounded below (thus injective).
(ii) If A is bounded below then At : W ∗ → V ∗ is surjective.
(iii) Deduce (not assuming reflexivity!) that At is surjective if and only if A is bounded below.

9.38 Proposition Let V, W be Banach spaces and A ∈ B(V, W ). If At is bounded below then
A is surjective.

78
Proof. Let C > 0 be such that kAt ϕk ≥ Ckϕk for all ϕ ∈ W ∗ . The set Y = AB V (0, 1) ⊂ W
is closed, convex and balanced. Thus for each z ∈ W \Y by Proposition B.64 there exists
ϕ ∈ W ∗ such that |ϕ(y)| ≤ 1 for all y ∈ Y and |ϕ(z)| > 1. By the first of these properties,
for all x ∈ B V (0, 1) we have |(At ϕ)(x)| = |ϕ(Ax)| ≤ 1, implying kAt ϕk ≤ 1. Thus with the
hypothesis,
C < C|ϕ(z)| ≤ Ckzkkϕk ≤ kzkkAt ϕk ≤ kzk.
By contraposition, if w ∈ W satisfies kwk ≤ C then w ∈ Y . In particular, B W (0, C) ⊆
AB V (0, 1). Now Proposition 7.4 gives the surjectivity of A. 

9.39 Remark Summarizing our findings, for A ∈ B(V, W ) we have


A has dense image ⇔ At is injective
A is bounded below ⇔ At is surjective
A is surjective ⇔ At is bounded below
A is injective ⇐ At has dense image
From this, we easily deduce that Att is surjective (resp. bounded below) if and only if A
is surjective (resp. bounded below). And if Att is injective then so is A, but this follows more
directly from Lemma 9.33.
As to the last row, we will see that the condition on At equivalent to injectivity of A is
somewhat weaker than (norm-)density of At W ∗ , cf. Theorem 10.32(i). 2

9.40 Proposition Let V, W be Banach spaces and A ∈ B(V, W ).


(i) If A has closed image AV then At W ∗ = (ker A)⊥ ⊆ V ∗ , thus At has closed image.
(ii) If At has closed image, so has A.
Proof. (i) is Exercise 9.41 below. (ii) Assume At has closed image. Following [141], define
Z = AV ⊆ W and define A0 : V → Z in the obvious way. Then A0 has dense image, so that
(A0 )t : Z ∗ → V ∗ is injective by Exercise 9.35. By Hahn-Banach, every ϕ ∈ Z ∗ has an extension
b ∈ W ∗ . Now for every v ∈ V we have (At ϕ)(v)
ϕ b = ϕ(Av)
b = ϕ(Av) = ((A0 )t ϕ)(v), implying
b = (A0 )t ϕ. Thus At and (A0 )t have the same images in V ∗ . Since At W ∗ ⊆ V ∗ is closed
At ϕ
by assumption, also (A0 )t Z ∗ ⊆ V ∗ is closed, thus complete. As an injective map with closed
image, (A0 )t : Z ∗ → (A0 )t Z ∗ is invertible by the BIT, thus bounded below. Now Proposition
9.38 (applied to A0 ) gives that A0 : V → Z is surjective. Since Z is closed by definition,
AV = A0 V = Z is closed. 
1 A 2 A 3 A
9.41 Exercise Prove (i) of Proposition 9.40. Hint: Factorize A as V −→ V /ker A −→ AV −→
W and use the BIT and HB theorems.

9.42 Exercise If V is a reflexive Banach space and W ' V (not necessarily isometric isomor-
phism), prove that W is reflexive.

10 ? Duality: Weak and weak ∗-topologies


Every Banach space has a canonical metric topology defined by the norm. But it also admits
an equally canonical weaker topology, the weak topology. And on the dual space V ∗ of a
Banach space V there is yet another topology, the weak-∗ topology. These topologies have
many applications in functional analysis, cf. e.g. [30], but we will discuss only a few, most
importantly:

79
• characterization of reflexive Banach spaces: Theorem 10.15.
• applications to > and transposes At of operators: Proposition 10.31, Theorem 10.32.
• characterizations of compact operators: Proposition 12.25 and Theorem 12.28.
• a (locally) compact topology relevant for the theory of commutative Banach algebras:
Section 19.

10.1 The weak topology of a Banach space


10.1 Definition If V is a normed space, the weak topology τw is the topology on V induced
by the family of seminorms F = {k · kϕ = |ϕ(·)| | ϕ ∈ V ∗ }. Thus a net {xι } ⊆ V converges
weakly to x ∈ V if and only if ϕ(xι ) → ϕ(x) for all ϕ ∈ V ∗ .
The weak topology is also called the σ(V, V ∗ )-topology (the topology on V induced by the
linear functionals in V ∗ ). The Hahn-Banach theorem immediately gives that F is separating, so
that this topology is locally convex. It is clear that a norm-convergent net is weakly convergent
k·k w
since |ϕ(xι ) − ϕ(x)| ≤ kϕkkxι − xk. This implies S ⊆ S for every S ⊆ V and τw ⊆ τk·k .
If (H, h·, ·i) is a Hilbert space, Theorem 5.34 implies F = {|h·, yi| | y ∈ H}, so that weak
convergence of a net {xι } in a Hilbert space H is equivalent to convergence of the nets {hxι , yi}
for each y ∈ H.

Since the norm and weak topologies on a normed space are Hausdorff, Proposition 2.29
implies that they coincide if the space is finite-dimensional. On the other hand:

10.2 Proposition Let V be an infinite-dimensional normed space. Then the weak topology
τw on V is strictly weaker than the norm-topology and not first countable. In particular (V, τw )
is neither normable nor Fréchet nor an F-space.
Proof. By the definition of τw , for every weakly open neighborhood U of 0 there are ϕ1 , . . . , ϕn ∈
V ∗ such {x ∈ V | |ϕi (x)| < 1 ∀i = 1, . . . , n} ⊆ U . Thus U contains the linear subspace
Tnthat−1
W = i=1 ϕi (0) ⊆ V , whose codimension is ≤ n. Thus if V is infinite-dimensional then
dim W is infinite, thus non-zero. On the other hand, it is clear that the (norm-)open ball
B(0, 1) contains no linear subspace of dimension > 0. Thus B(0, 1) 6∈ τw . Since τw ⊆ τk·k was
clear, we have τw $ τk·k .
If we assume that τw is T first countable, 0 ∈ V has a countable open neighborhood base
{Un }n∈N . Replacing Un by nk=1 Uk , we may assume U1 ⊇ U2 ⊇ · · · . As seen above, being
weakly open, each Un contains a non-zero linear subspace Vn . For each n we can pick a xn ∈ Vn .
If now ϕ ∈ V ∗ and ε > 0 are arbitrary, U = {x ∈ V | |ϕ(x)| < ε} is a weakly open neighborhood
of 0. Since {Un } is a shrinking weak neighborhood base of 0, there exists n0 such that for all
n ≥ n0 we have xn ∈ Vn ⊆ Un ⊆ U , thus |ϕ(xn )| < ε, implying ϕ(xn ) → 0. Since ϕ ∈ V ∗ was
w
arbitrary, we have proven xn → 0. Since this holds for every choice of {xn ∈ Vn } and the Vn
are vector spaces, we can choose {xn } such that kxn k → ∞. But this contradicts the fact that
every weakly convergent sequence is norm-bounded, cf. Exercise 10.6 below. This contradiction
shows that τw is not first countable.
Now the last statement is trivial since for a TVS we have the implications normable ⇒
Fréchet ⇒ F-space ⇒ metrizable ⇒ first countable. 

10.3 Exercise (i) Prove that the sequence {δn }n∈N has no weak limit in `1 (N, F).

80
(ii) Let 1 < p < ∞. Prove that the sequence {δn }n∈N ⊆ `p (N, F) converges to zero weakly, but
not in norm.
w
(iii) If fn → g in a Hilbert space and kfn k → kgk, prove that kfn − gk → 0.
(iv) (Bonus) Prove the result of (iii) for sequences in (`p (N, F), k · kp ), where 1 < p < ∞.59
The deviant behavior of `1 in the preceding exercise can be understood as a consequence of
the following surprising result:
60 w
10.4 Theorem (I. Schur 1920) If g, {fn }n∈N ⊆ `1 (N, F) and fn → g then kfn − gk1 → 0.

10.5 Remark 1. Like the uniform boundedness theorem, this result can be proven using a
beautiful gliding hump argument, cf. Section B.3.5, or using Baire’s theorem.
2. Theorem 10.4 does not generalize to nets since the weak and norm topologies on `1 (N, F)
differ by Proposition 10.2 and nets can distinguish topologies, cf. e.g. [108, Section 5.1].
3. Banach spaces in which weak and norm convergence of sequences are equivalent are said
to have the Schur property. All finite-dimensional spaces have it. See also Remark 12.31.1. 2

10.6 Exercise Prove that every weakly convergent sequence in a normed space is norm-
bounded. Hint: Uniform boundedness theorem. (This does not generalize to nets!)

10.7 Exercise Prove that every weakly compact subset of a Banach space is norm-bounded.

10.8 Exercise Let V be a Banach space. Prove that the (norm) closed unit ball V≤1 is also
weakly closed. Hint: Hahn-Banach.61
Given a linear map A : E → F between Banach spaces, one can consider its continuity w.r.t.
different pairs of topologies on E and F : Norm-norm continuity (w.r.t. the norm topologies on
both spaces), weak-norm continuity (the weak topology on E, the norm topology on F ) and,
similarly, norm-weak and weak-weak continuity. These notions are not all distinct:

10.9 Exercise Let V, W be Banach spaces and A : V → W a linear map. Prove that the
following are equivalent:
(i) A is norm-norm continuous (equivalently, bounded).
(ii) A is norm-weak continuous.
(iii) A is weak-weak continuous.

10.10 Exercise With the same assumptions as above, prove that the following are equivalent:
(i) A is weak-norm continuous.
PK
(ii) There are ϕ1 , . . . , ϕK ∈ V ∗ and y1 , . . . , yK ∈ W such that Ax = k=1 ϕk (x)yk ∀x ∈ V .
(iii) A is bounded and AV ⊆ W is finite-dimensional (i.e. A has ‘finite rank’).
59
More generally, this implication holds for all uniformly convex Banach spaces, cf. e.g. [86, Proposition 9.11], [23,
Sect. 3.7]. The Lp -spaces with 1 < p < ∞ are uniformly convex, cf. Section B.6.8.
60
Issai Schur (1874-1941). Russian mathematician. Studied and worked in Germany up to his emigration to Israel
in 1939. Mostly known for his work in group and representation theory.
61
More generally, every norm-closed convex set is weakly closed. But this is a bit harder.

81
10.11 Exercise If V is a normed space and A ∈ B(V ) has finite rank, prove
Pn that the trace
(as known from linear algebra) of A  AV ∈ End(AV ) coincides with i=1 ϕi (yi ) for any
PK
representation A = k=1 ϕk (·)yk .

10.12 Definition If V is a Banach space over F ∈ {R, C}, a sequence {xn } ⊂ V is called
weakly Cauchy if the sequence {ϕ(xn )} ⊂ F converges for every ϕ ∈ V ∗ . (Equivalently, {xn } is
Cauchy in the sense of Remark 2.12.2 in the locally convex space (V, τw ).)

10.13 Exercise Prove that in a Banach space one has:


(i) Every weakly convergent sequence is weakly Cauchy.
(ii) Every weakly Cauchy sequence is norm-bounded.
(iii) Every weakly Cauchy sequence in a reflexive Banach space is weakly convergent.
(iv) Give an example of a sequence in a Banach space that is weakly Cauchy, but not weakly
convergent. Hint: Try c0 .

10.14 Exercise Let V be a Banach space over F ∈ {R, C}. Prove:


(i) If V ∗ is separable (thus also V by Exercise 9.26) then every norm-bounded sequence in V
has a weakly Cauchy subsequence.
(ii) If V is reflexive and separable then every norm-bounded sequence in V has a weakly
convergent subsequence.
(iii) Statement (ii) remains true without the separability hypothesis.
(iv) (Bonus) Show that the sequence {δn }n∈N in `1 (N, F) does not have a weakly Cauchy
subsequence.62
The above (iii) means that every norm-bounded weakly closed subset of a reflexive Banach
space is sequentially compact in the weak topology. (The converse also holds, see below.)
Replacing sequential compactness by compactness we have the following, proven later:

10.15 Theorem Let V be a Banach space. Then the following are equivalent:
(i) V≤1 is compact w.r.t. the weak topology.
(ii) V is reflexive. (⇔ V ∗ is reflexive by Theorem 9.21.)

10.16 Remark 1. If V is a finite-dimensional normed space then V≤1 is compact w.r.t. the
norm topology. For infinite-dimensional V this is false, as we will see in Section 12.1. But since
the weak topology is weaker than the norm topology, a set can be weakly compact even though
it is not norm compact.
2. For metric spaces, compactness and sequential compactness are equivalent, but the two
properties are independent for general topological spaces. Despite the fact that weak topologies
on infinite-dimensional Banach spaces are metrizable at best on bounded subsets (under sepa-
rability assumptions), there is the Eberlein-Šmulyan63 Theorem: For subsets of a Banach space
62
In fact, given a Banach space V , every bounded sequence in V has a weakly Cauchy subsequence if and only if V
has no subspace isomorphic to `1 . This deep result is due to H. Rosenthal (1974). See e.g. [1, Chapter 11], [97, Vol.
1, Sect. [Link]].
63
William Frederick Eberlein (1917-1986), American mathematician who worked in functional analysis, topology
and related areas. Vitold Lvovich Šmulyan (1914-1944), Soviet mathematician, also known for the Krein-S. theorem.
Killed while fighting in WW2.

82
weak compactness and weak sequential compactness are equivalent. (⇒ is not very difficult.)
For proofs see more advanced texts or [173]. 2

10.17 Exercise If H is a Hilbert space, prove that every orthonormal sequence {en }n∈N in H
converges weakly to zero.

10.2 The weak operator topology on B(V )


In Remark 8.10 we have encountered the strong (operator) topology on B(V ): A net {Aι } ⊆
B(V ) converges strongly to A ∈ B(V ) if k(Aι − A)xk → 0 for all x ∈ V . Now we can have a
brief look at the weak operator topology:

10.18 Definition Let V be a Banach space. The weak operator topology τwot on B(V ) is
generated by the family F = {k · kx,ϕ : A 7→ |ϕ(Ax)| | x ∈ V, ϕ ∈ V ∗ } of seminorms. Thus
{Aι } ⊆ B(V ) converges to A ∈ B(V ) w.r.t. τwot if and only if ϕ((Aι − A)x) → 0 for all
x ∈ V, ϕ ∈ V ∗ , i.e. {Aι x} ⊆ V converges weakly to Ax for all x ∈ V . The family F is
wot
separating, so that τwot is Hausdorff. We write Aι −→ A or A = wot-lim Aι .
There is little risk of confusing the weak topology on V with the weak operator topology on
B(V ). But one might confuse the latter with the weak topology that B(V ) has as a Banach
space, in particular since the above k·kx,ϕ are in B(V )∗ ! However, when V is infinite-dimensional
these seminorms do not exhaust (or span) the bounded linear functionals on B(V ), so that the
weak operator topology on B(V ) is strictly weaker than the weak topology!

10.19 Exercise Let H be a Hilbert space.


wot
(i) Given A, {Aι } ⊆ B(H), prove that Aι −→ A if and only if hAι x, yi → hAx, yi for all
x, y ∈ H.
(ii) Prove that the map (B(H), τwot ) → (B(H), τwot ), A 7→ A∗ is continuous.
(iii) Prove that the map (B(H), τsot ) → (B(H), τsot ), A 7→ A∗ is not continuous if dim H = ∞.

10.3 The weak-∗ topology on a dual space. Alaoglu’s theorem


10.20 Definition If V is a Banach space, the weak-∗ topology τw∗ (or σ(V ∗ , V )-topology) is
the topology on the dual space V ∗ defined by the family F = {k · kx | x ∈ V } of seminorms,
where kϕkx = |b x(ϕ)| = |ϕ(x)|. Thus a net {ϕι } in V ∗ converges to ϕ ∈ V ∗ if and only if
ϕι (x) → ϕ(x) for every x ∈ V .

10.21 Remark 1. Since ϕ(x) = 0 for all x ∈ V means ϕ = 0, F is separating, thus the
σ(V ∗ , V )-topology is Hausdorff and therefore locally convex.
2. If V is infinite-dimensional, the weak-∗ topology τw∗ is neither normable nor metrizable.
3. Since the weak-∗ topology is induced by the linear functionals on V ∗ of the form x b,
which constitute a subset of V ∗∗ , it is weaker than the weak topology, thus also weaker than
the norm topology: τw∗ ⊆ τw ⊆ τk·k . As we know, the second inclusion is proper whenever V is
infinite-dimensional. For the first, we have: 2

10.22 Proposition If V is a Banach space, the weak-∗ topology σ(V ∗ , V ) on V ∗ coincides


with the weak topology σ(V ∗ , V ∗∗ ) if and only if V is reflexive.

83
Proof. If V is reflexive then V ∗∗ = ιV (V ), so that the weak-∗ topology σ(V ∗ , V ) on V ∗ coincides
with the weak topology σ(V ∗ , V ∗∗ ). If V is not reflexive, we have V $ V ∗∗ . Now for ψ ∈ V ∗∗ \V
it is clear that the linear functional ψ on V ∗ is σ(V ∗ , V ∗∗ )-continuous, whereas Exercise 10.23
gives that it is not σ(V ∗ , V )-continuous. This proves σ(V ∗ , V ) 6= σ(V ∗ , V ∗∗ ). 

10.23 Exercise Let V be an F-vector space with algebraic dual space V ? .


Tn
(i) For ϕ, ψ1 , . . . , ψn ∈ V ? prove that ϕ ∈ spanF {ψ1 , . . . , ψn } ⇔ i=1 ker ψi ⊆ ker ϕ. Hint:
Use the map V → Fn , x 7→ (ψ1 (x), . . . , ψn (x)).
(ii) Let W ⊆ V ? be a linear subspace. Prove that a linear functional ϕ : V → F is σ(V, W )-
continuous if and only if ϕ ∈ W . Hint: Use (i).

10.24 Lemma Let V be a Banach space over F. Then


(i) (V, τw )∗ = V ∗ . (I.e., if ϕ : V → F is linear, norm and weak continuity are equivalent.)
(ii) The continuous linear functionals on the locally convex space (V ∗ , τw∗ ) are precisely the
b : ϕ 7→ ϕ(x) for some x ∈ V .
functionals x
Proof. Since the weak topology on V and the weak-∗ topology on V ∗ are the σ(V, V ∗ ) and
σ(V ∗ , V ) topologies, respectively, both claims are immediate consequences of Exercise 10.23(ii).


10.25 Remark Before we proceed, some comments are in order: While the norm and weak
topologies are defined for each Banach space, the weak-∗ topology is defined only on spaces
that are the dual space V ∗ of a given space V . There are Banach spaces, like c0 (N, F), that are
not isomorphic (isometrically or not) to the dual space of any Banach space, cf. Corollary B.26.
And there are non-isomorphic Banach spaces with isomorphic dual spaces, cf. Corollary B.27.
Thus to define the weak-∗ topology on a Banach space V , it is not enough just to know that
the latter is a dual space. We must choose a ‘pre-dual’ space W such that W ∗ ∼ =V. 2

Recall that V≤1 is norm compact if and only if V is finite-dimensional and weakly compact
if and only if V is reflexive. The weak-∗ topology on the dual of a non-reflexive Banach space
V is strictly weaker than the weak topology, so that the closed unit ball of V ∗ has a chance of
being weak-∗ compact, and in fact this is the case unconditionally:

10.26 Theorem (Alaoglu’s theorem (1940)) 64 If V is a Banach space then the (norm)closed
unit ball (V ∗ )≤1 = {ϕ ∈ V ∗ | kϕk ≤ 1} is compact in the σ(V ∗ , V )-topology.
Proof. Define Y
Z= {z ∈ C | |z| ≤ kxk},
x∈V

equipped with the product topology. Since the closed discs in C are compact, Z is compact by
Tychonov’s theorem. If ϕ ∈ (V ∗ )≤1 then |ϕ(x)| ≤ kxk ∀x, so that we have a map
Y
f : (V ∗ )≤1 → Z, ϕ 7→ ϕ(x).
x∈V

Since the map ϕ 7→ ϕ(x) is continuous for each x, f is continuous (w.r.t. the weak-∗ topology
on (V ∗ )≤1 ). It is trivial that V separates the points of V ∗ , thus f is injective. By definition,
64
Leonidas Alaoglu (1914-1981). Greek mathematician. (Earlier versions due to Helly and Banach.)

84
a net {ϕι } in (V ∗ )≤1 converges in the σ(V ∗ , V )-topology if and only if ϕι (x) converges for all
x ∈ V , and therefore if and only if f (ϕι ) converges. Thus f : (V ∗ )≤1 → f ((V ∗ )≤1 ) ⊆ Z is a
homeomorphism.
Now let z ∈ f ((V ∗ )≤1 ) ⊆ Z. Clearly, |zx | ≤ kxk ∀x ∈ X. By Proposition A.12.2 there is
a net in f ((V ∗ )≤1 ) converging to z and therefore a net {ϕι } in (V ∗ )≤1 such that f (ϕι ) → z.
This means ϕι (x) → zx ∀x ∈ V . In particular ϕι (αx + βy) → zαx+βy , while also ϕι (αx + βy) =
αϕι (x) + βϕι (y) → αzx + βzy . Thus the map ψ : V → C, x 7→ zx is linear with kψk ≤ 1, to wit
ψ ∈ (V ∗ )≤1 and z = f (ψ). Thus f ((V ∗ )≤1 ) ⊆ f ((V ∗ )≤1 ), so that f ((V ∗ )≤1 ) ⊆ Z is closed.
Now we have proven that (V ∗ )≤1 is homeomorphic to the closed subset f ((V ∗ )≤1 ) of the
compact space Z, and therefore compact. 

10.27 Remark We deduced Alaoglu’s theorem from Tychonov’s theorem, which is known to be
equivalent to the axiom of choice (AC). But we only needed Tychonov as restricted to Hausdorff
spaces, and the converse also holds. See Appendix B.5 where we also prove equivalence of these
statements to the Ultrafilter Lemma (UL), a set theoretic axiom that is known to be strictly
weaker than AC and its equivalents. UL also implies the Hahn-Banach theorem. 2

10.28 Exercise Use Alaoglu’s theorem to prove that every Banach space V over F admits a
linear isometric bijection onto a closed subspace of C(X, F) for some compact Hausdorff space
X.

10.29 Exercise (i) Use Alaoglu’s theorem to prove (ii)⇒(i) in Theorem 10.15.
(ii) Conclude that the closed unit ball of every Hilbert space is weakly compact.
(iii) Prove σ(V, V ∗ ) = σ(V ∗∗ , V ∗ )  V .
(iv) Use Theorem 10.30 and (iii) to prove (i)⇒(ii) in Theorem 10.15.

10.30 Theorem (Goldstine) 65 If V is Banach then V≤1 is σ(V ∗∗ , V ∗ )-dense in (V ∗∗ )≤1 .


The fairly non-trivial proof is relegated to the supplementary Section B.6.3.
We close by applying weak-∗ topologies to Φ> and to transposes of operators:

10.31 Proposition Let V be a Banach space and W ⊆ V, Φ ⊆ V ∗ linear subspaces. Then


(i) W ⊥ ⊆ V ∗ is weak-∗ closed.
w ∗ >
(ii) Φ = Φ> .
w∗
(iii) (Φ> )⊥ = Φ .
(iv) Φ> = {0} holds if and only if Φ ⊆ V ∗ is weak-∗ dense.
w∗
Proof. (i) Let ϕ, {ϕι }ι∈I ⊆ W ⊥ such that ϕι → ϕ. This means ϕι (x) → ϕι (x) ∀x ∈ V . Thus
ϕ(w) = lim ϕι (w) = 0 for all w ∈ W , proving ϕ ∈ W ⊥ .
w∗ w ∗ > w∗
(ii) Since Φ ⊇ Φ, it is clear that Φ ⊆ Φ> . Let x ∈ Φ> . If ϕ ∈ Φ then there is a
w∗
net {ϕι } in Φ such that ϕι → ϕ, thus ϕι (x) → ϕ(x). In view of x ∈ Φ> we have ϕι (x) = 0 for
w∗ w ∗ >
all ι, thus ϕ(x) = 0. Thus x ∈ (Φ )> , proving the missing inclusion Φ> ⊆ Φ .
65
Herman Heine Goldstine (1913-2004). American mathematician and computer scientist. Worked on very pure
and very applied mathematics, like John von Neumann, with whom he collaborated on computers.

85
w ∗ > ⊥
(iii) In view of (ii), (Φ> )⊥ = ( Φ ) . Thus it suffices to prove (Φ> )⊥ = Φ for weak-∗
closed Φ. Since it is clear that Φ ⊆ (Φ ) , it remains to prove the converse inclusion. If ϕ0 ∈
> ⊥

(Φ> )⊥ \Φ, Corollary B.61, applied to the locally convex space (V ∗ , τw∗ ), gives a ψ ∈ (V ∗ , τw∗ )∗
such that ψ  Φ = 0 and ψ(ϕ0 ) 6= 0. Now by Lemma 10.24 there is a unique x ∈ V such that
ψ(ϕ) = ϕ(x) for all ϕ ∈ V ∗ . Clearly x 6= 0. In view of this we have ϕ(x) = 0 ∀ϕ ∈ Φ, thus
x ∈ Φ> . With ϕ0 ∈ (Φ> )⊥ this implies ψ(ϕ0 ) = ϕ0 (x) = 0, which is a contradiction.
w∗ w∗
(iv) If Φ = V ∗ then (ii) implies Φ> = (Φ )> = (V ∗ )> = {0}. And if Φ> = {0} then (iii)
w∗
gives Φ = ({0})⊥ = V ∗ . 

10.32 Theorem Let V, W be Banach spaces.


(i) If A ∈ B(V, W ) then A is injective if and only if At ∈ B(W ∗ , V ∗ ) has weak-∗ dense image,
w∗
thus At W ∗ = V ∗ . (A priori this condition is weaker than norm-density!)
(ii) If A ∈ B(W ∗ , V ∗ ) then there exists B ∈ B(V, W ) such that A = B t if and only if A is
weak-∗-weak-∗ continuous, i.e. continuous as a map (W ∗ , τw∗ ) → (V ∗ , τw∗ ).
Proof. (i) This is immediate by combining Exercise 9.36(i) with Proposition 10.31(iv).
(ii) Let B ∈ B(V, W ), and let {ϕι } be a net in W ∗ that converges to ϕ ∈ W ∗ in the weak-∗
topology, i.e. ϕι (w) → ϕ(w) for all w ∈ W . If v ∈ V then (B t ϕι )(v) = ϕι (Bv) → ϕ(Bv) =
w∗
(B t ϕ)(v), proving that B t ϕι → B t ϕ, so that indeed B t is weak-∗-weak-∗ continuous.
Now assume A ∈ B(W ∗ , V ∗ ) is weak-∗-weak-∗ continuous. Then for each v ∈ V the linear
functional W ∗ → F, ϕ 7→ (Aϕ)(v) is weak-∗ continuous, thus by Lemma 10.24 there is a unique
w ∈ W such that (Aϕ)(v) = ϕ(w). This defines a map B : V → W such that (Aϕ)(v) = ϕ(Bv)
for all v ∈ V, ϕ ∈ W ∗ . Thus (Aϕ)(v) = (B t ϕ)(v) for all v, ϕ, to wit A = B t . Now Lemma 9.31
gives kBk = kB t k = kAk < ∞. 

10.33 Remark If V is reflexive, the weak and weak-∗ topologies on V ∗ coincide by Proposition
10.22, so that with Corollary B.63 a linear subspace of V ∗ is norm-dense if and only if it is
weak-∗ dense. Thus for reflexive V the conditions of weak-∗ density in Proposition 10.31(iv)
and Theorem 10.32(i) reduce to norm-density. 2

11 Hilbert space operators and their adjoints. Spe-


cial classes of operators
In Section 9.5 we defined and studied the transpose At ∈ B(F ∗ , E ∗ ) of an operator A ∈ B(E, F )
between Banach spaces. For Hilbert spaces, we have natural identifications H ∼ = H ∗ , which leads
to new aspects that give the theory of operators between Hilbert spaces a special flavor.

11.1 The adjoint of a bounded Hilbert space operator


If H is a Hilbert space then we have a canonical map γH : H → H ∗ given by y 7→ ϕy = h·, yi.
This map is antilinear and isometric, and by the representation Theorem 5.34 it is a bijection.
This bijection in a sense makes the dual spaces of Hilbert spaces redundant to a large extent,
so that it is desirable to eliminate them from considerations of the transpose:

86
11.1 Proposition Let H1 , H2 be Hilbert spaces. For A ∈ B(H1 , H2 ), define the Hilbert space
−1
γH At YH
−1
adjoint A∗ : H2 → H1 as the composite map H2 −→
2
H2∗ −→ H1∗ −→
1
H1 . I.e. A∗ := γH 1
◦At ◦γH2 .
Now
(i) The map A∗ : H2 → H1 is linear and bounded, thus in B(H2 , H1 ).
(ii) The map B(H1 , H2 ) → B(H2 , H1 ), A 7→ A∗ is anti-linear.
(iii) For all x ∈ H1 , y ∈ H2 we have hAx, yi2 = hx, A∗ yi1 .
Proof. (i) Linearity of A∗ : H2 → H1 follows from its being the composite of the linear map At
−1
with the two anti-linear maps γH2 and γH 1
. Boundedness follows from kAt k = kAk.

(ii) Additivity of A 7→ A is obvious. Let A ∈ B(H1 , H2 ), c ∈ C, x ∈ H2 . Then
−1 −1 −1
(cA)∗ (x) = γH1
◦ (cA)t ◦ γH2 (x) = γH 1
(cAt (γH2 (x))) = cγH 1
(At (γH2 (x))) = cA∗ (x),
−1
where we used the linearity of A 7→ At and anti-linearity of γH 1
, shows (cA)∗ = cA∗ . (The
anti-linearity of γH2 is irrelevant here.)
(iii) If y ∈ H2 then γH2 (y) ∈ H2∗ is the functional h·, yi2 . Then (At ◦ γH2 )(y) ∈ H1∗ is the
−1
functional x 7→ hAx, yi2 . Thus z = A∗ y = (γH 1
◦ At ◦ γH2 )(y) ∈ H1 is a vector such that

hx, zi1 = hAx, yi2 for all x ∈ H1 . This means hx, A yi1 = hAx, yi2 ∀x ∈ H1 , y ∈ H2 , as claimed.


11.2 Remark Combining (iii) above with the Hellinger-Toeplitz theorem (Corollary 7.32), we
see that a linear map A : H1 → H2 has a Hilbert space adjoint if and only if it is bounded. 2

There is a very useful bijection between bounded operators and bounded sesquilinear forms.
It can be used to give an alternative (at least in appearance) construction of the adjoint A∗
(and for many other purposes). It is based on the following observation: If A ∈ B(H) satisfies
hAx, yi = 0 for all x, y ∈ H then Ax = 0 for all x, thus A = 0. Applying this to A − B shows
that hAx, yi = hBx, yi ∀x, y implies A = B. Thus bounded operators are distinguished by their
‘matrix elements’ hAx, yi. This motivates the following developments.

11.3 Definition Let V be an F-vector space. A map V × V → V, (x, y) 7→ [x, y] is called


sesquilinear if it is linear w.r.t. x and anti-linear w.r.t. y. A sesquilinear form [·, ·] is bounded if
supkxk=kyk=1 |[x, y]| < ∞.

11.4 Remark Recall that the inner product h·, ·i on a (pre-)Hilbert space is sesquilinear and
bounded by Cauchy-Schwarz. If F = R, the definition of course reduces to bilinearity. 2

11.5 Proposition Let H be a Hilbert space. Then there is a bijection between B(H) and the
set of bounded sesquilinear forms on H, given by B(H) 3 A 7→ [·, ·]A , where [x, y]A = hAx, yi.
Proof. Let A ∈ B(H). Sesquilinearity of [·, ·]A = hAx, yi is an obvious consequence of sesquilin-
earity of h·, ·i and linearity of A, and boundedness follows from Cauchy-Schwarz:

|[x, y]A | = |hAx, yi| ≤ kAxkkyk ≤ kAkkxkkyk ∀x, y.

Now let [·, ·] be a sesquilinear form bounded by M . Then for each x ∈ H, the map ψx : H →
C, y 7→ [x, y] is linear (thanks to the complex conjugation) and satisfies |ψx (y)| ≤ M kykkxk,
thus ψx ∈ H ∗ . Thus by Theorem 5.34 there is a unique vector zx ∈ H such that ψx = ϕzx , thus
[x, y] = ψx (y) = ϕzx (y) = hy, zx i ∀y and, taking complex conjugates, hzx , yi = [x, y] ∀y. Thus

87
defining A : H → H by Ax = zx ∀x we have hAx, yi = [x, y] ∀x, y. Since the maps x 7→ ψx and
ψx 7→ zx are both anti-linear, their composite A is linear. And since H → H ∗ , z →
7 ϕz is an
isometry, we have kAxk = kzx k = kϕzx k = kψx k ≤ M kxk, thus A ∈ B(H). 

11.6 Proposition Let H be a Hilbert space and A ∈ B(H). Then


(i) There is a unique B ∈ B(H) such that

hAx, yi = hx, Byi ∀x, y ∈ H.

This B is denoted A∗ and called the adjoint of A.


(ii) In particular A = A∗ if and only if hAx, yi = hx, Ayi ∀x, y ∈ H, in which case we call A
self-adjoint.
Proof. (i) The map (y, x) 7→ hy, Axi is sesquilinear and bounded (by kAk). Thus by Proposition
11.5 there is a bounded B ∈ B(H) such that hBy, xi = hy, Axi ∀x, y. Taking the complex
conjugate gives hx, Byi = hBy, xi = hy, Axi = hAx, yi, which is the wanted identity. Now (ii) is
obvious. 

11.7 Remark 1. In view of the identity hAx, yi = hx, A∗ yi satisfied by the adjoint as defined
above and Proposition 11.1(iii) (with H1 = H2 = H), it is clear that the two constructions of
A∗ give the same result (and in a sense are the same construction since both use Theorem 5.34).
2. Proposition 11.5 generalizes readily to a bijection between bounded linear maps A : H1 →
H2 and bounded sesquilinear forms [·, ·] on H1 × H2 . Then also the proof of Proposition 11.6
generalizes in this way and then produces the same adjoint A∗ ∈ B(H2 , H1 ) as Proposition 11.1.
(Of course A ∈ B(H1 , H2 ) can only be self-adjoint if H1 = H2 .)
3. If [·, ·] is a sesquilinear form then also [x, y]0 := [y, x] is a sesquilinear form, called the
adjoint form. Looking at the above definition of A∗ , one finds that A∗ is the bounded operator
associated with the form [·, ·]0 . Thus self-adjointness of A is equivalent to [·, ·]0A = [·, ·]A , i.e.
[·, ·]A being self-adjoint.
4. The following should be known from linear algebra, cf. e.g. [55]: If H is a Hilbert space,
A ∈ B(H) and E is an orthonormal basis for H then hA∗ e, f i = he, Af i = hAf, ei for all
e, f ∈ E. Thus the (possibly infinite) matrix describing A∗ w.r.t. E is obtained from the matrix
corresponding to A by transposition and complex conjugation. 2

11.8 Lemma The map B(H) → B(H), A 7→ A∗ satisfies


(i) (cA + dB)∗ = cA∗ + dB ∗ ∀A, B ∈ B(H), c, d ∈ F (antilinearity).
(ii) (AB)∗ = B ∗ A∗ (anti-multiplicativity).
(iii) A∗∗ = A (involutivity).
(iv) 1∗ = 1.
Proof. (i) Follows from

hx, (cA∗ + dB ∗ )yi = chx, A∗ yi + dhx, B ∗ yi = chAx, yi + dhBx, yi = h(cA + dB)x, yi


= hx, (cA + dB)∗ yi.

(ii) For all x, y ∈ H we have hx, (AB)∗ yi = h(AB)x, yi = hBx, A∗ yi = hx, B ∗ A∗ yi.
(iii) Complex conjugating hAx, yi = hx, A∗ yi gives

hy, Axi = hAx, yi = hx, A∗ yi = hA∗ y, xi,

88
which shows that A is an adjoint of A∗ . Uniqueness of the adjoint now implies A∗∗ = A.
(iv) Obvious. 

11.9 Proposition Let H be a Hilbert space. Then for all A ∈ B(H) we have
(i) kA∗ k = kAk. (The ∗-operation is isometric.)
(ii) kA∗ Ak = kAk2 . (“C ∗ -identity”)
Proof. (i) Similarly to Lemma 9.31, using (5.2) we have

kA∗ k = sup |hA∗ x, yi| = sup |hx, Ayi| = sup |hAy, xi| = kAk.
kxk=kyk=1 kxk=kyk=1 kxk=kyk=1

(ii) On the one hand, kA∗ Ak ≤ kA∗ kkAk = kAk2 , where we used (i). On the other hand,
 2
kAk2 = sup kAxk = sup kAxk2 = sup hAx, Axi = sup hA∗ Ax, xi ≤ kA∗ Ak,
kxk=1 kxk=1 kxk=1 kxk=1

where the last inequality follows from Cauchy-Schwarz. 

The following is a Hilbert space version of the results of Section 9.5, but much easier:

11.10 Lemma Let H1 , H2 be Hilbert spaces and A ∈ B(H1 , H2 ). Then


(i) A∗ is invertible if and only if A is invertible, in which case (A∗ )−1 = (A−1 )∗ .
(ii) ker A∗ = (AH1 )⊥ ⊆ H2 . Thus A∗ is injective if and only if A has dense image: AH1 = H2 .
Analogously, A is injective if and only if A∗ has dense image.
(iii) A∗ has closed image if and only if A has closed image.
(iv) A∗ is surjective if and only if A is bounded below. Similarly for A ↔ A∗ .
Proof. (i) If A is invertible then applying ∗ to AA−1 = 1 = A−1 A gives (A−1 )∗ A∗ = 1 =
A∗ (A−1 )∗ , as claimed. The converse implication follows from this together with A∗∗ = A.
(ii) For y ∈ H2 we have

A∗ y = 0 ⇔ hA∗ y, xi = 0 ∀x ∈ H1 ⇔ hy, Axi = 0 ∀x ∈ H1 ⇔ y ∈ (AH1 )⊥ .

Thus A∗ is injective if and only if (AH1 )⊥ = {0}, which is equivalent to AH1 = H2 by Exercise
5.29(i). Applying the fact just proven to A∗ and using A∗∗ = A proves A injective ⇔ A∗ has
dense image.
(iii) We will prove that closedness of AH1 implies closedness of A∗ H2 . Replacing A by

A then gives the converse implication. Put H10 = ker A (which is closed) and H11 = H10 ⊥.

Then we have a direct sum decomposition H1 = H10 ⊕ H11 by Theorem 5.27. By assumption
H21 = AH1 is closed, so that with H20 = H21 ⊥ we also have H = H ⊕ H . Now A maps H
2 20 21 11

injectively (since H11 ∩ ker A = H11 ∩ H11 = {0}) onto H21 = AH1 . This defines an operator
A0 ∈ B(H11 , H21 ) that is injective and surjective, thus invertible, so that A0 ∗ is invertible by (i)
and therefore has closed image A0 ∗ H21 = H11 ⊆ H1 . Now closedness of A∗ H2 follows once we
prove that A∗ (x20 + x21 ) = A0 ∗ x21 for all x2i ∈ H2i . Since A∗ vanishes on H20 = (AH1 )⊥ by
(ii), it remains to prove that A∗  H21 coincides with A0 ∗ followed by the inclusion H11 ,→ H1 .
This follows from the computation

h(x10 + x11 ), A∗ x21 i = hA(x10 + x11 ), x21 i = hAx11 , x21 i = hA0 x11 , x21 i = hx11 , A0 x21 i,

89
where xij ∈ Hij .
(iv) By Exercise 7.43, A is bounded below if and only if it is injective and has closed image.
By (ii), injectivity of A is equivalent to A∗ having dense image, and by (iii) closedness of the
image of A is equivalent to closedness of the image of A∗ . The proof is concluded by appealing
to the trivial fact that surjectivity of A∗ is equivalent to the combination of closedness and
density of its image. 

If H is finite-dimensional, closedness of the images is automatic, while dense image is equiv-


alent to surjectivity and boundedness below to injectivity. Thus (ii) and (iv) reduce to the well
known facts from linear algebra that A∗ is injective (surjective) if and only if A is surjective
(injective). Finally, obvious analogues of (i)-(iv) also hold for A ∈ B(H1 , H2 ) with H1 6= H2 .

11.11 Exercise Let H be a Hilbert space and A ∈ B(H).


(i) Show by example that injectivity of A and of A∗ does not imply invertibility of A.
(ii) Prove: If A is bounded below and A∗ is injective (or vice versa) then A is invertible.

b = H ⊕ H with the obvious Hilbert space


11.12 Exercise Let H be a Hilbert space. Equip H
structure.
(i) Let A ∈ B(H) and A∗ its adjoint. Show (without using boundedness of A∗ ) that the graph
G(A∗ ) ⊆ H
b of A∗ is the orthogonal complement in Hb of a certain linear subspace.
(ii) Conclude that A∗ is bounded. (Never mind that there are simpler ways of seeing this.)

11.2 Unitaries, isometries, coisometries, partial isometries


11.13 Lemma Let H1 , H2 be Hilbert spaces and A ∈ B(H1 , H2 ). Then
(i) A is an isometry if and only if A∗ A = idH1 , i.e. A∗ is a left inverse of A.
(ii) If A is an isometry then
(α) AH1 ⊆ H2 is closed.
(β) P = AA∗ ∈ B(H2 ) is an orthogonal projection, and P H2 = AH1 .
(γ) ker A∗ = (AH1 )⊥ = (1 − P )H2 .
(δ) The restriction A∗  AH1 : AH1 → H1 is unitary.
(iii) A is unitary if and only if A∗ A = idH1 and AA∗ = idH2 .
Proof. (i) By definition, A is an isometry if hAx, Ayi2 = hx, yi1 for all x, y ∈ H1 . Since the l.h.s.
equals hx, A∗ Ayi1 , A is an isometry if and only if hA∗ Ax, yi1 = hx, yi1 for all x, y ∈ H1 , which
is equivalent to A∗ A = idH1 .
(ii) (α) This is just Corollary 3.23. (β) We have P ∗ = (AA∗ )∗ = A∗∗ A∗ = AA∗ = P and
P 2 = AA∗ AA∗ = AA∗ = P , so that P is an orthogonal projection. We have P Ax = AA∗ Ax =
Ax for all x ∈ H1 , thus AH1 ⊆ P H2 . And if y ∈ P H2 then y = P y = AA∗ y (since P 2 = P ),
thus y = A(A∗ y) so that P H2 ⊆ AH1 . (γ) The equality ker A∗ = (AH1 )⊥ is Lemma 11.10 and
the second comes from (AH1 )⊥ = (P H1 )⊥ = (1 − P )H1 . (δ) If y ∈ AH1 then there is a unique
x ∈ H1 such that y = Ax. Now A∗ y = A∗ Ax = x. Thus A∗  AH1 : AH1 → H1 is isometric and
surjective, thus unitary.
(iii) By definition, A is unitary if and only if it is isometric and surjective. Thus A∗ A = idH1
by (i) and AA∗ = idH2 by (ii)(β). Conversely, if A∗ A = idH1 and AA∗ = idH2 then A is isometric
by (i) and surjective by (ii)(β), thus unitary. 

90
11.14 Definition If H1 , H2 are Hilbert spaces then A ∈ B(H1 , H2 ) is called coisometry if
AA∗ = idH2 .
It is clear that A is a coisometry if and only if A∗ is an isometry. Thus by the above, we have
that if A is a coisometry then A  (ker A)⊥ is unitary, and also the converse is easily checked.
This suggests the following generalization:

11.15 Definition V ∈ B(H1 , H2 ) is a partial isometry if V|(ker V )⊥ : (ker V )⊥ → H2 is an


isometry.
It is obvious that every isometry and every coisometry is a partial isometry. And the isome-
tries, respectively coisometries, are just the partial isometries that are injective, respectively
surjective. Partial isometries will be put to use shortly.
For simplicity, the following exercise is stated for H = H1 = H2 , but it generalizes literally
to V ∈ B(H1 , H2 ), where H1 6= H2 .

11.16 Exercise Prove that for V ∈ B(H), the following are equivalent.
(i) V is a partial isometry.
(ii) V ∗ is a partial isometry.
(iii) V ∗ V is an orthogonal projection.
(iv) V V ∗ is an orthogonal projection.
(v) V V ∗ V = V (trivially equivalent to V ∗ V V ∗ = V ∗ ).

11.3 Polarization revisited


By Proposition 11.5 there is a bijection between bounded operators and bounded sesquilinear
forms. For this reason, the following is useful:

11.17 Lemma (i) Let [·, ·] : V × V → C be a sesquilinear form. Then


3
1X k
[x, y] = i [x + ik y, x + ik y] ∀x, y ∈ V. (11.1)
4
k=0

And [·, ·] : V × V → C is self-adjoint, i.e. [x, y] = [y, x] ∀x, y, if and only if [x, x] ∈ R ∀x.
(ii) Every bilinear form [·, ·] : V × V → R that is symmetric, i.e. [x, y] = [y, x] ∀x, y, satisfies
1 
[x, y] = [x + y, x + y] − [x − y, x − y] ∀x, y ∈ V.
4
Proof. The polarization identities are proven by the same computations as for (5.5), the proof
of which only used sesquilinearity of h·, ·i, and (5.4), for which we also used the symmetry
hx, yi = hy, xi of inner products over R.
If a sesquilinear form [·, ·] is selfadjoint and x ∈ V then [x, x] = [x, x], thus [x, x] ∈ R for all
x ∈ V . Conversely, if [x, x] ∈ R ∀x then with (11.1) we have
3 3 3
1 X −k 1X k 1X k k
[x, y] = i [x+ik y, x+ik y] = i [x+i−k y, x+i−k y] = i [i x+y, ik x+y] = [y, x],
4 4 4
k=0 k=0 k=0

thus [·, ·] is self-adjoint. 

91
11.18 Remark The above shows that in the case F = C we can omit axiom (ii) in Definition
5.1 if we replace the assumption of linearity in x by sesquilinearity in x, y, since (ii) then follows
from the positivity assumption hx, xi ≥ 0 ∀x. 2

Now we can show that Hilbert space operators tend to be determined by their diagonal
elements66 hAx, xi:

11.19 Lemma Let H be a Hilbert space over F ∈ {R, C} and A, B ∈ B(H).


(i) If hAx, xi = 0 for all x ∈ H and either F = C or A = A∗ then A = 0.
(ii) If hAx, xi = hBx, xi for all x ∈ H and either F = C or A = A∗ and B = B ∗ then A = B.
Proof. (i) (i) Apply Lemma 11.17 to [·, ·]A = hA·, ·i, which is sesquilinear over C and symmetric
if F = R and A = A∗ by hAx, yi = hx, Ayi = hAy, xi = hAy, xi. For (ii) apply (i) to A − B. 
 
0 1
11.20 Remark 1. Every antisymmetric real matrix A ∈ Mn×n (R), e.g. A = , satisfies
−1 0
hAx, xi = 0 ∀x ∈ Rn . Thus the condition ‘F = C or A = A∗ ’ cannot be dropped.
2. If A ∈ B(H) and we define f : H → F by f (x) = hAx, xi, we have seen that A is
completely determined by f (if F = C, while f determines only A + A∗ if F = R). This raises
the natural problem of characterizing the functions H → F of the form x 7→ hAx, xi. 2

11.21 Exercise Let H be a Hilbert space over F and f : H → F a function. Prove that the
following are equivalent:
(i) There exists A ∈ B(H) such that f (x) = hAx, xi ∀x ∈ H.
(ii) The function f satisfies
(α) f (x + y) + f (x − y) = 2f (x) + 2f (y) ∀x, y ∈ H.
(β) f (cx) = |c|2 f (x) ∀x ∈ H, c ∈ F.
(γ) |f (x)| ≤ Ckxk2 for some C ≥ 0.

11.4 A little more on self-adjoint operators


The following shows that the self-adjoint operators are analogues of the real numbers, in some
sense:

11.22 Proposition Let H be a Hilbert space over C and A ∈ B(H). Then A = A∗ is


equivalent to hAx, xi ∈ R for all x ∈ H.
Proof. ⇒ If A = A∗ then hAx, xi = hx, A∗ xi = hx, Axi = hAx, xi ∀x, and the claim follows.
⇐ If hAx, xi ∈ R ∀x then hA∗ x, xi = hx, Axi = hAx, xi = hAx, xi ∀x, and Lemma 11.19(ii)
(applicable since F = C) gives A = A∗ . 

Note that if F = R, the statement hAx, xi ∈ R is trivially true and therfore implies nothing,
in particular not A = A∗ , again by Remark 11.20.
66
This language is a bit misleading since we looking at hAx, xi for all x, not just x ∈ E for some fixed basis E.

92
11.23 Exercise (i) Let H be a Hilbert space, A ∈ B(H) and K ⊆ H a closed subspace
such that AK ⊆ K. (We say K is A-invariant.) Prove that AK ⊥ ⊆ K ⊥ is equivalent to
A∗ K ⊆ K.
In this situation, K is called reducing since then A ∼
= A|K ⊕ A|K ⊥ .
(ii) Deduce that every invariant subspace of a self-adjoint operator is reducing.

11.5 Normal operators


If A ∈ B(H) then A is self-adjoint if A = A∗ and unitary if A∗ A = AA∗ = 1. It is obvious that
either of these properties implies the following:

11.24 Definition A Hilbert space operator A ∈ B(H) is called normal if AA∗ = A∗ A.

11.25 Remark On every complex Hilbert space H 6= {0} there are normal operators that are
neither self-adjoint nor unitary. Non-normal operators exist whenever dim H ≥ 2. 2

11.26 Lemma Let H be a Hilbert space over F ∈ {R, C} and A ∈ B(H). Then

A is normal ⇔ kAxk = kA∗ xk for all x ∈ H ⇔ A∗ is normal.

Proof. For all x ∈ H we have

kA∗ xk2 = hA∗ x, A∗ xi = hAA∗ x, xi, kAxk2 = hAx, Axi = hA∗ Ax, xi,

so that kA∗ xk = kAxk ∀x is equivalent to hA∗ Ax, xi = hAA∗ x, xi ∀x and therefore is implied by
normality of A. If kAxk = kA∗ xk ∀x then hAA∗ x, xi = hA∗ Ax, xi ∀x, so that Lemma 11.19(ii)
gives AA∗ = A∗ A. (This also holds for F = R since AA∗ and A∗ A are self-adjoint.) 

11.27 Proposition Let H be a Hilbert space over F ∈ {R, C} and A ∈ B(H) normal. Then
(i) ker A = ker A∗ = (AH)⊥ = (A∗ H)⊥ .
(ii) With H0 = ker A and H 0 = H0⊥ we have AH 0 ⊆ H 0 and A∗ H 0 ⊆ H 0 with dense images.
(In particular, ker A is a reducing subspace.)
(iii) A is injective if and only if it has dense image.
(iv) A is invertible ⇔ A is bounded below ⇔ A is surjective.
Proof. (i) The first equality is immediate from Lemma 11.26. The rest follows by applying
Lemma 11.10 to A and A∗ .
(ii) Taking the orthogonal complement of (AH)⊥ = ker A = H0 gives (AH)⊥⊥ = AH = H 0
and similarly for A∗ .
(iii) In view of (AH)⊥ = ker A established in (i), injectivity of A is equivalent to (AH)⊥ =
{0}, which is equivalent to density of the image AH by Exercise 5.29(i).
(iv) Invertibility implies boundedness below and surjectivity, cf. Proposition 7.41. If a normal
operator is bounded below, it is injective, so that it has dense image by (iii). Now boundedness
below and dense image imply invertibility by Proposition 7.41. And surjectivity implies dense
image, thus injectivity by (iii). Now injectivity and surjectivity give invertibility by the BIT. 

11.28 Remark In Corollary 17.25 we will characterize invertibility of A  (ker A)⊥ in terms of
the spectrum of A. 2

93
11.29 Exercise Describe (i) the unitary self-adjoint operators, (ii) the normal partial isome-
tries.

11.30 Exercise Give an example of a normal operator A ∈ B(H) such that there is a non-
trivial A-invariant subspace K ⊆ H that is not reducing. (See Exercise 11.23 for the terminol-
ogy.)

11.31 Exercise Let H be a Hilbert space and A ∈ B(H).


n n
(i) Assuming A = A∗ , prove kA2 k = kAk2 ∀n ∈ N.
(ii) Generalize (i) to all normal A.

11.6 Numerical range and radius


In view of Lemma 11.17 and Proposition 11.5, a Hilbert space operator A can be recovered
completely from its diagonal elements hAx, xi when F = C or A = A∗ . We should therefore be
able to prove a more quantitative statement improving on Lemma 11.19 (and implying it).

11.32 Definition Let H be a Hilbert space and A ∈ B(H). Then we define


• the numerical range of A as W (A) = {hAx, xi | x ∈ H, kxk = 1},
• the numerical radius of A as 9A9 = supλ∈W (A) |λ| = supkxk=1 |hAx, xi|.
(In quantum mechanics, cf. e.g. [92], W (A) is the set of expectation values of A (if A = A∗ ).)
If F = C, Lemma 11.19(i) and Proposition 11.22 can now be restated concisely: For A ∈
B(H) we have W (A) = {0} if and only if A = 0 and W (A) ⊆ R if and only if A = A∗ .
We now focus on 9 · 9 and refer to Appendix B.12.1 for more on W (A).

11.33 Lemma Let H be a Hilbert space over F ∈ {R, C} and A ∈ B(H). Then
(i) 9A9 ≤ kAk.
(ii) |hAx, xi| ≤ kxk2 9 A9 for all x ∈ H.
(iii) If 9A9 = 0 and either F = C or A = A∗ then A = 0.
Proof. (i) If kxk = 1 then |hAx, xi| ≤ kAxkkxk ≤ kAk.
(ii) For x = 0 this is clear. For x 6= 0 we have kxk−2 |hAx, xi| = |hA kxk
x x
, kxk i| ≤ 9A9.
(iii) In view of (ii) this is just a restatement of Lemma 11.19(i). 

11.34 Proposition Let H be a Hilbert space over F ∈ {R, C} and A ∈ B(H). Then
(i) If A = A∗ then kAk = 9A9.
(ii) If F = C then 21 kAk ≤ 9A9.
(iii) If F = C and A is normal then kAk = 9A9.
Proof. (i) If x, y ∈ H are unit vectors then

|hA(x + y), x + yi − hA(x − y), x − yi| ≤ |hA(x + y), x + yi| + hA(x − y), x − yi|
≤ (kx + yk2 + kx − yk2 ) 9 A 9
= 2 kxk2 + kyk2 9 A9 = 4 9 A9,


94
where we used Lemma 11.33(ii), the parallelogram identity (5.3) and kxk = kyk = 1. Inserting

hA(x + y), x + yi − hA(x − y), x − yi = 2hAx, yi + 2hAy, xi ∀x, y ∈ H,

proven by direct computation, we have

|hAx, yi + hAy, xi| ≤ 2 9 A 9 .


Ax
Assuming Ax 6= 0 and putting y = kAxk , this becomes

kAxk + kAxk−1 hA2 x, xi ≤ 2 9 A 9 whenever kxk = 1, Ax 6= 0. (11.2)

If now A = A∗ then hA2 x, xi = hAx, Axi = kAxk2 , so that (11.2) reduces to kAxk ≤ 9A9 when
kxk = 1, whence kAk ≤ 9A9. Combining this with Lemma 11.33(i), we are done.
(ii) If we replace A in (11.2) by αA where |α| = 1 then kAxk and 9A9 are unchanged but
hA x, xi acquires a factor α2 . Since F = C, we can choose α such that α2 hA2 x, xi = |hA2 x, xi| ≥
2

0. Then (11.2) becomes kAxk + kAxk−1 |hA2 x, xi| ≤ 2 9 A9, which implies kAxk ≤ 2 9 A9 for
all unit vectors x with Ax 6= 0. This also holds if Ax = 0, thus kAk = supkxk=1 kAxk ≤ 2 9 A9.
(iii) See the following exercise. 

11.35 Exercise Let H be a complex Hilbert space and A ∈ B(H). Prove:


(i) 9A2 9 ≤ 9A92 . Hint: Adapt the proof of Proposition 11.34(ii).
n n
(ii) If A is normal then 9A9 = kAk. Hint: Prove and use kAk2 ≤ 2 9 A 92 ∀n ∈ N.
For F = R, Proposition 11.34(ii) and both parts of Exercise 11.35 are false, a counterexample
being the A in Remark 11.20, satisfying kAk = 1 = 9A2 9 and AA∗ = A(−A) = (−A)A = A∗ A,
but 9A9 = 0. Over C the following shows that Proposition 11.34(ii) is optimal without further
assumption on A:
 
0 1
11.36 Exercise Let A = ∈ B(C2 ). Prove W (A) = {z ∈ C | |z| ≤ 21 } and 9A9 =
0 0
1
2 kAk.

11.7 Positive operators and their square roots


Just as self-adjoint operators are the ‘real’ elements of B(H), there also are positive ones:

11.37 Definition If H is a Hilbert space then A ∈ B(H) is called positive, abbreviated A ≥ 0,


if hAx, xi ≥ 0 for all x ∈ H. If F = R we also require A = A∗ .

11.38 Remark 1. Positivity of A is equivalent to W (A) ⊆ [0, +∞).


2. By Proposition 11.22, the assumption hAx, xi ≥ 0 ∀x automatically implies A = A∗ if
F = C. But this is not true for F = R, and it seems undesirable to consider matrices as in
Remark 11.20 as positive, for example since positivity should be preserved by extending scalars
from R to C. But also the definition without self-adjointness can be found in the literature. 2

11.39 Exercise Let H be a Hilbert space over F ∈ {R, C}. Prove:


(i) If A1 , A2 ∈ B(H) are positive and λ1 , λ2 ≥ 0 then λ1 A1 + λ2 A2 is positive.

95
(ii) For each A ∈ B(H) we have A∗ A ≥ 0. In particular, A = A∗ ⇒ A2 ≥ 0.
(iii) If A ∈ B(H) is positive then BAB ∗ is positive for every B ∈ B(H).
(iv) If A, B ∈ B(H) are positive and A + B = 0 then A = B = 0.
(v) If A ≥ 0 then Ak ≥ 0 for all k ∈ N.

11.40 Theorem If H is a Hilbert space and A ∈ B(H) is positive then there is a unique
positive B ∈ B(H) such that B 2 = A. We call B the square root A1/2 of A.
Proof. Existence: It suffices to prove the claim under the additional assumption kAk ≤ 1 since
 1/2
A
then for general A 6= 0 we can put A1/2 = kAk1/2 kAk .
Let A ≥ 0, kAk ≤ 1. Then for x ∈ H, kxk = 1 we have 0 ≤ hAx, xi ≤ 1, thus h(1 −
A)x, xi = 1 − hAx, xi ∈ [0, 1], so that 1 − A ≥ 0. Furthermore, Proposition 11.34(i) implies
k1 − Ak = supkxk=1 h(1 − A)x, xi ≤ 1. Thus A 7→ 1 − A is an involutive bijection from the set
{A ≥ 0 | kAk ≤ 1} to itself, so that it suffices to define (1 − A)1/2 whenever A ≥ 0, kAk ≤ 1.
(This is advantageous since z 7→ (1 − z)1/2 is analytic at z = 0 while z 7→ z 1/2 is not.)
The function z 7→ (1 − z)1/2 is infinitely differentiable at z = 0, with (formal) Taylor series

1 z2 1 3 z3
 
1/2
X
k 1 z
(1 − z) = ck z = 1 − + + + ··· , (11.3)
2 1! 2 2! 2 2 3!
k=0

where the ck can be found by explicit computation or appeal to Newton’s binomial theorem,
cf. e.g. [57, Vol. 1, Theorem 7.6.4]. We see that c0 = 1 and ck < 0 for all k ≥ 1. The power
series has convergence radius one and does converge to (1 − z)1/2 whenever |z| < 1. This can
be seen invokingP∞either complex analysisP or Newton’s theorem. As z % 1, the l.h.s. converges

to zero, P thus k=1 kc = −1, P∞ k=0 |ckk | = 2 and |ck | ≤ 1 ∀k. Thus for kAk ≤ 1
implying
∞ k
we have k=0 |ck |kA k < ∞, so that k=0 ck A converges in norm by Proposition 3.15. We
interpret the sum as (1 − A)1/2 . To see that this is justified we note that squaring (11.3) gives
P 2
∞ k
1−z = c
k=0 k z , which by absolute convergence also holds with z replaced by A. Since
A is self-adjoint with kAk ≤ 1, also Ak has these properties for all k ≥ 1. Thus for all k and
x ∈ H, kxk = 1 we have hAk x, xi ∈ [−1, 1].
P∞ (Actually · · · ∈ [0, 1] since A k
P∞≥ 0 ∀k ∈ N by Exercise
k
11.39(v), but we don’t need this.) With k=1 |ck | = 1 it follows that k=1 ck hA x, xi ∈ [−1, 1],
so that with c0 = 1 we have

D X  E X∞
1/2 k
(1 − A) x, x = ck A x, x = ck hAk x, xi ≥ 0.
k=0 k=0

Thus (1 − A)1/2 ≥ 0. Finally defining A1/2 = (1 − (1 − A))1/2 we are done.


Uniqueness: Let A ∈ B(H) be positive and B = A1/2 as constructed above. Let also C
satisfy C 2 = A and C ≥ 0. Then CA = CC 2 = C 2 C = AC, and since B is a limit of
polynomials in A, we have BC = CB. Using this and B 2 = A = C 2 we have

(B − C)B(B − C) + (B − C)C(B − C) = (B − C)(B + C)(B − C) = (B 2 − C 2 )(B − C) = 0.

Since (B−C)B(B−C) and (B−C)C(B−C) are positive by Exercise 11.39(iii), Exercise 11.39(iv)
gives (B−C)B(B−C) = (B−C)C(B−C) = 0. Thus also their difference (B−C)3 vanishes, and
so does (B −C)4 . Since B −C is self-adjoint, we have kB −Ck4 = k(B −C)2 k2 = k(B −C)4 k = 0,
thus B = C. 

96
11.41 Remark 1. For a proof of Theorem 11.40 that uses the Weierstrass approximation
theorem instead of a power series, cf. e.g. [118, Lemma 3.2.10 & Proposition 3.2.11].
2. In Section 17 we will develop a less ad hoc and much more general approach to applying
(continuous) functions to (normal) operators, but this will require much work. 2

11.8 The polar decomposition


Every complex number z can be written as z = ru, where |u| = 1 and r ≥ 0. For z 6= 0 this
decomposition is unique. It is natural to ask whether there is an analogue for A ∈ B(H).

11.42 Definition Let H be a Hilbert space and A ∈ B(H). Then we define the absolute value
|A| = (A∗ A)1/2 . (Apply Theorem 11.40 to A∗ A, which is positive by Exercise 11.39(ii).)

11.43 Exercise Let A ∈ B(H) be invertible. Prove (directly, without Proposition 11.44):
(i) B ∈ B(H) is invertible if and only if B 2 is invertible.
(ii) |A| is invertible.
(iii) U = A|A|−1 is unitary.
Thus we have the polar decomposition A = U |A| with U unitary and |A| positive and invertible.
If A is not invertible, a form of polar decomposition still exists, but is more subtle:

11.44 Proposition (Polar decomposition) Let H be a Hilbert space and A ∈ B(H).


(i) There exists a unique partial isometry V such that A = V |A| and ker A = ker V .
(ii) If A is injective (invertible) then V is an isometry (unitary).
(iii) In addition, we have |A| = V ∗ A. [This follows trivially from (i) only if V is unitary.]
Proof. (i) For each x ∈ H we have

kAxk2 = hAx, Axi = hx, A∗ Axi = h(A∗ A)1/2 x, (A∗ A)1/2 xi = h|A|x, |A|xi = k|A|xk2 , (11.4)

implying ker |A| = ker A. If |A|x = |A|x0 then |A|(x − x0 ) = 0, thus x − x0 ∈ ker |A| = ker A, so
that Ax = Ax0 . Thus there is a well-defined linear map V : |A|H → H satisfying V |A|x = Ax.
By (11.4) this map is isometric and therefore extends continuously to |A|H by Lemma 3.12. We

extend V to all of H by having it send |A|H to zero, obtaining a partial isometry. We have

ker V = |A|H = ker |A|∗ = ker |A| = ker A,

where we used Lemma 11.10, the self-adjointness of |A| and (11.4). It is clear from the definition
that V |A| = A.
That V is uniquely determined by its properties is quite clear: It must send |A|x to Ax, which

determines it on |A|H. And in view of ker A = ker |A| = ker |A|∗ = |A|H , the requirement

ker V = ker A forces V to be zero on |A|H .
(ii) It is trivial that an injective (bijective) partial isometry is an isometry (unitary).
(iii) Using (11.4) as in (i) there is a unique partial isometry W such that W Ax = |A|x ∀x ∈ H
and W  (AH)⊥ = 0. Now it is immediate that W V  |A|H = id and V W  AH = id, while W V
and V W vanish on |A|H ⊥ and AH ⊥ respectively. Thus W = V ∗ , and we are done. 

97
11.45 Remark Exercise 11.16 generalizes without any difficulty to operators V ∈ B(H, H 0 )
between different Hilbert spaces H 6= H 0 . Then, of course, we have V ∗ V ∈ B(H), V V ∗ ∈ B(H 0 ).
Also Proposition 11.44 generalizes to A ∈ B(H, H 0 ). Then |A| = (A∗ A)1/2 ∈ B(H) and
V ∈ B(H, H 0 ). 2

11.46 Exercise Let {λk }k∈N ⊆ C be a bounded sequence. Let H = `2 (N, C) and Aλ ∈ B(H)
defined by Aλ ek = λk ek+1 (where ek (n) = δk,n ).
(i) Compute (Aλ )∗ , |Aλ | and the partial isometry V in the polar decomposition of Aλ .
(ii) Give a necessary and sufficient condition on {λk } for Aλ to be normal.

11.47 Exercise Prove:


(i) If A ∈ B(H) is bounded below, i.e. kAxk ≥ Ckxk with C > 0, then also |A| is bounded
below with the same constant.
(ii) If A ∈ B(H) is bounded below then |A| is invertible.

11.9 ? The trace of positive operators


If A is an n × n matrix, its trace Tr(A) is defined as ni=1 Aii . One easily shows Tr(AB) =
P
Tr(BA), so that Tr(BAB −1 ) = Tr(A) for all invertible B. If V is a finite-dimensional vec-
tor space and A ∈ End V , we pick a basis E of V and define Tr(A) in terms of the matrix
representation of A w.r.t. E, which then is independent of E by the conjugation invariance.
If V is an infinite-dimensional Banach space and A ∈ B(V ), it is a rather non-trivial problem
to define Tr(A) (unless A has finite rank, cf. Exercise 10.11). But for Hilbert spaces, one has
the following:

11.48 Definition If H is a Hilbert space, E an ONB for H and A ∈ B(H) is positive, we


define the trace X
TrE (A) = hAe, ei ∈ [0, +∞].
e∈E

11.49 Remark 1. If H is finite-dimensional, TrE (A) coincides with our earlier definition and
therefore does not depend on the choice of E.
2. The sum is uniquely defined since hAe, ei ≥ 0 for all e ∈ E. (The trace of certain non-
positive operators will be considered in Appendix B.11.), but establishing the E-independence
now is less straightforward. We begin with two special cases. 2

11.50 Exercise Let H be a Hilbert space over F and P ∈ B(H) an orthogonal projection.
Prove that TrE (P ) equals the rank of P , i.e. the Hilbert space dimension of the closed subspace
P H ⊆ H as an F-Hilbert space, for each ONB E. Hint: Begin with rank-one projections.

11.51 Lemma Let H be a Hilbert space and A ∈ B(H). Then


(i) TrE (A∗ A) = TrE (AA∗ ) for each ONB E of H.
(ii) The quantity in (i) is independent of the ONB E.

98
Proof. (i) Using Parseval’s identity (kxk2 = e0 ∈E |hx, e0 i|2 ), we have
P

X X X XX
TrE (A∗ A) = hA∗ Ae, ei = hAe, Aei = kAek2 = |hAe, e0 i|2
e∈E e∈E e∈E e∈E e0 ∈E
XX XX X
= |hAe, e0 i|2 = |he, A∗ e0 i|2 = kA∗ e0 k2
e0 ∈E e∈E e0 ∈E e∈E e0 ∈E
X X
= hA∗ e0 , A∗ e0 i = hAA∗ e0 , e0 i = TrE (AA∗ ),
e0 ∈E e0 ∈E

where the exchange of summations is justified since all summands are non-negative.
(ii) If E is an ONB and U ∈ B(H) is unitary, we have

TrE (U AA∗ U ∗ ) = TrE (U A(U A)∗ ) = TrE ((U A)∗ U A)


= TrE (A∗ U ∗ U A) = TrE (A∗ A) = TrE (AA∗ ), (11.5)

where the second and fifth equalities come from (i) as applied to U A and A, respectively. If
now E 0 is a second ONB then E and E 0 have the same cardinality, so that there is a unitary U
such that U ∗ maps E onto E 0 . Together with (11.5) this gives TrE (AA∗ ) = TrE (U AA∗ U ∗ ) =
TrE 0 (AA∗ ), proving the claim. 

11.52 Corollary If A ∈ B(H) is positive then TrE (A) is independent of the ONB E.
Proof. Positivity of A implies that there is a positive B ∈ B(H) such that A = B 2 = B ∗ B.
Now the claim is immediate by Lemma 11.51(ii). 

(Attempts to prove the corollary without using square roots tend to have gaps.) The above
considerations will be put to use in Section 12.4 (and B.11).

12 Compact operators
12.1 Compact Banach space operators
We have met compact topological spaces many times in this course. A subset Y of a topological
space (X, τ ) is compact if it is compact when equipped with the induced (=subspace) topology
τ|Y . Recall that a metric space X, thus also a subset of a normed space, is compact if and only
if it is sequentially compact, i.e. every sequence {xn } in X has a convergent subsequence. And
Y ⊆ X is called precompact (or relatively compact) if its closure Y is compact. If X is complete
metric, precompactness of Y ⊆ X is equivalent to total boundedness.
A subset Y of a normed space (V, k · k) is called bounded if there is an M such that kyk ≤
M ∀y ∈ Y . A compact subset of a normed space is closed and bounded, but the converse, while
true for finite-dimensional spaces by the Heine-Borel theorem, is false in infinite-dimensional
spaces. This is particularly easy to see for an infinite-dimensional Hilbert space: Any ONB √
B ⊆ H clearly is bounded. For any e, e0 ∈ B, e 6= e0 we have ke − e0 k = he − e0 , e − e0 i1/2 = 2.
Thus B ⊆ H is closed and discrete. Since it is infinite, it is not compact.
Related to this: If H is a Hilbert
p space and W ⊆ H a proper closed subspace, pick x ∈ W ⊥
with kxk = 1. Then kx−wk = 1 + kwk2 for each w ∈ W , thus dist(x, W ) = inf w∈W kw−xk =
1. This can be generalized:

12.1 Exercise If V is a reflexive Banach space and W ⊆ V a proper closed subspace, there
exists x ∈ V satisfying kxk = 1 and dist(x, W ) = 1. Hint: Use Exercise 9.24.

99
Without reflexivity, we have the following weaker, but elementary result:

12.2 Lemma (F. Riesz) Let V be a normed space and W $ V a closed and proper linear
subspace. Then for each θ ∈ (0, 1) there is an xθ ∈ V such that kxθ k = 1 and dist(xθ , W ) ≥ θ.
Proof. Since W is proper and closed, there are x0 ∈ V \W and ε > 0 such that B(x0 , ε) ⊆ V \W .
Thus λ := dist(x0 , W ) = inf w∈W d(x0 , w) ≥ ε > 0. In view of θ ∈ (0, 1), we have λθ > λ, so that
we can find w0 ∈ W with λ ≤ kw0 − x0 k < λθ . Putting

w0 − x 0
xθ = ,
kw0 − x0 k

we have kxθ k = 1. If w ∈ W then

w0 − x0 kkw0 − x0 kw − w0 + x0 k dist(x0 , W ) λ
kw − xθ k = w − = ≥ > λ = θ,
kw0 − x0 k kw0 − x0 k kw0 − x0 k θ

where the ≥ is due to kw0 − x0 kw − w0 ∈ W and the > comes from dist(x0 , W ) = λ and
kw0 − x0 k < λθ . Since this holds for all w ∈ W , we have dist(xθ , W ) ≥ θ. 

12.3 Proposition If (V, k · k) is an infinite-dimensional normed space then:


(i) Each closed ball B(x, r) = {y ∈ V | kx − yk ≤ r} (with r > 0) is non-compact.
(ii) Every subset Y ⊆ V with non-empty interior Y 0 is non-compact.
Proof. (i) Choose x1 ∈ V with kx1 k = 1. Then Cx1 is a closed proper subspace, thus there exists
x2 ∈ V with kx2 k = 1 and kx1 − x2 k ≥ 21 . Since V is infinite-dimensional, V2 = span{x1 , x2 }
is a closed proper subspace, thus there exists x3 ∈ V with dist(x3 , V2 ) ≥ 12 , thus in particular
kx3 − xi k ≥ 21 for i = 1, 2. Continuing in this way we can construct a sequence {xi } ⊆ V
with kxi k = 1 and kxi − xj k ≥ 12 ∀i 6= j. The sequence {xi } clearly cannot have a convergent
subsequence, thus the closed unit ball B(0, 1) is non-compact. Since x 7→ λx + x0 with λ > 0 is
a homeomorphism, all closed balls are non-compact.
(ii) If Y ⊆ V and Y 0 6= ∅ then Y contains some open ball B(x, r), thus also B(x, r/2), which
is non-compact. Thus neither Y nor Y are compact. 

In view of the above, it is interesting to look at linear operators that send sets S ⊆ V to
sets AS with ‘better compactness properties’. There are several such notions:

12.4 Exercise Let V, W be normed spaces and A : V → W a linear map. Prove that the
following conditions are equivalent and imply boundedness of A:
(i) The image AV≤1 ⊆ W of the closed unit ball V≤1 is precompact.
(ii) Whenever S ⊆ V is bounded, AS is precompact (⇔ totally bounded if V is complete).
(iii) Given any bounded sequence {xn } ⊆ V , the sequence {Axn } has a convergent subsequence.

12.5 Definition Operators A ∈ B(V, W ) satisfying the above equivalent conditions are called
compact. The set of compact operators V → W is denoted K(V, W ), and we put K(V ) =
K(V, V ).

100
12.6 Remark 0. That compactness implies boundedness is good to know, but rarely important
since it is a priori clear in most situations.
1. Some authors write B0 (V ) rather than K(V ), motivated by Exercise 12.13(iii) below.
2. In the older literature one can find ‘completely continuous’ as synonym for compact, but
this should be avoided since complete continuity now is defined differently and in general is not
equivalent to compactness.
3. If A ∈ B(V, W ) is compact and V 0 ⊆ V is closed then the restriction A|V 0 ∈ B(V 0 , W )
is compact. If A ∈ B(V ) is compact, W ⊆ V is closed and AW ⊆ W then A|W ∈ B(W ) is
compact.
4. The Heine-Borel theorem implies that every linear operator on a finite-dimensional
normed space (automatically bounded by Exercise 3.7) is compact. For infinite-dimensional
spaces this is false since every closed ball is bounded but non-compact by Proposition 12.3. In
particular the unit operator 1V is compact if and only if V is finite-dimensional.
5. Compactness can also be defined for non-linear maps between Banach spaces. But then
the three versions above are no more equivalent and continuity is no more automatic. See
Section B.15. 2

Before we develop further theory, we should prove that (non-zero) compact operators on
infinite-dimensional spaces exist. The following may be known from Exercise 10.10:

12.7 Definition Let V, W be normed spaces and A ∈ B(V, W ). Then A has finite rank if
its image AV ⊆ W is finite-dimensional. The set of finite rank operators V → W is denoted
F (V, W ). Again, F (V ) = F (V, V ).

12.8 Remark 1. For example, if ϕ ∈ V ∗ , y ∈ W then A ∈ B(V, W ) defined by A : x 7→ ϕ(x)y


has finite rank.
2. Note that boundedness of A : V → W does not follow from finite-dimensionality of AV ,
as we know from W = F.
3. It is straightforward to check that F (V, W ) ⊆ B(V, W ) is a linear subspace. And if
A ∈ B(V, W ), B ∈ B(W, Z) and at least one of A, B has finite rank then BA ∈ F (V, Z). In
particular, F (V ) ⊆ B(V ) is a two-sided ideal.
4. In Exercise 10.10 we proved F (V, W ) = {A ∈ B(V, W ) | A is weak-norm continuous}. In
Section 12.2 we will relate weak versions of weak-norm continuity to compactness.
5. If V is infinite-dimensional then for each n ∈ N we can find (many) finite-dimensional
subspaces W of dimension n. Since W is complemented, we have V ' W ⊕ Z. This gives an
embedding B(W ) ,→ F (V ). Thus F (V ) has closed subalgebras isomorphic to B(W ) for each
finite-dimensional W .
6. Finite rank operators can be used to prove that an algebraic isomorphism K(V ) ∼
= K(W )

or B(V ) = B(W ) implies an isomorphism V ' W of Banach spaces. For a more general
statement and proof see Section B.7. 2

12.9 Lemma Finite rank operators are compact. Thus F (V, W ) ⊆ K(V, W ) for all Banach
space V, W .
Proof. Let A ∈ F (V, W ). If S ⊆ V is bounded then AS ⊆ AV is bounded by boundedness of
V . Since AV ⊆ W is finite-dimensional, it is closed and has the Heine-Borel property so that
AS ⊆ AV = AV is compact. Thus A is compact. 

101
12.10 Lemma K(V, W ) ⊆ B(V, W ) is a vector space. If A ∈ B(V, W ), B ∈ B(W, Z) and at
least one of A, B is compact then BA ∈ K(V, Z). In particular, K(V ) ⊆ B(V ) is a two-sided
ideal.
Proof. Let {xn } be a bounded sequence in V . Since A, B are compact, we can find a subsequence
{xnk }k∈N such that Axnk and Bxnk converge as k → ∞. Then also (cA + dB)xnk converges,
thus cA + dB is compact. Thus K(V, W ) ⊆ B(V, W ) is a linear subspace.
Alternative argument: Let A, B ∈ K(V ) and S ⊆ V bounded. Then AS and BS are
compact, and so are cAS, dBS if c, d ∈ F. Thus also cAS + dBS is compact (by joint continuity
of the map + : V × V → V ), thus also (cA + dB)S ⊆ cAS + dBS = cAS + dBS.
Now let A ∈ B(V, W ), B ∈ K(W, Z) and S ⊆ V bounded. Then BS and ABS = ABS are
compact by compactness of B and continuity of A, respectively. And boundedness of A implies
boundedness of AS, so that BAS has compact closure by compactness of B. Thus AB and BA
are compact. 

For the proof of the next result, we need the notion of total boundedness in metric spaces,
see Appendix A.6.4. In particular we will use Exercise A.43(iii).

12.11 Proposition K(V, W ) ⊆ B(V, W ) is k · k-closed for all Banach spaces V, W .


Proof. Since B(V, W ) is a metric space, it suffices to prove that the limit A of every norm-
convergent sequence {An } in K(V, W ) is in K(V, W ). Let thus {An } ⊆ K(V, W ) and A ∈
B(V, W ) such that kAn − Ak → 0. We want to prove that AV≤1 ⊆ W is precompact. Since
(B(V, W ), k · k) is complete, by Exercise A.43(iii) this is equivalent to AV≤1 being totally
bounded. To show this, let ε > 0. Then there is an n such that kAn − Ak < ε/3. Since
An is compact,Sn An V≤1 is precompact, thus totally bounded, so that there are x1 , . . . , xn ∈ V≤1
such that i=1 B(An xi , ε/3) ⊇ An V≤1 . Equivalently, for each x ∈ V≤1 (thus kxk ≤ 1) there is
i ∈ {1, . . . , n} such that kAn x − An xi k < ε/3. Thus
ε ε ε
kAx − Axi k ≤ kAx − An xk + kAn x − An xi k + kAn xi − Axi k < + + = ε,
3 3 3
Sn
where we used kA − An k < ε/3 and x, xi ∈ V≤1 . Thus AV≤1 ⊆ i=1 B(Axi , ε), proving that
AV≤1 is totally bounded, thus precompact. Thus A is compact. 

12.12 Corollary For all Banach spaces V, W , we have F (V, W ) ⊆ K(V, W ).

12.13 Exercise Let V = `p (S, F), where S is an infinite set and 1 ≤ p < ∞. If g ∈ `∞ (S, F)
and f ∈ `p (S, F) then Mg (f ) = gf (pointwise product) is in `p (S). This defines a linear map
`∞ (S, F) → B(V ), g 7→ Mg . Prove:
(i) g 7→ Mg is an algebra homomorphism.
(ii) kMg k = kgk∞ .
(iii) Mg ∈ K(V ) if and only if g ∈ c0 (S, F).

12.14 Remark 1. We now have two classes of compact operators: The (rather commutative)
one of multiplication by c0 -functions, and the operators that are norm-limits of finite rank
operators. Actually, the first class is contained in the second. Why?
2. In view of Exercise 12.13, the closed subspace K(V ) ⊆ B(V ) is a non-abelian analogue
of c0 ⊆ `∞ . If H is a separable Hilbert space, one can use the fact that c0 is not complemented
in `∞ to prove that K(H) is not complemented in B(H). See [31].

102
3. If either V or W is finite-dimensional, we have F (V, W ) = K(V, W ) = B(V, W ). While
K(V ) 6= B(V ) whenever V is infinite-dimensional since 1 ∈ B(V )\K(V ), very deep recent
(2011) work [5], see also [6, 179], has produced infinite-dimensional Banach spaces V for which
the difference between B(V ) and K(V ) is minimal in the sense of B(V ) = C1 + K(V )67 ! It is a
much older result that there are V, W both infinite-dimensional with B(V, W ) = K(V, W )! (Of
course then V 6' W .) For large classes of examples see Theorems 12.24 and B.30.
4. It is quite natural to ask whether F (V, W ) = K(V, W ). We will later prove F (H) = K(H)
for each Hilbert space. A Banach space V is said to have the approximation property if
F (W, V ) = K(W, V ) holds for every Banach space W . (It is known68 that V has the ap-
proximation property if and only if for every compact set X ⊂ V and ε > 0 there is a T ∈ F (V )
such that kx − T xk < ε ∀x ∈ X, cf. e.g. [98, vol. 1]. Whether this is already implied by
F (V ) = K(V ) seems to be still open.) Whether all Banach spaces have the approximation
property was an open problem until Enflo69 in 1973 [47] constructed a counterexample. His
construction was quite complicated and his spaces were not very ‘natural’ (in the sense of hav-
ing a simple definition and/or having been encountered previously). A simpler example, but
still tricky and not natural, can be found in [34]. Somewhat later, very natural examples were
found: The Banach space B(H) does not have the approximation property whenever H is an
infinite-dimensional Hilbert space, cf. [162]. (Note that this is about compact operators on
B(H), not compact operators in B(H)!) Most of this is well beyond the level of this course,
but you should be able to understand [34]. 2

None of the above examples of compact operators seems very relevant for applications, even
within mathematics. Indeed the most useful compact operators perhaps are integral operators.
We will briefly look at a class of them in Exercise 12.41. But there are very simple examples:

12.15 Definition Let V = C([0, C], F) for some C > 0, equipped with the norm kf k =
supx∈[0,C] |f (x)|. As we know, (V, k·k) is a Banach space. Define a linear operator, the Volterra70
operator, by Z x
A : V → V, (Af )(x) = f (t)dt.
0
Rx RC
We have kAf k = supx | 0 f (t)dt| ≤ 0 |f (t)|dt ≤ Ckf k, thus kAk ≤ C < ∞.

12.16 Proposition The Volterra operator A : V → V is compact.


The proof of this result makes essential use of the Arzelà-Ascoli Theorem71 which charac-
terizes the (pre)compact subsets of the metric space (C(X, F), k · k∞ ) for compact X.

67
Fittingly, this is not only the most recent, but also by far the most sophisticated result mentioned in these notes.
See [179] for a brief introduction, an excellent introduction to what happened in Banach space theory since the 1970s.
68
This was proven by Alexander Grothendieck (1928-2014), German-born mathematician (later French) who first
made fundamental contributions to functional analysis and then revolutionized algebraic geometry.
69
Per H. Enflo (1944-). Swedish mathematician, working mostly in functional analysis. He also made seminal
contributions to the ‘invariant subspace problem’ by constructing an infinite-dimensional separable Banach space V
and an A ∈ B(V ) such that the only closed subspaces W ⊆ V with AW ⊆ W are W = 0 and W = V .
70
Vito Volterra (1860-1940). Italian mathematician and one of the early pioneers of functional analysis.
71
You should have seen this theorem in Analysis 2 or Topology. See e.g. Appendix A.6.4 or [57, Vol. 2, Theorem
15.5.1]. It has many applications in classical analysis, for example Peano’s existence theorem on differential equations.

103
Proof. We will prove that F = AB ⊆ V is precompact by showing that it satisfies the hypotheses
of Theorem A.45. If x ∈ [0, C] and f ∈ C([0, C], F) with kf k ≤ 1 then
Z x
|(Af )(x)| = f (t)dt ≤ Ckf k ≤ C < ∞,
0

showing that F is pointwise bounded. For each f ∈ V with kf k ≤ 1 we have


Z x
|(Af )(x) − (Af )(y)| = f (t)dt ≤ |x − y|.
y

Since this is uniform in f ∈ B it shows that F is equicontinuous. 

R b adapted to give compactness of A : V → V with V =


The above proof is very easily
C([a, b], F) given by (Af )(x) = a K(x, y)f (y)dy for any K ∈ C([a, b] × [a, b], F). For another
class of compact integral operators see Example 12.41.

12.17 Theorem (Schauder 1930) Let V, W be Banach spaces and A ∈ B(V, W ). Then
At ∈ B(W ∗ , V ∗ ) is compact if and only if A is compact.
Proof. For simplicity we assume V = W . The general case requires only notational changes. As
in the proof of Proposition 12.11, we use the equivalence of precompactness and total bounded-
ness from Exercise A.43(iii). If A ∈ B(V ) is compact then AV≤1 ⊆ V is precompact,
Sn thus totally
bounded. Thus for every ε > 0 there are x1 , . . . , xn ∈ V≤1 such that i=1 B(Axi , ε) ⊇ AV≤1 .
Equivalently, for every x ∈ V≤1 there is an i such that kAx − Axi k < ε/3. Now define a bounded
linear map B : V ∗ → Fn (where Fn has the norm k · k∞ ) by B : ϕ 7→ (ϕ(Ax1 ), . . . , ϕ(Axn )).
Since Fn is finite-dimensional, B has finite rank, thus is compact, so that BV≤1 ∗ ⊆ V ∗ is

totally bounded. Thus there are ψ1 , . . . , ψm ∈ V≤1 such that for every ψ ∈ V≤1 ∗ we have

kBψ − Bψj k < ε/3 for some j. This gives

max |(At ψ)(xi ) − (At ψj )(xi )| = max |ψ(Axi ) − ψj (Axi )| = kBψ − Bψj k < ε/3. (12.1)
1≤i≤n 1≤i≤n

If now x ∈ V≤1 then kAx−Axi k < ε/3 for some i, thus |(At ψ)(x)−(At ψ)(xi )| = |ψ(Ax−Axi )| <
∗ , and (12.1) gives |(At ψ)(x ) − (At ψ )(x )| < ε/3. Thus
ε/3 for every ψ ∈ V≤1 i j i

|(At ψ)(x) − (At ψj )(x)| ≤


|(At ψ)(x) − (At ψ)(xi )| + |(At ψ)(xi ) − (At ψj )(xi )| + |(At ψj )(xi ) − (At ψj )(x)|
< ε/3 + ε/3 + ε/3 = ε.
∗ was arbitrary
Since this holds for all x ∈ V≤1 , we have kAt ψ − At ψj k ≤ ε, and since ψ ∈ V≤1
we have total boundedness of At V≤1 ∗ , thus compactness of At .

Now assume that A ∈ B(V ) is compact. Then by the above, Att ∈ B(V ∗∗ ) is compact.
t ∗

Since V ⊆ V ∗∗ is a closed subspace that is Att -invariant with Att |V = A by Lemma 9.33,
compactness of A follows from Remark 12.6.3. 

12.18 Remark The above self-contained proof, taken from [110], has much in common with
the proof of the Arzelà-Ascoli theorem, and indeed the latter can be used to prove Schauder’s
theorem, as has become the standard proof, cf. e.g. [94]. An alternative proof is sketched in
Remark 12.20 below. Yet another proof uses the circle of ideas in Section 10 (weak topologies
and Alaoglu’s theorem) as in [30, Theorem VI.3.4]. 2

104
The following is an instructive – and useful – characterization of compact operators which
should go some way towards making the notion more intuitive:

12.19 Theorem (H. E. Lacey 1963) Let V, W be Banach spaces and A ∈ B(V, W ). Then
the following are equivalent:
(i) A is compact.
(ii) For every ε > 0 there exists a closed subspace Z ⊆ V of finite codimension such that
kA  Zk ≤ ε.
Proof. (We follow the nice exposition in [121].)S(i)⇒(ii) Let ε > 0. Since AV≤1 is precompact,
there are w1 , . . . , wn ∈ W such that AV≤1 ⊆ ni=1 B(wi , ε). By Proposition 9.9 we can find
ϕ1 , . . . , ϕn ∈ W ∗ such that kϕi k = 1 and ϕi (wi ) = kwi k for all i. Now
n
\
Z = {v ∈ V | ϕ1 (Av) = · · · = ϕn (Av) = 0} = ker At ϕi
i=1

is a closed subspace of V (by continuity of A and the ϕi ). As the intersection of finitely many
spaces of codimension ≤ 1, Z has finite codimension ≤ n. If now z ∈ Z≤1 ⊆ V≤1 , there exists
an i such that kAz − wi k < ε. And with ϕi (Az) = 0 for all z ∈ Z, i ∈ {1, . . . , n} we have

kwi k = |ϕi (wi )| ≤ |ϕi (wi − Az)| + |ϕi (Az)| < ε,

thus
kAzk ≤ kAz − wi k + kwi k ≤ ε + ε = 2ε.
Since this holds for all z ∈ Z≤1 , we have kA  Zk ≤ 2ε.
(ii)⇒(i) If ε > 0, by assumption there is a closed subspace Z ⊆ V of finite codimension such
that kA  Zk ≤ ε. By Proposition 6.11 (i), Z is complemented, thus by Exercise 7.15(i), there is
an idempotent PZ with PZ V = Z. But this is easy to prove directly: We can find unit vectors
e1 , . . . , en , where n = codimZ, such that Y = spanF {e1 , . . . , en } is an algebraic complement
for Z. Then every v ∈ V is of the form v = z + y with unique z ∈ Z, y ∈ Y . Now define
PZ : z + y 7→ z and PY : z + y 7→ y. Since Y is finite-dimensional, PY has finite rank, thus is
compact. Thus with δ = min(1, ε/kAk), there are v1 , . . . , vm ∈ V≤1 such that
m
[
PY V≤1 ⊆ B(PY vi , δ).
i=1

If now v ∈ V≤1 there is an i such that kPY v −PY vi k < δ. With PM (v −vi ) = (v −vi )−Pn (v −vi )
we have
kPZ v − PZ vi k ≤ kv − vi k + kPY v − PY vi k ≤ 2 + δ ≤ 3,
so that, again using PZ + PY = 1, and kAPZ k = kA  Zk ≤ ε we have

kAv − Avi k ≤ kAPZ v − APZ vi k + kAPY v − APY vi k ≤ 3kAPZ k + δkAk ≤ 4ε.

Thus AV≤1 ⊆ m
S
i=1 B(Avi , 4ε). Since ε > 0 was arbitrary, AV≤1 is precompact and therefore A
compact. 

12.20 Remark Very similarly to the proof of the preceding theorem one can prove [143] that
compactness of A ∈ B(V, W ) is equivalent to the following statement, dual to the above (ii):

105
(iii) For every ε > 0 there exists a finite-dimensional subspace Y ⊆ W such that kQAk < ε,
where Q is the quotient map W → W/Y .
Now one can give an alternative proof [143] of Schauder’s theorem: Assume A is compact,
thus (iii) holds. Thus for every ε > 0, there is a finite-dimensional subspace Y ⊆ W such that
kQAk < ε, where Q : W → W/Y . Put Z = Y ⊥ ⊆ W ∗ , which has finite codimension. Now
A∗  Z = (QA)t , so that kA∗  Zk = k(QA)t k = kQAk < ε. Thus T ∗ satisfies (ii) in Theorem
12.19 and therefore is compact. The converse implication is proven as in Theorem 12.17. (But
ultimately this is the same proof, the finite-(co)dimensional subspaces Y playing a role similar
to that of the auxiliary Fn in the first proof.) 2

12.21 Exercise Let V, W be Banach spaces and A ∈ K(V, W ). Prove:


(i) AV ⊆ W is separable.
(ii) If AV ⊆ W is closed then AV is finite-dimensional.
The following result has applications to Sobolev spaces and PDEs:

12.22 Exercise Let X, Y, Z be Banach spaces, T ∈ B(X, Y ) compact and S ∈ B(Y, Z) injec-
tive. Prove (possibly by contradiction) that for every ε > 0 there is a Cε ≥ 0 such that

kT xkY ≤ εkxkX + Cε kST xkZ ∀x ∈ X.

(Note: If ε ≥ kT k then we can put Cε = 0. The point is that 0 < ε < kT k is allowed!)

12.23 Exercise If V, W are Banach spaces, A ∈ B(V, W ) is called strictly singular if there is
no infinite-dimensional subspace Z ⊆ V such that A  Z is bounded below. Prove:
(i) A ∈ B(V, W ) is strictly singular if and only if there is no infinite-dimensional closed
subspace Z ⊆ V such that A : Z → AZ is an isomorphism.
(ii) Every compact operator is strictly singular.
In Remark B.33 we will see that there are non-compact strictly singular operators!

12.2 ? Compactness vs. weak forms of weak-norm continuity


Recognizing compact operators can be difficult. So far we have seen that compactness of some
integral operators can be proven using the Arzelà-Ascoli theorem (see also Section 12.4), and we
know that operators in the norm-closure of the finite rank operators are compact. But our only
general criterion so far is Theorem 12.19. The fact that the finite rank operators are precisely
the weak-norm continuous operators, cf. Exercise 10.10, suggests that weak topologies can be
brought to bear on proving compactness. We begin with an easy but remarkable instance:

12.24 Theorem If V is a reflexive Banach space then all bounded linear operators V →
`1 (N, F) and c0 (N, F) → V are compact.
Proof. Let A ∈ B(V, `1 (N, F)). By Exercise 10.14(iii), V≤1 with the weak topology is sequentially
compact. Thus every norm-bounded sequence {xn } ⊂ V has a weakly convergent subsequence
{xni }. Since A is weak-weak continuous by Exercise 10.9, {Axni } ⊂ `1 (N, F) converges weakly.
Now Schur’s Theorem 10.4 implies that {Axni } ⊂ `1 (N, F) converges in norm. Thus {Axn } has
a norm-convergent subsequence, so that A is compact.

106
Let A ∈ B(c0 , V ), thus At ∈ B(V ∗ , c∗0 ). Since V ∗ is reflexive and c∗0 ∼
= `1 , the above gives
that At is compact, thus also A by Schauder’s Theorem 12.17. 

Remarkably, Theorem 12.24 requires no information about A other than its boundedness.
But the proof relies on the rather exceptional Schur property of `1 . In view of the proof it is
clear that B(V, W ) = K(V, W ) whenever V is reflexive and W has the Schur property. (See
[90] for further generalizations.)
To obtain results of wider applicability, it will be useful to characterize compactness of an
operator in terms of properties slightly weaker than weak-norm continuity. We will consider
three such properties of an operator, beginning with:

12.25 Proposition Let V, W be Banach spaces and A ∈ B(V, W ). Then A is compact if and
only if the restriction A : V≤1 → W is weak-norm continuous.
Proof. Assume A is compact. Let {xι } be a net in the closed unit ball of V weakly converges to
zero. Let ε > 0. By Theorem 12.19 there exists a closed subspace Z ⊆ V of finite codimension
such that kA  Zk ≤ ε. Since Z is complemented, there is a finite rank idempotent P ∈ B(V )
such that (1 − P )V = Z. Since AP has finite rank, it is weak-norm continuous by Exercise
10.10, thus kAP xι k → 0. Now with kA(1 − P )k ≤ ε we have

kAxι k ≤ kAP xι k + kA(1 − P )xι k ≤ kAP xι k + 2ε.

Since kAP xι k → 0 and ε > 0 was arbitrary, we have kAxι k → 0. Thus A is weak-norm
continuous.
Now assume that A  V≤1 is weak-norm continuous, and let ε > 0. By the weak-norm
continuity, V≤1 ∩ A−1 (B W (0, ε)) ⊆ V≤1 is a weakly open neighborhood of 0 ∈ V . Thus there
exist ϕ1 , . . . , ϕn ∈ V ∗ such that

{x ∈ V≤1 | |ϕi (x)| < 1 ∀i} ⊆ V≤1 ∩ A−1 (B W (0, ε)). (12.2)

Now Z = ni=1 ker ϕi ⊆ V is a closed subspace of codimension ≤ n, and for every z ∈ Z≤1 we
T
have ϕi (z) = 0 ∀i, and therefore kAzk < ε by (12.2). Thus for all v ∈ Z we have kAvk ≤ εkAk,
so that we have kA  Zk ≤ ε. Now A is compact by Theorem 12.19. 

12.26 Remark 1. An alternative proof of the implication ⇒ that uses the compactness of A
directly goes like this: Let {xι }ι∈I be a net in V≤1 that converges weakly to zero. Since the
w
norm-continuous A is weak-weak continuous, Axι → 0. If kAxι k 6→ 0 then there exists an ε > 0
such that for every ι ∈ I there is a ι0 ≥ ι such that kAxι0 k ≥ ε. Using this we can construct
a subnet {xσ }σ∈Σ of {xι } such that kAxσ k ≥ ε for all σ ∈ Σ. Since AV≤1 is compact, the net
{Axσ } has an accumulation point w ∈ W , which clearly satisfies kwk ≥ ε. Since w 6= 0 also is a
w
weak accumulation point of {Axσ }, we have a contradiction with Axα → 0. Thus kAxι k → 0.
This argument is just the obvious adaptation the proof of Theorem 12.28(ii) to nets. But it
is less elementary than the one above in that it uses accumulation points and subnets of nets.
2. By general topology, cf. e.g. [142, 108], continuity of A : (V≤1 , τw ) → (W, k·k) is equivalent
to the statement that the net {Axι } ⊂ W is norm-convergent for every weakly convergent
bounded net {xι }ι∈I in V . But either formulation is hard to verify, and we would prefer a
criterion involving only sequences. 2

12.27 Definition A linear map A : V → W of Banach spaces is sequentially weak-norm


continuous if {Axn } ⊂ W is norm-convergent for every weakly convergent sequence {xn } in V .

107
12.28 Theorem Let V, W be Banach spaces and A : V → W linear. Then
(i) If A is sequentially weak-norm continuous, it is bounded.
(ii) If A is compact then it is sequentially weak-norm continuous.
(iii) Every bounded linear map from `1 to an arbitrary Banach space is sequentially weak-norm
continuous, yet 1 : `1 → `1 is non-compact.
(iv) Let V, W be Banach spaces, where V is reflexive or V ∗ is separable. Then every sequentially
weak norm continuous A : V → W is compact.
Proof. (i) If A is unbounded, we can find {xn } ⊂ V with kxn k = 1 such that kAxn k ≥ n2 . Now
yn = xnn → 0 in norm, thus weakly, while Ayn does not converge in norm since kAyn k ≥ n. This
is a contradiction.
(ii) It suffices to prove that kAxn k → 0 whenever {xn } converges weakly to zero. By Exercise
w
10.6, {xn } is bounded. Since A is weak-weak continuous, we have Axn → 0. If kAxn k 6→ 0 then
there exist ε > 0 and a subsequence yi = {xni } such that kAyi k ≥ ε for all i. By compactness
of A there is a second subsequence zi = ymi such that Azi converges in norm. The limit is non-
zero since kAzi k ≥ ε for all i. Thus Azi converges to a non-zero limit also weakly, contradicting
w
Axn → 0. Thus kAxn k → 0.
(iii) If {xn } ⊂ `1 is a weakly convergent sequence, it is norm-convergent by Schur’s Theorem
10.4, thus {Axn } is norm-convergent for any A : `1 → V . Thus A is sequentially weak-norm
continuous. The second follows from infinite-dimensionality of `1 .
(iv) If V is reflexive then every bounded sequence {xn } ⊂ V has a weakly convergent
subsequence by Exercise 10.14(iii). Since A maps the latter to a norm convergent sequence in
W , {Axn } has a norm-convergent subsequence, thus A is compact.
If V ∗ is separable then every bounded sequence in V has a weakly Cauchy subsequence by
Exercise 10.14(i). Now compactness of A follows as in the preceding case if we use the following
lemma. 

12.29 Lemma A sequentially weak-norm continuous operator between Banach spaces maps
weakly Cauchy sequences to norm convergent ones.
Proof. Let {in }, {jn } be increasing sequences in N. Then for each ϕ ∈ V ∗ the sequences {ϕ(xin )}
and {ϕ(xjn )} have the same limit limn→∞ ϕ(xn ). Thus {xin − xjn } is a weakly null sequence,
so that kA(xin − xjn )k → 0 by the sequential weak-norm continuity of A. Now the following
exercise gives that {Axn } is Cauchy w.r.t. k · k and therefore norm-convergent. 

12.30 Exercise Let (X, d) be a metric space and {xn } a sequence in X such that d(xin , xjn ) →
0 for all strictly increasing sequences {in }, {jn } of natural numbers. Prove that {xn } is Cauchy.

12.31 Remark 1. Our first application of Theorem 12.28(iv) is that no infinite-dimensional


Banach space with the Schur property is reflexive or has separable dual. This follows since
the Schur property of V means that idV is sequentially weak-norm continuous, while idV is
non-compact for infinite-dimensional V .
2. In Section B.3.6, Theorem 12.28(iv) will be used to prove some interesting results about
the automatic compactness of a large class of bounded operators.
3. Accepting Footnote 62, Theorem 12.28(iv) immediately generalizes to all V that have no
subspace isomorphic to `1 . 2

We quickly look at a third property of weak-norm type:

108
12.32 Definition If V, W are Banach spaces then a linear A : V → W is completely continuous
if weak compactness of X ⊂ V implies norm-compactness of AX ⊂ W .

12.33 Proposition Let V, W be Banach spaces and A : V → W linear. Then


(i) If A  V≤1 is weak-norm continuous then it is completely continuous and sequentially
weak-norm continuous.
(ii) If A is completely continuous, it is bounded.
(iii) A is completely continuous if and only if it is sequentially weak-norm continuous.
Proof. (i) The hypothesis clearly implies weak-norm continuity on all bounded sets. If now X ⊂
V is weakly compact, it is norm-bounded by Exercise 10.7, so that AX ⊂ W is norm-compact by
the hypothesis on A. Thus A is completely continuous. Since every weakly convergent sequence
is bounded by Exercise 10.6, the sequential weak-norm continuity of A is equally obvious.
(ii) If A is unbounded, there is a sequence {xn } in V with kxn k = 1 and kAxn k ≥ n2 for all
n. Then yn = xn /n converges to zero in norm, thus weakly. Thus Y = {yn | n ∈ N} ∪ {0} is
weakly compact72 , thus AY is norm compact, thus bounded, contradicting kAyn k ≥ n.
(iii) ⇒: Assume A : V → W to be completely continuous and let {xn } ⊂ V converge weakly
to x ∈ V . Since A is bounded (=norm-continuous) by (i), it is weak-weak continuous, thus
{Axn } converges weakly. If the set {Axn | n ∈ N} is finite, its linear span in W is finite-
dimensional, thus weak and norm topology coincide on it, so that the weak convergence of
{Axn } implies norm convergence.
Thus from now on we may assume {Axn | n ∈ N} to be infinite. By weak convergence,
X = {xn | n ∈ N} ∪ {x} ⊆ V is weakly compact. Thus AX = {Axn | n ∈ N} ∪ {Ax} ⊂ W
is norm-compact by complete continuity of A. Thus AX has at least one limit point y ∈ AX
(i.e. #(B(y, ε) ∩ AX) = ∞ for every ε > 0) since otherwise it would be discrete and infinite,
contradicting compactness. For every limit point z of AX there is a subsequence of {Axn }
converging to z in norm, thus also weakly. But the weakly convergent sequence {Axn } has
precisely one weak limit point, so that it has exactly one limit point in the norm topology, thus
converges in norm.
⇐: Assume A : V → W is sequentially weak-norm continuous and X ⊂ V is weakly com-
pact, thus bounded by Exercise 10.7. Let {yn } be a sequence in AX. Clearly there exists a
sequence {xn } in X with Axn = yn ∀n. Since X is weakly compact, it is weakly sequentially
compact by the easier half of the Eberlein-Šmulyan theorem, cf. Remark 10.16, thus there is a
weakly convergent subsequence {xni }. By weak-norm continuity of A, {yni = Axni } is norm-
convergent, proving norm-compactness of AX. Thus A is completely continuous. 

Summarizing, for every linear A : V → W we have the following implications:

A compact ⇐==========⇒ A  V≤1 weak-norm continuous


=== w
~w w
w w
===
====
w w w
=
w w w

V reflexive or V separable w
w w
==== w
=
w w
w
=== w
==
w
=
w w
=
w
w
==== 

A sequent. weak-norm contin. ⇐====⇒ A completely continuous

72
In any topological space, convergence xn → x implies compactness of {xn | n ∈ N} ∪ {x}.

109
We close by mentioning another property involving weak topologies that an operator may
have or not: A ∈ B(V, W ) is called weakly compact if it maps bounded sets of V to subsets of
W that are precompact in the weak topology. See e.g. [102, Section 3.5].

12.3 Compact Hilbert space operators


12.34 Proposition If H is a Hilbert space and A ∈ B(H) is compact then A∗ and |A| are
compact.
Proof. By the polar decomposition, there is a partial isometry V such that A = V |A| and
|A| = V ∗ A. Since K(H) ⊆ B(H) is an ideal (Lemma 12.10) the second identity and compactness
of A give compactness of |A|. Since the adjoint of the first identity is A∗ = |A|V ∗ , a second use
of the ideal property of K(H) gives compactness of A∗ . 

12.35 Remark 1. Compactness of A∗ also follows from A∗ = γ −1 ◦ At ◦ γ, compactness of At



=
(Theorem 12.17) and the fact that γ : H → H ∗ , y 7→ h·, yi is a homeomorphism.
2. In Corollary 14.15 we will prove, without any use of A∗ , that for compact A and ε > 0
there is F ∈ F (H) with kA − F k < ε, thus kA∗ − F ∗ k < ε. By the following exercise, F ∗ is
finite rank. Since ε > 0 was arbitrary, Corollary 12.12 again gives A∗ ∈ K(H). 2

12.36 Exercise Let H be a Hilbert space.


(i) If x, y ∈ H, define x ⊗ y ∈ B(H) by (x ⊗ y)(z) = xhz, yi. Prove that (x ⊗ y)∗ = (y ⊗ x).
(ii) Prove that A ∈ F (H) ⇒ A∗ ∈ F (H) and dim A∗ H = dim AH.
Hint: You can prove A = K
P
k=1 xk ⊗ yk for K = dim AH and x1 , y1 , . . . , xK , yK ∈ H and
use (i), but there also is an elegant direct proof.

12.37 Theorem Let H be a Hilbert space and A ∈ B(H). Then the following are equivalent:
(α) For every ε > 0 there is an orthogonal projection P such that kP AP k < ε and P H ⊆ H
has finite codimension.
(β) A is compact.
(γ) For every orthonormal sequence {en }n∈N ⊆ H we have Aen → 0.
(δ) For every orthonormal sequence {en }n∈N ⊆ H we have hAen , en i → 0.
The implication (γ) ⇒ (δ) follows trivially from Cauchy-Schwarz. As to the rest:

12.38 Exercise Let H be a Hilbert space and A ∈ B(H).


(i) Prove (α) ⇒ (β) in Theorem 12.37.
(ii) Prove (β) ⇒ (γ). Hint: Begin by proving weak convergence.
(iii) Prove (δ) ⇒ (α) for self-adjoint A. Hint: Assuming that (α) is false, use Proposition
11.34(i) to construct an orthonormal sequence {en } such that hAen , en i 6→ 0.
(iv) Deduce the general statement (δ) from the self-adjoint case.

110
12.4 ? Hilbert-Schmidt operators: L2 (H)
If E is an ONB for a Hilbert space and A ∈ B(H), we have seen in Lemma 11.51 that TrE (A∗ A)
does not depend on E, so that we can write Tr(A∗ A), and that Tr(A∗ A) = Tr(AA∗ ).

12.39 Definition For each A ∈ B(H) we define

kAk2 = (Tr(A∗ A))1/2 ∈ [0, ∞], L2 (H) = {A ∈ B(H) | kAk2 < ∞}.

The elements of L2 (H) are called Hilbert-Schmidt73 operators.

12.40 Proposition Let H be a Hilbert space. Then


(i) kAk ≤ kAk2 = kA∗ k2 for all A ∈ B(H). Thus L2 (H) is self-adjoint.
(ii) For every A, B ∈ L2 (H),
X
hA, BiHS = Tr(B ∗ A) = hB ∗ Ae, ei
e

is absolutely convergent and independent of the ONB E. Now h·, ·iHS is an inner product
on L2 (H) such that hA, AiHS = kAk22 . And (L2 (H), h·, ·iHS ) is complete, thus a Hilbert
space.
(iii) For all A, B ∈ B(H) we have kABk2 ≤ kAkkBk2 and kABk2 ≤ kAk2 kBk. Thus L2 (H) ⊆
B(H) is a two-sided ideal.
k·k2
(iv) We have F (H) ⊆ L2 (H) ⊆ K(H) and F (H) = L2 (H).
Proof. (i) If x ∈ H is a unit vector, pick an ONB E containing x. Then kAxk2 = hA∗ Ax, xi ≤
TrE (A∗ A) = kAk22 . Thus kAxk ≤ kAk2 whenever kxk = 1, proving the inequality. And
kA∗ k22 = Tr(AA∗ ) = Tr(A∗ A) = kAk22 .
(ii) If E is any ONB (whose choice does not matter) for H, we have (as before)
X X X XX
Tr(A∗ A) = hA∗ Ae, ei = hAe, Aei = kAek2 = |hAe, e0 i|2 .
e∈E e∈E e∈E e∈E e0 ∈E

Thus L2 (H) is the set of A ∈ B(H) for which the matrix elements hAe, e0 i (w.r.t. the ONB E)
are absolutely square summable. We therefore have a map

α : L2 (H) → `2 (E × E), A 7→ {hAe, e0 i}(e,e0 )∈E 2

that clearly is injective. (Recall that `2 (S) = L2 (S, µ), where µ is the counting measure.) To
show surjectivity of α, let f = {fee0 } ∈ `2 (E × E). Define a linear operator A : H → H by
0
P
A : e 7→ e0 fee0 e . For each e, the r.h.s. is in H by square summability of f . If x ∈ H then
2 ! 2
X X X X
2 0 0
kAxk = sup hx, ei fee0 e = sup hx, eifee0 e
E0 e e0 ∈E 0 E0 e0 ∈E 0 e
2
X X X
= sup hx, eifee0 ≤ kxk2 |fee0 |2 ,
E0 e0 e e,e0

73
Erhard Schmidt (1876-1959). Baltic German mathematician, contributions to functional analysis like Gram-
Schmidt orthogonalization.

111
where the supremum is over the finite subsets E 0 ⊆ E, we used |hx, ei| ≤ kxk and the change
ofPsummation order is allowed due to the finiteness of E 0 . This computation shows that kAk ≤
( e,e0 |fee0 |2 )1/2 < ∞. Thus A ∈ B(H) and α(A) = f , so that α is surjective. Thus α : L2 (H) →
`2 (E × E) is a linear bijection. Now `2 (E × E) is a Hilbert space (in particular complete) with
inner product (f, g) = e,e0 fee0 gee0 , and pulling this inner product back to L2 (H) along α we
P
have
X X
hA, BiHS = hAe, e0 ihBe, e0 i = hAe, e0 ihe0 , Bei
(e,e0 )∈E 2 (e,e0 )∈E 2
X X
= hAe, Bei = hB ∗ Ae, ei = Tr(B ∗ A),
e e

where all sums converge absolutely. Lemma 11.51 gives that hA, AiHS = Tr(A∗ A) is independent
of the chosen ONB, and for general (A, B) this follows by the polarization identity.
From the above it is clear that (L2 (H), h·, ·iHS ) is isomorphic to the Hilbert space (`2 (E ×
E), h·, ·i), thus a Hilbert space. And the norm associated to h·, ·iHS is nothing other than k · k2 .
(iii) For any ONB E we have
X X
kABk22 = Tr(B ∗ A∗ AB) = kABek2 ≤ kAk2 kBek2 = kAk2 Tr(B ∗ B) = kAk2 kBk22 ,
e∈E e∈E

proving kABk2 ≤ kAkkBk2 . And kABk2 = k(AB)∗ k2 = kB ∗ A∗ k2 ≤ kB ∗ kkA∗ k2 = kAk2 kBk,


where we used the fact just proven and kA∗ k2 = kAk2 . The conclusion is obvious.
(iv) The inclusion F (H) ⊆ L2 (H) is very easy 2
P and is left as an exercise. If A ∈ L (H) and
F is a finite subset of the ONB E, define pF = e∈F h·, eie and AF = ApF . Then AF ∈ F (H)
and A X
kA − AF k22 = kA(1 − pF )k22 = |hAe, e0 i|2 .
he,e0 i∈(E\F )×E

k·k2
This implies kA − AF k2 → 0 as F % E, so that L2 (H) = F (H) .
k·k
Finally, by (i) we have kA − AF k ≤ kA − AF k2 → 0, thus A ∈ F (H) ⊆ K(H), where we
used Corollary 12.12. This proves L2 (H) ⊆ K(H). 

12.41 Example (L2 -Integral operators) Let (X, A, µ) be a measure space, and put H =
L2 (X, A,R µ).
R Let k 2: X × X → C be measurable 2(w.r.t. the product σ-algebra A × A) and
assume |k(x, y)| dµ(x)dµ(y) < ∞. (Thus k ∈ L (X × X, A × A, µ × µ).) Then
Z
(Kf )(x) = k(x, y)f (y) dµ(y)
X

defines a linear operator K : H → H whose Hilbert-Schmidt norm kKk2 coincides with the
norm kkkL2 of k ∈ L2 (X × X). Thus K is Hilbert-Schmidt, and in particular compact.

12.42 Exercise Prove the equality kKk2 = kkkL2 of norms claimed in the above example.
If V, W are vector spaces over any field K then there is a canonical linear map W ⊗K V ∗ →
HomK (V, W ) sending w ⊗K ϕ to the linear map v 7→ wϕ(v). (Here V ∗ is the algebraic dual space
and ⊗K is the algebraic tensor product.) If V or W is finite-dimensional, this map is a bijection,
but otherwise it is not. For Hilbert spaces, one has a statement that works irrespective of the
dimensions:

112
12.43 Exercise Let H be a Hilbert space.
(i) Define H to be the vector space H with scalar action (c, x) 7→ cx and inner product
hx, yiH = hx, yi. Prove that H is a Hilbert space.
a map α : H ⊗alg H → F (H) by associating to ni=1 xi ⊗ yi the operator z 7→
P
(ii) Define
Pn
i=1 xi hz, yi i. Prove that α is linear and extends to an isometric bijection α : H ⊗ H →
2
L (H).

12.44 Remark See Appendix B.11 for the definition of Lp (H) for all p ∈ [1, ∞) and a more
complete discussion of L1 (H). 2

12.45 Exercise Given a set S and f ∈ `∞ (S, F), define H = `2 (S, C) and the multiplication
operator Mg : H → H, f 7→ gf (known from Exercise 12.13, where we saw Mg ∈ K(H) ⇔ g ∈
c0 (S)). Prove |Mg | = M|g| and kMg kp = kgkp for all p ∈ [1, ∞). (Thus Mg ∈ Lp (H) ⇔ g ∈
`p (S, C).)

Part II: Spectral theory of operators and algebras


13 Spectrum of bounded operators and of Banach
algebra elements
13.1 Spectra of bounded operators I: Definitions, first results
We now return to the discussion of invertible operators, specializing from B(V, W ) to B(V ),
i.e. bounded linear maps from a normed space to itself. We recall some definitions from linear
algebra without assuming finite-dimensionality of V :

13.1 Definition Let V be a Banach space over F ∈ {R, C} and A ∈ B(V ).


• If x ∈ V \{0} and λ ∈ F such that Ax = λx then x is called an eigenvector of A with
corresponding eigenvalue λ.
• Thus λ ∈ F is an eigenvalue of A if and only if A−λ1 is not injective, i.e. ker(A−λ1) 6= {0}.
• The eigenspace of λ ∈ F is ker(A − λ1). (But 0 is not considered as eigenvector!)
• If x ∈ V \{0} and (A − λ1)n x = 0 for some n ∈ N then x is called a generalized eigenvector
for the eigenvalue λ. (If (A − λ1)n x = 0 but y = (A − λ1)n−1 x 6= 0 then (A − λ1)y = 0,
so that λ indeed is an eigenvalue even though x is not an eigenvector.)
• The geometric multiplicity of λ is dim ker(A − λ1).
• The algebraic multiplicity of λ is dim n∈N ker(A − λ1)n .
S

• If V is finite-dimensional then the algebraic multiplicity of λ coincides with the multiplicity


of λ as a zero of the characteristic polynomial P (z) = det(A − z1).
If V is a finite-dimensional vector space and A ∈ End V , it is well known that one has: A is
injective ⇔ A is surjective ⇔ A is invertible. Thus in finite dimensions failure of A − λ1V to
be invertible for some λ ∈ F is equivalent to ker(A − λ1V ) 6= {0}, thus λ to being an eigenvalue.
It is extremely important that the equivalence of injectivity, surjectivity and invertibility 
fails in infinite dimensions, as the following standard examples illustrate:

113
13.2 Definition Let V = `p (N, F), where 1 ≤ p ≤ ∞. Define L, R ∈ B(V ) by

0 if n = 1
(Lf )(n) = f (n + 1), (Rf )(n) =
f (n − 1) if n ≥ 2

Equivalently: Rδn = δn+1 , Lδ1 = 0, Lδn = δn−1 if n ≥ 2, which is why we call L, R the left
and right, respectively, shift operators on V .
It is immediate that R is injective, but not surjective (since (Rf )(1) = 0 ∀f ∈ V ) while L is
surjective, but not injective (since Lf does not depend on f (1)). One easily checks LR = idV ,
while RL 6= idV since RL = P2 (notation from Exercise 8.12).

13.3 Exercise Consider the shift operators L, R on `p (N, C), p ∈ [1, ∞].
(i) Prove that R is an isometry.
(ii) In the Hilbert space case p = 2, prove L∗ = R, R∗ = L. Conclude that L a coisometry.
Passing to an infinite-dimensional Banach space V , there can be A ∈ B(V ) and λ ∈ F for
which A − λ1 is injective, but not surjective, thus not invertible. Such λ are not eigenvalues,
but they turn out to be equally important as the former. This motivates the following:

13.4 Definition Let V be a Banach space over F and A ∈ B(V ). Then


• The spectrum74 σ(A) is the set of λ ∈ F for which A − λ1V is not invertible.
• The point spectrum σp (A) consists of those λ ∈ F for which A − λ1V is not injective.
Equivalently, σp (A) consists of the eigenvalues of A.
• The continuous spectrum σc (A) consists of those λ ∈ F for which A − λ1V is injective, but
not surjective, while it has dense image, i.e. (A − λ1V )V = V .
• The residual spectrum σr (A) consists of those λ ∈ F for which A − λ1V is injective and
(A − λ1V )V 6= V .
We have some immediate observations and comments:
• It is obvious by construction that the sets σp (A), σc (A), σr (A) are mutually disjoint and
have σ(A) as their union.
• Clearly 0 ∈ σ(A) is equivalent to non-invertibility of A and 0 ∈ σp (A) to ker A 6= {0}.
• If V is finite-dimensional then we know from linear algebra that injectivity and surjectivity
of any A ∈ B(V ) are equivalent. Thus for all operators on a finite-dimensional space we
have σc (A) = σr (A) = ∅, thus σ(A) = σp (A).
• If V is infinite-dimensional, the situation is much more complicated, thus more interesting.
For example, the right shift R on H = `2 (N) is injective, thus 0 6∈ σp (R), but with RH 6= H
we find 0 ∈ σr (R) ⊆ σ(R).
• If λ ∈ σp (A) then there is non-zero x ∈ V with Ax = λx. Then An x = λn x ∀n ∈ N. With
the definition of kAk it follows that |λ| ≤ inf n∈N kAn k1/n . (This can be smaller than kAk,
e.g. if A is nilpotent, i.e. An = 0 for some n ∈ N.) We’ll prove a similar result for σ(A).
74
The choice of this term by Hilbert was nothing less than a stroke of genius since it turned out to fit exactly its
later use in quantum theory.

114
• One reason for distinguishing continuous and residual spectrum is the (not quite perfect)
duality between σp and σr , cf. Exercise 13.9(ii)-(iii). Another is that σr often is empty, cf.
e.g. Exercise 13.13.
• The continuous spectrum need not be ‘continuous’ in the sense of connected. But one
can prove, cf. Exercise 13.68, that every isolated λ ∈ σ(A) is an eigenvalue if either A is
a normal operator on Hilbert space (Proposition 17.24(ii)) or some additional condition
is satisfied, cf. Section B.10. This perhaps goes some way towards explaining the term
‘continuous’.
• In Section B.10 we will define the discrete spectrum σd (A), a certain subset of the point
spectrum σp (A), as well as several closely related essential spectra of A.
• Later we will prove that σ(A) is always closed, while we will see in examples that σp , σc , σr
need not be closed.
Proposition 7.41(iii) suggests to define another two interesting subsets of σ(A):

13.5 Definition Let V be a Banach space and A ∈ B(V ). Then define


• the approximate point spectrum σap (A) = {λ ∈ F | A − λ1 is not bounded below},
• the compression spectrum σcp (A) = {λ ∈ F | (A − λ1)V 6= V }.
Again some immediate observations:
• σ(A) = σap (A) ∪ σcp (A) (by Proposition 7.41(iii)), but σap (A) and σcp (A) need not be
disjoint, e.g. for A = 0.
• σr (A) = σcp (A)\σp (A) ⊆ σcp (A) is easily checked.
• σp (A) ⊆ σap (A) is obvious.
• σc (A) ⊆ σap (A), since λ ∈ σc (A) implies that A − λ1 is not invertible, but has dense
image, so that λ 6∈ σcp (A).
• σap (A) is closed as consequence of Exercise 7.46. Thus σp (A) ∪ σc (A) ⊆ σap (A).
• If σr (A) = ∅ then σ(A) = σapp (A).
• For more on σap (A) see Exercises 13.13(iii), 13.65(ii).

13.6 Exercise Let V be a Banach space and A, B ∈ B(V ) with B invertible. Prove σ(BAB −1 ) =
σ(A) and σx (BAB −1 ) = σx (A) for all x ∈ {p, c, r, app, cp}.

13.7 Exercise Compute σp (L) and σp (R) for the shift operators L, R on `p (N, C) for all p ∈
[1, ∞]. (Of course, the p in σp has nothing to to with the p in `p .)

13.8 Exercise Let V, W be Banach spaces and A ∈ B(V ), B ∈ B(W ). Define C ∈ B(V ⊕ W )
by C : (x, y) 7→ (Ax, By). (Thus C = A ⊕ B.)
(i) Prove σ(C) = σ(A) ∪ σ(B) and σp (C) = σp (A) ∪ σp (B).
(ii) Compute σc (C) and σr (C). Warning: These are not simply unions as in (i)!
For operators on a Hilbert space, we can study how the spectra of A and A∗ are related:

13.9 Exercise Let H be a Hilbert space and A ∈ B(H). Use Lemma 11.10 to prove:

115
(i) σ(A∗ ) = σ(A)∗ := {λ | λ ∈ σ(A)}.75
(ii) If λ ∈ σr (A) then λ ∈ σp (A∗ ).
(iii) If λ ∈ σp (A) then λ ∈ σp (A∗ ) ∪ σr (A∗ ).
(iv) σcp (A) = σp (A∗ )∗ .

13.10 Remark Using Exercise 9.35 instead of Lemma 11.10, one has analogous results for the
transpose At ∈ B(V ∗ ) of a Banach space operator A ∈ B(V ): σ(At ) = σ(A), σr (A) ⊆ σp (At )
and σp (A) ⊆ σp (At ) ∪ σr (At ). (There is no complex conjugation since A 7→ At is linear.) 2

Normal operators have very nice spectral properties, foreshadowing the spectral theorem:

13.11 Exercise Let H be a Hilbert space and A ∈ B(H) normal. Prove:


(i) If An x = 0 for some n ∈ N then Ax = 0.
(ii) With Lλ (A) = n∈N ker(A − λ1)n we have Lλ (A) = ker(A − λ1), thus generalized eigen-
S
vectors are eigenvectors.

13.12 Exercise Let H be a Hilbert space and A ∈ B(H) normal. Let x, x0 be (non-zero)
eigenvectors for the eigenvalues λ, λ0 ∈ σp (A), respectively. Prove:
(i) A∗ x = λx, thus x is eigenvector for A∗ with eigenvalue λ.
(ii) σp (A∗ ) = σp (A)∗ .
(iii) If λ 6= λ0 then x ⊥ x0 .76
(iv) Give an example of a non-normal A ∈ B(H) for which (iii) fails.

13.13 Exercise (Spectra of normal operators) Let H be a Hilbert space and A ∈ B(H)
normal. Prove:
(i) σr (A) = ∅. (No residual spectrum)
(ii) σc (A∗ ) = σc (A)∗ .
(iii) σ(A) = σap (A) (cf. Definition 13.5).
Keep in mind that self-adjoint operators are normal, so that the above results apply. But 
for non-normal operators we cannot expect such nice results to hold!

13.14 Exercise Let H be a Hilbert space and A ∈ B(H). Use Exercise 13.13(iii) to prove:
(i) If A is self-adjoint then σ(A) ⊆ R.
(ii) If A is unitary then σ(A) ⊆ S 1 .

13.15 Exercise Let H be a Hilbert space and A ∈ B(H). Recalling Definition 11.32, prove
(i) σp (A) ⊆ W (A).
(ii) σr (A) ⊆ W (A).
(iii) σap (A) ⊆ W (A).
(iv) Conclude that σ(A) ⊆ W (A).
75
If S ⊆ C we write S ∗ for {s | s ∈ S} since S could be confused with the closure.
76
You may have seen this before, but probably only for self-adjoint operators.

116
13.2 The spectrum in a unital Banach algebra
Since B(E) is a unital Banach algebra for every Banach space E, all results proven in the rest
of this section in particular apply to bounded operators on Banach spaces. Restricting these
results to B(E) would not simplify their proofs significantly. Whether F is R or C does not
matter until Section 13.2.3.

13.2.1 The group of invertibles


13.16 Definition If A is a unital algebra over F then InvA = {a ∈ A | ∃b ∈ A : ab = ba = 1}
is the set of invertible elements of A.
It should be clear that InvA ⊆ A is a group with the multiplication of A and unit 1.

13.17 Lemma Let A be a unital normed algebra. Then InvA is a topological group (w.r.t. the
norm topology).
Proof. Since multiplication A × A → A is jointly continuous (Remark 3.29), the same holds
for its restriction to InvA. It remains to show that the inverse map InvA → InvA, a 7→ a−1 is
continuous. To this purpose, let a, a + h ∈ InvA and define k by (a + h)−1 = a−1 + k. Then
1 = (a−1 + k)(a + h) = 1 + a−1 h + ka + kh, thus a−1 h + ka + kh = 0. Multiplying this on
the right by a−1 we have a−1 ha−1 + k + kha−1 = 0, thus k = −a−1 ha−1 − kha−1 . Therefore
kkk ≤ ka−1 k2 khk + kkkkhkka−1 k, which is equivalent to kkk(1 − khkka−1 k) ≤ ka−1 k2 khk.
Assuming khk < ka−1 k−1 , the expression in brackets is positive, so that

ka−1 k2
kkk ≤ khk.
1 − khkka−1 k

It follows that khk → 0 implies kkk → 0, so that a + h → a in InvA entails (a + h)−1 → a−1 . 

So far, we know very little about InvA. We might, in principle, have InvA = F∗ 1, where
F∗ = F\{0}. (This is indeed the case for the algebra A = F[x] which is normed, but not Banach,
with kP k = supx∈[0,1] |P (x)|.) The next results provide invertible elements other than multiples
of 1.

13.18 Definition An element a ∈ A of an algebra is called nilpotent if an = 0 for some n ∈ N.

13.19 Lemma Let A be a unital Banach algebra.


(i) If a ∈ A is nilpotent then 1 − a ∈ InvA. (This is true in every unital algebra.)
(ii) If a ∈ A, kak < 1 then 1 − a ∈ InvA and (1 − a)−1 = ∞ n 77
P
n=0 a .
(iii) InvA ⊆ A is open. More precisely, if a ∈ InvA then B(a, ka−1 k−1 ) ⊆ InvA.
= ∞ n converges since it breaks off after
P
Proof. (i) If a ∈ A is nilpotent then the series b P n=0 a P
finitely many terms. Now P(1 − a)b = b(1 − a) = n=0 an − ∞
∞ n
P 1 −na ∈ InvA.
n=1 a = 1, thus
(ii) If kak < 1 then n=0 ka k ≤ n=0 kak < ∞, so that the series ∞
∞ n ∞ n
P
n=0 a converges
to some
P∞ n P∞ n b ∈ A by completeness and Proposition 3.15. Now again (1 − a)b = b(1 − a) =
n=0 a − n=1 a = 1, so that 1 − a is invertible with inverse b.
77
P∞
In this context, the geometric series n=0 an is called the Neumann series, after the German mathematician Carl
Gottfried Neumann (1832-1925).

117
(iii) If a ∈ InvA and a0 ∈ A with ka − a0 k < ka−1 k−1 then
k1 − a−1 a0 k = ka−1 (a − a0 )k ≤ ka−1 kka − a0 k < 1 so that a−1 a0 = 1 − (1 − a−1 a0 ) ∈ InvA by
(ii), thus a0 = a(a−1 a0 ) ∈ InvA. This proves that InvA is open. 

13.20 Exercise Let A be a unital algebra and a, b ∈ A. Prove:


(i) If a and b are invertible then ab and ba are invertible.
(ii) If a and ab (or ba) are invertible then b is invertible.
(iii) Invertibility of ab need not imply invertibility of a or b.
(iv) If ab and ba are invertible then a and b are invertible. Express the inverses a−1 , b−1 in
terms of c = (ab)−1 , d = (ba)−1 .
(v) If a1 , . . . , an ∈ A commute mutually then a1 · · · an ∈ InvA ⇔ ai ∈ InvA ∀i.
(vi) Invertibility of 1 − ab implies invertibility of 1 − ba.
Hint: Assuming that A is Banach and kakkbk < 1, find a formula for (1 − ba)−1 in terms
of (1 − ab)−1 . Now prove that the latter holds without the mentioned assumptions.

13.21 Exercise Let A be a unital Banach algebra. Prove:


(i) For all b ∈ InvA, a ∈ A\InvA we have kb−1 k ≥ ka − bk−1 .
(ii) For all b ∈ InvA we have kb−1 k ≥ (dist(b, A\InvA))−1 .

13.2.2 The spectrum. Basic properties


13.22 Definition Let A be a unital algebra over F. The spectrum of a ∈ A is defined as

σ(a) = {λ ∈ F | a − λ1 6∈ InvA}.

The spectral radius of a is r(a) = sup{|λ| | λ ∈ σ(a)}, where r(a) = 0 if σ(a) = ∅. (We will
prove that σ(a) 6= ∅ for all a ∈ A if A is normed and F = C.) The map

Ra : F\σ(a) → A, λ 7→ (a − λ1)−1

is called the resolvent (map). (Sometimes ρ(a) = F\σ(a) is called the resolvent set.)

13.23 Remark 1. It is clear that for an element of the Banach algebra B(E), where E is
a Banach space, this definition is equivalent to Definition 13.4. But in the present abstract
setting there is no analogue of the point, continuous, residual and compression spectra. (For a
generalization of σap to elements of abstract Banach algebras see Footnote 90.)
2. If a ∈ A and b ∈ InvA it is immediate that σ(a) = σ(bab−1 ), thus r(a) = r(bab−1 ).
3. Lemma 13.17 implies that the resolvent map Ra : F\σ(a) → A, λ 7→ (a − λ1)−1 is
continuous in every unital normed algebra.
4. Be warned that some authors define the resolvent Ra (λ) as (λ1 − a)−1 . 2

As to our standard examples of Banach algebras not of the form B(V ) with V Banach:

13.24 Exercise (i) Let X be a compact Hausdorff space. Recall that (C(X, F), k · k∞ ) is a
Banach algebra. For f ∈ C(X, F), prove σ(f ) = f (X) ⊆ F.
(ii) As we saw in Section 4.6, `∞ (S, F) is a Banach algebra w.r.t. pointwise multiplication for
every set S. If f ∈ `∞ (S, F), prove σ(f ) = f (S).

118
We begin our study of the spectrum with two purely algebraic results:

13.25 Lemma Let A be a unital algebra. Then


(i) If a ∈ A is nilpotent then σ(a) = {0}, thus r(a) = 0.
(ii) For all a, b ∈ A we have σ(ab) ∪ {0} = σ(ba) ∪ {0} and r(ab) = r(ba).
Proof. (i) If λ 6= 0 then 1 − λa is invertible by Lemma 13.19(i). Thus also λ1 − a is invertible,
so that σ(a) ⊆ {0}. Since no nilpotent a is invertible (why?) we have σ(a) = {0}.
(ii) For all λ 6= 0 we have λ ∈ σ(ab) ⇔ λ1 − ab 6∈ InvA ⇔ 1 − (λ−1 a)b 6∈ InvA ⇔
1 − b(λ−1 a) 6∈ InvA ⇔ λ1 − ba 6∈ InvA ⇔ λ ∈ σ(ba), where the third equivalence comes from
Exercise 13.20(vi). (For λ = 0 this argument does not work. As we know already, invertibility
of ab and ba are independent.) The second statement is an obvious consequence. 

13.26 Definition If A is a unital Banach algebra, a ∈ A is called quasi-nilpotent if r(a) = 0,


equivalently σ(a) ⊆ {0}.
As just proven, nilpotent ⇒ quasi-nilpotent. For examples of quasi-nilpotent elements that
are not nilpotent, see Exercises 13.61, 13.63, 19.6.

13.27 Proposition If A is a unital Banach algebra and a ∈ A then


(i) σ(a) is closed.
(ii) r(a) ≤ inf n∈N kan k1/n ≤ kak.
Proof. (i) If a ∈ A then fa : F → A, λ 7→ a − λ1 is continuous, thus fa−1 (InvA) ⊆ F is open by
Lemma 13.19(iii). Now σ(a) = F\fa−1 (InvA) is closed.
(ii) If λ ∈ F, |λ| > kak then ka/λk < 1 so that 1 − a/λ ∈ InvA by Lemma 13.19(ii). Thus
λ1 − a ∈ InvA, so that λ 6∈ σ(a). This proves r(a) ≤ kak.
In each unital algebra we have the telescoping computation related to finite geometric sums

(z − 1)(1 + z + z 2 + · · · + z n−1 ) = (z + · · · + z n ) − (1 + · · · + z n−1 ) = z n − 1. (13.1)


If z − 1 is not invertible, Exercise 13.20(v) implies that z n − 1 is not invertible for any n ∈ N.
Applying this to z = a/λ, where λ ∈ σ(a)\{0}, gives λn ∈ σ(an )78 , thus r(a) ≤ r(an )1/n , for all
n. With r(b) ≤ kbk we have r(a) ≤ inf n∈N r(an )1/n ≤ inf n∈N kan k1/n ≤ kak. 

13.28 Remark 1. Already by the above we have: kan k1/n → 0 ⇒ a is quasi-nilpotent.


2. There is another way of improving on r(a) ≤ kak: Let A be a unital Banach algebra, a ∈ A
and b ∈ InvA. Then with Remark 13.23.2 and Proposition 13.27 we have r(a) = r(bab−1 ) ≤
kbab−1 k, implying r(a) ≤ inf b∈InvA kbab−1 k. In favorable cases including A = B(H) one can
prove this to be an equality, cf. Exercise 17.15.
3. The completeness assumption is essential: If A = F[x] with the incomplete norm kP k =
supx∈[0,1] |P (x)| then σ(a) = F for all a ∈ A\C1. 2

13.29 Exercise Let A be a unital Banach algebra and a, b ∈ A. Prove the first and second
‘resolvent identities’

Ra (s) − Ra (t) = (s − t)Ra (s)Ra (t) ∀s, t ∈ F\σ(a),


Ra (s) − Rb (s) = Ra (s)(b − a)Rb (s) ∀s ∈ F\(σ(a) ∪ σ(b)).
78
Thus {λn | λ ∈ σ(a)} ⊆ σ(an ). Later we will prove that equality holds if F = C.

119
13.30 Exercise Let A be unital Banach algebra and a ∈ InvA. Prove:
(i) σ(a−1 ) = {λ−1 | λ ∈ σ(a)}.
(ii) If kak ≤ 1 and ka−1 k ≤ 1 then σ(a) ⊆ S 1 = {z ∈ C | |z| = 1}.

13.31 Exercise Let A be a unital Banach algebra, a ∈ A. Prove that for all λ ∈ F\σ(A):
(i) r((a − λ1)−1 ) = (dist(λ, σ(a)))−1 .
(ii) kRa (λ)k ≥ (dist(λ, σ(a)))−1

13.32 Remark 1. The result of (ii) could also be deduced from Exercise 13.21, but the ap-
proach via r(Ra (λ)) is more conceptual and will give the exact result for kRa (λ)k in certain
situations, cf. Exercise 13.71.
2. Since r(b) < kbk is perfectly possible, the above in general only gives a lower bound for
kRa (λ)k. Proving upper bounds tends to be harder. 2

13.33 Exercise Let A be a unital Banach algebra and a ∈ A nilpotent. With N = min{n ∈
N | an = 0} prove limλ→0 |λ|N k(a − λ1)−1 k = kaN −1 k =
6 0. (Thus k(a − λ1)−1 k behaves like
−N
|λ| ka N −1 k as λ → 0.)
It is instructive to compare the above with Exercise 13.52.

13.34 Exercise Let A be a unital Banach algebra. For a ∈ A define ζ(a) = inf{kabk | b ∈
A, kbk = 1} = 0 and call a a topological left zero-divisor if ζ(a) = 0. Prove:
(i) For a ∈ InvA we have ζ(a) = ka−1 k−1 > 0. Thus ζ −1 (0) ⊆ A\InvA.
(ii) |ζ(a) − ζ(b)| ≤ ka − bk ∀a, b ∈ A.
(iii) If a ∈ ∂InvA79 then a is not invertible.
(iv) Every a ∈ ∂InvA is a topological left zero-divisor. Hint: Use Exercise 13.21.
(v) If λ ∈ ∂σ(a) then a − λ1 is a topological left zero-divisor.
(vi) For A = C(X, F), where X is compact, prove that f ∈ A is a topological (left) zero-divisor
if and only if it non-invertible.
(vii) Give an example of a unital Banach algebra A and a non-invertible a ∈ A that is not a
topological left zero-divisor.

13.35 Exercise Let A be a unital Banach algebra over F.


(i) Use the ideas in the proof of Lemma 13.19 to give an alternative proof of the continuity
of InvA → InvA, a 7→ a−1 .
(ii) BONUS: If E, F are normed spaces and U ⊆ E is open, a map f : U → F is Fréchet
differentiable at x ∈ U if there is a bounded linear map D ∈ B(E, F ) such that

kf (x + h) − f (x) − D(h)k
→0 as khk → 0.
khk

Prove that InvA → InvA, a 7→ a−1 is Fréchet differentiable. For F = C, conclude that the
map C\σ(a) → C, λ 7→ ϕ((a − λ1)−1 ) is holomorphic for each a ∈ A, ϕ ∈ A∗ .
79
If X is a topological space and Y ⊆ X then ∂Y = Y ∩ X\Y is the boundary of Y .

120
13.2.3 The spectral radius formula (Beurling-Gelfand theorem)
13.36 Lemma Let A be a unital normed algebra and a ∈ A. Put ν = inf n∈N kan k1/n . Then
(i) limn→∞ kan k1/n = ν.8081
a n a n
6→ 0 provided ν > 0.82
 
(ii) For all µ > ν we have µ → 0 as n → ∞, but ν
(iii) If ν = 0 then a 6∈ InvA, thus 0 ∈ σ(a).
Proof. (i) With kan k ≤ kakn we trivially have

0 ≤ ν = inf kan k1/n ≤ lim inf kan k1/n ≤ lim sup kan k1/n ≤ kak < ∞. (13.2)
n∈N n→∞ n→∞

By definition of ν, for every ε > 0 there is a k such that kak k1/k < ν + ε. Every m ∈ N is of the
form m = sk + r with unique s ∈ N0 and 0 ≤ r < k (division with remainder). Then

kam k = kask+r k ≤ kak ks kakr < (ν + ε)sk kakr ,


sk r
kam k1/m ≤ (ν + ε) sk+r kak sk+r .
sk r
Now m → ∞ means sk+r → 1 and sk+r → 0, so that lim supm→∞ kam k1/m ≤ ν + ε. Since this
holds for every ε > 0, we have lim supm→∞ kam k1/m ≤ inf n∈N kan k1/n . Together with (13.2)
this implies that limm→∞ kam k1/m exists and equals inf n∈N kan k1/n .
(ii) Let µ > ν, and choose µ0 such that ν < µ0 < µ. Since kan k1/n → ν by (i), there exists
n0 such that n ≥ n0 ⇒ kan k1/n < µ0 . For such n we have
 n  0 n
a kan k µ n→∞
= n < −→ 0.
µ µ µ

This proves the first claim. On the other hand, by definition of ν we have kan k1/n ≥ ν for all
n ∈ N. With ν > 0 this implies k(a/ν)n k ≥ 1 ∀n, and therefore (a/ν)n 6→ 0.
(iii) If a ∈ InvA then there is b ∈ A such that ab = ba = 1. Then 1 = an bn , thus with
Remark 3.29 we have 1 ≤ k1k = kan bn k ≤ kan kkbn k ≤ kan kkbkn . Taking n-th roots, we have
1 ≤ kan k1/n kbk, and taking the limit n → ∞ gives the contradiction 1 ≤ νkbk = 0. Thus if
ν = 0 then a is not invertible, so that 0 ∈ σ(a). 

Essentially everything we did so far works for F = R and F = C. That the natural ONB
for L2 ([0, 2π], λ; F) depends on F is a triviality (but related to the significant fact that x 7→ eix
is a homomorphism, while x 7→ sin x, cos x are not). The R/C-dependence in the polarization
identities (Exercise 5.13 and Lemma 11.17) is more serious since it propagates to the fact that
Lemma 11.19 and Propositions 11.22 and 11.34 are weaker over R than over C.
By contrast, the rest of this section requires F = C, and the same applies when- 
ever we use Theorem 13.39 and its consequences like Corollary 13.43 or Exercise
13.46. For this reason, from now on we assume F = C throughout. If we still men- 
tion C, it is only for emphasis. Identifying the few results that also hold over R
(like like Lemma 14.1 through Corollary 14.4) usually is quite easy.
80
This would be immediate if n 7→ kan k1/n was decreasing, but this need not hold! See Exercise 13.64.
81 1/n 1/n
More generally, if {cn }n∈N ⊆ [0, ∞) satisfies cn+m ≤ cn cm ∀n, m then limn→∞ cn = inf n∈N cn .
82
This is of course trivial if µ > kak, but µ > ν is a weaker hypothesis when ν < kak.

121
13.37 Lemma Let A be a unital algebra over C, and let a ∈ A, λ ∈ C\{0}, n ∈ N. Then
2πi
a n
− 1 ∈ InvA if and only if λk = e n k λ 6∈ σ(a) for all k = 1, . . . , n. In this case,

λ
 a −1 n −1
n 1 X a
−1 = −1 (13.3)
λ n λk
k=1
2πik
Proof. For k ∈ Z we write ek = e n . It is obviousQthat enk = 1 for all k ∈ Z. Since
n−1
e0 , e1 , . . . , en−1 are mutually distinct, we have z n − 1 = k=0 (z − ek ). This identity holds in
every unital algebra, and replacing z by a/λ ∈ A and putting λk = ek λ, it implies the first
statement. On the other hand, it means that there is a partial fraction expansion83
n−1
1 X ck
n
=
z −1 z − ek
k=0

for certain unique ck ∈ C. Multiplying this equation by z − e` and taking z → e` gives


z − e`
c` = lim .
z→e` zn − 1
In view of
zn − 1 1 (z/e` )n − 1 1 n
lim 1 + (z/e` ) + · · · + (z/e` )n−1 = ,

lim = lim =
z→e` z − e` e` z→e` (z/e` ) − 1 e` z→e` e`
where we used en` = 1 and (13.1), we have ck = ek /n, so that
n−1
1 1X 1
n
= .
z −1 n (z/ek ) − 1
k=0

Replacing z by a/λ herein and using λk = λek we obtain the identity (13.3) in A. That this is
justified follows from λk 6∈ σ(a) for all k (thus also λn 6∈ σ(an )) and the following exercise. 

13.38 Exercise Let A be a unital algebra over C and assume we have an identity Ii=1 fgii = 0
P

in the field C(z) of rational functions. Prove i fgii (a)


P (a)
= 0 for all a ∈ A for which all gi (a) are
invertible, i.e. σ(a) ∩ gi−1 (0) = ∅ ∀i.

13.39 Theorem (Beurling 1938-Gelfand 1939) 8485 Let A be a unital normed algebra
over C (not necessarily complete) and a ∈ A. Then σ(a) 6= ∅, and
r(a) ≥ inf kan k1/n = lim kan k1/n . (13.4)
n∈N n→∞

If A is complete then equality holds in (13.4), which then is called the spectral radius formula.
83
Most calculus books mention this, but a proof is rarely given. Qn Here is a quick analytic proof in the case at
hand, where all zeros of the denominator are simple: If f (z) = ( j=1 (z − zj ))−1 , we have Ai = limz→zi (z − zi )f (z) =
−1 Ai 1
[ j6=i (z −zj )−1 −Ai ], where the expression in
Q Q
j6=i (zi −zj ) ∈ C\{0} for each i ∈ {1, . . . , n}. Now f (z)− z−z i
= z−z i
Ai
square brackets is holomorphic near zi , where it vanishes. Thus f (z)− z−z i
extends holomorphically to a neighborhood
Pn Ai Pn Ai
of zi . Continuing like this, g(z) = f (z) − i=1 z−zi extends to an entire function. Since f and i=1 z−z i
tend to
zero as z → ∞, it follows Pn that g is bounded, thus constant by Liouville’s theorem. Since the constant must be zero,
Ai
we have proven f (z) = i=1 z−z i
. (For a purely algebraic treatment of the partial fraction expansion, including the
case of multiple zeros of the denominator, see e.g. [93, Ch. IV, §4].)
84
Arne Beurling (1905-1986). Swedish mathematician. Worked mostly on harmonic and complex analysis.
85
Israel Moiseevich Gelfand (1913-2009). Outstanding Soviet mathematician. Many important contributions to
many areas of mathematics, among which functional analysis and Banach algebras.

122
Proof. The equality of infimum and limit was Lemma 13.36(i). For a ∈ A, define ν as before.
If ν = 0 then 0 ∈ σ(a) by Lemma 13.36(iii). Thus σ(a) 6= ∅ and (13.4) is trivially true.
From now on assume ν > 0. Assume that there is no λ ∈ σ(a) with |λ| ≥ ν. This implies
that (a − λ1)−1 exists for all |λ| ≥ ν and depends continuously on λ by Lemma 13.17. The
same holds (since |λ| ≥ ν > 0) for the slightly more convenient function
a −1
φ : {λ ∈ C | |λ| ≥ ν} → A, λ 7→ −1 .
λ
Now Lemma 13.37 gives for all λ with |λ| ≥ ν and n ∈ N that ( λa )n − 1 ∈ InvA with inverse
given by (13.3). Pick any η > ν. Since the annulus Λ = {λ ∈ C | ν ≤ |λ| ≤ η} is compact, the
continuous map φ : Λ → A is uniformly continuous. I.e., for every ε > 0 we can find δ > 0 such
that λ, λ0 ∈ Λ, |λ−λ0 | < δ ⇒ kφ(λ)−φ(λ0 )k < ε. If ν < µ < ν +δ, we have |νk −µk | = |ν −µ| < δ
and therefore kφ(νk ) − φ(µk )k < ε for all n P ∈ N and k = 1, . . . , n. Combining this with (13.3)
we have k(( νa )n − 1)−1 − (( µa )n − 1)−1 k ≤ n1 nk=1 kφ(νk ) − φ(µk )k < ε ∀n ∈ N, so that:
 a −1  a  −1
n n
∀ε > 0 ∃µ > ν ∀n ∈ N : −1 − −1 < ε. (13.5)
ν µ

By Lemma 13.36(ii), µ > ν implies (a/µ)n → 0 as n → ∞. With continuity of the inverse


map, ((a/µ)n − 1)−1 → −1. Thus for n large enough we have k((a/µ)n − 1)−1 + 1k < ε, and
combining this with (13.5) we have k((a/ν)n − 1)−1 + 1k < 2ε. Since ε > 0 was arbitrary, we
have ((a/ν)n − 1)−1 → −1 as n → ∞ and therefore (a/ν)n → 0. But this contradicts the other
part of Lemma 13.36(ii), so that our assumption that there is no λ ∈ σ(a) with |λ| ≥ ν is false.
Existence of such a λ obviously gives σ(a) 6= ∅ and r(a) ≥ ν, completing the proof of the main
result. If A is complete then Proposition 13.27(ii) applies, and combining it with (13.4) the
final claim follows. 

13.40 Remark 1. We emphasize that only the last clause requires completeness of A.
2. The standard proof of the above theorem requires completeness and uses the differentia-
bility of the resolvent map Ra and a certain amount of complex analysis. The more elementary
(which does not mean simple) proof given above, due to Rickart86 ([130] 1958), shows that nei-
ther the completeness assumption nor the complex analysis are essential to the problem. (See
also Exercise 13.53 and the subsequent remark.)
3. Even though we avoided complex analysis
 (holomorphicity etc.), it is clear that the proof

0 1
only works over C. In fact, the matrix ∈ M2×2 (R), the counterexample in Remark
−1 0
11.20, has empty spectrum over R (as does every invertible antisymmetric real matrix). 2

13.2.4 Applications, complements, exercises


13.41 Corollary (‘Fundamental Theorem of Algebra’) Let P ∈ C[z] be a polynomial
of degree d ≥ 1. Then there is λ ∈ C with P (λ) = 0.
Proof. We may assume that P is monic, i.e. the coefficient of the highest power z d is 1. It is
not hard to construct a matrix aP ∈ Md×d (C) such that P (λ) = det(λ1 − aP ) (do it!). Now
P −1 (0) = σ(aP ), and the claim follows since Theorem 13.39 gives σ(aP ) 6= ∅. 
86
Charles Earl Rickart (1913-2002). American mathematician, mostly operator algebraist.

123
13.42 Remark The above argument is not circular since the proof of Theorem 13.39 did not
use the FTA but only some information about the exponential function in the complex domain
(existence of n-th roots, in particular roots of unity). The same holds for the ‘standard’ proof
of the FTA, cf. e.g. [108, Theorem 7.7.57], with which the above has much in common. Both
proofs certainly are more elementary than those using complex analysis (Liouville’s theorem)
or topological arguments based on π1 (S 1 ) 6= 0. 2

13.43 Corollary (Gelfand-Mazur)


(i) Every unital normed algebra over C other than C1 has non-zero non-invertible elements.
(ii) If A is a normed division algebra (i.e. unital with InvA = A\{0}) over C then A = C1.
Proof. (i) Let a ∈ A\C1. By Theorem 13.39 we can pick λ ∈ σ(a). Then a − λ1 is non-zero
and non-invertible. Now (ii) is immediate. 

13.44 Remark 1. That every finite-dimensional division algebra over C is isomorphic to C is


an easy consequence of algebraic closedness. (Why?) There are infinite-dimensional ones (like
the field of C(z) of rational functions over C), but they do not admit norms as a consequence
of the above corollary, which does not assume finite-dimensionality of A.
2. Over R a theorem of Hurwitz87 says that there are precisely four division algebras
admitting a norm, namely R, C, H (Hamilton’s88 quaternions, which everyone should know)
and O, the octonions of Graves89 . But of these only C is an algebra over C. For more on the
fascinating subject of real division algebras see the 120 pages on the subject in [44]. 2

The preceding corollaries only used σ(a) 6= ∅, but also the spectral radius formula will have
many applications.

13.45 Exercise Let A be a unital normed algebra over C and a ∈ A. Prove that a is quasi-
nilpotent (r(a) = 0) if and only if limn→∞ k(za)n k = 0 for all z ∈ C.
If A is a Banach algebra with unit 1, B ⊆ A a Banach subalgebra (=closed subalgebra)
containing 1 and b ∈ B then we can consider the spectrum of b as an element of A or of B,
leading to σA (b), σB (b) and the spectral radii rA (b), rB (b).

13.46 Exercise Let A be a Banach algebra over C with unit 1. Prove:


(i) If a, b ∈ A with ab = ba then r(ab) ≤ r(a)r(b).
(ii) If B ⊆ A is a Banach subalgebra containing 1 then rB (b) = rA (b) for all b ∈ B.
Despite the above (ii), σB (b) = σA (b) does not necessarily hold!

13.47 Lemma Let A be a Banach algebra over C with unit 1 and B ⊆ A a Banach subalgebra
containing 1. Then
(i) InvB ⊆ B ∩ InvA and σA (b) ⊆ σB (b) for all b ∈ B.
(ii) σB (b) = σA (b) holds for all b ∈ B if and only if InvB = B ∩ InvA.
87
Adolf Hurwitz (1859-1919). German mathematician who worked on many subjects.
88
Sir William Rowan Hamilton (1805-1865). Irish mathematician. Known particularly for quaternions and Hamil-
tonian mechanics. It was he who advocated the modern view of complex numbers as pairs of real numbers.
89
John T. Graves (1806-1870). Irish jurist (!) and mathematician.

124
Proof. (i) The first statement is obvious. If b − λ1 has an inverse in B then the latter also is an
inverse in A. Thus λ 6∈ σB (b) ⇒ λ 6∈ σA (b).
(ii) Assume InvB = B ∩ InvA. Then for all b ∈ B, λ ∈ F we have that b − λ1 is invertible in
B if and only if it is invertible in A, so that σB (b) = σA (b). If InvB =
6 B ∩ InvA then in view
of InvB ⊆ B ∩ A we have InvB $ B ∩ InvA. If now b ∈ (B ∩ InvA)\InvB then 0 ∈ σB (b), while
0 6∈ σA (b). 

Here is an example of a Banach subalgebra B ⊆ A with InvB $ B ∩ InvA:

13.48 Example In Section 4.6 we saw that the Banach space A = `1 (Z, C) with norm k · k1
becomes a Banach algebra when equipped with the convolution product ?. The functions
δn (m) = δn,m satisfy kδn k = 1 and δn ? δm = δn+m . In particular δn−1 = δ−n for each n ∈ Z.
Now Exercise 13.30 gives σ(δn ) ⊆ S 1 for all n.
Let B = {f ∈ A | f (n) = 0 ∀n < 0} = spanC {δn | n ≥ 0} ⊆ A. It is immediate that B is a
closed subalgebra containing 1. If δ1 had an inverse c ∈ B, c would also be an inverse in A, so
that c = δ−1 by uniqueness of inverses. In view of δ−1 6∈ B we have δ1 ∈ (B ∩ InvA)\InvB.

13.49 Exercise Let A = `1 (Z, C) and B $ A as above. Prove, using no later results:
(i) σB (δ1 ) = {z ∈ C | |z| ≤ 1}.
(ii) σA (δ1 ) = S 1 .
Since the spectra depend on the set of invertibles, one is interested in subalgebras B ⊆ A
satisfying InvB = B ∩ InvA. For a very useful result in this direction see Theorem 16.19. Other
examples are provided by the following exercise:

13.50 Exercise Let A be a unital Banach algebra with unit 1. For any subset S ⊆ A, define
the ‘commutant’ S 0 of S by
S 0 = {t ∈ A | st = ts ∀s ∈ S}.
(i) For each S ⊆ A, prove that B = S 0 ⊆ A is a Banach subalgebra with unit 1 and that
Inv(B) = B ∩ InvA.
(ii) Let S ⊆ T ⊆ A. Prove: (1) T 0 ⊆ S 0 , (2) S ⊆ S 00 , (3) S 0 = S 000 .
(iii) Prove: S ⊆ A is commutative ⇔ S ⊆ S 0 ⇔ S 00 is commutative.
[Combining (i)-(iii) we have: If S ⊆ A is commutative then B = S 00 ⊆ A is a commutative
Banach subalgebra containing 1 and S and satisfying Inv(B) = B ∩ InvA.]
(iv) Prove that a subalgebra B ⊆ A is maximal abelian (i.e. abelian and not properly contained
in a larger abelian subalgebra of A) if and only if B = B 0 . Conclude that every maximal
abelian subalgebra B satisfies InvB = B ∩ InvA.
(v) If H is a Hilbert space, A = B(H) and S ⊆ A, prove that S 0 ⊆ B(H) is also τwot -closed.

13.51 Exercise Let V = C 1 ([0, 1], C) (the differentiable functions [0, 1] → C with continuous
derivative). For f ∈ V define kf k = kf k∞ + kf 0 k∞ .
(i) Prove that (V, k · k) is a Banach space. (You may assume from analysis that if fn , g ∈ V
and fn → g and fn0 → h uniformly then g 0 = h.)
(ii) Show that V is a Banach algebra when multiplication of functions is defined point-wise,
i.e. (f g)(x) = f (x)g(x).

125
(iii) Let g(x) = x for all x ∈ [0, 1]. Compute the norm, the spectrum and the spectral radius
of g.
(iv) If E ⊆ [0, 1] is a closed subset, show that

IE = {f ∈ V | f (x) = 0 ∀x ∈ E}

is a closed two-sided ideal in V .


(v) For a ∈ [0, 1] let Ia = {f ∈ V | f (a) = f 0 (a) = 0}. Show that Ia is a closed ideal that is
not of the form IE as in (v) for any E.

13.52 Exercise Let A be a unital Banach algebra over C and a ∈ A quasi-nilpotent.


(i) Prove that ∞ n n −1
P
n=0 z a converges absolutely for all z ∈ C to (1 − za) .
(ii) Prove that a resolvent bound k(a − λ1)−1 k ≤ C|λ|−D ∀λ 6= 0 implies aN = 0 for N = bDc.
Hint: Use Lemma B.170.

13.53 Exercise Let A be a unital Banach algebra over C, a ∈ A and z ∈ C\{0} such that
zS 1 ∩ σ(a) = ∅ (i.e. there is no λ ∈ σ(a) with |λ| = |z|). Let pn (a, z) = (1 − (a/z)n )−1 . Prove:
(i) If |z| > r(a) then limn→∞ pn (a, z) = 1.
(ii) If there is no λ ∈ σ(a) with |λ| < |z| then limn→∞ pn (a, z) = 0.
(iii) The limit p(a, z) = limn→∞ pn (a, z) ∈ A exists, commutes with a and depends only on |z|.
Hint: Lemma 13.37.
(iv) p(a, z) = limn→∞ (1 − (a/z)2n+1 )−1 = limn→∞ (1 + (a/z)2n+1 )−1 .
(v) Use the preceding results to prove p(a, z)2 = p(a, z).
1
H
(vi) BONUS: p(a, z) = − 2πi C Ra (z)dz, where C is the circle of radius |z| around 0 ∈ C with
counterclockwise orientation. (This is used as the definition of p(a, z) in the standard
approach and for proving its properties using holomorphicity of z 7→ Ra (z).)

13.54 Remark The operators ((a/z)n −1)−1 = −pn already appeared in our (that is Rickart’s)
proof of the Beurling-Gelfand theorem, where we did not need their convergence as n → ∞.
The result of (vi) to the effect that the limit p(a, z), known as (F.) Riesz idempotent, is given by
a contour integral establishes a connection between Rickart’s proof and the standard textbook
proof via complex analysis. See also [132, §149]. 2

Since we need a unit in order to define InvA and σ(a), the following construction is quite
important when dealing with non-unital algebras:

13.55 Exercise (Unitization of Banach algebras) Let A be a Banach algebra over F,


possibly without unit. Define Ae = A ⊕ F, which is an F-vector space in the obvious way. For
(a, α), (b, β) ∈ Ae define (a, α)(b, β) = (ab + αb + βa, αβ) and k(a, α)k1 = kak + |α|. Prove:
(i) Ae is an algebra with unit (0, 1), the map ι : A → A, e a 7→ (a, 0) is an algebra homomor-
phism, and ι(A) ⊆ A is a two-sided ideal.
e
(ii) (A,e k · k1 ) is a normed algebra, and ι : A → Ae is an isometry.
(iii) (A,
e k · k1 ) is complete.

126
13.3 Spectra of bounded operators II: Banach algebra methods
In this section, we apply our results on abstract Banach algebras to the study of some operators
on Banach spaces.

13.56 Exercise Consider the left and right shift operators L, R of Definition 13.2 in the Hilbert
space `2 (N, C).
(i) Prove σ(L) = σ(R) = B(0, 1) (closed unit disc).
(ii) Determine σc and σr for L, R.
(iii) Find σap and σcp for L, R.
Hint: Use Exercises 13.7, 13.9.

13.57 Exercise Let H = `2 (N, C) and define A ∈ B(H) by (Af )(n) = f (n)/n. Determine
σp (A), σc (A), σr (A).

13.58 Exercise (Assuming some measure theory) Let H = L2 ([a, b], C), where −∞ <
a < b < ∞, and define A ∈ B(H) by (Af )(x) = xf (x). Prove σc (A) = [a, b] and σp (A) =
σr (A) = ∅.

13.59 Exercise Prove that for every compact set C ⊆ C there is an operator A ∈ B(H),
where H is a separable Hilbert space, such that σ(A) = C.
Hint: Prove and use that C has a countable dense subset.

13.60 Exercise Prove that every quasi-nilpotent operator on a finite-dimensional complex


Banach space is nilpotent and non-injective, thus 0 ∈ σp .

13.61 Exercise Let V =RC([0, 1], F) with (complete) norm k·k∞ . Define the Volterra operator
x
A : V → V by (Af )(x) = 0 f (t)dt. Prove
(i) A is injective.
(ii) A is bounded and satisfies kAn k = 1/n! for all n ∈ N.
(iii) A is quasi-nilpotent, but not nilpotent, and 0 ∈ σr (A).
The next three exercises study an important class of bounded operators on `2 (N), the
‘weighted shift operators’, generalizing the right shift R, and two typical applications:

13.62 Exercise (i) Let H be a Hilbert space and A ∈ B(H). Define pn = kAn k for n ∈ N0
and prove that pi+j ≤ pi pj ∀i, j.
(ii) Let {αk }k∈N0 ⊆ C. Put H = `2 (N0 , C) with natural ONB {ek } and define a linear ‘weighted
shift operator’ A : H → H by Aek = αk ek+1 . Prove that kAk = supk |αk |.
(iii) Let {pk }k∈N0 be given so that p0 = 1, pk > 0 ∀k and pi+j ≤ pi pj ∀i, j. Define
pk
α0 = p0 and αk = if k ≥ 1.
pk−1

Prove that the sequence {αk } and the associated weighted shift operator A are bounded.
(iv) Let A be constructed from {pk }k∈N0 as in (iii). Prove kAn k = pn ∀n ∈ N0 .
(v) (Bonus) Adapt the above construction to the case where pk = 0 is allowed for some k > 0.

127
13.63 Exercise Let H = `2 (N, C) and define A ∈ B(H) by Aek = 2−k ek+1 ∀k. Prove that A
is (i) injective, (ii) quasi-nilpotent, but (iii) not nilpotent.

13.64 Exercise Let H = `2 (N, C) and define A ∈ B(H) by Aek = αk ek+1 , where αk = 2 for
odd k and αk = 1/2 for even k. Compute kAn k for all n and show that n 7→ kAn k1/n is not
monotonously decreasing.

13.65 Exercise Let V be a Banach space and A ∈ B(V ).


(i) Prove that A is not bounded below if and only if it is a topological left zero-divisor of
B(V ) as defined in Exercise 13.34.90
(ii) Conclude that ∂σ(A) ⊆ σap (A).
Note that the preceding exercises only used results up to Section 13.2.2! The next exercise
is a (rather weak, given the strong hypothesis) converse of Exercise 13.8:

13.66 Exercise Let V be a complex Banach space and A ∈ B(V ) such that σ(A) is disjoint
from the circle C = {z ∈ C | |z − z0 | = r}.
(i) Apply Exercise 13.53 to A − z0 1 to obtain P 2 = P ∈ B(V ) satisfying P A = AP .
(ii) Prove that AVi ⊆ Vi , where V1 = P V, V2 = (1 − P )V . Conclude that A = A1 ⊕ A2 where
Ai = A  Vi .
(iii) Prove V1 = {x ∈ V | limn→∞ (A − z0 1)n x = 0}.
(iv) Deduce σ(A1 ) = σ(A) ∩ B(z0 , r) and σ(A2 ) = σ(A)\B(z0 , r).

13.67 Remark The unnatural assumption that the two parts of the spectrum are separated
by a circle can be removed using holomorphic functional calculus. Cf. e.g. [132, §149], [30,
Chapter VII, §4]. For normal operators on Hilbert space, there is a more powerful approach, cf.
Proposition 17.24. 2

13.68 Exercise (Isolated points in the spectrum) Let V be a complex Banach space,
A ∈ B(V ) and λ ∈ σ(A) isolated. Pick r > 0 such that B(λ, r) ∩ σ(A) = {λ}, and put
C = {z | |z − λ| = r} and let Pλ , Vi , Ai be as constructed in Exercise 13.66. Prove that
(i) A1 − λ1 is quasi-nilpotent.
(ii) A2 − λ1 is invertible.
(iii) If dim V1 < ∞ then λ ∈ σp (A1 ) ⊆ σp (A).
Even though we developed the theory of the Riesz projector (Exercises 13.53, 13.66, 13.68)
reasonably fully only for isolated points of the spectrum, it will suffice for applications to
the spectral theory of compact operators in Section 14.4 and to the discussion of the discrete
spectrum, cf. Section B.10.
90
Thus if A is a unital Banach algebra and a ∈ A, defining σap (a) = {λ ∈ F | a − λ1 is topological left zero-divisor}
is consistent with the usual definition when A = B(V ). With this, (ii) also holds for Banach algebras.

128
13.4 Applications to normal Hilbert space operators
We will have more to say on abstract Banach algebras, in particular the subclass of C ∗ -algebras
still to be defined. But before turning to these matters, we will consider some applications of
the above to operator theory.

13.69 Proposition If H is a complex Hilbert space and A ∈ B(H) is normal then


(i) r(A) = kAk.
(ii) There exists λ ∈ σ(A) such that |λ| = kAk.
 
(iii) kAk = 9A 9 = supkxk=1 |hAx, xi| .
n n
Proof. (i) By Exercise 11.31 we have kA2 k = kAk2 for all normal A. Now Theorem 13.39
gives
n n n 1/2n
r(A) = lim kAn k1/n = lim kA2 k1/2 = lim kAk2 = kAk. (13.6)
n→∞ n→∞ n→∞

(ii) Since σ(A) is compact, the continuous real-valued function σ(A) → C, λ 7→ |λ| assumes
its supremum r(A), which equals kAk by (i).
(iii) By Exercise 13.15 we have r(A) ≤ 9A9 ≤ kAk for every A ∈ B(H). Now for normal A
the claim clearly follows from (i). 

13.70 Remark Recall that Exercise 11.35 gave a direct proof of 9A9 = kAk not using Theorem
13.39. Using this one can also give a direct proof [14] of (i),(ii): In Exercise B.141 it is shown
that for every A ∈ B(H) there exists a sequence {xn } with kxn k = 1 such that hAxn , xn i → λ,
where |λ| = 9A9, which equals kAk by normality. Now

kAxn − λxn k2 = hAxn − λxn , Axn − λxn i


= hAxn , Axn i − λhxn , Axn i − λhAxn , xn i + |λ|2 hxn , xn i.

The sum of the three rightmost terms converges to −|λ|2 = −kAk2 . Since hAxn , Axn i =
kAxn k2 ≤ kAk2 ∀n, we have kAxn − λxn k → 0, proving that A − λ1 is not bounded below.
Thus λ ∈ σ(A), so that r(A) ≥ |λ| = kAk. Combining this with r(A) ≤ kAk from Proposition
13.27, we have r(A) = kAk. 2

13.71 Exercise Let A ∈ B(H) be normal. Prove:


(i) kAn k = kAkn ∀n ∈ N.
(ii) If σ(A) = {λ} then A = λ1.
(iii) For all λ ∈ C\σ(A) we have kRA (λ)k = k(A − λ1)−1 k = (dist(λ, σ(A)))−1 .

14 Compact operators II: Spectral theorems


14.1 The spectrum of compact operators. Fredholm alternative
We now begin studying the spectrum of compact operators. Throughout, V is a Banach space.

14.1 Lemma If A ∈ B(V ) is compact and λ ∈ F\{0} then ker(A − λ1) is finite-dimensional.

129
Proof. If λ 6∈ σ(A) then this is trivial since A − λ1 is invertible. In general, Vλ = ker(A − λ1)
is the space of eigenvectors of A with eigenvalue λ. Clearly A|Vλ = λ idVλ , so that Vλ is an
invariant subspace. Since Vλ is closed and A|Vλ is compact by Remark 12.6.3, Vλ must be finite-
dimensional by Remark 12.6.4. 

For λ = 0, the above does not hold since the zero operator on any V is compact.

14.2 Proposition (Fredholm alternative) 91 Let V be a Banach space, A ∈ B(V ) com-


pact and λ ∈ F\{0}. Then the following are equivalent:
(i) A − λ1 is invertible. (I.e. λ 6∈ σ(A).)
(ii) A − λ1 is injective.
(iii) A − λ1 is surjective.
Proof. We know from Proposition 7.41 that (i) is equivalent to the combination of (ii) and (iii).
It therefore suffices to prove (ii)⇔(iii).
(iii)⇒(ii): It suffices to do this for λ = 1. (Why?) Assume that A − 1 is not injective, but
surjective. Then (A − 1)n is surjective for all n. In view of (A − 1)n+1 = (A − 1)(A − 1)n
we have ker(A − 1)n+1 ⊇ ker(A − 1)n . We claim that this inclusion is strict for each n, i.e.
ker(A − 1)n+1 % ker(A − 1)n : By non-injectivity of A − 1 we can find y ∈ V \{0} such that
(A − 1)y = 0. By surjectivity of (A − 1)n there exists x such that (A − 1)n x = y. Now
(A − 1)n+1 x = (A − 1)y = 0 while (A − 1)n x = y 6= 0, thus x ∈ ker(A − 1)n+1 \ ker(A − 1)n .
Now by Riesz’ Lemma 12.2, for each n we can find an xn ∈ ker(A − 1)n+1 such that
kxn k = 1 and dist(xn , ker(A − 1)n ) ≥ 12 . If n > m then (A − 1)n Axm = A(A − 1)n xm = 0 and
(A − 1)n+1 xn = 0, implying (A − 1)xn − Axm ∈ ker(A − 1)n . With the definition of {xn } it
follows that for all n > m (thus also n < m) we have
1
kAxn − Axm k = kxn + ((A − 1)xn − Axm )k ≥ dist(xn , ker(A − 1)n ) ≥ .
2
Thus {Axn } has no convergent subsequence, contradicting the compactness of A.
(ii)⇒(iii): If A − λ1 is injective but not surjective, one similarly proves (A − λ1)n+1 V $
(A − λ1)n V for all n, which again leads to a contradiction with compactness of A. 

14.3 Remark 1. The result fails for λ = 0 since there are compact injective operators that are
not surjective.
2. In the above proof we have shown that if A ∈ B(V ) is compact and λ ∈ F\{0} then we
cannot have ker(A − 1)n+1 % ker(A − 1)n for all n or (A − λ1)n+1 H $ (A − λ1)n H for all n.
One says that A − λ1 has finite ascent and descent.
3. Compactness is essential for this stabilization: For the shift operators on V = `p (N) we
have ker Ln+1 ) ker Ln and Rn+1 V $ Rn V for all n. 2

Fredholm’s alternative has far-reaching consequences for the spectrum of A:

14.4 Corollary If A ∈ B(V ) is compact then


(i) σ(A) ⊆ σp (A) ∪ {0}. (Thus all non-zero elements of the spectrum are eigenvalues.)
(ii) If V is infinite-dimensional then 0 ∈ σ(A).
91
Erik Ivar Fredholm (1866-1927). Swedish mathematician. Early pioneer of functional analysis through his work
on integral equations.

130
Proof. (i) For λ 6= 0, with Proposition 14.2 we have: λ ∈ σ(A) ⇔ A−λ1 not invertible ⇔ A−λ1
not injective ⇔ λ ∈ σp (A).
(ii) If A was invertible then 1 = A−1 A would be invertible by Lemma 12.10, but this is false
by Remark 12.6.3. Thus 0 ∈ σ(A). 

14.5 Exercise Show that a compact operator A can have 0 in any of σp (A), σc (A), σr (A).

14.6 Exercise Let H be a Hilbert space, A ∈ K(H) and λ ∈ C\{0}. Show that each of the
implications (ii)⇒(iii) and (iii)⇒(ii) in Proposition 14.2 can be deduced from the other.

14.2 ? A glimpse of Fredholm operators


The content of this subsection is not needed for the proof of the spectral theorem, but puts the
Fredholm alternative into perspective. (For more on Fredholm operators see Section B.9.)
Recall that if A : V → W is a linear map, the cokernel of A by definition is the linear quotient
space W/AV . If V, W are Hilbert spaces and AV ⊆ W is closed then, recalling Exercise 5.32
we may alternatively define the cokernel of A to be (AV )⊥ ⊆ W .

14.7 Proposition Let A ∈ B(V ) be compact and λ ∈ F\{0}. Then


(i) coker(A − λ1) = V /((A − λ1)V ) is finite-dimensional.
(ii) (A − λ1)V ⊆ V is closed.
Proof. (i) By Theorem 12.17, At is compact, thus ker(At −λ1V ∗ ) is finite-dimensional by Lemma
14.1. Now

ker(At − λ1V ∗ ) = ker[(A − λ1V )t ] = ((A − λ1V )V )⊥ ∼


= (V /((A − λ1V )V ))∗ ,

where the second equality and the final isomorphism come from Exercises 9.35(i) and 6.7, re-
spectively. Thus (V /((A−λ1V )V ))∗ is finite-dimensional, implying (why?) finite-dimensionality
of V /((A − λ1V )V ) = coker(A − λ1V ).
(ii) This follows from (i) and Exercise 7.11.
Alternatively, a direct proof goes as follows: By Lemma 14.1, K = ker(A − λ1) is finite-
dimensional, thus closed by Exercise 3.22. Thus there is a closed subspace S ⊆ V such that
V = K ⊕ S. (If V is a Hilbert space, we can just take S = K ⊥ . For general Banach spaces
this is the statement of Proposition 6.11.) The restriction (A − λ1)|S : S → H is compact and
injective. If (A − λ1)|S is not bounded below, we can find a sequence {xn } in S with kxn k = 1
for all n and k(A − λ1)xn k → 0. Since A is compact, we can find a subsequence {xnk } such
that {Axnk } converges. We relabel, so that now {Axn } converges. Now

xn = λ−1 [Axn − (A − λ1)xn ].

Since {Axn } converges and {(A − λ1)xn } converges to zero by choice of {xn }, {xn } converges
to some y ∈ S (since xn ∈ S ∀n and S is closed). From (A − λ1)xn → 0 and xn → y we obtain
(A − λ1)y = 0, so that y ∈ ker(A − λ1) = K. Thus y ∈ K ∩ S = {0}, which is impossible since
y = limn xn and kxn k = 1 ∀n. This contradiction shows that (A − λ1)|S is bounded below. Now
Lemma 7.39 gives that (A − λ1)H = (A − λ1)S is closed. 

14.8 Remark In fact one can prove more: If A ∈ B(V ) is compact and λ ∈ C\{0} then

dim ker(A − λ1) = dim coker(A − λ1). (14.1)

131
This clearly is much stronger than the equivalence (ii)⇔(iii) in Proposition 14.2, which amounts
to the statement dim ker(A − λ1) = 0 ⇔ dim coker(A − λ1) = 0. See Remark 14.10.2. 2

14.9 Definition If V, W are Banach space then A ∈ B(V, W ) is called a Fredholm operator if
both ker A and coker A are finite-dimensional.
If A is Fredholm, ind(A) = dim ker A − dim coker A ∈ Z is the (Fredholm) index of A.
Note that we do not require closedness of the image of A since Exercise 7.11 it is follows
automatically by from the finite-dimensionality of W/AV .

14.10 Remark 1. If V, W are Banach spaces and A ∈ B(V, W ) then At ∈ B(W ∗ , V ∗ ) is


Fredholm if and only if A is. In this case, ind(At ) = −ind(A). (Proposition B.105)
2. If A, B are Fredholm then so is AB and ind(AB) = ind(A)+ind(B). (Proposition B.104.)
3. If A ∈ B(V ) is compact and λ 6= 0 then Lemma 14.1 and Proposition 14.7 give that
A − λ1 is Fredholm, and (14.1) amounts to ind(A − λ1) = 0.
4. The latter identity follows immediately by combining the trivial fact that 1 is Fredholm
with index zero with the following important stability result: If F is Fredholm and K is compact
then F + K is Fredholm and ind(F + K) = ind(F ). (Theorem B.108.)
5. Another important connection between compact and Fredholm operators is Atkinson’s
theorem: A ∈ B(V ) is Fredholm if and only there exists B ∈ B(V ) such that AB −1 and BA−1
are compact. (Equivalently, the image of A in the quotient algebra B(V )/K(V ) is invertible.)
(Theorem B.107.) 2

14.3 Spectral theorems for compact Hilbert space operators


14.11 Proposition Let H be a non-zero complex Hilbert space and A ∈ B(H) a compact
normal operator. Then there is an eigenvalue λ ∈ σp (A) such that |λ| = kAk.
Proof. If A = 0 then it is clear that λ = 0 does the job. Now assume A 6= 0. By Proposition
13.69(ii) there exists λ ∈ σ(A) with |λ| = kAk. Since λ 6= 0, Corollary 14.4 gives λ ∈ σp (A). 

In linear algebra one proves that for a matrix A ∈ Mn×n (C) the following are equivalent: A
is normal, A can be diagonalized by a unitary matrix, Cn has an orthonormal basis consisting
of eigenvectors of A, the (geometric) dimension of each eigenspace of A coincides with the
(algebraic) multiplicity of the corresponding eigenvalue. Cf. e.g. [55, Theorem 6.16]. In basis-
independent language, A ∈ B(H) with H finite-dimensional is normal if and only if H admits an
ONB consisting of eigenvectors of A. The following beautiful result generalizes this to compact
normal operators:

14.12 Theorem (Spectral theorem for compact normal operators) Let H be a com-
plex Hilbert space and A ∈ B(H) compact normal. Then
(i) H is spanned by the eigenvectors of A.
P
(ii) There is an ONB E of H consisting of eigenvectors, thus A = e∈E λe Pe , where Pe =
e ⊗ e : x 7→ hx, eie.
(iii) For each ε > 0 there are at most finitely many λ ∈ σp (A) with |λ| ≥ ε.
(iv) σp (A) is at most countable and has no accumulation points except perhaps 0, which is an
accumulation point whenever σ(A) is infinite.

132
(v) We have σ(A) ⊆ σp (A) ∪ {0}, where 0 ∈ σp (A) if and only if A has a kernel and 0 ∈ σc (A)
if and only if A is injective and σ(A) is infinite.
S
Proof. (i) Let K ⊆ H be the smallest closed linear subspace containing λ∈σp (A) Hλ , where
Hλ = ker(A − λ1). Clearly K is an invariant subspace: AK ⊆ K. Exercise 13.12(i) implies
that also A∗ K ⊆ K. Now Exercise 11.23(i) gives that also K ⊥ is A-invariant: AK ⊥ ⊆ K ⊥ . If
K ⊥ 6= {0} then A|K ⊥ is compact and has eigenvectors by Proposition 14.11. Since this would
contradict the definition of K, we have K ⊥ = 0, proving that H is spanned by the eigenvectors
of A.
(ii) By Exercise 13.12(ii), the eigenspaces for different eigenvalues of A are mutually orthog-
S
onal. Now the claim follows from (i) by choosing ONBs Eλ for each Hλ and putting E = λ Eλ .
(iii) Taking into account the unitary equivalence H ∼ = `2 (E, C), cf. Theorem 5.45, this
essentially is Exercise 12.13(iii).
(iv) This is an immediate consequence of (iii).
(v) Since A is normal, 0 ∈ σr (A) is ruled out by Exercise 13.13(i). The statement about
0 ∈ σp (A) is trivially true by definition. The one about σc (A) now follows from (iv) and the
closedness of σ(A). 

14.13 Remark 1. The statements about σ(A) actually hold for all compact operators on
Banach spaces. (Instead of the orthogonality of eigenvectors for different eigenvalues, it suffices
to use their linear independence.)
2. The common theme of ‘spectral theorems’ is that normal operators can be diagonalized,
i.e. be interpreted as multiplication operators, compactness simplifying statement and proof
considerably. Compare Theorem 18.4 for a result not requiring compactness.
3. If A ∈ B(H) is compact normal and f : σ(A) → C a function, we formally define f (A)
P
as e∈E f (λe )Pe , where E, λe , Pe are as in Theorem 14.12. The sum converges strongly to a
bounded operator if and only if f is bounded, and f (A) is compact if and only if f (λ) → 0
as λ → 0. Thus setting up a ‘functional calculus’ for compact normal operators is quite easy.
Our discussion of not-necessarily-compact normal operators will proceed in the opposite order:
We begin in Section 17 by constructing a functional calculus, which will then be used to prove
spectral theorems. 2

For non-normal operators one can only prove a weaker statement:

14.14 Proposition Let H be a complex Hilbert space and A ∈ K(H). Then are orthonormal
sets (not necessarily bases!) E and F of H, a bijection E → F, e 7→ fe and positive numbers
{βe }, called the singular values of A, such that e 7→ βe is in c0 (E, C) and
X
A= βe fe h·, ei.
e∈E

The βe are the precisely the eigenvalues of |A|.


Proof. B = A∗ A is compact Pand self-adjoint, so that by Theorem 14.12 there is an ONB EB
diagonalizing B, thus B = e∈EB λe Pe . Clearly E = {e ∈ EB | Ae 6= 0} is orthonormal. For
Ae Ae
e ∈ E put fe = kAek . Now let F = {fe | e ∈ E}. The fe = kAek are normalized by definition,
and if e, e ∈ E, e 6= e then hAe, Ae i = he, A Ae i = 0 since E diagonalizes A∗ A, so that F is
0 0 0 ∗ 0

orthonormal. For all x ∈ H we have


X X X X
Ax = A hx, eie = hx, eiAe = hx, eiAe = kAekhx, eife ,
e∈EB e∈EB e∈E e∈E

133
and putting βe = kAek > 0 we have the desired form.
Since E diagonalizes A∗ A, we have A∗ Ae = λe e for all e ∈ E, where compactness of A∗ A im-
plies that e 7→ λe is in c0 (EB , C), cf. Theorem 14.12(iii). Now, kAek2 = hAe, Aei = he, A∗ Aei =
1/2
λe , thus βe = kAek = λe implies that also e 7→ βe is in c0 (E). For the final claim, note that
|A|2 e = A∗ Ae = λe e, thus |A|e = βe e. 

The following goes some way towards proving that Hilbert spaces have the approximation
property:

14.15 Corollary Let H be a complex Hilbert space, A ∈ K(H) and ε > 0. Then there is a
k·k
B ∈ F (H) (finite rank) with kA − Bk ≤ ε. Thus K(H) = F (H) .
P
Proof. Pick a representation A = e∈E λe fe h·, ei as in the preceding proposition. Since E →
C, e 7→ λe is in c0 (E), there is a finite subset F ⊆ E such that |λe | < ε for all e ∈ E\F . Define
X
B= λe h·, fe ie,
e∈F

which clearly has finite rank. If x ∈ H then using the orthonormality of E and Bessel’s
inequality, we have
X X
k(A − B)xk2 = k λe hx, fe iek2 = |λe hx, fe i|2 ≤ ε2 kxk2 .
e∈E\F e∈E\F

Thus kA − Bk ≤ ε, so that K(H) ⊆ F (H). The converse inclusion was Corollary 12.12. 

14.16 Remark 1. In the above, bases played a crucial role. Even though there is no notion of
orthogonality in general Banach spaces, it turns out that Banach spaces having suitable bases
k·k
do satisfy K(H) = F (H) , i.e. the approximation property. Cf. e.g. [102, Theorem 4.1.33].
2. If you like applications of complex analysis to functional analysis, see [128, Section VI.5]
for an interesting alternative approach to compact operators. 2

14.4 ? Spectral theorem (Jordan normal form) for compact Ba-


nach space operators
We now return to compact operators on general Banach spaces, including of course non-normal
compact operators on Hilbert spaces). We begin by reproving some assertions already shown
for compact normal Hilbert spaces operators:

14.17 Proposition Let V be a Banach space and A ∈ B(V ) compact. Then


(i) For each ε > 0 there are at most finitely many λ ∈ σp (A) with |λ| ≥ ε.
(ii) σp (A) is at most countable and has no accumulation points except perhaps 0, which is an
accumulation point whenever σ(A) is infinite.
Proof. We follow [102]. (i) For ε > 0 we define Σε = {λ ∈ σ(A) | |λ| ≥ ε} = σ(A)\B(0, ε), which
is compact. We must show that all Σε are finite. Assume this is not the case for some ε > 0.
By compactness, Σε then has an accumulation point. I.e. there is a λ ∈ Σε and a sequence {λn }
of mutually distinct elements of Σε converging to λ. Thus we obtain a contradiction if we prove
that every sequence of distinct non-zero eigenvalues must converge to zero.

134
Let thus {λn }n∈N ⊂ Σε be mutually distinct non-zero eigenvalues of A. Pick corresponding
eigenvectors {xn } and put Wn = spanF (x1 , . . . , xn ). Since we know from linear algebra that
the sets {x1 , . . . , xn } all are linearly independent, we have Wn $ Wn+1 for all n. Thus by
Riesz’ Lemma 12.2 there are unit vectors yn+1 ∈ Wn+1 such that dist(yn+1 , Wn ) ≥ 1/2. Since
(A − λn+1 1) kills xn+1 , we have (A − λn+1 1)(yn+1 ) ∈ Wn . Now for all j > k > 1 we have

A(λ−1 −1 −1 −1
j yj ) − A(λk yk ) = λj (A − λj 1)(yj ) − λk (A − λk 1)(yk ) + yj − yk
= yj − [−λ−1 −1
j (A − λj 1)(yj ) + λk (A − λk 1)(yk ) + yk ].

Since the expression in square brackets lies in Wj−1 , which has distance ≥ 1/2 from yj , we have
kA(λ−1 −1 −1
j yj ) − A(λk yk )k ≥ 1/2, which clearly holds for all j 6= k. Thus the sequence {A(λj yj )}
has no convergent subsequence. Since A is compact, this proves that {λ−1 j yj } has no bounded
subsequence. With kyj k = 1 ∀j, this proves λj → 0, as desired.
(ii) This immediately follows from (i) in the same way as for compact normal Hilbert space
operators. 

By the above (which also holds over R), every λ ∈ σ(A)\{0} is isolated. Restricting to
F = C, Exercises 13.53, 13.66, 13.68 provide a Riesz idempotent Pλ ∈ B(V ) commuting with
A and such that σ(A  Pλ V ) = {λ} and σ(A  (1 − Pλ )V ) = σ(A)\{λ}. Thus Vλ = Pλ V is a
closed A-invariant subspace. If λ, λ0 ∈ σ(A)\{0} are distinct, Pλ and Pλ0 commute since both
are limits of inverses of polynomials in A. It is easy to see that Pλ Pλ0 = Pλ0 Pλ = 0.

14.18 Proposition Let V be a complex Banach space, A ∈ B(V ) compact and λ ∈ σ(A)\{0}.
Then
(i) the generalized eigenspace ∞ n
S
n=1 ker(A − λ1) of λ coincides with Vλ = Pλ V and
(ii) is finite-dimensional,
(iii) (A − λ1)  Vλ is nilpotent.
Proof. Since Vλ is invariant under A, the restriction A  Vλ is compact. By construction of Pλ ,
we have σ(A  Vλ ) = {λ}. Since λ 6= 0, we have 0 6∈ σ(A  Vλ ) so that A  Vλ is invertible. Thus
A  Vλ is compact and invertible, implying that Vλ is finite-dimensional. Since σ(A  Vλ ) = {λ},
(A − λ1)  Vλ is quasi-nilpotent, thus nilpotentSby Exercise 13.60. Thus for every x ∈ Vλ we
have (A − λ1)n x = 0 for some n, proving Vλ ⊆ ∞ n
n=1 ker(A − λ1) . OnSthe other hand, the fact
that A − λ1 is invertible on (1 − Pλ )V implies the converse inclusion ∞ n
n=1 ker(A − λ1) ⊆ Vλ .
This concludes the proof. 

14.19 Remark An alternative proof for the finite-dimensionality of the generalized eigenspace
(but not its coincidence with Vλ = Pλ V ) proceeds as follows:
Let B ∈ B(V ) be arbitrary and n ∈ N. If x ∈ ker B n+1 then Bx ∈ ker B n , so that B restricts
to a linear map ker B n+1 → ker B n . And Bx ∈ ker B n−1 ⊆ ker B n if and only if x ∈ ker B n .
Thus B induces a linear map ker B n+1 / ker B n → ker B n / ker B n−1 , so that dim ker B n+1 −
dim ker B n ≤ dim ker B n −dim ker B n−1 . (For n = 1 this is dim ker B 2 −dim ker B ≤ dim ker B.)
Now by induction (or a telescoping sum) we have dim ker B n ≤ n dim ker B.
Putting B = A−λ1, by the proof of Proposition 14.2 there is a d such that ker B n+1 = ker B n
S∞
for all n ≥ d. Thus dim n=1 ker B n = dim ker B d ≤ d dim ker B < ∞, where we used Lemma
14.1. 2

135
Now we can proceed as known from finite-dimensional linear algebra and find for each
λ ∈ σ(A)\{0} a basis for Vλ with respect to which A is given by a block diagonal matrix with
a finite number of standard Jordan blocks, the number of these blocks equaling the geometric
multiplicity dim ker(A − λ1) of λ. We refer to the literature, cf. e.g. [55, 84, 95].
If σ(A) is finite, also 0 is an isolated point of L
σ(A) (if it is in it), so that we have a Riesz
projector P0 . Now we have an isomorphism V ' λ∈σ(A) Vλ . Note that V0 need not be finite-
dimensional, since there is an ample supply of compact quasi-nilpotent operators in infinite
dimensions, e.g. the classical Volterra operator and the weighted shift operators with weight
function decreasing fast enough. An attempt at classifying them would lead us too far.
If σ(A) is infinite,
L matters are moreL complicated. At least for every ε > 0 we have an
isomorphism V ' V
λ∈σ(A),|λ|>ε λ V≤ε , where V≤ε is a closed A-invariant subspace with
σ(A  V≤ε ) ⊆ B(0, ε). In this caseT0 ∈ σ(A) is not isolated, so that we cannot define a Riesz
projector, but we could put V0 = ε>0 V≤ε . Now V0 can be zero (as for A ∈ K(`2 ) defined by
(Af )(n) = f (n)/n) or non-empty, in which case A  V0 again is compact quasi-nilpotent.
For more on spectral theory and normal forms of compact operators see e.g. [24, 135].

Our last major target in this course is proving Theorem 18.4, an analogue of Theorem 14.12
for Hilbert space operators A ∈ B(H) that are normal but not necessarily compact. This will
require extensive preparations, but the mathematics needed is itself a central part of modern
functional analysis.

15 Some functional calculus for Banach algebras


15.1 Characters. Spectrum of a Banach algebra
We now develop a new perspective on the spectrum that will prove very powerful, allowing to
obtain results that would be hard to reach in other ways. For example: If A is a unital Banach
algebra and a, b ∈ A. What can we say about σ(a + b) or σ(ab)? Using only the definition of
the spectrum this seems quite difficult. In this section we require k1k = 1.

15.1 Definition If A, B are F-algebras, an (algebra) homomorphism α : A → B is an F-linear


map such that also α(aa0 ) = α(a)α(a0 ) ∀a, a0 ∈ A. If A, B are unital, α is called unital if
α(1A ) = 1B . Algebra homomorphisms from an F-algebra to F are called characters. An algebra
isomorphism is a bijective algebra homomorphism.

15.2 Lemma If A, B are unital algebras and α : A → B is a unital algebra homomorphism then
σB (α(a)) ⊆ σA (a) ∀a ∈ A.
Proof. If λ 6∈ σA (a) then a − λ1A ∈ A has an inverse b. Then α(b) is an inverse for α(a − λ1A ) =
α(a) − λ1B ∈ B, thus λ 6∈ σB (α(a)). 

15.3 Lemma Let A be a unital Banach algebra. Then every non-zero character ϕ : A → F
satisfies ϕ(1) = 1, ϕ(a) ∈ σ(a) ∀a ∈ A and kϕk = 1, thus ϕ is continuous.
Proof. Since ϕ 6= 0 we can find a ∈ A with ϕ(a) 6= 0. Now ϕ(a) = ϕ(a1) = ϕ(a)ϕ(1), and
dividing by ϕ(a) gives ϕ(1) = 1. Thus every non-zero character is a unital homomorphism,
so that Lemma 15.2 gives σF (ϕ(a)) ⊆ σA (a). With σF (x) = {x} we have ϕ(a) ∈ σ(a), thus
|ϕ(a)| ≤ r(a) ≤ kak by Proposition 13.27, whence kϕk ≤ 1. Since we require k1k = 1, we also
have kϕk ≥ |ϕ(1)|/k1k = 1. 

136
15.4 Definition If A is a unital Banach algebra, the spectrum Ω(A) of A is the set of non-zero
characters ϕ : A → F.

15.5 Exercise Let X be a compact Hausdorff space and A = C(X, F). For every x ∈ X define
ϕx : A → F, f 7→ f (x). Prove:
(i) ϕx is a non-zero character of A, thus ϕx ∈ Ω(A), for each x ∈ X.
(ii) The map ι : X → Ω(A), x 7→ ϕx is injective.
(iii) For each f ∈ A we have σ(f ) = {ϕ(f ) | ϕ ∈ Ω(A)}. Do not use Proposition 15.7!

15.6 Remark Since characters are bounded, we have Ω(A) ⊆ A∗ . Thus every topology of A∗
restricts to a topology on Ω(A). It will turn out that the ‘right’ one is the weak-∗ topology
from Section 10.3, for example since it makes the map ι : X → Ω(A) in the preceding exercise
a homeomorphism. But we defer this discussion to Section 19.1. 2

One could hope that σ(a) = {ϕ(a) | ϕ ∈ Ω(A)} holds for every unital Banach algebra A and
a ∈ A. While the inclusion ⊇ always holds, for equality one needs more:

15.7 Proposition Let A be a commutative unital Banach algebra over C. Then


(i) If ϕ ∈ Ω(A) then ker ϕ ⊆ A is a maximal ideal (i.e. not contained in a larger proper ideal).
(ii) Every maximal ideal in A is the kernel of a unique ϕ ∈ Ω(A). In particular, Ω(A) 6= ∅.92
(iii) For each a ∈ A we have
σ(a) = {ϕ(a) | ϕ ∈ Ω(A)}. (15.1)

Proof. (i) It should be clear that M = ker ϕ is an ideal, and M 6= A since ϕ 6= 0. This ideal has
codimension one since A/M ∼ = ϕ(A) = C and therefore is maximal.
(ii) Now let M ⊆ A be a maximal ideal. Since maximal ideals are proper, no element of
M is invertible. If b ∈ M satisfied k1 − bk < 1 then Lemma 13.19(i) would give invertibility
of b = 1 − (1 − b), a contradiction. (This is the only place where completeness is used.) We
thus have k1 − bk ≥ 1 for all b ∈ M , implying 1 6∈ M . Thus M is a proper ideal containing M .
Since M is maximal, we have M = M , thus M is closed. Now by Proposition 6.1(vi), A/M is
a normed algebra, and by a well-known argument from commutative algebra, the maximality
of M implies that A/M is a field, thus a division algebra. Thus A/M ∼ = C by the Gelfand-
Mazur theorem (Corollary 13.43), so that there is a unique isomorphism α : A/M → C sending
1 ∈ A/M to 1 ∈ C. If p : A → A/M is the quotient homomorphism then ϕ = α ◦ p : A → C is
a non-zero character with ker ϕ = M . This ϕ clearly is unique. Now Ω(A) 6= ∅ follows from the
fact that every commutative unital algebra has maximal ideals (by a standard Zorn argument).
(iii) We already know that {ϕ(a) | ϕ ∈ Ω(A)} ⊆ σ(a), so that it remains to prove that for
every λ ∈ σ(a) there is a ϕ ∈ Ω(A) such that ϕ(a) = λ. If λ ∈ σ(a) then a − λ1 6∈ Inv A. Thus
the ideal I = (a − λ1)A ⊆ A does not contain 1 and therefore is proper. Using Zorn’s lemma,
we can find a maximal ideal M ⊇ I. By (ii) there is a ϕ ∈ Ω(A) such that ker ϕ = M . Since
a − λ1 ∈ I ⊆ M = ker ϕ, we have ϕ(a − λ1) = 0, and with ϕ(1) = 1 we have ϕ(a) = λ. 
92
If A is an algebra over a field k, one must take care to distinguish between ring ideals I ⊆ A and algebra ideals.
Both are closed under addition and under multiplication by elements of A. Algebra ideals are linear subspaces, thus
also closed under multiplication by the scalars in k. Every algebra ideal clearly is a ring ideal, and the converse holds
if A has a unit (as is assumed here) since cx = (c1)x ∈ I for each c ∈ k and x ∈ I. But a non-unital algebra can have
ring ideals that are not algebra ideals.

137
15.8 Exercise Let A be a unital Banach algebra over C and a, b ∈ A.
(i) If A is abelian, prove σ(a + b) ⊆ σ(a) + σ(b) and σ(ab) ⊆ σ(a)σ(b). Conclude that
r(a + b) ≤ r(a) + r(b) and r(ab) ≤ r(a)r(b). (No use of Exercise 13.46!)
(ii) Prove that the results of (i) also hold for non-abelian A provided ab = ba.
(iii) Give examples of non-commuting a, b ∈ A = M2×2 (C) for which everything in (i) fails.
Hint: For (ii), use an abelian subalgebra and Exercise 13.50.

15.9 Exercise Prove by way of counterexamples that the statements of Proposition 15.7(ii)+(iii)
all can fail if we drop the commutativity assumption or replace C by R.

15.10 Exercise Let A be a Banach algebra over F and Ae its unitization (Exercise 13.55).
Define ϕ∞ : Ae → F, (a, α) 7→ α.
(i) Prove ϕ∞ ∈ Ω(A).e
(ii) Prove that every ϕ ∈ Ω(A) has a unique extension to ϕ
b ∈ Ω(A).
e
(iii) Prove Ω(A)
e = {ϕ b | ϕ ∈ Ω(A)} ∪ {ϕ∞ }.
(iv) If A is non-unital and a ∈ A, define σ(a) as σAe(a). For A commutative non-unital over C
and a ∈ A, prove
σ(a) = {ϕ(a) | ϕ ∈ Ω(A)} ∪ {0}.

15.11 Exercise Consider the commutative unital Banach algebra A = `1 (Z, C) with convolu-
tion product ? and unit 1 = δ0 . Prove:
(i) For every ϕ ∈ Ω(A) we have ϕ(δ1 ) ∈ S 1 .
(ii) If ϕ1 , ϕ2 ∈ Ω(A) satisfy ϕ1 (δ1 ) = ϕ2 (δ1 ) then ϕ1 = ϕ2 .
(iii) For every z ∈ S 1 prove that ϕz (f ) = n∈Z f (n)z n defines an element of Ω(A).
P

(iv) The map S 1 → Ω(A), z 7→ ϕz is a bijection.


(v) For every f ∈ A we have
nX o
σ(f ) = f (n)z n z ∈ S1 .
n∈Z
n.
P
(vi) If f ∈ A is quasi-nilpotent then f = 0. Hint: f can be recovered from n f (n)z
Note how difficult it would be to prove the result of (v) using only the definition of the spectrum!
But Fourier analysis provides an instructive perspective on the result, cf. Section 19.2.

15.12 Exercise Let f ∈ `1 = `1 (Z, C). With fm (n) = f (n−m), prove that spanC {fm | m ∈ Z}
(the finite
P linear combinations of the translates fm of f ) is dense in `1 (Z, C) if and only if
f (z) = n∈Z f (n)z vanishes for no z ∈ S 1 .
b m

Hint: Use the closed ideal If = f `1 ⊆ `1 generated by f and Exercise 15.11.

15.13 Remark The result of Exercise 15.12 is the simplest of a whole family of ‘span of trans-
lates’ results. These were initiated by a theorem of N. Wiener93 : Given f ∈ L1 (R, λ) (λ is
Lebesgue measure), the linear
R span of the translates of f is dense in L1 (R, λ) if and only if the
Fourier transform f (ξ) = f (x)e
b −iξx dx (which is a continuous function R → C) vanishes for
no ξ ∈ R. Already this is harder to prove. Cf. e.g. [141, Theorem 9.5] or [27, Chapter 2]. 2
93
Norbert Wiener (1894-1964). American mathematician with important contributions to harmonic and functional
analysis and many other fields. See Theorem 19.9 for a related result of his.

138
15.2 Baby version of holomorphic functional calculus
Functional calculus is concerned with defining f (a) when f is a (suitable) function and a is an
element of a Banach or C ∗ -algebra or B(H). So far, we have only considered the rather trivial
case f1 : x 7→ 1/x (for invertible elements of any Banach algebra) and f2 : z 7→ z 1/2 (for positive
Hilbert space operators) The next question is: Determine σ(f (a)). Does it equal f (σ(a))? (For
f1 it does by Exercise 13.30.) These are the basic questions addressed by the many different
‘functional calculi’ that there are: holomorphic, continuous, Borel, etc.
While functional calculus has many applications, our main one will be the proof of spectral
theorems for normal Hilbert space operators (not necessarily compact) in Section 18.
Defining f (a) poses no problem in the simplest case, which surely is f = P , a polynomial:

15.14 Definition If A is a unital algebra, a ∈ A and P (x) = cn xn + · · · + c1 x + c0 is a


polynomial, we put P (a) = cn an + · · · + c1 a + c0 1.

15.15 Exercise (Polynomial functional calculus) Let A be a unital normed algebra


over C and P ∈ C[z] with n = deg P .
(i) Prove that the map C[x] → A, P 7→ P (a) is a homomorphism of unital C-algebras.
(ii) Prove σ(P (a)) = P (σ(a)) := {P (λ) | λ ∈ σ(a)}.
Q Consider the case n = 0 separately. For n ≥ 1 use a factorization P (z) − λ =
Hint:
cn nk=1 (z − zk ) (which exists by algebraic completeness of C).
(iii) Why did we assume A to be normed?
The result of Exercise 15.15 can be generalized in various directions, requiring different
proofs. The following suffices for our purposes:

15.16 Proposition Let A be a unital Banach algebra over C and let f (z) = ∞ n
P
n=0 cn z be a
power series with convergence radius R > 0. Then for all a ∈ A satisfying kak < R we have:
(i) The series f (a) = ∞ n
P
n=0 cn a in A converges absolutely.
(ii) σ(f (a)) = f (σ(a)). [Spectral mapping theorem]
P∞ n
P∞ n
Proof. (i) We Phaven n=0Pkcn a kn ≤ n=0 |cn | kak , which converges since kak < R and−1the
power series cn z and |cn |z have the same convergence radius R (as follows from R =
1/n
lim supn |cn | ). Now use Proposition 3.15(ii).
(ii) Let B = {a}00 ⊆ A. This is a commutative unital Banach algebra with InvB = B ∩ InvA,
and f (a) ∈ B. Since every ϕ ∈ Ω(B) is continuous and a unital homomorphism, we have
N N
! !
X X
n n
ϕ(f (a)) = ϕ lim cn a = lim ϕ cn a
N →∞ N →∞
n=0 n=0
N
X ∞
X
= lim cn ϕ(a)n = cn ϕ(a)n = f (ϕ(a)).
N →∞
n=0 n=0
P∞
(Note that n=0 cn ϕ(a)n converges absolutely since |ϕ(a)| ≤ r(a) ≤ kak < R.)
Applying (15.1) to f (a) ∈ B, we have

σB (f (a)) = {ϕ(f (a)) | ϕ ∈ Ω(B)} = {f (ϕ(a)) | ϕ ∈ Ω(B)} = {f (λ) | λ ∈ σB (a)} = f (σB (a)).

Since Exercise 13.50(i) gives σB (b) = σA (b) for all b ∈ B, we have σA (f (a)) = σB (f (a)). 

139
15.17 Exercise Let A be a unital Banach algebra and a ∈ A. Let ∞ n
P
n=0 cn z Pbe a power
series with convergence radius R > 0. Prove that for the absolute convergence of ∞ n
n=0 cn a it
suffices that r(a) < R.
For P ∈ C[x], continuity of the map a 7→ P (a) is evident in every topological algebra. The
analogous result for a power series requires more work:

15.18 Exercise Let A be a unital Banach algebra.


(i) Prove kan − bn k ≤ nka − bk(max(kak, kbk))n−1 for all a, b ∈ A, n ∈ N. Hint: telescope.
(ii) Let f (z) = ∞ n
P
n=0 cn z have convergence radius R > 0. Prove that the map a 7→ f (a) is
uniformly continuous on B A (0, r) for each r < R and continuous on BA (0, R).
If one defines HR to be the set of functions defined by power series with convergence radius
≥ R then HR is easily checked to be a commutative algebra (which coincides with the algebra
of functions holomorphic on B(0, R)). Now for every a ∈ A with kak < R (or just r(a) < R)
one has a unital homomorphism HR → A, f 7→ f (a). This can be generalized considerably,
leading to the fully fledged holomorphic functional calculus, cf. e.g. [79, 94, 152], which we
do not discuss since it is insufficient for our purposes: We want to make sense of f (A) when
A ∈ B(H) is normal and f : σ(A) → C is just continuous. This will be done in Section 17.

15.19 Example (Exponential function) Of course every power series f (z) = ∞ n


P
n=0 cn z
with
P∞ infinite convergence radius can be ‘applied’ to every a ∈ A. For example exp(a) = ea =
an a
n=0 n! converges for every a ∈ A. By the spectral mapping theorem we have σ(e ) =
exp(σ(a)).
a n

15.20 Exercise Let A be a unital Banach algebra. Prove exp(a) = limn→∞ 1 + n for all
a ∈ A.

15.21 Exercise Let A be a unital Banach algebra over F ∈ {R, C} and a ∈ A.


(i) Give an elementary proof of {eλ | λ ∈ σ(a)} ⊆ σ(ea ) (i.e. without using the spectral
mapping theorem).
(ii) If F = C and keita k = 1 for all t ∈ R, prove σ(a) ⊆ R.
(iii) Show by example that σ(a) ⊆ R does not imply keita k = 1 ∀t ∈ R.

15.22 Exercise (One parameter groups I) Let A be a Banach algebra with unit 1 over
R or C. Let exp : A → A be defined as above. Prove:
(i) kea − 1k ≤ ekak − 1 ∀a ∈ A.
(ii) If ab = ba then ea+b = ea eb = eb ea .
(iii) The map R → A, t 7→ W (t) = eta is a one-parameter group (thus W (0) = 1 and W (s+t) =
W (s)W (t) ∀s, t ∈ R).
(iv) t 7→ eta is norm-continuous. Do this using (1a) and (1c), not Exercise 15.18.
eta −1
(v) limt→0 t = a.
(vi) If a, b ∈ A and eta esb = esb eta for all s, t ∈ R then ab = ba.

140
15.23 Remark One can find 2 × 2 matrices a, b such that ea+b = ea eb = ab ea , but ab 6= ba, as
well as non-commuting 2 × 2 matrices a, b such that ea+b = ea eb 6= eb ea . There is an extensive
literature on this phenomenon. We mention one interesting result [150]: If A is a unital Banach
algebra, a, b ∈ A such that ea eb = eb ea and σ(a) and σ(a) are 2πi-congruence free then ab = ba.
Here Ω ⊆ C is called 2πi-congruence free if λ, λ0 ∈ Ω, λ − λ0 ∈ 2πiZ implies λ = λ0 . (In
particular ea eb = eb ea does imply ab = ba if σ(a), σ(b) ⊂ R.) 2

15.24 Exercise (One parameter groups II) Let A be a unital Banach algebra.
(i) Local inverse for exp.
the branch for which z > 0 ⇒ log z ∈ R) has
(a) The logarithm function (more precisely P
a unique power series expansion g(z) = ∞ n
n=1 cn (z − 1) around z = 1. Prove that it
has convergence radius one.
(b) For a ∈ A with ka − 1k < 1 we can define log(a) ∈ A using the power series g. Prove:
If a, b ∈ A commute and ka − 1k < 1, kb − 1k < 1 then log a and log b commute.
(c) Prove: If kak < log 2 then kea − 1k < 1 and log(ea ) = a. And if kb − 1k < 1 then
exp(log b) = b.
(d) Let 0 ∈ U = B(0, log2 2 ) ⊂ A and V = exp(U ) 3 1. Prove that exp : U → V is a
homeomorphism with inverse log : V → U .
(e) Prove that if a, b ∈ V commute and ab ∈ V then log(ab) = log a + log b.
(ii) Let V be a Banach space, ε > 0 and f : (−ε, ε) → V continuous and satisfying f (0) = 0
and f (s + t) = f (s) + f (t) whenever s, t, s + t ∈ (−ε, ε). Prove that there exists a unique
x ∈ V such that f (t) = tx ∀t ∈ (−ε, ε).
(iii) Now let R → A, t 7→ W (t) be a norm-continuous one parameter group. Use the above
results to prove that there is a unique a ∈ A such that W (t) = eta for all t ∈ R.

15.25 Remark For most applications of one-parameter groups, norm-continuity is too strong
a requirement. To get further one considers one-parameter groups (or semigroups, defined only
for t ≥ 0) in B(V ) that are only strongly continuous, i.e. limt→0 W (t)x = x ∀x ∈ V . A typical
result then is Stone’s theorem according to which the unitary one-parameter groups on Hilbert
spaces are of the form W (t) = eitA , where A is a possibly unbounded self-adjoint operator, cf.
e.g. [128]. The subject of operator semigroups is huge, cf. [50] for an introduction. 2

15.26 Exercise Let A be a commutative unital Banach algebra and a1 , . . . , an ∈ A.


(i) Define the joint spectrum

σ(a1 , . . . , an ) = {(ϕ(a1 ), . . . , ϕ(an )) | ϕ ∈ Ω(A)} ⊆ Cn .

Prove σ(a1 , . . . , an ) ⊆ σ(a1 ) × · · · × σ(an ).


(ii) Prove (λ1 , . . . , λn ) 6∈ σ(a1 , . . . , an ) ⇔ ∃b1 , . . . , bn ∈ A : ni=1 bi (ai − λi 1) = 1.
P

Hint: For λ ∈ Cn , use the ideal Iλ = ni=1 A(ai − λi 1) generated by the ai − λi 1.


P

(iii) Let R1 , . . . , Rn > 0 and assume that j∈Nn cj1 ,...,jn z1j1 · · · znjn converges whenever |zi | <
P
0
Ri ∀i, defining an analytic function f on this domain. Assuming kai k < Ri ∀i, define
f (a1 , . . . , an ) ∈ A in analogy to the case n = 1 above. Prove

σ(f (a1 , . . . , an )) = f (σ(a1 , . . . , an )) = {f (λ1 , . . . , λn ) | (λ1 , . . . , λn ) ∈ σ(a1 , . . . , an )}.

141
15.27 Remark 1. Example: σ(a + b) = {α + β | (α, β) ∈ σ(a, b)}, improving on Exercise 15.8.
2. In more general situations, like non-abelian algebras and non-commuting operators, there
exist various definitions of the joint spectrum, not necessarily equivalent. 2

16 Basics of C ∗-algebras
16.1 Involutions. Definition of C ∗ -algebras
The properties of the adjoint map A 7→ A∗ on B(H) motivate some definitions:

16.1 Definition Let A be a C-algebra. A map ∗ : A → A satisfying antilinearity, antimulti-


plicativity and involutivity, i.e. (i)-(iii) in Lemma 11.8, is called an involution or ∗-operation.
An algebra with a chosen ∗-operation is called a ∗-algebra. A ∗-homomorphism α : A → B of
∗-algebras is a homomorphism satisfying α(a∗ ) = α(a)∗ ∀a ∈ A.

16.2 Lemma Let A be a unital ∗-algebra. Then


(i) 1∗ = 1.
(ii) If a ∈ A is invertible then a∗ is invertible and (a∗ )−1 = (a−1 )∗ .
(iii) σ(a∗ ) = σ(a)∗ := {λ | λ ∈ σ(a)}.
Proof. (i) 1∗ = 11∗ = 1∗∗ 1∗ = (11∗ )∗ = (1∗ )∗ = 1. The proofs of (ii) and (iii) are identical to
those given earlier for Hilbert space operators. 

16.3 Definition If A is a Banach algebra and ∗ : A → A an involution then A is called a


• Banach ∗-algebra if ka∗ k = kak ∀a ∈ A.
• C ∗ -algebra if ka∗ ak = kak2 ∀a ∈ A.94

16.4 Lemma Every C ∗ -algebra is a Banach ∗-algebra. If it has a unit 1 then k1k = 1.
Proof. With the C ∗ -identity and submultiplicativity we have kak2 = ka∗ ak ≤ ka∗ kkak, thus
kak ≤ ka∗ k for all a ∈ A. Replacing a by a∗ herein gives the converse inequality, thus ka∗ k = kak.
If 1 is a unit then k1k2 = k1∗ 1k = k1∗ k = k1k, and since k1k = 6 0 this implies k1k = 1. 

16.5 Remark 1. Clearly B(H) is a C ∗ -algebra for each Hilbert space H. Since this holds also
for real Hilbert spaces, it shows that one can discuss Banach ∗-algebras and C ∗ -algebras over
R. But we will consider only complex ones.
2. There is no special name for the non-complete variants of the above definitions. But a
submultiplicative norm on a ∗-algebra satisfying the C ∗ -identity is called a C ∗ -norm, whether
A is complete w.r.t. it or not. Completion of a ∗-algebra w.r.t. a C ∗ -norm gives a C ∗ -algebra,
and this is an important way of constructing new C ∗ -algebras. 2

It is easy to find Banach ∗-algebras that are not C ∗ -algebras:

16.6 Exercise For n ∈ N, consider A = Mn×n (C) with the usual ∗-algebra structure. Prove
2 1/2 defines a norm that satisfies submultiplicativity and ka∗ k = kak.
Pn 
that kak = i,j=1 |ai,j |
Thus (A, k · k) is a Banach ∗-algebra. Prove that it is not a C ∗ -algebra when n ≥ 2.
94
The original definition by Gelfand and Naimark (1942) had the additional axiom that a∗ a + 1 be invertible for
each a. This turned out to be redundant, cf. Proposition 17.6.

142
16.7 Exercise Recall the Banach algebra A = `1 (Z, C) from Example 13.48. Show that both
f ∗ (n) = f (n) and f ∗ (n) = f (−n) define involutions on A making it a Banach ∗-algebra. Show
that neither of them satisfies the C ∗ -identity.

16.8 Exercise Equip the Banach algebra from Exercise 13.51 with the ∗-operation f ∗ (x) =
f (x). Is it a C ∗ -algebra?

16.9 Lemma Let X be a compact space. For f ∈ C(X, C), define f ∗ by f ∗ (x) = f (x). Then
C(X, C) is a C ∗ -algebra. The same holds for Cb (X, C), where X is arbitrary, thus also for
`∞ (S, C).
Proof. We know that C(X, C) equipped with the norm kf k = supx |f (x)| is a Banach algebra.
It is immediate that ∗ is an involution. The computation
2
kf ∗ f k = sup |f (x)f (x)| = sup |f (x)|2 = sup |f (x)| = kf k2
x x x

proves the C ∗ -identity. It is clear that this generalizes to the bounded continuous functions on
any space X. 

In a sense, the examples B(H) and C(X, C) for compact X are all there is: One can prove,
as we will do in Theorem 19.12, that every commutative unital C ∗ -algebra is isometrically
∗-isomorphic to C(X, C) for some compact Hausdorff space X, determined uniquely up to
homeomorphism. (For example one has `∞ (S, C) ∼ = C(βS, C), where βS is the Stone-Čech
compactification of (S, τdisc ).) And one can prove that every C ∗ -algebra is isometrically ∗-
isomorphic to a norm-closed ∗-subalgebra of B(H) for some Hilbert space H. See e.g. [110].

16.10 Exercise If A is a C ∗ -algebra and Ae its unitization (Exercise 13.55), the norm k(a, α)k1 =
kak + |α| on Ae usually fails to be a C ∗ -norm. Define k(a, α)k = supb∈A,kbk≤1 kab + αbk. Prove:
(i) k · k is an algebra norm on Ae if and only if A is non-unital, which is assumed from now
on.
(ii) It satisfies k(a, α)k ≤ k(a, α)k1 and k(a, 0)k = kak ∀a ∈ A, thus ι : A ,→ Ae is an isometry.
(iii) k · k is a C ∗ -norm.
(iv) (A,e k · k) is complete and the norms k · k1 , k · k on Ae are equivalent.

16.2 Some classes of elements in a C ∗ -algebra and their spectra


Inspired by B(H) we define:

16.11 Definition Let A be a C-algebra with an involution ∗. Then a ∈ A is called


• self-adjoint if a = a∗ . We put Asa = {a ∈ A | a = a∗ }.
• normal if aa∗ = a∗ a. (I.e. a and a∗ ‘commute’.)
• unitary if aa∗ = a∗ a = 1. (Obviously A needs to be unital.)
• orthogonal projection if a2 = a = a∗ .
A subset S of a ∗-algebra is called self-adjoint if S = S ∗ := {s∗ | s ∈ S}.

16.12 Exercise Let A be a Banach ∗-algebra and I ⊆ A a closed self-adjoint two-sided ideal.
Prove:

143
(i) A/I has a natural ∗-operation so that p : A → A/I is a ∗-homomorphism.
(ii) With this ∗-operation and the quotient norm, A/I is a Banach ∗-algebra.
(iii) If α : A → B is a ∗-homomorphism such that I ⊆ ker α then the induced map α0 : A/I →
B, cf. Proposition 6.1(v), is a ∗-homomorphism.

16.13 Remark If A is a C ∗ -algebra one can prove that every closed two sided ideal I ⊆ A
automatically is self-adjoint and that A/I is a C ∗ -algebra. But the proofs would lead us too
far, cf. e.g. [110, Theorems 3.1.3, 3.1.4]. 2

The self-adjoint, respectively unitary, elements of the C ∗ -algebra A = C are the real num-
bers and the phases (|z| = 1). Thus self-adjoint and unitary elements of a C ∗ -algebra should
be thought of as generalized real numbers and phases, respectively. Also the real-imaginary
decomposition generalizes:
a+a∗ a−a∗
16.14 Lemma If A is a ∗-algebra and a ∈ A, we define Re(a) = 2 , Im(a) = 2i . Now
(i) Re(a), Im(a) are self-adjoint and a = Re(a) + i Im(a).
(ii) The representation a = b + ic with b, c self-adjoint is unique for each a.
(iii) a is self-adjoint if and only if Im(a) = 0.
(iv) a is normal if and only if Re(a) and Im(a) commute.
Proof. Mostly trivial computations. We only prove (ii): If b, b0 , c, c0 are self-adjoint and b + ic =
b0 + ic0 then b − b0 = i(c0 − c). This implies b − b0 = (b − b0 )∗ = −i(c0 − c) = −(b − b0 ). Thus
b = b0 and in turn c = c0 . 

16.15 Exercise Prove that a ∗-algebra is commutative if and only if every element is normal.
There are several reasons why normal elements are important. An element a ∈ A is normal
if and only if there is a ∗-closed commutative subalgebra B ⊆ A containing a. We will see that
normal elements behave like functions on a (locally) compact space.
While ab = ba clearly implies a∗ b∗ = b∗ a∗ , it need not follow that a∗ commutes with b (or
equivalently a with b∗ ). To see this just pick any non-normal a ∈ A and take b = a. But:

16.16 Theorem (Fuglede 1950) Let A be a unital C ∗ -algebra, and let a, b be commuting
elements at least one of which is normal. Then a∗ b = ba∗ (and ab∗ = b∗ a).
The theorem is quite remarkable, and asked for a proof one probably wouldn’t know where to
begin.
P For matrices it actually is quite easy: Normality of a implies that a is diagonalizable, i.e.
a = i λi Pi , where the λi are the (distinct) eigenvalues and the Pi orthogonal projections onto
the corresponding eigenspaces, see e.g. [55, Theorem 6.16]. Now ab = ba implies Pi b = bPi ∀i.
Taking adjoints gives Pi b∗ = b∗ Pi ∀i, whence ab∗ = b∗ a. With effort, this argument can be
extended to operators on infinite-dimensional Hilbert spaces, cf. [160]. But there is a much
more elegant argument that works in all C ∗ -algebras, for which we refer to Section B.13.1.

16.17 Proposition Let A be a unital C ∗ -algebra. Then


(i) If a ∈ A is normal then r(a) = kak.
(ii) If u ∈ A is unitary then kuk = 1 and σ(u) ⊆ S 1 .
(iii) If a ∈ A is self-adjoint then σ(a) ⊆ R.

144
Proof. (i) The proof of Proposition 13.69(i) works identically in every abstract C ∗ -algebra.
(ii) By unitarity of u we have kuk2 = ku∗ uk = k1k = 1, thus kuk = 1 = ku∗ k. With u−1 = u∗
we also have ku−1 k = 1, and Exercise 13.30(ii) gives σ(u) ⊆ S 1 .
(iii) Given λ ∈ σ(a), write λ = α + iβ with α, β ∈ R. Applying σ(a + z1) = σ(a) + z ∀z ∈ C
(why?) to z = −α + inβ, where n ∈ N, we have iβ(n + 1) = α + iβ − α + inβ ∈ σ(a − α1 + inβ1).
Thus with r(c) ≤ kck (Proposition 13.27), the C ∗ -identity and k1k = 1 we have

(n2 + 2n + 1)β 2 = |iβ(n + 1)|2 ≤ r(a − α1 + inβ1)2 ≤ ka − α1 + inβ1k2


= k(a − α1 − inβ1)(a − α1 + inβ1)k = k(a − α1)2 + n2 β 2 1k ≤ ka − α1k2 + n2 β 2 .

This simplifies to (2n + 1)β 2 ≤ ka − α1k2 ∀n ∈ N, which implies β = 0, thus λ ∈ R. 

16.18 Remark 1. Since (i) implies kak = ka∗ ak1/2 = r(a∗ a)1/2 for all a ∈ A and the spectral
radius r(a) by definition depends only on the algebraic structure of A, the latter also determines
the norm, which therefore is unique in a C ∗ -algebra! But note that the conclusion k · k1 = k · k2
for C ∗ -norms k · k1,2 only follows if A is complete with respect to both norms!
2. The proof of (iii) is short and uses only r(c) ≤ kck and the C ∗ -identity. A less direct,
but perhaps more insightful argument uses the exponential function, cf. Example 15.19: If
a = a∗ ∈ A then ut = eita with t ∈ R satisfies u∗t = e−ita , thus ut u∗t = u∗t ut = 1, so that ut is
unitary. Now (ii) gives σ(ut ) ⊆ S 1 , and σ(a) ⊆ R follows from the spectral mapping theorem
or Exercise 15.21.
For another application of the unitarity of eia for self-adjoint a see Section B.13.1. 2

The applications of Proposition 16.17(i) discussed in Section 13.4 (with the exception of the
result on 9A9) hold in all abstract C ∗ -algebras. Proposition 16.17(iii) can be used to improve
on the results of Section 13.2, showing that C ∗ -algebras are better behaved than general Banach
algebras:

16.19 Theorem Let A be a unital C ∗ -algebra and B ⊆ A a C ∗ -subalgebra (=closed self-adjoint


subalgebra) containing 1. Then InvB = B ∩ InvA and σB (b) = σA (b) for all b ∈ B.
Proof. Since the inclusion InvB ⊆ B ∩ InvA is clear we need to prove for every b ∈ B ∩ InvA that
b−1 ∈ B. Suppose first that b = b∗ ∈ B ∩ InvA. Proposition 16.17(iii) then gives that b − it1 is
invertible in B for all t ∈ R\{0} and invertible in A for all t ∈ R. Lemma 13.17 implies that the
function f : R → A, t 7→ (b − it1)−1 is continuous. For t ∈ R\{0}, b − it1 is invertible in B, so
that uniqueness of inverses gives f (t) ∈ B for all t 6= 0. Now continuity of f and closedness of
B ⊆ A imply f (0) ∈ B. Since f (0) = b−1 , we have b−1 ∈ B, thus b ∈ InvB.
Let now b ∈ B ∩ InvA, so that b has an inverse a ∈ A. By Lemma 16.2, also b∗ is invertible
in A, thus the same holds for bb∗ . Since bb∗ ∈ B is self-adjoint, it has an inverse c ∈ B by the
first half of the proof, in particular bb∗ c = 1. Combining with ab = 1 we have a = abb∗ c = b∗ c.
In view of b, c ∈ B we have b−1 = a ∈ B. This finishes the proof of InvB = B ∩ InvA. The rest
follows by Lemma 13.47(ii). 

16.3 Positive elements of a C ∗ -algebra I


16.20 Definition If A is a unital C ∗ -algebra then a ∈ A is called positive, or a ≥ 0, if a = a∗
and σ(a) ⊆ [0, ∞). The set of positive elements of A is denoted A+ .
If X is a compact Hausdorff space and f ∈ C(X, C) = A then σA (f ) = f (X), thus f ≥ 0 is
equivalent to f (x) ≥ 0 ∀x ∈ X.

145
16.21 Exercise Give an example of a unital C ∗ -algebra A and a ∈ A showing that σ(a) ⊆
[0, ∞) does not imply a = a∗ !

16.22 Exercise Let A be a unital C ∗ -algebra. Without using results proven later, prove:
(i) If a ∈ Asa then a2 ≥ 0.
(ii) If a, b ∈ A are positive and a + b = 0 then a = b = 0.
(iii) If a, b ∈ A are positive and ab = ba then a + b is positive.
(iv) If c ∈ A is normal then c∗ c is positive. Hint: Lemma 16.14.

16.23 Definition If A is a C ∗ -algebra and a, b ∈ Asa satisfy b − a ≥ 0 we write a ≤ b and,


equivalently, b ≥ a.

16.24 Exercise Let A be a unital C ∗ -algebra. Prove:


(i) The binary relation ≤ on Asa is reflexive and anti-symmetric (i.e. a ≤ b ≤ a ⇒ a = b).
(ii) If a ∈ Asa then −kak1 ≤ a ≤ kak1.
(iii) If a ∈ Asa satisfies −c1 ≤ a ≤ c1 with c ≥ 0 then kak ≤ c
(iv) If a, b ∈ A are positive with kak ≤ 1, kbk ≤ 1 then ka − bk ≤ 1.
(v) a ∈ A is positive if and only if a = a∗ and there is a t ≥ 0 such that ka − t1k ≤ t.
(vi) If a, b ∈ A are positive then a + b is positive. (No assumption that ab = ba!)
(vii) ≤ is transitive, and (Asa , ≤) is a partially ordered set.

17 Continuous functional calculus for C ∗-algebras


17.1 Continuous functional calculus for self-adjoint elements
Our goal is to make sense of f (a), where a is a normal element of some arbitrary C ∗ -algebra
A, for all functions f ∈ C(σ(a), C), in such a way that f 7→ f (a) is a ∗-homomorphism. (If
you don’t care for this generality, you may substitute A = B(H).) We will first do this for
self-adjoint elements and then generalize to normal ones. We cannot hope to go beyond this: If
f (z) = z 2 z then it is not clear whether to define f (a) as a2 a∗ or aa∗ a or a∗ a2 when aa∗ 6= a∗ a.
[Yet ‘quantization theory’, motivated by quantum theory, tries to do it.] For normal a, this
problem does not arise.

17.1 Proposition Let A be a unital C ∗ -algebra, a ∈ A normal and P a polynomial. Then


kP (a)k = sup |P (λ)| = kP|σ(a) k∞ .
λ∈σ(a)

Proof. Normality of a implies that P (a) is normal. Thus


kP (a)k = r(P (a)) = sup |λ| = sup |P (λ)|,
λ∈σ(P (a)) λ∈σ(a)

where the first equality is due to Proposition 16.17(i), the second is the definition of r and the
third comes from the spectral mapping theorem (Proposition 15.16(ii) or Exercise 15.15). 

Even though we are after a result for all normal operators, we first consider self-adjoint
operators:

146
17.2 Theorem Let A be a unital C ∗ -algebra and a = a∗ ∈ A. Then there is a unique con-
tinuous ∗-homomorphism αa : C(σ(a), C) → A such that αa (P ) = P (a) for all polynomials.
(Usually we will write f (a) instead of αa (f ).) It satisfies
(i) αa is an isometry: kαa (f )k = supλ∈σ(a) |f (λ)|.
(ii) The image of αa is the smallest C ∗ -subalgebra B ⊆ A containing 1 and a. The map
αa : C(σ(a), C) → B is a ∗-isomorphism. f (a) is self-adjoint if and only if f is real-valued.
(iii) σ(αa (f )) = f (σ(a)) = {f (λ) | λ ∈ σ(a)}. (Spectral mapping theorem)
(iv) If f ∈ C(σ(a), R), g ∈ C(f (σ(a)), C) then αa (g◦f ) = ααa (f ) (g), or just g(f (a)) = (g◦f )(a).
(We require f to be real-valued in order for f (a) to be self-adjoint.)
Proof. (i) By Propositions 13.27 and 16.17(iii), we have σ(a) ⊆ [−kak, kak]. By the classical
Weierstrass approximation theorem, cf. Theorem A.32, for every continuous continuous function
f : [c, d] → C and ε > 0 there is a polynomial P such that |f (x) − P (x)| ≤ ε for all x ∈
[c, d]. We cannot apply this directly since σ(a), while contained in an interval, need not be an
entire interval. But using Tietze’s Extension Theorem A.31, we can find (very non-uniquely)
a continuous function g : [−kak, kak] → C that coincides with f on σ(a). Now this g can
be approximated uniformly by polynomials thanks to Weierstrass’ theorem. (Alternatively,
apply the more abstract Stone-Weierstrass theorem directly to f .) In any case, the restriction
of the polynomials to σ(a) is dense in C(σ(a), C) w.r.t. k · k∞ . By Proposition 17.1, the map
C(σ(a), C) ⊇ C[x]|σ(a) → A, P 7→ P (a) is an isometry. Thus applying Lemma 3.12 we obtaining
a unique isometry αa : C(σ(a), C) → A extending P 7→ P (a). Since C(σ(a), C) is complete, its
image under αa is closed, thus equal to the closure C ∗ (1, a) of {P (a) | P ∈ C(x)}. Thus (i) is
proven up to the claim that αa is a ∗-homomorphism. This is left as an exercise.
(ii) Since αa : C[x] → A is a ∗-homomorphism, B := αa (C(σ(a), C)) ⊆ A is a ∗-subalgebra.
And since αa is an isometry by (i) and (C(σ(a), C), k · k∞ ) is complete, B is closed, thus a C ∗ -
algebra. Since αa maps the constant-one function to 1 ∈ A and the inclusion map σ(a) ,→ C
to a, B contains 1, a. Conversely, the smallest C ∗ -subalgebra of A containing 1 and a clearly
is obtained by taking the norm-closure of the set {P (a) | P ∈ C[z]}, which is contained in
the image of αa . As the continuous extension of a ∗-homomorphism, αa : C(σ(a), C) → A
is a ∗-homomorphism. Finally f (a)∗ = αa (f )∗ = αa (f ∗ ). Since αa is injective, this equals
f (a) = αa (f ) if and only f = f ∗ , which is equivalent to real-valuedness of f .
(iii) Let f ∈ C(σ(a), C). Then clearly αa (f ) ∈ B. Now
σA (αa (f )) = σB (αa (f )) = σC(σ(a),C) (f ) = f (σ(a)),
where the equalities come from Theorem 16.19, from the fact that αa : C(σ(a), C) → B is a
∗-isomorphism, and from Exercise 13.24, respectively.
(iv) If {Pn } is a sequence of polynomials converging to f uniformly on σ(a) and {Qn } is a
sequence of polynomials converging to g uniformly on σ(f (a)), then Qn ◦Pn converges uniformly
to g ◦ f , thus Qn (Pn (a)) = (Qn ◦ Pn )(a) converges to (g ◦ f )(a). On the other hand, {Qn (Pn (a))}
converges uniformly to g(f (a)). 

17.3 Exercise Prove that αa is a ∗-homomorphism.

17.4 Remark The isometric ∗-isomorphism αa : C(σ(a), C) → B = C ∗ (1, a) is a special case


of the Gelfand isomorphism for commutative unital C ∗ -algebras proven in Section 19 (which
will go in the opposite direction π : A → C(Ω(A), C)). A general commutative C ∗ -algebra A is
not generated by a single element a, so that we’ll need to find a substitute for σ(a). It shouldn’t
be surprising that this will be Ω(A) (with a suitable topology). 2

147
17.2 Positive elements of a C ∗ -algebra II. Absolute value
Using the functional calculus, we can continue the considerations begun in Section 16.3.

17.5 Exercise (i) Define f+ , f− : R → R by f+ (x) = max(x, 0), f− (x) = − min(x, 0).
Prove: 1. f+ f− = 0, 2. f± (x) = (|x| ± x)/2, 3. f± ∈ C(R, R).
(ii) Let now A be a unital C ∗ -algebra and a = a∗ ∈ A. Define a± ∈ A by functional calculus
as a± = f± (a). Prove: 1. a+ − a− = a and a+ + a− = |a|, 2. a+ a− = a− a+ = 0, 3.
a+ ≥ 0, a− ≥ 0.

17.6 Proposition If A is a unital C ∗ -algebra and a ∈ A then a∗ a is positive.


Proof. First a preparatory argument: Assume c ∈ A is such that −c∗ c is positive. Then by
Lemma 13.25(ii) we have σ(−cc∗ )\{0} = σ(−c∗ c)\{0}, thus −cc∗ is positive. Writing c = a + ib
with a, b self-adjoint, we have c∗ c + cc∗ = (a − ib)(a + ib) + (a + ib)(a − ib) = 2a2 + 2b2 , thus
c∗ c = 2a2 + 2b2 − cc∗ . Using −cc∗ ≥ 0 just proven and Exercises 16.22(i) and 16.24(vi), this
implies c∗ c ≥ 0. Combining −c∗ c ≥ 0 and c∗ c ≥ 0 gives σ(c∗ c) ⊆ [0, ∞) ∩ (−∞, 0] = {0}. This
implies kck2 = kc∗ ck = r(c∗ c) = 0, thus c = 0.
We turn to the proof of the claim. Let a ∈ A be arbitrary. Then b = a∗ a is self-adjoint, thus
with Exercise 17.5(ii) we have b = b+ − b− with b± ≥ 0 and b+ b− = 0. Putting c = ab− we have
−c∗ c = −b− a∗ ab− = −b− (b+ − b− )b− = b3− , which is positive (spectral mapping theorem). Now
the preparatory step gives ab− = c = 0. This implies −b2− = (b+ − b− )b− = bb− = a∗ ab− = 0,
thus b− = 0. (Since d = d∗ , d2 = 0 implies d = 0.) Now we have a∗ a = b = b+ ≥ 0. 

17.7 Exercise Let A be a unital C ∗ -algebra and a, b ∈ A with a ≥ 0. Use Proposition 17.6 to
prove that bab∗ ≥ 0. Conclude that a, c ∈ Asa , a ≤ c ⇒ bab∗ ≤ bcb∗ .

17.8 Proposition Let A be a unital C ∗ -algebra. If a ∈ A is positive then there is a positive



b ∈ A such that b2 = a, unique in C ∗ (1, a) (and in A). We write b = a.
Proof. In view of a ≥ 0 we have σ(a) ⊆ [0, ∞). Now continuity of the function [0, ∞) →
√ √
[0, ∞), x 7→ + x allows us to define b = a by the continuous functional calculus. It is imme-
diate by construction that b = b∗ , and the spectral mapping theorem gives σ(b) ⊆ [0, ∞), thus
√ 2
b ≥ 0. Now b2 = (a1/2 )2 = a since x = x. If c ∈ C ∗ (1, a) is positive and c2 = a then c = b.
This follows from the ∗-isomorphism C ∗ (1, a) ∼ = C(σ(a), C) and the fact that positive square
roots are unique in the function algebra C(σ(a), C) (why?). The stronger result that positive
square roots are unique even in A can be proven as in Theorem 11.40 if we have replacements
of Exercises 11.39(iii) and (iv) valid for abstract C ∗ -algebras. These are provided by Exercises
17.7 and 16.22(ii), respectively. 

For elements of the C ∗ -algebra B(H), where H is a Hilbert space, we have two competing
definitions of positivity. Luckily there is no conflict:

17.9 Proposition If H is a complex Hilbert space and A ∈ B(H), the following are equivalent:
(i) C ∗ -algebraic positivity: A = A∗ and σ(A) ⊆ [0, +∞).
(ii) Operator positivity: hAx, xi ≥ 0 for all x ∈ H, equivalently W (A) ⊆ [0, +∞).
Proof. If A is C ∗ -positive then by Proposition 17.8 there is a B = B ∗ ∈ B(H) such that
A = B 2 = B ∗ B. Thus A is operator positive by Exercise 11.39(ii).
If A is operator positive then A = A∗ by Proposition 11.22, and using Exercise 13.15 we
have σ(A) ⊆ W (A) ⊆ [0, ∞). Thus A is C ∗ -positive. 

148

17.10 Exercise Let A be a unital C ∗ -algebra. Prove |a| = a2 for all a ∈ Asa .
With Proposition 17.6 we can define the ‘absolute value’ of all elements, not only the self-
adjoint ones:

17.11 Definition If A is a unital C ∗ -algebra and a ∈ A, we define |a| = (a∗ a)1/2 .


By construction, |a| is positive. And if a is positive then |a| = a. These properties are
similar to those of |z| for z ∈ C, but some care is required (compare Exercise 19.17):

17.12 Exercise (i) Let A be a unital C ∗ -algebra and a ∈ A. Prove that |a| = |a∗ | holds if
and only if a is normal.
(ii) Find counterexamples in A = M2×2 (C) disproving |a + b| ≤ |a| + |b| and |ab| ≤ |a| |b|.
In Section 11.8 we have proven polar decomposition for the C ∗ -algebras A = B(H), but this
does not generalize to arbitrary C ∗ -algebras:

17.13 Exercise (i) If A is a unital C ∗ -algebra and a ∈ Inv(A), prove that |a| ∈ InvA. Use
this to define u ∈ A by a = u|a| and prove that u is unitary.
(ii) Give an example of a unital C ∗ -algebra A and a ∈ A such that there is no b ∈ A with
a = b|a|.

17.14 Remark We have seen in Exercise 17.13 that in a unital C ∗ -algebra one does not always
have polar decomposition. However, there is a class of particularly nice C ∗ -subalgebras A ⊆
B(H), the von Neumann algebras, such that for each A ∈ A one has not only |A| ∈ A, but also
V ∈ A, where V is as in Proposition 11.44. Cf. e.g. [110, Theorem 4.1.10]. 2

In Remark 13.28 we showed r(a) ≤ inf b∈InvA kbab−1 k for every element of a unital Banach
algebra. For C ∗ -algebras, we can now prove this to be an equality:

17.15 Exercise Let A be a unital C ∗ -algebra. Let a ∈ A with r(a) < 1.


(i) Prove that c = ∞ ∗ n n
P
n=0 (a ) a converges and c ≥ 1.
(ii) Define b = c1/2 and prove b ≥ 1 and b ∈ InvA.
(iii) Prove kbab−1 k < 1. Hint: Use the C ∗ -identity and b2 = c.
(iv) Use (i)-(iii) to prove r(a) = inf b∈InvA kbab−1 k for all a ∈ A.
The results of (i)-(iii) can also be used to relate σ(A), where A ∈ B(H), to the sets
W (BAB −1 ), where B ∈ Inv B(H), cf. Appendix B.12.1.

17.3 Continuous functional calculus for normal elements


17.16 Theorem Theorem 17.2 literally extends to all normal elements of a unital C ∗ -algebra,
with R replaced by C.
The proof of Theorem 17.2 does not generalize immediately.95 The reason is that the
spectrum of a normal operator need not be contained in R. (In fact, for normal a we have
σ(a) ⊆ R ⇔ a = a∗ , cf. Exercise 17.20.) If that happens, the polynomials, restricted to σ(a),
95
“It is a well-known technical nuisance that the proof of the spectral theorem for normal operators involves some
difficulties that do not arise in the Hermitian case”. [67]

149
fail to be uniformly dense in C(σ(a), C). (All functions that are uniform limits of polynomials
on sufficiently large subsets of C are holomorphic so that, e.g. f (z) = Re z cannot be approxi-
mated by polynomials in z = x + iy.) But with σ(a) ⊆ C ∼ = R2 and considering functions on (a
subset of) C as functions of two real variables, the polynomials in x, y are dense in C(σ(a), C)
by the higher dimensional version of the classical Weierstrass theorem, cf. PTheoremi A.38. Thus
also the polynomials in z = x + iy and z = x − iy are dense96 . If P = N c
i,j=0 ij z z j ∈ C[z, z]

we define P ∗ (z, z) = N j z i . This turns C[z, z] into a ∗-algebra.


P
c
i,j=0 ij z
There is a unique unital homomorphism αa from C[z, z] to A sending z to a and z to a∗ ,
and we need to adapt Proposition 17.1 to this setting. For this we need another lemma:

17.17 Lemma Let A be a unital C ∗ -algebra. Then every character ϕ ∈ Ω(A) satisfies ϕ(c∗ ) =
ϕ(c) for all c ∈ A, i.e. is a ∗-homomorphism.
Proof. We have c = a + ib, where a = Re(c), b = Im(c) are self-adjoint. Now σ(a) ⊆ R by
Proposition 16.17(iii), thus ϕ(a) ∈ σ(a) ⊆ R by Lemma 15.3. Similarly ϕ(b) ∈ R. Thus

ϕ(c∗ ) = ϕ(a − ib) = ϕ(a) − iϕ(b) = ϕ(a) + iϕ(b) = ϕ(a + ib) = ϕ(c),

where the third equality used that ϕ(a), ϕ(b) ∈ R as shown before. 

17.18 Proposition Let A be a unital C ∗ -algebra and a ∈ A normal. Then


(i) There is a unique homomorphism αa : C[z, z] 7→ A sending z to a and z to a∗ . It is a
∗-homomorphism.
(ii) For every P ∈ C[z, z] we have σ(P (a, a∗ )) = {P (λ, λ) | λ ∈ σ(a)} and

kP (a, a∗ )k = sup |P (λ, λ)|. (17.1)


λ∈σ(a)

Proof. (i) It is clear that we must define αa by N


P i j
PN i ∗ j
i,j=0 cij z z 7→ i,j=0 cij a (a ) . Using the

normality of a it is straightforward to see that P 7→ P (a, a ) is a ∗-homomorphism.
(ii) Since a is normal, B = C ∗ (1, a) ⊆ A is commutative, so that Proposition 15.7 applies,
and using that the ϕ ∈ Ω(B) are ∗-homomorphisms by Lemma 17.17, we have
N
n X  o
σB (P (a, a∗ )) = {ϕ(P (a, a∗ )) | ϕ ∈ Ω(B)} = ϕ cij ai (a∗ )j | ϕ ∈ Ω(B)
i,j=0
N
j
nX o
= cij ϕ(a)i ϕ(a) | ϕ ∈ Ω(B) = {P (λ, λ) | λ ∈ σ(a)}.
i,j=0

Appealing to Theorem 16.19, we get σA (P (a, a∗ )) = σB (P (a, a∗ )). Since P (a, a∗ ) is normal,
with Proposition 16.17(i) we have kP (a, a∗ )k = r(P (a, a∗ )) = supλ∈σ(a) |P (λ, λ)|. 

17.19 Remark The main difficulty in the construction of the continuous functional calculus for
normal operators is proving (17.1). (By contrast, Proposition 17.1 has an elementary proof since
it only uses the spectral mapping theorem for polynomials.) The above proof is efficient and
elegant, but it relies on Zorn’s lemma via Proposition 15.7. Avoiding this at the present level
of generality is possible, but quite cumbersome, cf. e.g. [155, Section 7.4], which makes massive
96
We allow ourselves the harmless sloppiness of not distinguishing between elements of the ring C[z, z] (where z, z
are independent variables) and the functions C → C, z 7→ f (z, z) induced by them.

150
use of holomorphic functional calculus including a version for several commuting operators.
For normal Hilbert space operators there proofs [14, 174] (see also [113, Prop. 8.21]) that are
reasonably elementary, but tricky and ad hoc. 2

Proof of Theorem 17.16 Essentially as that of Theorem 17.2, now using the density of the
polynomials in z, z in C(σ(a), C) as explained before and using Proposition 17.18 instead of
Proposition 17.1. 

17.20 Exercise Let A be a unital C ∗ -algebra and a ∈ A normal. Prove


(i) If σ(a) ⊆ R then a is self-adjoint.
(ii) If σ(a) ⊆ S 1 then a is unitary.

17.21 Exercise Let A, B be unital C ∗ -algebras, α : A → B a unital ∗-homomorphism, a ∈ A


normal and f ∈ C(σ(a), C). Prove that α(f (a)) = f (α(a)). Why is the r.h.s. even defined?

17.22 Exercise (i) Let A be a unital C ∗ -algebra and u ∈ A unitary (thus σ(u) ⊆ S 1 ). Prove
that if σ(u) 6= S 1 , there exists a ∈ Asa such that eia = u.
(ii) Give an example of a unital C ∗ -algebra A and a unitary u ∈ A such that there is no
a ∈ Asa with u = eia .
The following is a preview of later developments in infinitely many dimensions:

17.23 Exercise Let H be a finite-dimensional Hilbert space and A ∈ B(H) normal. Prove:
(i) A = ni=1 λi Pi , where the Pi are the orthogonal projections onto the eigenspaces of A and
P
the λi are the associated eigenvalues.
(ii) For any function f : {λ1P
, . . . , λn } → C, the f (A) provided by continuous functional calculus
coincides with f (A) = ni=1 f (λi )Pi .
With the help of continuous functional calculus for normal operators, we can improve on the
results obtained in Exercises 13.66 and 13.68 (for Banach space operators):

17.24 Proposition Let H be a Hilbert space and A ∈ B(H) normal.


(i) If Σ ⊆ σ(A) is clopen with ∅ =6 Σ 6= σ(A) and P = χΣ (A) ∈ B(H) then with H1 =
P H, H2 = (1 − P )H we have AHi ⊆ Hi , i = 1, 2, thus A = A|H1 ⊕ A|H2 . The restrictions
A|H1 , A|H2 are normal, and σ(A|H1 ) = Σ, σ(A|H2 ) = σ(A)\Σ.
(ii) If λ ∈ σ(A) is isolated then ker(A−λ1) 6= {0}, thus λ ∈ σp (A), and (A−λ1)  ker(A−λ1)⊥
is invertible.
Proof. (i) The assumption Σ 6= 0 implies σ(χΣ (A)) = χΣ (σ(A)) 6= {0}, thus P 6= 0. Similarly,
σ 6= σ(A) implies P 6= 1. Since P commutes with A, the subspaces H1 = P H, H2 = (1−P )H are
mapped into themselves by A and every f (A), where f ∈ C(σ(A), C). Normality of Ai = A|Hi
is clear. Now there are unital ∗-homomorphisms πi : C(σ(A), C) → B(Hi ) such that f (A) =
π1 (f ) ⊕ π2 (f ) for all f ∈ C(σ(A), C). Since Σ is clopen, the ∗-homomorphism C(σ(A), C) →
C(Σ, C) ⊕ C(σ(A)\Σ, C), f 7→ (f|Σ , f|σ(A)\Σ ) is an isomorphism. (The inverse sends (f1 , f2 ) to
fb1 +fb2 where fbi is the extension of fi to all of σ(A) that vanishes on the complement of the domain
of fi .) Now the composite C(Σ, C) ⊕ C(σ(A)\Σ, C) → C(σ(A), C) → B(H1 ) ⊕ B(H2 ) sends
(f1 , f2 ) to (π1 (f1 ), π2 (f2 )). If now z1 , z2 are the inclusion maps from Σ and σ(A)\Σ, respectively,

151
to C, we have A = A1 ⊕A2 = π1 (z1 )⊕π2 (z2 ). Thus σ(A1 ) = σ(π1 (z1 )) ⊆ σ(z1 ) = Σ. Analogously
σ(A2 ) ⊆ σ(A)\Σ. Now in view of σ(A) = σ(A1 ) ∪ σ(A2 ), we have σ(A1 ) = Σ, σ(A2 ) ⊆ σ(A)\Σ.
(ii) Since λ is isolated, Σ = {λ} ⊆ σ(A) is clopen. Applying (i) to Σ gives H1 , H2 , A1 , A2
with σ(A1 ) = {λ} and σ(A2 ) = σ(A)\{λ}. Now Exercise 13.71(ii) gives A1 = λ1, so that
H1 = ker(A − λ1). And (A2 − λ1) ∈ B(H2 ) is invertible. 

Now we are are in a position to answer the question raised in Remark 11.28:

17.25 Corollary Let H be a Hilbert space and A ∈ B(H) normal. With H 0 = (ker A)⊥ we
have AH 0 ⊆ H 0 , and the following are equivalent:
(i) A  H 0 ∈ B(H 0 ) is invertible (⇔ surjective ⇔ bounded below).
(ii) 0 6∈ σ(A) or 0 ∈ σ(A) is isolated.
Proof. Recall from Proposition 11.27 that A maps H 0 to itself.
(ii)⇒(i) If 0 6∈ σ(A) then A is invertible, thus H 0 = H and A  H 0 is invertible. If 0 ∈ σ(A)
is isolated then Proposition 17.24(ii) gives a decomposition H = H1 ⊕ H2 , where H1 = ker A
and H2 = (ker A)⊥ = H 0 , with A  H2 = A  H 0 invertible.
(i)⇒(ii) If A is injective, the assumption of invertibility of A  H 0 becomes invertibility of A,
implying 0 6∈ σ(A). If A is not injective then invertibility of A  H 0 means 0 6∈ σ(A  H 0 ). Since
σ(A  H 0 ) ⊆ C is closed, there is an open neighborhood U ⊂ C of 0 such that U ∩ σ(A  H 0 ) = ∅.
On the other hand, the spectrum of A  ker A obviously is {0}. Since σ(A) = {0} ∪ σ(A  H 0 ),
we have σ(A) ∩ U = {0}, so that 0 ∈ σ(A) is isolated. 

18 Spectral theorems for normal Hilbert space oper-


ators
18.1 Spectral theorem: Multiplication operator version
In this section we will prove several spectral theorems for normal Hilbert space operators, not
requiring compactness. We assume some nodding acquaintance with measure and integration
theory or willingness to learn the basics.
Let H be a Hilbert space, A ∈ B(H) normal and x ∈ H. Then the map

ϕA,x : C(σ(A), C) → C, f 7→ hf (A)x, xi

is a bounded linear functional on the Banach space (C(σ(A), C), k·k∞ ) since kf (A)k = kf k∞ . If
f is positive (i.e. takes values in [0, ∞)) then σ(f (A)) ⊆ [0, ∞) by the spectral mapping theorem,
so that f (A) ≥ 0 and hf (A)x, xi ≥ 0 by Proposition 17.9. Thus ϕA,x is a bounded positive
linear functional on C(σ(A), C). Now by the Riesz-Markov-Kakutani theorem, cf. Section A.7,
there is a unique finite positive measure µA,x on the Borel σ-algebra of σ(A) such that
Z
f dµA,x = ϕA,x (f ) = hf (A)x, xi ∀f ∈ C(σ(A), C). (18.1)

Taking f = 1 = const., we have f (A) = 1, so that µA,x (σ(A)) = 1 dµA,x = kxk2 < ∞. Now we
R

have the Hilbert space L2 (σ(A), µA,x ), where we omit the Borel σ-algebra from the notation.

18.1 Definition Let H be a Hilbert space, A ∈ B(H) normal and x ∈ H. Then x is called
∗-cyclic for A if spanC {An (A∗ )m x | n, m ∈ N0 } = H.

152
18.2 Remark A vector x is called cyclic for A if spanC {An x | n ∈ N0 } = H. Clearly the two
notions are equivalent for self-adjoint A, but a normal operator can be ∗-cyclic but not cyclic
(e.g. the shift on `2 (Z, F)). For the present purpose, ∗-cyclicity is the right notion. 2

18.3 Proposition Let H be a complex Hilbert space, A ∈ B(H) normal and x ∈ H ∗-


cyclic for A. Then there is a unitary U : H → L2 (σ(A), µA,x ) such that U AU ∗ = Mz , where
(Mz f )(z) = zf (z) for all f ∈ L2 (σ(A), µA,x ) and µA,x -almost all z ∈ σ(A).
Thus A is unitarily equivalent to a multiplication operator.
Proof. For all f ∈ C(σ(A), C) we have
Z
2 ∗
kf (A)xk = hf (A)x, f (A)xi = hf (A) f (A)x, xi = h(f f )(A)x, xi = |f |2 dµA,x = kf k22 ,

where the third equality comes from the ∗-homomorphism property of the functional calcu-
R and the fourth from (18.1). Thus if we equip C(σ(A), C) with the seminorm kf k2 =
lus
( |f |2 dµA.x )1/2 , the map α : C(σ(A), C) → H, f 7→ f (A)x is isometric. With L2 (σ(A), µA,x ) =
{f : σ(A) → C | f measurable, kf k2 < ∞} we have C(σ(A), C) ⊆ L2 (σ(A), µA,x ) (since µA,x is
finite). Recall that L2 (σ(A), µA,x ) is defined as the quotient space of L2 (σ(A), µA,x ) w.r.t.
the equivalence relation ∼ defined by f ∼ g ⇔ kf − gk2 = 0 ⇔ f = g µ-a.e. Since
f ∼ g implies kf (A)x − g(A)xk = kf − gk2 = 0, the map α descends to an isometric map
α0 : C(σ(A), C)/∼ → H such that the triangle in
L2 (σ(A), µA,x ) ⊇ C(σ(A), C)

?
? ? -
?
2
L (σ(A), µA,x ) ⊇ C(σ(A), C)/∼ 0 - H
α
commutes. It is a fact from measure theory that C(σ(A), C)/∼ ⊆ L2 (σ(A), µA,x ) is a dense linear
subspace, cf. e.g. [140, Theorem 3.14]. Thus by Lemma 3.12 there is a unique isometric map αb0 :
L2 (σ(A), µA,x ) → H extending α0 . With {f (A)x | f ∈ C(σ(A), C)} ⊇ {Ai (A∗ )j x | i, j ∈ N0 }, the
assumption that x be ∗-cyclic implies that α, thus α0 , has dense image in H. Since L2 (σ(A), µA,x )
is complete, its image under αb0 is closed and dense, thus all of H. Thus αb0 is unitary, and we

define U = αb0 : H → L2 (σ(A), µA,x ). For f ∈ C(σ(A), C) we have U ∗ [f ] = f (A)x, thus
(U AU ∗ )([f ])(z) = (U Af (A)x)(z) = (U (zf )(A)x)(z) = [zf ](z),
and by density of C(σ(A), C)/∼ in L2 (σ(A), µA,x ), this holds for all f ∈ L2 (σ(A), µA,x ) and
µA,x -almost all z ∈ σ(A). 

Not every normal operator A ∈ B(H) admits a ∗-cyclic vector, see Exercise 18.9. But we
always have:

18.4 Theorem (Spectral theorem for normal operators) Let H be a complex Hilbert
space and A ∈ B(H) [Link] there exists a family {µι }ι∈I of L finite Borel measures on
σ(A) and a unitary U : H → ι∈I L2 (σ(A), µι )97 such that U AU ∗ = ι∈I Mz , i.e.
M
(U AU ∗ f )ι (z) = zfι (z) ∀f = {fι } ∈ L2 (σ(A), µι ), z ∈ σ(A). (18.2)
ι∈I
97
L
Here is the Hilbert space direct sum defined at the end of Section 5.1.

153
Proof. Let F be the family of subsets F ⊆ H such that for x, y ∈ F, x 6= y we have f (A)x ⊥
f 0 (A)y for all f, f 0 ∈ C(σ(A), C). We partially order F by inclusion. One easily checks
S that F
satisfies the hypothesis of Zorn’s lemma. (Given a totally ordered subset C ⊆ F, C is in F,
thus an upper bound for C.) Thus there is a maximal element M ∈ F. For each x ∈ M we
put Hx = {f (A)x | f ∈ C(σ(A), C)}. L By construction these Hx are mutually orthogonal and
f (A)Hx ⊆ Hx ∀x. Putting K = x∈M Hx , we have f (A)K ⊆ K for all f ∈ C(σ(A), σ), thus
also f (A)∗ K ⊆ K since f (A)∗ = f (A). With Exercise 11.23 this means that K and K ⊥ are
invariant under all f (A). If K ⊥ 6= {0} then for every non-zero y ∈ K ⊥ we have M ∪ {y} ∈ F,
which contradicts the maximality of M . Thus K = H.
For every x ∈ M we have that x ∈ Hx is ∗-cyclic for the restriction of A to Hx , so that we can
apply Proposition 18.3 L to obtain unitaries Ux : Hx → L2 (σ(A), µA,x ) such that Ux A = Mz Ux .
Defining U : H → 2 2
x∈M L (σ(A), µA,x ) by sending y ∈ Hx to LUx y ∈ L (σ(A), µA,x ) and

extending linearly, U is unitary. It is clear that we have U AU = x∈M Mz . Now we are done
(with the obvious identifications I = M and µx = µA,x ). 

18.5 Remark 1. Once the maximal family M of vectors has been picked, the construction
is canonical. But there is no uniqueness in the choice of that family. (This is similar to the
non-uniqueness of the choices of ONBs in the eigenspaces ker(A − λ1) that we make in proving
Theorem 14.12.) For much more on this (in the self-adjoint case) see [128, Section VII.2].
2. Theorem 18.4 is perfectly compatible with Theorem 14.12: If A is compact normal and
E is an ONB diagonalizing it then the Hι in Theorem 18.4 are precisely the one-dimensional
spaces Ce for e ∈ E and the measure µι corresponding to Hι = Ce is the δ-measure on P (σ(A))
defined by µ(S) = 1 if λe ∈ S and µ(S) = 0 otherwise. (To be really precise, one should take
the non-uniqueness in both theorems into account.)
3. If A is as in the theorem and g ∈ C(σ(A), C) then the continuous functional calculus
gives us a normal operator g(A). We now have
M
U g(A)U ∗ = Mg .
ι∈I

(This is an obvious consequence of (18.2) when g is a polynomial and follows by a density


argument in general.) If one took Theorem 18.4 as given, this could even be used to define the
continuous functional calculus. This would be circular since we used the continuous functional
calculus to prove the theorem, or rather Proposition 18.3 on which it relied, but it shows that
the continuous functional calculus and the spectral theorem are ‘equivalent’ in the sense of being
easily deducible from each other.
4. The statement of Theorem 18.4 may not quite be what we expected, given the slogan
‘normal operators are multiplication operators’, since there is a direct sum involved. But this
can be fixed when H is separable: 2

18.6 Corollary Let H be a separable complex Hilbert space and A ∈ B(H) normal. Then
there exists a finite measure space (X, A, µ), a function g ∈ L∞ (X, A, µ; C) and a unitary
W : H → L2 (X, A, µ; C) such that W AW ∗ = Mg .
Proof. We apply Theorem 18.4. Since H is separable, the index set I is at most countable, and
L write I = {1, . . . , N } where N ∈ N ∪ {∞}
we
−1
with ∞ = #N. Now we put X = I × σ(A) =
i∈I σ(A) and for Y ⊆ X we put Y i = p 2 (p1 (i)) = {x ∈ σ(A) | (i, x) ∈ Y } ⊆ σ(A). We define

154
A ⊆ P (X) and µ : A → [0, ∞] by

A = {Y ⊆ X | Yi ∈ B(σ(A)) ∀i ∈ I},
X
µ(Y ) = µi (Yi ).
i∈I

Using the countability of I it is straightforward to check that A is a σ-algebra on X and µ a


(positive) measure on (X, A). With (18.1) we have P µi (σ(A)) = kxi k2 . Thus
P if we choose the
−i
cyclic vectors xi such that kxi k = 2 then µ(X) = i µ({i} × σ(A)) = i µi (σ(A)) < ∞, so
that the measure space (X, A, µ) is finite. Now we define a linear map
M
V : L2 (σ(A), µi ) → L2 (X, A, µ), {fi }i∈I 7→ f where f ((i, x)) = fi (x).
i∈I

From the way (X, A, µ) was constructed, it is quite clear that V is unitary. (Check this!) Now
W = V U : H → L2 (X, A, µ), where U comes from Theorem 18.4, is unitary. In view of
(U AU ∗ f )i (λ) = λfi (λ), defining g : X → C, (i, x) 7→ x (which is bounded by r(A) = kAk), we
have W AW ∗ = Mg . 

18.7 Exercise Use the above results to prove that for every normal A ∈ B(H), where H is
a Hilbert space of dimension ≥ 2, there is a proper closed subspace {0} =
6 K ( H such that
AK ⊆ K.
For a bit more on the existence of invariant subspaces see Section B.8.

18.8 Exercise (i) Let Σ ⊆ C be compact and non-empty and µ be a finite positive Borel
measure on Σ. Put H = L2 (Σ, µ) and define A ∈ B(H) by A = Mz , thus (Af )(z) = zf (z)
for f ∈ H. Prove:

σ(A) = {λ ∈ Σ | ∀ε > 0 : µ(B(λ, ε)) > 0},


σp (A) = {λ ∈ Σ | µ({λ}) > 0},

where B(λ, ε) denotes the open ε-disc around λ.


(ii) Prove that L2 (B(λ, ε), µ) is infinite-dimensional for each ε > 0 if λ ∈ σ(A)\σp (A).
(iii) Let A ∈ B(H) be normal. With the terminology of Theorem 18.4 prove

σ(A) = {λ ∈ C | ∃i ∈ I ∀ε > 0 : µι (B(λ, ε)) > 0},


σp (A) = {λ ∈ C | ∃ι ∈ I : µι ({λ}) > 0}.

18.9 Exercise Let H be a Hilbert space and A ∈ B(H) non-zero and normal. Prove that A
admits a ∗-cyclic vector if and only if H is separable and the algebra {A, A∗ }0 is commutative.

18.2 Borel functional calculus for normal operators


In the preceding section we used the continuous functional calculus to prove the spectral theorem
for normal operators. Now we will turn the logic around and use the spectral theorem to extend
the functional calculus to a larger class of functions!

18.10 Definition If (X, τ ) is a topological space, B ∞ (X, C) denotes the set of bounded func-
tions X → C that are measurable with respect to the Borel σ-algebra B(X, τ ).

155
18.11 Lemma Let (X, τ ) be a topological space. Then
(i) If {fn }n∈N is a sequence of Borel measurable functions X → C converging pointwise to f
then is Borel measurable.
(ii) (B ∞ (X, C), k·k∞ ), equipped with pointwise multiplication and ∗-operation is a C ∗ -algebra.
Proof. (i) It is an elementary fact of measure theory, cf. e.g. [29, Proposition 2.1.5], that the
pointwise limit of a sequence of measurable functions (whatever the σ-algebra) is measurable.
(ii) Every sequence in B ∞ (S, C) that is Cauchy w.r.t. k · k∞ converges pointwise everywhere,
thus is measurable by (i), and clearly bounded. Thus B ∞ (S, C) is complete. It is a C ∗ -algebra
since product and ∗-operation satisfy submultiplicativity and the C ∗ -identity. 

For a normal element a ∈ A of a C ∗ -algebra, we cannot make sense of f (a) if f is not


continuous. But the C ∗ -algebra B(H) has much more structure, and it turns out there is a
Borel functional calculus extending the continuous functional calculus:

18.12 Theorem Let H be a complex Hilbert space and A ∈ B(H) normal. Then:
(i) There is a unique unital ∗-homomorphism αA : B ∞ (σ(A), C) → B(H) extending the con-
tinuous functional calculus C(σ(A), C) → B(H) and satisfying kαA (f )k ≤ kf k∞ . Again
we write more suggestively f (A) = αA (f ).
(ii) If B ∈ B(H) commutes with A then B commutes with g(A) for all g ∈ B ∞ (σ(A), C).
(iii) If {fn }n∈N ⊆ B ∞ (σ(A), C) is a bounded sequence converging pointwise to f then f ∈
w
B ∞ (σ(A), C) and fn (A) → f (A), i.e. w.r.t. τwot , cf. Definition 10.18 and Exercise 10.19(i).
(And kfn − f k∞ → 0 ⇒ kfn (A) − f (A)k → 0.)
Proof. (i) For all x, y ∈ H, the map

ϕx,y : C(σ(A), C) → C, f 7→ hf (A)x, yi

is a linear functional on C(σ(A), C) that is bounded since kf (A)k = kf k∞ . Thus by the Riesz-
Markov-Kakutani R Theorem A.56 there exists a unique complex Borel measure µx,y on σ(A) such
that ϕx,y (f ) = f dµx,y for all f ∈ C(σ(A), C). Since ϕx,y depends in a sesquilinear way on
(x, y), the same holds for µx,y , and |µx,y (σ(A))| = |hx, yi| ≤ kxkkyk. Thus if f ∈ B ∞ (σ(A), C),
2
R
the map ψf : H → C defined by (x, y) 7→ f dµx,y is a sesquilinear form that is bounded since
|ψx,y (f )| ≤ kf k∞ kxkkyk. Thus by Proposition 11.5 there is a unique Af ∈ B(H) such that
hAf x, yi = ψx,y (f ) for all x, y ∈ H. It satisfies kAf k ≤ kf k∞ . Define α : B ∞ (σ(A), C) → B(H)
by f 7→ Af . If f ∈ C(σ(A), C) then ψx,y (f ) = hf (A)x, yi ∀x, y, implying Af = f (A). Thus αA
extends the continuous functional calculus.
It remains to be shown that αA is a ∗-homomorphism. Linearity is quite obvious. Since the
continuous functional calculus is a ∗-homomorphism, for f ∈ C(σ(A), C) we have f (A) = f (A)∗ ,
thus
Z Z Z

f dµx,y = hf (A)x, yi = hx, f (A) yi = hx, f (A)yi = hf (A)y, xi = f dµy,x = f dµy,x ,

implying µy,x = µx,y . Now for all f ∈ B ∞ (σ(A), C) the above computation can be read back-
wards, giving αA (f ) = αA (f )∗ . Since the continuous functional calculus is a homomorphism,
for all f, g ∈ C(σ(A), C) we have
Z Z

(f g) dµx,y = h(f g)(A)x, yi = hf (A)g(A)x, yi = hg(A)x, f (A) yi = g dµx,f (A)y .

156
The fact that this holds for all f, g ∈ C(σ(A), C) implies f µx,y = µx,f (A)y . Thus for all
f ∈ C(σ(A), C), g ∈ B ∞ (σ(A), C) we have
Z Z
h(f g)(A)x, yi = f g dµx,y = g dµx,f (A)y = hg(A)x, f (A)yi = hf (A)g(A)x, yi,

so that (f g)(A) = f (A)g(A). As above, we deduce from this that f µx,y = µx,f (A)y for all
f ∈ B ∞ (σ(A), C), and then (f g)(A) = f (A)g(A) for all f, g ∈ B ∞ (σ(A), C).
(ii) By normality of A and Fuglede’s Theorem 16.16 we have BA∗ = A∗ B, thus B commutes
with C ∗ (1, A), so that Bf (A) = f (A)B for all f ∈ C(σ(A), C). Thus

ϕBx,y (f ) = hf (A)Bx, yi = hBf (A)x, yi = hf (A)x, B ∗ yi = ϕx,B ∗ y (f ) ∀x, y, f.

This implies µBx,y = µx,B ∗ y for all x, y, whence


Z Z
hf (A)Bx, yi = f dµBx,y = f dµx,B ∗ y = hf (A)x, B ∗ yi ∀x, y ∈ H, f ∈ B ∞ (σ(A), C),

thus f (A)B = Bf (A) for all f ∈ B ∞ (σ(A), C).


(iii) Measurability of the limit function f follows from Lemma 18.11(i). If kfn k ≤ M ∀n
then clearly kf k ≤ M . Thus f ∈ B ∞ (σ(A)). For all x, y ∈ H we have
Z Z
hαA (fn )x, yi = fn dµx,y −→ f dµx,y = hαA (f )x, yi,

where convergence in the center is a trivial application of the dominated convergence theorem,
w
using boundedness of µx,y and kfn k∞ ≤ M for all n. This proves αA (fn ) → αA (f ). The final
claim clearly follows from kfn (A) − f (A)k = k(fn − f )(A)k ≤ kfn − f k∞ . 

The above construction of the Borel functional calculus was independent of the Spectral
Theorem 18.4. We now wish to understand their relationship. This is the first step:

18.13 Exercise Let Σ ⊆ C be compact and λ a finite positive Borel measure on Σ. Let
H = L2 (Σ, λ; C) and g ∈ B ∞ (Σ, C).
(i) Prove that the multiplication operator Mg : H → H, [f ] 7→ [gf ] satisfies

kMg k = ess supµ |g| = inf{t ≥ 0 | λ({x ∈ X | |g(x)| > t}) = 0} ≤ kgk∞ .

(ii) Let A = Mz ∈ B(H), where z : Σ ,→ C. Prove that g(A) as defined by the Borel functional
calculus coincides with Mg .

18.14 Corollary Let A ∈ B(H) be normal and g ∈ B ∞ (σ(A), C). Then


(i) σ(g(A)) ⊆ g(σ(A)).
(ii) If h ∈ B ∞ (g(σ(A)), C)) then h(g(A)) = (h ◦ g)(A).
2 ∗
L L
Proof. Let U : H → ι∈I L (σ(A), µι ) as in Theorem 18.4, so that U AU = ι Mz . The
projectors Pι onto the subspaces L2 (σ(A), µι ) of the direct sum commute with A (and A∗ ),
thus also with g(A) for each g ∈ B ∞ (σ(A), C) by Theorem 18.12(ii). Thus the Borel functional
calculus respects the direct sum decomposition of A (no matter how the maximal set M was
chosen). It is a pure formality to show that if V : H → H 0 is unitary then V g(A)V ∗ = g(V AV ∗ ).

157
∗ = ∗ =
L L
Thus
L with the direct sum decomposition U AU ι M z we have U g(A)U ι g(Mz ) =
ι Mg , where the second equality comes from Exercise 18.13(ii).
(i) If λ 6∈ g(σ(A)) then Mg −λ1 has a bounded inverse with norm ≤ dist(λ, g(σ(A)))−1 . Thus
all Mg − λ1 in the direct sum decomposition of A − λ1 have inverses with uniformly bounded
norms. Thus A − λ1 has a bounded inverse.
(ii) Under the assumption on h, we have
M M
U h(g(A))U ∗ = h(Mg ) = Mh◦g = U (h ◦ g)(A)U ∗ .
ι ι

(This is too sloppy, but the reader should be able to make it precise.) 

18.15 Remark 1. Since it turns out that g(A) = U ∗ ( ι Mg )U for all g ∈ B ∞ (σ(A), C), one
L
might try to take this as the definition of g(A). But apart from being very inelegant, it has
the problem that one must prove the independence of g(A) thus defined of the choice of the
maximal set M ⊆ H in the proof of the spectral theorem. This would not be difficult if every
Borel measurable function was a pointwise limit of a sequence of continuous functions. But this
is false, making such an approach quite painful. (Compare Lusin’s theorem in, e.g., [140].)
2. We cannot hope to prove kg(A)k = kgk∞ for all g ∈ B ∞ (σ(A), C) since it is true only
if σ(A) = σp (A)! Since singletons in C are closed, thus Borel measurable, we can change g
arbitrarily for some λ ∈ σ(A) without destroying the measurability of g, making kgk∞ as large
as we want. But if λ ∈ σ(A)\σp (A), Exercise 18.8 gives µι ({λ}) = 0 ∀ι ∈ I, so that this change
of g does not affect the norms, cf. Exercise 18.13, ess supµi |g| of the multiplication operators
making up g(A) and therefore does not affect kg(A)k.
3. Let A ∈ B(H) be normal and consider the C ∗ -algebra A = C ∗ (1, A) ⊆ B(H). Then
g(A) ∈ A for continuous g, but for most non-continuous g we have g(A) 6∈ A. For this reason
there is no Borel functional calculus in abstract C ∗ -algebras. (But g(A) is always contained in
wot
the von Neumann algebra vN(A) = C ∗ (A, 1) generated by A. This follows from Theorem
18.12(ii) and von Neumann’s ‘double commutant theorem’.) 2

As a remarkable application, we have the following complement to Exercise 17.22:

18.16 Lemma If H is a complex Hilbert space and U ∈ B(H) is unitary, there is a self-adjoint
A ∈ B(H) such that eiA = U .
Proof. Since U is unitary, we have σ(U ) ⊆ S 1 . Define f : S 1 → C as the unique inverse of the
map (−π, π] → S 1 , x 7→ eix . The function f is continuous on S 1 except at −1, where it has a
jump. Thus it is Borel measurable, so that we can define a normal operator A = f (U ) ∈ B(H)
by Borel functional calculus (Theorem 18.12). By Corollary 18.14 we have σ(A) ⊆ f (σ(U )) ⊆
[−π, π] (which together with normality of A implies A = A∗ ) and eiA = U . 

18.17 Proposition Let H be a complex Hilbert space. Then the open set InvB(H) ⊂ B(H)
of invertible operators is path-connected.
Proof. Let A ∈ InvB(H), and let A = V |A| be its polar decomposition. Then |A| is positive and
invertible, and V is unitary, cf. Exercise 11.43. Since σ(|A|) ⊆ (0, +∞), we can use continuous
functional calculus to define B = log |A| ∈ B(H), satisfying eB = |A|. By the preceding
lemma, there is a self-adjoint D with eiD = V . Now g : [0, 1] → InvB(H), t 7→ eitD etB
is a continuous path in Inv B(H) (since eX is invertible for all X) such that g(0) = 1 and
g(1) = eiD eB = V |A| = A. Now a standard argument produces paths between any two
invertible operators. 

158
18.18 Remark The result also holds for infinite-dimensional real Hilbert spaces, but then it
requires a different proof. This is surprising since GL(n, R) is not path-connected for n ∈ N
but has two path-components, while GL(n, C) is path-connected. In fact, one has this much
stronger theorem of Kuiper98 [89]: Inv B(H) is contractible in the norm topology for all infinite-
dimensional real or complex Hilbert spaces. 2

18.3 Normal operators vs. projection-valued measures


There is yet another perspective on the spectral theorem/functional calculus, provided by
projection-valued measures:

18.19 Definition Let H be Hilbert space and Σ ⊆ C a compact subset. Let B(Σ) be the Borel
σ-algebra on Σ. A projection-valued measure relative to (H, Σ) is a map P : B(Σ) → B(H)
such that
(i) P (S) is an orthogonal projection for all S ∈ B(Σ).
(ii) P (∅) = 0, P (Σ) = 1.
(iii) P (S ∩ S 0 ) = P (S)P (S 0 ) for all S, S 0 ∈ B(Σ).
(iv) For all x, y ∈ H, the map Ex,y : B(Σ) → C, S 7→ hP (S)x, yi is P a complex measure.
(Equivalently,Sif the {Sn }n∈N ⊆ B(Σ) are mutually disjoint then n P (Sn ) converges
weakly to P ( n Sn ).)
Note that (iii) implies P (S)P (S 0 ) = P (S 0 )P (S) for all S, S 0 ∈ B(Σ).

18.20 Proposition Let H be a complex Hilbert space and A ∈ B(H) normal. Put Σ = σ(A).
For each S ∈ B(Σ), define PA (S) = χS (A) by Borel functional calculus. Then S 7→ PA (S) is a
projection-valued measure relative to (H, Σ), also called the spectral resolution of A.
Proof. If g = χS for S ∈ B(Σ), g(A) is a direct sum of operators of multiplication by χS , which
clearly all are idempotent. And since g = χS is real-valued, g(A) is self-adjoint. Thus each
PA (S) = χS (A) is an orthogonal projection. PA (∅) = 0 is clear, and PA (Σ) = 1(A) = 1H (since
the constant 1 function is continuous). Property (iii) is immediate from χS∩S 0 = χS χS 0 . Finally,
if x, y ∈ H let U x = {fι }ι∈I , U y = {gι }ι∈I . Then
XZ
Ex,y (S) = hPA (S)x, yi = χS (z)fι (z)gι (z) dµι (z).
ι∈I σ(A)

PSR7→ Ex,y (S) is additive on countable families {Sn }n∈N ⊆ B(Σ) by


From this it is clear that
absolute convergence of ι fι gι dµι . 

18.21 Exercise Let A ∈ B(H) be normal and Σ ⊆ σ(A) a Borel set. Prove σ(A|P (Σ)H ) ⊆
Σ ∪ {0}. Bonus: State and prove a better result.

18.22 Exercise Let A ∈ B(H) be a normal operator and let PA be the corresponding spectral
measure. Prove:
(i) λ ∈ σ(A) if and only if PA (σ(A) ∩ B(λ, ε)) 6= 0 for each ε > 0.
(ii) λ ∈ σ(A) is an eigenvalue if and only if PA ({λ}) 6= 0.
98
Nicolaas Hendrik Kuiper (1920-1994), Dutch mathematician.

159
We have thus seen that every normal operator gives rise to a projection valued measure. The
converse is also true, and we have a bijection between normal operators and projection-valued
measures:

18.23 Proposition Let H be a complex Hilbert space, Σ ⊆ C a compact subset and P a


projection-valued measure relative to (H, Σ). Then
(i) For every f ∈ B ∞ (Σ, C) there is a unique α(f ) ∈ B(H) such that
Z
hα(f )x, yi = f dEx,y ∀x, y ∈ H
R
and kα(f )k ≤ kf k∞ ∀f . We also write, somewhat symbolically, α(f ) = f (z) dP (z).
(ii) The map α : B ∞ (Σ, C) → B(H) is a unital ∗-homomorphism. α(f ) is normal for each f .
(iii) Put A = α(z) ∈ B(H), where z : Σ ,→ C is the inclusion map. Then σ(A) ⊆ Σ and
α(f ) = f (A) for each f ∈ C(Σ, C).
(iv) The maps from normal elements to projection-valued measures (Proposition 18.20) and
and conversely ((i)-(iii) above) are mutually inverse.
Proof. (i) It is clear that the map (x, y) 7→ R Ex,y (S) = hP (S)x, yi is sesquilinear Rfor each
S ∈ B(Σ). For each f ∈ B ∞ (Σ, C), we have | f dEx,y | ≤ kf k∞ kxkkyk. Thus [x, y]f = f dEx,y
is a sesquilinear form with norm ≤ kf k∞ . Thus by Proposition 11.5 there is a unique Af ∈ B(H)
such that hAf x, yi = [x, y]f ∀x, y ∈ H and kAf k ≤ kf k∞ . Now put α(f ) := Af .
(ii) The inequality has already been proven. ItR is clear that f 7→ Af = α(f ) is linear. If 1 is
the constant one function, we have hα(1)x, yi = 1 dEx,y = Ex,y (Σ) = hx, yi since P (Σ) = 1.
Thus α(1) = 1. It remains to prove α(f g) = α(f )α(g) and α(f )∗ = α(f ). We first do this for
characteristic functions of measurable sets S, T : f = χS , g = χT . Now
Z
hα(χS )x, yi = dEx,y = hP (S)x, yi.
S

Thus α(χS χT ) = α(χS∩T ) = E(S ∩ T ) = E(S)E(T ) = α(χS )α(χT ). By linearity of α we now


have α(f g) = α(f )α(g) for all simple functions, i.e. finite linear combinations of characteristic
functions. The latter are k · k∞ -dense in B ∞ (Σ, C). (By a proof very similar to that of Lemma
4.13 in the case of `∞ . Note that a measurable function is simple if and only if it assumes only
finitely many values.) Now the identity follows for all f, g by continuity of α.
Furthermore, α(χS )∗ = P (S)∗ = P (S) = α(χS ), so that α(f )∗ = α(f ) for simple functions.
Now apply the same density and continuity argument as above.
In view of f (A) = α(f ), the normality of f (A) follows from

f (A)f (A)∗ = α(f )α(f )∗ = α(f f ∗ ) = α(f ∗ f ) = α(f )∗ α(f ) = f (A)∗ f (A).

(iii) σ(A) ⊆ Σ is clear. Since α(1) = 1 and α(z) = A by definition, we have α(P ) = P (A)
for each polynomial. More generally, since α is a ∗-homomorphism, a polynomial in z, z is sent
by α to the corresponding polynomial in A, A∗ . These polynomials are k · k∞ -dense in C(Σ, C)
by Weierstrass Theorem A.38, so that the continuity proven in (ii) implies that α(f ) = f (A) as
produced by the continuous functional calculus.
(iv) Left as an exercise. 

We close the discussion of spectral theorems with the advice of looking at the paper [67] and
[152, Chapter 5] by two masters of functional analysis.

160
19 ? The Gelfand homomorphism for commutative
Banach and C ∗-algebras
19.1 The topology of Ω(A). The Gelfand homomorphism
Let A be a unital Banach algebra over C and Ω(A) its spectrum. For each a ∈ A define (as in
Section 9.3)
a : Ω(A) → C, ϕ 7→ ϕ(a).
b
We now want a topology τ on Ω(A) such that b a is continuous for each a ∈ A, thus b a ∈
C((Ω(A), τ ), C). Since Ω(A) ⊆ (A∗ )≤1 by Lemma 15.3, we could take τ to be the restriction of
the norm topology of A∗ to Ω(A) (i.e. the relative topology). But we can also take the weakest
topology making all b a : Ω(A) → C continuous. This is nothing other than the restriction to
Ω(A) of the weak-∗ topology or σ(A∗ , A)-topology on A∗ .

19.1 Proposition Let A be a unital Banach algebra. Let τ be the restriction of the weak-∗
topology to Ω(A) ⊆ (A∗ )≤1 . Then (Ω(A), τ ) is compact Hausdorff.
Proof. By Alaoglu’s theorem, ((A∗ )≤1 , τw∗ ) is compact. Thus it suffices to prove that Ω(A) ⊆
(A∗ )≤1 is weak-∗ closed. Let {ϕι } be a net in Ω(A) that converges to ψ ∈ A∗ w.r.t. the σ(A∗ , A)-
topology. Then for all a, b ∈ A we have ψ(ab) = limι ϕι (ab) = limι ϕι (a)ϕι (b) = ψ(a)ψ(b), so
that ψ ∈ Ω(A). Thus Ω(A) ⊆ (A∗ )≤1 is σ(A∗ , A)-closed. 

The above works whether or not A is commutative, but we’ll now restrict to commutative
A since Ω(A) can be very small otherwise. We begin by completing Exercise 15.5:

19.2 Proposition Let X be a compact Hausdorff space and A = C(X, C). Then the map
X → Ω(A), x 7→ ϕx is a homeomorphism (with the weak-∗ topology on Ω(A)).
Proof. Injectivity was already proven in Exercise 15.5. In order to prove surjectivity, let ϕ ∈
Ω(A) and put M = ker ϕ. Then M ⊆ A is a proper closed subalgebra (in fact an ideal), and
it is self-adjoint by Lemma 17.17 since A is a C ∗ -algebra. If x, y ∈ X, x 6= y, pick f ∈ A
with f (x) 6= f (y). With g = f − ϕ(f )1 we have ϕ(g) = 0, thus g ∈ M . This proves that
M separates the points of X, yet it is not dense in A. Now the incarnation Corollary A.41
of the Stone-Weierstrass theorem implies that there must be an x ∈ X at which M vanishes
identically, i.e. ϕx (f ) = 0 for all f ∈ M . Now for every f ∈ A we have f − ϕ(f ) ∈ M , thus
ϕx (f − ϕ(f )1) = 0, which is equivalent to ϕx (f ) = ϕ(f ). Thus ι : X → Ω(A) is surjective.
If {xι } ⊆ X such that xι → x then ϕxι (f ) = f (xι ) → f (x) = ϕx (f ) for every f ∈ A by
continuity of f . But this precisely means that ϕxι → ϕx w.r.t. the weak-∗ topology. Thus ι is
continuous. As a continuous bijection of compact Hausdorff spaces it is a homeomorphism. 

19.3 Definition Let A be a unital Banach algebra. Then its radical is the set of quasi-
nilpotent elements: radA = {a ∈ A | r(a) = 0}. We call A semisimple if radA = {0}.

19.4 Proposition If A is a unital commutative Banach algebra, the map

π : A → C(Ω(A), C), a 7→ b
a (19.1)

is a unital homomorphism, called the Gelfand homomorphism (or representation) of A, and


kπ(a)k = r(a) ≤ kak for all a ∈ A. Thus ker π = radA, and π is injective if and only if A is
semisimple.

161
Proof. It is clear that π is linear, and 1(ϕ)
b = ϕ(1) = 1 for all ϕ. Let a, b ∈ A, ϕ ∈ Ω(A). Then
π(ab)(ϕ) = ab(ϕ)
b = ϕ(ab) = ϕ(a)ϕ(b) = b
a(ϕ)bb(ϕ) = π(a)(ϕ)π(b)(ϕ) = (π(a)π(b))(ϕ),
where we used multiplicativity of ϕ and the fact that the multiplication on C(Ω(A), C) is
pointwise, shows that π(ab) = π(a)π(b), thus π is an algebra homomorphism. We have
kb
ak = sup |b
a(ϕ)| = sup |ϕ(a)| = sup |λ| = r(a) ≤ kak,
ϕ∈Ω(A) ϕ∈Ω(A) λ∈σ(a)

where we used (15.1) and Proposition 13.27. In particular, ker π = r−1 (0) = radA. 

The Gelfand homomorphism can fail to be surjective or injective or both. See Section 19.2
for an important example for the failure of surjectivity and Exercise 19.6 for a non-trivial unital
Banach algebra with very large radical.

19.5 Proposition Let A be a commutative unital Banach algebra and a ∈ A such that A is
generated by {1, a}. Then the map ba : Ω(A) → σ(a) is a homeomorphism.
The same conclusion holds if a ∈ InvA and A is generated by {1, a, a−1 }.
Proof. We know from (15.1) that b a(Ω(A)) = σ(a), thus b
a is surjective. Assume b
a(ϕ1 ) = b
a(ϕ2 ),
thus ϕ1 (a) = ϕ2 (a). Since the ϕi are unital homomorphisms, this implies ϕ1 (an ) = ϕ2 (an )
for all n ∈ N0 , so that ϕ1 , ϕ2 agree on the polynomials in a. Since the latter are dense in A
by assumption and the ϕi are continuous, this implies ϕ1 = ϕ2 . Thus b a : Ω(A) → σ(a) is
injective, thus a continuous bijection. Since Ω(A) is compact and σ(a) ⊆ C Hausdorff, b a is a
homeomorphism. This proves the first claim.
For the second claim, note that ϕ(a)ϕ(a−1 ) = ϕ(aa−1 ) = ϕ(1) = 1, thus ϕ(a−1 ) = ϕ(a)−1 ,
for each ϕ ∈ Ω(A). This implies that ϕ1 (an ) = ϕ2 (an ) also holds for negative n ∈ Z. Now
ϕ1 , ϕ2 agree on all Laurent polynomials in a, thus on A by density and continuity. The rest of
the proof is the same. 

19.6 Exercise Let α : N0 →P (0, ∞) be a map satisfying α(0) = 1 and αn+m ≤ αn αm ∀n, m.
For f : N0 → C, define kf k P
= n∈N0 αn |f (n)|, and A = {f : N0 → C | kf k < ∞}. For f, g ∈ A,
define f · g by (f · g)(n) = u,v∈N0 f (u)g(v).
u+v=n

(i) Prove that (A, ·, 1, k · k) is a commutative Banach algebra with unit 1 = δ0 .


1/n
(ii) Prove that δ1 generates A and r(δ1 ) = limn→∞ αn .
(iii) Find a sequence {αn } satisfying the above requirements such that δ1 is quasi-nilpotent.
(iv) Conclude that with the {αn } from (iii) we have radA = {f ∈ A | f (0) = 0}.

19.7 Remark 1. Since every commutative unital Banach algebra has at least one non-zero
character ϕ, the worst that can happen is radA = ϕ−1 (0), which has codimension one, as in the
preceding exercise.
2. If A is a non-unital Banach algebra and a ∈ A one defines σ(a) = σAe(a), where Ae is
the unitization of A considered in Exercise 13.55. Now one defines r(a) = supλ∈σ(a) |λ| and
radA = r−1 (0) ⊆ A as before. Now for the non-unital subalgebra A0 = {f ∈ A | f (0) = 0} of
the A from Exercise 19.6 one easily proves Af0 ∼
= A, thus r(a) = 0 ∀a ∈ A0 and radA0 = A0 . 2

19.8 Exercise Let A be a commutative unital Banach algebra generated by {a1 , . . . , an } ⊆ A.


Prove that the map s : Ω(A) → Cn , ϕ 7→ (ϕ(a1 ), . . . , ϕ(an )) is a homeomorphism of Ω(A) onto
the joint spectrum σ(a1 , . . . , an ), the latter being a closed subspace of σ(a1 ) × · · · × σ(an ).

162
19.2 Application: Absolutely convergent Fourier series
Let (A = `1 (Z, C), k · k, ?, 1) be the unital Banach algebra from Section 4.6. In Exercise 15.11
we used characters to compute the spectra of elements of A and proved that it is semisimple.
We now give a new interpretation of these somewhat ad-hoc arguments in the light of Fourier
analysis and the Gelfand homomorphism.
By the semisimplicity of A, the Gelfand homomorphism π : A → C(Ω(A), C) is injective. (In
Exercise 15.11 we have proven a bijection S 1 → Ω(A). Since the Banach algebra A is generated
by δ1 and δ1−1 = δ−1 , Proposition 19.5 amplifies this to a homeomorphism Ω(A) → S 1 .) But π
is neither isometric nor surjective: Its image consists precisely of
n X o
W = g ∈ C(S 1 , C) |b
g (n)| < ∞ .
n∈Z

This is an algebra since A is. (To see this without reference to π, note that fd · g(n) = (fb? gb)(n)
1
and use the fact that ` (Z) is closed under convolution.) While A inherits the norm k · k∞
from C(S 1 , C), it is not closed in this norm and π is not P an isometry. The norm on W for
which π : A → W is an isometric isomorphism is kgkW = n∈Z |b g (n)|. Since W is generated
by the function z 7→ z, it follows that Ω(W) consists of the point evaluations {ϕz | z ∈ S 1 }
as for C(S 1 , C). Now the result of Exercise 15.11 is obvious since it follows from the isometric
isomorphism π : (A, k · k1 ) → (W, k · kW ).
For the g ∈ W the Fourier series converges absolutely uniformly to g, but we have proven
in Section 8.3 that C(S 1 , C) has a dense subset of functions whose the Fourier series does not
even converge pointwise everywhere. (Our proof was non-constructive, but as we remarked,
individual examples can be produced constructively.) Functions in C(S 1 , C)\W can actually
be written
P down even more concretely: With some effort (see [109] for an exposition) the
series ∞ sin nx 1
n=2 n log n can be shown to converge uniformly to some f ∈ C(S , C), and its Fourier
P∞
coefficients are not absolutely summable since n=2 (n log n)−1 = ∞. (That the convergence is
not unconditional follows from the fact that it isn’t at x = π/2.)
1 1 1
We now turn Pthe non-surjectivity of π : ` (Z) 1→ C(S ) into a virtue! For g ∈ C(S , C)
define kgkW = n∈Z |b g (n)|. Thus W = {g ∈ C(S , C) | kgkW < ∞}. We have seen that the
Gelfand representation of `1 (Z) is an isometric isomorphism (`1 (Z), k · k1 ) → (W, k · kW ). Now
we have:

19.9 Theorem If g ∈ W satisfies g(z) 6= 0 ∀z ∈ S 1 and h ∈ C(S 1 , C) is its multiplicative


inverse h(z) = 1/g(z) then h ∈ W (thus h has absolutely convergent Fourier series).
Proof. Let f = π −1 (g) ∈ `1 (Z). We have seen that Ω(A) = S 1 and ϕz (f ) = g(z) for all z ∈ S 1 .
Now the assumption g(z) 6= 0 ∀z implies that 0 6∈ σ(f ) = {ϕz (f ) | z ∈ S 1 }, so that f is invertible
in `1 (Z). Thus π(f ) = g ∈ W is invertible. Since the product on W is pointwise multiplication,
this proves that h = g −1 ∈ W, thus h has absolutely convergent Fourier series. 

19.10 Remark The first proof of this theorem due to Wiener was more involved. The above
proof due to Gelfand was one of the first successes of his theory of commutative C ∗ -algebras.
But now there is a much simpler and quite definitive proof using only convergence of the
geometric/Neumann series in the Banach algebra W. See [114] or [27, Section 2.5]. 2

19.11 Exercise Prove that the Gelfand homomorphism π : `1 (Z, C) → C(S 1 , C) (seen above
to be injective) is not bounded below.

163
19.3 C ∗ -algebras. Continuous functional calculus revisited
In discussing when the Gelfand homomorphism π : A → C(Ω(A), F) is an isomorphism, we
limit ourselves to the case where A is a C ∗ -algebra over C.

19.12 Theorem (Gelfand isomorphism) If A is a commutative unital C ∗ -algebra then the


Gelfand homomorphism π : A → C(Ω(A), C) is an isometric ∗-isomorphism.
Proof. For all a ∈ A, ϕ ∈ Ω(A), using Lemma 17.17 we have

π(a∗ )(ϕ) = ab∗ (ϕ) = ϕ(a∗ ) = ϕ(a) = ϕ∗ (a) = b


a(ϕ∗ ) = π(a)(ϕ∗ ).

Thus π(a∗ ) = π(a)∗ , so that π is a ∗-homomorphism, and π(A) ⊆ C(Ω(A), C) is self-adjoint.


Since A is commutative, all a ∈ A are normal, thus satisfy r(a) = kak by Proposition
16.17(i). Together with kπ(a)k = r(a) for all a this implies that π is an isometry, thus injective.
Since A is complete, this implies that the image π(A) ⊆ C(Ω(A), C) is complete, thus closed.
If ϕ1 6= ϕ2 then there is an a ∈ A such that ϕ1 (a) 6= ϕ2 (a), thus π(a)(ϕ1 ) = b
a(ϕ1 ) 6= b
a(ϕ2 ) =
π(a)(ϕ2 ). This proves that π(A) ⊆ C(Ω(A), R) separates the points of Ω(A). Since π is also
unital, the Stone-Weierstrass theorem (Corollary A.39) gives π(A) = π(A) = C(Ω(A), C). 

19.13 Proposition Let A, B be commutative unital C ∗ -algebras. If α : A → B is a unital


∗-homomorphism, define α∗ : Ω(B) → Ω(A), ψ 7→ ψ ◦ α. Then
(i) α∗ is continuous w.r.t. the weak-∗ topologies on Ω(A), Ω(B).
(ii) If α is surjective then α∗ is injective.
(iii) If α is injective then α∗ is surjective.
Proof. (i) Let {ψι } be a net in Ω(B) weak-∗ convergent to ψ. Thus ψι (b) → ψ(b) ∀b ∈ B. Then
w∗
in particular ψι (α(a)) → ψ(α(a)) ∀a ∈ A. Thus α∗ (ψι ) = ψι ◦ α −→ ψ ◦ α = α∗ (ψ), proving
continuity of α∗ .
(ii) This is trivial: If ψ, ψ 0 ∈ Ω(B) and α∗ (ψ) = α∗ (ψ 0 ) then ψ ◦ α = ψ 0 ◦ α, and surjectivity
of α implies ψ = ψ 0 .
(iii) Assume α∗ (Ω(B)) $ Ω(A). By (i) and weak-∗ compactness of Ω(B), α∗ (Ω(B)) is a com-
pact, thus closed, proper subset of Ω(A). Since Ω(A) is compact Hausdorff, it is normal, and
by Urysohn’s Lemma there is a non-zero f ∈ C(Ω(A), C) such that f  α∗ (Ω(B)) = 0. By the
Gelfand isomorphism, there is a non-zero a ∈ A such that f = b a. If now ψ ∈ Ω(B) then we have
ψ ◦ α(a) = b a(ψ ◦ α) = f (α∗ (ψ)) = 0. Since this holds for all ψ ∈ Ω(B) we have σB (α(a)) = {0},
thus α(a) = 0. But in view of a 6= 0 this contradicts the injectivity of α. This contradiction
proves the surjectivity of α∗ . 

The result of (iii) is equivalent to the following: Every character on a unital C ∗ -subalgebra
of a commutative C ∗ -algebra has an extension to a character of the larger algebra.

19.14 Remark 1. Theorem 19.12 can be strengthened to a (contravariant) equivalence of cat-


egories between the categories of commutative unital C ∗ -algebras and unital ∗-homomorphisms
and of compact Hausdorff spaces and continuous maps.
2. With some work, the assumption of A having a unit can be dropped, cf. e.g. [110].
One finds that every commutative C ∗ -algebra is isometrically ∗-isomorphic to C0 (X, C) for a
locally compact Hausdorff space X, unique up to homeomorphism. X is compact if and only
if A is unital. And the equivalence of categories mentioned above extends to a contravariant

164
equivalence between the category of locally compact Hausdorff spaces and proper maps and the
category of commutative ∗-algebras and non-degenerate homomorphisms.
3. The preceding comments in a sense end the theory of commutative C ∗ -algebras since the
latter is reduced it to general topology. But the theory of non-commutative C ∗ -algebras is vast,
see [79, 110] for accessible introductions, and it turns out that commutative C ∗ -algebras are a
very useful tool for studying them, as results like Exercise 16.24 and Proposition 17.6 just begin
to illustrate.
4. Comparing Theorem 19.12 with the non-surjectivity of the Gelfand-Homomorphism for
(` (Z, C), ?) shows that `1 (Z, C) does not admit a norm that would make it a C ∗ -algebra. But
1

`1 (Z, C) admits a non-complete C ∗ -norm k · k0 , and completing `1 (Z, C) w.r.t. the latter yields
a C ∗ -algebra C ∗ (Z) that is isomorphic to C ∗ (U ) ⊆ B(`2 (Z, C)), where U ∈ B(`2 (Z, C)) is the
two-sided shift unitary. One also has C ∗ (Z) ∼ = C(S 1 , C), thus the C ∗ -completion ‘adds’ the
continuous functions with non-absolutely convergent Fourier series. 2

The following is a C ∗ -version of Proposition 19.5:

19.15 Proposition Let B be a commutative unital C ∗ -algebra and b ∈ B such that B =


C ∗ (1, b). Then the map bb : Ω(B) → σ(b) is a homeomorphism.
Proof. The proof is similar to that of Proposition 19.5, enriched by the following argument: If
ϕ1 (b) = ϕ2 (b) then by Lemma 17.17 we have ϕ1 (b∗ ) = ϕ1 (b) = ϕ2 (b) = ϕ2 (b∗ ). Thus ϕ1 and ϕ2
coincide on all polynomials in b and b∗ , and therefore on B. 

Now we have another, perhaps more conceptual but certainly less elementary, proof of the
continuous functional calculus for normal elements of a C ∗ -algebra (Theorem 17.16):

19.16 Theorem Let A be a unital C ∗ -algebra and a ∈ A normal. Then


(i) There is a unique unital ∗-homomorphism αa : C(σ(a), C) → A such that αa (z) = a,
where z is the inclusion map σ(a) ,→ C. As in Section 17.1, we interpret αa (f ) as f (a).
(ii) If f ∈ C(σ(a), C)) then σ(f (a)) = f (σ(a)).
(ii) If f ∈ C(σ(a), C)) and g ∈ C(f (σ(a)), C) then (g ◦ f )(a) = g(f (a)).
Proof. (i) Let B = C ∗ (1, a) ⊆ A be the closed ∗-subalgebra generated by {1, a}. Since a is
normal, B is a commutative unital C ∗ -algebra, thus by Theorem 19.12, there is an isometric
∗-isomorphism π : B → C(Ω(B), C). And by Proposition 19.15 we have a homeomorphism
a : Ω(B) → σ(a). Now we define αa to be the composite of the maps
b

at π −1
C(σ(a), C) → C(Ω(B), C) → B ,→ A,
b

where the first map is bat : f 7→ f ◦b


a. It is clear that αa is a unital homomorphism. If z : σ(a) ,→ C
is the inclusion, then αa (z) = π −1 (z ◦ b a) = π −1 (b
a) = a. Any continuous unital homomorphism
α : C(σ(a)) → B sending 1 to 1A and z to a coincides with αa on the polynomials C[x]. Since
the latter are dense in C(σ(a), C) by Stone-Weierstrass, we have α = αa .
(ii) As used above, C ∗ -subalgebra B = C ∗ (1, a) is abelian and there is an isometric ∗-
isomorphism π : B → C(σ(a), C). By construction of the functional calculus we have π(f (a)) =
f ◦ ι, where ι is the inclusion map σ(a) ,→ C. Now, with Theorem 16.19 and Exercise 13.24 we
have
σA (f (a)) = σB (f (a)) = σC(σ(a)) (π(f (a))) = σC(σ(a)) (f ◦ ι) = f (σ(a)).

165
(iii) This is essentially obvious, since applying f to a and g to f (a) is just composition of
maps on the right hand side of the Gelfand isomorphism. 

It should be clear that Theorem 19.12 is of fundamental conceptual importance, but most
of its applications just use Theorem 19.16, which we proved in Section 17.3 in a more ele-
mentary fashion (without weak-∗ topology and Alaoglu’s theorem, and using only the classical
Weierstrass theorem). Genuine applications of Theorem 19.12 are harder to find. Here is one:

19.17 Exercise Let A be a unital C ∗ -algebra and let a, b ∈ A be commuting normal elements.
Prove that the absolute values (Defin. 17.11) satisfy |a + b| ≤ |a| + |b| and |ab| = |a| |b|.
Hint: Fuglede’s theorem.
(For A = B(H) one has results under weaker hypotheses. See e.g. [107].)

19.18 Remark If A is a commutative unital C ∗ -algebra generated by a1 , . . . , an ∈ A, Ex-


ercise 19.8 gives a homeomorphism Ω(A) ∼ = σ(a1 , . . . , an ). Combining this with the Gelfand
isomorphism, we see that A is isometrically ∗-isomorphic to C(σ(a1 , . . . , an ), C). 2

A Some more advanced topics from topology and


measure theory
A.1 Unordered infinite sums
P S is a finite set, A an abelian group and f : S → A a function, it is not hard to define
If
s∈S f (S) (even though few textbook authors bother to do so explicitly). One choses a bijection
α : {1, 2, . . . , #S} → S and defines s∈S f (s) = #S
P P
i=1 f (α(i)). The only slight difficulty is
proving that the result does not depend on the choice of α.
In order to define infinite sums, we need a topology on A, and we restrict to the case of
functions f : S → V , where (V, k · k) is a normed space. Many authors of introductory texts
(for a nice exception see [163, Vol. I, Section 8.2]) consider only those countable sums known as
series, but for our purposes this is inadequate.

A.1 Definition
P Let S be a set, (V, k · k) a normed space and f : S → V a function. We say
that s∈S f (s) exists or converges (or: f is summable P over S) with sum x ∈ V if for every
ε > 0 there is a finite subset T ⊆ S such that kx − s∈U f (s)k < ε holds whenever T ⊆ U ⊆ S
with U finite.
In many cases, the above will be applied to V = F ∈ {R, C} and k · k = | · |.
This notion of summation has some useful properties:

A.2 Proposition Let V be a normed space over F and f, g : V → F.


P
(i) If s∈S f (s) exists then the sum x ∈ V is uniquely determined.
P P P
(ii) If s∈S f (s) = x and s∈S g(s) = y then s∈S (cf (s) + dg(s)) = cx + dy for all c, d ∈ F.
P P
(iii) If f : S → [0, ∞) ⊆ R then s∈S f (s) exists if and only if sup{ t∈T f (t) | T ⊆ S finite} <
∞, in which case the two expressions coincide. These equivalent conditions imply that the
set {s ∈ S | f (s) 6= 0} is at most countable.

166
P P P
P(V, k · k) is complete and
(iv) If s∈S kf (s)k < ∞ then s∈S f (s) exists, and k s∈S f (s)k ≤
s∈S kf (s)k.
P P
(v) If f : S → F ∈ {R, C} is such that s∈S f (s) exists then s∈S |f (s)| exists, i.e. is finite.
The proofs of (i) and (ii) are straightforward and similar to those for the analogous state-
ments about series.
P The equivalence
P in (iii) follows from monotonicity of the map Pfin (S) →
[0, ∞), T 7→ t∈T f (t). If s∈S |f (s)| < ∞ then it follows that for every ε > 0 there are
at most finitely many s ∈ S such that |f (s)| ≥ ε. In particular, for every n ∈ N the set
Sn = {s ∈ S | |f (s)| ≥ 1/n} is finite. Since S∞ a countable union of finite sets is countable, we
have countability of {s ∈ S | f (s) 6= 0} = n=1 Sn . The proof of (iv) combines the argument in
Proposition 3.15(iii) with Lemma A.14 below.
Statement (v) may be surprising at first sight P since the analogous statement for series is
false. Roughly, the reason is that our definition of s∈S f (s) imposesP no ordering on S, while,
by a classical result of Riemann, the sum of a convergent series ∞ n=1 f (n) is invariant under
reordering of the terms only if the series converges absolutely. The rigorous proof of (v), found
e.g. in [12] or [108, Proposition 5.1.28], does not appeal to Riemann’s result, but uses similar
ideas. For S = N this is Proposition 3.16, the proof for general S being similar.
In discussing the spaces `p (S, F), the following (easy special case of Lebesgue’s dominated
convergence theorem) is useful:

A.3 Proposition (Discrete case of dominated convergence theorem) Let S be a set


and {fn }n∈N functions S → C. Assume that
1. For each s ∈ S, the limit limn→∞ fn (s) exists. Define h : S → C, s 7→ limn→∞ fn (s).
P
2. There exists a function g : S → [0, ∞) such that s∈S g(s) < ∞ and |fn (s)| ≤ g(s) ∀s ∈ S.
Then
P P
(i) s∈Sfn (s) converges for each n ∈ N. So does s∈S h(s).
P P
(ii) limn→∞ s∈S fn (s) = s∈S h(s). (Thus limit and summation can be interchanged.)
Proof. (i) Assumption
P P 1. gives |h(s)| ≤ g(s) ∀s. Now assumption 2. implies convergence of
s h(s) and of s fn (s)
Pfor all n.
(ii) Let ε > 0. Since s g(s) < ∞, there is a finite subset T ⊆ S such that s∈S\T g(s) < 4ε .
P
ε
For each t ∈ T there is an nt ∈ N such that n ≥ nt ⇒ |fn (t) − h(t)| < 2#T . Put n0 = maxt∈T nt .
If n ≥ n0 then

X X X X X X
fn (s) − h(s) ≤ fn (s) − h(s) + fn (s) − h(s) .
s∈S s∈S s∈T s∈T s∈S\T s∈S\T

The first term on the r.h.s. is bounded by


X ε ε
|fn (s) − h(s)| ≤ #T · =
2#T 2
s∈T

due to the definition of n0 and n ≥ n0 ≥ nt . And the second is bounded by


X X ε
(|fn (s)| + |h(s)|) ≤ 2 g(s) ≤ ,
2
S\T S\T

where we used that g dominates |fnP| and |h|, as P


well as the choice of T . Putting the two
estimates together gives n ≥ n0 ⇒ | s∈S fn (s) − s∈S h(s)| ≤ ε, completing the proof. 

167
A.2 More on unconditional convergence of series
In the case S = N and f (n) = xn , there is a connection between the summability of Definition
A.1 and the notion of unconditionally convergent series:

A.4 Theorem Let (V, k · k) be a Banach space and {xn }n∈N ⊂ V . Then the following are
equivalent:
P
(i) n∈N xn exists in the sense of Definition A.1.
P∞
(ii) n=1 xn is unconditionally convergent, the sums of all rearrangements being equal.
P∞
(iii) n=1 xn is unconditionally convergent.

P ε > 0 there exists a finite S ⊆ N such that for every finite subset T ⊂ N\S we
(iv) For every
have k t∈T xt k < ε.

X
(v) lim sup |ϕ(xk )| = 0.
N →∞ ϕ∈V ∗
≤1 k=N +1
P∞
(vi) n=1 cn xn is convergent for all bounded sequences {cn } in F. (The convergence then is
unconditional.)
P∞
(vii) k=1 xnk converges for all n1 < n2 < · · · . (Subseries convergence)
P
Proof. (i)⇒(ii) Let n∈N xn = x. If now σ is any permutation P of N and ε > 0 then by
assumption there is a finite subset T ⊆ N such that kx − n∈U xn k < ε for every finite U ⊂ N
containing T . Since σ : N → N is a bijection, there exists n0 such that P n ≥ n0 implies
−1 (k) | k ∈ T }.) Thus kx − n
T ⊆ {σ(1), . . . , σ(n)}. (We can take nP 0 = max{σ k=1 xσ(k) k < ε.

This proves that all rearranged sums k=1 xσ(k) converge to x.
(ii)⇒(iii) Trivial.
(iii)⇒(iv) Assume (iv) does not hold. By elementary logic this means that P there is an
ε > 0 such that for every finite S ⊆ N there exists a finite T ⊆ N\S with k t∈T xt k ≥ ε.
Using this weP can construct a sequence S1 , S2 , . . . of mutually disjoint finite subsets Sk ⊆ N
such that k s∈Sk xs k ≥ ε for each k. Now we can find a permutation σ of N and a sequence
n1 < n2 <P. . . such that Sk = σ({n
k +#Sk −1
P k , nk + 1, . . . , nk + #Sk − 1})Pfor∞
all k. Thus for each k
we have k nn=n k
x σ(n) k = k x
s∈Sk s k ≥ ε, so that the series n=1 xσ(n) is not Cauchy ,
99

thus divergent, contradicting (iii).


P(iv)⇒(i) P If ε > 0, let SPbe as in (i). P If now T, U ⊆ N are finite sets containingP S then
k t∈T xt − u∈U xu k ≤ k t∈T \S xt k + k u∈U \S xu k ≤ 2ε. Thus Pfin (N) → F, T 7→ t∈T xt
is a Cauchy net (Definition A.13) and therefore converges by Lemma A.14.

(iv)⇒(v) Let ε > 0 and S ⊆ NPas provided correspondingly by assumption (iv). Put
N = max(S). It then follows that k t∈T xt k < ε for each finite T ⊆ N with min(T ) > N . If
ϕ ∈ V ∗ and L ≥ K > N , put

F + = {n ∈ N | K ≤ n ≤ L, Re ϕ(xn ) ≥ 0},
F − = {n ∈ N | K ≤ n ≤ L, Re ϕ(xn ) < 0}.
∗ we have
Clearly min(F + ) ≥ K > N , so that k n∈F + xn k < ε. Now for every ϕ ∈ V≤1
P

X X  X   X  X
|Re ϕ(xn )| = Re ϕ(xn ) = Re ϕ xn ≤ ϕ xn ≤ kϕk xn < ε.
n∈F + n∈F + n∈F + n∈F + n∈F +

99
P∞
A series n=1 xn is Cauchy if the sequence {Sn } of partial sums is Cauchy.

168
Essentially the same argument holds for F − , so that L
P
PL n=K |Re ϕ(xn )| < 2ε. IfPF = C a similar
argument gives n=K |ImP ϕ(xn )| < 2ε. With |z| ≤ |Re z| + |Im z| we conclude L n=K |ϕ(xn )| <
∞ ∗
4ε. Taking L → ∞ gives n=K |ϕ(xn )| ≤ 4ε. Since this holds for all ϕ ∈ V≤1 , (v) follows.
(v)⇒(vi) We may clearly assume |cn | ≤ 1 for all n. Proposition 9.9(i) implies for every y ∈ V
that kyk = supϕ∈V ∗ |ϕ(y)|. Thus for M > N we have
≤1

M
X M
 X  M
X
ck xk = sup ϕ ck xk = sup ck ϕ(xk )

ϕ∈V≤1 ∗
ϕ∈V≤1
k=N +1 k=N +1 k=N +1
M
X M
X ∞
X
≤ sup |ck | |ϕ(xk )| ≤ sup |ϕ(xk )| ≤ sup |ϕ(xk )|.

ϕ∈V≤1 ∗
ϕ∈V≤1 ∗
ϕ∈V≤1
k=N +1 k=N +1 k=N +1

P∞ the rightmost expression tends to zero as N → ∞ by (v), it follows that the series
Since
k=1 ck xk is Cauchy, thus
P∞ convergent.
n=1 cn dn xn converges for all choices of {dn } in {0, 1} . Thus the
By the same N
Pproof,

convergence of n=1 cn xn is unconditional P by (vi)⇒(iii).
(vi)⇒(vii) This follows by rewriting ∞
P∞
k=1 xnk as n=1 cn xn , where cn ∈ {0, 1}.
(vii)⇒(iv) Assume (iv) does not hold. Arguing as in the proof of (iii)⇒(iv)
P we have an infinite
sequence {Sk } of mutually disjoint finite subsets of N such that k n∈Sk xn k ≥ ε for all k. It
is clear that we can choose the Sk in such a way that max(Sk ) < min(Sk+1 ) for all k. Putting
Sk−1 PNk+1
n = χS (n), with Nk = # i=1 Si we have k
S P
S = k Sk and cP n=Nk +1 cn xn k = k n∈Sk xn k ≥ ε.

Thus the series n=1 cn xn diverges since it is not Cauchy, contradicting (vii). 

A.5 Remark 1. Statement (vi) expresses more clearly than all the others that unconditional
convergence of a series does not rely on ‘cancellations’ between the summands, so that the
convergence is not affected by reordering or omission of any number of summands.
2. Our proof of the hardest implication (iv)⇒(vi) was found in [74]. It is perhaps the most
elementary one, using only Hahn-Banach. There are many other interesting proofs, see e.g.
[102, 4.2.6-4.2.8], [77,
P Vol. 1, Proposition 4.1.5], [97, Vol. 1, Theorem II.7]. ∗
3. If the series ∞
P∞
n=1 xn in V is unconditionallyP∞ convergent and ϕ ∈ V then n=1 ϕ(xn )

converges unconditionally, thus absolutely. Thus n=1 |ϕ(xn )| < ∞ ∀ϕ ∈ V . A series with
this property is called weakly unconditionally Cauchy (WUC). But P the WUC property does
not even imply conditional convergence, as is illustrated by series ∞ n=1 δn in c0 , which is easily
∗ ∼ 1
shown to be WUC using c0 = ` . This is essentially the only counterexample: Bessaga and
Pelczyński100 proved that every WUC series in a Banach space V converges unconditionally
if and only if V has no subspace isomorphic to c0 . Cf. e.g. [98, Theorem 2.e.4], [1, Theorem
2.4.11]. (This is similar to Rosenthal’s `1 -theorem mentioned in footnote 62, but much easier.)
4. While the WUC property of a series is weaker than unconditional convergence, weak-
topology characterizations of unconditional convergence do exist. One P∞ of them is statement
(v), being a uniform (in ϕ ∈ V≤1 ∗ ) version of the statement lim
N →∞ n=N +1 |ϕ(xn )| → 0 that
clearly follows from WUC. On the other hand, it is clear that unconditional convergence implies
weak convergence of all subseries. The latter property of a series is somewhat stronger than
being WUC, and indeed by the Orlicz101 -Pettis theorem, cf. e.g. [1, Theorem 2.4.14], [97, Vol.
1, Theorem II.3], we can add P the following to the list in Theorem 19.16:
(viii): Every subseries ∞ i=1 xni , where n1 < n2 < · · · , converges weakly. 2
100
Czeslaw Bessaga (1932-2021), Aleksander Pelczyński (1932-2012). Polish functional analysts. Both were students
of S. Mazur.
101
Wladyslaw Orlicz (1903-1990). Polish functional analyst and topologist. Also known for O. spaces.

169
A.6 Exercise Use Theorem A.4 to give a high-brow proof of Proposition 3.16.

A.3 Nets
The Definition A.1 of unordered sums is an instance of a much more general notion, the con-
vergence of nets.

A.7 Definition A directed set is a set I equipped with a binary relation ≤ on I satisfying
1. a ≤ a for each a ∈ I (reflexivity).
2. If a ≤ b and b ≤ c for a, b, c ∈ I then a ≤ c (transitivity).
3. For any a, b ∈ I there exists a c ∈ I such that a ≤ c and b ≤ c (directedness).

A.8 Remark If only 1. and 2. hold, (I, ≤) is called a pre-ordered set. Some authors, as e.g.
[101], require in addition that a ≤ b and b ≤ a together imply a = b (antisymmetry). Recall
that a pre-ordered set with this property is called partially ordered. But the antisymmetry is
an unnatural assumption in this context and is never used. 2

A.9 Example 1. Every totally ordered set (X, ≤) is a directed set. Only the directedness
needs to be shown, and it follows by taking c = max(a, b). In particular N is a directed set with
its natural total ordering.
2. If S is a set then the power set I = P (S) with its natural partial ordering is directed: For
the directedness, put c = a ∪ b. The same works for the set Pfin (S) of finite subsets of S, which
appeared in the definition of unordered sums.
3. If (X, τ ) is a topological space and x ∈ X, let Ux be the set of open neighborhoods of x.
Now for U, V ∈ Ux , define U ≤ V ⇔ U ⊆ V , thus we take the reversed ordering. Then (Ux , ≤)
is directed with c = a ∩ b.

A.10 Definition If X is a set, a net102 in X is a map I → X, ι 7→ xι , where (I, ≤) is a


directed set.
If (X, τ ) is a topological space, a net {xι }ι∈I in X converges to z ∈ X if for every open
neighborhood U of z there is a ι0 ∈ I such that ι ≥ ι0 ⇒ xι ∈ U .
When this holds, we write xι → z or limι∈I xι = z. (The second notation should only be
used if X is Hausdorff, since this property is equivalent to uniqueness of limits of nets.)

A.11 Remark 1. With I = N and ≤ the natural total ordering, a net indexed by I just is a
sequence, and this net converges if and only if the sequence does.
2. Unordered summation is a special case of a net limit: If S is any set, let I be the set of
finite subsets of S and let ≤ be the ordinary (partial) ordering of subsets of S. If T, U ∈ I let
V = T ∪ U . Clearly T ≤ V, U ≤ V , showing that (I, ≤) is a directed set. (This is the same as
Example A.9.2, except that now we only look at finite subsets P of S.) Now given f : S → F, for
every T ∈ I, thus every finite T ⊆ S, we can clearly define t∈T f (t). Now
X X
f (s) = lim f (t),
T ∈I
s∈S t∈T

where the sum exists if and only if the limit exists. 2

102
Nets were invented by the American mathematicians Eliakim Hastings Moore (1862-1932) and his student Herman
L. Smith (1892-1950). Moore made many contributions to many areas of mathematics.

170
Why nets? The reason is that sequences are totally inadequate for the study of topological
spaces that do not satisfy the first countability axiom.103 Given a metric space X and a subset
Y ⊆ X, one proves that x ∈ Y if and only if there is a sequence {yn } in Y converging to x, but
for general topological spaces this is false. Similarly, the statement that a function f : X → Y
is continuous at x ∈ X if and only if f (xn ) → f (x) for every sequence {xn } converging to x is
true for metric spaces, but false in general! (It is instructive to work out counterexamples.)
On the other hand:

A.12 Proposition 1. Let X be a topological space and Y ⊆ X. If {yι } is a net in Y that


converges to x ∈ X then x ∈ Y .
2. Let X be a topological space and Y ⊆ X. Then for every x ∈ Y there exists a net {yι }
in Y such that yι → x.
3. A topological space X is Hausdorff if and only if there exists no net {xι } in X that
converges to two different points of X.
4. If X, Y are topological spaces, f : X → Y a function, and x ∈ X, then f is continuous at
x if and only if f (xι ) → f (x) for every net {xι } in X converging to x.
For proofs see any decent book on topology or [108].
If (X, d) is a metric space, the problems with sequences mentioned above do not arise.
Nevertheless, there are situations where the use of nets in X is useful, as in the proof of
Theorem 5.42 and 5.45 where we considered nets indexed by the finite subsets of an ONB E.
In this case one wants:

A.13 Definition A net {xι }, indexed by a directed set (I, ≤), in a metric space (X, d) is a
Cauchy net if for every ε > 0 there is a ι0 ∈ I such that ι, ι0 ≥ ι0 ⇒ d(xι , xι0 ) < ε.

A.14 Lemma In a complete metric space every Cauchy net converges.


Proof. Let {xι }ι∈I be Cauchy. Then for every n ∈ N there is a ιn ∈ I such that ι, ι0 ≥ ιn ⇒
d(xι , xι0 ) < 1/n. We can arrange that in addition ι1 ≤ ι2 ≤ · · · (using directedness to replace
ι2 by some ι02 larger than ι1 and ι2 etc.). Put yn = xιn for n ∈ N. If ε > 0 and n0 ∈ N such that
1/n0 < ε then for n, m ≥ n0 we have ιm , ιn ≥ ιn0 , so that d(yn , ym ) = d(xιn , xιm ) < 1/n0 < ε.
Thus {yn }n∈N is a Cauchy sequence, which by completeness of (X, d) converges to some y ∈ X
with d(y, yn ) ≤ 1/n ∀n. If now ε > 0, pick n ∈ N with 1/n < ε/2. Now for all ι ≥ ιn we have
d(xι , y) ≤ d(xι , xιn ) + d(xιn , y) < n1 + n1 < ε, thus xι → y. 

A.15 Remark In the above proof we cannot argue by saying that there is a sequence ι1 ≤ ι2 ≤
· · · such that for every ι ∈ I there is an n with ι ≤ ιn . If this was true, we could replace the 
net by a sequence in the first place! 2

A.4 Reminder of the choice axioms and Zorn’s lemma


A.16 Definition The Axiom of Choice (AC) is any of the following statements, which are
easily shown to be equivalent:
• If f : X → Y is a surjective function then there exists a function g : Y → X such that
f ◦ g = idY .
103
Unfortunately many introductory books and courses sweep this problem under the rug and don’t even mention
nets (or their alternatives like filters).

171
• If X is a set, there exists a function s : P (X)\{∅} → X such that s(Y ) ∈ Y for each
Y ∈ P (X)\{∅}, i.e. ∅ = 6 Y ⊆ X.
• If {Xi }i∈I
Q
S is a family of non-empty sets then i∈I Xi 6= ∅. Concretely, there exists a map
f : I → j∈I Xj such that f (i) ∈ Xi ∀i ∈ I.

A.17 Definition Let (X, ≤) be a partially ordered set. Then


• m ∈ X is called a maximal element if y ∈ X, y ≥ m implies y = m.
• u ∈ X is called an upper bound for Y ⊆ X if x ≤ u holds for each u ∈ y. If u ∈ Y then it
is called largest element of Y (which is unique).

A.18 Theorem Given the Zermelo-Frenkel axioms of set theory, the Axiom of Choice is equiv-
alent to Zorn’s lemma, which says: If (X, ≤) is a non-empty partially ordered set such that
every totally ordered subset Y ⊆ X has an upper bound then X has a maximal element.

A.19 Definition The Axiom of Countable Choice (ACω ) is the first (or third) of the above
versions of AC with the restriction that Y (respectively I) be at most countable.
Many iterative constructions, as in the proof of Urysohn’s lemma or of Lemma 7.3, require
countably many choices where, however, the n-th choice must take into account the preceding
ones. For this we need an axiom that is stronger than ACω :

A.20 Definition The Axiom of Countable Dependent Choice (DCω ) is the following: If X is
a set and R ⊆ X × X is such that for every x ∈ X there is a y ∈ X such that (x, y) ∈ R then
there is a sequence {xn }n∈N in X such that (xn , xn+1 ) ∈ R for all n ∈ N.

A.21 Remark 1. Like the other choice axioms, DCω is often used without any comment.
2. It is easy to prove AC ⇒ DCω ⇒ ACω . The converse implications have been proven
false by constructing models of ZF set theory satisfying, say, ACω but not DCω . 2

A.5 Baire’s theorem


To provide context, we begin with some simple considerations. If Y1 , Y2 are dense subsets of a
topological space X, it does not follow that Y1 ∩ Y2 is dense: Consider X = R and the dense
sets Y1 = Q, Y2 = R\Q, for which Y1 ∩ Y2 = ∅. But we have:

A.22 Lemma Let (X, τ ) be a topological space with X 6= ∅.


(i) Y ⊆ X is dense ⇔ X\Y has empty interior ⇔ Y ∩ W 6= ∅ whenever ∅ =
6 W ∈ τ.
(ii) If Y1 , Y2 ⊆ X are dense and Y1 is open then Y1 ∩ Y2 is dense.
(iii) If Y1 , . . . , Yn ⊆ X are dense open subsets then Y1 ∩ · · · ∩ Yn is dense (and open).
Proof. (i) Easy exercise. (ii) Let W ⊆ X be open and non-empty. Then by (i) and den-
sity of Y1 the open set W 0 = W ∩ Y1 is non-empty. By (i) and density of Y2 we have
W ∩ Y1 ∩ Y2 = W 0 ∩ Y2 6= ∅, so that using (i) once more we conclude that Y1 ∩ Y2 is dense. (iii)
Follows from (ii) by induction. 

Saying something about infinite intersections requires more work and assumptions, as in:

172
A.23 Theorem (Baire) 104 Let (X, d) be a complete metric space and {Un }n∈N a countable
T∞
family of dense open subsets. Then n=1 Un is dense in X.
Proof. Let W ⊆ X be open and non-empty. Since U1 is dense, W ∩U1 6= ∅ by Lemma A.22, so we
can pick x1 ∈ W ∩U1 . Since W ∩U1 is open, we can choose ε1 > 0 such that B(x1 , ε1 ) ⊆ W ∩U1 .
We may also assume ε1 < 1. Since U2 is dense, U2 ∩B(x1 , ε1 ) 6= ∅ and we pick x2 ∈ U2 ∩B(x1 , ε1 ).
By openness, we can pick ε2 ∈ (0, 1/2) such that B(x2 , ε2 ) ⊆ U2 ∩ B(x1 , ε1 ). Continuing this
iteratively, we find points xn and εn ∈ (0, 1/n) such that B(xn , εn ) ⊆ Un ∩ B(xn−1 , εn−1 ) ∀n. If
i > n and j > n we have by construction that xi , xj ∈ B(xn , εn ) and thus d(xi , xj ) ≤ 2εn < 2/n.
Thus {xn } is a Cauchy sequence, and by completeness of (X, d) it converges to some z ∈ X.
Since n > k implies xn ∈ B(xk , εk ), the limit z is contained in B(xk , εk ) for each k, thus
\ \
z∈ B(xn , εn ) ⊆ W ∩ Un ,
n n
T
∩ n Un is non-empty. Since W was an arbitrary non-empty open set, Lemma A.22
so that W T
gives that n Un is dense. 

The following dual and equivalent reformulation is also useful:

A.24 Corollary Let (X, d) be a complete S∞ metric space and {Cn }n∈N a countable family of
closed subsets with empty interior. Then n=1 Cn has empty interior.

Proof. The sets Un = X\Cn , n ∈ N, are open and TUn = X\Cn = X\Cn0 = X since the
interiors Cn0 are empty. Thus the Un are dense so that n Un is dense by Baire’s theorem, thus
X\Cn = n Un = X. Thus with X\Y = (X\Y )0 we have ( n Cn )0 = (X\ n (X\Cn ))0 =
T T S T
nT S
X\ n X\Cn = ∅, i.e. n Cn has empty interior. 

A.25 Remark 1. There are many other ways of stating Baire’s theorem, but most of the
alternative versions introduce additional terminology (nowhere dense sets, meager sets, sets of
first or second category,
T etc.) that obscures the matter unnecessarily.
2. An intersection n Un of a countable family {Un }n∈N of open sets is called a Gδ -set. (And
a countable union of closed sets is called Fσ -set.)
3. The proof implicitly used the axiom DCω of countable dependent choice. (Making this
explicit is an instructive but tedious exercise.) Remarkably, the (Zermelo-Frenkel) axioms of
set theory (without any choice axiom) combined with Baire’s theorem imply DCω , cf. [17].
4. Some results customarily proven using Baire’s theorem can alternatively be proven with-
out it. But in most cases, such alternative proofs will also use the axiom DCω and therefore not
be better from a foundational (reverse mathematics) point of view. See also Remark 8.3.2. 2

A typical application of Baire’s theorem is the following (for a proof see, e.g., [108]):

A.26 Theorem There is a k · k∞ -dense Gδ -set F ⊆ C([0, 1], R) such that every f ∈ F is
nowhere differentiable.
Note that a single function f ∈ C([0, 1], R) that is nowhere
P∞ differentiable can be written down
−n n
quite explicitly and constructively, for example f (x) = n=1 2 cos(2 x). But for proving that
such functions are dense one needs Baire’s theorem (or something related).
104
René-Louis Baire (1874-1932). French mathematician, proved this for Rn in his 1899 doctoral thesis. The gener-
alization is due to Hausdorff (1914).

173
A.6 On C(X, F)
We recall a few facts from general topology:

A.27 Definition A topological space (X, τ ) is called


• T1 if {x} ⊆ X is closed for each x ∈ X.
• T2 or Hausdorff if for any x, y ∈ X, x 6= y there are disjoint open U, V with x ∈ U, y ∈ V .
• T4 or normal if it is T1 and for any two disjoint closed sets C, D ⊆ X there are disjoint
open sets U, V ⊆ X with C ⊆ U, D ⊆ V .
It is immediate that T4 implies T1 and T2 , and for T2 ⇒ ST1 it suffices to fix x and pick for
each y 6= x an open Uy with y ∈ Uy 63 x. Then X\{x} = X\ y6=x Uy is open.
One easily checks that for a T1 space, the following is equivalent to normality: Whenever
C ⊆ U with C closed and U open, there is an open V such that C ⊆ V ⊆ V ⊆ U .

A.28 Proposition Every compact Hausdorff space is normal.

A.29 Proposition (Urysohn’s Lemma) If X is a normal space and C, D ⊆ X there are


disjoint closed sets, there exists f ∈ C(X, [0, 1]) such that f  C = 0 and f  D = 1.
For proofs see e.g. [142].

A.30 Lemma If X is a compact topological space (not necessarily Hausdorff), C(X, F) with
norm kf k = supx∈X |f (x)| is a Banach space.
Proof. By compactness of X, every f ∈ C(X, F) is bounded, thus has kf k < ∞. That the norm
axioms are satisfied is easy enough. If {fn } ⊂ C(X, F) is a Cauchy sequence, so is {fn (x)} for
each x ∈ X. Since F is complete, g(x) = limn→∞ fn (x) exists for each x. Let ε > 0 and n0 be
such that n, m ≥ n0 implies supx∈X |fn (x) − fm (x)| = kfn − fm k < ε. Taking m → ∞ we obtain
kfn − gk = supx∈X |fn (x) − g(x)| =≤ ε for all n ≥ n0 . Thus the convergence fn (x) → g(x) is
uniform in x. This implies g ∈ C(X, F), as shown in topology. (If x ∈ X and ε > 0, pick n such
that kg − fn k < ε/3. By continuity of fn there is an open neighborhood U ⊆ X of x such that
|f (x) − f (y)| < ε/3 for all y ∈ U . Then for all y ∈ U we have
ε ε ε
|g(x) − g(y)| ≤ |g(x) − fn (x)| + |fn (x) − fn (y)| + |fn (y) − g(y)| ≤ + + = ε,
3 3 3
so that f is continuous at x. Since this holds works for all x ∈ X, g is continuous.) Thus every
Cauchy sequence in C(X, F) converges to an element of C(X, F), proving completeness. 

A.6.1 Tietze’s extension theorem


While Urysohn’s lemma belongs to point set topology, the following result can be given [61] a
functional analytic twist:

A.31 Theorem (Tietze-Urysohn extension theorem) 105 Let (X, τ ) be a normal (T4 )
topological space, Y ⊆ X closed and f ∈ Cb (Y, R). Then there exists fb ∈ Cb (X, R) such that
fb|Y = f and kfbk = kf k.
105
H. F. F. Tietze (1880-1964), Austrian mathematician. He proved this for metric spaces (for which Urysohn’s
lemma is a triviality). The generalization to normal spaces is due to Urysohn.

174
Proof. Let f ∈ Cb (Y, R), where we may assume kf k = 1, so that f (Y ) ⊆ [−1, 1]. Let A =
f −1 ([−1, −1/3]) and B = f −1 ([1/3, 1]). Then A, B are disjoint closed subsets of Y , which are
also closed in X since Y is closed. Thus by Urysohn’s Lemma, there is a g ∈ C(X, [−1/3, 1/3])
such that g  A = −1/3 and g  B = 1/3. Thus kgkX = 1/3, and with T g = g|Y one easily checks
(do it!) that kT g − f kY ≤ 2/3. Now Lemma 7.3 is applicable with m = 1/3 and r = 2/3 and
gives the existence of fb ∈ C(X, R) with T fb = f and kfbk = kf k (since m/(1 − r) = 1). 

The theorem is easily extended to C-valued functions.

A.6.2 Weierstrass’ theorem


The following fundamental theorem of Weierstrass106 (1885) has been proven in many ways. A
fairly standard proof due to E. Landau (1908) involves convolution of f with a sequence {gn } of
functions that is a polynomial approximate unit, cf. e.g. [163, Vol. II, Section 3.8]. The following
proof, given in 1913 by S. Bernstein107 , has the advantage of using no integration.

A.32 Theorem Let f ∈ C([a, b], F) and ε > 0. Then there exists a polynomial P ∈ F[x] such
that |f (x) − P (x)| ≤ ε for all x ∈ [a, b]. (As always, F ∈ {R, C}.)
Proof. It clearly suffices to prove this for the interval [0, 1]. For n ∈ N and x ∈ [0, 1], define
n   
X k n k
Pn (x) = f x (1 − x)n−k .
n k
k=0

Clearly Pn is a polynomial of degree at most n, called Bernstein polynomial. In view of


n  
n n
X n
1 = 1 = (x + (1 − x)) = xk (1 − x)n−k (A.1)
k
k=0

we have
n     
X k n k
f (x) − Pn (x) = f (x) − f x (1 − x)n−k ,
n k
k=0

thus
n    
X k n k
|f (x) − Pn (x)| ≤ f (x) − f x (1 − x)n−k . (A.2)
n k
k=0

Since [0, 1] is compact and f : [0, 1] → F is continuous, it is bounded and uniformly continuous.
Thus there is M such that |f (x)| ≤ M for all x, and for each ε > 0 there is δ > 0 such that
|x − y| < δ ⇒ |f (x) − f (y)| < ε.
Let ε > 0 be given, and chose a corresponding δ > 0 as above. Let x ∈ [0, 1]. Define
 
k
A = k ∈ {0, 1, . . . , n} | −x <δ .
n
106
Karl Theodor Wilhelm Weierstrass (1815-1897). German mathematician and one of the fathers of rigorous analysis.
107
Sergei Natanovich Bernstein (1880-1968). Russian/Soviet mathematician. Important contributions to approxima-
tion theory, probability, PDEs.

175
For all k we have |f (x) − f (k/n)| ≤ 2M , and for k ∈ A we have |f (x) − f (k/n)| < ε. Thus with
(A.2) we have
X n X n
|f (x) − Pn (x)| ≤ ε xk (1 − x)n−k + 2M xk (1 − x)n−k
k k
k∈A k∈Ac
X n
≤ ε + 2M xk (1 − x)n−k , (A.3)
c
k
k∈A

where we used (A.1) again. In an exercise, we will prove the purely algebraic identity
n  
X n
xk (1 − x)n−k (k − nx)2 = nx(1 − x) (A.4)
k
k=0

for all n ∈ N0 and x ∈ [0, 1] (in fact all x ∈ R). Now, k ∈ Ac is equivalent to | nk − x| ≥ δ and
to (k − nx)2 ≥ n2 δ 2 . Multiplying both sides of the latter inequality by nk xk (1 − x)n−k and
summing over k ∈ Ac , we have
X n X n
n2 δ 2 xk (1 − x)n−k ≤ xk (1 − x)n−k (k − nx)2
c
k c
k
k∈A k∈A
n 
X n k
≤ x (1 − x)n−k (k − nx)2 = nx(1 − x), (A.5)
k
k=0

where the last equality comes from (A.4). This implies


X n nx(1 − x) 1
xk (1 − x)n−k ≤ 2 2
≤ 2, (A.6)
c
k n δ nδ
k∈A

where we used the obvious inequality x(1 − x) ≤ 1 for x ∈ [0, 1]. Plugging (A.6) into (A.3)
2M
we have |f (x) − Pn (x)| ≤ ε + nδ 2 . This holds for all x ∈ [0, 1] since, by uniform continuity, δ

depends only on ε, not on x. Thus for n > 2M εδ 2


we have |f (x) − Pn (x)| ≤ 2ε ∀x ∈ [0, 1] and are
done. 

A.33 Exercise Prove


P (A.4). Hint: Use basic properties of the binomial coefficients or differ-
entiate (x + y)n = nk=0 nk xk y n−k twice with respect to x and then put y = 1 − x.
An immediate consequence of Theorem A.32 is the following:

A.34 Corollary There exists a sequence {pn }n∈N ⊆ R[x] of real polynomials that converges

uniformly on [0, 1] to the function x 7→ x.
The above corollary can also be proven directly:

A.35 Exercise Define a sequence {pn }n∈N0 ⊆ R[x] of polynomials by p0 = 0 and

x − pn (x)2
pn+1 (x) = pn (x) + . (A.7)
2
Prove by induction that the following holds:

(i) pn (x) ≤ x for all n ∈ N0 , x ∈ [0, 1].
(ii) The sequence {pn (x)} increases monotonously for each x ∈ [0, 1] and converges uniformly

to x.

176
A.6.3 The Stone-Weierstrass theorem
Theorem A.32 says that the polynomials, restricted to [0, 1] are uniformly dense in C([0, 1]).
Our aim is to generalize this, replacing replacing [0, 1] by (locally) compact Hausdorff spaces.
In order to see what should take the place of polynomials, notice that a polynomial on R is
a linear combination of powers xn , and the latter can be seen as powers f n (under pointwise
multiplication) of the identity function f = idR . Thus the polynomials are the unital subalgebra
P ⊆ C(R, R) generated by the single element idR . Now, if X is a topological space and F ∈
{R, C} then C(X, F) is a unital algebra, and we will consider subalgebras (not necessarily singly
generated) A ⊆ C(X, F). Since the functions on a (locally) compact Hausdorff space separate
points, we clearly need to impose the following if we want to prove A = C(X, F):

A.36 Definition A subalgebra A ⊆ C(X, F) separates points if for any x, y ∈ X, x 6= y there


is a f ∈ A such that f (x) 6= f (y).

A.37 Theorem (M. H. Stone 1937) 108 If X is compact Hausdorff and A ⊆ C(X, R) is a
unital subalgebra separating points then A = C(X, R).
Proof. Replacing A by A, the claim is equivalent to showing that A = C(X, R). We proceed in
several steps. We claim that f ∈ A implies |f | ∈ A. Since f is bounded due to compactness, it
clearly is enough to prove this under the assumption |f | ≤ 1. With the pn of Corollary A.34,
2 2
p have (x 7→ pn (f (x))) ∈ A since A is a unital algebra. Since pn ◦ f converges uniformly to
we
2
f = |f |, closedness of A implies |f | ∈ A. In view of

f + g + |f − g| f + g − |f − g|
max(f, g) = , min(f, g) = ,
2 2
and the preceding result, we see that f, g ∈ A implies min(f, g), max(f, g) ∈ A. By induction,
this extends to pointwise minima/maxima of finite families of elements of A.
Now let f ∈ C(X, R). Our goal is to find fε ∈ A satisfying kf − fε k < ε for each ε > 0.
Since A is closed, this will give A = C(X, R).
If a 6= b, the fact that A separates points gives us an h ∈ A such that h(a) 6= h(b). Thus
the function ha,b (x) = h(x)−h(a)
h(b)−h(a) is in A, continuous and satisfies h(a) = 0, h(b) = 1. Thus also
fa,b (x) = f (a) + (f (b) − f (a))ha,b (x) is in A, and it satisfies fa,b (a) = f (a) and fa,b (b) = f (b).
This implies that the sets

Ua,b,ε = {x ∈ X | fa,b (x) < f (x) + ε}, Va,b,ε = {x ∈ X | fa,b (x) > f (x) − ε}

are open neighborhoods of a and b, respectively, for every ε > 0. Thus keeping b, ε fixed,
{Ua,b,ε }a∈X is an open cover of X, and by compactness we find a finite subcover {Uai ,b,ε }ni=1 .
By the above preparation, the function fb,ε = min(fa1 ,b,ε , . . . , fan ,b,ε ) is in A. If x ∈ Uai ,b,ε
then fb,ε (x) ≤ fai ,b,ε (x) < f (x) + ε forTall x ∈ X, and since {Uai ,b,ε }ni=1 covers X, we have
fb,ε (x) < f (x) + ε ∀x. For all x ∈ Vb,ε = ni=1 Vai ,b,ε , we have fai ,b,ε (x) > f (x) − ε, and therefore
fb (x) = mini (fai ,b,ε ) > f (x) − ε. Now {Vb,ε }b∈X is an open cover of X, and we find a finite
subcover {Vbj ,ε }nj=1 . Then fε = max(fb1 ,ε , . . . , fbn ) is in A. Now fε (x) = maxj (fbj ,ε ) ≤ f (x) + ε
holds everywhere, and for x ∈ Vbj ,ε we have fε (x) ≥ fbj ,ε > f (x) − ε. Since {Vbj ,ε }j covers X,
we conclude that fε (a) ∈ (f (x) − ε, f (x) + ε) for all x, to wit kf − fε k < ε. 

108
Marshall Harvey Stone (1903-1989). American mathematician, mostly active in topology and (functional) analysis.

177
Since the polynomial ring R[x] is an algebra, and the polynomials clearly separate the points
of R, Theorem A.37 recovers Theorem A.32. (This is not circular if one has used Exercise A.35 to
prove Corollary A.34.) But we immediately have the higher dimensional generalization (which
can also be proven by more classical methods, like approximate units):

A.38 Theorem Let X ⊆ Rn be compact. Then the restrictions to X of the P ∈ R[x1 , . . . , xn ]


(considered as functions) are uniformly dense in C(X, R).
Having proven Theorem A.37, it is easy to generalize it to locally compact spaces or/and
subalgebras of C(0) (X, C). Recall that a subset S of a ∗-algebra A is called self-adjoint if
S = S ∗ := {s∗ | s ∈ S}.

A.39 Corollary If X is compact Hausdorff and A ⊆ C(X, C) is a self-adjoint unital subal-


gebra separating points then A = C(X, C).

Proof. Define B = A ∩ C(X, R). Let f ∈ A. Since f ∗ ∈ A, we also have Re(f ) = f +f 2 ∈B
f −f ∗
and Im(f ) = 2i = −Re(if ) ∈ B. Thus A = B + iB. It is obvious that Re(A) ⊆ C(X, R)
is a unital subalgebra. If x 6= y then there is f ∈ C(X, C) such that f (x) 6= f (y). Thus
Re(f )(x) 6= Re(f )(y) or Re(if )(x) 6= Re(if )(y) (or both). Since Re(f ), Re(if ) ∈ B, we see that
B separates points. Thus B = C(X, R) by Theorem A.37, implying A = B + iB = B + iB =
C(X, R) + iC(X, R) = C(X, C). 

A.40 Definition A subalgebra A ⊆ C0 (X, F) vanishes at no point if for every x ∈ X there is


an f ∈ A such that f (x) 6= 0.

A.41 Corollary If X is locally compact Hausdorff and A ⊆ C0 (X, R) is a subalgebra sepa-


rating points and vanishing at no point then A = C0 (X, R).
Proof. Let X∞ = X ∪ {∞} be the one-point compactification of X. Recall that every f ∈
C0 (X, R) extends to fb ∈ C(X∞ , R) with fb(∞) = 0. Then B = {fb | f ∈ A} + R1 clearly is a
unital subalgebra of C(X∞ , R). We claim that B separates the points of X∞ . This is obvious
for x, y ∈ X, x 6= y since already A does that. Now let x ∈ X. Since A vanishes at no point,
there is f ∈ A such that f (x) 6= 0. Let fb ∈ C(X, R) be the extension to X∞ with fb(∞) = 0.
In view of fb(x) = f (x) 6= 0, we see that B also separates ∞ from the points of X, so that
Theorem A.37 gives B = C(X, R). In view of B = A + R1 and C(X, R)  X = C0 (X, R), we
have A = B  X = C0 (X, R). 

A.42 Corollary If X is locally compact Hausdorff and A ⊆ C0 (X, C) is a self-adjoint subal-


gebra separating points and vanishing at no point then A = C0 (X, C).
Proof. The proof just combines the ideas of the proofs of Corollaries A.39 and A.41. 

A.6.4 The Arzelà-Ascoli theorem


Recall that a metric space (X, d) is totally bounded if for every ε > 0 there are x1 , . . . , xn ∈ X
such that X = B(x1 , ε) ∪ · · · ∪ B(xn , ε). And: A metric space is compact if and only if it is
complete and totally bounded, cf. e.g. [108]. The following will be needed on many occasions:

A.43 Exercise Let (X, d) be a metric space. Prove:


(i) If (X, d) is totally bounded and Y ⊆ X then (Y, d) is totally bounded.

178
(ii) If (Y, d) is totally bounded and Y ⊆ X is dense then (X, d) is totally bounded.
(iii) If (X, d) is complete and Y ⊆ X then (Y, d) is totally bounded if and only if Y is precom-
pact.
If (X, τ ) is a topological space and (Y, d) metric, the set Cb (X, Y ) is topologized by the
metric
D(f, g) = sup d(f (x), g(x)).
x∈X

It is therefore natural to ask whether the (relative) compactness of a set F ⊆ Cb (X, Y ) can
be characterized in terms of the elements of F, which after all are functions f : X → Y .
This will be the subject of this section, but we will restrict ourselves to compact X, for which
C(X, Y ) = Cb (X, Y ).

A.44 Definition Let (X, τ ) be a topological space and (Y, d) a metric space. A family F
of functions X → Y is called equicontinuous if for every x ∈ X and ε > 0 there is an open
neighborhood U 3 x such that f ∈ F, x0 ∈ U ⇒ d(f (x), f (x0 )) < ε. Then F ⊆ C(X, Y ).
The point of course is that the choice of U depends only on x and ε, but not on f ∈ F.

A.45 Theorem (Arzelà-Ascoli) 109 Let (X, τ ) be a compact topological space and (Y, d) a
complete metric space. Then F ⊆ C(X, Y ) is (pre)compact (w.r.t. the uniform topology τD ) if
and only if the following conditions are satisfied:
(i) {f (x) | f ∈ F} ⊆ Y is (pre)compact for every x ∈ X.
(ii) F is equicontinuous.
Proof. ⇒ If f, g ∈ C(X, Y ) then d(f (x), g(x)) ≤ D(f, g) for every x ∈ X. This implies that the
evaluation map ex : C(X, Y ) → Y, f 7→ f (x) is continuous for every x. Thus if F is compact, so
is eX (F). And compactness of F implies that ex (F) = {f (x) | f ∈ F} is compact, thus closed.
Since ex (F) contains ex (F), also ex (F) ⊆ ex (F) is compact.
To prove equicontinuity, let x ∈ X and ε > S 0. Since F is compact, F is totally bounded,
thus there are g1 , . . . , gn ∈ F such that F ⊆ i B D (gi , ε). By continuity of theTgi , there are
open Ui 3 x, i = 1, . . . , n, such that x0 ∈ Ui ⇒ d(gi (x), gi (x0 )) < ε. Put U = i Ui . If now
f ∈ F, there is an i such that f ∈ B D (gi , ε), to wit D(f, gi ) < ε. Now for x0 ∈ U ⊆ Ui we have

d(f (x), f (x0 )) ≤ d(f (x), gi (x)) + d(gi (x), gi (x0 )) + d(gi (x0 ), f (x0 )) < 3ε,

proving equicontinuity of F (at x, but x was arbitrary).


⇐ Following [72], we begin with a lemma:

A.46 Lemma Let (X, d) be a metric space. Assume that for each ε > 0 there are a δ > 0, a
metric space (Y, d0 ) and a continuous map h : X → Y such that (h(X), d0 ) is totally bounded
and such that d0 (h(x), h(x0 )) < δ implies d(x, x0 ) < ε. Then (X, d) is totally bounded.
Proof. For ε > 0, pick δ, (Y, d0 ), h S as asserted. Since h(X) is S totally bounded, there are
y1 , . . . , yn ∈ h(X) such that h(X) ⊆ i B(yi , δ) ⊆ Y . Then X = i h−1 (B(yi , δ)). For each i
choose xi ∈ X such that h(xi ) = yi . Now x S ∈ h−1 (B(yi , δ)) ⇒ d0 (h(x), yi ) < δ ⇒ d(x, xi ) < ε,
so that h−1 (B(yi , δ)) ⊆ B(xi , ε). Thus X = ni=1 B(xi , ε), and (X, d) is totally bounded. 

109
Giulio Ascoli (1843-1896), Cesare Arzelà (1847-1912), Italian mathematicians. They proved special cases of this
result, of which there also exist more general versions than the one above.

179
Let ε > 0. Since F is equicontinuous, for every x ∈ X there is an open neighborhood Ux
such that f ∈ F, x0S∈ Ux ⇒ d(f (x), f (x0 )) < ε. Since X is compact, there are x1 , . . . , xn ∈
X such that X = ni=1 Uxi . Now define h : F → Y ×n : f 7→ (f (x1 ), . . . , f (xn )). Now
e 1 , . . . , yn ), (y 0 , . . . , y 0 )) = P d(yi , y 0 ) is a product metric on Y ×n making h continuous. By
d((y 1 n i i
assumption {f (x) | f ∈ F} is compact for each x ∈ X, thus h(F) ⊆ i {f (xi ) | f ∈ F} ⊆ Y ×n
Q

is compact, thus (h(F), d) e is totally bounded. If now f, g ∈ F satisfy d(h(f


e ), h(g)) < ε then
d(f (xi ), g(xi )) < ε ∀i by definition of d. For every x ∈ X there is i such that x ∈ Uxi , thus
e

d(f (x), g(x)) ≤ d(f (x), f (xi )) + d(f (xi ), g(xi )) + d(g(xi ), g(x)) < 3ε.

Since this holds for all x ∈ X, we have D(f, g) ≤ 3ε. Thus the assumptions of Lemma A.46 are
satisfied, and we obtain total boundedness, thus precompactness, of F. 

A.47 Remark 1. If Y = Rn , as in most statements of the theorem, then in view of the Heine-
Borel theorem the requirement of precompactness of {f (x) | f ∈ F } for each x reduces to
that of boundedness, i.e. pointwise boundedness of F. One can also formulate the theorem in
terms of existence of uniformly convergent (or Cauchy) subsequences of bounded equicontinuous
sequences in C(X, Rn ).
2. We intentionally stated a more general version [72] of the theorem than needed in order to
argue that the result belongs to general topology rather than functional analysis. For Y = Rn
this is less clear, also since there are many alternative proofs of the theorem using various
methods from topology and functional analysis, cf. e.g. [111]. 2

A.6.5 Separability of C(X, R)


A.48 Proposition Let X be a compact Hausdorff space. Then the following are equivalent:
(i) X is second countable (⇔ metrizable).
(ii) The normed space (C(X, R), k · k), where k · k = k · k∞ , is second countable (⇔ separable).
Proof. (i)⇒(ii) Let B = {U1 , U2 , . . . } be a countable base for the topology of X, and let

S = {(n, m) ∈ N2 | Un ⊆ Um }.

For every (n, m) ∈ S use Urysohn’s Lemma to find a function f(n,m) ∈ C(X, [0, 1]) such that
f(n,m)  Un = 0, f(n,m)  X\Um = 1. Let x, y ∈ F, x 6= y. Since X\{y} is an open neighborhood
of x and B is a base, there exists m ∈ N with x ∈ Um . By normality of X there exists an
open V such that x ∈ V ⊆ V ⊆ Um . Since B is a base there exists n ∈ N with x ∈ Un ⊆ V .
Now Un ⊆ V ⊆ Um , so that (n, m) ∈ S. Now f(n,m) (x) = 0, f(n,m) (y) = 1, so that the family
F1 = {f(n,m) | (n, m) ∈ S} ⊂ C(X, [0, 1]) separates the points of X.
Let F2 denote the set, clearly countable, of all finite products of elements of F1 . Interpreting
the empty product as the function 1, we have 1 ∈ F2 . Then also the set F3 of finite linear
combinations of elements of F2 with Q-coefficients is countable. Since A = F3 contains the
finite linear combinations of elements of F2 with coefficients in R, it is a unital R-algebra.
Since already F1 separates the points of X, the same holds for A. Thus the Stone-Weierstrass
Theorem A.37 gives C(X, R) = A = F3 . Thus C(X, R) has F3 as countable dense subset.
(ii)⇒(i) Since (C(X, R), k · k) is metric, second countability and separability are equivalent.
Let F ⊆ C(X, R) be a subset that is dense w.r.t. k · k. Let x, y ∈ X, x 6= y. We claim that there
is an f ∈ F with f (x) 6= f (y) (i.e. F separates the points of X). If this was false, the uniform

180
density of F in C(X, R) would imply f (x) = f (y) for all f ∈ C(X, R), which however is false
by Urysohn’s lemma. The map
Y Y
ιF : X → [inf f, sup f ], x 7→ f (x)
f ∈F f ∈F

is continuous by definition of the product topology, and it is injective since F separates points.
Since X is compact and the product space Hausdorff, ιF is an embedding,
Q i.e. ιX : X → ιX (X)
is a homeomorphism. If now F is countable, the countable product f ∈F [inf f, sup f ] is second
countable, thus also its subspace ιF (X) which is homeomorphic to X. 

A.49 Remark 1. If X is locally compact Hausdorff, it is most natural to consider C0 (X, F).
With the one-point (Alexandrov) compactification X∞ , one easily proves a Banach space iso-
morphism C(X∞ , F) ∼ = C0 (X, F) ⊕ F, so that C0 (X, F) is separable if and only if X∞ is second
countable. Second countablility of X∞ implies that of X. The converse is also true, but is more
work since it involves proving that ∞ ∈ X∞ has a countable open neighborhood base. This is
equivalent to X being hemicompact, i.e. there is a family {Kn ⊆ X}n∈N of compact sets such
that every compact K ⊆ X is contained in some Kn . One can show that for a locally compact
Hausdorff space, hemicompactness is equivalent to second countability, cf. e.g. [108]. Thus also
for locally compact X one has that C0 (X, F) is separable if and only if X is second countable.
2. For non-compact X, one can also study Cb (X, F). At least if X is completely regular,
it turns out that Cb (X, F) is never separable for non-compact X. (For compact X we have
Cb (X, F) = C(X, F) and are back in the situation of Proposition A.48.) 2

A.6.6 ? The Stone-Čech compactification


If X is a topological space, a compactification of X is a space X b together with a continuous
map ι : X → X b such that ι(X) ⊆ X is a dense subset and ι : X → ι(X) is a homeomorphism.
You probably know the one-point or Alexandrov compactification of a topological space.
(Usually it is considered only for locally compact spaces.) It is the smallest possible compacti-
fication in that it just adds one point.
But for many purposes, another compactification is more important, the Stone-Čech com-
pactification. It is defined for spaces that have the following property:

A.50 Definition A topological space X is completely regular or T3.5 if it is T1 and for every
closed C ⊆ X and y ∈ X\C there exists f ∈ C(X, [0, 1]) such that f  C = 0 and f (x) = 1.
All subspaces of a completely regular space are completely regular. By Urysohn’s lemma,
every normal space is completely regular, in particular every metrizable and every compact
Hausdorff space. This implies that complete regularity is a necessary condition for a space X
to have a compactification X
b that is Hausdorff. In fact, it also is sufficient:

A.51 Theorem Let X be a topological space. Then the following are equivalent:
(i) X is completely regular.
(ii) There exists a compact Hausdorff space βX together with a dense embedding X ,→ βX
such that for every continuous function f : X → Y , where Y is compact Hausdorff, there
exists a continuous fb : βX → Y such that fb X = f . (This fb is automatically unique by
density of X ⊆ βX.)

181
A.52 Remark 1. The universal property (ii) implies that βX is unique up to homeomorphism.
‘It’ is called the Stone-Čech110 compactification of X.
2. If X is completely regular and F ∈ {R, C} then the restriction map C(βX, F) → Cb (X, F)
given by f 7→ f  X is a bijection and an isometric isomorphism of Banach algebras.
3. There are many ways to prove the non-trivial implication (i)⇒(ii). We sketch two,
referring to the literature for details.
(A) Define Z = [0, 1]C(X,[0,1] = f ∈C(X,[0,1])
Q
Q [0, 1] with the product topology, which is com-
pact Hausdorff. The map ιX : X → Z, x 7→ f ∈C(X,[0,1]) f (x) is continuous and injective since
the continuous functions on a completely regular space separate points. Using that they also
separate points from closed sets, one proves that ιX is an embedding, thus a homeomorphism
X → ιX (X). Let βX = ιX (X) ⊆ Z, which is compact Hausdorff. Now we can identify X
with the dense subspace ιX (X) of βX. By construction, for every f ∈ C(X, [0, 1]) we have
pf ◦ ιX = f , where pf is the projection Z → [0, 1] indexed by f . Thus pf  βX extends f to βX.
This generalizes to any compact Hausdorff space Y instead of [0, 1] using the fact that every
compact Hausdorff space is homeomorphic to a closed subset of a cube (which is proven as in
the proof of (ii)⇒(i) Proposition A.48, but now taking F = C(X, [0, 1])).
(B) Alternatively, one can use Gelfand duality for commutative C ∗ -algebras, cf. Section 19: If
X is completely regular, A = Cb (X, C) with norm kf k = supx |f (x)| is a commutative unital C ∗ -
algebra. As such it has a spectrum Ω(A), which is compact Hausdorff. We define βX = Ω(A).
There is a map ι : X → Ω(A), x 7→ ϕx , where ϕx (f ) = f (x). This map is continuous by
definition of the topology on Ω(A). Using the complete regularity of X one proves that ι is
an embedding, i.e. a homeomorphism of X onto ι(X) ⊆ Ω(A). Now ι(X) = Ω(A) is seen as
follows: ι(X) 6= Ω(A) would imply (using Urysohn or Tietze) that there are f ∈ A\{0} such
that ι(x)(f ) = 0 for all x ∈ X. This is a contradiction, since the elements of A are functions on
X, so that ι(x)(f ) = 0 ∀x implies f = 0.
4. The first of the above constructions used the Tychonov theorem, but only for Hausdorff
spaces. The second approach relies on Alaoglu’s theorem to prove compactness of Ω(A). One
can show that Alaoglu’s theorem and the restriction of Tychonov’s theorem to Hausdorff spaces
are equivalent over the ZF axioms, see Section B.5. In fact, also Theorem A.51 is in this
equivalence class.
5. If X is completely regular and non-compact, one can prove, cf. e.g. [51, 3.6.17], that no
point x ∈ βX\X has a countable open neighborhood basis, so that βX is not second countable.
Then Cb (X, F) ∼ = C(βX, F) is not separable. Not assuming complete regularity, not much can be
said since one can find topological spaces – even regular (T3 ) ones – on which every continuous
R-valued function is constant, in which case Cb (X, F) ∼ = F, see [51, 2.7.17, 2.7.18]. 2

A.7 Some notions from measure and integration theory


A.53 Definition If X is a set, a σ-algebra on X is a family A ⊆ P (X) of subsets such that
1. ∅ ∈ A.
2. If A ∈ A then X\A ∈ A.
3. If {An }n∈N ⊆ A then ∞
S
n=1 An ∈ A.
A measurable space is a pair (X, A) consisting of a set and a σ-algebra on it.
The closedness of A under complements implies that a σ-algebra also contains X and is
closed under countable intersections. Obviously P (X) is a σ-algebra.
110
Eduard Čech (1893-1960). Czech mathematician, worked mostly in topology, e.g. Čech cohomology.

182
It is very easy to see that the intersection of any number of σ-algebras on X is a σ-algebra
on X. Thus if F ⊆ P (X) is any family of subsets of X, we can define the σ-algebra generated
by F as the intersection of all σ-algebras on X that contain F.
If (X, τ ) is a topological space, the σ-algebra on X generated by τ is called the Borel111
σ-algebra B(X) of X. (We should of course write B(X, τ ). . . ) Apart from the open sets, it
contains the closed sets, the Gδ sets and many more. A function f : X → C is called Borel
measurable if f −1 (U ) ∈ B(X) for every open U ⊆ C. (This is equivalent to f −1 (B) ∈ B(X)
for every B ∈ B(C).) If (X, A) is a measurable space, B ∞ (X, C) denotes the set of functions
f : X → C that are Borel-measurable and bounded, i.e. supx∈X |f (x)| < ∞. It is not hard to
check that this is an algebra (with the pointwise product).

A.54 Definition A positive S measure P on a measurable space (X, A) is a map µ : A → [0, ∞]


such that µ(∅) = 0 and µ( ∞ A
n=1 n ) = ∞
n=1 µ(An ) whenever {An } ⊆ A is a countable family
of mutually disjoint sets. If µ(X) < ∞ then µ is called finite (then µ(A) ≤ µ(X) ∀A ∈ A).
A Borel measure on a topological space (X, τ ) is a positive measure on (X, B(X)).
There is a notion of regularity of a measure. Since we will only consider measures on compact
subsets of C, which are second countable, regularity of all finite Borel measures is automatic.
(This follows e.g. from [140, Theorem 2.18].)
For the definition of integration of real or complex valued functions w.r.t. a measure see any
book on measure theory or the appendix of [101].
The counting measure on (X, P (X)) is defined by µc (A) = #(A). P It is easy to show that
f : X → C (obviously measurable) P if and only if x∈X f (x) exists in the sense
R is µc -integrable
of Definition A.1, in which case f (x) dµc (x) = x∈X f (x).

A.55 Definition A complex P∞on a measurable space (X, A) is a map µ : A → C such


measure
that µ(∅) = 0 and µ( ∞
S
A
n=1 n ) = n=1 µ(An ) whenever {An } ⊆ A is a countable family of
mutually disjoint sets.
Note that complex measures are by definition bounded.
P Furthermore, if {A
S n } is a countable
family of mutually disjoint sets then automatically n |µ(An )| < ∞ since µ( n An ) is invariant
under permutations of the An . The supremum of this expression over the families {An }n∈N of
mutually disjoint measurable sets is called the total variation kµk of µ. R
For every bounded measurable function f : X → C one can define the integral X f dµ sat-
isfying | X f dµ| ≤ kf k∞ kµk, thus defining a bounded linear functional on L∞ (X, B(X), µ; C).
R

A.56 Theorem ([Link]-Markov-Kakutani) 112 Let X be a compact Hausdorff space and


ϕ : C(X, C) → C a linear functional. Then
(i) If ϕ is bounded,
R there is a finite complex measure µ on (X, B(X)) such that kµk = kϕk
and ϕ(f ) = X f dµ for all f ∈ C(X, C).
(ii) If ϕ is positive, i.e. ϕ(f ) ≥ 0 whenever f ≥ 0, then it is bounded,
R and there is a finite
positive measure µ on (X, B(X)) with µ(X) = kϕk and ϕ(f ) = X f dµ for all f ∈ C(X, C).
(This also works for R instead of C.)
For proofs see [29, Theorem 7.2.8] or [140, Theorem 2.14] in the real and [29, Theorem
7.3.6] or [140, Theorem 6.19] in the complex case. (For the implication positive ⇒ bounded see
Proposition B.158(i).)
111
Emile Borel (1871-1956). French mathematician. One of the pioneers of measure theory.
112
Andrey Andreyevich Markov (1903-1979), Soviet mathematician. Shizuo Kakutani (1911-2004), Japanese-
American mathematician. There also are fixed point theorems due to Kakutani and to Markov-Kakutani.

183
B ? Supplements for the curious
B.1 Functional analysis over fields other than R and C?
The most general meaningful definition of (linear) functional analysis is as the theory of topo-
logical vector spaces over a topological field F and continuous linear maps between them. If
(the topology of) F is discrete, we are effectively doing topological abelian group theory, and
this would not be considered functional analysis. Thus we restrict ourselves to non-discrete
topological fields. The general theory of topological fields is a thorny subject, almost unknown
to non-specialists. (For reviews see [175, 167].) There would be no point in going into this
here since in this course we considered general topological vector spaces only as a step towards
spaces that are at least metrizable.
But there is a complete (in a sense) classification of the non-discrete locally compact fields.
In characteristic zero, these are precisely R, the p-adic fields Qp , where p runs through the prime
numbers, and all their finite (thus algebraic) extensions. (And in characteristic p 6= 0 one has
the finite extensions of Fp ((x)), the field of formal Laurent series over the finite field Fp .) For
a proof see e.g. [126]. While R has only one algebraic extension (namely C), Qp has infinitely
many finite extensions, so that the algebraic closure of Qp (which is not complete!) is infinite-
dimensional over Qp . Like R and C, the p-adic fields and their finite extensions all have a norm,
usually called ‘valuation’ or ‘absolute value’, i.e. a map F → [0, ∞) satisfying |x| = 0 ⇔ x = 0,
|x + y| ≤ |x| + |y| and |xy| = |x||y|. Note that the norm is strictly multiplicative, not just
submultiplicative. The locally compact fields are complete w.r.t. their absolute value | · |.
Books entitled ‘Functional analysis’ or ‘Topological vector spaces’ tend to work entirely over
R and C unless the title contains ‘p-adic’, ‘non-archimedian’ or ‘ultrametric’ (but there are
exceptions like [21, 112]). Nevertheless, functional analysis over p-adic fields is a well-studied
subject, cf. e.g. [138, 125], but a somewhat exotic one since it only seems to have applications
to number theory, algebraic geometry and related fields.113
In the remainder of this short section we briefly comment on the extent to which the theory
covered in these notes remains valid over p-adic fields. As a rule of thumb, one must be very
careful with theorems on normed/Banach spaces that involve R or C either in their statement
or in the proof since then either the orderedness of R or the algebraic completeness of C tend to
be used, while the p-adic fields are neither algebraically closed nor orderable! The Hahn-Banach
theorem is a case in point since we first proved it for R, making essential use of the orderedness
of the base field F = R, thus not just of the set [0, ∞) in which the norms take values, and
then extended it to C. (There nevertheless is a p-adic Hahn-Banach theorem, but with slightly
different hypotheses and a different proof.)
Theorems not explicitly referring to R or C have a better chance of carrying over to p-adic
functional analysis. For example, the open mapping theorem and both versions of the uniform
boundedness theorem generalize without change. However, one has to be careful with the above
rule since there are properties, like connectedness, shared by R and C, but not enjoyed by the
p-adic fields! There are other problems: There is no a priori relationship between the subsets
S1 = {|c| | c ∈ F} and S2 = {kxk | x ∈ V } of [0, ∞). Thus given x ∈ V \{0} there may not be a
c ∈ F such that kcxk = 1.
We also have to be very careful with results on Hilbert spaces, since scalars in F can be
pulled out of inner products without picking up an absolute value: hcx, yi = chx, yi. Indeed
this leads to problems adapting the proof of Theorem 5.27. The same holds for the polarization
113
The author is skeptical about claims of relevance of p-adic/ultrametric (functional) analysis to fundamental the-
oretical/mathematical physics (but statistical/condensed matter physics is another discussion).

184
identities.
We leave the discussion here and refer to the literature on p-adic (functional) analysis for
more information. See e.g. [60, 136, 138, 125].

B.2 Even more on unconditional and conditional convergence


B.2.1 The Dvoretzky-Rogers theorem
B.1 Proposition Let n ∈ N and V be a normed space with dim V ≥ n2 . Then there are unit
vectors x1 , . . . , xn ∈ V such that
n
X n
X 1/2
ci xi ≤ 8 |ci |2 ∀c1 , . . . , cn ∈ F. (B.1)
i=1 i=1

Before we prove the proposition, we consider its consequences:


114
B.2 Theorem (Dvoretzky-Rogers 1950) P∞ 2 Let V be an infinite-dimensional Banach space
P∞ numbers such that n=1 cn < ∞. Then there is an unconditionally con-
and {cn }n∈N positive
vergent series n=1 xn in V such that kxn k = cn for all n.
Proof. By ∞
P 2
P∞ 2 −2k . For
j=1 cj < ∞ we can choose integers n1 < n2 < · · · such that j=nk cj ≤ 2
i < n1 pick arbitrary vectors xi with kxi k = ci . Since V is infinite-dimensional, Proposition B.1
provides, for every k ∈ N, unit vectors ynk , . . . , ynk+1 −1 ∈ V such that with xi = ci yi we have
nk+1 −1
X X−1 1/2
 nk+1
di xi ≤8 c2i ≤ 8 · 2−k
i=nk i=nk

P∞ Pnk+1 −1
for allPchoices of di ∈ {0, 1}. Thus all subseries of ∞
P
i=n1 xi = k=1 i=nk xi converge, so

that i=1 xi converges unconditionally by Theorem A.4(vii)⇒(iii). 

B.3 Corollary In every infinite-dimensional Banach space there are series that converge un-
conditionally, but not absolutely.

P applying Theorem B.2 to any sequence {cn } of positive numbers such


[Link] is immediate,
that n cn = ∞ and n c2n < ∞, like cn = 1/n. 

Proof of Proposition B.1. We can clearly assume dim V = n2 . By Proposition 3.10, there
exists an Auerbach basis {xi } be an Auerbach basis for V , i.e. kxi k = kϕi k = 1 for all i, where
P 2
{ϕi } ∈ V ∗ is the unique dual basis. Define an inner product on V by hx, yi = n2 ni=1 ϕi (x)ϕi (y),
Pn2 2 1/2 . Then
which induces the norm kxk0 = n

i=1 |ϕi (x)|

n 2
X
0 2
kxk /n ≤ max |ϕi (x)| ≤ kxk ≤ |ϕi (x)| ≤ kxk0 ∀x,
i
i=1

where the inequalities


P are due to, in turn, the definition of k · k0 in terms of the ϕi (x), kϕi k = 1,
the fact x = i ϕi (x)xi and kxi k = 1, and the Cauchy-Schwarz inequality (applied to the
vectors {ϕi (x)} and (1, . . . , 1)).
114
Aryeh Dvoretzky (1916-2008), Russian-born Israeli mathematician, worked mostly in functional analysis and
probability. Claude Ambrose Rogers (1920-2005), British mathematician, mostly in convex geometry.

185
****************
Thus {yi } is an ONB for (V, h·, ·i) such that kyi k ≥ 1/8 for each i.
Putting xi = yi /kyi k, we have kxi k = 1, kxi k0 ≤ 8 and with the mutual orthogonality of the
xi
n n n n
X X 0 X 1/2 X 1/2
0 2
ci xi ≤ ci xi = (|ci |kxi k ) ≤8 |ci |2
i=1 i=1 i=1 i=1

for all {ci } in F. 

B.4 Remark Proposition B.1 (from [98]) suffices for Theorem B.2 and its proof only uses the
very classical existence of Auerbach bases (1929). But it can be improved considerably if one
uses slightly later theory like the John ellipsoid (1948) or Lewis’ lemma (1979). Cf. e.g. [97, vol.
1, Proposition IV.1], where the 8 in (B.1) is replaced by 2 and the n2 by 2n.
And in 1960, Dvoretzky proved Dvoretzky’s theorem, a much stronger result that started
the ‘local theory’ of Banach spaces (which focuses on finite-dimensional subspaces): For every
k ∈ N and ε > 0 there exists N (k, ε) ∈ N such that for every normed space (V, k · k) of
dimension ≥ N (k, ε) there is a k-dimensional subspace W ⊆ V such that d(W, E) < 1 + ε,
where d is the Banach-Mazur distance and E = Rk with euclidean norm. In particular every
infinite-dimensional Banach space has finite-dimensional subspaces of arbitrary large dimension
and arbitrarily close to being Hilbert spaces. See e.g. [1] or [97, vol. 2]. 2

B.2.2 Converses of Dvoretzky-Rogers


In general it is difficult to give necessary and sufficient conditions for unconditional convergence
of a series. There is a relatively easy exception:

B.5 Exercise P∞ Let X be a compact Hausdorff space and V = C(X, F) with P∞norm k · k∞ . Prove
that a series n=1 fn in V is unconditionally convergent if and only if n=1 |fn (x)| < ∞ for
all x ∈ X. Hint: Dini’s theorem.
(Global absolute convergence means ∞
P
n=1 supx∈X |fn (x)| < ∞. While it is quite plausible
that the latter condition is strictly stronger than pointwise absolute convergence, the Dvoretzky-
Rogers theorem proves that this is the case whenever X is infinite.)

that a series ∞
P
B.6 Exercise Let H be a Hilbert space. Prove P∞ n=1 xn of mutually orthogonal
terms converges unconditionally if and only if n=1 kxn k2 < ∞.
It is plausible that unconditional convergence is harder to achieve if the summands are not
mutually orthogonal. (ForPexample, if x 6= 0 and xn = cn x then unconditional convergence of
P∞ ∞
n=1 xn is equivalent to n=1 kxn k < ∞.) Therefore the following is not surprising:

(Orlicz) If a series ∞
P
B.7 Proposition
P∞ n=1 xn in a Hilbert space converges unconditionally
then n=1 kxn k2 < ∞.
P
Proof. SinceP n xn converges unconditionally, by Proposition (iv) there exists a finite S ⊂ N
such that k t∈T xt k < 1 for every finite T ⊂ N\S. Now for every finite T ⊂ N we have
X X X X X X 
xt = xt + xt ≤ xt + xt ≤ kxs k + 1 =: C.
t∈T t∈S∩T t∈T \S t∈S∩T t∈T \S s∈S

186
Now given n ∈ N and s1 , . . . , sn ∈ {±1}, putting T+ = {i = 1, . . . , n | εi = 1}, T− = {i =
1, . . . , n | εi = −1} we have
n
X X X X X
s i xi = xi − xt ≤ xi + xi ≤ 2C. (B.2)
i=1 t∈T+ t∈T− t∈T+ t∈T−

So far we haven’t used any property of V , but now we do. With the generalized parallelogram
identity (5.6), we have
n n
X X X 2
2 −n
kxi k = 2 s i xi ≤ (2C)2
i=1 s∈{±1}n i=1
P∞ 2
for all n, thus also i=1 kxi k ≤ (2C)2 < ∞. 

Recall from Remark 5.18 that a Banach space V has cotype c if there exists C < ∞ such
that for all n ∈ N and x1 , . . . , xn ∈ V
n n
X C X X c
kxi kc ≤ s i xi . (B.3)
2n
i=1 s∈{±1}n i=1

Combining this with (B.2) immediately gives:

B.8 Proposition If V is a Banach space of cotype c < ∞ then every unconditionally conver-
gent series ∞
P P∞ c
x
n=1 n in V satisfies n=1 kxn k < ∞.

B.9 Exercise Let S be an infinite set and p ∈ [1, ∞). Prove that `p (S, F) has cotype max(p, 2).

B.10 CorollaryPLet S be a set, p ∈ [1, ∞) and ∞


P
n=1 xn an unconditionally convergent series
p ∞ c
in ` (S, F). Then n=1 kxn k < ∞, where c = max(p, 2).

B.2.3 Conditional convergence


If a series ∞
P
n=1 xn in a Banach space V is not unconditionally convergent, it is natural to ask
about the set of sums of all convergent rearrangements. For V = R, the latter is R, as shown
by Riemann. In finite dimensions we have:

B.11 Theorem (Lévy (1905), Steinitz (1913-4)) 115 Let ∞


P
n=1 xn be a conditionally con-
vergent series in a finite-dimensional normed space V over R. Then
n ∞
X o
Φ= ϕ∈V∗ |ϕ(xn )| < ∞ ⊆ V ∗
n=1

is a linear space, and the set


n ∞
X o
Σ = s ∈ V | ∃σ : s = xσ(n)
n=1

of sums of convergent rearrangements equals x0 + Φ> , where x0 ∈ Σ and Φ> = {x ∈ V | ϕ(x) =


0 ∀ϕ ∈ Φ}.
115
Paul Lévy (1886-1971), French mathematician. Best remembered for his work on probability theory.
Ernst Steinitz (1871-1928), German mathematician who worked in many areas. Founder of the theory of fields.

187
The proof begins with an easy reduction to the case where Φ = {0}, thus Σ = V . For a
nice exposition see [69]. The straightforward generalization of the above to infinite-dimensional
Banach spaces is false, since the set Σ of values of convergent rearrangements can fail to be
an affine space, cf. e.g. [78]. (It can even consist of two points.) But there are a number of
sufficient conditions under which Σ = x0 + Φ> holds. For example the following result, which
has a remarkable application to the ‘universality’ of the Riemann zeta function:
P∞
B.12 Theorem (Pecherski ı̌ 1973) [82, Appendix §6] If H is a real Hilbert space and n=1 xn
is such that Σ 6= ∅ and ∞ 2 >
P
n=1 kxn k < ∞ then Σ = x0 + Φ .

B.3 More on the spaces `p (S, F) and c0 (S, F)


B.3.1 Precompact subsets of c0 (S, F) and `p (S, F), 1 ≤ p < ∞
One can characterize the precompact subsets of `p (S, F), 1 ≤ p < ∞ in a fashion quite similar
to the Arzelà-Ascoli Theorem A.45. As in our discussion of the latter, we follow [72].

B.13 Proposition Let S be a set, F ∈ {R, C} and 1 ≤ p ≤ ∞. Put V = `p (S, F) if p < ∞ and
V = c0 (S, F) if p = ∞. Then X ⊂ V is (pre)compact if and only if
(i) The set {f (s) | f ∈ X} ⊆ F is compact (bounded) for each s ∈ S.
(ii) For each ε > 0 there is a finite F ⊆ S such that k(1 − χF )f kp ≤ ε for all f ∈ X.
(I.e., the elements of X tend to zero at infinity uniformly.)
Proof. If X is compact, its image in F under the continuous evaluation map ps : f 7→ f (s) is
compact for each s ∈ S. If X is only precompact then ps (X) is compact, and continuity of ps
gives ps (X) ⊆ ps (X), so that ps (X) is precompact (=bounded).
By (pre)compactness S of X, it is totally bounded. Thus if ε > 0 we can find f1 , . . . , fn ∈
` (S, F) such that X ⊆ ni=1 B(fi , ε). Now thereSare finite sets F1 , . . . , Fn such that k(1 −
p
χF )fi kp ≤ ε/2 for each i = 1, . . . , n. Then F = ni=1 Fi is finite. If f ∈ V then pick i such
that kf − fi kp < ε/2. Using the rather obvious facts k(1 − χF )fi kp ≤ k(1 − χFi )fi kp ≤ ε/2 and
k(1 − χF )(f − fi )kp ≤ kf − fi kp < ε/2, we obtain
ε ε
k(1 − χF )f k ≤ k(1 − χF )fi kp + k(1 − χF )(f − fi )kp ≤ + = ε,
2 2
thus (ii). (Note that the existence of the Fi would fail for (`∞ (S, F), k · k∞ ).)
Now assume that (i) and (ii) hold and ε > 0. Then we can find a finite F ⊆ S as in condition
(ii). Write F = {s1 , . . . , sn } and define a map h : X → Fn , f 7→ (f (s1 ), . . . , f (sn )). Equip Fn
with the norm k · kp . By condition (i), h(X) ⊆ Fn is bounded/compact. Let f, f 0 ∈ X such that
kh(f ) − h(f 0 )kp ≤ ε. Since this is equivalent to kχF (f − f 0 )kp ≤ ε, we have

kf − f 0 kp ≤ kχF (f − f 0 )kp + k(1 − χ)F f kp + k(1 − χ)F f 0 kp ≤ 3ε.

This shows that the assumptions of Lemma A.46 are satisfied, so that X is totally bounded,
thus precompact. Finally, assume the sets in (i) are compact. The total boundedness of X is
equivalent to the statement that every sequence {fn } in X has a Cauchy subsequence {fni }.
The latter converges to some element f ∈ V . Now the closedness of ps (X) ⊆ F for each s
implies that the limit function f is in X. Thus X is compact. 

188
B.14 Remark It is interesting to compare the conditions in Theorem A.45 and Proposition
B.13. While the respective pointwise conditions (i) are identical, the conditions (ii) are totally
different. In Theorem A.45 we are concerned with continuous function and therefore have a
uniform (over the functions) version of continuity, while due to compactness there is no condition
at infinity. On the other hand, in Proposition B.13 there are no continuity questions, but we
need a uniform vanishing condition at infinity.
As one may expect, there is a generalization of Proposition B.13 to Lp -spaces, now involving
three conditions. Note that pointwise conditions make no sense.

B.15 Theorem ([Link] theorem) 116 Let 1 ≤ p < ∞ and X ⊆ Lp (Rn , λ).
Then X is precompact if and only if
(i) X is k · kp -bounded.
1/p
(ii) limR→∞ supf ∈X |x|≥R |f (x)|p dλ
R
= 0 (condition at ∞)
(iii) limy→0 supf ∈X kf − fy kp = 0, where fy (x) = f (x − y) (variant of equicontinuity).
(Note that limy→∞ kf − fy kp = 0 holds for all f ∈ Lp (Rn ).) For an elegant proof of the
theorem, again using Lemma A.46, see [72]. In the literature one can find more restrictive
versions (e.g. in [23, Theorem 4.26] condition (ii) is replaced by the stronger hypothesis that all
f ∈ X are supported in the same bounded set) as well as more general ones. See [169, §12] for
a version on a locally compact group G, presented more readably in [88]. 2

B.3.2 The dual space of `∞ (S, F)


We have seen in in Theorem 4.19(v) that there are bounded linear functionals ϕ ∈ `∞ (S, F)∗
that vanish on c0 (S, F). Those clearly cannot be captured by the function g(s) = ϕ(δs ) widely
used in the proof of Theorem 4.19. This suggests to consider µϕ (A) = ϕ(χA ) for arbitrary
A ⊆ S instead. If A1 , . . . , AK are mutually disjoint, and A = K Ak then χA = K χ
S P
PK k=1 k=1 Ak ,
thus µϕ (A) = k=1 µϕ (Ak ), so that µϕ is finitely additive. 117

B.16 Definition If S is a set, a finitely additive finite F-valued measure on S is a map µ :


P (S) → F satisfying µ(∅) = 0 and µ(A1 ∪ · · · ∪ AK ) = µ(A1 ) + · · · + µ(AK ) whenever A1 , . . . , AK
are mutually disjoint subsets of S. The set of such µ, which we denote f a(S, F), is a vector
space via (c1 µ1 + c2 µ2 )(A) = c1 µ1 (A) + c2 µ2 (A). For µ ∈ f a(S, F) we define
(K )
X
kµk = sup |µ(Ak )| | K ∈ N, A1 , . . . , AK ⊆ S, i 6= j ⇒ Ai ∩ Aj = ∅ ,
k=1
kµk0 = sup |µ(A)|.
A⊆S

B.17 Theorem (i) k · k and k · k0 are equivalent norms on f a(S, F). We write

ba(S, F) = {µ ∈ f a(S, F) | kµk0 < ∞ (⇔ kµk < ∞)}.

(ii) (ba(S, F), k · k) is a Banach space.


116
Andrey Nikolaevich Kolmogorov (1903-1987). Soviet mathematician with countless contributions to classical,
harmonic and functional analysis, dynamical systems, probability theory (which he founded on measure theory), etc.
117
The discussion in this section strongly borrows from [43].

189
(iii) If ϕ ∈ `∞ (S, F)∗ then kµϕ k ≤ kµk, thus we have a norm-decreasing linear map `∞ (S, F)∗ →
ba(S, F), ϕ 7→ µϕ .
Proof. (i) It is immediate from the definition kcµk = |c|kµk and kcµk0 = |c|kµk0 for all c ∈
F, µ ∈ f a(S, F) and that kµk = 0 ⇔ µ = 0 ⇔ kµk0 = 0. Also kµ1 + µ2 k0 ≤ kµ1 k0 + kµ2 k0 is
quite obvious. Now
(K ) (K )
X X
kµ1 + µ2 k = sup |µ1 (Ak ) + µ2 (Ak )| | · · · ≤ sup |µ1 (Ak )| + |µ2 (Ak )| | · · ·
k=1 k=1
(K ) (K )
X X
≤ sup |µ1 (Ak )| | · · · + sup |µ2 (Ak )| | · · · = kµ1 k + kµ2 k.
k=1 k=1

Thus k · k, k · k0 are norms on f a(S, F). The definition of k · k clearly implies |µ(A)| ≤ kµk for
each A ⊆ S, whence kµk0 ≤ kµk.
Assume µ ∈ f a(S, R) and kµk0 < ∞. If A1 , . . . , AK ⊆ S are mutually disjoint, put
[ [
A+ = {Ak | µ(Ak ) ≥ 0}, A− = {Ak | µ(Ak ) < 0}.

Now by finite additivity, k |µ(Ak )| = µ(A+ ) + µ(A− ) ≤ 2kµk0 since |µ(A± )| ≤ kµk0 . Taking
P
the supremum over the families {Ak } gives kµk ≤ 2kµk0 .
If µ ∈ f a(S, C), writing µ = Re µ + i Im µ we find kµk ≤ 4kµk0 . Thus kµk0 ≤ kµk ≤ 4kµk0
for all µ, and the two norms are equivalent.
(ii) Here it is more convenient to work with the simpler norm k·k0 . Now let {µn } be a Cauchy
sequence in ba(S, F). Then |µn (A) − µm (A)| ≤ kµn − µm k0 , so that {µn (A)} is Cauchy, thus
convergent. Define µ(n) = limn µn (A). It is clear that µ(∅) = 0. If A1 , . . . , AK are mutually
disjoint then

µ(A1 ∪· · ·∪AK ) = lim µn (A1 ∪· · ·∪AK ) = lim (µn (A1 )+· · ·+µn (AK )) = µ(A1 )+· · ·+µ(AK ),
n→∞ n→∞

so that µ is finitely additive. Since {µn } is Cauchy, for every ε > 0 there is n0 such that n, m ≥ n0
implies kµm − µn k0 < ε. In particular there is n0 such that kµm k0 ≤ kµn0 k0 + 1 for m ≥ n0 .
This implies boundedness of µ. And taking m → ∞ in |µn (A) − µm (A)| ≤ kµn − µm k0 < ε gives
kµn − µk0 ≤ ε, so that kµn − µk0 → 0. Thus ba(S, F) is complete (w.r.t. k · k0 , thus also w.r.t.
k · k).
(iii) It is clear that `∞ (S, F)∗ → f a(S, F), ϕ 7→ µϕ is linear. Now let A1 , . . . , AK ⊆ S be
mutually disjoint. Then
K K K K
!
X X X X
|µϕ (Ak )| = sgn(µϕ (Ak ))µϕ (Ak ) = sgn(µϕ (Ak ))ϕ(χAk ) = ϕ sgn(µϕ (Ak ))χAk .
k=1 k=1 k=1 k=1

Since the Ak are mutually disjoint and |sgn(z)| ≤ 1, we have k K χ


P
PK k=1 sgn(µϕ (Ak )) Ak k∞ ≤ 1,
so that k=1 |µϕ (Ak )| ≤ kϕk. Taking the supremum over the finite families {Ak } gives kµϕ k ≤
kϕk. 

B.18 Theorem (i) For each µ ∈ ba(S, F) there is a unique linear functional µ ∈ `∞ (S, F)∗
R

such that µ (χA ) = µ(A) for all A ⊆ S. It satisfies k µ k ≤ kµk.


R R

(ii) The maps α : `∞ (S, F)∗ → ba(S, F), ϕ 7→ µϕ and : ba(S, F) → `∞ (S, F)∗ , µ 7→ µ are
R R

mutually inverse and isometric, thus `∞ (S, F)∗ ∼


= ba(S, F).

190
Proof. (i) If f ∈ `1 (S, F) has finite image, write f = K χ
P
k=1 ck Ak , where the Ak are mutually
disjoint, and define
Z XK
f dµ = ck µ(Ak ).
k=1

(We write µ (f ) or f dµ according to convenience.) If f = L 0χ


R R P
l=1 cl A0l is another representation
of f , then P
using finite additivity
PL of0 µ it0 ia straightforward to check, using the R finite additivity
K R R
of µ, that k=1 ck µ(Ak ) R= l=1 cl µ(ARl ), so thatR f dµ is well-defined. Now cf dµ = c f dµ
for c ∈ F is obvious, and (f + g)dµ = f dµ + g dµ for all finite-image functions follows from R
the fact
R that f +
R g again is a finite-image function and the representation independence of .
Thus µ : f 7→ f dµ is a linear functional on the bounded finite image functions. It is clear
that this is the unique linear functional sending χA to µ(A) for each A ⊆ S. Now
Z K
X K
X
f dµ ≤ |ck | |µ(Ak )| ≤ kf k∞ |µ(Ak )| ≤ kf k∞ kµk.
k=1 k=1
R
Thus µ is a bounded functional, and since the bounded finite-image functions are dense in
`∞ (S, F) by Lemma 4.13, µ has a unique extension to a linear functional µ ∈ `∞ (S, F)∗ with
R R
R
k µ k ≤ kµk.
(ii) If µ ∈ ba(S, F) then by definition of µ , we have χA dµ = µ(A) for all A ⊆ S. Thus
R R
R
α ◦ = idba(S,F) .
If ϕ ∈ `∞ (S, F) then Rin view of the definition of we have χA dµϕ = µϕ (A) = ϕ(χA ) for
R R

all A ⊆ S. Thus ϕ and µϕ coincide on all characteristic functions, thus on all of `∞ (S, F) by
R
linearity, density of the finite-image functions and the k · k∞ continuity of ϕ and µϕ . Thus
R
◦ α = id`∞ (S,F)∗ . R
Since the maps α and are mutually inverse and both norm-decreasing, they actually both
are isometries. 

This completes the determination of `∞ (S, F)∗ . (Note that we did not use the completeness
of ba(S, F) proven in Theorem B.17(ii). Thus it would also follow from the isometric bijection
ba(S, F) ∼
= `∞ (S, F)∗ just established.)

B.19 Exercise Given µ ∈ ba(S, F), prove that µ is {0, 1}-valued if and only if µ ∈ `∞ (S, F)∗
R

is a character, i.e. µ (f g) = µ (f ) µ (g) for all f, g ∈ `∞ (S, F).


R R R

Since `∞ (S, F)∗ has a closed subspace ι(`1 (S, F)), it is interesting to identify the correspond-
ing subspace of ba(S, F).

B.20 Definition A finitely additive measure µ ∈ ba(S, F) is called countably additive if for
every countable family A ⊆ P (S) of mutually disjoint sets we have
[  X
µ A = µ(A)
A∈A

and totally additive if the same holds for any family of mutually disjoint sets. The set of
countably and totally additive measures on S are denoted ca(S, F) and ta(S, F), respectively.

B.21 Proposition For µ ∈ ba(S, F), consider the following statements:

191
(i) There is g ∈ `1 (S, F) such that µ(A) = s∈A g(s) for all A ⊆ S.
P

(ii) µ ∈ `∞ (S, F)∗ is ‘normal’, thus f dµ = limι fι dµ for every net {fι } ∈ FS that is
R R R

pointwise convergent and uniformly bounded.


(iii) µ is totally additive.
(iv) µ is countably additive.
Then (i)⇔(ii)⇔(iii)⇒(iv). If S is countable then also (iv)⇒(iii).
Proof. (i)⇒(ii) If µ is of the given form then clearly µ χA dµ = µ(A) = s∈A g(s) for each
R P
R R P
A ⊆ S. By the way µ is constructed from µ, it is clear that f dµ = s∈S f (s)g(s) for all
f ∈ `∞ (S, F). Thus µ = ϕg , and normality of µ follows from Proposition A.3.
R R

(ii)⇒(iii) We know that we can recover µ from µ as µ(A) = χA dµ. Let A be a family
R R

of mutually disjoint subsets of S. Then the net {fF = χS F }, indexed by the finite subsets
χ
S
F ⊆R A, is uniformly bounded R and converges Rpointwise to PB , where B =P A. Now normality
of µ implies that µ(B) = µ χB dµ = limF fF = limF A∈F µ(A) = A∈A µ(A), which is
additivity of µ. P
(iii)⇒(i) If we put g(s) = µ({s}) then additivity of µ means that µ(A) = s∈A g(s) for all
A ⊆ S, convergence being absolute. Now the finiteness of µ(S) gives kgk1 < ∞.
(iii)⇒(iv) is trivial. If S is countable then a family of mutually disjoint non-empty subsets
of S is at most countable, so that (iii) and (iv) are equivalent. 

Thus we have the situation of the following diagram:

`1 (S, F)
=∼

∼=


-


=
(`∞ (S, F)∗ )n - ta(S, F)
∩ ∩

? ∼
= ?
`∞ (S, F)∗ - ba(S, F)

where ta(S, F) can be replaced by ca(S, F) if S is countable.

B.3.3 c0 (N, F) ⊆ `∞ (N, F) is not complemented


B.22 Definition We say that a Banach space V has property S if there is a countable subset
C ⊆ V ∗ separating the points of V . I.e., if x ∈ V and ϕ(x) = 0 for all ϕ ∈ C then x = 0.
If V has property S then every closed subspace W ⊆ V has property S. (And so would
non-closed subspaces, but they are not Banach.)
It is easy to see that V has property S whenever V ∗ is separable. But this is not a necessary
condition: V = `∞ (N, C) has property S, as we see by taking C = {ϕn }n∈N , where ϕn (f ) = f (n).
But V ∗ (∼= ba(N, C)) is not separable, since by Exercise 9.26 this would imply separability of V ,
which is false by Exercise 4.18(i).

192
B.23 Theorem (Phillips 1939, Sobczyk 1940) The closed subspace c0 (N, R) ⊆ `∞ (N, R)
is not complemented.
Proof. (Whitley (1966).) From now on we abbreviate `∞ (N, F) and c0 (N, F) as `∞ , c0 . Our
strategy for proving that c0 ⊆ `∞ is not complemented is the following: If c0 ⊆ `∞ had a
complementary closed subspace W , Exercise 7.15 would give `∞ ∼ = c0 ⊕ W , thus `∞ /c0 ∼= W.

Since W would have property S, it would follow that Q = ` /c0 has property S, but we will
prove that it doesn’t!
The idea for doing so is to produce an uncountable subset F ⊆ Q such that each functional
ϕ ∈ Q∗ isSnon-zero only on countably many elements of F. Then for any countable C ⊆ Q∗ the
set F 0 = ϕ∈C {q ∈ Q | ϕ(q) 6= 0} is countable, so that the family C ⊆ Q∗ vanishes identically
on the uncountable set F\F 0 . It therefore cannot separate the elements of F, let alone those of
Q. Thus Q does not have property S and we are done. For the construction of such an F we
use the following lemma:

B.24 Lemma Every countably infinite set X admits a family {Xλ }λ∈Λ of subsets of X such
that
(i) Λ has cardinality c = #R, in particular it is uncountable.
(ii) Xλ is infinite for each λ ∈ Λ.
(iii) Xλ ∩ Xλ0 is finite for all λ, λ0 ∈ Λ, λ 6= λ0 .
Proof. Take Y = (0, 1) ∩ Q and Λ = (0, 1)\Q. Clearly Y is countable and Λ is uncount-
able (since removing a countable set from one of cardinality c does not change the cardinal-
ity). For each λ ∈ Λ pick a sequence {an } ⊆ Y converging to λ (for example an = bnλc/n)
and put Yλ = {an | n ∈ N}. That each Yλ is infinite follows from the irrationality of λ
and the rationality of the an . If λ 6= λ0 and an → λ, a0n → λ0 then there exists n0 such
that n, n0 ≥ n0 ⇒ max(|an − λ|, |a0n0 − λ0 |) < |λ − λ0 |/2, so that an 6= a0n0 . This implies
#(Yλ ∩ Yλ0 ) < ∞. We thus have a family of subsets of Y with all desired properties. For an
arbitrary countably infinite set X the claim now follows using a bijection X ∼
=Y. 

Let {Xλ }λ∈Λ be a family of subsets of N as provided by the lemma. For λ ∈ Λ, the
characteristic function χXλ : N → {0, 1} ⊆ C clearly is in `∞ . Let p : `∞ → Q = `∞ /c0
be the quotient map. Now let qλ = p(χXλ ) and F = {qλ | λ ∈ Λ}. If λ, λ0 ∈ Λ, λ 6= λ
the symmetric difference Xλ ∆Xλ0 = (Xλ ∪ Xλ0 )\(Xλ0 ∩ Xλ ) is infinite by (ii) and (iii). Thus
χXλ − χX 0 6= c0 = ker p, so that λ 7→ qλ is injective, thus with (i) we see that F is uncountable.
λ
Let now ϕ ∈ Q∗ , m, n ∈ N and let λ1 , . . . , λm ∈ Λ be mutually distinct and such that
|ϕ(qλi )| ≥P1/n ∀i = 1, . . . , m. For each i pick ti with |ti | = 1 such that ti ϕ(qλi ) = |ϕ(qλi )|.
Put f = m χ ∞
i=1 ti Xλi ∈ ` . Since the sets Xλi have pairwise finite intersections,
Sm the function
f has absolute value larger than one only S on a subset
S of the finite set j,k=1 Xλj ∩ Xλk and
absolute value one on the infinite set ( i Xλi )\( j,k Xλj ∩ Xλk ). This implies that kp(f )k =
inf g∈c0 kf − gk∞ = 1. Thus
m m m
X X X m
kϕk ≥ |ϕ(p(f ))| = ti ϕ(p(χXλi )) = ti ϕ(qλi ) = |ϕ(qλi )| ≥ .
n
i=1 i=1 i=1

Thus m ≤ nkϕk < ∞, so that for each ϕ ∈ Q∗ and n ∈ N there cannot be more than m distinct
λ ∈ Λ with |ϕ(qλ )| ≥ 1/n. If there was an uncountable F 0 ⊆ F with ϕ(q) 6= 0 ∀q ∈ F 0 , there
would have to be an n ∈ N such that |ϕ(q)| ≥ 1/n for infinitely (in fact uncountably) many
q ∈ F 0 , contradicting what we just proved. This completes the proof. 

193
B.3.4 c0 (N, F) is not a dual space. Spaces with multiple pre-duals
Recall that we write ∼
= for isometric isomorphism and ' for isomorphism of Banach spaces.

B.25 Lemma Let V be a Banach space. Then:


(i) P = ιV ∗ ◦ (ιV )∗ ∈ B(V ∗∗∗ ) is an idempotent and P V ∗∗∗ = ιV ∗ (V ∗ ).
(ii) ιV ∗ (V ∗ ) ⊆ V ∗∗∗ is a complemented subspace.
(iii) If a Banach space W is isomorphic to V ∗ with V Banach then ιW (W ) ⊆ W ∗∗ is comple-
mented.
(iv) V ∗∗∗ /V ∗ ' (V ∗∗ /V )∗ . (We omitted the ι’s for simplicity.)
Proof. (i) Since ιV , ιV ∗ are bounded, with Lemma 9.31 we have boundedness of P . Let ϕ ∈ V ∗
and x ∈ X. Then

(P ιV ∗ (ϕ))(ιV (x)) = ιV ∗ ((ιV )∗ (ιV ∗ (ϕ)))(ιV (x)) = [(ιV )∗ (ιV ∗ (ϕ))](x) = ϕ(x) = ιV ∗ (ϕ)(ιV (x)),

where we used Exercise 9.18 several times, proves P ιV ∗ (ϕ) = ιV ∗ (ϕ). Thus P  ιV ∗ (V ∗ ) = id. On
the other hand, it follows directly from the definition of P that P V ∗∗∗ ⊆ ιV ∗ (V ∗ ). Combining
these two facts gives P 2 = P and P V ∗∗∗ = ιV ∗ (V ∗ ).
(ii) This is an immediate consequence of (ii) and Exercise 6.13.
(iii) If T : W → V ∗ is an isomorphism then we have isomorphisms T ∗ : V ∗∗ → W ∗ and
T : W ∗∗ → V ∗∗∗ . Using this it is straightforward to deduce the claim from (ii).
∗∗

(iv) By Exercise 6.7 we have (V ∗∗ /V )∗ ∼= V ⊥ ⊆ V ∗∗∗ . And by (ii), V ∗∗∗ ' V ∗ ⊕ W , where
W ' V ∗∗∗ /V ∗ , the isomorphism being given by x∗∗∗ 7→ (P x∗∗∗ , (1 − P )x∗∗∗ ) with P as in (i).
Thus P V ∗∗∗ ' V ∗ and V ∗∗∗ /V ∗ ' (1 − P )V ∗∗∗ . Thus the claimed isomorphism follows if we
prove that the subspaces V ⊥ and (1 − P )V ∗∗∗ of V ∗∗∗ are equal.
Now, x∗∗∗ ∈ (1 − P )V ∗∗∗ means (1 − P )x∗∗∗ = x∗∗∗ , thus P x∗∗∗ = 0. Since P = ιV ∗ ◦ (ιV )∗ ,
where ιV ∗ is injective, this is equivalent to (ιV )∗ (x∗∗∗ ) = 0. By the definition of the transpose,
this means that x∗∗∗ ◦ ιV = 0. Since this is the same as x∗∗∗ ∈ ιV (V )⊥ , we are done. 

B.26 Corollary c0 (N, F) is not isomorphic to the dual space of any Banach space.
Proof. We again abbreviate c0 (N, F) as c0 etc. We know that c∗0 = ∼ `1 and c∗∗ ∼ ∞
0 = ` , the
canonical map ιc0 : c0 → c∗∗ ∞
0 just being the inclusion map c0 ,→ ` . By Theorem B.23, c0 ⊆ `

is not complemented. Combining this with Lemma B.25(iii), the claim follows. 

B.27 Corollary Let X = c0 ⊕ (`∞ /c0 ). Then X 6' `∞ , but X ∗ ' (`∞ )∗ .
Proof. X ' `∞ would imply that c0 ⊆ `∞ is complemented, which it is not by Theorem B.23.
Thus X 6' `∞ . With c∗0 ∼
= `1 we have X ∗ ' c∗0 ⊕ (`∞ /c0 )∗ ' `1 ⊕ (`∞ /c0 )∗ .
On the other hand, since `1 ∼= c∗0 is a dual space, we see that `1 ⊆ (`1 )∗∗ ∼
= (`∞ )∗ is com-
∞ ∗ 1 ∞ ∗ 1
plemented by Lemma B.25(iii). Thus (` ) ' ` ⊕ (` ) /` by Exercise 7.15(i). Now Lemma
B.25(iv) with V = c0 gives (`∞ )∗ /`1 ' (`∞ /c0 )∗ , so that X ∗ ' (`∞ )∗ . 

One can also find Banach spaces V with V ∗ ∼


= `1 , while V 6' c0 . But this is a bit more
involved.

194
B.3.5 Schur’s theorem for `1 (N, F)
As on earlier occasions, we abbreviate `1 = `1 (N, F).
w
B.28 Theorem (I. Schur 1921) If g, {fn }n∈N ⊆ `1 (N, F) and fn → g then kfn − gk1 → 0.
w
Proof. It clearly suffices to prove this for g = 0, thus `1 3 fn → 0 ⇒ kfn k1 → 0.
w
Assume that fn → 0, but kfn k 6→ 0. Since δm ∈ `∞ ∼ = (`1 )∗ , the first fact clearly implies
n→∞
fn (m) = ϕδm (fn ) −→ 0 for all m. And by the second assumption there exists ε > 0 such that
kfn k1 ≥ ε for infinitely many n. Using this, we inductively define {nk }, {rk } ⊆ N as follows:
(a) Let n1 be the smallest number for which kfn1 k1 ≥ ε.
P∞
(b) Let r1 be the smallest number for which ri=1 ε
≤ 5ε .
P1
|fn1 (i)| ≥ 2 and i=r1 +1 |fn1 (i)|
For k ≥ 2:
Prk−1
(c) Let nk be the smallest number such that nk > nk−1 and kfnk k1 ≥ ε and i=1 |fnk (i)| ≤ 5ε .
Prk ε
(d) Let rk be the smallest number such that rk > rk−1 and
P∞ i=rk−1 +1 |fnk (i)| ≥ 2 and
ε
i=rk +1 |fnk (i)| ≤ 5 .
The reader should convince herself that the existence of such nk , rk follows from our assumptions!
Now define {ci }i∈N by ci = sgn(fnk (i)) where k is uniquely determined by rk−1 < i ≤ rk
with r0 = 0. Now clearly c = {ci } ∈ `∞ , and for all k we have, using the lower bound in (b),(d),
rk rk
X X ε
ci fnk (i) = |fnk (i)| ≥ ,
2
i=rk−1 +1 i=rk−1 +1

while using |ci | ≤ 1 and the upper bounds in (b),(c),(d) we have


rk−1 rk−1 ∞ ∞
X X ε X X ε
ci fnk (i) ≤ |fnk (i)| ≤ , ci fnk (i) ≤ |fnk (i)| ≤ .
5 5
i=1 i=1 i=rk +1 i=rk +1

Thus |ϕc (fnk )| ≥ 2ε − 5ε − 5ε = 10


ε
> 0 for all k, so that ϕc (fn ) 6→ 0. Since this contradicts the
w
assumption fn → 0, we must have kfn k1 → 0. 

B.29 Remark 1. The above proof followed the argument in [9] quite closely. The method of
proof is called the ‘gliding (or sliding) hump method’ (which is a variant of the earlier method
with the equally colorful name ‘condensation of singularities’). The gliding hump is precisely
the dominant contribution to ϕc (fnk ) coming from the i in the interval {rk + 1, . . . , rk+1 }, which
moves to infinity as k → ∞. For a high-brow interpretation of Schur’s theorem in terms of
Banach space bases see [1, Section 2.3]. But also this discussion uses gliding humps!
2. With the appearance of the proof of the uniform boundedness theorem using Baire’s
theorem the gliding hump method fell a bit out of fashion. (But never completely, cf. e.g.
[161].) Also Schur’s theorem can be proven using Baire’s theorem, cf. e.g. [30, Proposition
V.5.2], but this is conceptually and technically more complicated.
3. Note also that the determination of the nk , rk in the above proof was deterministic, using
no choice axiom at all. In this sense the proof is better than the alternative one using Baire’s
theorem, thus countable dependent choice, which nevertheless is instructive. (But of course also
the above proof is non-constructive in the somewhat extremist sense of intuitionism since the
necessary ε > 0 cannot be found algorithmically.) In the same vein, in next section we will give
a gliding-hump proof for the weak uniform boundedness theorem that only uses the axiom ACω
of countable choice. 2

195
B.3.6 Compactness of all bounded linear maps c0 → `q → `p (1 ≤ p < q < ∞)
In Theorem 12.24 we proved that for a reflexive Banach V space all bounded linear operators
c0 → V and V → `1 are compact. Applying this to the reflexive spaces `p = `p (N, F) with
1 < p < ∞, we see that for such p all bounded maps c0 → `p and `p → `1 are compact.
We now give an alternative proof that in addition gives compactness of all bounded maps
`q → `p for 1 < p < q < ∞ and c0 → `1 . Thus all bounded linear maps between the
spaces c0 , `p (1 ≤ p < ∞) that go in the opposite direction of the bounded inclusion maps
`1 ,→ `p ,→ `q ,→ c0 are compact!

B.30 Theorem (H. R. Pitt 1936) Let 1 ≤ p < q < ∞ and let V ⊆ `q or V ⊆ c0 be a closed
subspace. Then every bounded linear map V → `p is compact, thus B(V, `p ) = K(V, `p ).
We follow the elementary proof by Delpech [38], based upon the following

B.31 Lemma (i) Let 1 ≤ r < ∞, x ∈ `r and {yn } ⊂ `r a weak null-sequence. Then

lim sup kx + yn kr = kxkr + lim sup kyn kr . (B.4)


n→∞ n→∞

(ii) If x ∈ c0 and {yn } ⊂ c0 is a weak null-sequence then

lim sup kx + yn k = max(kxk, lim sup kyn k). (B.5)


n→∞ n→∞

Proof. (i) Since the evaluation map ϕm : `r → F, y 7→ y(m) is in (`r )∗ , the weak convergence
w
yn → 0 implies yn (m) → 0 for all m. From this it is quite clear that (B.4) holds if x has
finite support supp(x) = {m ∈ N | x(m) 6= 0} (i.e. x ∈ c00 ) since in particular yn (m) → 0 for
all m ∈ supp(x). For general x ∈ `r and ε > 0 we pick x0 ∈ c00 with kx − x0 k < ε. Since
kx0 + yn k − ε ≤ kx + yn k ≤ kx0 + yn k + ε for all n we find
1/r 1/r
kx0 kr + lim sup kyn kr − ε ≤ lim sup kx + yn k ≤ kx0 kr + lim sup kyn kr + ε.
n→∞ n→∞ n→∞

Combined with the fact that kx0 k → kxk as ε → 0, (B.4) follows.


The proof of (ii) is essentially the same. 

Proof of Theorem B.30. By Exercise 9.23, `q is reflexive for 1 < q < ∞, thus same holds for
every closed V ⊆ `q by Theorem 9.27. If V ⊆ c0 is closed then V ∗ ∼ = c∗0 /V ⊥ by Exercise 9.15.
∗ ∼ 1 ∗
Since c0 = ` is separable, the same holds for V by Exercise 6.5. In either case, V is reflexive
or has separable dual. Thus compactness of A ∈ B(V, `p ) follows from Theorem 12.28(iv) if we
w
prove sequential weak-norm continuity. It suffices to prove that xn → 0 implies kAxn k → 0. If
ε > 0 we pick yε ∈ V with kyε k = 1 and kAyε k > 1 − ε. Let t > 0. In the case V ⊆ `q we apply
Lemma B.31 to both sides of the inequality kAyε + A(txn )k ≤ kyε + txn k, obtaining
 1/p  1/q
kAyε kp + tp lim sup kAxn kp ≤ kyε kq + tq lim sup kxn kq . (B.6)
n→∞ n→∞

Since {xn } is weakly convergent, Exercise 10.6 gives kxn k ≤ M ∀n. With kyε k = 1 and kAyε k >
1 − ε, (B.6) becomes
1
lim sup kAxn kp ≤ (1 + tq M q )p/q − (1 − ε)p ,

tp
n→∞

196
which holds for all t > 0 and ε ∈ (0, 1). Putting t = ε1/q and using (1 + x)α = 1 + αx + o(x) as
x & 0, we have
1 p ε p
lim sup kAxn kp ≤ (1 + εM q ) − (1 − pε) + o(ε) = p/q M q + p + o(1) ,
 
n→∞ εp/q q ε q
which vanishes as ε → 0 since p < q. This finishes the proof in the case V = `q .
In the case V ⊆ c0 we proceed similarly, but use part (ii) of the Lemma on the r.h.s.,
obtaining
 1/p
kAyε kp + tp lim sup kAxn kp

≤ max kyε k, t lim sup kxn k) ,
n→∞ n→∞

leading similarly to the above to


1 
lim sup kAxn kp ≤ max(1, tM ) p
− (1 − ε) p
.
n→∞ tp

Putting t = ε1/2p , for ε and thus t small enough, this becomes

1   pε + o(ε)
lim sup kAxn kp ≤ 1 − (1 − ε)p = ,
n→∞ ε1/2 ε1/2
which again vanishes as ε → 0. 

B.32 Corollary Let V, W ∈ {`p | 1 ≤ p < ∞} ∪ {c0 } with V 6= W . Then there is no


isomorphism from an infinite-dimensional subspace of V to a subspace of W . Equivalently,
every A ∈ B(V, W ) is strictly singular. In particular V 6' W .
Proof. Assume K ⊆ V, L ⊆ W are infinite-dimensional closed subspaces and A ∈ B(K, L) is
an isomorphism, thus has a bounded inverse B ∈ B(L, K). Then by Theorem B.30 either A
or B is compact. But this implies that BA = idK is compact, which is impossible by infinite-
dimensionality of K. Thus no isomorphism K ' L (let alone V ' W ) can exist.
The equivalence of this statement to strict singularity was Exercise 12.23(i). 

B.33 Remark 1. One says that the spaces c0 and `p (1 ≤ p < ∞) are uncomparable.
2. By the above, in particular the inclusion maps `p ,→ `q ,→ c0 , where 1 ≤ p < q < ∞,
are strictly singular. But they are clearly non-compact since they send the bounded sequence
{δn }, which has no norm-convergent subsequence in any of these spaces, to itself. Taking
1 < p < q < ∞ this shows that strict singularity of an operator does not imply compactness
even if both spaces involved are reflexive. But one can prove that for all A ∈ B(`p ), where
1 < p < ∞, strict singularity does imply compactness. (Cf. e.g. [1, Problem 2.4].) By an easy
reduction the same holds for all operators between not necessarily separable Hilbert spaces. 2

B.4 The weak Uniform Boundedness Theorem using only ACω


The aim of this section is to give a proof of the following theorem that only uses the axiom ACω
of countable choice. The proof uses gliding humps, but less evidently than in Section B.3.5.

B.34 Theorem ZF+ACω ⇒ If E is a Banach space, F a normed space and F ⊆ B(E, F ) is


pointwise bounded then F is uniformly bounded.

197
Proof. Assume that F is not uniformly bounded. Then the sets Fn = {A ∈ F | kAk ≥ 4n }
are all non-empty, so that using ACω (axiom of countable choice), we can pick an An ∈ Fn for
each n ∈ N. By definition of kAn k, the sets Xn = {x ∈ E | kxk ≤ 1, kAn xk ≥ 23 kAn k} are all
non-empty, to that using ACω again, we can choose an xn ∈ Xn for each n ∈ N.
Applying the triangle inequality to Az = 12 (A(y + z) − A(y − z)) gives
1 1
kAzk = k(A(y + z) − A(y − z))k ≤ (kA(y + z)k + kA(y − z)k) ≤ max(kA(y + z)k, kA(y − z)k).
2 2
Applying this inequality to A = An+1 , y = yn , z = ±3−(n+1) xn+1 , recalling kAn xn k ≥ 32 kAn k,
we see that for at least one of the signs ± we have
2
kAn+1 (yn ± 3−(n+1) xn+1 )k ≥ 3−(n+1) kAn+1 xn+1 k ≥ 3−(n+1) kAn+1 k.
3
Thus defining a sequence {yn } ⊆ E inductively by y1 = x1 and
yn + 3−(n+1) xn+1 if kAn+1 (yn + 3−(n+1) xn+1 )k ≥ 3−(n+1) 23 kAn+1 k

yn+1 = (B.7)
yn − 3−(n+1) xn+1 otherwise
we have kAn yn k ≥ 23 3−n kAn k for all n. (For n = 1 this is true since y1 = x1 .) Since (B.7)
involves no further free choices, this inductive definition can be formalized in ZF (which we
don’t spell out here, see [53]).
With (B.7) and kxn k ≤ 1 for all n, we have kyn+1 − yn k ≤ 3−(n+1) ∀n. Now for all m > n
m−1 ∞
X X 1 1
kym − yn k = yk+1 − yk ≤ 3−(k+1) = 3−(n+1) 1 = 3−n ,
1− 3
2
k=n k=n

so that {yn } is a Cauchy sequence. By completeness of E we have yn → y ∈ E with ky − yn k ≤


1 −n
23 . Another use of the triangle inequality gives
kAn yn k = kAn (y − y + yn )k ≤ kAn yk + kAn (y − yn )k ≤ kAn yk + kAn kky − yn k,
so that with ky − yn k ≤ 21 3−n , kAn yn k ≥ 32 3−n kAn k and kAn k ≥ 4n for all n we finally have
1 4 n
   
2 −n 1 −n 1
kAn yk ≥ kAn yn k − kAn kky − yn k ≥ kAn k 3 − 3 = 3−n kAn k ≥ → ∞.
3 2 6 6 3
Thus y ∈ E is a witness for the failure of pointwise boundedness of F. 

B.35 Remark The above argument was discovered only a few years ago and published [53] in
2017! 2

B.5 Alaoglu’s theorem ⇔ Tychonov  T2 ⇔ Ultrafilter lemma


Most proofs in functional analysis are quite non-constructive in that they rely on the Axiom of
Choice (AC). But most results actually do not require the full strength of AC! We have seen that
the axiom ACω of countable choice suffices for proving the weak uniform boundedness theorem,
whereas the somewhat stronger axiom DCω of countable dependent choice (which is equivalent
to Baire’s theorem) yields the strong uniform boundedness and open mapping theorems, as well
as Hahn-Banach for separable normed spaces.
In the proof of Alaoglu’s Theorem 10.26 we have used Tychonov’s theorem. The latter is
well known to be equivalent (over ZF) to the Axiom of Choice and to Zorn’s lemma. (For proofs
see e.g. [108].) But an inspection of the proof shows that we used Tychonov’s theorem only for
a product of Hausdorff spaces. Thus:

198
B.36 Theorem Over ZF, Tychonov’s theorem for Hausdorff spaces ⇒ Alaoglu’s theorem.
It turns out that Alaoglu’s theorem actually is equivalent over ZF to Tychonov’s theorem
for Hausdorff spaces, as well as to several other statements, e.g. the Ultrafilter Lemma (UL),
Alexander’s subbase lemma, and the Boolean Prime Ideal Theorem. Furthermore, this class of
equivalent statements is strictly weaker than the equivalent statements AC, Zorn and Tychonov,
in the sense that there are models of the ZF axioms in which the former statements are true,
but not the latter, see [70, 71]. We will prove Alaoglu ⇒ UL below.

B.37 Definition Let X be a set.


• A filter118 on X is a family F ⊆ P (X) of subsets satisfying:
(i) If F, G ∈ F then F ∩ G ∈ F.
(ii) If F ∈ F and G ⊇ F then G ∈ F.
(iii) ∅ 6∈ F.
(iv) F 6= ∅.
Notice that filters on X are partially ordered by inclusion (as subsets of P (X)).
• An ultrafilter (or maximal filter) on a set is a filter that is not properly contained in
another filter.

B.38 Example 1. If X is any set and x ∈ X then Fx = {Y ⊆ X | x ∈ Y } is an ultrafilter on


X. The filters Fx are called prinicipal.
2. If (X, τ ) is a topological space and x ∈ X then N ⊆ X is called a neighborhood of x if
there exists U ∈ τ such that x ∈ U ⊆ N . Now the set Nx of neighborhoods of x is a filter on X.
Only the closedness under finite intersections merits an argument: If N1 , N2 ∈ Nx then there
are open Ui ⊆ Ni containing x. Then U1 ∩ U2 is an open neighborhood of x, thus in F, and so
is N1 ∩ N2 ⊇ U1 ∩ U2 . (Nx is principal if and only if x is isolated, i.e. {x} is open.)

B.39 Lemma (Ultrafilter Lemma) ZF+AC ⇒ Every filter is contained in an ultrafilter.


Proof. Let X be a set and F a filter on X. The family F of all filters on X that contain F is
ordered set w.r.t. inclusion. If C ⊆ F is a totally ordered subset of F, we claim that
a partiallyS
the union C of all elements of C is a filter that contains F. That the union of any non-zero
number of filters has the properties
S (ii), (iii) and (iv) in Definition B.37 is obvious, so that only
(i) remains. Let F1 , F2 ∈ S C. By the total order of C, there is a S Fe ∈ C such that F1 , F2 ∈ Fe
and thus F1 ∩ F2 ∈ Fe ⊆ C. This proves requirement (i), thus C is in F and is an upper
bound for the chain C. Therefore Zorn’s lemma applies and gives a maximal filter Fb containing
F. Since ultrafilters are just maximal filters, we are done. 

Ultrafilters are characterized by a quite remarkable property:

B.40 Lemma A filter F on X is an ultrafilter if and only if for every Y ⊆ X exactly one of the
alternatives Y ∈ F, X\Y ∈ F holds.
118
Filters were invented in 1937 by the French mathematician Henri Cartan (1904-2008), an important member of
the Bourbaki group. Unsurprisingly, the best reference on filters is [21]. Preference for nets or filters is sometimes
put as a question of American vs. European (in particular French) tastes, but this is simplistic. Most contemporary
research in general topology is actually done in terms of filters, not nets.

199
Proof. We begin by noting that we cannot have both Y ∈ F and X\Y ∈ F since (i) would
imply ∅ = Y ∩ (X\Y ) ∈ F, which is forbidden by (iii). Assume F contains Y or X\Y for every
Y ⊆ X. This means that F cannot be enlarged by adding Y ⊆ X since either already Y ∈ F
or else X\Y ∈ F, which excludes Y ∈ F. Thus F is an ultrafilter.
Now assume that F is an ultrafilter and Y ⊆ X. If there is an F ∈ F such that F ∩ Y = ∅
then F ⊆ X\Y , and property (ii) implies X\Y ∈ F. If, on the other hand, Y ∩ F 6= ∅ ∀F ∈ F
then there is a filter Fe containing F and Y . Since F is maximal, we must have Y ∈ F. 

B.41 Remark If F is a filter on a set S 6= ∅ then the map µ = χF : P (S) → {0, 1} (that
sends A ⊆ S to 1 if A ∈ F and to 0 otherwise) clearly satisfies µ(∅) = 0, µ(S) = 1, and for
for disjoint A1 , A2 ⊆ S we have µ(A1 ) + µ(A2 ) ≤ µ(A1 ∪ A2 ) ≤ 1 since A1 and A2 cannot
both be in F. If equality holds for all disjoint non-empty A1 , A2 , then in particular for all
∅ 6= A1 6= S and A2 = S\A1 . Thus either A1 or its complement A2 belongs to F. By Lemma
B.40 this characterizes the ultrafilters on S. Conversely, one easily checks that if µ is a non-zero
{0, 1}-valued finitely additive measure on S then F = {A ⊆ S | µ(A) = 1} is an ultrafilter. 2

B.42 Lemma For a topological space (X, τ ), the following are equivalent:
(i) (X, τ ) is compact (thus every open cover has a finite subcover).
T
(ii) Whenever F ⊆ P (X) is a family of closed subsets of X such that F = ∅ then there are
C1 , . . . , Cn ∈ F such that C1 ∩ · · · ∩ Cn = ∅.
(iii) If F ⊆ P (X) is a family of closed subsets of X with the finite intersectionTproperty (i.e.
the intersection of any finite number of elements of F is non-empty) then F 6= ∅.
Proof. (i) and (ii) are dualizations of each other, using de Morgan’s formulas, and (iii) is the
contraposition of (ii). 

B.43 Theorem Over ZF, the following are equivalent:


(i) The Ultrafilter Lemma.
(ii) Tychonov  T2 : Any product of compact Hausdorff spaces is compact.
(iii) Alagolu’s theorem.
Proof. (i)⇒(ii) This is ‘just’ general topology, and we do not reproduce it here. Cf. e.g. [108].
(ii)⇒(iii) Mentioned above.
(iii)⇒(i) Let F be a filter on the set X. Then V = `∞ (X, R) is a Banach space, and
Σ = (V ∗ )≤1 (which is a set of finitely additive measures on X, cf. Section B.3.2) is weak-∗
compact, thus weak-∗-closed, by Alaoglu’s theorem. Every x ∈ X gives rise to a bounded linear
functional ϕx ∈ Σ, f 7→ f (x) with kϕx k = 1. The map ι : X → Σ, x 7→ ϕx is injective.
w∗
Now put F = {ι(F ) | F ∈ F } ⊆ P (Σ). If F1 , . . . , Fn ∈ F then by injectivity of ι and
T w∗ T T
finite intersection property of F we have k ι(Fk ) ⊇ k ι(Fk ) = ι( k Fk ) 6= ∅, so that F
w∗
Since the sets ι(F ) T⊆ Σ are weak-∗ closed, and Σ is
has the finite intersection property. T
weak-∗ compact, Lemma B.42 gives F 6= ∅. Pick ψ ∈ F ⊆ Σ ⊆ V ∗ and define a map
µ : P (X) → C, S 7→ ψ(χS ).
w∗
Now `∞ (X, R) is an algebra and each ϕx is a character. Since ψ ∈ ι(F ) for each F ∈ F,
it also is a character: We have ψ = limλ ϕλ , where ϕλ is a net of characters converging in the
weak-∗ topology, thus

ψ(f g) = lim ϕλ (f g) = lim ϕλ (f )ϕλ (g) = ψ(f )ψ(g).


λ λ

200
And χS is idempotent for each S, thus ψ(χS ) = ψ(χ2S ) = ψ(χS )2 , implying µ(S) = ψ(χS ) ∈
{0, 1} for all S ⊆ X. We have µ(X) = ψ(1) = 1 (since ϕx (1) = 1 ∀x), and S ∩ T = ∅ implies
χS∪T = χS + χT , so that µ(S ∪ T ) = µ(S) + µ(T ). Thus µ is a finitely additive {0, 1}-valued
measure on X, and we know from Remark B.41 that Fb = {Y ⊆ X | µ(Y ) = 1} is an ultrafilter
T T w∗ w∗ w∗
on X. If Y ∈ F then ψ ∈ F = F ∈F ι(F ) implies ψ ∈ ι(Y ) = {ϕx | x ∈ Y } . Since
ϕx (χY ) = χY (x) = 1 for all x ∈ Y , we have µ(Y ) = ψ(χY ) = 1, thus F ⊆ F.
b We thus have
embedded F into an ultrafilter. 

B.44 Remark 1. Combining Theorems B.43 and B.45 we obtain the curious fact that over ZF
Alaoglu’s theorem (non-reversibly) implies Hahn-Banach.
2. Our proofs of OMT/BIT/CGT and UBT only use the ZF axioms and Baire’s theorem,
the latter being equivalent to DCω . Since we have seen that Alaolgu’s theorem is equivalent
to the Ultrafilter lemma (and to Tychonov for Hausdorff spaces and various other statements),
all these theorems can be proven in the framework ZF+DCω +UL that is provably weaker than
ZF+AC, see [124]. In Section B.6.1 we will prove that this also holds for Hahn-Banach. Yet,
there are some results, like the Krein-Milman theorem that cannot be proven in ZF+DCω +UL.
See Remark B.85. 2

B.6 More on convexity and Hahn-Banach matters


B.6.1 Tychonov for Hausdorff spaces implies Hahn-Banach
Having seen that Alaoglu’s theorem can be deduced over ZF from Tychonov’s theorem for
Hausdorff spaces, we now prove the same for the Hahn-Banach theorem:

B.45 Theorem [Loś & Ryll-Nardzewski (1951)] 119 Over ZF, Tychonov’s theorem for Hausdorff
spaces implies the Hahn-Banach Theorem 9.2.
Q
Proof. Let V, p, W, ϕ as in Theorem 9.2. Define E = v∈V [−p(−v), p(v)] with the product
topology. Since we assume Tychonov’s theorem for compact Hausdorff spaces, E is compact
(and Hausdorff). Clearly every e ∈ E can be interpreted as a map V → R satisfying the bound
−p(−v) ≤ e(v) ≤ p(v) ∀v. For each v ∈ V the coordinate map e 7→ e(v) is continuous, thus
E 0 = {e ∈ E | e(w) = ϕ(w) ∀w ∈ W } ⊆ E is closed. For each finite-dimensional subspace Z ⊆ V
let EZ = {e ∈ E | e  Z is linear}. Again using continuity of the coordinate maps e 7→ e(v), it
follows that each
\  \ 
EZ = e ∈ E | e(x + x0 ) = e(x) + e(x0 ) ∩ e ∈ E | e(cx) = ce(x) ⊆ E
x,x0 ∈Z c∈R,x∈Z

is closed. If Z ⊆ V is finite-dimensional, applying Lemma 9.3 a finite number of times, we


find a linear extension ψZ of ϕ to W + Z bounded by p. Defining e ∈ E by e(x) = ψZ (x) for
x ∈ W + Z and e(x) = p(x) otherwise, we have e ∈ EZ0 = E 0 ∩ EZ , thus EZ0 6= ∅. If Z, Z 0 ⊆ V
are finite-dimensional then EZ0 ∩ EZ0 0 ⊇ EZ+Z
0 0
0 6= ∅. Thus the family {EZ | Z ⊆ V fin. dim.}

T of E has
of closed subsets
0
the finite intersection property. Since E is compact, Lemma B.42
gives EHB = Z fin. dim. EZ 6= ∅. Pick any e ∈ EHB . Now e : V → R coincides with ϕ on W ,
satisfies the p-bound and is linear on all finite-dimensional subspaces, thus globally. Thus it is
a Hahn-Banach extension. 
119
Jerzy Loś (1920-1988), Czeslaw Ryll-Nardzewski (1926-2015). Polish mathematicians. L. mostly worked in set
theory and logic, R.-N. in functional analysis (R.-N. fixed point theorem), measure theory and probability.

201
B.46 Remark 1. The proof in [99] is phrased in terms of two rather more general theorems,
but we chose to sacrifice the generality and simplify the argument considerably. (There also is
a technical point: In the above proof we used distinguished coordinates p(x) to make sure that
the closed subsets EZ0 are non-empty. It is not clear to this author how to draw this conclusion
(without invoking AC) in the generality of [99, Theorem 2] when the spaces {Px }x∈X0 appearing
there have nothing to do with each other. Perhaps they should be assumed pointed?)
2. In 1962, Luxemburg120 [100] deduced Hahn-Banach from the UL by use of ultraproducts
(non-standard analysis). The ultraproducts are not essential and can be removed, cf. [11] (or
[108]). The resulting proof shares some features with the above, which however remains simpler.
3. In [123] it is shown that the Hahn-Banach theorem is strictly weaker than the equivalent
statements of Tychonov for Hausdorff spaces, the Ultrafilter Lemma, etc. (But it still suffices for
proving the Banach-Tarski paradox and the existence of sets that are not Lebesgue measurable,
see [54, 117]. Thus the latter cannot be avoided without giving up Hahn-Banach.) 2

B.6.2 Minkowski functionals. Criteria for normability and local convexity


We begin with an easy fact needed by all studies of convex sets in topological vector spaces:

B.47 Exercise Let V be a topological vector space and A ⊆ V convex. Prove that the interior
A0 and the closure A are convex.

B.48 Proposition Let V be a topological vector space and U a convex open neighborhood of
0. Define the ‘Minkowski functional’121 µU : V → [0, ∞) of U by

µU (x) = inf{t ≥ 0 | x ∈ tU }.

Then µU is sublinear and continuous, and U = {x ∈ X | µU (x) < 1}.


Proof. As t → ∞ we have t−1 x → 0. Since U is an open neighborhood of 0, we have t−1 x ∈ U
for t large enough. Thus µU (x) < ∞ for each x ∈ V . It is quite obvious from the definition
that µU (cx) = cµU (x) for c > 0. Thus µU is positive-homogeneous. We have µU (x) < 1 if and
only if there exists t ∈ (0, 1) such that x ∈ tU . Thus µU (x) < 1 ⇒ x ∈ U . And if x ∈ U then
openness of U implies that (1 − ε)x ∈ U for some ε > 0. Thus µU (x) < 1, so that we have
U = {x ∈ X | µU (x) < 1}.
Let x, y ∈ V , and let s, t > 0 such that x ∈ sU, y ∈ tU . I.e. there are a, b ∈ U such that
x = sa, y = tb. Thus x + y = sa + tb = (s + t) sa+tb s t
s+t . Since s+t a + s+t b ∈ U due to convexity
of U , we have x + y ∈ (s + t)U . Thus µU (x + y) ≤ s + t, and since we have x ∈ sU, y ∈ tU for
all s < µU (x) + ε, t < µU (y) + ε with ε > 0, the conclusion is µU (x + y) ≤ µU (x) + µU (y), thus
subadditivity. Being subadditive and positive homogeneous, µU is sublinear.
Let {xι }ι∈I ⊆ V be a net converging to zero. For each n ∈ N, n−1 U is an open neighborhood
of zero. Thus there exists a ιn ∈ I such that ι ≥ ιn implies xι ∈ n−1 U and therefore, with the
definition of µU , that µU (xι ) ≤ n−1 . Thus µU (xι ) → 0, which is continuity of µU at 0 ∈ V .
If now xι → x then the subadditivity of µU gives

µU (x) − µU (x − xι ) ≤ µU (xι ) ≤ µU (x) + µU (xι − x),

and since µU (xι − x) → 0, we have µU (xι ) → µU (x), thus continuity of µU at all x ∈ V . 


120
W. A. J. Luxemburg (1929-2018). Dutch mathematician, mainly interested in non-standard analysis.
121
Also called gauge functionals, gauge deriving from Minkowski’s ‘Aich. . . ’.

202
B.49 Definition Let V be a topological vector space and 0 ∈ U ⊆ V . Then U is called
• balanced if x ∈ U, |λ| ≤ 1 ⇒ λx ∈ U ,
• bounded if for every open W 3 0 there exists λ > 0 such that λU ⊆ W .
Note that if U is convex and contains zero, multiplication by t ∈ [0, 1] sends U into itself.
Thus for checking balancedness it suffices to consider |λ| = 1.

B.50 Exercise Let (V, τ ) be a TVS, where τ = τd for a translation-invariant metric d. Prove:
(i) B(0, r) is bounded in the TVS sense for each r > 0.
(ii) X ⊆ V is bounded in the TVS sense if and only if X ⊆ B(0, r) for some r > 0.

B.51 Proposition Let (V, τ ) be a topological vector space and U a convex open neighborhood
of zero. Then
(i) The Minkowski functional µU is a seminorm if and only if U is balanced.
(ii) If U is bounded then µU (x) = 0 implies x = 0.
(iii) If U is balanced and bounded then kxk = µU (x) is a norm inducing the topology τ .
Proof. (i) Since µU is subadditive and positive-homogeneous, it is a seminorm if and only if
µU (λx) = µU (x) for all x ∈ V and λ ∈ F with |λ| = 1. If U is balanced then this is evidently
satisfied. Now assume µU (λx) = µU (x). The openness of U implies that {t > 0 | x ∈ tU } =
(µU (x), ∞). Thus if |λ| = 1 then the assumption µU (λx) = µU (x) implies that x ∈ U if and
only if λx ∈ U . Thus U is balanced.
(ii) Assume that U is bounded and that x 6= 0. Since τ is T1 , there is an open W ⊆ V such
that 0 ∈ W 63 x. Since U is bounded, there is λ > 0 such that λU ⊆ W , which clearly implies
x 6∈ λU . Now the definition of µU implies µU (x) > λ > 0.
(iii) Proposition B.48 and the above (i) and (ii) show that k · k = µU is a continuous norm on
V . Thus xn → 0 implies kxn k → 0. If we prove the converse implication then τ = τk·k follows
since V is a topological vector space. Let {xn } be a sequence such that kxn k → 0, and let W
be an open neighborhood of 0. Since U is bounded, there is λ > 0 such that λU ⊆ W . Now,
kxn k → 0 means that there is n0 ∈ N such that n ≥ n0 ⇒ kxn k < λ/2. With the definition of
µU this implies xn ∈ λU , thus xn ∈ λU ⊆ W for all n ≥ n0 . This proves xn → 0. 

We now know that a topological vector space is normable if the zero element has a balanced
convex bounded open neighborhood. (The converse is easy.) But this can be improved:

B.52 Lemma Let V be a topological vector space and U a convex open neighborhood of 0.
Then there exists a balanced convex open neighborhood U 0 ⊆ U of 0.

S λU ⊆ U
Proof. Since multiplication by scalars is continuous, there exists an ε > 0 such that
whenever |λ| ≤ ε. Thus with W = |ε|U we have tW ⊆ U whenever |t| ≤ 1. Put Y = |t|≤1 tW ⊆
U . By construction, Y is a balanced open neighborhood of 0.
T every λ ∈ F with |λ| = 1 it is clear that λU is a convex open neighborhood of 0. Putting
For
Z = |λ|=1 λU , it is manifestly clear that Z is balanced and 0 ∈ Z. Furthermore, U 0 is convex
(as an intersection of convex sets). Since tW ⊆ U for all |t| = 1, we have Y ⊆ Z, so that Z
has non-empty interior Z 0 . Now we put U 0 = Z 0 and claim that U 0 has the desired properties.
Clearly U 0 is an open neighborhood of 0, as the interior of a convex set it is convex (Exercise
B.47). If |t| = 1 then the map Z → Z, x 7→ tx is a homeomorphism. Thus if x ∈ Z 0 = U 0 then

203
tx ∈ Z 0 = U 0 , showing that U 0 = Z 0 is balanced. 

Now we are in a position to prove geometric criteria for normability and local convexity of
topological vector spaces:

B.53 Theorem Let V be a topological vector space. Then V is normable if and only if there
exists a bounded convex open neighborhood of 0.
Proof. If V is normable by the norm k · k then Bk·k (0, 1) = {x ∈ V | kxk < 1} is clearly open,
convex (and balanced). To show boundedness, let W 3 0 be open. Then there is ε > 0 such
that B(0, ε) ⊆ W . Now clearly εB(0, 1) = B(0, ε) ⊆ W , thus B(0, 1) is bounded.
If there exists a bounded convex open neighborhood U of 0 then by Lemma B.52 we can
assume U in addition to be balanced. (The U 0 provided by the lemma is a subset of U , thus
bounded if U is bounded.) Now by Proposition B.51(iii), µU is a norm inducing the given
topology on V . 

B.54 Theorem A topological vector space (V, τ ) is locally convex in the sense of Definition
2.37 (i.e. the topology τ comes from a separating family F of seminorms) if and only if it is
Hausdorff and the zero element has an open neighborhood base consisting of convex sets.
Proof. Given a separating family F of seminorms and putting τ = τF , a basis of open neigh-
borhoods of 0 is given by the finite intersections of sets Up,ε = {x ∈ V | p(x) < ε}, where
p ∈ F, ε > 0. Each of the Up,ε is convex and open, thus also their finite intersections.
And if τ has the stated property, Lemma B.52 gives that 0 has a neighborhood base consisting
of balanced convex open sets. Defining F = {µU | U balanced convex open neighborhood of 0},
each of the µU is a continuous seminorm by Propositions B.48 and B.51. Thus if xι → 0 then
kxι kU := µU (xι ) → 0. And kxι kU → 0 for all balanced convex open U implies that xι ultimately
is in every open neighborhood of 0, thus xι → 0. Thus τ = τF , and 2.36 gives that F is
separating. 

B.55 Exercise Let 0 < p < 1.


(i) Prove that (`p (S, F), τdp ) is normable if S is finite.
(ii) Prove that the open unit ball of (`p (S, F), τdp ) does not contain any convex open neigh-
borhood of 0 if S is infinite.
(iii) Prove that (`p (S, F), τdp ) is neither normable nor locally convex if S is infinite.

B.6.3 Hahn-Banach separation theorems for locally convex spaces


The Hahn-Banach theorem in the sublinear functional version (Theorem 9.2) has an important
geometric application, namely the fact that disjoint convex sets in a topological vector space
can be separated by hyperplanes, i.e. sets H = {x ∈ X | Re ϕ(x) = t} for some ϕ ∈ V ∗ , t ∈ R.

B.56 Definition Let V be a topological vector space over F and V ∗ the space of continuous
linear functionals V → F. For A, B ⊆ V we say that A and B are
(i) separated if there exists ϕ ∈ V ∗ with Re ϕ(a) < inf b∈B Re ϕ(b) ∀a ∈ A.
(ii) strictly separated if there exist ϕ ∈ V ∗ , α ∈ R with Re ϕ(a) < α < Re ϕ(b) ∀a ∈ A, b ∈ B.
(iii) very strictly separated if there exists ϕ ∈ V ∗ with supx∈A Re ϕ(x) < inf x∈B Re ϕ(x).

204
B.57 Remark 1. Geometrically, (ii) and (iii) are equivalent to A, B being contained in two
disjoint open half spaces (having positive distance in case (iii)), while (i) means that A, B are
contained in an open half-space and its complement, respectively.
2. The sets A = (0, 1), B = (0, 1) in V = R are strictly separated, but not very strictly, while
A = (0, 1), B = [1, 2) are only separated. An example for two closed sets that are non-strictly
separated is A = {(x, y) | x, y > 0, xy ≥ 1}, B = {(x, y) | x ≤ 0} in V = R2 . 2

B.58 Theorem Let V be a topological vector space and A, B ⊆ V disjoint non-empty convex
subsets, A being open. Then
(i) A and B are separated.
(ii) If also B is open, A and B are strictly separated.
Proof. (i) Pick a0 ∈ A, b0 ∈ B and put z = b0 − a0 and U = (A − a0 ) − (B S − b0 ) = A − B + z,
which is a convex (as pointwise sum of two convex sets) open (since U = x∈−B−a0 +b0 (A + x))
neighborhood of 0. Let p = µU be the associated Minkowski functional. As a consequence of
A ∩ B = ∅ we have 0 6∈ A − B, thus z 6∈ U , and therefore p(z) ≥ 1.
Put W = Rz and define ψ : W → R, cz 7→ c. For c ≥ 0 we have ψ(cz) = c ≤ cp(z) = p(cz).
Thus by sublinearity of p and Theorem 9.2 there exists a linear functional ϕ : V → R satisfying
ϕ  W = ψ, thus ϕ(cz) = c, and ϕ(x) ≤ p(x) ∀x ∈ V . Thus also −p(−x) ≤ −ϕ(−x) = ϕ(x),
and since x → 0 implies p(x) → 0, ϕ is continuous at zero, thus everywhere.
If now a ∈ A, b ∈ B then a − b + z ∈ U , so that p(a − b + z) < 1. Thus

ϕ(a − b) + 1 = ϕ(a − b + z) ≤ p(a − b + z) < 1,

thus ϕ(a) < ϕ(b) for all a ∈ B and b ∈ B. Thus the subsets ϕ(A), ϕ(B) of R are disjoint.
Since A, B are convex, they are connected. Consequently, ϕ(A), ϕ(B) are connected, thus
intervals. Since A is open, so is ϕ(A) (open mapping theorem). If we put s = sup ϕ(A), we have
ϕ(a) < s ≤ ϕ(b) for all a ∈ A, b ∈ B, and this is equivalent to ϕ(a) ≤ inf b∈B ϕ(b) for all a ∈ A.
F = C: Considering V as R-vector space, apply the above to obtain a continuous R-linear
functional ϕ0 : V → R such that ϕ(a) < inf b∈B ϕ(b) ∀a ∈ A. Now define ϕ : V → C, x 7→
ϕ0 (x)−iϕ0 (ix). This clearly is continuous and satisfies Re ϕ = ϕ0 , so that the desired inequality
holds. That ϕ is C-linear follows from the same argument as in the proof of Theorem 9.5.
(ii) It suffices to consider F = R. With α = inf b∈B Re ϕ(b), by (i) we have Re ϕ(a) < α for
all a ∈ A. Since B is open, ϕ(B) ⊆ R is open by Exercise 6.4 (or the open mapping theorem),
so that ϕ does not assume its infimum α on B, whence α < Re ϕ(b) ∀b ∈ B. 

B.59 Lemma If V is a locally convex space and K ⊆ U ⊆ V with K compact and U open,
there is a convex open neighborhood N ⊆ V of zero such that K + N ⊆ U .
Proof. For brevity, we only prove the Lemma for normed spaces and leave the generalization to
locally convex spaces as an Exercise.
For every x ∈ K there exists εx > 0 such that B(x, 2εx ) ⊆ U . SinceS{B(x, εx )}x∈K is an
open cover of the compact set K, there are x1 , . . . , xn such that K ⊆ ni=1 B(xi , εxi ). Put
ε = min(εx1 , . . . , εxn ) > 0 and N = B(0, ε), which is open and convex. Now
n
[  n
[
K +N ⊆ B(xi , εxi ) + B(0, ε) ⊆ B(xi , 2εxi ) ⊆ U,
i=1 i=1

where we used B(xi , 2εxi ) ⊆ U ∀i. 

205
B.60 Corollary Let V be a locally convex vector space and A, B ⊆ V disjoint non-empty
convex subsets with A compact and B closed. Then A and B are very strictly separated.
Proof. We only discuss F = R, the changes for F = C being the same as above. Applying
Lemma B.59 to K = A, U = V \B, we obtain a convex open N 3 0 such that A + N ⊆ V . It
is easy to show that A + N is open and convex. (Do it!) Applying Theorem B.58(i) to A + N
and B, we obtain ϕ ∈ V ∗ such that ϕ(a) < inf b∈B ϕ(b) ∀a ∈ A + N ⊃ A. Since ϕ assumes its
supremum on the compact set A, we have supa∈A ϕ(a) < inf b∈B ϕ(b). 

B.61 Corollary Let V be a locally convex vector space and W ⊆ V a proper closed subspace.
Then for every x ∈ V \W there exists a continuous linear functional ϕ ∈ V ∗ such that ϕ  W = 0
and ϕ(x) 6= 0. In particular, for every x ∈ V \{0} there exists ϕ ∈ V ∗ with ϕ(x) 6= 0.
Proof. Let W, x as stated. Put A = {x}, B = W . Then A and B are disjoint closed convex
subsets, and A is compact. Thus they are very strictly separated by Corollary B.60, thus there
exists ϕ ∈ V ∗ such that Re ϕ(x) > supw∈W Re ϕ(w). Since W is a linear subspace, finiteness of
the supremum implies ϕ  W = 0.
For the final claim, apply the above with W = {0}. 

B.6.4 First applications of the separation theorems to Banach spaces


B.62 Proposition Every norm-closed convex set in a Banach space is weakly closed.
w
Proof. Let V be a Banach space and X ⊆ V convex and norm-closed. If y ∈ X \X then {y} is
compact, so that applying Corollary B.60 to (V, τk·k ) and A = {y}, B = X there is a continuous
linear functional ϕ ∈ V ∗ such that inf x∈X Re ϕ(x) > Re ϕ(y). If now {xι } is a net in X with
w
xι → y then ϕ(xι ) → ϕ(y), but this contradicts inf x∈X Re ϕ(x) > Re ϕ(y). Thus X is weakly
closed. 

B.63 Corollary The weak and norm closures of a convex set in a Banach space coincide.
k·k w
Proof. Then X ⊆ V be convex. Then also X is convex, thus weakly closed, so that X ⊆
w
k·k k·k w k·k
X =X . Combining this with the obvious inclusion X ⊇ X we are done. 

B.64 Proposition If V is a Banach space, X ⊂ V is closed, convex and balanced and y ∈ V \X


then there exists ϕ ∈ V ∗ such that |ϕ(x)| ≤ 1 ∀x ∈ X and |ϕ(y)| > 1.
Proof. Let X, y be as stated. As in the proof of Proposition B.62 there exists ϕ ∈ V ∗ such that
inf x∈X Re ϕ(x) > Re ϕ(y). With ϕ0 = −ϕ this becomes Re ϕ0 (y) > supx∈X Re ϕ0 (x). Now

|ϕ0 (y)| ≥ Re ϕ0 (y) > sup Re ϕ0 (x) = sup |ϕ0 (x)|,


x∈X x∈X

where the final equality is due to the fact that X is balanced, thus in particular closed under
multiplication by all λ with |λ| = 1. Now (supx∈X |ϕ0 (x)|)−1 ϕ0 has the desired properties. 

B.65 Theorem (Goldstine) If V is a Banach space then V≤1 is σ(V ∗∗ , V ∗ )-dense in (V ∗∗ )≤1 .
Proof. We abbreviate τ = σ(V ∗∗ , V ∗ ). The unit ball (V ∗∗ )≤1 is τ -compact by Alaoglu’s theorem,
τ
thus τ -closed, so that B = V≤1 , which is convex by Exercise B.47, is contained in (V ∗∗ )≤1 . If
τ
this inclusion is strict, pick x∗∗ ∈ (V ∗∗ )≤1 \V≤1 . Then x∗∗ has a τ -open neighborhood U disjoint
from B, and by Theorem B.54 there is a convex open A ⊆ U . Now Theorem B.58(i) applied to

206
(V ∗∗ , τ ) and A, B ⊆ V ∗∗ gives a τ = σ(V ∗∗ , V ∗ )-continuous linear functional ϕ ∈ (V ∗∗ )? such
that Re ϕ(a) < inf b∈B Re ϕ(b) ∀a ∈ A. Now Exercise 10.23 gives ϕ ∈ V ∗ ⊆ V ∗∗∗ .
Putting ψ = −ϕ we have supb∈B Re ψ(b) < Re ψ(a) ∀a ∈ A, which is more convenient. Since
ψ ∈ V ∗ and B ⊇ V≤1 , we have kψk ≤ supb∈B Re ψ(b). On the other hand, with x∗∗ ∈ A and
kx∗∗ k ≤ 1, we have Re ψ(x∗∗ ) ≤ |ψ(x∗∗ )| ≤ kx∗∗ kkψk ≤ kψk. Combining these findings, we
have kψk ≤ supb∈B Re ψ(b) < Re ψ(x∗∗ ) ≤ kψk, which is absurd. This contradiction proves
τ
V≤1 = (V ∗∗ )≤1 . 

B.6.5 Convex hulls and closed convex hulls


B.66 Definition If V is a vector space over F ∈ {R, C} and X ⊆ V then the convex hull
conv(X) of X is the intersection of all convex Y ⊆ V containing X. (Clearly this is the smallest
convex set containing X.)

B.67 Exercise Prove: If V is a vector space over F ∈ {R, C} and X ⊆ V then


I
nX I
X o
conv(X) = ti xi | I ∈ N, xi ∈ X, ti ≥ 0, ti = 1 . (B.8)
i=1 i=1

B.68 Theorem (Carathéodory 1911) 122 If X ⊆ Rd then every point in conv(X) is a a


convex combination of at most d + 1 points of X. Thus (B.8) holds with fixed I = d + 1.
Proof. (Sketch) By (B.8), every element of conv(X) is a convex combination of finitely many
points of X. Now one uses the fact that, just as more than n points in Rn must be linearly
dependent, more than n + 1 points in the n-dimensional affine space must be affinely dependent.
Thus one can remove points from the convex combination until I ≤ d + 1. For details see e.g.
[96, Theorem 2.23] (or Wikipedia). 

B.69 Corollary If X ⊆ Rd is compact then conv(X) is compact.


P
Proof. The simplices ∆d = {(t0 , . . . , td ) | ti ≥ 0, i ti = 1} are compact, and by Carathéodory’s
theorem, conv(X) is the image of an obvious continuous map X d+1 × ∆d → X. If X is compact,
so is X d+1 × ∆d , so that conv(X) is compact, thus closed. 

We will see below that closedness of conv(X) does not follow from closedness of X, and for
infinite-dimensional V not even from compactness of X.

B.70 Exercise Prove: If V is a topological vector space and X ⊆ V is open then conv(X) is
open.
1 2 (which is closed). Prove that
B.71 Exercise Let X = {(x, y) | x ∈ R, y ≥ 1+x 2} ⊂ R
conv(X) = {(x, y) | x ∈ R, y > 0} (which is open, but not closed).

B.72 Exercise Let V = `2 (N, R) with standard basis {δn }. Prove:


(i) X = {n−1 δn | n ∈ N} ∪ {0} is compact.
(ii) conv(X) ⊆ V is not closed (thus not compact).
122
Constantin Carathéodory (1873-1950), Greek mathematician with many contributions to real and complex analysis
and other areas and author of many textbooks.

207
B.73 Exercise Let V be a topological vector space and X ⊆ V . Prove that conv(X) coincides
with the intersection of all closed convex sets that contain X.

B.74 Definition If V is a topological vector space and X ⊆ V we call conv(X) the closed
convex hull of X, but mostly write conv(X) for readability.

B.75 Proposition (Mazur’s compactness theorem (S. Mazur 1930)) Let V be a norm-
ed space and X ⊆ V . Then
(i) If X is totally bounded then so is conv(X).
(ii) If V is Banach and X is precompact then conv(X) is precompact and conv(X) compact.
Proof. (i) S Let X ⊆ V be totally bounded, and let ε > 0. Then there are z1 , . . . , zK ∈ X such
that X ⊆ K k=1 B(zk , ε). Now C =P conv({z1 , . . . , zK }) ⊆ V is compact as the image in V of the
compact set {(t1 , . . . , tK ) | tk ≥ 0, K K
k=1 tk = 1} ⊆ R under the continuous map (t1 , . . . , tK ) 7→
PK SL
t
k=1 k k z . Thus there are y 1 , . . . , y L ∈ C ⊆ conv(X) such that C ⊆ l=1 B(y l , ε).
PM
If now y ∈ conv(X) then by Exercise B.67 we have y = m=1 tm xm for certain x1 , . . . , xM ∈
X, t1 , . . . , tM ≥ 0 with M
P
m=1 tm = 1. For each m pick km such that kxm − zkm k < ε. Putting
y0 = M
P 0k = k
PM PM
t
m=1 m km z ∈ C, we have ky − y m=1 tm (xm − zkm )k ≤ m=1 tm kxm − zkm k < ε.
0 0 0
Picking l such S that ky − yl k < ε, we have ky − yl k ≤ ky − y k + ky − yl k < 2ε. This proves
conv(X) ⊆ L l=1 B(yl , 2ε). Since ε > 0 was arbitrary, conv(X) is totally bounded.
(ii) Since V is complete, total boundedness and precompactness of its subsets are equivalent
by Exercise A.43. Thus conv(X) is precompact by (i), and by definition this means that conv(X)
is compact. 

B.76 Remark Giving a description of conv(X) that is similarly explicit as (B.8) is difficult.
The convex hull of X = {x1 , . . . , xn } is compact, thus coincides with conv(X). Now Passume
X = {x1 ,P
x2 , . . .} is a boundedPsequence, i.e. kxi k ≤ C ∀i. Then for all {ti ≥ 0} with i ti = 1
we have i ti kxi k ≤ C, thus i ti xi converges absolutely to some x ∈ V . It is easy P to see that
x ∈ conv(X), but in general it is not clear that every x ∈ conv(X) is of the form i ti xi . This
can be proven if the sequence {xn } is convergent. See [102, Lemma 3.4.29] for the proof due to
Grothendieck. 2

B.6.6 Extreme points and faces of convex sets. The Krein-Milman theorem
B.77 Definition Let V be a vector space over F ∈ {R, C} and X ⊆ V convex.
(i) A subset F ⊆ X is a face of X if it is convex and tx + (1 − t)y ∈ F with x, y ∈ X, 0 < t < 1
implies x, y ∈ F .
(ii) x ∈ X is an extreme point if {x} is a face. (Thus x is not a non-trivial convex combination.)
The set of extreme points of X is denoted E(X).

B.78 Exercise If X ⊆ R2 is a convex polygon (non-degenerate, i.e. with non-empty interior),


prove that the faces of X are X, the edges of the boundary and the corner singletons.

B.79 Exercise Let X be a convex set. Prove that x ∈ X is an extreme point if and only if
X\{x} is convex.

B.80 Exercise Let x ∈ F ⊆ X ⊆ V , where V is a vector space, X is convex, F is a face of X


and x an extreme point of F . Prove that x is an extreme point of X. (Thus E(F ) ⊆ E(X).)

208
B.81 Exercise Let V = R3 and let X be the closed convex hull of

S = {(x, y, 0) | (x − 1)2 + y 2 = 1} ∪ {(0, 0, 1), (0, 0, −1)}.

Determine the set E of extreme points of X and show that it its not closed.

B.82 Exercise Show that the closed unit ball in the Banach space c0 has no extreme points.

B.83 Exercise Show that the closed unit ball in the Banach space L1 ([0, 1], λ), where λ is
Lebesgue measure on the Borel σ-algebra, has no extreme points.

B.84 Theorem (Krein-Milman 1940) 123 Every non-empty compact convex subset of a lo-
cally convex space is the closed convex hull conv(E) of its set E of extreme points. (In particular
E 6= ∅.)
Proof. Let V be locally convex and X ⊆ V compact and convex. We may assume X 6= ∅ since
otherwise there is nothing to prove. Let F be the set of compact faces of X. Then X ∈ F, thus
F 6= ∅. Partially order F by reverse inclusion. If C ⊆ T F is a chain (totally ordered subset) then
it has the finite intersection property, so that F = C is non-empty by Lemma B.42. Assume
x ∈ F satisfies x = ty + (1 − t)z, where 0 < t < 1 and y, z ∈TX with y 6= T z. Since every G ∈ C
contains x and is a face, it follows that x ∈ G. Thus x ∈ G∈C G = C = F , proving that
F is a face and compact. Thus F is an upper bound for C, so that by Zorn’s lemma F has a
maximal element, thus a face that is minimal (in the sense of not having a proper subset that
is a compact face).
We claim that F is a singleton. Assuming x, y ∈ F with x 6= y, Corollary B.61 provides a
continuous linear functional ϕ ∈ V ∗ such that ϕ(x − y) 6= 0. Multiplying by a phase we can
achieve Re ϕ(x) 6= Re ϕ(y). Put M = supx∈F Re ϕ(x) and F 0 = {x ∈ F | Re ϕ(x) = M }. Then
F 0 6= ∅ since Re ϕ assumes its supremum on the compact set F . Now F 0 ⊆ F is a closed subset
and convex by linearity of ϕ. If z, z 0 ∈ F, 0 < t < 1 such that tz + (1 − t)z 0 ∈ F 0 then in view of
tRe ϕ(z) + (1 − t)Re ϕ(z 0 ) = M we must have Re ϕ(z) = Re ϕ(z 0 ) = M , thus z, z 0 ∈ F 0 , proving
that F 0 ⊆ F is a compact face. Since the face F is minimal by construction, we have F 0 = F .
Thus x, y ∈ F 0 , implying the contradiction Re ϕ(x) = M = Re ϕ(y). Thus F is a singleton.
Since its element is an extreme point, we have E 6= ∅.
It remains to prove that X0 = conv(E) equals X. If this is not true, picking z ∈ X\X0 , Corol-
lary B.60 provides a ϕ ∈ V ∗ such that supx∈X0 Re ϕ(x) < Re ϕ(z). Define M = supx∈X Re ϕ(x)
and F = {x ∈ X | Re ϕ(x) = M }. Since X is compact, Re ϕ assumes its supremum, thus
F 6= ∅. As before F is a compact face of X, and applying the above to F there exists an
extreme point y ∈ F . Now Re ϕ(y) = M . By Exercise B.80, y is an extreme point of X, thus
y ∈ E ⊆ X0 . This implies the contradiction M = Re ϕ(y) ≤ supx∈X0 Re ϕ(x) < Re ϕ(z) ≤ M .
Thus conv(E) = X. 

B.85 Remark 1. With a little extra effort one can prove E ⊆ S for every closed set S ⊆ X
such that conv(S) = X.
2. Exercise B.82 shows that ‘compact’ cannot be replaced by ‘closed bounded’ in Theorem
B.84.
3. Also the local convexity of V is essential: In the metrizable, but not locally convex, TVS
Lp ([0, 1]) with 0 < p < 1 there exist [137] compact convex sets without extreme points. 2
123
Mark Grigorievich Krein (1907-1989), David Milman (1912-1982), Soviet mathematicians. (Milman emigrated to
Israel.) Among other things, Krein is also known for the Tannaka-Krein duality theory for compact groups, Milman
e.g. for the M.-Pettis theorem, cf. Section B.6.8).

209
B.86 Corollary Let V be a Banach space. Then

∗ = convw (E(V ∗ )) (weak-∗ closure).
(i) V≤1 ≤1
(ii) If V is reflexive then V≤1 = convw (E(V≤1 )) (weak closure).
Proof. Since the weak and weak-∗ topologies are locally convex, this follows by combining
Krein-Milman with Alaoglu’s Theorem for (i) and with the weak compactness of V≤1 for (ii). 

This proves that the closed unit ball of a Banach space has ‘enough’ extreme points whenever
V is reflexive or a dual space. (In view of this Exercise B.82 again shows that c0 is not a dual
space.) While `1 and `∞ are non-reflexive, they are dual spaces, so that their unit balls have
extreme points. They can be identified:

B.87 Exercise Let S be a set and F ∈ {R, C}. Prove:


(i) For V = `1 (S, F), the set of extreme points of V≤1 is {cδs | s ∈ S, c ∈ F, |c| = 1}.
(ii) For V = `∞ (S, F), the set of extreme points of V≤1 is {f ∈ V | |f (s)| = 1 ∀s ∈ S}.

B.88 Remark 1. The statement that the closed unit ball V≤∗ has extreme points for every
Banach space V implies AC. This is proven in less than a page in [13]. In view of Corollary
B.86 we thus have the implication Alaoglu+KM ⇒ AC. (But HB+KM 6⇒ AC.) Recall that over
ZF, Alaoglu’s theorem is equivalent to the Ultrafilter Lemma, Tychonov’s theorem for Hausdorff
spaces, etc. Since ZF+UL+DCω is provably weaker than AC, it follows that the Krein-Milman
theorem cannot be proven in the framework ZF+UL+DCω (which does suffice for OMT, UBT,
HBT and Alaoglu).
2. In Exercise B.67 we found a representation for all elements of the convex hull of a set.
Doing a similar thing for the closed convex hull is much harder. One solution is the following
Choquet-Bishop-de Leeuw124 theorem: If V is a locally convex space, X ⊂ V is compact and
convex, and R x ∈ X then there is a probability measure µ on the set E of extreme points such
that x = E y dµ(y) as a weak integral: For every ϕ ∈ V ∗ we have ϕ(x) = E ϕ(y)dµ(y). For a
R

thorough discussion of the rather technical subject see [119].


3. Using the Choquet-Bishop-de Leeuw theorem, one easily proves Rainwater’s125 theorem:
w
If {xn } is a (norm)bounded sequence in a Banach space, then for the weak convergence xn → x
it is sufficient that ϕ(xn ) → ϕ(x) for all ϕ ∈ E(V≤1∗ ).

4. If V is locally convex and X ⊂ V compact convex then x ∈ X is called an exposed point


if there is a ϕ ∈ V ∗ such that Re ϕ(x) > Re ϕ(y) ∀y ∈ X\{x}. It is easy to see that every
exposed point is extremal, but that the converse need not hold. Still, one can prove that X is
the closed convex hull of its set of exposed points provided X is metrizable, see [8]. Clearly this
applies if V is Banach.
5. There is much more to be said about the relation between convex sets and linear (and
non-linear!) functionals. We want at least to mention another important result of Bishop and
Phelps: If V is a Banach space over R and X ⊂ V is non-empty, convex, closed and bounded
then the set of bounded linear functionals that assume their supremum on X is (norm-)dense
in V ∗ :
{ϕ ∈ V ∗ | ∃x0 ∈ X : |ϕ(x0 )| = sup |ϕ(x)|} = V ∗ .
x∈X
124
Gustave Choquet (1915-2006), French analyst, proved this under the assumption of metrizability of X. This
hypothesis was removed by to the American mathematicians Errett Albert Bishop (1928-1983), encountered earlier,
and Karel de Leeuw (1930-1978, murdered by a PhD student).
125
John Rainwater is a ficticious mathematician, see wikipedia or [Link] but “his”
results are not.

210
(Note that this is trivial if X is compact.) This can fail in complex Banach spaces, but it does
hold for the closed unit ball in a complex Banach space, giving the result mentioned in Remark
9.25.2.) See e.g. [102, Section 2.11]. 2

B.6.7 Strictly convex Banach spaces and Hahn-Banach uniqueness


If V is a Banach space, let E be the set of extreme points of the closed unit ball V≤1 . If
0 < kxk < 1 then with t = kxk, y = x/kxk, z = 0 we have x = ty + (1 − t)z, thus x 6∈ E.
Similarly 0 6= E, thus E ⊆ V1 = {x ∈ V | kxk = 1}. The question is whether V1 ⊆ E.
Assume x ∈ V1 is a non-trivial convex combination x = ty + (1 − t)z with y, z ∈ V≤1 . Then
1 = kxk ≤ tkyk + (1 − t)kyk, which is possible only if y, z ∈ V1 . But this is all we can say in
general since there are Banach spaces with E = ∅ or ∅ =
6 E $ V1 , cf. Exercises B.82 and B.87.

B.89 Proposition For every Banach space V , the following are equivalent:
(i) x, y ∈ V, kxk = kyk = 1, x 6= y implies kx + yk < 2.
(ii) If x, y ∈ V satisfy kx + yk = kxk + kyk then y = 0 or x = cy with c ≥ 0.
(iii) The set of extreme points of the closed unit ball V≤1 equals V1 = {x ∈ V | kxk = 1}.
Proof. (i)⇒(ii) Let x, y ∈ V satisfy kx + yk = kxk + kyk. If x = 0 or y = 0 then we are done.
By rescaling and/or exchanging if necessary we may assume 1 = kxk ≤ kyk. Put z = y/kyk.
Then

2 ≥ kx + zk = kx + y − (1 − kyk−1 )yk ≥ kx + yk − (1 − kyk−1 )kyk = kx + yk − kyk + 1


= kxk + kyk − kyk + 1 = 2,

where we used ka − bk ≥ kak − kbk and the assumptions kx + yk = kxk + kyk and kxk = 1. This
implies kx + zk = 2. Since kxk = 1 = kzk (by assumption and by construction of z), the strict
convexity implies x = z = y/kyk. Thus y = kykx, and we have proven (ii).
(ii)⇒(i) Assume x, y ∈ V, x 6= y and kxk = kyk = 1. If kx + yk = 2 was true then (ii) would
give x = cy with c > 0, but then kxk = kyk gives c = 1, thus the contradiction x = y. Since
kx + yk ≤ 2 is obvious, we must have kx + yk < kxk + kyk = 2.
(ii)⇒(iii) Let E be the set of extreme points of V≤1 . As noted above, if x ∈ V1 is a non-
trivial convex combination x = ty + (1 − t)z with y, z ∈ V≤1 , we must have y, z ∈ V1 , implying
kxk = ktyk + k(1 − t)zk. Since (ii) holds, we have that y and z are related by a positive scalar,
thus y = z. Thus x is an extreme point.
(iii)⇒(i) Let x, y ∈ V1 with x 6= y. Then (x + y)/2 is a non-trivial convex combination and
therefore not in E = V1 , so that k(x + y)/2k < 1. Thus kx + yk < 2, and we have (i). 

B.90 Definition A Banach space V is called strictly convex if it satisfies the equivalent con-
ditions in Proposition B.89.
Combining the above with Exercises B.82, B.87, B.83 shows that the spaces c0 , `1 , `∞ and
L1 ([0, 1], λ)
are not strictly convex. On the other hand:

B.91 Exercise Prove that `p (S, F) is strictly convex for every S and 1 < p < ∞.
Since every Hilbert space is isometrically isomorphic to some `2 (S, F), it is strictly convex.
This is very easy to show directly and also follows by combining Exercise 5.35 with the following:

211
B.92 Proposition (Taylor-Foguel) 126 Let V be a Banach space. Then the following are
equivalent:
(i) V ∗ is strictly convex.
(ii) For every closed subspace W ⊆ V and ϕ ∈ W ∗ there is a unique ϕ
b ∈ V ∗ with ϕ
b|W = ϕ
and kϕk
b = kϕk.
Proof. (i)⇒(ii) Existence of ϕb is guaranteed by Hahn-Banach. Assume W ⊆ V is a closed
subspace, ϕ ∈ W ∗ and ψ1 , ψ2 ∈ V ∗ are such that ψ1 6= ψ2 , but ψ1  W = ψ2  W = ϕ,
kψ1 k = kψ2 k = kϕk. Then ψ 0 = (ψ1 + ψ2 )/2 satisfies ψ 0  W = ϕ, thus

kψ1 k + kψ2 k
kϕk = kψ 0  W k ≤ kψ 0 k ≤ = kϕk
2
and therefore kψ 0 k = kϕk, contradicting the consequence k(ψ1 + ψ2 )/2k < kψ1 k = kψ2 k of strict
convexity of V ∗ .
(ii)⇒(i) Assume V ∗ is not strictly convex. Then there are ϕ1 , ϕ2 ∈ V ∗ with ϕ1 6= ϕ2 and
kϕ1 k = kϕ2 k = 1 and kϕ1 + ϕ2 k = 2. Then W = {x ∈ V | ϕ1 (x) = ϕ2 (x)} ⊆ V is a closed
linear subspace and proper (since ϕ1 6= ϕ2 ). Put ψ = ϕ1 |W = ϕ2 |W ∈ W ∗ . We will prove
kψk = 1. Then ϕ1 , ϕ2 are distinct norm-preserving extensions of ψ ∈ W ∗ to V , providing a
counterexample for uniqueness of norm-preserving extensions.
Since ϕ1 − ϕ2 6= 0, there exists z ∈ V with ϕ1 (z) − ϕ2 (z) = 1. Now every x ∈ V can be
written uniquely as x = y + cz, where y ∈ W, c ∈ C: Put c = ϕ1 (x) − ϕ2 (x) and then y = x − cz.
Now it is obvious that y ∈ W . Uniqueness of such a representation follows from z 6∈ W .
Since kϕ1 + ϕ2 k = 2, we can find a sequence {xn } ⊆ V with kxn k = 1 ∀n such that
ϕ1 (xn ) + ϕ2 (xn ) → 2. Since |ϕi (xn )| ≤ 1 for i = 1, 2 and all n, it follows that ϕi (xn ) → 1 for i =
1, 2. Now write xn = yn +cn z, where {yn } ⊆ W and {cn } ⊆ C. Then cn = ϕ1 (xn )−ϕ2 (xn ) → 0.
Thus kxn − yn k = |cn |kzk → 0, so that kyn k → 1. And with cn → 0 we have

lim ϕ1 (yn ) = lim ϕ1 (yn + cn z) = lim ϕ1 (xn ) = 1.


n→∞ n→∞ n→∞

In view of {yn } ⊆ W and ϕ1 |W = ψ, we have ψ(yn ) = ϕ1 (yn ) → 1. Together with kyn k → 1


this implies kψk ≥ 1. Since the converse inequality is obvious, we have kψk = 1, as claimed. 

B.6.8 Uniform convexity and reflexivity. Duality of Lp -spaces reconsidered


B.93 Definition A Banach space V is called uniformly convex if for every ε > 0 there exists
δ(ε) > 0 such that x, y ∈ V, kxk = kyk = 1 and kx − yk ≥ ε imply k(x + y)/2k ≤ 1 − δ(ε).
It is obvious that uniform convexity implies strict convexity. The converse is not true:

B.94 Exercise Let V = `1 (N, R), equipped with the norm kf k = kf k1 + kf k2 . Prove that (i)
(V, k · k) is a Banach space, (ii) strictly convex, but (iii) not uniformly convex.

B.95 Theorem (Milman-Pettis 1938/9) Every uniformly convex Banach space is reflexive.
Proof. (Following Ringrose [133]) Assume V is uniformly convex, but not reflexive. Let S ⊆ V
and S ∗∗ ⊆ V ∗∗ be the unit spheres (sets of elements of norm one). Since S = S ∗∗ easily implies
V = V ∗∗ , we have S $ S ∗∗ . If x∗∗ ∈ S ∗∗ \S then by the obvious norm-closedness of S ⊆ S ∗∗
126
Angus Ellis Taylor (1911-1999), American mathematician, proved (i)⇒(ii) in 1939. Shaul Reuven Foguel (1931-
2020), Israeli mathematician, proved (ii)⇒(i) in 1958.

212
there is ε > 0 such that B(x∗∗ , ε) ∩ S = ∅. Since kx∗∗ k = 1, we can find ϕ ∈ V ∗ with kϕk = 1
and |x∗∗ (ϕ) − 1| > 1 − δ(ε)/2. Now U = {y ∗∗ ∈ V ∗∗ | |y ∗∗ (ϕ) − 1| > 1 − δ(ε)} ⊆ V ∗∗ is a
τ := σ(V ∗∗ , V ∗ )-open neighborhood of x∗∗ . By Goldstine’s Theorem B.65, V≤1 ⊆ (V ∗∗ )≤1 is
τ
τ -dense. If {xα } ⊆ V≤1 is a net τ -converging to x ∈ S ∗∗ then kxα k → 1 and kxxαα k → x. Thus
S ⊆ S ∗∗ is τ -dense, thus S ∩ U 6= ∅. If now y1 , y2 ∈ S ∩ U then |ϕ(y1 ) + ϕ(y2 )| > 2 − 2δ(ε). With
kϕk = 1 this implies ky1 + y2 k > 2 − 2δ(ε). Thus by uniform convexity we have ky1 − y2 k < ε.
Since every net in S that τ -converges to x∗∗ ultimately lives in U , picking any y1 ∈ S ∩ U we
have kx∗∗ − y1 k ≤ ε. But this contradicts the choice of ε. 

The converse of the theorem is not true. In fact there are spaces that are reflexive and
strictly convex, but not uniformly convex, but the construction [36] is laborious. Note also that
the dual of a uniformly convex space need not be uniformly convex!

B.96 Theorem For every measure space (X, A, µ) and 1 < p < ∞, the space Lp (X, A, µ; F) is
uniformly convex and reflexive.
Proof. We follow [86]. Let 0 < ε ≤ 21−p . Then the set
 p 
x−y
Z = (x, y) ∈ R2 | |x|p + |y|p = 2, ≥ε
2

is closed and bounded, thus compact, and non-empty since (21/p , 0) ∈ Z. Since the function
p p p
R → R, t 7→ |t|p is strictly convex, we have x+y
2 < |x| +|y|
2 whenever x 6= y. Thus
p
|x|p + |y|p

x+y
ρ(ε) = inf − > 0.
(x,y)∈Z 2 2

Now by homogeneity we have


p p
x−y |x|p + |y|p |x|p + |y|p |x|p + |y|p x+y
≥ε ⇒ ρ(ε) ≤ − . (B.9)
2 2 2 2 2

Let now 0 < ε < 21−p and f, g ∈ Lp (X, A, µ) with kf kp = kgkp = 1 and k(f + g)/2kpp > 1 − δ.
Writing f, g instead of f (x), g(x), we put

f −g p |f |p + |g|p
 
M = x∈X | ≥ε .
2 2

Now
p p
f −g f −g f −g p
Z Z
= +
2 p X\M 2 M 2
|f |p + |g|p |f |p + |g|p
Z Z
≤ ε +
X\M 2 M 2
|f |p + |g|p |x|p + |y|p x+y p
Z Z  
1
≤ ε + −
X 2 ρ(ε) M 2 2
p p p p x+y p
 
|f | + |g| |x| + |y|
Z Z
1
≤ ε + −
X 2 ρ(ε) X 2 2
1 1−δ δ
≤ ε+ − = ε+ .
ρ(ε) ρ(ε) ρ

213
(In the second row we used the definition of M and (4.1), in the third we used (B.9), which
holds on M , in the forth the fact that the expression in brackets is non-negative on X\M , and
finally we used the assumptions kf kpp ≤ 1, kgkpp ≤ 1 and k(f + g)/2kpp > 1 − δ.) Now choosing
δ < ερ(ε) we have k(f − g)/2kpp ≤ 2ε, thus uniform convexity (more precisely, an implication
equivalent to it).
Reflexivity now follows from Theorem B.95. 

B.97 Remark The uniform convexity of Lp for 1 < p < ∞ was first proven by Clarkson in
1936 with a fairly complicated proof. (Reflexivity was known earlier thanks to F. Riesz’ proof
of (Lp )∗ ∼
= Lq .) A simpler proof, still giving optimal bounds, can be found in [73]. 2

Now we are in a position to complete the determination of Lp (X, A, µ)∗ for arbitrary measure
space (X, A, µ) and 1 < p < ∞ without invocation of the Radon-Nikodym theorem:

B.98 Corollary Let 1 < p < ∞ and (X, A, µ) any measure space. Then the canonical map
Lq (X, A, µ; F) → Lp (X, A, µ; F)∗ is an isometric bijection.
Proof. Let (X, A, µ) be any measure space, 1 < p < ∞ and q the conjugate exponent. We
abbreviate Lp (X, A, µ) to Lp . As discussed (without complete, but hopefully sufficient detail)
in Section 4.7, the map ϕ : Lq → (Lp )∗ , g 7→ ϕg is an isometry, so that only surjectivity remains
to be proven. Assume ϕ(Lq ) $ (Lp )∗ . The subspace being closed (since Lq is complete and ϕ is
an isometry), by Hahn-Banach there is a 0 6= ψ ∈ (Lp )∗∗ such that ψ  ϕ(Lq ) = 0. By reflexivity
of Lp (Theorem B.96), there is an p
R f ∈ L such that ψ = ιLp (f ). This implies ϕg (f ) = ψ(ϕg ) = 0
for all g ∈ L . With ϕg (f ) = f g dµ = ϕ0f (g), where ϕ0 : Lp → (Lq )∗ is the canonical map,
q

this implies ϕ0f = 0. Since ϕ0 is an isometry, we have f = 0 and therefore ψ = 0, which is a


contradiction. Thus ϕ : Lq → (Lp )∗ is surjective. 

B.7 The Eidelheit-Chernoff theorem


It is not unreasonable to ask whether the assignment V 7→ B(V ) (or V 7→ K(V )) is injective.
This certainly is the case for finite-dimensional spaces, for the simple reason that dim B(V ) =
dim K(V ) = (dim V )2 and the fact that all normed spaces of the same finite dimension are
isomorphic.
Dropping the finiteness, we need the notion of algebra isomorphism, cf. Definition 15.1.

B.99 Theorem (Eidelheit 1940, Chernoff 1969/73) 127 Let V, W be normed spaces and
A ⊆ B(V ), B ⊆ B(W ) subalgebras containing the algebras F (V ) and F (W ), respectively, of
finite rank operators. Then every algebra isomorphism α : A → B is of the form α(A) = T AT −1
for some isomorphism T : V → W of Banach spaces.
In particular, an algebraic isomorphism K(V ) ∼
= K(W ) or B(V ) ∼= B(W ) implies V ' W .
Proof. We closely follow [7] with some extra details. Pick a rank-one idempotent P ∈ A. Then
α(P )2 = α(P 2 ) = α(P ), thus α(P ) ∈ B is an idempotent and non-zero (since α is injective).
We claim it has rank one. [If not, pick a non-zero proper subspace of W 0 = α(P )W and an
idempotent Q with image W 0 . It satisfies Qα(P ) = α(P )Q = Q. Since Q ∈ F (W ) ⊆ B and α is
an isomorphism, there is an idempotent P 0 ∈ A with α(P 0 ) = Q. It satisfies P 0 P = P P 0 = P 0 ,
thus is a non-zero proper subprojection of P , but this is impossible since P has rank one.]
127
Meier Eidelheit (1910-1943), Polish mathematician, killed in the holocaust. Paul Robert Chernoff (1942-2017),
American analyst.

214
Pick non-zero vectors x0 ∈ P V, y0 ∈ α(P )W and an S ∈ B(V, W ) with Sx0 = y0 . [E.g., pick
any ϕ ∈ V ∗ with ϕ(x0 ) 6= 0. Then S : x 7→ ϕ(x0 )−1 ϕ(x)y0 does the job.] Define linear maps

ψ1 : AP → V, AP 7→ AP x0 , ψ2 : Bα(P ) → W, Bα(P ) 7→ Bα(P )y0 .

Now ψ1 injective quite trivially, and it is surjective since for every x ∈ V there exists A ∈ F (V )
with Ax0 = x and since F (V ) ⊆ A. Similarly, ψ2 is a linear bijection. Since the bijection
α : A → B restricts to a bijection α0 : AP → Bα(P ), the composite T = ψ2 ◦ α0 ◦ ψ1−1 : V → W
is a linear bijection. By definition of these maps, we have

T AP x0 = α(A)α(P )y0 = α(A)α(P )Sx0 ∀A ∈ A,

which is the same as saying T AP = α(A)α(P )SP . If now A, A0 ∈ A then this implies

T AA0 P = α(A)α(A0 )α(P )SP = α(A)T A0 P.

Thus T AA0 x0 = α(A)T A0 x0 , and since every x ∈ V is of the form A0 x0 for some A ∈ F (V ) ⊆ A,
we have the identity T A = α(A)T (of linear maps V → W ). With invertibility of T this is the
same as α(A) = T AT −1 .
It remains to prove that T is bounded. Pick y 0 ∈ W \{0}. For each ϕ ∈ W ∗ , we have the
finite rank operator y 0 ⊗ ϕ ∈ F (W ) : y 7→ y 0 ϕ(y). Since F (W ) ⊆ B, there exists A ∈ A such that
y 0 ⊗ ϕ = α(A) = T AT −1 . Thus A = T −1 (y 0 ⊗ ϕ)T = (T −1 y 0 ) ⊗ (ϕ ◦ T ). Since A is bounded and
T −1 y 0 6= 0, this implies that ϕ ◦ T = T t ϕ is bounded for each ϕ ∈ W ∗ . As proven in Exercise
9.32, this implies that T is bounded. 

B.100 Corollary Let V be a normed space. Then:


(i) Every automorphism of B(V ) is inner, i.e. of the form T · T −1 for some T ∈ InvB(V ).
(ii) Every automorphism of K(V ) extends to an automorphism of B(V ).

B.101 Remark The conclusion of the theorem implies kαk ≤ kT kkT −1 k < ∞, so that α
is bounded. Thus a purely algebraic hypothesis implies a continuity statement. This is an
early instance of the subject of ‘automatic continuity’, cf. [32]. For another such statement see
Theorem B.173. 2

B.8 A bit more on invariant subspaces. Lomonosov’s theorem


A classical question of functional analysis is the invariant subspace problem, asking whether every
Banach space operator A ∈ B(V ) admits a proper A-invariant subspace. Several affirmative
answers are known:
• If A = 0 then every subspace is A-invariant.
• If V is finite-dimensional with dim V ≥ 2, the Jordan normal form implies that every
A ∈ B(V ) has an invariant subspace.
• If V is non-separable, pick any x ∈ V \ ker A and put W = spanF {An x | n ∈ N0 }. Then
W is a non-zero A-invariant subspace and separable, thus proper.
• Every normal operator on a complex Hilbert space has proper invariant subspaces by
Exercise 18.7.

215
Thus we are left with non-normal operators on infinite-dimensional separable spaces. In 1954,
Aronszajn and Smith128 proved that every compact operator on a complex Banach space has a
non-trivial invariant subspace. Lomonosov129 used Schauder’s fixed point theorem, see Section
B.15, to prove the following stronger statement:

B.102 Theorem (Lomonosov 1973) On an infinite-dimensional complex Banach space V ,


every non-zero compact operator K has a non-trivial hyperinvariant subspace, i.e. a closed
subspace 0 6= W $ V such that AW ⊆ W for every A ∈ B(V ) that commutes with K.
Proof. By Corollary 14.4 of Fredholm’s alternative, all non-zero elements of σ(K) are eigen-
values. The eigenspace Vλ corresponding to a non-zero eigenvalue λ is finite-dimensional by
Lemma 14.1, thus proper, and is invariant not only for K but also every operator commuting
with it: If x ∈ Vλ and AK = KA then KAx = AKx = λAx, thus Ax ∈ Vλ . We are therefore
left with the case of compact K with σ(K) = {0}, i.e. the quasi-nilpotent ones. Rather than
follow Lomonosov, we give the fantastic proof of Hilden (as presented by Michaels [104]) that
avoids all fixed point theorems. It only uses that for quasi-nilpotent K the spectral radius
formula gives lim kK n k1/n = r(K) = 0, so that k(cK)n k → 0 for all c ∈ C, cf. Exercise 13.45.
Since multiplying K by a non-zero number does not change {K}0 , we may assume kKk = 1.
Pick x0 ∈ V such that kKx0 k > 1. Clearly kx0 k > 1. Let B = x0 + V≤1 = B(x0 , 1). Then 0 6∈ B
and 0 6∈ KB = Kx0 + KV≤1 (since kKx0 k > 1 and KV≤1 ⊆ V≤1 .) For each y ∈ V put

Wy = {Ay | A ∈ B(V ), AK = KA}.

Now Wy is a linear subspace of V containing y, so that it is non-zero whenever y 6= 0. If B


commutes with K then BWy ⊆ Wy . (Why?) Thus Wy ⊆ V is a closed hyperinvariant subspace
for K, so that we are done if we can find a y 6= 0 with Wy 6= V .
To prove this by contradiction, assume Wy = V ∀y 6= 0. Then for each y 6= 0 there exists
A ∈ B(V ) with AK = KA and kAy − x0 k < 1. With

U (A) = {y ∈ V | kAy − x0 k < 1} = A−1 (B(x0 , 1))


S
this is equivalent to the statement A∈B(V ),AK=KA U (A) = V \{0}. Thus the U (A) form an
open (by continuity of A) cover of the compact (by compactness of K) subset S KB of V \{0}.
Thus there are A1 , . . . , An ∈ B(V ) such that Ai K = KAi for all i and KB ⊆ ni=1 U (Ai ).
Since Kx0 ∈ KB, we have Kx0 ∈ U (Ai ) for some i1 , meaning kAi1 Kx0 − x0 k < 1, to
wit Ai1 Kx0 ∈ B. Then KAi1 Kx0 ∈ KB, thus KAi1 Kx0 ∈ U (Ai2 ) for some i2 , meaning
Ai2 KAi1 Kx0 ∈ B. It is clear that we can iterate this construction, obtaining a sequence {im }
such that Aim KAim−1 K · · · Ai1 Kx0 ∈ B for all m ∈ N. Since all Ai commute with K, this is
equivalent to Aim Aim−1 · · · Ai1 K m x0 ∈ B for all m. With c = max(kA1 k, . . . , kAn k) this in turn
is equivalent to

(c−1 Aim )(c−1 Aim−1 ) · · · (c−1 Ai1 )(cK)m x0 ∈ B ∀m ∈ N.

The l.h.s. of this tends to zero since kc−1 Ai k ≤ 1 for all i and k(cK)m k → 0 by assumption,
thus 0 ∈ B. But this contradicts the fact that B is closed by definition and 0 6∈ B. The only way
out of this contradiction is thay there exists y 6= 0 with Wy 6= V , which then is a non-trivial
closed hyperinvariant subspace. 
128
Nachman Aronszajn (1907-1980), Polish-American mathematician who worked on functional analysis, mathemat-
ical logic. Kennan Tayler Smith (1926-2000), American mathematician.
129
Victor Lomonosov (1946-2018), Russian-American mathematician who mostly worked on functional analysis.

216
B.103 Remark 1. While in finite dimensions every operator has non-trivial invariant sub-
spaces, Lomonosov’s theorem fails since 1 is compact, while there clearly is no non-trivial
subspace that is invariant under all operators. This is also why we required K 6= 0 above.
2. With the Aronszajn-Smith result, the invariant subspace problem is reduced (over C) to
non-compact non-normal operators on separable spaces. In 1975/1987 Enflo (already encoun-
tered in connection with the approximation property) constructed an operator on a Banach
space having no invariant subspace [48]. By now, more examples are known, even on `1 [127],
but all live on non-reflexive spaces. It still is an open question whether all operators on reflexive
spaces, or at least on separable Hilbert spaces must have an invariant subspace! (In a paper
[49] from May 2023, Enflo claims to prove just this, but this has not yet been verified.) 2

Why should anyone be interested in the existence of invariant subspaces? The arguments
used to prove the existence of invariant subspaces in finite dimensions and for normal opera-
tors on Hilbert spaces are closely related to results (usually called ‘spectral theorem’ only for
normal operators) giving representations of those operators by standard forms. Thus proving
the existence of invariant subspaces will invariably lead to structural results on the family of
operators in question.
As we saw in Section 14.4, for a compact operator A ∈ B(V ) is is notP
too difficult to construct
a family {Pλ }λ∈σ(A) of mutually commuting idempotents satisfying λ Pλ = 1 (converging
unconditionally in the strong operator topology) and such that each Vλ = Pλ V is an A-invariant
subspace with σ(A  Vλ ) = {λ}. For λ 6= 0, each Vλ is finite-dimensional and A − λ1  Vλ
nilpotent. But saying something non-trivial about A  V0 requires the less trivial existence of
invariant subspaces for quasi-nilpotent compact operators given by the above theorem. Using
the latter, good results on normal forms for compact operators have indeed been proven by
Ringrose130 [134, 135] and Brodsky131 [24].

B.9 More on Fredholm operators


We now study Fredholm operators, cf. Definition 14.9, in more depth.
We begin with some some considerations in pure linear algebra. If X, Y, Z are vector spaces
over the same field k and A : X → Y, B : Y → Z are linear maps with finite-dimensional kernels,
it is easy to prove

dim ker A ≤ dim ker(BA) ≤ dim ker A + dim ker B.

An analogous inequality holds for the cokernels, but there is no exact additivity of the dimensions
of the kernels or cokernels. Yet additivity does hold for the Fredholm indices! Generalizing the
definition of Fredholm operators to arbitrary vector spaces and putting ind(A) = dim ker A −
dim cokerA we have:

B.104 Proposition Let X, Y, Z be vector spaces and A : X → Y, B : Y → Z Fredholm. Then


BA is Fredholm and ind(BA) = ind(A) + ind(B).
Proof. We follow the argument in [147]. If A : X → Y with X and Y finite-dimensional, we have
dim X = dim ker A + dim AX = dim ker A + dim Y − dim cokerA, thus ind(A) = dim X − dim Y .
If also Z is finite-dimensional and B : Y → Z then

ind(BA) = dim X − dim Z = (dim X − dim Y ) + (dim Y − dim Z) = ind(A) + ind(B),


130
John Robert Ringrose (1932-), British functional analyst, working also on operator algebras.
131
Mikhail Samoilovich Brodskii (1913-1989), Soviet mathematician. Student of M. G. Krein.

217
proving the claim in the case of finite-dimensional spaces.
In the general case, we will find complementary subspaces X0 , X1 ⊆ X (thus X = X0 +
X1 , X0 ∩ X1 = {0}) and similarly Y0 , Y1 ⊆ Y and Z0 , Z1 ⊆ Z such that X0 , Y0 , Z0 are finite-
dimensional, AX0 ⊆ Y0 , BY0 ⊆ Z0 and AX1 = Y1 , BY1 = Z1 with X1 ∩ ker A = ∅ = Y1 ∩ ker B.
Thus BA : X1 → Z1 is a bijection, so that with the first half of the proof we have

ind(BA) = ind(BA : X0 → Z0 ) = ind(A : X0 → Y0 ) + ind(B : Y0 → Z0 ) = ind(A) + ind(B).

It remains to construct the spaces in question. We need one fact from linear algebra: If X1 , X2 ⊆
X are linear subspaces of X with X1 ∩ X2 = {0}, there exists a complementary subspace of X1
containing X2 . (This follows by applying Zorn’s lemma to the family of subspaces of X that
contain X2 and have trivial intersection with X1 .)
We put X0 = ker(BA) = A−1 (ker B) ⊇ ker A, which has dimension at most dim ker A +
dim ker B. Letting X1 be any complement of X0 in X, we have X1 ∩ ker A = {0}. Putting
Y1 = AX1 , the map A : X1 → Y1 is a bijection, and since A : X → Y has finite cokernel, Y1
has finite codimension in Y . We have Y1 ∩ ker B = {0} (since BA  X1 is injective, thus also
B  AX1 = Y1 ). Thus there is a complement Y0 , clearly finite-dimensional, of Y1 in Y containing
ker B. Putting Z1 = BY1 , by the same argument as earlier, Z1 ⊆ Z has finite codimension.
Since ker B ⊆ Y0 by construction, we have BY0 ∩ Z1 = {0}. Thus there exists a complement
Z0 , clearly finite-dimensional, of Z1 containing BY0 . Now all our claims are satisfied. 

Note that by the first part of the proof, every A ∈ End V , where V is finite-dimensional, is
Fredholm with index zero.

B.105 Proposition Let V, W be Banach spaces and A ∈ B(V, W ). Then At is Fredholm if and
only if A is Fredholm. Under these equivalent hypotheses we have dim ker(At ) = dim coker(A)
and dim coker(At ) = dim ker(A), thus ind(At ) = −ind(A).
Proof. Assume that A is Fredholm, in particular it satisfies dim(W/AV ) < ∞. By Exercise
7.11 this implies closedness of AV ⊆ W . By Exercise 9.35(i), ker At = (AV )⊥ ⊆ W ∗ . And by
Exercise 6.7 (AV )⊥ is isometrically isomorphic to (W/AV )∗ as Banach spaces. Since W/AV is
finite-dimensional, we have dim ker(At ) = dim(W/AV )∗ = dim(W/AV ) = dim coker(A).
Since A has closed image, the same is true for At by Exercise 9.41. By Exercise 9.36(i)
we have ker A = (At W ∗ )> , thus (ker A)∗ ∼= ((At W ∗ )> )∗ ∼
= V ∗ /At W ∗ , where the (isometric)
isomorphism from Exercise 9.16. With finite-dimensionality of ker A we thus have dim ker(A) =
dim(ker A)∗ = dim(V ∗ /At W ∗ ) = dim coker(At ). Now it is clear that At is Fredholm with
ind(At ) = −ind(A).
Now assume that At is Fredholm. As before, this implies that At has closed image. As a
consequence of a result that we did not prove, A has closed image. Now we can argue as above,
resulting in A being Fredholm. 

B.106 Corollary Let H, H 0 be Hilbert spaces.


(i) If A ∈ B(H, H 0 ) is Fredholm then A∗ ∈ B(H 0 , H) is Fredholm with ind(A∗ ) = −ind(A).
(ii) If A ∈ B(H) is Fredholm and normal then ind(A) = 0.
Proof. (i) is immediate, combining Proposition B.105 with the definition of A∗ in terms of At ,
cf. Proposition 11.1.
(ii) If A ∈ B(H) is normal then Proposition 11.27(i) gives ker A = ker A∗ , thus dim ker A =
dim ker(A∗ ). Now dim cokerA = dim ker(A∗ ) = dim ker A, whence ind(A) = 0. 

218
B.107 Theorem (Atkinson) 132 Let V be an infinite-dimensional133 Banach space and A ∈
B(V ). Then the following are equivalent:
(i) A is Fredholm.
(ii) There exists a Fredholm B with ind(B) = −ind(A) and ABA = A134 that 1 − AB and
1 − BA are finite rank idempotents.
(iii) There exists B ∈ B(V ) such that AB − 1 and BA − 1 are compact.
(iv) The image of A in the Calkin135 algebra C(V ) = B(V )/K(V ) is invertible.
Proof. (i)⇒(ii) Being Fredholm, A has finite-dimensional kernel and cokernel. Thus ker A is
complemented, and we can find a closed complement V1 ⊆ V . And since AV ⊆ V is closed by
Exercise 7.11 and has finite codimension, it has a closed complement V2 . Now A  V1 is injective,
thus A0 : V1 → AV is a bounded linear bijection. By the BIT, its inverse (A0 )−1 : AV → V1 is
bounded. Define B : V → V by B = (A0 )−1 on AV and as zero on the complement V2 of AV .
By construction B is bounded. We have ker B = V2 , thus dim ker B = dim cokerA < ∞, and
BV = V1 , thus dim cokerB = dim ker A < ∞. Thus B is Fredholm with ind(B) = −ind(A).
Now BA is the identity on V1 and zero on ker A, thus BA = 1 − P1 , where P1 is the unique
idempotent with P1 V = ker A and (1 − P1 )V = V1 . Since ker V is finite-dimensional, P1 has
finite rank. And ABA = A(1 − P1 ) = A − AP1 = A. Similarly, AB is the identity on AV and
zero on the complement V2 of AV . Thus AB = 1 − P2 , where P2 is the idempotent with image
V2 . Since V2 is finite-dimensional, P2 has finite rank.
(ii)⇒(iii) is trivial. (iii)⇒(i) There exists B ∈ B(V ) such that AB = 1+C, BA = 1+D with
C, D compact. Since this implies ker A ⊆ ker(1+D), and Lemma 14.1 gives dim ker(1+D) < ∞,
we have dim ker A < ∞. On the other hand, (1 + C)V ⊆ AV and since (1 + C)V ⊆ V has finite
codimension by Proposition 14.7, so has AV ⊆ V , thus cokerA = V /AV is finite-dimensional.
Thus A is Fredholm.
(iii)⇔(iv) Invertibility of q(A) ∈ C(A) means that there is a B ∈ B(V ) such that q(A)q(B) =
1C(V ) and q(B)q(A) = 1C(V ) . Since K(V ) ⊆ B(V ) is a two-sided ideal, q is a homomorphism.
Thus q(A)q(B) = q(AB), which equals 1C(V ) if and only if AB ∈ 1 + K(H). Similarly for
q(B)q(A), so that the claim follows. 

B.108 Theorem Let V be a Banach space. Then


(i) The set Fr(V ) of Fredholm operators on V is an open subset of B(V ).
(ii) The map Fr(V ) → Z, A 7→ ind(A) is continuous.
(iii) If A ∈ B(V ) is Fredholm and K ∈ K(V ) is compact then A + K is Fredholm and ind(A +
K) = ind(A).
Proof. (i) By Proposition 6.1 the quotient map q : B(V ) → B(V )/K(V ) = C(V ) is continuous
and C(V ) is a Banach algebra. Thus its set of invertibles is open by Lemma 13.19. Thus
q −1 (Inv C(V )) ⊆ B(V ) is open, and since this coincides with Fr(V ) by Theorem B.107, we are
done.
132
Frederick Valentine Atkinson (1916-2002), British mathematician.
133
For finite-dimensional V , (i)-(iii) are trivially true. The same holds for (iv) if we are willing to consider 0 as unit
element of {0}.
134
An element a of an algebra A is called regular if there exists a b ∈ A such aba = a.
135
John Williams Calkin (1909-1964). American mathematician. Worked at Los Alamos for 12 years on you guess
what.

219
(ii) Let A be Fredholm and pick B as in Theorem B.107(ii). If now A0 is Fredholm with
kA − A0 k < kBk−1 then kAB − A0 Bk < 1, so that D = 1 + A0 B − AB is invertible by Lemma
13.19(ii), thus ind(D) = 0. Now DA = A+A0 BA−ABA = A+A0 BA−A = A0 BA using ABA =
A. Since A, A0 , B, D are Fredholm, this implies ind(D) + ind(A) = ind(A0 ) + ind(B) + ind(A).
With ind(D) = 0 and ind(B) = −ind(A), we conclude ind(A0 ) = ind(A). Thus ind is locally
constant (constant on an open neighborhood of each point) and therefore continuous.
(iii) If A ∈ B(V ) is Fredholm and K is compact then q(A) ∈ C(V ) is invertible by Theorem
B.107, thus also q(A + K) = q(A) is invertible, so that A + K is Fredholm. Now t : [0, 1] →
Fr(V ), t 7→ A + tK is continuous, thus with (ii) also t 7→ ind(A + tK) is continuous. Since
a continuous map from a connected space to a discrete space is constant, we have ind(A) =
ind(A + K). 

B.109 Corollary If A ∈ B(V ) is compact and λ ∈ F\{0} then A−λ1 is Fredholm with index
zero, thus dim ker(A − λ1) = dim coker(A − λ1).
A
Proof. Apply Theorem B.108 to 1 − λ, noting that 1 is Fredholm with index zero. 

B.110 Proposition If V is a Banach space and A ∈ B(V ) then the following are equivalent:
(i) A is Fredholm with index zero.
(ii) There exists a compact K ∈ K(V ) such that A + K is invertible.
Proof. (ii)⇒ If there exists K ∈ K(V ) such that A+K is invertible then with Theorem B.108(iii)
we have ind(K) = ind(A + K) = 0 since A + K is invertible, thus has index zero.
(i)⇒(ii) If A is Fredholm with ind(A) = 0 then ker A and cokerA have the same finite
dimension. Now ker A has a closed complement V1 , and AV (which we know to be automatically
closed) has a finite-dimensional complement V2 . Now dim V2 = dim cokerA = dim ker A < ∞.
Thus we can find an invertible (and bounded, by finite-dimensionality) B : ker A → V2 . Since
we have the isomorphisms (ker A) ⊕ V1 ' V ' V2 ⊕ AV , we can define a bounded K : V → V
as B on ker A and zero on V1 . Since K has finite rank, it is compact. And A + K restricts to
the invertible maps B : ker A → V2 and A0 : V1 → AV and therefore is invertible. 

B.111 Exercise Let V be a Banach space and A ∈ B(V ) Fredholm. Prove: There exists a
compact K such that A + K is surjective (resp. injective) if and only if ind(A) ≥ 0 (resp. ≤ 0).
If X is a topological space, recall that π0 (X) is the set of path components of X, i.e.
X/∼, where x ∼ y if and only if there exists p ∈ C([0, 1], X) with p(0) = x, p(1) = y. The ∼-
equivalence of x is denoted [x]. A continuous map f : X → Y induces a map f∗ : π0 (X) → π0 (Y ).

B.112 Proposition Let V be a Banach space. Then π0 (Fr(V )) admits a group structure such
that [A][B] = [AB] and a homomorphism ind : π0 (Fr(V )) → Z such that ind([A]) = ind(A).
If ι : InvB(V ) → Fr(V ) is the inclusion map, we have ker ind = ι∗ (π0 (Inv V )). (Thus
π0 (InvB(V )) → π0 (Fr(V )) → Z is a short exact sequence.)
Proof. The set (Fr(V ), ·, 1) is a monoid (defined like a group, but without inverses). If
A, A0 , B, B 0 ∈ Fr(V ) such that A ∼ A0 and B ∼ B 0 then AB ∼ A0 B 0 , so that [A][B] = [AB]
defines a monoid structure on π0 (Fr(V )). It actually is a group with unit [1]: If A ∈ Fr(V ) then
by Theorem B.107(ii) there is a Fredholm B such that P1 = 1 − BA is a finite rank idempotent.
Define Ct = BA + tP1 . Since {Ct }t∈[0,1] is a continuous path in Fr(V ) from BA to the identity,
we have [B][A] = [BA] = [1] in π0 (Fr(V )). Similarly [A][B] = [1].

220
The map ind : Fr(V ) → Z is a monoid homomorphism by Proposition B.104, and the local
constancy of ind (Theorem B.108(ii)) implies that we have a well defined map π0 (Fr(V )) → Z,
also denoted by ind, such that ind([A]) = ind(A). This clearly is a group homomorphism.
The set Inv(V ) of invertible operators is a group, and the same holds for π0 (InvB(V )) by the
same argument as before. Since every invertible operator is Fredholm with index zero, the inclu-
sion map ι : InvB(V ) → Fr(V ) induces a monoid homomorphism ι∗ : π0 (InvB(V )) → π0 (Fr(V ))
ι∗ ind
such that the composite map π0 (InvB(V )) → π0 (Fr(V )) −→ Z is zero. Thus ι∗ (π0 (InvB(V ))) ⊆
ker ind.
It remains to prove that ind([A]) = 0 implies [A] ∈ ι∗ (π0 (InvB(V ))). On the level of oper-
ators instead of path-components this amounts to proving that every Fredholm operator with
index zero can be connected to an invertible operator by a continuous path in Fr(V ). Let
thus A ∈ B(V ) with dim ker A = dim cokerA < ∞. Since the subspaces ker V, cokerV ⊆ V
are finite-dimensional, they are complemented, giving rise to direct sum decompositions
 0  V =
A 0
V1 ⊕ ker V = V2 ⊕ cokerV with respect to which A is described by the matrix , where
0 0
A0 : V1 → V2 is invertible. Since ker V and cokerV have  0 the same
 dimensions, we can find an
A 0
invertible B : ker A → cokerA. For t ∈ [0, 1] let At = ∈ B(V ). Since At ∈ Fr(V ) for
0 tB
all t and A0 = A, while A1 is invertible, we have achieved our goal. 

We now specialize to complex Hilbert spaces.

B.113 Exercise Let H be an infinite-dimensional Hilbert space and n ∈ Z. Construct a


Fredholm operator A ∈ B(H) with ind(A) = n.

B.114 Theorem If H is a complex Hilbert space, the set of Fredholm operators of index n on
H is path-connected for every n ∈ Z. Thus Fr(H) is the disjoint union of open path-connected
components, one for each n ∈ Z. Equivalently, the map π0 (Fr(H)) → Z is a bijection.
Proof. Surjectivity of the index map π0 (Fr(H)) → Z is immediate by Exercise B.113. Injectiv-
ity follows at once from Proposition B.112 if we prove π0 (InvB(H)) = 0. But this is nothing
other than path-connectedness of InvB(H), which we proved in Proposition 18.17 (for F = C). 

We close with an interesting related result:

B.115 Theorem (Feldman & Kadison 1954) Let H be a separable Hilbert space. Then
k·k
Inv B(H) = {A ∈ B(H) | AH ⊆ H is non-closed or dim ker A = dim ker A∗ ∈ N0 ∪ {∞}}.

For an accessible proof (using the essential spectrum, see the next section) see [20].

B.10 Discrete and essential spectrum


B.10.1 Banach space operators
The spectra of an operator that we have defined so far are all very unstable under perturbation
of the operator, e.g. by compact operators (or by operators of small norm). This motivates
the search for interesting and relevant subsets of σ(A), called essential sprectra, that have such
invariance properties. We begin with two obvious candidates:

221
B.116 Definition Let V be a complex Banach space and A ∈ B(V ). Define
• σess,1 (A) = {λ ∈ C | A − λ1 is not Fredholm}.
• σess,2 (A) = {λ ∈ C | A − λ1 is not Fredholm with index zero}.
(σess,1 (A) is called the Fredholm (essential) spectrum of A and σess,2 (A) the Weyl (essential)
spectrum. These are many other definitions discussed in the literature, cf. e.g. [45].)
Some immediate observations:
• Since every invertible operator is Fredholm with index zero, we have

σess,1 (A) ⊆ σess,2 (A) ⊆ σ(A).

• In view of Theorem B.108(iii) it is evident that

σess,1 (A + K) = σess,1 (A) and σess,2 (A + K) = σess,2 (A) ∀A ∈ B(V ), K ∈ K(V ).

• This implies that σess,1 (K) = σess,2 (K) = {0} for all compact K, in particular for all
A ∈ B(V ) if V is finite-dimensional.
• If A is Fredholm of non-zero index then 0 6∈ σess,1 (A), while 0 ∈ σess,2 (A). Thus the two
essential spectra can differ.
• By Corollary B.106(ii), σess,1 (A) = σess,2 (A) if A is a normal operator on Hilbert space.
Both essential spectra can be expressed in terms of the usual spectrum:

B.117 Lemma If V is a complex Banach space and A ∈ B(V ) then with the quotient map
Q : B(V ) → B(V )/K(V )

σess,1 (A) = σ(Q(A)),


\
σess,2 (A) = σ(A + K).
K∈K(V )

Thus σess,1 (A) and σess,2 (A) are closed and non-empty.
Proof. The first statement is immediate Tby Atkinson’s Theorem B.107 and implies that σess,1 (A)
is closed and non-empty. And λ ∈ K∈K(V ) σ(A + K) is equivalent to the statement that
λ ∈ σ(A + K) for all compact K, thus A + K − λ1 is non-invertible for all K ∈ K(V ).
By Proposition B.110 this is equivalent to A − λ1 not being Fredholm with index zero, thus to
λ ∈ σess,2 (A). As an intersection of closed sets, σess,2 (A) is closed. And σess,2 (A) ) σess,1 (A) 6= ∅.


σess,2 (A) is the largest part of σ(A) that is stable under compact perturbations of A:

B.118 Exercise If σ 0 : B(V ) → P (C) is such that σ 0 (A) ⊆ σ(A) and σ 0 (A + K) = σ 0 (A) for
all A ∈ B(V ) and K ∈ K(V ), prove that σ 0 (A) ⊆ σess,2 (A) for all A ∈ B(V ).
In this sense, σess,2 (A) is the ‘best’ definition of essential spectrum. Yet, we will have a look at
a popular third definition, sitting between σess,2 (A) and σ(A). The theory of the Riesz projector
(Exercises 13.53, 13.66, 13.68) will play an important role. We begin with a preparatory result
similar to Proposition B.110:

222
B.119 Proposition Let V be a complex Banach space and A ∈ B(V ). Then the following are
equivalent:
(i) 0 is an isolated point of σ(A), and P0 V is finite-dimensional, where P0 is the Riesz projector
for λ = 0.
(ii) There are closed subspaces V1 , V2 ⊆ V with V1 + V2 = V and V1 ∩ V2 = {0}, where V1 has
non-zero finite dimension, such that AVi ⊆ Vi . With Ai = A  Vi , A1 is nilpotent, and A2
is invertible.
(iii) 0 is an isolated point of σ(A) and A is Fredholm of index zero.
(iv) 0 is an isolated point of σ(A) and A is Fredholm.
(v) A is not invertible, and there is a compact K such that AK = KA and A + K is invertible.
Proof. (i)⇒(ii) With V1 = P0 V, V2 = (1 − P0 )V , we have AVi ⊆ Vi with σ(A  V1 ) = {0} and
σ(A  V2 ) = σ(A)\{0}. Thus A2 = A  V2 is invertible and A1 = A  V1 quasi-nilpotent, thus
nilpotent by finite-dimensionality of V1 .
(ii)⇒(iii) By (ii) we have A = A1 ⊕ A2 , where A2 is invertible and V1 finite-dimensional.
Thus A1 is Fredholm of index zero, and the same holds for A. And σ(A1 ) = {0} while 0 6∈ σ(A2 ).
Since σ(A2 ) is closed, there is an open neighborhood U of zero such that U ∩ σ(A2 ) = ∅, thus
U ∩ σ(A) = {0}. Thus 0 is an isolated point of σ(A).
(iii)⇒(iv) This is trivial.
(iv)⇒(i) Since 0 ∈ σ(A) is isolated, it has a Riesz projector P0 . Let Vi , Ai as above. Then A1
is quasi-nilpotent and A2 is invertible. Together with the fact that A is Fredholm, this implies
that A1 ∈ B(V1 ) is Fredholm. Now Exercise B.120 below gives that V1 is finite-dimensional.
(ii)⇒(v) It is clear that A is not invertible. Since A1 is nilpotent, A1 + 1V1 ∈ B(V1 ) is
invertible by Lemma 13.19. Now K = 1V1 ⊕ 0 is compact since dim V1 < ∞, commutes with
A = A1 ⊕ A2 , and A + K = (A1 + 1V1 ) ⊕ A2 is invertible.
(v)⇒(iii) By assumption, B = A + K is invertible, implying that A = B − K is Fredholm
of index zero. Since B commutes with K (and A), with Exercise 15.8(ii) we have σ(A) ⊆
σ(B) − σ(K). Since A is not invertible, we have 0 ∈ σ(A), thus there are λ ∈ σ(B) ∩ σ(K).
Since σ(B) is closed and does not contain zero, there is an ε > 0 such that B(0, ε) ∩ σ(B) = ∅.
Since σ(K) has zero as only limit point, we see that σ(B) ∩ σ(K) is finite. If λ ∈ σ(K)\{0} and
r > 0 small enough so that B(λ, r) ∩ σ(K) = {λ} then A commutes with 1 − ((K − λ1)/z)n
(which is invertible) for each n, thus also with the inverses and with their limit, the Riesz
projector (for K!)
Pλ = lim (1 − ((K − λ1)/z)n )−1 .
n→∞

Thus A maps each of the spaces P Vλ = Pλ V into themselves, and the restrictions of A and K
commute on each Vλ . Now P = λ∈σ(B)∩σ(K) Pλ is a finite sum, and we put V1 = P V, V2 =
(1 − P )V . On V2 , both B = A + K and A are invertible, while A  V1 is not. Since V1 is
finite-dimensional, A is Fredholm of index zero, and zero is an isolated point of σ(A). 

B.120 Exercise Let V be a complex Banach space and A ∈ B(V ) quasi-nilpotent and Fred-
holm. Prove that V is finite-dimensional.

B.121 Definition Let V be a complex Banach space and A ∈ B(V ). Then


• The discrete spectrum σd (A) of A is the set of λ ∈ C for which A − λ1 satisfies the
equivalent conditions in Proposition B.119.

223
• The Browder136 essential spectrum is σess,3 (A) = σ(A)\σd (A).

B.122 Remark From property (v) in Proposition B.119 it is evident that σd (A) is definitely
not stable under compact perturbations! 2

B.123 Proposition Let V be a complex Banach space. Then the Browder essential spectrum
σess,3 (A) satisfies
\
σess,2 (A) ⊆ σess,3 (A) = σ(A + K) ⊆ σ(A) (B.10)
K∈K(V )∩{A}0

and is is closed and non-empty.


Proof. σess,3 ⊆ σ(A) holds by definition. And σess,2 (A) ⊆ σess,3 (A) clearly follows from the fact
that one of the equivalent definitions of σess,3 (A) resulting from the above is

σess,3 (A) = {λ ∈ σ(A) | λ not isolated or A − λ1 not Fredholm of index zero}. (B.11)

If λ ∈ C is such that A − λ1 is not Fredholm of index zero then on the one hand λ ∈ σess,3 (A).
On the other, for all compact K Theorem B.108(iii) gives that, A + K − λ1 is not Fredholm of
index zero, thus not invertible, so that thus λ ∈ σ(A + K). A fortiori, λ is in the intersection
in (B.10). Since the latter is contained in σ(A), in order to complete the proof of (B.10), it
suffices to prove that “λ ∈ σess,3 (A) ⇔ λ ∈ σ(A + K) for all compact K commuting with A”
holds under the assumption that A − λ1 is Fredholm of index zero. It suffices to consider the
case λ = 0, which amounts to the statement that 0 is a non-isolated point of σ(A) if and only
if there is no compact K commuting with A such that A + K is invertible. This is exactly the
equivalence (iii)⇔(v) in Proposition B.119. 

If the Browder essential spectrum σess,3 (A) is bigger than the Weyl essential spectrum
σess,2 (A), it cannot be stable under all compact perturbations of A. But the above shows
that it still has a very nice stability property.

B.10.2 Normal Hilbert space operators. Weyl’s theorem


We now restrict our attention to normal operators on Hilbert spaces.

B.124 Theorem Let H be a complex Hilbert space and A ∈ B(H) normal. Then

σess,3 (A) = {λ ∈ σ(A) | λ not isolated or dim ker(A − λ1) = ∞} (B.12)


= σess,1 (A) = σess,2 (A). (B.13)

(We simply call this the essential spectrum σess (A).) Whenever A, A0 are normal and A − A0 is
compact, σess (A) = σess (A0 ).
Proof. If λ ∈ σ(A) is not isolated, by definition it is contained in (B.11) and in (B.12). If
λ is isolated then by Proposition 17.24 there are mutually orthogonal A-invariant subspaces
H1 , H2 ⊆ H with H = H1 + H2 such that H1 = ker(A − λ1) 6= 0 and (A − λ1)  H2 is invertible.
Thus A − λ1 is Fredholm if and only if its restriction to H1 is Fredholm. The latter being
136
Felix Earl Browder (1927-2016). American mathematician who mostly worked on functional analysis and differ-
ential equations.

224
identically zero, this is equivalent to dim H1 = dim ker(A − λ1) < ∞. This proves the equality
in (B.12).
We already know that for normal A we have σess,1 (A) = σess,2 (A) ⊆ σess,3 (A). Thus to prove
equality of the three spectra we must show σess,3 (A) ⊆ σess,1 (A). Let thus λ ∈ σess,3 (A). By
(B.12), either ker(A − λ1) is infinite-dimensional or λ ∈ σ(A) is not isolated. In the first case,
A − λ1 is not Fredholm, thus λ ∈ σess,1 (A), and we are done.
We are thus reduced to the situation where ker(A − λ1) is finite-dimensional and λ is not
isolated. By Proposition 11.27(ii), we have a direct sum situation H = (ker(A − λ1)) ⊕ H 0 ,
where H 0 = (ker(A − λ1))⊥ . Since ker(A − λ1) is finite-dimensional, A − λ1 is Fredholm if and
only if (A − λ1)  H 0 is. Since (A − λ1)  H 0 is by construction injective, it has dense image
by normality and Proposition 11.27(iii). Since λ ∈ σ(A) is not isolated, Corollary 17.25 gives
that (A − λ1)  H 0 is not invertible. Since it is injective, it is not surjective. Now Exercise
7.11(iii) gives that (A − λ1)  ker(A − λ1)⊥ has infinite-dimensional cokernel. It thus is not
Fredholm, thus also A − λ1 is not Fredholm, implying λ ∈ σess,1 (A). This finishes the proof
of the equality of the three spectra for normal operators. The invariance of σess,3 (A) under
compact perturbations now follows from that of σess,2 (A). 

B.125 Exercise If H is an infinite-dimensional Hilbert space and A ∈ B(H) is normal, prove


σess (A) 6= ∅.

B.126 Remark The essential spectrum was introduced [170, 171] in 1909/10 by Weyl137 , who
only considered self-adjoint operators, but allowed unbounded ones since he was studying dif-
ferential equations. At that time functional analysis had just started developing, and the tools
used prove Theorem B.124 were not yet available. (The bounded inverse theorem came in 1929
and Fredholm operators and Theorem B.108 around 1950!) Weyl’s original approach to proving
invariance of σess (A) under compact perturbations was quite different (but not totally) and is
also interesting, which is why we briefly discuss it now. 2

B.127 Definition Let H be a Hilbert space and A ∈ B(H), λ ∈ C. A Weyl sequence for
(A, λ) is an orthonormal sequence ∃{xn } ⊂ H such that k(A − λ1)xn k → 0.

B.128 Proposition Let H be a complex Hilbert space, A ∈ B(H) normal and λ ∈ C. Then
the following are equivalent:
(i) λ ∈ σess (A).
(ii) PA (B(λ, ε))H ⊆ H is infinite-dimensional for each ε > 0. (Compare Proposition 18.20.)
(iii) There exists a Weyl sequence for (A, λ).
Proof. (i)⇒(ii) If dim ker(A − λ1) = ∞ then already PA ({λ})H = ker(A − λ1) is infinite-
dimensional. If λ is an accumulation point of eigenvalues, the linear span of the eigenspaces
ker(A − λ0 1) with λ0 ∈ B(λ, ε) is infinite-dimensional for each ε > 0. If neither of these holds,
there is an ε > 0 such that the restriction of A to H 0 = P (B(λ, ε0 ))H ∩ (ker(A − λ1))⊥ ) has
purely continuous spectrum containing λ for all ε0 ∈ (0, ε). In this case, H 0 ⊆ P (B(λ, ε))H is
infinite-dimensional for all ε > 0 since otherwise A  H 0 would have λ as eigenvalue. (Compare
also Exercise 18.8(ii).)
137
Hermann Weyl (1885-1955). German mathematician, who worked in many areas of mathematics and mathematical
physics, like real, complex and functional analysis, differential equations, Lie groups, quantum theory and relativity,
as well as philosophy of mathematics.

225
(ii)⇒(iii) By (ii), for each n ∈ N we can pick xn ∈ PA (B(λ, 1/n))H ∩ {x1 , . . . , xn−1 }0 and
kxn k = 1. Then {xn } is an orthonormal sequence, and k(A − λ)xn k ≤ 1/n → 0. Thus {xn } is
a Weyl sequence.
(iii)⇒(i) Existence of a Weyl sequence implies that A − λ1 is not bounded below, thus
λ ∈ σ(A). Assume λ 6∈ σess (A), thus λ ∈ σd (A). Then A = A1 ⊕ A2 with A1 = 0 and A2 − λ1H2
invertible, thus bounded below. With k(A − λ1)xn k → 0, this implies kP2 xn k → 0. Combined
with kxn k = 1 ∀n, this gives kP1 xn k → 1. Together with the fact that H1 is finite-dimensional
and the {xn } ⊆ H orthonormal, this produces a contradiction. Thus λ ∈ σess (A). 

Now we have an alternative proof for the invariance of the essential spectrum under compact
perturbations:

B.129 Theorem (Weyl) Let V be a complex Hilbert space and A, B ∈ B(H) normal such
that A − B is compact. Then σess (A) = σess (B).
Proof. If a Weyl sequence for A, λ exists, it is clear that λ ∈ σapp = σ(A).
Assume λ ∈ σess (A). Then there exists a Weyl sequence {xn }. Now

k(B − λ1)xn k ≤ k(A − λ1)xn k + k(B − A)xn k.

Now k(A−λ1)xn k → 0 by λ ∈ σess (A), while k(B−A)xn k → 0 since {xn } is a weak null sequence
and B − A is compact, thus sequentially weak-norm continuous, cf. Section 12.2. Using Weyl’s
criterion again, λ ∈ σess (B). Thus σess (A) ⊆ σess (B). The converse inclusion follows by A ↔ B.


B.130 Remark There are many generalizations of the theorem, e.g. to unbounded operators.
Cf. e.g. [45] and [129, Section XIII.4]. These generalizations have many applications to differ-
ential equations and quantum mechanics. 2

B.10.3 Applications
Apart from the mentioned applications of the essential spectrum to differential equations, there
are applications to operator theory. We mention a few without proofs:

B.131 Theorem (Weyl-von Neumann-Berg) If H is a separable complex Hilbert space,


A ∈ B(H) is normal and ε > 0, there is a compact K ∈ B(H) with kKk < ε such that
D = A − K is diagonal. (I.e. there exists an ONB E for H that diagonalizes D.)
For the proof, which has nothing particular to do with the essential spectrum, see e.g. [33,
Section II.4]. Combining this with the theory of the essential spectrum, we obtain:

B.132 Theorem Let H be a separable complex Hilbert space and A1 , A2 ∈ B(H) be normal.
Then σess (A1 ) = σess (A2 ) if and only if there exists a unitary U such that A2 − U A1 U ∗ is
compact. (One says A1 and A2 are ‘essentially unitarily equivalent’ or ‘compalent’).
Proof. If A2 is normal and U unitary, it is clear that U A2 U ∗ is normal and σess (U A2 U ∗ ) =
σess (A2 ). Thus if A1 − U A2 U ∗ is compact then σess (A1 ) = σess (U A2 U ∗ ) = σess (A2 ).
Now assume σess (A1 ) = σess (A2 ). By Weyl-von Neumann-Berg there are diagonal D1 , D2
and compact K1 , K2 such that Ai = Di + Ki . Since essential spectra are stable under compact
perturbations,
σess (D1 ) = σess (A1 ) = σess (A2 ) = σess (D2 ).

226
For a diagonal operator D = diag(dn ) we have λ ∈ σess (D) if and only λ is an accumulation
point of {dn }, i.e. {n ∈ N | |dn − λ| < ε} is infinite for every ε > 0. Thus the eigenvalue
sequences {d1,n }, {d2,n } of D1 , D2 have the same limit points. Using this one can construct a
permutation σ of N such that |d1,n − d2,σ(n) | → 0. If {e1,n }, {e2,n } are the ONBs diagonalizing
D1 , D2 , respectively, there is a unique unitary U such that U e1,n = e2,σ(n) ∀n. Now U D1 U ∗ −D2
is compact, thus U A1 U ∗ − A2 = U D1 U ∗ + U K1 U ∗ − D2 − K2 is compact. 

B.133 Theorem Let H be a separable Hilbert space and A, B ∈ B(H) normal. Then the
following are equivalent:
(i) There is a sequence {Un } of unitaries such that B = limn→∞ Un AUn∗ . (‘A and B are
approximately unitarily equivalent.’)
(ii) σess (A) = σess (B) and dim ker(A − λ1) = dim ker(B − λ1) for all λ ∈ C\σess (A).
Proof. See [33, Theorem II.4.4]. 

B.134 Theorem (Brown-Douglas-Fillmore 1973) Given A ∈ B(H), we have A = N +K


with N normal and K compact if and only if A∗ A − AA∗ is compact (A is ‘essentially normal’)
and ind(A − λ1) = 0 for all λ 6∈ σess,1 (A).
Proof. ⇒ is quite trivial: If A = N + K with N normal, K compact, then

A∗ A − AA∗ = (N ∗ + K ∗ )(N + K) − (N + K)(N ∗ + K ∗ ) = N ∗ N − N N ∗ − compact ∈ K(H),

and ind(A − λ1) = ind(N + K − λ1) = ind(N − λ1) = 0 whenever N − λ1 is Fredholm, i.e.
λ 6∈ σess (N ) = σess,1 (A). For the much deeper converse see e.g. [33, 75]. 

The above is just the tip of an iceberg. For more, see the references given above.

B.11 Trace-class operators: L1 (H)


B.135 Definition If H is a Hilbert space, A ∈ B(H) and 1 ≤ p < ∞, we define

kAkp = (Tr(|A|p ))1/p ∈ [0, ∞], Lp (H) = {A ∈ B(H) | kAkp < ∞}.

For p = 2 this agrees with Definition 12.39 since |A|2 = A∗ A. For p = 1 it specializes to

B.136 Definition Let H be a Hilbert space and A ∈ B(H). Then

kAk1 = Tr|A|, L1 (H) = {A ∈ B(H) | Tr|A| < ∞}.

The elements of L1 (H) are called trace class operators.


Trace class operators play an important role in von Neumann algebra theory. Our treatment
is inspired by [128]. See also [92].

B.137 Theorem Let H be any Hilbert space. Then


(i) kAk ≤ kAk1 for all A ∈ B(H).
(ii) kA∗ k1 = kAk1 for all A ∈ B(H). Thus L1 (H) is self-adjoint.
(iii) For all λ ∈ C, A, B ∈ B(H) we have kλAk1 = |λ|kAk1 and kA + Bk1 ≤ kAk1 + kBk1 . If
0 6= A ∈ L1 (H) then kAk1 > 0. Thus (L1 (H), k · k1 ) is a normed vector space.

227
(iv) For all A, B ∈ B(H) we have kABk1 ≤ kAkkBk1 and kABk1 ≤ kAk1 kBk. Thus L1 (H) ⊆
B(H) is a two-sided ideal.
(v) F (H) ⊆ L1 (H) ⊆ K(H).
k·k1
(vi) F (H) = L1 (H).
(vii) The normed space (L1 (H), k · k1 ) is complete, thus a Banach space.
(viii) (L1 (H), k · k1 ) is a Banach ∗-algebra.
(ix) For A ∈ B(H), the following are equivalent:
(α) A ∈ L1 (H), i.e. Tr(|A|) < ∞.
(β) A is a finite linear combination of positive operators with finite trace.
|hV Ae, ei| < ∞ for each ONB E and each unitary V .138
P
(γ)
Pe
(δ) e |hV Ae, ei| < ∞ Pfor some ONB E and each unitary V . Under this condition,
kAk1 ≤ supV ∈U (H) e |hV Ae, ei|.
P
() e |hAe, ei| < ∞ for each ONB E.
(x) For each ONB E, the unordered sum in
X
Tr : L1 (H) → C, A 7→ hAe, ei
e∈E

is convergent in the sense of Section A.1 (thus absolutely convergent) and independent of
the choice of E, defining a linear functional on L1 (H).
(xi) For all A ∈ L1 (H) we have Tr(A∗ ) = Tr(A).
(xii) If A ∈ B(H), b ∈ L1 (H) then Tr(AB) = Tr(BA) and |Tr(AB)| ≤ kAkkBk1 .
In particular |Tr(B)| ≤ kBk1 , thus Tr ∈ (L1 (H), k · k1 )∗ .
Proof. (i) Let B ≥ 0 and x ∈ H a unit vector. If E is an ONB containing x, we have
hBx, xi ≤ TrE (B). Since B is positive, we have kBk = supx,kxk=1 hBx, xi, thus kBk ≤ Tr(B).
If now A ∈ B(H) with polar decomposition A = U |A|, then applying the above to B = |A| and
using kU k ≤ 1 gives
kAk = kU |A| k ≤ k |A| k ≤ Tr|A| = kAk1 .
(ii) Let A ∈ B(H) with polar decomposition A = U |A| and |A| = U ∗ A. Since U is a partial
isometry, it satisfies U ∗ U U ∗ = U ∗ , so that U ∗ U |A| = U ∗ U U ∗ A = U ∗ A = |A|. Thus

(U |A|U ∗ )2 = U |A|U ∗ U |A|U ∗ = U |A|2 U ∗ = (U |A|)(U |A|)∗ = AA∗ = |A∗ |2 .

Since |A∗ | and U |A|U ∗ are both positive, taking roots gives U |A|U ∗ = |A∗ |. Choosing an ONB
E such that each e ∈ E is either in ker U ∗ or in (ker U ∗ )⊥ , thus U ∗ e = 0 or U ∗ e = e, we find
X X
kA∗ k1 = TrE |A∗ | = TrE (U |A|U ∗ ) = h|A|U ∗ e, U ∗ ei ≤ h|A|e, ei = Tr|A| = kAk1 .
e e

Replacing A by A∗ gives the opposite inequality so that kA∗ k1 = kAk1 .


138
P
This is the same as i∈I |hAei , fi i| < ∞ for any ONBs {ei }i∈I , {fi }i∈I .

228
(iii) The first statement follows from |λA| = |λ||A|. For the second, let A, B ∈ B(H) with
polar decompositions A = U |A|, B = V |B|, A + B = W |A + B|. If E is an ONB and F ⊆ E
is finite,
X X X
h|A + B|e, ei = hW ∗ (A + B)e, ei = (hW ∗ U |A|e, ei + hW ∗ V |B|e, ei)
e∈F e∈F e∈F
X X

≤ |hW U |A|e, ei| + |hW ∗ V |B|e, ei|. (B.14)
e∈F e∈F

Focusing on the first term of the r.h.s., we have


X X X
|hW ∗ U |A|e, ei| = |h|A|1/2 e, |A|1/2 U ∗ W ei| ≤ k |A|1/2 ek k|A|1/2 U ∗ W ek
e∈F e∈F e∈F
!1/2 !1/2
X X
≤ k |A|1/2 ek2 k|A|1/2 U ∗ W ek2 , (B.15)
e∈F e∈F

where the first ≤ comes from applying Cauchy-Schwarz to the inner product h|A|1/2 e, |A|1/2 U ∗ W ei
in H, the second ≤ from Cauchy-Schwarz in C|F | .
The argument of the first square root in the r.h.s. of (B.15) is dominated by e∈E k |A|1/2 ek2 =
P
Tr|A|, and for the argument of the second root we have
X X X
k|A|1/2 U ∗ W ek2 = h|A|1/2 U ∗ W e, |A|1/2 U ∗ W ei = hW ∗ U |A|U ∗ W e, e)
e∈F e∈F e∈F
∗ ∗
≤ Tr(W U |A|U W i.
Now, picking an ONB E such that each e ∈ E is either in ker W or in (ker W )⊥ , we find
Tr(W ∗ U |A|U ∗ W ) ≤ Tr(U |A|U ∗ ). Repeating the argument with U , we have Tr(U |A|U ∗ ) ≤
Tr|A|. Thus Tr(W ∗ U |A|U ∗ W ) ≤ Tr|A|, so that e∈F k|A|1/2 U ∗ W ek2 ≤ Tr|A|. Inserting this
P
in (B.15), we find X
|hW ∗ U |A|e, ei| ≤ Tr|A| = kAk1 .
e∈F
Analogously one proves the bound e∈F |hW ∗ V |B|e, ei| ≤ Tr|B| = kBk1 for the other summand
P
in (B.14). Now taking the limit F % E we have kA + Bk1 ≤ kAk1 + kBk1 . In view of this, it
is clear that L1 (H) is a vector space.
(iv) Let B = U |B| and AB = V |AB| be the polar decompositions of B, AB, respectively.
Using Proposition 11.44(iii), we have |AB| = V ∗ AB = V ∗ AU |B| = W |B|, where W = V ∗ AU .
In view of kU k, kV k ≤ 1 we have kW k ≤ kAk. Using W ∗ W ≤ kW ∗ W k1 = kW k2 1 and Exercise
17.7 we have
|AB|2 = |AB|∗ |AB| = |B|W ∗ W |B| ≤ kW k2 |B|2 .
In view of 0 ≤ A ≤ B ⇒ A1/2 ≤ B 1/2 , cf. [110, Theorem 2.2.6], this implies |AB| ≤ kW k |B|.
Thus kABk1 = Tr|AB| ≤ kW kTr|B| = kW kkBk1 ≤ kAkkBk1 .
The other inequality follows by kABk1 = k(AB)∗ k1 = kB ∗ A∗ k1 ≤ kB ∗ kkA∗ k1 = kAk1 kBk,
where we used the bound just proven and (ii). That L1 (H) is an ideal now is obvious.
(v) We have (x ⊗ y)∗ = (y ⊗ x), thus (x ⊗ y)∗ (x ⊗ y) = kxk2 (y ⊗ y) = kxk2 kyk2 (e ⊗ e) with
e = y/kyk, so that taking roots gives |x ⊗ y| = kxkkyk(e ⊗ e), which clearly is in L1 (H). Since
every element of F (H) is a finite linear combination of such x ⊗ y, we have F (H) ⊆ L1 (H).
If A ∈ L1 (H) the then |A|2 = A∗ A ∈ L1 (H) since L1 (H) is an ideal. Thus for any ONB E
we have X X X
k Aek2 = hA∗ Ae, ei = h|A|2 e, ei = TrE (|A|2 ) < ∞.
e∈E e e

229
Let F ⊆ E be finite and x ∈ F ⊥Pwith kxk = 1. Then F ∪ {x} is an orthonormal set and can be
completed to an ONB E. Thus e∈F k Aek2 + k Axk2 ≤ Tr(|A|2 ), or
X
kAxk2 ≤ Tr(|A|2 ) − kAek2 .
e∈F

Since the r.h.s. goes to zero as F % E, we have


n o F %E
sup kAxk | x ∈ F ⊥ , kxk = 1 −→ 0. (B.16)

If PF is the orthogonal projection onto spanC F then AF := APF is a finite rank operator that
k·k
converges in norm to A by (B.16). Thus A ∈ F (H) = K(H), proving L1 (H) ⊆ K(H).
(vi) Assume first A ∈ L1 (H)+ . By (v), A is compact. By Theorem 14.12,Pcompact self-
P there is an ONB E such that A = e∈E λe e ⊗ e.
adjoint operators can be diagonalized, thus
In our case, λPe ≥ 0 for all e ∈ E and e λe = Tr(A) < ∞. For
P a finite subset F ⊆ E
define AF := e∈F λe e ⊗ e, which Now A − AF = e∈E\F λe e ⊗ e ≥ 0. Thus
P is finite rank. P
kA − AF k1 = Tr(A − AF ) = e∈E\F λe . With e λe < ∞ this implies kA − AF k1 → 0 as
k·k1
F % E, thus A ∈ F (H) .
Let now A ∈ L1 (H) with polar decomposition A = U |A|. Then |A| ∈ L1 (H), thus by the
above for each ε > 0 there is B ∈ F (H) with k |A| − Bk1 < ε. With (iv) and kU k ≤ 1 this
implies kA − U Bk1 = kU (|A| − B)k1 ≤ k|A| − Bk1 < ε. Since F (H) is an ideal, we have
k·k1
U B ∈ F (H), finishing the proof of L1 (H) ⊆ F (H) . The converse is clear.
(vii) This can be proven directly, using L1 (H) ⊆ K(H), cf. e.g. [118, Theorem 3.4.12]. Since
we don’t need the result soon, we will follow Murphy in deducing it later from the isometric
isomorphism L1 (H) ∼ = B(H)∗ of normed spaces and completeness of the dual space B(H)∗ .
(viii) It only remains to prove submultiplicativity: If A, B ∈ L1 (H) then kABk1 ≤ kAkkBk1 ≤
kAk1 kBk1 , where we used (i) and (iv).
(ix) (β) ⇒ (α): This is trivial since L1 (H) is a vector space by (iii) and obviously contains
the positive operators of finite trace.
(α) ⇒ (β) Assume A ∈ L1 (H). By (ii), A∗ ∈ L1 (H) so that (iii) implies Re(A), Im(A) ∈
L1 (H). If A = A∗ ∈ L1 (H), let A = A+ − A− be the canonical decomposition with A± ≥ 0 and
A+ A− = 0. Then |A| = A+ + A− , so that Tr(A± ) ≤ Tr|A| = kAk1 < ∞, implying A± ∈ L1 (H).
Thus every trace class operator is a linear combination of four (or less) positive trace class
operators.
(β) ⇒ (γ) Let A ∈ L1 (H) and E an ONB for H. By (β), we have A = K
P
PK k=1 λk Ak with
Ak ∈ L P 1 +
(H) ∀k. Now |hAe, ei| ≤ k=1 |λk |hAk e, ei for all e ∈ E, so that Ak ∈ L1 (H)+ ∀k
implies e |hAe, ei| < ∞. Since L1 (H) is an ideal, the same holds for V A instead of A.
(γ) ⇒ (δ) + () is trivial.
(δ) ⇒ (α) Let A = U |A| be the polar decomposition of A ∈ B(H). Recall that U maps

|A|H ⊆ H isometrically to AH ⊆ H and sends |A|H to zero. Let V = U ∗ on AH = u|A|H ⊆
H. Since the closed subspaces |A|H and AH are unitarily equivalent, they have the same
⊥ ⊥ ⊥ ⊥
dimension, thus also |A|H and AH have the same dimension. Define V : AH → |A|H to
be (any) isometry. Then V is unitary. Since V U |A|e = U ∗ U |A|e = |A|e for all e, we have
X X X
|hV U |A|e, ei| = |h|A|e, ei| = h|A|e, ei = Tr|A|.
e∈E e∈E e∈E

By assumption, the l.h.s. is finite, thus Tr|A| < ∞. The bound on k · k1 is obvious in view of
the preceding computation.

230
() ⇒ (α) Assumption (ε) clearly implies hAen , en i → 0 for any orthonormal sequence
{en }n∈N . Now Theorem 12.37 gives that A is compact. If PA = B + iC with B, C self-adjoint
then B, C are compact, thus diagonalizable. Thus B = f ∈F λf f ⊗ f for a certain P ONB F
and λf ∈ R. Now (), which also holds for B, C, clearly implies kBk1 = Tr|B| = f |λf | < ∞,
so that B ∈ L1 (H). Similarly, C ∈ L1 (H), thus also
P A = B + iC is trace-class by (iii).
(x) Let E be an ONB. By (ix)(γ), the sum e∈E hAe, ei is absolutely convergent for each
A ∈ L1 (H). This proves that TrE : L1 (H) → C is well-definedPKand linear. It remains to show
that TrE is independent of E. By (ix)(β), we have A = k=1 λk Ak where K is finite and
Ak ∈ L1 (H)+ ∀k. If now F is another ONB, we have
X X
TrE (A) = λk TrE (Ak ) = λk TrF (Ak ) = TrF (A),
k k

where we used the linearity of TrE and TrF and the fact that TrE (Ak ) = TrF (Ak ) by Ak ≥ 0
and Lemma 11.51(ii). This proves TrE = TrF .
(xi) If A ∈ L1 (H) and E is an ONB, we have
X X X
Tr(A∗ ) = hA∗ e, ei = he, Aei = hAe, ei = Tr(A),
e e e

where the last identity comes from absolute convergence and continuity of complex conjugation.
(xii) Let B ∈ L1 (H) throughout. If U is unitary then BU, U B ∈ L1 (H) and
X X X
TrE (BU ) = hBU e, ei = hU BU e, U ei = hU Bf, f i = TrF (U B),
e∈E e∈E f ∈F

where F = U E is another ONB. Since Tr does not depend


P4 on the ONB, we have Tr(U B) =
Tr(BU ). If now A ∈ B(H)Pis arbitrary, we Phave A = l=1 λl Ul where λl ∈ C and the Ul are
unitaries. Then Tr(AB) = l λl Tr(Ul B) = l λl Tr(BUl ) = Tr(BA), thus the first claim.
Let A ∈ L1 (H) with polar decomposition A = U |A|. Then
X X
|Tr(A)| = | hU |A|e, ei| = | h|A|1/2 e, |A|1/2 U ∗ ei|
e e
X
≤ k |A|1/2
ek k |A| 1/2
U ∗ ek
e
!1/2 !1/2
X X
≤ k |A|1/2 ek2 k |A|1/2 U ∗ ek2 .
e∈E e∈E

Now, as in the proof of (iii), e∈E k |A|1/2 ek2 = Tr|A| and e∈E k |A|1/2 U ∗ ek2 ≤ Tr|A|. This
P P
proves |Tr(A)| ≤ kAk1 for all A ∈ L1 (H).
If now A ∈ L1 (H), B ∈ B(H) then |Tr(AB)| ≤ kABk1 ≤ kAk1 kBk by (iv). 

Let H be a Hilbert space. By Theorem B.137(xii), we have linear maps α : L1 (H) →


B(H)∗ , A 7→ Tr(A•) and β : B(H) → L1 (H)∗ , A 7→ Tr(A•) such that kα(A)k ≤ kAk1 and
kβ(A)k ≤ kAk.

B.138 Theorem Let H be a Hilbert space. Then


(i) β : (B(H), k · k) → (L1 (H), k · k1 )∗ is an isometric bijection.
(ii) αK : (L1 (H), k · k1 ) → (K(H), k · k)∗ is an isometric bijection.

231
(iii) The map α : (L1 (H), k · k1 ) → (B(H), k · k)∗ is isometric, but it is not surjective if H is
infinite-dimensional.
Proof. (i) [110, Theorem 4.2.3].
(ii) [110, Theorem 4.2.1].
(iii) Since kα(A)k ≤ kAk1 for all A ∈ L1 (H) and kα(A)  K(H)k = kAk1 by (ii), it is clear
that also kα(A)k = kAk1 , thus α is isometric.
For each x, y ∈ H we have x ⊗ y ∈ K(H), and Tr(A(x ⊗ y)) = Tr((Ax) ⊗ y) = hAx, yi. Thus
if α(A) = Tr(A·) vanishes on the compact operators then A = 0, thus α(A) = 0.
However, if H is infinite-dimensional, K(H) ⊆ B(H) is a proper closed ideal, and the
quotient space C(H) = B(H)/K(H) (the Calkin algebra) is non-trivial. Thus it admits a
bounded non-zero functional ψ. If p : B(H) → C(H) is the quotient map then ψ ◦ p is a non-
zero norm-continuous functional on B(H) that vanishes on K(H). Such a functional cannot be
of the form Tr(A·) with A ∈ L1 (H), proving that α is not surjective. 

B.139 Remark 1. The results (i), (ii), (iii) are non-commutative analogues of `∞ (S, F) ∼ =
`1 (S, F)∗ , `1 (S, F) ∼
= c0 (S, F)∗ , and Theorem 4.19(v), respectively.
2. One can show that α(L1 (H)) ⊆ B(H)∗ consists precisely of the linear functionals that
are not only norm-continuous but also ultra-weakly continuous (or, equivalently, normal). Cf.
e.g. [110]. This is analogous to Proposition B.21.
3. The analogy between Lp (H) and `p (S, F) extends to p 6∈ {1, 2, ∞}: Each space Lp (H) =
{A ∈ B(H) | kAkp < ∞}, the ‘p-th Schatten class’, is a two-sided ideal in B(H) and in fact
Lp (H) ⊆ K(H) for all p. If 1 ≤ p ≤ q < ∞, it is not hard to show that kAk := kAk∞ ≤ kAkq ≤
k·kp
kAkp , thus Lq (H) ⊆ Lp (H) ⊆ K(H). For all 1 < p < ∞ one has Lp (H) = F (H) , the
Hölder type inequality kABk1 ≤ kAkp kBkq for A ∈ Lp (H), B ∈ Lq (H) and in fact the duality
Lp (H)∗ ∼= Lq (H). For all this see [151]. 2

We close with a harder result:

B.140 Theorem (Grothendieck 1956, Lidskii 1959) If H is a Hilbert space and A ∈


L1 (H) then TrA =
P
λ∈σp (A) n λ λ, where nλ = dim Lλ (A) is the algebraic multiplicity of λ,
proven finite in Proposition 14.18.
The theorem is obvious if A is normal, thus diagonalizable by Theorem 14.12. For finite ma-
trices, there are two standard approaches, using the Jordan normal form and the characteristic
polynomial, respectively. Both can be adapted to infinite dimensions, in the first case using
the results of Section 14.4: One writes H as a direct sum H = H0 ⊕λ∈σ(A)\{0} Hλ of invariant
subspaces, where A  H0 is quasi-nilpotent and the P Hλ are finite-dimensional and A − λ1  Hλ
nilpotent. This already implies TrA = Tr(A  H0 ) + λ∈σp (A) nλ λ, and it remains to show that
the trace of a quasi-nilpotent trace class operator vanishes. For this see e.g. [59, p. 101-103],
[103, Lemma 16.32] or [94, Sect. 30.3]. For the adaptation of the characteristic polynomial
approach using determinants in infinite dimensions, see [62, 151, 152].

B.12 More on numerical ranges


B.12.1 The numerical range W (A) of a Hilbert space operator
We have already proven some results concerning the numerical range W (A) and radius 9A9
of a Hilbert space operator A, cf. Definition 11.32. See Propositions 11.34 and 11.22, Exercises

232
11.36 and 13.15, and Proposition 13.69(iii). These results show that these quantities are of some
interest, and here we prove some further results about W (A).
To begin with, W (A) need not be closed:

B.141 Exercise (i) Give an example of a bounded Hilbert space operator A such that there
is no x ∈ H, kxk = 1 such that |hAx, xi| = 9A9.
(ii) Prove that despite (i) there always exists a sequence {xn } with kxn k = 1 such that
hAxn , xn i → λ where λ ∈ C, |λ| = 9A9.

B.142 Exercise Let H be a Hilbert space over F ∈ {R, C} and A ∈ B(H). Prove:
(i) W (αA + β1) = αW (A) + β for all α, β ∈ F.
(ii) W (A∗ ) = W (A)∗ .
(iii) W (U AU ∗ ) = W (A) for every unitary U : H → H 0 . (NB: In general W (BAB −1 ) 6= W (A)
for invertible B!)
(iv) If F = C then W (A) = {λ} if and only if A = λ1.
(v) If F = C then W (A) is contained in a line segment [γδ] = {tγ + (1 − t)δ | t ∈ [0, 1]} if and
only there are α, β ∈ C such that αA + β1 is self-adjoint.

B.143 Theorem (Toeplitz-Hausdorff (1918/9)) 139 Let H be a complex Hilbert space


and A ∈ B(H). Then the numerical range W (A) ⊆ C is convex.
Proof. If x, y ∈ H with kxk = kyk = 1 and t ∈ [0, 1] we must show that thAx, xi+(1−t)hAy, yi ∈
W (A). Let P be the orthogonal projection onto K = Cx + Cy ⊆ H and AK = P AP considered
as element of B(K). Then thAx, xi + (1 − t)hAy, yi = thAK x, xi + (1 − t)hAK y, yi. Thus if we
prove that the r.h.s. is in W (AK ), it is of the form hAK z, zi = hAz, zi for some z ∈ K, thus also
the l.h.s. is in W (A). We have therefore reduced the claim to proving the special case H = C2
(since there is nothing to prove if K is one-dimensional). There are many ways of doing this,
most of which are quite computational. We will give the nice argument from [35].
If x ∈ H is a unit vector then P = x ⊗ x (notation of Exercise 12.36) is an orthogonal
projection of rank one, and every such projection arises in this way. Now for all A ∈ B(H)
we have hAx, xi = Tr(P A). (This is easily checked by computing the trace using an ONB
containing the vector x.) Since Tr(P ) equals the rank of P , denoting the set of rank one
orthogonal projections by P1 we have W (A) = {Tr(P A) | P ∈ P1 }.  
2 3 1 1 + z x + iy
Now specialize to H = C . For (x, y, z) ∈ R it is clear that M (x, y, z) = 2
x − iy 1 − z
is self-adjoint with trace one, and every such matrix is of this form. A trivial computation gives
!
1+x2 +y 2 +z 2
1 + z x + iy
M (x, y, z)2 = 2
1+x2 +y 2 +z 2 ,
2 x − iy 2 −z

implying that M (x, y, z) is an idempotent, thus a rank one orthogonal projection, if and only if
x2 + y 2 + z 2 = 1. Thus the map (x, y, z) 7→ M (x, y, z) restricts to a bijection S 2 → P1 . Now
  
Tr(A) 1 z x + iy
Tr(M (x, y, z)A) = + Tr A .
2 2 x − iy z
139
Felix Hausdorff (1868-1942), German mathematician. Towering figure in the history of general topology and
related areas like measure theory and functional analysis. Driven to suicide by the Nazis.

233
Since the second summand depends R-linearly on (x, y, z) (for any fixed A), we are done once we
show that the image of S 2 ⊆ R3 under every linear map α : R3 → R2 is convex. For dimensional
reasons, α has a non-trivial kernel K. Now the image of S 2 under the orthogonal projection
p : R3 → K ⊥ ⊂ R3 is a ball, thus convex, so that also α(S 2 ) = α(p(S 2 )) is convex. 

There are interesting other proofs that do not proceed by reduction to two dimensions, cf.
e.g. [113, Exercise 8.9].
In view of Exercise 13.15 and Theorem B.143 it is clear that for all bounded Hilbert space
operators we have conv(σ(A)) ⊆ W (A). (Recall Definition B.66.) Already Exercise 11.36 shows
that this need not be an equality. Again, normal operators are behaved nicely:

B.144 Exercise Let H be a Hilbert space and A ∈ B(H) normal. Use the Spectral Theorem
18.4 to prove conv(σ(A)) = W (A). (Note that conv(σ(A)) is closed by Corollary B.69.)
For an alternative proof avoiding the spectral theorem (but still using continuous functional
calculus) see [118, Exercise E4.4.5].
While conv(σ(A)) = W (A) does not hold for all operators, there is a slightly more involved
general fact, somewhat related to the numerical identity from Exercise 17.15(iv):

B.145 Theorem (S. Hildebrandt 1966) 140Let H be a Hilbert space and A ∈ B(H). Then
\
conv(σ(A)) = W (BAB −1 ).
B∈Inv B(H)

Proof. For each invertible B we have σ(A) = σ(BAB −1 ) ⊆ W (BAB −1 ). With the convexity of
the numerical ranges we have
\
conv(σ(A)) ⊆ W (BAB −1 ). (B.17)
B∈Inv B(H)

Assume equality does not hold, thus there exists λ such that λ ∈ W (BAB −1 ) for all B ∈
Inv B(H) but λ 6∈ conv(σ(A)). Since conv(σ(A)) is a compact convex set, it is not hard to
find an open disc D such that conv(σ(A)) ⊆ D while λ 6∈ D. Using σ(αA + β) = ασ(A) + β
and W (αA + β) = αW (A) + β, we can reduce to the situation where D is the open unit disc
U = B(0, 1) around zero. Then σ(A) ⊆ conv(σ(A)) ⊆ U , so that r(A) < 1. Now Exercise
17.15(i)-(iii) provides B ∈ Inv B(H) such that kBAB −1 k < 1. Then also W (BAB −1 ) ⊆ U ,
which contradicts the facts λ ∈ W (BAB −1 ) and |λ| ≥ 1. This contradiction proves that the
inclusion in (B.17) is an equality. 

For much more on W (A) see the book [64].

B.12.2 Numerical range in Banach algebras


In this section, which strongly leans on [18], we exlusively consider complex unital Banach
algebras satisfying k1k = 1.
In the discussion of commutative Banach and C ∗ -algebras, a large role was played by char-
acters, i.e. non-zero algebra homomorphisms A → C. A non-commutative algebra may fail
to have any characters. Of course A∗ separates the points of A, but this set is too big. The
following natural subset of A∗ will turn out useful:
140
Stefan Oscar Walter Hildebrandt (1936-2015). German mathematician who mostly worked on variational calculus.

234
B.146 Definition If A is a unital complex Banach algebra, ϕ ∈ A∗ is called a state if ϕ(1) =
kϕk = 1. The set of states of A is denoted S(A).

B.147 Proposition Let A be a unital normed algebra over C. For each a ∈ A the subsets
\
V1 (a) = B(z, ka − z1k) = {λ ∈ C | |λ − z| ≤ ka − z1k ∀z ∈ C},
z∈C
V2 (a) = {ϕ(a) | ϕ ∈ A∗ , kϕk = ϕ(1) = 1}

of C coincide. The resulting set V (a) is called the (algebraic) numerical range and R(a) =
supλ∈V (λ) |a| the (algebraic) numerical radius of a.
Proof. Let a ∈ A and ϕ ∈ A∗ with ϕ(1) = kϕk = 1. Then |ϕ(a) − z| = |ϕ(a − z1)| ≤ ka − z1k
holds for each z ∈ C. Thus ϕ(a) ∈ V1 (a), proving V2 (a) ⊆ V1 (a).
If a = c1 then V2 (a) = {c}, and with z = c the inequality |λ − z| ≤ ka − z1k becomes
|λ − z| ≤ 0, so that also V1 (a) = {c}. Thus V1 (a) = V2 (a) for a ∈ C1. Now assume a 6∈ C1 and
let λ ∈ V1 (a). Put W = C1 + Ca and define ϕ0 ∈ W ∗ by c1 + da 7→ c + dλ. Then ϕ0 (1) = 1
and ϕ0 (a) = λ, and

|ϕ0 (c1 + da)| |c + dλ|


kϕ0 k = sup = sup .
(c,d)6=(0,0) kc1 + dak (c,d)6=(0,0) kc1 + dak

For d = 0 the fraction on the r.h.s. is 1, so that with λ ∈ V1 (a) we have


 |z + λ| 
kϕ0 k = max 1, sup = 1.
z∈C kz1 + ak

Now by Hahn-Banach there exists an extension ϕ of ϕ0 to V satisfying kϕk = kϕ0 k = 1. Now


ϕ(a) = λ, proving V1 (a) ⊆ V2 (a). 

B.148 Proposition Let A be a unital Banach algebra over C. Then


(i) V (a) ⊆ C is closed and convex,
(ii) σ(a) ⊆ V (a) and r(a) ≤ R(a) ≤ kak. In particular, V (a) 6= ∅.
(iii) V (αa + β1) = αV (a) + β for all α, β ∈ C.
Proof. (i) Since B(z, ka − z1k) is closed and convex for each z ∈ C and these properties pass to
arbitrary intersections, closedness and convexity of V (a) are obvious from V (a) = V1 (a). And
R(a) ≤ kak follows from V1 (a) ⊆ B(0, ka − 01k), the z = 0 term in the intersection.
(ii) Assume λ 6∈ V (a). This means that there exists z ∈ C such that |λ−z| > ka−z1k. Then
a−z1
λ−z < 1, so that 1 − a−z1
λ−z ∈ Inv A by Lemma 13.19. Thus −(a − λ1) = (λ − z)1 − (a − z1) ∈
Inv A, so that λ 6∈ σ(a). Thus σ(a) ⊆ V (a), implying also r(a) ≤ R(a). Now σ(a) 6= ∅ implies
V (a) 6= ∅. (iii) is rather obvious for V2 (a) since the ϕ involved satisfy ϕ(1) = 1. 

B.149 Remark Given the similarity of the properties of V (a) to those of the (spatial) numerical
range W (A) of a Hilbert space operator (except for the closedness of V (a)), it is natural to ask
how the two definitions are related if A = B(H). It is easy to see that W (A) ⊆ V (A). In
fact, this always is an equality. We postpone the proof to Theorem B.168 since it requires some
preparations and we prefer to stick to the general Banach algebra situation for now. 2

235
B.150 Lemma Let A be a unital complex Banach algebra and a ∈ A. Then

inf{Re λ | λ ∈ V (a)} ≤ inf{kabk | b ∈ A, kbk = 1}.

(Note that the r.h.s. also appeared in Exercise 13.34.)


T
Proof. For b ∈ A with kbk = 1 put V (a, b) = z∈C B(z, k(a−z1)bk). With k(a−z1)bk ≤ ka−z1k
S the definition of V1 (a) we have V (a, b) ⊆ V1 (a). Since clearly V (a, 1) = V1 (a), we have
and
b∈A=1 V (a, b) = V1 (a) = V (a). For all b ∈ A, kbk = 1 we have V (a, b) ⊆ B(0, kabk), implying

inf{Re λ | λ ∈ V (a)} ≤ inf{Re λ | λ ∈ V (a, b)} ≤ kabk.

Now take the infimum over b ∈ A with kbk = 1. 

B.151 Theorem Let A be a unital complex Banach algebra and a ∈ A. Then


k1 + tak − 1
sup{Re λ | λ ∈ V (a)} = inf
t>0 t
k1 + tak − 1
= lim
t&0 t
ta
log ke k
= sup (B.18)
t>0 t
log keta k
= lim .
t&0 t
Proof. Fix a ∈ A, and pick ϕ ∈ S(A), t > 0. Then

1 + t Re ϕ(a) ≤ |1 + t Re ϕ(a)| = |Re(ϕ(1 + ta))| ≤ |ϕ(1 + ta)| ≤ k1 + tak,


k1+tak−1
implying Re ϕ(a) ≤ t . Since this holds for all t > 0 and ϕ ∈ S(A), we conclude

k1 + tak − 1
sup{Re λ | λ ∈ V (a)} = sup Re ϕ(a) ≤ inf . (B.19)
ϕ∈S(A) t>0 t

Now put µ = sup{Re λ | λ ∈ V (a)} and let t > 0. Then

1 − tµ = 1 − t sup{Re λ | λ ∈ V (a)} = inf Re ϕ(1 − ta) ≤ inf{k(1 − ta)bk | b ∈ A, kbk = 1},


ϕ∈S(A)

where the final inequality comes from Lemma B.150. From this one readily deduces

(1 − tµ)kbk ≤ k(1 − ta)bk ∀b ∈ A,

which for b = 1 + ta gives

(1 − tµ)k1 + tak ≤ k(1 − ta)(1 + ta)k = k1 + t2 a2 k ≤ 1 + t2 ka2 k.

Assuming tµ < 1 this gives k1 + tak ≤ (1 − tµ)−1 (1 + t2 ka2 k), implying

k1 + tak − 1 µ + tka2 k

t 1 − tµ
and therefore
k1 + tak − 1
lim sup ≤ µ.
t&0 t

236
Combining this with (B.19) we have

k1 + tak − 1 k1 + tak − 1 k1 + tak − 1


lim sup ≤ sup{Re λ | λ ∈ V (a)} ≤ inf ≤ lim inf ,
t&0 t t>0 t t&0 t

which implies the equality of the first three expressions in (B.18).


*********


B.152 Definition Let A be a unital complex Banach algebra. Then a ∈ A then a is called
• dissipative if Re λ ≤ 0 ∀λ ∈ V (a).
• hermitian if V (a) ⊆ R. We put H(A) = {a ∈ A | a hermitian}.
• normal if there are commuting b, c ∈ H(A) with a = b + ic.

B.153 Corollary If A is a unital Banach algebra and a ∈ A then


(i) keita k = 1 ∀t ∈ R if and only if a is hermitian.
(ii) keta k ≤ 1 ∀t ∈ R if and only if a is dissipative.
Proof. Both statements are immediate consequences of (B.18). 

B.154 Theorem (Bohnenblust & Karlin 1955) 141 Let A be a unital complex Banach
algebra. Then
kak
≤ R(a) ≤ kak ∀a ∈ A.
e
(Here e = exp(1) = 2.718 . . ..) In particular a = 0 ⇔ V (a) = {0}.
Proof. The inequality R(a) ≤ kak has already been proven. Let a 6= 0. Rescaling if necessary
we may assume R(a) = 1, thus |λ| ≤ 1 for all λ ∈ V (a). Then (B.18) implies

log kea k ≤ sup{Re λ | λ ∈ V (a)} ≤ 1,

which remains valid if we replace a by


P∞ ca with |c| = 1. Thus keza k ≤ e ∀z ∈ S 1 .
Whenever a power series f (z) = n=0 cn z n has convergence radius R > r(a) we can define
f (e2πit a) and have142
Z 1 Z ∞
1X  ∞
X Z 1
2πit −2πimt n 2πint −2πimt n
f (e a)e dt = cn a e e dt = cn a e2πi(n−m)t dt = cm am .
0 0 n=0 n=0 0
(B.20)
for each m ∈ N0 , where the interchange of integration and summation
R1 was justified by the
uniform convergence of the series on bounded sets, and we used 0 e2πi(n−m)t dt = δn,m . Thus
R1
|cm |kam k ≤ 0 kf (e2πit a)kdt. Applying this to f = exp and m = 1 (thus cm = 1) we obtain
R1
kak ≤ 0 k exp(e2πit a)k dt ≤ e = eR(a).
Now V (a) = {0} ⇒ R(a) = 0 ⇒ kak = 0 ⇒ a = 0 ⇒ V (a) = ∅. 

141
Samuel Karlin (1924-2007). Polish-born American mathematician. After his work in pure analysis he made many
contributions to mathematical economy and biology.
142
Rb
If V is a Banach space and f : [a, b] → V a continuous function, the Riemann integral a f (t)dt is defined (and
Rb Rb
existence proven using the uniform continuity of f ) as for R-valued functions. One then has k a f (t)dtk ≤ a kf (t)kdt.

237
The surprising factor e cannot be improved without further assumptions. (Recall that for
the numerical radius of an operator on a complex Hilbert space we proved the slightly stronger
9A9 ≥ kAk
2 , which we showed to be optimal.)
We mention without proofs (see e.g. [19]) two more results:

B.155 Theorem If A is a unital complex Banach algebra and a ∈ A is normal then V (a) =
conv(σ(a)), thus R(a) = r(a). (Compare Exercise B.144.) For hermitian a even R(a) = kak.
If a ∈ H(A) ∩ iH(A) then V (a) ⊆ R ∩ iR = {0}, which implies a = 0. Thus H(A) ∩ iH(A) =
{0}. It is natural to ask whether H(A) + iH(A) = A. For C ∗ -algebras this is true since one
can prove H(A) = Asa , cf. Proposition B.161. This actually characterizes C ∗ -algebras:

B.156 Theorem (Vidav (1956), Palmer (1968)) Let A be a unital complex Banach alge-
bra. If A = H(A) + iH(A) then (b + ic)∗ = b − ic for a, b ∈ H(A) defines a ∗-operation and
ka∗ ak = kak2 ∀a ∈ A, thus A is a C ∗ -algebra.
For proofs of these results (and many more), see [19].

B.12.3 Positive functionals on and numerical range for C ∗ -algebras


B.157 Definition Let A be a complex C ∗ -algebra. A linear functional ϕ : A → C is called
positive if ϕ(a) ≥ 0 for all positive a, i.e. a ∈ A+ . The set of positive functionals on A is
denoted A+ .
Positive functionals have remarkable properties:

B.158 Proposition Let A be a C ∗ -algebra and ϕ a positive functional. Then


(i) ϕ is bounded. Thus A+ ⊆ A∗ .
(ii) [a, b] = ϕ(ab∗ ) defines a sequilinear form on A that is self-adjoint and positive.
1/2
(iii) For all a, b ∈ A we have |ϕ(ab∗ )| ≤ ϕ(aa∗ )ϕ(bb∗ ) .
(iv) If A is unital, ϕ(a∗ ) = ϕ(a) for all a ∈ A.
Proof. (i) Let ϕ be positive. Assume that ϕ is unbounded on the positive elements of A.
ThenP there is a sequence {an } ⊂ A+ with kan k = 1 and ϕ(an ) ≥ 2n for each n ∈ N. Define
∞ −n
P∞ −n = 1. Since
a = n=1 2 an , which clearly converges with a positive and kak ≤ n=1 2
PN −n ≤ a for all N , we have a −
P N −n a ≥ 0, thus
n=1 2 n=1 2 n

N
X  XN N
X
ϕ(a) ≥ ϕ 2−n an = 2−n ϕ(an ) ≥ 2−n · 2n = N
n=1 n=1 n=1

holds for all N . But this contradicts the fact that ϕ(a) ∈ [0, +∞). Thus there exists C such
that ϕ(a) ≤ Ckak for all a ∈ A+ .
For every a ∈ A we have a = b + ic with b, c ∈ Asa , where kbk ≤ kak and kck ≤ kak. And
by Exercise 17.5, b = b+ − b− with b± ∈ A+ and kb± k ≤ kbk ≤ kak and similarly for c. Now

|ϕ(a)| = |ϕ(b+ ) − ϕ(b− ) + iϕ(c+ ) − iϕ(c− )| ≤ ϕ(b+ ) + ϕ(b− ) + ϕ(c+ ) + ϕ(c− ) ≤ 4Ckak.

Thus ϕ is bounded with kϕk ≤ 4C < ∞.

238
(ii) It is obvious that [·, ·] is a sesquilinear form on A. For all a ∈ A we have aa∗ ∈ A+ , thus
[a, a] = ϕ(aa∗ ) ≥ 0, so that [·, ·] is positive in the sense [a, a] ≥ 0 ∀a. A fortiori, [a, a] ∈ R for
all a, so that [·, ·] is self-adjoint by Lemma 11.17(i), i.e. [x, y] = [y, x].
1/2
(iii) Every positive sesquilinear form satisfies a Cauchy-Schwarz inequality |[a, b]| ≤ [a, a][b, b]
with the same proof as for (5.1) since non-degeneracy [x, x] = 0 ⇒ x = 0 was not needed there.
Expressing the inequality in terms of ϕ, the claim follows.
(iv) Since [·, ·] is self-adjoint by (i), we have ϕ(ab∗ ) = [a, b] = [b, a] = ϕ(ba∗ ). Putting b = 1,
the claim follows. 

(Statement (iii) also holds for non-unital algebras, but the proof uses approximate units.)
Be sure not to confuse A+ ⊆ A and A+ ⊆ A∗ !

B.159 Proposition (Bohnenblust & Karlin 1955) Let A be a unital complex C ∗ -algebra.
Then a linear functional ϕ : A → C is positive if and only if it is bounded and satisfies
ϕ(1) = kϕk.
Proof. ⇒ If ϕ is positive, it is bounded by Proposition B.158(i), and ϕ(1) > 0 since 1 ∈ A+ . If
a ∈ A≤1 , with Proposition B.158(iii) we have

|ϕ(a)|2 ≤ ϕ(aa∗ )ϕ(1) ≤ kaa∗ kkϕkϕ(1) ≤ kϕkϕ(1).

Since for each ε > 0 we can find a ∈ A≤1 with |ϕ(a)| > kϕk − ε, we have (kϕk − ε)2 ≤ kϕkϕ(1).
Taking ε → 0 gives kϕk ≤ ϕ(1) (if ϕ 6= 0, but the result also holds for ϕ = 0). On the other
hand, with k1k = 1 we have kϕk ≥ ϕ(1). Combining the inequalities we have kϕk = ϕ(1).
⇐ Replacing ϕ by kϕk−1 ϕ, we may assume kϕk = 1. Let a ∈ Asa , kak ≤ 1. Write
ϕ(a) = α + iβ with α, β ∈ R. Then for each s ∈ R,

ka − is1k2 = k(a − is1)∗ (a − is1)k = k(a + is1)(a − is1)k = ka2 + s2 1k ≤ 1 + s2 ,

where we used the C ∗ -identity. Thus

α2 + (β − s)2 = |α + iβ − is|2 = |ϕ(a) − is|2 = |ϕ(a − is1)|2 ≤ 1 + s2 ,

implying α2 + β 2 − 2sβ ≤ 1. Since this must hold for all s ∈ R, we have β = 0, thus ϕ(a) ∈ R.
Thus for a = a∗ we have V (a) ⊆ R.
Now let a ∈ A+ , kak ≤ 1. Then 0 ≤ a ≤ 1, thus 1 − a is positive with k1 − ak ≤ 1 (compare
Exercise 16.24), so that |ϕ(1 − a)| ≤ 1. Combined with ϕ(1 − a) ∈ R from the above, this gives
1 − ϕ(a) = ϕ(1 − a) ≤ 1, implying ϕ(a) ≥ 0. Thus ϕ is positive. 

B.160 Remark If A is a unital C ∗ -algebra, by states on A one usually means the ϕ ∈ A∗ that
are positive and normalized, i.e. ϕ(1) = 1. In view of the above result, this is entirely consistent
with the Banach algebraic Definition B.146 of states. 2

Now we are in a position to use V (a) to characterize the elements of a C ∗ -algebra that satisfy
a = 0, a = a∗ and a ≥ 0, respectively, in analogy to the results for A ∈ B(H) in terms of W (A).

B.161 Proposition Let A be a unital complex C ∗ -algebra and a ∈ A. Then


(i) a = 0 ⇔ ϕ(a) = 0 for all ϕ ∈ A+ ⇔ V (a) = {0}.
(ii) a = a∗ ⇔ ϕ(a) ∈ R for all ϕ ∈ A+ ⇔ V (a) ⊆ R.

239
(iii) a ≥ 0 ⇔ ϕ(a) ≥ 0 for all ϕ ∈ A+ ⇔ V (a) ⊆ [0, ∞).
Proof. By Proposition B.159 the positive functionals are precisely the positive multiples of the
states ϕ appearing in the definition of V2 (a). In view of this connection between V (a) and A+ ,
the three rightmost equivalences are obvious.
The leftmost implications ⇒ are obvious in (i) and (iii). As to (ii), if a = a∗ there are
a± ∈ A+ such that a = a+ − a− . Now with ϕ(a± ) ≥ 0 we have ϕ(a) = ϕ(a+ ) − ϕ(a− ) ∈ R.
To prepare the proof of the leftmost implications ⇐, assume a = a∗ 6= 0. Then a is normal,
thus r(a) = kak > 0, so that σ(a) contains a λ 6= 0. Then λ ∈ V (a) by Proposition B.148, so
that by Proposition B.147 there exists ϕ ∈ A∗ with ϕ(1) = kϕk = 1 and ϕ(a) = λ 6= 0. Since ϕ
is positive by Proposition B.159, we have proven that for every a = a∗ 6= 0 there exists ϕ ∈ A+
with ϕ(a) 6= 0.
For arbitrary a we have a = b + ic with b, c ∈ Asa . If ϕ ∈ A+ then ϕ(a) = ϕ(b) + iϕ(c),
where ϕ(b), ϕ(c) ∈ R. In view of this, ϕ(a) ∈ R implies ϕ(c) = 0, while ϕ(a) = 0 implies
ϕ(b) = ϕ(c) = 0. Together with the result just proven (if a = a∗ and ϕ(a) = 0 ∀ϕ ∈ A+ then
a = 0) this finishes the proof of both (i) and (ii). And if ϕ(a) ≥ 0 for all ϕ ∈ A+ then a = a∗
by (ii), and σ(a) ⊆ V (a) ⊆ [0, ∞) gives positivity of a, finishing (iii). 

B.162 Remark 1. We had already proven the equivalence a = 0 ⇔ V (a) = {0} for all unital
Banach algebras, but the above proof for C ∗ -algebras is considerably simpler.
2. We emphasise that V (a) ⊆ R implies a = a∗ , which it not implied by σ(a) ⊆ R.
3. Proposition B.161 shows that the states of a C ∗ -algebra can fulfill many of the tasks
of the characters in the commutative case. One easily checks that every character is a state.
But a state is a character only if it is multiplicative. By the Riesz-Markov-Kakutani theorem,
the states on C(X, C) are in bijection with the normalized positive R measures on X, while the
characters correspond to the Dirac measures δx , x ∈ X, for which X f dδx = f (x). 2

B.163 Corollary For a unital C ∗ -algebra A and a ∈ A, the following are equivalent:
(i) a is hermitian, i.e. V (a) ⊆ R.
(ii) a = a∗ .
(iii) eita is unitary for all t ∈ R.
(iv) keita k = 1 ∀t ∈ R.
Proof. (i)⇔(ii) was part of Proposition B.161. (ii)⇒(iii) is elementary, see Remark 16.18.2.
(iii)⇒(iv) is immediate by the C ∗ -identity. And Corollary B.153 gives (iv)⇔(i). 

B.164 Remark In every Banach ∗-algebra one can still define self-adjointness by a = a∗ and
unitarity by u∗ u = uu∗ = 1. (Confusingly, even in the Banach-∗ literature ‘hermitian’ can be
used for either of (i) or (ii).) One then still has (ii)⇔(iii), where ⇐ follows by the computation

d ∗ d
ia − ia∗ = eita eita = eita e−ita = 0.
dt |t=0 dt |t=0

And of course (i)⇔(iv) as in every Banach algebra. But in the absence of the C ∗ -identity neither
(i)⇔(ii) nor (iii)⇔(iv) needs to hold. (It is not difficult to construct Banach ∗-algebras with
k1k = 1 having unitary elements with norm 6= 1.) In fact, one can prove that every Banach
∗-algebra in which (ii)⇒(iv) holds is a C ∗ -algebra, cf. [58]. 2

240
We now study the relation between V (a) for C ∗ -algebra elements to σ(a) and to the spatial
numerical range W (A) in the case A = B(H).

B.165 Exercise Let A be a complex unital C ∗ -algebra and a ∈ A.


(i) Prove V (a) = conv(σ(a)) for commutative A.
(ii) Prove V (a) = conv(σ(a)) for normal a. (C ∗ -version of Exercise B.144.)
Before we can give the promised proof of V (A) = W (A) for A ∈ B(H), we need some
understanding of state spaces of concrete C ∗ -algebras, i.e. C ∗ -subalgebras of B(H):

B.166 Lemma Let A be a unital C ∗ -subalgebra. Then S(A) ⊆ A∗ is weak-∗-closed and convex.
Proof. Let ϕ1 , ϕ2 ∈ S(A) and t ∈ [0, 1]. Then ϕ = tϕ1 + (1 − t)ϕ2 is positive and normalized,
since ϕ(1) = 1. Thus ϕ ∈ S(A), proving convexity.
w∗
Let {ϕι } ⊂ S(A) be a net such that ϕι → ϕ ∈ A∗ . With ϕι (1) = 1 this implies ϕ(1) = 1.
By Alaoglu’s theorem, A∗≤1 is weak-∗ compact, thus weak-∗ closed. With ϕι ∈ A∗≤1 this implies
ϕ ∈ A∗≤1 , thus kϕk ≤ 1. Together with ϕ(1) = 1 this gives kϕk = 1, thus ϕ ∈ S(A). 

B.167 Proposition Let H be a complex Hilbert space and A ⊆ B(H) a C ∗ -subalgebra


with 1H ∈ A. Then with V S(A)) = {ϕx = h· x, xi | x ∈ H, kxk = 1} ⊆ S(A) we have
w∗
conv(V S(A)) = S(A).
Proof. If x ∈ H, kxk = 1 then ϕx : A 7→ hAx, xi is in A∗ and satisfies ϕx (1) = kϕx k =
w∗
1. Thus V S(A) ⊆ S(A), so that with Lemma B.166 we have conv(V S(A)) ⊆ S(A). If
w∗ w∗
conv(V S(A)) 6= S(A), pick ϕ0 ∈ S(A)\conv(V S(A)) . Then by Corollary B.60, applied to
(A∗ , τw∗ ) there exist a ψ ∈ A∗∗ and t ∈ R such that
w∗
sup{Re ψ(ϕ) | ϕ ∈ conv(V S(A)) } < t < Re ψ(ϕ0 ). (B.21)

By Lemma 10.24(ii) there is a (unique) a ∈ A such that ψ(ϕ) = ϕ(a) for all ϕ ∈ A∗ . Writing
a = b + ic with b, c ∈ Asa and using that every ϕ ∈ S(A) assumes real values on Asa , so that
Re ϕ(a) = ϕ(b), (B.21) becomes
w∗
sup{ϕ(b) | ϕ ∈ conv(V S(A)) } < t < ϕ0 (b). (B.22)

In particular for ϕx = h· x, xi, where x ∈ H is a unit vector, this implies hbx, xi ≤ t. This is
equivalent to hbx, xi ≤ tkxk2 for all x ∈ H, thus to h(t1 − b)x, xi ≥ 0 ∀x. With Proposition 17.9
this is equivalent to t1 − b ≥ 0. Since ϕ0 is a state, thus positive, this implies ϕ0 (t1 − b) ≥ 0,
w∗
thus ϕ0 (b) ≤ t. Since this contradicts (B.22), we indeed have conv(V S(A)) = S(A). 

B.168 Theorem Let H be a complex Hilbert space. Then for all A ∈ B(H) we have V (A) =
W (A) and R(A) = 9A9, where the numerical range V (A) and radius R(A) are taken in the
Banach algebra B(H).
Proof. With W (A) = {ϕ(A) | ϕ ∈ V S(B(H)} it is clear that W (A) ⊆ V (A), thus W (A) ⊆ V (A)
by closedness of V (A). Given λ ∈ V (A), pick ϕ ∈ S(A) with ϕ(A) = λ. By Proposition B.167
w∗
there is a net {ϕι } ⊂ conv(V S(B(H))) such that ϕι → ϕ. In particular, ϕι (A) → ϕ(A) = λ.
Since ϕι ∈ conv(V S(B(H))) implies ϕι (A) ∈ conv(W (A)) = W (A), we have proven λ ∈ W (A),
thus V (A) ⊆ W (A). The equality R(A) = 9A9 is now obvious. 

241
B.13 Some more basic theory of C ∗ -algebras
B.13.1 The Fuglede-Putnam theorem
B.169 Theorem Let A be a unital C ∗ -algebra over C.
(i) Let a, c ∈ A. If a is normal and ac = ca then a∗ c = ca∗ (and ac∗ = c∗ a).
(ii) Let a, b, c ∈ A. If a, b are normal and ac = cb then a∗ c = cb∗ .
Proof. Obviously (i) is just the special case a = b of (ii).
∗ ∗
(ii) We define f : C → A, z 7→ eza ce−zb , where ea = exp(a) is defined in terms of the power
series as in Example 15.19. Expanding the two power series in the definition of f we have
∞ ∞ ∞
! !
za∗ −zb∗
X z k (a∗ )k X (−z)l (b∗ )l X
f (z) = e ce = c = z n dn ∀z ∈ C
k! l!
k=0 l=0 n=0

for certain dn ∈ A. (The reshuffling is justified by the absolute convergence of the series.) We
only need d1 = a∗ c − cb∗ , which is quite obvious. Thus the theorem follows if we prove d1 = 0.
By induction, the assumption ac = cb is seen to imply an c = cbn . Multiplying by z n /n! and
summing over n ∈ N0 gives eza c = cezb for all z ∈ C, thus also eza ce−zb = c. Thus
∗ ∗ ∗ ∗ ∗ −za ∗ ∗ ∗
f (z) = eza ce−zb = eza (e−za cezb )e−zb = eza cezb−zb = e2iIm(za ) ce−2iIm(zb ) ,
∗ ∗ ∗ ∗
where eza e−za = eza −za and ezb e−zb = ezb−zb hold due to normality of a and b, respectively.
∗ ∗
Now Im(za∗ ), Im(zb∗ ) are self-adjoint so that e2iIm(za ) and e2iIm(zb ) are unitary for all z ∈ C,
cf. Remark 16.18(ii), thus bounded. This proves that f : C → A is bounded. Now the following
Lemma implies d1 = 0, and we are done. 

B.170
P∞ Lemma Let A be a unital Banach algebra and {dn }n∈N0 ⊆ A such that f : C → A, z 7→
z n d converges absolutely for all z ∈ C. (⇔ kd k1/n → 0.) If kf (z)k ≤ C|z|M ∀z ∈ C
n=0 n n
then dn = 0 for all n > M . In particular, if f is bounded then dn = 0 ∀n ≥ 1.
Proof. Let r > 0, m ∈ N. Then similarly to (B.20) we have
∞ ∞
Z 2π Z 2π !
X X Z 2π
−imt it −imt n int n
e f (re )dt = e r e dn dt = r dn ei(n−m)t dt = 2πrm dm ,
0 0 n=0 n=0 0

R 2π
where used the uniform convergence of the series on bounded sets and 0 ei(n−m)t dt = 2πδn,m .
R 2π R 2π
With k 0 e−imt f (reit )dtk ≤ 0 kf (reit )kdt ≤ 2πCrM we have

rM
Z
1
kdm k = e−imt f (reit )dt ≤ C ∀m ∈ N, r > 0.
2πrm 0 rm

Since the r.h.s. tends to zero as r → +∞ if m > M , the claim follows. 

B.171 Remark 1. Part (i) of the theorem was proven by Fuglede in 1950, (ii) by Putnam in
1951. The above elegant proof is due to Rosenblum (1958).143
2. For A = C, M = 0 the function f is entire and the lemma reduces to Liouville’s theorem
from complex analysis. But the general lemma can be deduced from Liouville’s theorem: For
143
Bent Fuglede (1925-2023), Danish mathematician. Calvin Richard Putnam (1924-2008), Marvin Rosenblum (1926-
2003), American mathematicians.

242
every ϕ ∈ A∗ , the function z 7→ ϕ(f (z)) = ∞ n n
P
n=0 z ϕ(c ) is entire and bounded, thus constant.
Thus for all z, z 0 ∈ C, ϕ ∈ A∗ we have ϕ(f (z) − f (z 0 )) = 0. The Hahn-Banach theorem now
implies f (z) − f (z 0 ) = 0 ∀z, z 0 , thus f is constant, so that dn = 0 ∀n ≥ 1.
3. But as our proof of the lemma shows, no use of complex analysis (holomorphicity etc.)
is needed if the function is a priori given by an everywhere convergent power series. Thus as in
Rickart’s proof of Theorem 13.39 the invocation of complex analysis can be replaced by much
simpler harmonic analysis. As there, also here the integration can be removed: 2

B.172 Exercise Give a proof of Lemma B.170 that does not use integration.

B.13.2 Homomorphisms of C ∗ -algebras


B.173 Theorem Let A be a Banach ∗-algebra, B a C ∗ -algebra and α : A → B a ∗-homomor-
phism. Then kαk ≤ 1, thus α is continuous.
Proof. Assume first that A, B, α are unital. Let a ∈ A. By Lemma 15.2 we have σB (α(a)) ⊆
σA (a), thus rB (α(a)) ≤ rA (a). Now the claim follows from

kα(a)k2 = kα(a)∗ α(a)k = kα(a∗ a)k = rB (α(a∗ a)) ≤ rA (a∗ a) ≤ ka∗ ak ≤ kak2 ,

where we used the C ∗ -identity for B, the fact that α is a ∗-homomorphism, the fact that
r(b) = kbk for normal elements of the C ∗ -algebra B, Lemma 15.2, r(a) ≤ kak and the inequality
ka∗ ak ≤ kak2 holding in every Banach ∗-algebra.
If α is non-unital, in particular if A or B has no unit, then α
e : Ae → B,
e (a, α) 7→ (α(a), α)

is easily seen to be a unital ∗-homomorphism. Since B is a C -algebra by Exercise 16.10, the
e
above applies to αe, and we have kαk ≤ ke αk ≤ 1. 

B.174 Remark 1. The above result is one of many cases in the theory of C ∗ -algebras where
the ‘algebra dictates the analysis’. Further instances are Theorem B.176 and Corollary B.177.
2. For (∗-)homomorphisms between general Banach (∗-)algebras the question of continuity
of homomorphisms is much more complicated, with connections to foundational matters like
the continuum hypothesis, cf. [32]. 2

B.175 Exercise Let A be a Banach ∗-algebra. For a ∈ A define kak∗ = sup(H,π) kπ(a)kB(H) ,
where H is a Hilbert space and π : A → B(H) is a ∗-homomorphism. Prove that k · k∗ is a
C ∗ -seminorm on A.

B.176 Theorem Every injective ∗-homomorphism α : A → B of C ∗ -algebras is an isometry,


and α(A) ⊆ B is closed.
Proof. If α is non-unital, define a unital homomorphism α e : Ae → Be as in the proof of Theorem
B.173. Now α eA
e is injective, thus if we prove the theorem in the unital case, we have that α = α
is isometric. Since A is complete, this implies closedness of α(A) ⊆ B. Thus α is an isometric
∗-isomorphism onto a C ∗ -subalgebra of B.
Assume thus α to be unital. It suffices to prove kα(a)k = kak for self-adjoint a ∈ A, since
then using the C ∗ -identity we have kα(a)k2 = kα(a)∗ α(a)k = kα(a∗ a)k = ka∗ ak = kak2 ∀a ∈ A.
Let thus a = a∗ ∈ A. We claim that σA (a) = σB (α(a)). [We cannot invoke Theorem
16.19 since we do not yet know α(A) ⊆ B to be closed!] This will imply kak = rA (a) =
rB (α(a)) = kα(a)k, as claimed. By (part of) Theorem 17.2 there are isometric ∗-homomorphisms
γ : C(σ(a), C) → A and γ 0 : C(σ(α(a)), C) → B continuously extending the maps P 7→ P (a)

243
and P 7→ P (α(a)). Assuming σ(α(a)) $ σ(a), we can find a non-zero f ∈ C(σ(a), C) that
vanishes on σ(α(a)). (Since we are dealing with subsets of R, this can be done by hand,
without Urysohn’s lemma.) If {Pn } is a sequence of polynomials converging uniformly to f on
σ(a) ⊆ R then the sequences {Pn (a)} and {Pn (α(a))} converge uniformly to γ(f ) = f (a) ∈ A
and γ 0 (f ) = f (α(a)) ∈ B, respectively. Since γ, γ 0 are isometric we have kf (a)k = kf kσ(A) 6= 0
and kf (α(a))k = kf kσ(α(a)) = 0. Since α is a unital homomorphism, we have α(Pn (a)) =
Pn (α(a)) ∀n. The r.h.s. converges to f (α(a)), the l.h.s. to α(f (a)) by continuity of α (Theorem
B.173), so that α(f (a)) = f (α(a)) = 0. Since f (a) 6= 0, this contradicts the injectivity of α.
Thus we have σ(α(a)) = σ(a) as claimed. 

B.177 Corollary If α : A → B is a ∗-homomorphism of C ∗ -algebras then α(A) ⊆ B is closed.


Proof. I = ker α clearly is a two-sided self-adjoint ideal, and it is closed since α is continuous
by Theorem B.173. Thus by Exercise 16.12 and Remark 16.13, C = A/I is a C ∗ -algebra, and
the induced homomorphism γ : C → B is a ∗-homomorphism. By construction, γ is injective,
thus an isometry with closed image γ(C) ⊆ B by Theorem B.176. With α(A) = γ(C) we are
done. 

B.14 Unbounded operators (mostly on Hilbert space)


B.14.1 Basic definitions. Closed and closable operators
B.178 Definition A (possibly unbounded) operator on a Banach space V is a pair (D, A),
where D ⊆ V is a dense linear subspace and A : D → V is a linear map.
In practice, we just denote an unbounded operator by a letter, say A, and denote its domain
by D(A), DA or just D.

B.179 Remark 1. We ignore operators with non-dense range, since they are of very limited
use. But assuming dense domain forces us to check for every construction of a new operator
whether its domain is dense.
2. Writing ‘unbounded operator’ is undesirable since it would exclude the bounded ones.
Since ‘possibly unbounded operator’ is unbearable, we will just write ‘operator’, ‘densely defined
linear’ being implied.
3. If A is an operator and t ∈ F it is immediate how to define tA on the domain D(A). 2

B.180 Definition Let A, B be operators on V . Then A + B has domain D(A) ∩ D(B), on


which it is defined as x 7→ Ax + Bx.144 And AB is defined as x 7→ A(Bx) on the domain
D(AB) = D(B) ∩ B −1 (D(A)) = {x ∈ D(B) | Bx ∈ D(A)}.
Note that it is quite possible for D(A + B) or D(AB) to be non-dense or even trivial, i.e.
{0}.

B.181 Definition Let V be a Banach space.


• The graph of an operator A is G(A) = {(x, Ax) | x ∈ D(A)} ⊆ V ⊕ V .
• If A, B are operators with D(A) ⊆ D(B) and B  D(A) = A (equivalently, G(A) ⊆ G(B))
then B is called an extension of A, denoted A ⊆ B.
144
In particular, D(A + c1) = D(A) with A + c1 : x 7→ Ax + cx, which we will use often.

244
• An operator A is closed if G(A) ⊆ V ⊕ V is closed, and closable if G(A) is the graph of an
operator. That operator, the closure of A, is then denoted A.
• If A is closed with domain D then a linear subspace D0 ⊆ D is a core for A if A  D0 = A.

B.182 Remark 1. If A is defined on D = V then closedness of A (thus of G(A)) is equivalent


to boundedness by the closed graph theorem. But an operator whose domain is only dense can
be closed without being bounded! In this case we cannot appeal to Lemma 3.12 to extend A to
all of V .
2. Since G(A) is a vector space, for closability of A it is necessary and suffiencient that G(A)
not contain (0, a) with a 6= 0.
3. Trivially, closed ⇒ closable. If A is closable then A is an extension of A. If A admits some
closed extension B then A is closable: We have G(A) ⊆ G(B), thus G(A) ⊆ G(B) by closedness
of B. This implies that G(A) is the graph of an operator A. 2

If V is a Banach space and K, L ⊆ V are closed subspaces, it can happen that K + L ⊆ V


is non-closed. See Exercises 7.13 and 7.44. We focus on the case where K ∩ L = {0}. Replacing
V by K + L, we may assume that K + L is dense.

B.183 Exercise Let V be a Banach space and K, L closed linear subspaces such that K ∩ L =
{0} and K + L = V . Put D = K + L and define S : D → D by S(k + l) = k − l for all
k ∈ K, l ∈ L. Prove:
(i) S is bounded if and only if K + L = V .
(ii) S is always closed.

B.14.2 Adjoints of unbounded Hilbert space operators


We now focus on Hilbert spaces.

B.184 Proposition Let A be an operator on H with dense domain D(A). Let

D(A∗ ) = {y ∈ H | the functional D(A) 3 x 7→ hAx, yi is bounded}.

Then
(i) For each y ∈ D(A∗ ) there is a unique zy ∈ H such that hAx, yi = hx, zi for all x ∈ D(A).
We put A∗ y = zy .
(ii) D(A∗ ) ⊆ H is a linear subspace and the map A∗ : D(A∗ ) → H is linear.
Proof. (i) Let y ∈ D(A∗ ) and assume z, z 0 ∈ H satisfy hAx, yi = hx, zi = hx, z 0 i for all x ∈ D(A).
This implies hx, z − z 0 i = 0 for all x ∈ D(A). Since D(A) is dense, z − z 0 = 0 follows.
(ii) Let y ∈ D(A∗ ) and c ∈ F. Then hAx, yi = hx, A∗ yi for all x ∈ D(A), implying
hAx, cyi = hx, cA∗ yi ∀x ∈ D(A). It follows that cy ∈ D(A∗ ) and A∗ (cy) = cA∗ y. Now
let y, y 0 ∈ D(A∗ ). Adding the equations hAx, yi = hx, A∗ yi and hAx, y 0 i = hx, A∗ y 0 i gives
hAx, y + y 0 i = hx, A∗ (y + y 0 )i, showing y + y 0 ∈ D(A∗ ) and A∗ (y + y 0 ) = A∗ y + A∗ y 0 . 

Thus hAx, yi = hx, A∗ yi ∀x ∈ D(A), y ∈ D(A∗ ).

B.185 Exercise Prove that A ⊆ B implies B ∗ ⊆ A∗ .

B.186 Theorem Let A be a densely defined operator on H. Then

245
(i) The graph of A∗ is closed.
(ii) The domain D(A∗ ) of A∗ is dense if and only if A is closable.

(iii) If A is closable then A = A∗ and A = A∗∗ . (In particular A closed ⇒ A = A∗∗ .)
Proof. (i) For frequent later use we notice that for every unitary U on H and every linear
subspace E (not necessarily closed) the identity U E ⊥ = (U E)⊥ holds. Now equip H ⊕ H with
the obvious inner product: h(a, b), (c, d)i = ha, ci + hb, di. With this it is trivial to check that
the linear operator V : H ⊕ H → H ⊕ H, (x, y) 7→ (−y, x) is unitary.
We claim that the following holds for every densely defined A:

G(A∗ ) = V G(A)⊥ = (V G(A))⊥ . (B.23)

This follows from the chain of equivalences

(x, y) ∈ V G(A)⊥ = (V G(A))⊥


⇔ h(x, y), V (z, Az)i = 0 ∀z ∈ D(A)
⇔ h(x, y), (−Az, z)i = 0 ∀z ∈ D(A)
⇔ hx, Azi = hy, zi ∀z ∈ D(A)
⇔ hAz, xi = hz, yi ∀z ∈ D(A)
⇔ x ∈ D(A∗ ), y = A∗ x ⇔ (x, y) ∈ G(A∗ ).

As an orthogonal complement, G(A)⊥ is closed, and the closedness of G(A∗ ) follows from (B.23).
(ii⇐) Since G(A) ⊆ H ⊕ H is a linear subspace, we have

G(A) = G(A)⊥⊥ = V V G(A)⊥ = V (V G(A)⊥ )⊥ = V (G(A∗ ))⊥ (B.24)

where we used V 2 = −1, the commutativity of V and ⊥ and (B.23).


If now D(A∗ ) is not dense, we can find x ∈ D(A∗ )⊥ \{0}. Then for each y ∈ D(A∗ ) we
have h(x, 0), (y, A∗ y)i = hx, yi = 0, implying (x, 0) ∈ G(A∗ )⊥ . Thus using (B.24) we find
(0, x) = V (x, 0) ∈ V G(A∗ )⊥ = G(A). In view of x 6= 0 and (0, 0) ∈ G(A), this shows that G(A)
is not the graph of an operator and thus A is not closable.
(ii⇒)+(iii) Assuming that D(A∗ ) is dense, we can define A∗∗ . Replacing A by A∗ in (B.23)
gives G(A∗∗ ) = V G(A∗ )⊥ = V V G(A)⊥⊥ = G(A)⊥⊥ = G(A). Thus G(A) is the graph of the
operator A∗∗ , showing that A is closable with A = A∗∗ .

The remaining claim in (iii) follows from the computation A∗ = A∗ = A∗∗∗ = A , where we
used in turn the closedness of A∗ , the fact that A∗ is closed, thus closable with A∗ = A∗∗∗ , and
the closability of A with A = A∗∗ . 

B.187 Definition An operator A with dense domain D ⊆ H is called


• symmetric if hAx, yi = hx, Ayi ∀x, y ∈ D.
• self-adjoint if A = A∗ , including D(A∗ ) = D(A).
We notice:
• A is symmetric if and only if A ⊆ A∗ .
• Thus self-adjoint ⇒ symmetric, and a symmetric A is self-adjoint if and only if D(A∗ ) =
D(A).
• Let A ⊆ B. Then B symmetric ⇒ A symmetric. But the converse need not hold.

246
• If A is symmetric then A ⊆ A∗∗ ⊆ A∗ (and conversely). This follows from the fact that
A∗ is closed and therefore contains the closure A = A∗∗ of A.
• The closed symmetric operators are those satisfying A = A∗∗ ⊆ A∗ .

B.188 Exercise Prove that every symmetric operator is closable, and its closure is symmetric.

B.189 Exercise Give an example of A $ B ⊆ C with A, B symmetric but C not symmetric.

B.190 Exercise Let H = `2 (N, C) and h : N → R. Define D = {f ∈ H | hf ∈ H} (pointwise


multiplication) and Af = hf . Prove that A is self-adjoint.

B.191 Definition An operator A is essentially self-adjoint if it is closable with A self-adjoint.

B.192 Exercise Prove that the following are equivalent:


(i) A is essentially self-adjoint.
(ii) A ⊆ A∗∗ = A∗ .
(iii) A and A∗ are symmetric.

B.193 Lemma If A is essentially self-adjoint then A is the only self-adjoint extension of A.


Proof. Let B be a self-adjoint extension of A. Thus A ⊆ B = B ∗ . Since B ∗ is closed, this

implies A ⊆ A ⊆ B = B ∗ ⊆ A∗ = A = A. Thus A = B. 

One can show that the converse is also true: If A has a unique self-adjoint extension B then
A is self-adjoint and therefore coincides with B.

B.194 Definition A symmetric operator A is maximal symmetric if every symmetric extension


B ⊇ A coincides with A.

B.195 Exercise Prove that self-adjoint ⇒ maximal symmetric ⇒ closed symmetric.


The following diagram summarizes the implications:

A self − adjoint ==⇒ A max. symm. ==⇒ A closed symmetric ===⇒ A closed
w
w
w w
w w
w w w
w
w
w w
w w
w w w
w
w
w w
w w
w
  
A ess. s. − a. =========================⇒ A symmetric =====⇒ A closable

B.14.3 Basic criterion for (essential) self-adjointness


Verifying (essential) self-adjointness of an operator directly from the definitions can be tedious,
making it desirable to have managable criteria. We need some preparations.
A
The kernel and image of an unbounded operator H ⊇ D → H are defined in the obvious
way, namely as {x ∈ D | Ax = 0} and AD, respectively.

B.196 Lemma For every (densely defined, of course) operator A

ker A∗ = (AD)⊥ .

247
Proof. The proof is essentially that of Lemma 11.10(ii) with some attention to the domains:
We have hAx, yi = hx, A∗ yi for all x ∈ D(A), y ∈ D(A∗ ). Thus

y ∈ ker A∗ ⊆ D(A) ⇔ hAx, yi = 0 ∀x ∈ D(A) ⇔ y ∈ (AD)⊥ .

B.197 Theorem Let A be a symmetric operator with domain D ⊆ H. Then the following are
equivalent:
(i) A is self-adjoint.
(ii) A is closed and ker(A∗ ± i) = {0}. (A∗ ± i both injective.)
(iii) (A ± i)D = H. (A ± i both surjective.)
(For an unbounded A, one defines A + c1 in the obvious way on the domain D(A).)
Proof. (i)⇒(ii) Let x ∈ D(A) = D(A∗ ). If A∗ x = ix then Ax = ix by A = A∗ , thus

ihx, xi = hix, xi = hAx, xi = hx, A∗ xi = hx, Axi = hx, ixi = −ihx, xi,

implying x = 0 and therefore ker(A − i) = {0}. The identity ker(A + i) = {0} is proven in the
same way.
(ii)⇒(iii) Since A∗ + i is injective, Lemma B.196 gives ((A − i)D)∗ = {0}, so that (A − i)D
is dense. It remains to prove that (A − i)D is closed. Using that A is symmetric we find

k(A − i)xk2 = h(A − i)x, (A − i)xi = kAxk2 + kxk2 − ihx, Axi + ihAx, xi
= kAxk2 + kxk2

and therefore k(A − i)xk ≥ kxk for all x ∈ D(A). The rest of the proof is an adaptation of
Lemma 7.39 to unbounded operators: By the inequality, the map A − i : D(A) → H is injective,
thus A − i : D(A) → (A − i)D(A) is a bijection. If {yn } is a Cauchy sequence in (A − i)D(A)
then the inequality gives that {xn = (A − i)−1 (yn )} is a Cauchy sequence in D(A − i). Thus
{(xn , yn )} ⊆ G(A−i) is Cauchy. The closedness of A implies closedness of A−i, thus of G(A−i),
so that (xn , yn ) → (x, y) ∈ G(A − i). This proves that limn yn = y ∈ G(A − i), so that (A − i)D
is closed, completing the proof of surjectivity of A − i. For A + i one argues analogously.
(iii)⇒(i) Let y ∈ D(A∗ ). Since (A − i)D = H, there exists x ∈ D(A) such that

(A − i)x = (A∗ − i)y. (B.25)

Since A is symmetric, x ∈ D(A) ⊆ D(A∗ ) and A∗ x = Ax. Thus (B.25) rewrites as

(A∗ − i)(x − y) = 0. (B.26)

Appealing to (A + i)D = H, Lemma B.196 gives ker(A∗ − i) = ker((A + i)∗ ) = H ⊥ = {0}, thus
injectivity of A∗ − i. Combining this with (B.26) we obtain x − y = 0, so that y = x ∈ D(A).
This proves D(A∗ ) ⊆ D(A), thus A = A∗ . 

Similarly:

B.198 Corollary Let A be symmetric. Then the following are equivalent:


(i) A is essentially self-adjoint.

248
(ii) ker(A∗ ± i) = {0}. (A∗ ± i both injective.)
(iii) (A ± i)D = H.
Proof. (i)⇒(ii) By assumption A is closable with self-adjoint closure A. Now the theorem gives
∗ ∗
ker(A ± i) = {0}, and with A = A∗ we have (ii) of the corollary.

(ii)⇒(i) From Exercise B.188 we know that A is closable with A symmetric. With A∗ = A ,

hypothesis (ii) gives ker(A ± i) = {0}, so that A is self-adjoint by implication (ii)⇒(i) in the
theorem. Equivalently, A is essentially self-adjoint.
(ii)⇔(iii) This is immediate from Lemma B.196, applied to A ± i. 

B.15 Glimpse of non-linear FA: Schauder’s fixed point theorem


In this final section we give a glimpse of non-linear functional analysis by proving Schauder’s
fixed point theorem, which is a generalization of Brouwer’s fixed point theorem to Banach
spaces.

B.199 Definition A topological space X has the fixed-point property if for every continuous
map f : X → X there is x ∈ X such that f (x) = x, i.e. a fixed-point.

B.200 Theorem (Brouwer, Hadamard, 1910) 145 [0, 1]n has the fixed point property. The
same holds for every non-empty compact convex subset of Rn .
The second result follows from the first since such an X is homeomorphic to some [0, 1]m .
There are many proofs of the first result. For what probably is the simplest proof (due to Kulpa)
of the first statement, using only some easy combinatorics, see [108]. (Proofs using algebraic
topology or analysis involve inessential elements and don’t reduce the combinatorics.)

B.201 Theorem (Schauder 1930) Every non-empty compact convex subset K of a normed
vector space has the fixed point property.
Proof. Let (V, k · k) be a normed vector space, K ⊆ V a non-empty compact convex subset
and f : K → K continuous. Let ε > Sn0. Since K is compact, thus totally bounded, there
are x1 , . . . , xn ∈ K such that K ⊆ i=1 B(xi , ε). Thus if we define continuous functions
αi : X → R+ , i = 1, . . . , n by

αi (x) = max(ε − kx − xi k, 0)

we see that for each x ∈ K there is at least one i such that αi (x) > 0. Since the αi are
continuous, so is the map Pn
αi (x)xi
Pε : K → K, x 7→ Pi=1 n .
i=1 αi (x)
Since Pε (x) is a convex combination of those xi for which kx − xi k < ε, we have kPε (x) − xk < ε
for all x ∈ K. The finite-dimensional subspace Vn = span(x1 , . . . , xn ) ⊆ V is isomorphic to some
Rm , and by Corollary 2.32 the restriction of the norm k · k to Vn is equivalent to the Euclidean
norm on Rm . Thus the convex hull conv(x1 , . . . , xn ) ⊆ Vn into which Pε maps is homeomorphic
to a compact convex subset of Rm and thus has the fixed point property by Theorem B.200.
145
Luitzen Egbertus Jan Brouwer (1881-1966). Dutch mathematician. Important contributions to topology, founding
of intuitionism. Jacques Hadamard (1865-1963). French mathematician.

249
Thus if we define fε = Pε ◦ f then fε maps conv(x1 , . . . , xn ) into itself and thus has a fixed point
x0 = fε (x0 ). Now,

kx0 − f (x0 )k ≤ kx0 − fε (x0 )k + kfε (x0 ) − f (x0 )k = kfε (x0 ) − f (x0 )k = kPε (f (x0 )) − f (x0 )k < ε.

Since ε > 0 was arbitrary, we find inf{kx − f (x)k | x ∈ K} = 0. Since K is compact and
x 7→ kx − f (x)k continuous, the infimum is assumed, thus f has a fixed point in K. 

The use of methods/results from algebraic topology is quite typical for non-linear functional
analysis. (But also linear functional analysis connects to algebraic topology, for example via
K-theory, cf. e.g. [110, Chapter 7].)

B.202 Definition Let V be a Banach space and W ⊆ V . A map f : W → V is called compact


if it is continuous and f (S) ⊆ V is precompact for every bounded S ⊆ V .

B.203 Corollary Let V be a Banach space and C ⊆ V closed, bounded and convex. If
f : C → V is compact and f (C) ⊆ C then f has a fixed point in C.
Proof. Since C is bounded and f is compact, f (C) ⊆ V is precompact, thus K = conv(f (C))
is compact by Mazur’s Theorem B.75 and convex. Thus K has the fixed point property by
Schauder’s theorem. Since C is closed and convex, we have K ⊆ C, thus f is defined on K and
maps it into f (K) ⊆ f (C) ⊆ C. Thus f has a fixed point x ∈ K ⊆ C. 

250
C The mathematicians encountered in these notes
• Marc-Antoine Parseval (1755-1836)
• Friedrich Bessel (1784-1846)
• Augustin-Louis Cauchy (1789-1857)
• Viktor Yakovlevich Bunyakovski (1804-1889)
• Sir William Rowan Hamilton (1805-1865)
• Karl Theodor Wilhelm Weierstrass (1815-1897)
• Eduard Heine (1821-1881)
• Bernhard Riemann (1826-1866)
• Carl Gottfried Neumann (1832-1925)
• Karl Hermann Armandus Schwarz (1843-1921)
• Giulio Ascoli (1843-1896)
• Cesare Arzelà (1847-1912)
• Adolf Hurwitz (1859-1919)
• Otto Hölder (1859-1937)
• Vito Volterra (1860-1940)
• Eliakim Hastings Moore (1862-1932)
• David Hilbert (1862-1943)
• Hermann Minkowski (1864-1909)
• Jacques Hadamard (1865-1963)
• Erik Ivar Fredholm (1866-1927)
• Felix Hausdorff (1868-1942)
• Ernst Steinitz (1871-1928)
• Emile Borel (1871-1956)
• Constantin Carathéodory (1873-1950)
• René-Louis Baire (1874-1932)
• Issai Schur (1874-1941)
• Ernst Sigismund Fischer (1875-1954)
• Erhard Schmidt (1876-1959)
• Maurice Fréchet (1878-1973)
• Hans Hahn (1879-1934)
• Frigyes Riesz (1880-1956)
• Sergei Natanovich Bernstein (1880-1968)
• Otto Toeplitz (1881-1940)
• Luitzen Egbertus Jan Brouwer (1881-1966)
• Ernst David Hellinger (1883-1950)

251
• Eduard Helly (1884-1943)
• Hermann Weyl (1885-1955)
• Marcel Riesz (1886-1969)
• Paul Lévy (1886-1971)
• Hugo Steinhaus (1887-1972)
• Stefan Banach (1892-1945)
• Eduard Čech (1893-1960)
• Norbert Wiener (1894-1964)
• Juliusz Schauder (1899-1943)
• Herman Auerbach (1901-1942)
• John von Neumann (1903-1957)
• Andrey Andreyevich Markov (1903-1979)
• Andrey Nikolaevich Kolmogorov (1903-1987)
• Marshall Harvey Stone (1903-1989)
• Wladyslaw Orlicz (1903-1990)
• Henri Cartan (1904-2008)
• Stanislaw Mazur (1905-1981)
• Arne Beurling (1905-1986)
• Henry Frederic Bohnenblust (1906-2000)
• Nachman Aronszajn (1907-1980)
• Mark Grigorievich Krein (1907-1989)
• Meier Eidelheit (1910-1943)
• Angus Ellis Taylor (1911-1999)
• Shizuo Kakutani (1911-2004)
• David Milman (1912-1982)
• Billy James Pettis (1913-1979)
• Charles Earl Rickart (1913-2002)
• Herman Heine Goldstine (1913-2004)
• Israel Moiseevich Gelfand (1913-2009)
• Vitold Lvovich Šmulyan (1914-1944)
• Leonidas Alaoglu (1914-1981)
• Laurent Schwartz (1915-2002)
• Gustave Choquet (1915-2006)
• Frederick Valentine Atkinson (1916-2002)
• Aryeh Dvoretzky (1916-2008)
• William Frederick Eberlein (1917-1986)

252
• Robert Clarke James (1918-2004)
• Jerzy Loś (1920-1988)
• Nicolaas Hendrik Kuiper (1920-1994)
• Claude Ambrose Rogers (1920-2005)
• Mychajlo Jossypowytsch Kadets (1923-2011)
• Calvin Richard Putnam (1924-2008)
• Bent Fuglede (1925-2023)
• Robert Ralph Phelps (1926-2013)
• Czeslaw Ryll-Nardzewski (1926-2015)
• Kennan Tayler Smith (1926-2000)
• Felix Earl Browder (1927-2016)
• Errett Albert Bishop (1928-1983)
• Alexander Grothendieck (1928-2014)
• Wilhelmus Anthonius Josephus Luxemburg (1929-2018)
• Karel de Leeuw (1930-1978)
• Shaul Reuven Foguel (1931-2020)
• Aleksander Pelczyński (1932-2012)
• Czeslaw Bessaga (1932-2021)
• John Robert Ringrose (b. 1932)
• Lior Tzafriri (1936-2008)
• Joram Lindenstrauss (1936-2012)
• Stefan Oscar Walter Hildebrandt (1936-2015)
• Haskell Paul Rosenthal (1940-2021)
• Paul Robert Chernoff (1942-2017)
• Stanislaw Kwapień (b. 1942)
• Per Henrik Enflo (b. 1944)
• Victor Lomonosov (1946-2018).
Embarrassingly the above list contains no women. In the related areas of PDEs and vari-
ational calculus (which is functional analysis, but non-linear) there have been quite a few, in
particular Sofia Kowalevskaya (1850-1891), Emmy Noether (1882-1935), Olga Ladyzhenskaya
(1922-2004), Cathleen Synge Morawetz (1923-2017), Yvonne Choquet-Bruhat (b. 1923), Karen
Uhlenbeck (b. 1942, Abel prize 2019), . . . (In classical and harmonic analysis, Grace Chisholm
Young (1868-1944), Nina Bari (1901-1961) and Dorothy Maharam Stone (1917-2014) come to
mind.) But in linear functional analysis the first notable women probably are
• Mary Beth Ruskai (1944-2023), who worked on functional analytic questions of quantum
theory.
• Nicole Tomczak-Jaegermann (1945-2022), who worked on Banach space theory, e.g. [165].
• Dusa McDuff (b. 1945), who after a brilliant PhD thesis (1970) on operator algebras
switched to more geometric matters (symplectic topology and geometry).
• Other female operator algebraists: Marie Choda and Claire Anantharaman-Delaroche with
first publications in 1962 and 1967, respectively.

253
D Results stated, but not proven
• 1905, 1913: Levy-Steinitz theorem on reordering series in finite dimension.
• 1911: Carathéodory’s convexity theorem.
• 1929/1940: Orlicz-Pettis theorem.
• 1936: Kakutani/Birkhoff: TVS is metrizable ⇔ 0 has countable neighborhood base.
• 1940s: Gelfand, A. E. Taylor, Dunford, Lorch: holomorphic functional calculus.
• 1940/1947: Eberlein-Šmulyan theorem.
∼ V ∗∗ but ιV (V ) $ V ∗∗ .
• 1951: R.C. James’ space with V =
• 1955: Grothendieck: approx prop. ⇔ finite rank ops. approx id on compact subsets
• 1957/1964: R.C. James: Banach sp. is reflexive ⇔ ∀ϕ ∈ V ∗ ∃0 6= x ∈ V : |ϕ(x)| = kxkkϕk.
• 1958: Bessaga/Pelczinski: Banach space V does not contain c0 ⇔ every WUC series in
V converges unconditionally.
• 1959: Lidskii’s theorem.
• 1960/1: Dvoretzky’s theorem.
• 1961/3: Bishop-Phelps theorems.
• 1965: Kuiper’s theorem.
• 1971: Kadets-Snobar theorem.
• 1971: Lindenstrauss-Tzafriri: Banach space with all closed subspaces complemented is
isomorphic to Hilbert space.
• 1972: Kwapień’s theorem
• 1973: Enflo: Banach space without approximation property.
• 1973: Pecherskii’s theorem.
• 1974: Rosenthal’s `1 theorem.
• 1975/1987: Enflo: Banach space operator without invariant subspace.
• 1977: Blair: Baire’s theorem ⇔ DCω .
• 1981: Szankowski: B(H) doesn’t have approximation property.
• 2011: Argyros/Haydon: solution of the scalar-plus-compact problem (Banach space V
with B(V ) = C1 + K(V )).
Note that apart from the last one and some new takes on old results like [7, 38, 53, 61] we
have hardly even mentioned any results from the last 40 years!

254
E What next?
For general orientation, the article [166] and Dieudonné’s book [40] are strongly recommended.
• General topological vector spaces, F-spaces, beginning with [141].
• Locally convex spaces and distributions, beginning with [141, 30, 94]
• Sobolev spaces. Applications of the latter and of distributions to PDEs, e.g. [52].
• Index theory of elliptic PDEs (Atiyah-Singer etc.)
• Much more on Banach spaces, beginning with [102], then [26, 98, 1, 97] etc.
• Connections between Banach spaces and classical/harmonic analysis, e.g. wavelets, Hardy
spaces, and with probability theory, e.g. martingales. E.g. [176, 77].
• More operator theory on Banach and Hilbert spaces, e.g. [59, 135, 24, 68, 129, 152].
• Semigroup theory, e.g. [4, 50]
• Banach algebras [131, 18, 80].
• Connections between Banach algebras and complex analysis.
• C ∗ - and von Neumann algebras: [110, 79] and many other books.
• Interactions of operator algebras and operator theory, beginning with [110, 33, 42].
• Algebraic topology of operator algebras, non-commutative geometry.
• Non-linear functional analysis, e.g. [28, 37, 39, 115, 164, 177].
• Variational calculus (with applications to differential equations), non-linear optimization.
• Applications of operator theory in quantum mechanics, e.g. [92].
• Applications of operator algebras in statistical physics and quantum field theory. E.g.
[22, 65].
• Non-archimedean/p-adic functional analysis, e.g. [138, 125].

255
F Approximate schedule for 14 lectures à 90 minutes
1. Sections 1-2.2: Introduction. Topological vector spaces, normed spaces.
2. Sections 2.3-3: Glimpse beyond normed spaces. More on normed spaces and bounded
maps
3. Section 4: The spaces `p (S, F) and c0 (S, F). Proofs of Hölder and Minkowski inequalities,
dual spaces. Most other proofs omitted or just sketched.
4. Sections 5.1-5.5: Hilbert spaces up to and incl. H ∗ .
5. Section 5.6-6.1: Bases and tensor products of Hilbert spaces. Quotients of Banach spaces.
6. Section 6.2: complemented subspaces. Section 7: Open mapping thm. incl. Baire, closed
graph theorem, boundedness below.
7. Section 8: Uniform boundedness theorem and applications.
8. Section 9: Hahn-Banach theorem and applications incl. reflexivity and transpose of oper-
ators.
9. Section 11: Hilbert space operators, beginning with adjoint. Self-adjoint, normal ops, etc.
10. Finish Hilbert space operators. Then Section 12 on compact operators.
11. Sections 13.1-13.2.2: spectra of operators, spectrum in a Banach algebra.
12. Sections 13.2.3 and 13.2.4: Beurling-Gelfand theorem and its applications.
13. Section 14: spectral theorems for compact operators (normal or not). Quick mention of
Fredholm operators. Section 15: Characters vs. maximal ideals. Power series functional
calculus.
14. Sections 16, 17: C ∗ -algebras, continuous functional calculus for normal operators. Spectral
theorems for normal operators (only Section 18.1).
− − − − − − − − − − − − − − − − − − − − − − − − −−
15. Section 10 on weak and weak-∗ topologies, first half of Section 12.2.
16. Second half of Section 12.2, Section 19.
Lectures 15 and 16 are not part of the course since the weeks 15-16 of the semester were
scrapped a few years ago. If there is interest, I can give these two lectures (incl. homework) in
January for 1 EC.

256
All papers appearing in the bibliography are cited somewhere, but not all books. Still, all
are worth looking at.

References
[1] F. Albiac, N. J. Kalton: Topics in Banach space theory. 2nd. ed. Springer, 2016.
[2] D. Amir: Characterizations of inner product spaces. Birkhäuser, 1986.
[3] N. R. Andre, S. M. Engdahl, A. E. Parker: An Analysis of the First Proofs of
the Heine-Borel Theorem. [Link]
an-analysis-of-the-first-proofs-of-the-heine-borel-theorem
[4] D. Applebaum: Semigroups of linear operators. Cambridge University Press, 2019.
[5] S. A. Argyros, R. G. Haydon: A hereditarily indecomposable L∞ -space that solves the
scalar-plus-compact problem. Acta Math. 206, 1-54 (2011).
[6] S. A. Argyros, R. G. Haydon: Bourgain-Delbaen L∞ -spaces, the scalar-plus-compact
property and related problems. Proc. Intern. Congr. Math., vol. 3 (2018).
[7] M. B. Asadi, A. Khosravi: An elementary proof of the characterization of isomorphisms
of standard operator algebras. Proc. Amer. Math. Soc. 134, 3255-3256 (2006).
[8] M. Bachir: On the Krein-Milman-Ky Fan theorem for convex compact metrizable sets.
Illinois Journ. Math. 62, 1-24 (2018).
[9] S. Banach: Théorie des opérations linéaires, 1932. Engl. transl.: Theory of linear opera-
tions. North-Holland, 1987.
[10] W. R. Bauer, R. H. Benner: The non-existence of a Banach space of countably infinite
Hamel dimension. Amer. Math. Monthly 78, 895-596 (1971).
[11] V. Baumann: W. A. J. Luxemburgs Beweis des Satzes von Hahn-Banach. Archiv Math.
18, 271-272 (1967).
[12] A. F. Beardon: Limits. Springer, 1997.
[13] J. L. Bell, D. H. Fremlin: A geometric form of the axiom of choice. Fund. Math. 77,
168-170 (1972/3).
[14] S. J. Bernau: The spectral theorem for normal operators. Journ. London Math. Soc. 40,
478-486 (1965).
[15] G. Birkhoff, E. Kreyzig: The establishment of functional analysis. Histor. Math. 11, 258-
321 (1984).
[16] E. Bishop, R. R. Phelps: A proof that every Banach space is subreflexive. Bull. Amer.
Math. Soc. 67, 97-98 (1961).
[17] C. E. Blair: The Baire category theorem implies the principle of dependent choices. Bull.
Acad. Polon. Sci. Sér. Sci. Math. Astronom. Phys. 25, 933-934 (1977).
[18] F. F. Bonsall, J. Duncan: Complete normed algebras. Springer, 1973
[19] F. F. Bonsall, J. Duncan: Numerical ranges of operators on normed spaces and of elements
of normed algebras and Numberical ranges II. Cambridge University Press, 1971, 1973.
[20] R. Bouldin: The essential minimum modulus. Indiana Univ. Math. Journ. 30, 513-517
(1981).
[21] N. Bourbaki: Topological vector spaces. Chapters 1-5. Springer, 1987.

257
[22] O. Bratteli, D. W. Robinson: Operator algebras and quantum statistical mechanics. Two
volumes. 2nd. eds.. Springer, 1987, 1997.
[23] H. Brézis: Functional analysis, Sobolev spaces and partial differential equations. Springer,
2011.
[24] M. S. Brodskiı̌: Triangular and Jordan representations of linear operators. American
Mathematical Society Translations, 1971.
[25] T. Bühler, D. A. Salamon: Functional analysis. American Mathematical Society, 2018.
[26] N. L. Carothers: A short course on Banach space theory. Cambridge University Press,
2005.
[27] D. Choimet, H. Queffélec: Twelve landmarks of twentieth-century analysis. Cambridge
University Press, 2009.
[28] P. G. Ciarlet: Linear and nonlinear functional analysis with applications. Society for
Industrial and Applied Mathematics, 2013.
[29] D. L. Cohn: Measure theory. 2nd. ed. Springer, 2013.
[30] J. B. Conway: A course in functional analysis. 2nd. ed. Springer, 2007.
[31] J. B. Conway: The compact operators are not complemented in B(H). Proc. Amer. Math.
Soc. 32, 549-550.
[32] H. G. Dales: Banach algebras and automatic continuity. Oxford University Press, 2000.
[33] K. R. Davidson: C ∗ -Algebras by example. American Mathematical Society, 1996.
[34] A. M. Davie: The Banach approximation problem. J. Approx. Th. 13, 392-394 (1975).
[35] C. Davis: The Toeplitz-Hausdorff theorem explained. Canad. Math. Bull. 14, 245-246
(1971).
[36] M. M. Day: Reflexive Banach spaces not isomorphic to uniformly convex spaces. Bull.
Amer. Math. Soc. 47, 313-317 (1941).
[37] K. Deimling: Nonlinear functional analysis. Springer, 1985.
[38] S. Delpech: A short proof of Pitt’s compactness theorem. Proc. Amer. Math. Soc. 137,
1371-1372 (2008).
[39] Z. Denkowski, S. Migórski, N. S. Papageorgiou: An introduction to nonlinear analysis:
Theory. Springer, 2003.
[40] J. Dieudonné: History of functional analysis. North-Holland, 1981.
[41] J.-L. Dorier: A general outline of the genesis of vector space theory. Hist. Math. 22,
227-261 (1995).
[42] R. G. Douglas: Banach algebra techniques in operator theory. 2nd. ed. Springer, 1998.
[43] N. Dunford, J. T. Schwartz: Linear operators. I. General theory. Interscience Publishers,
1958, John Wiley & Sons, 1988.
[44] H.-D. Ebbinghaus et al.: Numbers. Springer, 1991.
[45] D. E. Edmunds, W. D. Evans: Spectral theory and differential operators. Oxford University
Press, 1987, 2018.
[46] N. Eldredge: Answer to the question ‘Is there a simple direct proof of the Open
Mapping Theorem from the Uniform Boundedness Theorem?’ on [Link].
[Link]
mapping-theorem-from-the-uniform-boun

258
[47] P. Enflo: A counterexample to the approximation problem in Banach spaces. Acta Math.
130, 309-317 (1973).
[48] P. Enflo: On the invariant subspace problem in Banach spaces. Sémin. d’anal. fonction-
nelle (Maurey-Schwartz). Exp. no 14 et 15, p. 1-6 (1975-1976) On the invariant subspace
problem for Banach spaces. Acta Math. 158, 213-313 (1987).
[49] P. Enflo: On the invariant subspace problem in Hilbert spaces. Preprint 2023.
[Link]
[50] K.-J. Engel, R. Nagel: One-parameter semigroups for linear evolution equations. Springer,
2000.
[51] R. Engelking: General topology. Heldermann Verlag, 1989.
[52] L. C. Evans: Partial differential equations. 2nd. ed. American Mathematical Society, 2010.
[53] A. Fellhauer: On the relation of three theorems of analysis to the axiom of choice. J. Logic
Analysis 9, 1-23 (2017).
[54] M. Foreman, F. Wehrung: The Hahn-Banach theorem implies the existence of a non-
Lebesgue measurable set. Fund. Math. 77, 13-19 (1991).
[55] S. Friedberg, A. Insel, L. Spence: Linear algebra. 4th. ed. Pearson, 2014.
[56] T. W. Gamelin, R. E. Greene: Introduction to topology. 2nd. ed. Dover Publications, 1999.
[57] D. J. H. Garling: A course in mathematical analysis. Vol. 1 & 2. Cambridge University
Press, 2013.
[58] B. W. Glickfeld: A metric characterization of C(X) and its generalization to C ∗ -algebras.
Illinois Journ. Math 10, 547-556 (1966).
[59] I. C. Gohberg, M. G. Krein: Introduction to the theory of linear nonselfadjoint operators
in Hilbert Space. American Mathematical Society Translations, 1969.
[60] F. Q. Gouvêa: p-adic numbers. An introduction. 3rd. ed. Springer, 2020.
[61] S. Grabiner: The Tietze extension theorem and the open mapping theorem. Amer. Math.
Monthly 93, 190-191 (1986).
[62] A. Grothendieck: La théorie de Fredholm. Bull. Soc. Math. France 84, 319-384 (1956).
[63] S. Gudder: Inner product spaces. Amer. Math. Monthly 81, 29-36 (1974), 82, 251-252
(1975), 82, 818 (1975).
[64] K. E. Gustafson, D. K. M. Rao: Numerical range. The field of values of linear operators
and matrices. Springer, 1997.
[65] R. Haag: Local quantum physics. 2nd. ed. Springer, 1996.
[66] P. R. Halmos: Introduction to Hilbert space and the theory of spectral multiplicity. 2nd.
ed. Chelsea, 1957.
[67] P. R. Halmos: What does the spectral theorem say? Amer. Math. Monthly 70, 241-247
(1963).
[68] P. R. Halmos: A Hilbert space problem book. 2nd. ed. Springer, 1982.
[69] I. Halperin: Sums of a series, permitting rearrangements. C. R. Math. Rep. Acad. Sci.
Can. VIII, 87-102 (1986).
[70] J. D. Halpern: The independence of the axiom of choice from the Boolean prime ideal
theorem. Fundam. Math. 55, 57-66 (1964).

259
[71] J. D. Halpern, A. Lévy: The Boolean prime ideal theorem does not imply the axiom of
choice. In: D. Scott (ed.) Axiomatic set theory. Amer. Math. Soc., 1971.
[72] H. Hanche-Olsen, H. Holden: The Kolmogorov-Riesz compactness theorem. Expos. Math.
28, 385-394 (2010).
[73] O. Hanner: On the uniform convexity of Lp and lp . Ark. Mat. 3, 239-244 (1955).
[74] C. Heil: A basis theory primer. Birkhäuser, 2011.
[75] N. Higson, J. Roe: Analytic K-homology. Oxford University Press, 2000.
[76] K. Hoffman: Banach spaces of analytic functions. Prentice-Hall, 1962.
[77] T. Hytönen, J. van Neerven, M. Veraar, L. Weis: Analysis in Banach spaces. Two Volumes.
Springer, 2016, 2017.
[78] V. M. Kadets, M. I. Kadets: Rearrangements of series in Banach spaces. American Math-
ematical Society, 1991.
[79] R. V. Kadison, J. R. Ringrose: Fundamentals of the theory of operator algebras. Two
volumes. Academic Press, 1983, 1986.
[80] E. Kaniuth: A course in commutative Banach algebras. Springer, 2009.
[81] S. Kaplan: The bidual of C(X) I. North-Holland, 1985.
[82] A. A. Karatsuba, S. M. Voronin: The Riemann zeta function. Walter de Gruyter, 1992.
[83] Y. Katznelson: An introduction to harmonic analysis. 3rd. ed. Cambridge University
Press, 2004.
[84] Y. & Y. R. Katznelson: A (terse) introduction to linear algebra. American Mathematical
Society, 2008.
[85] I. Kleiner: A history of abstract algebra. Birkhäuser, 2007.
[86] V. Komornik: Lectures on functional analysis and the Lebesgue integral. Springer, 2016.
[87] E. Kreyszig: Introductory functional analysis with applications. Wiley, 1978.
[88] M. Krukowski: Natural proof of the characterization of relatively compact families in
Lp -spaces on locally compact groups. Publ. Math. 67, 687-713 (2023).
[89] N. H. Kuiper: The homotopy type of the unitary group of Hilbert space. Topology 3,
19-30 (1965).
[90] H. E. Lacey, R. Whitley: Conditions under which all the bounded linear maps are compact.
Math. Ann. 158, 1-5 (1965).
[91] E. Landau: Differential and integral calculus. Chelsea, 1951.
[92] N. P. Landsman: Foundations of quantum theory. Springer, 2017. Freely available at
[Link]
[93] S. Lang: Undergraduate algebra. 2nd. ed. Springer, 1990.
[94] P. D. Lax: Functional analysis. Wiley, 2002.
[95] P. D. Lax: Linear algebra and its applications. 2nd. ed. Wiley, 2007.
[96] S. R. Lay: Convex sets and their applications. Wiley, 1982.
[97] D. Li, H. Queffélec: Introduction to Banach spaces: Analysis and probability. Two Vol-
umes. Cambridge University Press, 2018.

260
[98] J. Lindenstrauss, L. Tzafriri: Classical Banach spaces. Vol. 1: Sequence spaces, Vol. 2:
Function spaces. Springer, 1977, 1979.
[99] J. Loś, C. Ryll-Nardzewski: On the application of Tychonoff’s theorem in mathematical
proofs. Fund. Math. 88, 233-237 (1951).
[100] W. A. J. Luxemburg: Two applications of the method of construction by ultrapowers to
analysis. Bull. Amer. Math. Soc. 68, 416-419 (1962).
[101] B. MacCluer: Elementary functional analysis. Springer, 2009.
[102] R. E. Megginson: An introduction to Banach space theory. Springer, 1998.
[103] R. Meise, D. Vogt: Introduction to functional analysis. Oxford University Press, 1997.
[104] A. J. Michaels: Hilden’s simple proof of Lomonosov’s invariant subspace theorem. Adv.
Math. 25, 56-58 (1977).
[105] D. F. Monna: Functional analysis in historical perspective. Oosthoek Publishing Company,
1973.
[106] G. H. Moore: The axiomatization of linear algebra: 1875-1940. Hist. Math. 22, 262-303
(1995).
[107] M. H. Mortad: On the absolute value of the product and the sum of operators. Rend.
Circ. Mat. Palermo, II. Ser 68, 247-257 (2019).
[108] M. Müger: Topology for the working mathematician. (work in progress).
[Link]
[109] M. Müger: Some examples of Fourier series.
[Link]
[110] G. J. Murphy: C ∗ -Algebras and operator theory. Academic Press, 1990.
[111] G. Nagy: A functional analysis point of view on the Arzela-Ascoli theorem. Real. Anal.
Exch. 32, 583-586 (2006/7).
D. C. Ullrich: The Ascoli-Arzelà Theorem via Tychonoff’s Theorem. Amer. Math.
Monthly 110, 939-940 (2003).
M. Wójtowicz: For eagles only: probably the most difficult proof of the Arzelà-Ascoli
theorem − via the Stone-Čech compactification. Quaest. Math. 40, 981-984 (2017).
[112] L. Narici, E. Beckenstein, G. Bachman: Functional analysis and valuation theory. Marcel
Dekker, Inc. 1971.
[113] J. van Neerven: Functional analysis. Cambridge University Press, 2022.
[114] D. J. Newman: A simple proof of Wiener’s 1/f theorem. Proc. Amer. Math. Soc. 48,
264-265 (1975).
[115] L. Nirenberg: Topics in nonlinear functional analysis. American Mathematical Society,
1974.
[116] B. de Pagter, A. C. M. van Rooij: An invitation to functional analysis. Epsilon Uitgaven,
2013.
[117] J. Pawlikowski: The Hahn-Banach theorem implies the Banach-Tarski paradox. Fund.
Math. 77, 21-22 (1991).
[118] G. Pedersen: Analysis now. Springer, 1989.
[119] R. R. Phelps: Lectures on Choquet’s theorem. 2nd. ed. Springer, 2001.

261
[120] J.-P. Pier: Mathematical analysis during the 20th century. Oxford University Press, 2001.
[121] A. Pietsch: Eigenvalues and s-numbers. Cambridge University Press, 1987.
[122] A. Pietsch: History of Banach spaces and linear operators. Birkhäuser, 2007.
[123] D. Pincus: The strength of the Hahn Banach theorem. In: A. Hurd, P. Loeb (eds.):
Victoria Symposium on Nonstandard Analysis. LNM 369, Springer, 1974.
[124] D. Pincus: Adding dependent choice to the prime ideal theorem. In: R. O. Gandy, J. M.
E. Hyland (eds.): Logic Colloquium 76. North-Holland, 1977.
[125] J. B. Prolla: Topics in functional analysis over valued division rings. North-Holland, 1982.
[126] D. Ramakrishnan, R. J. Valenza: Fourier analysis on number fields. Springer, 1999.
[127] C. J. Read: A solution to the Invariant Subspace Problem on the space `1 . Bull. London
Math. Soc. 17, 305-317 (1985).
[128] M. Reed, B. Simon: Methods of modern mathematical physics I: Functional analysis.
Academic Press, 1980.
[129] M. Reed, B. Simon: Methods of modern mathematical physics IV: Analysis of operators.
Academic Press, 1980.
[130] C. E. Rickart: An elementary proof of a fundamental theorem in the theory of Banach
algebras. Michigan Math. J. 5, 75-78 (1958).
[131] C. E. Rickart: General theory of Banach algebras. Robert E. Krieger Publishing Co. Inc.,
1960.
[132] F. Riesz, B. Sz.-Nagy: Functional analysis. Frederick Ungar Publ., 1955, Dover, 1990.
[133] J. R. Ringrose: A note on uniformly convex spaces. J. London Math. Soc. 34, 92 (1959).
[134] J. R. Ringrose: Super-diagonal forms for compact linear operators. Proc. London Math.
Soc. 12, 367-384 (1962).
[135] J. R. Ringrose: Compact non-self-adjoint operators. Van Nostrand Reinhold, 1971.
[136] A. M. Robert: A course in p-adic analysis. Springer, 2000.
[137] J. W. Roberts: A compact convex set with no extreme points. Studia Mathematica 60,
255-266 (1977).
[138] A. C. M. van Rooij: Non-Archimedean functional analysis. Marcel Dekker, Inc., 1978.
[139] W. Rudin: Principles of mathematical analysis. McGraw Hill, 1953, 1964, 1976.
[140] W. Rudin: Real and complex analysis. McGraw-Hill, 1966, 1974, 1986.
[141] W. Rudin: Functional analysis. 2nd. ed. McGraw-Hill, 1991.
[142] V. Runde: A taste of topology. Springer, 2005.
[143] V. Runde: A new and simple proof of Schauder’s theorem.
[Link]
[144] R. A. Ryan: Introduction to tensor products of Banach spaces. Springer, 2002.
[145] B. P. Rynne, M. A. Youngson: Linear functional analysis. 2nd. ed. Springer, 2008.
[146] D. A. Salamon: Measure and integration. European Mathematical Society, 2016.
[147] D. Sarason: The multiplication theorem for Fredholm operators. Amer. Math. Monthly
94, 68-70 (1987).

262
[148] K. Saxe: Beginning functional analysis. Springer, 2002.
[149] E. Schechter: Handbook of analysis and its foundations. Academic Press, 1997.
[150] C. Schmoeger: Remarks on commuting exponentials in Banach algebras. Proc. Amer.
Math. Soc. 127, 1337-1338 (1999).
[151] B. Simon: Trace ideals and their applications. 2nd. ed. American Mathematical Society,
2005.
[152] B. Simon: Operator theory. American Mathematical Society, 2015.
[153] I. Singer: Bases in Banach spaces I & II. Springer, 1970, 1981.
[154] A. Sokal: A really simple elementary proof of the Uniform Boundedness Theorem. Amer.
Math. Monthly 118, 450-452 (2011).
[155] P. Soltan: A primer on Hilbert space. Springer, 2018.
[156] L. A. Steen: Highlights in the history of spectral theory. Amer. Math. Monthly 80, 359-381
(1973).
[157] E. M. Stein, R. Shakarchi: Fourier analysis. Princeton University Press, 2005.
[158] J. Stillwell: Reverse mathematics. Proofs from the inside out. Princeton University Press,
2018.
[159] M. H. Stone: Linear transformations in Hilbert space and their applications to analysis.
American Mathematical Society, 1932.
[160] V. S. Sunder: Fuglede’s theorem. Indian J. Pure Appl. M. 46, 415-417 (2015).
[161] C. Swartz: Infinite matrices and the gliding hump. World Scientific, 1996.
[162] A. Szankowski: B(H) does not have the approximation property. Acta Math. 147, 89-108
(1981).
[163] T. Tao: Analysis I & II. 3rd. ed. Springer, 2016.
[164] G. Teschl: Topics in linear and nonlinear functional analysis. American Mathematical
Society, 2020.
[165] N. Tomczak-Jaegermann: Banach-Mazur distances and finite-dimensional operator ideals.
Longman Scientific & Technical, 1989.
[166] A. M. Vershik: The life and fate of functional analysis in the twentieth century. In: A.
A. Bolibruch, Yu. S. Osipov, Ya. G. Sinai (eds.): Mathematical events of the twentieth
century. Springer, 2006.
[167] S. Warner: Topological fields. North-Holland, 1989.
[168] J. Weidmann: Lineare Operatoren in Hilberträumen. 1: Grundlagen, 2: Anwendungen.
Teubner, 2000, 2003.
[169] A. Weil: L’intégration dans les groupes topologiques et ses applications. 2ème éd. Hermann,
1965.
[170] H. Weyl: Über beschränkte quadratische Formen, deren Differenz vollstetig ist. Rend.
Circ. Mat. Palermo 27, 373-392 (1909).
[171] H. Weyl: Über gewöhnliche Differentialgleichungen mit Singularitäten und die zugehörigen
Entwicklungen willkürlicher Funktionen. Math. Ann. 68, 220-269 (1910).
[172] R. Whitley: Projecting m onto c0 . Amer. Math. Monthly 73, 285-286 (1966).

263
[173] R. Whitley: An elementary proof of the Eberlein-Šmulyan theorem. Math. Ann. 172,
116-118 (1967).
[174] R. Whitley: The spectral theorem for a normal operator. Amer. Math. Monthly 75, 856-
861 (1968).
[175] W. Wieşlaw: On topological fields. Colloq. Math. 29, 119-146 (1974).
[176] P. Wojtaszczyk: Banach spaces for analysts. Cambridge University Press, 1991.
[177] E. Zeidler: Nonlinear functional analysis. Volumes 1, 2A, 2B, 3, 4, 5. Springer, 1984-1990.
[178] R. J. Zimmer: Essential results of functional analysis. University of Chicago Press, 1990.
[179] A. Zsák: On the Solution of the scalar-plus-compact problem by Argyros and Haydon.
EMS Newsletter, December 2018, pp. 8-15.
[180] Question “Why do we care about Lp spaces besides p = 1, p = 2, p = ∞?” on MathOver-
flow with answers. [Link]
spaces-besides-p-1-p-2-and-p-infty

264

You might also like