Functional Analysis Course Notes
Functional Analysis Course Notes
Michael Müger
24.02.2024
Abstract
These are notes for my Bachelor course Inleiding in de Functionaalanalyse (14×90 min.).
They are also recommended as background for my Master courses on Operator Algebras.
Some familiarity with metric and topological spaces is assumed, and the last lecture (Section
18) will use some measure theory. Complex analysis is not used.
Contents
1 Introduction 5
Part I: Fundamentals 9
2 Setting the stage 9
2.1 Topological algebra: Topological groups, fields, vector spaces . . . . . . . . . . . 9
2.2 Translation-invariant metrics. Normed and Banach spaces . . . . . . . . . . . . . 10
2.3 A glimpse beyond normed spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.1 Finite-dimensional TVS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.2 Locally convex and Fréchet spaces . . . . . . . . . . . . . . . . . . . . . . 17
1
5 Basics of Hilbert spaces 37
5.1 Inner products. Cauchy-Schwarz inequality . . . . . . . . . . . . . . . . . . . . . 37
5.2 The parallelogram and polarization identities . . . . . . . . . . . . . . . . . . . . 40
5.3 Basic Hilbert space geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.4 Closed subspaces, orthogonal complement, and orthogonal projections . . . . . . 43
5.5 The dual space H ∗ of a Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.6 Orthonormal sets and bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.7 Tensor products of Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2
12 Compact operators 99
12.1 Compact Banach space operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
12.2 ? Compactness vs. weak forms of weak-norm continuity . . . . . . . . . . . . . . 106
12.3 Compact Hilbert space operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
12.4 ? Hilbert-Schmidt operators: L2 (H) . . . . . . . . . . . . . . . . . . . . . . . . . 111
3
A Some more advanced topics from topology and measure theory 166
A.1 Unordered infinite sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
A.2 More on unconditional convergence of series . . . . . . . . . . . . . . . . . . . . . 168
A.3 Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
A.4 Reminder of the choice axioms and Zorn’s lemma . . . . . . . . . . . . . . . . . . 171
A.5 Baire’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
A.6 On C(X, F) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
A.6.1 Tietze’s extension theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 174
A.6.2 Weierstrass’ theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
A.6.3 The Stone-Weierstrass theorem . . . . . . . . . . . . . . . . . . . . . . . . 177
A.6.4 The Arzelà-Ascoli theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 178
A.6.5 Separability of C(X, R) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
A.6.6 ? The Stone-Čech compactification . . . . . . . . . . . . . . . . . . . . . . 181
A.7 Some notions from measure and integration theory . . . . . . . . . . . . . . . . . 182
4
B.12.1 The numerical range W (A) of a Hilbert space operator . . . . . . . . . . . 232
B.12.2 Numerical range in Banach algebras . . . . . . . . . . . . . . . . . . . . . 234
B.12.3 Positive functionals on and numerical range for C ∗ -algebras . . . . . . . . 238
B.13 Some more basic theory of C ∗ -algebras . . . . . . . . . . . . . . . . . . . . . . . . 242
B.13.1 The Fuglede-Putnam theorem . . . . . . . . . . . . . . . . . . . . . . . . . 242
B.13.2 Homomorphisms of C ∗ -algebras . . . . . . . . . . . . . . . . . . . . . . . . 243
B.14 Unbounded operators (mostly on Hilbert space) . . . . . . . . . . . . . . . . . . . 244
B.14.1 Basic definitions. Closed and closable operators . . . . . . . . . . . . . . . 244
B.14.2 Adjoints of unbounded Hilbert space operators . . . . . . . . . . . . . . . 245
B.14.3 Basic criterion for (essential) self-adjointness . . . . . . . . . . . . . . . . 247
B.15 Glimpse of non-linear FA: Schauder’s fixed point theorem . . . . . . . . . . . . . 249
References 257
1 Introduction
We will begin with a quick delineation of what we will discuss – and what not!
• “Classical analysis” is concerned with ‘analysis in finitely many dimensions’. ‘Functional
analysis’ is the generalization or extension of classical analysis to infinitely many dimen-
sions. Before one can try to make sense of this, one should make the first sentence more
precise. Since the creation of general topology, one can talk about convergence and con-
tinuity in very general terms. As far as I see it, this is not analysis, even if infinite sums
(=series) are studied. Analysis proper starts as soon as one talks about differentiation
and/or integration. Differentiation has to do with approximating functions locally by lin-
ear ones, and for this one needs the spaces considered to be vector spaces (at least locally).
This is the reason why most of classical analysis considers functions between the vector
spaces Rn and Cn (or subsets of them). (In a second step, one can then generalize to
spaces that look like Rn only locally by introducing topological and smooth manifolds and
their generalizations, but the underlying model of Rn remains important.) On the other
hand, integration, at least in the sense of the modern theory, can be studied much more
generally, i.e. on arbitrary sets equipped with a measure (defined on some σ-algebra). Such
a set can be very far from being a vector space or manifold, for example by being totally
disconnected.
• In view of the above, it is not surprising that functional analysis is concerned with (pos-
sibly) infinite-dimensional vector spaces and continuous maps between them. (Again, one
can then generalize to spaces that look like a vector space only locally, but this would
be considered infinite-dimensional geometry, not functional analysis.) In addition to the
vector space structure one needs a topology, which naturally leads to topological vector
spaces, which we will define soon.
5
• The importance of topologies is not specific to infinite dimensions. The point rather is that
Rn , Cn have unique topologies making them Hausdorff topological vector spaces. This is
no more true in infinite dimensions!
• Actually, ‘functional analysis’ most often studies only linear maps between topological
vector spaces so that this domain of study should be called ‘linear functional analysis’, but
this is done only rarely, e.g. [145]. Allowing non-linear maps leads to non-linear functional
analysis. This course will discuss only linear functional analysis. Thorough mastery of the
latter is needed anyway before one can think about non-linear FA or infinite-dimensional
geometry. For the simplest result of non-linear functional analysis, see Section B.15. For
more, you could have a look at, e.g., [164, 37, 115]. There even is a five volume treatise
[177]!
• The restriction to linear maps means that the notion of differentiation becomes point-less,
the derivative of f (x) = Ax being just A everywhere. But there are many non-trivial
connections between linear FA and integration (and measure) theory. For example, every
measure space (X, A, µ) gives rise to a family of topological vector spaces Lp (X, A, µ), p ∈
(0, ∞], and integration provides linear functionals. Proper appreciation of these matters
requires some knowledge of measure and integration theory, cf. e.g. [29, 146]. I will not
suppose that you have followed a course on this subject (but if you haven’t, I strongly
that you do so on the next occasion or, at least, read the appendix in MacCluer’s book
[101].). Yet, one can get a reasonably good idea by focusing on sequence spaces, for which
no measure theory is required, see Section 4.
• One should probably consider linear functional analysis as an infinite-dimensional and
topological version of linear algebra rather than as a branch of analysis! This might lead
one to suspect linear FA to be slightly boring, but this would be wrong for many reasons:
– Functional analysis (linear or not) leads to very interesting (and arbitrarily challeng-
ing) technical questions (most of which reduce to very easy ones in finite dimensions).
– Linear FA is essential for non-linear FA, like variational calculus, and the theory of
differential equations – not only linear ones!
– Quantum theory [92] is a linear theory and cannot be done properly without func-
tional analysis, despite the fact that many physicists think so! Conversely, many
developments in FA were directly motivated by quantum theory.
• The above could give the impression that functional analysis arose from the wish of gen-
eralizing analysis to infinitely many dimensions. This may have played a role for some
of its creators, but its beginnings (and much of what is being done now) were mostly
motivated by finite-dimensional “classical”1 analysis: If U ⊆ Rn , the set of functions
(possibly continuous, differentiable, etc.) from U to Rm is a vector space as soon as we
put (cf + dg)(x) = cf (x) + dg(x). Unless U is a finite set, this vector space will be
infinite-dimensional. Now one can consider certain operations R on such vector spaces, like
∞ ∞ 0
differentiation C (U ) → C (U ), f 7→ f or integration f 7→ U f . This sort of considera-
tions provided the initial motivation for the development functional analysis, and indeed
FA now is a very important tool for the study of ordinary and partial differential equations
on finite-dimensional spaces. See e.g. [23, 52]. The relevance of FA is even more obvious if
one studies differential equations in infinitely many dimensions. In fact, it is often useful to
study a partial differential equation (like heat or wave equation) by singling out one of the
1
Ultimately, it is quite futile to try and draw a neat line between “classical” and “modern” or functional analysis,
in particular since many problems in the former require methods from the latter for their proper treatment.
6
variables (typically ‘time’) and studying the equation as an ordinary differential equation
in an infinite-dimensional space of functions. FA is also essential for variational calculus
(which in a sense is just a branch of differential calculus in infinitely many dimensions).
• In view of the above, FA studies abstract topological vector spaces as well as ‘concrete’
spaces, whose elements are functions. In order to obtain a proper understanding of FA,
one needs some familiarity with both aspects.
Before we delve into technicalities, some further general remarks:
• The history of functional analysis is quite interesting, cf. e.g. the article [15], [120, Chapter
4] and the books [40, 105, 122]. But clearly it makes little sense to study it before one has
some technical knowledge of FA. It is surprisingly intertwined with the development of
linear algebra. One would think that (finite-dimensional) vector spaces, linear maps etc.
were defined much earlier than, e.g., Banach spaces, but this is not what has happened.
In fact, Banach’s2 book [9], based on his 1920 PhD thesis, is one of the first references
containing the modern definition of a vector space. Some mathematicians, mostly Italian
ones, like Peano, Pincherle and Volterra, essentially had the modern definition already in
the last decades of the 19th century, but their work had little impact since the usefulness
of an abstract/axiomatic approach was not yet widely appreciated. Cf. [85, Chapter 5] or
[41, 106].
Here I limit myself to mentioning that the basics of functional analysis (Hilbert and Banach
spaces and bounded linear maps between them) were developed in the period 1900-1930.
Nevertheless, many important developments (locally convex spaces, distributions, operator
algebras) took place in 1930-1960. After that, functional analysis has split up into many
quite specialized subfields that interact quite little with each other. The very interesting
article [166] ends with the conclusion that ‘functional analysis’ has ceased to exist as a
coherent field of study!
• The study of functional analysis requires a solid background in general topology. It may
well be that you’ll have to refresh and likely also extend yours. In Appendix A I have
collected brief accounts of the topics that – sadly – you are most likely not to have encoun-
tered before. All of them are treated in [56] or [142] (written by a functional analyst!), but
my favorite (I’m admittedly biased) reference is [108]. You should have seen Weierstrass’
theorem, but those of Tietze and Arzelà-Ascoli tend to vanish in the (pedagogical, not
factual) gap between general topology and functional analysis.
• If you find these notes too advanced, you might want to have a look at less ambitious
books like [116, 145, 148]. On the other hand, if you want more, [118] is a good place to
start, followed by [128, 94, 30, 141]. (The Dutch MasterMath course currently uses [30].)
One word about notation (without guarantee of always sticking to it): General vector spaces,
but also normed spaces, are denoted V, W, . . ., normed spaces also as E, F, . . .. Vectors in such
spaces are e, f, . . . , x, y, . . .. Linear maps are always denoted A, B, . . ., except linear functionals
V → F, which are ϕ, ψ. Algebras are usually denoted A, B, . . . and their elements a, b, . . .. (For
A = B(E) this leads to inconsistency, but I cannot bring myself to using capital letters for
abstract algebra elements.)
Note for experts. Our treatment deviates from the beaten path at a number of points. E.g. we
2
Stefan Banach (1892-1945). Polish mathematician and pioneer of functional analysis. Also known for B. algebras,
B.’s contraction principle, the B.-Tarski paradox and the Hahn-B. and B.-Steinhaus theorems, etc.
7
• emphasize that absolute and unconditional convergence of series are not the same thing
in infinite-dimensional spaces, including a proof the Dvoretzky-Rogers theorem. Very
strangly, most introductory books on functional analysis fail to point this out.
• following [61] we simplify the lengthy ad hoc argument in the standard proof of the open
mapping theorem by using a lemma that also gives Tietze’s extension theorem.
• include a fairly recent (2017) proof [53] of the uniform boundedness theorem (weak ver-
sion) that only uses the axiom of countable choice, thus neither Baire’s theorem nor the
equivalent axiom of countable dependent choice (used by all previous ‘elementary’ proofs).
• and also show how Baire’s theorem gives a stronger version of uniform boundedness, which
is old but ignored by most authors.
• give a slick proof, inspired by [99], of the Hahn-Banach theorem using only Tychonov’s the-
orem for Hausdorff spaces (equivalent to the Ultrafilter Lemma) instead of Zorn’s lemma.
• follow [72] in proving the Arzelà-Ascoli theorem for complete metric spaces as target spaces
and the Kolmogorov-Riesz compactness theorem (but only for `p ).
• prove Pitt’s compactness theorem without using bases, following [38].
• introduce characters of a Banach algebra, albeit without the weak-∗ topology, at a rela-
tively early stage. Among other things, this allows for a relatively painless extension of
the continuous functional calculus for self-adjoint elements of a C ∗ -algebra to normal ones.
This approach seems to be new. (Doing this via the full blown Gelfand isomorphism as
e.g. in [101] seems inappropriate in a first course.)
• follow Rickart [130] in proving the Beurling-Gelfand formula for the spectral radius in a
Banach algebra without using complex analysis (one reason being that the author cannot
assume that all his students have been exposed to complex analysis). Since we also do
the same on a few other occasions, like the Fuglede-Putnam theorem, the text does not
assume any knowledge of complex analysis (apart from the elementary fact that power
series converge on disks, which is not genuine complex analysis).
• we give a purely elementary construction of the Riesz projectors associated with isolated
points of the spectrum (which we apply to compact non-normal operators and in the
discussion of discrete and essential spectra).
• we similarly limit the use of measure theory to the absolute minimum until it becomes
unavoidable in discussing the spectral theorem for normal operators. We prove all standard
results on the Lebesgue spaces Lp for discrete measure spaces, i.e. sets S equipped with
the counting measure, not limiting ourselves to S = N. We then indicate how most proofs
generalize to arbitrary measure spaces, while for the duality (Lp )∗ ∼
= Lq in the general
case we give a proof using uniform convexity.
• touch upon, in Appendix B, a number of somewhat more advanced topics, for which there
is no time in the author’s lecture course, but which are very closely related to the core
material. In particular, we go slightly further into Banach space theory and operator
theory than many introductory texts.
Acknowledgment. I thank Bram Balkema, Victor Hissink Muller and Niels Vooijs for corrections,
but in particular Tim Peters for a huge number of them.
8
Part I: Fundamentals
2 Setting the stage
2.1 Topological algebra: Topological groups, fields, vector spaces
As said in the Introduction, functional analysis (even most of the non-linear theory) is concerned
with vector spaces, allowing infinite-dimensional ones. Large parts of linear algebra of course
work equally well for finite and infinite-dimensional spaces. One aspect where problems arise in
infinite dimensions is the description of linear maps by matrices, for example since multiplication
of infinite matrices involves infinite summations, which require the introduction of topologies.
(Actually, in some restricted contexts infinite matrices still are quite useful.)
We begin with the following
2.1 Definition A topological group is a group (G, ·, 1) equipped with a topology τ such that
the group operations G×G → G, (g, h) 7→ gh and G → G, g 7→ g −1 are continuous (where G×G
is given the product topology). (For abelian groups, one often denotes the binary operation by
+ instead of ·.)
2.2 Example 1. If (G, ·, 1) is any group then it becomes a topological group by putting τ =
τdisc , the discrete topology on G. (This is true since every function from a discrete space to a
topological space is continuous.)
2. The group (R, +, 0), where R is equipped with its standard topology, is easily seen to be
a topological group.
3. If n ∈ N3 and F ∈ {R, C} then the set GL(n, F) = {A ∈ Mn×n (F) | det(A) 6= 0} of
invertible n × n matrices is a group w.r.t. matrix product and inversion and in fact a topological
group when equipped with the subspace topology induced from Mn×n (F) ∼
2
= Fn .
2.3 Definition A topological field is a field (F, +, 0, ·, 1) equipped with a topology on F such
that (F, +, 0) and (F\{0}, ·, 1) are topological groups. (Equivalently, all field operations are
continuous.) Usually we just write F and denote the topology by τF .
Again, if F is any field then (F, τdisc ) is a topological field.
2.4 Exercise Prove that R and C are topological fields when equipped with their standard
topologies induced by the metric d(c, c0 ) = |c − c0 |.
2.5 Definition Let F be a topological field. Then a topological vector space (TVS) over F
is an F-vector space V equipped with a topology τV (to be distinguished, obviously, from the
topology τF on F) such that the maps V × V → V, (x, y) 7→ x + y and F × V → V, (c, x) 7→ cx
are continuous.
(These conditions imply that V → V, x 7→ −x is continuous, so that (V, +, 0) is a topological
group, but not conversely.)
9
(iii) If S is any set, let V = {f : S → F} = FS . With pointwise addition and scalar multiplica-
tion, V is an F-vector space. Let τFS be the product topology on FS . Prove that (V, τFS ) is
a TVS over F.
In this course, the only topological fields considered are R and C. When a result holds for
either of the two, we will write F. (In part I of these notes it will hardly ever matter whether
F = R or F = C, but much of part II will require F = C.) Note however that one can consider
topological vector spaces over other topological fields, like the p-adic ones Qp [60]. (This said,
the resulting p-adic functional analysis is quite different in some respects from the ‘usual’ one,
cf. the comments in Section B.1 and the literature, e.g. [138, 125].) Now:
2.7 Definition Functional analysis (ordinary, as opposed to p-adic) is concerned with topo-
logical vector spaces over R or C and continuous maps between them. Linear functional analysis
considers only linear maps.
As it turns out, the notion of topological vector spaces is a bit too general to base a satis-
factory and useful theory upon it. We’ll prove only one result (Proposition 2.29) in this setting.
Just as in topology it is often (but by no means always!) sufficient to work with metric spaces,
for most purposes it is usually sufficient to consider certain subclasses of topological vector
spaces. The following diagram illustrates some of these classes and their relationships:
(Note that F -spaces, Fréchet4 , Banach and Hilbert5 spaces are assumed complete but one also
has the non-complete versions. There is no special name for Fréchet spaces with completeness
dropped other than metrizable locally convex spaces. In the other cases, one speaks of metrized,
normed and pre-Hilbert spaces.)
The most useful of these classes are those in the bottom row. In fact, locally convex (vector)
spaces are general enough for almost all applications. They are thoroughly discussed in the
MasterMath course on functional analysis, while we will only briefly touch upon them. Most
of the time, we will be discussing Banach and Hilbert spaces. There is much to be said for
studying them in some depth before turning to locally convex spaces (or more general) spaces.
(Some books on functional analysis, like [141], begin with general topological vector spaces and
then turn to some special classes, but for a first encounter this does not seem appropriate. This
said, the author doesn’t see the point of beginning by proving many results on Hilbert spaces
that literally generalize to Banach spaces.)
10
a subbase, for τ .) A topology τ on X is called metrizable if there exists a metric d on X
(not necessarily unique) such that τ = τd . Metrizable topologies automatically have many nice
properties, like e.g. normality and, a fortiori, the Hausdorff property. I also assume as familiar
the notion of completeness of a metric space and the fact that that every metric space can be
completed, i.e. embedded isometrically into a complete metric space (unique up to isometry) as
a dense subspace.
On a vector space, it is natural and common to consider only metrics of a special type:
(ii) A topological F-vector space (V, τ ) is called (completely) metrizable if there exists a
translation-invariant (and complete) metric d on V such that τ = τd . (Completely metriz-
able TVS are also called F -spaces.)
2.9 Lemma Let V be a vector space over F ∈ {R, C} and d a translation-invariant metric on
V . Then addition V × V → V, (x, y) 7→ x + y and inversion V → V, x 7→ −x are continuous,
thus (V, τd ) is a topological abelian group.
Proof. If x, x0 , y, y 0 ∈ V we have
where we used translation invariance twice and the triangle inequality once. This implies that
the operation of addition + : V × V → V is jointly continuous. Continuity of the inverse
operation x 7→ −x follows from d(−x, −y) = d(0, x − y) = d(x − y, 0) = d(x, y).
11
(ii) If d1 , d2 are translation-invariant metrics on V such that τd1 = τd2 , show that d1 is complete
if and only if d2 is complete.
2.12 Remark 1. The analogue of (ii) for topological spaces is false: There can be equivalent
metrics of which only one is complete!
2. ? The equivalent characterization of Cauchy sequences in (i) makes sense in arbitrary
TVS V : A net {xι }ι∈I in V is called a Cauchy net if for every open neighborhood U of 0 there
is a ι0 ∈ I such that ι, ι0 ≥ ι0 implies xι − xι0 ∈ U . Now V is called complete if every Cauchy
net converges. 2
2.13 Definition Let V be a vector space over F ∈ {R, C}. A seminorm on V is a map
V → [0, ∞), x 7→ kxk (thus kxk = ∞ is not allowed!) such that
2.15 Lemma Let (V, k · k) be a normed F-vector space, and define d(x, y) = kx − yk. Then
(i) d is a translation-invariant metric on V .
(ii) (V, τd ) is a Hausdorff topological vector space.
(iii) The map V → [0, ∞), x 7→ kxk is τd -continuous.
Proof. (i) That norms give rise to metrics is probably known: d(x, y) ≥ 0 follows from kxk ≥ 0,
and d(x, y) = 0 ⇔ x = y follows from the norm axiom kxk = 0 ⇔ x = 0. Furthermore,
where we used k − xk = kxk, a special case of the second seminorm axiom. Finally,
12
2.16 Definition • A norm k·k on a vector space V is called complete if the metric d(x, y) =
kx − yk is complete.
• A topological vector space (V, τ ) is called (completely) normable if there exists a (complete)
norm k · k on V such that τ = τd with d(x, y) = kx − yk.
• A complete normed space is called Banach space.
• A normed space (V, k · k) is called separable if the associated norm topology τ is separable.
(I.e. V has a countable τ -dense subset.)
2.18 Example 0. Clearly F ∈ {R, C} is a vector space over itself and kck := |c| defines a norm,
making F a complete normed F-vector space.
1. Let X be a compact topological space and V = C(X, F). Clearly, V is an F-vector space.
Now kf k = supx∈X |f (x)| is a norm on V . You probably know that the normed space (V, k · k)
is complete. (See Lemma A.30 for a proof.) One can prove that it is separable if and only if X
is second countable, see Proposition A.48.
If X is non-compact then kf k can be infinite, but replacing C(X, F) by
13
(Note that all these k · kp including p = ∞ coincide if n = 1.) It is quite obvious that for
each p ∈ [1, ∞] we have kxkp = 0 ⇔ x = 0 and kcxkp = |c| kxkp . For p = 1 and p = ∞ also
the subadditivity is trivial to check using only |c + d| ≤ |c| + |d|. Subadditivity also holds for
1 < p < ∞, but is harder to prove. You have probably seen the proof for p = 2, which relies on
the Cauchy-Schwarz inequality. The proof for 1 < p < 2 and 2 < p < ∞ is similar, using the
inequality of Hölder instead.
2.19 Exercise Prove that the norms k · kp on Fn are complete for all F ∈ {R, C}, n ∈ N, p ∈
[1, ∞].
3. The above examples are easily generalized to infinite dimensions. Let S be any set. For
a function f : S → F and 1 ≤ p < ∞ define
X 1/p
kf k∞ = sup |f (s)|, kf kS = |f (s)|p
s∈S s∈S
with the understanding that (+∞)1/p = +∞. For the definition of infinite sums like
P
s∈S f (s)
see Appendix A.1. Now let
Now one can prove that k · kp is a complete norm on (`p (S, F ), k · kp ) for each p ∈ [1, ∞]. We
will do this in Section 4.
4. Let (X, A, µ) be a measure space, f : X → F measurable and 1 ≤ p < ∞. We define
Z 1/p
p
kf kp = |f | dµ ,
X
kf k∞ = inf{M > 0 | µ({x ∈ X | |f (x)| > M }) = 0}
and
Lp (X, A, µ; F) = {f : X → F measurable | kf kp < ∞}.
Then kf kp = ( X |f |p dµ)1/p is a seminorm on Lp (X, A, µ; F) for all 1 ≤ p < ∞.
R
14
2.20 Definition Let V be an F-vector space. Two norms k·k1 , k·k2 on V are called equivalent
if τd1 = τd2 , where di (x, y) = kx − yki .
This definition is a special case of the notion of equivalence of two metrics d1 , d2 on a set,
also defined by τd1 = τd2 . In that general situation one can prove criteria for equivalence, cf.
e.g. [108], but for normed spaces one has a much simpler one:
2.21 Proposition Two norms k · k1 , k · k2 on an F-vector space V are equivalent if and only if
there are 0 < c0 ≤ c such that c0 kxk1 ≤ kxk2 ≤ ckxk1 for all x ∈ V .
Proof. Since the norm topologies τi are defined in terms of the translation invariant metrics
di , for them to coincide it suffices that every d1 -open ball around zero contains a d2 -open ball
around zero and vice versa. By the absolute homogeneity of the norms, this is equivalent to
the existence of s, s0 > 0 such that B k·k1 (0, s) ⊆ B k·k2 (0, 1) and B k·k2 (0, s0 ) ⊆ B k·k2 (0, 1), which
means that
kxk1 < s ⇒ kxk2 < 1 and kxk2 < s0 ⇒ kxk1 < 1. (2.2)
This clearly is implied by the statement c0 kxk1 ≤ kxk2 ≤ ckxk1 with c0 > 0. On the other hand,
by continuity of the norms (2.2) implies kxk1 ≤ s ⇒ kxk2 ≤ 1 and kxk2 ≤ s0 ⇒ kxk1 ≤ 1
which, using homogeneity again, gives kxk2 ≤ s−1 kxk1 and kxk1 ≤ s0 −1 kxk2 , i.e. our condition
(with c0 = s0 , c = s−1 ).
2.22 Example Let F ∈ {R, C}, p ∈ [1, ∞), n ∈ N. Then for x ∈ Fn we have
n
X
kxkp∞ p
= max |xi | ≤ kxkpp = |xi |p ≤ nkxkp∞ ,
i
i=1
Thus kxk∞ ≤ kxkp ≤ n1/p kxk∞ , so that k · k∞ is equivalent to k · kp for all p < ∞ if n is finite.
This clearly implies that all k · kp , p ∈ [1, ∞] are mutually equivalent. (You probably know that
all norms on Fn are equivalent, not only those of the form k · kp . We will prove the even stronger
result that there is only one Hausdorff topology on Fn making it a TVS.)
Later (Section 7.1) we will also prove the following deeper and quite surprising result:
2.23 Theorem (Two norm theorem) If V is a vector space that is complete w.r.t. each of
the norms k · k1 , k · k2 and k · k2 ≤ ck · k1 for some c > 0 then also k · k1 ≤ c0 k · k2 for some c0 > 0,
thus the two norms are equivalent.
2.24 Example Let S be an infinite set, F ∈ {R, C} and V = `1 (S, F). Then kf k∞ ≤ kf k1
for all f ∈ V , but the norms are not equivalent, for example since (V, k · k1 ) is complete while
(V, k · k∞ ) is not. This also is the reason why there is no contradiction with the above theorem.
2.25 Exercise Prove: If V is a vector space and k · k1 , k · k2 are equivalent norms on V then
completeness of (V, k · k1 ) is equivalent to completeness of (V, k · k2 ).
2.26 Exercise Let (V, k · k) be a normed space. Put d(x, y) = kx − yk and let (Vb , d)
b be the
completion of the metric space (V, d). Prove that Vb is a Banach space (in particular a vector
space!) and give its norm.
15
(i) Prove that k(x1 , x2 )ks = kx1 k1 +kx2 k2 and k(x1 , x2 )km = max(kx1 k1 , kx2 k2 ) are equivalent
norms on V1 ⊕ V2 .
(ii) Prove that (V1 ⊕ V2 , k · ks/m ) is complete if and only if (V1 , k · k1 ), (V2 , k · k2 ) both are
complete.
2.28 Exercise (i) Let {(Vi , k · ki )}i∈I be a family of normed spaces, where I is any set. Put
M X
Vi = {{xi }i∈I | kf (i)ki < ∞}.
i∈I i∈I
Q S
(Technically, this is a subset of Pi Vi = {f : I 7→ j Vj | f (i) ∈ Vi ∀i ∈ I}.) Prove that
this is a linear space and kf k = i kf (i)ki a norm on it.
L
(ii) Prove that ( i∈I Vi , k · k) is complete if all the Vi are complete. Hint: The proof is an
adaptation of the one for `1 (S) given in Section 4.3.
L
If Vi = F for all i ∈ I with the usual norm, we have i∈I Vi = ∼ `1 (I, F).
continuous. Thus if we prove that every such τ 0 ⊆ τFn coincides with τFn , the uniqueness of τ
follows.
Let S = {x ∈ Fn | kxk2 = 1} be the euclidean unit sphere, which is τFn -compact. Now
τ 0 ⊆ τFn implies that S is also τ 0 -compact9 , and therefore τ 0 -closed (since τ 0 is Hausdorff). Thus
Fn \S ∈ τ 0 . Since the scalar action F×Fn → Fn is continuous (w.r.t. the metric topology τF on F
and τ 0 on Fn ), the pre-image V = ·−1 (Fn \S) ⊆ F × Fn , which contains 0 × 0, is τF × τ 0 -open. By
9
An open cover by elements of τ 0 is an open cover by elements of τFn , thus has a finite subcover.
16
definition of the product topology, there are ε > 0 and 0Fn ∈ U 0 ∈ τ 0 such that B(ε, 0F )×U 0 ⊆ V .
In other words, c ∈ F, |c| < ε and x ∈ U 0 imply cx 6∈ S, which is equivalent to kcxk2 6= 1 and
to kxk2 6= 1/|c|. Since 1/|c| may assume any value larger than 1/ε, we find that x ∈ U 0 implies
kxk2 ≤ 1/ε. Replacing x by d−1 x, where d > 0, we find x ∈ dU 0 ⇒ kxk2 ≤ d/ε, so that the
map id : (Fn , τ 0 ) → (Fn , τFn ) is continuous at zero, thus everywhere by linearity. This means
τFn ⊆ τ 0 , completing the proof of τ 0 = τFn .
2.30 Remark 1. The choice of the basis E in the above proof does not matter. Why?
2. The example of the indiscrete topology, which turns every vector space into a TVS, shows
that there is no uniqueness if one omits the Hausdorff assumption.
3. The above proof is quite typical for proofs in topological algebra. Luckily, as soon as
we have (semi)norms at our disposal, proofs tend to be less point set topological. For example
explicit invocations of the product topology are rare. 2
2.31 Corollary Every finite-dimensional Hausdorff TVS (V, τ ) over F ∈ {R, C} is normable.
−1
Proof.
P Pick aP basis E = {e1 , . . . , en } for V and define a norm on V by kxk = kαE (x)k1 . (Thus
k i ci ei k = i |ci |.) By Lemma 2.15 the topology on V induced by this norm is Hausdorff,
thus coincides with τ by Proposition 2.29. Thus τ is normable.
2.32 Corollary On a finite-dimensional vector space over F ∈ {R, C}, all norms are equiva-
lent.
Proof. Let k · k1 , k · k2 be norms on V . They give rise to topologies τ1 , τ2 such that (V, τi ) is a
Hausdorff TVS for i = 1, 2. Proposition 2.29 implies τ1 = τ2 , so that in view of Definition 2.20
the norms are equivalent.
2.33 Exercise Prove that every finite-dimensional normed space (V, k · k) over R or C is com-
plete.
17
The property of being separating is important since one usually is only interested in Hausdorff
topological vector spaces and the following holds:
2.37 Definition A topological vector space (V, τ ) over F is called locally convex if there exists
a separating family F of seminorms on V such that τ = τF .
2.38 Remark 1. Note that with this definition, locally convex spaces are Hausdorff.
2. For an equivalent, more geometric way of defining local convexity of a TVS see the
supplementary Section B.6.2, and for more on locally convex spaces see, e.g., [94, 30, 141].
3. Locally convex spaces, introduced in 1935 by von Neumann10 , have many applications.
In these notes, we will encounter the weak and weak-∗ topologies on (duals of) Banach spaces
and the strong and weak operator topologies. There are many others, as in distribution theory,
relevant for the theory of (partial) differential equations. 2
If the separating family F has just one element, we are back at the notionPof a normed,
possibly Banach, space. If F is finite, i.e. F = {k · k1 , . . . , k · kn }, then k · k = ni=1 k · ki is a
seminorm, and it is a norm if and only if F is separating. Thus the case of finite F again gives
a normed space. Thus F must be infinite in order for interesting Pthings to happen.
0
If F is infinite, we cannot obtain a norm by putting kxk = k·k∈F kxk, since the r.h.s. has
no reason to converge for all x ∈ V . But if the family F of seminorms on V is countable, we
can label the elements of F as k · kn , n ∈ N and define
∞
X
d(x, y) = 2−n min(1, kx − ykn ).
n=1
Now each term min(1, kx − ykn ) is a translation-invariant pseudometric [defined like a metric,
but without the requirement d(x, y) = 0 ⇒ x = y] on V that is bounded by 1, and the sum
converges to a translation-invariant metric on V . With just a bit more work one Pshows that
τF = τd , thus (V, τF ) is metrizable. (Note that we could not have defined kxk = ∞ i=1 2−i kxk
i
since this again may fail to converge, thus need not be a norm.) If such a space is complete, it
is called a Fréchet space. Clearly, every Fréchet space is an F -space.
10
John von Neumann (1903-1957). Hungarian, later American, mathematician. Countless contributions mostly to
foundational matters and analysis, e.g. the theory of unbounded operators and the spectral theorem, von Neumann
algebras, locally convex spaecs, but also to applied mathematics and computer science.
18
Here is an example of a Fréchet space: For f ∈ C ∞ (R, C) and n, m ∈ N0 , define
where f (m) is the m-th derivative of f . These k · kn,m are seminorms. Since the family F =
{k · kn,m | n, m ∈ N0 } is countable, the space
equipped with the topology τF is a Fréchet space. Its elements are called Schwartz11 functions.
They are infinitely differentiable functions that, together with all their derivatives, vanish as
|x| → ∞ faster than |x|−n for any n. (This definition is easily generalized to functions of several
variables.) Note that the seminorm k · k0,0 alone already separates the elements of S, thus is
a norm, but having the other seminorms around gives rise to a finer topology, one that is not
normable.
3.1 Definition A linear map A : V → W of normed spaces is called an isometry if kAxk = kxk
for all x ∈ V .
Recall that a linear map A : V → W is injective if and only if its kernel ker A = A−1 (0)
is {0}. It follows that an isometry is automatically injective. Furthermore, if A : V → W is
a surjective isometry then it is invertible, and its inverse also is an isometry. Then A is called
an isometric isomorphism of normed spaces, and we write V ∼ = W . Normed spaces that are
isometrically isomorphic are essentially indistinguishable.
3.2 Definition Let E, F be normed spaces and A : E → F a linear map. Then the norm
kAk ∈ [0, ∞] is defined by
kAek
kAk = sup = sup kAek.12
06=e∈E kek e∈E
kek=1
(The equality of the second and third expression is due to linearity of A and homogeneity of
the norms.) If kAk < ∞ then A is called bounded.
11
Laurent Schwartz (1915-2002). French mathematician who invented ‘distributions’, an important notion in func-
tional analysis.
12
It should be clear that writing sup kAek instead would not change the result.
e∈E
kek≤1
19
3.3 Remark 1. Every isometry has norm one, but not every norm one map is an isometry.
2. The definition implies kAxk ≤ kAkkxk ∀x ∈ E, and kAxk ≤ Ckxk ∀x implies kAk ≤ C.
3. Linear maps are also called linear operators, but linear maps A : E → F are called linear
functionals (whence the term ‘functional analysis’).
4. If V = C ∞ (R, R) with norm kf k = supx |f (x)| and fc (x) = sin(cx) then for all c ∈ R
we have fc ∈ V, kfc k = 1 and kfc0 k = c. Thus A : V → V, f 7→ f 0 is unbounded. (The same
holds for essentially all differential operators.) While this operator is defined on all of V , for
unbounded operators it often is too restrictive to require them to be defined on the whole space.
Cf. also Remark 7.33.
5. In fact, every infinite-dimensional space admits unbounded linear maps, cf. Exercise 3.8
below. 2
3.4 Exercise If E, G, H are normed spaces and S : E → G, T : G → H are linear maps, prove
that kT Sk ≤ kSkkT k. (We write T S for the composite map T ◦ S : E → H.)
3.5 Lemma Let E, F be normed spaces and A : E → F a linear map. Then the following are
equivalent:
(i) A is bounded.
(ii) A is continuous (w.r.t. the norm topologies).
(iii) A is continuous at 0 ∈ E.
Proof. (i)⇒(ii) For x, y ∈ E we have kAx − Ayk = kA(x − y)k ≤ kAk kx − yk. Thus d(Ax, Ay) ≤
kAk d(x, y), and with kAk < ∞ we have (uniform) continuity of E.
(ii)⇒(iii) This is obvious.
(iii)⇒(i) B F (0, 1) ⊆ F is an open neighborhood of 0 ∈ F . Since A is continuous at 0,
the preimage A−1 (B F (0, 1)) ⊆ E, which clearly contains 0, is open. Thus there exists ε > 0
such that B E (0, ε) ⊆ A−1 (B F (0, 1)). In other words, kxk < ε implies kAxk < 1. A for-
tiori, kxk ≤ ε/2 ⇒ kAxk ≤ 1. By linearity of A and absoloute homegeneity of the norms,
this is equivalent to kxk ≤ 1 ⇒ kAxk ≤ 2/ε, thus kAk ≤ 2/ε < ∞. (More precisely,
kAk = (sup{ε > 0 | B E (0, ε) ⊆ A−1 (B F (0, 1))})−1 .)
3.7 Exercise Let V, W be normed spaces, where V is finite-dimensional. Prove that every
linear map V → W is bounded.
20
(i) Show that there exists an unbounded linear map ϕ : V → F. (Hint: Use a Hamel basis13 )
(ii) If W is a non-zero normed space, show that there is an unbounded linear map A : V → W .
For linear functionals, i.e. linear maps from an F-vector space to F, there is another charac-
terization of boundedness/continuity:
3.9 Exercise Let (V, k · k) be a normed F-vector space and ϕ : V → F a linear functional.
Prove that ϕ is continuous if and only if ker ϕ = ϕ−1 (0) ⊆ V is closed.
Hint: For ⇐, pick a ball B(x, r) ⊆ V \ ker ϕ and prove that ϕ(B(0, r)) is bounded.
If V is a finite-dimensional normed space of dimension d, by linear algebra there exists a
basis E = {e1 , . . . , ed }. We can normalize its elements so that kei k = 1 for all i. Once E
is fixed, there is a unique family F = {ϕ1 , . . . , ϕd } of linear functionals V → F such that
ϕi (ej ) = δi,j ∀i, j. (One easily checks that F is a basis for V ∗ .) By Exercise 3.7, the ϕi are
bounded, and kϕi k ≥ |ϕ(ei )|/kei k = 1/1 = 1. Less trivially one has:
3.10 Proposition (Auerbach 1929) 14 Every finite-dimensional normed space admits a nor-
malized basis E, called Auerbach basis, such that also the dual basis F is normalized, i.e.
kϕi k = 1 ∀i.
Proof. Since every finite-dimensional vector space is isomorphic to Rn for some n, it suffices
to prove this for W = Rn (but with arbitrary norm). Let X = {w ∈ W | kwk = 1} be the
unit sphere, which is compact (since it is bounded and closed and the Heine-Borel theorem15
applies since the topology on W = Rn is the Euclidean one by Corollary 2.32). Consider the
map f : X n → R, E = (e1 , . . . , en ) 7→ det(e1 |e2 | · · · |en ), the matrix on the right having the
ei as columns. This is a continuous function, so µ = supE∈X n f (E) is finite, positive (since f
changes sign under exchange of two columns) and is assumed by some E = (e1 , . . . , en ). Defining
ϕi : x 7→ det(e1 , . . . , ei−1 , x, ei+1 , . . . , en )/µ, we have ϕi (ej ) = δij (since the determinant vanishes
if two columns are equal). The definition of µ implies det(e1 , . . . , ei−1 , x, ei+1 , . . . , en ) ≤ µ for
all x ∈ X, thus kϕi k = supx∈X |ϕi (x)| ≤ 1, and we are done.
Prove:
(i) D(V, W ) = D(W, V ) ≥ 1.
∼ W (isometric isomorphism) then D(V, W ) = 1. In particular D(V, V ) = 1.
(ii) If V =
(iii) If V ' W ' Z then D(V, Z) ≤ D(V, W )D(W, Z).
13
Recall from linear algebra that a Hamel basis for V is a subset E ⊂ V such that every x ∈ V is a linear combination
of finitely many elements of E, in a unique way.
14
Herman Auerbach (1901-1942). Polish mathematician. Born in Tarnopol (then Austria-Hungary, now Ukraine).
Murdered in the Belzec extermination camp.
15
A subset of Rn , equipped with its standard topology, is compact if and only if it is closed and bounded. While
the name attributes the theorem to Eduard Heine (1821-1881) and Emile Borel (1871-1956), the real history is very
complicated. See [3].
16
Stanislaw Mazur (1905-1981). Polish mathematician. Also known for the Gelfand-M. theorem and others.
21
(iv) Restricted to a set of mutually isomorphic Banach spaces, d(V, W ) = log D(V, W ) is a
pseudometric.
If V and W are finite-dimensional normed spaces, one can prove D(V, W ) = 1, thus they
are isometrically isomorphic, but in infinite dimensions this is not true!
3.12 Lemma Let V be a normed space, W ⊆ V a dense linear subspace, Z a Banach space
and A : W → Z a bounded linear map. Then there is a unique bounded linear map A
b:V →Z
with A = A 17
b|W . It satisfies kAk
b = kAk. If A is an isometry, so is A.
b
Proof. Let x ∈ V . Then there is a sequence {wn } in W such that kwn −xk → 0. Then {wn } ⊆ W
is a Cauchy sequence, and so is {Awn } ⊆ Z by boundedness of A. The latter converges since
Z is complete. If {wn0 } is another sequence converging to x then kA(wn − wn0 )k → 0, so that
lim Awn0 = lim Awn . It thus is well-defined to put Ax b = limn→∞ Awn . We omit the easy
b If x ∈ W then we can put wn = x ∀n, obtaining Ax
proof of linearity of A. b = Ax, thus
A|W = A. By density of W ⊆ V , any other continuous extension of A coincides with A.
b b We
have kAxk
b = lim kAwn k ≤ kAkkxk. Thus kAk b ≤ kAk, and the converse inequality is obvious.
If A is an isometry then kAxk = limn kAwn k = limn kwn k = kxk, so that A
b b is an isometry.
3.13 Exercise Let X, Y be Banach spaces over F ∈ {R, C}. Let {xi }i∈I ⊆ X, {yi }i∈I ⊆ Y be
families of vectors such that spanF {xi | i ∈ I} ⊆ X is dense. Show that there is an A ∈ B(X, Y )
satisfying Axi = yi ∀i ∈ I if and only if there exists a C ∈ [0, ∞) such that
X X
ci yi ≤ C ci xi
i∈J i∈J
for all finite subsets J ⊆ I and numbers {ci }i∈J in F. Show that this A is uniquely determined.
• unconditionally convergent if ∞
P
k=1 xσ(k) converges for each permutation σ of N.
17
If f : X → Y is a function and Z ⊆ X, then according to typographical convenience we write either f|Z or f Z
for the restriction of f to Z, which is a map Z → Y . But it if f : X → X maps Y ⊆ Y into itself, f Y = f|Y usually
is meant as a map Y → Y , not Y → X.
22
• conditionally convergent if it converges, but not unconditionally.
• absolutely convergent if ∞
P
n=1 kxn k < ∞.
Since the sequence {Tn } is convergent by assumption, thus Cauchy, the above implies that {Sn }
is Cauchy, thus convergent
P∞ by completeness of V . That the convergence is unconditional follows
from
P∞ the fact that n=1 kx σ(n) k is independent of σ, thus finite. The statement that then also
n=1 xσ(n) is independent of σ is proven as for series of real or complex numbers in a standard
analysis course.18
(iv) Assume that every absolutely convergent series in V converges, and let {yn }n∈N be
a Cauchy sequence in V . We can find (why?) a subsequence {zk }k∈N = {ynk } such that
kzk − zk−1 k ≤ 2−k ∀k ≥ 2. Now put z0 = 0 and define xk = zk − zk−1 for k ≥ 1. Now
∞
X ∞
X ∞
X
kxk k = kzk − zk−1 k ≤ kz1 k + 2−k < ∞.
k=1 k=1 k=2
P∞
Thus k=1 xk is absolutely convergent,
Pn andP therefore convergent by the hypothesis. To wit,
n
limn→∞ Sn exists, where Sn = k=1 xk = k=1 (zk − zk−1 ) = zn . Thus z = limk→∞ zk =
limk→∞ ynk exists. Now the sequence {yn } is Cauchy and has a convergent subsequence {ynk }.
This implies (why?) that the whole sequence {yn } converges to the limit of the subsequence.
23
3.17 Exercise Let (V, k · k) be a finite-dimensional
P∞ normed space over F ∈ {R, C}. Prove that
every unconditionally convergent series n=1 xn in V is absolutely convergent.
3.18 Remark 1. The result of the preceding exercise fails in infinite dimensions! In Exercise
4.16 we will encounter series in infinite-dimensional Banach spaces that are unconditionally
but not absolutely convergent. In fact, by the remarkable Dvoretzky-Rogers theorem (1950)
every infinite-dimensional Banach space contains unconditionally convergent series that are not
absolutely convergent! (See Section B.2.1 for a proof.)
2. We will laterP∞prove in two different ways (Corollary 9.11 and Theorem A.4) that the
independence of n=1 xσ(n) of σ holds for all unconditionally convergent series, not only the
absolutely convergent ones.
in terms of {kxn k}, but
3. There is no characterization of unconditionally convergent series P
see the resultsPof Appendices A.2 and B.2. For example, we prove that n P xn converges uncon-
ditionally ⇔ c x
n n n converges for all bounded sequences {cn } ⊆ F ⇔ i xni converges for
all n1 < n2 < · · · . The moral is that unconditional convergence does not rely
P on ‘cancellations’.
P 4. Proposition 3.15(iv) shows that it is somewhat careless to write “ n kxn k < ∞, thus
n xn converges” since the completeness condition is indispensable. 2
3.19 Exercise Let (V, k · k) be a Banach space over F ∈ {R, C} and E ⊂ V a Hamel basis. By
P there are unique linear functionals {ϕe : V → F}e∈E such that for each x ∈ X we
linear algebra
have x = e∈E ϕe (x)e for each x ∈ X, where {e ∈ E | ϕe (x) 6= 0} is finite. Prove:
(i) If V is finite-dimensional then all ϕe are continuous.
(ii) If V is infinite-dimensional then {e ∈ E | ϕe is continuous} is finite. Hint: Argue by
contradiction, using Proposition 3.15 to construct an x ∈ V for which {e ∈ E | ϕe (x) 6= 0}
is infinite.
Part (ii) shows that Hamel bases are not very well suited for infinite-dimensional Banach
spaces (other than for constructing counterexamples). This will be reinforced by Exercise 7.21,
where we will see that every Hamel basis of a separable Banach space has cardinality c = #R
rather than ℵ0 = #N.
3.20 Lemma Let (X, d) be a metric space and Y ⊆ X. Then (instead of d|Y we just write d)
(i) If (X, d) is complete and Y ⊆ X is closed (w.r.t. τd , of course) then (Y, d) is complete.
(ii) If (Y, d) is complete then Y ⊆ X is closed (whether or not (X, d) is complete).
The above should be compared with the fact that a closed subset of a compact space is
compact and that a compact subset of a Hausdorff space is closed. In the above, completeness
works as a weak substitute of compactness, an interpretation that is reinforced by the fact that
every compact metric space is complete.
The above lemma readily specializes to normed spaces:
3.21 Lemma Let (V, k · k) be a normed space and W ⊆ V a linear subspace. Then
(i) If V is complete (=Banach) and a W ⊆ V is closed then W is Banach.
24
(ii) If W is complete then W ⊆ V is closed (whether or not V is complete).
Note: We will often omit ‘linear’ from ‘linear subspace’. When arbitrary, possibly non-linear,
subsets are intended we will make this clear.
3.22 Exercise Prove that every finite-dimensional subspace of a normed space is closed.
The result of the preceding exercise is
Pnot at all true for infinite-dimensional subspaces! For
∞
example, let V = `1 (N) = {f : N → R | n=1 |f (n)| < ∞}. Now the infinite-dimensional linear
subspace W = {f : N → R | #{n ∈ N | f (n) 6= 0} < ∞} ⊆ V is non-closed as follows from the
easy facts W 6= V and W = V .
The following (to be generalized in Lemma 7.39) gives closedness of the image of an isometry:
25
If E is a normed F-vector space, the same holds for B(E) = B(E, E), and by Exercise 3.4,
we have kST k ≤ kSkkT k for all S, T ∈ B(E). This motivates the following definition:
3.28 Definition A normed F-algebra is a F-algebra A equipped with a norm k · k such that
kabk ≤ kakkbk for all a, b ∈ A (submultiplicativity). A Banach algebra is a normed algebra that
is complete (as a normed space). An algebra A is called unital if it has a unit 1 6= 0. (In fact,
if A =
6 {0} then 1 = 0 would imply the contradiction kak = k1ak ≤ k1kkak = 0 ∀a ∈ A.)
By the above, B(E) is a normed algebra for every normed space E, and by Proposition
3.25(ii), B(E) is a Banach algebra whenever E is a Banach space. There is another standard
class of examples:
3.30 Example Let X be a compact topological space and A = C(X, F).21 We already know
that A, equipped with the norm kf k = supx∈X |f (x)| is a Banach space. The pointwise product
(f g)(x) = f (x)g(x) of functions is bilinear, associative and clearly satisfies kf gk ≤ kf kkgk.
This makes (A, k · k, ·) a Banach algebra. An analogous result holds for the algebra Cb (X, F) of
bounded continuous functions on a not necessarily compact space X.
We will have much more to say about Banach algebras later in the course.
Before we go on developing the general theory of Banach spaces, it is instructive to study
in some depth an important class of spaces, the spaces `p (S, F), where everything can be done
very explicitly, in particular the dual spaces can be determined.
26
4.1 Basics. 1 ≤ p ≤ ∞: Hölder and Minkowski inequalities
4.1 Definition (`p -Spaces) If F ∈ {R, C}, 0 < p < ∞, S is a set and f : S → F, define
X 1/p
kf k∞ = sup |f (s)| ∈ [0, ∞], kf kp = |f (s)|p ∈ [0, ∞],
s∈S s∈S
where ∞1/p = ∞ and we use the notion of unordered sums, cf. Appendix A.1. Now for all
p ∈ (0, ∞] put
`p (S, F) := {f : S → F | kf kp < ∞}.
Here are some immediate observations:
• We have |f (s)| ≤ kf kp for all s ∈ S and p ∈ (0, ∞].
• kf kp = 0 if and only if f = 0.
• For all c ∈ F we have kcf kp = |c|kf kp (with the understanding that 0 · ∞ = 0).
• If S is finite then `p (S, F) = {f : S → F} = FS . If #S = 1 then all the k · kp coincide.
• If #S = ∞ then k · kp is not a norm on FS in the sense of Definition 2.13 since we can have
kf kp = +∞. But we’ll see that the restriction of k · kp to `p (S, F) is a norm if p ∈ [1, ∞].
4.2 Lemma (i) k · k1 and k · k∞ are subadditive, thus norms on `1 (S, F) and `∞ (S, F), resp.
(ii) If f ∈ `1 (S, F) and g ∈ `∞ (S, F) then
X
f (s)g(s) ≤ kf gk1 ≤ kf k1 kgk∞ .
s∈S
(iii) (`p (S, F), k · kp ) are vector spaces for all p ∈ (0, ∞].
Proof. (i) In view of the preceding observations, it remains to prove subadditivity:
|a + b|p ≤ (|a| + |b|)p ≤ (2 max(|a|, |b|))p = 2p max(|a|p , |b|p ) ≤ 2p (|a|p + |b|p ), (4.1)
4.3 Definition If p, q ∈ [1, ∞] we say that p and q are dual (or conjugate) to each other if
1 1 1 1
p + q = 1, with the understanding ∞ = 0, 0 = ∞.
27
One easily checks that every p ∈ [1, ∞] has a unique conjugate q ∈ [1, ∞]. And for 1 <
p, q < ∞ conjugacy is equivalent to pq = p + q.
1 1
4.4 Proposition Let 1 < p < ∞ and let q be conjugate to p, i.e. p + q = 1. Then
(i) For all f, g : S → F we have kf gk1 ≤ kf kp kgkq . (Inequality of Hölder22 (1889))
(ii) For all f, g : S → F we have kf + gkp ≤ kf kp + kgkp . (Inequality of Minkowski23 (1896))
Proof. (i) The inequality is trivially true if kf kp or kgkq is zero or infinite. Thus we assume
kf kp and kgkq to be finite and non-zero. The exponential function R → R, x 7→ ex is convex24 ,
so that with of p1 + 1q = 1 we have
ea eb
a/p b/q a b
e e = exp + ≤ + ∀a, b ∈ R.
p q p q
p/q
If kf + gkp 6= 0 we divide by kf + gkp , and using p − p/q = p(1 − 1/q) = p p1 = 1 we obtain
22
Otto Hölder (1859-1937). German mathematician. Important contributions to analysis and algebra.
23
Hermann Minkowski (1864-1909). German mathematician. Contributions to number theory, relativity and other
fields. We’ll encounter M.-functionals.
24
f : [a, b] → R is convex if f (tx + (1 − t)y) ≤ tf (x) + (1 − t)f (y) for all x, y ∈ [a, b] and t ∈ [0, 1] and strictly convex
if the inequality is strict whenever x 6= y, 0 < t < 1. See, e.g., [57, Vol. 1, Section 7.2].
28
For p = q = 2, the inequality of Hölder is known as the Cauchy-Schwarz inequality. We
will also call the trivial inequalities of Lemma 4.2 for {p, q} = {1, ∞} Hölder and Minkowski
inequalities. Now the analogue of Lemma 4.2 for 1 < p < ∞ is clear:
X
f (s)g(s) ≤ kf gk1 ≤ kf kp kgkq .
s∈S
defines a translation-invariant metric. (Note the absence of the p-th root present in k · kp !)
(iv) `p (S, F) a topological vector space when given the metric topology τdp .
Proof. (i) Pick s, t ∈ S, s 6= t and put f = δs , g = δt . Now kf kp = kgkp = 1 and
since 1/p > 1. Thus k · kp is not subadditive and therefore not a norm.
(ii) The proof of Lemma 4.2(iii) included the case 0 < p < 1.
(iii) That dp (f, g) < ∞ for all f, g ∈ `p (S, F) follows from `p being a vector space. Translation
invariance of dp and the axioms dp (f, g) = dp (g, f ) and dp (f, g) = 0 ⇔ f = g are all evident
from the definition. We claim that
29
as wanted, where first used the triangle inequality and then the claim.
Turning to our claim (a + b)p ≤ ap + bp , it is clear that this holds if a = 0. For a = 1 it
reduces to (1 + b)p ≤ 1 + bp ∀b ≥ 0. For b = 0 this is true, and for all b > 0 it follows from the
fact that
d
(1 + bp − (1 + b)p ) = p(bp−1 − (b + 1)p−1 ) > 0
db
due to p − 1 < 0. If now a > 0 then
and we are done. (By almost the same argument, for p ≥ 1 we have (a + b)p ≥ ap + bp .)
(iv) By the above, dp is a translation-invariant metric, so that the addition operation + :
` × `p → `p is jointly continuous by Lemma 2.9. Furthermore,
p
X X
dp (cf, 0) = |cf (s)|p = |c|p |f (s)|p = |c|p dp (f, 0),
s s
which tends to zero if c → 0 for fixed f or f → 0 (in the sense of dp (f, 0) → 0) for fixed c. By
Remark 2.10.3 this implies joint continuity of the scalar action F × `p → `p .
which is a metric in all cases. For a function f : S → F we define supp f = {s ∈ S | f (s) 6= 0}.
4.8 Lemma Let p ∈ (0, ∞] and dp (x, y) = kx − ykp . Then (`p (S, F), dp ) is complete for every
set S and F ∈ {R, C}.
Proof. For all p ∈ (0, ∞] we have |f (s)−g(s)| ≤ kf −gkp . Thus for p ≥ 1 we have |f (s)−g(s)| ≤
dp (f, g), while for p ∈ (0, 1) we have |f (s) − g(s)| ≤ kf − gkp = dp (f, g)1/p . In either case
d(f, g) → 0 implies f (s) − g(s) → 0 for all s ∈ S. Thus if {fn } ⊆ `p (S, F) is a Cauchy sequence
w.r.t. dp then {fn (s)} is a Cauchy sequence in F, thus convergent for each s ∈ S. Defining
g(s) = limn→∞ fn (s), it remains to prove g ∈ `p (S, F) and dp (fn , g) → 0.
For p = ∞ and ε > 0 we can find n0 such that n, m ≥ n0 implies kfn − fm k∞ < ε, which
readily gives kfm k∞ ≤ kfn0 k∞ + ε for all m ≥ n0 . Thus also kgk∞ ≤ kfn0 k∞ + ε < ∞. Taking
m → ∞ in sups |fn (s) − fm (s)| < ε gives sups |fn (s) − g(s)k ≤ ε, whence kfn − gk∞ → 0.
30
For 0 < p < ∞ we give a uniform argument. Since {fn } is Cauchy w.r.t. dp , for ε > 0 we
can find n0 such that n, m ≥ n0 implies dp (fn , fm ) < ε. Applying the dominated convergence
theorem (in the simple case of an infinite sum rather than a general integral, cf. Proposition
A.3) gives dp (g, fm ) = limn→∞ dp (fn , fm ) ≤ ε. This implies both g ∈ `p (S, F) and dp (g, fm ) → 0
as m → ∞.
where kf k∞ ≤ kf kq ≤ kf kp . All the inclusion maps except the first have norm one.
Proof. If f ∈ c00 (S, F) then clearly kf kp < ∞ for all p ∈ (0, ∞]. And f ∈ c0 (S, F) implies
boundedness of f . This gives the first and last inclusion. The map c0 (S, F) ,→ `∞ (S, F) is an
isometry since both spaces have the norm k · k∞ .
If f ∈ `q (S, F) with q ∈ (0, ∞) then finiteness of kf kqq = s∈S |f (s)|q implies that {s ∈
P
S | |f (s)| ≥ ε} is finite for each ε > 0, thus f ∈ c0 (S, F). Since |f (s)| ≤ kf kq for all s, we have
kf k∞ = sups∈S |f (s)| ≤ kf kq .
Now let 0 < p < q < ∞ and f ∈ `p (S, F) with kf kp = 1. Then |f (s)| ≤ 1 ∀s, thus
q
·p
X X X
kf kqq = |f (s)|q = |f (s)| p ≤ |f (s)|p = kf kpp = 1, (4.4)
s∈S s∈S s∈S
where we used q/p > 1 and |f (s)| ≤ 1 ∀s. Thus kf kq ≤ 1. Applying this to f = g/kgkp gives
kgkq ≤ kgkp for all g ∈ `p (S, F).
We now have that all the inclusion maps have norm ≤ 1. Taking f = δs and using kδs kp = 1
for all p ∈ (0, ∞] gives that the inclusion maps all have norm one.
4.11 Remark While we have found continuous maps between them, the spaces `p (1 ≤ p < ∞)
and c0 are mutually non-isomorphic. See Corollary B.32 for an even stronger statement. 2
dense.
If f ∈ c0 (S, F) and ε > 0 then F = {s ∈ S | |f (s)| ≥ ε} is finite. Now g = f χF is in c00 (S, F)
k·k∞ k·k∞
and kf − gk∞ < ε, proving f ∈ c00 (S, F) . And f ∈ c00 (S, F) means that for each ε > 0
31
there is a g ∈ c00 (S, F) with kf − gk∞ < ε. But this means |f (s)| < ε for all s ∈ S\F , where
F = supp(g) is finite. Thus f ∈ c0 (S, F).
(ii) Being the closure of c00 (S, F) in `∞ (S, F), c0 (S, F) is closed, thus complete by complete-
ness of `∞ (S, F), cf. Lemmas 4.8 and 3.21.
While the finitely supported functions are not dense in (`∞ (S, F), k · k∞ ) (for infinite S), the
finite-image functions are:
4.13 Lemma The set {f : S → F | #f (S) < ∞} of functions assuming only finitely many
values, equivalently, the set of finite linear combinations K χ
P
k=1 k Ak of characteristic functions,
c
∞
is dense in (` (S, F), k · k∞ ).
Proof. We prove this for F = R, from which the case F = C is easily deduced. Let f ∈ `∞ (S, F)
−1 kf k∞
P ε > 0. For k ∈ Z define Ak = f ([kε, (k + 1)ε)). Define K = d ε e + 1 and g =
and
ε |k|≤K k χAk . Then g has finite image and kf − gk∞ < ε.
4.16 Exercise Let 1 < p ≤ ∞ and V = `p (N, R). Define δn ∈ V by δn (m) = δn,m and
xn = n−α δn , where α > 0.
(i) For which α > 0 is ∞
P
n=1 xn absolutely convergent?
P∞
(ii) For which α > 0 is n=1 xn unconditionally convergent? (Cf. Remark A.5.)
(iii) Use (i),(ii) to give examples of series that are unconditionally convergent, but not abso-
lutely convergent, in each `p (N, F) with 1 < p ≤ ∞.
(iv) BONUS: As (iii), but for p = 1. (NB: Just invoking Corollary B.3 is not enough!)
32
and since ε > 0 was arbitrary, Y ⊆ `p (S, R) is dense.
For the converse, assume that S is uncountable. By Proposition A.2(iii),
S supp(f ) is countable
for every f ∈ `p (S, R). Thus if Y ⊆ `p (S, R) is countable then T = f ∈Y supp(f ) ⊆ S is a
countable union of countable sets and therefore countable. Thus all functions f ∈ Y vanish on
S\T 6= ∅, and the same holds for f ∈ Y since the coordinate maps f 7→ f (s) are continuous in
view of |f (s)| ≤ kf k. Thus Y cannot be dense.
∗
Recall that the dual space V = {ϕ : V → F linear | kϕk < ∞} is a Banach space with norm
kϕk. The aim of this section is to concretely identify `p (S, F)∗ for 1 ≤ p < ∞ and c0 (S, F)∗ .
(We will have something to say about `∞ (S, F)∗ , but the complete story would lead us too far.)
For the purpose of the following proof, it will be useful to define sgn : C → C by sgn(0) = 0
and sgn(z) = z/|z| otherwise. Then z = sgn(z)|z| and |z| = sgnz z for all z ∈ C.
4.19 Theorem (i) Let p ∈P [1, ∞] with conjugate value q. Then for each g ∈ `q (S, F) the map
ϕg : `p (S, F) → F, f 7→ s∈S f (s)g(s) satisfies kϕg k ≤ kgkq , thus ϕg ∈ `p (S, F)∗ . And the
map ι : `q (S, F) → `p (S, F)∗ , g 7→ ϕg , called the canonical map, is linear with kιk ≤ 1.
(ii) For all 1 ≤ p ≤ ∞ the canonical map `q (S, F) → `p (S, F)∗ is isometric.
(iii) If 1 ≤ p < ∞, the canonical map `q (S, F) → `p (S, F)∗ is surjective, thus `p (S, F)∗ ∼ =
`q (S, F).
(iv) The canonical map `1 (S, F) → c0 (S, F)∗ is an isometric bijection, thus c0 (S, F)∗ ∼
= `1 (S, F).
(v) If S is finite, the canonical map `1 (S, F) → `∞ (S, F)∗ is surjective. But its image is a
proper closed subspace of `∞ (S, F)∗ whenever S is infinite.
Proof. (i) For all p ∈ [1, ∞] and conjugate q we have
X X
| f (s)g(s)| ≤ |f (s)g(s)| ≤ kf kp kgkq < ∞ ∀f ∈ `p , g ∈ `q
s s∈S
by Hölder’s inequality. In either case, the absolute convergence for all f, g implies that (f, g) 7→
P
s f (s)g(s) is bilinear.
(ii) If kgk∞ 6= 0 and ε > 0 there is an s ∈ S with |g(s)| > kgk∞ − ε. If f = δs : t 7→ δs,t , we
have |ϕg (f )| = |g(s)| > kgk∞ − ε. Since kf k1 = 1, this proves kϕg k > kgk∞ − ε. Since ε > 0
was arbitrary, we have kϕg k ≥ kgk∞ .
P P
If kgk1 6= 0, define f (s) = sgn(g(s)). Then kf k∞ = 1 and s f (s)g(s) = s |g(s)| = kgk1 .
This proves kϕg k ≥ kgk1 .
33
6 0, define f (s) = sgn(g(s))|g(s)|q−1 . Then
If 1 < p, q < ∞ and kgkq =
X X
f (s)g(s) = |g(s)|q = kgkqq ,
s s
X X X
kf kpp = |f (s)|p = |g(s)|(q−1)p = |g(s)|q = kgkqq ,
s s, g(s)6=0 s
kgkqq kgkqq
P
| s f (s)g(s)|
kϕq k ≥ = = q/p
= kgkq(1−1/p)
q = kgkq .
kf kp kf kp kgkq
We thus have proven kϕg k ≥ kgkq in all cases and since the opposite inequality is known
from (i), g 7→ ϕg is isometric.
(iii) Let 0 6= ϕ ∈ `1 (S, F)∗ . Define g : S → F by g(s) = ϕ(δs ). With kδs k1 = 1, we
have |g(s)| = |ϕ(δs )| ≤ kϕkP for all s ∈ S, thus 1
P kgk∞ ≤ kϕk. If f ∈ ` (S, F) and F ⊆ S is
finite, wePhave ϕ(f χF ) = ϕ( s∈F f (s)δs ) = s∈F f (s)g(s). In the limit F % S this becomes
ϕ(f ) = s∈S f (s)g(s) = ϕg (f ) (since f g ∈ `1 , thus the r.h.s. is absolutely convergent, and
kf (1 − χF )k1 → 0 and ϕ is k · k1 -continuous). This proves ϕ = ϕg with g ∈ `∞ (S, F).
Now let 1 < p, q < ∞, and let 0 6= ϕ ∈ `p (S, F)∗ . Since `1 (S, F) ⊆ `p (S, F) by Lemma
4.10, weP can restrict ϕ to `1 (S, F)∗ , and the preceding argument gives a g ∈ `∞ (S, F) such that
ϕ(f ) = s∈S f (s)g(s) for all f ∈ `1 (S, F). The arguments in the proof of (ii) also show that for
1 < p, q < ∞ and any function g : S → F we have
( )
X
kgkq = sup | f (s)g(s)| | f ∈ c00 (S, F), kf kp ≤ 1 .
s∈S
P
Using this and ϕ(f ) = s f (s)g(s) for all f ∈ c00 (S, F) we have
Now ϕ(f ) = s∈S f (s)g(s) = ϕg (f ) for all f ∈ `p (S, F) follows as before from f g ∈ `1 and
P
kf (1 − χF )kp → 0 as F % S and the k · kp -continuity of ϕ.
(iv) Let 0 6= g ∈ `1 (S, F). Then ϕg ∈ `∞ (S, F)∗ , which we can restrict to c0 (S, F). For
finite F ⊆ S define fF = f χF with f (s)P= sgn(g(s)). Then fF ∈ P c00 (S, F) with kfF k∞ = 1
(provided F ∩ supp g 6= ∅) and ϕ(fF ) = s∈F |g(s)|. Thus kϕk ≥ s∈F |g(s)| for all finite F
intersecting supp g, and this implies kϕg k ≥ kgk1 . The opposite being known, we have proven
that `1 (S, F) → c0 (S, F)∗ is isometric.
To prove surjectivity, let 0 6= ϕ ∈ c0 (S, F)∗ Pand define g : S → F, s 7→ P ϕ(δs ). If now
f ∈ c0 (S, F) and F ⊆ S is finite, we have f χF = s∈F f (s)δs , thus ϕ(f χF ) = s∈F f (s)g(s).
In particular with f (s) = sgn(g(s)) we have ϕ(f χF ) = s∈F f (s)g(s) = s∈F |g(s)|. Again
P P
we have kf χf k∞ ≤ kf k∞ = 1, thus |ϕ(f χF )| ≤ kϕk, and combining these observations gives
kgk1 ≤ kϕk < ∞, thus g ∈ `1 (S, F). As F % S, we have kf (1 − χF )k∞ = kf χS\F k∞ → 0 since
f ∈ c0 , thus with k · k∞ -continuity of ϕ
X X
ϕ(f ) = lim ϕ(f χF ) = lim f (s)g(s) = f (s)g(s) = ϕg (f ),
F %S F %S
s∈F s∈S
where we again used f g ∈ `1 . Thus ϕ = ϕg , so that `1 (S, F) → c0 (S, F)∗ is an isometric bijection.
34
(v) It is clear that ι : `1 (S, F) → `∞ (S, F)∗ is surjective if S is finite. Closedness of the
image of ι always follows from the completeness of `1 (S, F) and the fact that ι is an isometry,
cf. Corollary 3.23. The failure of surjectivity is deeper than the results of this section so far, so
that it is illuminating to give two proofs.
First proof: If S is infinite, the closed subspace c0 (S, F) ⊆ `∞ (S, F) is proper since 1 ∈
` (S, F)\c0 (S, F). Thus the quotient space Z = `∞ (S, F)/c0 (S, F) is non-trivial. In Section 6.1
∞
we will show that Z is a Banach space, thus admits non-zero bounded linear maps ψ : Z → F
by the Hahn-Banach theorem (Section 9), and that the quotient map P : `∞ (S, F) → Z is
bounded. Thus ϕ = ψ ◦ P is a non-zero bounded linear functional on `∞ (S, F) that vanishes on
the closed subspace c0 (S, F). By (iv), the canonical map `1 (S, F) → c0 (S, F)∗ is isometric, thus
ϕg with g ∈ `1 (S, F) vanishes identically on c0 (S, F) if and only if g = 0. Thus ϕ 6= ϕg for all
g ∈ `1 (S, F).
Second proof: (This proof uses no (as yet) unproven results from functional but the Stone-
Čech compactification from general topology. Cf. Appendix A.6.6 and [108].) Since S is discrete,
`∞ (S, F) = Cb (S, F) ∼ = C(βS, F), where βS is the Stone-Čech compactification of S. The
isomorphism is given by the unique continuous extension Cb (S, F) → C(βX, F), f 7→ fb with
the restriction map C(βS, R) → Cb (S, R) as inverse. Since S is discrete and infinite, thus non-
compact, βS 6= S. If f ∈ c0 (S, F) then fb(x) = 0 for every x ∈ βS\S. (Proof: Let x ∈ βS\S.
Since X = βX, we can find a net {xι } in X such that xι → x. Since x 6∈ X, xι leaves every
finite subset of X. Now f ∈ c0 (S) and continuity of fb imply fb(x) = lim fb(xι ) = lim f (xι ) = 0.)
Thus for such an x, the evaluation map ψx : C(βS, F) → F, fb 7→ fb(x) gives rise to a non-zero
bounded linear functional (in fact character) ϕ(f ) = fb(x) on Cb (S, F) = `∞ (S, F) that vanishes
on c0 (S, F). Now we conclude as in the first proof that ϕ 6= ϕg for all g ∈ `1 (S, F).
4.20 Remark 1. The two proofs given above for the non-surjectivity of the canonical map
`1 (S, F) → `∞ (S, F)∗ for infinite S are both non-constructive: The first used the Hahn-Banach
theorem, which we will prove using Zorn’s lemma, equivalent to AC. The second used the Stone-
Čech compactification βS whose usual construction relies on Tychonov’s theorem, which also
is equivalent to the axiom of choice. (But both the Hahn-Banach theorem and the existence
of the Stone-Čech compactification can be proven using only the ‘ultrafilter lemma’, which is
strictly weaker than AC. For Hahn-Banach see Appendix B.5, for Stone-Čech e.g. [149, 108].)
2. The dual space of `∞ (S, F) can be determined quite explicitly, but it is not a space of
functions on S as are the spaces c0 (S, F)∗ and `p (S, F)∗ . It is the space ba(S, F) of ‘finitely
additive F-valued measures on S’. A discussion of this can be found in the supplementary
Section B.3.2.
3. The non-constructiveness mentioned above is unavoidable: There are set theoretic frame-
works without the ultrafilter lemma (but with DCω ) in which `∞ (N)∗ ∼ = `1 (N), see [149, §23.10].
(In this situation, all finitely additive measures on N are countably additive!)
4. For all p ∈ (0, 1), the dual space `p (S, F)∗ equals {ϕg | g ∈ `∞ (S, F)} = `1 (S, F)∗ . See [108,
Appendix F.6]. Thus there is no p-dependence despite the fact that the `p (S, F) are mutually
non-isomorphic! 2
4.6 The Banach algebras (`∞ (S, F), ·) and (`1 (Z, F), ?)
4.21 Definition If f, g ∈ `∞ (S, F) we define f · g by (f · g)(s) = f (s)g(s) (pointwise product).
If f, g ∈ `1 (Z, F) we define the ‘convolution product’ f ? g by
X X
(f ? g)(n) = f (m)g(n − m) = f (k)g(l). (4.5)
m∈Z k,l∈Z
k+l=n
35
4.22 Lemma (i) If f, g ∈ `∞ (S, F) then kf · gk∞ ≤ kf k∞ kgk∞ , thus f · g ∈ `∞ (S, F).
(ii) If f, g ∈ `1 (Z, F) then kf ? gk1 ≤ kf k1 kgk1 , thus f ? g ∈ `1 (Z, F).
(iii) The maps · : `∞ (S, F) × `∞ (S, F) → `∞ (S, F) and ? : `1 (Z, F) × `1 (Z, F) → `1 (Z, F) are
bilinear, commutative and associative. A unit for · is the constant function 1 ∈ `∞ (S, F),
while δ0 (n) = δn,0 is a unit for ?. Thus (`∞ (S, F), ·, 1) and (`1 (Z, F), ?, δ0 ) are commutative
unital Banach algebras with k1k = 1.
(iv) c0 (S, F) ⊆ `∞ (S, F) is a closed two-sided ideal.
Proof. (i) If f, g ∈ `∞ (S, F) then sups∈S |f (s)g(s)| ≤ sups∈S |f (s)| sups∈S |g(s)| < ∞.
(ii) The second claim clearly follows from the first, which is seen by
X X XX
kf ? gk1 = f (m)g(n − m) ≤ |f (m)g(n − m)|
n∈Z m∈Z n∈Z m∈Z
X X X X
= |f (m)| |g(n − m)| = |f (n)| |g(m)| = kf k1 kgk1 .
m n n m
(iii) Bilinearity of both maps is obvious, as are commutativity and associativity of ·. Com-
mutativity of ∗ is clear from the rightmost expression in (4.5). The latter is easily seen to
imply X
((f ? g) ? h)(n) = f (k)g(l)h(m) = (f ? (g ? h))(n),
k,l,m∈Z
k+l+m=n
thus associativity of ?. The statements about units are easy, (i),(ii) give submultiplicativity of
the norms, and completeness was proven earlier. In both cases it is obvious that k1k = 1.
(iv) Since c0 (S, F) ⊆ `∞ (S, F) is a closed linear subspace, it remains to show that f ∈
c0 (S, F), g ∈ `∞ (S, F) implies f g, gf ∈ c0 (S, F). This is obvious.
4.23 Remark 1. For 1 < p < ∞ there is no natural way of defining a bilinear product on
`p (S, F) turning it into a Banach algebra.
2. If G isPa discrete group, the definition of `1 (Z, F) is easily adapted to `1 (G, F) by putting
(f ? g)(k) = l∈G f (l)g(l−1 k). This again gives rise to a Banach algebra, but it is commutative
if and only if G is abelian. 2
Now Lp (X, µ) = {f : X → F measurable | kf kp < ∞} is an F-vector space for all p ∈ (0, ∞].
For 1 ≤ p ≤ ∞, the proofs of the inequalities of Hölder and Minkowski extend to the present
setting without any difficulties, so that the k · kp are seminorms on Lp (X, A, µ). But the latter
fails to be a norm whenever there exists ∅ 6= Y ∈ A with µ(Y ) = 0 since then kχY kp = 0.
26
Warning: [29] defines k · k∞ using locally null sets instead of null sets, which is very non-standard.
36
For this reason we define Lp (X, A, µ) = Lp (X, A, µ)/{f | kf kp = 0}. Now it is straightforward
to prove that Lp (X, µ) = Lp (X, µ)/ ∼ is a normed space, and in fact complete. The proof
now uses Proposition 3.15. If S is a set and µ is the counting measure, we have `p (S, F) =
Lp (S, P (S), µ, F) = Lp (S, P (S), µ, F).
A measurable function P is called simple if it assumes only finitely many values. Equivalently
it is of the form f (x) = K χ
k=1 ck Ak (x), where A1 , . . . , Ak are measurable sets. Now one proves
that the simple functions are dense in Lp for all p ∈ [1, ∞]. If X is locally compact and µ is nice
enough, the set Cc (X, F) of compactly supported continuous functions is dense in Lp (X, A, µ; F)
for 1 ≤ p < ∞, while its closure in L∞ is C0 (X, F).
The inclusion `p ⊆ `q for p ≤ q (Lemma 4.10) is false for general measure spaces! In fact,
if µ(X) < ∞ then one has the reverse inclusion p ≤ q ⇒ Lq (X, A, µ) ⊆ Lp (X, A, µ), while for
general measure spaces there is no inclusion relation between the Lp with different p.
If 1 < p, q < ∞ are conjugate, the canonical map ϕ : Lq (X, A, µ) → Lp (X, A, µ)∗ is
an isometric bijection for all measure spaces. That ϕ is an isometry is proven just as for
the spaces `p : Hölder’s inequality gives kϕg k ≤ kgkq , and equality is proven as in Theo-
rem 4.19(ii) by showing |ϕg (f )| ≥ kf kp kgk1 , where the f ∈ Lp are the same as before.
However, isometry of L∞ (X, A, µ) → (L1 (X, A, µ))∗ is not automatic, as the measure space
X = {x}, A = P (X) = {∅, X} and µ : ∅ 7→ 0, X 7→ +∞ shows, for which L1 (X, A, µ, F) ∼ = {0}
and F ∼ = L ∞ (X, A, µ, F) 6∼ L1 (X, A, µ, F)∗ . It is not hard to show that L∞ → (L1 )∗ is isometric
=
if and only if (X, A, µ) is semifinite, i.e.
µ(Y ) = sup{µ(Z) | Z ∈ A, Z ⊆ Y, µ(Z) < ∞} ∀Y ∈ A.
If 1 < p < ∞, one still has surjectivity of Lp → (Lq )∗ for all measure spaces (X, A, µ),
but the standard proof is outside our scope since it requires the Radon-Nikodym theorem.
(For a more functional-analytic proof see Section B.6.8.) In order for L∞ → (L1 )∗ to be an
isometric bijection, the measure space must be ‘localizable’, cf. [146]. This condition subsumes
semifiniteness and is implied by σ-finiteness, to which case many books limit themselves.
Since we relegated the dual spaces `∞ (S, F)∗ to an appendix, we only remark that also
in general L∞ (X, A, µ)∗ is a space of finitely additive measures with fairly similar proofs, see
[43]. For 0 < p < 1, the dual spaces (Lp )∗ behave even stranger than (`p )∗ . For example,
Lp ([0, 1], λ; R)∗ = {0}.
37
5.1 Definition Let V be an F-vector space. An inner product on V is a map V × V →
F, (x, y) 7→ hx, yi such that
(i) The map x 7→ hx, yi is linear for each choice of y ∈ V .
(ii) hy, xi = hx, yi ∀x, y ∈ V .
(iii) hx, xi ≥ 0 ∀x, and hx, xi = 0 ⇒ x = 0.
5.2 Remark 1. Many authors write (x, y) instead of hx, yi, but this often leads to confusion
with the notation for ordered pairs. We will use pointed brackets throughout.
2. Combining the first two axioms one finds that the map y 7→ hx, yi is anti-linear for each
choice of x. This means hx, cy + c0 y 0 i = chx, yi + c0 hx, y 0 i for all y, y 0 ∈ V and c, c0 ∈ F. Of
course this reduces to linearity if F = R. A map V × V → C that is linear in the first variable
and anti-linear in the second is called sesquilinear.
3. A large minority of authors, mostly (mathematical) physicists, defines inner products to
be linear in the second and anti-linear in the first argument. We follow the majority practice.
4. If F = R then hy, xi = hx, yi = hx, yi ∀x, y. Thus h·, ·i is bilinear and symmetric.
5. The first two axioms together already imply hx, xi ∈ R for all x, but not the positivity
assumption.
6. If hx, yi = 0 for all y ∈ H then x = 0. To see this, it suffices to take y = x. 2
5.3 Example 1. If V = Cn then hx, yi = ni=1 xi yi is an inner product and the corresponding
P
norm (see below) is k · k2 , which is complete.
2. Let S be any set and V = `2 (S, C). Then hf, gi = s∈S f (s)g(s) converges for all f, g ∈ V
P
by Hölder’s inequality and is easily seen to be an inner product. Of course, 1. is a special case
of 2. R
3. If (X, A, µ) is any measure space then hf, gi = X f (x)g(x) dµ(x) is an inner product on
L2 (X, A, µ; F) turning it into a Hilbert space. (Here we allow ourselves a standard sloppiness:
The elements of Lp are not functions, but equivalence classes of functions. The inner product
of two such classes is defined by picking arbitrary representers.) P
4. Let V = Mn×n (C). For a, b ∈ V , define ha, bi = Tr(b∗ a) = ni,j=1 aij bij , where (b∗ )ij =
bji . That this is an inner product turning V into a Hilbert space follows from 1. upon the
identification Mn×n (C) ∼
2
= Cn .
In view of hx, xi ≥ 0 for all x, and we agree that hx, xi1/2 always is the positive root.
38
3. Use 2. to prove kxk2 = kx1 k2 + kx2 k2 ≥ kx1 k2 .
4. Deduce Cauchy-Schwarz from kx1 k2 ≤ kxk2 .
5. Prove the claim about equality.
The above proof is the easiest to memorize (at least in outline) and reconstruct, but there
are many others, e.g.:
5.6 Exercise Let V be a vector space with inner product h·, ·i and define kxk = hx, xi1/2 .
(i) For x, y ∈ V and t ∈ R, define P (t) = kx + tyk2 and show this defines a quadratic
polynomial in t with real coefficients.
(ii) Use the obvious fact that this polynomial takes values in [0, ∞) for all t ∈ R, thus also
inf t∈R P (t) ≥ 0, to prove the Cauchy-Schwarz inequality.
p
5.7 Proposition If h·, ·i is an inner product on V then kxk = + hx, xi is a norm on V .
(An inner product h·, ·i and a norm k · k related in this way are called compatible.)
Proof. kxk ≥ 0 holds by construction, and the third axiom in Definition 5.1 implies kxk = 0 ⇒
x = 0. We have p p p
kcxk = hcx, cxi = cchx, xi = |c|2 hx, xi = |c|kxk,
thus kcxk = |c|kxk for all x ∈ V, c ∈ F. Finally,
kx + yk2 = hx + y, x + yi = hx, xi + hx, yi + hy, xi + hy, yi = kxk2 + kyk2 + hx, yi + hy, xi.
thus
kx + yk2 ≤ kxk2 + kyk2 + 2kxkkyk = (kxk + kyk)2
and therefore kx + yk ≤ kxk + kyk, i.e. subadditivity.
In terms of the norm, the Cauchy-Schwarz inequality just becomes |hx, yi| ≤ kxkkyk.
5.8 Definition A pre-Hilbert space (or inner product space) is a pair (V, h·, ·i), where V is an
F-vector space and h·, ·i an inner product on it. A Hilbert space is a pre-Hilbert space that is
complete for the norm k · k obtained from the inner product.
5.9 Remark 1. By the above, an inner product gives rise to a norm and therefore to a norm
topology τ . Now the Cauchy-Schwarz inequality implies that the inner product h·, ·i → F is
jointly continuous:
39
x
(For x = 0 this is obvious, and for x 6= 0 it follows from hx, kxk i = kxk.)
3. The restriction of an inner product on H to a linear subspace K ⊆ H again is an inner
product. Thus if H is a Hilbert space and K a closed subspace then K again is a Hilbert space
(with the restricted inner product).
4. All spaces considered in Example 5.3 are complete, thus Hilbert spaces. For `2 (S) this
was proven in Section 4., and the claim for Cn , thus also Mn×n (C), follows since Cn ∼
= `2 (S, C)
when #S = n. For L2 (X, A, µ) see books on measure theory like [29, 146]. 2
5.10 Definition Let (H1 , h·, ·i1 ), (H2 , h·, ·i2 ) be pre-Hilbert spaces. A linear map A : H1 → H2
is called
• isometric or an isometry if hAx, Ayi2 = hx, yi1 ∀x, y ∈ H1 .
• unitary if it is a surjective isometry.
5.11 Remark Every unitary map is invertible with unitary inverse. Two Hilbert spaces H1 , H2
are called unitarily equivalent or isomorphic if there exists a unitary U : H1 → H2 . 2
If (H1 , h·, ·i1 ), (H2 , h·, ·i2 ) are (pre-)Hilbert spaces then
h(x1 , x2 ), (y1 , y2 )i = hx1 , y1 i1 + hx2 , y2 i2
defines an inner product on H1 ⊕ H2 turning it into a (pre-)Hilbert space. With this definition,
k(x, y)k = h(x, y), (x, y)i1/2 = (hx, xi1 + hy, yi2 )1/2 = (kxk21 + kyk22 )1/2 (thus not kxk1 + kyk2 !).
More generally, if {Hi , h·, ·ii }i∈I is a family of (pre-)Hilbert spaces then
M X
Hi = {{xi }i∈I | hxi , xi ii < ∞}
i∈I i∈I
with X
h{xi }, {yi }i = hxi , yi ii
i∈I
is a (pre-)Hilbert space. (If Hi = F for all i ∈ I, this construction recovers `2 (I, F), while the
Banach space direct sum gives `1 (I, F).)
5.12 Exercise Let (V, h·, ·i) be a pre-Hilbert space and k·k the associated norm. Let (V 0 , k·k0 )
be the completion (as a normed space) of (V, k · k). Prove that V 0 is a Hilbert space.
5.13 Exercise Let (V, h·, ·i) be a pre-Hilbert space over F ∈ {R, C}. Prove the parallelogram
identity
kx + yk2 + kx − yk2 = 2kxk2 + 2kyk2 ∀x, y ∈ V (5.3)
and the polarization identities
1
hx, yi = (kx + yk2 − kx − yk2 ) if F = R, (5.4)
4
3
1X k
hx, yi = i kx + ik yk2 if F = C. (5.5)
4
k=0
40
For a map of (pre-)Hilbert spaces we have two a priori different notions of isometry, but
they are equivalent:
5.14 Exercise Let (H1 , h·, ·i1 ), (H2 , h·, ·i2 ) be (pre-)Hilbert spaces over F ∈ {R, C}. Let k · k1,2
be the norms induced by the inner products. Prove that a linear map A : H1 → H2 is an
isometry of normed spaces (i.e. kAxk2 = kxk1 ∀x ∈ H1 ) if and only if it is an isometry of
pre-Hilbert spaces (i.e. hAx, Ayi2 = hx, yi1 ∀x, y ∈ H1 ).
5.15 Exercise Let S be a set with #S ≥ 2 and p ∈ [1, ∞]. Prove that the norm k · kp on
`p (S, F) satisfies the parallelogram identity if and only if p = 2.
5.16 Exercise (Jordan-von Neumann 1935) 28 Let (V, k · k) be a normed space over F ∈
{R, C} satisfying (5.3). Define h·, ·i : V × V → R by (5.4).
(i) Prove hx, yi = hy, xi and hx + x0 , yi = hx, yi + hx0 , yi.
(ii) Prove htx, yi = thx, yi for all t ∈ N, then successively for t ∈ Z, Q, R.
(iii) Prove that h·, ·i is compatible with k · k and makes V a real inner product space.
(iv) If F = C, prove that (5.5) defines an inner product (Definition 5.1) compatible with k · k.
Hint: Prove and use a relationship between the right hand sides of (5.4) and (5.5).
Exercise 5.16 shows that (5.3) characterizes the Banach spaces that ‘are’ Hilbert spaces in
the sense of admitting an inner product compatible with the norm. There are very many such
criteria: The whole book [2] is dedicated to proving about 350 of them! To state just one more:
A Banach space V is a Hilbert space (in the above sense) if and only if for every 2-dimensional
subspace W there exists an idempotent P = P 2 ∈ B(V ) with kP k = 1 such that P V = W .
5.17 Exercise (i) If H is a Hilbert space and x1 , . . . , xn ∈ H, prove [this is easier without
induction!] the generalized parallelogram identity
n n
X X 2 X
−n
2 si xi = kxi k2 . (5.6)
s∈{±1}n i=1 i=1
(ii) Prove that a Banach space (V, k · k) is isomorphic to a Hilbert space (not necessarily
isometrically) if and only if there is an inner product h·, ·i on V such that the norm
kxk0 = hx, xi1/2 is equivalent to k · k.
(iii) If V is a Banach space isomorphic to a Hilbert space, prove that there are C 0 ≥ C > 0
such that
n n n
X X X 2 X
C kxi k2 ≤ 2−n si xi ≤ C 0 kxi k2 (5.7)
i=1 s∈{±1}n i=1 i=1
5.18 Remark ? A Banach space V in which the second inequality in (5.7) holds for some
C 0 > 0, all n ∈ N and x1 , . . . , xn ∈ V is said to have ‘type 2’. It has ‘cotype 2’ if the analogous
statement holds for the first inequality. (Type p and cotype p are defined analogously by
28
Pascual Jordan (1902-1980). German theoretical physicist with contributions to quantum theory. (Not to be
confused with the French mathematician Camille Jordan (1838-1922) to whom e.g. the Jordan normal form is due.)
41
replacing all k · k2 by k · kp .) By a remarkable theorem of Kwapień29 (1972), every Banach space
of type and cotype 2 (i.e. satisfying (5.7) with fixed C, C 0 > 0 for all n, xi ) is isomorphic to a
Hilbert space! For a proof see e.g. [1, Theorem 7.4.1], [97, Vol. 1, Theorem 5.V.6].
Granting this, we have a criterion for a given Banach space (V, k · k) to be isomorphic (not
necessarily isometrically) to a Hilbert space. We will encounter two more, cf. Remark 6.10.4
and Remark 9.13, but proving the converse directions is way too involved for these notes. 2
Proof. We have
DX X E X X X
kxk2 = hx, xi = xi , xj = hxi , xj i = hxi , xi i = kxi k2 ,
i j i,j i i
5.22 Definition Let V be an F-vector space. Then C ⊆ V is called convex if for all x, y ∈ C
and t ∈ [0, 1] we have tx + (1 − t)y ∈ C. (Equivalently tC + (1 − t)C ⊆ C for all t ∈ [0, 1].)
PN
5.23 Exercise If V is an F-vector space and C ⊆ P V is convex, prove that i=1 ti xi ∈ C
whenever x1 , . . . , xN ∈ C and t1 , . . . , tN ≥ 0 satisfy i ti = 1.
42
Let d = inf z∈C kzk and pick a sequence {yn } in C such that kyn k → d. Since C is convex,
we have yn +y
2
m
∈ C, thus yn +y
2
m
≥ d for all n, m. By the choice of {yn } we have kyn k → d.
Thus for every ε > 0 there is N ∈ N such that n ≥ N implies kyn k < d + ε. Thus if n, m ≥ N ,
then with the parallelogram identity (5.3) we have
2
yn + ym
kyn − ym k2 = 2kyn k2 + 2kym k2 − kyn + ym k2 = 2kyn k2 + 2kym k2 − 4
2
< 4(d + ε)2 − 4d2 = 8dε + 4ε2 .
This implies that {yn } is a Cauchy sequence and therefore converges to some y ∈ H by
completeness of H, and closedness of C gives y ∈ C. By continuity of the norm, we have
kyk = k lim yn k = lim kyn k = d.
0
If y, y 0 ∈ C with kyk = ky 0 k = d then y+y 0 2 2
2 ∈ C by convexity, thus k(y + y )/2k ≥ d by the
2
y+y 0
definition of d. Now the parallelogram identity gives 0 ≤ ky − y 0 k2 = 4d2 − 4 2 ≤ 0. Thus
ky − y0k = 0, proving y = y0.
5.26 Exercise Let H be a Hilbert space over F and S, T ⊆ H arbitrary subsets. Prove:
(i) T ⊆ S ⊥ ⇔ S ⊥ T ⇔ S ⊆ T ⊥ .
(ii) S ⊥ ⊆ H is a closed linear subspace.
⊥
(iii) S ⊥ = spanF (S) .
(iv) If S ⊆ T then T ⊥ ⊆ S ⊥ .
(v) S ⊆ S ⊥⊥ and S ⊥ = S ⊥⊥⊥ .
A linear subspace of a vector space clearly is a convex subset. Now,
5.27 Theorem Let H be a Hilbert space and K ⊆ H a closed linear subspace. Define a map
P : H → K by P x = y, where y ∈ K minimizes kx − yk as in Proposition 5.24. Also define
Qx = x − P x. Then
(i) Qx ∈ K ⊥ ∀x.
(ii) For each x ∈ H there are unique y ∈ K, z ∈ K ⊥ with x = y + z, namely y = P x, z = Qx.
(iii) The maps P, Q are linear.
(iv) The map P : H → H satisfies P 2 = P and hP x, yi = hx, P yi. The same holds for Q.
(v) The map U : H → K ⊕ K ⊥ , x 7→ (P x, Qx) is an isomorphism of Hilbert spaces. In
particular, kxk2 = kP xk2 + kQxk2 ∀x.
Proof. (i) Let x ∈ H, v ∈ K. We want to prove Qx ⊥ v, i.e. hx − P x, vi = 0. Since y = P x is
the element of K minimizing kx − yk, we have for all t ∈ C
kx − P xk ≤ kx − P x − tvk.
43
Taking squares and putting z = x − y = x − P x, this becomes hz, zi ≤ hz − tv, z − tvi, equivalent
to
2 Re(thv, zi) ≤ |t|2 kvk2 .
With the polar decomposition t = |t|eiϕ , this inequality becomes 2 Re(eiϕ hv, zi) ≤ |t|kvk2 .
Taking |t| → 0, we find Re(eiϕ hv, zi) = 0, and since ϕ was arbitrary, we conclude hv, zi = 0. In
view of z = x − y = x − P x this is what we wanted.
(ii) For each x ∈ H we have x = P x + Qx with P x ∈ K, Qx ∈ K ⊥ , proving the existence.
If y, y 0 ∈ K, z, z 0 ∈ K ⊥ such that y + z = y 0 + z 0 then y − y 0 = z 0 − z ∈ K ∩ K ⊥ = {0}. Thus
y − y 0 = z 0 − z = 0, proving the uniqueness.
(iii) If x, x0 ∈ H, c, c0 ∈ F then cx + c0 x0 = P (cx + c0 x0 ) + Q(cx + c0 x0 ). But also
where we used the orthogonality of the images of P and Q. The proofs for Q are analogous.
(v) It is clear that U is a linear isomorphism. Furthermore, P x ⊥ Qy implies
5.28 Remark The above theorem remains valid if H is only a pre-Hilbert space, provided
K ⊆ H is finite-dimensional. We first note that the proof of Lemma 5.24 only uses completeness
of C ⊆ H, not that of H. And we recall that finite-dimensional subspaces of normed spaces are
automatically complete and closed, cf. Exercises 2.33 and 3.22. In the proof of Theorem 5.27
we use Lemma 5.24 with C = K, which is complete as just noted. 2
44
(In Theorem 7.32 we will prove that every self-adjoint P : H → H is automatically bounded,
but this is not needed here. There are unbounded idempotents.)
We have seen that every closed subspace K of a Hilbert space gives rise to an orthogonal
projection P with P H = K. Conversely, we have:
5.32 Exercise Let H be a Hilbert space and K ⊆ H a closed linear subspace. Prove that
there is a linear isomorphism H/K → K ⊥ of F-vector spaces.
Conclude that the quotient space H/K of a Hilbert space H by a closed subspace K admits
an inner product turning it into a Hilbert space.
5.33 Exercise Let H be a Hilbert space and K, L closed linear subspaces of H such that
dim K < ∞ and dim K < dim L. Prove that L ∩ K ⊥ 6= {0}.
45
5.35 Exercise Let H be a Hilbert space, K ⊆ H a linear subspace (not necessarily closed!)
and let ϕ ∈ K ∗ . Prove:
b ∈ H ∗ such that ϕ
(i) There exists ϕ b K = ϕ.
b ∈ H ∗ satisfying ϕ
(ii) Uniqueness of ϕ b K = ϕ holds if and only K = H.
b ∈ H ∗ satisfying ϕ
(iii) There is a unique ϕ b K = ϕ and kϕk
b = kϕk.
As soon as we study topological vector spaces, we can also talk about infinite linear com-
binations, which renders the linear algebra notion of basis quite irrelevant (except as a tool in
some proofs). For Hilbert spaces we have the following natural notions:
5.38 Exercise Prove that every orthonormal set is linearly independent. What about orthog-
onal sets?
46
which is called the the inequality of Bessel.32
P
Proof. Let E be a finite orthonormal set and x ∈ V . Define y = x − e∈E hx, eie. It is
P check that hy, ei = 0 for all e ∈ E, so that E ∪ {y} is an orthogonal set. In
straightforward to
view of x = y + e∈E hx, eie, Pythagoras’ theorem (Lemma 5.20) gives
X X
kxk2 = kyk2 + khx, eiek2 = kyk2 + |hx, ei|2 ,
e∈E e∈E
5.40 Lemma For every orthonormal set E in a (pre-)Hilbert space H there is an orthonormal
basis E
b containing E. In particular every Hilbert space admits an ONB.
Proof. The proof is essentially the same as that of Proposition 5.36: Let B be the set of or-
thonormal sets that contain E, partially ordered by inclusion. Then E ∈ B, thus B 6= ∅. A
Zorn’s lemma argument gives the existence of maximal element E b of the partially ordered set
(B, ⊆). Thus E is maximal among the orthonormal sets containing E. If there is a unit vector
b
f ∈ E ⊥ then E
b ∪ {f } is an orthonomal set containing E strictly larger than E,
b contradicting
the maximality of E.
b Thus no such f exists, and E b is an ONB for H.
5.42 Theorem 33 Let H be a Hilbert space and E an orthonormal set in H. Then the following
are equivalent:
(i) E is an orthonormal basis, i.e. maximal.
(ii a) If x ∈ H and x ⊥ e for all e ∈ E then x = 0.
(ii b) The map H → `2 (E, F), x 7→ {hx, ei}e∈E (well-defined thanks to (5.8)) is injective.
(iii) spanF E = H.
P
(iv) For every x ∈ H, there are numbers {ae }e∈E in F such that x = e∈E ae e.
P
(v) For every x ∈ H, the equality x = e∈E hx, eie holds.
(vi a) For every x ∈ H, we have kxk2 = e∈E |hx, ei|2 . (Abstract Parseval34 identity)
P
32
Friedrich Bessel (1784-1846), German mathematician, now best known for certain differential equations.
33
I dislike the approach of some textbooks that restrict this statement to finite or countably infinite orthonormal
sets, which amounts to assuming H to be separable. I also find it desirable to understand how much of the theorem
survives without completeness since the latter does not hold in situations like Example 5.49. See Remark 5.43.
34
Marc-Antoine Parseval (1755-1836). French mathematician.
47
(vi b) The map H → `2 (E, F), x 7→ {hx, ei}e∈E is an isometric map of normed spaces, where
`2 (S) has the k · k2 -norm.
P P
(vii a) For all x, y ∈ H we have hx, yi = e∈E hx, eihy, ei = e∈E hx, eihe, yi.
(vii b) The map H → `2 (E, F), x 7→ {hx, ei}e∈E is an isometric map of pre-Hilbert spaces.
Here all summations over E are in the sense of the unordered summation of Appendix A.1 (with
V = H in (iv),(v) and V = F in (vi a),(vii a)).
Proof. If (ii a) holds then E is maximal, thus (i). If (ii a) is false then there is a non-zero x ∈ H
with x ⊥ e for all e ∈ E. Then E ∪ {x/kxk} is an orthonormal set larger than E, thus E is
not maximal. Thus (i)⇔(ii a). The equivalence (ii a)⇔(ii b) follows from the fact that a linear
map is injective if and only if its kernel is {0}.
(iii)⇒(i) If spanF E = H and x ∈ H satisfies x ⊥ E then also x ⊥ (spanF E = H), thus
x = 0. Thus E is maximal and therefore a basis.
(ii a)⇒(iii) K = spanF E ⊆ H is a closed linear subspace. If K 6= H then by Theorem 5.27
we can find a non-zero x ∈ K ⊥ . In particular x ⊥ e ∀e ∈ E, contradicting (ii a). Thus K = H.
It should be clear that the statements (vi b) and (vii b) are just high-brow versions of (vi a),
(vii a), respectively, to which they are equivalent. That (vii a) implies (vi a) is seen by taking
x = y. Since Exercise 5.14 gives (vi b)⇒(vii b), we have the mutual equivalence of (vi a), (vi
b), (vii a), (vii b).
(v)⇒(iv) is trivial.
P If (iv) holds then continuity of the inner product, cf. Remark 5.9.1,
implies hx, yi = e∈E ae he, yi for all y ∈ H. For y ∈ E, the r.h.s. reduces to ay , implying (v).
(iv) means that every x ∈ H is a limit of finite linear combinations of the e ∈ E, thus (iii)
holds. P
(v)⇒(vi a) For finite F ⊆ E we define xF = e∈F hx, eie. Pythagoras’ theorem gives
2 = 2 . As F % E, the l.h.s. converges to kxk2 by (iii) and the r.h.s. to
P
kx F k e∈F |hx, ei|
2
P
e∈E |hx, ei| . Thus (vi a) holds.
If (vi a) holds then for each ε > 0 there is a finite F ⊆ E such that e∈E\F |hx, ei|2 < ε.
P
Since x−xF is orthogonal to each e ∈ F , P we have x−xF ⊥ xF , to that kxk2 P= kx−xF k2 +kxF k2 .
Combining this with (iv a) and kxF k = e∈F |hx, ei| we find kx−xF k = e∈E\F |hx, ei|2 < ε.
2 2 2
48
since H is dense in H.
b And the converse follows from the general topology fact that the closure
in H of some S ⊆ H ⊆ H b equals S ∩ H, where S is the closure in H.b
In Example 5.49 below, all statements (i)-(vii) hold despite the incompleteness of H. But
in the absence of completeness the implication (i)⇒(iii) can fail! For a counterexample see
Exercise 5.44. (In view of this, maximal orthonormal sets in pre-Hilbert spaces should not
be called bases.) In [63] it is even proven that a pre-Hilbert space in which every maximal
orthonormal set E has dense span actually is a Hilbert space. Equivalently, in every incomplete
pre-Hilbert space there is a maximal orthonormal set E whose span is non-dense! There even
are pre-Hilbert spaces (called pathological) in which no orthonormal set has dense span!
Actually, most of the non-trivial results, like H ∼= K ⊕ K ⊥ for closed subspaces K and
Theorem 5.34, hold for a pre-Hilbert space if and only if it is a Hilbert space, see [63]. 2
5.45 Theorem ((F.) Riesz-Fischer) 3536 Let H be a pre-Hilbert space and E an orthonor-
mal set such that spanF E = H. Then the following are equivalent:
(i) H is a Hilbert space (thus complete).
(ii) The isometric map H → `2 (E, F), x 7→ {hx, ei}e∈E is surjective. I.e. for every f ∈ `2 (E, F)
there is an x ∈ H such that hx, ei = f (e) for all e ∈ E.
Proof. (ii)⇒(i) We know from (iii)⇒(vii b) in Theorem 5.42 that the map H → `2 (E, F) is an
isometry. If it is surjective then it is an isomorphism of pre-Hilbert spaces. Since `2 (E, F) is
complete by Lemma 4.8, so is H.
(i)⇒(ii) With f ∈ `2 (E, F) we have 2
P
e∈E |f (e)| < ∞. This implies that for each ε > 0
2
P
there is a finite F ⊆ E such that e∈E\F |f (e)| < ε. For each finite subset F ⊆ E we
define xF =P e∈F f (e)e. Whenever U, U 0 ⊆ E are finite subsets containing F , the identity
P
xU − x0U = e∈E (χU (e) − χU 0 (e))f (e)e implies
X
kxU − xU 0 k2 = |χU (e) − χU 0 (e)|2 |f (e)|2 ≤ ε
e∈E
since |χU − χU 0 | vanishes on F and is bounded by one on (U ∪ U 0 )\F . Thus {xF }F ⊆E finite is
a Cauchy net in H and therefore convergent to a unique x ∈ H by completeness, cf. Lemma
A.14. By continuity of the inner product, hxF , ei converges to f (e), so that hx, ei = f (e) for all
e ∈ E.
49
5.47 Proposition For a Hilbert space H, the following are equivalent:
(i) H is separable in the topological sense, i.e. there is a countable dense set S ⊆ H.
(ii) H admits a countable orthonormal basis.
(iii) Every orthonormal basis for H is countable.
Proof. If E ⊆ H is any ONB for H, Theorem 5.45 gives a unitary equivalence H ∼ = `2 (E, F).
2
By Proposition 4.17, ` (E, F) is separable if and only if E is countable. Combining these facts
proves the implications (ii)⇒(i)⇒(iii), while (iii)⇒(ii) is trivial.
5.48 Remark One can prove that any two ONBs E, E 0 for a Hilbert space H have the same
cardinality, i.e. there is a bijection between E and E 0 , cf. e.g. [30, Proposition I.4.14]. (This
does not follow from the linear algebra proof, since the latter uses a different notion of basis,
the Hamel bases.) The common cardinality of all bases of H is called the dimension of H. 2
which is the n-th Fourier coefficient fb(n) of f , cf. e.g. [157, 83]. In fact, in Fourier analysis one
proves, cf. e.g. [157, Corollary 5.4], that the finite linear combinations of the en (‘trigonometric
polynomials’) are dense in H, which is (iii) of Theorem 5.42. Thus all other statements in the
theorem also hold. The weaker statement (ii a) is also well-known in Fourier analysis, cf. [157,
Corollary 5.3]. Furthermore,
Z 2π
1 X X
|f (x)|2 dx = kf k2 = |hf, en i|2 = |fb(n)|2 .
2π 0 n∈Z n∈Z
This is the original Parseval formula, cf. e.g. [157, Chapter 3, Theorem 1.3]. Note that H is
not complete. Measure theory tells us that this completion is L2 ([0, 2π], λ; C), the measure
being Lebesgue measure λ (defined on the σ-algebra of Borel sets). Now the map L2 ([0, 2π]) →
`2 (Z, C), f 7→ fb is an isomorphism of Hilbert spaces. This nice situation shows that the Lebesgue
integral is much more appropriate for the purposes of Fourier analysis than the Riemann integral
(as for most other purposes).
Note: If we consider L2 ([0, 2π], λ; R) instead, we must replace the basis E by ER = {cos nx | n ∈
N0 } ∪ {sin nx | n ∈ N}. One easily checks that spanC E = spanC ER .
R 1 Exercise Prove that the pre-Hilbert space H = C([0, 1]) with inner product hf, gi =
5.50
0 f (t)g(t)dt is not complete.
37
The set of continuous 2π-periodic functions can be identified with C(S 1 , C) via z = eix . We write f (x) when we
consider f as a function on [0, 2π] (or R) and f (z) if f is understood as a function on S 1 = {z ∈ C | |z| = 1}.
50
5.7 Tensor products of Hilbert spaces
In this optional section, referenced only in Section 12.4 but important well beyond that, you are
assumed to know38 the notion of (algebraic) tensor product V ⊗k W of two vector spaces V, W
over a field k. (In two sentences: V ⊗k W is the free abelian group spanned the pairs (v, w) ∈
V ×W , divided by the subgroup generated by all elements of the form (v +v 0 , w)−(v, w)−(v 0 , w)
and (v, w + w0 ) − (v, w) − (v, w0 ) and (cv, w) − (v, cw), where v, v 0 ∈ V, w, w0 ∈ W, c ∈ k, the
quotient being a k-vector space in the obvious way. If v ∈ V, w ∈ W then the equivalence class
[(v, w)] is denoted v ⊗ w.)
The crucial property is that given a bilinear map α : V × W → Z (where V × W is the
Cartesian product) there is a unique linear map β : V ⊗k W → Z such that β(v ⊗w) = α((v, w)).
5.51 Lemma Let (H, h·, ·iH ), (H 0 , h·, ·iH 0 ) be pre-Hilbert spaces over F ∈ {R, C}. Then there is
a unique inner product h·, ·iZ on Z = H ⊗F H 0 such that hv ⊗ w, v 0 ⊗ w0 i = hv, v 0 iH hw, w0 iH 0 .
Proof. Every element z ∈ Z = H ⊗F H 0 has a representation z = K
P
PL 0 k=1 vk ⊗ wk with K < ∞.
0 0 0
Given another z = l=1 vl ⊗ wl ∈ H ⊗F H , we must define
K X
X L
hz, z 0 iZ = hvk , vl0 iH hwk , wl0 iH 0 .
k=1 l=1
K X
X L K̃ X
X L
0 0
hvk , vl iH hwk , wl iH =
0 hṽk̃ , vl0 iH hw̃k̃ , wl0 iH 0 .
k=1 l=1 k̃=1 l=1
5.52 Definition If H, H 0 are Hilbert spaces then H ⊗ H 0 is the Hilbert space obtained by
completing the above pre-Hilbert space (Z, h·, ·iZ ).
5.53 Remark 1. We usually write the completed tensor products ⊗ without subscript to
distinguish them from the algebraic ones.
2. If E, E 0 are ONBs in the Hilbert spaces H, H 0 , respectively, then it is immediate that
E × E 0 is an orthonormal set in the algebraic tensor product H ⊗k H 0 , thus also in H ⊗ H 0 . In
fact its span is dense in E ⊗ E 0 , so that it is an ONB.
38
Unfortunately, this is often omitted from undergraduate linear algebra teaching. E.g., it does not appear in [55]
despite the book’s > 500 pages. See however [84, 95] which, admittedly, are aiming higher.
51
This leads to a pedestrian way of defining the tensor product H ⊗ H 0 of Hilbert spaces over
F: Pick ONBs E ⊆ H, E 0 ⊆ H 0 and define H ⊗ H 0 = `2 (E × E 0 , F). By Remark 5.48, the
outcome is independent of the chosen bases up to isomorphism. If x ∈ H, x0 ∈ H 0 then the map
E ×E 0 → F, (e, e0 ) 7→ hx, eiH hx0 , e0 iH 0 is in `2 (E ×E, F), thus defines an element x⊗x0 ∈ H ⊗H 0 .
This map H × H 0 → H ⊗ H 0 is bilinear. But this definition is very ugly and unconceptual due
to its reliance on a choice of bases. 2
kc(x + W )k0 = kcx + W k0 = inf kcx − wk = |c| inf kx − w/ck = |c| inf kx − wk = |c|kxk0 ,
w∈W w∈W w∈W
where we used that W → W, w 7→ cw is a bijection. Now let x1 , x2 ∈ V and ε > 0. Then there
are w1 , w2 ∈ W such that kxi − wi k < kxi + W k0 + ε/2 for i = 1, 2. Then
Since ε > 0 was arbitrary, we have kx1 +x2 +W k0 ≤ kx1 +W k0 +kx2 +W k0 , proving subadditivity
of k · k0 . In view of 0 ∈ W it is immediate that kv + W k0 = inf w∈W kv − wk ≤ kvk.
52
(ii) If v ∈ V , the definition of k · k0 readily implies that kv + W k0 = 0 if and only if v ∈ W .
Thus if W is closed then w = v + W ∈ V /W has kwk0 = 0 only if w is the zero element of V /W .
And if W is non closed then every v ∈ W \W satisfies kv + W k0 = 0 even though v + W ∈ V /W
is non-zero. Thus k · k0 is not a norm.
(iii) Continuity of Q : (V, k · k) → (V /W, k · k0 ) follows from kQk ≤ 1, see (i). Since Q is
norm-decreasing, we have Q(B V (0, r)) ⊆ B V /W (0, r) for each r > 0. And if y ∈ V /W with
kyk < r then there is an x ∈ V with Q(x) = y and kxk < r (but typically larger than kyk).
Thus Q maps B V (0, r) onto B V /W (0, r) for each r. Similarly, Q(B V (x, r)) = B V /W (Q(x), r),
and from this it is easily deduced that Q(U ) ⊆ V /W is open for each open U ⊆ V . Thus Q is
open (w.r.t. the norm topologies on V, V /W ), which implies (cf. [108, Lemma 6.4.5]) that Q is
a quotient map, thus the topology on V /W coming from k · k0 is the quotient topology.
(iv) Let {yn } ⊆ V /W be a Cauchy sequence. Then we can pass to a subsequence wn = yin
such that kwn −wn+1 k < 2−n . Pick xn ∈ V such that Q(xn ) = wn and kxn −xn+1 k < 2−n . (Why
can this be done?) Then {xn } is a Cauchy sequence converging to some x ∈ V by completeness
of V . With y = Q(x) we have kyn − yk ≤ kxn − xk → 0. Thus yn → y, and V /W is complete.
(v) If y ∈ V /W , and x, x0 ∈ V satisfy Q(x) = Q(x0 ) = y then Q(x − x0 ) = 0, thus
x − x0 ∈ ker Q = W ⊆ ker T , implying T x = T x0 . Thus putting T 0 y = T x gives rise to a
well-defined map T 0 : V /W → Z satisfying T 0 Q = T . One easily checks that T 0 is linear. And
using Q(B V (0, 1)) = B V /W (0, 1) from the proof of (iii) we have
The statements concerning injectivity and surjectivity of T 0 are pure algebra, but for complete-
ness we give proofs: The statement about surjectivity follows from T = T 0 Q together with
surjectivity of Q, which gives T (V ) = T 0 (V /W ). If W $ ker T , pick x ∈ (ker T )\W and put
y = Q(x). Then y 6= 0, but T 0 y = T 0 Qx = T x = 0, so that T 0 is not injective. Now assume
W = ker T . If y ∈ ker T 0 then pick x ∈ V with y = Q(x). Then T x = T 0 Qx = T 0 y = 0, thus
x ∈ ker T = W , so that y = Q(x) = 0, proving injectivity of T 0 .
(vi) It is known from algebra that A/I is again an algebra. By the above, it is a normed
(resp. Banach) space. It remains to prove that the quotient norm on A/I is submultiplicative.
Let c, d ∈ A/I and ε > 0. Then there are a, b ∈ A with Q(a) = c, Q(b) = d, kak < kck+ε, kbk <
kdk + ε (see the exercise below). Then kcdk = kQ(ab)k ≤ kabk ≤ kakkbk < (kck + ε)(kdk + ε),
and since this holds for all ε > 0, we have kcdk ≤ kckkdk.
6.2 Exercise (i) If V is a normed space and W ⊆ V is a closed subspace, prove that for
every y ∈ V /W and every ε > 0 there is an x ∈ V with Q(x) = y and kxk ≤ kyk + ε.
(ii) Give an example of a normed space V , a closed subspace W and y ∈ V /W for which no
x ∈ V with y = Q(x), kxk = kyk exists.
We have seen that if V is Banach and W ⊆ V is closed then W and V /W (with their
inherited and quotient norms, respectively) are complete. The converse is also true:
53
Proof. (i) Let {vn } ⊆ V be a Cauchy sequence. Since Q : V → V /W is bounded, the sequence
{zn = Q(vn )} ⊆ V /W is Cauchy, so that by completeness of V /W it converges to some z ∈ V /W .
Pick zb ∈ V with Q(b z ) = z. By Exercise 6.2(i), for every n ∈ N we can find a yn ∈ V such that
Q(yn ) = z − zn = Q(b z − vn ) and kyn k ≤ kz − zn k + 2−n . With zn → z this implies yn → 0. With
Q(yn + vn − zb) = 0, we have yn + vn − zb ∈ ker Q = W ∀n. Since {vn } and {yn } are Cauchy, so
is {yn + vn − zb} so that by completeness of W we have yn + vn − zb → w for some w ∈ W . In
view of yn → 0 this implies vn → w + zb ∈ V . Thus V is complete.
(ii) Recalling that finite codimensionality of W means dim(V /W ) < ∞, both statements are
immediate by (i) and the completeness of finite-dimensional normed spaces (Exercise 2.33).
6.5 Exercise If V a Banach space and W ⊆ V a closed subspace, prove that V is separable if
and only if W and V /W are separable.
If V is a Banach space and W ⊆ V a closed linear subspace, it is natural to ask how the
dual spaces W ∗ and (V /W )∗ are related to V ∗ . This leads to the following definitions, which
are closely related to the Hilbert space ⊥, but not the same:
If V is a normed space and W, Z ⊆ V are closed linear subspaces, we will later see that
W + Z = {w + z | w ∈ W, z ∈ Z} ⊆ V can fail to be closed. If W and Z are both finite-
dimensional then W + Z is finite-dimensional, thus closed. More generally:
54
6.2 Complemented subspaces
If V is a vector space over any field K and W ⊆ V is a linear subspace, it is known from linear
algebra that we can find another subspace Z ⊆ V such that V = W + Z and W ∩ Z = {0}.
The proof is easy: Pick a (Hamel) basis E for W , extend it to a basis E 0 ⊇ E of V and put
Z = spanF (E 0 \E). Such a Z is called a (algebraic) complement for W . Since every x ∈ V can
be written as x = w + z with w ∈ W, z ∈ Z in a unique way (w + z = w0 + z 0 ⇒ w − w0 =
z 0 − z ∈ W ∩ Z = {0}) one says V is the internal direct sum of W and Z, or V ∼ = W ⊕ Z.
If V is a Banach space and W ⊆ V a closed subspace, it is natural ask for a complementary
subspace Z with the above properties to be closed, too. We have seen that for every closed
subspace K of a Hilbert space H there is a closed complement, namely the orthogonal comple-
ment K ⊥ ⊆ H. Since K ⊥ is defined in terms of the inner product, it is not surprising that the
situation will turn out more complicated for general Banach spaces, where no inner product
is around. (Simply passing from an algebraic complement Z to its closure is no solution since
there is no reason for W ∩ Z = {0} to hold.) This leads us to define:
6.10 Remark 1. In Exercise 7.15 we will prove that if V is a Banach space and W, Z ⊆ V are
complementary closed subspaces, the linear isomorphism V ' W ⊕ Z also is a homeomorphism,
thus an isomorphism of Banach spaces.
2. By the comments preceding the definition, every closed subspace of a Hilbert space is
complemented, and the same holds for Banach spaces isomorphic to a Hilbert space.
3. But ‘most’ infinite-dimensional Banach spaces have uncomplemented closed subspaces!
The simplest example of an uncomplemented closed subspace probably is c0 (N, R) ⊆ `∞ (N, R).
For a proof, not entirely trivial, see Appendix B.3.3.
Another easily stated example is given by X = C(S 1 , C)R with norm k · k∞ and the closed
1
subspace Y = {f ∈ X | fb(n) = 0 ∀n < 0}, where fb(n) = 0 f (e2πit )e−2πint dt, n ∈ Z are the
Fourier coefficients. Proving that Y ⊆ X is not complemented,
PN sin nx see e.g. [76, p. 163-4], boils down
to rather classical analysis, namely the fact that n=1 n is bounded uniformly in N ∈ N
einx
and x ∈ R, while N
P
n=1 n is not, combined with Exercise 7.15.
4. In fact, by a remarkable theorem of Lindenstrauss and Tzafriri39 (1971), cf. e.g. [1,
Theorem 13.4.5], [97, Vol. 2, Chap. 1, Theorem V.1], every Banach space all closed subspaces
of which are complemented is isomorphic to a Hilbert space. Cf. also Remark 9.13. 2
55
spanF {x1 , . . . , xn }. Now it is straightforward that Z is a complement for W .) Being finite-
dimensional, it is automatically closed by Exercise 3.22.
(ii) The proof will be given in Section 9.3 since it requires tools still to be developed.
6.12 Exercise It is not true that every subspace W ⊆ V with dim(V /W ) < ∞ of a Banach
space V is closed! Find a counterexample! (Hint: try codimension one.)
6.13 Exercise Let V be a normed space and P ∈ B(V ) satisfying P 2 = P (i.e. idempotent).
Prove that W = P V is a complemented subspace. (For a converse see Exercise 7.15.)
6.14 Exercise Let V = C([0, 2], R) with the k · k∞ -norm. Let W = {f ∈ V | f|(1,2] = 0}.
(i) Prove that W is complemented.
(ii) Can you ‘classify’ all possible complements, i.e. put them in bijection with a simpler set?
For more on the subject of complemented subspaces see [102].
In the process of returning from Hilbert to the more general Banach spaces, the above
discussion of quotient spaces and complements was the easiest part. The question of bases is
much more involved for Banach spaces, as the very extensive two-volume treatment [153] of
the subject attests. (Then again, the basics are quite accessible40 , cf. e.g. [102, 26, 1, 74], but
unfortunately we don’t have the time.) The same is true for the formidable subject of tensor
products of Banach spaces, see e.g. [144]. Going into that would be pointless given that we
already slighted the much simpler tensor products of Hilbert spaces.
A more tractable problem is the fact that in the absence of an inner product, the existence of
non-zero bounded linear functionals is rather non-trivial and can in general only be proven non-
constructively. We will do this in Section 9. (Of course, for spaces that are given very explicitly
like `p (S, F), we may well have more concrete descriptions of the dual spaces as in Section 4.5.)
But first we will prove two major theorems that are non-trivial even when restricted to Hilbert
spaces. Both of them use Baire’s theorem on complete metric spaces, cf. Appendix A.5 which
should perhaps be read first.
7.1 Exercise Let E, F be normed spaces and T ∈ B(E, F ). Consider the statements
(i) T is open (i.e. T U ⊆ F is open for each open U ⊆ E).
(ii) For every α > 0 there exists β > 0 such that B F (0, β) ⊆ T B E (0, α).
(iii) There exist α, β > 0 such that B F (0, β) ⊆ T B E (0, α).
(iv) There is a C > 0 such that for every y ∈ F there exists x ∈ E with T x = y and kxk ≤ Ckyk.
(This is a more quantitative or ‘controlled’ version of surjectivity.)
40
A Schauder Pbasis for a Banach space is a sequence {en }n∈N such that for every x ∈ V there are unique cn ∈ F
∞
such that x = n=1 cn en in the sense of (possibly
P conditional) convergence of series. One then proves that there are
continuous linear functionals ϕn such that x = n ϕn (x)x. Existence of a Schauder basis clearly implies separability
of V , but while most ‘natural’ separable Banach spaces have Schauder bases, counterexamples exist!
56
(v) T is surjective.
Obviously (iv)⇒(v). Prove the easy equivalences (i)⇔(ii)⇔(iii)⇔(iv).
Remarkably, for Banach spaces also (v) is equivalent to (i)-(iv):
7.2 Theorem (Open Mapping Theorem (OMT), Schauder 1930) 41 If E, F are Banach
spaces then every surjective T ∈ B(E, F ) is open (and (iv) holds, which is often useful).
Most proofs of this theorem are not very transparent. We follow the slightly better approach
of [61], which makes the proof of Proposition 7.4 more palatable by isolating a lemma that also
has other applications42 . It deduces statement (iv) in Exercise 7.1 from completeness of E and
an approximate form of surjectivity of T ∈ B(E, F ):
7.3 Lemma Let E be a Banach space, F a normed space and T ∈ B(E, F ). Assume also that
there are m > 0 and r ∈ (0, 1) such that for every y ∈ F there is an x0 ∈ E with kx0 k ≤ mkyk
m
and ky − T x0 k ≤ rkyk. Then for every y ∈ F there is an x ∈ E such that kxk ≤ 1−r kyk and
T x = y. In particular, T is surjective.
Proof. By linearity we may assume kyk = 1. By assumption, there is x0 ∈ E such that kx0 k ≤ m
and ky − T x0 k ≤ r. Putting y1 = y − T x0 we have ky1 k ≤ r, and applying the hypothesis to y1 ,
we find an x1 ∈ E with kx1 k ≤ mky1 k ≤ rm and ky − T (x0 + x1 )k = ky1 − T x1 k ≤ rky1 k ≤ r2 .
Continuing this inductively43 we obtain a sequence {xn } ⊆ E such that for all n ∈ N we have
kxn k ≤ rn m, (7.1)
n+1
ky − T (x0 + x1 + · · · + xn )k ≤ r . (7.2)
P∞
Now, (7.1) together with completeness of E implies, cf. Proposition 3.15, that n=0 xn converges
to an x ∈ E with
∞ ∞
X X m
kxk ≤ kxn k ≤ rn m = ,
1−r
n=0 n=0
and taking n → ∞ in (7.2) gives y = T x. Not assuming kyk = 1 gives an extra factor kyk.
7.4 Proposition If E is a Banach space, F a normed space, T ∈ B(E, F ) and there are
α, β > 0 such that B F (0, β) ⊆ T B E (0, α) then the statements (i)-(iv) from Exercise 7.1 hold.
Proof. The hypothesis clearly implies B F (0, β) ⊆ T B E (0, α). By linearity of T and the
fact that multiplication with a non-zero scalar is a homeomorphism, thus commutes with
F
the closure, this is equivalent to B (0, 1) ⊆ T B E (0, γ), 44 where γ = α/β. Thus for each
y ∈ F, kyk ≤ 1, ε > 0 there is x ∈ E, kxk < γ such that kT x − yk < ε. This, in turn, is
equivalent to ∀y ∈ F, ε > 0 ∃x ∈ E : kxk < γkyk, kT x − yk < εkyk. (Why?) Since this precisely
is the hypothesis of Lemma 7.3, we can conclude that every y ∈ F equals T x for an x ∈ E with
γ
kxk ≤ 1−ε kyk. This is statement (iv) in Exercise 7.1.
41
Juliusz Schauder (1899-1943). Polish mathematician. Born in Lwow (then Austria-Hungary, now Ukraine). Killed
by the Gestapo.
42
It leads to a quick proof of the Tietze extension theorem in topology, see Appendix A.6.1.
43
Here we are using the axiom DCω of countable dependent choice, cf. Appendix A.4.
44
In a normed space E we have B(x, r) = B(x, r) := {y ∈ E | d(x, y) ≤ r}, but in a metric space ⊇ may fail!
57
Proof of Theorem 7.2. Since T is surjective, we have
∞
[
F = TE = T B E (0, n).
n=1
Since F is a complete metric space and has non-empty interior F 0 = F 6= ∅, Corollary A.24 of
Baire’s theorem implies that at least one of the closed sets T B E (0, n) has non-empty interior.
Thus there are n ∈ N, y ∈ F, ε > 0 such that B F (y, ε) ⊆ T B E (0, n). If x ∈ B F (0, ε) then
2x = (y + x) − (y − x), thus 2B F (0, ε) ⊆ B F (y, ε) − B F (y, ε) and thus
1 1
B F (0, ε) ⊆ (B F (y, ε) − B F (y, ε)) ⊆ T B E (0, n) − T B E (0, n) ⊆ T B E (0, n).
2 2
Thus the hypothesis of Proposition 7.4 is satisfied with α = n, β = ε, and we are done.
7.5 Exercise Let E be a Banach space, F a normed space and A ∈ B(E, F ). Prove: If A is
open then F is complete, thus Banach.
Hint: Combine Exercise 7.1 and the method of proof of Proposition 6.1(iv).
7.6 Remark 1. The preceding exercise shows that A ∈ B(E, F ) is never open if E is a Banach
space and F an incomplete normed space! On the other hand, if E is incomplete, openness of
A ∈ B(E, F ) can fail even if A is bijective. See Exercise 7.23 below.
2. The proof of the OMT involved two applications of DCω , in the proof of the approximation
Lemma 7.3 and via Baire’s theorem. It actually is possible to replace the invocation of Baire’s theorem
by the (weak) uniform boundedness theorem, see [46]. This is interesting since the latter can be proven
using only the axiom of countable choice, cf. Appendix B.4. But replacing DCω by ACω in proving the
approximation Lemma 7.3 seems impossible.
3. Recent claims by various authors (Kesavan, Liebaug and Spindler, Velasco, . . . ) that OMT can be
deduced from the Uniform Boundedness Theorem (next section) do not make much sense. From a logical
point of view, ‘A implies B’ is true for every true statement B: Just ignore A and prove B from scratch.
In order for ‘A implies B’ to have meaning, we must restrict the tools allowed in proving the implication,
as in Theorem B.43, where no use of choice axioms is allowed. The proposed deduction of OMT from
UBT is not of this type since is uses DCω , while DCω is sufficient to prove OMT without invoking UBT!
(There is another non-rigorous but very common use of ‘A implies B’ in the sense of ‘deducing B from
A is much simpler than proving B without A’. As in: The non-existence of a retraction from a ball to
its boundary implies the Brouwer fixed point theorem. But also this does not apply here.) 2
7.7 Corollary (Bounded Inverse Theorem (BIT), Banach 1929) If E, F are Banach
spaces and T ∈ B(E, F ) is a bijection then also T −1 is bounded.
Proof. By Theorem 7.2, T is open. Thus the inverse T −1 that exists by bijectivity (and clearly
is linear) is continuous, thus bounded by Lemma 3.5 or property (iv) from Exercise 7.1.
Proof of Theorem 2.23. The hypothesis implies that idV : (V, k · k1 ) → (V, k · k2 ) is a continuous
bijection, thus a homeomorphism by the BIT. Now Lemma 3.5 gives k · k1 ≤ c0 k · k2 .
58
7.9 Remark 1. The BIT is equivalent to the statement that every continuous linear bijection of
Banach spaces is a homeomorphism. This is reminiscent of the statement that every continuous
bijection of compact Hausdorff spaces is a homeomorphism. (The analogy ends when we look
at the generalization: Every continuous map from a compact space to a Hausdorff space is a
closed map. But there are other open mapping theorems, e.g. for topological groups.)
2. The Open Mapping Theorem can be generalized to the case where E is an F -space, i.e.
a TVS admitting a complete translation-invariant metric. See [141, Theorem 2.11]. 2
The OMT and BIT have many applications to the questions concerning closed linear sub-
spaces, their sums and complementedness:
7.13 Exercise Let V be a Banach space and K, L ⊆ V closed linear subspaces. Equip W =
K ⊕ L with the norm k(k, l)k = kkk + klk (or an equivalent one like max(kkk, klk)).
(i) Prove that the following are equivalent:
(α) K + L ⊆ V is closed.
(β) The (surjective) map + : W → K + L, (k, l) 7→ k + l is open.
59
(γ) There is a C such that for every y ∈ K + L there are k ∈ K, l ∈ L such that k + l = y
and kkk + klk ≤ Ckyk.
(ii) If K ∩ L = {0} prove that (α)-(γ) are also equivalent to
(δ) inf{kk − lk | k ∈ K, l ∈ L, kkk = klk = 1} > 0. Thus the unit spheres of K and L
have positive distance (which they cannot have if K ∩ L 6= {0}).
7.18 Exercise (i) Let V be a Banach space and W ⊆ V a closed subspace. Prove that the
quotient map p : V → V /W has a bounded section if and only if W is complemented.
(ii) If V, W are Banach spaces and A ∈ B(V, W ), prove that A has a bounded right inverse if
and only if A is surjective and ker A ⊆ V is complemented.
7.19 Exercise Let V, W be Banach spaces and A ∈ B(V, W ). Prove that A has a bounded
left inverse if and only if A is injective and AV ⊆ W is complemented.
60
7.20 Remark Since all closed subspaces of Hilbert spaces are complemented, we find that a
bounded map between Hilbert spaces has a bounded right (resp. left) inverse if and only if it is
surjective (resp. injective with closed image). 2
7.21 Exercise (i) Let V be an infinite-dimensional Banach space. Prove that all finite-
dimensional subspaces have empty interior (in V !), then use Baire’s theorem to prove that
V cannot have a countable Hamel basis. (Thus dim V > ℵ0 = #N.)
(ii) Let V be an infinite-dimensional Banach space and {xn }, {ϕn } sequences as in Exercise
9.17. For every N ⊆ N define xN = n∈N 2−n xn . Now use Lemma B.24 to find a linearly
P
independent family in V of cardinality c = #R, so that dim V ≥ c (Hamel dimension).
(iii) Prove that every separable normed space V has cardinality at most c and deduce dim V ≤ c.
(iv) Conclude that every infinite-dimensional separable Banach space has Hamel dimension c.
7.22 Remark 1. The result of Exercise 7.21(i) can be proven using Riesz’ Lemma 12.2 instead
of Baire’s theorem, see [10], but (as in most such cases) the proof uses countable dependent
choice DCω like the proof of Baire’s theorem.
2. If the continuum hypothesis (CH) is true, Exercise 7.21 (ii) readily follows from (i)+(iii).
But the proof of (ii) indicated above is independent of CH. 2
7.23 Exercise Give counterexamples showing that both spaces appearing in the Bounded
Inverse Theorem must be complete.
Hint: For complete E, incomplete F use `p spaces, and for E incomplete, F complete use
F = `1 (N, R) and the fact that it has Hamel dimension c = #R, cf. Exercise 7.21(iv).
7.25 Lemma Let E, F be normed spaces and T : E → F a linear map (not assumed bounded).
Then the following are equivalent:
(i) The graph G(T ) = {(x, T x) | x ∈ E} ⊆ E ⊕ F of T is closed.
(ii) Whenever {xn }n∈N ⊆ E is a sequence such that xn → x ∈ E and T xn → y ∈ F , we have
y = T x.
Proof. Since E ⊕ F is a metric space, G(T ) is closed if and only if it contains the limit (x, y)
of every sequence {(xn , yn )} in G(T ) that converges to some (x, y) ∈ E ⊕ F . But a sequence in
G(T ) is of the form {(xn , T xn )}, and (x, y) ∈ G(T ) ⇔ y = T x.
61
7.26 Remark Operators with closed graph (in particular unbounded ones) are often called
closed. But this must not be confused with their closedness as a map, i.e. the property of
sending closed sets to closed sets! Bounded linear operators between Banach spaces have closed
graphs, but need not be closed maps. 2
7.27 Theorem (Banach 1929) If E, F are Banach spaces, then a linear map T : E → F is
bounded if and only if its graph is closed.
Proof. Let E, F be Banach spaces, and let T : E → F be linear. If T is bounded then
it is continuous, thus G(T ) is closed by Exercise 7.24. Now assume that G(T ) is closed. The
Cartesian product E ⊕F with norm k(e, f )k = kek+kf k is a Banach space. The linear subspace
G(T ) ⊆ E⊕F is closed by assumption, thus a Banach space. Since the projection p1 : G(T ) → E
is a bounded bijection, by Corollary 7.7 it has a bounded inverse p−1
1 : E → G(T ). Then also
T = p2 ◦ p−1
1 is bounded.
7.28 Exercise Show that the Bounded Inverse Theorem (Corollary 7.7) can be deduced from
the Closed Graph Theorem. (Thus the three main results of this section are ‘equivalent’.)
We discuss a few typical applications of the CGT. (For another one see Exercise 8.5.)
7.29 Exercise Let A : E → F be a linear map of Banach spaces. Prove that A is bounded if
and only if the following holds: If {xn }n∈N ⊆ E is such that xn → 0 and Axn → y then y = 0.
7.30 Exercise Let V be a Banach space over F, p ∈ [1, ∞] and {xn }n∈N a sequence in V such
that for every ϕ ∈ V ∗ the sequence {ϕ(xn )}n∈N is in `p (N, F). Thus there is a (clearly linear)
map A : V ∗ → `p (N, F), ϕ 7→ {ϕ(xn )}n∈N (called the analysis map). Prove that A is bounded,
thus A ∈ B(V ∗ , `p (N, F)).
7.31 Exercise Let H be a Hilbert space and {zn }n∈N ⊆ H a sequence such that for each
f ∈ `2 (N, C) there exists an x ∈ H satisfying hx, zn i = f (n) ∀n. Prove that there exists D such
that for each f ∈ `2 (N, C) there exists an x ∈ H satisfying hx, zn i = f (n) ∀n and kxk ≤ Dkf k2 .
Hint: Put N = {z1 , z2 , . . .}⊥ , construct a map `2 (N, C) → H/N and prove its boundedness.
implying Ax = y. Thus A has closed graph and therefore is bounded by Theorem 7.27. The
proof for B is analogous. (ii) is just the special case H = K, A = B of (i).
45
Ernst David Hellinger (1883-1950), Otto Toeplitz (1881-1940). German mathematicians. Both were forced into
exile in 1939. See also the T.-Hausdorff Theorem B.143.
62
7.33 Remark The Hellinger-Toeplitz Theorem shows that on a Hilbert space H there are no
unbounded linear operators A : H → H satisfying hAx, yi = hx, Ayi ∀x, y. This is a typical
example of a ‘no-go-theorem’. Occasionally such results are a nuisance. After all, the operator
of multiplication by n on `2 (N) ‘obviously’ is self-adjoint. What Hellinger-Toeplitz really says
is that such an operator cannot be defined everywhere, i.e. on all of H. This leads to the notion
of symmetric operators, and also illustrates that no-go theorems often can be circumvented by
generalizing the setting. This is the case here, since the Hellinger-Toeplitz theorem only applies
to operators that are defined everywhere. 2
In particular, A is bounded below if and only if its (set-theoretic) inverse A−1 is bounded.
Proof. Using the bijectivity of x 7→ Ax, we have
−1
kA−1 yk
−1 kxk kAxk −1
kA k = sup = sup = inf = inf kAxk .
y∈W \{0} kyk x∈V \{0} kAxk x∈V \{0} kxk kxk=1
46
This terminology clashes with another one according to which a self-adjoint operator A is bounded below if
σ(A) ⊆ [c, ∞) for some c ∈ R. Since we consider only bounded operators, we’ll have no use for this notion. The
problem could be avoided by writing ‘bounded away from zero’, as some authors do, but this is a bit tedious.
63
The second statement follows immediately.
Recall that the image AV ⊆ W of a linear map A : V → W need not be closed (if V is
infinite-dimensional). The following generalizes Corollary 3.23:
7.40 Definition If V, W are normed spaces then A ∈ B(V, W ) is called invertible if there is a
B ∈ B(W, V ) such that BA = idV and AB = idW .
7.41 Proposition Let V, W be a Banach spaces and A ∈ B(V, W ). Then the following are
equivalent:
(i) A is invertible.
(ii) A is injective and surjective.
(iii) A is bounded below and has dense image.
Proof. (i)⇒(ii)+(iii). It is clear that invertibility implies injectivity and surjectivity, thus in
particular dense image. Since A−1 is bounded, Lemma 7.38 gives that A is bounded below.
(ii)⇒(i) The set-theoretic inverse which exists by bijectivity, clearly linear, is bounded by
the bounded inverse theorem (Corollary 7.7). Thus A is invertible.
(iii)⇒(i). By boundedness below, A is injective. And AV ⊆ W is dense by assumption and
closed by Lemma 7.39, thus AV = W . Thus A is injective and surjective. Now boundedness of
the inverse A−1 follows from boundedness below of A and Lemma 7.38. Note: BIT not used.
7.42 Remark 1. Note that dense image is weaker than surjectivity, while boundedness below
is stronger than injectivity. The point of criterion (iii) is that it can be quite hard to verify
surjectivity of A directly, while density of the image usually is easier to establish.
2. The material on bounded below maps discussed so far, including (i)⇔(iii) in Proposition
7.41, was entirely elementary and could be moved to Section 3. 2
7.44 Exercise Let V be an infinite dimensional Banach space and A ∈ B(V ) injective.
64
(i) Putting W = V ⊕ V with norm k(a, b)k = kak + kbk, prove that the subspaces
7.45 Exercise Let H be a Hilbert space and A ∈ B(H) such that |hAx, xi| ≥ Ckxk2 for some
C > 0. Prove that A is invertible and kA−1 k ≤ C −1 . (Such an A is called elliptic or coercive.)
7.46 Exercise Let V be a Banach space. Prove that {A ∈ B(V ) | A bounded below } ⊆ B(V )
is open.
8.1 Definition Let E, F be normed spaces and F ⊆ B(E, F ) a family of bounded linear maps.
• F is called pointwise bounded if supA∈F kAxk < ∞ for each x ∈ E.
• F is called uniformly bounded if supA∈F kAk < ∞.
That uniform boundedness of F implies pointwise boundedness is trivial, but this is not:
8.2 Theorem Let E be a Banach space, F a normed space and F ⊆ B(E, F ). Then:
(i) Either F is uniformly bounded or the set {x ∈ E | supA∈F kAxk = ∞} ⊆ E is dense Gδ 47 .
(ii) [Helly 1912, Hahn, Banach 1922] If F is pointwise bounded then it is uniformly bounded.
Proof. (i) The map F → R≥0 , x 7→ kxk is continuous and each A ∈ F is bounded, thus
continuous. Therefore the map fA : E → R≥0 , x 7→ kAxk is continuous for every A ∈ F.
Defining for each n ∈ N
Vn = {x ∈ E | sup kAxk > n},
A∈F
which is open by
T continuity of the functions fA .
Thus X = n∈N Vn is Gδ . And X = {x ∈ E | supA∈F kAxk = ∞} in view of the definition
of the Vn . Now Baire’s Theorem A.23 implies that X is dense if all the Vn are.
47
A Gδ set in a topological space is an intersection of countably many open sets.
65
On the other hand, if Vn is non-dense for some n ∈ N, there exists x0 ∈ E and r > 0 such
that B(x0 , r) ∩ Vn = ∅. This means supA∈F kA(x0 + x)k ≤ n for all x ∈ E with kxk < r. With
x = (x0 + x) − x0 and the triangle inequality we have
8.3 Remark 1. We call (i) the strong and (ii) the weak version of the Uniform Bounded-
ness Theorem, respectively. Mystifyingly, most expositions of uniform boundedness use Baire’s
theorem to prove the weak version without pointing out that the proof actually gives a much
stronger result. One of the few exceptions is [140], the source of the elegant proof given above.
2. There is an incessant stream of publications purporting to give ‘elementary’ proofs (i.e.
avoiding Baire) of Theorem 8.2(ii), but all of them (except [53]) use, usually without acknowl-
edging it, the axiom DCω of countable dependent choice which, however, is logically equivalent
over ZF to Baire’s theorem! (See [17].) This also holds for [154], but the proof given there
admits a tiny modification, discovered quite recently [53], that really only uses countable choice
ACω . See Appendix B.4 for the beautiful argument (employing a version of the ‘method of
the gliding hump’) that really is more elementary, in the precise sense of reverse mathematics,
which concerns itself with the weakest axioms needed to prove desired results. (Cf. [158] for an
engaging introduction.) 2
8.4 Exercise (i) Deduce Theorem 8.2(ii) from its special case where E, F are both assumed
complete.
(ii) Show that the completeness assumption on E in Theorem 8.2(ii) cannot be omitted.
The weak version of the UBT can also be deduced from the closed graph theorem:48
8.5 Exercise Let E, F be Banach spaces and F ⊆ B(E, F ) a pointwise bounded family. Use
the Closed Graph Theorem to prove that F is uniformly bounded, as follows:
(i) Prove that FF = {{yA }A∈F ∈ F F = Fun(F, F ) | supA∈F kyA k < ∞} is a Banach space.
(ii) Show that pointwise boundedness of F is equivalent to T E ⊆ FF .
(iii) Prove that the graph G(T ) ⊆ E ⊕ FF of T is closed. (Thus T is bounded by Theorem
7.27.)
(iv) Deduce uniform boundedness of F from the boundedness of T .
(v) Remove the requirement that F be complete.
66
8.7 Exercise Give a proof of the Hellinger-Toeplitz Theorem 7.32 using Theorem 8.2(ii) in-
stead of the Closed Graph Theorem 7.27. (This is interesting since the it shows that also
Hellinger-Toeplitz depends only on the axiom ACω of countable choice.)
8.8 Definition Let E, F be normed spaces. A sequence (or net) {An } ⊆ B(E, F ) is strongly
convergent if limn→∞ An x exists for every x ∈ E.
Under the above assumption, the map A : E → F, x 7→ limn→∞ An x is easily seen to be
s
linear. Now we write An → A or A = s-lim An .
8.11 Exercise Let H be a Hilbert space and {e1 , e2 , . . .} an orthonormal sequence in H (not
necessarily an ONB). Define ϕn ∈ H ∗ = B(H, F) by ϕn (x) = hx, en i. Prove that ϕn → 0
strongly, but not in norm.
8.12 Exercise Let 1 ≤ p < ∞ and V = `p (N, F). For each m ∈ N define Pm ∈ B(V ) by
s
(Pm f )(n) = f (n) for n ≥ m and (Pm f )(n) = 0 if n < m. Prove Pm → 0, but kPm k = 1 ∀m,
k·k
thus Pm 6→ 0.
8.13 Exercise Let V be a separable Banach space and B ⊆ B(V ) a bounded subset.
(i) Prove: If S ⊆ V is dense and a net {Aι } ⊆ B satisfies kAι xk → 0 for all x ∈ S then
kAι xk → 0 for all x ∈ V , thus Aι → 0 in the strong operator topology.
(ii) Prove that the topological space (B, τsot ) is metrizable.
(iii) BONUS: Prove that (V, τsot ) is not metrizable if V is infinite-dimensional.
49
Hugo Steinhaus (1887-1972). Polish mathematician
50
In the literature, one can find either this result or Theorem 8.2(ii) denoted as ‘Banach-Steinhaus theorem’.
67
8.3 Appl. of the strong UBT: Many continuous functions with
divergent Fourier series
For the preceding applications of the uniform boundedness theorem, the weak version was
sufficient. Other applications use the contraposition for a (non-constructive) existence proof:
If F ⊆ B(E, F ) is not uniformly bounded then it is not pointwise bounded, thus there exists
x ∈ E with supA∈F kAxk = ∞. For such applications, Theorem 8.2(i) is a definite improvement.
Let f : R → C be 2π-periodic, i.e. f (x + 2π) = f (x) ∀x, and integrable over finite intervals.
Define Z 2π
1
cn (f ) = f (x)e−inx dx (8.1)
2π 0
and
n
X
Sn (f )(x) = ck (f )eikx , n ∈ N. (8.2)
k=−n
The fundamental problem of the theory of Fourier series is to find conditions for the conver-
gence Sn (f )(x) → f (x) as n → ∞, where convergence can be understood as (possibly almost)
everywhere pointwise or w.r.t. some norm, like k · k2 (as in Example 5.49) or k · k∞ . Here we
will discuss only continuous functions and we identify continuous 2π-periodic functions with
continuous functions on S 1 . It is not hard to show that Sn (f )(x) → f (x) if f is differentiable
at x (or just Hölder continuous: |f (x0 ) − f (x)| ≤ C|x0 − x|D with C, D > 0 for x0 near x) and
that convergence is uniform when f is continuously differentiable (or the Hölder condition holds
uniformly in x, x0 ). (See any number of books on Fourier analysis, e.g. [157, 83].)
Assuming only continuity of f one can still prove that limn→∞ Sn (f )(x) = f (x) if the limit
exists, but there actually exist continuous functions f such that Sn (f )(x) diverges at some
x. Such functions were first constructed in the 1870s using ‘condensation of singularities’,
a relative and precursor of the gliding hump method, cf. Appendix B.3.5. Nowadays, most
textbook presentations of such functions are based on Lemma 8.15 below combined with either
the uniform boundedness theorem or constructions ‘by hand’, see e.g. [83, Section II.2], that
are quite close in spirit to the uniform boundedness method.
However, individual examples of continuous functions with Fourier series divergent in a point
can be produced in a totally constructive fashion, avoiding all choice axioms! (See [109] for a
very classical example.) But using non-constructive arguments seems unavoidable if one wants
to prove that there are many such functions as in the following:
8.14 Theorem There is a subset X ⊆ C(S 1 ) that is dense Gδ (in the k · k∞ -topology) such
that the Fourier series {Sn (f )(0)}n∈N diverges for each f ∈ X.
Proof. Inserting (8.1) into (8.2) we obtain
n n
!
1 X ikx 2π
Z Z 2π
−ikt 1 X
Sn (f )(x) = e f (t)e dt = f (t) eik(x−t) dt = (Dn ? f )(x),
2π 0 2π 0
k=−n k=−n
1
R 2π
where ? denotes convolution, defined for 2π-periodic f, g by (f ? g)(x) = 2π 0 f (t)g(x − t)dt,
and
n
X sin(n + 21 )x
Dn (x) := eikx =
sin x2
k=−n
68
is the Dirichlet kernel. The quickest way to check the last identity is the ‘telescoping’ calculation
n
X n
X
(eix/2 − e−ix/2 )Dn (x) = eix(k+1/2) − eix(k−1/2) = eix(n+1/2) − e−ix(n+1/2) ,
k=−n k=−n
together with eix − e−ix = 2i sin x. Since Dn (x) is an even function, we have
Z 2π
1
ϕn (f ) := Sn (f )(0) = f (x)Dn (x)dx.
2π 0
It is clear that the norm of the map ϕn : (C(S 1 ), kR · k∞ ) → C is bounded above by kDn k1 .
2π
For gn (x) = sgn(Dn (x)) we have ϕn (gn ) = (2π)−1 0 |Dn (x)|dx =: kDn k1 . While gn is not
m→∞
continuous, we can find a sequence of continuous gn,m bounded by 1 such that gn,m −→ gn
pointwise. Now Lebesgue’s dominated convergence theorem implies ϕn (gn,m ) → ϕn (gn ) =
kDn k1 , which implies kϕn k = kDn k1 . By Lemma 8.15 below, kDn k1 → ∞ as n → ∞. Thus the
family F = {ϕn } ⊆ B(C(S 1 ), C) is not uniformly bounded. Now Theorem 8.2(i) implies that
the set X = {f ∈ C(S 1 , C) | {Sn (f )(0)} is unbounded} is dense Gδ .
4
8.15 Lemma We have kDn k1 ≥ log n for all n ∈ N.
π2
Proof. Using | sin x| ≤ |x| for all x ∈ R, we compute
Z π
2 π
Z
1 1 dx
kDn k1 = |Dn (x)|dx ≥ sin n + x
2π −π π 0 2 x
Z (n+1/2)π n
2 X kπ | sin x|
Z
2 dx
= | sin x| ≥ dx
π 0 x π x
k=1 (k−1)π
n Z π n
2X 1 4 X1 4
≥ sin x dx = 2 ≥ 2 log n,
π kπ 0 π k π
k=1 k=1
Pn R n+1
where we used k=1 1/k ≥ 1 dx/x = log(n + 1) > log n.
8.16 Remark Also the Bounded Inverse Theorem has an interesting application R 2π to Fourier
analysis: For f ∈ L1 ([0, 2π]), we define the Fourier coefficients fb(n) = (2π)−1 0 f (t)e−int dt
for all n ∈ N. It is immediate that kfbk∞ ≤ kf k1 , and is not hard to prove the Riemann-Lebesgue
theorem fb ∈ c0 (Z, C) and injectivity of the resulting map L1 ([0, 2π]) → c0 (Z, C), f 7→ fb, see
e.g. [140, Theorem 5.15] or [83]. If this map was surjective, the Bounded Inverse Theorem
would give kf k1 ≤ Ckfbk∞ ∀f ∈ L1 ([0, 2π]). For the Dirichlet kernel it is immediate that
Dcn (m) = χ[−n,n] (m), thus kD cn k∞ = 1 for all n ∈ N. Since we know that kDn k1 → ∞, we
would have a contradiction. Thus L1 ([0, 2π]) → c0 (Z, C), f 7→ fb is not surjective. 2
69
(which comes in many versions)52 is to show that all Banach spaces admit many bounded linear
functionals.
9.2 Theorem Let V be a real vector space and p : V → R a sublinear function. Let W ⊆ V
be a linear subspace and ϕ : W → R a linear functional such that ϕ(w) ≤ p(w) for all w ∈ W .
b : V → R such that ϕ
Then there is a linear functional ϕ b W = ϕ and ϕ(v)
b ≤ p(v) for all v ∈ V .
The heart of the proof is the special case where W has codimension one:
9.3 Lemma Let V, p, W, ϕ be as in Theorem 9.2 and v 0 ∈ V . Then there is a linear functional
b : Y = W + Rv 0 → R such that ϕ
ϕ b W = ϕ and ϕ(v)
b ≤ p(v) for all v ∈ Y .
Proof. If v 0 ∈ W , there is nothing to do so that we may assume v 0 ∈ V \W . Then every
x ∈ W + Rv 0 can be written as x = w + cv 0 with unique w ∈ W, c ∈ R. Thus if d ∈ R, we can
define ϕb : W + Rv 0 → R by w + cv 0 7→ ϕ(w) + cd for all w ∈ W and c ∈ R. Since ϕb is linear and
b W = ϕ, it remains to show that d can be chosen such that ϕ ≤ p holds on
trivially satisfies ϕ
Y = W + Rv 0 , to wit
b + cv 0 ) = ϕ(w) + cd ≤ p(w + cv 0 )
ϕ(w ∀w ∈ W, c ∈ R. (9.1)
For c = 0, this holds by assumption. If (9.1) holds for all w ∈ W and c ∈ R then in particular
And if (9.2) holds then by linearity of ϕ and positive homogeneity of p, for all e > 0 we have
(9.2)
b ± ev 0 ) = eϕ(e
ϕ(w b −1 w ± v 0 ) ≤ ep(e−1 w ± v 0 ) = p(w ± ev 0 ),
thus the desired inequality (9.1) holds for all w ∈ W, c ∈ R. Now (9.2) is equivalent to
Clearly this is possible if and only if ϕ(w) − p(w − v 0 ) ≤ p(w0 + v 0 ) − ϕ(w0 ) for all w, w0 ∈ W ,
which in turn is equivalent to ϕ(w) + ϕ(w0 ) ≤ p(w − v 0 ) + p(w0 + v 0 ) ∀w, w0 . The latter inequality
is indeed satisfied for all w, w0 ∈ W since w + w0 ∈ W , so that
52
Important early results are due to Eduard Helly (1884-1943), another Austrian mathematician. See [101, p. 54-55].
70
If n = dim(V /W ) < ∞, proving the theorem amounts to applying the Lemma n times. But
otherwise an infinite inductive procedure is required. This can be formalized via ‘transfinite
induction’, but is easier to invoke Zorn’s lemma, as in the proof of Proposition 5.36:
Proof of Theorem 9.2. If W = V , there is nothing to do, so assume W $ V . Let E be the set
of pairs (Z, ψ), where Z ⊆ V is a linear subspace space containing W and ψ : Z → R is a linear
map extending ϕ such that ψ(z) ≤ p(z) ∀z ∈ Z. Since W 6= V , Lemma 9.3 implies E 6= ∅.
We define a partial ordering on E by (Z, 0 0 0 0
S ψ) ≤ (Z , ψ ) ⇔ Z ⊆ Z , ψ Z = ψ. If C ⊆ E is
a chain, i.e. totally ordered by ≤, let Y = (Z,ψ)∈C Z and define ψY : Y → R by ψY (v) = ψ(v)
for any (Z, ψ) ∈ C with v ∈ Z. This clearly is consistent and gives a linear map. Now (Y, ψY ) is
an element of E and an upper bound for C. Thus by Zorn’s lemma there is a maximal element
(YM , ψM ) of E. Now ψM : YM → R is an extension of ϕ satisfying ψM (y) ≤ p(y) for all y ∈ YM ,
so we are done if we prove YM = V . If this is not the case, we can pick v 0 ∈ V \YM and use
Lemma 9.3 to extend ψY to YM + Rv 0 , but this contradicts the maximality of (YM , ψM ).
9.4 Remark The above proof is even more non-constructive than the preceding ones in that
it uses Zorn’s lemma, which is equivalent to the Axiom of Choice (AC)53 . If V is a separable
normed space, we can replace AC by the weaker axiom DCω (countable dependent choice), cf.
Exercise 9.8. But even in the generality of all Banach spaces there is the seldom cited fact that
the Hahn-Banach theorem can be deduced over ZF from the restriction of Tychonov’s theorem
to Hausdorff spaces, which is strictly weaker than AC. See Appendix B.6.1. 2
9.5 Theorem (Hahn-Banach Theorem (1927/9)) Let V be a vector space over F ∈ {R, C},
p a seminorm on it, W ⊆ V a linear subspace and ϕ : W → C a linear functional such that
|ϕ(w)| ≤ p(w) for all w ∈ W . Then there is a linear functional ϕ
b : V → C such that ϕ
b W = ϕ
and |ϕ(v)|
b ≤ p(v) for all v ∈ V .
Proof. F = R: This is an immediate consequence of Theorem 9.2 since a seminorm p is sublinear
with the additional property p(−v) = p(v) ≥ 0 for all v. In particular, −ϕ(v) b = ϕ(−v)
b ≤
p(−v) = p(v), so that −p(v) ≤ ϕ(v)
b ≤ p(v) for all v ∈ V , which is equivalent to |ϕ(v)|
b ≤ p(v) ∀v.
54 ϕ
F = C : Assume V ⊇ W → C satisfies |ϕ(w)| ≤ p(w) ∀w ∈ W . Define ψ : W → R, w 7→
Re(ϕ(w)), which clearly is R-linear and satisfies the same bound. Thus by the real case just
considered, there is an R-linear functional ψb : V → R extending ψ such that |ψ(v)|b ≤ p(v) for
all v ∈ V . Define ϕ
b : V → C by
ϕ(v)
b b − iψ(iv).
= ψ(v) b
ϕ(iv)
b = ψ(iv)
b − iψ(−v)
b = ψ(iv)
b + iψ(v)
b b − iψ(iv))
= i(ψ(v) b = iϕ(v),
b
53
“Such reliance on awful non-constructive results is unfortunately typical of traditional functional analysis.” [92]
54
This was discovered only in 1938 by Henri Frederic Bohnenblust (1906-2000) and Andrew Florian Sobczyk (1915-
1981), Swiss resp. Polish born American mathematicians.
71
b : V → C is C-linear. If w ∈ W then
proving that ϕ
ϕ(w)
b = ψ(w)
b − iψ(iw)
b = ψ(w) − iψ(iw) = Re(ϕ(w)) − iRe(ϕ(iw))
= Re(ϕ(w)) − iRe(iϕ(w)) = Re(ϕ(w)) + iIm(ϕ(w)) = ϕ(w),
so that ϕ
b extends ϕ.
Given v ∈ V , let α ∈ C, |α| = 1 be such that αϕ(v)b ≥ 0. Then αϕ(v)
b = ϕ(αv)
b =
Re(ϕ(αv))
b = ψ(αv), so that |ϕ(v)|
b b = |αϕ(v)|
b = ψ(αv) ≤ p(αv) = p(v).
b
9.6 Remark In Exercise 5.35 we saw (with a fairly easy proof) that bounded linear functionals
defined on linear subspaces of Hilbert spaces always have unique norm-preserving extensions to
the whole space. For a general Banach space V this uniqueness is far from true! (It holds if and
only if V ∗ is strictly convex, cf. Section B.6.7 for definition and proof.) 2
9.7 Exercise Give an example for a Banach space V , a linear subspace W ⊆ V and ϕ ∈ W ∗
b ∈ V ∗ of ϕ.
such that there are multiple norm-preserving extensions ϕ
9.8 Exercise Give a proof of Theorem 9.5 in the case of a separable normed space (V, k · k)
using only the axiom DCω of countable dependent choice (thus neither AC nor Zorn).
72
implies that ιV (V ) is complete, thus also V since ιV : V → V ∗∗ is an isometric bijection.
9.10 Corollary Every normed space V embeds isometrically as a dense subspace into a Ba-
nach space Vb . The latter is unique up to isometric isomorphism and is called the completion of
V.
Proof. This can be proven by completing the metric space (V, d), where d(x, y) = kx − yk and
showing that the completion is a linear space, but this is a bit tedious. Alternatively, using the
above result that ιV : V → V ∗∗ is an isometry, we can take Vb = ιV (V ) ⊆ V ∗∗ as the definition
of Vb since this is a closed subspace of the complete space V ∗∗ , thus complete, and contains
ιV (V ) ∼
= V as a dense linear subspace.
Uniqueness of the completion follows with the same proof as for metric spaces, cf. [108].
9.13 Remark With more effort one proves √ the Kadets-Snobar theorem (1971) giving an idem-
potent P with image W satisfying kP k ≤ dim W (which is almost optimal, but not quite). Cf.
e.g. [103, Theorem 12.14] or [1, Theorem 13.1.7]). If V is a Hilbert space, one clearly has the
orthogonal projections satisfying kP k = 1. If V is isomorphic to a Hilbert space, this implies a
73
uniform bound kP k ≤ λ < ∞ for all finite-dimensional subspaces. The converse is also true! Cf.
e.g. [1, Theorem 13.4.3]. Combining this with the Kadets-Snobar theorem, it is not too difficult
to prove the characterization of Hilbert spaces mentioned in Remark 6.10.4. 2
Now we can continue the discussion of ⊥ and > begun with Definition 6.6 and Exercise 6.7:
9.15 Exercise Let V be a normed space and W ⊆ V a closed linear subspace. Construct an
isometric linear bijection β : V ∗ /W ⊥ → W ∗ .
9.18 Exercise Let V be a normed space and x ∈ V, ϕ ∈ V ∗ . Prove that ιV (x) ∈ V ∗∗ and
ιV ∗ (ϕ) ∈ V ∗∗∗ satisfy ιV ∗ (ϕ)(ιV (x)) = ϕ(x).
9.21 Theorem Let V be a Banach space. Then V is reflexive if and only if V ∗ is reflexive.
56
Robert Clarke James (1918-2004). American functional analyst.
74
Proof. ⇒ Reflexivity of V means surjectivity of ιV : V → V ∗∗ . Let ϕ ∈ V ∗∗∗ = (V ∗∗ )∗ .
Then ϕ0 = ϕ ◦ ιV ∈ V ∗ , and we claim that ϕ = ιV ∗ (ϕ0 ). This would clearly imply surjectivity
of ιV ∗ : V ∗ → V ∗∗∗ , thus reflexivity of V ∗ . The claim means ϕ(x∗∗ ) = ιV ∗ (ϕ0 )(x∗∗ ) for all
x∗∗ ∈ V ∗∗ . By surjectivity of ιV : V → V ∗∗ , this is equivalent to ϕ(ιV (x)) = ιV ∗ (ϕ0 )(ιV (x))
for all x ∈ V . The latter identity indeed is true since both sides equal ϕ0 (x), the l.h.s. by the
definition of ϕ0 and the r.h.s. by Exercise 9.18.
⇐ Assume that V is not reflexive. Then ιV (V ) ⊆ V ∗∗ is a proper closed subspace, so that
ιV (V )⊥ 6= {0} by Exercise 9.14. Let thus 0 6= ϕ ∈ ιV (V )⊥ ⊆ V ∗∗∗ . Since V ∗ is reflexive,
we have ϕ = ιV ∗ (ϕ0 ) for some ϕ0 ∈ V ∗ . Using Exercise 9.18 again, for each x ∈ V we have
ϕ0 (x) = ιV ∗ (ϕ0 )(ιV (x)) = ϕ(ιV (x)) = 0 by ϕ ∈ ιV (V )⊥ . Thus ϕ0 = 0, implying ϕ = 0, but this
is a contradiction.
9.22 Remark For non-reflexive V none of the spaces V ∗ , V ∗∗ , V ∗∗∗ , . . . is reflexive, so that
V $ V ∗∗ $ V ∗∗∗∗ $ · · · and V ∗ $ V ∗∗∗ $ V ∗∗∗∗∗ $ · · · , and we have two somewhat mysterious
successions of ever larger spaces! There do not seem to be many general results about this, but
see Lemma B.25(iv). Even understanding C(X, R)∗∗ for compact X is complicated, cf. [81]. 2
9.24 Exercise (i) Prove that if V is reflexive then for each ϕ ∈ V ∗ there exists an x ∈ V
such that kxk = 1 and |ϕ(x)| = kϕk. (We say ‘ϕ attains its norm’.)
(ii) Identify the ϕ ∈ c0 (N, F)∗ for which there exists x ∈ c0 (N, F) with kxk = 1 such that
ϕ(x) = kϕk. Conclude that such ϕ are dense in c0 (N, F)∗ .
(ii) Prove (again) that c0 (N, C) is not reflexive.
9.25 Remark 1. The converse of the statement in Exercise 9.24(i) is also true: If every ϕ ∈ V ∗
attains its norm, V is reflexive. But the proof, also due to R. C. James, is much harder and
more than 10 pages long! See [102, Section 1.13].
2. On the other hand, Bishop and Phelps57 proved that the result of Exercise 9.24(ii) holds
for every Banach space V , i.e. the set of ϕ ∈ V ∗ that attain their norm is dense in V ∗ . Cf. [16]
or [102, Section 2.11].
3. See Appendix B.6.8 for the notion of uniform convexity, which is stronger than the strict
convexity encountered earlier, and a proof of the fact that uniformly convex spaces are reflexive.
We will also prove prove that Lp (X, A, µ) is uniformly convex for each measure space
(X, A, µ) and 1 < p < ∞. This provides a proof of reflexivity of these spaces that does
not use the relation between Lp and Lq . This in turn leads to a simple proof of surjectivity of
the isometric map Lq → (Lp )∗ known from Section 4.7 (reversing the logic of Exercise 9.23(iii)).
2
75
(i) If V ∗ is separable then V is separable.
(ii) For V infinite-dimensional separable, V ∗ can be separable or non-separable. (Examples!)
(iii) If V is separable and reflexive then V ∗ is separable.
9.27 Theorem (Pettis) 58 Let V be a Banach space and W ⊆ V a closed subspace. Then
the following are equivalent:
(i) V is reflexive.
(ii) W and V /W are reflexive.
Proof. We begin with some preparations. Let W ⊆ V be a closed subspace of W . W ⊥ ⊆ V ∗ is
a closed subspace, thus W ⊥⊥ is a closed subspace of V ∗∗ . Explicitly,
W ⊥⊥ = ψ ∈ V ∗∗ | ϕ ∈ V ∗ , ϕ W = 0 ⇒ ψ(ϕ) = 0 .
(9.3)
If w ∈ W and ψ = ιV (w) then for each ϕ ∈ V ∗ we have ψ(ϕ) = ιV (w)(ϕ) = ϕ(w). Thus if
ϕ W = 0 then ψ(ϕ) = 0. This proves ιV (W ) ⊆ W ⊥⊥ .
By Exercise 6.7 (dual space of quotient space) we have an isometric isomorphism W ∗ ∼ =
V /W ⊥ . Now Exercise 9.15 (dual space of subspace) gives an isometric isomorphism α : W ∗∗ →
∗
ιV ιV (9.4)
? ?
W ∗∗ - W ⊥⊥ ⊂ - V ∗∗
α
Let w ∈ W . Now ιW (w) ∈ W ∗∗ and ιV (w) ∈ V ∗∗ are the linear functionals on W ∗ and V ∗ ,
respectively, given by evaluation at w. Thus for ϕ ∈ V ∗ we have ιV (w)(ϕ) = ϕ(w). On the
other hand, ϕ W ∈ W ∗ , and ιW (w)(ϕ W ) = (ϕ W )(w) = ϕ(w), proving that the left triangle
of the diagram commutes.
(i)⇒(ii) Now assume that V is reflexive, so that ιV : V → V ∗∗ is a bijection. Thus every
ψ ∈ V ∗∗ is of the form ιV (v) for a unique v ∈ V . With this, (9.3) becomes
W ⊥⊥ = ιV (v) | ϕ ∈ V ∗ , ϕ W = 0 ⇒ ϕ(v) = 0 = ιV (W ),
where we used that for every v ∈ V \W there exists a ϕ ∈ V ∗ with ϕ W = 0, ϕ(v) 6= 0. Thus
ιV : W → W ⊥⊥ is a bijection. Since α is a bijection, also ιW : W → W ∗∗ is a bijection, thus W
is reflexive.
By Theorem 9.21, V ∗ is reflexive. Thus the closed subspace W ⊥ ⊆ V ∗ is reflexive by what
was just proven. Since W ⊥ ∼ = (V /W )∗ by Exercise 6.7, (V /W )∗ is reflexive, thus V /W is
reflexive using Theorem 9.21 again.
(ii)⇒(i) Let ψ ∈ V ∗∗ . Our aim is to find a v ∈ V such that ψ = ιV (v). We have a canonical
isomorphism β : (V /W )∗ → W ⊥ ⊆ V ∗ . Thus ψ ◦ β ∈ (V /W )∗∗ . Since V /W is reflexive, there
exists v + W ∈ V /W such that ιV /W (v + W ) = ψ ◦ β. Now for all ϕ ∈ W ⊥ ⊆ V ∗ we have
76
Thus ψ − ιV (v) ∈ V ∗∗ vanishes on W ⊥ , so that ψ − ιV (v) ∈ W ⊥⊥ . Since W is reflexive, ιW is
a bijection. Together with the fact that α is a bijection, this implies that ιV : W → W ⊥⊥ is
a bijection, thus W ⊥⊥ = ιV (W ). Thus there exists w ∈ W such that ιV (w) = ψ − ιV (v), thus
ψ = ιV (v + w), proving surjectivity of ιV . Thus V is reflexive.
In these notes, the Banach spaces C(b) (X, F), C0 (X, F) of continuous functions do not play
a very prominent role, since their study requires more general topology than the rest or our
subjects, and also measure theory for the dual spaces. But the following is not too difficult:
9.28 Exercise Let X be a normal (T4 ) topological space. Prove that the Banach space
Cb (X, F) is reflexive if and only if #X < ∞. Hint: If #X = ∞, pick distinct points {xn }n∈N , use
Urysohn’s lemma to produce functions fn ∈ C(X, [0, 1]) with disjoint supports and fn (xm ) =
δn,m . Use these to produce an embedding c0 ,→ Cb (X, F).
9.30 Exercise If V, W, Z are normed spaces and A ∈ B(V, W ), B ∈ B(W, Z), prove (BA)t =
At B t in B(X ∗ , Z ∗ ).
9.31 Lemma If V, W are normed spaces over F and A : V → W is linear then kAt k = kAk.
Thus At is bounded if and only if A is bounded. The map B(V, W ) → B(W ∗ , V ∗ ), A 7→ At is
isometric.
Proof. The identity follows from the computation
kAk = sup kAvk = sup sup |ϕ(Av)| = sup sup |ϕ(Av)| = sup kAt ϕk = kAt k,
v∈V v∈V ϕ∈W ∗ ϕ∈W ∗ v∈V ϕ∈W ∗
kvk=1 kvk=1 kϕk=1 kϕk=1 kxk=1 kϕk=1
where the the first and last identities are the definition of the norm, the second and fourth
follow from Proposition 9.9(ii), and the third is the exchangeability of two suprema. (Note that
we did not assume boundedness of A or At .) The rest is clear.
9.32 Exercise Let V, W be normed spaces and A : V → W a linear map. Prove: A is bounded
if and only if At ϕ = ϕ ◦ A is bounded for all ϕ ∈ W ∗ . Hint: Use Lemma 9.31 and UBT or CGT.
The transposition operation can be iterated, giving Att ∈ B(V ∗∗ , W ∗∗ ), etc.
77
9.33 Lemma If V, W are normed spaces and A ∈ B(V, W ) then the diagram
A -
V W
∩ ∩
ιV ιW
? ?
V ∗∗ tt
- W ∗∗
A
Now, ιW (Av) and (Att ιV (v)) are in W ∗∗ , and the fact that they coincide on all ϕ ∈ W ∗ means
ιW (Av) = Att ιV (v). And since this holds for all v ∈ V , we have ιW A = Att ιV , as claimed.
9.38 Proposition Let V, W be Banach spaces and A ∈ B(V, W ). If At is bounded below then
A is surjective.
78
Proof. Let C > 0 be such that kAt ϕk ≥ Ckϕk for all ϕ ∈ W ∗ . The set Y = AB V (0, 1) ⊂ W
is closed, convex and balanced. Thus for each z ∈ W \Y by Proposition B.64 there exists
ϕ ∈ W ∗ such that |ϕ(y)| ≤ 1 for all y ∈ Y and |ϕ(z)| > 1. By the first of these properties,
for all x ∈ B V (0, 1) we have |(At ϕ)(x)| = |ϕ(Ax)| ≤ 1, implying kAt ϕk ≤ 1. Thus with the
hypothesis,
C < C|ϕ(z)| ≤ Ckzkkϕk ≤ kzkkAt ϕk ≤ kzk.
By contraposition, if w ∈ W satisfies kwk ≤ C then w ∈ Y . In particular, B W (0, C) ⊆
AB V (0, 1). Now Proposition 7.4 gives the surjectivity of A.
9.42 Exercise If V is a reflexive Banach space and W ' V (not necessarily isometric isomor-
phism), prove that W is reflexive.
79
• characterization of reflexive Banach spaces: Theorem 10.15.
• applications to > and transposes At of operators: Proposition 10.31, Theorem 10.32.
• characterizations of compact operators: Proposition 12.25 and Theorem 12.28.
• a (locally) compact topology relevant for the theory of commutative Banach algebras:
Section 19.
Since the norm and weak topologies on a normed space are Hausdorff, Proposition 2.29
implies that they coincide if the space is finite-dimensional. On the other hand:
10.2 Proposition Let V be an infinite-dimensional normed space. Then the weak topology
τw on V is strictly weaker than the norm-topology and not first countable. In particular (V, τw )
is neither normable nor Fréchet nor an F-space.
Proof. By the definition of τw , for every weakly open neighborhood U of 0 there are ϕ1 , . . . , ϕn ∈
V ∗ such {x ∈ V | |ϕi (x)| < 1 ∀i = 1, . . . , n} ⊆ U . Thus U contains the linear subspace
Tnthat−1
W = i=1 ϕi (0) ⊆ V , whose codimension is ≤ n. Thus if V is infinite-dimensional then
dim W is infinite, thus non-zero. On the other hand, it is clear that the (norm-)open ball
B(0, 1) contains no linear subspace of dimension > 0. Thus B(0, 1) 6∈ τw . Since τw ⊆ τk·k was
clear, we have τw $ τk·k .
If we assume that τw is T first countable, 0 ∈ V has a countable open neighborhood base
{Un }n∈N . Replacing Un by nk=1 Uk , we may assume U1 ⊇ U2 ⊇ · · · . As seen above, being
weakly open, each Un contains a non-zero linear subspace Vn . For each n we can pick a xn ∈ Vn .
If now ϕ ∈ V ∗ and ε > 0 are arbitrary, U = {x ∈ V | |ϕ(x)| < ε} is a weakly open neighborhood
of 0. Since {Un } is a shrinking weak neighborhood base of 0, there exists n0 such that for all
n ≥ n0 we have xn ∈ Vn ⊆ Un ⊆ U , thus |ϕ(xn )| < ε, implying ϕ(xn ) → 0. Since ϕ ∈ V ∗ was
w
arbitrary, we have proven xn → 0. Since this holds for every choice of {xn ∈ Vn } and the Vn
are vector spaces, we can choose {xn } such that kxn k → ∞. But this contradicts the fact that
every weakly convergent sequence is norm-bounded, cf. Exercise 10.6 below. This contradiction
shows that τw is not first countable.
Now the last statement is trivial since for a TVS we have the implications normable ⇒
Fréchet ⇒ F-space ⇒ metrizable ⇒ first countable.
10.3 Exercise (i) Prove that the sequence {δn }n∈N has no weak limit in `1 (N, F).
80
(ii) Let 1 < p < ∞. Prove that the sequence {δn }n∈N ⊆ `p (N, F) converges to zero weakly, but
not in norm.
w
(iii) If fn → g in a Hilbert space and kfn k → kgk, prove that kfn − gk → 0.
(iv) (Bonus) Prove the result of (iii) for sequences in (`p (N, F), k · kp ), where 1 < p < ∞.59
The deviant behavior of `1 in the preceding exercise can be understood as a consequence of
the following surprising result:
60 w
10.4 Theorem (I. Schur 1920) If g, {fn }n∈N ⊆ `1 (N, F) and fn → g then kfn − gk1 → 0.
10.5 Remark 1. Like the uniform boundedness theorem, this result can be proven using a
beautiful gliding hump argument, cf. Section B.3.5, or using Baire’s theorem.
2. Theorem 10.4 does not generalize to nets since the weak and norm topologies on `1 (N, F)
differ by Proposition 10.2 and nets can distinguish topologies, cf. e.g. [108, Section 5.1].
3. Banach spaces in which weak and norm convergence of sequences are equivalent are said
to have the Schur property. All finite-dimensional spaces have it. See also Remark 12.31.1. 2
10.6 Exercise Prove that every weakly convergent sequence in a normed space is norm-
bounded. Hint: Uniform boundedness theorem. (This does not generalize to nets!)
10.7 Exercise Prove that every weakly compact subset of a Banach space is norm-bounded.
10.8 Exercise Let V be a Banach space. Prove that the (norm) closed unit ball V≤1 is also
weakly closed. Hint: Hahn-Banach.61
Given a linear map A : E → F between Banach spaces, one can consider its continuity w.r.t.
different pairs of topologies on E and F : Norm-norm continuity (w.r.t. the norm topologies on
both spaces), weak-norm continuity (the weak topology on E, the norm topology on F ) and,
similarly, norm-weak and weak-weak continuity. These notions are not all distinct:
10.9 Exercise Let V, W be Banach spaces and A : V → W a linear map. Prove that the
following are equivalent:
(i) A is norm-norm continuous (equivalently, bounded).
(ii) A is norm-weak continuous.
(iii) A is weak-weak continuous.
10.10 Exercise With the same assumptions as above, prove that the following are equivalent:
(i) A is weak-norm continuous.
PK
(ii) There are ϕ1 , . . . , ϕK ∈ V ∗ and y1 , . . . , yK ∈ W such that Ax = k=1 ϕk (x)yk ∀x ∈ V .
(iii) A is bounded and AV ⊆ W is finite-dimensional (i.e. A has ‘finite rank’).
59
More generally, this implication holds for all uniformly convex Banach spaces, cf. e.g. [86, Proposition 9.11], [23,
Sect. 3.7]. The Lp -spaces with 1 < p < ∞ are uniformly convex, cf. Section B.6.8.
60
Issai Schur (1874-1941). Russian mathematician. Studied and worked in Germany up to his emigration to Israel
in 1939. Mostly known for his work in group and representation theory.
61
More generally, every norm-closed convex set is weakly closed. But this is a bit harder.
81
10.11 Exercise If V is a normed space and A ∈ B(V ) has finite rank, prove
Pn that the trace
(as known from linear algebra) of A AV ∈ End(AV ) coincides with i=1 ϕi (yi ) for any
PK
representation A = k=1 ϕk (·)yk .
10.12 Definition If V is a Banach space over F ∈ {R, C}, a sequence {xn } ⊂ V is called
weakly Cauchy if the sequence {ϕ(xn )} ⊂ F converges for every ϕ ∈ V ∗ . (Equivalently, {xn } is
Cauchy in the sense of Remark 2.12.2 in the locally convex space (V, τw ).)
10.15 Theorem Let V be a Banach space. Then the following are equivalent:
(i) V≤1 is compact w.r.t. the weak topology.
(ii) V is reflexive. (⇔ V ∗ is reflexive by Theorem 9.21.)
10.16 Remark 1. If V is a finite-dimensional normed space then V≤1 is compact w.r.t. the
norm topology. For infinite-dimensional V this is false, as we will see in Section 12.1. But since
the weak topology is weaker than the norm topology, a set can be weakly compact even though
it is not norm compact.
2. For metric spaces, compactness and sequential compactness are equivalent, but the two
properties are independent for general topological spaces. Despite the fact that weak topologies
on infinite-dimensional Banach spaces are metrizable at best on bounded subsets (under sepa-
rability assumptions), there is the Eberlein-Šmulyan63 Theorem: For subsets of a Banach space
62
In fact, given a Banach space V , every bounded sequence in V has a weakly Cauchy subsequence if and only if V
has no subspace isomorphic to `1 . This deep result is due to H. Rosenthal (1974). See e.g. [1, Chapter 11], [97, Vol.
1, Sect. [Link]].
63
William Frederick Eberlein (1917-1986), American mathematician who worked in functional analysis, topology
and related areas. Vitold Lvovich Šmulyan (1914-1944), Soviet mathematician, also known for the Krein-S. theorem.
Killed while fighting in WW2.
82
weak compactness and weak sequential compactness are equivalent. (⇒ is not very difficult.)
For proofs see more advanced texts or [173]. 2
10.17 Exercise If H is a Hilbert space, prove that every orthonormal sequence {en }n∈N in H
converges weakly to zero.
10.18 Definition Let V be a Banach space. The weak operator topology τwot on B(V ) is
generated by the family F = {k · kx,ϕ : A 7→ |ϕ(Ax)| | x ∈ V, ϕ ∈ V ∗ } of seminorms. Thus
{Aι } ⊆ B(V ) converges to A ∈ B(V ) w.r.t. τwot if and only if ϕ((Aι − A)x) → 0 for all
x ∈ V, ϕ ∈ V ∗ , i.e. {Aι x} ⊆ V converges weakly to Ax for all x ∈ V . The family F is
wot
separating, so that τwot is Hausdorff. We write Aι −→ A or A = wot-lim Aι .
There is little risk of confusing the weak topology on V with the weak operator topology on
B(V ). But one might confuse the latter with the weak topology that B(V ) has as a Banach
space, in particular since the above k·kx,ϕ are in B(V )∗ ! However, when V is infinite-dimensional
these seminorms do not exhaust (or span) the bounded linear functionals on B(V ), so that the
weak operator topology on B(V ) is strictly weaker than the weak topology!
10.21 Remark 1. Since ϕ(x) = 0 for all x ∈ V means ϕ = 0, F is separating, thus the
σ(V ∗ , V )-topology is Hausdorff and therefore locally convex.
2. If V is infinite-dimensional, the weak-∗ topology τw∗ is neither normable nor metrizable.
3. Since the weak-∗ topology is induced by the linear functionals on V ∗ of the form x b,
which constitute a subset of V ∗∗ , it is weaker than the weak topology, thus also weaker than
the norm topology: τw∗ ⊆ τw ⊆ τk·k . As we know, the second inclusion is proper whenever V is
infinite-dimensional. For the first, we have: 2
83
Proof. If V is reflexive then V ∗∗ = ιV (V ), so that the weak-∗ topology σ(V ∗ , V ) on V ∗ coincides
with the weak topology σ(V ∗ , V ∗∗ ). If V is not reflexive, we have V $ V ∗∗ . Now for ψ ∈ V ∗∗ \V
it is clear that the linear functional ψ on V ∗ is σ(V ∗ , V ∗∗ )-continuous, whereas Exercise 10.23
gives that it is not σ(V ∗ , V )-continuous. This proves σ(V ∗ , V ) 6= σ(V ∗ , V ∗∗ ).
10.25 Remark Before we proceed, some comments are in order: While the norm and weak
topologies are defined for each Banach space, the weak-∗ topology is defined only on spaces
that are the dual space V ∗ of a given space V . There are Banach spaces, like c0 (N, F), that are
not isomorphic (isometrically or not) to the dual space of any Banach space, cf. Corollary B.26.
And there are non-isomorphic Banach spaces with isomorphic dual spaces, cf. Corollary B.27.
Thus to define the weak-∗ topology on a Banach space V , it is not enough just to know that
the latter is a dual space. We must choose a ‘pre-dual’ space W such that W ∗ ∼ =V. 2
Recall that V≤1 is norm compact if and only if V is finite-dimensional and weakly compact
if and only if V is reflexive. The weak-∗ topology on the dual of a non-reflexive Banach space
V is strictly weaker than the weak topology, so that the closed unit ball of V ∗ has a chance of
being weak-∗ compact, and in fact this is the case unconditionally:
10.26 Theorem (Alaoglu’s theorem (1940)) 64 If V is a Banach space then the (norm)closed
unit ball (V ∗ )≤1 = {ϕ ∈ V ∗ | kϕk ≤ 1} is compact in the σ(V ∗ , V )-topology.
Proof. Define Y
Z= {z ∈ C | |z| ≤ kxk},
x∈V
equipped with the product topology. Since the closed discs in C are compact, Z is compact by
Tychonov’s theorem. If ϕ ∈ (V ∗ )≤1 then |ϕ(x)| ≤ kxk ∀x, so that we have a map
Y
f : (V ∗ )≤1 → Z, ϕ 7→ ϕ(x).
x∈V
Since the map ϕ 7→ ϕ(x) is continuous for each x, f is continuous (w.r.t. the weak-∗ topology
on (V ∗ )≤1 ). It is trivial that V separates the points of V ∗ , thus f is injective. By definition,
64
Leonidas Alaoglu (1914-1981). Greek mathematician. (Earlier versions due to Helly and Banach.)
84
a net {ϕι } in (V ∗ )≤1 converges in the σ(V ∗ , V )-topology if and only if ϕι (x) converges for all
x ∈ V , and therefore if and only if f (ϕι ) converges. Thus f : (V ∗ )≤1 → f ((V ∗ )≤1 ) ⊆ Z is a
homeomorphism.
Now let z ∈ f ((V ∗ )≤1 ) ⊆ Z. Clearly, |zx | ≤ kxk ∀x ∈ X. By Proposition A.12.2 there is
a net in f ((V ∗ )≤1 ) converging to z and therefore a net {ϕι } in (V ∗ )≤1 such that f (ϕι ) → z.
This means ϕι (x) → zx ∀x ∈ V . In particular ϕι (αx + βy) → zαx+βy , while also ϕι (αx + βy) =
αϕι (x) + βϕι (y) → αzx + βzy . Thus the map ψ : V → C, x 7→ zx is linear with kψk ≤ 1, to wit
ψ ∈ (V ∗ )≤1 and z = f (ψ). Thus f ((V ∗ )≤1 ) ⊆ f ((V ∗ )≤1 ), so that f ((V ∗ )≤1 ) ⊆ Z is closed.
Now we have proven that (V ∗ )≤1 is homeomorphic to the closed subset f ((V ∗ )≤1 ) of the
compact space Z, and therefore compact.
10.27 Remark We deduced Alaoglu’s theorem from Tychonov’s theorem, which is known to be
equivalent to the axiom of choice (AC). But we only needed Tychonov as restricted to Hausdorff
spaces, and the converse also holds. See Appendix B.5 where we also prove equivalence of these
statements to the Ultrafilter Lemma (UL), a set theoretic axiom that is known to be strictly
weaker than AC and its equivalents. UL also implies the Hahn-Banach theorem. 2
10.28 Exercise Use Alaoglu’s theorem to prove that every Banach space V over F admits a
linear isometric bijection onto a closed subspace of C(X, F) for some compact Hausdorff space
X.
10.29 Exercise (i) Use Alaoglu’s theorem to prove (ii)⇒(i) in Theorem 10.15.
(ii) Conclude that the closed unit ball of every Hilbert space is weakly compact.
(iii) Prove σ(V, V ∗ ) = σ(V ∗∗ , V ∗ ) V .
(iv) Use Theorem 10.30 and (iii) to prove (i)⇒(ii) in Theorem 10.15.
85
w ∗ > ⊥
(iii) In view of (ii), (Φ> )⊥ = ( Φ ) . Thus it suffices to prove (Φ> )⊥ = Φ for weak-∗
closed Φ. Since it is clear that Φ ⊆ (Φ ) , it remains to prove the converse inclusion. If ϕ0 ∈
> ⊥
(Φ> )⊥ \Φ, Corollary B.61, applied to the locally convex space (V ∗ , τw∗ ), gives a ψ ∈ (V ∗ , τw∗ )∗
such that ψ Φ = 0 and ψ(ϕ0 ) 6= 0. Now by Lemma 10.24 there is a unique x ∈ V such that
ψ(ϕ) = ϕ(x) for all ϕ ∈ V ∗ . Clearly x 6= 0. In view of this we have ϕ(x) = 0 ∀ϕ ∈ Φ, thus
x ∈ Φ> . With ϕ0 ∈ (Φ> )⊥ this implies ψ(ϕ0 ) = ϕ0 (x) = 0, which is a contradiction.
w∗ w∗
(iv) If Φ = V ∗ then (ii) implies Φ> = (Φ )> = (V ∗ )> = {0}. And if Φ> = {0} then (iii)
w∗
gives Φ = ({0})⊥ = V ∗ .
10.33 Remark If V is reflexive, the weak and weak-∗ topologies on V ∗ coincide by Proposition
10.22, so that with Corollary B.63 a linear subspace of V ∗ is norm-dense if and only if it is
weak-∗ dense. Thus for reflexive V the conditions of weak-∗ density in Proposition 10.31(iv)
and Theorem 10.32(i) reduce to norm-density. 2
86
11.1 Proposition Let H1 , H2 be Hilbert spaces. For A ∈ B(H1 , H2 ), define the Hilbert space
−1
γH At YH
−1
adjoint A∗ : H2 → H1 as the composite map H2 −→
2
H2∗ −→ H1∗ −→
1
H1 . I.e. A∗ := γH 1
◦At ◦γH2 .
Now
(i) The map A∗ : H2 → H1 is linear and bounded, thus in B(H2 , H1 ).
(ii) The map B(H1 , H2 ) → B(H2 , H1 ), A 7→ A∗ is anti-linear.
(iii) For all x ∈ H1 , y ∈ H2 we have hAx, yi2 = hx, A∗ yi1 .
Proof. (i) Linearity of A∗ : H2 → H1 follows from its being the composite of the linear map At
−1
with the two anti-linear maps γH2 and γH 1
. Boundedness follows from kAt k = kAk.
∗
(ii) Additivity of A 7→ A is obvious. Let A ∈ B(H1 , H2 ), c ∈ C, x ∈ H2 . Then
−1 −1 −1
(cA)∗ (x) = γH1
◦ (cA)t ◦ γH2 (x) = γH 1
(cAt (γH2 (x))) = cγH 1
(At (γH2 (x))) = cA∗ (x),
−1
where we used the linearity of A 7→ At and anti-linearity of γH 1
, shows (cA)∗ = cA∗ . (The
anti-linearity of γH2 is irrelevant here.)
(iii) If y ∈ H2 then γH2 (y) ∈ H2∗ is the functional h·, yi2 . Then (At ◦ γH2 )(y) ∈ H1∗ is the
−1
functional x 7→ hAx, yi2 . Thus z = A∗ y = (γH 1
◦ At ◦ γH2 )(y) ∈ H1 is a vector such that
∗
hx, zi1 = hAx, yi2 for all x ∈ H1 . This means hx, A yi1 = hAx, yi2 ∀x ∈ H1 , y ∈ H2 , as claimed.
11.2 Remark Combining (iii) above with the Hellinger-Toeplitz theorem (Corollary 7.32), we
see that a linear map A : H1 → H2 has a Hilbert space adjoint if and only if it is bounded. 2
There is a very useful bijection between bounded operators and bounded sesquilinear forms.
It can be used to give an alternative (at least in appearance) construction of the adjoint A∗
(and for many other purposes). It is based on the following observation: If A ∈ B(H) satisfies
hAx, yi = 0 for all x, y ∈ H then Ax = 0 for all x, thus A = 0. Applying this to A − B shows
that hAx, yi = hBx, yi ∀x, y implies A = B. Thus bounded operators are distinguished by their
‘matrix elements’ hAx, yi. This motivates the following developments.
11.4 Remark Recall that the inner product h·, ·i on a (pre-)Hilbert space is sesquilinear and
bounded by Cauchy-Schwarz. If F = R, the definition of course reduces to bilinearity. 2
11.5 Proposition Let H be a Hilbert space. Then there is a bijection between B(H) and the
set of bounded sesquilinear forms on H, given by B(H) 3 A 7→ [·, ·]A , where [x, y]A = hAx, yi.
Proof. Let A ∈ B(H). Sesquilinearity of [·, ·]A = hAx, yi is an obvious consequence of sesquilin-
earity of h·, ·i and linearity of A, and boundedness follows from Cauchy-Schwarz:
Now let [·, ·] be a sesquilinear form bounded by M . Then for each x ∈ H, the map ψx : H →
C, y 7→ [x, y] is linear (thanks to the complex conjugation) and satisfies |ψx (y)| ≤ M kykkxk,
thus ψx ∈ H ∗ . Thus by Theorem 5.34 there is a unique vector zx ∈ H such that ψx = ϕzx , thus
[x, y] = ψx (y) = ϕzx (y) = hy, zx i ∀y and, taking complex conjugates, hzx , yi = [x, y] ∀y. Thus
87
defining A : H → H by Ax = zx ∀x we have hAx, yi = [x, y] ∀x, y. Since the maps x 7→ ψx and
ψx 7→ zx are both anti-linear, their composite A is linear. And since H → H ∗ , z →
7 ϕz is an
isometry, we have kAxk = kzx k = kϕzx k = kψx k ≤ M kxk, thus A ∈ B(H).
11.7 Remark 1. In view of the identity hAx, yi = hx, A∗ yi satisfied by the adjoint as defined
above and Proposition 11.1(iii) (with H1 = H2 = H), it is clear that the two constructions of
A∗ give the same result (and in a sense are the same construction since both use Theorem 5.34).
2. Proposition 11.5 generalizes readily to a bijection between bounded linear maps A : H1 →
H2 and bounded sesquilinear forms [·, ·] on H1 × H2 . Then also the proof of Proposition 11.6
generalizes in this way and then produces the same adjoint A∗ ∈ B(H2 , H1 ) as Proposition 11.1.
(Of course A ∈ B(H1 , H2 ) can only be self-adjoint if H1 = H2 .)
3. If [·, ·] is a sesquilinear form then also [x, y]0 := [y, x] is a sesquilinear form, called the
adjoint form. Looking at the above definition of A∗ , one finds that A∗ is the bounded operator
associated with the form [·, ·]0 . Thus self-adjointness of A is equivalent to [·, ·]0A = [·, ·]A , i.e.
[·, ·]A being self-adjoint.
4. The following should be known from linear algebra, cf. e.g. [55]: If H is a Hilbert space,
A ∈ B(H) and E is an orthonormal basis for H then hA∗ e, f i = he, Af i = hAf, ei for all
e, f ∈ E. Thus the (possibly infinite) matrix describing A∗ w.r.t. E is obtained from the matrix
corresponding to A by transposition and complex conjugation. 2
(ii) For all x, y ∈ H we have hx, (AB)∗ yi = h(AB)x, yi = hBx, A∗ yi = hx, B ∗ A∗ yi.
(iii) Complex conjugating hAx, yi = hx, A∗ yi gives
88
which shows that A is an adjoint of A∗ . Uniqueness of the adjoint now implies A∗∗ = A.
(iv) Obvious.
11.9 Proposition Let H be a Hilbert space. Then for all A ∈ B(H) we have
(i) kA∗ k = kAk. (The ∗-operation is isometric.)
(ii) kA∗ Ak = kAk2 . (“C ∗ -identity”)
Proof. (i) Similarly to Lemma 9.31, using (5.2) we have
kA∗ k = sup |hA∗ x, yi| = sup |hx, Ayi| = sup |hAy, xi| = kAk.
kxk=kyk=1 kxk=kyk=1 kxk=kyk=1
(ii) On the one hand, kA∗ Ak ≤ kA∗ kkAk = kAk2 , where we used (i). On the other hand,
2
kAk2 = sup kAxk = sup kAxk2 = sup hAx, Axi = sup hA∗ Ax, xi ≤ kA∗ Ak,
kxk=1 kxk=1 kxk=1 kxk=1
The following is a Hilbert space version of the results of Section 9.5, but much easier:
Thus A∗ is injective if and only if (AH1 )⊥ = {0}, which is equivalent to AH1 = H2 by Exercise
5.29(i). Applying the fact just proven to A∗ and using A∗∗ = A proves A injective ⇔ A∗ has
dense image.
(iii) We will prove that closedness of AH1 implies closedness of A∗ H2 . Replacing A by
∗
A then gives the converse implication. Put H10 = ker A (which is closed) and H11 = H10 ⊥.
Then we have a direct sum decomposition H1 = H10 ⊕ H11 by Theorem 5.27. By assumption
H21 = AH1 is closed, so that with H20 = H21 ⊥ we also have H = H ⊕ H . Now A maps H
2 20 21 11
⊥
injectively (since H11 ∩ ker A = H11 ∩ H11 = {0}) onto H21 = AH1 . This defines an operator
A0 ∈ B(H11 , H21 ) that is injective and surjective, thus invertible, so that A0 ∗ is invertible by (i)
and therefore has closed image A0 ∗ H21 = H11 ⊆ H1 . Now closedness of A∗ H2 follows once we
prove that A∗ (x20 + x21 ) = A0 ∗ x21 for all x2i ∈ H2i . Since A∗ vanishes on H20 = (AH1 )⊥ by
(ii), it remains to prove that A∗ H21 coincides with A0 ∗ followed by the inclusion H11 ,→ H1 .
This follows from the computation
∗
h(x10 + x11 ), A∗ x21 i = hA(x10 + x11 ), x21 i = hAx11 , x21 i = hA0 x11 , x21 i = hx11 , A0 x21 i,
89
where xij ∈ Hij .
(iv) By Exercise 7.43, A is bounded below if and only if it is injective and has closed image.
By (ii), injectivity of A is equivalent to A∗ having dense image, and by (iii) closedness of the
image of A is equivalent to closedness of the image of A∗ . The proof is concluded by appealing
to the trivial fact that surjectivity of A∗ is equivalent to the combination of closedness and
density of its image.
90
11.14 Definition If H1 , H2 are Hilbert spaces then A ∈ B(H1 , H2 ) is called coisometry if
AA∗ = idH2 .
It is clear that A is a coisometry if and only if A∗ is an isometry. Thus by the above, we have
that if A is a coisometry then A (ker A)⊥ is unitary, and also the converse is easily checked.
This suggests the following generalization:
11.16 Exercise Prove that for V ∈ B(H), the following are equivalent.
(i) V is a partial isometry.
(ii) V ∗ is a partial isometry.
(iii) V ∗ V is an orthogonal projection.
(iv) V V ∗ is an orthogonal projection.
(v) V V ∗ V = V (trivially equivalent to V ∗ V V ∗ = V ∗ ).
And [·, ·] : V × V → C is self-adjoint, i.e. [x, y] = [y, x] ∀x, y, if and only if [x, x] ∈ R ∀x.
(ii) Every bilinear form [·, ·] : V × V → R that is symmetric, i.e. [x, y] = [y, x] ∀x, y, satisfies
1
[x, y] = [x + y, x + y] − [x − y, x − y] ∀x, y ∈ V.
4
Proof. The polarization identities are proven by the same computations as for (5.5), the proof
of which only used sesquilinearity of h·, ·i, and (5.4), for which we also used the symmetry
hx, yi = hy, xi of inner products over R.
If a sesquilinear form [·, ·] is selfadjoint and x ∈ V then [x, x] = [x, x], thus [x, x] ∈ R for all
x ∈ V . Conversely, if [x, x] ∈ R ∀x then with (11.1) we have
3 3 3
1 X −k 1X k 1X k k
[x, y] = i [x+ik y, x+ik y] = i [x+i−k y, x+i−k y] = i [i x+y, ik x+y] = [y, x],
4 4 4
k=0 k=0 k=0
91
11.18 Remark The above shows that in the case F = C we can omit axiom (ii) in Definition
5.1 if we replace the assumption of linearity in x by sesquilinearity in x, y, since (ii) then follows
from the positivity assumption hx, xi ≥ 0 ∀x. 2
Now we can show that Hilbert space operators tend to be determined by their diagonal
elements66 hAx, xi:
11.21 Exercise Let H be a Hilbert space over F and f : H → F a function. Prove that the
following are equivalent:
(i) There exists A ∈ B(H) such that f (x) = hAx, xi ∀x ∈ H.
(ii) The function f satisfies
(α) f (x + y) + f (x − y) = 2f (x) + 2f (y) ∀x, y ∈ H.
(β) f (cx) = |c|2 f (x) ∀x ∈ H, c ∈ F.
(γ) |f (x)| ≤ Ckxk2 for some C ≥ 0.
Note that if F = R, the statement hAx, xi ∈ R is trivially true and therfore implies nothing,
in particular not A = A∗ , again by Remark 11.20.
66
This language is a bit misleading since we looking at hAx, xi for all x, not just x ∈ E for some fixed basis E.
92
11.23 Exercise (i) Let H be a Hilbert space, A ∈ B(H) and K ⊆ H a closed subspace
such that AK ⊆ K. (We say K is A-invariant.) Prove that AK ⊥ ⊆ K ⊥ is equivalent to
A∗ K ⊆ K.
In this situation, K is called reducing since then A ∼
= A|K ⊕ A|K ⊥ .
(ii) Deduce that every invariant subspace of a self-adjoint operator is reducing.
11.25 Remark On every complex Hilbert space H 6= {0} there are normal operators that are
neither self-adjoint nor unitary. Non-normal operators exist whenever dim H ≥ 2. 2
11.26 Lemma Let H be a Hilbert space over F ∈ {R, C} and A ∈ B(H). Then
kA∗ xk2 = hA∗ x, A∗ xi = hAA∗ x, xi, kAxk2 = hAx, Axi = hA∗ Ax, xi,
so that kA∗ xk = kAxk ∀x is equivalent to hA∗ Ax, xi = hAA∗ x, xi ∀x and therefore is implied by
normality of A. If kAxk = kA∗ xk ∀x then hAA∗ x, xi = hA∗ Ax, xi ∀x, so that Lemma 11.19(ii)
gives AA∗ = A∗ A. (This also holds for F = R since AA∗ and A∗ A are self-adjoint.)
11.27 Proposition Let H be a Hilbert space over F ∈ {R, C} and A ∈ B(H) normal. Then
(i) ker A = ker A∗ = (AH)⊥ = (A∗ H)⊥ .
(ii) With H0 = ker A and H 0 = H0⊥ we have AH 0 ⊆ H 0 and A∗ H 0 ⊆ H 0 with dense images.
(In particular, ker A is a reducing subspace.)
(iii) A is injective if and only if it has dense image.
(iv) A is invertible ⇔ A is bounded below ⇔ A is surjective.
Proof. (i) The first equality is immediate from Lemma 11.26. The rest follows by applying
Lemma 11.10 to A and A∗ .
(ii) Taking the orthogonal complement of (AH)⊥ = ker A = H0 gives (AH)⊥⊥ = AH = H 0
and similarly for A∗ .
(iii) In view of (AH)⊥ = ker A established in (i), injectivity of A is equivalent to (AH)⊥ =
{0}, which is equivalent to density of the image AH by Exercise 5.29(i).
(iv) Invertibility implies boundedness below and surjectivity, cf. Proposition 7.41. If a normal
operator is bounded below, it is injective, so that it has dense image by (iii). Now boundedness
below and dense image imply invertibility by Proposition 7.41. And surjectivity implies dense
image, thus injectivity by (iii). Now injectivity and surjectivity give invertibility by the BIT.
11.28 Remark In Corollary 17.25 we will characterize invertibility of A (ker A)⊥ in terms of
the spectrum of A. 2
93
11.29 Exercise Describe (i) the unitary self-adjoint operators, (ii) the normal partial isome-
tries.
11.30 Exercise Give an example of a normal operator A ∈ B(H) such that there is a non-
trivial A-invariant subspace K ⊆ H that is not reducing. (See Exercise 11.23 for the terminol-
ogy.)
11.33 Lemma Let H be a Hilbert space over F ∈ {R, C} and A ∈ B(H). Then
(i) 9A9 ≤ kAk.
(ii) |hAx, xi| ≤ kxk2 9 A9 for all x ∈ H.
(iii) If 9A9 = 0 and either F = C or A = A∗ then A = 0.
Proof. (i) If kxk = 1 then |hAx, xi| ≤ kAxkkxk ≤ kAk.
(ii) For x = 0 this is clear. For x 6= 0 we have kxk−2 |hAx, xi| = |hA kxk
x x
, kxk i| ≤ 9A9.
(iii) In view of (ii) this is just a restatement of Lemma 11.19(i).
11.34 Proposition Let H be a Hilbert space over F ∈ {R, C} and A ∈ B(H). Then
(i) If A = A∗ then kAk = 9A9.
(ii) If F = C then 21 kAk ≤ 9A9.
(iii) If F = C and A is normal then kAk = 9A9.
Proof. (i) If x, y ∈ H are unit vectors then
|hA(x + y), x + yi − hA(x − y), x − yi| ≤ |hA(x + y), x + yi| + hA(x − y), x − yi|
≤ (kx + yk2 + kx − yk2 ) 9 A 9
= 2 kxk2 + kyk2 9 A9 = 4 9 A9,
94
where we used Lemma 11.33(ii), the parallelogram identity (5.3) and kxk = kyk = 1. Inserting
If now A = A∗ then hA2 x, xi = hAx, Axi = kAxk2 , so that (11.2) reduces to kAxk ≤ 9A9 when
kxk = 1, whence kAk ≤ 9A9. Combining this with Lemma 11.33(i), we are done.
(ii) If we replace A in (11.2) by αA where |α| = 1 then kAxk and 9A9 are unchanged but
hA x, xi acquires a factor α2 . Since F = C, we can choose α such that α2 hA2 x, xi = |hA2 x, xi| ≥
2
0. Then (11.2) becomes kAxk + kAxk−1 |hA2 x, xi| ≤ 2 9 A9, which implies kAxk ≤ 2 9 A9 for
all unit vectors x with Ax 6= 0. This also holds if Ax = 0, thus kAk = supkxk=1 kAxk ≤ 2 9 A9.
(iii) See the following exercise.
95
(ii) For each A ∈ B(H) we have A∗ A ≥ 0. In particular, A = A∗ ⇒ A2 ≥ 0.
(iii) If A ∈ B(H) is positive then BAB ∗ is positive for every B ∈ B(H).
(iv) If A, B ∈ B(H) are positive and A + B = 0 then A = B = 0.
(v) If A ≥ 0 then Ak ≥ 0 for all k ∈ N.
11.40 Theorem If H is a Hilbert space and A ∈ B(H) is positive then there is a unique
positive B ∈ B(H) such that B 2 = A. We call B the square root A1/2 of A.
Proof. Existence: It suffices to prove the claim under the additional assumption kAk ≤ 1 since
1/2
A
then for general A 6= 0 we can put A1/2 = kAk1/2 kAk .
Let A ≥ 0, kAk ≤ 1. Then for x ∈ H, kxk = 1 we have 0 ≤ hAx, xi ≤ 1, thus h(1 −
A)x, xi = 1 − hAx, xi ∈ [0, 1], so that 1 − A ≥ 0. Furthermore, Proposition 11.34(i) implies
k1 − Ak = supkxk=1 h(1 − A)x, xi ≤ 1. Thus A 7→ 1 − A is an involutive bijection from the set
{A ≥ 0 | kAk ≤ 1} to itself, so that it suffices to define (1 − A)1/2 whenever A ≥ 0, kAk ≤ 1.
(This is advantageous since z 7→ (1 − z)1/2 is analytic at z = 0 while z 7→ z 1/2 is not.)
The function z 7→ (1 − z)1/2 is infinitely differentiable at z = 0, with (formal) Taylor series
∞
1 z2 1 3 z3
1/2
X
k 1 z
(1 − z) = ck z = 1 − + + + ··· , (11.3)
2 1! 2 2! 2 2 3!
k=0
where the ck can be found by explicit computation or appeal to Newton’s binomial theorem,
cf. e.g. [57, Vol. 1, Theorem 7.6.4]. We see that c0 = 1 and ck < 0 for all k ≥ 1. The power
series has convergence radius one and does converge to (1 − z)1/2 whenever |z| < 1. This can
be seen invokingP∞either complex analysisP or Newton’s theorem. As z % 1, the l.h.s. converges
∞
to zero, P thus k=1 kc = −1, P∞ k=0 |ckk | = 2 and |ck | ≤ 1 ∀k. Thus for kAk ≤ 1
implying
∞ k
we have k=0 |ck |kA k < ∞, so that k=0 ck A converges in norm by Proposition 3.15. We
interpret the sum as (1 − A)1/2 . To see that this is justified we note that squaring (11.3) gives
P 2
∞ k
1−z = c
k=0 k z , which by absolute convergence also holds with z replaced by A. Since
A is self-adjoint with kAk ≤ 1, also Ak has these properties for all k ≥ 1. Thus for all k and
x ∈ H, kxk = 1 we have hAk x, xi ∈ [−1, 1].
P∞ (Actually · · · ∈ [0, 1] since A k
P∞≥ 0 ∀k ∈ N by Exercise
k
11.39(v), but we don’t need this.) With k=1 |ck | = 1 it follows that k=1 ck hA x, xi ∈ [−1, 1],
so that with c0 = 1 we have
∞
D X E X∞
1/2 k
(1 − A) x, x = ck A x, x = ck hAk x, xi ≥ 0.
k=0 k=0
Since (B−C)B(B−C) and (B−C)C(B−C) are positive by Exercise 11.39(iii), Exercise 11.39(iv)
gives (B−C)B(B−C) = (B−C)C(B−C) = 0. Thus also their difference (B−C)3 vanishes, and
so does (B −C)4 . Since B −C is self-adjoint, we have kB −Ck4 = k(B −C)2 k2 = k(B −C)4 k = 0,
thus B = C.
96
11.41 Remark 1. For a proof of Theorem 11.40 that uses the Weierstrass approximation
theorem instead of a power series, cf. e.g. [118, Lemma 3.2.10 & Proposition 3.2.11].
2. In Section 17 we will develop a less ad hoc and much more general approach to applying
(continuous) functions to (normal) operators, but this will require much work. 2
11.42 Definition Let H be a Hilbert space and A ∈ B(H). Then we define the absolute value
|A| = (A∗ A)1/2 . (Apply Theorem 11.40 to A∗ A, which is positive by Exercise 11.39(ii).)
11.43 Exercise Let A ∈ B(H) be invertible. Prove (directly, without Proposition 11.44):
(i) B ∈ B(H) is invertible if and only if B 2 is invertible.
(ii) |A| is invertible.
(iii) U = A|A|−1 is unitary.
Thus we have the polar decomposition A = U |A| with U unitary and |A| positive and invertible.
If A is not invertible, a form of polar decomposition still exists, but is more subtle:
kAxk2 = hAx, Axi = hx, A∗ Axi = h(A∗ A)1/2 x, (A∗ A)1/2 xi = h|A|x, |A|xi = k|A|xk2 , (11.4)
implying ker |A| = ker A. If |A|x = |A|x0 then |A|(x − x0 ) = 0, thus x − x0 ∈ ker |A| = ker A, so
that Ax = Ax0 . Thus there is a well-defined linear map V : |A|H → H satisfying V |A|x = Ax.
By (11.4) this map is isometric and therefore extends continuously to |A|H by Lemma 3.12. We
⊥
extend V to all of H by having it send |A|H to zero, obtaining a partial isometry. We have
⊥
ker V = |A|H = ker |A|∗ = ker |A| = ker A,
where we used Lemma 11.10, the self-adjointness of |A| and (11.4). It is clear from the definition
that V |A| = A.
That V is uniquely determined by its properties is quite clear: It must send |A|x to Ax, which
⊥
determines it on |A|H. And in view of ker A = ker |A| = ker |A|∗ = |A|H , the requirement
⊥
ker V = ker A forces V to be zero on |A|H .
(ii) It is trivial that an injective (bijective) partial isometry is an isometry (unitary).
(iii) Using (11.4) as in (i) there is a unique partial isometry W such that W Ax = |A|x ∀x ∈ H
and W (AH)⊥ = 0. Now it is immediate that W V |A|H = id and V W AH = id, while W V
and V W vanish on |A|H ⊥ and AH ⊥ respectively. Thus W = V ∗ , and we are done.
97
11.45 Remark Exercise 11.16 generalizes without any difficulty to operators V ∈ B(H, H 0 )
between different Hilbert spaces H 6= H 0 . Then, of course, we have V ∗ V ∈ B(H), V V ∗ ∈ B(H 0 ).
Also Proposition 11.44 generalizes to A ∈ B(H, H 0 ). Then |A| = (A∗ A)1/2 ∈ B(H) and
V ∈ B(H, H 0 ). 2
11.46 Exercise Let {λk }k∈N ⊆ C be a bounded sequence. Let H = `2 (N, C) and Aλ ∈ B(H)
defined by Aλ ek = λk ek+1 (where ek (n) = δk,n ).
(i) Compute (Aλ )∗ , |Aλ | and the partial isometry V in the polar decomposition of Aλ .
(ii) Give a necessary and sufficient condition on {λk } for Aλ to be normal.
11.49 Remark 1. If H is finite-dimensional, TrE (A) coincides with our earlier definition and
therefore does not depend on the choice of E.
2. The sum is uniquely defined since hAe, ei ≥ 0 for all e ∈ E. (The trace of certain non-
positive operators will be considered in Appendix B.11.), but establishing the E-independence
now is less straightforward. We begin with two special cases. 2
11.50 Exercise Let H be a Hilbert space over F and P ∈ B(H) an orthogonal projection.
Prove that TrE (P ) equals the rank of P , i.e. the Hilbert space dimension of the closed subspace
P H ⊆ H as an F-Hilbert space, for each ONB E. Hint: Begin with rank-one projections.
98
Proof. (i) Using Parseval’s identity (kxk2 = e0 ∈E |hx, e0 i|2 ), we have
P
X X X XX
TrE (A∗ A) = hA∗ Ae, ei = hAe, Aei = kAek2 = |hAe, e0 i|2
e∈E e∈E e∈E e∈E e0 ∈E
XX XX X
= |hAe, e0 i|2 = |he, A∗ e0 i|2 = kA∗ e0 k2
e0 ∈E e∈E e0 ∈E e∈E e0 ∈E
X X
= hA∗ e0 , A∗ e0 i = hAA∗ e0 , e0 i = TrE (AA∗ ),
e0 ∈E e0 ∈E
where the exchange of summations is justified since all summands are non-negative.
(ii) If E is an ONB and U ∈ B(H) is unitary, we have
where the second and fifth equalities come from (i) as applied to U A and A, respectively. If
now E 0 is a second ONB then E and E 0 have the same cardinality, so that there is a unitary U
such that U ∗ maps E onto E 0 . Together with (11.5) this gives TrE (AA∗ ) = TrE (U AA∗ U ∗ ) =
TrE 0 (AA∗ ), proving the claim.
11.52 Corollary If A ∈ B(H) is positive then TrE (A) is independent of the ONB E.
Proof. Positivity of A implies that there is a positive B ∈ B(H) such that A = B 2 = B ∗ B.
Now the claim is immediate by Lemma 11.51(ii).
(Attempts to prove the corollary without using square roots tend to have gaps.) The above
considerations will be put to use in Section 12.4 (and B.11).
12 Compact operators
12.1 Compact Banach space operators
We have met compact topological spaces many times in this course. A subset Y of a topological
space (X, τ ) is compact if it is compact when equipped with the induced (=subspace) topology
τ|Y . Recall that a metric space X, thus also a subset of a normed space, is compact if and only
if it is sequentially compact, i.e. every sequence {xn } in X has a convergent subsequence. And
Y ⊆ X is called precompact (or relatively compact) if its closure Y is compact. If X is complete
metric, precompactness of Y ⊆ X is equivalent to total boundedness.
A subset Y of a normed space (V, k · k) is called bounded if there is an M such that kyk ≤
M ∀y ∈ Y . A compact subset of a normed space is closed and bounded, but the converse, while
true for finite-dimensional spaces by the Heine-Borel theorem, is false in infinite-dimensional
spaces. This is particularly easy to see for an infinite-dimensional Hilbert space: Any ONB √
B ⊆ H clearly is bounded. For any e, e0 ∈ B, e 6= e0 we have ke − e0 k = he − e0 , e − e0 i1/2 = 2.
Thus B ⊆ H is closed and discrete. Since it is infinite, it is not compact.
Related to this: If H is a Hilbert
p space and W ⊆ H a proper closed subspace, pick x ∈ W ⊥
with kxk = 1. Then kx−wk = 1 + kwk2 for each w ∈ W , thus dist(x, W ) = inf w∈W kw−xk =
1. This can be generalized:
12.1 Exercise If V is a reflexive Banach space and W ⊆ V a proper closed subspace, there
exists x ∈ V satisfying kxk = 1 and dist(x, W ) = 1. Hint: Use Exercise 9.24.
99
Without reflexivity, we have the following weaker, but elementary result:
12.2 Lemma (F. Riesz) Let V be a normed space and W $ V a closed and proper linear
subspace. Then for each θ ∈ (0, 1) there is an xθ ∈ V such that kxθ k = 1 and dist(xθ , W ) ≥ θ.
Proof. Since W is proper and closed, there are x0 ∈ V \W and ε > 0 such that B(x0 , ε) ⊆ V \W .
Thus λ := dist(x0 , W ) = inf w∈W d(x0 , w) ≥ ε > 0. In view of θ ∈ (0, 1), we have λθ > λ, so that
we can find w0 ∈ W with λ ≤ kw0 − x0 k < λθ . Putting
w0 − x 0
xθ = ,
kw0 − x0 k
w0 − x0 kkw0 − x0 kw − w0 + x0 k dist(x0 , W ) λ
kw − xθ k = w − = ≥ > λ = θ,
kw0 − x0 k kw0 − x0 k kw0 − x0 k θ
where the ≥ is due to kw0 − x0 kw − w0 ∈ W and the > comes from dist(x0 , W ) = λ and
kw0 − x0 k < λθ . Since this holds for all w ∈ W , we have dist(xθ , W ) ≥ θ.
In view of the above, it is interesting to look at linear operators that send sets S ⊆ V to
sets AS with ‘better compactness properties’. There are several such notions:
12.4 Exercise Let V, W be normed spaces and A : V → W a linear map. Prove that the
following conditions are equivalent and imply boundedness of A:
(i) The image AV≤1 ⊆ W of the closed unit ball V≤1 is precompact.
(ii) Whenever S ⊆ V is bounded, AS is precompact (⇔ totally bounded if V is complete).
(iii) Given any bounded sequence {xn } ⊆ V , the sequence {Axn } has a convergent subsequence.
12.5 Definition Operators A ∈ B(V, W ) satisfying the above equivalent conditions are called
compact. The set of compact operators V → W is denoted K(V, W ), and we put K(V ) =
K(V, V ).
100
12.6 Remark 0. That compactness implies boundedness is good to know, but rarely important
since it is a priori clear in most situations.
1. Some authors write B0 (V ) rather than K(V ), motivated by Exercise 12.13(iii) below.
2. In the older literature one can find ‘completely continuous’ as synonym for compact, but
this should be avoided since complete continuity now is defined differently and in general is not
equivalent to compactness.
3. If A ∈ B(V, W ) is compact and V 0 ⊆ V is closed then the restriction A|V 0 ∈ B(V 0 , W )
is compact. If A ∈ B(V ) is compact, W ⊆ V is closed and AW ⊆ W then A|W ∈ B(W ) is
compact.
4. The Heine-Borel theorem implies that every linear operator on a finite-dimensional
normed space (automatically bounded by Exercise 3.7) is compact. For infinite-dimensional
spaces this is false since every closed ball is bounded but non-compact by Proposition 12.3. In
particular the unit operator 1V is compact if and only if V is finite-dimensional.
5. Compactness can also be defined for non-linear maps between Banach spaces. But then
the three versions above are no more equivalent and continuity is no more automatic. See
Section B.15. 2
Before we develop further theory, we should prove that (non-zero) compact operators on
infinite-dimensional spaces exist. The following may be known from Exercise 10.10:
12.7 Definition Let V, W be normed spaces and A ∈ B(V, W ). Then A has finite rank if
its image AV ⊆ W is finite-dimensional. The set of finite rank operators V → W is denoted
F (V, W ). Again, F (V ) = F (V, V ).
12.9 Lemma Finite rank operators are compact. Thus F (V, W ) ⊆ K(V, W ) for all Banach
space V, W .
Proof. Let A ∈ F (V, W ). If S ⊆ V is bounded then AS ⊆ AV is bounded by boundedness of
V . Since AV ⊆ W is finite-dimensional, it is closed and has the Heine-Borel property so that
AS ⊆ AV = AV is compact. Thus A is compact.
101
12.10 Lemma K(V, W ) ⊆ B(V, W ) is a vector space. If A ∈ B(V, W ), B ∈ B(W, Z) and at
least one of A, B is compact then BA ∈ K(V, Z). In particular, K(V ) ⊆ B(V ) is a two-sided
ideal.
Proof. Let {xn } be a bounded sequence in V . Since A, B are compact, we can find a subsequence
{xnk }k∈N such that Axnk and Bxnk converge as k → ∞. Then also (cA + dB)xnk converges,
thus cA + dB is compact. Thus K(V, W ) ⊆ B(V, W ) is a linear subspace.
Alternative argument: Let A, B ∈ K(V ) and S ⊆ V bounded. Then AS and BS are
compact, and so are cAS, dBS if c, d ∈ F. Thus also cAS + dBS is compact (by joint continuity
of the map + : V × V → V ), thus also (cA + dB)S ⊆ cAS + dBS = cAS + dBS.
Now let A ∈ B(V, W ), B ∈ K(W, Z) and S ⊆ V bounded. Then BS and ABS = ABS are
compact by compactness of B and continuity of A, respectively. And boundedness of A implies
boundedness of AS, so that BAS has compact closure by compactness of B. Thus AB and BA
are compact.
For the proof of the next result, we need the notion of total boundedness in metric spaces,
see Appendix A.6.4. In particular we will use Exercise A.43(iii).
12.13 Exercise Let V = `p (S, F), where S is an infinite set and 1 ≤ p < ∞. If g ∈ `∞ (S, F)
and f ∈ `p (S, F) then Mg (f ) = gf (pointwise product) is in `p (S). This defines a linear map
`∞ (S, F) → B(V ), g 7→ Mg . Prove:
(i) g 7→ Mg is an algebra homomorphism.
(ii) kMg k = kgk∞ .
(iii) Mg ∈ K(V ) if and only if g ∈ c0 (S, F).
12.14 Remark 1. We now have two classes of compact operators: The (rather commutative)
one of multiplication by c0 -functions, and the operators that are norm-limits of finite rank
operators. Actually, the first class is contained in the second. Why?
2. In view of Exercise 12.13, the closed subspace K(V ) ⊆ B(V ) is a non-abelian analogue
of c0 ⊆ `∞ . If H is a separable Hilbert space, one can use the fact that c0 is not complemented
in `∞ to prove that K(H) is not complemented in B(H). See [31].
102
3. If either V or W is finite-dimensional, we have F (V, W ) = K(V, W ) = B(V, W ). While
K(V ) 6= B(V ) whenever V is infinite-dimensional since 1 ∈ B(V )\K(V ), very deep recent
(2011) work [5], see also [6, 179], has produced infinite-dimensional Banach spaces V for which
the difference between B(V ) and K(V ) is minimal in the sense of B(V ) = C1 + K(V )67 ! It is a
much older result that there are V, W both infinite-dimensional with B(V, W ) = K(V, W )! (Of
course then V 6' W .) For large classes of examples see Theorems 12.24 and B.30.
4. It is quite natural to ask whether F (V, W ) = K(V, W ). We will later prove F (H) = K(H)
for each Hilbert space. A Banach space V is said to have the approximation property if
F (W, V ) = K(W, V ) holds for every Banach space W . (It is known68 that V has the ap-
proximation property if and only if for every compact set X ⊂ V and ε > 0 there is a T ∈ F (V )
such that kx − T xk < ε ∀x ∈ X, cf. e.g. [98, vol. 1]. Whether this is already implied by
F (V ) = K(V ) seems to be still open.) Whether all Banach spaces have the approximation
property was an open problem until Enflo69 in 1973 [47] constructed a counterexample. His
construction was quite complicated and his spaces were not very ‘natural’ (in the sense of hav-
ing a simple definition and/or having been encountered previously). A simpler example, but
still tricky and not natural, can be found in [34]. Somewhat later, very natural examples were
found: The Banach space B(H) does not have the approximation property whenever H is an
infinite-dimensional Hilbert space, cf. [162]. (Note that this is about compact operators on
B(H), not compact operators in B(H)!) Most of this is well beyond the level of this course,
but you should be able to understand [34]. 2
None of the above examples of compact operators seems very relevant for applications, even
within mathematics. Indeed the most useful compact operators perhaps are integral operators.
We will briefly look at a class of them in Exercise 12.41. But there are very simple examples:
12.15 Definition Let V = C([0, C], F) for some C > 0, equipped with the norm kf k =
supx∈[0,C] |f (x)|. As we know, (V, k·k) is a Banach space. Define a linear operator, the Volterra70
operator, by Z x
A : V → V, (Af )(x) = f (t)dt.
0
Rx RC
We have kAf k = supx | 0 f (t)dt| ≤ 0 |f (t)|dt ≤ Ckf k, thus kAk ≤ C < ∞.
67
Fittingly, this is not only the most recent, but also by far the most sophisticated result mentioned in these notes.
See [179] for a brief introduction, an excellent introduction to what happened in Banach space theory since the 1970s.
68
This was proven by Alexander Grothendieck (1928-2014), German-born mathematician (later French) who first
made fundamental contributions to functional analysis and then revolutionized algebraic geometry.
69
Per H. Enflo (1944-). Swedish mathematician, working mostly in functional analysis. He also made seminal
contributions to the ‘invariant subspace problem’ by constructing an infinite-dimensional separable Banach space V
and an A ∈ B(V ) such that the only closed subspaces W ⊆ V with AW ⊆ W are W = 0 and W = V .
70
Vito Volterra (1860-1940). Italian mathematician and one of the early pioneers of functional analysis.
71
You should have seen this theorem in Analysis 2 or Topology. See e.g. Appendix A.6.4 or [57, Vol. 2, Theorem
15.5.1]. It has many applications in classical analysis, for example Peano’s existence theorem on differential equations.
103
Proof. We will prove that F = AB ⊆ V is precompact by showing that it satisfies the hypotheses
of Theorem A.45. If x ∈ [0, C] and f ∈ C([0, C], F) with kf k ≤ 1 then
Z x
|(Af )(x)| = f (t)dt ≤ Ckf k ≤ C < ∞,
0
12.17 Theorem (Schauder 1930) Let V, W be Banach spaces and A ∈ B(V, W ). Then
At ∈ B(W ∗ , V ∗ ) is compact if and only if A is compact.
Proof. For simplicity we assume V = W . The general case requires only notational changes. As
in the proof of Proposition 12.11, we use the equivalence of precompactness and total bounded-
ness from Exercise A.43(iii). If A ∈ B(V ) is compact then AV≤1 ⊆ V is precompact,
Sn thus totally
bounded. Thus for every ε > 0 there are x1 , . . . , xn ∈ V≤1 such that i=1 B(Axi , ε) ⊇ AV≤1 .
Equivalently, for every x ∈ V≤1 there is an i such that kAx − Axi k < ε/3. Now define a bounded
linear map B : V ∗ → Fn (where Fn has the norm k · k∞ ) by B : ϕ 7→ (ϕ(Ax1 ), . . . , ϕ(Axn )).
Since Fn is finite-dimensional, B has finite rank, thus is compact, so that BV≤1 ∗ ⊆ V ∗ is
∗
totally bounded. Thus there are ψ1 , . . . , ψm ∈ V≤1 such that for every ψ ∈ V≤1 ∗ we have
max |(At ψ)(xi ) − (At ψj )(xi )| = max |ψ(Axi ) − ψj (Axi )| = kBψ − Bψj k < ε/3. (12.1)
1≤i≤n 1≤i≤n
If now x ∈ V≤1 then kAx−Axi k < ε/3 for some i, thus |(At ψ)(x)−(At ψ)(xi )| = |ψ(Ax−Axi )| <
∗ , and (12.1) gives |(At ψ)(x ) − (At ψ )(x )| < ε/3. Thus
ε/3 for every ψ ∈ V≤1 i j i
Now assume that A ∈ B(V ) is compact. Then by the above, Att ∈ B(V ∗∗ ) is compact.
t ∗
Since V ⊆ V ∗∗ is a closed subspace that is Att -invariant with Att |V = A by Lemma 9.33,
compactness of A follows from Remark 12.6.3.
12.18 Remark The above self-contained proof, taken from [110], has much in common with
the proof of the Arzelà-Ascoli theorem, and indeed the latter can be used to prove Schauder’s
theorem, as has become the standard proof, cf. e.g. [94]. An alternative proof is sketched in
Remark 12.20 below. Yet another proof uses the circle of ideas in Section 10 (weak topologies
and Alaoglu’s theorem) as in [30, Theorem VI.3.4]. 2
104
The following is an instructive – and useful – characterization of compact operators which
should go some way towards making the notion more intuitive:
12.19 Theorem (H. E. Lacey 1963) Let V, W be Banach spaces and A ∈ B(V, W ). Then
the following are equivalent:
(i) A is compact.
(ii) For every ε > 0 there exists a closed subspace Z ⊆ V of finite codimension such that
kA Zk ≤ ε.
Proof. (We follow the nice exposition in [121].)S(i)⇒(ii) Let ε > 0. Since AV≤1 is precompact,
there are w1 , . . . , wn ∈ W such that AV≤1 ⊆ ni=1 B(wi , ε). By Proposition 9.9 we can find
ϕ1 , . . . , ϕn ∈ W ∗ such that kϕi k = 1 and ϕi (wi ) = kwi k for all i. Now
n
\
Z = {v ∈ V | ϕ1 (Av) = · · · = ϕn (Av) = 0} = ker At ϕi
i=1
is a closed subspace of V (by continuity of A and the ϕi ). As the intersection of finitely many
spaces of codimension ≤ 1, Z has finite codimension ≤ n. If now z ∈ Z≤1 ⊆ V≤1 , there exists
an i such that kAz − wi k < ε. And with ϕi (Az) = 0 for all z ∈ Z, i ∈ {1, . . . , n} we have
thus
kAzk ≤ kAz − wi k + kwi k ≤ ε + ε = 2ε.
Since this holds for all z ∈ Z≤1 , we have kA Zk ≤ 2ε.
(ii)⇒(i) If ε > 0, by assumption there is a closed subspace Z ⊆ V of finite codimension such
that kA Zk ≤ ε. By Proposition 6.11 (i), Z is complemented, thus by Exercise 7.15(i), there is
an idempotent PZ with PZ V = Z. But this is easy to prove directly: We can find unit vectors
e1 , . . . , en , where n = codimZ, such that Y = spanF {e1 , . . . , en } is an algebraic complement
for Z. Then every v ∈ V is of the form v = z + y with unique z ∈ Z, y ∈ Y . Now define
PZ : z + y 7→ z and PY : z + y 7→ y. Since Y is finite-dimensional, PY has finite rank, thus is
compact. Thus with δ = min(1, ε/kAk), there are v1 , . . . , vm ∈ V≤1 such that
m
[
PY V≤1 ⊆ B(PY vi , δ).
i=1
If now v ∈ V≤1 there is an i such that kPY v −PY vi k < δ. With PM (v −vi ) = (v −vi )−Pn (v −vi )
we have
kPZ v − PZ vi k ≤ kv − vi k + kPY v − PY vi k ≤ 2 + δ ≤ 3,
so that, again using PZ + PY = 1, and kAPZ k = kA Zk ≤ ε we have
Thus AV≤1 ⊆ m
S
i=1 B(Avi , 4ε). Since ε > 0 was arbitrary, AV≤1 is precompact and therefore A
compact.
12.20 Remark Very similarly to the proof of the preceding theorem one can prove [143] that
compactness of A ∈ B(V, W ) is equivalent to the following statement, dual to the above (ii):
105
(iii) For every ε > 0 there exists a finite-dimensional subspace Y ⊆ W such that kQAk < ε,
where Q is the quotient map W → W/Y .
Now one can give an alternative proof [143] of Schauder’s theorem: Assume A is compact,
thus (iii) holds. Thus for every ε > 0, there is a finite-dimensional subspace Y ⊆ W such that
kQAk < ε, where Q : W → W/Y . Put Z = Y ⊥ ⊆ W ∗ , which has finite codimension. Now
A∗ Z = (QA)t , so that kA∗ Zk = k(QA)t k = kQAk < ε. Thus T ∗ satisfies (ii) in Theorem
12.19 and therefore is compact. The converse implication is proven as in Theorem 12.17. (But
ultimately this is the same proof, the finite-(co)dimensional subspaces Y playing a role similar
to that of the auxiliary Fn in the first proof.) 2
12.22 Exercise Let X, Y, Z be Banach spaces, T ∈ B(X, Y ) compact and S ∈ B(Y, Z) injec-
tive. Prove (possibly by contradiction) that for every ε > 0 there is a Cε ≥ 0 such that
(Note: If ε ≥ kT k then we can put Cε = 0. The point is that 0 < ε < kT k is allowed!)
12.23 Exercise If V, W are Banach spaces, A ∈ B(V, W ) is called strictly singular if there is
no infinite-dimensional subspace Z ⊆ V such that A Z is bounded below. Prove:
(i) A ∈ B(V, W ) is strictly singular if and only if there is no infinite-dimensional closed
subspace Z ⊆ V such that A : Z → AZ is an isomorphism.
(ii) Every compact operator is strictly singular.
In Remark B.33 we will see that there are non-compact strictly singular operators!
12.24 Theorem If V is a reflexive Banach space then all bounded linear operators V →
`1 (N, F) and c0 (N, F) → V are compact.
Proof. Let A ∈ B(V, `1 (N, F)). By Exercise 10.14(iii), V≤1 with the weak topology is sequentially
compact. Thus every norm-bounded sequence {xn } ⊂ V has a weakly convergent subsequence
{xni }. Since A is weak-weak continuous by Exercise 10.9, {Axni } ⊂ `1 (N, F) converges weakly.
Now Schur’s Theorem 10.4 implies that {Axni } ⊂ `1 (N, F) converges in norm. Thus {Axn } has
a norm-convergent subsequence, so that A is compact.
106
Let A ∈ B(c0 , V ), thus At ∈ B(V ∗ , c∗0 ). Since V ∗ is reflexive and c∗0 ∼
= `1 , the above gives
that At is compact, thus also A by Schauder’s Theorem 12.17.
Remarkably, Theorem 12.24 requires no information about A other than its boundedness.
But the proof relies on the rather exceptional Schur property of `1 . In view of the proof it is
clear that B(V, W ) = K(V, W ) whenever V is reflexive and W has the Schur property. (See
[90] for further generalizations.)
To obtain results of wider applicability, it will be useful to characterize compactness of an
operator in terms of properties slightly weaker than weak-norm continuity. We will consider
three such properties of an operator, beginning with:
12.25 Proposition Let V, W be Banach spaces and A ∈ B(V, W ). Then A is compact if and
only if the restriction A : V≤1 → W is weak-norm continuous.
Proof. Assume A is compact. Let {xι } be a net in the closed unit ball of V weakly converges to
zero. Let ε > 0. By Theorem 12.19 there exists a closed subspace Z ⊆ V of finite codimension
such that kA Zk ≤ ε. Since Z is complemented, there is a finite rank idempotent P ∈ B(V )
such that (1 − P )V = Z. Since AP has finite rank, it is weak-norm continuous by Exercise
10.10, thus kAP xι k → 0. Now with kA(1 − P )k ≤ ε we have
Since kAP xι k → 0 and ε > 0 was arbitrary, we have kAxι k → 0. Thus A is weak-norm
continuous.
Now assume that A V≤1 is weak-norm continuous, and let ε > 0. By the weak-norm
continuity, V≤1 ∩ A−1 (B W (0, ε)) ⊆ V≤1 is a weakly open neighborhood of 0 ∈ V . Thus there
exist ϕ1 , . . . , ϕn ∈ V ∗ such that
{x ∈ V≤1 | |ϕi (x)| < 1 ∀i} ⊆ V≤1 ∩ A−1 (B W (0, ε)). (12.2)
Now Z = ni=1 ker ϕi ⊆ V is a closed subspace of codimension ≤ n, and for every z ∈ Z≤1 we
T
have ϕi (z) = 0 ∀i, and therefore kAzk < ε by (12.2). Thus for all v ∈ Z we have kAvk ≤ εkAk,
so that we have kA Zk ≤ ε. Now A is compact by Theorem 12.19.
12.26 Remark 1. An alternative proof of the implication ⇒ that uses the compactness of A
directly goes like this: Let {xι }ι∈I be a net in V≤1 that converges weakly to zero. Since the
w
norm-continuous A is weak-weak continuous, Axι → 0. If kAxι k 6→ 0 then there exists an ε > 0
such that for every ι ∈ I there is a ι0 ≥ ι such that kAxι0 k ≥ ε. Using this we can construct
a subnet {xσ }σ∈Σ of {xι } such that kAxσ k ≥ ε for all σ ∈ Σ. Since AV≤1 is compact, the net
{Axσ } has an accumulation point w ∈ W , which clearly satisfies kwk ≥ ε. Since w 6= 0 also is a
w
weak accumulation point of {Axσ }, we have a contradiction with Axα → 0. Thus kAxι k → 0.
This argument is just the obvious adaptation the proof of Theorem 12.28(ii) to nets. But it
is less elementary than the one above in that it uses accumulation points and subnets of nets.
2. By general topology, cf. e.g. [142, 108], continuity of A : (V≤1 , τw ) → (W, k·k) is equivalent
to the statement that the net {Axι } ⊂ W is norm-convergent for every weakly convergent
bounded net {xι }ι∈I in V . But either formulation is hard to verify, and we would prefer a
criterion involving only sequences. 2
107
12.28 Theorem Let V, W be Banach spaces and A : V → W linear. Then
(i) If A is sequentially weak-norm continuous, it is bounded.
(ii) If A is compact then it is sequentially weak-norm continuous.
(iii) Every bounded linear map from `1 to an arbitrary Banach space is sequentially weak-norm
continuous, yet 1 : `1 → `1 is non-compact.
(iv) Let V, W be Banach spaces, where V is reflexive or V ∗ is separable. Then every sequentially
weak norm continuous A : V → W is compact.
Proof. (i) If A is unbounded, we can find {xn } ⊂ V with kxn k = 1 such that kAxn k ≥ n2 . Now
yn = xnn → 0 in norm, thus weakly, while Ayn does not converge in norm since kAyn k ≥ n. This
is a contradiction.
(ii) It suffices to prove that kAxn k → 0 whenever {xn } converges weakly to zero. By Exercise
w
10.6, {xn } is bounded. Since A is weak-weak continuous, we have Axn → 0. If kAxn k 6→ 0 then
there exist ε > 0 and a subsequence yi = {xni } such that kAyi k ≥ ε for all i. By compactness
of A there is a second subsequence zi = ymi such that Azi converges in norm. The limit is non-
zero since kAzi k ≥ ε for all i. Thus Azi converges to a non-zero limit also weakly, contradicting
w
Axn → 0. Thus kAxn k → 0.
(iii) If {xn } ⊂ `1 is a weakly convergent sequence, it is norm-convergent by Schur’s Theorem
10.4, thus {Axn } is norm-convergent for any A : `1 → V . Thus A is sequentially weak-norm
continuous. The second follows from infinite-dimensionality of `1 .
(iv) If V is reflexive then every bounded sequence {xn } ⊂ V has a weakly convergent
subsequence by Exercise 10.14(iii). Since A maps the latter to a norm convergent sequence in
W , {Axn } has a norm-convergent subsequence, thus A is compact.
If V ∗ is separable then every bounded sequence in V has a weakly Cauchy subsequence by
Exercise 10.14(i). Now compactness of A follows as in the preceding case if we use the following
lemma.
12.29 Lemma A sequentially weak-norm continuous operator between Banach spaces maps
weakly Cauchy sequences to norm convergent ones.
Proof. Let {in }, {jn } be increasing sequences in N. Then for each ϕ ∈ V ∗ the sequences {ϕ(xin )}
and {ϕ(xjn )} have the same limit limn→∞ ϕ(xn ). Thus {xin − xjn } is a weakly null sequence,
so that kA(xin − xjn )k → 0 by the sequential weak-norm continuity of A. Now the following
exercise gives that {Axn } is Cauchy w.r.t. k · k and therefore norm-convergent.
12.30 Exercise Let (X, d) be a metric space and {xn } a sequence in X such that d(xin , xjn ) →
0 for all strictly increasing sequences {in }, {jn } of natural numbers. Prove that {xn } is Cauchy.
108
12.32 Definition If V, W are Banach spaces then a linear A : V → W is completely continuous
if weak compactness of X ⊂ V implies norm-compactness of AX ⊂ W .
72
In any topological space, convergence xn → x implies compactness of {xn | n ∈ N} ∪ {x}.
109
We close by mentioning another property involving weak topologies that an operator may
have or not: A ∈ B(V, W ) is called weakly compact if it maps bounded sets of V to subsets of
W that are precompact in the weak topology. See e.g. [102, Section 3.5].
12.37 Theorem Let H be a Hilbert space and A ∈ B(H). Then the following are equivalent:
(α) For every ε > 0 there is an orthogonal projection P such that kP AP k < ε and P H ⊆ H
has finite codimension.
(β) A is compact.
(γ) For every orthonormal sequence {en }n∈N ⊆ H we have Aen → 0.
(δ) For every orthonormal sequence {en }n∈N ⊆ H we have hAen , en i → 0.
The implication (γ) ⇒ (δ) follows trivially from Cauchy-Schwarz. As to the rest:
110
12.4 ? Hilbert-Schmidt operators: L2 (H)
If E is an ONB for a Hilbert space and A ∈ B(H), we have seen in Lemma 11.51 that TrE (A∗ A)
does not depend on E, so that we can write Tr(A∗ A), and that Tr(A∗ A) = Tr(AA∗ ).
kAk2 = (Tr(A∗ A))1/2 ∈ [0, ∞], L2 (H) = {A ∈ B(H) | kAk2 < ∞}.
is absolutely convergent and independent of the ONB E. Now h·, ·iHS is an inner product
on L2 (H) such that hA, AiHS = kAk22 . And (L2 (H), h·, ·iHS ) is complete, thus a Hilbert
space.
(iii) For all A, B ∈ B(H) we have kABk2 ≤ kAkkBk2 and kABk2 ≤ kAk2 kBk. Thus L2 (H) ⊆
B(H) is a two-sided ideal.
k·k2
(iv) We have F (H) ⊆ L2 (H) ⊆ K(H) and F (H) = L2 (H).
Proof. (i) If x ∈ H is a unit vector, pick an ONB E containing x. Then kAxk2 = hA∗ Ax, xi ≤
TrE (A∗ A) = kAk22 . Thus kAxk ≤ kAk2 whenever kxk = 1, proving the inequality. And
kA∗ k22 = Tr(AA∗ ) = Tr(A∗ A) = kAk22 .
(ii) If E is any ONB (whose choice does not matter) for H, we have (as before)
X X X XX
Tr(A∗ A) = hA∗ Ae, ei = hAe, Aei = kAek2 = |hAe, e0 i|2 .
e∈E e∈E e∈E e∈E e0 ∈E
Thus L2 (H) is the set of A ∈ B(H) for which the matrix elements hAe, e0 i (w.r.t. the ONB E)
are absolutely square summable. We therefore have a map
that clearly is injective. (Recall that `2 (S) = L2 (S, µ), where µ is the counting measure.) To
show surjectivity of α, let f = {fee0 } ∈ `2 (E × E). Define a linear operator A : H → H by
0
P
A : e 7→ e0 fee0 e . For each e, the r.h.s. is in H by square summability of f . If x ∈ H then
2 ! 2
X X X X
2 0 0
kAxk = sup hx, ei fee0 e = sup hx, eifee0 e
E0 e e0 ∈E 0 E0 e0 ∈E 0 e
2
X X X
= sup hx, eifee0 ≤ kxk2 |fee0 |2 ,
E0 e0 e e,e0
73
Erhard Schmidt (1876-1959). Baltic German mathematician, contributions to functional analysis like Gram-
Schmidt orthogonalization.
111
where the supremum is over the finite subsets E 0 ⊆ E, we used |hx, ei| ≤ kxk and the change
ofPsummation order is allowed due to the finiteness of E 0 . This computation shows that kAk ≤
( e,e0 |fee0 |2 )1/2 < ∞. Thus A ∈ B(H) and α(A) = f , so that α is surjective. Thus α : L2 (H) →
`2 (E × E) is a linear bijection. Now `2 (E × E) is a Hilbert space (in particular complete) with
inner product (f, g) = e,e0 fee0 gee0 , and pulling this inner product back to L2 (H) along α we
P
have
X X
hA, BiHS = hAe, e0 ihBe, e0 i = hAe, e0 ihe0 , Bei
(e,e0 )∈E 2 (e,e0 )∈E 2
X X
= hAe, Bei = hB ∗ Ae, ei = Tr(B ∗ A),
e e
where all sums converge absolutely. Lemma 11.51 gives that hA, AiHS = Tr(A∗ A) is independent
of the chosen ONB, and for general (A, B) this follows by the polarization identity.
From the above it is clear that (L2 (H), h·, ·iHS ) is isomorphic to the Hilbert space (`2 (E ×
E), h·, ·i), thus a Hilbert space. And the norm associated to h·, ·iHS is nothing other than k · k2 .
(iii) For any ONB E we have
X X
kABk22 = Tr(B ∗ A∗ AB) = kABek2 ≤ kAk2 kBek2 = kAk2 Tr(B ∗ B) = kAk2 kBk22 ,
e∈E e∈E
k·k2
This implies kA − AF k2 → 0 as F % E, so that L2 (H) = F (H) .
k·k
Finally, by (i) we have kA − AF k ≤ kA − AF k2 → 0, thus A ∈ F (H) ⊆ K(H), where we
used Corollary 12.12. This proves L2 (H) ⊆ K(H).
12.41 Example (L2 -Integral operators) Let (X, A, µ) be a measure space, and put H =
L2 (X, A,R µ).
R Let k 2: X × X → C be measurable 2(w.r.t. the product σ-algebra A × A) and
assume |k(x, y)| dµ(x)dµ(y) < ∞. (Thus k ∈ L (X × X, A × A, µ × µ).) Then
Z
(Kf )(x) = k(x, y)f (y) dµ(y)
X
defines a linear operator K : H → H whose Hilbert-Schmidt norm kKk2 coincides with the
norm kkkL2 of k ∈ L2 (X × X). Thus K is Hilbert-Schmidt, and in particular compact.
12.42 Exercise Prove the equality kKk2 = kkkL2 of norms claimed in the above example.
If V, W are vector spaces over any field K then there is a canonical linear map W ⊗K V ∗ →
HomK (V, W ) sending w ⊗K ϕ to the linear map v 7→ wϕ(v). (Here V ∗ is the algebraic dual space
and ⊗K is the algebraic tensor product.) If V or W is finite-dimensional, this map is a bijection,
but otherwise it is not. For Hilbert spaces, one has a statement that works irrespective of the
dimensions:
112
12.43 Exercise Let H be a Hilbert space.
(i) Define H to be the vector space H with scalar action (c, x) 7→ cx and inner product
hx, yiH = hx, yi. Prove that H is a Hilbert space.
a map α : H ⊗alg H → F (H) by associating to ni=1 xi ⊗ yi the operator z 7→
P
(ii) Define
Pn
i=1 xi hz, yi i. Prove that α is linear and extends to an isometric bijection α : H ⊗ H →
2
L (H).
12.44 Remark See Appendix B.11 for the definition of Lp (H) for all p ∈ [1, ∞) and a more
complete discussion of L1 (H). 2
12.45 Exercise Given a set S and f ∈ `∞ (S, F), define H = `2 (S, C) and the multiplication
operator Mg : H → H, f 7→ gf (known from Exercise 12.13, where we saw Mg ∈ K(H) ⇔ g ∈
c0 (S)). Prove |Mg | = M|g| and kMg kp = kgkp for all p ∈ [1, ∞). (Thus Mg ∈ Lp (H) ⇔ g ∈
`p (S, C).)
113
13.2 Definition Let V = `p (N, F), where 1 ≤ p ≤ ∞. Define L, R ∈ B(V ) by
0 if n = 1
(Lf )(n) = f (n + 1), (Rf )(n) =
f (n − 1) if n ≥ 2
Equivalently: Rδn = δn+1 , Lδ1 = 0, Lδn = δn−1 if n ≥ 2, which is why we call L, R the left
and right, respectively, shift operators on V .
It is immediate that R is injective, but not surjective (since (Rf )(1) = 0 ∀f ∈ V ) while L is
surjective, but not injective (since Lf does not depend on f (1)). One easily checks LR = idV ,
while RL 6= idV since RL = P2 (notation from Exercise 8.12).
13.3 Exercise Consider the shift operators L, R on `p (N, C), p ∈ [1, ∞].
(i) Prove that R is an isometry.
(ii) In the Hilbert space case p = 2, prove L∗ = R, R∗ = L. Conclude that L a coisometry.
Passing to an infinite-dimensional Banach space V , there can be A ∈ B(V ) and λ ∈ F for
which A − λ1 is injective, but not surjective, thus not invertible. Such λ are not eigenvalues,
but they turn out to be equally important as the former. This motivates the following:
114
• One reason for distinguishing continuous and residual spectrum is the (not quite perfect)
duality between σp and σr , cf. Exercise 13.9(ii)-(iii). Another is that σr often is empty, cf.
e.g. Exercise 13.13.
• The continuous spectrum need not be ‘continuous’ in the sense of connected. But one
can prove, cf. Exercise 13.68, that every isolated λ ∈ σ(A) is an eigenvalue if either A is
a normal operator on Hilbert space (Proposition 17.24(ii)) or some additional condition
is satisfied, cf. Section B.10. This perhaps goes some way towards explaining the term
‘continuous’.
• In Section B.10 we will define the discrete spectrum σd (A), a certain subset of the point
spectrum σp (A), as well as several closely related essential spectra of A.
• Later we will prove that σ(A) is always closed, while we will see in examples that σp , σc , σr
need not be closed.
Proposition 7.41(iii) suggests to define another two interesting subsets of σ(A):
13.6 Exercise Let V be a Banach space and A, B ∈ B(V ) with B invertible. Prove σ(BAB −1 ) =
σ(A) and σx (BAB −1 ) = σx (A) for all x ∈ {p, c, r, app, cp}.
13.7 Exercise Compute σp (L) and σp (R) for the shift operators L, R on `p (N, C) for all p ∈
[1, ∞]. (Of course, the p in σp has nothing to to with the p in `p .)
13.8 Exercise Let V, W be Banach spaces and A ∈ B(V ), B ∈ B(W ). Define C ∈ B(V ⊕ W )
by C : (x, y) 7→ (Ax, By). (Thus C = A ⊕ B.)
(i) Prove σ(C) = σ(A) ∪ σ(B) and σp (C) = σp (A) ∪ σp (B).
(ii) Compute σc (C) and σr (C). Warning: These are not simply unions as in (i)!
For operators on a Hilbert space, we can study how the spectra of A and A∗ are related:
13.9 Exercise Let H be a Hilbert space and A ∈ B(H). Use Lemma 11.10 to prove:
115
(i) σ(A∗ ) = σ(A)∗ := {λ | λ ∈ σ(A)}.75
(ii) If λ ∈ σr (A) then λ ∈ σp (A∗ ).
(iii) If λ ∈ σp (A) then λ ∈ σp (A∗ ) ∪ σr (A∗ ).
(iv) σcp (A) = σp (A∗ )∗ .
13.10 Remark Using Exercise 9.35 instead of Lemma 11.10, one has analogous results for the
transpose At ∈ B(V ∗ ) of a Banach space operator A ∈ B(V ): σ(At ) = σ(A), σr (A) ⊆ σp (At )
and σp (A) ⊆ σp (At ) ∪ σr (At ). (There is no complex conjugation since A 7→ At is linear.) 2
Normal operators have very nice spectral properties, foreshadowing the spectral theorem:
13.12 Exercise Let H be a Hilbert space and A ∈ B(H) normal. Let x, x0 be (non-zero)
eigenvectors for the eigenvalues λ, λ0 ∈ σp (A), respectively. Prove:
(i) A∗ x = λx, thus x is eigenvector for A∗ with eigenvalue λ.
(ii) σp (A∗ ) = σp (A)∗ .
(iii) If λ 6= λ0 then x ⊥ x0 .76
(iv) Give an example of a non-normal A ∈ B(H) for which (iii) fails.
13.13 Exercise (Spectra of normal operators) Let H be a Hilbert space and A ∈ B(H)
normal. Prove:
(i) σr (A) = ∅. (No residual spectrum)
(ii) σc (A∗ ) = σc (A)∗ .
(iii) σ(A) = σap (A) (cf. Definition 13.5).
Keep in mind that self-adjoint operators are normal, so that the above results apply. But
for non-normal operators we cannot expect such nice results to hold!
13.14 Exercise Let H be a Hilbert space and A ∈ B(H). Use Exercise 13.13(iii) to prove:
(i) If A is self-adjoint then σ(A) ⊆ R.
(ii) If A is unitary then σ(A) ⊆ S 1 .
13.15 Exercise Let H be a Hilbert space and A ∈ B(H). Recalling Definition 11.32, prove
(i) σp (A) ⊆ W (A).
(ii) σr (A) ⊆ W (A).
(iii) σap (A) ⊆ W (A).
(iv) Conclude that σ(A) ⊆ W (A).
75
If S ⊆ C we write S ∗ for {s | s ∈ S} since S could be confused with the closure.
76
You may have seen this before, but probably only for self-adjoint operators.
116
13.2 The spectrum in a unital Banach algebra
Since B(E) is a unital Banach algebra for every Banach space E, all results proven in the rest
of this section in particular apply to bounded operators on Banach spaces. Restricting these
results to B(E) would not simplify their proofs significantly. Whether F is R or C does not
matter until Section 13.2.3.
13.17 Lemma Let A be a unital normed algebra. Then InvA is a topological group (w.r.t. the
norm topology).
Proof. Since multiplication A × A → A is jointly continuous (Remark 3.29), the same holds
for its restriction to InvA. It remains to show that the inverse map InvA → InvA, a 7→ a−1 is
continuous. To this purpose, let a, a + h ∈ InvA and define k by (a + h)−1 = a−1 + k. Then
1 = (a−1 + k)(a + h) = 1 + a−1 h + ka + kh, thus a−1 h + ka + kh = 0. Multiplying this on
the right by a−1 we have a−1 ha−1 + k + kha−1 = 0, thus k = −a−1 ha−1 − kha−1 . Therefore
kkk ≤ ka−1 k2 khk + kkkkhkka−1 k, which is equivalent to kkk(1 − khkka−1 k) ≤ ka−1 k2 khk.
Assuming khk < ka−1 k−1 , the expression in brackets is positive, so that
ka−1 k2
kkk ≤ khk.
1 − khkka−1 k
It follows that khk → 0 implies kkk → 0, so that a + h → a in InvA entails (a + h)−1 → a−1 .
So far, we know very little about InvA. We might, in principle, have InvA = F∗ 1, where
F∗ = F\{0}. (This is indeed the case for the algebra A = F[x] which is normed, but not Banach,
with kP k = supx∈[0,1] |P (x)|.) The next results provide invertible elements other than multiples
of 1.
117
(iii) If a ∈ InvA and a0 ∈ A with ka − a0 k < ka−1 k−1 then
k1 − a−1 a0 k = ka−1 (a − a0 )k ≤ ka−1 kka − a0 k < 1 so that a−1 a0 = 1 − (1 − a−1 a0 ) ∈ InvA by
(ii), thus a0 = a(a−1 a0 ) ∈ InvA. This proves that InvA is open.
σ(a) = {λ ∈ F | a − λ1 6∈ InvA}.
The spectral radius of a is r(a) = sup{|λ| | λ ∈ σ(a)}, where r(a) = 0 if σ(a) = ∅. (We will
prove that σ(a) 6= ∅ for all a ∈ A if A is normed and F = C.) The map
Ra : F\σ(a) → A, λ 7→ (a − λ1)−1
is called the resolvent (map). (Sometimes ρ(a) = F\σ(a) is called the resolvent set.)
13.23 Remark 1. It is clear that for an element of the Banach algebra B(E), where E is
a Banach space, this definition is equivalent to Definition 13.4. But in the present abstract
setting there is no analogue of the point, continuous, residual and compression spectra. (For a
generalization of σap to elements of abstract Banach algebras see Footnote 90.)
2. If a ∈ A and b ∈ InvA it is immediate that σ(a) = σ(bab−1 ), thus r(a) = r(bab−1 ).
3. Lemma 13.17 implies that the resolvent map Ra : F\σ(a) → A, λ 7→ (a − λ1)−1 is
continuous in every unital normed algebra.
4. Be warned that some authors define the resolvent Ra (λ) as (λ1 − a)−1 . 2
As to our standard examples of Banach algebras not of the form B(V ) with V Banach:
13.24 Exercise (i) Let X be a compact Hausdorff space. Recall that (C(X, F), k · k∞ ) is a
Banach algebra. For f ∈ C(X, F), prove σ(f ) = f (X) ⊆ F.
(ii) As we saw in Section 4.6, `∞ (S, F) is a Banach algebra w.r.t. pointwise multiplication for
every set S. If f ∈ `∞ (S, F), prove σ(f ) = f (S).
118
We begin our study of the spectrum with two purely algebraic results:
13.29 Exercise Let A be a unital Banach algebra and a, b ∈ A. Prove the first and second
‘resolvent identities’
119
13.30 Exercise Let A be unital Banach algebra and a ∈ InvA. Prove:
(i) σ(a−1 ) = {λ−1 | λ ∈ σ(a)}.
(ii) If kak ≤ 1 and ka−1 k ≤ 1 then σ(a) ⊆ S 1 = {z ∈ C | |z| = 1}.
13.31 Exercise Let A be a unital Banach algebra, a ∈ A. Prove that for all λ ∈ F\σ(A):
(i) r((a − λ1)−1 ) = (dist(λ, σ(a)))−1 .
(ii) kRa (λ)k ≥ (dist(λ, σ(a)))−1
13.32 Remark 1. The result of (ii) could also be deduced from Exercise 13.21, but the ap-
proach via r(Ra (λ)) is more conceptual and will give the exact result for kRa (λ)k in certain
situations, cf. Exercise 13.71.
2. Since r(b) < kbk is perfectly possible, the above in general only gives a lower bound for
kRa (λ)k. Proving upper bounds tends to be harder. 2
13.33 Exercise Let A be a unital Banach algebra and a ∈ A nilpotent. With N = min{n ∈
N | an = 0} prove limλ→0 |λ|N k(a − λ1)−1 k = kaN −1 k =
6 0. (Thus k(a − λ1)−1 k behaves like
−N
|λ| ka N −1 k as λ → 0.)
It is instructive to compare the above with Exercise 13.52.
13.34 Exercise Let A be a unital Banach algebra. For a ∈ A define ζ(a) = inf{kabk | b ∈
A, kbk = 1} = 0 and call a a topological left zero-divisor if ζ(a) = 0. Prove:
(i) For a ∈ InvA we have ζ(a) = ka−1 k−1 > 0. Thus ζ −1 (0) ⊆ A\InvA.
(ii) |ζ(a) − ζ(b)| ≤ ka − bk ∀a, b ∈ A.
(iii) If a ∈ ∂InvA79 then a is not invertible.
(iv) Every a ∈ ∂InvA is a topological left zero-divisor. Hint: Use Exercise 13.21.
(v) If λ ∈ ∂σ(a) then a − λ1 is a topological left zero-divisor.
(vi) For A = C(X, F), where X is compact, prove that f ∈ A is a topological (left) zero-divisor
if and only if it non-invertible.
(vii) Give an example of a unital Banach algebra A and a non-invertible a ∈ A that is not a
topological left zero-divisor.
kf (x + h) − f (x) − D(h)k
→0 as khk → 0.
khk
Prove that InvA → InvA, a 7→ a−1 is Fréchet differentiable. For F = C, conclude that the
map C\σ(a) → C, λ 7→ ϕ((a − λ1)−1 ) is holomorphic for each a ∈ A, ϕ ∈ A∗ .
79
If X is a topological space and Y ⊆ X then ∂Y = Y ∩ X\Y is the boundary of Y .
120
13.2.3 The spectral radius formula (Beurling-Gelfand theorem)
13.36 Lemma Let A be a unital normed algebra and a ∈ A. Put ν = inf n∈N kan k1/n . Then
(i) limn→∞ kan k1/n = ν.8081
a n a n
6→ 0 provided ν > 0.82
(ii) For all µ > ν we have µ → 0 as n → ∞, but ν
(iii) If ν = 0 then a 6∈ InvA, thus 0 ∈ σ(a).
Proof. (i) With kan k ≤ kakn we trivially have
0 ≤ ν = inf kan k1/n ≤ lim inf kan k1/n ≤ lim sup kan k1/n ≤ kak < ∞. (13.2)
n∈N n→∞ n→∞
By definition of ν, for every ε > 0 there is a k such that kak k1/k < ν + ε. Every m ∈ N is of the
form m = sk + r with unique s ∈ N0 and 0 ≤ r < k (division with remainder). Then
This proves the first claim. On the other hand, by definition of ν we have kan k1/n ≥ ν for all
n ∈ N. With ν > 0 this implies k(a/ν)n k ≥ 1 ∀n, and therefore (a/ν)n 6→ 0.
(iii) If a ∈ InvA then there is b ∈ A such that ab = ba = 1. Then 1 = an bn , thus with
Remark 3.29 we have 1 ≤ k1k = kan bn k ≤ kan kkbn k ≤ kan kkbkn . Taking n-th roots, we have
1 ≤ kan k1/n kbk, and taking the limit n → ∞ gives the contradiction 1 ≤ νkbk = 0. Thus if
ν = 0 then a is not invertible, so that 0 ∈ σ(a).
Essentially everything we did so far works for F = R and F = C. That the natural ONB
for L2 ([0, 2π], λ; F) depends on F is a triviality (but related to the significant fact that x 7→ eix
is a homomorphism, while x 7→ sin x, cos x are not). The R/C-dependence in the polarization
identities (Exercise 5.13 and Lemma 11.17) is more serious since it propagates to the fact that
Lemma 11.19 and Propositions 11.22 and 11.34 are weaker over R than over C.
By contrast, the rest of this section requires F = C, and the same applies when-
ever we use Theorem 13.39 and its consequences like Corollary 13.43 or Exercise
13.46. For this reason, from now on we assume F = C throughout. If we still men-
tion C, it is only for emphasis. Identifying the few results that also hold over R
(like like Lemma 14.1 through Corollary 14.4) usually is quite easy.
80
This would be immediate if n 7→ kan k1/n was decreasing, but this need not hold! See Exercise 13.64.
81 1/n 1/n
More generally, if {cn }n∈N ⊆ [0, ∞) satisfies cn+m ≤ cn cm ∀n, m then limn→∞ cn = inf n∈N cn .
82
This is of course trivial if µ > kak, but µ > ν is a weaker hypothesis when ν < kak.
121
13.37 Lemma Let A be a unital algebra over C, and let a ∈ A, λ ∈ C\{0}, n ∈ N. Then
2πi
a n
− 1 ∈ InvA if and only if λk = e n k λ 6∈ σ(a) for all k = 1, . . . , n. In this case,
λ
a −1 n −1
n 1 X a
−1 = −1 (13.3)
λ n λk
k=1
2πik
Proof. For k ∈ Z we write ek = e n . It is obviousQthat enk = 1 for all k ∈ Z. Since
n−1
e0 , e1 , . . . , en−1 are mutually distinct, we have z n − 1 = k=0 (z − ek ). This identity holds in
every unital algebra, and replacing z by a/λ ∈ A and putting λk = ek λ, it implies the first
statement. On the other hand, it means that there is a partial fraction expansion83
n−1
1 X ck
n
=
z −1 z − ek
k=0
Replacing z by a/λ herein and using λk = λek we obtain the identity (13.3) in A. That this is
justified follows from λk 6∈ σ(a) for all k (thus also λn 6∈ σ(an )) and the following exercise.
13.38 Exercise Let A be a unital algebra over C and assume we have an identity Ii=1 fgii = 0
P
13.39 Theorem (Beurling 1938-Gelfand 1939) 8485 Let A be a unital normed algebra
over C (not necessarily complete) and a ∈ A. Then σ(a) 6= ∅, and
r(a) ≥ inf kan k1/n = lim kan k1/n . (13.4)
n∈N n→∞
If A is complete then equality holds in (13.4), which then is called the spectral radius formula.
83
Most calculus books mention this, but a proof is rarely given. Qn Here is a quick analytic proof in the case at
hand, where all zeros of the denominator are simple: If f (z) = ( j=1 (z − zj ))−1 , we have Ai = limz→zi (z − zi )f (z) =
−1 Ai 1
[ j6=i (z −zj )−1 −Ai ], where the expression in
Q Q
j6=i (zi −zj ) ∈ C\{0} for each i ∈ {1, . . . , n}. Now f (z)− z−z i
= z−z i
Ai
square brackets is holomorphic near zi , where it vanishes. Thus f (z)− z−z i
extends holomorphically to a neighborhood
Pn Ai Pn Ai
of zi . Continuing like this, g(z) = f (z) − i=1 z−zi extends to an entire function. Since f and i=1 z−z i
tend to
zero as z → ∞, it follows Pn that g is bounded, thus constant by Liouville’s theorem. Since the constant must be zero,
Ai
we have proven f (z) = i=1 z−z i
. (For a purely algebraic treatment of the partial fraction expansion, including the
case of multiple zeros of the denominator, see e.g. [93, Ch. IV, §4].)
84
Arne Beurling (1905-1986). Swedish mathematician. Worked mostly on harmonic and complex analysis.
85
Israel Moiseevich Gelfand (1913-2009). Outstanding Soviet mathematician. Many important contributions to
many areas of mathematics, among which functional analysis and Banach algebras.
122
Proof. The equality of infimum and limit was Lemma 13.36(i). For a ∈ A, define ν as before.
If ν = 0 then 0 ∈ σ(a) by Lemma 13.36(iii). Thus σ(a) 6= ∅ and (13.4) is trivially true.
From now on assume ν > 0. Assume that there is no λ ∈ σ(a) with |λ| ≥ ν. This implies
that (a − λ1)−1 exists for all |λ| ≥ ν and depends continuously on λ by Lemma 13.17. The
same holds (since |λ| ≥ ν > 0) for the slightly more convenient function
a −1
φ : {λ ∈ C | |λ| ≥ ν} → A, λ 7→ −1 .
λ
Now Lemma 13.37 gives for all λ with |λ| ≥ ν and n ∈ N that ( λa )n − 1 ∈ InvA with inverse
given by (13.3). Pick any η > ν. Since the annulus Λ = {λ ∈ C | ν ≤ |λ| ≤ η} is compact, the
continuous map φ : Λ → A is uniformly continuous. I.e., for every ε > 0 we can find δ > 0 such
that λ, λ0 ∈ Λ, |λ−λ0 | < δ ⇒ kφ(λ)−φ(λ0 )k < ε. If ν < µ < ν +δ, we have |νk −µk | = |ν −µ| < δ
and therefore kφ(νk ) − φ(µk )k < ε for all n P ∈ N and k = 1, . . . , n. Combining this with (13.3)
we have k(( νa )n − 1)−1 − (( µa )n − 1)−1 k ≤ n1 nk=1 kφ(νk ) − φ(µk )k < ε ∀n ∈ N, so that:
a −1 a −1
n n
∀ε > 0 ∃µ > ν ∀n ∈ N : −1 − −1 < ε. (13.5)
ν µ
13.40 Remark 1. We emphasize that only the last clause requires completeness of A.
2. The standard proof of the above theorem requires completeness and uses the differentia-
bility of the resolvent map Ra and a certain amount of complex analysis. The more elementary
(which does not mean simple) proof given above, due to Rickart86 ([130] 1958), shows that nei-
ther the completeness assumption nor the complex analysis are essential to the problem. (See
also Exercise 13.53 and the subsequent remark.)
3. Even though we avoided complex analysis
(holomorphicity etc.), it is clear that the proof
0 1
only works over C. In fact, the matrix ∈ M2×2 (R), the counterexample in Remark
−1 0
11.20, has empty spectrum over R (as does every invertible antisymmetric real matrix). 2
123
13.42 Remark The above argument is not circular since the proof of Theorem 13.39 did not
use the FTA but only some information about the exponential function in the complex domain
(existence of n-th roots, in particular roots of unity). The same holds for the ‘standard’ proof
of the FTA, cf. e.g. [108, Theorem 7.7.57], with which the above has much in common. Both
proofs certainly are more elementary than those using complex analysis (Liouville’s theorem)
or topological arguments based on π1 (S 1 ) 6= 0. 2
The preceding corollaries only used σ(a) 6= ∅, but also the spectral radius formula will have
many applications.
13.45 Exercise Let A be a unital normed algebra over C and a ∈ A. Prove that a is quasi-
nilpotent (r(a) = 0) if and only if limn→∞ k(za)n k = 0 for all z ∈ C.
If A is a Banach algebra with unit 1, B ⊆ A a Banach subalgebra (=closed subalgebra)
containing 1 and b ∈ B then we can consider the spectrum of b as an element of A or of B,
leading to σA (b), σB (b) and the spectral radii rA (b), rB (b).
13.47 Lemma Let A be a Banach algebra over C with unit 1 and B ⊆ A a Banach subalgebra
containing 1. Then
(i) InvB ⊆ B ∩ InvA and σA (b) ⊆ σB (b) for all b ∈ B.
(ii) σB (b) = σA (b) holds for all b ∈ B if and only if InvB = B ∩ InvA.
87
Adolf Hurwitz (1859-1919). German mathematician who worked on many subjects.
88
Sir William Rowan Hamilton (1805-1865). Irish mathematician. Known particularly for quaternions and Hamil-
tonian mechanics. It was he who advocated the modern view of complex numbers as pairs of real numbers.
89
John T. Graves (1806-1870). Irish jurist (!) and mathematician.
124
Proof. (i) The first statement is obvious. If b − λ1 has an inverse in B then the latter also is an
inverse in A. Thus λ 6∈ σB (b) ⇒ λ 6∈ σA (b).
(ii) Assume InvB = B ∩ InvA. Then for all b ∈ B, λ ∈ F we have that b − λ1 is invertible in
B if and only if it is invertible in A, so that σB (b) = σA (b). If InvB =
6 B ∩ InvA then in view
of InvB ⊆ B ∩ A we have InvB $ B ∩ InvA. If now b ∈ (B ∩ InvA)\InvB then 0 ∈ σB (b), while
0 6∈ σA (b).
13.48 Example In Section 4.6 we saw that the Banach space A = `1 (Z, C) with norm k · k1
becomes a Banach algebra when equipped with the convolution product ?. The functions
δn (m) = δn,m satisfy kδn k = 1 and δn ? δm = δn+m . In particular δn−1 = δ−n for each n ∈ Z.
Now Exercise 13.30 gives σ(δn ) ⊆ S 1 for all n.
Let B = {f ∈ A | f (n) = 0 ∀n < 0} = spanC {δn | n ≥ 0} ⊆ A. It is immediate that B is a
closed subalgebra containing 1. If δ1 had an inverse c ∈ B, c would also be an inverse in A, so
that c = δ−1 by uniqueness of inverses. In view of δ−1 6∈ B we have δ1 ∈ (B ∩ InvA)\InvB.
13.49 Exercise Let A = `1 (Z, C) and B $ A as above. Prove, using no later results:
(i) σB (δ1 ) = {z ∈ C | |z| ≤ 1}.
(ii) σA (δ1 ) = S 1 .
Since the spectra depend on the set of invertibles, one is interested in subalgebras B ⊆ A
satisfying InvB = B ∩ InvA. For a very useful result in this direction see Theorem 16.19. Other
examples are provided by the following exercise:
13.50 Exercise Let A be a unital Banach algebra with unit 1. For any subset S ⊆ A, define
the ‘commutant’ S 0 of S by
S 0 = {t ∈ A | st = ts ∀s ∈ S}.
(i) For each S ⊆ A, prove that B = S 0 ⊆ A is a Banach subalgebra with unit 1 and that
Inv(B) = B ∩ InvA.
(ii) Let S ⊆ T ⊆ A. Prove: (1) T 0 ⊆ S 0 , (2) S ⊆ S 00 , (3) S 0 = S 000 .
(iii) Prove: S ⊆ A is commutative ⇔ S ⊆ S 0 ⇔ S 00 is commutative.
[Combining (i)-(iii) we have: If S ⊆ A is commutative then B = S 00 ⊆ A is a commutative
Banach subalgebra containing 1 and S and satisfying Inv(B) = B ∩ InvA.]
(iv) Prove that a subalgebra B ⊆ A is maximal abelian (i.e. abelian and not properly contained
in a larger abelian subalgebra of A) if and only if B = B 0 . Conclude that every maximal
abelian subalgebra B satisfies InvB = B ∩ InvA.
(v) If H is a Hilbert space, A = B(H) and S ⊆ A, prove that S 0 ⊆ B(H) is also τwot -closed.
13.51 Exercise Let V = C 1 ([0, 1], C) (the differentiable functions [0, 1] → C with continuous
derivative). For f ∈ V define kf k = kf k∞ + kf 0 k∞ .
(i) Prove that (V, k · k) is a Banach space. (You may assume from analysis that if fn , g ∈ V
and fn → g and fn0 → h uniformly then g 0 = h.)
(ii) Show that V is a Banach algebra when multiplication of functions is defined point-wise,
i.e. (f g)(x) = f (x)g(x).
125
(iii) Let g(x) = x for all x ∈ [0, 1]. Compute the norm, the spectrum and the spectral radius
of g.
(iv) If E ⊆ [0, 1] is a closed subset, show that
IE = {f ∈ V | f (x) = 0 ∀x ∈ E}
13.53 Exercise Let A be a unital Banach algebra over C, a ∈ A and z ∈ C\{0} such that
zS 1 ∩ σ(a) = ∅ (i.e. there is no λ ∈ σ(a) with |λ| = |z|). Let pn (a, z) = (1 − (a/z)n )−1 . Prove:
(i) If |z| > r(a) then limn→∞ pn (a, z) = 1.
(ii) If there is no λ ∈ σ(a) with |λ| < |z| then limn→∞ pn (a, z) = 0.
(iii) The limit p(a, z) = limn→∞ pn (a, z) ∈ A exists, commutes with a and depends only on |z|.
Hint: Lemma 13.37.
(iv) p(a, z) = limn→∞ (1 − (a/z)2n+1 )−1 = limn→∞ (1 + (a/z)2n+1 )−1 .
(v) Use the preceding results to prove p(a, z)2 = p(a, z).
1
H
(vi) BONUS: p(a, z) = − 2πi C Ra (z)dz, where C is the circle of radius |z| around 0 ∈ C with
counterclockwise orientation. (This is used as the definition of p(a, z) in the standard
approach and for proving its properties using holomorphicity of z 7→ Ra (z).)
13.54 Remark The operators ((a/z)n −1)−1 = −pn already appeared in our (that is Rickart’s)
proof of the Beurling-Gelfand theorem, where we did not need their convergence as n → ∞.
The result of (vi) to the effect that the limit p(a, z), known as (F.) Riesz idempotent, is given by
a contour integral establishes a connection between Rickart’s proof and the standard textbook
proof via complex analysis. See also [132, §149]. 2
Since we need a unit in order to define InvA and σ(a), the following construction is quite
important when dealing with non-unital algebras:
126
13.3 Spectra of bounded operators II: Banach algebra methods
In this section, we apply our results on abstract Banach algebras to the study of some operators
on Banach spaces.
13.56 Exercise Consider the left and right shift operators L, R of Definition 13.2 in the Hilbert
space `2 (N, C).
(i) Prove σ(L) = σ(R) = B(0, 1) (closed unit disc).
(ii) Determine σc and σr for L, R.
(iii) Find σap and σcp for L, R.
Hint: Use Exercises 13.7, 13.9.
13.57 Exercise Let H = `2 (N, C) and define A ∈ B(H) by (Af )(n) = f (n)/n. Determine
σp (A), σc (A), σr (A).
13.58 Exercise (Assuming some measure theory) Let H = L2 ([a, b], C), where −∞ <
a < b < ∞, and define A ∈ B(H) by (Af )(x) = xf (x). Prove σc (A) = [a, b] and σp (A) =
σr (A) = ∅.
13.59 Exercise Prove that for every compact set C ⊆ C there is an operator A ∈ B(H),
where H is a separable Hilbert space, such that σ(A) = C.
Hint: Prove and use that C has a countable dense subset.
13.61 Exercise Let V =RC([0, 1], F) with (complete) norm k·k∞ . Define the Volterra operator
x
A : V → V by (Af )(x) = 0 f (t)dt. Prove
(i) A is injective.
(ii) A is bounded and satisfies kAn k = 1/n! for all n ∈ N.
(iii) A is quasi-nilpotent, but not nilpotent, and 0 ∈ σr (A).
The next three exercises study an important class of bounded operators on `2 (N), the
‘weighted shift operators’, generalizing the right shift R, and two typical applications:
13.62 Exercise (i) Let H be a Hilbert space and A ∈ B(H). Define pn = kAn k for n ∈ N0
and prove that pi+j ≤ pi pj ∀i, j.
(ii) Let {αk }k∈N0 ⊆ C. Put H = `2 (N0 , C) with natural ONB {ek } and define a linear ‘weighted
shift operator’ A : H → H by Aek = αk ek+1 . Prove that kAk = supk |αk |.
(iii) Let {pk }k∈N0 be given so that p0 = 1, pk > 0 ∀k and pi+j ≤ pi pj ∀i, j. Define
pk
α0 = p0 and αk = if k ≥ 1.
pk−1
Prove that the sequence {αk } and the associated weighted shift operator A are bounded.
(iv) Let A be constructed from {pk }k∈N0 as in (iii). Prove kAn k = pn ∀n ∈ N0 .
(v) (Bonus) Adapt the above construction to the case where pk = 0 is allowed for some k > 0.
127
13.63 Exercise Let H = `2 (N, C) and define A ∈ B(H) by Aek = 2−k ek+1 ∀k. Prove that A
is (i) injective, (ii) quasi-nilpotent, but (iii) not nilpotent.
13.64 Exercise Let H = `2 (N, C) and define A ∈ B(H) by Aek = αk ek+1 , where αk = 2 for
odd k and αk = 1/2 for even k. Compute kAn k for all n and show that n 7→ kAn k1/n is not
monotonously decreasing.
13.66 Exercise Let V be a complex Banach space and A ∈ B(V ) such that σ(A) is disjoint
from the circle C = {z ∈ C | |z − z0 | = r}.
(i) Apply Exercise 13.53 to A − z0 1 to obtain P 2 = P ∈ B(V ) satisfying P A = AP .
(ii) Prove that AVi ⊆ Vi , where V1 = P V, V2 = (1 − P )V . Conclude that A = A1 ⊕ A2 where
Ai = A Vi .
(iii) Prove V1 = {x ∈ V | limn→∞ (A − z0 1)n x = 0}.
(iv) Deduce σ(A1 ) = σ(A) ∩ B(z0 , r) and σ(A2 ) = σ(A)\B(z0 , r).
13.67 Remark The unnatural assumption that the two parts of the spectrum are separated
by a circle can be removed using holomorphic functional calculus. Cf. e.g. [132, §149], [30,
Chapter VII, §4]. For normal operators on Hilbert space, there is a more powerful approach, cf.
Proposition 17.24. 2
13.68 Exercise (Isolated points in the spectrum) Let V be a complex Banach space,
A ∈ B(V ) and λ ∈ σ(A) isolated. Pick r > 0 such that B(λ, r) ∩ σ(A) = {λ}, and put
C = {z | |z − λ| = r} and let Pλ , Vi , Ai be as constructed in Exercise 13.66. Prove that
(i) A1 − λ1 is quasi-nilpotent.
(ii) A2 − λ1 is invertible.
(iii) If dim V1 < ∞ then λ ∈ σp (A1 ) ⊆ σp (A).
Even though we developed the theory of the Riesz projector (Exercises 13.53, 13.66, 13.68)
reasonably fully only for isolated points of the spectrum, it will suffice for applications to
the spectral theory of compact operators in Section 14.4 and to the discussion of the discrete
spectrum, cf. Section B.10.
90
Thus if A is a unital Banach algebra and a ∈ A, defining σap (a) = {λ ∈ F | a − λ1 is topological left zero-divisor}
is consistent with the usual definition when A = B(V ). With this, (ii) also holds for Banach algebras.
128
13.4 Applications to normal Hilbert space operators
We will have more to say on abstract Banach algebras, in particular the subclass of C ∗ -algebras
still to be defined. But before turning to these matters, we will consider some applications of
the above to operator theory.
(ii) Since σ(A) is compact, the continuous real-valued function σ(A) → C, λ 7→ |λ| assumes
its supremum r(A), which equals kAk by (i).
(iii) By Exercise 13.15 we have r(A) ≤ 9A9 ≤ kAk for every A ∈ B(H). Now for normal A
the claim clearly follows from (i).
13.70 Remark Recall that Exercise 11.35 gave a direct proof of 9A9 = kAk not using Theorem
13.39. Using this one can also give a direct proof [14] of (i),(ii): In Exercise B.141 it is shown
that for every A ∈ B(H) there exists a sequence {xn } with kxn k = 1 such that hAxn , xn i → λ,
where |λ| = 9A9, which equals kAk by normality. Now
The sum of the three rightmost terms converges to −|λ|2 = −kAk2 . Since hAxn , Axn i =
kAxn k2 ≤ kAk2 ∀n, we have kAxn − λxn k → 0, proving that A − λ1 is not bounded below.
Thus λ ∈ σ(A), so that r(A) ≥ |λ| = kAk. Combining this with r(A) ≤ kAk from Proposition
13.27, we have r(A) = kAk. 2
14.1 Lemma If A ∈ B(V ) is compact and λ ∈ F\{0} then ker(A − λ1) is finite-dimensional.
129
Proof. If λ 6∈ σ(A) then this is trivial since A − λ1 is invertible. In general, Vλ = ker(A − λ1)
is the space of eigenvectors of A with eigenvalue λ. Clearly A|Vλ = λ idVλ , so that Vλ is an
invariant subspace. Since Vλ is closed and A|Vλ is compact by Remark 12.6.3, Vλ must be finite-
dimensional by Remark 12.6.4.
For λ = 0, the above does not hold since the zero operator on any V is compact.
14.3 Remark 1. The result fails for λ = 0 since there are compact injective operators that are
not surjective.
2. In the above proof we have shown that if A ∈ B(V ) is compact and λ ∈ F\{0} then we
cannot have ker(A − 1)n+1 % ker(A − 1)n for all n or (A − λ1)n+1 H $ (A − λ1)n H for all n.
One says that A − λ1 has finite ascent and descent.
3. Compactness is essential for this stabilization: For the shift operators on V = `p (N) we
have ker Ln+1 ) ker Ln and Rn+1 V $ Rn V for all n. 2
130
Proof. (i) For λ 6= 0, with Proposition 14.2 we have: λ ∈ σ(A) ⇔ A−λ1 not invertible ⇔ A−λ1
not injective ⇔ λ ∈ σp (A).
(ii) If A was invertible then 1 = A−1 A would be invertible by Lemma 12.10, but this is false
by Remark 12.6.3. Thus 0 ∈ σ(A).
14.5 Exercise Show that a compact operator A can have 0 in any of σp (A), σc (A), σr (A).
14.6 Exercise Let H be a Hilbert space, A ∈ K(H) and λ ∈ C\{0}. Show that each of the
implications (ii)⇒(iii) and (iii)⇒(ii) in Proposition 14.2 can be deduced from the other.
where the second equality and the final isomorphism come from Exercises 9.35(i) and 6.7, re-
spectively. Thus (V /((A−λ1V )V ))∗ is finite-dimensional, implying (why?) finite-dimensionality
of V /((A − λ1V )V ) = coker(A − λ1V ).
(ii) This follows from (i) and Exercise 7.11.
Alternatively, a direct proof goes as follows: By Lemma 14.1, K = ker(A − λ1) is finite-
dimensional, thus closed by Exercise 3.22. Thus there is a closed subspace S ⊆ V such that
V = K ⊕ S. (If V is a Hilbert space, we can just take S = K ⊥ . For general Banach spaces
this is the statement of Proposition 6.11.) The restriction (A − λ1)|S : S → H is compact and
injective. If (A − λ1)|S is not bounded below, we can find a sequence {xn } in S with kxn k = 1
for all n and k(A − λ1)xn k → 0. Since A is compact, we can find a subsequence {xnk } such
that {Axnk } converges. We relabel, so that now {Axn } converges. Now
Since {Axn } converges and {(A − λ1)xn } converges to zero by choice of {xn }, {xn } converges
to some y ∈ S (since xn ∈ S ∀n and S is closed). From (A − λ1)xn → 0 and xn → y we obtain
(A − λ1)y = 0, so that y ∈ ker(A − λ1) = K. Thus y ∈ K ∩ S = {0}, which is impossible since
y = limn xn and kxn k = 1 ∀n. This contradiction shows that (A − λ1)|S is bounded below. Now
Lemma 7.39 gives that (A − λ1)H = (A − λ1)S is closed.
14.8 Remark In fact one can prove more: If A ∈ B(V ) is compact and λ ∈ C\{0} then
131
This clearly is much stronger than the equivalence (ii)⇔(iii) in Proposition 14.2, which amounts
to the statement dim ker(A − λ1) = 0 ⇔ dim coker(A − λ1) = 0. See Remark 14.10.2. 2
14.9 Definition If V, W are Banach space then A ∈ B(V, W ) is called a Fredholm operator if
both ker A and coker A are finite-dimensional.
If A is Fredholm, ind(A) = dim ker A − dim coker A ∈ Z is the (Fredholm) index of A.
Note that we do not require closedness of the image of A since Exercise 7.11 it is follows
automatically by from the finite-dimensionality of W/AV .
In linear algebra one proves that for a matrix A ∈ Mn×n (C) the following are equivalent: A
is normal, A can be diagonalized by a unitary matrix, Cn has an orthonormal basis consisting
of eigenvectors of A, the (geometric) dimension of each eigenspace of A coincides with the
(algebraic) multiplicity of the corresponding eigenvalue. Cf. e.g. [55, Theorem 6.16]. In basis-
independent language, A ∈ B(H) with H finite-dimensional is normal if and only if H admits an
ONB consisting of eigenvectors of A. The following beautiful result generalizes this to compact
normal operators:
14.12 Theorem (Spectral theorem for compact normal operators) Let H be a com-
plex Hilbert space and A ∈ B(H) compact normal. Then
(i) H is spanned by the eigenvectors of A.
P
(ii) There is an ONB E of H consisting of eigenvectors, thus A = e∈E λe Pe , where Pe =
e ⊗ e : x 7→ hx, eie.
(iii) For each ε > 0 there are at most finitely many λ ∈ σp (A) with |λ| ≥ ε.
(iv) σp (A) is at most countable and has no accumulation points except perhaps 0, which is an
accumulation point whenever σ(A) is infinite.
132
(v) We have σ(A) ⊆ σp (A) ∪ {0}, where 0 ∈ σp (A) if and only if A has a kernel and 0 ∈ σc (A)
if and only if A is injective and σ(A) is infinite.
S
Proof. (i) Let K ⊆ H be the smallest closed linear subspace containing λ∈σp (A) Hλ , where
Hλ = ker(A − λ1). Clearly K is an invariant subspace: AK ⊆ K. Exercise 13.12(i) implies
that also A∗ K ⊆ K. Now Exercise 11.23(i) gives that also K ⊥ is A-invariant: AK ⊥ ⊆ K ⊥ . If
K ⊥ 6= {0} then A|K ⊥ is compact and has eigenvectors by Proposition 14.11. Since this would
contradict the definition of K, we have K ⊥ = 0, proving that H is spanned by the eigenvectors
of A.
(ii) By Exercise 13.12(ii), the eigenspaces for different eigenvalues of A are mutually orthog-
S
onal. Now the claim follows from (i) by choosing ONBs Eλ for each Hλ and putting E = λ Eλ .
(iii) Taking into account the unitary equivalence H ∼ = `2 (E, C), cf. Theorem 5.45, this
essentially is Exercise 12.13(iii).
(iv) This is an immediate consequence of (iii).
(v) Since A is normal, 0 ∈ σr (A) is ruled out by Exercise 13.13(i). The statement about
0 ∈ σp (A) is trivially true by definition. The one about σc (A) now follows from (iv) and the
closedness of σ(A).
14.13 Remark 1. The statements about σ(A) actually hold for all compact operators on
Banach spaces. (Instead of the orthogonality of eigenvectors for different eigenvalues, it suffices
to use their linear independence.)
2. The common theme of ‘spectral theorems’ is that normal operators can be diagonalized,
i.e. be interpreted as multiplication operators, compactness simplifying statement and proof
considerably. Compare Theorem 18.4 for a result not requiring compactness.
3. If A ∈ B(H) is compact normal and f : σ(A) → C a function, we formally define f (A)
P
as e∈E f (λe )Pe , where E, λe , Pe are as in Theorem 14.12. The sum converges strongly to a
bounded operator if and only if f is bounded, and f (A) is compact if and only if f (λ) → 0
as λ → 0. Thus setting up a ‘functional calculus’ for compact normal operators is quite easy.
Our discussion of not-necessarily-compact normal operators will proceed in the opposite order:
We begin in Section 17 by constructing a functional calculus, which will then be used to prove
spectral theorems. 2
14.14 Proposition Let H be a complex Hilbert space and A ∈ K(H). Then are orthonormal
sets (not necessarily bases!) E and F of H, a bijection E → F, e 7→ fe and positive numbers
{βe }, called the singular values of A, such that e 7→ βe is in c0 (E, C) and
X
A= βe fe h·, ei.
e∈E
133
and putting βe = kAek > 0 we have the desired form.
Since E diagonalizes A∗ A, we have A∗ Ae = λe e for all e ∈ E, where compactness of A∗ A im-
plies that e 7→ λe is in c0 (EB , C), cf. Theorem 14.12(iii). Now, kAek2 = hAe, Aei = he, A∗ Aei =
1/2
λe , thus βe = kAek = λe implies that also e 7→ βe is in c0 (E). For the final claim, note that
|A|2 e = A∗ Ae = λe e, thus |A|e = βe e.
The following goes some way towards proving that Hilbert spaces have the approximation
property:
14.15 Corollary Let H be a complex Hilbert space, A ∈ K(H) and ε > 0. Then there is a
k·k
B ∈ F (H) (finite rank) with kA − Bk ≤ ε. Thus K(H) = F (H) .
P
Proof. Pick a representation A = e∈E λe fe h·, ei as in the preceding proposition. Since E →
C, e 7→ λe is in c0 (E), there is a finite subset F ⊆ E such that |λe | < ε for all e ∈ E\F . Define
X
B= λe h·, fe ie,
e∈F
which clearly has finite rank. If x ∈ H then using the orthonormality of E and Bessel’s
inequality, we have
X X
k(A − B)xk2 = k λe hx, fe iek2 = |λe hx, fe i|2 ≤ ε2 kxk2 .
e∈E\F e∈E\F
Thus kA − Bk ≤ ε, so that K(H) ⊆ F (H). The converse inclusion was Corollary 12.12.
14.16 Remark 1. In the above, bases played a crucial role. Even though there is no notion of
orthogonality in general Banach spaces, it turns out that Banach spaces having suitable bases
k·k
do satisfy K(H) = F (H) , i.e. the approximation property. Cf. e.g. [102, Theorem 4.1.33].
2. If you like applications of complex analysis to functional analysis, see [128, Section VI.5]
for an interesting alternative approach to compact operators. 2
134
Let thus {λn }n∈N ⊂ Σε be mutually distinct non-zero eigenvalues of A. Pick corresponding
eigenvectors {xn } and put Wn = spanF (x1 , . . . , xn ). Since we know from linear algebra that
the sets {x1 , . . . , xn } all are linearly independent, we have Wn $ Wn+1 for all n. Thus by
Riesz’ Lemma 12.2 there are unit vectors yn+1 ∈ Wn+1 such that dist(yn+1 , Wn ) ≥ 1/2. Since
(A − λn+1 1) kills xn+1 , we have (A − λn+1 1)(yn+1 ) ∈ Wn . Now for all j > k > 1 we have
A(λ−1 −1 −1 −1
j yj ) − A(λk yk ) = λj (A − λj 1)(yj ) − λk (A − λk 1)(yk ) + yj − yk
= yj − [−λ−1 −1
j (A − λj 1)(yj ) + λk (A − λk 1)(yk ) + yk ].
Since the expression in square brackets lies in Wj−1 , which has distance ≥ 1/2 from yj , we have
kA(λ−1 −1 −1
j yj ) − A(λk yk )k ≥ 1/2, which clearly holds for all j 6= k. Thus the sequence {A(λj yj )}
has no convergent subsequence. Since A is compact, this proves that {λ−1 j yj } has no bounded
subsequence. With kyj k = 1 ∀j, this proves λj → 0, as desired.
(ii) This immediately follows from (i) in the same way as for compact normal Hilbert space
operators.
By the above (which also holds over R), every λ ∈ σ(A)\{0} is isolated. Restricting to
F = C, Exercises 13.53, 13.66, 13.68 provide a Riesz idempotent Pλ ∈ B(V ) commuting with
A and such that σ(A Pλ V ) = {λ} and σ(A (1 − Pλ )V ) = σ(A)\{λ}. Thus Vλ = Pλ V is a
closed A-invariant subspace. If λ, λ0 ∈ σ(A)\{0} are distinct, Pλ and Pλ0 commute since both
are limits of inverses of polynomials in A. It is easy to see that Pλ Pλ0 = Pλ0 Pλ = 0.
14.18 Proposition Let V be a complex Banach space, A ∈ B(V ) compact and λ ∈ σ(A)\{0}.
Then
(i) the generalized eigenspace ∞ n
S
n=1 ker(A − λ1) of λ coincides with Vλ = Pλ V and
(ii) is finite-dimensional,
(iii) (A − λ1) Vλ is nilpotent.
Proof. Since Vλ is invariant under A, the restriction A Vλ is compact. By construction of Pλ ,
we have σ(A Vλ ) = {λ}. Since λ 6= 0, we have 0 6∈ σ(A Vλ ) so that A Vλ is invertible. Thus
A Vλ is compact and invertible, implying that Vλ is finite-dimensional. Since σ(A Vλ ) = {λ},
(A − λ1) Vλ is quasi-nilpotent, thus nilpotentSby Exercise 13.60. Thus for every x ∈ Vλ we
have (A − λ1)n x = 0 for some n, proving Vλ ⊆ ∞ n
n=1 ker(A − λ1) . OnSthe other hand, the fact
that A − λ1 is invertible on (1 − Pλ )V implies the converse inclusion ∞ n
n=1 ker(A − λ1) ⊆ Vλ .
This concludes the proof.
14.19 Remark An alternative proof for the finite-dimensionality of the generalized eigenspace
(but not its coincidence with Vλ = Pλ V ) proceeds as follows:
Let B ∈ B(V ) be arbitrary and n ∈ N. If x ∈ ker B n+1 then Bx ∈ ker B n , so that B restricts
to a linear map ker B n+1 → ker B n . And Bx ∈ ker B n−1 ⊆ ker B n if and only if x ∈ ker B n .
Thus B induces a linear map ker B n+1 / ker B n → ker B n / ker B n−1 , so that dim ker B n+1 −
dim ker B n ≤ dim ker B n −dim ker B n−1 . (For n = 1 this is dim ker B 2 −dim ker B ≤ dim ker B.)
Now by induction (or a telescoping sum) we have dim ker B n ≤ n dim ker B.
Putting B = A−λ1, by the proof of Proposition 14.2 there is a d such that ker B n+1 = ker B n
S∞
for all n ≥ d. Thus dim n=1 ker B n = dim ker B d ≤ d dim ker B < ∞, where we used Lemma
14.1. 2
135
Now we can proceed as known from finite-dimensional linear algebra and find for each
λ ∈ σ(A)\{0} a basis for Vλ with respect to which A is given by a block diagonal matrix with
a finite number of standard Jordan blocks, the number of these blocks equaling the geometric
multiplicity dim ker(A − λ1) of λ. We refer to the literature, cf. e.g. [55, 84, 95].
If σ(A) is finite, also 0 is an isolated point of L
σ(A) (if it is in it), so that we have a Riesz
projector P0 . Now we have an isomorphism V ' λ∈σ(A) Vλ . Note that V0 need not be finite-
dimensional, since there is an ample supply of compact quasi-nilpotent operators in infinite
dimensions, e.g. the classical Volterra operator and the weighted shift operators with weight
function decreasing fast enough. An attempt at classifying them would lead us too far.
If σ(A) is infinite,
L matters are moreL complicated. At least for every ε > 0 we have an
isomorphism V ' V
λ∈σ(A),|λ|>ε λ V≤ε , where V≤ε is a closed A-invariant subspace with
σ(A V≤ε ) ⊆ B(0, ε). In this caseT0 ∈ σ(A) is not isolated, so that we cannot define a Riesz
projector, but we could put V0 = ε>0 V≤ε . Now V0 can be zero (as for A ∈ K(`2 ) defined by
(Af )(n) = f (n)/n) or non-empty, in which case A V0 again is compact quasi-nilpotent.
For more on spectral theory and normal forms of compact operators see e.g. [24, 135].
Our last major target in this course is proving Theorem 18.4, an analogue of Theorem 14.12
for Hilbert space operators A ∈ B(H) that are normal but not necessarily compact. This will
require extensive preparations, but the mathematics needed is itself a central part of modern
functional analysis.
15.2 Lemma If A, B are unital algebras and α : A → B is a unital algebra homomorphism then
σB (α(a)) ⊆ σA (a) ∀a ∈ A.
Proof. If λ 6∈ σA (a) then a − λ1A ∈ A has an inverse b. Then α(b) is an inverse for α(a − λ1A ) =
α(a) − λ1B ∈ B, thus λ 6∈ σB (α(a)).
15.3 Lemma Let A be a unital Banach algebra. Then every non-zero character ϕ : A → F
satisfies ϕ(1) = 1, ϕ(a) ∈ σ(a) ∀a ∈ A and kϕk = 1, thus ϕ is continuous.
Proof. Since ϕ 6= 0 we can find a ∈ A with ϕ(a) 6= 0. Now ϕ(a) = ϕ(a1) = ϕ(a)ϕ(1), and
dividing by ϕ(a) gives ϕ(1) = 1. Thus every non-zero character is a unital homomorphism,
so that Lemma 15.2 gives σF (ϕ(a)) ⊆ σA (a). With σF (x) = {x} we have ϕ(a) ∈ σ(a), thus
|ϕ(a)| ≤ r(a) ≤ kak by Proposition 13.27, whence kϕk ≤ 1. Since we require k1k = 1, we also
have kϕk ≥ |ϕ(1)|/k1k = 1.
136
15.4 Definition If A is a unital Banach algebra, the spectrum Ω(A) of A is the set of non-zero
characters ϕ : A → F.
15.5 Exercise Let X be a compact Hausdorff space and A = C(X, F). For every x ∈ X define
ϕx : A → F, f 7→ f (x). Prove:
(i) ϕx is a non-zero character of A, thus ϕx ∈ Ω(A), for each x ∈ X.
(ii) The map ι : X → Ω(A), x 7→ ϕx is injective.
(iii) For each f ∈ A we have σ(f ) = {ϕ(f ) | ϕ ∈ Ω(A)}. Do not use Proposition 15.7!
15.6 Remark Since characters are bounded, we have Ω(A) ⊆ A∗ . Thus every topology of A∗
restricts to a topology on Ω(A). It will turn out that the ‘right’ one is the weak-∗ topology
from Section 10.3, for example since it makes the map ι : X → Ω(A) in the preceding exercise
a homeomorphism. But we defer this discussion to Section 19.1. 2
One could hope that σ(a) = {ϕ(a) | ϕ ∈ Ω(A)} holds for every unital Banach algebra A and
a ∈ A. While the inclusion ⊇ always holds, for equality one needs more:
Proof. (i) It should be clear that M = ker ϕ is an ideal, and M 6= A since ϕ 6= 0. This ideal has
codimension one since A/M ∼ = ϕ(A) = C and therefore is maximal.
(ii) Now let M ⊆ A be a maximal ideal. Since maximal ideals are proper, no element of
M is invertible. If b ∈ M satisfied k1 − bk < 1 then Lemma 13.19(i) would give invertibility
of b = 1 − (1 − b), a contradiction. (This is the only place where completeness is used.) We
thus have k1 − bk ≥ 1 for all b ∈ M , implying 1 6∈ M . Thus M is a proper ideal containing M .
Since M is maximal, we have M = M , thus M is closed. Now by Proposition 6.1(vi), A/M is
a normed algebra, and by a well-known argument from commutative algebra, the maximality
of M implies that A/M is a field, thus a division algebra. Thus A/M ∼ = C by the Gelfand-
Mazur theorem (Corollary 13.43), so that there is a unique isomorphism α : A/M → C sending
1 ∈ A/M to 1 ∈ C. If p : A → A/M is the quotient homomorphism then ϕ = α ◦ p : A → C is
a non-zero character with ker ϕ = M . This ϕ clearly is unique. Now Ω(A) 6= ∅ follows from the
fact that every commutative unital algebra has maximal ideals (by a standard Zorn argument).
(iii) We already know that {ϕ(a) | ϕ ∈ Ω(A)} ⊆ σ(a), so that it remains to prove that for
every λ ∈ σ(a) there is a ϕ ∈ Ω(A) such that ϕ(a) = λ. If λ ∈ σ(a) then a − λ1 6∈ Inv A. Thus
the ideal I = (a − λ1)A ⊆ A does not contain 1 and therefore is proper. Using Zorn’s lemma,
we can find a maximal ideal M ⊇ I. By (ii) there is a ϕ ∈ Ω(A) such that ker ϕ = M . Since
a − λ1 ∈ I ⊆ M = ker ϕ, we have ϕ(a − λ1) = 0, and with ϕ(1) = 1 we have ϕ(a) = λ.
92
If A is an algebra over a field k, one must take care to distinguish between ring ideals I ⊆ A and algebra ideals.
Both are closed under addition and under multiplication by elements of A. Algebra ideals are linear subspaces, thus
also closed under multiplication by the scalars in k. Every algebra ideal clearly is a ring ideal, and the converse holds
if A has a unit (as is assumed here) since cx = (c1)x ∈ I for each c ∈ k and x ∈ I. But a non-unital algebra can have
ring ideals that are not algebra ideals.
137
15.8 Exercise Let A be a unital Banach algebra over C and a, b ∈ A.
(i) If A is abelian, prove σ(a + b) ⊆ σ(a) + σ(b) and σ(ab) ⊆ σ(a)σ(b). Conclude that
r(a + b) ≤ r(a) + r(b) and r(ab) ≤ r(a)r(b). (No use of Exercise 13.46!)
(ii) Prove that the results of (i) also hold for non-abelian A provided ab = ba.
(iii) Give examples of non-commuting a, b ∈ A = M2×2 (C) for which everything in (i) fails.
Hint: For (ii), use an abelian subalgebra and Exercise 13.50.
15.9 Exercise Prove by way of counterexamples that the statements of Proposition 15.7(ii)+(iii)
all can fail if we drop the commutativity assumption or replace C by R.
15.10 Exercise Let A be a Banach algebra over F and Ae its unitization (Exercise 13.55).
Define ϕ∞ : Ae → F, (a, α) 7→ α.
(i) Prove ϕ∞ ∈ Ω(A).e
(ii) Prove that every ϕ ∈ Ω(A) has a unique extension to ϕ
b ∈ Ω(A).
e
(iii) Prove Ω(A)
e = {ϕ b | ϕ ∈ Ω(A)} ∪ {ϕ∞ }.
(iv) If A is non-unital and a ∈ A, define σ(a) as σAe(a). For A commutative non-unital over C
and a ∈ A, prove
σ(a) = {ϕ(a) | ϕ ∈ Ω(A)} ∪ {0}.
15.11 Exercise Consider the commutative unital Banach algebra A = `1 (Z, C) with convolu-
tion product ? and unit 1 = δ0 . Prove:
(i) For every ϕ ∈ Ω(A) we have ϕ(δ1 ) ∈ S 1 .
(ii) If ϕ1 , ϕ2 ∈ Ω(A) satisfy ϕ1 (δ1 ) = ϕ2 (δ1 ) then ϕ1 = ϕ2 .
(iii) For every z ∈ S 1 prove that ϕz (f ) = n∈Z f (n)z n defines an element of Ω(A).
P
15.12 Exercise Let f ∈ `1 = `1 (Z, C). With fm (n) = f (n−m), prove that spanC {fm | m ∈ Z}
(the finite
P linear combinations of the translates fm of f ) is dense in `1 (Z, C) if and only if
f (z) = n∈Z f (n)z vanishes for no z ∈ S 1 .
b m
15.13 Remark The result of Exercise 15.12 is the simplest of a whole family of ‘span of trans-
lates’ results. These were initiated by a theorem of N. Wiener93 : Given f ∈ L1 (R, λ) (λ is
Lebesgue measure), the linear
R span of the translates of f is dense in L1 (R, λ) if and only if the
Fourier transform f (ξ) = f (x)e
b −iξx dx (which is a continuous function R → C) vanishes for
no ξ ∈ R. Already this is harder to prove. Cf. e.g. [141, Theorem 9.5] or [27, Chapter 2]. 2
93
Norbert Wiener (1894-1964). American mathematician with important contributions to harmonic and functional
analysis and many other fields. See Theorem 19.9 for a related result of his.
138
15.2 Baby version of holomorphic functional calculus
Functional calculus is concerned with defining f (a) when f is a (suitable) function and a is an
element of a Banach or C ∗ -algebra or B(H). So far, we have only considered the rather trivial
case f1 : x 7→ 1/x (for invertible elements of any Banach algebra) and f2 : z 7→ z 1/2 (for positive
Hilbert space operators) The next question is: Determine σ(f (a)). Does it equal f (σ(a))? (For
f1 it does by Exercise 13.30.) These are the basic questions addressed by the many different
‘functional calculi’ that there are: holomorphic, continuous, Borel, etc.
While functional calculus has many applications, our main one will be the proof of spectral
theorems for normal Hilbert space operators (not necessarily compact) in Section 18.
Defining f (a) poses no problem in the simplest case, which surely is f = P , a polynomial:
15.16 Proposition Let A be a unital Banach algebra over C and let f (z) = ∞ n
P
n=0 cn z be a
power series with convergence radius R > 0. Then for all a ∈ A satisfying kak < R we have:
(i) The series f (a) = ∞ n
P
n=0 cn a in A converges absolutely.
(ii) σ(f (a)) = f (σ(a)). [Spectral mapping theorem]
P∞ n
P∞ n
Proof. (i) We Phaven n=0Pkcn a kn ≤ n=0 |cn | kak , which converges since kak < R and−1the
power series cn z and |cn |z have the same convergence radius R (as follows from R =
1/n
lim supn |cn | ). Now use Proposition 3.15(ii).
(ii) Let B = {a}00 ⊆ A. This is a commutative unital Banach algebra with InvB = B ∩ InvA,
and f (a) ∈ B. Since every ϕ ∈ Ω(B) is continuous and a unital homomorphism, we have
N N
! !
X X
n n
ϕ(f (a)) = ϕ lim cn a = lim ϕ cn a
N →∞ N →∞
n=0 n=0
N
X ∞
X
= lim cn ϕ(a)n = cn ϕ(a)n = f (ϕ(a)).
N →∞
n=0 n=0
P∞
(Note that n=0 cn ϕ(a)n converges absolutely since |ϕ(a)| ≤ r(a) ≤ kak < R.)
Applying (15.1) to f (a) ∈ B, we have
σB (f (a)) = {ϕ(f (a)) | ϕ ∈ Ω(B)} = {f (ϕ(a)) | ϕ ∈ Ω(B)} = {f (λ) | λ ∈ σB (a)} = f (σB (a)).
Since Exercise 13.50(i) gives σB (b) = σA (b) for all b ∈ B, we have σA (f (a)) = σB (f (a)).
139
15.17 Exercise Let A be a unital Banach algebra and a ∈ A. Let ∞ n
P
n=0 cn z Pbe a power
series with convergence radius R > 0. Prove that for the absolute convergence of ∞ n
n=0 cn a it
suffices that r(a) < R.
For P ∈ C[x], continuity of the map a 7→ P (a) is evident in every topological algebra. The
analogous result for a power series requires more work:
15.22 Exercise (One parameter groups I) Let A be a Banach algebra with unit 1 over
R or C. Let exp : A → A be defined as above. Prove:
(i) kea − 1k ≤ ekak − 1 ∀a ∈ A.
(ii) If ab = ba then ea+b = ea eb = eb ea .
(iii) The map R → A, t 7→ W (t) = eta is a one-parameter group (thus W (0) = 1 and W (s+t) =
W (s)W (t) ∀s, t ∈ R).
(iv) t 7→ eta is norm-continuous. Do this using (1a) and (1c), not Exercise 15.18.
eta −1
(v) limt→0 t = a.
(vi) If a, b ∈ A and eta esb = esb eta for all s, t ∈ R then ab = ba.
140
15.23 Remark One can find 2 × 2 matrices a, b such that ea+b = ea eb = ab ea , but ab 6= ba, as
well as non-commuting 2 × 2 matrices a, b such that ea+b = ea eb 6= eb ea . There is an extensive
literature on this phenomenon. We mention one interesting result [150]: If A is a unital Banach
algebra, a, b ∈ A such that ea eb = eb ea and σ(a) and σ(a) are 2πi-congruence free then ab = ba.
Here Ω ⊆ C is called 2πi-congruence free if λ, λ0 ∈ Ω, λ − λ0 ∈ 2πiZ implies λ = λ0 . (In
particular ea eb = eb ea does imply ab = ba if σ(a), σ(b) ⊂ R.) 2
15.24 Exercise (One parameter groups II) Let A be a unital Banach algebra.
(i) Local inverse for exp.
the branch for which z > 0 ⇒ log z ∈ R) has
(a) The logarithm function (more precisely P
a unique power series expansion g(z) = ∞ n
n=1 cn (z − 1) around z = 1. Prove that it
has convergence radius one.
(b) For a ∈ A with ka − 1k < 1 we can define log(a) ∈ A using the power series g. Prove:
If a, b ∈ A commute and ka − 1k < 1, kb − 1k < 1 then log a and log b commute.
(c) Prove: If kak < log 2 then kea − 1k < 1 and log(ea ) = a. And if kb − 1k < 1 then
exp(log b) = b.
(d) Let 0 ∈ U = B(0, log2 2 ) ⊂ A and V = exp(U ) 3 1. Prove that exp : U → V is a
homeomorphism with inverse log : V → U .
(e) Prove that if a, b ∈ V commute and ab ∈ V then log(ab) = log a + log b.
(ii) Let V be a Banach space, ε > 0 and f : (−ε, ε) → V continuous and satisfying f (0) = 0
and f (s + t) = f (s) + f (t) whenever s, t, s + t ∈ (−ε, ε). Prove that there exists a unique
x ∈ V such that f (t) = tx ∀t ∈ (−ε, ε).
(iii) Now let R → A, t 7→ W (t) be a norm-continuous one parameter group. Use the above
results to prove that there is a unique a ∈ A such that W (t) = eta for all t ∈ R.
15.25 Remark For most applications of one-parameter groups, norm-continuity is too strong
a requirement. To get further one considers one-parameter groups (or semigroups, defined only
for t ≥ 0) in B(V ) that are only strongly continuous, i.e. limt→0 W (t)x = x ∀x ∈ V . A typical
result then is Stone’s theorem according to which the unitary one-parameter groups on Hilbert
spaces are of the form W (t) = eitA , where A is a possibly unbounded self-adjoint operator, cf.
e.g. [128]. The subject of operator semigroups is huge, cf. [50] for an introduction. 2
(iii) Let R1 , . . . , Rn > 0 and assume that j∈Nn cj1 ,...,jn z1j1 · · · znjn converges whenever |zi | <
P
0
Ri ∀i, defining an analytic function f on this domain. Assuming kai k < Ri ∀i, define
f (a1 , . . . , an ) ∈ A in analogy to the case n = 1 above. Prove
141
15.27 Remark 1. Example: σ(a + b) = {α + β | (α, β) ∈ σ(a, b)}, improving on Exercise 15.8.
2. In more general situations, like non-abelian algebras and non-commuting operators, there
exist various definitions of the joint spectrum, not necessarily equivalent. 2
16 Basics of C ∗-algebras
16.1 Involutions. Definition of C ∗ -algebras
The properties of the adjoint map A 7→ A∗ on B(H) motivate some definitions:
16.4 Lemma Every C ∗ -algebra is a Banach ∗-algebra. If it has a unit 1 then k1k = 1.
Proof. With the C ∗ -identity and submultiplicativity we have kak2 = ka∗ ak ≤ ka∗ kkak, thus
kak ≤ ka∗ k for all a ∈ A. Replacing a by a∗ herein gives the converse inequality, thus ka∗ k = kak.
If 1 is a unit then k1k2 = k1∗ 1k = k1∗ k = k1k, and since k1k = 6 0 this implies k1k = 1.
16.5 Remark 1. Clearly B(H) is a C ∗ -algebra for each Hilbert space H. Since this holds also
for real Hilbert spaces, it shows that one can discuss Banach ∗-algebras and C ∗ -algebras over
R. But we will consider only complex ones.
2. There is no special name for the non-complete variants of the above definitions. But a
submultiplicative norm on a ∗-algebra satisfying the C ∗ -identity is called a C ∗ -norm, whether
A is complete w.r.t. it or not. Completion of a ∗-algebra w.r.t. a C ∗ -norm gives a C ∗ -algebra,
and this is an important way of constructing new C ∗ -algebras. 2
16.6 Exercise For n ∈ N, consider A = Mn×n (C) with the usual ∗-algebra structure. Prove
2 1/2 defines a norm that satisfies submultiplicativity and ka∗ k = kak.
Pn
that kak = i,j=1 |ai,j |
Thus (A, k · k) is a Banach ∗-algebra. Prove that it is not a C ∗ -algebra when n ≥ 2.
94
The original definition by Gelfand and Naimark (1942) had the additional axiom that a∗ a + 1 be invertible for
each a. This turned out to be redundant, cf. Proposition 17.6.
142
16.7 Exercise Recall the Banach algebra A = `1 (Z, C) from Example 13.48. Show that both
f ∗ (n) = f (n) and f ∗ (n) = f (−n) define involutions on A making it a Banach ∗-algebra. Show
that neither of them satisfies the C ∗ -identity.
16.8 Exercise Equip the Banach algebra from Exercise 13.51 with the ∗-operation f ∗ (x) =
f (x). Is it a C ∗ -algebra?
16.9 Lemma Let X be a compact space. For f ∈ C(X, C), define f ∗ by f ∗ (x) = f (x). Then
C(X, C) is a C ∗ -algebra. The same holds for Cb (X, C), where X is arbitrary, thus also for
`∞ (S, C).
Proof. We know that C(X, C) equipped with the norm kf k = supx |f (x)| is a Banach algebra.
It is immediate that ∗ is an involution. The computation
2
kf ∗ f k = sup |f (x)f (x)| = sup |f (x)|2 = sup |f (x)| = kf k2
x x x
proves the C ∗ -identity. It is clear that this generalizes to the bounded continuous functions on
any space X.
In a sense, the examples B(H) and C(X, C) for compact X are all there is: One can prove,
as we will do in Theorem 19.12, that every commutative unital C ∗ -algebra is isometrically
∗-isomorphic to C(X, C) for some compact Hausdorff space X, determined uniquely up to
homeomorphism. (For example one has `∞ (S, C) ∼ = C(βS, C), where βS is the Stone-Čech
compactification of (S, τdisc ).) And one can prove that every C ∗ -algebra is isometrically ∗-
isomorphic to a norm-closed ∗-subalgebra of B(H) for some Hilbert space H. See e.g. [110].
16.10 Exercise If A is a C ∗ -algebra and Ae its unitization (Exercise 13.55), the norm k(a, α)k1 =
kak + |α| on Ae usually fails to be a C ∗ -norm. Define k(a, α)k = supb∈A,kbk≤1 kab + αbk. Prove:
(i) k · k is an algebra norm on Ae if and only if A is non-unital, which is assumed from now
on.
(ii) It satisfies k(a, α)k ≤ k(a, α)k1 and k(a, 0)k = kak ∀a ∈ A, thus ι : A ,→ Ae is an isometry.
(iii) k · k is a C ∗ -norm.
(iv) (A,e k · k) is complete and the norms k · k1 , k · k on Ae are equivalent.
16.12 Exercise Let A be a Banach ∗-algebra and I ⊆ A a closed self-adjoint two-sided ideal.
Prove:
143
(i) A/I has a natural ∗-operation so that p : A → A/I is a ∗-homomorphism.
(ii) With this ∗-operation and the quotient norm, A/I is a Banach ∗-algebra.
(iii) If α : A → B is a ∗-homomorphism such that I ⊆ ker α then the induced map α0 : A/I →
B, cf. Proposition 6.1(v), is a ∗-homomorphism.
16.13 Remark If A is a C ∗ -algebra one can prove that every closed two sided ideal I ⊆ A
automatically is self-adjoint and that A/I is a C ∗ -algebra. But the proofs would lead us too
far, cf. e.g. [110, Theorems 3.1.3, 3.1.4]. 2
The self-adjoint, respectively unitary, elements of the C ∗ -algebra A = C are the real num-
bers and the phases (|z| = 1). Thus self-adjoint and unitary elements of a C ∗ -algebra should
be thought of as generalized real numbers and phases, respectively. Also the real-imaginary
decomposition generalizes:
a+a∗ a−a∗
16.14 Lemma If A is a ∗-algebra and a ∈ A, we define Re(a) = 2 , Im(a) = 2i . Now
(i) Re(a), Im(a) are self-adjoint and a = Re(a) + i Im(a).
(ii) The representation a = b + ic with b, c self-adjoint is unique for each a.
(iii) a is self-adjoint if and only if Im(a) = 0.
(iv) a is normal if and only if Re(a) and Im(a) commute.
Proof. Mostly trivial computations. We only prove (ii): If b, b0 , c, c0 are self-adjoint and b + ic =
b0 + ic0 then b − b0 = i(c0 − c). This implies b − b0 = (b − b0 )∗ = −i(c0 − c) = −(b − b0 ). Thus
b = b0 and in turn c = c0 .
16.15 Exercise Prove that a ∗-algebra is commutative if and only if every element is normal.
There are several reasons why normal elements are important. An element a ∈ A is normal
if and only if there is a ∗-closed commutative subalgebra B ⊆ A containing a. We will see that
normal elements behave like functions on a (locally) compact space.
While ab = ba clearly implies a∗ b∗ = b∗ a∗ , it need not follow that a∗ commutes with b (or
equivalently a with b∗ ). To see this just pick any non-normal a ∈ A and take b = a. But:
16.16 Theorem (Fuglede 1950) Let A be a unital C ∗ -algebra, and let a, b be commuting
elements at least one of which is normal. Then a∗ b = ba∗ (and ab∗ = b∗ a).
The theorem is quite remarkable, and asked for a proof one probably wouldn’t know where to
begin.
P For matrices it actually is quite easy: Normality of a implies that a is diagonalizable, i.e.
a = i λi Pi , where the λi are the (distinct) eigenvalues and the Pi orthogonal projections onto
the corresponding eigenspaces, see e.g. [55, Theorem 6.16]. Now ab = ba implies Pi b = bPi ∀i.
Taking adjoints gives Pi b∗ = b∗ Pi ∀i, whence ab∗ = b∗ a. With effort, this argument can be
extended to operators on infinite-dimensional Hilbert spaces, cf. [160]. But there is a much
more elegant argument that works in all C ∗ -algebras, for which we refer to Section B.13.1.
144
Proof. (i) The proof of Proposition 13.69(i) works identically in every abstract C ∗ -algebra.
(ii) By unitarity of u we have kuk2 = ku∗ uk = k1k = 1, thus kuk = 1 = ku∗ k. With u−1 = u∗
we also have ku−1 k = 1, and Exercise 13.30(ii) gives σ(u) ⊆ S 1 .
(iii) Given λ ∈ σ(a), write λ = α + iβ with α, β ∈ R. Applying σ(a + z1) = σ(a) + z ∀z ∈ C
(why?) to z = −α + inβ, where n ∈ N, we have iβ(n + 1) = α + iβ − α + inβ ∈ σ(a − α1 + inβ1).
Thus with r(c) ≤ kck (Proposition 13.27), the C ∗ -identity and k1k = 1 we have
16.18 Remark 1. Since (i) implies kak = ka∗ ak1/2 = r(a∗ a)1/2 for all a ∈ A and the spectral
radius r(a) by definition depends only on the algebraic structure of A, the latter also determines
the norm, which therefore is unique in a C ∗ -algebra! But note that the conclusion k · k1 = k · k2
for C ∗ -norms k · k1,2 only follows if A is complete with respect to both norms!
2. The proof of (iii) is short and uses only r(c) ≤ kck and the C ∗ -identity. A less direct,
but perhaps more insightful argument uses the exponential function, cf. Example 15.19: If
a = a∗ ∈ A then ut = eita with t ∈ R satisfies u∗t = e−ita , thus ut u∗t = u∗t ut = 1, so that ut is
unitary. Now (ii) gives σ(ut ) ⊆ S 1 , and σ(a) ⊆ R follows from the spectral mapping theorem
or Exercise 15.21.
For another application of the unitarity of eia for self-adjoint a see Section B.13.1. 2
The applications of Proposition 16.17(i) discussed in Section 13.4 (with the exception of the
result on 9A9) hold in all abstract C ∗ -algebras. Proposition 16.17(iii) can be used to improve
on the results of Section 13.2, showing that C ∗ -algebras are better behaved than general Banach
algebras:
145
16.21 Exercise Give an example of a unital C ∗ -algebra A and a ∈ A showing that σ(a) ⊆
[0, ∞) does not imply a = a∗ !
16.22 Exercise Let A be a unital C ∗ -algebra. Without using results proven later, prove:
(i) If a ∈ Asa then a2 ≥ 0.
(ii) If a, b ∈ A are positive and a + b = 0 then a = b = 0.
(iii) If a, b ∈ A are positive and ab = ba then a + b is positive.
(iv) If c ∈ A is normal then c∗ c is positive. Hint: Lemma 16.14.
where the first equality is due to Proposition 16.17(i), the second is the definition of r and the
third comes from the spectral mapping theorem (Proposition 15.16(ii) or Exercise 15.15).
Even though we are after a result for all normal operators, we first consider self-adjoint
operators:
146
17.2 Theorem Let A be a unital C ∗ -algebra and a = a∗ ∈ A. Then there is a unique con-
tinuous ∗-homomorphism αa : C(σ(a), C) → A such that αa (P ) = P (a) for all polynomials.
(Usually we will write f (a) instead of αa (f ).) It satisfies
(i) αa is an isometry: kαa (f )k = supλ∈σ(a) |f (λ)|.
(ii) The image of αa is the smallest C ∗ -subalgebra B ⊆ A containing 1 and a. The map
αa : C(σ(a), C) → B is a ∗-isomorphism. f (a) is self-adjoint if and only if f is real-valued.
(iii) σ(αa (f )) = f (σ(a)) = {f (λ) | λ ∈ σ(a)}. (Spectral mapping theorem)
(iv) If f ∈ C(σ(a), R), g ∈ C(f (σ(a)), C) then αa (g◦f ) = ααa (f ) (g), or just g(f (a)) = (g◦f )(a).
(We require f to be real-valued in order for f (a) to be self-adjoint.)
Proof. (i) By Propositions 13.27 and 16.17(iii), we have σ(a) ⊆ [−kak, kak]. By the classical
Weierstrass approximation theorem, cf. Theorem A.32, for every continuous continuous function
f : [c, d] → C and ε > 0 there is a polynomial P such that |f (x) − P (x)| ≤ ε for all x ∈
[c, d]. We cannot apply this directly since σ(a), while contained in an interval, need not be an
entire interval. But using Tietze’s Extension Theorem A.31, we can find (very non-uniquely)
a continuous function g : [−kak, kak] → C that coincides with f on σ(a). Now this g can
be approximated uniformly by polynomials thanks to Weierstrass’ theorem. (Alternatively,
apply the more abstract Stone-Weierstrass theorem directly to f .) In any case, the restriction
of the polynomials to σ(a) is dense in C(σ(a), C) w.r.t. k · k∞ . By Proposition 17.1, the map
C(σ(a), C) ⊇ C[x]|σ(a) → A, P 7→ P (a) is an isometry. Thus applying Lemma 3.12 we obtaining
a unique isometry αa : C(σ(a), C) → A extending P 7→ P (a). Since C(σ(a), C) is complete, its
image under αa is closed, thus equal to the closure C ∗ (1, a) of {P (a) | P ∈ C(x)}. Thus (i) is
proven up to the claim that αa is a ∗-homomorphism. This is left as an exercise.
(ii) Since αa : C[x] → A is a ∗-homomorphism, B := αa (C(σ(a), C)) ⊆ A is a ∗-subalgebra.
And since αa is an isometry by (i) and (C(σ(a), C), k · k∞ ) is complete, B is closed, thus a C ∗ -
algebra. Since αa maps the constant-one function to 1 ∈ A and the inclusion map σ(a) ,→ C
to a, B contains 1, a. Conversely, the smallest C ∗ -subalgebra of A containing 1 and a clearly
is obtained by taking the norm-closure of the set {P (a) | P ∈ C[z]}, which is contained in
the image of αa . As the continuous extension of a ∗-homomorphism, αa : C(σ(a), C) → A
is a ∗-homomorphism. Finally f (a)∗ = αa (f )∗ = αa (f ∗ ). Since αa is injective, this equals
f (a) = αa (f ) if and only f = f ∗ , which is equivalent to real-valuedness of f .
(iii) Let f ∈ C(σ(a), C). Then clearly αa (f ) ∈ B. Now
σA (αa (f )) = σB (αa (f )) = σC(σ(a),C) (f ) = f (σ(a)),
where the equalities come from Theorem 16.19, from the fact that αa : C(σ(a), C) → B is a
∗-isomorphism, and from Exercise 13.24, respectively.
(iv) If {Pn } is a sequence of polynomials converging to f uniformly on σ(a) and {Qn } is a
sequence of polynomials converging to g uniformly on σ(f (a)), then Qn ◦Pn converges uniformly
to g ◦ f , thus Qn (Pn (a)) = (Qn ◦ Pn )(a) converges to (g ◦ f )(a). On the other hand, {Qn (Pn (a))}
converges uniformly to g(f (a)).
147
17.2 Positive elements of a C ∗ -algebra II. Absolute value
Using the functional calculus, we can continue the considerations begun in Section 16.3.
17.5 Exercise (i) Define f+ , f− : R → R by f+ (x) = max(x, 0), f− (x) = − min(x, 0).
Prove: 1. f+ f− = 0, 2. f± (x) = (|x| ± x)/2, 3. f± ∈ C(R, R).
(ii) Let now A be a unital C ∗ -algebra and a = a∗ ∈ A. Define a± ∈ A by functional calculus
as a± = f± (a). Prove: 1. a+ − a− = a and a+ + a− = |a|, 2. a+ a− = a− a+ = 0, 3.
a+ ≥ 0, a− ≥ 0.
17.7 Exercise Let A be a unital C ∗ -algebra and a, b ∈ A with a ≥ 0. Use Proposition 17.6 to
prove that bab∗ ≥ 0. Conclude that a, c ∈ Asa , a ≤ c ⇒ bab∗ ≤ bcb∗ .
For elements of the C ∗ -algebra B(H), where H is a Hilbert space, we have two competing
definitions of positivity. Luckily there is no conflict:
17.9 Proposition If H is a complex Hilbert space and A ∈ B(H), the following are equivalent:
(i) C ∗ -algebraic positivity: A = A∗ and σ(A) ⊆ [0, +∞).
(ii) Operator positivity: hAx, xi ≥ 0 for all x ∈ H, equivalently W (A) ⊆ [0, +∞).
Proof. If A is C ∗ -positive then by Proposition 17.8 there is a B = B ∗ ∈ B(H) such that
A = B 2 = B ∗ B. Thus A is operator positive by Exercise 11.39(ii).
If A is operator positive then A = A∗ by Proposition 11.22, and using Exercise 13.15 we
have σ(A) ⊆ W (A) ⊆ [0, ∞). Thus A is C ∗ -positive.
148
√
17.10 Exercise Let A be a unital C ∗ -algebra. Prove |a| = a2 for all a ∈ Asa .
With Proposition 17.6 we can define the ‘absolute value’ of all elements, not only the self-
adjoint ones:
17.12 Exercise (i) Let A be a unital C ∗ -algebra and a ∈ A. Prove that |a| = |a∗ | holds if
and only if a is normal.
(ii) Find counterexamples in A = M2×2 (C) disproving |a + b| ≤ |a| + |b| and |ab| ≤ |a| |b|.
In Section 11.8 we have proven polar decomposition for the C ∗ -algebras A = B(H), but this
does not generalize to arbitrary C ∗ -algebras:
17.13 Exercise (i) If A is a unital C ∗ -algebra and a ∈ Inv(A), prove that |a| ∈ InvA. Use
this to define u ∈ A by a = u|a| and prove that u is unitary.
(ii) Give an example of a unital C ∗ -algebra A and a ∈ A such that there is no b ∈ A with
a = b|a|.
17.14 Remark We have seen in Exercise 17.13 that in a unital C ∗ -algebra one does not always
have polar decomposition. However, there is a class of particularly nice C ∗ -subalgebras A ⊆
B(H), the von Neumann algebras, such that for each A ∈ A one has not only |A| ∈ A, but also
V ∈ A, where V is as in Proposition 11.44. Cf. e.g. [110, Theorem 4.1.10]. 2
In Remark 13.28 we showed r(a) ≤ inf b∈InvA kbab−1 k for every element of a unital Banach
algebra. For C ∗ -algebras, we can now prove this to be an equality:
149
fail to be uniformly dense in C(σ(a), C). (All functions that are uniform limits of polynomials
on sufficiently large subsets of C are holomorphic so that, e.g. f (z) = Re z cannot be approxi-
mated by polynomials in z = x + iy.) But with σ(a) ⊆ C ∼ = R2 and considering functions on (a
subset of) C as functions of two real variables, the polynomials in x, y are dense in C(σ(a), C)
by the higher dimensional version of the classical Weierstrass theorem, cf. PTheoremi A.38. Thus
also the polynomials in z = x + iy and z = x − iy are dense96 . If P = N c
i,j=0 ij z z j ∈ C[z, z]
17.17 Lemma Let A be a unital C ∗ -algebra. Then every character ϕ ∈ Ω(A) satisfies ϕ(c∗ ) =
ϕ(c) for all c ∈ A, i.e. is a ∗-homomorphism.
Proof. We have c = a + ib, where a = Re(c), b = Im(c) are self-adjoint. Now σ(a) ⊆ R by
Proposition 16.17(iii), thus ϕ(a) ∈ σ(a) ⊆ R by Lemma 15.3. Similarly ϕ(b) ∈ R. Thus
ϕ(c∗ ) = ϕ(a − ib) = ϕ(a) − iϕ(b) = ϕ(a) + iϕ(b) = ϕ(a + ib) = ϕ(c),
where the third equality used that ϕ(a), ϕ(b) ∈ R as shown before.
Appealing to Theorem 16.19, we get σA (P (a, a∗ )) = σB (P (a, a∗ )). Since P (a, a∗ ) is normal,
with Proposition 16.17(i) we have kP (a, a∗ )k = r(P (a, a∗ )) = supλ∈σ(a) |P (λ, λ)|.
17.19 Remark The main difficulty in the construction of the continuous functional calculus for
normal operators is proving (17.1). (By contrast, Proposition 17.1 has an elementary proof since
it only uses the spectral mapping theorem for polynomials.) The above proof is efficient and
elegant, but it relies on Zorn’s lemma via Proposition 15.7. Avoiding this at the present level
of generality is possible, but quite cumbersome, cf. e.g. [155, Section 7.4], which makes massive
96
We allow ourselves the harmless sloppiness of not distinguishing between elements of the ring C[z, z] (where z, z
are independent variables) and the functions C → C, z 7→ f (z, z) induced by them.
150
use of holomorphic functional calculus including a version for several commuting operators.
For normal Hilbert space operators there proofs [14, 174] (see also [113, Prop. 8.21]) that are
reasonably elementary, but tricky and ad hoc. 2
Proof of Theorem 17.16 Essentially as that of Theorem 17.2, now using the density of the
polynomials in z, z in C(σ(a), C) as explained before and using Proposition 17.18 instead of
Proposition 17.1.
17.22 Exercise (i) Let A be a unital C ∗ -algebra and u ∈ A unitary (thus σ(u) ⊆ S 1 ). Prove
that if σ(u) 6= S 1 , there exists a ∈ Asa such that eia = u.
(ii) Give an example of a unital C ∗ -algebra A and a unitary u ∈ A such that there is no
a ∈ Asa with u = eia .
The following is a preview of later developments in infinitely many dimensions:
17.23 Exercise Let H be a finite-dimensional Hilbert space and A ∈ B(H) normal. Prove:
(i) A = ni=1 λi Pi , where the Pi are the orthogonal projections onto the eigenspaces of A and
P
the λi are the associated eigenvalues.
(ii) For any function f : {λ1P
, . . . , λn } → C, the f (A) provided by continuous functional calculus
coincides with f (A) = ni=1 f (λi )Pi .
With the help of continuous functional calculus for normal operators, we can improve on the
results obtained in Exercises 13.66 and 13.68 (for Banach space operators):
151
to C, we have A = A1 ⊕A2 = π1 (z1 )⊕π2 (z2 ). Thus σ(A1 ) = σ(π1 (z1 )) ⊆ σ(z1 ) = Σ. Analogously
σ(A2 ) ⊆ σ(A)\Σ. Now in view of σ(A) = σ(A1 ) ∪ σ(A2 ), we have σ(A1 ) = Σ, σ(A2 ) ⊆ σ(A)\Σ.
(ii) Since λ is isolated, Σ = {λ} ⊆ σ(A) is clopen. Applying (i) to Σ gives H1 , H2 , A1 , A2
with σ(A1 ) = {λ} and σ(A2 ) = σ(A)\{λ}. Now Exercise 13.71(ii) gives A1 = λ1, so that
H1 = ker(A − λ1). And (A2 − λ1) ∈ B(H2 ) is invertible.
Now we are are in a position to answer the question raised in Remark 11.28:
17.25 Corollary Let H be a Hilbert space and A ∈ B(H) normal. With H 0 = (ker A)⊥ we
have AH 0 ⊆ H 0 , and the following are equivalent:
(i) A H 0 ∈ B(H 0 ) is invertible (⇔ surjective ⇔ bounded below).
(ii) 0 6∈ σ(A) or 0 ∈ σ(A) is isolated.
Proof. Recall from Proposition 11.27 that A maps H 0 to itself.
(ii)⇒(i) If 0 6∈ σ(A) then A is invertible, thus H 0 = H and A H 0 is invertible. If 0 ∈ σ(A)
is isolated then Proposition 17.24(ii) gives a decomposition H = H1 ⊕ H2 , where H1 = ker A
and H2 = (ker A)⊥ = H 0 , with A H2 = A H 0 invertible.
(i)⇒(ii) If A is injective, the assumption of invertibility of A H 0 becomes invertibility of A,
implying 0 6∈ σ(A). If A is not injective then invertibility of A H 0 means 0 6∈ σ(A H 0 ). Since
σ(A H 0 ) ⊆ C is closed, there is an open neighborhood U ⊂ C of 0 such that U ∩ σ(A H 0 ) = ∅.
On the other hand, the spectrum of A ker A obviously is {0}. Since σ(A) = {0} ∪ σ(A H 0 ),
we have σ(A) ∩ U = {0}, so that 0 ∈ σ(A) is isolated.
is a bounded linear functional on the Banach space (C(σ(A), C), k·k∞ ) since kf (A)k = kf k∞ . If
f is positive (i.e. takes values in [0, ∞)) then σ(f (A)) ⊆ [0, ∞) by the spectral mapping theorem,
so that f (A) ≥ 0 and hf (A)x, xi ≥ 0 by Proposition 17.9. Thus ϕA,x is a bounded positive
linear functional on C(σ(A), C). Now by the Riesz-Markov-Kakutani theorem, cf. Section A.7,
there is a unique finite positive measure µA,x on the Borel σ-algebra of σ(A) such that
Z
f dµA,x = ϕA,x (f ) = hf (A)x, xi ∀f ∈ C(σ(A), C). (18.1)
Taking f = 1 = const., we have f (A) = 1, so that µA,x (σ(A)) = 1 dµA,x = kxk2 < ∞. Now we
R
have the Hilbert space L2 (σ(A), µA,x ), where we omit the Borel σ-algebra from the notation.
18.1 Definition Let H be a Hilbert space, A ∈ B(H) normal and x ∈ H. Then x is called
∗-cyclic for A if spanC {An (A∗ )m x | n, m ∈ N0 } = H.
152
18.2 Remark A vector x is called cyclic for A if spanC {An x | n ∈ N0 } = H. Clearly the two
notions are equivalent for self-adjoint A, but a normal operator can be ∗-cyclic but not cyclic
(e.g. the shift on `2 (Z, F)). For the present purpose, ∗-cyclicity is the right notion. 2
where the third equality comes from the ∗-homomorphism property of the functional calcu-
R and the fourth from (18.1). Thus if we equip C(σ(A), C) with the seminorm kf k2 =
lus
( |f |2 dµA.x )1/2 , the map α : C(σ(A), C) → H, f 7→ f (A)x is isometric. With L2 (σ(A), µA,x ) =
{f : σ(A) → C | f measurable, kf k2 < ∞} we have C(σ(A), C) ⊆ L2 (σ(A), µA,x ) (since µA,x is
finite). Recall that L2 (σ(A), µA,x ) is defined as the quotient space of L2 (σ(A), µA,x ) w.r.t.
the equivalence relation ∼ defined by f ∼ g ⇔ kf − gk2 = 0 ⇔ f = g µ-a.e. Since
f ∼ g implies kf (A)x − g(A)xk = kf − gk2 = 0, the map α descends to an isometric map
α0 : C(σ(A), C)/∼ → H such that the triangle in
L2 (σ(A), µA,x ) ⊇ C(σ(A), C)
?
? ? -
?
2
L (σ(A), µA,x ) ⊇ C(σ(A), C)/∼ 0 - H
α
commutes. It is a fact from measure theory that C(σ(A), C)/∼ ⊆ L2 (σ(A), µA,x ) is a dense linear
subspace, cf. e.g. [140, Theorem 3.14]. Thus by Lemma 3.12 there is a unique isometric map αb0 :
L2 (σ(A), µA,x ) → H extending α0 . With {f (A)x | f ∈ C(σ(A), C)} ⊇ {Ai (A∗ )j x | i, j ∈ N0 }, the
assumption that x be ∗-cyclic implies that α, thus α0 , has dense image in H. Since L2 (σ(A), µA,x )
is complete, its image under αb0 is closed and dense, thus all of H. Thus αb0 is unitary, and we
∗
define U = αb0 : H → L2 (σ(A), µA,x ). For f ∈ C(σ(A), C) we have U ∗ [f ] = f (A)x, thus
(U AU ∗ )([f ])(z) = (U Af (A)x)(z) = (U (zf )(A)x)(z) = [zf ](z),
and by density of C(σ(A), C)/∼ in L2 (σ(A), µA,x ), this holds for all f ∈ L2 (σ(A), µA,x ) and
µA,x -almost all z ∈ σ(A).
Not every normal operator A ∈ B(H) admits a ∗-cyclic vector, see Exercise 18.9. But we
always have:
18.4 Theorem (Spectral theorem for normal operators) Let H be a complex Hilbert
space and A ∈ B(H) [Link] there exists a family {µι }ι∈I of L finite Borel measures on
σ(A) and a unitary U : H → ι∈I L2 (σ(A), µι )97 such that U AU ∗ = ι∈I Mz , i.e.
M
(U AU ∗ f )ι (z) = zfι (z) ∀f = {fι } ∈ L2 (σ(A), µι ), z ∈ σ(A). (18.2)
ι∈I
97
L
Here is the Hilbert space direct sum defined at the end of Section 5.1.
153
Proof. Let F be the family of subsets F ⊆ H such that for x, y ∈ F, x 6= y we have f (A)x ⊥
f 0 (A)y for all f, f 0 ∈ C(σ(A), C). We partially order F by inclusion. One easily checks
S that F
satisfies the hypothesis of Zorn’s lemma. (Given a totally ordered subset C ⊆ F, C is in F,
thus an upper bound for C.) Thus there is a maximal element M ∈ F. For each x ∈ M we
put Hx = {f (A)x | f ∈ C(σ(A), C)}. L By construction these Hx are mutually orthogonal and
f (A)Hx ⊆ Hx ∀x. Putting K = x∈M Hx , we have f (A)K ⊆ K for all f ∈ C(σ(A), σ), thus
also f (A)∗ K ⊆ K since f (A)∗ = f (A). With Exercise 11.23 this means that K and K ⊥ are
invariant under all f (A). If K ⊥ 6= {0} then for every non-zero y ∈ K ⊥ we have M ∪ {y} ∈ F,
which contradicts the maximality of M . Thus K = H.
For every x ∈ M we have that x ∈ Hx is ∗-cyclic for the restriction of A to Hx , so that we can
apply Proposition 18.3 L to obtain unitaries Ux : Hx → L2 (σ(A), µA,x ) such that Ux A = Mz Ux .
Defining U : H → 2 2
x∈M L (σ(A), µA,x ) by sending y ∈ Hx to LUx y ∈ L (σ(A), µA,x ) and
∗
extending linearly, U is unitary. It is clear that we have U AU = x∈M Mz . Now we are done
(with the obvious identifications I = M and µx = µA,x ).
18.5 Remark 1. Once the maximal family M of vectors has been picked, the construction
is canonical. But there is no uniqueness in the choice of that family. (This is similar to the
non-uniqueness of the choices of ONBs in the eigenspaces ker(A − λ1) that we make in proving
Theorem 14.12.) For much more on this (in the self-adjoint case) see [128, Section VII.2].
2. Theorem 18.4 is perfectly compatible with Theorem 14.12: If A is compact normal and
E is an ONB diagonalizing it then the Hι in Theorem 18.4 are precisely the one-dimensional
spaces Ce for e ∈ E and the measure µι corresponding to Hι = Ce is the δ-measure on P (σ(A))
defined by µ(S) = 1 if λe ∈ S and µ(S) = 0 otherwise. (To be really precise, one should take
the non-uniqueness in both theorems into account.)
3. If A is as in the theorem and g ∈ C(σ(A), C) then the continuous functional calculus
gives us a normal operator g(A). We now have
M
U g(A)U ∗ = Mg .
ι∈I
18.6 Corollary Let H be a separable complex Hilbert space and A ∈ B(H) normal. Then
there exists a finite measure space (X, A, µ), a function g ∈ L∞ (X, A, µ; C) and a unitary
W : H → L2 (X, A, µ; C) such that W AW ∗ = Mg .
Proof. We apply Theorem 18.4. Since H is separable, the index set I is at most countable, and
L write I = {1, . . . , N } where N ∈ N ∪ {∞}
we
−1
with ∞ = #N. Now we put X = I × σ(A) =
i∈I σ(A) and for Y ⊆ X we put Y i = p 2 (p1 (i)) = {x ∈ σ(A) | (i, x) ∈ Y } ⊆ σ(A). We define
154
A ⊆ P (X) and µ : A → [0, ∞] by
A = {Y ⊆ X | Yi ∈ B(σ(A)) ∀i ∈ I},
X
µ(Y ) = µi (Yi ).
i∈I
From the way (X, A, µ) was constructed, it is quite clear that V is unitary. (Check this!) Now
W = V U : H → L2 (X, A, µ), where U comes from Theorem 18.4, is unitary. In view of
(U AU ∗ f )i (λ) = λfi (λ), defining g : X → C, (i, x) 7→ x (which is bounded by r(A) = kAk), we
have W AW ∗ = Mg .
18.7 Exercise Use the above results to prove that for every normal A ∈ B(H), where H is
a Hilbert space of dimension ≥ 2, there is a proper closed subspace {0} =
6 K ( H such that
AK ⊆ K.
For a bit more on the existence of invariant subspaces see Section B.8.
18.8 Exercise (i) Let Σ ⊆ C be compact and non-empty and µ be a finite positive Borel
measure on Σ. Put H = L2 (Σ, µ) and define A ∈ B(H) by A = Mz , thus (Af )(z) = zf (z)
for f ∈ H. Prove:
18.9 Exercise Let H be a Hilbert space and A ∈ B(H) non-zero and normal. Prove that A
admits a ∗-cyclic vector if and only if H is separable and the algebra {A, A∗ }0 is commutative.
18.10 Definition If (X, τ ) is a topological space, B ∞ (X, C) denotes the set of bounded func-
tions X → C that are measurable with respect to the Borel σ-algebra B(X, τ ).
155
18.11 Lemma Let (X, τ ) be a topological space. Then
(i) If {fn }n∈N is a sequence of Borel measurable functions X → C converging pointwise to f
then is Borel measurable.
(ii) (B ∞ (X, C), k·k∞ ), equipped with pointwise multiplication and ∗-operation is a C ∗ -algebra.
Proof. (i) It is an elementary fact of measure theory, cf. e.g. [29, Proposition 2.1.5], that the
pointwise limit of a sequence of measurable functions (whatever the σ-algebra) is measurable.
(ii) Every sequence in B ∞ (S, C) that is Cauchy w.r.t. k · k∞ converges pointwise everywhere,
thus is measurable by (i), and clearly bounded. Thus B ∞ (S, C) is complete. It is a C ∗ -algebra
since product and ∗-operation satisfy submultiplicativity and the C ∗ -identity.
18.12 Theorem Let H be a complex Hilbert space and A ∈ B(H) normal. Then:
(i) There is a unique unital ∗-homomorphism αA : B ∞ (σ(A), C) → B(H) extending the con-
tinuous functional calculus C(σ(A), C) → B(H) and satisfying kαA (f )k ≤ kf k∞ . Again
we write more suggestively f (A) = αA (f ).
(ii) If B ∈ B(H) commutes with A then B commutes with g(A) for all g ∈ B ∞ (σ(A), C).
(iii) If {fn }n∈N ⊆ B ∞ (σ(A), C) is a bounded sequence converging pointwise to f then f ∈
w
B ∞ (σ(A), C) and fn (A) → f (A), i.e. w.r.t. τwot , cf. Definition 10.18 and Exercise 10.19(i).
(And kfn − f k∞ → 0 ⇒ kfn (A) − f (A)k → 0.)
Proof. (i) For all x, y ∈ H, the map
is a linear functional on C(σ(A), C) that is bounded since kf (A)k = kf k∞ . Thus by the Riesz-
Markov-Kakutani R Theorem A.56 there exists a unique complex Borel measure µx,y on σ(A) such
that ϕx,y (f ) = f dµx,y for all f ∈ C(σ(A), C). Since ϕx,y depends in a sesquilinear way on
(x, y), the same holds for µx,y , and |µx,y (σ(A))| = |hx, yi| ≤ kxkkyk. Thus if f ∈ B ∞ (σ(A), C),
2
R
the map ψf : H → C defined by (x, y) 7→ f dµx,y is a sesquilinear form that is bounded since
|ψx,y (f )| ≤ kf k∞ kxkkyk. Thus by Proposition 11.5 there is a unique Af ∈ B(H) such that
hAf x, yi = ψx,y (f ) for all x, y ∈ H. It satisfies kAf k ≤ kf k∞ . Define α : B ∞ (σ(A), C) → B(H)
by f 7→ Af . If f ∈ C(σ(A), C) then ψx,y (f ) = hf (A)x, yi ∀x, y, implying Af = f (A). Thus αA
extends the continuous functional calculus.
It remains to be shown that αA is a ∗-homomorphism. Linearity is quite obvious. Since the
continuous functional calculus is a ∗-homomorphism, for f ∈ C(σ(A), C) we have f (A) = f (A)∗ ,
thus
Z Z Z
∗
f dµx,y = hf (A)x, yi = hx, f (A) yi = hx, f (A)yi = hf (A)y, xi = f dµy,x = f dµy,x ,
implying µy,x = µx,y . Now for all f ∈ B ∞ (σ(A), C) the above computation can be read back-
wards, giving αA (f ) = αA (f )∗ . Since the continuous functional calculus is a homomorphism,
for all f, g ∈ C(σ(A), C) we have
Z Z
∗
(f g) dµx,y = h(f g)(A)x, yi = hf (A)g(A)x, yi = hg(A)x, f (A) yi = g dµx,f (A)y .
156
The fact that this holds for all f, g ∈ C(σ(A), C) implies f µx,y = µx,f (A)y . Thus for all
f ∈ C(σ(A), C), g ∈ B ∞ (σ(A), C) we have
Z Z
h(f g)(A)x, yi = f g dµx,y = g dµx,f (A)y = hg(A)x, f (A)yi = hf (A)g(A)x, yi,
so that (f g)(A) = f (A)g(A). As above, we deduce from this that f µx,y = µx,f (A)y for all
f ∈ B ∞ (σ(A), C), and then (f g)(A) = f (A)g(A) for all f, g ∈ B ∞ (σ(A), C).
(ii) By normality of A and Fuglede’s Theorem 16.16 we have BA∗ = A∗ B, thus B commutes
with C ∗ (1, A), so that Bf (A) = f (A)B for all f ∈ C(σ(A), C). Thus
where convergence in the center is a trivial application of the dominated convergence theorem,
w
using boundedness of µx,y and kfn k∞ ≤ M for all n. This proves αA (fn ) → αA (f ). The final
claim clearly follows from kfn (A) − f (A)k = k(fn − f )(A)k ≤ kfn − f k∞ .
The above construction of the Borel functional calculus was independent of the Spectral
Theorem 18.4. We now wish to understand their relationship. This is the first step:
18.13 Exercise Let Σ ⊆ C be compact and λ a finite positive Borel measure on Σ. Let
H = L2 (Σ, λ; C) and g ∈ B ∞ (Σ, C).
(i) Prove that the multiplication operator Mg : H → H, [f ] 7→ [gf ] satisfies
kMg k = ess supµ |g| = inf{t ≥ 0 | λ({x ∈ X | |g(x)| > t}) = 0} ≤ kgk∞ .
(ii) Let A = Mz ∈ B(H), where z : Σ ,→ C. Prove that g(A) as defined by the Borel functional
calculus coincides with Mg .
157
∗ = ∗ =
L L
Thus
L with the direct sum decomposition U AU ι M z we have U g(A)U ι g(Mz ) =
ι Mg , where the second equality comes from Exercise 18.13(ii).
(i) If λ 6∈ g(σ(A)) then Mg −λ1 has a bounded inverse with norm ≤ dist(λ, g(σ(A)))−1 . Thus
all Mg − λ1 in the direct sum decomposition of A − λ1 have inverses with uniformly bounded
norms. Thus A − λ1 has a bounded inverse.
(ii) Under the assumption on h, we have
M M
U h(g(A))U ∗ = h(Mg ) = Mh◦g = U (h ◦ g)(A)U ∗ .
ι ι
(This is too sloppy, but the reader should be able to make it precise.)
18.15 Remark 1. Since it turns out that g(A) = U ∗ ( ι Mg )U for all g ∈ B ∞ (σ(A), C), one
L
might try to take this as the definition of g(A). But apart from being very inelegant, it has
the problem that one must prove the independence of g(A) thus defined of the choice of the
maximal set M ⊆ H in the proof of the spectral theorem. This would not be difficult if every
Borel measurable function was a pointwise limit of a sequence of continuous functions. But this
is false, making such an approach quite painful. (Compare Lusin’s theorem in, e.g., [140].)
2. We cannot hope to prove kg(A)k = kgk∞ for all g ∈ B ∞ (σ(A), C) since it is true only
if σ(A) = σp (A)! Since singletons in C are closed, thus Borel measurable, we can change g
arbitrarily for some λ ∈ σ(A) without destroying the measurability of g, making kgk∞ as large
as we want. But if λ ∈ σ(A)\σp (A), Exercise 18.8 gives µι ({λ}) = 0 ∀ι ∈ I, so that this change
of g does not affect the norms, cf. Exercise 18.13, ess supµi |g| of the multiplication operators
making up g(A) and therefore does not affect kg(A)k.
3. Let A ∈ B(H) be normal and consider the C ∗ -algebra A = C ∗ (1, A) ⊆ B(H). Then
g(A) ∈ A for continuous g, but for most non-continuous g we have g(A) 6∈ A. For this reason
there is no Borel functional calculus in abstract C ∗ -algebras. (But g(A) is always contained in
wot
the von Neumann algebra vN(A) = C ∗ (A, 1) generated by A. This follows from Theorem
18.12(ii) and von Neumann’s ‘double commutant theorem’.) 2
18.16 Lemma If H is a complex Hilbert space and U ∈ B(H) is unitary, there is a self-adjoint
A ∈ B(H) such that eiA = U .
Proof. Since U is unitary, we have σ(U ) ⊆ S 1 . Define f : S 1 → C as the unique inverse of the
map (−π, π] → S 1 , x 7→ eix . The function f is continuous on S 1 except at −1, where it has a
jump. Thus it is Borel measurable, so that we can define a normal operator A = f (U ) ∈ B(H)
by Borel functional calculus (Theorem 18.12). By Corollary 18.14 we have σ(A) ⊆ f (σ(U )) ⊆
[−π, π] (which together with normality of A implies A = A∗ ) and eiA = U .
18.17 Proposition Let H be a complex Hilbert space. Then the open set InvB(H) ⊂ B(H)
of invertible operators is path-connected.
Proof. Let A ∈ InvB(H), and let A = V |A| be its polar decomposition. Then |A| is positive and
invertible, and V is unitary, cf. Exercise 11.43. Since σ(|A|) ⊆ (0, +∞), we can use continuous
functional calculus to define B = log |A| ∈ B(H), satisfying eB = |A|. By the preceding
lemma, there is a self-adjoint D with eiD = V . Now g : [0, 1] → InvB(H), t 7→ eitD etB
is a continuous path in Inv B(H) (since eX is invertible for all X) such that g(0) = 1 and
g(1) = eiD eB = V |A| = A. Now a standard argument produces paths between any two
invertible operators.
158
18.18 Remark The result also holds for infinite-dimensional real Hilbert spaces, but then it
requires a different proof. This is surprising since GL(n, R) is not path-connected for n ∈ N
but has two path-components, while GL(n, C) is path-connected. In fact, one has this much
stronger theorem of Kuiper98 [89]: Inv B(H) is contractible in the norm topology for all infinite-
dimensional real or complex Hilbert spaces. 2
18.19 Definition Let H be Hilbert space and Σ ⊆ C a compact subset. Let B(Σ) be the Borel
σ-algebra on Σ. A projection-valued measure relative to (H, Σ) is a map P : B(Σ) → B(H)
such that
(i) P (S) is an orthogonal projection for all S ∈ B(Σ).
(ii) P (∅) = 0, P (Σ) = 1.
(iii) P (S ∩ S 0 ) = P (S)P (S 0 ) for all S, S 0 ∈ B(Σ).
(iv) For all x, y ∈ H, the map Ex,y : B(Σ) → C, S 7→ hP (S)x, yi is P a complex measure.
(Equivalently,Sif the {Sn }n∈N ⊆ B(Σ) are mutually disjoint then n P (Sn ) converges
weakly to P ( n Sn ).)
Note that (iii) implies P (S)P (S 0 ) = P (S 0 )P (S) for all S, S 0 ∈ B(Σ).
18.20 Proposition Let H be a complex Hilbert space and A ∈ B(H) normal. Put Σ = σ(A).
For each S ∈ B(Σ), define PA (S) = χS (A) by Borel functional calculus. Then S 7→ PA (S) is a
projection-valued measure relative to (H, Σ), also called the spectral resolution of A.
Proof. If g = χS for S ∈ B(Σ), g(A) is a direct sum of operators of multiplication by χS , which
clearly all are idempotent. And since g = χS is real-valued, g(A) is self-adjoint. Thus each
PA (S) = χS (A) is an orthogonal projection. PA (∅) = 0 is clear, and PA (Σ) = 1(A) = 1H (since
the constant 1 function is continuous). Property (iii) is immediate from χS∩S 0 = χS χS 0 . Finally,
if x, y ∈ H let U x = {fι }ι∈I , U y = {gι }ι∈I . Then
XZ
Ex,y (S) = hPA (S)x, yi = χS (z)fι (z)gι (z) dµι (z).
ι∈I σ(A)
18.21 Exercise Let A ∈ B(H) be normal and Σ ⊆ σ(A) a Borel set. Prove σ(A|P (Σ)H ) ⊆
Σ ∪ {0}. Bonus: State and prove a better result.
18.22 Exercise Let A ∈ B(H) be a normal operator and let PA be the corresponding spectral
measure. Prove:
(i) λ ∈ σ(A) if and only if PA (σ(A) ∩ B(λ, ε)) 6= 0 for each ε > 0.
(ii) λ ∈ σ(A) is an eigenvalue if and only if PA ({λ}) 6= 0.
98
Nicolaas Hendrik Kuiper (1920-1994), Dutch mathematician.
159
We have thus seen that every normal operator gives rise to a projection valued measure. The
converse is also true, and we have a bijection between normal operators and projection-valued
measures:
f (A)f (A)∗ = α(f )α(f )∗ = α(f f ∗ ) = α(f ∗ f ) = α(f )∗ α(f ) = f (A)∗ f (A).
(iii) σ(A) ⊆ Σ is clear. Since α(1) = 1 and α(z) = A by definition, we have α(P ) = P (A)
for each polynomial. More generally, since α is a ∗-homomorphism, a polynomial in z, z is sent
by α to the corresponding polynomial in A, A∗ . These polynomials are k · k∞ -dense in C(Σ, C)
by Weierstrass Theorem A.38, so that the continuity proven in (ii) implies that α(f ) = f (A) as
produced by the continuous functional calculus.
(iv) Left as an exercise.
We close the discussion of spectral theorems with the advice of looking at the paper [67] and
[152, Chapter 5] by two masters of functional analysis.
160
19 ? The Gelfand homomorphism for commutative
Banach and C ∗-algebras
19.1 The topology of Ω(A). The Gelfand homomorphism
Let A be a unital Banach algebra over C and Ω(A) its spectrum. For each a ∈ A define (as in
Section 9.3)
a : Ω(A) → C, ϕ 7→ ϕ(a).
b
We now want a topology τ on Ω(A) such that b a is continuous for each a ∈ A, thus b a ∈
C((Ω(A), τ ), C). Since Ω(A) ⊆ (A∗ )≤1 by Lemma 15.3, we could take τ to be the restriction of
the norm topology of A∗ to Ω(A) (i.e. the relative topology). But we can also take the weakest
topology making all b a : Ω(A) → C continuous. This is nothing other than the restriction to
Ω(A) of the weak-∗ topology or σ(A∗ , A)-topology on A∗ .
19.1 Proposition Let A be a unital Banach algebra. Let τ be the restriction of the weak-∗
topology to Ω(A) ⊆ (A∗ )≤1 . Then (Ω(A), τ ) is compact Hausdorff.
Proof. By Alaoglu’s theorem, ((A∗ )≤1 , τw∗ ) is compact. Thus it suffices to prove that Ω(A) ⊆
(A∗ )≤1 is weak-∗ closed. Let {ϕι } be a net in Ω(A) that converges to ψ ∈ A∗ w.r.t. the σ(A∗ , A)-
topology. Then for all a, b ∈ A we have ψ(ab) = limι ϕι (ab) = limι ϕι (a)ϕι (b) = ψ(a)ψ(b), so
that ψ ∈ Ω(A). Thus Ω(A) ⊆ (A∗ )≤1 is σ(A∗ , A)-closed.
The above works whether or not A is commutative, but we’ll now restrict to commutative
A since Ω(A) can be very small otherwise. We begin by completing Exercise 15.5:
19.2 Proposition Let X be a compact Hausdorff space and A = C(X, C). Then the map
X → Ω(A), x 7→ ϕx is a homeomorphism (with the weak-∗ topology on Ω(A)).
Proof. Injectivity was already proven in Exercise 15.5. In order to prove surjectivity, let ϕ ∈
Ω(A) and put M = ker ϕ. Then M ⊆ A is a proper closed subalgebra (in fact an ideal), and
it is self-adjoint by Lemma 17.17 since A is a C ∗ -algebra. If x, y ∈ X, x 6= y, pick f ∈ A
with f (x) 6= f (y). With g = f − ϕ(f )1 we have ϕ(g) = 0, thus g ∈ M . This proves that
M separates the points of X, yet it is not dense in A. Now the incarnation Corollary A.41
of the Stone-Weierstrass theorem implies that there must be an x ∈ X at which M vanishes
identically, i.e. ϕx (f ) = 0 for all f ∈ M . Now for every f ∈ A we have f − ϕ(f ) ∈ M , thus
ϕx (f − ϕ(f )1) = 0, which is equivalent to ϕx (f ) = ϕ(f ). Thus ι : X → Ω(A) is surjective.
If {xι } ⊆ X such that xι → x then ϕxι (f ) = f (xι ) → f (x) = ϕx (f ) for every f ∈ A by
continuity of f . But this precisely means that ϕxι → ϕx w.r.t. the weak-∗ topology. Thus ι is
continuous. As a continuous bijection of compact Hausdorff spaces it is a homeomorphism.
19.3 Definition Let A be a unital Banach algebra. Then its radical is the set of quasi-
nilpotent elements: radA = {a ∈ A | r(a) = 0}. We call A semisimple if radA = {0}.
π : A → C(Ω(A), C), a 7→ b
a (19.1)
161
Proof. It is clear that π is linear, and 1(ϕ)
b = ϕ(1) = 1 for all ϕ. Let a, b ∈ A, ϕ ∈ Ω(A). Then
π(ab)(ϕ) = ab(ϕ)
b = ϕ(ab) = ϕ(a)ϕ(b) = b
a(ϕ)bb(ϕ) = π(a)(ϕ)π(b)(ϕ) = (π(a)π(b))(ϕ),
where we used multiplicativity of ϕ and the fact that the multiplication on C(Ω(A), C) is
pointwise, shows that π(ab) = π(a)π(b), thus π is an algebra homomorphism. We have
kb
ak = sup |b
a(ϕ)| = sup |ϕ(a)| = sup |λ| = r(a) ≤ kak,
ϕ∈Ω(A) ϕ∈Ω(A) λ∈σ(a)
where we used (15.1) and Proposition 13.27. In particular, ker π = r−1 (0) = radA.
The Gelfand homomorphism can fail to be surjective or injective or both. See Section 19.2
for an important example for the failure of surjectivity and Exercise 19.6 for a non-trivial unital
Banach algebra with very large radical.
19.5 Proposition Let A be a commutative unital Banach algebra and a ∈ A such that A is
generated by {1, a}. Then the map ba : Ω(A) → σ(a) is a homeomorphism.
The same conclusion holds if a ∈ InvA and A is generated by {1, a, a−1 }.
Proof. We know from (15.1) that b a(Ω(A)) = σ(a), thus b
a is surjective. Assume b
a(ϕ1 ) = b
a(ϕ2 ),
thus ϕ1 (a) = ϕ2 (a). Since the ϕi are unital homomorphisms, this implies ϕ1 (an ) = ϕ2 (an )
for all n ∈ N0 , so that ϕ1 , ϕ2 agree on the polynomials in a. Since the latter are dense in A
by assumption and the ϕi are continuous, this implies ϕ1 = ϕ2 . Thus b a : Ω(A) → σ(a) is
injective, thus a continuous bijection. Since Ω(A) is compact and σ(a) ⊆ C Hausdorff, b a is a
homeomorphism. This proves the first claim.
For the second claim, note that ϕ(a)ϕ(a−1 ) = ϕ(aa−1 ) = ϕ(1) = 1, thus ϕ(a−1 ) = ϕ(a)−1 ,
for each ϕ ∈ Ω(A). This implies that ϕ1 (an ) = ϕ2 (an ) also holds for negative n ∈ Z. Now
ϕ1 , ϕ2 agree on all Laurent polynomials in a, thus on A by density and continuity. The rest of
the proof is the same.
19.6 Exercise Let α : N0 →P (0, ∞) be a map satisfying α(0) = 1 and αn+m ≤ αn αm ∀n, m.
For f : N0 → C, define kf k P
= n∈N0 αn |f (n)|, and A = {f : N0 → C | kf k < ∞}. For f, g ∈ A,
define f · g by (f · g)(n) = u,v∈N0 f (u)g(v).
u+v=n
19.7 Remark 1. Since every commutative unital Banach algebra has at least one non-zero
character ϕ, the worst that can happen is radA = ϕ−1 (0), which has codimension one, as in the
preceding exercise.
2. If A is a non-unital Banach algebra and a ∈ A one defines σ(a) = σAe(a), where Ae is
the unitization of A considered in Exercise 13.55. Now one defines r(a) = supλ∈σ(a) |λ| and
radA = r−1 (0) ⊆ A as before. Now for the non-unital subalgebra A0 = {f ∈ A | f (0) = 0} of
the A from Exercise 19.6 one easily proves Af0 ∼
= A, thus r(a) = 0 ∀a ∈ A0 and radA0 = A0 . 2
162
19.2 Application: Absolutely convergent Fourier series
Let (A = `1 (Z, C), k · k, ?, 1) be the unital Banach algebra from Section 4.6. In Exercise 15.11
we used characters to compute the spectra of elements of A and proved that it is semisimple.
We now give a new interpretation of these somewhat ad-hoc arguments in the light of Fourier
analysis and the Gelfand homomorphism.
By the semisimplicity of A, the Gelfand homomorphism π : A → C(Ω(A), C) is injective. (In
Exercise 15.11 we have proven a bijection S 1 → Ω(A). Since the Banach algebra A is generated
by δ1 and δ1−1 = δ−1 , Proposition 19.5 amplifies this to a homeomorphism Ω(A) → S 1 .) But π
is neither isometric nor surjective: Its image consists precisely of
n X o
W = g ∈ C(S 1 , C) |b
g (n)| < ∞ .
n∈Z
This is an algebra since A is. (To see this without reference to π, note that fd · g(n) = (fb? gb)(n)
1
and use the fact that ` (Z) is closed under convolution.) While A inherits the norm k · k∞
from C(S 1 , C), it is not closed in this norm and π is not P an isometry. The norm on W for
which π : A → W is an isometric isomorphism is kgkW = n∈Z |b g (n)|. Since W is generated
by the function z 7→ z, it follows that Ω(W) consists of the point evaluations {ϕz | z ∈ S 1 }
as for C(S 1 , C). Now the result of Exercise 15.11 is obvious since it follows from the isometric
isomorphism π : (A, k · k1 ) → (W, k · kW ).
For the g ∈ W the Fourier series converges absolutely uniformly to g, but we have proven
in Section 8.3 that C(S 1 , C) has a dense subset of functions whose the Fourier series does not
even converge pointwise everywhere. (Our proof was non-constructive, but as we remarked,
individual examples can be produced constructively.) Functions in C(S 1 , C)\W can actually
be written
P down even more concretely: With some effort (see [109] for an exposition) the
series ∞ sin nx 1
n=2 n log n can be shown to converge uniformly to some f ∈ C(S , C), and its Fourier
P∞
coefficients are not absolutely summable since n=2 (n log n)−1 = ∞. (That the convergence is
not unconditional follows from the fact that it isn’t at x = π/2.)
1 1 1
We now turn Pthe non-surjectivity of π : ` (Z) 1→ C(S ) into a virtue! For g ∈ C(S , C)
define kgkW = n∈Z |b g (n)|. Thus W = {g ∈ C(S , C) | kgkW < ∞}. We have seen that the
Gelfand representation of `1 (Z) is an isometric isomorphism (`1 (Z), k · k1 ) → (W, k · kW ). Now
we have:
19.10 Remark The first proof of this theorem due to Wiener was more involved. The above
proof due to Gelfand was one of the first successes of his theory of commutative C ∗ -algebras.
But now there is a much simpler and quite definitive proof using only convergence of the
geometric/Neumann series in the Banach algebra W. See [114] or [27, Section 2.5]. 2
19.11 Exercise Prove that the Gelfand homomorphism π : `1 (Z, C) → C(S 1 , C) (seen above
to be injective) is not bounded below.
163
19.3 C ∗ -algebras. Continuous functional calculus revisited
In discussing when the Gelfand homomorphism π : A → C(Ω(A), F) is an isomorphism, we
limit ourselves to the case where A is a C ∗ -algebra over C.
The result of (iii) is equivalent to the following: Every character on a unital C ∗ -subalgebra
of a commutative C ∗ -algebra has an extension to a character of the larger algebra.
164
equivalence between the category of locally compact Hausdorff spaces and proper maps and the
category of commutative ∗-algebras and non-degenerate homomorphisms.
3. The preceding comments in a sense end the theory of commutative C ∗ -algebras since the
latter is reduced it to general topology. But the theory of non-commutative C ∗ -algebras is vast,
see [79, 110] for accessible introductions, and it turns out that commutative C ∗ -algebras are a
very useful tool for studying them, as results like Exercise 16.24 and Proposition 17.6 just begin
to illustrate.
4. Comparing Theorem 19.12 with the non-surjectivity of the Gelfand-Homomorphism for
(` (Z, C), ?) shows that `1 (Z, C) does not admit a norm that would make it a C ∗ -algebra. But
1
`1 (Z, C) admits a non-complete C ∗ -norm k · k0 , and completing `1 (Z, C) w.r.t. the latter yields
a C ∗ -algebra C ∗ (Z) that is isomorphic to C ∗ (U ) ⊆ B(`2 (Z, C)), where U ∈ B(`2 (Z, C)) is the
two-sided shift unitary. One also has C ∗ (Z) ∼ = C(S 1 , C), thus the C ∗ -completion ‘adds’ the
continuous functions with non-absolutely convergent Fourier series. 2
Now we have another, perhaps more conceptual but certainly less elementary, proof of the
continuous functional calculus for normal elements of a C ∗ -algebra (Theorem 17.16):
at π −1
C(σ(a), C) → C(Ω(B), C) → B ,→ A,
b
165
(iii) This is essentially obvious, since applying f to a and g to f (a) is just composition of
maps on the right hand side of the Gelfand isomorphism.
It should be clear that Theorem 19.12 is of fundamental conceptual importance, but most
of its applications just use Theorem 19.16, which we proved in Section 17.3 in a more ele-
mentary fashion (without weak-∗ topology and Alaoglu’s theorem, and using only the classical
Weierstrass theorem). Genuine applications of Theorem 19.12 are harder to find. Here is one:
19.17 Exercise Let A be a unital C ∗ -algebra and let a, b ∈ A be commuting normal elements.
Prove that the absolute values (Defin. 17.11) satisfy |a + b| ≤ |a| + |b| and |ab| = |a| |b|.
Hint: Fuglede’s theorem.
(For A = B(H) one has results under weaker hypotheses. See e.g. [107].)
A.1 Definition
P Let S be a set, (V, k · k) a normed space and f : S → V a function. We say
that s∈S f (s) exists or converges (or: f is summable P over S) with sum x ∈ V if for every
ε > 0 there is a finite subset T ⊆ S such that kx − s∈U f (s)k < ε holds whenever T ⊆ U ⊆ S
with U finite.
In many cases, the above will be applied to V = F ∈ {R, C} and k · k = | · |.
This notion of summation has some useful properties:
166
P P P
P(V, k · k) is complete and
(iv) If s∈S kf (s)k < ∞ then s∈S f (s) exists, and k s∈S f (s)k ≤
s∈S kf (s)k.
P P
(v) If f : S → F ∈ {R, C} is such that s∈S f (s) exists then s∈S |f (s)| exists, i.e. is finite.
The proofs of (i) and (ii) are straightforward and similar to those for the analogous state-
ments about series.
P The equivalence
P in (iii) follows from monotonicity of the map Pfin (S) →
[0, ∞), T 7→ t∈T f (t). If s∈S |f (s)| < ∞ then it follows that for every ε > 0 there are
at most finitely many s ∈ S such that |f (s)| ≥ ε. In particular, for every n ∈ N the set
Sn = {s ∈ S | |f (s)| ≥ 1/n} is finite. Since S∞ a countable union of finite sets is countable, we
have countability of {s ∈ S | f (s) 6= 0} = n=1 Sn . The proof of (iv) combines the argument in
Proposition 3.15(iii) with Lemma A.14 below.
Statement (v) may be surprising at first sight P since the analogous statement for series is
false. Roughly, the reason is that our definition of s∈S f (s) imposesP no ordering on S, while,
by a classical result of Riemann, the sum of a convergent series ∞ n=1 f (n) is invariant under
reordering of the terms only if the series converges absolutely. The rigorous proof of (v), found
e.g. in [12] or [108, Proposition 5.1.28], does not appeal to Riemann’s result, but uses similar
ideas. For S = N this is Proposition 3.16, the proof for general S being similar.
In discussing the spaces `p (S, F), the following (easy special case of Lebesgue’s dominated
convergence theorem) is useful:
X X X X X X
fn (s) − h(s) ≤ fn (s) − h(s) + fn (s) − h(s) .
s∈S s∈S s∈T s∈T s∈S\T s∈S\T
167
A.2 More on unconditional convergence of series
In the case S = N and f (n) = xn , there is a connection between the summability of Definition
A.1 and the notion of unconditionally convergent series:
A.4 Theorem Let (V, k · k) be a Banach space and {xn }n∈N ⊂ V . Then the following are
equivalent:
P
(i) n∈N xn exists in the sense of Definition A.1.
P∞
(ii) n=1 xn is unconditionally convergent, the sums of all rearrangements being equal.
P∞
(iii) n=1 xn is unconditionally convergent.
P ε > 0 there exists a finite S ⊆ N such that for every finite subset T ⊂ N\S we
(iv) For every
have k t∈T xt k < ε.
∞
X
(v) lim sup |ϕ(xk )| = 0.
N →∞ ϕ∈V ∗
≤1 k=N +1
P∞
(vi) n=1 cn xn is convergent for all bounded sequences {cn } in F. (The convergence then is
unconditional.)
P∞
(vii) k=1 xnk converges for all n1 < n2 < · · · . (Subseries convergence)
P
Proof. (i)⇒(ii) Let n∈N xn = x. If now σ is any permutation P of N and ε > 0 then by
assumption there is a finite subset T ⊆ N such that kx − n∈U xn k < ε for every finite U ⊂ N
containing T . Since σ : N → N is a bijection, there exists n0 such that P n ≥ n0 implies
−1 (k) | k ∈ T }.) Thus kx − n
T ⊆ {σ(1), . . . , σ(n)}. (We can take nP 0 = max{σ k=1 xσ(k) k < ε.
∞
This proves that all rearranged sums k=1 xσ(k) converge to x.
(ii)⇒(iii) Trivial.
(iii)⇒(iv) Assume (iv) does not hold. By elementary logic this means that P there is an
ε > 0 such that for every finite S ⊆ N there exists a finite T ⊆ N\S with k t∈T xt k ≥ ε.
Using this weP can construct a sequence S1 , S2 , . . . of mutually disjoint finite subsets Sk ⊆ N
such that k s∈Sk xs k ≥ ε for each k. Now we can find a permutation σ of N and a sequence
n1 < n2 <P. . . such that Sk = σ({n
k +#Sk −1
P k , nk + 1, . . . , nk + #Sk − 1})Pfor∞
all k. Thus for each k
we have k nn=n k
x σ(n) k = k x
s∈Sk s k ≥ ε, so that the series n=1 xσ(n) is not Cauchy ,
99
(iv)⇒(v) Let ε > 0 and S ⊆ NPas provided correspondingly by assumption (iv). Put
N = max(S). It then follows that k t∈T xt k < ε for each finite T ⊆ N with min(T ) > N . If
ϕ ∈ V ∗ and L ≥ K > N , put
F + = {n ∈ N | K ≤ n ≤ L, Re ϕ(xn ) ≥ 0},
F − = {n ∈ N | K ≤ n ≤ L, Re ϕ(xn ) < 0}.
∗ we have
Clearly min(F + ) ≥ K > N , so that k n∈F + xn k < ε. Now for every ϕ ∈ V≤1
P
X X X X X
|Re ϕ(xn )| = Re ϕ(xn ) = Re ϕ xn ≤ ϕ xn ≤ kϕk xn < ε.
n∈F + n∈F + n∈F + n∈F + n∈F +
99
P∞
A series n=1 xn is Cauchy if the sequence {Sn } of partial sums is Cauchy.
168
Essentially the same argument holds for F − , so that L
P
PL n=K |Re ϕ(xn )| < 2ε. IfPF = C a similar
argument gives n=K |ImP ϕ(xn )| < 2ε. With |z| ≤ |Re z| + |Im z| we conclude L n=K |ϕ(xn )| <
∞ ∗
4ε. Taking L → ∞ gives n=K |ϕ(xn )| ≤ 4ε. Since this holds for all ϕ ∈ V≤1 , (v) follows.
(v)⇒(vi) We may clearly assume |cn | ≤ 1 for all n. Proposition 9.9(i) implies for every y ∈ V
that kyk = supϕ∈V ∗ |ϕ(y)|. Thus for M > N we have
≤1
M
X M
X M
X
ck xk = sup ϕ ck xk = sup ck ϕ(xk )
∗
ϕ∈V≤1 ∗
ϕ∈V≤1
k=N +1 k=N +1 k=N +1
M
X M
X ∞
X
≤ sup |ck | |ϕ(xk )| ≤ sup |ϕ(xk )| ≤ sup |ϕ(xk )|.
∗
ϕ∈V≤1 ∗
ϕ∈V≤1 ∗
ϕ∈V≤1
k=N +1 k=N +1 k=N +1
P∞ the rightmost expression tends to zero as N → ∞ by (v), it follows that the series
Since
k=1 ck xk is Cauchy, thus
P∞ convergent.
n=1 cn dn xn converges for all choices of {dn } in {0, 1} . Thus the
By the same N
Pproof,
∞
convergence of n=1 cn xn is unconditional P by (vi)⇒(iii).
(vi)⇒(vii) This follows by rewriting ∞
P∞
k=1 xnk as n=1 cn xn , where cn ∈ {0, 1}.
(vii)⇒(iv) Assume (iv) does not hold. Arguing as in the proof of (iii)⇒(iv)
P we have an infinite
sequence {Sk } of mutually disjoint finite subsets of N such that k n∈Sk xn k ≥ ε for all k. It
is clear that we can choose the Sk in such a way that max(Sk ) < min(Sk+1 ) for all k. Putting
Sk−1 PNk+1
n = χS (n), with Nk = # i=1 Si we have k
S P
S = k Sk and cP n=Nk +1 cn xn k = k n∈Sk xn k ≥ ε.
∞
Thus the series n=1 cn xn diverges since it is not Cauchy, contradicting (vii).
A.5 Remark 1. Statement (vi) expresses more clearly than all the others that unconditional
convergence of a series does not rely on ‘cancellations’ between the summands, so that the
convergence is not affected by reordering or omission of any number of summands.
2. Our proof of the hardest implication (iv)⇒(vi) was found in [74]. It is perhaps the most
elementary one, using only Hahn-Banach. There are many other interesting proofs, see e.g.
[102, 4.2.6-4.2.8], [77,
P Vol. 1, Proposition 4.1.5], [97, Vol. 1, Theorem II.7]. ∗
3. If the series ∞
P∞
n=1 xn in V is unconditionallyP∞ convergent and ϕ ∈ V then n=1 ϕ(xn )
∗
converges unconditionally, thus absolutely. Thus n=1 |ϕ(xn )| < ∞ ∀ϕ ∈ V . A series with
this property is called weakly unconditionally Cauchy (WUC). But P the WUC property does
not even imply conditional convergence, as is illustrated by series ∞ n=1 δn in c0 , which is easily
∗ ∼ 1
shown to be WUC using c0 = ` . This is essentially the only counterexample: Bessaga and
Pelczyński100 proved that every WUC series in a Banach space V converges unconditionally
if and only if V has no subspace isomorphic to c0 . Cf. e.g. [98, Theorem 2.e.4], [1, Theorem
2.4.11]. (This is similar to Rosenthal’s `1 -theorem mentioned in footnote 62, but much easier.)
4. While the WUC property of a series is weaker than unconditional convergence, weak-
topology characterizations of unconditional convergence do exist. One P∞ of them is statement
(v), being a uniform (in ϕ ∈ V≤1 ∗ ) version of the statement lim
N →∞ n=N +1 |ϕ(xn )| → 0 that
clearly follows from WUC. On the other hand, it is clear that unconditional convergence implies
weak convergence of all subseries. The latter property of a series is somewhat stronger than
being WUC, and indeed by the Orlicz101 -Pettis theorem, cf. e.g. [1, Theorem 2.4.14], [97, Vol.
1, Theorem II.3], we can add P the following to the list in Theorem 19.16:
(viii): Every subseries ∞ i=1 xni , where n1 < n2 < · · · , converges weakly. 2
100
Czeslaw Bessaga (1932-2021), Aleksander Pelczyński (1932-2012). Polish functional analysts. Both were students
of S. Mazur.
101
Wladyslaw Orlicz (1903-1990). Polish functional analyst and topologist. Also known for O. spaces.
169
A.6 Exercise Use Theorem A.4 to give a high-brow proof of Proposition 3.16.
A.3 Nets
The Definition A.1 of unordered sums is an instance of a much more general notion, the con-
vergence of nets.
A.7 Definition A directed set is a set I equipped with a binary relation ≤ on I satisfying
1. a ≤ a for each a ∈ I (reflexivity).
2. If a ≤ b and b ≤ c for a, b, c ∈ I then a ≤ c (transitivity).
3. For any a, b ∈ I there exists a c ∈ I such that a ≤ c and b ≤ c (directedness).
A.8 Remark If only 1. and 2. hold, (I, ≤) is called a pre-ordered set. Some authors, as e.g.
[101], require in addition that a ≤ b and b ≤ a together imply a = b (antisymmetry). Recall
that a pre-ordered set with this property is called partially ordered. But the antisymmetry is
an unnatural assumption in this context and is never used. 2
A.9 Example 1. Every totally ordered set (X, ≤) is a directed set. Only the directedness
needs to be shown, and it follows by taking c = max(a, b). In particular N is a directed set with
its natural total ordering.
2. If S is a set then the power set I = P (S) with its natural partial ordering is directed: For
the directedness, put c = a ∪ b. The same works for the set Pfin (S) of finite subsets of S, which
appeared in the definition of unordered sums.
3. If (X, τ ) is a topological space and x ∈ X, let Ux be the set of open neighborhoods of x.
Now for U, V ∈ Ux , define U ≤ V ⇔ U ⊆ V , thus we take the reversed ordering. Then (Ux , ≤)
is directed with c = a ∩ b.
A.11 Remark 1. With I = N and ≤ the natural total ordering, a net indexed by I just is a
sequence, and this net converges if and only if the sequence does.
2. Unordered summation is a special case of a net limit: If S is any set, let I be the set of
finite subsets of S and let ≤ be the ordinary (partial) ordering of subsets of S. If T, U ∈ I let
V = T ∪ U . Clearly T ≤ V, U ≤ V , showing that (I, ≤) is a directed set. (This is the same as
Example A.9.2, except that now we only look at finite subsets P of S.) Now given f : S → F, for
every T ∈ I, thus every finite T ⊆ S, we can clearly define t∈T f (t). Now
X X
f (s) = lim f (t),
T ∈I
s∈S t∈T
102
Nets were invented by the American mathematicians Eliakim Hastings Moore (1862-1932) and his student Herman
L. Smith (1892-1950). Moore made many contributions to many areas of mathematics.
170
Why nets? The reason is that sequences are totally inadequate for the study of topological
spaces that do not satisfy the first countability axiom.103 Given a metric space X and a subset
Y ⊆ X, one proves that x ∈ Y if and only if there is a sequence {yn } in Y converging to x, but
for general topological spaces this is false. Similarly, the statement that a function f : X → Y
is continuous at x ∈ X if and only if f (xn ) → f (x) for every sequence {xn } converging to x is
true for metric spaces, but false in general! (It is instructive to work out counterexamples.)
On the other hand:
A.13 Definition A net {xι }, indexed by a directed set (I, ≤), in a metric space (X, d) is a
Cauchy net if for every ε > 0 there is a ι0 ∈ I such that ι, ι0 ≥ ι0 ⇒ d(xι , xι0 ) < ε.
A.15 Remark In the above proof we cannot argue by saying that there is a sequence ι1 ≤ ι2 ≤
· · · such that for every ι ∈ I there is an n with ι ≤ ιn . If this was true, we could replace the
net by a sequence in the first place! 2
171
• If X is a set, there exists a function s : P (X)\{∅} → X such that s(Y ) ∈ Y for each
Y ∈ P (X)\{∅}, i.e. ∅ = 6 Y ⊆ X.
• If {Xi }i∈I
Q
S is a family of non-empty sets then i∈I Xi 6= ∅. Concretely, there exists a map
f : I → j∈I Xj such that f (i) ∈ Xi ∀i ∈ I.
A.18 Theorem Given the Zermelo-Frenkel axioms of set theory, the Axiom of Choice is equiv-
alent to Zorn’s lemma, which says: If (X, ≤) is a non-empty partially ordered set such that
every totally ordered subset Y ⊆ X has an upper bound then X has a maximal element.
A.19 Definition The Axiom of Countable Choice (ACω ) is the first (or third) of the above
versions of AC with the restriction that Y (respectively I) be at most countable.
Many iterative constructions, as in the proof of Urysohn’s lemma or of Lemma 7.3, require
countably many choices where, however, the n-th choice must take into account the preceding
ones. For this we need an axiom that is stronger than ACω :
A.20 Definition The Axiom of Countable Dependent Choice (DCω ) is the following: If X is
a set and R ⊆ X × X is such that for every x ∈ X there is a y ∈ X such that (x, y) ∈ R then
there is a sequence {xn }n∈N in X such that (xn , xn+1 ) ∈ R for all n ∈ N.
A.21 Remark 1. Like the other choice axioms, DCω is often used without any comment.
2. It is easy to prove AC ⇒ DCω ⇒ ACω . The converse implications have been proven
false by constructing models of ZF set theory satisfying, say, ACω but not DCω . 2
Saying something about infinite intersections requires more work and assumptions, as in:
172
A.23 Theorem (Baire) 104 Let (X, d) be a complete metric space and {Un }n∈N a countable
T∞
family of dense open subsets. Then n=1 Un is dense in X.
Proof. Let W ⊆ X be open and non-empty. Since U1 is dense, W ∩U1 6= ∅ by Lemma A.22, so we
can pick x1 ∈ W ∩U1 . Since W ∩U1 is open, we can choose ε1 > 0 such that B(x1 , ε1 ) ⊆ W ∩U1 .
We may also assume ε1 < 1. Since U2 is dense, U2 ∩B(x1 , ε1 ) 6= ∅ and we pick x2 ∈ U2 ∩B(x1 , ε1 ).
By openness, we can pick ε2 ∈ (0, 1/2) such that B(x2 , ε2 ) ⊆ U2 ∩ B(x1 , ε1 ). Continuing this
iteratively, we find points xn and εn ∈ (0, 1/n) such that B(xn , εn ) ⊆ Un ∩ B(xn−1 , εn−1 ) ∀n. If
i > n and j > n we have by construction that xi , xj ∈ B(xn , εn ) and thus d(xi , xj ) ≤ 2εn < 2/n.
Thus {xn } is a Cauchy sequence, and by completeness of (X, d) it converges to some z ∈ X.
Since n > k implies xn ∈ B(xk , εk ), the limit z is contained in B(xk , εk ) for each k, thus
\ \
z∈ B(xn , εn ) ⊆ W ∩ Un ,
n n
T
∩ n Un is non-empty. Since W was an arbitrary non-empty open set, Lemma A.22
so that W T
gives that n Un is dense.
A.24 Corollary Let (X, d) be a complete S∞ metric space and {Cn }n∈N a countable family of
closed subsets with empty interior. Then n=1 Cn has empty interior.
Proof. The sets Un = X\Cn , n ∈ N, are open and TUn = X\Cn = X\Cn0 = X since the
interiors Cn0 are empty. Thus the Un are dense so that n Un is dense by Baire’s theorem, thus
X\Cn = n Un = X. Thus with X\Y = (X\Y )0 we have ( n Cn )0 = (X\ n (X\Cn ))0 =
T T S T
nT S
X\ n X\Cn = ∅, i.e. n Cn has empty interior.
A.25 Remark 1. There are many other ways of stating Baire’s theorem, but most of the
alternative versions introduce additional terminology (nowhere dense sets, meager sets, sets of
first or second category,
T etc.) that obscures the matter unnecessarily.
2. An intersection n Un of a countable family {Un }n∈N of open sets is called a Gδ -set. (And
a countable union of closed sets is called Fσ -set.)
3. The proof implicitly used the axiom DCω of countable dependent choice. (Making this
explicit is an instructive but tedious exercise.) Remarkably, the (Zermelo-Frenkel) axioms of
set theory (without any choice axiom) combined with Baire’s theorem imply DCω , cf. [17].
4. Some results customarily proven using Baire’s theorem can alternatively be proven with-
out it. But in most cases, such alternative proofs will also use the axiom DCω and therefore not
be better from a foundational (reverse mathematics) point of view. See also Remark 8.3.2. 2
A typical application of Baire’s theorem is the following (for a proof see, e.g., [108]):
A.26 Theorem There is a k · k∞ -dense Gδ -set F ⊆ C([0, 1], R) such that every f ∈ F is
nowhere differentiable.
Note that a single function f ∈ C([0, 1], R) that is nowhere
P∞ differentiable can be written down
−n n
quite explicitly and constructively, for example f (x) = n=1 2 cos(2 x). But for proving that
such functions are dense one needs Baire’s theorem (or something related).
104
René-Louis Baire (1874-1932). French mathematician, proved this for Rn in his 1899 doctoral thesis. The gener-
alization is due to Hausdorff (1914).
173
A.6 On C(X, F)
We recall a few facts from general topology:
A.30 Lemma If X is a compact topological space (not necessarily Hausdorff), C(X, F) with
norm kf k = supx∈X |f (x)| is a Banach space.
Proof. By compactness of X, every f ∈ C(X, F) is bounded, thus has kf k < ∞. That the norm
axioms are satisfied is easy enough. If {fn } ⊂ C(X, F) is a Cauchy sequence, so is {fn (x)} for
each x ∈ X. Since F is complete, g(x) = limn→∞ fn (x) exists for each x. Let ε > 0 and n0 be
such that n, m ≥ n0 implies supx∈X |fn (x) − fm (x)| = kfn − fm k < ε. Taking m → ∞ we obtain
kfn − gk = supx∈X |fn (x) − g(x)| =≤ ε for all n ≥ n0 . Thus the convergence fn (x) → g(x) is
uniform in x. This implies g ∈ C(X, F), as shown in topology. (If x ∈ X and ε > 0, pick n such
that kg − fn k < ε/3. By continuity of fn there is an open neighborhood U ⊆ X of x such that
|f (x) − f (y)| < ε/3 for all y ∈ U . Then for all y ∈ U we have
ε ε ε
|g(x) − g(y)| ≤ |g(x) − fn (x)| + |fn (x) − fn (y)| + |fn (y) − g(y)| ≤ + + = ε,
3 3 3
so that f is continuous at x. Since this holds works for all x ∈ X, g is continuous.) Thus every
Cauchy sequence in C(X, F) converges to an element of C(X, F), proving completeness.
A.31 Theorem (Tietze-Urysohn extension theorem) 105 Let (X, τ ) be a normal (T4 )
topological space, Y ⊆ X closed and f ∈ Cb (Y, R). Then there exists fb ∈ Cb (X, R) such that
fb|Y = f and kfbk = kf k.
105
H. F. F. Tietze (1880-1964), Austrian mathematician. He proved this for metric spaces (for which Urysohn’s
lemma is a triviality). The generalization to normal spaces is due to Urysohn.
174
Proof. Let f ∈ Cb (Y, R), where we may assume kf k = 1, so that f (Y ) ⊆ [−1, 1]. Let A =
f −1 ([−1, −1/3]) and B = f −1 ([1/3, 1]). Then A, B are disjoint closed subsets of Y , which are
also closed in X since Y is closed. Thus by Urysohn’s Lemma, there is a g ∈ C(X, [−1/3, 1/3])
such that g A = −1/3 and g B = 1/3. Thus kgkX = 1/3, and with T g = g|Y one easily checks
(do it!) that kT g − f kY ≤ 2/3. Now Lemma 7.3 is applicable with m = 1/3 and r = 2/3 and
gives the existence of fb ∈ C(X, R) with T fb = f and kfbk = kf k (since m/(1 − r) = 1).
A.32 Theorem Let f ∈ C([a, b], F) and ε > 0. Then there exists a polynomial P ∈ F[x] such
that |f (x) − P (x)| ≤ ε for all x ∈ [a, b]. (As always, F ∈ {R, C}.)
Proof. It clearly suffices to prove this for the interval [0, 1]. For n ∈ N and x ∈ [0, 1], define
n
X k n k
Pn (x) = f x (1 − x)n−k .
n k
k=0
we have
n
X k n k
f (x) − Pn (x) = f (x) − f x (1 − x)n−k ,
n k
k=0
thus
n
X k n k
|f (x) − Pn (x)| ≤ f (x) − f x (1 − x)n−k . (A.2)
n k
k=0
Since [0, 1] is compact and f : [0, 1] → F is continuous, it is bounded and uniformly continuous.
Thus there is M such that |f (x)| ≤ M for all x, and for each ε > 0 there is δ > 0 such that
|x − y| < δ ⇒ |f (x) − f (y)| < ε.
Let ε > 0 be given, and chose a corresponding δ > 0 as above. Let x ∈ [0, 1]. Define
k
A = k ∈ {0, 1, . . . , n} | −x <δ .
n
106
Karl Theodor Wilhelm Weierstrass (1815-1897). German mathematician and one of the fathers of rigorous analysis.
107
Sergei Natanovich Bernstein (1880-1968). Russian/Soviet mathematician. Important contributions to approxima-
tion theory, probability, PDEs.
175
For all k we have |f (x) − f (k/n)| ≤ 2M , and for k ∈ A we have |f (x) − f (k/n)| < ε. Thus with
(A.2) we have
X n X n
|f (x) − Pn (x)| ≤ ε xk (1 − x)n−k + 2M xk (1 − x)n−k
k k
k∈A k∈Ac
X n
≤ ε + 2M xk (1 − x)n−k , (A.3)
c
k
k∈A
where we used (A.1) again. In an exercise, we will prove the purely algebraic identity
n
X n
xk (1 − x)n−k (k − nx)2 = nx(1 − x) (A.4)
k
k=0
for all n ∈ N0 and x ∈ [0, 1] (in fact all x ∈ R). Now, k ∈ Ac is equivalent to | nk − x| ≥ δ and
to (k − nx)2 ≥ n2 δ 2 . Multiplying both sides of the latter inequality by nk xk (1 − x)n−k and
summing over k ∈ Ac , we have
X n X n
n2 δ 2 xk (1 − x)n−k ≤ xk (1 − x)n−k (k − nx)2
c
k c
k
k∈A k∈A
n
X n k
≤ x (1 − x)n−k (k − nx)2 = nx(1 − x), (A.5)
k
k=0
where we used the obvious inequality x(1 − x) ≤ 1 for x ∈ [0, 1]. Plugging (A.6) into (A.3)
2M
we have |f (x) − Pn (x)| ≤ ε + nδ 2 . This holds for all x ∈ [0, 1] since, by uniform continuity, δ
A.34 Corollary There exists a sequence {pn }n∈N ⊆ R[x] of real polynomials that converges
√
uniformly on [0, 1] to the function x 7→ x.
The above corollary can also be proven directly:
x − pn (x)2
pn+1 (x) = pn (x) + . (A.7)
2
Prove by induction that the following holds:
√
(i) pn (x) ≤ x for all n ∈ N0 , x ∈ [0, 1].
(ii) The sequence {pn (x)} increases monotonously for each x ∈ [0, 1] and converges uniformly
√
to x.
176
A.6.3 The Stone-Weierstrass theorem
Theorem A.32 says that the polynomials, restricted to [0, 1] are uniformly dense in C([0, 1]).
Our aim is to generalize this, replacing replacing [0, 1] by (locally) compact Hausdorff spaces.
In order to see what should take the place of polynomials, notice that a polynomial on R is
a linear combination of powers xn , and the latter can be seen as powers f n (under pointwise
multiplication) of the identity function f = idR . Thus the polynomials are the unital subalgebra
P ⊆ C(R, R) generated by the single element idR . Now, if X is a topological space and F ∈
{R, C} then C(X, F) is a unital algebra, and we will consider subalgebras (not necessarily singly
generated) A ⊆ C(X, F). Since the functions on a (locally) compact Hausdorff space separate
points, we clearly need to impose the following if we want to prove A = C(X, F):
A.37 Theorem (M. H. Stone 1937) 108 If X is compact Hausdorff and A ⊆ C(X, R) is a
unital subalgebra separating points then A = C(X, R).
Proof. Replacing A by A, the claim is equivalent to showing that A = C(X, R). We proceed in
several steps. We claim that f ∈ A implies |f | ∈ A. Since f is bounded due to compactness, it
clearly is enough to prove this under the assumption |f | ≤ 1. With the pn of Corollary A.34,
2 2
p have (x 7→ pn (f (x))) ∈ A since A is a unital algebra. Since pn ◦ f converges uniformly to
we
2
f = |f |, closedness of A implies |f | ∈ A. In view of
f + g + |f − g| f + g − |f − g|
max(f, g) = , min(f, g) = ,
2 2
and the preceding result, we see that f, g ∈ A implies min(f, g), max(f, g) ∈ A. By induction,
this extends to pointwise minima/maxima of finite families of elements of A.
Now let f ∈ C(X, R). Our goal is to find fε ∈ A satisfying kf − fε k < ε for each ε > 0.
Since A is closed, this will give A = C(X, R).
If a 6= b, the fact that A separates points gives us an h ∈ A such that h(a) 6= h(b). Thus
the function ha,b (x) = h(x)−h(a)
h(b)−h(a) is in A, continuous and satisfies h(a) = 0, h(b) = 1. Thus also
fa,b (x) = f (a) + (f (b) − f (a))ha,b (x) is in A, and it satisfies fa,b (a) = f (a) and fa,b (b) = f (b).
This implies that the sets
Ua,b,ε = {x ∈ X | fa,b (x) < f (x) + ε}, Va,b,ε = {x ∈ X | fa,b (x) > f (x) − ε}
are open neighborhoods of a and b, respectively, for every ε > 0. Thus keeping b, ε fixed,
{Ua,b,ε }a∈X is an open cover of X, and by compactness we find a finite subcover {Uai ,b,ε }ni=1 .
By the above preparation, the function fb,ε = min(fa1 ,b,ε , . . . , fan ,b,ε ) is in A. If x ∈ Uai ,b,ε
then fb,ε (x) ≤ fai ,b,ε (x) < f (x) + ε forTall x ∈ X, and since {Uai ,b,ε }ni=1 covers X, we have
fb,ε (x) < f (x) + ε ∀x. For all x ∈ Vb,ε = ni=1 Vai ,b,ε , we have fai ,b,ε (x) > f (x) − ε, and therefore
fb (x) = mini (fai ,b,ε ) > f (x) − ε. Now {Vb,ε }b∈X is an open cover of X, and we find a finite
subcover {Vbj ,ε }nj=1 . Then fε = max(fb1 ,ε , . . . , fbn ) is in A. Now fε (x) = maxj (fbj ,ε ) ≤ f (x) + ε
holds everywhere, and for x ∈ Vbj ,ε we have fε (x) ≥ fbj ,ε > f (x) − ε. Since {Vbj ,ε }j covers X,
we conclude that fε (a) ∈ (f (x) − ε, f (x) + ε) for all x, to wit kf − fε k < ε.
108
Marshall Harvey Stone (1903-1989). American mathematician, mostly active in topology and (functional) analysis.
177
Since the polynomial ring R[x] is an algebra, and the polynomials clearly separate the points
of R, Theorem A.37 recovers Theorem A.32. (This is not circular if one has used Exercise A.35 to
prove Corollary A.34.) But we immediately have the higher dimensional generalization (which
can also be proven by more classical methods, like approximate units):
178
(ii) If (Y, d) is totally bounded and Y ⊆ X is dense then (X, d) is totally bounded.
(iii) If (X, d) is complete and Y ⊆ X then (Y, d) is totally bounded if and only if Y is precom-
pact.
If (X, τ ) is a topological space and (Y, d) metric, the set Cb (X, Y ) is topologized by the
metric
D(f, g) = sup d(f (x), g(x)).
x∈X
It is therefore natural to ask whether the (relative) compactness of a set F ⊆ Cb (X, Y ) can
be characterized in terms of the elements of F, which after all are functions f : X → Y .
This will be the subject of this section, but we will restrict ourselves to compact X, for which
C(X, Y ) = Cb (X, Y ).
A.44 Definition Let (X, τ ) be a topological space and (Y, d) a metric space. A family F
of functions X → Y is called equicontinuous if for every x ∈ X and ε > 0 there is an open
neighborhood U 3 x such that f ∈ F, x0 ∈ U ⇒ d(f (x), f (x0 )) < ε. Then F ⊆ C(X, Y ).
The point of course is that the choice of U depends only on x and ε, but not on f ∈ F.
A.45 Theorem (Arzelà-Ascoli) 109 Let (X, τ ) be a compact topological space and (Y, d) a
complete metric space. Then F ⊆ C(X, Y ) is (pre)compact (w.r.t. the uniform topology τD ) if
and only if the following conditions are satisfied:
(i) {f (x) | f ∈ F} ⊆ Y is (pre)compact for every x ∈ X.
(ii) F is equicontinuous.
Proof. ⇒ If f, g ∈ C(X, Y ) then d(f (x), g(x)) ≤ D(f, g) for every x ∈ X. This implies that the
evaluation map ex : C(X, Y ) → Y, f 7→ f (x) is continuous for every x. Thus if F is compact, so
is eX (F). And compactness of F implies that ex (F) = {f (x) | f ∈ F} is compact, thus closed.
Since ex (F) contains ex (F), also ex (F) ⊆ ex (F) is compact.
To prove equicontinuity, let x ∈ X and ε > S 0. Since F is compact, F is totally bounded,
thus there are g1 , . . . , gn ∈ F such that F ⊆ i B D (gi , ε). By continuity of theTgi , there are
open Ui 3 x, i = 1, . . . , n, such that x0 ∈ Ui ⇒ d(gi (x), gi (x0 )) < ε. Put U = i Ui . If now
f ∈ F, there is an i such that f ∈ B D (gi , ε), to wit D(f, gi ) < ε. Now for x0 ∈ U ⊆ Ui we have
d(f (x), f (x0 )) ≤ d(f (x), gi (x)) + d(gi (x), gi (x0 )) + d(gi (x0 ), f (x0 )) < 3ε,
A.46 Lemma Let (X, d) be a metric space. Assume that for each ε > 0 there are a δ > 0, a
metric space (Y, d0 ) and a continuous map h : X → Y such that (h(X), d0 ) is totally bounded
and such that d0 (h(x), h(x0 )) < δ implies d(x, x0 ) < ε. Then (X, d) is totally bounded.
Proof. For ε > 0, pick δ, (Y, d0 ), h S as asserted. Since h(X) is S totally bounded, there are
y1 , . . . , yn ∈ h(X) such that h(X) ⊆ i B(yi , δ) ⊆ Y . Then X = i h−1 (B(yi , δ)). For each i
choose xi ∈ X such that h(xi ) = yi . Now x S ∈ h−1 (B(yi , δ)) ⇒ d0 (h(x), yi ) < δ ⇒ d(x, xi ) < ε,
so that h−1 (B(yi , δ)) ⊆ B(xi , ε). Thus X = ni=1 B(xi , ε), and (X, d) is totally bounded.
109
Giulio Ascoli (1843-1896), Cesare Arzelà (1847-1912), Italian mathematicians. They proved special cases of this
result, of which there also exist more general versions than the one above.
179
Let ε > 0. Since F is equicontinuous, for every x ∈ X there is an open neighborhood Ux
such that f ∈ F, x0S∈ Ux ⇒ d(f (x), f (x0 )) < ε. Since X is compact, there are x1 , . . . , xn ∈
X such that X = ni=1 Uxi . Now define h : F → Y ×n : f 7→ (f (x1 ), . . . , f (xn )). Now
e 1 , . . . , yn ), (y 0 , . . . , y 0 )) = P d(yi , y 0 ) is a product metric on Y ×n making h continuous. By
d((y 1 n i i
assumption {f (x) | f ∈ F} is compact for each x ∈ X, thus h(F) ⊆ i {f (xi ) | f ∈ F} ⊆ Y ×n
Q
d(f (x), g(x)) ≤ d(f (x), f (xi )) + d(f (xi ), g(xi )) + d(g(xi ), g(x)) < 3ε.
Since this holds for all x ∈ X, we have D(f, g) ≤ 3ε. Thus the assumptions of Lemma A.46 are
satisfied, and we obtain total boundedness, thus precompactness, of F.
A.47 Remark 1. If Y = Rn , as in most statements of the theorem, then in view of the Heine-
Borel theorem the requirement of precompactness of {f (x) | f ∈ F } for each x reduces to
that of boundedness, i.e. pointwise boundedness of F. One can also formulate the theorem in
terms of existence of uniformly convergent (or Cauchy) subsequences of bounded equicontinuous
sequences in C(X, Rn ).
2. We intentionally stated a more general version [72] of the theorem than needed in order to
argue that the result belongs to general topology rather than functional analysis. For Y = Rn
this is less clear, also since there are many alternative proofs of the theorem using various
methods from topology and functional analysis, cf. e.g. [111]. 2
S = {(n, m) ∈ N2 | Un ⊆ Um }.
For every (n, m) ∈ S use Urysohn’s Lemma to find a function f(n,m) ∈ C(X, [0, 1]) such that
f(n,m) Un = 0, f(n,m) X\Um = 1. Let x, y ∈ F, x 6= y. Since X\{y} is an open neighborhood
of x and B is a base, there exists m ∈ N with x ∈ Um . By normality of X there exists an
open V such that x ∈ V ⊆ V ⊆ Um . Since B is a base there exists n ∈ N with x ∈ Un ⊆ V .
Now Un ⊆ V ⊆ Um , so that (n, m) ∈ S. Now f(n,m) (x) = 0, f(n,m) (y) = 1, so that the family
F1 = {f(n,m) | (n, m) ∈ S} ⊂ C(X, [0, 1]) separates the points of X.
Let F2 denote the set, clearly countable, of all finite products of elements of F1 . Interpreting
the empty product as the function 1, we have 1 ∈ F2 . Then also the set F3 of finite linear
combinations of elements of F2 with Q-coefficients is countable. Since A = F3 contains the
finite linear combinations of elements of F2 with coefficients in R, it is a unital R-algebra.
Since already F1 separates the points of X, the same holds for A. Thus the Stone-Weierstrass
Theorem A.37 gives C(X, R) = A = F3 . Thus C(X, R) has F3 as countable dense subset.
(ii)⇒(i) Since (C(X, R), k · k) is metric, second countability and separability are equivalent.
Let F ⊆ C(X, R) be a subset that is dense w.r.t. k · k. Let x, y ∈ X, x 6= y. We claim that there
is an f ∈ F with f (x) 6= f (y) (i.e. F separates the points of X). If this was false, the uniform
180
density of F in C(X, R) would imply f (x) = f (y) for all f ∈ C(X, R), which however is false
by Urysohn’s lemma. The map
Y Y
ιF : X → [inf f, sup f ], x 7→ f (x)
f ∈F f ∈F
is continuous by definition of the product topology, and it is injective since F separates points.
Since X is compact and the product space Hausdorff, ιF is an embedding,
Q i.e. ιX : X → ιX (X)
is a homeomorphism. If now F is countable, the countable product f ∈F [inf f, sup f ] is second
countable, thus also its subspace ιF (X) which is homeomorphic to X.
A.49 Remark 1. If X is locally compact Hausdorff, it is most natural to consider C0 (X, F).
With the one-point (Alexandrov) compactification X∞ , one easily proves a Banach space iso-
morphism C(X∞ , F) ∼ = C0 (X, F) ⊕ F, so that C0 (X, F) is separable if and only if X∞ is second
countable. Second countablility of X∞ implies that of X. The converse is also true, but is more
work since it involves proving that ∞ ∈ X∞ has a countable open neighborhood base. This is
equivalent to X being hemicompact, i.e. there is a family {Kn ⊆ X}n∈N of compact sets such
that every compact K ⊆ X is contained in some Kn . One can show that for a locally compact
Hausdorff space, hemicompactness is equivalent to second countability, cf. e.g. [108]. Thus also
for locally compact X one has that C0 (X, F) is separable if and only if X is second countable.
2. For non-compact X, one can also study Cb (X, F). At least if X is completely regular,
it turns out that Cb (X, F) is never separable for non-compact X. (For compact X we have
Cb (X, F) = C(X, F) and are back in the situation of Proposition A.48.) 2
A.50 Definition A topological space X is completely regular or T3.5 if it is T1 and for every
closed C ⊆ X and y ∈ X\C there exists f ∈ C(X, [0, 1]) such that f C = 0 and f (x) = 1.
All subspaces of a completely regular space are completely regular. By Urysohn’s lemma,
every normal space is completely regular, in particular every metrizable and every compact
Hausdorff space. This implies that complete regularity is a necessary condition for a space X
to have a compactification X
b that is Hausdorff. In fact, it also is sufficient:
A.51 Theorem Let X be a topological space. Then the following are equivalent:
(i) X is completely regular.
(ii) There exists a compact Hausdorff space βX together with a dense embedding X ,→ βX
such that for every continuous function f : X → Y , where Y is compact Hausdorff, there
exists a continuous fb : βX → Y such that fb X = f . (This fb is automatically unique by
density of X ⊆ βX.)
181
A.52 Remark 1. The universal property (ii) implies that βX is unique up to homeomorphism.
‘It’ is called the Stone-Čech110 compactification of X.
2. If X is completely regular and F ∈ {R, C} then the restriction map C(βX, F) → Cb (X, F)
given by f 7→ f X is a bijection and an isometric isomorphism of Banach algebras.
3. There are many ways to prove the non-trivial implication (i)⇒(ii). We sketch two,
referring to the literature for details.
(A) Define Z = [0, 1]C(X,[0,1] = f ∈C(X,[0,1])
Q
Q [0, 1] with the product topology, which is com-
pact Hausdorff. The map ιX : X → Z, x 7→ f ∈C(X,[0,1]) f (x) is continuous and injective since
the continuous functions on a completely regular space separate points. Using that they also
separate points from closed sets, one proves that ιX is an embedding, thus a homeomorphism
X → ιX (X). Let βX = ιX (X) ⊆ Z, which is compact Hausdorff. Now we can identify X
with the dense subspace ιX (X) of βX. By construction, for every f ∈ C(X, [0, 1]) we have
pf ◦ ιX = f , where pf is the projection Z → [0, 1] indexed by f . Thus pf βX extends f to βX.
This generalizes to any compact Hausdorff space Y instead of [0, 1] using the fact that every
compact Hausdorff space is homeomorphic to a closed subset of a cube (which is proven as in
the proof of (ii)⇒(i) Proposition A.48, but now taking F = C(X, [0, 1])).
(B) Alternatively, one can use Gelfand duality for commutative C ∗ -algebras, cf. Section 19: If
X is completely regular, A = Cb (X, C) with norm kf k = supx |f (x)| is a commutative unital C ∗ -
algebra. As such it has a spectrum Ω(A), which is compact Hausdorff. We define βX = Ω(A).
There is a map ι : X → Ω(A), x 7→ ϕx , where ϕx (f ) = f (x). This map is continuous by
definition of the topology on Ω(A). Using the complete regularity of X one proves that ι is
an embedding, i.e. a homeomorphism of X onto ι(X) ⊆ Ω(A). Now ι(X) = Ω(A) is seen as
follows: ι(X) 6= Ω(A) would imply (using Urysohn or Tietze) that there are f ∈ A\{0} such
that ι(x)(f ) = 0 for all x ∈ X. This is a contradiction, since the elements of A are functions on
X, so that ι(x)(f ) = 0 ∀x implies f = 0.
4. The first of the above constructions used the Tychonov theorem, but only for Hausdorff
spaces. The second approach relies on Alaoglu’s theorem to prove compactness of Ω(A). One
can show that Alaoglu’s theorem and the restriction of Tychonov’s theorem to Hausdorff spaces
are equivalent over the ZF axioms, see Section B.5. In fact, also Theorem A.51 is in this
equivalence class.
5. If X is completely regular and non-compact, one can prove, cf. e.g. [51, 3.6.17], that no
point x ∈ βX\X has a countable open neighborhood basis, so that βX is not second countable.
Then Cb (X, F) ∼ = C(βX, F) is not separable. Not assuming complete regularity, not much can be
said since one can find topological spaces – even regular (T3 ) ones – on which every continuous
R-valued function is constant, in which case Cb (X, F) ∼ = F, see [51, 2.7.17, 2.7.18]. 2
182
It is very easy to see that the intersection of any number of σ-algebras on X is a σ-algebra
on X. Thus if F ⊆ P (X) is any family of subsets of X, we can define the σ-algebra generated
by F as the intersection of all σ-algebras on X that contain F.
If (X, τ ) is a topological space, the σ-algebra on X generated by τ is called the Borel111
σ-algebra B(X) of X. (We should of course write B(X, τ ). . . ) Apart from the open sets, it
contains the closed sets, the Gδ sets and many more. A function f : X → C is called Borel
measurable if f −1 (U ) ∈ B(X) for every open U ⊆ C. (This is equivalent to f −1 (B) ∈ B(X)
for every B ∈ B(C).) If (X, A) is a measurable space, B ∞ (X, C) denotes the set of functions
f : X → C that are Borel-measurable and bounded, i.e. supx∈X |f (x)| < ∞. It is not hard to
check that this is an algebra (with the pointwise product).
183
B ? Supplements for the curious
B.1 Functional analysis over fields other than R and C?
The most general meaningful definition of (linear) functional analysis is as the theory of topo-
logical vector spaces over a topological field F and continuous linear maps between them. If
(the topology of) F is discrete, we are effectively doing topological abelian group theory, and
this would not be considered functional analysis. Thus we restrict ourselves to non-discrete
topological fields. The general theory of topological fields is a thorny subject, almost unknown
to non-specialists. (For reviews see [175, 167].) There would be no point in going into this
here since in this course we considered general topological vector spaces only as a step towards
spaces that are at least metrizable.
But there is a complete (in a sense) classification of the non-discrete locally compact fields.
In characteristic zero, these are precisely R, the p-adic fields Qp , where p runs through the prime
numbers, and all their finite (thus algebraic) extensions. (And in characteristic p 6= 0 one has
the finite extensions of Fp ((x)), the field of formal Laurent series over the finite field Fp .) For
a proof see e.g. [126]. While R has only one algebraic extension (namely C), Qp has infinitely
many finite extensions, so that the algebraic closure of Qp (which is not complete!) is infinite-
dimensional over Qp . Like R and C, the p-adic fields and their finite extensions all have a norm,
usually called ‘valuation’ or ‘absolute value’, i.e. a map F → [0, ∞) satisfying |x| = 0 ⇔ x = 0,
|x + y| ≤ |x| + |y| and |xy| = |x||y|. Note that the norm is strictly multiplicative, not just
submultiplicative. The locally compact fields are complete w.r.t. their absolute value | · |.
Books entitled ‘Functional analysis’ or ‘Topological vector spaces’ tend to work entirely over
R and C unless the title contains ‘p-adic’, ‘non-archimedian’ or ‘ultrametric’ (but there are
exceptions like [21, 112]). Nevertheless, functional analysis over p-adic fields is a well-studied
subject, cf. e.g. [138, 125], but a somewhat exotic one since it only seems to have applications
to number theory, algebraic geometry and related fields.113
In the remainder of this short section we briefly comment on the extent to which the theory
covered in these notes remains valid over p-adic fields. As a rule of thumb, one must be very
careful with theorems on normed/Banach spaces that involve R or C either in their statement
or in the proof since then either the orderedness of R or the algebraic completeness of C tend to
be used, while the p-adic fields are neither algebraically closed nor orderable! The Hahn-Banach
theorem is a case in point since we first proved it for R, making essential use of the orderedness
of the base field F = R, thus not just of the set [0, ∞) in which the norms take values, and
then extended it to C. (There nevertheless is a p-adic Hahn-Banach theorem, but with slightly
different hypotheses and a different proof.)
Theorems not explicitly referring to R or C have a better chance of carrying over to p-adic
functional analysis. For example, the open mapping theorem and both versions of the uniform
boundedness theorem generalize without change. However, one has to be careful with the above
rule since there are properties, like connectedness, shared by R and C, but not enjoyed by the
p-adic fields! There are other problems: There is no a priori relationship between the subsets
S1 = {|c| | c ∈ F} and S2 = {kxk | x ∈ V } of [0, ∞). Thus given x ∈ V \{0} there may not be a
c ∈ F such that kcxk = 1.
We also have to be very careful with results on Hilbert spaces, since scalars in F can be
pulled out of inner products without picking up an absolute value: hcx, yi = chx, yi. Indeed
this leads to problems adapting the proof of Theorem 5.27. The same holds for the polarization
113
The author is skeptical about claims of relevance of p-adic/ultrametric (functional) analysis to fundamental the-
oretical/mathematical physics (but statistical/condensed matter physics is another discussion).
184
identities.
We leave the discussion here and refer to the literature on p-adic (functional) analysis for
more information. See e.g. [60, 136, 138, 125].
P∞ Pnk+1 −1
for allPchoices of di ∈ {0, 1}. Thus all subseries of ∞
P
i=n1 xi = k=1 i=nk xi converge, so
∞
that i=1 xi converges unconditionally by Theorem A.4(vii)⇒(iii).
B.3 Corollary In every infinite-dimensional Banach space there are series that converge un-
conditionally, but not absolutely.
Proof of Proposition B.1. We can clearly assume dim V = n2 . By Proposition 3.10, there
exists an Auerbach basis {xi } be an Auerbach basis for V , i.e. kxi k = kϕi k = 1 for all i, where
P 2
{ϕi } ∈ V ∗ is the unique dual basis. Define an inner product on V by hx, yi = n2 ni=1 ϕi (x)ϕi (y),
Pn2 2 1/2 . Then
which induces the norm kxk0 = n
i=1 |ϕi (x)|
n 2
X
0 2
kxk /n ≤ max |ϕi (x)| ≤ kxk ≤ |ϕi (x)| ≤ kxk0 ∀x,
i
i=1
185
****************
Thus {yi } is an ONB for (V, h·, ·i) such that kyi k ≥ 1/8 for each i.
Putting xi = yi /kyi k, we have kxi k = 1, kxi k0 ≤ 8 and with the mutual orthogonality of the
xi
n n n n
X X 0 X 1/2 X 1/2
0 2
ci xi ≤ ci xi = (|ci |kxi k ) ≤8 |ci |2
i=1 i=1 i=1 i=1
B.4 Remark Proposition B.1 (from [98]) suffices for Theorem B.2 and its proof only uses the
very classical existence of Auerbach bases (1929). But it can be improved considerably if one
uses slightly later theory like the John ellipsoid (1948) or Lewis’ lemma (1979). Cf. e.g. [97, vol.
1, Proposition IV.1], where the 8 in (B.1) is replaced by 2 and the n2 by 2n.
And in 1960, Dvoretzky proved Dvoretzky’s theorem, a much stronger result that started
the ‘local theory’ of Banach spaces (which focuses on finite-dimensional subspaces): For every
k ∈ N and ε > 0 there exists N (k, ε) ∈ N such that for every normed space (V, k · k) of
dimension ≥ N (k, ε) there is a k-dimensional subspace W ⊆ V such that d(W, E) < 1 + ε,
where d is the Banach-Mazur distance and E = Rk with euclidean norm. In particular every
infinite-dimensional Banach space has finite-dimensional subspaces of arbitrary large dimension
and arbitrarily close to being Hilbert spaces. See e.g. [1] or [97, vol. 2]. 2
B.5 Exercise P∞ Let X be a compact Hausdorff space and V = C(X, F) with P∞norm k · k∞ . Prove
that a series n=1 fn in V is unconditionally convergent if and only if n=1 |fn (x)| < ∞ for
all x ∈ X. Hint: Dini’s theorem.
(Global absolute convergence means ∞
P
n=1 supx∈X |fn (x)| < ∞. While it is quite plausible
that the latter condition is strictly stronger than pointwise absolute convergence, the Dvoretzky-
Rogers theorem proves that this is the case whenever X is infinite.)
that a series ∞
P
B.6 Exercise Let H be a Hilbert space. Prove P∞ n=1 xn of mutually orthogonal
terms converges unconditionally if and only if n=1 kxn k2 < ∞.
It is plausible that unconditional convergence is harder to achieve if the summands are not
mutually orthogonal. (ForPexample, if x 6= 0 and xn = cn x then unconditional convergence of
P∞ ∞
n=1 xn is equivalent to n=1 kxn k < ∞.) Therefore the following is not surprising:
(Orlicz) If a series ∞
P
B.7 Proposition
P∞ n=1 xn in a Hilbert space converges unconditionally
then n=1 kxn k2 < ∞.
P
Proof. SinceP n xn converges unconditionally, by Proposition (iv) there exists a finite S ⊂ N
such that k t∈T xt k < 1 for every finite T ⊂ N\S. Now for every finite T ⊂ N we have
X X X X X X
xt = xt + xt ≤ xt + xt ≤ kxs k + 1 =: C.
t∈T t∈S∩T t∈T \S t∈S∩T t∈T \S s∈S
186
Now given n ∈ N and s1 , . . . , sn ∈ {±1}, putting T+ = {i = 1, . . . , n | εi = 1}, T− = {i =
1, . . . , n | εi = −1} we have
n
X X X X X
s i xi = xi − xt ≤ xi + xi ≤ 2C. (B.2)
i=1 t∈T+ t∈T− t∈T+ t∈T−
So far we haven’t used any property of V , but now we do. With the generalized parallelogram
identity (5.6), we have
n n
X X X 2
2 −n
kxi k = 2 s i xi ≤ (2C)2
i=1 s∈{±1}n i=1
P∞ 2
for all n, thus also i=1 kxi k ≤ (2C)2 < ∞.
Recall from Remark 5.18 that a Banach space V has cotype c if there exists C < ∞ such
that for all n ∈ N and x1 , . . . , xn ∈ V
n n
X C X X c
kxi kc ≤ s i xi . (B.3)
2n
i=1 s∈{±1}n i=1
B.8 Proposition If V is a Banach space of cotype c < ∞ then every unconditionally conver-
gent series ∞
P P∞ c
x
n=1 n in V satisfies n=1 kxn k < ∞.
B.9 Exercise Let S be an infinite set and p ∈ [1, ∞). Prove that `p (S, F) has cotype max(p, 2).
187
The proof begins with an easy reduction to the case where Φ = {0}, thus Σ = V . For a
nice exposition see [69]. The straightforward generalization of the above to infinite-dimensional
Banach spaces is false, since the set Σ of values of convergent rearrangements can fail to be
an affine space, cf. e.g. [78]. (It can even consist of two points.) But there are a number of
sufficient conditions under which Σ = x0 + Φ> holds. For example the following result, which
has a remarkable application to the ‘universality’ of the Riemann zeta function:
P∞
B.12 Theorem (Pecherski ı̌ 1973) [82, Appendix §6] If H is a real Hilbert space and n=1 xn
is such that Σ 6= ∅ and ∞ 2 >
P
n=1 kxn k < ∞ then Σ = x0 + Φ .
B.13 Proposition Let S be a set, F ∈ {R, C} and 1 ≤ p ≤ ∞. Put V = `p (S, F) if p < ∞ and
V = c0 (S, F) if p = ∞. Then X ⊂ V is (pre)compact if and only if
(i) The set {f (s) | f ∈ X} ⊆ F is compact (bounded) for each s ∈ S.
(ii) For each ε > 0 there is a finite F ⊆ S such that k(1 − χF )f kp ≤ ε for all f ∈ X.
(I.e., the elements of X tend to zero at infinity uniformly.)
Proof. If X is compact, its image in F under the continuous evaluation map ps : f 7→ f (s) is
compact for each s ∈ S. If X is only precompact then ps (X) is compact, and continuity of ps
gives ps (X) ⊆ ps (X), so that ps (X) is precompact (=bounded).
By (pre)compactness S of X, it is totally bounded. Thus if ε > 0 we can find f1 , . . . , fn ∈
` (S, F) such that X ⊆ ni=1 B(fi , ε). Now thereSare finite sets F1 , . . . , Fn such that k(1 −
p
χF )fi kp ≤ ε/2 for each i = 1, . . . , n. Then F = ni=1 Fi is finite. If f ∈ V then pick i such
that kf − fi kp < ε/2. Using the rather obvious facts k(1 − χF )fi kp ≤ k(1 − χFi )fi kp ≤ ε/2 and
k(1 − χF )(f − fi )kp ≤ kf − fi kp < ε/2, we obtain
ε ε
k(1 − χF )f k ≤ k(1 − χF )fi kp + k(1 − χF )(f − fi )kp ≤ + = ε,
2 2
thus (ii). (Note that the existence of the Fi would fail for (`∞ (S, F), k · k∞ ).)
Now assume that (i) and (ii) hold and ε > 0. Then we can find a finite F ⊆ S as in condition
(ii). Write F = {s1 , . . . , sn } and define a map h : X → Fn , f 7→ (f (s1 ), . . . , f (sn )). Equip Fn
with the norm k · kp . By condition (i), h(X) ⊆ Fn is bounded/compact. Let f, f 0 ∈ X such that
kh(f ) − h(f 0 )kp ≤ ε. Since this is equivalent to kχF (f − f 0 )kp ≤ ε, we have
This shows that the assumptions of Lemma A.46 are satisfied, so that X is totally bounded,
thus precompact. Finally, assume the sets in (i) are compact. The total boundedness of X is
equivalent to the statement that every sequence {fn } in X has a Cauchy subsequence {fni }.
The latter converges to some element f ∈ V . Now the closedness of ps (X) ⊆ F for each s
implies that the limit function f is in X. Thus X is compact.
188
B.14 Remark It is interesting to compare the conditions in Theorem A.45 and Proposition
B.13. While the respective pointwise conditions (i) are identical, the conditions (ii) are totally
different. In Theorem A.45 we are concerned with continuous function and therefore have a
uniform (over the functions) version of continuity, while due to compactness there is no condition
at infinity. On the other hand, in Proposition B.13 there are no continuity questions, but we
need a uniform vanishing condition at infinity.
As one may expect, there is a generalization of Proposition B.13 to Lp -spaces, now involving
three conditions. Note that pointwise conditions make no sense.
B.15 Theorem ([Link] theorem) 116 Let 1 ≤ p < ∞ and X ⊆ Lp (Rn , λ).
Then X is precompact if and only if
(i) X is k · kp -bounded.
1/p
(ii) limR→∞ supf ∈X |x|≥R |f (x)|p dλ
R
= 0 (condition at ∞)
(iii) limy→0 supf ∈X kf − fy kp = 0, where fy (x) = f (x − y) (variant of equicontinuity).
(Note that limy→∞ kf − fy kp = 0 holds for all f ∈ Lp (Rn ).) For an elegant proof of the
theorem, again using Lemma A.46, see [72]. In the literature one can find more restrictive
versions (e.g. in [23, Theorem 4.26] condition (ii) is replaced by the stronger hypothesis that all
f ∈ X are supported in the same bounded set) as well as more general ones. See [169, §12] for
a version on a locally compact group G, presented more readably in [88]. 2
B.17 Theorem (i) k · k and k · k0 are equivalent norms on f a(S, F). We write
189
(iii) If ϕ ∈ `∞ (S, F)∗ then kµϕ k ≤ kµk, thus we have a norm-decreasing linear map `∞ (S, F)∗ →
ba(S, F), ϕ 7→ µϕ .
Proof. (i) It is immediate from the definition kcµk = |c|kµk and kcµk0 = |c|kµk0 for all c ∈
F, µ ∈ f a(S, F) and that kµk = 0 ⇔ µ = 0 ⇔ kµk0 = 0. Also kµ1 + µ2 k0 ≤ kµ1 k0 + kµ2 k0 is
quite obvious. Now
(K ) (K )
X X
kµ1 + µ2 k = sup |µ1 (Ak ) + µ2 (Ak )| | · · · ≤ sup |µ1 (Ak )| + |µ2 (Ak )| | · · ·
k=1 k=1
(K ) (K )
X X
≤ sup |µ1 (Ak )| | · · · + sup |µ2 (Ak )| | · · · = kµ1 k + kµ2 k.
k=1 k=1
Thus k · k, k · k0 are norms on f a(S, F). The definition of k · k clearly implies |µ(A)| ≤ kµk for
each A ⊆ S, whence kµk0 ≤ kµk.
Assume µ ∈ f a(S, R) and kµk0 < ∞. If A1 , . . . , AK ⊆ S are mutually disjoint, put
[ [
A+ = {Ak | µ(Ak ) ≥ 0}, A− = {Ak | µ(Ak ) < 0}.
Now by finite additivity, k |µ(Ak )| = µ(A+ ) + µ(A− ) ≤ 2kµk0 since |µ(A± )| ≤ kµk0 . Taking
P
the supremum over the families {Ak } gives kµk ≤ 2kµk0 .
If µ ∈ f a(S, C), writing µ = Re µ + i Im µ we find kµk ≤ 4kµk0 . Thus kµk0 ≤ kµk ≤ 4kµk0
for all µ, and the two norms are equivalent.
(ii) Here it is more convenient to work with the simpler norm k·k0 . Now let {µn } be a Cauchy
sequence in ba(S, F). Then |µn (A) − µm (A)| ≤ kµn − µm k0 , so that {µn (A)} is Cauchy, thus
convergent. Define µ(n) = limn µn (A). It is clear that µ(∅) = 0. If A1 , . . . , AK are mutually
disjoint then
µ(A1 ∪· · ·∪AK ) = lim µn (A1 ∪· · ·∪AK ) = lim (µn (A1 )+· · ·+µn (AK )) = µ(A1 )+· · ·+µ(AK ),
n→∞ n→∞
so that µ is finitely additive. Since {µn } is Cauchy, for every ε > 0 there is n0 such that n, m ≥ n0
implies kµm − µn k0 < ε. In particular there is n0 such that kµm k0 ≤ kµn0 k0 + 1 for m ≥ n0 .
This implies boundedness of µ. And taking m → ∞ in |µn (A) − µm (A)| ≤ kµn − µm k0 < ε gives
kµn − µk0 ≤ ε, so that kµn − µk0 → 0. Thus ba(S, F) is complete (w.r.t. k · k0 , thus also w.r.t.
k · k).
(iii) It is clear that `∞ (S, F)∗ → f a(S, F), ϕ 7→ µϕ is linear. Now let A1 , . . . , AK ⊆ S be
mutually disjoint. Then
K K K K
!
X X X X
|µϕ (Ak )| = sgn(µϕ (Ak ))µϕ (Ak ) = sgn(µϕ (Ak ))ϕ(χAk ) = ϕ sgn(µϕ (Ak ))χAk .
k=1 k=1 k=1 k=1
B.18 Theorem (i) For each µ ∈ ba(S, F) there is a unique linear functional µ ∈ `∞ (S, F)∗
R
(ii) The maps α : `∞ (S, F)∗ → ba(S, F), ϕ 7→ µϕ and : ba(S, F) → `∞ (S, F)∗ , µ 7→ µ are
R R
190
Proof. (i) If f ∈ `1 (S, F) has finite image, write f = K χ
P
k=1 ck Ak , where the Ak are mutually
disjoint, and define
Z XK
f dµ = ck µ(Ak ).
k=1
all A ⊆ S. Thus ϕ and µϕ coincide on all characteristic functions, thus on all of `∞ (S, F) by
R
linearity, density of the finite-image functions and the k · k∞ continuity of ϕ and µϕ . Thus
R
◦ α = id`∞ (S,F)∗ . R
Since the maps α and are mutually inverse and both norm-decreasing, they actually both
are isometries.
This completes the determination of `∞ (S, F)∗ . (Note that we did not use the completeness
of ba(S, F) proven in Theorem B.17(ii). Thus it would also follow from the isometric bijection
ba(S, F) ∼
= `∞ (S, F)∗ just established.)
B.19 Exercise Given µ ∈ ba(S, F), prove that µ is {0, 1}-valued if and only if µ ∈ `∞ (S, F)∗
R
Since `∞ (S, F)∗ has a closed subspace ι(`1 (S, F)), it is interesting to identify the correspond-
ing subspace of ba(S, F).
B.20 Definition A finitely additive measure µ ∈ ba(S, F) is called countably additive if for
every countable family A ⊆ P (S) of mutually disjoint sets we have
[ X
µ A = µ(A)
A∈A
and totally additive if the same holds for any family of mutually disjoint sets. The set of
countably and totally additive measures on S are denoted ca(S, F) and ta(S, F), respectively.
191
(i) There is g ∈ `1 (S, F) such that µ(A) = s∈A g(s) for all A ⊆ S.
P
(ii) µ ∈ `∞ (S, F)∗ is ‘normal’, thus f dµ = limι fι dµ for every net {fι } ∈ FS that is
R R R
(ii)⇒(iii) We know that we can recover µ from µ as µ(A) = χA dµ. Let A be a family
R R
of mutually disjoint subsets of S. Then the net {fF = χS F }, indexed by the finite subsets
χ
S
F ⊆R A, is uniformly bounded R and converges Rpointwise to PB , where B =P A. Now normality
of µ implies that µ(B) = µ χB dµ = limF fF = limF A∈F µ(A) = A∈A µ(A), which is
additivity of µ. P
(iii)⇒(i) If we put g(s) = µ({s}) then additivity of µ means that µ(A) = s∈A g(s) for all
A ⊆ S, convergence being absolute. Now the finiteness of µ(S) gives kgk1 < ∞.
(iii)⇒(iv) is trivial. If S is countable then a family of mutually disjoint non-empty subsets
of S is at most countable, so that (iii) and (iv) are equivalent.
`1 (S, F)
=∼
∼=
∼
-
=
(`∞ (S, F)∗ )n - ta(S, F)
∩ ∩
? ∼
= ?
`∞ (S, F)∗ - ba(S, F)
192
B.23 Theorem (Phillips 1939, Sobczyk 1940) The closed subspace c0 (N, R) ⊆ `∞ (N, R)
is not complemented.
Proof. (Whitley (1966).) From now on we abbreviate `∞ (N, F) and c0 (N, F) as `∞ , c0 . Our
strategy for proving that c0 ⊆ `∞ is not complemented is the following: If c0 ⊆ `∞ had a
complementary closed subspace W , Exercise 7.15 would give `∞ ∼ = c0 ⊕ W , thus `∞ /c0 ∼= W.
∞
Since W would have property S, it would follow that Q = ` /c0 has property S, but we will
prove that it doesn’t!
The idea for doing so is to produce an uncountable subset F ⊆ Q such that each functional
ϕ ∈ Q∗ isSnon-zero only on countably many elements of F. Then for any countable C ⊆ Q∗ the
set F 0 = ϕ∈C {q ∈ Q | ϕ(q) 6= 0} is countable, so that the family C ⊆ Q∗ vanishes identically
on the uncountable set F\F 0 . It therefore cannot separate the elements of F, let alone those of
Q. Thus Q does not have property S and we are done. For the construction of such an F we
use the following lemma:
B.24 Lemma Every countably infinite set X admits a family {Xλ }λ∈Λ of subsets of X such
that
(i) Λ has cardinality c = #R, in particular it is uncountable.
(ii) Xλ is infinite for each λ ∈ Λ.
(iii) Xλ ∩ Xλ0 is finite for all λ, λ0 ∈ Λ, λ 6= λ0 .
Proof. Take Y = (0, 1) ∩ Q and Λ = (0, 1)\Q. Clearly Y is countable and Λ is uncount-
able (since removing a countable set from one of cardinality c does not change the cardinal-
ity). For each λ ∈ Λ pick a sequence {an } ⊆ Y converging to λ (for example an = bnλc/n)
and put Yλ = {an | n ∈ N}. That each Yλ is infinite follows from the irrationality of λ
and the rationality of the an . If λ 6= λ0 and an → λ, a0n → λ0 then there exists n0 such
that n, n0 ≥ n0 ⇒ max(|an − λ|, |a0n0 − λ0 |) < |λ − λ0 |/2, so that an 6= a0n0 . This implies
#(Yλ ∩ Yλ0 ) < ∞. We thus have a family of subsets of Y with all desired properties. For an
arbitrary countably infinite set X the claim now follows using a bijection X ∼
=Y.
Let {Xλ }λ∈Λ be a family of subsets of N as provided by the lemma. For λ ∈ Λ, the
characteristic function χXλ : N → {0, 1} ⊆ C clearly is in `∞ . Let p : `∞ → Q = `∞ /c0
be the quotient map. Now let qλ = p(χXλ ) and F = {qλ | λ ∈ Λ}. If λ, λ0 ∈ Λ, λ 6= λ
the symmetric difference Xλ ∆Xλ0 = (Xλ ∪ Xλ0 )\(Xλ0 ∩ Xλ ) is infinite by (ii) and (iii). Thus
χXλ − χX 0 6= c0 = ker p, so that λ 7→ qλ is injective, thus with (i) we see that F is uncountable.
λ
Let now ϕ ∈ Q∗ , m, n ∈ N and let λ1 , . . . , λm ∈ Λ be mutually distinct and such that
|ϕ(qλi )| ≥P1/n ∀i = 1, . . . , m. For each i pick ti with |ti | = 1 such that ti ϕ(qλi ) = |ϕ(qλi )|.
Put f = m χ ∞
i=1 ti Xλi ∈ ` . Since the sets Xλi have pairwise finite intersections,
Sm the function
f has absolute value larger than one only S on a subset
S of the finite set j,k=1 Xλj ∩ Xλk and
absolute value one on the infinite set ( i Xλi )\( j,k Xλj ∩ Xλk ). This implies that kp(f )k =
inf g∈c0 kf − gk∞ = 1. Thus
m m m
X X X m
kϕk ≥ |ϕ(p(f ))| = ti ϕ(p(χXλi )) = ti ϕ(qλi ) = |ϕ(qλi )| ≥ .
n
i=1 i=1 i=1
Thus m ≤ nkϕk < ∞, so that for each ϕ ∈ Q∗ and n ∈ N there cannot be more than m distinct
λ ∈ Λ with |ϕ(qλ )| ≥ 1/n. If there was an uncountable F 0 ⊆ F with ϕ(q) 6= 0 ∀q ∈ F 0 , there
would have to be an n ∈ N such that |ϕ(q)| ≥ 1/n for infinitely (in fact uncountably) many
q ∈ F 0 , contradicting what we just proved. This completes the proof.
193
B.3.4 c0 (N, F) is not a dual space. Spaces with multiple pre-duals
Recall that we write ∼
= for isometric isomorphism and ' for isomorphism of Banach spaces.
(P ιV ∗ (ϕ))(ιV (x)) = ιV ∗ ((ιV )∗ (ιV ∗ (ϕ)))(ιV (x)) = [(ιV )∗ (ιV ∗ (ϕ))](x) = ϕ(x) = ιV ∗ (ϕ)(ιV (x)),
where we used Exercise 9.18 several times, proves P ιV ∗ (ϕ) = ιV ∗ (ϕ). Thus P ιV ∗ (V ∗ ) = id. On
the other hand, it follows directly from the definition of P that P V ∗∗∗ ⊆ ιV ∗ (V ∗ ). Combining
these two facts gives P 2 = P and P V ∗∗∗ = ιV ∗ (V ∗ ).
(ii) This is an immediate consequence of (ii) and Exercise 6.13.
(iii) If T : W → V ∗ is an isomorphism then we have isomorphisms T ∗ : V ∗∗ → W ∗ and
T : W ∗∗ → V ∗∗∗ . Using this it is straightforward to deduce the claim from (ii).
∗∗
(iv) By Exercise 6.7 we have (V ∗∗ /V )∗ ∼= V ⊥ ⊆ V ∗∗∗ . And by (ii), V ∗∗∗ ' V ∗ ⊕ W , where
W ' V ∗∗∗ /V ∗ , the isomorphism being given by x∗∗∗ 7→ (P x∗∗∗ , (1 − P )x∗∗∗ ) with P as in (i).
Thus P V ∗∗∗ ' V ∗ and V ∗∗∗ /V ∗ ' (1 − P )V ∗∗∗ . Thus the claimed isomorphism follows if we
prove that the subspaces V ⊥ and (1 − P )V ∗∗∗ of V ∗∗∗ are equal.
Now, x∗∗∗ ∈ (1 − P )V ∗∗∗ means (1 − P )x∗∗∗ = x∗∗∗ , thus P x∗∗∗ = 0. Since P = ιV ∗ ◦ (ιV )∗ ,
where ιV ∗ is injective, this is equivalent to (ιV )∗ (x∗∗∗ ) = 0. By the definition of the transpose,
this means that x∗∗∗ ◦ ιV = 0. Since this is the same as x∗∗∗ ∈ ιV (V )⊥ , we are done.
B.26 Corollary c0 (N, F) is not isomorphic to the dual space of any Banach space.
Proof. We again abbreviate c0 (N, F) as c0 etc. We know that c∗0 = ∼ `1 and c∗∗ ∼ ∞
0 = ` , the
canonical map ιc0 : c0 → c∗∗ ∞
0 just being the inclusion map c0 ,→ ` . By Theorem B.23, c0 ⊆ `
∞
is not complemented. Combining this with Lemma B.25(iii), the claim follows.
B.27 Corollary Let X = c0 ⊕ (`∞ /c0 ). Then X 6' `∞ , but X ∗ ' (`∞ )∗ .
Proof. X ' `∞ would imply that c0 ⊆ `∞ is complemented, which it is not by Theorem B.23.
Thus X 6' `∞ . With c∗0 ∼
= `1 we have X ∗ ' c∗0 ⊕ (`∞ /c0 )∗ ' `1 ⊕ (`∞ /c0 )∗ .
On the other hand, since `1 ∼= c∗0 is a dual space, we see that `1 ⊆ (`1 )∗∗ ∼
= (`∞ )∗ is com-
∞ ∗ 1 ∞ ∗ 1
plemented by Lemma B.25(iii). Thus (` ) ' ` ⊕ (` ) /` by Exercise 7.15(i). Now Lemma
B.25(iv) with V = c0 gives (`∞ )∗ /`1 ' (`∞ /c0 )∗ , so that X ∗ ' (`∞ )∗ .
194
B.3.5 Schur’s theorem for `1 (N, F)
As on earlier occasions, we abbreviate `1 = `1 (N, F).
w
B.28 Theorem (I. Schur 1921) If g, {fn }n∈N ⊆ `1 (N, F) and fn → g then kfn − gk1 → 0.
w
Proof. It clearly suffices to prove this for g = 0, thus `1 3 fn → 0 ⇒ kfn k1 → 0.
w
Assume that fn → 0, but kfn k 6→ 0. Since δm ∈ `∞ ∼ = (`1 )∗ , the first fact clearly implies
n→∞
fn (m) = ϕδm (fn ) −→ 0 for all m. And by the second assumption there exists ε > 0 such that
kfn k1 ≥ ε for infinitely many n. Using this, we inductively define {nk }, {rk } ⊆ N as follows:
(a) Let n1 be the smallest number for which kfn1 k1 ≥ ε.
P∞
(b) Let r1 be the smallest number for which ri=1 ε
≤ 5ε .
P1
|fn1 (i)| ≥ 2 and i=r1 +1 |fn1 (i)|
For k ≥ 2:
Prk−1
(c) Let nk be the smallest number such that nk > nk−1 and kfnk k1 ≥ ε and i=1 |fnk (i)| ≤ 5ε .
Prk ε
(d) Let rk be the smallest number such that rk > rk−1 and
P∞ i=rk−1 +1 |fnk (i)| ≥ 2 and
ε
i=rk +1 |fnk (i)| ≤ 5 .
The reader should convince herself that the existence of such nk , rk follows from our assumptions!
Now define {ci }i∈N by ci = sgn(fnk (i)) where k is uniquely determined by rk−1 < i ≤ rk
with r0 = 0. Now clearly c = {ci } ∈ `∞ , and for all k we have, using the lower bound in (b),(d),
rk rk
X X ε
ci fnk (i) = |fnk (i)| ≥ ,
2
i=rk−1 +1 i=rk−1 +1
B.29 Remark 1. The above proof followed the argument in [9] quite closely. The method of
proof is called the ‘gliding (or sliding) hump method’ (which is a variant of the earlier method
with the equally colorful name ‘condensation of singularities’). The gliding hump is precisely
the dominant contribution to ϕc (fnk ) coming from the i in the interval {rk + 1, . . . , rk+1 }, which
moves to infinity as k → ∞. For a high-brow interpretation of Schur’s theorem in terms of
Banach space bases see [1, Section 2.3]. But also this discussion uses gliding humps!
2. With the appearance of the proof of the uniform boundedness theorem using Baire’s
theorem the gliding hump method fell a bit out of fashion. (But never completely, cf. e.g.
[161].) Also Schur’s theorem can be proven using Baire’s theorem, cf. e.g. [30, Proposition
V.5.2], but this is conceptually and technically more complicated.
3. Note also that the determination of the nk , rk in the above proof was deterministic, using
no choice axiom at all. In this sense the proof is better than the alternative one using Baire’s
theorem, thus countable dependent choice, which nevertheless is instructive. (But of course also
the above proof is non-constructive in the somewhat extremist sense of intuitionism since the
necessary ε > 0 cannot be found algorithmically.) In the same vein, in next section we will give
a gliding-hump proof for the weak uniform boundedness theorem that only uses the axiom ACω
of countable choice. 2
195
B.3.6 Compactness of all bounded linear maps c0 → `q → `p (1 ≤ p < q < ∞)
In Theorem 12.24 we proved that for a reflexive Banach V space all bounded linear operators
c0 → V and V → `1 are compact. Applying this to the reflexive spaces `p = `p (N, F) with
1 < p < ∞, we see that for such p all bounded maps c0 → `p and `p → `1 are compact.
We now give an alternative proof that in addition gives compactness of all bounded maps
`q → `p for 1 < p < q < ∞ and c0 → `1 . Thus all bounded linear maps between the
spaces c0 , `p (1 ≤ p < ∞) that go in the opposite direction of the bounded inclusion maps
`1 ,→ `p ,→ `q ,→ c0 are compact!
B.30 Theorem (H. R. Pitt 1936) Let 1 ≤ p < q < ∞ and let V ⊆ `q or V ⊆ c0 be a closed
subspace. Then every bounded linear map V → `p is compact, thus B(V, `p ) = K(V, `p ).
We follow the elementary proof by Delpech [38], based upon the following
B.31 Lemma (i) Let 1 ≤ r < ∞, x ∈ `r and {yn } ⊂ `r a weak null-sequence. Then
Proof. (i) Since the evaluation map ϕm : `r → F, y 7→ y(m) is in (`r )∗ , the weak convergence
w
yn → 0 implies yn (m) → 0 for all m. From this it is quite clear that (B.4) holds if x has
finite support supp(x) = {m ∈ N | x(m) 6= 0} (i.e. x ∈ c00 ) since in particular yn (m) → 0 for
all m ∈ supp(x). For general x ∈ `r and ε > 0 we pick x0 ∈ c00 with kx − x0 k < ε. Since
kx0 + yn k − ε ≤ kx + yn k ≤ kx0 + yn k + ε for all n we find
1/r 1/r
kx0 kr + lim sup kyn kr − ε ≤ lim sup kx + yn k ≤ kx0 kr + lim sup kyn kr + ε.
n→∞ n→∞ n→∞
Proof of Theorem B.30. By Exercise 9.23, `q is reflexive for 1 < q < ∞, thus same holds for
every closed V ⊆ `q by Theorem 9.27. If V ⊆ c0 is closed then V ∗ ∼ = c∗0 /V ⊥ by Exercise 9.15.
∗ ∼ 1 ∗
Since c0 = ` is separable, the same holds for V by Exercise 6.5. In either case, V is reflexive
or has separable dual. Thus compactness of A ∈ B(V, `p ) follows from Theorem 12.28(iv) if we
w
prove sequential weak-norm continuity. It suffices to prove that xn → 0 implies kAxn k → 0. If
ε > 0 we pick yε ∈ V with kyε k = 1 and kAyε k > 1 − ε. Let t > 0. In the case V ⊆ `q we apply
Lemma B.31 to both sides of the inequality kAyε + A(txn )k ≤ kyε + txn k, obtaining
1/p 1/q
kAyε kp + tp lim sup kAxn kp ≤ kyε kq + tq lim sup kxn kq . (B.6)
n→∞ n→∞
Since {xn } is weakly convergent, Exercise 10.6 gives kxn k ≤ M ∀n. With kyε k = 1 and kAyε k >
1 − ε, (B.6) becomes
1
lim sup kAxn kp ≤ (1 + tq M q )p/q − (1 − ε)p ,
tp
n→∞
196
which holds for all t > 0 and ε ∈ (0, 1). Putting t = ε1/q and using (1 + x)α = 1 + αx + o(x) as
x & 0, we have
1 p ε p
lim sup kAxn kp ≤ (1 + εM q ) − (1 − pε) + o(ε) = p/q M q + p + o(1) ,
n→∞ εp/q q ε q
which vanishes as ε → 0 since p < q. This finishes the proof in the case V = `q .
In the case V ⊆ c0 we proceed similarly, but use part (ii) of the Lemma on the r.h.s.,
obtaining
1/p
kAyε kp + tp lim sup kAxn kp
≤ max kyε k, t lim sup kxn k) ,
n→∞ n→∞
1 pε + o(ε)
lim sup kAxn kp ≤ 1 − (1 − ε)p = ,
n→∞ ε1/2 ε1/2
which again vanishes as ε → 0.
B.33 Remark 1. One says that the spaces c0 and `p (1 ≤ p < ∞) are uncomparable.
2. By the above, in particular the inclusion maps `p ,→ `q ,→ c0 , where 1 ≤ p < q < ∞,
are strictly singular. But they are clearly non-compact since they send the bounded sequence
{δn }, which has no norm-convergent subsequence in any of these spaces, to itself. Taking
1 < p < q < ∞ this shows that strict singularity of an operator does not imply compactness
even if both spaces involved are reflexive. But one can prove that for all A ∈ B(`p ), where
1 < p < ∞, strict singularity does imply compactness. (Cf. e.g. [1, Problem 2.4].) By an easy
reduction the same holds for all operators between not necessarily separable Hilbert spaces. 2
197
Proof. Assume that F is not uniformly bounded. Then the sets Fn = {A ∈ F | kAk ≥ 4n }
are all non-empty, so that using ACω (axiom of countable choice), we can pick an An ∈ Fn for
each n ∈ N. By definition of kAn k, the sets Xn = {x ∈ E | kxk ≤ 1, kAn xk ≥ 23 kAn k} are all
non-empty, to that using ACω again, we can choose an xn ∈ Xn for each n ∈ N.
Applying the triangle inequality to Az = 12 (A(y + z) − A(y − z)) gives
1 1
kAzk = k(A(y + z) − A(y − z))k ≤ (kA(y + z)k + kA(y − z)k) ≤ max(kA(y + z)k, kA(y − z)k).
2 2
Applying this inequality to A = An+1 , y = yn , z = ±3−(n+1) xn+1 , recalling kAn xn k ≥ 32 kAn k,
we see that for at least one of the signs ± we have
2
kAn+1 (yn ± 3−(n+1) xn+1 )k ≥ 3−(n+1) kAn+1 xn+1 k ≥ 3−(n+1) kAn+1 k.
3
Thus defining a sequence {yn } ⊆ E inductively by y1 = x1 and
yn + 3−(n+1) xn+1 if kAn+1 (yn + 3−(n+1) xn+1 )k ≥ 3−(n+1) 23 kAn+1 k
yn+1 = (B.7)
yn − 3−(n+1) xn+1 otherwise
we have kAn yn k ≥ 23 3−n kAn k for all n. (For n = 1 this is true since y1 = x1 .) Since (B.7)
involves no further free choices, this inductive definition can be formalized in ZF (which we
don’t spell out here, see [53]).
With (B.7) and kxn k ≤ 1 for all n, we have kyn+1 − yn k ≤ 3−(n+1) ∀n. Now for all m > n
m−1 ∞
X X 1 1
kym − yn k = yk+1 − yk ≤ 3−(k+1) = 3−(n+1) 1 = 3−n ,
1− 3
2
k=n k=n
B.35 Remark The above argument was discovered only a few years ago and published [53] in
2017! 2
198
B.36 Theorem Over ZF, Tychonov’s theorem for Hausdorff spaces ⇒ Alaoglu’s theorem.
It turns out that Alaoglu’s theorem actually is equivalent over ZF to Tychonov’s theorem
for Hausdorff spaces, as well as to several other statements, e.g. the Ultrafilter Lemma (UL),
Alexander’s subbase lemma, and the Boolean Prime Ideal Theorem. Furthermore, this class of
equivalent statements is strictly weaker than the equivalent statements AC, Zorn and Tychonov,
in the sense that there are models of the ZF axioms in which the former statements are true,
but not the latter, see [70, 71]. We will prove Alaoglu ⇒ UL below.
B.40 Lemma A filter F on X is an ultrafilter if and only if for every Y ⊆ X exactly one of the
alternatives Y ∈ F, X\Y ∈ F holds.
118
Filters were invented in 1937 by the French mathematician Henri Cartan (1904-2008), an important member of
the Bourbaki group. Unsurprisingly, the best reference on filters is [21]. Preference for nets or filters is sometimes
put as a question of American vs. European (in particular French) tastes, but this is simplistic. Most contemporary
research in general topology is actually done in terms of filters, not nets.
199
Proof. We begin by noting that we cannot have both Y ∈ F and X\Y ∈ F since (i) would
imply ∅ = Y ∩ (X\Y ) ∈ F, which is forbidden by (iii). Assume F contains Y or X\Y for every
Y ⊆ X. This means that F cannot be enlarged by adding Y ⊆ X since either already Y ∈ F
or else X\Y ∈ F, which excludes Y ∈ F. Thus F is an ultrafilter.
Now assume that F is an ultrafilter and Y ⊆ X. If there is an F ∈ F such that F ∩ Y = ∅
then F ⊆ X\Y , and property (ii) implies X\Y ∈ F. If, on the other hand, Y ∩ F 6= ∅ ∀F ∈ F
then there is a filter Fe containing F and Y . Since F is maximal, we must have Y ∈ F.
B.41 Remark If F is a filter on a set S 6= ∅ then the map µ = χF : P (S) → {0, 1} (that
sends A ⊆ S to 1 if A ∈ F and to 0 otherwise) clearly satisfies µ(∅) = 0, µ(S) = 1, and for
for disjoint A1 , A2 ⊆ S we have µ(A1 ) + µ(A2 ) ≤ µ(A1 ∪ A2 ) ≤ 1 since A1 and A2 cannot
both be in F. If equality holds for all disjoint non-empty A1 , A2 , then in particular for all
∅ 6= A1 6= S and A2 = S\A1 . Thus either A1 or its complement A2 belongs to F. By Lemma
B.40 this characterizes the ultrafilters on S. Conversely, one easily checks that if µ is a non-zero
{0, 1}-valued finitely additive measure on S then F = {A ⊆ S | µ(A) = 1} is an ultrafilter. 2
B.42 Lemma For a topological space (X, τ ), the following are equivalent:
(i) (X, τ ) is compact (thus every open cover has a finite subcover).
T
(ii) Whenever F ⊆ P (X) is a family of closed subsets of X such that F = ∅ then there are
C1 , . . . , Cn ∈ F such that C1 ∩ · · · ∩ Cn = ∅.
(iii) If F ⊆ P (X) is a family of closed subsets of X with the finite intersectionTproperty (i.e.
the intersection of any finite number of elements of F is non-empty) then F 6= ∅.
Proof. (i) and (ii) are dualizations of each other, using de Morgan’s formulas, and (iii) is the
contraposition of (ii).
200
And χS is idempotent for each S, thus ψ(χS ) = ψ(χ2S ) = ψ(χS )2 , implying µ(S) = ψ(χS ) ∈
{0, 1} for all S ⊆ X. We have µ(X) = ψ(1) = 1 (since ϕx (1) = 1 ∀x), and S ∩ T = ∅ implies
χS∪T = χS + χT , so that µ(S ∪ T ) = µ(S) + µ(T ). Thus µ is a finitely additive {0, 1}-valued
measure on X, and we know from Remark B.41 that Fb = {Y ⊆ X | µ(Y ) = 1} is an ultrafilter
T T w∗ w∗ w∗
on X. If Y ∈ F then ψ ∈ F = F ∈F ι(F ) implies ψ ∈ ι(Y ) = {ϕx | x ∈ Y } . Since
ϕx (χY ) = χY (x) = 1 for all x ∈ Y , we have µ(Y ) = ψ(χY ) = 1, thus F ⊆ F.
b We thus have
embedded F into an ultrafilter.
B.44 Remark 1. Combining Theorems B.43 and B.45 we obtain the curious fact that over ZF
Alaoglu’s theorem (non-reversibly) implies Hahn-Banach.
2. Our proofs of OMT/BIT/CGT and UBT only use the ZF axioms and Baire’s theorem,
the latter being equivalent to DCω . Since we have seen that Alaolgu’s theorem is equivalent
to the Ultrafilter lemma (and to Tychonov for Hausdorff spaces and various other statements),
all these theorems can be proven in the framework ZF+DCω +UL that is provably weaker than
ZF+AC, see [124]. In Section B.6.1 we will prove that this also holds for Hahn-Banach. Yet,
there are some results, like the Krein-Milman theorem that cannot be proven in ZF+DCω +UL.
See Remark B.85. 2
B.45 Theorem [Loś & Ryll-Nardzewski (1951)] 119 Over ZF, Tychonov’s theorem for Hausdorff
spaces implies the Hahn-Banach Theorem 9.2.
Q
Proof. Let V, p, W, ϕ as in Theorem 9.2. Define E = v∈V [−p(−v), p(v)] with the product
topology. Since we assume Tychonov’s theorem for compact Hausdorff spaces, E is compact
(and Hausdorff). Clearly every e ∈ E can be interpreted as a map V → R satisfying the bound
−p(−v) ≤ e(v) ≤ p(v) ∀v. For each v ∈ V the coordinate map e 7→ e(v) is continuous, thus
E 0 = {e ∈ E | e(w) = ϕ(w) ∀w ∈ W } ⊆ E is closed. For each finite-dimensional subspace Z ⊆ V
let EZ = {e ∈ E | e Z is linear}. Again using continuity of the coordinate maps e 7→ e(v), it
follows that each
\ \
EZ = e ∈ E | e(x + x0 ) = e(x) + e(x0 ) ∩ e ∈ E | e(cx) = ce(x) ⊆ E
x,x0 ∈Z c∈R,x∈Z
T of E has
of closed subsets
0
the finite intersection property. Since E is compact, Lemma B.42
gives EHB = Z fin. dim. EZ 6= ∅. Pick any e ∈ EHB . Now e : V → R coincides with ϕ on W ,
satisfies the p-bound and is linear on all finite-dimensional subspaces, thus globally. Thus it is
a Hahn-Banach extension.
119
Jerzy Loś (1920-1988), Czeslaw Ryll-Nardzewski (1926-2015). Polish mathematicians. L. mostly worked in set
theory and logic, R.-N. in functional analysis (R.-N. fixed point theorem), measure theory and probability.
201
B.46 Remark 1. The proof in [99] is phrased in terms of two rather more general theorems,
but we chose to sacrifice the generality and simplify the argument considerably. (There also is
a technical point: In the above proof we used distinguished coordinates p(x) to make sure that
the closed subsets EZ0 are non-empty. It is not clear to this author how to draw this conclusion
(without invoking AC) in the generality of [99, Theorem 2] when the spaces {Px }x∈X0 appearing
there have nothing to do with each other. Perhaps they should be assumed pointed?)
2. In 1962, Luxemburg120 [100] deduced Hahn-Banach from the UL by use of ultraproducts
(non-standard analysis). The ultraproducts are not essential and can be removed, cf. [11] (or
[108]). The resulting proof shares some features with the above, which however remains simpler.
3. In [123] it is shown that the Hahn-Banach theorem is strictly weaker than the equivalent
statements of Tychonov for Hausdorff spaces, the Ultrafilter Lemma, etc. (But it still suffices for
proving the Banach-Tarski paradox and the existence of sets that are not Lebesgue measurable,
see [54, 117]. Thus the latter cannot be avoided without giving up Hahn-Banach.) 2
B.47 Exercise Let V be a topological vector space and A ⊆ V convex. Prove that the interior
A0 and the closure A are convex.
B.48 Proposition Let V be a topological vector space and U a convex open neighborhood of
0. Define the ‘Minkowski functional’121 µU : V → [0, ∞) of U by
µU (x) = inf{t ≥ 0 | x ∈ tU }.
202
B.49 Definition Let V be a topological vector space and 0 ∈ U ⊆ V . Then U is called
• balanced if x ∈ U, |λ| ≤ 1 ⇒ λx ∈ U ,
• bounded if for every open W 3 0 there exists λ > 0 such that λU ⊆ W .
Note that if U is convex and contains zero, multiplication by t ∈ [0, 1] sends U into itself.
Thus for checking balancedness it suffices to consider |λ| = 1.
B.50 Exercise Let (V, τ ) be a TVS, where τ = τd for a translation-invariant metric d. Prove:
(i) B(0, r) is bounded in the TVS sense for each r > 0.
(ii) X ⊆ V is bounded in the TVS sense if and only if X ⊆ B(0, r) for some r > 0.
B.51 Proposition Let (V, τ ) be a topological vector space and U a convex open neighborhood
of zero. Then
(i) The Minkowski functional µU is a seminorm if and only if U is balanced.
(ii) If U is bounded then µU (x) = 0 implies x = 0.
(iii) If U is balanced and bounded then kxk = µU (x) is a norm inducing the topology τ .
Proof. (i) Since µU is subadditive and positive-homogeneous, it is a seminorm if and only if
µU (λx) = µU (x) for all x ∈ V and λ ∈ F with |λ| = 1. If U is balanced then this is evidently
satisfied. Now assume µU (λx) = µU (x). The openness of U implies that {t > 0 | x ∈ tU } =
(µU (x), ∞). Thus if |λ| = 1 then the assumption µU (λx) = µU (x) implies that x ∈ U if and
only if λx ∈ U . Thus U is balanced.
(ii) Assume that U is bounded and that x 6= 0. Since τ is T1 , there is an open W ⊆ V such
that 0 ∈ W 63 x. Since U is bounded, there is λ > 0 such that λU ⊆ W , which clearly implies
x 6∈ λU . Now the definition of µU implies µU (x) > λ > 0.
(iii) Proposition B.48 and the above (i) and (ii) show that k · k = µU is a continuous norm on
V . Thus xn → 0 implies kxn k → 0. If we prove the converse implication then τ = τk·k follows
since V is a topological vector space. Let {xn } be a sequence such that kxn k → 0, and let W
be an open neighborhood of 0. Since U is bounded, there is λ > 0 such that λU ⊆ W . Now,
kxn k → 0 means that there is n0 ∈ N such that n ≥ n0 ⇒ kxn k < λ/2. With the definition of
µU this implies xn ∈ λU , thus xn ∈ λU ⊆ W for all n ≥ n0 . This proves xn → 0.
We now know that a topological vector space is normable if the zero element has a balanced
convex bounded open neighborhood. (The converse is easy.) But this can be improved:
B.52 Lemma Let V be a topological vector space and U a convex open neighborhood of 0.
Then there exists a balanced convex open neighborhood U 0 ⊆ U of 0.
S λU ⊆ U
Proof. Since multiplication by scalars is continuous, there exists an ε > 0 such that
whenever |λ| ≤ ε. Thus with W = |ε|U we have tW ⊆ U whenever |t| ≤ 1. Put Y = |t|≤1 tW ⊆
U . By construction, Y is a balanced open neighborhood of 0.
T every λ ∈ F with |λ| = 1 it is clear that λU is a convex open neighborhood of 0. Putting
For
Z = |λ|=1 λU , it is manifestly clear that Z is balanced and 0 ∈ Z. Furthermore, U 0 is convex
(as an intersection of convex sets). Since tW ⊆ U for all |t| = 1, we have Y ⊆ Z, so that Z
has non-empty interior Z 0 . Now we put U 0 = Z 0 and claim that U 0 has the desired properties.
Clearly U 0 is an open neighborhood of 0, as the interior of a convex set it is convex (Exercise
B.47). If |t| = 1 then the map Z → Z, x 7→ tx is a homeomorphism. Thus if x ∈ Z 0 = U 0 then
203
tx ∈ Z 0 = U 0 , showing that U 0 = Z 0 is balanced.
Now we are in a position to prove geometric criteria for normability and local convexity of
topological vector spaces:
B.53 Theorem Let V be a topological vector space. Then V is normable if and only if there
exists a bounded convex open neighborhood of 0.
Proof. If V is normable by the norm k · k then Bk·k (0, 1) = {x ∈ V | kxk < 1} is clearly open,
convex (and balanced). To show boundedness, let W 3 0 be open. Then there is ε > 0 such
that B(0, ε) ⊆ W . Now clearly εB(0, 1) = B(0, ε) ⊆ W , thus B(0, 1) is bounded.
If there exists a bounded convex open neighborhood U of 0 then by Lemma B.52 we can
assume U in addition to be balanced. (The U 0 provided by the lemma is a subset of U , thus
bounded if U is bounded.) Now by Proposition B.51(iii), µU is a norm inducing the given
topology on V .
B.54 Theorem A topological vector space (V, τ ) is locally convex in the sense of Definition
2.37 (i.e. the topology τ comes from a separating family F of seminorms) if and only if it is
Hausdorff and the zero element has an open neighborhood base consisting of convex sets.
Proof. Given a separating family F of seminorms and putting τ = τF , a basis of open neigh-
borhoods of 0 is given by the finite intersections of sets Up,ε = {x ∈ V | p(x) < ε}, where
p ∈ F, ε > 0. Each of the Up,ε is convex and open, thus also their finite intersections.
And if τ has the stated property, Lemma B.52 gives that 0 has a neighborhood base consisting
of balanced convex open sets. Defining F = {µU | U balanced convex open neighborhood of 0},
each of the µU is a continuous seminorm by Propositions B.48 and B.51. Thus if xι → 0 then
kxι kU := µU (xι ) → 0. And kxι kU → 0 for all balanced convex open U implies that xι ultimately
is in every open neighborhood of 0, thus xι → 0. Thus τ = τF , and 2.36 gives that F is
separating.
B.56 Definition Let V be a topological vector space over F and V ∗ the space of continuous
linear functionals V → F. For A, B ⊆ V we say that A and B are
(i) separated if there exists ϕ ∈ V ∗ with Re ϕ(a) < inf b∈B Re ϕ(b) ∀a ∈ A.
(ii) strictly separated if there exist ϕ ∈ V ∗ , α ∈ R with Re ϕ(a) < α < Re ϕ(b) ∀a ∈ A, b ∈ B.
(iii) very strictly separated if there exists ϕ ∈ V ∗ with supx∈A Re ϕ(x) < inf x∈B Re ϕ(x).
204
B.57 Remark 1. Geometrically, (ii) and (iii) are equivalent to A, B being contained in two
disjoint open half spaces (having positive distance in case (iii)), while (i) means that A, B are
contained in an open half-space and its complement, respectively.
2. The sets A = (0, 1), B = (0, 1) in V = R are strictly separated, but not very strictly, while
A = (0, 1), B = [1, 2) are only separated. An example for two closed sets that are non-strictly
separated is A = {(x, y) | x, y > 0, xy ≥ 1}, B = {(x, y) | x ≤ 0} in V = R2 . 2
B.58 Theorem Let V be a topological vector space and A, B ⊆ V disjoint non-empty convex
subsets, A being open. Then
(i) A and B are separated.
(ii) If also B is open, A and B are strictly separated.
Proof. (i) Pick a0 ∈ A, b0 ∈ B and put z = b0 − a0 and U = (A − a0 ) − (B S − b0 ) = A − B + z,
which is a convex (as pointwise sum of two convex sets) open (since U = x∈−B−a0 +b0 (A + x))
neighborhood of 0. Let p = µU be the associated Minkowski functional. As a consequence of
A ∩ B = ∅ we have 0 6∈ A − B, thus z 6∈ U , and therefore p(z) ≥ 1.
Put W = Rz and define ψ : W → R, cz 7→ c. For c ≥ 0 we have ψ(cz) = c ≤ cp(z) = p(cz).
Thus by sublinearity of p and Theorem 9.2 there exists a linear functional ϕ : V → R satisfying
ϕ W = ψ, thus ϕ(cz) = c, and ϕ(x) ≤ p(x) ∀x ∈ V . Thus also −p(−x) ≤ −ϕ(−x) = ϕ(x),
and since x → 0 implies p(x) → 0, ϕ is continuous at zero, thus everywhere.
If now a ∈ A, b ∈ B then a − b + z ∈ U , so that p(a − b + z) < 1. Thus
thus ϕ(a) < ϕ(b) for all a ∈ B and b ∈ B. Thus the subsets ϕ(A), ϕ(B) of R are disjoint.
Since A, B are convex, they are connected. Consequently, ϕ(A), ϕ(B) are connected, thus
intervals. Since A is open, so is ϕ(A) (open mapping theorem). If we put s = sup ϕ(A), we have
ϕ(a) < s ≤ ϕ(b) for all a ∈ A, b ∈ B, and this is equivalent to ϕ(a) ≤ inf b∈B ϕ(b) for all a ∈ A.
F = C: Considering V as R-vector space, apply the above to obtain a continuous R-linear
functional ϕ0 : V → R such that ϕ(a) < inf b∈B ϕ(b) ∀a ∈ A. Now define ϕ : V → C, x 7→
ϕ0 (x)−iϕ0 (ix). This clearly is continuous and satisfies Re ϕ = ϕ0 , so that the desired inequality
holds. That ϕ is C-linear follows from the same argument as in the proof of Theorem 9.5.
(ii) It suffices to consider F = R. With α = inf b∈B Re ϕ(b), by (i) we have Re ϕ(a) < α for
all a ∈ A. Since B is open, ϕ(B) ⊆ R is open by Exercise 6.4 (or the open mapping theorem),
so that ϕ does not assume its infimum α on B, whence α < Re ϕ(b) ∀b ∈ B.
B.59 Lemma If V is a locally convex space and K ⊆ U ⊆ V with K compact and U open,
there is a convex open neighborhood N ⊆ V of zero such that K + N ⊆ U .
Proof. For brevity, we only prove the Lemma for normed spaces and leave the generalization to
locally convex spaces as an Exercise.
For every x ∈ K there exists εx > 0 such that B(x, 2εx ) ⊆ U . SinceS{B(x, εx )}x∈K is an
open cover of the compact set K, there are x1 , . . . , xn such that K ⊆ ni=1 B(xi , εxi ). Put
ε = min(εx1 , . . . , εxn ) > 0 and N = B(0, ε), which is open and convex. Now
n
[ n
[
K +N ⊆ B(xi , εxi ) + B(0, ε) ⊆ B(xi , 2εxi ) ⊆ U,
i=1 i=1
205
B.60 Corollary Let V be a locally convex vector space and A, B ⊆ V disjoint non-empty
convex subsets with A compact and B closed. Then A and B are very strictly separated.
Proof. We only discuss F = R, the changes for F = C being the same as above. Applying
Lemma B.59 to K = A, U = V \B, we obtain a convex open N 3 0 such that A + N ⊆ V . It
is easy to show that A + N is open and convex. (Do it!) Applying Theorem B.58(i) to A + N
and B, we obtain ϕ ∈ V ∗ such that ϕ(a) < inf b∈B ϕ(b) ∀a ∈ A + N ⊃ A. Since ϕ assumes its
supremum on the compact set A, we have supa∈A ϕ(a) < inf b∈B ϕ(b).
B.61 Corollary Let V be a locally convex vector space and W ⊆ V a proper closed subspace.
Then for every x ∈ V \W there exists a continuous linear functional ϕ ∈ V ∗ such that ϕ W = 0
and ϕ(x) 6= 0. In particular, for every x ∈ V \{0} there exists ϕ ∈ V ∗ with ϕ(x) 6= 0.
Proof. Let W, x as stated. Put A = {x}, B = W . Then A and B are disjoint closed convex
subsets, and A is compact. Thus they are very strictly separated by Corollary B.60, thus there
exists ϕ ∈ V ∗ such that Re ϕ(x) > supw∈W Re ϕ(w). Since W is a linear subspace, finiteness of
the supremum implies ϕ W = 0.
For the final claim, apply the above with W = {0}.
B.63 Corollary The weak and norm closures of a convex set in a Banach space coincide.
k·k w
Proof. Then X ⊆ V be convex. Then also X is convex, thus weakly closed, so that X ⊆
w
k·k k·k w k·k
X =X . Combining this with the obvious inclusion X ⊇ X we are done.
where the final equality is due to the fact that X is balanced, thus in particular closed under
multiplication by all λ with |λ| = 1. Now (supx∈X |ϕ0 (x)|)−1 ϕ0 has the desired properties.
B.65 Theorem (Goldstine) If V is a Banach space then V≤1 is σ(V ∗∗ , V ∗ )-dense in (V ∗∗ )≤1 .
Proof. We abbreviate τ = σ(V ∗∗ , V ∗ ). The unit ball (V ∗∗ )≤1 is τ -compact by Alaoglu’s theorem,
τ
thus τ -closed, so that B = V≤1 , which is convex by Exercise B.47, is contained in (V ∗∗ )≤1 . If
τ
this inclusion is strict, pick x∗∗ ∈ (V ∗∗ )≤1 \V≤1 . Then x∗∗ has a τ -open neighborhood U disjoint
from B, and by Theorem B.54 there is a convex open A ⊆ U . Now Theorem B.58(i) applied to
206
(V ∗∗ , τ ) and A, B ⊆ V ∗∗ gives a τ = σ(V ∗∗ , V ∗ )-continuous linear functional ϕ ∈ (V ∗∗ )? such
that Re ϕ(a) < inf b∈B Re ϕ(b) ∀a ∈ A. Now Exercise 10.23 gives ϕ ∈ V ∗ ⊆ V ∗∗∗ .
Putting ψ = −ϕ we have supb∈B Re ψ(b) < Re ψ(a) ∀a ∈ A, which is more convenient. Since
ψ ∈ V ∗ and B ⊇ V≤1 , we have kψk ≤ supb∈B Re ψ(b). On the other hand, with x∗∗ ∈ A and
kx∗∗ k ≤ 1, we have Re ψ(x∗∗ ) ≤ |ψ(x∗∗ )| ≤ kx∗∗ kkψk ≤ kψk. Combining these findings, we
have kψk ≤ supb∈B Re ψ(b) < Re ψ(x∗∗ ) ≤ kψk, which is absurd. This contradiction proves
τ
V≤1 = (V ∗∗ )≤1 .
We will see below that closedness of conv(X) does not follow from closedness of X, and for
infinite-dimensional V not even from compactness of X.
B.70 Exercise Prove: If V is a topological vector space and X ⊆ V is open then conv(X) is
open.
1 2 (which is closed). Prove that
B.71 Exercise Let X = {(x, y) | x ∈ R, y ≥ 1+x 2} ⊂ R
conv(X) = {(x, y) | x ∈ R, y > 0} (which is open, but not closed).
207
B.73 Exercise Let V be a topological vector space and X ⊆ V . Prove that conv(X) coincides
with the intersection of all closed convex sets that contain X.
B.74 Definition If V is a topological vector space and X ⊆ V we call conv(X) the closed
convex hull of X, but mostly write conv(X) for readability.
B.75 Proposition (Mazur’s compactness theorem (S. Mazur 1930)) Let V be a norm-
ed space and X ⊆ V . Then
(i) If X is totally bounded then so is conv(X).
(ii) If V is Banach and X is precompact then conv(X) is precompact and conv(X) compact.
Proof. (i) S Let X ⊆ V be totally bounded, and let ε > 0. Then there are z1 , . . . , zK ∈ X such
that X ⊆ K k=1 B(zk , ε). Now C =P conv({z1 , . . . , zK }) ⊆ V is compact as the image in V of the
compact set {(t1 , . . . , tK ) | tk ≥ 0, K K
k=1 tk = 1} ⊆ R under the continuous map (t1 , . . . , tK ) 7→
PK SL
t
k=1 k k z . Thus there are y 1 , . . . , y L ∈ C ⊆ conv(X) such that C ⊆ l=1 B(y l , ε).
PM
If now y ∈ conv(X) then by Exercise B.67 we have y = m=1 tm xm for certain x1 , . . . , xM ∈
X, t1 , . . . , tM ≥ 0 with M
P
m=1 tm = 1. For each m pick km such that kxm − zkm k < ε. Putting
y0 = M
P 0k = k
PM PM
t
m=1 m km z ∈ C, we have ky − y m=1 tm (xm − zkm )k ≤ m=1 tm kxm − zkm k < ε.
0 0 0
Picking l such S that ky − yl k < ε, we have ky − yl k ≤ ky − y k + ky − yl k < 2ε. This proves
conv(X) ⊆ L l=1 B(yl , 2ε). Since ε > 0 was arbitrary, conv(X) is totally bounded.
(ii) Since V is complete, total boundedness and precompactness of its subsets are equivalent
by Exercise A.43. Thus conv(X) is precompact by (i), and by definition this means that conv(X)
is compact.
B.76 Remark Giving a description of conv(X) that is similarly explicit as (B.8) is difficult.
The convex hull of X = {x1 , . . . , xn } is compact, thus coincides with conv(X). Now Passume
X = {x1 ,P
x2 , . . .} is a boundedPsequence, i.e. kxi k ≤ C ∀i. Then for all {ti ≥ 0} with i ti = 1
we have i ti kxi k ≤ C, thus i ti xi converges absolutely to some x ∈ V . It is easy P to see that
x ∈ conv(X), but in general it is not clear that every x ∈ conv(X) is of the form i ti xi . This
can be proven if the sequence {xn } is convergent. See [102, Lemma 3.4.29] for the proof due to
Grothendieck. 2
B.6.6 Extreme points and faces of convex sets. The Krein-Milman theorem
B.77 Definition Let V be a vector space over F ∈ {R, C} and X ⊆ V convex.
(i) A subset F ⊆ X is a face of X if it is convex and tx + (1 − t)y ∈ F with x, y ∈ X, 0 < t < 1
implies x, y ∈ F .
(ii) x ∈ X is an extreme point if {x} is a face. (Thus x is not a non-trivial convex combination.)
The set of extreme points of X is denoted E(X).
B.79 Exercise Let X be a convex set. Prove that x ∈ X is an extreme point if and only if
X\{x} is convex.
208
B.81 Exercise Let V = R3 and let X be the closed convex hull of
Determine the set E of extreme points of X and show that it its not closed.
B.82 Exercise Show that the closed unit ball in the Banach space c0 has no extreme points.
B.83 Exercise Show that the closed unit ball in the Banach space L1 ([0, 1], λ), where λ is
Lebesgue measure on the Borel σ-algebra, has no extreme points.
B.84 Theorem (Krein-Milman 1940) 123 Every non-empty compact convex subset of a lo-
cally convex space is the closed convex hull conv(E) of its set E of extreme points. (In particular
E 6= ∅.)
Proof. Let V be locally convex and X ⊆ V compact and convex. We may assume X 6= ∅ since
otherwise there is nothing to prove. Let F be the set of compact faces of X. Then X ∈ F, thus
F 6= ∅. Partially order F by reverse inclusion. If C ⊆ T F is a chain (totally ordered subset) then
it has the finite intersection property, so that F = C is non-empty by Lemma B.42. Assume
x ∈ F satisfies x = ty + (1 − t)z, where 0 < t < 1 and y, z ∈TX with y 6= T z. Since every G ∈ C
contains x and is a face, it follows that x ∈ G. Thus x ∈ G∈C G = C = F , proving that
F is a face and compact. Thus F is an upper bound for C, so that by Zorn’s lemma F has a
maximal element, thus a face that is minimal (in the sense of not having a proper subset that
is a compact face).
We claim that F is a singleton. Assuming x, y ∈ F with x 6= y, Corollary B.61 provides a
continuous linear functional ϕ ∈ V ∗ such that ϕ(x − y) 6= 0. Multiplying by a phase we can
achieve Re ϕ(x) 6= Re ϕ(y). Put M = supx∈F Re ϕ(x) and F 0 = {x ∈ F | Re ϕ(x) = M }. Then
F 0 6= ∅ since Re ϕ assumes its supremum on the compact set F . Now F 0 ⊆ F is a closed subset
and convex by linearity of ϕ. If z, z 0 ∈ F, 0 < t < 1 such that tz + (1 − t)z 0 ∈ F 0 then in view of
tRe ϕ(z) + (1 − t)Re ϕ(z 0 ) = M we must have Re ϕ(z) = Re ϕ(z 0 ) = M , thus z, z 0 ∈ F 0 , proving
that F 0 ⊆ F is a compact face. Since the face F is minimal by construction, we have F 0 = F .
Thus x, y ∈ F 0 , implying the contradiction Re ϕ(x) = M = Re ϕ(y). Thus F is a singleton.
Since its element is an extreme point, we have E 6= ∅.
It remains to prove that X0 = conv(E) equals X. If this is not true, picking z ∈ X\X0 , Corol-
lary B.60 provides a ϕ ∈ V ∗ such that supx∈X0 Re ϕ(x) < Re ϕ(z). Define M = supx∈X Re ϕ(x)
and F = {x ∈ X | Re ϕ(x) = M }. Since X is compact, Re ϕ assumes its supremum, thus
F 6= ∅. As before F is a compact face of X, and applying the above to F there exists an
extreme point y ∈ F . Now Re ϕ(y) = M . By Exercise B.80, y is an extreme point of X, thus
y ∈ E ⊆ X0 . This implies the contradiction M = Re ϕ(y) ≤ supx∈X0 Re ϕ(x) < Re ϕ(z) ≤ M .
Thus conv(E) = X.
B.85 Remark 1. With a little extra effort one can prove E ⊆ S for every closed set S ⊆ X
such that conv(S) = X.
2. Exercise B.82 shows that ‘compact’ cannot be replaced by ‘closed bounded’ in Theorem
B.84.
3. Also the local convexity of V is essential: In the metrizable, but not locally convex, TVS
Lp ([0, 1]) with 0 < p < 1 there exist [137] compact convex sets without extreme points. 2
123
Mark Grigorievich Krein (1907-1989), David Milman (1912-1982), Soviet mathematicians. (Milman emigrated to
Israel.) Among other things, Krein is also known for the Tannaka-Krein duality theory for compact groups, Milman
e.g. for the M.-Pettis theorem, cf. Section B.6.8).
209
B.86 Corollary Let V be a Banach space. Then
∗
∗ = convw (E(V ∗ )) (weak-∗ closure).
(i) V≤1 ≤1
(ii) If V is reflexive then V≤1 = convw (E(V≤1 )) (weak closure).
Proof. Since the weak and weak-∗ topologies are locally convex, this follows by combining
Krein-Milman with Alaoglu’s Theorem for (i) and with the weak compactness of V≤1 for (ii).
This proves that the closed unit ball of a Banach space has ‘enough’ extreme points whenever
V is reflexive or a dual space. (In view of this Exercise B.82 again shows that c0 is not a dual
space.) While `1 and `∞ are non-reflexive, they are dual spaces, so that their unit balls have
extreme points. They can be identified:
B.88 Remark 1. The statement that the closed unit ball V≤∗ has extreme points for every
Banach space V implies AC. This is proven in less than a page in [13]. In view of Corollary
B.86 we thus have the implication Alaoglu+KM ⇒ AC. (But HB+KM 6⇒ AC.) Recall that over
ZF, Alaoglu’s theorem is equivalent to the Ultrafilter Lemma, Tychonov’s theorem for Hausdorff
spaces, etc. Since ZF+UL+DCω is provably weaker than AC, it follows that the Krein-Milman
theorem cannot be proven in the framework ZF+UL+DCω (which does suffice for OMT, UBT,
HBT and Alaoglu).
2. In Exercise B.67 we found a representation for all elements of the convex hull of a set.
Doing a similar thing for the closed convex hull is much harder. One solution is the following
Choquet-Bishop-de Leeuw124 theorem: If V is a locally convex space, X ⊂ V is compact and
convex, and R x ∈ X then there is a probability measure µ on the set E of extreme points such
that x = E y dµ(y) as a weak integral: For every ϕ ∈ V ∗ we have ϕ(x) = E ϕ(y)dµ(y). For a
R
210
(Note that this is trivial if X is compact.) This can fail in complex Banach spaces, but it does
hold for the closed unit ball in a complex Banach space, giving the result mentioned in Remark
9.25.2.) See e.g. [102, Section 2.11]. 2
B.89 Proposition For every Banach space V , the following are equivalent:
(i) x, y ∈ V, kxk = kyk = 1, x 6= y implies kx + yk < 2.
(ii) If x, y ∈ V satisfy kx + yk = kxk + kyk then y = 0 or x = cy with c ≥ 0.
(iii) The set of extreme points of the closed unit ball V≤1 equals V1 = {x ∈ V | kxk = 1}.
Proof. (i)⇒(ii) Let x, y ∈ V satisfy kx + yk = kxk + kyk. If x = 0 or y = 0 then we are done.
By rescaling and/or exchanging if necessary we may assume 1 = kxk ≤ kyk. Put z = y/kyk.
Then
where we used ka − bk ≥ kak − kbk and the assumptions kx + yk = kxk + kyk and kxk = 1. This
implies kx + zk = 2. Since kxk = 1 = kzk (by assumption and by construction of z), the strict
convexity implies x = z = y/kyk. Thus y = kykx, and we have proven (ii).
(ii)⇒(i) Assume x, y ∈ V, x 6= y and kxk = kyk = 1. If kx + yk = 2 was true then (ii) would
give x = cy with c > 0, but then kxk = kyk gives c = 1, thus the contradiction x = y. Since
kx + yk ≤ 2 is obvious, we must have kx + yk < kxk + kyk = 2.
(ii)⇒(iii) Let E be the set of extreme points of V≤1 . As noted above, if x ∈ V1 is a non-
trivial convex combination x = ty + (1 − t)z with y, z ∈ V≤1 , we must have y, z ∈ V1 , implying
kxk = ktyk + k(1 − t)zk. Since (ii) holds, we have that y and z are related by a positive scalar,
thus y = z. Thus x is an extreme point.
(iii)⇒(i) Let x, y ∈ V1 with x 6= y. Then (x + y)/2 is a non-trivial convex combination and
therefore not in E = V1 , so that k(x + y)/2k < 1. Thus kx + yk < 2, and we have (i).
B.90 Definition A Banach space V is called strictly convex if it satisfies the equivalent con-
ditions in Proposition B.89.
Combining the above with Exercises B.82, B.87, B.83 shows that the spaces c0 , `1 , `∞ and
L1 ([0, 1], λ)
are not strictly convex. On the other hand:
B.91 Exercise Prove that `p (S, F) is strictly convex for every S and 1 < p < ∞.
Since every Hilbert space is isometrically isomorphic to some `2 (S, F), it is strictly convex.
This is very easy to show directly and also follows by combining Exercise 5.35 with the following:
211
B.92 Proposition (Taylor-Foguel) 126 Let V be a Banach space. Then the following are
equivalent:
(i) V ∗ is strictly convex.
(ii) For every closed subspace W ⊆ V and ϕ ∈ W ∗ there is a unique ϕ
b ∈ V ∗ with ϕ
b|W = ϕ
and kϕk
b = kϕk.
Proof. (i)⇒(ii) Existence of ϕb is guaranteed by Hahn-Banach. Assume W ⊆ V is a closed
subspace, ϕ ∈ W ∗ and ψ1 , ψ2 ∈ V ∗ are such that ψ1 6= ψ2 , but ψ1 W = ψ2 W = ϕ,
kψ1 k = kψ2 k = kϕk. Then ψ 0 = (ψ1 + ψ2 )/2 satisfies ψ 0 W = ϕ, thus
kψ1 k + kψ2 k
kϕk = kψ 0 W k ≤ kψ 0 k ≤ = kϕk
2
and therefore kψ 0 k = kϕk, contradicting the consequence k(ψ1 + ψ2 )/2k < kψ1 k = kψ2 k of strict
convexity of V ∗ .
(ii)⇒(i) Assume V ∗ is not strictly convex. Then there are ϕ1 , ϕ2 ∈ V ∗ with ϕ1 6= ϕ2 and
kϕ1 k = kϕ2 k = 1 and kϕ1 + ϕ2 k = 2. Then W = {x ∈ V | ϕ1 (x) = ϕ2 (x)} ⊆ V is a closed
linear subspace and proper (since ϕ1 6= ϕ2 ). Put ψ = ϕ1 |W = ϕ2 |W ∈ W ∗ . We will prove
kψk = 1. Then ϕ1 , ϕ2 are distinct norm-preserving extensions of ψ ∈ W ∗ to V , providing a
counterexample for uniqueness of norm-preserving extensions.
Since ϕ1 − ϕ2 6= 0, there exists z ∈ V with ϕ1 (z) − ϕ2 (z) = 1. Now every x ∈ V can be
written uniquely as x = y + cz, where y ∈ W, c ∈ C: Put c = ϕ1 (x) − ϕ2 (x) and then y = x − cz.
Now it is obvious that y ∈ W . Uniqueness of such a representation follows from z 6∈ W .
Since kϕ1 + ϕ2 k = 2, we can find a sequence {xn } ⊆ V with kxn k = 1 ∀n such that
ϕ1 (xn ) + ϕ2 (xn ) → 2. Since |ϕi (xn )| ≤ 1 for i = 1, 2 and all n, it follows that ϕi (xn ) → 1 for i =
1, 2. Now write xn = yn +cn z, where {yn } ⊆ W and {cn } ⊆ C. Then cn = ϕ1 (xn )−ϕ2 (xn ) → 0.
Thus kxn − yn k = |cn |kzk → 0, so that kyn k → 1. And with cn → 0 we have
B.94 Exercise Let V = `1 (N, R), equipped with the norm kf k = kf k1 + kf k2 . Prove that (i)
(V, k · k) is a Banach space, (ii) strictly convex, but (iii) not uniformly convex.
B.95 Theorem (Milman-Pettis 1938/9) Every uniformly convex Banach space is reflexive.
Proof. (Following Ringrose [133]) Assume V is uniformly convex, but not reflexive. Let S ⊆ V
and S ∗∗ ⊆ V ∗∗ be the unit spheres (sets of elements of norm one). Since S = S ∗∗ easily implies
V = V ∗∗ , we have S $ S ∗∗ . If x∗∗ ∈ S ∗∗ \S then by the obvious norm-closedness of S ⊆ S ∗∗
126
Angus Ellis Taylor (1911-1999), American mathematician, proved (i)⇒(ii) in 1939. Shaul Reuven Foguel (1931-
2020), Israeli mathematician, proved (ii)⇒(i) in 1958.
212
there is ε > 0 such that B(x∗∗ , ε) ∩ S = ∅. Since kx∗∗ k = 1, we can find ϕ ∈ V ∗ with kϕk = 1
and |x∗∗ (ϕ) − 1| > 1 − δ(ε)/2. Now U = {y ∗∗ ∈ V ∗∗ | |y ∗∗ (ϕ) − 1| > 1 − δ(ε)} ⊆ V ∗∗ is a
τ := σ(V ∗∗ , V ∗ )-open neighborhood of x∗∗ . By Goldstine’s Theorem B.65, V≤1 ⊆ (V ∗∗ )≤1 is
τ
τ -dense. If {xα } ⊆ V≤1 is a net τ -converging to x ∈ S ∗∗ then kxα k → 1 and kxxαα k → x. Thus
S ⊆ S ∗∗ is τ -dense, thus S ∩ U 6= ∅. If now y1 , y2 ∈ S ∩ U then |ϕ(y1 ) + ϕ(y2 )| > 2 − 2δ(ε). With
kϕk = 1 this implies ky1 + y2 k > 2 − 2δ(ε). Thus by uniform convexity we have ky1 − y2 k < ε.
Since every net in S that τ -converges to x∗∗ ultimately lives in U , picking any y1 ∈ S ∩ U we
have kx∗∗ − y1 k ≤ ε. But this contradicts the choice of ε.
The converse of the theorem is not true. In fact there are spaces that are reflexive and
strictly convex, but not uniformly convex, but the construction [36] is laborious. Note also that
the dual of a uniformly convex space need not be uniformly convex!
B.96 Theorem For every measure space (X, A, µ) and 1 < p < ∞, the space Lp (X, A, µ; F) is
uniformly convex and reflexive.
Proof. We follow [86]. Let 0 < ε ≤ 21−p . Then the set
p
x−y
Z = (x, y) ∈ R2 | |x|p + |y|p = 2, ≥ε
2
is closed and bounded, thus compact, and non-empty since (21/p , 0) ∈ Z. Since the function
p p p
R → R, t 7→ |t|p is strictly convex, we have x+y
2 < |x| +|y|
2 whenever x 6= y. Thus
p
|x|p + |y|p
x+y
ρ(ε) = inf − > 0.
(x,y)∈Z 2 2
Let now 0 < ε < 21−p and f, g ∈ Lp (X, A, µ) with kf kp = kgkp = 1 and k(f + g)/2kpp > 1 − δ.
Writing f, g instead of f (x), g(x), we put
f −g p |f |p + |g|p
M = x∈X | ≥ε .
2 2
Now
p p
f −g f −g f −g p
Z Z
= +
2 p X\M 2 M 2
|f |p + |g|p |f |p + |g|p
Z Z
≤ ε +
X\M 2 M 2
|f |p + |g|p |x|p + |y|p x+y p
Z Z
1
≤ ε + −
X 2 ρ(ε) M 2 2
p p p p x+y p
|f | + |g| |x| + |y|
Z Z
1
≤ ε + −
X 2 ρ(ε) X 2 2
1 1−δ δ
≤ ε+ − = ε+ .
ρ(ε) ρ(ε) ρ
213
(In the second row we used the definition of M and (4.1), in the third we used (B.9), which
holds on M , in the forth the fact that the expression in brackets is non-negative on X\M , and
finally we used the assumptions kf kpp ≤ 1, kgkpp ≤ 1 and k(f + g)/2kpp > 1 − δ.) Now choosing
δ < ερ(ε) we have k(f − g)/2kpp ≤ 2ε, thus uniform convexity (more precisely, an implication
equivalent to it).
Reflexivity now follows from Theorem B.95.
B.97 Remark The uniform convexity of Lp for 1 < p < ∞ was first proven by Clarkson in
1936 with a fairly complicated proof. (Reflexivity was known earlier thanks to F. Riesz’ proof
of (Lp )∗ ∼
= Lq .) A simpler proof, still giving optimal bounds, can be found in [73]. 2
Now we are in a position to complete the determination of Lp (X, A, µ)∗ for arbitrary measure
space (X, A, µ) and 1 < p < ∞ without invocation of the Radon-Nikodym theorem:
B.98 Corollary Let 1 < p < ∞ and (X, A, µ) any measure space. Then the canonical map
Lq (X, A, µ; F) → Lp (X, A, µ; F)∗ is an isometric bijection.
Proof. Let (X, A, µ) be any measure space, 1 < p < ∞ and q the conjugate exponent. We
abbreviate Lp (X, A, µ) to Lp . As discussed (without complete, but hopefully sufficient detail)
in Section 4.7, the map ϕ : Lq → (Lp )∗ , g 7→ ϕg is an isometry, so that only surjectivity remains
to be proven. Assume ϕ(Lq ) $ (Lp )∗ . The subspace being closed (since Lq is complete and ϕ is
an isometry), by Hahn-Banach there is a 0 6= ψ ∈ (Lp )∗∗ such that ψ ϕ(Lq ) = 0. By reflexivity
of Lp (Theorem B.96), there is an p
R f ∈ L such that ψ = ιLp (f ). This implies ϕg (f ) = ψ(ϕg ) = 0
for all g ∈ L . With ϕg (f ) = f g dµ = ϕ0f (g), where ϕ0 : Lp → (Lq )∗ is the canonical map,
q
B.99 Theorem (Eidelheit 1940, Chernoff 1969/73) 127 Let V, W be normed spaces and
A ⊆ B(V ), B ⊆ B(W ) subalgebras containing the algebras F (V ) and F (W ), respectively, of
finite rank operators. Then every algebra isomorphism α : A → B is of the form α(A) = T AT −1
for some isomorphism T : V → W of Banach spaces.
In particular, an algebraic isomorphism K(V ) ∼
= K(W ) or B(V ) ∼= B(W ) implies V ' W .
Proof. We closely follow [7] with some extra details. Pick a rank-one idempotent P ∈ A. Then
α(P )2 = α(P 2 ) = α(P ), thus α(P ) ∈ B is an idempotent and non-zero (since α is injective).
We claim it has rank one. [If not, pick a non-zero proper subspace of W 0 = α(P )W and an
idempotent Q with image W 0 . It satisfies Qα(P ) = α(P )Q = Q. Since Q ∈ F (W ) ⊆ B and α is
an isomorphism, there is an idempotent P 0 ∈ A with α(P 0 ) = Q. It satisfies P 0 P = P P 0 = P 0 ,
thus is a non-zero proper subprojection of P , but this is impossible since P has rank one.]
127
Meier Eidelheit (1910-1943), Polish mathematician, killed in the holocaust. Paul Robert Chernoff (1942-2017),
American analyst.
214
Pick non-zero vectors x0 ∈ P V, y0 ∈ α(P )W and an S ∈ B(V, W ) with Sx0 = y0 . [E.g., pick
any ϕ ∈ V ∗ with ϕ(x0 ) 6= 0. Then S : x 7→ ϕ(x0 )−1 ϕ(x)y0 does the job.] Define linear maps
Now ψ1 injective quite trivially, and it is surjective since for every x ∈ V there exists A ∈ F (V )
with Ax0 = x and since F (V ) ⊆ A. Similarly, ψ2 is a linear bijection. Since the bijection
α : A → B restricts to a bijection α0 : AP → Bα(P ), the composite T = ψ2 ◦ α0 ◦ ψ1−1 : V → W
is a linear bijection. By definition of these maps, we have
which is the same as saying T AP = α(A)α(P )SP . If now A, A0 ∈ A then this implies
Thus T AA0 x0 = α(A)T A0 x0 , and since every x ∈ V is of the form A0 x0 for some A ∈ F (V ) ⊆ A,
we have the identity T A = α(A)T (of linear maps V → W ). With invertibility of T this is the
same as α(A) = T AT −1 .
It remains to prove that T is bounded. Pick y 0 ∈ W \{0}. For each ϕ ∈ W ∗ , we have the
finite rank operator y 0 ⊗ ϕ ∈ F (W ) : y 7→ y 0 ϕ(y). Since F (W ) ⊆ B, there exists A ∈ A such that
y 0 ⊗ ϕ = α(A) = T AT −1 . Thus A = T −1 (y 0 ⊗ ϕ)T = (T −1 y 0 ) ⊗ (ϕ ◦ T ). Since A is bounded and
T −1 y 0 6= 0, this implies that ϕ ◦ T = T t ϕ is bounded for each ϕ ∈ W ∗ . As proven in Exercise
9.32, this implies that T is bounded.
B.101 Remark The conclusion of the theorem implies kαk ≤ kT kkT −1 k < ∞, so that α
is bounded. Thus a purely algebraic hypothesis implies a continuity statement. This is an
early instance of the subject of ‘automatic continuity’, cf. [32]. For another such statement see
Theorem B.173. 2
215
Thus we are left with non-normal operators on infinite-dimensional separable spaces. In 1954,
Aronszajn and Smith128 proved that every compact operator on a complex Banach space has a
non-trivial invariant subspace. Lomonosov129 used Schauder’s fixed point theorem, see Section
B.15, to prove the following stronger statement:
The l.h.s. of this tends to zero since kc−1 Ai k ≤ 1 for all i and k(cK)m k → 0 by assumption,
thus 0 ∈ B. But this contradicts the fact that B is closed by definition and 0 6∈ B. The only way
out of this contradiction is thay there exists y 6= 0 with Wy 6= V , which then is a non-trivial
closed hyperinvariant subspace.
128
Nachman Aronszajn (1907-1980), Polish-American mathematician who worked on functional analysis, mathemat-
ical logic. Kennan Tayler Smith (1926-2000), American mathematician.
129
Victor Lomonosov (1946-2018), Russian-American mathematician who mostly worked on functional analysis.
216
B.103 Remark 1. While in finite dimensions every operator has non-trivial invariant sub-
spaces, Lomonosov’s theorem fails since 1 is compact, while there clearly is no non-trivial
subspace that is invariant under all operators. This is also why we required K 6= 0 above.
2. With the Aronszajn-Smith result, the invariant subspace problem is reduced (over C) to
non-compact non-normal operators on separable spaces. In 1975/1987 Enflo (already encoun-
tered in connection with the approximation property) constructed an operator on a Banach
space having no invariant subspace [48]. By now, more examples are known, even on `1 [127],
but all live on non-reflexive spaces. It still is an open question whether all operators on reflexive
spaces, or at least on separable Hilbert spaces must have an invariant subspace! (In a paper
[49] from May 2023, Enflo claims to prove just this, but this has not yet been verified.) 2
Why should anyone be interested in the existence of invariant subspaces? The arguments
used to prove the existence of invariant subspaces in finite dimensions and for normal opera-
tors on Hilbert spaces are closely related to results (usually called ‘spectral theorem’ only for
normal operators) giving representations of those operators by standard forms. Thus proving
the existence of invariant subspaces will invariably lead to structural results on the family of
operators in question.
As we saw in Section 14.4, for a compact operator A ∈ B(V ) is is notP
too difficult to construct
a family {Pλ }λ∈σ(A) of mutually commuting idempotents satisfying λ Pλ = 1 (converging
unconditionally in the strong operator topology) and such that each Vλ = Pλ V is an A-invariant
subspace with σ(A Vλ ) = {λ}. For λ 6= 0, each Vλ is finite-dimensional and A − λ1 Vλ
nilpotent. But saying something non-trivial about A V0 requires the less trivial existence of
invariant subspaces for quasi-nilpotent compact operators given by the above theorem. Using
the latter, good results on normal forms for compact operators have indeed been proven by
Ringrose130 [134, 135] and Brodsky131 [24].
An analogous inequality holds for the cokernels, but there is no exact additivity of the dimensions
of the kernels or cokernels. Yet additivity does hold for the Fredholm indices! Generalizing the
definition of Fredholm operators to arbitrary vector spaces and putting ind(A) = dim ker A −
dim cokerA we have:
217
proving the claim in the case of finite-dimensional spaces.
In the general case, we will find complementary subspaces X0 , X1 ⊆ X (thus X = X0 +
X1 , X0 ∩ X1 = {0}) and similarly Y0 , Y1 ⊆ Y and Z0 , Z1 ⊆ Z such that X0 , Y0 , Z0 are finite-
dimensional, AX0 ⊆ Y0 , BY0 ⊆ Z0 and AX1 = Y1 , BY1 = Z1 with X1 ∩ ker A = ∅ = Y1 ∩ ker B.
Thus BA : X1 → Z1 is a bijection, so that with the first half of the proof we have
It remains to construct the spaces in question. We need one fact from linear algebra: If X1 , X2 ⊆
X are linear subspaces of X with X1 ∩ X2 = {0}, there exists a complementary subspace of X1
containing X2 . (This follows by applying Zorn’s lemma to the family of subspaces of X that
contain X2 and have trivial intersection with X1 .)
We put X0 = ker(BA) = A−1 (ker B) ⊇ ker A, which has dimension at most dim ker A +
dim ker B. Letting X1 be any complement of X0 in X, we have X1 ∩ ker A = {0}. Putting
Y1 = AX1 , the map A : X1 → Y1 is a bijection, and since A : X → Y has finite cokernel, Y1
has finite codimension in Y . We have Y1 ∩ ker B = {0} (since BA X1 is injective, thus also
B AX1 = Y1 ). Thus there is a complement Y0 , clearly finite-dimensional, of Y1 in Y containing
ker B. Putting Z1 = BY1 , by the same argument as earlier, Z1 ⊆ Z has finite codimension.
Since ker B ⊆ Y0 by construction, we have BY0 ∩ Z1 = {0}. Thus there exists a complement
Z0 , clearly finite-dimensional, of Z1 containing BY0 . Now all our claims are satisfied.
Note that by the first part of the proof, every A ∈ End V , where V is finite-dimensional, is
Fredholm with index zero.
B.105 Proposition Let V, W be Banach spaces and A ∈ B(V, W ). Then At is Fredholm if and
only if A is Fredholm. Under these equivalent hypotheses we have dim ker(At ) = dim coker(A)
and dim coker(At ) = dim ker(A), thus ind(At ) = −ind(A).
Proof. Assume that A is Fredholm, in particular it satisfies dim(W/AV ) < ∞. By Exercise
7.11 this implies closedness of AV ⊆ W . By Exercise 9.35(i), ker At = (AV )⊥ ⊆ W ∗ . And by
Exercise 6.7 (AV )⊥ is isometrically isomorphic to (W/AV )∗ as Banach spaces. Since W/AV is
finite-dimensional, we have dim ker(At ) = dim(W/AV )∗ = dim(W/AV ) = dim coker(A).
Since A has closed image, the same is true for At by Exercise 9.41. By Exercise 9.36(i)
we have ker A = (At W ∗ )> , thus (ker A)∗ ∼= ((At W ∗ )> )∗ ∼
= V ∗ /At W ∗ , where the (isometric)
isomorphism from Exercise 9.16. With finite-dimensionality of ker A we thus have dim ker(A) =
dim(ker A)∗ = dim(V ∗ /At W ∗ ) = dim coker(At ). Now it is clear that At is Fredholm with
ind(At ) = −ind(A).
Now assume that At is Fredholm. As before, this implies that At has closed image. As a
consequence of a result that we did not prove, A has closed image. Now we can argue as above,
resulting in A being Fredholm.
218
B.107 Theorem (Atkinson) 132 Let V be an infinite-dimensional133 Banach space and A ∈
B(V ). Then the following are equivalent:
(i) A is Fredholm.
(ii) There exists a Fredholm B with ind(B) = −ind(A) and ABA = A134 that 1 − AB and
1 − BA are finite rank idempotents.
(iii) There exists B ∈ B(V ) such that AB − 1 and BA − 1 are compact.
(iv) The image of A in the Calkin135 algebra C(V ) = B(V )/K(V ) is invertible.
Proof. (i)⇒(ii) Being Fredholm, A has finite-dimensional kernel and cokernel. Thus ker A is
complemented, and we can find a closed complement V1 ⊆ V . And since AV ⊆ V is closed by
Exercise 7.11 and has finite codimension, it has a closed complement V2 . Now A V1 is injective,
thus A0 : V1 → AV is a bounded linear bijection. By the BIT, its inverse (A0 )−1 : AV → V1 is
bounded. Define B : V → V by B = (A0 )−1 on AV and as zero on the complement V2 of AV .
By construction B is bounded. We have ker B = V2 , thus dim ker B = dim cokerA < ∞, and
BV = V1 , thus dim cokerB = dim ker A < ∞. Thus B is Fredholm with ind(B) = −ind(A).
Now BA is the identity on V1 and zero on ker A, thus BA = 1 − P1 , where P1 is the unique
idempotent with P1 V = ker A and (1 − P1 )V = V1 . Since ker V is finite-dimensional, P1 has
finite rank. And ABA = A(1 − P1 ) = A − AP1 = A. Similarly, AB is the identity on AV and
zero on the complement V2 of AV . Thus AB = 1 − P2 , where P2 is the idempotent with image
V2 . Since V2 is finite-dimensional, P2 has finite rank.
(ii)⇒(iii) is trivial. (iii)⇒(i) There exists B ∈ B(V ) such that AB = 1+C, BA = 1+D with
C, D compact. Since this implies ker A ⊆ ker(1+D), and Lemma 14.1 gives dim ker(1+D) < ∞,
we have dim ker A < ∞. On the other hand, (1 + C)V ⊆ AV and since (1 + C)V ⊆ V has finite
codimension by Proposition 14.7, so has AV ⊆ V , thus cokerA = V /AV is finite-dimensional.
Thus A is Fredholm.
(iii)⇔(iv) Invertibility of q(A) ∈ C(A) means that there is a B ∈ B(V ) such that q(A)q(B) =
1C(V ) and q(B)q(A) = 1C(V ) . Since K(V ) ⊆ B(V ) is a two-sided ideal, q is a homomorphism.
Thus q(A)q(B) = q(AB), which equals 1C(V ) if and only if AB ∈ 1 + K(H). Similarly for
q(B)q(A), so that the claim follows.
219
(ii) Let A be Fredholm and pick B as in Theorem B.107(ii). If now A0 is Fredholm with
kA − A0 k < kBk−1 then kAB − A0 Bk < 1, so that D = 1 + A0 B − AB is invertible by Lemma
13.19(ii), thus ind(D) = 0. Now DA = A+A0 BA−ABA = A+A0 BA−A = A0 BA using ABA =
A. Since A, A0 , B, D are Fredholm, this implies ind(D) + ind(A) = ind(A0 ) + ind(B) + ind(A).
With ind(D) = 0 and ind(B) = −ind(A), we conclude ind(A0 ) = ind(A). Thus ind is locally
constant (constant on an open neighborhood of each point) and therefore continuous.
(iii) If A ∈ B(V ) is Fredholm and K is compact then q(A) ∈ C(V ) is invertible by Theorem
B.107, thus also q(A + K) = q(A) is invertible, so that A + K is Fredholm. Now t : [0, 1] →
Fr(V ), t 7→ A + tK is continuous, thus with (ii) also t 7→ ind(A + tK) is continuous. Since
a continuous map from a connected space to a discrete space is constant, we have ind(A) =
ind(A + K).
B.109 Corollary If A ∈ B(V ) is compact and λ ∈ F\{0} then A−λ1 is Fredholm with index
zero, thus dim ker(A − λ1) = dim coker(A − λ1).
A
Proof. Apply Theorem B.108 to 1 − λ, noting that 1 is Fredholm with index zero.
B.110 Proposition If V is a Banach space and A ∈ B(V ) then the following are equivalent:
(i) A is Fredholm with index zero.
(ii) There exists a compact K ∈ K(V ) such that A + K is invertible.
Proof. (ii)⇒ If there exists K ∈ K(V ) such that A+K is invertible then with Theorem B.108(iii)
we have ind(K) = ind(A + K) = 0 since A + K is invertible, thus has index zero.
(i)⇒(ii) If A is Fredholm with ind(A) = 0 then ker A and cokerA have the same finite
dimension. Now ker A has a closed complement V1 , and AV (which we know to be automatically
closed) has a finite-dimensional complement V2 . Now dim V2 = dim cokerA = dim ker A < ∞.
Thus we can find an invertible (and bounded, by finite-dimensionality) B : ker A → V2 . Since
we have the isomorphisms (ker A) ⊕ V1 ' V ' V2 ⊕ AV , we can define a bounded K : V → V
as B on ker A and zero on V1 . Since K has finite rank, it is compact. And A + K restricts to
the invertible maps B : ker A → V2 and A0 : V1 → AV and therefore is invertible.
B.111 Exercise Let V be a Banach space and A ∈ B(V ) Fredholm. Prove: There exists a
compact K such that A + K is surjective (resp. injective) if and only if ind(A) ≥ 0 (resp. ≤ 0).
If X is a topological space, recall that π0 (X) is the set of path components of X, i.e.
X/∼, where x ∼ y if and only if there exists p ∈ C([0, 1], X) with p(0) = x, p(1) = y. The ∼-
equivalence of x is denoted [x]. A continuous map f : X → Y induces a map f∗ : π0 (X) → π0 (Y ).
B.112 Proposition Let V be a Banach space. Then π0 (Fr(V )) admits a group structure such
that [A][B] = [AB] and a homomorphism ind : π0 (Fr(V )) → Z such that ind([A]) = ind(A).
If ι : InvB(V ) → Fr(V ) is the inclusion map, we have ker ind = ι∗ (π0 (Inv V )). (Thus
π0 (InvB(V )) → π0 (Fr(V )) → Z is a short exact sequence.)
Proof. The set (Fr(V ), ·, 1) is a monoid (defined like a group, but without inverses). If
A, A0 , B, B 0 ∈ Fr(V ) such that A ∼ A0 and B ∼ B 0 then AB ∼ A0 B 0 , so that [A][B] = [AB]
defines a monoid structure on π0 (Fr(V )). It actually is a group with unit [1]: If A ∈ Fr(V ) then
by Theorem B.107(ii) there is a Fredholm B such that P1 = 1 − BA is a finite rank idempotent.
Define Ct = BA + tP1 . Since {Ct }t∈[0,1] is a continuous path in Fr(V ) from BA to the identity,
we have [B][A] = [BA] = [1] in π0 (Fr(V )). Similarly [A][B] = [1].
220
The map ind : Fr(V ) → Z is a monoid homomorphism by Proposition B.104, and the local
constancy of ind (Theorem B.108(ii)) implies that we have a well defined map π0 (Fr(V )) → Z,
also denoted by ind, such that ind([A]) = ind(A). This clearly is a group homomorphism.
The set Inv(V ) of invertible operators is a group, and the same holds for π0 (InvB(V )) by the
same argument as before. Since every invertible operator is Fredholm with index zero, the inclu-
sion map ι : InvB(V ) → Fr(V ) induces a monoid homomorphism ι∗ : π0 (InvB(V )) → π0 (Fr(V ))
ι∗ ind
such that the composite map π0 (InvB(V )) → π0 (Fr(V )) −→ Z is zero. Thus ι∗ (π0 (InvB(V ))) ⊆
ker ind.
It remains to prove that ind([A]) = 0 implies [A] ∈ ι∗ (π0 (InvB(V ))). On the level of oper-
ators instead of path-components this amounts to proving that every Fredholm operator with
index zero can be connected to an invertible operator by a continuous path in Fr(V ). Let
thus A ∈ B(V ) with dim ker A = dim cokerA < ∞. Since the subspaces ker V, cokerV ⊆ V
are finite-dimensional, they are complemented, giving rise to direct sum decompositions
0 V =
A 0
V1 ⊕ ker V = V2 ⊕ cokerV with respect to which A is described by the matrix , where
0 0
A0 : V1 → V2 is invertible. Since ker V and cokerV have 0 the same
dimensions, we can find an
A 0
invertible B : ker A → cokerA. For t ∈ [0, 1] let At = ∈ B(V ). Since At ∈ Fr(V ) for
0 tB
all t and A0 = A, while A1 is invertible, we have achieved our goal.
B.114 Theorem If H is a complex Hilbert space, the set of Fredholm operators of index n on
H is path-connected for every n ∈ Z. Thus Fr(H) is the disjoint union of open path-connected
components, one for each n ∈ Z. Equivalently, the map π0 (Fr(H)) → Z is a bijection.
Proof. Surjectivity of the index map π0 (Fr(H)) → Z is immediate by Exercise B.113. Injectiv-
ity follows at once from Proposition B.112 if we prove π0 (InvB(H)) = 0. But this is nothing
other than path-connectedness of InvB(H), which we proved in Proposition 18.17 (for F = C).
B.115 Theorem (Feldman & Kadison 1954) Let H be a separable Hilbert space. Then
k·k
Inv B(H) = {A ∈ B(H) | AH ⊆ H is non-closed or dim ker A = dim ker A∗ ∈ N0 ∪ {∞}}.
For an accessible proof (using the essential spectrum, see the next section) see [20].
221
B.116 Definition Let V be a complex Banach space and A ∈ B(V ). Define
• σess,1 (A) = {λ ∈ C | A − λ1 is not Fredholm}.
• σess,2 (A) = {λ ∈ C | A − λ1 is not Fredholm with index zero}.
(σess,1 (A) is called the Fredholm (essential) spectrum of A and σess,2 (A) the Weyl (essential)
spectrum. These are many other definitions discussed in the literature, cf. e.g. [45].)
Some immediate observations:
• Since every invertible operator is Fredholm with index zero, we have
• This implies that σess,1 (K) = σess,2 (K) = {0} for all compact K, in particular for all
A ∈ B(V ) if V is finite-dimensional.
• If A is Fredholm of non-zero index then 0 6∈ σess,1 (A), while 0 ∈ σess,2 (A). Thus the two
essential spectra can differ.
• By Corollary B.106(ii), σess,1 (A) = σess,2 (A) if A is a normal operator on Hilbert space.
Both essential spectra can be expressed in terms of the usual spectrum:
B.117 Lemma If V is a complex Banach space and A ∈ B(V ) then with the quotient map
Q : B(V ) → B(V )/K(V )
Thus σess,1 (A) and σess,2 (A) are closed and non-empty.
Proof. The first statement is immediate Tby Atkinson’s Theorem B.107 and implies that σess,1 (A)
is closed and non-empty. And λ ∈ K∈K(V ) σ(A + K) is equivalent to the statement that
λ ∈ σ(A + K) for all compact K, thus A + K − λ1 is non-invertible for all K ∈ K(V ).
By Proposition B.110 this is equivalent to A − λ1 not being Fredholm with index zero, thus to
λ ∈ σess,2 (A). As an intersection of closed sets, σess,2 (A) is closed. And σess,2 (A) ) σess,1 (A) 6= ∅.
σess,2 (A) is the largest part of σ(A) that is stable under compact perturbations of A:
B.118 Exercise If σ 0 : B(V ) → P (C) is such that σ 0 (A) ⊆ σ(A) and σ 0 (A + K) = σ 0 (A) for
all A ∈ B(V ) and K ∈ K(V ), prove that σ 0 (A) ⊆ σess,2 (A) for all A ∈ B(V ).
In this sense, σess,2 (A) is the ‘best’ definition of essential spectrum. Yet, we will have a look at
a popular third definition, sitting between σess,2 (A) and σ(A). The theory of the Riesz projector
(Exercises 13.53, 13.66, 13.68) will play an important role. We begin with a preparatory result
similar to Proposition B.110:
222
B.119 Proposition Let V be a complex Banach space and A ∈ B(V ). Then the following are
equivalent:
(i) 0 is an isolated point of σ(A), and P0 V is finite-dimensional, where P0 is the Riesz projector
for λ = 0.
(ii) There are closed subspaces V1 , V2 ⊆ V with V1 + V2 = V and V1 ∩ V2 = {0}, where V1 has
non-zero finite dimension, such that AVi ⊆ Vi . With Ai = A Vi , A1 is nilpotent, and A2
is invertible.
(iii) 0 is an isolated point of σ(A) and A is Fredholm of index zero.
(iv) 0 is an isolated point of σ(A) and A is Fredholm.
(v) A is not invertible, and there is a compact K such that AK = KA and A + K is invertible.
Proof. (i)⇒(ii) With V1 = P0 V, V2 = (1 − P0 )V , we have AVi ⊆ Vi with σ(A V1 ) = {0} and
σ(A V2 ) = σ(A)\{0}. Thus A2 = A V2 is invertible and A1 = A V1 quasi-nilpotent, thus
nilpotent by finite-dimensionality of V1 .
(ii)⇒(iii) By (ii) we have A = A1 ⊕ A2 , where A2 is invertible and V1 finite-dimensional.
Thus A1 is Fredholm of index zero, and the same holds for A. And σ(A1 ) = {0} while 0 6∈ σ(A2 ).
Since σ(A2 ) is closed, there is an open neighborhood U of zero such that U ∩ σ(A2 ) = ∅, thus
U ∩ σ(A) = {0}. Thus 0 is an isolated point of σ(A).
(iii)⇒(iv) This is trivial.
(iv)⇒(i) Since 0 ∈ σ(A) is isolated, it has a Riesz projector P0 . Let Vi , Ai as above. Then A1
is quasi-nilpotent and A2 is invertible. Together with the fact that A is Fredholm, this implies
that A1 ∈ B(V1 ) is Fredholm. Now Exercise B.120 below gives that V1 is finite-dimensional.
(ii)⇒(v) It is clear that A is not invertible. Since A1 is nilpotent, A1 + 1V1 ∈ B(V1 ) is
invertible by Lemma 13.19. Now K = 1V1 ⊕ 0 is compact since dim V1 < ∞, commutes with
A = A1 ⊕ A2 , and A + K = (A1 + 1V1 ) ⊕ A2 is invertible.
(v)⇒(iii) By assumption, B = A + K is invertible, implying that A = B − K is Fredholm
of index zero. Since B commutes with K (and A), with Exercise 15.8(ii) we have σ(A) ⊆
σ(B) − σ(K). Since A is not invertible, we have 0 ∈ σ(A), thus there are λ ∈ σ(B) ∩ σ(K).
Since σ(B) is closed and does not contain zero, there is an ε > 0 such that B(0, ε) ∩ σ(B) = ∅.
Since σ(K) has zero as only limit point, we see that σ(B) ∩ σ(K) is finite. If λ ∈ σ(K)\{0} and
r > 0 small enough so that B(λ, r) ∩ σ(K) = {λ} then A commutes with 1 − ((K − λ1)/z)n
(which is invertible) for each n, thus also with the inverses and with their limit, the Riesz
projector (for K!)
Pλ = lim (1 − ((K − λ1)/z)n )−1 .
n→∞
Thus A maps each of the spaces P Vλ = Pλ V into themselves, and the restrictions of A and K
commute on each Vλ . Now P = λ∈σ(B)∩σ(K) Pλ is a finite sum, and we put V1 = P V, V2 =
(1 − P )V . On V2 , both B = A + K and A are invertible, while A V1 is not. Since V1 is
finite-dimensional, A is Fredholm of index zero, and zero is an isolated point of σ(A).
B.120 Exercise Let V be a complex Banach space and A ∈ B(V ) quasi-nilpotent and Fred-
holm. Prove that V is finite-dimensional.
223
• The Browder136 essential spectrum is σess,3 (A) = σ(A)\σd (A).
B.122 Remark From property (v) in Proposition B.119 it is evident that σd (A) is definitely
not stable under compact perturbations! 2
B.123 Proposition Let V be a complex Banach space. Then the Browder essential spectrum
σess,3 (A) satisfies
\
σess,2 (A) ⊆ σess,3 (A) = σ(A + K) ⊆ σ(A) (B.10)
K∈K(V )∩{A}0
σess,3 (A) = {λ ∈ σ(A) | λ not isolated or A − λ1 not Fredholm of index zero}. (B.11)
If λ ∈ C is such that A − λ1 is not Fredholm of index zero then on the one hand λ ∈ σess,3 (A).
On the other, for all compact K Theorem B.108(iii) gives that, A + K − λ1 is not Fredholm of
index zero, thus not invertible, so that thus λ ∈ σ(A + K). A fortiori, λ is in the intersection
in (B.10). Since the latter is contained in σ(A), in order to complete the proof of (B.10), it
suffices to prove that “λ ∈ σess,3 (A) ⇔ λ ∈ σ(A + K) for all compact K commuting with A”
holds under the assumption that A − λ1 is Fredholm of index zero. It suffices to consider the
case λ = 0, which amounts to the statement that 0 is a non-isolated point of σ(A) if and only
if there is no compact K commuting with A such that A + K is invertible. This is exactly the
equivalence (iii)⇔(v) in Proposition B.119.
If the Browder essential spectrum σess,3 (A) is bigger than the Weyl essential spectrum
σess,2 (A), it cannot be stable under all compact perturbations of A. But the above shows
that it still has a very nice stability property.
B.124 Theorem Let H be a complex Hilbert space and A ∈ B(H) normal. Then
(We simply call this the essential spectrum σess (A).) Whenever A, A0 are normal and A − A0 is
compact, σess (A) = σess (A0 ).
Proof. If λ ∈ σ(A) is not isolated, by definition it is contained in (B.11) and in (B.12). If
λ is isolated then by Proposition 17.24 there are mutually orthogonal A-invariant subspaces
H1 , H2 ⊆ H with H = H1 + H2 such that H1 = ker(A − λ1) 6= 0 and (A − λ1) H2 is invertible.
Thus A − λ1 is Fredholm if and only if its restriction to H1 is Fredholm. The latter being
136
Felix Earl Browder (1927-2016). American mathematician who mostly worked on functional analysis and differ-
ential equations.
224
identically zero, this is equivalent to dim H1 = dim ker(A − λ1) < ∞. This proves the equality
in (B.12).
We already know that for normal A we have σess,1 (A) = σess,2 (A) ⊆ σess,3 (A). Thus to prove
equality of the three spectra we must show σess,3 (A) ⊆ σess,1 (A). Let thus λ ∈ σess,3 (A). By
(B.12), either ker(A − λ1) is infinite-dimensional or λ ∈ σ(A) is not isolated. In the first case,
A − λ1 is not Fredholm, thus λ ∈ σess,1 (A), and we are done.
We are thus reduced to the situation where ker(A − λ1) is finite-dimensional and λ is not
isolated. By Proposition 11.27(ii), we have a direct sum situation H = (ker(A − λ1)) ⊕ H 0 ,
where H 0 = (ker(A − λ1))⊥ . Since ker(A − λ1) is finite-dimensional, A − λ1 is Fredholm if and
only if (A − λ1) H 0 is. Since (A − λ1) H 0 is by construction injective, it has dense image
by normality and Proposition 11.27(iii). Since λ ∈ σ(A) is not isolated, Corollary 17.25 gives
that (A − λ1) H 0 is not invertible. Since it is injective, it is not surjective. Now Exercise
7.11(iii) gives that (A − λ1) ker(A − λ1)⊥ has infinite-dimensional cokernel. It thus is not
Fredholm, thus also A − λ1 is not Fredholm, implying λ ∈ σess,1 (A). This finishes the proof
of the equality of the three spectra for normal operators. The invariance of σess,3 (A) under
compact perturbations now follows from that of σess,2 (A).
B.126 Remark The essential spectrum was introduced [170, 171] in 1909/10 by Weyl137 , who
only considered self-adjoint operators, but allowed unbounded ones since he was studying dif-
ferential equations. At that time functional analysis had just started developing, and the tools
used prove Theorem B.124 were not yet available. (The bounded inverse theorem came in 1929
and Fredholm operators and Theorem B.108 around 1950!) Weyl’s original approach to proving
invariance of σess (A) under compact perturbations was quite different (but not totally) and is
also interesting, which is why we briefly discuss it now. 2
B.127 Definition Let H be a Hilbert space and A ∈ B(H), λ ∈ C. A Weyl sequence for
(A, λ) is an orthonormal sequence ∃{xn } ⊂ H such that k(A − λ1)xn k → 0.
B.128 Proposition Let H be a complex Hilbert space, A ∈ B(H) normal and λ ∈ C. Then
the following are equivalent:
(i) λ ∈ σess (A).
(ii) PA (B(λ, ε))H ⊆ H is infinite-dimensional for each ε > 0. (Compare Proposition 18.20.)
(iii) There exists a Weyl sequence for (A, λ).
Proof. (i)⇒(ii) If dim ker(A − λ1) = ∞ then already PA ({λ})H = ker(A − λ1) is infinite-
dimensional. If λ is an accumulation point of eigenvalues, the linear span of the eigenspaces
ker(A − λ0 1) with λ0 ∈ B(λ, ε) is infinite-dimensional for each ε > 0. If neither of these holds,
there is an ε > 0 such that the restriction of A to H 0 = P (B(λ, ε0 ))H ∩ (ker(A − λ1))⊥ ) has
purely continuous spectrum containing λ for all ε0 ∈ (0, ε). In this case, H 0 ⊆ P (B(λ, ε))H is
infinite-dimensional for all ε > 0 since otherwise A H 0 would have λ as eigenvalue. (Compare
also Exercise 18.8(ii).)
137
Hermann Weyl (1885-1955). German mathematician, who worked in many areas of mathematics and mathematical
physics, like real, complex and functional analysis, differential equations, Lie groups, quantum theory and relativity,
as well as philosophy of mathematics.
225
(ii)⇒(iii) By (ii), for each n ∈ N we can pick xn ∈ PA (B(λ, 1/n))H ∩ {x1 , . . . , xn−1 }0 and
kxn k = 1. Then {xn } is an orthonormal sequence, and k(A − λ)xn k ≤ 1/n → 0. Thus {xn } is
a Weyl sequence.
(iii)⇒(i) Existence of a Weyl sequence implies that A − λ1 is not bounded below, thus
λ ∈ σ(A). Assume λ 6∈ σess (A), thus λ ∈ σd (A). Then A = A1 ⊕ A2 with A1 = 0 and A2 − λ1H2
invertible, thus bounded below. With k(A − λ1)xn k → 0, this implies kP2 xn k → 0. Combined
with kxn k = 1 ∀n, this gives kP1 xn k → 1. Together with the fact that H1 is finite-dimensional
and the {xn } ⊆ H orthonormal, this produces a contradiction. Thus λ ∈ σess (A).
Now we have an alternative proof for the invariance of the essential spectrum under compact
perturbations:
B.129 Theorem (Weyl) Let V be a complex Hilbert space and A, B ∈ B(H) normal such
that A − B is compact. Then σess (A) = σess (B).
Proof. If a Weyl sequence for A, λ exists, it is clear that λ ∈ σapp = σ(A).
Assume λ ∈ σess (A). Then there exists a Weyl sequence {xn }. Now
Now k(A−λ1)xn k → 0 by λ ∈ σess (A), while k(B−A)xn k → 0 since {xn } is a weak null sequence
and B − A is compact, thus sequentially weak-norm continuous, cf. Section 12.2. Using Weyl’s
criterion again, λ ∈ σess (B). Thus σess (A) ⊆ σess (B). The converse inclusion follows by A ↔ B.
B.130 Remark There are many generalizations of the theorem, e.g. to unbounded operators.
Cf. e.g. [45] and [129, Section XIII.4]. These generalizations have many applications to differ-
ential equations and quantum mechanics. 2
B.10.3 Applications
Apart from the mentioned applications of the essential spectrum to differential equations, there
are applications to operator theory. We mention a few without proofs:
B.132 Theorem Let H be a separable complex Hilbert space and A1 , A2 ∈ B(H) be normal.
Then σess (A1 ) = σess (A2 ) if and only if there exists a unitary U such that A2 − U A1 U ∗ is
compact. (One says A1 and A2 are ‘essentially unitarily equivalent’ or ‘compalent’).
Proof. If A2 is normal and U unitary, it is clear that U A2 U ∗ is normal and σess (U A2 U ∗ ) =
σess (A2 ). Thus if A1 − U A2 U ∗ is compact then σess (A1 ) = σess (U A2 U ∗ ) = σess (A2 ).
Now assume σess (A1 ) = σess (A2 ). By Weyl-von Neumann-Berg there are diagonal D1 , D2
and compact K1 , K2 such that Ai = Di + Ki . Since essential spectra are stable under compact
perturbations,
σess (D1 ) = σess (A1 ) = σess (A2 ) = σess (D2 ).
226
For a diagonal operator D = diag(dn ) we have λ ∈ σess (D) if and only λ is an accumulation
point of {dn }, i.e. {n ∈ N | |dn − λ| < ε} is infinite for every ε > 0. Thus the eigenvalue
sequences {d1,n }, {d2,n } of D1 , D2 have the same limit points. Using this one can construct a
permutation σ of N such that |d1,n − d2,σ(n) | → 0. If {e1,n }, {e2,n } are the ONBs diagonalizing
D1 , D2 , respectively, there is a unique unitary U such that U e1,n = e2,σ(n) ∀n. Now U D1 U ∗ −D2
is compact, thus U A1 U ∗ − A2 = U D1 U ∗ + U K1 U ∗ − D2 − K2 is compact.
B.133 Theorem Let H be a separable Hilbert space and A, B ∈ B(H) normal. Then the
following are equivalent:
(i) There is a sequence {Un } of unitaries such that B = limn→∞ Un AUn∗ . (‘A and B are
approximately unitarily equivalent.’)
(ii) σess (A) = σess (B) and dim ker(A − λ1) = dim ker(B − λ1) for all λ ∈ C\σess (A).
Proof. See [33, Theorem II.4.4].
and ind(A − λ1) = ind(N + K − λ1) = ind(N − λ1) = 0 whenever N − λ1 is Fredholm, i.e.
λ 6∈ σess (N ) = σess,1 (A). For the much deeper converse see e.g. [33, 75].
The above is just the tip of an iceberg. For more, see the references given above.
kAkp = (Tr(|A|p ))1/p ∈ [0, ∞], Lp (H) = {A ∈ B(H) | kAkp < ∞}.
For p = 2 this agrees with Definition 12.39 since |A|2 = A∗ A. For p = 1 it specializes to
227
(iv) For all A, B ∈ B(H) we have kABk1 ≤ kAkkBk1 and kABk1 ≤ kAk1 kBk. Thus L1 (H) ⊆
B(H) is a two-sided ideal.
(v) F (H) ⊆ L1 (H) ⊆ K(H).
k·k1
(vi) F (H) = L1 (H).
(vii) The normed space (L1 (H), k · k1 ) is complete, thus a Banach space.
(viii) (L1 (H), k · k1 ) is a Banach ∗-algebra.
(ix) For A ∈ B(H), the following are equivalent:
(α) A ∈ L1 (H), i.e. Tr(|A|) < ∞.
(β) A is a finite linear combination of positive operators with finite trace.
|hV Ae, ei| < ∞ for each ONB E and each unitary V .138
P
(γ)
Pe
(δ) e |hV Ae, ei| < ∞ Pfor some ONB E and each unitary V . Under this condition,
kAk1 ≤ supV ∈U (H) e |hV Ae, ei|.
P
() e |hAe, ei| < ∞ for each ONB E.
(x) For each ONB E, the unordered sum in
X
Tr : L1 (H) → C, A 7→ hAe, ei
e∈E
is convergent in the sense of Section A.1 (thus absolutely convergent) and independent of
the choice of E, defining a linear functional on L1 (H).
(xi) For all A ∈ L1 (H) we have Tr(A∗ ) = Tr(A).
(xii) If A ∈ B(H), b ∈ L1 (H) then Tr(AB) = Tr(BA) and |Tr(AB)| ≤ kAkkBk1 .
In particular |Tr(B)| ≤ kBk1 , thus Tr ∈ (L1 (H), k · k1 )∗ .
Proof. (i) Let B ≥ 0 and x ∈ H a unit vector. If E is an ONB containing x, we have
hBx, xi ≤ TrE (B). Since B is positive, we have kBk = supx,kxk=1 hBx, xi, thus kBk ≤ Tr(B).
If now A ∈ B(H) with polar decomposition A = U |A|, then applying the above to B = |A| and
using kU k ≤ 1 gives
kAk = kU |A| k ≤ k |A| k ≤ Tr|A| = kAk1 .
(ii) Let A ∈ B(H) with polar decomposition A = U |A| and |A| = U ∗ A. Since U is a partial
isometry, it satisfies U ∗ U U ∗ = U ∗ , so that U ∗ U |A| = U ∗ U U ∗ A = U ∗ A = |A|. Thus
Since |A∗ | and U |A|U ∗ are both positive, taking roots gives U |A|U ∗ = |A∗ |. Choosing an ONB
E such that each e ∈ E is either in ker U ∗ or in (ker U ∗ )⊥ , thus U ∗ e = 0 or U ∗ e = e, we find
X X
kA∗ k1 = TrE |A∗ | = TrE (U |A|U ∗ ) = h|A|U ∗ e, U ∗ ei ≤ h|A|e, ei = Tr|A| = kAk1 .
e e
228
(iii) The first statement follows from |λA| = |λ||A|. For the second, let A, B ∈ B(H) with
polar decompositions A = U |A|, B = V |B|, A + B = W |A + B|. If E is an ONB and F ⊆ E
is finite,
X X X
h|A + B|e, ei = hW ∗ (A + B)e, ei = (hW ∗ U |A|e, ei + hW ∗ V |B|e, ei)
e∈F e∈F e∈F
X X
∗
≤ |hW U |A|e, ei| + |hW ∗ V |B|e, ei|. (B.14)
e∈F e∈F
where the first ≤ comes from applying Cauchy-Schwarz to the inner product h|A|1/2 e, |A|1/2 U ∗ W ei
in H, the second ≤ from Cauchy-Schwarz in C|F | .
The argument of the first square root in the r.h.s. of (B.15) is dominated by e∈E k |A|1/2 ek2 =
P
Tr|A|, and for the argument of the second root we have
X X X
k|A|1/2 U ∗ W ek2 = h|A|1/2 U ∗ W e, |A|1/2 U ∗ W ei = hW ∗ U |A|U ∗ W e, e)
e∈F e∈F e∈F
∗ ∗
≤ Tr(W U |A|U W i.
Now, picking an ONB E such that each e ∈ E is either in ker W or in (ker W )⊥ , we find
Tr(W ∗ U |A|U ∗ W ) ≤ Tr(U |A|U ∗ ). Repeating the argument with U , we have Tr(U |A|U ∗ ) ≤
Tr|A|. Thus Tr(W ∗ U |A|U ∗ W ) ≤ Tr|A|, so that e∈F k|A|1/2 U ∗ W ek2 ≤ Tr|A|. Inserting this
P
in (B.15), we find X
|hW ∗ U |A|e, ei| ≤ Tr|A| = kAk1 .
e∈F
Analogously one proves the bound e∈F |hW ∗ V |B|e, ei| ≤ Tr|B| = kBk1 for the other summand
P
in (B.14). Now taking the limit F % E we have kA + Bk1 ≤ kAk1 + kBk1 . In view of this, it
is clear that L1 (H) is a vector space.
(iv) Let B = U |B| and AB = V |AB| be the polar decompositions of B, AB, respectively.
Using Proposition 11.44(iii), we have |AB| = V ∗ AB = V ∗ AU |B| = W |B|, where W = V ∗ AU .
In view of kU k, kV k ≤ 1 we have kW k ≤ kAk. Using W ∗ W ≤ kW ∗ W k1 = kW k2 1 and Exercise
17.7 we have
|AB|2 = |AB|∗ |AB| = |B|W ∗ W |B| ≤ kW k2 |B|2 .
In view of 0 ≤ A ≤ B ⇒ A1/2 ≤ B 1/2 , cf. [110, Theorem 2.2.6], this implies |AB| ≤ kW k |B|.
Thus kABk1 = Tr|AB| ≤ kW kTr|B| = kW kkBk1 ≤ kAkkBk1 .
The other inequality follows by kABk1 = k(AB)∗ k1 = kB ∗ A∗ k1 ≤ kB ∗ kkA∗ k1 = kAk1 kBk,
where we used the bound just proven and (ii). That L1 (H) is an ideal now is obvious.
(v) We have (x ⊗ y)∗ = (y ⊗ x), thus (x ⊗ y)∗ (x ⊗ y) = kxk2 (y ⊗ y) = kxk2 kyk2 (e ⊗ e) with
e = y/kyk, so that taking roots gives |x ⊗ y| = kxkkyk(e ⊗ e), which clearly is in L1 (H). Since
every element of F (H) is a finite linear combination of such x ⊗ y, we have F (H) ⊆ L1 (H).
If A ∈ L1 (H) the then |A|2 = A∗ A ∈ L1 (H) since L1 (H) is an ideal. Thus for any ONB E
we have X X X
k Aek2 = hA∗ Ae, ei = h|A|2 e, ei = TrE (|A|2 ) < ∞.
e∈E e e
229
Let F ⊆ E be finite and x ∈ F ⊥Pwith kxk = 1. Then F ∪ {x} is an orthonormal set and can be
completed to an ONB E. Thus e∈F k Aek2 + k Axk2 ≤ Tr(|A|2 ), or
X
kAxk2 ≤ Tr(|A|2 ) − kAek2 .
e∈F
If PF is the orthogonal projection onto spanC F then AF := APF is a finite rank operator that
k·k
converges in norm to A by (B.16). Thus A ∈ F (H) = K(H), proving L1 (H) ⊆ K(H).
(vi) Assume first A ∈ L1 (H)+ . By (v), A is compact. By Theorem 14.12,Pcompact self-
P there is an ONB E such that A = e∈E λe e ⊗ e.
adjoint operators can be diagonalized, thus
In our case, λPe ≥ 0 for all e ∈ E and e λe = Tr(A) < ∞. For
P a finite subset F ⊆ E
define AF := e∈F λe e ⊗ e, which Now A − AF = e∈E\F λe e ⊗ e ≥ 0. Thus
P is finite rank. P
kA − AF k1 = Tr(A − AF ) = e∈E\F λe . With e λe < ∞ this implies kA − AF k1 → 0 as
k·k1
F % E, thus A ∈ F (H) .
Let now A ∈ L1 (H) with polar decomposition A = U |A|. Then |A| ∈ L1 (H), thus by the
above for each ε > 0 there is B ∈ F (H) with k |A| − Bk1 < ε. With (iv) and kU k ≤ 1 this
implies kA − U Bk1 = kU (|A| − B)k1 ≤ k|A| − Bk1 < ε. Since F (H) is an ideal, we have
k·k1
U B ∈ F (H), finishing the proof of L1 (H) ⊆ F (H) . The converse is clear.
(vii) This can be proven directly, using L1 (H) ⊆ K(H), cf. e.g. [118, Theorem 3.4.12]. Since
we don’t need the result soon, we will follow Murphy in deducing it later from the isometric
isomorphism L1 (H) ∼ = B(H)∗ of normed spaces and completeness of the dual space B(H)∗ .
(viii) It only remains to prove submultiplicativity: If A, B ∈ L1 (H) then kABk1 ≤ kAkkBk1 ≤
kAk1 kBk1 , where we used (i) and (iv).
(ix) (β) ⇒ (α): This is trivial since L1 (H) is a vector space by (iii) and obviously contains
the positive operators of finite trace.
(α) ⇒ (β) Assume A ∈ L1 (H). By (ii), A∗ ∈ L1 (H) so that (iii) implies Re(A), Im(A) ∈
L1 (H). If A = A∗ ∈ L1 (H), let A = A+ − A− be the canonical decomposition with A± ≥ 0 and
A+ A− = 0. Then |A| = A+ + A− , so that Tr(A± ) ≤ Tr|A| = kAk1 < ∞, implying A± ∈ L1 (H).
Thus every trace class operator is a linear combination of four (or less) positive trace class
operators.
(β) ⇒ (γ) Let A ∈ L1 (H) and E an ONB for H. By (β), we have A = K
P
PK k=1 λk Ak with
Ak ∈ L P 1 +
(H) ∀k. Now |hAe, ei| ≤ k=1 |λk |hAk e, ei for all e ∈ E, so that Ak ∈ L1 (H)+ ∀k
implies e |hAe, ei| < ∞. Since L1 (H) is an ideal, the same holds for V A instead of A.
(γ) ⇒ (δ) + () is trivial.
(δ) ⇒ (α) Let A = U |A| be the polar decomposition of A ∈ B(H). Recall that U maps
⊥
|A|H ⊆ H isometrically to AH ⊆ H and sends |A|H to zero. Let V = U ∗ on AH = u|A|H ⊆
H. Since the closed subspaces |A|H and AH are unitarily equivalent, they have the same
⊥ ⊥ ⊥ ⊥
dimension, thus also |A|H and AH have the same dimension. Define V : AH → |A|H to
be (any) isometry. Then V is unitary. Since V U |A|e = U ∗ U |A|e = |A|e for all e, we have
X X X
|hV U |A|e, ei| = |h|A|e, ei| = h|A|e, ei = Tr|A|.
e∈E e∈E e∈E
By assumption, the l.h.s. is finite, thus Tr|A| < ∞. The bound on k · k1 is obvious in view of
the preceding computation.
230
() ⇒ (α) Assumption (ε) clearly implies hAen , en i → 0 for any orthonormal sequence
{en }n∈N . Now Theorem 12.37 gives that A is compact. If PA = B + iC with B, C self-adjoint
then B, C are compact, thus diagonalizable. Thus B = f ∈F λf f ⊗ f for a certain P ONB F
and λf ∈ R. Now (), which also holds for B, C, clearly implies kBk1 = Tr|B| = f |λf | < ∞,
so that B ∈ L1 (H). Similarly, C ∈ L1 (H), thus also
P A = B + iC is trace-class by (iii).
(x) Let E be an ONB. By (ix)(γ), the sum e∈E hAe, ei is absolutely convergent for each
A ∈ L1 (H). This proves that TrE : L1 (H) → C is well-definedPKand linear. It remains to show
that TrE is independent of E. By (ix)(β), we have A = k=1 λk Ak where K is finite and
Ak ∈ L1 (H)+ ∀k. If now F is another ONB, we have
X X
TrE (A) = λk TrE (Ak ) = λk TrF (Ak ) = TrF (A),
k k
where we used the linearity of TrE and TrF and the fact that TrE (Ak ) = TrF (Ak ) by Ak ≥ 0
and Lemma 11.51(ii). This proves TrE = TrF .
(xi) If A ∈ L1 (H) and E is an ONB, we have
X X X
Tr(A∗ ) = hA∗ e, ei = he, Aei = hAe, ei = Tr(A),
e e e
where the last identity comes from absolute convergence and continuity of complex conjugation.
(xii) Let B ∈ L1 (H) throughout. If U is unitary then BU, U B ∈ L1 (H) and
X X X
TrE (BU ) = hBU e, ei = hU BU e, U ei = hU Bf, f i = TrF (U B),
e∈E e∈E f ∈F
Now, as in the proof of (iii), e∈E k |A|1/2 ek2 = Tr|A| and e∈E k |A|1/2 U ∗ ek2 ≤ Tr|A|. This
P P
proves |Tr(A)| ≤ kAk1 for all A ∈ L1 (H).
If now A ∈ L1 (H), B ∈ B(H) then |Tr(AB)| ≤ kABk1 ≤ kAk1 kBk by (iv).
231
(iii) The map α : (L1 (H), k · k1 ) → (B(H), k · k)∗ is isometric, but it is not surjective if H is
infinite-dimensional.
Proof. (i) [110, Theorem 4.2.3].
(ii) [110, Theorem 4.2.1].
(iii) Since kα(A)k ≤ kAk1 for all A ∈ L1 (H) and kα(A) K(H)k = kAk1 by (ii), it is clear
that also kα(A)k = kAk1 , thus α is isometric.
For each x, y ∈ H we have x ⊗ y ∈ K(H), and Tr(A(x ⊗ y)) = Tr((Ax) ⊗ y) = hAx, yi. Thus
if α(A) = Tr(A·) vanishes on the compact operators then A = 0, thus α(A) = 0.
However, if H is infinite-dimensional, K(H) ⊆ B(H) is a proper closed ideal, and the
quotient space C(H) = B(H)/K(H) (the Calkin algebra) is non-trivial. Thus it admits a
bounded non-zero functional ψ. If p : B(H) → C(H) is the quotient map then ψ ◦ p is a non-
zero norm-continuous functional on B(H) that vanishes on K(H). Such a functional cannot be
of the form Tr(A·) with A ∈ L1 (H), proving that α is not surjective.
B.139 Remark 1. The results (i), (ii), (iii) are non-commutative analogues of `∞ (S, F) ∼ =
`1 (S, F)∗ , `1 (S, F) ∼
= c0 (S, F)∗ , and Theorem 4.19(v), respectively.
2. One can show that α(L1 (H)) ⊆ B(H)∗ consists precisely of the linear functionals that
are not only norm-continuous but also ultra-weakly continuous (or, equivalently, normal). Cf.
e.g. [110]. This is analogous to Proposition B.21.
3. The analogy between Lp (H) and `p (S, F) extends to p 6∈ {1, 2, ∞}: Each space Lp (H) =
{A ∈ B(H) | kAkp < ∞}, the ‘p-th Schatten class’, is a two-sided ideal in B(H) and in fact
Lp (H) ⊆ K(H) for all p. If 1 ≤ p ≤ q < ∞, it is not hard to show that kAk := kAk∞ ≤ kAkq ≤
k·kp
kAkp , thus Lq (H) ⊆ Lp (H) ⊆ K(H). For all 1 < p < ∞ one has Lp (H) = F (H) , the
Hölder type inequality kABk1 ≤ kAkp kBkq for A ∈ Lp (H), B ∈ Lq (H) and in fact the duality
Lp (H)∗ ∼= Lq (H). For all this see [151]. 2
232
11.36 and 13.15, and Proposition 13.69(iii). These results show that these quantities are of some
interest, and here we prove some further results about W (A).
To begin with, W (A) need not be closed:
B.141 Exercise (i) Give an example of a bounded Hilbert space operator A such that there
is no x ∈ H, kxk = 1 such that |hAx, xi| = 9A9.
(ii) Prove that despite (i) there always exists a sequence {xn } with kxn k = 1 such that
hAxn , xn i → λ where λ ∈ C, |λ| = 9A9.
B.142 Exercise Let H be a Hilbert space over F ∈ {R, C} and A ∈ B(H). Prove:
(i) W (αA + β1) = αW (A) + β for all α, β ∈ F.
(ii) W (A∗ ) = W (A)∗ .
(iii) W (U AU ∗ ) = W (A) for every unitary U : H → H 0 . (NB: In general W (BAB −1 ) 6= W (A)
for invertible B!)
(iv) If F = C then W (A) = {λ} if and only if A = λ1.
(v) If F = C then W (A) is contained in a line segment [γδ] = {tγ + (1 − t)δ | t ∈ [0, 1]} if and
only there are α, β ∈ C such that αA + β1 is self-adjoint.
implying that M (x, y, z) is an idempotent, thus a rank one orthogonal projection, if and only if
x2 + y 2 + z 2 = 1. Thus the map (x, y, z) 7→ M (x, y, z) restricts to a bijection S 2 → P1 . Now
Tr(A) 1 z x + iy
Tr(M (x, y, z)A) = + Tr A .
2 2 x − iy z
139
Felix Hausdorff (1868-1942), German mathematician. Towering figure in the history of general topology and
related areas like measure theory and functional analysis. Driven to suicide by the Nazis.
233
Since the second summand depends R-linearly on (x, y, z) (for any fixed A), we are done once we
show that the image of S 2 ⊆ R3 under every linear map α : R3 → R2 is convex. For dimensional
reasons, α has a non-trivial kernel K. Now the image of S 2 under the orthogonal projection
p : R3 → K ⊥ ⊂ R3 is a ball, thus convex, so that also α(S 2 ) = α(p(S 2 )) is convex.
There are interesting other proofs that do not proceed by reduction to two dimensions, cf.
e.g. [113, Exercise 8.9].
In view of Exercise 13.15 and Theorem B.143 it is clear that for all bounded Hilbert space
operators we have conv(σ(A)) ⊆ W (A). (Recall Definition B.66.) Already Exercise 11.36 shows
that this need not be an equality. Again, normal operators are behaved nicely:
B.144 Exercise Let H be a Hilbert space and A ∈ B(H) normal. Use the Spectral Theorem
18.4 to prove conv(σ(A)) = W (A). (Note that conv(σ(A)) is closed by Corollary B.69.)
For an alternative proof avoiding the spectral theorem (but still using continuous functional
calculus) see [118, Exercise E4.4.5].
While conv(σ(A)) = W (A) does not hold for all operators, there is a slightly more involved
general fact, somewhat related to the numerical identity from Exercise 17.15(iv):
B.145 Theorem (S. Hildebrandt 1966) 140Let H be a Hilbert space and A ∈ B(H). Then
\
conv(σ(A)) = W (BAB −1 ).
B∈Inv B(H)
Proof. For each invertible B we have σ(A) = σ(BAB −1 ) ⊆ W (BAB −1 ). With the convexity of
the numerical ranges we have
\
conv(σ(A)) ⊆ W (BAB −1 ). (B.17)
B∈Inv B(H)
Assume equality does not hold, thus there exists λ such that λ ∈ W (BAB −1 ) for all B ∈
Inv B(H) but λ 6∈ conv(σ(A)). Since conv(σ(A)) is a compact convex set, it is not hard to
find an open disc D such that conv(σ(A)) ⊆ D while λ 6∈ D. Using σ(αA + β) = ασ(A) + β
and W (αA + β) = αW (A) + β, we can reduce to the situation where D is the open unit disc
U = B(0, 1) around zero. Then σ(A) ⊆ conv(σ(A)) ⊆ U , so that r(A) < 1. Now Exercise
17.15(i)-(iii) provides B ∈ Inv B(H) such that kBAB −1 k < 1. Then also W (BAB −1 ) ⊆ U ,
which contradicts the facts λ ∈ W (BAB −1 ) and |λ| ≥ 1. This contradiction proves that the
inclusion in (B.17) is an equality.
234
B.146 Definition If A is a unital complex Banach algebra, ϕ ∈ A∗ is called a state if ϕ(1) =
kϕk = 1. The set of states of A is denoted S(A).
B.147 Proposition Let A be a unital normed algebra over C. For each a ∈ A the subsets
\
V1 (a) = B(z, ka − z1k) = {λ ∈ C | |λ − z| ≤ ka − z1k ∀z ∈ C},
z∈C
V2 (a) = {ϕ(a) | ϕ ∈ A∗ , kϕk = ϕ(1) = 1}
of C coincide. The resulting set V (a) is called the (algebraic) numerical range and R(a) =
supλ∈V (λ) |a| the (algebraic) numerical radius of a.
Proof. Let a ∈ A and ϕ ∈ A∗ with ϕ(1) = kϕk = 1. Then |ϕ(a) − z| = |ϕ(a − z1)| ≤ ka − z1k
holds for each z ∈ C. Thus ϕ(a) ∈ V1 (a), proving V2 (a) ⊆ V1 (a).
If a = c1 then V2 (a) = {c}, and with z = c the inequality |λ − z| ≤ ka − z1k becomes
|λ − z| ≤ 0, so that also V1 (a) = {c}. Thus V1 (a) = V2 (a) for a ∈ C1. Now assume a 6∈ C1 and
let λ ∈ V1 (a). Put W = C1 + Ca and define ϕ0 ∈ W ∗ by c1 + da 7→ c + dλ. Then ϕ0 (1) = 1
and ϕ0 (a) = λ, and
B.149 Remark Given the similarity of the properties of V (a) to those of the (spatial) numerical
range W (A) of a Hilbert space operator (except for the closedness of V (a)), it is natural to ask
how the two definitions are related if A = B(H). It is easy to see that W (A) ⊆ V (A). In
fact, this always is an equality. We postpone the proof to Theorem B.168 since it requires some
preparations and we prefer to stick to the general Banach algebra situation for now. 2
235
B.150 Lemma Let A be a unital complex Banach algebra and a ∈ A. Then
k1 + tak − 1
sup{Re λ | λ ∈ V (a)} = sup Re ϕ(a) ≤ inf . (B.19)
ϕ∈S(A) t>0 t
where the final inequality comes from Lemma B.150. From this one readily deduces
k1 + tak − 1 µ + tka2 k
≤
t 1 − tµ
and therefore
k1 + tak − 1
lim sup ≤ µ.
t&0 t
236
Combining this with (B.19) we have
B.152 Definition Let A be a unital complex Banach algebra. Then a ∈ A then a is called
• dissipative if Re λ ≤ 0 ∀λ ∈ V (a).
• hermitian if V (a) ⊆ R. We put H(A) = {a ∈ A | a hermitian}.
• normal if there are commuting b, c ∈ H(A) with a = b + ic.
B.154 Theorem (Bohnenblust & Karlin 1955) 141 Let A be a unital complex Banach
algebra. Then
kak
≤ R(a) ≤ kak ∀a ∈ A.
e
(Here e = exp(1) = 2.718 . . ..) In particular a = 0 ⇔ V (a) = {0}.
Proof. The inequality R(a) ≤ kak has already been proven. Let a 6= 0. Rescaling if necessary
we may assume R(a) = 1, thus |λ| ≤ 1 for all λ ∈ V (a). Then (B.18) implies
141
Samuel Karlin (1924-2007). Polish-born American mathematician. After his work in pure analysis he made many
contributions to mathematical economy and biology.
142
Rb
If V is a Banach space and f : [a, b] → V a continuous function, the Riemann integral a f (t)dt is defined (and
Rb Rb
existence proven using the uniform continuity of f ) as for R-valued functions. One then has k a f (t)dtk ≤ a kf (t)kdt.
237
The surprising factor e cannot be improved without further assumptions. (Recall that for
the numerical radius of an operator on a complex Hilbert space we proved the slightly stronger
9A9 ≥ kAk
2 , which we showed to be optimal.)
We mention without proofs (see e.g. [19]) two more results:
B.155 Theorem If A is a unital complex Banach algebra and a ∈ A is normal then V (a) =
conv(σ(a)), thus R(a) = r(a). (Compare Exercise B.144.) For hermitian a even R(a) = kak.
If a ∈ H(A) ∩ iH(A) then V (a) ⊆ R ∩ iR = {0}, which implies a = 0. Thus H(A) ∩ iH(A) =
{0}. It is natural to ask whether H(A) + iH(A) = A. For C ∗ -algebras this is true since one
can prove H(A) = Asa , cf. Proposition B.161. This actually characterizes C ∗ -algebras:
B.156 Theorem (Vidav (1956), Palmer (1968)) Let A be a unital complex Banach alge-
bra. If A = H(A) + iH(A) then (b + ic)∗ = b − ic for a, b ∈ H(A) defines a ∗-operation and
ka∗ ak = kak2 ∀a ∈ A, thus A is a C ∗ -algebra.
For proofs of these results (and many more), see [19].
N
X XN N
X
ϕ(a) ≥ ϕ 2−n an = 2−n ϕ(an ) ≥ 2−n · 2n = N
n=1 n=1 n=1
holds for all N . But this contradicts the fact that ϕ(a) ∈ [0, +∞). Thus there exists C such
that ϕ(a) ≤ Ckak for all a ∈ A+ .
For every a ∈ A we have a = b + ic with b, c ∈ Asa , where kbk ≤ kak and kck ≤ kak. And
by Exercise 17.5, b = b+ − b− with b± ∈ A+ and kb± k ≤ kbk ≤ kak and similarly for c. Now
|ϕ(a)| = |ϕ(b+ ) − ϕ(b− ) + iϕ(c+ ) − iϕ(c− )| ≤ ϕ(b+ ) + ϕ(b− ) + ϕ(c+ ) + ϕ(c− ) ≤ 4Ckak.
238
(ii) It is obvious that [·, ·] is a sesquilinear form on A. For all a ∈ A we have aa∗ ∈ A+ , thus
[a, a] = ϕ(aa∗ ) ≥ 0, so that [·, ·] is positive in the sense [a, a] ≥ 0 ∀a. A fortiori, [a, a] ∈ R for
all a, so that [·, ·] is self-adjoint by Lemma 11.17(i), i.e. [x, y] = [y, x].
1/2
(iii) Every positive sesquilinear form satisfies a Cauchy-Schwarz inequality |[a, b]| ≤ [a, a][b, b]
with the same proof as for (5.1) since non-degeneracy [x, x] = 0 ⇒ x = 0 was not needed there.
Expressing the inequality in terms of ϕ, the claim follows.
(iv) Since [·, ·] is self-adjoint by (i), we have ϕ(ab∗ ) = [a, b] = [b, a] = ϕ(ba∗ ). Putting b = 1,
the claim follows.
(Statement (iii) also holds for non-unital algebras, but the proof uses approximate units.)
Be sure not to confuse A+ ⊆ A and A+ ⊆ A∗ !
B.159 Proposition (Bohnenblust & Karlin 1955) Let A be a unital complex C ∗ -algebra.
Then a linear functional ϕ : A → C is positive if and only if it is bounded and satisfies
ϕ(1) = kϕk.
Proof. ⇒ If ϕ is positive, it is bounded by Proposition B.158(i), and ϕ(1) > 0 since 1 ∈ A+ . If
a ∈ A≤1 , with Proposition B.158(iii) we have
Since for each ε > 0 we can find a ∈ A≤1 with |ϕ(a)| > kϕk − ε, we have (kϕk − ε)2 ≤ kϕkϕ(1).
Taking ε → 0 gives kϕk ≤ ϕ(1) (if ϕ 6= 0, but the result also holds for ϕ = 0). On the other
hand, with k1k = 1 we have kϕk ≥ ϕ(1). Combining the inequalities we have kϕk = ϕ(1).
⇐ Replacing ϕ by kϕk−1 ϕ, we may assume kϕk = 1. Let a ∈ Asa , kak ≤ 1. Write
ϕ(a) = α + iβ with α, β ∈ R. Then for each s ∈ R,
implying α2 + β 2 − 2sβ ≤ 1. Since this must hold for all s ∈ R, we have β = 0, thus ϕ(a) ∈ R.
Thus for a = a∗ we have V (a) ⊆ R.
Now let a ∈ A+ , kak ≤ 1. Then 0 ≤ a ≤ 1, thus 1 − a is positive with k1 − ak ≤ 1 (compare
Exercise 16.24), so that |ϕ(1 − a)| ≤ 1. Combined with ϕ(1 − a) ∈ R from the above, this gives
1 − ϕ(a) = ϕ(1 − a) ≤ 1, implying ϕ(a) ≥ 0. Thus ϕ is positive.
B.160 Remark If A is a unital C ∗ -algebra, by states on A one usually means the ϕ ∈ A∗ that
are positive and normalized, i.e. ϕ(1) = 1. In view of the above result, this is entirely consistent
with the Banach algebraic Definition B.146 of states. 2
Now we are in a position to use V (a) to characterize the elements of a C ∗ -algebra that satisfy
a = 0, a = a∗ and a ≥ 0, respectively, in analogy to the results for A ∈ B(H) in terms of W (A).
239
(iii) a ≥ 0 ⇔ ϕ(a) ≥ 0 for all ϕ ∈ A+ ⇔ V (a) ⊆ [0, ∞).
Proof. By Proposition B.159 the positive functionals are precisely the positive multiples of the
states ϕ appearing in the definition of V2 (a). In view of this connection between V (a) and A+ ,
the three rightmost equivalences are obvious.
The leftmost implications ⇒ are obvious in (i) and (iii). As to (ii), if a = a∗ there are
a± ∈ A+ such that a = a+ − a− . Now with ϕ(a± ) ≥ 0 we have ϕ(a) = ϕ(a+ ) − ϕ(a− ) ∈ R.
To prepare the proof of the leftmost implications ⇐, assume a = a∗ 6= 0. Then a is normal,
thus r(a) = kak > 0, so that σ(a) contains a λ 6= 0. Then λ ∈ V (a) by Proposition B.148, so
that by Proposition B.147 there exists ϕ ∈ A∗ with ϕ(1) = kϕk = 1 and ϕ(a) = λ 6= 0. Since ϕ
is positive by Proposition B.159, we have proven that for every a = a∗ 6= 0 there exists ϕ ∈ A+
with ϕ(a) 6= 0.
For arbitrary a we have a = b + ic with b, c ∈ Asa . If ϕ ∈ A+ then ϕ(a) = ϕ(b) + iϕ(c),
where ϕ(b), ϕ(c) ∈ R. In view of this, ϕ(a) ∈ R implies ϕ(c) = 0, while ϕ(a) = 0 implies
ϕ(b) = ϕ(c) = 0. Together with the result just proven (if a = a∗ and ϕ(a) = 0 ∀ϕ ∈ A+ then
a = 0) this finishes the proof of both (i) and (ii). And if ϕ(a) ≥ 0 for all ϕ ∈ A+ then a = a∗
by (ii), and σ(a) ⊆ V (a) ⊆ [0, ∞) gives positivity of a, finishing (iii).
B.162 Remark 1. We had already proven the equivalence a = 0 ⇔ V (a) = {0} for all unital
Banach algebras, but the above proof for C ∗ -algebras is considerably simpler.
2. We emphasise that V (a) ⊆ R implies a = a∗ , which it not implied by σ(a) ⊆ R.
3. Proposition B.161 shows that the states of a C ∗ -algebra can fulfill many of the tasks
of the characters in the commutative case. One easily checks that every character is a state.
But a state is a character only if it is multiplicative. By the Riesz-Markov-Kakutani theorem,
the states on C(X, C) are in bijection with the normalized positive R measures on X, while the
characters correspond to the Dirac measures δx , x ∈ X, for which X f dδx = f (x). 2
B.163 Corollary For a unital C ∗ -algebra A and a ∈ A, the following are equivalent:
(i) a is hermitian, i.e. V (a) ⊆ R.
(ii) a = a∗ .
(iii) eita is unitary for all t ∈ R.
(iv) keita k = 1 ∀t ∈ R.
Proof. (i)⇔(ii) was part of Proposition B.161. (ii)⇒(iii) is elementary, see Remark 16.18.2.
(iii)⇒(iv) is immediate by the C ∗ -identity. And Corollary B.153 gives (iv)⇔(i).
B.164 Remark In every Banach ∗-algebra one can still define self-adjointness by a = a∗ and
unitarity by u∗ u = uu∗ = 1. (Confusingly, even in the Banach-∗ literature ‘hermitian’ can be
used for either of (i) or (ii).) One then still has (ii)⇔(iii), where ⇐ follows by the computation
d ∗ d
ia − ia∗ = eita eita = eita e−ita = 0.
dt |t=0 dt |t=0
And of course (i)⇔(iv) as in every Banach algebra. But in the absence of the C ∗ -identity neither
(i)⇔(ii) nor (iii)⇔(iv) needs to hold. (It is not difficult to construct Banach ∗-algebras with
k1k = 1 having unitary elements with norm 6= 1.) In fact, one can prove that every Banach
∗-algebra in which (ii)⇒(iv) holds is a C ∗ -algebra, cf. [58]. 2
240
We now study the relation between V (a) for C ∗ -algebra elements to σ(a) and to the spatial
numerical range W (A) in the case A = B(H).
B.166 Lemma Let A be a unital C ∗ -subalgebra. Then S(A) ⊆ A∗ is weak-∗-closed and convex.
Proof. Let ϕ1 , ϕ2 ∈ S(A) and t ∈ [0, 1]. Then ϕ = tϕ1 + (1 − t)ϕ2 is positive and normalized,
since ϕ(1) = 1. Thus ϕ ∈ S(A), proving convexity.
w∗
Let {ϕι } ⊂ S(A) be a net such that ϕι → ϕ ∈ A∗ . With ϕι (1) = 1 this implies ϕ(1) = 1.
By Alaoglu’s theorem, A∗≤1 is weak-∗ compact, thus weak-∗ closed. With ϕι ∈ A∗≤1 this implies
ϕ ∈ A∗≤1 , thus kϕk ≤ 1. Together with ϕ(1) = 1 this gives kϕk = 1, thus ϕ ∈ S(A).
By Lemma 10.24(ii) there is a (unique) a ∈ A such that ψ(ϕ) = ϕ(a) for all ϕ ∈ A∗ . Writing
a = b + ic with b, c ∈ Asa and using that every ϕ ∈ S(A) assumes real values on Asa , so that
Re ϕ(a) = ϕ(b), (B.21) becomes
w∗
sup{ϕ(b) | ϕ ∈ conv(V S(A)) } < t < ϕ0 (b). (B.22)
In particular for ϕx = h· x, xi, where x ∈ H is a unit vector, this implies hbx, xi ≤ t. This is
equivalent to hbx, xi ≤ tkxk2 for all x ∈ H, thus to h(t1 − b)x, xi ≥ 0 ∀x. With Proposition 17.9
this is equivalent to t1 − b ≥ 0. Since ϕ0 is a state, thus positive, this implies ϕ0 (t1 − b) ≥ 0,
w∗
thus ϕ0 (b) ≤ t. Since this contradicts (B.22), we indeed have conv(V S(A)) = S(A).
B.168 Theorem Let H be a complex Hilbert space. Then for all A ∈ B(H) we have V (A) =
W (A) and R(A) = 9A9, where the numerical range V (A) and radius R(A) are taken in the
Banach algebra B(H).
Proof. With W (A) = {ϕ(A) | ϕ ∈ V S(B(H)} it is clear that W (A) ⊆ V (A), thus W (A) ⊆ V (A)
by closedness of V (A). Given λ ∈ V (A), pick ϕ ∈ S(A) with ϕ(A) = λ. By Proposition B.167
w∗
there is a net {ϕι } ⊂ conv(V S(B(H))) such that ϕι → ϕ. In particular, ϕι (A) → ϕ(A) = λ.
Since ϕι ∈ conv(V S(B(H))) implies ϕι (A) ∈ conv(W (A)) = W (A), we have proven λ ∈ W (A),
thus V (A) ⊆ W (A). The equality R(A) = 9A9 is now obvious.
241
B.13 Some more basic theory of C ∗ -algebras
B.13.1 The Fuglede-Putnam theorem
B.169 Theorem Let A be a unital C ∗ -algebra over C.
(i) Let a, c ∈ A. If a is normal and ac = ca then a∗ c = ca∗ (and ac∗ = c∗ a).
(ii) Let a, b, c ∈ A. If a, b are normal and ac = cb then a∗ c = cb∗ .
Proof. Obviously (i) is just the special case a = b of (ii).
∗ ∗
(ii) We define f : C → A, z 7→ eza ce−zb , where ea = exp(a) is defined in terms of the power
series as in Example 15.19. Expanding the two power series in the definition of f we have
∞ ∞ ∞
! !
za∗ −zb∗
X z k (a∗ )k X (−z)l (b∗ )l X
f (z) = e ce = c = z n dn ∀z ∈ C
k! l!
k=0 l=0 n=0
for certain dn ∈ A. (The reshuffling is justified by the absolute convergence of the series.) We
only need d1 = a∗ c − cb∗ , which is quite obvious. Thus the theorem follows if we prove d1 = 0.
By induction, the assumption ac = cb is seen to imply an c = cbn . Multiplying by z n /n! and
summing over n ∈ N0 gives eza c = cezb for all z ∈ C, thus also eza ce−zb = c. Thus
∗ ∗ ∗ ∗ ∗ −za ∗ ∗ ∗
f (z) = eza ce−zb = eza (e−za cezb )e−zb = eza cezb−zb = e2iIm(za ) ce−2iIm(zb ) ,
∗ ∗ ∗ ∗
where eza e−za = eza −za and ezb e−zb = ezb−zb hold due to normality of a and b, respectively.
∗ ∗
Now Im(za∗ ), Im(zb∗ ) are self-adjoint so that e2iIm(za ) and e2iIm(zb ) are unitary for all z ∈ C,
cf. Remark 16.18(ii), thus bounded. This proves that f : C → A is bounded. Now the following
Lemma implies d1 = 0, and we are done.
B.170
P∞ Lemma Let A be a unital Banach algebra and {dn }n∈N0 ⊆ A such that f : C → A, z 7→
z n d converges absolutely for all z ∈ C. (⇔ kd k1/n → 0.) If kf (z)k ≤ C|z|M ∀z ∈ C
n=0 n n
then dn = 0 for all n > M . In particular, if f is bounded then dn = 0 ∀n ≥ 1.
Proof. Let r > 0, m ∈ N. Then similarly to (B.20) we have
∞ ∞
Z 2π Z 2π !
X X Z 2π
−imt it −imt n int n
e f (re )dt = e r e dn dt = r dn ei(n−m)t dt = 2πrm dm ,
0 0 n=0 n=0 0
R 2π
where used the uniform convergence of the series on bounded sets and 0 ei(n−m)t dt = 2πδn,m .
R 2π R 2π
With k 0 e−imt f (reit )dtk ≤ 0 kf (reit )kdt ≤ 2πCrM we have
2π
rM
Z
1
kdm k = e−imt f (reit )dt ≤ C ∀m ∈ N, r > 0.
2πrm 0 rm
B.171 Remark 1. Part (i) of the theorem was proven by Fuglede in 1950, (ii) by Putnam in
1951. The above elegant proof is due to Rosenblum (1958).143
2. For A = C, M = 0 the function f is entire and the lemma reduces to Liouville’s theorem
from complex analysis. But the general lemma can be deduced from Liouville’s theorem: For
143
Bent Fuglede (1925-2023), Danish mathematician. Calvin Richard Putnam (1924-2008), Marvin Rosenblum (1926-
2003), American mathematicians.
242
every ϕ ∈ A∗ , the function z 7→ ϕ(f (z)) = ∞ n n
P
n=0 z ϕ(c ) is entire and bounded, thus constant.
Thus for all z, z 0 ∈ C, ϕ ∈ A∗ we have ϕ(f (z) − f (z 0 )) = 0. The Hahn-Banach theorem now
implies f (z) − f (z 0 ) = 0 ∀z, z 0 , thus f is constant, so that dn = 0 ∀n ≥ 1.
3. But as our proof of the lemma shows, no use of complex analysis (holomorphicity etc.)
is needed if the function is a priori given by an everywhere convergent power series. Thus as in
Rickart’s proof of Theorem 13.39 the invocation of complex analysis can be replaced by much
simpler harmonic analysis. As there, also here the integration can be removed: 2
B.172 Exercise Give a proof of Lemma B.170 that does not use integration.
kα(a)k2 = kα(a)∗ α(a)k = kα(a∗ a)k = rB (α(a∗ a)) ≤ rA (a∗ a) ≤ ka∗ ak ≤ kak2 ,
where we used the C ∗ -identity for B, the fact that α is a ∗-homomorphism, the fact that
r(b) = kbk for normal elements of the C ∗ -algebra B, Lemma 15.2, r(a) ≤ kak and the inequality
ka∗ ak ≤ kak2 holding in every Banach ∗-algebra.
If α is non-unital, in particular if A or B has no unit, then α
e : Ae → B,
e (a, α) 7→ (α(a), α)
∗
is easily seen to be a unital ∗-homomorphism. Since B is a C -algebra by Exercise 16.10, the
e
above applies to αe, and we have kαk ≤ ke αk ≤ 1.
B.174 Remark 1. The above result is one of many cases in the theory of C ∗ -algebras where
the ‘algebra dictates the analysis’. Further instances are Theorem B.176 and Corollary B.177.
2. For (∗-)homomorphisms between general Banach (∗-)algebras the question of continuity
of homomorphisms is much more complicated, with connections to foundational matters like
the continuum hypothesis, cf. [32]. 2
B.175 Exercise Let A be a Banach ∗-algebra. For a ∈ A define kak∗ = sup(H,π) kπ(a)kB(H) ,
where H is a Hilbert space and π : A → B(H) is a ∗-homomorphism. Prove that k · k∗ is a
C ∗ -seminorm on A.
243
and P 7→ P (α(a)). Assuming σ(α(a)) $ σ(a), we can find a non-zero f ∈ C(σ(a), C) that
vanishes on σ(α(a)). (Since we are dealing with subsets of R, this can be done by hand,
without Urysohn’s lemma.) If {Pn } is a sequence of polynomials converging uniformly to f on
σ(a) ⊆ R then the sequences {Pn (a)} and {Pn (α(a))} converge uniformly to γ(f ) = f (a) ∈ A
and γ 0 (f ) = f (α(a)) ∈ B, respectively. Since γ, γ 0 are isometric we have kf (a)k = kf kσ(A) 6= 0
and kf (α(a))k = kf kσ(α(a)) = 0. Since α is a unital homomorphism, we have α(Pn (a)) =
Pn (α(a)) ∀n. The r.h.s. converges to f (α(a)), the l.h.s. to α(f (a)) by continuity of α (Theorem
B.173), so that α(f (a)) = f (α(a)) = 0. Since f (a) 6= 0, this contradicts the injectivity of α.
Thus we have σ(α(a)) = σ(a) as claimed.
B.179 Remark 1. We ignore operators with non-dense range, since they are of very limited
use. But assuming dense domain forces us to check for every construction of a new operator
whether its domain is dense.
2. Writing ‘unbounded operator’ is undesirable since it would exclude the bounded ones.
Since ‘possibly unbounded operator’ is unbearable, we will just write ‘operator’, ‘densely defined
linear’ being implied.
3. If A is an operator and t ∈ F it is immediate how to define tA on the domain D(A). 2
244
• An operator A is closed if G(A) ⊆ V ⊕ V is closed, and closable if G(A) is the graph of an
operator. That operator, the closure of A, is then denoted A.
• If A is closed with domain D then a linear subspace D0 ⊆ D is a core for A if A D0 = A.
B.183 Exercise Let V be a Banach space and K, L closed linear subspaces such that K ∩ L =
{0} and K + L = V . Put D = K + L and define S : D → D by S(k + l) = k − l for all
k ∈ K, l ∈ L. Prove:
(i) S is bounded if and only if K + L = V .
(ii) S is always closed.
Then
(i) For each y ∈ D(A∗ ) there is a unique zy ∈ H such that hAx, yi = hx, zi for all x ∈ D(A).
We put A∗ y = zy .
(ii) D(A∗ ) ⊆ H is a linear subspace and the map A∗ : D(A∗ ) → H is linear.
Proof. (i) Let y ∈ D(A∗ ) and assume z, z 0 ∈ H satisfy hAx, yi = hx, zi = hx, z 0 i for all x ∈ D(A).
This implies hx, z − z 0 i = 0 for all x ∈ D(A). Since D(A) is dense, z − z 0 = 0 follows.
(ii) Let y ∈ D(A∗ ) and c ∈ F. Then hAx, yi = hx, A∗ yi for all x ∈ D(A), implying
hAx, cyi = hx, cA∗ yi ∀x ∈ D(A). It follows that cy ∈ D(A∗ ) and A∗ (cy) = cA∗ y. Now
let y, y 0 ∈ D(A∗ ). Adding the equations hAx, yi = hx, A∗ yi and hAx, y 0 i = hx, A∗ y 0 i gives
hAx, y + y 0 i = hx, A∗ (y + y 0 )i, showing y + y 0 ∈ D(A∗ ) and A∗ (y + y 0 ) = A∗ y + A∗ y 0 .
245
(i) The graph of A∗ is closed.
(ii) The domain D(A∗ ) of A∗ is dense if and only if A is closable.
∗
(iii) If A is closable then A = A∗ and A = A∗∗ . (In particular A closed ⇒ A = A∗∗ .)
Proof. (i) For frequent later use we notice that for every unitary U on H and every linear
subspace E (not necessarily closed) the identity U E ⊥ = (U E)⊥ holds. Now equip H ⊕ H with
the obvious inner product: h(a, b), (c, d)i = ha, ci + hb, di. With this it is trivial to check that
the linear operator V : H ⊕ H → H ⊕ H, (x, y) 7→ (−y, x) is unitary.
We claim that the following holds for every densely defined A:
As an orthogonal complement, G(A)⊥ is closed, and the closedness of G(A∗ ) follows from (B.23).
(ii⇐) Since G(A) ⊆ H ⊕ H is a linear subspace, we have
246
• If A is symmetric then A ⊆ A∗∗ ⊆ A∗ (and conversely). This follows from the fact that
A∗ is closed and therefore contains the closure A = A∗∗ of A.
• The closed symmetric operators are those satisfying A = A∗∗ ⊆ A∗ .
B.188 Exercise Prove that every symmetric operator is closable, and its closure is symmetric.
One can show that the converse is also true: If A has a unique self-adjoint extension B then
A is self-adjoint and therefore coincides with B.
A self − adjoint ==⇒ A max. symm. ==⇒ A closed symmetric ===⇒ A closed
w
w
w w
w w
w w w
w
w
w w
w w
w w w
w
w
w w
w w
w
A ess. s. − a. =========================⇒ A symmetric =====⇒ A closable
ker A∗ = (AD)⊥ .
247
Proof. The proof is essentially that of Lemma 11.10(ii) with some attention to the domains:
We have hAx, yi = hx, A∗ yi for all x ∈ D(A), y ∈ D(A∗ ). Thus
B.197 Theorem Let A be a symmetric operator with domain D ⊆ H. Then the following are
equivalent:
(i) A is self-adjoint.
(ii) A is closed and ker(A∗ ± i) = {0}. (A∗ ± i both injective.)
(iii) (A ± i)D = H. (A ± i both surjective.)
(For an unbounded A, one defines A + c1 in the obvious way on the domain D(A).)
Proof. (i)⇒(ii) Let x ∈ D(A) = D(A∗ ). If A∗ x = ix then Ax = ix by A = A∗ , thus
ihx, xi = hix, xi = hAx, xi = hx, A∗ xi = hx, Axi = hx, ixi = −ihx, xi,
implying x = 0 and therefore ker(A − i) = {0}. The identity ker(A + i) = {0} is proven in the
same way.
(ii)⇒(iii) Since A∗ + i is injective, Lemma B.196 gives ((A − i)D)∗ = {0}, so that (A − i)D
is dense. It remains to prove that (A − i)D is closed. Using that A is symmetric we find
k(A − i)xk2 = h(A − i)x, (A − i)xi = kAxk2 + kxk2 − ihx, Axi + ihAx, xi
= kAxk2 + kxk2
and therefore k(A − i)xk ≥ kxk for all x ∈ D(A). The rest of the proof is an adaptation of
Lemma 7.39 to unbounded operators: By the inequality, the map A − i : D(A) → H is injective,
thus A − i : D(A) → (A − i)D(A) is a bijection. If {yn } is a Cauchy sequence in (A − i)D(A)
then the inequality gives that {xn = (A − i)−1 (yn )} is a Cauchy sequence in D(A − i). Thus
{(xn , yn )} ⊆ G(A−i) is Cauchy. The closedness of A implies closedness of A−i, thus of G(A−i),
so that (xn , yn ) → (x, y) ∈ G(A − i). This proves that limn yn = y ∈ G(A − i), so that (A − i)D
is closed, completing the proof of surjectivity of A − i. For A + i one argues analogously.
(iii)⇒(i) Let y ∈ D(A∗ ). Since (A − i)D = H, there exists x ∈ D(A) such that
Appealing to (A + i)D = H, Lemma B.196 gives ker(A∗ − i) = ker((A + i)∗ ) = H ⊥ = {0}, thus
injectivity of A∗ − i. Combining this with (B.26) we obtain x − y = 0, so that y = x ∈ D(A).
This proves D(A∗ ) ⊆ D(A), thus A = A∗ .
Similarly:
248
(ii) ker(A∗ ± i) = {0}. (A∗ ± i both injective.)
(iii) (A ± i)D = H.
Proof. (i)⇒(ii) By assumption A is closable with self-adjoint closure A. Now the theorem gives
∗ ∗
ker(A ± i) = {0}, and with A = A∗ we have (ii) of the corollary.
∗
(ii)⇒(i) From Exercise B.188 we know that A is closable with A symmetric. With A∗ = A ,
∗
hypothesis (ii) gives ker(A ± i) = {0}, so that A is self-adjoint by implication (ii)⇒(i) in the
theorem. Equivalently, A is essentially self-adjoint.
(ii)⇔(iii) This is immediate from Lemma B.196, applied to A ± i.
B.199 Definition A topological space X has the fixed-point property if for every continuous
map f : X → X there is x ∈ X such that f (x) = x, i.e. a fixed-point.
B.200 Theorem (Brouwer, Hadamard, 1910) 145 [0, 1]n has the fixed point property. The
same holds for every non-empty compact convex subset of Rn .
The second result follows from the first since such an X is homeomorphic to some [0, 1]m .
There are many proofs of the first result. For what probably is the simplest proof (due to Kulpa)
of the first statement, using only some easy combinatorics, see [108]. (Proofs using algebraic
topology or analysis involve inessential elements and don’t reduce the combinatorics.)
B.201 Theorem (Schauder 1930) Every non-empty compact convex subset K of a normed
vector space has the fixed point property.
Proof. Let (V, k · k) be a normed vector space, K ⊆ V a non-empty compact convex subset
and f : K → K continuous. Let ε > Sn0. Since K is compact, thus totally bounded, there
are x1 , . . . , xn ∈ K such that K ⊆ i=1 B(xi , ε). Thus if we define continuous functions
αi : X → R+ , i = 1, . . . , n by
αi (x) = max(ε − kx − xi k, 0)
we see that for each x ∈ K there is at least one i such that αi (x) > 0. Since the αi are
continuous, so is the map Pn
αi (x)xi
Pε : K → K, x 7→ Pi=1 n .
i=1 αi (x)
Since Pε (x) is a convex combination of those xi for which kx − xi k < ε, we have kPε (x) − xk < ε
for all x ∈ K. The finite-dimensional subspace Vn = span(x1 , . . . , xn ) ⊆ V is isomorphic to some
Rm , and by Corollary 2.32 the restriction of the norm k · k to Vn is equivalent to the Euclidean
norm on Rm . Thus the convex hull conv(x1 , . . . , xn ) ⊆ Vn into which Pε maps is homeomorphic
to a compact convex subset of Rm and thus has the fixed point property by Theorem B.200.
145
Luitzen Egbertus Jan Brouwer (1881-1966). Dutch mathematician. Important contributions to topology, founding
of intuitionism. Jacques Hadamard (1865-1963). French mathematician.
249
Thus if we define fε = Pε ◦ f then fε maps conv(x1 , . . . , xn ) into itself and thus has a fixed point
x0 = fε (x0 ). Now,
kx0 − f (x0 )k ≤ kx0 − fε (x0 )k + kfε (x0 ) − f (x0 )k = kfε (x0 ) − f (x0 )k = kPε (f (x0 )) − f (x0 )k < ε.
Since ε > 0 was arbitrary, we find inf{kx − f (x)k | x ∈ K} = 0. Since K is compact and
x 7→ kx − f (x)k continuous, the infimum is assumed, thus f has a fixed point in K.
The use of methods/results from algebraic topology is quite typical for non-linear functional
analysis. (But also linear functional analysis connects to algebraic topology, for example via
K-theory, cf. e.g. [110, Chapter 7].)
B.203 Corollary Let V be a Banach space and C ⊆ V closed, bounded and convex. If
f : C → V is compact and f (C) ⊆ C then f has a fixed point in C.
Proof. Since C is bounded and f is compact, f (C) ⊆ V is precompact, thus K = conv(f (C))
is compact by Mazur’s Theorem B.75 and convex. Thus K has the fixed point property by
Schauder’s theorem. Since C is closed and convex, we have K ⊆ C, thus f is defined on K and
maps it into f (K) ⊆ f (C) ⊆ C. Thus f has a fixed point x ∈ K ⊆ C.
250
C The mathematicians encountered in these notes
• Marc-Antoine Parseval (1755-1836)
• Friedrich Bessel (1784-1846)
• Augustin-Louis Cauchy (1789-1857)
• Viktor Yakovlevich Bunyakovski (1804-1889)
• Sir William Rowan Hamilton (1805-1865)
• Karl Theodor Wilhelm Weierstrass (1815-1897)
• Eduard Heine (1821-1881)
• Bernhard Riemann (1826-1866)
• Carl Gottfried Neumann (1832-1925)
• Karl Hermann Armandus Schwarz (1843-1921)
• Giulio Ascoli (1843-1896)
• Cesare Arzelà (1847-1912)
• Adolf Hurwitz (1859-1919)
• Otto Hölder (1859-1937)
• Vito Volterra (1860-1940)
• Eliakim Hastings Moore (1862-1932)
• David Hilbert (1862-1943)
• Hermann Minkowski (1864-1909)
• Jacques Hadamard (1865-1963)
• Erik Ivar Fredholm (1866-1927)
• Felix Hausdorff (1868-1942)
• Ernst Steinitz (1871-1928)
• Emile Borel (1871-1956)
• Constantin Carathéodory (1873-1950)
• René-Louis Baire (1874-1932)
• Issai Schur (1874-1941)
• Ernst Sigismund Fischer (1875-1954)
• Erhard Schmidt (1876-1959)
• Maurice Fréchet (1878-1973)
• Hans Hahn (1879-1934)
• Frigyes Riesz (1880-1956)
• Sergei Natanovich Bernstein (1880-1968)
• Otto Toeplitz (1881-1940)
• Luitzen Egbertus Jan Brouwer (1881-1966)
• Ernst David Hellinger (1883-1950)
251
• Eduard Helly (1884-1943)
• Hermann Weyl (1885-1955)
• Marcel Riesz (1886-1969)
• Paul Lévy (1886-1971)
• Hugo Steinhaus (1887-1972)
• Stefan Banach (1892-1945)
• Eduard Čech (1893-1960)
• Norbert Wiener (1894-1964)
• Juliusz Schauder (1899-1943)
• Herman Auerbach (1901-1942)
• John von Neumann (1903-1957)
• Andrey Andreyevich Markov (1903-1979)
• Andrey Nikolaevich Kolmogorov (1903-1987)
• Marshall Harvey Stone (1903-1989)
• Wladyslaw Orlicz (1903-1990)
• Henri Cartan (1904-2008)
• Stanislaw Mazur (1905-1981)
• Arne Beurling (1905-1986)
• Henry Frederic Bohnenblust (1906-2000)
• Nachman Aronszajn (1907-1980)
• Mark Grigorievich Krein (1907-1989)
• Meier Eidelheit (1910-1943)
• Angus Ellis Taylor (1911-1999)
• Shizuo Kakutani (1911-2004)
• David Milman (1912-1982)
• Billy James Pettis (1913-1979)
• Charles Earl Rickart (1913-2002)
• Herman Heine Goldstine (1913-2004)
• Israel Moiseevich Gelfand (1913-2009)
• Vitold Lvovich Šmulyan (1914-1944)
• Leonidas Alaoglu (1914-1981)
• Laurent Schwartz (1915-2002)
• Gustave Choquet (1915-2006)
• Frederick Valentine Atkinson (1916-2002)
• Aryeh Dvoretzky (1916-2008)
• William Frederick Eberlein (1917-1986)
252
• Robert Clarke James (1918-2004)
• Jerzy Loś (1920-1988)
• Nicolaas Hendrik Kuiper (1920-1994)
• Claude Ambrose Rogers (1920-2005)
• Mychajlo Jossypowytsch Kadets (1923-2011)
• Calvin Richard Putnam (1924-2008)
• Bent Fuglede (1925-2023)
• Robert Ralph Phelps (1926-2013)
• Czeslaw Ryll-Nardzewski (1926-2015)
• Kennan Tayler Smith (1926-2000)
• Felix Earl Browder (1927-2016)
• Errett Albert Bishop (1928-1983)
• Alexander Grothendieck (1928-2014)
• Wilhelmus Anthonius Josephus Luxemburg (1929-2018)
• Karel de Leeuw (1930-1978)
• Shaul Reuven Foguel (1931-2020)
• Aleksander Pelczyński (1932-2012)
• Czeslaw Bessaga (1932-2021)
• John Robert Ringrose (b. 1932)
• Lior Tzafriri (1936-2008)
• Joram Lindenstrauss (1936-2012)
• Stefan Oscar Walter Hildebrandt (1936-2015)
• Haskell Paul Rosenthal (1940-2021)
• Paul Robert Chernoff (1942-2017)
• Stanislaw Kwapień (b. 1942)
• Per Henrik Enflo (b. 1944)
• Victor Lomonosov (1946-2018).
Embarrassingly the above list contains no women. In the related areas of PDEs and vari-
ational calculus (which is functional analysis, but non-linear) there have been quite a few, in
particular Sofia Kowalevskaya (1850-1891), Emmy Noether (1882-1935), Olga Ladyzhenskaya
(1922-2004), Cathleen Synge Morawetz (1923-2017), Yvonne Choquet-Bruhat (b. 1923), Karen
Uhlenbeck (b. 1942, Abel prize 2019), . . . (In classical and harmonic analysis, Grace Chisholm
Young (1868-1944), Nina Bari (1901-1961) and Dorothy Maharam Stone (1917-2014) come to
mind.) But in linear functional analysis the first notable women probably are
• Mary Beth Ruskai (1944-2023), who worked on functional analytic questions of quantum
theory.
• Nicole Tomczak-Jaegermann (1945-2022), who worked on Banach space theory, e.g. [165].
• Dusa McDuff (b. 1945), who after a brilliant PhD thesis (1970) on operator algebras
switched to more geometric matters (symplectic topology and geometry).
• Other female operator algebraists: Marie Choda and Claire Anantharaman-Delaroche with
first publications in 1962 and 1967, respectively.
253
D Results stated, but not proven
• 1905, 1913: Levy-Steinitz theorem on reordering series in finite dimension.
• 1911: Carathéodory’s convexity theorem.
• 1929/1940: Orlicz-Pettis theorem.
• 1936: Kakutani/Birkhoff: TVS is metrizable ⇔ 0 has countable neighborhood base.
• 1940s: Gelfand, A. E. Taylor, Dunford, Lorch: holomorphic functional calculus.
• 1940/1947: Eberlein-Šmulyan theorem.
∼ V ∗∗ but ιV (V ) $ V ∗∗ .
• 1951: R.C. James’ space with V =
• 1955: Grothendieck: approx prop. ⇔ finite rank ops. approx id on compact subsets
• 1957/1964: R.C. James: Banach sp. is reflexive ⇔ ∀ϕ ∈ V ∗ ∃0 6= x ∈ V : |ϕ(x)| = kxkkϕk.
• 1958: Bessaga/Pelczinski: Banach space V does not contain c0 ⇔ every WUC series in
V converges unconditionally.
• 1959: Lidskii’s theorem.
• 1960/1: Dvoretzky’s theorem.
• 1961/3: Bishop-Phelps theorems.
• 1965: Kuiper’s theorem.
• 1971: Kadets-Snobar theorem.
• 1971: Lindenstrauss-Tzafriri: Banach space with all closed subspaces complemented is
isomorphic to Hilbert space.
• 1972: Kwapień’s theorem
• 1973: Enflo: Banach space without approximation property.
• 1973: Pecherskii’s theorem.
• 1974: Rosenthal’s `1 theorem.
• 1975/1987: Enflo: Banach space operator without invariant subspace.
• 1977: Blair: Baire’s theorem ⇔ DCω .
• 1981: Szankowski: B(H) doesn’t have approximation property.
• 2011: Argyros/Haydon: solution of the scalar-plus-compact problem (Banach space V
with B(V ) = C1 + K(V )).
Note that apart from the last one and some new takes on old results like [7, 38, 53, 61] we
have hardly even mentioned any results from the last 40 years!
254
E What next?
For general orientation, the article [166] and Dieudonné’s book [40] are strongly recommended.
• General topological vector spaces, F-spaces, beginning with [141].
• Locally convex spaces and distributions, beginning with [141, 30, 94]
• Sobolev spaces. Applications of the latter and of distributions to PDEs, e.g. [52].
• Index theory of elliptic PDEs (Atiyah-Singer etc.)
• Much more on Banach spaces, beginning with [102], then [26, 98, 1, 97] etc.
• Connections between Banach spaces and classical/harmonic analysis, e.g. wavelets, Hardy
spaces, and with probability theory, e.g. martingales. E.g. [176, 77].
• More operator theory on Banach and Hilbert spaces, e.g. [59, 135, 24, 68, 129, 152].
• Semigroup theory, e.g. [4, 50]
• Banach algebras [131, 18, 80].
• Connections between Banach algebras and complex analysis.
• C ∗ - and von Neumann algebras: [110, 79] and many other books.
• Interactions of operator algebras and operator theory, beginning with [110, 33, 42].
• Algebraic topology of operator algebras, non-commutative geometry.
• Non-linear functional analysis, e.g. [28, 37, 39, 115, 164, 177].
• Variational calculus (with applications to differential equations), non-linear optimization.
• Applications of operator theory in quantum mechanics, e.g. [92].
• Applications of operator algebras in statistical physics and quantum field theory. E.g.
[22, 65].
• Non-archimedean/p-adic functional analysis, e.g. [138, 125].
255
F Approximate schedule for 14 lectures à 90 minutes
1. Sections 1-2.2: Introduction. Topological vector spaces, normed spaces.
2. Sections 2.3-3: Glimpse beyond normed spaces. More on normed spaces and bounded
maps
3. Section 4: The spaces `p (S, F) and c0 (S, F). Proofs of Hölder and Minkowski inequalities,
dual spaces. Most other proofs omitted or just sketched.
4. Sections 5.1-5.5: Hilbert spaces up to and incl. H ∗ .
5. Section 5.6-6.1: Bases and tensor products of Hilbert spaces. Quotients of Banach spaces.
6. Section 6.2: complemented subspaces. Section 7: Open mapping thm. incl. Baire, closed
graph theorem, boundedness below.
7. Section 8: Uniform boundedness theorem and applications.
8. Section 9: Hahn-Banach theorem and applications incl. reflexivity and transpose of oper-
ators.
9. Section 11: Hilbert space operators, beginning with adjoint. Self-adjoint, normal ops, etc.
10. Finish Hilbert space operators. Then Section 12 on compact operators.
11. Sections 13.1-13.2.2: spectra of operators, spectrum in a Banach algebra.
12. Sections 13.2.3 and 13.2.4: Beurling-Gelfand theorem and its applications.
13. Section 14: spectral theorems for compact operators (normal or not). Quick mention of
Fredholm operators. Section 15: Characters vs. maximal ideals. Power series functional
calculus.
14. Sections 16, 17: C ∗ -algebras, continuous functional calculus for normal operators. Spectral
theorems for normal operators (only Section 18.1).
− − − − − − − − − − − − − − − − − − − − − − − − −−
15. Section 10 on weak and weak-∗ topologies, first half of Section 12.2.
16. Second half of Section 12.2, Section 19.
Lectures 15 and 16 are not part of the course since the weeks 15-16 of the semester were
scrapped a few years ago. If there is interest, I can give these two lectures (incl. homework) in
January for 1 EC.
256
All papers appearing in the bibliography are cited somewhere, but not all books. Still, all
are worth looking at.
References
[1] F. Albiac, N. J. Kalton: Topics in Banach space theory. 2nd. ed. Springer, 2016.
[2] D. Amir: Characterizations of inner product spaces. Birkhäuser, 1986.
[3] N. R. Andre, S. M. Engdahl, A. E. Parker: An Analysis of the First Proofs of
the Heine-Borel Theorem. [Link]
an-analysis-of-the-first-proofs-of-the-heine-borel-theorem
[4] D. Applebaum: Semigroups of linear operators. Cambridge University Press, 2019.
[5] S. A. Argyros, R. G. Haydon: A hereditarily indecomposable L∞ -space that solves the
scalar-plus-compact problem. Acta Math. 206, 1-54 (2011).
[6] S. A. Argyros, R. G. Haydon: Bourgain-Delbaen L∞ -spaces, the scalar-plus-compact
property and related problems. Proc. Intern. Congr. Math., vol. 3 (2018).
[7] M. B. Asadi, A. Khosravi: An elementary proof of the characterization of isomorphisms
of standard operator algebras. Proc. Amer. Math. Soc. 134, 3255-3256 (2006).
[8] M. Bachir: On the Krein-Milman-Ky Fan theorem for convex compact metrizable sets.
Illinois Journ. Math. 62, 1-24 (2018).
[9] S. Banach: Théorie des opérations linéaires, 1932. Engl. transl.: Theory of linear opera-
tions. North-Holland, 1987.
[10] W. R. Bauer, R. H. Benner: The non-existence of a Banach space of countably infinite
Hamel dimension. Amer. Math. Monthly 78, 895-596 (1971).
[11] V. Baumann: W. A. J. Luxemburgs Beweis des Satzes von Hahn-Banach. Archiv Math.
18, 271-272 (1967).
[12] A. F. Beardon: Limits. Springer, 1997.
[13] J. L. Bell, D. H. Fremlin: A geometric form of the axiom of choice. Fund. Math. 77,
168-170 (1972/3).
[14] S. J. Bernau: The spectral theorem for normal operators. Journ. London Math. Soc. 40,
478-486 (1965).
[15] G. Birkhoff, E. Kreyzig: The establishment of functional analysis. Histor. Math. 11, 258-
321 (1984).
[16] E. Bishop, R. R. Phelps: A proof that every Banach space is subreflexive. Bull. Amer.
Math. Soc. 67, 97-98 (1961).
[17] C. E. Blair: The Baire category theorem implies the principle of dependent choices. Bull.
Acad. Polon. Sci. Sér. Sci. Math. Astronom. Phys. 25, 933-934 (1977).
[18] F. F. Bonsall, J. Duncan: Complete normed algebras. Springer, 1973
[19] F. F. Bonsall, J. Duncan: Numerical ranges of operators on normed spaces and of elements
of normed algebras and Numberical ranges II. Cambridge University Press, 1971, 1973.
[20] R. Bouldin: The essential minimum modulus. Indiana Univ. Math. Journ. 30, 513-517
(1981).
[21] N. Bourbaki: Topological vector spaces. Chapters 1-5. Springer, 1987.
257
[22] O. Bratteli, D. W. Robinson: Operator algebras and quantum statistical mechanics. Two
volumes. 2nd. eds.. Springer, 1987, 1997.
[23] H. Brézis: Functional analysis, Sobolev spaces and partial differential equations. Springer,
2011.
[24] M. S. Brodskiı̌: Triangular and Jordan representations of linear operators. American
Mathematical Society Translations, 1971.
[25] T. Bühler, D. A. Salamon: Functional analysis. American Mathematical Society, 2018.
[26] N. L. Carothers: A short course on Banach space theory. Cambridge University Press,
2005.
[27] D. Choimet, H. Queffélec: Twelve landmarks of twentieth-century analysis. Cambridge
University Press, 2009.
[28] P. G. Ciarlet: Linear and nonlinear functional analysis with applications. Society for
Industrial and Applied Mathematics, 2013.
[29] D. L. Cohn: Measure theory. 2nd. ed. Springer, 2013.
[30] J. B. Conway: A course in functional analysis. 2nd. ed. Springer, 2007.
[31] J. B. Conway: The compact operators are not complemented in B(H). Proc. Amer. Math.
Soc. 32, 549-550.
[32] H. G. Dales: Banach algebras and automatic continuity. Oxford University Press, 2000.
[33] K. R. Davidson: C ∗ -Algebras by example. American Mathematical Society, 1996.
[34] A. M. Davie: The Banach approximation problem. J. Approx. Th. 13, 392-394 (1975).
[35] C. Davis: The Toeplitz-Hausdorff theorem explained. Canad. Math. Bull. 14, 245-246
(1971).
[36] M. M. Day: Reflexive Banach spaces not isomorphic to uniformly convex spaces. Bull.
Amer. Math. Soc. 47, 313-317 (1941).
[37] K. Deimling: Nonlinear functional analysis. Springer, 1985.
[38] S. Delpech: A short proof of Pitt’s compactness theorem. Proc. Amer. Math. Soc. 137,
1371-1372 (2008).
[39] Z. Denkowski, S. Migórski, N. S. Papageorgiou: An introduction to nonlinear analysis:
Theory. Springer, 2003.
[40] J. Dieudonné: History of functional analysis. North-Holland, 1981.
[41] J.-L. Dorier: A general outline of the genesis of vector space theory. Hist. Math. 22,
227-261 (1995).
[42] R. G. Douglas: Banach algebra techniques in operator theory. 2nd. ed. Springer, 1998.
[43] N. Dunford, J. T. Schwartz: Linear operators. I. General theory. Interscience Publishers,
1958, John Wiley & Sons, 1988.
[44] H.-D. Ebbinghaus et al.: Numbers. Springer, 1991.
[45] D. E. Edmunds, W. D. Evans: Spectral theory and differential operators. Oxford University
Press, 1987, 2018.
[46] N. Eldredge: Answer to the question ‘Is there a simple direct proof of the Open
Mapping Theorem from the Uniform Boundedness Theorem?’ on [Link].
[Link]
mapping-theorem-from-the-uniform-boun
258
[47] P. Enflo: A counterexample to the approximation problem in Banach spaces. Acta Math.
130, 309-317 (1973).
[48] P. Enflo: On the invariant subspace problem in Banach spaces. Sémin. d’anal. fonction-
nelle (Maurey-Schwartz). Exp. no 14 et 15, p. 1-6 (1975-1976) On the invariant subspace
problem for Banach spaces. Acta Math. 158, 213-313 (1987).
[49] P. Enflo: On the invariant subspace problem in Hilbert spaces. Preprint 2023.
[Link]
[50] K.-J. Engel, R. Nagel: One-parameter semigroups for linear evolution equations. Springer,
2000.
[51] R. Engelking: General topology. Heldermann Verlag, 1989.
[52] L. C. Evans: Partial differential equations. 2nd. ed. American Mathematical Society, 2010.
[53] A. Fellhauer: On the relation of three theorems of analysis to the axiom of choice. J. Logic
Analysis 9, 1-23 (2017).
[54] M. Foreman, F. Wehrung: The Hahn-Banach theorem implies the existence of a non-
Lebesgue measurable set. Fund. Math. 77, 13-19 (1991).
[55] S. Friedberg, A. Insel, L. Spence: Linear algebra. 4th. ed. Pearson, 2014.
[56] T. W. Gamelin, R. E. Greene: Introduction to topology. 2nd. ed. Dover Publications, 1999.
[57] D. J. H. Garling: A course in mathematical analysis. Vol. 1 & 2. Cambridge University
Press, 2013.
[58] B. W. Glickfeld: A metric characterization of C(X) and its generalization to C ∗ -algebras.
Illinois Journ. Math 10, 547-556 (1966).
[59] I. C. Gohberg, M. G. Krein: Introduction to the theory of linear nonselfadjoint operators
in Hilbert Space. American Mathematical Society Translations, 1969.
[60] F. Q. Gouvêa: p-adic numbers. An introduction. 3rd. ed. Springer, 2020.
[61] S. Grabiner: The Tietze extension theorem and the open mapping theorem. Amer. Math.
Monthly 93, 190-191 (1986).
[62] A. Grothendieck: La théorie de Fredholm. Bull. Soc. Math. France 84, 319-384 (1956).
[63] S. Gudder: Inner product spaces. Amer. Math. Monthly 81, 29-36 (1974), 82, 251-252
(1975), 82, 818 (1975).
[64] K. E. Gustafson, D. K. M. Rao: Numerical range. The field of values of linear operators
and matrices. Springer, 1997.
[65] R. Haag: Local quantum physics. 2nd. ed. Springer, 1996.
[66] P. R. Halmos: Introduction to Hilbert space and the theory of spectral multiplicity. 2nd.
ed. Chelsea, 1957.
[67] P. R. Halmos: What does the spectral theorem say? Amer. Math. Monthly 70, 241-247
(1963).
[68] P. R. Halmos: A Hilbert space problem book. 2nd. ed. Springer, 1982.
[69] I. Halperin: Sums of a series, permitting rearrangements. C. R. Math. Rep. Acad. Sci.
Can. VIII, 87-102 (1986).
[70] J. D. Halpern: The independence of the axiom of choice from the Boolean prime ideal
theorem. Fundam. Math. 55, 57-66 (1964).
259
[71] J. D. Halpern, A. Lévy: The Boolean prime ideal theorem does not imply the axiom of
choice. In: D. Scott (ed.) Axiomatic set theory. Amer. Math. Soc., 1971.
[72] H. Hanche-Olsen, H. Holden: The Kolmogorov-Riesz compactness theorem. Expos. Math.
28, 385-394 (2010).
[73] O. Hanner: On the uniform convexity of Lp and lp . Ark. Mat. 3, 239-244 (1955).
[74] C. Heil: A basis theory primer. Birkhäuser, 2011.
[75] N. Higson, J. Roe: Analytic K-homology. Oxford University Press, 2000.
[76] K. Hoffman: Banach spaces of analytic functions. Prentice-Hall, 1962.
[77] T. Hytönen, J. van Neerven, M. Veraar, L. Weis: Analysis in Banach spaces. Two Volumes.
Springer, 2016, 2017.
[78] V. M. Kadets, M. I. Kadets: Rearrangements of series in Banach spaces. American Math-
ematical Society, 1991.
[79] R. V. Kadison, J. R. Ringrose: Fundamentals of the theory of operator algebras. Two
volumes. Academic Press, 1983, 1986.
[80] E. Kaniuth: A course in commutative Banach algebras. Springer, 2009.
[81] S. Kaplan: The bidual of C(X) I. North-Holland, 1985.
[82] A. A. Karatsuba, S. M. Voronin: The Riemann zeta function. Walter de Gruyter, 1992.
[83] Y. Katznelson: An introduction to harmonic analysis. 3rd. ed. Cambridge University
Press, 2004.
[84] Y. & Y. R. Katznelson: A (terse) introduction to linear algebra. American Mathematical
Society, 2008.
[85] I. Kleiner: A history of abstract algebra. Birkhäuser, 2007.
[86] V. Komornik: Lectures on functional analysis and the Lebesgue integral. Springer, 2016.
[87] E. Kreyszig: Introductory functional analysis with applications. Wiley, 1978.
[88] M. Krukowski: Natural proof of the characterization of relatively compact families in
Lp -spaces on locally compact groups. Publ. Math. 67, 687-713 (2023).
[89] N. H. Kuiper: The homotopy type of the unitary group of Hilbert space. Topology 3,
19-30 (1965).
[90] H. E. Lacey, R. Whitley: Conditions under which all the bounded linear maps are compact.
Math. Ann. 158, 1-5 (1965).
[91] E. Landau: Differential and integral calculus. Chelsea, 1951.
[92] N. P. Landsman: Foundations of quantum theory. Springer, 2017. Freely available at
[Link]
[93] S. Lang: Undergraduate algebra. 2nd. ed. Springer, 1990.
[94] P. D. Lax: Functional analysis. Wiley, 2002.
[95] P. D. Lax: Linear algebra and its applications. 2nd. ed. Wiley, 2007.
[96] S. R. Lay: Convex sets and their applications. Wiley, 1982.
[97] D. Li, H. Queffélec: Introduction to Banach spaces: Analysis and probability. Two Vol-
umes. Cambridge University Press, 2018.
260
[98] J. Lindenstrauss, L. Tzafriri: Classical Banach spaces. Vol. 1: Sequence spaces, Vol. 2:
Function spaces. Springer, 1977, 1979.
[99] J. Loś, C. Ryll-Nardzewski: On the application of Tychonoff’s theorem in mathematical
proofs. Fund. Math. 88, 233-237 (1951).
[100] W. A. J. Luxemburg: Two applications of the method of construction by ultrapowers to
analysis. Bull. Amer. Math. Soc. 68, 416-419 (1962).
[101] B. MacCluer: Elementary functional analysis. Springer, 2009.
[102] R. E. Megginson: An introduction to Banach space theory. Springer, 1998.
[103] R. Meise, D. Vogt: Introduction to functional analysis. Oxford University Press, 1997.
[104] A. J. Michaels: Hilden’s simple proof of Lomonosov’s invariant subspace theorem. Adv.
Math. 25, 56-58 (1977).
[105] D. F. Monna: Functional analysis in historical perspective. Oosthoek Publishing Company,
1973.
[106] G. H. Moore: The axiomatization of linear algebra: 1875-1940. Hist. Math. 22, 262-303
(1995).
[107] M. H. Mortad: On the absolute value of the product and the sum of operators. Rend.
Circ. Mat. Palermo, II. Ser 68, 247-257 (2019).
[108] M. Müger: Topology for the working mathematician. (work in progress).
[Link]
[109] M. Müger: Some examples of Fourier series.
[Link]
[110] G. J. Murphy: C ∗ -Algebras and operator theory. Academic Press, 1990.
[111] G. Nagy: A functional analysis point of view on the Arzela-Ascoli theorem. Real. Anal.
Exch. 32, 583-586 (2006/7).
D. C. Ullrich: The Ascoli-Arzelà Theorem via Tychonoff’s Theorem. Amer. Math.
Monthly 110, 939-940 (2003).
M. Wójtowicz: For eagles only: probably the most difficult proof of the Arzelà-Ascoli
theorem − via the Stone-Čech compactification. Quaest. Math. 40, 981-984 (2017).
[112] L. Narici, E. Beckenstein, G. Bachman: Functional analysis and valuation theory. Marcel
Dekker, Inc. 1971.
[113] J. van Neerven: Functional analysis. Cambridge University Press, 2022.
[114] D. J. Newman: A simple proof of Wiener’s 1/f theorem. Proc. Amer. Math. Soc. 48,
264-265 (1975).
[115] L. Nirenberg: Topics in nonlinear functional analysis. American Mathematical Society,
1974.
[116] B. de Pagter, A. C. M. van Rooij: An invitation to functional analysis. Epsilon Uitgaven,
2013.
[117] J. Pawlikowski: The Hahn-Banach theorem implies the Banach-Tarski paradox. Fund.
Math. 77, 21-22 (1991).
[118] G. Pedersen: Analysis now. Springer, 1989.
[119] R. R. Phelps: Lectures on Choquet’s theorem. 2nd. ed. Springer, 2001.
261
[120] J.-P. Pier: Mathematical analysis during the 20th century. Oxford University Press, 2001.
[121] A. Pietsch: Eigenvalues and s-numbers. Cambridge University Press, 1987.
[122] A. Pietsch: History of Banach spaces and linear operators. Birkhäuser, 2007.
[123] D. Pincus: The strength of the Hahn Banach theorem. In: A. Hurd, P. Loeb (eds.):
Victoria Symposium on Nonstandard Analysis. LNM 369, Springer, 1974.
[124] D. Pincus: Adding dependent choice to the prime ideal theorem. In: R. O. Gandy, J. M.
E. Hyland (eds.): Logic Colloquium 76. North-Holland, 1977.
[125] J. B. Prolla: Topics in functional analysis over valued division rings. North-Holland, 1982.
[126] D. Ramakrishnan, R. J. Valenza: Fourier analysis on number fields. Springer, 1999.
[127] C. J. Read: A solution to the Invariant Subspace Problem on the space `1 . Bull. London
Math. Soc. 17, 305-317 (1985).
[128] M. Reed, B. Simon: Methods of modern mathematical physics I: Functional analysis.
Academic Press, 1980.
[129] M. Reed, B. Simon: Methods of modern mathematical physics IV: Analysis of operators.
Academic Press, 1980.
[130] C. E. Rickart: An elementary proof of a fundamental theorem in the theory of Banach
algebras. Michigan Math. J. 5, 75-78 (1958).
[131] C. E. Rickart: General theory of Banach algebras. Robert E. Krieger Publishing Co. Inc.,
1960.
[132] F. Riesz, B. Sz.-Nagy: Functional analysis. Frederick Ungar Publ., 1955, Dover, 1990.
[133] J. R. Ringrose: A note on uniformly convex spaces. J. London Math. Soc. 34, 92 (1959).
[134] J. R. Ringrose: Super-diagonal forms for compact linear operators. Proc. London Math.
Soc. 12, 367-384 (1962).
[135] J. R. Ringrose: Compact non-self-adjoint operators. Van Nostrand Reinhold, 1971.
[136] A. M. Robert: A course in p-adic analysis. Springer, 2000.
[137] J. W. Roberts: A compact convex set with no extreme points. Studia Mathematica 60,
255-266 (1977).
[138] A. C. M. van Rooij: Non-Archimedean functional analysis. Marcel Dekker, Inc., 1978.
[139] W. Rudin: Principles of mathematical analysis. McGraw Hill, 1953, 1964, 1976.
[140] W. Rudin: Real and complex analysis. McGraw-Hill, 1966, 1974, 1986.
[141] W. Rudin: Functional analysis. 2nd. ed. McGraw-Hill, 1991.
[142] V. Runde: A taste of topology. Springer, 2005.
[143] V. Runde: A new and simple proof of Schauder’s theorem.
[Link]
[144] R. A. Ryan: Introduction to tensor products of Banach spaces. Springer, 2002.
[145] B. P. Rynne, M. A. Youngson: Linear functional analysis. 2nd. ed. Springer, 2008.
[146] D. A. Salamon: Measure and integration. European Mathematical Society, 2016.
[147] D. Sarason: The multiplication theorem for Fredholm operators. Amer. Math. Monthly
94, 68-70 (1987).
262
[148] K. Saxe: Beginning functional analysis. Springer, 2002.
[149] E. Schechter: Handbook of analysis and its foundations. Academic Press, 1997.
[150] C. Schmoeger: Remarks on commuting exponentials in Banach algebras. Proc. Amer.
Math. Soc. 127, 1337-1338 (1999).
[151] B. Simon: Trace ideals and their applications. 2nd. ed. American Mathematical Society,
2005.
[152] B. Simon: Operator theory. American Mathematical Society, 2015.
[153] I. Singer: Bases in Banach spaces I & II. Springer, 1970, 1981.
[154] A. Sokal: A really simple elementary proof of the Uniform Boundedness Theorem. Amer.
Math. Monthly 118, 450-452 (2011).
[155] P. Soltan: A primer on Hilbert space. Springer, 2018.
[156] L. A. Steen: Highlights in the history of spectral theory. Amer. Math. Monthly 80, 359-381
(1973).
[157] E. M. Stein, R. Shakarchi: Fourier analysis. Princeton University Press, 2005.
[158] J. Stillwell: Reverse mathematics. Proofs from the inside out. Princeton University Press,
2018.
[159] M. H. Stone: Linear transformations in Hilbert space and their applications to analysis.
American Mathematical Society, 1932.
[160] V. S. Sunder: Fuglede’s theorem. Indian J. Pure Appl. M. 46, 415-417 (2015).
[161] C. Swartz: Infinite matrices and the gliding hump. World Scientific, 1996.
[162] A. Szankowski: B(H) does not have the approximation property. Acta Math. 147, 89-108
(1981).
[163] T. Tao: Analysis I & II. 3rd. ed. Springer, 2016.
[164] G. Teschl: Topics in linear and nonlinear functional analysis. American Mathematical
Society, 2020.
[165] N. Tomczak-Jaegermann: Banach-Mazur distances and finite-dimensional operator ideals.
Longman Scientific & Technical, 1989.
[166] A. M. Vershik: The life and fate of functional analysis in the twentieth century. In: A.
A. Bolibruch, Yu. S. Osipov, Ya. G. Sinai (eds.): Mathematical events of the twentieth
century. Springer, 2006.
[167] S. Warner: Topological fields. North-Holland, 1989.
[168] J. Weidmann: Lineare Operatoren in Hilberträumen. 1: Grundlagen, 2: Anwendungen.
Teubner, 2000, 2003.
[169] A. Weil: L’intégration dans les groupes topologiques et ses applications. 2ème éd. Hermann,
1965.
[170] H. Weyl: Über beschränkte quadratische Formen, deren Differenz vollstetig ist. Rend.
Circ. Mat. Palermo 27, 373-392 (1909).
[171] H. Weyl: Über gewöhnliche Differentialgleichungen mit Singularitäten und die zugehörigen
Entwicklungen willkürlicher Funktionen. Math. Ann. 68, 220-269 (1910).
[172] R. Whitley: Projecting m onto c0 . Amer. Math. Monthly 73, 285-286 (1966).
263
[173] R. Whitley: An elementary proof of the Eberlein-Šmulyan theorem. Math. Ann. 172,
116-118 (1967).
[174] R. Whitley: The spectral theorem for a normal operator. Amer. Math. Monthly 75, 856-
861 (1968).
[175] W. Wieşlaw: On topological fields. Colloq. Math. 29, 119-146 (1974).
[176] P. Wojtaszczyk: Banach spaces for analysts. Cambridge University Press, 1991.
[177] E. Zeidler: Nonlinear functional analysis. Volumes 1, 2A, 2B, 3, 4, 5. Springer, 1984-1990.
[178] R. J. Zimmer: Essential results of functional analysis. University of Chicago Press, 1990.
[179] A. Zsák: On the Solution of the scalar-plus-compact problem by Argyros and Haydon.
EMS Newsletter, December 2018, pp. 8-15.
[180] Question “Why do we care about Lp spaces besides p = 1, p = 2, p = ∞?” on MathOver-
flow with answers. [Link]
spaces-besides-p-1-p-2-and-p-infty
264