Functionalanalysis
Functionalanalysis
Michael Müger
26.01.2021
Abstract
These are notes for the course Inleiding in de Functionaalanalyse, Autumn 2020/21 (14×90
min.). They are also recommended as background for my courses on Operator Algebras.
Contents
Part I: Basics 4
1 Rough introduction 4
1
5.4 The dual space H ∗ of a Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.5 Orthonormal sets. Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.6 ? Tensor products of Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2
13 More on Hilbert space operators 85
13.1 Normal operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
13.2 Self-adjoint operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
13.3 Positive operators. Polar decomposition . . . . . . . . . . . . . . . . . . . . . . . 89
14 Compact operators 91
14.1 Compact Banach space operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
14.2 Fredholm alternative. The spectrum of compact operators . . . . . . . . . . . . . 95
14.3 Spectral theorems for compact Hilbert space operators . . . . . . . . . . . . . . . 97
14.4 ? Hilbert-Schmidt operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3
B.6.1 Strict convexity and uniqueness in the Hahn-Banach theorem . . . . . . . 141
B.6.2 Uniform convexity and reflexivity. Duality of Lp -spaces reconsidered . . . 142
B.7 Schur’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
B.8 The Fuglede-Putnam theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
B.9 Glimpse of non-linear FA: Schauder’s fixed point theorem . . . . . . . . . . . . . 146
References 149
1 Rough introduction
We will begin with a quick delineation of what we will discuss – and what not!
• “Classical analysis” is concerned with ‘analysis in finitely many dimensions’. ‘Functional
analysis’ is the generalization or extension of classical analysis to infinitely many dimen-
sions. Before one can try to make sense of this, one should make the first sentence more
precise. Since the creation of general topology, one can talk about convergence and con-
tinuity in very general terms. As far as I see it, this is not analysis, even if infinite sums
(=series) are studied. Analysis proper starts as soon as one talks about differentiation
and/or integration. Differentiation has to do with approximating functions locally by lin-
ear ones, and for this one needs the spaces considered to be vector spaces (at least locally).
This is the reason why most of classical analysis considers functions between the vector
spaces Rn and Cn (or subsets of them). (In a second step, one can then generalize to
spaces that look like Rn only locally by introducing topological and smooth manifolds and
their generalizations, but the underlying model of Rn remains important.) On the other
hand, integration, at least in the sense of the modern theory, can be studied much more
generally, i.e. on arbitrary sets equipped with a measure (defined on some σ-algebra). Such
a set can be very far from being a vector space or manifold, for example by being totally
disconnected.
• In view of the above, it is not surprising that functional analysis is concerned with (pos-
sibly) infinite dimensional vector spaces and continuous maps between them. (Again, one
can then generalize to spaces that look like a vector space only locally, but this would
be considered infinite dimensional geometry, not functional analysis.) In addition to the
vector space structure one needs a topology, which naturally leads to topological vector
spaces, which I will define soon.
• The importance of topologies is not specific to infinite dimensions. The point rather is
that Rn , Cn have unique topologies making them topological vector spaces. This is no
more true in infinite dimensions!
• Actually, ‘functional analysis’ most often studies only linear maps between topological
vector spaces so that this domain of study should be called ‘linear functional analysis’, but
this is done only rarely, e.g. [68]. Allowing non-linear maps leads to non-linear functional
analysis. This course will discuss only linear functional analysis. Thorough mastery of the
latter is needed anyway before one can think about non-linear FA or infinite dimensional
geometry. For the simplest result of non-linear functional analysis, see Section B.9. For
more, you could have a look at, e.g., [80, 14, 53]. There even is a five volume treatise [86]!
4
• The restriction to linear maps means that the notion of differentiation becomes point-
less, the derivative of f (x) = Ax + b being just A everywhere. But there are many
non-trivial connections between linear FA and integration (and measure) theory. For
example, every measure space (X, A, µ) gives rise to a family of topological vector spaces
Lp (X, A, µ), p ∈ (0, ∞], and integration provides linear functionals. Proper appreciation of
these matters requires some knowledge of measure and integration theory, cf. e.g. [11, 69].
I will not suppose that you have followed a course on this subject (but if you haven’t, I
strongly that you do so on the next occasion or, at least, read the appendix in MacCluer’s
book [42].). Yet, one can get a reasonably good idea by focusing on sequence spaces, for
which no measure theory is required, see Section 4.
• One should probably consider linear functional analysis as an infinite dimensional and
topological version of linear algebra rather than as a branch of analysis! This might lead
one to suspect linear FA to be slightly boring, but this would be wrong for many reasons:
– Functional analysis (linear or not) leads to very interesting (and arbitrarily challeng-
ing) technical questions (most of which reduce to very easy ones in finite dimensions).
– Linear FA is essential for non-linear FA, like variational calculus, and the theory of
differential equations – not only linear ones!
– Quantum theory [38] is a linear theory and cannot be done properly without func-
tional analysis, despite the fact that many physicists think so! Conversely, many
developments in FA were directly motivated by quantum theory.
• The above could give the impression that functional analysis arose from the wish of gen-
eralizing analysis to infinitely many dimensions. This may have played a role for some
of its creators, but its beginnings (and much of what is being done now) were mostly
motivated by finite dimensional “classical”1 analysis: If U ⊆ Rn , the set of functions (pos-
sibly continuous, differentiable, etc.) from U to Rm is a vector space as soon as we put
(cf + dg)(x) = cf (x) + dg(x). Unless U is a finite set, this vector space will be infinite
dimensional. Now one can consider certain operations on R such vector spaces, like differ-
entiation C ∞ (U ) → C ∞ (U ), f 7→ f 0 or integration f 7→ U f . This sort of considerations
provided the initial motivation for the development functional analysis, and indeed FA
now is a very important tool for the study of ordinary and partial differential equations on
finite dimensional spaces. See e.g. [7, 20]. The relevance of FA is even more obvious if one
studies differential equations in infinitely many dimensions. In fact, it is often useful to
study a partial differential equation (like heat or wave equation) by singling out one of the
variables (typically ‘time’) and studying the equation as an ordinary differential equation
in an infinite dimensional space of functions. FA is also essential for variational calculus
(which in a sense is just a branch of differential calculus in infinitely many dimensions).
• In view of the above, FA studies abstract topological vector spaces as well as ‘concrete’
spaces, whose elements are functions. In order to obtain a proper understanding of FA,
one needs some familiarity with both aspects.
Before we delve into technicalities, some further general remarks:
• The history of functional analysis is quite interesting, cf. e.g. the article [5], [56, Chapter
4] and the books [15, 45]. But clearly it makes little sense to study it before one has some
technical knowledge of FA. It is surprisingly intertwined with the development of linear
1
Ultimately, I find it quite futile to try and draw a neat line between “classical” and “modern” or functional
analysis, in particular since many problems in the former require methods from the latter for their proper treatment.
5
algebra. One would think that (finite dimensional) vector spaces, linear maps etc. were
defined much earlier than, e.g., Banach spaces, but this is not true. In fact, Banach’s2
book [3], based on his 1920 PhD thesis, is one of the first references containing the mod-
ern definition of a vector space. Some mathematicians, mostly Italian ones, like Peano,
Pincherle and Volterra, essentially had the modern definition already in the last decades
of the 19th century, but they had no impact since the usefulness of an abstract/axiomatic
approach was not yet widely appreciated. Cf. [36, Chapter 5] or [16, 46].
Here I limit myself to mentioning that the basics of functional analysis (Hilbert and Banach
spaces and bounded linear maps between them) were developed in the period 1900-1930.
Nevertheless, many important developments (locally convex spaces, distributions, operator
algebras) took place in 1930-1960. After that, functional analysis has split up into many
very specialized subfields that interact quite little with each other. The very interesting
article [81] ends with the conclusion that ‘functional analysis’ has ceased to exist as a
coherent field of study!
• The study of functional analysis requires a solid background in general topology. It may
well be that you’ll have to refresh and likely also extend yours. In Appendix A I have
collected brief accounts of the topics that – sadly – you are most likely not to have en-
countered before. All of them are contained in [65] (written by a functional analyst!), but
my favorite (I’m admittedly biased) reference is [47]. You should have seen Weierstrass’
theorem, but those of Tietze and Arzelà-Ascoli tend to vanish in the (pedagogical, not
factual) gap between general topology and functional analysis.
• The main reference for this course has been [42] for a number of years, but I am unenthu-
siastic about a number of aspects of it, which is why I wrote these notes. If you find them
too advanced, you might want to have a look at [54, 68, 70]. On the other hand, if you
want more, [55] is a good place to start, followed by [59, 39, 12, 64]. (The MasterMath
course currently uses [12].)
One word about notation (without guarantee of always sticking to it): General vector spaces,
but also normed spaces, are denoted V, W, . . ., normed spaces also as E, F, . . .. Vectors in such
spaces are e, f, . . . , x, y, . . .. Linear maps are always denoted A, B, . . ., except linear functionals
V → F, which are ϕ, ψ. Algebras are usually denoted A, B, . . . and their elements a, b, . . .. (For
A = B(E) this leads to inconsistency, but I cannot bring myself to using capital letters for
abstract algebra elements.)
Acknowledgment. I thank Bram Balkema, Victor Hissink Muller and Niels Vooijs for corrections,
but in particular Tim Peters for a huge number of them.
6
of infinite matrices involves infinite summations, which require the introduction of topologies.
(Actually, in some restricted contexts infinite matrices still are quite useful.)
We begin with the following
2.1 Definition A topological group is a group (G, ·, 1) equipped with a topology τ such that
the group operations G×G → G, (g, h) 7→ gh and G → G, g 7→ g −1 are continuous (where G×G
is given the product topology). (For abelian groups, one often denotes the binary operation by
+ instead of ·.)
2.2 Example 1. If (G, ·, 1) is any group then it becomes a topological group by putting τ =
τdisc , the discrete topology on G.
2. The group (R, +, 0), where R is equipped with its standard topology, is easily seen to be
a topological group.
3. If n ∈ N3 and F ∈ {R, C} then the set GL(n, F) = {A ∈ Mn×n (F) | det(A) 6= 0} of
invertible n × n matrices is a group w.r.t. matrix product and inversion and in fact a topological
group when equipped with the subspace topology induced from Mn×n (F) ∼
2
= Fn .
2.3 Remark Topological groups – or rather matrix groups as in 3. above – are the subject of
my 3rd year course on course Continuous Matrix Groups, taught again next spring. They are
an important (and prototypical) case of Lie groups. The latter are a subject at Master level
that is very much worthy of study! 2
2.4 Definition A topological field is a field (F, +, 0, ·, 1) equipped with a topology on F such
that (F, +, 0) and (F\{0}, ·, 1) are topological groups. (Equivalently, all field operations are
continuous.)
It is very easy to check that R and C are topological fields when equipped with their standard
topologies. (So is Q with the topology induced from Q.)
2.6 Definition Let F be a topological field. Then a topological vector space (TVS) over F
is an F-vector space equipped with a topology τV (to be distinguished, obviously, from the
topology τF on F) such that the maps V × V → V, (x, y) 7→ x + y and F × V → V, (c, x) 7→ cx
are continuous.
(These conditions imply that V → V, x 7→ −x is continuous, so that (V, +, 0) is a topological
group, but not conversely.)
Again it is very easy to check that Rn and Cn are topological vector spaces over the topo-
logical fields R, C, respectively, when they are quipped with the euclidean topologies (=product
topologies on F × · · · × F).
In this course, the only topological fields considered are R and C. When a result holds for
either of the two, I will write F. But note that one can consider topological vector spaces over
other topological fields, like the p-adic ones Qp [24]. (But the resulting p-adic functional analysis
is quite different in some respects from the ‘usual’ one, cf. the comments in Section B.1 and the
literature, e.g. [62, 57].)
2.7 Exercise Let F be a topological field and V an F-vector space. Is it true that V , equipped
with the discrete topology, is a topological vector space over F? Prove or give a counterexample.
3
Throughout these notes, N = {1, 2, 3, . . .}, thus 0 6∈ N.
7
Now we can define:
2.8 Definition Functional analysis (ordinary, as opposed to p-adic) is concerned with topo-
logical vector spaces over R or C and continuous maps between them. Linear functional analysis
considers only linear maps.
As it turns out, the above notion of topological vector spaces is a bit too general to build a
satisfactory and useful theory upon it. Just as in topology it is often (but by no means always!)
sufficient to work with metric spaces, for most purposes it is usually sufficient to consider certain
subclasses of topological vector spaces. The following diagram illustrates some of these classes
and their relationships:
(Note that F -spaces, Fréchet4 , Banach and Hilbert5 spaces are assumed complete but one also
has the non-complete versions. There is no special name for Fréchet spaces with completeness
dropped other than metrizable locally convex spaces. In the other cases, one speaks of metrized,
normed and pre-Hilbert spaces.)
The most useful of these classes are those in the bottom row. In fact, locally convex (vector)
spaces are general enough for almost all applications. They are thoroughly discussed in the
MasterMath course on functional analysis, while we will only briefly touch upon them. Most
of the time, we will be discussing Banach and Hilbert spaces. There is much to be said for
studying them in some depth before turning to locally convex spaces (or more general) spaces.
(Some books on functional analysis, like [64], begin with general topological vector spaces and
then turn to some special classes, but for a first encounter this does not seem appropriate. This
said, I also don’t see the point of beginning with proofs of many results on Hilbert spaces that
literally generalize to Banach spaces.)
2.9 Definition Let V be a vector space over F ∈ {R, C}. A seminorm on V is a map V →
[0, ∞), x 7→ kxk such that
8
(Note that by the above, kxk = ∞ is not allowed!) A norm is a seminorm satisfying also
kxk = 0 ⇒ x = 0.
A normed F-vector space is an F-vector space equipped with a norm.
2.10 Example 0. Clearly F ∈ {R, C} is a vector space over itself and kck := |c| defines a norm,
making F a complete normed F-vector space.
1. Let X be a compact topological space and V = C(X, F). Clearly, V is an F-vector space.
Now kf k = supx∈X |f (x)| is a norm on V . You probably know that the normed space (V, k · k)
is complete.
If X is non-compact then kf k can be infinite, but replacing C(X, F) by
(Note that all these k · kp including p = ∞ coincide if n = 1.) It is quite obvious that for
each p ∈ [1, ∞] we have kxkp = 0 ⇔ x = 0 and kcxkp = |c| kxkp . For p = 1 and p = ∞ also
the subadditivity is trivial to check using only |c + d| ≤ |c| + |d|. Subadditivity also holds for
1 < p < ∞, but is harder to prove. You have probably seen the proof for p = 2, which relies
on the Cauchy-Schwarz inequality. The proof for 1 < p < 2 and 2 < p < ∞ is similar, using
the inequality of Hölder instead. We will return to this and also prove that Rn , Cn is complete
w.r.t. any of the norms k · kp , p ∈ [1, ∞].
3. The above examples are easily generalized to infinite dimensions. Let S be any set. For
a function f : S → F and 1 ≤ p < ∞ define
!1/p
X
kf k∞ = sup |f (s)|, kf kS = |f (s)|p
s∈S s∈S
with the understanding that (+∞)1/p = +∞. For the definition of infinite sums like
P
s∈S f (s)
see Appendix A.1. Now let
Now one can prove that k · kp is a complete norm on (`p (S, F ), k · k)p ) for each p ∈ [1, ∞]. We
will do this in Section 4.
4. Let (X, A, µ) be a measure space, f : X → F measurable and 1 ≤ p < ∞. We define
Z 1/p
p
kf kp = |f | dµ ,
X
kf k∞ = inf{M > 0 | µ({x ∈ X | |f (x)| > M }) = 0}
and
Lp (X, A, µ; F) = {f : X → F measurable | kf kp < ∞}.
9
Then kf kp = ( X |f |p dµ)1/p is a seminorm on Lp (X, A, µ; F) for all 1 ≤ p < ∞.
R
2.11 Lemma Let (V, k · k) be a normed F-vector space, and define d(x, y) = kx − yk. Then
(i) d is a metric on V that is translation-invariant, i.e. satisfies the equivalent statements
where we used k − xk = kxk, a special case of the second seminorm axiom. Finally,
2.12 Definition A topological vector space (V, τ ) is called normable if there exists a norm k · k
on V such that τ = τd with d(x, y) = kx − yk.
2.13 Remark One can prove, see the supplementary Appendix B.5.1, that a topological vector
space (V, τ ) is normable if and only if there is an open U ⊆ V such that 0 ∈ U and U is convex,
cf. Definition 5.21, and ‘bounded’. The latter property means that for every open neighborhood
V of 0 there exists an s > 0 such that U ⊆ sV . (In a normed space it is easy to see that B(0, r)
has all these properties for each r > 0.) 2
6
In view of these facts, which we cannot prove without going deeper into measure and integration theory, I don’t
find it unreasonable to ask that you understand the much simpler unordered summation.
10
2.14 Definition A normed space (V, k · k) is called
• complete, or Banach space, if the metric space (V, d) with d(x, y) = kx − yk is complete.
• separable if the metric space (V, d) is separable (i.e. has a countable7 dense subset).
Just as complete metric spaces are ‘better behaved’ (in the sense of allowing stronger theo-
rems) than general metric spaces, Banach spaces are ‘better’ than normed spaces. Separability
is an annoying restriction that we will try to avoid as much as possible. (An opposite approach
[54] restricts to separable spaces from the beginning in order to make do with weak versions of
the axiom of choice.)
Note that the norm of a normable space (V, τ ) never is unique (unless V = {0}): If k · k is
a norm compatible with τ then the same holds for ck · k for every c > 0. Thus the choice of a
norm on a vector space is an extra piece of structure. If k · k1 , k · k2 are different norms on V
then (V, k · k1 ), (V, k · k2 ) are different as normed spaces even if the norms give rise to the same
topology!
2.15 Definition Let V be an F-vector space. Two norms k·k1 , k·k2 on V are called equivalent
if τd1 = τd2 , where di (x, y) = kx − yki .
We will soon prove (quite easily) the following:
2.16 Proposition Two norms k · k1 , k · k2 on an F-vector space V are equivalent if and only if
there are 0 < c0 ≤ c such that c0 kxk1 ≤ kxk2 ≤ ckxk1 for all x ∈ V .
The following deeper result will be proven later:
2.17 Theorem If V is a vector space that is complete w.r.t. each of the norms k · k1 , k · k2 and
k · k2 ≤ ck · k1 for some c > 0 then also k · k1 ≤ c0 k · k2 for some c0 > 0, thus the two norms are
equivalent.
11
3. Given a vector space V and some translation-invariant metric d on V , one proves as in
Lemma 2.11 that (V, +, 0), equipped with the topology τd , is a topological abelian group. But
scalar multiplication can fail to be continuous. This continuity holds if and only if d(cx, 0) → 0
as c → 0 for each x ∈ V .
4. There is a nice necessary and sufficient condition for metrizability of a TVS: It must
be Hausdorff and the zero element must have a countable base of open neighborhoods, cf. [64,
Theorem 1.24].
5. Metrizable TVS that are non-normable, while better behaved than general TVS, can still
be rather pathological. They do not have too many applications. 2
2.20 Remark The definition of completeness that we gave for normed spaces immediately
generalizes to metrizable TVS. Remarkably, completeness can actually be defined for arbitrary
TVS: A sequence {xn }ι∈I in a TVS V is called a Cauchy sequence if for every open set U ⊆ V
containing 0 there is an n0 such that n, m ≥ n0 ⇒ xn − xm ∈ U . Cauchy nets are defined
analogously. (For the definition of nets see e.g. [47].) Now a TVS V is called complete if every
Cauchy net in V is convergent. It is easy to see that for a metrizable TVS the above notion of
completeness coincides with the metric one. 2
12
existence of z ∈ U ∩ V would imply the contradiction d(x, y) ≤ d(x, z) + d(z, y) < c/2 + c/2 =
r = d(x, y). Thus U ∩ V = ∅, so that τF is Hausdorff.
2.24 Definition A topological vector space (V, τ ) over F is called locally convex if there exists
a separating family F of seminorms on V such that τ = τF .
2.25 Remark 1. Locally convex spaces were introduced in 1935 by John von Neumann (1903-
1957), to whom also von Neumann algebras, the theory of unbounded operators, the spectral
theorem and countless other discoveries in pure and applied mathematics are due.
2. For an equivalent, more geometric way of defining local convexity of a TVS see the
supplementary Section B.5.1, and for more on locally convex spaces see, e.g., [39, 12, 64]. 2
If the separating family F has just one element, we are back at the notionPof a normed,
possibly Banach, space. If F is finite, i.e. F = {k · k1 , . . . , k · kn }, then k · k = ni=1 k · ki is a
seminorm, and it is a norm if and only if F is separating. Thus the case of finite F again gives
a normed space. Thus F must be infinite in order Pfor interesting things to happen.
If F is infinite, we cannot just put kxk = k·k∈F kxk, since the r.h.s. has no reason to
converge for all x ∈ V . But if the family F of seminorms on V is countable, we can label the
elements of F as k · kn , n ∈ N and define
∞
X
d(x, y) = 2−n min(1, kx − ykn ).
n=1
where f (m) is the m-th derivative of f . These k · kn,m are seminorms. Now
is a Fréchet space when equipped with the topology defined by the family F = {k · kn,m | n, m ∈
N0 }, which is countable. Its elements are called Schwartz8 functions. They are infinitely differ-
entiable functions that, together with all their derivatives, vanish as |x| → ∞ faster than |x|−n
for any n. (This definition is easily generalized to functions of several variables.) Note that the
seminorm k · k0,0 alone already separates the elements of S, thus is a norm, but having the other
seminorms around gives rise to a finer topology, one that is not normable.
8
Laurent Schwartz (1915-2002). French mathematician who invented ‘distributions’, an important notion in func-
tional analysis.
13
3 Normed and Banach space basics
3.1 Why we care about completeness. Closed subspaces
As you (should) know from topology, completeness of a metric space is convenient since leads
to results that are not necessarily true without it, like Cantor’s intersection theorem and the
contraction principle (or Banach’s fixed point theorem). The same holds for normed spaces.
Here is one main reason:
P∞
3.1 Definition Let (V, k · k) be a normed
P∞ space and {xn }n∈N a sequence. The series n=1 xn
is saidPto be absolutely convergent if n=1 kxn k < ∞ and to converge to s ∈ V if the sequence
Sn = nk=1 xk of partial sums converges to s.
3.2 Proposition Let (V, k · k) be a normed F-vector space. Then the following are equivalent:9
(i) (V, k · k) is complete, thus a Banach space.
(ii) Every absolutely convergent series ∞
P
n=1 xn in V converges.
P P
Under these assumptions, the sum satisfies k n xn k ≤ n kxn k.
V to be complete and n xn to be absolutely convergent. Let Sn = nk=1 xk
P P
Proof. ⇒ Assume
Pn
and Tn = k=1 kxk k. For all n > m we have (by subadditivity of the norm)
n
X n
X
kSn − Sm k = k xk k ≤ kxk k = Tn − Tm .
k=m+1 k=m+1
Since the sequence {Tn } is convergent by assumption, thus Cauchy, the above implies that
{SPn } is Cauchy,
Pnthus convergent by completeness of V . The subadditivity of the norm gives
n
k k=1 xk k ≤ k=1 kxk k for all n, and since the limit n → ∞ of both sides exists, we have the
inequality.
⇐ Assume that every absolutely convergent series in V converges, and let {yn }n∈N be a
Cauchy sequence in V . We can find (why?) a subsequence {zk }k∈N = {ynk } such that kzk −
zk−1 k ≤ 2−k ∀k ≥ 2. Now put z0 = 0 and define xk = zk − zk−1 for k ≥ 1. Now
∞
X ∞
X ∞
X
kxk k = kzk − zk−1 k ≤ kz1 k + 2−k < ∞.
k=1 k=1 k=2
P∞
Thus k=1 xk is absolutely convergent,
Pn andP therefore convergent by the hypothesis. To wit,
n
limn→∞ Sn exists, where Sn = k=1 xk = k=1 (zk − zk−1 ) = zn . Thus z = limk→∞ zk =
limk→∞ ynk exists. Now the sequence {yn } is Cauchy and has a convergent subsequence {ynk }.
This implies (why?) that the whole sequence {yn } converges to the limit of the subsequence.
We will see various other reasons for the importance of completeness. In the next section,
it will be used to prove that every finite dimensional linear subspace of a normed space is
automatically closed. This is
Pnot at all true for infinite dimensional subspaces. For example, let
∞
V = `1 (N) = {f : N → R | n=1 |f (n)| < ∞}. Now W = {f : N → R | #{n ∈ N | f (n) 6= 0} <
∞} $ V is an infinite dimensional linear proper subspace, but not closed: One easily checks
that W = V .
Closedness and completeness of subsets of a metric space are related. We recall from topology
(if you haven’t seen this, prove it!):
9
P P
Unfortunately, some authors write: “ n kxn k < ∞, thus n xn converges” without indicating that something
needs to be proven here.
14
3.3 Lemma Let (X, d) be a metric space and Y ⊆ X. Then (instead of d|Y 10 I just write d)
(i) If (X, d) is complete and Y ⊆ X is closed (w.r.t. τd , of course) then (Y, d) is complete.
(ii) If (Y, d) is complete then Y ⊆ X is closed (whether or not (X, d) is complete).
The above should be compared with the fact that a closed subset of a compact space is
compact and that a compact subset of a Hausdorff space is closed. In the above, completeness
works as a weak substitute of compactness, an interpretation that is reinforced by the fact that
every compact metric space is complete.
This specializes immediately to normed spaces:
3.4 Lemma Let (V, k · k) be a normed space and W ⊆ V a linear subspace. Then
(i) If V is complete (=Banach) and W ⊆ V is closed then W is Banach.
(ii) If W is complete then W ⊆ V is closed (whether or not V is complete).
3.5 Definition A linear map A : V → W of normed spaces is called an isometry if kAxk = kxk
for all x ∈ V .
Recall that a linear map A : V → W is injective if and only if its kernel ker A = A−1 (0) is
{0}. It follows readily that an isometry is automatically injective. Furthermore, if A : V → W
is a surjective isometry then it is invertible, and its inverse also is an isometry. Then A is called
an isometric isomorphism of normed spaces.
I suppose as known that that every metric space can be completed, i.e. embedded isometri-
cally into a complete metric space (unique up to isometry) as a dense subspace.
3.7 Exercise Prove that the completion of a normed space again is a vector space, thus a
Banach space.
Later we will see an alternative construction of the completion.
15
kxi − xk → 0, kyi − yk → 0. Now k(xi , yi ) − (x, y)k → 0 for both sum and maximum norm.
Assume X1 ⊕ X2 is complete for the sum or maximum norm. By their equivalence, it then is
complete for the other. And if {xi } is Cauchy in X1 then {(xi , 0)} ⊆ X1 ⊕ X2 is Cauchy, thus
convergent to some (x, 0). Now it is clear that xi → x.
3.9 Exercise (i) Let {(Vi , k · ki )}i∈I be a family of normed spaces, where I is any set. Put
M X
Vi = {{xi }i∈I | kf (i)ki < ∞}.
i∈I i∈I
Q S
(Technically, this is a subset of Pi Vi = {f : I 7→ j Vj | f (i) ∈ Vi ∀i ∈ I}.) Prove that
this is a linear space and kf k = i kf (i)ki a norm on it.
L
(ii) Prove that ( i∈I Vi , k · k) is complete if all the Vi are complete. Hint: The proof is an
adaptation of the one for `1 (S) given in Section 4.3.
If Vi = F for all i ∈ I with the usual norm, we have i∈I Vi ∼ = `1 (I, F).
L
3.10 Definition Let E, F be normed spaces and A : E → F a linear map. Then the norm
kAk ∈ [0, ∞] is defined by
kAek
kAk = sup = sup kAek.11
06=e∈E kek e∈E
kek=1
(The equality of the second and third expression is due to linearity of A.) If kAk < ∞ then A
is called bounded.
3.11 Remark 1. ‘Linear operator’ is a synonym for linear map, but linear maps A : E → F
are called linear functionals.
2. While unbounded linear maps exist, cf. Exercise 3.15 below, in the unbounded case it
often is too restrictive to require them to be defined everywhere. See also Remark 8.10. 2
3.13 Lemma Let E, F be normed spaces and A : E → F a linear map. Then the following are
equivalent:
(i) A is bounded.
(ii) A is continuous (w.r.t. the norm topologies).
11
It should be clear that writing sup kAek instead would not change the result.
e∈E
kek≤1
16
(iii) A is continuous at 0 ∈ E.
Proof. (i)⇒(ii) For x, y ∈ E we have kAx − Ayk = kA(x − y)k ≤ kAk kx − yk. Thus d(Ax, Ay) ≤
kAk d(x, y), and with kAk < ∞ we have (uniform) continuity of E.
(ii)⇒(iii) This is obvious.
(iii)⇒(i) B F (0, 1) ⊆ F is an open neighborhood of 0 ∈ F . Since A is continuous at 0, there
is an open neighborhood U ⊆ E of 0 ∈ E such that A(U ) ⊆ B F (0, 1). Since the balls form
bases of the topologies, there is C > 0 such that B E (0, C) ⊆ U , thus A(B E (0, C)) ⊆ B F (0, 1).
By linearity of A and the properties of the norm, this is equivalent to A(B E (0, 1)) ⊆ B F (0, D),
where D = 1/C. If 0 6= x ∈ E then
x
Ax = 2kxk A ,
2kxk
3.14 Exercise Let V, W be normed spaces, where V is finite dimensional. Prove that every
linear map V → W is bounded.
3.16 Exercise Let (V, k · k) be a normed F-vector space and ϕ : V → F a linear functional.
Prove that ϕ is continuous if and only if ker ϕ = ϕ−1 (0) ⊆ V is closed.
Hint: For ⇐, pick a ball B(x, r) ⊆ V \ ker ϕ and prove that ϕ(B(0, r)) is bounded.
3.17 Lemma Let V be a normed space, W ⊆ V a dense linear subspace, Z a Banach space
and A : W → Z a bounded linear map. Then there is a unique linear map A
b : V → Z with
A|W = A and kAk = kAk. If A is an isometry, so is A.
b b b
Proof. Let x ∈ V . Then there is a sequence {wn } in W such that kwn − xk → 0. Then
{wn } ⊆ W is a Cauchy sequence, and so is {Awn } ⊆ Z by boundedness of A. The latter con-
verges since Z is complete. If {wn0 } is another sequence converging to x then kA(wn −wn0 )k → 0,
so that lim Awn0 = lim Awn . It thus is well-defined to put Ax b = limn→∞ Awn . We omit the
easy proof of linearity of A. If x ∈ W then we can put wn = x ∀n, obtaining Ax
b b = Ax, thus
A|W = A. Finally, kAxk = lim kAwn k ≤ kAkkxk. Thus kAk ≤ kAk, and the converse inequality
b b b
is obvious. If A is an isometry then kAxk b = limn kAwn k = limn kwn k = kxk, so that A b is an
isometry.
We recall the following from topology: If X is a set and τ1 , τ2 are topologies on X then
τ1 = τ2 holds if and only if idX : (X, τ1 ) → (X, τ2 ) is a homeomorphism, i.e. continuous with
continuous inverse.
Proof of Proposition 2.16. Equivalence of k · k1 , k · k2 means that the two norms give rise to the
same topology. By the above, this is equivalent to the maps idV : (V, k · k1 ) → (V, k · k2 ) and
idV : (V, k · k2 ) → (V, k · k1 ) being continuous. Now by Lemma 3.13, this is equivalent to the
existence of C, C 0 such that kxk1 ≤ Ckxk2 and kxk2 ≤ C 0 kxk1 holding for all x ∈ V .
17
3.18 Exercise Prove: If V is a vector space and k · k1 , k · k2 are equivalent norms on V then
completeness of V w.r.t. k · k1 is equivalent to completeness of V w.r.t. k · k2 .
3.19 Proposition On a finite dimensional vector space, all norms are equivalent.
Proof. Let P F ∈ {R, C}. Let BP = {e1 , . . . , ed } be a basis for V , and define the Euclidean norm
k·k2 of x = i ci ei by kxk2 = ( i |ci |2 )1/2 . Since equivalence of norms is an equivalence relation,
it clearly is sufficient to show that any norm k · k is equivalent to k · k2 . Using |ci | ≤ kxk2 ∀i
and the properties of any norm, we have
d d d
!
X X X
kxk = ci ei ≤ |ci | kei k ≤ kei k kxk2 . (3.1)
i=1 i=1 i=1
This implies that x 7→ kxk is continuous w.r.t. the topology on V defined by k · k2 . The sphere
S = {x ∈ Fn | kxk2 = 1} is closed and bounded, thus compact, which implies that there is
z ∈ S such that λ := inf x∈S kxk = kzk. Since z ∈ S implies z 6= 0 and k · k is a norm, we have
x
λ = kzk > 0. Now, for x 6= 0 we have kxk 2
∈ S, and thus
x
kxk = kxk2 ≥ kxk2 λ. (3.2)
kxk2
P
Combining (3.1, 3.2), we have c1 kxk2 ≤ kxk ≤ c2 kxk2 with 0 < c1 = inf x∈S kxk ≤ i kei k = c2 .
(Note that ei ∈ S ∀i, so that c2 ≥ dc1 , showing again that V must be finite dimensional.)
3.20 Remark In fact, one can prove the somewhat stronger result that on a finite dimensional
vector space there is precisely one topology making it a TVS. 2
3.21 Exercise Prove that every finite dimensional normed space over R or C is complete.
3.22 Exercise Prove that every finite dimensional subspace of a normed space is closed.
18
with M = max(kT1 k, . . . , kTn0 −1 k, kTn0 k + 1) we have kTn k ≤ M for all n. If now x ∈ E then
k(Tn −Tm )xk ≤ kTn −Tm kkxk, so that {Tn x} is a Cauchy sequence in F and therefore convergent
by completeness of F . Now define T : E → F by T x = limn→∞ Tn x. It is straightforward to
check that T is linear. Finally, since kTn xk ≤ M kxk for all n, we have kT xk = limn→∞ kTn k,
so that T ∈ B(E, F ).
3.27 Definition A normed F-algebra is a F-algebra A equipped with a norm k · k such that
kabk ≤ kakkbk for all a, b ∈ A (submultiplicativity). A Banach algebra is a normed algebra that
is complete (as a normed space). An algebra A is called unital if it has a unit 1 6= 0. (In fact,
if A =
6 {0} then 1 = 0 would imply the contradiction kak = k1ak ≤ k1kkak = 0 ∀a ∈ A.)
By the above, B(E) is a normed algebra for every normed space E, and by Proposition
3.24(ii), B(E) is a Banach algebra whenever E is a Banach space. There is another standard
class of examples:
3.29 Example Let X be a compact Hausdorff space and A = C(X, F). We already know that
A, equipped with the norm kf k = supx∈X |f (x)| is a Banach space. The pointwise product
(f g)(x) = f (x)g(x) of functions is bilinear, associative and clearly satisfies kf gk ≤ kf kkgk.
This makes (A, ·, k · k) a Banach algebra.
We will have more to say about Banach algebras later in the course.
Before we go on developing general theory, it seems instructive to study in some depth an
important class of spaces, the `p (S) spaces, where everything can be done very explicitly, in
particular the dual spaces can be determined.
19
• They provide a first encounter with the more general Lebesgue spaces Lp (X, A, µ) without
the measure and integration theoretic baggage needed for the latter.
• They can be studied quite completely and have their dual spaces identified.
• We will see that every Hilbert space is isomorphic to `2 (S, F) for some S.
where ∞1/p = ∞ and we use the notion of unordered sums, cf. Appendix A.1. Now for all
p ∈ (0, ∞] put
`p (S, F) := {f : S → F | kf kp < ∞}.
4.3 Lemma (i) (`p (S, F), k · kp ) are vector spaces for all p ∈ [1, ∞].
(ii) k · k1 and k · k∞ are norms.
(iii) If f ∈ `1 (S, F) and g ∈ `∞ (S, F) then
X
f (s)g(s) ≤ kf gk1 ≤ kf k1 kgk∞ .
s∈S
(i) Together with the obvious fact that each `p (S, F) is stable under scalar multiplication, (ii)
implies that `p (S, F) is a vector space for p ∈ {1, ∞}. For 1 < p < ∞, with the useful inequality
|a + b|p ≤ (|a| + |b|)p ≤ (2 max(|a|, |b|))p = 2p max(|a|p , |b|p ) ≤ 2p (|a|p + |b|p ) (4.1)
and f, g ∈ `p (S, F) we have
X X
kf + gkpp = |f (s) + g(s)|p ≤ 2p |f (s)|p + |g(s)|p = 2p (kf kpp + kgkpp ) < ∞.
s s
20
P P P
(iii) | s∈S f (s)g(s)| ≤ s∈S |f (s)||g(s)| ≤ kgk∞ s∈S |f (s)| = kgk∞ kf k1 .
In order to obtain analogues of (ii), (iii) for 1 < p < ∞, define q ∈ (1, ∞) by p1 + 1q = 1.
(This is equivalent to pq = p + q, which often is useful.) Whenever p, q appear together they
are supposed to be a conjugate or dual pair in this sense. We extend this in a natural way by
declaring (1, ∞) and (∞, 1) to be conjugate pairs.
1 1
4.4 Proposition Let 1 < p < ∞ and q conjugate to p, i.e. p + q = 1. Then
(i) For all f, g : S → F we have kf gk1 ≤ kf kp kgkq . (Inequality of Hölder12 (1889))
(ii) For all f, g : S → F we have kf + gkp ≤ kf kp + kgkp . (Inequality of Minkowski13 (1896))
Proof. (i) The inequality is trivially true if kf kp or kgkq is zero or infinite. Thus we assume
kf kp , kgkq to be finite and non-zero. The exponential function R → R, x 7→ ex is convex14 , to
that with of p1 + 1q = 1 we have
ea eb
a b
ea/p eb/q = exp + ≤ + ∀a, b ∈ R.
p q p q
where the second ≤ comes from Hölder’s inequality applied to |f | ∈ `p and |f + g|p−1 ∈ `q and
also to |g| ∈ `p and |f + g|p−1 ∈ `q . If kf + gkp 6= 0, we can divide by kf + gkp/q and with
p − p/q = p(1 − 1/q) = p p1 = 1 we obtain
12
Otto Hölder (1859-1937). German mathematician. Important contributions to analysis and algebra.
13
Hermann Minkowski (1864-1909). German mathematician. Contributions to number theory, relativity and other
fields.
14
f : [a, b] → R is convex if f (tx + (1 − t)y) ≤ tf (x) + (1 − t)f (y) for all x, y ∈ [a, b] and t ∈ [0, 1] and strictly convex
if the inequality is strict whenever x 6= y, 0 < t < 1. See, e.g., [23, Vol. 1, Section 7.2].
21
For p = q = 2, the inequality of Hölder is known as the Cauchy-Schwarz inequality. We
will also call the trivial inequalities of Lemma 4.3 for {p, q} = {1, ∞} Hölder and Minkowski
inequalities. Now the analogue of Lemma 4.3 for 1 < p < ∞ is clear:
X
f (s)g(s) ≤ kf gk1 ≤ kf kp kgkq .
s∈S
since 1/p > 1. Thus k · kp is not subadditive and therefore not a norm.
(ii) It is clear that f ∈ `p (S, F) implies cf ∈ `p (S, F) for all c ∈ F. For a, b ≥ 0 we have
(a + b)p ≤ (2 max(a, b))p ≤ 2p (ap + bp ), whence the inequality
X X
kf + gkpp = |f (s) + g(s)|p ≤ (|f (s)| + |g(s)|)p
s∈S s∈S
X
p
≤ 2 (|f (s)| + |g(s)|p ) = 2p (kf kpp + kgkpp ),
p
s∈S
22
Believing this for a minute, we have
X X
dp (f, h) = dp (f − h, 0) = |f (s) − h(s)|p ≤ (|f (s) − g(s)| + |g(s) − h(s)|)p
s s
X
p p
≤ (|f (s) − g(s)| + |g(s) − h(s)| )
s
= dp (f − g, 0) + dp (g − h, 0) = dp (f, g) + dp (g, h),
as wanted, where first used the triangle inequality and then the claim.
Turning to our claim (a + b) ≤ ap + bp , it is clear that this holds if a = 0. For a = 1 it
reduces to (1 + b)p ≤ 1 + bp ∀b ≥ 0. For b = 0 this is true, and for all b > 0 it follows from the
fact that
d
(1 + bp − (1 + b)p ) = p(bp−1 − (b + 1)p−1 ) > 0
db
due to p − 1 < 0. If now a > 0 then
The above does not amount to a proof that the topological vector spaces (`p (S), dp ) with
0 < p < 1 are not normable, but this can be done, cf. Section B.5.1. (In fact they are not even
locally convex.) This leads to strange behavior; for example the dual space `p (S)∗ is unexpected,
cf. [47]. This strangeness is even more pronounced for the continuous versions Lp (X, A, µ): For
X = [0, 1] equipped with Lebesgue measure, one has Lp (X, A, µ)∗ = {0}, which cannot happen
for a non-zero Banach (or locally convex) space due to the Hahn-Banach theorem.
which is a metric in all cases. For a function f : S → F we define supp f = {s ∈ S | f (s) 6= 0}.
23
4.8 Lemma If 0 < p ≤ q < ∞, we have
(i) c00 (S, F) ⊆ `p (S, F) ⊆ `q (S, F) ⊆ c0 (S, F) ⊆ `∞ (S, F),
p/q
(ii) kf kq ≤ min(1, kf kp ), thus kf kp → 0 ⇒ kf kq → 0 so that all inclusion maps `p (S) ,→
`q (s) for p ≤ q are continuous.
Proof. (i) If f ∈ c00 (S, F) then clearly kf kp < ∞ for all p ∈ (0, ∞]. And f ∈ c0 (S, F) implies
boundedness of f . This gives the first and last inclusion.
If f ∈ `p (S, F) with p ∈ (0, ∞) then finiteness of s∈S |f (s)|p implies that {s ∈ S | |f (s)| ≥ ε}
P
is finite for each ε > 0, thus f ∈ c0 (S, F). In particular F = {s ∈ S | |f (s)| ≥ 1} is finite. If
now 0 < p < q < ∞ then
p· q
X X X X
kf kqq − |f (s)|q = |f (s)|q = |f (s)| p ≤ |f (s)|p ≤ kf kpp < ∞, (4.3)
s∈F s∈S\F s∈S\F s∈S\F
since q/p > 1 and |f (s)| < q/p ≤ |f (s)|, for all s ∈ S\F . With the finiteness of
P q
P 1, thus |fq(s)| q
s∈F |f (s)| this implies s∈S |f (s)| < ∞, thus f ∈ ` (S, F).
(ii) It suffices to observe that kf kp < 1 implies that the set F in part (i) of the proof is
p/q
empty, so that (4.3) reduces to kf kqq ≤ kf kpp , thus kf kq ≤ kf kp .
4.9 Exercise Let S be an infinite set and 0 < p < q < ∞. Prove that all inclusions in Lemma
4.8(i) are strict.
4.10 Lemma Let p ∈ (0, ∞] and dp (x, y) = kx − ykp . Then (`p (S, F), dp ) is complete for every
set S and F ∈ {R, C}.
Proof. Let {fn } ⊆ `p (S, F) be a Cauchy sequence w.r.t. dp , thus also w.r.t. k · kp . Then
|fn (s) − fm (s)| ≤ kfn − fm kp , so that {fn (s)} is a Cauchy sequence in F, thus convergent for
each s ∈ S. Defining g(s) = limn fn (s), it remains to prove g ∈ `p (S, F) and dp (fn , g) → 0.
For p = ∞ and ε > 0 we can find n0 such that n, m ≥ n0 implies kfn − fm k∞ < ε, which
readily gives kfm k∞ ≤ kfn0 k∞ + ε for all m ≥ n0 . Thus also kgk∞ ≤ kfn0 k∞ + ε < ∞. Taking
m → ∞ in sups |fn (s) − fm (s)| < ε gives sups |fn (s) − g(s)k ≤ ε, whence kfn − gk∞ → 0.
For 0 < p < ∞ we give a uniform argument. Since {fn } is Cauchy w.r.t. dp , for ε > 0 we can
find n0 such that n, m ≥ n0 implies dp (fn , fm ) < ε. In particular dp (fm , fn0 ) < ε for all m ≥ n0 ,
thus also dp (g, fn0 ) ≤ ε, thus g ∈ `p (S, F). Applying the dominated convergence theorem (in
the simple case of an infinite sum rather than a general integral, cf. Proposition A.3) to take
m → ∞ in dp (fn , fm ) < ε gives d(fn , g) ≤ ε, whence d(fn , g) → 0.
dense.
If f ∈ c0 (S, F) and ε > 0 then F = {s ∈ S | |f (s)| ≥ ε} is finite. Now g = f χF is in c00 (S, F)
k·k∞ k·k∞
and kf − gk∞ < ε, proving f ∈ c00 (S, F) . And f ∈ c00 (S, F) means that for each ε > 0
24
there is a g ∈ c00 (S, F) with kf − gk∞ < ε. But this means |f (s)| < ε for all s ∈ S\F , where
F = supp(g) is finite. Thus f ∈ c0 (S, F).
(ii) Being the closure of c00 (S, F) in `∞ (S, F), c0 (S, F) is closed, thus complete by complete-
ness of `∞ (S, F), cf. Lemmas 4.10 and 3.4.
4.12 Remark Note that `∞ (S, F) is a commutative algebra under pointwise multiplication
of functions. (In fact, `∞ (S, F) = Cb (S, F) if we equip S with the discrete topology.) And
c0 (S, F) ⊆ `∞ (S, F) is an ideal. 2
While the finitely supported functions are not dense in `∞ (S, F) (for infinite S), the finite-
image functions are:
4.13 Lemma The set {f : S → F | #f (S) < ∞} of functions PK assuming only finitely many
χ
values, equivalently, the set of finite linear combinations k=1 ck Ak of characteristic functions,
is dense in `∞ (S, F).
Proof. We prove this for F = R, from which the case F = C is easily deduced. Let f ∈ `∞ (S, F)
−1 kf k∞
P ε > 0. For k ∈ Z define Ak = f ([kε, (k + 1)ε)). Define K = d ε e + 1 and g =
and
ε |k|≤K k χAk . Then g has finite image and kf − gk∞ < ε.
25
4.5 Dual spaces of `p (S, F), 1 ≤ p < ∞, and c0 (S, F)
If (V, k · k) is a normed vector space over F and ϕ : V → F is a linear functional, Definition 3.10
specializes to
|ϕ(x)|
kϕk = sup = sup |ϕ(x)|.
06=x∈V kxk x∈V
kxk≤1
Recall that the dual space V ∗ = {ϕ : V → F linear | kϕk < ∞} is a Banach space with norm
kϕk. The aim of this section is to concretely identify `p (S, F)∗ for 1 ≤ p < ∞ and c0 (S, F)∗ .
(We will have something to say about `∞ (S, F)∗ , but the complete story would lead us too far.)
For the purpose of the following proof, it will be useful to define sgn : C → C by sgn(0) = 0
and sgn(z) = z/|z| otherwise. Then z = sgn(z)|z| and |z| = sgnz z for all z ∈ C.
4.16 Theorem (i) Let p ∈P [1, ∞] with conjugate value q. Then for each g ∈ `q (S, F) the map
ϕg : ` (S, F) → F, f 7→ s∈S f (s)g(s) satisfies kϕg k ≤ kgkq , thus ϕg ∈ `p (S, F)∗ . And the
p
map ι : `q (S, F) → `p (S, F)∗ , g 7→ ϕg , called the canonical map, is linear with kιk ≤ 1.
(ii) For all 1 ≤ p ≤ ∞ the canonical map `q (S, F) → `p (S, F)∗ is isometric.
(iii) If 1 ≤ p < ∞, the canonical map `q (S, F) → `p (S, F)∗ is surjective, thus `p (S, F)∗ ∼ =
`q (S, F).
(iv) The canonical map `1 (S, F) → c0 (S, F)∗ is an isometric bijection, thus c0 (S, F)∗ ∼
= `1 (S, F).
(v) If S is finite, the canonical map `1 (S, F) → `∞ (S, F)∗ is surjective. If S is infinite, its
image is a proper closed subspace of `∞ (S, F)∗ .
Proof. (i) For all p ∈ [1, ∞] and conjugate q we have
X X
| f (s)g(s)| ≤ |f (s)g(s)| ≤ kf kp kgkq < ∞ ∀f ∈ `p , g ∈ `q
s s∈S
by Hölder’s inequality. In either case, the absolute convergence for all f, g implies that (f, g) 7→
P
s f (s)g(s) is bilinear.
(ii) If kgk∞ 6= 0 and ε > 0 there is an s ∈ S with |g(s)| > kgk∞ − ε. If f = δs : t 7→ δs,t , we
have |ϕg (f )| = |g(s)| > kgk∞ − ε. Since kf k1 = 1, this proves kϕg k > kgk∞ − ε. Since ε > 0
was arbitrary, we have kϕg k ≥ kgk∞ .
P P
If kgk1 6= 0, define f (s) = sgn(g(s)). Then kf k∞ = 1 and s f (s)g(s) = s |g(s)| = kgk1 .
This proves kϕg k ≥ kgk1 .
If 1 < p, q < ∞ and kgkq 6= 0, define f (s) = sgn(g(s))|g(s)|q−1 . Then
X X
f (s)g(s) = |g(s)|q = kgkqq ,
s s
X X X
kf kpp = |f (s)|p = |g(s)|(q−1)p = |g(s)|q = kgkqq ,
s s, g(s)6=0 s
kgkqq kgkqq
P
| s f (s)g(s)|
kϕq k ≥ = = q/p
= kgkq(1−1/p)
q = kgkq .
kf kp kf kp kgkq
We thus have proven kϕg k ≥ kgkq in all cases and since the opposite inequality is known
from (i), g 7→ ϕg is isometric.
26
(iii) Let 0 6= ϕ ∈ `1 (S, F)∗ . Define g : S → F by g(s) = ϕ(δs ). With kδs k1 = 1, we
have |g(s)| = |ϕ(δs )| ≤ kϕkP for all s ∈ S, thus 1
P kgk∞ ≤ kϕk. If f ∈ ` (S, F) and F ⊆ S is
finite, wePhave ϕ(f χF ) = ϕ( s∈F f (s)δs ) = s∈F f (s)g(s). In the limit F % S this becomes
ϕ(f ) = s∈S f (s)g(s) = ϕg (f ) (since f g ∈ `1 , thus the r.h.s. is absolutely convergent, and
kf (1 − χF )k1 → 0 and ϕ is k · k1 -continuous). This proves ϕ = ϕg with g ∈ `∞ (S, F).
Now let 1 < p, q < ∞, and let 0 6= ϕ ∈ `p (S, F)∗ . Since `1 (S, F) ⊆ `p (S, F) by Lemma
4.8, we P can restrict ϕ to `1 (S, F)∗ , and the preceding argument gives a g ∈ `∞ (S, F) such that
ϕ(f ) = s∈S f (s)g(s) for all f ∈ `1 (S, F). The arguments in the proof of (ii) also show that for
1 < p, q < ∞ and any function g : S → F we have
( )
X
kgkq = sup | f (s)g(s)| | f ∈ c00 (S, F), kf kp ≤ 1 .
s∈S
P
Using this and ϕ(f ) = s f (s)g(s) for all f ∈ c00 (S, F) we have
Now ϕ(f ) = s∈S f (s)g(s) = ϕg (f ) for all f ∈ `p (S, F) follows as before from f g ∈ `1 and
P
kf (1 − χF )kp → 0 as F % S and the k · kp -continuity of ϕ.
(iv) Let 0 6= g ∈ `1 (S, F). Then ϕg ∈ `∞ (S, F)∗ , which we can restrict to c0 (S, F). For
finite F ⊆ S define fF = f χF with f (s)P= sgn(g(s)). Then fF ∈ P c00 (S, F) with kfF k∞ = 1
(provided F ∩ supp g 6= ∅) and ϕ(fF ) = s∈F |g(s)|. Thus kϕk ≥ s∈F |g(s)| for all finite F
intersecting supp g, and this implies kϕg k ≥ kgk1 . The opposite being known, we have proven
that `1 (S, F) → c0 (S, F)∗ is isometric.
To prove surjectivity, let 0 6= ϕ ∈ c0 (S, F)∗ Pand define g : S → F, s 7→ P ϕ(δs ). If now
f ∈ c0 (S, F) and F ⊆ S is finite, we have f χF = s∈F f (s)δs , thus ϕ(f χF ) = s∈F f (s)g(s).
In particular with f (s) = sgn(g(s)) we have ϕ(f χF ) = s∈F f (s)g(s) = s∈F |g(s)|. Again
P P
we have kf χf k∞ ≤ kf k∞ = 1, thus |ϕ(f χF )| ≤ kϕk, and combining these observations gives
kgk1 ≤ kϕk < ∞, thus g ∈ `1 (S, F). As F % S, we have kf (1 − χF )k∞ = kf χS\F k∞ → 0 since
f ∈ c0 , thus with k · k∞ -continuity of ϕ
X X
ϕ(f ) = lim ϕ(f χF ) = lim f (s)g(s) = f (s)g(s) = ϕg (f ),
F %S F %S
s∈F s∈S
where we again used f g ∈ `1 . Thus ϕ = ϕg , so that `1 (S, F) → c0 (S, F)∗ is an isometric bijection.
(v) It is clear that ι : `1 (S, F) → `∞ (S, F)∗ is surjective if S is finite. Closedness of the
image of ι always follows from the completeness of `1 (S, F) and the fact that ι is an isometry,
cf. Corollary 3.6. The failure of surjectivity is deeper than the results of this section so far, so
that it is illuminating to give two proofs.
First proof: If S is infinite, the closed subspace c0 (S, F) ⊆ `∞ (S, F) is proper since 1 ∈
` (S, F)\c0 (S, F). Thus the quotient space Z = `∞ (S, F)/c0 (S, F) is non-trivial. In Section 6
∞
we will show that Z is a Banach space, thus admits non-zero bounded linear maps ψ : Z → F by
the Hahn-Banach theorem (Section 7), and that the quotient map p : `∞ (S, F) → Z is bounded.
Thus ϕ = ψ ◦ p is a non-zero bounded linear functional on `∞ (S, F) that vanishes on the closed
subspace c0 (S, F). By (iv), the canonical map `1 (S, F) → c0 (S, F)∗ is isometric, thus ϕg with
g ∈ `1 (S, F) vanishes identically on c0 (S, F) if and only if g = 0. Thus ϕ 6= ϕg for all g ∈ `1 (S, F).
Second proof: (This proof uses no unproven results from functional but the Stone-Čech
compactification from general topology. Cf. Appendix A.3 and [47].) Since S is discrete,
`∞ (S, F) = Cb (S, F) ∼ = C(βS, F), where βS is the Stone-Čech compactification of S. The
27
isomorphism is given by the unique continuous extension Cb (S, F) → C(βX, F), f 7→ fb with
the restriction map C(βS, R) → Cb (S, R) as inverse. Since S is discrete and infinite, thus non-
compact, βS 6= S. If f ∈ c0 (S, F) then fb(x) = 0 for every x ∈ βS\S. (Proof: Let x ∈ βS\S.
Since X = βX, we can find a net {xι } in X such that xι → x. Since x 6∈ X, xι leaves every
finite subset of X. Now f ∈ c0 (S) and continuity of fb imply fb(x) = lim fb(xι ) = lim f (xι ) = 0.)
Thus for such an x, the evaluation map ψx : C(βS, F) → F, fb 7→ fb(x) gives rise to a non-zero
bounded linear functional (in fact character) ϕ(f ) = fb(x) on Cb (S, F) = `∞ (S, F) that vanishes
on c0 (S, F). Now we conclude as in the first proof that ϕ 6= ϕg for all g ∈ `1 (S, F).
4.17 Remark 1. The two proofs of non-surjectivity of the canonical map `1 (S, F) → `∞ (S, F)∗
for infinite S given above are both very non-constructive: The first used the Hahn-Banach
theorem, which we will prove using Zorn’s lemma, equivalent to AC. The second used the
Stone-Čech compactification βS whose usual construction relies on Tychonov’s theorem, which
is equivalent to the axiom of choice. (But here we only need the restriction of Tychonov’s
theorem to Hausdorff spaces, which is equivalent to the ‘ultrafilter lemma’, which is strictly
weaker than AC. Also Hahn-Banach can be proven using only the ultrafilter lemma. See [47].)
2. In fact, the two proofs are essentially the same. The second proof implicitly uses the
fact that `∞ (S) and c0 (S) are algebras, so that we can consider characters instead of all linear
functionals. Now, the characters on `∞ (S) = Cb (S) = C(βS) correspond bijectively to the
points of βS, and those that vanish on c0 (S) correspond to βS\S. The first construction is more
functional analytic, involving the Banach space quotient Cb (S)/c0 (S) and general functionals
instead of characters.
3. The dual space of `∞ (S, F) can be determined quite explicitly, but it is not a space of
functions on S as are the spaces c0 (S, F)∗ and `p (S, F)∗ for p < ∞. It is the space ba(S, F) of
‘finitely additive F-valued measures on S’. Going into this would lead us too far, but see the
supplementary Section B.2.
4. There are set theoretic frameworks without AC (but with DCω ) in which `∞ (N)∗ ∼ = `1 (N),
see [71, §23.10]. (In this situation, all finitely additive measures, cf. Section B.2, on N are
countably additive!)
5. For all p ∈ (0, 1), the dual space `p (S, F)∗ equals {ϕg | g ∈ `∞ (S, F)} = `1 (S, F)∗ . See [47,
Appendix F.6]. Thus there is no p-dependence despite the fact that the `p (S, F) are mutually
non-isomorphic! 2
Now Lp (X, µ) = {f : X → F measurable | kf kp < ∞} is an F-vector space for all p ∈ (0, ∞].
For 1 ≤ p ≤ ∞, the proofs of the inequalities of Hölder and Minkowski extend to the present
setting without any difficulties, so that the k · kp are seminorms on Lp (X, A, µ). But the latter
fails to be a norm whenever there exists ∅ = 6 Y ∈ A with µ(Y ) = 0 since then kχY kp = 0.
15
Warning: [11] defines k · k∞ using locally null sets instead of null sets, which is very non-standard.
28
For this reason we define Lp (X, A, µ) = Lp (X, A, µ)/{f | kf kp = 0}. Now it is straightforward
to prove that Lp (X, µ) = Lp (X, µ)/ ∼ is a normed space, and in fact complete. The proof
now uses Proposition 3.2. If S is a set and µ is the counting measure, we have `p (S, F) =
Lp (S, P (S), µ, F) = Lp (S, P (S), µ, F).
A measurable function P is called simple if it assumes only finitely many values. Equivalently
it is of the form f (x) = K χ
k=1 ck Ak (x), where A1 , . . . , Ak are measureable sets. Now one proves
that the simple functions are dense in Lp for all p ∈ [1, ∞]. If X is locally compact and µ is nice
enough, the set Cc (X, F) of compactly supported continuous functions is dense in Lp (X, A, µ; F)
for 1 ≤ p < ∞, while its closure in L∞ is C0 (X, F).
The inclusion `p ⊆ `q for p ≤ q (Lemma 4.8) is false for general measure spaces! In fact,
if µ(X) < ∞ then one has the reverse inclusion p ≤ q ⇒ Lq (X, A, µ) ⊆ Lp (X, A, µ), while for
general measure spaces there is no inclusion relation between the Lp with different p.
If 1 < p, q < ∞ are conjugate, the canonical map ϕ : Lq (X, A, µ) → Lp (X, A, µ)∗ is
an isometric bijection for all measure spaces. That ϕ is an isometry is proven just as for
the spaces `p : Hölder’s inequality gives kϕg k ≤ kgkq , and equality is proven as in Theo-
rem 4.16(ii) by showing |ϕg (f )| ≥ kf kp kgk1 , where the f ∈ Lp are the same as before.
However, isometry of L∞ (X, A, µ) → (L1 (X, A, µ))∗ is not automatic, as the measure space
X = {x}, A = P (X) = {∅, X} and µ : ∅ 7→ 0, X 7→ +∞ shows, for which L1 (X, A, µ, F) ∼ = {0}
and F ∼ = L ∞ (X, A, µ, F) 6∼ L1 (X, A, µ, F)∗ . It is not hard to show that L∞ → (L1 )∗ is isometric
=
if and only if (X, A, µ) is semifinite, i.e.
µ(Y ) = sup{µ(Z) | Z ∈ A, Z ⊆ Y, µ(Z) < ∞} ∀Y ∈ A.
If 1 < p < ∞, one still has surjectivity of Lp → (Lq )∗ for all measure spaces (X, A, µ), but the
standard proof is outside our scope since it requires the Radon-Nikodym theorem. (For a more
functional-analytic proof see Section B.6.) In order for L∞ → (L1 )∗ to be an isometric bijection,
the measure space must be ‘localizable’, cf. [69]. This condition subsumes semifiniteness and is
implied by σ-finiteness, to which case many books limit themselves.
Since we relegated the dual spaces `∞ (S, F)∗ to an appendix, we only remark that also
in general L∞ (X, A, µ)∗ is a space of finitely additive measures with fairly similar proofs, see
[17]. For 0 < p < 1, the dual spaces (Lp )∗ behave even stranger than (`p )∗ . For example,
Lp ([0, 1], λ; R)∗ = {0}.
29
5.1 Definition Let V be an F-vector space. An inner product on V is a map V × V →
F, (x, y) 7→ hx, yi such that
• The map x 7→ hx, yi is linear for each choice of y ∈ V .
• hy, xi = hx, yi ∀x, y ∈ V .
• hx, xi ≥ 0 ∀x, and hx, xi = 0 ⇒ x = 0.
5.2 Remark 1. Many authors write (x, y) instead of hx, yi, but this leads to confusion with
the notation for ordered pairs. We will use pointed brackets throughout.
2. If F = R, the complex conjugation has no effect and can be omitted. Then h·, ·i is
symmetric.
3. Combining the first two axioms one finds that the map y 7→ hx, yi is anti-linear for each
choice of x. This means hx, cy + c0 y 0 i = chx, yi + c0 hx, y 0 i for all y, y 0 ∈ V and c, c0 ∈ F. Of
course this reduces to linearity if F = R. A map V × V → C that is linear in one variable and
anti-linear in the other is called sesquilinear.
4. A large minority of authors, mostly (mathematical) physicists, defines inner products to
be linear in the second and anti-linear in the first argument. We follow the majority use like
[42].
5. The first two axioms together already imply hx, xi ∈ R for all x, but not the positivity
assumption.
6. If hx, yi = 0 for all y ∈ H then x = 0. To see this, it suffices to take y = x. 2
5.3 Example 1. If V = Cn then hx, yi = ni=1 xi yi is an inner product and the corresponding
P
norm (see below) is k · k2 , which is complete.
2. Let S be any set and V = `2 (S, C). Then hf, gi = s∈S f (s)g(s) converges for all f, g ∈ V
P
by Hölder’s inequality and is easily seen to be an inner product. Of course, 1. is a special case
of 2. R
3. If (X, A, µ) is any measure space then hf, gi = X f (x)g(x) dµ(x) is an inner product on
L2 (X, A, µ; F) turning it into a Hilbert space. (Here we allow ourselves a standard sloppiness:
The elements of Lp are not functions, but equivalence classes of functions. The inner product
of two such classes is defined by picking arbitrary representers.) P
4. Let V = Mn×n (C). For a, b ∈ V , define ha, bi = Tr(b∗ a) = ni,j=1 aij bij , where (b∗ )ij =
bji . That this is an inner product turning V into a Hilbert space follows from 1. upon the
identification Mn×n (C) ∼
2
= Cn .
In view of hx, xi ≥ 0 for all x, and we agree that hx, xi1/2 always is the positive root.
30
1. Prove it for y = 0, so that we may assume y 6= 0 from now on.
2. Define x1 = kyk−2 hx, yiy and x2 = x − x1 and prove hx1 , x2 i = 0.
3. Use 2. to prove kxk2 = kx1 k2 + kx2 k2 ≥ kx1 k2 .
4. Deduce Cauchy-Schwarz from kx1 k2 ≤ kxk2 .
5. Prove the claim about equality.
The above proof is the easiest to memorize (at least in outline) and reconstruct, but there
are many others, e.g.:
5.6 Exercise (i) For x, y ∈ V , define P (t) = kx + tyk2 and show this defines a quadratic
polynomial in t ∈ C with real coefficients.
(ii) Use the obvious fact that this polynomial takes values in [0, ∞) for all t ∈ R, thus also
inf t∈R P (t) ≥ 0, to prove the Cauchy-Schwarz inequality.
p
5.7 Proposition If h·, ·i is an inner product on V then kxk = + hx, xi is a norm on V .
Proof. kxk ≥ 0 holds by construction, and the third axiom in Definition 5.1 implies kxk = 0 ⇒
x = 0. We have p p p
kcxk = hcx, cxi = cchx, xi = |c|2 hx, xi = |c|kxk,
thus kcxk = |c|kxk for all x ∈ V, c ∈ F. Finally,
kx + yk2 = hx + y, x + yi = hx, xi + hx, yi + hy, xi + hy, yi = kxk2 + kyk2 + hx, yi + hy, xi.
thus
kx + yk2 ≤ kxk2 + kyk2 + 2kxkkyk = (kxk + kyk)2
and therefore kx + yk ≤ kxk + kyk, i.e. subadditivity.
In terms of the norm, the Cauchy-Schwarz inequality just becomes |hx, yi| ≤ kxkkyk.
5.8 Definition A pre-Hilbert space (or inner product space) is a pair (V, h·, ·i), where V is an
F-vector space and h·, ·i an inner product on it. A Hilbert space is a pre-Hilbert space that is
complete for the norm k · k obtained from the inner product.
5.9 Remark 1. By the above, an inner product gives rise to a norm and therefore to a norm
topology τ . Now the Cauchy-Schwarz inequality implies that the inner product h·, ·i → F is
jointly continuous:
31
x
(For x = 0 this is obvious, and for x 6= 0 it follows from hx, kxk i = kxk.)
3. The restriction of an inner product on H to a linear subspace K ⊆ H again is an inner
product. Thus if H is a Hilbert space and K a closed subspace then K again is a Hilbert space
(with the restricted inner product).
4. All spaces considered in Example 5.3 are complete, thus Hilbert spaces. For `2 (S) this
was proven in Section 4., and the claim for Cn , thus also Mn×n (C), follows since Cn ∼
= `2 (S, C)
when #S = n. For L2 (X, A, µ) see books on measure theory like [11, 69]. 2
5.10 Definition Let (H1 , h·, ·i1 ), (H2 , h·, ·i2 ) be pre-Hilbert spaces. A linear map A : H1 → H2
is called
• isometry if hAx, Ayi2 = hx, yi1 ∀x, y ∈ H1 .
• unitary if it is a surjective isometry.
5.11 Remark Every unitary map is invertible and its inverse is also unitary. Two Hilbert
spaces H1 , H2 are called unitarily equivalent or isomorphic if there exists a unitary U : H1 → H2 .
2
If (H1 , h·, ·i1 ), (H2 , h·, ·i2 ) are (pre)Hilbert spaces then
h(x1 , x2 ), (y1 , y2 )i = hx1 , y1 i1 + hx2 , y2 i2
defines an inner product on H1 ⊕ H2 turning it into a (pre)Hilbert space. More generally, if
{Hi , h·, ·ii }i∈I is a family of (pre)Hilbert spaces then
M X
Hi = {{xi }i∈I | hxi , xi ii < ∞}
i∈I i∈I
with X
h{xi }, {yi }i = hxi , yi ii
i∈I
is a (pre)Hilbert space. (If Hi = F for all i ∈ I, this construction recovers `2 (I, F), while the
Banach space direct sum gives `1 (I, F).)
5.12 Exercise Let (V, h·, ·i) be a pre-Hilbert space and k·k the associated norm. Let (V 0 , k·k0 )
be the completion (as a normed space) of (V, k · k). Prove that V 0 is a Hilbert space.
5.13 Exercise Let (V, h·, ·i) be a pre-Hilbert space over F ∈ {R, C}. Prove the parallelogram
identity
kx + yk2 + kx − yk2 = 2kxk2 + 2kyk2 ∀x, y ∈ V (5.3)
and the polarization identities
3
1X k
hx, yi = i kx + ik yk2 if F = C, (5.4)
4
k=0
1
hx, yi = (kx + yk2 − kx − yk2 ) if F = R, (5.5)
4
32
5.14 Remark The proof of (5.4) only uses the sesquilinearity of h·, ·i, so that the polarization
1 P3
identity [x, y] = 4 k=0 ik [x + ik y, x + ik y] holds for every sesquilinear form [·, ·] over C.. 2
For a map of (pre)Hilbert spaces we have two a priori different notions of isometry, but they
are equivalent:
5.15 Exercise Let (H1 , h·, ·i1 ), (H2 , h·, ·i2 ) be (pre)Hilbert spaces over C. Let k · k1,2 be the
norms induced by the inner products. Prove that if a linear map A : H1 → H2 is an isometry
of normed spaces then it is an isometry of pre-Hilbert spaces. (I.e. if kAxk2 = kxk1 ∀x ∈ H1
then hAx, Ayi2 = hx, yi1 ∀x, y ∈ H1 .) Hint: Polarization.
The polarization identities actually characterize norms coming from inner products:
5.16 Exercise Let (V, k · k) be a normed space over C whose norm satisfies (5.3). Take (5.4) as
definition of h·, ·i and prove that this is an inner product. Thus (5.3) is necessary and sufficient
for the norm to come from an inner product. (See the hints in [42, Exercise 1.14].)
The above is only one of very many characterizations of Hilbert spaces among the Banach
spaces. There is a whole book [2] about them!
5.17 Exercise Let S be a set with #S ≥ 2. Prove that (`p (S, F), k · kp ), where p ∈ [1, ∞],
satisfies the parallelogram identity if and only if p = 2.
Proof. We have
X X X X X
kxk2 = hx, xi = h xi , xj i = hxi , xj i = hxi , xi i = kxi k2 ,
i j i,j i i
33
5.22 Proposition (Riesz lemma) 17 Let H be a Hilbert space and C ⊆ H a non-empty
closed convex set. Then for each x ∈ H there is a unique y ∈ C minimizing kx − yk, i.e.
kx − yk = inf z∈C kx − zk.
Proof. We will prove this for x = 0, in which case the statement says that there is a unique
element of C of minimal norm. For general x ∈ H 0 , let y 0 be the unique element of minimal
norm in the convex set C 0 = C − x. Then y = y 0 + x is the unique element in C minimizing
kx − yk.
Let d = inf z∈C kzk and pick a sequence {yn } in C such that kyn k → d. Now with the
parallelogram identity (5.3) we have
5.24 Theorem Let H be a Hilbert space and K ⊆ H a closed linear subspace. Define a map
P : H → K by P x = y, where y ∈ K minimizes kx − yk as in Proposition 5.22. Also define
Qx = x − P x. Then
(i) Qx ∈ K ⊥ ∀x.
(ii) For each x ∈ H there are unique y ∈ K, z ∈ K ⊥ with x = y + z, namely y = P x, z = Qx.
(iii) The maps P, Q are linear.
(iv) The map U : H → K ⊕ K ⊥ , x 7→ (P x, Qx) is an isomorphism of Hilbert spaces. In
particular, kxk2 = kP xk2 + kQxk2 ∀x.
(v) The map P : H → H satisfies P 2 = P and hP x, yi = hx, P yi. The same holds for Q.
17
Frigyes Riesz (1880-1956). Hungarian mathematician and one of the pioneers of functional analysis. (The same
applies to his younger brother Marcel Riesz (1886-1969).)
34
Proof. (i) Let x ∈ H, v ∈ K. We want to prove Qx ⊥ v, i.e. hx − P x, vi = 0. Since y = P x is
the element of K minimizing kx − yk, we have for all t ∈ C
kx − P xk ≤ kx − P x − tvk.
Taking squares and putting z = x − y = x − P x, this becomes hz, zi ≤ hz − tv, z − tvi, equivalent
to
2 Re(thv, zi) ≤ |t|2 kvk2 .
With the polar decomposition t = |t|eiϕ , the above becomes 2 Re(eiϕ hv, zi) ≤ |t|kvk2 . Taking
|t| → 0, we find Re(eiϕ hv, zi) = 0, and since ϕ was arbitrary, we conclude hv, zi = 0. In view of
z = x − y = x − P x this is what we wanted.
(ii) For each x ∈ H we have x = P x + Qx with P x ∈ K, Qx ∈ K ⊥ , proving the existence.
If y, y 0 ∈ K, z, z 0 ∈ K ⊥ such that y + z = y 0 + z 0 then y − y 0 = z 0 − z ∈ K ∩ K ⊥ = {0}. Thus
y − y 0 = z 0 − z = 0, proving the uniqueness.
(iii) If x, x0 ∈ H, c, c0 ∈ F then cx + c0 x0 = P (cx + c0 x0 ) + Q(cx + c0 x0 ). But also
hS(x1 , x2 ), (y1 , y2 )i = h(x1 , 0), (y1 , y2 )i = hx1 , y1 i1 = h(x1 , x2 ), (y1 , 0)i = h(x1 , x2 ), S(y1 , y2 )i.
5.25 Remark The above theorem remains valid if H is only a pre-Hilbert space, provided
K ⊆ H is finite dimensional. We first note that the proof of Lemma 5.22 only uses completeness
of C ⊆ H, not that of H. And we recall that finite dimensional subspaces of normed spaces are
automatically complete and closed, cf. Exercises 3.21 and 3.22. In the proof of Theorem 5.24
we use Lemma 5.22 with C = K, which is complete as just noted. 2
35
(In Theorem 8.9 we will prove that every self-adjoint P : H → H is automatically bounded,
but this is not needed here.)
We have seen that every closed subspace K of a Hilbert space gives rise to an orthogonal
projection P with P H = K. Conversely, we have:
5.28 Exercise Let H be a Hilbert space and P an idempotent on H (i.e. in B(H)). Prove:
(i) K = P H ⊆ H and L = (1 − P )H are closed linear subspaces.
(ii) We have K ⊥ L if and only if P = P ∗ , i.e. P is an orthogonal projection.
(iii) If P is an orthogonal projection then it equals the P associated to K by Theorem 5.24.
36
5.31 Proposition Every vector space V has a base.
Proof. If V = {0}, ∅ is a base. Thus let V be non-zero and let B be the set of linearly inde-
pendent subsets of V . The set B is partially ordered by inclusion ⊆ and non-empty (since it
contains {x} for all 0 6= x ∈ V ). We claim that every chain in (=totally ordered subset of)
(B, ⊆) has a maximal element: Just take the union B b of all sets in the chain. Since any finite
subset of the union over a chain of sets is contained in some element of the chain, every finite
subset of B
b is linearly independent. Thus B b is in B and clearly is an upper bound of the chain.
Thus the assumption of Zorn’s Lemma is satisfied, so that (B, ⊆) has a maximal element M .
We claim that M is a basis for V : If this was false, we could find a v ∈ V not contained in
the span of M . But then M ∪ {v} would be a linearly independent set strictly larger than M ,
contradicting the maximality of M .
The linear algebra notion of base is quite irrelevant as soon as we are concerned with topo-
logical vector spaces, like normed spaces since in the presence of a topology we can also talk
about infinite linear combinations. This leads to the notion of a Hilbert space base and the
above purely algebraic one is of little or no relevance.
5.34 Lemma Let H be a (pre)Hilbert space and E ⊆ H an orthonormal set. Then the Bessel
inequality X
|hx, ei|2 ≤ kxk2 ∀x ∈ H (5.6)
e∈E
holds.
P
Proof. Let E be a finite orthonormal set and x ∈ V . Define y = x − e∈E hx, eie. It is
P check that hy, ei = 0 for all e ∈ E, so that E ∪ {y} is an orthonormal set. In
straightforward to
view of x = y + e∈E hx, eie, Pythagoras gives
X
kxk2 = kyk2 + |hx, ei|2 ,
e∈E
5.36 Lemma For every orthonormal set E in a (pre)Hilbert space H there is an orthonormal
base E
b containing E. In particular every Hilbert space admits an ONB.
37
Proof. The proof is essentially the same as that of Proposition 5.31: Let B be the set of or-
thonormal sets that contain E. A Zorn’s lemma argument gives the existence of an orthonormal
set E
b that contains E and is maximal. Thus E b is a base. (There is a tricky point: Our E
b is
maximal among the orthonormal sets that contain E. Without the latter requirement there
might be a bigger E.
b Why can’t this happen?)
5.38 Theorem 18 Let H be a Hilbert space and E an orthonormal set in H. Then the following
are equivalent:
(i) E is an orthonormal base, i.e. maximal.
(ii a) If x ∈ H and x ⊥ e for all e ∈ E then x = 0.
(ii b) The map H → `2 (E, F), x 7→ {hx, ei}e∈E (well-defined thanks to (5.6)) is injective.
(iii) spanF E = H.
P
(iv) For every x ∈ H, there are numbers {ae }e∈E in F such that x = e∈E ae e.
P
(v) For every x ∈ H, the equality x = e∈E hx, eie holds.
(vi a) For every x ∈ H, we have kxk2 = e∈E |hx, ei|2 . (Abstract Parseval19 identity)
P
(vi b) The map H → `2 (E, F), x 7→ {hx, ei}e∈E is an isometric map of of normed spaces, where
`2 (S) has the k · k2 -norm.
P P
(vii a) For all x, y ∈ H we have hx, yi = e∈E hx, eihe, yi = e∈E hx, eihy, ei.
(vii b) The map H → `2 (E, F), x 7→ {hx, ei}e∈E is an isometric map of pre-Hilbert spaces.
Here all summations over E are in the sense of the unordered summation of Appendix A.1 (with
V = H in (iv),(v) and V = F in (vi a),(vii a)).
Proof. If (ii a) holds then E is maximal, thus (i). If (ii a) is false then there is a non-zero x ∈ H
with x ⊥ e for all e ∈ E. Then E ∪ {x/kxk} is an orthonormal set larger than E, thus E is
not maximal. Thus (i)⇔(ii a). The equivalence (ii a)⇔(ii b) follows from the fact that a linear
map is injective if and only if its kernel is {0}.
(iii)⇒(i) If spanF E = H and x ∈ H satisfies x ⊥ E then also x ⊥ (spanF E = H), thus
x = 0. Thus E is maximal and therefore a base.
(ii a)⇒(iii) K = spanF E ⊆ H is a closed linear subspace. If K 6= H then by Theorem 5.24
we can find a non-zero x ∈ K ⊥ . In particular x ⊥ e ∀e ∈ E, contradicting (ii a). Thus K = H.
It should be clear that the statements (vi b) and (vii b) are just high-brow versions of (vi a),
(vii a), respectively, to which they are equivalent. That (vii a) implies (vi a) is seen by taking
x = y. Since Exercise 5.15 gives (vi b)⇒(vii b), we have the mutual equivalence of (vi a), (vi
b), (vii a), (vii b).
18
I dislike the approach of [42] of restricting this statement to finite or countably infinite orthonormal sets. This in
particular means that the hypotheses of [42, Theorem 1.33] can never be satisfied if H is non-separable! I also find
it desirable to understand how much of the theorem survives without completeness since the latter does not hold in
situations like Example 5.44. See Remark 5.39.
19
Marc-Antoine Parseval (1755-1836). French mathematician.
38
(v)⇒(iv) is trivial.
P If (iv) holds then continuity of the inner product, cf. Remark 5.9.1,
implies hx, yi = e∈E ae he, yi for all y ∈ H. For y ∈ E, the r.h.s. reduces to ay , implying (v).
(iv) means that every x ∈ H is a limit of finite linear combinations of the e ∈ E, thus (iii)
holds. P
(v)⇒(vi a) For finite F ⊆ E we define xF = e∈F hx, eie. Pythagoras’ theorem gives
2 = 2 . As F % E, the l.h.s. converges to kxk2 by (iii) and the r.h.s. to
P
kx
PF k e∈F |hx, ei|
|hx, ei|2 . Thus (vi a) holds.
e∈E
If (vi a) holds then for each ε > 0 there is a finite F ⊆ E such that e∈E\F |hx, ei|2 < ε.
P
Since x−xF is orthogonal to each e ∈ F , P we have x−xF ⊥ xF , to that kxk2 P = kx−xF k2 +kxF k2 .
Combining this with (iv a) and kxF k = e∈F |hx, ei| we find kx−xF k = e∈E\F |hx, ei|2 < ε.
2 2 2
5.41 Theorem ((F.) Riesz-Fischer) 2021 Let H be a pre-Hilbert space and E an orthonor-
mal set such that spanF E = H. Then the following are equivalent:
20
Ernst Sigismund Fischer (1975-1954). Austrian mathematician. Early pioneer of Hilbert space theory.
21
Also the completeness of L2 (X, A, µ; F) (see Lemma 4.10 for `2 (S)) is sometimes called Riesz-Fischer theorem.
39
(i) H is a Hilbert space (thus complete).
(ii) The isometric map H → `2 (E, F), x 7→ {hx, ei}e∈E is surjective. I.e. for every f ∈ `2 (E, F)
there is an x ∈ H such that hx, ei = f (e) for all e ∈ E.
Proof. (ii)⇒(i) We know from (iii)⇒(vii b) in Theorem 5.38 that the map H → `2 (E, F) is an
isometry. If it is surjective then it is an isomorphism of pre-Hilbert spaces. Since `2 (E, F) is
complete by Lemma 4.10, so is H.
(i)⇒(ii) Let f ∈ `2 (E, F). For each finite P
P
subset F ⊆ E we define xF = e∈F f (e)e. For
each ε > 0 there is a finite F ⊆ E such that e∈E\F |f (e)|2 < ε. Whenever F ⊆ U ∩ U 0 , the
identity xU − x0U = e∈E (χU (e) − χU 0 (e))f (e)e implies
P
X
kxU − xU 0 k2 = |χU (e) − χU 0 (e)|2 |f (e)|2 ≤ ε
e∈E
since |χU − χU 0 | vanishes on F and is bounded by one on (U ∪U 0 )\F . Thus {xF } is a Cauchy net
and therefore convergent to a unique x ∈ H by completeness, cf. Lemma A.11. By continuity
of the inner product, hxF , ei converges to f (e), so that hx, ei = f (e) for all e ∈ E.
5.42 Remark If E and E 0 are ONBs for a Hilbert space H then one can prove that E and E 0
have the same cardinality, i.e. there is a bijection between E and E 0 , cf. [12, Proposition I.4.14].
(This does not follow from the linear algebra proof, since the latter uses a different notion of
base, the Hamel bases.) The common cardinality of all bases of H is called the dimension of H.
2
40
which is the n-th Fourier coefficient fb(n) of f , cf. e.g. [76, 34]. In fact, in Fourier analysis one
proves, cf. e.g. [76, Corollary 5.4], that the finite linear combinations of the en (‘trigonometric
polynomials’) are dense in H, which is (iii) of Theorem 5.38. Thus all other statements in the
theorem also hold. The weaker statement (ii a) is also well-known in Fourier analysis, cf. [76,
Corollary 5.3]. Furthermore,
Z 2π
1 X X
|f (x)|2 dx = kf k2 = |hf, en i|2 = |fb(n)|2 .
2π 0
n∈Z n∈Z
This is the original Parseval formula, cf. e.g. [76, Chapter 3, Theorem 1.3]. Note that H is
not complete. Measure theory tells us that this completion is L2 ([0, 2π], λ; C), the measure
being Lebesgue measure λ (defined on the σ-algebra of Borel sets). Now the map L2 ([0, 2π]) →
`2 (Z, C), f 7→ fb is an isomorphism of Hilbert spaces. This nice situation shows that the Lebesgue
integral is much more appropriate for the purposes of Fourier analysis than the Riemann integral
(as for most other purposes).
5.45 Exercise Prove that the pre-Hilbert space H = C([0, 1]) with inner product hf, gi =
RA
0 f (t)g(t)dt is not complete.
5.46 Lemma Let (H, h·, ·iH ), (H 0 , h·, ·iH 0 ) be pre-Hilbert spaces over F ∈ {R, C}. Then there is
a unique inner product h·, ·iZ on Z = H ⊗F H 0 such that hv ⊗ w, v 0 ⊗ w0 i = hv, v 0 iH hw, w0 iH 0 .
Proof. Every element z ∈ Z = H ⊗F H 0 has a representation z = K
P
k=1 vk ⊗ wk with K < ∞.
Given another z 0 = L 0 ⊗ w 0 ∈ H ⊗ H 0 , we must define
P
v
l=1 l l F
K X
X L
0
hz, z iZ = hvk , vl0 iH hwk , wl0 iH 0 .
k=1 l=1
41
The independence of hz, z 0 iZ of the representation of z 0 is shown in the same way.
It is quite clear that h·, ·iZ is sesquilinear and satisfies 0 0
P hz , ziZ = hz, z iZ .
In order to study hz, ziZ we may assume that z = k vk ⊗ wk , where the wk are mutually
orthogonal. This leads to
X X
hz, ziZ = hvk , vk iH hwk , wk iH 0 = kvk k2 kwk k2 ≥ 0
k k
5.47 Definition If H, H 0 are Hilbert spaces then H ⊗ H 0 is the Hilbert space obtained by
completing the above pre-Hilbert space (Z, h·, ·iZ ).
5.48 Remark 1. We usually write the completed tensor products ⊗ without subscript to
distinguish them from the algebraic ones.
2. If E, E 0 are ONBs in the Hilbert spaces H, H 0 , respectively, then it is immediate that
E × E 0 is an orthonormal set in the algebraic tensor product H ⊗k H 0 , thus also in H ⊗ H 0 . In
fact its span is dense in E ⊗ E 0 , so that it is an ONB.
This leads to a pedestrian way of defining the tensor product H ⊗ H 0 of Hilbert spaces over
F: Pick ONBs E ⊆ H, E 0 ⊆ H 0 and define H ⊗ H 0 = `2 (E × E 0 , F). By Remark 5.42, the
outcome is independent of the chosen bases up to isomorphism. If x ∈ H, x0 ∈ H 0 then the map
E ×E 0 → F, (e, e0 ) 7→ hx, eiH hx0 , e0 iH 0 is in `2 (E ×E, F), thus defines an element x⊗x0 ∈ H ⊗H 0 .
This map H × H 0 → H ⊗ H 0 is bilinear. But this definition is very ugly and unconceptual due
to its reliance on a choice of bases. 2
42
6.2 Quotient spaces of Banach spaces
In a general Banach space, we don’t have the notion of orthogonal complement. But in most
situations, having Banach quotient spaces is good enough. (For a different substitute for or-
thogonal complements see Section 6.3.)
kc(x + W )k0 = kcx + W k0 = inf kcx − wk = |c| inf kx − w/ck = |c| inf kx − wk = |c|kxk0 ,
w∈W w∈W w∈W
where we used that W → W, w 7→ cw is a bijection. Now let x1 , x2 ∈ V and ε > 0. Then there
are w1 , w2 ∈ W such that kxi − wi k < kxi + W k0 + ε/2 for i = 1, 2. Then
Since ε > 0 was arbitrary, we have kx1 +x2 +W k0 ≤ kx1 +W k0 +kx2 +W k0 , proving subadditivity
of k · k0 . It is immediate that kv + W k0 = inf w∈W kv − wk ≤ kvk.
(ii) If v ∈ V , the definition of k · k0 readily implies that kv + W k0 = 0 if and only if v ∈ W .
Thus if W is closed then w = v + W ∈ V /W has kwk0 = 0 only if w is the zero element of W .
And if W is non closed then every v ∈ W \W satisfies kv + W k0 = 0 even though v + W ∈ V /W
is non-zero. Thus k · k0 is not a norm.
(iii) Continuity of p : (V, k · k) → (V /W, k · k0 ) follows from kpk ≤ 1, see (i). Since p is
norm-decreasing, we have p(B V (0, r)) ⊆ B V /W (0, r) for each r > 0. And if y ∈ V /W with
kyk < r then there is an x ∈ V with p(x) = y and kxk < r (but typically larger than kyk). Thus
p maps B V (0, r) onto B V /W (0, r) for each r. Similarly, p(B V (x, r)) = B V /W (p(x, r)), and from
this it is easily deduced that p(U ) ⊆ V /W is open for each open U ⊆ V . Thus p is open (w.r.t.
the norm topologies on V, V /W ), which implies (cf. [47, Lemma 6.4.5]) that p is a quotient map,
thus the topology on V /W coming from k · k0 is the quotient topology.
(iv) Let {yn } ⊆ V /W be a Cauchy sequence. Then we can pass to a subsequence wn = yin
such that kwn −wn+1 k < 2−n . Pick xn ∈ V such that p(xn ) = wn and kxn −xn+1 k < 2−n . (Why
can this be done?) Then {xn } is a Cauchy sequence converging to some x ∈ V by completeness
of V . With y = p(x) we have kyn − yk ≤ kxn − xk → 0. Thus yn → y, and V /W is complete.
43
(v) Existence and uniqueness of T 0 as linear map are standard. And using p(B V (0, 1)) =
B V /W (0, 1) we have
Also the statements concerning injectivity and surjectivity of T 0 are again pure algebra, but for
completeness we give proofs: The statement about surjectivity follows from T = T 0 ◦ p together
with surjectivity of p, which gives T (V ) = T 0 (V /W ). If W $ ker T , pick x ∈ (ker T )\W and
put y = p(x). Then y 6= 0, but T 0 y = T 0 px = T x = 0, so that T 0 is not injective. Now assume
W = ker T . If y ∈ ker T 0 then pick x ∈ V with y = p(x). Then T x = T 0 px = T 0 y = 0, thus
x ∈ ker T = W , so that y = p(x) = 0, proving injectivity of T 0 .
(vi) It is known from algebra that A/I is again an algebra. By the above, it is normed. It
remains to prove that the quotient norm on A/I is submultiplicative. Let c, d ∈ A/I and ε > 0.
Then there are a, b ∈ A with p(a) = c, p(b) = d, kak < kck + ε, kbk < kdk + ε (see the exercise
below). Then kcdk = kp(ab)k ≤ kabk ≤ kakkbk < (kck + ε)(kdk + ε), and since this holds for all
ε > 0, we have kcdk ≤ kckkdk.
6.3 Exercise (i) If V is a normed space and W ⊆ V is a closed subspace, prove that for
every y ∈ V /W and every ε > 0 there is an x ∈ V with p(x) = y and kxk ≤ kyk + ε.
(ii) Give an example of a normed space V , a closed subspace W and y ∈ V /W for which no
x ∈ V with y = p(x), kxk = kyk exists.
6.4 Exercise Use the quotient space construction of Banach spaces to give a new proof for
the difficult part of Exercise 3.16.
The following is closely related to the Hilbert space ⊥, but not the same:
6.5 Definition Let V be a Banach space and W ⊆ V a subspace. Then the annihilator of W
⊥
is W ⊥ = {ϕ ∈ V ∗ | ϕ W = 0} ⊆ V ∗ . One easily checks W ⊥ = W ⊥ = W .
6.7 Exercise Let V be a Banach space and Z ⊆ V ∗ a closed subspace. Define Z > ⊆ V and
prove V ∗ /Z ∼
= (V /Z > )∗ .
44
6.10 Definition Let V be a Banach space. A closed subspace W ⊆ V is called complemented
if there is a closed subspace Z ⊆ V such that V = W + Z and W ∩ Z = {0}.
If V, W, Z are as in the definition (without closedness) then every v ∈ V can be written as
v = w + z with w ∈ W, z ∈ Z in a unique way. (Uniqueness follows from w + z = w0 + z 0 ⇒
w − w0 = z 0 − z ∈ W ∩ Z = {0}.) One says ‘V is the internal direct sum of W and Z’. Purely
algebraically, every subspace W has a complementary subspace Z: Pick a (Hamel) base E for
W , extend to a base E 0 of V and put Z = spanF E 0 \E. But here we want Z to be closed! In
Exercise 9.11 we will prove that with closedness of W, Z we have V ∼= W ⊕ Z also topologically.
6.11 Exercise Let V = C([0, 2], R) with the k · k∞ -norm. Let W = {f ∈ V | f|(1,2] = 0}.
(i) Prove that W is complemented.
(ii) Can you ‘classify’ all possible complements, i.e. put them in bijection with a simpler set?
6.12 Exercise Let V be a Banach space and P ∈ B(V ) satisfying P 2 = P . Prove that
W = P V is a complemented subspace. (The converse is also true, as you will prove later.)
6.13 Exercise Let V be a Banach space and W ⊆ V a closed subspace such that dim V /W <
∞. Prove that W is complemented.
Not every closed subspace of a Banach space is complemented! In view of Exercise 6.13
and Proposition 6.14, a non-complemented subspace W ⊆ V must have infinite dimension and
codimension. And indeed, c0 (N, F) ⊆ `∞ (N, F) is non-complemented, as we prove in Appendix
B.3. See also [43] for more on the subject of complemented subspaces.
In fact, a Banach space has complementary subspaces for all closed subspaces if and only
if it is isomorphic to a Hilbert space, i.e. it admits an inner product whose associated norm is
equivalent to the original one! See [41].
In the process of returning from Hilbert to Banach spaces, the above discussion of quotient
spaces and complements was the easiest part. The question of bases is much harder for Banach
spaces, as the existence of the two volume treatment [74] of the subject, having 680+888 pages,
might suggest. (Then again, the basics are quite accessible, cf. e.g. [43, 27, 9, 1], but unfortu-
nately we don’t have the time.) The same is true for the formidable subject of tensor products
of Banach spaces, see e.g. [67]. Going into that would be pointless given that we already slighted
the much simpler tensor products of Hilbert spaces.
A more tractable problem is the fact that in the absence of an inner product, the existence
of non-zero bounded linear functionals is rather non-trivial and can in general only be proven
non-constructively, as we will do in the next section. (Of course, for spaces that are given very
explicitly like `p (S, F), we may well have more concrete approaches as in Section 4.5.)
45
understood. (The map H → H ∗ , y 7→ ϕy is an anti-linear bijection.) For a general Banach
space V , matters are much more complicated. The point of the Hahn23 -Banach theorem (which
comes in many versions)24 is to show that there many linear functionals.
7.2 Theorem Let V be a real vector space and p : V → R a sublinear function. Let W ⊆ V
be a linear subspace and ϕ : W → R a linear functional such that ϕ(w) ≤ p(w) for all w ∈ W .
b : V → R such that ϕ
Then there is a linear functional ϕ b W = ϕ and ϕ(v)
b ≤ p(v) for all v ∈ V .
The heart of the proof of the theorem is proving it in the case where we extend ϕ from W
to W + Rv 0 .
7.3 Lemma Let V, p, W, ϕ be as in Theorem 7.2 and v 0 ∈ V . Then there is a linear functional
b : Y = W + Rv 0 → R such that ϕ
ϕ b W = ϕ and ϕ(v)
b ≤ p(v) for all v ∈ Y .
Proof. If v ∈ W , there is nothing to do so that we may assume v 0 ∈ V \W . Then every
x ∈ W + Rv 0 can be written as x = w + cv 0 with unique w ∈ W, c ∈ R. Thus if d ∈ R, we can
define ϕb : W + Rv 0 → R by w + cv 0 7→ ϕ(w) + cd for all w ∈ W and c ∈ R. Since ϕ
b is linear and
trivially satisfies ϕ
b W = ϕ, it remains to show that d can be chosen such that
b + cv 0 ) = ϕ(w) + cd ≤ p(w + cv 0 ) ∀w ∈ W, c ∈ R.
ϕ(w (7.1)
For c = 0, this holds by assumption. If (7.1) holds for all w ∈ W and c ∈ {1, −1}, i.e.
b ± ev 0 ) = eϕ(e
ϕ(w b −1 w ± v 0 ) ≤ ep(e−1 w ± v 0 ) = p(w ± ev 0 ),
thus the desired inequality (7.1) holds for all w ∈ W, c ∈ R. Now d ∈ R satisfies (7.2) for all
w ∈ W and both signs if and only if
Clearly this is possible if and only if ϕ(w) − p(w − v 0 ) ≤ p(w0 + v 0 ) − ϕ(w0 ) for all w, w0 ∈ W ,
which is equivalent to ϕ(w) + ϕ(w0 ) ≤ p(w − v 0 ) + p(w0 + v 0 ) ∀w, w0 . This is indeed satisfied for
all w, w0 ∈ W since w + w0 ∈ W so that
23
Hans Hahn (1879-1934). Austrian mathematician who mostly worked in analysis and topology.
24
Important early results are due to Eduard Helly (1884-1943), another Austrian mathematician. See [42, p. 54-55].
46
Proof of Theorem 7.2. If W = V , there is nothing to do, so assume W $ V . Let E be the set
of pairs (Z, ψ), where Z ⊆ V is a linear subspace space containing W and ψ : Z → R is a linear
map extending ϕ such that ψ(z) ≤ p(z) ∀z ∈ Z. Since W 6= V , Lemma 7.3 implies E 6= ∅.
We define a partial ordering on E by (Z, 0 0 0
S ψ) ≤ (Z , ψ ) ⇔ Z ⊆ Z, ψ Z = ψ. If C ⊆ E is
a chain, i.e. totally ordered by ≤, let Y = (Z,ψ)∈C Z and define ψY : Y → R by ψY (v) = ψ(v)
for any (Z, ψ) ∈ C with v ∈ Z. This clearly is consistent and gives a linear map. Now (Y, ψY ) is
in element of E and an upper bound for C. Thus by Zorn’s lemma there is a maximal element
(YM , ψM ) of E. Now ψM : YM → R is an extension of ϕ satisfying ψM (y) ≤ p(y) for all y ∈ YM ,
so we are done if we prove YM = V . If this is not the case, we can pick v 0 ∈ V \YM and use
Lemma 7.3 to extend ψY to YM + Rv 0 , but this contradicts the maximality of (YM , ψM ).
7.4 Remark The above proof used Zorn’s lemma, which is equivalent to the Axiom of Choice
(AC), and therefore very non-constructive25 . There is nothing much to be done about this, but
we mention that the Hahn-Banach theorem can be deduced from the ‘ultrafilter lemma’, which
is strictly weaker than AC. For separable spaces, the Hahn-Banach theorem can be proven using
only the axiom DCω of countable dependent choice. For proofs of these claims see [47, Appendix
G]. 2
7.5 Theorem (Hahn-Banach Theorem) If V be a vector space over F ∈ {R, C}, p a semi-
norm on it, W ⊆ V a linear subspace and ϕ : W → C a linear functional such that |ϕ(w)| ≤ p(w)
for all w ∈ W . Then there is a linear functional ϕ
b : V → C such that ϕ b W = ϕ and
|ϕ(v)|
b ≤ p(v) for all v ∈ V .
Proof. F = R: This is an immediate consequence of Theorem 7.2 since a seminorm p is sublinear
with the additional properties p(−v) = p(v) ≥ 0 for all v. In particular, −ϕ(v) b = ϕ(−v)
b ≤
p(−v) = p(v), so that −p(v) ≤ ϕ(v) ≤ p(v) for all v ∈ V , which is equivalent to |ϕ(v)|
b ≤ p(v) ∀v.
ϕ
F = C: Assume V ⊇ W → C satisfies |ϕ(w)| ≤ p(w) ∀w ∈ W . Define ψ : W → R, w 7→
Re(ϕ(w)), which clearly is R-linear and satisfies the same bounds. Thus by the real case just
considered, there is an R-linear functional ψb : V → R such that |ψ(v)|
b ≤ p(v) for all v ∈ V .
b : V → C by
Define ϕ
ϕ(v)
b b − iψ(iv).
= ψ(v) b
Again it is clear that ϕ
b is R-linear. Furthermore
ϕ(iv)
b = ψ(iv)
b − iψ(−v)
b = ψ(iv)
b + iψ(v)
b b − iψ(iv))
= i(ψ(v) b = iϕ(v),
b
b : V → C is C-linear. If w ∈ W then
proving that ϕ
ϕ(w)
b = ψ(w)
b − iψ(iw)
b = ψ(w) − iψ(iw) = Re(ϕ(w)) − iRe(ϕ(iw))
= Re(ϕ(w)) − iRe(iϕ(w)) = Re(ϕ(w)) + iIm(ϕ(w)) = ϕ(w),
so that ϕ
b extends ϕ.
Given v ∈ V , let α ∈ C, |α| = 1 be such that αϕ(v) b ≥ 0. Then αϕ(v)
b = ϕ(αv)
b =
Re(ϕ(αv))
b = ψ(αv),
b so that |ϕ(v)|
b = |αϕ(v)|
b = ψ(αv)
b ≤ p(αv) = p(v).
25
“Such reliance on awful non-constructive results is unfortunately typical of traditional functional analysis.” [38]
47
7.6 Remark In Exercise 5.30 we saw (with a fairly easy proof) that bounded linear functionals
defined on linear subspaces of Hilbert spaces always have unique norm-preserving extensions to
the whole space. For a general Banach space V this uniqueness is far from true! (It holds if and
only if V ∗ is strictly convex, cf. Section B.6 for definition and proof.) 2
7.7 Exercise Give an example for a Banach space V , a linear subspace K ⊆ V and ϕ ∈ K ∗
b ∈ V ∗.
such that there are multiple norm-preserving extensions ϕ
Proof of Proposition 6.14. To begin with, finite dimensional subspaces are automatically closed
by Exercise 3.22. Let W ⊆ V be finite P dimensional and let {e1 , . . . , en } be a base for W .
Since every w ∈ W can beP written as ni=1 ci ei in a unique way, there are linear functionals
ϕi : W → C such that w = ni=1 ϕi (w)ei for each w ∈ W . Since W is finite dimensional, the ϕi
are automatically bounded. Now by the Hahn-Banach Theorem 7.5 there are continuous linear
bi : V → C extending the ϕi . Then Z = ni=1 ker ϕ
T
functionals ϕ bi is aP closed linear subspace
of V . It should be clear that W ∩ Z = {0}. Define P : V → W, v 7→ ni=1 ϕ bi (v)ei . We have
2
P W = idW , thus P = P . Now apply Exercise 6.12.
48
7.9 Corollary Every normed space E embeds isometrically into a Banach space E b as a dense
subspace. That space E is unique up to isometric isomorphism and is called the completion of
b
E.
Proof. This can be proven by completing the metric space (E, d), where d(x, y) = kx − yk and
showing that the completion is a linear space, which is easy. Alternatively, using the above
result that ιE : E → E ∗∗ is an isometry, we can take Eb = ιE (E) ⊆ E ∗∗ as the definition of E
b
∗∗
since this is a closed subspace of the complete space E and therefore complete.
Uniqueness of the completion follows with the same proof as for metric spaces, cf. [47].
7.11 Exercise Let V be an infinite dimensional Banach space over F ∈ {R, C}.
(i) Use Hahn-Banach to construct sequences {xn }n∈N ⊆ V and {ϕn }n∈N ⊆ V ∗ such that
kxn k = 1 and ϕn (xn ) 6= 0 for all n ∈ N and ϕn (xm ) = 0 whenever n 6= m.
(ii) Prove that {xn }n∈N is linearly independent and that xn 6∈ spanF {xm | m 6= n} for all n.
7.12 Exercise Let V be a Banach space and x ∈ V, ϕ ∈ V ∗ . Prove that ιV ∗ (ϕ)(ιV (x)) = ϕ(x).
7.16 Exercise (i) Prove that if E is reflexive then for each ϕ ∈ E ∗ there is x ∈ E such that
kxk = 1 and |ϕ(x)| = kϕk.
(ii) Use (i) and Theorem 4.16 to prove (again) that c0 (N, C) is not reflexive.
7.17 Remark 1. The converse of the statement in Exercise 7.16(i) is also true, but the proof
is much harder and more than 10 pages long! (See [43, Section 1.13].)
2. See Appendix B.6 for the notion of uniform convexity, which is stonger than the strict
convexity encountered earlier, and a proof of the fact that uniformly convex spaces are reflexive.
We will also prove prove that Lp (X, A, µ) is uniformly convex for each measure space
(X, A, µ) and 1 < p < ∞. This provides a proof of reflexivity of these spaces that does
not use the relation between Lp and Lq . This in turn leads to a simple proof of surjectivity of
the isometric map Lq → (Lp )∗ known from Section 4.6 (reversing the logic of Exercise 7.15(iii)).
49
3. If E is a Banach space and F ⊆ E is a closed subspace then E is reflexive if and only both
F and E/F are reflexive. The proof uses only Hahn-Banach. See [85] for a nice exposition. 2
7.18 Theorem Let V be a Banach space. Then V is reflexive if and only if V ∗ is reflexive.
Proof. ⇒ Given surjectivity of the canonical map ιV : V → V ∗∗ , we want to prove surjectivity
of ιV ∗ : V ∗ → V ∗∗∗ . Let thus ϕ ∈ V ∗∗∗ = (V ∗∗ )∗ . Putting ϕ0 = ϕ ◦ ιV ∈ V ∗ , the implication
is proven if we show ϕ = ιV ∗ (ϕ0 ), which means ϕ(x∗∗ ) = ιV ∗ (ϕ0 )(x∗∗ ) for all x∗∗ ∈ V ∗∗ . By
surjectivity of ιV : V → V ∗∗ , this is equivalent to ϕ(ιV (x)) = ιV ∗ (ϕ0 )(ιV (x)) for all x ∈ V . This
is true since the l.h.s. is ϕ0 (x) by definition of ϕ0 and the r.h.s. equals ϕ0 (x) by Exercise 7.12.
⇐ Assume that V is not reflexive. Then ιV (V ) ⊆ V ∗∗ is a proper closed subspace, so that
ιV (V )⊥ 6= {0} by Exercise 7.10. Let thus 0 6= ϕ ∈ ιV (V )⊥ ⊆ V ∗∗∗ . Since V ∗ is reflexive,
we have ϕ = ιV ∗ (ϕ0 ) for some ϕ0 ∈ V ∗ . Using Exercise 7.12 again, for each x ∈ V we have
ϕ0 (x) = ιV ∗ (ϕ0 )(ιV (x)) = ϕ(ιV (x)) = 0. But this means ϕ0 = 0, thus ϕ = 0, a contradiction.
7.19 Remark 1. Since `∞ (S, F) ∼ = `1 (S, F)∗ , the theorem implies that also `∞ (S, F) is not
reflexive for infinite S.
2. More generally, for non-reflexive E none of the spaces E ∗ , E ∗∗ , E ∗∗∗ , . . . is reflexive, so that
E $ E ∗∗ $ E ∗∗∗∗ $ · · · and E ∗ $ E ∗∗∗ $ E ∗∗∗∗∗ $ · · · , and we have two somewhat mysterious
successions of ever larger spaces! There do not seem to be many general results about this, but
see Lemma B.10(iv). Even understanding C(X, R)∗∗ for compact X is complicated, cf. [33]. 2
8.2 Theorem [Helly 1912, Hahn, Banach 1922] Let E be a Banach space, F a normed space
and F ⊆ B(E, F ) pointwise bounded. Then F is uniformly bounded.
Proof. Assume that F is not uniformly bounded. Then the sets Fn = {A ∈ F | kAk ≥ 4n }
are all non-empty, so that using ACω (axiom of countable choice), we can pick an An ∈ Fn for
each n ∈ N. By definition of kAn k, the sets Xn = {x ∈ E | kxk ≤ 1, kAn xk ≥ 23 kAn k} are all
non-empty, to that using ACω again, we can choose an xn ∈ Xn for each n ∈ N.
50
Applying the triangle inequality to Az = 12 (A(y + z) − A(y − z)) gives
1 1
kAzk = k(A(y + z) − A(y − z))k ≤ (kA(y + z)k + kA(y − z)k) ≤ max(kA(y + z)k, kA(y − z)k).
2 2
we have kAn yn k ≥ 32 3−n kAn k for all n. (For n = 1 this is true since y1 = x1 .) Since (8.1)
involves no further free choices, this inductive definition can be formalized in ZF (which we
don’t do here, see [21]).
With (8.1) and kxn k ≤ 1 for all n, we have kyn+1 − yn k ≤ 3−(n+1) ∀n. Now for all m > n
m−1 ∞
X X 1 1
kym − yn k = yk+1 − yk ≤ 3−(k+1) = 3−(n+1) 1 = 3−n ,
1− 3
2
k=n k=n
so that with ky − yn k ≤ 21 3−n , kAn yn k ≥ 32 3−n kAn k and kAn k ≥ 4n for all n we finally have
n
2 −n 1 −n 1 1 4
kAn yk ≥ kAn yn k − kAn kky − yn k ≥ kAn k 3 − 3 = 3−n kAn k ≥ → ∞.
3 2 6 6 3
8.3 Remark The above method of proof is called the gliding (or sliding) hump method and
is more than 100 years old. (See also Section B.7 for another use of this method.) Nowadays,
the above theorem is usually deduced from Baire’s theorem, cf. Appendix A.5. As mentioned
there, the latter is equivalent to the axiom DCω of countable dependent choice, whereas above
we only used the weaker axiom ACω of countable choice. The above argument was discovered
only a few years ago and published [21] in 2017! 2
51
8.5 Corollary (Banach-Steinhaus) 2728 If E is a Banach space, F a normed space and
the sequence {An } ⊆ B(E, F ) is strongly convergent then the map A = s-lim An is bounded,
thus in B(E, F ).
Proof. The convergence of {An x} ⊆ F for each x ∈ E implies boundedness of {An x | n ∈ N}
for each x, so that F = {An | n ∈ N} ⊆ B(E, F ) is pointwise bounded and therefore uniformly
bounded by Theorem 8.2. Thus there is T such that kAn k ≤ T ∀n, so that kAn xk ≤ T kxk ∀x ∈
E, n ∈ N. With An x → Ax this implies kAxk ≤ T kxk for all x, thus kAk ≤ T < ∞.
s
8.6 Remark Clearly An → A is equivalent to kAn − Akx → 0 for all x ∈ E, where kAkx :=
kAxk is a seminorm on B(E, F ) for each x ∈ E. If kAkx = 0 for all x ∈ E then Ax = 0 ∀x ∈ E,
thus A = 0. Thus the family F = {k · kx | x ∈ E} is separating and induces a locally convex
topology on B(E, F ), the strong operator topology τsot . Norm convergence kAn −Ak → 0 clearly
s
implies strong convergence An → A, but usually the strong (operator) topology is strictly weaker
(despite its name) than the norm topology. See the following exercise for an example. 2
8.7 Exercise Let 1 ≤ p < ∞ and V = `p (N, F). For each m ∈ N define Pm ∈ B(V ) by
s
(Pm f )(n) = f (n) for n ≥ m and (Pm f )(n) = 0 if n < m. Prove Pm → 0, but kPm k = 1 ∀m,
k·k
thus Pm 6→ 0.
8.8 Exercise Let V be a separable Banach space and B ⊆ B(V ) a bounded subset.
(i) Prove: If S ⊆ V is dense and a net {Aι } ⊆ B satisfies kAι xk → 0 for all x ∈ S then
kAι xk → 0 for all x ∈ V , thus Aι → 0 in the strong operator topology.
(ii) Prove that the topological space (B, τsot ) is metrizable.
(iii) BONUS: Prove that (V, τsot ) is not metrizable if V is infinite dimensional.
With Cauchy-Schwarz and kyk ≤ 1 we have |hAx, yi| ≤ kAxk. Thus F is pointwise bounded
and therefore uniformly bounded by Theorem 8.2. Thus there is an M ∈ [0, ∞) such that
|hAx, yi| = |hx, Ayi| ≤ M kxk for all y ∈ H with kyk ≤ 1, and this implies kAk ≤ M .
8.10 Remark The Hellinger-Toeplitz Theorem shows that on a Hilbert space H there are no
unbounded linear operators A : H → H satisfying hAx, yi = hx, Ayi ∀x, y. This is a typical
example of a ‘no-go-theorem’. Occasionally such results are a nuisance. After all, the operator
of multiplication by n on `2 (N) ‘obviously’ is self-adjoint. What Hellinger-Toeplitz really says
is that such an operator cannot be defined everywhere, i.e. on all of H. This leads to the notion
of symmetric operators, and also illustrates that no-go theorems often can be circumvented by
27
Hugo Steinhaus (1887-1972). Polish mathematician
28
In the literature, one can find either this result or Theorem 8.2 denoted as ‘Banach-Steinhaus theorem’.
29
Ernst David Hellinger (1883-1950), Otto Toeplitz (1881-1940). German mathematicians. Both were forced into
exile in 1939.
52
generalizing the setting. This is the case here, since the Hellinger-Toeplitz theorem only applies
to operators that are defined everywhere. 2
8.13 Theorem Let E be a Banach space, F a normed space and F ⊆ B(E, F ). Then either
F is uniformly bounded or the set {x ∈ E | supA∈F kAxk = ∞} ⊆ E is dense Gδ .
Proof. The map F → R≥0 , x 7→ kxk is continuous and each A ∈ F is bounded, thus continuous.
Therefore the map fA : E → R≥0 , x 7→ kAxk is continuous for every A ∈ F. Defining for each
n∈N
Vn = {x ∈ E | sup kAxk > n},
A∈F
the definition of sup implies
[ [
Vn = {x ∈ E | ∃A ∈ F : kAxk > n} = {x ∈ E | kAxk > n} = fA−1 ((n, ∞)),
A∈F A∈F
53
8.4 Application: A dense set of continuous functions with di-
vergent Fourier series
Let f : R → C be 2π-periodic, i.e. f (x + 2π) = f (x) ∀x, and integrable over finite intervals.
Define Z 2π
1
cn (f ) = f (x)e−inx dx (8.2)
2π 0
and
n
X
Sn (f )(x) = ck (f )eikx , n ∈ N. (8.3)
k=−n
The fundamental problem of the theory of Fourier series is to find conditions for the conver-
gence Sn (f )(x) → f (x) as n → ∞, where convergence can be understood as (possibly almost)
everywhere pointwise or w.r.t. some norm, like k · k2 (as in Example 5.44) or k · k∞ . Here we
will discuss only continuous functions and we identify continuous 2π-periodic functions with
continuous functions on S 1 . It is not hard to show that Sn (f )(x) → f (x) if f is differentiable
at x (or just Hölder continuous: |f (x0 ) − f (x)| ≤ C|x0 − x|D with C, D > 0 for x0 near x) and
that convergence is uniform when f is continuously differentiable (or the Hölder condition holds
uniformly in x, x0 ). (See any number of books on Fourier analysis, e.g. [76, 34].)
Assuming only continuity of f one can still prove that limn→∞ Sn (f )(x) = f (x) if the limit
exists, but there actually exist continuous functions f such that Sn (f )(x) diverges at some
x. Such functions were first constructed in the 1870s using ‘condensation of singularities’, a
relative and precursor of the gliding hump method. Nowadays, most textbook presentations of
such functions are based on Lemma 8.15 below combined with either the uniform boundedness
theorem or constructions ‘by hand’, see e.g. [34, Section II.2], that are quite close in spirit to
the uniform boundedness method.
However, individual examples of continuous functions with divergent (in a point) Fourier
series can be produced in a totally constructive fashion, avoiding all choice axioms! (See [49]
for a very classical example.) But using non-constructive arguments seems unavoidable if one
wants to prove that there are many such functions as in the following:
8.14 Theorem There is a dense Gδ -set X ⊆ C(S 1 ) such that {Sn (f )(0)}n∈N diverges for each
f ∈ X.
Proof. Inserting (8.2) into (8.3) we obtain
n n
!
1 X ikx 2π
Z Z 2π
−ikt 1 X
Sn (f )(x) = e f (t)e dt = f (t) eik(x−t) dt = (Dn ? f )(x),
2π 0 2π 0
k=−n k=−n
1
R 2π
where ? denotes convolution, defined for 2π-periodic f, g by (f ? g)(x) = 2π 0 f (t)g(x − t)dt,
and
n
X sin(n + 21 )x
Dn (x) := eikx =
sin x2
k=−n
is the Dirichlet kernel. The quickest way to check the last identity is the telescoping calculation
n
X n
X
(eix/2 − e−ix/2 )Dn (x) = eix(k+1/2) − eix(k−1/2) = eix(n+1/2) − e−ix(n+1/2) ,
k=−n k=−n
54
together with eix − e−ix = 2i sin x. Since Dn (x) is an even function, we have
Z 2π
1
ϕn (f ) := Sn (f )(0) = f (x)Dn (x)dx.
2π 0
It is clear that the norm of the map ϕn : (C(S 1 ), kR · k∞ ) → C is bounded above by kDn k1 .
2π
For gn (x) = sgn(Dn (x)) we have ϕn (gn ) = (2π)−1 0 |Dn (x)|dx =: kDn k1 . While gn is not
m→∞
continuous, we can find a sequence of continuous gn,m bounded by 1 such that gn,m −→ gn
pointwise. Now Lebesgue’s dominated convergence theorem implies ϕn (gn,m ) → ϕn (gn ) =
kDn k1 , which implies kϕn k = kDn k1 . By Lemma 8.15 below, kDn k1 → ∞ as n → ∞. Thus the
family F = {ϕn } ⊆ B(C(S 1 ), C) is not uniformly bounded. Now Theorem 8.13 implies that the
set X = {f ∈ C(S 1 , C) | {Sn (f )(0)} is unbounded} is dense Gδ .
4
8.15 Lemma We have kDn k1 ≥ log n for all n ∈ N.
π2
Proof. Using | sin x| ≤ |x| for all x ∈ R, we compute
Z π
2 π
Z
1 1 dx
kDn k1 = |Dn (x)|dx ≥ sin n + x
2π −π π 0 2 x
Z (n+1/2)π n
2 X kπ | sin x|
Z
2 dx
= | sin x| ≥ dx
π 0 x π (k−1)π x
k=1
n Z π n
2 X 1 4 X 1 4
≥ sin x dx = 2 ≥ 2 log n,
π kπ 0 π k π
k=1 k=1
Pn R n+1
where we used k=1 1/k ≥ 1 dx/x = log(n + 1) > log n.
9.2 Lemma Let E be a Banach space, F a normed space (real or complex) and T : E → F a
linear map. Assume also that there are m > 0 and r ∈ (0, 1) such that for every y ∈ F there is
an x0 ∈ E with kx0 kE ≤ mkykF and ky − T x0 kF ≤ rkykF . Then for every y ∈ F there is an
m
x ∈ E such that kxkE ≤ 1−r kykF and T x = y. In particular, T is surjective.
Proof. It suffices to consider the case kyk = 1. By assumption, there is x0 ∈ E such that
kx0 k ≤ m and ky − T x0 k ≤ r. Now, applying the hypothesis to y − T x0 instead of y, we find
31
Juliusz Schauder (1899-1943). Polish mathematician. Killed by the Gestapo.
55
an x1 ∈ E with kx1 k ≤ rm and ky − T (x0 + x1 )k ≤ r2 . Continuing this inductively (thus using
DCω !) we obtain a sequence {xn } such that for all n ∈ N
kxn k ≤ rn m, (9.1)
n+1
ky − T (x0 + x1 + · · · + xn )k ≤ r . (9.2)
P∞
Now, (9.1) together with completeness of E implies, cf. Proposition 3.2, that n=0 xn converges
to an x ∈ E with
∞ ∞
X X m
kxk ≤ kxn k ≤ rn m = ,
1−r
n=0 n=0
and taking n → ∞ in (9.2) gives y = T x.
9.3 Proposition Let E be a Banach space, F a normed space and T ∈ B(E, F ) such that
B F (0, β) ⊆ T (B E (0, α)) for certain α, β > 0. Then B F (0, β 0 ) ⊆ T (B E (0, α)) if 0 < β 0 < β.
F
Proof. If 0 < β 0 < β 00 < β then B (0, β 00 ) ⊆ B F (0, β) ⊆ T B E (0, α). Equivalently (since x 7→ λx
F
is a homeomorphism for every λ > 0), B (0, 1) ⊆ T B E (0, α/β 00 ). With the definition of the
closure, this means that for every y ∈ F, kyk ≤ 1 and ε > 0 there exists x ∈ E with kxk < α/β 00
and kT x − yk < ε. Equivalently, for every y ∈ F and ε > 0 there is x ∈ E with kxk < βα00 kyk and
kT x − yk < εkyk. Now Lemma 9.2 gives (assuming ε < 1) that for every y ∈ F there is x ∈ E
00 β 0 /β 00
β0
with T x = y and kxk ≤ α/β 1−ε kyk. If we choose ε ∈ (0, 1 − β 00 ) then 1−ε < 1, so that for every
F
y ∈ F with kyk ≤ β 0 there is x ∈ E with T x = y and kxk < α. Thus B (0, β 0 ) ⊆ T B E (0, α).
Since F is a complete metric space and has non-empty interior F 0 = F 6= ∅, Corollary A.21 of
Baire’s theorem implies that at least one of the closed sets T (B E (0, n)) has non-empty interior.
Thus there are n ∈ N, y ∈ F, ε > 0 such that B F (y, ε) ⊆ T (B E (0, n)). If x ∈ B F (0, ε) then
2x = (y + x) − (y − x), thus 2B F (0, ε) ⊆ B F (y, ε) − B F (y, ε) and thus
1 1
B F (0, ε) ⊆ (B F (y, ε) − B F (y, ε)) ⊆ (T (B E (0, n)) − T (B E (0, n))) ⊆ T (B E (0, n)).
2 2
Now Proposition 9.3 implies that B F (0, ε0 ) ⊆ T (B E (0, n)) for some ε0 > 0 (actually every
ε0 ∈ (0, ε), but we don’t need this). By linearity we have that for every δ > 0 there is a δ 0 > 0
such that B F (0, δ 0 ) ⊆ T B E (0, δ). Now using the linearity of T , proving its openness is routine.
9.4 Exercise Let E, F be normed spaces and T : E → F linear such that for every δ > 0
there is δ 0 > 0 for which B F (0, δ 0 ) ⊆ T B E (0, δ). Prove that U is open.
9.5 Corollary (Banach 1929) If E, F are Banach spaces and T ∈ B(E, F ) (thus linear and
bounded) is a bijection then also T −1 is bounded. (Thus T is a homeomorphism.)
56
Proof. By Theorem 9.1, T is open. Thus the inverse T −1 that exists by bijectivity (and clearly
is linear) is continuous, thus bounded by Lemma 3.13.
9.6 Definition A linear map A : E → F between normed spaces that is a bijection and a
homeomorphism is called an isomorphism of normed spaces. (Not to be confused with isometric
isomorphisms, for which kAxk = kxk ∀x ∈ E.) If an (isometric) isomorphism A : E → F exists,
we write E ' F (E ∼= F ).
9.7 Remark 1. If k·k1 , k·k2 are norms on V then idV : (V, k·k1 ) → (V, k·k2 ) is an isomorphism
(isometric isomorphism) if and only if the two norms are equivalent (equal).
2. The Bounded Inverse Theorem is a special case of the Open Mapping Theorem, but it
also implies the latter: Assume that the former holds, that E, F are Banach spaces and that
T ∈ B(E, F ) is surjective. The kernel ker T ⊆ E is closed, so that the quotient space E/ ker T is
a Banach space, and the quotient map p : E → E/ ker T is continuous and open by Proposition
6.2. Since T is surjective, the induced map Te : E/ ker T → F is a continuous bijection, so that
Te−1 : F → E/ ker T is continuous by the Bounded Inverse Theorem. Equivalently, Te is open,
so that the T = Te ◦ p is open as the composite of two open maps.
3. Also the Bounded Inverse Theorem has an interesting application R 2πto Fourier analysis:
1
For f ∈ L ([0, 2π]), we define the Fourier coefficients f (n) = (2π) −1 −int
0 f (t)e dt for all
b
n ∈ N. It is immediate that kfbk∞ ≤ kf k1 , and is not hard to prove the Riemann-Lebesgue
theorem fb ∈ c0 (Z, C) and injectivity of the resulting map L1 ([0, 2π]) → c0 (Z, C), f 7→ fb, see
e.g. [63, Theorem 5.15] or [34]. If this map was surjective, the Bounded Inverse Theorem would
give kf k1 ≤ Ckfbk∞ . For the Dirichlet kernel it is immediate that D cn (m) = χ[−n,n] (m), thus
kDcn k∞ = 1 for all n ∈ N. Since we know that kDn k1 → ∞, we would have a contradiction.
Thus L1 ([0, 2π]) → c0 (Z, C), f 7→ fb is not surjective.
4. The Open Mapping Theorem can be generalized to the case where E is an F -space, i.e.
a TVS admitting a complete translation-invariant metric. See [64, Theorem 2.11]. 2
9.8 Exercise (i) Let V be an infinite dimensional Banach space. Prove that all finite di-
mensional subspaces have empty interior (in V !), then use Baire’s theorem to prove that
V cannot have a countable Hamel basis. (Thus dim V > ℵ0 = #N.)
(ii) Let V be an infinite dimensional Banach space and {xn }, {ϕn } sequences as in Exercise
7.11. For every N ⊆ N define xN = n∈N 2−n xn . Now use Lemma B.9 to find a linearly
P
independent family in V of cardinality c = #R, so that dim V ≥ c (Hamel dimension).
(iii) Prove that every separable normed space V has cardinality at most c and deduce dim V ≤ c.
(iv) Conclude that every infinite dimensional separable Banach space has Hamel dimension c.
9.9 Remark 1. The result of Exercise 9.8 (i) can be proven using Riesz’ Lemma 14.1 instead of
Baire’s theorem, see [4], but (as in most such cases) the proof uses countable dependent choice
DCω like the proof of Baire’s theorem.
2. If the continuum hypothesis (CH) is true, Exercise 9.8 (ii) readily follows from (i)+(iii).
But the proof of (ii) indicated above is independent of CH. 2
9.10 Exercise Give counterexamples showing that both spaces appearing in the Bounded
Inverse Theorem must be complete.
Hint: For complete E, incomplete F use `p spaces, and for E incomplete, F complete use
F = `1 (N, R) and the fact that it has Hamel dimension c = #R, cf. Exercise 9.8(iv).
57
9.11 Exercise Let V be a Banach space.
(i) Let W, Z ⊆ V be closed subspaces such that W + Z = V and W ∩ Z = {0}. Give
W ⊕ Z the norm k(w, z)k = kwk + kzk. Prove that α : W ⊕ Z → V, (w, z) 7→ w + z is a
homeomorphism, thus an isomorphism of Banach spaces.
(ii) If W ⊆ V is complemented then:
– There is a bounded linear map P ∈ B(V ) with P 2 = P and W = P V . (The converse
was proven in Exercise 6.12.)
– Every closed Z ⊆ V complementary to W is isomorphic to V /W as Banach spaces.
9.12 Exercise Let V be a Banach space and W, Z ⊆ V closed linear subspaces satisfying
W ∩ Z = {0}, so that W + Z ∼ = W ⊕ Z algebraically. Prove that W + Z ⊆ V is closed if and
only if the projection W + Z → W : w + z 7→ w is continuous.
[There is a generalization without the assumption W ∩ Z = {0}, but we don’t pursue this.]
9.13 Exercise Let V, W be Banach spaces and A ∈ B(V, W ) such that dim(W/AV ) < ∞.
(i) Prove that AV ⊆ W is closed, assuming injectivity of A.
(ii) Remove the injectivity assumption.
9.14 Remark The quotient W/AV is called the (algebraic) cokernel of A. Some authors de-
fine the cokernel as W/AV . But we don’t do this, since finite dimensionality of W/AV (the
topological cokernel) is a much weaker condition on A and doesn’t imply closedness of AV . 2
9.15 Exercise It is not true that every subspace W ⊆ V with dim(V /W ) < ∞ of a Banach
space V is closed! Find a counterexample! (Hint: codimension one.)
9.17 Lemma Let E, F be normed spaces and T : E → F a linear map (not assumed bounded).
Then the following are equivalent:
(i) The graph G(T ) = {(x, T x) | x ∈ E} ⊆ E ⊕ F of T is closed.
(ii) Whenever {xn }n∈N ⊆ E is a sequence such that xn → x ∈ E and T xn → y ∈ F , we have
y = T x.
Proof. Since E ⊕ F is a metric space, G(T ) is closed if and only if it contains the limit (x, y)
of every sequence {(xn , yn )} in G(T ) that converges to some (x, y) ∈ E ⊕ F . But a sequence in
G(T ) is of the form {(xn , T xn )}, and (x, y) ∈ G(T ) ⇔ y = T x.
58
9.18 Remark Operators with closed graph (in particular unbounded ones) are often called
closed. But this must not be confused with their closedness as a map, i.e. the property of
sending closed sets to closed sets! Bounded linear operators between Banach spaces have closed
graphs, but need not be closed maps. 2
9.19 Theorem (Banach 1929) If E, F are Banach spaces, then a linear map T : E → F is
bounded if and only if its graph is closed.
Proof. Let E, F be Banach spaces, and let T : E → F be linear. If T is bounded then it is
continuous, thus G(T ) is closed by Exercise 9.16. Now assume T , thus G(T ), is closed. The
Cartesian product E ⊕F with norm k(e, f )k = kek+kf k is a Banach space. The linear subspace
G(T ) ⊆ E⊕F is closed by assumption, thus a Banach space. Since the projection p1 : G(T ) → E
is a bounded bijection, by Corollary 9.5 it has a bounded inverse p−1
1 : E → G(T ). Then also
T = p2 ◦ p−1
1 is bounded.
9.20 Exercise Show that the Bounded Inverse Theorem (Corollary 9.5) can be deduced from
the Closed Graph Theorem. (Thus the three main results of this section are ‘equivalent’.)
9.21 Remark 1. The Hellinger-Toeplitz Theorem (Corollary 8.9) can also be deduced from
the Closed Graph Theorem: Let {xn } ⊆ H be a sequence converging to x ∈ H and assume that
Axn → y. Then
hAx, zi = hx, Azi = limhxn , Azi = limhAxn , zi = hy, zi ∀z ∈ H,
n n
thus Ax = y. Thus A has closed graph and therefore is bounded by Theorem 9.19.
2. Since we deduced the Hellinger-Toeplitz theorem from the weak version of the uniform
boundedness theorem, it is moderately interesting [But not too much: We needed DCω to prove
the closed graph theorem, whereas we know that ACω suffices for proving the weak version of
the uniform boundedness theorem! And with DCω one has the better Theorem 8.13.] that also
the latter can also be deduced from the closed graph theorem: 2
9.22 Exercise Let E, F be Banach spaces and F ⊆ B(E, F ) a pointwise bounded family. Use
the Closed Graph Theorem to prove that F is uniformly bounded, as follows:
(i) Prove that FF = {{yA }A∈F ∈ F F = Fun(F, F ) | supA∈F kyA k < ∞} is a Banach space.
(ii) Show that pointwise boundedness of F is equivalent to T (E) ⊆ FF .
(iii) Prove that the graph G(T ) ⊆ E ⊕ FF of T is closed. (Thus T is bounded by Theorem
9.19.)
(iv) Deduce uniform boundedness of F from the boundedness of T .
(v) Remove the requirement that F be complete.
59
It is obvious that boundedness below of a map implies injectivity, but the converse is not
true. Furthermore, the image AE = {Ax | x ∈ E} of a linear map A : E → F need not
be closed. In particular, the image can be dense without A being surjective. The operator
A ∈ B(E), where E = `2 (N, C), defined by (Af )(n) = f (n)/n exemplifies both phenomena.
9.24 Exercise Let E, F be be normed spaces, where E is finite dimensional, and let A : E → F
be an injective linear map. Prove that A is bounded below.
In particular, A is bounded below if and only if its (set-theoretic) inverse A−1 is bounded.
Proof. Using the invertibility of A, thus bijectivity of x 7→ Ax, we have
−1
kA−1 yk
−1 kxk kAxk
kA k = sup = sup = inf = ( inf kAxk)−1 .
y∈F \{0} kyk x∈E\{0} kAxk x∈E\{0} kxk kxk=1
9.26 Lemma If E is a Banach space, F is a normed space and A : E → F is a linear map that
is bounded and bounded below then AE ⊆ F is closed.
Proof. Since A is bounded below, it is injective, thus A e : E → AE (the map A, with the
−1
codomain replaced by AE) is a bijection. Now A : AE → E is bounded by Lemma 9.25.
e
Thus if {fn } is a Cauchy sequence in AE, {A e−1 fn } is a Cauchy sequence in E. Since E
is complete, there is e ∈ E such that A −1
e fn → e. Since A is bounded, {fn = A(A e−1 fn )}
converges to Ae ∈ AE. Thus AE is complete, thus closed.
9.27 Definition If E, F are normed spaces then A ∈ B(E, F ) is called invertible if there is a
B ∈ B(F, E) such that BA = idE and AB = idF .
9.28 Proposition Let E, F be a Banach spaces and A ∈ B(E, F ). Then the following are
equivalent:
(i) A is invertible.
(ii) A is injective and surjective.
(iii) A is bounded below and has dense image.
Proof. It is clear that invertibility implies injectivity and surjectivity, thus in particular dense
image. Since A−1 is bounded, Lemma 9.25 gives that A is bounded below.
If (ii) holds then the the set-theoretic inverse, clearly linear, is bounded by the bounded
inverse theorem, (Corollary 9.5). Thus A is invertible in the sense of Definition 9.23.
Assume (iii). By boundedness below, A is injective. And AE ⊆ F is dense by assumption
and closed by Lemma 9.26, thus AE = F . Thus A is injective and surjective. Now boundedness
of the inverse A−1 follows from boundedness below of A and Lemma 9.25.
60
9.29 Remark 1. Note that dense image is weaker than surjectivity, while boundedness below
is stronger than injectivity. The point of criterion (iii) is that it can be quite hard to verify
surjectivity of A directly, while density of the image usually is easier to establish.
2. The material on bounded below maps discussed so far, including (i)⇔(iii) in Proposition
9.28, was entirely elementary and could be moved to Section 3. 2
9.31 Exercise Let H be a Hilbert space and A ∈ B(H) such that |hAx, xi| ≥ Ckxk2 for some
C > 0. Prove that A is invertible and kA−1 k ≤ C −1 .
61
• The point spectrum σp (A) consists of those λ ∈ F for which A − λ1E is not injective.
Equivalently, σp (A) consists of the eigenvalues of A.
• The continuous spectrum σc (A) consists of those λ ∈ F for which A − λ1E is injective, but
not surjective, while it has dense image, i.e. (A − λ1E )E = E.
• The residual spectrum σr (A) consists of those λ ∈ F for which A − λ1E is injective and
(A − λ1E )E 6= E.
We have some immediate observations:
• It is obvious by construction that the sets σp (A), σc (A), σr (A) are mutually disjoint and
have σ(A) as their union.
• Clearly 0 ∈ σ(A) is equivalent to non-invertibility of A and 0 ∈ σp (A) to ker A 6= {0}.
• If E is finite dimensional then we know from linear algebra that injectivity and surjectivity
of any A ∈ B(E) are equivalent. Thus for all operators on a finite dimensional space we
have σc (A) = σr (A) = ∅, thus σ(A) = σp (A).
• If E is infinite dimensional, the situation is much more complicate, thus more interesting.
For example, the right shift R on `2 (N) is injective, but not surjective. Thus 0 ∈ σ(R),
while 0 6∈ σp (R).
• If λ ∈ σp (A) then there is non-zero x ∈ E with Ax = λx. Then An x = λn x ∀n ∈ N. With
the definition of kAk it follows that |λ| ≤ inf n∈N kAn k1/n . (This can be smaller than kAk,
e.g. if A is nilpotent, i.e. An = 0 for some n ∈ N.)
• There are other interesting subsets of σ(A), motivated by Proposition 9.28(iii):
– The approximate point spectrum σapp (A) = {λ ∈ F | A − λ1 not bounded below}.
Clearly σp (A) ⊆ σapp (A).
– The compression spectrum σcp (A) = {λ ∈ F | (A − λ1)E 6= F }. Obviously σ(A) =
σapp (A) ∪ σcp (A) (but the two need not be disjoint). And σr (A) = σcp (A)\σp (A).
– The discrete spectrum σd (A) ⊆ σp (A), cf. Exercise 10.42. The essential spectrum
σess (A) (or rather one version of it – there are others) is σ(A)\σd (A).
10.3 Exercise Let V be a Banach space, A ∈ B(V ), and let W, Z ⊆ V closed subspaces such
that W + Z = V, W ∩ Z = {0} and AW ⊆ W, AZ ⊆ Z. Prove σ(A) = σ(A|W ) ∪ σ(A|Z ) and
σt (A) = σt (A|W ) ∪ σt (A|Z ) for all t ∈ {p, c, r}.
Before we try to compute the spectra of some interesting operators, it is better to first prove
some general results, since they will be helpful also for studying examples. Remarkably one can
get rather far using only the fact that B(E) is a Banach algebra.
σ(a) = {λ ∈ F | a − λ1 6∈ InvA}.
62
The spectral radius of a is r(a) = sup{|λ| | λ ∈ σ(a)}, where r(a) = 0 if σ(a) = ∅. (But we will
soon prove σ(a) 6= ∅ for all a ∈ A if A is normed.)
10.5 Remark It is clear that for an element of the Banach algebra B(E), where E is a Banach
space, this definition is equivalent to Definition 10.2. But in the present abstract setting there
is no distinction between point, continuous and residual spectrum. 2
As to our standard example of a Banach algebra not of the form B(V ) with V Banach:
10.6 Exercise (i) Let X be a compact Hausdorff space. Recall that (C(X, F), k · k∞ ) is a
Banach algebra. For f ∈ C(X, F), prove σ(f ) = f (X) ⊆ F.
(ii) If S is any set, `∞ (S, F) is a Banach algebra w.r.t. pointwise multiplication. If f ∈ `∞ (S, F),
prove σ(f ) = f (S).
There is another case where the spectrum is easy to determine:
10.8 Lemma If A is a unital algebra and a ∈ A is nilpotent then σ(a) = {0}, thus r(a) = 0.
Proof. If a ∈ A is nilpotent then the series b =P ∞ n
P
n=0 a converges since it breaks off after
finitely many terms. Now (1 − a)b = b(1 − a) = n=0 a − ∞
∞ n n = 1, thus a − 1 ∈ InvA.
P
n=1 a
Since the same holds for a/λ whenever λ 6= 0, we have σ(a) ⊆ {0}. Since no nilpotent a is
invertible (why?) we have σ(a) = {0}.
63
(iv) Deduce that σ(ab) ∪ {0} = σ(ba) ∪ {0} and r(ab) = r(ba).
known from finite geometric sums. If 0 6= λ ∈ σ(a) then a/λ − 1 is not invertible, thus putting
z = a/λ in (10.1) gives that (a/λ)n − 1 is not invertible (since a product of two commuting
elements is invertible if and only if both are invertible by Exercise 10.11(i)). Thus λn ∈ σ(an ),
so that r(a)n ≤ r(an ). Since this holds for all n ∈ N, using r(b) ≤ kbk just proven, we have
r(a) ≤ inf r(an )1/n ≤ inf kan k1/n ≤ kak.
n∈N n∈N
10.14 Lemma Let A be a unital normed algebra. Then InvA is a topological group (w.r.t. the
norm topology).
Proof. It is clear that InvA is a group and that multiplication is continuous, since multiplication
A × A → A is jointly continuous (Remark 3.28). It remains to show that the inverse map
σ : InvA → InvA, a 7→ a−1 is continuous. To this purpose, let r, r + h ∈ InvA and put
(r + h)−1 = r−1 + k. We must show that khk → 0 implies kkk → 0. From 1 = (r−1 +
k)(r + h) = 1 + r−1 h + kr + kh we obtain r−1 h + kr + kh = 0. Multiplying this on the
right by r−1 we have r−1 hr−1 + k + khr−1 = 0, thus k = −r−1 hr−1 − khr−1 . Therefore
kkk ≤ kr−1 k2 khk + kkkkhkkr−1 k, which is equivalent to kkk(1 − khkkr−1 k) ≤ kr−1 k2 khk and,
for khk < kr−1 k−1 , to
kr−1 k2
kkk ≤ khk.
1 − khkkr−1 k
From this it is clear that khk → 0 implies kkk → 0.
10.15 Corollary If A is a unital normed algebra and a ∈ A then the ‘resolvent map’ Ra :
C\σ(a) → A, λ 7→ (a − λ1)−1 is continuous.
10.16 Lemma Let A be a unital normed algebra and a ∈ A. Put ν = inf n∈N kan k1/n . Then
(i) limn→∞ kan k1/n = ν.
n n
(ii) For all µ > ν we have µa → 0 as n → ∞, but νa 6→ 0 provided ν > 0.
(This is of course trivial if µ > kak, but our hypothesis is weaker when ν < kak.)
64
(iii) If ν = 0 then a 6∈ InvA, thus 0 ∈ σ(a).
Proof. (i) With kan k ≤ kakn we trivially have
0 ≤ ν = inf kan k1/n ≤ lim inf kan k1/n ≤ lim sup kan k1/n ≤ kak < ∞. (10.2)
n∈N n→∞ n→∞
By definition of ν, for every ε > 0 there is a k such that kak k1/k < ν + ε. Every m ∈ N is of the
form m = sk + r with unique s ∈ N0 and 0 ≤ r < k (division with remainder). Then
This proves the first claim. On the other hand, for all n ∈ N we have kan k1/n ≥ ν. With ν > 0
this implies k(a/ν)n k ≥ 1 ∀n, and therefore (a/ν)n 6→ 0.
(iii) Assume a ∈ InvA. Then there is b ∈ A such that ab = ba = 1. Then 1 = an bn , thus
with Remark 3.28 we have 1 ≤ k1k = kan bn k ≤ kan kkbn k ≤ kan kkbkn . Taking n-th roots, we
have 1 ≤ kan k1/n kbk, and taking the limit gives the contradiction 1 ≤ νkbk = 0. Thus if ν = 0
then a is not invertible, so that 0 ∈ σ(a).
Everything we did so far works over R and over C. (With the obvious exception of the ONB
{en : x 7→ einx } ⊆ L2 ([0, 2π], λ; C) in the discussion of Fourier series. But over R that can be
replaced by {cos nx | n ∈ N0 } ∪ {sin nx | n ∈ N}.) The rest of this section requires F = C, and
the same applies whenever we use Theorem 10.18 or Corollaries 10.21, 10.24.
10.17 Lemma Let A be a unital normed algebra over C, a ∈ A and λ ∈ C\{0} such that
1 a n
λS ∩ σ(a) = ∅. Then for all n ∈ N we have λ − 1 ∈ InvA and
a −1 n −1
n 1 X a 2πi
k
−1 = −1 , where λk = e n λ. (10.3)
λ n λk
k=1
2πi
Proof. For 0 6= λ ∈ C and n ∈ N, put λk = λe n k , where k = 1, . . . , n. (One should really
write λn,k , but we n n
n n
Q suppress the n.) Then λ1 , . . . , λn are the solutions of z = λ , and we
have z − λ = k (z − λk ). This is an identity in C[z], thus is also holds in every unital
C-algebra A with z replaced by a ∈ A. Let |λ| ≥ ν > 0 and n ∈ N. Then our assumption
λS 1 ∩ σ(a) =Q∅ implies λk 6∈ σ(a) for allk = 1, . . . , n. Thus all a − λk 1 are invertible, and so is
n
an − λn 1 = k (a − λk 1). Thus also λa − 1 ∈ InvA, our first claim.
Putting z = a/λk in (10.1) and observing λnk = λn , we have
a a a a a
( )n − 1 = ( )n − 1 = ( − 1)(1 + + · · · + ( )n−1 ).
λ λk λk λk λk
65
a a n
Using the invertibility of λk − 1 ∀k and λ − 1, we can rewrite this as
n−1
a a a a a X a l 2πikl
φ(λk ) = ( − 1)−1 = (( )n − 1)−1 (1 + + · · · + ( )n−1 ) = (( )n − 1)−1 e− n .
λk λ λk λk λ λ
l=0
2πi
If l ∈ {1, . . . , n − 1} then z = e− n
l
satisfies z 6= 1 and z n = 1, so that (10.1) gives
n n−1
X 2πi X zn − 1
e− n
kl
=z zk = z = 0.
z−1
k=1 k=0
Thus only l = 0 contributes to (10.4), and the r.h.s. equals n(( λa )n − 1)−1 , yielding (10.3).
10.18 Theorem (Beurling 1938-Gelfand 1939) 3435 Let A be a unital normed algebra
over C (not necessarily complete) and a ∈ A. Then σ(a) 6= ∅, and
If A is complete, equality holds in (10.5), which then is called the spectral radius formula.
Proof. The equality of infimum and limit was Lemma 10.16(i). Once the ≥ is proven, combining
it with Proposition 10.12(ii) in the complete case gives r(a) = limn→∞ kan k1/n .
For a ∈ A, define ν as before. If ν = 0 then 0 ∈ σ(a) by Lemma 10.16(iii). Thus σ(a) 6= ∅
and (10.5) is trivially true.
From now on assume ν > 0. Assume that there is no λ ∈ σ(a) with |λ| ≥ ν. This implies
that (a − λ1)−1 exists for all |λ| ≥ ν and depends continuously on λ by Lemma 10.14. The
same holds (since |λ| ≥ ν > 0) for the slightly more convenient function
a −1
φ : {λ ∈ C | |λ| ≥ ν} → A, λ 7→ −1 .
λ
Now Lemma 10.17 gives for all λ with |λ| ≥ ν and n ∈ N that ( λa )n − 1 ∈ InvA, the inverse
given by (10.3). Pick any η > ν. Since the annulus Λ = {λ ∈ C | ν ≤ |λ| ≤ η} is compact, the
continuous map φ : Λ → A is uniformly continuous. I.e., for every ε > 0 we can find δ > 0 such
that λ, λ0 ∈ Λ, |λ−λ0 | < δ ⇒ kφ(λ)−φ(λ0 )k < ε. If ν < µ < ν +δ, we have |νk −µk | = |ν −µ| < δ
and therefore kφ(νk ) − φ(µk )k < ε for all n P
∈ N and k = 1, . . . , n. Combining this with (10.3)
we have k(( ν ) − 1) − (( µ ) − 1) k ≤ n nk=1 kφ(νk ) − φ(µk )k < ε ∀n ∈ N, so that:
a n −1 a n −1 1
a n −1 a n −1
∀ε > 0 ∃µ > ν ∀n ∈ N : −1 − −1 < ε. (10.6)
ν µ
By Lemma 10.16(ii), µ > ν implies (a/µ)n → 0 as n → ∞. With continuity of the inverse
map, ((a/µ)n − 1)−1 → −1. Thus for n large enough we have k((a/µ)n − 1)−1 + 1k < ε, and
34
Arne Beurling (1905-1986). Swedish mathematician. Worked mostly on harmonic and complex analysis.
35
Israel Moiseevich Gelfand (1913-2009). Outstanding Soviet mathematician. Many important contributions to
many areas of mathematics, among which functional analysis and Banach algebras.
66
combining this with (10.6) we have k((a/ν)n − 1)−1 + 1k < 2ε. Since ε > 0 was arbitrary, we
have ((a/ν)n − 1)−1 → −1 as n → ∞ and therefore (a/ν)n → 0. This contradicts the other
part of Lemma 10.16(ii), so that our assumption that there is no λ ∈ σ(a) with |λ| ≥ ν is
false. Existence of such a λ obviously gives σ(a) 6= ∅ and r(a) ≥ ν, completing the proof. We
emphasize that completeness of A was not needed!
10.19 Remark 1. The standard proof of the above theorem, which requires completeness, uses
the differentiability of the resolvent map Ra and a certain amount of complex analysis. The
more elementary (which does not mean simple) proof given above, due to Rickart36 (1958),
shows that neither the completeness assumption nor the complex analysis are essential to the
problem. (See also Exercise 10.30 and the subsequent remark.)
2. Even though we avoided complex
analysis (holomorphicity etc.), it is clear that the proof
0 −1
only works over C. In fact, ∈ M2×2 (R) has empty spectrum over R. 2
1 0
The above is not a joke! This proof is certainly more elementary than those using complex
analysis (Liouville’s theorem) or topological arguments based on π1 (S 1 ) 6= 0. And the ‘standard’
proof using compactness and n-th roots of complex numbers, cf. e.g. [47, Theorem 7.7.57], has
more than a little in common with the above argument.
(i) Every unital normed algebra over C other than C1 has non-zero non-invertible elements.
(ii) If A is a normed division algebra (i.e. unital with InvA = A\{0}) over C then A = C1.
Proof. (i) Let a ∈ A\C1. By Theorem 10.18 we can pick λ ∈ σ(a). Then a − λ1 is non-zero
and non-invertible. Now (ii) is immediate.
10.22 Remark 1. That there are no finite dimensional division algebras over C is an easy
consequence of algebraic closedness. (Why?) There are infinite dimensional ones (like the field
of C(z) of rational functions over C), but they do not admit norms by the above corollary, which
does not assume finite dimensionality of A.
2. Over R a theorem of Hurwitz38 says that there are precisely four division algebras
admitting a norm, namely R, C, H (Hamilton’s39 quaternions, which everyone should know)
and O, the octonions of Graves40 . But of these only C is an algebra over C. For more on the
fascinating subject of real division algebras see the 120 pages on the subject in [18]. 2
36
Charles Earl Rickart (1913-2002). American mathematician, mostly operator algebraist.
37
Stanislaw Mazur (1905-1981). Polish mathematician.
38
Adolf Hurwitz (1859-1919). German mathematician who worked on many subjects.
39
Sir William Rowan Hamilton (1805-1865). Irish mathematician. Known particularly for quaternions and Hamil-
tonian mechanics. It was he who advocated the modern view of complex numbers as pairs of real numbers.
40
John T. Graves (1806-1870). Irish jurist (!) and mathematician.
67
The preceding corollaries only used σ(a) 6= ∅, but also the spectral radius formula will have
many applications.
10.24 Corollary Let A be a unital Banach algebra and B ⊆ A a closed subalgebra containing
1. Then σA (b) ⊆ σB (b) and rA (b) = rB (b) for all b ∈ B.
Proof. If b − λ1 has an inverse in B then the latter also is an inverse in A. Thus λ 6∈ σB (b) ⇒
λ 6∈ σA (b), whence the first claim.
Since the norm of B is the restriction to B of the norm of A, the spectral radius formula
gives rB (b) = limn→∞ kbn k1/n = rA (b).
X X XX
kf ? gk1 = f (m)g(n − m) ≤ |f (m)g(n − m)| = kf k1 kgk1 ,
n∈Z m∈Z n∈Z m∈Z
10.26 Exercise Let A be a unital Banach algebra over C and a, b ∈ A commuting elements.
Prove r(ab) ≤ r(a)r(b). (I.e. r is submultiplicative. We will soon prove subadditivity.)
10.27 Exercise Let A be a unital Banach algebra and B ⊆ A a maximal abelian Banach
subalgebra with 1 ∈ B. (Maximality means that we cannot have B $ C ⊆ A with C commutative
Banach.) Prove InvB = B ∩ InvA and conclude that σB (b) = σA (b) ∀b ∈ B.
10.28 Exercise Let A be a unital Banach algebra and a ∈ A. Prove the ‘resolvent identity’
68
(ii) BONUS: If E, F are normed spaces and U ⊆ E is open, a map f : U → F is Fréchet
differentiable at x ∈ U if there is a bounded linear map D ∈ B(E, F ) such that
kf (x + h) − f (x) − D(h)k
→0 as khk → 0.
khk
Prove that InvA → InvA, a 7→ a−1 is Fréchet differentiable. Conclude that the map
C\σ(a) → C, λ 7→ ϕ((a − λ1)−1 ) is holomorphic for each ϕ ∈ A∗ .
10.30 Exercise Let A be a unital Banach algebra, a ∈ A and z ∈ C\{0} such that zS 1 ∩σ(a) =
∅ (i.e. there is no λ ∈ σ(a) with |λ| = |z|). Let pn = (1 − (a/z)n )−1 . Prove:
(i) If |z| > r(a) then pn → 1 as n → 0.
(ii) If there is no λ ∈ σ(a) with |λ| ≤ |z| then pn → 0 as n → 0.
(iii) The limit p = limn→∞ pn ∈ A exists and satisfies pa = ap. Hint: Lemma 10.17.
1
H
(iv) p = − 2πi C Ra (z)dz, where C is the circle of radius |z| around 0 ∈ C with counterclockwise
orientation.
(v) BONUS: p2 = p. Hint: You may use Exercise 10.28.
10.31 Remark In our (that is Rickart’s) proof of the Beurling-Gelfand theorem we used the
operators ((a/z)n − 1)−1 = −pn for large n and the above (i) concerning existence of p =
limn→∞ pn . The result of (iv) to the effect that p is given by a contour integral establishes a
connection between Rickart’s proof and the standard textbook proof via complex analysis. See
also [60, §149]. 2
Since we need a unit in order to define σ(a), the following construction is quite important
(but we won’t use it):
10.33 Exercise Compute σp (L) and σp (R) for the shift operators L, R on `p (N, C) for all
p ∈ [1, ∞]. (Of course, the p in σp has nothing to to with the p in `p .)
10.34 Exercise Prove: If V is a finite-dimensional Banach space over C then every quasi-
nilpotent operator on V is nilpotent.
69
10.35 Exercise Let H = `2 (N, C). Define A ∈ B(H) by Aek = 2−k ek+1 ∀k. Prove that A is
(i) injective, (ii) quasi-nilpotent, but (iii) not nilpotent.
10.36 Exercise Let H = `2 (N, C) and define A ∈ B(H) by Aek = αk ek+1 , where αk = 2 for
odd k and αk = 1/2 for even k. Compute kAn k for all n and show that n 7→ kAn k1/n is not
monotonously decreasing.
The operators in the two preceding exercises are examples of ‘weighted shift operators’.
10.37 Exercise Let H = `2 (N, C) and define A ∈ B(H) by (Af )(n) = f (n)/n. Determine
σp (A), σc (A), σr (A).
10.38 Exercise (Assuming some measure theory) Let H = L2 ([a, b]), where −∞ < a <
b < ∞, and define A ∈ B(H) by (Af )(x) = xf (x). Prove σc (A) = [a, b] and σp (A) = σr (A) = ∅.
10.39 Exercise Prove that for every compact set C ⊆ C there is an operator A ∈ B(H),
where H is a separable Hilbert space, such that σ(A) = C.
Hint: Prove and use that C has a countable dense subset.
The next exercise is a (quite weak, given the strong hypothesis) converse of Exercise 10.3:
10.40 Exercise Let V be a Banach space and A ∈ B(V ) such that σ(A) is disjoint from the
circle C = {z ∈ C | |z − z0 | = r}.
(i) Apply Exercise 10.30 to A − z0 1 to obtain P 2 = P ∈ B(V ) satisfying P A = AP .
(ii) Prove that AVi ⊆ Vi , where V1 = P V, V2 = (1 − P )V . Conclude that A = A1 ⊕ A2 where
Ai = A V i .
(iii) Prove V1 = {x ∈ V | limn→∞ (A − z0 1)n x = 0}.
(iv) Deduce σ(A1 ) = σ(A) ∩ B(z0 , r) and σ(A2 ) = σ(A)\B(z0 , r).
10.41 Remark The unnatural assumption that the two parts of the spectrum are separated by
a circle can be removed using holomorphic functional calculus. Cf. e.g. [60, §149]. For normal
operators on Hilbert space, there is an alternative approach, cf. Proposition 13.10. 2
10.42 Exercise (Discrete Spectrum) Let V be a Banach space and A ∈ B(V ). If λ ∈ σ(A)
is isolated, pick r > 0 such that B(λ, r)∩σ(A) = {λ} and let Pλ be the idempotent corresponding
to C = {z | |z − λ| = r} constructed in the preceding exercise. Now put
70
10.43 Definition If A, B are F-algebras, an (algebra) homomorphism α : A → B is a linear
map such that also α(aa0 ) = α(a)α(a0 ) ∀a, a0 ∈ A. If A, B are unital, α is called unital if
α(1A ) = 1B . Algebra homomorphisms from an F-algebra to F are called characters.
10.45 Lemma Let A be a unital Banach algebra. Then every non-zero character ϕ : A → F
satisfies ϕ(1) = 1, ϕ(a) ∈ σ(a) ∀a ∈ A and kϕk = 1, thus ϕ is continuous.
Proof. If ϕ(1) = 0 then ϕ(a) = ϕ(a1) = ϕ(a)ϕ(1) = 0 for all a ∈ A, thus ϕ = 0. Thus
ϕ 6= 0 ⇒ ϕ(1) 6= 0. Now ϕ(1) = ϕ(12 ) = ϕ(1)2 implies ϕ(1) = 1.
We have just proven that every non-zero character is a unital homomorphism. Thus by
Lemma 10.44, σ(ϕ(a)) ⊆ σ(a). Since the spectrum of z ∈ F clearly is {z}, this means ϕ(a) ∈
σ(a), thus |ϕ(a)| ≤ r(a) ≤ kak by Proposition 10.12, whence kϕk ≤ 1. Since we require k1k = 1,
we also have kϕk ≥ |ϕ(1)|/k1k = 1.
10.46 Definition If A is a unital Banach algebra, the spectrum Ω(A) of A is the set of non-
zero characters ϕ : A → F.
10.47 Exercise Let X be a compact Hausdorff space and A = C(X, F). For every x ∈ X
define ϕx : A → F, f 7→ f (x). Prove:
(i) ϕx is a non-zero character of A, thus ϕx ∈ Ω(A), for each x ∈ X.
(ii) The map ι : X → Ω(A), x 7→ ϕx is injective.
(iii) For each f ∈ A we have σ(f ) = {ϕ(f ) | ϕ ∈ Ω(A)}.
(Later we will define a topology on Ω(A) and see that ι is a homeomorphism.)
One could hope that σ(a) = {ϕ(a) | ϕ ∈ Ω(A)} holds for every unital Banach algebra A
and a ∈ A. But this is too much to ask since a non-commutative algebra A may well have
Ω(A) = ∅! E.g., this holds for all matrix algebras Mn×n (F), n ≥ 2 since these are simple (no
proper two-sided ideals) so that a homomorphism to another algebra B must be zero or injective,
the latter being impossible for B = F for dimensional reasons.
Proof. (i) Every ϕ ∈ Ω(A) is continuous, thus M = ker ϕ is a closed ideal. We have M 6= A
since ϕ 6= 0. This ideal has codimension one since A/M ∼ = C and therefore is maximal.
(ii) Now let M ⊆ A be a maximal ideal. Since maximal ideals are proper, no element of M
is invertible. For each b ∈ M we have k1 − bk ≥ 1 since otherwise b = 1 − (1 − b) would be
invertible by Lemma 10.10(i). (This is the only place where completeness is used.) Thus 1 6∈ M ,
71
so that M is a proper ideal containing M . Since M is maximal, we have M = M , thus M is
closed. Now by Proposition 6.2(vi), A/M is a normed algebra, and by a well-known algebraic
argument the maximality of M implies that A/M is a division algebra. Thus A/M ∼ = C by
Gelfand-Mazur (Corollary 10.21, which holds only over C), so that there is a unique isomorphism
α : A/M → C sending 1 ∈ A/M to 1 ∈ C. If p : A → A/M is the quotient homomorphism
then ϕ = α ◦ p : A → C is a non-zero character with ker ϕ = M . This ϕ clearly is unique. The
last statement follows from the fact that every commutative unital algebra has maximal ideals
(by a standard Zorn argument).
(iii) We already know that {ϕ(a) | Ω(A)} ⊆ σ(a), so that it remains to prove that for every
λ ∈ σ(a) there is a ϕ ∈ Ω(A) such that ϕ(a) = λ. If λ ∈ σ(a) then a − λ1 6∈ Inv A. Thus
I = (a − λ1)A ⊆ A is a proper ideal. (Here we need the commutativity of A since otherwise
this would only be a right ideal!) Using Zorn’s lemma, we can find a maximal ideal M ⊇ I.
By (ii) there is a ϕ ∈ Ω(A) such that ker ϕ = M . Since a − λ1 ∈ I ⊆ M = ker ϕ, we have
ϕ(a − λ1) = 0 and therefore ϕ(a) = λ.
10.49 Exercise Let A be a commutative unital Banach algebra over C and a, b ∈ A. Prove:
(i) σ(a + b) ⊆ σ(a) + σ(b) and σ(ab) ⊆ σ(a)σ(b).
(ii) r(a + b) ≤ r(a) + r(b) and r(ab) ≤ r(a)r(b).
(iii) If A is non-commutative but ab = ba then (ii) still holds.
(iv) If A is non-commutative but ab = ba then (i) still holds.
Hint: For (iii), use an abelian subalgebra, for (iv) a maximal one.
10.50 Exercise Give an example of a commutative Banach algebra over R for which (ii) and
(iii) of Proposition 10.48 (with C replaced by R) fail.
From now on, whether we say it explicitly or not, we assume F = C in all con-
siderations involving spectra! I.e. essentially everywhere except most of Sections
11, 14.1, 16. Finding out whether a result also holds over R usually is very easy.
We could now discuss the contents of Section 12.1, but it seems better first to go on with
our study of operators on Banach and Hilbert spaces.
11.1 Lemma The linear map B(E, F ) → B(F ∗ , E ∗ ) is isometric, i.e. kAt k = kAk.
72
Proof. By Proposition 7.8(i), for f ∈ F we have kf k = supϕ∈F ∗ ,kϕk=1 |ϕ(f )|. Thus
kAk = sup kAek = sup sup |ϕ(Ae)| = sup sup |ϕ(Ae)| = sup kAt ϕk = kAt k.
e∈E e∈E ϕ∈F ∗ ϕ∈F ∗ e∈E ϕ∈F ∗
kek=1 kek=1 kϕk=1 kϕk=1 kxk=1 kϕk=1
Now, ιF (Ax) and (Att ιE (x)) are in F ∗∗ , and the fact that they coincide on all ϕ ∈ F ∗ means
ιF (Ax) = Att ιE (x). And since this holds for all x ∈ E, we have ιF A = Att ιE , as claimed.
73
Proof. (i) Linearity of A∗ : H2 → H1 follows from its being the composite of the linear map At
−1
with the two anti-linear maps γH2 and γH 1
.
∗
(ii) Additivity of A 7→ A is obvious. Let A ∈ B(H1 , H2 ), c ∈ C, x ∈ H2 . Then
−1 −1 −1
(cA)∗ (x) = γH1
◦ (cA)t ◦ γH2 (x) = γH 1
(cAt (γH2 (x))) = cγH 1
(At (γH2 (x))) = cA∗ (x),
−1
where we used the linearity of A 7→ At and anti-linearity of γH 1
, shows (cA)∗ = cA∗ . (The
anti-linearity of γH2 is irrelevant here.)
(iii) If y ∈ H2 then γH2 (y) ∈ H2∗ is the functional h·, yi2 . Then (At ◦ γH2 )(y) ∈ H1∗ is the
−1
functional x 7→ hAx, yi2 . Thus z = A∗ y = (γH 1
◦ At ◦ γH2 )(y) ∈ H1 is a vector such that
∗
hx, zi1 = hAx, yi2 for all x ∈ H1 . This means hx, A yi1 = hAx, yi2 ∀x ∈ H1 , y ∈ H2 , as claimed.
There is a useful bijection between bounded operators and bounded sesquilinear forms. It
can be used to give an alternative (at least in appearance) construction of the adjoint A∗ (and
for many other purposes). It is based on the following observation: If A ∈ B(H) satisfies
hAx, yi = 0 for all x, y ∈ H then Ax = 0 for all x, thus A = 0. Applying this to A − B shows
that hAx, yi = hBx, yi ∀x, y implies A = B. Thus bounded operators are determined by their
‘matrix elements’. This motivates the following developments.
11.7 Remark Recall that the inner product h·, ·i on a (pre)Hilbert space is sesquilinear and
bounded by Cauchy-Schwarz. If F = R, the definition of course reduces to bilinearity. 2
11.8 Proposition Let H be a Hilbert space. Then there is a bijection between B(H) and the
set of bounded sesquilinear forms on H, given by B(H) 3 A 7→ [·, ·]A , where [x, y]A = hAx, yi.
Proof. Let A ∈ B(H). Sesquilinearity of [·, ·]A = hAx, yi is an obvious consequence of sesquilin-
earity of h·, ·i and linearity of A, and boundedness follows from Cauchy-Schwarz:
Now let [·, ·] be a sesquilinear form bounded by M . Then for each x ∈ H, the map ψx : H →
C, y 7→ [x, y] is linear (thanks to the complex conjugation) and satisfies |ψx (y)| ≤ M kykkxk,
thus ψx ∈ H ∗ . Thus by Theorem 5.29 there is a unique vector zx ∈ H such that ψx = ϕzx , thus
[x, y] = ψx (y) = ϕzx (y) = hy, zx i ∀y and, taking complex conjugates, hzx , yi = [x, y] ∀y. Thus
defining A : H → H by Ax = zx ∀x we have hAx, yi = [x, y] ∀x, y. Since the maps x 7→ ψx and
ψx 7→ zx are both anti-linear, their composite A is linear. And since H → H ∗ , z 7→ ϕz is an
isometry, we have kAxk = kzx k = kϕzx k = kψx k ≤ M kxk, thus A ∈ B(H).
11.9 Proposition Let H be a Hilbert space and A ∈ B(H). Then there is a unique B ∈ B(H)
such that
hAx, yi = hx, Byi ∀x, y ∈ H.
This B is denoted A∗ and called the adjoint of A.
Proof. The map (y, x) 7→ hy, Axi is sesquilinear and bounded (by kAk). Thus by Proposition
11.8 there is a bounded B ∈ B(H) such that hBy, xi = hy, Axi ∀x, y. Taking the complex
conjugate gives hx, Byi = hBy, xi = hy, Axi = hAx, yi, which is the wanted identity.
74
11.10 Remark 1. In view of the identity hAx, yi = hx, A∗ yi satisfied by the adjoint as defined
above and Proposition 11.5(iii) (with H1 = H2 = H), it is clear that the two constructions of
A∗ give the same result (and in a sense are the same construction since both use Theorem 5.29).
2. It is obvious that A ∈ B(H) is self-adjoint as defined earlier (hAx, yi = hx, Ayi ∀x, y ∈ H)
if and only if A∗ = A.
3. If [·, ·] is a sesquilinear form then also [x, y]0 := [y, x] is a sesquilinear form, called the
adjoint form. Looking at the above definition of A∗ , one finds that A∗ is the bounded operator
associated with the form [·, ·]0 . Thus self-adjointness of A is equivalent to [·, ·]0A = [·, ·]A , i.e.
self-adjointness of [·, ·]A . (Self-adjoint forms and operators are also called hermitian.) 2
which shows that A is an adjoint of A∗ . Uniqueness of the adjoint now implies A∗∗ = A.
(iv) Obvious.
11.12 Proposition Let H be a Hilbert space. Then for all A ∈ B(H) we have
(i) kA∗ k = kAk. (The ∗-operation is isometric.)
(ii) kA∗ Ak = kAk2 . (“C ∗ -identity”)
Proof. (i) Similarly to Lemma 11.1, using (5.2) we have
kA∗ k = sup |hA∗ x, yi| = sup |hx, Ayi| = sup |hAy, xi| = kAk.
kxk=kyk=1 kxk=kyk=1 kxk=kyk=1
(ii) On the one hand, kA∗ Ak ≤ kA∗ kkAk = kAk2 , where we used (i). On the other, using
(5.2) we have
kA∗ Ak = sup kA∗ Axk = sup |hA∗ Ax, yi| = sup |hAx, Ayi|
kxk=1 kxk=kyk=1 kxk=kyk=1
75
The above construction of A∗ for A ∈ B(H) can be generalized to bounded linear maps
A : H1 → H2 , so as to give A∗ : H2 → H1 satisfying
We refrain from doing so explicitly since, as in the case H1 = H2 , the result would be the same
as that of the construction in Proposition 11.5. Instead we take the latter as definition of A∗ in
the general case. Now one has:
11.14 Exercise Consider the left and right shift operators L, R of Definition 10.1 in the Hilbert
space `2 (N, C).
(i) Prove L∗ = R and R∗ = L.
(ii) Prove σ(L) = σ(R) = B(0, 1) (closed unit disk). Hint: Use Exercise 10.33.
11.17 Lemma Every C ∗ -algebra is a Banach ∗-algebra. If it has a unit 1 then k1k = 1.
Proof. With the C ∗ -identity and submultiplicativity we have kak2 = ka∗ ak ≤ ka∗ kkak, thus
kak ≤ ka∗ k for all a ∈ A. Replacing a by a∗ herein gives the converse inequality, thus ka∗ k = kak.
If 1 is a unit then k1k2 = k1∗ 1k = k1∗ k = k1k, and since k1k = 6 0 this implies k1k = 1.
41
The original definition by Gelfand and Naimark (1942) had the additional axiom that a∗ a + 1 be invertible for
each a. This turned out to be redundant, cf. Proposition 12.13.
76
11.18 Remark 1. Clearly B(H) is a C ∗ -algebra for each Hilbert space H. Since this holds
also for real Hilbert spaces, is shows that one can discuss Banach ∗-algebras and C ∗ -algebras
over R. But we will consider only complex ones.
2. There is no special name for the non-complete variants of the above definitions. But a
submultiplicative norm on a ∗-algebra satisfying the C ∗ -identity is called a C ∗ -norm, whether
A is complete w.r.t. it or not. Completion of a ∗-algebra w.r.t. a C ∗ -norm gives a C ∗ -algebra,
and this is an important way of constructing new C ∗ -algebras. 2
11.19 Exercise Recall the Banach algebra A = `1 (Z, C) from Example 10.25. Show that both
f ∗ (n) = f (n) and f ∗ (n) = f (−n) are involutions on A making it a Banach ∗-algebra. Show
that neither of them satisfies the C ∗ -identity. (Thus Banach-∗ 6⇒ C ∗ .)
11.20 Lemma Let X be a compact space. For f ∈ C(X, C), define f ∗ by f ∗ (x) = f (x). Then
C(X, C) is a C ∗ -algebra. The same holds for Cb (X), where X is arbitrary, thus also for `∞ (S, C).
Proof. We know that C(X, C) equipped with the norm kf k = supx |f (x)| is a Banach algebra.
It is immediate that ∗ is an involution. The computation
2
∗
kf f k = sup |f (x)f (x)| = sup |f (x)| = sup |f (x)| = kf k2
2
x x x
proves the C ∗ -identity. It is clear that this generalizes to the bounded continuous functions on
any space X.
In a sense, the examples B(H) and C(X, C) for compact X are all there is: One can prove,
as we will do in Theorem 17.11, that every commutative unital C ∗ -algebra is isometrically
∗-isomorphic to C(X, C) for some compact Hausdorff space X, determined uniquely up to
homeomorphism. (For example one has `∞ (S, C) ∼ = C(βS, C), where βS is the Stone-Čech
∗
compactification of (S, τdisc ).) And every C -algebra is isometrically ∗-isomorphic to a norm-
closed ∗-subalgebra of B(H) for some Hilbert space H. Cf. e.g. [50].
77
11.4 Spectrum of elements of a C ∗ -algebra
11.23 Lemma Let A be a unital C ∗ -algebra.
(i) If a ∈ A is invertible then a∗ is invertible and (a∗ )−1 = (a−1 )∗ .
(ii) σ(a∗ ) = σ(a)∗ := {λ | λ ∈ σ(a)}.42
Proof. (i) Taking the adjoint of the equation aa−1 = 1 = a−1 a gives (a−1 )∗ a∗ = 1 = a∗ (a−1 )∗ ,
thus (a∗ )−1 = (a−1 )∗ . (ii) By (i), a − λ1 is invertible if and only if a∗ − λ1 is invertible.
(ii) We have kuk2 = ku∗ uk = k1k = 1, and in same way ku−1 k = ku∗ k = 1. Now Exercise
10.13 gives σ(u) ⊆ S 1 .
(iii) Given λ ∈ σ(a), write λ = α + iβ with α, β ∈ R. Since σ(a + z1) = σ(a) + z ∀z ∈ C
(why?), we have iβ(n + 1) = α + iβ − α + inβ ∈ σ(a − α1 + inβ1). Thus with r(c) ≤ kck
(Proposition 10.12), the C ∗ -identity and k1k = 1 we have
(n2 + 2n + 1)β 2 = |iβ(n + 1)|2 ≤ r(a − α1 + inβ1)2 ≤ ka − α1 + inβ1k2
= k(a − α1 − inβ1)(a − α1 + inβ1)k = k(a − α1)2 + n2 β 2 1k ≤ ka − α1k2 + n2 β 2 ,
which simplifies to (2n + 1)β 2 ≤ ka − α1k2 ∀n ∈ N. Thus β = 0 and λ ∈ R.
11.25 Remark 1. Since (i) implies kak = ka∗ ak1/2 = r(a∗ a)1/2 for all a ∈ A and the spectral
radius r(a) by definition depends only on the algebraic structure of A, the latter also determines
the norm, which therefore is unique in a C ∗ -algebra! But note that two C ∗ -norms on a ∗-algebra
A can be very different if A fails to be complete w.r.t. one of the two norms!
2. AnPalternative and perhaps more insightful proof for (iii) goes like this: Since ez ≡
exp(z) = ∞ n
n=0 z /n! converges absolutely for all z ∈ ∗C, Proposition 3.2 gives convergence of
exp(a) for all a ∈ A. It is easy to verify (ea )∗ = e(a ) and ea+b = ea eb provided ab = ba.
In particular we have ea ∈ InvA for all a with (ea )−1 = e−a . If now a = a∗ then u = eia
satisfies u∗ = e−ia , thus uu∗ = u∗ u = 1, so that u is unitary and therefore σ(eia ) ⊆ S 1 by (ii).
Now the holomorphic spectral mapping theorem, cf. Section 12.1, in particular (12.1), gives
{eiλ | λ ∈ σ(a)} = σ(eia ) ⊆ S 1 , and this implies σ(a) ⊆ R. [The above argument only needs
{eiλ | λ ∈ σ(a)} ⊆ σ(eia ), which can be proven more directly: For all λ ∈ C we have
∞
!
X (i(a − λ1))k
eia − eiλ 1 = (ei(a−λ1) − 1)eiλ = eiλ = (a − λ1)beiλ ,
k!
k=1
42
If S ⊆ C we we write S ∗ for {s | s ∈ S} since S could be confused with the closure.
78
(i(a−λ1))k−1
where b = i ∞ ∈ A. Since a − λ1 and b commute, we have λ ∈ σ(a) ⇒ eiλ ∈
P
k=1 k!
σ(e ).] For another (quite striking) application of exp to C ∗ -algebras see Section B.8.
ia 2
11.29 Exercise Give an example of a unital C ∗ -algebra A and a ∈ A showing that σ(a) ⊆
[0, ∞) does not imply a = a∗ !
11.30 Exercise Prove: If A is a unital C ∗ -algebra and a, b ∈ A are positive and ab = ba then
a + b is positive. (Later we will use different methods to remove the condition ab = ba.)
79
We immediately generalize the above questions to elements of unital Banach algebras, but
mostly we will (later) focus on C ∗ -algebras, to which B(H) belongs for each Hilbert space.
Defining f (a) poses no problem in the simplest case, which surely is f = P , a polynomial:
12.2 Exercise Let A be a unital algebra and a ∈ A. Prove that the map C[x] → A, P 7→ P (a)
is a homomorphism of unital C-algebras.
12.3 Lemma Let A be a unital Banach algebra, a ∈ A and P ∈ C[x] a polynomial. Then
Proof. Choose a maximal abelian Banach subalgebra B ⊆ A containing a. Since every ϕ ∈ Ω(B)
is a unital homomorphism, we have
There is a more elementary proof of Lemma 12.3, which works in every normed algebra (but
does not lend itself to generalizations like (12.1) or Proposition 12.20(ii)):
12.4 Exercise Let A be a unital normed algebra and P ∈ C[z] with n = deg P .
(i) Prove σ(P (a)) = P (σ(a)) when n = 0.
Qn
(ii) Assume n ≥ 1 and λ ∈ C. Use a factorization P (z) − λ = cn k=1 (z − zk ) to prove
σ(P (a)) = P (σ(a)) without using characters.
(iii) Why did we assume A to be normed?
P∞The above ‘polynomial functional calculus’ can be generalized: If the power series f (z) =
n has convergence radius R and a ∈ A satisfies kak < R then
n=0 cn z
∞
X ∞
X ∞
X
kcn an k = |cn |kan k ≤ |cn |kakn < ∞,
n=0 n=0 n=0
80
Thus if A is commutative we have
The first equality is due to Proposition 11.24(i), the second is the definition of r and the third
comes from the Lemma 12.3.
Even though we are after a result for all normal operators, we first consider self-adjoint
operators:
12.6 Theorem Let A be a unital C ∗ -algebra and a = a∗ ∈ A. Then there is a unique con-
tinuous ∗-homomorphism αa : C(σ(a), C) → A such that αa (P ) = P (a) for all polynomials.
(Usually we will write f (a) instead of αa (f ).) It satisfies
(i) kαa (f )k = supλ∈σ(a) |f (λ)|. (Thus αa is an isometry.)
81
(ii) The image of αa is the smallest C ∗ -subalgebra B ⊆ A containing 1 and a, and αa :
C(σ(a), C) → B is a ∗-isomorphism.
(iii) σ(αa (f )) = f (σ(a)) = {f (λ) | λ ∈ σ(a)}. (Spectral mapping theorem)
(iv) If g ∈ C(f (σ(a)), C) then αa (g ◦ f ) = ααa (f ) (g), or just g(f (a)) = (g ◦ f )(a).
Proof. (i) By Propositions 10.12 and 11.24(iii), we have σ(a) ⊆ [−kak, kak]. By the classical
Weierstrass approximation theorem, cf. Theorem A.25, for every continuous continuous function
f : [c, d] → C and ε > 0 there is a polynomial P such that |f (x) − P (x)| ≤ ε for all x ∈ [c, d].
We cannot apply this directly since σ(a), while contained in an interval, need not be an entire
interval. But using Tietze’s extension theorem, cf. Appendix A.6, we can find (very non-
uniquely) a continuous function g : [−kak, kak] → C that coincides with f on σ(a). Now this g
can be approximated uniformly by polynomials thanks to Weierstrass’ theorem. (Alternatively,
apply the more abstract Stone-Weierstrass theorem directly to f .) In any case, the restriction
of the polynomials to σ(a) is dense in C(σ(a), C) w.r.t. k · k∞ . By Proposition 12.5, the map
C(σ(a), C) ⊇ C[x]|σ(a) → A, P 7→ P (a) is an isometry. Thus applying Lemma 3.17 we obtaining
a unique isometry αa : C(σ(a), C) → A extending P 7→ P (a). Thus (i) is proven up to the claim
that αa is a ∗-homomorphism. This is left as an exercise.
(ii) Since αa is a ∗-homomorphism, B := αa (C(σ(a), C)) ⊆ A is a ∗-subalgebra. And since
αa is an isometry by (i) and (C(σ(a), C), k · k∞ ) is complete, B is closed, thus a C ∗ -algebra.
Since αa maps the constant-one function to 1 ∈ A and the inclusion map σ(a) ,→ C to a, B
contains 1, a. Conversely, the smallest C ∗ -subalgebra of A containing 1 and a clearly is obtained
by taking the norm-closure of the set {P (a) | P ∈ C[z]}, which is contained in the image of αa .
(iii) Let f ∈ C(σ(a), C). Then clearly αa (f ) ∈ B. Now
where the equalities come from Theorem 11.27, from the fact that αa : C(σ(a), C) → B is a
∗-isomorphism, and from Exercise 10.6, respectively.
(iv) If {Pn } is a sequence of polynomials converging to f uniformly on σ(a) and {Qn } is a
sequence of polynomials converging to g uniformly on σ(f (a)), then Qn ◦Pn converges uniformly
to g ◦ f , thus Qn (Pn (a)) = (Qn ◦ Pn )(a) converges to (g ◦ f )(a). On the other hand, {Qn (Pn (a))}
converges uniformly to g(f (a)).
12.9 Exercise (i) Define f+ , f− : R → R by f+ (x) = max(x, 0), f− (x) = − min(x, 0). Prove
the alternative formulae f± (x) = (|x| ± x)/2 and f+ f− = 0 and f± ∈ C(R, R).
(ii) Let now A be a unital C ∗ -algebra and a = a∗ ∈ A. Define a± ∈ A by functional calculus
as a± = f± (a). Prove: 1. a+ − a− = a and a+ + a− = |a|, 2. a+ a− = a− a+ = 0, 3.
a+ ≥ 0, a− ≥ 0.
82
12.10 Proposition Let A be a unital C ∗ -algebra.
(i) If a = a∗ ∈ A then a2 is positive.
(ii) If a ∈ A is positive then there is a positive b ∈ A such that b2 = a, unique in C ∗ (1, a)
√
(and in A). We write b = a.
Proof. (i) It is clear that a2 is self-adjoint. Since σ(a) ⊆ R by Proposition 11.24(iii), the spectral
mapping theorem (Lemma 12.3 suffices) gives σ(a2 ) = {λ2 | λ ∈ σ(a)} ⊆ [0, ∞), thus a2 ≥ 0.
(ii) In view of a ≥ 0 we have σ(a) ⊆ [0, ∞). Now continuity of the function [0, ∞) →
√ √
[0, ∞), x 7→ + x allows us to define b = a by the continuous functional calculus. It is
immediate by construction that b = b∗ , and the spectral mapping theorem gives σ(b) ⊆ [0, ∞),
√ 2
thus b ≥ 0. Now b2 = (a1/2 )2 = a since x = x. If c ∈ C ∗ (1, a) is positive and c2 = a then
c = b. This follows from the ∗-isomorphism C ∗ (1, a) ∼ = C(σ(a), C) and the fact that positive
square roots are unique in the function algebra C(σ(a), C) (why?). The stronger result that
positive square roots are unique even in A will be proven in Remark 12.15.
83
12.14 Exercise Let A be a unital C ∗ -algebra and a, b ∈ A with a ≥ 0. Prove that bab∗ ≥ 0.
12.15 Remark Now we can prove the strong uniqueness claim in Proposition 12.10: Let A
be a unital C ∗ -algebra and a, b, b0 ∈ A positive with b2 = a = b02 . Then Exercise 12.14 gives
positivity of (b − b0 )b(b − b0 ) and (b − b0 )b0 (b − b0 ). A short computation gives
12.17 Exercise Give an example of a unital C ∗ -algebra A and a ∈ A such that there is no
b ∈ A with a = b|a|.
12.19 Lemma Let A be a unital C ∗ -algebra. Then every character ϕ ∈ Ω(A) satisfies ϕ(c∗ ) =
ϕ(c) for all c ∈ A, i.e. is a ∗-homomorphism.
Proof. We have c = a + ib, where a = Re(c), b = Im(c) are self-adjoint. Now σ(a) ⊆ R by
Proposition 11.24(iii), thus ϕ(a) ∈ σ(a) ⊆ R by Lemma 10.45. Similarly ϕ(b) ∈ R. Thus
ϕ(c∗ ) = ϕ(a − ib) = ϕ(a) − iϕ(b) = ϕ(a) + iϕ(b) = ϕ(a + ib) = ϕ(c),
where the third equality used that ϕ(a), ϕ(b) ∈ R as shown before.
43
We allow ourselves the harmless sloppiness of not distinguishing between elements of the ring C[z, z] (where z, z
are independent variables) and the functions C → C induced by them.
84
12.20 Proposition Let A be a unital C ∗ -algebra and a ∈ A normal. Then
(i) If P ∈ C[z, z], define P (a, a∗ ) by replacing z and z in P by a, a∗ , respectively. Then
αa : P 7→ P (a, a∗ ) is a ∗-homomorphism (extending the αa : C[z] → A defined earlier).
(ii) For every P ∈ C[z, z] we have σ(P (a, a∗ )) = {P (λ, λ) | λ ∈ σ(a)} and
Now the proof of Theorem 12.6 becomes a proof of Theorem 12.18 if we replace the invocation
of Proposition 12.5 by one of Proposition 12.20 and use the density of the P (z, z) σ(a) in
C(σ(a), C) as explained before.
12.21 Exercise Let A be a unital C ∗ -algebra and a ∈ A normal. For t 6∈ σ(a), prove that
k(a − t1)−1 k = (dist(t, σ(a)))−1 .
13.1 Lemma Let H be a Hilbert space and A ∈ B(H). Then ker A∗ = (AH)⊥ . Thus A has
dense image (AH = H) if and only if A∗ is injective. (Compare Exercise 11.3(i).)
85
Proof. We have x ∈ (AH)⊥ ⇔ hAy, xi = 0 ∀y ⇔ hy, A∗ xi = 0 ∀y ⇔ A∗ x = 0. Thus A∗ is
injective if and only if (AH)⊥ = {0}, which is equivalent to AH = H by Exercise 5.26(i).
13.3 Remark Using Exercise 11.3(i) instead of Lemma 13.1, one has analogous results for the
transpose At ∈ B(V ∗ ) of a Banach space operator A ∈ B(V ): σr (A) ⊆ σp (At ) and σp (A) ⊆
σp (At ) ∪ σr (At ). (As in Exercise 11.3(iv) there is no complex conjugation since A 7→ At is
linear.) 2
13.5 Remark There is something to be said for the above simple direct argument, but the
result also follows from Remark 5.14, which even allows to recover A (more precisely [·, ·]A )
from the map x 7→ hAx, xi. 2
86
Proof. (0) This is trivial, but nevertheless worth pointing out.
(i) If A is normal then for all x ∈ H we have
This computation holds both ways, thus kAxk = kA∗ xk ∀x implies hAA∗ x, xi = hAA∗ x, xi ∀x.
By Lemma 13.4, this implies AA∗ = A∗ A.
(ii) The first equality is Lemma 13.1, and the second is immediate from (i).
(iii) In view of (AH)⊥ = ker A established in (ii), injectivity of A is equivalent to (AH)⊥ =
{0}, which is equivalent to AH = H by Exercise 5.26(i).
(iv) Invertibility implies boundedness below and surjectivity, cf. Proposition 9.28. If a normal
operator is bounded below, it is injective, so that it has dense image by (iii). Now boundedness
below and dense image imply invertibility by Proposition 9.28. And surjectivity implies dense
image, thus injectivity by (iii). Now injectivity and surjectivity give invertibility.
Normal operators have very nice spectral properties, which foreshadows the spectral theorem:
13.8 Exercise Let A ∈ B(H) be normal. Let x, x0 be (non-zero) eigenvectors for the eigen-
values λ, λ0 , respectively. Prove:
(i) A∗ x = λx, thus x is eigenvector for A∗ with eigenvalue λ.
(ii) σp (A∗ ) = σp (A)∗ .
(iii) If λ 6= λ0 then x ⊥ x0 .44
87
π1 (f ) ⊕ π2 (f ) for all f ∈ C(σ(A), C). Since Σ is clopen, the ∗-homoomorphism C(σ(A), C) →
C(Σ, C) ⊕ C(σ(A)\Σ, C), f 7→ (f|Σ , f|σ(A)\Σ ) is an isomorphism. (The inverse sends (f1 , f2 ) to
fb1 +fb2 where fbi is the extension of fi to all of σ(A) that vanishes on the complement of the domain
of fi .) Now the composite C(Σ, C) ⊕ C(σ(A)\Σ, C) → C(σ(A), C) → B(H1 ) ⊕ B(H2 ) sends
(f1 , f2 ) to (π1 (f1 ), π2 (f2 )). If now z1 , z2 are the inclusion maps from Σ and σ(A)\Σ, respectively,
to C, we have A = A1 ⊕A2 = π1 (z1 )⊕π2 (z2 ). Thus σ(A1 ) = σ(π1 (z1 )) ⊆ σ(z1 ) = Σ. Analogously
σ(A2 ) ⊆ σ(A)\Σ. Now in view of σ(A) = σ(A1 ) ∪ σ(A2 ), we have σ(A1 ) = Σ, σ(A2 ) ⊆ σ(A)\Σ.
(ii) Since λ is isolated, Σ = {λ} ⊆ σ(A) is clopen. Now (i) gives σ(A1 ) = {λ}, thus by
Exercise 11.26(iii) we have A|H1 = λ idH1 , so that every x ∈ H1 satisfies Ax = λx.
Alternative argument: We have P = χ{λ} (A) 6= 0, thus P H ⊆ H is a non-zero closed
subspace. If 0 6= x ∈ P H then x = P x, and
(A − λ1)x = (A − λ1)P x = (z − λ)(A)χ{λ} (A)x = ((z − λ)χ{λ} )(A)x = 0,
where z is the inclusion map σ(A) ,→ C and we used the homomorphism property of the
functional calculus and the fact that the function z 7→ (z − λ)χ{λ} (z) is identically zero. This
proves that x ∈ ker(A − λ1), so that λ ∈ σp (A).
13.12 Exercise Use Exercise 13.9(iii) to give simple(r) proofs for σ(A) ⊆ R and σ(A) ⊆ S 1
for A ∈ B(H) self-adjoint or unitary, respectively.
13.13 Exercise (i) Let A ∈ B(H) and K ⊆ H a closed subspace such that AK ⊆ K. (Thus
K is A-invariant.) Prove that AK ⊥ ⊆ K ⊥ is equivalent to A∗ K ⊆ K.
In this situation, K is called reducing since then A ∼
= A|K ⊕ A|K ⊥ .
(ii) Deduce that every invariant subspace of a self-adjoint operator is reducing.
(iii) Show by example that a normal operator can have invariant but non-reducing subspaces.
For every A ∈ B(H), using (5.2) we have
kAk = sup kAxk = sup |hAx, yi|.
kxk=1 kxk=kyk=1
88
Proof. Putting M = supkxk=1 |hAx, xi|, Cauchy-Schwarz gives M ≤ kAk. (We also note for
later use that |hAx, xi| ≤ M kxk2 ∀x.) It remains to prove kAk ≤ M , which in view of kAk =
supkxk=1 kAxk follows if we have kAxk ≤ M whenever kxk = 1. This inequality is trivially true
Ax
if Ax = 0. If not, put y = kAxk . Using A = A∗ and hAx, yi = kAxk−1 hAx, Axi ∈ R, we have
hAy, xi = hy, Axi = hAx, yi = hAx, yi. Using this, we have
1
kAxk = hAx, yi = hA(x + y), x + yi − hA(x − y), x − yi
4
1
= hA(x + y), x + yi − hA(x − y), x − yi
4
1
≤ |hA(x + y), x + yi| + |hA(x − y), x − yi|
4
M
kx + yk2 + kx − yk2
≤
4
M
kxk2 + kyk2 = M,
=
2
where in the last steps we used the parallelogram identity (5.3) and kxk = kyk = 1.
13.15 Remark 1. The set {hAx, xi | kxk = 1} is called the numerical range of A. In quantum
mechanics [38] it is the set of expectation values of A.
2. The number supkxk=1 |hAx, xi| is the numerical radius 9A9 of A. The identity kAk =
9A9 generalizes to all normal operators, but the proof is a bit trickier, see [55, Proposition
3.2.25] or [29]. (It also follows from the spectral theorem, cf. Section 15.1.) 2
13.16 Definition Let H be a Hilbert space and A ∈ B(H). Then A is called operator positive,
A ≥O 0, if hAx, xi ≥ 0 for all x ∈ H. (I.e., the numerical range of A is contained in [0, ∞).)
89
(note that −2λhAx, xi ≥ 0), thus k(A−λ1)xk ≥ |λ| kxk, so that A−λ1 is bounded below. Since
it is also normal, Proposition 13.6(iv) implies that A−λ1 is invertible. Thus σ(A)∩(−∞, 0) = ∅.
(v) A satisfies the hypotheses of Proposition 12.10, so that there is a B = B ∗ ∈ B(H) such
that A = B 2 = B ∗ B. Now the claim follows from (ii).
(vi) Combine (iv) and (v).
13.19 Exercise Prove that for V ∈ B(H), the following are equivalent.
(i) V is a partial isometry.
(ii) V ∗ is a partial isometry.
(iii) V ∗ V is an orthogonal projection.
(iv) V V ∗ is an orthogonal projection.
(v) V V ∗ V = V (trivially equivalent to V ∗ V V ∗ = V ∗ ).
In Exercise 12.17 we saw a C ∗ -algebra not admitting polar decomposition. But:
At this point, we could go on to Section 15, where the various versions of the spectral
theorem for normal operators are proven. But it is customary to first study compact operators,
since their spectral theory is much simpler and quite similar to that in finite dimensions.
90
14 Compact operators
14.1 Compact Banach space operators
We have met compact topological spaces many times in this course. A subset Y of a topological
space (X, τ ) is compact if it is compact when equipped with the induced (=subspace) topology
τ|Y . And Y ⊆ X is called precompact (or relatively compact) if its closure Y is compact. Recall
that a metric space X, thus also a subset of a normed space, is compact if and only if every
sequence {xn } in X has a convergent subsequence.
A subset Y of a normed space (V, k · k) is called bounded if there is an M such that kyk ≤
M ∀y ∈ Y . A compact subset of a normed space is closed and bounded, but the converse, while
true for finite dimensional spaces by the Heine-Borel theorem, is false in infinite dimensional
spaces. This is particularly easy to see for a Hilbert space: Any ONB √ B ⊆ H clearly is bounded.
For any e, e0 ∈ B, e 6= e0 we have ke − e0 k = he − e0 , e − e0 i1/2 = 2. Thus B ⊆ H is closed and
discrete. Since it is infinite, it is not compact.
For normed spaces, one needs the following easy, but important lemma:
14.1 Lemma (F. Riesz) Let (V, k · k) be a normed space and W $ V a closed proper subspace.
Then for each δ ∈ (0, 1) there is an xδ ∈ V such that kxδ k = 1 and dist(xδ , W ) ≥ δ, i.e.
kxδ − xk ≥ δ ∀x ∈ W .
Proof. If x0 ∈ V \W then closedness of W implies λ = dist(x0 , W ) > 0. In view of δ ∈ (0, 1),
we have λδ > λ, so that we can find y0 ∈ W with kx0 − y0 k < λδ . Putting
y0 − x 0
xδ = ,
ky0 − x0 k
we have kxδ k = 1. If x ∈ W then
y0 − x0 kky0 − x0 kx − y0 + x0 k dist(x0 , W ) λ
kx − xδ k = x − = ≥ > λ = δ,
ky0 − x0 k ky0 − x0 k ky0 − x0 k δ
where the ≥ is due to ky0 − x0 kx − y0 ∈ W and the > comes from dist(x0 , W ) = λ and
kx0 − y0 k < λδ . Since this holds for all x ∈ W , we have dist(xδ , W ) ≥ δ.
In view of the above, it is interesting to look at linear operators that send sets S ⊆ V to
sets AS with ‘better compactness properties’. There are several such notions:
91
14.3 Exercise Let V be a normed space and A : V → V a linear map. Prove that the following
conditions are equivalent and imply boundedness of A:
(i) The image AB ⊆ V of the closed unit ball B = V≤1 is precompact.
(ii) AS is precompact whenever S ⊆ V is bounded.
(iii) Given any bounded sequence {xn } ⊆ V , the sequence {Axn } has a convergent subsequence.
14.4 Definition Operators A ∈ B(V ) satisfying the above equivalent conditions are called
compact (or completely continuous). The set of compact operators on V is denoted K(V ).
Before we develop further theory, we should prove that (non-zero) compact operators on
infinite dimensional spaces exist.
14.6 Definition Let V be a normed space and A ∈ B(V ). Then A has finite rank if its image
AV is finite dimensional. The set of finite rank operators on V is denoted F (V ).
For example, if ϕ ∈ V ∗ , y ∈ V then A ∈ B(V ) define by A : x 7→ ϕ(x)y has finite rank.
14.8 Lemma K(V ) ⊆ B(V ) is a two-sided ideal (thus a linear subspace, and if A ∈ B(V ), B ∈
K(V ) then AB, BA ∈ K(V )).
Proof. Let {xn } be a bounded sequence in V . Since A, B are compact, we can find a subsequence
{xnk }k∈N such that Axnk and Bxnk converge as k → ∞. Then also (cA + dB)xnk converges,
thus cA + dB is compact. Thus K(V ) ⊆ B(V ) is a linear subspace.
Alternative argument: Let A, B ∈ K(V ) and S ⊆ V bounded. Then AS and BS are
compact, and so are cAS, dBS if c, d ∈ F. Thus also cAS + dBS is compact (by joint continuity
of the map + : V × V → V ), thus also (cA + dB)S ⊆ cAS + dBS = cAS + dBS.
Now let A ∈ B(V ), B ∈ K(V ) and S ⊆ V bounded. Then BS and ABS = ABS are
compact by compactness of B and continuity of A, respectively. And boundedness of A implies
boundedness of AS, so that BAS has compact closure by compactness of B. Thus AB and BA
are compact.
92
For the proof of the next result, we need the notion of total boundedness in metric spaces,
see Appendix A.8. In particular we will use Exercise A.36(iii).
14.11 Exercise Let V = `p (S, F), where S is an infinite set and 1 ≤ p < ∞. If g ∈ `∞ (S, F)
and f ∈ `p (S, F) then Mg (f ) = gf (pointwise product) is in `p (S). This defines a linear map
`∞ (S, F) → B(V ), g 7→ Mg . Prove:
(i) g 7→ Mg is an algebra homomorphism.
(ii) kMg k = kgk∞ .
(iii) Mg ∈ K(V ) if and only if g ∈ c0 (S, F).
14.12 Remark 1. We now have two classes of compact operators: The (rather commutative)
one of multiplication by c0 -functions, and the operators that are norm-limits of finite rank
operators. Actually, the first class is contained in the second. Why?
2. It is quite natural to ask whether in fact F (V ) = K(V ), i.e. whether all compact op-
erators on V are norm-limits of finite rank operators. When this holds, V is said to have the
approximation property. We will later see that this is true for all Hilbert spaces. Whether all
Banach spaces have the approximation property was an open problem until Enflo45 in 1973 [19]
constructed a counterexample. His construction was very complicated and his spaces were not
very ‘natural’ (in the sense of having a simple definition and/or having been encountered pre-
viously). A simpler example, but still tricky and not natural, can be found in [13]. Somewhat
later, very natural examples were found: The Banach space B(H) does not have the approxi-
mation property whenever H is an infinite dimensional Hilbert space, cf. [78]. (Note that this
is about compact operators on B(H), not compact operators in B(H)!) All this is well beyond
the level of this course, but you should be able to understand [13]. 2
None of the above examples of compact operators seems very relevant for applications, even
within mathematics. Indeed the most useful compact operators perhaps are integral operators.
We will briefly look at a class of them in Exercise 14.35. But there are very simple examples:
45
Per H. Enflo (1944-). Swedish mathematician, working mostly in functional analysis.
93
14.13 Definition Let V = C([0, C], F) for some C > 0, equipped with the norm kf k =
supx∈[0,C] |f (x)|. As we know, (V, k·k) is a Banach space. Define a linear operator, the Volterra46
operator, by Z x
A : V → V, (Af )(x) = f (t)dt.
0
Rx RC
We have kAf k = supx | 0 f (t)dt| ≤ 0 |f (t)|dt ≤ Ckf k, thus kAk ≤ C < ∞.
Proof. We will prove that F = AB ⊆ V is precompact by showing that it satisfies the hypotheses
of Theorem A.38. If x ∈ [0, C] and f ∈ C([0, C]) with kf k ≤ 1 then
Z x
|(Af )(x)| = f (t)dt ≤ Ckf k ≤ C < ∞,
0
14.15 Proposition Let A ∈ B(V ). Then At ∈ B(V ∗ ) is compact if and only if A is compact.
Proof. As in the proof of Proposition, we use the equivalence of precompactness and total
boundedness from Exercise A.36(iii). Assume that A ∈ B(V ) is compact. Then X = AV≤1 ⊆ V
is compact. Let
F = {ϕ X | ϕ ∈ (V ∗ )≤1 } ⊆ C(X, F).
For each x ∈ X we have Fx = {ϕ(x) | ϕ ∈ (V ∗ )≤1 }, which clearly is bounded. Thus F is
pointwise bounded. If x, x0 ∈ X then for each ϕ|X ∈ F we have |ϕ(x) − ϕ(x0 )| ≤ kϕkkx − x0 k ≤
kx − x0 k, so that F ⊆ C(X, F) is equicontinuous. Thus by the Arzelà-Ascoli Theorem A.38, F
is totally bounded. This means that for each ε > 0 there are ϕ1 , . . . , ϕN ∈ (V ∗ )≤1 such that for
every ϕ ∈ (V ∗ )≤1 there is an i such that kϕ − ϕi kC(X,F) = supx∈X |ϕ(x) − ϕi (x)| < ε. In view
of X = AV≤1 , this implies: For every ε > 0 there are ϕ1 , . . . , ϕN ∈ (V ∗ )≤1 such that for each
ϕ ∈ (V ∗ )≤1 there is an i such that supx∈V≤1 |ϕ(Ax) − ϕi (Ax)| < ε. In view of
we have proven the total boundedness (=precompactness) of the set At (V ∗ )≤1 ⊆ V ∗ and there-
fore compactness of At ∈ B(V ∗ ).
46
Vito Volterra (1860-1940). Italian mathematician and one of the early pioneers of functional analysis.
47
You should have seen this theorem in Analysis 2 or Topology. See e.g. Appendix A.9 or [23, Vol. 2, Theorem
15.5.1]. It has many applications in classical analysis, for example Peano’s existence theorem on differential equations.
94
Now assume that At ∈ B(V ∗ ) is compact. Then by the above, Att ∈ B(V ∗∗ ) is compact.
Since V ⊆ V ∗∗ is a closed subspace, the restriction Att |V is compact. But by Lemma 11.2, the
latter equals A, so that A is compact.
14.16 Remark The above result (due to Schauder) can be proven in different ways: One can
give essentially the same proof avoiding invocation of Arzelà-Ascoli, cf. [50, Theorem 1.4.4], or
use the circle of ideas in Section 16 as in [12, Theorem VI.3.4]. For another functional analysis
proof see [66]. Cf. also Remark A.40.2 on different proofs of the Arzelà-Ascoli theorem. 2
14.17 Proposition If H is a Hilbert space and A ∈ B(H) is compact then A∗ and |A| are
compact.
∼
=
Proof. Compactness of A∗ follows from Proposition 14.15 and A∗ = γ −1 ◦ At ◦ γ, where γ : H →
H ∗ , y 7→ h·, yi.
For |A| (and also A∗ , if we want to avoid Proposition 14.15) we argue as follows: By the polar
decomposition, there is a partial isometry V such that A = V |A| and |A| = V ∗ A. The second
identity together with compactness of A and Lemma 14.8 gives compactness of |A|. Since the
first identity is equivalent to A∗ = |A|V ∗ , the compactness of A∗ follows.
Yet another proof: If A ∈ K(H) and ε > 0 then by Corollary 14.30 proven below there is
F ∈ F (H) with kA − F k < ε, thus kA∗ − F ∗ k < ε. By the following exercise, F ∗ is finite rank.
Since ε > 0 was arbitrary, Corollary 14.10 gives A∗ ∈ K(H).
14.19 Lemma If A ∈ B(V ) is compact and λ ∈ C\{0} then ker(A − λ1) is finite dimensional.
Proof. If λ 6∈ σ(A) then this is trivial since A − λ1 is invertible. In general, Vλ = ker(A − λ1)
is the space of eigenvectors of A with eigenvalue λ. Clearly A|Vλ = λ idVλ , so that Vλ is an
invariant subspace. Since Vλ is closed and A|Vλ is compact by Remark 14.5.3, Vλ must be
finite-dimensional by Remark 14.5.4.
95
Proof. We know from Proposition 9.28 that (i) is equivalent to the combination of (ii) and (iii).
It therefore suffices to prove (ii)⇔(iii).
(iii)⇒(ii): It suffices to do this for λ = 1. (Why?) Assume that A − 1 is not injective, but
surjective. Clearly (A − 1)n is surjective for all n. In view of (A − 1)n+1 = (A − 1)(A − 1)n
we have ker(A − 1)n+1 = ((A − 1)n )−1 (ker(A − 1)), where the −1 stands for ‘preimage’. This
space clearly contains ker(A − 1)n , but it is strictly larger since it contains vectors x such
that (A − 1)n x ∈ ker(A − 1)\{0}. Thus ker(A − 1)n+1 % ker(A − 1)n for all n. Now by
Riesz’ Lemma 14.1, for each n we can find an xn ∈ ker(A − 1)n+1 such that kxn k = 1 and
dist(xn , ker(A − 1)n ) ≥ 21 . If n > m then (A − 1)xn − Axm ∈ ker(A − 1)n (note that (A − 1)n
commutes with A and xm ∈ ker(A − 1)n since n > m), so that
1
kAxn − Axm k = kxn + ((A − 1)xn − Axm )k ≥ .
2
Thus {Axn } has no convergent subsequence, contradicting the compactness of A.
(ii)⇒(iii): If A − λ1 is injective but not surjective, one similarly proves (A − λ1)n+1 H $
(A − λ1)n H for all n, which again leads to a contradiction with compactness of A.
14.21 Remark In the above proof we have shown that if A ∈ B(V ) is compact and λ ∈ C\{0}
then we cannot have ker(A − 1)n+1 % ker(A − 1)n for all n or (A − λ1)n+1 H $ (A − λ1)n H for
all n. One says that A − λ1 has finite ascent and descent. 2
14.23 Proposition Let H be a non-zero complex Hilbert space and A ∈ B(H) a compact
normal operator. Then there is an eigenvalue λ ∈ σp (A) such that |λ| = kAk.
Proof. If A = 0 then it is clear that λ = 0 does the job. Now assume A 6= 0. By Exercise 11.26,
there is λ ∈ σ(A) with |λ| = kAk. Since λ 6= 0, Corollary 14.22 gives λ ∈ σp (A).
The rest of this subsection is not needed for the proof of the spectral theorem, but puts the
Fredholm alternative into perspective.
Recall that if A : V → W is a linear map, the cokernel of A by definition is the linear quotient
space W/AV . If V, W are Hilbert spaces and AV ⊆ W is closed then, recalling Exercise 6.1 we
may alternatively define the cokernel of A to be (AV )⊥ ⊆ W .
96
where the second equality and the final isomorphism come from Exercises 11.3(i) and 6.6, re-
spectively. Thus (V /((A−λ1V )V ))∗ is finite dimensional, implying (why?) finite dimensionality
of V /((A − λ1V )V ) = coker(A − λ1V ).
(ii) This follows from (i) and Exercise 9.13.
Alternatively, a direct proof goes as follows: By Lemma 14.19, K = ker(A − λ1) is finite
dimensional, thus closed by Exercise 3.22. Thus there is a closed subspace S ⊆ V such that
V = K ⊕ S. (If V is a Hilbert space, we can just take S = K ⊥ . For general Banach spaces
this is the statement of Proposition 6.14.) The restriction (A − λ1)|S : S → H is compact and
injective. If (A − λ1)|S is not bounded below, we can find a sequence {xn } in S with kxn k = 1
for all n and k(A − λ1)xn k → 0. Since A is compact, we can find a subsequence {xnk } such
that {Axnk } converges. We relabel, so that now {Axn } converges. Now
Since {Axn } converges and {(A − λ1)xn } converges to zero by choice of {xn }, {xn } converges
to some y ∈ S (since xn ∈ S ∀n and S is closed). From (A − λ1)xn → 0 and xn → y we obtain
(A − λ1)y = 0, so that y ∈ ker(A − λ1) = K. Thus y ∈ K ∩ S = {0}, which is impossible since
y = limn xn and kxn k = 1 ∀n. This contradiction shows that (A − λ1)|S is bounded below. Now
Lemma 9.26 gives that (A − λ1)H = (A − λ1)S is closed.
14.25 Remark 1. In fact one has a much stronger result: If A ∈ B(V ) is compact and
λ ∈ C\{0} then
dim ker(A − λ1) = dim coker(A − λ1). (14.1)
Compare this with the equivalence (ii)⇔(iii) in Proposition 14.20, which amounts to the weak
statement dim ker(A − λ1) = 0 ⇔ dim coker(A − λ1) = 0. For proofs of (14.1) in a Banach
space context see any of [39, 50, 43].
2. If V is a Banach space then A ∈ B(V ) is called a Fredholm operator if ker A and
coker A are finite dimensional. (Often it is also assumed that A has closed image, but by
Exercise 9.13 this follows from finite dimensionality of the cokernel.) In this case, one calls
ind(A) = dim ker A − dim coker A ∈ Z the (Fredholm) index of A. If A, B are both Fredholm
then so is AB and ind(AB) = ind(A) + ind(B). (This can be used for proving (14.1).)
Thus if A is compact and λ 6= 0 then A − λ1 is Fredholm with index zero.
Since 1 is Fredholm with index zero, this result is a very special case of the following: If F
is Fredholm and K is compact then F + K is Fredholm and ind(F + K) = ind(F ).
Another important connection between compact and Fredholm operators is Atkinson’s the-
orem: A ∈ B(V ) is Fredholm if and only there exists B ∈ B(V ) such that AB − 1 and BA − 1
are compact. (Equivalently, the image of A in the quotient algebra B(V )/K(V ) is invertible.)
For more on Fredholm operators see [55, p.110-112] or [50, Section 1.4].
Finally, λ ∈ σd (A) is equivalent to λ ∈ σ(A) being isolated and A − λ1 being Fredholm
(equivalently, Fredholm of index zero). 2
14.26 Exercise Let H be a Hilbert space, A ∈ K(H) and λ ∈ C\{0}. Show that each of the
implications (ii)⇒(iii) and (iii)⇒(ii) in Proposition 14.20 can be deduced from the other.
97
14.27 Theorem (Spectral theorem for compact normal operators) Let H be a Hil-
bert space and A ∈ B(H) compact normal. Then
(i) H is spanned by the eigenvectors of A.
P
(ii) There is an ONB E of H consisting of eigenvectors, thus A = e∈E λe Pe , where Pe : x 7→
hx, eie.
(iii) For each ε > 0 there are at most finitely many λ ∈ σp (A) with |λ| ≥ ε.
(iv) σp (A) is at most countable and has no accumulation points except perhaps 0.
(v) We have σ(A) ⊆ σp (A) ∪ {0}. Furthermore,
– 0 ∈ σp (A) if and only if A is not injective.
– 0 ∈ σc (A) if and only if A is injective and H is infinite dimensional.
S
Proof. (i) Let K ⊆ H be the smallest closed subspace containing λ∈σp (A) Hλ , where Hλ =
ker(A − λ1). Clearly K is an invariant subspace: AK ⊆ K. Exercise 13.8(i) implies that also
A∗ K ⊆ K. Now Exercise 13.13(i) gives that also K ⊥ is A-invariant: AK ⊥ ⊆ K ⊥ . Clearly
A|K ⊥ is compact, thus if K ⊥ 6= {0} then by Proposition 14.23 it contains an eigenvector of A.
Since this contradicts the definition of K, we have K ⊥ = 0, proving that H is spanned by the
eigenvectors of A.
(ii) By Exercise 13.8(ii), the eigenspaces for different eigenvalues of A are mutually orthogo-
S
nal. Now the claim follows from (i) by choosing ONBs Eλ for each Hλ and putting E = λ Eλ .
(iii) Taking into account the unitary equivalence H ∼ = `2 (E, C), cf. Theorem 5.41, this
essentially is Exercise 14.11(iii).
(iv) This is an immediate consequence of (iii).
(v) The first statement was already proven in Corollary 14.22. The second is immediate. As
to the last, if H is finite dimensional then 0 6∈ σp (A) = σ(A). If H is infinite dimensional and A
injective then σp is infinite since the eigenspaces for the λ 6= 0 are finite dimensional and span
H. Thus in view of (iii), we have 0 ∈ σp (A). Now 0 ∈ σc (A) follows from the fact that A is not
bounded below or from the closedness of σ(A). (And recall that σr (A) = ∅ by normality.)
14.28 Remark 1. The common theme of ‘spectral theorems’ is that normal operators can be
diagonalized, i.e. be interpreted as multiplication operators, compactness simplifying statement
and proof considerably.
2. The statements about σ(A) actually hold for all compact operators on Banach spaces.
(Instead of the orthogonality of eigenvectors for different eigenvalues, it suffices to use their
linear independence.) 2
14.29 Proposition Let A ∈ K(H). Then are orthonormal sets (not necessarily bases!) E and
F of H, a bijection E → F, e 7→ fe and positive numbers {βe }, called the singular values of A,
such that e 7→ βe is in c0 (E, C) and
X
A= βe fe h·, ei.
e∈E
98
Proof.PB = A∗ A is compact and self-adjoint, so that there is an ONB EB diagonalizing B, thus
Ae
B = e∈EB λe Pe . Clearly E = {e ∈ EB | Ae 6= 0} is orthonormal. For e ∈ E put fe = kAek .
0
Now let F = {fe | e ∈ E }. For all x ∈ H we have
X X X X
Ax = A hx, eie = hx, eiAe = hx, eiAe = kAekhx, eife
e∈EB e∈EB e∈E e∈E
If e, e0 ∈ E, e 6= e0 then hAe, Ae0 i = he, A∗ Ae0 i = 0 since E diagonalizes A∗ A. Thus the fe = kAek
Ae
are mutually orthogonal, and they are normalized by definition. Thus F is orthonormal.
Since E diagonalizes A∗ A, we have A∗ Ae = λe e for all e ∈ E, where compactness of A∗ A im-
plies that e 7→ λe is in c0 (EB , C), cf. Theorem 14.27(iii). Now, kAek2 = hAe, Aei = he, A∗ Aei =
1/2
λe , thus βe = kAek = λe implies that also e 7→ λe is in c0 (E). For the final claim, note that
A∗ Ae = |A|2 e = λe e, thus |A|e = βe e.
Now we can prove that Hilbert spaces have the approximation property:
14.30 Corollary Let H be a Hilbert space, A ∈ K(H) and ε > 0. Then there is a B ∈ F (H)
k·k
(finite rank) with kA − Bk ≤ ε. Thus K(H) = F (H) .
P
Proof. Pick a representation A = e∈E λe fe h·, ei as in the preceding proposition. Since E →
C, e 7→ λe is in c0 (E), there is a finite subset F ⊆ E such that |λe | < ε for all e ∈ E\F . Define
X
B= λe h·, fe ie,
e∈F
which clearly has finite rank. If x ∈ H then using the orthonormality of E, we have
X X
k(A − B)xk2 = k λe hx, fe iek2 = |λe hx, fe i|2 ≤ ε2 kxk2 .
e∈E\F e∈E\F
Thus kA − Bk ≤ ε, so that K(H) ⊆ F (H). The converse inclusion was Corollary 14.10.
14.31 Remark 1. In the above, bases played a crucial role. Even though there is no notion of
orthogonality in general Banach spaces, it turns out that Banach spaces having suitable bases
k·k
do satisfy K(H) = F (H) , i.e. the approximation property.
2. If you like applications of complex analysis to functional analysis, see [59, Section VI.5]
for an interesting alternative approach to compact operators. 2
14.32 Lemma (i) For every A ∈ B(H) we have TrE (A∗ A) = TrE (AA∗ ).
(ii) If A ≥ 0 and U is unitary then TrE (U AU ∗ ) = TrE (A).
(iii) If A ≥ 0 then TrE (A) is independent of the ONB E. We therefore just write Tr(A).
99
(iv) For A, B ∈ B(H)+ and λ ≥ 0 we have Tr(A + B) = Tr(A) + Tr(B) and Tr(λA) = λTr(A).
Proof. (i) Using Parseval (kxk2 = e0 ∈E |hx, e0 i|2 ), we have
P
X X X XX
TrE (A∗ A) = hA∗ Ae, ei = hAe, Aei = kAek2 = |hAe, e0 i|2
e∈E e∈E e∈E e∈E e0 ∈E
XX XX X
0 2 ∗ 0 2
= |hAe, e i| = |he, A e i| = kA∗ e0 k2 = TrE (AA∗ ),
e0 ∈E e∈E e0 ∈E e∈E e0 ∈E
where the exchange of summation is justified since all summands are non-negative.
(ii) Put B = U A1/2 . By (i), TrE (A) = TrE (B ∗ B) = TrE (BB ∗ ) = TrE (U AU ∗ ).
(iii) Let A ≥ 0 and let E, F be ONBs for H. Since E, F have the same cardinality, we can
pick a bijection α : F → E. The latter extends to a unitary operator U : H → H. Thus by (ii),
X X
TrE (A) = TrE (U AU ∗ ) = hAU ∗ e, U ∗ ei = hAf, f i = TrF (A).
e∈E f ∈F
(iv) The second statement is evident, and the first follows from the fact that a sum of non-
negative numbers is independent of the order or bracketing.
We have seen that A∗ A is positive for any A ∈ B(H), so that Tr(A∗ A) ∈ [0, ∞] is well-
defined. For each A ∈ B(H) we define
is absolutely convergent and independent of the ONB E. Now h·, ·iHS is an inner product
on L2 (H) such that hA, AiHS = kAk22 . And (L2 (H), h·, ·iHS ) is complete, thus a Hilbert
space.
(iii) For all A, B ∈ B(H) we have kABk2 ≤ kAkkBk2 and kABk2 ≤ kAk2 kBk. Thus L2 (H) ⊆
B(H) is a two-sided ideal.
k·k2
(iv) We have F (H) ⊆ L2 (H) ⊆ K(H) and F (H) = L2 (H).
Proof. (i) If x ∈ H is a unit vector, pick an ONB E containing x. Then kAxk2 = (A∗ Ax, x) ≤
TrE (A∗ A) = kAk22 . Thus kAxk ≤ kAk2 whenever kxk = 1, proving the inequality. And Lemma
14.32(i) gives kA∗ k22 = Tr(AA∗ ) = Tr(A∗ A) = kAk22 .
(ii) If E is any ONB (whose choice does not matter) for H, we have (as before)
X X X XX
Tr(A∗ A) = hA∗ Ae, ei = hAe, Aei = kAek2 = |hAe, e0 i|2 .
e∈E e∈E e∈E e∈E e0 ∈E
49
Erhard Schmidt (1876-1959). Baltic German mathematician, contributions to functional analysis like Gram-
Schmidt orthogonalization.
100
Thus L2 (H) is the set of A ∈ B(H) for which the matrix elements hAe, e0 i (w.r.t. the ONB E)
are absolutely square summable. We therefore have a map
that clearly is injective. (Recall that `2 (S) = L2 (S, µ), where µ is the counting measure.) To
show surjectivity of α, let f = {fee0 } ∈ `2 (E × E). Define a linear operator A : H → H by
0
P
A : e 7→ e0 fee0 e . For each e, the r.h.s. is in H by square summability of f . If x ∈ H then
2 ! 2
X X X X
2 0 0
kAxk = sup hx, ei fee0 e = sup hx, eifee0 e
E0 e e0 ∈E 0 E0 e0 ∈E 0 e
2
X X X
= sup hx, eifee0 ≤ kxk2 |fee0 |2 ,
E0 e0 e e,e0
where the supremum is over the finite subsets E 0 ⊆ E, we used |hx, ei| ≤ kxk and the change
ofPsummation order is allowed due to the finiteness of E 0 . This computation shows that kAk ≤
( e,e0 |fee0 |2 )1/2 < ∞. Thus A ∈ B(H) and α(A) = f , so that α is surjective. Thus α : L2 (H) →
`2 (E × E) is a linear bijection. Now `2 (E × E) is a Hilbert space (in particular complete) with
inner product (f, g) = e,e0 fee0 gee0 , and pulling this inner product back to L2 (H) along α we
P
have
X X
hA, BiHS = (Ae, e0 )(Be, e0 ) = hAe, e0 ihe0 , Bei
(e,e0 )∈E 2 (e,e0 )∈E 2
X X
= hAe, Bei = hB ∗ Ae, ei = Tr(B ∗ A),
e e
where all sums converge absolutely. Lemma 14.32(ii) implies that (A, A) = Tr(A∗ A) is inde-
pendent of the chosen ONB, and for general (A, B) this follows by the polarization identity.
From the above it is clear that (L2 (H), h·, ·iHS ) is isomorphic to the Hilbert space (`2 (E ×
E), h·, ·i), thus a Hilbert space. And the norm associated to h·, ·iHS is nothing other than k · k2 .
(iii) For any ONB E we have
X X
kABk22 = Tr(B ∗ A∗ AB) = kABek2 ≤ kAk2 kBek2 = kAk2 Tr(B ∗ B) = kAk2 kBk22 ,
e∈E e∈E
k·k2
This implies kA − AF k2 → 0 as F % E, so that L2 (H) = F (H) .
k·k
Finally, by (i) we have kA − AF k ≤ kA − AF k2 → 0, thus A ∈ F (H) = K(H), where we
used Corollary 14.10. This proves L2 (H) ⊆ K(H).
101
14.35 Example (L2 -Integral operators) Let (X, A, µ) be a measure space, and put H =
L2 (X, A,R µ).
R Let K 2: X × X → C be measureable (w.r.t. the product σ-algebra A × A) and
assume |K(x, y)| dµ(x)dµ(y) < ∞. (Thus K ∈ L2 (X × X, A × A, µ × µ).) Then
Z
(Kf )(x) = K(x, y)f (y) dµ(y)
X
defines a linear operator K : H → H whose Hilbert-Schmidt norm kKk2 coincides with the
norm kKkL2 of K ∈ L2 (X × X). Thus K is Hilbert-Schmidt, and in particular compact.
14.36 Exercise Prove the equality kKkL2 = kKk2 of norms claimed in the above example.
If V, W are vector spaces over any field k then there is a canonical linear map W ⊗k V ∗ →
Homk (V, W ) sending w ⊗k ϕ to the linear map v 7→ wϕ(v). (Here V ∗ is the algebraic dual space
and ⊗k is the algebraic tensor product.) If V or W is finite dimensional, this map is a bijection,
but otherwise it is not. For Hilbert spaces, one has a statement that works irrespective of the
dimensions:
14.38 Remark If A ∈ B(H) and 1 ≤ p < ∞ one puts kAkp = (Tr(|A|p ))1/p . For p = 2 this
agrees with our previous definition since |A|2 = A∗ A, while for p = 1 one has kAk1 = Tr|A|.
Now each space Lp (H) = {A ∈ B(H) | kAkp < ∞}, the ‘p-th Schatten class’, is a two-sided
ideal in B(H) and in fact Lp (H) ⊆ K(H) for all p, see e.g. [72]. In particular L1 (H), the
‘trace-class operators’, play an important role in von Neumann algebra theory. The treatments
of them in [59] and [38] are quite good. See also [48]. If 1 ≤ p ≤ q < ∞, it is not hard to show
that kAk := kAk∞ ≤ kAkq ≤ kAkp , thus Lq (H) ⊆ Lp (H) ⊆ K(H). Thus the spaces Lp (H)
behave quite similarly to the `p (S, F), as also this exercise shows: 2
14.39 Exercise Given a set S and f ∈ `∞ (S, F), define H = `2 (S, C) and the multiplication
operator Mg : H → H, f 7→ gf (known from Exercise 14.11, where we saw Mg ∈ K(H) ⇔ g ∈
c0 (S)). Prove |Mg | = M|g| and kMg kp = kgkp for all p ∈ [1, ∞). (Thus Mg ∈ Lp (H) ⇔ g ∈
`p (S, C).)
102
is a bounded linear functional on (C(σ(A), C), k · k). If f is positive (i.e. takes values in [0, ∞))
then σ(f (A)) ⊆ [0, ∞) by the spectral mapping theorem, so that f (A) ≥ 0 and hf (A)x, xi ≥ 0
by Proposition 13.17. Thus ϕA,x is a bounded positive linear functional on C(σ(A), C). Thus
by the Riesz-Markov-Kakutani theorem, cf. [42, Appendix A.5] for the statement and, e.g., [11,
Theorem 7.2.8] or [63, Theorem 2.14] for proofs, there is a unique finite regular positive measure
µA,x on the Borel σ-algebra of C(σ(A)) such that
Z
f dµA,x = ϕA,x (f ) = hf (A)x, xi ∀f ∈ C(σ(A), C). (15.1)
Taking f = 1 = const., we have f (A) = 1, so that µA,x (σ(A)) = kxk2 < ∞. Since all measures
will be Borel measures, we omit the σ-algebra from the notation and just write L2 (σ(A), µA,x ).
15.1 Definition Let H be a Hilbert space, A ∈ B(H) and x ∈ H. Then x is called ∗-cyclic
for A if spanC {An (A∗ )m x | n, m ∈ N0 } = H.
15.2 Remark A vector x is cyclic for A if spanC {An x | n ∈ N0 } = H. Clearly the two notions
are equivalent for self-adjoint A, but in general they differ. For the present purpose, ∗-cyclicity
is the right notion. 2
15.3 Proposition Let H be a Hilbert space, A ∈ B(H) normal and x ∈ H ∗-cyclic for
A. Then there is a unique unitary U : H → L2 (σ(A), µA,x ) such that U AU ∗ = Mz , where
(Mz f )(z) = zf (z) for all f ∈ L2 (σ(A), µA,x ), z ∈ σ(A).
Thus A is unitarily equivalent to a multiplication operator.
Proof. The computation
Z
2 ∗
kf (A)xk = hf (A)x, f (A)xi = hf (A) f (A)x, xi = h(f f )(A)x, xi = |f |2 dµA,x
Not every normal operator A ∈ B(H) admits a ∗-cyclic vector. (If H is separable, A has a
∗-cyclic vector if and only if the algebra {B ∈ B(H) | AB = BA, A∗ B = BA∗ } is commutative.)
In this case we say that A is multiplicity free. In general we have:
15.4 Theorem (Spectral theorem for normal operators) Let H be a Hilbert space
and A ∈ B(H) normal. Then there exists a family {µι }ι∈I L
of finite Borel measures on σ(A) and
2 50 ∗
L
unitary U : H → ι∈I L (σ(A), µι ) such that U AU = ι∈I Mz , i.e.
M
(U AU ∗ f )ι (z) = zfι (z) ∀f = {fι } ∈ L2 (σ(A), µι ), z ∈ σ(A). (15.2)
ι∈I
50
L
Here is the Hilbert space direct sum defined at the end of Section 5.1.
103
Proof. Let F be the family of subsets F ⊆ H such that for x, y ∈ F, x 6= y we have f (A)x ⊥
f 0 (A)y for all f, f 0 ∈ C(σ(A), C). We partially order F by inclusion. One easily checks S that
F satisfies the hypothesis of Zorn’s lemma. (Given a totally ordered subset C ⊆ F, C is in
F, thus an upper bound for C.) Thus there is a maximal element M ∈ F. For each x ∈ M
we put HL x = {f (A)x | f ∈ C(σ(A), C)}. By construction these Hx are mutually orthogonal.
Let K = x∈M Hx . By construction, we have f (A)K ⊆ K for all f ∈ C(σ(A), σ), thus also
f (A)∗ K ⊆ K since f (A)∗ = f (A). Thus K ⊥ is invariant under all f (A). Picking a non-zero
y ∈ K ⊥ , we have M ∪ {y} ∈ F, which is a contradiction. Thus K = H.
Since clearly x ∈ M is ∗-cyclic for the restriction of A to HL
x , we can use Proposition 15.3 to
obtain unitaries Ux : Hx → L2 (σ(A), µA,x ). Defining U : H → x∈M L2 (σ(A), µA,x ) by sending
x ∈ Hx to Ux x ∈ L2 (σ(A), µA,x ) and extending linearly, U is unitary, and we are done. (Of
course we have identified I = M and µι = µA,x .)
15.5 Remark 1. Once the maximal family M of vectors has been picked, the construction
is canonical. But there is no uniqueness in the choice of that family. (This is similar to the
non-uniqueness of the choices of ONBs in the eigenspace ker(A − λ1) that we make in proving
Theorem 14.27.) For much more on this (in the self-adjoint case) see [59, Section VII.2].
2. Theorem 15.4 is perfectly compatible with Theorem 14.27: If A is compact normal and
E is an ONB diagonalizing it then the Hι in Theorem 15.4 are precisely the one-dimensional
spaces Ce for e ∈ E and the measure µι corresponding to Hι = Ce is the δ-measure on P (σ(A))
defined by µ(S) = 1 if λe ∈ S and µ(S) = 0 otherwise. (To be really precise, one should take
the non-uniqueness in both theorems into account.)
3. If A is as in the theorem and g ∈ C(σ(A), C) then the continuous functional calculus
gives us a normal operator g(A). We now have
M
U g(A)U ∗ = Mg .
ι∈I
15.6 Corollary Let H be a separable Hilbert space and A ∈ B(H) normal. Then there
exists a finite measure space (X, A, µ), a function g ∈ L∞ (X, A, µ; C) and a unitary W : H →
L2 (X, A, µ; C) such that W AW ∗ = Mg .
Proof. We apply Theorem 15.4. Since H is separable, the index set I is at most countable, and
L write I = {1, . . . , N } where N ∈ N ∪ {∞}
we
−1
with ∞ = #N. Now we put X = I × σ(A) =
i∈I σ(A) and for Y ⊆ X we put Y i = p 2 (p1 (i)) = {x ∈ σ(A) | (i, x) ∈ Y } ⊆ σ(A). We define
A ⊆ P (X) and µ : A → [0, ∞] by
A = {Y ⊆ X | Yi ∈ B(σ(A)) ∀i ∈ I},
X
µ(Y ) = µi (Yi ).
i∈I
104
Using the countability of I it is straightforward to check that A is a σ-algebra on X and µ a
(positive) measure on (X, A). With (15.1) we have P µi (σ(A)) = kxi k2 . Thus
P if we choose the
−i
cyclic vectors xi such that kxi k = 2 then µ(X) = i µ({i} × σ(A)) = i µi (σ(A)) < ∞, so
that the measure space (X, A, µ) is finite. Now we define a linear map
M
V : L2 (σ(A), µi ) → L2 (X, A, µ), {fi }i∈I 7→ f where f ((i, x)) = fi (x).
i∈I
From the way (X, A, µ) was constructed, it is quite clear that V is unitary. (Check this!) Now
W = V U : H → L2 (X, A, µ), where U comes from Theorem 15.4, is unitary. In view of
(U AU ∗ f )i (λ) = λfi (λ), defining g : X → C, (i, x) 7→ x (which is bounded by r(A) = kAk), we
have W AW ∗ = Mg .
15.7 Exercise (i) Let Σ ⊆ C be compact and non-empty and µ be a finite positive Borel
measure on Σ. Put H = L2 (Σ, µ) and define A ∈ B(H) by (Af )(x) = xf (x) for f ∈ H.
Prove:
15.8 Definition If (X, τ ) is a topological space, B ∞ (X, C) denotes the set of bounded func-
tions X → C that are measurable with respect to the Borel σ-algebra B(X, τ ).
105
(i) There is a unique unital ∗-homomorphism αA : B ∞ (σ(A), C) → B(H) extending the con-
tinuous functional calculus C(σ(A), C) → B(H) and satisfying kαA (f )k ≤ kf k∞ . Again
we write more suggestively f (A) = αA (f ).
(ii) If B ∈ B(H) commutes with A and A∗51 then B commutes with g(A) for all g ∈
B ∞ (σ(A), C).
(iii) If {fn }n∈N ⊆ B ∞ (σ(A), C) is a bounded sequence converging pointwise to f then f ∈
w
B ∞ (σ(A), C) and fn (A) → f (A), i.e. w.r.t. τwot , cf. Definition 16.8. (And kfn − f k∞ →
0 ⇒ kfn (A) − f (A)k → 0.)
Proof. (i) For all x, y ∈ H, the map
ϕx,y : C(σ(A), C) → C, f 7→ hf (A)x, yi
is a linear functional on C(σ(A), C) that is bounded since kf (A)k ≤ kf k∞ . Thus by the Riesz-
Markov-Kakutani
R theorem there exists a unique complex Borel measure µx,y on σ(A) such that
ϕx,y (f ) = f dµx,y for all f ∈ C(σ(A), C). Since ϕx,y depends in a sesquilinear way on (x, y),
the same holds for µx,y , and |µx,y (σ(A))|R = |hx, yi| ≤ kxkkyk. Thus if f ∈ B ∞ (σ(A), C), the
map ψf : H 2 → C defined by (x, y) 7→ f dµx,y is a sesquilinear form that is bounded since
|ψx,y (f )| ≤ kf k∞ kxkkyk. Thus by Proposition 11.8 there is a unique Af ∈ B(H) such that
hAf x, yi = ψx,y (f ) for all x, y ∈ H. It satisfies kAf k ≤ kf k∞ . Define α : B ∞ (σ(A), C) → B(H)
by f 7→ Af . If f ∈ C(σ(A), C) then ψx,y (f ) = hf (A)x, yi ∀x, y, implying Af = f (A). Thus αA
extends the continuous functional calculus.
It remains to be shown that αA is a ∗-homomorphism. Linearity is quite obvious. Since the
continuous functional calculus is a ∗-homomorphism, for f ∈ C(σ(A), C) we have f (A) = f (A)∗ ,
thus
Z Z Z
∗
f dµx,y = hf (A)x, yi = hx, f (A) yi = hx, f (A)yi = hf (A)y, xi = f dµy,x = f dµy,x ,
thus µy,x = µx,y . Reading the above computation backwards, this implies αA (f ) = αA (f )∗
for all f ∈ B ∞ (σ(A), C). Since the continuous functional calculus is a homomorphism, for all
f, g ∈ C(σ(A), C) we have
Z Z
∗
(f g) dµx,y = h(f g)(A)x, yi = hf (A)g(A)x, yi = hg(A)x, f (A) yi = g dµx,f (A)y .
The fact that this holds for all f, g ∈ C(σ(A), C) implies f µx,y = µx,f (A)y . Thus for all
f ∈ C(σ(A), C), g ∈ B ∞ (σ(A), C) we have
Z Z
h(f g)(A)x, yi = f g dµx,y = g dµx,f (A)y = hg(A)x, f (A)yi = hf (A)g(A)x, yi,
so that (f g)(A) = f (A)g(A). As above, we deduce from this that f µx,y = µx,f (A)y for all
f ∈ B ∞ (σ(A), C), and then (f g)(A) = f (A)g(A) for all f, g ∈ B ∞ (σ(A), C).
(ii) The assumption implies Bf (A) = f (A)B for all f ∈ C(σ(A), C). Thus
ϕBx,y (f ) = hf (A)Bx, yi = hBf (A)x, yi = hf (A)x, B ∗ yi = ϕx,B ∗ y (f ) ∀x, y, f.
This implies µBx,y = µx,B ∗ y for all x, y, whence
Z Z
hf (A)Bx, yi = f dµBx,y = f dµx,B ∗ y = hf (A)x, B ∗ yi ∀x, y ∈ H, f ∈ B ∞ (σ(A), C),
51
Since A is normal, AB = BA actually implies A∗ B = BA∗ by Fuglede’s theorem, cf. Section B.8!
106
thus f (A)B = Bf (A) for all f ∈ B ∞ (σ(A), C).
(iii) Measurability of the limit function f follows from Lemma 15.9(i). If kfn k ≤ M ∀n then
clearly kf k ≤ M . Thus f ∈ B ∞ (σ(A)). For all x, y ∈ H we have
Z Z
hαA (fn )x, yi = fn dµx,y −→ f dµx,y = hαA (f )x, yi,
where convergence in the center is a trivial application of the dominated convergence theorem,
w
using boundedness of µx,y and kfn k∞ ≤ M for all n. This proves αA (fn ) → αA (f ). The final
claim clearly follows from kfn (A) − f (A)k = k(fn − f )(A)k ≤ kfn − f k∞ .
The above construction of the Borel functional calculus was independent of the Spectral
Theorem 15.4. We now wish to understand their relationship. This is the first step:
15.11 Exercise Let Σ ⊆ C be compact and λ a finite positive Borel measure on Σ. Let
H = L2 (Σ, λ; C) and g ∈ B ∞ (Σ, C).
(i) Prove that the multiplication operator Mg : H → H, [f ] 7→ [gf ] satisfies
kMg k = ess supµ |g| = inf{t ≥ 0 | λ({x ∈ X | |g(x)| > t}) = 0} ≤ kgk∞ .
(ii) Let A = Mz ∈ B(H), where z : Σ ,→ C. Prove that g(A) as defined by the Borel functional
calculus coincides with Mg .
(This is too sloppy, but the reader should be able to make it precise.)
15.13 Remark 1. Since it turns out that g(A) = U ∗ ( ι Mg )U for all g ∈ B ∞ (σ(A), C), one
L
might try to take this as the definition of g(A). But apart from being very inelegant, it has
the problem that one must prove the independence of g(A) thus defined from the choice of the
maximal set M ⊆ H in the proof of the spectral theorem. This would not be difficult if every
Borel measureable function was a pointwise limit of a sequence of continuous functions. But
this is false, making such an approach quite painful. (Compare Lusin’s theorem in, e.g., [63].)
107
2. We cannot hope to prove kg(A)k = kgk∞ for all g ∈ B ∞ (σ(A), C) since it is true only
if σ(A) = σp (A)! Since singletons in C are closed, thus Borel measurable, we can change g
arbitrarily for some λ ∈ σ(A) without destroying the measurability of g, making kgk∞ as large
as we want. But if λ ∈ σ(A)\σp (A), Exercise 15.7 gives µι ({λ}) = 0 ∀ι ∈ I, so that this change
of g does not affect the norms, cf. Exercise 15.11, ess supµi |g| of the multiplication operators
making up g(A) and therefore does not affect kg(A)k.
3. Let A ∈ B(H) be normal and consider the C ∗ -algebra A = C ∗ (1, A) ⊆ B(H). Then
g(A) ∈ A for continuous g, but for most non-continuous g we have g(A) 6∈ A. For this reason
there is no Borel functional calculus in abstract C ∗ -algebras. (But g(A) is always contained in
wot
the von Neumann algebra vN(A) = C ∗ (A, 1) generated by A. This follows from Theorem
15.10(ii) and von Neumann’s ‘double commutant theorem’.) 2
15.14 Definition Let H be Hilbert space and Σ ⊆ C a compact subset. Let B(Σ) be the Borel
σ-algebra on Σ. A projection-valued measure relative to (H, Σ) is a map P : B(Σ) → B(H)
such that
(i) P (S) is an orthogonal projection for all S ∈ B(Σ).
(ii) P (∅) = 0, P (Σ) = 1.
(iii) P (S ∩ S 0 ) = P (S)P (S 0 ) for all S, S 0 ∈ B(Σ).
(iv) For all x, y ∈ H, the map Ex,y : B(Σ) → C, S 7→ hP (S)x, yi is P a complex measure.
(Equivalently,Sif the {Sn }n∈N ⊆ B(Σ) are mutually disjoint then n P (Sn ) converges
weakly to P ( n Sn ).)
Note that (iii) implies P (S)P (S 0 ) = P (S 0 )P (S) for all S, S 0 ∈ B(Σ).
15.15 Proposition Let H be a Hilbert space and A ∈ B(H) normal. Put Σ = σ(A). For
each S ∈ B(Σ), define P (S) = χS (A). Then S 7→ P (S) is a projection-valued measure relative
to (H, Σ), also called the spectral resolution of A.
Proof. If g = χS for S ∈ B(Σ), g(A) is a direct sum of operators of multiplication by χS , which
clearly all are idempotent. And since g = χS is real-valued, g(A) is self-adjoint. Thus each
P (S) = χS (A) is an orthogonal projection. P (∅) = 0 is clear, and P (Σ) = 1(A) = 1H (since the
constant 1 function is continuous). Property (iii) is immediate from χS∩S 0 = χS χS 0 . Finally, if
x, y ∈ H let U x = {fι }ι∈I , U y = {gι }ι∈I . Then
XZ
Ex,y (S) = hP (S)x, yi = χS (z)fι (z)gι (z) dµι (z).
ι∈I σ(A)
15.16 Exercise Let A ∈ B(H) be normal and Σ ⊆ σ(A) a Borel set. Prove σ(A|P (Σ)H ) ⊆
Σ ∪ {0}. Bonus: State and prove a better result.
108
15.17 Exercise Let A ∈ B(H) be a normal operator and let P be the corresponding spectral
measure. Prove:
(i) λ ∈ σ(A) if and only if P (σ(A) ∩ B(λ, ε)) 6= 0 for each ε > 0.
(ii) λ ∈ σ(A) is an eigenvalue if and only if P ({λ}) 6= 0.
We have thus seen that every normal operator gives rise to a projection valued measure. The
converse is also true, and we have a bijection between normal operators and projection-valued
measures:
f (A)f (A)∗ = α(f )α(f )∗ = α(f f ∗ ) = α(f ∗ f ) = α(f )∗ α(f ) = f (A)∗ f (A).
(iii) σ(A) ⊆ Σ is clear. Since α(1) = 1 and α(z) = A by definition, we have α(P ) = P (A)
for each polynomial. More generally, since α is a ∗-homomorphism, a polynomial in z, z is sent
by α to the corresponding polynomial in A, A∗ . These polynomials are k · k∞ -dense in C(Σ, C)
109
by Weierstrass Theorem A.31, so that the continuity proven in (ii) implies that α(f ) = f (A) as
produced by the continuous functional calculus.
(iv) Left as an exercise.
We close the discussion of spectral theorems with the advice of looking at the paper [29] and
[73, Chapter 5] by two masters.
16.2 Proposition The weak topology on every infinite dimensional Banach space is strictly
weaker than the norm-topology.
Proof. By the definition of τw , for every weakly open neighborhood U of 0 there are ϕ1 , . . . , ϕn ∈
V ∗ such
T that {x ∈ V | |ϕi (x)| < 1 ∀i = 1, . . . , n} ⊆ U . Thus U contains the linear subspace
W = ni=1 ϕ−1 i (0) ⊆ V , whose codimension is ≤ n. Thus if V is infinite dimensional then
dim W is infinite, thus non-zero. On the other hand, it is clear that the (norm-)open ball
B(0, 1) contains no linear subspace of dimension > 0. Thus B(0, 1) 6∈ τw . Since τw ⊆ τk·k was
clear, we have τw $ τk·k .
16.3 Exercise (i) Prove that the sequence {δn }n∈N has no weak limit in `1 (N, F).
(ii) Let 1 < p < ∞. Prove that the sequence {δn }n∈N ⊆ `p (N, F) converges to zero weakly, but
not in norm.
w
(iii) (Bonus) Let 1 ≤ p < ∞ and g, {fn }n∈N ∈ `p (N, F). Prove that if fn → g and kfn kp → kgkp
then kfn − gkp → 0.52
The deviant behavior of `1 in the preceding exercise can be understood as a consequence of
the following surprising result:
53 w
16.4 Theorem (I. Schur 1920) If g, {fn }n∈N ⊆ `1 (N, F) and fn → g then kfn − gk1 → 0.
52
This implication holds for all uniformly convex Banach spaces, cf. e.g. [37, Proposition 9.11]. The Lp -spaces with
1 < p < ∞ are uniformly convex, cf. Section B.6.2.
53
Issai Schur (1874-1941). Russian mathematician. Studied and worked in Germany up to his emigration to Israel
in 1939. Mostly known for his work in group and representation theory.
110
Like the uniform boundedness theorem, this result can be proven using a beautiful gliding
hump argument or using Baire’s theorem, cf. Section B.7.
Theorem 16.4 does not generalize to nets since the weak and norm topologies on `1 (N, F)
differ by Proposition 16.2 and nets can distinguish topologies, cf. [47, Section 5.1].
16.5 Exercise Prove that every weakly convergent sequence in a Banach space is norm-
bounded. Hint: Uniform boundedness theorem.
16.6 Exercise Let V be a Banach space. Prove that the (norm) closed unit ball V≤1 is also
weakly closed. Hint: Hahn-Banach.
In Section 14.1 we saw that V≤1 is compact w.r.t. the norm topology if and only if V is
finite-dimensional. But the weak topology is weaker than the norm topology, so that a set can
be weakly compact even though it is not norm compact. Indeed, after some further preparations
we will prove the following theorem:
16.7 Theorem Let V be a Banach space. Then the following are equivalent:
(i) V is reflexive. (⇔ V ∗ is reflexive by Theorem 7.18.)
(ii) V≤1 is compact w.r.t. the weak topology.
In Remark 8.6 we have encountered the strong (operator) topology on B(V ): A net {Aι } ⊆
B(V ) converges strongly to A ∈ B(V ) if k(Aι − A)xk → 0 for all x ∈ V . Now we can have a
brief look at the weak operator topology:
16.8 Definition Let V be a Banach space. The weak operator topology τwot on B(V ) is
generated by the family F = {k · kx,ϕ : A 7→ |ϕ(Ax)| | x ∈ V, ϕ ∈ V ∗ } of seminorms. Thus
{Aι } ⊆ B(V ) converges to A ∈ B(V ) w.r.t. τwot if and only if ϕ((Aι − A)x) → 0 for all
x ∈ V, ϕ ∈ V ∗ , i.e. {Aι x} ⊆ V converges weakly to Ax for all x. The family F is separating, so
wot
that τwot is Hausdorff. We write Aι −→ A or A = w-lim Aι .
wot
If H is a Hilbert space, we have Aι −→ A if and only if hAι x, yi → hAx, yi for all x, y ∈ H.
There is little risk of confusing the weak topology on V with the weak operator topology on
B(V ). But one might confuse the latter with the weak topology that B(V ) has as a Banach
space, in particular since the above k·kx,ϕ are in B(V )∗ ! However, when V is infinite dimensional
these seminorms do not exhaust (or span) the bounded linear functionals on B(V ), so that the
weak operator topology on B(V ) is strictly weaker than the weak topology!
111
16.11 Remark 1. Since ϕ(x) = 0 for all x ∈ V means ϕ = 0, F is separating, thus the
σ(V ∗ , V )-topology is Hausdorff and therefore locally convex.
2. If V is infinite dimensional, the weak-* topology τw∗ is neither normable nor metrizable.
3. Since the weak-∗ topology is induced by the linear functionals on V ∗ of the form x b, which
constitute a subset of V ∗∗ , it is weaker than the weak topology, thus also weaker than the norm
topology: τw∗ ⊆ τw ⊆ τk·k . As we know, the second inclusion is proper whenever V is infinite
dimensional. For the first, we have: 2
16.14 Remark Before we proceed, some comments are in order: While the norm and weak
topologies are defined for each Banach space, the weak-∗ topology is defined only on spaces
that are the dual space V ∗ of a given space V . There are Banach spaces, like c0 (N, F), that are
not isomorphic (isometrically or not) to the dual space of any Banach space, cf. Corollary B.11.
And there are non-isomorphic Banach spaces with isomorphic dual spaces, cf. Corollary B.12.
Thus to define the weak-∗ topology on a Banach space, it is not enough just to know that the
latter is a dual space. We must choose a ‘pre-dual’ space. 2
The following is the reason for the importance of the weak-∗ topology:
16.15 Theorem (Alaoglu’s theorem) 54 If V is a Banach space then the (norm)closed unit
ball (V ∗ )≤1 = {ϕ ∈ V ∗ | kϕk ≤ 1} is compact in the σ(V ∗ , V )-topology.
Proof. Define Y
Z= {z ∈ C | |z| ≤ kxk},
x∈V
equipped with the product topology. Since the closed discs in C are compact, Z is compact by
Tychonov’s theorem. If ϕ ∈ (V ∗ )≤1 then |ϕ(x)| ≤ kxk ∀x, so that we have a map
Y
f : (V ∗ )≤1 → Z, ϕ 7→ ϕ(x).
x∈V
Since the map ϕ 7→ ϕ(x) is continuous for each x, f is continuous (w.r.t. the weak-∗ topology
on (V ∗ )≤1 ). It is trivial that V separates the points of V ∗ , thus f is injective. By definition,
a net {ϕι } in (V ∗ )≤1 converges in the σ(V ∗ , V )-topology if and only if ϕι (x) converges for all
x ∈ V , and therefore if and only if f (ϕι ) converges. Thus f : (V ∗ )≤1 → f ((V ∗ )≤1 ) ⊆ Z is a
homeomorphism.
54
Leonidas Alaoglu (1914-1981). Greek mathematician. (Earlier versions due to Helly and Banach.)
112
Now let z ∈ f ((V ∗ )≤1 ) ⊆ Z. Clearly, |zx | ≤ kxk ∀x ∈ X. By Proposition A.9.2 there is
a net in f ((V ∗ )≤1 ) converging to z and therefore a net {ϕι } in (V ∗ )≤1 such that f (ϕι ) → z.
This means ϕι (x) → zx ∀x ∈ V . In particular ϕι (αx + βy) → zαx+βy , while also ϕι (αx + βy) =
αϕι (x) + βϕι (y) → αzx + βzy . Thus the map ψ : V → C, x 7→ zx is linear with kψk ≤ 1, to wit
ψ ∈ (V ∗ )≤1 and z = f (ψ). Thus f ((V ∗ )≤1 ) ⊆ f ((V ∗ )≤1 ), so that f ((V ∗ )≤1 ) ⊆ Z is closed.
Now we have proven that (V ∗ )≤1 is homeomorphic to the closed subset f ((V ∗ )≤1 ) of the
compact space Z, and therefore compact.
16.16 Remark We deduced Alaoglu’s theorem from Tychonov’s theorem. The latter is known
to be equivalent to the axiom of choice (AC). However, we only needed Tychonov’s theorem as
restricted to Hausdorff spaces. The latter can be proven from a weaker axiom than AC, to which
it actually is equivalent (namely the ‘ultrafilter lemma’, which also implies the Hahn-Banach
theorem). Cf. [47]. 2
16.17 Exercise Use Alaoglu’s theorem to prove that every Banach space V over F admits a
linear isometric bijection onto a closed subspace of C(X, F) for some compact Hausdorff space
X.
16.18 Exercise (i) Use Alaoglu’s theorem to prove (i)⇒(ii) in Theorem 16.7.
(ii) Conclude that the closed unit ball of every Hilbert space is weakly compact.
(iii) Prove σ(V, V ∗ ) = σ(V ∗∗ , V ∗ ) V .
(iv) Use Theorem 16.19 and (iii) to prove (ii)⇒(i) in Theorem 16.7.
17.1 Proposition Let A be a unital Banach algebra and Ω(A) its spectrum. For each a ∈ A
define ba : Ω(A) → C, ϕ 7→ ϕ(a). Let τ be the initial topology on Ω(A) defined by {b
a | a ∈ A},
i.e. the weakest topology making all b
a continuous. Then (Ω(A), τ ) is compact Hausdorff.
Proof. We have proven in Section 10.4 that (non-zero) characters are automatically continuous
with norm one, so that Ω(A) ⊆ (A∗ )≤1 . By definition, b a(ϕ) = ϕ(a). Thus the topology gener-
ated by the b ∗
a is the restriction to Ω(A) ⊆ A of the σ(A , A)-topology, thus Hausdorff. Let {ϕι }
be a net in Ω(A) that converges to ψ ∈ A∗ w.r.t. the σ(A∗ , A)-topology. Then for all a, b ∈ A we
have ψ(ab) = limι ϕι (ab) = limι ϕι (a)ϕι (b) = ψ(a)ψ(b), so that ψ ∈ Ω(A). Thus Ω(A) ⊆ (A∗ )≤1
55
Herman Heine Goldstine (1913-2004). American mathematician and computer scientist. Worked on very pure
and very applied mathematics, like John von Neumann with whom he collaborated on computers.
113
is σ(A∗ , A)-closed, thus compact since (A∗ )≤1 is σ(A∗ , A)-compact by Alaoglu’s theorem.
The above works whether or not A is commutative, but we’ll now restrict to commutative
A since Ω(A) can be very small otherwise. We begin by completing Exercise 10.47:
17.2 Proposition Let X be a compact Hausdorff space and A = C(X, C). Then the map
X → Ω(A), x 7→ ϕx is a homeomorphism.
Proof. Injectivity was already proven in Exercise 10.47. In order to prove surjectivity, let
ϕ ∈ Ω(A) and put M = ker ϕ. Then M ⊆ A is a proper closed subalgebra (in fact an ideal),
and it is self-adjoint by Lemma 12.19 since A is a C ∗ -algebra. If x, y ∈ X, x 6= y, pick f ∈ A
with f (x) 6= f (y). With g = f − ϕ(f )1 we have ϕ(g) = 0, thus g ∈ M . This proves that
M separates the points of X, yet it is not dense in A. Now the incarnation Corollary A.34
of the Stone-Weierstrass theorem implies that there must be an x ∈ X at which M vanishes
identically, i.e. ϕx (f ) = 0 for all f ∈ M . Now for every f ∈ A we have f − ϕ(f ) ∈ M , thus
ϕx (f − ϕ(f )1) = 0, which is equivalent to ϕx (f ) = ϕ(f ). Thus ι : X → Ω(A) is surjective.
If {xι } ⊆ X such that xι → x then ϕxι (f ) = f (xι ) → f (x) = ϕx (f ) for every f ∈ A by
continuity of f . But this precisely means that ϕxι → ϕx w.r.t. the weak-∗ topology. Thus ι is
continuous. As a continuous bijection of compact Hausdorff spaces it is a homeomorphism.
17.3 Definition Let A be a unital Banach algebra. Then its radical is the set of quasi-
nilpotent elements: radA = {a ∈ A | r(a) = 0}. We call A semisimple if radA = {0}.
π : A → C(Ω(A), C), a 7→ b
a (17.1)
kb
ak = sup |b
a(ϕ)| = sup |ϕ(a)| = r(a) ≤ kak,
ϕ∈Ω(A) ϕ∈Ω(A)
The Gelfand homomorphism can fail to be surjective or injective or both. See Section 17.2
for an important example for the failure of surjectivity and Exercise 17.6 for a non-trivial unital
Banach algebra with very large radical.
17.5 Proposition Let A be a commutative unital Banach algebra and a ∈ A such that A is
generated by {1, a}. Then the map ba : Ω(A) → σ(a) is a homeomorphism. The same conclusion
holds if a ∈ InvA and A is generated by {1, a, a−1 }.
114
Proof. We know from (10.7) that b a(Ω(A)) = σ(a), thus b
a is surjective. Assume b
a(ϕ1 ) = b
a(ϕ2 ),
thus ϕ1 (a) = ϕ2 (a). Since the ϕi are unital homomorphisms, this implies ϕ1 (a ) = ϕ2 (an )
n
for all n ∈ N0 , so that ϕ1 , ϕ2 agree on the polynomials in a. Since the latter are dense in A
by assumption and the ϕi are continuous, this implies ϕ1 = ϕ2 . Thus b a : Ω(A) → σ(a) is
injective, thus a continuous bijection. Since Ω(A) is compact and σ(a) ⊆ C Hausdorff, b a is a
homeomorphism. This proves the first claim.
For the second claim, note that ϕ(a)ϕ(a−1 ) = ϕ(aa−1 ) = ϕ(1) = 1, thus ϕ(a−1 ) = ϕ(a)−1 ,
for each ϕ ∈ Ω(A). This implies that ϕ1 (an ) = ϕ2 (an ) also holds for negative n ∈ Z. Now
ϕ1 , ϕ2 agree on all Laurent polynomials in a, thus on A by density and continuity. The rest of
the proof is the same.
17.6 Exercise Let α : N0 →P (0, ∞) be a map satisfying α(0) = 1 and αn+m ≤ αn αm ∀n, m.
For f : N0 → C, define kf k P
= n∈N0 αn |f (n)|, and A = {f : N0 → C | kf k < ∞}. For f, g ∈ A,
define f · g by (f · g)(n) = u,v∈N0 f (u)g(v).
u+v=n
17.7 Remark 1. Since every commutative unital Banach algebra has at least one non-zero
character ϕ, the worst that can happen is radA = ϕ−1 (0), which has codimension one, as in the
preceding exercise.
2. If A is a non-unital Banach algebra and a ∈ A one defines σ(a) = σAe(a), where Ae is
the unitization of A considered in Exercise 10.32. Now one defines r(a) = supλ∈σ(a) |λ| and
radA = r−1 (0) ⊆ A as before. Now for the non-unital subalgebra A0 = {f ∈ A | f (0) = 0} of
the A from Exercise 17.6 one easily proves Af0 ∼
= A, thus r(a) = 0 ∀a ∈ A0 and radA0 = A0 . 2
17.9 Example Let (A = `1 (Z, C), k · k, ?, 1) be the unital Banach algebra from Example 10.25.
In view of δn ? δm = δn+m , this algebra is generated by a = δ1 ∈ InvA and a−1 = δ−1 . We have
kak = ka−1 k = 1 so that by Exercise 10.13 we have σ(a) ⊆ S 1 . Now, for z ∈ S 1 define
X
ϕz : f 7→ f (n)z n , (17.2)
n∈Z
which is absolutely and uniformly convergent since f ∈ `1 . It is clear that ϕz (δn ) = z n , so that
ϕz (δn δm ) = ϕz (δn+m ) = z n+m = ϕz (δn )ϕz (δm ), proving ϕz ∈ Ω(A). In particular, ϕz (a) = z, so
that σ(a) = S 1 . Now Proposition 17.5 gives Ω(A) = {ϕz | z ∈ S 1 }. By uniform convergence in
115
R1
(17.2), one finds that fˇ(z) = ϕz (f ) is continuous in z and fbˇ(n) = 0 fˇ(e2πit )e−2πit dt = f (n) ∀n.
We have
kπ(f )k = r(f ) = sup |ϕz (f )| = sup |fˇ(z)| = kfˇk∞ ,
z∈S 1 z∈S 1
which vanishes only if f = 0 (by the fact that g ∈ C(S 1 , C) vanishes if and only if gb(n) = 0 ∀n ∈
Z, cf. e.g. [76, Chapter 2, Theorem 2.1]). Thus A is semisimple and π : `1 (Z) → C(S 1 , C) is
injective. But π is not surjective: Its image consists precisely of
X
B = {g ∈ C(S 1 , C) | |b
g (n)| < ∞}.
n∈Z
This is an algebra since A is. In (superficially) more elementary terms this is just the observation
that the pointwise product of two elements of C(S 1 , C) corresponds to convolution of their
Fourier coefficients and the fact that `1 (Z) is closed under convolution. For the g ∈ B the Fourier
series converges absolutely uniformly to g, but we have proven in Section 8.4 that C(S 1 , C)
has a dense subset of functions whose the Fourier series does not even converge pointwise
everywhere. (Our proof was non-constructive, but as we remarked, single examples can be
produced constructively.)
And functions in C(S 1 , C)\B can be writtenP∞ sin down even more concretely: With some ef-
fort (see [49] for an exposition) the series n=2 n lognxn can be shown to be uniformly conver-
gent
P∞ to some −1 f ∈ C(S 1 , C), and its Fourier coefficients are not absolutely summable since
n=2 (n log n) = ∞.
1 1 1
We now turn P the non-surjectivity of π : ` (Z)1 → C(S ) into a virtue! For g ∈ C(S , C)
define kgkB = n∈Z |b g (n)|. Thus B = {g ∈ C(S , C) | kgkB < ∞}. We have seen that the
Gelfand representation of `1 (Z) is an isometric isomorphism (`1 (Z), k · k1 ) → (B, k · kB ). Now we
can give Gelfand’s slick proof of the following result proven by Wiener with much more effort:
116
If ϕ1 6= ϕ2 then there is an a ∈ A such that ϕ1 (a) 6= ϕ2 (a), thus π(a)(ϕ1 ) = b
a(ϕ1 ) 6= b
a(ϕ2 ) =
π(a)(ϕ2 ). This proves that π(A) ⊆ C(Ω(A), R) separates the points of Ω(A). Since π is also
unital, the Stone-Weierstrass theorem (Corollary A.32) gives π(A) = π(A) = C(Ω(A), C).
Now we have another proof of the continuous functional calculus for normal elements of a
C ∗ -algebra:
117
at : f 7→ f ◦b
where the first map is b a. It is clear that αa is a unital homomorphism. If z : σ(a) ,→ C
−1
is the inclusion, then αa (z) = π (z ◦ b a) = π −1 (b
a) = a. Any continuous unital homomorphism
α : C(σ(a)) → B sending 1 to 1A and z to a coincides with αa on the polynomials C[x]. Since
the latter are dense in C(σ(a), C) by Stone-Weierstrass, we have α = αa .
(ii) As used above, C ∗ -subalgebra B = C ∗ (1, a) is abelian and there is an isometric ∗-
isomorphism π : B → C(σ(a), C). By construction of the functional calculus we have π(f (a)) =
f ◦ ι, where ι is the inclusion map σ(a) ,→ C. Now, with Theorem 11.27 and Exercise 10.6 we
have
σA (f (a)) = σB (f (a)) = σC(σ(a)) (π(f (a))) = σC(σ(a)) (f ◦ ι) = f (σ(a)).
(iii) This is essentially obvious, since applying f to a and g to f (a) is just composition of
maps on the right hand side of the Gelfand isomorphism.
17.15 Remark 1. It should be clear that Theorem 17.11 is conceptually of fundamental im-
portance, but it is not easy to find applications that are not just applications of the continuous
functional calculus for normal operators. The proof of the latter that we gave in Section 12.3
was a good deal more elementary than the one above: It did not involve the weak-∗ topology
and Alaoglu’s theorem, and it only needed Weierstrass’ classical theorem (in two dimensions)
rather than the more general result of Stone.
2. Enriching Theorem 17.11 by some considerations on von Neumann algebras (which we
don’t define here) one can prove representation theorems for commutative von Neumann alge-
bras, cf. e.g. [42, Chapter 6] or [50, Section 4.4], which add additional perspective to the spectral
theorem for normal operators. 2
A.1 Definition
P Let S be a set, (V, k · k) a normed space and f : S → V a function. We say
that s∈S f (s) exists (or converges) and equalsPx if there is x ∈ V such that for every ε > 0
there is a finite subset T ⊆ S such that kx − s∈U f (s)k < ε holds for every finite U ⊆ S
containing T .
118
In many cases, the above will be applied to V = F ∈ {R, C} and k · k = | · |.
This notion of summation has some useful properties:
P
A.2 Proposition (i) If s∈S f (s) exists then the sum x ∈ V is uniquely determined.
P P P
(ii) If s∈S f (s) = x and s∈S g(s) = y then s∈S (cf (s) + dg(s)) = cx + dy for all c, d ∈ F.
P P
(iii) If f (s) ≥ 0 ∀s ∈ S then s∈S f (s) exists if and only if sup{ t∈T f (t) | T ⊆ S finite} < ∞,
in which case the two expressions coincide. These equivalent conditions imply that the set
{s ∈ S | f (s) 6= 0} is at most countable.
P P P
P(V, k · k) is complete and s∈S kf (s)k < ∞ then s∈S f (s) exists, and k s∈S f (s)k ≤
(iv) If
s∈S kf (s)k.
P P
(v) If f : S → F is such that s∈S f (s) exists then s∈S |f (s)| exists.
The proofs of (i) and (ii) are straightforward and similar to those for the analogous state-
ments about series.
P The equivalence
P in (iii) follows from monotonicity of the map Pfin (S) →
[0, ∞), T 7→ t∈T f (t). If s∈S |f (s)| < ∞ then it follows that for every ε > 0 there are
at most finitely many s ∈ S such that |f (s)| ≥ ε. In particular, for every n ∈ N the set
Sn = {s ∈ S | |f (s)| ≥ 1/n} is finite. Since S∞a countable union of finite sets is countable, we
have countability of {s ∈ S | f (s) 6= 0} = n=1 Sn . The proof of (iv) is analogous to that of
the implication ⇒ in Proposition 3.2.
The statement (v) probably is surprising since P the analogous statement for series is false.
Roughly, the reason is that our definition of s∈S f (s) imposes no ordering on S, while the
sum of a series ∞
P
n=1 f (n) is invariant under reordering of the terms only if the series converges
absolutely. I strongly suggest that you make an effort to understand this! A rigorous proof of
(v) can be found, e.g., in [47].
In discussing the spaces `p (S, F), the following (which just is an easy special case of Lebesgue’s
dominated convergence theorem) is useful:
X X X X X X
fn (s) − h(s) ≤ fn (s) − h(s) + fn (s) − h(s) .
s∈S s∈S s∈T s∈T s∈S\T s∈S\T
119
due to the definition of n0 and n ≥ n0 ≥ nt . And the second is bounded by
X X ε
(|fn (s)| + |h(s)|) ≤ 2 g(s) ≤ ,
2
S\T S\T
A.2 Nets
The Definition A.1 of unordered sums is an instance of a much more general notion, the con-
vergence of nets.
A.4 Definition A directed set is a set I equipped with a binary relation ≤ on I satisfying
1. a ≤ a for each a ∈ I (reflexivity).
2. If a ≤ b and b ≤ c for a, b, c ∈ I then a ≤ c (transitivity).
3. For any a, b ∈ I there exists a c ∈ I such that a ≤ c and b ≤ c (directedness).
A.5 Remark If only 1. and 2. hold, (I, ≤) is called a pre-ordered set. Some authors, as e.g.
[42], require in addition that a ≤ b and b ≤ a together imply a = b (antisymmetry). Recall
that a pre-ordered set with this property is called partially ordered. But the antisymmetry is
an unnatural assumption in this context and is never used. 2
A.6 Example 1. Every totally ordered set (X, ≤) is a directed set. Only the directedness
needs to be shown, and it follows by taking c = max(a, b). In particular N is a directed set with
its natural total ordering.
2. If S is a set then the power set I = P (S) with its natural partial ordering is directed: For
the directedness, put c = a ∪ b. The same works for the set Pfin (S) of finite subsets of S, which
appeared in the definition of unordered sums.
3. If (X, τ ) is a topological space and x ∈ X, let Ux be the set of open neighborhoods of x.
Now for U, V ∈ Ux , define U ≤ V ⇔ U ⊆ V , thus we take the reversed ordering. Then (Ux , ≤)
is directed with c = a ∩ b.
A.8 Remark 1. With I = N and ≤ the natural total ordering, a net indexed by I just is a
sequence, and this net converges if and only if the sequence does.
2. Unordered summation is a special case of a net limit: If S is any set, let I be the set of
finite subsets of S and let ≤ be the ordinary (partial) ordering of subsets of S. If T, U ∈ I let
V = T ∪ U . Clearly T ≤ V, U ≤ V , showing that (I, ≤) is a directed set. (This is the same as
Example A.6.2, except that now we only look at finite subsets P of S.) Now given f : S → F, for
every T ∈ I, thus every finite T ⊆ S, we can clearly define t∈T f (t). Now
X X
f (s) = lim f (t),
T ∈I
s∈S t∈T
120
where the sum exists if and only if the limit exists. 2
Why nets? The reason is that sequences are totally inadequate for the study of topological
spaces that do not satisfy the first countability axiom.56 Given a metric space X and a subset
Y ⊆ X, one proves that x ∈ Y if and only if there is a sequence {yn } in Y converging to x, but
for general topological spaces this is false. Similarly, the statement that a function f : X → Y
is continuous at x ∈ X if and only if f (xn ) → f (x) for every sequence {xn } converging to x is
true for metric spaces, but false in general! (It is instructive to work out counterexamples.)
On the other hand:
A.10 Definition A net {xι }, indexed by a directed set (I, ≤), in a metric space (X, d) is a
Cauchy net if for every ε > 0 there is a ι0 ∈ I such that ι, ι0 ≥ ι0 ⇒ d(xι , xι0 ) < ε.
(In a normed space, this definition is consistent with the one in Remark 2.20.)
121
A.12 Definition A topological space X is completely regular if for every closed C ⊆ X and
y ∈ X\C there exists a continuous function f : X → [0, 1] such that f C = 0 and f (x) = 1.
All subspaces of a completely regular space are completely regular. By Urysohn’s lemma,
every normal space is completely regular, in particular every metrizable and every compact
Hausdorff space. This implies that complete regularity is a necessary condition for a space X
to have a compactification X
b that is Hausdorff. In fact, it also is sufficient:
A.13 Theorem Let X be a topological space. Then the following are equivalent:
(i) X is completely regular.
(ii) There exists a compact Hausdorff space βX together with a dense embedding X ,→ βX
such that for every continuous function f : X → Y , where Y is compact Hausdorff, there
exists a continuous fb : βX → Y such that fb X = f . (This fb is automatically unique by
density of X ⊆ βX.)
The universal property (ii) implies that βX is unique up to homeomorphism. ‘It’ is called the
Stone57 -Čech58 compactification of X.
Let X be completely regular and F ∈ {R, C}. Then the restriction map C(βX, F) → Cb (X, F)
given by f 7→ f X is a bijection and an isomorphism of commutative unital F-algebras.
There are many ways to prove the non-trivial implication (i)⇒(ii), the most common one
using Tychonov’s theorem, cf. [47]. But we can also use Gelfand duality for commutative C ∗ -
algebras, cf. Section 17. Here is a sketch: If X is completely regular, A = Cb (X, C) with
norm kf k = supx |f (x)| is a commutative unital C ∗ -algebra. As such it has a spectrum Ω(A),
which is compact Hausdorff. We define βX = Ω(A). There is a map ι : X → Ω(A), x 7→ ϕx ,
where ϕx (f ) = f (x). This map is continuous by definition of the topology on Ω(A). Using the
complete regularity of X one proves that ι is an embedding, i.e. a homeomorphism of X onto
ι(X) ⊆ Ω(A). Now ι(X) = Ω(A) is seen as follows: ι(X) 6= Ω(A) would imply (using Urysohn
or Tietze) that there are f ∈ A\{0} such that ι(x)(f ) = 0 for all x ∈ X. This is a contradiction,
since the elements of A are functions on X, so that ι(x)(f ) = 0 ∀x implies f = 0.
122
• u ∈ X is called an upper bound for Y ⊆ X if x ≤ u holds for each u ∈ y. If this u is in Y
it is called largest element of Y (which is unique).
A.16 Theorem Given the Zermelo-Frenkel axioms of set theory, the Axiom of Choice is equiv-
alent to Zorn’s lemma, which says: If (X, ≤) is a non-empty partially ordered set such that
every totally ordered subset Y ⊆ X has an upper bound then X has a maximal element.
A.17 Definition The Axiom of Countable Choice (ACω ) is the first (or third) of the above
versions of AC with the restriction that Y (respectively I) be at most countable.
A.18 Definition The Axiom of Countable Dependent Choice (DCω ) is the following: If X is
a set and R ⊆ X × X is such that for every x ∈ X there is a y ∈ X such that (x, y) ∈ R then
for each x1 ∈ X there is a sequence {xn } in X such that (xn , xn+1 ) ∈ R for all n ∈ N.
It is easy to prove AC ⇒ DCω ⇒ ACω . The converse implications are false.
A.19 Lemma Let (X, τ ) be a topological space with X =6 ∅. Then Y ⊆ X is dense if and only
if Y ∩ W 6= ∅ whenever ∅ =
6 W ∈ τ . (Equivalently, X\Y has empty interior.)
A.20 Theorem 59 Let (X, Td) be a complete metric space and {Un }n∈N a countable family of
∞
dense open subsets. Then n=1 Un is dense in X.
Proof. Let ∅ 6= W ∈ τ . Since U1 is dense, W ∩ U1 6= ∅ by Lemma A.19, so we can pick
x1 ∈ W ∩ U1 . Since W ∩ U1 is open, we can choose ε1 > 0 such that B(x1 , ε1 ) ⊆ W ∩ U1 . We
may also assume ε1 < 1. Since U2 is dense, U2 ∩ B(x1 , ε1 ) 6= ∅ and we pick x2 ∈ U2 ∩ B(x1 , ε1 ).
By openness, we can pick ε2 ∈ (0, 1/2) such that B(x2 , ε2 ) ⊆ U2 ∩ B(x1 , ε1 ). Continuing this
iteratively, we find points xn and εn ∈ (0, 1/n) such that B(xn , εn ) ⊆ Un ∩ B(xn−1 , εn−1 ) ∀n. If
i > n and j > n we have by construction that xi , xj ∈ B(xn , εn ) and thus d(xi , xj ) ≤ 2εn < 2/n.
Thus {xn } is a Cauchy sequence, and by completeness it converges to some z ∈ X. Since
n > k ⇒ xn ∈ B(xk , εk ), the limit z is contained in B(xk , εk ) for each k, thus
\ \
z∈ B(xn , εn ) ⊆ W ∩ Un ,
n n
T
thus W ∩ Tn Un is non-empty. Since W was an arbitrary non-empty open set, Lemma A.19
gives that n Un is dense.
A.21 Corollary Let (X, d) be a complete S∞ metric space and {Cn }n∈N a countable family of
closed subsets with empty interior. Then n=1 Cn has empty interior.
Proof. The sets Un = X\Cn , n ∈ N, are open and TUn = X\Cn = X\Cn0 = X since the
interiors Cn0 are empty. Thus the Un are dense so that n Un is dense by Baire’s theorem, thus
X\Cn = n Un = X. Thus with X\Y = (X\Y )0 we have ( n Cn )0 = (X\ n (X\Cn ))0 =
T T S T
nT S
X\ n X\Cn = ∅, i.e. n Cn has empty interior.
59
René-Louis Baire (1874-1932). French mathematician, proved this for Rn in his 1899 doctoral thesis. The gener-
alization is due to Hausdorff (1914).
123
A.22 Remark 1. There are many other ways of stating Baire’s theorem, but most of the
alternative versions introduce additional terminology (nowhere dense sets, meager sets, sets of
first or second category,
T etc.) and tend to obscure the matter unnecessarily.
2. An intersection n Un of a countable family {Un }n∈N of open sets is called a Gδ -set.
3. The proof implicitly used the axiom DCω of countable dependent choice. (Making this
explicit would be a tedious exercise.) Remarkably, one can prove that the (Zermelo-Frenkel)
axioms of set theory (without any choice axiom) combined with Baire’s theorem imply DCω .
4. Some results usually proven using Baire’s theorem can alternatively be proven without it.
But in most cases, such alternative proofs will also use DCω and therefore not be better from
a foundational point of view. The proof of Theorem 8.2 is a rare exception. 2
A typical application of Baire’s theorem is the following (for a proof see, e.g., [47]):
A.23 Theorem There is a k · k∞ -dense Gδ -set F ⊆ C([0, 1], R) such that every f ∈ F is
nowhere differentiable.
Note that a single function f ∈ C([0, 1], R) that is nowhere differentiable can be written down
quite explicitly and constructively, for example f (x) = ∞ −n cos(2n x). But for proving that
P
n=1 2
such functions are dense one needs Baire’s theorem.
124
A.25 Theorem Let f ∈ C([a, b], F) and ε > 0. Then there exists a polynomial P ∈ F[x] such
that |f (x) − P (x)| ≤ ε for all x ∈ [a, b]. (As always, F ∈ {R, C}.)
Proof. It clearly suffices to prove this for the interval [0, 1]. For n ∈ N and x ∈ [0, 1], define
n
X k n k
Pn (x) = f x (1 − x)n−k .
n k
k=0
we have
n
X k n k
f (x) − Pn (x) = f (x) − f x (1 − x)n−k ,
n k
k=0
thus
n
X k n k
|f (x) − Pn (x)| ≤ f (x) − f x (1 − x)n−k . (A.2)
n k
k=0
Since [0, 1] is compact and f : [0, 1] → F is continuous, it is bounded and uniformly continuous.
Thus there is M such that |f (x)| ≤ M for all x, and for each ε > 0 there is δ > 0 such that
|x − y| < δ ⇒ |f (x) − f (y)| < ε.
Let ε > 0 be given, and chose a corresponding δ > 0 as above. Let x ∈ [0, 1]. Define
k
A = k ∈ {0, 1, . . . , n} | −x <δ .
n
For all k we have |f (x) − f (k/n)| ≤ 2M , and for k ∈ A we have |f (x) − f (k/n)| < ε. Thus with
(A.2) we have
X n X n
|f (x) − Pn (x)| ≤ ε xk (1 − x)n−k + 2M xk (1 − x)n−k
k k
k∈A k∈Ac
X n
≤ ε + 2M xk (1 − x)n−k , (A.3)
c
k
k∈A
where we used (A.1) again. In an exercise, we will prove the purely algebraic identity
n
X n
xk (1 − x)n−k (k − nx)2 = nx(1 − x) (A.4)
k
k=0
for all n ∈ N0 and x ∈ [0, 1] (in fact all x ∈ R). Now, k ∈ Ac is equivalent to | nk − x| ≥ δ and
to (k − nx)2 ≥ n2 δ 2 . Multiplying both sides of the latter inequality by nk xk (1 − x)n−k and
summing over k ∈ Ac , we have
X n X n
n2 δ 2 xk (1 − x)n−k ≤ xk (1 − x)n−k (k − nx)2
c
k c
k
k∈A k∈A
n
X n k
≤ x (1 − x)n−k (k − nx)2 = nx(1 − x), (A.5)
k
k=0
125
where the last equality comes from (A.4). This implies
X n nx(1 − x) 1
xk (1 − x)n−k ≤ 2 2
≤ 2, (A.6)
c
k n δ nδ
k∈A
where we used the obvious inequality x(1 − x) ≤ 1 for x ∈ [0, 1]. Plugging (A.6) into (A.3)
2M
we have |f (x) − Pn (x)| ≤ ε + nδ 2 . This holds for all x ∈ [0, 1] since, by uniform continuity, δ
A.27 Corollary There exists a sequence {pn }n∈N ⊆ R[x] of real polynomials that converges
√
uniformly on [0, 1] to the function x 7→ x.
The above corollary can also be proven directly:
x − pn (x)2
pn+1 (x) = pn (x) + . (A.7)
2
Prove by induction that the following holds:
√
(i) pn (x) ≤ x for all n ∈ N0 , x ∈ [0, 1].
(ii) The sequence {pn (x)} increases monotonously for each x ∈ [0, 1] and converges uniformly
√
to x.
A.30 Theorem (M. H. Stone 1937) If X is compact Hausdorff and A ⊆ C(X, R) is a unital
subalgebra separating points then A = C(X, R).
126
Proof. Replacing A by A, the claim is equivalent to showing that A = C(X, R). We proceed in
several steps. We claim that f ∈ A implies |f | ∈ A. Since f is bounded due to compactness, it
clearly is enough to prove this under the assumption |f | ≤ 1. With the pn of Corollary A.27,
2 2
p have (x 7→ pn (f (x))) ∈ A since A is a unital algebra. Since pn ◦ f converges uniformly to
we
2
f = |f |, closedness of A implies |f | ∈ A. In view of
f + g + |f − g| f + g − |f − g|
max(f, g) = , min(f, g) = ,
2 2
and the preceding result, we see that f, g ∈ A implies min(f, g), max(f, g) ∈ A. By induction,
this extends to pointwise minima/maxima of finite families of elements of A.
Now let f ∈ C(X, R). Our goal is to find fε ∈ A satisfying kf − fε k < ε for each ε > 0.
Since A is closed, this will give A = C(X, R).
If a 6= b, the fact that A separates points gives us an h ∈ A such that h(a) 6= h(b). Thus
the function ha,b (x) = h(x)−h(a)
h(b)−h(a) is in A, continuous and satisfies h(a) = 0, h(b) = 1. Thus also
fa,b (x) = f (a) + (f (b) − f (a))ha,b (x) is in A, and it satisfies fa,b (a) = f (a) and fa,b (b) = f (b).
This implies that the sets
Ua,b,ε = {x ∈ X | fa,b (x) < f (x) + ε}, Va,b,ε = {x ∈ X | fa,b (x) > f (x) − ε}
are open neighborhoods of a and b, respectively, for every ε > 0. Thus keeping b, ε fixed,
{Ua,b,ε }a∈X is an open cover of X, and by compactness we find a finite subcover {Uai ,b,ε }ni=1 .
By the above preparation, the function fb,ε = min(fa1 ,b,ε , . . . , fan ,b,ε ) is in A. If x ∈ Uai ,b,ε
then fb,ε (x) ≤ fai ,b,ε (x) < f (x) + ε forTall x ∈ X, and since {Uai ,b,ε }ni=1 covers X, we have
fb,ε (x) < f (x) + ε ∀x. For all x ∈ Vb,ε = ni=1 Vai ,b,ε , we have fai ,b,ε (x) > f (x) − ε, and therefore
fb (x) = mini (fai ,b,ε ) > f (x) − ε. Now {Vb,ε }b∈X is an open cover of X, and we find a finite
subcover {Vbj ,ε }nj=1 . Then fε = max(fb1 ,ε , . . . , fbn ) is in A. Now fε (x) = maxj (fbj ,ε ) ≤ f (x) + ε
holds everywhere, and for x ∈ Vbj ,ε we have fε (x) ≥ fbj ,ε > f (x) − ε. Since {Vbj ,ε }j covers X,
we conclude that fε (a) ∈ (f (x) − ε, f (x) + ε) for all x, to wit kf − fε k < ε.
Since the polynomial ring R[x] is an algebra, and the polynomials clearly separate the points
of R, Theorem A.30 recovers Theorem A.25. (This is not circular if one has used Exercise A.28 to
prove Corollary A.27.) But we immediately have the higher dimensional generalization (which
can also be proven by more classical methods, like approximate units):
A.7.3 Generalizations
Having proven Theorem A.30, it is easy to generalize it to locally compact spaces or/and
subalgebras of C(0) (X, C).63 Recall that a subset S of a ∗-algebra A is called self-adjoint if
S = S ∗ := {s∗ | s ∈ S}.
127
∗
Proof. Define B = A ∩ C(X, R). Let f ∈ A. Since f ∗ ∈ A, we also have Re(f ) = f +f 2 ∈B
f −f ∗
and Im(f ) = 2i = −Re(if ) ∈ B. Thus A = B + iB. It is obvious that Re(A) ⊆ C(X, R)
is a unital subalgebra. If x 6= y then there is f ∈ C(X, C) such that f (x) 6= f (y). Thus
Re(f )(x) 6= Re(f )(y) or Re(if )(x) 6= Re(if )(y) (or both). Since Re(f ), Re(if ) ∈ B, we see that
B separates points. Thus B = C(X, R) by Theorem A.30, implying A = B + iB = B + iB =
C(X, R) + iC(X, R) = C(X, C).
It is therefore natural to ask whether the (relative) compactness of a set F ⊆ Cb (X, Y ) can
be characterized in terms of the elements of F, which after all are functions f : X → Y .
This will be the subject of this section, but we will restrict ourselves to compact X, for which
C(X, Y ) = Cb (X, Y ).
128
A.37 Definition Let (X, τ ) be a topological space and (Y, d) a metric space. A family F
of functions X → Y is called equicontinuous if for every x ∈ X and ε > 0 there is an open
neighborhood U 3 x such that f ∈ F, x0 ∈ U ⇒ d(f (x), f (x0 )) < ε. Then F ⊆ C(X, Y ).
The point of course is that the choice of U depends only on x and ε, but not on f ∈ F.
A.38 Theorem (Arzelà-Ascoli) 64 Let (X, τ ) be a compact topological space and (Y, d) a
complete metric space. Then F ⊆ C(X, Y ) is (pre)compact (w.r.t. the uniform topology τD ) if
and only if
• {f (x) | f ∈ F} ⊆ Y is (pre)compact for every x ∈ X,
• F is equicontinuous.
Proof. ⇒ If f, g ∈ C(X, Y ) then d(f (x), g(x)) ≤ D(f, g) for every x ∈ X. This implies that the
evaluation map ex : C(X, Y ) → Y, f 7→ f (x) is continuous for every x. Thus if F is compact, so
is eX (F). And compactness of F implies that ex (F) = {f (x) | f ∈ F} is compact, thus closed.
Since ex (F) contains ex (F), also ex (F) ⊆ ex (F) is compact.
To prove equicontinuity, let x ∈ X and ε > S 0. Since F is compact, F is totally bounded,
thus there are g1 , . . . , gn ∈ F such that F ⊆ i B D (gi , ε). By continuity of theTgi , there are
open Ui 3 x, i = 1, . . . , n, such that x0 ∈ Ui ⇒ d(gi (x), gi (x0 )) < ε. Put U = i Ui . If now
f ∈ F, there is an i such that f ∈ B D (gi , ε), to wit D(f, gi ) < ε. Now for x0 ∈ U ⊆ Ui we have
d(f (x), f (x0 )) ≤ d(f (x), gi (x)) + d(gi (x), gi (x0 )) + d(gi (x0 ), f (x0 )) < 3ε,
A.39 Lemma Let (X, d) be a metric space. Assume that for each ε > 0 there are a δ > 0, a
metric space (Y, d0 ) and a continuous map h : X → Y such that (h(X), d0 ) is totally bounded
and such that d0 (h(x), h(x0 )) < δ implies d(x, x0 ) < ε. Then (X, d) is totally bounded.
Proof. For ε > 0, pick δ, (Y, d0 ), h S as asserted. Since h(X) is S totally bounded, there are
y1 , . . . , yn ∈ h(X) such that h(X) ⊆ i B(yi , δ) ⊆ Y . Then X = i h−1 (B(yi , δ)). For each i
choose xi ∈ X such that h(xi ) = yi . Now x S ∈ h−1 (B(yi , δ)) ⇒ d0 (h(x), yi ) < δ ⇒ d(x, xi ) < ε,
so that h−1 (B(yi , δ)) ⊆ B(xi , ε). Thus X = ni=1 B(xi , ε), and (X, d) is totally bounded.
d(f (x), g(x)) ≤ d(f (x), f (xi )) + d(f (xi ), g(xi )) + d(g(xi ), g(x)) < 3ε.
Since this holds for all x ∈ X, we have D(f, g) ≤ 3ε. Thus the assumptions of Lemma A.39 are
satisfied, and we obtain total boundedness, thus precompactness, of F.
64
Giulio Ascoli (1843-1896), Cesare Arzelà (1847-1912), Italian mathematicians. They proved special cases of this
result, of which there also exist more general versions than the one above.
129
A.40 Remark 1. If Y = Rn , as in most statements of the theorem, then in view of the Heine-
Borel theorem the requirement of precompactness of {f (x) | f ∈ F } for each x reduces to
that of boundedness, i.e. pointwise boundedness of F. One can also formulate the theorem in
terms of existence of uniformly convergent (or Cauchy) subsequences of bounded equicontinuous
sequences in C(X, Rn ).
2. We intentionally stated a more general version of the theorem than needed in order to
argue that the result belongs to general topology rather than functional analysis. For Y = Rn
this is less clear, also since there are many alternative proofs of the theorem using various
methods from topology and functional analytis, cf. e.g. [51]. (This is no surprise since, as
explained in [47], the theorems of Alaoglu, the Stone-Čech compactification and Tychonov’s
theorem for Hausdorff spaces are all equivalent, i.e. easily deducible from each other.) 2
130
A.43 Definition A complex
S∞ P∞on a measurable space (X, A) is a map µ : A → C such
measure
that µ(∅) = 0 and µ( n=1 An ) = n=1 µ(An ) whenever {An } ⊆ A is a countable family of
mutually disjoint sets.
Note that complex measures are by definition bounded.
P Furthermore, if {A
S n } is a countable
family of mutually disjoint sets then automatically n |µ(An )| < ∞ since µ( n An ) is invariant
under permutations of the An .
131
Theorems not explicitly referring to R or C have a better chance of carrying over to p-adic
functional analysis. For example, the open mapping theorem and both versions of the uniform
boundedness theorem generalize without change. However, one has to be careful with the above
rule since there are properties, like connectedness, shared by R and C, but not enjoyed by the
p-adic fields! There are other problems: There is no a priori relationship between the subsets
S1 = {|c| | c ∈ F} and S2 = {kxk | x ∈ V } of [0, ∞). Thus given x ∈ V \{0} there may not be a
c ∈ F such that kcxk = 1.
We also have to be very careful with results on Hilbert spaces, since scalars in F can be
pulled out of inner products without picking up an absolute value: hcx, yi = chx, yi. Indeed
this leads to problems adapting the proof of Theorem 5.24. The same holds for the polarization
identities.
We leave the discussion here and refer to the literature on p-adic (functional) analysis for
more information. See e.g. [24, 61, 62, 57].
B.2 Theorem (i) k · k and k · k0 are equivalent norms on f a(S, F). We write
ba(S, F) = {µ ∈ f a(S, F) | kµk0 < ∞ (⇔ kµk < ∞)}.
132
Thus k · k, k · k0 are norms on f a(S, F). The definition of k · k clearly implies |µ(A)| ≤ kµk for
each A ⊆ S, whence kµk0 ≤ kµk.
Assume µ ∈ f a(S, R) and kµk0 < ∞. If A1 , . . . , AK ⊆ S are mutually disjoint, put
[ [
A+ = {Ak | µ(Ak ) ≥ 0}, A− = {Ak | µ(Ak ) < 0}.
Now by finite additivity, k |µ(Ak )| = µ(A+ ) + µ(A− ) ≤ 2kµk0 since |µ(A± )| ≤ kµk0 . Taking
P
the supremum over the families {Ak } gives kµk ≤ 2kµk0 .
If µ ∈ f a(S, C), writing µ = Re µ + i Im µ we find kµk ≤ 4kµk0 . Thus kµk0 ≤ kµk ≤ 4kµk0
for all µ, and the two norms are equivalent.
(ii) Here it is more convenient to work with the simpler norm k·k0 . Now let {µn } be a Cauchy
sequence in ba(S, F). Then |µn (A) − µm (A)| ≤ kµn − µm k0 , so that {µn (A)} is Cauchy, thus
convergent. Define µ(n) = limn µn (A). It is clear that µ(∅) = 0. If A1 , . . . , AK are mutually
disjoint then
µ(A1 ∪· · ·∪AK ) = lim µn (A1 ∪· · ·∪AK ) = lim (µn (A1 )+· · ·+µn (AK )) = µ(A1 )+· · ·+µ(AK ),
n→∞ n→∞
so that µ is finitely additive. Since {µn } is Cauchy, for every ε > 0 there is n0 such that n, m ≥ n0
implies kµm − µn k0 < ε. In particular there is n0 such that kµm k0 ≤ kµn0 k0 + 1 for m ≥ n0 .
This implies boundedness of µ. And taking m → ∞ in |µn (A) − µm (A)| ≤ kµn − µm k0 < ε gives
kµn − µk0 ≤ ε, so that kµn − µk0 → 0. Thus ba(S, F) is complete (w.r.t. k · k0 , thus also w.r.t.
k · k).
(iii) It is clear that `∞ (S, F)∗ → f a(S, F), ϕ 7→ µϕ is linear. Now let A1 , . . . , AK ⊆ S be
mutually disjoint. Then
K K K K
!
X X X X
|µϕ (Ak )| = sgn(µϕ (Ak ))µϕ (Ak ) = sgn(µϕ (Ak ))ϕ(χAk ) = ϕ sgn(µϕ (Ak ))χAk .
k=1 k=1 k=1 k=1
B.3 Theorem (i) For each µ ∈ ba(S, F) there is a unique linear functional µ ∈ `∞ (S, F)∗
R
(ii) The maps α : `∞ (S, F)∗ → ba(S, F), ϕ 7→ µϕ and : ba(S, F) → `∞ (S, F)∗ , µ 7→ µ are
R R
133
R R
Thus µ : f 7→ f dµ is a linear functional on the bounded finite image functions. It is clear
that this is the unique linear functional sending χA to µ(A) for each A ⊆ S. Now
Z K
X K
X
f dµ ≤ |ck | |µ(Ak )| ≤ kf k∞ |µ(Ak )| ≤ kf k∞ kµk.
k=1 k=1
R
Thus µ is a bounded functional, and since the bounded finite-image functions are dense in
`∞ (S, F) by Lemma 4.13, µ has a unique extension to a linear functional µ ∈ `∞ (S, F)∗ with
R R
R
k µ k ≤ kµk.
(ii) If µ ∈ ba(S, F) then by definition of µ , we have χA dµ = µ(A) for all A ⊆ S. Thus
R R
R
α ◦ = idba(S,F) .
If ϕ ∈ `∞ (S, F) then Rin view of the definition of we have χA dµϕ = µϕ (A) = ϕ(χA ) for
R R
all A ⊆ S. Thus ϕ and µϕ coincide on all characteristic functions, thus on all of `∞ (S, F) by
R
linearity, density of the finite-image functions and the k · k∞ continuity of ϕ and µϕ . Thus
R
◦ α = id`∞ (S,F)∗ . R
Since the maps α and are mutually inverse and both norm-decreasing, they actually both
are isometries.
This completes the determination of `∞ (S, F)∗ . (Note that we did not use the completeness
of ba(S, F) proven in Theorem B.2(ii). Thus it would also follow from the isometric bijection
ba(S, F) ∼
= `∞ (S, F)∗ just established.)
B.4 Exercise Given µ ∈ ba(S, F), prove that µ is {0, 1}-valued if and only if µ ∈ `∞ (S, F)∗ is
R
Since `∞ (S, F)∗ has a closed subspace ι(`1 (S, F)), it is interesting to identify the correspond-
ing subspace of ba(S, F).
B.5 Definition A finitely additive measure µ ∈ ba(S, F) is called countably additive if for
every countable family A ⊆ P (S) of mutually disjoint sets we have
[ X
µ A = µ(A)
A∈A
and totally additive if the same holds for any family of mutually disjoint sets. The set of
countably and totally additive measures on S are denoted ca(S, F) and ta(S, F), respectively.
(ii) µ ∈ `∞ (S, F)∗ is normal, thus f dµ = limι fι dµ for every net {fι } ∈ FS that is
R R R
134
Proof. (i)⇒(ii) If µ is of the given form then clearly µ χA dµ = µ(A) = s∈A g(s) for each
R P
R R P
A ⊆ S. By the way µ is constructed from µ, it is clear that f dµ = s∈S f (s)g(s) for all
f ∈ `∞ (S, F). Thus µ = ϕg , and normality of µ follows from Proposition 13.6.
R R
(ii)⇒(iii) We know that we can recover µ from µ as µ(A) = χA dµ. Let A be a family
R R
of mutually disjoint subsets of S. Then the net {fF = χS F }, indexed by the finite subsets
χB , where B = A. Now normality
S
F ⊆ A, is uniformly bounded and converges pointwise to
of µ implies that µ(B) = µ χB dµ = limF fF = limF A∈F µ(A) = A∈A µ(A), which is
R R R P P
additivity of µ. P
(iii)⇒(i) If we put g(s) = µ({s}) then additivity of µ means that µ(A) = s∈A g(s) for all
A ⊆ S, convergence being absolute. Now the finiteness of µ(S) gives kgk1 < ∞.
(iii)⇒(iv) is trivial. If S is countable then a family of mutually disjoint non-empty subsets
of S is at most countable, so that (iii) and (iv) are equivalent.
Thus we have the situation of the following diagram:
=∼ `1 (S, F)
∼=
∼
-
=
(`∞ (S, F)∗ )n - ta(S, F)
∩ ∩
? ∼
= ?
`∞ (S, F)∗ - ba(S, F)
135
Q. Thus Q does not have property S and we are done. For the construction of such an F we
use the following lemma:
B.9 Lemma Every countably infinite set X admits a family {Xλ }λ∈Λ of subsets of X such that
(i) Λ has cardinality c = #R, in particular it is uncountable.
(ii) Xλ is infinite for each λ ∈ Λ.
(iii) Xλ ∩ Xλ0 is finite for all λ, λ0 ∈ Λ, λ 6= λ0 .
Proof. Take Y = (0, 1) ∩ Q and Λ = (0, 1)\Q. Clearly Y is countable and Λ is uncount-
able (since removing a countable set from one of cardinality c does not change the cardinal-
ity). For each λ ∈ Λ pick a sequence {an } ⊆ Y converging to λ (for example an = bnλc/n)
and put Yλ = {an | n ∈ N}. That each Yλ is infinite follows from the irrationality of λ
and the rationality of the an . If λ 6= λ0 and an → λ, a0n → λ0 then there exists n0 such
that n, n0 ≥ n0 ⇒ max(|an − λ|, |a0n0 − λ0 |) < |λ − λ0 |/2, so that an 6= a0n0 . This implies
#(Yλ ∩ Yλ0 ) < ∞. We thus have a family of subsets of Y with all desired properties. For an
arbitrary countably infinite set X the claim now follows using a bijection X ∼
=Y.
Let {Xλ }λ∈Λ be a family of subsets of N as provided by the lemma. For λ ∈ Λ, the
characteristic function χXλ : N → {0, 1} ⊆ C clearly is in `∞ . Let p : `∞ → Q = `∞ /c0
be the quotient map. Now let qλ = p(χXλ ) and F = {qλ | λ ∈ Λ}. If λ, λ0 ∈ Λ, λ 6= λ
the symmetric difference Xλ ∆Xλ0 = (Xλ ∪ Xλ0 )\(Xλ0 ∩ Xλ ) is infinite by (ii) and (iii). Thus
χXλ − χX 0 6= c0 = ker p, so that λ 7→ qλ is injective, thus with (i) we see that F is uncountable.
λ
Let now ϕ ∈ Q∗ , m, n ∈ N and let λ1 , . . . , λm ∈ Λ be mutually distinct and such that
|ϕ(qλi )| ≥P1/n ∀i = 1, . . . , m. For each i pick ti with |ti | = 1 such that ti ϕ(qλi ) = |ϕ(qλi )|.
Put f = m χ ∞
i=1 ti Xλi ∈ ` . Since the sets Xλi have pairwise finite intersections,
Sm the function
f has absolute value larger than one only S on a subset
S of the finite set j,k=1 λj ∩ Xλk and
X
absolute value one on the infinite set ( i Xλi )\( j,k Xλj ∩ Xλk ). This implies that kp(f )k =
inf g∈c0 kf − gk∞ = 1. Thus
m m m
X X X m
kϕk ≥ |ϕ(p(f ))| = ti ϕ(p(χXλi )) = ti ϕ(qλi ) = |ϕ(qλi )| ≥ .
n
i=1 i=1 i=1
Thus m ≤ nkϕk < ∞, so that for each ϕ ∈ Q∗ and n ∈ N there cannot be more than m distinct
λ ∈ Λ with |ϕ(qλ )| ≥ 1/n. If there was an uncountable F 0 ⊆ F with ϕ(q) 6= 0 ∀q ∈ F 0 , there
would have to be an n ∈ N such that |ϕ(q)| ≥ 1/n for infinitely (in fact uncountably) many
q ∈ F 0 , contradicting what we just proved. This completes the proof.
136
(iv) V ∗∗∗ /V ∗ ' (V ∗∗ /V )∗ . (We omitted the ι’s for simplicity.)
Proof. (i) Since ιV , ιV ∗ are bounded, with Lemma 11.1 we have boundedness of P . Let ϕ ∈ V ∗
and x ∈ X. Then
(P ιV ∗ (ϕ))(ιV (x)) = ιV ∗ ((ιV )∗ (ιV ∗ (ϕ)))(ιV (x)) = [(ιV )∗ (ιV ∗ (ϕ))](x) = ϕ(x) = ιV ∗ (ϕ)(ιV (x)),
where we used Exercise 7.12 several times, proves P ιV ∗ (ϕ) = ιV ∗ (ϕ). Thus P ιV ∗ (V ∗ ) = id.
On the other hand, it follows directly from the definition of P that P V ∗∗∗ ⊆ ιV ∗ (V ∗ ). Combining
these two facts gives P 2 = P and P V ∗∗∗ = ιV ∗ (V ∗ ).
(ii) This is an immediate consequence of (ii) and Exercise 6.12.
(iii) If T : W → V ∗ is an isomorphism then we have isomorphisms T ∗ : V ∗∗ → W ∗ and
T : W ∗∗ → V ∗∗∗ . Using this it is straighforward to deduce the claim from (ii).
∗∗
(iv) By Exercise 6.6 we have (V ∗∗ /V )∗ ∼= V ⊥ ⊆ V ∗∗∗ . And by (ii), V ∗∗∗ ' V ∗ ⊕ W , where
W ' V /V , the isomorphism being given by x∗∗∗ 7→ (P x∗∗∗ , (1 − P )x∗∗∗ ) with P as in (i).
∗∗∗ ∗
Thus P V ∗∗∗ ' V ∗ and V ∗∗∗ /V ∗ ' (1 − P )V ∗∗∗ . Thus the claimed isomorphism follows if we
prove that the subspaces V ⊥ and (1 − P )V ∗∗∗ of V ∗∗∗ are equal.
Now, x∗∗∗ ∈ (1 − P )V ∗∗∗ means (1 − P )x∗∗∗ = x∗∗∗ , thus P x∗∗∗ = 0. Since P = ιV ∗ ◦ (ιV )∗ ,
where ιV ∗ is injective, this is equivalent to (ιV )∗ (x∗∗∗ ) = 0. By the definition of the transpose,
this means that x∗∗∗ ◦ ιV = 0. Since this is the same as x∗∗∗ ∈ ιV (V )⊥ , we are done.
B.11 Corollary c0 (N, F) is not isomorphic to the dual space of any Banach space.
Proof. We again abbreviate c0 (N, F) as c0 etc. We know that c∗0 ∼ = `1 and c∗∗ ∼ ∞
0 = ` , the
canonical map ιc0 : c0 → c0 just being the inclusion map c0 ,→ ` . By Theorem B.8, c0 ⊆ `∞
∗∗ ∞
is not complemented. Combining this with Lemma B.10(iii), the claim follows.
B.12 Corollary Let X = c0 ⊕ (`∞ /c0 ). Then X 6' `∞ , but X ∗ ' (`∞ )∗ .
Proof. X ' `∞ would imply that c0 ⊆ `∞ is complemented, which it is not by Theorem B.8.
Thus X 6' `∞ . With c∗0 ∼
= `1 we have X ∗ ' c∗0 ⊕ (`∞ /c0 )∗ ' `1 ⊕ (`∞ /c0 )∗ .
On the other hand, since `1 ∼= c∗0 is a dual space, we see that `1 ⊆ (`1 )∗∗ ∼
= (`∞ )∗ is com-
plemented by Lemma B.10(iii). Thus (`∞ )∗ ' `1 ⊕ (`∞ )∗ /`1 by Exercise 9.11(i). Now Lemma
B.10(iv) with V = c0 gives (`∞ )∗ /`1 ' (`∞ /c0 )∗ , so that X ∗ ' (`∞ )∗ .
137
Proof. As t → ∞ we have t−1 x → 0. Since U is an open neighborhood of 0, we have t−1 x ∈ U
for t large enough. Thus µU (x) < ∞ for each x ∈ V . It is quite obvious from the definition
that µU (cx) = cµU (x) for c > 0. Thus µU is positive-homogeneous. We have µU (x) < 1 if and
only if there exists t ∈ (0, 1) such that x ∈ tU . Thus µU (x) < 1 ⇒ x ∈ U . And if x ∈ U then
openness of U implies that (1 − ε)x ∈ U for some ε > 0. Thus µU (x) < 1, so that we have
U = {x ∈ X | µU (x) < 1}.
Let x, y ∈ V , and let s, t > 0 such that x ∈ sU, y ∈ tU . I.e. there are a, b ∈ U such that
x = sa, y = tb. Thus x + y = sa + tb = (s + t) sa+tb s t
s+t . Since s+t a + s+t b ∈ U due to convexity
of U , we have x + y ∈ (s + t)U . Thus µU (x + y) ≤ s + t, and since we have x ∈ sU, y ∈ tU for
all s < µU (x) + ε, t < µU (y) + ε with ε > 0, the conclusion is µU (x + y) ≤ µU (x) + µU (y), thus
subadditivity. Being subadditive and positive homogeneous, µU is sublinear.
Let {xι }ι∈I ⊆ V be a net converging to zero. For each n ∈ N, n−1 U is an open neighborhood
of zero. Thus there exists a ιn ∈ I such that ι ≥ ιn implies xι ∈ n−1 U and therefore, with the
definition of µU , that µU (xι ) ≤ n−1 . Thus µU (xι ) → 0, which is continuity of µU at 0 ∈ V .
If now xι → x then the subadditivity of µU gives
B.15 Proposition Let (V, τ ) be a topological vector space and U a convex open neighborhood
of zero. Then
(i) The Minkowski functional µU is a seminorm if and only if U is balanced.
(ii) If U is bounded then µU (x) = 0 implies x = 0.
(iii) If U is balanced and bounded then kxk = µU (x) is a norm inducing the topology τ .
Proof. (i) Since µU is subadditive and positive-homogeneous, it is a seminorm if and only if
µU (λx) = µU (x) for all x ∈ V and λ ∈ F with |λ| = 1. If U is balanced then this is evidently
satisfied. Now assume µU (λx) = µU (x). The openness of U implies that {t > 0 | x ∈ tU } =
(µU (x), ∞). Thus if |λ| = 1 then the assumption µU (λx) = µU (x) implies that x ∈ U if and
only if λx ∈ U . Thus U is balanced.
(ii) Assume that U is bounded and that x 6= 0. Since τ is T1 , there is an open W ⊆ V such
that 0 ∈ W 63 x. Since U is bounded, there is λ > 0 such that λU ⊆ W , which clearly implies
x 6∈ λU . Now the definition of µU implies µU (x) > λ > 0.
(iii) Proposition B.13 and the above (i) and (ii) show that k · k = µU is a continuous norm on
V . Thus xn → 0 implies kxn k → 0. If we prove the converse implication then τ = τk·k follows
since V is a topological vector space. Let {xn } be a sequence such that kxn k → 0, and let W
be an open neighborhood of 0. Since U is bounded, there is λ > 0 such that λU ⊆ W . Now,
kxn k → 0 means that there is n0 ∈ N such that n ≥ n0 ⇒ kxn k < λ/2. With the definition of
µU this implies xn ∈ λU , thus xn ∈ λU ⊆ W for all n ≥ n0 . This proves xn → 0.
138
B.16 Exercise Let V be a topological vector space and A ⊆ V convex. Prove that the interior
A0 and the closure A are convex.
We now know that a topological vector space is normable if the zero element has a balanced
convex bounded open neighborhood. (The converse is easy.) But this can be improved:
B.17 Lemma Let V be a topological vector space and U a convex open neighborhood of 0.
Then there exists a balanced convex open neighborhood U 0 ⊆ U of 0.
S λU ⊆ U
Proof. Since multiplication by scalars is continuous, there exists an ε > 0 such that
whenever |λ| ≤ ε. Thus with W = |ε|U we have tW ⊆ U whenever |t| ≤ 1. Put Y = |t|≤1 tW ⊆
U . By construction, Y is a balanced open neighborhood of 0.
T every λ ∈ F with |λ| = 1 it is clear that λU is a convex open neighborhood of 0. Putting
For
Z = |λ|=1 λU , it is manifestly clear that Z is balanced and 0 ∈ Z. Furthermore, U 0 is convex
(as an intersection of convex sets). Since tW ⊆ U for all |t| = 1, we have Y ⊆ Z, so that Z
has non-empty interior Z 0 . Now we put U 0 = Z 0 and claim that U 0 has the desired properties.
Clearly U 0 is an open neighborhood of 0, as the interior of a convex set it is convex (Exercise
B.16). If |t| = 1 then the map Z → Z, x 7→ tx is a homeomorphism. Thus if x ∈ Z 0 = U 0 then
tx ∈ Z 0 = U 0 , showing that U 0 = Z 0 is balanced.
Now we are in a position to prove geometric criteria for normability and local convexity of
topological vector spaces:
B.18 Theorem Let V be a topological vector space. Then V is normable if and only if there
exists a bounded convex open neighborhood of 0.
Proof. If V is normable by the norm k · k then Bk·k (0, 1) = {x ∈ V | kxk < 1} is clearly open,
convex (and balanced). To show boundedness, let W 3 0 be open. Then there is ε > 0 such
that B(0, ε) ⊆ W . Now clearly εB(0, 1) = B(0, ε) ⊆ W , thus B(0, 1) is bounded.
If there exists a bounded convex open neighborhood U of 0 then by Lemma B.17 we can
assume U in addition to be balanced. (The U 0 provided by the lemma is a subset of U , thus
bounded if U is bounded.) Now by Proposition B.15(iii), µU is a norm inducing the given
topology on V .
B.19 Theorem A topological vector space (V, τ ) is locally convex in the sense of Definition
2.24 (i.e. the topology τ comes from a separating family F of seminorms) if and only if it is
Hausdorff and the zero element has an open neighborhood base consisting of convex sets.
Proof. Given a separating family F of seminorms and putting τ = τF , a basis of open neigh-
borhoods of 0 is given by the finite intersections of sets Up,ε = {x ∈ V | p(x) < ε}, where
p ∈ F, ε > 0. Each of the Up,ε is convex and open, thus also the finite intersections.
And if τ has the stated property, Lemma B.17 gives that 0 has a neighborhood base consisting
of balanced convex open sets. Defining F = {µU | U balanced convex open neighborhood of 0},
each of the µU is a continuous seminorm by Propositions B.13 and B.15. Thus if xι → 0 then
kxι kU := µU (xι ) → 0. And kxι kU → 0 for all balanced convex open U implies that xι ultimately
is in every open neighborhood of 0, thus xι → 0. Thus τ = τF , and 2.23 gives that F is
separating.
139
(ii) Prove that the open unit ball of (`p (S, F), τdp ) does not contain any convex open neigh-
borhood of 0 if S is infinite.
(iii) Prove that (`p (S, F), τdp ) is neither normable nor locally convex if S is infinite.
B.21 Theorem Let V be a topological vector space and A, B ⊆ V disjoint non-empty convex
subsets, A being open. Then there is a continuous linear functional ϕ : V → F such that
Re ϕ(a) < inf b∈B Re ϕ(b) ∀a ∈ A. (If F = R, drop the ‘Re’.)
Proof. Pick a0 ∈ A, b0 ∈ B and put z = b0 − a0 and U = (A − a0 ) − (BS− b0 ) = A − B + z,
which is a convex (as pointwise sum of two convex sets) open (since U = x∈−B−a0 +b0 (A + x))
neighborhood of 0. Let p = µU be the associated Minkowski functional. As a consequence of
A ∩ B = ∅ we have 0 6∈ A − B, thus z 6∈ U , and therefore p(z) ≥ 1.
Put W = Rz and define ψ : W → R, cz 7→ c. For c ≥ 0 we have ψ(cz) = c ≤ cp(z) = p(cz).
Thus by sublinearity of p and Theorem 7.2 there exists a linear functional ϕ : V → R satisfying
ϕ W = ψ, thus ϕ(cz) = c, and ϕ(x) ≤ p(x) ∀x ∈ V . Thus also −p(−x) ≤ −ϕ(−x) = ϕ(x),
and since x → 0 implies p(x) → 0, ϕ is continuous at zero, thus everywhere.
If now a ∈ A, b ∈ B then a − b + z ∈ U , so that p(a − b + z) < 1. Thus
thus ϕ(a) < ϕ(b) for all a ∈ B and b ∈ B. Thus the subsets ϕ(A), ϕ(B) of R are disjoint.
Since A, B are convex, they are connected. Consequently, ϕ(A), ϕ(B) are connected, thus
intervals. Since A is open, so is ϕ(A) (open mapping theorem). If we put s = sup ϕ(A), we have
ϕ(a) < s ≤ ϕ(b) for all a ∈ A, b ∈ B, and this is equivalent to ϕ(a) ≤ inf b∈B ϕ(b) for all a ∈ A.
F = C: Considering V as R-vector space, apply the above to obtain a continuous R-linear
functional ϕ0 : V → R such that ϕ(a) < inf b∈B ϕ(b) ∀a ∈ A. Now define ϕ : V → C, x 7→
ϕ0 (x)−iϕ0 (ix). This clearly is continuous and satisfies Re ϕ = ϕ0 , so that the desired inequality
holds. That ϕ is C-linear follows from the same argument as in the proof of Theorem 7.5.
B.22 Theorem (Goldstine’s theorem) If V is a Banach space then V≤1 is σ(V ∗∗ , V ∗ )-dense
in (V ∗∗ )≤1 .
Proof. We abbreviate τ = σ(V ∗∗ , V ∗ ). The unit ball (V ∗∗ )≤1 is τ -compact by Alaoglu’s theorem,
τ
thus τ -closed, so that B = V≤1 , which is convex by Exercise B.16, is contained in (V ∗∗ )≤1 . If
τ
this inclusion is strict, pick x∗∗ ∈ (V ∗∗ )≤1 \V≤1 . Then x∗∗ has a τ -open neighborhood U disjoint
from B, and by Theorem B.19 there is a convex open A ⊆ U . Now Theorem B.21 applied to
(V ∗∗ , τ ) and A, B ⊆ V ∗∗ gives a τ = σ(V ∗∗ , V ∗ )-continuous linear functional ϕ ∈ (V ∗∗ )? such
that Re ϕ(a) < inf b∈B Re ϕ(b) ∀a ∈ A. Now Exercise 16.13 gives ϕ ∈ V ∗ ⊆ V ∗∗∗ .
Putting ψ = −ϕ we have supb∈B Re ψ(b) < Re ψ(a) ∀a ∈ A, which is more convenient. Since
ψ ∈ V ∗ and B ⊇ V≤1 , we have kψk ≤ supb∈B Re ψ(b). On the other hand, with x∗∗ ∈ A and
kx∗∗ k ≤ 1, we have Re ψ(x∗∗ ) ≤ |ψ(x∗∗ )| ≤ kx∗∗ kkψk ≤ kψk. Combining these findings, we
have kψk ≤ supb∈B Re ψ(b) < Re ψ(x∗∗ ) ≤ kψk, which is absurd. This contradiction proves
τ
V≤1 = (V ∗∗ )≤1 .
140
B.6 Strictly convex and uniformly convex Banach spaces
B.6.1 Strict convexity and uniqueness in the Hahn-Banach theorem
B.23 Definition A Banach space V is called strictly convex if x, y ∈ V, kxk = kyk = 1, x 6= y
implies kx + yk < 2.
B.24 Exercise (i) Prove that `p (S, F) is not strictly convex if #S ≥ 2 and p ∈ {1, ∞}.
(ii) Prove that `p (S, F) is strictly convex for every S and 1 < p < ∞.
(iii) Prove that all Hilbert spaces are strictly convex.
where we used ka − bk ≥ kak − kbk and the assumptions kx + yk = kxk + kyk and kxk = 1. This
implies kx + zk = 2. Since kxk = 1 = kzk (by assumption and by construction of z), the strict
convexity implies x = z = y/kyk. Thus y = kykx, and we have proven (b).
(ii) (b)⇒(a) Assume V ∗ is not strictly convex. Then there are ϕ1 , ϕ2 ∈ V ∗ with ϕ1 6= ϕ2
and kϕ1 k = kϕ2 k = 1 and kϕ1 + ϕ2 k = 2. Then W = {x ∈ V | ϕ1 (x) = ϕ2 (x)} ⊆ V is a
closed linear subspace and proper (since ϕ1 6= ϕ2 ). Put ψ = ϕ1 |W = ϕ2 |W ∈ W ∗ . We will prove
kψk = 1. Then ϕ1 , ϕ2 are distinct norm-preserving extensions of ψ ∈ W ∗ to V , providing a
counterexample for uniqueness of norm-preserving extensions.
Since ϕ1 − ϕ2 6= 0, there exists z ∈ V with ϕ1 (z) − ϕ2 (z) = 1. Now every x ∈ V can be
written uniquely as x = y + cz, where y ∈ W, c ∈ C: Put c = ϕ1 (x) − ϕ2 (x) and then y = x − cz.
Now it is obvious that y ∈ W . Uniqueness of such a representation follows from z 6∈ W .
Since kϕ1 + ϕ2 k = 2, we can find a sequence {xn } ⊆ V with kxn k = 1 ∀n such that
ϕ1 (xn ) + ϕ2 (xn ) → 2. Since |ϕi (xn )| ≤ 1 for i = 1, 2 and all n, it follows that ϕi (xn ) → 1 for i =
1, 2. Now write xn = yn +cn z, where {yn } ⊆ W and {cn } ⊆ C. Then cn = ϕ1 (xn )−ϕ2 (xn ) → 0.
Thus kxn − yn k = |cn |kzk → 0, so that kyn k → 1. And with cn → 0 we have
141
In view of {yn } ⊆ W and ϕ1 |W = ψ, we have ψ(yn ) = ϕ1 (yn ) → 1. Together with kyn k → 1
this implies kψk ≥ 1. Since the converse inequality is obvious, we have kψk = 1, as claimed.
In addition to the above we remark that V is strictly convex if and only if V ∗ is ‘smooth’, and
conversely. (Cf. e.g. [43] for definition and proof.) Thus uniqueness of Hahn-Banach extensions
for subspaces W ⊆ V holds if and only if V is smooth.
B.28 Theorem (Milman-Pettis 1938/9) Every uniformly convex Banach space is reflexive.
Proof. (Following Ringrose 1958) Assume V is uniformly convex, but not reflexive. Let S ⊆ V
and S ∗∗ ⊆ V ∗∗ be the unit spheres (sets of elements of norm one). Since S = S ∗∗ easily implies
V = V ∗∗ , we have S $ S ∗∗ . If x∗∗ ∈ S ∗∗ \S then by the obvious norm-closedness of S ⊆ S ∗∗
there is ε > 0 such that B(x∗∗ , ε) ∩ S = ∅. Since kx∗∗ k = 1, we can find ϕ ∈ V ∗ with kϕk = 1
and |x∗∗ (ϕ) − 1| > 1 − δ(ε)/2. Now U = {y ∗∗ ∈ V ∗∗ | |y ∗∗ (ϕ) − 1| > 1 − δ(ε)} ⊆ V ∗∗ is a
τ := σ(V ∗∗ , V ∗ )-open neighborhood of x∗∗ . By Goldstine’s Theorem B.22, V≤1 ⊆ (V ∗∗ )≤1 is
τ
τ -dense. If {xα } ⊆ V≤1 is a net τ -converging to x ∈ S ∗∗ then kxα k → 1 and kxxαα k → x. Thus
S ⊆ S ∗∗ is τ -dense, thus S ∩ U 6= ∅. If now y1 , y2 ∈ S ∩ U then |ϕ(y1 ) + ϕ(y2 )| > 2 − 2δ(ε). With
kϕk = 1 this implies ky1 + y2 k > 2 − 2δ(ε). Thus by uniform convexity we have ky1 − y2 k < ε.
Since every net in S that τ -converges to x∗∗ ultimately lives in U , picking any y1 ∈ S ∩ U we
have kx∗∗ − y1 k ≤ ε. But this contradicts the choice of ε.
The converse of the theorem is not true, but the construction of counterexamples is laborious.
Note also that the dual of a uniformly convex space need not be uniformly convex!
B.29 Theorem For every measure space (X, A, µ) and 1 < p < ∞, the space Lp (X, A, µ; F) is
uniformly convex and reflexive.
Proof. We follow [37]. Let 0 < ε ≤ 21−p . Then the set
p
x−y
Z = (x, y) ∈ R2 | |x|p + |y|p = 2, ≥ε
2
is closed and bounded, thus compact, and non-empty since (21/p , 0) ∈ Z. Since the function
p p p
R → R, t 7→ |t|p is strictly convex, we have x+y
2 < |x| +|y|
2 whenever x 6= y. Thus
p
|x|p + |y|p
x+y
ρ(ε) = inf − > 0.
(x,y)∈Z 2 2
142
Let now 0 < ε < 21−p and f, g ∈ Lp (X, A, µ) with kf kp = kgkp = 1 and k(f + g)/2kpp > 1 − δ.
Writing f, g instead of f (x), g(x), we put
f −g p |f |p + |g|p
M = x∈X | ≥ε .
2 2
Now
p p
f −g f −g f −g p
Z Z
= +
2 p X\M 2 M 2
Z p
|f | + |g| p Z
|f |p + |g|p
≤ ε +
X\M 2 M 2
p p x+y p
Z p
|x| + |y|p
|f | + |g|
Z
1
≤ ε + −
X 2 ρ(ε) M 2 2
p p x+y p
Z p p
|f | + |g| |x| + |y|
Z
1
≤ ε + −
X 2 ρ(ε) X 2 2
1 1−δ δ
≤ ε+ − = ε+ .
ρ(ε) ρ(ε) ρ
(In the second row we used the definition of M and (4.1), in the third we used (B.1), which
holds on M , in the forth the fact that the expression in brackets is non-negative on X\M , and
finally we used the assumptions kf kpp ≤ 1, kgkpp ≤ 1 and k(f + g)/2kpp > 1 − δ.) Now choosing
δ < ερ(ε) we have k(f − g)/2kpp ≤ 2ε, thus uniform convexity (more precisely, an implication
equivalent to it).
Reflexivity now follows from Theorem B.28.
B.30 Remark The uniform convexity of Lp for 1 < p < ∞ was first proven by Clarkson in
1936 with a fairly complicated proof. (Reflexivity was known earlier thanks to F. Riesz’ proof
of (Lp )∗ ∼
= Lq .) A simpler proof, still giving optimal bounds, can be found in [31]. 2
Now we are in a position to complete the determination of Lp (X, A, µ)∗ for arbitrary measure
space (X, A, µ) and 1 < p < ∞ without invocation of the Radon-Nikodym theorem:
B.31 Corollary Let 1 < p < ∞ and (X, A, µ) any measure space. Then the canonical map
Lq (X, A, µ; F) → Lp (X, A, µ; F)∗ is an isometric bijection.
Proof. Let (X, A, µ) be any measure space, 1 < p < ∞ and q the conjugate exponent. We
abbreviate Lp (X, A, µ) to Lp . As discussed (without complete, but hopefully sufficient detail)
in Section 4.6, the map ϕ : Lq → (Lp )∗ , g 7→ ϕg is an isometry, so that only surjectivity remains
to be proven. Assume ϕ(Lq ) $ (Lp )∗ . The subspace being closed (since Lq is complete and ϕ is
an isometry), by Hahn-Banach there is a 0 6= ψ ∈ (Lp )∗∗ such that ψ ϕ(Lq ) = 0. By reflexivity
of Lp (Theorem B.29), there is an p
R f ∈ L such that ψ = ιLp (f ). This implies ϕg (f ) = ψ(ϕg ) = 0
for all g ∈ L . With ϕg (f ) = f g dµ = ϕ0f (g), where ϕ0 : Lp → (Lq )∗ is the canonical map,
q
143
w
B.32 Theorem (I. Schur) If g, {fn }n∈N ⊆ `1 (N, F) and fn → g then kfn − gk1 → 0.
w
Proof. It clearly suffices to prove this for g = 0, thus `1 3 fn → 0 ⇒ kfn k1 → 0. We will
follow the gliding hump argument in [3] very closely.
w
Assume that fn → 0, but kfn k 6→ 0. Since δm ∈ `∞ ∼ = (`1 )∗ , the first fact clearly implies
n→∞
fn (m) = ϕδm (fn ) −→ 0 for all m. And by the second assumption there exists ε > 0 such that
kfn k1 ≥ ε for infinitely many n. Using this, we inductively define {nk }, {rk } ⊆ N as follows:
(a) Let n1 be the smallest number for which kfn1 k1 ≥ ε.
P∞
(b) Let r1 be the smallest number for which ri=1 ε
≤ 5ε .
P1
|fn1 (i)| ≥ 2 and i=r1 +1 |fn1 (i)|
For k ≥ 2:
Prk−1
(c) Let nk be the smallest number such that nk > nk−1 and kfnk k1 ≥ ε and i=1 |fnk (i)| ≤ 5ε .
Prk ε
(d) Let rk be the smallest number such that rk > rk−1 and
P∞ i=rk−1 +1 |fnk (i)| ≥ 2 and
ε
i=rk +1 |fnk (i)| ≤ 5 .
The reader should convince herself that the existence of such nk , rk follows from our assumptions!
Now define {ci }i∈N by ci = sgn(fnk (i)) where k is uniquely determined by rk−1 < i ≤ rk
with r0 = 0. Now clearly c = {ci } ∈ `∞ , and for all k we have, using the lower bound in (b),(d),
rk rk
X X ε
ci fnk (i) = |fnk (i)| ≥ ,
2
i=rk−1 +1 i=rk−1 +1
rk−1 rk−1 ∞ ∞
X X ε X X ε
ci fnk (i) ≤ |fnk (i)| ≤ , ci fnk (i) ≤ |fnk (i)| ≤ .
5 5
i=1 i=1 i=rk +1 i=rk +1
B.33 Remark In the above proof, the gliding hump philosophy is much more clearly visible
than in the proof of Theorem 8.2: The gliding hump is precisely the dominant contribution to
ϕc (fnk ) coming from the i in the interval {rk + 1, . . . , rk+1 }, which moves to infinity as k → ∞.
Note also that the determination of the nk , rk in the above proof was deterministic, using
no choice axiom at all. In this sense the proof is better than the alternative one using Baire’s
theorem, thus countable dependent choice, cf. e.g. [12, Proposition V.5.2], which nevertheless is
instructive. But of course also the above proof is non-constructive in the somewhat extremist
sense of intuitionism since the necessary ε > 0 cannot be found algorithmically.
For a high-brow interpretation of Schur’s theorem in terms of Banach space bases see [1,
Section 2.3]. But also this discussion uses gliding humps. 2
144
The theorem is quite remarkable, and asked for a proof one probably wouldn’t know where
to begin. For matrices this is quite easy to
P prove, as we do for (i): Normality of a implies the
existence of an ONB {ei } such that a = i λi Pi , where Pi (·) = ei h·, ei i. Now ac = ca implies
Pi c = cPi for all i, from which a∗ c = ca∗ is immediate. This argument can be extended to
operators on infinite dimensional spaces, cf. [77]. But the following is quite different:
for certain dn ∈ A. (The reshuffling is justified by the uniform convergence of the series.) We
only need d1 = a∗ c − cb∗ , which is quite obvious. Thus the theorem follows if we prove d1 = 0.
By induction, the assumption ab = cb is seen to imply an c = cbn . Multiplying by z n /n! and
summing over n ∈ N0 gives eza c = cezb for all z ∈ C, thus also eza ce−zb = c. Thus
∗ ∗ ∗ ∗ ∗ −za ∗
f (z) = eza ce−zb = eza (e−za cezb )e−zb = eza cezb−zb = ei2Im(za) ce−i2Im(zb) ,
∗ ∗
where eza e−za = eza −za is true due to the normality aa∗ = a∗ a of a, and similarly for b. Now
2Im(za), 2Im(zb) are self-adjoint so that ei2Im(za) and ei2Im(zb) are unitary for all z ∈ C, cf.
Remark 11.25(ii), thus bounded. This proves that f : C → A is bounded.
Thus for every ϕ ∈ A∗ , the function z 7→ ϕ(f (z)) is entire and bounded, thus constant by
Liouville’s theorem. Thus for all z, z 0 ∈ C, ϕ ∈ A∗ we have ϕ(f (z) − f (z 0 )) = 0. Hahn-Banach
now implies f (z) − f (z 0 ) = 0 ∀z, z 0 , thus f is constant. In particular, d1 = f 0 (0) = 0.
B.35 Remark (i) was proven by Fuglede in 1950, (ii) by Putnam in 1951. The above elegant
proof is due to Rosenblum (1958). Nevertheless, the appeal to complex analysis is redundant
and somewhat misleading since for a bounded function given in terms of a power series of infinite
convergence radius, as is the case here, Liouville’s theorem has a proof that involves neither the
notion of holomorphicity nor the general path independence of contour integrals: 2
For another instance where the standard invocation of Liouville’s theorem can be replaced by
harmonic analysis see Rickart’s proof of Theorem 10.18, where only finite cyclic groups appear!
145
B.9 Glimpse of non-linear FA: Schauder’s fixed point theorem
In this final section we give a glimpse of non-linear functional analysis by proving Schauder’s
fixed point theorem, which is a generalization of Brouwer’s fixed point theorem to Banach
spaces.
B.37 Definition A topological space X has the fixed-point property if for every continuous
f : X → X there is x ∈ X such that f (x) = x, i.e. a fixed-point.
B.38 Theorem (Brouwer, Hadamard, 1910) 68 [0, 1]n has the fixed point property. The
same holds for every non-empty compact convex subset of Rn .
The second result follows from the first since such an X is homeomorphic to some [0, 1]m .
There are many proofs of the first result. For what probably is the simplest proof (due to
Kulpa) of the first statement, using only some easy combinatorics, see [47]. (Proofs using
algebraic topology or analysis involve inessential elements and don’t reduce the combinatorics.)
B.39 Theorem (Schauder 1930) 69 Every non-empty compact convex subset K of a normed
vector space has the fixed point property.
Proof. Let (V, k · k) be a normed vector space, K ⊆ V a non-empty compact convex subset
and f : K → K continuous. Let S ε > 0. Since K is compact, thus totally bounded, there are
x1 , . . . , xn ∈ K such that K ⊆ ni=1 B(xi , ε). Thus if we define αi (x) ≥ 0 by
0 if kx − xi k ≥ ε
αi (x) = ∀i = 1, . . . , n, (B.2)
ε − kx − xi k if kx − xi k < ε
we see that for each x ∈ K there is at least one i such that αi (x) > 0. The functions αi clearly
are continuous. Thus also the map
Pn
αi (x)xi
Pε : K → K, x 7→ Pi=1 n
i=1 αi (x)
is continuous. Since Pε (x) is a convex combination of those xi for which kx − xi k < ε, we have
kPε (x) − xk < ε for all x ∈ K. The finite dimensional subspace Vn = span(x1 , . . . , xn ) ⊆ V
is isomorphic to some Rm , and by Proposition 3.19 the restriction of the norm k · k to Vn is
equivalent to the Euclidean norm on Rm . Thus the convex hull conv(x1 , . . . , xn ) ⊆ Vn into
which Pε maps is homeomorphic to a compact convex subset of Rm and thus has the fixed point
property by Theorem B.38. Thus if we define fε = Pε ◦ f then fε maps conv(x1 , . . . , xn ) into
itself and thus has a fixed point x0 = fε (x0 ). Now,
kx0 − f (x0 )k ≤ kx0 − fε (x0 )k + kfε (x0 ) − f (x0 )k = kfε (x0 ) − f (x0 )k = kPε (f (x0 )) − f (x0 )k < ε.
Since ε > 0 was arbitrary, we find inf{kx − f (x)k | x ∈ K} = 0. Since K is compact and
x 7→ kx − f (x)k continuous, the infimum is assumed, thus f has a fixed point in K.
The use of methods/results from algebraic topology is quite typical for non-linear functional
analysis. (But also linear functional analysis connects to algebraic topology, for example via
K-theory, cf. e.g. [50, Chapter 7].)
68
Luitzen Egbertus Jan Brouwer (1881-1966). Dutch mathematician. Important contributions to topology, founding
of intuitionism. Jacques Hadamard (1865-1963). French mathematician.
69
Juliusz Schauder (1899-1943). Born in Lwow/Lviv (now Ukraine, then Lemberg in the Austrian empire) and killed
by the Nazis during WW2.
146
B.40 Definition Let V be a Banach space and W ⊆ V . A map f : W → V is called compact
if it is continuous and f (S) ⊆ V is precompact for every bounded S ⊆ V .
B.41 Corollary Let V be a Banach space and C ⊆ V closed, bounded and convex. If
f : C → V is compact and f (C) ⊆ C then f has a fixed point in C.
Proof. If W ⊆ V is compact then its convex hull co(W ) = {tx + (1 − t)y | x, y ∈ W, t ∈ [0, 1]} is
the image of the compact space W ×W ×[0, 1] under the continuous map (x, y, t) 7→ tx+(1−t)y
and therefore compact.
Now f (C) ⊆ V is compact by boundedness of C and compactness of f , thus by the above
also K = co(f (C)) is compact and, of course, convex. Thus K has the fixed point property by
Schauder’s theorem. Since C is closed and convex, we have K ⊆ C, thus f is defined on K and
maps it into f (K) ⊆ f (C) ⊆ C. Thus f has a fixed point x ∈ K ⊆ C.
147
C Tentative schedule (14 lectures à 90 minutes)
1. Introduction, motivation. TVS. Normed spaces, bounded linear maps
2. Continuation of basic material (Sections 2, 3). Sequence spaces `p (S): proof of Hölder and
Minkowski inequalities.
3. More on sequence spaces (most proofs omitted). Basics on Hilbert spaces: inner product,
CS ineq., norm from inner product. Parallelogram equal., polarization. Orthogonality.
4. From Riesz lemma to Sect. 5.3.
5. End of Sect. 5.3. Briefly: tensor products of Hilbert spaces. Quotients of Banach spaces.
Hahn-Banach over R
6. Hahn-Banach over C. Applications: Reflexivity, complemented subspaces. Baire’s thm.
Strong version of the uniform boundedness theorem (gliding hump proof of Theorem 8.2
omitted). Hellinger-Toeplitz.
7. Strong convergence, Banach-Steinhaus. Many continuous functions with divergent Fourier
series. Open mapping, bounded inverse and closed graph theorems. Invertibility of Banach
space operators: (i)⇔(ii) in Proposition 9.28.
8. Bounded below maps, (i)⇔(iii) in Proposition 9.28. Sections 10.1-2.
9. Sections 10.4 and 11.1-2.
10. Sections 11.3-4 and 12.1-2 (incl. brief discussion of Weierstrass and Tietze theorems).
11. Sections 12.3 and 13. If possible: Beginning of Section 14.
12. Brief discussion of Arzelà-Ascoli (App. A.9). Rest of Section 14 (probably no time for
Section 14.4).
13. Section 15: Spectral theorems for normal operators.
14. Sections 16, 17: Weak and weak∗ topologies, Gelfand homomorphism/isomorphism.
148
All papers appearing in the bibliography are cited somewhere, but not all books. Still, all
are worth looking at.
References
[1] F. Albiac, N. J. Kalton: Topics in Banach space theory. 2nd. ed. Springer, 2016.
[2] D. Amir: Characterizations of inner product spaces. Birkhäuser, 1986.
[3] S. Banach: Théorie des opérations linéaires, 1932. Engl. transl.: Theory of linear opera-
tions. North-Holland, 1987.
[4] W. R. Bauer, R. H. Benner: The non-existence of a Banach space of countably infinite
Hamel dimension. Amer. Math. Monthly 78, 895-596 (1971).
[5] G. Birkhoff, E. Kreyzig: The establishment of functional analysis. Histor. Math. 11, 258-
321 (1984).
[6] N. Bourbaki: Topological vector spaces. Chapters 1-5. Springer, 1987.
[7] H. Brézis: Functional analysis, Sobolev spaces and partial differential equations. Springer,
2011.
[8] T. Bühler, D. A. Salamon: Functional analysis. American Mathematical Society, 2018.
[9] N. L. Carothers: A short course on Banach space theory. Cambridge University Press,
2005.
[10] P. G. Ciarlet: Linear and nonlinear functional analysis with applications. Society for In-
dustrial and Applied Mathematics, 2013.
[11] D. L. Cohn: Measure theory. 2nd. ed. Springer, 2013.
[12] J. B. Conway: A course in functional analysis. 2nd. ed. Springer, 2007.
[13] A. M. Davie: The Banach approximation problem. J. Approx. Th. 13, 392-394 (1975).
[14] K. Deimling: Nonlinear functional analysis. Springer, 1985.
[15] J. Dieudonné: History of functional analysis. North-Holland, 1981.
[16] J.-L. Dorier: A general outline of the genesis of vector space theory. Hist. Math. 22, 227-261
(1995).
[17] N. Dunford, J. T. Schwartz: Linear operators. I. General theory. Interscience Publishers,
1958, John Wiley & Sons, 1988.
[18] H.-D. Ebbinghaus et al.: Numbers. Springer, 1991.
[19] P. Enflo: A counterexample to the approximation problem in Banach spaces. Acta Math.
130, 309-317 (1973).
[20] L. C. Evans: Partial differential equations. 2nd. ed. American Mathematical Society, 2010.
[21] A. Fellhauer: On the relation of three theorems of analysis to the axiom of choice. J. Logic
Analysis 9, 1-23 (2017).
[22] S. Friedberg, A. Insel, L. Spence: Linear algebra. 4th. ed. Pearson, 2014.
[23] D. J. H. Garling: A course in mathematical analysis. Vol. 1 & 2. Cambridge University
Press, 2013.
[24] F. Q. Gouvêa: p-adic numbers. An introduction. 3rd. ed. Springer, 2020.
149
[25] S. Grabiner: The Tietze extension theorem and the open mapping theorem. Amer. Math.
Monthly 93, 190-191 (1986).
[26] S. Gudder: Inner product spaces. Amer. Math. Monthly 81, 29-36 (1974), 82, 251-252
(1975), 82, 818 (1975).
[27] C. Heil: A basis theory primer. Birkhäuser, 2011.
[28] P. R. Halmos: Introduction to Hilbert space and the theory of spectral multiplicity. 2nd. ed.
Chelsea, 1957.
[29] P. R. Halmos: What does the spectral theorem say? Amer. Math. Monthly 70, 241-247
(1963).
[30] P. R. Halmos: A Hilbert space problem book. 2nd. ed. Springer, 1982.
[31] O. Hanner: On the uniform convexity of Lp and lp . Ark. Mat. 3, 239-244 (1955).
[32] R. V. Kadison, J. R. Ringrose: Fundamentals of the theory of operator algebras. Vol. 1.
Elementary theory. Academic Press, 1983.
[33] S. Kaplan: The bidual of C(X) I. North-Holland, 1985.
[34] Y. Katznelson: An introduction to harmonic analysis. 3rd. ed. Cambridge University Press,
2004.
[35] Y. & Y. R. Katznelson: A (terse) introduction to linear algebra. American Mathematical
Society, 2008.
[36] I. Kleiner: A history of abstract algebra. Birkhäuser, 2007.
[37] V. Komornik: Lectures on functional analysis and the Lebesgue integral. Springer, 2016.
[38] N. P. Landsman: Foundations of quantum theory. Springer, 2017. Freely available at
https://link.springer.com/book/10.1007%2F978-3-319-51777-3
[39] P. D. Lax: Functional analysis. Wiley, 2002.
[40] P. D. Lax: Linear algebra and its applications. 2nd. ed. Wiley, 2007.
[41] J. Lindenstrauss, L. Tzafriri: On the complemented subspace problem. Isr. J. Math. 9,
263-269 (1971).
[42] B. MacCluer: Elementary functional analysis. Springer, 2009.
[43] R. E. Megginson: An introduction to Banach space theory. Springer, 1998.
[44] R. Meise, D. Vogt: Introduction to functional analysis. Oxford University Press, 1997.
[45] D. F. Monna: Functional analysis in historical perspective. Oosthoek Publishing Company,
1973.
[46] G. H. Moore: The axiomatization of linear algebra: 1875-1940. Hist. Math. 22, 262-303
(1995).
[47] M. Müger: Topology for the working mathematician. (work in progress).
http://www.math.ru.nl/~mueger/topology.pdf
[48] M. Müger: On trace class operators (and Hilbert-Schmidt operators).
http://www.math.ru.nl/~mueger/PDF/Trace-class.pdf
[49] M. Müger: Some examples of Fourier series.
http://www.math.ru.nl/~mueger/PDF/Fourier.pdf
[50] G. J. Murphy: C ∗ -Algebras and operator theory. Academic Press, 1990.
150
[51] G. Nagy: A functional analysis point of view on the Arzela-Ascoli theorem. Real. Anal.
Exch. 32, 583-586 (2006/7).
D. C. Ullrich: The Ascoli-Arzelà Theorem via Tychonoff’s Theorem. Amer. Math. Monthly
110, 939-940 (2003).
M. Wójtowicz: For eagles only: probably the most difficult proof of the Arzela-Ascoli
theorem − via the Stone-Cech compactification. Quaest. Math. 40, 981-984 (2017).
[52] L. Narici, E. Beckenstein, G. Bachman: Functional analysis and valuation theory. Marcel
Dekker, Inc. 1971.
[53] L. Nirenberg: Topics in nonlinear functional analysis. American Mathematical Society,
1974.
[54] B. de Pagter, A. C. M. van Rooij: An invitation to functional analysis. Epsilon Uitgaven,
2013.
[55] G. Pedersen: Analysis now. Springer, 1989.
[56] J.-P. Pier: Mathematical analysis during the 20th century. Oxford University Press, 2001.
[57] J. B. Prolla: Topics in functional analysis over valued division rings. North-Holland, 1982.
[58] D. Ramakrishnan, R. J. Valenza: Fourier analysis on number fields. Springer, 1999.
[59] M. Reed, B. Simon: Methods of modern mathematical physics. I: Functional analysis.
Academic Press, 1980.
[60] F. Riesz, B. Sz.-Nagy: Functional analysis. Frederick Ungar Publ., 1955, Dover, 1990.
[61] A. M. Robert: A course in p-adic analysis. Springer, 2000.
[62] A. C. M. van Rooij: Non-Archimedean functional analysis. Marcel Dekker, Inc., 1978.
[63] W. Rudin: Real and complex analysis. McGraw-Hill, 1966, 1974, 1986.
[64] W. Rudin: Functional analysis. 2nd. ed. McGraw-Hill, 1991.
[65] V. Runde: A taste of topology. Springer, 2005.
[66] V. Runde: A new and simple proof of Schauder’s theorem. arXiv:1010.1298.
[67] R. A. Ryan: Introduction to tensor products of Banach spaces. Springer, 2002.
[68] B. P. Rynne, M. A. Youngson: Linear functional analysis. 2nd. ed. Springer, 2008.
[69] D. A. Salamon: Measure and integration. European Mathematical Society, 2016.
[70] K. Saxe: Beginning functional analysis. Springer, 2002.
[71] E. Schechter (ed.): Handbook of analysis and its foundations. Academic Press, 1997.
[72] B. Simon: Trace ideals and their applications. 2nd. ed. American Mathematical Society,
2005.
[73] B. Simon: Operator theory. American Mathematical Society, 2015.
[74] I. Singer: Bases in Banach spaces I & II. Springer, 1970, 1981.
[75] P. Soltan: A primer on Hilbert space. Springer, 2018.
[76] E. M. Stein, R. Shakarchi: Fourier analysis. Princeton University Press, 2005.
[77] V. S. Sunder: Fuglede’s theorem. Indian J. Pure Appl. M. 46, 415-417 (2015).
[78] A. Szankowski: B(H) does not have the approximation property. Acta Math. 147, 89-108
(1981).
151
[79] T. Tao: Analysis I & II. 3rd. ed. Springer, 2016.
[80] G. Teschl: Topics in linear and nonlinear functional analysis. American Mathematical
Society, 2020.
[81] A. M. Vershik: The life and fate of functional analysis in the twentieth century. In: A.
A. Bolibruch, Yu. S. Osipov, Ya. G. Sinai (eds.): Mathematical events of the twentieth
century. Springer, 2006.
[82] S. Warner: Topological fields. North-Holland, 1989.
[83] J. Weidmann: Lineare Operatoren in Hilberträumen. 1: Grundlagen, 2: Anwendungen.
Teubner, 2000, 2003.
[84] W. Wieşlaw: On topological fields. Colloq. Math. 29, 119-146 (1974).
[85] K.-W. Yang: A note on reflexive Banach spaces. Proc. Amer. Math. Soc. 18, 859-861
(1967).
[86] E. Zeidler: Nonlinear functional analysis. Volumes 1, 2A, 2B, 3, 4, 5. Springer, 1984-1990.
[87] R. J. Zimmer: Essential results of functional analysis. University of Chicago Press, 1990.
152