0% found this document useful (0 votes)
23 views112 pages

Analytical Methods

Brief information about Analytical methods

Uploaded by

gothwalsakshi5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views112 pages

Analytical Methods

Brief information about Analytical methods

Uploaded by

gothwalsakshi5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Analytical Methods for Engineering

Paolo Guiotto
Introduction

The material I present here covers the "Analytical part" of the Course Analytical and Stochastic Methods
for Engineering. The scope is to give a solid base on the most relevant advanced analytical tools used
in disparate contexts. To give an idea of this versatility, few remarkable applications are presented here
touching problems from Engineering, Physics and Finance.
Most of the advanced analytical tools rely on the powerful Lebesgue’s integration Theory. This is the
reason why we start by presenting the Lebesgue measure and integral in the first Chapter and we extend
this Theory to abstract integration in Chapter 2. This environment is fundamental also for Probability
and Stochastic Analysis. Spaces of integrable functions are very important examples of Banach spaces, a
concept introduced in general in Chapter 3 with some of its main features. In Chapter 4 we particularize
to those Banach spaces where the norm is induced by a scalar product, the so called Hilbert spaces. The
final part of the course introduces to Fourier Analysis: Fourier series and transform.
Few words are required as "disclaimer" about the focus of this course. The Theory is developed the strict
needed to give a reasonable comprehension and tools to the student in such a way he/she can dispose of a
solid base on advanced Mathematics. As consequence, proofs are given under simplified assumptions, the
most technical ones are simply omitted. In my opinion this should not compromise the comprehension.
There’s a certain number of problems at the end of each Chapter. Students are warmly invited to
try to solve these. There’re not "standard problems" in which you have to apply recipes. Even in the
computational problems, students must have a good knowledge of the general theory: definitions and
main properties. To solve the more abstract problems, two main ingredients are needed: first, a good
understanding of main definitions (it is just a way to learn these); second, to be able to imitate and adapt
proofs. The scope is not to produce a "mathematician". Rather, to give tools to be able to "swim" in deep
water when doing mathematical modeling, which is one of the expected outcomes for a Mathematical
Engineer.
Finally, this Lecture Notes is not free of errors of various nature (from misprints to more serious
conceptual mistakes). The reader is warmly asked to point out these to me.
Contents

Chapter 1. Lebesgue Measure and Integral 1


1.1. Outer Measure 2
1.2. Lebesgue measure 3
1.3. Lebesgue measurable functions 6
1.4. Lebesgue integral 7
1.5. Reduction formula 11
1.6. Change of variables 13
1.7. Limit theorems 15
1.8. Continuous and differentiable dependance by parameters 18
1.9. Exercises 20

Chapter 2. Abstract Measure and Integral 23


2.1. Concept of Measure 23
2.2. How to define a measure 25
2.3. Integral 28
2.4. Limit theorems 30
2.5. Exercises 31

Chapter 3. Basic Banach spaces 33


3.1. Normed spaces 33
3.2. Limit of a sequence 37
3.3. Completeness 40
3.4. Linear Operators 44
3.5. Exercises 48

Chapter 4. Basic Hilbert spaces 51


4.1. Definition and first properties 51
4.2. Orthogonal Projections 53
4.3. Orthonormal bases 55
4.4. Gram–Schmidt orthogonalization 58
4.5. Exercises 61

Chapter 5. Fourier Series 63


5.1. Preliminaries 63
5.2. Euler formulas 65
3
4

5.3. Properties of Fourier coefficients 68


5.4. Convergence of a Fourier series 70
5.5. Applications 73
5.6. Exercises 81
Chapter 6. Fourier Transform 83
6.1. Definition and examples 84
6.2. Behavior of Fourier transform 86
6.3. Schwarz Class 88
6.4. The Fourier–Plancherel Transform 92
6.5. Convolution 95
6.6. Applications 99
6.7. Exercises 106
CHAPTER 1

Lebesgue Measure and Integral

The aim of this Chapter is to introduce to the Lebesgue Measure and Integral on Rd . The measure of
a set E is an extension of the concept of length (d = 1), area (d = 2) and volume (d = 3) to any dimension
d with the aim to assign a quantitative "size" to the largest possible class of sets. Denoting by λ d (E) the
measure of E, we expect that a certain number of properties be fulfilled by λ d :
• geometric coherence, that is,
λ d ([a1, b1 ] × · · · × [ad, bd ]) = (b1 − a1 ) · · · (bd − ad );
• countable additivity, that is, if the E j are disjoint,
[ ∞ ∞
X
λd *
. Ej =
+
/ λ d (E j ).
, j=1 - j=1
• translation invariance, that is λ d (E + x) = λ d (E) for any x ∈ Rd (here E + x = {e+ x : e ∈ E});
• homothety invariance, that is λ d (cE) = |c| d λ d (E) for any c ∈ R (here cE = {ce : e ∈ E}).
The main problem is, of course, how can be defined λ d (E) for a generic set E ⊂ Rd ? We will see that
this is a delicate problem. A major application of a measure is to the definition of an integral of a function
f : D ⊂ Rd −→ R. Il f > 0, a natural definition is
Z
f := λ d+1 Trap( f ) , where Trap( f ) := {(x, y) ∈ Rd+1 : x ∈ D ⊂ Rd, 0 6 y 6 f (x)}.

D

Figure 1. Trap( f ) colored in red.

Because most of the proofs are long, technical and not really interesting on the applied side, we will omit
them (a good reference is Wheeden R.L., Zygmund A., Measure and Integral, Dekker, 1977).
1
2

1.1. Outer Measure


Let’s start by
Definition 1.1.1. A set of type I = [a1, b1 ] × · · · × [ad, bd ] ⊂ Rd is called multi–interval (or interval
for shortness). Its measure is, by definition
|I |d := (b1 − a1 ) · · · (bd − ad ).
By covering a set E ⊂ Rd through intervals we can easily define an estimation by excess of the measure
of any set:

I3

I4

I2
I5
I1 E

Definition 1.1.2 (outer measure).



X
 [ 

λ ∗d (E) := inf 
 |I j | d : E ⊂ I j , I j intervals ⊂ R  , ∀E ⊂ Rd .
d

 
 j j 
By definition we set λ d (∅) = 0.

Notice that λ ∗d is defined for any subset of Rd . Moreover, the intervals covering E are not necessarily
disjoints. Let’s see some properties of λ ∗d .
Proposition 1.1.3. λ ∗d fulfills the following properties:
i) coherence, that is λ d (I) = |I |d for every I interval;
ii) monotone, that is λ ∗d (E) 6 λ ∗d (F) if E ⊂ F;
iii) translation invariant, that is λ ∗d (E + v) = λ ∗d (E), ∀E ⊂ Rd , ∀v ∈ Rd ;
iv) homogeneous, that is λ ∗d (cE) = |c| d λ ∗d (E), ∀E ⊂ Rd , ∀c ∈ R;
v) sub-additive, that is
[ X
λ ∗d *. E j +/ 6 λ ∗d (E j ), ∀(E j ) j ⊂ P (Rd ).
, j - j

Proof. i) is less trivial of how it could appear but natural (exercise), ii), iii), iv) are easy and left as
exercise (do it!). Let’s prove the v). The conclusion is trivially true if j λ ∗d (E j ) = +∞. Let’s assume
P
that j λ d (E j ) < +∞. In particular (but not equivalently!) λ ∗d (E j ) < +∞ for all j. Therefore, by
P ∗
definition, [ X
j j j
∀ε > 0 : ∃(Ik )k : E j ⊂ Ik , and |Ik |d 6 λ ∗d (E j ) + ε.
k k
3

ε
We choose as ε = 2j
(where ε in the r.h.s. is arbitrary positive) in such a way that
j
X
j ε
(1.1.1) ∃(Ik )k : |Ik |d 6 λ ∗d (E j ) + j .
k
2
j S
Now: (Ik ) j,k is clearly a covering of j E j . Therefore, again by the definition of outer measure,
(1.1.1)
j j ε
S  P  
λ ∗d |Ik |d = λ ∗d (E j ) + = λ ∗d (E j ) + ε
P P P P P 1
j Ej 6 j,k j k |Ik |d 6 j 2j j j 2j

= λ ∗d (E j ) + ε.
P
j

Finally, being ε arbitrary we obtain the conclusion. 

A fundamental property we might expect by λ ∗d is the countable additivity: this means that if E can
be decomposed as disjoint union of a countable number of parts then its measure is just the sum of the
measures of the parts. Let’s introduce a useful notation:
G [
E j := E j , if Ei ∩ E j = ∅, ∀i , j.
j j

Countable additivity means then that


G X
λ ∗d *. E j +/ = λ ∗d (E j ).
, j - j

Unfortunately this is false!


Theorem 1.1.4 (Vitali). λ ∗d is not countably additive.

Proof. The proof is based on the construction of a set S ⊂ [−1, 1]d with weird properties: for a
suitable countable set of x n , S + x i ∩ S + x j = ∅, [−1, 1]d ⊂ n (S + x n ) ⊂ [−2, 2]d . In this way,
S

G
2d 6 λ ∗d * (S + x n ) + 6 4d,
, n -
However, if λ ∗d were countaly additive,
X X X
λ ∗d (S + x n ) = λ ∗d (S + x n ) = λ ∗d (S),
n n n

and this sum can be only 0 (if = 0) or +∞ (if λ ∗d (S)


> 0), which is in contradiction with the λ ∗d (S)
previous bounds. The construction of such S is non trivial and not intuitive, we omit it. 

1.2. Lebesgue measure


We will now introduce a suitable class of subsets wide enough to contain interesting sets (as intervals,
open and closed sets) but not leading to contradictions.
4

Definition 1.2.1 (Lebesgue class).


( )
Md := E ⊂ Rd : ∃O, open : λ ∗d (E4O) = 0 .
Here E4O := (E\O) ∪ (O\E) is the symmetric difference of E and O.
Let’s see some easy consequences of this definition:
• open sets are Lebesgue measurable: their symmetric difference from themselves is empty;
• intervals are Lebesgue measurable: just take O = Int(I); in this case I4O = ∂I is the hedge of
the interval that, how it is easy to check, has outer measure 0;
• measure 0 sets are Lebesgue measurable: indeed, if λ ∗d (E) = 0 then just take O = ∅ (by def is
open); then λ ∗d (E4O) = λ ∗d (E) = 0;
• any sets that differs by a measurable set by a measure 0 set is measurable: indeed let F = E ∪ N
with λ ∗d (N ) = 0 and E ∈ Md ; according to the definition there exists an open set O such that
λ d (E4O) = 0; then, because F\O = (E ∪ N )\O ∪O\(E ∪ N ) ⊂ E\O ∪ N\O ∪O\E ⊂ E4O ∪ N
and O\F = O\(E ∪ N ) ⊂ O\E ⊂ E4O we have F4O ⊂ E4O ∪ N therefore
λ ∗d (F4O) 6 λ ∗d (E4O) + λ ∗d (N ) = 0. 
With some technical work it is possible to prove the
Theorem 1.2.2. The class Md fulfills the following properties:
i) if E ∈ Md then E c ∈ Md ;
ii) if (E j ) ⊂ Md then j E j ∈ Md .
S

The outer measure λ ∗d restricted on sets of Md is also countably additive, that is


G X
λ ∗d *. E j +/ = λ ∗d (E j ), E j ∈ Md .
, j - j

The class Md is called Lebesgue class and its elements are called Lebesgue measurable sets. The
outer measure λ ∗d restricted on Md is called Lebesgue measure and it is denoted just by λ d .
By ii) it follows that Md contains also all the closed (hence all the compact) sets. Moreover because
c
[ \
*. E +/ = E jc,
j
, j - j

easily one sees that Md is closed also respect to countable intersections. This class is really huge that
it is basically impossible, in practice, to write down a subset of Rd which is not in Md . It is for this
reason that in practical applications the Lebesgue measurability of a given set is almost never checked.
Nonetheless, a Vitali’s theorem says, Md , P (Rd ).
It is convenient to have an equivalent reformulation of the countable additivity. Indeed sometimes we
may approximate a certain set E by an increasing sequence of sets En . We will say that
[
En % E, ⇐⇒ En ⊂ En+1, ∀j, and E = En .
n
A similar concept is used to say that En & E.
5

Proposition 1.2.3 (continuity of the Lebesgue measure). Let (En ) ⊂ Md . Then


i) (continuity from below) If En % E then λ d (E) = limn λ d (En );
ii) (continuity from above) If En & E and λ d (E1 ) < +∞ then λ d (E) = limn λ d (En ).
Proof. i) First E = n En ∈ Md by properties of the Lebesgue class. Notice that the union is not
S
discoint (unless all En are empty), but we can rewrite it as
[ G
E= En = E1 t (E2 \E1 ) t (E3 \E2 ) t . . . = (E j \E j−1 ),
n
if we set for convenience E0 = ∅. Therefore
X n
X
λ d (E) = λ d (E j \E j−1 ) = lim λ d (E j \E j−1 ) = lim λ d (En ).
n→+∞ n→+∞
j j=1
T
ii) We reduce to i) by setting Fn := E1 \En . Clearly Fn % E1 \ n En therefore, by i)
\
λ d * E1 \ En + = lim λ d (Fn ) = lim λ d (E1 \En ).
n n
, n -
Now, because λ d (E1 ) < +∞, all λ d (En ) < +∞ hence
λ d (E1 ) = λ d (E1 \En ) + λ d (En ), ⇐⇒ λ d (E1 \En ) = λ d (E1 ) − λ d (En ).
(notice that if these quantities would be infinite this algebra wouldn’t make sense reducing to +∞− (+∞)).
Similarly
\ \
λ d * E1 \ En + = λ d (E1 ) − λ d * En +
, n - , n -
hence
\
λ d (E1 ) − λ d * En + = lim (λ d (E1 ) − λ d (En ))
n
, n -
and by this the conclusion follows. 
Remark 1.2.4. Notice that the upper continuity it might be not true if λ dT(E1 ) = +∞. For example take
d = 1, En = [n + ∞[. Clearly λ 1 (En ) = +∞ −→ +∞. But En = ∅ hence λ 1 En = 0.
T
n n 

The Lebesgue measure has some other remarkable properties. Among these we will quote here a couple
of them that are interesting and will be used later. The first is the
Proposition 1.2.5. Let T ∈ L (Rd ) (set of linear transformations on Rd ) be invertible. Then
λ d (T (E)) = | det T |λ d (E), ∀E ∈ Md .
In particular: because in case of rotations | det T | = 1 it turns out that λ d is invariant by rotations.
The proof is long and it consists to reduce T to the composition of particular linear bijections of Rd on
each of these the formula is easy to prove. Notice that a particular case of this Proposition is the case
λ d (cE) = |c| d λ d (E): here T = cI.
6

The second property worth to be mentioned is actually a natural generalization of the germinal principle
on which λ d is based. We recall indeed that if I is an interval
I = [a1, b1 ] × · · · × [ad, bd ] = [a1, b1 ] × · · · × [ak , bk ] × [ak+1, bk+1 ] × · · · × [an, bn ]
| {z } | {z }
J H
then
λ d (I) = |I |d = | J |k · |H |n−k = λ k (J)λ n−k (H).
By this, with a not really difficult but long work, it is possible to prove the
Theorem 1.2.6 (Factorization). If E = A × B with A ∈ Mk and BMd−k . Then E ∈ Md and
λ d (E) = λ k ( A)λ d−k (B).
We will denote this property quickly as λ d = λ k ⊗ λ d−k .

1.3. Lebesgue measurable functions


The first step toward the definition of the integral is to introduce the class of functions that we will
consider. To avoid caveats, we will consider extended valued functions:
Definition 1.3.1. Let f : Rd −→ [−∞, +∞]. We say that f is Lebesgue measurable (notation
f ∈ L(Rd )) if ( )
{ f > a} ≡ x ∈ Rd : f (x) > a ∈ Md, ∀a ∈ R.
The class of measurable functions is huge. To start we have
Proposition 1.3.2. χ E ∈ L(Rd ) iff E ∈ Md .
Proof. Notice that χ E ∈ {0, 1}. Therefore

 Rd, if a < 0,
{ χ E > a} =  E, if 0 6 a < 1,

if a > 1.

 ∅,

By this is evident that { χ E > a} ∈ Md iff E ∈ Md (because ∅, Rd are trivially in Md ). 
Let’s introduce a very important class of functions: those who takes a finite number a values.
Definition 1.3.3. A simple function is a function of the form
Xn
s(x) := ck χ Ek (x), Ei ∩ E j = ∅, ∀i , j.
k=1

On can easily check directly that a simple function is measurable iff Ek ∈ Md for any k = 1, . . . , n. In
alternative this follows as by product of the following
Proposition 1.3.4. Any linear combination, product and ratio (if the denominator is always , 0) of
measurable functions is a measurable function.
Another class of common functions that are measurable is
Proposition 1.3.5. Any continuous function f ∈ C (Rd ) is Lebesgue measurable.
7

Proof. Being f continuous, { f > a} = f ← (]a, +∞[) is open, whence it belongs to Md . 


A measurable function might be modified on a null set without loosing the measurability:
Proposition 1.3.6. Suppose that f = g on Rd \N where λ d (N ) = 0. Then f ∈ L(Rd ) iff g ∈ L(Rd ). In
particular: if f = g ∈ C (Rd ) on Rd \N with λ d (N ) = 0 then f ∈ L(Rd ).
Proof. Notice that
{g > a} = {x ∈ Rd : g(x) > a} = {x ∈ Rd \N : g(x) > a} ∪ {x ∈ N : g(x) > a}
| {z }
N
H

Now: NH ⊂ N and because N has measure 0 also N


H has measure 0 and, as we know, measure 0 sets are
in Md . About the first we have
 
{x ∈ Rd \N : g(x) > a} = {x ∈ Rd \N : f (x) > a} = {x ∈ Rd : f (x) > a} \ ({x ∈ N : f (x) > a})

= { f > a}\ N
D
D ⊂ N has measure 0 hence it is measurable. Therefore { f > a}\ N
where, again, N D ∈ Md (difference of
measurable sets) whence, finally, {g > a} ∈ Md . 
Let’s introduce a very useful terminology: we say that a certain property p(x) holds for almost every x if
∃N : λ d (N ) = 0, : p(x) is true ∀x ∈ Rd \N .
For instance, the previous Proposition could be restated in the following more suggestive form:
• if f = g a.e. then f is measurable iff g it is;
• if f it is a.e. equal to a continuous function then it is measurable.
Definition 1.3.7. Let ( f n ) ⊂ L(Rd ). We say that
a.e.
f n −→ f , ⇐⇒ f n (x) −→ f (x), as n −→ +∞, a.e. x ∈ Rd .
We have the
a.e.
Theorem 1.3.8. Let ( f n ) ⊂ L(Rd ) be such that f n −→ f . Then f ∈ L(Rd ).

1.4. Lebesgue integral


We’re now ready to define the integral for a measurable function f . We will start by assuming f > 0.
In this case the natural definition would be
Z Z ( )
f ≡ f (x) dx := λ d+1 (x, y) ∈ Rd+1 : 0 6 y < f (x) .
Rd Rd
To give properly this definition we need to know that the set delimited by the graph of f and Rd be
measurable. This follows by the measurability of f of course:
Lemma 1.4.1. Let f ∈ L(Rd ), f > 0. Then
( )
Trap( f ) := (x, y) ∈ Rd+1 : 0 6 y < f (x) ∈ Md+1 .
8

Proof. We prove that Trap( f ) is the limit of an increasing sequence of measurable sets. The idea is
simple: divide the co-domain [0, +∞[ into intervals [ 2kn , k+1
2 n [. Clearly
[( k k +1
) " "
k
En := 6 f < × 0, n ∪ { f = +∞} × [0, +∞[⊂ Trap( f )
k
2n 2n 2

and, being f measurable, by factorization En ∈ Md+1 .

k+2
2n
k+1 k+1
2n 2n
k k
2n 2n

x x
k k+1 k k+1 k k+1 k+2 k+1
 ⩽<  ={ ⩽< }×[0, ] ={ ⩽< }×[0, ]
2n 2n 2n 2n 2n 2n 2n 2n

Easily En+1 ⊃ En , so En % E ⊂ Trap( f ). Now, pick (x, y) ∈ Trap( f ). If f (x) = +∞ immediately


(x, y) ∈ { f = +∞} × [0, +∞[⊂ En for every n. Assume f (x) < +∞. Because y < f (x), we can find n∗
such that
1
∗ < f (x) − y.
2n
∗ +1 ∗ k ∗ +1
Now, pick k ∗ in such a way that 2kn∗ 6 f (x) < k2n∗ . In particular, y < f (x) − 1
2n
∗ 6 2n
∗ − 1
2n
∗ = k∗
2n
∗.

By this,
k∗ + 1
( ∗
k∗
) " #
k [
(x, y) ∈ ∗ 6 f < ∗ × 0, ∗ ⊂ E n ∗ ⊂ En . 
2n 2n 2n n

Definition 1.4.2. Let f ∈ L(Rd ), f > 0. We set


Z Z
f ≡ f (x) dx := λ d+1 (Trap( f )).
Rd Rd

Some remarks on this definition:


• if E ∈ Md then
Z
λ d+1 =λ d ⊗λ1
(1.4.1) χ E = λ d+1 (Trap( χ E )) = λ d+1 (E × [0, 1]) = λ d (E).
Rd

• the definition consider also the case on which R d f could be +∞.


R

• this definition fulfills some elementary properties. One which is evident is the monotonicity: if
f > g > 0 are measurables, then clearly Trap( f ) ⊃ Trap(g) hence
Z Z
f = λ d+1 (Trap( f )) > λ d+1 (Trap(g)) = g.
Rd Rd
9

Let’s now extend this definition to a generic f (not necessarily positive). The idea is just to consider the
area with sign, positive when f is positive and negative when f is negative. To this aim let’s introduce
some useful notation:
f + := max{ f , 0} (positive part of f ) f − := max{− f , 0} (negative part of f ).
Be careful: f ± > 0. It is immediate by the definition to check that
f = f + − f −, | f | = f + + f − .
Moreover
f + = ϕ( f ), where ϕ(u) := max{u, 0}, f − = ψ( f ), where ψ(u) = max{−u, 0},
being evidently ϕ, ψ ∈ C (R), f ± ∈ L(Rd ) if f ∈ L(Rd ). A natural position would be then
Z Z Z
f := f+ − f−.
Rd Rd Rd

f ± < +∞.
R
To avoid the problem that this formula reduces to (+∞) − (+∞) we will require that both Rd
Being | f | = f + + f − by monotonicity this condition is equivalent to R d | f | < +∞.
R

Definition 1.4.3. Let f ∈ L(Rd ). We say that f is Lebesgue integrable (notation: f ∈ L 1 (Rd )) if
Z
| f | < +∞.
Rd

In this case we pose


Z Z Z
f := f+ − f−.
Rd Rd Rd

The next step is to extend the definition of integral by considering the integral over a subset of Rd . Let
E ⊂ Rd and f : E −→ R. We will write

 f , on E,
f χ E : Rd −→ R f χ E = 

d

 0, on R \E.

Notice that f χ E ∈ L(Rd ) iff E ∈ Md and { f > a} ∈ Md for all a ∈ R (see exercises). We will say that
f ∈ L(E) in this case. It is therefore natural to pose
Definition 1.4.4. Let f ∈ L(E). We say that f ∈ L 1 (E) iff R d | f χ E | < +∞ and we pose
R

Z Z
f := f χE .
E Rd

To finish we introduce a last extension to the case of complex valued functions f : E ⊂ Rd −→ C. This
case is relevant in view of the Fourier Transform that we will discuss later.
10

Definition 1.4.5. Let f : E ⊂ Rd −→ C. We say that f ∈ L(E) iff Re f , Im f ∈ L(E). Moreover we


say that f ∈ L 1 (E) iff | f | ∈ L 1 (E) ( 1 ) and we pose
Z Z Z
f = Re f + i Im f .
E E E
This definition fulfills the natural properties we may expect by an integral:
Proposition 1.4.6. The Lebesgue integral fulfills the following properties:
• linearity: if f , g ∈ L 1 (E) then α f + βg ∈ L 1 (E) for every α, β ∈ C and E (α f + βg) =
R

α E f + β E g.
R R

• isotonicity: if f , g ∈ L 1 (E) are real valued and such that f > g a.e. then E f > E g.
R R
R R
• triangular inequality: if f ∈ L 1 (E) then E f 6 E | f |.
• decomposition: if f ∈ L 1 (E) and E = A t B with A, B ∈ Md then E f = A f + B f .
R R R

• restriction: if f ∈ L 1 (E) and Md 3 F ⊂ E, then f ∈ L 1 (F) and F f = E f χ F .


R R
a.e. a.e.
• null sets invariance: if f , g ∈ L 1 (E) and f = g then E f = E g. In particular: if f = 0
R R

then E f = 0.
R

1.4.1. Lebesgue and Riemann integrals. In dimension d = 1 we have two possible concepts of
integral: the Riemann integral and the Lebesgue integral. Of course there’re connections between the
two coinciding the first with the second when the first is defined. Precisely
Theorem 1.4.7. If f ∈ R ([a, b]) (Riemann integrable) then f ∈ L 1 ([a, b]) and
Z b Z
(Riemann) f (x) dx = f (Lebesgue).
a [a,b]
Moreover, f ∈ R ([a, b]) iff it is a.e. equal to a continuous function on [a, b].
In practice, to compute the Lebesgue integral we may use in some case methods seen in the first Analysis
course as, for instance, the Calculus of primitives together with the fundamental theorem of Calculus.
That is: if f ∈ C 1 ([a, b]) has a primitive F then
Z Z b
f ≡ f (x) dx = F (b) − F (a).
[a,b] a
R b
For this reason we will use the symbol a
f (x) dx for the Lebesgue integral. However the Lebesgue
integral is more general.
a.e.
Example 1.4.8. The Dirichlet function χ[0,1]∩Qc = 1 on [0, 1] is not Riemann integrable but it is
Lebesgue integrable. Indeed, [0, 1]\Qc is measurable and
Z
χ[0,1]∩Qc = λ 1 ([0, 1] ∩ Qc ) = λ 1 ([0, 1]) = 1. 
[0,1]

1Notice that by the monotonicity it follows that


Z Z
|Re f | 6 | f | < +∞, =⇒ Re f ∈ L 1 (E),
E E
and similarly Im f ∈ L 1 (E).
11

An extension of Riemann’s integral is given by the concept of generalized integral. For instance
Z +∞ Z b
f (x) dx := lim f (x) dx,
−∞ a→−∞, b→+∞ a
if the limit exists. Of course we need a minimum requirement on f in such a way that definite integrals
Rb
a
f (x) dx would make sense. We say that f ∈ Rloc (R) iff f ∈ R ([a, b]) for every [a, b] ⊂ R.
Theorem 1.4.9. Let f ∈ Rloc (R). Then
Z +∞
1
f ∈ L (R), ⇐⇒ | f (x)| dx < +∞.
−∞
+∞
f (x) dx =
R R
In this case −∞ R
f.
Remark 1.4.10. The requirement on absolute integrability is really needed. For instance, the function
R +∞ sin
sin x
f (x) := x
1 x
is integrable in generalized sense but it is not L because −∞ x dx = +∞. 

1.5. Reduction formula


Intuitively, we could compute the volume of a potato by slicing it and summing up the volume of the
slices, computed as area × thickness. As the thickness goes to 0, the sum becomes an integral. To see
precisely this property, which is a continuum version of the factorization, let’s start by the
Definition 1.5.1. Given E ⊂ Rd we set
( ) ( )
Ex := y ∈ Rd−k : (x, y) ∈ E (x−section), E y := x ∈ Rk : (x, y) ∈ E , (y−section).

Y Y

Ex
y

E E

Ey
X X
x

Notice that Ex, E y may be empty for some values of x, y.


Example 1.5.2. If E = A × B we have
( )  ∅,
 x < A, ( )  ∅,
 y < B,
Ex = y ∈ Rd−k : (x, y) ∈ A × B = 
 E y = x ∈ Rk : (x, y) ∈ A × B = 

 B,
 x ∈ A,  A,
 y ∈ B.
We have the
Theorem 1.5.3 (slicing). Let E ∈ Md . Then
i) Ex ∈ Md−k a.e. x ∈ Rk and E y ∈ Mk a.e. y ∈ Rd−k ;
ii) x −→ λ d−k (Ex ) ∈ L(Rk ) and y −→ λ k (E y ) ∈ L(Rd−k );
12

iii) the slicing formula holds


Z Z
(1.5.1) λ d (E) = λ d−k (Ex ) dx = λ k (E y ) dy.
Rk R d−k
The (1.5.1) can be rewritten as follows:
Z Z Z ! Z Z !
= dy dx = dx dy.
E Rk Ex R d−k Ey

This is the first brick of the general integration formula for functions f > 0 of (x, y) ∈ Rk × Rd−k ≡ Rd :
Theorem 1.5.4 (Fubini–Tonelli). Let f ∈ L(E), f > 0. Then
i) y −→ Rf (x, y) ∈ L(Ex ) a.e. x ∈ Rk and x R−→ f (x, y) ∈ L(E y ) a.e. y ∈ Rd−k ;
ii) x 7−→ E f (x, y) dy ∈ L(Rk ) and y 7−→ E y f (x, y) dx ∈ L(Rd−k );
x
ii) the reduction formula holds
Z Z Z ! Z Z !
(1.5.2) f = f (x, y) dy dx = f (x, y) dx dy.
E Rk Ex R d−k Ey
The (1.5.2) is called reduction formula. Its aim is evident: we can reduce the dimension of the integration
variables by iterating integrals. It is of course natural to ask if the formula could work for an L 1 function
and this seems evident by reducing to positive and negative parts:
Corollary 1.5.5. If f ∈ L 1 (E), E ⊂ Rd then the reduction formula (1.5.2) holds.
The reduction formula can be used also to check if f ∈ L 1 (E). Indeed: this means E | f | < +∞. But if
R

f ∈ L(E) then | f | ∈ L(E) hence we can apply the Fubini thm to | f |. This says that the (1.5.2) holds for
| f |. In particular we obtain the
Corollary 1.5.6. Let f ∈ L(E). If one among
Z Z ! Z Z !
| f (x, y)| dy dx, | f (x, y)| dx dy
Rk Ex R d−k Ey

is finite, then f ∈ L 1 (E) (and the reduction formula (1.5.2) holds).


Example 1.5.7. Let
x−y
f (x, y) = , (x, y) ∈ Q := [0, 1]2 .
(x + y) 3
R R  R R 
Then R Qx
f dy dx , R Q y f dx dy (hence, in particular, f < L 1 ([0, 1]2 )).
Sol. — Notice first that
 ∅,

 x < [0, 1],
Q x = {y ∈ R : (x, y) ∈ [0, 1]2 } = 

 [0, 1]
 x ∈ [0, 1]
y
and similarly for Q . Therefore


 0, y < [0, 1],
Z 

f (x, y) dx = 

x + y − 2y
Z 1 Z 1 Z 1 Z 1
Qy x−y 1 1
= =

dx dx dx − dx. y ∈ [0, 1].

 2y
+ + + +

(x y) (x y) (x y) (x y) 3
 3 3 2
 0 0 0 0
13

Except for y = 0 (therefore for a measure 0 set) both integrals are finite and their value is
# x=1 # x=1
(x + y) −1 (x + y) −2
" " !
1 1 1 1 1
− 2y = − +y − 2 =− .
−1 x=0 −2 x=0 y y + 1 (y + 1) 2 y (y + 1) 2
Hence
Z Z ! Z 1 !g y=1 1
1
f 1
f (x, y) dx dy = dy = (y + 1) −1
− = −1=− .
R Q y 0 (y + 1) 2 y=0 2 2
R R 
Exchanging x with y we obtain the same result except for the sign: R Q f (x, y)dy dx = 12 . 
x

1.6. Change of variables


Let T : Rd −→ Rd a transformation. If T is a linear bijection we know that
λ d (T (E)) = | det T |λ d (E).
What happens if T is a general bijection? If T is regular (differentiable) then T (x) = T (x 0 ) + T 0 (x 0 )(x −
x 0 )+o(x − x 0 ) ≈ T (x 0 )+T 0 (x 0 )(x − x 0 ). The sense of ≈ is that T (x) can be replaced by T (x 0 )+T 0 (x 0 )(x −
x 0 ) in a neighborhood of x 0 and the approximation is more precise smaller is this neighborhood. Imagine
then that we decompose E as union of neighborhoods of its points:
[ X
E= Ux0 , =⇒ λ d (T (E)) ≈ λ d (T (Ux0 )).
x0 x0

Now if we replace T by the affine transformation T (x 0 ) + T 0 (x 0 )(x − x 0 ) we see that


λ d (T (Ux0 )) ≈ λ d (T (x 0 ) + T 0 (x 0 )(Ux0 − x 0 )) = λ d (T 0 (x 0 )(Ux0 − x 0 )) = | det T 0 (x 0 )|λ d (Ux0 − x 0 )

= | det T 0 (x 0 )|λ d (Ux0 ).


Hence Z
X
λ d (T (E)) ≈ 0
| det T (x 0 )|λ d (Ux0 ) ≈ | det T 0 (ξ)| dξ.
x0 E

Of course this is not a proof but it suggests a correct conclusion:


Theorem 1.6.1. Let T : E ⊂ Rd −→ T (E) be a diffeomorphism (that is T, T −1 ∈ C 1 ) on E open set.
Then Z
λ d (T (E)) = | det T 0 (ξ)| dξ.
E

Because null sets are irrelevant for measure and integral, the conclusion remains true if we change E and
T (E) by a measure 0 set.
Transferring from measure to integrals the previous can be rewritten in the form
Z Z Z Z
Φ:=T −1
dx = 0
| det T (ξ)| dξ, ⇐⇒ dx = | det(Φ−1 ) 0 (ξ)| dξ,
T (E) E F Φ(F )

which is an identity involving the function 1(x) ≡ 1. For a generic function f we have the
14

Theorem 1.6.2 (change of variables formula). Let Φ : F −→ Φ(F) be a diffeomorphism, F ⊂ Rd


open. Then
Z Z
(1.6.1) f (x) dx = f (Φ−1 (ξ))| det(Φ−1 ) 0 (ξ)| dξ.
F Φ(F )

Some remarks on the (1.6.1):


• the identity (1.6.1) means that the l.h.s. is finite (hence f ∈ L 1 (F)) iff the r.h.s. is finite: this
explains why the formula can be used to check the integrability of a given function;
• the (1.6.1) expresses the change of variables in the canonical form where one introduces a new
variable ξ = Φ(x); sometimes the change of vars is defined in the form x = Ψ(ξ): in this case
Φ = Ψ−1 hence | det(Φ−1 ) 0 | = | det Ψ 0 |.
• the (1.6.1) remains valid if we modify F and Φ(F) by null sets; this is particularly important
with remarkable change of variables.
Example 1.6.3 (polar coordinates in R2 ).
(x, y) = Ψ( ρ, θ) = ( ρ cos θ, ρ sin θ).
If we define, as usual, Ψ : [0, +∞[×[0, 2π] −→ R2 we don’t have a bijection because for instance Ψ(0, θ) = (0, 0)
for every θ ∈ [0, 2π]. If we restrict Ψ :]0, +∞[×[0, 2π[−→ R2 \02 we have now a bijection but, how is possible to
check, Ψ−1 is not continuous! (hence in particular Φ = Ψ−1 cannot be a diffeomorphism). This is because
q !
Ψ (x, y) =
−1
x + y , arg(x, y) ,
2 2

where arg(x, y) ∈ [0, 2π[ denotes by definition the angle included between the vector (x, y) and the positive real
axis. The problem is that if (x, y) −→ (1, 0) along a unit circumference, then Ψ−1 (x, y) −→ (1, 0) or (1, 2π)
according where (x, y) is moving (in the upper half plan or in the lower half plane). In other words
@ lim Ψ−1 (x, y).
(x,y)→(1,0)

To make Ψ, Ψ−1 = Φ a true diffeomorphism we have to eliminate problematic points of the positive real axis. This
means that we have to restrict Ψ :]0, +∞[×]0, 2π[−→ R2 \{(x, 0) : x > 0} =: R2 \R+ . Now Ψ is a diffeomorphism
between this two open sets (on one side the strip ]0, +∞[×]0, 2π[; on the other side R2 \R+ . Moreover
 cos θ −ρ sin θ 
| det Ψ 0 ( ρ, θ)| = det   = | ρ(cos2 θ + sin2 θ)| = ρ.
 sin θ ρ cos θ 

Now being [0, +∞[×[0, 2π[ different by ]0, +∞[×]0, 2π[ for a measure zero set and being R2 different by R2 \R+ for
a measure zero set we can apply the change of variables formula to conclude that
(1.6.2)
Z Z Z Z +∞ 2π !
FT
f (x, y) dxdy = f ( ρ cos θ, ρ sin θ) ρ dρdθ = ρ f ( ρ cos θ, ρ sin θ) dθ dρ. 
R2 [0,+∞[×[0,2π] 0 0

Example 1.6.4 (Gaussian integral). A very beautiful (and relevant) application of the (1.6.2) is the
formula Z
x2 √
e− 2 dx = 2π.
R
15

Sol. — Let’s start by the integral


Z Z Z ! Z Z ! Z !2
x 2 +y 2 FT x 2 +y 2 x2 y2 x2
e− 2 dxdy = e− 2 dx dy = e− 2 e− 2 dx dy = e− 2 dx .
R2 R R R R R
On the other hand, by (1.6.2)
Z Z +∞ Z 2π ! Z +∞ " 2 # ρ=+∞
x 2 +y 2 ρ2 ρ2 ρ
e− 2 dxdy = e− 2 ρ dθ dρ = 2π e− 2 ρ dρ = 2π e− 2 = 2π,
R2 0 0 0 ρ=0

and by this the conclusion follows. 

1.7. Limit theorems


A very common problem, with lots of variations, consists in studying
Z
lim f n, where ( f n )n∈N ⊂ L 1 (E).
n E
The question is: under which conditions can be said that
Z Z
fn = lim f n .
?
lim
n E E n

The aim of this section is to present the two main results on this problems, in the next section we will
discuss some remarkable applications.

1.7.1. Monotone convergence. The monotone convergence theorem concerns an apparently very
particular and never happening case. It turns out, however, that this is actually the base of all other results:
Theorem 1.7.1 (Beppo Levi). Let ( f n ) ⊂ L(E) such that 0 6 f n 6 f n+1 a.e. for every n ∈ N (we write
f n % a.e.). Then Z Z
lim fn = lim f n .
n E E n

Proof. We first consider the particular case when E = Rd , and a.e. means everywhere, that is for
every x ∈ Rd . The proof is quite easy and it is nothing but the continuity of the measure. Indeed, by
definition Z
f n = λ d+1 (Trap( f n )).
Rd
Easily Trap( f n ) % Trap( f ) hence, by the continuity of the measure,
Z Z
f n = λ d+1 (Trap( f n )) −→ λ d+1 (Trap( f )) = f.
Rd Rd
The general case is easily reduced to the particular one: let Nn the null set where f n 6 f n+1 doesn’t hold
and set N := n Nn . Then λ d (N ) = 0 and f n 6 f n+1 on E\N hence f n χ E\N 6 f n+1 χ E\N on Rd for
S
every n. According to the particular case,
Z Z Z  
lim f n χ E\N = lim f n χ E\N = lim f n χ E\N
n Rd Rd n Rd n
16

But
Z Z Z
g χ E\N = g= g,
Rd E\N E

and by this the conclusion follows. 

As well as upper continuity for measures, inverting the monotonicity doesn’t work in general.
Example 1.7.2. Take f n = χ[n,+∞[ . Then f n & 0 but
Z Z
f n dx = +∞ 6−→ 0 dx = 0. 
R R

However we have monotone convergence for decreasing sequences if we add one requirement:

Proposition 1.7.3. Let f n > f n+1 > 0 a.e. on E for every n with f 1 ∈ L 1 (E). Then
Z Z
lim fn = lim f n .
n E E n

Proof. It is a straightforward application of monotone convergence to the sequence gn := f 1 − f n .


Being f n & we’ll have gn %. Hence
Z Z Z Z  
lim gn = lim gn, ⇐⇒ lim ( f1 − fn) = f 1 − lim f n .
n E E n n E E n

Now, the point is that being f 1 ∈ L 1 and 0 6 f n 6 f 1 , every f n ∈ L 1 . Therefore


Z Z Z Z   Z Z
( f1 − fn) = f 1 dµ − f n, f 1 − lim f n = f1 − lim f n,
E E E E n E E n

and by this the conclusion follows by simple steps. 

An immediate application of monotone convergence is to sum of constant sign series:

Corollary 1.7.4. Let ( f n ) ⊂ L(E) f n > 0. Then


Z X XZ
fn = fn.
E n n E

Pn
Proof. Just consider the sequence of partial sums Sn := k=0 f k . Being f k > 0 for all k we have
Sn % S = n f n hence
P

Z Z Z X n Z X
lim Sn = lim Sn, ⇐⇒ lim fk = f k,
n E E n n E k=0 E k

fk =
R Pn Pn R
and because by linearity E k=0 k=0 E f k we have easily the conclusion. 
17

1.7.2. Dominated convergence. Monotone convergence is a too restrictive. A more general setting
(but actually it follows by monotone convergence) is given by the
Theorem 1.7.5 (Lebesgue). Let ( f n ) ⊂ L 1 (E) such that
a.e.
i) f n −→ f on E;
ii) ∃g ∈ L 1 (E) (called integrable dominant) such that | f n | 6 g a.e. on E for every n ∈ N.
Then f ∈ L 1 (E) and Z Z
lim fn = lim f n .
n E E n

Proof. Being point-wise limit of measurable functions f := limn f n is measurable. Now,


Z Z
a.e a.e
| f (x)| = lim | f n (x)| 6 g(x), =⇒ |f| 6 g < +∞, =⇒ f ∈ L 1 (E).
n E E
R R
Now let’s check that E f n −→ E f . We estimate the modulus of the difference:
Z Z Z Z Z Z
fn − f = ( fn − f ) 6 | fn − f | 6 sup | f k − f | =: ε n,
E E E E E k>n E

where, of course, ε n (x) := supk>n | f k (x) − f (x)|. Now if we prove that ε n & 0 and ε 1 ∈ L 1 (E) by
applying the monotone convergence for decreasing sequence the conclusion follows.
Clearly ε n is positive. We accept the measurability (see exercises). Easily
ε n+1 (x) = sup | f k (x) − f (x)| 6 sup | f k (x) − f (x)| = ε n (x), ∀x ∈ E.
k>n+1 k>n
Moreover, by i), ε n (x) −→ 0 a.e. x ∈ E. Indeed: except for a null set of x f n (x) −→ f (x). Therefore
∀ε > 0, ∃N (ε), : | f n (x) − f (x)| 6 ε, ∀n > N (ε), =⇒ 0 6 ε n (x) 6 ε, ∀n > N (ε)
To finish we apply the dominated convergence in the decreasing version. We need just to check that
ε 1 ∈ L 1 (E):
ε 1 = sup | f k − f | 6 sup (| f k | + | f |) 6 2g ∈ L 1 (E), =⇒ ε 1 ∈ L 1 (E). 
k k

Example 1.7.6. Compute Z +∞  x − n x


lim n2 1 − cos e n+1 dx.
n→+∞ 0 n
  n
Sol. — Let f n (x) := n2 1 − cos x
n e − n+1 x
. Clearly f n ∈ L 1 ([0, +∞[) for any n. Moreover
x2 − n x x2 − n x x 2 −x
f n (x) ∼n→+∞ n2 e n+1 = e n+1 −→ e , ∀x > 0.
2n2 2 2
Now,
(x/n) 2 − n x n
| f n (x)| 6 n2 e n+1 = x 2 e− n+1 x, ∀x > 0.
2
n n 1
Now n+1 % 1, therefore n+1 > 2 as n > 1. Hence
x
| f n (x)| 6 x 2 e− 2 := g(x), ∀n > 1, ∀x ∈ [0, +∞[.
18

Clearly g ∈ L 1 so it is an integrable dominant. By dominated convergence then


Z +∞ Z +∞ 2 " 2 # +∞ Z +∞ Z +∞
x −x x  +∞
f n (x) dx = e dx = − e−x + xe−x dx = −xe−x 0 + e−x dx = 1.

lim 
n→+∞ 0 0 2 2 0 0
0

There’s of course a version of dominated convergence for series that we’ll state in the following form:
Corollary 1.7.7. Let ( f n ) ⊂ L 1 (E) be such that
XZ
| f n | < +∞.
n E
P
Then the series n f n converges a.e. to an L 1 (E) function and the series can be integrated term-wise,
that is Z X XZ
fn = fn.
E n n E

Proof. Exercise. 

1.8. Continuous and differentiable dependance by parameters


In some applications it arises the problem to study the regularity of an integral with respect to some
parameter λ ∈ Λ. This means that we want to discuss how the integral
Z
I (λ) := f (x, λ) dx,
E
depends by λ. Here λ can be whatever, but for simplicity we’ll consider just to be a real parameter (this
is enough for lots of applications and, in particular, to those that we’ll see in this course). Two particular
type of regularity are important: continuity and differentiability. In both cases dominated convergence
reveals its fundamental role. Let’s start by the continuity.
Theorem 1.8.1. Let f : E × Λ −→ R, where Λ ⊂ R. Assume that
i) f (·, λ) ∈ L 1 (E) for every λ ∈ λ;
ii) f (x, ·) ∈ C (Λ) a.e. x ∈ E;
iii) there exists g ∈ L 1 (E) such that | f (x, λ)| 6 g(x) for every λ ∈ Λ, a.e. x ∈ E.
Then I (λ) := E f (x, λ) dx ∈ C (Λ).
R

Proof. To prove the continuity we have to check that


∀(λ n ) ⊂ Λ, : λ n −→ λ ∗, =⇒ I (λ n ) −→ I (λ ∗ ).
Now Z Z
I (λ n ) = f (x, λ n ) dx =: f n (x) dx, where f n (x) := f (x, λ n ).
E E
The idea is just to apply dominated convergence now to ( f n ). We have:
• f n (x) = f (x, λ n ) −→ f (x, λ ∗ ) a.e. x ∈ E because of assumption i);
• | f n (x)| = | f (x, λ n )| 6 g(x), a.e. x ∈ E.
19

The assumptions of Lebesgue thm are therefore fulfilled and we deduce that
Z Z Z Z
lim I (λ n ) = lim fn = lim f n = lim f (x, λ n ) dx = f (x, λ ∗ ) dx = I (λ ∗ ). 
n n E E n E n E

A similar argument works for the derivative with respect to the parameter λ ∈ Λ
Theorem 1.8.2. Let f : E × Λ −→ R, where Λ ⊂ R. Assume that
i) f (·, λ) ∈ L 1 (E) for every λ ∈ Λ;
ii) ∃∂λ f (x, λ) for all λ ∈ Λ and a.e. x ∈ E;
iii) there exists g ∈ L 1 (E) such that |∂λ f (x, λ)| 6 g(x) for every λ ∈ Λ, a.e. x ∈ E.
Then Z
∃∂λ I (λ) = ∂λ f (x, λ) dx, ∀λ ∈ Λ.
E

Proof. Let (hn ) ⊂ R\{0}, hn −→ 0 and notice that


I (λ + hn ) − I (λ) f (x, λ + hn ) − f (x, λ)
Z Z
= dx =: f n (x) dx.
hn E hn E
Now: by ii) it follows that
f (x, λ + hn ) − f (x, λ)
f n (x) = −→ ∂λ f (x, λ), a.e. x ∈ E.
hn
The difficult part is to find an integrable dominant for f n . To this aim first notice that by Lagrange thm
iii)
| f (x, λ + hn ) − f (x, λ)| 6 sup |∂λ f (x, η)||hn | 6 g(x)|hn |, q.o. x ∈ E, =⇒ | f n (x)| 6 g(x), q.o. x ∈ E.
η ∈[λ,λ+h n ]

Therefore, by dominated convergence


I (λ + hn ) − I (λ)
Z Z
= f n (x) dx −→ ∂λ f (x, λ) dx.
hn E E
Being (hn ) ⊂ R\{0} arbitrary the conclusion follows. 
Let’s see a beautiful application of this result that will be important for the future:
Example 1.8.3 (Fourier transform of the gaussian). Let
Z 2
− x
I (ξ) := e−i2πξ}x dx, ξ ∈ R.
e| 2σ 2{z
R
f (x,ξ )

Show that I 0 (ξ) is well defined for any ξ ∈ R. Deduce a differential equation for I, solve it and that
I (ξ) = 2πσ 2 e−πσ ξ , ∀λ ∈ R.
p 2 2
(1.8.1)

Sol. — The integral is well defined because


Z 2
Z 2
− x − x
p
e 2σ 2 e−i2πξ x dx = e 2σ 2 dx = 2πσ 2 < +∞.
R R
20

In other words, f (·, ξ) ∈ L 1 for any ξ ∈ R. To compute I 0 let’s apply the derivation under the integral thm. If we
can do this, Z ! Z
2 2
− x − x
I 0 (ξ) = ∂ξ e−i2πξ x e 2σ 2 dx = (−i2πx)e−i2πξ x e 2σ 2 dx.
R R| {z }
∂ξ f (x,ξ )
Of course ∂ξ f (x, ξ) exists. So we need only an integrable bound for ∂ξ f (x, ξ) independent by ξ:
x2 x2
− −
|∂ξ f (x, ξ)| = (−i2πx)e−i f r m−eπξ x e 2σ 2 = 2π|x|e 2σ 2 =: g(x) ∈ L 1 .

Therefore the derivation thm applies and


Z 2
! Z 2
!
− x − x
I 0 (ξ) = −i2π e−i2πξ x xe 2σ 2 dx = i2πσ 2 e−i2πξ x ∂x e 2σ 2 dx =
R R
" 2
# x=+∞ Z !
x x2
−i2πξ x − 2σ 2 −
 
= i2πσ 2
e e − ∂x e −i2πξ x
e 2σ 2 dx
x=−∞ R
Z Z
x2 x2
− −
= i2πσ 2 i2πξe−i2πξ x e 2σ 2 dx = −2πσ 2 ξ e−i2πξ x e 2σ 2 dx = −2πσ 2 ξ I (ξ).
R R
Therefore
I 0 (ξ) = −2πσ 2 ξ I (ξ), =⇒ I (ξ) = e−πσ ξ I (0).
2 2

x2 √ √
dx = 2πσ 2 we have finally I (ξ) = 2πσ 2 e−πσ ξ .

And because I (0) =
R 2 2
R
e 2σ 2 

1.9. Exercises
Exercise 1.9.1 (Cantor set). Define
C0 := [0, 1],

C1 := [0, 13 ] ∪ [ 23 , 1] = C0 \] 13 , 32 [,
 
C2 := [0, 19 ] ∪ [ 29 , 39 ] ∪ [ 69 , 79 ] ∪ [ 89 , 1] = C1 \ ] 19 , 29 [∪] 79 , 98 [
..
.
2 n−1
[−1 # 3k + 1 3k + 2 "
3 n −1
Cn := [0, 3n ] ∪ [ 3n , 3n ] ∪ . . . ∪ [ 3n , 1] = Cn−1 \
1 2 3
, , n ∈ N.
k=0
3n 3n
Let now C := n ≥0 Cn . Then C ∈ M1 and λ 1 (C) = 0.
T

Figure 2. Cantor set (left), Sierpinski carpet (right).


21

Exercise 1.9.2 (Sierpinski carpet). The Sierpinki carpet is a bidimensional set of Cantor type. Let T0 :=
[0, 1] × [0, 1] and define recursively Tn according to the following rule
2 n−1
[−1 # 3i + 1 3i + 2 " # 3 j + 1 3 j + 2 "
Tn := Tn−1 \ , × , , n ∈ N.
i, j=0
3n 3n 3n 3n
T
Define finally T := n>0 Tn . Show that T is measureble and determine its measure.
Exercise 1.9.3 (?). Show that if λ 1 (N ) = 0 then λ 1 (N 2 ) = 0 where N 2 = {x 2 : x ∈ N }.
Exercise 1.9.4. Let S ⊂ R2 be the set of points (x, y) ∈ R2 such that x and y are rationally dependent, that is
∃m, n ∈ N : mx + ny = 0.
Prove that λ 2 (S) = 0.
Exercise 1.9.5. Let f ∈ L(Rd ): show that in particular { f = +∞} and { f = −∞} are measurable.
Exercise 1.9.6. Let f : Rd −→ R. Prove that if f ∈ L(Rd ) then { f = a} ∈ Md for all a ∈ R. Does the vice versa
hold true? Prove or find a counterexample.
Exercise 1.9.7. Show that f is measurable iff
i) { f > a} ∈ Md for all a ∈ R.
ii) { f ∈ I} ∈ Md for all I ⊂ R interval.
Exercise 1.9.8. Show that f ∈ L(Rd ) iff { f > a} ∈ Md for every a ∈ Q.
Exercise 1.9.9. Any monotone function f : R −→ R belongs to L(R).
Exercise 1.9.10. Let g ∈ C (Rd ) be such that g = 0 a.e.. Deduce that, necessarily, g ≡ 0 that is g(x) = 0 for every
x ∈ Rd .
Exercise 1.9.11 (?). We want to prove that f + g is measurable. Work out the details by following this trick:
[
{ f + g > a} = { f > a − g} = { f > q} ∩ {g > a − q}.
?

q ∈Q

Exercise 1.9.12. Adapt the idea used in the previous exercise to discuss the measurability of f · g if f , g ∈ L(Rd ).
Exercise 1.9.13. Show that if ϕ ∈ C (R) and f ∈ L(Rd ) then ϕ ◦ f ∈ L(Rd ). In particular: if f ∈ L(Rd ) then
| f | p ∈ L(Rd ). Is it true that if | f | ∈ L(Rd ) then f ∈ L(Rd )? And if f 2 ∈ L(Rd )?
Exercise 1.9.14. Let f : E ⊂ Rd −→ R. Show that f χ E ∈ L(Rd ) iff E ∈ Md and { f > a} ∈ Md for every a ∈ R.
Exercise 1.9.15 (?). Show that if ( f n ) ⊂ L(Rd ) then M (x) := supn f (x) and m(x) := inf n f n (x) are measurable.
R 2
Exercise 1.9.16. In this problem we assume that the value of the integral I := R e−t dt is not known, and
we compute it. By using in a suitable way the Tonelli thm, prove that the function f : R2 7−→ R given by
f (x, y) = ye−(1+x )y is in L 1 (R2 ) and by using Fubini compute its integral on [0, +∞[2 . Deduce the value of I.
2 2

Justify everything with care.


Exercise 1.9.17. Justifying carefully all the steps, compute
Z +∞ Z 2π !
y −y/x
e sin x dx dy.
0 0 x
Exercise 1.9.18. Discuss if f (x, y) = 1
(1−xy) α ∈ L 1 ([0, 1]2 ) as α > 0.
22

Exercise 1.9.19 (?). Let E p,q := {(x, y) ∈ R2 : |x| p + |y| q 6 1}, where p, q > 0. Show that
Z
1 1 1
p + |y| q
dxdy < +∞ ⇐⇒ + > 1.
E p, q |x| p q
Hint: use polar coordinates.
Exercise 1.9.20 (?). Let Bd := {(x 1, . . . , x d ) ∈ Rd : x 21 + · · · + x 2d 6 1} be the unit ball of Rd . The goal is to
compute the d dimensional measure of Bd , that is λ d (Bd ).
i) By suitable use of the reduction formula, show that

λ d (Bd ) = λ d−2 (Bd−2 ).
d
ii) Use ii) to determine λ d (Bd ) in function of d.
iii) What is λ d (Brd ) where Brd := {(x 1, . . . , x d ) ∈ Rd : x 21 + · · · + x 2d 6 r 2 }, r > 0 ?
Exercise 1.9.21. Compute
Z +∞ −n(x−n) Z +∞  Z +∞
e x  −n x n x
i) lim dx. ii) lim n 1 + sin dx. iii) lim sin dx.
n→+∞ n 1+x 2 n→+∞ 0 n n n→+∞ 0 x(1 + x )
2 n

2 ) n ∈ L ([0, +∞[)? Compute


1+nx 2
Exercise 1.9.22. For which n ∈ N we have f n (x) := (1+x 1

Z +∞
lim f n (x) dx.
n→+∞ 0
(hint: f n 6 f 2 . . . ).
Exercise 1.9.23. Let Z +∞
sin(x y)
I (x) := e−y
dy.
0 y
0
Show that I is well defined for any x ∈ R, compute I and determine I.
Exercise 1.9.24. Let +∞
e−xt − e−t
Z
I (x) := dt.
0 t
Show that I (x) is well defined for any x > 0, is derivable and compute I 0, hence deduce I.
Exercise 1.9.25. Define
xλ − 1
Z 1
I (λ) := dx.
0 log x
Show that I is well defined for any λ > 0. By applying the derivation under the integral sign, show that ∃∂λ I. Use
this to find out I.
Exercise 1.9.26. Define
Z π/2
arctan(λ tan x)
I (λ) := dx.
0 tan x
Show that I is well defined for any λ > 0. By applying the derivation under the integral sign, show that ∃∂λ I. Use
this to find out I. In particular
Z π/2
x cot x dx = . . .
0
CHAPTER 2

Abstract Measure and Integral

The basic idea behind Lebesgue measure and integral can be extended to more general settings. A
measure µ is a set function assigning a positive µ(E) number to a set E ⊂ X according to certain rules.
While certain properties of the Lebesgue measure depends by the underlying space Rd (as translation
invariance or the measure of intervals), others can be reference properties for more general settings. In
particular countable additivity
G X
µ* En + = µ(En ),
, n - n
and µ(∅) = 0. This abstract setting allows to many application and, remarkably, it leads to a general
framework for Probability Theory.
The scope of this Chapter is to introduce to the general concept of measure and integral. Because
some of the proofs are extended straight from that one for the Lebesgue measure while others demand a
quite different and more general approach, we will omit almost all the proofs in this Chapter.

2.1. Concept of Measure


As for the Lebesgue measure, an abstract measure (or simply a measure) is defined on a family of
sets:
Definition 2.1.1. Let X be a set. A family F ⊂ P (X ) is called σ-algebra if
i) ∅, X ∈ F ;
ii) if E ∈ F then E c ∈ F ;
iii) if (En ) ⊂ F then n En ∈ F .
S

Of course the Lebesgue class is a σ−algebra.


Example 2.1.2. F := {∅, X } and F := P (X ) are both σ−algebras (trivial). 
Example 2.1.3. If X = Rd and F := {O ⊂ Rd : O open} is not a σ−algebra. In fact, is O is open in
general O c is not open (this happens only for O = ∅, Rd by the way). 
In fact, if you think about a little, apart for very particular examples, it is in general difficult to construct
non trivial examples of σ−algebras. We will return below on this problem.
Definition 2.1.4. Let X be endowed with a σ−algebra F ⊂ P (X ). A function µ : F −→ [0, +∞] is
called measure on (X, F ) if
i) µ(∅) = 0;
ii) µ n En = n µ(En ), for every (En ) ⊂ F , Ei ∩ E j = ∅ as i , j.
F  P

23
24

The triplet (X, F, µ) is called measure space. A measure is said to be finite if µ(X ) < +∞. In particular,
if µ(X ) = 1, µ is called probability measure, in this case (X, F, µ) is called probability space.
Of course (Rd, Md, λ d ) is a measure space. Let’s see some other interesting example.
Example 2.1.5. Let X = {x n : n ∈ N} be a countable set, F := P (N) and µ : F −→ [0, +∞] defined
as X
µ(E) := 1.
x ∈E
We take µ(∅) := 0 by def also. Then µ is a measure called counting measure. The details are left as
exercise. 
Example 2.1.6. Let X = Rd , F := Md . If f ∈ L(Rd ) is s.t. f > 0 then setting
Z
µ(E) := f
E
µ is a measure on (Rd, M d) denoted by the writing dµ = f (x) dx.
Sol. — We have to check the countable additivity. If E = n En then
F
Z Z
µ(E) = f = f χE .
E Rd
Now notice that being the En disjoints we can write χ E = n χ En . For any x this sum just reduces to a sum o
P
infinitely many zeroes and just (eventually) one 1 (if x ∈ En for some n). Therefore
Z Z X XZ
m.c.
X X
µ(E) = f χ En = f χ En = f χ En = µ(En ). 
Rd n Rd n n Rd n

Every measure fulfills certain basic properties that we summarize in the next Proposition whose proof is
left to the reader as exercise:
Proposition 2.1.7. Let µ be a measure on (X, F ). Then
i) µ is additive: µ(E t F) = µ(E) + µ(F), ∀E, F ∈ F , E ∩ F = ∅.
ii) µ is countably sub-additive: µ( n En ) 6 n µ(En ), ∀(En ) ⊂ F .
S P
iii) µ is monotone: if E ⊂ F, E, F ∈ F , then µ(E) 6 µ(F).
iv) µ is continuous from below: if En % E (that is En ⊂ En+1 , for every n, E = n En ) then
S
µ(E) = limn µ(En ).
v) if µ is also finite, µ is continuous from above: if En & E (that is En ⊃ En+1 , for every n,
E = n En ) then µ(E) = limn µ(En ).
T
vi) if µ is also finite, µ is subtractive, that is if F ⊂ E, E, F ∈ F , then µ(E\F) = µ(E) − µ(F).
Proof. Exercise. 
Actually, continuity from below (as well as continuity from above for finite measures) is equivalent to
countable additivity:
Proposition 2.1.8. Let µ be additive on (X, F ). Then the following are equivalent:
i) µ is countably additive;
ii) µ is continuous from below;
25

iii) if µ is finite, µ is continuous from above at ∅, that is if En & ∅ then µ(En ) −→ 0.


Proof. Let’s check just iii) =⇒ i). Take n An =: E and set En := nj=1 A j % E. Therefore
F F
E\En & ∅, hence by assumption µ(E\En ) −→ 0. But µ is additive, therefore µ(E) = µ(E\En ) + µ(En ),
that is µ(En ) −→ µ(E) and, always because µ is additive, µ(En ) = nj=1 µ(E j ). By this the conclusion
P
follows. 

2.2. How to define a measure


The elegant definitions of σ−algebra and measure given in the previous section hide the complexity
behind them. In fact, normally a measure is assigned, according to certain criterium, on some class of
sets which is not, in general, a σ−algebra. Think to the case of the Lebesgue measure: this was first
assigned on intervals, but the family of intervals is not a σ−algebra. Two questions arise then: first,
how is possible to construct a σ−algebra of sets "taylored" on a given family of sets S ? second, how is
possible to extend a measure assigned on S to a larger σ−algebra F ?
To respond to the first question we notice that of course there’re, in principle, many σ−algebras
containing a given family S of subsets of X (for sure at least P (X ) is a σ−algebra containing S ).
However, as the case of the Lebesgue measure teaches, bigger is the σ−algebra, harder will be the task
to coherently define a measure on it. This suggests to look for the "smallest" σ−algebra containing S :
Definition 2.2.1. Given a family S ⊂ P (X ) we call σ−algebra generated by S the family
\
σ(S ) := F.
F ⊃S , F σ−alg

It is evident that σ(S ) is non empty (every F contains ∅ and X). With some straightforward work one
can prove that
Proposition 2.2.2. σ(S ) is a σ−algebra and it is the smallest containing F .
Proof. Exercise. 
Example 2.2.3 (Borel σ−algebra). A very important example is B(Rd ):= σ(τR d ) where τR d is the set of
all open sets of Rd . This is called the Borel σ−algebra and its elements are called Borel-sets or borelians. Notice
that the Lebesgue σ−algebra Md contains open sets by definition, hence Md ⊃ B(Rd ). A subtle question is: are
they the same? The answer is no, but it is not easy to prove this. 

Now, let’s imagine that µ : S −→ [0, +∞] be a set function on S . We wonder under which conditions
µ can be extended to σ(S ) as a measure. Extended means that we can define
µ : σ(S ) −→ [0, +∞], : H
H µ measure and H
µ (E) = µ(E), ∀E ∈ S .
Of course, this will force µ to some minimum requirement. For instance:
• if ∅ ∈ S , then µ(∅) = 0;
• if n En ∈ S with En ∈ S for every n, then
F

G X
µ* En + = µ(En ).
, n - n
26

Notice that this second property doesn’t mean at all that µ is already a measure! In fact, S won’t be
in general closed under countable unions. Thinking to the Lebesgue measure we’re just saying that if a
rectangle is disjoint (eventually countable) union of rectangles then the total area is the sum of all the
areas of the components. We say that µ is a pre-measure.
Similarly to the case of the Lebesgue measure, we can define the µ−outer measure by setting

X
 [ 
µ∗ (E) := inf  µ(E , S  , ∀E ⊂ X .

 n ) : E ⊂ E n E n ∈ 
 n n 
In general, as for the outer Lebesgue measure, we cannot expect that µ∗ will be a measure. It is not
difficult to prove that

Proposition 2.2.4. Let µ be a pre-measure on S . The outer measure µ∗ coincides with µ on S , it is


sub–additive and monotone.

Proof. Exercise, mime the proof of Prop. 1.1.3. 

To make µ∗ a true countably additive measure we need, in general, to restrict the class of measurable
sets. In the case of the Lebesgue measure we used, as criterium, the fact that we want open sets to
be measurable sets. This is because, in the context of the Lebesgue measure, these are important and
common sets. In the present setting, however, X is just a generic set, therefore we should look for a
different criterium. The right general condition was found by Carathéodory, a greek mathematician lived
across the XIX and the XXth century. The idea is not difficult: if µ∗ were additive, then

µ∗ (E) = µ∗ (E ∩ F) + µ∗ (E ∩ F c ), ∀E, F.

The class of sets F cutting (in the sense of µ∗ ) every other set E in two perfect parts is called the
Carathéorody class. It is formally defined by

F := F ⊂ X : µ∗ (E) = µ∗ (E ∩ F) + µ∗ (E ∩ F c ), ∀E ⊂ X .


For instance, it is possible to prove (a bit technical) that

Proposition 2.2.5. Let µ be a pre-measure on S . Then, the Carathéodory class coincides with the
Lebesgue class in the case S =family of all the rectangles and µ the area of rectangles.

It turns out that

Theorem 2.2.6 (Carathéodory). The Carathéodory class is a σ−algebra of sets containing S . The
outer measure µ∗ restricted to F is a measure extending µ on S .

Because F is a σ−algebra containing S , it contains also σ(S ). Therefore, the Carathéodory extension
µ of µ is, in particular a measure on σ(S ). In general σ(S) ( F (this is a subtle point: for example, if
H
S =rectangles, σ(S) basically coincides with the Borel σ−algebra B(Rd ) which is, as we noted some
remark above, strictly smaller than the Lebesgue class).
27

2.2.1. A non trivial example: the Sequence Space. We want to show how the general ideas
presented in this Section are applied to a non trivial setting. The example is borrowed from Probability
Theory, that offers "non geometric" settings in which Measure Theory offers a solid framework to define
precisely complex quantities. This approach has remarkable applications for instance also in Quantum
Physics, Information Theory, etc.
Let S be a finite set, representing the possible outcomes of a certain "experiment". For instance: for
tossing a coin, S := {H, T } (H stands for head, T for tail); for rolling a die, S := {1, 2, 3, 4, 5, 6}, and so on.
We assume that to every state s ∈ S is assigned a probability ps ∈]0, 1[ in such a way that s ∈S ps = 1.
P
Let’s imagine now that the experiment is repeated indefinitely. A sequence (s n ) ⊂ S of outcomes
tells that s0 is the outcome at t = 0, s1 that one at t = 1 and so on. The space of all the possible stories is
therefore the set
X := {(s n ) ⊂ S} ≡ S N .
It is clear that S N has infinity many elements, precisely as many as R (Cantor theorem). This to say
that X is a non trivial stuff. A single story, as a single real number, is not particularly significative.
Imagine the experiment is the coin tossing. A single story is an infinite sequence of H and T, let’s say
H, H, T, H, T, T, T, H, T, T, . . .. Assuming that there’s no "memory" in the outcomes, the probability of
a single story should be 21 · 12 · 12 · · · = 0. Simple stories, the element of the sample space X, are also
called elementary events in Probability. More complex events can be described in the following way: fix
N times n1 < . . . < n N , and consider all the possible stories x such that at time n j the outcome belongs
into a certain "range" S j ⊂ S. Formally,
C(n1, . . . , n N ; Sn1 , . . . , Sn N ) := {x ∈ X : x n j ∈ S j , j = 1, . . . , N }.
A set of this type is called cylinder (the origin of this name is due to the fact that, if we call x n the
"coordinates" of x = (x n ), then a certain number is constrained while the remaining are free). Notice
that we can write any cylinder in the form
C(1, 2, . . . , M; S1, S2, . . . , SM ).
Let’s show this on some easy example:
C(2; S2 ) = {(x n ) : x 2 ∈ S2 } ≡ {(x n ) : x 1 ∈ S, x 2 ∈ S2 } = C(1, 2; S, S2 ).
In particular, any cylinder can be represented in infinitely many ways. A reasonable definition of
probability of a cylinder coherent with the elementary probability (ps ) and the "absence of memory"
could be then
X
(2.2.1) P (C(1, . . . , M; S1, . . . , SM )) = ps1 ps2 · · · ps M .
(s1,...,s M ) ∈S1 ×···×S M

We used here the letter P instead µ, as commonly used in Probabilistic literature. Notice that, in principle,
there could be a problem with the previous definition. In fact, the same set of stories can be represented
as cylinder in basically infinitely many ways. This could potentially lead to a bad definition of P if the
corresponding probabilities wouldn’t be the same. Let’s show how it works on an easy example. Of
course
C(1; S1 ) ≡ C(1, 2; S1, S).
28

Following the previous definition


X X X X
P(C(1, 2; S1, S)) = ps1 ps2 = ps1 ps2 = ps1 = P(C(1; S1 )).
(s1,s2 ) ∈S1 ×S s1 ∈S1 s2 ∈S s1 ∈S1

Therefore, there’s no problem with the (2.2.1).


Now, let S be the family of all the cylinders. It is evident that S is closed under finite unions. To
show the principle, let’s consider
C(1; S1 ) ∪ C(2; S2 ) ≡ C(1, 2; S1 × S) ∪ C(1, 2; S × S2 ) ≡ C(1, 2; S1 × S ∪ S × S2 ).
However, S is not closed under countable (infinite) unions: n C(n; H S = ∅, S.
S
S) is not a cylinder unless H
It is however possible to prove that
Theorem 2.2.7. P is a pre-measure over the class of cylinders. Therefore, according to the Carathéodory
theorem, P has an extension to the σ−algebra generated by the cylinders.
Proof. Being P finite, invoking the Prop. 2.1.8, let Cn & ∅ be a sequence of cylinders. We have to
prove that P(Cn ) −→ 0. The conclusion is evident if some of the Cn is empty because, in this case, the
sequence Cn would be definitively equal to ∅ whence P(Cn ) = 0 definitively. So let’s assume the Cn , ∅
for all n. We can always represent Cn in the form
Cn = C(1, . . . , Mn ; S1n, . . . , SM
n
n
).
We set also S jn = S for every j > Mn . Because Cn+1 ⊂ Cn we have S jn+1 ⊂ S jn as j = 1, . . . , Mn . Now,
fix j and look at S jn : it’s a decreasing sequence, let’s call H
S j := n S jn . Therefore
T
\ ( )
Cn = x ∈ X : x j ∈ H Sj, j ∈ N .
n

S j = ∅ iff S jn is
S j is empty. However, because S is finite, H
Now, this is empty iff at least one between H
definitively (in n) empty, and this means Cn = ∅ definitively, that contradict the assumption. But then,
T
n Cn , ∅, contradicting Cn & ∅. 

2.3. Integral
Let (X, F, µ) be a measure space. We want to define the integral
Z Z
f dµ ≡ f (x) dµ(x).
X X
We first introduce the class of measurable functions:
Definition 2.3.1. Let f : X −→ [−∞, +∞]. We say that f ∈ L(X ) if { f > a} ∈ F for every a ∈ R.
The main properties of measurable functions extends easily. Hence sum, difference, product and ratio
(if the denominator is , 0) of measurable functions is a measurable function. Again measurability is
not affected by a changement on a measure 0 set and a.e. point wise limits of measurable functions are
measurable functions. Particularly important are simple functions:
Definition 2.3.2. A function of type s = nj=1 c j χ E j where Ei ∩ E j = ∅ is called simple.
P
29

To define the integral, however, we need to take a different road w.r.t. the Lebesgue measure case. Indeed,
in the previous Chapter we defined
Z
(2.3.1) f dλ d = λ d+1 (Trap( f )).
Rd
This natural definition is suggested by the geometry but it seems too specific of the Rd case hence is not
affordable in the general context. However there’s a simple alternative way to define the integral.
Definition 2.3.3. Let s ∈ L(X ) be a simple positive function, s = j c j χ E j . We pose
P
Z X
s dµ := c j µ(E j ),
X j
with the agreement that 0 · (+∞) = 0.
It is not difficult to check that such a definition fulfills certain properties as linearity and monotonicity.
Definition 2.3.4. Let f ∈ L(X ) be positive. We pose
Z Z
f dµ := sup s dµ.
X s6 f , s simple X

With some work one can prove that also this definition fulfills linearity and monotonicity. For general
variable sign or even complex valued function we have the
Definition 2.3.5. Let f ∈ L(X ). We say that f ∈ L 1 (X ) if
Z
| f | dµ < +∞.
X
If f is real valued we pose Z Z Z
f dµ := f + dµ − f − dµ,
X X X
and if f is complex valued then
Z Z Z
f dµ := Re f dµ + i Im f dµ.
X X X
With some work not particularly difficult one obtains easily the properties as linearity, monotonicity,
triangular inequality, decomposition, restriction and null sets invariance. An important inequality is the
Proposition 2.3.6 (Čebishev). Let (X, F, µ) be a measure space and f ∈ L(X ) be such that f > 0.
Then Z
1
µ( f > a) 6 f dµ, ∀a > 0.
a X
Proof. Indeed Z Z Z
f 1
µ( f > a) = dµ 6 dµ 6 f dµ. 
f >a f >a a a X

In particular it follows that


f dµ = 0 then f = 0 a.e..
R
Corollary 2.3.7. If f ∈ L(X ) and f > 0 is such that X
30

Proof. Just notice that


[( 1
)
Ceb.
Z
{ f > 0} = f > 6 n f dµ = 0. 
n
n X

2.4. Limit theorems


Monotone convergence and dominated convergence extend to the general setting of an abstract
measure space.
Theorem 2.4.1 (monotone convergence). Let (X, F, µ) be a measure space, ( f n ) ⊂ L(X ) be such
that 0 6 f n 6 f n+1 a.e. for every n ∈ N. Then
Z Z
lim f n dµ = lim f n dµ.
n X X n
The proof of the Lebesgue dominated convergence thm remains unchanged depending by the monotone
convergence thm:
Theorem 2.4.2 (Lebesgue). Let (X, F, µ) be a measure space, ( f n ) ⊂ L 1 (X ) be such that
i) ( f n (x)) converges µ-a.e.;
ii) ∃g ∈ L 1 (X ) such that | f n | 6 g a.e. for every n ∈ N.
Then limn f n ∈ L 1 (X ) and Z Z
lim f n dµ = lim f n dµ.
n X X n
Applying the Lebesgue thm to series we have the
Corollary 2.4.3. Let (X, F, µ) be a measure space and ( f n ) ⊂ L 1 (X ) be such that
XZ
| f n | dµ < +∞.
n X
P
Then the series n f n converges a.e. to an L 1 (X ) function and the series can be integrated term-wise,
that is Z X XZ
f n dµ = f n dµ.
X n n X

Because they are direct consequence of the dominated convergence, results concerning continuity and
derivability of integrals depending by parameters follows straightforwardly.
Theorem 2.4.4. Let (X, F, µ) be a measure space E ∈ F and f : E × Λ −→ R, where Λ ⊂ R. Assume
that
i) f (·, λ) ∈ L 1 (E) for every λ ∈ λ;
ii) f (x, ·) ∈ C (Λ) a.e. x ∈ E;
iii) there exists g ∈ L 1 (E) such that | f (x, λ)| 6 g(x) for every λ ∈ Λ, a.e. x ∈ E.
Then I (λ) := E f (x, λ) dµ(x) is continuous. If furthermore
R

iv) ∃∂λ f (x, λ) for all λ ∈ Λ and a.e. x ∈ E;


v) there exists G ∈ L 1 (E) such that |∂λ f (x, λ)| 6 G(x) for every λ ∈ Λ, a.e. x ∈ E.
31

Then Z
∃∂λ I (λ) = ∂λ f (x, λ) dµ(x), ∀λ ∈ Λ.
E

2.5. Exercises
2 , x ∈ X := R, c > 0. Determine the value of c in such a way that dµ := f dx be
c
Exercise 2.5.1. Let f (x) := 1+x
a probability measure on (R, M1 ).
Exercise 2.5.2. Let X be a generic set, F ⊂ P (X ) be a σ−algebra, x 0 ∈ X and

 0,

 E = x 0,
δ x0 (E) := 
E 3 x0 .

 1,

Show that δ x0 is a measure called Dirac delta centered at x 0 . Let X = Rd , F = Md . Does it exists f ∈ L 1 (Rd )
such that δ x0 = f dx?
Exercise 2.5.3 (?). Let X = [0, 1], E ⊂ X and define
( )
1 1 1 k
µ(E) := lim ]( A ∩ N) = lim ] ∈A : k∈N ,
n→+∞ n n n→+∞ n n
if the limit exists.
i) Show that if E = [a, b] ⊂ [0, 1] then µ(E) = b − a.
ii) Show that µ is additive, that is if A ∩ B = ∅, then µ( A ∪ B) = µ( A) ∪ µ(B).
iii) Take as A the set of dyadic numbers A := { 2km : m ∈ N, k = 0, 1, . . . , 2m }. Find ]( A ∩ n1 N). What can
you conclude on µ( A)? Use the answer to respond to the question: is µ a measure?
Exercise 2.5.4. Let X = {1, 2, 3, 4}, F be the σ−algebra generated by S := {{1}, {1, 2}}. i) List the elements of
F . ii) Characterize F measurable functions and exhibit a non measurable one. iii) Let µ be defined on S as
Rµ({1}) = 3 and µ({1, 2}) = 8. We denote by µ the Caratheodory extension of µ to F . If it makes sense, compute
X
(x − 3)(x − 4) dµ(x).
Exercise 2.5.5 (?). Let (X, F, µ) be a measure space, (En )n∈N ⊂ F . Suppose that
X
µ(En ) < +∞.
n
Show that almost every x ∈ X belongs only to a finite number of sets En .
Exercise 2.5.6. Let (X, F, µ) be a finite measure space (that is a µ(X ) < +∞) and let E ∈ F . Show that
Z
min ( χ E (x) − y) 2 dµ(x)
y ∈R X
exists and find it.
Exercise 2.5.7. Let µ be a measure on (X, F ). Let T : X −→ Y a function and define G := {G ⊂ Y : T −1 (G) ∈
F }.
i) Show that G is a σ−algebra and ν(G) := µ(T −1 (G)), G ∈ G defines a measure on (Y, G ).
ii) Show that g : Y −→ R is G −measurable iff f := g◦T (that is f (x) := g(T (x)), x ∈ X) is F −measurable.
iii) Show that g is ν−integrable iff f := g ◦ T is µ−integrable and, in such case,
Z Z
g(y) dν(y) = f (T (x)) dµ(x).
Y X
32

Exercise 2.5.8. Let (X, F, µ) and let f ∈ L 1 be such that


Z Z
f dµ = | f | dµ,
E E
for some E ∈ F . Prove that f has constant sign on E µ−a.e.
Exercise 2.5.9 (?). Let (X, F, µ) be a finite measure space. Show that
X
f ∈ L 1, ⇐⇒ nµ(n 6 | f | < n + 1) < +∞.
n
What happens if µ is not finite?
Exercise 2.5.10 (?). Let f ∈ L 1 (X ). Show that
Z
∀ε > 0, ∃δ = δ(ε) > 0 : | f | dµ 6 ε, ∀E ∈ F, µ(E) 6 δ.
E
CHAPTER 3

Basic Banach spaces

In many applied problems are modeled through equations involving a function as unknown: differ-
ential equations, integral equations, functional equations, . . . . Other problems consists in optimizing a
function whose variable is a function itself, as in Calculus of Variations. It is therefore natural to consider
spaces of functions as natural frameworks. A natural way to look at these spaces is a normed spaces, that
is vector spaces endowed with a norm, measuring the length of each vector. In this way, we can consider
sequences and their limits, which is the base of Analysis.

3.1. Normed spaces


Let’s start by the
Definition 3.1.1. Let V be a vector space (over R or C). A function k · k : V −→ [0, +∞[ is called norm
on V if
i) (vanishing) k f k = 0 iff f = 0;
ii) (homogeneity) kα f k = |α|k f k for every α ∈ C, f ∈ V ;
iii) (triangular inequality) k f + gk 6 k f k + kgk, for every f , g ∈ V .
Of course V = Rd with the euclidean norm
v
u
t d
X
k(x 1, . . . , x d )k := x 2k
k=1

is an example of normed space (over R) and similarly V = Cd with the norm


v
u
t d
X
k(z1, . . . , z d )k := |zk | 2
k=1
is an example of normed space (over R as well as C) with usual definitions of sum and product by a scalar.
In the next subsections we will introduce some important spaces that are frequently used as working
framework.
3.1.1. The space B(X ). Let X be a set and let
( )
B(X ) := f : X −→ C : k f k∞ := sup | f (x)| < +∞ ,
x ∈X
the set of all the bounded functions endowed with the usual sum of functions and product of a function
by a scalar.
33
34

Proposition 3.1.2. (B(X ), k · k∞ ) is a normed space.


Proof. Exercise. 
Let X ⊂ Rd . An important subset of B(X ) is that of continuous functions on X:
Cb (X ) := { f ∈ B(X ) : f ∈ C (X )}.
By definition Cb (X ) is a subspace of B(X ). If X is in particular compact, thanks to the Wierstrass thm
Cb (X ) ≡ C (X ) (every continuous function is bounded because it has global min/max on X).
3.1.2. The space L p (X ) (1 6 p < +∞). Let (X, F, µ) be a measure space. The set L 1 (X ) is
naturally a vector space with usual operations of sum and product by a scalar. Indeed, if f , g ∈ L 1 (X )
then f + g ∈ L 1 (X ) because
Z Z Z Z
| f + g| dµ 6 (| f | + |g|) dµ = | f | dµ + |g| dµ < +∞,
X X X X
and similarly α f ∈ L 1 (X ) for every α ∈ C. Modulo an agreement, it turns out that X | f | dµ defines a
R

norm:
Proposition 3.1.3. Let Z
k f k1 := | f | dµ, f ∈ L 1 (X ).
X
Then the function k · k1 is a norm on L 1 (X ) with the vanishing modified in the following form: k f k1 = 0
iff f = 0 a.e.
Proof. The unique interesting fact is just the vanishing (homogeneity and triangular inequality
are straightforward). It is evident that if f = 0 Ra.e. then, by well known properties of the integral,
k f k1 = X | f | dµ = 0. Vice versa: if k f k1 = X | f | dµ = 0 then, as consequence of the Čebishev
R

inequality | f | = 0 a.e. 
Remark 3.1.4. There’s a formal way to make L 1 (X ) a normed space and it consists in introducing an equivalence
relation between functions saying that f ∼ g if they differ at most on a null set. Easily one shows that this is an
equivalence relation. By this we may identify all equivalent functions by taking the quotient L 1 / ∼. Elements of
this spaces are equivalence classes { f } and if we put k{ f }k1 := k f k1 one may see that this definition is well posed
(it doesn’t depend by the specific f in the equivalence class) and now k{ f }k1 = 0 iff { f } is the class of functions
0 a.e.. This class turns out to be the unique zero of the quotient L 1 / ∼ and by working out the details one can
construct a structure of normed space on L 1 . The universal agreement is to consider L 1 normed by k · k1 with the
unique care to the vanishing. 
An extension of the previous case is the following: define
( Z )
p
L (X ) := f ∈ L(X ) : | f | dµ < +∞ , (1 < p < +∞).
p
X
It is more difficult to see that L p (X )
is a vector space. There’s no problem to prove that if f ∈ L p (X )
then α f ∈ L (X ) for every α ∈ C. More difficult is to prove that if f , g ∈ L p (X ) then f + g ∈ L p (X ).
p

Basically we need an elementary inequality as


(u + v) p 6 Cp (u p + v p ), ∀u, v > 0, for some Cp > 0.
35

Notice that if for instance u > v > 0 (v = 0 is trivial) the previous can be rewritten as
 up
v 1+
p
6 Cp (u p + v p ), ⇐⇒ (1 + t) p 6 Cp (1 + t p ), ∀t > 1.
v
p
In other words it is enough to check that the auxiliary function γ(t) := (1+t)
1+t p be bounded as t ∈ [1, +∞[.
But this follows by a soft argument: clearly γ ∈ C ([1, +∞[) and γ(+∞) = 1: easily it follows that γ is
bounded. Applying this we have
Z Z Z
| f + g| dµ 6
p
(| f | + |g|) dµ 6 Cp
p
(| f | p + |g| p ) dµ < +∞,
X X X
if f , g ∈ L p (X ): hence L p (X ) is a vector space. Much less easy is to prove the
Theorem 3.1.5. Let Z ! 1/p
k f k p := p
| f | dµ , f ∈ L p (X ).
X
Then the function k · k p is a norm on L p (X ) with the vanishing modified in the following form: k f k p = 0
iff f = 0 a.e.
Proof. Vanishing works as in the L 1 case and homogeneity is straightforward. The main difficulty
is the triangular inequality which is based on the following fundamental inequality for integrals:
Lemma 3.1.6 (Hölder inequality). Let f , g ∈ L(X ) f , g > 0. Then
Z Z ! 1/p Z ! 1/q
1 1
(3.1.1) f g dµ 6 p
f dµ q
g dµ , where + = 1.
X X X p q
p and q are called conjugate exponents.
R  1/p
Proof. Call k f k p = X f p dµ and similarly kgkq . If one of the two is 0 the inequality is trivial
(because f or g or both would be 0 a.e.). Therefore we can assume that both are positive. In the case one
of the two is +∞ the inequality is again trivial. So let’s assume that both are positive and finite. Then the
conclusion is equivalent to say Z
f g
dµ 6 1.
X k f k p kgk q
The key of the proof relies on a remarkable numerical inequality called Young inequality:
1 1
(3.1.2) ab 6 a p + bq, ∀a, b > 0.
p q
This inequality is deduced easily by the concavity of the function log by which it follows that (recall that
p, q > 0 and p1 + q1 = 1)
!
1 p 1 q 1 1
log a + b > log a p + log bq = log a + log b = log(ab).
p q p q
Therefore, by the Young inequality
1 fp 1 gq
Z Z
f g 1 1 p 1 1 q 1 1
dµ 6 p + q dµ = p k f kp + q kgkq = + = 1. 
X k f k p kgkq X p k f kp q kgkq p k f kp q kgkq p q
36

Let’s return to the triangular inequality for k · k p . Notice that


Z Z Z Z
p
k f + gk p = | f + g| dµ =
p
| f + g|| f + g| p−1
dµ 6 | f || f + g| p−1
dµ + |g|| f + g| p−1 dµ.
X X X X

Applying the Hölder inequality


Z Z ! 1/p Z ! 1/q
| f || f + g| p−1
dµ 6 p
| f | dµ | f + g| (p−1)q
dµ ,
X X X
p−1
and similarly for the other integral. Now, notice that because p1 + q1 = 1 then q1 = 1 − 1
p = p hence
p p
q = p−1 . Therefore (p − 1)q = (p − 1) p−1 = p. Returning to the inequality we get
Z ! 1/p Z ! 1/p Z ! 1/q
p p/q
 
kf + gk p 6* p
| f | dµ + p
|g| dµ + | f + g| dµ
p
= k f k p + kgk p k f + gk p .
, X X - X

p/q
We want to divide by k f + gk p . If this is 0 the triangular inequality is trivial. If it is , 0 then we obtain
p−p/q
k f + gk p 6 k f k p + kgk p, ⇐⇒ k f + gk p 6 k f k p + kgk p . 

The triangular inequality is called also Minkowski inequality.

3.1.3. The space L ∞ (X ). On the set L(X ) we could introduce the sup norm (that is choosing by
B(X ) those functions which are also measurable) and we would obtain another interesting subspace of
B(X ) (as well as Cb (X ) if X has some topological structure). However this set doesn’t seem particularly
appropriate considering that if we change a measurable function on a measure 0 set nothing changes with
respect to its measurability properties. Imagine that X = R with the Lebesgue measure: by changing a
given function in just one point we would have functions identical from the point of view of measure but
with different sup norm. So it is natural to look to some extension of the concept of sup in such a way
that this quantity doesn’t depend on negligible changes w.r.t. µ. Basically the idea is simple: we don’t
care to values of | f | above a certain threshold if these values are taken on a measure 0 set.
Definition 3.1.7. Let f ∈ L(X ). We say that f is essentially bounded if
∃M > 0 : µ(| f | > M) = 0, (or, | f | 6 M a.e.).
We will write f ∈ L ∞ (X ).
We leave as exercise to prove that L ∞ (X ) is a vector space. The essential sup is then the smallest of these
bounds:
Proposition 3.1.8. Let
k f k∞ := inf{M : µ(| f | > M) = 0}, f ∈ L ∞ (X ).
Then the function k · k∞ is a norm on L ∞ (X ) with the vanishing modified in the following form: k f k∞ = 0
iff f = 0 a.e.
37

Proof. Again vanishing and homogeneity are left as exercise (do it!). About the triangular inequality:
the main remark is that, of course
| f | 6 k f k∞, |g| 6 kgk∞, a.e. =⇒ | f + g| 6 | f | + |g| 6 k f k∞ + kgk∞, a.e.
This says that k f + gk∞ , which is the best (the smallest) upper bound, k f + gk∞ 6 k f k∞ + kgk∞ . 
3.1.4. Spaces ` p . These spaces are actually just particular cases of L p (X ) spaces. Precisely
` p := L p (N), with respect to µ defined as the counting measure.
In these cases we identify a function f : N −→ C with a sequence ( f n ) (that is f n := f (n)). The
corresponding L p norm is
Z ! 1/p 1/p
X
k( f n )k` p = p
| f (n)| dµ =* | fn |p+ .
N , n -
About p = +∞ because with respect to the counting measure the unique null set is just the empty set we
deduce that
k( f n )k` ∞ = sup | f n |.
n
The ` p spaces are interesting simplified models of their older brothers L p . In many circumstances they
can be used as "toy" models to get first answers to many problems posed in the more general setting of
L p spaces.

3.2. Limit of a sequence


The norm plays the same role of the modulus for the real line. In particular we have the
Definition 3.2.1. Let (V, k · k) be a normed space. Given ( f n ) ⊂ V we say that
k·k
f n −→ f , ⇐⇒ k f n − f k −→ 0.
It has to be clear that a sequence converge respect to a certain norm. This is relevant because usually on
a same functional spaces are defined several norms: it may happens that a certain sequence is convergent
respect to a norm but not respect to another.
Example 3.2.2. On V = CR([0, 1]) let’s consider the following two norms: the "natural" norm k · k∞ (uniform
1 1
norm); the L norm k f k1 := 0
| f (x)| dx. Take


 n − n3 x, 06x6 1
n2
,

f n (x) := 

1
6 x 6 1.

 0
 n2
k · k1
Then f n −→ 0 but ( f n ) is not convergent in the uniform norm.
Sol. — The first is easy to check:
Z 1 Z 1/n2
n
k f n − 0k1 = | f n (x)| dx = (n − n3 x) dx = −→ 0.
0 0 2n2
38

k · k∞
About the second: notice first that if f n −→ f ∈ C ([0, 1]) then
k f n − f k∞ = sup | f n (x) − f (x)| > | f n (0) − f (0)| = |n − f (0)| −→ +∞,
x ∈[0,1]

no matter what is f (0). In particular k f n − f k∞ 6−→ 0. 


On the most relevant spaces (as B(X ), C (X ), L p (X ) and others) there’s always a standard reference
norm (the norms introduced above). It is, however, interesting to consider the case when several different
norms are defined on the same space. A particularly important property is the following:
Definition 3.2.3. Let V be a vector space on which two norms k · k and k · k∗ are defined. We say that
k · k∗ is stronger than k · k if
∃C > 0, : k f k 6 Ck f k∗, ∀ f ∈ V .
We say that two norms are equivalent if each one is stronger than the other.
Example 3.2.4. On V = C ([0, 1]) let’s consider k · k∞ and k · k1 . The uniform norm is stronger than the
L 1 norm.
Sol. — Just notice that because | f (x)| 6 k f k∞ for every x ∈ [0, 1], we have
Z 1 Z 1
k f k1 = | f (x)| dx 6 k f k∞ dx = k f k∞ .
0 0
Notice that the two norms are not equivalent. In fact, if this were true, there would exists C > 0 such that
k f k∞ 6 Ck f k1, ∀ f ∈ C ([0, 1]).
By considering the f n of the previous example, however, we have k f n k1 = 1
2n hence if the previous were true
1
n6C , ⇐⇒ 2n2 6 C, ∀n,
2n
which is clearly impossible. 
An important fact is the
Proposition 3.2.5. If k · k∗ is stronger than k · k, then any sequence converging under k · k∗ converges
under k · k.
Proof. Evident, exercise. 
Definition 3.2.6. Let (V, k · k) be a normed space. A subset S ⊂ V is said to be closed if it contains the
limits of all the convergent sequences of S. In symbols:
k·k
S closed ⇐⇒ ∀( f n ) ⊂ S : f n −→ f , then f ∈ S.
Notice that every space V is closed by definition. An interesting example is the following:
39

Theorem 3.2.7. Let X ⊂ Rd . Then Cb (X ) is a closed linear subspace of B(X ).


k · k∞
Proof. Assume that ( f n ) ⊂ Cb (X ) be such that f n −→ f ∈ B(X ). We want to prove that f ∈ Cb (X ),
hence in particular that f be continuous. Fix x 0 ∈ X and consider
| f (x) − f (x 0 )| 6 | f (x) − f n (x)| + | f n (x) − f n (x 0 )| + | f n (x 0 ) − f (x 0 )| 6 2k f n − f k∞ + | f n (x 0 ) − f (x 0 )|.
k · k∞
Because f n −→ f we have by definition that k f n − f k∞ 6 ε for every n > N (ε). Taking in particular
n = N (ε) hence
| f (x) − f (x 0 )| 6 2ε + | f n (x 0 ) − f (x 0 )|.
But f n ∈ Cb (X ) hence | f n (x) − f (x 0 )| 6 ε as x ∈ B(x 0, δ] for some δ: therefore
| f (x) − f (x 0 )| 6 3ε, ∀x ∈ B(x 0, δ], =⇒ f continuous in x 0 .
Being x 0 any point of X the conclusion follows. 
Example 3.2.8. How you might expect, closeness depends by the norm used. For instance: Cb ([0, 1]) ⊂
L 1 ([0, 1]) is not closed with respect to the L 1 norm.
Sol. — Of course Cb ([0, 1]) ⊂ L 1 ([0, 1]) because, as we know, continuous functions are Riemann integrable and
these ones are Lebesgue integrable too. It suffices to find a counterexample, that is a sequence ( f n ) ⊂ Cb ([0, 1])
k · k1
such that f n −→ f < Cb ([0, 1]). For instance we can approximate f = χ[1/2,1] < Cb ([0, 1]) (because of the
discontinuity) through continuous functions. This can be easily done with a piece wise linear function


 0, 06x6 1
2 − n1 ,





 n 
f n (x) = 
 2 x − 2 + 2,
1 1 1
2 − 1
n 6x6 1
2 + n1 ,





1
+ 1

 1,
 2 n 6 x 6 1.
1 1
n ·2
Clearly ( f n ) ⊂ Cb ([0, 1]) and k f n − f k1 = 2 · 2 = 1
2n −→ 0. 
1

1
2

Definition 3.2.9. Let V be a normed space and S ⊂ V . The closure of S, denoted by S, is the set
 k · kV

S := f ∈ V : ∃( f n ) ⊂ S f n −→ f .

We say that S is dense in V if S = V .


Easily it follows that S is closed iff S = S
Usually, sequences are used to approximate solutions of complex problems when the solutions themselves
are not explicitly available. As we will see, a natural way to build sequences of functions (as for numbers,
by the way) is through series. This is the reason we introduce the following natural
40

P
Definition 3.2.10. Let V be a normed space, ( f n ) ⊂ V . We say that the series n f n converges in V if
N
X
∃ lim fn ∈ V .
N →+∞
n=0
PN
The finite sum s N := n=0 f n is called partial sum of the series.
As for numerical series, to check if a series of vectors (or, more specifically, functions) converges can
be a difficult task. This because it is not easy, and eventually impossible, in general to get simple forms
PN
to finite sums. Therefore, the algorithm first compute s N := n=0 f n , hence take lim N s N is unfeasible.
For this reason, for series the next section is particularly relevant.

3.3. Completeness
To check the convergence of course one needs to have a candidate as possible limit. But not always it is
possible to determine a candidate. It would be therefore desirable to have an intrinsic property to test if
a certain sequence converges or less. The most important is consequence of the
Proposition 3.3.1. If ( f n ) converges in V then it must fulfills the Cauchy property:
∀ε > 0, ∃N (ε) ∈ N : k f n − f m k 6 ε, ∀n, m > N (ε).
We say also that ( f n ) is a Cauchy sequence.
V
Proof. It’s almost immediate: if f n −→ f then, being k f n − f k −→ 0 we have
∀ε > 0, ∃N (ε) ∈ N : k f n − f k 6 ε, ∀n > N (ε).
But then, if n, m > N (ε),
k f n − f m k 6 k f n − f k + k f − f m k 6 2ε. 
Remark 3.3.2. In general a Cauchy sequence is not necessarily convergent. Just go back to Example 3.2.8.
There we built a sequence ( f n ) ⊂ C ([0, 1]) converging in the L 1 norm to a function, f = χ[1/2,1] < C ([0, 1]).
Because ( f n ) is convergent respect to the L 1 norm, it is a Cauchy sequence respect to this norm. Now: we say that
L1
it cannot converge to any g ∈ C ([0, 1]). This seems obvious since we proved that f n −→ χ[1/2,1] . However a little
L1
argument is needed. Indeed: if f n −→ g then kg − χ[1/2,1] k = 0 by uniqueness of the limit, that is g = χ[1/2,1]
a.e.. Now the question is: can g be continuous? This seems to be impossible. To see formally, notice that because
g = χ[1/2,1] a.e.,
λ 1 ({x ∈ [0, 1] : g(x) , 0, 1}) = 0.
But g, as continuous function, must take all the values between 0 and 1, so if c ∈]0, 1[ there exists x c ∈ [0, 1]
such that g(x) = c. Not only, by continuity, for any Jc := [c − ε, c + ε] there exists a neighborhood of x c ,
Ixc := [x c − δ, x c + δ] such that g(Ixc ) ⊂ Jc . Now, by choosing ε small enough in such a way that Jc ⊂]0, 1[ we
would have that
{x ∈ [0, 1] : g(x) , 0, 1} ⊃ Ixc , =⇒ λ 1 ({x ∈ [0, 1] : g(x) , 0, 1}) > 2δ > 0,
which is a contradiction. 
Definition 3.3.3. A normed space V such that every Cauchy sequence is convergent is said to be
complete or a Banach space.
41

Completeness is a very important property of a normed space. The remaining part of this section is
dedicated to discuss some example.

3.3.1. Completeness of B(X ).


Proposition 3.3.4. B(X ) is a Banach space.
Proof. Let ( f n ) ⊂ B(X ) be a Cauchy sequence in the uniform norm: rewriting the Cauchy property
∀ε > 0, ∃N (ε) : sup | f n (x) − f m (x)| 6 ε, ∀n, m > N (ε).
x ∈X
In particular ( f n (x)) ⊂ R is a Cauchy sequence w.r.t. the euclidean metric. As well known R is complete
w.r.t. the euclidean metric: therefore
∃ f (x) := lim f n (x), ∀x ∈ X .
n
k · k∞
This defines a function. We prove that f ∈ B(X ) and f n −→ f . About the first notice just that
| f (x)| 6 | f (x) − f N (x)| + | f N (x)| 6 ε + k f N k∞, ∀x ∈ X, =⇒ k f k∞ 6 ε + k f N k∞,
that is f ∈ B(X ). To finish notice that being
m→+∞
| f n (x) − f m (x)| 6 ε, ∀n, m > N (ε), ∀x ∈ X, =⇒ | f n (x) − f (x)| 6 ε, ∀n > N (ε), ∀x ∈ X
that is, taking the sup,
k · k∞
k f n − f k∞ = sup | f n (x) − f (x)| 6 ε, ∀n > N (ε), ⇐⇒ f n −→ f . 
x ∈X

If X ⊂ Rd then the subspace Cb (X ) ⊂ B(X ) makes sense. If ( f n ) ⊂ Cb (X ) is a Cauchy sequence (w.r.t.


the uniform norm) then it is of course a Cauchy sequence in B(X ): by the previous result it converges to
some f ∈ B(X ). To say if actually Cb (X ) is complete we just need to know if f ∈ Cb (X ). This is true
(the proof is not difficult but we’ll accept here):
k · k∞
Theorem 3.3.5. If ( f n ) ⊂ Cb (X ) is such that f n −→ f then f ∈ Cb (X ). In particular: Cb (X ) is
complete.
3.3.2. Completeness of L p (X ).
Theorem 3.3.6. L p (X ) is a Banach space for every 1 6 p 6 +∞. Moreover,
• if p < +∞, an L p convergent ( f n ) admits a subsequence ( f nk ) convergent a.e.;
• if p = +∞, an L ∞ convergent ( f n ) converges also a.e..
Proof. Case p = +∞. By the Cauchy property,
∀ε > 0 ∃N (ε) : k f n − f m k∞ 6 ε, ∀n, m > N (ε)
According to the definition of k · k∞ , we have
| f n (x) − f m (x)| 6 ε, a.e. x ∈ X, ∀n, m > N (ε).
42

This means that there exists En,m such that µ(En,m ) = 0 such that the previous holds for every x ∈ X\En,m .
Let E := n,m En,m . By sub additivity, µ(E) = 0 and the previous holds on X\E for every n, m > N (ε),
S
that is
(3.3.1) | f n (x) − f m (x)| 6 ε, ∀x ∈ X\E, ∀n, m > N (ε).
In particular: ( f n (x)) ⊂ R is a Cauchy sequence for every x ∈ X\E, hence ∃ f (x) := limn f n (x). Being
pointwise limit of measurable functions, f itself is measurable. Moreover, by passing to the limit in the
(3.3.1) we have
(3.3.2) | f n (x) − f (x)| 6 ε, ∀x ∈ X\E, ∀n > N (ε).
This says first that ( f n ) converges a.e. and moreover that k f n − f k∞ 6 ε for all n > N (ε), that is
L∞
f n −→ f .
Case p < +∞. For technical simplicity we assume p = 1 (p > 1 is left as Exercise). To attack the
problem let ( f n ) ⊂ L 1 (X ) be a Cauchy sequence: we want to show that there exists f ∈ L 1 (X ) such that
L1
f n −→ f . The first problem is to understand what it should be such f . Now the trick is the following:
we can find a subsequence ( f nk ) ⊂ ( f n ) such that
1
k f nk+1 − f nk k1 6 .
2k
Indeed: by the Cauchy property there exists Nk such that
1
k f n − f m k1 6 k , ∀n, m > Nk .
2
Take n1 > N1 , hence n2 > n1 such that n2 > N2 (of course this is possible because there’s infinitely many
n after n1 and N2 ). What happens is that n1, n2 > N1 , therefore
1
k f n2 − f n1 k1 6 .
2
Now repeat by choosing n3 > n2, N3 . We’ll have n2, n3 > N2 hence
1
k f n3 − f n2 k1 6 2 .
2
And so on. Now, we claim that ( f nk (x)) is a.e. convergent. Indeed:
k−1 
X 
f nk (x) = f n j+1 (x) − f n j (x) + f n1 (x),
j=1

so f nk (x) will be convergent (as k −→ ∞ with x fixed) iff the series converges. But
Z X XZ X 1
m.c.
X
f n j+1 − f n j dµ = f n j+1 − f n j dµ = k f n j+1 − f n j k1 6 < +∞.
X j j X j j
2j
As well known this says that X
f n j+1 − f n j < +∞, µ − a.e.,
j
43

hence the series is absolutely convergent, therefore is convergent. Let be now f (x) := limk f nk (x). Being
point-wise limit of measurable functions f is measurable. Moreover
X∞  
f − f nk = f n j+1 − f n j
j=k
so
Z Z X XZ X 1
m.c.
k f − f nk k1 = | f − f nk | dµ 6 | f n j+1 − f n j | dµ = | f n j+1 − f n j | dµ 6 j
−→ 0.
X X j>k j>k X j>k
2

This shows actually that f − f nk ∈ L 1 , and being f nk ∈ L 1 we deduce f ∈ L 1 ; moreover it says that
L1
f nk −→ f . We’re almost done: we proved that ( f n ) admits an L 1 convergent subsequence ( f nk ). The
conclusion now follows by a general property of Cauchy sequences:
Proposition 3.3.7. Let ( f n ) be a Cauchy sequence in a normed space V . Then
V V
∃ f nk −→ f , =⇒ f n −→ f .
Proof. Indeed: by assumption
∀ε > 0, ∃K (ε) : k f nk − f k 6 ε, ∀k > K (ε).
Moreover the Cauchy property says that
∀ε > 0, ∃N (ε) : k f n − f m k 6 ε, ∀n, m > N (ε).
Now, for n > N (ε) and choose k > K (ε) such that nk > N (ε) (of course there’re infinitely many of such
k). We have
k f n − f k 6 k f n − f nk k + k f nk − f k 6 ε + ε = 2ε, ∀n > N (ε). 
In general it may happens that an L p convergent sequence is not point wise convergent as the following
example shows:
Example 3.3.8. On X = [0, 1] with the ordinary Lebesgue measure let f 1 ≡ 1, hence
f 2 = χ[0,1/2[, f 3 = χ[1/2,1],
f 4 = χ[0,1/4[, f 5 = χ[1/4,2/4[, f 6 = χ[2/4,3/4[, f 7 = χ[3/4,1]
..
.
f 2k = χ[0,1/2k [, f 2k +1 = χ[ 1
, 2 [, . . . , f 2k+1 −1 = χ[ 2k+1 −1 ,1],
2k 2k 2k
..
.
L1
Then f n −→ 0 but ( f n (x)) is never convergent.
Sol. — The first is evident: if 2k 6 n < 2k+1 we have
1 2 2
k f n k1 = k = k+1 < −→ 0.
2 2 n
f j j+1 f
But of course, whatever is x, for k fixed x ∈ 2k , 2k for just one possible value of j: this means that f n (x) = 0
for all 2k 6 n < 2k+1 except for just one possible value of n and for that value f n (x) = 1. In other words the
44

sequence ( f n (x)) contains 0 and 1 infinitely many times, so it cannot be convergent. Notice that here if we take the
subsequence f 2k = χ[0,1/2k [ we have
 1, x = 0,


f 2k (x) −→ 

 0 x ∈]0, 1],

and this confirms what the proof shows. 
3.3.3. Normal convergence. A very important fact concerning numerical series is the following
well known result:
P P
Theorem 3.3.9. Any absolutely convergent series, that is n |an | convergent, is convergent, that is n an
converges.
n
It is also well known that absolute convergence is stronger than convergence. For example, n (−1)
P
n is
P (−1) n
convergent (Leibniz test) but non absolutely convergent because n n = n n is divergent. Basically,
P 1
replacing the modulus with the norm, in a Banach space the previous theorem holds:
P
Theorem 3.3.10. Let V be a Banach space. Any normally convergent series, that is n k f n k converges,
P
is convergent, that is n f n converges in V .
PN
Proof. Let SN := n=0 f n be the N−th partial sum. Let’s check that (SN ) is a Cauchy sequence. If,
for instance, N > M,
X N XN
kSN − SM k = fn 6 k f n k = s N − s M,
n=M+1 n=M+1
PN P
where s N := n=0 k f n k. Now, because by hypothesis n k f n k converges, the sequence (s N ) converges
(in R), therefore it is a Cauchy sequence. Hence,
∀ε > 0, ∃N (ε) : 0 6 s N − s M 6 ε, ∀N > M > N (ε).
But then, kSN − SM k 6 ε for all N > M > N (ε), that is (SN ) is Cauchy in V . Now the conclusion
follows by the completeness. 
An example of declination of the previous general result is the
Corollary 3.3.11 (Weierstrass’ test). Let ( f n ) ⊂ B(X ). Then
X∞ X∞
k f n k∞ converges =⇒ f n converges uniformly.
n=0 n=0
Proof. Evident. 

3.4. Linear Operators


The most natural functions operating between vector spaces are linear operators. We recall that this
means that
A : V −→ W, A(α f + βg) = α A( f ) + β A(g) ≡ α A f + β Ag, ∀ f , g ∈ V, ∀α, β ∈ R (C).
Here we introduced also the usual notation A f to denote A( f ), used for linear operators. If V, W are
normed spaces we can consider another important property: the continuity. In general
45

Definition 3.4.1. Let (V, k · kV ) and (W, k · kW ) be normed space and T : V −→ W a function. We say
that T is continuous at f ∈ V if
k · kW k · kV
T ( f n ) −→ T ( f ), ∀ f n −→ f .
If T is continuous at every f ∈ V we write T ∈ C (V, W ).
If V, W are finite dimensional (actually just V ) then it is possible to prove that a linear operator A is
continuous, no matter what are the norms on V and W . Therefore: if V has finite dimension, any A linear
is also continuous. This may be false if V is not finite dimensional. Here’s a classical
Example 3.4.2. Let V = C 1 ([0, 1]) with k f kV := k f k∞ (it is easy to check that this is a norm on V ) and
W := C ([0, 1]) with usual uniform norm. Let
A : V = C 1 ([0, 1]) −→ W = C ([0, 1]), A f := f 0 .
Clearly A is linear but it is not continuous in this setting.
Sol. — Take f n (x) := 1
n sin(n2 x). Clearly ( f n ) ⊂ C 1 ([0, 1]) and because
1 1
k f n kV = k f n k∞ = sup | f n (x)| = sup sin(n2 x) 6 −→ 0,
x ∈[0,1] x ∈[0,1] n n
k · kV
hence f n −→ 0. But
1 2
A f n = f n0 = n cos(n2 x) = n cos(n2 x), =⇒ k A f n kW = k f n0 k∞ = sup n cos(n2 x) = n −→ +∞,
n x ∈[0,1]

k · kW
hence it cannot be that A f n −→ A0 = 0. 

Operators like the derivative are not exoteric but very common and relevant in applications. Just think
that a PDE can be seen as an equation of the form A f = 0 where A is an operator involving derivatives.
This is the case for instance of
• Laplace equation ∆ f = 0, here A = ∂xx + ∂yy ;
• heat equation ∂t f = σ 2 ∂xx f , that is A f = 0 where A = ∂t − σ 2 ∂xx ;
• wave equation ∂tt f = c2 ∂xx f , A = ∂tt − c2 ∂xx ;
• Schrödinger equation i~∂t f = − ~2 ∂xx f , A = i~∂t + ~2 ∂xx .
2 2

The class of linear and continuous operators is very important:


Definition 3.4.3. The set of linear and continuous operators from (V, k · kV ) to (W, k · kW ) normed
spaces is denoted by L (V, W ). If (W, k · kW ) = (V, k · kV ) we write L (V, V ) ≡ L (V ).
A first interesting remark is the following
Proposition 3.4.4.
k · kW k · kV
A ∈ L (V, W ), ⇐⇒ A f n −→ 0, ∀ f n −→ 0.
k · kV
Proof. =⇒ evident. Vice versa: assume that A is continuous at 0. If f n −→ f then easily
k · kV k · kW k · kV
f n − f −→ 0 hence, by assumption, A( f n − f ) = A f n − A f −→ 0 that is A f n −→ A f . 
46

The set L (V, W ) has the natural structure of vector space with obvious refs
(α A + βB) f := α A f + βB f , f ∈ V, ∀α, β ∈ R (C).
What is less evident is that L (V, W ) has also a very important and natural norm
Theorem 3.4.5.
A ∈ L (V, W ) ⇐⇒ k AkL (V,W ) := sup k A f kW < +∞,
k f kV 61

The quantity k · kL (V,W ) is a norm on L (V, W ) called operator norm.

Proof. =⇒ Suppose that k Ak = +∞: there exists ( f n ) with k f n kV 6 1 such that k A f n kW > n. Let’s
notice that
1 1 fn
1 6 k A f n kW = A fn = A .
n n W n W
fn k · kV
Moreover k fnn kV = 1 1
n k f n kV 6 n −→ 0, it follows n −→ 0. But this leads to a contradiction because,
k · kW
being A continuous A fnn −→ 0 and this is impossible being A fnn > 1.
W

k · kV
⇐= Let f n −→ 0. Notice that if f n , 0 we have
!
fn fn
k A f n kW = A k f n kV = k f n kV A 6 k AkL (V,W ) k f n kV .
k f n kV W k f n kV W
fn
because of course k fn kV V = 1 6 1. If f n = 0 the previous estimates remains valid trivially. Therefore
k · kW
k A f n kW 6 k AkL (V,W ) k f n kV −→ 0, =⇒ A f n −→ 0.

Let’s check now that k · kL (V,W ) is a norm on L (V, W ). By the previous discussion it follows that
the operator norm is well defined on L (V, W ) and it is clearly > 0. It remains to prove vanishing,
homogeneity and triangular inequality. We have
• k AkL (V,W ) = 0 iff k A f kW = 0 for every f ∈ V such that k f kV 6 1. In particular A f = 0 fur
such f . If f ∈ V is generic (but not 0) we can write A f = A k f kV k f kV = k f kV A k f fkV = 0
f


being k f fkV = 1 6 1. Of course A0 = 0 by linearity hence A ≡ 0.


V
• kα AkL (V,W ) = sup k f kV 61 kα A f kW = sup k f kV 61 |α|k A f kW = |α| sup k f kV 61 k A f kW = |α|k AkL (V,W ) .
• k A + BkL (V,W ) = sup k f kV 61 k A f + B f kW = sup k f kV 61 (k A f kW + kB f kW ) 6 k AkL (V,W ) +
kBkL (V,W ) . 

By the way notice that we have another very useful characterization of continuity:
Corollary 3.4.6.
A ∈ L (V, W ) ⇐⇒ ∃L > 0 : k A f kW 6 Lk f kV , ∀ f ∈ V .
47

Proof. ⇐= Is immediate (exercise).


=⇒ If A ∈ L (V, W ) then, if f , 0,
f
k A f kW = k f kV A 6 k AkL (V,W ) k f kV . 
k f kV W
Remark 3.4.7. Notice that, by the previous proof,
(3.4.1) k A f kW 6 k AkL (V,W ) k f kV , ∀ f ∈ V .
In particular
k AB f k 6 k Ak kB f k 6 k Ak kBk k f k, =⇒ k ABk 6 k Ak kBk. 
Example 3.4.8. Let V = C 1 ([0, 1]) with k f kV := k f k∞ + k f 0 k∞ and W = C ([0, 1]) with kgkW = kgk∞ .
Then A : V −→ W defined as A f = f 0 is linear and continuous.
Sol. — Trivial: k A f kW = k f 0 kW = k f 0 k∞ 6 k f k∞ + k f 0 k∞ = k f kV . 
The previous examples shows that is doesn’t make sense to say A is continuous or not without a specific
reference to which norms this should be true: the same operator (the derivative), between the same spaces
but with different norms on V turns out to be discontinuous and continuous according to which of the
norms on V we define.
It’s now time to shorten notations: we’ll write k Ak for the operator norm (usually upper case letters are
referred to spaces and operators), k f k and k A f k for norms in the domain and co-domain. The setting
should make clear enough to what norms are we referring. We will discuss now some property of the
space L (V ) ≡ L (V, V ). A very important is the completeness:
Theorem 3.4.9. If V is a Banach space then L (V ) is a Banach space too.
Proof. Let ( An ) ⊂ L (V ) be a Cauchy sequence, that is
∀ε > 0, ∃N (ε) : k An − Am k 6 ε, ∀n, m > N (ε).
Now because k An f − Am f k 6 k An − Am k k f k 6 εk f k it follows that ( An f ) ⊂ V is a Cauchy sequence:
being V complete it converges. Let’s call A f its limit:
A f := lim An f , ∀ f ∈ V .
n
k · kL (V )
This defines a map A : V −→ V . The goal is to show that A ∈ L (V ) and An −→ A. We have
A(α f + βg) = lim An (α f + βg) = lim (α An f + β An g) = α A f + β Ag.
n n
Moreover, if k f k 6 1,
k A f k = k A f − An f k + k An f k 6 ε + k An k < +∞, =⇒ A ∈ L (V ).
Finally, if m > N (ε) and k f k 6 1,
k A f − Am f k 6 lim k An f − Am f k 6 ε, =⇒ k A − Am k 6 ε, ∀m > N (ε),
n
k · kL (V )
and this means Am −→ A. 
48

Let’s show an abstract application of this fact. Consider the equation


f − A f = g.
Here A ∈ L (V ). Of course we can see the previous equation as
(I − A) f = g, ⇐⇒ f = (I − A) −1 g.
So the question is: under which hypotheses is I − A invertible? Let’s see how an informal argument
suggests the answer. Let’s write

I X
(I − A) −1 = = An
I − A n=0
reminding the geometric sum ∞ n=0 q = 1−q . This numerical series converges if |q| < 1. It turns out
P n 1

that
Theorem 3.4.10 (Neumann series). Let V be a Banach space, A ∈ L (V ) be such that k Ak < 1. Then

X
∃(I − A) −1 = An
n=0
in the sense that the series converges in the operator norm of L (V ) and it is the inverse of I − A.
Proof. According to the previous theorem, L (V ) is a Banach space. Let’s first check that SN :=
n=0 A is a Cauchy sequence in L (V ): if N > M for instance,
PN n

N N N
X X k An k6 k Ak n X
kSN − SM k = An 6 k An k 6 k Ak n,
n=M+1 n=M+1 n=M+1
and because k Ak < 1 this can be made smaller than any ε > 0 for N, M big enough. Therefore An
P
n
converges in L (V ). Call S its sum. Let’s check that S = (I − A) −1 . We have
X∞ X∞
S(I − A) = S − S A = An − An+1 = I.
n=0 n=0
Similarly (I − A)S = I. 

3.5. Exercises
Exercise 3.5.1. Define, on R2 ,
k(x, y)k∗ := ( |x| + |y|) 2 .
p p

Is k · k∗ a norm?
Exercise 3.5.2. On V = C 1 ([0, 1]) let’s consider
Z 1
i) k f k∗ := k f k∞ + k f 0 k∞ . ii) k f k∗∗ := k f 0 k∞ . iii) k f k∗∗∗ = | f (0)|+ k f 0 k∞ . iv) k f k∗∗∗∗ := | f (1)|+ | f 0 (x)| dx.
0
Which among these are norms (k · k∞ stands for the uniform norm)? For those who are norms consider then the
sequence f n (x) := n1 sin(n2 x). Discuss if this sequence converges in each of the norms. Discuss relations among
the norms (which is stronger?).
49

Exercise 3.5.3. In X = C 1 ([0, 1]) consider three norms as follows: a) the sup-norm k · k∞ ; b) the total variation
norm Z 1
k f kv := k f k∞ + k f 0 k1 ≡ k f k∞ + | f 0 (x)|dx,
0
and the C 1 norm k f k∗ := k f k∞ + k f 0 k∞ .
i) Show that there exist c, C > 0 such that
k f k∞ 6 ck f kv 6 Ck f k∗ .
ii) By using sequences like f k (x) := ck sin(kπx) or gk (x) = ck x k with ck > 0 show that do not exists
constants m, M > 0 such that
k f kv 6 mk f k∞, k f k∗ 6 M k f kv .
Exercise 3.5.4. Let D ⊂ Rd open and let k · k∞ the L ∞ (D) norm respect to the Lebesgue measure. Show that if g
is bounded and continuous,
kgk∞ = sup |g(x)|.
x ∈D
Is the conclusion still true if D is not open?
Exercise 3.5.5. Let µ(X ) < +∞. Show that L p (X ) ⊂ L q (X ) if +∞ > p > q > 1 and, actually, the L p norm is
stronger than the L q norm. Discuss what happens if µ(X ) = +∞.
Exercise 3.5.6. Prove the following extension of the Hölder inequality: if f , g, h ∈ L(X ) with f , g, h > 0 and
p, q, r > 1 are such that p1 + q1 + r1 = 1 then
Z Z ! 1/p Z ! 1/q Z ! 1/r
f gh dµ 6 f p dµ g q dµ hr dµ .
X X X X

Exercise 3.5.7. Let f n (x) = 1 · χ[−1,0] (x) + 1 − nx χ]0,1/n] (x). Discuss if ( f n ) is convergent in L 2 ([−1, 1]).
Exercise 3.5.8. Let f n (x) := n
1+n9 x 3
, x ∈ [0, 1]. Discuss convergence of ( f n ) in L p ([0, 1]) norm as p ∈ [1, +∞].
Exercise 3.5.9. Let ( f n ) ⊂ L 4 ([0, 1]) be such that k f n k4 6 1 for all n ∈ N. For each of the following statements
say if it is true (and provide a proof) or false (and provide a counter example):
i) ( f n ) is bounded in the L 2 ([0, 1]) norm.
ii) ( f n ) is bounded in the L 6 ([0, 1]) norm.
iii) there exists ( f nk ) ⊂ ( f n ) converging almost everywhere.
Exercise 3.5.10. Let f ∈ L 2 (R) (the measure is the Lebesgue measure).
i) Prove or disprove with a counterexample: f ∈ L 1 (R)? f ∈ L 1 ([−R, R]) for every R > 0?
ii) Show that if also x f (x) ∈ L 2 (R) then f ∈ L 1 (R), by proving also the bound

k f k1 6 2 (k f k2 + k x f k2 ) .
L 2, L 4
Exercise 3.5.11 (?). Let ( f n ) ⊂ L(Rd ) be such that f n −→ 0. Prove or disprove providing a counterexample the
L1 L3 L5
following statements: i) f n −→ 0. ii) f n −→ 0. iii) f n −→ 0.
L 1 (X)
Exercise 3.5.12 (?). Suppose that (En ) ⊂ F (with (X, F, µ) measure space) be such that χ En −→ f . Show
that f = χ E a.e. for some E ∈ F .
Exercise 3.5.13 (?). Show directly that ` 1 is a Banach space.
50

Exercise 3.5.14. Let


 | f (t)| 
C < +∞ .
 
X :=  f ∈ ([0, 1]) : k f k∗ := sup
 t ∈]0,1] t 
 
i) Check that k · k∗ is a well defined norm on X.
ii) Let f n be defined as


 nt, 0 6 t 6 n12 ,

f n (t) := 
 √t, 1 6 t 6 1.


 n2
Is ( f n ) ⊂ X? If yes, is ( f n ) convergent to some f ∈ X in the k · k∗ norm?
iii) On X is also defined the k · k∞ norm. Show that k · k∗ is stronger than k · k∞ . Are the two also equivalent?
(prove or disprove)
iv) Discuss if X is a Banach space under k · k∗ .
Exercise 3.5.15. On X = R2 consider the norms
k(x, y)k1 := |x| + |y|; k(x, y)k2 := (|x| 2 + |y| 2 ) 1/2 ; k(x, y)k∞ := max{|x|, |y|}.
Let A : R2 −→ R defined as A(x, y) := 2x + y.
i) Show directly that A is continuous as operator from R2 to R with respect to each of the norms k · k1, k ·
k2, k · k∞ .
ii) Find the operator norms k Ak of A with respect to each of the norms k · k1, k · k2, k · k∞ .
Exercise 3.5.16. Let V = Rd with euclidean norm, A = diag(λ 1, . . . , λ d ). Determine k Ak.
Exercise 3.5.17 (?). Let X = R2 endowed with the euclidean norm, M = [mi j ] a 2 × 2 matrix and
A : R2 −→ R2, Ax = M · x, x ∈ R2,
where M · x stands for the classical product matrix times vector. Suppose that M is diagonal, M = diag(λ 1, λ 2 ).
Express k Ak in terms of λ 1, λ 2 .
Exercise 3.5.18. Let X = C ([0, 1]) be normed with the sup-norm k · k∞ . Consider
( A f )(t) := t f (t), t ∈ [0, 1].
i) Show that A ∈ L (X ).
ii) Show that A is injective (hint: recall that A is linear. . . ).
iii) Is also A surjective? If yes, what is A−1 ? In any case, what is the image of A?
Exercise 3.5.19. Let X = C ([0, 1]) be normed with the sup-norm k · k∞ . Consider
Z t
( A f )(t) := f (t) + f (s) ds, t ∈ [0, 1].
0
i) Show that A ∈ L (X ).
ii) Show that A is injective (hint: recall that A is linear. . . ).
iii) Is also A surjective? If yes, what is A−1 ? In any case, what is the image of A?
Exercise 3.5.20. On L 2 (R) consider the operator f − 7 → A f , where A f is defined as
Z
2
( A f )(x) := (x − y)e−y dy.
Rf
Prove that A ∈ L (L 2 (R)).
CHAPTER 4

Basic Hilbert spaces

Hilbert spaces are particular Banach spaces in which the norm is induced by a scalar product. Scalar
product add to the structure an "euclidean insight" through the concept of orthogonal vectors. For this
reason, among the Banach spaces Hilbert spaces are particularly rich structures on which concepts are
those of orthonormal base becomes a natural extension of the canonical base for Rd . Scope of this
Chapter is to introduce to this order of ideas that will play a very important role in the next Chapters as
well as, more in general, in all modern Mathematics.

4.1. Definition and first properties


Let’s start by the
Definition 4.1.1. Let V be a vector space over the scalar field C. A function h·, ·i : V × V −→ C is called
hermitian product if
i) (positivity) h f , f i > 0 for every f ∈ V ;
ii) (vanishing) h f , f i = 0 iff f = 0;
iii) (linearity) hα f + βg, hi = αh f , hi + βhg, hi, ∀ f , g, h ∈ V and ∀α, β ∈ C;
iv) (anti-symmetry) h f , gi = hg, f i;
Combining linearity and anti-symmetry we see that
h f , g + hi = hg + h, f i = hg, f i + hh, f i = h f , gi + h f , hi
and
h f , αgi = hαg, f i = αhg, f i = αh f , gi.
An hermitian product induces naturally a structure of normed space through the position
Proposition 4.1.2. Let h·, ·i be an hermitian product. Then k f k := h f , f i defines is a norm on V .
p

Proof. Notice first that by positivity k f k is well defined and positive. The vanishing for the norm
follows immediately by the vanishing for the inner product. Next,
kα f k = hα f , α f i.
p

Now combining linearity and antisymmetry we have hα f , α f i = αh f , α f i = αhα f , f i = ααh f , f i =


|α| 2 k f k 2 and by this kα f k = |α|k f k. Finally the triangular inequality: first notice that
k f + gk 2 = h f + g, f + gi = h f , f i + hg, gi + h f , gi + hg, f i = k f k 2 + kgk 2 + 2Re h f , gi
The conclusion now follows by the
51
52

Lemma 4.1.3 (Cauchy–Schwarz inequality).


(4.1.1) |h f , gi| 6 k f k kgk, ∀ f , g ∈ V .
The = holds iff f and g are linearly dependent.
Proof. The proof is purely algebraic: notice that kλ f + gk 2 > 0 for every λ ∈ R. Being
kλ f + gk 2 = λ 2 k f k 2 + kgk 2 + 2λRe h f , gi.
As a second degree polynomial in λ ∈ R this is always > 0, therefore ∆ 6 0 where
∆ = 4Re h f , gi2 − 4k f k 2 kgk 2 6 0, ⇐⇒ Re h f , gi2 6 k f k 2 kgk 2, ⇐⇒ |Re h f , gi| 6 k f k kgk.
Notice that the = holds iff ∆ = 0, that is iff kλ f + gk can be = 0, that is iff λ f + g = 0, or f , g are linearly
dependent.
To replace Re h f , gi with h f , gi, let’s write h f , gi = ρeiθ . Then, he−iθ f , gi = ρ, in particular
he f , gi is real. Therefore
−iθ

|h f , gi| = |e−iθ h f , gi| = |he−iθ f , gi| 6 ke−iθ f k kgk = k f k kgk.


Finally, the = holds iff e−iθ f and g are linearly dependent, that is iff f and g are linearly dependent. 
Returning to the main proof we have
CS
k f + gk 2 6 k f k 2 + kgk 2 + 2|h f , gi| 6 k f k 2 + kgk 2 + 2k f k kgk = (k f k + kgk) 2 ,
which is the conclusion. 
One of the important byproducts of the Cauchy–Schwarz inequality is that
Proposition 4.1.4. The inner product is continuous w.r.t. each factors, that is
k·k
f n −→ f , =⇒ h f n, gi −→ h f , gi, ∀g ∈ H.
Proof. Just notice that
|h f n, gi − h f , gi| = |h f n − f , gi| 6 k f n − f k kgk −→ 0. 
Definition 4.1.5. We say that f and g are orthogonal (notation f ⊥ g) if h f , gi = 0.
In this case we have the Pythagorean theorem
k f + gk 2 = k f k 2 + kgk 2, ∀ f , g ∈ V, f ⊥ g.
More generally, if f i ⊥ f j for all i , j then
2
n
X n
X
fj = k f j k2.
j=1 j=1
Another remarkable identity is the so-called parallelogram identity
k f + gk 2 + k f − gk 2 = 2(k f k 2 + kgk 2 ), ∀ f , g ∈ V .
Indeed
     
k f +gk 2 + k f −gk 2 = k f k 2 + kgk 2 + 2Re h f , gi + k f k 2 + k − gk 2 + 2Re h f , −gi = 2 k f k 2 + kgk 2 ,
53

being h f , −gi = −h f , gi.


Definition 4.1.6. An complete inner product space H is called Hilbert space.
Example 4.1.7. The space L 2 (X, F, µ) is an Hilbert space endowed with the inner product
Z Z
h f , gi2 := f g dµ ≡ f (x)g(x) dµ(x).
X X
In particular, if µ is the counting measure on (N, P (N)), the space L 2 is denoted by
 X 
` 2 (N) :=  < +∞ .
 2 
 (x n ) ⊂ C : |x n | 
 n 
Here the inner product is X
h(x n ), (yn )i := x n yn . 
n

4.2. Orthogonal Projections


One of the most important result concerning Hilbert spaces is the existence of the orthogonal
projection:
Theorem 4.2.1. Let H be an Hilbert space and U ⊂ H be an its closed linear subspace. It is well defined
the orthogonal projection on U, that is:
(4.2.1) ∀ f ∈ U, ∃!ΠU f ∈ U : h f − ΠU f , gi = 0, g ∈ U.
The orthogonal projection ΠU f of f on U is the best approximation of f with an element of U, that is
(4.2.2) k f − ΠU f k = min k f − gk.
g ∈U

Proof. Of course in principle it is not evident why the minimum in (4.2.2) should exists, hence we
will consider
inf k f − gk.
g ∈U
Of course if f ∈ U there’s nothing to prove: the minimum is achieved at g = f which is, of course,
the orthogonal projection of f on U. Therefore, we’ll assume f < U and call α > 0 the inf. Take a
minimizing sequence: for convenience we will write in the following form
1
(4.2.3) ∃(gn ) ⊂ U : α 2 6 k f − gn k 2 6 α 2 + , ∀n ∈ N.
n
We affirm that (gn ) is actually a Cauchy sequence. To show this we have to estimate kgn − gm k knowing
estimates of kgn − f k and kgm − f k. This can be done through the parallelogram identity
 
kgn − gm k 2 = k(gn − f ) − (gm − f )k 2 = 2 kgn − f k 2 + kgm − f k 2 − k(gn − f ) + (gm − f )k 2
! !!
1 1
62 α + 2
+ α +
2
− k(gn + gm − 2 f )k 2
n m

2 2 gn + gm 2
= 4α 2 + + −4 −f .
n m 2
54

gn +gm gn +gm
Now, being U a linear space and gn, gm ∈ U we have 2 ∈ U: therefore 2 − f > α hence
2 2 2 2
kgn − gm k 2 6 4α 2 + + − 4α 2 = + 6 ε 2, ∀n, m > N (ε).
n m n m
This is the Cauchy property: being H complete (gn ) converges. Let g∗ := limn gn : by letting n −→ +∞
in (4.2.3) we have immediately k f − g∗ k 2 = α 2 . This proves the existence of the element at minimum
distance.
Let’s prove that it is unique. If g∗ and g∗∗ are two elements at minimum distance then, again by the
parallelogram identity,
  g∗ + g∗∗ 2
kg∗ − g∗∗ k 2 = 2 kg∗ − f k 2 + kg∗∗ − f k 2 + kg∗ + g∗∗ − 2 f k 2 = 4α 2 − 4 − f 6 4α 2 − 4α 2 = 0,
2
that is g∗ = g∗∗ . This authorizes to call this unique element ΠU f .
Finally let’s prove that (4.2.1) is a characterization of ΠU f . Let g ∈ U and consider the scalar
function
λ(t) := k f − (ΠU f + tg)k.
Because ΠU f + tg ∈ U and ΠU f is the element of U at minimum distance by f we have that
λ(t) = k f − (ΠU f + tg)k > k f − ΠU f k = λ(0), =⇒ t = 0 is a minimum for λ.
In particular: λ 0 (0) = 0. Being
λ(t) = k f − ΠU f k 2 + t 2 kgk 2 + 2tRe h f − ΠU f , gi, =⇒ λ 0 (t) = 2t kgk 2 + 2Re h f − ΠU f , gi,
hence λ 0 (0) = h f − ΠU f , gi = 0 which is the conclusion. 
This result has a remarkable application in Probability to define the conditional expectation of a random
variable respect to a σ−algebra of events.
Definition 4.2.2. Given a subset V ⊂ H we define orthogonal of V
V ⊥ := {ψ ∈ H : hψ, φi = 0, ∀φ ∈ H } .
It is easy to check that V ⊥ is always a close linear subspace of H (exercise). An important consequence
of the orthogonal projection theorem is that
Proposition 4.2.3. If U ⊂ H is a closed linear subspace of H then
∀ f ∈ H, ∃!φ ∈ U, ψ ∈ U ⊥ : f = φ + ψ.
We write H = U ⊕ U ⊥ .
Proof. First, for every f we can write f = ΠU f + ( f − ΠU f ) where ΠU f ∈ U and f − ΠU f ∈ U ⊥ .
This proves the existence of the decomposition. If now f = φ1 + ψ1 = φ2 + ψ2 then φ1 − φ2 = ψ2 − ψ1 .
But ψ2 − ψ1 ∈ U ⊥ therefore
0 = hφ1 − φ2, ψ2 − ψ1 i = kφ1 − φ2 k 2, =⇒ φ1 = φ2,
hence also ψ1 = ψ2 . This proves the uniqueness. 
An important consequence is the following
Corollary 4.2.4 (density test). A set S is dense in H, iff S ⊥ = {0}.
55

Proof. Necessity: assume S dense and take f ∈ H such that h f , φi = 0, ∀φ ∈ S. Because S is dense,
there exists (φn ) ⊂ S such that φn −→ f . But then
k f k 2 = h f , f i = limh f , φn i = 0, =⇒ f = 0.
n

= {0}. Define V := { j=1 c j φ j : c j ∈ C, φ j ∈ S}. V is a linear space but


PN
Sufficiency: assume S⊥
in general is not closed, let V be its closure. The thesis consists then in proving that V = H. Assume
by contradiction that V ( H. Therefore H = V ⊕ (V ) ⊥ with (V ) ⊥ ) {0}. On the other hand, because
S ⊂ V , if f ∈ (V ) ⊥ , in particular h f , φi = 0 for all φ ∈ S. By (??) it follows f = 0, whence (V ) ⊥ = {0}
contradicting the previous statement. 

4.3. Orthonormal bases


We want now to introduce the concept of orthonormal base for an inner product space. Being in
general infinite dimensional a base should be an infinite system of vectors (e j ) ⊂ H such that
X
ei ⊥ e j , i , j, ke j k = 1, ∀ f ∈ H, f = cj e j .
j

In principle the cardinality of the set (e j ) j ∈J could be countable or uncountable according to J . Here
we’re only interested to the first case
Definition 4.3.1. Let H be an Hilbert space. A countable set (en )n∈N ⊂ H is called orthonormal
system if
hei, e j i = δi j , ∀i, j ∈ N.
An orthonormal system is said to be a orthonormal base for H if
X
∀ f ∈ H : ∃(cn ) ⊂ C : f = f n en .
n

Example 4.3.2. ` 2 has an orthonormal base.


Sol. — Just take en = (δkn )k ∈N (the so called "canonical base"). It is straightforward to check that (en )n∈N is an
orthonormal system. To check if it is complete take f = ( f k ) ∈ ` 2 . Then
X
f = f k ek ,
k

| f k | 2 < +∞ being f ∈ ` 2 .
P
and the series converges because k 

If (en ) is an orthonormal base for H then


*X + X X
h f , ek i = f n en, ek = f n hen, ek i = f n δ nk = f k .
n n n

In other words,
X
(4.3.1) f = h f , en ien, ∀ f ∈ H.
n
56

This series is called abstract Fourier Series of f and h f , en i are called abstract Fourier coefficients
(usually the specification "abstract" is omitted). Combining the Pythagorean thm to the (4.3.1) we obtain
the so called Parseval identity
X
(4.3.2) k f k2 = |h f , en i| 2 .
n
Given an orthonormal system (en ) we set
X
   X X 
< +∞ .

≡  2 
Span(en )n := 
 f n e n : the series converges in H 

 f n e n : | f n | 
 n   n n 
The second identity requires a justification. Indeed we have the
Proposition 4.3.3. Let (en )n be an orthonormal system in H Hilbert space. Then
X X
f n en converges ⇐⇒ | f n | 2 < +∞.
n n

Proof. Because H is complete (being Hilbert space), we have to check the Cauchy property for the
f n en of partial sums. If N > M,
PN
sequence SN := n=0
N 2 N N
Pyth. thm
X X X
kSN − SM k =2
f n en = k f n en k 2 = | f n | 2 =: s N − s M ,
n=M+1 n=M+1 n=M+1
PN
|2.
where of course s N := n=0 | f n Therefore: (SN ) N is Cauchy in H iff (s N ) N is Cauchy in R, that is
P P 2
n f n en converges in H iff n | f n | converges in R. 
It is easy to check that Span(en )n is a closed subspace of H. Said in other terms, (en ) is an orthonormal
base iff Span(en )n = H. In the case Span(en )n ( H the Parseval identity (4.3.2) holds in a weaker form:
Proposition 4.3.4 (Bessel inequality). Let (en )n be an orthonormal system. Then
X
(4.3.3) |h f , en i| 2 6 k f k 2, ∀ f ∈ H.
n

Proof. Let V := Span(en )n . Then, for every f ∈ H, f = ΠV f + ( f − ΠV f ). By Pythagorean


theorem then
2
X X
k f k = kΠV f k + k f − ΠV f k > kΠV f k =
2 2 2 2
h f , en ien = |h f , en i| 2 . 
n n

Two main questions arise when we consider orthonormal systems:


• When a given orthonormal system (en ) is a base?
• More in general, how to know if a given Hilbert space has an orthonormal base and, in this case,
how to determine one?
The first question assumes we have a candidate orthonormal system to check. Sometimes, this can be
achieved directly, by proving that any f in the space coincides with its Fourier series respect to the given
system. This was, for instance, the case of the ` 2 shown above. An intrinsic condition to check if an
orthonormal system (en ) is a base follows by the density test
57

Corollary 4.3.5. An orthonormal system (en )n is a base for an Hilbert space H iff the unique vector
f ∈ H orthogonal to all the en s is the null vector.
Proof. Necessity: evident, because if h f , en i = 0 for all n, being (en ) a base, f = n h f , en ien = 0.
P
The sufficiency follows by the density test. Indeed: take U := Span(en ). It is easy to recognize that
if (en ) ⊥ = {0} then U ⊥ = {0} as well. But then H = U. 
4.3.1. Haar base. Let H := L 2 ([0, 1]) and define the Haar functions
Definition 4.3.6.
n−1


 2 2 , 2n 6 x < 2n ,
k−1 k

e0 (x) ≡ 1, ek/2n (x) =  n−1


k = 1, . . . , 2n − 1, k odd, n > 1,

 −2 2 , 2kn 6 x < k+1
2n ,

 0,
 otherwise.

n-1
2 2

t
k-1 k k+1
2n 2n 2n

n-1
-2 2

Proposition 4.3.7. The Haar system is a base for L 2 ([0, 1]).


Proof. The orthogonality can be easily checked as exercise. Assume that f ⊥ e0, ek/2n for all k, n.
Notice first that
Z kn Z k+1 Z kn Z k+1
n−1 2 n−1 2n 2 2n
0 = h f , ek/2 i2 = 2
n 2 f (x) dx − 2 2 f (x) dx, =⇒ f (x) dx = f (x) dx.
k−1 k k−1 k
2n 2n 2n 2n

Therefore
Z Z 1 Z 1 Z 1
1 2 4 2n
0 = h f , e0 i2 = f (x) dx = 2 f (x) dx = 4 f (x) dx = . . . = 2 n
f (x) dx,
0 0 0 0
k+1
f (x) dx = 0 for any (n, k) ∈ I . By this it is easy to deduce that
R
2n
and again, by previous identity, k
2 n
Z b ( )
k
f (x) dx = 0, ∀a, b ∈ : n ∈ N, k ∈ {0, 1, . . . , 2 } =: D,
n
a 2n
and because the set D, the set of dyadic numbers, is dense in [0, 1]R we can conclude that the previous holds
for any a, b ∈ [0, 1]. It is now a standard job to conclude that E f = 0 for every E ⊂ [0, 1] Lebesgue
measurable, and by this the conclusion follows. 
In particular, X
f = h f , e0 i + h f , ek/2n iek/2n , ∀ f ∈ L 2 ([0, 1]).
n,k
This is among the simplest wavelet reconstruction formula of a function f .
58

4.4. Gram–Schmidt orthogonalization


Let’s now take the second question posed above, that is when a given Hilbert space H admits an
orthonormal base and how to determine one. A first remark is the following: if H has an orthonormal
base then, the set
X
 
gn en : (gn ) ⊂ Q + iQ  ,

S := 

 n 
is countable and S = H. This is intuitively clear, the proof is a bit "bureaucratic". If f = n f n en with
P
PN
( f n ) ⊂ C then f is approximated as we like by n=0 f n en as N is big enough. Moreover, because of
density of Q in R, every f n ∈ C is approximated by a gn ∈ Q + iQ and we can refine the approximation
PN PN PN
in such a way that n=0 gn en approximates well n=0 f n en . Therefore n=0 gn en approximates f .

Definition 4.4.1. We say that H is separable if there exists S ⊂ H countable and dense in H.

Hence, if H has an orthonormal base it is also separable. This condition turns out to be also sufficient:

Theorem 4.4.2 (Gram–Schmidt). An Hilbert space H admits an orthonormal base iff H is separable.
Moreover, if (vn ) ⊂ H is a countable dense set of linearly independent vectors, then by setting

j=0 hvn, e j ie j
vn − n−1
P
v0
(4.4.1) e0 := , en = , (n > 1),
kv0 k vn − n−1 ,
P
j=0 hv n e j ie j

(en ) is an orthonormal base.

Proof. The necessity has been proved in the premises. Let (vn ) ⊂ H\ be a countable subset dense
in H. In particular, Span(vn )n = H. We can assume that the vn are linearly independent, in the sense that

vn+1 < Span(v0, . . . , vn ),

otherwise we eliminate vn+1 without any change. In particular vn , 0 for every n. Now let’s define the
base: call Hn := Span(v0, . . . , vn ), let Π Hn be the orthogonal projection over Hn and pose

v0 vn − Π Hn−1 vn
e0 := , en := .
kv0 k kvn − Π Hn−1 vn k

It is evident that i) ken k = 1 for every n; ii) en ⊥ em if n , m. In fact it is enough to prove that en ⊥ e j
as j < n. To this aim just notice that e0, . . . , en−1 ∈ Hn−1 while en ∝ vn − Π Hn−1 vn ⊥ Hn−1 because of
the properties of the orthogonal projection. Because

Span(e0, . . . , en ) = Span(v0, . . . , vn ) = Hn,

we have Π Hn−1 f = n−1j=0 h f , e j ie j and by this the (4.4.1) follows. Finally, by the previous remark it follows
P
also that Span(en )n = Span(vn )n = H. 
59

4.4.1. Hermite polynomials. L 2 (R) is a very common framework in many applied settings. It can
be easily (but technical) proved that it is separable, hence it admits an orthonormal base. In this Section,
we will show a very particular one of these, very important for applications. To attack the problem, we
start by changing slightly the setting by considering
2

 Z
e − x2 

< +∞  = L 2 (R, N (0, 1)),
 2 
H :=  f : R −→ R : | f (x)| √ dx 
 R 2π 
 
2
−x
that is the L 2 space respect to the probability measure N (0, 1)(dx) := e√ 2 dx called also standard

gaussian. It is possible, we accept here, to prove that H is an Hilbert space with scalar product and norm
x2 2
Z
e− 2
Z − x2
2 e
h f , gi := f (x)g(x) √ dx, k f k = 2
| f (x)| √ dx.
R 2π R 2π
We notice that,
1, x, x 2, . . . , x n, . . . ∈ L 2 (R, N ).
We will need the Fourier Transform to prove that
Span x n : n ∈ N = L 2 (R, N ).

(4.4.2)
Accepting this, let’s determine an orthonormal base for L 2 (R, N ) starting from (x n )n . In general, the x n
2
−x
are not orthogonal because hx n, x m i = R x n+m e√ 2 dx = 0 iff n + m is odd. However, we can apply the
R

Gram-Schmidt algorithm to "orthogonalize" the x n . Let
n−1
1 * n X n Hn
en = .x − hx , e j ie j +/ =: .
αn j=0
kHn k
, -
So, for instance
y2
e√− 2
e0 = α02 = dy = 1,
1
R
α0 1, R
1

y2 2
− −x
e1 = (x − hx, e0 ie0 ) = dy + = α12 = dx = 1.
R R
1
α1
1
α1
*x −
R
y e√ 2 1
α1 x, R
x 2 e√ 2
2π 2π
, -
Therefore e0 (x) ≡ 1, e1 (x) = x. Again
1  2  1  2  1 2
e2 = x − hx 2, e0 ie0 − hx 2, e1 ie1 = x − hx 2, 1ie0 − hx 2, xie1 = (x − 1).
α2 α2 α2
The value of α2 is
2 2 2 Z − x2
Z − x2 Z − x2 Z − x2
2e 4e 2e e 2
α2 =
2 2
(x − 1) √ dx = x √ dx − 2 x √ dx + √ dx = 3 − 2 + 1 = 2.
R 2π R 2π R 2π R 2π
In conclusion e2 (x) = √1 (x 2 − 1). It is clear that we can compute e3, e4, . . . in this way, but it looks to
2
be difficult to have a "quick" recipe to compute en for every n. To do this, notice first that the Hn are
60

polynomials called Hermite polynomials. Precisely, Hn (x) = x n + pn−1 (x), where pn−1 is an n − 1-th
degree polynomial. In particular, Hn has degree = n and
Span(H0, ..., Hn ) ≡ Span(1, x, . . . , x n ).
Furthermore, by construction Hn ⊥ Hm, n , m. In particular,
Hn ⊥ Span(H0, ..., Hn−1 ) ≡ Span(1, . . . , x n−1 ).
Let’s see how to determine more efficiently the Hn . The first step is the
Proposition 4.4.3.
(4.4.3) Hn0 = nHn−1 .

Proof. First, by deriving Hn ,


n−2
X
Hn0 = nx n−1
+ 0
pn−1 = nHn−1 + qn−2 = nHn−1 + cj Hj .
j=0

Now, multiplying both sides by Hk in the scalar product, we obtain


hHn0 , Hk i = nhHn−1, Hk i + ck kHk k 2 = ck kHk k 2, ∀k 6 n − 2.
On the other side
x 2 x2
e− 2 e− 2
Z Z
parts
hHn0 , Hk i = 0
H n Hk √ dx = − Hn (Hk0 − xHk ) √ dx = hHn, Hk0 − xHk i = 0,
R 2π R 2π
because Hk0 − xHk ∈ Span(1, x, . . . , x n−1 ) if k 6 n − 2. The moral is ck = 0 for every k = 0, . . . , n − 2,
that is the conclusion. 

The (4.4.3) is not a good rule to compute Hn because even if we know Hn−1 we should proceed with an
integration, which is not a problem being Hn−1 a polynomial but it involves a free constant to be determined
by other conditions. Notice that, in proving the previous Proposition we proved the integration by parts
formula
(4.4.4) hp0, qi = hp, (xq − q 0 )i, ∀p, q polynomials.
Indeed,
 0
x2 x2
hp0, qi = =−
R R
p0 (x)q(x)e− 2 √dx p(x) q(x)e− 2 √dx
R 2π R 2π

 x2
=− = hp, xq − q 0i.
R
p(x) q 0 (x) − xq(x) e− 2 √dx
R 2π
By this we obtain easily the
Proposition 4.4.4.
(4.4.5) Hn+1 = xHn − Hn0 .
61

Proof. Consider the polynomial xHn − Hn0 : we already proved that xHn − Hn0 ⊥ Hk for every
k , n + 1. In particular, then, xHn − Hn0 ⊥ ek for every k , n + 1 and because this is an orthonormal
base,
xHn − Hn0 = cn+1 en+1 ≡ H
cn+1 Hn+1 .
Now, because xHn − Hn = x(x + pn−1 ) − (x + pn−1 ) = x n+1 + (xpn−1 − nx n−1 − pn−1
0 n n 0 0 ) ≡ x n+1 + p ,
n
we see immediately that Hcn+1 = 1. 
Following the (4.4.5) we have, for instance,
H3 = xH2 − H20 = x(x 2 − 1) − 2x = x 3 − 3x,
H4 = xH3 − H30 = x(x 3 − 3x) − (3x 2 − 3) = x 4 − 6x 2 + 3,
H5 = xH4 − H40 = x(x 4 − 6x 2 + 3) − (4x 3 − 12x) = x 5 − 10x 3 + 15x,
..
.
definitely much easier.
Let’s now compute the norm of Hn to determine the scaling factor of en . We have
kHn k 2 = hHn, Hn i = hxHn−1 − Hn−1
0 , H i = hH
n n−1, xHn i − hHn−1, Hn i
0

(4.4.4)
= hHn−1, xHn i − hHn−1, xHn − Hn0 i

(4.4.3)
= nhHn−1, Hn−1 i = nkHn−1 k 2 .
Therefore,
kHn k 2 = nkHn−1 k 2 = n(n − 1)kHn−2 k 2 = . . . = n!kH0 k 2 = n!
In conclusion !
1
√ Hn (x) is an orthonormal base for L 2 (R; N (0, 1)).
n! n∈N
As a consequence,
!
1 2
− x4
√ Hn (x)e is an orthonormal base for L 2 (R).
2πn! n

4.5. Exercises
( )
Exercise 4.5.1. Let V := f ∈ C 1 ([0, 1]; R) : f (0) = 0 . On V define
Z 1
h f , gi := f 0 (x)g 0 (x) dx.
0
i) Show that h·, ·i is a scalar product on V .
ii) Discuss if V is an Hilbert space.
Exercise 4.5.2. Let f (x) := cos x ∈ L 2 ([0, 2π]). Determine the best possible second degree polynomial closest to
f in the L 2 ([0, 2π]) norm.
Exercise 4.5.3. Solve Z 1
min |x 3 + ax 2 + bx + c| 2 dx.
a,b,c ∈R −1
62

Exercise 4.5.4. Solve Z +∞


min |e−x − (ae−2x + be−3x )| 2 dx
a,b ∈R 0
Exercise 4.5.5. Solve Z 1
max
R f (x)e x dx.
1
f ∈L 2 ([0,1]) : 0
f 2 dx=1 0

Exercise 4.5.6 (?). Let H be an Hilbert space, f ∈ H and f 1, . . . , f n ∈ H linearly independent. Consider the
problem
Xn
min n f − aj f j .
a1,...,a n ∈R
j=1
i) Show that the minimum exists.
ii) Show that the solution is a vector a = (a1, . . . , an ) such that Ma = b where M = (h f i, f j i), b = (h f , f j i).
Exercise 4.5.7. [?] Let (en ) be an orthonormal system in H Hilbert space. We want to prove that
X
 X 
| f n | < +∞ 

Span(en ) :=   f n e n : 
 n n 
is closed. Outline: let (gn ) ⊂ U, gn = k f k ek , be such that gn −→ g. The goal is to prove g ∈ U.
P n

i) Use the Cauchy property of (gn ) to deduce that ( f kn ) ⊂ C is convergent for every k ∈ N fixed.
ii) Call f k := limn f kn . Show that k | f k | 2 < +∞.
P
iii) Deduce g = k f k ek and conclude
P

Exercise 4.5.8. Let ΠU be the orthogonal projection on the closed subspace U of an Hilbert space H. Prove that
ΠU is symmetric, that is hΠU φ, ψi = hφ, ΠU ψi for every φ, ψ ∈ H. (hint: recall that H = U ⊕ U ⊥ . . . ).
Exercise 4.5.9. Let H be an Hilbert space, V ⊂ H a linear subspace. We define V ⊥ = {φ ∈ H : hφ, vi = 0, ∀v ∈
V }. i) Prove that V ⊥ is a closed linear subspace. ii) (V ⊥ ) ⊥ ⊂ V . Prove that = holds if V is closed. iii) (?) Discuss
what happens if V is not closed.
Exercise 4.5.10. Let e0, ek,n n ∈ N, k = 0, 1, . . . , 2n − 1 be the Haar base of L 2 ([0, 1]). Determine the Fourier
series of f (x) = x into this base. Discuss if the series is uniformly convergent, that is it converges in the L ∞ norm.
Suppose in particular that we consider the sum SN of the FS of f truncated to n = N (N fixed). Give an estimate
of k f − SN k2 and k f − SN k∞ .
Exercise 4.5.11 (Legendre polynomials (?)). Let H := L 2 ([−1, 1]) and let vn := x n , n = 0, 1, 2, . . .. i) By
dn
applying the Gram–Schmidt algorithm to (vn )n , compute e0, e1, e2, e3 . ii) Let p0 (x) = 1, pn (x) = 2n1n! dx 2 n
n (x − 1) .
Show that hpn, pm i = 0 if n , m. iii) Compute kpn k2 .
Exercise 4.5.12 (?). Let X = C ([0, 1]) and consider the uniform norm k f k∞ := maxx ∈[0,1] | f (x)|. Discuss if the
uniform norm can be or less the norm associated to some scalar product. Hint: think to the "euclidean geometry"
induced by the scalar product.
CHAPTER 5

Fourier Series

Historically, Fourier series arose a century before Hilbert spaces theory to respond to a natural
problem: is it possible to represent any periodic function f as sum of elementary periodic functions? To
present the problem, let’s consider a generic T−periodic function f : R −→ R (that is f (x + T ) = f (x)
for any x ∈ R). Classical examples of such f are the fundamental harmonics
! !
2π 2π
sin nx , cos nx , n ∈ N.
T T
We ask if is it always possible to say that
∞ ! !! ∞ ! !!
X 2π 2π X 2π 2π
f (x) = nx + bn sin nx = a0 + nx + bn sin nx .
?
an cos an cos
n=0
T T n=1
T T
The type of series at the r.h.s. are called trigonometric series. These series are series of functions,
therefore they can be convergent in several different ways. For instance:
• pointwise, that is the convergence holds for some x ∈ [0, T] fixed, as numerical series;
• uniformly, that is the convergence is respect to the uniform norm k · k∞ on [0, T];
• in mean, that is the convergence is respect to L 1 ([0, T]) norm k · k1 ;
• in quadratic mean, that is the convergence is respect to L 2 ([0, T]) norm k · k2 .
There’re relations between these convergences as we will see, but the L 2 convergence deserve a special
position as we will see. Fourier series is a topic of Fourier Analysis, a well developed Mathematical
Theory that we will touch in this and in the next Chapter. Our aim is just to give an introduction and to
show some first applications.

5.1. Preliminaries
To begin, we start by rewriting a trigonometric series in a more convenient form. Recalling the Eulero
identities
eiθ + e−iθ eiθ − e−iθ
(5.1.1) cos θ = , sin θ = ,
2 2i
by some straightforward computation we can write
a n +ib n −i 2π
      a −ib 2π 
a0 + ∞ n=1 an cos T nx + bn sin T nx = a0 + ∞ n i T nx
+ e T nx
P 2π 2π P n
n=1 2 e 2

= T nx .
i 2π
P
n∈Z cn e
63
64

Easily one check that if we know the (cn ) we can go back to (an ) and (bn ):
(5.1.2) an = 2Re cn, bn = −2Im cn .
As in the introduction, there’re many possible ways to consider convergence. Let
X 2π
SN (x) := cn ei T nx .
|n |6 N

Definition 5.1.1. We say that the series n∈Z cn ei T nx
P

• converges pointwise at x ∈ [0, T] to S if lim N SN (x) = S(x) exists finite as N −→ +∞ for x


fixed;
• converges uniformly on [0, T] to if (SN ) converges in the k · k∞ norm to S, that is
kSN − Sk∞ = sup |SN (x) − S(x)| −→ 0, as N −→ +∞.
x ∈[0,T ]

It is clear that if (SN ) converges uniformly then it converges pointwise at every x ∈ [0, T]. Infact:
|SN (x) − S(x)| 6 kSN − Sk∞ −→ 0, ∀x ∈ [0, T].
Let’s introduce two other important convergences: the L 1 and L 2 convergence. These are nothing but
the convergence w.r.t. the L 1 and L 2 norms that, for convenience, will be slightly modified here by the
following positions
! 1/2
1 T 1 T
Z Z
k f k1 := | f (x)| dx, k f k2 := 2
| f (x)| dx .
T 0 T 0
Notice that, with these definitions, the L 2 norm is induced by the scalar product
1 T
Z
h f , gi L 2 := f (x)g(x) dx,
T 0

Definition 5.1.2. We say that the series n∈Z cn ei T nx
P

• (SN ) converges in mean to S if kSN − Sk1 −→ 0;


• (SN ) converges in quadratic-mean to S if kSN − Sk2 −→ 0;
According to Hölder inequality,
! 1/2 Z T ! 1/2 ! 1/2
1 T
Z Z T Z T
1 1
k f k1 = | f (x)| dx 6 | f (x)| 2 dx dx =√ | f (x)| 2 dx = k f k2 .
T 0 T 0 0 T 0
In particular, if (SN ) converges in quadratic mean to S then (SN ) converges in mean to S too. In other
words: the L 2 convergence is stronger than L 1 convergence. Noticing also that
! 1/2 ! 1/2
1 T 1 T
Z Z
k f k1 6 k f k2 = 2
| f (x)| dx 6 2
k f k∞ dx = k f k∞,
T 0 T 0
we deduce that if (SN ) converges uniformly to S, then (SN ) converges also in quadratic mean to S
(hence in mean too). We recall also that, in general, the L 1 convergence doesn’t implies the pointwise
65

convergence but, by Theorem 3.3.6, the pointwise convergence holds on a suitable subsequence (SNk ).
Summarizing:
L ∞ conv. =⇒ L 2 conv. =⇒ L 1 conv.

pointwise conv. on [0, T].
We will use also the following notation to say that n∈Z cn ei T nx converges, respectively, in L ∞, L 2, L 1
P 2π

norms to S:
L ∞, L 2, L 1
X 2π
S(x) = cn ei T nx
.
n∈Z

5.2. Euler formulas


The advantage of the complex form is that we can easily find out a formula for the coefficients cn .
Theorem 5.2.1. Suppose that
L1
X 2π
f (x) = cn ei T nx
,
n∈Z
Then the Eulero formulas hold:
Z T
1 2π
(5.2.1) cn = f (x)e−i T nx
dx, n ∈ Z.
T 0

Remark 5.2.2. Because L ∞ =⇒ L 2 =⇒ L 1 it should be clear that Euler’s formulas hold also for f ∈ L ∞
or f ∈ L 2 . 
Proof. Let’s multiply both sides of f = n cn e−i T nx by e−i T k x and let’s integrate over one period,
P 2π 2π

for instance on the interval [0, T]. We have


Z T Z TX X Z T 2π
−i 2π i 2π
f (x)e T kx
dx = cn e T (n−k)x
dx = cn ei T (n−k)x dx.
0 0 n n 0

This last passage is justified by the dominated convergence. Now


Z T
1 dx = T, n = k,






Z T 

 0

ei T (n−k)x dx = 

 i 2π  x=T
e T (n−k)x 
0



=0



 
  i 2π (n − k) 
 n , k,
  T  x=0
so we deduce the (5.2.1). 
Eulero formulas shows, in particular, that if f admits an L 1 convergent Fourier series, this is unique. Its
coefficients are called Fourier coefficients and we will use the notation
1 T
Z

f (n) :=
D f (x)e−i T nx dx.
T 0
Some simple property is sometimes useful in calculating Fourier coefficients.
66

Proposition 5.2.3. Let f ∈ L 1 ([0, T]). Then


R a+T
i) (invariance) fD(n) = T1 a

f (x)e−i T nx dx, ∀a ∈ R.
(· − τ)(n) = ei T nτ fD(n).

ii) (translation) fG
Proof. The proof are just a simple checks. Let’s see ii) leaving the remaining as exercise:
Z T −τ
1 T
Z
2π y=x−τ 1 2π
(· − τ)(n) =
fG f (x − τ)e−i T nx dx = f (y)e−i T n(y+τ) dy
T 0 T −τ
Z T
i) τ 1
dy = ei T τ fD(n).
2π 2π 2π
= ei T · f (y)e−i T ny

T 0
Let’s see some examples.
Example 5.2.4 (square wave). Compute the Fourier series of


 0, x ∈ [0, π[,
f (x) := 

 1, x ∈ [π, 2π[,

extended by periodicity to all R.
Sol. — We have
Z 2π
1 1
dx = , n = 0,




 2π π 2
Z 2π Z 2π 

1 1 
fD(n) = f (x)e−inx dx = e−inx dx = 

2π 0 2π π  # x=2π
1 e−inx 1 1 − e−inπ 1 − (−1) n

 "
= =i ,

n , 0.



 2π −in 2π −in 2nπ
 x=π
Therefore the Fourier series for f is
X 1 − (−1) n ∞
1 1 X i 1 X 2
+ i einx = + ei(2k+1)x = − sin((2k + 1)x).
2 n∈Z\{0} 2nπ 2 k ∈Z (2k + 1)π 2 k=1 (2k + 1)π
It’s interesting to plot some graph of the partial sum of the series and to compare it with the graph of f . Here’s the
case of the partial sum of previous series of the first 5,20 and 100 terms respectively. The picture seems to indicate
1 1 1

1 1 1
2 2 2

Π 2Π Π 2Π Π 2Π

at least a pointwise convergence for x ∈ [0, 2π] except in the discontinuity points of f (that is on x = mπ). At those
points notice that the value of the Fourier series are
∞ ∞
1 X 2 1 X 2 1
− sin((2k + 1)mp) = − ·0= .
2 k=1 (2k + 1)π 2 k=1 (2k + 1)π 2
Notice another interesting phenomena: at the discontinuity points for f the Fourier series seems to have some
"unstable" behavior (the peaks in the violet graphs) not changing in highness increasing the number of terms of the
67

series. This is called Gibbs’ phenomenon and was discovered by Gibbs in a letter to Nature in the 1899 (you have
always to keep in mind that in 1899 computer didn’t exists. . . ). 

Example 5.2.5 (triangular wave). Compute the Fourier series of

 x,

 x ∈ [0, π[,
f (x) := 

 2π − x, x ∈ [π, 2π[,

extended by periodicity to all R.


Sol. — By definition,
π
Z 2π
1 1 2π 2
fD(0) = f (x) dx = = ,
2π 0 2π 2 2
while, as n , 0, integrating by parts,
Z 2π Z π Z 2π !
1 1
f (n) =
D f (x)e −inx
dx = xe −inx
dx + (2π − x)e −inx
dx
2π 0 2π 0 π

# x=π Z π −inx # x=2π Z 2π −inx


1 * e−inx e−inx
" "
e e
= x − dx + (2π − x) + dx +
2π , −in x=0 0 −in −in x=π π −in -
# x=π # x=2π
1 * (−1) n π 1 e−inx (−1) n π 1 e−inx
" "
= i + −i − +
2π , n in −in x=0 n in −in x=π -

(−1) n − 1
!
1 1 1
= n n
= .

(−1) − 1 − 1 − (−1)
2π n2 n2 πn2
Therefore the Fourier series is
π X (−1) n − 1 inx π 2 X 1
+ e = − ei(2k+1)x .
2 n,0 πn 2 2 π k ∈Z (2k + 1) 2
n −1
For the real form we have a0 = 2Re fD(0) = π, an = 2Re fD(n) = 2 (−1)
πn2
whereas bn = 2Im fD(n) = 0 for any
n > 1. Therefore the real form is

π 4X 1
− cos((2k + 1)x).
2 π k=0 (2k + 1) 2
Also in this case let’s see some plots of partial sums. The next picture shows sums of 1,4,15 terms respectively.
Here we see clearly that the approximation seems point wise and even uniform to f . Not only: it seems also
Π Π Π

Π 2Π Π 2Π Π 2Π

much "better" with respect to the previous example. This indicate that regularity plays an important role in rate of
68

convergence. Finally, if how it seems the sum of the series is f (x) for any x, taking x = 0 we get the remarkable
identity
∞ ∞
π 4X 1 π2 X 1
0= − , ⇐⇒ = . 
2 π k=0 (2k + 1) 2 8 k=0
(2k + 1) 2

5.3. Properties of Fourier coefficients


The previous examples determine the Fourier series of a given f . However, we don’t know yet if this
series is convergent and to what. We can however try to guess something from the Examples. A first
remark is that regularity seems to play an important role. In fact, the square wave have discontinuities
while the triangular wave is continuous. Both are derivable except at certain points, the second is not
derivable at some points but, differently by the first one, at these points left and right derivatives exist.
The Fourier coefficients seems to be influenced by this, because in both cases they go to 0 as |n| −→ +∞
but for the triangular wave this happens faster.
The following Proposition shows that these remarks are not incidentals. For simplicity we assume a
relatively strong hypothesis on the regularity, the result has other versions with weaker assumptions.
Theorem 5.3.1. If f ∈ C k ([0, T]), then
!k

(5.3.1) f (k) (n) = i n
M fD(n), ∀n ∈ Z.
T
In particular
C
(5.3.2) | fD(n)| 6 .
|n| k
Proof. We consider only the case f ∈ C 1 (a similar proof could be obtained for f ∈ C k ). Integrating
by parts,
1 T 0  x=T Z T
Z ! !
−i 2π 1  2π 2π 2π
f (n) =
D 0 f (x)e T nx
dx = −i
f (x)e T nx
− f (x) −i n e T −i nx
dx
T 0 T x=0 0 T
Z T !
1 2π −i 2πT n x 2π D
= f (T )e−i2πn − f (0) + i n f (x)e dx = i n f (n),
T T 0 T
because of the periodicity. This proves the (5.3.1). Moreover, as n , 0,
C D0 C k f 0 kL1 C
| fD(n)| = | f (n)| 6 ≡ . 
|n| |n| T |n|
A quick way to denote the (5.3.2) is to write
!
1
(5.3.3) f ∈ C k ([0, T]), =⇒ fD(n) = O .
|n| k
In particular we have the
Corollary 5.3.2. If f ∈ C 2 ([0, T]) then the FS of f converges uniformly.
69

Proof. By (5.3.2) | fD(n)| 6 nC2 . To prove that the FS converges in the uniform norm, we apply the
Weierstrass’ test 3.3.11: we have
X 2pß X XC
fD(n)ei T nx = | fD(n)| 6 < +∞. 
n∈Z

n∈Z n∈Z
n2

This is not a particularly interesting result: it is very restrictive (it cannot be applied to none of the
previous Examples) and, above all, it does not say anything about the sum of the FS. We will improve
this result in the next Section.
A final remark concerns the interpretation of the FS in the L 2 setting: defining

en (x) = ei T nx
, x ∈ [0, T].
the Eulero’s formulas reads as
Z T
1 2π
fD(n) = f (x)e−i T nx
dx = h f , en i L 2 .
T 0
Notice also that
T
1 dx = 1, n = m,
1
R

T

 0
Z T Z T 

1 i 2π −i 2π 1 i 2π

hen, em i L 2 = nx nx
dx = (n−m)x
dx = 

e T e T e T " # x=T
T T  2π
e i T (n−m) x
= 0, n , m.
0 0 

 1
 T i 2π
T (n−m)

 x=0
In particular: (en ) is an orthonormal system. Therefore
L2
X
f = fD(n)en, ⇐⇒ (en ) is a base for L 2 ([0, T]).
n

At this point, we don’t know if (en ) is a base or, that is equivalent, if every f ∈ L 2 is the sum of its
Fourier series. This will be proved in the next Section. In any case, being (en ) an orthonormal system,
according to the Bessel inequality (4.3.3) we have
X
(5.3.4) | fD(n)| 2 6 k f k22, ∀ f ∈ L 2 ([0, T]).
n

This says a non trivial fact:


Corollary 5.3.3. If f ∈ L 2 ([0, T]) then the FS of f converges in L 2 .
Notice also that, by Bessel inequality (5.3.4),
fD(n) −→ 0, |n| −→ +∞.
In other words, Fourier coefficients for an L 2 function vanishes as |n| −→ +∞. This fact turns out
to be true also for f ∈ L 1 ([0, T]) (which includes but it is not included into L 2 ([0, T]), for instance
√1 ∈ L 1 \L 2 ):
x

Lemma 5.3.4 (Riemann–Lebesgue). Let f ∈ L 1 ([0, T]). Then fD(n) −→ 0.


70

Proof. We already proved that if g ∈ C 1 ([0, 1]) then |D


g (n)| 6 C
|n | . Now, let f ∈ L 1 ([0, T]): because
C ([0, T]) is dense in L , we have
1 1

∀ε > 0, ∃g ∈ C 1 ([0, T]) : k f − gk1 6 ε.


But then
C
| fD(n)| = E
f − g(n) + D f − g(n)| + |D
g (n) 6 | E g (n)| 6 k f − gk1 + |D
g (n)| 6 ε + 6 2ε, ∀|n| > N (ε).
|n|

But this means fD(n) −→ 0. 

5.4. Convergence of a Fourier series


In this Section we discuss some of the fundamental results concerning the convergence of the Fourier
series of a given function f . We start, in a sense, with one of the weakest convergence, that is the
pointwise one.
Theorem 5.4.1. Let f ∈ C ([0, T]) such that exists f 0 (x) at x ∈ [0, T]. Then, the Fourier series of f
converges pointwise at x to f (x), that is
X 2π
f (x) = fD(n)ei T nx .
n∈Z


fD(n)ei T nx and let’s rework its definition: we have
P
Proof. Let SN (x) := |n |6 N
RT RT
SN (x) = ei T nx T1 0 f (y)e−i T n y dy = T1 0 f (y) |n |6 N ei T n(x−y) dy
P 2π 2π P 2π
|n |6 N

T
=: 1
R
T 0
f (y)D N (x − y) dy,
where D N is called Dirichlet kernel. By changing variable x − y = u, y = x − u we have

1 T /2
Z x Z
1
SN (x) = f (x − u)D N (u) du ≡ f (x − u)D N (u) du,
T x−T T −T /2
because everything is T−periodic in the integral hence the integral doesn’t depend by a specific period.
The reason why we choose [−T/2, T/2] as period is due to certain symmetry of the Dirichlet kernel. It is
indeed evident that
Z T /2
1 T
Z
1 X 2π
D N (−u) = D N (u), ∀u, and D N (u) du = ei T nu du = 1.
T 0 T |n |6 N −T /2

In particular
Z T /2
1
(5.4.1) SN (x) − f (x) = ( f (x − u) − f (x)) D N (u) du.
T −T /2
71

Notice also that D N is only apparently complex because


2N 2N i(2N +1)z− 1 ei(N + 2 )z − e−i(N + 2 )z
1 1
−i N z e
X X X
e inz
=e −i N z
e ikz
=e −i N z
(e ) = e
iz k
= z z
|n |6 N k=0 k=0
eiz − 1 ei 2 − e−i 2
 
sin N + 21 z
= .
sin 2z
Setting z = 2π
T u,  
sin N + 12 2π
T u
D N (u) = .
sin Tπ u
Returning to (5.4.1) we have

-π π -π π -π π

Figure 1. Plots of D3, D6, D9 , T = 2π.


T /2 f (x−u)− f (x)  
SN (x) − f (x) = sin N + 12
1
R 2π
T −T /2 sin Tπ u T u du

π π
T /2 f (x−u)− f (x) e i 2π 2π
T N u e i T u −e−i T N u e−i T u
= 1
R
T −T /2 π
sin T u 2i du

π
T /2 ei T u f (x − u) − f (x) i 2π N u
= 1
R
e T du
T −T /2 2i sin Tπ u
| {z }
g(u)

π
R T /2 e−i T u f (x − u) − f (x) −i 2π N u
− T1 e T du
−T /2 2i sin Tπ u
| {z }
h(u)

=D
g (−N ) − D
h(N ).
Now, easily by our assumptions it follows that g, h ∈ C ([0, T]) ⊂ L 1 ([0, T]). By Riemann–Lebesgue’s
Lemma it follows that g(−N ), h(N ) −→ 0 as N −→ +∞, and by this the conclusion follows. 
This explains, for instance, the Examples of the square wave (except for the jumps) and for the triangular
wave (at every point in this case). With actually a sightly modification of the previous proof we can treat
also the case when f has a discontinuity:
72

Theorem 5.4.2. Assume that f ∈ C ([0, T]\{x 0 }) be such that


i) there exist f (x 0 −), f (x 0 +) (limits of f as x −→ x 0 ±);
ii) there exists left and right derivatives f −0 (x 0 ), f +0 (x 0 ).
f (x0 −)+ f (x0 +)
Then the Fourier series of f converges at point x 0 to 2 .
Proof. Omitted (not easy exercise: adapt the final step of previous proof). 
Pointwise convergence is really a weak type of convergence. The next Theorem shows that by adding
some regularity to f its Fourier series becomes uniformly convergent.
Theorem 5.4.3. If f ∈ C 1 ([0, T]) be T−periodic. Then the Fourier series of f converges uniformly to f
and the following convergence rate estimate holds:
Ck f 0 k2
(5.4.2) kSN − f k ∞ 6 √ .
N
Proof. We discuss first the uniform convergence of the Fourier series of f by applying the Weierstrass
test 3.3.11. To this aim, notice that
2π n
X X X
k fD(n)ei T x k∞ = | fD(n)| = | fD(0)| + | fD(n)|.
n∈Z n∈Z |n |>1

f 0 (n) = i 2π
Now, by (5.3.1) D T n f (n), whence for n , 0,
D
C D0
| fD(n)| 6 | f (n)|.
|n|
Therefore, by the Cauchy–Schwarz inequality
1/2 1/2
X X C X 1 X
| fD(n)| 6 f 0 (n)| 6 C *.
|D +/ *. f 0 (n)| 2 +/
|D .
|n |>1 |n |>1
|n| |n| 2
, |n |>1 - , |n |>1 -
According to Bessel inequality (5.3.4),
X
|Df 0 (n)| 2 6 k f 0 k22
|n |>1

| fD(n)| < +∞. This says that the Fourier series of f converges uniformly. About the rate,
P
hence |n |>1
1 1/2
P 
kSN − f k∞ = 6 |n |>N | fD(n)| 6 Ck f 0 k L 2
P P
|n |>N f (n)en |n |>N n2
D

R +∞ 1  1/2 C 0 k f 0 k2
6 C 0 k f 0 k2 N u2
du = √ . 
N

Let’s discuss now the L 2 case. As expected, we have the


2 2
 f ∈ L ([0, T]) is sum of its Fourier series, convergent according the L norm. In
 2π Every
Theorem 5.4.4.
i
particular: e T nx 2
is an orthonormal base for L ([0, T]). Furthermore, the Parseval identity holds
n∈Z
1 T
Z X
(5.4.3) | f (x)| 2 dx = | fD(n)|.
T 0 n∈Z
73

Proof. It is possible to prove (intuitively is clear but it requires a non trivial proof) that every f ∈ L 2
can be approximated in the L 2 norm by a C 1 function. Now, call g ∈ C 1 such that k f − gk2 6 ε. By the
previous Theorem
k · k∞
X k · k2 6 k · k∞ k·k
X
g = g (n)en,
D =⇒ g =2 g (n)en .
D
n n

g (n)ek2 6 ε, whence
P
In particular, for N > N (ε), kg − |n |6 N D

6 k f − gk2 + g − |n |6 N D +
P P P
f − |n |6 N fD(n)en g (n)en g (n)
|n |6 N (D − fD(n))en
2 2 2

6 2ε + .
P
g (n)
|n |6 N (D − fD(n))en
2

Now, because the (en ) is orthonormal system, by Pythagorean theorem,


2
X X X Bessel
g (n) − fD(n))en
(D = g (n) − fD(n)| 2 =
|D − f (n)| 2
| gE 6 kg − f k22 6 ε 2,
|n |6 N 2 |n |6 N |n |6 N

and by this we conclude that


X
f− fD(n)en 6 3ε, ∀N > N (ε).
|n |6 N 2

This proves that (en )n is an orthonormal base for L 2 . About the Parseval identity it follows by its abstract
alter ego (4.3.2): just notice that in the present setting
1 T
Z
k f k22 = | f (x)| 2 dx. 
T 0

5.5. Applications
In this section we show some applications of Fourier series. Among these, the most remarkable are
devoted to solve Partial Differential Equations (PDEs). The advantage of Fourier series is to provide
exact and explicit solutions to certain equations. The counterpart of this is that the equations solvable
through Fourier series must be of very particular nature, so the method is extremely limited and not
suitable for many practical problems. However, it gives the base to develop further methods based on
different approximations of the solutions through special functions (as the trigonometric polynomials).
The emphasis will be on the application and not on the formal justification of all the details.

5.5.1. Heat diffusion. Consider a rod, that we will model as a unidimensional segment [0, L]. At
each point x ∈ [0, L], u(t, x) will denote the temperature at point x at time t. The evolution of the
temperature is driven by the heat equation, which is a PDE of type
σ 2 (x)
∂t u(t, x) = ∂xx u(t, x) + f (t, x).
2
74

Here σ > 0 is the conductivity and f represents an external heat source. Here we will assume f ≡ 0 and
σ(x) ≡ σ. A typical problem consists in determining the future temperature in the rod given the initial
temperature u(0, x), x ∈ [0, L]. If we impose to the solution to be periodic, that is
u(t, 0) = u(t, L),
we can solve the problem
∂t u(t, x) = σ2 ∂xx u, (t, x) ∈ [0, +∞[×[0, L],
2


 u(t, 0) = u(t, L),

t > 0,
= ϕ(x),

u(0, x) x ∈ [0, L],


via Fourier series. Let’s see how.
Because u(t, ·) is an L−periodic function, we can write
X
u(t, x) =

cn ei L nx
.
n

Of course this identity should hold for every t > 0, therefore it is natural to look for cn = cn (t) to be determined.
The initial state ϕ is known and periodic, hence
X
ϕ(x) = ϕ
D(n)ei L nx,

where, in this case, ϕ


D(n) are known being ϕ known. Therefore
u(0, x) = ϕ(x), ⇐⇒ cn (0) = ϕ
D(n), n ∈ Z.
How can we determine cn (t) for t > 0? By imposing that u fulfill the heat equation. To do this we have to compute
∂t u and ∂xx u. Keeping aside the formal justification, we could write
∂t u = ∂t n cn (t)ei L nx = n cn0 (t)ei L nx,
P 2π P 2π

 2
∂xx u = ∂xx = = nx .
2
P i 2π
L nx
P
i 2π ei

nx P
− 4π n2 cn (t)ei

n cn (t)e n cn (t) L n
L
n L2
L

By these,
σ2 4π 2 σ 2 2
∂t u = ∂xx u, ⇐⇒ cn0 (t) = − n cn (t), n ∈ Z.
2 L2
This last is an infinite system of differential equations that we can solve easily:
2 σ2 2 σ2
− 2π n2 t − 2π n2 t
cn (t) = e L2 cn (0) = e L2 ϕ
D(n).
Conclusion: a formal candidate to solve the heat equation is
X 2π 2 σ 2 2
− n t
u(t, x) = ϕ
D(n)ei L nx .

(5.5.1) e L2
n∈Z

The previous argument contains delicate passages. For example, we said that ∂t n cn (t)en (x) = n cn0 (t)en (x).
P P
Is this correct? If the sum were finite, there would be no problem. But the sum is infinite and we should be careful
with such kind of manoeuvres. In fact, we could now take the (5.5.1) and check directly that the series can be
derived term wise. We skip this.
75

Let’s see what happens if f , 0. Of course we should expect that, to preserve the periodicity of the solution,
f must be x−periodic too, that is f (t, 0) = f (t, L). Therefore we could write
X
f (t, x) = f n (t) := h f (t, ·), en i L 2 .
f n (t)en (x), where L
L
n

Therefore, the heat equation assumes the form


X X 2π 2 σ 2 !
2π 2 σ 2 2
cn (t)en (x) =
0
− n 2
cn (t) + f
L n (t) e n (x), ⇐⇒ cn
0
(t) = − n cn (t) + L
f n (t).
n n
L2 L2
This is a first order non homogeneous linear equation. We dispose of its general solution
Z t ! Z t
− 2π σ n2 t − 2π σ n2 s L − 2π σ n2 t − 2π σ n2 (t−s) L
2 2 2 2 2 2 2 2
cn (t) = e L 2 e L2 f n (s) ds + cn (0) = e L 2 ϕ
D(n) + e L2 f n (s) ds.
0 0

Therefore,
X 2 σ2
Z t 2 σ2
!
− 2π n2 t − 2π n2 (t−s) L
u(t, x) = ϕ
D(n) +

e L2 e L2 f n (s) ds ei L nx
.
n 0

5.5.2. Modelling temperature in the Earth. This is a variation on the previous problem. We want
now to model the temperature in depth under the ground. Here u(t, x) will be, as usual, the temperature
at time t ∈ [0, 365] (here t = 0 means Jan. 1st, t = 365 Dec. 31st) and point x ∈ [0, +∞[ (here x = 0
means the surface). However, we change perspective respect to the previous problem. We will assume:
i) the temperature on the surface u(t, 0) is known and periodic in time (let’s say: t = 0 correspond
to Jan 1st, t = 365 to Dec 31st);
ii) the temperature at depth x > 0 remains periodic u(0, x) = u(365, x).
The heat equation remains unchanged, so the problem we wish to discuss is
∂t u(t, x) = σ2 ∂xx u, (t, x) ∈ [0, 365] × [0, +∞[,
2


 u(0, x) = u(365, x), x > 0,

 u(t, 0) = ϕ(t),

t ∈ [0, 365],

Because the solution is now t−periodic, we will assume it will have the form
X X
u(t, x) = cn (x)en (t) = cn (x)ei 365 nt .

n n

By computing derivatives as before,


X 2π X
∂t u = i ncn (x)en, ∂xx u = cn00 (x)en,
n
365 n

therefore the heat equation imposes


2π σ 2 00 4π
i ncn (x) = c (x), ⇐⇒ cn00 = i ncn .
365 2 n 365σ 2
This is a second order linear differential equation. In general, the solution of
√ √
λx λx
c 00 (x) = λc(x), ⇐⇒ c(x) = αe + βe− ,
76

where ± λ are the two roots of the complex

number λ. In our case we have to proceed with some care because
√ 2 π √
λ = i 365σ 2 n and n ∈ Z, hence λ =
4π √ ni. Now,
365σ
√ √ √ 1+i
n > 0, n i = n√ ,


2 1±i


ni =  = |n| √ , where ± = sgn(n).
 p
 n < 0, √ √ √
−n −i = −n 1−i
√ ,

 2
 2

Therefore,
q q
2π 1+i 2π
365 |n | σ x − 365 |n | 1+i
σ x
> α + β ,




 n 0, n e n e
cn (x) = 


 q q

 n < 0, α e 365 |n | 1−i
σ x

− 365 |n | 1−i
σ x
+ βn e .


 n

As n = 0, c0 (x) = 0, therefore c0 (x) = α0 x + β0 . The (α n, βn ) can be determined by imposing the condition on


00

the surface. This means


X
u(t, 0) = ϕ(t) = ϕ
D(n)ei 365 nt , =⇒ cn (0) = ϕ

D(n).
n

For instance c0 (0) = β0 = ϕ


D(0). This doesn’t determine α0 however! The same happens for n , 0: by imposing
the previous condition we get
α n + βn = ϕ
D(n), n ∈ Z.
We notice however that if α n , 0, cn (x) is unbounded in x and, without exterior source, this condition seems
incompatible with our setting. For this reason we will impose α n ≡ 0. In this way βn = ϕ
D(n) and finally,
q q
X 2π
− 365 |n | 1+i
X − 2π
|n | 1−i
σ x i 365 |n |t σ x −i 365 |n |t

D(0) + ϕ

+ ϕ

u(t, x) D(n)e e D(n)e 365
e
n>0 n<0
(5.5.2) q  q 
X 2π x i 2π n t− 365 x
− |n | σ
= ϕ
D(n)e 365
e 365 2π |n| σ

n∈Z

This formula shows an interesting phenomenon. Respect to the temperature on the surface ϕ(t) = ϕ

D(n)ei 365 nt ,
P
n
at depth x > 0 the temperature is
q
2π x
− |n | σ
• damped exponentially
q by the factor e
365
;
365 x
• shifted in time by 2π |n | σ .

To fix ideas, let’s consider a particularly ideal situation with


!

ϕ(t) = A − B cos t , t ∈ [0, 365].
365
Here A represents the mean temperature along the year, B the maximum oscillation from the mean. Hence,
ϕ(0) = A − B = ϕ(365) while ϕ(182, 5) = A + B. This ϕ might represent a stylized seasonal temperature shape.
We can easily determine the Fourier coefficients here without any computation: just notice that

ei 365 t + e−i 365 t


2π 2π
B B
ϕ(t) = A − B = Ae0 − e−1 − e1 .
2 2 2
77

Therefore ϕ
D(n) = 0 as |n| > 1, ϕ
D(0) = A, ϕ
D(±1) = − B2 . According to (5.5.2)
q  q  q  q !
B − 2π x 2π
i 365 t− 365 x 2π
− 365 x 2π
−i 365 t− 365 x
u(t, x) = A− e 365 σ
e 2π σ
+e σ
e 2π σ
2
q r
− 2π x 2π 2π x +
= A − Be 365 σ
cos * t− .
, 365 365 σ -
At depth x ∗ such that
r r
2π x ∗ 365π
= π, ⇐⇒ x = σ ∗
365 σ 2
we have q ! q
2π x ∗ 2π 2π x ∗ 2π
− 365 − 365
u(t, x ) = A − Be
∗ σ
cos t − π = A + Be σ
cos t,
365 365
In particular u(0, x ∗ ) = u(365, x∗) = A + Be−π while u(182, 5, x ∗ ) = A − Be−π , that is the cycle winter-summer-
winter is inverted at depth x ∗ (and actually damped by a factor e−π ≈ 231
).

5.5.3. Wave equations. Let’s consider now a vibrating string. To simplify our model, we will
assume that the string vibrates in a fixed plane. We can describe the configuration of the string through
a function y(t, x) representing, at time t, the ordinate of the point (x, y) on the string. We will consider
here a finite string, that is x ∈ [0, L]. It is possible to prove that the function y must fulfills a PDE called
wave equation of the form
∂tt y(t, x) = c2 ∂xx y(t, x) + f (t, x)
where c > 0 represents the wave propagation speed and f an external source. We will assume f ≡ 0.
Moreover we will assume that the string is periodic, that is y(t, 0) = y(t, L). The initial configuration
y(0, x) =: ϕ(x) and the initial impulse ∂t y(0, x) =: ψ(x) are assumed to be known. Therefore, the
problem to be solved is
 ∂tt y(t, x) = c2 ∂xx y(t, x), t > 0, x ∈ [0, L],
u(t, 0) = u(t, L),

t > 0,



u(0, x) = ϕ(x),


 x ∈ [0, L],
∂t u(0, x) = ψ(x),

x ∈ [0, L].


As before, we look for a solution in the form
X X
y(t, x) =

cn (t)ei L nx
≡ cn (t)en (x).
n n
Notice that
X X X 4π 2
∂t y(t, x) = cn0 (t)en (x), ∂tt y(t, x) = cn00 (t)en (x), ∂xx y(t, x) = − n2 cn (t)en (x).
n n n
L
The wave equation becomes then
X X 4π 2 4π 2 c2
cn00 (t)en (x) = c2 − 2 n2 cn (t)en (x), ⇐⇒ cn00 (t) = − 2 n2 cn (t),
n n
L L
while the initial conditions becomes
cn (0) = ϕ
D(n), cn0 (0) = ψ
D(n).
78

Putting all these together, we have to solve the infinite system of second order linear equations
cn00 (t) = − 4πL 2c n2 cn (t),
2 2



 cn (0) = ϕ

D(n),

 c 0 (0) = ψ

 n
D (n).
For n = 0, c000 = 0, therefore c0 (t) = α0 t + β0 . We deduce that β0 = c0 (0) = ϕ
D(0) and α0 = c00 (0) = ψ
D(0), that is
c0 (t) = ψ
D(0)t + ϕD(0).
For n , 0, he general solution has the form
2πc 2πc
cn (t) = α n cos |n|t + βn sin |n|t.
L L
By imposing the initial conditions we have


 αn = ϕD(n),


 2πc |n| βn = ψ
D(n), ⇐⇒ βn =
2π |n |c ψ (n).

 L D
 L
Therefore
2πc L D 2πc
cn (t) = ϕ
D(n) cos |n|t + ψ (n) sin |n|t.
L 2π|n|c L
Notice that, forcing a bit the previous formula, we could include in it also the case n = 0 interpreting sin z
z = 1 as
z = 0. Therefore !
X 2πc L D 2πc
y(t, x) = ϕ
D(n) cos |n|t + ψ (n) sin |n|t en (x).
n∈Z
L 2π|n|c L
Reworking this formula we can get a more intelligible one. First: notice that we can replace everywhere |n| with
n. This is evident if n > 0. If n < 0, |n| = −n and because cos is even and sin is odd easily we get the conclusion.
Moreover, according to Euler formulas (5.1.1),
2πc 1  i 2π c nt 2π c

cos nt = e L + e−i L nt
L 2
and recalling the properties of Fourier coefficients (5.2.3)
2π c 2π c
ei L nt
ϕ
D(n) = ϕ(·
G − ct)(n), e−i L nt ϕ
D(n) = ϕ(·
G + ct)(n).
Therefore !
X 2πc 1X G 
ϕ
D(n) cos nt en (x) = ϕ(· − ct)(n)en (x) + ϕ(·
G + ct)(n)en (x)
n∈Z
L 2 n∈Z

1
= (ϕ(x − ct) + ϕ(x + ct)) .
2
Similarly,
2πc 1  i 2π c nt 2π c

sin nt = e L − e−i L nt .
L 2i
Let’s consider then
1 L D 2π c
ψ (n)ei L nt
2i 2πnc
Recalling the (6.2.1)
2πn D L D0
f 0 (n) = i
D f (n), ⇐⇒ f (n) = fD(n)
L i2πn
79

we have
1 L D 2π c 1 D 2π c 1 G
ψ (n)ei L nt = Ψ(n)ei L nt = Ψ(· − ct)(n), where Ψ 0 = ψ.
2i 2πnc 2c 2c
Similarly,
1 L D 2π c 1 G
ψ (n)e−i L nt = Ψ(· + ct)(n).
2i 2πnc 2c
Therefore,
!
X L D 2πc 1 X G 
ψ (n) sin |n|t en (x) = Ψ(· + ct)(n)en (x) − Ψ(·
G − ct)(n)en
n∈Z
2π|n|c L 2c n

1
= (Ψ(x + ct) − Ψ(x − ct)) .
2c
Now, as Ψ we can take
Z u Z x+ct
Ψ(u) = ψ(v) dv, =⇒ Ψ(x + ct) − Ψ(x − ct) = ψ(v) dv.
0 x−ct
In conclusion, we obtain the D’Alembert formula
Z x+ct
1 1
y(t, x) = (ϕ(x − ct) + ϕ(x + ct)) + ψ(v) dv.
2 2c x−ct
This formula has a beautiful interpretation: the state of the wave function y at point x, time t depends:
• by the initial configuration at points x ± ct;
• by the initial impulse between x − ct and x + ct.

5.5.4. Laplace equation. The Laplace equation describe the equilibrium configuration for an elastic
membrane subject to a traction and constrained to a certain configuration on its edge. We could formalize
mathematically the problem in the following way. Let Ω ⊂ R2 a domain in such a way that the shape of
the membrane be described through a function u = u(x, y) : Ω −→ R. We denote by f (x, y) the vertical
component of the force applied to the membrane. It is possible to show that u solves the following PDE
∆u ≡ ∂xx u + ∂yy u = f , (x, y) ∈ Ω.
About the constraint, we could model through a function ϕ : ∂Ω −→ R representing the configuration of
the membrane on its edge. In other words, u must be a solution of the following problem


 ∆u(x, y) = f (x, y), (x, y) ∈ Ω,

 u(x, y) = ϕ(x, y),

(x, y) ∈ ∂Ω.

This problem is also called Dirichlet problem for the Laplace equation. Through Fourier series we can
find the solution in a particular case. Assume that Ω = {(x, y) ∈ R2 : x 2 + y 2 6 1} (unitary disk).
Then ∂Ω = {(x, y) ∈ R2 : x 2 + y 2 = 1} (unitary circle). Because of the particular form of Ω, the
configuration u can be described by using polar coordinates. In other words we will assume u = u(r, θ)
being x = r cos θ, y = r sin θ. In this way, because (r, θ + 2π) corresponds to (r, θ), u(r, θ + 2π) = u(r, θ),
that is u is naturally periodic in θ of period 2π. The configuration on the boundary can be easily described
by assigning ϕ = ϕ(θ), periodic for the same reason as for u.
Polar coordinates simplify the way to look at u and ϕ. But there’s a problem: the Laplace equation is
given in cartesian coordinates. In fact, derivatives are computed respect to x and y and of course this is
80

not the same as derivatives respect to r and θ. In other words, to use polar coordinates, we have to rewrite
the Laplace equation in polar coordinates. How this can be done? To do this, we will use an ambiguous
notation by writing u(r, θ) = u(r cos θ, r sin θ). By some computation it can be proved that
1 1
∆x,y u ≡ ∂rr u + ∂r u + 2 ∂θθ u.
r r
Hence, the Dirichlet problem in polar coordinates assumes the form


 ∂rr u + r1 ∂r u + r12 ∂θθ u = f (r, θ), 0 < r 6 1, θ ∈ [0, 2π],

 u(1, θ) = ϕ(θ), θ ∈ [0, 2π].

As in the beginning we will assume f ≡ 0. Therefore, if we represent the solution as Fourier series in θ,
X
u(r, θ) = cn (r)einθ ,
n

the equation to be fulfilled by u leads to


X 1X 0 1 X 1 n2
cn00 en + cn en + 2 (−n2 )cn en = 0, ⇐⇒ cn00 + cn0 − 2 cn = 0,
n
r n r n r r
with cn (1) = ϕ
D(n). The good news is that the differential equation solved by cn is linear
r 2 c 00 + rc − n2 c = 0.
To solve this equation, let’s first separate n = 0 by n , 0. In the first case
κ
rc 00 + c 0 = 0, ⇐⇒ (rc 0 ) 0 = 0, ⇐⇒ rc 0 = κ, ⇐⇒ c 0 = , ⇐⇒ c = κ log r + κ 0 .
r
Because we look for a bounded solution, κ = 0, whence c0 ≡ κ 0 = c0 (1) = ϕ D(0). If n , 0, let’s look at c as a power
series c(r) = ∞ k=0 α k r . Then c is a solution iff
k
P
X X X
r2 k (k − 1)α k r k−2 + r kα k r k−1 − n2 α k r k = 0,
k>2 k>1 k>0

By imposing to be a solution we get, after straightforward calculations,


X 
k 2 − n2 α k r k ≡ 0.
k

By this, (k 2 − n2 )α k = 0 for every k. This says α k = 0 for k , |n|, whence cn (r) = α |n | r |n | . By imposing the
boundary condition cn (1) = ϕ D(n), we have finally
cn (r) = ϕ
D(n)r |n | .
Notice that this formula works also for n = 0. Returning to the solution of the Laplace equation, we have finally
X
u(r, θ) = ϕD(n)r |n | einθ .
n∈Z

Notice again that, because


Z 2π
1
ϕ
D(n) = ϕ(ϑ)e−inϑ dϑ
2π 0
81

we can represent the solution in an integral form as


Z Z 2π
1 X |n | inθ 2π 1 X
(5.5.3) u(r, θ) = r e ϕ(ϑ)e −inϑ
dϑ = ϕ(ϑ) r |n | ein(θ−ϑ) dϑ.
2π n 0 2π 0
|n {z }
P(r,θ−ϑ)
Now,
X +∞
X   X+∞  
P(r, ω) = r |n | einω = 1 + r n einω + e−inω = r n einω + e−inω − 1
n n=1 n=0

n=0 q = (that holds for every |q| < 1) we have that, for r < 1, we can
P∞ n 1
and recalling the geometric sum 1−q
compute in finite terms P:
1 1 1 − r2
P(r, ω) = + − 1 = .
1 − reiω 1 − re−iω 1 + r 2 − 2r cos ω
The function P is called Poisson kernel and the (5.5.3) is called Poisson formula.

5.6. Exercises
Exercise 5.6.1. Find the Fourier series of the function f ∈ L 2 ([0, 2π]) defined as

 −1, −π < x < 0,




f (x) := 
 +1, 0 < x < π.

What does the Fourier series converge to at x = kπ, k ∈ Z?


Exercise 5.6.2. Let f (x) = x 2 defined as x ∈ [0, 1], extended by periodicity to R. Compute its Fourier series and
discuss the convergence (pointwise, L 2 ).
Exercise 5.6.3. Let α ∈ R\Z and f α (x) := cos(αx) for x ∈] − π, π] extended by periodicity to R. Determine the
real form of the Fourier transform of f α and discuss its sum. Use this to show the identity

X 1
πα cot(πα) = 1 + 2α 2 , ∀α ∈ R\Z.
n=1
α 2 − n2

Exercise 5.6.4. Compute the Fourier series of f (x) = x as x ∈] − π, π] extended by periodicity to R. What is the
sum of the series? Use the answer to show that

π X (−1) k
= .
4 k=0 2k + 1
|x|
Exercise 5.6.5. Let f (x) := xe− π as x ∈ [−π, π] extended by periodicity to all R. By discussing the regularity of
f find the maximum k such that
C
| fD(n)| 6 .
|n| k
Compute the Fourier series of f in complex form. Deduce by this the real form and confirm what deduced above.
Exercise 5.6.6. Let f ∈ L 1 real a T−periodic. If f is odd the Fourier series of f contains only sines, if f is even
only cosines.
Exercise 5.6.7. Let f (x) := π4 − x2 , x ∈ [0, π]. How can f be expanded as a cosine series? Discuss carefully the
convergence of such series (pointwise, L 2 ).
82

Exercise 5.6.8. Show that if f ∈ C k ([0, T]) its Fourier series converges uniformly and
K
kSN − f k∞ 6 k/2 , as N −→ +∞.
N
Exercise 5.6.9. Determine the solution of the wave equation ∂tt y = ∂xx y with the following initial conditions:
Exercise 5.6.10. Determine the solution of the Laplace equation ∆u = 0 on the unit disk assuming u = ϕ(θ) on
the boundary of the disk in the following cases:
i) ϕ(θ) = θ(2π − θ).
ii) ϕ(θ) = χ[0,π] (θ).
iii) ϕ(θ) = θ.
iv) ϕ(θ) = ∞
P 1
n=1 2 n sin(nθ).

Exercise 5.6.11. A column of soldiers march in unison over a bridge. We model the shape of the bridge as a
unidimensional line described through the graph of a function y(t, x) of real variable x ∈ [0, L]. Initially the bridge
is assumed to be flat (y(0, x) ≡ 0) and with no impulse (∂t y(0, x) ≡ 0). The march of the soldiers is modeled through
a force f (t, x) impressed by the soldiers to the bridge by their steps. We assume that T > 0 be the time-period for
the steps. How can we model f analytically? A simple possibility is to take f (t, x) = F0 sin 2πtT .
i) By assuming the shape of the bridge modeled through a wave equation with forcing term f , write the
solution y = y(t, x).
ii) What happens to the bridge (that is to y) as T ≈ cLN for some N ∈ N? Describe the behavior as better
as you can.
iii) Assume now that the march is a bit "out of step", in such a way that the front of the column exercises
inverted force respect to the rear. We model this by assuming
2πt 2πx
f (t, x) = F0 sin sin .
T L
Determine the evolution of the shape of the bridge. What happens if T ≈ cLN , for some N ∈ N?
CHAPTER 6

Fourier Transform

The main result of the previous Chapter is that for every f ∈ L 2 T−periodic it holds
L2 2π n
X
(6.0.1) f (x) = fD(n)ei T x .
n∈Z

Main applications of this fact are to signals (voice, music, etc) and this is the base of the digital signal
processing. Roughly, one replace the full signal by an its finite approximation by using a finite number
of Fourier coefficients. This makes possible, for instance, mobile phones, compact discs, and so on. The
main feature of sound waves is that they are periodic. Other type of signals, like images, are not, so the
question arises naturally: is there an analogous of (6.0.1) but for non periodic functions? The answer is
yes and it is not far from (6.0.1).
To catch the point, let’s rewrite the (6.0.1) by writing the Fourier coefficients as
1 T /2
Z
2π n
fD(n) = f (y)e−i T x dy.
T −T /2
Hence
X 1 Z T /2 !
i 2πT n x −i 2πT n y 2π n
X
f (x) = fD(n)e = f (y)e dy ei T x
n∈Z n∈Z
T −T /2
Now, suppose that we consider a formal limit as T −→ +∞ (this is to consider a non periodic function).
Something interesting happens if we look carefully to the previous sum: introduce the points ξ n := Tn
n ∈ Z as a subdivision of R in such a way that dξ n = ξ n+1 − ξ n = T1 we have
X Z T /2 ! Z Z !
f (x) = f (y)e −i2πξn y
dy e i2πξn x
dξ n −→ f (y)e −i2πξ y
dy ei2πξ x dξ.
n∈Z −T /2 R R

The integral
Z
(6.0.2) fD(ξ) = f (y)e−i2πξ y dy
R
is called Fourier transform of f and the previous formula suggests that

f (x) = L
f (−x).
L
(6.0.3)
Basically in these few steps the main ideas concerning the Fourier transform are contained. Of course it
will take some non trivial work to show that these idea can be made rigorous.
83
84

6.1. Definition and examples


We will introduce the definition of Fourier Transform (FT) for functions f = f (x), x ∈ Rd .
Definition 6.1.1. Let f ∈ L 1 (Rd ). The function
Z
(6.1.1) f (ξ) :=
D f (y)e−i2πξ ·y dy, ξ ∈ Rd
Rd
is called Fourier Transform (FT) of f
The definition is, of course, well posed. Indeed, the integral in (6.1.1) converges iff
Z Z
| f (y)e −i2πξ ·y
| dy = | f (y)| dy < +∞, ∀ξ ∈ Rd, ⇐⇒ f ∈ L 1 (Rd ).
Rd Rd
In particular the previous says that
(6.1.2) | fD(ξ)| 6 k f k L 1 .
that is fDis bounded by k f k L 1 .
Let’s see some important examples.
Example 6.1.2 (rectangle). Call recta := χ[−a/2,a/2] . Then
sin(πaξ)
(6.1.3) ecta (ξ) = a
rE =: a sinc (πaξ), ∀ξ ∈ R.
πaξ
(with the agreement that sinc 0 = 1 ).

2a

-a a -a a

Figure 1. recta at the left and rE


ecta at the right

Sol. — Clearly recta = χ[−a/2,a/2] ∈ L 1 (R). We have


a/2
ξ = 0, = dy = a,
R

Z a/2 

 −a/2
ecta (ξ) = −i2πξ y
dy = 

rE e  
−a/2 e−i2π ξ y y=a/2 sin(πaξ ) )
f g
 ξ , 0,


= = = a sin(πaξ
 −i2πξ y=−a/2 πξ πaξ

Example 6.1.3 (exponential).


−a |] | (ξ) =
2a
(6.1.4) eE , ξ ∈ R, (a > 0).
a2 + 4π 2 ξ 2
85

2a

Sol. — Clearly e−a |] | ∈ L 1 (R) if a > 0. By definition


Z Z 0 Z +∞
e
E −a |] | (ξ) = e −a |y | −i2πξ y
e dy = ay −i2πξ y
e e dy + e−ay e−i2πξ y dy
R −∞ 0

+∞ # y=0 " −(a+i2πξ )y # y=+∞


e (a−i2πξ )y
Z 0 Z "
e
= e (a−i2πξ )y dy + e−(a+i2πξ )y dy = + −
−∞ 0 a − i2πξ y=−∞ a + i2πξ y=0

1 1 2a
= + = 2 . 
a − i2πξ a + i2πξ a + 4π 2 ξ 2
The next example is very important:
Example 6.1.4 (gaussian).
− ]
2
e 2σ 2 (ξ) = 2πσ 2 e−2π σ ξ , (σ > 0).
E p 2 2 2
(6.1.5)
In particular:
−π]2 (ξ) = e−πξ .
2
(6.1.6) eE

Sol. — We’ve already obtained this Fourier transform: looking to (1.8.1) by taking λ = 2πξ we obtain
− ]
2
σ2
e 2σ 2 (ξ) = 2πσ 2 e− 2 4π ξ = 2πσ 2 e−2π σ ξ . 
E p 2 2
p 2 2 2

Example 6.1.5 (multivariate gaussian). Let C be a strictly positive definite symmetric matrix. Then
q
e− 2 C ]·] (ξ) = (2π) d det Ce−2πCξ ·ξ .
G 1 −1
(6.1.7)

Sol. — Notice first that being C strictly positive, C is invertible and of course being C symmetric the same holds
for C −1 , which is therefore diagonalizable. This means that C −1 = T −1 Λ−1T with T orthogonal matrix, that is
T −1 = T t (transposed matrix), and Λ−1 := diag( σ12 , . . . , σ12 ) a diagonal matrix. Therefore
1 d

C −1 y · y = T t Λ−1T y · y = Λ−1T y · T y.
Now, notice that
Z Z
x=T y
e− 2 C ]·] (ξ) = e−i2πT ξ ·x | det T t | dx,
1 −1 1 −1 T y ·T y tT y −1 x ·x
dy =
1
e− 2 Λ e−i2πξ ·T e− 2 Λ
G
Rd Rd
and because T t = T −1 easily | det T t | = 1. Therefore
d Z − ]2
2
− 1 2 x2 YE Yq
2πσ 2j e−2πσ j (T ξ ) j .
(6.1.5)
e j e−i2π(T ξ ) j x j dx j =
F−T
Y
e− 2 C ]·] (ξ) =
1 −1 2 2
e j ((T ξ) j ) =
G 2σ 2σ

j=1 R j j
86

To finish notice that Y


σ 2j = det Λ = det(TC −1T −1 ) −1 ) = det C,
j
and X
σ 2j (T ξ) 2j = (Λ−1T ξ) · T ξ = T t ΛT ξ · ξ = Cξ · ξ,
j
and by these identities the conclusion follows easily. 

Let’s finish this Section with few "algebraic" properties of the Fourier transform sometimes useful:
Proposition 6.1.6. Let f ∈ L 1 . Then
i) fG(· − x 0 ) = e−i2πξ ·x0 fD.
−i2π]·v f (ξ) = fD(ξ + v).
ii) eG
f (λ])(ξ) = 1 fD ξ .
 
iii) F |λ | d λ

Proof. We will limit to prove the first one, the remaining are similar (exercise). We have
Z Z
f (· − x 0 )(ξ) =
G f (x−x 0 )e −i2πξ ·x
dx = e −i2πξ ·x0
f (x−x 0 )e−i2πξ ·(x−x0 ) dx = e−i2πξ ·x0 fD(ξ). 
Rd Rd

6.2. Behavior of Fourier transform


In this section we will study the behavior of the Fourier transform as a function, that is we will discuss
some property of the function
ξ 7−→ fD(ξ).
We start by the
Theorem 6.2.1 (Riemann–Lebesgue). Let f ∈ L 1 . Then fD ∈ C and fD(ξ) −→ 0 as |ξ | −→ +∞.
Proof. The continuity follows by the dominated convergence. Indeed, if ξ n −→ ξ0 we have clearly
f (y)e−i2πξn ·y −→ f (y)e−i2πξ0 ·y, a.e., and | f (y)e−i2πξn ·y | 6 | f (y)|,
and because | f | ∈ L 1 we have
Z Z
f (ξ n ) =
D f (y)e −i2πξn ·y
dy −→ f (y)e−i2πξ0 ·y dy = fD(ξ0 ).
R R
We omit the proof of the second part of the statement, the proof follows a similar idea as for the RL
Lemma for FS. 
Proposition 6.2.2. Let f ∈ L 1 such that f ∈ C 1 with ∂j f ∈ L 1 . Then
(6.2.1) ∂
M j f (ξ) = i2πξ j f (ξ).
D
In particular:
k∂j f k1
(6.2.2) | fD(ξ)| 6 .
2π|ξ |
87

Proof. We will limit to prove the statement for d = 1 (the general case is just a technical complication
but it follows the same ideas). We have
Z Z
parts f −i2πξ y y=+∞
g  
f (ξ) =
D0 0
f (y)e −i2πξ y
dy = f (y)e − f (y)∂y e−i2πξ y dy.
y=−∞
R R
We notice that in our case f (y) −→ 0 at ∞. Indeed, by the fundamental thm of integral calculus
Z y Z ±∞
f (y) − f (0) = 0
f (u) du −→ f 0 (u) du ∈ R, because f 0 ∈ L 1 (R).
0 0
Therefore the limy→±∞ f (y) exists finite. But f is integrable: such a limit cannot be anything else than
f g y=+∞
0. Now, being e−i2πξ y bounded function, we obtain that f (y)e−i2πξ y = 0. Hence
y=−∞
Z   Z
f 0 (ξ) = −
D f (y)∂y e−i2πξ y dy = i2πξ f (y)e−i2πξ y dy = i2πξ fD(ξ).
R R
As for the bound,
f 0 (ξ)|
|D (6.1.2) k f 0 k1
| fD(ξ)| = 6 . 
2π|ξ | 2π|ξ |
Example 6.2.3. Let
 0,

 x 6 −a, x > a,
f a (x) :=  x + a, −a 6 x 6 0,
 −x + a, 0 6 x 6 a.


Compute f a0 and deduce L
f a.
Sol. — Notice that f a is of course L 1 (R) and

 0,

 x < −a, x > a,
f a0 (x) := 
 1, −a < x < 0, = χ[−a,0] (x) − χ[0,a] (x).
 −1,
 0 < x < a.
Clearly f a0 ∈ L 1 (R) and
f a (ξ) = L
iξ L f a0 (ξ) = χ
F [−a,0] (ξ) − χ
E [0,a] (ξ).
Notice that
 a  a
χ[−a,0] (]) = χ[− a2 , a2 ] ] + , =⇒ χ [−a,0] (ξ) = rect a ] + (ξ) = eiπaξ asinc (πξ).
G
F
2 2
and similarly
 a −i a2 ξ 1
sin( aξ
2 )
χ = ] = ,
G
(ξ) rect a − (ξ) e √
ξ
E [0,a]
2 2π
so
aξ 2
 

 a
ξ a
ξ
 1 sin( 2 ) 2i sin( 2 )
f a0 (ξ) = ei
L 2 − e−i 2 √ =√ ,
2π ξ 2π ξ
and finally
aξ 2
2 sin( 2 ) +
f a (ξ) = √ *
L . 
2π , ξ -
88

The (6.2.1) and (6.2.2) are of course the analogous of (5.3.1) and (5.3.2) and connect the regularity of f
to how fast fDgoes to 0.
An important consequence of (6.2.1) is that the FT converts derivatives into polynomials. In fact,
more in general,
∂1k1 ∂2k2 · · · ∂dk d f (ξ) = (i2πξ1 ) k1 (i2πξ2 ) k2 · · · (i2πξ d ) k d fD(ξ).
G

We could write this formula in a compact way: if k = (k 1, . . . , k d ) ∈ Nd (called multi-index) and


d
Y k
∂ k := ∂1k1 ∂2k2 · · · ∂dk d , x k := xjj
j=1

then, if ∂ k f ∈ L 1 ,

(6.2.3) ∂M
k f (ξ) = (i2πξ) k fD(ξ).

What is remarkable is that by switching derivatives with power the same holds:
Proposition 6.2.4. Let f ∈ L 1 (Rd ) such that x k f (x) ∈ L 1 (Rd ). Then

∃∂ k fD(ξ) = (−i2π]) k f (]) (ξ) ∈ C (R d ).


 G 
(6.2.4)

Proof. (d = 1 and k = 1) It’s an application of the derivation under the integral sign. By definition
Z
f (ξ) =
D f (x)e−i2πξ x dy.
R

Deriving under the integral sign,


Z
( fD) 0 (ξ) = −i2πx f (x)e−i2πξ y dy = −i2π]
 G 
f (]) (ξ).
R

To justify the derivation we need to dominate −i2πx f (x)e−i2πξ x uniformly in ξ with an L 1 function in
x. But this follows immediately by our assumptions being
−i2πx f (x)e−i2πξ x 6 2π|x f (x)| ∈ L 1, ∀ξ ∈ R. 

The properties stated in the Prop. 6.2.2 and 6.2.4 can be summarized in a unique elegant form as follows:
Corollary 6.2.5 (Duality Multiplication-Derivation).

(i2πξ) h ∂ k fD(ξ) = ∂ h (−i2π])


Gk f (])  (ξ).

(6.2.5)

6.3. Schwarz Class


The FT of f is a function. By Riemann–Lebesgue thm we know that if f ∈ L 1 then fD ∈ C and,
moreover, fD −→ 0 at ∞. To discuss the validity of (6.0.3) we would need to compute the FT of fD. To
this aim, we would need to know if fD ∈ L 1 once f ∈ L 1 . Unfortunately this is false in general.
89

Example 6.3.1. Recall that, by (6.1.3),


ecta (ξ) = asinc (πaξ).
rE
However, sinc < L 1 because, for instance taking a = 1,
sin η
Z Z Z
sin(πξ)
|sinc (πξ)| dξ = dξ = dη = +∞. 
R R πξ R η
Apart for the specific example, the reason why rE ecta < L 1 has to be found in the lack of regularity of
recta . In fact, as we know, higher is the regularity of f , faster fD goes to 0 at infinity, and this is what
doesn’t happen to rE ecta to make its integral convergent. It is for this reason that, if we want to have (6.0.3)
as a general rule we have to consider f regular enough.
On the other hand, in analogy with Fourier series from which it comes from, we expect the (6.0.3) be
true for f ∈ L 2 . But this poses another problem: the FT of an L 2 is not always defined! This is simply
because L 2 1 L 1 .
Remark 6.3.2. None of L 1 ⊂ L 2 nor L 2 ⊂ L 1 is true. For instance
1 1
f (x) = √ ∈ L 1 (R)\L 2 (R), while f (x) = ∈ L 2 (R)\L 1 (R). 
|x|(1 + |x|) 1 + |x|

This means that an L 2 definition of the FT is not evident and, in principle, it cannot be done through the
Fourier Integral (6.1.1).
To overcome these difficulties, we will now introduce a very important class of functions, the class
of regular functions rapidly decaying at infinity called also the Schwarz class in honor of its creator. This
is a class of regular functions behaving well at infinity that has some remarkable properties:
• it is contained in both L 1 (Rd ) and L 2 (Rd );
• the FT maps Schwarz functions into Schwarz functions;
• even more: the FT is a bijection on the Schwarz class and the (6.0.3) holds.
As we will see in the next section, this will lead to define the FT on the space L 2 . This is a rich program,
let’s begin with the
Definition 6.3.3 (Schwarz functions and space).
( )
S (Rd ) := f ∈ C ∞ (Rd ) : sup (1 + k xk) N |∂ k f (x)| < +∞, ∀N, ~k ∈ Nd .
x ∈R d

In words: S (Rd ) is the space of C ∞ (Rd ) rapidly decreasing at infinity, that is faster than any polynomial.
Example 6.3.4. For instance
1
e−] ∈ S (R), e− |] | < S (R) (not regular at 0), < S (R) (not fast enough to 0 at ∞).
2

1 + x2
As we said
Proposition 6.3.5. The Schwarz class S (Rd ) is contained and dense in L 1 (Rd ) and L 2 (Rd ).
90

Proof. (d = 1) If f ∈ S then, for instance


C
(1 + |x|) 2 | f (x)| 6 C, =⇒ | f (x)| 6 ∈ L1 ∩ L2 .
(1 + |x|) 2
The density demands a technical proof that we omit here. 

The second step is the

Lemma 6.3.6. The FT applies S (Rd ) into itself, that is: if f ∈ S (Rd ) then fD ∈ S (Rd ).

Proof. For simplicity, we will prove the theorem for d = 1. In this case
( )
S (R) := f ∈ C (R) : sup (1 + |x|) | f (x)| = 0, ∀N, k ∈ N .
∞ N (k)
x ∈R

Let f ∈ S (R). To show that fD ∈ S (R) have to check two things:


i) fD ∈ C ∞ ;
ii) fD(k) is rapidly decaying at ∞.
Let’s see how both are direct consequences of the duality multiplication-derivation. Indeed, because
f ∈ S we have x k f ∈ L 1 . By Prop. 6.2.4 it follows that ∃ fD(k) and, as consequence of the Riemann–
Lebesgue Thm, being this a FT it is a continuous function. Conclusion: fD ∈ C k for any k, that is
fD ∈ C ∞ .
Moreover, the multiplication-derivation duality (6.2.5)
(6.1.2)
|(i2πξ) h ∂ k fD(ξ)| = | ∂ h (−i2π])
Gk f (])  (ξ)| k∂ h (−i2π]) k f (])k1 =: Ch,k

6
and Ch,k < +∞ because, being f ∈ S , ∂ h (x k f ) ∈ L 1 as it can be easily checked. Therefore
sup (1 + |ξ |) h | fD(k) (ξ)| < +∞, ∀h, k,
ξ ∈R

and this precisely means that fD ∈ S (R). 

The FT is invertible on S (Rd ):


Theorem 6.3.7. The inversion formula

f (x) = L
f (−x), ∀x ∈ Rd,
L
(6.3.1)
holds for every f ∈ S (Rd ). In particular, the FT is a bijection on S (Rd ).

Proof. (d = 1) The proof consists in make rigorous the informal argument of the introduction to this
Chapter. Suppose first that f ∈ S be ≡ 0 off [−R, R] for certain value of R > 0. Take T big enough in
such a way that [−R, R] ⊂ [−T/2, T/2]. Because f ∈ L 2 , it can be written as sum of its Fourier series
L2
X 1 Z T /2 !
−i 2πT n y 2π n
f (x) = f (y)e dy ei T x
n∈Z
T −T /2
91

Notice that, to avoid confusion, we didn’t use the notation fD(n) for the n−th Fourier coefficient of the
FS because we will reserve the notation fD for the FT of f . In particular, because we assumed f ≡ 0 off
[−R, R] ⊂ [−T/2, T/2],
Z T /2 Z n
2π n 2π n
f (y)e−i T y dy = f (y)e−i T y dy = fD
−T /2 R T
so
L2
X  n  2π n 1
f (x) = fD ei T x · , ∀T > 2R.
n∈Z
T T
Call ST (x) the r.h.s.. Because f = ST in L 2 in particular f = ST a.e.. Moreover, because of the regularity
of f , by Thm 5.4.3, the FS converges to ST also uniformly hence, in particular, point wise. Now, for
every x fixed, as T −→ +∞
X  n  2π n 1 Z
ST (x) := fD ei T x · −→ fD(ξ)ei2πξ x dξ = L
f (−x).
L
n∈Z
T T R

a.e. a.e. L
Therefore f = ST −→ L f (−x), that is f (x) = L f (−x). Now, because f ∈ C (by assumption) and
L

f ∈ C (by Riemann–Lebesgue) it follows that the previous identity must actually hold for every x ∈ R.
L
L
Let’s see how to remove the restriction that f ≡ 0 off [−R, R]. Take first a C ∞ function such that
ϕ ≡ 1 on [−1/2, 1/2] and ϕ ≡ 0 off [−1, 1], 0 6 ϕ 6 1. Hence set
x
f R (x) := f (x)ϕ .
R
It is clear that if f ∈ S then f R ∈ S and f R ≡ 0 off [−R, R]. By the previous part of the proof
a.e L
f R (x) = L f R (−x).
 
It is also evident that f R (x) = f (x)ϕ Rx −→ f (x) as R −→ +∞ for every x such that f makes sense,
that is a.e.. On the other side
Z
f R (−x) = f R (ξ)ei2π xξ dξ.
L
(6.3.2) L L
R
By dominated convergence easily
Z Z
f R (ξ) =
L f R (y)e−i2πξ y dy −→ f (y)e−i2πξ y dy = fD(ξ), ∀ξ.
R R
We wish now to use this and dominated convergence to pass to the limit in the integral (6.3.2). To this
aim we need a control independent by R of | L f R (ξ)|. We do this by using the trick of the derivative: by
(6.2.3)
k f 00 k1
f R (ξ)| 6 R 2
|L
|ξ |
and because f R = f ϕ + 2 R f ϕ ( R ) + R2 f ϕ ( R ) we have
00 00 1 0 0 x 1 00 x

2 0 1
k f R00 k1 6 k f 00 k1 kϕk∞ + k f k1 kϕ 0 k∞ + 2 k f k1 kϕ 00 k∞ 6 K, ∀R > 1.
R R
92

Therefore,
K
|L
f R (ξ)| 6 .
|ξ | 2
This is a good bound at ∞ but a bad one in 0. However it is easy to fix this point because, | L f R (ξ)| 6
k f R k1 6 k f k1 , so f R is uniformly bounded (this is a good bound on any finite interval but is bad at ∞).
L
It is now easy to recognize that | L f R (ξ)| 6 1+C|ξ |2 for some constant C. We can now apply the dominated
convergence to (6.3.2) and conclude the proof of (6.3.1) for every f ∈ S (R).
It remains to prove that the FT is a bijection. This is easy now: first, as operator, the FT is clearly linear.
Hence it is enough to check that if fD = 0 then, necessarily, f = 0. According to the inversion formula
a.e. D
f (x) = Df (−x) = D 0(−x) = 0.
Because f ∈ S it is continuous, hence f = 0 a.e. means necessarily f ≡ 0 on R. 

6.4. The Fourier–Plancherel Transform


In the previous Section we have seen a setting, the Schwarz class, on which the FT works particularly
well. Unfortunately, this class is too restrictive. A natural wish would be to extend the FT to the class L 2
but, as we pointed out above, because L2 1 L 1 we cannot use directly the Definition (6.1.1). However, a
remarkable property of the FT on the Schwarz space suggests naturally a possible way to define the FT
for L 2 functions. The key property is the
Lemma 6.4.1 (duality Lemma). Let f , g ∈ L 1 . Then
Z Z
(6.4.1) g=
fD fDg.
Rd Rd

Proof. First notice that both members are well defined: f , g ∈ L 1 (R) imply fD, D g ∈ C and bounded
by Riemann–Lebesgue, so in particular f , DD g are bounded hence f g, f D
D 1
g ∈ L . The (6.4.1) is just an easy
computation:
Z Z Z Z Z Z
Fubini
g=
fD f (x) g(y)e −i2π x ·y
dy dx = g(y) f (x)e −ix ·y
dx dy = g fD. 
Rd Rd Rd Rd Rd Rd
In particular, combining the inversion formula with the duality Lemma we obtain the analogous of the
Parseval identity (5.4.3) for the FT:
Corollary 6.4.2 (Parseval Identity).
(6.4.2) k f k2 = k fDk2, ∀ f ∈ S (Rd ).
Proof. Just notice that
Z Z
k f k22 = f (x) f (x) dx = f (x) L
f (−x) dx.
L
Rd Rd
Now,
Z Z
g (ξ) =
D g(y)e−i2πξ y dy = g(y)ei2πξ y dy = D
g(−ξ).
Rd Rd
93

Therefore
f (−x) = L
L
f (x),
L
L
and returning to the initial calculation
Z Z
(6.4.1)
k f k2 = = f (ξ) dξ = k fDk22 .
2
L
f (x) L
f (x) fD(ξ) L 
Rd Rd

This open a new perspective on the FT. As linear operator on S (Rd ) ⊂ L 2 (Rd ), the FT conserves the
L 2 norm. We could say that the FT is an isometry on L 2 . Actually, the FT is defined only on S . Being
this dense in L 2 , however, the Parseval identity leads to a natural extension to L 2 :
Theorem 6.4.3 (Fourier–Plancherel). The FT extends to L 2 (Rd ) as an isometric bijection. The
inversion formula holds in the sense that
a.e. L
(6.4.3) f (x) = L f (−x).
L 2
Proof. Let f ∈ L 2 : by density, there exists ( f n ) ⊂ S such that f n −→ f . Consider (Lf n ) ⊂ S ⊂ L2.
Let’s check that this sequence is L 2 convergent. Being L 2 complete, let’s check that (L f n ) is a Cauchy
sequence: by Parseval identity,
(6.4.2)
kL f m k2 = k fG
fn − L n − f m k2 = k f n − f m k2 .
Being ( f n ) L 2 convergent, it is a Cauchy sequence, hence the same holds for (L
f n ). We define
L 2
fn.
fD := lim L
n

L2
We have to show that this definition is coherent, that is if another gn −→ f then limn gDn = limn L
f n . But,
again by Parseval identity,
(6.4.2)
f n k2 = k gG
k gDn − L n − f n k2 = kgn − f n k2 −→ 0.
Now fD is well defined for f ∈ L 2 . It is clear that the Parseval identity holds for every f ∈ L 2 . In
particular the FT is injective on L 2 . By proving the inversion formula (6.4.3) we will prove also that the
L2
FT is bijective. Let f ∈ L 2 and let ( f n ) ⊂ S such that f n −→ f . Then, by (6.3.1),

f n (x) = L
f n (−x), ∀x ∈ Rd .
L
(6.4.4)
As we know, eventually taking a subsequence, f n (x) −→ f (x) a.e. x. What about the r.h.s.? Calling
L2 L2 L
gn := L f n ∈ S (because f n ∈ S ), we have gn −→ fD hence, L f n = gDn −→ L f . Eventually taking a
L

f n −→ L f a.e.. In conclusion, by extracting subsequences, the l.h.s. of (6.4.4) converges


L L
subsequence, L
f (−x), and because the limit is unique we conclude that the (6.4.3) holds.
a.e. to f , the r.h.s. to L 
L

The Fourier–Plancherel risk to appear as a quite abstract result. A natural question, in fact, is: ok, but how
do I compute in practice the FT of an L 2 function? Is there a formula? Can I use the (1.5.2)? Starting
94

with the last question, the answer is no in general, unless f ∈ L 2 ∩ L 1 (as, in particular, for Schwarz
functions). However, the (1.5.2) holds also for f ∈ L 2 in a slightly weaker sense, that is
Z
L2
f (ξ) = lim
D f (x)e−i2πξ ·x dx,
R→+∞ kx k6R

in which the limit is understood in the L2 sense. Let’s see some example.
Example 6.4.4 (Cauchy).
G 1 π
(6.4.5) (ξ) = e−2πa |ξ |, (a > 0).
a +]
2 2 a

Sol. — Easily, a2 1+]2 ∈ L 1 ∩ L 2 . We could therefore use the (1.5.2) to compute the FT. An alternative and much
easier way is the following. We recall that, according to (6.1.4)
−a |] | (ξ) =
2a
eE , ξ ∈ R, (a > 0).
a2 + 4π 2 ξ 2
Notice that both 1
a2 +x 2
, e−a |x | ∈ L 2 , and because
1 1 E  x 
= e −a |] | .
a2 + x 2 2a 2π
by applying the FT both sides we get
G 1 1 E G[ ! 6.1.6 iii) 2π E (6.4.3) π −a |−2πξ | π −2aπ |ξ |
(ξ) = e |] |
−a (ξ) = −a |] | (2πξ)
eE = e = e . 
a 2 + ]2 2a 2π 2a a a
Example 6.4.5.
1
sinc
G (aπ])(ξ) = recta (ξ).
a
|sinc (aπx)| dx = dx = +∞. However, sinc (aπx) ∈
R R sin(aπx)
Sol. — In this case sinc (aπ]) < L 1 because R R aπ x
L 2 because Z Z 2 Z 2
sin(aπx) 1 sin y
|sinc (aπx)| dx =
2
dx = dy.
R R aπx aπ R y
 sin y  2 2
. Therefore is it reasonably siny y

The function y is continuous in y = 0 and it is also bounded by 1
y2
6 C
1+y 2
which is integrable. To compute the FT we cannot in this case use the L 1 formula. However, we know that
1
ecta (ξ) = asinc (aπξ), ⇐⇒ sinc (aπx) =
rE ecta (x).
rE
a
By applying both sides the L 2 FT and recalling the inversion formula
1 1 1
sinc
G (aπ])(ξ) = rE ecta (ξ) = recta (−x) = recta (x).
E 
a a a
The previous example is interesting because it shows that, in general, the Riemann–Lebesgue Theorem
doesn’t hold for the L 2 FT. In particular, fDis not necessarily continuous.
We conclude this Section proving (4.4.2):
Proposition 6.4.6. Let Hn be the Hermite polinomials. Then Span( √1 Hn : n ∈ N) = L 2 (R, N ).
n!
95

Proof. We will prove that if hφ, x n i = 0 for every n ∈ N then φ = 0. By this the conclusion easily
follows. Notice that
Z
x 2 dx
0 = hφ, x n i = φ(x)x n e− 2 √ , ∀n ∈ N.
R 2π
n
Therefore, multiplying the previous identity by (−iξ
n!
)
and summing up all the identities, we have
(6.4.6)
XZ (−iξ x) n − x 2 dx X (−iξ x) n x 2 dx
Z Z
x2 dx
0= φ(x) e 2 √ = φ(x) e − 2
√ = φ(x)e− 2 e−iξ x √ .
n R n! 2π R n
n! 2π R 2π
n
The previous passage makes perfectly sense because n (−iξn!x) converges in L 2 (R, N ) to e−iξ x , for
P
n
every ξ ∈ R fixed. It is sufficient to show that n (−iξn!x) < +∞. Now
P
2
Z Z
2n − x2 dx x 2 dx
2
k x k2 =
n 2
x e √ = (2n − 1) x 2n−2 e− 2 √ = . . . = (2n − 1)!!
R 2π R 2π
therefore
X (−iξ x) n X |ξ | n p
= (2n − 1)!!,
n
n! 2 n
n!
and this series converges easily for every ξ ∈ R. Returning to (6.4.6), we have that

]2
φe− 2 (ξ) ≡ 0,
F

hence, by inversion formula,


x2
φ(x)e− 2 ≡ 0, ⇐⇒ φ ≡ 0. 

6.5. Convolution
A very important operation in FT context is the

Definition 6.5.1. Given two functions f and g the function


Z Z
( f ∗ g)(x) := f (x − y)g(y) dy ≡ f (y)g(x − y) dy, x ∈ Rd
Rd Rd

is called convolution of f and g.

The first thing we need to check is that the convolution is actually well defined.

Theorem 6.5.2. Let f ∈ L p and g ∈ L 1 . Then f ∗ g is well defined, it belongs to L p and the Young
inequality holds

(6.5.1) k f ∗ gk p 6 k f k p kgk1 .
96

Proof. We skip the details concerning the measurability of f ∗ g. We prove he Young’s inequality
in the simplest case p = 1. We have
Z Z Z Z
k f ∗ gk1 = f (x − y)g(y) dy dx 6 | f (x − y)||g(y)| dy dx
Rd Rd Rd Rd
Z Z Z Z
FT z=x−y
= |g(y)| | f (x − y)| dx dy = |g(y)| | f (z)| dz dy = k f k1 kgk1 . 
Rd Rd Rd Rd
The Fourier transform a convolution into the product of FTs.
Theorem 6.5.3. Let f ∈ L 1 or L 2 and g ∈ L 1 . Then
(6.5.2) f ∗ g = fDD
E g.
Proof. We do the proof in the simpler case f , g ∈ L 1 . By previous proposition f ∗ g ∈ L 1 so we can
compute the FT and
Z Z Z !
f ∗ g(ξ) =
E ( f ∗ g)(y)e −i2πξ ·y
dy = f (y − x)g(x) dx e−i2πξ ·y dy
Rd Rd Rd
Z Z !
= f (y − x)g(x)e −i2πξ ·y
dx dy
Rd Rd
Z Z !
Fubini
= g(x)e −i2πξ ·x
f (y − x)e −i2πξ ·(y−x)
dy dx
Rd Rd

= fD(ξ)D
g (ξ). 
Example 6.5.4. Solve the equation
u 00 − u = e− |x |, x ∈ R.

Sol. — Assuming that u fulfills all the requirements we can apply the FT to both members of the equation and we
get that
2
uL 00 − D u = eM
−|] | = .
1 + 4π 2 ξ 2
Now:
uL 00 (ξ) (6.2.1)
= (i2πξ) 2D u = −4π 2 ξ 2D u (ξ),
hence we obtain
!2 2
2 2 1 2 1 M
(−4π 2 ξ 2 − 1)D
u= , ⇐⇒ u = − = − = − e −|] | .
1 + 4π 2 ξ 2 (1 + 4π 2 ξ 2 ) 2 2 1 + 4π 2 ξ 2
D
2
Now, by writing
 2
eM− |] | = eM −|] | = e−|]
−|] | eM G | ∗ e−|] |,

by the inversion thm we obtain


Z
1 1
u(x) = − e−|] | ∗ e−|] | (x) = − e−|x−y | e−|y | dy
2 2 R
97

Let’s compute the integral


Z Z 0 Z +∞
e− |x−y | e− |y | dy = e−|x−y | ey dy + e−|x−y | e−y dy.
R −∞ 0
If for instance x > 0 the previous integrals are
Rx R +∞ Rx R +∞
= −∞ e−(x−y) ey dy + 0 e−(x−y) e−y dy + x e−(y−x) e−y dy = e−x −∞ e2y dy + e−x 0 dy + e x x e−2y dy
R0 R0

e−x −2x
= 2 + xe−x + e x e 2 = e−x (1 + x),
while, as x < 0,
Rx R +∞ Rx R +∞
= −∞ e−(x−y) ey dy + x e x−y ey dy + 0 e−(y−x) e−y dy = e−x −∞ e2y dy + e x x dy + e x 0 e−2y dy
R0 R0

ex
= 2 − xe x + e x 21 = e x (1 − x),
hence, in conclusion u(x) = − 12 e− |x | (1 + |x|). 

Example 6.5.5. Solve the equation


Z
u(x) = λ e− |x−y | u(y) dy + e− |x | .
R

Sol. — First notice that the equation can be rewritten as


u = λe−|] | ∗ u + e−|] | .
By applying to both members the FT we obtain
 
u = λ eM
D u + eM
− |] | D −|] |, ⇐⇒ 1 − λ eM
−|] | Du = eM
−|] | .

Now, recall that by (6.1.4)


−|] | (ξ) =
2
eM ,
1 + 4π 2 ξ 2
therefore
1 2 2 2
u (ξ) = = = .
λ 1+4π2 2 ξ 2 1 + 4π ξ 1 + 4π ξ − 2λ (1 − 2λ) + 4π 2 ξ 2
D 2 2 2 2
1−
Let’s look carefully to this u. If for instance 1 − 2λ = 0, that is λ = 21 ,
2
u= < L 1, L 2
4π 2 ξ 2
D

because of the singular behavior at ξ = 0. The same happens if 1 − 2λ < 0: in this case we could write
√ √
(1 − 2λ) + 4π 2 ξ 2 = (2πξ) 2 − (2λ − 1) = (2πξ − 2λ − 1)(2πξ + 2λ − 1),
hence
2
u= √ √ .
(2πξ − 2λ − 1)(2πξ + 2λ − 1)
D

u < L 1, L 2 because of the singularities at ξ = ±
Such D 2λ−1
2π .
98

The conclusion of these remarks is that a solution u ∈ L 1 or L 2 is possible only if 1 − 2λ > 0, that is λ < 21 .
In this case √
2 (6.1.4) 1 2 1 − 2λ 1 √
u= = √ √ =√ e− 1−2λ |] |
G
(1 − 2λ) + 4π ξ 1 − 2λ ( 1 − 2λ) 2 + 4π 2 ξ 2
D 2 2
1 − 2λ
that is √
1
u(x) = √ e− 1−2λ |x | . 
1 − 2λ
6.5.1. Approximate units. One of the most important applications of the convolution is to the
regularization of an irregular function. The idea is simple: let’s look at the convolution
Z
f ∗ g(x) = f (x − y)g(y) dy
Rd

R a "mean" of values of f around the point x, using g as weight. That is, imagine g > 0 such that
as
Rd
g = 1. For example take
1
gε := χ d.
(2ε) d [−ε,ε]
This gε > 0 and R d gε = (2ε) d [−ε,ε] d dx = (2ε) d (2ε) . Then
R R
1 1 d
Z Z
1 1
f ε (x) := f ∗ gε (x) = f (x − y) dy ≡ f (y) dy,
(2ε) d [−ε,ε] d (2ε) d C (x,ε]
where C(x, ε] := {y ∈ Rd : |y j − x j | 6 ε, j = 1, . . . , d} is a cube of side 2ε centered in x. Therefore,
f ∗ gε in this case represents the integral mean of f on the cube C(x, ε]. The idea is that eventual
irregularities of f are mollified. At the same time, in some suitable sense, we expect that, as ε −→ 0+,
f ε −→ f . We understand here why the convolution is important for signal processing because it is a way
to filter a noise on the signal.
It is convenient to consider the situation under a general setting. Let’s begin with the
Definition 6.5.6. Let g ∈ L 1 such that
Z
g > 0 a.e., g = 1.
Rd
The family of functions (gε )ε>0 defined as
1 x
gε (x) := g , x ∈ Rd
εd ε
is called approximate unit.
Remark 6.5.7. Easily one checks that gε ∈ L 1 and
Z Z
y= εx 1
x Z Z
1
gε (x) dx = d g dx = d g(y)ε dy =
d
g = 1. 
Rd ε Rd ε ε Rd Rd
Example 6.5.8 (Gaussian approximate unit). An important approximate unit is that defined by the
gaussian density
2
1 | x |2 1 − |x|
g(x) = p e− 2 , =⇒ gε (x) = p e 2ε 2 . 
(2π) d (2πε 2 ) d
99

As anticipated we have the


Theorem 6.5.9. Let f ∈ L p and (gε ) be an approximate unit. Then
Lp
lim f ∗ gε = f , ∀ f ∈ L p .
ε→0+

Proof. We will limit to a special (but important!) case: p = 2 and gε the Gaussian approximate unit.
Notice that
Plancher el (6.5.2)
k f ∗ g ε − f k2 = kfG ∗ gε − f k2 = k fD(D gε − 1)k2 .
Now, Z  2 2
gε − 1)k =
k f (D
D 2
2 | fD(ξ)| 2 1 − e−ε |ξ | dξ.
Rd
Here |ξ | represents the Rd norm of ξ. Noticed that 1−e−ε |ξ | −→ 0 for every ξ ∈ Rd and 0 6 1−e−ε |ξ | 6
2 2

1, so that | fD(ξ)| 2 ∈ L 1 (Rd ) is a summable dominant for the integrand, the conclusion follows now by
Lebesgue’s dominated convergence. 

6.6. Applications
In this final section we will illustrate some remarkable applications of the FT. We will start with some
classical applications to the solution of linear PDEs. Similarly to FS, the here aim is to show a method
that could be used in many situations. Other natural applications of FT are to signal processing. The
Shannon Sampling Thm concerns the problem of sampling a given signal and basically on it are based all
the modern digital devices (from the CDs to YouTube). The last application is to the X–rays tomography,
and it opens just a window on the important field of image analysis. All these applications show the
relevance of FT in everyday life.
6.6.1. Heat diffusion. The classical equation describing the heat diffusion on an infinite volume is
the PDE

ut (t, x) = σ2 ∆(t, x), t > 0, x ∈ R3,


2




(6.6.1) 
 u(0, x) = ϕ(x), x ∈ R3,



Here u = u(t, x) represents the temperature at time t > 0 on each point of an infinite rod with initial
temperature ϕ.
Setting Z
v(t, ξ) := u(t,
F ])(ξ) ≡ u(t, x)e−2iπξ ·x dx, ξ ∈ R3,
R3
100

we have Z Z
t (t, ])(ξ) =
uF ut (t, y)e−i2πξ y dy = ∂t u(t, y)e−i2πξ y dy = ∂t u(t, ])(ξ) = vt (t, ξ).
F
R3 R3
Of course here we’re implicitly assuming that ∂t can be passed under integral. At this point we don’t know anything
on the solution u, so for the moment is a formal passage. Once we’ll have found the solution it should be checked
that this operation can be done.
According to properties of the FT
f g
∆u(t,
G ])(ξ) = ∂xx u +G∂yy u + ∂zz u = (−i2πξ1 ) 2 + (−i2πξ2 ) 2 + (−i2πξ3 ) 2 u(t,
F ])(ξ) = −4π 2 |ξ | 2 v(t, ξ).
So, in term of v the heat equation becomes
vt (t, ξ) = −2π 2 σ 2 |ξ | 2 v(t, ξ), t > 0, ξ ∈ R.
Of course
v(0, ξ) = u(0,
F ])(ξ) = ϕ
D(ξ),
so we have to solve the Cauchy problem


 vt (t, ξ) = −2π 2 σ 2 |ξ | 2 v(t, ξ), t > 0, ξ ∈ R.
(6.6.2) 
 v(0, ξ) = ϕ ξ ∈ R.

D(ξ),

Freezing ξ, the first is a simple ordinary differential equation and easily we get
2 σ 2 |ξ | 2 t
v(t, ξ) = ϕ
D(ξ)e−2π , t > 0, ξ ∈ R.
Now we proceed with the inversion. Recalling the Fourier transform of the gaussian
2
− |]|
G
E 2
− |]| 2
p 2 σ 2 |ξ | 2 2 σ 2 t |ξ | 2 e 4σ 2 t
e 2σ (ξ) = 2πσ e
2 −2π
, =⇒ e −2π
=√ (ξ),
2πσ 2 t
hence
2 2
!
1 − |]|2 1 − |]|2
F G
u(t, ])(ξ) = v(t, ξ) = √ ϕ
F D(ξ) e 4σ t (ξ) = √ ϕ ∗ e 2σ t (ξ),
2t 2πσ 2 t
that gives, finally,
! Z 2
|]| 2
1 − 1 − | x−y2 |
(6.6.3) u(t, x) = √ ϕ∗e 2σ 2 t (x) = √ dy, t > 0, x ∈ R3 .
ϕ(y)e 2σ t
2πσ 2 t 2πσ 2 t R3
Some remarks on this solution. First: the formula doesn’t make sense for t = 0. However, by posing
− ]
2
1
√ e 2σ 2 t = g1/√σ 2 t ,
2πσ 2 t
where g is a gaussian distribution, then
Lp
u(t, ]) −→ ϕ, t −→ 0+, if ϕ ∈ L p (R3 ).
This is the sense we may think the initial condition in (6.6.1) is fulfilled. Once u is defined it is actually a boring
but easy job to apply the differentiation under integral sign to check that u is differentiable in t and x as t > 0 and
x ∈ R. Actually one discover that no matter ϕ bad is, u ∈ C ∞ (]0, +∞[×R) (this is a typical phenomena of heat
diffusion, called regularizing effect). By computing explicitly ut and ∆u it is easy to check also that u fulfills the
heat equation.
101

6.6.2. The Black–Sholes Equation. The Black–Sholes equation is a PDE describing the behavior
of the value of a financial derivative over a risky asset under certain assumptions. We do not enter in
the derivation of the equation, we will limit to a qualitative description of the model. The Black–Sholes
model describes a simple market model where two assets are available:
• a risk free asset, also called bank bond because we may think to it as a bank account with
instantaneous return rate r, this meaning that
dB(t) = r B(t) dt.
• a risky asset, usually called stock, characterized by a random return modeled as
dS(t) = λS(t) dt + gaussian mean 0 and variance σ 2 dt.
The nature of the two quantities B and S is very different. B is deterministic, in the sense that B(t) =
ert B(0) if B(0) is the initial amount invested in B. S is stochastic, in the sense that S(t) is a random
variable. The uncertainty in S is what makes uncertain the investment on it. It is therefore natural to look
to forms of protection against financial risks by considering some kind of "insurance" delivering at some
future time T > 0 a pre-determined function of S(T ). A typical example of this type of protections is a
contract paying a minimum amount K if S(T ) is below K and S(T ) if S(T ) is above. This quantity is
called also payoff. In other words

 K,

 if S(T ) < K,
payoff = 
 ≡ max{S(T ), K }.
 S(T ), if S(T ) > K

As we see, the payoff is a function of the value of the stock at time T, that is F = F (S(T )). In the previous
example F (x) = max{x, K }. Now one of the main question is: what is the price of such contract? That
is, how much should I pay today (time t = 0) to receive such indemnity at maturity T?
What Black and Sholes discovered is a PDE whose solution leads to the solution of the previous
problem. Let V = V (t, x) be the price of the contract valued at time t if the price of the stock is S(t) = x.
Notice that at time t = T, the maturity, the contract pays F (S(T )), so F (x) if S(T ) = x. In this case
the price at time T of the contract is just F (x). In fact: if V (T, x) < F (x) then an investor could buy
at time T the contract for V (T, x) obtaining immediately F (x), therefore realizing with certainty a profit
F (x) − V (T, x) > 0. On the other hand, if V (T, x) > F (x) it would be the issuer to realize a no risk
profit: the dealer would send the contract receiving V (T, x) and delivering to the customer F (x), therefore
realizing a profit V (T, x) − F (x) > 0. In both cases, one of the two sides of the business would make
money without any risk. This is called arbitrage and we assume that this is impossible in a market. Of
course, we might discuss if this assumption is realistic or less, but this is another issue. In conclusion,
(6.6.4) V (T, x) = F (x), x > 0.
Notice that the restriction x > 0 is because x represents a price, hence here a non negative quantity. By
similar arguments, Black and Sholes derived a condition on V (t, x) at any time t < T. They proved that,
to prevent arbitrages, it must be
1
(6.6.5) ∂t V (t, x) + σ 2 x 2 ∂xx V (t, x) + r x∂x V (t, x) − rV (t, x) = 0.
2
102

Putting together the (6.6.4) and (6.6.5) we get the problem




 ∂t V (t, x) + 12 σ 2 x 2 ∂xx V (t, x) + r x∂x V (t, x) − rV (t, x) = 0, 0 6 t 6 T, x > 0,
(6.6.6) 
 V (T, x) = F (x),

x > 0.

The problem (6.6.6) is apparently similar to the (6.6.1). However, it is not evident how to use the FT being
the spatial domain x asymmetric. There’s a simple trick to transform the (6.6.6) into a FT manageable
equation. Set y = log x, that is x = ey and
u(t, y) := V (t, ey ).
Then
∂t u(t, y) = ∂t V (t, ey ), ∂y u(t, y) = ey ∂x V (t, ey ), ∂yy u(t, y) = (ey ) 2 ∂xx V (t, ey ).
Therefore, u solves


 ∂t u(t, y) + 12 σ 2 ∂yy u(t, y) + r∂y u(t, y) − ru(t, y) = 0, 0 6 t 6 T, y ∈ R,
(6.6.7) 
 u(T, y) = F (e ),
y

y ∈ R.

We can now use the FT to solve this problem. Let


v(t, ξ) := u(t,
E·)(ξ).

Then
1
∂t v + σ 2 (−i2πξ) 2 v + r (−i2πξ)v − rv = 0,
2
or  
∂t v = 2π 2 σ 2 ξ 2 + i2πr ξ + r v.
This is an ordinary differential equation in v(t, ξ) (ξ fixed). The final condition on u(T, y) becomes

v(T, ξ) = u(T,
F (e] )(ξ).
·)(ξ) = FE
Therefore
2 σ 2 ξ 2 +i2πr ξ+r
v(t, ξ) = e (2π ) (t−T ) v(T, ξ) = e−r (T −t) e−i2πr (T −t)ξ e−2π 2 σ 2 (T −t)ξ 2 FE
(e] )(ξ).
We can now return to y. First recall that
2 σ 2 (T −t)ξ 2 1 G − 2]√2
e−2π =p e 2σ T −t
2πσ 2 (T − t)
therefore
2 σ 2 (T −t)ξ 2 1 G ]2
− √
e−2π (e] ) = p
FE e 2σ 2 T −t ∗ F (e] ).
2πσ 2 (T − t)
Moreover, the multiplication by e−i2πr (T −t)ξ in the FT means a translation in the variable of −r (T −t) in its original.
Putting together these facts,
η2
]2
Z −
1 e 2σ 2 (T −t )
∗ F (e] )(y − r (T − t)) = e−r (T −t)
− √
 
u(t, y) = e−r (T −t) p e 2σ 2 T −t F ey−r (T −t)−η p dη.
2πσ 2 (T − t) R 2πσ 2 (T − t)
103

Returning to V we finally obtain


η2
Z −
e 2σ 2 (T −t )
V (t, x) = u(t, log x) = e−r (T −t) F (elog x−r (T −t)−η ) p dη
R 2σ 2 (T − t)
(6.6.8)
z2
e− 2 √
Z  
(z := − √η ) =e −r (T −t)
F xe −r (T −t)+(σ T −t)z
√ dz
σ T −t
R 2π
This is the Black formula, currently used to price contracts. For instance, in the case of the example done at the
beginning, F (x) = max{K, x} the price at t = 0 if S(0) = x is
Z  √  e− z22
−rT +(σ T )z
V (0, x) = e −rT
max xe ,K √ dz
R 2π
√ z2 2
Z √1 K

r
T +∞ − z2
e− 2 √
log Z
*. σ T −rT +σ T z e
x
=e −rT
K√ dz + √ xe √ dz +/
−∞ 2π 1
√ log K

r
T 2π
, σ T x -
Z − √1 log K − r √T − z2
2
r√ √
!
K σ
σ T
−σ T z e
1 x
=e −rT *.KΦ
√ log + T + xe −rT
e √ dz +/
σ T x σ −∞ 2π
, -

r√ r√ √
!   !!
K σ2 K
2 −r T
1 1
= e−rT KΦ √ log + T + xe Φ − √ log − T +σ T .
σ T x σ σ T x σ
where we denoted by
w z2
e− 2
Z
Φ(w) := √ dz
−∞ 2π
the distribution function of the standard gaussian. This function cannot expressed elementary, however an extremely
good approximation is available through power series. Therefore, the previous offers a good numeric formula to
price the insurance.

6.6.3. Shannon sampling theorem. One of the main applications of FT is to signals. A signal can
be represented as a function f = f (x), x ∈ Rd . Its FT fD = fD(ξ) represents the "coefficient" of the
harmonics ei2πξ ·x in the inversion formula
Z
f (x) = fD(ξ)ei2πξ ·x dξ.
Rd
In Information Theory the variable ξ is therefore also called frequency and the domain of frequencies is
called band. One of the interesting problems is the
sampling problem: how can be reconstructed a signal by a certain number of its samples? and how
eventually should be done the sampling?
Apparently, this seems a completely meaningless problem. Suppose that we take samples, for instance,
at discrete scaling, that is f (k) with k ∈ Z. Of course it seems impossible to pretend that these values
determine a unique signal f because we may interpolate the known values f (k) in infinitely many ways.
104

However if we require something on the band of f things changes. Suppose for instance that we know
that the band of f is bounded, that is fD ≡ 0 off a certain interval [−λ, λ].
Theorem 6.6.1 (Shannon Sampling Thm, SST). Let f ∈ L 1 (R) be such that fD ≡ 0 off [−λ, λ]. Then f
k
is completely determined by its values at points 2λ . Precisely the following reconstruction formula holds
X  n 
f (x) = f sinc (2λ x − nπ).
n∈Z

Proof. Let’s construct first a periodic copy of fD, that is let’s reproduce fD periodically with period
2a. Call g = g(ξ) such signal. Then
n
X
g(ξ) = g (n)e−i2π 2λ nξ
D
n∈Z

where
Z λ Z  n 
1 −i πλn η 1 1 L 1  n 
fD(η)e−i2π 2λ η dη =
n
g (n) =
D fD(η)e dη = f
L = f − .
2λ −λ 2λ R 2λ 2λ 2λ 2λ
by the inversion formula (that here is valid because we assume fD ≡ 0 off [−λ, λ], hence it is L 1 ). In
particular,
X 1  n 
e−i2π 2λ ξ , ∀ξ ∈ [−λ, λ].
n
f (ξ) =
D f −
n∈Z
2λ 2λ

This shows that fD(hence f by inversion) will be determined by a discrete set of values of f . Precisely
Z Z λ X 1  n Z λ
e−i2π 2λ ξ ei2πξ x dξ
n
f (x) = f (ξ)e
D i2πξ x
dξ = f (ξ)e
D i2πξ x
dξ = f −
R −λ n∈Z
2λ 2λ −λ

X 1  n Z X 1  n 
E n −x
 
χ[−λ,λ] (ξ)e−i2π ( 2λ −x ) ξ dξ =
n
= f − f − rectλ
n∈Z
2λ 2λ R n∈Z
2λ 2λ 2λ

X 1  n    n  X 1  n 
= f − 2λsinc 2πλ −x = f 2λsinc (2λ x − n)
n∈Z
2λ 2λ 2λ n∈Z
2λ 2λ
which is the conclusion. 

The SST is very useful to understand how CDs works. Suppose f : R −→ R represents the amplitude
of a sound wave f = f (t) as function of time. The FT fD represents the frequency of the sound wave
f . Humans only hear sounds at frequencies less than about 20.000 Hertz. So, we may conservatively
assume that fD ≡ 0 off the interval [−22.000, 22.000], that is in the SST assumption λ = 22.000. Then, if
we sample f at intervals of time length 2λ1
= 44.000
1
, that is with a frequency of 44.000 times per second,
then we can recover f exactly. In particular, on the CD we can store the samples of the sound wave f at
n
these discrete times 2λ . The basic idea works also in the case of MP3 or JPEG format as well as on voice
signals transmitted by a cellular phone or video browsed on line on YouTube.
105

6.6.4. Radon transform and application to X−rays tomography. X–rays tomography is an impor-
tant field of application in medical diagnosis. Roughly speaking, the aim of X–rays tomography is to use
X–rays absorbance rates to construct a picture of the interior of a solid body. Applications ranges from
CAT scanners to airport security checks passing by engineering applications, as for instance to control
weld joints in delicate construction where an extreme precision is demanded. How does it works then?
Tomography is the name of a simple geometrical idea: we can reconstruct a solid body by knowing
all its sections parallel to a certain plane. X–rays enter as a method to obtain the sections of a body. Let’s
consider therefore a 2-dim. section represented by a certain function f : Ω ⊂ R2 −→ R where at any
point (x, y) ∈ Ω the value f (x, y) represent the absorption rate of an X–ray passing through (x, y). For
convenience we will consider Ω = R2 setting f = 0 off Ω. An X–ray is represented by a straight line. If
at certain point (x, y) the intensity is I (x, y) then
dI
dI = − f (x, y)I d`, ⇐⇒ = − f d`,
I
where d` represent a small length around the point (x, y). In particular, if Iin is the intensity of the X–ray
when it enters in the slice and Iout is the intensity when it comes out,
R
(6.6.9) Iout = e− r
f
Iin
R
where r is the straight line representing the trajectory of the X–ray and r f is the line integral of the
density f along r. Basically the problem is: given Iin and Iout on any r, is it possible to reconstruct f ?
This is a typical inverse problem. Clearly, given f to compute Iout given Iin is a calculus problem.
On the other side, given Iin and Iout it seems a very different problem and it is not clear if there is a
solution, if it is unique and, in the case, how it depends by the data (this is a very important question
in application because normally one cannot give data exactly but with some approximation). We could
reformulate the problem in the following way: by (6.6.9) we have
Z
Iout R
Iout (r)
= e− r f , ⇐⇒ g(r) := − log = f,
Iin Iin (r) r

so the problem becomes: given g = g(r), where r is any straight line, find f such that g(r) = r f . The
R

Fourier transform given a quick and elegant way to solve this problem. Let’s see how.
To describe straight lines r into the plane we will use the cartesian form
r : ax + by = c, ⇐⇒ (a, b) · (x, y) = c, (a, b) ∈ R2 \{02 }.
Of course there’re infinitely many a, b, c for the same r. Conventionally we will assume k(a, b)k = 1
so that the multiple choice reduces to two because the triplets a, b, c and −a, −b, −c describe the same
straight line. So we will consider g as function
g : S1 × R −→ R, g(~u, c) = g(−~u, −c), ∀(u, c) ∈ S1 × R.
The advantage is that if k(a, b)k = 1,
 
t, − ba t +
R c
Z 

 R
f b dt, b , 0,
f = R


 
ax+by=c
− ba t + ac , t dt, a , 0.

 R f

106

Let’s consider the Fourier transform of f with respect (x, y):


Z Z
f (a, b) =
D f (x, y)e −i2π (a,b) ·(x,y)
dxdy = f (x, y)e−i2π k(a,b) k~u ·(x,y) dxdy,
R2 R2
where k~u k = 1. We may integrate in the plane along straight lines u~ · (x, y) = c, varying c ∈ R so
R2
Z Z ! Z Z !
fD(a, b) = fe −i2π k(a,b) k~
u ·(x,y)
dσ1 dc = f dσ1 e−i2π k(a,b) kc dc
R ~ ·(x,y)=c
u R u ·(x,y)=c

Z !
G(a, b)
= g(u, c)e −i2π k(a,b) kc
dc = g(~
F u, ]) (k(a, b)k) = g , ] (k(a, b)k)
R k(a, b)k
If we assume that fD ∈ L 1 (R2 ) the inversion formula gives
Z Z !
G(a, b)
f (x, y) = f (a, b)e
D i2π (a,b) ·(x,y)
dadb = g , ] (k(a, b)k) ei2π (a,b) ·(x,y) dadb
R2 R2 k(a, b)k
Z Z ! !
(a, b)
= g ,c e −i2π k(a,b) kc
dc ei2π(a,b) ·(x,y) dadb
R2 R k(a, b)k
Integrating in polar coordinates on (a, b) ∈ R2 we get finally
Z +∞ Z Z ! !
f (x, y) = g u~, c e−i2πr c dc ei2πru ·(x,y) dσ2 (u) r dr.

0 u ∈S1 R
Notice that
Z
1 i
r g (u, c) e−i2πr c dc = r g(u,
F ])(r) = (i2πr) g(u,
F ])(r) = − ∂G c g(u, ])(r),
R i2π 2π
so Z +∞ Z
i
f (x, y) = − ∂G
c g(u, ])(r)e
i2πru ·(x,y)
dσ2 (u) dr.
2π 0 S1

This is actually an inversion formula: given g we may reconstruct f uniquely. That is we have existence
and uniqueness of a solution under suitable hypotheses on g.

6.7. Exercises
Exercise 6.7.1. Compute the Fourier transforms of the following functions:
1. x recta (x). 2. (a − |x|)recta(x). 3. (cos x)rectπ/2 (x). 4. e−|x | sgn(x). 5. e−x χ[0,+∞[ (x).
Exercise 6.7.2. Show that if f is real valued and even then fDis real valued.
Exercise 6.7.3. Let R be an orthogonal matrix, RRt = Rt R = I. Express the FT of f (Rx) in terms of fD.
Exercise 6.7.4. Compute χE
[a,b] and χ
G [−a,a] d .

Exercise 6.7.5. For each of the following functions say if they are L 1 , L 2 , S :
1 sin x
, 2. . 3. e−x . 4. x 2 e−|x | . 5. |x|(e−x − 1).
4 2
1. √
1+x 2 x
107

Exercise 6.7.6. Let


1
g(ξ) := , ξ ∈ R,
(ξ 2 + a2 )(ξ 2 + b2 )
with a, b > 0 and a , b. i) Show that g has a Fourier original in L 2 (that is, there exists f such that fD = g) and
−λ |] | (ξ) = . . .). ii) Show that ]g(]) has a Fourier original in
compute it. (hint: split the fraction and recall that eE
2
L and find it in term of the original f of g. Justify carefully your answer.
Exercise 6.7.7. Let
1 e−|ξ |
f (ξ) = , ξ ∈ R, g(ξ) := , ξ ∈ R.
1+ξ 4 1 + ξ4
i) Show that f and g admit an L 2 Fourier original, that is functions whose FTs are, resp., f and g. ii) Without
computing the originals, show that the Fourier original of g is C ∞ (R) function. Is it also a S (R) function?
iii) Determine the originals of f and g,
Exercise 6.7.8. Let f , g ∈ S . Is f g ∈ L 1 ? In this case, what is L
f g?
2 2
Exercise 6.7.9. By a suitable use of FT compute xe−x ∗ e−x .
ξ
Exercise 6.7.10. The function f has Fourier transform fD(ξ) = 1+ξ 4 . Show that
Z
x f (x) dx, and f 0 (0),
R
are well defined and determine their values.
Exercise 6.7.11. Solve the equation
Z
f (x − y)e−|y | dy = 2e−|x | − e−2 |x | .
R

Exercise 6.7.12. By using FT, compute the convolution f a ∗ f b of two Cauchy distributions where f a (x) = 1
a2 +x 2
.
Use this to compute
Z √nβ

lim n √ f 1 ∗ f 1 ∗ · · · ∗ f 1 dx.
n nα | {z }
n−times
Exercise 6.7.13. Prove the Young inequality (6.5.1) in the case p = 1.
Exercise 6.7.14. Take g(x) = rect1 . Compute explicitly gε , f ∗ gε . Show that if f ∈ C then f ∗ gε (x) −→ f (x)
for any x ∈ R.
Exercise 6.7.15 (?). Solve the equation
Z +∞
1
u 00 (x) − e−|y | u(x − y) dy = e−|x | sgnx. (?)
2 −∞
Exercise 6.7.16 (?). Solve the equation
Z
1
f (y) f (x − y) dy + f (x) = .
R 1 + x2
Exercise 6.7.17 (?). Suppose that f ∈ L 2 be such that
Z
f (y)e−y e2ξ y dy = 0, ∀ξ ∈ R.
2

R
Show that f ≡ 0. (Hint: (y − ξ) 2 = . . .).
108

Exercise 6.7.18 (?). Show that if f ∈ L 2 (R) then


2
Z R
L
(6.7.1) fD(ξ) = lim f (y)e−i2πξ y dy.
R→+∞ −R
Use this to show that the function
g(ξ) := 2a sinc (2πaξ), ξ ∈ R.
has an L 2 transform and determine it directly.
Exercise 6.7.19. Consider the Cauchy problem for the wave equation on an infinite interval

 ∂tt u(t, x) = c2 ∂xx u(t, x), t > 0, x ∈ R,
 u(0, x) = ϕ(x), x ∈ R,

 ∂t u(0, x) = ψ(x), x ∈ R.

Introducing, v(t, ξ) := u(t,


F ])(ξ), determine v(t, ξ) hence, by using properties of the FT, prove the D’Alembert
formula Z x+t
1 1
u(t, x) = (ϕ(x + t) + ϕ(x − t)) + ψ(y) dy.
2 2 x−t
Exercise 6.7.20. Find the solution of the following problem

 ∂xx u(t, x) = ∂t x u(t, x), x ∈ R, t > 0,


 u(0, x) = e ,
−|x |

x ∈ R.

Exercise 6.7.21. Find the solution of the following problem




 ∂t u(t, x) + t∂x u(t, x) = 0, x ∈ R, t > 0,

 u(0, x) = f (x),

x ∈ R.

Exercise 6.7.22. Find the solution of the following problem

 ∂t u(t, x) = e−t ∂xx u(t, x), x ∈ R, t > 0,


 u(0, x) = e ,
−|x |

x ∈ R.

Exercise 6.7.23. Find the solution of the following problem




 ∂tt u(t, x) + ∂xxxx u(t, x) = 0, x ∈ R, t > 0,




 u(0, x) = rect1,

x ∈ R,





 u (0, x) = 0,

x ∈ R.
 t
Exercise 6.7.24. The model of heat diffusion with convection is described by the Cauchy problem


 ∂tt u = c2 ∂xx u + k∂x u, t > 0, x ∈ R,

 u(0, x) = f (x),

x ∈ R.

Find the evolution of the temperature in the case c = 1, k = and initial temperature f (x) = e−x .
1 2
2

You might also like