0% found this document useful (0 votes)
26 views135 pages

PDEintro 21

This document provides an introduction to partial differential equations. It discusses function spaces, bases and diagonalization, integral identities, and preliminaries on basic ODEs and PDEs. The document covers weak derivatives and Sobolev spaces, the finite element method, initial value problems on Rn, initial-boundary value problems, harmonic functions, and goes beyond basic PDE problems to discuss additional topics like boundary integral equations, classification of second order PDEs, and calculus of variations. It includes exercises and answers, as well as instructions for projects involving numerical solutions to PDEs using the finite element method.

Uploaded by

ahmetyergenuly
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
26 views135 pages

PDEintro 21

This document provides an introduction to partial differential equations. It discusses function spaces, bases and diagonalization, integral identities, and preliminaries on basic ODEs and PDEs. The document covers weak derivatives and Sobolev spaces, the finite element method, initial value problems on Rn, initial-boundary value problems, harmonic functions, and goes beyond basic PDE problems to discuss additional topics like boundary integral equations, classification of second order PDEs, and calculus of variations. It includes exercises and answers, as well as instructions for projects involving numerical solutions to PDEs using the finite element method.

Uploaded by

ahmetyergenuly
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 135

PARTIAL DIFFERENTIAL

EQUATIONS

weak derivatives and

systems of ODEs

Andreas Rosén
Chalmers University of Technology and
the University of Gothenburg 2021
Contents

0 Preliminaries 5
0.1 Function spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
0.2 Bases and diagonalization . . . . . . . . . . . . . . . . . . . . . . . . 8
0.3 Integral identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
0.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1 Basic ODEs and PDEs 13


1.1 First order ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2 First order linear PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3 Heat, wave and Laplace equations . . . . . . . . . . . . . . . . . . . . 19
1.4 Linear PDE problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2 Weak derivatives and FEM 29


2.1 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 Sobolev spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3 Existence of weak solutions to BVPs . . . . . . . . . . . . . . . . . . 40
2.4 Discretization and FEM . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3 Initial value problems on Rn 53


3.1 Tempered distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2 The solution formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.3 Heat vs. Wave equation . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4 Initial-boundary value problems 73


4.1 Generalized sine and cosine series . . . . . . . . . . . . . . . . . . . . 73
4.2 Eigenfunctions and IBVPs . . . . . . . . . . . . . . . . . . . . . . . . 79
4.3 FEM: eigenvalues and evolution . . . . . . . . . . . . . . . . . . . . . 85
4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5 Harmonic functions 91
5.1 Green functions and Poisson kernels . . . . . . . . . . . . . . . . . . . 91
5.2 Mean value and maximum theorems . . . . . . . . . . . . . . . . . . . 95
5.3 Analytic functions and Hardy splittings . . . . . . . . . . . . . . . . . 101
5.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

3
4

6 Boundary integral equations 109

7 Beyond our basic PDE problems 111


7.1 Classification of second order PDEs . . . . . . . . . . . . . . . . . . . 111
7.2 Dirac type PDE systems . . . . . . . . . . . . . . . . . . . . . . . . . 112
7.3 Calculus of variations . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.4 Curvature flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.5 Non-linear wave equations . . . . . . . . . . . . . . . . . . . . . . . . 116
7.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

A Answers to Exercises 119

B Instructions 121
B.1 The written exam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
B.2 The FEM projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

C FEM projects 125


C.1 Heat equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
C.2 The sound of drums . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
C.3 Propagation of waves . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

D Matlab 131
Chapter 0

Preliminaries

The material in this chapter is not meant to be explicitly lectured, but you should
read this before the start of the course to prepare. Prerequisites to the course also
include

ˆ Fourier series and the Fourier transform.

ˆ Linear algebra: if you change your basis, how do the coordinate of vectors and
the matrices of linear transformation change?

ˆ Vector calculus: Gauss theorem, grad, div and curl.

Functional analysis is not required for the course. We use some concepts like function
spaces and norms however, but develop what we need as we go along.
The course starts in Chapter 1. These notes teach methods for solving the three
basic PDEs: the heat, wave and Laplace equations. We aim for a modern course
which develops the theory for a good understanding of numerical solution of PDE
problems with FEM, rather than teaches techniques for finding explicit solutions
by hand to PDE problems which are limited to special geometries. An extended
version of the course also provides the theory for solving PDE problems numerical
with boundary integral equations, which give faster and more accurate solutions in
many situations.
For the analysis we use the Fourier transform and generalized Fourier series as
our main tools. The basic strategy is to view our PDE as a vector-valued ODE
and to solve this through diagonalization. In order to diagonalize using the Fourier
transform, we require distributions. Without distributions this transform is quite
useless in practical applications like PDEs. Distributions also allow us to weakly
differentiate any function, even non-smooth ones, which is essential for a modern
understanding of PDE theory.
Necessary for a good understanding of PDEs and distributions are pictures which
convey the geometry hidden behind the abstract formulas. For example, how does
the Dirac delta and its derivative (dipole) look like? How does the Laplace fun-
damental solution, Poission kernels, Riemann functions and heat kernels look like?
This manuscript is unfortunately not yet complete, and these pictures are missing.
Therefore we shall draw them at the lectures, and you can add them in the margin.

5
6

In the introduction to each chapter you find boxes, like the following, with recom-
mended reading in the course book by Strauss, and elsewhere, which complements
these notes.

Recommended reading:

ˆ Strauss: 7.1-2. Only the beginning of these sections.

Study questions:
What is a Hilbert space? How do Green’s identities follow from
Gauss’ theorem?

0.1 Function spaces


In PDE the vector spaces (or linear spaces synonymously) that appear are infinite
dimensional and consist of functions satisfying certain conditions. The space of func-
tions which is most important to us is L2 (D), the set of square integrable functions
on a given set D. We assume throughout in this section that functions are real
valued.

Definition 0.1.1 (Square integrable functions). Let D ⊂ Rn be an open set. A


function f defined on D is called square integrable, written f ∈ L2 (D), if
Z
|f (x)|2 dx < ∞.
D
R
The L2 norm of f is ∥f ∥ := ( D |f (x)|2 dx)1/2 , and
R the L2 inner product of two square
integrable functions f and g on D is ⟨f, g⟩ = D f (x)g(x)dx.

Similarly, we can and will consider L2 spaces of functions on non-open sets S ⊂


n
R , notably on boundaries S = ∂D of open sets, in which case we define L2 (S) by
replacing area measure dx by length measure ds when D ⊂ R2 and by replacing
volume measure dx by surface measure dS when D ⊂ R3 .
There are many ways to measure the size of a function f by a norm ∥f ∥, and
therefore the L2 -norm is often more precisely denoted ∥f ∥2 or ∥f ∥L2 (D) . However,
we shall mostly use the L2 norm and if not otherwise is said, ∥f ∥ will denote the L2
norm.
Spaces of functions like L2 (D) share basic properties with Rn : both are linear
spaces, or synonymously vector spaces. This means that they are both sets V , where
the sum x + y ∈ V is well defined for any two x, y ∈ V , and there is a multiplication
by scalars ax ∈ V , for a ∈ R, x ∈ V . We assume that the reader is familiar with
this concept of linear spaces and the usual rules of calculation that addition and
multiplication by scalars must satisfy. The novelty with linear spaces of functions
like L2 (D) is that these typically have an infinite dimension. Naively, we can view
the function values f (x), infinitely many, as the coordinates of f , and we view f as
a vector/object in the linear space L2 (D). Basic concepts from functional analysis
which we need are as follows.
7

Definition 0.1.2 (Banach and Hilbert spaces). Let V be a linear space.


A norm on V assigns a positive value ∥f ∥ > 0 to each nonzero f ∈ V , with
∥0∥ = 0, so that ∥af ∥ = |a|∥f ∥ and ∥f + g∥ ≤ ∥f ∥ + ∥g∥ for all a ∈ R and f, g ∈ V .
An inner product on V assigns a real value ⟨f, g⟩ ∈ R to each pair f, g ∈ V , with
⟨f, f ⟩ > 0 for nonzero f ∈ V , so that ⟨f, g⟩ = ⟨g, f ⟩ and ⟨f, ag+bh⟩ = a⟨f, g⟩+b⟨f, h⟩
for all a,pb ∈ R and f, g, h ∈ V . The norm associated with an inner product ⟨f, g⟩ is
∥f ∥ := ⟨f, f ⟩.
Consider a sequence of vectors f1 , f2 , f3 , . . . in a linear space V equipped with
a norm ∥ · ∥. The sequence is called a Cauchy sequence with respect to ∥ · ∥ if for
each ϵ > 0, there exists an index N < ∞ so that ∥fi − fj ∥ < ϵ for all i, j ≥ N . The
sequence is said to converge to a vector f ∈ V if for each ϵ > 0, there exists an index
N < ∞ so that ∥fi − f ∥ < ϵ for all i ≥ N .
The linear space V is called complete with respect to the norm ∥ · ∥ if any Cauchy
sequence in V converges to some vector f ∈ V . Such complete and normed linear
space is called a Banach space. If V is a Banach space with respect to a norm ∥ · ∥
which is associated with an inner product, then V is called a Hilbert space.

To explain the intuition behind these abstract concepts, consider the unit sphere

{f ∈ V ; ∥f ∥ = 1}

in V . The completeness means that there are no holes in this sphere as we go along
the infinitely many dimensions. For a general Banach space, the unit sphere may
not be as round as one usually imagine a sphere. It is only the Hilbert spaces that
have perfectly round unit spheres without holes.

Proposition 0.1.3. Let D ⊂ Rn be an open set. Then function space L2 (D) is


a Hilbert space, provided that the integrals appearing in Definition 0.1.1 are in the
sense of Lebesgue.

It is the completeness property which needs to be proved, and for this it is not
enough to use the simple Riemann integral. With only Riemann integrable functions
in L2 (D), there will functions missing in the space (holes). We shall not prove
Proposition 0.1.3, but refer to courses on Lebesgue integration theory for this. We
only remark that the Lebesgue integral, which is always used in higher mathematics
courses, is so powerful that in real life you will never encounter a function which
is too irregular to be integrable. Therefore whenever we speak of integrable in this
course, we always refer to the size of the function, that is integrable means that the
integral of the function is absolutely convergent.
A result which we shall need and can prove is the following.

Lemma 0.1.4. Let H be a Hilbert space, and let V ⊂ H be a subspace. If V is a


closed subspace, then V is a Hilbert space itself.

Recall that V being a subspace means that af + bg ∈ V whenever f, g ∈ V and


a, b ∈ R.
Proof. It is clear that V is a linear space in itself, and that the inner product on H
restricts to an inner product on V . To show that V is complete, let f1 , f2 , f3 , . . . be
8

a Cauchy sequence in V . In particular this is a Cauchy sequence in H, and since H


is complete we know that there exists f ∈ H such that ∥fi − f ∥ → 0 as i → ∞. It
remains to show that f ∈ V . But since f is the limit of a sequence of elements in V
and V is a closed set, this is indeed the case.

0.2 Bases and diagonalization


A very useful technique in linear analysis is the following.

Definition 0.2.1 (Similarity transformation). Let V1 and V2 be two linear spaces,


and let T : V2 → V1 be a linear and invertible map. Consider a linear map A : V1 →
V1 of V1 . The similarity transformation of A by T is the linear map

T −1 AT : V2 → V2 .

Let us recall the geometric meaning of a similarity transformation from linear


algebra.

Example 0.2.2 (Change of coordinate system). Set V1 = V2 = R3 and consider


a linear map which in the standard basis is represented by the matrix A. Write
x ∈ R3 for the coordinates of a vector in the standard basis.
Consider now another basis, and denote coordinates of vectors in this new basis
by y ∈ R3 . We write the linear relation between the coordinates as

x = T y,

where T is the change of basis matrix. It follows, by going via the standard basis,
that the in the new basis the linear transformation that we consider acts on the y
coordinates by the matrix
T −1 AT.
Thus, the matrix A of a linear map change by a similarity transformation when we
change the basis by T , just like the coordinates of a vector change when we change
basis.

A main result in linear algebra is the following.

Theorem 0.2.3 (Spectral theorem). Let A is a symmetric matrix, that is AT = A.


Then there exists an isometric linear map T such that

T −1 AT

is a diagonal matrix.

In this case the change of basis matrix T is not only invertible as it must be,
but in fact can be chosen to be a rotation. This gives a very good understanding
of symmetric maps A: the spectral theorem means that if you just turn your own
head appropriately (that is choose coordinate system) then your symmetric map
just scales vectors along the coordinate axes.
9

Also non-symmetric matrices are in the generic case possible to diagonalize, if


we use complex numbers. For example when all complex eigenvalues are different,
then we have a similarity transformation T to a diagonal matrix, although T need
not be an isometry.
Similarity transformations are also very relevant in the Hilbert space context.
Here the change of basis maps T are not finite dimensional matrices but linear maps
of functions. The most useful is the Fourier transform. Although the following
example may seem unnecessarily abstract, it contains the basic idea for solving
initial value problems for PDEs.

Example 0.2.4 (Diagonalization of differential operators). Recall from Fourier anal-


ysis the Fourier transform
Z
F(f )(ξ) = f (x)e−i⟨ξ,x⟩ dx, ξ ∈ Rn .
Rn

Given a function f on Rn , this yields its Fourier transform F(f ), which is another
function on Rn usually is denoted by fˆ. A fundamental theorem by Plancherel states
that Z Z
1
2
|f (x)| dx = |fˆ(ξ)|2 dξ. (1)
Rn (2π)n Rn
If we set V1 = V2 = L2 (Rn ), then the Fourier transform

F : V1 → V2

is a linear and invertible map of square integrable functions. Furthermore, if we


more precisely define the norms on V1 and V2 to be the square roots of the left and
right hand sides in (1), then F becomes an isometry.
The main use of the Fourier transform to partial differential equations is that it
provides a change of basis that diagonalizes differential operators. The change of
basis idea is that F transforms from a spatial basis to a spectral basis: f (x) shows
how large the function is at various points in space, whereas fˆ(ξ) shows how large
f is at various frequencies. Of course, we here use the terminology “basis” in a
generalized continuous sense. The basis idea is more clear in the case of periodic
functions and their Fourier series, where the trigonometric functions indeed form a
basis for the Hilbert space L2 .
The diagonalization property of F is that F(∂xk f ) = iξk F(f ). Seen as a simi-
larity transformation, we have

F∂xk F −1 = Miξk , (2)

that is T = F −1 and A = ∂xk in Definition 0.2.1. Here we view ∂xk as the linear map
∂xk : V1 → V1 which computes the partial derivative ∂xk f . On the right hand side
stands the multiplier Miξk , which is the linear map Miξk : V2 → V2 which computes
iξk fˆ(ξ). Such a multiplication operator Miξk is the continuous analogue of a diagonal
matrix, so (2) is rightly viewed a diagonalization of the derivative ∂xk .
10

0.3 Integral identities


A fundamental result from vector calculus is the following integral identity, which
relates the flow of a vector field F out through the boundary ∂D to the total diver-
gence of F inside D.
Theorem 0.3.1 (Divergence theorem/Gauss’s theorem). Let D ⊂ Rn be a domain
and let F : D → Rn be a vector field. Then if D is bounded and F and D are
sufficiently smooth, then
Z Z
⟨ν(x), F (x)⟩dS(x) = divF (x)dx.
∂D D

Throughout this book, ν denotes the outward pointing unit normal vector on
∂D. We recall the following basic differential operators.

ˆ The gradient of a scalar function u is the vector field

∇u = (∂x1 u, . . . , ∂xn u).

ˆ The divergence divF of a vector field F = (F1 , . . . , Fn ) is the scalar function

divF = ∂x1 F1 + · · · + ∂xn Fn .

ˆ The Laplace operator is the second order differential operator ∆ = div∇, that
is
∆u = div∇u = ∂x21 u + · · · + ∂x2n u.

There are two versions of the divergence theorem due to George Green, which
we will frequently use. Given two scalar functions u and v, we apply the divergence
theorem to the vector field F = v∇u and note that divF = ⟨∇v, ∇u⟩ + v∆u. This
yields Green’s first identity
Z Z
v∂ν udS = (⟨∇v, ∇u⟩ + v∆u)dx.
∂D D

Another frequently useful version is obtained by swapping the roles of u and v and
subtracting the two identities. This yields Green’s second identity
Z Z
(u∂ν v − v∂ν u)dS = (u∆v − v∆u)dx,
∂D D

since the inner product ⟨∇v, ∇u⟩ is symmetric in u and v.

0.4 Exercises
1. For which exponents α ∈ R does f (x) = |x|α belong to L2 (D), if (a) D is the
interval D = (−1, 1) ⊂ R? (b) D is the unit disk x2 + y 2 < 1 in R2 ? (c) D is
the unit ball x2 + y 2 + z 2 < 1 in R3 ?
11

2. Consider the sequence of functions fk (x) = arctan(kx) on D = (−1, 1), k =


1, 2, . . .. Show that fk converge in L2 (D), and determine the limit function f .
Is the function space V consisting of continuous functions on [−1, 1], equipped
with the L2 norm, a Hilbert space?
 
1 6 −2
3. Diagonalize the matrix A = 5 , that is find an invertible matrix T so
−2 9
that T AT −1 is a diagonal matrix. Hint: T −1 should contain the eigenvectors
of A as columns.
12
Chapter 1

Basic ODEs and PDEs

In this course, we study differential equations. Unlike usual (algebraic) equations


like x2 + 2x − 7 = 0, where the unknown x which we seek is a number, for differential
equations the unknown quantity is a whole function u. An ordinary differential
equation (ODE) is a differential equation where u is a one-variable function u = u(x).
For example
u′ (x) = −7u(x).
A partial differential equation (PDE) is a differential equation where u is a function
u = u(x1 , . . . , xn ), n ≥ 2, of several variables. For example Laplace’s equation

u′′xx (x, y) + u′′yy (x, y) = 0.

“Differential” refers to the equation involving derivatives of the unknown function.


The order of the PDE is order of the highest derivative occuring in the PDE. For
Laplace’s equation above, the order is 2.

Remark 1.0.1. In this book we write partial derivatives as u′x or ∂x u. Another short
hand notation in literature is ux , omitting the prime. Beware that we will instead
reserve this notation for the one-variabel function ux (y) = u(x, y) of y, where x has
been fixed.

Many natural PDEs are rather systems of PDEs, which is the case where we
have not only one unknown function but unknown functions u1 , . . . , uk , which are
assumed to satisfy not only one PDE but a number of PDEs. In this course however,
we mainly focus on scalar PDEs (k = 1), in which case the most interesting PDEs
are of second order as we shall see in Section 1.3. In Chapter 7, we learn that many
PDEs from physics are systems of PDEs.
We consider here mainly the simplest PDEs, the linear PDEs. A PDE is said to
be linear if it can be written in the form
X
as (x) ∂ s u(x) = f (x),
s

where s = (s1 , . . . , sn ) index the various partial derivatives ∂ s = ∂xs11 · · · ∂xsnn of the
unknown function u, and as and f are given functions depending on x = (x1 , . . . , xn ).
The PDE is said to be homogeneous if f = 0. If a PDE is not linear it is called a

13
14

non-linear PDE. A typical non-linear PDE is a PDE where a product of u and some
derivative ∂ s u appears.
There are also integral equations, which is an equation involving integrals of the
unknown function, see Section 6. Sometimes PDEs can be reformulated as integral
equations.

Recommended reading:

ˆ Basic ODE techniques: http://www.math.chalmers.se/~rosenan/


diffekv.pdf

ˆ Strauss: 1.1-5.

Study questions:
What is a characteristic curve/equation? What is the physical
meaning of lower order terms in the three main PDEs? What
does it mean that a PDE problem is well posed? Linear?
Homogeneous?

1.1 First order ODEs


Before studying PDEs, we need to recall some theory for ODEs. The most general
form of an ODE is
u′ (t) = F (t, u(t)), (1.1)
for an RN -valued function u : R → RN . More precisely this means a system
of ODEs involving the component functions u1 (t), u2 (t), . . . , uN (t). The function
F : R × RN → RN on the right hand side describes the rate of change u′ (t), given
the position u(t) ∈ RN and time t ∈ R. As we shall see in this course, PDEs can be
viewed as the limit case of infinitely large ODE systems, that is the case N = ∞.
In this section however, we study the ODE case, that is N < ∞.
The basic existence and uniqueness result for ODEs is the following.

Theorem 1.1.1 (Picard). Consider the initial value problem for (1.1) where at time
t = t0 we are given the initial position

u(t0 ) = u0 ∈ Rn .

Assume that the function F is continuous and that there is a constant L < ∞ so
that F satisfies the Lipschitz condition

|F (t, u1 ) − F (t, u2 )| ≤ L|u1 − u2 |,

for all t near t0 and all u1 , u2 near u0 . Then there exists t1 < t0 < t2 such that there
exists one and only one solution u(t) to this initial value problem for t1 < t < t2 .

The proof of Picards theorem uses a fixed point theorem. We will not look into
this, but instead refer to standard literature on ODEs. Coming back to PDEs, of
15

which ODEs are a special case, we shall see that a PDE often in a natural way can
be viewed as an ODE with N = ∞. In this case we shall speak of a vector-valued
ODE (meaning an infinite dimensional vector= function) rather than a system of
ODEs. Using this point of view we shall solve our basic PDEs in later chapters. To
practise the basic technique for solving linear constant coefficient ODEs, we consider
here the case of finite dimensional systems of PDEs.
Example 1.1.2 (Diagonalization of systems of ODEs). Let A be a given constant
N × N matrix. We want to solve the systems of ODEs
u′ (t) = Au(t) (1.2)
for u : R → RN . Assume that we know a change of basis T so that the matrix
D = T −1 AT
is diagonal. As in linear algebra, the columns in T are the eigenvectors of A and D
contain the eigenvalues of A on the diagonal, with compatible ordering.
We now use this similarity transformation to solve (1.2). Let v(t) := T −1 u(t).
With this change of unknown function, (1.2) becomes T v ′ (t) = AT v(t). By multi-
plying this equation from the left by T −1 , we obtain
v ′ (t) = Dv(t).
Since D is diagonal, this is in fact N scalar first order ODEs for the component
functions vk (t). Solving these gives v(t), and as a consequence u(t) = T v(t).
Example 1.1.3 (Baby wave equation). Replacing the Laplace operator ∆ by a
negative number −ω 2 , we consider the second order ODE
f ′′ (t) = −ω 2 f (t) + a(t),
where a is a given function. A way to solve this is to write it as a 2 × 2 system of
first order ODEs and solve it by the methods described above. Let u : R → R2 be
the function given by  
f (t)
u(t) = ′ .
f (t)
We see that this satisfies
   
′ 0 1 0
u (t) = u(t) + .
−ω 2 0 a(t)
We compute eigenvectors of the matrix and obtain
 −1     
1 1 0 1 1 1 iω 0
= .
iω −iω −ω 2 0 iω −iω 0 −iω
 −1
1 1
The diagonalized system, when changing variables to v = u, becomes
iω −iω
(
v1′ (t) = iωv1 (t) + a(t)/(2iω),
v2′ (t) = −iωv2 (t) − a(t)/(2iω).
16

Solving the first of these two scalar ODEs by multiplication with the integrating
factor e−iωt , we have
Z t
iωt −1
v1 (t) = C1 e + (2iω) a(s)eiω(t−s) ds.
0

Replacing ω and a by −ω and −a, we also have


Z t
−iωt −1
v2 (t) = C2 e − (2iω) a(s)e−iω(t−s) ds.
0

From this we obtain the solution f = u1 = v1 + v2 to the original second order ODE.
We have Z t
sin(ω(t − s))
f (t) = A cos(ωt) + B sin(ωt) + a(s) ds,
0 ω
with constants A = C1 + C2 and B = i(C1 − C2 ).

1.2 First order linear PDEs


As a first and simplest example of PDEs, we study in this section first order linear
PDEs for functions u(x, y) of two variables x, y. By this we mean a PDE of the form

a(x, y)∂x u(x, y) + b(x, y)∂y u(x, y) + c(x, y)u(x, y) = f (x, y). (1.3)

Unlike more general PDEs, these are actually quite easy to solve explicitly. The
basic idea is to perform a suitable change of variables
(
s = s(x, y),
(1.4)
t = t(x, y),

to be chosen so that the PDE becomes simpler in the new coordinates s, t. What
we want is that, after transformation to s, t, only one of the derivatives ∂s u and ∂t u
appear in the PDE. For example, let us aim for a PDE of the form

ã(s, t)∂s u(s, t) + 0 + c̃(s, t)u(s, t) = f˜(s, t), (1.5)

where b̃(s, t) = 0.
To find s(x, y) and t(x, y), use the chain rule to get
(
∂x u(x, y) = ∂x s(x, y)∂s u(x, y) + ∂x t(x, y)∂t u(x, y),
∂y u(x, y) = ∂y s(x, y)∂s u(x, y) + ∂y t(x, y)∂t u(x, y).

Inserting into (1.3) gives

(a∂x s + b∂y s)∂s u + (a∂x t + b∂y t)∂t u + cu = f,

and we see that we require


b̃ = a∂x t + b∂y t = 0,
17

that is t(x, y) should satisfy (1.3), with c = f = 0. A geometric way to write this
PDE for t(x, y) is
⟨∇t, (a, b)⟩ = 0,
which means that at each point (x0 , y0 ) we require the gradient vector ∇t to be
orthogonal to the vector (a, b). Set C = u(x0 , y0 ). This means that the level curve
consisting of those (x, y) such that u(x, y) = C, is such that at each point on it, the
vector (a, b) is tangent to this level curve.

Figure 1.1: Characteristic curves for a first order linear PDE.

Writing the curve as a graph y = y(x), this amounts to the ODE

dy/dx = b(x, y)/a(x, y). (1.6)

Definition 1.2.1 (Characteristic equation/curves). The ODE (1.6) is called the


characteristic equation for the PDE (1.3). The (graphs of) solutions to (1.6) are
referred to as characteristic curves for the PDE (1.3).

We can now summarize the algorithm for solving PDEs of the form (1.3).

ˆ Solve the characteristic equation (1.6), a first order ODE for y(x). The general
solution will depend on a parameter C.

ˆ Solve for C and write the solution as t(x, y) = C. Choose this t(x, y) for
your change of variables. Pick s(x, y) essentially randomly: we only want t
and s to define an invertible (locally usually suffices) change of variables. It
is recommended to choose s as simple as possible, typically s(x, y) = x or
s(x, y) = y will do.

ˆ Find the transformed PDE (1.5). Make sure no x, y remains in the PDE: not
only ∂x u and ∂y u should be replaced by ∂s u and ∂t u according to the chain rule,
but coefficients x and y must be replaced by s and t, for which you may need
to solve for x and y in (1.4) to obtain the inverse x = x(s, t) and y = y(s, t).

ˆ Solve the PDE (1.5). This can be done since this is now actually an ODE, since
only derivatives in s appears. Therefore we can think of t as having a fixed
value (being a parameter). Note that the general solution to the PDE=ODE
(1.5) depend on an integration constant C, but that these may depend on t.
So the general solution u(x, y) of our original PDE (1.3), after switching back
to x and y, will depend on an arbitrary one-variable function C(t(x, y)).

The following illustrates how this works and how to find particular solutions to
the PDE.

Example 1.2.2. We want to solve the first order linear PDE

xy∂x u − x2 ∂y u − yu = xy.
18

Its characteristic equation is

dy/dx = (−x2 )/(xy) = −x/y.


R R
Solving this as a separable equation, we integrate ydy = − xdx to get y 2 /2 =
−x2 /2 + C. Solving for C we choose

t(x, y) = C = (x2 + y 2 )/2.

With s(x, y) = x, the chain rule gives


(
∂x u = 1∂s u(x, y) + x∂t u,
∂y u(x, y) = 0∂s u(x, y) + y∂t u(x, y).

Inserting this into the PDE yields

xy(∂s u + x∂t u) − x2 (y∂t u) − yu = xy.

As expected, the ∂t u terms cancel, and we simplify the result to x∂s u − u = x.


Replacing x by s = x, we have the transformed PDE

∂s u − s−1 u = 1.

This is a first order linear ODE, and the standard method to solve such involves the
integrating factor e− ln s = 1/s. Multiplying the ODE by 1/s and using the product
rule backwards gives
∂s (s−1 u) = 1/s.
After integration, we obtain the general solution s−1 u = ln s + C(t), that is

u(x, y) = s ln s + sC(t) = x ln(x) + xC(x2 /2 + y 2 /2). (1.7)

Note that any choice of one-variable function C yield one particular solution to
the PDE. For example C(α) = sin(2α), using α to denote the dummy variable that
C depends on, gives us the PDE solution u = x ln(x)+x sin(x2 +y 2 ). Conversely, one
is often given some complementary boundary conditions which specify the function
C(α). For example, if we demand of the solution u(x, y) that

u(1, y) = y, for all y ≥ 0,

then clearly y = 0 + C((1 + y 2 )/2). Write α = (1 + y 2 )/2 and note that α ≥ 1/2.
Therefore we know that C is the function

C(α) = y = 2α − 1, for all α ≥ 1/2.

Now, letting α = x2 /2 + y 2 /2 in (1.7), we obtain the particular solution


p
u(x, y) = s ln s + sC(t) = x ln(x) + x x2 + y 2 − 1, x2 + y 2 ≥ 1,

to our PDE, which satisfies this boundary condition.


19

1.3 Heat, wave and Laplace equations


In this section, we show how conduction of heat and evolution of waves can be
modelled by PDEs. As this is meant to be a mathematics course, we omit details
about physical units for the various quantities appearing below. The linear PDEs
appearing here will be the main PDEs to be studied in this course.
Example 1.3.1 (The heat (diffusion) equation). Consider a conducting plate, which
we model by a domain D ⊂ R2 . We seek a function u = u(x, y, t) depending
on position (x, y) ∈ D and time t, which describes the temperature of the plate.
Fourier’s law for heat conduction is that
J = −A∇u = −A(∂x u, ∂y u),
where J denotes the local heat flux density vector and A is the conductivity of the
plate. This expresses the fact that heat flows from warm to cold regions in the
plate. Usually we have A as a material constant, but if the plate is inhomogeneous,
A = A(x, y) will be a function. For anisotropic material, where the conductivity is
orientation-dependent, A = (Aij ) is matrix, but we always assume positivity of A.
Consider now a region Ω ⊂ D and the total heat contained in Ω, which we model
by ZZ
H= udxdy.

The time rate of change of H, that is ∂t H, depends obviously on two things. First,
heat may flow in or out through ∂Ω. The total flow into Ω is
Z
⟨J, −ν⟩ds.
∂Ω

Second, we may have some external heat sources, modelled by a function f =


f (x, y, t). Another possibility is that we have heating/cooling of the plate through
contact with some surrounding material above/below the x, y-plane, which has a
given temperature u0 = u0 (x, y, t). A natural linear model for the heat flow into the
plate due to this contact is a(u0 −u), where a = a(x, y, t) is a positive proportionality
factor since again heat flows from warm to cold regions. In total, we have a total
external heat flow into Ω being
ZZ
(f + a(u0 − u))dxdy.

The divergence theorem shows that


ZZ ZZ Z ZZ
∂t udxdy = ∂t udxdy = ⟨J, −ν⟩ds + (f + a(u0 − u))dxdy
Ω Ω ∂Ω ZZ Ω

= (−divJ + f + a(u0 − u))dxdy. (1.8)


The key observation now is that this holds for all Ω ⊂ D, and this is only possible
if the integrands on the left and right hand sides are pointwise equal. Since divJ =
−divA∇u, and writing R(u) = f + au0 − au, we obtain the PDE
∂t u = divA∇u + R(u). (1.9)
20

In the special case when R = 0 and A = k is a scalar positive constant, then this
reads
∂t u = k∆u, (1.10)
and is called the heat equation, or sometimes the diffusion equation. Indeed, a
diffusion process can be modelled by this PDE through arguments entirely similar to
those above. The unknown u now describes the density of some gas or fluid, diffusion
is described by Fick’s law (analogously to Fourier’s law) and external sources of the
substances can be modelled by a reaction term R(u) as above. A process described
by a PDE of the form (1.9) is referred to as a reaction-diffusion process. Above, our
function R(u) depended linearly on u, but more general reaction-diffusion processes
are modelled by suitable non-linear reaction terms R(u).
To understand the heat equation, it is instructive to study how (1.10) works
at a point (x, y) ∈ D where u attains a local maximum. Typically we here have
∆u < 0, and since k > 0, the heat equation forces ∂t u < 0. Similarly at the minima
of u, we have ∂t u > 0. This illustrates the main feature of the heat equation: it
lower/smoothes peaks in u as time t increases.

Example 1.3.2 (The wave equation). We again consider a domain D ⊂ R2 , but


now consider this as the equilibrium position of a membrane which can vibrate
vertically. For example a drum. The unknown function u = u(x, y, t) which we seek
here describes the vertical position of the membrane at position (x, y) and time t. We
limit ourselves to the linear approximation of the evolution of u, which in particular
means that we only consider very small vibrations of the membrane, which typically
is the case for a drum.
To derive a PDE which u satisfies, we consider a region Ω ⊂ D and Newton’s
second law F = ma applied to the part of the membrane described by Ω. Assuming
a small vibration, we can neglect the horizontal motion. For the vertical direction,
the rate of change ma of the total momentum for the Ω part of the membrane is
modelled by ZZ
∂t2 u dxdy,

assuming for simplicity that the membrane is homogeneous with density 1. On Ω,


there are obviously two types of forces acting.

Figure 1.2: Tension force on membrane element.

First we have the vertical tension force. By Newton’s third law, at each point
on ∂Ω, the D \ Ω part of the membrane acts by a force T on Ω, while at the same
time the Ω part of the membrane acts by a force −T on D \ Ω. A natural linear
model is that T has the direction of ν and that the horizontal part of T is constant.
Equivalently, we assume that there is a positive constant, which we write as c2 ,
describing the tension of the membrane, such that

Tz = c2 ∂ν u.
21

Thus the total vertical force by which the rest of the membrane acts on Ω, is
Z
c2 ∂ν u ds.
∂Ω

Second, we may have some external vertical force, modelled by a function f =


f (x, y, t). For example gravity. Another possibility is friction, if the membrane
vibrates in some surrounding medium. A natural linear model for such frictional
forces is −a∂t u, where a is some positive proportionality factor, so that the friction
force is opposite to the velocity ∂t u of the membrane. In total, we have a total
vertical external force on the Ω part of the membrane being
ZZ
(f − a∂t u)dxdy.

Coming back to Newton’s second law, and using the divergence theorem we have
ZZ Z ZZ
2 2
∂t udxdy = c ∂ν uds + (f − a∂t u)dxdy
Ω ∂Ω Ω ZZ
= (c2 ∆u + f − a∂t u)dxdy.

This holds for all Ω ⊂ D, and this is only possible if the integrands on the left and
right hand sides are pointwise equal. We therefore obtain the PDE

∂t2 u = c2 ∆u + f − a∂t u. (1.11)

For undamped vibrations and no external forces, that is when a = f = 0, this reads

∂t2 u = c2 ∆u (1.12)

and is called the wave equation.


To understand the wave equation, it is instructive to study how (1.12) works at a
point (x, y) ∈ D where u attains a local maximum. Typically we here have ∆u < 0,
and since c > 0, the wave equation forces ∂t2 u < 0. Similarly at the minima of u, we
have ∂t2 u > 0. This illustrates the main feature of the wave equation: the membrane
pulls/accelerates the peaks in u towards the equilibrium position u = 0.
There are both lower and higher dimensional versions of the heat and wave
equations, which are equally important. In one spatial variable, the heat equation
for u = u(x, t) describes heat flow in a one-dimension conducting rod, whereas the
wave equation for u = u(x, t) describes a vibrating string. For example a string on
a violin. In three spatial variables, the heat equation for u = u(x, y, z, t) describes
heat flow in a three-dimension conducting solid object, whereas the wave equation
for u = u(x, y, z, t) describes scalar waves in space, like acoustic/sound waves.
Example 1.3.3 (The Laplace and Poisson equations). Assume that we are in an
equilibrium state, where u = u(x, y) does not depend on time t. Then the general
heat equation (1.9) (with A = 1), as well as the wave equation (1.11) (with c = 1),
reduces to the PDE
−∆u = f,
22

Figure 1.3: A harmonic function u on an x shaped domain D, with u = 1 on the blue


part of ∂D and u = 2 on the red part of ∂D.

which is referred to as Poisson’s equation. (The minus sign is of course not important,
but as we shall later see there are reasons to keep it.) When f = 0, it is called the
Laplace equation and functions u solving ∆u = 0 are called harmonic functions.
The Laplace equation in one dimensions, for u = u(x), is trivial, since u′′ (x) = 0
simply means that u is a linear function of x. In two and higher dimension, the
equilibrium state that the Laplace equation describes is much more interesting as
we shall see. Beyond the Laplace equation, for a general positive matrix-valued
function A = Aij (x), the PDE
div(A(x)∇u(x)) = 0
is referred to as a divergence form elliptic equation, and describes an equilibrium
state in an inhomogeneous and anisotropic material.
Before moving on, it is worthwhile to discuss how we obtained the PDEs from
the integral identities over domains Ω in Examples 1.3.1 and 1.3.2. In each case, by
moving all terms to the left hand side, we know that a certain function F satisfies
ZZ
F dxdy = 0 (1.13)

for any region Ω ⊂ D. In Example 1.3.1, we had F = ∂t u + divJ − f − a(U0 − u)


and in Example 1.3.2, we had F = ∂t2 u − c2 ∆u − f + a∂t u. Let us assume that F
is a continuous function, and to reach a contradiction, assume that F (x0 ) > 0 at
some x0 ∈ D. By continuity there exists a small ball D around x0 , so that F (x) > 0
for all x ∈ D. But then (1.13) cannot hold for this ball D. Similarly, F (x1 ) < 0 at
some x1 ∈ D would yield a contradiction to (1.13). Therefore the only possibility is
that F = 0 at each point in D.
Arguments similar to this appear frequently in this book, see for example Propo-
sition 2.1.5.
23

1.4 Linear PDE problems


By a PDE problem, we mean a PDE together with various additional conditions
imposed in order to uniquely determine the function. These conditions can be:

ˆ Initial/starting conditions (ICs), sometimes referred to as Cauchy data, in


which case we speak of initial value problems (IVPs).
For the heat equation, where only the first time derivative ∂t u appears, the
natural IC is to prescribe all the values of u at t = 0.
For the wave equation, where the second time derivative ∂t2 u appears, the
natural IC is to prescribe all the values of u as well as of ∂t u, at t = 0.
Since there is no time-evolution for Laplaces equation, we normally do not
consider IVPs for Laplaces equation. See however Exercise 10 and Section 5.3.

ˆ Boundary conditions (BCs), in which case we speak of boundary value prob-


lems (BVPs).
When we consider some of the second order PDEs derived in 1.3, not in all
Rn but rather in some domain D ⊂ Rn , then the most common BCs at ∂D
are the Dirichlet BC and the Neumann BC, as discussed below. Another
natural choice is the Robin BC. For the evolution equations, the heat and
wave equations, we impose BCs at ∂D for each time t.
There is a generalization of BVPs called transmission problems where jump
conditions are specified at an interface between two regions where the PDE
holds.

ˆ Decay/radiation conditions at ∞, in the case that the domain D where the


PDE holds is unbounded.

Example 1.4.1 (Dirichlet/Neumann/Robin BCs). The Dirichlet BC is to require


that
u(x) = ϕ(x) for all x ∈ ∂D,
where ϕ is some given function on ∂D. Homogeneous Dirichlet BCs mean that
ϕ = 0.

ˆ When applied to the heat equation, this means that we keep the temperature
at values prescribed by ϕ along ∂D.

ˆ When applied to the wave equation, this means the membrane is fixed at height
ϕ along ∂D.

The Neumann BC is to require that

∂ν u(x) = ϕ(x) for all x ∈ ∂D,

where ϕ is some given function on ∂D. Recall that the normal directional derivative
is ∂ν u = ⟨ν, ∇u⟩. Homogeneous Neumann BCs mean that ϕ = 0.
24

ˆ When applied to the wave equation, this means the membrane is acted on by
a vertical force ϕ along ∂D. In particular, homogeneous Neumann BCs mean
that the membrane moves freely in the vertical direction at ∂D.

ˆ When applied to the heat equation, in the case A = k is a scalar constant,


this means the flow of heat through ∂D into D is ⟨J, −ν⟩ = kϕ. In particular,
homogeneous Neumann BCs mean that the membrane is insulated at ∂D so
that no heat can flow through ∂D.
For more general divergence form equations involving div(A∇u), the natural
Neumann BC is to specify the conormal derivative ⟨ν, A∇u⟩ = ϕ.

The Robin BC is to require that

∂ν u(x) + a(x)u(x) = ϕ(x) for all x ∈ ∂D,

where ϕ is some given function on ∂D, and a is a given function on ∂D, which should
not be negative. One may view the Neumann BC as the special case a = 0 and the
Dirichlet BC, after some re-normalization, as the special case a = ∞.

We can see that each of these BCs alone is strong enough to uniquely determine
the solution as follows. Consider for example the Poisson equation, and assume that
we have two solutions u1 and u2 to the Dirichlet problem
(
−∆u(x) = f (x), x ∈ D,
u(x) = ϕ(x), x ∈ ∂D.

Let v = u1 − u2 . Since the PDE and the BC are linear, we have


(
∆v(x) = ∆u1 − ∆u2 = −f (x) + f (x) = 0, x ∈ D,
v(x) = u1 (x) − u2 (x) = ϕ(x) − ϕ(x) = 0, x ∈ ∂D.

From Green’s first identity we get


ZZ Z
2
|∇v(x)| dx = v(x)∂ν v(x)dx.
D ∂D

Here the right hand side is zero since v = 0 at ∂D, so the left hand side shows that
∇v = 0 in all D. Thus v is constant, but from the BC we see that this constant
must be zero, so v = 0, or equivalently u1 = u2 .
Very similar arguments apply to the other two BCs. For the Neumann BC, the
right hand side vanishes now since ∂ν v = 0, but we can only conclude that v is
constant. And indeed, a typical feature of solution to the Neumann problem is that
they Rare clearly unique only up to constants.
RR For the Robin BC, the right hand side
is − ∂D a(x)v(x)2 dx ≤ 0. But since D |∇v(x)|2 dx ≥ 0, we can obtain that v = 0
in all D if a ≥ 0 and is non-zero at least on a part of ∂D.
Roughly speaking, a main goal in Chapter 2 will be to show that these BCs in
fact are just strong enough so that a solution u exists for any given data. Note that
if we impose two of these BCs at the same time, for example if we try to prescribe
25

both u and ∂ν along ∂D, then in general there will exist no solution to this PDE
problem.
In general the prescibed data for a PDE problem are of three types: ICs, BCs
and internal sources. For linear problems, we can pass between these types as the
following example illustrates.

Example 1.4.2 (Homogeneous reduction for linear PDEs). Consider the inital-
boundary value problem (IBVP) for the heat equation

∂t u(t, x) = k∆u(t, x) + f (t, x), t > 0, x ∈ D,

u(0, x) = g(x), x ∈ D, (1.14)

u(t, x) = h(t, x), t > 0, x ∈ ∂D.

Here f in the PDE represents internal sources, g is the prescribed IC and h is the
prescribed Dirichlet BC.
The point of this example is to demonstrate that if we can solve this linear PDE
problem in the special case when g = h = 0 and for all f , then we can easily also
solve it for general f, g, h. To see this, let any f, g, h be given. Pick any sufficiently
smooth function u1 (t, x) such that v = g at t = 0 and v = h at ∂D. Write

u(t, x) = u0 (t, x) + u1 (t, x),

where u1 is the known function that we just introduced. Now using that the PDE
∂t u − k∆u = f is linear, in terms of u0 we equivalently have

∂t u0 − k∆u0 = f − ∂t u1 − k∆u1 .

Therefore, if we define a new source term f˜ := f − ∂t u1 − k∆u1 , then we see that u


solves the PDE problem (1.14) if and only if u0 solves the PDE problem

˜
∂t u0 (t, x) = k∆u0 (t, x) + f (t, x), t > 0, x ∈ D,

u(0, x) = 0, x ∈ D, (1.15)

u(t, x) = 0, t > 0, x ∈ ∂D.

Therefore, if we can solve (1.15) for u0 , then we obtain the solution u = u0 + u1 to


(1.14)

We summarize this discussion with a central notion in PDE theory.

Definition 1.4.3. A PDE problem is said to be well-posed (in the sense of Hadamard)
if

(i) there exists a solution for any given data,

(ii) this solution is unique, and

(iii) the solution depends continuously on the data.


26

Note that for this notion of well-posedness to be well defined, a function space
Y of admissible data and a function space X of possible solutions must be specified
for (i) and (ii) to make sense. For (iii) to make sense, we must also specify norms
on X and Y , in which we measure the continuity.
As we saw examples of above, uniqueness (ii) of a PDE problem is relatively
easy to prove. For linear PDE problems, the strategy is to assume that we have two
solutions u1 and u2 for the same data. Linearity then shows that v = u1 − u2 is a
solution with 0 data, and it is from this often possible to show that v must be zero.
Continuous dependence on data (iii) is typically proved rather similarly, but using
the norms. For linear PDE problems, we now need to show that if the data for v is
small in the Y norm, then v must be small in the X norm.
Existence (i) of solutions is usually the most difficult to prove. Indeed, much of
our coming work is devoted to constructing solutions, numerically and theoretically,
for given data. There are also techniques for linear PDE problems, referred to as
Fredholm theory, which can be used to deduce (i) from (ii) and (iii). See Section 6.
Most of the naturally appearing PDE problems in physics are well-posed. There
are however important examples of PDE problems which are not well-posed: we
speak of ill-posed PDE problems. Here it is often (i) and (iii) that fails, and what
can only be proved is uniqueness of solutions (ii). See Exercise 10 and Example 3.3.4.

1.5 Exercises
1. Solve the 2 × 2 ODE system u′ (t) = Au(t) by diagonalizing A, where A is the
matrix from Exercise 0.3.
 
′ 0 1
2. Solve the 2 × 2 ODE system u (t) = Au(t), where A = . Hint: Use
−1 2
a similarity transformation to an upper triangular matrix T AT −1 . Then start
by solving the simplest ODE in the new system.
3. Determine the order of the following PDEs and if they are nonlinear, linear
inhomogeneous or linear homogeneous.
(a) ∂t u − ∂x2 u + 1 = 0
(b) ∂t u − ∂x2 u + xu = 0
(c) ∂t u − ∂x2 ∂t u + u∂x u = 0
(d) ∂t2 u + ∂x2 u + x3 = 0
(e) i∂t u + ∂x2 u + u/x = 0
(f) (1 + ∂x2 u)−1/2 ∂x u + (1 + ∂y2 u)−1/2 ∂y u = 0
(g) ∂x u + ey ∂y u = 0
(h) ∂t u + ∂x4 u + (1 + u)8 = 0
4. Solve 2∂t u + 3∂x u = 0 when u(0, x) = sin x.
5. Find the general solution to (1 + x2 )∂x u + ∂y u = 0. Sketch some characteristic
curves.
27

6. Solve x∂x u − y∂y u + u = x when u(x, x) = x2 .

7. Let u0 be any function that solves the heat equation and satisfies the initial
conditions in (1.14). (We learn below in (3.6) how to compute such u0 .) Show
how to use this particular solution to reduce the IBVP (1.14) to the case
f = g = 0.

8. Show that the Neumann BVP


(
∆u(x) = f (x), x ∈ D,
∂ν u(x) = ϕ(x), x ∈ ∂D.
RR R
can have a solution only if D
f dx = ∂D
ϕdS. What does this result mean
for a heat equilibrium?

9. Give example of a problem from physics, which can be modelled by a Robin


BVP for the Laplace equation.

10. Consider the following Cauchy problem for the Laplace equation in a square.


 ∆u = 0, 0 < x, y < π,

u = 0, x = 0 or x = π,


 u = g, y = 0,

∂y u = h, y = 0,

with given ICs g and h at y = 0. Note that y is treated as an evolution vari-


able. Writing t = y, this becomes similar to the IBVP for the wave equation,
but with opposite sign for the tension term ∂x2 u. Explain why the functions
u(x, y) = sin(kx) cosh(ky) show that the solution u does not depend continu-
ously on the data g and h.

11. Solve 1 − x2 ∂x u + ∂y u = 0 when u(0, y) = y 2 .

12. Solve ∂x u + y∂y u = u when u(x, 0) = x.
28
Chapter 2

Weak derivatives and FEM

The goal in this chapter is to learn the theory behind the finite element method
(FEM) for the numerical solution of the BVPs for Laplace equation. This uses a
weak formulation of the PDE problem, and the concept of weak derivatives. So we
start with a short introduction to distribution theory and collect what we need for
this course.

Recommended reading:

ˆ Strauss 12.1.

ˆ Lecture notes: http://www.math.chalmers.se/~rosenan/


fundlosndistr.pdf

ˆ Strauss 8.5.

Proposition 2.2.5: Only the statement and understanding how to


apply it is part of the course, not the proof.

Study questions:
What is a distribution? What is a test function? What do we
mean when we say that a distribution is a function? How do we
use integration by parts/use Green to compute a distributional
derivative? What is a fundamental solution to a PDE?

2.1 Distributions
We have seen in Section 0.1 that the Lebesgue integral is needed for modern analysis.
Likewise, we also need the modern distributional definition of derivative. At first
the notion of distributions below may seem more abstract than that of a function,
but in fact distributions are closer to physical reality than functions.

Example 2.1.1 (Point masses and dipoles). General distributions, as defined math-
ematically below can be quite wild creatures. However, the only two types of dis-
tributions beyond ordinary functions which we consider and need in this course are

29
30

variations of physical concepts of point masses and dipoles. Naively, a point mass f
at the origin is defined by (
∞, x = 0,
f (x) =
0, x ̸= 0.
However, this is a useless definition since we cannot calculate with ∞. For example
remember that 0 · ∞ can be anything/is undefined. The proper and useful definition
of a point mass (Dirac delta) is found below.
Similarly, the naive definition of a dipole at the origin is

∞,
 x = 0+ ,
f (x) = −∞, x = 0− ,

0, x ̸= 0.

The physical meaning is that we have placed two point masses, infinitely large and
of opposite sign, infiniteimally close at the origin. Again, this is a useless definition
since we somehow must quantify infinity and the infinitesimal distance. The proper
definition of a dipole (Dirac delta derivative) is found below.

To explain distributions we need to recall what is a weighted average. The


weighted average of N quantities f1 , . . . , fN is a linear combination of the form

ϕ1 f1 + ϕ2 f2 + · · · + ϕN fN ,

where ϕ1 + · · · + ϕN = 1 and ϕi ≥ 0, i = 1, . . . , N . The usual uniform average is the


special case when all weights are equal to ϕi = 1/N .
Consider now a function f (x) depending on a real variable x ∈ R. The continuous
analogue of the weighted average above is
Z
⟨f, ϕ⟩ = f (x)ϕ(x)dx, (2.1)

R
where now the weight ϕ is positive function ϕ(x) ≥ 0 with total mass ϕ(x)dx = 1.
Here is the core idea of distributions. The classical functional way to describe a
quantity f depending on a variable x is to specify its values f (x) at each point
x. However, if f is a physical quantity then this is typically not posssible, since a
point is only a mathematical idealization. In real life, what is possible to measure
are various averages ⟨f, ϕ⟩ of the values f (x). This leads us to the concept of a
distribution, where instead of viewing f as depending on the point x ∈ R, we view
f as depending on the weight function ϕ.

The correct intuition is that instead of talking about the values f (x) of
a function f at mathematical points x, we now talk about the values
⟨f, ϕ⟩ of a distribution f at physical ”smeared out” points ϕ.

To turn these ideas into a good mathematical definition, we make the following
observations.
31

ˆ The restrictions ϕ(x) ≥ 0 and ϕ(x)dx = 1 were imposed above only for
R

the physical interpretation of the values ⟨f, ϕ⟩. We shall no longer need these
restrictions and therefore drop them, and we will refer to ϕ as a test function
rather than a weight function.
ˆ The integral (2.1) depends linearly on ϕ (as well as on f ), meaning that
⟨f, aϕ + bψ⟩ = a⟨f, ϕ⟩ + b⟨f, ψ⟩,
for all a, b ∈ R and all test functions ϕ, ψ. This reflects the fact that there is
a redundancy in the values ⟨f, ϕ⟩ in that they cannot be specified completely
independent of each other. This in contrast with the classical function values
f (x), which in general can be independently specified to yield a function.
ˆ For f, ϕ ∈ L2 (R), the value (2.1) of the function f at the test function ϕ is
nothing but the L2 inner product of f and ϕ. As further studied in a course on
functional analysis, there is a dual relation between f and ϕ: To make sense
of the value ⟨f, ϕ⟩, the better the function ϕ is, the worse we can allow f to
R bounded, then ⟨f, ϕ⟩ is well defined for any integrable
be. For example, if ϕ is
function f , that is if |f (x)|dx < ∞. And if ϕ = 0 outside a bounded set,
then we can allow f to grow arbitrarily fast as x → ∞. Further assuming
that ϕ is differentiable to any order, then ⟨f, ϕ⟩ will be well defined for any
distribution f .
Definition 2.1.2 (Distributions and test functions). Let D(Rn ) denote the set of
all test functions ϕ : Rn → R such that all partial derivatives of ϕ of all orders exist
as continuous functions, and such that ϕ = 0 outside some bounded set.
Let D′ (Rn ) denote the set of all distributions, that is functionals F : D(Rn ) → R
such that F is linear and defined on all D(Rn ). We denote the value of F on the
test function ϕ by ⟨F, ϕ⟩.
Before we start using distributions in concrete calculations, some remarks are in
order.
ˆ We write ⟨F, ϕ⟩ for the value of the distribution f on the test function ϕ. As
discussed above, if ϕ is a weight function, then the number ⟨F, ϕ⟩ means the ϕ-
weighted average of the quantity represented by the distribution. Alternative
notations are F [ϕ] or F (ϕ), but we use the inner product symbol ⟨F, ϕ⟩ since
this better indicates the linear dependence on ϕ. Note carefully though that,
unless the distribution F is (represented by) a function, ⟨F, ϕ⟩ is not an integral
but just the value of F at ϕ.
ˆ The usual definition of a distribution F : D(Rn ) → R is that it should be
linear and continuous in a certain sense. It turns out that any F which is
linear and defined on all D(Rn ) that you will ever encounter in real life, will
be continuous in this sense. This is a consequence of a completeness property
of the function space D(Rn ). To be honest, there are in some sense linear
F defined on all D(Rn ) which are not continous, but to construct such an
example you will need the axiom of choice, a very abstract mathematical tool
from the foundations of logics.
32

ˆ Test functions ϕ ∈ D(Rn ) exist in abundance. However, this is not imme-


diately obvious. Here is the most well known example, in dimension n = 1,
which gives a test function which is nonzero on an interval (a, b):
( −1 −1
e−(x−a) e−(b−x) , a < x < b,
ϕ(x) :=
0, else.

The idea behind this construction is that e−1/x → 0 so fast as x → 0+ that all
derivatives will vanish at 0.

The most well known distribution, which is not a function, is the following point
mass distribution.
Definition 2.1.3 (The Dirac delta). Let a ∈ Rn . The Dirac delta at a is the
distribution δa ∈ D′ (Rn ) given by

⟨δa , ϕ⟩ := ϕ(a).

The Dirac delta distribution δa is not a function, but the limit (in a weak sense
R below) of functions g(x) such that g ̸= 0 only in a small neighbourhood
introduced
of a and g(x)dx = 1.

RLemma 2.1.4. Assume that g ≥ 0 is such that g(x) = 0 when |x − a| ≥ r and


g(x)dx = 1. Then for any function f , we have
Z
f (x)g(x)dx − f (a) ≤ sup |f (x) − f (a)|.
|x−a|<r

Proof. Just write


Z Z Z
f (x)g(x)dx − f (a) = (f (x) − f (a))g(x)dx = (f (x) − f (a))g(x)dx
|x−a|<r

and estimate with the triangle inequality for integrals.


Roughly speaking functions are special cases of distributions. More, the relation
between these two concepts is the following.
Proposition 2.1.5 (Functions vs. distributions).
R Let f : Rn → R be a locally
integrable function, that is we assume that D |f (x)|dx < ∞ for any bounded set D.
Then Z
⟨F, ϕ⟩ = f (x)ϕ(x)dx, ϕ ∈ D(Rn ),

is a well defined distribution F ∈ D′ (Rn ).


Assume now that two locally integrable functions f1 and f2 represent the same
distribution, that is
Z Z
f1 (x)ϕ(x)dx = f2 (x)ϕ(x)dx, ϕ ∈ D(Rn ).

Then f1 (a) = f2 (a) at each point a ∈ Rn where both f1 and f2 are continuous.
33

We remark that using Lebesgue integration theory, one can prove that f (x) =
g(x) at almost every x ∈ Rn .
Proof. Let ϕ ∈ D(Rn ) be a test function. Then there exists C < ∞ and R < ∞ so
that |ϕ(x)| ≤ C for all x ∈ Rn and ϕ(x) = 0 for |x| ≥ R. Therefore
Z Z
|f (x)ϕ(x)|dx ≤ C |f (x)|dx < ∞,
Rn D

so the integral defining ⟨F, ϕ⟩ is absolutely convergent. By the properties of the


integral, this depends linearly on ϕ, so F defines a distribution.
Assume that f1 and f2 represents
R the same distribution and both are continuous
at a. Let f := f1 −f2 , so that f (x)ϕ(x) = 0 for allR ϕ ∈ D(Rn ). Apply Lemma 2.1.4
with g = ϕ such that ϕ = 0 when |x − a| ≥ r and ϕ(x)dx = 1. Since

lim sup |f (x) − f (a)| = 0


r→0 |x−a|<r

follows from the continuity of f at a, we conclude that f (a) = 0. This proves that
f1 (a) = f2 (a).

Figure 2.1: Example of zero order distributions: (a) Locally integrable functions. (b)
Dirac delta distributions.

Recall the classical pointwise definition of the derivative of a one-variable function


f (x) at x = a ∈ R:
f (a + h) − f (a)
f ′ (a) = lim .
h→0 h
A serious disadvantage of this definition is that not all functions are differentiable,
since the above limit does not exist at points where f jumps, or have even worse
irregularities. By generalizing functions to distributions we can remove this obstruc-
tion completely: anything (function or distribution) is differentiable in the following
sense.

Definition 2.1.6 (Weak derivative). Let F ∈ D′ (Rn ) be a distribution. Its par-


tial derivative ∂xi F in the xi -direction in the weak or distributional sense is the
distribution
⟨∂xi F, ϕ⟩ = −⟨F, ∂xi ϕ⟩, ϕ ∈ D(Rn ).

In this definition, the derivative ∂xi ϕ is the usual one defined by difference quo-
tients, which is well defined since ϕ is a test function. The motivation for this defi-
nition is integration by parts. Indeed, if F is the distribution defined by a smooth
function f , then
Z Z
∂xi f (x)ϕ(x)dx = − f (x)∂xi ϕ(x)dx,
Rn Rn

since ϕ = 0 outside a bounded set.


34

Carefully note that the weak derivative in Definition 2.1.6, is not defined point-
wise. Given a function, or distribution F , we only know what ∂xi is as a whole.
We do not tell what F (x) is at each x ∈ Rn . Indeed, remember from the inital
discussion in this section that this is impossible for distributions in general.

Example 2.1.7. Let f− and f+ be two smooth one-variable functions, and set
(
f− (x), x < 0,
f (x) =
f+ (x), x > 0.

We compute the weak derivative of f :


Z 0 Z ∞
′ ′ ′
⟨f , ϕ⟩ = −⟨f, ϕ ⟩ = − f− (x)ϕ (x)dx − f+ (x)ϕ′ (x)dx
−∞ 0
Z 0 Z ∞
′ ∞
= −[f− (x)ϕ(x)]0−∞ + f− (x)ϕ(x)dx − [f+ (x)ϕ(x)]0 + f+′ (x)ϕ(x)dx
−∞ 0
Z ∞
= (f+ (0) − f− (0))ϕ(0) + g(x)ϕ(x)dx,
−∞

where g is the pointwise derivative of f for x ̸= 0. We view g as a distribution so it


is irrelevant to define g(0). What we obtain is that the weak derivative of f is

f ′ = g + (f+ (0) − f− (0))δ0 .

This means that if f is continuous at 0, that is if f+ (0) = f− (0), then the weak
derivative is a function: the pointwise derivative g. But if f jumps at 0, then the
weak derivative also contains a distribution term: a Dirac delta at the jump, with
mass equal to the size of the jump.

Example 2.1.8 (Dirac delta derivative). We have seen that the Dirac delta distri-
bution δa at a ∈ R appears as the weak derivative of a function which jumps at
this point. Since any distribution can be weakly differentiated, we may form the
derivative δa′ , which is the distribution defined by

⟨δa′ , ϕ⟩ = −ϕ′ (a).

This is a distribution even more singular than δa . The physical interpretation of δa


is as a unit point mass at a, whereas the interpretation of δa is as a dipole at a:
two infinite charges with opposite sign placed infinitesimally close together, with a
balance between charges 1/ϵ and distance ϵ.
The Dirac delta derivative is an example of a first order distribution, unlike
(locally integrable) functions and the Dirac delta itself, which are examples of zero
order distributions. Roughly speaking, a distribution has order k if ⟨F, ϕ⟩ requires
the partial derivatives of ϕ up to order k for its definition. A smeared out version
of δa′ is the distribution
Z ∞
ϕ(x) − ϕ(−x)
Z
−1
⟨F, ϕ⟩ = lim x ϕ(x)dx = dx,
ϵ→0 |x|>ϵ ϵ x
35

usually referred to as the principal value distribution 1/x. Note that 1/x is not
locally integrable
R ∞around 0, but stillR the value ⟨F, ϕ⟩ is well defined since the two
0
infinite masses 0 x dx = ∞ and −∞ x−1 dx = −∞ cancel when calculating the
−1

integral symmetrically around 0. But note that this require smoothness of ϕ: the
derivative of ϕ should exist at 0 at least, so F is a first order distribution.

Figure 2.2: Example of first order distributions: (a) Dirac delta derivative. (b) Principal
value 1/x.

The last notion for distributions which we need for now, is that of weak conver-
gence of distributions. This is both a simpler and more useful notion of convergence
than that of convergence of test functions, discussed above.

Definition 2.1.9 (Weak convergence). Consider a sequence of distributions Fk ∈


D′ (Rn ), k = 1, 2, . . .. We say that Fk converge weakly, or converge in the sense of
distributions to F ∈ D′ (Rn ), if

⟨Fk , ϕ⟩ → ⟨F, ϕ⟩, as k → ∞,

for each test function ϕ ∈ D(Rn ).

Note that since the distributions are functions Fk : D(Rn ) → R defined on the
set of test functions, weak convergence is nothing but pointwise convergence: we
demand that their values at each test function converge.
As the name suggests, this type is of convergence is indeed weak. Roughly
speaking, if we have any type of convergence, then we have also weak convergence.
For example,

Lemma 2.1.10. Let fk ∈ L2 (Rn ) be a sequence of square integrable functions, and


view these as distributions as in Proposition 2.1.5. If fk → f ∈ L2 (Rn ), that is if
∥fk − f ∥L2 → 0, then
fk → f weakly as k → ∞.

Proof. Let ϕ ∈ D(Rn ). Then


Z Z 1/2 Z 1/2
2 2
|⟨fk , ϕ⟩ − ⟨f, ϕ⟩| = (fk − f )ϕ ≤ |fk − f | dx |ϕ| dx →0

by Cauchy–Schwarz inequality.

The two main examples of weak convergence, where we do not have convergence
in any other usual sense, are the following.

Example 2.1.11 (Delta convergence). Fix a ∈ Rn . Let gk : Rn → R be functions


Rsuch that gk (x) = 0 when |x − a| ≥ rk , wheren rk → 0 as k → ∞. Assume that
gk dx = 1 and that gk (x) ≥ 0 for all x ∈ R . This means that each g1 , g2 , . . .
represents a density with total mass 1, and as k → ∞, this concentrates at a.
36

We claim that
g k → δa , k → ∞,
in the weak sense. Indeed, choosing f = ϕ as a test function in Lemma 2.1.4 shows
that
Z
|⟨gk , ϕ⟩ − ⟨δa , ϕ⟩| = gk (x)ϕ(x)dx − ϕ(a) ≤ sup |ϕ(x) − ϕ(a)| → 0
|x−a|<rk

as k → ∞, since ϕ in particular is continuous.


Note that we do NOT have gk → δa in the L2 or uniformly. It is however true
that gk (x) → 0 pointwise (in the sense of functions) for x ̸= a, and choosing gk so
that gk (a) = 0 we also trivially have convergence at x = a. This illutrates the major
advantage or distributions and weak convergence: only in this sense we clearly see
the most important fact, namely that the total mass of the functions gk concentrates
at x = a into a Dirac delta. With simple pointwise convergence, we are falsely led
to believe that the mass disappears in the limit.

Example 2.1.12 (Oscillatory convergence). For simplicity, consider n = 1, and


define
fk (x) = sin(kx).
We claim that fk → 0 weakly as k → ∞. This is a well known result from Fourier
analysis called Riemann–Lebesgues lemma, which in particular shows that
Z
sin(kx)ϕ(x)dx → 0, as k → ∞,

for any ϕ ∈ D(R). In fact, this is true for any integrable function ϕ. The idea of proof
is that a smooth function and a highly oscillatory function are almost orthogonal in
L2 .
This example clearly shows who weak the notion of weak convergence is. Indeed,
sin(kx) → 0 weakly but not in any other usual sense. The drawback with weak
convergence is that it is quite useless in numerical applications for obvious reasons.

Figure 2.3: Examples of weak convergence: (a) Delta convergence. (b) Oscillatory
convergence.

2.2 Sobolev spaces


Let D ⊂ Rn be a bounded open set with a sufficiently smooth boundary. Similar to
Definitions 2.1.2 and 2.1.6, we define distributions and weak derivatives on D, rather
than on all Rn , by requiring that all the test functions are nonzero only on compact
subsets of D. We write D(D) and D′ (D) for such test functions and distributions.
37

Definition 2.2.1 (Sobolev space). Define the Sobolev space H 1 (D) to be the space
of square integrable functions f ∈ L2 (D) such that all the weak partial derivatives
∂xi f are square integrable functions, i = 1, . . . , n. Writing ∇f for the weak gradient,
having these weak derivatives as components, we define the Sobolev inner product
Z
⟨f, g⟩H 1 := (⟨∇f (x), ∇g(x)⟩ + f (x)g(x))dx.
D

Proposition 2.2.2. H 1 (D), as a linear space with the inner product ⟨·, ·⟩H 1 , is a
Hilbert space.

Proof. It is straightforward to show that H 1 (D) is a linear space and that ⟨f, g⟩H 1
defines an inner product on H 1 (D). So it remains to verify that H 1 (D) is complete
with respect to the Sobolev norm
Z 1/2
2 2
∥f ∥H 1 := (|∇f (x)| + |f (x)| )dx .
D

We observe that H 1 (D) ⊂ L2 (D) with ∥f ∥L2 ≤ ∥f ∥H 1 and ∥∂xk f ∥L2 ≤ ∥f ∥H 1 for
all f ∈ H 1 (D). Assume now that f1 , f2 , . . . is a Cauchy sequence in H 1 (D). It
follows that this is a Cauchy sequence also in L2 (D) since ∥fi − fj ∥L2 ≤ ∥fi − fj ∥H 1 .
By Proposition 0.1.3, there exists f ∈ L2 (D) so that ∥fi − f ∥L2 → 0 as i → ∞.
Similarly, for each k = 1, . . . , n, we conclude that ∂xk fi is a Cauchy sequence in
L2 (D) and so there exists gk ∈ L2 (D) so that ∥∂xk fi − gk ∥L2 → 0 as i → ∞. But
∂xk f = gk in the weak sense as the following calculation shows.
Z Z Z Z
⟨∂xk f, ϕ⟩ = − f ∂xk ϕdx = − lim fi ∂xk ϕdx = lim (∂xk fi )ϕdx = gk ϕdx
D i→∞ D i→∞ D D

To summarize we have shown that f ∈ H 1 (D), since its weak partial derivatives
gk are square integrable functions, and ∥fi − f ∥H 1 → 0 as i → ∞. This shows
that our given Cauchy sequence f1 , f2 , . . . converges, which proves that H 1 (D) is
complete.

Unlike the case for a general L2 (D) function, see Exercise 8, it is meaningful to
consider the values of a Sobolev function f ∈ H 1 (D) on the boundary ∂D.

Proposition 2.2.3 (Sobolev trace). Let f ∈ H 1 (D) and consider the boundary
values g = f |∂D . Then g ∈ L2 (∂D). Moreover, there exists a constant C < ∞ such
that Z Z
|g(x)| dS(x) ≤ C (|∇f (x)|2 + |f (x)|2 )dx.
2
∂D D

Proof. As often is the case in analysis, it is the estimate that is crucial, so assume
first that f is a smooth function on D, so that in particular f ∈ H 1 (D) and g is well
defined and smooth.
The idea is to use a fixed smooth vector field V on Rn such that

(V (x), ν(x)) ≥ 1, for all x ∈ ∂D,


38

and apply the divergence theorem to the vector field f 2 V . We obtain


Z Z
2
|g(x)| dS(x) ≤ (g(x)2 V (x), ν(x))dS(x)
∂D Z ∂D Z
= div(f (x) V (x))dx = (2f (x)(∇f (x), V (x)) + f (x)2 divV (x)).
2
D D

This gives the stated estimate if |V (x)| ≤ C and |divV (x)| ≤ C, if we apply the
inequality 2ab ≤ a2 + b2 , with a = |f (x)| and b = |∇f (x)|, to the first term.
To remove the condition that f is a smooth function on D, one shows that such
smooth functions are dense in H 1 (D) and define the boundary trace by continuity
in the general case. We omit the details.

Definition 2.2.4 (Sobolev subspaces). Consider the Sobolev space H 1 (D). We


define two subspaces

H01 (D) := {f ∈ H 1 (D) ; f |∂D = 0},


Z
1 1
H (D) := {f ∈ H (D) ;
e f (x)dx = 0}.
D

Further define the homogeneous inner product


Z
⟨f, g⟩Ḣ 1 := ⟨∇f (x), ∇g(x)⟩dx
D

1/2
and norm ∥f ∥Ḣ 1 = ⟨f, f ⟩Ḣ 1 .

Proposition 2.2.5 (Homogeneous norms). Let D ⊂ Rn be a bounded and connected


open set with smooth boundary.
The subspaces H01 (D) and H e 1 (D) are closed and in particular Hilbert spaces
themselves. On both these subspaces ⟨f, g⟩Ḣ 1 is an inner product and the associated
norm ∥f ∥Ḣ 1 is equivalent to the Sobolev norm in the sense that there exists a constant
C < ∞ such that
∥f ∥Ḣ 1 ≤ ∥f ∥H 1 ≤ C∥f ∥Ḣ 1
for all f ∈ H01 (D) and all f ∈ H
e 1 (D).

We note that this equivalence of norm is not true on all H 1 (D) for the simple
reason that ∥f ∥Ḣ 1 = 0 for constant functions. However, it is readily seen that none
of the two subspaces contain any constant functions except 0. A main idea is that
in the Sobolev norm ∥f ∥H 1 it is the gradient term which is the dominant one, and
by eliminating the constant functions we may use the simpler homogeneous norm.

Proof. (1) To prove that the subspaces are Hilbert spaces, it suffices by Lemma 0.1.4
to show that they are closed. That H01 (D) is closed follows from the estimate in
Proposition 2.2.3, since if fj → f in H 1 (D) and fj ∈ H01 (D), then

∥0 − f |∂D ∥2L2 (∂D) ≤ C∥fj − f ∥2H 1 (D) → 0, j → ∞,


39

showing that f ∈ H01 (D).


To show that H̃ 1 (D) is closed, we recall Cauchy–Schwarz inequality for integrals,
which gives
Z 2 Z
f (x)dx ≤ C |f (x)|2 dx ≤ C∥f ∥2H 1 ,
D
R
where C = D dx < ∞ is the measure of D. Using this estimate, we can prove
closedness of H̃ 1 (D) similar to above.
(2) For the equivalence of norms, it suffices to prove an estimate
Z Z
2
|f (x)| dx ≤ Cp |∇f (x)|2 dx. (2.2)
D D

This is known as Poincaré’s inequality, a proof of which we only sketch now. Consider
the subspace H01 (D). The proof for H e 1 (D) is similar. Assume that there is no
constant Cp < ∞ so that (2.2) holds for all f ∈ H01 (D). This means that there
exists a sequence f1 , f2 , . . . in H01 (D) so that
Z Z
2
|fj (x)| = 1 and |∇fj (x)|2 dx → 0,
D D

when j → ∞. In particular supj ∥fj ∥H 1 (D) < ∞ and by Rellich’s theorem 2.2.6
below, we conclude that there exists a subsequence {fjk }k and f ∈ L2 (D) such that
limk→∞ ∥fjk − f ∥L2 = 0. Considering the weak derivatives of this f , we have
Z
⟨∂xi f, ϕ⟩ = − f (x)∂xi ϕ(x)dx
D Z Z
= − lim fjk (x)∂xi ϕ(x)dx = lim ∂xi fjk (x)ϕ(x)dx = 0.
k→∞ D k→∞ D

This shows that ∇f = 0 in the weak sense, from which one can conclude that f is
a constant function. Furthermore ∥fjk − f ∥H 1 (D) → 0 as k → ∞, so f ∈ H01 (D) by
(1). This forces f = 0 since f is constant, and contradicts ∥fj ∥L2 = 1.

We end by stating the result that we used in the above proof, which shows that
in a certain sense the L2 norm is small compared to the H 1 norm, if the domain D
is bounded.

Theorem 2.2.6 (Rellich). Assume that D ⊂ Rn is a bounded domain with smooth


boundary. Let fj be a bounded sequence of functions in H 1 (D), that is

sup ∥fj ∥H 1 (D) < ∞.


j

Then there exists a subsequence {fjk }k and f ∈ L2 (D) such that limk→∞ ∥fjk −
f ∥L2 (D) = 0.
40

2.3 Existence of weak solutions to BVPs


Consider the Poisson equation
∆u(x) = −f (x)
on Rn . When u is a C 2 function, we can use the classical definition of the derivatives,
but for general functions and distributions, this means that ⟨u,R∆ϕ⟩ = −⟨f, ϕ⟩.
However, the most natural assumption on u is u ∈ H 1 (Rn ) since |∇u(x)|2 dx has
the physical interpretation of energy. In this case it suffices to integrate by parts
once, using Green’s first identity, and put one derivative on the test function ϕ. We
obtain Z Z
⟨∇u(x), ∇ϕ(x)⟩dx = f (x)ϕ(x)dx,

which is referred to as a variational formulation of the Poisson equation. Since only


first order derivatives of ϕ enter, if this holds for all ϕ ∈ D(Rn )Rthen one can show
that it holdsRfor all ϕ ∈ H 1 (Rn ). With V = H 1 (Rn ), a(u, ϕ) = ⟨∇u(x), ∇ϕ(x)⟩dx
and L(ϕ) = f (x)ϕ(x)dx, we have an example of the following setup.
Definition 2.3.1 (Abstract variational problem). An abstract variational problem
is a linear equation for u ∈ V of the form
a(u, ϕ) = L(ϕ), for all ϕ ∈ V.
Here V is some linear space, a : V × V → R is a bilinear functional and L : V → R
is a linear functional.
We now turn to our main interest in this section: boundary value problems. The
following two examples show how to obtain variational formulations of the Dirichlet
and Neumann boundary value problems on a domain D by by using the Sobolev
spaces H 1 (D), H01 (D) and H
e 1 (D).

Example 2.3.2 (Dirichlet problem I). We want to solve the Dirichlet BVP
(
−∆u(x) = f (x), x ∈ D,
(2.3)
u(x) = 0, x ∈ ∂D.
We prefer zero boundary conditions for the Dirichlet problem for the variational
formulation below. However, for a more general boundary condition u = g, we
may first reduce this to the case g = 0 by using that the equation is linear, as in
Section 1.4. The minus sign is a technicality: It gives a plus sign in the variational
formulation.
In the setup from Definition 2.3.1, we use the function space V = H01 (D) for the
Dirichlet problem. Given a solution u ∈ C 2 (D) to (2.3), Green’s first identity gives
Z Z Z Z
⟨∇u, ∇ϕ⟩dx = (∂ν u)ϕdS − (∆u)ϕdx = f ϕdx,
D ∂D D D

for ϕ ∈ H01 (D), since ϕ = 0 on ∂D. Therefore the variational formulation of (2.3)
should be Z Z
⟨∇u, ∇ϕ⟩dx = f ϕdx, for all ϕ ∈ H01 (D), (2.4)
D D
41

R R
that is a(u, ϕ) = D ⟨∇u, ∇ϕ⟩dx and L(ϕ) = D f ϕdx.
Conversely, we need to verify that a solution to (2.4) is a solution to the original
BVP 2.3. Here appears a technical problem: a solution u to (2.4) belongs for all we
know only to the Sobolev space H01 (D). This means in particular that u = 0 at ∂D,
but all we know is that its weak first derivatives are L2 functions, so it is not clear
if u ∈ C 2 (D) and that Green’s identity may be used. (Assuming enough regularity
of f and ∂D, this may however be proved.) But assuming that we have a solution
u ∈ C 2 (D) to (2.4), we obtain from Green’s first identity that
Z Z Z
∆uϕdx = − ⟨∇u, ∇ϕ⟩dx = − f ϕdx
D D D

for all ϕ ∈ H01 (D). This shows that ∆u = −f , assuming f to be a continuous


function. To see this, we note that Lemma 2.1.4 applies with f and g replaced by
∆u + f and ϕ, since we may choose ϕ as an approximation of δa at each a ∈ D.
Modulo a regularity issue for solutions u we have shown that the Dirichlet prob-
lem (2.3) is equivalent to the variational problem (2.4) in H01 (D). After developing
the abstract existence theory we shall return below to the Dirichlet problem and
prove that indeed the variational problem (2.4) has a unique solution.
Quite similarly, but offering several surprising novelties, we can solve the Neu-
mann problem through a variational formulation.
Example 2.3.3 (Neumann problem I). We want to solve the Neumann BVP
(
−∆u(x) = f (x), x ∈ D,
(2.5)
∂ν u(x) = g(x), x ∈ ∂D.

As for the Dirichlet problem, we can easily reduce to the case g = 0 using the
linearity of the equation. However, unlike for the Dirichlet problem this is not really
necessary in order to write a variational formulation of the Neumann problem. So
we keep g.
An important observation is that (2.5) does not have a unique solution. Indeed,
we can add any constant to u and obtain a new solution. In general, it also does
not exist a solution u: from Green’s identity we have the necessary condition
Z Z Z Z
− f dx = ∆udx = ∂ν udS = gdS.
D D ∂D ∂D

(Physics: Interior sources = flow out though the boundary.) However, assuming this
constraint on f and g, it turns out that a solution exists, and is unique modulo con-
stants on a connected domain D. However, to start with we ignore these problems,
which turn out to be minor ones.
In the setup from Definition 2.3.1, we use the function space V = H 1 (D) for the
Neumann problem. (In order to obtain a unique solution, we later replace H 1 (D)
by a space like He 1 (D) to eliminate constants.) Given a solution u ∈ C 2 (D) to (2.5),
Green’s first identity gives
Z Z Z
⟨∇u, ∇ϕ⟩dx = (∂ν u)ϕdS − (∆u)ϕdx,
D ∂D D
42

for ϕ ∈ H 1 (D). Therefore the variational formulation of (2.5) should be


Z Z Z
⟨∇u, ∇ϕ⟩dx = gϕdS + f ϕdx, for all ϕ ∈ H 1 (D), (2.6)
D ∂D D
R R R
that is a(u, ϕ) = D ⟨∇u, ∇ϕ⟩dx and L(ϕ) = ∂D gϕdS + D f ϕdx.
Carefully note that the function space H 1 (D) for the Neumann problem does not
at all involve the Neumann boundary condition. Nevertheless, we can now verify
that a solution to (2.6) is a solution to the original BVP 2.5, in particular it will
satisfy the Neumann boundary condition. As for the Dirichlet problem, we avoid
the regularity issue and assume that we have a solution u ∈ H 1 (D) to (2.6) which
in fact is twice classically differentiable: u ∈ C 2 (D). Green’s first identity and (2.6)
shows that Z Z Z Z
∆uϕdx = (∂ν u)ϕdS − gdS − f ϕdx (2.7)
D ∂D ∂D D
1
for all ϕ ∈ H (D). Surprisingly, we can recover both the PDE and the boundary
condition in 2.5 from this equation. We proceed in two steps.

Figure 2.4: Delta convergence (1) in the interior, and (2) at the boundary.

(1) We have in particular that


Z
(∆u + f )ϕdx = 0
D

for all ϕ ∈ H 1 (D) such that ϕ = 0 on ∂D, that is for ϕ ∈ H01 (D). We want to apply
Lemma 2.1.4 and conclude that h := ∆u + f = 0. At a given point a ∈ D, choose
ϕ to be an approximation to δa . We get
Z
|h(a)| = h(x)ϕ(x)dx − h(a)) ≤ sup |h(x) − h(a)|.
D |x−a|<r

Letting r → 0 and assuming that h is continuous, this shows that h(a) = 0. Since
a was arbitrary, we have ∆u + f = 0 in all D.
(2) Returning to (2.7): since ∆u(x) + f (x) = 0 identically in D by (1), we have
Z
(∂ν u − g)ϕdS = 0
∂D

for all ϕ ∈ H 1 (D). We now want to conclude from an argument as in Lemma 2.1.4,
but on the boundary ∂D rather than in the domain D, that h̃(a) := ∂ν u(a) − g(a)
at each a ∈ ∂D. We design the test function ϕ, given a point a ∈ ∂D.

ˆ First define ϕ on ∂D.R Let ϕ(x) = 0 when |x − a| > r and let ϕ be smooth and
positive on ∂D with ∂D ϕdS = 1.

ˆ Then extend ϕ into D to a smooth function, so in particular ϕ ∈ H 1 (D).


43

As in Lemma 2.1.4, we conclude that


Z
h̃ϕdS − h̃(a) → 0
∂D

as r → 0. This shows that h̃ = ∂ν u − g = 0 on all ∂D.


Modulo a regularity issue for solutions u we have shown that the Neumann prob-
lem (2.5) is equivalent to the variational problem (2.6) in H 1 (D). After developing
the abstract existence theory we shall return below to the Neumann problem and
prove that after minor modifications we can obtain a Neumann variational formula-
tion with a unique solution.

A take away point from these examples is that the Dirichlet boundary condition
is an essential BC whereas the Neumann boundary condition is a natural BC in the
following sense.

Definition 2.3.4 (Essential/natural BC). Consider a variational formulation (V, a, L)


of a boundary value problem. We call a boundary condition essential if it appears
in the definition of the function space V . Otherwise it is called natural.

The main goal in this section is to prove the following existence and uniqueness
result for solutions to a variational problem. The two main hypotheses are that
the function space V should be a Hilbert space, and that the bilinear a should be
coersive.

Theorem 2.3.5 (Lax–Milgram). Consider the abstract variational problem (V, a, L)


from Definition 2.3.1. Assume the following.

ˆ The linear space V is a Hilbert space.

ˆ The bilinear functional a is bounded, that is |a(u, ϕ)| ≤ C1 ∥u∥∥ϕ∥, and coer-
sive, that is a(u, u) ≥ c1 ∥u∥2 . (C1 < ∞ and c1 > 0 are some constant.)

ˆ The linear functional L is bounded, that is |L(ϕ)| ≤ C2 ∥ϕ∥. (C2 < ∞ is some
constant.)

Then there exists a unique solution u ∈ V to the variational problem.

We shall prove this under the further assumption that a is symmetric:

a(u, ϕ) = a(ϕ, u).

This assumption allow us to reformulate the variational problem as a minimization


problem. The finite, even one-, dimensional special case is the linear equation

ax + b = 0,

writing −L = b, which gives the location of the minimum of the second order
polynomial 21 ax2 + bx + c. The Hilbert space generalization is
44

Lemma 2.3.6. Consider the abstract variational problem (V, a, L) from Definition 2.3.1,
and assume that a is symmetric. The solutions u ∈ V to this variational problem
are precisely the minima of the functional F : V → R given by
F (u) := 21 a(u, u) − L(u).
Proof. Assume that u ∈ V is a minimizer. Let ϕ ∈ V . Then the one-variable
function f (t) := F (u + tϕ) has a minimum at t = 0. Using bilinearity, symmetry
and linearity, we have
f (t) = 21 t2 a(ϕ, ϕ) + 12 a(u, u) + ta(u, ϕ) − L(u) − tL(ϕ).
Differentiation shows that a(u, ϕ) = L(ϕ).
Conversely, assume that a(u, ϕ) = L(ϕ) for all ϕ ∈ V . Then
F (u + ϕ) = 21 a(u, u) + a(u, ϕ) + 12 a(ϕ, ϕ) − L(u) − L(ϕ) = F (u) + 12 a(ϕ, ϕ) ≥ F (u).
Since ϕ was arbitrary, we have shown that u is a minimum for F .
Proof. Proof of Theorem 2.3.5 for symmetric a By Lemma 2.3.6, it suffices to prove
that
F (u) = 12 a(u, u) − L(u).
has a unique minimum u ∈ V . Uniqueness follows from the coersiveness of a, since
the proof of Lemma 2.3.6 shows that F (u + ϕ) > F (u) for any variational solution
u and ϕ ̸= 0. For existence, let m := inf u∈V F (u).
(1) We first show that F is bounded from below, that is m > −∞. Using the
assumed estimates, we have
F (u) = 12 a(u, u) − L(u) ≥ c1
2
∥u∥2 − C2 ∥u∥.
Clearly this second order polynomial in the real variable x = ∥u∥ is bounded from
below since c1 > 0.
(2) We need to show that the infimum m is attained at some u ∈ V . Let u1 , u2 , . . .
be a sequence in V such that limk→∞ F (uk ) = m. It suffices to show that this in fact
is a Cauchy sequence in V . Indeed, the completeness of the Hilbert space then shows
the existence of a limit u ∈ V , and continuity of F yields F (u) = limk→∞ F (uk ) = m.
Let ϵ > 0. Choose i, j large so that
F (ui ) ≤ m + ϵ and F (uj ) ≤ m + ϵ.
Consider F along the line in V through ui and uj :

f (t) := F (ui + t(uj − ui ))


= 21 a(uj − ui , uj − ui )t2 + (a(ui , uj − ui ) − L(uj − ui ))t + ( 12 a(ui , ui ) − L(ui ))
=: a1 t2 + a2 t + a3 .
This is a second order polynomial with f (0) ≤ m + ϵ, f (1) ≤ m + ϵ and f (t) ≥ m for
all x ∈ R, which suggests smallness of a1 , that is closeness of ui and uj as wanted.
The following identity for f is readily verified.
f (0) + f (1) − 2f (1/2) = a1 /2
45

We obtain

∥ui −uj ∥2 ≤ c−1


1 2a1 =
4
c1
(f (0)+f (1)−2f (1/2)) ≤ 4
c1
(m+ϵ+m+ϵ−2m) = 8ϵ/c1 → 0

as ϵ → 0. This proves that u1 , u2 , . . . is a Cauchy sequence and completes the


proof.

Example 2.3.7 (Dirichlet problem II). Consider the variational formulation (2.4)
for the Dirichlet problem. We check the hypothesis in Lax–Milgram’s theorem.
The function space V = H0R1 (D) is a Hilbert space by Proposition 2.2.5.
The functional a(u, ϕ) = D ⟨∇u, ∇ϕ⟩dx is bilinear. It is is bounded: this is
Cauchy–Schwarz inequality. It is coersive: This is Poincaré’s inequality and Propo-
sition 2.2.5. R
The functional L(ϕ) = D f ϕdx is bounded, assuming that f ∈ L2 (D): again
this is Cauchy–Schwarz inequality.
Therefore Lax–Milgram’s theorem shows that the Dirichlet problem has a unique
solution.

Example 2.3.8 (Neumann problem II). Consider the variational formulation (2.6)
for the Neumann problem.R We checkR the hypothesis in Lax–Milgram’s theorem.
The functional L(ϕ) = ∂D gdS + D f ϕdx is bounded, assuming that g ∈ L2 (∂D)
and f ∈ L2 (D): this is Cauchy–Schwarz inequality and Proposition 2.2.3.
The remaining hypothesis is verified as for the Dirichlet problem, except for the
coersiveness of a, which fails on H 1 (D). Indeed, if u is a constant function, then
a(u, u) = 0.
To uniquely solve the Neumann problem, we need to replace H 1 (D) by H e 1 (D).
We here only consider the case g = 0 (or else we need to modify H e 1 (D) by a
boundary integral term). So consider the variational problem
Z Z
⟨∇u, ∇ϕ⟩dx = f ϕdx, e 1 (D),
for all ϕ ∈ H (2.8)
D D

e 1 (D).
for u ∈ H
The same argument as before shows that any solution u ∈ C 2 (D) to
(
−∆u(x) = f (x), x ∈ D,
(2.9)
∂ν u(x) = 0, x ∈ ∂D.

satisfies (2.8). The converse however, presents some novelties. As before, for a
variational solution u ∈ C 2 (D) to 2.8, Green’s first identity shows that
Z Z Z
∆uϕdx = (∂ν u)ϕdS − f ϕdx (2.10)
D ∂D D

e 1 (D).
for all ϕ ∈ H
(1) We now only have Z
(∆u + f )ϕdx = 0
D
46

for all ϕ ∈ He 1 (D) such that ϕ = 0 on ∂D. We want to apply Lemma 2.1.4 and
conclude that h := ∆u+f R= 0. We cannot as before use a Dirac delta approximation
for ϕ since we must have D ϕdx = 0. Instead we use differences of two such deltas.
Fix two points a, b ∈ D, and Rlet ϕ = gRa − gb , where
R ga and gb are approximations to
δa and δb respectively. Since ϕdx = ga dx − gb dx = 1 − 1 = 0, we get
Z
|h(a) − h(b)| = h(x)ϕ(x)dx − (h(a) − h(b))
D
Z Z
≤ h(x)ga (x)dx − h(a) + h(x)gb (x)dx − h(b)
D D
≤ sup |h(x) − h(a)| + sup |h(x) − h(b)|.
|x−a|<r |x−b|<r

Letting r → 0 and assuming that h is continuous, we have shown that h(a) = h(b)
R two points a, b ∈ D, so ∆u(x) + f (x) = c for some constant c. Assuming
for any
that D f dx = 0, we have in fact that
Z Z Z
c dx = ∆udx = ∂ν udx = 0,
D D ∂D

so c = 0.
(2) Returning to (2.10): since ∆u(x) + f (x) = 0 identically in D by (1), we have
Z
(∂ν u)ϕdS = 0
∂D

e 1 (D). We design the test function ϕ, given a point a ∈ ∂D.


for all ϕ ∈ H

ˆ First define ϕ on ∂D.R Let ϕ(x) = 0 when |x − a| > r and let ϕ be smooth and
positive on ∂D with ∂D gdS = 1.

ˆ Then extend ϕ into D to a smooth function, not to any smoothR function in D


as before, but sufficiently negative in the interior of D so that D ϕdx = 0.

As in Lemma 2.1.4, we conclude that


Z
(∂ν u)ϕdS − ∂ν u(a) → 0
∂D

as r → 0. This shows that ∂ν u = 0 on all ∂0.


Modulo the regularity issue for solutions u we have shown that the Neumann
problem (2.9) is equivalent to the variational problem (2.8) in H̃ 1 (D). Since a is
e 1 (D) by Proposition 2.2.5, Lax–Milgram’s theorem shows that (2.8)
coersive on H
has a unique solution.

Example 2.3.9 (Poincaré failure on Rn ). We have seen in connection with the


Neumann problem on a bounded domain D that
Z
a(u, ϕ) = ⟨∇u, ∇ϕ⟩dx
D
47

fails to be coersive on all H 1 (D). This causes a minor problem, which the Poincaré
inequality in Proposition 2.2.5 fixes if we instead use the subspace H e 1 (D). Note
that H e 1 (D) is a hyperplane in H 1 (D) since it is the orthogonal complement of the
constant functions.
Now consider the bilinear functional a on H 1 (Rn ), that is we replace the bounded
domain D by the unbounded full space Rn . Note that in this case H 1 (Rn ) contains
no nonzero constant functions. We show nevertheless that the Poincaré inequality
fails, and so the coersivity of a fails, on any closed subspace of H 1 (Rn ) which contains
a function f ̸= 0 along with all its rescalings
fk (x) = k −n/2 f (x/k), x ∈ Rn .
Indeed, changing variables x = ky shows that
Z Z
2
|fk (x)| dx = |f (y)|2 dy,
Rn Rn

and also using the chain rule shows


Z Z
2 −2
|∇fk (x)| dx = k |∇f (y)|2 dy.
Rn Rn

Therefore, clearly there is no constant C < ∞ so that


Z Z
2
|fk (x)| dx ≤ C |∇fk (x)|2 dx
Rn Rn

for all k = 1, 2, . . ..

2.4 Discretization and FEM


The finite element method (FEM) uses a variational formulation as in Definition 2.3.1
to numerically solve a boundary value problem. The idea is to discretize the problem
by choosing
ˆ a finite dimensional subspace VN of our Hilbert space V , and

ˆ a basis ϕ1 , ϕ2 , . . . , ϕN for VN .
In the discretized finite dimensional problem, we consider the solution u and the
test functions ϕ to belong to VN , and let
u = x1 ϕ1 + . . . + xN ϕN
and ϕ = ϕi , i = 1, . . . , N . We obtain a finite dimensional linear systems of equations
N
X
Ai,j xj = bi , (2.11)
j=1

where Ai,j = a(ϕi , ϕj ) and bi = L(ϕi ). By tradition the N × N matrix A = (Ai,j )Ni,j=1
is called the stiffness matrix and the RN -vector b = (bi )Ni=1 is the right hand side is
called the load vector.
48

Figure 2.5: A coarse triangulation of a domain D, using 145 triangles. The number of
interior nodes, used for the Dirichlet problem, is N = 56. The total number of nodes,
including boundary nodes, used for the Neumann problem, is N = 91. Can you visualize
ϕi at one of the interior nodes i? At one of the boundary nodes i?

Definition 2.4.1 (Abstract FEM). The abstract finite element method is to numer-
ically solve the abstract variational problem (V, a, L) from Definition 2.3.1, using a
finite dimensional subspace VN ⊂ V with basis (ϕi )N N
i=1 , by solving (2.11) for (xi )i=1 ,
giving the numerical solution u = x1 ϕ1 + · · · + xN ϕN .

Given a domain D, there are many possible ways to choose VN and ϕi to solve the
Dirichlet and Neumann problems. We here only consider the most standard setup,
which use a given triangulation of D ⊂ R2 . (For a 3D problem one uses tetrahedra
instead of triangles.) This means that we approximate D by a union of of triangles.
To construct such a triangulation is a non-trivial geometric problem, which we do
not consider the details of here. To describe the algorithm, we assume given three
matrices vertices, triangles and boundary as follows.

ˆ vertices is a matrix with 2 rows, with entries in R. The columns enumerate


the vertices (nodes), and in the kth column stands the (x, y) coordinates of
vertex k.

ˆ triangles is a matrix with 3 rows, with entries in Z. The columns enumerate the
triangles (elements), and in the kth column stands the three vertices (integers
specifying the columns in vertices) of triangle k in the triangulation. Standard
is to enumerate the vertices of a triangle counter clockwise.

ˆ boundary is a matrix with 2 rows, with entries in Z. The columns enumerate the
boundary intervals (boundary elements), that is the triangle edges hitting ∂D.
In the kth column stands the two vertices (integers specifying the columns in
49

vertices) of boundary interval k. Standard is to enumerate the vertices counter


clockwise along ∂D.
Given this geometric information, the algorithm for solving the two main bound-
ary value problems are as follows.
ˆ The basis functions ϕi are constructed as follows. For each vertex in the
triangulation, indexed by i, we define a basis functions ϕi which equals 1 at
this vertex, equals 0 at all other vertices and is linear, that is of the form
ax + by + c on each triangle.
For the Neumann problem we use V = H 1 (D), and therefore use the basis
functions for all vertices in the triangulation, including those at the boundary,
that is also those in boundary.
For the Dirichlet problem we use V = H01 (D), and therefore use the basis
functions only for all interior vertices, that is those not listed in boundary.
ˆ To compute the stiffness matrix Aij , we need to compute its elements
XZ
a(ϕi , ϕj ) = ⟨∇ϕi , ∇ϕj ⟩dx,
T T

where the sum is over all the triangles which have both i and j as two of its
vertices (or possibly as one i = j). On each triangle T , the two gradients are
constant vectors, so the integral is simply
⟨∇ϕi , ∇ϕj ⟩area(T ), (2.12)
the evaluation of which is a nice linear algebra exercise. See Example D.0.1.
Having a representation of the triangulation as above, one may proceed as
follows to fill in the matrix A. Initialize A = 0. Then, rather than iterating
over pairs of vertices i, j, we iterate over triangles T in triangles. For each of
the 3 × 3 = 9 pairs of vertices i, j to T , found in triangles we add (2.12) to Aij .
This gives the stiffness matrix for the Neumann problem. To obtain the smaller
stiffness matrix for the Dirichlet problem, we remove the rows and columns
corresponding to boundary nodes, using boundary.
ˆ To compute the load vector bi , assuming that boundary data g = 0, we need
to compute, using suitable numerical integration,
XZ
L(ϕi ) = f (x)ϕi (x)dx,
T T

where the sum is over all triangles with one vertex at i. For the Dirichlet
problem, this needs not be computed for boundary vertices i. For the Neumann
problem, if g ̸= 0, we need to add
XZ
g(x)ϕi (x)ds
I I

where the sum are over the boundary intervals in boundary.


50

ˆ We now solve the linear system (2.11). For the Dirichlet problem this is always
uniquely solvable, that is the matrix A is invertible. For the Neumann problem,
A is not invertible. However, the matlab command A\b will yield the least
square solution and one solution x. The general solution is
 T
x + 1 1 1 ··· 1 .

Moreover, the condition


Z Z
f dx + gds = 0
D ∂D
P
for existence of solutions means that i bi = 0.

2.5 Exercises

2
x ,
 x<1
2
1. Let f (x) = x + 2x, 1 ≤ x < 2 . At which x does the classical derivative

2x, x≥2


f (x) exist? Compute the weak derivative of f . Sketch a sequence of functions
which converge weakly to f ′ .
(
1, |x| < 1, |y| < 1
2. Let f (x, y) = . Compute the weak partial derivative ∂x f .
0, else
Sketch a sequence of functions which converge weakly to ∂x f .

3. Let Fk = kδ1/k −kδ−1/k . Find the weak limit of the distributions Fk as k → ∞.


Interpret the result in terms of dipoles.

4. Let g : R → R be any integrable function, possible non-smooth and discon-


tinuous. Show that u(t, x) = g(x − ct) solves the wave equation ∂t2 u = c2 ∆u
in the weak sense.

5. Assume that u(x, y) is a harmonic function in the upper half disk x2 + y 2 < 1,
y > 0, which is continuous for x2 + y 2 ≤ 1, y ≥ 0, with homogeneous Dirichlet
boundary values u(x, 0) = 0. Extend u to an odd function with respect to y
in the whole disk x2 + y 2 < 1 by letting
(
u(x, y), y > 0,
v(x, y) =
−u(x, −y), y < 0.

Show that ∆v = 0 in distributional sense in the disk. (With more theory one
can deduce from this that v in fact must be C ∞ and harmonic in the whole
disk.)

6. For which exponents α ∈ R does f (x) = |x|α belong to H 1 (D), if (a) D is the
interval D = (−1, 1) ⊂ R? (b) D is the unit disk x2 + y 2 < 1 in R2 ? (c) D is
the unit ball x2 + y 2 + z 2 < 1 in R3 ?
51

7. Let Γ be a curve that cuts a bounded domain D into two parts D1 and D2 .
Let f be a function which is smooth in D1 and in D2 and continuous in all D.
Show that f ∈ H 1 (D). Hint: Gauss theorem.
8. Let D be your favorite domain. For each ϵ > 0, construct two smooth functions
f1 and f2 so that f1 |∂D = 0 and f2 |∂D = 7, but ∥f1 − f2 ∥L2 (D) < ϵ.
9. Let γ > 0. Give a variational formulation of the Robin BVP
(
−∆u(x) = f (x), x ∈ D,
∂ν u(x) + γu(x) = g(x), x ∈ ∂D.

What are V , a and L? Show equivalence of the BVP and the variational
problem, modulo regularity issues, and that the Lax-Milgram hypothesis is
satisfied.
10. Give a variational formulation of the mixed BVP

−∆u(x) = f (x),
 x ∈ D,
∂ν u(x) = 0, x ∈ ΓN ,

u(x) = 0, x ∈ ΓD .

Here we assume that ∂D is the disjoint union of ΓN and ΓD , and that ΓD ̸= ∅.


What are V , a and L? Show equivalence of the BVP and the variational
problem, modulo regularity issues, and that the Lax-Milgram hypothesis is
satisfied.
11. Give a variational formulation of the Dirichlet BVP in an inhomogeneous and
anisotropic material:
(
−div(B(x)∇u(x)) = f (x), x ∈ D,
u(x) = 0, x ∈ ∂D.

Here D is a two-dimensional domain, and B is a smooth function which to


each x ∈ D assigns a 2 × 2 matrix. What are V , a and L? What condition on
B is needed for the Lax-Milgram hypothesis to be satisfied?
12. Consider ∆u = −4 on the unit square (0, 1)2 with homogeneous Dirichlet
BCs. Use FEM to approximately find u(1/2, 1/2). Partition the square into
four triangles, with the two diagonals.
13. Consider FEM on the one-dimensional interval D = (1, N + 1) with Dirichlet
BCs, and the basis functions

ϕj (x) = max(0, min(1 − j + x, 1 + j − x)), j = 1, 2, . . . , N.

Calculate the stiffness matrix. What special structure does it have?


14. Let fn (x) = sin2 (nx), n = 1, 2, 3, . . . Show that fn converge weakly on R as
n → ∞, and determine the limit distribution.
52
Chapter 3

Initial value problems on Rn

The goal in this chapter is to find explicit formulas for the solution to the IVPs for
the heat and wave equations on all R, on all R2 and on all R3 . Using the Fourier
transform, this is possible when the wave/heat propagates freely in all space, and by
analyzing the obtained formulas we learn much about wave and heat propagation.
Using the Fourier transform, it is important to note that the Fourier transform of
many important functions are not functions, but more general distributions. Indeed,
the basic Riemann function appearing in the solution of the wave equation, is not a
function but a spherical “Dirac delta wall”. So we start by learning how to transform
distributions.

Recommended reading:

ˆ Strauss 12.3.

ˆ Follands book from the Fourier analysis course (Chapter 9 in


particular).

ˆ Strauss 2.1, 2.4, 9.2, 12.4.

ˆ Strauss 2.2, 2.5, 9.1.

Study questions:
How does Strauss find the solution formulas for the heat and wave
equations without using the Fourier transform? How do we use the
chain rule to find the d’Alembert formula for the one-dimensional
wave IVP?

3.1 Tempered distributions


In this section, we study the Fourier transform of distributions. For a function
f ∈ L2 (Rn ), we recall that its Fourier transform is another function f ∈ L2 (Rn )
which we define by specifying its values
Z
fˆ(ξ) = f (x)e−i⟨ξ,x⟩ dx
Rn

53
54

at each point/frequencies ξ ∈ Rn . We refer to a course on Fourier analysis for more


background on the properties of Fourier transforms of functions.
For a distribution F , of course we do not in general expect its Fourier transform
F̂ to be a function, but a distribution in general. Therefore we do not set out to
define its values F̂ (ξ) at individual points ξ ∈ Rn , but we rather need to define
its values ⟨F̂ , ϕ⟩ on test functions ϕ. To find this distributional definition of F̂ , we
consider first the case when F is given by a function f as in Proposition 2.1.5. In
this case we have
Z Z Z 
ˆ
⟨f , ϕ⟩ = ˆ
f (ξ)ϕ(ξ)dξ = f (x)e −i⟨ξ,x⟩
dx ϕ(ξ)dξ
Rn Rn Rn
Z Z  Z
−i⟨ξ,x⟩
= f (x) ϕ(ξ)e dξ dx = f (x)ϕ̂(x)dx = ⟨f, ϕ̂⟩.
Rn Rn Rn

We have changed order R of integration in this calculation, and for this to work we
need to assume that Rn |f (x)|dx < ∞.
We see that the natural definition of F̂ is as the distribution with the value ⟨fˆ, ϕ⟩
at the test function ϕ being ⟨f, ϕ̂⟩. Here
Z
ϕ̂(ξ) = ϕ(x)e−i⟨ξ,x⟩ dx.
Rn

However, we encounter a technical problem here: ϕ̂ is not a test function! To


explain this, we recall from Definition 2.1.2 that test functions are C ∞ smooth and
= 0 outside a bounded set. The latter condition on ϕ means that there is a radius
R < ∞ so that Z
ϕ̂(ξ) = ϕ(x)e−i⟨ξ,x⟩ dx.
|x|<R

This allows us to differentiate under the integral sign any number of times, so clearly
ϕ̂ is C ∞ smooth. The problem is rather that ϕ̂ is too smooth! One can show that
ϕ̂ is an analytic function, and such functions can never be = 0 on an open set, in
particular not outside any bounded set.
The solution to this problem is to use a slightly different class of test functions.
We make the following analogue of Definition 2.1.2
Definition 3.1.1 (Tempered distributions and Schwartz test functions). Let S(Rn )
denote the set of all Schwartz test functions ϕ : Rn → R such that all partial
derivatives of ϕ of all orders exist as continuous functions which are rapidly decaying.
By a function being rapidly decaying we mean that it decays as O(|x|−N ) for any
N < ∞ as x → ∞.
Let S ′ (Rn ) denote the set of all tempered distributions, that is functionals F :
S(Rn ) → R such that F is linear and defined on all S(Rn ).
The following lemma will allow to define Fourier transforms of tempered distri-
butions according to the above discussion.
Lemma 3.1.2 (Fourier transform of Schwartz functions). If ϕ ∈ S(Rn ), then ϕ̂ ∈
S(Rn ).
55

Proof. To ease notation, we only consider the one-dimensional case n = 1. Differ-


entiation under the integral sign k times shows that
Z
ϕ̂ (ξ) = (−ix)k ϕ(x)e−iξx dx.
(k)

This shows that ϕ̂(k) exists as a continuous function, since xk ϕ(x) is integrable. To
further show that it is rapidly decaying we integrate by parts N times and obtain
Z
−N
(k)
ϕ̂ (ξ) = (iξ) ((−ix)k ϕ(x))(N ) e−iξx dx.

Since the integrand here is integrable for any N , we see the stated rapid decay.
Definition 3.1.3 (Fourier transform of tempered distributions). Let F ∈ S(Rn )
be a tempered distribution. We define its Fourier transform to be the tempered
distribution F̂ defined by

⟨F̂ , ϕ⟩ = ⟨F, ϕ̂⟩, ϕ ∈ S(Rn ).

We summarize that we have basic spaces

D(Rn ) ⊂ S(Rn ) ⊂ L2 (Rn ) ⊂ S ′ (Rn ) ⊂ D′ (Rn ).

Here L2 (Rn ) is a Hilbert space, whereas the other spaces are not even Banach spaces
as there is no norm ∥ · ∥ defined on them. The Fourier transform maps bijectively

F :S(Rn ) → S(Rn ),
F :L2 (Rn ) → L2 (Rn ),
F :S ′ (Rn ) → S ′ (Rn ),

but is not defined on all D′ (Rn ) since F only maps D(Rn ) → S(Rn ).

Figure 3.1: Examples of (a) distribution/test function, and (b) tempered distribu-
tion/Schwartz test function.

Example 3.1.4 (Dirac delta revisited). Consider the Dirac delta distribution δa
from Example 2.1.3. Clearly
ϕ(a)
is well defined not only for ϕ ∈ D(Rn ), but also for all ϕ ∈ S(Rn ). This means that
δa in fact is a tempered distribution. We compute
Z
⟨δ̂a , ϕ⟩ = ⟨δa , ϕ̂⟩ = ϕ̂(a) = e−i⟨a,x⟩ ϕ(x)dx.
Rn

This means that the Fourier transform δ̂a in fact is a function:

δ̂a = e−i⟨a,ξ⟩ .
56

Note that we have replaced the dummy variable x by ξ. In particular we have

δ̂0 = 1.

This should be observed with some degree of fascination, since the interpretation is
that the Dirac delta distribution contains equally much of all frequencies. We know
from Fourier analysis that

fˆ(ξ) → 0, ξ → ∞,

for any integrable function by the Riemann–Lebesgue lemma. This shows conversely
that δa is not a function, since its Fourier transform does not decay as ξ → ∞.

Example 3.1.5 (Non-tempered distributions). Consider the real exponential func-


tion
f (x) = ex , x ∈ R.
If we try to define f as a distribution, we need the integral
Z
(f, ϕ) = f (x)ϕ(x)dx

to be well defined for all test functions. Clearly f ∈ D′ (R) since test functions
ϕ ∈ D(R) are zero outside a bounded interval, so the integral converges trivially.
However, f is not a tempered distribution since the rapid decay of Schwartz test
functions ϕ ∈ S(R) are not enough to counter-act the growth of f and make the
integral convergent. A concrete example: ⟨f, ϕ⟩ is not defined/convergent for

x2 +1/2
ϕ(x) = e−

as the reader may verify.


For the non-tempered distribution ex ∈ D′ (R) we cannot define a Fourier trans-
form. It has a well defined Laplace transform, but that is a different story.

We next compute the Fourier transforms of the most important functions in PDE
theory.

Example 3.1.6 (Heat kernel=Gaussian). Fix a scale parameter t > 0 and consider
the function
2
e−|x| /(4t) , x ∈ Rn .
This socalled Gauss function plays a central role in mathematics. For us it provides
a simple example of a Schwartz test function, and we see below that it is the most
important function in studying the heat equation.
The standard computation of its Fourier transform is by completing the square
in the exponent
Z Z
−|x|2 /(4t)−i⟨ξ,x⟩ 2 2
e dx = e−⟨x+2itξ,x+2itξ⟩ /(4t)−t|ξ| dx.
Rn Rn
57

Here the constant last term in the exponent


√ can be factored out and with a complex
change of variables y = (x + 2itξ)/(2 t), we get
Z
−|x|2 /(4t) −t|ξ|2 2 2
F{e }=e e−|y| (4t)n/2 dy = (4πt)n/2 e−t|ξ| .
Rn

In particular for t = 1/2, this Gaussian is an eigenfunction to the Fourier transform


in that its Fourier transform is the factor (2π)n/2 times the function itself, changing
the dummy variables ξ and x.

Example 3.1.7 (Laplace fundamental solution=Newton potential). We aim to cal-


culate the Fourier transform of the Laplace fundamental solution, the Newton po-
tential, via a method referred to as subordination. To explain this, we recall that
the Fourier transform is linear, meaning that

F{a1 f1 (x) + · · · + aN fN (x)} = a1 fˆ1 (ξ) + · · · + aN fˆN (ξ).

Having in mind a Riemann sum approximating an integral, letting ai = 1/N and


N → ∞, we obtain in the limit
Z b Z b
F{ ft (x)dt} = fˆt (ξ)dt.
a a

Applying this to the Gauss functions from Example 3.1.6, after division by the
constant (4πt)n/2 , gives
Z b Z b
−n/2 −|x|2 /(4t) 2
F{ (4πt) e dt} = e−t|ξ| dt.
a a

Integration gives
Z |x|2 /(4a)
2 2
F{ 4π1n/2 |x|2−n sn/2−2 e−s ds} = (e−a|ξ| − e−b|ξ| )/|ξ|2 .
|x|2 /(4b)

Letting n = 3, a = 0, b = ∞, calculating the integral this shows that in three


dimensions
F{ 1/(4π|x|)} = 1/|ξ|2 . (3.1)
This result doesR not require distributions since both sides are locally integrable
functions, since |x|<1 |x|−k dx < ∞ when k < n, the dimension of space.

Figure 3.2: Subordination synthesis of 1/|x|.

|ξ|−2 dξ =
R
The corresponding result in R2 however requires distributions since |ξ|<1
∞ in the plane. Letting a = 0, we have
Z ∞
2
1
F{ 4π s−1 e−s ds} = (1 − e−b|ξ| )/|ξ|2 .
|x|2 /(4b)
58

We cannot however not let b → ∞ since both sides diverges, even in the sense of
distributions. The left hand side is nevertheless close to a logarithm, and we write
Z 1/(4b) Z 1/(4b) Z 1/(4b)
ds −s ds ds
ln 1
|x|2
= = (1 − e ) + e−s
|x|2 /(4b) s |x|2 /(4b) s |x|2 /(4b) s
Z 1/(4b) Z ∞ Z ∞
−s ds −s ds ds
= (1 − e ) + e − e−s .
|x|2 /(4b) s |x|2 /(4b) s 1/(4b) s

On the right hand side, the first term defines a function g(x) such that |g(x)| ≤
|1 − |x|2 |/(4b). Therefore g → 0 weakly as b → ∞. The last term is a constant c(b),
where c(b) → +∞ as b → ∞. We obtain
 2

1
F{ 2π 1
ln |x| } = lim (1 − e−b|ξ| )/|ξ|2 − c(b)π −1 δ0 (ξ) , (3.2)
b→∞

where the limit exists in the weak sense. The conclusion is that outside the origin,
1 1
the Fourier transform of 2π ln |x| equals the function 1/|ξ|2 , but at the origin this
Fourier transform has a distributional singularity which best can be described as a
radial analogue of δ0′ from Example 2.1.8.
1 1
It is instructive to note that in (3.2), we have F{ 2π ln |x| } < 0 at ξ = 0, since
c(b) → +∞, if we approximate the Dirac delta as in Example 2.1.11. See Figure X.
This is indeed reasonable since we know from Fourier analysis that for functions
Z
f (x)dx = fˆ(0),
Rn

1
R
and in our case R2
ln |x| dx = −∞.

The basic property of the Fourier transform is that derivatives and convolutions
acting on f (x) correspond to multiplication on fˆ(ξ). These operations are extended
to tempered distributions as follows.

ˆ We have already seen in Section 2.1.6 how the weak derivative ∂xk F of any
distribution F is defined through “integration by parts”

⟨∂xk F, ϕ⟩ = −⟨F, ∂xk ϕ⟩.

For tempered distributions F ∈ S ′ (Rn ) one can show through a limiting argu-
ment that this holds for any ϕ ∈ S(Rn ) if it hold for all ϕ ∈ D(Rn ).

ˆ Recall that the convolution of two functions f (x) and g(x) is the function
f ∗ g(x) defined by the integral
Z
f ∗ g(x) = f (x − y)g(y)dy, x ∈ Rn .
Rn

The convolution product is commutative, meaning that f ∗ g = g ∗ f , as is seen


by changing variables from y to z = x − y in the integral.
59

Consider the definition of the convolution F ∗ G of two distributions F and G.


Expecting F ∗ G to be a distribution, we need to define its value ⟨F ∗ G, ϕ⟩ on
test functions ϕ. In the case of functions, we calculate
Z Z 
⟨F ∗ G, ϕ⟩ = F (x − y)G(y)dy ϕ(x)dx
Z Z 
= G(y) F (x − y)ϕ(x)dx dy
Z Z 
= G(y) F (z)ϕ(y + z)dz dy = ⟨G, ϕF ⟩, (3.3)
R
where ϕF is the test function ϕF (y) = F (z)ϕ(y + z)dz, which in the case of a
distribution F is interpreted as ϕF (y) = ⟨F, ϕy ⟩, with ϕy being the translated
test function ϕy (z) = ϕ(y + z).
The convolution product, both for functions and distributions, clearly requires
enough decay of f and g (or F and G) at ∞, to be well defined, that is for the
integrals to be convergent.

ˆ The pointwise product


f (x)g(x), x ∈ Rn
of two functions f and g, although innocent looking at first, also requires
restrictions on f and g to be well defined as a distribution. This is seen
2
already for functions. RFor example, f (x) = e−x /|x|0.7 = g(x), x ∈ R, is
2
seen to be integrable: R f (x)dx < ∞. But f (x)g(x) = e−2x /|x|1.4 is not
locally integrable around 0 since 1.4 > 1. So we can define f , but not f g, as
a distribution as in Proposition 2.1.5.
We define the product F g of a distribution F and a test function to be the
distribution whose value on a test function ϕ is
Z Z
⟨F g, ϕ⟩ = (F g)ϕdx = F (gϕ)dx = ⟨F, gϕ⟩, (3.4)

where the two integrals are only written as a motivation for the definition. For
this definition to work, g must be smooth so that gϕ is a test function and the
right hand side ⟨F, gϕ⟩ is well defined. However, depending on how singular
the distribution F is, one can sometimes allow more general functions g. For
example, the product with a Dirac delta δa g is well defined whenever g is a
continuous function.

In fact the following fundamental result, that convolutions correspond to point-


wise products under the Fourier transform, shows that the problem of defining point-
wise products in fact is the same as the problem of defining convolution products.
Proposition 3.1.8 (Transforms of convolutions). Under suitable decay conditions
on F, G ∈ S ′ (Rn ), or equivalent under suitable regularity conditions on F̂ , Ĝ ∈
S ′ (Rn ), we have
F{F ∗ G} = F̂ Ĝ
60

We omit the technical details and refer to the discussion above for the meaning
of “suitable”.

Proof. Writing the values of distributions on test functions formally as integrals, we


have
ZZ Z 
−i⟨x+y,z⟩
⟨F(F ∗ G), ϕ⟩ = ⟨F ∗ G, ϕ̂⟩ = F (x)G(y) ϕ(z)e dz dxdy
R2n Rn
Z Z Z  Z 
−i⟨x,z⟩ −i⟨y,z⟩
= F (x)e dx G(y)e dy ϕ(z)dz = ⟨F̂ Ĝ, ϕ⟩,
Rn Rn Rn

for all test functions ϕ, by using the definition (3.3) of the convolution of distribu-
tions.

The following example clearly illustrates the conceptual power of distributions.

Example 3.1.9 (Derivative=distributional convolution). For simplicity, let n = 1.


Consider the Dirac derivative F = δ0′ from Example 2.1.8. From (3.3) we obtain

⟨δ0′ ∗ G, ϕ⟩ = ⟨G(y), ⟨δ0′ , ϕ⟩y ⟩ = ⟨G, −ϕ′ ⟩ = ⟨G′ , ϕ⟩

for all G ∈ S ′ (Rn ). On the other hand, the Fourier transform of δ0′ is
Z
′ ′
⟨δb0 , ϕ⟩ = ⟨δ0 , ϕ̂⟩ = −(ϕ̂) (0) = ixϕ(x)ei0x dx = ⟨ix, ϕ⟩.

Changing dummy variable from x to ξ, we conclude that F{δ0′ } = iξ. Therefore, we


obtain from Proposition 3.1.8 the well known result

F{G′ } = iξ Ĝ, G ∈ S ′ (R),

that differentiation in x-space correspond to multiplication by iξ in Fourier ξ-space.


In Rn , the result is that

F{∂xk G} = iξk Ĝ, G ∈ S ′ (Rn ), (3.5)

a result which we use to diagonalize differential operators and solve inital value
problems for PDEs.
The interpretation of this result is that a differentiation is nothing but convolu-
tion by a distribution whose Fourier transform is a polynomial which grows at ∞.
The only difference with a usual convolution by a function F , is that F̂ (ξ) → 0,
rather than F̂ (ξ) → ∞, as ξ → ∞. So, convolution by a function may be seen as a
negative differentiation that smoothens the function G.

Figure 3.3: (a) A function G. (b) Its derivative G′ = δ0′ ∗ G. (c) The convolution
2
e−|x| ∗ G(x).
61

3.2 The solution formulas


In this section we return to PDE. In Section 2.3 we solved BVPs on bounded domains
for the Laplace equation, which represents the stationary case in the heat and wave
equations where we have reached equilibrium.
We now consider the opposite case. The domain is all of Rn , and we want to
solve the initial value problems for our evolution equations. The basic strategy is
the following.

ˆ View the PDE as a vector-valued ODE for a function ut depending on the


time-variable t, where ut is a function

ut (x) = u(t, x), x ∈ Rn ,

which describes the heat distribution, or shape of the wave, at time t. In other
words, we consider a function R → L2 (Rn ).

ˆ We solve the vector-valued ODE by diagonalizing it with the Fourier transform,


similar to the finite dimensional case in Section 1.1.

ˆ We obtain in Fourier ξ-space, at each fixed frequency ξ a simple scalar ODE


which we solve.

ˆ Inverse Fourier transformation gives a solution formula in terms of convolutions


by distributions in general, which provides the basic understanding of the inital
value problem.

We first perform this calculation in the simpler case of the heat equation.

Example 3.2.1 (Heat IVP on Rn ). Consider the following initial value problem for
the heat equation.
(
∂t u(t, x) = k∆u(t, x) + f (t, x), t > 0, x ∈ Rn ,
u(0, x) = g(x), x ∈ Rn .

Here k > 0 is the constant from Fourier’s law, g is the inital data describing the
distribution of heat at time t = 0, and f describes the sources, possibly time-
dependent, which are present.
For a fixed time t, write ut (x) = u(t, x) and ft (x) = f (t, x) and view ut and ft
as vectors in the Hilbert space L2 (Rn ). We obtain the vector valued ODE

∂t ut = k∆ut + ft ,

with the Laplace operator ∆ as a linear map on the Hilbert space L2 (Rn ).
To solve, we proceed similar to the finite dimensional case in Example 1.1.2
and define ût = F{ut }. Applying, for each fixed t, the Fourier transform in the
x-variable, we obtain
∂t ût = −k|ξ|2 ût + fˆt ,
with ξ = (ξ1 , . . . , ξn ), since ∆ = k ∂x2k and each ∂xk transforms to iξk .
P
62

Now fix ξ ∈ Rn and regard instead t as variable. Consider the scalar ODE
∂t ût (ξ) = −k|ξ|2 ût (ξ) + fˆt (ξ). Multiplication by the integrating factor ek|ξ| t , and
2

changing dummy variable from t to s, gives


2 2
∂s (ek|ξ| s ûs ) = ek|ξ| s fs .
2t
Integration over 0 < s < t, using the initial condition û0 = ĝ, and division by ek|ξ|
yields Z t
−k|ξ|2 t 2
ût (ξ) = ĝ(ξ)e + fˆs (ξ)e−k|ξ| (t−s) ds.
0

To find the solution formula u(t, x) in x-space, we must find the inverse Fourier
2
transform of e−k|ξ| t . Replacing t by kt in Example 3.1.6, we have
2 /(4kt) 2
F{(4πkt)−n/2 e−|x| } = e−k|ξ| t .

Translating pointwise products of functions in ξ-space back to convolution products


of functions in x-space by Proposition 3.1.8, we obtain
Z
dy
2 /(4kt)
u(t, x) = g(x − y)e−|y|
Rn (4πkt)n/2
Z tZ
2 dyds
+ f (s, x − y)e−|y| /(4k(t−s)) . (3.6)
0 Rn (4πk(t − s))n/2

The formula (3.6) completely describes how the the solution u(t, x) depend on
the inital data g and the sources f , by convolution with the Gauss function, or heat
kernel. We discuss this further in Section 3.3.

Definition 3.2.2 (Heat kernel). For the heat equation with constant k > 0, we
define the heat kernel
2
Ht (x) = (4πkt)−n/2 e−|x| /(4kt) .

We next solve the wave equation on Rn in the same way, but with somewhat
more complicated calculations.

Example 3.2.3 (Wave IVP on Rn ). Consider the following initial value problem
for the wave equation.

2 2
∂t u(t, x) = c ∆u(t, x) + f (t, x),
 t > 0, x ∈ Rn ,
u(0, x) = g(x), x ∈ Rn ,

∂t u(0, x) = h(x), x ∈ Rn .

Here c > 0 is a constant, which we shall see has a meaning of speed of propagation,
g is the inital shape of the wave, g is the initial speed of the wave, and f describes
exterior forces, possibly time-dependent, which are present.
As in our calculation for the heat equation, we obtain a vector valued ODE

∂t2 ut = c2 ∆ut + ft ,
63

Figure 3.4: The heat kernel Ht (x) for n = 1 and k = 1.

which is now of second order. To solve, we apply, for each fixed t, the Fourier
transform in the x-variable, and obtain
∂t2 ût = −c2 |ξ|2 ût + fˆt .
Now fix ξ ∈ Rn and regard instead t as variable. Consider the scalar ODE ∂t2 ût (ξ) =
−c2 |ξ|2 ût (ξ) + fˆt (ξ). From Example 1.1.3 with ω = c|ξ| and a(t) = fˆt (ξ) we have
the solution
Z t
sin(c|ξ|t) sin(c|ξ|(t − s))
ût (ξ) = ĝ(ξ) cos(c|ξ|t) + ĥ(ξ) + fˆs (ξ) ds,
c|ξ| 0 c|ξ|
with constants A and B determined by the initial conditions.
To find the solution formula u(t, x) in x-space, we must find a function Rt such
that
sin(c|ξ|t)
F{Rt } = , ξ ∈ Rn .
c|ξ|
It turns out however that in dimension n ≥ 3, this Fourier transform decays too slow
as ξ → ∞ for such function Rt to exist: we must use distributions! We calculate
these distributions below. Interestingly, these socalled Riemann functions (we use
this traditional terminology even though these are distributions) look very different
depending on the dimension n. However, in any dimension we have Rt = 0 when
|ξ| > t, in the sense that ⟨Rt , ϕ⟩ = 0 for all test functions such that ϕ = 0 when
|ξ| ≤ t. In particular, the convolution Rt ∗ f with any locally integrable function f
is well defined, and one can show that such convolutions Rt ∗ f are functions. At the
end of the day, we obtain after inverse Fourier transformation the solution formula
Z t
u(t, x) = ∂t (Rt ∗ g)(x) + (Rt ∗ h)(x) + (R(t−s) ∗ fs )(x)ds. (3.7)
0

We have here used that ∂t R̂t = cos(c|ξ|t).


64

Proposition 3.2.4 (Riemann functions). Let Rt ∈ S ′ (Rn ) be the distribution with


sin(c|ξ|t)
F{Rt } = , ξ ∈ Rn .
c|ξ|
If n = 1, then Rt is the function
(
1/(2c), |x| < ct,
Rt (x) =
0, |x| > ct.

If n = 2, then Rt is the function


(
(2πc)−1 (c2 t2 − |x|2 )−1/2 , |x| < ct,
Rt (x) =
0, |x| > ct.

If n = 3, then Rt is the distribution


Z
2 −1
⟨Rt , ϕ⟩ = (4πc t) ϕ(x)dS(x), ϕ ∈ S(R3 ).
|x|=ct

Figure 3.5: (a) Initial speed ∂t u0 being a Dirac delta δ0 , and propagation speed c = 1,
giving a wave u1 at time t = 1 being the Riemann function R1 (x). (b) R1 (x) in
dimension n = 1. (c) Radial profile of R1 (x) in dimension n = 2. (d) Radial profile of
R1 (x) in dimension n = 3.

Proof. If n = 1, we calculate
Z ct
sin(ξct) sin(|ξ|ct)
e−iξx dx = 2 = 2c .
−ct ξ c|ξ|
65

If n = 2, we first choose an ON-basis for R2 so that ξ = (|ξ|, 0). We calculate

ct Z √c2 t2 −x21 !
e−i|ξ|x1 dx1 dx2
ZZ Z
dx2
p = √ p e−i|ξ|x1 dx1
x21 +x22 <c2 t2 c2 t2 − x21 − x22 −ct −
2 2 2
c t − x1 − x2
c2 t2 −x21
2
Z ct Z 1 
du sin(c|ξ|t)
= √ e−i|ξ|x1 dx1 = 2πc ,
−ct −1 1−u 2 c|ξ|

p
by a change of variables x2 = c2 t2 − x21 u and the one-dimensional result.
If n = 3, by definition R̂t is the tempered distribution with value

Z Z  Z 
1 1 −i⟨ξ,x⟩
⟨R̂t , ϕ⟩ = ⟨Rt , ϕ̂⟩ = ϕ̂(ξ)dS(ξ) = e dS(ξ) ϕ(x)dx
4πc2 t |ξ|=ct R3 4πc2 t |ξ|=ct

on a test function ϕ, by changing the order of integration. Therefore R̂t is in fact


a function R̂t (ξ), defined by the formula inside this last bracket. Now choose an
ON-basis for R3 so that ξ = (0, 0, |ξ|). Using spherical coordinates, we calculate

ZZ Z 2π Z π
−i|ξ|x3
e dS(x) = e−i|ξ|ct cos θ (ct)2 sin θdθdϕ
|x|=ct 0 0
Z π
sin(c|ξ|t)
= 2πc t2 2
e−i|ξ|ct cos θ sin θdθ = 4πc2 t .
0 c|ξ|

3.3 Heat vs. Wave equation


In this section we analyze and compare the explicit solution formulas obtained in
Section 3.2 for the heat and wave equation. For simplicitly we limit our discussion
to the case of no sources, that is f = 0. In this case the solution formulas for the
heat IVP read

2
ût (ξ) = ĝ(ξ)e−k|ξ| t , (3.8)
Z
2 dy
u(t, x) = g(y)e−|y−x| /(4kt) , (3.9)
Rn (4πkt)n/2

in Fourier ξ space and in x-space respectively.


By inserting the formulas from Proposition 3.2.4 in (3.7), we obtain solution
66

formulas for the wave IVP


sin(c|ξ|t)
ût (ξ) = ĝ(ξ) cos(c|ξ|t) + ĥ(ξ) , (3.10)
c|ξ|
1 x+ct
Z
1
u(t, x) = (g(x − ct) + g(x + ct)) + h(y)dy, n = 1, (3.11)
2 2c x−ct
ZZ ZZ
1 g(y)dy1 dy2 1 h(y)dy1 dy2
u(t, x) = ∂t p + p , n = 2,
2πc |y−x|<ct
2 2
c t − |y − x| 2 2πc |y−x|<ct c2 t2 − |y − x|2
(3.12)
ZZ ZZ
1 1
u(t, x) = ∂t 2
g(y)dS(y) + h(y)dS(y), n = 3.
4πc t |y−x|=ct 4πc2 t |y−x|=ct
(3.13)
Note that the solution formula for the wave equation in Fourier ξ-space (3.10),
as compared to the x-space formulas, is the same in any dimension. The solution
formula (3.11) in R is called d’Alembert’s formula and the solution formula (3.13)
in R3 is referred to as Kirchhoff ’s formula or Poisson’s formula.
These formulas shows in particular the following.
ˆ Given a point (t, x) at position x ∈ Rn and time t > 0, we ask which part
of the initial data g(y) (and h(y) for wave equation) that the solution u(t, x)
depends on. The answer depend heavily on equation and dimension.
∈ Rn to compute
For the heat equation (3.9) shows that g(y) is needed for all y √
u(t, x). However, the contribution from y with |y − x| >> kt is very small
since the heat kernel is very small in this region.
For the wave equation, we see in all dimensions that at least we never need
any values g(y) and h(y) with |y − x| > ct to compute u(t, x). A remarkable
fact is that in R3 , Kirchhoff’s formula (3.13) shows that we also do not need
to know the initial data for |y − x| < ct either, just for |y − x| = ct, or more
precisely we need the initial position g(y) in an infinitesimal neighbourhood of
|y −x| = ct in order to be able to compute the derivative ∂t in (3.13). Similarly
in d’Alembert’s formula, we also observe that the initial position g(y) is needed
only at the two points |y − x| = ct, but the same is however not true for the
initial speed h(y).
ˆ Conversely, we can ask at which points (t, x) in spacetime the solution u(t, x)
is non-zero, if we have as initial data a point mass g = αδa at a ∈ Rn (and
h = βδa in the case of the wave equation) for some constant α ∈ R (and
β ∈ R). Again the answer depends heavily on equation and dimension.
For the heat equation we see from (3.9) that
2 /(4kt)
u(t, x) = α(4πkt)−n/2 e−|y−x| >0
for any (t, x), althought very small when |x − a| >> t.
For the wave equation, we see from (3.7), that
u(t, x) = α∂t Rt + βRt .
67

Therefore u(t, x) = 0 when |x − a| > ct. Again we have the remarkable fact in
R3 that u(t, x) = 0 also when |x − a| < ct!

Figure 3.6: (a) Domain of dependence: wave= black, heat=red. (b) Domain of influ-
ence: wave= black, heat=red.

Definition 3.3.1 (Huygens principle). We say that Huygens principle holds in Rn


if the Riemann function Rt in Rn vanishes for |x| < ct.

We have seen that Huygens principle holds in dimension 3, but not in dimension
2, and that a half Huygens principle holds in dimension 1 in the sense that Rt does
not vanish for |x| < ct but is constant there so that ∂t Rt = 0 for |x| < ct. One
can show that Huygens principle holds precisely in odd dimension n ≥ 3. From the
above discussion it is clear that odd dimensional worlds are quiet ones, whereas even
dimensional worlds are noisy ones.
Another observation based on the solution formulas for the IVPs concern the
direction of time. In this case the formulas in Fourier ξ-space are the most helpful.

ˆ For the heat equation, we see from (3.8) that we cannot solve backward in
2
time. Indeed, for t < 0 the Gaussian e−k|ξ| t grows so fast at ∞ that it does
not even define a tempered distribution, and therefore the Fourier transform
breaks down. (Compare Example 3.1.5.) This means that heat flow is an
irreversible process, meaning that it is impossible to reconstruct ut for t < 0
from u0 in a stable way.
A closely related fact is that heat evolution is a smoothing process: No matter
how irregular the distribution of heat is at t = 0, it will be C ∞ smooth at any
t > 0. This is clear from (3.9), where we can differentiate under the integral
sign any number of times.

ˆ For the wave equation, we see from (3.10) that this solution is valid for t < 0
as well. This means that wave propagation is a reversible process, meaning
that we can reconstruct in a stable way the wave ut at times t < 0 from the
data g and h at t = 0. Also, wave evolution is not a smoothing process. See
Exercise 2.4.

Figure 3.7: Heat equation: (a) given function u at t = 0, (b) smoothed out heat at
t = 1, and (c) an attempt to numerically compute u at t = −1.

Related to the reversibility of wave propagation is the following energy consid-


eration.
68

Proposition 3.3.2 (Conservation of energy). Fix a bounded smooth domain D ⊂


Rn . Assume that u(t, x) solves the wave equation ∂t2 u = c2 ∆u with no sources in
Rn . Define the total energy
ZZ
E(t) = (c−2 |∂t u(t, x)|2 + |∇u(t, x)|2 )dx
D

of the wave ut in the domain D at time t. Then


Z

E (t) = 2 ∂t u(t, x) ∂ν u(t, x)dS(x).
∂D

In particular, if u satisfies Dirichlet boundary conditions u = 0 or Neumann bound-


ary conditions ∂ν u = 0 at ∂D, then total energy of the wave is conserved in the sense
that E ′ (t) = 0.
Proof. Differentiation under the integral sign and Green’s first identity applied to u
and ∂t u gives
ZZ

E (t) = 2(c−2 ∂t2 u∂t u + ⟨∇(∂t u), ∇u⟩)dx
ZZ D Z Z
−2 2
= 2(c ∂t u∂t u − ⟨∂t u, ∆u⟩)dx + 2 ∂t u∂ν udS(x) = 2 ∂t u∂ν udS(x).
D ∂D ∂D

An important consequence of this energy conservation is the uniqueness of solu-


tions to the wave IVP.
Corollary 3.3.3. Assume that uj solves ∂t2 uj = c2 ∆uj + f in a bounded smooth
domain D ⊂ Rn , for j = 1, 2. If also u1 = u2 and ∂t u1 = ∂t u2 at t = 0, and if u1
and u2 satisfy the same Dirichlet or Neumann boundary condition, then u1 (t, x) =
u2 (t, x) for all x ∈ D and t > 0.
Proof. Apply Proposition 3.3.2 to u = u1 − u2 . Since the wave equation is linear, we
have ∂t2 u = c2 ∆u. We see that E(0) = 0, and since u satisfy homogeneous Dirichlet
or Neumann boundary conditions, it follows that E(t) = 0 for all t > 0. From this
it follows that u is a constant function of x at each t > 0, and that this constant
does not depend on t. Therefore u = 0, that is u1 = u2 .

Figure 3.8: (a) Wave equation: two given superimposed waves at t = 0. (b) Waves at
t = 1 and (c) waves at t = −1, both computable from (a) and non-smooth.

Example 3.3.4 (Uniqueness for the backward heat IBVP). An argument similar to
that in Proposition 3.3.2 can be used to prove uniqueness for the IBVP for the heat
equation. Indeed, if ∂t = ∆u for x ∈ D and t > 0, then the one-variable function
ZZ
E(t) = |u(t, x)|2 dx
D
69

has derivative
ZZ Z ZZ

E (t) = 2 u∆udx = 2 u∂ν udS(x) − 2 |∇u|2 dx.
D ∂D D

Thus if u = 0 or ∂ν u = 0 at ∂D, then E ′ (t) ≤ 0, and it follows that u = 0 for all


t > 0 if u = 0 at t = 0.
Now to something more exciting! We have seen that for initial data u(0, x) =
g(x), there is in general no solution u(t, x) to the heat equation for any t < 0, with
initial (or rather terminal) data g. However, let us assume that there exists such a
solution for −a < t < 0, for some a > 0. We aim to show that if g = 0, then we
must have u = 0 for all −a < t < 0. Since we have a linear PDE problem, this
means that we nevertheless has uniqueness of solutions in the ill-posed backward
heat IBVP. To see this, consider
ZZ
1
E(t) = 2 log |u(t, x)|2 dx.
D

We compute the first and second derivative, assuming homogeneous Dirichlet or


Neumann boundary conditions.
RR
′ D
|∇u|2 dx
E (t) = − RR
u2 dx
RR D 2
RR 2 RR
( (∇u) dx)( u dx) − ( u∆udx)2
E ′′ (t) = 2 D RRD D
( D u2 dx)2
By Cauchy-Schwarz inequality, it follows that E ′′ (t) ≥ 0 so that E(t) is a convex
function for −a < t < 0. But if g = 0, then limt→0− E(t) = −∞. This leads to
a contradiction, since a convex function lies above any of its tangent lines and in
particular cannot go to −∞ in finite time.
In terms of the usual forward IBVP for the heat equation, what we just proved
means that it is impossible for a non-zero heat distribution at t = 0 to completely
wipe out itself in finite time. This may seem clear for physical reasons. But to
illustrate that we really proved something non-trivial, we point out such a finite
time heat wipe out indeed can happen for the heat equation with more general
conductivity coefficients A(x) as in Example 1.3.1. There is a beautiful example by
Keith Miller from 1973 of this, in space dimension 2 and a continuous symmetric
matrix valued function A(x).

3.4 Exercises
1. For which k ∈ R does f (x) = |x|k define a tempered distribution on Rn ? It
suffices to consider n = 2, 3. Note that there are two important points: x = 0
and x = ∞.
(
1, |x| < k
2. Define the function fk (x) = . Show that fk → 1 weakly as
0, |x| > k
k → ∞. Compute fˆk . Show that sin(kξ)/ξ → πδ0 weakly as k → ∞. Plot the
graphs of sin(kξ)/ξ. Can you see this convergence result?
70

3. Show that δa ∗ f is a translate of the function f .

4. For which a, b ∈ Rn is the product δa δb well defined? Compute it when it is


defined. Same problem for the convolution product δa ∗ δb .

5. Solve ∂t2 u = c2 ∂x2 u with u(0, x) = ex , ∂t u(0, x) = sin x.

6. Solve the wave equation in dimension n = 1 with g = 0 and initial speed


(
1, |x| < a,
h(x) =
0, |x| > a.

Sketch the wave resulting from of this 1D hammer blow at times t = a/2c, a/c, 3a/2c, 2a/c
and 5a/c.
p
7. We consider spherical three-dimensional waves u(t, r), where r = x2 + y 2 + z 2 .
Using spherical coordinates, the wave equation in dimension n = 3 reduces to

∂t2 u = c2 (∂r2 u + 2r−1 ∂r u).

(a) Consider the function v(t, r) = ru(t, r). Show that v solves the wave
equation in dimension n = 1.
(b) Given two even functions g(r) and h(r), solve the IVP u(0, r) = g(r),
∂t u(0, r) = h(r).

8. Solve the heat equation in dimension n = 1 with IC u(0, x) = e3x .

9. Solve the reaction-diffusion problem ∂t u = k∂x2 u − bu, u(0, x) = g(x), with


constant dissipation b. Hint: adapt Example 3.2.1.

10. Solve the heat equation ∂t u = k∂x2 u − a∂x u, u(0, x) = g(x), with constant
convection a. Hint: adapt Example 3.2.1.

11. Let u(t, x) solve the wave equation in dimension n = 1, with c = 1. Define the
energy density e = (∂t2 u + ∂x2 u)/2 and the momentum density p = ∂t u∂x u.

(a) Show that ∂t e = ∂x p and ∂t p = ∂x e.


(b) Show that both e and p also solve the wave equation.

12. Consider the damped wave equation (1.11) with no sources f = 0. Show that
energy decreases with time.

13. We consider spherical n-dimensional waves u(t, r). Using spherical coordinates
in Rn , the wave equation in dimension n reduces to

∂t2 u = c2 (∂r2 u + (n − 1)r−1 ∂r u).

We consider solutions of the form u = α(r)f (t − β(r)).


71

(a) Which differential equations must α and β satisfy, if such u solves the
spherical wave equation for any choice of f ? Hint: write down the ODE
for f and set the coefficients to zero.
(b) Show that solutions u ̸= 0 of this form exist only when n = 1 or n = 3.

14. Solve the wave equation, with c = 1, in dimension n = 3 with g = 0 and initial
speed (
1, |x| < 1,
h(x) =
0, |x| > 1.
Hint: compute the area of the spherical cap {y ; |y − x| = ct} ∩ {y ; |y| < 1}.

(a) Sketch the wave resulting from of this 3D hammer blow at times t = 1/2, 1
and 2.
(b) Sketch u as a function of t, at points where |x| = 1/2 and |x| = 2
respectively.

15. Consider the wave equation, with c = 1, in dimension n = 2 with g = 0 and


initial speed (
1, |x| < 1,
h(x) =
0, |x| > 1.
Compute and sketch the solution u as a function of t, at the origin x = 0.
(
1 − |x|, |x| < 1
16. Define the function v(x) = . Show that there is no initial
0, |x| > 1
data u(0, x) so that u(1, x) = v(x), where u(t, x) solves the heat equation for
t > 0.
72
Chapter 4

Initial-boundary value problems

The goal in this chapter is to solve general IBVPs, where we consider the time-
evolution of a PDE on some domain D ⊂ Rn , n = 2, 3 in particular. Our strategy
is similar to Chapter 3, but replacing the Fourier transform (which is suitable only
for D = Rn ) by something like Fourier series, but adapted to the domain D ⊂
Rn . These generalized sine and cosine functions for D give important theoretically
insights for IBVPs, and we also learn how to numerically compute them in order to
solve IBVPs.

Recommended reading:

ˆ Strauss 10.1, 11.1, 11.3

ˆ Strauss 4.1, 4.2, 10.2, 11.2

ˆ Lecture notes: http://www.math.chalmers.se/~rosenan/


part4-1.pdf

Proposition 4.1.3 and Theorem 4.1.4 are not required reading.

Study questions: What do we mean by Dirichlet /Neumann


eigenfunctions and eigenvalues for a domain D? In what sense are
they minimizers? How do we compute them? How do we diagonalize
our PDE= vector-valued ODE, by using the Laplace eigenfunctions?
What is a standing wave?

4.1 Generalized sine and cosine series


The aim in this chapter is to solve the our evolution PDEs, the heat and wave
equations, when initial data at t = 0 as well as boundary conditions at ∂D are im-
posed. We assume that the domain D where we consider the PDE is bounded, with
a smooth boundary. The method of solution is basically the same as in Section 3.2.
The difference is that there the domain Rn was unbounded, which cause the variable
ξ which we diagonalize our vector-valued ODE with, to be continuous. In this section

73
74

we assume D to be bounded, and we shall see that in this case the diagonalization
leads to a discrete set of scalar ODEs, one for each Laplace eigenvalue.
We know from Fourier analysis, in one dimension, that the Fourier transform is
useful for functions defined on all R, whereas Fourier series are useful for functions
defined on a bounded interval. The real form of a Fourier series uses the sine and
cosine functions.
Example 4.1.1 (Laplace eigenfunctions on intervals). The sine functions sin(kx)
are eigenfunctions to the Laplace operator ∆ = ∂x2 on D = (0, π),

∂x2 sin(kx) = −k 2 sin(kx),

with Dirichlet
p boundary conditions at x = 0 and x = π, k = 1, 2, . . . As in Fourier

analysis, { 2/π sin(kx)}k=1 form an ON-basis for L2 (0, π).
The cosine functions cos(kx) are eigenfunctions to the Laplace operator ∆ = ∂x2
on D = (0, π),
∂x2 cos(kx) = −k 2 cos(kx),
with Neumann √ boundary
p conditions at x = 0 and x = π, k = 0, 1, 2, . . . As in Fourier
analysis, {1/ π} ∪ { 2/π cos(kx)}∞ k=1 form an ON-basis for L2 (0, π). Note that we
here, unlike the Dirichlet case, include k = 0.
To goal in this section is to show that for any bounded domain D ⊂ Rn we
have, for either choice of boundary condition, an ON-basis for L2 (D) consisting of
Laplace eigenfunctions like this. One way to view this result is as a generalization
of the spectral theorem 0.2.3 from linear algebra. The generalization consists in
replacing the finite dimensional vector space by the Hilbert space L2 (D) and viewing
the Laplace operator as an infinitely large matrix. As in spectral theorem, we need
symmetry of the operator in order to have a ON-basis of eigenfunctions. And indeed,
Green’s second formula shows that
Z Z
(v∆u − u∆v)dx = (v∂ν u − u∂ν v)dS.
D ∂D

Now, if both u and v satisfy either Dirichlet or Neumann boundary conditions, then
the right hand side vanishes, and we interpret this result as

⟨∆u, v⟩ = ⟨u, ∆v⟩,

which means that the Laplace operator is symmetric.


We need to following generalized eigenvalue concept.
Definition 4.1.2 (Rayleigh quotient). The Rayleigh quotient for a function u ∈
H 1 (D) is the number
Z .Z
2
R(u) := |∇u| dx |u|2 dx.
D D

We note from Green’s first formula that if ∆u = −λu, then


Z
(−λu2 + |∇u|2 )dx = 0
D
75

if u = 0 or if ∂ν u = 0 on ∂D. This means that R(u) = λ if u is a Dirichlet or


Neumann eigenfunction to ∆, with eigenvalue −λ. We now aim to reverse this
result, and construct eigenfunctions by minimizing Rayleigh quotients in a certain
way. To limit technicalities, we focus on the computation of the eigenfunction e1
corresponding to the first/smallest Dirichlet eigenvalue λ1 > 0.
Proposition 4.1.3 (Existence of minimizer). Assume that D ⊂ Rn is a bounded
domain with smooth boundary. Let
m = inf{R(u) ; u ̸= 0, u ∈ H01 (D)}.
Then there exists a function u ∈ H01 (D) such that R(u) = m.
Proof. Pick a minimizing sequence uj for which
R(uj ) → m, j → ∞.
By normalizing uj , we may assume that ∥uj ∥ = 1, and we have
sup ∥∇uj ∥ < ∞.
j

We now use two general compactness results: Theorems 2.2.6 and 4.1.4. From these
we can deduce that there exists a subsequence {ujk }∞
k=1 , and limits u ∈ L2 (D) and
vi ∈ L2 (D) such that
∥ujk − u∥ → 0, k → ∞,
⟨∂i ujk , ϕ⟩ → ⟨vi , ϕ⟩, k → ∞,
for each ϕ ∈ L2 (D) and i = 1, . . . , n. For testfunctions ϕ ∈ D(D), it follows that
Z Z Z Z
− u∂i ϕdx = − lim ujk ∂i ϕdx = lim (∂i ujk )ϕdx = vi ϕ,
D k→∞ D k→∞ D D
1
so u ∈ H (D) with weak derivatives ∂i u = vi . Further we check that ∥u∥ = 1, and
that u ∈ H01 (D) since H01 (D) is a closed subspace of H 1 (D) and u is the weak limit
of ujk ∈ H01 (D) in H 1 (D). Finally, we verify that R(u) = m, where it remains to
show that ∥∇u∥2 ≤ m. Using ϕi = ∂i u, we have
n n
2
X X √
∥∇u∥ = ⟨∂i u, ϕi ⟩ = lim ⟨∂i ujk , ϕi ⟩ ≤ m∥∇u∥
k→∞
i=1 i=1

by Cauchy–Schwarz inequality, since limj ∥∇uj ∥2 = m. Dividing by ∥∇u∥, this


proves that R(u) = m and that the minimum is attained at u.
Before proceeding, we state the compactness result that we used in the above
proof, which is a special case of the Banach-Alaoglu theorem.
Theorem 4.1.4 (Weak compactness). Let f1 , f2 , f3 , . . . be vectors in a Hilbert space
V . If the sequence is bounded, that is supj ∥fj ∥ < ∞, then there exists a subsequence
fj1 , fj2 , fj3 , . . . which converges weakly in the sense that there exists f ∈ V such that
⟨fjk , ϕ⟩ → ⟨f, ϕ⟩, k → ∞,
for each ϕ ∈ V .
76

Proposition 4.1.5 (The first Dirichlet eigenfunction). Assume that D ⊂ Rn is a


bounded domain with smooth boundary. Let e1 ∈ H01 (D) denote the minimizer u from
Proposition 4.1.3. Then e1 is the Dirichlet eigenfunction with smallest eigenvalue
λ1 > 0. Moreover, after suitable normalization, e1 (x) > 0 for all x ∈ D, and the
multiples te1 , t ∈ R \ {0}, are the only Dirichlet eigenfunctions with eigenvalue λ1 .
R R
Note from the definition of e1 as the minimizer of all Rayleigh quotients |∇u|2 / u2 ,
that e1 is in a sense the smoothest of all non-zero functions in D with Dirichlet
boundary conditions.

Proof. (1) We first show that e1 ∈ H01 (D) is a Dirichlet eigenfunction. In the weak
sense, by Green’s first identity, this means that
Z Z
⟨∇e1 , ∇ϕ⟩dx = λ1 e1 ϕdx,
D D

for all ϕ ∈ H01 (D), and we have seen that the candidate for λ1 is λ1 = R(e1 ).
Consider the one-variable function

f (t) = R(e1 + tϕ).

By Proposition 4.1.3, f has a minimum at t = 0, so


R R R
′ d |∇e1 |2 + 2t ⟨∇e1 , ∇ϕ⟩ + t2 |∇ϕ|2
0 = f (0) = R 2 R R
dt e1 + 2t e1 ϕ + t2 ϕ2 t=0
R R R R
⟨∇e1 , ∇ϕ⟩ e21 − e1 ϕ |∇e1 |2
=2 R 2 R R 2 .
e1 + 2t e1 ϕ + t2 ϕ2

This proves that ∆e1 = −R(e1 )e1 in the weak sense. Any Dirichlet eigenvalue,
being a Rayleigh quotient, is positive and cannot be zero, since this would force
the eigenfunction to be constant and hence zero by the boundary conditions. By
construction R(e1 ) is the smallest Dirichlet eigenvalue.
(2) To show that e1 ≥ 0 on all D, consider the function

|e1 (x)|, x ∈ D.

Classically we do not expect this to differentiable at the zero level set D0 = {x ∈


D ; e1 (x) = 0}, but we can prove that in the weak sense actually |e1 | ∈ H01 (D). Let
D± = {x ∈ D ; ±e1 (x) > 0}. Then the divergence theorem shows that
Z Z Z Z
⟨∂xk u, ϕ⟩ = − e1 ∂xk ϕ + e1 ∂xk ϕ = (∂xk e1 )ϕ − (∂xk e1 )ϕ,
D+ D− D+ D−

where the boundary integrals vanish since e1 = 0 on ∂D+ and ∂D− . This proves
that in the weak sense, ∇|e1 | is the vector field
(
∇e1 (x), x ∈ D+ ,
∇|e1 | =
−∇e1 (x), x ∈ D− ,
77

which is discontinuous across D0 .


It follows in particular that |e1 | ∈ H01 (D), with R(|e1 |) = R(e1 ). We conclude
that also |e1 | minimizes R(u), and is a Dirichlet eigenfunction by (1). We claim
that this forces D0 = ∅, and therefore that e1 ̸= 0 in all D. If not, pick x ∈ D0
and B(x, r) ⊂ D. A version of Proposition 5.1.2, with Φ replaced by the Green’s
function G for the ball B(x, r) shows that
Z Z
0 = |e1 (x)| = (−λ1 |e1 (y)|)G(y, x)dy + |e1 (y)|∂ν G(y, x)dS(y) > 0,
B(x,r) ∂B(x,r)

since ∆|e1 | = −λ1 |e1 |, G < 0 and ∂ν G > 0. (Note however that |e1 | is not a C 2
function, so some more technical details are needed to justify this, which we omit.)
(3) We have shown that, possibly after normalization, e1 > 0 in all D. Let u be
any other Dirichlet eigenfunction with eigenvalue λ1 . If u is not parallel to e1 , then
we may assume that ⟨e1 , u⟩ = 0 after Gram–Schmidt orthogonalization. But by the
same argument as above, we also have u > 0 in all D. But two positive functions
cannot be orthogonal in L2 . Therefore u must be parallel to e1 .

Let D ⊂ Rn be a bounded domain, with smooth boundary. Then there exists


an ON-basis
e1 (x), e2 (x), e3 (x), . . .
for L2 (D), such that ej ∈ H01 (D) and all are Dirichlet eigenfunctions to the Laplace
operator ∆ in the sense that

∆ej (x) = −λj ej (x),

with eigenvalues 0 < λ1 < λ2 ≤ λ3 ≤ . . .. Note that λ2 > λ1 follows from Proposi-
tion 4.1.5, and λ1 > 0 follows from the Poincaré inequality, but some the remaining
eigenfunctions e2 , e3 , . . . may have the same eigenvalue, depending on the symmetries
of D. A brief sketch of the construction of this Dirichlet eigenbasis is as follows.

ˆ To construct e2 , we consider the consider the orthogonal complement

V1 = {u ∈ H01 (D) ; ⟨u, e1 ⟩ = 0}

of e1 from Proposition 4.1.5. A minimizer of R(u), not over H01 (D) but in
the subspace V1 , can be shown to exist similarly to Proposition 4.1.3. A
modification of Proposition 4.1.5 shows that this minimizer is e2 . However,
somewhat similar to sin(2x) on (0, π), it will have an oscillation.

ˆ To construct e3 , we then consider the consider the orthogonal complement

V2 = {u ∈ H01 (D) ; ⟨u, e1 ⟩ = 0, ⟨u, e2 ⟩ = 0}

of e1 and e2 . The minimizer of R(u) in the subspace V2 will be e3 .

ˆ Proceeding recursively like this, we obtain an ON-set of Dirichlet eigenfunc-


tions e1 (x), e2 (x), e3 (x), . . . and the corresponding eigenvalues λ1 , λ2 , λ3 , . . . will
78

Figure 4.1: The Dirichlet eigenfunctions e1 (x), . . . , e6 (x) for a domain D, computed
with the Rayleigh-Ritz approximation from Section 4.3, using N = 4501 basis functions.
The error in the eigenvalue approximations is about 1%. (Note similarities to sin(kx)
on (0, π).)

be increasing, since we minimize R(u) over a decreasing subspaces Vj . Fur-


ther, one can show that these eigenfunctions form an ON-basis, that is any
f ∈ L2 (D) can be expanded in a generalized sine series

X
f (x) = fˆj ej (x),
j=1

with generalized sine coefficients fˆj =


R
D
f (x)ej (x)dx, with convergence in
L2 (D) norm.
For a general domain D, the values of λj (and ej (x)) can only be computed
numerically. What is known exactly however, is the rate at which λj → ∞. It turns
out that this depends only on the dimension and the area/volume of D.
Theorem 4.1.6 (Weyl’s law). For a bounded domain D ⊂ R2 with area A, we have
λj 4π
→ , j → ∞.
j A
79

For a bounded domain D ⊂ R3 with volume V , we have


λj 4π 2/3
→ , j → ∞.
j 2/3 V
More generally for a domain D ⊂ Rn , the eigenvalues λj grow like a constant
depending on the n-volume of D times j 2/n .

If we replace the Dirichlet BC by the Neumann BC, we have a completely anal-


ogous result: there exists an ON-basis

ẽ0 (x), ẽ1 (x), ẽ2 (x), ẽ3 (x), . . .

for L2 (D), such that ẽj ∈ H01 (D) and all are Neumann eigenfunctions to the Laplace
operator ∆ in the sense that

∆ej (x) = −λ̃j ej (x),

with eigenvalues 0 = λ̃0 < λ̃1 ≤ λ̃2 ≤ . . .. We prefer to index these eigenfunctions
from j = 0 since, just like the cosine functions on (0, π), we have that e0 (x) is
a constant function, and hence λ̃0 = 0. Again, λ̃1 > λ̃0 follows from the Poincaé
e 1 (D), but some the remaining eigenfunctions ẽ1 , ẽ2 , . . . may
inequality, this time for H
have the same eigenvalue, depending on the symmetries of D. (If D is disconnected,
the first few Neumann eigenfunctions will be locally constant.) The construction
of the Neumann basis is very similar to the Dirichlet eigenbasis, using the whole
Sobolev space H 1 (D), and subspaces thereof, instead of H01 (D).

4.2 Eigenfunctions and IBVPs


One of the few geometries for which we can explicitly calculate the eigenfunctions,
are rectangular.

Example 4.2.1 (Rectangles). Let D ⊂ R2 be a rectangle

D = (0, a) × (0, b).

Then for each pair (n, m) of positive integers, the function

u(x, y) = (ab)−1/2 sin(nxπ/a) sin(myπ/b)

is a Dirichlet eigenfunction on D. From Fourier analysis it follows that these func-


tions form an ON-basis for L2 (D). We calculate that

∆u = −π 2 ((n/a)2 + (m/b)2 )u.

Let us order these eigenvalues not by (n, m), but linearly as λ1 , λ2 , . . .. Consider the
j’th eigenvalue λj . Each of the j smaller eigenvalues correspond to a point (n, m)
satisfying
π 2 ((n/a)2 + (m/b)2 ) ≤ λj .
80

Figure 4.2: The Neumann eigenfunctions ẽ0 (x), . . . , ẽ5 (x) for a domain D, computed
with the Rayleigh-Ritz approximation from Section 4.3, using N = 4781 basis functions.
The error in the eigenvalue approximations is about 1%. (Note similarities to cos(kx)
on (0, π).)

p p
This means that (n, m) is inside a quarter ellipse with axes λj a/π and λj b/π.
For large j, this number of points j is approximately equal to the area (1/4)πλj ab/π 2
of this quarter ellipse, that is
λj ≈ j4π/(ab).
Since the area of the rectangle D is A = ab, we have verified Weyl’s law in the case
of a rectangle.
The following examples illustrate how to solve initial/boundary value problems
(IBVPs) using eigenfunction expansions.
Example 4.2.2 (Heat IBVP on D). Consider the Dirichlet IBVP for the heat equa-
tion 
∂t u(t, x) = k∆u(t, x) + f (t, x), t > 0, x ∈ D,

u(0, x) = g(x), x ∈ D,

u(t, x) = 0, t > 0, x ∈ ∂D.

If we can solve this PDE problem, we recall from Section 1.4 that we can also
solve it in the case of inhomogeneous Dirichlet boundary conditions u|∂D = h ̸= 0.
81

(We can even reduce to the case g = 0, but choose to keep g since our method of
solution does not require g = 0.) The method for solving the problem stated here is
entirely similar to Example 3.2.1. What is different is that we replace the use of the
Fourier transform, which is applicable only to D = Rn , by the use of the Dirichlet
eigenfunctions
e1 (x), e2 (x), . . .
on D. (In the case of a rectangle, this would be a double Fourier sine series as in
Example 4.2.1.)
The computation is as follows. For a fixed time t, write ut (x) = u(t, x) and
ft (x) = f (t, x) and view ut and ft as vectors in the Hilbert space L2 (D). We obtain
the vector valued ODE
∂t ut = k∆ut + ft . (4.1)
Here we write, for each t > 0,
X
ut (x) = ût (j)ej (x),
j
X
ft (x) = fˆt (j)ej (x),
j

and similarly for the initial data


X
g(x) = ĝ(j)ej (x).
j

We use the notation ût (j), fˆt (j) and ĝ(j) for the coordinates of the functions in the
Dirichlet eigenbasis, but remember that this no longer refers to the Fourier transform
but rather denote generalized Fourier series coefficients.
Since ∆ej = −λj ej , we obtain a sequence of ODEs

∂t ût (j) = −kλj ût (j) + fˆt (j)

from (4.1) by equating the coefficients on both sides. Now fix j and regard instead
t as variable. Multiplication by the integrating factor ekλj t , and changing dummy
variable from t to s, gives

∂s (ekλj s ûs (j)) = ekλj s fˆs (j).

Integration over 0 < s < t, using the initial condition û0 (j) = ĝ(j), and division by
ekλj t yields Z t
ût (j) = ĝ(j)e−kλj t
+ fˆs (j)e−kλj (t−s) ds. (4.2)
0
To summarize, given initial data g and source f , we compute their generalized
Fourier coefficients
Z
ĝ(j) = g(x)ej (x)dx,
ZD

ˆ
ft (j) = ft (x)ej (x)dx.
D
82

From these and (4.2) we obtain ût (j), which gives the solution
X
u(t, x) = ût (j)ej (x), t > 0, x ∈ D,
j

to our heat IBVP.


We note that unlike the case D = Rn and the Fourier transform in Example 3.2.1,
for a general bounded domain D it is not possible to sum this series solution for
u(t, x) and obtain a formula similar to (3.6). However, in the case of the heat
equation treated here, it is important to note that the series converges very rapidly.
For simplicity, consider the case f = 0 of no sources. Then we have

ût (j) = ĝ(j)e−kλj t

and we see that even if the initial data has many of the coefficents ĝ(j) (corresponding
to a non-smooth g), the coefficients ût (j) of the solution decay fast with j and for
not too small t > 0 (corresponding to a smooth u for t > 0). Even
Z 
−kλ1 t
u(t, x) ≈ ĝ(1)e e1 (x) = g(y)e1 (y)dy e−kλ1 t e1 (x)
D

is a good approximation for the solution. Indeed, one can show for the Dirichlet
eigenvalues that λ1 < λ2 so that e−kλ1 t >> e−kλj t for j ≥ 2 and t large enough.
Example 4.2.3 (Hot spots conjecture). Consider instead the Neumann IBVP for
the heat equation

∂t u(t, x) = k∆u(t, x), t > 0, x ∈ D,

u(0, x) = g(x), x ∈ D,

∂ν u(t, x) = 0, t > 0, x ∈ ∂D.

Write the solution ut , for each time t > 0, as a linear combination of the Neumann
eigenfunctions

X
ut (x) = ût (j)ẽj (x),
j=0

and similarly for the initial data g. Then computations analogous to those in Ex-
ample 4.2.2 shows that
Z Z 
−kλ̃1 t
u(t, x) ≈ ĝ(0)ẽ0 (x) + ĝ(1)e ẽ1 (x) = g(y)dy + g(y)ẽ1 (y)dy e−kλ̃1 t ẽ1 (x)
D D

for large t. The interpretation of this result is that when the boundary is insulated
(Neumann boundary conditions), then the initial heat g(x) will spread out evenly
and converge to a constant function on D with the same total heat (integral). For
large t, the difference between ut and this constant function is approximately a small
multiple of ẽ1 (x). This shows that this first non-trivial Neumann eigenfunction has
an important physical meaning: the asymptotic shape of non-constant heat in a
conducting plate/body with insulated boundary.
83

A sofar unsolved conjecture in mathematical reasearch is whether a maximum


principle holds for this ẽ1 : Is the maximum (=a hot spot) of ẽ1 always attained on
∂D? Note that this is not a harmonic function. There is a known counter example
with a domain D ⊂ R2 with two holes (so it looks like a figure eight) where the
maximum of ẽ1 is attained in the interior of D, but for simply connected domains
we do not know if this can happen. You can verify that ẽ1 attains it maximum at
∂D, for the domain D in Figure 4.2.

Figure 4.3: Slide show of a heat evolution in a domain D with f = 0, k = 1, and


Neumann BC, computed using eigenfunction expansion, with N = 4781, as in Sec-
2
tion 4.3. The heat distribution u0 = e−10|(x−1,y)| at t = 0 peaks at u0 = 1 be-
yond Rthe colormap. Note the similarity between u1.5 and ẽ1 in Figure 4.2, and that
u → D u0 dxdy/|D| ≈ 0.02 as t → ∞.

Example 4.2.4 (Wave IBVP on D). Consider the Dirichlet IBVP for the wave
equation.
 2

 ∂t u(t, x) = c2 ∆u(t, x) + f (t, x), t > 0, x ∈ D,

u(0, x) = g(x), x ∈ D,


 ∂t u(0, x) = h(x), x ∈ D,
t > 0, x ∈ ∂D.

u(t, x) = 0,

Again, the more general case of inhomogeneous boundary conditions can be reduced
to this case. We could have also reduced further to the case g = h = 0 by the
methods in Section 1.4, but refrain from this since our method of solution does not
require this.
Consider our wave equation as the vector valued second order ODE

∂t2 ut = c2 ∆ut + ft . (4.3)


84

As in Example 4.2.2, we use the Dirichlet eigenbasis {ej (x)} for D to write
X
ut (x) = ût (j)ej (x),
j
X
ft (x) = fˆt (j)ej (x),
j

and similarly for the initial data


X
g(x) = ĝ(j)ej (x),
j
X
h(x) = ĥ(j)ej (x).
j

By equating the coefficients of the functions on both sides in (4.3), we obtain the
ODEs
∂t2 ût (j) = −c2 λj ût (j) + fˆt (j).
1/2
Now fix j and regard instead t as variable. From Example 1.1.3 with ω = cλj and
a(t) = fˆt (j), we have the solution
1/2 t 1/2
sin(cλj t) sin(cλj (t − s))
Z
1/2
ût (j) = ĝ(j) cos(cλj t) + ĥ(j) 1/2
+ fˆt (ξ) 1/2
ds, (4.4)
cλj 0 cλj

To summarize, given initial data g and source f , we compute their generalized


Fourier coefficients
Z
ĝ(j) = g(x)ej (x)dx,
ZD
fˆt (j) = ft (x)ej (x)dx.
D

From these and (4.4) we obtain ût (j), which gives the solution
X
u(t, x) = ût (j)ej (x), t > 0, x ∈ D, (4.5)
j

to our wave IBVP. Like in Example 4.2.2 and unlike the case in Section 3.2, for a
general bounded domain it is not possible to sum this series solution for u(t, x) and
obtain a formula similar to (3.7).
It is important to note in (4.4) that the coefficients ût (j) do not decay as t grows,
as was the case for the heat equation in Example 4.2.2. (This is related to the fact
that the wave equation does not smoothen the functions ut as t grows.) And indeed
the series (4.5) typically converge slowly for the wave equation, which indicates that
eigenfunction expansion is a numerically less efficient tool than for the heat equation.
Recall that in this latter case, we only needed very few eigenfunctions to get a good
approximation, since the coefficients decay exponentially according to λj .
An important special case deserves special attention: the case when only one
mode of oscillation is present. For simplicity assume that f = h = 0 and that the
85

initial shape g of the wave is one of the eigenfunctions g = em . This means that
ĝ(j) = 0 for all j except j = m, where ĝ(m) = 1. We obtain in this case the solution
1/2 1/2
u(t, x) = ĝ(m) cos(cλm t)em (x) = cos(cλm t)g(x).

This means that if we start with an eigenfunction as the initial shape u(0, x), then
the wave ut will keep this shape, it is only the amplitude of the wave that will change.
1/2
We refer to u as a standing wave, with frequency cλm .

Example 4.2.5 (Can you hear the shape of a drum?). This intriguing question was
posed by the mathematician Mark Kac in a 1966 paper in the journal American
Mathematical Monthly. So what did he mean? We have seen that the vibrating
membrane on a drum with shape D ⊂ R2 , evolve according to the wave equation,
in the linear approximation. p Given D, we can compute the Dirichlet eigenvalues
λ1 , λ2 , . . .. The square roots λj are the frequencies of the pure notes that this
drum with shape D can produce. This means that the eigenvalues determine how
the drum sounds like. Kac’s questions is an example of an inverse problem: If we
know the sequence λ1 , λ2 , . . ., is it then possible to figure out what the shape of D
must be? Clearly we can translate and rotate D without changing the eigenvalues,
but is the shape of D determined by the pure notes, the sound, that the drum
produces?
Finally Gordon, Webb and Wolpert found a counterexample in 1992: there exists
two drums with different shapes which have exactly the same Dirichlet eigenvalues
(we say that they are isospectral).

4.3 FEM: eigenvalues and evolution


For a general domain D, it is not possible to find a closed formula for the eigen-
functions and eigenvalues. In this case, we need to compute these approximately
by a suitable numerical algorithm. Consider the Dirichlet eigenfunctions. We have
seen in Section 4.1 that the j’th Dirichlet eigenfunction ej minimizes R(u) among all
functions orthogonal to the eigenfunctions e1 , e2 , . . . , ej−1 . To compute ej approxi-
mately, we proceed similarly to Section 2.4 and fix a basis ϕ1 , ϕ2 , . . . , ϕN for a finite
dimensional subspace VN ⊂ H01 (D). Write

u(x) = u1 ϕ1 (x) + · · · + uN ϕN (x)

for a function u ∈ VN . Note that in terms of the coordinates ⃗u = [u1 , . . . , uN ]T of u,


we have
⃗uT A⃗u
R(u) = T ,
⃗u B⃗u
where A is the stiffness matrix with entries Aij = ⟨∇ϕi , ∇ϕj ⟩ and B is the matrix
with entries Bij = ⟨ϕi , ϕj ⟩, referred to as the mass matrix.
We define approximations fj ∈ VN to the eigenfunctions ej recursively, by let-
ting fj be the function in VN which minimizes the Rayleigh quotient among all
functions orthogonal to f1 , f2 , . . . , fj−1 . The Rayleigh quotients rj = R(fj ) serve as
approximations to the corresponding eigenvalues λj .
86

Proposition 4.3.1 (Rayleigh–Ritz approximation). Let ϕ1 , ϕ2 , . . . , ϕN ⊂ H01 (D) be


linearly independent functions. The Rayleigh–Ritz approximation r1 , r2 , . . . , rN of
the N first Dirichlet eigenvalues to −∆ on D are the roots of the equation

det(A − λB) = 0,

where the N × N matrices are A = (⟨∇ϕi , ∇ϕj ⟩)N N


i,j=1 and B = (⟨ϕi , ϕj ⟩)i,j=1 . The
approximate eigenfunction fj has the solution f⃗j to (A − rj B)f⃗j = 0 as coordinates.

Proof. Since B is a symmetric matrix, by the spectral theorem we have a similarity


transformation
T BT −1 = D,
where D is a diagonal matrix. Moreover, since xT Bx = ∥u∥2 > 0, the diagonal
elements in D must be positive. Forming the square root of these elements, we write
D = S 2 , where S is also a diagonal matrix with positive diagonal.
Consider now the change of coordinates y = ST x. Note that S T = S and
T T = T −1 . We get

xT Bx = y T y,
xT Ax = y T Cy,

where C = S −1 T AT −1 S −1 . In the new coordinates for u, the Rayleigh quotient is

y T Cy
R(u) = .
yT y

Since again C is a symmetric matrix, yet another application of the spectral theorem
reveals that the eigenvalue approximations rj are the eigenvalues of C, that is the
root of
det(S −1 T AT −1 S −1 − λI) = 0.
Multiplying from the left by det(T −1 S) and from the right by det(ST ) yields the
stated equation.

Replacing H01 (D) by H 1 (D) above, gives a numerical algorithm for calculating
approximations f˜j (x) and r̃j to the Neumann eigenfunctions ẽj (x) and eigenvalues
λ̃j .
The Rayleigh–Ritz approximation (RRA) provides the following concrete algo-
rithm for computing numerical approximations to the eigenvalues and eigenfunc-
tions.

ˆ We assume given the three matrices vertices, triangles and boundary, which
encode a triangulation of the domain D, as in Section 2.4.

ˆ Basis functions are indexed by all the vertices listed in vertices when computing
the Neumann eigenfunctions, whereas in the Dirichlet case we only use those
vertices not listed in boundary. See Section 2.4.
87

ˆ The stiffness matrix A is computed as suggested inR Section 2.4. During the
same iteration over triangles T , we also compute T ϕi ϕj dx and add to the
element Bij in the mass matrix B.

ˆ We can now solve det(A − λB) = 0 for eigenvalues λ and (A − λB)⃗u = 0


for eigenvector coordinates ⃗u, which yields the eigenfunction approximations
u(x) = u1 ϕ1 (x) + . . . + uN ϕN (x). The final result is a matrix E containing the
coordinates for the ej approximations in the basis {ϕi }N i=1 in its j’th column,
and a diagonal matrix D containing the λj approximations along the diagonal.

Once we have solved for (approximations to) eigenvalues and eigenfunctions,


we can solve evolution problems like those considered in Examples 4.2.2 and 4.2.4
numerically as follows.

ˆ A initial data g(x) ∈ VN is represented by its coordinates ⃗g = [g1 , . . . , gN ]T in


the FEM-basis ϕ1 (x), . . . , ϕN (x). We now change basis to the eigenfunction
basis represented by the columns in E, as in Example 0.2.2. The coordinates
ĝ = [ĝ(1), . . . , ĝ(N )]T for g in the eigenfunction basis, the generalized Fourier
coefficients of g, is found by solving the system

Eĝ = ⃗g .

Note that this does not require E to be orthonormal.

ˆ Depending on the evolution PDE, we have a formula like (4.2) or (4.4) available
for the Fourier coefficients ût of the solution at a time t > 0. This allows ût to
be computed from the generalized Fourier coefficents ĝ and the eigenvalues D,
assuming for simplicity that we have no sources f for t > 0. Multiplying by E
will then give the coordinates ⃗ut in the FEM-basis {ϕi }, that is the values of
u(t, x) at the nodes x in the triangulation.

Figures 4.3 and 4.4 have been produced with this algorithm. Note that a maybe
more straightforward way to solve these evolution problems, is to numerically solve
the ODEs for the FEM-coordinates ⃗ut , without diagonalizing and using the analytic
solution for the decoupled generalized Fourier coefficents as we have done here.

4.4 Exercises
1. Does there exist a function f (x) such that f (0) = f (3) = 0 and
Z 3 Z 3
2
f dx = 1 = (f ′ )2 dx?
0 0

2. Show that first eigenvalue for −∆ with Robin BC ∂ν u + au = 0 is given by


RR R
R D
|∇u|2 dx + ∂D au2 dS
λ1 = min RR .
u∈H 1 (D),u̸=0
D
u2 dx
88

Figure 4.4: Slide show of a wave evolution in a domain D with f = 0, c = 1, and


Neumann BC, computed using eigenfunction expansion, with N = 4781. The initial
2
wave u0 = e−10|(x−1,y)| at t = 0, with zero speed, peaks at u0 = 1 beyond the colormap.
We can verify that the propagation speed is 1 and that that the solution does not depend
on the domain D until the wave first hits ∂D at t ≈ 1.

3. Use the eigenfunctions from Example 4.1.1 to solve the one-dimensional IBVP

2
∂t u = ∂x u, 0 < x < π, t > 0,

u = 1, t = 0,

u = 0, x = 0 and x = π.

4. Consider −∆ = −∂x2 on the interval D = (0, L) with mixed BCs: ∂x u(0) = 0 =


u(L). Show that the eigenfunctions are cos((j +1/2)πx/L), j = 0, 1, 2, . . ., and
use these to find a series solution to the mixed IBVP for the wave equation.

5. Assume that u and v are two Dirichlet eigenfunctions, with different eigenval-
ues, on a domain D. Show that u and v are orthogonal in L2 (D).

6. Compute the RRA of the first two Dirichlet eigenvalues for the interval D =
(0, 1), using the basis functions ϕ1 (x) = x − x2 and ϕ2 (x) = x2 − x3 . Compare
with the exact values: which are the smaller?
89

7. Compute the Rayleigh quotient for ϕ(x, y) = xy(π − x)(π − y) on the square
D = (0, π)2 . Compare with the first eigenvalue (see Example 4.2.1): which is
smallest?
90
Chapter 5

Harmonic functions

In this chapter, we prove basic Green and Poisson representation formulas for har-
monic functions, the equilibrium functions described by Laplace’s equation. By an-
alyzing these formulas, we learn much about the properties of harmonic functions:
they are completely smoothness and has a mean value property. All irregularities
have been averaged away! There are similar results for the heat equation, but not
for the wave equation.
As extra material, we also show how BVPs for the Laplace equation can be solved
by diagonalizing a Cauchy–Riemann vector valued system of ODEs, very similar to
the way we solved the heat and wave IVPs in Chapter 3.

Recommended reading:

ˆ Strauss 6.1, 6.3, 7.2, 7.3, 7.4.

ˆ Strauss 7.1, 2.3.

Section 5.3 and the material at the end of Section 5.2


(Lemma 5.2.6 and below) is not part of the course.

Study questions:
What is a Green’s function? How do we compute such with
reflection methods? Why is it symmetric? Poisson formulas
for simple domains? Why are harmonic functions smooth in
the interior of their domain of definition? What do maximum
principles say? What do they mean physically?

5.1 Green functions and Poisson kernels


For harmonic functions and the Laplace equation, the most important function,
depending on dimension, is the following.

Definition 5.1.1 (Fundamental solution to ∆). The fundamental solution to the

91
92

Laplace operator on Rn is the function


1
Φ(x) = 2π
ln |x|, in dimension n = 2
and
1
Φ(x) = − 4π|x| , in dimension n = 3.

Figure 5.1: The logarithmic potential (dimension n = 2) is very narrow at x = 0 and


grows slowly as |x| → ∞. The Newton potential (dimension n = 3) has a stronger
singularity at x = 0 and tend to 0 as |x| → ∞.

To see it appear, consider the Poisson equation ∆u = f , with tempered distri-


butions u and f on Rn . Applying the Fourier transform, we have −|ξ|2 û = fˆ. By
Example 3.1.7, we know that
1
Φ̂(ξ) = − 2 ,
|ξ|
where in R2 we need to define the right hand side in the correct distributional sense
at ξ = 0. If u and f decay sufficiently fast at ∞, then we can multiply by Φ̂(ξ) to
get û = Φ̂fˆ, or equivalently
u(x) = Φ(x) ∗ f (x).
This shows that convolution by Φ is the inverse ∆−1 of the Laplace operator. Equiv-
alently, we have the weak derivative
∆Φ = δ0 .
Proposition 5.1.2 (Green’s third identity). Let D ⊂ Rn be a bounded smooth do-
main. Then for all sufficiently smooth functions u : D → R we have the reproducing
formula
Z Z  
u(x) = ∆u(y)Φ(y−x)dy+ u(y)∂ν Φ(y−x)−Φ(y−x)∂ν u(y) dS(y), x ∈ D,
D ∂D

where ∂ν Φ(y − x) = ⟨ν(y), (∇Φ)(y − x)⟩.


Proof. Using weak derivatives, this result is immediate from Green’s second identity.
See Section 0.3. Indeed, let v(y) = Φ(y − x), where now y is the integration variable
and x ∈ D is fixed. Then the first term in the integral over D is u∆Φ(y − x) = uδx
and therefore Z
u(y)∆Φ(y − x)dy = u(x).
D
93

Green’s third identity is particularly useful for harmonic functions u, in which


case
Z

u(x) = u(y)∂ν Φ(y − x) − Φ(y − x)∂ν u(y) dS(y), x ∈ D. (5.1)
∂D

Let us now switch our point of view, and view x again as our variable. The identity
writes u(x) as a continuous linear combination of the functions ϕ(y−x) and ∂yj Φ(y−
x) with poles y ∈ ∂D. Since Φ(x − y) is C ∞ smooth for x ̸= y, differentiation under
the integral sign shows the following.
Corollary 5.1.3 (Smoothness). Any harmonic function is C ∞ smooth (and even
real analytic) in the interior of its domain of definition.
To gain further understanding of the properties of harmonic functions, the fol-
lowing concept is useful.
Definition 5.1.4 (Green’s function). Consider a domain D ⊂ Rn and a point x ∈
D. A function G(y) = G(y, x) is said to be a Green’s function for D with pole at x
if
ˆ G(y, x) = Φ(y − x) + gx (y), where

ˆ ∆gx = 0 in all D, so that ∆G(y) = ∆y Φ(y − x) + ∆gx = δx , and

ˆ gx (y) = −Φ(y − x) for all y ∈ ∂D, so that G = 0 at ∂D.


Green’s third identity for harmonic functions (5.1) provides an explicit formula
to find u inside D, if we know both the Dirichlet and Neumann data on ∂D. The
reason d’être for Green’s functions is that they eliminate the Neumann term. Indeed,
replacing Φ by G in the proof of Proposition 5.1.2 we have for any harmonic function
that Z
u(x) = u(y)∂ν G(y, x)dS(y), x ∈ D.
∂D
This is a solution formula for the Dirichlet problem! (There is also a similar notion
of a Neumann function which allows you to eliminate the Dirichlet term and to solve
the Neumann problem.) To test you understanding: why did we do all the work
with FEM in Chapter 2 to solve the Dirichlet problem when we have such a simple
formula for its solution? (See answer in footnote 1 below, but do your own thinking
first!)
The standard way to compute Green’s functions for simple domains is through
reflections as the following examples show.
Example 5.1.5 (Half-space). Consider the unbounded upper half space D = {(y1 , y2 , y3 ) ∈
R3 ; y3 > 0}, and fix a point x = (x1 , x2 , x3 ) ∈ D. The key idea is to consider
not only Φ(y − x), which is harmonic in D except at the pole y = x, but also the
function Φ(y − x∗ ), where
x∗ = (x1 , x2 , −x3 )
1
For general domains, you do not know a formula for the Green’s function. Green’s functions
can solve all BVPs, but you need to be able to solve BVPs to find the Green’s function!
94

is reflection of x in ∂D. Since x∗ ∈


/ D, the function gx (y) = −Φ(y − x∗ ) is harmonic
in all D. Moreover, x∗ being the mirror image of x, it is clear that |y − x∗ | = |y − x|
for all y ∈ ∂D. Hence gx (y) = −Φ(y − x) for y ∈ ∂D and so
 
1 1 1
G(y, x) = − −
4π |y − x| |y − x∗ |
1 x3
is a Green’s function for D. Furthermore ∂ν G = −∂y3 G = 2π |y−x|3
for y ∈ ∂D, and
so the solution to the Dirichlet problem is given by
ZZ
x3 u(y1 , y2 , 0)dy1 dy2
u(x1 , x2 , x3 ) = 3/2 . (5.2)
2π R2 (y1 − x1 )2 + (y2 − x2 )2 + x2 3

Example 5.1.6 (Disk). Consider the disk D = {(y1 , y2 ) ∈ R2 ; y12 + y22 < a2 } with
radius a, and fix a point x = (x1 , x2 ) ∈ D. We would like to use a reflection as
in Example 5.1.5 to compute a Green’s function. The suitable reflection of x for
a disk/ball is the inversion x∗ in the boundary. By this we mean the point in the
same direction as x but with
|x∗ |/a = a/|x|,
that is x∗ = a2 x/|x|2 . Set gx (y) = −Φ(y − x∗ ). Then gx is again harmonic in all D.
We check the boundary condition: if |y| = a we have

1
ln |y − x|2 − ln |y − x∗ |2

Φ(y − x) + gx (y) = 4π

1 a2 + |x|2 − 2⟨y, x⟩ 1
= 4π ln 2 = 2π
ln(|x|/a).
a + a4 /|x|2 − 2(a2 /|x|2 )⟨y, x⟩

Correcting gx by this constant difference, we find the Green’s function

1 a|y − x|
G(y, x) = ln .
2π |x||y − x∗ |

Using polar coordinates x = reiθ and y = ρeiϕ , we calculate ∂ν G = ∂ρ G at ρ = a,


and find that the solution to the Dirichlet problem is given by

a2 − r 2 u(aeiϕ )dϕ
Z

u(re ) = .
2π 0 r2 + a2 − 2ar cos(ϕ − θ)

The function
P (y, x) = ∂ν G(y, x)
appearing in the solution formula for the Dirichlet problem is called the Poisson
kernel for D, or sometimes the harmonic measure on ∂D with pole at x. For any
domain, it has the following properties which can be verified in the examples above.

ˆ It is a positive function: P (y, x) > 0 for all y ∈ ∂D and x ∈ D. This can


shown using the maximum principle, which we prove in Section 5.2.
95

Figure 5.2: The Poisson kernel P = ∂ν G as a function of ϕ on the unit circle, for various
poles x in the unit disk.

ˆ It has total mass one: ∂D P (y, x)dS(y) = 1 for all x ∈ D. This is follows from
R

Green’s first identity since ∆y G(y, x) = δx .

ˆ When x ∈ D is near a point x′ ∈ ∂D, then P (y, x) is localized near y = x′ ,


that is P (y, x) is small except for y ≈ x′ . Therefore we expect P to converge
weakly to a Dirac delta distribution at x′ on ∂D, as x → x′ .

This illustrates that the solution to the Dirichlet problem is obtained as weighted
averages of its boundary Dirichlet data, with P as a weight function.
We end this section with a symmetry property of Green’s functions, which is
maybe not that obvious in the examples above.

Proposition 5.1.7 (Symmetry). Let G(y, x) be a Green’s function for a domain D.


Then G(a, b) = G(b, a) for all a, b ∈ D, a ̸= b.

Proof. Let u(y) = G(y, a) and v(y) = G(y, b) and apply Green’s second identity to
these functions to get
Z Z
(u∆v − v∆u)dy = (u∂ν v − v∂ν u)dS = 0,
D ∂D

since u = v = 0 on ∂D. But ∆u = δa and ∆v = δb , so u(b) = v(a) follows as


claimed.

5.2 Mean value and maximum theorems


This section concern Laplace’s equation as well as the heat equation, despite the title
of this chapter, and two properties that solutions to these PDE share. We saw in
96

Figure 5.3: The Green’s function G for the domain D from Figure 1.3, with pole at
x = (2, 1). The neighbourhood near the pole where G < −0.1 is beyond the colormap.
G = 0 on ∂D, with the Poisson kernel P = ∂νy G peaking on ∂D south-east of the pole.

Section 5.1 that a harmonic function is a weighted average of its boundary values. In
particular at the center of a disk, this average is unweighted/uniform. Thus, at any
point, the value of a harmonic function equals the average over any circle centered
at this point. This is perhaps to best way to convey the intuition about harmonic
functions, and this mean value property actually characterize harmonic functions.
Theorem 5.2.1 (Laplace mean value property). Let u be a C 2 function in some
domain D ⊂ Rn . Then u is a harmonic function if and only if
Z
1
u(x) = u(y)dS(y)
|∂B(x, r)| ∂B(x,r)

for all balls B(x, r) ⊂ D.


Proof. We consider the case n = 3. The proof for n = 2 is similar. Green’s third
identity for D = B(x, r) shows that
Z Z
u(y)/(4πr2 ) + ∂ν u(y)/(4πr) dS(y),

u(x) = Φ(y − x)∆u(y)dy +
B ∂B
R
since
R Φ = −1/(4πr) and ∂ν Φ = 1/(4πr2 ) for y ∈ ∂D. Moreover ∂B ∂ν udS =
B
∆udy by Green’s first identity. We obtain
Z Z
1
u(x) − u(y)dS(y) = (Φ(y − x) + 1/(4πr))∆u(y)dy,
|∂B(x, r)| ∂B(x,r) B

which again shows that any harmonic function has the mean value property. But
conversely, assume given a function u which has the mean value property so that
97

the left hand side is always zero. Assume, to reach a contradiction, that ∆u(x) ̸= 0
at some x ∈ D. Then choose r > 0 small enough so that the sign of u is the same
in all B(x, r). But then
Z
(Φ(y − x) + 1/(4πr))∆u(y)dy = 0
B

is impossible since Φ(y − x) + 1/(4πr) < 0 in all B. We conclude that u must be a


harmonic function.

The mean value property shows that a harmonic function is obtained by aver-
aging some boundary values until equilibrium is reached. In particular, since an
average of some set of values can never be larger than the maximum of these values,
harmonic functions have the following fundamental property.

Theorem 5.2.2 (Laplace maximum principle). Let u be a harmonic function on a


domain D ⊂ Rn . Assume that D is bounded and connected, and that u is continuous
on D. Consider the maximum M = maxD u, which exists by our assumptions.

(W) There exists a point x ∈ ∂D, where u(x) = M .

(S) If there exists x in the interior of D where u(x) = M , then u is the function
which is constant equal to M .

Traditionally, the maximum principle is presented in a weak form (W) and a


strong form (S) in this way. As the names suggest, the strong form implies the
weak form. Indeed, assume that we know (S). If there is no maximum at ∂D, then
there must be a maximum in the interior. But (S) then shows that u is a constant
function, and (W) follows trivially.

Proof. To prove (S), we use the mean value property, and proceed in two steps.
(1) Assume that there exists an interior point x = x0 in D where u(x0 ) = M .
Consider a ball B(x0 , r) ⊂ D centered at x0 and contained in D. Since u(y) ≤ M
for all y ∈ D, we have by the mean value property that
Z Z
1 1
M = u(x0 ) = u(y)dS(y) ≤ M dS(y) = M.
|∂B(x0 , r)| ∂B(x0 ,r) |∂B(x0 , r)| ∂B(x0 ,r)

Therefore the middle inequality must be an equality, which is only possible if u(y) =
M for all y ∈ ∂B(x0 , r).
Let R denote the minimal distance from x0 to ∂D. Repeating the above argument
for 0 < r < R shows that u(x) = M for all x ∈ B(x0 , R).
(2) Next, let x1 ∈ D be arbitrary. Let γ : [0, 1] → D be a curve with γ(0) = x0
and γ(1) = x1 . We showed in (1) that there exists 0 < t0 < 1 such that u(γ(t)) = M
for all 0 ≤ t ≤ t0 . Denote by T the supremum of such t0 . We claim that T = 1,
which will prove that u(x1 ) = M and conclude the proof. To reach a contraction,
assume that T < 1. Now repeat (1) with x0 replaced by x′ = γ(T ). By continuity
u(x′ ) = M and (1) shows that there exists ϵ > 0 such that u(γ(t)) = M for all
t ∈ (T − ϵ, T + ϵ). This contradicts the definition of T .
98

Figure 5.4: Iterating the result (1) from x0 to x1 .

Turning to our other two basic PDEs, one can see by simple examples that
nothing like a mean value property or a maximum principle can hold for the wave
equation. See exercises. However, for the heat equation similar results hold as we
shall now prove.
Theorem 5.2.3 (Heat maximum principle). Let D ⊂ Rn be bounded and connected
domain and consider the cylinder Ω = D × (0, T ), there 0 < T < ∞. Assume
that u = u(x, t) solves the heat equation ∂t u = k∆u in Ω, and that u is continuous
on Ω = D × [0, T ]. Consider the maximum M = maxΩ u, which exists by our
assumptions.
(W) There maximum M is attained either among the initial data at t = 0 or among
the boundary data at x ∈ ∂D.
(S) If there exists an interior point x0 ∈ D and 0 < t0 ≤ T where u(x0 , t0 ) = M ,
then u is constant equal to M for all t ≤ t0 .

Figure 5.5: (a) Maximum among initial data, due to injection of heat at t = 0. (b)
Maximum among boundary data, due to injection of heat at ∂D for some t > 0.

Again, (W) is seen to follow from (S). What differs the heat from the Laplace
maximum principle, is that there is a direction of time as the heat equation is
not reversible, as we saw in Section 3.3. For this reason the top of the cylinder
goes together with the interior, in the formulation of the maximum principle. It is
worthwhile the contemplate the physics behind this!
We shall prove (S) from a mean value property for the heat equation below, but
since this is now more technical, we first give a simpler and direct proof of (W). The
main idea of the latter proof is contained in the following lemma.
Lemma 5.2.4 (Heat max with sinks). Consider a solution v(x, t) to the inhomo-
geneous heat equation ∂t v = k∆v + f in the cylinder Ω, with sinks f < 0. As
in Theorem 5.2.3, we assume that v is continuous on the bounded and connected
cylinder Ω. Then v attains its maximum K = maxΩ v at, and only at, t = 0 or
x ∈ ∂D.
Proof. If K is attained at an interior point (x, t), then this must be a stationary
point so that ∂t v = 0 and ∇x v = 0. Moreover the quadratic form for v must be
negative definite or semidefinite, from which it follows that ∆v ≤ 0. It follows from
the PDE that
0 = ∂t v = k∆v + f ≤ f,
which contradicts our assumption f < 0.
If K is attained at (x, T ) at the top of the cylinder, we can still conclude that
∆v ≤ 0 along the top t = T , but for ∂t v we only have access to a one-sided derivative
since v is not defined for t > T . Nevertheless ∂t v ≥ 0, since u ≤ M for t < T , and
as above this contradicts our assumption f < 0.
99

Proof of (W). Fix ϵ > 0 and consider the auxiliary function


v(x, t) = u(x, t) + ϵ|x|2 . (5.3)
Using linearity, we have ∂t v − k∆v = (∂t u − k∆u) + (0 − kϵ2n) = −2nkϵ. Therefore
we can apply Lemma 5.2.4 to v with constant sinks f = −2nkϵ, and conclude that
maxΩ v is attained at t = 0 or x ∈ ∂D. We now conclude as follows.
ˆ Let M0 be the maximum of u at t = 0 or x ∈ ∂D.
ˆ By (5.3) we have v ≤ M0 +ϵL2 at t = 0 and at x ∈ ∂D, where L = maxx∈∂D |x|.
ˆ By Lemma 5.2.4, we have v ≤ M0 + ϵL2 in all Ω.
ˆ Since u ≤ v and ϵ > 0 is arbitrary, we have u ≤ M0 in all Ω.

To obtain a mean value formula for the heat equation, the key observation is
that in the Laplace mean value formula, the circle/sphere on which the mean value
is calculated, is a level set of the Laplace fundamental solution Φ(y − x), since this
is a radial function. Similarly, the heat mean value formula will use a level set of
the heat kernel. To simplify, we assume k = 1 for the remainder of this section. (If
k ̸= 1, a simple rescaling will reduce to the case k = 1.)
Lemma 5.2.5 (Heat kernel= fundamental solution). Consider Ht (x) from Defini-
tion 3.2.2, with k = 1, as a function of (x, t) ∈ Rn × R, where Ht (x) = 0 when
t < 0. Then we have the weak derivative
∂t Ht − ∆x Ht = δ(0,0) .
Proof. By calculating the classical derivative, we see that ∂t Ht − ∆Ht = 0 when
t > 0, and clearly this holds also for t < 0. Around at point (x, 0) with x ̸= 0,
one can show that Ht (x) is a C ∞ function of (x, t) with all derivatives vanishing at
t = 0 since the exponential function decay very fast as t → 0. It remains to see what
happens at the singularity at (0, 0). To this end, we fix a test function ϕ(x, t) and
compute
ZZ
2
⟨∂t Ht − ∆Ht , ϕ⟩ = ⟨Ht , −∂t ϕ − (−1) ∆ϕ⟩ = lim+ Ht (x)(−∂t ϕ − ∆ϕ)dxdt
ϵ→0 t>ϵ
Z ZZ
= lim+ Hϵ (x)ϕ(x, ϵ)dx + (∂t Ht (x) − ∆Ht (x))ϕ(x, t)dxdt
ϵ→0 Rn t>ϵ
√ √ n
Z Z
−n/2 −|z|2 ϕ(0, 0) 2
= lim+ (4πϵ) e ϕ(2 ϵz, ϵ)(2 ϵ) dz → n/2 e−|z| dz = ⟨δ(0,0) , ϕ⟩
ϵ→0 Rn π Rn

where we have integrated by barts in t and made a change of variables x = 2 ϵz.
Lemma 5.2.6 (Parabolic Green). Let D be a bounded smooth domain in spacetime
Rn+1 , and let f and g be sufficiently smooth on D. Write the normal outward unit
vector on ∂D as ν = (νx , νt ) where νx ∈ Rn and νt ∈ R. Then
Z Z
(f ∂νx g − g∂νx f + νt f g)dS(x, t) = (f (∆x g + ∂t g) − g(∆x f − ∂t f ))dxdt,
∂D D

where ∂νx f = ⟨νx , ∇x,t f ⟩ = ⟨ν, ∇x f ⟩.


100

Proof. Similar to Section 0.3, this integral identity follows by applying the divergence
theorem to the spacetime vector field

F = f ∇x g − g∇x f + et f g,

where et = (0, 1) is the time unit vector.

Theorem 5.2.7 (Heat mean value property). Assume that u(x, t) solves the heat
equation ∂t u = ∆u in a neighbourhood of (x0 , t0 ) ∈ Rn+1 . Define the parabolic ball
at the origin with diameter r to be

BP (r) = {(x, t) ; 0 < t < r, |x|2 < 2nt ln(r/t)} = {(x, t) ; Ht (x) > (4πr)−n/2 }.

Then for small r > 0, we have

|x|2 u(x0 − x, t0 − t)dS(x, t)


Z
1
u(x0 , t0 ) = p . (5.4)
(4πr)n/2 ∂BP (r) 4t2 |x|2 + (2nt − |x|2 )

Figure 5.6: The heat kernel Ht (x), n = 1, with level sets defining the parabolic balls
BP (r).

Proof. We apply Lemma 5.2.6, with Ht (x) as f (x, t) and u(x0 − x, t0 − t) as g(x, t),
and with D = BP (r). Then g solves the backward heat equation ∆x g + ∂t g = 0
and Lemma 5.2.5 shows that ∂t f − ∆x f = δ(0,0) . Moreover, since g is constant on
∂BP (r), we have
Z Z
(f ∂νx g + νt f g)dS(x, t) = f (∂νx g + νt g)dS(x, t)
∂BP (r) ∂BP (r)
ZZ
=g (∆x g + ∂t g)dxdt = 0.
BP (r)

For the boundary term g∂νx f in the parabolic Green identity, computations reveal
that

−∇x Ht (x) = (x/2t)Ht (x),


p
νx = x/ |x|2 + (|x|2 /2t − n)2 .
R
for (x, t) ∈ ∂BP (r). Therefore, on the right in (5.4) stands ∂BP (r) g∂νx f dS. Lemma 5.2.6
R
therefore completes the proof if we can show that BP (r) g(∂t f −∆x f )dxdt = u(x0 , t0 ).
Taking a closer look at f = Ht reveals that the Dirac delta δ(0,0) = ∂t f − ∆x f lies
“infinitesimally above” t = 0, and therefore integrating over BP (r), which has a
horizontal tangent plane at (0, 0) should indeed yield all of u(x0 , t0 ). We leave the
details of a rigorous proof, which involves for example integrating around the origin
on a small cylinder and taking a limit, to the reader.
101

Proof of Theorem 5.2.3(S). This follows from the heat mean value property, in a
way similar to how we derived the strong maximum principle from the mean value
property for harmonic functions.
Assume that u(x0 , t0 ) = M for some interior x0 ∈ D and 0 < t0 ≤ T . Write
Kr (x, t) for the integrand in (5.4), so that
Z
u(x0 , t0 ) = Kr (x, t)u(x0 − x, t0 − t)dS(x, t).
∂Bp (r)

Note that this also applies to R the function u = 1, since this also solves the heat
equation. This shows that ∂Bp (r) Kr (x, t)dS(x, t) = 1, and clearly Kr (x, t) ≥ 0,
so u(x0 , t0 ) is a weighted mean value of its values on ∂Bp (r). As in the proof of
Theorem 5.2.2(S), we can conclude that u(x, t) = M for all (x, t) ∈ BP (R), where
R is the supremum of r such that BP (r) ⊂ Ω.
Finally, if (x1 , t1 ) ∈ Ω is any point such that t1 < t0 , then we can find a curve
γ : [0, 1] → Ω such that γ(0) = (x0 , t0 ) and γ(1) = (x1 , t1 ), which has downward/past
pointing tangent vector γ ′ . Following the proof of Theorem 5.2.2(S), we can prove
that u(x1 , t1 ) = M . Note that the condition on γ ′ is needed since the mean values
of u are taken over a set below the point.

5.3 Analytic functions and Hardy splittings


We saw in Section 3.2 how to solve the basic IVPs on Rn for two of our main
PDEs, the heat and wave equations, by diagonlization with the Fourier transform
and reduction to ODEs. At first it seems impossible that something like this is
possible for the Laplace equation since it contains no time-variable. Nevertheless we
perform an entirely analogous calculation in this section to solve the Dirichlet and
Neumann BVPs for the Laplace equation on the upper half plane in R2 .
We make use of the following type of pairs of harmonic functions in the plane.
In complex analysis these pairs, interpreted as a complex-valued analytic function,
are studied.
Definition 5.3.1 (Cauchy–Riemann’s equations). A pair of functions v1 , v2 : D →
R, defined on an open set D ⊂ R2 , is said to be harmonic conjugate functions if
they satisfy the Cauchy–Riemann system of PDEs
(
∂x v1 = ∂y v2 ,
∂y v1 = −∂x v2 .

Basic relations between harmonic conjugate functions and harmonic functions


are the following.
If (v1 , v2 ) are harmonic conjugate functions, then each of the two functions is
harmonic. Indeed
∆v1 = ∂x (∂x v1 ) + ∂y (∂y v1 ) = ∂x ∂y v2 − ∂y (∂x v2 ) = 0,
∆v2 = ∂x (∂x v2 ) + ∂y (∂y v2 ) = −∂x ∂y v1 + ∂y (∂x v1 ) = 0,
102

by the equality of mixed derivatives.


Let D be a simply connected domain, and let (x0 , y0 ) ∈ D. Given a harmonic
function v1 , then the function
Z (x,y)
v2 (x, y) = (−∂y v1 dx + ∂x v1 dy)
(x0 ,y0 )

defines a harmonic conjugate function v2 to v1 , that is (v1 , v2 ) are harmonic conjugate


functions. The simply connectedness of D ensures that the integral does not depend
on curve of integration.
If v2′ is any other harmonic conjugate function to v1 , then v2′ − v2 is is constant.
Indeed, it follows from Cauchy–Riemann’s equations that ∇(v2′ − v2 ) = 0.
Given a harmonic function u, then the gradient vector field ∇u = (∂x u, ∂y u) is
a divergence and curl-free vector field. This amounts to the anti-Cauchy-Riemann
equations, where the minus sign is instead in the first equation. To conform to the
Cauchy–Riemann equations, we consider here the conjugate gradient vector field

(v1 , v2 ) := (∂x u, −∂y u)

to u. It is the following relation between the Laplace and Cauchy–Riemann equations


which we use here.
Lemma 5.3.2 (Conjugate gradients). If u is a harmonic function, then its conjugate
gradient (v1 , v2 ) solves the Cauchy–Riemann equations. Conversely, if (v1 , v2 ) solves
the Cauchy–Riemann equations and if D is simply connected, then there exists a
harmonic function u, unique up to constants, so that (v1 , v2 ) is the conjugate gradient
(v1 , v2 ) = (∂x u, −∂y u) of u.
Proof. We have
∂x v1 − ∂y v2 = ∂x ∂x u − ∂y (−∂y u) = 0
since ∆u = 0, and the second CR equation similarly holds by equality of mixed
derivatives.
For the converse, from the second CR equation we have

∂y (−v2 ) − ∂x v1 = 0,

that is (v1 , −v2 ) is a curl-free vector field. By vector calculus, there exists a scalar
potential u so that (v1 , −v2 ) = ∇u. From the first CR equation we conclude

∂x2 u + ∂y2 u = ∂x v1 + ∂y (−v2 ) = 0,

so this potential is indeed harmonic.


To simplify computations, we use complex algebra and consider the complex-
valued function
v := v1 + iv2
instead of the two real functions v1 and v2 . In terms of v, the Cauchy–Riemann
equations reads

∂y v = ∂y v1 + i∂y v2 = −∂x v2 + i∂x v1 = i∂x v.


103

We now regard the height y as our evolution variable, playing the role of time t in
Section 3.2. Writing
vy (x) = v(x, y), x ∈ R,
for the function v on height y, the Cauchy–Riemann equations is equivalent to the
vector-valued ODE
∂y vy = i∂x vy ,
for a function R → L2 (R).
To solve this, as in Section 3.2 we apply to each fixed y the Fourier transform in
the x-variable, and obtain
∂y v̂y (ξ) = −ξv̂y (ξ).
Now fix ξ and regard y as variable. The solution of the ODE is clearly

v̂y (ξ) = v̂0 (ξ)e−ξy . (5.5)

Here we encounter a new phenomenon as compared to the calculations in Section 3.2:


the function f (ξ) = e−ξy does not define a tempered distribution unless y = 0 and
so it is not possible to apply the inverse Fourier transform to f to obtain a solution
to Cauchy–Riemanns equations. For y > 0, f grows exponentially as ξ → −∞, and
for y < 0 we have that f grows exponentially as ξ → ∞. Recall Example 3.1.5.
However, this problem disappears if we choose the initial data v0 appropriately.
If v̂0 (ξ) = 0 for all ξ < 0, then we see from (5.5) that |vy | ≤ |v0 | for y ≥ 0. Conversely
if v̂0 (ξ) = 0 for all ξ > 0, then we see from (5.5) that |vy | ≤ |v0 | for y ≤ 0. If we
consider initial data v0 from the function space L2 (R), then the following subspaces
are relevant.

Definition 5.3.3 (Hardy subspaces). Consider the function space L2 (R) = L2 (R; C)
of square integrable complex-valued functions f : R → C. Define the upper Hardy
subspace to be

L+ ˆ
2 (R) := {f ∈ L2 (R) ; f (ξ) = 0 for all ξ < 0}.

Define the lower Hardy subspace to be

L− ˆ
2 (R) := {f ∈ L2 (R) ; f (ξ) = 0 for all ξ > 0}.

To inverse transform, we need the following.

Proposition 5.3.4 (Cauchy and Hilbert kernels). For y ̸= 0, define the functions

1 1
Cy (x) := , x ∈ R.
2π y − ix

Also define the tempered distribution


Z
i 1
⟨H, ϕ⟩ := lim+ ϕ(x)dx, ϕ ∈ S(R).
ϵ→0 π |x|>ϵ x
104

Then we have the following piecewise smooth functions as Fourier transforms .


(
e−yξ , ξ > 0,
F{Cy } = y > 0,
0, ξ < 0,
(
0, ξ > 0,
F{Cy } = −yξ
, y < 0,
−e , ξ < 0,
(
1, ξ > 0,
F{H} =
−1, ξ < 0.

Proof. For y > 0, we compute the inverse Fourier transform


Z ∞
1
e−yξ eixξ dξ. = Cy (x)
2π 0
For y < 0 the proof is similar. For H we first check that ⟨H, ϕ⟩ is well defined for
all test functions. To see this, we write
Z Z ∞
1 1
ϕ(x)dx = (ϕ(x) − ϕ(−x))dx,
|x|>ϵ x ϵ x
by splitting the integral and applying a change of variable. Here we see that the
limit ϵ → 0 is unproblematic since
(ϕ(x) − ϕ(−x))/x → 2ϕ′ (0), x → 0+ .
We now verify that F{H} = sgn(ξ) by taking a weak limit as y → 0+ as follows.
We have
Z Z
−y|ξ|
⟨Cy + C−y , ϕ̂⟩ = sgn(ξ)e ϕ(ξ)dξ → sgn(ξ)ϕ(ξ)dξ, y → 0+ .

But also, again by a change of variables


Z ∞ Z ∞
i x i1
⟨Cy +C−y , ϕ̂⟩ = 2 2
(ϕ̂(x)−ϕ̂(−x))dx → (ϕ̂(x)−ϕ̂(−x))dx = ⟨H, ϕ̂⟩,
0 πy +x 0 πx
as y → 0+ .
We now return to (5.5), where we already noted that for general initial data
v(x, 0), we have no solution to the CR IVP, not for any y > 0 and not for any y < 0.
Consider now v0 ∈ L+ 2 (R) belonging to the upper Hardy subspace. In terms of
Hilbert kernel, this means that
H ∗ v0 = v0 ,
as is seen by checking the Fourier transform of this equation. We see from (5.5) that
v̂y (ξ) = Ĉy (ξ)v̂0 (ξ), so we obtain the solution
v(x, y) = Cy ∗ v0 (x)
to the CR IVP for y > 0 with v(x, 0) = v0 (x).
A similar calculation shows that the CR IVP is solvable for y < 0 when v0 ∈

L2 (R). We summarize our findings.
105

Theorem 5.3.5 (Cauchy integrals). We have an orthogonal splitting



L2 (R) = L+
2 (R) ⊕ L2 (R)

of our function space L2 (R) into the two subspaces L± 2 (R) from Definition 5.3.3.
If v0 ∈ L+
2 (R), or equivalently when H ∗ v0 = v0 then the CR IVP is solvable
,
for y > 0, with solution
Z
1 v(z, 0)
v(x, y) = dz, x ∈ R, y > 0,
2πi z − (x + iy)
decaying as y → +∞.
If v0 ∈ L−
2 (R), or equivalently when H ∗ v0 = −v0 , then the CR IVP is solvable
for y < 0, with solution
Z
i v(z, 0)
v(x, y) = − dz, x ∈ R, y < 0,
π z − (x + iy)
decaying as y → −∞.
The reader with knowledge in complex analys recognize these formulas as the
Cauchy integral formulas for analytic functions.
A comparison with the heat and wave equations is in order.

ˆ The IVP for the heat equation is solvable forward in time t > 0, but not
backwards in time.
ˆ The IVP for the wave equation is solvable both forward and backward in time.

ˆ Theorem 5.3.5 shows that the IVP for the Cauchy–Riemann equations/Laplace
equation is solvable upwards in space for initial data in L+
2 (R) and downward

in space for initial data in L2 (R).

Returning to the Laplace equation, we show next how to solve the Dirichlet and
Neumann BVPs using Theorem 5.3.5. We need the following observation.
Lemma 5.3.6. A function f : R → C is real-valued if and only if its Fourier
transform fˆ is conjugate-symmetric, that is

fˆ(−ξ) = fˆ(ξ), ξ ∈ R.

Example 5.3.7 (Neumann problem on the upper half plane). We wish to solve the
Neumann problem (
∆u(x, y) = 0, y > 0, x ∈ R,
∂y u(x, 0) = g(x), x ∈ R.
We assume that the Neumann data g ∈ L2 (R). For the solution, consider the
conjugate gradient vector field v = (v1 , v2 ) = (∂x u, −∂y u) as above. Since we want
this to solve the CR equations for y > 0, we look for v0 ∈ L+ 2 (R). To match the
boundary data, we want this to have the imaginary part

Im v(x, 0) = −g(x), x ∈ R.
106

So we seek a real-valued function f so that v0 = f − ig has Fourier transform

fˆ(ξ) − iĝ(ξ) = 0, for all ξ < 0.

There exists a unique such f : for ξ < 0 we must have fˆ(ξ) = iĝ(ξ), whereas for ξ > 0
since f and g are real-valued functions we have fˆ(ξ) = fˆ(−ξ) = iĝ(−ξ) = −iĝ(ξ).
We obtain the inital data
(
−2iĝ(ξ), ξ > 0,
v̂0 (ξ) =
0, ξ < 0,

and Theorem 5.3.5 gives the solution u(x, y) to the Neumann problem with conjugate
gradient Z
1 g(z)
v(x, y) = Cy ∗ v0 (x) = dz.
π (x + iy) − z
Recall that the solution u to the Neumann problem is unique only up to constants.
We obtain formulas for the partial derivative
x−z
Z
1
∂x u(x, y) = g(z) dz,
π (x − z)2 + y 2
Z
1 y
∂y u(x, y) = g(z) dz,
π (x − z)2 + y 2
where we recognize the second one to be the Poisson integral for the half plane
applied to ∂y u.

5.4 Exercises
1. Show that for the half-plane D = {(y1 , y2 ) ∈ R2 ; y2 > 0}, the solution to the
Dirichlet problem is given by
Z
1 u(y1 , 0)dy1
u(x1 , x2 ) = .
π R (y1 − x1 )2 + x22

2. Show that for a ball D = {(y1 , y2 , y3 ) ∈ R3 ; y12 + y22 + y32 < a2 }, the solution
to the Dirichlet problem is given by
a2 − |x|2
Z
u(y)dS(y)
u(x1 , x2 , x3 ) = 3
.
4πa |y|=a |y − x|

3. Show that for a given domain D, there is at most one Green’s function.

4. Write the solution to the Poisson Dirichlet BVP


(
∆u = f, x ∈ D,
u = g, x ∈ ∂D

in terms of the Green’s function.


107

5. Find the Green’s function for the one-dimensional interval D = (0, L).
6. Consider the solution formula (5.2) for the Dirichlet problem on the half-space.
Assume that u(y1 , y2 , 0) = 0 when y12 + y22 > R2 for some R < ∞. Show that
u → 0 as x → ∞.
7. Consider the solution formula (5.2) for the Dirichlet problem on the half-space.
Assume that g(y1 , y2 ) = u(y1 , y2 , 0) is a continuous and bounded function.
Show that limx3 →0 u(x1 , x2 , x3 ) = g(x1 , x2 ).
8. The definition of a Neumann function N (y, x) = Φ(y − x) + nx (y) for a domain
D is similar to Definition 5.1.4, but replacing the boundary condition by ∂ν N =
c on ∂D. Show the constant c must be chosen as 1/area of ∂D, for N to exist.
Use Green’s formulas to express the solution to the Neumann problem in term
of N .
9. The Kelvin inversion formula for harmonic functions states that if ∆u = 0 in
a neighbourhood of x = a, then
1
v(x) = u(x/|x|2 )
|x|n−2
satisfies ∆v = 0 in a neighbourhood of x = a/|a|2 . Prove this for n = 3.
Hint: A direct calculation is possible but not pleasant. Use Green’s formula
to reduce to the case when u(x) = Φ(x − p) for some p ∈ Rn .
10. Let u be the harmonic function on the disk x2 + y 2 < 4 with Dirichlet data
u = 3 sin(2θ) + 1. Without computing u, find the maximum value of u on the
closed disk, and the value u(0, 0).
11. Consider the solution u(t, x) = 1 − x2 − 2kt to the heat equation. Find its max
and min on 0 ≤ x ≤ 1, 0 ≤ t ≤ T .
12. Consider the heat IBVP

2
∂t u = ∂x u,
 0 < x < 1, t > 0,
u = 0, x = 0 and x = 1,

u = 4x(1 − x), t = 0.

(a) Show that 0 < u < 1 for all 0 < x < 1, t > 0.
(b) Show that u(t, 1 − x) = u(t, x) for all 0 < x < 1, t > 0.
13. Prove that if u and v solve the heat equation and u ≤ v at t = 0, at x = 0 and
at x = L, then u ≤ v for all t > 0, 0 < x < L.
14. Find an example which shows that there is no maximum principle for the wave
equation.
15. Assume that ∂t u = ∂x2 u for 0 < x < a, 0 < t < T , and that ∂x u = 0 for x = 0,
0 < t < T . Show that u attains a maximum either at t = 0 or x = a. What is
the physics? Hint: Make an even reflection at x = 0.
108

16. Solve the CR BVP


(
∆u(x, y) = 0, y > 0, x ∈ R,
∂x u(x, 0) = g(x), x ∈ R.

similar to what was done in Example 5.3.7 for the Neumann problem. What
is the relation to the Dirichlet BVP for the Laplace equation on the upper half
plane?
Chapter 6

Boundary integral equations

109
110
Chapter 7

Beyond our basic PDE problems

In this course, we have studied essentially only three PDEs. The reason is that a
general PDE, to some extent, can be reduced to one of the basic ones: the heat,
wave and Laplace PDE, as we see in Section 7.1. A general PDE may be different
in many ways: it may be of higher order, it may be a system involving several
PDEs and unknown functions, it may be non-linear or it may have variable, even
non-smooth, coefficients. Linear PDE problem in smooth domains with constants
coefficents like k and c2 appearing in the heat and wave equations, the topic of
a first PDE course like this, have been well understood for more than a century.
The modern PDE research often concerns non-linear PDEs or linear PDE problems
with non-smooth coefficients, or in domains with non-smooth boundaries ∂D. Other
important modern problems include free boundary problems, where the domain D
itself is unknown and evolve with the PDE, and inverse problems where one knows
the solutions to the PDE problem, for many data, and seek the coefficients or the
domain.

7.1 Classification of second order PDEs

Reading:

ˆ Strauss 1.6

ˆ Lecture notes (last 4 pages): http://www.math.chalmers.se/


~rosenan/part1.pdf
What do we mean by an elliptic, hyperbolic or parabolic second
order PDE? What are the standard examples of PDEs from these
three classes respectively? How do we reduce a given second
order PDE to one of the three main PDEs?

111
112

7.2 Dirac type PDE systems

Recommended reading:

ˆ Strauss 13.1, 13.5

ˆ Lecture notes: http://www.math.chalmers.se/~rosenan/part5.


pdf

What is a Dirac type PDE?

In this course we have only considered mainly scalar PDEs, that is the unknown
function u(x1 , . . . , xn ) is scalar-valued. As a consequence the basic PDEs we have
studied are of second order. Many important PDEs, in particular those appearing in
physics, are however not scalar, that is there are more than one unknown function
appearing in the PDE. In this case it is often first order PDE systems that are the
fundamental ones, and one important class of such is the following.
Definition 7.2.1 (Dirac type PDE system). A first order linear system of PDEs is
said to be of elliptic Dirac type if the matrix with the first order derivatives squares
to a diagonal matrix with diagonal elements all being the Laplace operator ∆.
A first order linear system of PDEs is said to be of hyperbolic Dirac type, with
propagation speed c, if the matrix with the first order derivatives squares to a diag-
onal matrix with diagonal elements all being the d’Alembert operator c−2 ∂t2 − ∆.
Since both the first order derivative ∂t and the second order derivatives ∆ appear
in the heat equation, we do not normally consider any parabolic Dirac type PDE
systems.
This concept is best understood by examples. The first one we have already
encountered in Section 5.3.
Example 7.2.2 (Cauchy–Riemann= elliptic Dirac). The Cauchy–Riemann equa-
tions from Definition 5.3.1, written in matrix form, read
    
−∂x ∂y v1 0
=
∂y ∂x v2 0

These form an elliptic Dirac system since


    
−∂x ∂y −∂x ∂y ∆ 0
=
∂y ∂x ∂y ∂x 0 ∆

Example 7.2.3 (Maxwell≈hyperbolic Dirac). Maxwell’s equations read

∇·E = ρ, (7.1)
−1
c ∂t E − ∇ × B = −J, (7.2)
c−1 ∂t B + ∇ × E = 0, (7.3)
∇·B = 0, (7.4)
113

and concern six unknown functions depending on position x ∈ R3 and time t ∈ R,


grouped into the electric vector field E = (E1 , E2 , E3 ) and the magnetic vector field
B = (B1 , B2 , B3 ). The source terms are the charge density ρ and electric current
density J, which are assumed to satisfy the continuity equation c−1 ∂t ρ + ∇ · J = 0.
We have here rescaled E, B, ρ and J to avoid unnecessary technical details, leaving
only the constant c, the speed of light. Recall from vector calculus the three nabla
operators

∇f = gradf,
∇ · F = divF,
∇ × F = curlF.

We aim to show that Maxwell’s equations in a certain sense form a hyperbolic


Dirac system of PDEs. To start with, we approach Maxwell’s equations in the
classical way and observe that the two Gauss equations (7.1) and (7.4) follow from
Maxwell-Ampere’s and Farday’s equations (7.2) and (7.3). This is seen by applying
the divergence to the latter two equations. Written in matrix form, the Maxwell’s
equations therefore can be summarized as
 −1    
−c ∂t curl E J
= .
curl c−1 ∂t B 0

We now observe that Maxwell’s equations essentially are a hyperbolic Dirac type
system. For this we use the identity

∆F = grad(divF ) − curl(curlF ) (7.5)

from vector calculus, where ∆ acts on each component function of the vector field
F . In absense of sources ρ = J = 0, using the Gauss equations divE = divB = 0
we obtain that
 −1
−c ∂t curl −c−1 ∂t curl
     −2 2  
E c ∂t − ∆ 0 E
−1 −1 = −2 2 .
curl c ∂t curl c ∂t B 0 c ∂t − ∆ B

Therefore, at least for divergence free vector fields E and B, Maxwell’s equations is
a hyperbolic Dirac type system.
To avoid the problem with the divergence free constraints, another approch which
we shall use is to add two auxiliary scalar functions f and g and consider the 8 × 8
system of PDEs
 −1    
c ∂t div 0 0 f ρ
−grad −c−1 ∂t curl 0  E  J 
   

 0 −1 = . (7.6)
curl c ∂t grad  B   0 
0 0 −div −c−1 ∂t g 0

Maxwell’s equations is the special case when f = g = 0, in which case the first and
last equations are the Gauss equations. Using (7.5), we see that this is indeed a
hyperbolic 8 × 8 Dirac system of PDEs.
114

Elliptic Dirac systems share many properties with the Laplace equation, and
hyperbolic Dirac systems share many properties with the wave equation. As an
example of this, we solve the Maxwell IVP on R3 by adapting Example 3.2.3.

Example 7.2.4 (Maxwell IVP on R3 ). We aim to solve the IVP on R3 for the
hyperbolic 8 × 8 Dirac system (7.6), where ρ(t, x) and J(t, x) are prescribed sources
that we assume satisfy the continuity equation c−1 ∂t ρ + ∇ · J = 0, assuming initial
conditions 

 f (0, x) = 0, x ∈ R3 ,
E(0, x) = E (x), x ∈ R3 ,

0


 B(0, x) = B0 (x), x ∈ R3 ,
x ∈ R3 ,

g(0, x) = 0,
where E0 and B0 are given vector fields with divE0 (x) = ρ(0, x) and divB0 = 0.
We start by writing (7.6) as the vector valued ODE

c−1 ∂t Ft + M (∇)Ft = jt ,

where  
0 div 0 0
grad 0 −curl 0 
M (∇) = 
 0

curl 0 grad
0 0 div 0
and    
ft ρt
Et  −Jt 
Ft = 
Bt  ,
 jt = 
 0 .

gt 0
Following our usual strategy we apply, for each t > 0, the Fourier transform and
obtain
c−1 ∂t F̂t + iM (ξ)F̂t = ĵt , (7.7)
where  ˆ  
ft ξ · Êt
Êt  ξ fˆt − ξ × B̂t 

M (ξ)  =
B̂  ξ × Êt + ξĝt 
 .
t 
ĝt ξ · B̂t
For each fixed ξ ∈ R3 , we now solve the ODE (7.7) in the variable t. Unlike the
case for scalar equation, we note that the Fourier transform does not completely
diagonalize our Dirac system of PDEs, but has transformed it into 8 × 8 systems
of ODEs. These can be further diagonalized as in Example 1.1.2 to scalar ODEs.
However, the Dirac structure provide a way to simplify this calculation as follows.
We note that M (∇) is an elliptic Dirac system and that the square of the matrix
M (ξ) is |ξ|2 times the identity matrix. A matrix valued integrating factor for (7.7)
is the matrix
sin(c|ξ|t)
eicM (ξ)t := cos(c|ξ|t)I + icM (ξ).
c|ξ|
115

(Just like the classical formula eit = cos(t) + i sin(t) this is justified by Taylor series
expansion, using that (icM (ξ))2 = −(c|ξ|)2 I.) Multiplying (7.7) by this matrix and
c yields
∂s (eicM (ξ)s F̂s ) = ceicM (ξ)s ĵs ,
where we have changed the dummy variable from t to s. Integration over 0 < s < t
and multiplication by e−icM (ξ)t gives
Z t
−icM (ξ)t
F̂t = e F̂0 + c e−icM (ξ)(t−s) ĵs ds. (7.8)
0

Finally we apply the inverse Fourier transform to this solution formula. The simplest
way to do this is to observe that the inverse transform of
sin(c|ξ|t)
e−ictM (ξ) F̂ := cos(c|ξ|t)F̂ − icM (ξ) F̂
c|ξ|
is ZZ
−ctM (∇) 1
e F (x) := (∂t − cM (∇)) F (y)dS(y), (7.9)
4πc2 t |y−x|=ct

using the Riemann function for R3 from Proposition 3.2.4.


To summarize, the solution to the Maxwell–Dirac IVP considered in this example
is given by Z t
F (t, x) = e−ctM (∇) F0 (x) + c e−c(t−s)M (∇) js (x)ds. (7.10)
0
This solution F = [f, E, B, g]T is in fact a Maxwell field in the sense that f = g = 0
for all t > 0. To see this we compute in (7.8) the first component of F̂t to be
Z t
sin(c|ξ|t) sin(c|ξ|(t − s))
− icξ · Ê0 (ξ) + c (cos(c|ξ|(t − s))ρ̂s (ξ) − icξ · Jˆs (ξ))ds.
c|ξ| 0 c|ξ|
By assumption iξ · Ê0 = ρ̂0 and icξ · Jˆs = −∂s ρ̂s , from which we can conclude that
f = 0. Similarly the fourth component g is given by the same formula but with
E0 , ρs and Js replaced by B0 , 0 and 0. This shows that g = 0 since we assume
divB0 = 0.
Finally we remark that the propagation speed is c and Huygens principle holds
for Maxwell’s equation: this is seen from the solution formulas (7.9) and (7.10),
which build on the Riemann function for R3 .

7.3 Calculus of variations

Recommended reading:

ˆ Strauss 14.3.

ˆ Lecture notes: http://www.math.chalmers.se/~rosenan/part5.


pdf

How does Euler equations generalize the Laplace equation in


Section 2.3.
116

7.4 Curvature flows

7.5 Non-linear wave equations

Recommended reading:

ˆ Strauss 13.2.

ˆ Lecture notes: http://www.math.chalmers.se/~rosenan/part5.


pdf

How does Navier-Stokes equations generalize the wave equation?

7.6 Exercises
1. What is the type of each of the following PDEs?

(a) ∂x2 u − ∂x ∂y u + 2∂y u + ∂y2 u − 3∂y ∂x u + 4u = 0


(b) 9∂x2 u + 6∂x ∂y u + ∂y2 u + ∂x u = 0

2. Sketch the regions in xy-plane, where

(1 + x)∂x2 u + 2xy∂x ∂y u − y 2 ∂y2 u = 0

is elliptic, hyperbolic or parabolic respectively.

3. Why is one IC, specifying u(0, x), needed for the heat IVP to be well posed?
Why are two ICs, specifying u(0, x) and ∂ν u(0, x), needed for the heat IVP
to be well posed? Why is one BC, specifying either u|∂D or ∂ν u|∂D but not
both, needed for the Laplace BVP to be well posed? Hint: Example 5.3.7 is
relevant.

4. Write out the second and third component of the solution formula (7.10) and
obtain a formula for Et and Bt , given E0 , B0 , ρs and Js , for 0 < s < t.

5. Consider Maxwell’s equations from Example 7.2.3.

(a) Use vector calculus to deduce that there exists a vector potential A and
a scalar potential u such that ∇ × A = B and −∇u = E + c−1 ∂t A.
(b) Show that A and u are not uniquely determined by E and B. Indeed,
show that given any function λ, a gauge, we can equally well use the
vector potential A + ∇λ together with the scalar potential u − c−1 ∂t λ.
(c) Show that there exists a gauge λ such that the new A and u satisfy
c−1 ∂t u + ∇ · A = 0.
117

(d) Write D for the system of partial differential operators defined by the
matrix in (7.6). Show that D[u, A, 0, 0]T = [0, E, B, 0] in the gauge above.
Conclude that each component of u, A, E and B solves the wave equation
with a source. Which are the source terms?
Rbp
6. Solve the Euler-Lagrange equation for the minizer of a 1 + (u′ (x))2 dx. De-
duce that the shortest path between two point in the plane is a straight line.
Rb p
7. Find the minimal surfaces of revolution. Hints: The area is a 2πu 1 + (u′ (x))2 dx.
p
The primitive u(x)/ 1 + (u′ (x))2 may be useful when integrating the ODE.
118
Appendix A

Answers to Exercises

Problem session 1:

0.1 (a) α > −1/2 (b) α > −1 (c) α > −3/2



π/2,
 x > 0,
0.2 f (x) = 0, x = 0,

−π/2, x < 0,

R∞
∥fk − f ∥2L2 = 2k −1 1/k (arctan y)2 dy/y 2 → 0
V cannot be a Hilbert space since fk ∈ V but f ∈
/ V.
 
−1 2 −1
0.3 T =
1 2
 
−1 1 0
D = T AT =
0 2

1.1 u1 (t) = 2C1 et − C2 e2t , u2 (t) = C1 et + 2C2 e2t

1.2 u1 (t) = (C1 + C2 t)et , u2 (t) = (C1 + C2 + C2 t)et

1.7 With u = u0 + u1 and u0 given as explained, the equivalent PDE problem for
u1 is ∂t u1 = k∆u1 , u1 = 0 at t = 0, and u1 = h̃ on ∂D, where h̃ = h − u0 .

Problem session 2:

2x,
 x < 1,

2.1 f = 2δ1 − 4δ2 + g, where g(x) = 2x + 2, 1 < x < 2, is the classical deriva-

2, x>2

tive. g does not exist at x = 1 and at x = 2.
R1
2.2 ∂x f is the distribution whose ϕ-average is −1 (ϕ(−1, y)−ϕ(1, y))dy. Geometri-
cally: A Dirac wall along {−1}×(−1, 1) minus a Dirac wall along {1}×(−1, 1).

2.3 A dipole −2δ0′ with strength (−)2 at x = 0.

119
120

g(x − ct)ϕ′′tt (x, t)dxdt = c2 g(x − ct)ϕ′′xx (x, t)dxdt for all
RR RR
2.4 Need to show
test functions ϕ(x, t).
RR RR
2.5 Need to show y>0 u(x, y)∆ϕ(x, y)dxdy = y<0 u(x, −y)∆ϕ(x, y)dxdy for all
test functions ϕ(x, y) on the disk.

2.6 (a) α > 1/2 (b) α > 0 (c) α > −1/2 (For a complete solution you should show
that the weak derivative does not contain any Dirac delta at the origin.)

2.7 RGauss applied to aRR


vector field uvek along the coordinate basis vector ek reads
∂D
uv⟨n, ek ⟩dS = D ((∂xk u)v + u(∂xk v))dx.

2.8 D = {x ∈ R2 ; |x| < 1}, f1 (x) = 0, f2 (x) = 7ek(r−1) . For k large, these are
close in L2 (D), but still their boundary values are far apart.
1
RR R RR
2.9 V
R = H (D), a(u, ϕ) = D
⟨∇u, ∇ϕ⟩dx + γ ∂D
uϕdS, L(ϕ) = D
f ϕdx +
∂D
gϕdS.
RR RR
2.10 V = {u ∈ H 1 (D) ; u|ΓD = 0}, a(u, ϕ) = D ⟨∇u, ∇ϕ⟩dx, L(ϕ) = D f ϕdx.
RR RR
2.11 V = H01 (D), a(u, ϕ) = D ⟨B(x)∇u(x), ∇ϕ(x)⟩dx, L(ϕ) = D f ϕdx. The
Lax-Milgram hypothesis is satisfied if B(x) are uniformly bounded and positive
definite matrices.

Problem session 3:

3.1 k > −2 when n = 2, k > −3 when n = 3.

3.2 fˆk (ξ) = 2 sin(kξ)/ξ. Note that if fk → f weakly, then fˆk → fˆ weakly. Why?

3.4 δa δb = 0 if a ̸= b, and δa2 is not defined. δa ∗ δb = δa+b is defined for all a, b.

3.16 Reason: v is not a smooth function.

Problem session 4:

4.15 Physics: no max at an insulated end-point.

4.16 ∂x u(x, y) = π1 g(z) (x−z)y2 +y2 dz, ∂y u(x, y) = 1 x−z


R R
π
g(z) (x−z) 2 +y 2 dz.
Appendix B

Instructions

B.1 The written exam


The written exam on the course will consist of 7 questions in total.

ˆ One question will be to account for some of the Definitions that we have
encountered in the course. Typical questions may be “what is a tempered
distribution?” or “what do we mean by a Green’s function?”. It is impor-
tant that you understand the difference between a Definition (Where it is
explained the precise meaning of X. No proofs relevant here!) and a The-
orem/Proposition/Lemma. (Where we prove something about well defined
things.)

ˆ One question will be to account for the proof of a Theorem/Proposition/Lemma


covered in the course, from the list given below.

ˆ The remaining questions will be on the material covered in the compendium


R, and the parts of Strauss book listed in R. It is described in R which parts
are required only for the higher grades. Typical questions/problems include
solving ODE systems by diagonalization, solving first order PDEs by changes of
variables, calculating fundamental Fourier transforms for PDEs (Ht , Rt , Φ),
solving IVPs and IBVPs by the method of vector-valued ODEs, computing
weak derivatives and limits of distributions, finding an equivalent variational
formulation of a BVP, computing/using Green’s functions, using mean value
and maximum theorems, using/computing Laplace eigenfunctions and values,
classifying second order PDEs.

List of required proofs:

ˆ Functions as distributions: Proposition 2.1.5

ˆ Sobolev trace: Proposition 2.2.3

ˆ Lax-Milgram: Theorem 2.3.5

ˆ Variational vs. minimization problem: Lemma 2.3.6

121
122

ˆ Schwartz test functions: Lemma 3.1.2

ˆ Conservation of energy: Proposition 3.3.2

ˆ Minimum principle for first eigenvalue: Proposition 4.1.5, part (1) of proof

ˆ Symmetry of Green’s function: Proposition 5.1.7

ˆ Weak max for heat: Theorem 5.2.3(W)+Lemma 5.2.4

ˆ Mean give max for Laplace: Theorem 5.2.2

B.2 The FEM projects


The following guidelines have to be respected when carrying out the projects.
1. The projects should be carried out in groups of two. The students form the
groups autonomously. If there are problems, for example if someone does not
find a partner, a solution will be found together with the supervisor. Groups
of one or three are okay as exceptions.
2. Each team chooses one of the projects described in Appendix C and works
on it according to the exercises stated. Matlab must be used, following the
instructions in Appendix D. The project is mandatory and the students can
receive up to five bonus points for the exam.
3. The teams have the possibility to propose an own project to the supervisor
that can be carried out instead of the projects described in this text.
4. The result of the project will be a report containing
(a) a description of the physical problem and how it is modelled by a bound-
ary value problem,
(b) the weak formulation of the boundary value problem,
(c) the process of discretization of the problem,
(d) an explanation of the numerical methods and the code,
(e) a section where the code is tested and the boundary value problem is
solved containing some graphs,
(f) and finally the Matlab - source code (equipped with comments).
5. Deadline for handing in the report is Wednesday, December 22nd 2021, 23.59
PM. The reports should be sent to the supervisor via e-mail.
6. The reports can be written in Swedish or English.
7. The students get help with their projects in meetings with the supervisor.
They are encouraged to book at least one meeting with the supervisor. Before
the first meeting the students should have formed a team, selected a project
and written down the weak formulation.
123

8. (a) Report containing the points (a) - (f) from above with no severe mistakes:
passed, 0 bonus points.
(b) Entirely correct and complete presentation of the theoretical part: 1
bonus point.
(c) Professional presentation of the result with proper images and discussion
about errors: 1 bonus point.
(d) One obtains further bonus points by completing the extra exercises in the
project descriptions in Appendix C.
(e) From the 3rd meeting with the supervisor onwards: −1 bonus point per
meeting.
124
Appendix C

FEM projects

As part of the course, you are supposed to solve one, and only one, of the three
projects described below in this appendix.1 .
Before getting started on your project, you are supposed to read through Sec-
tions 2.4 and 4.3, and Appendix D. Matlab should be used, following Appendix D.
Note that you are only allowed to use the PDE toolbox for creating triangulations
and for plotting, and not for computing matrices and boundary conditions.

C.1 Heat equilibrium


In this project we wish to simulate heat conduction in a non-insulated water hose
filled with water that is assumed to be at rest. The surrounding of the hose has a
temperature of u0 K. We assume that the water in the hose is hit by microwaves
that cause the water to heat up. The microwaves act as heat sources because their
energy turns into heat when the waves are absorbed by the water. This heat source
is described by a function f (x). We consider a cross section of the water hose and
consider the stationary heat equation. Let the hose’s cross section be described by
a domain D and let u : D → R be the heat distribution. Then the temperature
distribution is described by the boundary value problem
(
−div(a∇u(x)) = f (x), x ∈ D,
(C.1)
a∂ν u(x) = c(u0 − u(x)), x ∈ ∂D.

Here a is the thermal conductivity coefficient of water and c is the heat conductiv-
ity coefficient of the hose’s walls. Typical values are awater = 0.6 W/K and c = 10
W/(dm K). Furthermore ν is the outward pointing unit normal vector to the bound-
ary ∂D of the domain and we use dm as unit of length. We assume that the cross
section of the hose is a circle of radius 1 dm.
1
The first and last project was developed by Christoffer Cromvik 2005 and the second by Fredrik
Lindgren 2009. They have also been further developed by Matteo Molteni, Maximilian Thaller and
Andreas Rosén since then.

125
126

Basic exercises (minimum requirement for pass)


1. Find the weak formulation of the boundary value problem and write a Matlab
program that solves the BVP using FEM. Take for now a, c, f and u0 as
constants, but allow D to be an arbitrary domain (although we will always
use the above disk D).
Report: Code for functions IntMatrix, BdyMatrix, IntVector and BdyVector.
Computations of integrals by hand.

2. We next check the code against an analytic solution. Consider the function
u(x, y) = 325 − 20(x2 + y 2 ), a = 0.6 and u0 = 300, and find the corresponding
values for c and f . Then use your code to calculate u numerically, for these
a, f, c and u0 . Check your code if the error does not seem to converge to 0.
(Hint: Carefully specify your circle with pdecirc.)
Report: Maximum absolute errors of the solution, for different meshes.

3. We now consider the following non-linear PDE problem. Set the outside tem-
perature to u0 = 250 and c = 10. Ice has a different thermal conductivity
coefficient than water, namely aice = 2.2 W/K. So a depends on the tempera-
ture u now: (
2.2, u < 273,
a(u) = (C.2)
0.6, u > 273.
This gives a non-linear partial differential equation that we can solve with a
fixed point iteration. The fixed point iteration is the following procedure: As-
sume first a = 0.6 everywhere, then solve the boundary value problem. Change
afterwards a to 2.2 at all node points where u < 273. Solve the boundary value
problem again and adjust a according to the values of u. Repeat this proce-
dure as long as a has to be adjusted. What constant value of the heat source
function f describing the radiation is needed so that in the equilibrium state
we have 50% ice and 50% water of the area cross section?
Report: The value of f and plot of the non-linear solution u.

Extra exercises (1 bonus point each)


4. Consider a = 0.6, c = 10 and u0 = 300. Implement non-constant f using the
quadrature (D.2). Consider the function u(x, y) = 325 − 20(x2 + y 2 )2 , a = 0.6
and u0 = 300, and find the corresponding values for c and f . Then use your
code to calculate u numerically, for these a, f, c and u0 . Check your code if
the error does not seem to converge to 0.
Report: Code for IntVector, and error estimates for different meshes.

5. Consider a = 0.6, u0 = 300 and f (x, y) = 100 sin(6y). Vary the value of c and
compute the solution u. What happens when (a) c → 0+ (b) c → +∞ and (c)
c < 0.
Report: Plots, with mathematical and physical explanations of the results.
127

6. Consider a = 0.6, c = 10 and u0 = 300. Rather than a constant f , a realistic


physical model for the absorption and scattering of microwaves in water is the
Beer-Lambert law: f = f0 exp(−µd), where f0 is the incident intensity, µ is
the attenuation and d is the distance that the microwave has travelled in the
hose. Choose a source incident from the right and µ = ln(2). Compute the
heat distribution.
Report: The value of f0 which gives an average temperature 320 in the hose.

C.2 The sound of drums


The resonance frequencies of a drum, with a shape described by a domain D ⊂ R2 ,
are the eigenvalues λ for a problem
(
−∆u + V (x)u = λu, x ∈ D,
(C.3)
u = 0, x ∈ ∂D,

when V = 0. Adding a potential V (x) yields a Schrödinger operator, and the


corresponding eigenvalues play a central role in quantum mechanics. The project
concerns the computation of the eigenvalues, and how these depend on the domain
D, the potential V and the choice of boundary conditions.
Note that Section 2.4 concerns the solution of BVPs and not eigenvalue problems.
It is Section 4.3 which is relevant here.

Basic exercises (minimum requirement for pass)


1. Formulate a Rayleigh-Ritz algorithm for computing approximations to the
eigenvalues and eigenfunctions for (C.3), for general functions V . Then take
for now V = 0, but allow D to be a general domain. Write a Matlab program
that computes approximations to the eigenvalues and eigenfunctions, plots the
50 first eigenvalues in one figure and plots the 6 first eigenfunctions in another
figure.
Report: Code for functions IntMatrix. Computations of integrals by hand.

2. We next check the code against an analytic solution. Compute the exact values
of the 50 first eigenvalues for the rectangle D = (−1, 1)×(−0.5, 0.5), by sorting
those appearing in Example 4.2.1 in ascending order. Compute the maximum
error among the first 50 eigenvalues for different meshes. Check your code
if your error does not seem to converge to 0. (Hint: carefully specify your
rectangle with pderect.)
Report: Errors of the solution, for different meshes. Explanation of the sign
of the errors.
128

3. Compute the first 100 eigenvalues for the domains, with V = 0.

D1 = (−1, 1) × (−1, 1) (C.4)


D2 = (−3, 3) × (−3, 3) (C.5)
D3 = (−2, −1) × (−2, 2) ∪ (−2, 2) × (1, 2) (C.6)
D4 = {(x, y) ; x2 + y 2 < 2} (C.7)

Estimate the asymptotic behaviour of λj as j → ∞, and compare to theoretic


predictions. Analyze how the size of λj , for each fixed j, depends on the
domain. In what sense does a larger domain ensure a smaller eigenvalue?
Report: Answer to all problems posed, and plots of eigenvalues (not eigen-
functions).

Extra exercises (1 bonus point each)


4. Implement non-constant potentials V (x, y), using the quadrature (D.2). Com-
pute the first 100 eigenvalues of −∆ + V , for the harmonic oscillator potential
V (x, y) = x2 + y 2 on the unit disk, and compare to the case V = 0.
Report: Codes for IntMatrix, plots of differences λVj − λ0j , j = 1, . . . , 100,
(where λVj are eigenvalues for −∆ + V ) and a discussion of the result.

5. Let V = 0, and let D be the domain D1 above. Impose, instead of Dirichlet


BCs, now mixed Dirichlet/Neumann boundary conditions on D1 : require that
u = 0 at x = 1 and ∂ν u = 0 on the remaining 3/4 of ∂D1 . Plot the 50
first eigenvalues and the 6 first eigenfunctions, and compare to full Dirichlet
boundary conditions.
Report: Result of the comparison, supported by plots. Code relevant for the
boundary conditions.

6. Find two different domains D1 and D2 which have exactly the same set of
Dirichlet eigenvalues. Verify this numerically with your Matlab code. Hint:
Example 4.2.5.
Report: Plots of the difference between the 50 first eigenvalues for respective
domains. What domains you are using.

C.3 Propagation of waves


The waves in a bathtub D filled with water can be modeled by the initial-boundary
value problem 
2 2
∂t u − c ∆u = f,
 0 < t < T, x ∈ D,
∂ν u = 0, x ∈ ∂D, (C.8)

u = u0 (x), ∂t u = v0 (x), t = 0, x ∈ D.

Here u describes the height of the water surface and ν is the outward pointing unit
normal to the boundary ∂D of the domain. We use m as unit of length, and a
129

typical value of wave propagation speed which we use is c = 1.6 m/s. We discretize
this evolution problem by writing
N
X
u(t, x) = ξj (t)ϕj (x),
j=1

with spatial FEM basis functions ϕj and coordinates ξj (t) depending on time t.
We do not use eigenfunction expansions here, but instead solve the ODE in time
numerically.

Basic exercises (minimum requirement for pass)


1. Write down a weak formulation of (C.8) by integrating over D only, and not
in time. Treat (ξj (t))N
j=1 as a time-dependent column vector ξ(t), and obtain
a system of ODEs for ξ(t). Then use the discrete second derivative
ξj (t + h) − 2ξj (t) + ξj (t − h)
ξj′′ (t) ≈ ,
h2
with step length h < ϵ, where ϵ is the mesh size of the triangulation (sidelength
of the triangles). For initial data for the recursion, we use u0 and v0 to compute
ξ(0) and ξ ′ (0), and the Taylor approximation
h2 ′′
ξ(−h) ≈ ξ(0) − hξ ′ (0) + 2
ξ (0),
where ξ ′′ (0) is obtained from the system of ODEs.
Write a Matlab program which computes the solution u to the IBVP at a given
time t = kh, k ∈ Z+ , given inital data u0 and v0 . Take for now f = 0.
Report: The system of ODEs and the recursion. Code for IntMatrix, and
computations of integrals by hand.
2. We next check the code against an analytic solution. Let D = (0, π)×(0, π) and
consider the standing wave u(t, x, y) = cos(3x) cos(4y) cos(5ct + π/3) solving
(C.8) with f = 0. Use your code to calculate the solution numerically from u0
and v0 . Compute the error in the numerical solution compared to the exact
solution for various meshes, step lengths and times. Check your code if the
error does not seem to converge to 0 as ϵ → 0.
Report: Plots of maximum error on D as a function of 0 < t < 2, for suitable
meshes and step lengths h.
3. Consider your favorite bathtub D be centered at the origin with inital condi-
tions u0 = η and v0 = 0, where
2 /(10ϵ)2
η(x) = ae−|x| .
Choose a so that η approximates the Dirac delta δ0 . Without using your code,
find the time T1 it will take until the wave hits the boundary for the first time.
Then solve the IBVP using your code and check T1 numerically. (Note that
the wave does not start off exactly at the origin.)
Report: Theoretical and numerical value of T1 . Plots of the wave, for t < T1 ,
t ≈ T1 , t > T1 as well as t >> T1 , with a discussion of what you see.
130

Extra exercises (1 bonus point each)


4. Let u0 = η and v0 = 0. Solve the IBVP for Dirichlet BCs u(t, x) = 0 for all
x ∈ ∂D, t > 0, in your favorite bathtub. In particular, compare what happens
when the wave hits the boundary for the two BCs.
Report: Plot of the wave at suitable T1 < t < 2T1 for the two BCs, and
discussion.

5. Back to Neumann BCs. Implement source functions f of the form

f (t, x, y) = f1 (t)f2 (x, y)

using the quadrature (D.2). You can now take u0 = v0 = 0. Check your
code for the exact solution u = t2 cos x cos y on D = (0, π)2 : calculate the
corresponding source f , and from this reconstruct the solution numerically.
Report: Code for IntVector. Error plot at t = 2.

6. Use your code to make a realistic simulation: a wave machine in a pool, a


dripping faucet,... One good simulation is enough.
Report: Plots.
Appendix D

Matlab

Solving a PDE problem typically splits into 3 parts as follows.

Geometry and triangulation


Matlabs PDE toolbox can be used to create the domain and the triangulation. Click
APPS and scroll down and select PDE modeler. With this graphical interface you
can create your domain and triangulation. Avoid domains with triangulations which
are very refined towards some boundary points. When ready, you choose Mesh →
Export mesh. This gives you the following 3 matlab variables.

p This corresponds to the matrix vertices in Section 2.4.

t This is a matrix with 4 rows, the first 3 corresponding to the matrix triangles
in Section 2.4. The last row is a subdomain reference, useful for example when
data is given by different expressions in parts of D.

e This is a matrix with 7 rows, the first 2 corresponding to the matrix boundary
in Section 2.4. Rows 6 : 7 are references to the triangles to the left and right of
the boundary interval, 0 referring to the exterior of D. The boundary ∂D may
consist of subsegments. Row 5 tells which subsegment the boundary interval
belongs to, and rows 3 : 4 give the ordering of the endpoints for the boundary
interval, within the subsegment.

When estimating errors by comparing the results for a coarse and a fine mesh,
note that as you refine your mesh with PDE modeler, the new nodes for the fine
mesh are listed after the old nodes for the coarse mesh in your variables.

Computing matrices and vectors


As we have seen in Sections 2.4 and 4.3, the numerical solution of a PDE problem
with FEM means that we discretize the problem and reduce it to a finite N × N
linear system
Ax = b,

131
132

or we may be interested in computing the eigenvalues and eigenvector for a matrix A.


Given the test functions ϕi and the geometry of D and ∂D encoded by the variables
p, t, e above, we need to compute the matrix elements Ai,j and the coordinates bj .
The matrix elements to be calculated are typically of the form
X ZZ XZ
Ai,j = a0 (ϕi , ϕj )dx + a1 (ϕi , ϕj )ds,
T T I I

where the first sum is over all triangles T , and approximates the integral over D,
whereas the second sum is over all boundary intervals, and approximates the integral
over ∂D. Depending on the problem, two functions

A0 = IntMatrix(nodes) and A1 = BdyMatrix(nodes)

need to be implemented. (Int= Interior, Bdy= Boundary.) The first function calcu-
lates, given the coordinates of the nodes of a triangle T , a 3 × 3 matrix containing
the values ZZ
a0 (ϕi , ϕj )dx,
T
when i and j each is a vertex of T . The second function calculates, given the
coordinates of the nodes of a boundary interval I, a 2 × 2 matrix containing the
values Z
a1 (ϕi , ϕj )dx,
I
when i and j each is an end point of I. The following matlab script then produces
the matrix A.

N = size(p, 2);
A = zeros(N, N);
for el = 1 : size(t, 2)
nn = t(1:3, el);
A(nn, nn) = A(nn, nn) + IntMatrix( p(:, nn) );
end
for bel = 1 : size(e, 2)
nn = e(1:2, bel);
A(nn, nn) = A(nn, nn) + BdyMatrix( p(:, nn) );
end

For some problems A may only involve a solid integral, in which case the second
for-loop should be omitted.
The vector coordinates to be calculated are typically of the form
X ZZ XZ
bi = L0 (ϕi )ds + L1 (ϕi )ds.
T T I I

Again, depending on the problem, two functions

L0 = IntVector(nodes) and L1 = BdyVector(nodes)


133

need to be implemented. The first function calculates,given the coordinates of the


nodes of a triangle T , a 3 × 1 column vector containing the values
ZZ
L0 (ϕi )ds,
T

when i is a vertex of T . The second function calculates, given the coordinates of the
nodes of a boundary interval I, a 2 × 1 column vector containing the values
Z
L1 (ϕi )ds,
I

when i is an end point of I. The following matlab script then produces the vector b.

N = size(p, 2);
b = zeros(N, 1);
for el = 1 : size(t, 2)
nn = t(1:3, el);
b(nn) = b(nn) + IntVector( p(:, nn) );
end
for bel = 1 : size(e, 2)
nn = e(1:2, bel);
b(nn) = b(nn) + BdyVector( p(:, nn) );
end

For some problems b may only involve a solid/boundary integral, in which case the
first/second for-loop should be omitted.

Example D.0.1. We consider the coding of the function IntMatrix corresponding


to the bilinear form ZZ
a0 (u, ϕ) = ⟨∇u, ∇ϕ⟩dx.
D

Replacing D, u, ϕ by T, ϕi , ϕj , we note that on the triangle T both vector fields ∇ϕi


and ∇ϕj are constant. We obtain the following code.

function A0 = IntMatrix(nodes)
% Input: 2 x 3 matrix, node coords as columns
% Output: 3 x 3 matrix of integrals for stiffness matrix
e1= nodes(:, 1)- nodes(:, 3); % choose 3rd node as origin
e2= nodes(:, 2)- nodes(:, 3);
basis= [e1, e2];
dualbasis= inv(basis’); % computes the dual basis
grads= [dualbasis(:, 1), dualbasis(:, 2),
-dualbasis(:, 1)- dualbasis(:, 2)];
area= det(basis)/2; % computes the area of the triangle
A0= grads’ * grads * area; % returns the 9 inner products
end
134

You may use this code if you understand how it works. Recall that the basis {e∗1 , e∗2 }
is dual to {e1 , e2 } if ⟨e∗1 , e1 ⟩ = ⟨e∗2 , e2 ⟩ = 1 and ⟨e∗1 , e2 ⟩ = ⟨e∗2 , e1 ⟩ = 0. The code
computes the dual basis by writing these relations in matrix form. Since ϕ1 = 1 at
node 1 and zero at nodes 2 and 3, it is clear that ∇ϕ1 = e∗1 . Similarly ∇ϕ2 = e∗2 ,
and for ∇ϕ3 we differentiate the relation ϕ1 + ϕ2 + ϕ3 = 1.
You are supposed to calculate the exact value of all integrals involving only test
functions ϕi by hand. It is convenient to make a change of variables in the double
integral over the triangle:
(s, t) 7→ p3 + se1 + te2
maps the triangle {(s, t) ; s, t > 0, s + t < 1} onto T with vertices pi and edge
vectors e1 = p1 − p3 , e2 = p2 − p3 . For example, if ϕi is the test function with
ϕi (p3 ) = 1 and J denotes the Jacobian determinant, then a change of variables gives
ZZ Z 1 Z 1−x
ϕi (x, y)dxdy = J (1 − x − y)dydx.
T 0 0

When the integrals over triangles/boundary intervals involve non-constant func-


tions besides the test functions ϕi , these integrals need to be approximated using
numerical integration=quadrature. To integrate a function g(x, y) on a triangle T ,
the simplest quadrature is
ZZ
p1 + p 2 + p3
gdxdy ≈ g( )|T | (D.1)
T 3
and the second simplest quadrature is
ZZ  p +p
1 2 p2 + p 3 p1 + p3 
gdxdy ≈ g( ) + g( ) + g( ) |T |/3, (D.2)
T 2 2 2
where |T | is the area of T . As the size of T tends to zero, (D.2) (which gives the exact
value for any second order polynomial g) gives a more accurate approximation of the
integral than (D.1) (which only gives the exact value for first order polynomials g).
However, you will notice little improvement since the error in the FEM algorithm
will dominate over the error in computing the integrals. Experiment!

Solving and plotting


One we have discretized and computed matrices A and vectors b, we can solve
the PDE problem numerically. Since A is large, maybe 1000 × 1000, computing
the inverse A−1 is numerically too expensive, so instead we solve Ax = b with
Matlabs backslash x= A\b. To compute the eigenvalues (diagonal matrix D) and
eigenvectors (columns in V ) to A, the Matlab command is [V,D]= eig(A). Note that
[V,D]= eig(A,B) directly gives the Rayleigh-Ritz approximation of eigenfunctions
and eigenvalues from Proposition 4.3.1.
Example D.0.2 (The Dirichlet Sobolev space H01 (D)). When solving a Dirichlet
problem, we use the function space V = H01 (D). In FEM, this means that we
135

do not use the boundary vertices, and no boundary matrices BdyMatrix and no
boundary vectors BdyVector need to be computed. The following code will produce
the matrices for Dirichlet problem from the Neumann matrices, following Section 2.4.

nn= size(p, 2);


intnodes= setdiff(1:nn, e(1, :)); % list of non-boundary nodes
ADir= ANeu(intnodes, intnodes); % removes boundary rows/columns

Before plotting a Dirichlet solution xDir, we need to add back zeros at boundary
nodes, which the following code does.

nn= size(p, 2);


intnodes= setdiff(1:nn, e(1, :)); % list of non-boundary nodes
x= zeros(nn, 1);
x(intnodes)= xDir; % adds zeros for boundary nodes in solution

To plot the solution represented by the column vector x, the general plotting
command in Matlabs PDE toolbox is pdeplot. See Matlabs documentation/help.
Two useful and simpler to use plotting options are pdesurf(p, t, x), which plots the
graph of the solution (using view([0, 90]) produces a colorplot), and pdecont(p, t, x),
which plots level curves of the solution (a fourth argument can specify the number
of level curves).

You might also like