PDEintro 21
PDEintro 21
EQUATIONS
systems of ODEs
Andreas Rosén
Chalmers University of Technology and
the University of Gothenburg 2021
Contents
0 Preliminaries 5
0.1 Function spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
0.2 Bases and diagonalization . . . . . . . . . . . . . . . . . . . . . . . . 8
0.3 Integral identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
0.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5 Harmonic functions 91
5.1 Green functions and Poisson kernels . . . . . . . . . . . . . . . . . . . 91
5.2 Mean value and maximum theorems . . . . . . . . . . . . . . . . . . . 95
5.3 Analytic functions and Hardy splittings . . . . . . . . . . . . . . . . . 101
5.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
3
4
B Instructions 121
B.1 The written exam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
B.2 The FEM projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
D Matlab 131
Chapter 0
Preliminaries
The material in this chapter is not meant to be explicitly lectured, but you should
read this before the start of the course to prepare. Prerequisites to the course also
include
Linear algebra: if you change your basis, how do the coordinate of vectors and
the matrices of linear transformation change?
Functional analysis is not required for the course. We use some concepts like function
spaces and norms however, but develop what we need as we go along.
The course starts in Chapter 1. These notes teach methods for solving the three
basic PDEs: the heat, wave and Laplace equations. We aim for a modern course
which develops the theory for a good understanding of numerical solution of PDE
problems with FEM, rather than teaches techniques for finding explicit solutions
by hand to PDE problems which are limited to special geometries. An extended
version of the course also provides the theory for solving PDE problems numerical
with boundary integral equations, which give faster and more accurate solutions in
many situations.
For the analysis we use the Fourier transform and generalized Fourier series as
our main tools. The basic strategy is to view our PDE as a vector-valued ODE
and to solve this through diagonalization. In order to diagonalize using the Fourier
transform, we require distributions. Without distributions this transform is quite
useless in practical applications like PDEs. Distributions also allow us to weakly
differentiate any function, even non-smooth ones, which is essential for a modern
understanding of PDE theory.
Necessary for a good understanding of PDEs and distributions are pictures which
convey the geometry hidden behind the abstract formulas. For example, how does
the Dirac delta and its derivative (dipole) look like? How does the Laplace fun-
damental solution, Poission kernels, Riemann functions and heat kernels look like?
This manuscript is unfortunately not yet complete, and these pictures are missing.
Therefore we shall draw them at the lectures, and you can add them in the margin.
5
6
In the introduction to each chapter you find boxes, like the following, with recom-
mended reading in the course book by Strauss, and elsewhere, which complements
these notes.
Recommended reading:
Study questions:
What is a Hilbert space? How do Green’s identities follow from
Gauss’ theorem?
To explain the intuition behind these abstract concepts, consider the unit sphere
{f ∈ V ; ∥f ∥ = 1}
in V . The completeness means that there are no holes in this sphere as we go along
the infinitely many dimensions. For a general Banach space, the unit sphere may
not be as round as one usually imagine a sphere. It is only the Hilbert spaces that
have perfectly round unit spheres without holes.
It is the completeness property which needs to be proved, and for this it is not
enough to use the simple Riemann integral. With only Riemann integrable functions
in L2 (D), there will functions missing in the space (holes). We shall not prove
Proposition 0.1.3, but refer to courses on Lebesgue integration theory for this. We
only remark that the Lebesgue integral, which is always used in higher mathematics
courses, is so powerful that in real life you will never encounter a function which
is too irregular to be integrable. Therefore whenever we speak of integrable in this
course, we always refer to the size of the function, that is integrable means that the
integral of the function is absolutely convergent.
A result which we shall need and can prove is the following.
T −1 AT : V2 → V2 .
x = T y,
where T is the change of basis matrix. It follows, by going via the standard basis,
that the in the new basis the linear transformation that we consider acts on the y
coordinates by the matrix
T −1 AT.
Thus, the matrix A of a linear map change by a similarity transformation when we
change the basis by T , just like the coordinates of a vector change when we change
basis.
T −1 AT
is a diagonal matrix.
In this case the change of basis matrix T is not only invertible as it must be,
but in fact can be chosen to be a rotation. This gives a very good understanding
of symmetric maps A: the spectral theorem means that if you just turn your own
head appropriately (that is choose coordinate system) then your symmetric map
just scales vectors along the coordinate axes.
9
Given a function f on Rn , this yields its Fourier transform F(f ), which is another
function on Rn usually is denoted by fˆ. A fundamental theorem by Plancherel states
that Z Z
1
2
|f (x)| dx = |fˆ(ξ)|2 dξ. (1)
Rn (2π)n Rn
If we set V1 = V2 = L2 (Rn ), then the Fourier transform
F : V1 → V2
that is T = F −1 and A = ∂xk in Definition 0.2.1. Here we view ∂xk as the linear map
∂xk : V1 → V1 which computes the partial derivative ∂xk f . On the right hand side
stands the multiplier Miξk , which is the linear map Miξk : V2 → V2 which computes
iξk fˆ(ξ). Such a multiplication operator Miξk is the continuous analogue of a diagonal
matrix, so (2) is rightly viewed a diagonalization of the derivative ∂xk .
10
Throughout this book, ν denotes the outward pointing unit normal vector on
∂D. We recall the following basic differential operators.
The Laplace operator is the second order differential operator ∆ = div∇, that
is
∆u = div∇u = ∂x21 u + · · · + ∂x2n u.
There are two versions of the divergence theorem due to George Green, which
we will frequently use. Given two scalar functions u and v, we apply the divergence
theorem to the vector field F = v∇u and note that divF = ⟨∇v, ∇u⟩ + v∆u. This
yields Green’s first identity
Z Z
v∂ν udS = (⟨∇v, ∇u⟩ + v∆u)dx.
∂D D
Another frequently useful version is obtained by swapping the roles of u and v and
subtracting the two identities. This yields Green’s second identity
Z Z
(u∂ν v − v∂ν u)dS = (u∆v − v∆u)dx,
∂D D
0.4 Exercises
1. For which exponents α ∈ R does f (x) = |x|α belong to L2 (D), if (a) D is the
interval D = (−1, 1) ⊂ R? (b) D is the unit disk x2 + y 2 < 1 in R2 ? (c) D is
the unit ball x2 + y 2 + z 2 < 1 in R3 ?
11
Remark 1.0.1. In this book we write partial derivatives as u′x or ∂x u. Another short
hand notation in literature is ux , omitting the prime. Beware that we will instead
reserve this notation for the one-variabel function ux (y) = u(x, y) of y, where x has
been fixed.
Many natural PDEs are rather systems of PDEs, which is the case where we
have not only one unknown function but unknown functions u1 , . . . , uk , which are
assumed to satisfy not only one PDE but a number of PDEs. In this course however,
we mainly focus on scalar PDEs (k = 1), in which case the most interesting PDEs
are of second order as we shall see in Section 1.3. In Chapter 7, we learn that many
PDEs from physics are systems of PDEs.
We consider here mainly the simplest PDEs, the linear PDEs. A PDE is said to
be linear if it can be written in the form
X
as (x) ∂ s u(x) = f (x),
s
where s = (s1 , . . . , sn ) index the various partial derivatives ∂ s = ∂xs11 · · · ∂xsnn of the
unknown function u, and as and f are given functions depending on x = (x1 , . . . , xn ).
The PDE is said to be homogeneous if f = 0. If a PDE is not linear it is called a
13
14
non-linear PDE. A typical non-linear PDE is a PDE where a product of u and some
derivative ∂ s u appears.
There are also integral equations, which is an equation involving integrals of the
unknown function, see Section 6. Sometimes PDEs can be reformulated as integral
equations.
Recommended reading:
Strauss: 1.1-5.
Study questions:
What is a characteristic curve/equation? What is the physical
meaning of lower order terms in the three main PDEs? What
does it mean that a PDE problem is well posed? Linear?
Homogeneous?
Theorem 1.1.1 (Picard). Consider the initial value problem for (1.1) where at time
t = t0 we are given the initial position
u(t0 ) = u0 ∈ Rn .
Assume that the function F is continuous and that there is a constant L < ∞ so
that F satisfies the Lipschitz condition
for all t near t0 and all u1 , u2 near u0 . Then there exists t1 < t0 < t2 such that there
exists one and only one solution u(t) to this initial value problem for t1 < t < t2 .
The proof of Picards theorem uses a fixed point theorem. We will not look into
this, but instead refer to standard literature on ODEs. Coming back to PDEs, of
15
which ODEs are a special case, we shall see that a PDE often in a natural way can
be viewed as an ODE with N = ∞. In this case we shall speak of a vector-valued
ODE (meaning an infinite dimensional vector= function) rather than a system of
ODEs. Using this point of view we shall solve our basic PDEs in later chapters. To
practise the basic technique for solving linear constant coefficient ODEs, we consider
here the case of finite dimensional systems of PDEs.
Example 1.1.2 (Diagonalization of systems of ODEs). Let A be a given constant
N × N matrix. We want to solve the systems of ODEs
u′ (t) = Au(t) (1.2)
for u : R → RN . Assume that we know a change of basis T so that the matrix
D = T −1 AT
is diagonal. As in linear algebra, the columns in T are the eigenvectors of A and D
contain the eigenvalues of A on the diagonal, with compatible ordering.
We now use this similarity transformation to solve (1.2). Let v(t) := T −1 u(t).
With this change of unknown function, (1.2) becomes T v ′ (t) = AT v(t). By multi-
plying this equation from the left by T −1 , we obtain
v ′ (t) = Dv(t).
Since D is diagonal, this is in fact N scalar first order ODEs for the component
functions vk (t). Solving these gives v(t), and as a consequence u(t) = T v(t).
Example 1.1.3 (Baby wave equation). Replacing the Laplace operator ∆ by a
negative number −ω 2 , we consider the second order ODE
f ′′ (t) = −ω 2 f (t) + a(t),
where a is a given function. A way to solve this is to write it as a 2 × 2 system of
first order ODEs and solve it by the methods described above. Let u : R → R2 be
the function given by
f (t)
u(t) = ′ .
f (t)
We see that this satisfies
′ 0 1 0
u (t) = u(t) + .
−ω 2 0 a(t)
We compute eigenvectors of the matrix and obtain
−1
1 1 0 1 1 1 iω 0
= .
iω −iω −ω 2 0 iω −iω 0 −iω
−1
1 1
The diagonalized system, when changing variables to v = u, becomes
iω −iω
(
v1′ (t) = iωv1 (t) + a(t)/(2iω),
v2′ (t) = −iωv2 (t) − a(t)/(2iω).
16
Solving the first of these two scalar ODEs by multiplication with the integrating
factor e−iωt , we have
Z t
iωt −1
v1 (t) = C1 e + (2iω) a(s)eiω(t−s) ds.
0
From this we obtain the solution f = u1 = v1 + v2 to the original second order ODE.
We have Z t
sin(ω(t − s))
f (t) = A cos(ωt) + B sin(ωt) + a(s) ds,
0 ω
with constants A = C1 + C2 and B = i(C1 − C2 ).
a(x, y)∂x u(x, y) + b(x, y)∂y u(x, y) + c(x, y)u(x, y) = f (x, y). (1.3)
Unlike more general PDEs, these are actually quite easy to solve explicitly. The
basic idea is to perform a suitable change of variables
(
s = s(x, y),
(1.4)
t = t(x, y),
to be chosen so that the PDE becomes simpler in the new coordinates s, t. What
we want is that, after transformation to s, t, only one of the derivatives ∂s u and ∂t u
appear in the PDE. For example, let us aim for a PDE of the form
where b̃(s, t) = 0.
To find s(x, y) and t(x, y), use the chain rule to get
(
∂x u(x, y) = ∂x s(x, y)∂s u(x, y) + ∂x t(x, y)∂t u(x, y),
∂y u(x, y) = ∂y s(x, y)∂s u(x, y) + ∂y t(x, y)∂t u(x, y).
that is t(x, y) should satisfy (1.3), with c = f = 0. A geometric way to write this
PDE for t(x, y) is
⟨∇t, (a, b)⟩ = 0,
which means that at each point (x0 , y0 ) we require the gradient vector ∇t to be
orthogonal to the vector (a, b). Set C = u(x0 , y0 ). This means that the level curve
consisting of those (x, y) such that u(x, y) = C, is such that at each point on it, the
vector (a, b) is tangent to this level curve.
We can now summarize the algorithm for solving PDEs of the form (1.3).
Solve the characteristic equation (1.6), a first order ODE for y(x). The general
solution will depend on a parameter C.
Solve for C and write the solution as t(x, y) = C. Choose this t(x, y) for
your change of variables. Pick s(x, y) essentially randomly: we only want t
and s to define an invertible (locally usually suffices) change of variables. It
is recommended to choose s as simple as possible, typically s(x, y) = x or
s(x, y) = y will do.
Find the transformed PDE (1.5). Make sure no x, y remains in the PDE: not
only ∂x u and ∂y u should be replaced by ∂s u and ∂t u according to the chain rule,
but coefficients x and y must be replaced by s and t, for which you may need
to solve for x and y in (1.4) to obtain the inverse x = x(s, t) and y = y(s, t).
Solve the PDE (1.5). This can be done since this is now actually an ODE, since
only derivatives in s appears. Therefore we can think of t as having a fixed
value (being a parameter). Note that the general solution to the PDE=ODE
(1.5) depend on an integration constant C, but that these may depend on t.
So the general solution u(x, y) of our original PDE (1.3), after switching back
to x and y, will depend on an arbitrary one-variable function C(t(x, y)).
The following illustrates how this works and how to find particular solutions to
the PDE.
xy∂x u − x2 ∂y u − yu = xy.
18
∂s u − s−1 u = 1.
This is a first order linear ODE, and the standard method to solve such involves the
integrating factor e− ln s = 1/s. Multiplying the ODE by 1/s and using the product
rule backwards gives
∂s (s−1 u) = 1/s.
After integration, we obtain the general solution s−1 u = ln s + C(t), that is
Note that any choice of one-variable function C yield one particular solution to
the PDE. For example C(α) = sin(2α), using α to denote the dummy variable that
C depends on, gives us the PDE solution u = x ln(x)+x sin(x2 +y 2 ). Conversely, one
is often given some complementary boundary conditions which specify the function
C(α). For example, if we demand of the solution u(x, y) that
then clearly y = 0 + C((1 + y 2 )/2). Write α = (1 + y 2 )/2 and note that α ≥ 1/2.
Therefore we know that C is the function
√
C(α) = y = 2α − 1, for all α ≥ 1/2.
The key observation now is that this holds for all Ω ⊂ D, and this is only possible
if the integrands on the left and right hand sides are pointwise equal. Since divJ =
−divA∇u, and writing R(u) = f + au0 − au, we obtain the PDE
∂t u = divA∇u + R(u). (1.9)
20
In the special case when R = 0 and A = k is a scalar positive constant, then this
reads
∂t u = k∆u, (1.10)
and is called the heat equation, or sometimes the diffusion equation. Indeed, a
diffusion process can be modelled by this PDE through arguments entirely similar to
those above. The unknown u now describes the density of some gas or fluid, diffusion
is described by Fick’s law (analogously to Fourier’s law) and external sources of the
substances can be modelled by a reaction term R(u) as above. A process described
by a PDE of the form (1.9) is referred to as a reaction-diffusion process. Above, our
function R(u) depended linearly on u, but more general reaction-diffusion processes
are modelled by suitable non-linear reaction terms R(u).
To understand the heat equation, it is instructive to study how (1.10) works
at a point (x, y) ∈ D where u attains a local maximum. Typically we here have
∆u < 0, and since k > 0, the heat equation forces ∂t u < 0. Similarly at the minima
of u, we have ∂t u > 0. This illustrates the main feature of the heat equation: it
lower/smoothes peaks in u as time t increases.
First we have the vertical tension force. By Newton’s third law, at each point
on ∂Ω, the D \ Ω part of the membrane acts by a force T on Ω, while at the same
time the Ω part of the membrane acts by a force −T on D \ Ω. A natural linear
model is that T has the direction of ν and that the horizontal part of T is constant.
Equivalently, we assume that there is a positive constant, which we write as c2 ,
describing the tension of the membrane, such that
Tz = c2 ∂ν u.
21
Thus the total vertical force by which the rest of the membrane acts on Ω, is
Z
c2 ∂ν u ds.
∂Ω
Coming back to Newton’s second law, and using the divergence theorem we have
ZZ Z ZZ
2 2
∂t udxdy = c ∂ν uds + (f − a∂t u)dxdy
Ω ∂Ω Ω ZZ
= (c2 ∆u + f − a∂t u)dxdy.
Ω
This holds for all Ω ⊂ D, and this is only possible if the integrands on the left and
right hand sides are pointwise equal. We therefore obtain the PDE
For undamped vibrations and no external forces, that is when a = f = 0, this reads
∂t2 u = c2 ∆u (1.12)
which is referred to as Poisson’s equation. (The minus sign is of course not important,
but as we shall later see there are reasons to keep it.) When f = 0, it is called the
Laplace equation and functions u solving ∆u = 0 are called harmonic functions.
The Laplace equation in one dimensions, for u = u(x), is trivial, since u′′ (x) = 0
simply means that u is a linear function of x. In two and higher dimension, the
equilibrium state that the Laplace equation describes is much more interesting as
we shall see. Beyond the Laplace equation, for a general positive matrix-valued
function A = Aij (x), the PDE
div(A(x)∇u(x)) = 0
is referred to as a divergence form elliptic equation, and describes an equilibrium
state in an inhomogeneous and anisotropic material.
Before moving on, it is worthwhile to discuss how we obtained the PDEs from
the integral identities over domains Ω in Examples 1.3.1 and 1.3.2. In each case, by
moving all terms to the left hand side, we know that a certain function F satisfies
ZZ
F dxdy = 0 (1.13)
Ω
When applied to the heat equation, this means that we keep the temperature
at values prescribed by ϕ along ∂D.
When applied to the wave equation, this means the membrane is fixed at height
ϕ along ∂D.
where ϕ is some given function on ∂D. Recall that the normal directional derivative
is ∂ν u = ⟨ν, ∇u⟩. Homogeneous Neumann BCs mean that ϕ = 0.
24
When applied to the wave equation, this means the membrane is acted on by
a vertical force ϕ along ∂D. In particular, homogeneous Neumann BCs mean
that the membrane moves freely in the vertical direction at ∂D.
where ϕ is some given function on ∂D, and a is a given function on ∂D, which should
not be negative. One may view the Neumann BC as the special case a = 0 and the
Dirichlet BC, after some re-normalization, as the special case a = ∞.
We can see that each of these BCs alone is strong enough to uniquely determine
the solution as follows. Consider for example the Poisson equation, and assume that
we have two solutions u1 and u2 to the Dirichlet problem
(
−∆u(x) = f (x), x ∈ D,
u(x) = ϕ(x), x ∈ ∂D.
Here the right hand side is zero since v = 0 at ∂D, so the left hand side shows that
∇v = 0 in all D. Thus v is constant, but from the BC we see that this constant
must be zero, so v = 0, or equivalently u1 = u2 .
Very similar arguments apply to the other two BCs. For the Neumann BC, the
right hand side vanishes now since ∂ν v = 0, but we can only conclude that v is
constant. And indeed, a typical feature of solution to the Neumann problem is that
they Rare clearly unique only up to constants.
RR For the Robin BC, the right hand side
is − ∂D a(x)v(x)2 dx ≤ 0. But since D |∇v(x)|2 dx ≥ 0, we can obtain that v = 0
in all D if a ≥ 0 and is non-zero at least on a part of ∂D.
Roughly speaking, a main goal in Chapter 2 will be to show that these BCs in
fact are just strong enough so that a solution u exists for any given data. Note that
if we impose two of these BCs at the same time, for example if we try to prescribe
25
both u and ∂ν along ∂D, then in general there will exist no solution to this PDE
problem.
In general the prescibed data for a PDE problem are of three types: ICs, BCs
and internal sources. For linear problems, we can pass between these types as the
following example illustrates.
Example 1.4.2 (Homogeneous reduction for linear PDEs). Consider the inital-
boundary value problem (IBVP) for the heat equation
∂t u(t, x) = k∆u(t, x) + f (t, x), t > 0, x ∈ D,
u(0, x) = g(x), x ∈ D, (1.14)
u(t, x) = h(t, x), t > 0, x ∈ ∂D.
Here f in the PDE represents internal sources, g is the prescribed IC and h is the
prescribed Dirichlet BC.
The point of this example is to demonstrate that if we can solve this linear PDE
problem in the special case when g = h = 0 and for all f , then we can easily also
solve it for general f, g, h. To see this, let any f, g, h be given. Pick any sufficiently
smooth function u1 (t, x) such that v = g at t = 0 and v = h at ∂D. Write
where u1 is the known function that we just introduced. Now using that the PDE
∂t u − k∆u = f is linear, in terms of u0 we equivalently have
∂t u0 − k∆u0 = f − ∂t u1 − k∆u1 .
Definition 1.4.3. A PDE problem is said to be well-posed (in the sense of Hadamard)
if
Note that for this notion of well-posedness to be well defined, a function space
Y of admissible data and a function space X of possible solutions must be specified
for (i) and (ii) to make sense. For (iii) to make sense, we must also specify norms
on X and Y , in which we measure the continuity.
As we saw examples of above, uniqueness (ii) of a PDE problem is relatively
easy to prove. For linear PDE problems, the strategy is to assume that we have two
solutions u1 and u2 for the same data. Linearity then shows that v = u1 − u2 is a
solution with 0 data, and it is from this often possible to show that v must be zero.
Continuous dependence on data (iii) is typically proved rather similarly, but using
the norms. For linear PDE problems, we now need to show that if the data for v is
small in the Y norm, then v must be small in the X norm.
Existence (i) of solutions is usually the most difficult to prove. Indeed, much of
our coming work is devoted to constructing solutions, numerically and theoretically,
for given data. There are also techniques for linear PDE problems, referred to as
Fredholm theory, which can be used to deduce (i) from (ii) and (iii). See Section 6.
Most of the naturally appearing PDE problems in physics are well-posed. There
are however important examples of PDE problems which are not well-posed: we
speak of ill-posed PDE problems. Here it is often (i) and (iii) that fails, and what
can only be proved is uniqueness of solutions (ii). See Exercise 10 and Example 3.3.4.
1.5 Exercises
1. Solve the 2 × 2 ODE system u′ (t) = Au(t) by diagonalizing A, where A is the
matrix from Exercise 0.3.
′ 0 1
2. Solve the 2 × 2 ODE system u (t) = Au(t), where A = . Hint: Use
−1 2
a similarity transformation to an upper triangular matrix T AT −1 . Then start
by solving the simplest ODE in the new system.
3. Determine the order of the following PDEs and if they are nonlinear, linear
inhomogeneous or linear homogeneous.
(a) ∂t u − ∂x2 u + 1 = 0
(b) ∂t u − ∂x2 u + xu = 0
(c) ∂t u − ∂x2 ∂t u + u∂x u = 0
(d) ∂t2 u + ∂x2 u + x3 = 0
(e) i∂t u + ∂x2 u + u/x = 0
(f) (1 + ∂x2 u)−1/2 ∂x u + (1 + ∂y2 u)−1/2 ∂y u = 0
(g) ∂x u + ey ∂y u = 0
(h) ∂t u + ∂x4 u + (1 + u)8 = 0
4. Solve 2∂t u + 3∂x u = 0 when u(0, x) = sin x.
5. Find the general solution to (1 + x2 )∂x u + ∂y u = 0. Sketch some characteristic
curves.
27
7. Let u0 be any function that solves the heat equation and satisfies the initial
conditions in (1.14). (We learn below in (3.6) how to compute such u0 .) Show
how to use this particular solution to reduce the IBVP (1.14) to the case
f = g = 0.
10. Consider the following Cauchy problem for the Laplace equation in a square.
∆u = 0, 0 < x, y < π,
u = 0, x = 0 or x = π,
u = g, y = 0,
∂y u = h, y = 0,
The goal in this chapter is to learn the theory behind the finite element method
(FEM) for the numerical solution of the BVPs for Laplace equation. This uses a
weak formulation of the PDE problem, and the concept of weak derivatives. So we
start with a short introduction to distribution theory and collect what we need for
this course.
Recommended reading:
Strauss 12.1.
Strauss 8.5.
Study questions:
What is a distribution? What is a test function? What do we
mean when we say that a distribution is a function? How do we
use integration by parts/use Green to compute a distributional
derivative? What is a fundamental solution to a PDE?
2.1 Distributions
We have seen in Section 0.1 that the Lebesgue integral is needed for modern analysis.
Likewise, we also need the modern distributional definition of derivative. At first
the notion of distributions below may seem more abstract than that of a function,
but in fact distributions are closer to physical reality than functions.
Example 2.1.1 (Point masses and dipoles). General distributions, as defined math-
ematically below can be quite wild creatures. However, the only two types of dis-
tributions beyond ordinary functions which we consider and need in this course are
29
30
variations of physical concepts of point masses and dipoles. Naively, a point mass f
at the origin is defined by (
∞, x = 0,
f (x) =
0, x ̸= 0.
However, this is a useless definition since we cannot calculate with ∞. For example
remember that 0 · ∞ can be anything/is undefined. The proper and useful definition
of a point mass (Dirac delta) is found below.
Similarly, the naive definition of a dipole at the origin is
∞,
x = 0+ ,
f (x) = −∞, x = 0− ,
0, x ̸= 0.
The physical meaning is that we have placed two point masses, infinitely large and
of opposite sign, infiniteimally close at the origin. Again, this is a useless definition
since we somehow must quantify infinity and the infinitesimal distance. The proper
definition of a dipole (Dirac delta derivative) is found below.
ϕ1 f1 + ϕ2 f2 + · · · + ϕN fN ,
R
where now the weight ϕ is positive function ϕ(x) ≥ 0 with total mass ϕ(x)dx = 1.
Here is the core idea of distributions. The classical functional way to describe a
quantity f depending on a variable x is to specify its values f (x) at each point
x. However, if f is a physical quantity then this is typically not posssible, since a
point is only a mathematical idealization. In real life, what is possible to measure
are various averages ⟨f, ϕ⟩ of the values f (x). This leads us to the concept of a
distribution, where instead of viewing f as depending on the point x ∈ R, we view
f as depending on the weight function ϕ.
The correct intuition is that instead of talking about the values f (x) of
a function f at mathematical points x, we now talk about the values
⟨f, ϕ⟩ of a distribution f at physical ”smeared out” points ϕ.
To turn these ideas into a good mathematical definition, we make the following
observations.
31
The restrictions ϕ(x) ≥ 0 and ϕ(x)dx = 1 were imposed above only for
R
the physical interpretation of the values ⟨f, ϕ⟩. We shall no longer need these
restrictions and therefore drop them, and we will refer to ϕ as a test function
rather than a weight function.
The integral (2.1) depends linearly on ϕ (as well as on f ), meaning that
⟨f, aϕ + bψ⟩ = a⟨f, ϕ⟩ + b⟨f, ψ⟩,
for all a, b ∈ R and all test functions ϕ, ψ. This reflects the fact that there is
a redundancy in the values ⟨f, ϕ⟩ in that they cannot be specified completely
independent of each other. This in contrast with the classical function values
f (x), which in general can be independently specified to yield a function.
For f, ϕ ∈ L2 (R), the value (2.1) of the function f at the test function ϕ is
nothing but the L2 inner product of f and ϕ. As further studied in a course on
functional analysis, there is a dual relation between f and ϕ: To make sense
of the value ⟨f, ϕ⟩, the better the function ϕ is, the worse we can allow f to
R bounded, then ⟨f, ϕ⟩ is well defined for any integrable
be. For example, if ϕ is
function f , that is if |f (x)|dx < ∞. And if ϕ = 0 outside a bounded set,
then we can allow f to grow arbitrarily fast as x → ∞. Further assuming
that ϕ is differentiable to any order, then ⟨f, ϕ⟩ will be well defined for any
distribution f .
Definition 2.1.2 (Distributions and test functions). Let D(Rn ) denote the set of
all test functions ϕ : Rn → R such that all partial derivatives of ϕ of all orders exist
as continuous functions, and such that ϕ = 0 outside some bounded set.
Let D′ (Rn ) denote the set of all distributions, that is functionals F : D(Rn ) → R
such that F is linear and defined on all D(Rn ). We denote the value of F on the
test function ϕ by ⟨F, ϕ⟩.
Before we start using distributions in concrete calculations, some remarks are in
order.
We write ⟨F, ϕ⟩ for the value of the distribution f on the test function ϕ. As
discussed above, if ϕ is a weight function, then the number ⟨F, ϕ⟩ means the ϕ-
weighted average of the quantity represented by the distribution. Alternative
notations are F [ϕ] or F (ϕ), but we use the inner product symbol ⟨F, ϕ⟩ since
this better indicates the linear dependence on ϕ. Note carefully though that,
unless the distribution F is (represented by) a function, ⟨F, ϕ⟩ is not an integral
but just the value of F at ϕ.
The usual definition of a distribution F : D(Rn ) → R is that it should be
linear and continuous in a certain sense. It turns out that any F which is
linear and defined on all D(Rn ) that you will ever encounter in real life, will
be continuous in this sense. This is a consequence of a completeness property
of the function space D(Rn ). To be honest, there are in some sense linear
F defined on all D(Rn ) which are not continous, but to construct such an
example you will need the axiom of choice, a very abstract mathematical tool
from the foundations of logics.
32
The idea behind this construction is that e−1/x → 0 so fast as x → 0+ that all
derivatives will vanish at 0.
The most well known distribution, which is not a function, is the following point
mass distribution.
Definition 2.1.3 (The Dirac delta). Let a ∈ Rn . The Dirac delta at a is the
distribution δa ∈ D′ (Rn ) given by
⟨δa , ϕ⟩ := ϕ(a).
The Dirac delta distribution δa is not a function, but the limit (in a weak sense
R below) of functions g(x) such that g ̸= 0 only in a small neighbourhood
introduced
of a and g(x)dx = 1.
Then f1 (a) = f2 (a) at each point a ∈ Rn where both f1 and f2 are continuous.
33
We remark that using Lebesgue integration theory, one can prove that f (x) =
g(x) at almost every x ∈ Rn .
Proof. Let ϕ ∈ D(Rn ) be a test function. Then there exists C < ∞ and R < ∞ so
that |ϕ(x)| ≤ C for all x ∈ Rn and ϕ(x) = 0 for |x| ≥ R. Therefore
Z Z
|f (x)ϕ(x)|dx ≤ C |f (x)|dx < ∞,
Rn D
follows from the continuity of f at a, we conclude that f (a) = 0. This proves that
f1 (a) = f2 (a).
Figure 2.1: Example of zero order distributions: (a) Locally integrable functions. (b)
Dirac delta distributions.
In this definition, the derivative ∂xi ϕ is the usual one defined by difference quo-
tients, which is well defined since ϕ is a test function. The motivation for this defi-
nition is integration by parts. Indeed, if F is the distribution defined by a smooth
function f , then
Z Z
∂xi f (x)ϕ(x)dx = − f (x)∂xi ϕ(x)dx,
Rn Rn
Carefully note that the weak derivative in Definition 2.1.6, is not defined point-
wise. Given a function, or distribution F , we only know what ∂xi is as a whole.
We do not tell what F (x) is at each x ∈ Rn . Indeed, remember from the inital
discussion in this section that this is impossible for distributions in general.
Example 2.1.7. Let f− and f+ be two smooth one-variable functions, and set
(
f− (x), x < 0,
f (x) =
f+ (x), x > 0.
This means that if f is continuous at 0, that is if f+ (0) = f− (0), then the weak
derivative is a function: the pointwise derivative g. But if f jumps at 0, then the
weak derivative also contains a distribution term: a Dirac delta at the jump, with
mass equal to the size of the jump.
Example 2.1.8 (Dirac delta derivative). We have seen that the Dirac delta distri-
bution δa at a ∈ R appears as the weak derivative of a function which jumps at
this point. Since any distribution can be weakly differentiated, we may form the
derivative δa′ , which is the distribution defined by
usually referred to as the principal value distribution 1/x. Note that 1/x is not
locally integrable
R ∞around 0, but stillR the value ⟨F, ϕ⟩ is well defined since the two
0
infinite masses 0 x dx = ∞ and −∞ x−1 dx = −∞ cancel when calculating the
−1
integral symmetrically around 0. But note that this require smoothness of ϕ: the
derivative of ϕ should exist at 0 at least, so F is a first order distribution.
Figure 2.2: Example of first order distributions: (a) Dirac delta derivative. (b) Principal
value 1/x.
The last notion for distributions which we need for now, is that of weak conver-
gence of distributions. This is both a simpler and more useful notion of convergence
than that of convergence of test functions, discussed above.
Note that since the distributions are functions Fk : D(Rn ) → R defined on the
set of test functions, weak convergence is nothing but pointwise convergence: we
demand that their values at each test function converge.
As the name suggests, this type is of convergence is indeed weak. Roughly
speaking, if we have any type of convergence, then we have also weak convergence.
For example,
by Cauchy–Schwarz inequality.
The two main examples of weak convergence, where we do not have convergence
in any other usual sense, are the following.
We claim that
g k → δa , k → ∞,
in the weak sense. Indeed, choosing f = ϕ as a test function in Lemma 2.1.4 shows
that
Z
|⟨gk , ϕ⟩ − ⟨δa , ϕ⟩| = gk (x)ϕ(x)dx − ϕ(a) ≤ sup |ϕ(x) − ϕ(a)| → 0
|x−a|<rk
for any ϕ ∈ D(R). In fact, this is true for any integrable function ϕ. The idea of proof
is that a smooth function and a highly oscillatory function are almost orthogonal in
L2 .
This example clearly shows who weak the notion of weak convergence is. Indeed,
sin(kx) → 0 weakly but not in any other usual sense. The drawback with weak
convergence is that it is quite useless in numerical applications for obvious reasons.
Figure 2.3: Examples of weak convergence: (a) Delta convergence. (b) Oscillatory
convergence.
Definition 2.2.1 (Sobolev space). Define the Sobolev space H 1 (D) to be the space
of square integrable functions f ∈ L2 (D) such that all the weak partial derivatives
∂xi f are square integrable functions, i = 1, . . . , n. Writing ∇f for the weak gradient,
having these weak derivatives as components, we define the Sobolev inner product
Z
⟨f, g⟩H 1 := (⟨∇f (x), ∇g(x)⟩ + f (x)g(x))dx.
D
Proposition 2.2.2. H 1 (D), as a linear space with the inner product ⟨·, ·⟩H 1 , is a
Hilbert space.
Proof. It is straightforward to show that H 1 (D) is a linear space and that ⟨f, g⟩H 1
defines an inner product on H 1 (D). So it remains to verify that H 1 (D) is complete
with respect to the Sobolev norm
Z 1/2
2 2
∥f ∥H 1 := (|∇f (x)| + |f (x)| )dx .
D
We observe that H 1 (D) ⊂ L2 (D) with ∥f ∥L2 ≤ ∥f ∥H 1 and ∥∂xk f ∥L2 ≤ ∥f ∥H 1 for
all f ∈ H 1 (D). Assume now that f1 , f2 , . . . is a Cauchy sequence in H 1 (D). It
follows that this is a Cauchy sequence also in L2 (D) since ∥fi − fj ∥L2 ≤ ∥fi − fj ∥H 1 .
By Proposition 0.1.3, there exists f ∈ L2 (D) so that ∥fi − f ∥L2 → 0 as i → ∞.
Similarly, for each k = 1, . . . , n, we conclude that ∂xk fi is a Cauchy sequence in
L2 (D) and so there exists gk ∈ L2 (D) so that ∥∂xk fi − gk ∥L2 → 0 as i → ∞. But
∂xk f = gk in the weak sense as the following calculation shows.
Z Z Z Z
⟨∂xk f, ϕ⟩ = − f ∂xk ϕdx = − lim fi ∂xk ϕdx = lim (∂xk fi )ϕdx = gk ϕdx
D i→∞ D i→∞ D D
To summarize we have shown that f ∈ H 1 (D), since its weak partial derivatives
gk are square integrable functions, and ∥fi − f ∥H 1 → 0 as i → ∞. This shows
that our given Cauchy sequence f1 , f2 , . . . converges, which proves that H 1 (D) is
complete.
Unlike the case for a general L2 (D) function, see Exercise 8, it is meaningful to
consider the values of a Sobolev function f ∈ H 1 (D) on the boundary ∂D.
Proposition 2.2.3 (Sobolev trace). Let f ∈ H 1 (D) and consider the boundary
values g = f |∂D . Then g ∈ L2 (∂D). Moreover, there exists a constant C < ∞ such
that Z Z
|g(x)| dS(x) ≤ C (|∇f (x)|2 + |f (x)|2 )dx.
2
∂D D
Proof. As often is the case in analysis, it is the estimate that is crucial, so assume
first that f is a smooth function on D, so that in particular f ∈ H 1 (D) and g is well
defined and smooth.
The idea is to use a fixed smooth vector field V on Rn such that
This gives the stated estimate if |V (x)| ≤ C and |divV (x)| ≤ C, if we apply the
inequality 2ab ≤ a2 + b2 , with a = |f (x)| and b = |∇f (x)|, to the first term.
To remove the condition that f is a smooth function on D, one shows that such
smooth functions are dense in H 1 (D) and define the boundary trace by continuity
in the general case. We omit the details.
1/2
and norm ∥f ∥Ḣ 1 = ⟨f, f ⟩Ḣ 1 .
We note that this equivalence of norm is not true on all H 1 (D) for the simple
reason that ∥f ∥Ḣ 1 = 0 for constant functions. However, it is readily seen that none
of the two subspaces contain any constant functions except 0. A main idea is that
in the Sobolev norm ∥f ∥H 1 it is the gradient term which is the dominant one, and
by eliminating the constant functions we may use the simpler homogeneous norm.
Proof. (1) To prove that the subspaces are Hilbert spaces, it suffices by Lemma 0.1.4
to show that they are closed. That H01 (D) is closed follows from the estimate in
Proposition 2.2.3, since if fj → f in H 1 (D) and fj ∈ H01 (D), then
This is known as Poincaré’s inequality, a proof of which we only sketch now. Consider
the subspace H01 (D). The proof for H e 1 (D) is similar. Assume that there is no
constant Cp < ∞ so that (2.2) holds for all f ∈ H01 (D). This means that there
exists a sequence f1 , f2 , . . . in H01 (D) so that
Z Z
2
|fj (x)| = 1 and |∇fj (x)|2 dx → 0,
D D
when j → ∞. In particular supj ∥fj ∥H 1 (D) < ∞ and by Rellich’s theorem 2.2.6
below, we conclude that there exists a subsequence {fjk }k and f ∈ L2 (D) such that
limk→∞ ∥fjk − f ∥L2 = 0. Considering the weak derivatives of this f , we have
Z
⟨∂xi f, ϕ⟩ = − f (x)∂xi ϕ(x)dx
D Z Z
= − lim fjk (x)∂xi ϕ(x)dx = lim ∂xi fjk (x)ϕ(x)dx = 0.
k→∞ D k→∞ D
This shows that ∇f = 0 in the weak sense, from which one can conclude that f is
a constant function. Furthermore ∥fjk − f ∥H 1 (D) → 0 as k → ∞, so f ∈ H01 (D) by
(1). This forces f = 0 since f is constant, and contradicts ∥fj ∥L2 = 1.
We end by stating the result that we used in the above proof, which shows that
in a certain sense the L2 norm is small compared to the H 1 norm, if the domain D
is bounded.
Then there exists a subsequence {fjk }k and f ∈ L2 (D) such that limk→∞ ∥fjk −
f ∥L2 (D) = 0.
40
Example 2.3.2 (Dirichlet problem I). We want to solve the Dirichlet BVP
(
−∆u(x) = f (x), x ∈ D,
(2.3)
u(x) = 0, x ∈ ∂D.
We prefer zero boundary conditions for the Dirichlet problem for the variational
formulation below. However, for a more general boundary condition u = g, we
may first reduce this to the case g = 0 by using that the equation is linear, as in
Section 1.4. The minus sign is a technicality: It gives a plus sign in the variational
formulation.
In the setup from Definition 2.3.1, we use the function space V = H01 (D) for the
Dirichlet problem. Given a solution u ∈ C 2 (D) to (2.3), Green’s first identity gives
Z Z Z Z
⟨∇u, ∇ϕ⟩dx = (∂ν u)ϕdS − (∆u)ϕdx = f ϕdx,
D ∂D D D
for ϕ ∈ H01 (D), since ϕ = 0 on ∂D. Therefore the variational formulation of (2.3)
should be Z Z
⟨∇u, ∇ϕ⟩dx = f ϕdx, for all ϕ ∈ H01 (D), (2.4)
D D
41
R R
that is a(u, ϕ) = D ⟨∇u, ∇ϕ⟩dx and L(ϕ) = D f ϕdx.
Conversely, we need to verify that a solution to (2.4) is a solution to the original
BVP 2.3. Here appears a technical problem: a solution u to (2.4) belongs for all we
know only to the Sobolev space H01 (D). This means in particular that u = 0 at ∂D,
but all we know is that its weak first derivatives are L2 functions, so it is not clear
if u ∈ C 2 (D) and that Green’s identity may be used. (Assuming enough regularity
of f and ∂D, this may however be proved.) But assuming that we have a solution
u ∈ C 2 (D) to (2.4), we obtain from Green’s first identity that
Z Z Z
∆uϕdx = − ⟨∇u, ∇ϕ⟩dx = − f ϕdx
D D D
As for the Dirichlet problem, we can easily reduce to the case g = 0 using the
linearity of the equation. However, unlike for the Dirichlet problem this is not really
necessary in order to write a variational formulation of the Neumann problem. So
we keep g.
An important observation is that (2.5) does not have a unique solution. Indeed,
we can add any constant to u and obtain a new solution. In general, it also does
not exist a solution u: from Green’s identity we have the necessary condition
Z Z Z Z
− f dx = ∆udx = ∂ν udS = gdS.
D D ∂D ∂D
(Physics: Interior sources = flow out though the boundary.) However, assuming this
constraint on f and g, it turns out that a solution exists, and is unique modulo con-
stants on a connected domain D. However, to start with we ignore these problems,
which turn out to be minor ones.
In the setup from Definition 2.3.1, we use the function space V = H 1 (D) for the
Neumann problem. (In order to obtain a unique solution, we later replace H 1 (D)
by a space like He 1 (D) to eliminate constants.) Given a solution u ∈ C 2 (D) to (2.5),
Green’s first identity gives
Z Z Z
⟨∇u, ∇ϕ⟩dx = (∂ν u)ϕdS − (∆u)ϕdx,
D ∂D D
42
Figure 2.4: Delta convergence (1) in the interior, and (2) at the boundary.
for all ϕ ∈ H 1 (D) such that ϕ = 0 on ∂D, that is for ϕ ∈ H01 (D). We want to apply
Lemma 2.1.4 and conclude that h := ∆u + f = 0. At a given point a ∈ D, choose
ϕ to be an approximation to δa . We get
Z
|h(a)| = h(x)ϕ(x)dx − h(a)) ≤ sup |h(x) − h(a)|.
D |x−a|<r
Letting r → 0 and assuming that h is continuous, this shows that h(a) = 0. Since
a was arbitrary, we have ∆u + f = 0 in all D.
(2) Returning to (2.7): since ∆u(x) + f (x) = 0 identically in D by (1), we have
Z
(∂ν u − g)ϕdS = 0
∂D
for all ϕ ∈ H 1 (D). We now want to conclude from an argument as in Lemma 2.1.4,
but on the boundary ∂D rather than in the domain D, that h̃(a) := ∂ν u(a) − g(a)
at each a ∈ ∂D. We design the test function ϕ, given a point a ∈ ∂D.
First define ϕ on ∂D.R Let ϕ(x) = 0 when |x − a| > r and let ϕ be smooth and
positive on ∂D with ∂D ϕdS = 1.
A take away point from these examples is that the Dirichlet boundary condition
is an essential BC whereas the Neumann boundary condition is a natural BC in the
following sense.
The main goal in this section is to prove the following existence and uniqueness
result for solutions to a variational problem. The two main hypotheses are that
the function space V should be a Hilbert space, and that the bilinear a should be
coersive.
The bilinear functional a is bounded, that is |a(u, ϕ)| ≤ C1 ∥u∥∥ϕ∥, and coer-
sive, that is a(u, u) ≥ c1 ∥u∥2 . (C1 < ∞ and c1 > 0 are some constant.)
The linear functional L is bounded, that is |L(ϕ)| ≤ C2 ∥ϕ∥. (C2 < ∞ is some
constant.)
ax + b = 0,
writing −L = b, which gives the location of the minimum of the second order
polynomial 21 ax2 + bx + c. The Hilbert space generalization is
44
Lemma 2.3.6. Consider the abstract variational problem (V, a, L) from Definition 2.3.1,
and assume that a is symmetric. The solutions u ∈ V to this variational problem
are precisely the minima of the functional F : V → R given by
F (u) := 21 a(u, u) − L(u).
Proof. Assume that u ∈ V is a minimizer. Let ϕ ∈ V . Then the one-variable
function f (t) := F (u + tϕ) has a minimum at t = 0. Using bilinearity, symmetry
and linearity, we have
f (t) = 21 t2 a(ϕ, ϕ) + 12 a(u, u) + ta(u, ϕ) − L(u) − tL(ϕ).
Differentiation shows that a(u, ϕ) = L(ϕ).
Conversely, assume that a(u, ϕ) = L(ϕ) for all ϕ ∈ V . Then
F (u + ϕ) = 21 a(u, u) + a(u, ϕ) + 12 a(ϕ, ϕ) − L(u) − L(ϕ) = F (u) + 12 a(ϕ, ϕ) ≥ F (u).
Since ϕ was arbitrary, we have shown that u is a minimum for F .
Proof. Proof of Theorem 2.3.5 for symmetric a By Lemma 2.3.6, it suffices to prove
that
F (u) = 12 a(u, u) − L(u).
has a unique minimum u ∈ V . Uniqueness follows from the coersiveness of a, since
the proof of Lemma 2.3.6 shows that F (u + ϕ) > F (u) for any variational solution
u and ϕ ̸= 0. For existence, let m := inf u∈V F (u).
(1) We first show that F is bounded from below, that is m > −∞. Using the
assumed estimates, we have
F (u) = 12 a(u, u) − L(u) ≥ c1
2
∥u∥2 − C2 ∥u∥.
Clearly this second order polynomial in the real variable x = ∥u∥ is bounded from
below since c1 > 0.
(2) We need to show that the infimum m is attained at some u ∈ V . Let u1 , u2 , . . .
be a sequence in V such that limk→∞ F (uk ) = m. It suffices to show that this in fact
is a Cauchy sequence in V . Indeed, the completeness of the Hilbert space then shows
the existence of a limit u ∈ V , and continuity of F yields F (u) = limk→∞ F (uk ) = m.
Let ϵ > 0. Choose i, j large so that
F (ui ) ≤ m + ϵ and F (uj ) ≤ m + ϵ.
Consider F along the line in V through ui and uj :
We obtain
Example 2.3.7 (Dirichlet problem II). Consider the variational formulation (2.4)
for the Dirichlet problem. We check the hypothesis in Lax–Milgram’s theorem.
The function space V = H0R1 (D) is a Hilbert space by Proposition 2.2.5.
The functional a(u, ϕ) = D ⟨∇u, ∇ϕ⟩dx is bilinear. It is is bounded: this is
Cauchy–Schwarz inequality. It is coersive: This is Poincaré’s inequality and Propo-
sition 2.2.5. R
The functional L(ϕ) = D f ϕdx is bounded, assuming that f ∈ L2 (D): again
this is Cauchy–Schwarz inequality.
Therefore Lax–Milgram’s theorem shows that the Dirichlet problem has a unique
solution.
Example 2.3.8 (Neumann problem II). Consider the variational formulation (2.6)
for the Neumann problem.R We checkR the hypothesis in Lax–Milgram’s theorem.
The functional L(ϕ) = ∂D gdS + D f ϕdx is bounded, assuming that g ∈ L2 (∂D)
and f ∈ L2 (D): this is Cauchy–Schwarz inequality and Proposition 2.2.3.
The remaining hypothesis is verified as for the Dirichlet problem, except for the
coersiveness of a, which fails on H 1 (D). Indeed, if u is a constant function, then
a(u, u) = 0.
To uniquely solve the Neumann problem, we need to replace H 1 (D) by H e 1 (D).
We here only consider the case g = 0 (or else we need to modify H e 1 (D) by a
boundary integral term). So consider the variational problem
Z Z
⟨∇u, ∇ϕ⟩dx = f ϕdx, e 1 (D),
for all ϕ ∈ H (2.8)
D D
e 1 (D).
for u ∈ H
The same argument as before shows that any solution u ∈ C 2 (D) to
(
−∆u(x) = f (x), x ∈ D,
(2.9)
∂ν u(x) = 0, x ∈ ∂D.
satisfies (2.8). The converse however, presents some novelties. As before, for a
variational solution u ∈ C 2 (D) to 2.8, Green’s first identity shows that
Z Z Z
∆uϕdx = (∂ν u)ϕdS − f ϕdx (2.10)
D ∂D D
e 1 (D).
for all ϕ ∈ H
(1) We now only have Z
(∆u + f )ϕdx = 0
D
46
for all ϕ ∈ He 1 (D) such that ϕ = 0 on ∂D. We want to apply Lemma 2.1.4 and
conclude that h := ∆u+f R= 0. We cannot as before use a Dirac delta approximation
for ϕ since we must have D ϕdx = 0. Instead we use differences of two such deltas.
Fix two points a, b ∈ D, and Rlet ϕ = gRa − gb , where
R ga and gb are approximations to
δa and δb respectively. Since ϕdx = ga dx − gb dx = 1 − 1 = 0, we get
Z
|h(a) − h(b)| = h(x)ϕ(x)dx − (h(a) − h(b))
D
Z Z
≤ h(x)ga (x)dx − h(a) + h(x)gb (x)dx − h(b)
D D
≤ sup |h(x) − h(a)| + sup |h(x) − h(b)|.
|x−a|<r |x−b|<r
Letting r → 0 and assuming that h is continuous, we have shown that h(a) = h(b)
R two points a, b ∈ D, so ∆u(x) + f (x) = c for some constant c. Assuming
for any
that D f dx = 0, we have in fact that
Z Z Z
c dx = ∆udx = ∂ν udx = 0,
D D ∂D
so c = 0.
(2) Returning to (2.10): since ∆u(x) + f (x) = 0 identically in D by (1), we have
Z
(∂ν u)ϕdS = 0
∂D
First define ϕ on ∂D.R Let ϕ(x) = 0 when |x − a| > r and let ϕ be smooth and
positive on ∂D with ∂D gdS = 1.
fails to be coersive on all H 1 (D). This causes a minor problem, which the Poincaré
inequality in Proposition 2.2.5 fixes if we instead use the subspace H e 1 (D). Note
that H e 1 (D) is a hyperplane in H 1 (D) since it is the orthogonal complement of the
constant functions.
Now consider the bilinear functional a on H 1 (Rn ), that is we replace the bounded
domain D by the unbounded full space Rn . Note that in this case H 1 (Rn ) contains
no nonzero constant functions. We show nevertheless that the Poincaré inequality
fails, and so the coersivity of a fails, on any closed subspace of H 1 (Rn ) which contains
a function f ̸= 0 along with all its rescalings
fk (x) = k −n/2 f (x/k), x ∈ Rn .
Indeed, changing variables x = ky shows that
Z Z
2
|fk (x)| dx = |f (y)|2 dy,
Rn Rn
for all k = 1, 2, . . ..
a basis ϕ1 , ϕ2 , . . . , ϕN for VN .
In the discretized finite dimensional problem, we consider the solution u and the
test functions ϕ to belong to VN , and let
u = x1 ϕ1 + . . . + xN ϕN
and ϕ = ϕi , i = 1, . . . , N . We obtain a finite dimensional linear systems of equations
N
X
Ai,j xj = bi , (2.11)
j=1
where Ai,j = a(ϕi , ϕj ) and bi = L(ϕi ). By tradition the N × N matrix A = (Ai,j )Ni,j=1
is called the stiffness matrix and the RN -vector b = (bi )Ni=1 is the right hand side is
called the load vector.
48
Figure 2.5: A coarse triangulation of a domain D, using 145 triangles. The number of
interior nodes, used for the Dirichlet problem, is N = 56. The total number of nodes,
including boundary nodes, used for the Neumann problem, is N = 91. Can you visualize
ϕi at one of the interior nodes i? At one of the boundary nodes i?
Definition 2.4.1 (Abstract FEM). The abstract finite element method is to numer-
ically solve the abstract variational problem (V, a, L) from Definition 2.3.1, using a
finite dimensional subspace VN ⊂ V with basis (ϕi )N N
i=1 , by solving (2.11) for (xi )i=1 ,
giving the numerical solution u = x1 ϕ1 + · · · + xN ϕN .
Given a domain D, there are many possible ways to choose VN and ϕi to solve the
Dirichlet and Neumann problems. We here only consider the most standard setup,
which use a given triangulation of D ⊂ R2 . (For a 3D problem one uses tetrahedra
instead of triangles.) This means that we approximate D by a union of of triangles.
To construct such a triangulation is a non-trivial geometric problem, which we do
not consider the details of here. To describe the algorithm, we assume given three
matrices vertices, triangles and boundary as follows.
triangles is a matrix with 3 rows, with entries in Z. The columns enumerate the
triangles (elements), and in the kth column stands the three vertices (integers
specifying the columns in vertices) of triangle k in the triangulation. Standard
is to enumerate the vertices of a triangle counter clockwise.
boundary is a matrix with 2 rows, with entries in Z. The columns enumerate the
boundary intervals (boundary elements), that is the triangle edges hitting ∂D.
In the kth column stands the two vertices (integers specifying the columns in
49
where the sum is over all the triangles which have both i and j as two of its
vertices (or possibly as one i = j). On each triangle T , the two gradients are
constant vectors, so the integral is simply
⟨∇ϕi , ∇ϕj ⟩area(T ), (2.12)
the evaluation of which is a nice linear algebra exercise. See Example D.0.1.
Having a representation of the triangulation as above, one may proceed as
follows to fill in the matrix A. Initialize A = 0. Then, rather than iterating
over pairs of vertices i, j, we iterate over triangles T in triangles. For each of
the 3 × 3 = 9 pairs of vertices i, j to T , found in triangles we add (2.12) to Aij .
This gives the stiffness matrix for the Neumann problem. To obtain the smaller
stiffness matrix for the Dirichlet problem, we remove the rows and columns
corresponding to boundary nodes, using boundary.
To compute the load vector bi , assuming that boundary data g = 0, we need
to compute, using suitable numerical integration,
XZ
L(ϕi ) = f (x)ϕi (x)dx,
T T
where the sum is over all triangles with one vertex at i. For the Dirichlet
problem, this needs not be computed for boundary vertices i. For the Neumann
problem, if g ̸= 0, we need to add
XZ
g(x)ϕi (x)ds
I I
We now solve the linear system (2.11). For the Dirichlet problem this is always
uniquely solvable, that is the matrix A is invertible. For the Neumann problem,
A is not invertible. However, the matlab command A\b will yield the least
square solution and one solution x. The general solution is
T
x + 1 1 1 ··· 1 .
2.5 Exercises
2
x ,
x<1
2
1. Let f (x) = x + 2x, 1 ≤ x < 2 . At which x does the classical derivative
2x, x≥2
′
f (x) exist? Compute the weak derivative of f . Sketch a sequence of functions
which converge weakly to f ′ .
(
1, |x| < 1, |y| < 1
2. Let f (x, y) = . Compute the weak partial derivative ∂x f .
0, else
Sketch a sequence of functions which converge weakly to ∂x f .
5. Assume that u(x, y) is a harmonic function in the upper half disk x2 + y 2 < 1,
y > 0, which is continuous for x2 + y 2 ≤ 1, y ≥ 0, with homogeneous Dirichlet
boundary values u(x, 0) = 0. Extend u to an odd function with respect to y
in the whole disk x2 + y 2 < 1 by letting
(
u(x, y), y > 0,
v(x, y) =
−u(x, −y), y < 0.
Show that ∆v = 0 in distributional sense in the disk. (With more theory one
can deduce from this that v in fact must be C ∞ and harmonic in the whole
disk.)
6. For which exponents α ∈ R does f (x) = |x|α belong to H 1 (D), if (a) D is the
interval D = (−1, 1) ⊂ R? (b) D is the unit disk x2 + y 2 < 1 in R2 ? (c) D is
the unit ball x2 + y 2 + z 2 < 1 in R3 ?
51
7. Let Γ be a curve that cuts a bounded domain D into two parts D1 and D2 .
Let f be a function which is smooth in D1 and in D2 and continuous in all D.
Show that f ∈ H 1 (D). Hint: Gauss theorem.
8. Let D be your favorite domain. For each ϵ > 0, construct two smooth functions
f1 and f2 so that f1 |∂D = 0 and f2 |∂D = 7, but ∥f1 − f2 ∥L2 (D) < ϵ.
9. Let γ > 0. Give a variational formulation of the Robin BVP
(
−∆u(x) = f (x), x ∈ D,
∂ν u(x) + γu(x) = g(x), x ∈ ∂D.
What are V , a and L? Show equivalence of the BVP and the variational
problem, modulo regularity issues, and that the Lax-Milgram hypothesis is
satisfied.
10. Give a variational formulation of the mixed BVP
−∆u(x) = f (x),
x ∈ D,
∂ν u(x) = 0, x ∈ ΓN ,
u(x) = 0, x ∈ ΓD .
The goal in this chapter is to find explicit formulas for the solution to the IVPs for
the heat and wave equations on all R, on all R2 and on all R3 . Using the Fourier
transform, this is possible when the wave/heat propagates freely in all space, and by
analyzing the obtained formulas we learn much about wave and heat propagation.
Using the Fourier transform, it is important to note that the Fourier transform of
many important functions are not functions, but more general distributions. Indeed,
the basic Riemann function appearing in the solution of the wave equation, is not a
function but a spherical “Dirac delta wall”. So we start by learning how to transform
distributions.
Recommended reading:
Strauss 12.3.
Study questions:
How does Strauss find the solution formulas for the heat and wave
equations without using the Fourier transform? How do we use the
chain rule to find the d’Alembert formula for the one-dimensional
wave IVP?
53
54
We have changed order R of integration in this calculation, and for this to work we
need to assume that Rn |f (x)|dx < ∞.
We see that the natural definition of F̂ is as the distribution with the value ⟨fˆ, ϕ⟩
at the test function ϕ being ⟨f, ϕ̂⟩. Here
Z
ϕ̂(ξ) = ϕ(x)e−i⟨ξ,x⟩ dx.
Rn
This allows us to differentiate under the integral sign any number of times, so clearly
ϕ̂ is C ∞ smooth. The problem is rather that ϕ̂ is too smooth! One can show that
ϕ̂ is an analytic function, and such functions can never be = 0 on an open set, in
particular not outside any bounded set.
The solution to this problem is to use a slightly different class of test functions.
We make the following analogue of Definition 2.1.2
Definition 3.1.1 (Tempered distributions and Schwartz test functions). Let S(Rn )
denote the set of all Schwartz test functions ϕ : Rn → R such that all partial
derivatives of ϕ of all orders exist as continuous functions which are rapidly decaying.
By a function being rapidly decaying we mean that it decays as O(|x|−N ) for any
N < ∞ as x → ∞.
Let S ′ (Rn ) denote the set of all tempered distributions, that is functionals F :
S(Rn ) → R such that F is linear and defined on all S(Rn ).
The following lemma will allow to define Fourier transforms of tempered distri-
butions according to the above discussion.
Lemma 3.1.2 (Fourier transform of Schwartz functions). If ϕ ∈ S(Rn ), then ϕ̂ ∈
S(Rn ).
55
This shows that ϕ̂(k) exists as a continuous function, since xk ϕ(x) is integrable. To
further show that it is rapidly decaying we integrate by parts N times and obtain
Z
−N
(k)
ϕ̂ (ξ) = (iξ) ((−ix)k ϕ(x))(N ) e−iξx dx.
Since the integrand here is integrable for any N , we see the stated rapid decay.
Definition 3.1.3 (Fourier transform of tempered distributions). Let F ∈ S(Rn )
be a tempered distribution. We define its Fourier transform to be the tempered
distribution F̂ defined by
Here L2 (Rn ) is a Hilbert space, whereas the other spaces are not even Banach spaces
as there is no norm ∥ · ∥ defined on them. The Fourier transform maps bijectively
F :S(Rn ) → S(Rn ),
F :L2 (Rn ) → L2 (Rn ),
F :S ′ (Rn ) → S ′ (Rn ),
but is not defined on all D′ (Rn ) since F only maps D(Rn ) → S(Rn ).
Figure 3.1: Examples of (a) distribution/test function, and (b) tempered distribu-
tion/Schwartz test function.
Example 3.1.4 (Dirac delta revisited). Consider the Dirac delta distribution δa
from Example 2.1.3. Clearly
ϕ(a)
is well defined not only for ϕ ∈ D(Rn ), but also for all ϕ ∈ S(Rn ). This means that
δa in fact is a tempered distribution. We compute
Z
⟨δ̂a , ϕ⟩ = ⟨δa , ϕ̂⟩ = ϕ̂(a) = e−i⟨a,x⟩ ϕ(x)dx.
Rn
δ̂a = e−i⟨a,ξ⟩ .
56
δ̂0 = 1.
This should be observed with some degree of fascination, since the interpretation is
that the Dirac delta distribution contains equally much of all frequencies. We know
from Fourier analysis that
fˆ(ξ) → 0, ξ → ∞,
for any integrable function by the Riemann–Lebesgue lemma. This shows conversely
that δa is not a function, since its Fourier transform does not decay as ξ → ∞.
to be well defined for all test functions. Clearly f ∈ D′ (R) since test functions
ϕ ∈ D(R) are zero outside a bounded interval, so the integral converges trivially.
However, f is not a tempered distribution since the rapid decay of Schwartz test
functions ϕ ∈ S(R) are not enough to counter-act the growth of f and make the
integral convergent. A concrete example: ⟨f, ϕ⟩ is not defined/convergent for
√
x2 +1/2
ϕ(x) = e−
We next compute the Fourier transforms of the most important functions in PDE
theory.
Example 3.1.6 (Heat kernel=Gaussian). Fix a scale parameter t > 0 and consider
the function
2
e−|x| /(4t) , x ∈ Rn .
This socalled Gauss function plays a central role in mathematics. For us it provides
a simple example of a Schwartz test function, and we see below that it is the most
important function in studying the heat equation.
The standard computation of its Fourier transform is by completing the square
in the exponent
Z Z
−|x|2 /(4t)−i⟨ξ,x⟩ 2 2
e dx = e−⟨x+2itξ,x+2itξ⟩ /(4t)−t|ξ| dx.
Rn Rn
57
Applying this to the Gauss functions from Example 3.1.6, after division by the
constant (4πt)n/2 , gives
Z b Z b
−n/2 −|x|2 /(4t) 2
F{ (4πt) e dt} = e−t|ξ| dt.
a a
Integration gives
Z |x|2 /(4a)
2 2
F{ 4π1n/2 |x|2−n sn/2−2 e−s ds} = (e−a|ξ| − e−b|ξ| )/|ξ|2 .
|x|2 /(4b)
|ξ|−2 dξ =
R
The corresponding result in R2 however requires distributions since |ξ|<1
∞ in the plane. Letting a = 0, we have
Z ∞
2
1
F{ 4π s−1 e−s ds} = (1 − e−b|ξ| )/|ξ|2 .
|x|2 /(4b)
58
We cannot however not let b → ∞ since both sides diverges, even in the sense of
distributions. The left hand side is nevertheless close to a logarithm, and we write
Z 1/(4b) Z 1/(4b) Z 1/(4b)
ds −s ds ds
ln 1
|x|2
= = (1 − e ) + e−s
|x|2 /(4b) s |x|2 /(4b) s |x|2 /(4b) s
Z 1/(4b) Z ∞ Z ∞
−s ds −s ds ds
= (1 − e ) + e − e−s .
|x|2 /(4b) s |x|2 /(4b) s 1/(4b) s
On the right hand side, the first term defines a function g(x) such that |g(x)| ≤
|1 − |x|2 |/(4b). Therefore g → 0 weakly as b → ∞. The last term is a constant c(b),
where c(b) → +∞ as b → ∞. We obtain
2
1
F{ 2π 1
ln |x| } = lim (1 − e−b|ξ| )/|ξ|2 − c(b)π −1 δ0 (ξ) , (3.2)
b→∞
where the limit exists in the weak sense. The conclusion is that outside the origin,
1 1
the Fourier transform of 2π ln |x| equals the function 1/|ξ|2 , but at the origin this
Fourier transform has a distributional singularity which best can be described as a
radial analogue of δ0′ from Example 2.1.8.
1 1
It is instructive to note that in (3.2), we have F{ 2π ln |x| } < 0 at ξ = 0, since
c(b) → +∞, if we approximate the Dirac delta as in Example 2.1.11. See Figure X.
This is indeed reasonable since we know from Fourier analysis that for functions
Z
f (x)dx = fˆ(0),
Rn
1
R
and in our case R2
ln |x| dx = −∞.
The basic property of the Fourier transform is that derivatives and convolutions
acting on f (x) correspond to multiplication on fˆ(ξ). These operations are extended
to tempered distributions as follows.
We have already seen in Section 2.1.6 how the weak derivative ∂xk F of any
distribution F is defined through “integration by parts”
For tempered distributions F ∈ S ′ (Rn ) one can show through a limiting argu-
ment that this holds for any ϕ ∈ S(Rn ) if it hold for all ϕ ∈ D(Rn ).
Recall that the convolution of two functions f (x) and g(x) is the function
f ∗ g(x) defined by the integral
Z
f ∗ g(x) = f (x − y)g(y)dy, x ∈ Rn .
Rn
where the two integrals are only written as a motivation for the definition. For
this definition to work, g must be smooth so that gϕ is a test function and the
right hand side ⟨F, gϕ⟩ is well defined. However, depending on how singular
the distribution F is, one can sometimes allow more general functions g. For
example, the product with a Dirac delta δa g is well defined whenever g is a
continuous function.
We omit the technical details and refer to the discussion above for the meaning
of “suitable”.
for all test functions ϕ, by using the definition (3.3) of the convolution of distribu-
tions.
for all G ∈ S ′ (Rn ). On the other hand, the Fourier transform of δ0′ is
Z
′ ′
⟨δb0 , ϕ⟩ = ⟨δ0 , ϕ̂⟩ = −(ϕ̂) (0) = ixϕ(x)ei0x dx = ⟨ix, ϕ⟩.
′
a result which we use to diagonalize differential operators and solve inital value
problems for PDEs.
The interpretation of this result is that a differentiation is nothing but convolu-
tion by a distribution whose Fourier transform is a polynomial which grows at ∞.
The only difference with a usual convolution by a function F , is that F̂ (ξ) → 0,
rather than F̂ (ξ) → ∞, as ξ → ∞. So, convolution by a function may be seen as a
negative differentiation that smoothens the function G.
Figure 3.3: (a) A function G. (b) Its derivative G′ = δ0′ ∗ G. (c) The convolution
2
e−|x| ∗ G(x).
61
which describes the heat distribution, or shape of the wave, at time t. In other
words, we consider a function R → L2 (Rn ).
We first perform this calculation in the simpler case of the heat equation.
Example 3.2.1 (Heat IVP on Rn ). Consider the following initial value problem for
the heat equation.
(
∂t u(t, x) = k∆u(t, x) + f (t, x), t > 0, x ∈ Rn ,
u(0, x) = g(x), x ∈ Rn .
Here k > 0 is the constant from Fourier’s law, g is the inital data describing the
distribution of heat at time t = 0, and f describes the sources, possibly time-
dependent, which are present.
For a fixed time t, write ut (x) = u(t, x) and ft (x) = f (t, x) and view ut and ft
as vectors in the Hilbert space L2 (Rn ). We obtain the vector valued ODE
∂t ut = k∆ut + ft ,
with the Laplace operator ∆ as a linear map on the Hilbert space L2 (Rn ).
To solve, we proceed similar to the finite dimensional case in Example 1.1.2
and define ût = F{ut }. Applying, for each fixed t, the Fourier transform in the
x-variable, we obtain
∂t ût = −k|ξ|2 ût + fˆt ,
with ξ = (ξ1 , . . . , ξn ), since ∆ = k ∂x2k and each ∂xk transforms to iξk .
P
62
Now fix ξ ∈ Rn and regard instead t as variable. Consider the scalar ODE
∂t ût (ξ) = −k|ξ|2 ût (ξ) + fˆt (ξ). Multiplication by the integrating factor ek|ξ| t , and
2
To find the solution formula u(t, x) in x-space, we must find the inverse Fourier
2
transform of e−k|ξ| t . Replacing t by kt in Example 3.1.6, we have
2 /(4kt) 2
F{(4πkt)−n/2 e−|x| } = e−k|ξ| t .
The formula (3.6) completely describes how the the solution u(t, x) depend on
the inital data g and the sources f , by convolution with the Gauss function, or heat
kernel. We discuss this further in Section 3.3.
Definition 3.2.2 (Heat kernel). For the heat equation with constant k > 0, we
define the heat kernel
2
Ht (x) = (4πkt)−n/2 e−|x| /(4kt) .
We next solve the wave equation on Rn in the same way, but with somewhat
more complicated calculations.
Example 3.2.3 (Wave IVP on Rn ). Consider the following initial value problem
for the wave equation.
2 2
∂t u(t, x) = c ∆u(t, x) + f (t, x),
t > 0, x ∈ Rn ,
u(0, x) = g(x), x ∈ Rn ,
∂t u(0, x) = h(x), x ∈ Rn .
Here c > 0 is a constant, which we shall see has a meaning of speed of propagation,
g is the inital shape of the wave, g is the initial speed of the wave, and f describes
exterior forces, possibly time-dependent, which are present.
As in our calculation for the heat equation, we obtain a vector valued ODE
∂t2 ut = c2 ∆ut + ft ,
63
which is now of second order. To solve, we apply, for each fixed t, the Fourier
transform in the x-variable, and obtain
∂t2 ût = −c2 |ξ|2 ût + fˆt .
Now fix ξ ∈ Rn and regard instead t as variable. Consider the scalar ODE ∂t2 ût (ξ) =
−c2 |ξ|2 ût (ξ) + fˆt (ξ). From Example 1.1.3 with ω = c|ξ| and a(t) = fˆt (ξ) we have
the solution
Z t
sin(c|ξ|t) sin(c|ξ|(t − s))
ût (ξ) = ĝ(ξ) cos(c|ξ|t) + ĥ(ξ) + fˆs (ξ) ds,
c|ξ| 0 c|ξ|
with constants A and B determined by the initial conditions.
To find the solution formula u(t, x) in x-space, we must find a function Rt such
that
sin(c|ξ|t)
F{Rt } = , ξ ∈ Rn .
c|ξ|
It turns out however that in dimension n ≥ 3, this Fourier transform decays too slow
as ξ → ∞ for such function Rt to exist: we must use distributions! We calculate
these distributions below. Interestingly, these socalled Riemann functions (we use
this traditional terminology even though these are distributions) look very different
depending on the dimension n. However, in any dimension we have Rt = 0 when
|ξ| > t, in the sense that ⟨Rt , ϕ⟩ = 0 for all test functions such that ϕ = 0 when
|ξ| ≤ t. In particular, the convolution Rt ∗ f with any locally integrable function f
is well defined, and one can show that such convolutions Rt ∗ f are functions. At the
end of the day, we obtain after inverse Fourier transformation the solution formula
Z t
u(t, x) = ∂t (Rt ∗ g)(x) + (Rt ∗ h)(x) + (R(t−s) ∗ fs )(x)ds. (3.7)
0
Figure 3.5: (a) Initial speed ∂t u0 being a Dirac delta δ0 , and propagation speed c = 1,
giving a wave u1 at time t = 1 being the Riemann function R1 (x). (b) R1 (x) in
dimension n = 1. (c) Radial profile of R1 (x) in dimension n = 2. (d) Radial profile of
R1 (x) in dimension n = 3.
Proof. If n = 1, we calculate
Z ct
sin(ξct) sin(|ξ|ct)
e−iξx dx = 2 = 2c .
−ct ξ c|ξ|
65
ct Z √c2 t2 −x21 !
e−i|ξ|x1 dx1 dx2
ZZ Z
dx2
p = √ p e−i|ξ|x1 dx1
x21 +x22 <c2 t2 c2 t2 − x21 − x22 −ct −
2 2 2
c t − x1 − x2
c2 t2 −x21
2
Z ct Z 1
du sin(c|ξ|t)
= √ e−i|ξ|x1 dx1 = 2πc ,
−ct −1 1−u 2 c|ξ|
p
by a change of variables x2 = c2 t2 − x21 u and the one-dimensional result.
If n = 3, by definition R̂t is the tempered distribution with value
Z Z Z
1 1 −i⟨ξ,x⟩
⟨R̂t , ϕ⟩ = ⟨Rt , ϕ̂⟩ = ϕ̂(ξ)dS(ξ) = e dS(ξ) ϕ(x)dx
4πc2 t |ξ|=ct R3 4πc2 t |ξ|=ct
ZZ Z 2π Z π
−i|ξ|x3
e dS(x) = e−i|ξ|ct cos θ (ct)2 sin θdθdϕ
|x|=ct 0 0
Z π
sin(c|ξ|t)
= 2πc t2 2
e−i|ξ|ct cos θ sin θdθ = 4πc2 t .
0 c|ξ|
2
ût (ξ) = ĝ(ξ)e−k|ξ| t , (3.8)
Z
2 dy
u(t, x) = g(y)e−|y−x| /(4kt) , (3.9)
Rn (4πkt)n/2
Therefore u(t, x) = 0 when |x − a| > ct. Again we have the remarkable fact in
R3 that u(t, x) = 0 also when |x − a| < ct!
Figure 3.6: (a) Domain of dependence: wave= black, heat=red. (b) Domain of influ-
ence: wave= black, heat=red.
We have seen that Huygens principle holds in dimension 3, but not in dimension
2, and that a half Huygens principle holds in dimension 1 in the sense that Rt does
not vanish for |x| < ct but is constant there so that ∂t Rt = 0 for |x| < ct. One
can show that Huygens principle holds precisely in odd dimension n ≥ 3. From the
above discussion it is clear that odd dimensional worlds are quiet ones, whereas even
dimensional worlds are noisy ones.
Another observation based on the solution formulas for the IVPs concern the
direction of time. In this case the formulas in Fourier ξ-space are the most helpful.
For the heat equation, we see from (3.8) that we cannot solve backward in
2
time. Indeed, for t < 0 the Gaussian e−k|ξ| t grows so fast at ∞ that it does
not even define a tempered distribution, and therefore the Fourier transform
breaks down. (Compare Example 3.1.5.) This means that heat flow is an
irreversible process, meaning that it is impossible to reconstruct ut for t < 0
from u0 in a stable way.
A closely related fact is that heat evolution is a smoothing process: No matter
how irregular the distribution of heat is at t = 0, it will be C ∞ smooth at any
t > 0. This is clear from (3.9), where we can differentiate under the integral
sign any number of times.
For the wave equation, we see from (3.10) that this solution is valid for t < 0
as well. This means that wave propagation is a reversible process, meaning
that we can reconstruct in a stable way the wave ut at times t < 0 from the
data g and h at t = 0. Also, wave evolution is not a smoothing process. See
Exercise 2.4.
Figure 3.7: Heat equation: (a) given function u at t = 0, (b) smoothed out heat at
t = 1, and (c) an attempt to numerically compute u at t = −1.
Figure 3.8: (a) Wave equation: two given superimposed waves at t = 0. (b) Waves at
t = 1 and (c) waves at t = −1, both computable from (a) and non-smooth.
Example 3.3.4 (Uniqueness for the backward heat IBVP). An argument similar to
that in Proposition 3.3.2 can be used to prove uniqueness for the IBVP for the heat
equation. Indeed, if ∂t = ∆u for x ∈ D and t > 0, then the one-variable function
ZZ
E(t) = |u(t, x)|2 dx
D
69
has derivative
ZZ Z ZZ
′
E (t) = 2 u∆udx = 2 u∂ν udS(x) − 2 |∇u|2 dx.
D ∂D D
3.4 Exercises
1. For which k ∈ R does f (x) = |x|k define a tempered distribution on Rn ? It
suffices to consider n = 2, 3. Note that there are two important points: x = 0
and x = ∞.
(
1, |x| < k
2. Define the function fk (x) = . Show that fk → 1 weakly as
0, |x| > k
k → ∞. Compute fˆk . Show that sin(kξ)/ξ → πδ0 weakly as k → ∞. Plot the
graphs of sin(kξ)/ξ. Can you see this convergence result?
70
Sketch the wave resulting from of this 1D hammer blow at times t = a/2c, a/c, 3a/2c, 2a/c
and 5a/c.
p
7. We consider spherical three-dimensional waves u(t, r), where r = x2 + y 2 + z 2 .
Using spherical coordinates, the wave equation in dimension n = 3 reduces to
(a) Consider the function v(t, r) = ru(t, r). Show that v solves the wave
equation in dimension n = 1.
(b) Given two even functions g(r) and h(r), solve the IVP u(0, r) = g(r),
∂t u(0, r) = h(r).
10. Solve the heat equation ∂t u = k∂x2 u − a∂x u, u(0, x) = g(x), with constant
convection a. Hint: adapt Example 3.2.1.
11. Let u(t, x) solve the wave equation in dimension n = 1, with c = 1. Define the
energy density e = (∂t2 u + ∂x2 u)/2 and the momentum density p = ∂t u∂x u.
12. Consider the damped wave equation (1.11) with no sources f = 0. Show that
energy decreases with time.
13. We consider spherical n-dimensional waves u(t, r). Using spherical coordinates
in Rn , the wave equation in dimension n reduces to
(a) Which differential equations must α and β satisfy, if such u solves the
spherical wave equation for any choice of f ? Hint: write down the ODE
for f and set the coefficients to zero.
(b) Show that solutions u ̸= 0 of this form exist only when n = 1 or n = 3.
14. Solve the wave equation, with c = 1, in dimension n = 3 with g = 0 and initial
speed (
1, |x| < 1,
h(x) =
0, |x| > 1.
Hint: compute the area of the spherical cap {y ; |y − x| = ct} ∩ {y ; |y| < 1}.
(a) Sketch the wave resulting from of this 3D hammer blow at times t = 1/2, 1
and 2.
(b) Sketch u as a function of t, at points where |x| = 1/2 and |x| = 2
respectively.
The goal in this chapter is to solve general IBVPs, where we consider the time-
evolution of a PDE on some domain D ⊂ Rn , n = 2, 3 in particular. Our strategy
is similar to Chapter 3, but replacing the Fourier transform (which is suitable only
for D = Rn ) by something like Fourier series, but adapted to the domain D ⊂
Rn . These generalized sine and cosine functions for D give important theoretically
insights for IBVPs, and we also learn how to numerically compute them in order to
solve IBVPs.
Recommended reading:
73
74
we assume D to be bounded, and we shall see that in this case the diagonalization
leads to a discrete set of scalar ODEs, one for each Laplace eigenvalue.
We know from Fourier analysis, in one dimension, that the Fourier transform is
useful for functions defined on all R, whereas Fourier series are useful for functions
defined on a bounded interval. The real form of a Fourier series uses the sine and
cosine functions.
Example 4.1.1 (Laplace eigenfunctions on intervals). The sine functions sin(kx)
are eigenfunctions to the Laplace operator ∆ = ∂x2 on D = (0, π),
with Dirichlet
p boundary conditions at x = 0 and x = π, k = 1, 2, . . . As in Fourier
∞
analysis, { 2/π sin(kx)}k=1 form an ON-basis for L2 (0, π).
The cosine functions cos(kx) are eigenfunctions to the Laplace operator ∆ = ∂x2
on D = (0, π),
∂x2 cos(kx) = −k 2 cos(kx),
with Neumann √ boundary
p conditions at x = 0 and x = π, k = 0, 1, 2, . . . As in Fourier
analysis, {1/ π} ∪ { 2/π cos(kx)}∞ k=1 form an ON-basis for L2 (0, π). Note that we
here, unlike the Dirichlet case, include k = 0.
To goal in this section is to show that for any bounded domain D ⊂ Rn we
have, for either choice of boundary condition, an ON-basis for L2 (D) consisting of
Laplace eigenfunctions like this. One way to view this result is as a generalization
of the spectral theorem 0.2.3 from linear algebra. The generalization consists in
replacing the finite dimensional vector space by the Hilbert space L2 (D) and viewing
the Laplace operator as an infinitely large matrix. As in spectral theorem, we need
symmetry of the operator in order to have a ON-basis of eigenfunctions. And indeed,
Green’s second formula shows that
Z Z
(v∆u − u∆v)dx = (v∂ν u − u∂ν v)dS.
D ∂D
Now, if both u and v satisfy either Dirichlet or Neumann boundary conditions, then
the right hand side vanishes, and we interpret this result as
We now use two general compactness results: Theorems 2.2.6 and 4.1.4. From these
we can deduce that there exists a subsequence {ujk }∞
k=1 , and limits u ∈ L2 (D) and
vi ∈ L2 (D) such that
∥ujk − u∥ → 0, k → ∞,
⟨∂i ujk , ϕ⟩ → ⟨vi , ϕ⟩, k → ∞,
for each ϕ ∈ L2 (D) and i = 1, . . . , n. For testfunctions ϕ ∈ D(D), it follows that
Z Z Z Z
− u∂i ϕdx = − lim ujk ∂i ϕdx = lim (∂i ujk )ϕdx = vi ϕ,
D k→∞ D k→∞ D D
1
so u ∈ H (D) with weak derivatives ∂i u = vi . Further we check that ∥u∥ = 1, and
that u ∈ H01 (D) since H01 (D) is a closed subspace of H 1 (D) and u is the weak limit
of ujk ∈ H01 (D) in H 1 (D). Finally, we verify that R(u) = m, where it remains to
show that ∥∇u∥2 ≤ m. Using ϕi = ∂i u, we have
n n
2
X X √
∥∇u∥ = ⟨∂i u, ϕi ⟩ = lim ⟨∂i ujk , ϕi ⟩ ≤ m∥∇u∥
k→∞
i=1 i=1
Proof. (1) We first show that e1 ∈ H01 (D) is a Dirichlet eigenfunction. In the weak
sense, by Green’s first identity, this means that
Z Z
⟨∇e1 , ∇ϕ⟩dx = λ1 e1 ϕdx,
D D
for all ϕ ∈ H01 (D), and we have seen that the candidate for λ1 is λ1 = R(e1 ).
Consider the one-variable function
This proves that ∆e1 = −R(e1 )e1 in the weak sense. Any Dirichlet eigenvalue,
being a Rayleigh quotient, is positive and cannot be zero, since this would force
the eigenfunction to be constant and hence zero by the boundary conditions. By
construction R(e1 ) is the smallest Dirichlet eigenvalue.
(2) To show that e1 ≥ 0 on all D, consider the function
|e1 (x)|, x ∈ D.
where the boundary integrals vanish since e1 = 0 on ∂D+ and ∂D− . This proves
that in the weak sense, ∇|e1 | is the vector field
(
∇e1 (x), x ∈ D+ ,
∇|e1 | =
−∇e1 (x), x ∈ D− ,
77
since ∆|e1 | = −λ1 |e1 |, G < 0 and ∂ν G > 0. (Note however that |e1 | is not a C 2
function, so some more technical details are needed to justify this, which we omit.)
(3) We have shown that, possibly after normalization, e1 > 0 in all D. Let u be
any other Dirichlet eigenfunction with eigenvalue λ1 . If u is not parallel to e1 , then
we may assume that ⟨e1 , u⟩ = 0 after Gram–Schmidt orthogonalization. But by the
same argument as above, we also have u > 0 in all D. But two positive functions
cannot be orthogonal in L2 . Therefore u must be parallel to e1 .
with eigenvalues 0 < λ1 < λ2 ≤ λ3 ≤ . . .. Note that λ2 > λ1 follows from Proposi-
tion 4.1.5, and λ1 > 0 follows from the Poincaré inequality, but some the remaining
eigenfunctions e2 , e3 , . . . may have the same eigenvalue, depending on the symmetries
of D. A brief sketch of the construction of this Dirichlet eigenbasis is as follows.
of e1 from Proposition 4.1.5. A minimizer of R(u), not over H01 (D) but in
the subspace V1 , can be shown to exist similarly to Proposition 4.1.3. A
modification of Proposition 4.1.5 shows that this minimizer is e2 . However,
somewhat similar to sin(2x) on (0, π), it will have an oscillation.
Figure 4.1: The Dirichlet eigenfunctions e1 (x), . . . , e6 (x) for a domain D, computed
with the Rayleigh-Ritz approximation from Section 4.3, using N = 4501 basis functions.
The error in the eigenvalue approximations is about 1%. (Note similarities to sin(kx)
on (0, π).)
for L2 (D), such that ẽj ∈ H01 (D) and all are Neumann eigenfunctions to the Laplace
operator ∆ in the sense that
with eigenvalues 0 = λ̃0 < λ̃1 ≤ λ̃2 ≤ . . .. We prefer to index these eigenfunctions
from j = 0 since, just like the cosine functions on (0, π), we have that e0 (x) is
a constant function, and hence λ̃0 = 0. Again, λ̃1 > λ̃0 follows from the Poincaé
e 1 (D), but some the remaining eigenfunctions ẽ1 , ẽ2 , . . . may
inequality, this time for H
have the same eigenvalue, depending on the symmetries of D. (If D is disconnected,
the first few Neumann eigenfunctions will be locally constant.) The construction
of the Neumann basis is very similar to the Dirichlet eigenbasis, using the whole
Sobolev space H 1 (D), and subspaces thereof, instead of H01 (D).
Let us order these eigenvalues not by (n, m), but linearly as λ1 , λ2 , . . .. Consider the
j’th eigenvalue λj . Each of the j smaller eigenvalues correspond to a point (n, m)
satisfying
π 2 ((n/a)2 + (m/b)2 ) ≤ λj .
80
Figure 4.2: The Neumann eigenfunctions ẽ0 (x), . . . , ẽ5 (x) for a domain D, computed
with the Rayleigh-Ritz approximation from Section 4.3, using N = 4781 basis functions.
The error in the eigenvalue approximations is about 1%. (Note similarities to cos(kx)
on (0, π).)
p p
This means that (n, m) is inside a quarter ellipse with axes λj a/π and λj b/π.
For large j, this number of points j is approximately equal to the area (1/4)πλj ab/π 2
of this quarter ellipse, that is
λj ≈ j4π/(ab).
Since the area of the rectangle D is A = ab, we have verified Weyl’s law in the case
of a rectangle.
The following examples illustrate how to solve initial/boundary value problems
(IBVPs) using eigenfunction expansions.
Example 4.2.2 (Heat IBVP on D). Consider the Dirichlet IBVP for the heat equa-
tion
∂t u(t, x) = k∆u(t, x) + f (t, x), t > 0, x ∈ D,
u(0, x) = g(x), x ∈ D,
u(t, x) = 0, t > 0, x ∈ ∂D.
If we can solve this PDE problem, we recall from Section 1.4 that we can also
solve it in the case of inhomogeneous Dirichlet boundary conditions u|∂D = h ̸= 0.
81
(We can even reduce to the case g = 0, but choose to keep g since our method of
solution does not require g = 0.) The method for solving the problem stated here is
entirely similar to Example 3.2.1. What is different is that we replace the use of the
Fourier transform, which is applicable only to D = Rn , by the use of the Dirichlet
eigenfunctions
e1 (x), e2 (x), . . .
on D. (In the case of a rectangle, this would be a double Fourier sine series as in
Example 4.2.1.)
The computation is as follows. For a fixed time t, write ut (x) = u(t, x) and
ft (x) = f (t, x) and view ut and ft as vectors in the Hilbert space L2 (D). We obtain
the vector valued ODE
∂t ut = k∆ut + ft . (4.1)
Here we write, for each t > 0,
X
ut (x) = ût (j)ej (x),
j
X
ft (x) = fˆt (j)ej (x),
j
We use the notation ût (j), fˆt (j) and ĝ(j) for the coordinates of the functions in the
Dirichlet eigenbasis, but remember that this no longer refers to the Fourier transform
but rather denote generalized Fourier series coefficients.
Since ∆ej = −λj ej , we obtain a sequence of ODEs
from (4.1) by equating the coefficients on both sides. Now fix j and regard instead
t as variable. Multiplication by the integrating factor ekλj t , and changing dummy
variable from t to s, gives
Integration over 0 < s < t, using the initial condition û0 (j) = ĝ(j), and division by
ekλj t yields Z t
ût (j) = ĝ(j)e−kλj t
+ fˆs (j)e−kλj (t−s) ds. (4.2)
0
To summarize, given initial data g and source f , we compute their generalized
Fourier coefficients
Z
ĝ(j) = g(x)ej (x)dx,
ZD
ˆ
ft (j) = ft (x)ej (x)dx.
D
82
From these and (4.2) we obtain ût (j), which gives the solution
X
u(t, x) = ût (j)ej (x), t > 0, x ∈ D,
j
and we see that even if the initial data has many of the coefficents ĝ(j) (corresponding
to a non-smooth g), the coefficients ût (j) of the solution decay fast with j and for
not too small t > 0 (corresponding to a smooth u for t > 0). Even
Z
−kλ1 t
u(t, x) ≈ ĝ(1)e e1 (x) = g(y)e1 (y)dy e−kλ1 t e1 (x)
D
is a good approximation for the solution. Indeed, one can show for the Dirichlet
eigenvalues that λ1 < λ2 so that e−kλ1 t >> e−kλj t for j ≥ 2 and t large enough.
Example 4.2.3 (Hot spots conjecture). Consider instead the Neumann IBVP for
the heat equation
∂t u(t, x) = k∆u(t, x), t > 0, x ∈ D,
u(0, x) = g(x), x ∈ D,
∂ν u(t, x) = 0, t > 0, x ∈ ∂D.
Write the solution ut , for each time t > 0, as a linear combination of the Neumann
eigenfunctions
∞
X
ut (x) = ût (j)ẽj (x),
j=0
and similarly for the initial data g. Then computations analogous to those in Ex-
ample 4.2.2 shows that
Z Z
−kλ̃1 t
u(t, x) ≈ ĝ(0)ẽ0 (x) + ĝ(1)e ẽ1 (x) = g(y)dy + g(y)ẽ1 (y)dy e−kλ̃1 t ẽ1 (x)
D D
for large t. The interpretation of this result is that when the boundary is insulated
(Neumann boundary conditions), then the initial heat g(x) will spread out evenly
and converge to a constant function on D with the same total heat (integral). For
large t, the difference between ut and this constant function is approximately a small
multiple of ẽ1 (x). This shows that this first non-trivial Neumann eigenfunction has
an important physical meaning: the asymptotic shape of non-constant heat in a
conducting plate/body with insulated boundary.
83
Example 4.2.4 (Wave IBVP on D). Consider the Dirichlet IBVP for the wave
equation.
2
∂t u(t, x) = c2 ∆u(t, x) + f (t, x), t > 0, x ∈ D,
u(0, x) = g(x), x ∈ D,
∂t u(0, x) = h(x), x ∈ D,
t > 0, x ∈ ∂D.
u(t, x) = 0,
Again, the more general case of inhomogeneous boundary conditions can be reduced
to this case. We could have also reduced further to the case g = h = 0 by the
methods in Section 1.4, but refrain from this since our method of solution does not
require this.
Consider our wave equation as the vector valued second order ODE
As in Example 4.2.2, we use the Dirichlet eigenbasis {ej (x)} for D to write
X
ut (x) = ût (j)ej (x),
j
X
ft (x) = fˆt (j)ej (x),
j
By equating the coefficients of the functions on both sides in (4.3), we obtain the
ODEs
∂t2 ût (j) = −c2 λj ût (j) + fˆt (j).
1/2
Now fix j and regard instead t as variable. From Example 1.1.3 with ω = cλj and
a(t) = fˆt (j), we have the solution
1/2 t 1/2
sin(cλj t) sin(cλj (t − s))
Z
1/2
ût (j) = ĝ(j) cos(cλj t) + ĥ(j) 1/2
+ fˆt (ξ) 1/2
ds, (4.4)
cλj 0 cλj
From these and (4.4) we obtain ût (j), which gives the solution
X
u(t, x) = ût (j)ej (x), t > 0, x ∈ D, (4.5)
j
to our wave IBVP. Like in Example 4.2.2 and unlike the case in Section 3.2, for a
general bounded domain it is not possible to sum this series solution for u(t, x) and
obtain a formula similar to (3.7).
It is important to note in (4.4) that the coefficients ût (j) do not decay as t grows,
as was the case for the heat equation in Example 4.2.2. (This is related to the fact
that the wave equation does not smoothen the functions ut as t grows.) And indeed
the series (4.5) typically converge slowly for the wave equation, which indicates that
eigenfunction expansion is a numerically less efficient tool than for the heat equation.
Recall that in this latter case, we only needed very few eigenfunctions to get a good
approximation, since the coefficients decay exponentially according to λj .
An important special case deserves special attention: the case when only one
mode of oscillation is present. For simplicity assume that f = h = 0 and that the
85
initial shape g of the wave is one of the eigenfunctions g = em . This means that
ĝ(j) = 0 for all j except j = m, where ĝ(m) = 1. We obtain in this case the solution
1/2 1/2
u(t, x) = ĝ(m) cos(cλm t)em (x) = cos(cλm t)g(x).
This means that if we start with an eigenfunction as the initial shape u(0, x), then
the wave ut will keep this shape, it is only the amplitude of the wave that will change.
1/2
We refer to u as a standing wave, with frequency cλm .
Example 4.2.5 (Can you hear the shape of a drum?). This intriguing question was
posed by the mathematician Mark Kac in a 1966 paper in the journal American
Mathematical Monthly. So what did he mean? We have seen that the vibrating
membrane on a drum with shape D ⊂ R2 , evolve according to the wave equation,
in the linear approximation. p Given D, we can compute the Dirichlet eigenvalues
λ1 , λ2 , . . .. The square roots λj are the frequencies of the pure notes that this
drum with shape D can produce. This means that the eigenvalues determine how
the drum sounds like. Kac’s questions is an example of an inverse problem: If we
know the sequence λ1 , λ2 , . . ., is it then possible to figure out what the shape of D
must be? Clearly we can translate and rotate D without changing the eigenvalues,
but is the shape of D determined by the pure notes, the sound, that the drum
produces?
Finally Gordon, Webb and Wolpert found a counterexample in 1992: there exists
two drums with different shapes which have exactly the same Dirichlet eigenvalues
(we say that they are isospectral).
det(A − λB) = 0,
xT Bx = y T y,
xT Ax = y T Cy,
y T Cy
R(u) = .
yT y
Since again C is a symmetric matrix, yet another application of the spectral theorem
reveals that the eigenvalue approximations rj are the eigenvalues of C, that is the
root of
det(S −1 T AT −1 S −1 − λI) = 0.
Multiplying from the left by det(T −1 S) and from the right by det(ST ) yields the
stated equation.
Replacing H01 (D) by H 1 (D) above, gives a numerical algorithm for calculating
approximations f˜j (x) and r̃j to the Neumann eigenfunctions ẽj (x) and eigenvalues
λ̃j .
The Rayleigh–Ritz approximation (RRA) provides the following concrete algo-
rithm for computing numerical approximations to the eigenvalues and eigenfunc-
tions.
We assume given the three matrices vertices, triangles and boundary, which
encode a triangulation of the domain D, as in Section 2.4.
Basis functions are indexed by all the vertices listed in vertices when computing
the Neumann eigenfunctions, whereas in the Dirichlet case we only use those
vertices not listed in boundary. See Section 2.4.
87
The stiffness matrix A is computed as suggested inR Section 2.4. During the
same iteration over triangles T , we also compute T ϕi ϕj dx and add to the
element Bij in the mass matrix B.
Eĝ = ⃗g .
Depending on the evolution PDE, we have a formula like (4.2) or (4.4) available
for the Fourier coefficients ût of the solution at a time t > 0. This allows ût to
be computed from the generalized Fourier coefficents ĝ and the eigenvalues D,
assuming for simplicity that we have no sources f for t > 0. Multiplying by E
will then give the coordinates ⃗ut in the FEM-basis {ϕi }, that is the values of
u(t, x) at the nodes x in the triangulation.
Figures 4.3 and 4.4 have been produced with this algorithm. Note that a maybe
more straightforward way to solve these evolution problems, is to numerically solve
the ODEs for the FEM-coordinates ⃗ut , without diagonalizing and using the analytic
solution for the decoupled generalized Fourier coefficents as we have done here.
4.4 Exercises
1. Does there exist a function f (x) such that f (0) = f (3) = 0 and
Z 3 Z 3
2
f dx = 1 = (f ′ )2 dx?
0 0
3. Use the eigenfunctions from Example 4.1.1 to solve the one-dimensional IBVP
2
∂t u = ∂x u, 0 < x < π, t > 0,
u = 1, t = 0,
u = 0, x = 0 and x = π.
5. Assume that u and v are two Dirichlet eigenfunctions, with different eigenval-
ues, on a domain D. Show that u and v are orthogonal in L2 (D).
6. Compute the RRA of the first two Dirichlet eigenvalues for the interval D =
(0, 1), using the basis functions ϕ1 (x) = x − x2 and ϕ2 (x) = x2 − x3 . Compare
with the exact values: which are the smaller?
89
7. Compute the Rayleigh quotient for ϕ(x, y) = xy(π − x)(π − y) on the square
D = (0, π)2 . Compare with the first eigenvalue (see Example 4.2.1): which is
smallest?
90
Chapter 5
Harmonic functions
In this chapter, we prove basic Green and Poisson representation formulas for har-
monic functions, the equilibrium functions described by Laplace’s equation. By an-
alyzing these formulas, we learn much about the properties of harmonic functions:
they are completely smoothness and has a mean value property. All irregularities
have been averaged away! There are similar results for the heat equation, but not
for the wave equation.
As extra material, we also show how BVPs for the Laplace equation can be solved
by diagonalizing a Cauchy–Riemann vector valued system of ODEs, very similar to
the way we solved the heat and wave IVPs in Chapter 3.
Recommended reading:
Study questions:
What is a Green’s function? How do we compute such with
reflection methods? Why is it symmetric? Poisson formulas
for simple domains? Why are harmonic functions smooth in
the interior of their domain of definition? What do maximum
principles say? What do they mean physically?
91
92
Let us now switch our point of view, and view x again as our variable. The identity
writes u(x) as a continuous linear combination of the functions ϕ(y−x) and ∂yj Φ(y−
x) with poles y ∈ ∂D. Since Φ(x − y) is C ∞ smooth for x ̸= y, differentiation under
the integral sign shows the following.
Corollary 5.1.3 (Smoothness). Any harmonic function is C ∞ smooth (and even
real analytic) in the interior of its domain of definition.
To gain further understanding of the properties of harmonic functions, the fol-
lowing concept is useful.
Definition 5.1.4 (Green’s function). Consider a domain D ⊂ Rn and a point x ∈
D. A function G(y) = G(y, x) is said to be a Green’s function for D with pole at x
if
G(y, x) = Φ(y − x) + gx (y), where
Example 5.1.6 (Disk). Consider the disk D = {(y1 , y2 ) ∈ R2 ; y12 + y22 < a2 } with
radius a, and fix a point x = (x1 , x2 ) ∈ D. We would like to use a reflection as
in Example 5.1.5 to compute a Green’s function. The suitable reflection of x for
a disk/ball is the inversion x∗ in the boundary. By this we mean the point in the
same direction as x but with
|x∗ |/a = a/|x|,
that is x∗ = a2 x/|x|2 . Set gx (y) = −Φ(y − x∗ ). Then gx is again harmonic in all D.
We check the boundary condition: if |y| = a we have
1
ln |y − x|2 − ln |y − x∗ |2
Φ(y − x) + gx (y) = 4π
1 a2 + |x|2 − 2⟨y, x⟩ 1
= 4π ln 2 = 2π
ln(|x|/a).
a + a4 /|x|2 − 2(a2 /|x|2 )⟨y, x⟩
1 a|y − x|
G(y, x) = ln .
2π |x||y − x∗ |
The function
P (y, x) = ∂ν G(y, x)
appearing in the solution formula for the Dirichlet problem is called the Poisson
kernel for D, or sometimes the harmonic measure on ∂D with pole at x. For any
domain, it has the following properties which can be verified in the examples above.
Figure 5.2: The Poisson kernel P = ∂ν G as a function of ϕ on the unit circle, for various
poles x in the unit disk.
It has total mass one: ∂D P (y, x)dS(y) = 1 for all x ∈ D. This is follows from
R
This illustrates that the solution to the Dirichlet problem is obtained as weighted
averages of its boundary Dirichlet data, with P as a weight function.
We end this section with a symmetry property of Green’s functions, which is
maybe not that obvious in the examples above.
Proof. Let u(y) = G(y, a) and v(y) = G(y, b) and apply Green’s second identity to
these functions to get
Z Z
(u∆v − v∆u)dy = (u∂ν v − v∂ν u)dS = 0,
D ∂D
Figure 5.3: The Green’s function G for the domain D from Figure 1.3, with pole at
x = (2, 1). The neighbourhood near the pole where G < −0.1 is beyond the colormap.
G = 0 on ∂D, with the Poisson kernel P = ∂νy G peaking on ∂D south-east of the pole.
Section 5.1 that a harmonic function is a weighted average of its boundary values. In
particular at the center of a disk, this average is unweighted/uniform. Thus, at any
point, the value of a harmonic function equals the average over any circle centered
at this point. This is perhaps to best way to convey the intuition about harmonic
functions, and this mean value property actually characterize harmonic functions.
Theorem 5.2.1 (Laplace mean value property). Let u be a C 2 function in some
domain D ⊂ Rn . Then u is a harmonic function if and only if
Z
1
u(x) = u(y)dS(y)
|∂B(x, r)| ∂B(x,r)
which again shows that any harmonic function has the mean value property. But
conversely, assume given a function u which has the mean value property so that
97
the left hand side is always zero. Assume, to reach a contradiction, that ∆u(x) ̸= 0
at some x ∈ D. Then choose r > 0 small enough so that the sign of u is the same
in all B(x, r). But then
Z
(Φ(y − x) + 1/(4πr))∆u(y)dy = 0
B
The mean value property shows that a harmonic function is obtained by aver-
aging some boundary values until equilibrium is reached. In particular, since an
average of some set of values can never be larger than the maximum of these values,
harmonic functions have the following fundamental property.
(S) If there exists x in the interior of D where u(x) = M , then u is the function
which is constant equal to M .
Proof. To prove (S), we use the mean value property, and proceed in two steps.
(1) Assume that there exists an interior point x = x0 in D where u(x0 ) = M .
Consider a ball B(x0 , r) ⊂ D centered at x0 and contained in D. Since u(y) ≤ M
for all y ∈ D, we have by the mean value property that
Z Z
1 1
M = u(x0 ) = u(y)dS(y) ≤ M dS(y) = M.
|∂B(x0 , r)| ∂B(x0 ,r) |∂B(x0 , r)| ∂B(x0 ,r)
Therefore the middle inequality must be an equality, which is only possible if u(y) =
M for all y ∈ ∂B(x0 , r).
Let R denote the minimal distance from x0 to ∂D. Repeating the above argument
for 0 < r < R shows that u(x) = M for all x ∈ B(x0 , R).
(2) Next, let x1 ∈ D be arbitrary. Let γ : [0, 1] → D be a curve with γ(0) = x0
and γ(1) = x1 . We showed in (1) that there exists 0 < t0 < 1 such that u(γ(t)) = M
for all 0 ≤ t ≤ t0 . Denote by T the supremum of such t0 . We claim that T = 1,
which will prove that u(x1 ) = M and conclude the proof. To reach a contraction,
assume that T < 1. Now repeat (1) with x0 replaced by x′ = γ(T ). By continuity
u(x′ ) = M and (1) shows that there exists ϵ > 0 such that u(γ(t)) = M for all
t ∈ (T − ϵ, T + ϵ). This contradicts the definition of T .
98
Turning to our other two basic PDEs, one can see by simple examples that
nothing like a mean value property or a maximum principle can hold for the wave
equation. See exercises. However, for the heat equation similar results hold as we
shall now prove.
Theorem 5.2.3 (Heat maximum principle). Let D ⊂ Rn be bounded and connected
domain and consider the cylinder Ω = D × (0, T ), there 0 < T < ∞. Assume
that u = u(x, t) solves the heat equation ∂t u = k∆u in Ω, and that u is continuous
on Ω = D × [0, T ]. Consider the maximum M = maxΩ u, which exists by our
assumptions.
(W) There maximum M is attained either among the initial data at t = 0 or among
the boundary data at x ∈ ∂D.
(S) If there exists an interior point x0 ∈ D and 0 < t0 ≤ T where u(x0 , t0 ) = M ,
then u is constant equal to M for all t ≤ t0 .
Figure 5.5: (a) Maximum among initial data, due to injection of heat at t = 0. (b)
Maximum among boundary data, due to injection of heat at ∂D for some t > 0.
Again, (W) is seen to follow from (S). What differs the heat from the Laplace
maximum principle, is that there is a direction of time as the heat equation is
not reversible, as we saw in Section 3.3. For this reason the top of the cylinder
goes together with the interior, in the formulation of the maximum principle. It is
worthwhile the contemplate the physics behind this!
We shall prove (S) from a mean value property for the heat equation below, but
since this is now more technical, we first give a simpler and direct proof of (W). The
main idea of the latter proof is contained in the following lemma.
Lemma 5.2.4 (Heat max with sinks). Consider a solution v(x, t) to the inhomo-
geneous heat equation ∂t v = k∆v + f in the cylinder Ω, with sinks f < 0. As
in Theorem 5.2.3, we assume that v is continuous on the bounded and connected
cylinder Ω. Then v attains its maximum K = maxΩ v at, and only at, t = 0 or
x ∈ ∂D.
Proof. If K is attained at an interior point (x, t), then this must be a stationary
point so that ∂t v = 0 and ∇x v = 0. Moreover the quadratic form for v must be
negative definite or semidefinite, from which it follows that ∆v ≤ 0. It follows from
the PDE that
0 = ∂t v = k∆v + f ≤ f,
which contradicts our assumption f < 0.
If K is attained at (x, T ) at the top of the cylinder, we can still conclude that
∆v ≤ 0 along the top t = T , but for ∂t v we only have access to a one-sided derivative
since v is not defined for t > T . Nevertheless ∂t v ≥ 0, since u ≤ M for t < T , and
as above this contradicts our assumption f < 0.
99
To obtain a mean value formula for the heat equation, the key observation is
that in the Laplace mean value formula, the circle/sphere on which the mean value
is calculated, is a level set of the Laplace fundamental solution Φ(y − x), since this
is a radial function. Similarly, the heat mean value formula will use a level set of
the heat kernel. To simplify, we assume k = 1 for the remainder of this section. (If
k ̸= 1, a simple rescaling will reduce to the case k = 1.)
Lemma 5.2.5 (Heat kernel= fundamental solution). Consider Ht (x) from Defini-
tion 3.2.2, with k = 1, as a function of (x, t) ∈ Rn × R, where Ht (x) = 0 when
t < 0. Then we have the weak derivative
∂t Ht − ∆x Ht = δ(0,0) .
Proof. By calculating the classical derivative, we see that ∂t Ht − ∆Ht = 0 when
t > 0, and clearly this holds also for t < 0. Around at point (x, 0) with x ̸= 0,
one can show that Ht (x) is a C ∞ function of (x, t) with all derivatives vanishing at
t = 0 since the exponential function decay very fast as t → 0. It remains to see what
happens at the singularity at (0, 0). To this end, we fix a test function ϕ(x, t) and
compute
ZZ
2
⟨∂t Ht − ∆Ht , ϕ⟩ = ⟨Ht , −∂t ϕ − (−1) ∆ϕ⟩ = lim+ Ht (x)(−∂t ϕ − ∆ϕ)dxdt
ϵ→0 t>ϵ
Z ZZ
= lim+ Hϵ (x)ϕ(x, ϵ)dx + (∂t Ht (x) − ∆Ht (x))ϕ(x, t)dxdt
ϵ→0 Rn t>ϵ
√ √ n
Z Z
−n/2 −|z|2 ϕ(0, 0) 2
= lim+ (4πϵ) e ϕ(2 ϵz, ϵ)(2 ϵ) dz → n/2 e−|z| dz = ⟨δ(0,0) , ϕ⟩
ϵ→0 Rn π Rn
√
where we have integrated by barts in t and made a change of variables x = 2 ϵz.
Lemma 5.2.6 (Parabolic Green). Let D be a bounded smooth domain in spacetime
Rn+1 , and let f and g be sufficiently smooth on D. Write the normal outward unit
vector on ∂D as ν = (νx , νt ) where νx ∈ Rn and νt ∈ R. Then
Z Z
(f ∂νx g − g∂νx f + νt f g)dS(x, t) = (f (∆x g + ∂t g) − g(∆x f − ∂t f ))dxdt,
∂D D
Proof. Similar to Section 0.3, this integral identity follows by applying the divergence
theorem to the spacetime vector field
F = f ∇x g − g∇x f + et f g,
Theorem 5.2.7 (Heat mean value property). Assume that u(x, t) solves the heat
equation ∂t u = ∆u in a neighbourhood of (x0 , t0 ) ∈ Rn+1 . Define the parabolic ball
at the origin with diameter r to be
BP (r) = {(x, t) ; 0 < t < r, |x|2 < 2nt ln(r/t)} = {(x, t) ; Ht (x) > (4πr)−n/2 }.
Figure 5.6: The heat kernel Ht (x), n = 1, with level sets defining the parabolic balls
BP (r).
Proof. We apply Lemma 5.2.6, with Ht (x) as f (x, t) and u(x0 − x, t0 − t) as g(x, t),
and with D = BP (r). Then g solves the backward heat equation ∆x g + ∂t g = 0
and Lemma 5.2.5 shows that ∂t f − ∆x f = δ(0,0) . Moreover, since g is constant on
∂BP (r), we have
Z Z
(f ∂νx g + νt f g)dS(x, t) = f (∂νx g + νt g)dS(x, t)
∂BP (r) ∂BP (r)
ZZ
=g (∆x g + ∂t g)dxdt = 0.
BP (r)
For the boundary term g∂νx f in the parabolic Green identity, computations reveal
that
Proof of Theorem 5.2.3(S). This follows from the heat mean value property, in a
way similar to how we derived the strong maximum principle from the mean value
property for harmonic functions.
Assume that u(x0 , t0 ) = M for some interior x0 ∈ D and 0 < t0 ≤ T . Write
Kr (x, t) for the integrand in (5.4), so that
Z
u(x0 , t0 ) = Kr (x, t)u(x0 − x, t0 − t)dS(x, t).
∂Bp (r)
Note that this also applies to R the function u = 1, since this also solves the heat
equation. This shows that ∂Bp (r) Kr (x, t)dS(x, t) = 1, and clearly Kr (x, t) ≥ 0,
so u(x0 , t0 ) is a weighted mean value of its values on ∂Bp (r). As in the proof of
Theorem 5.2.2(S), we can conclude that u(x, t) = M for all (x, t) ∈ BP (R), where
R is the supremum of r such that BP (r) ⊂ Ω.
Finally, if (x1 , t1 ) ∈ Ω is any point such that t1 < t0 , then we can find a curve
γ : [0, 1] → Ω such that γ(0) = (x0 , t0 ) and γ(1) = (x1 , t1 ), which has downward/past
pointing tangent vector γ ′ . Following the proof of Theorem 5.2.2(S), we can prove
that u(x1 , t1 ) = M . Note that the condition on γ ′ is needed since the mean values
of u are taken over a set below the point.
∂y (−v2 ) − ∂x v1 = 0,
that is (v1 , −v2 ) is a curl-free vector field. By vector calculus, there exists a scalar
potential u so that (v1 , −v2 ) = ∇u. From the first CR equation we conclude
We now regard the height y as our evolution variable, playing the role of time t in
Section 3.2. Writing
vy (x) = v(x, y), x ∈ R,
for the function v on height y, the Cauchy–Riemann equations is equivalent to the
vector-valued ODE
∂y vy = i∂x vy ,
for a function R → L2 (R).
To solve this, as in Section 3.2 we apply to each fixed y the Fourier transform in
the x-variable, and obtain
∂y v̂y (ξ) = −ξv̂y (ξ).
Now fix ξ and regard y as variable. The solution of the ODE is clearly
Definition 5.3.3 (Hardy subspaces). Consider the function space L2 (R) = L2 (R; C)
of square integrable complex-valued functions f : R → C. Define the upper Hardy
subspace to be
L+ ˆ
2 (R) := {f ∈ L2 (R) ; f (ξ) = 0 for all ξ < 0}.
L− ˆ
2 (R) := {f ∈ L2 (R) ; f (ξ) = 0 for all ξ > 0}.
Proposition 5.3.4 (Cauchy and Hilbert kernels). For y ̸= 0, define the functions
1 1
Cy (x) := , x ∈ R.
2π y − ix
of our function space L2 (R) into the two subspaces L± 2 (R) from Definition 5.3.3.
If v0 ∈ L+
2 (R), or equivalently when H ∗ v0 = v0 then the CR IVP is solvable
,
for y > 0, with solution
Z
1 v(z, 0)
v(x, y) = dz, x ∈ R, y > 0,
2πi z − (x + iy)
decaying as y → +∞.
If v0 ∈ L−
2 (R), or equivalently when H ∗ v0 = −v0 , then the CR IVP is solvable
for y < 0, with solution
Z
i v(z, 0)
v(x, y) = − dz, x ∈ R, y < 0,
π z − (x + iy)
decaying as y → −∞.
The reader with knowledge in complex analys recognize these formulas as the
Cauchy integral formulas for analytic functions.
A comparison with the heat and wave equations is in order.
The IVP for the heat equation is solvable forward in time t > 0, but not
backwards in time.
The IVP for the wave equation is solvable both forward and backward in time.
Theorem 5.3.5 shows that the IVP for the Cauchy–Riemann equations/Laplace
equation is solvable upwards in space for initial data in L+
2 (R) and downward
−
in space for initial data in L2 (R).
Returning to the Laplace equation, we show next how to solve the Dirichlet and
Neumann BVPs using Theorem 5.3.5. We need the following observation.
Lemma 5.3.6. A function f : R → C is real-valued if and only if its Fourier
transform fˆ is conjugate-symmetric, that is
fˆ(−ξ) = fˆ(ξ), ξ ∈ R.
Example 5.3.7 (Neumann problem on the upper half plane). We wish to solve the
Neumann problem (
∆u(x, y) = 0, y > 0, x ∈ R,
∂y u(x, 0) = g(x), x ∈ R.
We assume that the Neumann data g ∈ L2 (R). For the solution, consider the
conjugate gradient vector field v = (v1 , v2 ) = (∂x u, −∂y u) as above. Since we want
this to solve the CR equations for y > 0, we look for v0 ∈ L+ 2 (R). To match the
boundary data, we want this to have the imaginary part
Im v(x, 0) = −g(x), x ∈ R.
106
There exists a unique such f : for ξ < 0 we must have fˆ(ξ) = iĝ(ξ), whereas for ξ > 0
since f and g are real-valued functions we have fˆ(ξ) = fˆ(−ξ) = iĝ(−ξ) = −iĝ(ξ).
We obtain the inital data
(
−2iĝ(ξ), ξ > 0,
v̂0 (ξ) =
0, ξ < 0,
and Theorem 5.3.5 gives the solution u(x, y) to the Neumann problem with conjugate
gradient Z
1 g(z)
v(x, y) = Cy ∗ v0 (x) = dz.
π (x + iy) − z
Recall that the solution u to the Neumann problem is unique only up to constants.
We obtain formulas for the partial derivative
x−z
Z
1
∂x u(x, y) = g(z) dz,
π (x − z)2 + y 2
Z
1 y
∂y u(x, y) = g(z) dz,
π (x − z)2 + y 2
where we recognize the second one to be the Poisson integral for the half plane
applied to ∂y u.
5.4 Exercises
1. Show that for the half-plane D = {(y1 , y2 ) ∈ R2 ; y2 > 0}, the solution to the
Dirichlet problem is given by
Z
1 u(y1 , 0)dy1
u(x1 , x2 ) = .
π R (y1 − x1 )2 + x22
2. Show that for a ball D = {(y1 , y2 , y3 ) ∈ R3 ; y12 + y22 + y32 < a2 }, the solution
to the Dirichlet problem is given by
a2 − |x|2
Z
u(y)dS(y)
u(x1 , x2 , x3 ) = 3
.
4πa |y|=a |y − x|
3. Show that for a given domain D, there is at most one Green’s function.
5. Find the Green’s function for the one-dimensional interval D = (0, L).
6. Consider the solution formula (5.2) for the Dirichlet problem on the half-space.
Assume that u(y1 , y2 , 0) = 0 when y12 + y22 > R2 for some R < ∞. Show that
u → 0 as x → ∞.
7. Consider the solution formula (5.2) for the Dirichlet problem on the half-space.
Assume that g(y1 , y2 ) = u(y1 , y2 , 0) is a continuous and bounded function.
Show that limx3 →0 u(x1 , x2 , x3 ) = g(x1 , x2 ).
8. The definition of a Neumann function N (y, x) = Φ(y − x) + nx (y) for a domain
D is similar to Definition 5.1.4, but replacing the boundary condition by ∂ν N =
c on ∂D. Show the constant c must be chosen as 1/area of ∂D, for N to exist.
Use Green’s formulas to express the solution to the Neumann problem in term
of N .
9. The Kelvin inversion formula for harmonic functions states that if ∆u = 0 in
a neighbourhood of x = a, then
1
v(x) = u(x/|x|2 )
|x|n−2
satisfies ∆v = 0 in a neighbourhood of x = a/|a|2 . Prove this for n = 3.
Hint: A direct calculation is possible but not pleasant. Use Green’s formula
to reduce to the case when u(x) = Φ(x − p) for some p ∈ Rn .
10. Let u be the harmonic function on the disk x2 + y 2 < 4 with Dirichlet data
u = 3 sin(2θ) + 1. Without computing u, find the maximum value of u on the
closed disk, and the value u(0, 0).
11. Consider the solution u(t, x) = 1 − x2 − 2kt to the heat equation. Find its max
and min on 0 ≤ x ≤ 1, 0 ≤ t ≤ T .
12. Consider the heat IBVP
2
∂t u = ∂x u,
0 < x < 1, t > 0,
u = 0, x = 0 and x = 1,
u = 4x(1 − x), t = 0.
(a) Show that 0 < u < 1 for all 0 < x < 1, t > 0.
(b) Show that u(t, 1 − x) = u(t, x) for all 0 < x < 1, t > 0.
13. Prove that if u and v solve the heat equation and u ≤ v at t = 0, at x = 0 and
at x = L, then u ≤ v for all t > 0, 0 < x < L.
14. Find an example which shows that there is no maximum principle for the wave
equation.
15. Assume that ∂t u = ∂x2 u for 0 < x < a, 0 < t < T , and that ∂x u = 0 for x = 0,
0 < t < T . Show that u attains a maximum either at t = 0 or x = a. What is
the physics? Hint: Make an even reflection at x = 0.
108
similar to what was done in Example 5.3.7 for the Neumann problem. What
is the relation to the Dirichlet BVP for the Laplace equation on the upper half
plane?
Chapter 6
109
110
Chapter 7
In this course, we have studied essentially only three PDEs. The reason is that a
general PDE, to some extent, can be reduced to one of the basic ones: the heat,
wave and Laplace PDE, as we see in Section 7.1. A general PDE may be different
in many ways: it may be of higher order, it may be a system involving several
PDEs and unknown functions, it may be non-linear or it may have variable, even
non-smooth, coefficients. Linear PDE problem in smooth domains with constants
coefficents like k and c2 appearing in the heat and wave equations, the topic of
a first PDE course like this, have been well understood for more than a century.
The modern PDE research often concerns non-linear PDEs or linear PDE problems
with non-smooth coefficients, or in domains with non-smooth boundaries ∂D. Other
important modern problems include free boundary problems, where the domain D
itself is unknown and evolve with the PDE, and inverse problems where one knows
the solutions to the PDE problem, for many data, and seek the coefficients or the
domain.
Reading:
Strauss 1.6
111
112
Recommended reading:
In this course we have only considered mainly scalar PDEs, that is the unknown
function u(x1 , . . . , xn ) is scalar-valued. As a consequence the basic PDEs we have
studied are of second order. Many important PDEs, in particular those appearing in
physics, are however not scalar, that is there are more than one unknown function
appearing in the PDE. In this case it is often first order PDE systems that are the
fundamental ones, and one important class of such is the following.
Definition 7.2.1 (Dirac type PDE system). A first order linear system of PDEs is
said to be of elliptic Dirac type if the matrix with the first order derivatives squares
to a diagonal matrix with diagonal elements all being the Laplace operator ∆.
A first order linear system of PDEs is said to be of hyperbolic Dirac type, with
propagation speed c, if the matrix with the first order derivatives squares to a diag-
onal matrix with diagonal elements all being the d’Alembert operator c−2 ∂t2 − ∆.
Since both the first order derivative ∂t and the second order derivatives ∆ appear
in the heat equation, we do not normally consider any parabolic Dirac type PDE
systems.
This concept is best understood by examples. The first one we have already
encountered in Section 5.3.
Example 7.2.2 (Cauchy–Riemann= elliptic Dirac). The Cauchy–Riemann equa-
tions from Definition 5.3.1, written in matrix form, read
−∂x ∂y v1 0
=
∂y ∂x v2 0
∇·E = ρ, (7.1)
−1
c ∂t E − ∇ × B = −J, (7.2)
c−1 ∂t B + ∇ × E = 0, (7.3)
∇·B = 0, (7.4)
113
∇f = gradf,
∇ · F = divF,
∇ × F = curlF.
We now observe that Maxwell’s equations essentially are a hyperbolic Dirac type
system. For this we use the identity
from vector calculus, where ∆ acts on each component function of the vector field
F . In absense of sources ρ = J = 0, using the Gauss equations divE = divB = 0
we obtain that
−1
−c ∂t curl −c−1 ∂t curl
−2 2
E c ∂t − ∆ 0 E
−1 −1 = −2 2 .
curl c ∂t curl c ∂t B 0 c ∂t − ∆ B
Therefore, at least for divergence free vector fields E and B, Maxwell’s equations is
a hyperbolic Dirac type system.
To avoid the problem with the divergence free constraints, another approch which
we shall use is to add two auxiliary scalar functions f and g and consider the 8 × 8
system of PDEs
−1
c ∂t div 0 0 f ρ
−grad −c−1 ∂t curl 0 E J
0 −1 = . (7.6)
curl c ∂t grad B 0
0 0 −div −c−1 ∂t g 0
Maxwell’s equations is the special case when f = g = 0, in which case the first and
last equations are the Gauss equations. Using (7.5), we see that this is indeed a
hyperbolic 8 × 8 Dirac system of PDEs.
114
Elliptic Dirac systems share many properties with the Laplace equation, and
hyperbolic Dirac systems share many properties with the wave equation. As an
example of this, we solve the Maxwell IVP on R3 by adapting Example 3.2.3.
Example 7.2.4 (Maxwell IVP on R3 ). We aim to solve the IVP on R3 for the
hyperbolic 8 × 8 Dirac system (7.6), where ρ(t, x) and J(t, x) are prescribed sources
that we assume satisfy the continuity equation c−1 ∂t ρ + ∇ · J = 0, assuming initial
conditions
f (0, x) = 0, x ∈ R3 ,
E(0, x) = E (x), x ∈ R3 ,
0
B(0, x) = B0 (x), x ∈ R3 ,
x ∈ R3 ,
g(0, x) = 0,
where E0 and B0 are given vector fields with divE0 (x) = ρ(0, x) and divB0 = 0.
We start by writing (7.6) as the vector valued ODE
c−1 ∂t Ft + M (∇)Ft = jt ,
where
0 div 0 0
grad 0 −curl 0
M (∇) =
0
curl 0 grad
0 0 div 0
and
ft ρt
Et −Jt
Ft =
Bt ,
jt =
0 .
gt 0
Following our usual strategy we apply, for each t > 0, the Fourier transform and
obtain
c−1 ∂t F̂t + iM (ξ)F̂t = ĵt , (7.7)
where ˆ
ft ξ · Êt
Êt ξ fˆt − ξ × B̂t
M (ξ) =
B̂ ξ × Êt + ξĝt
.
t
ĝt ξ · B̂t
For each fixed ξ ∈ R3 , we now solve the ODE (7.7) in the variable t. Unlike the
case for scalar equation, we note that the Fourier transform does not completely
diagonalize our Dirac system of PDEs, but has transformed it into 8 × 8 systems
of ODEs. These can be further diagonalized as in Example 1.1.2 to scalar ODEs.
However, the Dirac structure provide a way to simplify this calculation as follows.
We note that M (∇) is an elliptic Dirac system and that the square of the matrix
M (ξ) is |ξ|2 times the identity matrix. A matrix valued integrating factor for (7.7)
is the matrix
sin(c|ξ|t)
eicM (ξ)t := cos(c|ξ|t)I + icM (ξ).
c|ξ|
115
(Just like the classical formula eit = cos(t) + i sin(t) this is justified by Taylor series
expansion, using that (icM (ξ))2 = −(c|ξ|)2 I.) Multiplying (7.7) by this matrix and
c yields
∂s (eicM (ξ)s F̂s ) = ceicM (ξ)s ĵs ,
where we have changed the dummy variable from t to s. Integration over 0 < s < t
and multiplication by e−icM (ξ)t gives
Z t
−icM (ξ)t
F̂t = e F̂0 + c e−icM (ξ)(t−s) ĵs ds. (7.8)
0
Finally we apply the inverse Fourier transform to this solution formula. The simplest
way to do this is to observe that the inverse transform of
sin(c|ξ|t)
e−ictM (ξ) F̂ := cos(c|ξ|t)F̂ − icM (ξ) F̂
c|ξ|
is ZZ
−ctM (∇) 1
e F (x) := (∂t − cM (∇)) F (y)dS(y), (7.9)
4πc2 t |y−x|=ct
Recommended reading:
Strauss 14.3.
Recommended reading:
Strauss 13.2.
7.6 Exercises
1. What is the type of each of the following PDEs?
3. Why is one IC, specifying u(0, x), needed for the heat IVP to be well posed?
Why are two ICs, specifying u(0, x) and ∂ν u(0, x), needed for the heat IVP
to be well posed? Why is one BC, specifying either u|∂D or ∂ν u|∂D but not
both, needed for the Laplace BVP to be well posed? Hint: Example 5.3.7 is
relevant.
4. Write out the second and third component of the solution formula (7.10) and
obtain a formula for Et and Bt , given E0 , B0 , ρs and Js , for 0 < s < t.
(a) Use vector calculus to deduce that there exists a vector potential A and
a scalar potential u such that ∇ × A = B and −∇u = E + c−1 ∂t A.
(b) Show that A and u are not uniquely determined by E and B. Indeed,
show that given any function λ, a gauge, we can equally well use the
vector potential A + ∇λ together with the scalar potential u − c−1 ∂t λ.
(c) Show that there exists a gauge λ such that the new A and u satisfy
c−1 ∂t u + ∇ · A = 0.
117
(d) Write D for the system of partial differential operators defined by the
matrix in (7.6). Show that D[u, A, 0, 0]T = [0, E, B, 0] in the gauge above.
Conclude that each component of u, A, E and B solves the wave equation
with a source. Which are the source terms?
Rbp
6. Solve the Euler-Lagrange equation for the minizer of a 1 + (u′ (x))2 dx. De-
duce that the shortest path between two point in the plane is a straight line.
Rb p
7. Find the minimal surfaces of revolution. Hints: The area is a 2πu 1 + (u′ (x))2 dx.
p
The primitive u(x)/ 1 + (u′ (x))2 may be useful when integrating the ODE.
118
Appendix A
Answers to Exercises
Problem session 1:
1.7 With u = u0 + u1 and u0 given as explained, the equivalent PDE problem for
u1 is ∂t u1 = k∆u1 , u1 = 0 at t = 0, and u1 = h̃ on ∂D, where h̃ = h − u0 .
Problem session 2:
2x,
x < 1,
′
2.1 f = 2δ1 − 4δ2 + g, where g(x) = 2x + 2, 1 < x < 2, is the classical deriva-
2, x>2
tive. g does not exist at x = 1 and at x = 2.
R1
2.2 ∂x f is the distribution whose ϕ-average is −1 (ϕ(−1, y)−ϕ(1, y))dy. Geometri-
cally: A Dirac wall along {−1}×(−1, 1) minus a Dirac wall along {1}×(−1, 1).
119
120
g(x − ct)ϕ′′tt (x, t)dxdt = c2 g(x − ct)ϕ′′xx (x, t)dxdt for all
RR RR
2.4 Need to show
test functions ϕ(x, t).
RR RR
2.5 Need to show y>0 u(x, y)∆ϕ(x, y)dxdy = y<0 u(x, −y)∆ϕ(x, y)dxdy for all
test functions ϕ(x, y) on the disk.
2.6 (a) α > 1/2 (b) α > 0 (c) α > −1/2 (For a complete solution you should show
that the weak derivative does not contain any Dirac delta at the origin.)
2.8 D = {x ∈ R2 ; |x| < 1}, f1 (x) = 0, f2 (x) = 7ek(r−1) . For k large, these are
close in L2 (D), but still their boundary values are far apart.
1
RR R RR
2.9 V
R = H (D), a(u, ϕ) = D
⟨∇u, ∇ϕ⟩dx + γ ∂D
uϕdS, L(ϕ) = D
f ϕdx +
∂D
gϕdS.
RR RR
2.10 V = {u ∈ H 1 (D) ; u|ΓD = 0}, a(u, ϕ) = D ⟨∇u, ∇ϕ⟩dx, L(ϕ) = D f ϕdx.
RR RR
2.11 V = H01 (D), a(u, ϕ) = D ⟨B(x)∇u(x), ∇ϕ(x)⟩dx, L(ϕ) = D f ϕdx. The
Lax-Milgram hypothesis is satisfied if B(x) are uniformly bounded and positive
definite matrices.
Problem session 3:
3.2 fˆk (ξ) = 2 sin(kξ)/ξ. Note that if fk → f weakly, then fˆk → fˆ weakly. Why?
Problem session 4:
Instructions
One question will be to account for some of the Definitions that we have
encountered in the course. Typical questions may be “what is a tempered
distribution?” or “what do we mean by a Green’s function?”. It is impor-
tant that you understand the difference between a Definition (Where it is
explained the precise meaning of X. No proofs relevant here!) and a The-
orem/Proposition/Lemma. (Where we prove something about well defined
things.)
121
122
Minimum principle for first eigenvalue: Proposition 4.1.5, part (1) of proof
8. (a) Report containing the points (a) - (f) from above with no severe mistakes:
passed, 0 bonus points.
(b) Entirely correct and complete presentation of the theoretical part: 1
bonus point.
(c) Professional presentation of the result with proper images and discussion
about errors: 1 bonus point.
(d) One obtains further bonus points by completing the extra exercises in the
project descriptions in Appendix C.
(e) From the 3rd meeting with the supervisor onwards: −1 bonus point per
meeting.
124
Appendix C
FEM projects
As part of the course, you are supposed to solve one, and only one, of the three
projects described below in this appendix.1 .
Before getting started on your project, you are supposed to read through Sec-
tions 2.4 and 4.3, and Appendix D. Matlab should be used, following Appendix D.
Note that you are only allowed to use the PDE toolbox for creating triangulations
and for plotting, and not for computing matrices and boundary conditions.
Here a is the thermal conductivity coefficient of water and c is the heat conductiv-
ity coefficient of the hose’s walls. Typical values are awater = 0.6 W/K and c = 10
W/(dm K). Furthermore ν is the outward pointing unit normal vector to the bound-
ary ∂D of the domain and we use dm as unit of length. We assume that the cross
section of the hose is a circle of radius 1 dm.
1
The first and last project was developed by Christoffer Cromvik 2005 and the second by Fredrik
Lindgren 2009. They have also been further developed by Matteo Molteni, Maximilian Thaller and
Andreas Rosén since then.
125
126
2. We next check the code against an analytic solution. Consider the function
u(x, y) = 325 − 20(x2 + y 2 ), a = 0.6 and u0 = 300, and find the corresponding
values for c and f . Then use your code to calculate u numerically, for these
a, f, c and u0 . Check your code if the error does not seem to converge to 0.
(Hint: Carefully specify your circle with pdecirc.)
Report: Maximum absolute errors of the solution, for different meshes.
3. We now consider the following non-linear PDE problem. Set the outside tem-
perature to u0 = 250 and c = 10. Ice has a different thermal conductivity
coefficient than water, namely aice = 2.2 W/K. So a depends on the tempera-
ture u now: (
2.2, u < 273,
a(u) = (C.2)
0.6, u > 273.
This gives a non-linear partial differential equation that we can solve with a
fixed point iteration. The fixed point iteration is the following procedure: As-
sume first a = 0.6 everywhere, then solve the boundary value problem. Change
afterwards a to 2.2 at all node points where u < 273. Solve the boundary value
problem again and adjust a according to the values of u. Repeat this proce-
dure as long as a has to be adjusted. What constant value of the heat source
function f describing the radiation is needed so that in the equilibrium state
we have 50% ice and 50% water of the area cross section?
Report: The value of f and plot of the non-linear solution u.
5. Consider a = 0.6, u0 = 300 and f (x, y) = 100 sin(6y). Vary the value of c and
compute the solution u. What happens when (a) c → 0+ (b) c → +∞ and (c)
c < 0.
Report: Plots, with mathematical and physical explanations of the results.
127
2. We next check the code against an analytic solution. Compute the exact values
of the 50 first eigenvalues for the rectangle D = (−1, 1)×(−0.5, 0.5), by sorting
those appearing in Example 4.2.1 in ascending order. Compute the maximum
error among the first 50 eigenvalues for different meshes. Check your code
if your error does not seem to converge to 0. (Hint: carefully specify your
rectangle with pderect.)
Report: Errors of the solution, for different meshes. Explanation of the sign
of the errors.
128
6. Find two different domains D1 and D2 which have exactly the same set of
Dirichlet eigenvalues. Verify this numerically with your Matlab code. Hint:
Example 4.2.5.
Report: Plots of the difference between the 50 first eigenvalues for respective
domains. What domains you are using.
Here u describes the height of the water surface and ν is the outward pointing unit
normal to the boundary ∂D of the domain. We use m as unit of length, and a
129
typical value of wave propagation speed which we use is c = 1.6 m/s. We discretize
this evolution problem by writing
N
X
u(t, x) = ξj (t)ϕj (x),
j=1
with spatial FEM basis functions ϕj and coordinates ξj (t) depending on time t.
We do not use eigenfunction expansions here, but instead solve the ODE in time
numerically.
using the quadrature (D.2). You can now take u0 = v0 = 0. Check your
code for the exact solution u = t2 cos x cos y on D = (0, π)2 : calculate the
corresponding source f , and from this reconstruct the solution numerically.
Report: Code for IntVector. Error plot at t = 2.
Matlab
t This is a matrix with 4 rows, the first 3 corresponding to the matrix triangles
in Section 2.4. The last row is a subdomain reference, useful for example when
data is given by different expressions in parts of D.
e This is a matrix with 7 rows, the first 2 corresponding to the matrix boundary
in Section 2.4. Rows 6 : 7 are references to the triangles to the left and right of
the boundary interval, 0 referring to the exterior of D. The boundary ∂D may
consist of subsegments. Row 5 tells which subsegment the boundary interval
belongs to, and rows 3 : 4 give the ordering of the endpoints for the boundary
interval, within the subsegment.
When estimating errors by comparing the results for a coarse and a fine mesh,
note that as you refine your mesh with PDE modeler, the new nodes for the fine
mesh are listed after the old nodes for the coarse mesh in your variables.
131
132
where the first sum is over all triangles T , and approximates the integral over D,
whereas the second sum is over all boundary intervals, and approximates the integral
over ∂D. Depending on the problem, two functions
need to be implemented. (Int= Interior, Bdy= Boundary.) The first function calcu-
lates, given the coordinates of the nodes of a triangle T , a 3 × 3 matrix containing
the values ZZ
a0 (ϕi , ϕj )dx,
T
when i and j each is a vertex of T . The second function calculates, given the
coordinates of the nodes of a boundary interval I, a 2 × 2 matrix containing the
values Z
a1 (ϕi , ϕj )dx,
I
when i and j each is an end point of I. The following matlab script then produces
the matrix A.
N = size(p, 2);
A = zeros(N, N);
for el = 1 : size(t, 2)
nn = t(1:3, el);
A(nn, nn) = A(nn, nn) + IntMatrix( p(:, nn) );
end
for bel = 1 : size(e, 2)
nn = e(1:2, bel);
A(nn, nn) = A(nn, nn) + BdyMatrix( p(:, nn) );
end
For some problems A may only involve a solid integral, in which case the second
for-loop should be omitted.
The vector coordinates to be calculated are typically of the form
X ZZ XZ
bi = L0 (ϕi )ds + L1 (ϕi )ds.
T T I I
when i is a vertex of T . The second function calculates, given the coordinates of the
nodes of a boundary interval I, a 2 × 1 column vector containing the values
Z
L1 (ϕi )ds,
I
when i is an end point of I. The following matlab script then produces the vector b.
N = size(p, 2);
b = zeros(N, 1);
for el = 1 : size(t, 2)
nn = t(1:3, el);
b(nn) = b(nn) + IntVector( p(:, nn) );
end
for bel = 1 : size(e, 2)
nn = e(1:2, bel);
b(nn) = b(nn) + BdyVector( p(:, nn) );
end
For some problems b may only involve a solid/boundary integral, in which case the
first/second for-loop should be omitted.
function A0 = IntMatrix(nodes)
% Input: 2 x 3 matrix, node coords as columns
% Output: 3 x 3 matrix of integrals for stiffness matrix
e1= nodes(:, 1)- nodes(:, 3); % choose 3rd node as origin
e2= nodes(:, 2)- nodes(:, 3);
basis= [e1, e2];
dualbasis= inv(basis’); % computes the dual basis
grads= [dualbasis(:, 1), dualbasis(:, 2),
-dualbasis(:, 1)- dualbasis(:, 2)];
area= det(basis)/2; % computes the area of the triangle
A0= grads’ * grads * area; % returns the 9 inner products
end
134
You may use this code if you understand how it works. Recall that the basis {e∗1 , e∗2 }
is dual to {e1 , e2 } if ⟨e∗1 , e1 ⟩ = ⟨e∗2 , e2 ⟩ = 1 and ⟨e∗1 , e2 ⟩ = ⟨e∗2 , e1 ⟩ = 0. The code
computes the dual basis by writing these relations in matrix form. Since ϕ1 = 1 at
node 1 and zero at nodes 2 and 3, it is clear that ∇ϕ1 = e∗1 . Similarly ∇ϕ2 = e∗2 ,
and for ∇ϕ3 we differentiate the relation ϕ1 + ϕ2 + ϕ3 = 1.
You are supposed to calculate the exact value of all integrals involving only test
functions ϕi by hand. It is convenient to make a change of variables in the double
integral over the triangle:
(s, t) 7→ p3 + se1 + te2
maps the triangle {(s, t) ; s, t > 0, s + t < 1} onto T with vertices pi and edge
vectors e1 = p1 − p3 , e2 = p2 − p3 . For example, if ϕi is the test function with
ϕi (p3 ) = 1 and J denotes the Jacobian determinant, then a change of variables gives
ZZ Z 1 Z 1−x
ϕi (x, y)dxdy = J (1 − x − y)dydx.
T 0 0
do not use the boundary vertices, and no boundary matrices BdyMatrix and no
boundary vectors BdyVector need to be computed. The following code will produce
the matrices for Dirichlet problem from the Neumann matrices, following Section 2.4.
Before plotting a Dirichlet solution xDir, we need to add back zeros at boundary
nodes, which the following code does.
To plot the solution represented by the column vector x, the general plotting
command in Matlabs PDE toolbox is pdeplot. See Matlabs documentation/help.
Two useful and simpler to use plotting options are pdesurf(p, t, x), which plots the
graph of the solution (using view([0, 90]) produces a colorplot), and pdecont(p, t, x),
which plots level curves of the solution (a fourth argument can specify the number
of level curves).