0% found this document useful (0 votes)
3 views103 pages

Partial Differential Equation Notes

Uploaded by

Jinhui Zheng
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
3 views103 pages

Partial Differential Equation Notes

Uploaded by

Jinhui Zheng
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 103

MATH3083/MATH6163

Advanced Partial Di↵erential Equations


Carsten Gundlach
School of Mathematical Sciences
University of Southampton
Contents
1 Overview 5
1.1 Notation and terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Partial derivatives and index notation . . . . . . . . . . . . . . . . . . . . . 5
1.1.2 Systems of PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.3 PDE problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.4 Revision: Linear PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Revision: Vector calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 The Laplace operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.3 The divergence theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Revision: Separation of variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Complex Fourier series and Fourier transforms . . . . . . . . . . . . . . . . . . . . 10
1.4.1 Motivation from separation of variables: PDE problems with fall-o↵ bound-
ary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.2 Revision: Complex Fourier series . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.3 Derivation of the Fourier transform from the complex Fourier series . . . . 11
1.4.4 Use of the Fourier transform: derivatives and fall-o↵ boundary conditions . 12
1.5 Common linear PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5.1 Laplace, Poisson and Helmholtz equations . . . . . . . . . . . . . . . . . . . 12
1.5.2 Heat equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5.3 Wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Cauchy data and the Cauchy-Kowalewski solution . . . . . . . . . . . . . . . . . . 16
1.7 Weak solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2 Well-posedness 20
2.1 Main definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Examples of ill-posed problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.1 Wave equation with Dirichlet boundary conditions: many solutions . . . . . 20
2.2.2 Wave equation with Dirichlet boundary conditions: no continuous depen-
dence on the boundary data . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.3 Cauchy problem for the Laplace equation: no continuous dependence on the
boundary data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Continuous dependence on the data . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4 Examples of well-posedness results . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.1 An energy estimate for the heat equation . . . . . . . . . . . . . . . . . . . 24
2.4.2 An energy estimate for the wave equation . . . . . . . . . . . . . . . . . . . 25
2.4.3 Uniqueness of solutions of the boundary value problem for the Poisson equation 25
2.5 The importance of well-posedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Classification of PDEs from their symbol 29


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 The symbol of a linear PDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.1 The Fourier transform in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.2 Definition of the symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.3 The symbol and plane wave solutions . . . . . . . . . . . . . . . . . . . . . 30
3.3 Ellipticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.1 A model elliptic PDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.2 Formal definition of ellipticity . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3.3 The high-frequency approximation . . . . . . . . . . . . . . . . . . . . . . . 33
3.4 A model parabolic PDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 Hyperbolicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5.1 A model hyperbolic PDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2
3.5.2 Strict hyperbolicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.7 Special cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.7.1 First order scalar PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.7.2 Second order scalar PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.8 Nonlinear PDEs and systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4 Conservation laws 43
4.1 Integral and di↵erential form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.1.1 One space dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.1.2 Higher space dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2 Scalar conservation laws in one space dimension . . . . . . . . . . . . . . . . . . . . 45
4.2.1 The advection equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2.2 Method of characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2.3 Propagation of small disturbances . . . . . . . . . . . . . . . . . . . . . . . 47
4.3 Weak solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3.1 Shock formation and Riemann problems . . . . . . . . . . . . . . . . . . . . 47
4.3.2 Propagating shock solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3.3 Rarefaction waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3.4 The Lax condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.5 A few words on systems and higher dimensions . . . . . . . . . . . . . . . . 50
4.4 Example: Traffic flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5 Elementary generalised functions 55


5.1 Test functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 The -function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3 The Heaviside and signum functions . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.4 Generalised functions and derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.5 Properties of the -function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

6 Green’s functions for ODEs 61


6.1 A simple example: first-order linear ODE with constant coefficients . . . . . . . . . 61
6.2 Another example: the harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . 63
6.3 The general second order linear ODE with constant coefficients . . . . . . . . . . . 64
6.4 Initial-value problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

7 Green’s functions for the Poisson and Helmholtz equations 68


7.1 Three-dimensional -function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.2 Free space Green’s function for the Poisson equation . . . . . . . . . . . . . . . . . 69
7.3 Free space Green’s function for the Helmholtz equation . . . . . . . . . . . . . . . 71
7.4 An alternative derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.5 The large distance and long wavelength approximations . . . . . . . . . . . . . . . 74
7.6 Uniqueness of the solution to the free-space problem . . . . . . . . . . . . . . . . . 76
7.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

8 Green’s functions for bounded regions 79


8.1 Green’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8.2 The reciprocal theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8.3 The Kirchho↵-Helmholtz formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.4 Problems on bounded regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.4.1 The Dirichlet problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.4.2 The Neumann problem for the Helmholtz equation . . . . . . . . . . . . . . 82
8.4.3 The Neumann problem for the Poisson equation . . . . . . . . . . . . . . . 82

3
8.4.4 Robin boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
8.5 The method of images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8.5.1 Example: Laplace equation with Neumann BCs on a plane . . . . . . . . . 84
8.5.2 Example: Helmholtz equation with Dirichlet BCs on a plane . . . . . . . . 85
8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

9 The di↵usion equation 87


9.1 The one-dimensional di↵usion equation . . . . . . . . . . . . . . . . . . . . . . . . . 87
9.2 The initial-value problem in one dimension . . . . . . . . . . . . . . . . . . . . . . 89
9.3 The three-dimensional problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
9.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

10 The wave equation 95


10.1 One space dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
10.1.1 The Green’s function in one space dimension . . . . . . . . . . . . . . . . . 95
10.1.2 The initial value problem in one space dimension . . . . . . . . . . . . . . . 96
10.2 The three-dimensional problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
10.2.1 The Green’s function in three space dimensions . . . . . . . . . . . . . . . . 97
10.2.2 Retarded potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
10.3 The method of descent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
10.3.1 From three to one dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . 99
10.3.2 From three to two dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . 100
10.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Index 102

4
1 Overview
1.1 Notation and terminology
In these lecture notes I use boldface to highlight technical terms when they are being defined,
either informally by using them, or in a formal definition. These terms also appear in the index at
the end of these notes. I use italics to highlight something that is important. In equations, either
a := b or b =: a means that a is being defined in terms of b. Three dots . . . in a mathematical
expression signify something that I do not write out in full because it repeats something earlier –
what, should be clear from the context. Vectors are written in boldface in the printed notes, for
example x, but on the board I will underline them.

1.1.1 Partial derivatives and index notation


In these notes, I often abbreviate partial derivatives by commas, as in

@u @2u
u,x := , u,xy := (1)
@x @x@y
Obviously, u,xy = u,yx , as partial derivatives commute. It is helpful to stick to some preferred
order (say x first). To take an example of this notation, the most general linear second-order PDE
in one dependent variable u and two independent variables (or coordinates) x and y can be written
as
au,xx + 2bu,xy + cu,yy + pu,x + qu,y = f. (2)
(This is linear if the coefficients a, b, c, p, q and f do not depend on u.)
Remark 1.1. In the PDE literature (for example in Renardy and Rogers), it is more common to
write ux , uxy etc., without the comma, but I find it helpful to distinguish partial derivatives from
other indices.

To write a PDE even more concisely when discussing general theory, we may number the
independent variables, for example
x1 := x, x2 := y, (3)
and do the same for the partial derivatives, obviously in the same numbering:

u,1 := u,x , u,2 := u,y . (4)

We do the same for the coefficients in (2),

a11 := a, a12 = a21 := b, a22 := c, b1 := p, b2 := q (5)

and then we can write our example PDE (2) as


2 X
X 2 2
X
aij u,ij + bi u,i = f, (6)
i=1 j=1 i=1

or in more relaxed notation X X


aij u,ij + bi u,i = f. (7)
i,j i

We can, and always will, assume that aij is symmetric. Make sure you understand where the factor
of 2 in the coefficient 2b of (2) has gone when it is written in the form (6). The form (6) has the
advantage that it can be easily generalised to an arbitrary number of independent variables. We
will use this in definitions and theorems later.

5
1.1.2 Systems of PDEs
More generally, we may want to solve a PDE system of N 2 PDEs for N dependent variables
(u1 , u2 , . . . , uN ), in n independent variables (x1 , x2 , . . . xn ). Note that typically there will be as
many PDEs as dependent variables (unknowns) (N of each), but the number of independent
variables n has nothing to do with that. It is then useful to write the dependent variables and the
equations as components of column vectors.
A simple example with N = 2 and n = 2 are the Cauchy-Riemann equations

u,y + v,x = 0, (8)


u,x v,y = 0. (9)

If we write the dependent variables u and v as the vector


✓ ◆
u
u := , (10)
v

we can write the two PDEs as


✓ ◆ ✓ ◆
0 1 1 0
u,x + u,y = 0, (11)
1 0 0 1

or more abstractly as
2
X
Ai u,i = 0, (12)
i=1

where ✓ ◆ ✓ ◆
x 1 0 1 y 2 1 0
A := A = , A := A = . (13)
1 0 0 1
As another example, the most general second-order linear system in N dependent and n inde-
pendent variables can, in an abstract way, be written as
n X
X n n
X
ij
A u,ij + B i u,i = f , (14)
i=1 j=1 i=1

where each Aij and each B i (say, A12 = A21 , or B 3 ) is an N ⇥ N matrix. One sometimes refers
to u = (u1 , . . . uN ) as a vector in state space to distinguish it from a vector x = (x1 , . . . , xn ) in
physical space.

1.1.3 PDE problems


A PDE (or PDE system) is not usually solved in isolation. We must specify a domain in space,
or in space and time, on which we want the PDE to hold. If the domain is bounded, then we will
need to impose one or several boundary conditions on each boundary. For a time-dependent
problem, boundary conditions at the initial time t = 0 are also called initial conditions. (For
other purposes, it may make sense to refer to boundary conditions in space and initial conditions
together as “boundary conditions”.) It is important to understand that the initial and boundary
conditions are independent of the PDE (system) in the interior.
If the domain is unbounded, for example all of Rn , then we still need boundary conditions,
but they may be less obvious. For example, we may want the solution to fall o↵ (go to zero)
sufficiently fast as we approach infinity.
Definition 1.2. A PDE problem is a PDE (system), together with its domain and all necessary
boundary and/or initial conditions.

6
1.1.4 Revision: Linear PDEs
Probably all the PDEs you have seen until now were linear PDEs. This means that they are linear
expressions (containing the first power or the zeroth power) in the unknown and its derivatives.
Our example PDE (2) is linear if the coefficients a, b, c, p, q, f are known functions of x and y (they
may be constants), but not of u. All terms in it are of the first power in the unknown u, except
for s, which is zeroth power. If there is a zeroth-power term, also called the inhomogeneous
term or source term, the linear PDE is called inhomogeneous. If this is absent (f = 0 in our
example), the linear PDE is homogeneous. A PDE system is linear if each PDE is linear in the
vector u of unknowns.
Sometimes it is useful to write an inhomogeneous linear PDE abstractly as
Lu = f, (15)
where L is a homogeneous linear di↵erential operator (say the Laplace operator) and f is the
inhomogeneous term. For a PDE system, we could write Lu = f .
The general solution (without imposing any boundary conditions) of any inhomogeneous linear
PDE (or PDE system) is of the form u = uCF + uPI . Here the complementary function uCF
is the general solution of the corresponding homogeneous linear PDE Lu = 0, and the particular
integral uPI is any one solution of the original inhomogeneous linear PDE (15). This is just the
same as for linear ODEs. It works because
Lu = L(uCF + uPI ) = LuCF + LuPI = 0 + f = f. (16)
We have used the linearity of L in the second equality. Recall that any operator L is linear if
L(f + g) = L(f ) + L(g) and L(cf ) = cL(f ), where c is a constant. For example L(f ) = df /dx or
L(f ) = h(x)f are linear, but L(f ) = f 2 would not be. Moreover, it is customary in mathematics
to write homogeneous linear functions as prefix operators, that is, we write L(u) as Lu if L is
homogeneous linear.
Similarly, in linear PDE problems we distinguish homogeneous boundary conditions and
inhomogenous boundary conditions. u = 0 at the boundary is an example of a homogenous
boundary condition, while u = g, with g(x) a given function, would be inhomogeneous. This g(x)
is an example of what is called boundary data. With inhomogeneous boundary conditions and a
source term, we can then again write the general solution as the sum of a particular integral that
solves the PDE problem with the source term and with inhomogeneous boundary conditions, and
a complementary function that is the general solution of the corresponding PDE with zero source
term and homogeneous boundary conditions.

1.2 Revision: Vector calculus


1.2.1 Notation
In (14), we have written the dependent variables as a vector u, but we have used index notation for
the independent variables. In other contexts, it may be useful to write the independent variables
as a vector x, especially if they represent a position vector in three-dimensional Euclidean space.
A dependent variable may itself be a direction vector in Euclidean space, for example the velocity
vector v(x) in fluid dynamics. We then write
x := (x1 , x2 , x3 ) := (x, y, z), (17)
v(x) := (v 1 , v 2 , v 3 ) := (v x , v y , v z ), (18)
✓ ◆
@ @ @
r := , , . (19)
@x @y @z
We can use r (pronounced nabla) to define the gradient of the scalar function u as
ru := (u,x , u,y , u,z ) (20)
and the divergence of the vector-valued function v as
3
X
x y z
r · v := v,x + v,y + v,z = v i ,i . (21)
i

7
Note that
n
X
r·x= xi ,i = n (22)
i=1

gives the number of space dimensions.


Remark 1.3. In this course, I use the notation of denoting a vector by a boldface letter both
for vectors in three (or two) space dimensions such as x, short for xi with i = 1, . . . , n, and for
vectors of variables such as u, short for u↵ with ↵ = 1, . . . , N . I use r only to denote derivatives
with respect to space (the independent variables). (By contrast, Renardy and Rogers use it also
for derivatives with respect to the dependent variables.)

1.2.2 The Laplace operator


The second-order derivative operator r · r is called the Laplace operator and is usually denoted
by either r2 or . In three-dimensional Cartesian coordinates x = (x1 , x2 , x3 ) the Laplace
operator is
@2 @2 @2
(3) = 2 + 2 + . (23)
@x1 @x2 @x23
In two or one dimensions we have
@2 @2 @2
(2) = 2 + , (1) = . (24)
@x1 @x22 @x2
Without proof, we state a few other formulas. In cylindrical polar coordinates (r, ✓, z), where

x1 = r cos ✓, x2 = r sin ✓, x3 = z, (25)

we have
✓ ◆
1 @ @ 1 @2 @2
(3) = r + 2 2+ 2 (26)
r @r @r r @✓ @z
2 2
@ 1 @ 1 @ @2
= + + + . (27)
@r2 r @r r2 @✓2 @z 2
By just removing the coordinate z we obtain polar coordinates in the plane

x1 = r cos ✓, x2 = r sin ✓, (28)

where ✓ ◆
1 @ @ 1 @2
(2) = r + . (29)
r @r @r r2 @✓2
In spherical polar coordinates (r, ✓, ') where

x1 = r sin ✓ cos ', x2 = r sin ✓ sin ', x3 = r cos ✓, (30)

we have
✓ ◆  ✓ ◆
1 @ 2 @ 1 1 @ @ 1 @2
(3) = r + sin ✓ + (31)
2
r @r @r 2
r sin ✓ @✓ @✓ sin2 ✓ @'2

@2 2 @ 1 @2 @ 1 @2
= + + + cot ✓ + . (32)
@r2 r @r r2 @✓2 @✓ sin2 ✓ @'2
In the following, will be used to denote any of (3) , (2) or (1) . Note that in curvilinear
coordinates, such as the polar coordinates we have used here, the Laplace operator generally
contains first, as well as second, derivatives.
All of these formulas can be derived from the chain rule of partial derivatives
n
X @xj @
@
= , (33)
@x0i j=1
@x0i @xj

8
where xj is one system of coordinates and x0i the other. As a simple example, where n = 2, take
the transformation from coordinates (x, y) to (⇠, ⌘), given by

@ @x @ @y @
= + , (34)
@⇠ @⇠ @x @⇠ @y
@ @x @ @y @
= + . (35)
@⌘ @⌘ @x @⌘ @y

1.2.3 The divergence theorem


There is more to vector calculus than concise notation. One result we will need is
Theorem 1.4 (Divergence theorem). Let V be a volume and S := @V its surface. Let dV
be the volume element and dS the surface element, and let n(x) be the outward-pointing
unit normal vector at each point of S. Then
Z Z Z Z Z
r · f dV = n · f dS, (36)
V S

It is is easy to prove the theorem in the simple case in three dimensions where V is the
rectangular box x0  x  x1 , y0  y  y1 , z0  z  z1 , and hence S is the union of six rectangles.
Using the fundamental theorem of calculus,
Z x1
df
dx = f (x1 ) f (x0 ), (37)
x0 dx

separately in the x, y and z directions, one can show that


Z x1 Z y1 Z z1
x y z
dx dy dz (f,x + f,y + f,z )
x0 y0 z0
Z y1 Z z1
= dy dz [f x (x1 , y, z) f x (x0 , y, z)]
y z
Z 0x1 Z 0z1
+ dx dz [f y (x, y1 , z) f y (x, y0 , z)]
x0 z0
Z x1 Z y1
+ dx dy [f z (x, y, z1 ) f z (x, y, z0 )] (38)
x0 y0

In vector calculus notation this can be written as (36), where n = (0, 1, 0) for all points in the
part of S given by y = y0 , x0  x  x1 , z0  z  z1 , and so on for the other five faces of the box.
The actual divergence theorem above generalises our simple result in two ways. First, one can
prove that (36) holds for any volume V with boundary S, not only a rectangular box. Secondly,
r, f , dV , dS and n are geometric objects, in the sense that the two integrals on either side of (36)
can be defined and evaluated not only in Cartesian coordinates, but in arbitrary coordinates.
You have previously learned to evaluate (36) in some simple situations, but for this course we
mostly need the abstract version.

1.3 Revision: Separation of variables


In MATH2038, MATH2047, MATH2048 or MATH2015 you have learned to solve PDEs using
separation of variables. Keep in mind that this works only under the following conditions (we
initially assume there are only n = 2 independent variables):
1. The PDE is homogeneously linear and separable. This means that an ansatz like

u(x, y) = X(x) Y (y) (39)

turns the linear PDE


Lu(x, y) = 0, (40)

9
where L is a homogeneous linear di↵erential operator into

Lx X(x) = Ly Y (y), (41)

where Lx and Ly are homogeneous linear di↵erential operators. Hence both sides must be equal
to some constant K, and so we have the ODEs

Lx XK (x) = K, Ly YK (y) = K. (42)

These ODEs are then solved for arbitrary separation constant K. Typically, homogeneous
boundary conditions allow only discrete values of K. The solution u(x, y) is obtained as the sum
X
u(x, y) = cK XK (x)YK (y). (43)
K

The constants cK are typically determined by initial conditions or inhomogeneous boundary con-
ditions.
2. The domain of the problem is a coordinate rectangle. In other words, the domain has the
form a < x < b and c < y < d. The important consequence of this is that each part of the
boundary a↵ects only one of the ODEs above. So a boundary condition at x = 0 becomes a
boundary condition for X(0) but does not a↵ect Y (y). This in turn means that the ODEs for X
and Y can be solved separately.
A coordinate rectangle does not have to be a rectangle in physical space. For example, a sphere
in spherical coordinates is given by

0  r  1, 0  ✓  ⇡, 0  ' < 2⇡, (44)

which is again a coordinate rectangle, but now in spherical polar coordinates.


We do not review basic separation of variables any further here, but it is worth reminding you
of a few more advanced aspects below.
More than two independent variables: Separate in one variable first and introduce a first sepa-
ration constant. You now have an ODE, and a PDE in one fewer independent variable. Repeat.
Inhomogenous boundary conditions: Say we want to solve the Laplace equation u,xx + u,yy = 0
on a coordinate rectangle, with inhomogenous (non-zero) boundary conditions on each side. We
do this by writing the solution u as the sum of four solutions, each of which obeys homogeneous
boundary conditions on three sides and the given inhomogeneous one on one side.
Another method works more generally. Here we turn inhomogeneous boundary conditions into
a source term. Let u1 be any function that obeys the inhomogenous boundary condition, but is
not actually a solution of the PDE, i.e. Lu1 6= f . Then write u = u1 + u2 where u2 obeys the
corresponding homogeneous boundary condition but has a non-vanishing source term, given by

Lu2 = Lu1 + f. (45)

You should convince yourself that this works.

1.4 Complex Fourier series and Fourier transforms


In this course, we will need the complex Fourier transform for two purposes: formally, in the
classification of PDEs, and in solving PDEs by separation of variables where the domain in, say, x
is not an interval a  x  b but the real line 1 < x < 1. You have already seen the FT briefly
in MATH2047, MATH2048 or MATH2015, but not in MATH2038.

1.4.1 Motivation from separation of variables: PDE problems with fall-o↵ boundary
conditions
In separation of variables on an interval, we typically have a boundary condition on each end of the
interval. As a simple example, assume we want to solve some PDE for u(x, y, z) with homogeneous
boundary conditions u(0, y, z) = u(2⇡, y, z) = 0 on u. After separation of variables with the
usual ansatz u(x, y, z) = X(x)Y (y)Z(z) this gives the boundary conditions X(0) = X(2⇡) = 0 on

10
X. Then we would make each Xn (x) obey these boundary condition, by trying Xn (x) = sin nx,
n = 1, 2, . . . . Note that this choice of Xn (x) also implies the simple relation Xn00 = n2 Xn .
If instead of an interval we have the real line, we typically have a fall-o↵ condition, u(x, y, z) ! 0
as |x| ! 1. In principle it would be possible to make each Xk (x) vanish separately at infinity.
But then we would lose the useful property that Xk00 is related to Xk in a simple way, namely
Xk00 = k 2 X. Instead we use Xk (x) = eikx . This is periodic (with period 2⇡/k), and so does not
fall o↵ at infinity. It also obeys Xk00 = k 2 X again. However, by superimposing a continuum of
such functions we can make a function that falls o↵ at infinity. This leads us to the concept of a
Fourier transform.

1.4.2 Revision: Complex Fourier series


You have already covered the complex Fourier series in the prequisite course for this one. If f (x)
is a periodic function with period L, that is f (x + L) = f (x), then we can write it as
1
X 2⇡n
f (x) = cn eikn x , kn := , (46)
n= 1
L

where the complex Fourier coefficients cn are given by


Z L
1 2
ikn x
cn = f (x)e dx. (47)
L L
2

Note the integral could be taken over any other full period, for example from 0 to L. If f (x) is
real, then cn = c⇤ n , where the star denotes the complex conjugate.
[You can verify (47) directly by changing n to m and then substituting the expression (46) for
f (x). After integrating, you get cm = cn , as required.]

1.4.3 Derivation of the Fourier transform from the complex Fourier series
From the complex Fourier series we can obtain the Fourier transform as a limit. Define
cn L 2⇡
fˆ(kn ) := p , k := kn+1 kn = . (48)
2⇡ L

With this notation, (46) and (47) become

X1
1
f (x) = p fˆ(kn )eikn x k, (49)
2⇡ n= 1
Z L2
1
fˆ(kn ) = p f (x)e ikn x dx. (50)
2⇡ L
2

So far, the function fˆ(k) is defined only at the discrete points kn . Now as weP
take the limit
R L ! 1,
we have k ! 0 and we can think of it as dk under an integral, that is k ! dk. At the
same time, fˆ(k) is then defined on the continuous real line. We have motivated
Definition 1.5 (Fourier transform).
Z 1
1
f (x) = p fˆ(k)eikx dk, (51)
2⇡ 1
Z 1
1
fˆ(k) = p f (x)e ikx
dx. (52)
2⇡ 1

Note the pleasing symmetry. When f (x) is real, then fˆ(k) = fˆ( k)⇤ (the complex conjugate).

11
1.4.4 Use of the Fourier transform: derivatives and fall-o↵ boundary conditions
We can now go back to the motivation from separation of variables. First, taking derivatives. It
is easy to see that Z 1
1
f 0 (x) = p ik fˆ(k)eikx dk, (53)
2⇡ 1

so we know how to di↵erentiate.


Second, fall-o↵ boundary conditions at infinity. It is a theorem that f (x) admits a Fourier
transform if and only if Z 1
|f (x)|2 dx < 1. (54)
1

We say that f is square-integrable, or that it is in the function space L2 (R). But this means
that f (x) must vanish at infinity sufficiently fast for this integral to exist. Hence if we we assume
that f (x) can be written as an inverse Fourier transform (51), f (x) is square-integrable and we
automatically have the fall-o↵ condition f (x) ! 0 as |x| ! 1. In other words, if we solve a
PDE using a Fourier transform in the variable x with range 1 < x < 1, then we automatically
impose a fall-o↵ condition f (x) ! 0 as |x| ! 1.

1.5 Common linear PDEs


1.5.1 Laplace, Poisson and Helmholtz equations
We start with PDEs that do not have time derivatives. The Laplace equation is

u = 0. (55)

The Poisson equation is just the Laplace equation with a (given) source term, or

u = f (x). (56)

The (inhomogeneous) Helmholtz equation, which we will derive below, is

( + k 2 )u = f (x), (57)

Let us assume that we want to solve one of these linear PDEs on a bounded domain V with
boundary S, with unit normal vector n. At each point on the boundary we can impose precisely
one boundary condition. The standard classes of conditions that can be imposed are
Dirichlet boundary conditions: Specify the value of u on S.

Neumann boundary conditions: Specify the value of the normal derivative


X
u,n := n · ru = ni u,i (58)
i

on S, where n is the outward-pointing unit normal vector.


Robin boundary conditions: Specify the value of a linear combination of u and u,n on S.
For example, the Poisson equation with inhomogeneous Robin boundary conditions specifies
the PDE problem

u = f (x) in V, (59)
u,n + ↵u = g(x) on S. (60)

Here the special value = 0 gives Dirichlet boundary conditions and ↵ = 0 gives Neumann
boundary conditions. Here, and in similar examples, g(x) 6= 0 specifies an inhomogeneous boundary
condition and g(x) = 0 the corresponding homogeneous boundary condition.

12
Remark 1.6. For the Poisson equation the solution with Neumann boundary conditions is only
unique up to an arbitrary constant, for if u1 satisfies the problem so too does u2 = u1 + C where
C is constant. This is so because the derivative of a constant is zero, and so both C = 0 and
C,n = 0.
Remark 1.7. For the Poisson or Laplace equation with Neumann boundary conditions,

u = f (x) in V, (61)
u,n = g(x) on S, (62)

we must have
Z Z Z Z Z Z
f dV = u dV = r · (ru) dV = (ru) · n dS = u,n dS = g dS (63)
V V V S S S

for the source term f and boundary data g to be compatible. Otherwise, this problem has no
solution. (We will come back to this in Sec. 8.4.3.)
Remark 1.8. Dirichlet, Neumann or Robin boundary conditions, homogeneous or inhomogeneous,
all apply only at a boundary at finite distance. If the domain of the PDE is infinite, then most
likely we will want to impose a fall-o↵ condition u(x) ! 0 as |x| ! 1 in some direction, or in
any direction. Note that there is only one type of fall-o↵ condition, and in particular we should
think of it as a (homogeneous) boundary condition at infinity. It turns out that if, for a given
PDE, we can impose either Dirichlet, Neumann or Robin boundary condition on a finite volume,
as a fourth alternative we can also impose a free space boundary condition on any open side of
an infinite (unbounded) volume. If the domain is all of space, and we have fall-o↵ at infinity in
all directions, this type of boundary condition is also called a free space boundary condition.
This is often the simplest boundary condition to work with. In particular, we will use it a lot when
we find Green’s functions for PDEs.
The Laplace, Poisson and Helmholtz equations, in any number of space dimensions, are exam-
ples of a class of PDEs that are called elliptic equations. They are, in fact, the most important
examples of linear second-order elliptic equations. A formal definition of ellipticity will be given
in Sec. 3 below.
Remark 1.9. A key feature of elliptic PDEs is that they smoothe out their data: the interior
solution is more often di↵erentiable than the boundary data.

1.5.2 Heat equation


For a time-dependent problem on a bounded domain, with a PDE that contains one or more time
derivatives, we must specify both initial data throughout the domain (volume) V at some time
t = 0, and boundary data on the boundary of the domain (surface) S for t 0; see Fig. 1. The
problem is then solved for t > 0 inside the volume V . This is called the Cauchy problem.

t t

αφ + βφ n = g(x,y,t) αφ + βφ n= h(x,y,t)

φt = Δφ φtt = Δφ

y y

x x

φ (x,y,0) = f(x,y) φ (x,y,0) = f(x,y)


φt (x,y,0) = g(x,y)

Figure 1: The Cauchy problem for the heat equation (left) and the wave equation (right)

13
We begin by looking at the heat equation, also called the di↵usion equation,

u,t =  u. (64)

Here  > 0 is a constant with dimension length2 /time, the di↵usion constant. To see this, note
that both terms in this equation must have the same dimension. The left-hand side has whatever
the dimension of u is, divided by time. The right-hand side has the dimension of u, times the
dimension of , divided by length2 . The dimension of u cancels.
Because the heat equation gives us u,t (x, 0) if we know u(x, 0), it is intuitively clear that we
need to specify precisely u(x, 0) as initial data. A particular class of solutions of the heat equation
are those that are independent of time: they then obey the Laplace equation. This suggests that
the heat equation requires the same boundary conditions as the Laplace equation. Thus, the
Cauchy problem for the heat equation is

u,t =  u, t>0 (65)


u(x, 0) = f (x) for x 2 V (66)
u,n + ↵u = g(x, t) for x 2 S. (67)

(Either ↵ or can be zero, to give Neumann or Dirichlet boundary conditions.)


The heat equation, in any number of space dimensions, is an example of a class of PDEs that
are called parabolic. We give a formal definition of parabolic second-order scalar PDEs in Sec. 3,
but if Lu(x) = 0 is a linear elliptic equation for u(x), then u,t = Lu(x, t) is a parabolic equation for
u(x, t). Going from the Laplace equation (elliptic) to the heat equation (parabolic) is one example
of this. One can prove that a parabolic equation with Dirichlet, Neumann or Robin boundary
conditions and initial data for u has a unique solution.
Remark 1.10. Parabolic PDEs have these key features:
a) They smoothe out their initial data: the solution is more often di↵erentiable than the initial
data and boundary data.
b) They have infinite propagation speed. Even if the initial data at t = 0 is non-zero only in a
finite region of space, at any t > 0 the solution is typically non-zero in the entire domain: in this
sense, the solution has spread from the initial data at infinite speed.
c) They cannot be run backwards. If the time evolution problem with initial data at t = 0 is
well-posed for t > 0, it is ill-posed for t < 0. (Well-posedness will be defined below).

1.5.3 Wave equation


The wave equation, in any number of space dimensions, is

u,tt = c2 u, (68)

where c > 0 is the wave speed. For the wave equation to have consistent dimensions, c must have
dimension length/time.
The wave equation contains two time derivatives, and so we need initial data for both u(x, 0)
and u,t (x, 0) at t = 0. To understand this intuitively, consider the wave equation that describes
the motion of a string, that is the wave dimension in one space dimension

u,tt = c2 u,xx . (69)

(Here x is the position along the string, and u the transversal displacement of the string from
its rest position, assumed small.) We need to specify the initial position u(x, 0, ) and velocity
u,t (x, 0) of each part of the string. Newton’s force law and Hooke’s law of elasticity then give us
the acceleration and we can evolve in time. (By contrast, heat has no inertia, so it is sufficient to
specify the initial temperature.) If one considers a time-periodic solution to the wave equation, it
reduces to the Helmholtz equation. This suggests that the wave equation needs Dirichlet, Neumann
or Robin boundary conditions, like the Helmholtz equation.

14
All these conditions together constitute the Cauchy problem for the wave equation (on a volume
V with boundary S, and assuming Robin boundary conditions):

u,tt = c2 u for x 2 V, t 0, (70)


u(x, 0) = f (x) for x 2 V, (71)
u,t (x, 0) = g(x) for x 2 V, (72)
u,n + ↵u = h(x, t) for x 2 S. (73)

The wave equation is an example of a class of PDEs called linear second-order hyperbolic
equations. We give a formal definition below. However, if Lu = 0 is an elliptic equation for some
di↵erential operator L, then c 2 u,tt + bu,t = Lu is a hyperbolic second-order PDE, for any c > 0
and real b. The wave equation is obviously in this class, but we shall see that the class of hyperbolic
PDEs is much larger.
Remark 1.11. Hyperbolic PDEs have these key features:
a) Any features in the initial data move about in space at one or several specific velocities.
b) There is no smoothing: the solution is typically as often di↵erentiable as the initial data and
boundary data.
c) They can be run backwards: the same initial data at t = 0 can in general be evolved to t > 0
or t < 0.
Remark 1.12. The one dimensional wave equation (69) is unusual among wave equations in that,
on the infinite domain 1 < x < 1, it can be solved in closed form. Its general solution, known
as the d’Alembert solution, is

u(x, t) = F (x ct) + E(x + ct). (74)

This can be easily verified by using the chain rule.


D’Alembert’s solution does not help us for more general hyperbolic problems, but it provides
a nice explicit illustration of all points of Remark 1.11: F (x ct) represents a wave of shape F (x)
(at time t = 0) moving in the direction of increasing x at constant speed c. That is, F (x ct)
represents the same shape as F (x), but with the origin moved to x = ct, i.e. the same shape
as F (x) but translated a distance ct in the positive x direction. (If you are confused about the
signs, imagine the example where F has its maximum when its argument is zero, say.) Conversely,
E(x + ct) represents a wave of shape E(x) going in the negative x direction at constant speed c.
It is also obvious that the solution u(x, t) is just as often di↵erentiable as the initial data u(x, 0)
and once more than u,t (x, 0).
Remark 1.13. The Helmholtz equation is derived from the wave equation by separating the
time variable (assuming a periodic time dependence) as follows. Consider the wave equation with
a time-periodic source term. For simplicity, we assume the source has only a single (angular)
frequency !, and we use complex notation:
1 i!t
,tt + =e f (x) (75)
c2
If we look for a time-periodic solution of this, called a standing wave,
i!t
(x, t) = e u(x), (76)

we find that u(x) satisfies (57), where


! 2⇡⌫
k= = (77)
c c
is the wave number. Here c is the wave speed, ⌫ the frequency and ! := 2⇡⌫ the angular
frequency. The wavelength of a plane wave of frequency ⌫ is
c 2⇡
= = . (78)
⌫ k

15
1.6 Cauchy data and the Cauchy-Kowalewski solution
The initial data for the Cauchy problem consist of specifying u at t = 0 for a PDE that contains
only a first time derivative, u and u,t for a PDE that contains up to the second time derivative,
and so on: one fewer time derivative in the initial data than in the PDE. These initial data are
called Cauchy data. The following two remarks are meant to motivate why Cauchy data can be
expected to give rise to well-posed initial-value problems.

Remark 1.14. If we solve the heat equation numerically, we start with data u(x, 0)at t = 0. The
PDE then gives us u,t (x, 0). We use this to obtain an approximation for u(x, h), where h is some
short time interval. In the simplest case, called forward Euler, this is just the zeroth and first
term of a Taylor series,

u(x, h) ' u(x, 0) + hu,t (x, 0) = u(x, 0) + h u(x, 0). (79)

Then consider this as new initial data and repeat the process to get u(x, 2h), then u(x, 3h) and so
on. This is the basic idea behind all numerical methods for solving time-dependent problems.
Remark 1.15. A related theoretical way of looking at Cauchy data is to try and write the solution
for small t > 0 by expanding u(x, t) into its Taylor series about t = 0, that is
1
u(x, t) = u(x, 0) + u,t (x, 0) t + u,tt (x, 0) t2 + O(t3 ). (80)
2!
To do this, we need to find all t-derivatives of u at t = 0. In principle, this can be done by replacing
successive time derivatives by spatial derivatives and using the fact that partial derivatives, and in
particular space and time derivatives, commute. In the example of the heat equation, we have

u,t =  u ) u,tt = ( u),t =  u,t =  ( u) = 2 2


u, (81)

and so on for higher time derivatives. If we evaluate this at t = 0 and substitute it into (80), we
obtain the Cauchy-Kowalewski solution: an infinite series in powers of t, with coefficients that
contain arbitrarily many spatial derivatives of the initial data. For the heat equation this is
1 2 2
u(x, t) = u(x, 0) +  u(x, 0) t +  u(x, 0) t2 + O(t3 ). (82)
2!
However, the Cauchy-Kowalewski solution is just a formal solution. In the first place, it only
exists if u is infinitely often di↵erentiable in t and x. (The technical term for infinitely often
di↵erentiable is smooth.) More importantly, in almost any interesting situation, the infinite series
does not converge for any finite t > 0, and so this solution does not make sense. We have introduced
it here to show why a k-th order evolution equation requires the first k 1 derivatives as Cauchy
data.

1.7 Weak solutions


The d’Alembert solution of the wave equation gives us a glimpse of a general fact: hyperbolic PDEs
have discontinuous solutions. To derive (74) we have assumed that u(x, t) is twice di↵erentiable,
but intuitively the d’Alembert solution makes sense for non-di↵erentiable and even discontinuous
functions F and E. For example, one could consider initial data that correspond to a wave moving
right and then make these initial data more and more square until they become a step function,
which is discontinuous.
Remark 1.16. It is clear that a discontinuity in the solution arises when either F or E is a
discontinuous function. Say F (z) jumps at z0 , then F (x ct) jumps at x ct = z0 . Similarly, a jump
in E(z) at z0 becomes a jump at x + ct = z0 . These lines are the characteristic hypersurfacess
of (69). This is a simple example of a general fact: the solution of a linear hyperbolic equation can
be discontinuous across its characteristic hypersurfaces. (A hypersurface has one dimension fewer
than the space it is in, so a hypersurface in (t, x, y, z) is three-dimensional, but a hypersurface in
(t, x) is a curve.)

16
In what sense does such a discontinuous solution still obey the wave equation? Consider initially
a solution u(x, t) of the wave equation (69) that is at least twice di↵erentiable (and so really can
be said to obey the wave equation), and an arbitrary function (x, t) that is smooth and vanishes
at infinity, together with all its derivatives. (More precisely, and all its derivatives must have
finite L2 norm.) Such a functions is called a test function. Then u,tt = c2 u,xx for all x and t
implies that Z Z
1 1
( u,tt + c2 u,xx ) dx dt = 0 (83)
1 1

for any such function , simply because the round bracket is zero. We can imagine in particular that
, while smooth, is sharply peaked about the point (x, t). It then probes if u obeys u,tt = c2 u,xx
at that point.
Integrating the first term by parts in t twice, and the second part twice in x gives
Z 1Z 1
( ,tt + c2 ,xx )u dx dt = 0. (84)
1 1

The boundary terms in the integration by parts vanish by the assumption that and its derivatives
vanish at infinity. But was an arbitrary function, and this integral no longer requires u to be
di↵erentiable in order to be defined. This observation motivates
Definition 1.17. u(x, t) is a weak solution of the wave equation (69) if it obeys (84) for all test
functions (x, t).
Definition 1.18. A strong solution of (69) is a twice di↵erentiable function u(x, t) for which
(69) holds for every x and t.
Remark 1.19. Any strong solution is also a weak solution, as we have just shown by integrating
by parts and considering all possible . The reverse is not true [because we may not be able to
di↵erentiate u, and hence may not be able to go back from (84) to (83).]
Weak solutions are important for hyperbolic (wave equation-like) problems for two reasons.
First, the solution of a linear hyperbolic equation with discontinuous initial data often makes
physical sense. Secondly, the solution of a nonlinear hyperbolic equation typically becomes discon-
tinuous in finite time even if the initial data are everywhere smooth. We will see examples of this
when we discuss conservation laws.

1.8 Exercises
1. Revision problem 1: a) Solve the following PDE problem (Laplace equation in two di-
mensions on a square, with homogeneous Dirichlet boundary conditions, on three sides of
the square, and inhomogeneous Dirichlet boundary conditions on one side):

u,xx + u,yy = 0, 0  x  ⇡, 0  y  ⇡, (85)


u(0, y) = 0, 0  y  ⇡, (86)
u(⇡, y) = 0, 0  y  ⇡, (87)
u(x, 0) = 0, 0  x  ⇡, (88)
u(x, ⇡) = f (x), 0  x  ⇡. (89)

b) Hence show that the solution for f (x) = 1 is


1
X 4
u(x, y) = sin[(2m + 1)x] sinh[(2m + 1)y] (90)
m=0
(2m + 1)⇡ sinh[(2m + 1)⇡]

c) Use Maple, Matlab, Mathematica, or some other software, to plot the solution series with
the first 1, 2, 10, 100 terms. Observe that the solution is di↵erentiable in the interior even
though the boundary data jumps (between 0 and 1) at the two corners (x, y) = (0, ⇡) and
(⇡, ⇡).

17
2. Revision problem 2: a) Solve the following PDE problem (one-dimensional heat equation
on an interval, with homogeneous Dirichlet boundary conditions at both ends of the interval):

u,xx u,y = 0, 0  x  ⇡, y 0 (91)


u(0, y) = 0, y 0, (92)
u(⇡, y) = 0, y 0, (93)
u(x, 0) = f (x), 0  x  ⇡. (94)

b) Hence show that the solution for f (x) = 1 is


1
X 4 (2m+1)2 y
u(x, y) = sin[(2m + 1)x]e (95)
m=0
(2m + 1)⇡

c) Use Maple, Matlab, Mathematica, or some other software, to plot the solution series with
the first 1, 2, 10, 100 terms for 0  y  ⇡. Observe that the solution is di↵erentiable in
the interior even though the there is a jump from 0 to 1 between the initial data and the
boundary data at the corners (x, y) = (0, 0 and (⇡, 0). (Or put di↵erently, the initial data do
not obey the boundary condition.)
3. Revision problem 3: a) Solve the following PDE problem (one-dimensional wave equation
on an interval, with homogeneous Dirichlet boundary conditions at both ends of the interval):

u,xx u,yy = 0, 0  x  ⇡, y 0 (96)


u(0, y) = 0, y 0, (97)
u(⇡, y) = 0, y 0, (98)
u(x, 0) = f (x), 0  x  ⇡, (99)
u,y (x, 0) = g(x), 0  x  ⇡. (100)

b) Hence show that the solution for f (x) = 1, g(x) = 0 is


1
X 4
u(x, y) = sin[(2m + 1)x] cos[(2m + 1)y] (101)
m=0
(2m + 1)⇡

c) Use Maple, Matlab, Mathematica, or some other software, to plot the solution series with
the first 1, 2, 10, 100 terms for 0  y  ⇡. Observe that the solution is discontinuous, and
that these discontinuities propagate with speeds ±1.
4. Homework 1: By applying the chain rule of partial derivatives, transform the 2-dimensional
Laplacian in Cartesian coordinates to polar coordinates. In other words, show that
1 1
u,xx + u,yy = u,rr + u,r + 2 u,✓✓ . (102)
r r
[Hint: solve for r(x, y) and ✓(x, y) first.]
5. Homework 2: Find the solution of the wave equation u,tt = c2 u,xx on the line 1 < x < 1
in terms of the initial data u(x, 0) = f (x), u,t (x, 0) = g(x). [Hint: use the d’Alembert
solution. Your answer will contain f (x) and an integral over g(x).]
6. Homework 3: a) Write down the Cauchy-Kowalewski solution for the one-dimensional wave
equation u,tt = c2 u,xx . Use this to find the solution with the Cauchy data u(x, 0) = sin x,
u,t (x, 0) = 0 as an infinite series. Show that the series sums to u(x, t) = cos(ct) sin(x). b)
Then write this also as a d’Alembert solution.
7. Write out
r · (uru) = ru · ru + u u (103)
in Cartesian coordinates (x, y, z).

18
8. Consider the answer (101) to a previous wave equation problem - it does not matter now
how we found that. Write this as

u(x, y) = E(x + y) + F (x y) (104)

for two functions E(z) and F (z). Give these functions both as Fourier series, and explicitly.
[Hint: For the first part, use a trig identity for sin a cos b. For the second part, note first
from the Fourier series that E and F are periodic functions. Then use part of the answer
to Revision Problem 3 to determine what the value of these functions is over the interval
0  z  ⇡. Hence work out what they are for all z.]
9. Using separation of variables, find the solution of

u,tt + u,xx = 0, a  x  b, t 0, (105)


↵L u(a, t) L u,x (a, t) = 0, t 0, (106)
↵R u(b, t) + R u,x (b, t) = 0, t 0, (107)
u(x, 0) = f (x), a  x  b, (108)
u,t (x, 0) = g(x), axb (109)

(wave equation on an interval with Robin BCs). (I have written L because u,n = u,x at
the left boundary, but that is just a convention.)
Hint: Your main challenge in this problem is to find the basis functions Xn (x). Sturm-
Liouville theory then tells you that the Xn (x) must obey the orthogonality conditions
Z b (
0, n 6= m,
Xn (x)X̄m (x) dx = (110)
a Nn , n = m,

where X̄ denotes the complex conjugate (if you have chosen to use complex notation for the
Xn ) for some coefficients Nn > 0. In your answer you can use Nn without finding them
explicitly – we leave that for the next problem.
10. Show explicitly by integration that (110) holds for the basis functions Xn (x) from your
solution of the previous Problem and find the coefficients Nn .
Hint: Use integration by parts twice, and the boundary condition.
11. Using separation of variables, find the solution of

u,tt + u,xx = 0, 1 < x < 1, t 0 (111)


u(x, t) ! 0, |x| ! 1, t 0 (112)
u(x, 0) = f (x), 1  x  1, (113)
u,t (x, 0) = g(x), 1  x  1, (114)

(1-dimensional wave equation with free space boundary conditions).


Hint: Here of course you have a Fourier transform instead of a Fourier series, and this problem
is much simpler than the same problem on a  x  b. It is included here for comparison,
and to reassure you that Fourier transforms are not intrinsically difficult.

19
2 Well-posedness
2.1 Main definition
Recall the definition of a PDE problem in Sec. 1.1.3. Some boundary conditions are unsuitable
for certain types of PDE in that they can lead to unphysical behaviour. For example, Cauchy
type conditions are unsuitable for the Laplace equation and Dirichlet conditions are unsuitable
for the wave equation. This leads to the notion of a well-posed problem. Problems which arise in
practical applications are usually well-posed boundary value problems (for PDEs in space only)
or well-posed Cauchy problems (for PDEs in time and space). It is always a PDE problem that is
well-posed (or not), not a PDE on its own.
Definition 2.1. A PDE problem is well-posed if and only if

1. a solution exists (existence);


2. for given data there is only one solution (uniqueness);
3. a small change in the data (boundary data, initial data, source terms) produces only a small
change in the solution (continuous dependence on the data).

The need for existence and uniqueness seems only common sense, but in practice they may be
hard to prove, and typically need to be proved separately. Only for very simple problems can one
prove existence by explicitly deriving the solution. The third condition, continuity, will be less
familiar to you. To give it a precise meaning requires some technical preparation. For motivation,
we look at a a few simple but illustrative examples of ill-posed problems first.

2.2 Examples of ill-posed problems


2.2.1 Wave equation with Dirichlet boundary conditions: many solutions
Consider the wave equation

u,tt = u,xx , 0 < x < ⇡, 0 < t < ⇡, (115)

with the homogeneous Dirichlet boundary conditions in space

u(0, t) = 0, u(⇡, t) = 0, (116)

and the homogeneous initial and final Dirichlet boundary conditions in time,

u(x, 0) = 0, u(x, ⇡) = 0. (117)

Separation of variables leads us to look for solutions of the form u(x, t) = X(x)T (t), and it is easy
to see that there are infinitely many such solutions which obey all boundary conditions, namely
1
X
u(x, t) = An sin nx sin nt (118)
n=1

all the An are arbitrary. Thus there are infinitely many solutions to this problem. It is ill-posed.
We have not considered inhomogeneous boundary conditions, or a source term, but we can
immediately see that those problems are also ill-posed. Suppose they have a solution (if not, they
are ill-posed). As this is a linear PDE, we can add any solution of the problem with homogeneous
boundary conditions and zero source term, and get another solution. But, as we have just shown,
there are infinitely many of these. So the problem with a source term and/or inhomogeneous
boundary conditions is also ill-posed.

20
2.2.2 Wave equation with Dirichlet boundary conditions: no continuous dependence
on the boundary data
Perhaps what was wrong with the previous example was that we imposed final conditions at the
special time t = ⇡. So let us impose them at a general value t = a instead, with a not an integer
multiple of ⇡. We now also allow for inhomogeneous final conditions. Hence consider

u,tt = u,xx , 0 < x < ⇡, 0 < t < a, (119)

with homogeneous Dirichlet boundary conditions in space

u(0, t) = 0, u(⇡, t) = 0, (120)

as before, and the homogeneous initial and inhomogeneous final Dirichlet boundary conditions in
time,
u(x, 0) = 0, u(x, a) = f (x). (121)
It is easy to see that (118) again obeys the wave equation and all three homogeneous boundary
conditions. Substituting this into the inhomogeneous boundary condition gives
1
X
An sin nx sin na = f (x), (122)
n=1

and we can solve this using by determining the constants sin naAn as the coefficients of a sine
series: Z
1 2 ⇡
An = sin nx f (x) dx. (123)
sin na ⇡ 0
Hence we have a unique solution for any given function f (x). For f (x) = 0 this unique solution is
just u(x, t) = 0. So we avoid the problem of the previous example, as we suspected.
But now there is another problem instead. Focus on the factor 1/ sin na. As we have assumed
that a is not an integer multiple of ⇡, sin na is not zero for any n. But how small can it be?
One can prove that it can become arbitrarily small, and hence An can become arbitrarily large,
in the following technical sense. Fix any ✏ > 0, as small as you like. Then there exists some
positive n (which depends on ✏), such that | sin na| < ✏. Now consider f (x) = sin nx. The absolute
value of these data is bounded by 1. But the solution is bounded only by 1/✏, which we can
make arbitrarily large. In this sense, the solution does not depend continuously on the data. The
problem is ill-posed.
Note that it was sufficient to show ill-posedness to look at one (suitably chosen) particular
family of data, here f (x) = sin nx.

2.2.3 Cauchy problem for the Laplace equation: no continuous dependence on the
boundary data
We now look at another problem that is ill-posed because its solution does not depend continuously
on the data. Consider the solution of the Laplace equation

u,xx + u,yy = 0, 0 < x < ⇡, y > 0, (124)

with the homogeneous Dirichlet boundary conditions

u(0, y) = 0, u(⇡, y) = 0, (125)

and the Cauchy data


u(x, 0) = 0, u,y (x, 0) = sin nx. (126)
where n > 0 is an integer. Separation of variables suggests that we look for a solution of the form
u(x, y) = X(x)Y (y). Doing this, and taking into account all boundary conditions, we find the
unique solution
1
u(x, y) = sin nx sinh ny. (127)
n

21
Now, as n ! 1, the Cauchy data remain finite and bounded between 1 and 1, but the solution
for y > 0 diverges at almost every x, because for any y > 0,
sinh ny
lim = 1. (128)
n!1 n
[We say at almost every x, because obviously u(m⇡/n, y) = 0, for any integer m.] Hence once
again, we have given an example of a family of data (with parameter n) for which the solution
does not depend continuously on the data, and so the problem is ill-posed.

2.3 Continuous dependence on the data


We now introduce some concepts that allows us to define the intuitive notion in which the solution
did not depend continuously on the data in the previous example.
Definition 2.2. A real vector space V is a set whose elements (called vectors) can be added
and multiplied by real numbers (called scalars in contrast to vectors). Hence if X, Y 2 V and
c 2 R then cX 2 V and X + Y 2 V .
Example 2.3. The prototypical vector space that you know already is Rn . However, when using
this example keep in mind that a norm or an inner product is not part of the definition of a vector
space.
Example 2.4. Functions from R to R also form a vector space, with f + g defined by (f + g)(x) :=
f (x) + g(x) and cf defined by (cf )(x) = cf (x). More generally, we could consider the vector space
of functions from Rn to RN , or in other words, vector-valued functions on Rn . Clearly a vector
space of functions is infinite-dimensional in the sense that it is not spanned by a finite number of
basis vectors, but we can still do many of the same things that we can do with finite-dimensional
vector spaces such as Rn .
Definition 2.5. A norm on a real vector space V is a function from V to the non-negative real
numbers with the properties
1. kXk = 0 if and only if X = 0;
2. kcXk = |c| kXk for all X 2 V and c 2 R;
3. kX + Y k  kXk + kY k for all X, Y 2 V (the triangle inequality).
Example 2.6. The Euclidean norm, also called the l2 norm, on Rn is given by
q
|x| := x21 + x22 + · · · + x2n . (129)

I use |x| instead of kxk for this norm because that is the established notation for this particular
norm. Also, for n = 1, this norm is just the absolute value of real number x. Convince yourself
that the square root is necessary for property 2 above to hold.
This is the most important norm on finite-dimensional vector spaces, and we mention two more
only to illustrate the concept of norm.
Example 2.7. The maximum norm on Rn is given by

kxkmax := max(|x1 |, |x2 |, . . . , |xn |), (130)

where of course |x1 | stands for the absolute value.


Example 2.8. The l1 norm on Rn is given by

kxkl1 := |x1 | + |x2 | + · · · + |xn |. (131)

Convince yourself that properties 1 and 2 above hold for these norms. (Property 3 is a little
harder to check.)
We will only introduce one norm on a function space:

22
Example 2.9. The L2 norm on functions from Rn to RN defined by
sZ
kf (·)kL2 := |f (x)|2 dn x, (132)
Rn

where |f (x)| denotes the Euclidean norm on RN (over the components of the vector f , at constant
x). The notation f (·) indicates that the function norm depends on the value of f (x) for all x.
Functions from Rn to R with finite L2 norm are said to be in the vector space L2 (Rn ), but there
is no space in this course to give a formal definition.
Definition 2.10. The function f from the vector space V with norm k · kV to the vector space
W with norm k · kW is Lipshitz continuous with Lipshitz constant K > 0 if
kf (X1 ) f (X2 )kW  KkX1 X2 kV (133)
for all pairs of vectors X1 , X2 2 V .
Remark 2.11. You will get the right intuitive idea if you simply take both V and W to be R
with the usual absolute value of real numbers |x| as the norm: the change in f cannot be bigger
than K times the change in x.
We are now ready to formally define continuous dependence on the initial data. For
simplicity, we consider an evolution problem that is first order in time and has homogeneous BCs,
so that u(x, 0) are the only data. Consider two solutions u1 (x, t) and u2 (x, t) with initial data
u1 (x, 0) and u2 (x, 0), respectively.
Definition 2.12. The solution u(x, t) of a first-order in time evolution problem depends contin-
uously on the initial data u(x, 0) in the function norm k · k if and only if there exists a function
K(t) > 0 such that
ku1 (·, t) u2 (·, t)kK(t)ku1 (·, 0) u2 (·, 0)k (134)
for all pairs of initial data u1 (x, 0), u2 (x, 0), and with K(t) independent of these data.
Note that continuity, and hence well-posedness, is defined only in some particular function
norm. An inequality such as (134) in PDE theory is called an estimate. The notion of continuity
we have used here is Lipshitz continuity, where the Lipshitz constant K(t) is allowed to depend on
t. The choice of function norm is an art, and depends on the problem in hand. (You will not be
asked to find estimates in this course, just to check and use them.)
Remark 2.13. I have kept the definition as simple as possible by restricting it to a first-order in
time evolution problem, and by using the same norm for the solution as for the initial data. In
practice, it may be necessary to use di↵erent function norms on the initial data and the solution in
order to prove an estimate at all, or in order to get the most useful one. For second-order in time
evolution problems, we need a norm on u and u,t for the initial data, and if there are boundary
data, we need a norm for those as well, and the solution must be bounded by the sum of all these
norms. For a system of PDEs, the norm combines a norm over state space u with a function norm
over position x, as in the example (132).
Proposition 2.14. Continuous dependence on all the data implies uniqueness.
Proof. We again restrict to a problem with only initial data for simplicity. If there are two solutions
u1 and u2 that have the same data, then the right-hand side of (134) is zero. As the left-hand
side cannot be negative, it must also be zero, so u1 u2 has zero norm for all t 0. But the only
function that has zero norm is the zero function, and so u1 (x, t) u2 (x, t) vanishes for all t. Hence
u1 = u2 for all t.
Remark 2.15. If the PDE or PDE system is linear, then the di↵erence u = u1 u2 of any
two solutions u1 and u2 is itself a solution of the corresponding homogeneous problem (zero
inhomogeneous term) with homogeneous BCs (zero boundary data). Hence we can reformulate
the continuity condition for linear problems as
ku(·, t)k  K(t)ku(·, 0)k (135)
for all initial data u(·, 0), with K(t) independent of these data.

23
Remark 2.16. The condition (135) can also be viewed as a form of stability with respect to the
initial data, where by stability we mean that the solution cannot grow arbitrarily rapidly. It is
clear that often solutions will grow in time, so we cannot demand that K(t) is a constant. But it
is essential that it is independent of the initial data u(x, 0), or else the condition would be trivial.
In fact, a typical way in which an evolution problem fails to be well-posed is that initial data that
vary more rapidly in space lead to solutions that grow more quickly in time. We have seen an
example of this in Sec. 2.2.3.

Remark 2.17. In fact, (134) is often too strong for nonlinear problems. Instead one may want the
problem to be well-posed only for solutions u close to a given reference solution u0 . The relevant
criterion is then
k u(·, t)k  K(t)k u(·, 0)k, (136)
where u := u u0 denotes any sufficiently small perturbation about a reference solution u0 . One
then says that the problem is well-posed in a neighbourhood of u0 .

2.4 Examples of well-posedness results


2.4.1 An energy estimate for the heat equation
Consider now the 1-dimensional heat equation

u,t = u,xx (137)

on all of space. (For simplicity we have set  = 1.) Define the “energy”
Z
1 1 2
Ẽ(t) := u dx. (138)
2 1

and again consider initial data where Ẽ(0) is finite. We then have
Z 1
dẼ 1 @ 2
= u dx
dt 2 1 @t
Z 1
= uu,t dx
1
Z 1
= uu,xx dx
1
Z 1
= u2,x dx  0. (139)
1

In the first equality we have di↵erentiated under the integral, and in the second we have used
the product rule. We have used (137) in the third equality, and in the fourth equality we have
integrated by parts and used that finite E implies that u vanishes x = ±1. Hence we have
obtained the estimate
ku(·, t)kẼ  ku(·, 0)kẼ (140)
in the norm defined by q
ku(·, t)kẼ := Ẽ(t). (141)
The square root is required here to obtain the propert kcuk = |c| kuk of a norm. In this example
we have K(t) = 1, and we use the same norm for the initial data and the solution, but neither is
typical.
With this estimate, we have shown continuous dependence of the solution on the initial data
in the energy norm, for (di↵erentiable) solutions. From Prop. 2.14 we also have uniqueness for
(di↵erentiable) solutions.

24
2.4.2 An energy estimate for the wave equation
Consider the 1-dimensional wave equation

u,tt = u,xx (142)

on all of space. (For simplicity we have set c = 1.) Define the energy
Z
1 1 2
E(t) := (u + u2,x ) dx, (143)
2 1 ,t

and consider initial data where E(0) is defined and is finite. We then have
Z
dE 1 1 @ 2
= (u + u2,x ) dx
dt 2 1 @t ,t
Z 1
= (u,t u,tt + u,x u,xt ) dx
1
Z 1
= (u,t u,xx + u,x u,xt ) dx
1
Z 1
= (u,t u,x ),x dx = 0. (144)
1

In the first equality we have di↵erentiated under the integral, and in the second we have used the
product rule. We have used (142) in the third equality, and in the fourth equality we have used
the fact that finite E implies that u,t and u,x must vanish at x = ±1. This gives us the estimate

ku(·, t)kE = ku(·, 0)kE (145)

in the energy norm defined by p


ku(·, t)kE := E(t). (146)
As for the heat equation, we have K(t) = 1, and we use the same norm for the initial data and the
solution. The inequality (135) is actually a strict equality here. Finally, we note that E is really
the physical energy (potential energy plus kinetic energy) carried by the wave. This has lent the
name “energy norm” to this type of function norm.
As for the heat equation, we have shown continuous dependence on the initial data, and hence
uniqueness, for strong (di↵erentiable) solutions.

2.4.3 Uniqueness of solutions of the boundary value problem for the Poisson equation
Theorem 2.18. The solution of the Poisson equation (59) with either Dirichlet boundary con-
ditions, or Robin boundary conditions (60) with ↵ > 0, is unique. With Neumann boundary
conditions, the solution is unique up to addition of a constant.
Proof. Suppose we have two solutions u1 and u2 that have the same boundary conditions and
source term but di↵er in V . We define their di↵erence

u := u1 u2 . (147)

Convince yourself that if u1 and u2 are solutions of the Poisson equation with some inhomogeneous
boundary condition, u satisfies Laplace’s equation with the homogeneous boundary condition of
the same type (say Dirichlet) as in the original problem. Hence we have u = 0, and therefore
Z
u u dV = 0. (148)
V

Now r · (uru) = u u + ru · ru, so that


Z h i
2
r · (uru) |ru| dV = 0 (149)
V

25
for any volume V . The divergence theorem gives us
Z Z
r · (uru) dV = uu,n dS, (150)
V S

and hence Z Z
2
|ru| dV = uu,n dS (151)
V S
for any volume V with surface S. Now if our original boundary conditions were inhomogeneous
Dirichlet boundary conditions then u = 0 on the boundary, and if they were Neumann, then
u,n = 0 on the boundary. Either way, the right-hand side of (151) vanishes. Finally, if ↵ > 0,
then uu,n = (↵/ )u2 0. But then the left-hand side of (151) is non-negative and the right-hand
side is non-positive, so both must be zero. Hence u must be constant. But for either Dirichlet
or Robin boundary conditions, this constant must be zero. Hence u1 = u2 , and the solution is
unique. For Neumann boundary conditions, u = u1 u2 can be constant, and so the solution of
the Neumann problem is unique only up to addition of a constant.
Remark 2.19. Here we have assumed that the solutions are twice di↵erentiable, so we have proved
uniqueness only in the space of strong solutions.

2.5 The importance of well-posedness


Why is well-posedness important?
If the solution exists and is unique but if arbitrarily small changes in data result in large
changes in the solution, then it is more likely that the solution is useless for engineering and
science because, in practice all data come from measurements which have small errors in them.
Similarly, any numerical solution of the PDE is only an approximation and so will have small errors
in it. Consider some initial data at t0 and evolve them to t1 . The final data at t1 then have some
numerical error in them, and so we have a small random change of initial data when we evolve
further from t1 to t2 . If this gives rise to a very large change in the solution later on, the numerical
solution becomes nonsense.
There is a large mathematical literature on the well-posedness or ill-posedness of PDEs. You
need to understand the basic concept of well-posedness so that you know when to worry about
possible ill-posedness and consult this literature. There are several reasons why an ill-posed PDE
problem may not be recognised as such:
The mathematical problems that you encounter in this course are both standard problems,
and well-posed, but in modelling an engineering system, a system of ODEs and PDEs may
well arise that has never been investigated mathematically before and that may or may not
be well-posed.
Many PDEs or systems of PDEs that one can write down are neither elliptic, hyperbolic or
parabolic, and then the question of well-posedness becomes tricky.
A physical or engineering problem may be “physically well posed” or may “clearly have a
unique solution” but the mathematical problem one solves may be ill-posed because it does
not completely reflect the physical situation one had in mind. As a (further) example, the
solutions of the Navier-Stokes equations in the limit of small viscosity are completely di↵erent,
mathematically and in fact (they have boundary layers), from solutions of the Euler equations
(where the viscosity is set to zero), but the only di↵erence is a tiny amount of dissipation
that one might naively consider irrelevant.
Trying to solve an ill-posed problem numerically, there is a risk that you get an answer
that looks reasonable, but is actually nonsense. However, there is one important clue: a
numerical solution to a well-posed problem will converge with increasing resolution (finer
numerical mesh). A numerical simulation to an ill-posed problem will actually get worse, by
the mechanism of Sec. 2.2.3: a finer mesh allows for sin nx with larger n to be resolved.

26
2.6 Exercises
12. Homework 4: (A reworking of a similar example in Lecture 4, and practice with Fourier
transforms) Find the solution of the Cauchy problem (sic!) for the Laplace equation on the
half-plane

u,xx + u,yy = 0, 1 < x < 1, y 0, (152)


u(x, 0) = f (x), u,y (x, 0) = g(x), 1 < x < 1, (153)
u(x, y) ! 0 as x ! ±1. (154)

Show that it is ill-posed (because the solution can grow arbitrarily rapidly compared to the
initial data).
13. Homework 5: (Tests your understanding of the properties of a norm, but not much cal-
culation required) Show that the Cauchy problem for the backwards in time heat equation
u,t = u,xx with  > 0 is ill-posed, because the solution cannot depend continuously on
the initial data in any possible function norm over x. [For definiteness, you may assume the
Dirichlet boundary conditions u(0, t) = u(2⇡, t) = 0, but the answer will not depend on the
boundary conditions.]
14. Homework 6: (A variation on the example in Lecture 6) Derive the values of the constants
↵L , L , ↵R and R for which the 1-dimensional heat equation on the interval [a, b] with
Robin boundary conditions

u,t = u,xx , a  x  b, t 0, (155)


↵L u(a, t) + L u,x (a, t) = 0, t 0, (156)
↵R u(b, t) + R u,x (b, t) = 0, t 0, (157)

is well-posed in the energy norm


q Z b
1 2
||u(·, t)||Ẽ := Ẽ(t), where Ẽ(t) := u dx. (158)
a 2

15. In a previous problem we showed that the Cauchy problem for the backwards in time heat
equation u,t = u,xx with  > 0 is ill-posed, because the solution does not depend continu-
ously on the initial data. In the model answer we neglected boundary conditions. Extend that
answer assuming that the domain is 0  x  ⇡, t 0, with initial data and inhomogeneous
Dirichlet boundary data

u(x, 0) = f (x), u(0, t) = g(t), u(⇡, t) = h(t). (159)

[Hint: write u = u1 + u2 , where

u1 (x, 0) = f (x), u1 (0, t) = 0, u1 (⇡, t) = 0 (160)

and
u2 (x, 0) = 0, u2 (0, t) = g(t), u2 (⇡, t) = h(t), (161)
so that u1 takes care of the initial data and u2 of any inhomogeneous boundary conditions.
You can use the triangle inequality

||X + Y || ||X|| ||Y || (162)

(the right-hand side denotes the absolute value of the di↵erence of the two norms).]
16. a) How would you define the lp norm on the vector space Rn , where p is any positive integer,
such that l1 and l2 give the special cases we have already seen?
b) [Much harder, but might be a coursework problem for first-year analysis for mathemati-
cians] With the answer to the previous question, can you show that the limit as p ! 1 of
the lp norm is the maximum norm? (It is in fact also called the l1 norm).

27
17. Homework 7: State and prove an energy estimate for the wave equation on R2 ,

u,tt = u,xx + u,yy , 1 < x < 1, 1<y<1 (163)

where u and its first derivatives vanish as x ! ±1 or y ! ±1.


18. Show that the energy norm for the wave equation can be written as
1
E(t) = p ||U (·, t)||L2 (164)
2
where
U (x, t) := (u,t (x, t), u,x (x, t)) (165)
Hence conclude that E(t) obeys the three conditions for a norm, as long as we exclude
constant solutions.

28
3 Classification of PDEs from their symbol
3.1 Introduction
In Sec. 1.5 we highlighted some typical features of what we called, provisionally, elliptic, parabolic
and hyperbolic PDEs. In Sec. 2.2 we saw examples of boundary conditions that make these PDEs
well-posed or ill-posed. Both are closely related to the coefficients of the highest derivatives in the
PDE.
In this Section we give formal definitions of elliptic PDEs and PDE systems, and of hyperbolic
PDEs and PDE systems. These definitions are quite general and abstract, and we will try to
motivate them from simple examples, and then check that simple examples fall into these categories.
Initially, we look at linear PDEs with constant coefficients. It turns out that this captures the
essence of what we are trying to do, and it allows us to use two essential technical tools, the Fourier
transform (for functions of several variables) and, based on the Fourier transform, the concept of
the symbol of a PDE.
We then look at an example of an elliptic PDE, review why it has a well-bosed boundary value
problem but not a well-posed Cauchy problem, and from there make a definition of ellipticity for
a linear scalar PDE with constant coefficients.
Next, we make two generalisations: to linear PDEs with coefficients that depend on x, and to
systems of linear PDEs.
We then move on to hyperbolic PDEs. Again we start with an example to motivate the general
definition that follows.
Then we look at two important classes of scalar PDEs (i.e. not PDE systems), namely first-
order and second-order linear PDEs. We will see that first-order scalar PDEs are all hyperbolic,
and we will see that determining if second-order scalar PDEs are elliptic or hyperbolic can be done
with an elegant shortcut. For second-order scalar PDEs (only) we also formally define parabolicity.
Finally, we generalise to nonlinear PDEs (or systems). We do not consider the most general
type of nonlinear PDE, but only the most common type, called quasilinear PDEs. The conservation
laws we encounter in the next section will be quasilinear, for example.
We do not state or prove any well-posedness theorems for the classes of PDEs that we define,
as even stating them rigorously goes beyond this course.

3.2 The symbol of a linear PDE


3.2.1 The Fourier transform in Rn
For any (vector-valued) function u(x) defined on all of Rn , we define its Fourier transform
Z
û(k) := (2⇡) n/2 u(x)e ik·x dn x, (166)
Rn

where
n
X
k · x := ki xi (167)
i=1

is the standard inner product on Rn . k is called the wave number (even though for n > 1 it is
not a number but a vector). If the Fourier transform exists, one can show that it has the unique
inverse Z
n/2
u(x) = (2⇡) û(k)eik·x dn k. (168)
Rn

In the case n = 1, the formulas (166) and (168) reduce to the formulas (52) and (51) above.

3.2.2 Definition of the symbol


Definition 3.1. We can write any (scalar) linear PDE in the abstract form

L(x, r)u = f, (169)

29
where L is a linear derivative operator acting to the right on u(x). More generally, we can
write any system of N linear PDEs for N unknowns u↵ in the form

L(x, r)u = f , (170)

where L is now an N ⇥ N matrix of linear derivative operators, or equivalently a matrix-valued


derivative operator. We denote by Lp the principal part of L, comprising only the highest
derivatives.
Definition 3.2. The symbol of the linear di↵erential operator L(x, r) is the algebraic expression
L(x, ik). (For a system of PDEs, the symbol is matrix-valued). The principal symbol Lp (x, ik)
is just the symbol of the principal part.
Example 3.3. The second-order linear PDE
X X
aij (x)u,ij + bi (x)u,i + c(x)u = f (171)
ij i

can be written in operator form as Lu = f , where


X @ @ X @
L(x, r) = aij (x) + bi (x) i + c(x). (172)
ij
@xi @xj i
@x

(To avoid ambiguity about what the partial derivatives act on, we write all coefficients to the left
of all partial derivatives, so that the partial derivatives act only on u.) Its symbol is therefore
X X
L(x, ik) = aij (x)ki kj + i bi (x)ki + c(x), (173)
ij i

where ki are the components of the vector k. This is simply a quadratic expression in k. (Note
the inhomogeneous term f does not appear in the symbol.) The principal symbol is
X
Lp (x, ik) = aij (x)ki kj , (174)
ij

that is, a homogeneous quadratic expression in k.

3.2.3 The symbol and plane wave solutions


One useful property of the Fourier transform is that taking a partial derivative in real space x
corresponds to multiplication in Fourier space k. Namely
Z
@ @ n/2
u(x) = (2⇡) û(k)eik·x dn k
@xi @xi R n
Z
n/2 @
= (2⇡) û(k) i eik·x dn k
@x
ZR
n

= (2⇡) n/2 û(k)(iki )eik·x dn k. (175)


Rn

So we have shown (if the Fourier transform of u exists) that


✓\◆
[ @
(u ,i )(k) = u (k) = (iki ) û(k). (176)
@xi
We can use this again to get the formula for a second derivative,
✓ \ ◆
\ @2
(u ,ij )(k) = u (k) = (iki )(ikj ) û(k) = ki kj û(k), (177)
@xi @xj
and so on.

30
Now assume that the coefficients of the PDE (system) are constant (independent of x). Hence
we can write L(x, r) = L(r). Then we can pull the coefficients of each derivative into the integral
together with the derivative, and so we have
Z
n/2
L(r)u(x) = L(r)(2⇡) û(k)eik·x dn k
Rn
Z
= (2⇡) n/2 û(k)L(r)eik·x dn k
ZR
n

n/2
= (2⇡) û(k)L(ik)eik·x dn k. (178)
Rn

We have shown the following:


Lemma 3.4. If the Fourier transform of L(r)u exists, then it is given by

\
(L(r)u)(k) = L(ik) û(k) (179)

In words, one obtains the Fourier transform of Lu by multiplying the Fourier transform of u by
the symbol of L.
Remark 3.5. Conversely, assume we have a homogeneous linear scalar PDE with constant coef-
ficients L(r)u = 0. Then we can try to find a plane wave solution of the form

u(x) = eik·x . (180)

Clearly
L(r)u(x) = L(r)eik·x = L(ik)eik·x = 0 , L(ik) = 0. (181)
So for each solution of the algebraic equation L(ik) = 0 we have a plane wave solution of the PDE
L(r)u.
Remark 3.6. Note that if there is no real wave number vector k such that det L(ik) = 0, there
can be no purely oscillating plane wave solutions.
We can then construct the general solution by adding together (integrating over) di↵erent plane
wave solutions, with arbitrary coefficients. Basically, this integral is an inverse Fourier transform.
We can combine plane wave solutions and their complex conjugates to obtain real solutions, along
the line of exp(ikx) + exp( ikx) = 2 cos kx. In this context, the complex plane wave solutions or
their real counterparts are also called elementary solutions, as they are the building blocks of
the general solution.
If we apply separation of variables to a scalar PDE with constant coefficients, then once again
L(ik) = 0 appears: those components of k that can be chosen freely are the separation constants.
Example 3.7. Take the linear scalar PDE with constant coefficients in three variables x := (x, y, z)

au,xx + bu,yy + cu,zz = 0, (182)

and look for solutions of the form

u(x) = eik·x = ei⇠x+i⌘y+i⇣z , (183)

where k := (⇠, ⌘, ⇣). This is a solution of (182) if and only if

a⇠ 2 + b⌘ 2 + c⇣ 2 = 0. (184)

Now look at this from the point of view of separation of variables: we try

u(x, y, z) = X(x)Y (y)Z(z). (185)

Separating o↵ x first (the order does not matter in this example) gives us
X 00 Y 00 Z 00
a = A, b +c = A. (186)
X Y Z

31
Then we separate y, getting

X 00 Y 00 Z 00
a = A, b = B, c = A B. (187)
X Y Z
If we now rewrite rewrite A and B as A = a⇠ 2 , B = b⌘ 2 , and introduce the shorthand ⇣ 2 =
(A + B)/c, we have
X 00 = ⇠ 2 X, Y 00 = ⌘ 2 Y, Z 00 = ⇣ 2Z (188)
By solving these ODEs we again get (183), and from our definitions of ⇠, ⌘ and ⇣ we get (184).

Remark 3.8. Consider now a linear PDE system, still with constant coefficients, and look for
plane wave solutions of the form
u(x) = eik·x r, (189)
where r is a constant vector in state space RN . (But x and k are vectors in physical space Rn !)
Then
L(r)u = 0 , L(ik)r = 0 ) det L(ik) = 0. (190)
The middle equality states that r is an eigenvector of the N ⇥ N matrix L(ik) with eigenvalue 0,
and the last equality is a necessary condition for this.

3.3 Ellipticity
3.3.1 A model elliptic PDE
Consider the Laplace equation in two dimensions,

u,xx + u,yy = 0, (191)

without worrying about the domain or boundary conditions at this point. Using separation of
variables, or looking for plane wave solutions, we find, for example, the particular elementary
solution
u(x, y) = sin kx sinh ky, (192)
for any real constant k. This solution remains finite and oscillates in the x-direction, with periodic
zeros (at x = n⇡), and and it increases in the y-direction, with only one zero (at y = 0). More
generally, any elementary solution, any linear combination of elementary solutions, and in fact any
solution of (191), may vanish at two values of x and at one value of y, but can never vanish at two
values of x and two values of y.
So there is no solution of (191) that vanishes on all four sides of a rectangle in the xy-plane.
This implies uniqueness of the Dirichlet problem on a rectangle for this PDE: if there were two
di↵erent solutions that obeyed the same Dirichlet boundary conditions, their di↵erence would be
a non-zero solution that vanished everywhere on the boundary. Uniqueness is actually true for the
Laplace equation on a domain V of arbitrary shape, and with other types of boundary condition,
as we have already shown in Sec. 2.4.3.
Now consider the Cauchy problem for the same PDE, considering y as the time variable.
Because the PDE is second order in y, we specify both u and u,y on y = 0. For example, (192)
is a solution with u(x, 0) = 0 and u,y (x, 0) = k sin kx. But now we run into the problem we have
encountered already in Sec. 2.2.3: There is a unique solution for these Cauchy data, but it grows
without bound as k ! 1 because sinh ky ! 1 for any y > 0. Hence there is no continuous
dependence on the initial data.
Remark 3.9. It is very important to understand that the problem for the well-posedness of the
Cauchy problem for (191) is not that the solution (192) grows exponentially with increasing y.
The problem is that it grows without bound with k at fixed y. As u can grow as sinh ky, for any
k, the same is true for ||u(·, y)||. This means that there can be no norms kukdata and kuksolution
such that
ku(·, y)ksolution  K(y)ku(·, 0)kdata , (193)
where K(y) does not depend on the data and so does not depend on k in this example.

32
But how do we know that there is not some norm for which (193) does hold? From the property
of a norm that kcuk = |c|kuk for any constant c it follows that any norm over x of sin kx sinh ky
must be proportional to sinh ky. So we would have the problem of unlimited growth in any norm.
[This trick only works because (192) is the product of a function of x and a function of y. But
that is fair enough: we only need one counterexample to prove that something does not hold.]
Remark 3.10. The key property of elliptic PDEs is that every solution grows in at least one
direction. This gives us uniqueness of the pure boundary-value problem problem, but at the same
time destroys the continuous dependence on the data for the Cauchy problem: the well-posed
problem for this PDE is a boundary value-problem, not a Cauchy problem.

3.3.2 Formal definition of ellipticity


Definition 3.11. (Provisional definition) The linear PDE L(r)u = 0 in n independent variables
with constant coefficients is called elliptic if and only if
Lp (ik) 6= 0 8k2 Rn 6= 0. (194)
Remark 3.12. From Remark 3.6, the absence of real k that solve L(ik) = 0 means that there are
no plane wave solutions that oscillate in every direction – any plane wave solution must be growing
exponentially in at least one direction (the direction in which k has an imaginary part). But the
general solution is made from a superposition of plane wave solutions, via a Fourier transform or
Fourier series. We saw in Sec. 3.3.1 how this growth in at least one direction gives both uniqueness
of the boundary problem and ill-posedness of the Cauchy problem.
Going to systems, Remark 3.8 motivates the following definition:
Definition 3.13. (Provisional definition) The linear PDE system with constant coefficients L(r)u =
0 for N dependent variables in n independent variables is called elliptic if and only if
det Lp (ik) 6= 0 8k2 Rn 6= 0. (195)

3.3.3 The high-frequency approximation


What about linear PDEs with coefficients that do depend on x? If a PDE fails to be well-posed,
it is often because solutions are badly behaved when they vary very rapidly in space (or space
and time). For example, we saw in Sec. 3.3.1 that continuous dependence of the solution on the
data typically breaks down because the solution becomes arbitrarily large relative to its data if we
consider arbitrarily large wave numbers k.
Denote by `coe↵ the typical distance on which the coefficients of the PDE vary. Consider
solutions u(x) which vary on a typical length scale `soln , with
`soln ⌧ `coe↵ . (196)
We shall refer to this as the high-frequency approximation. We can then consider the coeffi-
cients of the PDE as approximately constant. The resulting Fourier transform is then L(ik)û(k).
Furthermore, because the solution is rapidly varying, this is dominated by Lp (ik)û(k), because
when |k| is large, a higher power of it is much larger.
This motivates a new definition, where a PDE is elliptic at a point x if it is elliptic taking into
account only the highest derivatives, and “freezing” the coefficients of the PDE at their values at
x. Hence we have the following two definitions.
Definition 3.14. (Generalises and replaces Def. 3.11) The linear PDE (now with variable coeffi-
cients) L(x, r)u = 0 is called elliptic at the point x if and only if
Lp (x, ik) 6= 0 8k2 Rn 6= 0.
Definition 3.15. (Generalises and replaces Def. 3.13.) The linear PDE system L(x, r)u = 0 is
called elliptic at the point x if and only if
det Lp (x, ik) 6= 0 8k2 Rn 6= 0. (197)
To save repetition, in the following subsections we will define hyperbolicity and parabolicity
(at a point) immediately for PDEs and PDE systems with variable coefficients.

33
3.4 A model parabolic PDE
Consider now the di↵usion equation in one dimension

u,t = u,xx . (198)

One elementary solution is


k2 t
u(x, y) = sin kxe , (199)
for any real constant k. This solution decays with t for all real k. For larger k (more rapidly
changing initial data), it just decays faster. Essentially for this reason, the Cauchy problem is in
fact well-posed for t > 0, as we have seen in Sec. 2.4.1. But the Cauchy problem would be ill-posed
for t < 0 (the backwards heat equation), because then the solution grows with k without bound
at fixed t.
The Dirichlet problem on a rectangle in the xt-plane is also ill-posed, simply because the solution
is already determined by data on, for example, t = 0, x = 0 and x = ⇡: there is no freedom to
specify boundary data at any t > 0.
Remark 3.16. The key property of a parabolic equation determining all this is that every solution
decays in one and the same direction (time).

We will not give a definition of parabolicity for arbitrary PDEs and PDE systems, but we will
give one for second-order scalar PDEs, in Sec. 3.7.2 below.

3.5 Hyperbolicity
3.5.1 A model hyperbolic PDE
Consider now the wave equation in one space dimension,

u,tt = u,xx . (200)

For any real k, one elementary solution is

u(x, t) = sin kx(A cos kt + B sin kt) (201)

We saw in Sec. 2.2.1 that the Dirichlet problem on the rectangle 0  x  ⇡, 0  t  ⇡


is ill-posed because it does not have unique solutions. This is because sin kt sin kt vanishes on
the four sides of this rectangle for k = 1, 2, . . . . Intuitively, this is possible because this solution
oscillates, with repeating zeros, in both the x and t-direction. We saw that if we choose a generic
rectangle, rather than a square, there is a unique solution, but this can then become arbitrarily
large compared to the boundary date. So then the problem is ill-posed because we lack continuous
dependence on the initial data. Either way, the Dirichlet problem is ill-posed.
On the other hand, we get continuous dependence on initial data for the Cauchy problem
because the solution oscillates but does not grow. We showed this more formally in Sec. 2.4.2.
We also saw from the d’Alembert solution that any Cauchy data u(x, 0) = f (x) and u,t (x, 0) =
g(x) give rise to a unique solution. The solution (201), for example, has Cauchy data f (x) =
A sin kx and g(x) = Bk sin kx. We have also seen that we can reassemble the initial data f (x) and
g(x) into a right-moving wave F (x t) and a left-moving wave G(x + t).
Remark 3.17. The key property of a (strongly or strictly hyperbolic) PDE is that there are enough
linearly independent solutions oscillating in all directions (space and time) to split Cauchy data
into “waves”.

3.5.2 Strict hyperbolicity


Definition 3.18. The linear PDE L(x, r)u = 0 in n independent variables is called strictly
hyperbolic in the time direction n at the point x if and only if

Lp (x, in) 6= 0, (202)

34
and all roots ! of the polynomial equation

Lp (x, ik + i!n) = 0 (203)

are real and distinct for all k 2 Rn that are not zero or a multiple of n.
This is complicated definition, and we look at it piece by piece.
Remark 3.19. By the same argument as before, for a PDE with constant coefficients,

u(x) = e(ik+i!n)·x (204)

is a solution of Lp u = 0 if and only if (203) holds. Moreover, it is an approximate solution of


Lu = 0 in the high-frequency approximation.
Remark 3.20. Often, the time direction n we want to consider is just along one of the coordinate
axes, say x =: (t, y) with n = (1, 0). Without loss of generality, we can then also restrict k
to k =: (0, l). (See Example 3.26 for an explanation why this is possible.) Here l 2 Rn 1 and
y 2 Rn 1 . In this special case, (204) becomes

u(t, y) = eil·y+i!t = ei|l|(l̂·y vt)


, (205)

where
n
X1 l !
l · y := l↵ y ↵ , l̂ := , v := . (206)
↵=1
|l| |l|
We see that this corresponds to a plane wave travelling with velocity v in the spatial direction
given by the unit vector l̂.
Remark 3.21. Why the condition that all eigenvalues ! be real? As the coefficients of Lp are
real, if ! is a solution of (203), so is its complex conjugate ! ⇤ . Hence unless ! is real, either !
or ! ⇤ has positive imaginary part, and so (205) grows exponentially with time for one of them.
Furthermore, if L is a di↵erential operator of order m, Lp (ik) is a homogeneous polynomial of
order m, or Lp ( ik) = m Lp (ik). Hence if the pair (k, !) is a solution of (203), so is any multiple
( k, !). Hence if there are complex !, Lu = 0 has solutions (205) that grow the more rapidly
the faster they oscillate in space. Hence there cannot be an estimate of the form (134) because no
K(t) would grow fast enough, and so the Cauchy problem cannot be well-posed. (This is precisely
what happened in our example of trying to solve the Cauchy problem for the Laplace equation,
Sec. 2.2.3.)
Remark 3.22. Why the condition that all eigenvalues ! be distinct? Consider a Fourier transform
in y, but not in t, that is û(l, t). The condition that all eigenvalues ! be distinct then guarantees
that we can obtain û(l, t) (the Fourier transform of the solution) from the the Fourier transform of
the Cauchy data û(l, 0), û,t (l, 0), û,tt (l, 0), and so on, up to k 1 time derivatives for a k-th order
in time PDE.
For a system of PDEs, we again look for plane waves of the form

u(x) = e(ik+i!n)·x r, (207)

where r is a constant vector in state space that obeys the matrix equation

L(ik + i!n)r = 0. (208)

A necessary condition for such an r 6= 0 to exist is that the determinant of this matrix vanishes.
We will not spell out all the details, but arguments similar to the ones we gave in the scalar case
then motivate the following definition.
Definition 3.23. The linear PDE system L(x, r)u = 0 in n independent variables is called
strictly hyperbolic in the time direction n at the point x if

det Lp (x, in) 6= 0, (209)

35
and all roots ! of the polynomial equation
det Lp (x, ik + i!n) = 0 (210)
are real and distinct for all k 2 Rn that are not zero or a multiple of n.
There are other definitions of hyperbolicity, notably strong hyperbolicity. We do not mention
them here to keep things simple.

3.6 Examples
Example 3.24. The second-order scalar PDE
u,xx + u,yy = 0 (211)
(Laplace equation) is elliptic. Proof: Write k := (⇠, ⌘). Then
L(ik) = (⇠ 2 + ⌘ 2 ) = |k|2 . (212)
But this is positive definite in k, so the only real solution of |k|2 = 0 is k = 0.
Example 3.25. The first-order PDE system
u,y + v,x = 0, (213)
u,x v,y = 0 (214)
(the Cauchy-Riemann equations) is elliptic. Proof: This system can be written as
!✓ ◆
@ @
@y @x u
@ @ = 0. (215)
@y @y
v

Hence ✓ ◆
⌘ ⇠
L(ik) = i ) det L(ik) = ⇠ 2 + ⌘ 2 . (216)
⇠ ⌘
Note that by taking a derivative this first-order system implies u,xx + u,yy = 0 and v,xx + v,yy = 0,
both of which are also elliptic.
Example 3.26. The second-order scalar PDE
u,xx + u,yy + u,zz = 0 (217)
(the wave equation in two space dimensions) is strictly hyperbolic in the x-direction.
Proof: With k := (⇠, ⌘, ⇣), the principal symbol (which is equal to the full symbol) is
L(ik) = ⇠ 2 ⌘2 ⇣2 (218)
With n the x-direction, we write n = (1, 0, 0). Hence L(in) = 1 6= 0, and
L(i!n + ik) = L [i(! + ⇠, ⌘, ⇣)] = (! + ⇠)2 ⌘2 ⇣ 2. (219)
The two solutions of L(i!n + ik) = 0 are therefore
p
!± = ⇠ ± ⌘ 2 + ⇣ 2 . (220)
These are real for all real k, and they are distinct unless ⌘ = ⇣ = 0. But we can exclude that case
as then k would be a multiple of n.
Note that we could set ⇠ to zero by absorbing it into !. Therefore, without loss of generality
we can restrict to k = (0, l) = (0, ⌘, ⇣) for simplicity. We then have
p
L(i!n + ik) = L [i(!, ⌘, ⇣)] = ! 2 |l|2 , |l| = ⌘ 2 + ⇣ 2 . (221)
The solutions of L(i!n + ik) = 0 are then
!± = ±|l|. (222)

36
Example 3.27. The second-order scalar PDE (217) is in fact strictly hyperbolic in all directions

n = (1, v) := (1, vy , vz ) with |v|2 := vx2 + vy2 < 1, (223)

that is in all directions inside a cone around (1, 0, 0).


Beginning of proof: With k =: (⇠, ⌘, ⇣) := (⇠, l), we have L(ik) = ⇠ 2 |l|2 . Also, !n + k =
(! + ⇠, !v + l). Hence

L(i!n+ik) = (!+⇠)2 |!v+l|2 = (1 |v|2 )! 2 +2!(⇠ l·v)+(⇠ 2 |l|2 ) =: A! 2 + 2B! + C. (224)

Setting this to zero has two distinct real solutions if and only if := B 2 AC > 0. To finish the
proof, we need to show that in fact > 0 for all ⇠ if and only if |v|2 < 1. (We do this first for
⇠ = 0, and then for ⇠ 6= 0.)
Example 3.28. The first-order PDE system

u,x + v,y + w,z = 0, (225)


v,x + u,y = 0, (226)
w,x + u,z = 0 (227)

is a) strictly hyperbolic for the same directions n as in Example 3.27, and b) actually equivalent
to (217).
Example 3.29. The first-order PDE system

u,x + µu,y = 0, (228)


v,x + ⌫v,y + u,y = 0, (229)

where µ, ⌫ and are real constants, is strictly hyperbolic in the x-direction if and only if µ 6= ⌫.
Beginning of proof: Let n = (1, 0) and k := (⇠, ⌘). Then
✓ ◆
! + ⇠ + µ⌘ 0
L(i!n + ik) = i . (230)
⌘ ! + ⇠ + ⌫⌘

The result follows by explicitly calculating the eigenvalues and eigenvectors of this matrix.

3.7 Special cases


3.7.1 First order scalar PDEs
Remark 3.30. Any scalar first-order linear PDE in n independent variables
n
X
ai (x)u,i + c(x)u + d(x) = 0 (231)
i=1

is strictly hyperbolic in the sense of Def. 3.18 for all n that are not normal to a. Hence all such
PDEs are of the same type.
Remark 3.31. This is not so for first-order systems, which can be hyperbolic, elliptic, or neither.
Remark 3.32. Solving a scalar, first-order PDE is equivalent to first solving a system of first-order
ODEs. Intuitively the principal part of (231) can be thought of as just a directional derivative
along the vector field ai (u, x). u can then be obtained from initial data by integrating along this
vector field.

3.7.2 Second order scalar PDEs


For scalar, second-order linear PDEs in n independent variables, there is a complete classification
into four types, as follows.

37
Definition 3.33. Consider the scalar linear second-order PDE for the unknown u in n independent
variables x,
X n n
X
L(x, r)u = aij (x)u,ij + bi (x)u,i + c(x)u + d(x) = 0. (232)
i,j=1 i=1

Hence the principal symbol is


n
X
p
L (x, ik) = aij (x)ki kj . (233)
i,j=1

Now consider the coefficients aij (x) of the principal symbol as a symmetric n ⇥ n matrix. Then we
call the PDE parabolic at x if aij is singular (so one or more eigenvalues are zero). Conversely,
if all eigenvalues are nonzero, we call the PDE elliptic at x if all eigenvalues of aij have the same
sign. We call it hyperbolic at x if one eigenvalue has the opposite sign from the others. The
remaining case, where there is more than one eigenvalue of each sign (but none zero), is called
ultrahyperbolic at x. The PDE is simply called parabolic if it is parabolic at every x, and so
on.
Proposition 3.34. This definition of ellipticity agrees with Def. 3.14, and this definition of hy-
perbolicity agrees with strict hyperbolicity for at least one choice of n defined in Def. 3.18 (namely
the eigenvector corresponding to the one eigenvalue that has opposite sign from all the others).
Proof. If all eigenvalues of the matrix aij have the same sign, then without loss of generality we
can assume they are all positive. Hence the matrix is positive definite, which is defined as
X X
aij ki kj 0 8k 2 Rn , with aij ki kj = 0 , k = 0. (234)
i,j i,j

Hence we have ellipticity.


Now assume that one eigenvalue is negative and all others are positive. Let the corresponding
eigenvector be n, so that X
aij ni nj < 0. (235)
i,j

We can uniquely decompose any other vector k as a part in the direction of n and a part “orthog-
onal” to it, in the sense that
X
k = ⇠n + l, with aij ni lj = 0. (236)
i,j

(Here, “orthogonal” means orthogonal with respect to the inner product defined by the positive
definite matrix A.) Moreover, X
aij li lj 0, (237)
i,j

with X
aij li lj = 0 , l=0 (238)
i,j

We then have
X X X
L(i!n+ik) = L [i(! + ⇠)n + l] = aij [(!+⇠)ni +li ][(!+⇠)nj +lj ] = (!+⇠)2 aij ni nj + aij li lj .
i,j i,j i,j
(239)
Therefore the equation L(i!n + ik) = 0 has two distinct real solutions
s P
aij li lj
!= ⇠± P i,j ij , (240)
i,j a ni nj

for any k that is not a multiple of n (so that l 6= 0).

38
Remark 3.35. The only type of parabolic second-order PDE we will consider in this course are
those with precisely one zero eigenvalue. The corresponding eigenvector n is the direction of time.
We will not say anything about ultrahyperbolic PDEs, which arise less naturally in physics and
engineering than elliptic, pararabolic and hyperbolic ones.

3.8 Nonlinear PDEs and systems


Recall the definition of a linear PDE in Sec. 1.1.4. Some PDEs and systems in engineering and
physics are linear at a deep physical level, for example the Poisson equation for the gravitational
field, the Maxwell equations, or the Schrödinger equation.
Other linear PDEs arise as an approximation when we consider small perturbations of an
equilibrium solution. For example, the equations of acoustics are valid for small pressure and
velocity perturbations of the Euler equations. For larger perturbations, for example in an explosion,
the full nonlinear Euler equations are needed, and interesting things such as shock formation will
happen. In another example, the heat equation is linear because the heat flux is approximately
proportional to the temperature gradient, but this is only an approximation.
We can write any PDE system in the abstract form

F (x, u, r) = 0, (241)

meaning some (vector-valued) function F of x, u and the partial derivatives of u with respect to
x vanishes.
If one wants to look, for example, at continuous dependence on the data, or at uniqueness,
one is in e↵ect asking how the solution changes if there is a small change in the boundary or
initial data. Hence consider a solution u that consists of a known solution u0 plus a small change,
u = u0 + ✏ u, where ✏ > 0 is a small number. We now expand F (x, r, u) in a Taylor series about
✏ = 0 as
F (x, u0 + ✏ u, r) = F (x, u0 , r) + ✏ L(x, u0 , r) u + O(✏2 ), (242)
where the linear di↵erential operator is defined by
d
L(x, u0 , r) u := F (x, u0 + ✏ u, r) (243)
d✏ ✏=0

This is just F (✏) = F (0) + ✏F 0 (0) + O(✏2 ) with knobs on. Now dF (✏ r u)/d✏ = F 0 (. . . )r u from
the chain rule, and similarly for higher derivatives. This shows that the coefficient of ✏ is linear in
u and its derivatives, and can indeed be written in the form L u.
Definition 3.36. The linearisation L u = 0 of a system of PDEs of the form (241) about a
solution u0 (x) is defined by (243).
Example 3.37. Consider the nonlinear PDE

F (u, r) = u(u,xx + u,yy ) + u2,x + u2,y = 0. (244)

We linearise about a solution u0 (x, t) by computing

d d ⇥ ⇤
F (u0 + ✏ u, r) = (u0 + ✏ u)(u0,xx + ✏ u,xx ) + · · · + (u0,x + ✏ u,x )2 + . . .
d✏ ✏=0 d✏ ✏=0
= u0 ( u,xx + u,xx ) + u(u0,xx + u0,yy ) + 2(u0,x u,x + u0,y u,y )
= L(u0 , r) u, (245)

where the dots denote the obvious terms with y-derivatives. We read o↵
✓ 2 ◆ ✓ ◆
@ @2 @ @
L(u0 , r) = u0 + + 2 u 0,x + u 0,y + (u0,xx + u0,yy ). (246)
@x2 @x2 @x @y
The fact that well-posedness of nonlinear perturbations is about small perturbations motivates
the following definition.

39
Definition 3.38. A nonlinear PDE or PDE system is called strictly hyperbolic, hyperbolic,
elliptic or parabolic, and so on, about a solution u0 (x) if its linearisation about u0 (x) has the
appropriate property. In other words, we look at algebraic properties of the principal symbol of
the linearised system, Lp (u0 , x, ik), where L(u0 , x, r) is defined by (243).
Remark 3.39. Well-posedness of the linearisation of a PDE problem is necessary for well-posedness
of the full nonlinear system, but it is not sufficient. (And finding sufficient conditions is an art.)
Definition 3.40. A system of PDEs is called quasilinear if derivatives of the principal order
occur only linearly. (Their coefficients may depend nonlinearly on the lower derivatives and the
independent coordinates).
As an example, the PDE (2) is quasilinear if we allow a, b, c and f to depend only on x, y, u,
u,x and u,y (but not on u,xx , u,yy and u,xy , or any higher derivatives). The systems of conservation
laws we look at later (for example the Euler equation) are quasilinear. So are many or other PDEs
of interest in physics and engineering. The PDE (244) is also quasilinear.
Remark 3.41. The key property of quasilinear PDE or PDE system is that it and its linearisation
have essentially the same principal part. That means we can save the trouble of linearising the non-
principal part, and simply consider Lp (u0 , x, r). More precisely, in the principal of the quasilinear
PDE, replace the highest derivatives of u by the corresponding derivatives of u, and replace all
lower derivatives of u (including u undi↵erentiated) by those of u0 .
Example 3.42. Consider the quasilinear first-order PDE
✓ ◆
0 @ 0 @
u,t + f (u)u,x = 0 , + f (u) u = 0. (247)
@t @x

Its linearisation about a solution u0 (x, t) is

u,t + f 0 (u0 ) u,x + f 00 (u0 ) u u0,x = 0. (248)

Hence
@ @ @ @
L(u0 , x, r) = + f 0 (u0 ) + f 00 (u0 ) u0,x ) Lp (u0 , x, r) = + f 0 (u0 ) . (249)
@t @x @t @x
Note the similarity between the last part of (247) and the last part of (249).
Example 3.43. The principal part of the quasilinear PDE (244) is

u(u,xx + u,yy ) + · · · = 0, (250)

where the dots denote non-principal terms. The principal part of its linearisation L u = 0, with L
given by (246), is
u0 ( u,xx + u,yy ) + · · · = 0, (251)
where again the dots denote non-principal terms. Hence the principal symbol of the linearisation
is
Lp (x, ik) = u0 (x)(⇠ 2 + ⌘ 2 ), (252)
where the x-dependence of the symbol comes from the x-dependence of u0 .

3.9 Exercises
19. Homework 8: (Simple example for the lecture) For the linear PDE with constant coefficients

Lu = 0 , u,xx + 7u,xy + 3u,x = 0, (253)

\ = L(ik)û(k).
find L(r) and L(ik). Show explicitly that L(r)u

40
20. Homework 9: (A fourth-order elliptic PDE, and practice with separation of variables) a)
Using separation of variables, find the general solution the PDE

u,xxxx + u,yyyy = 0 (254)

on the domain 0  x  ⇡ with boundary conditions

u(0, y) = 0, u(⇡, y) = 0, u,xx (0, y) = 0, u,xx (⇡, y) = 0 (255)

for all y. [Hint: this is not a complete PDE problem yet. We have not specified the domain in
y and the related boundary conditions, so your solution should contain four free constants.]
b) Based on this solution, does it look as if the PDE (254) has the key property of an elliptic
equation, namely that every solution grows in at least one direction?
21. (A variation on the previous problem: this looks like a fourth-order wave equation but do
you think it is hyperbolic?) Using separation of variables, find the general solution of the
PDE
u,xxxx u,yyyy = 0 (256)
on the domain 0  x  ⇡ with boundary conditions

u(0, y) = 0, u(⇡, y) = 0, u,xx (0, y) = 0, u,xx (⇡, y) = 0 (257)

for all y. [Hint: this is not a complete PDE problem yet. We have not specified the domain
in y and the related boundary conditions, so your solution should contain four sets of free
constants.] Based on this solution, does it look as if the PDE (256) has any of the following
properties: 1) every solution grows in at least one direction, 2) every solution oscillates in all
directions, 3) every solution decays in one and the same direction?
22. (Continuation of the previous problem, practice with ill-posedness results) Show that the
Cauchy problem consisting of (256) on the domain 0  x  ⇡, y 0 with boundary conditions
(257) and initial data

u(x, 0) = u0 (x), u,y (x, 0) = u1 (x), u,yy (x, 0) = u2 (x), u,yyy (x, 0) = u3 (x) (258)

has a unique solution. Show that this Cauchy problem is nevertheless ill-posed because the
solution cannot depend continuously on the initial data in any function norm.
23. (Begun in lecture) Show that

u,xx + u,yy + u,zz + · · · = 0 (259)

is strictly hyperbolic in all directions

n = (1, v y , v z , . . . ) =: (1, v) (260)


p
with v < 1, where v := |v| := v · v.
24. Homework 10: (Practice with notation) Show, using the formal definitions, that a) the
PDE (254) is elliptic, and that b) the PDE (256) is neither elliptic, nor strictly hyperbolic
in the y-direction.
25. Homework 11: Show that the system of first-order PDEs

u,x + v,y + w,z = 0, (261)


v,x + u,y = 0, (262)
w,x + u,z = 0 (263)

is (a) strictly hyperbolic for suitable n and (b) equivalent to the second-order wave equation
u,xx + u,yy + u,zz = 0.

41
26. Homework 12: a) Show that the PDE

4u,xy + u,zz = 0 (264)

for u(x, y, z) is hyperbolic, using the criterion for second-order scalar PDEs. b) Show that
neither the x, y or z-directions are good time directions for this PDE. In other words, show
that the PDE is not strictly hyperbolic in the directions n = (1, 0, 0), (0, 1, 0) or (0, 0, 1).
c) Show that the direction n = (1, 1, 0) is a good time direction. d) Change variables from
(x, y, z) to (t, s, Z), where t = (x + y)/2, s = (x y)/2 and Z = z, and show that you get the
standard form of the wave equation. [This is consistent with the fact that we have already
shown that rt = (1/2, 1/2, 0) is a good time direction.]
27. Coursework 1, 2020/21: The compressible Euler equations in one space dimension are,
in conservation law form,

⇢,t + (⇢v),x = 0, (265)


(⇢v),t + (⇢v 2 + P ),x = 0, (266)
✓ ◆  ✓ ◆
1 2 1 2
e + ⇢v + v e + ⇢v +P = 0, (267)
2 ,t 2 ,x

where P = P (⇢, e) is a given function with P,⇢ > 0 and P,e > 0, is strictly hyperbolic . Here
⇢ > 0 is mass/volume, v is velocity, e > 0 is internal (heat) energy per volume and P > 0 is
pressure. (And as we are in one space dimension, “volume” means length.)
a) Write this system of PDEs in the quasilinear form

At u,t + Ax u,x = 0, (268)

where u := (⇢, v, e), and At and Ax are 3⇥3 matrices that you need to find, and which depend
only on the variables u. [Hint: This part is fairly straightforward. Start by expanding
(⇢v),t = ⇢v,t + ⇢,t v and similarly for the other derivatives. Use the chain rule to write
P,x = P,⇢ ⇢,x + P,e e,x . You now have the equations in the quasilinear form where every term
in the PDE is one of the derivatives ⇢,t , v,t , e,t , ⇢,x , v,x or e,x , times a coefficient that depends
only on ⇢, v and e. Then read o↵ the matrices. ]
b) Write down the conditions for this system to be strictly hyperbolic in the t-direction, in
terms of the matrices At and Ax . Show that the first of these conditions, that det Lp (x, in) 6=
0, actually holds (as long as ⇢ > 0). [Hint: This is a straight application of lecture examples.
Use the notation n = (1, 0) and k = (⌧, ⇠). We have seen that for classification purposes we
can treat a quasilinear first-order system as if it is linear. We have to check if the system
is hyperbolic at the point x because ⇢, v and e in the background solution all depend on x.
But that x-dependence makes no practical di↵erence to your calculation.]
c) Show that the three roots ! of det Lp (x, i!n + ik) = 0 are given by ! = ⇠!0 ⌧ , where
!0 are the three eigenvalues of the 3 ⇥ 3 matrix Ãx := (At ) 1 Ax . Hence conclude that all
three values of ! are real and distinct if the three eigenvalues !0 of Ãx are real and distinct.
[Hint: this is not covered by the lecture notes. But it helps with the final part, which involves
the heavy algebra.]
d) Use a computer algebra program such as Maple or Mathematica to show that the three
eigenvalues !0 of Ãx are v, v + c and v c, where
s
P +e
c := P,⇢ + P,e (269)

is the sound speed. (These are real and distinct for c > 0, as required.) [Hint: you could just
about do this by hand, but it would be tedious and error-prone.]

42
4 Conservation laws
4.1 Integral and di↵erential form
4.1.1 One space dimension
To give the simplest example of a conservation law, consider the flow of mass through a pipe, so we
have a problem in one space dimension x and time t. Let u(x, t) be the density of mass, measured
in units of mass/length, at position x and time t.
Let f (x, t) be the mass flux through the pipe, measured in mass/time, again at position x and
time t. We count a flux going in the direction of increasing x as positive. Consider a segment of
pipe a  x  b. The mass in that segment at time t is
Z b
m(t) = u(x, t) dx (270)
a

This mass changes with time because of the fluxes through the ends x = a and x = b, as follows:
dm
= f (a, t) f (b, t). (271)
dt
Note the signs: flux towards increasing x means into the segment at x = a but out of it at x = b.
Combining the last two equations, we have
Z b
d
u(x, t) dx = f (a, t) f (b, t). (272)
dt a

We now turn (272) into a PDE. Assuming u(x, t) is once di↵erentiable in t, we can write the
left-hand side as Z Z b
d b @
u(x, t) dx = u(x, t) dx (273)
dt a a @t
Also, if f (x, t) is once di↵erentiable in x, we can write the right-hand side as
Z b
@
f (a, t) f (b, t) = f (x, t) dx. (274)
a @x

Combining these two results and bringing both terms on the same side we obtain
Z b
[u,t (x, t) + f (x, t),x ] dx = 0. (275)
a

If we want this to hold for any segment of pipe, a and b can take any values. Then the integral
can vanish only if the integrand in square brackets vanishes for every x, or

u,t + f,x = 0. (276)

This first-order PDE is a conservation law in di↵erential form or strong form. A solution of
this is called a strong solution of the conservation law.
However, we are often interested in the case where u and f are not di↵erentiable, and are in
fact discontinuous. (This will lead us to “shocks”). Hence we want to go the other way from (272)
and also remove the time derivative. For this, we integrate (272) over a time interval t0  t  t1
to get
Z b Z b Z t1 Z t1
u(x, t1 ) dx u(x, t0 ) dx = f (a, t) dt f (b, t) dt (277)
a a t0 t0

This consists of four integrals, one over each side of the rectangle (a  x  b, t0  t  t1 ). Note
that u and f now only need to be integrable, not di↵erentiable.
Definition 4.1. A weak solution of the conservation law (276) is a function u(x, t) that obeys
(277) for all a, b, t0 and t1 .

43
In defining this, we use (276) only as a shorthand notation for (277). Be sure you get the
signs right. This definition of a weak solution looks di↵erent from the weak solutions of the wave
equation we defined in Sec. 1.7, but is closely related. (277) itself is called the integral form or
weak form of the conservation law (276). We have just shown by construction that any strong
solution is also a weak solution. The reverse is clearly not true, as a weak solution does not have
to be once di↵erentiable. In fact, interesting weak solutions are often discontinuous.
Note that until we have given an expression for f in terms of u, the problem has not been
completely specified.

4.1.2 Higher space dimensions


It is straightforward to generalise the integral and di↵erential form to any number of spatial
dimensions. Consider the rectangle (a  x  b, c  y  d) in two space dimensions. u is the
density of mass, now measured in mass/(length)2 . The mass in the rectangle at time t is
Z b Z d
m(t) = dx dy u(x, y, t). (278)
a c

It changes with time because of the fluxes through four sides of the rectangle:
Z d Z b
dm
= [f x (a, y, t) f x (b, y, t)] dy + [f y (x, c, t) f y (x, d, t)] dx. (279)
dt c a

Here f x (x, y, t) is the mass flux in x-direction, and f y (x, y, t) the mass flux in the y-direction. Both
are measured in mass/(length·time). Once again, this is not the form we need. The di↵erential
form is
x y
u,t + f,x + f,y =0 (280)
and the integral form is
Z b Z d
dx dy [u(x, y, t1 ) u(x, y, t0 )]
a c
Z t1 Z d
+ dt dy [f x (b, y, t) f x (a, yt)]
t0 c
Z t1 Z b
+ dt dx [f y (x, d, t) f y (x, c, t)] = 0. (281)
t0 a

The integration is along the six faces of the rectangular box (t0  t  t1 , a  x  b, c  y  d).
Each of these six faces is itself two-dimensional.
In the the three-dimensional case, u is measured in units of mass/(length)3 and the three fluxes
in units of mass/(length2 ·time). The di↵erential form of the conservation law is
x y z
u,t + f,x + f,y + f,z = 0. (282)

We do not write out the corresponding integral form, but it now has eight integrals, each over
three of the four coordinates (x, y, z, t).
In n space dimensions and time, the di↵erential form generalises to
n
X
u,t + f i ,i = 0. (283)
i=1

In vector calculus notation, the same equation is

u,t + r · f = 0. (284)

We can also write the mixed form of our conservation law as


Z Z
d
u dV + f · n dS = 0, (285)
dt V S

44
and the fully integral form as
Z Z t1 Z
[u(x, t1 ) u(x, t0 )] dV + dt f · n dS = 0, (286)
V t0 S

where S is the boundary of V . Because they use the divergence theorem, these now hold for any
volume V with boundary S (not just rectangular boxes). In any conservation law of the form
(286), u is called the conserved quantity and f the corresponding flux.

4.2 Scalar conservation laws in one space dimension


The general form of a scalar conservation law in one space dimension is

u,t + [f (u, x, t)],x = 0. (287)

As for other PDEs, “scalar” means that there is only one PDE for one dependent variable u,
as opposed to a system of conservation laws (such as the Euler equations). We have seen in
Remark 3.30 above that this equation is hyperbolic. The appropriate initial data are

u(x, 0) = g(x). (288)

Often, f (u, x, t) is actually independent of x and t. (Any ODE or PDE whose coefficients are
independent of all the independent variables is called autonomous.) In this autonomous case,
(287) reduces to
u,t + [f (u)],x = 0, u(x, 0) = g(x). (289)
The function f (u) is sometimes called the flux function or flux law. In the following we consider
only the autonomous case, which includes many physical conservation laws.
Using the chain rule, we can also write (289) as

u,t + f 0 (u)u,x = 0, (290)

where f 0 (u) := df /du. This form is explicitly quasilinear, but no longer explicitly in conservation
law form. The two forms are equivalent if and only if u(x, t) is at least once di↵erentiable. By
contrast, (289) is often used as a shorthand for the integral form (277), which is defined for any
solution u(x, t) that is integrable.
We define (in the autonomous case)

f (u)
v(u) := (291)
u
or
f (u) = u v(u). (292)
If we can interpret u as the density of something countable, say particles per length of pipe, then
v is the velocity with which these particles move, the particle velocity.

4.2.1 The advection equation


The simplest case of a scalar conservation law is the one where v is just constant in space and
time, v = v0 and hence f (u) = v0 u. This is called the advection equation (here in one space
dimension). It can be solved in closed form. It is easy to verify that the unique solution of

u,t + (uv0 ),x = 0, u(x, 0) = g(x), (293)

is
u(x, t) = g(x v0 t). (294)
The solution means that if g(x0 ) = u0 , then u = u0 all along the characteristic curve
x(t) = x0 + v0 t. (Compare this with the discussion of the d’Alembert solution in Remark 1.12.)

45
We see that the advection equation just translates the initial data along characteristic curves, or
with velocity v0 .
It is natural to also admit weak solutions of (293), of the form (294) but where the initial data
g(x) and hence the solution u(x, t) are discontinuous. Weak solutions of a conservation law are not
everywhere di↵erentiable, and hence do not obey the di↵erential form of the conservation law, but
they obey its corresponding integral form. Note that any discontinuities in g(x) also propagate
along characteristic curves.

4.2.2 Method of characteristics


Now consider the generic autonomous scalar conservation law (287,288), where f (u) is some given
function. We can derive a solution either in implicit form or graphically using the method of
characteristics.
Assume u is constant on the curve in the (x, t)-plane given by x = ⇠(x0 , t) that starts at x0 at
t = 0. In other words
u[⇠(x0 , t), t] = u(x0 , 0) = g(x0 ), (295)
for all t 0. We can therefore take @/@t of this equation (at constant x0 ) and obtain
✓ ◆
@ @⇠ @⇠ @⇠
0= u[⇠(x0 , t), t] = u,t + u,x = f 0 (u)u,x + u,x = u,x f 0 (u) . (296)
@t x0 @t @t @t

In the second equality we have used the chain rule of partial derivatives, and in the third equality
we have used (290). If we ignored the parameter x0 , we could also write
d
u[⇠(t), t)] = u,t + u,x ⇠ 0 (t) (297)
dt
to stress that u has two arguments x and t, but that the value ⇠(t) of the formal argument x also
depends on t. To give an intuitive interpretation of (297), think of u(x, t) as the temperature of the
air, which depends on both position and time. You move around, carrying a thermometer around
with you. Your position is given by x = ⇠(t). Then (297) is the rate of change with time of your
thermometer reading.
Generically u,x 6= 0, and hence we must have
@
⇠(x0 , t) = f 0 [u(x0 , 0)] = f 0 [g(x0 )]. (298)
@t
We also have the initial condition
⇠(x0 , 0) = x0 . (299)
Although ⇠ depends on x0 and t, this di↵erential equation contains no derivative with respect to
x0 , and so it is in e↵ect an ODE in t, for each value of the parameter x0 . Moreover, the right-hand
side of (298) does not depend on ⇠ or t, and so we can simply integrate both sides to obtain

⇠(x0 , t) = x0 + f 0 [g(x0 )] t, (300)

where the integration constant is fixed by the initial condition (299). So the characteristic curves
for the general autonomous scalar conservation law in one space dimension are the straight lines

x = x0 + f 0 [g(x0 )] t. (301)

We now have the solution u(x, t) in implicit form. If we can solve the algebraic equation (301)
to find x0 (x, t) in closed form [if this is possible depends on f 0 (u)], we can also write the solution
in explicit form as
u(x, t) = g[x0 (x, t)]. (302)
[Above, we have already discussed the special case of the advection equation, where the characteric
curves were x = x0 + v0 t. This can be solved for x0 (x, t) = x v0 t, and hence we obtain the closed-
form solution (294).]
f 0 (u) is called the characteristic velocity. Note that this is di↵erent from the particle velocity
v = f (u)/u, except for the advection equation, where both are equal to v0 .

46
4.2.3 Propagation of small disturbances
Assume that u(x, t) is at least once di↵erentiable. Then start from the quasilinear form (290) of a
scalar conservation law in one space dimension, and find its linearisation as

d
[(u0 + ✏ u),t + f 0 (u0 + ✏ u)(u0 + ✏ u),x ] = u,t + f 0 (u0 ) u,x + f 00 (u0 )u0,x u = 0. (303)
d✏ ✏=0

If u0 (x, t) is constant, the third term vanishes and we obtain

u,t + f 0 (u0 ) u,x = 0, (304)

But this is just the advection equation for u, so small perturbations travel at the characteristic
velocity v0 = f 0 (u0 ).
As an approximation, the third term in (303) can be neglected with respect to the second one
if and only if
f 00 (u0 ) u,x
0
u0,x ⌧ . (305)
f (u0 ) u
We can also write this as
(ln f 0 (u0 )),x ⌧ (ln u),x . (306)

In other words, ln u varies much more rapidly than ln f 0 (u0 ), or put di↵erently again, the
relative change in the perturbation u is much greater than the relative change in the (approximate)
advection speed v0 = f 0 (u0 ). We then obtain again an approximate advection equation, now with
a v0 that depends slowly on x and t. You can think of the equations governing the atmosphere:
there will be changes of pressure and density over large scales, corresponding to the weather, and
small but rapidly varying changes corresponding to sound waves.
Considering that sound can convey information, and sound is a (very) small perturbation of gas
pressure and velocity, it is said (in a non-rigorous sense) that information in hyperbolic equations
travels at the characteristic velocity or velocities.

4.3 Weak solutions


4.3.1 Shock formation and Riemann problems
For the conservation law (289) with initial data (288), consider initial data g(x) such that the
function f 0 [g(x)] is an increasing function of x. Then the characteristics fan out from t = 0 and
never intersect. Hence the initial density profile is stretched out. The solution remains smooth for
all t > 0.
Now consider initial data g(x) such that the characteristic speed, given by f 0 [g(x)] is a de-
creasing function of x (at least for some interval in x). From the chain rule, this is the case
if
d 0
f [g(x)] = f 00 [g(x)] g 0 (x) < 0. (307)
dx
Then the characteristics converge from t = 0. Intuitively, “particles” at the back move faster than
particles in front, and hence catch them up. The initial density profile is compressed and becomes
steeper until the solution given by following characteristics becomes multivalued, and no longer
make. At this point the physical solution has become discontinuous, and hence non-di↵erentiable,
and we do not know how to continue. A shock has formed. Shocks develop generically in nonlinear
hyperbolic PDEs, of which nonlinear conservation laws are an example.
To understand what happens once a discontinuity has formed, or when a discontinuity is already
present in the initial data, we consider the Riemann problem, which consists of a conservation
law (here, a scalar conservation law) with piecewise constant initial data:
(
uL , x < 0
u,t + [f (u)],x = 0, u(x, 0) = . (308)
uR , x > 0

47
4.3.2 Propagating shock solutions
We have seen that the advection equation admits solutions with a travelling discontinuity. This mo-
tivates us to look for a solution to the Riemann problem (308) with a discontinuity that propagates
with constant velocity: (
uL , x < st
u(x, t) = , (309)
uR , x > st
for t 0, where s is a constant shock velocity. The shock location is x = st. (309) cannot
be a solution of the di↵erential form u,t + [f (u)],x = 0 of the conservation law because it is not
di↵erentiable. Instead, consider the equivalent integral form
Z b Z t1
[u(x, t1 ) u(x, t0 )] dx + (f [u(b, t)] f [u(a, t)]) dt = 0, (310)
a t0

which must hold for all rectangles (t0  t  t1 , a  x  b). On a rectangle where u is simply
constant, this is trivial. Instead consider a rectangle that is cut diagonally into two triangles by
the propagating shock, for example the rectangle (0  t  t, 0  x  x), where x := s t
(assuming here that s > 0). Then (310), after dividing by t, immediately gives

s(uL uR ) = f (uL ) f (uR ), (311)

the Rankine-Hugoniot condition. This is often written as

s[u] = [f (u)], (312)

where the square brackets denote the jump across the shock. It is also called the jump condition
at the shock.
By making the rectangle under consideration arbitrarily small, it is easy to show that the jump
condition still holds when the solution is once di↵erentiable but not necessarily constant on either
side of the shock. Intuitively, if we zoom in on an isolated discontinuity, the smooth derivative
becomes less and less relevant, and the discontinuity looks like a step function. Hence, what
happens in the neighbourhood of the shock should depend only on the value just to the left and
right. The shock still moves with velocity s given by (311), but that velocity will in general depend
on time as uL and uR change.
For a scalar conservation law, we can obtain a shock solution for arbitrary values of uL and
uR , and simply read o↵ the shock speed as

f (uL ) f (uR )
s= . (313)
uL uR
If we now let uL = u + u and uR = u in (313) and take the limit u ! 0, we see that the
right-hand side is just the formal definition of a derivative as a limit, so we find that s ! f 0 (u).
Hence weak shocks (shocks with an infinitesimally small jump) propagate with the characteristic
velocity.

4.3.3 Rarefaction waves


The initial data in a Riemann problem do not single out any particular length scale, and neither
does the solution (309), and so the solution must be scale-invariant. Dimensional analysis shows
that there is no length scale that can be formed from the conservation law itself and the initial
data of the Riemann problem. The solution must be a similarity solution, which here means
that it must be a function of x/t only. Looking back, we see that this does hold for (309).
There is in fact another type of similarity solution of the Riemann problem that is continuous,
although still only a weak solution. We look for a solution of the form
x
u(x, t) = (z), z := , t > 0. (314)
t

48
Substituting this into the di↵erential form of the conservation law we find
⇣ x⌘ ✓ ◆
0 0 0 0 1
u,t + f (u)u,x = (z) 2
+ f [ (z)] (z) = 0, (315)
t t
and after multiplying by t,
f 0 [ (z)] 0 (z) = z 0 (z). (316)
0
We can assume (z) 6= 0 (or else the solution we are constructing would be constant) and divide
by it to obtain
f 0 [ (z)] = z. (317)
As f (u) and hence f 0 (u) is a known function, this is just an algebraic equation that in principle can
be solved for (z) = (f 0 ) 1 (z). Assume there exist zL and zR with zL < zR such that (zL ) = uL
and (zR ) = uR . Then a solution to the Riemann problem is given by
8
>
< uL , x < zL t,
u(x, t) = (x/t), zL t < x < zR t (318)
>
:
uR , x > zR t,

where (zL,R ) = uL,R and (317) give

zL,R = f 0 (uL,R ). (319)

So this solution can only exist when

f 0 (uL ) < f 0 (uR ). (320)

This solution is called a rarefaction wave. Clearly, it is continuous but not di↵erentiable at
x = zL,R t. One can show that it is a weak solution of the conservation law.

4.3.4 The Lax condition


A Riemann problem with given left and right states uL and uR may well admit more than one
shock or rarefaction solution. One then needs additional input to decide which of these possible
weak solutions is the physically correct one.
In this course, we are solving Riemann problems only for scalar conservation laws. Then a
shock solution exists for any uL and uR . On the other hand, a rarefaction solution exists if and
only if zL < zR , that is for
f 0 (uL ) < f 0 (uR ). (321)
So for values of uL and uR that obey this condition we have both a shock solution and a rarefaction
solution. Which one is correct?
A possible criterion for selecting the correct solution is to demand continuous dependence on
the initial data. If we take initial data for the Riemann problem that obey (321), but smooth out
the jump from uL to uR in the initial data over a distance `, then in the resulting solution the
characteristics diverge everywhere, so a shock never forms. In fact, after a time proportional to `
the solution looks very similar to the rarefaction solution of the (unsmoothed) original Riemann
problem, but very di↵erent from the shock solution. So for f 0 (uL ) < f 0 (uR ) the correct solution
of the Riemann problem is the rarefaction wave, as smoothing it out on an arbitrarily small scale
` changes the solution by arbitrarily little. On the other hand, for

f 0 (uL ) > f 0 (uR ), (322)

we have only the shock solution. Suppose we again smooth out the initial data over a length scale
l. Then we initially have a smooth solution, but the characteristics converge, and so after a short
time proportional to ` a shock forms anyway. So here it is the shock solution that is stable against
small perturbations and so is the correct one.
Expressing the same statement more geometrically, shocks are the correct solution to the Rie-
mann problem only if characteristics run into them from both sides. This is called the Lax shock

49
condition. Using our expressions for the shock velocities and the characteristic velocities to the
left and right, this is
f (uL ) f (uR )
f 0 (uL ) > > f 0 (uR ). (323)
uL uR
Of course, (323) is necessary for (323) to holds.
We note for completeness that, conversely, (322) implies (323) if and only if either f 00 (u) < 0
or f 00 (u) > 0 for all u. We say the flux function is convex. Draw a graph of f (u) to understand
this.

4.3.5 A few words on systems and higher dimensions


A system of N conservation laws for N unknowns u := (u1 , u2 , . . . uN ) in one space dimension can
be written as
u,t + (f (u)),x = 0, (324)
where now both u and f are vectors in state space RN , or if we write out the N components,
u↵,t + (f↵ (u)),x = 0, ↵ = 1 . . . n. (325)
Afer using the chain rule of partial derivatives, we obtain the quasilinear form
@f
u,t + (u) · u,x = 0, (326)
@u
where now @f /@u is an N ⇥ N matrix that depends on u, and the dot denotes multiplying the
column vector u by this matrix. Writing this out in components, we have
N
X @f↵
u↵,t + (u) u ,x = 0, ↵ = 1 . . . n. (327)
@u
=1

For a system of N conservation laws in one space dimension, the Rankine-Hugoniot condition
can be derived in the same way, and is
s(uL uR ) = f (uL ) f (uR ), (328)
or in components
s (u↵,L u↵,R ) = f↵ (uL ) f↵ (uR ), ↵ = 1 . . . n. (329)
These are now N equations. We can interpret them as N 1 constraints between the 2N com-
ponents of uL and uR , as well as one equation for the shock speed s, so we cannot choose uL
and uR freely. Yet in a di↵erent way, we can think of uL (say) as fixed, and interpret the N
components of the Rankine-Hugoniot condition as one equation for the shock speed s, and N 1
constraints on the N components of uR . This means that the possible values of uR , given uL ,
form a one-dimensional set, namely a continuous curve through the point uR = uL (the trivial
shock). In fact, because the equations are nonlinear, through every point uL in state space there
are not just one but N curves of possible values of uR that obey the Rankine-Hugoniot condition.
A similar statement holds for rarefaction waves in a system.
Hence constructing weak solutions for systems is more complicated. Roughly speaking, if we
look for the solution of a Riemann problem with arbitrarily given left and right states uL and uR
in a system of N conservation laws, the solution will consist of N “waves” sandwiched between
the left and right state and N 1 intermediate states. Here a “wave” means a shock, rarefaction
wave, or a third kind of similarity solution called a contact discontinuity.
The statement that a weak shock propagates at a characteristic speed is still true for a system,
except that a system of N conservation laws now has N characteristic speeds (the eigenvalues of
the matrix @f /@u.)
In more space dimensions, say three, an isolated shock between constant states is planar, so
we can simply orient our coordinate system so that the shock propagates in the x-direction, and
nothing depends on y and z. A similar statement holds for rarefaction waves. So there is nothing
fundamentally di↵erent from one space dimension, but of course solutions can become extremely
complicated in practice.

50
4.4 Example: Traffic flow
Look again at (292) but interpret it as the conservation of cars on a road, where u is now measured
in cars/length, and v is the velocity of the traffic flow, measured of course in length/time. In this
context, (292) is called the traffic flow equation.
Now let us look a particular velocity law for traffic flow. There is maximum density umax when
cars are bumper to bumper, and there is a maximum velocity vmax given by the speed limit. On
an open road, cars will go at the speed limit, but in dense traffic they will slow down until they
reach the maximum density at zero speed – a traffic jam. For simplicity, we assume v(u) to be the
linear function defined by these two points, namely
✓ ◆
u
v(u) = vmax 1 , (330)
umax
and hence ✓ ◆
u
f (u) = uvmax 1 . (331)
umax
The characteristic velocities are therefore
✓ ◆
u
f 0 (u) = vmax 1 2 . (332)
umax
Hence f 0 (u) is a decreasing function. Finally, the shock velocity is given by
⇣ ⌘ ⇣ ⌘
uL uR
f (uL ) f (uR ) u L 1 umax u R 1 umax
s = = vmax
uL uR uL uR
u2L u2R ✓ ◆
(uL uR ) umax uL + uR
= vmax = vmax 1 . (333)
uL uR umax
Consider now the evolution of two kinds of initial data:
1) Assume that g(x) is a decreasing function. From (332) we see that f 0 (u) is a decreasing
function of u. Hence f 0 (g(x)) is an increasing function of x. In other words, the characteristic
velocity in the initial data increase with x. Hence the characteristics fan out from t = 0 and never
intersect. Physically, the cars in front are in less dense traffic and hence move faster. Hence the
initial density profile is stretched out. The solution remains smooth for all t > 0.
2) g(x) is an increasing function. Then the characteristics converge from t = 0. Physically, the
cars in front are in denser traffic and hence move more slowly, allowing the cars behind to catch up.
Hence the initial density profile is compressed and becomes steeper, until a moving shock forms in
the traffic flow at which each driver suddenly hits the brakes.

4.5 Exercises
28. Homework 13: (Very short) In n space dimensions, the general form of a scalar conservation
law is
Xn
⇥ i ⇤
u,t + f (u) ,i = 0, (334)
i=1
R
and the total mass is m = u d x. What are the dimensions of u and f i ?
n

29. Homework 14: a) Use separation of variables to solve the Cauchy problem for the advection
equation on the line,

u,t + v0 u,x = 0, 1 < x < 1, t 0, (335)


u ! 0 as x ! ±1, (336)
u(x, 0) = g(x) (337)

and hence show that the solution u(x, t) = g(x v0 t). (Hint: you need a Fourier transform).
b) Now do the same using the method of characteristics.

51
30. Homework 15: Use the method of characteristics to show that the solution of the Burgers
equation with linear initial data,
✓ ◆
1 2
u,t + u = 0, (338)
2 ,x
u(x, 0) = ax, (339)

is
ax
u(x, t) =
. (340)
1 + at
For what a does a shock form at some t > 0, and what value of t is that?
31. Homework 16: (Very short) Find the shock solution of the Riemann problem to the Burgers
equation (
✓ 2◆
u uL , x < 0
u,t + = 0, u(x, 0) = . (341)
2 ,x uR , x > 0

32. Homework 17: (Very short) Find the rarefaction wave solution to the Riemann problem
for the Burgers equation, and find when it actually exists. Also find when the shock solution
is the correct solution to the Riemann problem instead.
33. Homework 18: (Longish, but important) For the traffic flow Riemann problem
✓ ✓ ◆◆ (
u2 uL = aumax , x < 0
u,t + vmax u = 0, u(x, 0) = , (342)
umax ,x uR = bumax , x > 0

find the shock or rarefaction solution, as appropriate, and in either case the characteristics
⇠(x0 , t). Let vmax = 60mph, and consider the two cases where a = 0.3, b = 0.9, and a = 0.9,
b = 0.3.
˜ 0 , t).
34. For the solution of the traffic flow example, also find the car trajectories ⇠(x
35. Consider the PDE problem

u,t + [f (u)],x = 0, 1  x  1, t 0, (343)


u(x, 0) = g(x) (344)

Determine if these data will form a shock, and if so compute the time ts when the shock first
forms.
Hint: a) Recall the method of characteristics for solving this problem graphically, and recall
that a shock forms when two characteristics cross. b) By drawing a picture, or otherwise,
convince yourself that the first characteristics that cross will be two neighbouring charac-
teristics. In other words, you cannot have characteristics starting at x01 and x02 crossing
without some characteristics starting from intermediate values of x0 crossing first, or at the
same time. c) Now look at characteristics starting from x0 and x0 + h, and find out where
and when they cross, working to leading order in h. You will find that to leading order the
crossing time ts = ts (x0 , h) does not depend on h. d) Now find the smallest value of ts (x0 )
and you are done.
36. Solve the PDE problem
✓ ◆
1 2
u,t + u = 0, 1  x  1, t 0, (345)
2 ,x
8
>
< uL , x < 0,
uR uL
u(x, 0) = uL + L x, 0 < x < L, (346)
>
:
uR , x > L.

(Burgers’ equation with continuous, piecewise linear initial data.)

52
Hint: You will need to solve this problem separately for the two cases uL < uR and uL > uR ,
as the solutions are qualitatively di↵erent. Sketch the initial data. Sketch some characteristics
in the (t, r)-plane. Recall shocks and rarefaction waves. Recall the solution of another
homework problem where u(x, 0) = ax for 1  x (what is a here?). Try to glue together
a solution from these ingredients in di↵erent regions of the (x, t)-plane.
37. Coursework 2, 2020/21:
In this system we consider the system of two conservation laws for the two variables u :=
(u, v),

u,t v,x = 0, (347)


v,t + (P (u)),x = 0 (348)

where P (u) is a given function that obeys P 0 (u) < 0. We first find the subset of all Riemann
problems that admit a solution with a single shock. We then use this to construct a solution
with two shocks for the general Riemann problem. (We ignore the existence of solutions
with two rarefaction waves, or one shock and one rarefaction wave, and the question which
of these is the correct solution.)
a) In this part of the question, we find all the single-shock solutions
(
uL , x < st,
u(x, t) = (349)
uR , x > st,

for restricted values of uL and uR . Write down the two components of the Rankine-Hugoniot
condition in terms of the left and right states uL and uR and shock speed s. Show that uL
and uR must be obey the restriction
s
P (uL ) P (uR )
(vL vR ) = ⌥ (uL uR ), (350)
uL uR

for either the upper or lower sign and find the corresponding two shock speeds s± in terms
of uL and uR . Show that what is inside the square root in (350) is strictly positive for any
uL and uR , and find the limit of the shock speeds for small shocks, that is, as uL ! uR .
b) We now find a solution with two shocks of the general Riemann problem, that is
8
>
< uL , x < sL t,
u(x, t) = u⇤ , sL t < x < sR t, (351)
>
:
uR , x > sR t,

for arbitrary uL and uR . Sketch the regions in the (x, t)-plane where u takes each value. For
given uL and uR , write down four equations (from using the Rankine-Hugoniot condition
separately for each shock) that can be used in principle to determine u⇤ , v⇤ , sL and sR , to
actually find a solution of the conservation law. Explain your choice of signs, and check that
your sketch has the correct signs. Then find a single equation for u⇤ in terms of uL and uR
only (you do not need to solve this equation yet), and explain how you would complete the
solution.
c) Assume now the specific flux function

P (u) = 4u, (352)

and the specific numerical values

uL := (uL , vL ) = (1, 1), uR := (uR , vR ) = (4, 3). (353)

Plot accurately(!), in the (u, v)-plane, the two curves that link our uL to all possible u⇤ , and
the two curves that link our uR to all possible u⇤ . Then find the correct u⇤ graphically, as

53
the intersection of one curve from each pair. Find the corresponding sL and sR . There are
two intersections, but only one is the correct intermediate state u⇤ for solving the Riemann
problem: explain your choice. For what other Riemann problem would we need the other
intersection u⇤ ?
Verify your graphical solution by solving the Rankine-Hugoniot condition for u⇤ , v⇤ , sL and
sR algebraically, as you planned in part b).
d) Repeat part c) for the same left and right state, but now for the flux function

P (u) = eu . (354)

Use Mathematica, Maple, python, Matlab, or some other software to plot the four shock
curves (in a single figure, chosen so the two intersections are clearly visible, and with axes
clearly labelled). [Hint: use parametric plotting with parameter u⇤ .] Compute (numerically)
the intermediate state u⇤ and two shock speeds sL and sR . [Hint: use FindRoot in Math-
ematica or something similar in other languages, or write your own Newton solver. Check
against the plot that your numbers make sense.]
(This question is adapted from problem 13.7 of R. J. LeVeque, “Finite volume methods for
hyperbolic problems”.)

54
5 Elementary generalised functions
5.1 Test functions
Definition 5.1. A function (x) is a test function if it has the following properties:
(x) and all its derivatives exist and are continuous at all points 1 < x < 1 ( is smooth);
R1
the integrals 1 of (x) and all its derivatives exist and are finite.
Note that since the integrals of a test function and all its derivatives exist, the test function
and all its derivatives must vanish as x ! ±1.
A simple example of a test function is
x2
(x) = e , (355)

which is analytic. Another example is the function


8
>
<0, xa
1
f (x) = e (x a)(b x) , a<x<b (356)
>
:
0, x b

which is smooth but vanishes outside the interval (a, b) (and is not analytic at x = a and x = b).

5.2 The -function


Definition 5.2. A generalised function G (also called a distribution) is a linear map from
test functions to real numbers. That is, it assigns to each test function a number G[ ]. Two
generalised functions F and G are equal if F [ ] = G[ ] for all test functions .
This may seem a little abstract, but in practice we can think of generalised functions as “func-
tions” that are only defined when integrated over a test function. We can then write
Z 1
G[ ] := G(x) (x) dx. (357)
1

In the following, we only use this integral notation, and never G[ ].


The most important generalised function is the -function, which in spite of its name is not a
function, but a generalised function. Intuitively, one can think of it as a “function” that is only
defined under an integral.
Definition 5.3. The -function is the generalised function defined by
Z 1
(x) (x) dx := (0) (358)
1

for every test function (x).

Remark 5.4. Intuitively, we can think of the -function as an ordinary function with the properties
Z 1
(x) = 0 if x 6= 0, (x) dx = 1. (359)
1

Note that we do not, and cannot, assign a value to (0).


Remark 5.5. We can also define the -function as the limit of various sequences of regular func-
tions that obey
lim f✏ (x) = 0, x 6= 0 (360)
✏!0

and Z 1
lim f✏ (x) = 1, (361)
✏!0 1

55
1.2

0.8

0.6

0.4

0.2

-4 -2 00 2 4
x

Figure 2: Sequence of functions converging to the -function.

An example is
1 ✏
(x) = lim , (362)
✏!0 ⇡ ✏ 2 + x2
see Fig. 2. This definition of the -function as a limit has a physical interpretation in terms of a
very large force acting over a very short time, while conveying a finite momentum. In applications
the -function can be used to represent an impulse, eg. when a string is hit with a hammer.
R1
Remark 5.6. More generally, if f (x) is any function such that 1 f (x)dx = 1, which implies
that f (x) ! 0 as |x| ! 0, then
1 ⇣x⌘
lim f = (x). (363)
✏!0 ✏ ✏
Say (x) is a test function. Then, with the change of variable y = x/✏,
Z 1 Z 1 Z 1 Z 1
1 ⇣x⌘
lim f (x)dx = lim f (y) (✏y)dy = f (y) (0)dy = (0) f (y)dy = (0).
✏!0 1 ✏ ✏ ✏!0 1 1 1
(364)

5.3 The Heaviside and signum functions


Definition 5.7. The Heaviside function is the generalised function defined by
Z 1 Z 1
H(x) (x) dx := (x) dx (365)
1 0

for all test functions (x).


Remark 5.8. It is easy to see that this is equivalent to
(
1, x > 0
H(x) = (366)
0, x < 0

but note that we do not need to assign a value to H(0).


Remark 5.9. We have Z x
(y) dy = H(x). (367)
1

56
2

1.5

y 1

0.5

-3 -2 -1 00 1 2 3
x

-0.5

-1

Figure 3: The Heaviside function.

Because (x) = 0 for x 6= 0, it only matters if y = 0 is part of the integration domain 1 < y < x.
Note also that the left-hand side of this equation is not defined for x = 0, and hence for consistency
we cannot give H(x) a value at x = 0 either.

In applications the Heaviside function is often used as a “switch”. For example, it can be used
to mathematically model an electrical circuit where the power is switched on at a specific moment
in time.
Definition 5.10. The signum function sgn(x) is defined by

sgn(x) = H(x) H( x). (368)

Hence we have (
1, x>0
sgn(x) = . (369)
1, x<0
Clearly sgn(x) is just the sign of the number x, but sgn(0) is not defined. Considering sgn(x) as a
generalised function allows us to show that

d |x|
sgn(x) = (370)
dx
and that
d
sgnx = 2 (x). (371)
dx

5.4 Generalised functions and derivatives


We are going to use generalised functions to solve inhomogeneous di↵erential equations. In ma-
nipulations we will often need the derivative, G0 (x), of a generalised function, G(x).
Definition 5.11. The derivative G0 (x) of the generalised function G(x) is defined by
Z 1 Z 1
0
G (x) (x) dx := G(x) 0 (x) dx (372)
1 1

for all test functions (x).

57
3

-4 -2 00 2 4
x

-1

-2

-3

Figure 4: Sequence of functions converging to the derivative of the -function.

Remark 5.12. This definition is essentially integration by parts (since test functions vanish at
±1). The right-hand-side of this equation is always known since that is how G(x) is defined, and
if is a test function so is 0 .
Example 5.13. In this sense we have

H 0 (x) = (x), (373)

because
Z 1 Z 1 Z 1
0 0 0 1
H (x) (x) dx = H(x) (x) dx = (x) dx = (x)|0 = (0) . (374)
1 1 0

Remark 5.14. The derivative of the -function, 0 (x), is defined by


Z 1 Z 1
0
(x) (x) dx = (x) 0 (x) dx = 0
(0) (375)
1 1

for all test functions (x).


0
We can also define as the limit of a sequence of functions such as

0 d 1 ✏ 2 ✏x
(x) = lim = lim (376)
✏!0 dx ⇡ ✏2 + x2 ✏!0 ⇡ (✏2 + x2 )2

as shown in Fig. 4.

5.5 Properties of the -function


All of the following properties are a consequence of the definition of the -function and the standard
properties of integrals. Recall that
R 1 to prove F (x) =RG(x) for two generalised functions F and G,
1
we really have to prove that 1
F (x) (x) dx = 1
G(x) (x) dx for any test function (x).
Typically, this will involve a change of variable, and working from both sides towards a number.
R1
1. 1
(x) (x) dx = (0)
Rb
2. a
(x) (x) dx = (0) for a < 0 < b

58
R1
3. 1
(x) (x a) dx = (a)

4. ( x) = (x)
5. (ax) = (x)/ |a|
6. (a2 x2 ) = [ (x a) + (x + a)] /(2|a|)
P
7. (f (x)) = i |f(x0 (xxi )|
i)
, where the xi are the simple zeros of f (x)

8. x (x) = 0
9. g(x) (x) = g(0) (x) provided g(x) is continuous and g(0) exists.
10. H 0 (x) = (x)
Rx
11. H(x) = 1 (y) dy
R1 0 0
12. 1
(x) (x) dx = (0)
13. (g(x) (x))0 = g(0) 0 (x) (the naive product rule does not apply)
Example 5.15. Let us take a closer look at property 6. Consider (a2 x2 ) as a generalised
function and let it operate on a test function (x), i.e. evaluate
Z 1
I= (x) (a2 x2 )dx. (377)
1

Change variables to y = a2 x2 and use


p
x = a2 y for x > 0, (378)
p
x = a2 y for x < 0. (379)

Then we get
Z " # Z " #
x=1 x=0
dy dy
I = (x) (y) p + (x) (y) p =
x=0 2 a2 y x= 1 2 a2 y
Z a2
p Z a2
p
( a2 y) (y)dy ( a2 y) (y)dy
= p + p =
1 2 a2 y 1 2 a2 y
Z 1
1 1
= [ (|a|) + ( |a|)] = [ (x a) + (x + a)] (x)dx. (380)
2|a| 2|a| 1

Example 5.16. To show property 11, let (x) be any function such that 0 (x) = (x). Then
Z 1 ✓Z ◆  Z 1 Z 1
(x) (y) dy dx = (x) (y) dy (x) (x) dx
1 1x 1x 1 1
Z 1 Z 1
= (1) (y) dy ( 1) (y) dy (0)
1 1
Z 1 Z 1
= (1) (0) = (x) dx = (x)H(x) dx,(381)
0 1

for all test functions .

59
5.6 Exercises
38. Homework 19: Show that lim✏!0 f✏ (x) = (x), where
1 ✏
f✏ (x) := . (382)
⇡ ✏ 2 + x2

39. Prove that, in the sense of generalised functions, ( x) = (x).

40. Prove that Z 1


(x y)f (x) dx = f (y). (383)
1

Rb Rb
41. Show that a (x) (x) dx = (0) for a < 0 < b, a (x) (x) dx = (0) for b < 0 < a,
otherwise zero, or undefined if a = 0 or b = 0

60
6 Green’s functions for ODEs
In this Section we introduce the concept of Green’s function, which can be used to solve inhomo-
geneous di↵erential equations. Since the technique readily generalises from ODEs to PDEs we will
first consider the simpler case of ODEs. We begin with the general definition, and then consider
some examples.
Definition 6.1. A Green’s function for the inhomogeneous linear ODE

L y(t) = f (t), (384)

where L = L(t, d/dt) is any homogeneous linear ordinary di↵erential operator, is a generalised
function G(t, s) of two variables that satisfies

Lt G(t, s) = (t s), (385)

where Lt signifies that the di↵erential operator L acts on the variable t, not s, that is Lt :=
L(t, @/@t).
Remark 6.2. The reason for this definition is this: if we put
Z 1
y(t) = G(t, s)f (s) ds, (386)
1

we find that Z 1 Z 1
Ly(t) = Lt G(t, s)f (s) ds = (t s)f (s) ds = f (t), (387)
1 1

so that (386) is a solution of (384).


Remark 6.3. The above definition of a Green’s function becomes unique only when we comple-
ment it with appropriate boundary conditions, either on the ODE, or equivalently on the Greens’s
function. Clearly, two di↵erent Green’s functions for the same problem must di↵er by a solution
of the homogeneous problem Lt G = 0.

6.1 A simple example: first-order linear ODE with constant coefficients


Example 6.4. A Green’s function for the first-order linear ODE with constant coefficients,

ẏ + ay = f (t) (388)

is a generalised function G(t, s) of two variables t and s that satisfies

G,t + aG = (t s). (389)

To find the Green’s function G(t, s) for (388), we note that for t 6= s we have

G,t + aG = 0, (390)

and hence we have (


at
A(s)e for t < s
G(t, s) = at
(391)
B(s)e for t > s,
where A(s) and B(s) are integration constants. (They are functions of s, rather than true constants
because, although G(t, s) obeys an ODE in the single variable t, it also depends on the parameter
s, and so these integration “constants” can depend on s. You may remember seeing something
similar when you learned how to solve exact PDEs by integration.)
We now recall that a -function is the derivative of the Heaviside function, i.e. a jump discon-
tinuity of unit size. Thus it will be enough to make G(t, s) jump by one as we go across t = s.
Thus we want
lim G(t, s) = lim G(t, s) + 1, (392)
t!s+ t!s

61
1

0.8

0.6

0.4

0.2

-1 00 1 2 3 4
t

(1 t)
Figure 5: The Green’s function H(t 1)e .

or
as as
B(s)e = A(s)e + 1, (393)
from which it follows that
B(s) = A(s) + eas . (394)
Thus we have (
at
A(s)e t<s
G(t, s) = at a(s t)
(395)
A(s)e +e t > s,
which can be written as
at a(t s)
G(t, s) = A(s)e + H(t s)e . (396)
Then we see that (386) implies
Z 1
y(t) = G(t, s)f (s) ds
1
Z 1 Z 1
at a(t s)
= e A(s)f (s) ds + H(t s)e f (s) ds
1 1
Z t
at a(t s)
= Ce + e f (s) ds. (397)
1

The term Ce at is of course the general solution of (388) with f (t) = 0, or in other words,
the complementary function. Like an ordinary solution to this ODE, the Green’s function needs
boundary conditions to be uniquely defined.
If we choose A(s) = 0, and hence C = 0, then
(
0, t<s
G(t, s) = H(t s)e a(t s) = a(t s)
. (398)
e , t>s

This Green’s function represents the reaction of the system to a unit impulse at time t = s where
the system is at rest prior to the impulse; see Figs. 5 and 6. It is called the causal Green’s
function, defined by the property that

G(t, s) = 0 for t < s. (399)

62
1

0.8

0.6
G(t,s)
0.4

0.2

0
-2 -2

-1 -1

0 0
s t
1 1

2 2

(t s)
Figure 6: The Green’s function H(t s)e for 2 < s < 2, 2 < t < 2.

The solution (386) with (398) is


Z t
a(t s)
y(t) = e f (s) ds. (400)
1

It is called the causal solution because it depends only on f (s) for s < t, that is, it depends only
on the values of f (s) before the present time. Thus this solution can be said to be caused by the
driving force f (t). This is in contrast to the general case (386), where y(t) can depend on the values
of f (s) for both s < t (i.e., values that have already occurred) and t < s (i.e., values that have
yet to happen). One is typically mainly interested in the causal solution to a physical problem. In
general, we obtain such solutions by imposing G(t, s) = 0 for t < s. The factor H(t s) in (398)
obviously makes sure of that.

6.2 Another example: the harmonic oscillator


Definition 6.5. The causal Green’s function for the harmonic oscillator problem
ÿ + ! 2 y = f (t) (401)
is a function G(t, s) that satisfies
G,tt + ! 2 G = (t s). (402)
and G = 0 for t < s.
We take G(t, s) = 0 for t < s. For t > s we have
G,tt + ! 2 G = 0, (403)
so that
G(t, s) = A(s)ei!t + B(s)e i!t
. (404)
The -function in (402) must come from the term G,tt , which implies that G,t must jump by one
as we go across t = s. This implies that G(t, s) must be continuous across t = s. If G(t, s) was
discontinuous then G,t would have a -function and G,tt would contain 0 (t).
This can be derived more formally in the following way. Assume that the Green’s function is
continuous, but that its derivative may have discontinuities. Then integrate (402) from s ✏ to
s + ✏. This gives
Z s+✏ Z s+✏
⇥ ⇤
G,tt + ! 2 G dt = G,t |s+✏ G,t |s ✏ = (t s) dt = 1. (405)
s ✏ s ✏

63
(In the limit ✏ ! 0, the integral over ! 2 G vanishes). This method can be used also for more
complicated equations.
As we let t ! s+ we have

lim G(t, s) = A(s)ei!s + B(s)e i!s


, (406)
t!s+

which must be zero since G(t, s) = 0 for t < s and we want G(t, s) to be continuous. Thus

A(s)ei!s + B(s)e i!s


= 0. (407)

We have G,t = 0 for t < s and we want G,t to jump by one as we go across t = s, so we must have

lim G,t = i!A(s)ei!s i!B(s)e i!s


= 1. (408)
t!s+

Solving for A(s) and B(s) we find that


1 i!s 1 i!s
A(s) = e , B(s) = e , (409)
2!i 2!i
and hence (see Fig. 7)
(
0, t<s
G(t, s) = 1
! sin !(t s), t>s
1
= H(t s) sin !(t s). (410)
!
Thus, the causal solution of (401) is
Z t
1
y(t) = f (s) sin !(t s) ds. (411)
! 1

0.5

G(t,s) 0

-0.5

-1
-6 -6
-4 -4
-2 -2
0 0
s t
2 2
4 4
6 6

(t st)
Figure 7: The Green’s function H(t s)e as a function of s and t.

6.3 The general second order linear ODE with constant coefficients
The procedure from the previous Section works for more general equations. Take the general linear
second order ODE with constant coefficients,

ÿ + ↵ẏ + y = f (t). (412)

The causal Green’s function is the solution of

G,tt + ↵G,t + G = (t s), (413)

64
which satisfies G(t, s) = 0 for t < s. Finding G(t, s) involves solving

G,tt + ↵G,t + G = 0 (414)

for t > s, and the solution is


1t 2t
G(t, s) = A(s)e + B(s)e , (415)
where 1 and 2 are the roots of the quadratic equation
2
+↵ + = 0. (416)

Thus G(t, s) may be written as


(
0 t<s
G(t, s) = 1t 2t
(417)
A(s)e + B(s)e t > s.

(Here we have assumed that 1 6= 2 .) In order to obtain the term (t s) we need G(t, s) to be
continuous at t = s and G,t (t, s) to have a jump of magnitude one. Thus we find that
1s 2s
A(s)e + B(s)e =0 (418)

from the continuity of G(t, s) at t = s, and that


1s 2s
1 A(s)e + 2 B(s)e =1 (419)

from the jump in G,t . These are two equations for the functions A(s) and B(s) which we can solve
to find them. Once we have done this, the Green’s function is
h i
G(t, s) = H(t s) A(s)e 1 t + B(s)e 2 t . (420)

6.4 Initial-value problems


Rather than starting at t = 1 and looking for the causal solution, we often want to solve the
initial value problem. We will illustrate this with the example of the general second-order linear
ODE (not restricting to constant coefficients). It should be clear afterwards how to apply the same
method to a first-order or higher-order linear ODE.
So consider the initial value problem

ÿ + ↵(t)ẏ + (t)y = f (t), y(0) = A, ẏ(0) = B (421)

for t > 0. We can extend the Green’s function methods described above to deal with this situation.
We define a new function
z(t) = H(t)y(t), (422)
so that z(t) equals y(t) for t > 0 and is zero for t < 0. Then

ż(t) = y(0) (t) + ẏ(t)H(t) = A (t) + ẏ(t)H(t) (423)

and
z̈(t) = A 0 (t) + ẏ(t) (t) + ÿ(t)H(t) = A 0 (t) + B (t) + ÿ(t)H(t), (424)
and thus
z̈ + ↵(t)ż + (t)z = A 0 (t) + [↵(0)A + B] (t) + H(t)f (t) =: f˜(t). (425)
Now assume that G(t, s) is the causal Green’s function, so that the causal solution of (425) is
Z t
z(t) = G(t, s)f˜(s) ds
1
Z t
= G(t, s) {A 0 (s) + [↵(0)A + B] (s) + H(s)f (s)} ds
1
n Z t o
= H(t) AG,s (t, 0) + [↵(0)A + B]G(t, 0) + G(t, s)f (s) ds . (426)
0

65
The factor H(t) multiplies the first two terms because (s) and 0 (s) both vanish for s 6= 0, and
hence for t < 0 they vanish everywhere inside the integration range. Similarly, H(t) also multiplies
the third term because H(s) = 0 for s < 0, and so if t < 0, it vanishes everywhere inside the
integration range. Finally, if t > 0, H(s) = 1 still only for s > 0, and so we can remove it if we
limit the integration range to s > 0 (and s < t).
For t > 0 we have y(t) = z(t) and hence
Z t
y(t) = AG,s (t, 0) + [↵(0)A + B]G(t, 0) + G(t, s)f (s) ds. (427)
0
00 0
G(t, 0) obeys the homogeneous equation y + ↵y + y = 0 for t > 0 because G(t, s) does
for t > s and s = 0 here. Similarly, G,s (t, 0) also obeys the homogeneous equation. The third
term obeys the inhomogeneous equation y 00 + ↵y 0 + y = f for t > 0 by construction. Therefore
the first two terms can be considered as the “complementary function” and the third term as the
“particular integral”.
(427) obeys the initial conditions by construction, but it is instructive to check this explicitly.
Note that at t = s, G(t, s) is continuous, and G,t (t, s) jumps by 1. This implies that
⇥ ⇤
G(t, s) = H(t s) (t s) + (t s)2 g(t, s) (428)

where g(t, s) is some regular function of t and s. But from this we have that

G(0+ , 0) = 0, G,s (0+ , 0) = 1, G,t (0+ , 0) = 1, G,ts (0+ , 0) = 2g(0, 0) = G,tt . (429)

Furthermore, G(t, s) obeys the homogenous ODE G,tt + ↵G,t + G = 0 for t > s. Therefore

G,tt (0+ , 0) = ↵(0)G,t (0+ , 0) (0)G(0+ , 0) = ↵(0). (430)

Using these results, we have

y(0+ ) = AG,s (0+ , 0) + (↵(0)A + B)G(0+ , 0) = A, (431)


0
y (0+ ) = AG,st (0+ , 0) + (↵(0)A + B)G,t (0+ , 0)
= AG,tt (0+ , 0) + (↵(0)A + B)G,t (0+ , 0) = B. (432)

6.5 Exercises
42. Homework 20: (Very short) Check by di↵erentiation that
at a(t s)
G(t, s) = A(s)e + H(t s)e (433)

obeys
G,t + aG = (t s). (434)

43. Homework 21: (Very short) Check by di↵erentiation that


1
G(t, s) = H(t s) sin !(t s) (435)
!
obeys
G,tt + ! 2 G = (t s). (436)

44. (Standard exam question) a) Find the causal Green’s function for dy/dt + y/t = f (t). b) Use
it to solve dy/dt + y/t = f (t) for t > 1, with y(1) = A.
1
45. Show that the Green’s function G(t, s) = ! H(t s) sin !(t s) satisfies

G,tt + ! 2 G = (t s) (437)

in the sense that Z 1 ⇣ ⌘


G(t, s) ¨(t) + ! 2 (t) dt = (s) (438)
1

for every test function (t).

66
46. (Standard exam questions) a) Find the Green’s function and causal solutions of

ÿ + 5ẏ + 6y = f (t). (439)


t
Find the particular causal solutions when b) f (t) = e and c) f (t) = H(t)e t .
47. (All standard exam questions) Use the method of Green’s functions to find the causal solu-
tions of:

(a) ÿ + 4ẏ + 4y = H(t) + H( t)e2t ;


(b) ÿ + 4ẏ + 4y = H( t) sin t;
(c) ÿ + 4ẏ + 3y = 2 sin 5t;
(d) ÿ + 6ẏ + 9y = t2 ;
(e) ÿ + 3ẏ + 2y = ˙ (t);
(f) y 000 + y 0 = f (x)
Hint for the last one: you can either solve @ 3 G/@x3 + @G/@x = (x y) with G = 0 for
x < y, or put f (x) = F 0 (x) and integrate the equation once.

67
7 Green’s functions for the Poisson and Helmholtz equa-
tions
After warming up with inhomogeneous linear ODEs, we now turn to inhomogeneous linear PDEs.
In this Section we consider
Lu = f (x) (440)
where L is the Laplacian or the Helmholtz operator + k 2 and f (x) is a given function. The
nature of the solution depends on the boundary conditions and in this Section we assume that
solutions are required in unbounded space.

7.1 Three-dimensional -function


Definition 7.1. The three-dimensional -function can be defined by

(x) = (x1 ) (x2 ) (x3 ), (441)

or by the properties Z
(x) = 0 for x 6= 0, and (x) d3 x = 1, (442)

or by the property Z
(x) (x) d3 x = (0) (443)

for all test functions (x).


Proposition 7.2. In the spherical polar coordinates (30), with r2 = x21 + x22 + x23 , we have

(r)
(x) = . (444)
4⇡r2
R
Proof. We will show that (444) obeys (443). It makes sense to carry out the integral d3 x in
spherical
R1 polar coordinates. TheR 1integration range is 0  r < 1, and we face the problem that
0
(r) dr is not defined. But 0
(r ✏) dr = 1 for all ✏ > 0 as now r ✏ = 0 occurs inside the
integration range.
So instead of (444) we consider
(r ✏)
F✏ (x) = (445)
4⇡r2
for ✏ > 0, and then take ✏ ! 0+ at the end. Now
Z Z 2⇡ Z ⇡ Z 1
3 (r ✏)
F✏ (x) (x) d x = (r, ✓, ') r2 dr sin ✓ d✓ d' (446)
0 0 0 4⇡r2
Z 2⇡ Z ⇡
1
= (✏, ✓, ') sin ✓ d✓ d', (447)
4⇡ 0 0

where the two factors of r2 have cancelled, and we have carried out the integration over r. The
remaining integral over ✓ and ' is the average of over a sphere of radius ✏. As ✏ ! 0, and the
sphere shrinks to a point, this average becomes simply (0). We have shown that lim✏!0 F✏ (x) =
(x). We shall use this trick of “protecting” (r) again.
Remark 7.3. Although we have said that

g(x) (x) = g(0) (x), (448)

this only applies to functions g(x) which are continuous at x = 0. An expression of the form
g(x) (x) when g(x) is not continuous at x = 0 must be interpreted as a generalised function in its
own right, that is, under an integral. Thus (r)/r2 is a generalised function and it is certainly not
equal to (r)/02 .

68
7.2 Free space Green’s function for the Poisson equation
Definition 7.4. The free space problem for the Poisson equation is

u = f (x) (449)

with the boundary condition at infinity

u(x) ! 0 as |x| ! 1. (450)

We assume that f (x) ! 0 as |x| ! 1 so that there are no sources at infinity.


That is, we consider the e↵ect of a source distribution that falls o↵ far away from these sources,
and require a solution that also falls o↵. This fall-o↵ condition takes the place of the Dirichlet,
Neumann or Robin boundary conditions. We can solve this problem in much the same way as we
found causal solutions for ODEs.
Definition 7.5. The free space Green’s function G(x, y) for the Laplace equation satisfies

x G(x, y) = (x y), G(x, y) ! 0 as |x y| ! 1. (451)

Here x indicates di↵erentiation with respect to x = (x1 , x2 , x3 ) and not with respect to
y = (y1 , y2 , y3 ). If we put Z
u(x) = G(x, y)f (y) d3 y, (452)

then, since the integral is with respect to y and the derivatives are with respect to x,
Z Z
3
xu = x G(x, y)f (y) d y = (x y)f (y) d3 y = f (x). (453)

Because the boundary conditions are at infinity, we can shift the source f (y) and the solution
u(x) about together without making a di↵erence. Hence the Green’s function G(x, y) depends
only on the relative position of x and y and so G(x, y) = G(x y) = G(x0 ). Then the problem
for G becomes
0
x0 G = (x ). (454)
Because the problem is invariant under rotations, we expect G to be spherically symmetric, since
x0 and (x) are, so we look for a generalised function

G(x0 ) = G(|x0 |) =: G(r). (455)

Then, using (31) and (444), (454) becomes


✓ ◆
1 @ 2 @G 1 (r)
r = . (456)
r2 @r @r 4⇡ r2

For r > 0, (r) = 0, so we find that


A
G= +B (for r > 0), (457)
r
where A and B are constants. Since G ! 0 as r ! 1, B = 0.
To find A we argue as follows. Since x0 G = (x0 ), we must have
Z
3 0
x0 G d x = 1 (458)
V

for any volume V that includes the origin. By the divergence theorem we also have
Z Z
x 0 G d 3
x = G,n d2 x0 , (459)
V S

where S is the surface of V .

69
z z’ R dφ

dS=R sin( φ) d θ R d φ

R sin( φ ) d θ

r=R y’

x’

Figure 8: Sphere of radius R

Now, choose V to be a sphere of radius R (see Fig. 8). The outward normal to the surface is
just the unit vector pointing from the origin to the point on the surface and so
A A
G,n = G,r = = (460)
r2 R2
on the surface r = R of the sphere. Also, on the surface r = R, the surface element d2 x0 is given
by R2 sin ✓ d✓ d', so Z ⇡ Z 2⇡
@G
R2 sin ✓ d✓ d' = 1. (461)
0 0 @r r=R
Thus Z Z
⇡ 2⇡
A sin ✓ d✓ d' = 1, (462)
0 0
and so
1
A= . (463)
4⇡
Thus
1 1 1 1 1 1 1
G= = = = p , (464)
4⇡ r 4⇡ |x0 | 4⇡ |x y| 4⇡ (x1 y1 )2 + (x2 y2 )2 + (x3 y3 ) 2
and the solution of the problem

u = f (x), u ! 0 as |x| ! 1 (465)

is Z
1 f (y) d3 y
u(x) = , (466)
4⇡ |x y|
or
Z1 Z1 Z1
1 f (y1 , y2 , y3 ) dy1 dy2 dy3
u(x1 , x2 , x3 ) = p . (467)
4⇡ (x1 y1 )2 + (x2 y2 )2 + (x3 y3 ) 2
1 1 1

Note that G(x, y) is not defined at x = y. Rather, it is a generalised function. One useful
interpretation (see Section 7.4) is
1 H(|x y| ✏)
G(x, y) = lim (468)
✏!0 4⇡ |x y|

70
as shown in Fig. 9.
r
0 0.5 1 1.5 2
0

-0.1

-0.2

-0.3

-0.4

-0.5

-0.6

Figure 9: Generalised function interpretation of Green’s function. We consider r = |x y| and


illustrate H(r ✏)/r for a small non-zero ✏.

7.3 Free space Green’s function for the Helmholtz equation


Recall the derivation of the Helmholtz equation for time-periodic solutions with angular frequency
! of the wave equation in Remark 1.13.

source

wave fronts moving


away from source

Figure 10: Radiation condition; waves move away from the source

The Green’s function for the Helmholtz equation satisfies


( x + k 2 )G(x, y) = (x y). (469)
Hence Z
u(x) = G(x, y)f (y) d3 y (470)

71
is a solution of (57). However, this solution depends on the boundary conditions, which we have not
yet specified. For the Helmholtz equation, it is natural to impose this boundary condition directly
on the Green’s function. Physically, we expect waves to propagate away from the disturbance
generating them and not towards it. This gives us a radiation boundary condition, which
replaces the fall-o↵ condition that u ! 0 as |x| ! 1 used for Poisson’s equation. We shall see
shortly what form this radiation condition takes for the Green’s function.
As before, it is convenient to introduce x0 = x y, in which case the problem becomes

( x0 + k 2 )G = (x0 ), (471)

which clearly has spherical symmetry. So, we look for a solution with G(x0 ) = G(r), and the
problem is then 
1 @2 (r)
2
(rG) + k 2 (rG) = , (472)
r @r 4⇡r2
since ✓ ◆
1 @ 2@ 1 @2
3 f (r) = 2 r f = (rf ) (473)
r @r @r r @r2
when acting on a spherically symmetric function f (x) = f (|x|) = f (r). (The second equality
is a useful identity to remember.) So for r > 0 we have

@2
(rG) + k 2 (rG) = 0, (474)
@r2
which implies that rG = ↵eikr + e ikr
or
A ikr B ikr
G= e + e . (475)
r r

f(r) f(r-t)

t r

g(r+t) g(r)

t r

Figure 11: Radiation condition; waves move away from the source

If we consider Ge ikct , which is a solution of the wave equation if G is a solution of the


Helmholtz equation (remember Remark 1.13) we have

ikct A ik(r ct) B ik(r+ct)


Ge = e + e . (476)
r r

72
Now any function f (r ct) represents a wave moving away from r = 0 towards r ! 1 with
speed c as t increases, because f is constant on lines r ct = C. Fig. 11). On the other hand a
function g(r + ct) represents a wave moving inwards, see Fig. 11. The -function in the problem
for G represents a disturbance at the origin. Physically we expect waves to propagate outward
away from this disturbance and not inward from infinity towards the disturbance. So the radiation
condition tells us that B = 0. Hence
A ikr
G= e , r > 0. (477)
r
To find A in a handwaving way, we note that the -function at r in ( + k 2 )G = must be coming
from the 1/r term, as eikr is a regular function, that takes the value one there. Put di↵erently, at
x = 0 we can neglect k 2 compared to (x). Hence by analogy with G = we set A = 1/(4⇡).
We obtain
1 ikr 1
eik|x | ,
0
G(x0 ) = e = 0
(478)
4⇡r 4⇡ |x |
or
1
G(x, y) = eik|x y|
4⇡ |x y|
⇣ p ⌘
exp ik (x y ) 2 + (x y ) 2 + (x y ) 2
1 1 1 2 2 3 3
= p (479)
4⇡ (x1 y1 )2 + (x2 y2 )2 + (x3 y3 )2

Note that as k ! 0 we recover the Green’s function for the Poisson equation.
To summarize: The solution of the inhomogeneous Helmholtz problem

( + k 2 )u = f (x), (480)
f (x) ! 0, |x| ! 1 (481)

that satisfies the outgoing radiation boundary condition is given by


Z
1 f (y) ik|x y| 3
u(x) = e d y. (482)
4⇡ |x y|

This represents the (spatial part of) an outgoing train of waves caused by a disturbance in the
region where f (x) 6= 0.

7.4 An alternative derivation


It is instructive to check by direct di↵erentiation that the free space Green’s functions we have
derived for the Poisson and Helmholtz equations actually obey the inhomogeneous PDEs they are
supposed to.
Let G(r) be the (generalised) function defined by

H(r ✏)
G(r) := lim G✏ (r), G✏ (r) := f (r), (483)
✏!0 4⇡r
where f (r) smooth and bounded (so that f (r) times any test function is again a test function).
This is similar to the trick we have used in Sec. 7.1, where we replaced (r) by (r ✏) and took
✏ ! 0 at the end.
We now use the product rule to find
 ✓ ◆
1 @ 2 @ H(r ✏)
G✏ = r f (r) (484)
r2 @r @r 4⇡r
1 @
= [rf (r) (r ✏) + (rf 0 (r) f (r))H(r ✏)] . (485)
4⇡r2 @r

73
To be sure that we evaluate the derivative of the term with the -function correctly, we integrate
it over a test function (r):
Z Z
1 0
[rf (r) (r ✏)] (r)4⇡r 2
dr = rf (r) (r ✏) 0 (r)dr (486)
4⇡r2
Z
1
= ✏f (✏) 0 (✏) = ✏f (✏) 0 (r ✏) (r)4⇡r2 dr. (487)
4⇡r2

The important point here is that the 1/r2 term in (484) is cancelled by the factor of r2 in the
volume element 4⇡r2 dr in the integral (486), and so is not di↵erentiated during the integration
by parts in r, but it comes back when we write (487) as a volume integral over 4⇡r2 dr again. To
be quite sure, we could have carried out the entire calculation under a volume integral, but this is
the one step where this really matters. Putting this term together with the other, unproblematic,
terms in (485), we have
1
G✏ = [✏f (✏) 0 (r ✏) + (✏f 0 (✏) f (✏)) (r ✏) + (rf 00 (r) + f 0 (r) f 0 (r))H(r ✏)] .
4⇡r2
(488)
Again, note that the 1/r2 in front has not become a 1/✏2 . In the limit ✏ ! 0
✓ ◆
f (r) f 00 (r) f (0)
= + (r), (489)
4⇡r 4⇡r 4⇡r2

Hence we have proved that


1 00
G= f (r) + f (0) (x). (490)
4⇡r
Then for f (r) = 1 we have ✓ ◆
1
= (x), (491)
4⇡r
which confirms our result for the Poisson equation, and for f (r) = eikr we have
✓ ikr ◆
e
( + k2 ) = (x), (492)
4⇡r

which confirms our result for the Helmholtz equation.

7.5 The large distance and long wavelength approximations


If the source distribution is nonzero only in a bounded region, or if it falls o↵ sufficiently rapidly
with distance, one can deduce the approximate behaviour of the solution at large distance from
the source without solving the full problem.
It is important that two related but separate approximations are necessary here. For the Poisson
equation, the point where we evaluate the solution must be much further away from the source
than the size of the source (large distance approximation). For the Helmholtz equation, we need
the large distance approximation and a separate approximation, namely that the size of the source
is much smaller than the wavelength (long wavelength approximation). (Recall that the Helmholtz
equation is about waves of a specific frequency and hence wavelength).
We study these two approximations separately.
Example 7.6. Consider the free-space Poisson problem
|x|2 /`2
u=e =: f (x), u(x) ! 0 as |x| ! 1. (493)

What approximation can we make for u(x) as |x| `?


The exact solution is Z
1 f (y) 3
u(x) = d y. (494)
4⇡ y2R3 |x y|

74
But let us assume we cannot or do not want to evaluate this exactly. The key observation is that
in this example, although the source f (x) is nowhere zero, it falls o↵ very rapidly with |x|, and we
can approximate it as zero for |x| > R, where R is the “size” of the source region, chosen to be a
few `. [Of course if f (x) is nonzero only on a bounded region, say f (x) = H(R |x|), the size of
the source is defined unambiguously]. So we approximate
Z
1 f (y) 3
u(x) ' d y. (495)
4⇡ |y|<R |x y|

For any two vectors x, y 2 Rn and the Euclidean norm | · |, we have the triangle equalities

|x| |y|  |x y|  |x| + |y|. (496)

The outer bracket in the first expression is just the absolute value of a real number. (To just
convince yourself that it is true, draw some vectors x and y in R2 . Prove it as an Exercise (for
any norm).)
Now we consider the limit |x| R (and hence in particular |x| > R). In the integral (495),
|y| < R. Hence we have
|x| R  |x y|  |x| + R. (497)
We can write this as
|x y| = |x| + O(R) as |x| ! 1. (498)
The symbol O(·) is pronounced “order of” or “big-O of”. Its formal definition is that f (x) =
O[g(x)] as x ! 1 if cg(x) < |f (x)| < Cg(x) for two constants 0 < c < C as x ! 1. Its intuitive
meaning is “grows or decays like”. Note that a limit is always part of the definition of O.
We now rewrite the estimate (498) of the absolute error in |x y| as a relative error,
 ✓ ◆
R
|x y| = |x| 1 + O (499)
|x|

We can approximate this by |x| if and only if R/|x| ⌧ 1, that is when the relative error in |x y|
is small. Clearly, it is the relative error (in percent) that matters here, not the absolute error (in
meters).
We can now continue from (495) as
Z
1 f (y) 3
u(x) ' d y
4⇡ |y|<R |x|
Z
1
= f (y) d3 y
4⇡|x| |y|<R
Z
1
' f (y) d3 y
4⇡|x| y2R3
A
= , (500)
4⇡|x|

where Z
A := f (y) d3 y (501)

is the magnitude of the source. This is the large distance approximation: the source term is
approximated as a point source of magnitude A located at the origin.
Example 7.7. Consider the free-space Helmholtz problem
|x|2 /`2
( + k 2 )u = e =: f (x), outgoing wave BC as |x| ! 1. (502)

What approximation can we now make for u(x) as |x| `?

75
The exact solution is now
Z
1 eik|x y|
u(x) = f (y) d3 y. (503)
4⇡ y2R3 |x y|
We now want to approximate |x y| by |x| in two di↵erent places, namely in the amplitude
1/|x y| and in the complex phase exp ik|x y|. For the amplitude, we have once again (499),
where it matters that the relative error in amplitude (in percent) is small. So, as in the Poisson
example, we need the large distance condition
R
⌧ 1. (504)
|x|
But the complex phase, measured in radians, is
 ✓ ◆
R
k|x y| = k|x| 1 + O = k|x| + O(kR). (505)
|x|
We are clearly completely out of phase when the error in the phase approaches 2⇡. The relative
error in the phase (first equality above) is completely irrelevant, what matters is the absolute phase
error (second equality). Hence we can approximate
eik|x y|
' eik|x| (506)
if and only if
kR ⌧ 2⇡. (507)
(You will also find kR ⌧ 1 in the literature, which is equally good within the approximation
implied by ⌧.) As k = 2⇡/ , this means that the wavelength of the waves is much larger than the
size of the source,
R
⌧ 1, (508)
the long wavelength condition.
As a real world example, consider the loudspeaker of a radio, with a diameter of 10cm. Hence
the large distance approximation holds if we look at the sound field at distances much larger than
10cm from the loud speaker, and the long wavelength approximation holds for wavelengths much
larger than 10cm, or frequencies much lower than (340m/s)/(10cm)=3400Hz.

7.6 Uniqueness of the solution to the free-space problem


Given the Green’s function we deduced in Sec. 5.3, we know that the solution of the free space
Poisson problem is Z
1 f (y) d3 y
u(x) = . (509)
4⇡ |x y|
Theorem 7.8. The solution (509) of the free space Poisson problem (449,450) is unique.
Proof. Assume there are two solutions u1 and u2 , and let u := u1 u2 . The proof is then identical
to the proof for a bounded volume V given in Sec. 2.4.3 up to Eq. (151).
Now consider S in that equation to be a sphere of radius R, and let R ! 1. On the sphere
1 @u 1
u⇠ , u,n = ⇠ , d2 x = R2 sin ✓ d✓d', (510)
R @r r=R R2
and so Z Z Z
⇡ 2⇡
1 4⇡
uu,n d2 x ⇠ sin ✓ d'd✓ = ! 0. (511)
S 0 0 R R
Thus, in the limit R ! 1, we get Z
2
|ru| d3 x = 0, (512)
2
so that |ru| = 0 at all points. Hence u must be constant. Now note that since u ! 0 as |x| ! 1,
this constant must be zero. Hence u1 = u2 , and the solution is unique.

76
7.7 Exercises
48. Homework 22: (Short and easy) Check that
✓ ◆
@2f 2 @f 1 @ 2@ 1 @2
f (r) = 2
+ = 2 r f = (rf ) (513)
@r r @r r @r @r r @r2

when acting on a spherically symmetric function f (x) = f (r).


49. Verify that the 2-dimensional -function

(x, y) := (x) (y) (514)

can also be defined as either of the limits a)


(
1/4✏2 |x| < ✏ and |y| < ✏
lim (515)
✏!0 0 |x| > ✏ or |y| > ✏

or b)

lim (516)
✏!0 2⇡(x2 + y 2 + ✏2 )3/2
Hint: use polar coordinates and recall that
Z 1Z 1 Z 1 Z 2⇡
f (x, y) dx dy = f (r, ✓) d✓ rdr. (517)
1 1 0 0

50. Show by di↵erentiation, and using f =r 1


(rf )00 for f (x) = f (r), that

f (r) (r)f (0) f 00 (r)


= + . (518)
4⇡r 4⇡r2 4⇡r

51. Homework 23: (The version of the triangle inequality used in the lecture) We originally
met the triangle inequality as

||X + Z||  ||X|| + ||Z||. (519)

Make two appropriate choices of Z to obtain the equivalent triangle inequalities

||X|| ||Y ||  ||X Y ||  ||X|| + ||Y ||. (520)

52. By integration over a suitable Green’s function, show that the solution of the Poisson equation
(in three space dimensions) with free space boundary conditions and a spherically symmetric
source,

u = f (r) where r := |x|, (521)


u ! 0 as |x| ! 0 (522)

is Z r Z 1
1 2
u(r) = f (⇢)⇢ d⇢ f (⇢)⇢ d⇢. (523)
r 0 r
Hint: use the appropriate Green’s function, and do the integration in spherical polar coordi-
nates (⇢, ✓, '). Without loss of generality you can assume that the point x is on the x3 -axis.
Further hint: consider the substitution z = cos ✓.

53. Show that (523) is a solution (it is in fact the unique solution) of (521,522) by checking the
PDE explicitly by di↵erentiating, and the boundary condition by taking a limit.

77
54. If we approximate the earth as a sphere of radius r0 and constant mass density ⇢0 , then its
gravitational potential (x) obeys
(
4⇡G⇢0 , 0 < |x| < r0 ,
= (524)
0, |x| > r0 ,
! 0 as |x| ! 0, (525)

where G is Newton’s gravitational constant.


a) Using the formula (523), find (r) where r := |x|. Hint: calculate separately for 0 < r < r0
and r > r0 .
b) The gravitational acceleration of a freely falling body is given by

d2 x
= r (526)
dt2
You build a rapid transit system by drilling a straight hole from the north pole to the south
pole through the centre of the earth. You drop a capsule (like a lift cabin) from the north
pole. It accelerates down, reaches maximum velocity at the centre of the earth, and comes
to a stop just as it reaches the south pole. Find the ODE obeyed by its position z(t), and
solve it with initial conditions z(0) = r0 and dz/dt(0) = 0, where z = r0 is the north pole
and z = r0 the south pole. (We assume the tunnel goes along the earth’s rotation axis so
we do not have to worry about forces due to the rotation of the earth). Look up numbers
for r0 , ⇢0 and G and calculate the time (in minutes) the one-way trip takes.
c) You convert your tunnel into a gigantic cannon and shoot the capsule (now a spaceship)
straight up from the north pole into the sky. Find the ODE obeyed by its position z(t).
This cannot be solved in closed form. However, show that the total energy (per mass) of the
capsule
✓ ◆2
1 dz M
E(t) := (527)
2 dt z
is conserved, that is dE/dt = 0. Hence find the maximum height zmax reached for a given
initial velocity v0 . Hence find the escape velocity vescape for which the capsule never comes
back. Find the numerical value for the Earth’s escape velocity (in kilometers per second).

78
8 Green’s functions for bounded regions
So far we have only considered free-space problems. However, in many situations of interest
we are looking for a solution that satisfies both the PDE and given boundary conditions. This
Section discusses how we can find a Green’s function solution that satisfies the relevant boundary
conditions. We focus on the Helmholtz equation

( + k 2 )u = f (x), (528)

but the method is readily generalised to other problems. For example, results for the Poisson
equation follow by taking the limit k ! 0.
Our main mathematical tool will be the Kirchho↵-Helmholtz formula. To derive it, we need
two major ingredients, Green’s theorem and the reciprocal theorem.

8.1 Green’s theorem


Theorem 8.1. Suppose that G(x) and u(x) are functions with continuous second derivatives on
a region V with surface S. Then
Z Z
(G u u G) d3 x = (Gu,n uG,n ) d2 x. (529)
V S

Proof. Recalling that


u = r · (ru) (530)
and likewise for G, and that

r · (Gru) = G u + (rG) · (ru), (531)


r · (urG) = u G + (ru) · (rG), (532)

we find that
G u u G = r · (Gru urG). (533)
Thus, integrating over V we have
Z Z
(G u u G) d3 x = r · (Gru urG) d3 x. (534)
V V

By the divergence theorem


Z Z Z
r · (Gru urG) d3 x = (Gru urG) · n d2 x = (Gu,n uG,n ) d2 x, (535)
V S S

which establishes Green’s theorem.

8.2 The reciprocal theorem


In all of the problems in this Section, the Green’s function is symmetric, that is

G(x, y) = G(y, x). (536)

The physical meaning of this symmetry is that hitting the system at x produces the same e↵ect
at y as the other way around. Mathematically speaking, Green’s functions are symmetric if
they correspond to self-adjoint di↵erential operators with homogeneous (that is, zero) boundary
conditions. Not all physical systems have this property. Note also that the causal Green’s function
for a time evolution problem (or for an ODE) does not have this symmetry under the interchange
of t and s.

79
Theorem 8.2. If G(x, y) is the solution of

( x + k 2 )G(x, y) = (x y) for x, y in V, (537)


↵G(x, y) + Gnx (x, y) = 0 for x on S, (538)

then
G(x, y) = G(y, x). (539)
The theorem also holds for the Poisson equation, with k = 0.
Proof. Consider the two PDEs

( x + k 2 )G(x, y 1 ) = (x y 1 ), ( x + k 2 )G(x, y 2 ) = (x y 2 ), (540)

and multiply the first equation by G(x, y 2 ), multiply the second equation by G(x, y 1 ), subtract
and integrate over V :
Z h i
G(x, y 2 ) x G(x, y 1 ) G(x, y 1 ) x G(x, y 2 ) d3 x
ZV h i
= G(x, y 2 ) (x y 1 ) G(x, y 1 ) (x y 2 ) d3 x. (541)
V

(The terms containing k 2 have cancelled). Now apply Green’s theorem to the left-hand side and
integrate out the -function on the right-hand side. This gives
Z
[G(x, y 2 )G,nx (x, y 1 ) G(x, y 1 )G,nx (x, y 2 )] d2 x = G(y 1 , y 2 ) G(y 2 , y 1 ). (542)
S

One can argue from the boundary condition (538) that this surface integral vanishes (see problem).
We conclude that
G(y 2 , y 1 ) G(y 1 , y 2 ) = 0. (543)
This proves the theorem since y 1 and y 2 can be any points inside V .
The reciprocal theorem is of physical interest in its own right, but for the derivation of the
Kirchho↵-Helmholtz formula that follows we need the following
Corollary 8.3. If G obeys (537) and (538) then it also obeys

( y + k 2 )G(x, y) = (x y) for x, y in V, (544)


↵G(x, y) + Gny (x, y) = 0 for y on S, (545)

Idea of proof: Because G is symmetric under interchange of its two arguments, we can write
1
G(x, y) = (G(x, y) + G(y, x)) . (546)
2
Using suffixes 1 and 2 to denote partial derivatives with respect to the first and second argument
of the abstract function G(·, ·), this gives us
1
rx G(x, y) = (r1 G(x, y) + r2 G(y, x)) . (547)
2
Now write out (537) and (538) in this notation, interchange the labels x and y, use the reciprocal
theorem G(x, y) = G(y, x) and the identity (x y) = (y x), and so obtain (544) and (545).

80
8.3 The Kirchho↵-Helmholtz formula
We want to derive a formula for the solution u(x) of a Helmholtz problem
( + k 2 )u(x) = f (x) (548)
on a bounded domain V , with some boundary conditions on S = @V . We leave these boundary
conditions unspecified for now.
To obtain the desired formula, consider
Z
G(x, y)f (y) d3 y u(x) (549)
V
Z
= [G(x, y)f (y) u(y) (x y)] d3 y (550)
V
Z h i
= G(x, y)( + k 2 )u(y) u(y)( y + k 2 )G(x, y) d3 y (551)
ZV h i
= G(x, y) u(y) u(y) y G(x, y) d3 y (552)
ZV
⇥ ⇤
= G(x, y)u,n (y) u(y)G,ny (x, y) d2 y (553)
S

In (550) we have used the definition of the -function. In (551), we have used (548), but with x
renamed to y, to replace f (y), and we have used (544) to replace the -function. In (552), we have
cancelled the term proportional to k 2 . In (553), we have used Green’s theorem in the variable y.
Here u,ny denotes the normal derivative with respect to the y variables, i.e.,
G,ny := n · (ry G). (554)
Combining (549) and (553) and rearranging, we obtain the Kirchho↵-Helmholtz represen-
tation Z Z
3
⇥ ⇤
u(x) = G(x, y)f (y) d y + G,ny (x, y)u(y) G(x, y)u,n (y) d2 y. (555)
V S
This formula gives the value of u(x) inside the region V in terms of the source distribution f (x)
in V and the values of u and u,n on the surface S. It is true for any G(x, y) that is symmetric.
Remark 8.4. When attempting to solve (548) analytically, we must choose G so that we minimise
the amount of information we need to know about u and u,n on the boundary. For example, if
we are given a Dirichlet problem, where u is prescribed on the boundary, we try to find G so
that G(x, y) = 0 when y is on the boundary. This eliminates the unknown u,n and allows us
to calculate u in terms of known quantities. Similarly, for the Neumann problem where we know
u,n on the boundary we try to find G so that G,ny (x, y) = 0 when y is on the boundary. This
eliminates the unknown u from the integral over the surface.
Remark 8.5. (555) can also be used numerically. We choose a simple G, say the free space
Green’s function, and this gives us an integral equation to solve numerically. For example if we are
given a Neumann problem with u,n specified on the boundary (but where we do not know u on the
boundary) then by choosing x to be a point on the boundary, (548) becomes an integral equation
for the unknown u(x) on the boundary. This is solved numerically, and once we know u(x) on
the boundary, then (548) tells us the value of u(x) at all points inside the boundary. This is the
essence of boundary integral methods. Note that the integral equation is 2-dimensional whereas
the original problem is 3-dimensional. This reduction in dimensionality is why boundary integral
methods are useful.

8.4 Problems on bounded regions


8.4.1 The Dirichlet problem
This is the problem of finding u in V given that
( + k 2 )u(x) = f (x) in V,
u(x) = g(x) on S. (556)

81
We solve this in terms of the Kirchho↵–Helmholtz representation by eliminating the unknown
u,n from the integral, that is, we attempt to find a Green’s function such that

( x + k 2 )G(x, y) = (x y) in V,
G(x, y) = 0 when y on S. (557)

In practice it may be difficult to find such a G, but assuming G is known the solution is then, from
the Kirchho↵-Helmholtz representation,
Z Z
u(x) = G(x, y)f (y)d3 y + G,ny (x, y)g(y) d2 y. (558)
V S

8.4.2 The Neumann problem for the Helmholtz equation


This is the problem of finding u in V given that

( + k 2 )u(x) = f (x) in V, u,n (x) = g(x) on S. (559)

We solve this in terms of the Kirchho↵–Helmholtz representation by eliminating the unknown


u from the integral, that is, we attempt to find a Green’s function such that

( x + k 2 )G(x, y) = (x y) in V, G,n (x, y) = 0 when y on S. (560)

Assuming G can be found, the solution is then


Z Z
u(x) = G(x, y)f (y) d3 y G(x, y)g(y) d2 y. (561)
V S

8.4.3 The Neumann problem for the Poisson equation


As we have already discussed in Remark 1.7, the Neumann problem for the Poisson equation
(k = 0),

u(x) = f (x) in V,
u,n (x) = g(x) on S, (562)

is posed consistently only if


Z Z Z Z
3 3 2
f (x) d x = ud x = u,n d x = g(x) d2 x. (563)
V V S S

That is, f (x) and g(x) must satisfy the compatibility condition for the data of the Poisson
equation with Neumann boundary conditions,
Z Z
3
f (x)d x = g(x)d2 x. (564)
V S

By the same reasoning, for x 2 V we find another compatibility condition, this time for the
Green’s function G(x, y) (and integrating over y),
Z Z Z
2 3
G,ny (x, y) d y = y G(x, y) d y = (x y) d3 y = 1. (565)
S V V

Hence we cannot impose G,ny (x, y) = 0 for y on S!


However, it turns out that this is not necessary. Instead, consider the Green’s function defined
by

x G(x, y) = (x y) in V, (566)
1
G,ny (x, y) = for y on S, (567)
A

82
where Z
A := d2 y (568)
S

is the area of S. Then the compatibility condition (565) for the Green’s function is satisfied. The
solution of the Neumann problem is, from the Kirchho↵-Helmholtz representation (555), given by
Z Z ✓ ◆
3 u(y)
u(x) = G(x, y)f (y) d y + G(x, y)g(y) d2 y. (569)
V S A

Now in the term Z


1
u(y) d2 y (570)
A S

we do not know u(y) on the boundary, so we cannot evaluate this integral, but it is just a constant
(independent of x). On the other hand, the solution to the Neumann problem is unique only up
to adding an arbitrary constant anyway. So we can absorb the unknown additive constant (570)
into this, and write
Z Z
3
u(x) = G(x, y)f (y) d y G(x, y)g(y) d2 y + C, (571)
V S

where C is an arbitrary constant.


The compatibility conditions (564) for the data or (565) for the Green’s function only apply if
the volume V and hence its boundary S are finite. If all or part of the boundary are at infinity,
with the usual condition that u ! 0 at infinity, then there is no compatibility condition, and we
must have C = 0.

8.4.4 Robin boundary conditions


This is the problem of finding u in V given that

( + k 2 )u(x) = f (x) in V,
u,n (x) + (x)u(x) = g(x) on S, (572)

where f (x), g(x) and (x) are all given functions.


We solve this in terms of the Kirchho↵–Helmholtz representation by eliminating the unknown
u,n from the problem using the fact that

u,n (x) = g(x) (x)u(x) on S. (573)

so the Kirchho↵-Helmholtz representation (555) becomes


Z Z Z
3
⇥ ⇤
u(x) = G(x, y)f (y) d y + G,ny (x, y) + (y)G(x, y) u(y) d2 y G(x, y)g(y) d2 y. (574)
V S S

Then, as we do not know u on the surface S, we choose G so that this term is eliminated. That
is, we choose G to be a solution of

( x + k 2 )G(x, y) = (x y) in V,
G,ny (x, y) + (y)G(x, y) = 0 when y on S. (575)

Assuming G can be found, the solution is then


Z Z
u(x) = f (y)G(x, y) d3 y g(y)G(x, y) d2 y. (576)
V S

83
8.5 The method of images
8.5.1 Example: Laplace equation with Neumann BCs on a plane
It remains to find a Green’s function that obeys the homogeneous version of the boundary condition
we want to impose. As an example, consider the half-space Neumann problem (see Fig. 12) for
Laplace’s equation:
u = 0, x3 > 0,
u,x3 = v(x1 , x2 ) on x3 = 0,
u ! 0 as x3 ! 1. (577)
As we are given u,x3 = u,n on the boundary, we choose a Green’s function that satisfies
G,ny (x, y) = 0 on y3 = 0. (578)
Technically, the surface we need to integrate over also includes y3 ! 1. However, we can assume
that G ! 0 and G,n ! 0 as y3 ! 1, hence eliminating the integral over those parts of the surface
at 1.
The Kirchho↵-Helmholtz representation (555) becomes
Z
u(x) = G(x, y)u,n (y) d2 y (579)
S

or, with u,n = u,x3 = v,


Z1 Z1
u(x) = G (x, (y1 , y2 , 0)) v(y1 , y2 ) dy1 dy2 . (580)
1 1

We still need to find a Green’s function that obeys the homogeneous Neumann condition (578).
This can be done by the method of images. We start with the free space Green’s function
1 1
GF (x, y) = . (581)
4⇡ |x y|
This represents the e↵ect at x of a unit source at y. Now consider what would happen if there
were another point source at the image point
y 0 := (y1 , y2 , y3 ). (582)
Because the sources are now symmetric under a reflection in the x3 -plane (see Fig. 12), so is the
solution u(x). Hence its x3 -derivative vanishes on the x3 = 0 plane.
To add in this fictitious image charge, we use the Green’s function
✓ ◆
1 1 1
G(x, y) = GF (x, y) + GF (x, y 0 ) = + (583)
4⇡ |x y| |x y 0 |
1 1
= p
4⇡ (x1 y1 )2 + (x2 y2 )2 + (x3 y3 ) 2
!
1
+ p . (584)
(x1 y1 )2 + (x2 y2 )2 + (x3 + y3 )2
Clearly this obeys
G(x, (y1 , y2 , y3 )) = G(x, (y1 , y2 , y3 )), (585)
and so G,y3 = 0 on y3 = 0.
When we evaluate G(x, y) on y3 = 0, the two terms in (584) become identical, and substituting
into (580) we obtain
Z1 Z1
1 v(y1 , y2 ) dy1 dy2
u(x1 , x2 , x3 ) = p . (586)
2⇡ (x1 y1 )2 + (x2 y2 )2 + x23
1 1

84
One might object that G obeys G = (x y) + (x y 0 ). But when y is inside the domain
of the PDE then y 0 is outside and so we do not integrate over the region containing y 0 , and so
(x y 0 ) does not make a contribution.

y3
(x1,x2,x3) Δφ=0

y2

y1

(x1 ,x2,−x3)
Figure 12: Method of images for the Neumann problem for Laplace’s equation.

8.5.2 Example: Helmholtz equation with Dirichlet BCs on a plane


As a second example, consider now the (homogeneous) Helmholtz equation with (inhomogeneous)
Dirichlet BCs on a plane,

( + k 2 )u = 0, x3 > 0, (587)
u = v(x1 , x2 ), x3 = 0, (588)
u ! 0, |x| ! 1. (589)

We need G to obey G(x, y) = 0 for y 2 S, that is for y3 = 0. The required Green’s function is
given by
G(x, y) = GF (x, y) GF (x, y 0 ), (590)
where GF is the Green’s function for the free space Helmholtz problem,

eik|x y|
GF (x, y) = , (591)
4⇡|x y|

and where the image point y 0 = (y1 , y2 , y3 ).


We then have
Z Z 1Z 1
u(x) = G,ny (x, y)u(y) d2 y = G,y3 (x, (y1 , y2 , 0)) v(y1 , y2 ) dy1 dy2 , (592)
S 1 1

where the sign comes from n = (0, 0, 1). (The rest of the example is left as an exercise)

8.6 Exercises
55. Homework 24: Fill in a missing step in the proof of the reciprocal theorem in the lecture
notes by showing that
Z
[G(x, y 2 )G,nx (x, y 1 ) G(x, y 1 )G,nx (x, y 2 )] d2 x = 0 (593)
S

if G obeys either Dirichlet, Neumann or Robin boundary conditions for x 2 S.

85
56. Homework 25: (Begun in lecture, past exam question) Solve the PDE problem

( + k 2 )u = 0, x3 > 0, (594)
u = v(x1 , x2 ), x3 = 0, (595)
u ! 0, |x| ! 1. (596)

57. Find the solution u(r) of

u = f (r), 0r<R (597)


↵u(R) + u,r (R) = 0 (598)

(Poisson problem in three space dimensions inside a sphere with spherically symmetric source
and spherically symmetric Robin BC) in the form
Z R
u(r) = G(r, s)f (s) ds. (599)
0

Hint: Write the PDE in spherical coordinates. We e↵ectively now have an ODE problem.
Proceed from first principles, as we have done for constructing the Green’s function for
ODEs. It is implicit in the problem that u(r) must obey the boundary condition u0 (0) = 0.
If u0 (0) 6=, u(x, y, z) would have a conical shape at the origin r = 0 (or x = y = z = 0), and
this would correspond to a -function source.

86
9 The di↵usion equation
In the previous Sections we have set out the general principles involved in solving a PDE by means
of the Green’s function technique. The method readily generates to PDEs other than the ones we
have considered so far. The only additional complication is that we need to also consider time-
dependent problems, like di↵usion or wave equations. These are the topics of this and the following
Section.
The di↵usion equation (or heat equation) is

u,t0  u = fˆ(x, t̂) (600)

where  is the di↵usion constant (with dimension length2 /time). This can be reduced to the form

u,t u = f (x, t) (601)

by making the change of variable



t := t̂, f (x, t) := f (x, t̂). (602)

We shall assume that this change of variable has been made and only consider (601) in these notes.
We shall begin with the heat equation in one space dimension. This is interesting in its own
right, but as we shall see, it is easy to find the Green’s function in n space dimensions from the
one in one space dimension.

9.1 The one-dimensional di↵usion equation


The one-dimensional di↵usion equation is

u,t u,xx = f (x, t), (603)

that is, we assume that u depends only on t and x. The associated causal Green’s function,
G(x, t; y, ⌧ ) satisfies
G,t G,xx = (t ⌧ ) (x y), G = 0 if t < ⌧. (604)
The causal solution of (603) is given by

Zt Z1
u(x, t) = G(x, t; y, ⌧ )f (y, ⌧ ) dy d⌧. (605)
1 1

In order to find G(x, t; y, ⌧ ) we introduce the variables

z=x y, T =t ⌧. (606)

In terms of these variables, the problem for G(x, t; y, ⌧ ) = G(z, T ) becomes

G,T G,zz = (T ) (z), G(z, T ) = 0 if T < 0. (607)

In order to accommodate the condition G(z, T ) = 0 if T < 0 we put

G(z, T ) = H(T )g(z, T ). (608)

Recalling that dH(T )/dT = (T ) and (T )g(z, T ) = (T )g(z, 0), we find that

G,T G,zz = (T )g(z, 0) + H(T ) (g,T g,zz ) , (609)

so that problem (607) for G(z, T ) is satisfied if g(z, T ) satisfies

g,T = g,zz , g(z, 0) = (z). (610)

87
There are many ways of solving (610) to find g(z, T ). One particularly elegant way involves
the use of similarity variables, leading to a similarity ansatz.
The basic idea behind the similarity method is to notice that if > 0 is a constant and we put
2 1
z̄ = z, T̄ = T, ḡ(z̄, T̄ ) = g(z, T ), (611)

then
1 1
ḡ,T̄ = 2
g,T = 2
g,zz = ḡ,z̄z̄ , (612)

and
1 1
ḡ(z̄, 0) = g(z, 0) = (z) = ( z) = (z̄), (613)
[The third equality is the identity (ax) = (x)/|a|.] That is, the problem (610) is invariant under
the change of variables (611), in the sense that it is the same in barred variables and unbarred
variables. That is, if (in unbarred variables),

g,T = g,zz , g(z, 0) = (z) (614)

then (in barred variables)


ḡ,T̄ = ḡ,z̄z̄ , ḡ(z̄, 0) = (z̄) . (615)
Given that the problem is invariant under the transformation (611), it is sensible to look for
solutions of the problem in terms of variables which are invariant under the same transformation.
It is fairly obvious that both of
p p p
2T ( 1
= T g= g) = T̄ ḡ (616)

and p p p
⇠ = z/ T = z/ 2T = z̄/ T̄ (617)
are invariant under the transformation (611). As x and T are independent variables and g is a
dependent variable, it is reasonable to look for a solution, in terms of the invariants and ⇠, in
the form
= (⇠). (618)
p p
Since = T g and ⇠ = z/ T this amounts to looking for a solution of the form
1 z
g(z, T ) = p (⇠), where ⇠ = p . (619)
T T
If we look for a solution of this form we find that
1 1 d
g,T = (⇠) + 1/2 ⇠,T (⇠)
2T 3/2 T d⇠
1 1 ⇣ z ⌘ 0
= (⇠) + (⇠)
2T 3/2 T 1/2 2T 3/2
1
= [ (⇠) + ⇠ 0 (⇠)], (620)
2T 3/2
and that
1@2 1 00
g,zz = (⇠) = 3/2 (⇠), (621)
T 1/2 @z 2 T
1/2
since @/@z = (@⇠/@z)d/d⇠ = T d/d⇠.
This shows that
1 1
3/2
[ (⇠) + ⇠ 0 (⇠)] = 3/2 00
(⇠), (622)
2T T
so that (⇠) satisfies the ordinary di↵erential equation
00 0
(⇠) + 12 [⇠ (⇠) + (⇠)] = 0, (623)

which can be written as


00
(⇠) + 12 [⇠ (⇠)]0 = 0. (624)

88
This can be integrated once to give
0
(⇠) + 12 ⇠ (⇠) = C. (625)

This is a first order linear ordinary di↵erential equation for (⇠). With the integrating factor
exp(⇠ 2 /4) it is equivalent to ✓ 2 ◆
d ⇠ ⇠2
e4 = Ce 4 . (626)
d⇠
This integrates to give
Z ⇠
⇠2 ⌘2 ⇠2
(⇠) = Ce 4 e 4 d⌘ + Be 4 . (627)
0

In order to get (⇠) to vanish fast enough as ⇠ ! ±1 (that is, in order to get g(x, 0) = (x)) we
have to take C = 0. [One way of seeing this is to note from (625) that for C 6= 0, ' 2C/⇠ as
⇠ ! ±1: asymptotically, the last two terms of the RODE balance each other, while the first term
becomes negligible. Hence for C 6= 0, g ⇠ 1/z, but z 1 dz does not converge.] With C = 0, we
have
⇠2
(⇠) = Be 4 . (628)
Subsituting this into (619) gives
✓ ◆
B z2
g(z, T ) = p exp . (629)
T 4T

To determine the constant B we note that


Z 1 Z 1
g(z, 0) dz = (z) dz = 1. (630)
1 1

We also have Z 1 Z 1 ✓ ◆
B z2
g(z, T ) dz = p exp dz (631)
1 T 1 4T
p
and putting q = z/2 T this becomes
Z 1 Z 1
q2
p
g(z, T ) dz = 2B e dq = 2B ⇡. (632)
1 1

This is valid for any T > 0, so taking the limit T ! 0


Z 1
p
g(z, 0) dz = 2B ⇡ = 1 (633)
1

and hence
1
B= p . (634)
2 ⇡
Recalling that G(z, T ) = H(T )g(z, T ), z = x y and T = t ⌧ we conclude that the one dimensional
causal Green’s function for the heat equation is

H(t ⌧ ) (x y)2
G(x, t; y, ⌧ ) = p exp . (635)
2 ⇡(t ⌧ ) 4(t ⌧ )

9.2 The initial-value problem in one dimension


In practice, we are often interested in finding the solutions of initial value problems, rather than
causal solutions. Consider the initial value problem

u,t = u,xx , t > 0,


u(x, 0) = g(x). (636)

89
Since we are only interested in u(x, t) for t > 0, we use again a trick we have seen before, and write

(x, t) = H(t)u(x, t), (637)

so = 0 for t < 0 and = u for t > 0. Compare the definition of z(t) in Eq. (422) above. Then

,t = (t)u(x, t) + H(t)u,t (638)

and since (t)u(x, t) = (t)u(x, 0) (this is just (t)f (t) = (t)f (0) – think of x as some fixed
parameter), this becomes
,t = (t)g(x) + H(t)u,t . (639)
As H(t) does not depend on x,
,xx = H(t)u,xx (640)
and hence
,t ,xx = (t)g(x) + H(t) (u,t u,xx ) (641)
so that
,t ,xx = (t)g(x). (642)
The causal solution of this problem is
Z t Z 1
1 (x y)2
(x, t) = p e 4(t ⌧ ) (⌧ )g(y) dy d⌧. (643)
1 1 2 ⇡(t ⌧)
Rt
For t > 0, the integral 1 . . . (⌧ ) d⌧ simply picks out the value of the integrand at ⌧ = 0, while
for t < 0 the point ⌧ = 0 is not in the integration range, so that
Z 1
1 (x y)2
(x, t) = H(t) p e 4t g(y) dy, (644)
1 2 ⇡t

and hence for t > 0 we have


Z 1
1 (x y)2
u(x, t) = p e 4t g(y) dy. (645)
2 ⇡t 1

Example 9.1. Consider the particular case g(x) = H(x). Since H(y) = 0 for y < 0 and H(y) = 1
for y > 0, Z 1
1 (x y)2
u(x, t) = p e 4t dy. (646)
0 2 ⇡t
p
Now put q = (y x)/2 t to obtain
Z 1
1 q2
u(x, t) = p p e dq. (647)
⇡ x/2 t

This integral cannot be evaluated in terms of elementary functions, but we can express it in terms
of a special function, the complementary error function.
✓ ◆
1 x
u(x, t) = erfc p . (648)
2 2 t
The error function is defined as
Z x
2 q2
erf(x) := p e dq, (649)
⇡ 0

and it has the properties (see Fig. 13)


erf( x) = erf(x),
erf(0) = 0,

90
erf(1) = 1,

erf( 1) = 1,
erf(x) is a monotonically increasing function of x.

0.5

-4 -2 0 2 4
x

-0.5

-1

Figure 13: The error function erf(x).

The complementary error function is defined as


Z 1
2 q2
erfc(x) := p e dq. (650)
⇡ x

The error and complementary error functions are related by

erf(x) + erfc(x) = 1, (651)

because ✓Z x Z 1 ◆ Z 1
2 q2 q2 2 q2
p e dq + e dq =p e dq = 1. (652)
⇡ 0 x ⇡ 0

91
2

1.5

0.5

-4 -2 0 2 4
x

Figure 14: The complementary error function erfc(x).

Example 9.2. Show that if u(x, t) satisfies

u,t u,xx = 0, t > 0,


x2
u(x, 0) = e `2 (653)

then
`
u(0, t) = p . (654)
`2 + 4t
We have Z 1
1 (x y)2 y2
u(x, t) = p e 4t e `2 dy (655)
2 ⇡t 1
Hence Z 1
1 y 2 ( `12 + 4t
1
)
u(0, t) = p e dy (656)
2 ⇡t 1
p
and putting y 1/`2 + 1/4t =: q this becomes
Z 1
1 1 q2 `
u(0, t) = p q e dq = p . (657)
2 ⇡t 1
2 +
1 1 `2 + 4t
` 4t

9.3 The three-dimensional problem


The (causal) Green’s function, G(x, t; y, ⌧ ) for the di↵usion equation is defined by

G,t xG = (t ⌧ ) (x y), G = 0 if t < ⌧. (658)

To find it, let G(x, t, y, ⌧ ) = G(z, T ), where z := |x y| and T := t ⌧ , and recall that
✓ ◆
1 z2
g(z, T ) = p exp (659)
2 ⇡T 4T
obeys
g,T = g,zz , g(z, 0) = (z). (660)
Therefore G(z, T ) = H(T )g(z, T ) obeys

G,T G,zz = (T )g(z, 0) + H(T ) (g,T g,zz ) = (T ) (z) (661)

92
In three dimensions, try

G(z, T ) = H(T )g(z1 , T )g(z2 , T )g(z3 , T ). (662)

Clearly this vanishes for T < 0 as required. Furthermore, it also obeys

G,T G = G,T (G,z1 z1 + G,z2 z2 + G,z3 z3 )


= (T ) g(z1 , 0) g(z2 , 0) g(z3 , 0)
+H(T ) {[g,t (z1 , t) g,z1 z1 (z1 , t)] g(z2 , T )g(z3 , T ) + 2 more terms}
= (T ) (z1 ) (z2 ) (z3 ) = (T ) (z). (663)

We have shown that the causal Green’s function for the heat equation in three dimensions in
free space is " #
2
H(t ⌧ ) |x y|
G(x, t; y, ⌧ ) = exp . (664)
8(⇡(t ⌧ ))3/2 4(t ⌧ )
By a similar argument, the two dimensional Green’s function is given by
" #
2
H(t ⌧ ) |x y|
G(x, t; y, ⌧ ) = exp (665)
4⇡(t ⌧ ) 4(t ⌧ )

where now x = (x1 , x2 ), y = (y1 , y2 ). Clearly this method works in any number of space dimen-
sions.
Remark 9.3. Consider the initial value problem for the heat equation in three dimensions,

u,t u = 0,
u(x, 0) = f (x). (666)

We turn this initial value problem into a causal problem in the usual way, that is, we assume t > 0
and write
(x, t) = H(t)u(x, t). (667)
Then satisfies
,t = (t)f (x) (668)
Thus Z Z
t
(x, t) = G(x, t; y, ⌧ ) (⌧ )f (y) d3 y d⌧. (669)
1 R3
Rt
For t > 0 the (⌧ ) in the integral 1
. . . d⌧ simply picks out the value of the integrand at ⌧ = 0,
and for t < 0 we get nothing, so
Z
(x, t) = H(t) G(x, t; y, 0)f (y) d3 y. (670)
R3

Thus for t > 0 (when = u) we have


Z Z " #
2
1 |x y|
u(x, t) = G(x, t; y, 0)f (y) d3 y. = exp f (y) d3 y. (671)
R3 R3 8(⇡t)3/2 4t

Example 9.4. Suppose that u(x, t) satisfies the initial value problem

u,t = u, t > 0, u(x, 0) = (1 |x|)H(1 |x|). (672)

Show that for t > 0 ✓ ◆ r


1 t ⇣ 1/4t

u(0, t) = erf p +4 e 1 . (673)
2 t ⇡

93
The solution of the initial value problem is
Z !
2
1 |x y|
u(x, t) = exp (1 |y|)H(1 |y|)d3 y. (674)
8(⇡t)3/2 R3 4t

and hence Z !
2
1 |y|
u(0, t) = exp (1 |y|)H(1 |y|)d3 y. (675)
8(⇡t)3/2 R3 4t

As this integral depends only on r = |y| we change to spherical polar coordinates;


Z 1 Z 2⇡ Z ⇡ ✓ ◆
1 r2
u(0, t) = exp (1 r)H(1 r)r2 sin ✓ d✓ d' dr. (676)
8(⇡t)3/2 0 0 0 4t

The integrals separate and we have


Z 2⇡ Z ⇡
sin ✓ d✓ d' = 4⇡, (677)
0 0

so that Z 1 ✓ ◆
1 r2
u(0, t) = p exp (1 r)H(1 r)r2 dr. (678)
2 ⇡(t)3/2 0 4t
Since H(1 r) = 0 for r > 1 and H(1 r) = 1 for r < 1 this becomes
Z 1 ✓ ◆
1 r2
u(0, t) = p exp (1 r)r2 dr. (679)
2 ⇡(t)3/2 0 4t
p
If we put q = r/2 t this becomes
Z p
4 1/2 t ⇣ p ⌘
q2
u(0, t) = p e q2 2 tq 3 dq, (680)
⇡ 0

and hence, integrating by parts, we obtain the desired solution (673).


R 2
Note that we can always integrate an integral of the form e q q n dq by parts to get either an
error function if n is even, or an exponential function if n is odd.

9.4 Exercises
58. Homework 26: (A good way of understanding the lecture better) Show that an alternative
similarity ansatz to the one in the lectures is ⇠ := z 2 /T and := zg, and use this to find
g(z, T ).
59. Homework 27: (Exam-style) Find the solution of

u,t = u,xx , t > 0, (681)


u ! 0, x ! ±1, (682)
u(x, 0) = H(x + 1)H(1 x) (683)

in terms of error functions.


60. Homework 28: (Exam-style) Evaluate
Z 1 ✓ ◆
1 r2
u(0, t) = p 3/2 exp (1 r)r2 dr. (684)
2 ⇡t 0 4t

94
10 The wave equation
10.1 One space dimension
10.1.1 The Green’s function in one space dimension
Consider the Green’s function for the wave equation in one space dimension,
1
G,tt G,xx = (x y) (t ⌧ ). (685)
c2
If we put z = x y and T = t ⌧ this becomes
1
G,T T G,zz = (z) (T ). (686)
c2
As a boundary condition, we want the solution G(z, T ) which has outgoing wave behaviour, that
is, the solution for which waves move outwards from z = 0 towards infinity.
Outside the point (z = 0, T = 0), the d’Alembert solution applies, which can be written as
G(z, T ) = F (T z/c) + E(T + z/c). (687)
Here F (T z/c) represents a wave travelling towards increasing z, while E(T + z/c) represents a
wave travelling towards decreasing z.
Now, if we want a wave that travels away from its source at z = 0, then for z > 0 we need
the wave moving towards increasing z, but for z < 0 we need it to travel towards decreasing z.
Consider therefore the ansatz ✓ ◆
|z|
G(z, t) = f T . (688)
c
Note that |z| = z for z > 0 and |z| = z for z < 0. It is easy to see that (688) has the property of
travelling away from z = 0 for either z < 0 or z > 0. However, at z = 0 it does not obey the wave
equation. But that may be to the good, because d|z|/dz = sgn(z) and d sgn(z)/dz = 2 (z).
Taking two time derivatives of (688), we easily find
1 1
2
G,T T = 2 f 00 (T |z| /c). (689)
c c
Taking a first space derivative, we find
sgn(z) 0
G,z = f (T |z| /c), (690)
c
and hence taking another space derivative we find
✓ ◆2
2 (z) 0 sgn(z)
G,zz = f (T |z| /c) + f 00 (T |z| /c). (691)
c c
Since sgn(z)2 = 1 whatever z is and (z)f (z) = (z)f (0), this becomes
2 (z) 0 1
G,zz = f (T ) + 2 f 00 (T |z| /c). (692)
c c
Hence
1 2 (z) 0
G,T T G,zz = f (T ). (693)
c2 c
Comparing with (686), we have
c
f 0 (T ) = (T ) (694)
2
and hence
c
f (T ) =H(T ) (695)
2
plus a constant, which is seen to be zero from causality: we want f (t ⌧ ) = 0 for t < ⌧ . Thus
c
G(z, T ) = f (T |z| /c) = H(T |z| /c) (696)
2
so that
c
G(x, t; y, ⌧ ) = H [(t ⌧ ) |x y| /c] . (697)
2

95
10.1.2 The initial value problem in one space dimension
Often, instead of wanting the causal solution, we want to solve the initial value problem.
1
u,tt u,xx = 0, t > 0, u(x, 0) = f (x), u,t (x, 0) = g(x). (698)
c2
We can turn this into a problem solvable with the causal Green’s function by the usual method.
That is, put
(x, t) = H(t)u(x, t), (699)
so that = u for t > 0. Then, as usual, we have
,t = (t)u(x, 0) + H(t)u,t , (700)
where we have used that (t)u(x, t) = (t)u(x, 0),
0
,tt = (t)u(x, 0) + (t)u,t (x, 0) + H(t)u,tt , (701)
and note that the naive Leibniz rule does not apply. Also
,xx = H(t)u,xx . (702)
Thus
1 1 ⇥ 0 ⇤
,tt ,xx = (t)f (x) + (t)g(x) + H(t) u,tt c2 u,xx (703)
c2 c2
and hence
1 1 0
,tt ,xx = 2 ( (t)f (x) + (t)g(x)) . (704)
c2 c
The source term vanishes except at t = 0. If we solve this with the causal Green’s function (697)
we therefore get a solution (x, t) that vanishes for t < 0, which is precisely what we want, so the
causal Green’s function is the correct one to use.
For t > 0, = u and we have
Z Z
1 1 1
u= 2 G(x, t; y, ⌧ ) ( 0 (⌧ )f (y) + (⌧ )g(y)) dy d⌧. (705)
c 1 1

where G is the one dimensional Greens function. Using the definitions of (⌧ ) and its derivative
˙ (⌧ ), we obtain
Z
1 1
u(x, t) = 2 [G(x, t; y, 0)g(y) G,⌧ (x, t; y, 0)f (y)] dy. (706)
c 1
Now from (697) we find
c
G,⌧ (x, t; y, 0) = (t |x y| /c), (707)
2
and so Z 1 Z 1
1
u(x, t) = H(t |x y| /c)g(y) dy + (t |x y| /c)f (y) dy . (708)
2c 1 1
Finally note that
t |x y| /c < 0 , y<x ct or y > x + ct, (709)
t |x y| /c = 0 , y = x ct or y = x + ct, (710)
t |x y| /c > 0 , x ct < y < x + ct, (711)
so that Z x+ct
1 1
u(x, t) = [f (x + ct) + f (x ct)] + g(y) dy. (712)
2 2c x ct
It is interesting to write this explicitly in the d’Alembert form. In terms of the primitive
function ĝ of g, defined by Z
1 z
ĝ(z) := g(y) dy, (713)
c
we can write
1 1
u(x, t) = [f (x + ct) + ĝ(x + ct)] + [f (x ct) ĝ(x ct)] , (714)
2 2
which is of the form
u(x, t) = F (x ct) + E(x + ct). (715)

96
10.2 The three-dimensional problem
10.2.1 The Green’s function in three space dimensions
The three dimensional wave equation problem is
1
u,tt u = f (x, t), (716)
c2
together with a radiation condition. Specifically, waves should travel outwards from points where
f (x, t) 6= 0 towards infinity rather than travel in from infinity.
The associated Green’s function, G(x, t; y, ⌧ ) satisfies
1
G,tt xG = (t ⌧ ) (x y), (717)
c2
together with the radiation condition that disturbances should radiate away from the point of
disturbance, x = y, rather than towards it. If we introduce z = x y and T = t ⌧ then (717)
becomes
1
G,T T z G = (T ) (z) (718)
c2
Since (z) = (r)/4⇡r2 , where r = |z| it follows that G = G(r, T ). Using this, and the identity
(473), we find
1 1 (r)
2
G,T T (rG),rr = (T ) , (719)
c r 4⇡r2
For r 6= 0 we have (r) = 0, and after multiplying through by r we obtain
1
(rG),T T (rG),rr = 0, (720)
c2
which is the one-dimensional wave equation for rG. Its general solution can be written as
F (T r/c) E(T + r/c)
G(r, T ) = + (721)
4⇡r 4⇡r
for some functions F (T r/c) and E(T + r/c). (The factor of 1/4⇡ is introduced purely for
convenience in what follows.)
The radiation condition that disturbances should radiate away from x = y rather than towards
it becomes the condition that disturbances should move away from r = 0 towards increasing r.
Hence we must take
F (T r/c)
G(r, T ) = . (722)
4⇡r
This solution is singular as r ! 0, so we define a “generalised version” of it by
✓ ◆
F (T r/c)
G(r, T ) = lim H(r ✏) (723)
✏!0 4⇡r
where the point r = 0 is excluded. We then use the result from Sec. 7.4 that
✓ ◆
H(r ✏)f (r) f 00 (r)
lim z = f (0) (z), (724)
✏!0 4⇡r 4⇡r

where r := |z|, to find


F 00 (T r/c)
zG = F (T ) (z). (725)
4⇡rc2
We also have
F 00 (T r/c)
G,T T = , (726)
4⇡r
and putting the two together, we have
1
G,T T zG = F (T ) (z). (727)
c2

97
Thus we need
F (T ) = (T ) (728)
and so
r/c) (T
G(r, T ) =. (729)
4⇡r
Recalling our shorthands T = t ⌧ , z = x y and r = |z|, written out in full this is
✓ ◆
1 1
G(x, t; y, ⌧ ) = t ⌧ |x y| . (730)
4⇡ |x y| c

Note that G = 0 except where |x y| = c(t ⌧ ). The -function here is the scalar one, not the
three-dimensional one.

10.2.2 Retarded potentials


The causal solution of the problem
1
u,tt u = f (x, t) (731)
c2
in three space dimensions is
Z1 Z ✓ ◆
1 1
u(x, t) = f (y, ⌧ ) t ⌧ |x y| d3 y d⌧. (732)
R 3 4⇡ |x y| c
1

The integral over ⌧ simply picks out the value of the integrand at ⌧ = t |x y| /c, and so
Z
1 f (y, t |x y| /c) 3
u(x, t) = d y. (733)
4⇡ R3 |x y|

This form of the solution is called a retarded potential. Essentially, the solution at point x and
time t represents the superposition (in the form of an integral over space) of disturbances at points
y, and this is attenuated by a factor of 1/distance. This looks very similar to the Greens’ function
solution of the Poisson equation; hence the name “potential”. But this source is evaluated not at
time t, but at an earlier time, namely earlier by the time takes for the disturbance to travel in a
straight line at speed c; hence the name “retarded”.
Example 10.1. Find the field generated by a time harmonic point source at the origin which is
switched on at time t = 0:
1
u,tt u = H(t) (x)ei!t . (734)
c2
Using the retarded potential integral (733) with f (x, t) = (x)H(t)ei!t , we have
Z |x y|/c)
1 (y)H(t |x y| /c)ei!(t H(t |x| /c) i!(t |x|/c)
u(x, t) = d3 y = e . (735)
4⇡ R3 |x y| 4⇡ |x|

The Heaviside function confines the solution to the expanding sphere

|x|  ct (736)

whose radius expands at the wave speed c. The surface |x| = ct of this sphere represents the wave
front of the expanding sphere of disturbance. Once the wave front has passed a given point x, so
that x is inside the sphere, the solution is simply
1 |x|/c)
u(x, t) = ei!(t (737)
4⇡ |x|

and the phase of the oscillation at x di↵ers from the phase of the oscillation at the source of
the disturbance, 0, by a ! |x| /c = k|x|. This phase di↵erence is simply the time it takes the

98
disturbance to propagate from 0 to x, travelling at speed c. The factor of 1/(4⇡ |x|) says that the
amplitude decays inversely with the distance from the source.
Discarding the time factor in (737), this has the same form as the Green’s function for the
Helmholtz equation (but with the opposite sign). This is not surprising since, for |x| < ct, we have

1 !2
u,tt = u= k2 u (738)
c2 c2
and hence
1
u,tt u = H(t) (x)ei!t (739)
c2
reduces to
( + k 2 )u = H(t) (x)ei!t (740)
i!t
and writing u = H(t)e this becomes the problem for (minus) the Helmholtz Green’s function;

( + k2 ) = (x) (741)

10.3 The method of descent


10.3.1 From three to one dimensions
The one dimensional wave equation, with a source term, is
1
u,tt u,xx = f (x1 , t) (742)
c2
We can think of this as a three dimensional equation in planar symmetry
1
u,tt u = f (x1 , t) (743)
c2
where, since the source term depends only on x1 and t, we can choose u(x, t) to also depend only
on x1 and t. Thus the solution can be written as
0 1 1
Z1 Z1 Z Z1
u(x1 , t) = @ G3 (x, t; y, ⌧ ) dy2 dy3 A f (y1 , ⌧ ) dy1 d⌧, (744)
t= 1 y1 = 1 y2 = 1 y2 = 1

where G3 (x, t; y, ⌧ ) is the three dimensional Green’s function (730). Because f depends only on
y1 and ⌧ , we can evaluate the term in round brackets first.
But because the solution of the one-dimensional problem is also, by definition, given by
Z 1 Z1
u(x1 , t) = G1 (x1 , t; y1 , ⌧ )f (y1 , ⌧ ) dy1 d⌧, (745)
1
y1 = 1

we have an expression for the one-dimensional Greens’ function G1 as an integral over the three-
dimensional one G3 , namely
Z1 Z1
G1 (x1 , t; y1 , ⌧ ) = G3 (x, t; y, ⌧ ) dy2 dy3 (746)
y 2 = 1 y3 = 1

This is called the method of descent (in the number of dimensions). It is equally valid for
the Poisson, Helmholtz and di↵usion equations (although it is rather pointless in the case of
the di↵usion equation, where we used the one-dimensional Green’s function to find the three-
dimensional one).
We now evaluate this integral. First we write z = x y and T = t ⌧ so that the problem
becomes
Z1 Z1
1
G1 (z1 , T ) = (T |z| /c) dz2 dz3 . (747)
4⇡ |z|
z2 = 1 z3 = 1

99
Now introduce cylindrical polar coordinates in which z1 is the axial direction and
z2 = r cos ✓, z3 = r sin ✓ (748)
so that q q
|z| = z12 + z22 + z32 = z12 + r2 (749)
and
dz2 dz3 = r d✓ dr. (750)
with 1 < z1 < 1, 0  r < 1 and 0  ✓ < 2⇡. Then
Z 1 Z 2⇡ ✓ q ◆
1 1
G1 (z1 , T ) = p T z12 + r2 r d✓ dr. (751)
2
r=0 ✓=0 4⇡ z1 + r
2 c

As the integrand does not depend on ✓, the integration over ✓ simply gives a factor of 2⇡, or
Z ✓ q ◆
1 1 1 1 2 2
G1 (z1 , T ) = p T z1 + r r dr (752)
2 0 z12 + r2 c

Now note that q


d r
z12 + r2 = p (753)
dr z12 + r2
so that we can make the substitution
q ✓q ◆
1 1 d
q= z12 + r2 ) dq = z12 + r2 dr (754)
c c dr
to obtain Z 1
c
G1 (z1 , T ) = (T q) dq. (755)
2 |z1 |/c

If |z1 | /c > T the integral is zero since T q is not in the integration range, and if |z1 | /c < T the
integral is one, since now the zero of the -function is inside the integration range. Thus
c
G1 (z1 , T ) = H(T |z1 | /c), (756)
2
and we have recovered (696), as expected.

10.3.2 From three to two dimensions


We can use the method of descent to deduce the two dimensional Greens function
G2 (x1 , x2 , t; y1 , y2 , ⌧ ) (757)
for the two dimensional wave equation
1
u,tt (u,x1 x1 + u,x2 x2 ) = f (x1 , x2 , t). (758)
c2
By starting from the three-dimensional wave equation in cylindrical symmetry we find that
Z1
G2 (x1 , x2 , t; y1 , y2 , ⌧ ) = G3 (x, t; y, ⌧ ) dy3 . (759)
y3 = 1

Evaluating this integral is in principle similar to the calculation given above. Writing x = (x1 , x2 )
and y = (y1 , y2 ), it can be shown that
H(t ⌧ |x y| /c)
G2 (x, t; y, ⌧ ) = q . (760)
2
2⇡ (t ⌧ )2 |x y| /c2

(In fact, the method of descent was invented not to find the one dimensional Green’s function but
rather, as the easiest way of finding the two dimensional Green’s function!)

100
10.4 Exercises
61. Homework 29: Use the causal Green’s function for the wave equation in one space dimen-
sion to solve the Cauchy problem with a source term
2
c u,tt u,xx = s(x, t), (761)
u(x, 0) = f (x), (762)
u,t (x, 0) = g(x). (763)

[Hint: No need to repeat the part already done in the lecture in detail, focus here on the
source term that was absent in the lecture.]
62. Homework 30: a) Find the causal solution of
2
c u,tt u = H(R |x|)H(t) sin !t (764)

in three-dimensional free space, where R > 0 and ! are constants, and simplify as much as
possible (but not more). b) State the long wavelength and large distance approximations
for this problem and find u(x, t) using these approximations. c) Find u(0, t) in closed form.
[Hint: This is a bit fiddly. Use polar coordinates for y, and in your final expression distinguish
the three cases t < 0, 0 < t < c 1 R and t > c 1 R.]
63. Write out the causal solution of
2
c u,tt u = f (x, t) (765)

in two-dimensional free space. Then eliminate the Heaviside function by restricting the
integration domain instead. [Hint: The restricted integration domain can be described as
the interior of the past lightcone of (x, t). What we are looking for is the two-dimensional
equivalent of the retarded potential for the three-dimensional wave equation.]

101
Index
L2 norm, 23 Euclidean norm, 22
-function, 55 existence, 20
l1 norm, 22
l2 norm, 22 fall-o↵ condition, 12, 13, 69
flux, 45
advection equation, 45 flux function, 45
angular frequency, 15 flux law, 45
ansatz, 9 forward Euler, 16
autonomous, 45 Fourier coefficients, 11
Fourier series, 11
boundary, 6 Fourier transform, 11, 29
boundary condition at infinity, 13 free space boundary condition, 13
boundary conditions, 6 free space problem, 69
boundary data, 7 frequency, 15
bounded, 6 fundamental theorem of calculus, 9

Cartesian coordinates, 8 generalised function, 55


Cauchy data, 16 gradient, 7
Cauchy problem, 13
Cauchy-Kowalewski solution, 16 heat equation, 14
Cauchy-Riemann equations, 6, 36 Heaviside function, 56
causal Green’s function, 62 Helmholtz equation, 12
causal solution, 63 high-frequency approximation, 33
chain rule of partial derivatives, 8 homogeneous, 7
characteristic curve, 45, 46 homogeneous boundary conditions, 7
characteristic hypersurfaces, 16 hyperbolic, 15, 38, 40
characteristic velocity, 46
compatibility condition, 82 image point, 84
complementary error function, 91 inhomogeneous, 7
complementary function, 7 inhomogeneous term, 7
conservation law, 43 inhomogenous boundary conditions, 7
conserved quantity, 45 initial conditions, 6
continuous dependence on the data, 20 integral form, 44
continuous dependence on the initial data, 23
convex, 50 jump condition, 48
cylindrical polar coordinates, 8
Kirchho↵-Helmholtz representation, 81
cylindrical symmetry, 100
Laplace equation, 12
d’Alembert solution, 15
large distance approximation, 75
di↵erential form, 43
large distance condition, 76
di↵usion constant, 14
Lax shock condition, 50
di↵usion equation, 14
linear derivative operator, 30
Dirichlet boundary conditions, 12
linearisation, 39
distribution, 55
Lipshitz continuous, 23
divergence, 7
long wavelength condition, 76
Divergence theorem, 9
domain, 6 maximum norm, 22
method of characteristics, 46
elementary solutions, 31
method of descent, 99
elliptic, 13, 33, 38, 40
method of images, 84
elliptic at the point x, 33
energy norm, 25 nabla, 7
error function, 90 Neumann boundary conditions, 12
estimate, 23 norm, 22

102
normal derivative, 12 unit normal vector, 9

outward-pointing, 9 vectors, 22
volume element, 9
parabolic, 14, 38, 40
particle velocity, 45 wave number, 15, 29
particular integral, 7 wavelength, 15
PDE problem, 6 weak form, 44
PDE system, 6 weak solution, 17, 43
physical space, 6 well-posed, 20
planar symmetry, 99
plane wave solution, 31
Poisson equation, 12
principal part, 30
principal symbol, 30

quasilinear, 40

radiation boundary condition, 72


Rankine-Hugoniot condition, 48
rarefaction wave, 49
real vector space, 22
retarded potential, 98
Riemann problem, 47
Robin boundary conditions, 12

scalar conservation law, 45


scalars, 22
scale-invariant, 48
separation constant, 10
shock, 47
shock location, 48
shock velocity, 48
signum function, 57
similarity ansatz, 88
similarity solution, 48
similarity variables, 88
smooth, 16
source term, 7
spherical polar coordinates, 8
spherically symmetric, 72
stability, 24
standing wave, 15
state space, 6
strictly hyperbolic, 34, 35, 40
strong form, 43
strong solution, 17, 43
surface element, 9
symbol, 30

test function, 17, 55


traffic flow equation, 51
triangle inequality, 22

ultrahyperbolic, 38
unbounded, 6
uniqueness, 20

103

You might also like