Multi Grid

Multigrid Methods
Volker John
Winter Semester 2013/14

Contents
1 Literature 2
2 Model Problems 3
3 Detailed Investigation of Classical Iterative Schemes 8

3.1 General Aspects of Classical Iterative Schemes . . . . . . . . . . . . 8
3.2 The Jacobi and Damped Jacobi Method . . . . . . . . . . . . . . . . 10
3.3 The Gauss–Seidel Method and the SOR Method . . . . . . . . . . . 15
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 Grid Transfer 19
4.1 Algorithms with Coarse Grid Systems, the Residual Equation . . . . 19
4.2 Prolongation or Interpolation . . . . . . . . . . . . . . . . . . . . . . 21
4.3 Restriction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5 The Two-Level Method 27

5.1 The Coarse Grid Problem . . . . . . . . . . . . . . . . . . . . . . . . 27
5.2 General Approach for Proving the Convergence of the Two-Level
Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.3 The Smoothing Property of the Damped Jacobi Iteration . . . . . . 32
5.4 The Approximation Property . . . . . . . . . . . . . . . . . . . . . . 33
5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6 The Multigrid Method 37

6.1 Multigrid Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 Convergence of the W-cycle . . . . . . . . . . . . . . . . . . . . . . . 39
6.3 Computational Work of the Multigrid γ-Cycle . . . . . . . . . . . . . 45
7 Algebraic Multigrid Methods 49

7.1 Components of an AMG and Definitions . . . . . . . . . . . . . . . . 49
7.2 Algebraic Smoothness . . . . . . . . . . . . . . . . . . . . . . . . . . 52
7.3 Coarsening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
7.4 Prolongation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
7.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
8 Outlook 61
1
Chapter 1
Literature
Remark 1.1 Literature. There are several text books about multigrid methods,
e.g.,
• Briggs et al. (2000), easy to read introduction,
• Hackbusch (1985), the classical book, sometimes rather hard to read,
• Shaidurov (1995),
• Wesseling (1992), an introductionary book,
• Trottenberg et al. (2001).
2
2
Chapter 2
Model Problems
Remark 2.1 Motivation. The basic ideas and properties of multigrid methods will
be explained in this course on two model problems. 2
Example 2.2 A two-point boundary value problem. Consider the boundary value
problem
−u00 = f in Ω = (0, 1),
(2.1)
u(0) = u(1) = 0.
Often, this problem can be solved analytically.
Multigrid methods are solvers for linear system of equations that arise, e.g., in
the discretization of partial differential equations. For this reason, discretizations
of (2.1) will be considered: a finite difference method and a finite element method.
These discretizations are described in detail in the lecture notes of Numerical Math-
ematics III.
Consider an equidistant triangulation of Ω with the nodes 0 = x0 < x1 < . . . <
xN = 1 with the distance h = 1/N between two neighboring nodes.
The application of the second order finite difference scheme leads to a linear
system of equations
Au = f (2.2)
with the tridiagonal matrix A ∈ R(N −1)×(N −1) with

2 if i = j, i = 1, . . . , N − 1,
1 
aij = 2 −1 if i = j − 1, i = 2, . . . , N − 1, or i = j + 1, i = 1, . . . , N − 2,
h 
0 else,

(2.3)
and the right-hand side
(f )i = fi = f (xi ), i = 1, . . . , N − 1.
Using the P1 finite element method leads to a linear system of equations (2.2)
with the tridiagonal matrix

2 if i = j, i = 1, . . . , N − 1,
1
aij = −1 if i = j − 1, i = 2, . . . , N − 1, or i = j + 1, i = 1, . . . , N − 2, (2.4)
h
0 else,

and the right-hand side

Z xi+1
fi = f (x)ϕi (x) dx, i = 1, . . . , N − 1,
xi−1
3
where ϕi (x) is the function from the local basis that does not vanish in the node
xi . Note that there is a different scaling in the matrices of the finite difference and
the finite element method. 2
Example 2.3 Poisson equation in two dimension. The Poisson equation in two
dimensions with homogeneous boundary conditions has the form
−∆u = f in Ω = (0, 1)2 ,

(2.5)
u = 0 on ∂Ω.
Again, an equidistant grid is considered for the discretization of (2.5) with mesh
width hx = hy = h = 1/N . The nodes are numbered lexicographically.
The application of the finite difference method with the five point stencil leads
to a linear system of equations of dimension (N − 1) × (N − 1) with the matrix
entries

4 if i = j,
1 
aij = 2 −1 if i = j − 1, i = j + 1, i = j − (N + 1), i = j + (N + 1),
h 
0 else,

with obvious modifications for the nodes near the boundary of the domain.
For applying the P1 finite element method, the grid has to be decomposed into
triangles. Using a decomposition where the edges are either parallel to the axes or
parallel to the line y = x, one obtains the matrix

4
 if i = j,
aij = −1 if i = j − 1, i = j + 1, i = j − (N + 1), i = j + (N + 1),

0 else,

again with obvious modifications for the degrees of freedom near the boundary. 2
Remark 2.4 Properties of the matrices.

• The matrix A is sparse. In one dimension, there are not more than three non-
zero entries per row and column, the matrix is even tridiagonal. In the two-
dimensional case, there are not more than five non-zero entries per row and
column.
• The matrix A is symmetric. It follows that all eigenvalues are real.
• The matrix A is positive definite, i.e.,
xT Ax > 0 ∀ x \ {0},
where the dimension of the vector x corresponds to the dimension of the matrix
A. It follows that all eigenvalues are positive.
• The matrix A is diagonally dominant, i.e., it is
X
|aii | ≥ |aij | ∀ i,
j6=i
and there is at least one index for which the equal sign is not true. For the
considered problems, the upper sign applies for all nodes or degrees of freedom
which are close to the boundary.
It is well known from the course on iterative methods for sparse large linear systems
of equations, Numerical Mathematics II, that these properties are favorable. In fact,
also for multigrid methods, the state of the art is that most of the analysis is known
for systems with symmetric positive definite matrices, or matrices which are only
4
slight perturbations of such matrices. However, in practice, multigrid methods often
work very well also for the solution of systems with other matrices.
Even if the properties given above are favorable, the condition of the matrices
might be large. A direct calculation reveals (this was an exercise problem in Nu-
merical Mathematics II) that in one dimension, the eigenvalues of the finite element
matrix A are
4 2 kπ
λk = sin , k = 1, . . . , N − 1, (2.6)
h 2N
and the corresponding eigenvectors vk = (vk,1 , . . . , vk,N −1 )T with

jkπ
vk,j = sin , j, k = 1, . . . , N − 1. (2.7)
N
Then, a direct calculation, using a theorem for trigonometric functions and a Taylor
series expansion, shows for the spectral condition number

2 (N −1)π
sin sin2 (1 − h) π2

λmax (A) 2N
κ2 (A) = = =
sin2 2Nπ
sin2 h π2

λmin (A)
!2 !2
sin π2 cos h π2 − cos π2 sin h π2 cos h π2

= =
sin h π2 sin h π2

π 2π 2
2
− O(h) = O h−2 .

= cot h =
2 h
Also in higher dimensions, the condition number is κ2 (A) = O h−2 . 2

Example 2.5 Behavior of iterative methods for the Poisson equation. Consider
the Poisson equation (2.5) with f = 1 for all x ∈ Ω and the P1 finite element
discretization of this problem on meshes with different fineness.
Table 2.1 gives the number of iterations and the computing times for different
solvers applied to the solution of this problem. The simulations were performed
with the research code MooNMD, John and Matthies (2004). The SSOR method
and the conjugate gradient method (CG) are already known from Numerical Math-
ematics II. For these methods, not the system Au = f was solved, but the system
D−1 Au = D−1 f ,
where D is the diagonal of A. It is known from Numerical Mathematics II that the

number of iterations for SSOR can be estimated to be proportional to the condition
number of the matrix and the number of iterations for CG to be proportional to
the square root of the condition number. If κ2 (D−1 A) < κ2 (A), then the upper
bound for the number of iterations becomes better. As comparison, the number
of iterations with a multigrid method as solver and with a multigrid method as
preconditioner within a flexible general minimized residual (GMRES) method are
presented. Finally, the computing times for the application of the sparse direct
solver UMFPACK, Davis (2004), are given. UMFPACK is the solver behind the
backslash command in MATLAB.
5
Table 2.1: Example 2.5. Number of iterations and computing times (13/10/11 on a HP BL460c Gen8 2xXeon, Eight-Core 2700MHz). The number
of degrees of freedom (d.o.f.) includes the Dirichlet values.
level h d.o.f. SSOR PCG MG FGMRES+MG UMFPACK
ite time ite time ite time ite time ite time
1 1/4 25 49 0 3 0 11 0 6 0 1 0
2 1/8 81 164 0 9 0 13 0 8 0 1 0
3 1/16 289 543 0 31 0 13 0 8 0 1 0
4 1/32 1089 2065 0.07 66 0.01 14 0.03 8 0.01 1 0.01
5 1/64 4225 7998 0.92 132 0.02 14 0.11 8 0.03 1 0.03
6 1/128 16641 31054 14.61 263 0.16 13 0.35 8 0.21 1 0.12
7 1/256 66049 > 100000 524 1.79 13 1.55 8 1.06 1 0.75
8 1/512 263169 1038 16.55 12 6.09 8 3.90 1 5.40
9 1/1024 1050625 1986 127.76 12 27.46 7 18.32 1 46.46
10 1/2048 4198401 3944 1041.68 12 111.03 7 68.38
6
factor ≈ 4 4 16 2 8 1 4 1 4 1
The number of floating point operations per iteration is for all iterative methods
proportional to the number of degrees of freedom. One gets for the complete number
of operations the values from Table 2.2. One can observe that the estimate for the
number of iterations is sharp for PCG. For the multigrid approaches, the total
number of operations is proportional to the number of unknowns. Since in the
solution of a linear system of equations, each unknown has to be considered at
least once, the total number of operations is asymptotically optimal for multigrid
methods.
Table 2.2: Example 2.5. Number of floating point operations, where n is the number
of degrees of freedom.
method op./iter. no. of iterations total no. of operations
−2
2

SSOR O (n) O(κ2 (A)) =
O h = O n O n3
p
κ2 (A) = O h−1 = O (n)

PCG O (n) O O n2
MG O (n) O (1) O (n)
In addition, it can be seen that it is even more efficient to use the multigrid
method as a preconditioner in a Krylov subspace method than as a solver. One
has to use here the flexible GMRES method since the preconditioner is not a fixed
matrix but a method. That means, the preconditioner might change slightly from
iteration to iteration. The flexible GMRES method can cope with this difficulty.
The development of sparse direct solvers has shown remarkable progress in the
last couple of years. One can observe that for the model problem, the direct solver is
best for small and medium sized problems, up to about 100000 degrees of freedom.
But for large problems, good iterative methods are still better. On the fine grid,
UMFPACK is not able to solve the problem because there is an internal memory
limitation in this program. 2
7
Chapter 3
Detailed Investigation of
Classical Iterative Schemes
3.1 General Aspects of Classical Iterative Schemes

Remark 3.1 Motivation. This chapter studies the reason for the inefficient behav-
ior, observed in Example 2.5 for SSOR, of classical iterative schemes in detail. This
study is performed for the one-dimensional problem (2.1) and the linear system of
equations has the form
Au = f . (3.1)
2
Remark 3.2 General approach. Classical iterative schemes for the solution of (3.1)
were introduced and studied in Numerical Mathematics II. Here, a short review is
presented and notations are introduced.
Classical iterative schemes are based on a fixed point iteration for solving the
linear system of equations. To this end, decompose the matrix
A = M − N, (3.2)
where M is a non-singular matrix. Then, one can write system (3.1) in the fixed
point form
Mu = Nu + f
or
u = M −1 N u + M −1 f =: Su + M −1 f .
Given an initial iterate u(0) , a fixed point iteration can be applied to this equation
u(m+1) = Su(m) + M −1 f , m = 0, 1, 2, . . . (3.3)
This basic iterative approach might be also damped
u∗ = Su(m) + M −1 f , u(m+1) = ωu∗ + (1 − ω)u(m) , ω ∈ R+ ,
such that
u(m+1) = (ωS + (1 − ω)I) u(m) + ωM −1 f . (3.4)
2
8
Remark 3.3 The residual equation. Let u be the solution of (3.1) and u(m) an
approximation computed with (3.3) or (3.4). The error is denoted by
e(m) = u − u(m)
and the residual by
r(m) = f − Au(m) .
It is for the fixed point iteration (3.3)
Se(m) = M −1 N u − Su(m) = M −1 N u − u(m+1) + M −1 f
= M −1 (N u + f ) − u(m+1) = u − u(m+1) = e(m+1) . (3.5)
For both iterations (3.3) and (3.4), the so-called residual equation has the form
Ae(m) = Au − Au(m) = f − Au(m) = r(m) . (3.6)
2
Remark 3.4 To multigrid methods. In multigrid methods, the residual equation
(3.6) is used for updating the current iterate u(m) . An approximation ẽ(m) of e(m)
is computed from (3.6) and the new iterate is given by u(m+1) = u(m) + ẽ(m) . An
advantage of using the residual equation is that, at least close to the solution, e(m)
is small and the zero vector is a good initial guess for an iterative solution of (3.6).
2
Remark 3.5 To the convergence of classical iteration schemes. From (3.5), it
follows by induction that e(m) = S m e(0) , such that

(m)
e ≤ kS m k e(0) (3.7)

for each vector norm and its induced matrix norm. The iteration is called convergent
if
lim kS m k = 0
m→∞
and kS m k is called contraction number of the fixed point iteration (3.3). It was
shown in the course Numerical Mathematics II, Theorem 3.3 in the part on iterative
solvers, that the fixed point iteration (3.3) converges for any initial iterate if and
only if ρ(S) < 1, where ρ(S) = maxi |λi (S)| is the spectral radius of S. In connection
with iterative schemes, the spectral radius is also called convergence factor. It is
loosely speaking the worst factor for the reduction of the error in each step of the
iteration.
For each eigenvalue λi ∈ C of a matrix A ∈ Rn×n it is |λi | ≤ kAk, where k·k is
any induced matrix norm. It follows that ρ(S) ≤ kSk.
Let M ∈ N be the smallest natural number for which
(M )
e
−1
e(0) ≤ 10 ,

i.e., the smallest number of iterations which are needed for reducing the error by
the factor 10. This condition is satisfied approximately if
(M )
e
≤ S M ≈ ρ S M = (ρ(S))M ≈ 10−1 .

e(0)
It follows that
1
M .− .
log10 |ρ(S)|
The number − log10 |ρ(S)| is called rate of convergence. If it is close to zero, i.e.,
ρ(S) is close to one, then M is large and the convergence is slow. The convergence
becomes the faster the closer ρ(S) is to zero. 2
9
3.2 The Jacobi and Damped Jacobi Method
Remark 3.6 The Jacobi and the damped Jacobi method. The Jacobi method is
given by M = diag(A) = D in (3.2). A straightforward calculation shows, see also
Numerical Mathematics II, that it has the form
u(m+1) = u(m) + D−1 r(m) .
Also the damped Jacobi method was introduced in Numerical Mathematics II
u(m+1) = u(m) + ωD−1 r(m) , ω ∈ (0, 1]. (3.8)
A straightforward calculation shows that it can be written as a basic fixed point

iteration (3.3) with M = ω −1 diag(A). The behavior of the (damped) Jacobi method
shall be studied at the one-dimensional model problem (2.1). 2
Remark 3.7 Discrete Fourier modes. To study the behavior of the (damped)
Jacobi method for the one-dimensional model problem, it is sufficient to consider
the homogeneous linear system of equations
Au = 0 (3.9)
and an arbitrary initial iterate u(0) . The solution of the homogeneous system is
u = 0. Obviously, the matrix from the finite element discretization can be used
without loss of generality.
Let b be a given integrable function in [0, 1] with b(0) = b(1) = 0. This function
can be expanded in the form
∞
X
b(x) = bk sin (kπx) ,
k=1
where k is the wave number, bk is the k-th Fourier coefficient, the functions sin (kπx)
are called Fourier modes, and the frequency is 1/k. Small wave numbers characterize
long and smooth waves, whereas large wave numbers describe highly oscillating
waves, see Figure 3.1.
Fourier modes
1
0.8
0.6
0.4
0.2
0
y
−0.2
−0.4
−0.6
k=1
−0.8 k=3
k=6
−1
0 0.2 0.4 0.6 0.8 1
x
Figure 3.1: Fourier modes.
For the investigation of an iterative method applied to the solution of (3.9), it

is of advantage to consider initial iterates (which are equal to the negative of the
initial errors) which are discrete, i.e., at x = j/N , analogs of the Fourier modes:

(0) (0)
T
(0) jkπ
u(0) = u1 , . . . , uN −1 with uj = sin , j, k = 1, . . . , N −1. (3.10)
N
10
Note that these discrete Fourier modes are also the eigenvectors of the matrix A,
see (2.7).
The discrete Fourier modes in the lower part of the spectrum 1 ≤ k < N/2 are
called low frequency or smooth modes. The modes in the upper part of the spectrum
N/2 ≤ k ≤ N − 1 are the so-called high frequency or oscillating modes. Note that
the classification of the discrete modes depends on the number of nodes N . The
discrete analogs of the Fourier modes have different properties on different grids. 2
Example 3.8 Application of the damped Jacobi method for the solution of the
model problem. The damped Jacobi method (3.8) with ω = 2/3 is applied to
the solution of the model in the following two situations:
• the number of intervals N is fixed and the wave number k is varied,
• the wave number k is fixed and the number of intervals N is varied.
For each simulation, 100 iterations were performed and the error is measured in the
l∞ vector norm k·k∞ . The obtained results are presented in Figures 3.2 and 3.3
N = 64 N = 64
1 0
10
0.9
0.8
0.7
0.6
error ||e||∞
error ||e||∞
−1
0.5 10
0.4
0.3
0.2
k=1 k=1
0.1 k=3 k=3
k=6 k=6
−2
0 10
0 20 40 60 80 100 0 20 40 60 80 100
iterations iterations
Figure 3.2: Convergence of the damped Jacobi method with ω = 2/3 for initial
iterates with different wave numbers on a fixed grid, left linear plot, right semilog-
arithmic plot.
k=6 k=6
1 0
10
0.9
0.8
0.7
0.6
error ||e||∞
error ||e||∞
−1
0.5 10
0.4
0.3
0.2
N=64 N=64
0.1 N=128 N=128
N=256 N=256
−2
0 10
0 20 40 60 80 100 0 20 40 60 80 100
Figure 3.3: Convergence of the damped Jacobi method with ω = 2/3 on different
grids for an initial iterate with a fixed wave number, left linear plot, right semilog-
arithmic plot.
The following observations are of importance:

• On a fixed grid, there is a good damping of the high frequency errors whereas
(0)
there is almost no damping of the low frequency errors (note that u(0) = ej ).
11
• For a fixed wave number, the error is reduced on a coarser grid better than on
a finer grid.
• The logarithm of the error decays linearly, i.e., the error itself decays geometri-
cally. Thus, there is a constant 0 < C(k) < 1 such that

(n) n
e ≤ (C(k)) e(0) .

∞ ∞
In practice, one does not prescribe the number of iterations to be performed

but the scheme is applied until the computed solution satisfies a certain
criterion

with respect to its accuracy. For k = 6 and the stopping criterion e(m) ∞ =
(m)
u < 10−6 , the number of iterations are given in Table 3.1. One can see that
∞
refining the mesh once, thus halving the mesh width and doubling the number of
unknowns, then the number of iterations increases by the factor four.
Table 3.1: Convergence of the damped Jacobi method with ω = 2/3 for the initial
iterate with wave number k = 6.
N no. of iterations
16 27
32 116
64 475
128 1908
256 7642
512 30576
1024 122314
Remark 3.9 Analytical considerations of the damped Jacobi method. The iteration
matrix of the damped Jacobi method (3.8) has the form
ωh
Sjac,ω = I − ωD−1 A = I − A, (3.11)
2
where the diagonal of the finite element system matrix has been inserted. The
convergence of the damped Jacobi method is determined by the eigenvalues of the
iteration matrix. From the special form of this matrix, one can see that

ωh 2 kπh
λk (Sjac,ω ) = 1 − λk (A) = 1 − 2ω sin , k = 1, . . . , N − 1, (3.12)
2 2
where (2.6) and h = 1/N have been used. The eigenvectors vk of A, see (2.7), are
the same as the eigenvectors of Sjac,ω , exercise. 2
Lemma 3.10 Convergence of the damped Jacobi method. The damped Ja-
cobi method converges for the one-dimensional model problem for all initial iterates
if ω ∈ (0, 1]. The method converges fastest for ω = 1.
Proof: From Numerical Mathematics II it is known that the method converges for all
initial iterates if and only if the spectral radius of the iteration matrix ρ(Sjac,ω ) is smaller
than 1. Since it is

kπh kπ
0 < sin2 = sin2 < 1, for k = 1, . . . , N − 1,
2 2N
it follows from (3.12) that λk (Sjac,ω ) ∈ (−1, 1) for k = 1, . . . , N − 1, and ω ∈ (0, 1]. Hence
it is ρ(Sjac,ω ) < 1.
12
It is also known from Numerical Mathematics II that the method converges the faster
the smaller ρ(Sjac,ω ) is, i.e., one has to solve (exercise)

kπ
max 1 − 2ω sin2

min .
ω∈(0,1] k=1,...,N −1 2N
Remark 3.11 General effect of the damped Jacobi method on the discrete Fourier
modes. For studying the effect of the damped Jacobi methods on the discrete Fourier
modes, see (3.10), an arbitrary initial error e(0) will be represented with respect to
the basis {w1 , . . . , wN −1 }, where w1 , . . . , wN −1 are the eigenvectors of Sjac,ω and
A,
N
X −1
e(0) = ck wk , ck ∈ R.
k=1
Since the damped Jacobi method can be written in form (3.3), it follows from (3.5)
that
e(m) = Sjac,ω
m
e(0) .
Using the property of wk being an eigenvector of Sjac,ω , one obtains
N
X −1 N
X −1 N
X −1
m−1 m−1
e(m) = m
ck Sjac,ω wk = ck Sjac,ω (Sjac,ω wk ) = ck Sjac,ω λk (Sjac,ω ) wk
k=1 k=1 k=1
N
X −1
= ... = ck λm
k (Sjac,ω ) wk .
k=1
This calculation shows that after m iterations, the initial error with respect to the
k-th discrete Fourier mode is reduced by the factor λm m
k (Sjac,ω ). If |λk (Sjac,ω )|
is close to 1, then the reduction will be small. A strong reduction will occur if
|λm
k (Sjac,ω )| is close to zero. 2
Remark 3.12 Effect on the smooth error modes. Using (3.12), one finds that

2 kπ
λk (Sjac,ω ) ≈ 1 ⇐⇒ sin ≈ 0 ⇐⇒ k small,
2N
kπ
λk (Sjac,ω ) ≈ −1 ⇐⇒ ω sin2 ≈ 1 ⇐⇒ ω ≈ 1 and k close to N.
2N
For λ1 (Sjac,ω ), one finds with a Taylor series expansion
h2 π 2 h2 π 2

πh
λ1 (Sjac,ω ) = 1 − 2ω sin2 ≈ 1 − 2ω =1−ω .
2 4 2
This eigenvalue is close to 1 for all damping parameters ω ∈ (0, 1]. Hence, there
is no choice of the damping parameter which results in an efficient damping of the
smooth error modes connected to w1 . In addition, λ1 (Sjac,ω ) is the closer to 1 the
finer the grid is. It follows that refining the grid results in a worse convergence with
respect of the smooth error modes. 2
Remark 3.13 Effect on the oscillating error modes. The distribution of the eigen-
values of the iteration matrix for ω ∈ {1, 2/3, 1/2} and N = 16 is presented in
Figure 3.4. As it was observed in the previous remark, none of the damping pa-
rameters gives a method that reduces the smooth error modes efficiently. In can be
seen in Figure 3.4, using the damping parameter ω = 1, the method does not damp
13
Eigenvalues of the iteration matrix S , N=16
ω
1
0.8
0.6
0.4
0.2
eigenvalue
0
−0.2
−0.4
−0.6
ω=1
−0.8 ω=2/3
ω=0.5
−1
2 4 6 8 10 12 14
k
Figure 3.4: Eigenvalues of the iteration matrix Sjac,ω of the damped Jacobi method
for different values of the damping parameter, N = 16.
efficiently the oscillating error modes neither, but it damps efficiently some inter-
mediate error modes. The situation is much different for the damping parameter
ω = 1/2. For this parameter, it can be observed that the oscillating error modes
are damped efficiently.
The situation as it occurs for ω = 1/2 is of advantage, since it allows to distin-
guish clearly between the low and the high frequencies. With the damped Jacobi
method and appropriate damping parameters, there is an iterative scheme that
damps the high frequencies fast and the low frequencies slowly. Now, one needs
another method with complementary properties to combine both methods. The
construction of the complementary method is the goal of multigrid methods. 2
Example 3.14 Optimal damping parameter for the oscillating modes. The damp-
ing parameter ω has to be determined such the one finds the smallest interval
[−λ̄, λ̄] with λk (Sjac,ω ) ∈ [−λ̄, λ̄] for k = N/2, . . . , N − 1. This goal is achieved with
ω = 2/3. In this case it is, using the monotonicity of the sine function,

4 4 2 kπ 4 2 Nπ 4 1 2
≥ sin ≥ sin = · = .
3 3 2N 3 4N 3 2 3
One gets
4 2 kπ 1
max λk S2/3 = max 1 − sin
≤ ,
k≥N/2 k≥N/2 3 2N 3
see also Figure 3.4. It follows that the oscillating error modes are reduced in each
iteration at least by the factor three. This damping rate for the oscillating error
modes is called smoothing rate of the method. As one can see, the smoothing rate
is for the damped Jacobi method (with fixed ω) independent of the fineness of the
grid. 2
Remark 3.15 On the multigrid idea. Consider a fixed Fourier mode sin(kπx) and
its discrete representation sin(jkπ/N ), j = 1, . . . , N − 1. As already noted at the
end of Remark 3.7, the classification of this mode depends on the fineness of the
grid:
• If the grid is sufficiently fine, i.e., k < N/2, it is a smooth mode and it can be
damped only slowly with the damped Jacobi method.
• If the grid is sufficiently coarse, i.e., N/2 ≤ k ≤ N − 1, it is an oscillating mode
and can be damped quickly with the damped Jacobi method.
From this observation, one can already derive the multigrid idea. On a fine grid,
only the oscillating error modes on this grid are damped. The smooth modes on
this grid are oscillating on coarser grids and they will be reduced on these grids. 2
14
3.3 The Gauss–Seidel Method and the SOR Method
Remark 3.16 The Gauss–Seidel Method and the SOR Method. The Gauss–Seidel
method and the SOR (successive over relaxation) method were also already intro-
duced and studied in Numerical Mathematics II. Decompose the system matrix of
(3.1) into
A = D + L + U,
where D is the diagonal, L is the strict lower part, and U is the strict upper part.
The Gauss–Seidel method is obtained with M = D + L and N = −U in the fixed
point method (3.3)
−1 −1
u(m+1) = − (D + L) U u(m) + (D + L) f
| {z }
SGS

−1
= u(m) + (D + L) f − Au(m) m = 0, 1, 2, . . . .
Using M = ω −1 D + L and N = ω −1 D − (D − U ) gives the SOR method. This

method can be written in the form

u(m+1) = u(m) + ωD−1 f − Lu(m+1) − (D + U ) u(m)
−1
(m) D
= u + +L f − Au(m) , m = 0, 1, 2, . . . .
ω
In the case ω = 1, the Gauss–Seidel method is recovered. Writing the method
component-wise
 
i−1 n
(m+1) (m) ω  X (m+1)
X (m) 
ui = ui + fi − aij uj − aij uj ,
aii j=1 j=i
one can see that for the computation of u(m+1) not only the old iterate u(m) is used,
as in the damped Jacobi method, but one uses the already computed components
of u(m+1) .
By the last property, one can say that the SOR method is somewhat more
advanced than the damped Jacobi method. However, it will turn out that the
SOR method shows a similar behavior for the solution of the model problem as the
damped Jacobi method. 2
Remark 3.17 Properties of the SOR method. The properties of the SOR method
were studied in Numerical Mathematics II. They will be summarized here.
• Lemma of Kahan1 . If the SOR method converges for every initial iterates u(0) ∈
Rn then ω ∈ (0, 2).
• If A ∈ Rn×n a symmetric positive definite matrix. Then the SOR method
converges for all initial iterates u(0) ∈ Rn if ω ∈ (0, 2).
The rate of convergence depends on ω. It can be shown that for a certain class of
matrices, to which also the matrix obtained in the discretization of the model prob-
lem belongs to, there is an optimal value ωopt ∈ (1, 2). However, the determination
of ωopt is difficult in practice. For the model problem, one finds that ωopt tends to
2 if the grids are refined, cf., Numerical Mathematics II, exercise problem 04/2.
The behavior of the SOR method depends on the numbering of the unknowns,
which is in contrast to the damped Jacobi method. There are classes of problems
where the efficiency of the SOR method depends essentially on the numbering of
the unknowns. 2
1 William M. Kahan, born 1933
15
Example 3.18 Application of the Gauss–Seidel method for the solution of the
model problem. The SOR method with ω = 1 is studied in the same way as the
damped Jacobi method in Example 3.8. The qualitatively behavior does not change
for other values of the relaxation parameter. The numbering of the unknowns in
the model problem is from left to right.
Figures 3.5 and 3.6 present the results. One can see that they are qualitatively
the same as for the damped Jacobi method.
N = 64 N = 64
1 0
10
0.9
0.8
0.7
0.6
error ||e||∞
error ||e||∞
−1
0.5 10
0.4
0.3
0.2
k=1 k=1
0.1 k=3 k=3
k=6 k=6
−2
0 10
0 20 40 60 80 100 0 20 40 60 80 100
Figure 3.5: Convergence of the SOR method with ω = 1 for initial iterates with
different wave numbers on a fixed grid, left linear plot, right semilogarithmic plot.
k=6 k=6
1 0
10
0.9
0.8
0.7
0.6
error ||e||∞
error ||e||∞
−1
0.5 10
0.4
0.3
0.2
N=64 N=64
0.1 N=128 N=128
N=256 N=256
−2
0 10
0 20 40 60 80 100 0 20 40 60 80 100
Figure 3.6: Convergence of the SOR method with ω = 1 on different grids for an
initial iterate with a fixed wave number, left linear plot, right semilogarithmic plot.

The number of iterations for k = 6 and the stopping criterion e(m) ∞ =
(m)
u < 10−6 is presented in Table 3.2. Like for the Jacobi method, the number
∞
increases by the factor of four if the grid is refined once.
Altogether, one can draw for the SOR method the same conclusions as for the
damped Jacobi method. 2
Lemma 3.19 Some eigenvalues and eigenvectors of SGS . Let A be the matrix
obtained by discretizing the model problem (2.1) with the finite element method.
Then, some eigenvalues of the iteration matrix of the Gauss–Seidel method are
given by
2 kπ
λk (SGS ) = cos , k = 1, . . . , N/2,
N
16
Table 3.2: Convergence of the SOR method with ω = 1 for the initial iterate with
wave number k = 6.
N no. of iterations
16 274
32 1034
64 3859
128 14297
256 52595
512 191980
1024 –
and the corresponding eigenvectors are wk = (wk,1 , . . . , wk,N −1 )T with

j/2 jkπ
wk,j = (λk (SGS )) sin , j = 1, . . . , N − 1.
N
Proof: One has to show that

SGS wk = λk (SGS ) wk , k = 1, . . . , N − 1.
Inserting the decomposition of SGS gives
− (D + L)−1 U wk = λk (SGS ) wk ⇐⇒ λk (SGS ) (D + L) wk = −U wk .
Considering the j-th component and using the special form of the matrices D, L, and U ,
see Example 2.2, one obtains

2 1 1
λk (SGS ) wk,j − wk,j−1 = wk,j+1 .
h h h
Scaling this equation by h and inserting the representation of the k-th eigenvector yields

jkπ (j − 1)kπ
λk (SGS ) 2 (λk (SGS ))j/2 sin − (λk (SGS ))(j−1)/2 sin
N N

(j + 1)kπ
= (λk (SGS ))(j+1)/2 sin ,
N
which is equivalent to

jkπ (j − 1)kπ
(λk (SGS ))(j+1)/2 2 (λk (SGS ))1/2 sin − sin
N N

(j + 1)kπ
= (λk (SGS ))(j+1)/2 sin .
N
Applying the formula for the eigenvalues, noting that cos(kπ/N ) ≥ 0 for k = 1, . . . , N/2,
gives

kπ jkπ (j − 1)kπ
(λk (SGS ))(j+1)/2 2 cos sin − sin
N N N

(j+1)/2 (j + 1)kπ
= (λk (SGS )) sin .
N
Using now the relation

α+β α−β
2 sin cos = sin α + sin β
2 2
with α = (j +1)kπ/N , β = (j −1)kπ/N , then one finds that both sides are in fact identical.
17
For j = 1 and j = N − 1, one can perform the same calculation with formally intro-
ducing

0kπ N kπ
wk,0 = (λk (SGS ))0/2 sin = 0, wk,N = (λk (SGS ))N/2 sin = 0.
N N
Remark 3.20 Discussion of Lemma 3.19. The eigenvalues of SGS are close to one
for small k. They are close to zero if k is close to N/2. This situation is similar to
the Jacobi method without damping.
One can derive, analogosly to the damped Jacobi method in Remark 3.11, the
error formula
N
X −1
m
e(m) = (λk (SGS )) wk (SGS ) .
k=1
Using the eigenvectors wk (SGS ) of SGS as initial iterates, then it follows from the
eigenvalues of SGS that a fast error reduction can be expected only for k ≈ N/2,
whereas for k close to 0, there is only a slow convergence, see Table 3.3. It turns
out that the situation for k being close to N − 1 is simular as for k being close to 0.
2
Table 3.3: Number of iterations for damping the norm of the error by the factor
100, N = 64.
k wk (SGS )
1 1895
3 207
6 50
16 6
32 1
60 115
63 1895
3.4 Summary
Remark 3.21 Summary. The investigation of classical iterative schemes led to the
following important observations:
• Classical iterative schemes might damp highly oscillating discrete error modes
very quickly. There is only a slow damping of the smooth discrete error modes.
• A smooth error mode on a given grid is generally on a coarser grid less smooth.
2
18
Chapter 4
Grid Transfer
Remark 4.1 Contents of this chapter. Consider a grid with grid size h and the
corresponding linear system of equations
Ah uh = f h .
The summary given in Section 3.4 leads to the idea that there might be an iterative
method for solving this system efficiently, which uses also coarser grids. In order
to construct such a method, one needs mechanisms that transfer the information in
an appropriate way between the grids. 2
4.1 Algorithms with Coarse Grid Systems, the Resid-

ual Equation
Remark 4.2 Basic idea for obtaining a good initial iterate with a coarse grid solu-
tion. One approach for improving the behavior of iterative methods, at least at the
beginning of the iteration, consists in using a good initial iterate. For the model
problem, one can try to find a good initial iterate, e.g., by solving the problem
approximately on a coarse grid, using only a few iterations. The application of only
a few iterations is called smoothing, and the iterative method itself smoother, since
only the oscillating error modes (on the coarse grid) are damped. The solution from
the coarse grid can be used as initial iterate on the fine grid. 2
Remark 4.3 Study of the discrete Fourier modes on different grids. Given a grid
Ω2h . In practice, a uniform refinement step consists in dividing in halves all intervals
of Ω2h , leading to the grid Ωh . Then, the nodes of Ω2h are the nodes of Ωh with
even numbers, see Figure 4.1.
0 1 2 3 4 5 6 7 8 Ωh
0 1 2 3 4 Ω2h
Figure 4.1: Coarse and fine grid.
Consider the k-th Fourier mode of the fine grid Ωh . If 1 ≤ k ≤ N/2, then it
follows for the even nodes that

h 2jkπ jkπ 2h N
wk,2j = sin = sin = wk,j , j = 1, . . . , − 1.
N N/2 2
19
Hence, the k-th Fourier mode on Ωh is the k-th Fourier mode on Ω2h . From the
definition of the smooth and oscillating modes, Remark 3.7, it follows that by going
from the fine to the coarse grid, the k-th mode gets a higher frequency if 1 ≤ l ≤
N/2. Note again that the notion of frequency depends on the grid size. The Fourier
mode on Ωh for k = N/2 is represented on Ω2h by the zero vector.
For the transfer of the oscillating modes on Ωh , i.e., for N/2 < k < N , one
obtains a somewhat unexpected results. These modes are represented on Ω2h as
relatively smooth modes. The k-th mode on Ωh becomes the negative of the (N −k)-
th mode on Ω2h :

h 2jkπ jkπ
wk,2j = sin = sin ,
N N/2

2h j(N − k)π 2j(N − k)π
−wN −k,2j = − sin = − sin
N/2 N

2jN π 2jkπ
= − sin −
N N

2jkπ 2jkπ
= − sin (2jπ) cos + cos (2jπ) sin
| {z } N | {z } N
=0 =1

2jkπ
= sin ,
N
h 2h
i.e., wk,2j = −wN −k,2j . This aspect shows that it is necessary to damp the oscil-
lating error modes on Ωh before a problem on Ω2h is considered. Otherwise, one
would get additional smooth error modes on the coarser grid. 2
Remark 4.4 The residual equation. An iterative method for the solution of Au = f
can be applied either directly to this equation or to an equation for the error, the
so-called residual equation. Let u(m) be an approximation of u, then the error
e(m) = u − u(m) satisfies the equation
Ae(m) = f − Au(m) =: r(m) . (4.1)
Remark 4.5 Nested iteration. This remark gives a first strategy for using coarse
grid problems for the improvement of an iterative method for solving Auh = f h .
This strategy is a generalization of the idea from Remark 4.2. It is called nested
iteration:
• solve Ah0 uh0 = f h0 on a very coarse grid approximately by applying a smoother,
.
• ..
• smooth A2h u2h = f 2h on Ω2h ,
• solve Ah uh = f h on Ωh by an iterative method with the initial iterate provided
from the coarser grids.
However, there are some open questions with this strategy. How are the linear
systems defined on the coarser grids? What can be done if there are still smooth
error modes on the finest grid? In this case, the convergence of the last step will be
slowly. 2
Remark 4.6 Coarse grid correction, two-level method. A second strategy uses also
the residual equation (4.1):
• Smooth Ah uh = f h on Ωh . This step gives an approximation vh of the solution
which still has to be updated appropriately. Compute the residual rh = f h −
Ah v h .
20
• Project (restrict) the residual to Ω2h . The result is called R(rh ).
• Solve A2h e2h = R(rh ) on Ω2h . With this step, one obtains an approximation
e2h of the error.
• Project (prolongate) e2h to Ωh . The result is denoted by P (e2h ).
• Update the approximation of the solution on Ωh by vh := vh + P (e2h ).
This approach is called coarse grid correction or two-level method. With this ap-
proach, one computes on Ω2h an approximation of the error. However, also for
this approach one has to answer some questions. How to define the system on the
coarse grid? How to restrict the residual to the coarse grid and how to prolongate
the correction to the fine grid? 2
4.2 Prolongation or Interpolation

Remark 4.7 General remarks. The transfer from the coarse to the fine grid is
called prolongation or interpolation. In many situations, one can use the simplest
approach, which is the linear interpolation. For this reason, this section will only
consider this approach. 2
Example 4.8 Linear interpolation for finite difference methods. For finite differ-
ence methods, the prolongation operator is defined by a local averaging. Let Ω2h be
divided into N/2 intervals and Ωh into N intervals. The node j on Ω2h corresponds
to the node 2j on Ωh , 0 ≤ j ≤ N/2, see Figure 4.1. Let v2h be given on Ω2h . Then,
the linear interpolation
h
I2h : RN/2−1 → RN −1 , vh = I2h
h 2h
v
is given by
h
v2j = vj2h , j = 1, . . . , N/2 − 1,
1 2h 2h
(4.2)
h
v2j+1 = vj + vj+1 , j = 0, . . . , N/2 − 1,
2
see Figure 4.2. For even nodes of Ωh , one takes directly the value of the correspond-
ing node of Ω2h . For odd nodes of Ωh , the arithmetic mean of the values of the
neighbor nodes is computed.
0 1 2 3 4 5 6 7 8 Ωh
h
I2h
0 1 2 3 4 Ω2h
Figure 4.2: Linear interpolation for finite difference methods.
The linear prolongation is a linear operator, see below Lemma 4.10, between
two finite-dimensional spaces. Hence, it can be represented as a matrix. Using the
21
standard basis of RN/2−1 and RN −1 , then
 
1
2 
 
1 1 
 

 2 

 1 1 
h 1  ∈ R(N −1)×(N/2−1) .

I2h =  .. (4.3)
2 2 . 

 .. 

 1 . 


 1
 2
1
2
Example 4.9 Canonical prolongation for finite element methods. Consider con-
forming finite element methods and denote the spaces on Ω2h and Ωh with V 2h
and V h , respectively. Because Ωh is a uniform refinement of Ω2h , it follows that
V 2h ⊂ V h . Hence, each finite element function defined on Ω2h is contained in the
space V h . This aspect defines a canonical prolongation
h
I2h : V 2h → V h , v 2h 7→ v 2h .
The canonical prolongation will be discussed in detail for P1 finite elements. Let
N/2−1 −1
{ϕ2h
i }i=1 be the local basis of V 2h and {ϕhi }N h
i=1 be the local basis of V . Each
2h 2h
function v ∈ V has a representation of the form
N/2−1
X
v 2h (x) = vi2h ϕ2h
i (x), vi2h ∈ R, i = 1, . . . , N/2 − 1.
i=1
There is a bijection between V 2h and RN/2−1 .

Let j = 2i be the corresponding index on Ωh to the index i on Ω2h . From the
property of the local basis, it follows that
1 h 1
ϕ2h
i = ϕ + ϕhj + ϕhj+1 .
2 j−1 2
Inserting this representation gives
N/2−1
2h
X 1 h 1
v (x) = vi2h
ϕ2i−1 + ϕh2i + ϕh2i+1
i=1
2 2

1 h 1
= v12h ϕ + ϕh2 + ϕh3
2 1 2

1 h 1
+v22h ϕ3 + ϕh4 + ϕh5
2 2

1 h 1
+v32h ϕ5 + ϕh6 + ϕh7
2 2
+....
From this formula, one can see that the representation in the basis of V h is of the
following form. For basis functions that correspond to nodes which are already
on Ω2h (even indices on the fine grid), the coefficient is the same as for the basis
function on the coarser grids. For basis functions that correspond to new nodes, the
22
coefficient is the arithmetic mean of the coefficients of the neighbor basis functions.
Hence, if local bases are used, the coefficients for the prolongated finite element
function can be computed by multiplying the coefficients of the coarse grid finite
element function with the matrix (4.3). 2
Lemma 4.10 Properties of the linear interpolation operator. The operator

h
I2h : RN/2−1 → RN −1 defined in (4.2) is a linear operator. It has full rank and
only the trivial kernel.
h
Proof: i) I2h is a linear operator. The operator is homogeneous, since for α ∈ R and
v ∈ RN/2−1 it is
h
v2j = (αv)j = αvj ,
h 1 1
v2j+1 = (αv)j + (αv)j+1 = α (vj + vj+1 ) .
2 2
The operator is additive. Let v, w ∈ RN/2−1 , then

h h h
I2h (v + w) = (v + w)j = vj + wj = I2h (v) + I2h (w) ,
2j 2j 2j

h
1 1 1
I2h (v + w) = ((v + w)j + (v + w)j+1 ) = (vj + vj+1 ) + (wj + wj+1 )
2j+1 2 2 2

h h
= I2h (v) + I2h (w) .
2j+1 2j+1
An homogeneous and additive operator is linear.

h
ii) I2h has full rank and trivial kernel. Since N/2 − 1 < N − 1, both properties are
equivalent. Let 0 = vh = I2h h
(v2h ). From (4.2) it follows from the vanishing of the even
indices of v immediately that vj2h = 0, j = 1, . . . , N/2 − 1, i.e., v2h = 0. Hence, the only
h
h
element in the kernel of I2h is the zero vector.
Remark 4.11 Effect of the prolongation on different error modes. Assume that
the error, which is of course unknown is a smooth function on the fine grid Ωh . In
addition, the coarse grid approximation on Ω2h is computed and it should be exact
in the nodes of the coarse grid. The interpolation of this coarse grid approximation
is a smooth function on the fine grid (there are no new oscillations). For this reason,
one can expect a rather good approximation of the smooth error on the fine grid.
If the error on the fine grid is oscillating, then each interpolation of a coarse grid
approximation to the fine grid is a smooth function and one cannot expect that the
error on the fine grid is approximated well, see Figure 4.3.
Altogether, the prolongation gives the best results, if the error on the fine grid
is smooth. Hence, the prolongation is an appropriate complement to the smoother,
which works most efficiently if the error is oscillating. 2
4.3 Restriction
Remark 4.12 General remarks. For the two-level method, one has to transfer the
residual from Ωh to Ω2h before the coarse grid equation can be solved. This transfer
is called restriction. 2
Example 4.13 Injection for finite difference schemes. The simplest restriction is
the injection. It is defined by
N
Ih2h : RN −1 → RN/2−1 , v2h = Ih2h vh , vj2h = v2j
h
, j = 1, . . . , − 1,
2
see Figure 4.4. For this restriction, one takes for each node on the coarse grid simply
the value of the grid function at the corresponding node on the fine grid.
23
oscillating error
interpolant for exact grid function
0.5
0.4
0.3
0.2
0.1
value
−0.1
−0.2
−0.3
−0.4
−0.5
0 0.2 0.4 0.6 0.8 1
x
Figure 4.3: Oscillating error and interpolation.
0 1 2 3 4 5 6 7 8 Ωh
1 1 1 Ih2h
0 1 2 3 4 Ω2h
Figure 4.4: Injection.
It turns out that the injection does not lead to an efficient method. If one ignores
every other node on Ωh , then the values of the residual in these nodes, and with
that also the error in these nodes, do not possess any impact on the system on the
coarse grid. Consequently, these errors will generally not be corrected. 2
Example 4.14 Weighted restriction for finite difference schemes. The weighted
restriction uses all nodes on the fine grid. It is defined by an appropriate averaging
Ih2h : RN −1 → RN/2−1 ,
1 h N
v2h Ih2h vh , vj2h = h h

= v + 2v2j + v2j+1 , j = 1, . . . , − 1, (4.4)
4 2j−1 2
see Figure 4.5. For finite difference schemes, only the weighted restriction will be
considered in the following.
If the spaces RN −1 and RN/2−1 are equipped with the standard bases, the matrix
24
0 1 2 3 4 5 6 7 8 Ωh
1 1 1 1 1 1 1 1 1
4 2 4 4 2 4 4 2 4 Ih2h
0 1 2 3 4 Ω2h
Figure 4.5: Weighted restriction.
representation of the weighted restriction operator has the form

 
1 2 1
 1 2 1 
1 1 2 1

2h
Ih =   ∈ R(N/2−1)×(N −1) . (4.5)
 
4 .. 
 . 
1 2 1
With this representation, one can see an important connection between weighted
restriction Ih2h and interpolation I2h
h
:
h
T
I2h = 2 Ih2h .
2
Lemma 4.15 Properties of the weighted restriction operator. Let the re-
striction operator Ih2h given by (4.4). This operator is linear. The rank of this
operator is N/2 − 1 and the kernel has dimension N/2.
Proof: i) Linearity. exercise.
ii) Rank and kernel. From linear algebra, it is known that the sum of the dimension
of the kernel and the rank is N − 1. The rank of Ih2h is equal to the dimension of its
range (row rank). The range of Ih2h is equal to RN/2−1 , since every vector from RN/2−1
might be the image in the space of grid functions of Ω2h of a vector corresponding to grid
functions of Ωh . Hence, the rank is N/2 − 1 and consequently, the dimension of the kernel
is N − 1 − (N/2 − 1) = N/2.
Example 4.16 Canonical restriction for finite element schemes. Whereas for fi-
nite difference methods, one works only with vectors of real numbers, finite element
methods are imbedded into the Hilbert space setting. In this setting, a finite ele-
ment function is, e.g, from the space V h , but the residual, which is the right-hand
side minus the finite element operator∗ applied to a finite element function (current
iterate) is from the dual space V h of V h . In this setting, it makes a difference if
one restricts an element from V h or from its dual space.
For restricting a finite element function from V h to V 2h , one can take the anal-
ogon of the weighted restriction. If local bases are used, then the coefficients of
the finite element function from V h are multiplied with the matrix (4.5) to get the
coefficients of the finite element function in V 2h .
In the two-level method, one has to restrict the residual, i.e., one needs a re-
∗ ∗
striction from V h to V 2h . In this situation, a natural choice consists in using
the dual prolongation operator, i.e.,
∗ ∗ h ∗
Ih2h : V h → V 2h , Ih2h = I2h

.
25
The dual operator is defined by

h 2h h ∗
I2h v , r V h ,(V h )∗ = v 2h , Ih2h rh V 2h ,(V 2h )∗ ∀ v 2h ∈ V 2h , rh ∈ V h

.
Thus, if local bases and the bijection between finite element spaces and the
Euclidean spaces are used, then the restriction of the residual can be represented
by the transposed of the matrix (4.3). This issue makes a difference of a factor of
2 compared with the matrix for the weighted restriction. 2
26
Chapter 5
The Two-Level Method
Remark 5.1 The two-level method. In this chapter, the two-level method or coarse
grid correction scheme will be analyzed. The two-level method, whose principle was
already introduced in Remark 4.6, has the following form:
• Smooth Ah uh = f h on Ωh with some steps of a simple iterative scheme. This
procedure gives an approximation vh . Compute the residual rh = f h − Ah vh .
• Restrict the residual to the coarse grid Ω2h using the restriction operator Ih2h
(weighted restriction for finite difference methods, canonical restriction for finite
element methods).
• Solve the coarse grid equation
A2h e2h = Ih2h rh

(5.1)
on Ω2h .
• Prolongate e2h to Ωh using the prolongation operator I2h h
.
h h h 2h
• Update v := v + I2h e .
After the update, one can apply once more some iterations with the smoother. This
step is called post smoothing, whereas the first step of the two-level method is called
pre smoothing. 2
5.1 The Coarse Grid Problem

Remark 5.2 The coarse grid system. The two-level method still lacks a definition
of the coarse grid matrix A2h . This matrix should be a “Ω2h version of the fine grid
matrix Ah ”. Possible choices of A2h will be discussed in this section. 2
Remark 5.3 Definition of the coarse grid matrix by using a discrete scheme on
Ω2h . A straightforward approach consists in defining A2h by applying a finite
difference or finite element method to the differential operator on Ω2h . 2
Remark 5.4 Definition of the coarse grid matrix by Galerkin projection. Start-
ing point for the derivation of an appropriate coarse grid matrix by the Galerkin
projection is the residual equation
Ah eh = rh . (5.2)
It will be assumed for the moment that eh lies in the range of the prolongation
h
operator I2h . Then, there is a vector e2h defined on the coarse grid such that
eh = I2h
h
e2h .

27
Substituting this equation into (5.2) gives
Ah I2h
h
e2h = rh .

Applying now on both sides of this equation the restriction operator gives
Ih2h Ah I2h
h
e2h = Ih2h rh .

Comparing this definition with (5.1) leads to the definition
A2h := Ih2h Ah I2h

h
. (5.3)
This definition of the coarse grid matrix is called Galerkin projection.

The derivation of (5.3) was based on the assumption that the error eh is in the
range of the prolongation. This property is in general not given. If it would be true,
then an exact solution of the coarse grid equation would result in obtaining the so-
lution of Auh = f h with one step of the coarse grid correction scheme. Nevertheless,
this derivation gives a motivation for defining A2h in the form (5.3). 2
Remark 5.5 Matrix representation of the Galerkin projection. For all operators
on the right-hand side of (5.3), matrix representations are known, e.g., see (2.3),
(4.3), and (4.5) for the case of the finite difference discretization. Using these
28
representations, one obtains
A2h    
1 2 1 2 −1
 1 2 1  −1 2 −1 
1 1 2 1
 1 
.. .. ..

=
   
4

..
 2
h  . . . 

 .   −1 2 −1
1 2 1 −1 2
 
1
2 
 
1 1 
 

 2 

 1 1 
1 
×  .. 
2
 2 . 

 .. 

 1 . 


 1

 2
1
 
0 0 0
2 −1 0 
  
1 2 1 0
 0 0 

 1 2 1  −1 2 −1 
1  1 2 1

..

=
  
2
8h 
  . 
..  
 . 
 −1 2 −1
1 2 1  0 0 0
 0 −1 2
0 0 0
 
4 −2
−2 4 −2 
1  .. .. ..

=
 
8h2 
 . . . 

 −2 4 −2
−2 4
 
2 −1
−1 2 −1 
1  .. .. ..

= .
 
4h2 
 . . . 
 −1 2 −1
−1 2
This matrix has the form of the matrix (2.3) with h replaced by 2h. Thus, in the
case of the model problem, the matrix defined by the Galerkin projection (5.3) and
the matrix (2.3) obtained by discretizing the differential operator on the coarse grid
Ω2h coincide.
In the finite element case, the matrices differ only by the factors in front of
the parentheses: 1/2, 1/h, 1/2, instead of 1/4, 1/h2 , 1/2. Then, the final factor
is 1/(2h) instead of 1/(4h2 ). The factor 1/(2h) is exactly the factor of the finite
element matrix on Ω2h , see (2.4). That means, also in this case Galerkin projection
and the discretization on Ω2h coincide.
This connection of the Galerkin projection and of the discretized problem on
Ω2h does not hold in all cases (problems and discretizations), but it can be found
often. 2
29
5.2 General Approach for Proving the Convergence
of the Two-Level Method
Remark 5.6 The iteration matrix of the two-level method. For studying the con-
vergence of the two-level method, one first has to find the iteration matrix S2lev of
this scheme. For simplicity, only the case of pre smoothing is considered, but no
post smoothing.
Let Ssm be the iteration matrix of the smoother. The approximation of the
solution before the pre smoothing step is denoted by v(n) and the result after the
update will be v(n+1) . Applying ν pre smoothing steps, then it is known from (3.7)
that
e(ν) = Ssm
ν
e(0) , with e(0) = u − v(n) , e(ν) = u − vν(n) .
It follows that
vν(n) = u − Ssm
ν
u − v(n) ,
(n)
where now vν stands for vh in the general description of the two-level method
from Remark 5.1. It follows that

r = f − Ah vν(n) = f − Ah u + Ah Ssm
ν
u − v(n) = Ah Ssm
ν
u − v(n) .
Applying this formula in the two-level method from Remark 5.1, starting with the
update step, one obtains
v(n+1) = vν(n) + I2h h

e2h

ν
−1 2h
= u − Ssm u − v(n) + I2h h
A2h Ih r
ν −1
v(n) + (I − Ssm ν
) Ah

= Ssm f
−1 −1

h
A2h Ih2h Ah Ssm
ν
Ah f − v(n)

+I2h

h
−1 2h h ν (n)
= I − I2h A2h Ih A Ssm v (5.4)

ν h
−1 2h h ν −1
+ (I − Ssm ) + I2h A2h Ih A Ssm Ah f.
Hence, the iteration matrix of the two-level method is given by

h
−1 2h h ν
S2lev = I − I2h A2h Ih A Ssm . (5.5)
−1
Inserting u = Ah f into the two-level method (5.4) shows that u is a fixed
point, exercise. It follows that in the case this fixed point is the only fixed point
and that the two-level method converges, then it converges to u. 2
Remark 5.7 Goal of the convergence analysis. From the course Numerical Math-
ematics II, Theorem 3.3 in the part on iterative solvers, it is known that a sufficient
and necessary condition for the convergence of the fixed point iteration is that
ρ (S2lev ) < 1. But the computation of ρ (S2lev ) is rather complicated, even in sim-
ple situations. However, from linear algebra it is known that ρ (S2lev ) ≤ |||S2lev |||
for induced matrix norms, e.g., the spectral norm. The goal of the convergence
analysis will be to show that
|||S2lev ||| ≤ ρ < 1
independently of h. The analysis is based on a splitting of S2lev in the form

−1 −1 2h h ν
S2lev = Ah h
− I2h A2h Ih A Ssm .
30
It follows that
−1 −1
|||S2lev ||| ≤ ||| Ah h
− I2h A2h Ih2h ||| |||Ah Ssm
ν
|||. (5.6)
The first factor in (5.6) describes the effect of the coarse grid approximation. The
second factor measures the efficiency of the smoothing step. The smaller the first
factor is, the better is the coarse grid solution which approximates eh . Hence, the
two essential components of the two-level method, the smoothing and the coarse
grid correction, can be analyzed separately. 2
Definition 5.8 Smoothing property. The matrix Ssm is said to possess the
smoothing property, if there exist functions η(ν) and ν(t), whose definition is inde-
pendent of h, and a number α > 0 such that
|||Ah Ssm
ν
||| ≤ η(ν)h−α for all 1 ≤ ν ≤ ν(h), (5.7)
with η(ν) → 0 as ν → ∞ and ν(h) = ∞ or ν(h) → ∞ as h → 0. 2
Remark 5.9 On the smoothing property. The smoothing property does not nec-
essarily mean that the smoothing iteration is a convergent iteration. It is only
required that the error is smoothed in a certain way using up to ν(h) smooth-
ing steps. In fact, there are examples where divergent iterative schemes are good
smoothers. But in this course, only the case ν(h) = ∞ will be considered, i.e., the
case of a convergent smoothing iteration. 2
Definition 5.10 Approximation property. The approximation property holds

if there is a constant Ca , which is independent of h, such that
−1 −1
||| Ah h
− I2h A2h Ih2h ||| ≤ Ca hα (5.8)
with the same α as in the smoothing property. 2
Theorem 5.11 Convergence of the two-level method. Suppose the smoothing

property and the approximation property hold. Let ρ > 0 be a fixed number. If
ν(t) = ∞ for all t, then there is a number ν such that
|||S2lev ||| ≤ Ca η(ν) ≤ ρ, (5.9)
whenever ν ≥ ν.
Proof: From (5.6) one obtains with the approximation property (5.8) and the smooth-
ing property (5.7)
|||S2lev ||| ≤ Ca hα η(ν)h−α = Ca η(ν).
Since η(ν) → 0 as ν → ∞, the right-hand side of this estimate is smaller than any given
ρ > 0 if ν is sufficiently large, e.g., if ν ≥ ν.
Remark 5.12 On the convergence theorem. Note that the estimate Ca η(ν) is inde-
pendent of h. The convergence theorems says that the two-level method converges
with a rate that is independent of h if sufficiently many smoothing steps are applied.
For many problems, one finds that only a few pre smoothing steps, i.e., 1 to 3, are
sufficient for convergence. 2
31
5.3 The Smoothing Property of the Damped Ja-
cobi Iteration
Remark 5.13 Contents of this section. In this section, the smoothing property of
the damped Jacobi iteration for the model problem will be proved. Therefore,
one
has to estimate |||Ah Sjac,ω
ν
|||, where now the spectral matrix norm Ah Sjac,ων is
2
ν
considered. In the proof, one has to estimate a term of the form kB(I − B) k2 for
some symmetric positive definite matrix with 0 < B ≤ I, i.e., it is for all eigenvalues
λ of B that λ ∈ (0, 1]. 2
Lemma 5.14 Estimate for a symmetric positive definite matrix. Let 0 <
B = B T ≤ I, then
kB(I − B)ν k2 ≤ η0 (ν)
with
νν
η0 (ν) = , ν ∈ N. (5.10)
(ν + 1)ν+1
Proof: The matrix B(I − B)ν is symmetric, exercise.

First, the eigenvalues of B(I − B)ν will be computed. Let λ ∈ R+ be an eigenvalue of
B. It will be shown that λ(1 − λ)ν is an eigenvalue of B(I − B)ν . The proof is performed
by induction Let ν = 1. Then, one has
B(I − B)x = Bx − BBx = λx − Bλx = λx − λ2 x = λ(1 − λ)x.
Thus, the statement of the theorem is true for ν = 1. The induction step has the form
B(I − B)ν x = B(I − B)(I − B)ν−1 x = B(I − B)ν−1 x − BB(I − B)ν−1 x

= λ(1 − λ)ν−1 x − Bλ(1 − λ)ν−1 x = λ(1 − λ)ν−1 x − λ2 (1 − λ)ν−1 x
λ − λ2 (1 − λ)ν−1 x = λ(1 − λ)ν x.

=
Since 0 ≤ B ≤ I, one has 0 < λ ≤ 1. Then, it is obvious that
0 ≤ λ(1 − λ)ν ≤ 1,
since both factors are between 0 and 1. Hence B(I − B)ν is positive semi-definite. One
gets, using the definition of the spectral norm, the symmetry of the matrix, the eigenvalue
of the square of a matrix, and the nonnegativity of the eigenvalues,
1/2
kB(I − B)ν k2 = λmax (B(I − B)ν )T B(I − B)ν
1/2
= λmax (B(I − B)ν )2
1/2
= (λmax (B(I − B)ν ))2
= λmax (B(I − B)ν )
= max λ(1 − λ)ν .
λ is eigenvalue of B
Thus, one has to maximize λ(1 − λ)ν for λ ∈ [0, 1] to get an upper bound for
kB(I − B)ν k2 . This expression takes the value zero at the boundary of the interval and it
is positive in the interior. Thus, one can compute the maximum with standard calculus
d
λ(1 − λ)ν = (1 − λ)ν − νλ(1 − λ)ν−1 = 0.
dλ
This necessary conditions becomes
1
1 − λ − νλ = 0 =⇒ λ= .
1+ν
32
It follows that
ν
νν

ν 1 1
kB(I − B) k2 ≤ 1− = .
1+ν 1+ν (1 + ν)1+ν
Remark 5.15 Damped Jacobi method. Now, the smoothing property of the damped
Jacobi method can be proved. The iteration matrix of the damped Jacobi method
for the model problem is given by, see also (3.11),
Sjac,ω = I − ωD−1 Ah , ω ∈ (0, 1], (5.11)
where D−1 Ah is the same for the finite difference and the finite element method. 2
Theorem 5.16 Smoothing property of the damped Jacobi method. Let

Sjac,ω be the iteration matrix of the damped Jacobi method given in (5.11), let ν ≥ 1,
ν ∈ N, and let ω ∈ (0, 1/2]. Then it is
A Sjac,ω ≤ 2 η0 (ν),
h ν
2 ωh
where η0 (ν) was defined in (5.10).
Proof: The proof will be presented for the finite element method, it can be performed
analogously for the finite difference method. For the finite element method, it is D = 2I/h.
Hence, one gets
ν
ν ωh h

h ν h −1 h
h
S = I − ωD A = A I − A

A jac,ω
2
A
2 2

2
ν
2 ωh h ωh h
= A I− A .

ωh 2 2 2
ωh h
The matrix B = 2
A is symmetric and positive definite and its eigenvalues are, see (2.6),

ωh h ωh h ωh 4 kπ
λ A = λ A = sin2 ≤ 2ω ≤ 1
2 2 2 h 2N
with the assumptions of the theorem. Hence B ≤ I and Lemma 5.14 can be applied, which
gives immediately the statement of the theorem.
Remark 5.17 To the smoothing property.

• The smooting property does not hold for the non-damped Jacobi method or
the SOR method with relaxation parameter ω ≥ ωopt , see (Hackbusch, 1994, p.
340).
• The bound η0 (ν) behaves like ν −1 , exercise. It follows that
A Sjac,ω ≤ 2 1
h ν
2 ωh ν
and the smoothing rate is said to be linear, i.e., O ν −1 .

5.4 The Approximation Property

Remark 5.18 Contents. Proofs of the approximation property are not only of
algebraic nature. They generally use properties of the underlying boundary value
problem. Hence, results from the theory of partial differential equations, like error
estimates, have to be applied. 2
33
Remark 5.19 Isomorphism between finite element spaces and Euclidean spaces.
There is a bijection between the functions in the finite element space V h and the
coefficients of the finite element functions in the space Rnh . This bijection is denoted
by P h : Rnh → V h , vh 7→ v h with
nh
X
v h (x) = vih ϕhi (x), vh = vih .

i=1
If the Euclidean space Rnh is equipped with the standard Euclidean norm, then
the norm equivalence
C0 h1/2 vh 2 ≤ P h vh L2 ((0,1)) ≤ C1 h1/2 vh 2

(5.12)
holds with constants that are independent of the mesh size, exercise.
There are commutation properties between the grid transfer operators and the
bijektion. For instance, for a function given in V 2h , one gets the same result if one
first applies the bisection to Rn2h and then the interpolation to Rnh or if one first
applies the prolongation to V h (imbedding) and then applies the bisection to Rnh ,
i.e.,
h
−1 2h −1 h 2h
I2h P 2h v = Ph I2h v , (5.13)
h
where I2h on the left-hand side is the matrix representation of the prolongation
h
operator I2h between the finite element spaces. Similarly, if the vector of coefficients
is given on the fine grid, one can first apply the bijection and then the restriction
or vice versa
Ih2h P h vh = P 2h Ih2h vh . (5.14)
2
Theorem 5.20 Approximation property for the finite element discretiza-

tion. Let Ah be defined in (2.4), A2h be defined by the Galerkin projection (5.3),
h
the prolongation I2h be defined in Example 4.9, and the restriction in Example 4.16.
Assume that the boundary value problem (2.1) is 2-regular, then the approximation
property
h −1 h
−1 2h
A − I2h A2h Ih ≤ Ch

2
holds.
Proof: Using the definition of an operator norm, the left-hand side of the approxi-
mation property (5.8) can be rewritten in the form
−1 −1 2h h
Ah h
− I2h A2h Ih w

2
sup (5.15)
wh ∈Rnh kwh k2
Let Ah zh = wh , A2h z2h = Ih2h wh , then the numerator can be written as

h h 2h
z − I2h z . (5.16)
2
By construction, zh is the solution of a finite element problem on the fine grid and z2h
is the solution of almost the same problem on the coarse grid. The right-hand side of
the coarse grid problem is the restriction of the right-hand side of the fine grid problem.
Therefore, it is a straightforward idea to apply results that are known from finite element
error analysis. Consider the finite element problems
0 0
uh , ϕh = P h w h , ϕ h = w h , ϕ h , ∀ ϕh ∈ V h ,
0 0
(7.5) u2h , ϕ2h = wh , ϕ2h , ∀ ϕ2h ∈ V 2h .
34
Approximating the right-hand side of the first problem by the composite trapezoidal rule
and using ϕhi (xi−1 ) = ϕhi (xi+1 ) = 0, ϕhi (xi ) = 1, one gets
Z xi+1
wh (x)ϕhi (x) dx
xi−1
wh (xi−1 )ϕhi (xi−1 ) + wh (xi )ϕhi (xi ) wh (xi )ϕhi (xi ) + wh (xi+1 )ϕhi (xi+1 )
≈ h +h
2 2
h
= hw (xi ) = hwi .
This formula, which is exact for constant vectors wh , is the algebraic form of the right-hand
side of the first problem Ah uh = hwh . With the definition of zh , one obtains
−1 −1
zh = Ah wh = h−1 uh = h−1 P h uh .
Using the commutation P 2h Ih2h wh = Ih2h P h wh = Ih2h wh , see (5.14), the finite element
function z 2h = P 2h z2h is the solution of the coarse grid problem
0 0
z 2h , ϕ2h = Ih2h wh , ϕ2h = wh , I2h h
ϕ2h , ∀ ϕ2h ∈ V 2h ,
where the duality of prolongation and restriction was used, see Example 4.16. The canon-
ical prolongation of ϕ2h is the embedding, see Example 4.9, hence I2h
h
ϕ2h = ϕ2h and one
obtains 0 0
z 2h , ϕ2h = wh , ϕ2h , ∀ ϕ2h ∈ V 2h .
With the same quadrature rule as on the fine grid, it follows that

z 2h = P 2h z2h = (2h)−1 u2h =⇒
−1 −1
h 2h
I2h z = (2h)−1 I2hh
P 2h u2h = (2h)−1 P h h 2h
I2h u ,
h
where (5.13) was used. Since I2h is the identity, one gets that (5.16) can be written in the
form

h h 2h

−1
h −1 h 2h

z − I2h z = h P u − u . (5.17)
2

2

Since the norm equivalence (5.12) should be applied, the error uh − u2h L2 ((0,1)) will
be estimated. Let u ∈ H01 ((0, 1)) be the solution of the variational problem

u0 , ϕ 0 = w h , ϕ ∀ ϕ ∈ H01 ((0, 1)).
This problem is by assumption 2-regular, i.e., it is u ∈ H 2 ((0, 1)) and it holds kukH 2 ((0,1)) ≤

c wh L2 ((0,1)) . Then, it is known from Numerical Mathematics 3 that the error estimates

h
u − u ≤ Ch2 wh , 2h
u − u ≤ C (2h)2 wh

L2 ((0,1)) L2 ((0,1)) L2 ((0,1)) L2 ((0,1))
hold. Thus, one obtains with the triangle inequality

h 2h
u − u 2 ≤ u − uh + u − u2h ≤ Ch2 wh . (5.18)

L ((0,1)) 2 L ((0,1)) L2 ((0,1)) L2 ((0,1))
Finally, inserting (5.16), (5.17), (5.18) into (5.15) and using the norm equivalence
35
(5.12) gives
−1 −1 2h h
Ah h
− I2h A2h Ih w

2
sup
wh ∈Rnh kwh k2
h h 2h

z − I2h z 2
= sup
wh ∈Rnh kwh k2

h −1 h
P u − u2h

−1 2
= Ch sup
wh ∈Rnh kwh k2

h h −1 h
P P u − u2h

−3/2 L2 ((0,1))
≤ Ch sup
n
wh ∈R h kw hk
2
h
u − u2h 2
L ((0,1))
= Ch−3/2 sup
h
w ∈R nh kw hk
2
h
w 2
L ((0,1))
≤ Ch1/2 sup
wh ∈Rnh kwh k2
h
w
2
≤ Ch sup
wh ∈Rnh kw k2
h
= Ch.
Remark 5.21 On the approximation property.

• In the one-dimensional model problem, the assumption on the regularity are sat-
isfied if the right-hand side f (x) is sufficiently smooth. In multiple dimensions,
one needs in addition conditions on the domain.
• The proof is literally the same in higher dimensions.
2
5.5 Summary
Remark 5.22 Summary. This chapter considered the convergence of the two-level
method or coarse grid correction scheme. First, an appropriate coarse grid operator
was defined. It was shown that the spectral radius of the iteration matrix of the
two-level method can be bounded with a constant lower than 1, independently of
the mesh width h, if
• the smoothing property holds and sufficiently many smoothing steps are per-
formed,
• and if the approximation property holds.
Considering the model problem (2.1), the smoothing property for the damped Jacobi
problem with ω ∈ (0, 1/2] was proved as well as the approximation property. 2
36
Chapter 6
The Multigrid Method
Remark 6.1 Motivation. The two-level method leaves an open question: How to
solve the coarse grid equation
A2h e2h = Ih2h rh =: r2h

(6.1)
efficiently? The answer might be apparent: by a two-level method. The form (6.1)
is not much different from the original problem. Thus, if one applies the two-level
method to the original equation, its application to (6.1) should be easy. A recursive
application of this idea, of using the two-level method for solving the coarse grid
equation, leads to the multigrid method. 2
6.1 Multigrid Cycles

Remark 6.2 Notations. To simplify the notations, the right-hand side vector of
the residual equation will be denoted by f 2h instead of r2h since it is just another
right-hand side vector. The solution vector on the finest grid will be denoted by
uh and the current iterate by vh . Instead of denoting the solution vector on the
coarse grid by e2h , it will be denoted by v2h . These notations can be used in an
implementation of the method. 2
Example 6.3 A multigrid method. Now, the two-level method will be imbedded
into itself. It will be assumed that there are l + 1 grids, l ≥ 0, where the finest grid
has the grid spacing h and the grid spacing increase by the factor 2 for each coarser
grid. Let L = 2l .
• Apply the smoother ν1 times to Ah uh = f h with the initial guess vh . The results
is denoted by vh .
• Compute f 2h = Ih2h rh = Ih2h f h − Ah vh .
◦ Apply the smoother ν1 times to A2h u2h = f 2h with the initial guess v2h = 0.
Denote the result by v2h .
◦ Compute f 4h = I2h 4h 2h 4h
r = I2h f 2h − A2h v2h .
..
.
− Solve ALh uLh = f Lh .
..
.
◦ Correct v2h := v2h + I4h
2h 4h
v .
◦ Apply smoother ν2 times to A2h u2h = f 2h with the initial guess v2h .
• Correct vh := vh + I2h
h 2h
v .
• Apply the smoother ν2 times to Ah uh = f h with the initial guess vh .
37
2
Example 6.4 Multigrid method with γ-cycle. The multigrid scheme from Exam-
ple 6.3 is just one possibility to perform a multigrid method. It belongs to a family
of multigrid methods, the so-called multigrid methods with γ-cycle that have the
following compact recursive definition:
vh ← Mγh (vh , f h )
1. Pre smoothing: Apply the smoother ν1 times to Ah uh = f h with the initial
guess vh .
2. If Ωh is the coarsest grid
− solve the problem.
else

− Restrict to the next coarser grid: f 2h ← Ih2h f h − Ah vh .
− Set initial iterate on the next coarser grid: v2h = 0.
− If Ωh is the finest grid, set γ = 1.
− Call the γ-cycle scheme γ times for the next coarser grid:
v2h ← Mγ2h v2h , f 2h .

3. Correct with the prolongated update: vh ← vh + I2h

h 2h
v .
4. Post smoothing: Apply the smoother ν2 times to Ah uh = f h with the initial
guess vh .
In practice, only γ = 1 (V-cycle) and γ = 2 (W-cycle) are used. The names
become clear if one has a look on how they move through the hierarchy of grids,
see Figures 6.1 and 6.2. 2
level
s s h 3
r p
s s 2h 2
r p
s s 4h 1
r p
8h 0
e
Figure 6.1: Multigrid V-cycle (γ = 1), s – smoothing, r – restriction, p – prolon-

gated, e – exact solver.
Example 6.5 Multigrid F-cycle. In between the V-cycle and the W-cycle is the
F-cycle, see Figure 6.3. The F-cycle starts with the restriction to the coarsest grid.
In the prolongation process, after having reached each level the first time, again a
restriction to the coarsest grid is performed. 2
Remark 6.6 To the multigrid cycles.

• The system on the coarsest grid is often small or even very small. Then, it
can be solved efficiently with a direct method (Gaussian elimination, Cholesky
factorization). Otherwise, one can apply a few steps of an iterative scheme to
computed a sufficiently good approximate solution.
38
level
s s h 3
r γ=1 p
s s s 2h 2
r γ=1 p r γ=2 p
s s s s s s 4h 1
γ=1 γ=2 γ=1 γ=2
r p r p r p r p
8h 0
e e e e
Figure 6.2: Multigrid W-cycle (γ = 2), s – smoothing, r – restriction, p – prolon-

gated, e – exact solver.
level
s s h 3
r p
s s s 2h 2
r p r p
s s s s s 4h 1
r p r p r p
8h 0
e e e
Figure 6.3: Multigrid F-cycle, s – smoothing, r – restriction, p – prolongated, e –

exact solver.
• In our experience, it is sometimes (depending on the problem) helpful to damp

the correction after having prolongated the update. Let β ∈ (0, 1] be given, then
instead of Step 3 of the multigrid γ-cycle, the update has the form
vh ← vh + βI2h
h 2h
v .
• The initial guess for the first pre smoothing step on the finest grid can be
obtained by a nested iteration, see Remark 4.5. In the nested iteration, the
system is first solved (or smoothed) on a very coarse gird, then one goes to the
next finer grid and smoothes the system on this grid and so on, until the finest
grid is reached. This approach is called full multigrid. If one uses on each grid,
which is not the finest grid, one multigrid V-cycle for smoothing, the so-called
full multigrid V-cycle is performed, see Figure 6.4. The full multigrid V-cycle
looks like a F-cycle without restriction and pre smoothing.
In practice, one solves the systems on the coarser grids up to a certain accuracy
before one enters the next finer grid.
2
6.2 Convergence of the W-cycle

Remark 6.7 Contents. It will be proved that the sufficient conditions for the
convergence of the two-level method, Theorem 5.11, almost imply the convergence of
the multigrid W-cycle. The rate of convergence will be bounded by a number ρ(ν) <
1 which depends on the number of pre smoothing steps and which is independent of
the finest step size h and of the number of levels involved in the multigrid scheme.
39
level
s h 3
p
s s 2h 2
p r p
s s s s 4h 1
p r p r p
8h 0
e e e
Figure 6.4: Full multigrid V-cycle, s – smoothing, r – restriction, p – prolongated,

e – exact solver.
This technique cannot be applied to the multigrid V-cycle. The convergence

theory for the V-cycle is more complicated and beyond the scope of this course. 2
Remark 6.8 Preliminarities. As usual, one has to study the iteration matrix for
the investigation of the convergence of an iterative solver. The levels of the multigrid
hierarchy are numbered by 0, . . . , l, where level 0 is the coarsest grid. The iteration
matrix of the two-level method on level l, where the corresponding mesh width
should be h, is denoted by Sl and it has the form, see (5.5)

−1
Sl (ν) = I − Il−1 l
(Al−1 ) Ill−1 Al Ssm,l
ν
. (6.2)
This iteration matrix is the matrix without post smoothing.

The solution of Al ul = fl is a fixed point of the multigrid γ-cycle. This statement
follows from the fact that it is a fixed point of the two-level method, see Remark 5.6.
2
Lemma 6.9 Iteration matrix of the multigrid γ-cycle. The iteration matrix
of the multigrid γ-cycle scheme is given by
Smg,l (ν) = Sl (ν) if l = 1,

γ
Smg,l (ν) = l
Sl (ν) + Il−1 (Smg,l−1 (ν)) A−1 l−1
l−1 Il
ν
Al Ssm,l for l ≥ 2. (6.3)
Proof: For l = 1, the two-level method and the multigrid γ-cycle scheme are identical
and the statement of the lemma follows immediately.
The proof for l ≥ 2 will be performed by induction. Assume that (6.3) holds for l − 1.
The iteration matrix Smg,l (ν) can be written in the form
ν
Smg,l (ν) = Cmg,l Ssm,l ,
where Cmg,l represents the iteration matrix of the complete coarse grid correction, i.e.,
everything which was done on the levels 0, . . . , l − 1. This matrix has to be determined.
To this end, consider the multigrid method with fl = 0 and let ul being arbitrary. For the
restricted residual, it holds
fl−1 = Ill−1 (fl − Al ul ) = −Ill−1 Al ul .

(1) (γ)
Then, in the multigrid γ-cycle, γ iterates vl−1 , . . . , vl−1 are computed, starting with the
(0)
initial iterate vl−1 = 0. The multigrid γ-cycle on level l − 1, which is applied to
Al−1 ul−1 = fl−1 , (6.4)
can be described with the basic form of a fixed point iteration given in (3.3)
(j+1) (j)
vl−1 = Smg,l−1 (ν)vl−1 + Nl−1 fl−1 . (6.5)
40
From Remark 6.8 it follows that the solution of (6.4) is the fixed point of (6.5). One
obtains
(1) (0)
vl−1 = Smg,l−1 (ν)vl−1 + Nl−1 fl−1 = Nl−1 fl−1
(2)
vl−1 = Smg,l−1 (ν)Nl−1 fl−1 + Nl−1 fl−1
(3)
vl−1 = Smg,l−1 (ν) (Smg,l−1 (ν)Nl−1 fl−1 + Nl−1 fl−1 ) + Nl−1 fl−1
= (Smg,l−1 (ν))2 Nl−1 fl−1 + Smg,l−1 (ν)Nl−1 fl−1 + Nl−1 fl−1
..
.
γ−1
(γ)
X
vl−1 = (Smg,l−1 (ν))k Nl−1 fl−1
k=0
γ−1
X
= (Smg,l−1 (ν))k Nl−1 −Ill−1 (Al ul ) . (6.6)
k=0
Let ul−1 be the fixed point of (6.5) and the solution of (6.4), then it is
ul−1 = Smg,l−1 (ν)ul−1 + Nl−1 fl−1 = Smg,l−1 (ν)ul−1 + Nl−1 Al−1 ul−1
= (Smg,l−1 (ν) + Nl−1 Al−1 ) ul−1 .
It follows that
I = Smg,l−1 (ν) + Nl−1 Al−1
and
Nl−1 = (I − Smg,l−1 (ν)) A−1
l−1 . (6.7)
Using (telescopic sum)
γ−1 γ−1 γ−1
X X X
xk (1 − x) = xk − xk+1 = 1 − xγ ,
k=0 k=0 k=0
one obtains from (6.6) and (6.7)

γ−1
!
X
(Smg,l−1 (ν)) (I − Smg,l−1 (ν)) A−1
(γ) k l−1
vl−1 = l−1 −Il (Al ul )
k=0

= (I − (Smg,l−1 (ν))γ ) A−1 l−1
l−1 −Il (Al ul ) . (6.8)
From the coarse grid correction, step 3 auf the multigrid γ-cycle scheme, see Example 6.4,
it follows for the result of the multigrid γ-cycle that
(γ)
unew
l
l
:= Cmg,l ul = ul + Il−1 vl−1 .
Inserting (6.8), one obtains for the iteration matrix of the coarse grid correction

Cmg,l = I + Il−1 l
(I − (Smg,l−1 (ν))γ ) A−1
l−1 −Il
l−1
Al
= l
I − Il−1 A−1 l−1
l−1 Il
l
Al + Il−1 (Smg,l−1 (ν))γ A−1 l−1
l−1 Il Al .
Hence, the iteration matrix of the multigrid γ-cycle scheme is given by

ν
Smg,l (ν) = Cmg,l Ssm,l

= l
I − Il−1 A−1 l−1
l−1 Il
ν
Al Ssm,l l
+ Il−1 (Smg,l−1 (ν))γ A−1 l−1
l−1 Il
ν
Al Ssm,l .
The first term is equal to Sl (ν), see (6.2). Thus, (6.3) is proved for level l under the
assumption that it holds for level l − 1.
One can write the iteration matrix for l = 1 also in form (6.3), using the definition
Smg,0 (ν) := 0. Then, (6.3) holds for l = 1 and hence it holds for all l ≥ 1 by induction.
41
Remark 6.10 Estimate of the spectral norm of the iteration matrix. The iteration
matrix Smg,l (ν) of the multigrid γ-cycle scheme is the sum of the iteration matrix
of the two-level method and a perturbation. It will be shown that this perturbation
is, under certain assumptions, small.
The spectral norm of Smg,l (ν) will be estimated in a first step by the triangle
inequality and the rule for estimating the norm of products of matrices
γ
(Smg,l−1 (ν)) A−1 l−1
l ν

kSmg,l (ν)k2 ≤ kSl (ν)k2 + Il−1 l−1 Il Al Ssm,l
2
l γ −1 l−1 ν

≤ kSl (ν)k2 + Il−1 2 kSmg,l−1 (ν)k2 Al−1 Il Al Ssm,l . (6.9)
2
Now, bounds for all factors on the right-hand side of (6.9) are needed. 2
Remark 6.11 Assumptions on the prolongation operator. It will be assumed that
the prolongation is a bounded linear operator with a bound independent of l, i.e.,
there is a constant cp such that
l
Il−1 ≤ cp ∀ l ≥ 1. (6.10)
2
l
In addition, a bound of Il−1 from below will be needed. Thus, it will be assumed
2
that there is a constant cp > 0 such that for all ul−1 defined on level l − 1 it is
c−1
l
p kul−1 k2 ≤ Il−1 ul−1 2 ∀ l ≥ 1. (6.11)

The assumptions (6.10) and (6.11) are satisfied for the prolongation operator
defined in Section 4.2. These properties can be deduced, e.g., by using the definition
of the operator norm, exercise. 2
Remark 6.12 Assumptions on the smoother. It will be assumed that there is a
constant cs such that
ν
Ssm,l ≤ cs ∀ l ≥ 1, 0 < ν < ∞. (6.12)
2
This assumption is satisfied, e.g., for the damped Jacobi iteration, Ssm,l = Sjac,ω
applied to the model problem, with cs = 1. It was shown in the proof of Lemma 3.10
that ρ(Sjac,ω ) < 1. Since Sjac,ω is a symmetric matrix, it is kSjac,ω k2 = ρ(Sjac,ω ).
It follows that
Ssm,l = kSsm,l kν = ρ(Sjac,ω )ν < 1.
ν
2 2
2
Lemma 6.13 Estimate of last term in (6.9) with the iteration matrix of
the two-level method. Suppose (6.11) and (6.12), then
−1 l−1 ν

A I Al Ssm,l ≤ cp (cs + kSl (ν)k ) . (6.13)
l−1 l 2 2
Proof: One gets with (6.11)

−1 l−1 ν l −1 l−1 ν
Al−1 Il Al Ssm,l ul ≤ cp Il−1 Al−1 Il Al Ssm,l ul

2 2
for all ul , where it is noted that A−1 l−1

l−1 Il
ν
Al Ssm,l is a vector on level l − 1. Using the
ul
definition of an operator norm gives

−1 l−1 ν l −1 l−1 ν
Al−1 Il Al Ssm,l ≤ cp Il−1 Al−1 Il Al Ssm,l . (6.14)

2 2
The right-hand side of this estimate can be rewritten as follows

l
Il−1 A−1 l−1
l−1 Il
ν
Al Ssm,l = ν
Ssm,l − A−1 ν l −1 l−1
l Al Ssm,l + Il−1 Al−1 Il
ν
Al Ssm,l

= ν
Ssm,l − A−1l
l
− Il−1 A−1 l−1
l−1 Il
ν
Al Ssm,l
ν
= Ssm,l − Sl (ν).
42
Using this identity in (6.14), applying the triangle inequality, and assumption (6.12) gives

−1 l−1 ν
ν
Al−1 Il Al Ssm,l ≤ cp Ssm,l + kSl (ν)k

2 2
2

≤ cp cs + kSl (ν)k2 .
Remark 6.14 Impact on estimate (6.9). Only the case will be considered that the
number ν of smoothing steps is sufficiently large such that the two-level method
converges, i.e., it is
kSl (ν)k2 < 1.
Inserting (6.13) into (6.9) and using the assumption on the number of smoothing
steps yields, together with (6.10),
kSmg,l−1 (ν)kγ (cs + kSl (ν)k )
l
kSmg,l (ν)k2 ≤ kSl (ν)k2 + cp Il−1 2 2 2
γ
≤ kSl (ν)k2 + cp cp (cs + 1) kSmg,l−1 (ν)k2
γ
= kSl (ν)k2 + c∗ kSmg,l−1 (ν)k2 . (6.15)
This inequality is of the recursive form
x1 = x, xl ≤ x + c∗ xγl−1 , l ≥ 2, (6.16)
with x = kSl (ν)k2 < 1 and for l = 1 the multigrid and the two-level method
coincide. 2
Lemma 6.15 Bound for the iterates of inequality (6.16). Assume that c∗ γ >
1. If γ ≥ 2 and
γ − 1 ∗ − γ−11
x ≤ xmax := (c γ) ,
γ
then every iterate of (6.16) is bounded by
γ
xl ≤ x < 1.
γ−1
Proof: The proof of the bound is performed by induction. For l = 2, one has
x + c∗ xγ1 ≤ x + c∗ xγ = x 1 + c∗ xγ−1

x2 ≤
γ−1 !
∗ γ−1 ∗ γ−1 1
≤ x 1 + c xmax = x 1 + c
γ c∗ γ
γ
(γ − 1)γ−1

1 1
= x 1+ = x 1 + 1 −
γγ γ−1 γ
γ−1+1 γ
≤ x =x ,
γ−1 γ−1
γ
since 1 − γ1 < 1 (positive power of a real number in (0, 1)).
Let the statement be already proved for l − 1, then one obtains with the assumption
of the induction
γ
∗ γ ∗ γ
xl ≤ x + c xl−1 ≤ x + c xγ
γ−1
γ γ
γ γ
= x 1 + c∗ xγ−1 ≤ x 1 + c∗ γ−1
xmax
γ−1 γ−1
γ γ−1 !
γ γ−1 1 1 γ
= x 1+ =x 1+ =x .
γ−1 γ γ γ−1 γ−1
43
Using now the assumption on x and the assumption c∗ γ > 1, one gets
γ γ − 1
x≤ xmax = (c∗ γ) γ−1 < 1.
γ−1 γ−1
Remark 6.16 To Lemma 6.15. The condition γ ≥ 2 is used in the definition of

xmax . Note that xmax < 1 since both factors are lower than 1.
In the case of the W-cylce, i.e., γ = 2, the statement of the Lemma 6.15 implies
1
x ≤ xmax = , xl ≤ 2x = 2 kSl (ν)k2 .
4c∗
2
Theorem 6.17 Convergence of the multigrid γ-cycle for γ ≥ 2. Suppose

γ ≥ 2, (6.10), (6.11), (6.12) with ν(h) = ∞, and the assumptions of Theorem 5.11.
Let ρ ∈ (0, 1) be a fixed number. Then there is a number ν such that
γ
kSmg,l (ν)k2 ≤ ρ < 1, kSmg,l (ν)k2 ≤ Ca η(ν), (6.17)
γ−1
whenever the number of smoothing iterations ν is larger or equal than ν. The
estimates (6.17) are independent of the level l and the number of levels. The function
η(ν) is defined in the smoothing property (5.7) and the constant Ca is defined in the
approximation property.
Proof: Starting point of the proof is inequality (6.15). Lemma 6.15 will be applied
with
x = kSl (ν)k2 , xl = kSmg,l (ν)k2 .
Without loss of generality, one can choose
1
c∗ > ⇐⇒ c∗ γ > 1.
γ
In particular, c∗ can be chosen so large that
γ − 1 ∗ − γ−1
1 γ−1
x≤ (c γ) ≤ ρ < 1.
γ γ
Note that large values of c∗ imply small values of x, which can be always obtained by
applying sufficiently many smoothing steps. Thus, the assumptions of Lemma 6.15 are
satisfied and one obtains
γ γ
kSmg,l (ν)k2 ≤ kSl (ν)k2 = x ≤ ρ.
γ−1 γ−1
The second estimate is obtained recursively. Using formally the same computations as
in the proof of Lemma 6.15, one gets
γ γ
kSmg,2 (ν)k2 ≤ kS2 (ν)k2 ≤ Ca η(ν),
γ−1 γ−1
and by induction
γ γ
kSmg,l (ν)k2 ≤ kSl (ν)k2 ≤ Ca η(ν),
γ−1 γ−1
The details of this proof are an exercise.
Remark 6.18 To Theorem 6.17.

• The theorem states the convergence of the multigrid γ-cycle with a rate of
convergence that is independent of the level. The estimate of this rate, i.e.,
γ
γ−1 Ca η(ν), is in general somewhat pessimistic.
44
• A similar result can be proved if only post smoothing and no pre smoothing is
applied, as well as in the case that both pre and post smoothing are used.
• The convergence proof for the V-cycle, i.e., γ = 1, does not rely on the conver-
gence of the two-level method. In this proof, the multigrid iteration matrix is
analyzed directly, e.g., see (Hackbusch, 1985, pp. 164).
• For problems without symmetric positive definite system matrix, multigrid works
often quite well. But only very little is proved on the convergence of multigrid
methods for such problems. Results on the multigrid convergence for problems
without symmetric positive definite matrix are in general for problems which
are only a slight perturbation of a s.p.d. problem. But many interesting prob-
lems are not small perturbations of a s.p.d. problem, like convection-dominated
convection-diffusion equations or the Navier–Stokes equations. In these fields,
many questions concerning the theory of multigrid methods are open. Some
results for convection-diffusion problems can be found in Reusken (2002); Ol-
shanskii and Reusken (2004).
2
6.3 Computational Work of the Multigrid γ-Cycle

Remark 6.19 Goal. So far it is proved that the rate of convergence for the multi-
grid γ-cycle is bounded by a number ρ < 1 independently of the level. That means,
the number of iterations for solving the equation up to a certain accuracy is bounded
from above by a constant which is independent of the level, i.e., one needs on each
grid level essentially the same number of iterations to solve the equation. This be-
havior is in contrast to the classical iteration schemes or the PCG method, where
the number of iterations increases by the factor of 4 or 2, respectively, if the grid is
refined once, cf. Table 2.1.
Let Nl the number of degrees of freedom on level l, 1 ≤ l ≤ L. To obtain an
optimal algorithm, one needs to show that the number of operations (flops) per
multigrid cycle behaves like O (Nl ). Since the number of multigrid cycles for the
solution of the linear system is bounded uniformly, i.e., independently of l, it follows
that then also the solution of the linear system requires O (Nl ) operations. 2
Remark 6.20 Assumptions on the computational costs of the components of the

multigrid method. The following bounds for the number of operations are assumed
for the basic components of the multigrid method:
• one smoothing step ul := Sl (ul )
flops ≤ cs Nl , l ≥ 1,
• restriction fl−1 = Ill−1 (fl − Al vl )
flops ≤ cr Nl , l ≥ 1,
l
• prolongation and correction ul := ul + Il−1 vl−1
flops ≤ cp Nl , l ≥ 1,
• coarsest grid problem u0 = A−1

0 f0
flops ≤ c0 .
For sparse matrices and the prolongation and restriction which were introduced in
Chapter 4, the bounds are true. The system on the coarsest grid can be solved, e.g.,
45
by Gaussian elimination. Then, c0 depends on the number of degrees of freedom
on the coarsest grid, but not on Nl .
Let
Nl−1
ch = sup .
l≥1 Nl
For uniformly refined grids, i.e., hl−1 = 2hl , this constant has the value ch = 2−d ,
where d is the dimension of the domain. 2
Theorem 6.21 Number of operations for the multigrid γ-cycle. Set θ =

ch γ, let θ < 1 and let the assumptions from Remark 6.20 be satisfied. Then, one
cycle of the multigrid γ-cycle with ν smoothing steps on each level requires cl Nl
operations, where
νcs + cr + cp c0
cl < + θl−1 . (6.18)
1−θ N1
Proof: One iteration at level l involves γ l−k iterations at level k, 1 ≤ k ≤ l, since

there are
• γ iterations on level l − 1,
• at each of these iterations, γ iterations on level l − 2, i.e., γ 2 iterations on level l − 2,
• and so on.
On level 0, γ l−1 coarsest grid systems have to be solved, since in each of the γ l−1 situations
where one is on level 1, level 0 is called. Using the assumptions on the costs of the basic
components of the multigrid method, on obtains the following costs
(νcs + cr + cp ) Nl + γ (νcs + cr + cp ) Nl−1 + γ 2 (νcs + cr + cp ) Nl−2

+ . . . + γ l−1 (νcs + cr + cp ) N1 + γ l−1 c0

= (νcs + cr + cp ) Nl + γNl−1 + . . . + γ l−1 N1 + γ l−1 c0

Nl−1 N1
= (νcs + cr + cp ) Nl 1 + γ + . . . + γ l−1 + γ l−1 c0
Nl Nl

Nl−1 Nl−2 Nl−1
= (νcs + cr + cp ) Nl 1 + γ + γ2 + . . . + γ l−1 c0
Nl Nl−1 Nl

≤ (νcs + cr + cp ) Nl 1 + γch + γ ch + . . . + γ l−1 cl−1
2 2
h + γ l−1 c0

= (νcs + cr + cp ) Nl 1 + θ + θ2 + . . . + θl−1 + γ l−1 c0 (6.19)
Nl c0 Nl
≤ (νcs + cr + cp ) + θl−1 l−1
1−θ c h Nl

νcs + cr + cp l−1 c0
≤ +θ Nl ,
1−θ N1
since cl−1
h ≥ N1 /Nl for l ≥ 1.
Remark 6.22 On the bound (6.18). The bound (6.18) depends formally on l. One
can remove this dependence by using that θl−1 < θ. However, in the form (6.18) it
becomes clearer that the importance of the flops of the coarsest grid solver decreases
with increasing level. 2
Example 6.23 Computational costs for different cycles. Consider a standard uni-
form refinement, i.e., it is ch = 2−d , where d is the dimension of the domain.
For one dimension, the theory applies for the V-cycle because γch = 1/2, but
not for the W-cycle since γch = 1.
In two dimensions, one has for the V-cycle γch = 1/4 and for the W-cycle γch =
1/2. Then, one obtains from (6.18) the following estimates for the computational
costs:
46
• V-cycle
l−1
4 1 c0
cl < (νcs + cr + cp ) + ,
3 4 N1
• W-cycle
l−1
1 c0
cl < 2 (νcs + cr + cp ) + .
2 N1
Neglecting the flops for the coarsest grid solver, a W-cycle for a two-dimensional
problem requires roughly 1.5 times the number of flops of a V-cycle.
In three dimensions, one finds for the V-cycle that γch = 1/8 and for the W-cycle
that γch = 1/4. Then, the number of flops per cycle is bounded by
• V-cycle
l−1
8 1 c0
cl < (νcs + cr + cp ) + ,
7 8 N1
• W-cycle
l−1
4 1 c0
cl < (νcs + cr + cp ) + .
3 4 N1
Hence, the W-cycle is only 1.167 times as expensive as the V-cycle.
These results to think about using different strategies for different dimensions.
The V-cycle is always more efficient whereas the W-cycle is generally more stable.
Since the efficiency gain of the V-cycle in three dimensions is only small, one should
apply there the W-cycle. In two dimensions, one should first try if the V-cycle
works. As alternative, one can use in both cases the F-cycle. The computation of
the numerical costs of the F-cycle is an exercise. 2
Corollary 6.24 Number of flops for θ = 1. Let the notations be as in The-
orem 6.21 and let θ = 1. Then, the number of operations on level l is bounded
by
c0
(νcs + cr + cp ) l + Nl .
N1
Proof: The proof starts like the proof of Theorem 6.21 until (6.19). Then, one sets
θ = 1 in (6.19) to obtain the statement of the corollary.
Example 6.25 W-cycle in one dimension. The corollary states that the number
of flops for the W-cycle in one dimension is not proportional to Nl . Hence, the
W-cycle is not optimal in one dimension. 2
Remark 6.26 Memory requirements of the multigrid method. The sparse matrix
on level l requires the storage of cm Nl numbers, where cm is independent of l. In
addition, one has to store the arrays vl and fl , which are 2Nl numbers. It follows
that the total storage requirements are
l
X Nl−1 Nl−1 Nl−2
(2 + cm ) Nk = (2 + cm ) Nl + Nl + Nl + ...
Nl Nl Nl−1
k=0
l−1
X
≤ (2 + cm ) Nl ckh
k=0
(2 + cm ) Nl
≤ ,
1 − ch
if ch < 1. A method that works only on the finest grid requires at least the storage
of (2 + Nl ) numbers. Thus, for uniform standard refinement, i.e., ch = 2−d , one has
for
47
• d = 1: that the multigrid method needs 100 %,
• d = 2: that the multigrid method needs 33.3 %,
• d = 3: that the multigrid method needs 14.3 %,
more memory than a single grid algorithm on the finest grid. 2
48
Chapter 7
Algebraic Multigrid Methods
Remark 7.1 Motivation. The (geometric) multigrid methods described so far need
a hierarchy of (geometric) grids, from the coarsest one (l = 0) to the finest one.
On all levels but the coarsest one, the smoother will be applied and on the coarsest
level, the system is usually solved exactly. However, the following question arises:
• What should be done if the available coarsest grid possesses already that many
degrees of freedom that the use of a direct solver takes too much time ?
This situation will happen frequently if the problem is given in a complicated domain
in Rd , d ∈ {2, 3}, see Figure 7.1 for an (academic) example. Complicated domains
are very likely to be given in applications. Then, the application of a grid generator
will often lead to (coarsest) grids that are so fine that a refinement would lead to so
many degrees of freedom that an efficient simulation of the problem is not possible.
Altogether, there is just one grid.
To handle the situation of a coarsest grid with many degrees of freedom, there
are at least two possibilities.
• One level iterative scheme. In the case that there is a geometric grid hierarchy
but the coarse grid is already fine, one can use a simple iterative method, e.g.,
the smoother, to solve the system on the coarsest grid approximately. Then, the
smooth error modes on this grid are not damped. However, experience shows
that this approach works in practice sometimes quite well.
If there is just one grid available, a Krylov subspace method can be used for
solving the arising linear systems of equations.
• Iterative scheme with multilevel ideas. Construct a more complicated iterative
method which uses a kind of multigrid idea for the solution of the system on the
coarsest geometric grid. The realization of this multigrid idea should be based
only on information which can be obtained from the matrix on the coarsest grid.
This type of solver is called Algebraic Multigrid Method (AMG).
2
7.1 Components of an AMG and Definitions

Remark 7.2 Components. An AMG possesses the same components as a geomet-
ric multigrid method:
• a hierarchy of levels,
• a smoother,
• a prolongation,
• a restriction,
• coarse grid operators.
49
Figure 7.1: Top: domain with many holes (like the stars in the flag of the United
Stars); middle: triangular grid from a grid generator; bottom: zoom into the region
with the holes.
A level or a grid is a set of unknowns or degrees of freedom. In contrast to

geometric multigrid methods, the hierarchy of levels is obtained by starting from a
finest level and reducing the number of unknowns to get the coarser levels.
AMGs restrict themselves on using only simple smoothers, e.g., the damped
Jacobi method. This approach is in contrast to geometric multigrid methods, whose
efficiency can be enhanced by using appropriate smoothers.
In this course, only the case of symmetric positive definite matrices will be con-
sidered. Then, the restriction is always defined as the transpose of the prolongation,
i.e.,
T
Ifc = Icf ,
50
where “f” refers to the fine grid and “c” to the next coarser grid.
The coarse grid operator is defined by the Galerkin projection
Ac = Ifc Af Icf .
Remark 7.3 Main tasks in the construction of AMGs. There remain two main
tasks in the construction of an AMG:
• An appropriate hierarchy of levels has to be constructed fully automatically,
using only information from the matrix on the current grid to construct the
next coarser grid.
• One has to define an appropriate prolongation operator.
These two components will determine the efficiency of the AMG.
In contrast to geometric multigrid methods, an AMG constructs from a given
grid a coarser grid. Since the final number of coarser grids is not known a priori, it
makes sense to denote the starting grid by level 0, the next coarser grid by level 1
and so on.
The coarsening process of an AMG should work automatically, based only on
information from the matrix on the current level. To describe this process, some
notation is needed. AMGs are set up in an algebraic environment. However, it
is often convenient to use a grid terminology by introducing fictitious grids with
grid points being the nodes of a graph which is associated with the given matrix
A = (aij ). 2
Definition 7.4 Graph of a matrix, set of neighbor vertices, coupled ver-

tices. Let A be a sparse n × n matrix with a symmetric sparsity pattern, i.e., aij is
allowed to be non-zero if and only if aji is allowed to be non-zero. Let Ω = GA (V, E)
be the graph of the matrix consisting of a set
V = {v1 , . . . , vn }
of n ordered vertices (nodes, unknowns, degrees of freedom) and a set of edges E

such that the edge eij , which connects vi and vj for i 6= j, belongs to E if and only
if aij is allowed to be non-zero.
For a vertex vi , the set of its neighbor vertices Ni is defined by
Ni = {vj ∈ V : eij ∈ E} .
The number of elements in Ni is denoted by |Ni |.

If eij ∈ E, then the vertices vi and vj are called coupled or connected. 2
Example 7.5 Graph of a matrix. Consider the matrix

 
4 −1 −1 0
−1 4 0 −1
A= −1 0
. (7.1)
4 −1
0 −1 −1 4
Let the vertex vi correspond to the i-th unknown, i.e., to the degree of freedom
that corresponds to the i-th row of the matrix. Then the graph of A has the form
as given in Figure 7.2. It is
E = {e12 , e21 , e13 , e31 , e24 , e42 , e34 , e43 } .
51
v3 v4
v1 v2
Figure 7.2: Graph Ω = GA (V, E) of the matrix (7.1).
7.2 Algebraic Smoothness

Remark 7.6 Notations. In geometric multigrid methods, an error is called smooth
if it can be approximated well on some pre-defined coarser level. In AMGs there
are no pre-defined grids. Let S be the smoother on Ω, then an error is said to be
algebraically smooth if the convergence of the fixed point iteration with the matrix
S is slow, i.e., Se ≈ e.
To define the property of algebraic smoothness precisely, some inner products
and norms of vectors have to be introduced. Let D be the diagonal matrix corre-
sponding to A ∈ Rn×n and let (·, ·) be the Euclidean inner product of two vectors
n
X
(u, v) = ui v i .
i=1
Then, the following inner products and norms are defined

1/2
(u, v)0 = (Du, v) , kuk0 = (u, u)0 ,
1/2
(u, v)1 = (Au, v) , kuk1 = (u, u)1 ,
1/2
= D−1 Au, Av ,

(u, v)2 kuk2 = (u, u)2 .
The norm kuk1 is sometimes called energy norm.

In this course, only classes of matrices will be considered where ρ D−1 A is

uniformly bounded, i.e., the spectral radius is bounded independently of the grid.
This property holds for many classes of matrices which occur in applications. 2
Lemma 7.7 Properties of the norms. Let A ∈ Rn×n be a symmetric positive

definite matrix. Then the following inequalities hold for all v ∈ Rn :
2
kvk1 ≤ kvk0 kvk2 , (7.2)
2 2
ρ D−1 A kvk1 ,

kvk2 ≤ (7.3)
2 2
ρ D−1 A kvk0 .

kvk1 ≤ (7.4)
Proof: (7.2). This estimate follows from the Cauchy–Schwarz inequality and the
52
symmetry of D

kvk21 = (Av, v) = vT Av = vT AD−1/2 D1/2 v = D−1/2 Av, D1/2 v

−1/2 1/2
≤ D Av D v
1/2 1/2
= D−1/2 Av, D−1/2 Av D1/2 v, D1/2 v
1/2 1/2
= Av, D−1/2 D−1/2 Av v, D1/2 D1/2 v
1/2
= Av, D−1 Av (v, Dv)1/2
= kvk2 kvk0 ,
where k·k is here the Euclidean vector norm.

(7.3). The matrix D−1 A is in general not a symmetric matrix. However, it has the
same eigenvalues as the symmetric matrix A1/2 D−1 A1/2 , since from
D−1 Ax = λx
one obtains with x = A−1/2 y
D−1 AA−1/2 y = λA−1/2 y ⇐⇒ A1/2 D−1 A1/2 y = λy.
In particular, the spectral radii of both matrices are the same. Using the definition of
the positive definiteness, one sees that A1/2 D−1 A1/2 is positive definite since the diagonal
of a positive definite matrix is a positive definite matrix. Hence, one gets, using a well
known property of the spectral radius for symmetric positive definite matrices (Rayleigh
quotient)

A1/2 D−1 A1/2 x, x
ρ D−1 A = ρ A1/2 D−1 A1/2 = λmax A1/2 D−1 A1/2 = sup

.
x∈Rn (x, x)
Setting now x = A1/2 v gives an estimate of the spectral radius from below

A1/2 D−1 A1/2 A1/2 v, A1/2 v D−1 Av, Av kvk22

ρ D−1 A ≥

= = ,
1/2 1/2
(A v, A v) (Av, v) kvk21
where the symmetry of A was also used.

(7.4). The matrix D−1 A has also the same eigenvalues as the matrix D−1/2 AD−1/2 ,
since from
D−1 Ax = λx
it follows with x = D−1/2 y that
D−1 AD−1/2 y = λD−1/2 y ⇐⇒ D−1/2 AD−1/2 y = λy.

Hence, ρ D−1 A = ρ D−1/2 AD−1/2 . The matrix D−1/2 AD−1/2 is symmetric and pos-

itive definite, which follows by the definition of the positive definiteness and the assumed
positive definiteness of A. Using the Rayleigh quotient yields

D−1/2 AD−1/2 x, x
ρ D−1 A = ρ D−1/2 AD−1/2 = λmax D−1/2 AD−1/2 = sup

.
x∈Rn (x, x)
Setting x = D1/2 v, it follows that

D−1/2 AD−1/2 D1/2 v, D1/2 v (Av, v) kvk21
ρ D−1 A ≥

= = .
(D1/2 v, D1/2 v) (Dv, v) kvk20
53
Lemma 7.8 On the eigenvectors of D−1 A. Let A ∈ Rn×n be a symmetric
positive definite matrix and φ be an eigenvector of D−1 A with the eigenvalue λ,
i.e.,
D−1 Aφ = λφ.
Then it is
2 2 2 2
kφk2 = λ kφk1 , kφk1 = λ kφk0 .
Proof: The first statement is obtained by multiplying the eigenvalue problem from
the left with φT A giving
Aφ, D−1 Aφ = λ (Aφ, φ) .

The second equality follows from multiplying the eigenvalue problem from left with φT D
φ, DD−1 Aφ = λ (φ, Dφ) .

Definition 7.9 Smoothing property of an operator. A smoothing operator

S is said to satisfy the smoothing property with respect to a symmetric positive
definite matrix A if
2 2 2
kSvk1 ≤ kvk1 − σ kvk2 (7.5)
with σ > 0 independent of v.
Let A be a class of matrices. If the smoothing property (7.5) is satisfied for all
A ∈ A for a smoothing operator S with the same σ, then S is said to satisfy the
smoothing property uniformly with respect to A. 2
Remark 7.10 On the smoothing property. The definition of the smoothing prop-
erty implies that S reduces the error efficiently as long as kvk2 is relatively large
compared with kvk1 . However, the smoothing operator will become very inefficient
if kvk2 kvk1 . 2
Definition 7.11 Algebraically smooth error. An error v is called algebraically

smooth if kvk2 kvk1 . 2
Remark 7.12 Algebraically smooth error. An algebraically smooth error is a vec-

tor for which an iteration with S converges slowly. The term “smooth” for this
property is used for historical reasons.
It will be shown now that the damped Jacobi iteration satisfies the smoothing
property (7.5) uniformly for symmetric positive definite matrices. 2
Lemma 7.13 Equivalent formulation of the smoothing property. Let A ∈

Rn×n be a symmetric positive definite matrix and let the smoothing operator be of
the form
S = I − Q−1 A
with some non-singular matrix Q. Then the smoothing property (7.5) is equivalent
to
σ QT D−1 Qv, v ≤ Q + QT − A v, v ∀ v ∈ Rn .

(7.6)
Proof: It is
kSvk21 (ASv, Sv) = A I − Q−1 A v, I − Q−1 A v

=
(Av, v) − AQ−1 Av, v − Av, Q−1 Av + AQ−1 Av, Q−1 Av

=

kvk21 − QT Q−1 Av, Q−1 Av − QQ−1 Av, Q−1 Av + AQ−1 Av, Q−1 Av

=

= kvk21 − QT + Q − A Q−1 Av, Q−1 Av .
54
Hence, the algebraic smoothing property (7.5) is equivalent to the condition that for all
v ∈ Rn :

σ kvk22 ≤ QT + Q − A Q−1 Av, Q−1 Av ⇐⇒

σ D−1 Av, Av QT + Q − A Q−1 Av, Q−1 Av

≤ ⇐⇒

σ D−1 Qy, Qy QT + Q − A y, y ,

≤
with y = Q−1 Av. Since the matrices A and Q are non-singular, y is an arbitrary vector
from Rn . Hence, the statement of the lemma is proved.
Theorem 7.14 Algebraic smoothing property of the damped Jacobi meth-

od. Let A ∈ Rn×n be a symmetric and positive definite matrix and let η >
ρ D−1 A . Then, the damped Jacobi iteration with the damping parameter ω ∈
(0, 2/η) satisfies the algebraic smoothing property (7.5) uniformly with σ = ω(2 −
ωη).
Proof: The damped Jacobi iteration satisfies the assumptions of Lemma 7.13 with
Q = ω −1 D. Hence, the algebraic smoothing property (7.5) is eqivalent to (7.6), which
gives

D 2D 2 σ
σ v, v ≤ v, v − (Av, v) ⇐⇒ (Av, v) ≤ − 2 Dv, v
ω2 ω ω ω

2 σ
⇐⇒ kvk21 ≤ − 2 kvk20 . (7.7)
ω ω
From inequality (7.4) and the assumption on η it follows for all v ∈ Rn that
kvk21 ≤ ρ D−1 A kvk20 < η kvk20 .

Thus, if
2 σ
η≤ − 2 , (7.8)
ω ω
then (7.7) is satisfied (sufficient condition) and the damped Jacobi iteration fulfills the
algebraic smoothing property. One obtains from (7.8)
σ ≤ 2ω − ηω 2 = ω (2 − ωη) .
Obviously it is σ > 0 if ω ∈ (0, 2/η).
Remark 7.15 On the algebraic smoothing property.

• The optimal value of ω, which gives the largest σ is ω ∗ = 1/η, such that σ = 1/η.
This statement can be proved by standard calculus, exercise.
• The algebraic smoothing property can be proved also for the Gauss–Seidel iter-
ation.
2
Remark 7.16 The algebraic smooth error for M-matrices. The meaning of “v
being an algebraic smooth error” will be studied in some more detail for symmetric
positive definite M-matrices. This class of matrices was introduced in the course on
numerical methods for convection-dominated problems.
An algebraic smooth error satisfies kvk2 kvk1 . By (7.2), this property implies
kvk1 kvk0 . (7.9)
For a symmetric matrix A ∈ Rn×n , it is, exercise,
n n n
1 X 2
X X
kvk1 = (−aij ) (vi − vj ) + si vi2 , with si = aij
2 i,j=1 i=1 j=1
55
being the i-th row sum of A. It follows from (7.9) that
n n n
1 X 2
X X
(−aij ) (vi − vj ) + si vi2 aii vi2 . (7.10)
2 i,j=1 i=1 i=1
Let A be an M-matrix. Then aij ≤ 0, i.e., |aij | = −aij for i 6= j. In many

applications, it is si = 0. Then, from (7.10) it follows on the average for each i
(consider just a fixed i)
n 2
X |aij | (vi − vj )
1.
j=1
aii vi2
In the sum, there are only nonnegative terms. Thus, if |aij | /aii is large, then
2
(vi − vj ) /vi2 has to be small such that the sum becomes small. One says, schemes
which satisfy the smoothing property (7.5) smooth the error along the so-called
strong connections, i.e., where |aij | /aii is large, since for these connections a good
smoothing can be expected on the given grid. This property implies that the cor-
responding nodes i and j do not need to be both on the coarse grid. 2
7.3 Coarsening
Remark 7.17 Goal. Based on the matrix information only, one has to choose in
the graph of the matrix nodes which become coarse nodes and nodes which stay on
the fine grid. There are several strategies for coarsening. In this course, a standard
way will be described. It will be restricted to the case that A ∈ Rn×n is a symmetric
positive definite M-matrix. 2
Definition 7.18 Strong coupling. A variable (node) i is said to be strongly

coupled to another variable j if
−aij ≥ εstr max |aik |

aik <0
with fixed εstr ∈ (0, 1). The set of all strong couplings of i is denoted by
Si = {j ∈ Ni : i is strongly coupled to j} .
The set SiT of strong transposed couplings of i consists of all variables j which are
strongly coupled to i
SiT = {j ∈ Ni : i ∈ Sj } .
2
Remark 7.19 On strong couplings.

• Even for symmetric matrices, the relation of being strongly coupled is in general
not symmetric. Consider, e.g.,
 
5 −1 −0.1
A =  −1 3 −0.1 , εstr = 0.25.
−0.1 −0.1 3
Then, one gets S1 = {2}, S2 = {1}, S3 = {1, 2}, such that S1 = {2, 3}, S2 =
{1, 3}, S3 = ∅.
• The actual choice of εstr is in practical computations not very critical. Values
of around 0.25 are often used.
2
56
Remark 7.20 Aspects of the coarsening process. In the coarsening process, one
has to pay attention to several aspects.
• The number of coarse nodes (C-nodes) should not be too large, such that the
dimension of the coarse system is considerably smaller than the dimension of
the fine system.
• Nodes i and j, which are strongly coupled, have a small relative error
2
(ei − ej ) /e2i
such that a coarse grid correction of this error is not necessary. That means, it
will be inefficient to define both nodes as coarse nodes.
• All fine nodes (F-nodes) should have a substantial coupling to neighboring C-
nodes. In this way, the F-nodes obtain sufficient information from the C-nodes.
• The distribution of the C-nodes and F-nodes in the graph should be reasonably
uniform.
2
Remark 7.21 A standard coarsening procedure. A standard coarsening procedure

starts by defining some first variable i to become a C-node. Then, all variables j
that are strongly coupled with i, i.e., all j ∈ SiT , become F-nodes. Next, from the
remaining undecided variables, another one is defined to become a C-node and all
variables which are strongly coupled to it and are still undecided become F-nodes,
and so on. This process stops if all variables are either C-nodes or F-nodes.
To obtain a uniform distribution of the C-nodes and F-nodes, the process of
choosing C-nodes has to be done in a certain order. The idea consists in starting
with some variable and to continue from this variable until all variables are covered.
Therefore, an empirical “measure of importance” λi for any undecided variable to
become a C-node is introduced
λi = SiT ∩ U + 2 SiT ∩ F , i ∈ U,

(7.11)
where U is the current set of undecided variables, F the current set of F-nodes
and |·| is the number of elements in a set. One of the undecided variables with
the largest value of λi will become the next C-node. After this choice, all variables
which are strongly coupled to the new C-node become F-nodes and for the remaining
undecided variables, one has to update their measure of importance.
With the measure of importance (7.11), there is initially the tendency to pick
variables which are strongly coupled with many other variables to become C-nodes,
because |U | is large and |F | is small, such that the first term dominates. Later,
the tendency is to pick as C-nodes those variables on which many F-nodes depend
strongly, since |F | is large and |U | is small such that the second term in λi becomes
dominant. Thus, the third point of Remark 7.20 is taken into account. 2
Example 7.22 A standard coarsening procedure. Consider a finite difference dis-

cretization of the Laplacian in the unit square with the five point stencil. Assuming
that the values at the boundary are known, the finite difference scheme gives for
the interior nodes i the following matrix entries, apart of a constant factor,

 4 if i = j,
aij = −1 if j is left, right, upper, or lower neighbor of i,
0 else.

Taking an arbitrary εstr , then each node i is strongly coupled to its left, right, upper,
and lower neighbor. Consider a 5 × 5 patch and choose some node as C-node. In
57
the first step, one obtains
U U U U U
U U U U U
U U F U U,
U F C F U
U U F U U
where for U it is λi = 2 + 2 · 2 = 6 and for U it is either λi = 4 + 2 · 0 = 4 or
λi = 3 + 2 · 1 = 5. The next step gives, e.g.,
U U U U U
U U U F U
U U F C F,
U F C F U
U U F U U
with λi = 2 + 2 · 2 = 6 for U and λi ≤ 5 else. Continuing this process leads to
U U F U U
U F C F U
F C F C F,
U F C F U
U U F U U
and so on.
In this particular example, one obtains a similar coarse grid as given by a geomet-
ric multigrid method. However, in general, especially with non-symmetric matrices,
the coarse grid of the AMG looks considerably different than the coarse grid of a
geometric multigrid method. 2
Remark 7.23 On coarsening strategies.

• In the standard coarsening scheme, none of the C-nodes is strongly coupled
to any of the C-nodes created prior itself. However, since the relation of be-
ing strongly coupled might be non-symmetric, in particular for non-symmetric
matrices, this property may not be true the other way around. Numerical expe-
rience shows that in any case the resulting set of C-nodes is close to a maximal
set of variables which are not strongly coupled among each other.
• Other ways of coarsening can be found, e.g., in K. Stüben “Algebraic Multigrid
(AMG): An introduction with applications”, which is part of Trottenberg et al.
(2001).
2
7.4 Prolongation
Remark 7.24 Prolongation. The last component of an AMG, which has to be
defined, is the prolongation. It will be matrix-depend, in contrast to geometric
multigrid methods. 2
Remark 7.25 Construction of an prolongation operator. To motivate the con-

struction of an prolongation operator, once more the meaning of an error to be
algebraically smooth will be discussed. From the geometric multigrid methods, it is
known that the prolongation has to work well for smooth functions, see Remark 4.11.
By definition, an algebraic smooth error is characterized by
Se ≈ e
58
or
kek2 kek1 .
In terms of the residual
r = f − Av = Au − Av = A (u − v) = Ae,
this inequality means that
D−1 Ae, Ae (Ae, e) D−1 r, r (r, e) .

⇐⇒
One term in both inner products is the same. One can interprete this inequality in
the way that on the average, algebraic smooth errors are characterized by a scaled
residual (first argument on the left-hand side) to be much smaller than the error
(second argument on the right-hand side). On the average, it follows that
a−1 2
ii ri |ri ei | ⇐⇒ |ri | aii |ei | .
Thus, |ri | is close to zero and one can use the approximation
X
0 ≈ ri = aii ei + aij ej . (7.12)
j∈Ni
Let i be a F-node and Pi ⊂ Cnod a subset of the C-nodes, where the set of
C-nodes is denoted by Cnod , the so-called interpolatory points. The goal of the
prolongation consists in obtaining a good approximation of ei using information
from the coarse grid, i.e., from the C-nodes contained in Pi . Therefore, one likes to
compute prolongation weights ωij such that
X
ei = ωij ej (7.13)
j∈Pi
and ei is a good approximation for any algebraic smooth error which satisfies (7.12).
2
Remark 7.26 Direct prolongation. Here, only the so-called direct prolongation in
the case of A being an M-matrix will be considered. Direct prolongation means
that Pi ⊂ Cnod ∩ Ni , i.e., the interpolatory nodes are a subset of all coarse nodes
which are coupled to i. Inserting the ansatz (7.13) into (7.12) gives
X 1 X
ei = ωij ej = − aij ej . (7.14)
aii
j∈Pi j∈Ni
If Pi = Ni , then the choice ωij = −aij /aii will satisfy this equation. But in general,
Pi ( Ni . If there are sufficiently many nodes which are strongly connected to i
contained in Pi , then for the averages it holds
1 X 1 X
P aij ej ≈ P aij ej .
j∈Pi aijj∈Pi j∈Ni aij j∈Ni
Inserting this relation into (7.14) leads to the proposal for using matrix-dependent
prolongation weights
P !
k∈Ni aik aij
ωij = − P > 0, i ∈ F, j ∈ Pi .
k∈Pi aik aii
Summation of the weights gives
P !P
k∈Ni aik j∈Pi aij aii − si si
X
ωij = − P = =1− ,
j∈Pi k∈Pi aik aii aii aii
P
where si is the sum of the i-th row of A. Thus, if si = 0, then j∈Pi ωij = 1 such
that constants are prolongated exactly. 2
59
7.5 Concluding Remarks
Example 7.27 Behavior of an AMG for the Poisson equation. The same situation
as in Example 2.5 will be considered. In the code MooNMD, a simple AMG is
implemented. The number of iterations and computing times for applying this
method as solver or as preconditioner in a flexible GMRES method are presented
in Table 7.1.
Table 7.1: Example 7.27. Number of iterations and computing times (14/01/24
on a HP BL460c Gen8 2xXeon, Eight-Core 2700MHz). The number of degrees of
freedom (d.o.f.) includes the Dirichlet values. The time for the setup of the AMG
is included into the total solution time.
level h d.o.f. AMG FGMRES+AMG setup time

ite time ite time (FGMRES+AMG)
1 1/4 25 1 0 1 0 0
2 1/8 81 1 0 1 0 0
3 1/16 289 34 0.01 18 0.01 0
4 1/32 1089 41 0.02 19 0.01 0.01
5 1/64 4225 45 0.13 21 0.08 0.03
6 1/128 16641 47 0.69 22 0.43 0.15
7 1/256 66049 51 3.81 23 2.66 1.32
8 1/512 263169 49 25.08 24 14.82 7.28
9 1/1024 1050625 50 157.14 24 119.96 84.95
10 1/2048 4198401 50 1500.75 24 1333.09 1103.40
It can be seen, that using AMG as preconditioner is more efficient than using
it as solver. The number of iterations for both applications of AMG is constant
independently of the level. However, the solution time does not scale with the
number of degrees of freedom. The reason is that in the used implementation,
the time for constructing the AMG does not scale in this way but much worse.
Comparing the results with Table 2.2, one can see that AMG is not competitive
with a geometric multigrid method, if the geometric multigrid method works well.
2
Remark 7.28 Concluding remarks.

• A number of algebraic results for AMGs is available, see the survey paper of
K. Stüben. But there are still many open questions, even more than for the
geometric multigrid method.
• As seen in Example 7.27, in problems for which a geometric multigrid method
can be applied efficiently, the geometric multigrid method will in general outper-
form AMG. But there are classes of problems for which AMG is as efficient or
even more efficient than a geometric multigrid method. One of the most impor-
tant fields of application for AMG are problems for which a geometric multigrid
method cannot be performed.
2
60
Chapter 8
Outlook
Remark 8.1 More general problems.

• There are only few contributions to the analysis of multigrid methods for prob-
lems which are not symmetric positive definite or a slight perturbation of such
problems. One example where it is nothing proved are linear convection-diffusion
equations which are convection-dominated. However, the practical experience is
that multigrid solvers, with appropriate preconditioners, work reasonably well
for such problems.
• The key for the efficiency of the multigrid method is generally the smoother.
There is a lot of experience for scalar problems, e.g., for convection-diffusion
problems often SSOR or ILU work reasonably well, see Example 8.2. For cou-
pled problems, sometimes the construction of smoothers is already complicated.
For instance, many discretizations for the Navier–Stokes equations lead to ma-
trices where a number of diagonal entries are zero. In this case, one cannot
apply classical iterative schemes since these schemes require the division by the
diagonal entries.
• Algebraic multigrid methods are usually applied to scalar problems. There are
only few proposals of algebraic multigrid methods for coupled problems.
• The extension of the multigrid idea to nonconforming finite element discretiza-
tions is possible.
2
Example 8.2 Convection-diffusion problem in two dimensions. A standard convec-

tion-diffusion test problem in two dimensions has the form
−ε∆u + (1, 0)T · ∇u = 1 in Ω = (0, 1)2 ,

u = 0 on ∂Ω,
see the lecture notes of the course on numerical methods for convection-dominated
problems. Considering ε = 10−8 with the Q1 finite element method and the SUPG
stabilization, then one obtains the iterations and computing times as shown in
Table 8.1. In these simulations, the multigrid methods were applied with the F-
(ν, ν)-cycle, where ν is the number of pre and post smoothing steps. In the geomet-
ric multigrid method, a SSOR smoother was used and in the algebraic multigrid
method, a ILU smoother.
61
Table 8.1: Example 8.2. Number of iterations and computing times (14/01/23 on a HP BL460c Gen8 2xXeon, Eight-Core 2700MHz). The number
of degrees of freedom (d.o.f.) includes the Dirichlet values.
level h d.o.f. FGMRES+MG, F(3,3) FGMRES+MG F(10,10) FGMRES+AMG F(3,3) FGMRES+AMG F(5,5) UMFPACK
ite time ite time ite time ite time ite time
0 1/16 289 1 0 1 0 6 0 4 0 1 0
1 1/32 1089 6 0.03 2 0.01 8 0.05 5 0.05 1 0.01
2 1/64 4225 9 0.10 3 0.05 11 0.42 7 0.41 1 0.02
3 1/128 16641 15 0.44 5 0.25 17 3.19 12 4.64 1 0.16
4 1/256 66049 26 2.68 9 1.73 30 34.32 23 42.20 1 1.35
5 1/512 263169 47 20.09 16 8.59 no conv. 149 866.58 1 10.27
6 1/1024 1050625 145 252.80 29 66.69 no conv. 1 75.17
7 1/2048 4198401 302 2057.39 76 838.18
62
One can see in Table 8.2 that none of the solvers behaves optimal, i.e,. for none
of the solvers, the computing time scales with the number of degrees of freedom.
The most efficient solvers in this example are the direct solver (note that this is
a two-dimensional problem) and the geometric multigrid as preconditioner with
sufficiently many smoothing steps. On the finest grid, only the geometric multigrid
approaches work since the direct solver terminates because of internal memory
limitations. In the multigrid methods, one can well observe the effect of increasing
the number of smoothing steps.
Altogether, the linear systems obtained for convection-dominated problem are
usually hard to solve and so far an optimal solver is not known. 2
Remark 8.3 Multigrid methods with different finite element spaces. One can apply
the multigrid idea also with different (finite element) spaces. For instance, consider
just one grid. As coarse grid space, one can use P1 finite elements and as fine grid
space P2 finite elements. With these two finite element spaces, one can perform a
two-level method.
This idea has been used in the construction of finite element spaces for higher
order finite elements. It is known from numerical studies that multigrid methods
with the same finite element space on all levels might become inefficient for higher
order elements because it is hard to construct good smoothers. On the other hand,
multigrid methods are usually more efficient for lower order elements. The idea
consists in using on the fine grid the higher order finite element space as the finest
level of the multigrid hierarchy and using as next coarser level of this hierarchy a first
order finite element space on the same geometric grid. On the coarser geometric
grids, one uses also low order finite elements. In this way, one has a multigrid
method for the higher order discretization which uses low order discretizations on
the coarser grids. Some good experience with this approach is reported in the
literature. 2
Remark 8.4 Simulations in practice. The great difficulty of the application of

multigrid methods for problems from practice comes from the situation that in
practice the domains are often complicated. A good initial triangulation of a com-
plicated domain leads already to a fine mesh. Often, the computational resources
can just afford this mesh such that there is no mesh hierarchy available. Also, gen-
erally (in industrial codes) there is only one type of discrete space, e.g., P1 finite
elements, available. Altogether, in this situation one has to use a different solver.
2
63
Bibliography
Briggs, W., E. Henson, and S. McCormick, 2000: A multigrid tutorial. 2d ed., SIAM,
Philadelphia, PA.
Davis, T. A., 2004: Algorithm 832: UMFPACK V4.3—an unsymmetric-pattern
multifrontal method. ACM Trans. Math. Software, 30 (2), 196–199, doi:10.1145/
992200.992206, URL http://dx.doi.org/10.1145/992200.992206.
Hackbusch, W., 1985: Multi-Grid Methods and Applications. Springer Series in

Computational Mathematics, 4, Springer-Verlag, Berlin-Heidelberg-New York.
Hackbusch, W., 1994: Iterative solution of large sparse systems of equa-
tions, Applied Mathematical Sciences, Vol. 95. Springer-Verlag, New York,
xxii+429 pp., doi:10.1007/978-1-4612-4288-8, URL http://dx.doi.org/10.
1007/978-1-4612-4288-8, translated and revised from the 1991 German origi-
nal.
John, V. and G. Matthies, 2004: MooNMD—a program package based on mapped
finite element methods. Comput. Vis. Sci., 6 (2-3), 163–169, doi:10.1007/
s00791-003-0120-1, URL http://dx.doi.org/10.1007/s00791-003-0120-1.
Olshanskii, M. A. and A. Reusken, 2004: Convergence analysis of a multigrid

method for a convection-dominated model problem. SIAM J. Numer. Anal.,
42 (3), 1261–1291, doi:10.1137/S0036142902418679, URL http://dx.doi.org/
10.1137/S0036142902418679.
Reusken, A., 2002: Convergence analysis of a multigrid method for convection-
diffusion equations. Numer. Math., 91 (2), 323–349, doi:10.1007/s002110100312,
URL http://dx.doi.org/10.1007/s002110100312.
Shaidurov, V., 1995: Multigrid Methods for Finite Elements, Mathematics and Its
Application, Vol. 318. Kluwer Academic Publisher.
Trottenberg, U., C. W. Oosterlee, and A. Schüller, 2001: Multigrid. Academic Press

Inc., San Diego, CA, xvi+631 pp., with contributions by A. Brandt, P. Oswald
and K. Stüben.
Wesseling, P., 1992: An Introduction to Multigrid Methods. John Wiley & Sons,
Chichester, New York, Brisbane, Toronto, Singapore, corrected reprint by R.T.
Edwards, Inc. 2004.
64
Index
algebraic multigrid method (AMG), full, 39

49 full V-cycle, 39
approximation property, 31 with γ-cycle, 38
multigrid method
coarse grid correction, 21 algebraic, 49
condition number, 5
conjugate gradient method, 5 nested iteration, 20
connection node
strong, 56 C-, 57
contraction number, 9 F-, 57
convergence factor, 9
coupled vertices, 51 Poisson equation, 4
cycle prolongation, 21, 58
F-, 38 direct, 59
V-, 38
W-, 38 rate of convergence, 9
residual, 9
error, 9 residual equation, 9, 20
algebraically smooth, 52, 54 restriction, 23
weighted, 24
fixed point iteration, 8
Fourier modes, 10 smoother, 19
frequency, 10 smoothing
post, 27
Galerkin projection, 28 pre, 27
Gauss–Seidel method, 15 smoothing property, 31, 54
GMRES, 5 smoothing rate, 14
graph SOR method, 15
matrix, 51 SSOR method, 5
injection, 23 two-level method, 21

interpolation, 21
iteration UMFPACK, 5
convergent, 9
wave number, 10
Jacobi method, 10
matrix
positive definite, 4
sparse, 4
symmetric, 4
modes
oscillating, 11
smooth, 11
multigrid
65

Multi Grid

Uploaded by

Multi Grid

Uploaded by

Multigrid Methods

Winter Semester 2013/14

3 Detailed Investigation of Classical Iterative Schemes 8

5 The Two-Level Method 27

6 The Multigrid Method 37

7 Algebraic Multigrid Methods 49

and the right-hand side

−∆u = f in Ω = (0, 1)2 ,

Remark 2.4 Properties of the matrices.

Also in higher dimensions, the condition number is κ2 (A) = O h−2 . 2

where D is the diagonal of A. It is known from Numerical Mathematics II that the

3.1 General Aspects of Classical Iterative Schemes

u(m+1) = Su(m) + M −1 f , m = 0, 1, 2, . . . (3.3)

This basic iterative approach might be also damped

u∗ = Su(m) + M −1 f , u(m+1) = ωu∗ + (1 − ω)u(m) , ω ∈ R+ ,

u(m+1) = u(m) + D−1 r(m) .

Also the damped Jacobi method was introduced in Numerical Mathematics II

u(m+1) = u(m) + ωD−1 r(m) , ω ∈ (0, 1]. (3.8)

A straightforward calculation shows that it can be written as a basic fixed point

Figure 3.1: Fourier modes.

For the investigation of an iterative method applied to the solution of (3.9), it

The following observations are of importance:

In practice, one does not prescribe the number of iterations to be performed

For λ1 (Sjac,ω ), one finds with a Taylor series expansion

Using M = ω −1 D + L and N = ω −1 D − (D − U ) gives the SOR method. This

and the corresponding eigenvectors are wk = (wk,1 , . . . , wk,N −1 )T with

Proof: One has to show that

Inserting the decomposition of SGS gives

− (D + L)−1 U wk = λk (SGS ) wk ⇐⇒ λk (SGS ) (D + L) wk = −U wk .

Using now the relation

4.1 Algorithms with Coarse Grid Systems, the Resid-

Figure 4.1: Coarse and fine grid.

Ae(m) = f − Au(m) =: r(m) . (4.1)

4.2 Prolongation or Interpolation

Figure 4.2: Linear interpolation for finite difference methods.

There is a bijection between V 2h and RN/2−1 .

Lemma 4.10 Properties of the linear interpolation operator. The operator

An homogeneous and additive operator is linear.

Figure 4.3: Oscillating error and interpolation.

Figure 4.4: Injection.

Figure 4.5: Weighted restriction.

representation of the weighted restriction operator has the form

The Two-Level Method

A2h e2h = Ih2h rh

5.1 The Coarse Grid Problem

Comparing this definition with (5.1) leads to the definition

A2h := Ih2h Ah I2h

This definition of the coarse grid matrix is called Galerkin projection.

v(n+1) = vν(n) + I2h h

Hence, the iteration matrix of the two-level method is given by

|||S2lev ||| ≤ ρ < 1

independently of h. The analysis is based on a splitting of S2lev in the form

with η(ν) → 0 as ν → ∞ and ν(h) = ∞ or ν(h) → ∞ as h → 0. 2

Definition 5.10 Approximation property. The approximation property holds

with the same α as in the smoothing property. 2

Theorem 5.11 Convergence of the two-level method. Suppose the smoothing

|||S2lev ||| ≤ Ca η(ν) ≤ ρ, (5.9)

Proof: The matrix B(I − B)ν is symmetric, exercise.

B(I − B)x = Bx − BBx = λx − Bλx = λx − λ2 x = λ(1 − λ)x.

B(I − B)ν x = B(I − B)(I − B)ν−1 x = B(I − B)ν−1 x − BB(I − B)ν−1 x

Since 0 ≤ B ≤ I, one has 0 < λ ≤ 1. Then, it is obvious that

Sjac,ω = I − ωD−1 Ah , ω ∈ (0, 1], (5.11)

Theorem 5.16 Smoothing property of the damped Jacobi method. Let

Remark 5.17 To the smoothing property.

5.4 The Approximation Property

C0 h1/2 vh 2 ≤ P h vh L2 ((0,1)) ≤ C1 h1/2 vh 2

Theorem 5.20 Approximation property for the finite element discretiza-

Let Ah zh = wh , A2h z2h = Ih2h wh , then the numerator can be written as

hold. Thus, one obtains with the triangle inequality

Remark 5.21 On the approximation property.

The Multigrid Method

A2h e2h = Ih2h rh =: r2h

6.1 Multigrid Cycles

v2h ← Mγ2h v2h , f 2h .

3. Correct with the prolongated update: vh ← vh + I2h

Figure 6.1: Multigrid V-cycle (γ = 1), s – smoothing, r – restriction, p – prolon-