Lecture Notes2019
Lecture Notes2019
Dr Walter Mudzimbabwe
2019-02-01
2
Contents
Course Outline 11
Course Structure and Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Course Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Course Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Hardware Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1 Numerical Differentiation 13
1.1 Finite Difference Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.1.1 Approximations to f 0 (x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.1.2 Approximations to f 00 (x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.1.2.1 Mathematica Demonstration . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.1.2.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.1.3 Errors in First and Second Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.3 Richardson’s Extrapolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 Numerical Integration 19
2.1 Quadrature Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Newton-Cotes Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.1 Trapezoidal Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.2.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.3 The Midpoint Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.3.1 Comparing Trapezoidal Vs Midpoint Method . . . . . . . . . . . . . . . . . . 22
2.2.4 Simpson’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2.4.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2.5 Convergence Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Romberg Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3.0.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4 Double and Triple Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.1 The Midpoint Method for Double Integrals . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.1.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.2 The Midpoint Method for Triple Integrals . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.2.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3
4 CONTENTS
5 Interpolation 45
5.1 Weierstrauss Approximation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 Linear Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.2.0.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.3 Quadratic Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.3.0.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.4 Lagrange Interpolating Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.4.0.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.4.0.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.5 Newton’s Divided Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.5.0.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.5.0.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.5.1 Errors of Newton’s interpolating polynomials . . . . . . . . . . . . . . . . . . . . . . . 52
5.6 Cubic Splines Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.6.0.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.6.1 Runge’s Phenomenon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.6.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6 Least Squares 59
6.1 Linear Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.1.0.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.2 Polynomial Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2.0.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.3 Least Squares Exponential Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.3.0.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.3.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7.3.0.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.4 Runge-Kutta Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.4.1 Second Order Runge-Kutta Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.4.2 Fourth Order Runge-Kutta Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.4.2.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.5 Multistep Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.5.1 Adam-Bashforth-MoultonMethod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.5.1.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.5.2 Advantages of Multistep Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7.6 Systems of First Order ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7.6.1 R-K Method for Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.7 Converting an nth Order ODE to a System of First Order ODEs . . . . . . . . . . . . . . . . 75
7.7.0.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.7.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6 CONTENTS
List of Tables
7
8 LIST OF TABLES
List of Figures
9
10 LIST OF FIGURES
Course Outline
Course Assessment
Course Topics
Hardware Requirements
The course will be very computational in nature, however, you do not need your own personal machine. MSL
already has Python installed. The labs will be running the IDEs for Python (along with Jupyter) while
I will be using Jupyter for easier presentation and explanation in lectures. You will at some point need to
become familiar with Jupyter as the tests will be conducted in the Maths Science Labs (MSL) utilising this
platform for autograding purposes.
11
12 LIST OF FIGURES
If you do have your own machine and would prefer to work from that you are more than welcome. Since all
the notes and code will be presented through Jupyter please follow the following steps:
• Install Anaconda from here: https://repo.continuum.io/archive/Anaconda3-5.2.0-Windows-x86_64.exe
– Make sure when installing Anaconda to set the installation to PATH when prompted (it will be
deselected by default)
• To launch a Jupyter notebook, open the command promt (cmd) and type jupyter notebook. This
should launch the browser and jupyter. If you see any proxy issues while on campus, then you will need
to set the proxy to exclude the localhost.
If you are not running Windows but rather Linux, then you can get Anaconda at: https://repo.continuum.io/
archive/Anaconda3-5.2.0-Linux-x86_64.sh
Chapter 1
Numerical Differentiation
In certain situations it is difficult to work with the actual derivative of a function. In some cases a derivative
may fail to exist at a point. Another situation is when dealing with a function represented only by data and
no analytic expression. In such situations it is desirable to be able to approximate the derivative from the
available information. Presented below are methods used approximate f 0 (x).
Numerical differentiation is not a particularly accurate process. It suffers from round-off errors (due to
machine precision) and errors through interpolation. Therefore, a derivative of a function can never be
computed with the same precision as the function itself.
dy f (x + h) − f (x)
= f 0 (x) = lim . (1.1)
dx h→0 h
The above definition uses values of f near the point of differentiation, x. To obtain a formula approximating
the first derivative, we typically use equally spaced points in the neighbourhood (e.g. x − h, x, x + h, where h
is some small positive value) and construct an approximation from these values for the function f (x) at these
points.
This obviously leads to an error caused by the discretisation (truncation error). This error will decrease as
h decrease, provided that the function f (x) is sufficiently smooth.
Given a smooth function f : R → R, we wish to approximate its first and second derivatives at a point x.
Consider the Taylor series expansions:
13
14 CHAPTER 1. NUMERICAL DIFFERENTIATION
Solving for f 0 (x) in Equation (1.2), we obtain the Forward Difference Formula:
Now, subtracting Equation (1.3) from Equation (1.2) gives the Central Difference Formula:
f (x + h) − f (x − h) f 000 (x) 2 f (x + h) − f (x − h)
f 0 (x) = − h + ... ≈ , (1.6)
2h 6 2h
which is second order accurate, i.e. O(h2 ). This approximation is a three-point formula (implying that three
2
points are required to make an approximation). Here the truncation error is − h6 f 000 (), where x−h ≤ ≤ x+h.
Adding Equation (1.3) to Equation (1.2) gives the Central Difference Formula for the second derivative:
1.1.2.2 Example
Compute an approximation to f 0 (1) for f (x) = x2 cos(x) using the central difference formula and h =
0.1, 0.05, 0.025, 0.0125.
from math import *
cfd = lambda f, x, h: (f(x + h) - f(x - h))/(2*h)
x = 1
h = [0.1, 0.05, 0.025, 0.0125, 0.00625]
f = lambda x: (x**2)*cos(x)
for i in h:
y = cfd(f, x, i)
print('The derivative at x = 1 with h = {:.4f} is f^1(x) = {:.10f}'.format(i, y))
1.1. FINITE DIFFERENCE METHODS 15
How do the relevant errors look with regard to the example above? We can plot the absolute error of the
approximations using both approaches compared to the true value. This is illustrated below:
loss of accuracy as h decreases with our fourth order approximation. The reason for this is inherent in the
approximation formula. At small h the formula has instances of subtracting nearly equal numbers, and along
with the loss of significant digits, this is exacerbated by the division of small numbers. This illustrates the
effect that machine precision can have on our computations!
Optimal Stepsize:
If f is a function with continuous derivatives up to order 2, then the approximate stepsize h which minimises
the total error(truncation + round-off error) of the derivative of f at x is:
∗ |f (x)|
p
h∗ = 2 p . (1.8)
|f 00 (x)|
Here ∗ is the maximum relative error that occurs when real numbers are represented by floating-point
numbers and there is no underflow or overflow. I useful rule-of-thumb estimate for ∗ is 7 × 10−17 .
1.2 Exercises
1. Consider the function f (x) = sin(x). Using the forward derivative approximation, compute an
approximation for f 0 (x) at x = 0.5. Investigate a gradual decrease in stepsize and determine a rough
estimate for an optimal stepsize h, i.e. one which avoids round-off error. Plot the behaviour of these
errors at the varying stepsizes. Does your estimation seem reasonable?
2. If we increased the number of bits for storing floating-point numbers, i.e. 128-bit precision, can we
obtain better numerical approximations to derivatives?
f (x + h) − f (x)
3. The approximation f 0 (x) ≈ will give the exact answer (assuming no round-off error) if
h
the function f is linear?
4. Use the forward difference formula for f 0 (x) to find an approximation for f 0 (1) where f (x) = ln(x) and
h = 0.1, 0.01, 0.001.
5. Use the central difference formula for f 0 (x) to find an approximation for f 0 (0) where f (x) = exp(x)
and h = 0.1, 0.01, 0.001.
In numerical differentiation and soon to be seen integration, we are computing approximate values according
to some stepsize. Clearly we would have an ideal case where the stepsize approaches zero as seen in our demo.
However, due to rounding error this is simply not possible. Using nonzero stepsizes however, we may be able
to estimate the what the value would be for a stepsize approaching zero. If we compute some value F from
some stepsizes hi and know something of its behaviour of F as h → 0, then it may be possible to extrapolate
from the known values an approximation of F at h = 0. This extrapolation will be of higher order accuracy
than any of the originally used values.
In summary:
Richardson extrapolation method is a procedure which combines several approxima-
tions of a certain quantity to yield a more accurate approximation of that quantity.
Suppose we are computing some quantity F and assume that the result depends on some stepsize h. Denoting
the approximation by f (h), we have F = f (h) + E(h), where E(h) represents an error. Richardson’s
1.3. RICHARDSON’S EXTRAPOLATION 17
extrapolation can remove the error provided E(h) = chp , where c and p are constants. We start by computing
f (h) at some value of h, say h1 giving:
F = f (h1 ) + chp1 ,
and another value h = h2 :
F = f (h2 ) + chp2 .
Then solving the above equations for F we get:
which is the Richardson’s Extrapolation Formula. In this course we will only consider half-steps, thus
h2 = h1 /2. This allows us to rewrite our formula as:
22 F2 (h/2) − F2 (h)
F3 (h) = ,
22 − 1
f (x + h/2) − f (x − h/2) f (x + h) − f (x − h)
= 4 − /3,
h 2h
f (x − h) − 8f (x − h/2) + 8f (x + h/2) − f (x + h)
= ,
6h
which is the five-point central difference formula which is of order four O(h4 ).
In order to apply this strategy, we need only apply a cheap but high order approximation for an array for
halving stepsizes in order to extrapolate a much high order accurate approximation. We can do this by
building the Richardson’s Extrapolation Table. To do this, let us rewrite our formula one last time:
1
Fji = 4j Fj−1
i i−1
j = 1, 2, . . . m, i = 1, 2, . . . , n. (1.9)
− Fj−1 ,
4j −1
Here j denotes iteration of the extrapolation and i the particular stepsize.
So if we use our difference formulae to compute our initial approximations F11 , F12 , . . . , T1n (which we should
try to use as higher an order as possible), then we use the above formula to build up the table.
1.3.1 Example
Build a Richardson’s extrapolation table for f (x) = x2 cos(x) to evaluate f 0 (1) for h = 0.1, 0.05, 0.025, 0.0125.
Solution:
We have:
18 CHAPTER 1. NUMERICAL DIFFERENTIATION
1
F12 = (4F02 − F01 )
3
1
F13 = (4F03 − F02 )
3
1
F14 = (4F04 − F03 )
3
1
F23 = (16F13 − F12 )
15
1
F24 = (16F14 − F13 )
15
1
F34 = (64F24 − F23 )
63
In Tabular form:
Note: The F1i values are computed from whatever was used for the inital approximation. In this case, it was
the central difference approximation.
1.3.2 Exercises
1. Develop a two point backward difference formula for approximating f 0 (x) including the error term.
2. Develop a second order method for approximating f 0 (x) that only uses the data f (x − h), f (x) and
f (x + 3h).
3. Extrapolate the formula obtained in exercise (2). Then demonstrate the order of this new formula by
approximating f 0 (π/3) where f (x) = sin(x) and h = 0.1, 0.01.
4. Use the forward difference formulas and backward difference formulas to determine the missing entries
in the following table:
x f (x) f 0 (x)
0.5 0.4794 Compute
0.6 0.5646 Compute
0.7 0.6442 Compute
5. Using the times and positions recorded for a moving car below, compute the velocity of the car at all
times listed:
Time (s) 0 3 5 8 10 13
Distance (m) 0 225 383 623 742 993
6. Apply Richard’s extrapolation to determine F3 (h), an approximation to f 0 (x) for the the following
1.3. RICHARDSON’S EXTRAPOLATION 19
function:
f (x) = 2x sin(x), x0 = 1.05, h = 0.4
7. Use the centred difference formula to approximate the derivative of each of the following functions at
the specified location and for the specified size:
• y = tan x at √x = 4, h = 0.1
• y = sin(0.5 x) at x = 1, h = 0.125
8. A jet fighter’s position on an aircraft carrier’s runway was timed during landing: where x is the distance
from the end of the carrier, measured in metres and t is the time in seconds. Estimate the velocity and
acceleration for each time point and plot these values accordingly.
dV
9. The following data was collected when a large oil tanker was loading. Calculate the flow rate Q =
dt
for each time point.
t, min 0 15 30 45 60 90 120
V , 106 barrels 0.5 0.65 0.73 0.88 1.03 1.14 1.30
20 CHAPTER 1. NUMERICAL DIFFERENTIATION
Chapter 2
Numerical Integration
Here we wish to compute the area under a the curve f (x) over an interval [a, b] on the real line. The numerical
approximation of definite integrals is known as numerical quadrature. We will consider the interval of
integration to be finite and assume the integrand f is smooth and continuous.
Since integration is an infinite summation we will need to approximate this infinite sum by a finite sum. This
finite sum involves sampling the integrand a some number of finite points within the interval, this is known
as the quadrature rule. Thus, our goal is to determine which sample points to take and how to weight
these in contribution to the quadrature formula. We can design these to a desired accuracy at which we are
satisfied with the computational cost required. Generally, this computational cost is measured through the
number of integrand function requirements undertaken. Importantly, Numerical Integration is insensitive to
round-off error.
The points xi are the values at which f is evaluated (called nodes), the multipliers wi (called weights) and
the remainder Rn . To approximate the value of the integral we compute:
n
X
I= wi f (xi ), (2.3)
i=1
21
22 CHAPTER 2. NUMERICAL INTEGRATION
This is the first and simplest of Newton–Cotes closed integration formulae. It corresponds to the case when
the polynomial is of first degree. We partition the interval [a, b] of integration into n subintervals of equal
width, and with n + 1 points x0 , x1 , · · · , xn , where x0 = a and xn = b. Let
b−a
xi+1 − xi = h = , i = 0, 1, 2, · · · , n − 1.
n
On each subinterval [xi , xi+1 ], we approximate f (x) with a first degree polynomial,
fi+1 − fi
P1 (x) = fi + (x − xi )
xi+1 − xi
fi+1 − fi
= fi + (x − xi ).
h
Then we have:
Z xi+1 Z xi+1
f (x)dx ≈ P1 (x)dx
xi x
Z xi i+1
fi+1 − fi
= fi + (x − xi )dx
xi h
fi+1 − fi h2
= hfi +
h 2
h
= (fi + fi+1 )
2
Geometrically, the trapezoidal rule is equivalent to approximating the area of the trapezoid under the straight
line connecting f (xi ) and f (xi+1 ). Summing over all subintervals and simplifying gives:
n Z xi n
f (xi−1 ) + f (xi )
Z b X X
I= f (x)dx = f (x)dx ≈ h, (2.6)
a i=1 xi−1 i
2
or:
h
I≈ [f0 + 2(f1 + f2 + · · · + fn−1 ) + fn ] , (2.7)
2
which is known as the Composite Trapezoidal rule. In practice we would always used composite trapezoidal
rule since it is simply trapezoidal rule applied in a piecewise fashion. The error of the composite trapezoidal
rule is the difference between the value of the integral and the computed numerical result:
Z b
E= f (x)dx − I, (2.8)
a
2.2. NEWTON-COTES QUADRATURE 23
So:
(b − a)h2 00
ET = − f (), ∈ [a, b], (2.9)
12
where is a point which exists between a and b. We can also see that the error is of order O(h2 ). Therefore,
if the integrand is concave then the error is negative and the trapezoidal rule overestimates the true value.
Should the integrand be concave then the error is positive and we have underestimated the true value.
2.2.2 Example
Solution:
1
I≈ [f0 + 2(f1 + f2 + f3 + f4 + f5 ) + f6 ]
12
import numpy as np
from math import *
import warnings
warnings.filterwarnings("ignore")
trap = lambda f, x, h: (h/2)*(f(x[0]) + sum(2*f(x[1:-1])) +f(x[-1]))
f = lambda x: (1 + x**2)**(-1)
print('Computed Inputs:')
## Computed Inputs:
x = np.linspace(0, 1, 7)
h = 1/6
ans = trap(f, x, h)
print('The trapezoidal method yields: {:.6f}'.format(ans))
2.2.2.1 Exercise
by splitting the interval into 2k subintervals, for k = 1, 2, . . . , 10. Report the approximation juxtaposed to
the corresponding approximation.
Instead of approximating the area under the curve by trapezoids, we can also use rectangles. This may seem
less accurate using horizontal lines versus skew ones, however, it is often more accurate.
In this approach, we construct a rectangle for every subinterval where the height equals f at the midpoint of
the subinterval.
26 CHAPTER 2. NUMERICAL INTEGRATION
Let us now derive the general formula for the midpoint method given n rectangles of equal width:
Z b Z x1 Z x2 Z xn
f (x) dx = f (x)dx + f (x)dx + . . . + f (x)dx,
a x0 x1 xn−1
x0 + x1 x1 + x2 xn−1 + xn
≈ hf + hf + . . . + hf , (2.10)
2 2 2
x0 + x1 x1 + x2 xn−1 + xn
≈h f +f + ... + f (2.11)
2 2 2
This can be rewritten as:
Z b n−1
X
f (x)dx ≈ h f (xi ), (2.12)
a i=0
To compare to the two methods, we will increase the number of panels used in each method, from n = 2 to
n = 1048576.
2.2. NEWTON-COTES QUADRATURE 27
## n midpoint trapezoidal
for i in range(1, 21):
n = 2**i
m = midpoint_method(g, a, b, n)
t = trapezoidal(g, a, b, n)
print('{:7d} {:.16f} {:.16f}'.format(n, m, t))
## 2 0.8842000076332692 0.8770372606158094
## 4 0.8827889485397279 0.8806186341245393
## 8 0.8822686991994210 0.8817037913321336
## 16 0.8821288703366458 0.8819862452657772
## 32 0.8820933014203766 0.8820575578012112
## 64 0.8820843709743319 0.8820754296107942
## 128 0.8820821359746071 0.8820799002925637
## 256 0.8820815770754198 0.8820810181335849
## 512 0.8820814373412922 0.8820812976045025
## 1024 0.8820814024071774 0.8820813674728968
## 2048 0.8820813936736116 0.8820813849400392
## 4096 0.8820813914902204 0.8820813893068272
## 8192 0.8820813909443684 0.8820813903985197
## 16384 0.8820813908079066 0.8820813906714446
## 32768 0.8820813907737911 0.8820813907396778
## 65536 0.8820813907652575 0.8820813907567422
## 131072 0.8820813907631487 0.8820813907610036
## 262144 0.8820813907625702 0.8820813907620528
## 524288 0.8820813907624605 0.8820813907623183
## 1048576 0.8820813907624268 0.8820813907623890
print('True Solution to 16 decimals is:', 0.882081390762422)
The trapezoidal rule approximates the area under a curve by summing over the areas of trapezoids formed
by connecting successive points by straight lines. A more accurate estimate of the area can be achieved by
using polynomials of higher degree to connect the points. Simpson’s rule uses a second degree polynomial
(parabola) to connect adjacent points. Interpolating polynomials are convenient for this approximation. So
the interval [a, b] is subdivided into an even number of equal subintervals (n is even). Next we pass a parabolic
interpolant through through three adjacent nodes. Therefore our approximation is:
h
I= [fi−1 + 4fi + fi+1 ] . (2.13)
3
Summing the definite integrals over each subinterval [xi−1 , xi+1 ] for i = 1, 3, 5, · · · , n − 1 provides the
approximation:
Z b
h
f (x)dx ≈ [(f0 + 4f1 + f2 ) + (f2 + 4f3 + f4 ) + · · · + (fn−2 + 4fn−1 + fn )] (2.14)
a 3
28 CHAPTER 2. NUMERICAL INTEGRATION
2.2.4.1 Exercise
by splitting the interval into 2 subintervals, for k = 1, 2, . . . , 10. Report the approximation juxtaposed to
k
Often when implementation numerical approximations we may assume certain asymptotic R 1 behaviours when
2 t3
considering errors. For example, when implementing experimental results of the problem 0 3t e dt, where n
is doubled in each run n = 4, 8, 16 using the trapezoidal rule, the errors where 12%, 3% and 0.77% respectively.
This illustrates that the error was approximately reduced by a factor of 4 when n was doubled. Therefore,
the error converges to zero as n−2 and we can say that the convergence rate is 2 (quadratic). Numerical
integration methods usually have an error that converge to zero as np for some p that depends on the method.
This implies that it does not matter what the actual approximation error is since we know at what rate it is
reducing by. Therefore, running a method for two or more different n values would allows us to see if the
expected rate is indeed achieved.
The idea of a corresponding unit test is then to run the algorithm for some n values, compute the error (the
absolute value of the difference between the exact analytical result and the one produced by the numerical
method), and check that the error has approximately correct asymptotic behaviour, i.e., that the error is
proportional to n2 in case of the trapezoidal and midpoint method.
More formally, assume that the error E depends on n according to:
E = Cnr ,
where C is an unknown constant and r is the convergence rate. Consider a set of experiments with various n,
i.e. n0 , n1 , . . . , nq . We can compute the errors at each n, i.e. E0 , E1 , . . . , Eq . Therefore, for two consecutive
experiments, i and i − 1, we have the error model:
Ei = Cnri , (2.17)
Ei−1 = Cnri−1 . (2.18)
2.3. ROMBERG INTEGRATION 29
These are two equations for two unknowns C and r. Eliminating C by dividing the equations by each other.
Then solving for r gives:
ln(Ei /Ei−1 )
ri−1 = . (2.19)
ln(ni /ni−1 )
We have a subscript i − 1 in r since the estimated value for r varies with i. Ideally, ri−1 approaches the
correct convergence rate as the number of intervals.
2.2.6 Exercises
1. Since every point of measurement in the trapezoidal rule is used in two different subintervals, we must
evaluate the function we want to integrate at every point twice. Is this a true statement to make?
2. Apply the trapezoidal rule to approximate the integral:
Z 1
x2 dx,
0
determine a value of h which guarantees that the absolute error is smaller than 10−10 .
5. When using the trapezoidal rule and h is halved, some function values used with stepsize h/2 are the
same as those used when the stepsize was h. Derive a formula for the trapezoidal rule with step length
h/2 that allows one not to recompute the function values that were computed when the stepsize was h.
6. Is the Simpson’s Rule exact for polynomials of degree 3 or lower?
7. Compute an approximation for the integral:
sin(x)
Z π/2
dx,
0 1 + x2
with the Simpson’s rule and 6 subintervals.
8. How many function evaluations does one need to calculate the integral:
Z 1
dx
,
0 1 + 2x
with the trapezoidal rule to ensure that the error is smaller than 10−10 .
9. Repeat question 8 using the Simpson’s rule.
where,
h (b − a)
T (h) =
(f0 + 2f1 + 2f2 + · · · + 2fn−1 + fn ), h = .
2 n
Consider two trapezoidal approximations with spacing 2h and h and n is even.
If we subtract equation (2.20) from 4 times equation (2.21) we eliminate the leading error term (i.e. of O(h2 ))
and we get
1
I = (4T (h) − T (2h)) + 4a2 h4 + 20a3 h6 + · · ·
3
after dividing right through by 3. But:
1 h
(4T (h) − T (2h)) = [(2f0 + 4f1 + 4f2 + · · · 4fn−1 + 2fn ) − (f0 + 2f2 + 2f4 + · · · 2fn−2 + fn )
3 3
h
= (f0 + 4f1 + 2f2 + 4f3 + · · · 2fn−2 + 4fn−1 + fn )
3
= S(h),
Ih = S(h) + c1 h4 + c2 h6 + · · · (2.22)
4 6
h h
Ih/2 = S(h/2) + c1 + c2 + ··· (2.23)
2 2
16S(h/2) − S(h)
I= + d1 h6 + · · ·
15
which is now more accurate, with an error O(h6 ).
We now generalise the results for hk = (b − a)/2k , n = 2k . Hence the trapezoidal rule for 2k subintervals (i.e.
n is even) becomes
hk
T1,k = (f0 + 2f1 + 2f2 + · · · + 2f2k −1 + f2k )
2
I = T1,k + a1 h2k + a2 h4k + a3 h6k + · · ·
We define
1
T2,k = (4T1,k+1 − T1,k ), k = 1, 2, · · ·
3
which is the Simpson’s rule for hk and hence has an error O(h4k ), i.e.,
In general, we define
1
Tji = (4j Tj−1
i i−1
− Tj−1 ), j = 1, · · · , m i = 1, 2, · · · n (2.24)
4j − 1
2.3.0.1 Example
Use Romberg integration to find the integral of f (x) = e−x for x ∈ [0, 1]. Take the initial sub-interval as
h = (1 − 0)/2 = 0.5. Use 6 decimal places.
2.3.1 Exercises
• Use (a) the trapezoidal rule (b) Simpson’s rule to estimate I for the following:
1
– (i) f (x) = , over the interval [0, 1] for n = 4
1 + x22
– (ii) f (x) = xe−x over the interval [0, 2] for n = 4 Compare your numerical results with the
analytical ones.
• Use Romberg’s method to approximate to integral
Z 1p
I= 1 − x2 dx
0
r
L
• The period of a simple pendulum of length L is τ = 4 h(θ0 ), where g is the gravitational acceleration,
g
θ0 represents the angular amplitude and:
Z π/2
dθ
h(θ0 ) = .
1 − sin (θ0 /2) sin2 θ
p
2
0
Given a double integral over the rectangular domain [a, b] × [c, d]:
Z bZ d
f (x, y)dydx.
a c
This looks slightly different than before, since we need to integrate in both x and y directions. Note when
integrating in the y direction we use ny for n, hy for h and index according to j. When integrating in the x
direction, we use hx , nx and i respectively.
So, the double integral approximated by the midpoint method:
x −1
nX
b
1
Z
g(x)dx ≈ hx g(xi ), xi = a + hx + ihx .
a i=0
2
So finally putting both approximations together we get the composite midpoint method for the double integral:
Z b Z d x −1
nX ny −1
X
f (x, y)dydx ≈ hx hy f (xi , yj )
a c i=0 j=0
y −1
x −1 nX
nX
hx hy
= hx hy f (a + + ihx , c + + jhy ) . (2.25)
i=0 j=0
2 2
2.4.1.1 Example
import sympy
x, y = sympy.symbols('x y')
true_ans = sympy.integrate(f(x, y), (x, a, b), (y, c, d))
print('True analytical solution:', true_ans)
The idea used for double integrals can similarly be extended to three dimensions. Consider the triple integral:
Z bZ dZ f
g(x, y, z)dzdydx,
a c e
we wish to approximate the integral via the midpoint rule. Utilising the same strategy as before, we split the
integral into one-dimensional integrals:
Z f
p(x, y) = g(x, y, z)dz
e
Z d
q(x) = p(x, y)dy
c
Z b Z d Z f Z b
g(x, y, z)dzdydx = q(x)dx
a c e a
where:
1 1 1
zk = e + hz + khz , yj = c + hy + jhy xi = a + hx + ihx .
2 2 2
RbRdRf
So finally, starting with the formula for a c e g(x, y, z)dzdydx and combining the two previous formulas
we have:
Z bZ dZ f
g(x, y, z) dzdydx ≈
a c e
y −1 nz −1
x −1 nX
nX X 1 1 1
hx hy hz g(a + hx + ihx , c + hy + jhy , e + hz + khz ) . (2.26)
i=0 j=0 k=0
2 2 2
2.4.2.1 Example
f1 = lambda x, y, z: 8*x*y*z
a = 2; b = 3; c = 1; d = 2; e = 0; f = 1; nx = 5; ny = 5; nz = 5
print(midpoint_method_triple(f1, a, b, c, d, e, f, nx, ny, nz))
# Check symbolic solution
## 15.000000000000009
import sympy
x, y, z = sympy.symbols('x y z')
true_ans = sympy.integrate(f1(x, y, z), (x, a, b), (y, c, d), (z, e, f))
print('True analytical answer:', true_ans)
Non-linear equations occur in many world problems and are rarely solvable analytically.
f (x) = 0
in many applications in science and engineering. The values of x that make f (x) = 0 are called the roots (or
the zeros) of this equation.
This type of problem also includes determining the points of intersection of curves. If f (x) and g(x) represent
equations of two curves, the intersection points correspond to the roots of the function F (x) = f (x) − g(x) = 0.
37
38 CHAPTER 3. NUMERICAL SOLUTIONS TO NONLINEAR EQUATIONS
We shall examine two types of iterative methods for determining the roots of the equation f (x) = 0, namely:
• Bracketing methods, also known as interval methods.
• Fixed point methods
To obtain these intervals or initial approximations graphical methods are usually used.
These methods require an initial interval which is guaranteed to contain a root. The width of this interval
(bracket) is reduced iteratively until it encloses the root to a desired accuracy.
The bisection method is an incremental search method in which the interval is always divided in half.
Intermediate value theorem:
If f (x) is real and continuous in an interval [a, b] and f (a)f (b) < 0, then there exists a point c ∈ (a, b) such
that f (c) = 0.
3.1. NONLINEAR EQUATIONS IN ONE UNKNOWN: F (X) = 0 39
1
c= (a + b)
2
then:
• If f (a)f (c) < 0 then f (a) and f (c) have opposite signs and so the root must lie in the smaller interval
[a, c].
• If f (a)f (c) > 0 then f (a) and f (c) have the same signs and so f (b) and f (c) must have opposite signs,
so the root lies in [c, b].
3.1.2.1 Example
Perform two iterations of the bisection method on the function f (x) = x2 − 1, using [0, 3] as your initial
interval.
Answer: The root lies at 1, but after two iterations, the interval will be [0.75, 1.5].
f = lambda x: x**2 - 1
x = np.arange(-1, 3, 0.1)
y = f(x)
a = 0
b = 3
tol = 10**-3
val = bisection_plot(f, a, b, tol)
40 CHAPTER 3. NUMERICAL SOLUTIONS TO NONLINEAR EQUATIONS
Stopping Criteria:
We use a stopping criteria of
|bn − an | <
We have
|b1 − a1 | = |b − a|
1
|b2 − a2 | = |b1 − a1 |
2
..
.
1
|bn − an | = |bn−1 − an−1 |
2
1
= |bn−2 − an−2 |
22
1
= |b1 − a1 |
2n−1
1 |b1 − a1 |
|b1 − a1 | ≈ , or 2n = 2
2n−1
or
|b1 − a1 |
n = log 2 / log 2 (3.1)
3.1. NONLINEAR EQUATIONS IN ONE UNKNOWN: F (X) = 0 41
3.1.2.2 Example
Find the root of f (x) = sin(x) − 0.5 between 0 and 1. Iterate until the interval is of length 1
23
Definition 3.1 (Bisection Method Theorem). If the bisection algorithm is applied to a continuous function
f on an interval [a, b], where f (a)f (b) < 0, then, after n steps, an approximate root will have been computed
with error at most (b − a)/2n+1 .
The bisection method is attractive because of its simplicity and guaranteed convergence. Its disadvantage is
that it is, in general, extremely slow.
Regula Falsi algorithm is a method of finding roots based on linear interpolation. Its convergence is linear,
but it is usually faster than bisection. On each iteration a line is drawn between the endpoints (a, f (a)) and
(b, f (b)) and the point where this line crosses the x−axis taken as the point c.
f = lambda x: x**2 - 1
x = np.arange(-1, 3, 0.1)
y = f(x)
a = 0
b = 3
tol = 10**-3
val = false_position_plot(f, a, b, tol)
42 CHAPTER 3. NUMERICAL SOLUTIONS TO NONLINEAR EQUATIONS
The equation of the line through (a, f (a)) and (b, f (b)) is
x−a
y = f (a) + (f (b) − f (a)).
b−a
We require the point c where y = 0, i.e.
c−a
f (c) = f (a) + (f (b) − f (a)) = 0,
b−a
from which we solve for c to get:
af (b) − bf (a)
c= (3.2)
f (b) − f (a)
The sign of f (c) determines which side of the interval does not contain the root, which side is discarded to
give a new, smaller interval containing the root. The procedure is continued until the interval is sufficiently
small.
3.1.3.1 Example
Perform two iterations of the false position method on the function f (x) = x2 − 1, using [0, 3] as your initial
interval. Compare your answers to those of the bisection method.
Answer: False position, in other words, performs a linear fit onto the function, and then directly solves that
fit.
With Bisection we obtain the following,
3.1. NONLINEAR EQUATIONS IN ONE UNKNOWN: F (X) = 0 43
a c b
0 1.5 1.5
0.75 0.75 1.5
0.75 1.125 1.125
0.9375 0.9375 1.125
0.9375 1.03125 1.03125
0.984375 0.984375 1.03125
Stopping criteria
The false position method often approaches the root from one side only, so we require a different stopping
criteria from that of the bisection method. We usually choose:
|c − c∗ | <
• Normally faster than Bisection Method. Can decrease the interval by more than half at each iteration.
• Superlinear convergence rate. Linear convergence rate in the worst case.
• Usually approaches the root from one side.
3.1.3.2 Exercise
Use the bisection method and the false position method to find the root of f (x) = x2 − x − 2 that lies in the
interval [1, 4].
For these methods we start with an initial approximation to the root and produce a sequence of approximations,
each closer to the root than its predecessor.
This is one of the most widely used of all root-finding formulae. It works by taking as the new approximation
the point of intersection of the tangent to the curve y = f (x) at xi with the x–axis. Thus we seek to solve
the equation f (x) = 0, where f is assumed to have a continuous derivative f 0 .
Newton developed this method for solving equations while wanting the find the root to the
equation x3 − 2x − 5 = 0. although he demonstrated the method only for polynomials, it is clear
he realised its broader applications.
plt.show()
44 CHAPTER 3. NUMERICAL SOLUTIONS TO NONLINEAR EQUATIONS
Newton’s method can be derived in several ways; we choose to do it using Taylor series.
Let xi+1 = xi + h and obtain a Taylor’s expansion of f (xi+1 ) about xi ,
h2 00
f (xi+1 ) = f (xi ) + hf 0 (xi ) + f (xi ) + · · · (3.3)
2
f (xi )
h=− , provided f 0 (xi ) 6= 0.
f 0 (xi )
Therefore
f (xi )
xi+1 = xi + h = xi − , i = 0, 1, 2, · · · (3.4)
f 0 (xi )
which is called Newton’s (or Newton-Raphson’s) iterative formula.
• Requires the derivative of the function.
• Has quadratic convergence rate. Linear in worst case.
• May not converge if too far from the root.
• Could get caught in basins of attraction with certain sinusoidals .
3.2. NEWTON’S METHOD FOR SYSTEMS OF NONLINEAR EQUATIONS 45
Newton’s method may also be used to find roots of a system of two or more non-linear equations.
Consider a system of two equations:
∂f ∂f
f (x + h, y + k) = f (x, y) + h +k + terms in h2 , k 2 , hk (3.6)
∂x ∂y
∂g ∂g
g(x + h, y + k) = g(x, y) + h +k + terms in h2 , k 2 , hk (3.7)
∂x ∂y
and if we keep only the first order terms, we are looking for a couple (h, k) such that:
∂f ∂f
f (x + h, y + k) = 0 ≈ f (x, y) + h +k (3.8)
∂x ∂y
∂g ∂g
g(x + h, y + k) = 0 ≈ g(x, y) + h +k (3.9)
∂x ∂y
The 2 × 2 matrix is called the Jacobian matrix (or Jacobian) and is sometimes denoted as:
" #
∂f ∂f
J(x, y) = ∂x
∂g
∂y
∂g
∂x ∂y
The general Jacobian of a a (n × n) matrix for a system of n equations and n variables, (x1 , x2 , . . . , xn ) is
immediate: ∂f
∂f1 ∂f1
∂x
1
∂x , · · · ∂xn
∂f21 ∂f22 ∂f
∂x1 ∂x2 · · · ∂xn2
J = .. .. .
. ..
.
∂fn ∂fn ∂fn
∂x1 ∂x2 · · · ∂xn
If we define xi+1 = xi + h and yi+1 = yi + k then the equation(3.10) suggests the iteration formula:
f (xi , yi )
xi+1 xi
= − J −1 (xn , yn )
yi+1 yi g(xi , yi )
Starting with an initial guess (x0 , y0 ) and under certain conditions it’s possible to show that this iteration
process converges to a root of the system.
46 CHAPTER 3. NUMERICAL SOLUTIONS TO NONLINEAR EQUATIONS
3.2.0.1 Exercise
f (x, y) = x3 − 3xy 2 − 1
g(x, y) = 3x2 y − y 3
## [-0.5 0.86602539]
print('It took', o2, 'iterations')
## It took 4 iterations
3.2. NEWTON’S METHOD FOR SYSTEMS OF NONLINEAR EQUATIONS 47
3.2.1 Exercises
1. Show that the equation x = cos x has a solution in the interval [0, π/2]. Use the bisection method to
reduce the interval containing the solution to a length of 0.2.
2. Use the bisection method to solve
e−x = ln x, a = 1, b=2
3. Apply (i) the bisection method (ii) False Position and (iii) Newton’s method to solve each of the
following equations to, at least, 6D.
8. Explain the meaning of the phrase: A convergent numerical method is qualitatively just as good as an
analytical solution
9. Motivate the False-Position Method, why is it generally preferable to the Bisection Method?
10. Use Newton’s method to find a solution to the following system:
v − u3 = 0, (3.11)
u + v − 1 = 0,
2 2
(3.12)
given a starting value of (1, 2). Plot the curves along with successive approximations to determine if it
is indeed true that the approximations approach the intercept.
Chapter 4
The set of homogeneous equation (4.2) admits a non–trivial solution if and only if
det(A − λI) = |A − λI| = 0.
The determinant |A − λI| is an nth degree polynomial in λ and is called the characteristic polynomial of
A. Thus one way to find the eigenvalues of A is to obtain its characteristic polynomial and then find the n
zeros of this polynomial.
Although the characteristic polynomial is easy to work, for large values of n finding the roots of the polynomial
equations is difficult and time consuming.
If we are interested in the eigenvalue of largest magnitude, then the power method becomes a popular
approach.
49
50 CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Solution: Since |−7| > |5| > |2| ≥ |−2| > 0 A has λ1 = −7 as its dominant eigenvalue.
Not every matrix has a dominant eigenvalue. For instance the matrix
1 0
0 −1
has eigenvalues λ1 = 1 and λ2 = −1 and therefore has no dominant eigenvalue.
The eigenvectors v1 , v2 , · · · , vn form a basis of Rn so that any vector x can be written as a linear combination
of them:
x = c1 v1 + c2 v2 + · · · + cn vn .
Derive the following equations:
Ax = c1 Av1 + c2 Av2 + · · · + cn Avn
= c1 λ1 v1 + c2 λ2 v2 + · · · + cn λn vn
2
A x = c1 λ1 Av1 + c2 λ2 Av2 + · · · + cn λn Avn
= c1 λ21 v1 + c2 λ22 v2 + · · · + cn λ2n vn
4.1.0.1 Example
Using the power method, find the dominate eigenvalue and vector of the following matrix:
1 3
A= .
2 2
## {4: 1, -1: 1}
Remark: If x is an eigenvector corresponding to some given eigenvalue, then so is kx for any k. Thus, only
the direction of a vector matters. We can choose its length by changing k. Therefore we will seek eigenvectors
of length unity, called the normalised eigenvectors.
The previous pseudocode for power method above can be accomplished by hand with the following:
Step 1:
Let x0 be an initial guess of the eigenvector corresponding to λ1 . Normalise it. Call the result y0
x0
y0 =
kx0 k
Step 2:
52 CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Multiply y0 once by A to get a new vector. Normalise the result and call it y1
x1
x1 = Ay0 , y1 =
kx1 k
Step 3:
Multiply y1 once by A to get a new vector. Normalise the result and call it y2
x2
x2 = Ay1 , y2 =
kx2 k
Repeat m times. If m is sufficiently large enough, ym−1 should be approximately equal to ym then we stop.
Thus ym is approximately the normalised eigenvectors of A corresponding to the dominant eigenvalue λ1
and hence:
Aym−1 ≈ Aym = λ1 ym ,
which we can use to read off the eigenvalue.
The power method will converge if the initial estimate of the eigenvector has a nonzero component in
the direction of the eigenvector corresponding to the dominant eigenvalue. For this reason, starting vector
with all components equal to 1 is usually used in the computations of the power method.
Exercise:
Use the power method to estimate the dominant eigenpair of the matrix:
1 2
A=
3 2
To find the smallest eigenpair we can use the Inverse Power Method. If the Power Method is applied to
the inverse of the matrix, then the smallest eigenvalue can be found.
We should try to avoid computing the inverse of A as much as possible, so we may rewrite the application of
the Power Method to A−1 from:
xj+1 = A−1 xj ,
to:
Axj+1 = xj ,
and solve for xj+1 using Gaussian Elimination.
You will cover many further methods to compute eigenvalues and eigenvectors in the third year numerical
methods course. Therefore, we will only consider the above two for this course.
4.2. THE INVERSE POWER METHOD 53
4.2.1 Exercises
1. Use the power method to calculate approximations to the dominant eigenpair (if a dominant eigenpair
exists). Perform 5 iterations in each case. Use x0 = [1 1]T . Check your answers analytically and with
Python
1 5
1.
5 6
2 3
2.
−2 1
2. Use the inverse power method for find the smallest eigenpair on previous two questions.
3. Find the characteristic polynomial and the eigenvalues and eigenvectors of the following matrics:
•
3.5 −1.5
−1.5 3.5
•
3.5 −1.5
−1.5 3.5
54 CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Chapter 5
Interpolation
Typically, from experimental observations or statistical measurements we may have the value of a function f
at a set of points x0 , x1 , · · · , xn (x0 < x1 < · · · < xn ). However, we do not have an analytic expression for f
which would allow us to calculate the value of f at an arbitrary point.
You will frequently have occasion to estimate intermediate values between precise data points when dealing
with real world data sets. The most common method used for this purpose is polynomial interpolation.
Polynomial functions which fit the known data are commonly used to allow us to approximate these arbitrary
points. If we use this function to approximate f for some point x0 < x < xn then the process is called
interpolation. If we use it to approximate f for x < x0 or x > xn then it is called extrapolation.
Polynomials are used because:
• Computers can handle them easily. Which makes for fast and efficient programming.
• The integration and differentiation of polynomials is straightforward computationally.
• Polynomials are smooth functions - i.e. not only is a polynomial a continuous function, but all the
derivatives exist and are themselves continuous.
• Polynomials are uniformly approximate continuous functions. This means that, given any function,
which is continuous on some interval [a, b] and any positive number (no matter how small) we can
find a polynomial P such that
|f (x) − P (x)| < , x ∈ [a, b]
This result is known as Weierstrass Approximation theorem.
For n + 1 data points, there is one and only one polynomial of order n that passes through all the points. For
example, there is only one straight line (that is, a first-order polynomial) that connects two points. Similarly,
only one parabola connects a set of three points. Polynomial interpolation consists of determining the unique
nth-order polynomial that fits n + 1 data points. This polynomial then provides a formula to compute
intermediate values.
Pn (x) = an xn + an−1
n−1 + ... + a1 x + a0 ,
where n is a nonnegative integer and a0 , ..., an are real constants. One reason for their importance is that
they uniformly approximate continuous functions. By this we mean that given any function, defined and
55
56 CHAPTER 5. INTERPOLATION
continuous on a closed and bounded interval, there exists a polynomial that is as “close” to the given function
as desired. This result is expressed precisely in the Weierstrass Approximation Theorem.
Theorem 5.1 (Weierstrauss Approximation Theorem). Suppose that f is defined and continous on [a, b].
For each, > 0, there exists a polynomial P (x), with the property that,
Note: Karl Weierstrass (1815-1897) is often referred to as the father of modern analysis
because of his insistence on rigour in the demonstration of mathematical results. He was
instrumental in developing tests for convergence of series, and determining ways to rigorously
define irrational numbers. He was the first to demonstrate that a function could be everywhere
continuous but nowhere differentiable, a result that shocked some of his contemporaries.
a0 + a1 x0 = f (x0 ) (5.1)
a0 + a1 x1 = f (x1 ) (5.2)
5.2.0.1 Example
ln 6 − ln 1
P(2) = ln 1 + (2 − 1) = 0.3583519
6−1
Calculator value ln 2 = 0.6931472.
In this case the error is large because for one the interval between the data points is large and secondly we
are linearly approximating a non-linear function.
5.3.0.1 Example
Fit a second degree polynomial that goes through the points x0 = 1, x1 = 4 and x2 = 6 for f (x) = ln x. Use
this polynomial to approximate ln 2.
Solution:
Polynomial,
P2 (x) = 0 + 0.46209813(x − 1) − 0.051873116(x − 1)(x − 4)
Estimate for ln 2, put x = 2 in P2 (x)
P2 (2) = 0 + 0.46209813(2 − 1) − 0.051873116(2 − 1)(2 − 4) = 0.56584436
This is a more accurate result than obtained using linear interpolation. We now have a relative error of
= 18.4%. Thus, the curvature introduced by the quadratic formula improves the interpolation compared
with the result obtained using straight lines.
It is easy to verify that P (x0 ) = y0 and P (x1 ) = y1 . Thus the polynomial agrees with the functional values
at the two stipulated points. We also note the following about the quotients L0 (x) and L1 (x). When
x = x0 , L0 (x0 ) = 1 and L1 (x0 ) = 0. When x = x1 , L0 (x1 ) = 0 and L1 (x1 ) = 1. Thus we need to construct
the quotients L0 (x) and L1 (x) to determine the polynomial.
In general, to construct a polynomial of degree at most n that passes through the n + 1 points
(x0 , f (x0 )), (x1 , f (x1 )), . . . , (xn , f (xn )), we need to construct for k = 0, 1, . . . , n, a quotient Ln,k (x) with the
property that Ln,k (xi ) = 0 when i 6= k and Ln,k (xk ) = 1. To satisfy Ln,k (xi ) = 0 for each i 6= k requires that
the numerator of Ln,k to contain the term:
To satisfy Ln,k (xk ) = 1, the denominator of Ln,k must equal the denominator of the above numerator
evaluated at x = xk . Thus:
(x − x0 ) . . . (x − xk−1 )(x − xk+1 ) . . . (x − xn )
Ln,k (x) =
(xk − x0 ) . . . (xk − xk−1 )(xk − xk+1 ) . . . (xk − xn )
n
Y (x − xi )
= .
(xk − xi )
i=0,i6=k
P (x) = Ln,0 (x)f (x0 ) + Ln,1 (x)f (x1 ) + . . . + Ln,n (x)f (xn ) (5.7)
If there is no confusion about the degree of the required polynomial we shall simply use Lk instead of Ln,k .
Error in Lagrange polynomial:
The error in the approximation by the Lagrange interpolating polynomial can be estimated if f (x) is known
as:
n
f (n+1) (ξ(x)) Y
E(x) = (x − xi ), (5.8)
(n + 1)! i=0
for some ξ(x) ∈ (a, b), a ≤ x0 ≤ x − 1 ≤ . . . ≤ xn ≤ bn , assuming f (n+1) (x) is continuous on [a, b].
5.4.0.1 Example
Use a Lagrange interpolating polynomial of the first and second order to evaluate ln(x) on the basis of the
given data points and estimate the value at x = 2
n
Y x − xk
Li (x) =
xi − xk
i=0,i6=k
x0 = 1; f (x0 ) = 0
x1 = 4; f (x1 ) = 1.386294
x2 = 6; f (x2 ) = 1.791760
First Order:
We have the following equation for the first order Lagrange polynomial,
x − x1 x − x0
P1 (x) = f (x0 ) + f (x1 )
x0 − x1 x1 − x0
60 CHAPTER 5. INTERPOLATION
Therefore we obtain,
2−4 2−1
P1 (2) = (0) + (1.386294)
1−4 4−1
= 0.4620981.
Second Order:
We have the following equation for the second order Lagrange polynomial,
Therefore we obtain,
= 0.565844
5.4.0.2 Example
Use the following data to approximate f (1.5) using the Lagrange interpolating polynomial for n = 1, 2, and 3.
5.5. NEWTON’S DIVIDED DIFFERENCES 61
which gives,
P (1.5) = 0.508939.
We now fit an nth degree interpolating polynomial to the n + 1 data points (xi , f (xi )), i = 0, 1, · · · , n in the
form:
Since the polynomial must pass through the points (xi , fi ) we have:
• x = x0 Pn (x0 ) = f0 = a0 = f [x0 ]
• x = x1 Pn (x1 ) = f1 = f [x0 ] + a1 (x1 − x0 ) = f [x1 ] ⇒ a1 = f [xx11]−f −x0
[x0 ]
= f [x0 , x1 ].
• x = x2
Pn (x2 ) = f2 = f [x2 ] = f [x0 ] + f [x0 , x1 ](x2 − x0 ) + a2 (x2 − x0 )(x2 − x1 ),
and therefore:
f [x2 ] − f [x0 ] − f [x0 , x1 ](x2 − x0 )
a2 =
(x2 − x0 )(x2 − x1 )
With some algebraic manipulation it can be shown that:
f [x1 , x2 ] − f [x0 , x1 ]
a2 = = f [x0 , x1 , x2 ]
x2 − x0
62 CHAPTER 5. INTERPOLATION
In general:
ak = f [x0 , x1 , · · · , xk ]
so that:
n
X
Pn (x) = f [x0 ] + f [x0 , · · · , xk ](x − x0 ) · · · (x − xk−1 )
k=1
n
X k−1
Y
= f [x0 ] + f [x0 , · · · , xk ] (x − xi ) (5.9)
k=1 i=0
called Newton’s divided difference interpolating polynomial. All divided differences are calculated
in a similar process and the results are usually tabulated in:
a divided difference table:
xi f [xi ] f [xi , xi+1 ] f [xi , xi+1 , xi+2 ] f [xi , xi+1 , xi+2 , fx[x
i+3i ,]xi+1 , xi+2 , xi+3 , xi+4 ]
x0 f [x0 ]
f [x0 , x1 ]
x1 f [x1 ] f [x0 , x1 , x2 ]
f [x1 , x2 ] f [x0 , x1 , x2 , x3 ]
x2 f [x2 ] f [x1 , x2 , x3 ] f [x0 , x1 , x2 , x3 , x4 ]
f [x2 , x3 ] f [x1 , x2 , x3 , x4 ]
x3 f [x3 ] f [x2 , x3 , x4 ]
f [x3 , x4 ]
x4 f [x4 ]
5.5.0.1 Exercise
Use a third degree polynomial passing through the points (1, ln 1), (4, ln 4), (5, ln 5) and (6, ln 6) to estimate
ln 2. (Ans: P3 (2) = 0.62876869).
5.5.0.2 Example
Find a polynomial satisfied by (−4, 1245), (−1, 33), (0, 5), (2, 9), (5, 1335).
Solution:
xi f (xi ) f [xi , xi+1 ] f [xi , xi+1 , xi+2 ]f [xi , xi+1 , xi+2 , xi+3f] [xi , xi+1 , xi+2 , xi+3 , xi
−4 1245
−404
−1 33 94
−28 −14
0 5 10 3
2 13
2 9 88
442
5 1335
5.5. NEWTON’S DIVIDED DIFFERENCES 63
Hence,
P4 (x) = 1245 − 404(x + 4) + 94(x + 4)(x + 1) − 14(x + 4)(x + 1)(x) (5.10)
+3(x + 4)(x + 1)x(x − 2)
= 3x4 − 5x3 + 6x2 − 14x + 5.
Note: If an extra data point (x, f (x)) is added, we only need to add an additional term to the Pn (x) already
found.
In general if Pn (x) is the interpolating polynomial through the (n + 1) points (xi , fi ), i = 0, 1, · · · , n, then the
Newton’s divided difference formula gives Pn+1 through these points plus one more point (xn+1 , fn+1 ) as i.e.,
n
Y
Pn+1 (x) = Pn (x) + f [x0 , x1 , · · · , xn , xn+1 ] (x − xi ) (5.11)
i=0
Let Pn+1 (x) be the (n+1)th degree polynomial which fits y = f (x) at the n+2 points, (x0 , f (x0 )), (x1 , f (x1 )), · · · , (xn , f (xn ), (x
The last point is a general point. Then:
n
Y
Pn+1 (x) = Pn (x) + f [x0 , x1 , · · · , xn , x] (x − xi )
i=0
Remarks: For n = 0,
f (x) − f (x0 )
f [x0 , x] = .
x − x0
We have:
• (Mean value theorem) f [x0 , x] = f (x)−f
x−x0
(x0 )
= f 0 (ξ), ξ ∈ [x0 , x].
• (Definition of a derivative) lim x → x0 f [x0 , x] = f 0 (x0 ).
In general, it can be shown that
1 (n)
f [x0 , x1 , · · · , xn ] = f (ξ), ξ ∈ [x0 , xn ]
n!
and hence:
1
f [x0 , x1 , · · · , xn , x] = f (n+1) (ξ), ξ ∈ [x0 , x] (5.12)
(n + 1)!
The error is then:
n
Y
n (x) = f [x0 , x1 , · · · , xn , x] (x − xi )
i=0
n
1 Y
= f (n+1) (ξ) (x − xi ), ξ ∈ [x0 , x] (5.13)
(n + 1)! i=0
si (x) = fi + bi (x − xi ) + ci (x − xi )2 + di (x − xi )3 . (5.16)
1 0
c1
h1 2(h1 + h2 ) 3(f [x3 , x2 ] − f [x2 , x1 ])
h2
c2
. . . . . .
. .. ..
.. =
.. .
5.6.0.1 Example
Consider the table below. Fit cubic splines to the data and utilize the results to estimate the value at x = 5.
i xi fi
1 3 2.5
2 4.5 1
3 7 2.5
4 9 0.5
Solution:
The first step is to generate the set of simultaneous equations that will be utilized to determine the c
coefficients:
1 0 0 0 0
c1
1.5 8 2.5 0 c2 3(0.6 + 1)
0 2.5 9 2 c3 = 3(−1 − 0.6) .
0 0 0 1 0
c4
66 CHAPTER 5. INTERPOLATION
1 0 0 0 0
c1
1.5 8 2.5 0
c2 = 4.8 .
⇒
0 2.5 9 2 c3 −4.8
0 0 0 1 0
c4
Therfore:
0
0.839543726
⇒ c̄ =
−0.766539924 .
0
Using our values for c we obtain the following for our d’s,
d1 = 0.186565272,
d2 = −0.214144487,
d3 = 0.127756654.
b1 = −1.419771863,
b2 = −0.160456274,
b3 = 0.022053232.
These results allow us to develop the cubic splines for each interval using Equation (5.16):
The three equations can then be employed to compute values within each interval. For example, the value at
x = 5, which falls within the second interval, is calculated as,
A major problem with interpolation is Runge’s Phenomenon. Let us consider an example in Mathematica:
ClearAll[data, x];
data = RandomReal[{-10, 10}, 20];
ListPlot[data]
Manipulate[
Show[
Plot[InterpolatingPolynomial[data[[1 ;; n]], x], {x, 1, n},
PlotRange -> All],
ListPlot[data, PlotStyle -> Directive[PointSize[Large], Red]],
PlotRange -> All
], {n, 2, Length[data], 1}]
5.6. CUBIC SPLINES INTERPOLATION 67
tcentr[d_] :=
Module[{a},
a = Accumulate[
Table[Norm[d[[i + 1]] - d[[i]]]^(1/2), {i, Length[d] - 1}]];
N[Prepend[a/Last[a], 0]]]
noeudmoy[d_, param_] :=
Join[{0, 0, 0, 0},
Table[1/3*Sum[param[[i]], {i, j, j + 2}], {j, 2,
Length[param] - 3}], {1, 1, 1, 1}]
Manipulate[Module[{pCt},
pCt = pctrl[dpts[[1 ;; n]], tcentr[dpts[[1 ;; n]]],
noeudmoy[dpts[[1 ;; n]], tcentr[dpts[[1 ;; n]]]]];
Show[
ParametricPlot[
BSplineFunction[pCt,
SplineKnots ->
noeudmoy[dpts[[1 ;; n]], tcentr[dpts[[1 ;; n]]]]][x], {x, 0,
1}, PlotRange -> All],
ListPlot[data, PlotStyle -> Directive[PointSize[Large], Red]],
PlotRange -> All
]], {n, 4, Length[data], 1}]
Thus we can see that high order polynomials lead to an exponential growth of the infinity norm error. To
overcome this we used the splines technique from above, however, another method one could use is Chebyshev
polynomials. Here points are distributed more densely towards the bounds of the interval.
5.6.2 Exercises
determine y at x = 0 using (a) Lagrange’s method and (b) Newton’s Divided Differences.
2. Given the data:
Estimate f (0.6) from the data using: (a) a second degree Lagrange polynomial (b) a third degree Lagrange
polynomial
3. Given f (−2) = 46, f (−1) = 4, f (1) = 4, f (3) = 156, f (4) = 484, use Newton Divided Differences to
estimate f (0).
4. Let P (x) be the degree 5 polynomial that takes the value 10 at x = 1, 2, 3, 4, 5 and the value 15 at
x = 6. Find P (7).
5. Write down a polynomial of degree exactly 5 that interpolates the four points (1, 1), (2, 3), (3, 3), (4, 4).
6. Find P (0), where P (x) is the degree 10 polynomial that is zero at x = 1, . . . , 10 and satisfies P (12) = 44.
7. Write down the degree 25 polynomial that passess through the points (1, −1), (2, −2), . . . , (25, −25) and
has constant term equal to 25.
8. Use the method of divided differences to find the degree 4 interpolating polynomial P4 (x) for the
data (0.6, 1.433329), (0.7, 1.632316), (0.8, 1.896481), (0.9, 2.247908) and (1.0, 2.718282). Next calculate
P4 (0.82) and P4 (0.98). The aforementioned data is sampled from the function f (x) = ex . Compute the
absolute and relative errors of your approximations at P4 (0.82) and P4 (0.98).
Chapter 6
Least Squares
When considering experimental data it is commonly associated with noise. This noise could be resultant of
measurement error or some other experimental inconsistency. In these instances, we want to find a curve that
fits the data points “on the average”. That is, we do not want to overfit the data, thereby amplifying any of
the noise. With this in mind, the curve should have the simplest form (i.e. lowest order polynomial possible).
Let:
f (x) = f (x, a1 , a2 , . . . , am ),
be the function that is to be fitted to the n data points (xi , yi ), i = 1, 2, . . . , n. Thus, we have a function
of x that contains the parameters aj , j = 1, 2, . . . , m, where m < n. The shape of f (x) is known a priori,
normally from the theory associated with the experiment in question. This means we are looking to fit the
best parameters. Thus curve fitting is a two step process; (i) selecting the correct form of f (x) and (ii)
computing the parameters that produce the best fit to the data.
The notion of best fit (at least for the purpose of this course) considers noise bound to the y-coordinate.
The most common of which is measured by the least squares fit, which minimises:
n
X 2
S(a1 , a2 , . . . , am ) = [yi − f (xi )] , (6.1)
i=1
with respect to each aj . The optimal values of the parameters are given by the solution of the equations:
∂S
= 0, k = 1, 2, . . . , m. (6.2)
∂ak
We measure the residual as ri = yi − f (xi ) from Equation (6.1) which represent the discrepancy between the
data points and the fitting function at xi . The function S is the sum of the squares of all residuals.
A Least squares problem is said to be linear if the fitting function is chosen as a linear combination of
functions fj (x):
f (x) = a1 f1 (x) + a2 f2 (x) + . . . + am fm (x). (6.3)
Here an example could be where f1 (x) = 1, f2 (x) = x, f3 (x) = x2 etc. Often these polynomials can be
nonlinear and become increasingly difficult to solve. For the purpose of this course we will only consider
linear least squares.
69
70 CHAPTER 6. LEAST SQUARES
A necessary condition for S(a0 , a1 ) to be a minimum is that the first partial derivatives of S w.r.t. a0 and a1
must be zero:
n
∂E X
= −2 [yi − a0 − a1 xi )] = 0 (6.4)
∂a0 i=1
n
∂E X
= −2 xi [yi − a0 − a1 xi )] = 0 (6.5)
∂a1 i=1
These equations are called the normal equations. They can be solved simultaneously for a1 :
P P P
n i xi yi − i xi i yi
a1 = 2 (6.8)
n i x2i − ( i xi )
P P
This result can then be used in conjunction with the Equation (6.6) to solve for a0 :
n n
!
1 X X
a0 = yi − a1 xi . (6.9)
n i=1 i=1
So in matrix form:
Pn Pn
Pnn Pni=1 x2i a0 = Pni=1 yi . (6.10)
i=1 xi i=1 xi a1 i=1 xi yi
Therefore:
Pn −1 Pn
a0 n Pni=1 x2i Pni=1 yi .
= Pn (6.11)
a1 i=1 xi i=1 xi i=1 xi yi
6.1.0.1 Example
xi 1 2 3 4 5 6 7
yi 0.5 2.5 2.0 4.0 3.5 6.0 5.5
To find the least squares line approximation of this data, extend the table and sum the columns, as below:
xi yi x2i xi yi
1 0.5 1 0.5
2 2.5 4 5.0
3 2.0 9 6.0
4 4.0 16 16.0
6.1. LINEAR LEAST SQUARES 71
xi yi x2i xi yi
5 3.5 25 16.5
6 6.0 36 36.0
7 P5.5 P 49 P 37.5
= 28 = 24 = 140 = 119.5
P
7(119.5) − 28(24)
a1 = = 0.8393
7(140) − 282
and hence:
24 − 0.8393(28)
a0 = = 0.0714
7
The least squares procedure above can be readily extended to fit the data to an mth degree polynomial:
through some n data points (x1 , Pm (x1 )), (x2 , Pm (x2 )), . . . , (xm , Pm (xn )), where m ≤ n − 1. Then, S takes
the form:
n
X
S= [yi − f (xi )]2 (6.13)
i=1
∂E ∂E ∂E
= 0, = 0, · · · , =0
∂a0 ∂a1 ∂am
6.2. POLYNOMIAL LEAST SQUARES 73
for a0 , a1 , and a2 .
Note: This system is symmetric and can be solved using Gauss elimination.
6.2.0.1 Exercise
xi 0 1 2 3 4 5
yi 2.1 7.7 13.6 27.2 40.9 61.1
import numpy.linalg as LA
x = np.array([0, 1, 2, 3, 4, 5])
y = np.array([2.1, 7.7, 13.6, 27.2, 40.9, 61.1])
n = len(x)
sumX = sum(x)
sumY = sum(y)
sumX2 = sum(x**2)
sumX3 = sum(x**3)
sumX4 = sum(x**4)
sumXY = sum(x * y)
sumXXY = sum(x**2 * y)
A = np.array([[n, sumX, sumX2],[sumX, sumX2, sumX3], [sumX2, sumX3, sumX4]])
74 CHAPTER 6. LEAST SQUARES
print(A)
## [[ 6 15 55]
## [ 15 55 225]
## [ 55 225 979]]
b = np.array([sumY, sumXY, sumXXY])
print(b)
Remark: As the degree m increases the coefficient matrix becomes extremely ill-conditioned. It is therefore
not recommended to fit least squares polynomials of degree greater than 4 to given data points.
Also, it would be common practice to use built-in libraries to do these computations instead of programming it
yourself. In addition, any real world scenario would likely involve a massive number of data points. Gradient
descent techniques could also be applied. You may find these within machine learning courses etc.
When the derivatives of S with respect to a and b are set equal to zero the resulting equations are:
n
∂E X
= −2 ebxi [yi − aebxi ] = 0 (6.24)
∂a i=1
n
∂E X
= −2 axi ebxi [yi − aebxi ] = 0 (6.25)
∂b i=1
76 CHAPTER 6. LEAST SQUARES
These two equations in two unknowns are nonlinear and generally difficult to solve.
It is sometimes possible to “linearise” the normal equations through a change of variables. If we take natural
logarithm of our equation (6.22) we have:
We introduce the variable Y = ln(y), a0 = ln(a) and a1 = b. Then the linearized equation becomes:
Y (x) = a0 + a1 x, (6.26)
and the ordinary least squares analysis may then be applied to the problem. Once the coefficients a0 and a1
have been determined, the original coefficients can be computed as a = ea0 and b = a1 .
6.3.0.1 Example
To fit an exponential least squares fit to this data, extend the table as:
xi yi Yi = ln yi x2i xi Yi
1.00 5.10 1.629 1.0000 1.629
1.25 5.79 1.756 1.5625 2.195
1.50 6.53 1.876 2.2500 2.814
1.75 7.45 2.008 3.0625 3.514
2.00 8.46 2.135 4.000 4.270
= 7.5 = 33.3 = 9.404 = 11.875 = 14.422
P P P P P
import numpy.linalg as LA
x = np.array([1.0, 1.25, 1.5, 1.75, 2.0])
y = np.array([5.1, 5.79, 6.53, 7.45, 8.46])
sumX = sum(x)
sumY = sum(y)
6.3. LEAST SQUARES EXPONENTIAL FIT 77
6.3.1 Exercises
• Find the least squares polynomials of degrees one, two and three for the data, computing the error S in
each case.
Ans:
y = 0.6209 + 1.2196x, y = 0.5966 + 1.2533x − 0.0109x2 ,
y = −0.01x3 + 0.0353x2 + 1.185x + 0.629
• An experiment is performed to define the relationship between applied stress and the time to fracture
for a stainless steel. Eight different values of stress are applied and the resulting data is:
Use a linear least squares fit to determine the fracture time for an applied stress of 33 kg/mm2 to a stress.
(Ans: t = 39.75 − 0.6x, t = 19.95 hours)
• Fit a least squares exponential model to:
(Ans: y = 530.8078e0.8157x )
Chapter 7
Ordinary differential equations govern a great number of many important physical processes and phenomena.
Not all differential equations can be solved using analytic techniques. Consequently, numerical solutions have
become an alternative method of solution, and these have become a very large area of study.
Importantly, we note the following:
• By itself y 0 = f (x, y) does not determine a unique solution.
• This simply tells us the slope y 0 (x) of the solution function at each point, but not the actual value y(x)
at any point.
• There are an infinite family of functions satisfying an ODE.
• To single out a particular solution, a value y0 of the solution function must be specified at some point
x0 . These are called initial value problems.
Should members of the solution family of an ODE move away from each other over time, then the equation is
said to be unstable. If the family members move closer to one another with time then the equation is said
to be stable. Finally, if the solution curves do not approach or diverge from one another with time, then the
equation is said to be neutrally stable. So small perturbations to a solution of a stable equation will be
damped out with time since the solution curves are converging. Conversely, an unstable equation would see
the perturbation grow with time as the solution curves diverge.
To give physical meaning to the above, consider a 3D cone. If the cone is stood on its circular base, then
applying a perturbation to the cone will see it return to its original position standing up, implying a stable
79
80 CHAPTER 7. ORDINARY DIFFERENTIABLE EQUATIONS (ODES)
position. If the cone was balanced on its tip, then a small perturbation would see the cone fall, there the
position is unstable. Finally, consider the cone resting on its side, applying a perturbation will simply roll the
cone to some new position and thus the position is neutrally stable.
An example of an unstable ODE is y 0 = y. Its family of solutions are given by the curves y(t) = cet . From
the exponential growth of the solutions we can see that the solution curves move away from one another as
time increases implying that the equations is unstable. We can see this is the plot below.
Now consider the equation y 0 = −y. Here the family of solutions is given by y(t) = ce−t . Since we have
exponential decay of the solutions we can see that the equation is stable as seen in Figure below.
7.1. INITIAL VALUE PROBLEMS 81
Finally, consider the ODE y 0 = a for a given constant a. Here the family of solutions is given by y(t) = at + c,
where c again is any real constant. Thus, in the example plotted below where a = 12 the solutions are parallel
straight lines which neither converge or diverge. Therefore, the equation is neutrally stable.
82 CHAPTER 7. ORDINARY DIFFERENTIABLE EQUATIONS (ODES)
yi0 = f (xi , yi )
This is a difference formula which can be evaluated step by step. This is the formula for Euler’s (or
Euler-Cauchy) method. Thus given (x0 , y0 ) we can calculate (xi , yi ) for i = 1, 2, · · · , n. Since the new
value yi+1 can be calculated from known values of xi and yi , this method is said to be explicit.
Each time we apply an equation such as(7.4) we introduce two types of errors:
• Local truncation error introduced by ignoring the terms in h2 , h3 , · · · in equation (7.2). For Euler’s
method, this error is
h2 00
E= y (ξ), ξ ∈ [xi , xi+1 ],
2! i
i.e. E = O(h2 ). Thus the local truncation error per step is O(h2 ).
• A further error introduced in yi+1 because yi is itself in error. The size of this error will depend on the
function f (x, y) and the step size h.
The above errors are introduced at each step of the calculation.
7.2.2 Example
The numerical results of approximate solutions at subsequent points x1 = 0.2, . . . can be computed in a
similar way, rounded to 3 decimal, to obtain places.
x y y 0 = f (x, y) y0 h
0 1.000 1.000 0.100
0.1 1.100 1.200 0.120
0.2 1.220 1.420 0.142
0.3 1.362 1.662 0.166
0.4 1.528 1.928 0.193
The analytical solution at x = 0.4 is 1.584. The numerical value is 1.528 and hence the error is about 3.5%.
The accuracy of the Euler’s method can be improved by using a smaller step size h. Another alternative is to
use a more accurate algorithm.
84 CHAPTER 7. ORDINARY DIFFERENTIABLE EQUATIONS (ODES)
which is assumed to represent a valid approximation of the average slope for the entire subinterval. This
slope is then used to extrapolate linearly from xi to xi+1 using Euler’s method to obtain:
For the modified Euler method, the truncation error can be shown to be:
h3 000
E = − y (ξ), ξ ∈ [xi , xi+1 ] (7.8)
12 i
7.4. RUNGE-KUTTA METHODS 85
7.3.0.1 Example
Solve
dy
= x + y, y(0) = 1, h = 0.1
dx
using the modified Euler’s method described above.
Solution:
0 0
xi yi yi+1/2 yi+1/2 yi+1/2 h
0 1.000 1.050 1.100 0.110
0.1 1.110 1.1705 1.3205 0.13205
0.2 1.24205 1.1705 1.3205 0.13205
0.3 1.39847 1.31415 1.56415 0.15641
0.4 1.58180 1.48339 1.83339 0.18334
The numerical solution is now 1.5818 which much more accurate that the result obtained using Euler’s
method. In this case the error is about 0.14%.
7.4.2.1 Example
Solve the DE y 0 = x + y, y(0) = 1 using 4th order Runge–Kutta method. Compare your results with those
obtained from Euler’s method, modified Euler’s method and the actual value. Determine y(0.1) and y(0.2)
only.
The solution using Runge-Kutta is obtained as follows:
For y1 :
and therefore:
1
y1 = y0 + (0.1 + 2(0.01) + 2(0.1105) + 0.1211) = 1.1103
6
A similar computation yields
1
y(0.2) = y2 = 1.1103 + (0.1210 + 2(0.1321) + 2(0.1326) + 0.1443 = 1.2428
6
A table for all the approximate solutions using the required methods is:
As previously, Euler’s method, Modified Euler’s method and Runge-Kutta methods are single-step methods.
They work by computing each successive value yi+1 only utilising information from the preceding value yn .
Another approach are multistep methods, where values from several computed previously computed steps
are used to obtain yi+1 . There are numerous methods using this approach, however, for the purpose of this
course we will only consider one - the Adam Bashforth Method.
7.5.1 Adam-Bashforth-MoultonMethod
This is a multistep method is similar to the Modified Euler’s method in that it is a predictor-corrector
method, i.e. uses one formula to predict a value yi+1
0
, which is then used to obtain a corrected value yi+1 .
The predictor in this method is the Adams-Bashforth formula. Specifically,
h
∗
yi+1 = yi + (55yi0 − 59yi−1
0
+ 37yi−2
0
− 9yi−3
0
),
24
yi0 = f (xi , yi ),
0
yi−1 = f (xi−1 , yi−1 ),
0
yi−2 = f (xi−2 , yi−2 ),
0
yi−3 = f (xi−3 , yi−3 ),
7.5. MULTISTEP METHODS 89
7.5.1.1 Example
Use the Adam-Bashforth method with h = 0.2 to obtain an approximation to y(0.8) for the IVP:
y 0 = x + y − 1, y(0) = 1.
Solution:
Using the RK-4 method to get started, we obtain the following:
y1 = 1.0214, y2 = 1.09181796, y3 = 1.22210646.
Since h = 0.2, we know that x1 = 0.2, x2 = 0.4, x3 = 0.6 and f (x, y) = x + y − 1. Now we can proceed:
y00 = f (x0 , y0 ) = 0 + 1 − 1 = 0,
y10 = f (x1 , y1 ) = 0.2 + 1.0214 − 1 = 0.2214,
y20 = f (x2 , y2 ) = 0.4 + 1.09181796 − 1 = 0.49181796,
y30 = f (x3 , y3 ) = 0.6 + 1.22210646 − 1 = 0.82210646.
There are a number of decisions to make when choosing a numerical method to solve a differential equation.
While single step explicit methods such as RK-4 are often chosen due to their accuracy and easily programmable
implementation, the right hand side of the equation needs to be evaluated many times. In the case of RK-4,
the method is required to make four function evaluations at each step. On the Implicit side, if the function
valuations in the previous step have been computed and stored, then a multistep method would require only
one new function evaluation at each step - saving computational time.
In general the Adam-Bashforth method requires slightly more than one quarter of the number of function
evaluations required for the RK-4 method.
90 CHAPTER 7. ORDINARY DIFFERENTIABLE EQUATIONS (ODES)
dy1
= f1 (x, y1 , y2 , · · · , yn ), y1 (x0 ) = α1
dx
dy2
= f2 (x, y1 , y2 , · · · , yn ), y2 (x0 ) = α2
dx
..
.
dyn
= fn (x, y1 , y2 , · · · , yn ), yn (x0 ) = αn ,
dx
for x0 ≤ x ≤ xn .
The methods we have seen so far were for a single first order equation, in which we sought the solution y(x).
Methods to solve first order systems of IVP are simple generalization of methods for a single equations,
bearing in mind that now we seek n solutions y1 , y2 , . . . , yn each with an intial condition yk (x0 ); k = 1, . . . , n
at the points xi , i = 1, 2. . . ..
dy
= f (x, y, z), y(0) = y0 (7.30)
dx
dz
= g(x, y, z), z(0) = z0 . (7.31)
dx
Let y = y1 , z = y2 , f = f1 , and g = f2 . The fourth order R-K method would be applied as follows. For each
j = 1, 2 corresponding to solutions yj,i , compute
k1,j = hfj (xi , y1,i , y2,i ), j = 1, 2 (7.32)
h k1,1 k1,2
k2,j = hfj (xi + , y1,i + , y2,i + ) j = 1, 2 (7.33)
2 2 2
h k2,1 k2,2
k3,j = hfj (xi + , y1,i + , y2,i + ), j = 1, 2 (7.34)
2 2 2
k4,j = hfj (xi + h, y1,i + k3,1 , y2,i + k3,2 ), j = 1, 2 (7.35)
and:
1
yi+1 = y1,i+1 = y1,i + (k1,1 + 2k2,1 + 2k3,1 + k4,1 ) (7.36)
6
1
zi+1 = y2,i+1 = zi + (k1,2 + 2k2,2 + 2k3,2 + k4,2 ). (7.37)
6
Note that we must calculate k1,1 , k1,2 , k2,1 , k2,2 , k3,1 , k3,2 , k4,1 , k4,2 in that order.
If we let
z = y0 , z 0 = y 00
then the original ODE can now be written as
y0 = z, y(0) = α1 (7.38)
z 0
= −az − by, z(0) = α2 (7.39)
Once transformed into a system of first order ODEs the methods for systems of equations apply.
7.7.0.1 Exercise
(i) Second order R–K method (ii) 4th order R–K. Use h = 0.1. Do only two steps.
7.7.1 Exercises
Use (i) Euler’s method (ii) modified Euler’s formula to solve the following IVP;
• y 0 = sin(x + y), y(0) = 0
• y 0 = yx2 − y, y(0) = 1 for h = 0.2 and h = 0.1.
• Determine y(0.4) for each of the above IVP.
• Use Richardson’s extrapolation to get improved approximations to the solutions at x = 0.4
• If f is a function of x only, show that the fourth-order Runge-Kutta formula, applied to the differential
equation
Rx dy/dx = f (x) is equivalent to the use of Simpson’s rule (over one interval) for evaluating
0
f (x)dx.
• Use fourth order Runge–Kutta method to solve the following IVPs:
– y 0 = 2xy, y(0) = 1
y0 = 1 + y2 ,
on the domain [0, 1] with the initial condition of (a) y0 = 0 and (b) y0 = 1. Plot these solutions along
with the exact solution. Use step sizes of h = 0.1 and h = 0.05.
• Given the IVP y 0 = (x + y − 1)2 , y(0) = 2. Using the Modified Euler’s method with h = 1 and h = 0.05,
obtain approximate solutions of the solution at x = 0.5. Compare these values with the analytical
solution.