Mathematics For Engineers and Scientists 4 Notes For F1.8XD2
Mathematics For Engineers and Scientists 4 Notes For F1.8XD2
Scientists 4
Notes for F1.8XD2
2018
3 Geometry 35
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Revision of Vector Operations . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.1 Vector addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.2 Multiplication by a scalar . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.3 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.4 Vector product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3 Lines in Three Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3.1 Parametric equation of a line . . . . . . . . . . . . . . . . . . . . . 42
3.3.2 Angle between two lines . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4 Equations of a Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4.1 Non-parametric (Cartesian) equation for a plane . . . . . . . . . . . 43
3
3.4.2 Parametric representation of a plane . . . . . . . . . . . . . . . . . 46
3.4.3 Plane defined by three point vectors . . . . . . . . . . . . . . . . . . 48
3.4.4 Two intersecting planes . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4.5 Parallel planes and the angle between two planes . . . . . . . . . . 50
3.4.6 Angle between a line and a plane . . . . . . . . . . . . . . . . . . . 52
3.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4 Vector Differentiation 57
4.1 Differentiation of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.1.1 Differentiation of sums and products of vectors . . . . . . . . . . . . 58
4.1.2 Linear approximation of a curve in three dimensions . . . . . . . . . 58
4.2 Gradient of a Scalar Function . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2.1 Directional derivatives . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2.2 Equations for a tangent plane and normal line . . . . . . . . . . . . 64
4.3 Introduction to div and curl . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4 Summary of Vector Geometry . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6 Matrices 91
6.1 Vectors and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2 Inverse Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4
Chapter 1
1.1 Introduction
Laplace transforms are an interesting field of mathematics that can be used to solve prob-
lems involving differential equations and integro-differential equations. The technique is
quite abstract in nature but it allows the solution of differential equations, without the
need to do any integration or differentiation. The processes of integration and differentia-
tion are replaced by algebraic manipulation, which is often considered easier to apply than
concepts taken from calculus. A further advantage of the Laplace transform method for
solving initial value problems that is the initial conditions are incorporated in an entirely
natural way.
The main reason for considering Laplace transforms at this stage is that most engineering
disciplines engage in a subject called control engineering, where Laplace transforms are
used to analyse the response of engineering systems to changes in inputs to the system,
whether it be a chemical reactor vessel or an auto-pilot in a plane.
Before we can attempt to solve differential equations using the Laplace transform, we
need to introduce it and consider the Laplace transform and inverse Laplace transform
for a number of simple functions and differential operators.
1
From (1.1) we see that the Laplace transform consists of an improper integral (one of the
integration bounds is infinite). In the first instance with an improper integral you have to
focus on whether the integral has a solution or not. For example the Laplace transform
of the function
f (t) = eat
only exists if a − s < 0, otherwise the Laplace transform would be
Z ∞ Z ∞
−st
e f (t) dt = ebt dt
0 0
Returning to the terminology used to describe the Laplace transform (1.1), e−st is called
the kernel of the integral. In the original function, f (t), the independent variable is t,
which can be considered a time variable. The function f (t) is considered to exist in the
time domain. The Laplace transform F (s) exists in a frequency domain. s is the
independent variable of the Laplace transform and strictly speaking is a complex variable
although we shall for the most part only consider it to be real.
As this is an improper integral it should be considered in the limit of the upper bound
being finite and the limit to infinity taken once the integral is evaluated.
Z ∞ Z b
−st
L{f (t)} = ce dt = lim ce−st dt
0 b→∞ 0
where Z b h c ib c c
ce−st dt = − e−st = −e−sb − (−e0 ) = (1 − e−sb ) .
0 s 0 s s
We can now let b tend to infinity,
c c
lim (1 − e−sb ) = .
b→∞ s s
To summarise,
c
L{c} = .
s
c
f (t) = c , F (s) = .
s
2
Worked Examples (Evaluating Laplace transforms by direct integration). Let
us consider another example in detail.
Integrating by parts,
Z b Z b
dv du
u dt = [uv]b0 − v dt
0 dt 0 dt
−st
with u = t so du/dt = 1 and dv/dt = e so v = −e−st /s,
Z ∞ b Z Z
−st t −st 1 ∞ −st b −bs 1 ∞ −st
te dt = − e + e dt = − e + e dt .
0 s 0 s 0 s s 0
We could evaluate the integral on the right-hand side but we can let b tend to infinity
now and avoid the integration,
Z ∞ Z
−st 1 ∞ −st 1 1
lim te dt = e dt = L{1} = 2 .
b→∞ 0 s 0 s s
1
f (t) = t , F (s) = 2 .
s
Exercise 1.1. Use the same approach (i.e. one integration by parts) to show that
2 2
f (t) = t , F (s) = 3 .
s
The main point of this section is not the integration process. We are interested in growing
our collection of functions that we know the Laplace transform for, although we shall not
do this by integrating the Laplace transform for every function we come across.
For example, inspecting the three Laplace transform pairs given above a pattern is emerg-
ing for algebraic terms. The general rule is:
3
General Rule for Laplace transforms of algebraic terms.
Question 1.2. Evaluate the Laplace transform of the function f (t) = ekt .
Solution. Z Z
∞ b
kt −st
kt
L{e } = e e dt = lim e−(s−k)t dt ,
0 b→∞ 0
where Z
b b
−(s−k)t 1 −(s−k)t 1
lim e dt = − e =− e−(s−k)b − e0 0 ,
b→∞ 0 s−k s−k
so
1 b 1
L{ekt } = lim − e−(s−k)b − 1 0 = ,
b→∞ s−k s−k
provided that s − k > 0, i.e. s > k.
Hence the Laplace transform pair is
kt 1
f (t) = e , F (s) = .
s−k
As you will appreciate from the above examples, deriving Laplace transforms by direct
integration is quite a tedious process. To avoid direct integration many ingenious math-
ematical tricks and theorems are often used to find Laplace transforms. For example
consider the following worked example.
Example 1.1. Evaluate the Laplace transform of the function f (t) = eiat , where a is
some real number and i2 = −1.
Following the same steps as in Question 1.2, the solution comes to
iat 1
f (t) = e , F (s) = .
s − ia
So why have we done two worked examples that are so very similar? The answer is that
the second leads onto an additional useful result giving us two extra Laplace transforms.
Three Laplace transforms for the price of one!
4
Complex numbers given in exponential form can be represented using Euler’s formula
The separation of the Laplace transform into an application of the Laplace transform
to the two terms is possible as the Laplace transform is a linear operator. This will be
discussed further below, for now just accept it.
Again from Example 1.1,
1 1 s + ia s + ia
L{eiat } = = × = 2 .
s − ia s − ia s + ia s + a2
So, from (1.3),
s a
L{cos at} + iL{sin at} = +i 2 . (1.4)
+a s2
2 s + a2
Therefore equating real and imaginary parts of (1.4) gives the following results:
Laplace Transforms of Trigonometric Functions:
where α and β are constants and f (t) and g(t) are functions.
We can now apply the Laplace transforms derived in the previous section to get
3 2
L{3t + 2e3t } = + .
s2 s − 3
5
Question 1.4. Determine L{5 − 3t + 4 sin 2t − 6e4t }.
and applying the Laplace transforms derived or presented in the previous section,
1 1 2 1 5 3 8 6
L{5−3t+4 sin 2t−6e4t } = 5× −3× 2 +4× 2 −6× = − 2+ 2 − .
s s s +4 s−4 s s s +4 s−4
Another mathematical tool that can be used to grow our collection of Laplace transforms
is called the First Shift Theorem.
L{eat f (t)} = F (s − a) .
Here are a couple of worked examples using the first shift theorem to derive Laplace
transforms.
Solution.
1
For f (t) = t , L{f (t)} = L{t} = F (s) = .
s2
Then by the first shift theorem,
−2t 1 1
L{te } = F (s)s→s+2 = 2 = .
s s→s+2 (s + 2)2
Solution.
2
For f (t) = sin 2t , L{f (t)} = L{sin 2t} = F (s) = .
s2 +4
Then by the first shift theorem,
−3t 2 2 2
L{e sin 2t} = F (s)s→s+3 = 2 = 2
= 2 .
s +4 s→s+3 (s + 3) + 4 s + 6s + 13
6
1.2.5 Table of Laplace transforms
Deriving a Laplace transform every time a function crops up is a time-consuming process.
Remembering the Laplace transforms for all of the functions given above is also not a
realistic proposition for most students.
Therefore in the exam the list of Laplace transforms presented in Table 1.1 is
handed out.
The table together with the first shift theorem and the linearity property of the Laplace
transform allows you to determine the Laplace transform of many functions.
f (t) F (s)
c c/s
t 1/s2
tn n!/sn+1
ekt 1/(s − k)
sin at a/(s2 + a2 )
cos at s/(s2 + a2 )
t sin at 2as/(s2 + a2 )2
δ(t − a) e−as
eat f (t) F (s − a)
g(t − a) t > a
f (t) = e−as G(s)
0 t<a
Table 1.1 summarises the results presented in the previous sections. In addition there are
Laplace transforms for derivatives and something called the Dirac delta function. These
7
are useful for the solution of differential equations and will be considered in detail in the
coming sections.
Exercise 1.2. Confirm by applying the first shift theorem to the Laplace transform of
teiat that
s 2 − a2 2as
L{t cos at} = 2 and L{t sin at} = .
(s + a2 )2 (s2 + a2 )2
1
kt −1 1
Example 1.2. As L{e } = , L = ekt .
s−k s−k
a −1 a
Example 1.3. As L{sin at} = 2 , L = sin at.
s + a2 s + a2
2
p(s)
(1.5)
q(s)
is often required. For example they might be represented as partial fractions, see Ap-
pendix B if your memory needs jogging.
Similar to the Laplace transformation, the inverse Laplace transform is a linear operator,
so
L−1 {αF (s) + βG(s)} = αL−1 {F (s)} + βL−1 {G(s)} .
Using partial fractions and the linear property given above we can calculate our first
inverse Laplace transforms that do not appear in the Laplace transform Table 1.1:
8
Solution. We are ultimately going to use the table of Laplace transforms but the first
thing to do is represent the Laplace transform using partial fractions:
1 A B A(s − 2) + B(s + 3)
= + =
(s + 3)(s − 2) (s + 3) (s − 2) (s + 3)(s − 2)
so
A(s − 2) + B(s + 3) = 1 .
Taking s = 2 then s = −3 gives B = 1/5 and A = −1/5 so
−1 1 1 −1 1 1 −1 1
L = L − L .
(s + 3)(s − 2) 5 (s − 2) 5 (s + 3)
By inspection of the table of Laplace transforms, the inverse Laplace transform is then
−1 1 1 1
L = e2t − e−3t .
(s + 3)(s − 2) 5 5
Solution. The first thing to do is represent the Laplace transform using partial fractions,
s+1 A B Cs + D As(s2 + 9) + B(s2 + 9) + Cs3 + Ds2
= + + =
s2 (s2 + 9) s s2 s2 + 9 s2 (s2 + 9)
3 2
(A + C)s + (B + D)s + 9As + 9B
= .
s2 (s2 + 9)
Equating terms; 1: 9B = 1; s: 9A = 1; s2 : B + D = 0; s3 : A + C = 0.
These give A = 1/9, B = 1/9, C = −1/9 and D = −1/9. Then
−1 s+1 1 −1 1 1 −1 1 1 −1 s 1 −1 1
L = L + L − L − L .
s2 (s2 + 9) 9 s 9 s2 9 (s2 + 9) 9 (s2 + 9)
The first three terms are easy to evaluate using the table of Laplace transforms,
1 −1 1 1 −1 1 1 −1 s 1 1 1
L + L 2
− L 2
= + t − cos 3t ,
9 s 9 s 9 (s + 9) 9 9 9
while the fourth term requires a little more work,
1 −1 1 3 −1 1 1 −1 3 1
L = L = L == sin 3t ,
9 (s2 + 9) 3×9 (s2 + 9) 27 (s2 + 9) 27
the key point being multiplying the rational function by 3 and dividing by 3 to give a
Laplace transform that has the same form as one of the ‘standard’ transforms.
This is a common operation in the business of finding inverse Laplace transforms
Putting this all together,
−1 s+1 1 1 1 1
L 2 2
= + t − cos 3t − sin 3t ,
s (s + 9) 9 9 9 27
9
1.3.2 Finding inverses using the first shift theorem
The first shift theorem applied to a inverse Laplace transform says
We will only do one example in this section. However, you will have ample opportunity
to see this technique throughout the rest of these notes.
Solution.
−1 1 −1 1
L =L = te−2t .
(s + 2)2 s2 s→s+2
Solution. As stated above the quotient cannot be represented as the sum of two partial
fractions as the quadratic term s2 + 6s + 13 has no real roots. The answer is to complete
the square:
s2 + 6s + 13 = (s + 3)2 + 13 − 32 = (s + 3)2 + 4 .
Hence
−1 2 −1 2 −1 2
L 2
=L =L = e−3t sin 2t .
s + 6s + 13 (s + 3)2 + 4 s + 22
2
s→s+3
10
Question 1.11. Find
−1 s+7
L 2
.
s + 2s + 5
This process, at least initially, has to be done step by step in a slow and methodical way,
otherwise silly mistakes might creep into the solution.
11
1.4 Problems
Problem 1.1. Compute the Laplace transforms for the following functions from first
principles (i.e. carrying out the integrations!):
Problem 1.2. Use the table of Laplace transform and the first shift theorem to find the
Laplace transforms of the following functions:
Problem 1.3. Show that L{eat f (t)} = F (s − a), i.e., prove the first shift theorem.
dF
Problem 1.4. Show that L{tf (t)} = − (s) [Hint: Write F (s) using the definition of
ds
Laplace transform, and differentiate the left- and right-hand sides of the expression with
respect to s.]
Problem 1.5. (a) Use the results from Problem 1.1 (iv) to obtain Laplace transforms
of functions cos at and sin at.
(b) Use the results from Problem 1.1 (ii) and the first shift theorem to obtain Laplace
transforms of functions t cos at, t sin at.
Problem 1.6. (Advanced.) Show that the Laplace transform of f (t) = tn is F (s) =
n!
n+1
. [Hint: Recursively employ integration by parts or use proof by induction.]
s
Problem 1.7. Use partial fractions, the first shift theorem and the table of Laplace
transforms to find the inverse Laplace transforms for the following functions
1 3s
(a) ; (d) ;
(s + 3)(s + 7) (s − 1)(s2 − 4)
2s + 6 s
(b) ; (e) ;
s2 + 4 (s − 1)2 (s2 + 4)
4s s
(c) ; (f) .
(s − 1)(s + 1)2 s2 + 4s + 8
12
Answers
3 2
2.(e) 2/s
+ 2/s + 1/s
1 1 s
2.(f) − .
2 s s2 + 4
1 −3t
7.(a) 4
(e − e−7t )
7.(b) 2 cos 2t + 3 sin 2t
7.(c) et − e−t + 2te−t
7.(d) −et + 23 e2t − 12 e−2t
3 t
7.(e) 25
e + 15 tet − 25
3
cos 2t − 4
25
sin 2t
−2t
7.(f) e (cos 2t − sin 2t).
13
14
Chapter 2
We have
the Laplace transformation of the first derivative of a function f (t):
df
L = sF (s) − f (0) .
dt
d2 f
L = s2 F (s) − sf (0) − f 0 (0) .
dt2
It has to be said that in the analysis above it is assumed that the function f (t) and its
derivatives are sufficiently nice for the integrals to exist.
Note the Laplace transforms of derivatives are included in the table of Laplace
transforms.
15
2.2 Constant-Coefficient Linear Differential Equations
2.2.1 Solution of first-order differential equations
Now that we have the Laplace transforms of derivatives and we can find inverse Laplace
transforms, we can solve differential equations without having to integrate or differentiate
anything!
We have solved our first differential equation using the Laplace transform.
If you review the example given above you will see the Laplace-transform method for
solving differential equations can be separated into three steps:
16
2.2.2 Solution of second-order differential equations
The general inhomogeneous linear second-order constant-coefficient differential equation
reads
d2 y dy
a 2 + b + cy = f (t) (with a 6= 0) .
dt dt
Note that by using Laplace transforms, we can solve this equation without the need to
separate the solution into the complementary function (the solution to the homogeneous
problem (i.e. with f (t) ≡ 0) and a particular integral to extend the solution to the
inhomogeneous differential equation. The solution of second-order differential equations
follows a similar line to the solution of first-order differential equations using the Laplace-
transform method.
Solution. Apply the Laplace transform to both sides of the differential equation,
2
dy dy
L +L 5 + L{6y} = 0 so
dt2 dt
dy
s2 Y (s) − sy(0) − (0) + 5(sY (s) − y(0)) + 6Y (s) = 0 ,
dt
substitute the initial values into the transformed equation,
s2 Y (s) − s + 5(sY (s) − 1) + 6Y (s) = 0 ,
and reorganise the transformed equation such that Y (s) is the subject of the equation,
s+5
(s2 + 5s + 6)Y (s) = s + 5 , i.e. Y (s) = .
s2 + 5s + 6
We now take the inverse Laplace transform of both sides,
−1 −1 s+5
y(t) = L {Y (s)} = L . (2.3)
s2 + 5s + 6
Considering the right-hand side of (2.3),
s+5 s+5 A B A(s + 3) + B(s + 2)
= = + =
s2 + 5s + 6 (s + 2)(s + 3) s+2 s+3 (s + 2)(s + 3)
so A(s + 3) + B(s + 2) = s + 5 and putting s = −2 and then s = −3 gives A = 3 and
B = −2.
Then
−1 −1 3 (−2) −1 1 −1 1
y(t) = L {Y (s)} = L + = 3L −2L = 3e−2t −2e−3t .
s+2 s+3 s+2 s+3
17
Let’s do another worked example, this time inhomogeneous, of degenerate type.
Solution. Apply the Laplace transform to both sides of the differential equation,
2
dy dy
L 2
+ 5L + 6L{y} = L{e−2t } so
dt dt
dy 1
s2 Y (s) − sy(0) − (0) + 5(sY (s) − y(0)) + 6Y (s) = ,
dt s+2
substitute the initial values into the transformed equation,
1
s2 Y (s) − s + 1 + 5(sY (s) − 1) + 6Y (s) = ,
s+2
and reorganise the transformed equation so that Y (s) is the subject of the equation,
1 1 + (s + 4)(s + 2)
(s2 + 5s + 6)Y (s) = (s + 2)(s + 3)Y (s) = +s+4=
s+2 s+2
s2 + 6s + 9 (s + 3)2
= =
s+2 s+2
2
(s + 3) s+3
so Y (s) = 2
= .
(s + 2) (s + 3) (s + 2)2
We can now take the inverse Laplace transform of both sides,
−1 −1 s+3 −1 (s + 2) + 1 −1 1 1
y(t) = L {Y (s)} = L =L =L +
(s + 2)2 (s + 2)2 s + 2 (s + 2)2
−1 1 −1 1 −2t −1 1
=L +L = e +L = e−2t +e−2t t = e−2t (1+t) .
s+2 (s + 2)2 s2 s→s+2
18
Solution. Apply the Laplace transform to both sides of the differential equation,
2
dy dy
L 2
+ L{9y} = s2 Y (s) − sy(0) − (0) + 9Y (s) = 0 .
dt dt
s2 Y (s) − 1 + 9Y (s) = 0 .
Reorganise the transformed equation so that Y (s) is the subject of the equation,
1
s2 Y (s) + 9Y (s) = 1 so Y (s) = .
s2 +9
Take the inverse Laplace transform of both sides,
−1 −1 1 1 −1 3 1
y(t) = L {Y (s)} = L 2
= L 2
= sin 3t .
s +9 3 s +9 3
L R
i(t)
C t=0
e(t)
q(t)
Before closing the switch at time t = 0 the charge, q, on the capacitor and the resulting
current, i = dq/dt, in the circuit are zero. Applying Kirchhoff’s second law to the circuit
gives a second-order inhomogeneous differential equation for the charge on the capacitor,
d2 q dq 1
L 2
+ R + q = e(t) ,
dt dt C
where
dq
q(0) = 0 and (0) = 0 .
dt
In the circuit equation given above the different components have the following values,
R = 160 Ω, L = 1 H, C = 10−4 F and e(t) = 20 V.
19
Substitute the values of the electrical properties into the differential equation:
d2 q dq
+ 160 + 104 q = 20 .
dt2 dt
Take the Laplace transforms of the differential equation:
2
dq dq dq
L 2
+ 160L + 104 L{q} = s2 Q(s) − sq(0) − (0) + 160(sQ(s) − q(0)) + 104 Q(s)
dt dt dt
= L{20} = 20/s .
Substitute in the initial values:
(s2 + 160s + 104 )Q(s) = 20/s .
Make the Laplace transform of the solution of the differential equation the subject of the
equation:
20
Q(s) = .
s(s + 160s + 104 )
2
To take the inverse Laplace transform of the RHS, it needs to be represented using partial
fractions. Note that s2 + 160s + 104 doesn’t have real factors so that we shall have to
complete the square at some stage.
20 A Bs + C A(s2 + 160s + 104 ) + Bs2 + Cs
= + = .
s(s2 + 160s + 104 ) s s2 + 160s + 104 s(s2 + 160s + 104 )
Equating terms, 1: 104 A = 20; s: 160A + C = 0; s2 : A + B = 0.
These equations have the solution A = 1/500, B = −1/500, C = −160/500 (= −8/25)
so
20 1 1 s + 160
Q(s) = = − .
s(s2 + 160s + 104 ) 500 s s2 + 160s + 104
We now complete the square,
s2 + 160s + 10000 = (s + 80)2 + 3600 = (s + 80)2 + 602 .
Then
1 1 s + 160 1 1 s + 80 80
Q(s) = − = − −
500 s (s + 80)2 + 602 500 s (s + 80)2 + 602 (s + 80)2 + 602
1 1 s + 80 4 60
= − − .
500 s (s + 80)2 + 602 3 (s + 80)2 + 602
Then
−1 20 1 −1 1 s + 80 4 60
L = L − −
s(s2 + 160s + 104 ) 500 s (s + 80)2 + 602 3 (s + 80)2 + 602
1 −1 1 s 4 60 1 −80t 4 −80t
= L − 2 − = 1−e cos 60t − e sin 60t .
500 s s + 602 s→s+80 3 s2 + 602 s→s+80 500 3
Therefore the solution reads
−1 1 −80t 4
q(t) = L {Q(s)} = 1−e cos 60t + sin 60t .
500 3
20
2.3 Differential Equations and the Dirac Delta Func-
tion
So far we have solved differential equations using the Laplace-transform method without
the direct application of techniques taken from calculus. It has offered a number of
advantages compared to other techniques. We shall consider the application of the Laplace
transform method to a class of differential equations that are not easily solved any other
way.
Laplace transforms can be used to solve problems involving an impulsive force or current
(i.e., charge), where the impulse is delivered over a short time interval, say (t0 , t1 ),
Z t1
I(t) = f (t) dt ,
t0
where I(t) is the total momentum input to the system if f (t) is a force.
Suppose that the applied force is given by the function
1/ε for 0 < t0 < t < t0 + ε
fε (t) = .
0 otherwise
The above function should be interpreted as a constant force 1/ε applied over a time
interval of length ε. By construction,
Z ∞ Z t0 +ε
Iε = fε (t) dt = (1/ε) dt = 1
0 t0
so that the total impulse Iε is the area under the curve fε (t) and is independent of ε.
Taking the limit ε → 0,
fε (t) → δ(t − t0 )
where δ(t − t0 ) is called the Dirac delta function. This is a peculiar name for this
construct as it does not have all of the properties of a function. For this reason it is
a member of a class called generalised functions. The Dirac delta function is zero
everywhere except at t = t0 where it has a singularity and is therefore undefined. The
significant point is the Dirac delta function has the property
Z ∞
δ(t − t0 ) dt = 1 .
0
(as long as f is continuous at t0 ). This is called the sifting property of Dirac functions
as it makes it possible to isolate a particular value of a function.
This means the Laplace transform of a Dirac delta function can be evaluated:
21
Z ∞
L{δ(t − a)} = δ(t − a)e−st dt = e−as .
0
Therefore in principle any differential equation involving an impulse delivered over a very
short time interval can be solved using Laplace transforms.
The last construct we need before we can start solving differential equations involving
Dirac delta functions is the second shift theorem.
This is pretty dry stuff. It will become a little clearer when you see an application of the
second shift theorem.
e−3s
F (s) = .
s2
Another worked example to see how the second shift theorem can be used.
Solution. Here G(s) = 1/(s + 7) so g(t) = L−1 {G(s)} = L−1 {1/(s + 7)} = e−7t , while
a = 2.
Applying the 2nd shift theorem,
−7(t−2)
−1 e for t > 2
f (t) = L {F (s)} = .
0 for t < 2
22
We now have all of the tools in place to solve problems involving instantaneous impulses.
Consider a freely vibrating system without damping. The dynamics of the system are
governed by a differential equation of the form
d2 y
+ ω2y = 0 .
dt2
(See Section 2.4 in the first semester of material.) If the system is subjected to an instan-
taneous impulse of magnitude b at a time t = a, the second-order differential equation is
modified, to
d2 y
2
+ ω 2 y = bδ(t − a) .
dt
Example 2.2. Let’s use Laplace transforms to solve the initial value problem
d2 y
+ 4y = δ(t − 3) ,
dt2
dy
with y(0) = 1 and (0) = 0.
dt
We follow the same solution strategy as previous examples looking at solving differential
equations using the Laplace transform method.
23
To find the second inverse Laplace transform,
−1 −3s 1
L e ,
s2 + 4
we use the second shift theorem.
−1 1 1 −1 2 1
L 2
= L 2
= sin 2t ,
s +4 2 s +4 2
therefore the 2nd shift theorem says
1
−1 −3s 1 sin 2(t − 3) for t > 3
L e = 2 .
s2 + 4 0 for t < 3
Putting these results together gives the solution,
−1 cos 2t + 12 sin 2(t − 3) for t > 3
y(t) = L {Y (s)} = .
cos 2t for t < 3
Question 2.6. Use Laplace transforms to solve the initial value problem
d2 y dy dy
2
+ 6 + 8y = 2δ(t − 7) , y(0) = 0 , (0) = 6 . (2.5)
dt dt dt
24
1.5
0.5
y
−0.5
−1
−1.5
0 1 2 3 4 5 6 7 8
t
To find the inverses we have to represent the quotients using partial fractions, where
s2 + 6s + 8 = (s + 2)(s + 4)
so
2 2 A B A(s + 4) + B(s + 2)
= = + = ,
s2 + 6s + 8 (s + 2)(s + 4) (s + 2) (s + 4) (s + 2)(s + 4)
giving A(s + 4) + B(s + 2) = 2. Thus, by taking s = −2 and then s = −4, A = 1 and
B = −1.
It follows that
2 1 1 6 3 3
= − and = − so
s2 + 6s + 8 (s + 2) (s + 4) s2 + 6s + 8 (s + 2) (s + 4)
−1 2 −1 1 −1 1
L 2
=L −L = e−2t − e−4t
s + 6s + 8 (s + 2) (s + 2)
−1 6
and L 2
= 3e−2t − 3e−4t .
s + 6s + 8
25
Then, by the second shift theorem,
−1 2e−7s 0 for t < 7
L = −2(t−7) −4(t−7) .
s2 + 6s + 8 e −e for t > 7
0.8
0.7
0.6
0.5
y
0.4
0.3
0.2
0.1
0
0 1 2 3 4 5 6 7 8 9 10
t
Figure 2.3: Graph of the solution of the initial value problem (2.5).
26
The Laplace Transform Method.
Step 1. Take the Laplace Transform of the given differential equation.
Step 2. Make the transformed variable (Y (s) above) the subject of the transformed
equation.
Step 3. Apply the inverse Laplace transform to find y(t).
For a system of differential equations the second step is more complicated as it involves
the solution of a system of (simultaneous) algebraic equations rather than one equation.
By way of example consider the following initial value problem.
Question 2.7. Solve the following initial value problem using the Laplace transform
method:
dx1 dx2
= x1 + 2x2 , = 2x1 − 2x2 , x1 (0) = 2 , x2 (0) = 1 .
dt dt
Solution. The first step is to take the Laplace transform of the differential equations:
dx1 dx2
L = L{x1 } + 2L{x2 } and L = 2L{x1 } − 2L{x2 } so
dt dt
27
Having found X2 we can substitute into (2.8) to give X1 :
1 1 s−1 1 2
X1 = 2 1 + =2 = .
s−2 s−1 s−2 s−1 s−2
Now that the simultaneous equations have been solved, taking the inverse Laplace trans-
forms gives the result:
−1 −1 2 2t −1 −1 1
x1 (t) = L {X1 } = L = 2e , x2 (t) = L {X2 } = L = e2t .
s−2 s−2
Question 2.8. Solve the following second-order system using the Laplace-transform
method:
d2 x d2 y
+ 2x − y = 0 , − x + 2y = 0 ,
dt2 dt2
dx dy
with x(0) = 4, y(0) = 2, (0) = (0) = 0.
dt dt
Solution. Some of the details of the solution will be omitted as it has essentially the
same steps as Question 2.7.
Take the Laplace transform of the differential equations to get
dx
s2 X − x(0)s − (0) + 2X − Y = 0 ,
dt
dy
s2 Y − y(0)s − (0) − X + 2Y = 0 .
dt
Substitute for the initial values and rearrange so that the unknowns are on one side
28
2s s s
so = 2 − 2 and
(s22
+ 1)(s + 3) (s + 1) (s + 3)
2
2s3 s2 s2 (s + 1) − 1 (s2 + 3) − 3
=s − =s −
(s2 + 1)(s2 + 3) (s2 + 1) (s2 + 3) (s2 + 1) (s2 + 3)
1 3 3 1
=s 1− 2 −1+ 2 =s − .
(s + 1) (s + 3) (s2 + 3) (s2 + 1)
Hence
6s 2s 5s 5s 3s s
X= − 2 + 2 − 2 = 2 + 2 .
(s2 + 3) (s + 1) (s + 1) (s + 3) (s + 1) (s + 3)
3s s
= − .
(s2 + 1) (s2 + 3)
Taking the inverse Laplace transforms,
−1 −1 s −1 s √
x(t) = L {X(s)} = 3L + L = 3 cos t + cos 3t
(s2 + 1) (s2 + 3)
and similarly √
y(t) = L−1 {Y (s)} = 3 cos t − cos 3t .
There are many examples of systems of differential equations that are formulated as
problems in vibration and circuit simulation. As an example application, a mathematical
model for the simulation of a circuit is given here.
i(t) i2(t)
X
L1 L2
R2
i1(t)
V (t) R1 R3
29
Applying Kirchhoff’s 1st law at node X gives
i = i1 + i2 .
Applying Kirchhoff’s 2nd law to the left and right loops in turn then gives the differential
equations
d
R1 (i1 + i2 ) + L1 (i1 + i2 ) + R2 i1 = V ,
dt
di2
L2 + R3 i2 − R2 i1 = 0 .
dt
Initially no current flows: i1 (0) = i2 (0) = 0.
Given that R1 = R3 = 10 Ω, R2 = 20 Ω, L1 = L2 = 5 H and V (t) = 200 V, we wish to
find the currents i1 and i2 using the Laplace transform method.
We first substitute the numerical values for the constant terms into the differential equa-
tions:
d
10(i1 + i2 ) + 5 (i1 + i2 ) + 20i1 = 200 ,
dt
di2
5 + 10i2 − 20i1 = 0 .
dt
We divide through by common factors in the differential equations to simplify them:
di1 di2
2(i1 + i2 ) + + + 4i1 = 40 ,
dt dt
di2
+ 2i2 − 4i1 = 0 .
dt
Next we take Laplace transforms:
30
using partial fractions again.
Taking inverse Laplace transforms,
−1 −1 4 −1 4
i1 (t) = L {I1 } = L −L = 4 − 4e−10t ,
s s + 10
and similarly,
i2 (t) = L−1 {I2 } = 8 − 10e−2t + 2e−10t .
Linear systems of differential equations are relatively easy to solve using the Laplace
transform method as you can follow the three-step process given above. The second step
for systems of differential equations includes the solution of simultaneous equations.
31
2.6 Problems
Problem 2.1. Solve the following initial value problems using Laplace tranforms:
dy
(a) + 3y = e−2t , y(0) = 2 ;
2
dt
dy dy dy
(b) 2
+ 2 + 5y = 1, y(0) = 0, (0) = 0 ;
2
dt dt dt
dy dy dy
(c) 2
+ 4 + 5y = 3e−2t , y(0) = 4, (0) = −7 ;
2
dt dt dt
dy dy dy
(d) + 8 + 16y = 16 sin 4t, y(0) = − 12 , (0) = 1 ;
dt2 2 dt dt
dy dy dy
(e) 2
− 2 + 2y = cos t, y(0) = 1, (0) = 0 .
dt dt dt
Problem 2.2. Use the second shift theorem to find the function y(t) with the following
Laplace transforms:
e−3s se−s
(a) Y (s) = ; (b) Y (s) = e−2s ; (c) Y (s) = .
s4 s2 + 2s + 5
Problem 2.3. Solve the initial value problems:
d2 y dy dy(0)
(a) 2
+ 3 + 2y = δ(t − 4), y(0) = = 0;
dt dt dt
1 d2 y dy dy(0)
(b) 2
+ + y = 2δ(t − 3), y(0) = 1, = 0.
2 dt dt dt
Problem 2.4. Application of Kirchhoff’s laws for a two-loop circuit leads to the following
system of differential equations for the currents i1 and i2
d di1
L1 (i1 + i2 ) + R1 (i1 + i2 ) + R2 i1 + L3 =V ,
dt dt
di2 di1
L2 + R3 i2 − R2 i1 − L3 = 0,
dt dt
i1 (0) = 1 , i2 (0) = 0 .
Solve the system of equations using Laplace transforms, taking R1 = R2 = R3 = 2,
L1 = L2 = L3 = 1 and V = 3.
Problem 2.5. Find the solution x1 (t), x2 (t) of the system of differential equations:
dx1
− x1 − x2 = e−2t ;
dt
dx2
− 4x1 + 2x2 = −2et ;
dt
with x1 (0) = 0 and x2 (0) = 1.
32
Answers
3.(a)
0 t<4
y(t) = ;
e−(t−4) − e−2(t−4) t>4
3.(b)
e−t (cos t + sin t) t<3
y(t) = .
e−t (cos t + sin t) + 4e−(t−3) sin (t − 3) t>3
33
34
Chapter 3
Geometry
3.1 Introduction
In this and the following chapter we look at vectors and combine them with ideas taken
from calculus. Our starting point is the concept of a vector and vector operations that
you are likely to be familiar with already. This area of mathematics forms the basis of all
CAD packages and is also useful in simulating the stress fields and fluid motion around
complex shapes. A final area of application of interest to students reading for physics and
electrical engineering is electromagnetism.
The vector calculus operators discussed at the end of this part make it possible to state
fundamental systems of partial differential equations in a concise way. Physical laws
amenable to such a treatment include the Navier Stokes equations (fluid motion) and
Maxwell’s Equations (electro-magnetism).
35
z
3
y
1 2
It is often convenient to calculate unit vectors. These are vectors with a given direction
and magnitude of one. They are often written as
r̂ .
To calculate a unit vector in the direction of a vector r, we divide r by its magnitude:
r̂ = r/|r| .
For example, by construction, the following vector is a unit vector:
1
1
r̂ = √ 2 .
14 3
36
y
b
a+b
a
x
y
3
2a
− 12 a
−a
37
Geometrically the scalar product is the magnitude of b multiplied by the projection of a
onto b, see Fig. 3.4.
a
θ
b
Projection of a onto b
The projection of a onto b is |a| cos θ, where θ is the angle between the directions of the
two vectors. If you do not see this immediately, note that in Fig. 3.4 the projection of
a onto b is the side adjacent to the angle θ, and the magnitude of a is the hypotenuse.
Hence
a · b = |a||b| cos θ . (3.2)
If the scalar product is zero then, assuming that neither vector is the zero vector, cos θ = 0
which implies that
θ = π/2 = 90◦ .
This is very important as finding vectors that are at right angles often involves finding
zero scalar products:
Given two non-zero vectors a and b then
a·b=0 ⇔ a and b are orthogonal. (3.3)
Orthogonal means at right angles. Sometimes such vectors are described as being nor-
mal to one another.
Another special case for the scalar product is if the scalar product of a vector is taken
with itself:
a · a = |a||a| cos 0 = |a|2 .
The square root of the scalar product of a vector with itself gives the vector’s magnitude:
√
a · a = |a| .
For scalar products the order of the vectors is immaterial so a · b = b · a.
In Cartesians the scalar product is given by the sum of the pair-wise products of the
coordinates:
a1 b1
a · b = a1 b1 + a2 b2 + a3 b3 , where a = a2 and b = b2 .
a3 b3
38
Question 3.1. Find the angle between a = (1, −2, 2) and b = (0, 3, −4).
a × b.
As the name suggests the vector product produces a vector. Geometrically the vector
product is a vector with a magnitude given by the area of a parallelogram and direction
perpendicular to the two vectors a and b, see Fig. 3.5. The magnitude of the resulting
a×b
vector is given by the area of the parallelogram defined by a and b as shown in Fig. 3.5.
Two directions are perpendicular to the plane defined by the two vectors a and b, vertically
“up” and vertically “down”. In this case the vector is vertically up because of the “right-
hand rule”. This says that if your right thumb represents a and right index finger b, then
the right-hand middle finger (arranged at right angles to thumb and index finger) gives
“up”, the direction of a × b (and an “upward” normal vector, n, to the parallelogram).
An expression for the vector product is then
a × b = |a||b|n sin θ ,
where n denotes the unit vector in the vertical direction and θ is again the angle between
a and b.
39
If we consider the vector product b × a, it has the same magnitude as a × b but the right-
hand rule determines that the direction of the vector exactly opposite (“down” instead of
“up”). This implies that the order in the vector product is important:
a × b = −b × a .
a · (a × b) = 0 and b · (a × b) = 0 .
where
a1 b1
a = a2 = a1 i + a2 j + a3 k and b = b2 = b1 i + b2 j + b3 k .
a3 b3
In (3.4) the vector product is given by a 3 × 3 determinant. Recall that this can be
evaluated by breaking it down into three 2×2 determinants:
i j k
a2 a3 a1 a3 a1 a2
a1 a2 a3 = i
b2 b3 − j b1 b3 + k b1 b 2 ,
b1 b 2 b3
You might not find this an easy formula to remember, it is then a better strategy to
remember (3.4) and then evaluate the 3×3 determinant each time. Note that determinants
will be looked at again in Section 6.3.
Question 3.2. Calculate the vector product a × b for a = (2, 1, 0) and b = (2, 3, 0).
Hence find the angle between a and b.
40
Solution.
i j k
a × b = 2 1 0
2 3 0
1 0 2 0 2 1
= i −j
2 0 + k 2 3
3 0
= i((1)(0) − (0)(3)) − j((2)(0) − (0)(2)) + k((2)(3) − (1)(2)) = 4k .
Thus
0
a × b = 0 .
4
As a × b = |a||b|n sin θ with n a unit vector and θ the angle between a and b, here
n = k = (0, 0, 1) and
|a × b| 4 4 4
sin θ = =√ √ =√ √ =√ .
|a||b| 4+1 4+9 5 13 65
√
Therefore the angle = θ = sin−1 (4/ 65) ≈ 0.519 ≈ 29.7◦ .
Exercise 3.1. The angular momentum vector, taken about the origin, H of a particle of
mass m moving with velocity v is given by
H = r × (mv) ,
a × (b × c) = (a · c)b − (a · b)c ,
H = mr2 ω ,
where r = |r|.
41
z
b−a
b
r = a + t(b − a) . (3.6)
Here t can be thought of as a free parameter, t = 0 gives the point a, t = 1 gives the
point b. A value of t between zero and one gives a point between a and b on the line. t
outside of the unit interval gives points on the line either side of a and b.
A more general representation of the equation of a line using vectors is
r = a + tm ,
where a is a point on the line and m is a direction vector (a vector parallel to the line).
The advantage of this type of representation is it is the same whether the line is in 2D or
3D space.
42
As
1 1
a = −2 and b−a= 5 ,
2 −1
an equation of the line in this case is
1 1
r = −2 + t 5 ,
2 −1
or equivalently, r = (1 + t)i + (5t − 2)j + (2 − t)k.
For example if two lines are given by r = c + ta, r = d + tb, with a = (2, 1, 0) and
b = (2, 3, 0), the angle θ between the direction vectors is given by
a·b (2)(2) + 1(3) 7
cos θ = = √ √ =√ .
|a||b| 5 13 65
The angle between the lines is then given by
√
θ = cos−1 (7/ 65) ≈ 0.519 ≈ 29.7◦ .
43
n
x−a
a
x
is specified in terms of a vector normal to the plane, i.e. at right angles to its surface,
here denoted by n, see Fig. 3.8.
In particular, a plane is defined by a vector normal to the plane, n, and a point vector in
the plane, a (see Fig. 3.8). With reference to the figure, if r is any point on the plane,
the vector r − a is a direction vector in the plane, and therefore r − a has an orientation
that is perpendicular to the normal vector, n. If we take a scalar product of r − a and
n, then as these two vectors are at normal to each other,
π
(r − a) · n = |r − a||n| cos =0 so
2
n · r = n · a. (3.7)
This is the equation of a plane with normal n and having a point vector a lying on it.
One interesting property of the equation of a plane is if the normal vector is a unit vector,
i.e. has a magnitude of one, so that the plane can be described by
n̂ · r = d so
with |n̂| = 1, then the size of right-hand side, |d| in the equation above, is the perpendic-
ular distance from the origin to the plane.
More generally, the distance h of a point b from a plane n · r = d is given by
h = |d − n · b|/|n| . (3.8)
This can be got by simple trigonometry (see Fig. 3.9). Taking a to be a point on the
plane, the plane’s equation is
n̂ · r = n̂ · a = d so
d = n̂ · a .
44
n
θ
b−a Cross section through plane
a
b
The perpendicular distance, h, to the plane from b is then the length of the projection of
the vector from b to a, a − b, onto n: h = ||a − b| cos θ| where θ is the angle between
a − b and n. But remember that n · (a − b) = |a − b||n| cos θ. Hence
|n · (a − b)| |n · a − n · b| |d − n · b|
h= = = .
|n| |n| |n|
Note that we would generally write the non-parametric equation of a plane, n · r = d,
where
n1 x
n = n2 and r = y
n3 z
as
n1 x + n2 y + n3 z = d .
Question 3.4. Suppose we have a plane with orientation defined by the normal vector
n = 3i − j + 2k and a point in the plane a = (1, 1, 3). Represent the non-parametric
equation of the plane in vector notation and in Cartesian form, and calculate the perpen-
dicular distance of the plane from the origin.
n · r = n · a.
45
while the right-hand side is
3 1
n · a = −1 · 1 = 3 − 1 + 6 = 8
2 3
so the plane is
3 x
−1 · r = y = 8 .
2 z
To convert this to its Cartesian form is straight forward; we simply calculate the scalar
product on the left-hand side of the equation:
3x − y + 2z = 8 . (3.9)
In getting the distance from the origin, we might alternatively write the plane n · r = d
ˆ where n̂ is the unit vector in the direction of n: n̂ = n/|n|. An equation
as n̂ · r = d,
of a plane can simply be multiplied throughout by a scalar to get an alternative form.
Here we√ multiply throughout
√ by 1/|n| to get a new right-hand side dˆ = d//|n|. Here
|n| = 9 + 1 + 4 = 14. Eqn. (3.9) is then replaced by
3 1 2 √
√ x − √ y + √ z = 8/1 14 ≈ 2.138 .
14 14 14
Note that this equation has precisely the correct form for reading off the perpendicular
distance to the origin as the left-hand side is the scalar product of the general point, r,
with a unit vector normal to the plane.
Perpendicular distance to the origin to 4 s.f. is then 2.138.
r = a + td .
r = a + sb + tc . (3.10)
In the equation of a line the location of a point is given by a numerical value of t once all
of the vectors are defined. For the equation of a plane the location of a point in the plane
is prescribed using the parameters s and ts. A schematic diagram of a plane defined in
this way is given in Fig. 3.10).
Given (3.7) and (3.10) are equivalent representations of a plane, it is instructive to see
how (3.10) can be converted into the general form (3.7).
The conversion is relatively straightforward. Both representations require a point in the
plane, a. The difference is that (3.7) requires a normal vector, n, perpendicular to the
46
c
Figure 3.10: The definition of a plane by a point vector and two direction vectors.
plane and (3.10) requires two direction vectors, b and c, parallel to the plane but not with
each other.
The link between n, and b and c, is the vector product
n = b × c.
This gives a vector that is perpendicular to both b and c, and hence it is perpendicular
to the plane, giving the equation of a plane of the form
(b × c) · r = (b × c) · a .
Convert it into its Cartesian form and hence determine the perpendicular distance be-
tween the origin and the plane.
Solution. In the above, the point in the plane and the two direction vectors are given
as
4 2 −1
a = 1 , and b = −4 and c = 4 ,
0 −3 0
respectively. Therefore a vector normal to the plane is given by
i j k
−4 −3 2 −3 2 −4
n = b × c = 2 −4 −3 = i
4 0 − j −1 0 + k −1 4
−1 4 0
47
Then the equation of a plane in the form given by (3.7) is
12 4
n · r = n · a = d with d = n · a = 3 · 1 = 48 + 3 + 0 = 51 .
4 0
12x + 3y + 4z = 51 .
C
~ −a ~
c = OC OC
~ −a
b = OB
A B
~
a = OA ~
OB
To find the equation of a plane through three points A, B, C in the plane, we note that
the direction vectors
~ − OA
b = OB ~ and c = OC ~ − OA~ (3.11)
are parallel to the plane (see Fig. 3.11). As we have two vectors that are parallel to the
plane, if we take their vector product the result is a vector that is perpendicular to the
two vectors (3.11) and therefore perpendicular to the plane:
n = b × c.
We can now appeal to the original analysis that gave us the result for a plane given a
point in the plane a and a normal vector n.
48
Example 3.1. Suppose we have three point vectors a = (1, 0, 1), b = (−2, 5, 0) and
c = (3, 1, 1) lying on a plane.
The vectors
−2 1 −3 3 1 2
b − a = 5 − 0 = 5 and c − a = 1 − 0 = 1
0 1 −1 1 1 0
x − 2y − 13z = −12 .
Plane P2
n1 Line of intersection, L
m
n2
Plane P1
Figure 3.12: Two intersecting planes P1 , P2 defining a line L. Normals to the planes are
n1 and n2 , respectively.
49
To calculate the line of intersection of two planes we can proceed by finding two points
that are on the line. This allows us to derive a direction vector for the line and we already
have a point on the line. As we have seen in Subsection 3.3.1, this is enough information
to derive the equation of the line. To appreciate this more fully let us consider an example.
2y − z = 3 , −y + z = 1 .
These are solved by y = 4 and z = 5. One point on the line of intersection is therefore
r = a = (0, 4, 5).
Similarly, trying y = 0, to look for another point on the line of intersection, and therefore
common to both planes, gives the two simultaneous equations
3x − z = 3 , x + z = 1.
These equations have solution, x = 1 and z = 0. Therefore another point on the line of
intersection is r = b = (1, 0, 0).
Now we have two points on the line of intersection we can now write down its parametric
equation:
0 1 0 0 1
r = a + t(b − a) = 4 + t 0 − 4 = 4 + t −4 .
5 0 5 5 −5
Note that it is possible to calculate a direction vector for the line using the normal vectors
for the planes. The line of intersection is in both planes so is perpendicular to both planes’
normals, say n1 and n2 , and therefore a direction vector is given by the vector product
m = n1 × n1 ,
see Fig. 3.12. Using this would mean we would only have to determine one point on the
line. However, the calculation of a vector product is more complicated than the solution
of the second set of simultaneous equations in two unknowns to find a second point on
the line.
n1 · r = d1 and n2 · r = d2 .
50
These planes are parallel if the two normal vectors, n1 and n2 , are parallel. This is true
if one of them is proportional to the other,
n1 = Cn2 ,
These are parallel planes as n1 = (1, 2, −1) and n2 = (12, 24, −12) so that n2 = 12n1 .
Two planes that are not parallel will intersect, defining a line as discussed in Subsec-
tion 3.4.4. As the condition for two planes to be parallel is dependent on their normal
vectors, the angle between two intersecting planes can be calculated from the normal
vectors: the angle between the planes is the same as the angle between the normals, see
Fig. 3.13.
Plane P2
θ n1
θ
n2
Plane P1
Figure 3.13: Two intersecting planes P1 , P2 , with normals to the planes are n1 and n2 ,
respectively. The angle between the planes (and between the normals) is θ.
The angle between the two planes is given by the scalar product of the two normal vectors.
In Fig. 3.13 the angle is denoted by θ. The angle between the planes can then be found
by using (3.2):
−1 n1 · n2
n1 · n2 = |n1 ||n2 | cos θ so θ = cos .
|n1 ||n2 |
51
Line L Plane P
φ θ
Question 3.6. A laser located at the point a = (2, 4, 1) is directed at a mirror lying
in the plane given by 2x + 5y + 7z = 11. If the laser intersects the plane at the point
b = (1, −1, 2), what is the angle of incidence of the laser light with the mirror?
Solution. A direction vector for the laser beam is a − b = (1, 5, −1), while a normal to
the mirror is (2, 5, 7).
Hence the angle between the normal to the mirror and the laser is given by
n · (a − b) 2 + 25 − 7 20 20
cos θ = =√ √ =p = √ .
|n||a − b| 4 + 25 + 49 1 + 25 + 1 (78)(27) 9 26
Then θ ≈ 1.120 ≈ 64.16◦ .
(Taking a direction
√ vector or normal vector pointing the other way would have given
cos θ = −20/(9 26). We would then want to subtract our value of θ from π (or 180◦ ) to
get a value betweeen 0 and π/2 (0 and 90◦ ).)
The angle between the mirror and the laser is therefore π/2 − θ ≈ 0.451 ≈ 25.84◦ .
52
Question 3.7. Where does the line r = (1, 2, −5) + t(2, −3, 1) intersect the plane
2x + 5y − 3z = 6?
x = 1 + 2t , y = 2 − 3t , z = −5 + t .
A point of intersection must satisfy this set of equations as well as the equation for the
plane. Substituting x, y and z into the equation for the plane gives an equation for t:
53
3.5 Problems
Problem 3.1. What is the angle between the vectors (2, 2, 1) and (1, −1, 1)?
Problem 3.2. Evaluate the dot and cross products of the following pairs of vectors.
Problem 3.3. Find the parametric equation of the line passing through each of the pairs
of points in the previous question.
Problem 3.5. If a = (1, 2, −1) and b = (2, −3, 1), find a unit vector perpendicular to
both a and b.
Problem 3.6. Find parametric and non-parametric equations of the planes passing
through each of the following sets of points.
Problem 3.7. Find a non-parametric equation of the plane passing through each of the
following points a, with the given normal vectors n.
Problem 3.8. The vertices of a triangle are given by A = (1, 1, −1), B = (2, −1, 1),
C = (−1, 1, 1). Use vector operations to determine the angle between the sides AB and
AC.
2x + y − 2z = 5 ,
3x − 6y − 2z = 7 .
2x + y − 2z = 5 ,
54
Problem 3.11. For each of the two planes, find the shortest distance from the origin:
(a) The plane passing through a = (2, −3, 1) and parallel to the vectors d1 = (2, 3, 4) and
d2 = (3, 2, 0);
(b) The plane passing through a = (0, 3, 6) and parallel to the vectors d1 = (3, 2, 2) and
d2 = (−3, 1, 4).
Problem 3.12. A plane is given by x + 2y + z = 6. A second plane passes through
a = (3, 2, 1) and is parallel to the first. Find the equation for the second plane and the
distance between the two planes.
Problem 3.13. For the following planes and lines, determine, in each case, whether or
not the line is parallel to the plane. If the plane and line are parallel, find their separation.
If they are not, find the angle between the line and the plane, and also determine the
point at which the line crosses the plane.
(a) P : x + y + z = 2 ; L : r = t(−1, 2, 2).
(b) P : x + 2y − 3z = 0 ; L : r = (1, 1, −1) + t(1, 1, 1).
(c) P : −2x + y + 2z = 1 ; L : r = (1, 1, −2) + t(−1, 6, −4).
(d) P : −2x + y + 2z = 1 ; L : r = (1, 1, −2) + t(2, 2, −1).
Answers
1. 1.377 radians.
2. (a) 32, (−3, 6, −3). (b) 7, (−7, 7, 7). (c) 17, (−5, 9, 2).
3. (a) (1, 2, 3) + t(3, 3, 3). (b) (1, −2, 3) + t(−2, −3, 1). (c) (3, 1, 3) + t(−1, −1, 2).
8. 76◦ (approx.).
9. 79◦ (approx.).
55
56
Chapter 4
Vector Differentiation
Example 4.1. Suppose we have a plane descending from 10,000 m towards an airport.
(Planes tend to do this by going in circles as they lose altitude.) If the plane’s location is
given by the co-ordinates x = 10000 cos(t/100), y = 10000 sin(t/100), z = 10000 − 5t, we
can determine its velocity.
57
Exercise 4.1. Given the vector function
1+t
r = t2 ,
2 3
3
t
58
y
y = x2 y = 2x − 1
0 1 x
Figure 4.1: The tangent line (dashed) giving a linear approximation to the curve y = x2
(solid) around x = 1.
so that
dx dy
x = x0 + s(t0 ) , y = y0 + s (t0 ) .
dt dt
Eliminating s from these two equations gives
dy/dt(t0 )
y = y0 + (x − x0 ) = f (x0 ) + f 0 (x0 )(x − x0 ) ,
dx/dt(t0 )
dy dy/dt
f 0 (x) = (x) = .
dx dx/dt
The same idea can be applied in three dimensions. Consider the vector function x(t) =
(x(t), y(t), z(t)). We are interested in a linear approximation centred at the point vector
given by t = t0 , i.e. on x0 = (x(t0 ), y(t0 ), z(t0 )).
Let R(t) denote the linear approximation, then
dx
R(t) = x(t0 ) + (t − t0 ) (t0 ) .
dt
59
Alternatively, the tangent line to the curve x = x(t) at the point x0 is given, on writing
s = t − t0 , by
dx
r = x0 + s (t0 ) .
dt
In two dimensions, since a vector in the tangent direction is (dx/dt, dy/dt), and (dy/dt, −dx/dt)·
(dx/dt, dy/dt) = 0, a vector directed perpendicularly to the curve is n = (dy/dt, −dx/dt).
Another normal vector is given by n/(dx/dt) = (f 0 (x), −1).
Question 4.1. Consider the trajectory of a particle as given by the vector function
2t + 3
x(t) = t2 + 3t . (4.6)
t3 + 2t2
dx
Substituting t = 1 into this, (1) = (2, 5, 7).
dt
The linear approximation then reads
5 2 2t + 3
x(t) = 4 + (t − 1) 5 = 5t − 1 .
3 7 7t − 4
60
Given a function of the form f (x, y, z), then we can define its gradient as
∂f
∂x
∂f
gradient of f = grad f = ∇f =
∂y .
∂f
∂z
The gradient will be useful in a following subsection on tangent planes (as might be used
as approximations to surfaces in 3D).
The gradient has already appeared, in disguised form, in the lecture course in the previous
semester, F18XC.
61
y
a
â
O x
Figure 4.2: In two dimensions, a vector a, its corresponding unit vector â, and angle of
direction, θ.
Question 4.2. Consider the function f (x, y, z) = xy + yz + xz. Find the “slope” in the
direction v = (1, 2, 3).
∂f
Solution. First of all we need the gradient, ∇f = (∂f /∂x, ∂f /∂y, ∂f /∂z), where =
∂x
∂ ∂f ∂ ∂f ∂
(xy +yz +xz) = y +z, = (xy +yz +xz) = x+z, = (xy +yz +xz) = x+y,
∂x ∂y ∂y ∂z ∂z
so
y+z
∇f = x + z .
x+y
v · ∇f (1, 2, 3) · (y + z, x + z, x + y)
Dv f = = √
|v| 1+4+9
(y + z) + 2(x + z) + 3(x + y) 1
= √ = √ (5x + 4y + 3z) .
14 14
Having defined the directional derivative then it is possible to find the direction with
greatest positive slope. Using the geometric definition of a scalar product,
v · ∇f |v||∇f | cos θ
Dv f = = = |∇f | cos θ ,
|v| |v|
for θ the angle between the direction vector v and the gradient of f .
Therefore the maximum directional derivative occurs when cos θ = 1, that is, for θ = 0
(v pointing in the direction of fastest increase of f ).
62
The direction with the greatest slope is
v = ∇f
|∇f | .
Question 4.3. Consider the function given in Question 4.2, f (x, y, z) = xy + yz + xz. In
which direction is f increasing the fastest from the point (1, 3, −1)?
Example 4.3. Following the release of a toxic chemical the concentration field is given,
at a particular time, by
2 2 2
c = Cex +y +z .
Due to diffusion the rate of change of the toxic chemical at any location is related to the
gradient
x2 +y2 +z2
2xe x
2 2 2 2 2 2 2 2 2
∇c = C∇ex +y +z = C 2yex +y +z = 2Cex +y +z y = 2Cex +y +z r .
2 2 2
2 2 2
2zex +y +z z
In particular, at a point (−1, 1, 2) the gradient vector is
−1
6
∇c = 2Ce 1 .
2
The direction of maximum negative slope (along which the toxic chemical diffuses) is
−∇c = 2Ce6 (1, −1, −2). An alternative, simpler, vector in the same direction is (1, −1, −2).
The slope or size of the gradient in that direction (which
√ 6 gives the speed at which the
6
chemical diffuses) is −|∇c| = −2Ce |(−1, 1, 2)| = −2 6 e C.
63
4.2.2 Equations for a tangent plane and normal line
A surface in three dimensions can be specified by a function as
f (x, y, z) = c ,
f (x, y, z) = 2x + 3y + 4z = 1
f (x, y, z) = z = 10 .
This is the equation of a plane that is parallel to the x and y axes. If we differentiate
the function with respect to the three independent variables this gives the gradient ∇f =
(0, 0, 1).
The individual components of the gradient give the derivatives in each of the three co-
ordinate directions. So, for example, the function above does not change in the x direction
so the x component of the derivative is zero. Another interpretation of the derivative
vector is as a direction vector. This interpretation leads to the result that in this case the
derivative direction is perpendicular, or normal, to the plane.
Important Result
This is a specific example of the general result that, given a surface, f (r) = c, then
the gradient evaluated at a point on the surface r 0 = (x0 , y0 , z0 ), ∇f (x0 , y0 , z0 ), is
perpendicular to the surface at r 0 , see Fig. 4.3.
(It follows from noting that following any curve r(t) = (x(t), y(t), z(t)) on the curve gives
df /dt = dr/dt · ∇f (by the chain rule), while df /dt = dc/dt = 0, so ∇f must be normal
to any curves in the surface, hence to all their tangent lines, and hence to the tangent
plane.)
This is an important result as it makes it possible to calculate planes at a point on a
surface.
Given a point r 0 = (x0 , y0 , z0 ) on a surface f (r) = c, then n = ∇f (r 0 ) is perpendicular
to the surface at r 0 .
Recall the equation of a plane,
n · r = d,
where n is a vector normal to the plane. Therefore
∇f (r 0 ) · r = d ,
64
Section through tangent
plane n · r = n · r 0
r0
Section through n = ∇f (r 0)
surface f (r) = c
Figure 4.3: The tangent plane to a surface f (r) = c at the point r 0 = (x0 , y0 , z0 ).
is the equation of the tangent plane plane. The constant d can be found by noting that
the point r 0 is in the plane.
r = r 0 + t∇f (r 0 ) ,
i.e., the gradient at r 0 is the direction vector of the normal line passing through r 0 .
Let us see this put into action.
f (x, y, z) = x2 + 2y 2 + 3z 2 = 10
65
The equation of the normal plane is therefore
2
√
4 3 · r = 20 .
6
We could tidy it up a little further as 2√is common to all terms and could be removed.
The plane can then be written as x + 2 3 y + 3z = 10.
√
Using the normal vector above, or, more simply, n = (1, 2 3, 3), the equation of the
normal line passing through r 0 can be written as
√1 1
√
r = 3 + t 2 3 .
1 3
When considering derivatives of a vector function there are two possibilities. These arise
in the same way that “multiplication” of two vectors can take two forms, the scalar or
dot product and the vector or cross product. The divergence operator is related to the
scalar product, it is defined now.
66
Definition of the divergence operator
Given a vector function
f1
f (x, y, z) = f2 ,
f3
Note that div takes a vector function and produces a scalar one whereas grad takes a
scalar function and produces a vector one.
Example 4.5. A compressible fluid with varying density ρ(r, t) flows with velocity v =
(v1 , v2 , v3 ). To ensure that mass is conserved, the “continuity equation”,
∂ρ
+ ∇ · (ρv) = 0 ,
∂t
has to hold.
Suppose that, at some time, q = ρv = (2x − y 2 , x2 + 3z, 4y − z 2 ). An evaluation of div q
tells us how fast the density is decreasing.
Solution.
2
∂/∂x 3x y
∂ ∂z ∂x2
∇ · f = ∂/∂y · z = (3x2 y) + + = 6xy + 0 + 0 = 6xy .
2 ∂x ∂y ∂z
∂/∂z x
The other derivative operator for vector functions, based on the vector product, is the
curl, defined here:
67
Definition of curl
Given a vector function,
f1
f (x, y, z) = f2 ,
f3
Example 4.6. Curl can be thought of as giving a measure of how a vector field turns.
Suppose that a fluid rotates as a rigid body with angular velocity ω about an axis through
the origin O. Then the velocity v of the fluid at a point with position vector r = (x, y, z)
is given by
ω1 x i j k
v = ω × r = ω2 × y = ω1 ω2 ω3 = (ω2 z − ω3 y)i + (ω3 x − ω1 z)j + (ω1 y − ω2 x)k .
ω3 z x y z
Then
i j k
∂ ∂
∇ × v = ∂/∂x ∂/∂y
∂/∂z = (ω1 y − ω2 x) − (ω3 x − ω1 z) i
ω2 z − ω3 y ω3 x − ω1 z ω1 y − ω2 x ∂y ∂z
∂ ∂ ∂ ∂
+ (ω2 z − ω3 y) − (ω1 y − ω2 x) j + (ω3 y − ω1 z) − (ω2 z − ω3 y) k
∂z ∂x ∂x ∂y
= 2ω1 i + 2ω2 j + 2ω3 k = 2ω .
Solution.
∂/∂x 2x − y 2 i j k
curl f = ∇ × f = ∂/∂y × x2 + 3z = ∂/∂x ∂/∂y ∂/∂z
∂/∂z 4y − z 2 2x − y x + 3z 4y − z 2
2 2
68
∂/∂y ∂/∂z ∂/∂x ∂/∂z ∂/∂x ∂/∂y
= i 2
2 − j
2 + k
x + 3z 4y − z 2
2x − y 4y − z 2x − y x + 3z
2 2
∂ 2 ∂ 2
= i (4y − z ) − (x + 3z)
∂y ∂z
∂ 2 ∂ 2
− j (4y − z ) − (2x − y )
∂x ∂z
∂ 2 ∂ 2
+ k (x + 3z) − (2x − y )
∂x ∂y
= (4 − 3)i − (0 + 0)j + (2x + 2y)k = i + 2(x + y)k .
In some circumstances the div and curl of a physical property can have physical mean-
ing, for example in the vector field of the velocity v of an incompressible fluid, mass
conservation can be expressed as
∂v1 ∂v2 ∂v3
div v = + + = 0.
∂x ∂y ∂z
This can be read as the net flow of fluid into a control volume is zero.
Similarly a velocity field for a fluid where the curl is zero, curl v = 0, means the fluid has
no rotational motion.
The vector operators div and curl and the scalar operator grad can be combined. For
example,
∂/∂x ∂f /∂x
∂ 2f ∂ 2f ∂ 2f
div(grad f ) = ∇ · (∇f ) = ∂/∂y · ∂f /∂y = + + .
∂x2 ∂y 2 ∂z 2
∂/∂z ∂f /∂z
curl grad f = 0 .
div curl f = 0 .
An advantage of working with the vector and scalar operators is that they provide short
hand for systems of partial differential equations. For example, in a solid material the
mechanism for heat transfer is conduction, this leads to a partial differential equation that
the temperature field T (x, y, z, t) satisfies. The relevant differential equation without using
vector and scalar operators reads
∂T ∂ ∂T ∂ ∂T ∂ ∂T
ρc = k + k + k ,
∂t ∂x ∂x ∂y ∂y ∂z ∂z
where ρ is the density, c is the specific heat and k is the thermal conductivity. Using
vector and scalar operators the differential equation reads
∂T
ρc = div (k grad T ) = ∇ · (k∇T ) .
∂t
69
One final example is the system of equations in electromagnetism, Maxwell’s equations.
Using vector and scalar operators these are, in a vacuum (without any electric charge or
current),
∇ · H = 0, ∇ · E = 0, ε0 E = ∇ × H , H = −µ0 ∇ × E .
r = a + tb ,
n · r = d.
The equation of the plane can take a few different forms depending on what information
is available. For example if you have three points in the plane, a, b, and c, then the
normal vector defining the orientation of the plane must be constructed by deriving two
direction vectors parallel to the plane and using the vector product,
n = (a − b) × (b − c) .
Further relationships between planes and lines, for example the line of intersection between
planes or the angle between two intersecting planes, could then be determined without
the need to visualise them in three-dimensional space.
Once comfortable with the analysis of planes and lines, the calculus of vectors was in-
troduced. This made it possible to approximate three-dimensional curves by a linear
approximation and made it possible to construct planes as local approximations to three-
dimensional surfaces, i.e. ∇f (r 0 ) · r = d is the tangent (approximating) plane to the
surface f (r) = c at r 0 on taking d = ∇f (r 0 ) · r 0 .
We also found that for a scalar function f we could calculate the directional derivative in
any direction defined by a vector v:
70
4.5 Problems
Problem 4.1. If a = 5t2 i + tj − t3 k and b = sin(t)i − cos(t)j, find
d d d
(a) (a · b), (b) (a × b), (c) (a · a).
dt dt dt
Problem 4.2. Using the laws given in the lectures, show that
d dc db da
(a · b × c) = a · b × +a· ×c+ · b × c.
dt dt dt dt
Problem 4.3. A particle moves along a curve r(t) = (x(t), y(t), z(t)), x = t, y = t2 ,
z = 2t3 /3.
(a) Find its velocity (dr/dt) and acceleration (d2 r/dt2 ) at time t = 1.
(b) Find the linear approximation of the particle trajectory, velocity and acceleration
at t = 1
(a) f (x, y, z) = x2 y 3 z 4 .
(c) f (x, y, z) = x2 yz + xy 2 z.
Problem 4.5. Calculate the divergence and curl of the following vector fields.
Problem 4.6. Suppose that two vector fields are given by g(r) = r = (x, y, z) and
h(r) = ω × r, where ω is a constant vector. Show that:
(a) ∇ · g = 3; ∇ × g = (0, 0, 0).
(b) ∇ · h = 0; ∇ × h = 2ω.
Problem 4.7. For each of the scalar functions in Problem 4.4, calculate the directional
derivative a.∇f , at the point r = (π, π, 1), a = (1, 2, 2).
Problem 4.8. Find the equations for the normal line and tangent plane for the following
surfaces, at the points indicated.
71
Problem 4.9. Compute the directional derivative at position (2, 1, −1) of the vector
field φ = x2 yz 3 in the direction (1, 0, 0). In which direction is the directional derivative
at (2, 1, −1) a maximum?
Problem 4.10. Show that if f (x, y, z) = (yz, xz, xy) then ∇ × f = 0, and find a simple
scalar field φ(x, y, z) such that that f = grad φ. Can you find another φ(x, y, z) such that
f = grad φ still holds?
Problem 4.11 (Advanced). Show that if the scalar field φ(x, y, z) is a solution of
Laplace’s equation ∇ · (∇φ) = 0, then for grad φ we have that ∇ × grad φ = 0 and
∇ · grad φ = 0.
Problem 4.12 (Advanced). Suppose that φ(x, y, z) is a scalar field, and that f (x, y, z)
is a vector field. Prove the following:
1. curl(grad (φ)) = 0;
Answers
3. (b) (t, −1 + 2t, −4/3 + 2t), (1, 2t, −2 + 4t), (0, 2, 4t).
5. (a) 2xy 2 + 2xz + 2z, −2xyi + (2yz − 2x2 y)k. (b) 2xy 2 + 2yz 2 + 2x2 z, −2y 2 zi − 2xz 2 j −
2x2 yk. (c) cos x − sin y + sec2 z, 0.
8π 4 π2
7. 3
(1 + π). (b) 13 . (c) 3
(9 + 4π).
8. (a) (1, 2, 3) + u(5, 4, 3), 5x + 4y + 3z = 22. (b) (3, 2, 5) + u(6, 16, −10), 3x + 8y − 5z = 0.
(c) (−1, −1, 1) + u(−3, −3, 0), x + y = −2.
72
Chapter 5
Hydraulic networks: hydraulic head at junctions and the rate of flow (discharge) for
connecting pipes;
73
Airline scheduling: flight/ticket availability.
In any of these contexts, the system of algebraic equations that we must solve will in
many cases be linear or at least can be well approximated by a linear system of equations.
Linear algebraic equations are characterised by the property that no variable is raised to
a power other than one or is multiplied by any other variable. The question is: is there a
systematic procedure for solving such systems?
• Does it have a solution? What conditions are required on the coefficients aij and
the constants bj for the system to have a solution?
b
x= .
a
For example the equation
3x = 12
has unique solution
12
x= = 4.
3
74
2. If a = 0 but b 6= 0, then (5.2) becomes
0 × x = b.
Since b is not zero, no value of x will make this statement true, so (5.2) has no solutions
in this case.
3. If both a = 0 and b = 0, then (5.2) becomes
0 × x = 0.
This is true for any x. So (5.2) has infinitely many solutions in this case.
We shall see that the three cases in the above example exactly mirror the general case.
The coefficients aij , i = 1, . . . , m, j = 1, . . . , n present on the left-hand of the system of
equations (5.1) can be arranged into a square table
a11 a12 . . . a1n
a21 a22 . . . a2n
A = .. .. .. .. (5.3)
. . . .
am1 am2 . . . amn
called an m × n matrix, more specifically, the coefficient matrix. Here m is the number
of rows in the matrix (the number of equations) and n is the number of columns (the
number of unknowns). A number positioned in the i-th row and the j-th column is called
the (ij) matrix entry. For the matrix at hand the i-th row thus corresponds to the i-th
equation while the j-th column corresponds to the j-th variable xj .
The coefficients bi , i = 1, . . . , m on the right-hand sides of the equations are arranged into
a column
b1
b2
b = .. .
.
bm
We put all of the data specifying the system (5.1) into an m × (n + 1) augmented
matrix by adding to the matrix A one extra column on the right. To emphasise that the
coefficients bi are of a different nature than the coefficients aij we may separate the extra
column by a vertical bar:
a11 a12 . . . a1n b1
a21 a22 . . . a2n b2
(A|b) = .. .. .. .. .. .
. . . . .
am1 am2 . . . amn bm
Each row of the augmented matrix represents an equation of system (5.1). To facilitate
the solution of system (5.1), we initially define two matrix row operations:
1. Rj → Rj + kRi means: add k times row i to row j of the matrix;
75
Question 5.2. Solve the system of linear equations
x + 2y − 3z = 4 ,
x + 3y + z = 11 ,
2x + 5y − 4z = 13 .
We use the first row to kill the rest of the first column. This corresponds to eliminating
x from all but the first equation. The top left entry of the matrix, which is used to drive
all the entries directly below it to zero, is called a pivot and is marked by an asterisk.
We apply
R2 → R2 − R1 , R3 → R3 − 2R1 ,
to obtain
1 2 −3 4
0 1∗ 4 7 .
0 1 2 5
The process of driving all the entries below a pivot to zero is called a down-sweep. Next
we use the second enry on the second row as a pivot to kill the elements below it. We do
this by applying R3 → R3 − R2
1 2 −3 4
0 1 4 7 .
0 0 −2 −2
We can further simplfy the matrix by multiplying the last row by −1/2:
1 2 −3 4
0 1 4 7 .
0 0 1∗ 1
We may now reverse the strategy and use the third row to kill elements in the third
column. This process is called an up-sweep. After applying
R2 → R2 − 4R3 , R1 → R1 + 3R3 ,
we obtain
1 2 0 7
0 1∗ 0 3 .
0 0 1 1
76
We next use the second row to kill elements in the second column by applying R1 →
R1 − 2R2
1 0 0 1
0 1 0 3 .
0 0 1 1
The unique solution of the system can now be read off the rightmost column:
x = 1, y = 3, z = 1.
The process that we employed to solve the above problems is called Gaussian elimina-
tion. In it one first carries out a sequence of down-sweeps using pivots on the diagonal
successively starting with the first row. One next scales the last row and uses it for up-
sweeping. Then one scales the second row and uses it for up-sweeping, etc. until one
obtains a unit matrix in the left block of the augmented matrix. The right-most column
of the resulting matrix contains the solution. This algorithm assumes that all pivots
encountered in the course of its execution are non-zero.
x+y+z = 1
3x + 4y − z = 2
−2x + y − z = 8 .
77
Finally up-sweeping in the second column yields
1 0 0 −3
0 1 0 3
0 0 1 1
and the unique solution is x = −3, y = 3, z = 1.
As an alternative to scaling the rows (to make all the pivots equal to one) and up-sweeping
following down-sweeping, we can carry out the alternative (but equivalent) process of back
substitution. Once down-sweeping is complete, the final row (last equation) is used to
determine the last variable. This is then substituted into the final version of the second
last equation to find the second last variable. Both are put into equation got from the
third last row, giving the third last variable, and so on.
Example 5.1. We can finish off Question 5.3 using back substitution.
After down-sweeping is complete, we have the augmented matrix (5.4). The third row
implies
13z = 13 so z = 1 .
The second row now gives
y − 4z = y − 4 = −1 ,
on using z = 1, so y = 3.
Then the first row leads to
x + y + z = x + 3 + 1 = x + 4 = 1,
x + 2y + 3z = 1 ,
4x + 5y + 6z = 2 ,
7x + 8y + 9z = 1 .
78
From here on we are unable to proceed with up-sweeping because the entry we normally
use as a pivot is zero. However, we do not have to proceed because the last equation now
reads 0 × x + 0 × y + 0 × z = −2 which cannot be satisfied for any values of x,y, z. The
system therefore has no solutions.
79
5.2 Gaussian Elimination: General Case
Sometimes pivots do not appear where we naively expect them to be and rows may need
to be interchanged. Consider the following problem:
Question 5.6. Solve the system of linear equations represented by the augmented matrix
∗
1 2 3 1
2 4 5 1 .
−1 1 2 4
We are blocked from using the natural pivot by a zero. However, we can interchange rows
2 and 3 and use a new pivot.
1 2 3 1
0 3∗ 5 5 .
0 0 −1 −1
Question 5.7. Solve the system of linear equations represented by the augmented matrix
∗
1 2 3 1
2 4 5 1 .
−1 −2 2 4
Solution. All that is changed from the previous example is the entry A3,2 but it makes
a big difference. After down-sweeping the first column we get
1 2 3 1
0 0∗ −1 −1
0 0 5 5
80
and there is nothing further down in the second column. If we can’t move down we move
right:
1 2 3 1
0 0 −1∗ −1 .
0 0 5 5
We down-sweep column 3 to obtain
1 2 3 1
0 0 −1∗ −1 .
0 0 0 0
There are infinitely many solutions. We can express the pivot variables (those multiplied
by the units) in terms of the non-pivot ones:
x = −2y − 2 , z = 1.
2. The first non-zero entry of any row (called the pivot entry) is strictly further right
than the first non-zero entry of any row above it
If (A0 |b0 ) is the augmented matrix obtained by using the Gaussian elimination just after
the last down-sweep, then the matrix A0 is in echelon form. Here is an example of an
augmented matrix after the last down-sweep:
1 2 1 1 4 1 2
0 0 2 −2 6 2 2
.
0 0 0 0 0 3 6
0 0 0 0 0 0 0
The coefficient part A0 is in echelon form. The pivot entries are put in boxes.
A matrix is said to be in reduced echelon form if all the following are true
1. It is in echelon form.
81
If (A00 |b00 ) is the final augmented matrix obtained by Gaussian elimination (as described
above), then A00 is in the reduced echelon form. In the above example, the augmented
matrix, after scalings and up-sweeps, is finally in the form
1 2 0 2 1 0 1
0 0 1 −1 3 0 −1
.
0 0 0 0 0 1 2
0 0 0 0 0 0 0
It is in reduced echelon form (as its coefficient part). The pivotal 1’s are put into boxes.
Consider now a general system of m equations for n unknowns. The coefficient matrix A
has dimensions m × n. At the end of Gaussian elimination it has the form
0 ... 0 1 0 0 0 d1
0 ... 0 0 ... 1 0 0 d2
0 ... 0 0 ... 0 ... 1 0 d3
. . . . . . .
.. .. .. .. .. .. ..
.
0 ... 0 0 ... 0 ... 0 ... 1 d
l
0 ... 0 0 ... 0 ... 0 ... 0 ... d
l+1
. .. .. .. .. .. ..
.. . . . . . .
0 . . . 0 0 . . . 0 . . . 0 . . . 0 . . . dm
The first l rows contain pivotal 1’s while the last m − l rows have coefficients zero. There
are the following possibilities :
2. If dj = 0 for all l + 1 ≤ j ≤ m and l < n, the system has infinitely many solutions.
The non-pivotal variables can be taken as free (undertermined) variables. All pivotal
variables are expressed via (only) the non-pivotal ones.
Question 5.8. Determine how many solutions there is to the system of equations
x + 2y − z = 0,
2x + 5y + 2z = 0,
x + 4y + 7z = 0,
x + 3y + 3z = 0.
82
Solution. Down-sweeping, we obtain
1 2 −1 0 1 2 −1 0 1 2 −1 0
2 5 2 0 0 1 4 0 0 1 4 0
→ .
1 4 7 0 → 0 2 8 0 0 0 0 0
1 3 3 0 0 1 4 0 0 0 0 0
We see that in the echelon form there are two equations for 3 unknowns (l = 2 < n = 3)
and the system has infinitely many solutions.
x + 3y − 5z + w = 4 ,
2x + 5y − 2z + 4w = 6 .
The coefficient matrix is now in reduced echelon form. We have l = 2 < n = 4 and the
system has infinitely many solutions. The pivotal variables are x and y and the general
solution is
(x, y, z, w) = (−2 − 19z − 7w, 2 + 8z + 2w, z, w) .
The non-pivotal variables w, z are arbitrary.
The number of non-zero rows in the echelon form of matrix A is called the rank of A,
and is denoted by rank(A). (It gives a measure of the “size” of A.)
83
Solution. To find the rank we bring the matrix to echelon form by down-sweeping:
1 1 1 2 1 1 1 2 1 1 1 2
3 4 4 7 0 1 1 1 0 1 1 1
−2 1
2 −3 → 0 3 4 1 → 0 0 1 −2
5 3 4 6 0 −2 −1 −4 0 0 1 −2
4 5 3 13 0 1 −1 5 0 0 −2 4
1 1 1 2 1 1 1 2
0 1 1 1 0 1 1 1
→ 0 0 1 −2 → 0
0 1 −2 .
0 0 1 −2 0 0 0 0
0 0 −2 4 0 0 0 0
The matrix is now in echelon form. There are 3 non-zero rows so rank(A) = 3.
represents a line in the plane (assuming that a11 and a12 aren’t both zero). Taking a
second equation,
a21 x + a22 y = b2 , (5.6)
gives a second line, and having the two equations holding together, (5.5) and (5.6), leads
to the crossing point of the lines, assuming that the lines are not parallel, i.e. that
a11 /a12 6= a21 /a22 , or, equivalently, a11 a22 6= a21 a12 (see Fig. 5.1). If the lines are parallel,
a11 a22 = a21 a12 , e.g. for x + y = 1, 2x + 2y = 1, the lines, generally, do not intersect and
then (5.5) and (5.6) have no solution, see Fig. 5.2. If a11 /a12 = a21 /a22 = b1 /b2 , the lines
are coincident and there are infinitely many solutions (e.g. x + y = 1 and 2x + 2y = 2).
In three dimensions,
a11 x + a12 y + a13 z = b1 (5.7)
represents a plane. Combining this with the equation of a second plane,
84
y
a11x + a12y = b1
a21x + a22y = b2
Figure 5.1: Crossing of two non-parallel lines (a11 a22 6= a21 a12 ).
gives the planes’ line of intersection (see Subsection 3.4.4). If we solve (5.7) and (5.8), we
can find x and y in terms of the free parameter z, say, and this would then be one form
of the equation of the line of intersection.
Now adding a third equation, for a third plane,
the solution of (5.7), (5.8) and (5.9) (assuming it exists) gives the point where all three
planes intersect. This is where the line of intersection crosses the third plane.
If there is no solution, a plane must be parallel to the line of intersection of the other two.
If there are infinitely many solutions, the line of intersection lies on the third plane.
85
y
a11x + a12y = b1
a21x + a22y = b2
5.4 Problems
Problem 5.1. Find how many solutions the following system has:
8x1 + 6x2 = 1 ,
20x1 + 15x2 = 0 .
[Use the quantities a11 a22 − a12 a21 and a22 b1 − a12 b2 appearing in lectures.]
Problem 5.2. For the system of linear equations
x1 + 2x2 − 3x3 = −1 ,
3x1 − x2 + 2x3 = 7,
5x1 + 3x2 − 4x3 = 2,
find the augmented matrix. Simplify it by performing the sequence of row operations
R2 → R2 − 3R1 , R3 → R3 − 5R1 , R3 → R3 − R2 .
How many solutions does this system have?
Problem 5.3. For the system of linear equations
x−z = 2,
−2x + y + 4z = −5 ,
2x − 3y − 3z = 8,
find the augmented matrix. Simplify it by down-sweeping in each column. Find the
solution by back substitution or by up-sweeping.
Problem 5.4. For the system
x + 2y + z + 3w = 1 ,
2x + 5y + 2z + 5w = 17 ,
−x − 2y − 2w = 4 ,
86
find the augmented matrix. Simplify it by down-sweeping in each column. Find the
general solution using back substitution or up-sweeping and taking z to be a free variable.
simplify the matrix by down-sweeping in each column. Find the general solution using
back substitution (or up-sweeping).
Problem 5.6. For the electric circuit appearing below, Kirchhoff’s laws give
i2 + i3 = i1 ,
10i1 + 5i3 = 95 ,
−5i3 + 10i2 = 35 .
i1 i3
2Ω
3Ω 10Ω 5Ω
i2
35V
95V
5Ω
(x1 )CO + (x2 )CO2 + (x3 )H2 → (x4 )CH4 + (x5 )H2 O . (5.10)
Find the smallest positive integer values of x1 , x2 , x3 , x4 , x5 so that (5.10) balances. [Note
that the equations for conservation of C, O and H atoms read, respectively,
x 1 + x2 = x4 ,
x1 + 2x2 = x5 ,
2x3 = 4x4 + 2x5 .]
Problem 5.8. Two weights of mass m1 = m2 = 3 kg are arranged with light ropes and
light pulleys as shown below.
87
Newton’s laws and the equation giving constancy of
rope length give
3a1 + 2T = 3g ,
3a2 + T = 3g ,
x2
2a1 + a2 = 0 ,
x1 where a1 = ẍ1 and a2 = ẍ2 are the downward accel-
m2
erations, T is the tension in the rope linking the first
pulley and the second mass, and g is acceleration due
to gravity. Use Gaussian elimination to find a1 , a2
and T in terms of g.
m1
Problem 5.9. What condition should be placed on the parameter a so that the system
x + 2y + 3z = 0,
4x + 9y + (a + 12)z = 2,
−2x − 9y + (10 − 3a)z = −10 ,
Problem 5.10. What condition should be placed on the parameters a and b so that the
system
x + 2y = −a ,
−3x − 6y = 4a − 2b ,
2x + 7y = 1 − 2a ,
has a unique solution?
88
Answers
6. i1 = 8 A, i2 = 5 A, i3 = 3 A.
7. 1, 1, 7, 2, 3
9. −8.
10. a = 2b.
89
90
Chapter 6
Matrices
y
x
y
91
z
x
y
z
y
3. There is a special n-vector called the zero vector whose every component is a zero:
0
0
0 = .. .
.
0
92
be an n-vector or a point in Rn . We can define an action of the matrix A on the vector
x to be a vector denoted Ax with components
a11 x1 + a12 x2 + · · · + a1n xn
a21 x1 + a22 x2 + · · · + a2n xn
Ax = .. . (6.1)
.
am1 x1 + am2 x2 + · · · + amn xn
−1
be a point in R4 . Then
24
TA (x) := Ax = 10
−11
is a point in R3 .
Since the action Ax is defined for any vector x ∈ Rn we say that A gives rise to a linear
transformation:
TA : Rn → Rm
by mapping
x 7→ Ax .
TA (0) = 0
where the 0 on the left-hand side is the zero vector in Rn and the 0 on the right-hand
side is the zero vector in Rm .
93
Example 6.2. For the matrix A as in Example 6.1 and
−12 22
7 1
x1 =
1 ,
x2 = −1
,
−17 5
we have
5
TA (x1 ) = TA (x2 ) = 2 = y
4
and also many other vectors from R4 are mapped onto y. This happens because TA
“crushes” R4 into R3 .
A system (5.1) of m linear equations for n unknowns in matrix notation can be written
as
Ax = b . (6.2)
A linear transformation associated with A acts as
TA : Rn → Rm
TA (x) = b
and the problem of solving (6.2) is equivalent to finding all vectors in Rn which TA maps
onto b – a specific vector in Rm .
Suppose now we have an m × n matrix A and an n × k matrix B. The matrix B has n
rows and k columns. Each column in matrix B can be considered to be a vector from Rn .
Hence we can map each column in B into a column of m elements (a vector in Rm ) using
the rule (6.1). The resulting columns are then arranged into an m × k matrix called a
product of matrices A and B and written AB. (The order is important!) Thus we have
defined a matrix multiplication, C = AB, so that the (ij) entry in the product matrix
is given by the formula
94
Note that in general not any two matrices can be multiplied. The number of columns in
the first matrix must match the number of rows in the second.
Example 6.3. In the above question one can also define BA:
7 −1 3 −8 −1
1 −1 0
BA = 2 0 = 2 −2 0 .
4 1 1
0 1 4 1 1
AA−1 = A−1 A = I ,
Recall that a system of linear equations (5.1) can be written in a matrix form
Ax = b . (6.4)
95
solution x = b0 . On the other hand, if we apply the inverse matrix A−1 to both sides of
our equation (6.4),
A−1 Ax = A−1 b
and therefore
Ix = x = A−1 b .
This means that b0 = A−1 b. Solving the system for different right-hand sides b gives us
information on the entries of the inverse matrix A−1 . Thus if b is a column vector with 1
on the i-th row and zeroes everywhere else, b = ei , the i-th elementary column vector
(in three dimensions, e1 , e2 , e3 are the three unit vectors i, j, k), A−1 b gives the i-th
column of the matrix A−1 . Solving the system for all such elementary column vectors
with i = 1, 2, . . . , n gives the entire matrix A−1 . We can do it all at once by applying
the same row operations to a large augmented matrix (A|I) where the second block is an
n × n identity matrix (the n elementary column vectors put together).
Question 6.2. Use Gaussian elimination to find the inverse for the matrix
2 1
A= .
5 3
which we can treat as a single 2 × 4 matrix. Down-sweeping in the first column, we obtain
2 1 1 0
.
0 1/2 −5/2 1
Multiplying the second row by 2 and up-sweeping the second column, we obtain
2 0 6 −2
.
0 1 −5 2
Finally multiplying the first row by 1/2 brings our matrix to the form
1 0 3 −1
0 1 −5 2
and we see that when the left block became the identity matrix the right block became
the inverse matrix to the original coefficient matrix A .
Warning: Not every square matrix has an inverse! However, it can be proved that if
system (6.4) has a unique solution then A−1 exists for its matrix of coefficients A.
96
Question 6.3. Find B−1 for
6 11 5
B = 18 34 15 .
12 25 11
6.3 Determinants
A determinant of a 2 × 2 matrix A is the number
a b a b
det(A) := det = = ad − bc . (6.5)
c d c d
97
Example 6.4.
1 −3 3
−5 3 3 3 3 −5
det 3 −5 3 = 1 × det − (−3) × det + 3 × det
−6 4 6 4 6 −6
6 −6 4
= (−5 × 4 − 3 × (−6)) + 3(3 × 4 − 3 × 6) + 3(3 × (−6) − (−5) × 6)
= −2 + 3 × (−6) + 3 × 12 = 16 .
x + 2y = 0 ,
−x + z = 0 ,
2x − 3y = az = 0 ,
Solution. Computing
1 2 0
det −1 0 1 = 7 − 2a
2 −3 a
and setting it to zero, we conclude that the system has a non-trivial solution if and only
if a = 7/2.
One can also prove that if two matrices A and B can be turned into each other by
elementary row operations, their determinants are proportional to each other. Thus if
det(A) = 0 then det(B) = 0 and vice versa. This implies in particular that:
• if a matrix A has rows which are linearly dependent (i.e. one row can be written
as a sum of others, or a non-trivial sum of rows vanishes), its determinant vanishes.
98
(The same comments apply to columns.)
(If the row operations are restricted to adding or subtracting a multiple of one to or from
another, avoiding swaps and scalings, the value of the determinant is unchanged. The
same applies to manipulating columns.)
Question 6.5. For the matrices A and B below check that det(AB) = det(BA) =
det(A)det(B)
1 2 −1 0
A= , B=
−1 4 2 5
Solution. We have
3 10 −1 −2
AB = , BA =
9 20 −3 24
and the determinants are det(A) = 6, det(B) = −5, det(AB) = det(BA) = −30 so that
indeed det(AB) = det(BA) = det(A)det(B).
The entries of the inverse matrix A−1 can becomputed by using the minors using the
following formula:
(−1)i+j Mji
A−1
ij = . (6.7)
det(A)
Note that the order of indices for the minor is interchanged compared with the order of
indices in the inverse matrix. In practice it is convenient first to compute the matrix
99
Cij = (−1)i+j Mij which is called the cofactor matrix of matrix A, and then take its
transpose, that is, interchange the rows and columns and finally divide the result by
det(A). In general transpose of a matrix A is denoted by AT and means that ATij = Aji .
For example
T
1 2 3 1 4 7
4 5 6 = 2 5 8 .
7 8 9 3 6 9
With this notation we have
A−1 = (1/det(A))CT .
Formula (6.7) assumes that det(A) 6= 0, so that we can divide by that quantity. Having
det(A) 6= 0 is a necessary and sufficient condition for the inverse matrix to A to exist.
−1
−1 a b 1 d −b
A = = .
c d ad − bc −c a
Question 6.6. Compute the inverse of matrix given in Example 6.4, using formula (6.7).
100
6.4 Problems
Problem 6.1. For a matrix A and vectors x1 ∈ R3 and x2 ∈ R5 , find vectors y = Ax1 ∈
R5 and z = 2y − 3x2 ∈ R5 , with
1 0 −2 1
7 1/2 −1 6 −1
A= −1/3 0 2 , x1 = −2 , x2 = 0 .
2 2 2 1 1/3
0 1 0 11
Problem 6.2. (a) The matrix performing clockwise rotations about the x axis in 3-
dimensional space has the form
1 0 0
Rx (θ) = 0 cos θ sin θ ,
0 − sin θ cos θ
where θ is the angle of rotation. Use this matrix to calculate the components of vector
1
v = −2.5
0.3
Problem 6.6. Compute the determinants det(D), det(E) of the following matrices:
1 3 0
11 −3
D= , E= 2 6 4 .
5 2
−1 0 2
101
Problem 6.7. Check, by computing a determinant, that the homogeneous system of
linear equations
x − z = 0,
2x + 7y − 3z = 0 ,
5x + 14y − 7z = 0 ,
Problem 6.8. Find for what values of parameter a the following system of linear equa-
tions (for the unknowns x, y) has a non-trivial solution:
ax + 12y = 0 ,
3x + ay = 0 .
Problem 6.9. Using the general properties of determinants, compute the determinant
1 77 0 1
2 0 5 −1
.
0 −77 0 1
3 154 0 0
Problem 6.10. Find the cofactor matrix C and hence the inverse A−1 for the matrix
1 3 0
A= 2 6 4
−1 0 2
102
Answers
1.
4 5
40 83
y=
0 ,
z=
0 .
10 19
−2 −37
2. (1, −0.99, 2.31)T .
3.
33 −3 32
34 11 1112 297
CD = , DC = −3 −11 −17 , 2
(CD) = .
−4 −7 −108 5
0 4 5
4.
27 −16 6
A−1 = 8 −5 2 .
−5 3 −1
5.
−1 1/3 −2/3 −1 4 −3
B = , C = .
1/6 1/6 2/3 0
8. a = ±6.
9. −1540.
103
104
Chapter 7
7.1 Introduction
In this chapter we study “eigenvalue” problems. These arise in many situations, for
example: calculating the natural frequencies of oscillation of a vibrating system; finding
principal axes of stress and strain; calculating oscillations of an electrical circuit; image
processing; data mining (web search engines); etc.
Suppose that A is a square n × n matrix then we can ask if there are any non-zero vectors
such that A just stretches when it acts on them:
Ax = λx
105
where λ is a number. For λ = 0 the corresponding vectors are said to belong to the null
space of A. For λ = 1 the corresponding vector is called a fixed point because it does not
change under the action of A.
If
Ax = λx .
for some vector x then such a number λ is called an eigenvalue of A, and the corre-
sponding x is called an eigenvector.
A matrix can have more than one eigenvector and eigenvalue. In the previous example
the matrix A was shown to have an eigenvalue λ = 4 with eigenvector
1
x= 1 .
2
One can check that λ = −2 is also an eigenvalue of A with corresponding eigenvectors
1 1
x1 = 1 and x2 = 0 ,
0 −1
that is, Ax1 = −2x1 and Ax2 = −2x2 . This example also shows that there can be
more than one eigenvector corresponding to the same eigenvalue. In fact, for the above
example, one can check that any linear combination
y = c1 x1 + c2 x2
where c1 , c2 are constants (not both equal to zero), is also an egenvector of A with
eigenvalue −2.
106
Question 7.1. Find the eigenvalues of matrix
1 2
A= .
1 0
This matrix has a non-trivial null space if the lower right entry is zero, that is, if
λ2 − λ − 2 = (λ − 2)(λ + 1) = 0 .
Therefore A has exactly 2 eigenvalues, λ = 2 and λ = −1.
The presence of a variable λ in the matrix made Gaussian elimination messier than usual.
(Even worse, our computation has been incomplete as we tacitly assumed that λ − 1 6= 0.
This assumption effectively excluded the case λ = 1 from our computations and one needs
to check separately that 1 is not an eigenvalue of A.)
107
The polynomial factorises to give
(λ − 1)(λ + 1)(λ − 2) = 0
λ1 = 1 , λ2 = −1 , λ3 = 2 .
λ1 = 0 , λ2 = 1 , λ3 = 2 .
108
Solving this system gives a corresponding eigenvector (or any multiple thereof)
0
x1 = 1 .
−1
Solving this gives an eigenvector (or any scalar multiple thereof) corresponding to λ2 :
1
x1 = 0 .
0
Solving this gives an eigenvector (or any scalar multiple thereof) corresponding to λ3 :
0
x3 = 1 .
1
(4 − λ)(2 + λ)2 = 0
so that λ1 = 4 is a single root, and appears once as a solution of the equation, whereas
λ2 = −2 is a double root, and appears twice as a solution of the equation. We say that
λ1 = 4 is an eigenvalue of algebraic multiplicity 1 and that λ2 = −2 is an eigenvalue
of algebraic multiplicity 2. More generally, if the characteristic polynomial for a matrix
A has a factor (λi − λ)j , λi will be an eigenvalue of algebraic multiplicity j.
109
at least 1. In Example 7.1 we saw that there were 2 independent eigenvectors correspond-
ing to the eigenvalue of algebraic multiplicity 2. In general an eigenvalue’s geoemetric
multiplicity is no greater than its algebraic multiplicity:
Question 7.4. Find the eigenvalues and their algebraic and geoemetric multiplicities for
9 0 0 5 4 −2 9 3 6
A = 0 9 0 , B = −2 11 −1 , and C = −6 3 0 .
0 0 9 4 −4 11 −3 0 15
Solution.
λ=9
Attempting to solve this system leaves x, y and z undetermined; the general eigenvector
is
x 1 0 0
y = x 0 + y 1 + z 0 .
z 0 0 1
There are three independent eignevectors :
1 0 0
0 , 1 and 0 ,
0 0 1
or, equivalently, the general eigenvector can be written in terms of three arbitrary con-
stants: λ = 9 is an eigenvalue of geometric multiplicity three.
110
The matrix B. The characteristic equation is
λ−5 −4 2
det(λI − B) = det 2 λ − 11 1
−4 4 λ − 11
Subtracting twice the second row from the first and adding twice the second row to the
third gives
0 0 0 x 0
−2 2 −1 y = 0 .
0 0 0 z 0
The second equation then gives −2x+2y−z = 0 so z = 2y−2x with x and y undetermined;
the general eigenvector is
x 1 0
y =x 0 +y 1 .
2y − 2x −2 2
or, equivalently, the general eigenvector can be written in terms of two arbitrary constants:
λ = 9 is an eigenvalue of geometric multiplicity two.
= (λ − 9)(λ2 − 18λ + 45) + 18(λ − 15) + 18(λ − 3) = (λ − 9)(λ2 − 18λ + 45) + 36(λ − 9)
= (λ − 9)(λ2 − 18λ + 81) = (λ − 9)3 = 0 .
111
Once again there is the triple root
λ=9
so that C also has only one eigenvalue, λ = 9, of algebraic multiplicity three.
To now find the eigenvectors, we solve the homogeneous equation (C − λI)x = 0, so
0 3 6 x 0
−6 −6 0 y = 0 .
−3 0 6 z 0
Dividing the first and third rows by three and the second by minus six recasts the problem
as
0 1 2 x 0
1 1 0 y = 0 ,
−1 0 2 z 0
then adding the third row to the second gives
0 1 2 x 0
0 1 2 y = 0 ,
−1 0 2 z 0
and subtracting the second row from the first leads to
0 0 0 x 0
0 1 2 y = 0 .
−1 0 2 z 0
The second equation then gives y = −2z while the third gives x = 2z; the general
eigenvector is
2z 2
−2z = z −2 .
z 1
There is just one (independent) eignevector :
2
−2 ,
1
or, equivalently, the general eigenvector can be written in terms of one arbitrary constant:
λ = 9 is an eigenvalue of geometric multiplicity one.
112
7.3 Practical Application: Mass-Spring Systems
Question 7.5. Two identical simple pendula oscillate in the plane as shown in Figure 7.1.
Both pendula consist of light rods of length ` = 10 and are suspended from the same ceil-
ing a distance L = 15 apart, with equal masses m = 1 attached to their ends. The angles
the pendula make to the downward vertical are θ1 and θ2 , and they are coupled through
the spring shown which has stiffness coefficient k = 1. The spring has unstretched length
L = 15. Assume that the acceleration due to gravity g = 10. Describe the system’s
dynamics by differential equations and find their general solutions.
L = 15
10
θ1
θ2
10
m=1 m=1
k=1
Solution. Assuming that the oscillations of the spring remain small in amplitude, so
that |θ1 | 1 and |θ2 | 1, by applying Newton’s second law and Hooke’s law one finds
that the coupled pendula system gives rise to the system of differential equations
d2 θ −2 1
= Aθ, where A = , (7.3)
dt2 1 −2
and
θ1
θ=
θ2
is the vector of unknown angles for each of the pendula shown in Figure 7.1.
We further look for a solution of the form θ(t) = Re{veiωt } for a constant vector v.
Substituting θ(t) into (7.3) and dividing both sides by eiω(t) we obtain that the system of
differential equations (7.3) reduces to solving the eigenvalue problem
(A + ω 2 I)v = 0 . (7.4)
We obtain using the characteristic equation method explained in the previous section that
matrix A has two distinct eigenvalues λ1 = −1 and λ = −3. Since λ = −ω 2 we obtain
113
√
the characteristic frequencies ω1 = 1 and ω2 = 3 (it is usual to take frequencies to be
positive, as was done in F18XC). An eigenvector corresponding to λ1 = −1 is
1
v1 =
1
7.4 Diagonalisation
An n × n matrix à is said to be similar to an n × n matrix A if
à = X−1 AX
i.e. the matrix whose diagonal entries are the eigenvalues of the matrix A and whose all
other entries are zero. Note that we have
A = XDX−1 .
114
(Also note that if our matrix has some eigenvalue whose algebraic multiplicity is greater
than its geometric multiplicity, the total number of independent eigenvectors will be less
than n and A will not be diagonalisable.)
Symmetric Matrices
A real square matrix A is said to be symmetric if transposition leaves it unchanged, i.e.
AT = A.
Note that as usual the eigenvectors are defined up to rescaling. We can use this freedom to
pick eigenvectors all to have length one (all to be unit vectors), so they satisfy x(i) ·x(i) = 1.
To do that we divide each eigenvector by its length
1
x(i) → √ x(i) .
(i)
x ·x (i)
These relations imply that for the matrix X = [x(1) |x(2) | . . . |x(m) ] whose columns are the
eigenvectors of A, one has
X−1 = XT .
In general matrices for which the inverse coincides with the transposed matrix are called
orthogonal matrices.
115
Question 7.6. Find a matrix X which diagonalises the matrix A from the previous ex-
ample via a similarity transformation.
Hence
− √25 0 √1
5
X = √15 0 √2
5
.
0 1 0
Now we can check that
− √25 √1
5
0
X−1 = XT = 0 0 1 ,
√1 √2 0
5 5
and furthermore
− √25 √1
5
0 2 2 0 − √25 0 √1
5 1 0 0
X−1 AX = 0 0 1 2 5 0 √15 0 √2
5
= 0 3 0 .
√1 √2 0 0 0 3 0 1 0 0 0 6
5 5
Diagonalisation can be used to obtain large powers of a given matrix. Note that
and it is straightforward to take the n-th power of a diagonal matrix just by taking the
n-th power of each diagonal entry.
Question 7.7. Find An for the matrix from the previous example.
Solution. We have
− √25 0 √15 1 0 0 − √25 √1
5
0
An = XDn X−1 = √15 0 √25 0 3n 0 0 0 1
0 1 0 0 0 6n √1 √2 0
5 5
1 n 2 n
5
(4 + 6 ) 5 (−1 + 6 ) 0
= 2 n
6 4 n
6 0 .
5 5
0 0 3n
The need to evaluate matrices to arbitrary powers often arises in connection with differ-
ential equations.
116
7.5 Systems of Linear Differential Equations
We have already seen how one (second-order) coupled system could be tackled in Ques-
tion 7.5. Other oscillatory problems like this proceed as in the folowing question.
Solution. First look for a special solution in complex form, x = veiωt with v a constant
vector.
Differentiating x twice, we get d2 x/dt2 = −ω 2 veiωt . Then substituting into (7.5) gives
−ω 2 veiωt + Bveiωt = 0
so Bv = ω 2 v .
Hence λ = ω 2 is an eigenvalue of B with v a corresponding eigenvector.
117
and also
x1 1 √ 1 √
x= = cos( 2 t) and sin( 2 t) .
x2 1 1
(The first pair of solutions, coming from λ1 = 4, has x1 = x2 – the components move
together – and such motion is termed in phase. The second pair of solutions, coming
from λ1 = 2, has x1 = −x2 – the components move opposite to each other – and such
motion is termed out of phase.)
The general solution is then got by combining all four solutions:
1 1 √ √
x= (a1 cos 2t + a2 sin 2t) + (b1 cos( 2 t) + b2 sin( 2 t)) .
−1 1
Example 7.3. Let’s first try solving the vector differential equation
dx
= Ax
dt
with A as in Question 7.3. We start by looking for a simple solution of the form x = veλt
with v a constant non-zero vector. Differentiating x and substituting into the original
equation, we get
λveλt = Aveλt so Av = λv .
Hence λ is an eigenvalue with v a corresponding eigenvector. From Question 7.3, we have:
0 1 0
λ1 = 0 , v 1 = 1 ; λ2 = 1 , v 2 = 0 ; λ3 = 2 , v 3 = 1 .
−1 0 1
There are then three independent special solutions:
0 1 0
λ1 t
v1e = 1 ; v 2 e = 0 et ;
λ2 t v 3 eλ 3 t = 1 e2t .
−1 0 1
The general solution is given by taking a linear combination:
0 1 0
x = a1 v 1 eλ1 t + a2 v 2 eλ2 t + v 3 eλ3 t = a1 1 + a2 0 et + a3 1 e2t .
−1 0 1
Alternatively, the individual components are x1 = a2 et , x2 = a1 +a3 e2t and x3 = a3 e2t −a1 .
118
Question 7.9. Suppose that x(t) = (x1 , x2 , x3 )T satisfies dx/dt = Ax with
−2 2 −3
A= 2 1 −6 .
−1 −2 0
Solution. Looking for a special solution of the form x = veλt gives, on substituting
this into the equation, λv = Av, i.e. λ is an eigenvalue of A and v is an associated
eigenvector.
We can find that the eigenvalues of A are λ1 = 5 and λ2 = −3, and that corresponding
eigenvectors are : v 1 = (−1, −2, 1) for λ = 5; and v 2 = (3, 0, 1)T and v 3 = (−2, 1, 0)T for
λ = −3.
Independent solutions to the system are then
To determine the constants for an initial value problem, a system of linear equations has
to be solved (as in Chap. 5).
−α + 3β − 2γ = 3 , −2α + γ = −5 , α + β = 3.
119
Systems of linear first-order ordinary differential equations with constant coefficients are
normally equivalent to a single ODE: an n × n system can be usually be written as an
nth-order ODE, and an nth-order ODE can always be written as an n × n system.
d2 y dy
2
+ 2 − 3y = 0 .
dt dt
We can first write x1 = y and x2 = dy/dt. Then
dx1 dx2 d2 y dy
= x2 and = 2 = 3y − 2 = 3x1 − 2x2 .
dt dt dt dt
In matrix form the system is then
d x1 0 1 x1
= .
dt x2 3 −2 x2
y = x1 = αe−3t + βet .
120
Question 7.10. Use eigenvalues and eigenvectors to solve
d2 y dy dy
2
+ 6 + 13y = 0 , y(0) = −1 , (0) = 11 .
dt dt dt
121
7.6 Problems
Problem 7.1. Find the eigenvalues and eigenvectors for the matrices
8 3 3 5
A= , B= .
−10 −3 5 3
Problem 7.2. Use the results of the previous question to find a general solution to the
system of equations
dx
= 8x + 3y ,
dt
dy
= −10x − 3y .
dt
Problem 7.3. Compute the eigenvalues and eigenvectors of the following 3×3 matrices:
3 5 7 1 1 0
D = 5 3 −7 , E= 1 1 0 .
0 0 2 0 0 2
Problem 7.4. Find the general solution to the system of differential equations
d2 x
= Bx ,
dt2
where √
x1 (t) −2
√ 2
x(t) = and B= .
x2 (t) 2 −3
Problem 7.5. Consider two equal masses m connected by three springs with stiffness
coefficients k as indicated in the diagram:
m m
k k k
x1 x2
122
Problem 7.6. In quantum physics the outcome of a measurement will correspond to
the eigenvalues of an operator. The spins of a particle in the x, y and z directions are
described by the three Pauli spin matrices
1 0 1 1 0 i 1 1 0
Sx = , Sy = , Sz = ,
2 1 0 2 −i 0 2 0 −1
√
where i = −1 is the imaginary unit. Calculate the values of any measurements of spins
along the x, y and z directions, which are given by the corresponding eigenvalues of Sx ,
Sy , Sz .
Problem 7.7. Determine the algebraic and geometric multiplicities of the eigenvalues for
the matrix
2 75 0
E= 0 2 0
0 0 2
Problem 7.8. For the matrix
4 2
B= ,
3 −1
find the diagonalisation transformation.
calculate D54 .
Answers
1 −3
1. For A: λ1 = 2, x1 = α ; λ2 = 3, x2 = α .
−2 5
1 1
For B: λ1 = 8, x1 = α ; λ2 = −2, x2 = α .
1 −1
2.
x(t) 2t 1 3t −3
= C1 e + C2 e .
y(t) −2 5
123
1 7 1
3. For D: λ1 = −2, x1 = α −1 . λ2 = 2, x2 = α −7 ; λ3 = 8, x3 = α 1 .
0 4 0
1 0 1
For E: λ1 = 0, x1 = α −1 ; λ2 = 2, x2 = α1 0 + α2 1 .
0 1 0
√
2 1
√
4. x(t) = (C1 cos(t) + C2 sin(t)) + (C3 cos(2t) + C4 sin(2t)).
1 − 2
√
5. The natural frequences are ω1 = 1, ω2 = 3. The general solution is
1 √ √ 1
(A1 cos(t) + A2 sin(t)) + (A3 cos( 3 t) + A4 sin( 3 t)) .
1 −1
10.
54 1 (654 + 454 ) (654 − 454 )
D = .
2 (654 − 454 ) (654 + 454 )
124
Appendix A
Standard Derivatives :
Standard Integrals :
0
F (x) F (x)
R
f (x) f (x) dx
xn nxn−1
Integration by Parts :
Z b Z b
dv du
u(x) dx = [u(x)v(x)]ba − v(x) dx
a dx a dx
125
Trigonometrical Formulæ :
sin2 A + cos2 A = 1 , sec2 A = tan2 A + 1 ,
sin(A + B) = sin A cos B + cos A sin B , sin(A − B) = sin A cos B − cos A sin B
cos(A + B) = cos A cos B − sin A sin B , cos(A − B) = cos A cos B + sin A sin B
sin 2A = 2 sin A cos A , cos 2A = 2 cos2 A − 1 = 1 − 2 sin2 A
1
sin A sin B = (cos(A − B) − cos(A + B))
2
1
cos A cos B = (cos(A − B) + cos(A + B))
2
1
sin A cos B = (sin(A + B) + sin(A − B))
2
A+B A−B
sin A + sin B = 2 sin cos
2 2
A−B A+B
sin A − sin B = 2 sin cos
2 2
A+B A−B
cos A + cos B = 2 cos cos
2 2
A+B A−B
cos A − cos B = −2 sin sin
2 2
126
Laplace transforms :
f (t) F (s)
c c/s
t 1/s2
tn n!/sn+1
ekt 1/(s − k)
sin at a/(s2 + a2 )
cos at s/(s2 + a2 )
t sin at 2as/(s2 + a2 )2
δ(t − a) e−as
eat f (t) F (s − a)
g(t − a) t > a
f (t) = e−as G(s)
0 t<a
127
128
Appendix B
Partial Fractions
In the same way you can add algebraic fractions together you can also separate them into
combinations of algebraic fractions, such terms are called partial fractions.
7x + 10 A B
= + .
(x + 2)(x + 1) x+2 x+1
Next, multiply both sides of the equation by the denominator (x + 2)(x + 1) and tidy up:
7x + 10 4 3
= + .
x2 + 3x + 2 x+2 x+1
so that the denominator has a repeated root, the above approach does not work.
129
In this case you have to represent the algebraic fraction as the sum of two partial fractions
in the form
x−2 A B
2
= + .
(x + 1) x + 1 (x + 1)2
Now multiplying both sides by (x + 1)2 gives
x − 2 = A(x + 1) + B . (B.2)
You can now equate terms to get two simultaneous equations in A and B to solve:
x : 1 = A; 1 : −2 = A + B .
Hence A = 1 and then B = −3. (Note that you could instead take x = −1 in (B.2) to
get B = −3 immediately.) Then
x−2 1 3
2
= − .
(x + 1) x + 1 (x + 1)2
You can also get algebraic fractions where the denominator includes quadratic terms
(ax2 + bx + c) that cannot be factorised into two linear factors (without using complex
numbers) as the roots are complex. Such cases give rise to a partial fractions of the form
Ax + B
ax2 + bx + c
where A and B are real constants.
Now we have the partial fraction representation of the algebraic fraction we find the
unknowns, A, B and C, in the same way as the other examples.
Step 1. Recombine the partial fractions over a common denominator:
130
Step 3. Tidy up:
−3(−2B) + B = 7B = 7 so B = 1.
(Again, some work can be saved by seeing what you get by trying x = 2.)
mx + n A B
= + ;
(px + q)(rx + s) px + q rx + s
mx + n A B
2
= + ;
(px + q) px + q (px + q)2
lx2 + mx + n Ax + B C
= + ;
(ax2 + bx + c)(px + q) ax2 + bx + c px + q
lx2 + mx + n A B C
2
= + 2
+ .
(px + q) (rx + s) px + q (px + q) rx + s
In all cases, you can multiply through by the denominator on the left-hand side of the
equation, tidy up and compare terms in the top line. (Work is often saved by seeing what
x = −q/p (or −s/r) gives.)
131
132
Appendix C
In any quadratic function you can complete the square. Consider the function
x2 − 3x + 2 . (C.1)
You can complete the square by recalling that, for any number a,
(x + a)2 = x2 + 2ax + a2
Thus, to complete the square in (C.1), where b = −3, we add and subtract 32 = 9/4 to
get 2 2
2 3 9 3 1
x − 3x + 2 = x + − +2= x+ − .
2 4 2 4
Once you have completed a square you can find the roots of a quadratic equation, although
in the Laplace transform method that is not the point of the procedure.
133