Matlab Notes
Matlab Notes
2019 Coursebook
Computational Mathematics
Copyright ©2019
Department of Mathematics
The University of Auckland
Front Cover: A single parotid acinar cell (in red) from a mouse parotid gland, with the associated
acinar lumen (in green). That is, it’s a spit-making cell, together with the tube that collects up the spit
from multiple cells. Figure reconstructed by John Rugis (University of Auckland) using experimental
data from David Yule (University of Rochester).
ii
Contents
1 Introduction 1
2 Cryptography 3
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
iii
iv CONTENTS
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.9 Loans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
CONTENTS v
3.10.3 An exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.10.4 Epidemics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.2.1 Randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.2.2 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Introduction
Contents
1.1 What is this course about? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 How to do well in this course . . . . . . . . . . . . . . . . . . . . . . . . . . 1
There are many areas of modern mathematics where computational methods play a large role. In
this course you will learn about the mathematical underpinnings of several areas of mathematics
where computation plays a big part, and also learn about the practical aspects of programming and
algorithmic thinking. We will discuss some of these areas in the first lecture. By the end of this
course you will not only understand the theoretical basis for a number of fascinating areas of modern
mathematics, but also be able to develop and program algorithms to solve these problems.
In this course we will use the programming language MATLAB. MATLAB is commonly used in
mathematics, and it will be useful to learn some specifics aspects of programming in MATLAB.
MATLAB is available on the lab computers, and you can purchase a copy for your personal computer
from the bookstore at a discounted student rate. You will need to become comfortable with matlab
in order to do well in this course. More broadly, you will learn algorithmic thinking and programming
skills which are applicable to many environments beyond matlab.
There are additional online resources, based on MATLAB, which we will also use throughout for extra
practice. One is MATLAB onramp, which is a web-based tutorial which will help you learn the basics
1
2 CHAPTER 1. INTRODUCTION
of MATLAB. .1
The other is cody, which is a MATLAB based, online environment which tests your skills – we will
use for additional exercises throughout the course.
There will also be useful resources posted on canvas, including this coursebook, links to matlab
resources, lecture recordings, sample code, assignments, practice tests and exams, and more.
You should plan to spend about 10 hours each week working on this course. This includes attending
lectures, reading this book and doing assignment questions.
Try hard not to miss lectures. If you miss a lecture, read the lecture notes and watch the lecture
recording (if available) before the next lecture. Lecture recordings are a valuable resource, but they
do not contain everything that occurs in lecture, and are not intended as a substitute for lectures.
You can only learn mathematics by doing mathematics and it is important to supplement lecture
material by trying some of the recommended problems. Try some of the problems every week. Don’t
wait until it is time to study for the exam.
Attempt all assignments and all questions on the assignment. Once your assignment is marked, go
over the assignment to check where you made mistakes. Sample solutions to the assignments will be
made available - read them, as they contain helpful information such as alternative ways to answer
questions.
If you are having problems with material in the course, first make sure you have read the appropriate
parts of the lecture notes. Then speak to your lecturer, either in lectures or by making an appointment
with your lecturer for another time. Good ways to make an appointment are by speaking to your
lecturer after class or by emailing your lecturer. Don’t be scared to approach your lecturers for help
- they are happy to help students who are trying to help themselves.
If you need help with computer use in the computer laboratory, ask a demonstrator in the laboratory.
Demonstrators on duty will be wearing a sash and there will always be a demonstrator on duty when
the basement computer laboratory is open. If the demonstrators are unable to help you with details
of the MATLAB package used, then ask your lecturer for help.
To prepare for the test or exam, first make sure you understand your lecture notes and make sure you
can do all assignment and tutorial questions. Go over some old exam papers (these can be downloaded
from the University Library website).
1
You will find a link to MATLAB onramp on canvas.
Chapter 2
Cryptography
Contents
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Modular Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Caesar cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 Brute force attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6 Affine ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.7 Cribs and frequency analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.8 One-time pads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.9 Euclid’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.10 Factoring integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.11 Public-key cryptography: Diffie-Hellman . . . . . . . . . . . . . . . . . . . 25
2.11.1 Modular exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.12 Element-wise operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.13 Modulus as a congruence relation . . . . . . . . . . . . . . . . . . . . . . . 29
2.1 Introduction
Cryptography is the study of writing and solving codes; specifically, techniques for secure communi-
cations. Cryptography forms the basis of many modern communications systems, and this includes
not just messaging but also the secure online transactions that we take for granted in the modern
world. In this chapter we will explore both the mathematical theory behind such systems, and also
the computational methods and algorithms used for both creating and breaking codes.
We will also introduce, throughout this chapter, the basic concepts of matlab (and programming in
general) that you will need as we progress.
3
4 CHAPTER 2. CRYPTOGRAPHY
2.2 Ciphers
Theory
The simplest cryptographic systems are substitution ciphers. The central idea is that each letter of
the message to be encoded is replaced by a pre-defined coded letter; thus the original message (the
plaintext) is encoded into the ciphertext. Perhaps the simplest such cipher is known as the Atbash
cipher, which encodes letters by reversing the alphabet. Thus A goes to Z, B to Y, etc., as in the
following table:
plaintext A B C D E F G H I J K L M
ciphertext Z Y X W V U T S R Q P O N
(continued)
plaintext N O P Q R S T U V W X Y Z
ciphertext M L K J I H G F E D C B A
The Atbash cipher can then easily be read off, given the key. For example, from the plaintext FOX we
convert F→U, O→L and X→C so that the plaintext FOX becomes the ciphertext ULC.
Practice
Implementing the atbash cipher in code is easiest if instead of working with letters, we convert our
letters into integers and work with those. Taking A to be 0, B to be 1, etc., up to Z= 25, each letter
of the cipher (C) is computed from the plaintext P easily as
C = 25 − P. (2.1)
Introduction to matlab
In its simplest form, you can simply use matlab as a calculator; for example:
>> 2+2
ans =
4
2.2. CIPHERS 5
>> 5*7
ans =
35
>> exp(3)
ans =
20.085536923187668
This last line exp(3), calls one of matlab’s built-in functions. This particular one is the exponential
function, e.g e3 . There are lots of other built-in functions that we will use later on.
It is also possible to write your own functions to perform particular tasks. In order to implement our
cipher, we will first need to be able to convert letters into integers, and also the reverse. There are two
function files posted on canvas which we will use for this. Download chartoint.m and inttochar.m
from canvas, and change matlab’s “Current Folder” to the folder where you have put these files. The
function chartoint converts a character (that is, a letter) into the corresponding integer value:
>> chartoint('A')
ans =
>> chartoint('B')
ans =
>> chartoint('M')
ans =
12
>> chartoint('Z')
ans =
25
Letters must be entered using single quotation marks. Similarly, inttochar converts an integer (0-26)
back to the corresponding letter. Here we are using only capital letters.
>> inttochar(0)
6 CHAPTER 2. CRYPTOGRAPHY
ans =
>> inttochar(25)
ans =
To get the most out of matlab, we need to think of it not just as a calculator, but also to use variables.
>> x=5
x =
>> y=3
y =
>> z=x*y
z =
15
We can also use variables to hold our characters (letters), and pass those to our functions.
>> my character='G'
my character =
my integer =
ans =
G
2.2. CIPHERS 7
>> P=chartoint('G')
P =
>> C=25−P
C =
19
>> inttochar(C)
ans =
However, we don’t really want to work with just one letter at a time; we want to encrypt entire
messages. In order to handle this, in matlab, we will need the concept of vectors.
Vectors in matlab are much like the mathematical concept that you already know. You can have either
row or column vectors, and you access individual elements of a vector by using parentheses ().
myvector =
1 2 3 4
myvector =
1
2
3
4
ans =
>> myvector
myvector =
1
2
11
4
Because we need to think about messages, we also need strings, which are essentially vectors of
characters. Individual elements can be accessed and changed just as with vectors.
>> mystring='SECRETMESSAGE'
mystring =
SECRETMESSAGE
ans =
mystring =
SECRZTMESSAGE
Now we can operate directly on strings and vectors, rather than just one integer or character at a
time.
message =
FOX
V =
5 14 23
ans =
2.2. CIPHERS 9
FOX
How, then, do we write a function to apply the cipher? This will be a three-step process:
2. Use the equation C = 25 − P to encode the ciphertext, with one integer corresponding to each
character.
The first and third step are just as we did above, with chartoint() and inttochar(). For the second
step, to apply the cipher to each letter, we will use a for loop.1
for loops
The for loop is a basic programming construct which uses the same code repeatedly, under specified
conditions. At its most basic, something like
for i=[1 2 3]
2*i
end
executes the code inside the block (2*i) for each of the i values specified in the loop definition (1,2,3).
Thus this loop is executed three times, and the calculations at each stage are 2, 4 and 6.
for loops can also operate on other variables, not just the counter variable (i in the example above).
x=1;
for i=[1 2 3 4]
x=2*x
end
This loop executes four times, and the value of the variables x is doubled each time; initially we have
x=1; after each pass through the loop we have 2,4,8 and 16.
One natural extension is to construct our for loops to work with vectors; that is, each pass through
the loop deals with one element in the vector. For example
x =
x =
1 4
x =
1 4 9
x =
1 4 9 16
x =
1 4 9 16 25
Now we have the tools needed to implement the atbash cipher in matlab, using a for loop to work
with each element of the vector to be encoded. Here is an example of one way to do this:
function ciphertext=atbash(plaintext)
The first thing to note is that here we are writing a function, and so this code would go into its own
file (atbash.m) to be executed when we call it. One way to know that this is the case is to see the
keyword function at the top. Breaking down this top line, we have
The body of the function (the following lines) take the input variable (plaintext in this case) and
compute the output (ciphertext). It is important to distinguish between function files, and interac-
tive commands entered at the matlab prompt. In order to do this, we have colour-coded each in this
2.2. CIPHERS 11
coursebook as follows. Examples of interaction with the matlab prompt appear this way:
>> 2+2
ans =
function output=myfunctionname(input)
output=inputˆ2;
For more details on writing your own function, see the matlab reference chapter (Ch 7).
Back to the atbash cipher then. Using our new function, we can easily encode our plaintext strings.
>> atbash('SUPERSECRETMESSAGE')
ans =
HFKVIHVXIVGNVHHZTV
How, then, do we decode the ciphertext? As it happens, we already have the means to do that – the
atbash cipher decodes itself!
>> ciphertext=atbash('SUPERSECRETMESSAGE')
ciphertext =
HFKVIHVXIVGNVHHZTV
>> atbash(ciphertext)
ans =
SUPERSECRETMESSAGE
12 CHAPTER 2. CRYPTOGRAPHY
Theory
One key idea which we will need in order to understand cryptographic systems is that of modular
arithmetic, in particular the modulo operator :
The idea is the same as the remainder by division. For example, 14 (mod 3) is 2, because 14 = 4×3+2.
For cryptography we are particularly interested in operations (mod 26), for the 26 letters in the Latin
(English) alphabet.
Thinking of the integers (mod 26) as representing the letters A-Z, then applying 3x + 5 (mod 26), we
have D→M (3 → 14) and I→D (8 → 3). Clearly, operations (mod 26) will be helpful in constructing
cryptosystems!
Practice
MATLAB has a built-in function for the modulus, mod(b,n) which computes b (mod n).
ans =
P =
12 4 18 18 0 6 4
C =
15 17 7 7 5 23 17
2.4. CAESAR CIPHER 13
ans =
PRHHFXR
Theory
The Caesar cipher is a simple substitution cipher, like the atbash cipher, but here the idea is to shift
each plaintext letter by the letter which is three steps down the alphabet. Hence A becomes D, B
becomes E, etc. At the end, we “roll” around, so that X becomes A, Y becomes B, and Z becomes C.
Figure 2.1: Illustration of the Caesar cipher. Image credit: public domain, wikipedia.
Thinking again of the letters A-Z as the integers (mod 26), so that A= 0, B= 1 . . . , Z= 25, then the
caesar cipher can be written using the modulo operator as
C = (P + 3) (mod 26). (2.3)
Then the decryption function is
P = (C − 3) (mod 26). (2.4)
In fact, this works with any shift key, not just 3; let’s call it k. Then we encrypt with
C = (P + k) (mod 26) (2.5)
and the decryption function is
P = (C − k) (mod 26). (2.6)
Practice
As with the atbash cipher, we first convert our plaintext string into the integers (mod 26) using our
chartoint() function. Then we use a for loop over the length of the message, and apply our encoding
function. Finally we convert the integer encoded message back to a string using inttochar().
14 CHAPTER 2. CRYPTOGRAPHY
function C = caesarcipher(P,k)
% inputs: P, plaintext string, all capitals, no spaces. k, integer key
% returns: C, ciphertext string
Example 2.4.1. How can you use the function above to decrypt caesar cipher messages?
Observe that the decryption function is simply the encryption function, but with a key of the opposite
sign (k → −k).
>> message='ISTHISSECURE';
>> C=caesarcipher(message,5) % encrypt with key +5
C =
NXYMNXXJHZWJ
ans =
ISTHISSECURE
Theory
At the end of the last section, we observed that the decryption function for the Caesar cipher is simply
the encryption function where the key k now has the opposite sign (k → −k).
This suggests a method of breaking the cipher, using what is known as a brute force attack. Suppose
we have only the ciphertext, but not the key used to encode it, but that nonetheless we wish to decode
the message. If we know that this is a Caesar cipher, then all we must do is try all possible keys!
The question, then, is how many possible keys are there? If this number is 1015 , then it will not be
practical to try them all. However, in the case of the caesar cipher, there are only 25 possible keys.
2.5. BRUTE FORCE ATTACK 15
There are two ways of understanding this. The first is to realize that because the cipher is a shift,
there are only 25 possible shifts. Formally, in terms of the modulo operator
That is, we need only check keys from 1 up to 25. (Why not k = 0 or k = 26?)
Practice
Example 2.5.1. Decode the ciphertext STEADNIWTHTRGTILTPEDC by brute force attack. You may
assume that it is encrypted with a Caesar cipher, but you do not know the key.
Solution
The brute for approach is simply to try every possible key; we will use a for-loop to cover all pos-
sibilities. We already wrote a function for applying the encryption/decryption function, so let’s use
that.
function bruteforce(C)
for j=1:25
caesarcipher(C,j) % try all possible keys
end
>> bruteforce('STEADNIWTHTRGTILTPEDC')
TUFBEOJXUIUSHUJMUQFED
UVGCFPKYVJVTIVKNVRGFE
VWHDGQLZWKWUJWLOWSHGF
WXIEHRMAXLXVKXMPXTIHG
XYJFISNBYMYWLYNQYUJIH
YZKGJTOCZNZXMZORZVKJI
ZALHKUPDAOAYNAPSAWLKJ
ABMILVQEBPBZOBQTBXMLK
BCNJMWRFCQCAPCRUCYNML
CDOKNXSGDRDBQDSVDZONM
DEPLOYTHESECRETWEAPON
EFQMPZUIFTFDSFUXFBQPO
FGRNQAVJGUGETGVYGCRQP
GHSORBWKHVHFUHWZHDSRQ
HITPSCXLIWIGVIXAIETSR
IJUQTDYMJXJHWJYBJFUTS
JKVRUEZNKYKIXKZCKGVUT
KLWSVFAOLZLJYLADLHWVU
LMXTWGBPMAMKZMBEMIXWV
MNYUXHCQNBNLANCFNJYXW
16 CHAPTER 2. CRYPTOGRAPHY
NOZVYIDROCOMBODGOKZYX
OPAWZJESPDPNCPEHPLAZY
PQBXAKFTQEQODQFIQMBAZ
QRCYBLGURFRPERGJRNCBA
RSDZCMHVSGSQFSHKSODCB
So we can see that the message was encoded with 11 as the key, and the message was ’DEPLOYTHES-
ECRETWEAPON’.
Theory
The idea of the Caesar cipher can be extended in a simple way. Instead of the simple shift ((P + k)
(mod 26)) we can instead take our encryption function in a slightly more general form as
This is known as an affine cipher (because the encryption function is an affine function, mod 26).
Now the cipher key is not just a single integer, but instead we require both a and b (both integers).
However, we cannot just go choosing and a and b freely. As it happens, in order for the affine cipher
to work, we need a and 26 to be coprime. That is, the only positive integer which divides them both
is 1. As it happens, without this condition, it is not possible to decrypt the cipher.
where 1 = aA (mod 26). Hence if a and 26 are not coprime, then we can’t find A to decrypt the
message.2 But, so long as we choose a coprime with 26, we can select any integer b and have a usable
affine cipher. Note that if a = 1 we have the Caesar cipher again.
Example 2.6.1. Suppose we wish to use an affine cipher with the key a = 3, b = 5. In order to find
the decryption function, we must find A such that 1 = 3A (mod 26). Because 3 and 26 are coprime,
this has a solution: A = 9. Observe that aA (mod 26) = 3 × 9 (mod 26) = 27 (mod 26) = 1.
Example 2.6.2. Is the affine cipher harder to break by brute force, compared with the Caesar cipher?
How many keys would you need to try? Hint: the only numbers coprime with 26, and less than 26,
are: 1, 3, 5, 7, 9, 11, 15, 17, 19, 21, 23, and 25.
2
If you find yourself particularly interested in why, you should consider MATHS 328 in future.
2.6. AFFINE CIPHERS 17
Practice
As with our other ciphers, we have a three step process: first convert the message into integers
(mod 26); then use a for loop to apply the encryption function to each entry; finally convert the
encrypted integers back into a string.
function C = affinecipher(P,a,b)
%inputs: P, plaintext string
% a, b: integer key. assuming a is coprime with 26
How do we decrypt affine cipher messages? In the example above, we found that the decryption
function is
But this is just another affine cipher! So we can both encrypt and decrypt using the same function,
just by altering the key.
>> C=affinecipher('TOPSECRETMOONBASEPLANS',3,5)
C =
KVYHRLERKPVVSIFHRYMFSH
>> affinecipher(C,9,−9*5)
ans =
TOPSECRETMOONBASEPLANS
18 CHAPTER 2. CRYPTOGRAPHY
Theory
We have already discussed brute force attacks in Sec. 2.5, but this is certainly not the only way of
attempting to decrypt messages (without knowing the key). In this section we consider additional
methods which take advantage of the likely content of the plaintext message.
The first is the idea of the “crib”, a term which originated with the codebreakers at Bletchley Park,
working to break the Enigma machine, during WWII. The key concept is that in some situations,
certain parts of the plaintext may be known (or guessed). Historical examples include messages from
certain stations which often stated only “Nothing to report”, or weather reports at the same time
each day which contain the word “weather” (and also common weather terms for that day’s weather).
Such information can be extremely valuable in decoding messages.
Example 2.7.1. Suppose that you have a ciphertext, encoded with an affine cipher, which you wish
to break “HSYJINJJHAGRANIJSGV”. Instead of resorting to brute force (see example 2.6.2), here
we have a crib: suppose we know that the first two letters of each message are “HI”.
we can use the first two letters of the cipher text (H= 7, S= 18) and the first two letters of the
plaintext from the crib (H= 7, I= 8) to find:
This is now a system of two equations and two unknowns; all we have to do is solve these simultaneous
equations in order to find A and b.
Solving this we find A = 19 and b = 8, and remembering the relationship between the affine encryption
and decryption functions, we can decode our message:
>> affinecipher('HSYJINJJHAGRANIJSGV',19,−8*19)
ans =
HISTARTTHEOPERATION
For more complex cryptosystems, cribs can still provide valuable information, if not an out-and-out
solution, as in this example.
2.7. CRIBS AND FREQUENCY ANALYSIS 19
Frequency analysis
Frequency analysis also exploits properties of the plaintext, in particular the fact that letters are not
equally used in natural language. For example, in English, approximately 12.7% of letters are ‘E’ and
8.2% ‘A’, while less than 0.1% are ‘Q’ or ‘Z’.
The patterns can also be extended to groups of letters, for example with ‘TH’ being the most common
pair of letters (bigram) in English, and ‘THE’ the most common triplet (trigram).
This method is more usable, of course, with longer messages; the longer the message, the more likely
that it adheres closely to the known distributions.
BPQAQABPMAWVOBPIBVMDMZMVLAQBOWMAWVIVLWVUGNZQMVLAAWUMXMWXTMABIZBMLAQVOQV
OQBVWBSVWEVQVOEPIBQBEIAIVLVWEBPMGSMMXWVAQVOQVOQBNWZMDMZRCABJMKICAM
is encoded with Caesar cipher, and the plaintext is in English, but we do not know the key.
The most common letters in this ciphertext are, in descending order: V, M, B and A. Knowing that
E is the most common letter in English texts, we first guess that V = E, which would mean that the
Caesar cipher has a key of 21 − 4 = 17. Checking we find
>> caesarcipher(C,−17)
ans =
KYZJZJKYVJFEXKYRKEVMVIVEUJZKXFVJFEREUFEDPWIZVEUJJFDVGVFGCVJKRIKVUJZEXZE
XZKEFKBEFNEZEXNYRKZKNRJREUEFNKYVPBVVGFEJZEXZEXZKWFIVMVIALJKSVTRLJV
Oh dear. No luck this time. But, we’re not done yet: the next most common letter in the ciphertext
was M. Guessing M = E, the key would be 12 − 4 = 8.
>> caesarcipher(C,−8)
ans =
THISISTHESONGTHATNEVERENDSITGOESONANDONMYFRIENDSSOMEPEOPLESTARTEDSINGING
ITNOTKNOWNINGWHATITWASANDNOWTHEYKEEPONSINGINGITFOREVERJUSTBECAUSE
Success!
Obviously, the Caesar cipher is particularly prone to this, but more sophisticated cryptosystems are
20 CHAPTER 2. CRYPTOGRAPHY
Theory
So far we have looked at simple ciphers, which are easy to implement (but also easy to break).
The central idea is to have a “code book” which must be exchanged in advance. Then to encode your
plaintext message, you take each character in your message, and add it to the corresponding character
in the code book, taking the result (mod 26).
For example, suppose the code book is a secret book from the library – here, “The Complete Poems
of Emily Dickinson.” The first unused poem from the book is
BUYTELECOMSHARES
1 20 24 19 4 11 4 2 14 12 18 7 0 17 4 18
BESTWITCHCRAFTIS
1 4 18 19 22 8 19 2 7 2 17 0 5 19 8 18
2 24 16 12 0 19 23 4 21 14 9 7 5 10 12 10
CYQMATXEVOJHFKMK
3
It’s also worth noting that the letter frequencies are different in other languages!
2.9. EUCLID’S ALGORITHM 21
The decryption process then is to use the same letters from the key (code book) and take the difference
(mod 26). If the key is not predictable, then this is an unbreakable code. This means, however, that
the key cannot be re-used; hence the name one-time pad. This disadvantage is the reason that our
chapter on cryptography doesn’t end here! In many situations, exchanging a one-time pad key in
advance is impractical.
Practice
As with our other ciphers, we have a three step process: first convert the message and key into
integers (mod 26); then use a for loop to apply the encryption function to each entry; finally convert
the encrypted integers back into a string.
function C = onetimepad(P,key)
% inputs P, plaintext string
% key, plaintext string
% output C ciphertext string
for j=1:N
C(j) = mod(P(j)+key(j),26);
end
>> C=onetimepad('BUYTELECOMSHARES','BESTWITCHCRAFTIS')
C =
CYQMATXEVOJHFKMK
Example 2.8.1. Modify the code given above to provide the decryption function for a one-time pad.
When working with integers (mod 26) for cryptosystems, we are often interested in the greatest
common divisor (gcd) of two integers (let’s call them a and b). The greatest common divisor of a and
b (gcd(a,b)) is the largest integer which divides both a and b without leaving a remainder. For example,
gcd(8, 12) = 4. In this section we discuss Euclid’s algorithm, which is a method of computing the gcd;
22 CHAPTER 2. CRYPTOGRAPHY
it is not a cryptosystem in itself, but it will help us to understand more sophisticated cryptosystems
later on.
Theory
Euclid’s algorithm is a famous method for computing the greatest common divisor. The key idea
behind Euclid’s algorithm, which dates back at least 2000 years, is an iterative process of trying
possible divisors.
Suppose we are trying to find gcd(8, 22). We might first try dividing 8 into 22, supposing that 8 could
be the greatest common divisor, but find that this leaves a remainder:
22 = 2 × 8 + 6
The key observation is that now we need to find the greatest common divisor of 8 and 6. That is, if
8 = mG and 6 = nG, where G = gcd(6, 8), (and so m and n are integers) then
8=1×6+2
6 = 3 × 2 + 0.
That is
22 = 2 × 8 + 6 = 2 × (4 × 2) + 3 × 2 = 11 × 2
Euclid’s algorithm is based on the same process. In more succinct form, at each step of the algorithm
we are given rk−1 and rk−2 and need to find the quotient (qk ) and remainder (rk )
rk−2 = qk rk−1 + rk .
We start with rk−2 = a and rk−1 = b, and stop when rk = 0; then gcd(a, b) = rk−1 . Here we have
assumed that a > b; if not, swap a and b before starting.
Practice
Euclid’s algorithm uses a type of control that we have not so far considered. That is, we perform the
main body of the algorithm until some condition is met (rk = 0), rather than for a fixed number of
iterations. This means that we cannot use a for loop as we have been doing for ciphers up to this
point.
Instead we need another common programming control construct, the while loop. The idea here is
exactly that we continue to execute our loop until the condition is met, e.g.:
while(condition)
code−to−execude
end
More specifically, here is a while loop that squares x, starting at 2, so long as x is smaller than
1,000,000.
>> x=2;
>> while(x<1000000)
x=xˆ2
end
x =
x =
16
x =
24 CHAPTER 2. CRYPTOGRAPHY
256
x =
65536
x =
4.2950e+09
The while loop is a powerful concept that we will use repeatedly throughout the rest of the course.
We also use it to implement Euclid’s algorithm.
function z=euclidalgo(a,b)
z=rm1;
Theory
One of the key ideas in cryptography is the idea of a “one-way function”. The idea is that there are
certain operations which are computationally easy in one direction, but very difficult in the opposite
direction – such operations can then be used at the heart of cryptosystems.
One such operation is decomposing integers into prime factors, for example 4578 = 2 × 3 × 7 × 109.
In one direction, starting with 4578, and needing to find the prime factors, this problem is hard. In
the opposite, having the prime factors and simply needing to multiply them together, is easy.
To illustrate this, let us see how long it takes to factor some very big numbers.
2.11. PUBLIC-KEY CRYPTOGRAPHY: DIFFIE-HELLMAN 25
Practice
There is code posted on canvas which uses matlab’s built-in factor() function to factor integers into
their prime factors. It does this for different numbers of digits (from 30 digits up to 50), and for each
size it generates 50 random numbers of this size and measures the time taken to factor each number.
This code is more complex than we have considered so far, and you aren’t necessarily expected to
understand all of it at this stage; but, you may find it useful to look through it and see if you can
understand the outline.
The main point of interest, though, is how long it takes to factor very large numbers. Running the
code, we can try to extrapolate the trend to understand how long it might take to factor even larger
numbers. The key conclusion, if you fit the trend line to the running time data, is that the amount of
time required grows very quickly as the size of the number increases4 .
For example, on my laptop it took about 1.6s to factor a 50-digit number. Not prohibitive. But
extrapolating to larger and larger numbers, a 100-digit number would take about 5 days. Again,
this still wouldn’t be adequate security for most purposes. But exponential growth will take care of
this; for 309-digit numbers, a size commonly used in modern cryptosystems, factoring in this manner
would be expected to take 7 × 1020 years – considerably longer than the age of the universe. The
multiplication time, by contrast, grows only very slowly with the number of digits, and is trivial for
even numbers of this size. This is the key idea of the one-way function; the factorisation direction is
hard, but the reverse (multiplication) direction is easy.
Obviously there are faster computers, and better algorithms, which would reduce these factorisa-
tion times considerably. But the fundamental concept remains the same – very fast growth of the
computation time will make the computational cost prohibitive if the numbers are big enough.
Theory
Most of the cryptographic systems we have discussed so far rely on some sort of prior key exchange;
that is, that both parties already have a shared, common key.
But, this is a severe limitation. In many situations there is no obvious way to go about exchanging
keys securely. Fortunately there are mathematical solutions to this problem, which allow secure key
exchange over unsecured channels.
In this chapter we will illustrate the idea of public key cryptography by explaining the Diffie-Hellman
key exchange system. The commonly used RSA protocol, which encrypts much internet traffic today,
is based on similar ideas.
4
In fact, faster than any polynomial.
26 CHAPTER 2. CRYPTOGRAPHY
Alice and Bob wish to communicate securely with one another, but they have not previously exchanged
any information (e.g. no shared, secure keys). Eve is attempting to eavesdrop on the communication
between Alice and Bob.
Alice and Bob begin by choosing two integers: a prime p, and an integer g (with 1 < g < (p − 1)).
Both p and g are transmitted in the clear – that is, Eve can intercept them.
Alice and Bob then both choose a secret integer; let’s call Alice’s secret integer a and Bob’s b.
Alice then computes A = g a (mod p), using her secret integer, and sends this (in the clear) to Bob.
Bob computes B = g b (mod p), using his secret integer, and sends this (in the clear) to Alice. Eve is
able to intercept both A and B.
Finally Alice computes B a (mod p), and Bob computes Ab (mod p). But, both Alice and Bob now
have a shared secret key s! Why is this?
But, why can Eve not also compute the secret key s? The answer is that it is difficult to solve
A = g a (mod p)
for a if g and p are large5 , in much the same way that factoring integers was difficult in the previous
section. So this is a one-way function; it is easy for Alice and Bob to do their calculations, but the
reverse calculation required of Eve is vastly harder.
While calculating the power a may be difficult, going in the other direction, i.e. computing g a (mod p),
can be very efficient. One way to do modular exponentiation would be to raise the base, g, to the
power, a, and then apply the modulus operator, however there are more efficient ways.
Then
Based on this we can use the computationally efficient method of exponentiation by squaring. We can
work out g 2 , g 4 , g 8 , etc... by repeatedly squaring and then g a can be calculated by taking products.
32 (mod 53) = 9
4
3 (mod 53) = 92 (mod 53) = 81 (mod 53) = 28
8 2
3 (mod 53) = 28 (mod 53) = 784 (mod 53) = 42
16 2
3 (mod 53) = 42 (mod 53) = 1764 (mod 53) = 15
21 16+4+1
⇒3 (mod 53) = 3 (mod 53) = (15 × 28 × 3) (mod 53) = 1260 (mod 53) = 41
One key matlab idea that we have not so far used is the idea of operating directly on vectors, specifically
element-wise operations. These are natural ways of implementing ciphers, as an alternative to using
a for loop.
The idea is simple: we want to perform an operation on a vector, but acting only on each element at
a time. For example, if we want the square of each element in a vector:
>> x=[1 2 3 4 5]
x =
1 2 3 4 5
>> x.ˆ2
ans =
1 4 9 16 25
The ‘dot’ (e.g. x.ˆ2 rather than xˆ2) indicates the element-wise operation.
>> x=[1 2 3 4]
x =
1 2 3 4
>> y=[5 6 7 8]
y =
5 6 7 8
>> x.*y
ans =
5 12 21 32
Most matlab built-in functions, for example mod(), also work naturally on vectors in an element wise
manner:
x =
1 2 5 56 135
>> mod(x,26)
ans =
1 2 5 4 5
This naturally brings up the notion that we could implement our cipher codes using element-wise
operations instead of for loops. Here is an implementation of the affine cipher using element-wise
operations.
C=mod(a*P+b,26);
Compare this with the approach, using a for-loop, that we took in Sec.2.6.
2.13. MODULUS AS A CONGRUENCE RELATION 29
Theory
Up to this point, we have thought of the modulus as an operator. That is, if we consider 34 (mod 16),
we are thinking of the operation of computing 34 = (16 × 2) + 2 and finding that the remainder is
2. There is, however, another way of thinking about the modulus which is often theoretically more
powerful, and that is to think of the modulus as a congruence relation. In this case, when we write
A≡B (mod C)
Consider the integers modulo 3. For any integer, if we compute that integer (mod 3), we will get
either 0, 1 or 2. We sometimes refer to these as equivalence classes, which is to say that all integers
which end up in the 2 group (mod 3) are in the same equivalence class, so:
4 ≡ 13 (mod 3)
and we say that 4 is congruent to 13 modulo 3. (This is because 4 (mod 3) = 1, and also 13
(mod 3) = 1; they are both the same, modulo 3. Note the difference between the equals sign (=) and
congruence (≡).)
• reflexive: A ≡ A (mod C)
Contents
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Discrete Population Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2.1 Evaluating using MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.2 Script Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.3 Some exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3 A nonlinear difference equation for population growth . . . . . . . . . . . 38
3.3.1 Some exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.2 Long-term behaviour and fixed points . . . . . . . . . . . . . . . . . . . . . . . 40
3.4 More on long-term behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4.1 Steps to draw a cobweb diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5 Discrete logistic equation with a parameter . . . . . . . . . . . . . . . . . . 46
3.5.1 Long-term behaviour and bifurcation diagrams . . . . . . . . . . . . . . . . . . 50
3.6 Fibonacci and his rabbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.6.1 Evaluating using MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.6.2 A lot more rabbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.7 Plutonium-239 - Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.8 Money in the bank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.8.1 Using MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.8.2 Some exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.9 Loans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.10 Systems of Difference Equations . . . . . . . . . . . . . . . . . . . . . . . . 62
3.10.1 Red Blood Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.10.2 Predator-prey model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.10.3 An exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.10.4 Epidemics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
31
32 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
3.1 Introduction
Difference equations are often used to model quantities that vary with time.
Perhaps we want to know how much money we can make from interest. Or perhaps we might want
to know how populations are changing over time; or further yet know how an epidemic would spread
through a population. Difference equations are a mathematical tool we can use to investigate such
problems.
In this section, you will learn to solve difference equations numerically using MATLAB.
Populations change with time. In a given unit of time some new members will be born, and others
will die. It should be intuitive to say that a population in a year’s time (Pn+1 ) will depend on the
population now (Pn ). Mathematically, we can use a difference equation to describe this relationship:
Pn+1 = kPn .
Notes
1. This difference equation is an example of a linear difference equation. The unknown, P in this
example, only appears as terms raised to the first power.
2. Like in P0 , n = 0 conventionally used to describe what is happening at ‘time 0’. This is referred
to as the initial condition.
Solution
First, we’ll give the MATLAB commands, then we’ll describe them.
ans =
1464.1
Description
3. (i) The values in P are accessed using P(1), P(2), and so on; see also the discussion in Sec-
tion 7.2.4.
MATLAB does not permit an array subscript that is zero or negative.
THIS IS IMPORTANT. This means P(0) is not permitted. So, we have stored our first
value in P(1) instead of P(0). Because we had to start our array at P(1) = P0 , the value
we want, P4 , is stored in P(5).
(ii) The last element in the array P is P(end).
34 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
(iii) We have written the difference equation as P (n + 1) = kP (n). As such, every iteration of
the for loop is calculating the next population. It is for this reason that numit must be
one less than the final value we want. E.g. we chose numit=4 to calculate P(5) as our last
value.
Solution
Because we want to plot, it would be better for us if we altered our simple pop.m function to output
the entire P vector. This is as simple as changing P(end) to P. E.g.
Typing the following into prompt will give us the plots below
1250
Pn
1000
1200 999.8
1150 999.6
1100 999.4
1050 999.2
1000 999
0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4
n n
950
900
850
Pn
800
750
700
650
0 0.5 1 1.5 2 2.5 3 3.5 4
n
Note: Doing this in prompt means that we have to type out all of this code line by line. This is
usually an inefficient way of writing code. Some reasons for this are as follows:
2. If a mistake is made, you have to re input the line in question and potentially rewrite all the
following lines again.
3. If blocks of code are repeated, with or without differences, it is inefficient to be rewriting the
same thing over and over.
In MATLAB there is another type of save-able file that lets us do everything we would need to do in
prompt, but circumvents such tedious problems. These files are called Script files.
Script files are used as a ‘notepad’ version of the prompt window. We write down all of the instructions
we would want to do in prompt. But having in a script file means that i) we don’t have to execute the
36 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
code line-by-line, ii) we can save our code for access later, iii) it is usually neater and easier to read,
iv) script files make it easier to deal with repeated code.
Note: Script files are created similarly to function files. They can be found under the new button in
the tool bar. For more information on creating and saving Script files refer to section 7.2.9.
Solution
We can write and save the following code as a Script file called simple pop script.m
%%% Plotting:
n = [0:4]; % The years for the x axis
plot(n,P case1,'*−')
hold on
plot(n,P case2,'*−')
plot(n,P case3,'*−')
grid
legend('k=1.1','k=1','k=0.9')
xlabel('n')
ylabel('P n','rotation',0)
To run this Script file we can either click the Run button on the toolbar or type the filename into
prompt. Typing the following into prompt tells MATLAB to go read and run all of the code in
simple pop script.m.
1500
k=1.1
1400 k=1
k=0.9
1300
1200
P n1100
1000
900
800
700
600
0 0.5 1 1.5 2 2.5 3 3.5 4
n
Our script file version may not look any shorter than what we would have typed into prompt. But it
is easier to read, edit, save and troubleshoot (and we don’t have to rewrite the same code three times
like before).
>>
In an animal park we initially have 5000 animals. After one year there are 6000. Use the exponential
growth model Pn = kPn−1 where n is measured in years.
where 0 < k 1 and A > 0. Use MATLAB to create a plot of Pn for n = 0, 1, . . . , 200. Use k = 0.001,
A = 100 and P0 = 2.
Solution
In this example there are four parameters, k, A, P0 , and N . So we start by writing a function that
takes these four input parameters and outputs one vector, P .
We use the following MATLAB commands to evaluate the difference equation and plot the results
nfinal = 200;
P = logistic pop(0.001,100,2,nfinal);
100
90
80
70
60
P
n
50
40
30
20
10
0
0 50 100 150 200
n
3.3. A NONLINEAR DIFFERENCE EQUATION FOR POPULATION GROWTH 39
You can see from the graph that Pn levels off. This is a more realistic than having the population
grow indefinitely.
Notes
2. The term −k Pn−12 in the equation is called a self-crowding term. As the population grows, the
self-crowding term gets larger, and the increase in the population slows.
3. The variable A is called the carrying capacity. It indicates the ideal population that a particular
region can sustain comfortably.
5. The '+' in the plot command plot([0:1:nfinal],P,'+') tells MATLAB to plot each point
as a +, and not to join the points up with a continuous line (we had a continuous line in previous
examples).
6. You can use other symbols besides + such as ., x or *. For more information, type help plot
in the Command window.
7. An important point to understand about this difference equation is why Pn levelled off. If you
look closely at
Pn = Pn−1 + kPn−1 (A − Pn−1 ),
you can see that if Pn−1 = A, the term k Pn−1 (A − Pn−1 ) will be zero. This implies Pn = Pn−1
and hence Pn+1 , Pn+2 etc. will all equal Pn−1 . This means the population has levelled off.
You should now be able to explain the general shape of the graph. If P0 is smaller than A, then
the term k P0 (A − P0 ) will be positive and P1 will be greater than P0 . In a similar way, you can
show that P2 > P1 , P3 > P2 , etc. However, the bigger Pn−1 is, the smaller (A − Pn−1 ) is and
after a while the increase in Pn slows. Eventually P levels off.
1. The maximum sustainable population for animals living in a particular area is 50,000. Write
a difference equation for the population, if there are initially 5,000. What extra information is
needed?
2. For the previous exercise, assume that initially the population growth is exponential with the
model
Pn = 1.1 Pn−1 .
Does this give any more information?
40 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
In Example 3.3.1, we used A = 100 and k = 0.001. The MATLAB plot shows that in the long term
the population will be 100. The long-term value for the population will be at a fixed point of the
difference equation. This is a value P so that when Pn−1 = P , then Pn = P also.
Example 3.3.2. Find the fixed points of the difference equation
Pn = Pn−1 + k Pn−1 (A − Pn−1 ).
Solution
In the difference equation, we substitute Pn−1 = P and Pn = P , which gives
P = P + k P (A − P ),
which is equivalent to
k P (A − P ) = 0.
The fixed points are the solutions of this equation, i.e., P = 0 and P = A.
When k = 0.001 and A = 100, the fixed points are 0 and 100. This agrees with the plot in Exam-
ple 3.3.1. It can be shown that except for P0 = 0, no other initial value will give a solution which is
zero in the long term.
Example 3.3.3. Write down the function f so that Pn = f (Pn−1 ) for the difference equation in
Example 3.3.2. For k = 0.001 and A = 100, show a graph of Pn against Pn−1 and on the same axes
Pn = Pn−1 .
Solution
Pn = Pn−1 + kPn−1 (A − Pn−1 ) = f (Pn−1 ), where f (P ) = P + kP (A − P ) which in MATLAB can be
written:
k=0.001; A=100;
P=[0:1:140];
f = logistic vec(k,A,P);
plot(P,f,'−',P,P,'g− −') % f is solid line, P n=P {n−1} is dashed line
grid
xlabel('P {n−1}')
ylabel('P n', 'rotation', 0)
If you look carefully, you will see that the graphs intersect at the fixed points. Why? Does this agree
with the fixed points in Example 3.3.2?
3.4. MORE ON LONG-TERM BEHAVIOUR 41
140
120
100
80
Pn
60
40
20
0
0 20 40 60 80 100 120 140
Pn−1
(a) x0 = 1 and
(b) x0 = 4.
Solution
We could use the following MATLAB commands:
The plots are below. Note that the solutions for both initial values are tending to 2.
2 4
1.8
3.5
1.6
xk xk
3
1.4
2.5
1.2
1 2
0 5 10 15 0 5 10 15
k k
Notes
1. The plots for both the initial values have been produced in one run of the MATLAB code by
storing these initial values in a vector and using the outer for loop to go through the initial
values.
2. The two graphs have appeared as one plot with two subplots. The MATLAB statements
subplot(1,2,j),plot(x,'*−') breaks the Figure window into a 1 by 2 matrix of smaller
plots and plots x in plot j. To find out more, type help subplot. These MATLAB statements
are not examinable.
It is possible to investigate the long-term behaviour without calculating a list of iterates. We can
determine the long-term behaviour by graphical means using a cobweb diagram, as described below.
xn = g(xn−1 ), x0 = a.
3. Find x1 by moving vertically to the curve y = g(x). The vertical component of this point is x1 .
4. To find x1 on the horizontal axis, move horizontally to the line y = x, and then vertically to the
horizontal axis.
3.4. MORE ON LONG-TERM BEHAVIOUR 43
xk = 0.5 xk−1 + 1
Solution
On the same set of axes, we draw y = 0.5 x + 1 to represent the difference equation, and also y = x.
4
y=x
y = 0.5 x + 1
3
xk
0
0 1 2 3 4 5 6 7
xk−1
Then we start at the initial value, say x0 = 4, on the x-axis and move vertically to the line y = 0.5 x+1
which represents the difference equation. The y coordinate of this point will be x1 ; in order to locate
it on the x-axis, move horizontally to the line y = x. This point will have x and y coordinates equal
to x1 and we can then move vertically to y = 0.5 x + 1 to find x2 .
Notice that for x0 = 4, xk is moving closer and closer to 2, which was what we expected above. What
will happen if x0 = 1?
Example 3.4.3. Use a cobweb diagram to investigate the long-term behaviour of the difference
equation
xn = x2n−1 ,
for each of the initial conditions (a) x0 = 0.9; and (b) x0 = 1.1.
Solution
First draw y = x2 and y = x on the same axes. Then start at (a) x = 0.9 and follow the solid line.
For (b) start at x = 1.1 and follow the dashed line.
44 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
y = x2
y=x
x
n
0
0 1 2
xn−1
From following the solid line, we can see that if x0 = 0.9 then the solution tends to zero. However,
looking at the dashed line, it is apparent that if x0 = 1.1 then the solution tends to infinity.
3.4.2 Exercises
xn = 0.8xn−1 + 2,
20
18
16
14
12
10
0
0 5 10 15 20
3.4. MORE ON LONG-TERM BEHAVIOUR 45
3.5
2.5
1.5
0.5
0
0 0.5 1 1.5 2 2.5 3 3.5 4
46 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
This is a good model to use if there are limited resources for the population.
Solution
First note that Pn = M xn . Then
xn = Pn /M
q Pn−1
= Pn−1 1 −
M M
q M xn−1
= M xn−1 1 −
M M
⇒ xn = q xn−1 (1 − xn−1 ).
This difference equation is called the discrete logistic equation. Note that it has only one parameter,
q, but the original difference equation had two parameters, A and k.
Example 3.5.2. Use MATLAB to create a plot of xn for n = 0, 1, . . . , 200 where q = 1.1 and
x0 = 0.08.
Solution
0.092
0.09
0.088
xn
0.086
0.084
0.082
0.08
0 50 100 150 200
n
Solution
The plot produced is
0.102
0.1
0.098
xn
0.096
0.094
0.092
0.09
0 50 100 150 200
n
48 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
This time the proportion decreases but still levels off at 0.091. But does the long-term behaviour of
x always look like this? We will investigate.
Example 3.5.4. Repeat the plots for q = 1.5, 2, 3, 3.2, 3.5, 3.6 using x0 = 0.05. Also draw the
corresponding cobweb diagrams.
Solution
• q = 1.5
1 1
0.8 0.8
0.6 0.6
xn xn
0.4 0.4
0.2 0.2
0 0
0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1
n xn−1
•q=2
1 1
0.8 0.8
0.6 0.6
xn xn
0.4 0.4
0.2 0.2
0 0
0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1
n x
n−1
3.5. DISCRETE LOGISTIC EQUATION WITH A PARAMETER 49
•q=3
1 1
0.8 0.8
0.6 0.6
xn xn
0.4 0.4
0.2 0.2
0 0
0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1
n xn−1
• q = 3.2
1 1
0.8 0.8
0.6 0.6
xn xn
0.4 0.4
0.2 0.2
0 0
0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1
n xn−1
• q = 3.5
1 1
0.8 0.8
0.6 0.6
xn xn
0.4 0.4
0.2 0.2
0 0
0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1
n x
n−1
50 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
• q = 3.6
1 1
0.8 0.8
0.6 0.6
xn xn
0.4 0.4
0.2 0.2
0 0
0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1
n xn−1
We can use an orbit diagram to summarise the information we have about the long-term behaviour
for different values of q. The long term behaviour of x is given on the vertical axis.
0.9
0.8
0.7
0.6
0.5
0.4
0.3
2.8 2.9 3 3.1 3.2 3.3 3.4 3.5
q
For more information on the discrete logistic equation see section 8.1 of the Maths 260 textbook,
Differential Equations by Blanchard, Devaney and Hall (3rd edition).
3.5. DISCRETE LOGISTIC EQUATION WITH A PARAMETER 51
nfinal=150;
values used=20;
clf
q=2.8:.008:3.59;
pause
for j=1:length(q) % repeat for each value of q
hold(hand1,'off')
P=[1/2]; % calculate P, starting at critical point
for n=1:nfinal
P(n+1)=q(j)*P(n)*(1−P(n)); % find P n for q j and A
end
plot(hand1,1:nfinal,P(1:nfinal),'+−') % plot P n using + to show pts
grid(hand1,'on')
axis(hand1,[1,nfinal,0.3,1])
title([' q=',num2str(q(j))],'FontSize',14) % title shows q and A
xlabel('n','FontSize',16)
ylabel('x', 'Rotation', 0, 'FontSize',16)
bv=(P(end−values used:end)); % Take last values to see the long term
hold on
for i=1:length(bv)
plot(hand2,q(j),bv(i),'*','MarkerSize',5) % plot long−term values
end
grid(hand2,'on')
box(hand2,'on')
axis(hand2, [2.8,3.6,0.3,1])
xlabel(hand2, 'q','FontSize',16)
ylabel(hand2, 'x', 'Rotation', 0, 'FontSize',16)
if j==3 | j==6| j==29 | j==87 % put in pauses to watch diagram appear
pause
else
pause(0.1)
end
end
52 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
In 1202, Leonardo of Pisa, also known as Fibonacci, published a simple population model for an
isolated group of rabbits.
The population begins with just 1 pair of newly born rabbits (1 male, 1 female). Let Fn be the number
of pairs of rabbits at the end of the nth breeding season (that is the nth month). Then F0 = 1 and
F1 = 1, because the first pair of baby rabbits needs one month to mature before they start breeding.
After two months, the first pair will have produced a pair of baby rabbits, so including the mature
pair, there are now two pairs of rabbits and F2 = 2.
One month later, the baby rabbits will have matured and the mature pair will have produced another
pair of baby rabbits, which means there are now three pairs of rabbits and F3 = 3. At the end of
month four, the new baby rabbits will have matured and the two mature pairs will each have produced
a pair of baby rabbits, so the total is now F4 = 5; and so on.
Fn = Fn−1 + Fn−2 , n ≥ 2.
In words, the number of pairs of rabbits at the end of the nth breeding season is equal to sum of the
number at the end of the (n − 1)st and the number at the end of the (n − 2)nd season. The above
equation is an example of a difference equation.
Solution
3.6. FIBONACCI AND HIS RABBITS 53
F2 = F1 + F0
√
= 1 + 1 = 2
F3 = F2 + F1
√
= 2 + 1 = 3
F4 = F3 + F2
√
= 3 + 2 = 5
From the discussion above, we know that F4 is composed of three mature pairs and two baby pairs.
The baby pairs mature the following month and the mature pairs each produce a baby pair. Hence,
F5 is composed of five mature pairs and three baby pairs, so F5 = 8. Then F6 will be composed of
eight mature pairs and five baby pairs, so F6 = 13, and F7 will be composed of thirteen mature pairs
and eight baby pairs, giving F7 = 21. Using the formula, we find:
F5 = F4 + F3
√
= 5 + 3 = 8
F6 = F5 + F4
√
= 8 + 5 = 13
F7 = F6 + F5
√
= 13 + 8 = 21
Example 3.6.2. Use a for loop in a MATLAB function that solves for Fk . Use your function to
calculate F7 .
Solution
>> fib([1,1], 7)
ans =
21
54 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
Notes:
1. MATLAB does not permit an array subscript that is zero or negative. This means F(0) is
not permitted; we have store F0 in F(1) instead. As a consequence, F (n) is stored in F(n+1).
Thus, the last element in the array F is F(nfinal + 1); you can also access the last element as
F(end).
Solution
Calculating F314 by hand would be boring and take too long. We modify the MATLAB statements
in the previous example to
ans =
3.0312e+65
Let Pn be the amount of Pu-239 left after n years. Pn satisfies (approximately) the difference equation
Pn = (0.999971119284533)Pn−1 .
i) Estimate the amount of Pu-239 left after one million years if P0 = 1 × 1012 .
Solution
Recognising that this difference equation is very similar to our previous examples, we can solve it in
MATLAB using the functions we created before with some very small modifications.
ans =
0.286481153781130
ii) Estimate the amount of Pu-239 left after one million years for any P0 .
Solution
Here we do not know P0 so we don’t have enough information to compute a solution in MATLAB .
Thus, our first step is to find the solution to the difference equation. We can notice that if Pn = kPn−1 ,
56 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
Pn = kPn−1 ,
= k(kPn−2 ) = k 2 Pn−2 ,
= k 2 (kPn−3 ) = k 3 Pn−3 ,
= k 4 Pn−4 ,
..
.
= k n P0 .
Pn = (0.999971119284533)n P0 .
The amount of radiation left after one million years will then be
In words, after one million years, 2.86 × 10−13 of the original amount of Pu-239 will be left.
Notes
1. How do we get the constant 0.999971119284533 in the difference equation? Let the value be k
so that
Pn = k Pn−1 .
Since the half-life is 24,000 years, we know that
1
P24000 = 2 P0 .
P24000 = k 24000 P0 .
Therefore,
k 24000 = 21
(1/2400)
⇔ k = 12
⇔ k = 0.999971119284533
Example 3.8.1. Let Pn be the amount of money in an account at the beginning of the (n + 1)-st
year. If the money earns i percent interest each year and this interest is added to the account at the
end of the year, write down a difference equation for Pn . You may assume the interest is not taxed
and there are no bank fees.
Solution
You should be able to write down the difference equation by inspection. It is
i
Pn = 1 + Pn−1 .
100
Solution
Based on our work from the previous section we should be able to recognise the solution to be
i n
Pn = P0 1 + .
100
Example 3.8.3. Suppose in the previous example, P0 = 1000 and i = 8. Use MATLAB to find P20 .
Solution
We need to evaluate
8 20
P20 = 1000 1 + .
100
The MATLAB command
ans =
4.6610e+03
which gives 4.660957143849309e+03 as the answer. Hence, at the beginning of the twenty-first year
there will be $4660.96 in the account. (Here, we assume that the effect of round-off error for the
annual payments rounded to cents can be ignored.)
58 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
2. Write the difference equation for money owing on a mortgage if you are charged 10% interest each
year and pay off $20000 each year. Assume that the interest is charged and the payment made
at the end of each year. Can this difference equation be solved with the analytical technique we
used in Example 3.8.2?
3.9 Loans
In Section 3.8, we considered money in the bank and interest. We will now look at loans on which
interest is charged.
Example 3.9.1. Write the difference equation for money owing on a $200,000 loan if you are charged
10% interest each year and pay off $24,000 each year.
Note: Here we are assuming that the interest is paid only once a year and the repayments are also made only
once a year and at the same time. This will not be realistic for a house mortgage, but the same ideas can be
used.
Solution
Let An be the amount owing after n years. Then the amount after n + 1 years will be the amount
that was owing after n years, plus the interest on that amount, less the money paid off. We can write
this as the difference equation
where r is the interest rate as a percentage and M is the money paid off. We can use MATLAB to
plot An . We do not know how long it will take to pay off the loan so we guess and calculate up to
A30 .
title('Mortgage')
xlabel('Years')
ylabel('Amount owing ($)')
The plot is
x 10
5 Mortgage
2
0
Amount owing ($)
−1
−2
−3
The loan is paid off as soon as the graph dips
−4 below zero on the y-axis. Hence, it takes about
19 years to clear the balance.
−5
0 5 10 15 20 25 30
Years
Example 3.9.2. Allison would like to borrow $200,000 but she doesn’t know if she can afford the
repayments. Use MATLAB to plot the amount owing if she repays $15,000 per year. Do this for the
interest rates 5%, 6%, 7.5%, and 8%. Use the plots to estimate how long before the loan would be
repaid for each interest rate.
Solution
We will modify the script file for the last example.
x 10
5 Mortgage
2
1.5
1
Amount owing ($)
0.5
−0.5
−1
−1.5
0 5 10 15 20 25 30
Years
The loan will be repaid in about 22 years. We will now investigate what happens for some higher
interest rates. Change r into 6 then 7.5 then 8.
x 10
4 Mortgage x 10
5 Mortgage
20 2
15 2
2
Amount owing ($)
10 2
5 2
0 2
−5 2
0 5 10 15 20 25 30 0 5 10 15 20 25 30
Years Years
• r = 6: The loan is repaid in about 27 years. • r = 7.5: The amount owing is constant. Why?
3.9. LOANS 61
x 10
5 Mortgage
3.2
3
Amount owing ($)
2.8
2.6
2.4
2.2
2
0 5 10 15 20 25 30
Years
Interest rate(%)
5% Repaid in about 22 years.
6% Repaid in about 27 years.
7.5% Amount owing does not change
8% Amount owing increases
Example 3.9.3. Alan wants to buy a house for which he would need to borrow $200,000. The interest
rate is 7.25% per year and he can pay off $15,000 each year. Can he afford the house?
Solution
We have used r as a parameter of the difference equation and saw that the type of long-term behaviour
depended on r. For r < 7.5, we notice that the loan is paid off; for r = 7.5 it is not paid off. At 7.25%
he can pay it off. Are we making any assumptions here?
62 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
Example 3.10.1. Red blood cells (RBC) carry oxygen around the body. They are constantly being
destroyed and replaced. We want the number to be maintained at some fixed level. We assume that
the spleen filters out and destroys a constant fraction of RBC and the bone marrow produces a number
proportional to those lost on the previous day.
Define:
(a) Explain the terms in the equation based on the assumptions listed above.
(b) Write a Matlab script file to plot M and R over time for (i) R0 = 106 , M0 = 100 and (ii)
R0 = 106 , M0 = 1000. Use f = 0.5, c = 1. What do you notice about the plots?
Solution
(a) The spleen filters out a fraction f of Rn on day n, which means that (1 − f ) Rn remains. This
is supplemented by the amount Mn produced by the bone marrow. Hence, the number of RBC
on day n + 1 is the sum of these two. The amount produced by the bone marrow for the next
day is proportional to the amount f Rn of RBC on the current day n; the fraction is given by c.
(b) function [R,M] = rbc(f, c, ICs, nfinal) %Note: ICs are input as a vector
R = ICs(1); %Getting ICs from the input vector
M = ICs(2);
for n = 1:nfinal %Update both equations at the same time
R(n+1) = (1−f)*R(n) + M(n);
M(n+1) = c*f*R(n);
end
nfinal = 20;
[R,M] = rbc(0.5,1,[1e6,100],nfinal); %Inputting the parameters and ICs
plot(0:nfinal,R,'r−',0:nfinal,M,'b−−')
legend('R','M') %Creates a legend to identify the curves
3.10. SYSTEMS OF DIFFERENCE EQUATIONS 63
The left hand plot is (i) R0 = 106 , M0 = 100, and the right hand one is (ii) R0 = 106 , M0 = 1000.
x 10
5 f=0.5 and c=1 x 10
5 f=0.5 and c=1
10 10
R R
9 M 9 M
8 8
7 7
6 6
5 5
4 4
3 3
2 2
1 1
0 0
0 5 10 15 20 0 5 10 15 20
n n
Notice that the initial value of M makes little difference to the plot.
Notes
1. The for loop contains two command lines; you can have as many as you like in a for loop,
which is why you must indicate where it stops by writing end.
2. In the function rbc we have two outputs R and M. In MATLAB we use a vector [R,M] as output.
You can have as many outputs as you like in a function; just put them in the output vector.
3. Initial conditions in MATLAB functions can be input as single variables or in vectors. Here f
and c are single inputs, but ICs expects a vector containing both initial conditions.
4. The plot command put the graphs of Rn and Mn on the same set of axes.
5. The 'r−' in the plot command told MATLAB to draw the graph for Rn as a continuous line
in red. The 'b−−' in the plot command told MATLAB to draw the graph for Mn as a dashed
line in blue.
6. If we had said 'b+' instead of 'b−−', MATLAB would have plotted Mn as blue points with a
+ at each point.
7. If we had said 'b+−' instead of 'b−−', MATLAB would have plotted Mn as a continuous blue
curve with a + added at each point.
8. Since the y-axis represents both Rn and Mn , we use the legend command instead of labelling
the y-axis.
64 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
Example 3.10.2. The plots below are for R0 = 106 and M0 = 1000, with different values of f and c
as shown in the titles. What would be a suitable value of c to keep the number of RBC at a constant
level?
Solution
x 10
5 f=0.8 and c=1 x 10
5 f=0.3 and c=1
10 10
R R
9 M 9 M
8 8
7 7
6 6
5 5
4 4
3 3
2 2
1 1
0 0
0 5 10 15 20 0 5 10 15 20
n n
x 10
5 f=0.5 and c=1.1 x 10
5 f=0.5 and c=0.9
14 10
R R
M 9 M
12
8
10 7
6
8
5
6
4
4 3
2
2
1
0 0
0 5 10 15 20 0 5 10 15 20
n n
The value of c = 1 seems to keep it constant. To be sure, try it with different values of f .
Just as for the linear growth examples, linear systems always have relatively simple long-term be-
haviour: the values tend to 0; the values tend to infinity; or the parameters are special such that the
values remain constant. More interesting behaviour can be obtained in models including nonlinear
terms. The following two sections give examples of systems of nonlinear equations.
Example 3.10.3. The predator-prey model gives the populations, at a discrete set of times, of two
species interacting with each other. One species is the predators. They consume the prey and if there
were no prey, the predators would die off. The other species is the prey. They are consumed by the
predators and if there were no predators, the number of prey would grow indefinitely.
Let xn and yn be the number of predators and prey at the nth time interval. One set of difference
3.10. SYSTEMS OF DIFFERENCE EQUATIONS 65
where the constants a, b, c and d are all positive. Can you tell from the equations which is the predator
and which the prey?
Suppose a = b = c = d = 0.005, x0 = 2, and y0 = 4, where x and y are in units of one thousand. Use
MATLAB to produce a plot of xn and yn .
Solution
nfinal = 5000;
[x, y] = pred prey([0.005,0.005,0.005,0.005], 2,4,nfinal);
plot([0:nfinal],x,'b−',[0:nfinal],y,'g−−')
legend('predator','prey')
xlabel('n')
5
predator
4.5 prey
3.5
2.5
1.5
0.5
0
0 1000 2000 3000 4000 5000
n
Notes
66 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
1. You can see from the plot that xn and yn oscillate with time. When the number of prey (yn ) is
large, the predators (xn ) will have a plentiful supply of food and xn will increase. As xn increases,
more and more prey will be consumed and after a while yn will start decreasing. Eventually yn
will get so small that the number of predators will start decreasing. When xn becomes small,
the number of prey will start increasing, and so on.
One activity of the Hudson’s Bay Company was buying furs from trappers. The plot
below, taken with permission from
http://www.globalchange.umich.edu/globalchange1/current/lectures/predation/predation.html
gives the number of snowshoe rabbits and one of their main predators, the Canada lynx,
bought for most years from the 1840s to the 1930s. A predator-prey cycle of eight to ten
years is readily observed.
3.10.3 An exercise
The difference equations below model the populations of foxes and rabbits, measured in thousands.
The foxes kill the rabbits for food.
(
xn = xn + 0.03 xn − 0.001 xn yn ,
yn = yn − 0.02 yn + 0.005 xn yn .
(a) Which population does x represent and which population does y represent? Give a reason for
your answer.
3.10.4 Epidemics
In this section we will look at a model for the spread of diseases. A population will be divided into
three groups:
• Those who are susceptible to the disease - people who could catch the disease from an infected
person.
• Those who are infectious with the disease - these people have the disease and can infect suscep-
tible people.
• Those who are recovered from the disease - they are now immune to the disease.
Let Sn be the number of people susceptible at time n, In be the number infectious at time n and Rn
be the number recovered at time n.
Example 3.10.4. Use the following assumptions to write expressions for these values at time n in
terms of their values at time n − 1. Between time n − 1 and n, we will assume that:
• the number of susceptibles who are infected is proportional to Sn−1 In−1 ;
• a fixed proportion of the infected people will recover.
Solution
Between the time n − 1 and n, the number of people infected will be β Sn−1 In−1 for some constant β.
These must be subtracted from the susceptible group and added to the infectious group. A proportion
γ of the infectious will recover and they will be subtracted from the infectious group and added to the
recovered group. We can summarise this as
Sn = Sn−1 − β Sn−1 In−1 ,
In = In−1 + β Sn−1 In−1 − γ In−1 ,
Rn = Rn−1 + γIn−1 .
Example 3.10.5. Suppose that β = 0.001, γ = 0.005, S0 = 200, I0 = 1, and R0 = 0. Use MATLAB
to produce a plot of S, I and R all on the same axes.
Solution
Note that, S, I and R in the equations are expressed in terms of time n − 1. Think about whether it
makes a difference to define Sn in terms of Sn−1 or Sn+1 in terms of Sn , etc. Conclude that you can
still write the MATLAB script files as before. To test your understanding of working with arrays and
using the for loop in MATLAB, it is a good idea to try write a script file using time n − 1 on the
right-hand side of the equations. The plot should be identical to the one produced by the script file
on the next page.
68 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
140
120
100
80
60
40
20
0
0 50 100 150 200
n
Note: observe that Sn + In + Rn = Sn−1 + In−1 + Rn−1 . What does this tell you about the model,
and its behaviour?
3. Use √ √ !n √ √ !n
5+ 5 1+ 5 5− 5 1− 5
Fn = +
10 2 10 2
4. Why is Fibonacci’s model an unrealistic model? You might like to read the short description of
the model in Wikipedia.
5. A person puts $23,000 in a bank account July 1, 2010. The account earns six percent interest
per year and the interest is added to the account on June 30 each year. How much money will
be in the account July 1, 2020? Assume the interest is not taxed and there are no bank fees.
6. Suppose for the previous example, the interest is taxed at a rate of 30 percent. If the tax is paid
July 1 each year using money from the account, how much money will be in the account on July
1, 2020?
7. A person puts $23,000 in a bank account July 1, 2010. Every July 1 thereafter, the person adds
$1000 to the account. The account earns six percent interest per year and the interest is added
to the account on June 30 each year. How much money will be in the account on July 1, 2020?
Assume the interest is not taxed and there are no bank fees (you may use MATLAB to answer
this question).
Pn = (0.999971119284533) Pn−1 .
9. Let Sn be the amount of Strontium-90 (Sr-90) in a pile after n years. Sn satisfies (approximately)
the difference equation
Sn = (0.97638) Sn−1 .
Suppose S0 = 2. Use MATLAB to calculate S100 .
10. Let Kn be the amount of Krypton-85 (Kr-85) in a pile after n years. Kn satisfies (approximately)
the difference equation
Kn = r Kn−1 .
Use the fact the half-life of Kr-85 is 10 years to find a numerical value for r to five decimal places.
Tn = Tn−1 − k (Tn−1 − Ta ).
We know this approximation can be used when Ta < T0 . Can the approximation be used when
Ta > T0 ?
70 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING
Tn = Tn−1 − k (Tn−1 − Ta )
13. A hot piece of lead is dropped in a large tank of cold water. The temperature Tn of the lead
after n seconds satisfies (approximately) the difference equation
The initial temperature of the lead is 200◦ C. Estimate the temperature of the lead after 10
seconds.
(i) P0 = 50
(ii) P0 = 200
(iii) P0 = 100.
for k = 0.01 and A = 100. Use your diagram to describe the behaviour of Pn when (a) P0 = 20,
and (b) P0 = 120,
(a) Write a MATLAB script file that plots a graph of S, I and R when S0 = 400, I0 = 4 and
R0 = 0 for β = 10−3 , γ = 5 × 10−3 .
(b) Use different values of β in your script file, to find a value that gives a peak value of
infectious people of less than 200.
(a) write a population model for the rabbits with three difference equations for mn , the number
of mature rabbits after n months, tn , the number of one month old rabbits after n months
and bn , number of newborn rabbits after n months;
(b) starting with a single pair of baby rabbits, calculate the populations for up to 10 months
(check correctness of your model by comparing the total population with the 10th Fibonacci
number);
(c) incorporate death in your model by letting a proportion dm of the mature rabbits, and a
proportion db of the baby rabbits die each month;
(d) Let dm = 0.1, then find db be such that the total population remains constant.
Chapter 4
Contents
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.2 Randomness, probability and simulation. . . . . . . . . . . . . . . . . . . . 72
4.2.1 Randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.2.2 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.2.3 Simulation for probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3 Discrete and continuous random variables . . . . . . . . . . . . . . . . . . 75
4.3.1 Discrete random variables and histograms . . . . . . . . . . . . . . . . . . . . . 76
4.3.2 Continuous random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.4 Uniform and normal distributions, probability distributions . . . . . . . . 79
4.4.1 Probability Density Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.4.2 The Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.4.3 Normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.4.4 Modelling error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.5 Simulating Discrete Random Variables . . . . . . . . . . . . . . . . . . . . 91
4.5.1 Simple examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.5.2 More complicated discrete distributions . . . . . . . . . . . . . . . . . . . . . . 94
4.6 Estimating probabilities, Monty Hall . . . . . . . . . . . . . . . . . . . . . . 97
4.6.1 Estimating probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.6.2 Supermarket workers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.6.3 Monty Hall problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.7 Estimating expectations, Gambler’s ruin . . . . . . . . . . . . . . . . . . . 101
4.7.1 Estimating expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.7.2 Expectations of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.7.3 Gambler’s ruin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.8 Monte Carlo integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.8.1 Estimating integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
71
72 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
4.1 Introduction
A deterministic model will always produce the same output given the same initial input; the models
in Chapter 3 were all of this type. However in real life there is variability. Hence, a stochastic model
will include randomness.
• To better represent reality - the result of a dice throw, radioactive decay, animal movement...
• Some phenomena are not random, but we model them as random because they are too compli-
cated.
• Measurement error
In this chapter we review the properties of random variables and use MATLAB to simulate ran-
dom variables with special properties. We will then see how simulations can be used in a range of
applications including modelling and estimating probabilities, expectations, and integrals.
Theory
4.2.1 Randomness
In this chapter, we consider the case where there is uncertainty in the underlying situation. For
example, consider a coin toss and let X be the side of the coin that shows when the coin lands. What
is X? We expect that X will be either Heads or Tails but we cannot be certain about X unless we
toss the coin and observe the result. In other words, the value of X depends on the experiment, and
depending on the experiment, the value of X may change: X is sometimes heads and sometimes tails.
It is useful to think of variables like X in this example as behaving randomly (with the value determined
by chance), and variables like X are called random variables. We often denote random variables
(RV for short) by capital letters (such as X).
One way to get information about possible values for an RV like X is to perform some experiments.
However, there are also ways to simulate RVs and then make predictions about their behaviour. For
example, we could simulate the numbers of customers at the checkouts of a supermarket to work out
whether more staff needed to be employed before actually employing more staff.
A random process is a sequence of random variables, one for each point in time. A random process is
usually denoted by a capital letter and a subscript representing time; for example, the daily exchange
rate of the New Zealand dollar versus the US dollar can be regarded as a random process, and might
be denoted Rt .
10
−10
−20
X
−30
−40
−50
0 200 400 600 800 1000
time
Back to the coin-toss problem. Let Xt be the random process of the results when we successively toss
the coin. Although we cannot be sure about the exact value of Xt before tossing the coin for the t-th
time, we expect it to be either Heads or Tails. In other words, we know the range or the set of all
possible values that Xt can take. This range or set is called the state space Ω of the random variable.
Practice
Where zeros are assigned as heads and ones are assigned as tails.
4.2.2 Probability
Theory
Since the exact value of a random variable Xt cannot be determined before performing an experiment,
it makes sense to consider the probability of observing an event of the state space. For example, if Xt
is the result of the coin toss at the t-th time, what is the probability that Xt is Heads? We denote
this probability by Prob(Xt = Heads) or P(Xt = Heads), or P(Xt = 0), where we assign Heads = 0.
There are analytical methods to calculate probabilities. For a simple example like P(Xt = Heads),
we already know that the chances of observing Heads or Tails are equal for a fair coin, so there is a
50% chance that Xt is Heads. Thus, P(Xt = Heads) = 21 = 0.5. However, calculating probabilities
analytically is not always straightforward, especially for complicated examples.
There are many courses on analytical probability calculation. In this chapter, we only aim to estimate
probabilities by running experiments or simulations (e.g. computational experiments). Monte Carlo
methods use repeated sampling to solve problems, instead of analytical methods, and can be faster
and easier.
Let S be a subset of the state space Ω and consider Xt to be a random process of Ω. Suppose that
the values of Xt are known by experiment or simulation for t ∈ {1, 2, . . . , N }. Then the probability
P(Xt ∈ S) is estimated by
For example, consider S to be {Heads} as a subset of Ω = {Heads, Tails}. Suppose that we toss a fair
coin 30 times and 18 times it shows Heads. Then,
The estimated probability is close to the analytical value 0.5 but is, of course, not exactly equal; its
value depends on the experiment. The estimated probability can be improved (i.e., getting closer to the
analytical probability) by increasing the total number of experiments, provided that some conditions
are satisfied with respect to the nature of the random process. Intuitively, you would expect that a
4.3. DISCRETE AND CONTINUOUS RANDOM VARIABLES 75
probability estimate improves if we increase the total number of experiments. While a proper proof
of the above statement, in a general case, is beyond the scope of this course, we shall assume that our
estimate of a particular probability does improve if we increase the number of experiments.
The subset S can have more than one element; for example, what happens if S is chosen to be equal
to Ω, i.e. S = {Heads, Tails}? In this case,
P(Xt ∈ S) = P(Xt = Heads or Xt = Tails) = 1
since the number of times that Heads or Tails are observed is equal to the total number of experiments.
In other words, no matter how many times we toss a coin, we will observe an element in Ω (Heads or
Tails) because no other outcome is possible! This conclusion seems trivial, yet, it is a very important
theorem in statistics and probability:
Theorem 4.2.1. For any random process Xt with state space Ω, we have P(Xt ∈ Ω) = 1.
Practice
In the following, we make the definitions of random variables more specific by considering some
application problems. These are thought applications (or can be done with dice), the MATLAB
methods for simulation will be introduced later.
Example 4.2.2. What is the probability that four dice sum up to a value more than 10?
We could solve the problem exactly (with probability theory), but we could also get good estimates
by simulation. We would throw the four dice many times and determine the proportion of times that
we get a value more than 10. Better, we get a computer to throw the dice! The same strategy works
for many problems in the real world.
Example 4.2.3. What is the probability that it will be sunny tomorrow?
We can construct a simulator for the weather (not easy!) and then run it many times to compute the
probability.
In fact we can use the same idea to help with integration. It is often extremely difficult to find the
integral of a function analytically (at least for most real-world functions). Later on in this chapter we
will see how to compute approximate integrals using simulation. That is usually much much faster
than trying to evaluate the integral exactly.
Theory
Random variables can be categorised as discrete or continuous and can be visualised using histograms.
76 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
A random variable X is said to be discrete if its state space Ω is a discrete set; for example, the
discrete set of integer numbers {0, 1, 2, . . . }.
Example 4.3.1. Little Ben received a mini keyboard as a gift for his first birthday. It has 7 keys for
the notes Do, Re, Mi, etc. For simplicity in notation, we denote them by the numbers 1, 2, 3, . . . , 7.
The table below shows a random process of the keys that he presses.
time key time key time key time key time key time key time key
1 3 4 5 7 1 10 7 13 4 16 1 19 7
2 1 5 2 8 7 11 3 14 3 17 2 20 6
3 4 6 6 9 2 12 6 15 5 18 2 21 1
Solution
We can estimate the probability by counting the number of times that he pushes 1 (four times according
to the table) and 7 (three times). Thus,
4+3 1
P(Xt = 1 or Xt = 7) ≈ = ≈ 0.33.
21 3
Histograms
Note that counting the numbers is not an efficient way of estimating probabilities for large samples;
for example, counting the number of events in a sample of 2100 would be enormously time consuming!
A common method to visualise and estimate probabilities is to use a histogram.
Consider a random process Xt with the state space Ω. Assume that we partition Ω into subsets S1 , S2 ,
· · · , SN . The histogram is a graph that gives the number of times that each S1 , · · · , SN is observed
in Xt .
The division of the histogram of a discrete random process by its total number of experiments is called
the probability mass function of the random process.
The probability mass function of any discrete random process has the following properties:
• the sum of its values, that is, the total height of the bars for the probability mass function over
Ω is 1. (Why?)
4.3. DISCRETE AND CONTINUOUS RANDOM VARIABLES 77
Practice
The command hist in MATLAB plots histograms. (Tip: typing help (or doc) followed by a command
or function gives you help in the command window (or in a separate window)). The hist function
defines subsets by their centres.Since the histogram gives the number of times each Si is observed, for
discrete random variables we can estimate probabilities by dividing the histogram by the total number
of experiments.
To visualise the keys pressed by Ben in Example 4.3.1, we consider S1 = {1}, up to S7 = {7}. We
use the following code to obtain the probability estimates and generate the graph shown below (e.g.
written in a MATLAB script file. You can save MATLAB files in the current directory/folder.).
0.2
0.18
0.16
0.14
Probability estimate
0.12
0.1
0.08
0.06
0.04
0.02
0
1 2 3 4 5 6 7
Key
Note that the semi-colon ’;’ at the end of a line of MATLAB code will suppress its output in the
command window.
Alternatively, in the MATLAB 2014b release a new function histogram was introduced. The de-
fault histogram settings give a plot of the number of observations in each subset defined by subset
edges. The histogram function has a number of options that allow us to choose how the data is
78 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
displayed. The 'BinMethod' option allows us to define the subsets as integers instead of by edges.
The 'Normalization' option allows us to show the results as probabilities rather than the default
counts. Hence the following code can be used:
X = [3 1 4 5 2 6 1 7 2 7 3 6 4 3 5 1 2 2 7 6 1];
histogram(X,'BinMethod','integers','Normalization','probability')
xlabel('Key')
ylabel('Probability estimate')
In this example we entered the data sample into MATLAB. However, large data samples are usually
provided in separate data files, which should be loaded by MATLAB before plotting the histogram.
We do not consider such cases in this course.
Theory
A random variable X is said to be continuous if its state space Ω is a continuous set; for example,
the interval [0, 10] of real numbers, R, between 0 and 10.
Practice
Example 4.3.3. The pigeon PJ flies from her nest every day, searching for food. On average, she
flies Xt metres away from her nest (as a random process) and she never gets further than 100 metres.
A researcher recorded Xt for 200 days and stored the values in a data file. The distances that the
researcher has recorded during the first 21 days are given in the following table.
Assuming that the data file X is available, explain how to estimate P(50 ≤ Xt ≤ 60) from a visual
examination of the data.
Solution
In this case the random process Xt is continuous with state space Ω = [0, 100]. We need to consider
the subsets S1 , S2 , etc, as intervals as well. For example we could select the intervals S1 = (0, 10],
S2 = (10, 20], S3 = (20, 30], · · · , S10 = (90, 100]. Also consider the following code:
For a sample of 200 days this code has given the graph on the following page.
0.35
0.3
0.25
Probability estimate
0.2
0.15
0.1
0.05
0
0 10 20 30 40 50 60 70 80 90 100
S
Note that, therefore, P(50 ≤ Xt ≤ 60) is approximately equal to the bar between 50 and 60; this is
approximately 0.25.
Example 4.3.4. According to the graph, what is the estimate of P(60 ≤ Xt ≤ 80)?
Example 4.3.5. Without the graph, what is P(0 ≤ Xt ≤ 100)? How do you interpret this probability
in the graph?
This section outlines common continuous random variables and how to simulate them. We will then
look at the use of continuous random variables for modelling error.
80 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
Theory
In Example 4.3.3, we used 10 subsets to generate the histogram and, therefore, obtain an estimate
of the probability mass function. We could, of course, assign the subsets differently or use more of
them; for example, we could use S1 = [0, 1], S2 = (1, 2], . . . , S100 = (99, 100]. By using more subsets,
we get more bars in the probability mass function but the bars get thinner and (mostly) shorter. If
we use a lot of very very small subsets (for example, S1 = [0, 0.01], etc.) the bars tend to lines and
the probability mass function tends to a continuous curve. This makes sense, because we expect that
the probability mass function of a continuous random variable would be a continuous function rather
than a discrete set of bars. This continuous function is called the probability density function.
More formally, the probability density function of a continuous random variable X is the limit of its
probability mass function when the number of subsets Si ’s tends to infinity and the subset widths go
to zero.
In Statistics and Probability Theory, probability density functions, sometimes abbreviated to p.d.f.,
are very important; in fact, continuous random variables are defined by their p.d.f.’s. The next figure
shows an example of a p.d.f. of a random variable X. The probability that the value of the random
variable lies between a and b is the shaded area. (Why?)
f ( x)
x
a b
Rb
Or, put another way: P r(a < X < b) = a f (x)dx.
There are three important properties of p.d.f.’s for any continuous random variable:
1. The value of the p.d.f. function is always non-negative.
2. The integral over the whole real line of any p.d.f. equals one.
4.4. UNIFORM AND NORMAL DISTRIBUTIONS, PROBABILITY DISTRIBUTIONS 81
Example 4.4.1. Let X be a continuous random variable with the p.d.f. shown below. If this is a
p.d.f., can we find the value of c?
c
@
@
@
@
@
@
@
@
@
@
@
@ -
−1 0 1
Solution
1
We want the area under the graph to equal one. The area of each triangle is 2 × 1 × c so the total
area is c. Hence, in order for this to be a p.d.f., we must have c = 1.
2. Prob(X = 0)
Solution
1. Exactly half the area is on the positive side, so Prob(X > 0) = 0.5.
2. The area under the graph from 0 to 0 is zero! So Prob(X = 0) = 0. With continuous random
variables the probability of a single point is zero. (This is a good trick question for exams.)
3. This is a little harder, but not much. We want the area of the triangle from 0.5 to 1, which has
height 0.5. This is 12 × 0.5 × 0.5 = 0.125.
In the following, we consider two famous p.d.f.’s that are widely used.
The uniform distribution is the simplest continuous distribution. More complicated distributions
can be built from this distribution and it is the basis of all random number generators.
82 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
The p.d.f. for the uniform distribution between a and b looks like a box: it is zero for X < a, and
1
zero for X > b. Between a and b the height is b−a , so that the total area under the graph is one.
(
1
b−a if a < x < b
f (x) =
0 otherwise
Note that all of the real values between a and b are possible, not just the integers. Examples of
different uniform distributions are shown in the figure below.
1 1
0.5
0 1 0 2
0 1 2
Figure 4.2: Uniform distributions on the intervals [0, 1], [0, 2] and [1, 2]
Practice
The rand function in MATLAB produces a random number from the uniform distribution on the
interval [0, 1] (this is the standard uniform distribution). The command rand(m,n) produces an
m × n matrix of random numbers in [0, 1]. To generate just one random number in [0, 1], type rand(),
which is the same as rand(1) and rand(1,1).
There are two rules that we can use to generate different uniform random variables.
Example 4.4.3. Use MATLAB to generate a uniform random number in the interval [2, 5].
Solution
If X is uniform on [0, 1] then 3 X + 2 will be uniform on [2, 5]. The MATLAB code is
4.4. UNIFORM AND NORMAL DISTRIBUTIONS, PROBABILITY DISTRIBUTIONS 83
>> 3*rand()+2
Example 4.4.4. Use MATLAB to generate a uniform random number in the interval [a, b].
Solution
If X is uniform on [0, 1] then (b − a) X is uniform on [0, b − a] and (b − a) X + a is uniform on [a, b].
The MATLAB code is
>> (b−a)*rand()+a
Example 4.4.5. Write MATLAB code to generate 20 uniformly distributed random variables that
lie between 1 and 5.
Solution
>> x=4*rand(20,1)+1
Notes
1. The MATLAB command rand(20,1) generates a 20 × 1 matrix of [0, 1] uniform random num-
bers. Then 4*rand(20,1) is the same matrix with all entries multiplied by 4. When we add 1
to the matrix, MATLAB adds 1 to every element in the matrix.
2. It is important to understand that rand(a,b) does NOT generate a number from the uniform
distribution on the interval [a, b].
Example 4.4.6. Consider Example 4.3.3 again. Write MATLAB code to generate 200 random vari-
ables that are uniformly distributed between 5m and 100m. Plot the histogram of your generated data,
using the same Si ’s as in Example 4.3.3. Does your histogram look like the one that the researcher
got in the example? What do you conclude about the distribution of the distance travelled by PJ?
X = 5+95*rand(200,1);
N=length(X);% total number of experiments
S = [0:10:100]; % subset definitions
h = hist(X, S); % histogram
bar(S, h/N); % probability plotted versus S
xlabel('S')
ylabel('Probability estimate')
84 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
Theory
One of the foundational observations of statistics is that if you take enough little random variables,
the distribution of their average approaches a density function that looks just like a bell curve (this is
essentially what the central limit theorem means).
The histogram below shows the chest measurements of 5738 Scottish soldiers collected by a Belgian
scholar with statistical interests.
0.2
0.1
0
32 34 36 38 40 42 44 46 48 50
Measurement
It turns out that we can approximate the histogram by a smooth symmetric bell-shaped curve called
the Normal probability density function curve:
1 (x−µ)2
f (x) = √ e− 2σ 2
2πσ 2
• the probability that x lies in an interval [a − h, a + h], for fixed h > 0, decreases with the
distance between a and µ.
We say that such a random variable X is normally distributed with mean µ and variance σ 2 (or
standard deviation σ). It is often written as X ∼ N(µ, σ 2 ).
The mean tells us approximately where the data is (the peak), and the variance or standard deviation
tells us how much the data is spread out. This is illustrated in the figures below which show us the
p.d.f.s for several normal distributions with different means and variances. The normal distribution
4.4. UNIFORM AND NORMAL DISTRIBUTIONS, PROBABILITY DISTRIBUTIONS 85
with mean 0 and variance 1 (i.e., N(0,1)) is sometimes called the standard normal distribution; its
p.d.f. is shown below.
0.8
0.6
0.4
0.2
−5 −4 −3 −2 −1 0 1 2 3 4 5
x
P.d.f.s for some other normal distributions are shown in the next figure.
0.4
0.35
N(3,1)
0.3
0.25
0.2
N(-5,4)
0.15
N(3,4)
0.1
0.05
N(3,25)
0
-15 -10 -5 0 5 10 15 20
Note that N(3, 1), N(3, 4) and N(3, 25) are all centred at 3. As the variance increases, the p.d.f. becomes
more spread. Note also that N(−5, 4) and N(3, 4) are exactly the same shape and differ only in location.
Roughly 68% of the area under the curve falls within one standard deviation of the mean, which is
the interval [µ − σ, µ + σ]; roughly 95% falls within two standard deviations of the mean, that is, in
the interval [µ − 2 σ, µ + 2 σ]; and roughly 99.7% falls within three standard deviations, that is, in the
interval [µ − 3 σ, µ + 3 σ]. You can check this on the p.d.f. of the standard normal distribution plotted
earlier. Similarly, a normally distributed random variable with mean 10 and standard deviation 2 will
lie in the interval [6, 14] with probability approximately 0.95.
86 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
Practice
The randn function in MATLAB produces a random number from the standard normal distribution
N(0, 1). This function works the same way as rand: the command randn() generates a single number
with normal distribution; the command randn(m,n) produces an m × n matrix of normal random
numbers.
Warning: It is very easy to confuse rand and randn in your MATLAB code!
There is a general rule that we can use to generate normal random variables with given means and
variances, starting with a number generated by randn.
• If X is normally distributed with mean µ and variance σ 2 , then X + a is normal with mean µ + a
and variance σ 2 .
• If we multiply a normal random variable by a positive number we get another normal random
variable. If b > 0 and X is normally distributed with mean µ and variance σ 2 , then b X has
mean b µ and variance b2 σ 2 (and standard deviation b σ).
Example 4.4.7. What if we want a normal random number with mean 1 and standard deviation 5?
Solution
>> 5*randn()+1
or
Example 4.4.8. Write MATLAB code to generate 100 random numbers normally distributed with
mean 12 and variance 4. Use the vector form of randn to produce a (column) vector z with 100
numbers from the N(12, 4) distribution.
Solution
The standard deviation will be 2, since the variance is 4. (The variance equals the square of the
standard deviation)
4.4. UNIFORM AND NORMAL DISTRIBUTIONS, PROBABILITY DISTRIBUTIONS 87
>> z=2*randn(100,1)+12;
Example 4.4.9. Consider Example 4.3.3 again. Write MATLAB code to generate 200 random vari-
ables that are normally distributed with mean 40m and standard deviation 10m. Plot the histogram
of your generated data, using the same Si ’s as in Example 4.3.3. Does your histogram look like the
one that the researcher got in the example? What do you conclude about the distribution of the
distance travelled by PJ? Try some other mean and standard deviation values to get your histogram
more similar to the graph in Example 4.3.3.
Theory
It is also useful to know that there are many other distributions, and random variables from these can
often be generated by using combinations of uniform and normal random variables.
(
λe−λx if x ≥ 0
f (x) =
0 otherwise
with λ > 0
1
X = − log(U )
λ
If Y is a standard normal random variable, then X = Y 2 has a chi-squared distribution (also written
as a χ2 -distribution). For this reason, you will come across chi-squared distributions a lot in classical
statistics.
Note that
>> Y=randn();
>> X(i)=Yˆ2;
Theory
The following is an example of the application of the simulation of a continuous random variable to
modelling. Often when modelling a system we want to be able to include uncertainty due to parameter
estimation, and uncertainty due to measurement error.
Where Xn is the fish population at time n, r is the growth rate, and K is the carrying capacity of the
environment. (This is a difference equation, as in Ch. 3).
Practice
Often the parameters in models are estimates only. For example, we estimate r = 0.36. To account for
our uncertainty we can generate a value of r from a normal distribution (say with standard deviation
0.05), run the simulation for fish stock and repeat.
First we can write a function to give N time steps for the fish stock model:
We can then use the fish function to simulate 50 fish stock populations through time, given our
uncertainty in the growth rate, r, and the carrying capacity, K.
6
x 10
5
4.5
3.5
2.5
1.5
1
0 5 10 15 20 25 30 35 40 45 50
Theory
Sometimes we want to model random fluctuations in time. The most well-known model for this is
Brownian motion.
Recurrence equation:
• The value Xk+1 after k steps has a normal distribution with variance kσ 2 .
• Good for short-term models, but not good for exploring long term behaviour - process is highly
likely to whiz up towards infinity (or down towards negative infinity).
90 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
Practice
The following code gives the general form for simulating a Brownian random walk:
Example 4.4.10. Use MATLAB to model the environmental fluctuation in carrying capacity.
Solution
We model carrying capacity as a random walk: each step K changes by a small (random) amount.
We begin by creating a function that simulates a fish population and that models the fluctuations in
carrying capacity as a Brownian random walk.
The following figure shows what the changing carrying capacity for our fish populations might look
like through time.
We could then simulate several possible fish populations through time using the brownFish function.
6
x 10
7
Carrying capacity K
4
0
0 5 10 15 20 25 30 35 40 45 50
Time step
X = brownFish(X1,r,K,N);
plot(X);
end
hold off
end
Theory
In the previous section we considered simulating continuous random variables (or processes), where
we assumed that their p.d.f.s were either uniform or normal. In this section we consider simulating
discrete random variables using a uniform distribution. In other words, we want to use a uniform
distribution, which is a continuous distribution, to simulate a discrete random process. We will also
show some alternative methods for simulating discrete random variables. Whichever method is used,
we first need to know the probability mass function of the discrete random process. We show how to
simulate discrete random variables using some examples.
Here we simulate random discrete variables where the probabilities for each state are equal.
92 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
Practice
Solution
We did this earlier using the randi MATLAB function. Here we solve this problem in two different
ways.
(i) If we generate a uniform random variable between 0 and 1 then there is a 50% probability that
the number is less than 0.5. You can think of this as subdividing the line from 0 to 1 into two
segments:
0 0.5 1
HEADS TAILS
When we generate our uniform random variable we are selecting a random position along the
line segment [0,1]. We can generate X from the uniform distribution on (0, 1) using MATLAB
as follows. If the random variable from the uniform distribution is less than 0.5 we will say the
result is Heads, otherwise it is Tails. Equivalently (where Heads=1, Tails = 2)
function flip=coinflip()
X = rand(); % generate X from uniform distribution
if (X<0.5)
flip=1; % case Heads
else
flip=2; % case Tails
end
end
(ii) Alternatively, we could also use the ceil function from MATLAB. This takes a real number
and rounds up to the nearest integer. To generate 1 or 2 with equal probability we can generate
a random variable that is uniformly distributed between 0 and 2 and round up:
>> flip=ceil(2*rand())
Simulating a dice
Solution
State space: Ω = {1, 2, 3, 4, 5, 6}. Each outcome has probability 61 . We will do this in two ways:
1 2 3 4 5 6
We let the variable out contain the result of rolling the dice.
function out=dice()
x = rand(); % Uniform random real number from 0 to 1.
if (x<1/6)
out=1; % dice gives 1
elseif (x<2/6)
out=2; % dice gives 2
elseif (x<3/6)
out=3; % dice gives 3
elseif (x<4/6)
out=4; % dice gives 4
elseif (x<5/6)
out=5; % dice gives 5
else
out=6; % dice gives 6
end
end
(ii) If we generate a uniform random variable between 0 and 6, and then round up to the nearest
integer, we get 1, 2, . . . , 6 with equal probabilities.
>> ceil(6*rand())
Note that this method, using ceil, only works when the probabilities are all equal.
Example 4.5.3. Write MATLAB code to to simulate the sum obtained when three dice are thrown.
Solution
We can use a loop performing three simulations
total=0; % total sum at start
for i=1:3 % run three simulations
total = total + dice(); % use dice function we made
end
94 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
Theory
The same general approach works for more complicated distributions where the state probabilities are
unequal.
Benford’s law
A set of numbers satisfies Benford’s law if the leading digit d ∈ {1, 2, . . . , 9} occurs with probability
I.e. the leading digit is likely to be small, and it turns out that this is very common in lots of datasets.
For example, the first digits of randomly selected street addresses, atomic weights, and population
sizes have a discrete distribution roughly following Benford’s law.
4.5. SIMULATING DISCRETE RANDOM VARIABLES 95
0.35
0.3
0.25
0.2
probability
0.15
0.1
0.05
0
1 2 3 4 5 6 7 8 9
digit
Practice
We can simulate random variables using a uniform function and then assign them to a discrete random
variable based on the cumulative distribution.
1 2 3 4 5 6 7 8 9
Note that the interval widths in the above figure are not to scale!
96 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
function digit=Benford()
x = rand(); % Uniform random real number from 0 to 1.
if x< log10(2)
digit=1;
elseif x<log10(3)
digit=2;
elseif x<log10(4)
digit=3;
elseif x<log10(5)
digit=4;
elseif x<log10(6)
digit=5;
elseif x<log10(7)
digit=6;
elseif x<log10(8)
digit=7;
elseif x<log10(9)
digit=8;
else
digit=9;
end
Suppose that we have been given a vector p containing the probabilities for states 1 to m. We want
to generate a random variable with these probabilities.
We can continue the same idea as before, except that we use a for loop to go through the different
states.
function state=NewRandomVariable()
p= (the vector containing the probabilities);
r=rand(); % generate uniform random number
total=0; % start of first interval segment
for i=1:m % check each segment, where m is the number of segmen
if total<r & r<=total+p(i)
state=i; % r lies in segment [total, p(i)]
end
total=total+p(i); % update to start of next segment
end
We are looping through the intervals until we get the one that contains r.
4.6. ESTIMATING PROBABILITIES, MONTY HALL 97
Theory
Suppose, for example, that we want to know the probability that a random variable X is less than
some value, k. One way of doing this is to generate lots of instances of X and determine the proportion
of times you generate a value less than k (i.e. using simulation as outlined in Section 4.2.3).
Practice
Example 4.6.1. Write a MATLAB code to estimate the probability that a normal random variable
with mean 100 and variance 22 is less than 98.
Solution
Every time you run this script you may get a slightly different answer. As the number of simulations
(in this case 1000) increases, the variability between runs will decrease.
There is a general recipe that we can follow. Suppose that you want to know the probability that
some event happens. The simulation algorithm is:
count = 0;
numsims = 1000; % (or 10000, or more)
for n=1: numsims
(simulate something)
if (the event occurred in the simulation)
count = count+1;
end
end
prob = count/numsims
98 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
• 'simulate something' Use the techniques of the previous chapter to generate a random in-
stance. In the above example, this meant generating a random normal with mean 100 and
standard deviation 2.
• 'if the event occurred in the simulation' Here you test the outcome of your simula-
tion to see if what you are looking for occurred. In the above example, we wish to test if the
random variable was less than 98.
Note that the number of simulations needs to be high enough so that the resulting proportion is close
to the actual probability of the event and not largely affected by fluctuations.
Example 4.6.2. What is the probability that four dice sum up to a value more than 10?
Solution
From earlier we saw that we could simulate a single dice using
>> ceil(rand()*6)
or
>> sum(ceil(6*rand(4,1)))
numsims=1000;
count = 0; % initial counting the number of times greater than 10
for n=1:numsims
% simulate summation of the numbers
total = sum(ceil(6*rand(4,1)));
if total>10
count = count+1; % count the simulation
end
end
prob = count/numsims % probability estimate
Example 4.6.3. Suppose it has been projected that the number of people who will visit the Auckland
Zoo each day during next January is normally distributed with a mean of 1200 and standard deviation
4.6. ESTIMATING PROBABILITIES, MONTY HALL 99
of 400. Assuming that the zoo is open every day and each visitor pays an entry fee of $7.50, estimate
the probability that the zoo will receive at least $250,000 in entry fees during next January.
Solution
count = 0;
numsims = 1000;
for n=1:numsims
total = 0; % number of people that came in the month
for k=1:31% 31 days in January
z = 1200 + 400 * randn(); % simulate the number of people coming this day
% add the number of people in one day to the total number in the month
total = total + z;
end
A casual supermarket worker can get 3, 4 or 5 shifts a week with probabilities 0.4, 0.4 and 0.2
respectively. A random vector of shifts worked over 10 weeks can be simulated using the code:
function shifts=shifts10()
shifts=zeros(10,1); % setting up vector for storage
for i=1:10
r=rand(); % generate variable from standard uniform distribution
if r<0.4
shifts(i)=3;
else if r<0.8 % sum of the 1st two probabilities
shifts(i)=4;
else
shifts(i)=5;
end
end
Suppose the output from our previous simulation of the shifts for a supermarket worker is
The command shifts == 5 will produce a vector which has zero for every element not equal to 5 and
one for every element equal to five. Hence shifts==5 produces the vector [0,0,0,1,1,0,0,0,0,1].
These vector commands are especially useful when evaluating the outcome of a single simulation. The
MATLAB command sum produces the sum of a vector. This means that sum(shifts==5) has the
value 3. In one line we can count the number of entries equal to 5 (or less than 4, or more than 4).
Example 4.6.4. Write a MATLAB program to estimate the probability that a supermarket worker
gets 5 shifts in more than 6 weeks out of 10.
Solution
Above we saw how to generate a vector shifts of 10 numbers giving the shifts worked each week.
The number of weeks in which the worker worked 5 shifts is
>> sum(shifts==5)
and we can test if this number is greater than six. In order to get an estimate, we run multiple (10000)
runs of the simulation, and determine the proportion that pass the test. Putting everything together:
numsim = 10000;
count = 0;
for n=1:numsim
shifts=shifts10(); % using the previous code
% check if the worker worked 5 shifts in more than six weeks out of ten.
if sum(shifts==5) > 6
count = count+1;
end
end
prob = count/numsim % probability estimate
Suppose you’re on a game show, and you’re given the choice of three doors; behind one door is a car;
behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the
doors, opens another door, say No. 3, which has a goat.
He then says to you,“Do you want to pick door No. 2?” Is it to your advantage to switch your choice?
To answer this we can use simulation to estimate the probability of getting the car if we don’t switch,
and the probability if we do switch. We’ll assume we always pick door one to begin with. The position
4.7. ESTIMATING EXPECTATIONS, GAMBLER’S RUIN 101
of the car is a random variable with equal probability for each state.
Using the car function we can start to see how to simulate the Monty Hall problem. For example, we
can use car() to choose a door at random, too. How then can we determine success or failure in the
game, and implement the switching strategy? We will develop a full code for this in lecture, which
will then be posted to the matlab code resource page on canvas.
Theory
The mean of a finite set of numbers is the sum of the numbers divided by the size of the set of
numbers.
The expectation of a random variable is a number which, roughly, describes the average value you’d
get if you generated lots of instances of that random variable.
Example 4.7.1. In section 4.6.2, a worker got 3, 4 or 5 shifts a week with probabilities 0.4, 0.4 and
0.2 respectively. Let X be the number of shifts. Show that E(X) = 3.8
Solution
By the definition
E(X) = 3 × 0.4 + 4 × 0.4 + 5 × 0.2 = 3.8
Another example is the expected value of a dice. 1/6 of the time you get a one, 1/6 of the time you
get a two, and so on. So when you generate dice rolls a large number of time, close to 1/6 of the
generated numbers will be one, 1/6 will be two, and so on. This gives:
1 1 1
E(X) = × 1 + × 2 + · · · + × 6 = 3.5
6 6 6
102 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
The expectation of a continuous random variable with probability density function p is defined using
integrals: Z ∞
E(X) = x p(x)dx.
−∞
Example 4.7.4. If X is a normal random variable with mean µ and variance σ 2 then E(X) = µ.
There are three very important rules about expectations that you need to learn:
For example, what is the expectation of the sum of a standard uniform variable X and a normal
random variable Y with mean 5 and variance 8?
1
E(X + Y ) = E(X) + E(Y ) = + 5 = 5.5
2
If the random variable is simple, it is easy enough to calculate the expectation. In general, this is not
so simple, so we investigate by simulation.
Law of large numbers: An important property of random variables is that if we generate lots of values
and look at their mean, then as the sample get larger the mean will get closer and closer to the
expectation. A sample from X is a set of values generated using the distribution of X. The number
of values generated is called the sample size.
Practice
Example 4.7.5. Use 10000 simulations to estimate the mean of a normal random variable with mean
0 and standard deviation 1.
Solution
Of course in this case we already know the ‘true’ expectation, i.e., zero! We will do this example two
ways. First, we use a loop:
numsims = 10000;
total = 0; %initial sum
4.7. ESTIMATING EXPECTATIONS, GAMBLER’S RUIN 103
for i=1:numsims
x = randn(); % simulate a standard normal random variable
total = total + x; % add x to the sum
end
expectation = total/numsims % estimate the expectation
We can also generate a whole vector of normal random variables in one go. This gives the simpler
(and faster) code:
You can try bigger values of numsims to get better accuracy. The bigger the sample, the more accurate
the estimation.
Theory
We can also define the expectation of a function of a random variable. Suppose that X is a real-valued
random variable and g is some function defined on the real numbers. Then
Z ∞
E(g(X)) = g(x)p(x)dx.
−∞
m
X
E(g(X)) = g(xi )P (X = xi )
i=1
Practice
Example 4.7.6. Estimate the expectation of the square of a uniform random variable on [0, 1].
Solution
104 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
n = 10000;
total = 0; % initial sum
for i=1:n
% simulate a standard uniform random variable, square it and add it to the sum
total = total + (rand())ˆ2;
end
expectation = total/n % expectation estimate
>> n = 10000;
>> x = rand(n,1); % generate a vector of standard uniform distribution RVs
>> gx = x.ˆ2; % g(x) for each x is xˆ2, using the dot to make operation vector friendly
>> expectation = sum(gx)/n % expectation estimate
The last line can be simplified using the mean command:
Here are two general algorithms for estimating the expectation of g(X), the first with loops, the second
without.
Or,
Exercises:
Estimate E[X 2 − X] where X is normally distributed, with mean 1 and variance 3. Do not use loops.
4.7. ESTIMATING EXPECTATIONS, GAMBLER’S RUIN 105
Theory
Each turn has equal odds, but we have a finite number coins. We want to explore things like the
probability I lose all my money, or I get all your money, etc.
Practice
(
Xn−1 + 1 with probability 1/2
Xn =
Xn−1 − 1 with probability 1/2
Here’s the output from 5 runs of gamble(20,10) - i.e. I begin with $20 and we play 10 turns:
The players only have a finite amount of money, so we need to impose limits. The game ends when
we hit 0 (I’m bust) or 2x20=40 (you’re bust).
25
24
23
22
21
dollars
20
19
18
17
16
1 2 3 4 5 6 7 8 9 10
turn
A note on the use of break in Matlab. The break command tells Matlab to jump out of the loop it
is currently in. The routine breaks out of the loop if X(n) is zero or is too large. In the above code
4.8. MONTE CARLO INTEGRATION 107
the output will be the index n if this happens (or N if it never does).
How long is the average game? (I.e. how long do we expect the game to last if we start with $20)
total = 0;
N=1000; %number of simulated games
for i = 1:N
L=howLong(20,1000);
total = total + L;
end
expect = total/N;
What is the probability that I get all your money at the end of a game?
X = zeros(1000,1);
total=0;
for t=1:1000 % run several games
X(1) = 20; % initial money I have
for n=2:1000 % upper limit on turns
if rand()<0.5
X(n) = X(n−1)+1;
else
X(n) = X(n−1)−1;
end
if X(n)<=0 % You win
win=0; break;
elseif X(n)>=2*20; % I win (assuming you started with in same amount of money)
win=1; break;
end
end
total=total+win;
end
pwin=total/1000 % probability of winning
But what happens if one of us starts with more money? Or, in the more realistic casino scenario,
what happens if the game is slightly biased?
Theory
There is a range of methods for evaluating integrals, some analytical, and some numerical. There are
whole fields of applied mathematics and statistics devoted to this problem. In this section we use the
108 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
tools of simulation in order to carry out integration. We will only look at one technique for doing this,
which we will derive below:
Z ∞
E(g(X)) = g(x)f (x)dx
−∞
where X is drawn from the p.d.f f (). Suppose we have X uniformly distributed on an interval [a,b]
and we have some function g(x) for which we want to estimate the mean.
Example 4.8.1. What is the exact expectation of the square of a uniform random number on [0, 2]?
Solution
The answer is the following integral (where a = 0 and b = 2)
b
1
Z
E(g(X)) = g(x) dx.
a (b − a)
where g(x) = x2 . Therefore the mean of a uniform random number from (0, 2) squared is given by
2
4
Z
x2 1 3 2
1
2 dx = 2 3x 0
= .
0 3
But wait: this gives us a formula for the integral, in terms of the expectation. Flip the sides and
multiply by (b − a). Then we get
Z b
g(x)dx = (b − a)E(g(X)).
a
This approach is called Monte-Carlo integration (after the gambling capital, Monte-Carlo). Relative
to other numerical methods for integration, Monte Carlo integration is better for high dimensional
problems, but we’ll just look at one dimension in this course.
4.8. MONTE CARLO INTEGRATION 109
Practice
n = 1000;
total = 0; % initial sum
for i=1:n
x = (pi/2)*rand(); % simulate a uniform RV between 0 and pi/2
y = cos(x); % the function of x as given
total = total + y; % adding y to the total sum
end
expectation = total/n; % estimate expectation
integral = expectation * pi/2 % estimate the integral
>> n = 1000;
% simulate a vector of n uniform RVs between 0 and pi/2
>> x = (pi/2)*rand(n,1);
>> y = cos(x); % the function of x as given
>> expectation = mean(y); % estimate expectation
>> integral = expectation * pi/2 % estimate the integral
Remember that if x is a vector then, in MATLAB , cos(x) is the vector formed by applying cos to
all the entries of x.
MATLAB returns different values each time, but the larger n is, the closer this value is expected to
get to the true answer.
Example 4.8.3. Using Monte Carlo integration evaluate
Z 1
2
e−x dx
−1
Solution
n=1000;
x=−1+2*rand(n,1); % uniform RV between −1 and 1
y=exp(−xˆ2);
expectation=mean(y);
integral=expectation*2;
110 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING
Z b
f (x)dx
a
Exercise: Write a general function for estimating integrals Monte Carlo style using vectors.
Chapter 5
Contents
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.1.1 Theory and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.1.2 Representing Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.1.3 Graphs in MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.2 Graph Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.2.1 Disease Spread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.3 Walks and Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.3.1 Where Graph Theory Started: The Bridges of Königsberg . . . . . . . . . . . . 120
5.3.2 The First Graph Theory Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.3.3 Graph Theory Today: The Traveling Salesman Problem . . . . . . . . . . . . . 124
5.3.4 Solving the Traveling Salesman Problem . . . . . . . . . . . . . . . . . . . . . . 125
5.4 Colourings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.4.1 Where Graph Theory Became Famous: Map Colouring . . . . . . . . . . . . . 127
5.4.2 Vertex Colourings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.4.3 Graph Colourings: Why and How . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.4.4 Edge Colourings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.1 Introduction
In this chapter, we will be studying graphs: a mathematical concept designed to let us talk about
objects and the connections between them. While graph theory was originally seen as being a recre-
ational branch of mathematics with few applications, in recent years its power to describe networks
has made it the perfect tool for studying many problems in the modern world. Graphs can be used
to model and study the internet, social networks, the spread of diseases through a population, travel,
computer chip design, and countless other phenomena; they are everywhere, and are at the heart of
exciting new frontiers of research in data science, mathematics, and computer science.
111
112 CHAPTER 5. NETWORKS AND GRAPHS
In these notes, we’ll start by building up some definitions that will let us talk about graphs and their
properties, and practice some MATLAB commands that will let us write programs to interact with
graphs. From there, we’ll move to solving various famous problems in graph theory, ranging from the
Bridges of Konigsberg and the Four-Color Theorem to more recent phenomena such as the traveling
salesman problem and tournament design. It’ll be fun!
A graph, in mathematics, is just a way to describe a set of objects and the relations between them.
In a graph, we call the objects vertices; to represent a relation between two objects, we will draw a
connection between those two vertices, and call that connection an edge.
Formally, we define a graph G as a pair of sets (V, E), where V is the set of vertices and E is the set
of edges. Individual vertices are usually denoted by using lower-case alphabetical letters, like a, b, c
or x, y, z; graphs are typically denoted by capital letters like K, G, H. Edges are typically denoted by
writing pairs of vertices in a set; for instance, we can describe the edge connecting a vertex a to a
vertex b by writing {a, b}. You can also describe an edge by writing a ↔ b; either is acceptable!
a b
2. The drawing at right represents G = (V, E), where V = {a, b}, E = {a↔b}.
a b
The graph drawn at right has vertex set: V = {a, b, c, d}, and edge set E = {a ↔
3.
b, a ↔ c, a ↔ d, c ↔ d}
c d
There are a number of ways to describe a graph in MATLAB. In this section, we will discuss the two
most straightforward ways to input graphs into MATLAB, and the commands needed to do so.
Edge Lists
Edge lists are probably the easiest way to create a graph in MATLAB. To do this, do the following:
1
Vertex is the singular of vertices. People also sometimes call a vertex a “node;” in particular, MATLAB often uses
the word node to refer to vertices. In general, because graph theory is a relatively young field its terminology is still
developing; there are many cases where people have different notations or names for the same things! Also { thing1 ,
thing2 , thing3 } is the notation we use to describe a set containing the three things thing1 , thing2 , thing3 . Writing
something like {} denotes an “empty” set, that contains nothing; you can also write ∅ to denote the an empty set, if you
like.
5.1. INTRODUCTION 113
Example 5.1.1.
Edge lists are simple and take up very little space. But for large graphs these lists can become large
and cumbersome to search through! If we needed to find whether the graph contains a particular
edge, for instance, we would have to search through the entire list (which can be an exhausting job.)
In cases like this, the adjacency matrix is often a good way to describe a graph:
Adjacency Matrices
Similar to edge lists, adjacency matrices describe a graph by describing the edges between vertices.
Given a graph G on n vertices, we can build the adjacency matrix AG for G as follows:
Example 5.1.2.
a b c d e f
a c e a 0 0 0 0 0 0
b 0 0 1 0 0 0
c 0 1 0 0 0 0
d 0 0 0 0 1 1
b d f
e 0 0 0 1 0 1
f 0 0 0 1 1 0
2
If two vertices vi , vj are connected by an edge, then we say that the vertices are adjacent, or that vi and vj are
neighbours.
114 CHAPTER 5. NETWORKS AND GRAPHS
Adjacency matrices usually take up more space than edge lists, and can be harder to make. However,
the trade off is that they are incredibly easy to search through! To find out if there’s an edge from vi
to vj , you just find the corresponding element AG (i, j) of the matrix and check if it is a 0 or a 1.
It’s also very easy to see how many neighbours any vertex has: we can just count how many ones
are in the corresponding column. In our example, it’s fast to see that vertex a has 0 neighbours and
isolated from every other vertex in the graph, just from reading the top row of the matrix.
Note: We call the number of neighbours a vertex has the degree of that vertex. For instance, we could
write deg(a) = 0 to denote that the vertex a in our earlier example has no neighbors.
a Alternately, we can use the edge list technique to make the same graph:
>> s = [1 1 1 4];
b t = [2 3 4 3];
G2 = graph(s,t);
c d
Notice that we’ve used numbers to keep track of the vertices in our code. Here, for example, we used
the numbers 1,2,3,4 instead of the letters a,b,c,d. This is because in most situations MATLAB indexes
vertices using numbers (which is what you’d expect a computer language to do!), and doing this can
make it much easier to write clean and usable code.
If we want to plot the two graphs G1 , G2 from above, we can use the plot() command as follows:
>> plot(G1); title('G1')
figure
plot(G2); title('G2')
5.2. GRAPH DYNAMICAL SYSTEMS 115
Note: MATLAB might not draw the graph in the same way as you. This doesn’t mean your
code is wrong! There are often many ways to draw the same graph.
In this example, for instance, we can see that while G1 and G2 look the same, they do not look like
our original graph! This is not because we’ve done anything wrong: indeed, if we look at the actual
connections between our vertices we can see that these are the same graphs, except MATLAB has
decided to move the “central” vertex outside of the triangle.
We can use graphs to model processes that evolve over time. For example, we can use graphs to study
how a genetic trait might spread through a population over time, or how traffic patterns evolve over
the course of a day. In this section, we’ll turn to a type of problem we’ve encountered earlier in this
course: modeling how a disease might spread through a social network.
In this section, we will build a program that simulates disease spread using graphs. To start with
something tangible, let’s look at a small class of students, with social connections used to draw our
edges:
Rebekah Ielyaas
Here the connections represent friendships between students who spend time together or who share
the same office together. We want a program that allows to choose people that are sick, and then
models how the sickness spreads throughout the rest of the people in the class.
If we want to set this problem up in MATLAB we need to first create this graph. Recall that MATLAB
indexes vertices with numbers. So we should first rewrite the graph using numbers, and then write
down the edge list for the graph.
116 CHAPTER 5. NETWORKS AND GRAPHS
Now that we have our graph in MATLAB we’re going to need a model of how sickness spreads. That
is, we need some rules to tell us how a person gets sick, and what happens when they are sick. For
our model we’re going to assume the following rules:
1. If a healthy person comes into contact with someone that is sick, then they get sick themselves.
2. If a sick person is around less than two other sick people, then they start recovering. Otherwise
they stay sick.
3. A recovering person will become healthy. While they are recovering they can’t become sick.
These rules form a simple model for how sickness spreads amongst people. The next question becomes
how are we going to implement this model in our program.
Node Properties
To begin implementing this model we first have to learn about node properties. We need to be able
to describe each of our vertices as being healthy, infected, or recovering. How do we do that? In
MATLAB we would call this - being sick or not - a property that vertices can take on. In MATLAB
we can access and assign ‘properties’ to our graph’s vertices using some special notation. We named
our graph Classroom so we can access all the vertices by writing
>> Classroom.Nodes
ans =
11x0 empty table
The details of what a table is isn’t hugely important. The important thing is that we have access to
where our vertices are stored. Currently the table is size 11x0. This means that our 11 vertices are
there, but they have zero properties so far. (By default, vertices start with no properties in MATLAB.)
5.2. GRAPH DYNAMICAL SYSTEMS 117
So: let’s establish a new property! Specifically, let’s give our node the property of being sick or not.
We’ll do this by initialising every vertex as being a healthy person. We do this by using the general
notation Graph.Nodes.<property> = <vector>, where <property> can be named by us, and
the vector is going to contain the value of the property for each vertex.
Now if we look at the vertices of the graph, using the code at right,
we’ll be able to see the new property has been established. >> Classroom.Nodes
Great! Our code works so far, and everyone currently starts out healthy. ans =
Now, we need to be able to set some vertices to ‘sick’ (‘S’). We want
to make it so we can choose anyone to start out sick, so we write our 11x1 table
function to have the input variable whoIsSick, and then set those nodes
Sickornot
to be sick later in the code.
That is, we want to write something like the code below:
H
function out = SickClassroom(whoIsSick) H
H
... H
H
% We store the vertex number of everyone who starts sick H
% in the vector whoIsSick. H
% Then we can use it to denote who starts off sick. H
Classroom.Nodes.Sickornot([whoIsSick]) = 'S'; H
H
... H
Now that we have our graph all set up, we’re ready to start modelling! To do this, we use the following
algorithm to implement the “disease spread” rules we came up with earlier:
Let’s talk about how we’d implement this algorithm in MATLAB. The first two steps (copying the
graph, and then going through each vertex) is pretty straightforward:
3
This is so that we only change our graph at the end, not while we’re working on it!
118 CHAPTER 5. NETWORKS AND GRAPHS
Now, in this for loop, we need to check the state of our vertex and its neighbors. Checking the state
of our vertex is pretty easy: we can just write something like
To find all the neighbouring vertices, we can use the MATLAB function neighbours(G,i); given a
graph G and vertex i in G, this gives a list of all vertices adjacent to i in G.
For example, the following code will count the number of infected neighbors of a vertex i:
Once we have the number of infected neighbours, we have all the information we need to be able to
implement our three rules we came up with for how sickness spreads. We can implement these rules
using if and else statements:
The very last line of code here just saves over the previous graph with nice, new updated graph - we
wouldn’t want to lose all our work!
If you put all of this together, you should have a program that lets you simulate how a disease might
spread in this classroom! Try it on your own, and check Canvas if you get stuck; we’ve uploaded a
fully-functional version of this code there for you to look at.
To illustrate how it works, we show what our code does if Elias starts off sick and we ask it to go
forward in time by three steps (e.g. if we type SickClassroom(3,[6]) into the command line in
MATLAB:
Rebekah Ielyaas 5 7
Figure 5.3: Second time step Figure 5.4: Third time step
120 CHAPTER 5. NETWORKS AND GRAPHS
In the previous section, you did a lot with graphs: you learned what a graph is, picked up several
pieces of vocabulary that are useful when describing graphs, and then saw how to use MATLAB to
model real-life situations with graphs!
Graph theory can do a lot more than just model things, however: it’s a fascinating area of study that
can actually solve tricky problems through its own ideas and techniques. In this section, we’re going
to study two such concepts in the field of graph theory, the first of which started the field of graph
theory in proper:
Puzzle.
In the early 17th century, the Prussian city of Königsberg was famous
for its beautiful downtown city. At the time, the city was divided by
the river Pregel into four parts: a northern region, a southern region,
and two islands. These regions were connected by seven ornate bridges,
drawn in red in the map at right.
On nice days, residents of the city would often go out for a walking tour
through the city that would try to cross every bridge. No matter how
hard they tried, though, they found it impossible to make a route that
would start and end at the same place and cross every bridge; every
route they made would accidentally “double back” and walk on some Map of Königsberg in Euler’s
bridges more than once. time. Image from public
Despite this, though, no-one in the city could come up with a reason domain, namely Wikipedia’s
why this was impossible! Trying to resolve this conundrum became pop- Seven Bridges of
Königsberg page.
ular with the citizens of Königsberg, and was soon shared by its mayor
with other cities in a hope that someone could find a solution. Eventu-
ally it made its way to the prolific mathematician Leonhard Euler, who
heard the problem described as follows:
“Can you come up with a path through Königsberg that starts and ends at the same place,
and walks over each bridge exactly once?”
Before reading further, take out pen and paper and try to solve the problem yourself! Can you do it?
Or is it impossible (and if so, can you explain why?)
The first key to this problem is the idea of a graph that you’ve developed earlier. Specifically, consider
the following way to turn our map of Königsberg into a graph K:
5.3. WALKS AND CIRCUITS 121
• Take each of the four regions of Königsberg, and turn each into a N
vertex: that is, make a vertex N for the northern region, a vertex
S for the southern region, and vertices I1 , I2 for the two islands.
• Now, take each bridge and turn it into an edge: that is, draw two
edges between N and I1 , two edges between I1 and S, and an I1 I2
edge from I2 to each of N, I1 and S.
• This gives you a graph K (drawn at right); we think of this
graph as representing the important connections of the city of
Königsberg, but without all of the other extra details that could S
The city of Königsberg, now
get in the way. in graph form!
Doing this gives you a sort of funny-looking graph, where we have some pairs of vertices linked by
multiple edges! This is OK, though. In graph theory, we call these kinds of graphs multigraphs;
conversely, if we want to talk about graphs where we only allow up to one edge between any two
vertices, we’ll call those kinds of graphs simple graphs.
With this concept in place, we can now rigorously define the idea of a walk:
Definition 5.3.1. In a graph G = (V, E), we define a walk of length n to be any sequence of n edges
{v0 , v1 }, {v1 , v2 }, {v2 , v3 }, . . . , {vn−1 , vn }, such that all of these edges are in our graph G. We say that
this path starts at v0 and ends at vn . If v0 = vn (that is, if we start and end at the same place) we
call this path a circuit.
Notice that circuits are allowed to repeat edges and vertices if they want, and also not use all of the
vertices or edges in a graph.
If a circuit contains every edge in G exactly once, we call it Eulerian, in honour of the mathematician
Leonhard Euler. Similarly, if a circuit contains every vertex in G exactly once, we call it Hamiltonian
(named for William Rowan Hamilton, another famous mathematician.)
Using the language of graph theory, our question from earlier can be phrased as follows: does the graph
K drawn earlier have an Eulerian circuit? In general, what kinds of graphs have Eulerian circuits,
and which do not? How can we quickly tell if a graph has an Eulerian circuit or not?
Theorem 5.3.2. If G is a graph that has at least one vertex of degree 1, then G does not have an
Eulerian circuit.
Proof. To prove a claim like this, we need to make an argument that applies to any graph G with a
vertex of degree 1! So, we can’t just look at an example like H above, because that wouldn’t persuade
someone if they were skeptical: they’d just say that that example was dumb, and that there could be
other graphs that did work.
Instead, we need to make an argument about all graphs G that have a vertex of degree 1. That is:
suppose that G is a graph, and x is a vertex in G with deg(x) = 1. Why does G not have an Eulerian
circuit?
• If we start a walk at the vertex x in our graph G, we can never return: there’s only one edge
connecting x to other vertices of our graph, because deg(x) = 1.
• If we didn’t start at x, though, then as soon as our walk gets to x we’re stuck: there’s only one
route into x!
Because we’re trying to make a Eulerian circuit, our walk needs to use all of the edges in G; this
means that we’ll eventually have to visit x sometime, and then we’ll get stuck (and thus not be
able to form a circuit.)
So: this gives us a condition under which Eulerian circuits don’t exist.
However, there are graphs (like our graph K from Köningsberg, or the
one at right) that don’t have any vertices of degree 1, but also don’t
Another graph with no
seem to have any Eulerian circuits! Eulerian circuit.
How can we deal with graphs like these?
The answer is the following theorem of Euler, famously presented to the St. Petersburg Academy in
1735 as the first theorem in graph theory:
Theorem 5.3.3. A connected graph G has an Eulerian circuit precisely whenever it has no vertices
of odd degree. That is: if a graph G has any vertex with odd degree (like a vertex of degree 1, or
degree 3) it cannot have an Eulerian circuit. Conversely, if every vertex in G has even degree, then it
must have an Eulerian circuit!
♥ First, we should show that that if G has an Eulerian circuit, it has no odd-degree vertices.
5.3. WALKS AND CIRCUITS 123
♣ Then, we should show the reverse direction: that is, we should prove that if G has no vertices
of odd degree, then it must have an Eulerian circuit.
We start with (♥). Suppose that G = (V, E) is a graph with a Eulerian circuit. Write down that
Eulerian circuit here, as {v0 , v1 }, {v1 , v2 }, {v2 , v3 }, . . . , {vn−1 , vn }, {vn , v0 }
Pick any vertex x ∈ V . Notice that each time x comes up in the above circuit, it does so twice: if
x = vi for some i, it shows up in both {vi−1 , vi } and {vi , vi+1 }. You can think of this as saying that
each time our circuit “enters” a vertex along some edge, it must “leave” it along another edge!
As a result, any vertex x shows up an even number of times in the circuit we’ve came up with here.
But our circuit is Eulerian; that is, it contains every edge in E exactly once. As a result, every vertex
x shows up in an even number of edges in E; that is, deg(x) is even for every vertex x, as claimed! So
we’ve proven this half of our claim.
We now proceed to (♣). Suppose that G is a connected graph in which all of our vertices have even
degree; we want to find an Eulerian circuit in G.
Init: Pick a vertex v0 at random from V . Think of v0 as our current location, and our
current path as the empty path.
1. If we are currently at some vertex vi , randomly choose a vertex vi+1 so that the edge
{vi , vi+1 } is not yet in our path. Add {vi , vi+1 } to our path, and update our current
location to vi+1 .
2. Repeatedly do step 1 above until we get back to v0 .
Notice that because the degree of every vertex in G is even, step 1 in this process can never fail: if we
are able to “enter” a vertex along some edge, there must be a corresponding edge we can “leave” on!
Because G has a finite number of edges, we can’t get stuck on 1 forever as well; so we must eventually
get back to v0 . In other words, the process above generates a circuit! Call it C.
If this circuit is Eulerian, sweet; we’re done. If not, though, it’s not too hard to make it Eulerian!
Simply do the following:
Init: Take G, and delete C’s edges from G. Because every vertex shows up an even number
of times in a circuit (as shown earlier!), this doesn’t change our “all vertices have
even degree” property.
1. If G has edges that aren’t in C, then (because G’s connected) there must be some
vertex vi in our circuit that still has nonzero degree.
2. Starting from vi , run our “find a circuit” algorithm, to get another circuit C 0 that
starts and ends at vi .
3. Now, “paste” that circuit C 0 into our original circuit, by traveling along C until
we get to vi , then taking the circuit C 0 which starts and ends back at vi , and then
resuming the original circuit C. We’ve made a bigger circuit!
4. If G still has edges, go to 1 and do it all again!
124 CHAPTER 5. NETWORKS AND GRAPHS
This process will “grow” our circuit on each pass, and is again guaranteed to work because our degrees
stay even on each loop of our algorithm. So doing this repeatedly will generate an Eulerian circuit for
us, and thus complete our proof!
If the (♣) half of the argument above was a bit too complex for you, try drawing out a graph where
all of its degrees are even, and then “running” by hand the process described for making an Eulerian
circuit.
1. What does the theorem above tell you about the Seven Bridges of Königsberg problem?
3. Write MATLAB code that when given a graph G, tells you whether G has an Eulerian circuit.
4. Write MATLAB code that when given a graph G that has an Eulerian circuit, can actually draw
an Eulerian circuit in that graph.
(You can find coded answers to the MATLAB questions on Canvas, if you’re stuck or curious!)
While taking scenic walks is certainly enjoyable, most modern applications of graph theory are much
more practical at their heart. Consider the following task, known as the traveling salesman prob-
lem:
Puzzle. Suppose that you’re a traveling salesman. In particular, you’re traveling the South Island,
and trying to sell rugby tickets for the nine rugby teams there (illustrated in the map below.)
Tasman Tasman
3 4
Buller Buller Canterbury
7 5 2
2
West Coast Canterbury Mid-Canterbury
West Coast 3 1
8 4 South Canterbury
6 2
Mid-Canterbury
South Canterbury Southland North Otago
4
North Otago 3 4
Otago
Otago
Southland
5.3. WALKS AND CIRCUITS 125
You want to start and finish in Mid-Canterbury, and visit each other region exactly once to sell tickets
in it. The travel times between adjacent regions are labeled on the edges of the graph at right. What
circuit can you take through these cities that minimizes your total travel time, while still visiting
each city exactly once?
Without knowing any mathematics, you’d probably guess that the shortest route is to just go around
the perimeter of the island. Intuitively, at the least, this makes sense: avoiding the southern alps is
probably a good way to save time!
In real life, however, maps can get a lot messier than this. Consider a
map of all of the airports in the world, or even just in New Zealand (at
right.) If you were an Air New Zealand representative and wanted to
visit each airport, how would you do so in the shortest amount of time
and still return home to Auckland?
These sorts of tasks are known as traveling salesman problems, and companies all over the world
solve them daily to move pilots, cargo, and people to where they need to be.
It’s also a task that’s remarkably similar to the Eulerian circuits we were looking at before! From the
graph theory perspective, a solution to the traveling salesman problem is a circuit that visits every
vertex exactly once, in such a way that the total “travel time,” as measured by labels we put on all
of the edges of our graph, is minimized. (Note that we don’t have to use every edge in these solutions:
we just need to visit each vertex once and start and end in the same place.)
Given that humans solved the Eulerian path problem in 1735, and that the traveling salesman problem
sounds a lot more practical than touring bridges, you’d think that we’d have a good solution to this
problem by now, right?
. . . not so much. Finding a “quick” solution to the traveling salesman problem is an open problem
in mathematics; if you could find an efficient solution to the traveling salesman problem for certain
specialized notions of efficient, you would solve a problem that’s stumped mathematicians for nearly a
century, advance mathematics and computer science into a new golden age, and quite likely go down
in history as one of the greatest minds of the millenium4 .
4
So, uh, extra-credit problem.
126 CHAPTER 5. NETWORKS AND GRAPHS
This is a fancy way of saying “this problem is really hard.” So: why mention it here? Well: in math-
ematics in general, and graph theory in particular, we often find ourselves having to solve problems
that don’t have known good or efficient algorithms. Despite this, people expect us to find answers
anyways: so it’s useful to know how to find “good enough” solutions in cases like this!
For the traveling salesman problem, one brute-force approach you could use to find the answer could
be coded like this:
Init: Take our graph G, containing n vertices. Let s be the vertex we start and end at.
Let c be a cost function, that given any edge {x, y} in G outputs the cost of traveling
along that edge.
1. Write down every possible order in which we can list the n vertices of G, starting
and ending at s.
2. For each order {s, v1 }, {v1 , v2 }, . . . , {vn−1 , vn }, {vn , s}, calculate c({s, v1 }) +
c({v1 , v2 }) + . . . + c({vn , s}). Assume that c({x, y}) is infinite if the edge doesn’t
exist (i.e. that it would take “forever” to travel along a path that is impossible to
travel along.)
3. Output the smallest number/path you find.
Points in favor of this algorithm: it works! Also, it’s not too hard to code (try it!)
Points against this algorithm: if you were trying to visit 25 cities in a week, it would take the world’s
fastest supercomputer over ten thousand years to answer your problem. (If you were trying to visit
75 cities, I think the heat death of the universe occurs before this algorithm is likely to terminate.)
This is because the algorithm needs us to consider every possible order of the n vertices in G to
complete. There are (n − 1)! = (n − 1) · (n − 2) · (n − 3) · . . . · 3 · 2 · 1 many ways in which we can order
our n cities5 , and the factorial function grows incredibly quickly: in general, if you have a program
whose runtime can be measured with a factorial function, it stops being something you can run very
very quickly.
Another approach (which, as someone who would like to book their travel before the heat death of
the universe, I’m in favor of) is to use randomness to solve this problem! Consider the following
algorithm:
Init: Take our graph G, starting vertex s, and cost function c just like before.
1. Start from s and randomly choose a city we haven’t visited, and then go to that city.
2. Keep randomly picking new cities until we’ve ran out of new choices, and then return
to s.
3. Calculate the total cost of that path.
4. Run this process like ten thousand times (which, while large, will be much smaller
than n! for almost all values of n that you’ll run into!)
5. Output the smallest number/path you find.
5
To see why, think about how you’d make an ordering of the cities. You’d start by choosing a city to travel to from
s: there are n − 1 choices here, as we can possibly go anywhere other than s. From there, we have n − 2 choices for our
second city, and then n − 3 for our third city, and so on/so forth!
5.4. COLOURINGS 127
Points against this algorithm: strictly speaking, it probably won’t work. That is: we’re just repeatedly
randomly picking paths and measuring their length. There’s no guarantee that we’ll ever pick the
“shortest” path!
Points in favor of this algorithm: it’s easy to code, it’s really fast, and if you only care about just
getting close-ish to the right answer it’s actually6 not too bad in many situations!
In the long run, it’s probably better if your phone gives you slightly suboptimal directions in a second
rather than taking two years to find the absolute best path to the Countdown, so in general this is
probably a better way to go. But in certain small situations (or times when you randomly have a
supercomputer at hand) brute-force can also be the way to go: it really depends on what you’re trying
to solve! This is a small preview of the “applied” side of applied mathematics: often, it’s not enough
to just know how to solve the problem. You usually have to solve it efficiently as well!
1. Look at the graph we drew earlier for the South Island and its travel times. What is the shortest
route that starts and ends at Tasman? Is it actually the “go around the coast” route, or does it
cross the southern alps somewhere?
2. Draw a map for where you grew up. Label your home, school, local grocery store, and a couple
of your favorite places to visit outside of home. Use Google Maps or something similar to find
the distances between these things. What’s the shortest path that visits all of them, starting
and ending from home?
3. Write a MATLAB program that takes in a graph G with edge labels that give you the cost
function for those edges, and uses the brute-force method to solve the traveling salesman program
on that graph.
4. Write a MATLAB program that takes in a graph G with edge labels that give you the cost
function for those edges, and uses a randomization method to solve the traveling salesman
program on that graph.
As before, you can find code for the MATLAB questions on Canvas; check it out if you’re stuck!
5.4 Colourings
In the last section we studied Eulerian circuits, which were the objects studied in the first major
graph theory proof. In this section, we’ll switch over to studying probably the most famous result in
graph theory: the four-colour theorem.
The four-colour theorem was first posed by Francis Guthrie in 1852. He was colouring in a map
of counties in England, and was attempting to do so such that no two counties that bordered each
6
In particular there are lots of tweaks you can apply here to make this pretty decent in most cases, while still keeping
it fast.
128 CHAPTER 5. NETWORKS AND GRAPHS
other got the same colour. When he was doing this, he noticed that only four colours were needed! He
mentioned this to his brother, who was the student of Augustus De Morgan (a famous set theoretician),
who then sent the problem out to all of his colleagues and friends.
The problem quickly became infamous amongst mathematicians7 , and attracted hundreds of false
proofs in the coming years. It stood until 1976, when Appel and Haken wrote a proof that reduced
the problem to checking a few (thousand) individual cases, which they did by computer. Since then
no entirely human-made proof has been found: every proof that a map only needs four colours has
needed a computer to check at least some of its cases!
So: why mention this in a graph theory class? Well: as you’ve seen earlier, we can turn any map into
a graph by assigning a vertex to each region, and by drawing an edge between two regions when they
share a border.
Tasman Tasman
3 4
Buller Buller Canterbury
7 5 2
2
West Coast Canterbury Mid-Canterbury
West Coast 3 1
8 4 South Canterbury
6 2
Mid-Canterbury
South Canterbury Southland North Otago
4
North Otago 3 4
Otago
Otago
Southland
Under this idea, a colouring of our map that doesn’t give adjacent regions the same colour is just a
way to paint each vertex a colour, so that no edge in our graph has both endpoints with the same
colour. This idea is an important one — indeed, it’s the focus of this section! — so we should give it
a name.
Definition 5.4.1. Take a graph G. A proper vertex colouring of G with k colours is any way
to take k different colours and use them to paint the vertices of G, so that no edge in G has both
endpoints receiving the same colour.
The chromatic number of a graph G, χ(G), denotes the smallest number of colours k such that G’s
vertices can be properly k-coloured.
To illustrate this idea, look at the four graphs below for a moment:
7
Ironically the result itself was of little interest to mapmakers, who had found that in practice you could colour most
maps with three colours anyways.
5.4. COLOURINGS 129
C7 K5 L5 O
(cycle graph) (complete graph) (ladder graph) (octahedron)
Try to colour the vertices of each with the smallest number of colours possible, then try to explain
why you can’t use less than that number of colours. Once you think you’ve got the right answers,
read on for the solutions!
Solutions: We claim that χ(C7 ) = 3, χ(K5 ) = 5, χ(L5 ) = 2, and χ(O) = 3. To see why, we first
show that these graphs can indeed be coloured with 3, 5, 2 and 3 colours, respectively:
C7 K5 L5 O
(cycle graph) (complete graph) (ladder graph) (octahedron)
• You can’t colour the vertices of the cycle graph C7 with just two colours. To see why, try it!
Make one vertex red to start; then if there are only two colours (say red and blue,) you know
that the neighbors of that vertex are both blue. This forces their neighbors to be red, and
forces their neighbors to be blue; this then forces a blue-blue edge, which causes a problem. (If
this argument didn’t make sense, take a pen and actually try to do the two-colouring that it
describes!)
• In the complete graph K5 , every pair of vertices are connected by an edge. Therefore, because
no edge can connect two vertices of the same colour, we need at least five colours to colour this
graph.
• In the ladder graph L5 , we clearly need at least two colours (as the only graphs that can be
1-coloured are ones with no edges at all!)
• The octahedron graph O contains a triangle graph. Colouring a triangle requires three colours
(Why? Prove this to yourself!) As a result, the octahedron needs at least three colours as well.
So: we know both what a proper vertex colouring of a graph is, some history about where it came
from, and have seen a few examples calculated. The next natural questions to ask are the following:
(1) what can we do with graph colourings, and (2) how can we tell a computer to colour a graph?
130 CHAPTER 5. NETWORKS AND GRAPHS
While map colourings are nice to make, the reason that we care deeply about graph colourings in the
modern world is their application to scheduling problems. Consider the following task:
Puzzle. Suppose that you’re running a business. In the coming day, you have a set of jobs (like
picking up supplies with the company car, taking clients out to lunch, running/cleaning the store) to
complete and a set of time slots for those jobs.
Some jobs might conflict with each other, however, because they depend on a shared resource; for
example, if you only have one car, you can’t have someone both picking up supplies from Manukau
and taking clients out to lunch in Ponsonby at the same time! Similarly, you probably can’t schedule
someone to wax the floors of your store while it’s open.
The goal, then, is to assign jobs to time slots so that no two conflicting jobs occur at the same time.
How can you do this?
How can you efficiently assign time slots to the jobs you have to do?
One simple solution is to simply give every job its own time slot; this ensures that you won’t have any
conflicts! However, the amount of time slots this takes will probably make you quite sad. Another
solution, that’s much less likely to lead to burnout, is to use the language of graph theory:
This sort of task is particularly common in computer science: there, your “jobs” are often calculations
that a program wants to perform, and any two jobs that rely on accessing the same bits of memory
at the same time are thought to be in “conflict.”
So: just like the traveling salesman problem, we’ve came across a graph theory idea that’s useful, easy
to describe, and has been studied by thousands of mathematicians for over a hundred years. Surely
we’ve got an efficient way to find the smallest number of colours needed to properly vertex-colour a
graph by now, right?
5.4. COLOURINGS 131
. . . sadly, not so much8 . While we’ve discovered tons of graph colouring techniques and ideas over the
past century (enough to spend your entire life studying!), we have not yet discovered a truly efficient
way to find the chromatic number of an arbitrary graph.
Like before, we could describe brute-force and random algorithms for colouring a graph (and indeed,
you can find code to do this on Canvas!) Instead, we’ll use this section to introduce a third kind of
algorithm that’s useful in graph theory: a greedy algorithm.
Init: Take a graph G on n vertices that we want to properly vertex-colour. List the
vertices of G as v1 , v2 , . . . vn .
1. As well, make a list of possible colours that we’d want to use on this graph. Because
we’re mathematicians, let’s name these colours 1,2,3. . . instead of things like red,
blue, green; this means that we’ve got a nice ordering built into our colours, and
that we’ll never run out of colours!
2. Paint v1 the colour 1.
3. Now, paint v2 the smallest colour that we can without causing a conflict with v1 .
That is; if v1 and v2 don’t share an edge, colour v2 1. If they do, however, we can’t
colour v2 1 without making a conflict; so colour v2 2.
4. Do the same thing for v3 ; that is, give v3 the smallest colour that doesn’t cause
conflicts with v1 , v2 .
5. Keep going through our list of vertices. At the end, we will have a properly coloured
graph!
To illustrate the idea, here’s a sample run of the greedy algorithm (where blue is the first colour, and
yellow is the second colour):
v1 v4 v1 v4 v1 v4 v1 v4 v1 v4 v1 v4
v2 v5 v2 v5 v2 v5 v2 v5 v2 v5 v2 v5
v3 v6 v3 v6 v3 v6 v3 v6 v3 v6 v3 v6
Points in favour of this algorithm: it’s pretty easy to code (try it in MATLAB!), and it runs quickly.
Also, unlike the randomization algorithm, it’s predictable: that is, every time you run the greedy
algorithm on a graph G with the vertices in the same order, you’ll always get the same output! This
can be important for designing processes that humans will interact with, as people are often unhappy
when things randomly change without them knowing why.
Points against this algorithm: it’s sometimes very inefficient depending on how you’ve listed the
vertices in your graph. For example, suppose we took the graph above with a slightly different
ordering on its vertices:
v1 v2 v1 v2 v1 v2 v1 v2 v1 v2 v1 v2
v3 v4 v3 v4 v3 v4 v3 v4 v3 v4 v3 v4
v5 v6 v5 v6 v5 v6 v5 v6 v5 v6 v5 v6
8
This is something of a theme in mathematics in general, and graph theory in particular: so many things are both
(a) simple to define, (b) incredibly useful and (c) a nightmare to actually calculate. If you want to help change this,
become a mathematician! We need new ideas.
132 CHAPTER 5. NETWORKS AND GRAPHS
We saw before that this graph only needs two colours; when ordered as above, however, the greedy
algorithm used three colours to colour this graph!
Unfortunately for us, you can often get massive gaps between what v1 v2
the greedy algorithm comes up with for a graph colouring and what is v3 v4
optimal. If you generalize the graph from earlier to something like the
v5 v6
drawing at right, you’ll see that you can find graphs on n vertices whose
chromatic number is 2, but where the greedy algorithm will try to use
n/2 different colours! vn-1 vn
With that said, in many situations the greedy graph colouring works pretty well; try it out on the four
graphs whose chromatic numbers we calculated earlier, and see if it gives you the same values! As is
often the case with algorithms, there is no one-size-fits-all solution: you need to look at the specific
sorts of graphs you’re encountering in your work, and figure out which algorithm gives the best results
for you.
We close this section by introducing a cute variant on the idea of a vertex colouring: edge colourings!
Definition 5.4.2. Given a graph G, an edge colouring of G with k colours is any way to assign
each edge of G one of k different colours, so that no two edges of the same colour share an endpoint
in common.
a 3-edge-coloring a 4-edge-coloring
Just like with vertex colourings, we can use a greedy algorithm to find an edge-colouring of a graph:
Init: Take a graph G containing m edges that we want to properly edge-colour. List the
edges of G as e1 , e2 , . . . en .
1. Paint e1 the colour 1.
2. Now, paint e2 the smallest colour that we can without causing a conflict with e1 .
3. Do the same thing for e3 ; that is, give e3 the smallest colour that doesn’t cause
conflicts with e1 or e2 .
4. Keep going through our list of edges. At the end, we will have a properly edge-
coloured graph!
5.4. COLOURINGS 133
To check your understanding in this section, try out some of the exercises below!
2. A ladder graph Ln can be drawn by making a n-rung ladder and placing vertices at intersec-
tions, as drawn below:
L1 L2 L3 L4 L5 ... Ln
4. Write a program that uses the greedy algorithm described in the notes to vertex-colour a graph.
5. Write a program that uses the greedy algorithm described in the notes to edge-colour a graph.
6. Find a map of where you grew up, broken apart into counties or regions. Try to colour it with
four colours. Can you succeed?
As before, you can find code on Canvas that answers the MATLAB questions!
9
This graph is the Petersen graph, which is famous for being a counterexample to a tremendous number of things
you might otherwise believe about graphs. There are entire textbooks centered around the Petersen graph and its
generalizations. Maths is weird.
10
This graph is one of the “flower snarks,” which were named in reference to the Lewis Carroll poem The Hunting of
the Snark ! Yay, poetry references in a maths class.
134 CHAPTER 5. NETWORKS AND GRAPHS
Chapter 6
Markov Chains
Contents
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.1 Introduction
A Markov chain is an example of a random process (see Ch 4) and is used to model a sequence of
random variables whose probabilities depend on the previous state.
In this chapter we will cover how to construct and interpret Markov chains. We will look at how to
calculate probabilities after multiple steps of the Markov chain, as well as in the long run. Building
on skills from the Ch 4 we will also simulate steps of a Markov chain.
135
136 CHAPTER 6. MARKOV CHAINS
Theory
In many cases the probability of an event may not be the same every time we repeat the process.
Consider the following example: the probabilities of Chris being fit, sick or tired tomorrow are 0.6,
0.1 and 0.3, respectively. However, will the same probabilities of 0.6, 0.1 and 0.3 be appropriate for
every day? Maybe Chris is more likely to be tired on Wednesday if he has been sick on Tuesday. It
could be that we need to look even further back to how he was on Monday, but this becomes more
complicated. In this chapter we will consider situations where the probability of an event depends
only on the previous state; so the probabilities of the next state will depend on Chris’s current state.
This is a Markov Chain model where the probabilities depend on the state the process is in. For
each possible state we define a set of probabilities. For Chris we might have
• If Chris is fit, then the probability that he will be fit on the following day is 0.7, that he will be
sick is 0.1 and that he will be tired is 0.2.
• If Chris is sick, then the probability that he will be fit on the following day is 0.3, that he will
be sick is 0.4 and that he will be tired is 0.3.
• If Chris is tired, then the probability that he will be fit on the following day is 0.6, that he will
be sick is 0.1 and that he will be tired is 0.3.
The transition diagram for a Markov chain has one node for each state and arrows indicating
possible transitions. The values on the arrows are given by the transition probabilities. Thus, the
transition diagram for the example just described is
0.4
0.1
SICK
0.7 FIT 0.3
TIRED
0.3
To
Fit Sick Tired
Fit 0.7 0.1 0.2
From Sick 0.3 0.4 0.3
Tired 0.6 0.1 0.3
Practice
Notice that each row of the table represents a set of probabilities and so it sums to 1. For mathematical
use, it is more convenient to use a matrix instead of a table. The matrix will give the probabilities for
changing between states, and we call it the transition matrix. For the above example,
We use X(t) to denote the state at time t (recall the definition of a state space from Ch 4). The
transition matrix gives the probabilities for different states at time t + 1, for all the different states
that can occur at time t. More formally, the transition matrix, P = (pij ), is defined by
So pij gives the probability that the system is in state j at time t + 1 given it was in state i at time t.
Example 6.2.1. What is the probability that Chris is sick tomorrow if he is tired today?
Solution
Looking in the ‘tired’ row, the probability of being sick is 0.1.
Example 6.2.2. What is the probability that Chris is sick for the next two days if he is tired today?
Solution
Looking in the ‘tired’ row, the probability of being sick tomorrow is 0.1. Furthermore if he is sick
tomorrow, there is a 0.4 probability of being sick the following day. So the chance he is sick for the
next two days is 0.1 × 0.4 = 0.04.
Note that to get the probability of going from one state to another then to another we just multiplied
the transition probabilities. Think of this as ‘this transition happens and this transition happens’.
Example 6.2.3. What is the probability that Chris will be tired or sick tomorrow, given that he is
tired today?
138 CHAPTER 6. MARKOV CHAINS
Solution
There are two possibilities. If Chris is tired today he is tired tomorrow with probability 0.3. If Chris
is tired today, he is sick tomorrow with probability 0.1. Hence if he is tired today then he is either
tired or sick with probability 0.3 + 0.1 = 0.4.
Note that to get the probability of going from one state to either of two options we just added the
transition probabilities. Think of this as ‘this transition happens or this transition happens’.
Theory
In this section we will investigate how to calculate the probabilities of transitioning between states
after multiple steps of the chain. We’ll run through an example to illustrate the process of constructing
a transition matrix, and show how to solve for the probability of a particular path (i.e. an ordered
sequence of events). From here we’ll look at the problem of solving for the probability of reaching a
particular future state, a number of steps down the line, via any path.
Practice
Example 6.3.1. A lift is in a building with four floors. The floors will make up our state space,
Ω = {1, 2, 3, 4}, and t = 1, 2, 3, . . . represents the trips for the lift. X(t) is the floor visited on trip t.
The lift starts on floor 1 in the morning, X(1) = 1.
Some assumptions
• If the lift is at floor 1, then it is equally likely to go to any of the other floors. Remember that
the lift must go to a different floor each trip, i.e. the probability of recurrence is zero.
• If the lift is above floor 1, then the probability of going to floor 1 is 1/2 and the other floors are
equally likely.
Solution
Remember that the rows of the transition matrix, P , must sum to 1.
6.3. PROBABILITIES AFTER MULTIPLE STEPS 139
We will do this row by row. First we said that if the lift is on the first floor then it goes to each other
floor with equal probability. There are 3 other floors, so the probability of going to each one is 13 . The
probability of staying on the same floor is 0. Hence the first row of the transition matrix is
1 2 3 4
1 0 1/3 1/3 1/3
Now floor 2. The lift goes to floor 1 with probability 21 , so p21 = 12 . The other floors are equally likely.
There are two of them, and we need the probabilities to sum to one, so the probabilities of going to
each of the other floors (3 or 4) is 41 . We can now fill in the second row:
1 2 3 4
1 0 1/3 1/3 1/3
2 1/2 0 1/4 1/4
We can fill in rows 3 and 4 using the same arguments that we used for row 2:
1 2 3 4
1 0 1/3 1/3 1/3
2 1/2 0 1/4 1/4
3 1/2 1/4 0 1/4
4 1/2 1/4 1/4 0
As a matrix:
>>P = [
0 1/3 1/3 1/3;
0.5 0 0.25 0.25;
0.5 0.25 0 0.25;
0.5 0.25 0.25 0];
The probability of a specific path in the Markov chain is the product of the transition probabilities in
the path. So, if we start in state 1, the probability of jumping to state 2, then state 3, then state 1 is
the probability of a transition from 1 to 2, times the probability of a transition from 2 to 3, times the
probability of a transition from 3 to 1.
140 CHAPTER 6. MARKOV CHAINS
Example 6.3.2. Consider Example 6.3.1. Suppose that the lift is on floor 1. What is the probability
of the lift going next to floor 3 then floor 2. What is the probability that it instead remains on floor
1 and then goes to floor 2?
Solution
For the first probability, we multiple the transition probabilities
The original transition probabilities are the probabilities after a single step. To determine the proba-
bilities after more than one step we need to sum over all the paths that could have been taken.
Example 6.3.3. We continue with the lift example. What is the probability that the system is in
state 2 exactly 2 steps later if it starts in state 1?
Solution
Firstly we will consider solving the problem by hand. There are four possible paths we need to
consider:
Path Probability
1 → 1 → 2 p11 p12 = 0 × 1/3
1 → 2 → 2 p12 p22 = 1/3 × 0
1 → 3 → 2 p13 p32 = 1/3 × 1/4
1 → 4 → 2 p14 p42 = 1/3 × 1/4
The probability is
X 1
p1k pk2 =
6
k
Alternatively, we could calculate the whole transition matrix for the probability of states after two
steps.
The ij term in the two step transition matrix is the probability of moving to state j in 2 steps when
we start in state i. To find this term we sum up over all possible pathways, i.e. over all possible states
k after just one step: The ij term is therefore given by
4
X
probability = Pik Pkj = (P 2 )ij ,
k=1
6.3. PROBABILITIES AFTER MULTIPLE STEPS 141
where P 2 is the matrix multiplication square of the transition matrix P . So the transition matrix for
2 steps of the chain is P 2 . Using MATLAB :
>> Pˆ2
ans =
Note that we could have written P*P instead of Pˆ2, but that this is not the same as P.ˆ2.
Putting everything together, we have that the probability that a system in state i is in state j exactly
2 steps later is given by
(P 2 )ij ,
the ij element of the square of P . Likewise the probability a system in state i is in state j exactly n
steps later is (P n )ij .
Example 6.3.4. Each week a cellphone is working (State 1), broken (State 2) or lost/thrown out
(State 3).
Ω = {1, 2, 3} and t = 1, 2, 3, 4, . . . counts the weeks.
Assume that
• Broken phones are lost more easily than working ones (probabilities 0.1 and 0.01)
• Phones break with probability 0.01 and are repaired with probability 0.5
Use this to draw a transition diagram and write the transition matrix.
Solution
Transition diagram:
142 CHAPTER 6. MARKOV CHAINS
0.98 0.4
0.5
1 2
0.01
0.01
0.1
3
1
Transition matrix:
0.98 0.01 0.01
0.5 0.4 0.1 .
0 0 1
Example 6.3.5. Assume that a phone is working now. What is the probability that it will be lost in
week 2?
Solution
We can use MATLAB to check this.
>> P=[.98,.01,.01;.5,.4,.1;0,0,1]
P =
0.9800 0.0100 0.0100
0.5000 0.4000 0.1000
0 0 1.0000
>> P*P
ans =
0.9654 0.0138 0.0208
0.6900 0.1650 0.1450
0 0 1.0000
Example 6.3.6. What is the probability that a working phone is lost 3 weeks later?
Solution
Continuing on from the previous code we can figure out P 3 :
>> Pˆ3
ans =
6.4. SIMULATING MARKOV CHAINS 143
Theory
Simulating paths in Markov chains is just like simulating any discrete random variable, except that
1. we have to generate a new random variable for every step of the path;
2. the distribution of the next step in the path depends on the current step.
In some ways, the code for simulating Markov chain paths is a cross between the code for computing
difference equations and the code for generating discrete random variables.
Practice
Example 6.4.1. Recall, in section 6.2, the transition matrix for the Markov chain describing whether
Chris is fit, sick or tired.
>> P = [0.7 0.1 0.2;
0.3 0.4 0.3;
0.6 0.1 0.3];
Write MATLAB code for simulating 10 days of Chris, assuming that he starts off fit.
Solution
We will use a vector C to store the values for each day, where having 1,2, or 3 in C(i) means Chris
is fit, sick or tired on day i. Each day, the probabilities for the next day are read off the appropriate
row in the transition matrix.
days = 10; %number of days
C=zeros(10,1); % storage vector for the state each day
C(1) = 1; %the first day, Chris is fit.
for i = 1:days−1 % −1 as the first day is already known
today = C(i);
if today == 1 %Chris is fit today... determine his state tomorrow
x = rand();
if x < 0.7
tomorrow = 1;
144 CHAPTER 6. MARKOV CHAINS
Note the that code we used for generating tomorrow’s state is cut-and-pasted from the code we used
to generate discrete random variables, and modified depending on the state today.
Alternatively, the MATLAB code could be made a bit more compact (and perhaps a bit easier to
read) by using the transition matrix P .
The key trick to notice is that if today is the current state, then the probabilities for the next state
are given by P(today,1), P(today,2) and P(today,3).
function C=Chris(days,initial) % simulating Chris's well−being each day for an initial state
P = [0.7 0.1 0.2; % transition matrix
0.3 0.4 0.3;
0.6 0.1 0.3];
C=zeros(days,1);
C(1) = initial; % the initial state (1, 2, or 3).
for i = 1:days−1
today = C(i); % state on day i
x = rand();
% checking the probability from state i to state 1
if x < P(today,1)
tomorrow = 1;
% checking the probability from state i to state 2
elseif x < P(today,1) + P(today,2)
tomorrow = 2;
6.4. SIMULATING MARKOV CHAINS 145
else
% checking the probability from state i to state 3
tomorrow = 3;
end
C(i+1) = tomorrow;
end
We can make this code even more general, so that it works for any number of states and any transition
matrix (see Exercises).
Now that we can simulate, we can test some of the techniques we’ve developed.
Example 6.4.2. Using simulation, estimate the probability that Chris is sick in five days time, given
that he is tired today.
Solution
We use the same ‘simulation recipe’ as before, except this time plugging in our code for simulating
Markov chains. Note that if today is day one, then five days time is day six (not day five!)
Note that the true answer is given by the (3,2) element of Pˆ5, or 0.1425.
Example 6.4.3. After careful observation of a share price in the market, a financial adviser has
modelled the share price as a Markov chain. She believes that the share price mostly remains constant
over one day, or changes (increases or decreases) by either about 2% or 5%. Other changes have such
a small probability of occurrence that the adviser decided to exclude them from her model.
• Starting from any of the states, the probability of 2% increase in the share price is equal to the
probability of 2% decrease. The probabilities of 5% increase and 5% decrease are also equal.
• If the share price stays constant over a day, there is 10% chance that it remains constant the
following day. The chance of 2% increase on the following day is twice the chance of 5% increase.
146 CHAPTER 6. MARKOV CHAINS
• If the share price increases by 2% on a day, it has 34% chance of another 2% increase on the
following day and 32% chance of no increase or decrease. The probability of 5% increase on the
following day is equal to the probability of 2% decrease. Similarly, if the share price decreases
by 2% on a day, it has 34% chance of another 2% decrease on the following day and 32% chance
of no increase or decrease. The probability of 5% decrease on the following day is equal to the
probability of 2% increase.
• If the share price increases by 5% on a day, with 50% chance it increases by 2% the following
day and with 50% it remains constant. Similarly, if the share price decreases by 5% on a day,
with 50% chance it decreases by 2% the following day and with 50% it remains constant.
(a) Write down a suitable state space for this Markov chain.
(b) Write down the transition matrix for the Markov chain.
(c) Suppose that the share price is $p on Monday, which has increased by 2% compared to the
Sunday price. What is the probability that the share has the same price on Wednesday (i.e.,
after two days)?
(d) Write a MATLAB script file to estimate the probability that a share with initial price of $10
becomes more expensive than $25 after 30 days. Assume that the share price remains constant
over the first day.
Solution
(a) The state space is {1, 2, 3, 4, 5}, where 1 is the share price staying constant, 2 is 2% increase, 3
is 5% increase, 4 is 2% decrease and 5 is 5% decrease.
(c) The share price starts in state 2. There are several possible ways that the share price could have
the same price after two days.
P21 P11 +P22 P24 +P24 P42 +P23 P35 +P25 P53 = 0.32×0.1+0.34×0.17+0.17×0.17+0.17×0+0×0 = 0.1187
6.4. SIMULATING MARKOV CHAINS 147
prob =
0.0678
148 CHAPTER 6. MARKOV CHAINS
Theory
One of the more important questions we can ask about a Markov chain is what its long term behaviour
is. In the previous section we saw how to determine transition probabilities after a few steps. However
the same techniques can be used to study what the transition probabilities are like after thousands,
or millions, or billions of steps. This is the long term behaviour.
Practice
We can investigate the long term behaviour of a Markov chain by seeing what happens to the transition
matrix as we increase the number of steps.
lim P n
n→∞
Example 6.5.1. What are the long term transition probabilities for the cellphone in Example 6.3.4?
Solution
Use MATLAB again to examine high powers of the transition matrix
>> Pˆ50
ans =
0.5527 0.0094 0.4379
0.4696 0.0080 0.5224
0 0 1.0000
>> Pˆ100
ans =
0.3099 0.0053 0.6848
0.2633 0.0045 0.7322
0 0 1.0000
>> Pˆ200
ans =
0.0974 0.0017 0.9009
0.0828 0.0014 0.9158
0 0 1.0000
6.5. LONG TERM BEHAVIOUR 149
>> Pˆ10000
ans =
In the long term, the system will in state 3 with probability 1.0, no matter which state it started in.
State 3 (lost) in the cellphone example is called an absorbing state: once the chain reaches that
state it never leaves it. You can recognise a state i as an absorbing state by the fact that Pii = 1.0.
In the example above, P33 = 1.0, which means that once a cellphone is lost, it is always lost.
In the previous example the long term behaviour was a particular state. However, this is not always
the case.
Example 6.5.2. What is the long term behaviour of the lift in Example 6.3.1?
Solution
First observe that we never have Pii = 1.0, so the system does not have any absorbing states. We use
MATLAB to look at higher powers of the transition matrix P .
>> Pˆ10
ans =
0.3340 0.2220 0.2220 0.2220
0.3330 0.2223 0.2223 0.2223
0.3330 0.2223 0.2223 0.2223
0.3330 0.2223 0.2223 0.2223
>> Pˆ20
ans =
0.3333 0.2222 0.2222 0.2222
0.3333 0.2222 0.2222 0.2222
0.3333 0.2222 0.2222 0.2222
0.3333 0.2222 0.2222 0.2222
>> Pˆ10000
ans =
0.3333 0.2222 0.2222 0.2222
0.3333 0.2222 0.2222 0.2222
0.3333 0.2222 0.2222 0.2222
0.3333 0.2222 0.2222 0.2222
150 CHAPTER 6. MARKOV CHAINS
Note that every row of this matrix is the same, even though this didn’t hold for the original transition
matrix. That means that, in the long term, the probability of where the chain ends up doesn’t depend
on where it started. In this case, the long term probabilities are
In the long term, if we look at the lift at a random time, then it will be in floors 1, 2, 3, 4 with proba-
bilities 1/3, 2/9, 2/9, 2/9.
We say that a vector π is the equilibrium distribution of a Markov chain if, for all states i, the long
term probability of state i is πi , independent of which state the chain started in. Not every Markov
chain has an equilibrium distribution. You can find the equilibrium distribution by
2. checking if all the rows are the same (if not, there is no equilibrium distribution);
Example 6.5.3. In the long run, for what proportion of the time will Chris, in Section 6.2, be tired?
Solution
States are fit (1), sick (2), tired (3). The transition matrix is
>> Pˆ10000
ans =
All of the rows are the same, so there is an equilibrium distribution. This equilibrium distribution is
(0.6190, 0.1429, 0.2381), so Chris is sick for 0.14 of the time.
Example 6.5.4. You are playing at coin tossing with a friend. You start with $2 and she starts with
$3. Each turn, if it’s heads then your friend gives you one dollar, otherwise you give her one dollar.
The game continues until one of you runs out of money. What is the probability that you will win?
6.6. EXERCISES 151
Solution
There are six possible states, that you have 0, 1, 2, 3, 4, 5 dollars. The transition matrix is
>> P = [1.0000 0 0 0 0 0;
0.5000 0 0.5000 0 0 0;
0 0.5000 0 0.5000 0 0;
0 0 0.5000 0 0.5000 0;
0 0 0 0.5000 0 0.5000;
0 0 0 0 0 1.0000]
Note that state 1 (corresponding to $0) and state 6 (corresponding to $5) are both absorbing states.
>> Pˆ100000
ans =
1.0000 0 0 0 0 0
0.8000 0 0 0 0 0.2000
0.6000 0 0 0 0 0.4000
0.4000 0 0 0 0 0.6000
0.2000 0 0 0 0 0.8000
0 0 0 0 0 1.0000
Every row is different, so there is no equilibrium distribution. We started with $2, which corresponds
to state 3. The third row is (0.6 0 0 0 0 0.4). So with probability 0.6 we finish in state 1 (and lose) and
with probability 0.4 we finish in state 6 (and win). Notice the pattern in these probabilities: perhaps
you can prove a general rule?
6.6 Exercises
1. A taxi company serves three small towns, Augustine, Berkeley and Camus. The company has
the following information:
• The towns are so small that people only catch a taxi to another town.
• Taxis wait for a customer in the town they travel to.
• Of the customers who catch a taxi from Augustine 50% go to Berkeley and 50% go to
Camus.
• Of the customers who catch a taxi from Berkeley 30% go to Augustine and 70% go to
Camus.
• Of the customers who catch a taxi from Camus 40% go to Augustine and 60% go to Berkeley.
(a) Write down a suitable state space for this Markov chain.
152 CHAPTER 6. MARKOV CHAINS
0.5
1 2
0.2
0.5 0.5 0.2
0.2
3 4
0.4
0.5
(a) Find the probability for a Markov chain X(t), t = 0, 1, 2, .. with transition matrix P , started
in state X(0) = 1, to be in state X(3) = 2 at time t = 3.
(b) What can be said about the long term behaviour of the system state X(t) at large t?
6.6. EXERCISES 153
4. Assume that the 3 × 3 transition matrix for a 3 state Markov chain is assigned to the variable
Pmatrix in MATLAB . Write MATLAB code, using the variable Pmatrix, to calculate and
display the probability to go from state 1 to state 3 in 5 steps.
5. Write general MATLAB function that takes an m × m transition matrix P, a number n, and
simulates a path of length n that starts in the first state.
6. Most of this exercise was an exam question in first semester 2010.
Each minute of a Maths 162 lecture, a student is either interested, fascinated or excited.
• A student who is interested one minute has a 50% chance of being interested the next
minute and a 20% chance of being fascinated the next minute.
• A student who is fascinated one minute has a 60% chance of being interested the next
minute and a 20% chance of being fascinated the next minute.
• A student who is excited one minute will be interested the next minute.
We will model the state of the student using a Markov chain.
(a) Draw a transition diagram for this Markov chain.
(b) Write down the transition matrix for the Markov chain.
(c) If a student is excited, find the probability that she is excited in three minutes.
(d) Suppose a student is excited for the first minute of a lecture. Write a MATLAB script file
to estimate the probability that he is excited for at least 15 of the 50 minutes in the lecture.
7. An airport has 3 terminals. An airport taxi carries passengers from one terminal to another.
It takes passengers to whichever terminal they want and picks up its next passengers from that
terminal. Given that the taxi is at one terminal, the following table shows the probability that
it will be sent to any other terminal:
• Terminal 1: to Terminal 2, 10%, to Terminal 3, 90%
• Terminal 2: to Terminal 1, 50%, to Terminal 3, 50%
• Terminal 3: to Terminal 1, 90%, to Terminal 2, 10%
Each day the taxi starts at terminal 1.
Write the transition matrix. Use MATLAB to find:
(a) the probability that a taxi starts at Terminal 1 and is at Terminal 3 after four trips
(b) the long term probabilities
8. Students in 162 can be divided into two categories, those who are taking the course for the first
time, and those who are repeating the course.
The university has the following information:
• Students who are taking the course for the first time have a 80% chance of passing, a 10%
chance of repeating the course and a 10% chance of of not passing and never taking the
course again.
• Students who are repeating the course (i.e. taking the course for the second, third, fourth
time etc.) have a 50% chance of passing, a 25% chance of repeating the course and a 25%
chance of never taking the course again.
154 CHAPTER 6. MARKOV CHAINS
(a) Draw a transition diagram for this Markov chain. (Hint states 3 and 4 are absorbing
states).
(b) Write down the transition matrix for the Markov chain.
(c) If a student enrols in 162 for the first time, find the probability she will pass only at her
third attempt.
(d) Explain (but do not calculate) how you would find the proportion of students who enrol in
162 for the first time who will eventually pass the course.
• Customers who rented a car last year have a 50% probability of renting a car this year.
• Customers who rented a car two years ago, but who did not rent a car last year have a 25%
probability of renting a car this year.
• Customers who last rented a car more than two years ago have a 10% probability of renting
a car this year.
We will model this using a Markov chain using a time step of one year.
(a) Write down a suitable state space for this Markov chain.
(b) Draw a transition diagram for this Markov chain.
(c) Write down the transition matrix for the Markov chain.
(d) If a customer rents a car this year, find the probability that she will NOT rent a car in any
of the subsequent three years.
(e) Write a MATLAB script file to estimate the probability that if a customer rents a car this
year then she will rent one in at least four of the next six years.
Chapter 7
Contents
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
7.2 MATLAB Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
7.2.1 Standard Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
7.2.2 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
7.2.3 Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
7.2.4 Arrays, storage and indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
7.2.5 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
7.2.6 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
7.2.7 Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7.2.8 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
7.2.9 Script Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.2.10 Five Steps for Problem Solving . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
7.2.11 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
7.2.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
7.1 Introduction
So far in this course we have covered a lot of MATLAB tools, coding conventions, and mathematical
concepts.
This chapter is here to summarise all of the coding practices we should have obtained from this course.
This chapter also serves to remind you that the tools that we have learnt here are not just for the
specific mathematical examples we’ve used to teach them to you. The coding ideas learned in Maths
162 can - and should - be used on any problem where they are useful.
Now is also a good time to note that although we have been learning how to code in MATLAB
specifically, most of the computing concepts used in MATLAB are transferable to other programming
languages. What follows is a summary of the most important programming concepts learnt in this
course.
155
156 CHAPTER 7. MATLAB REFERENCE CHAPTER
MATLAB can be used as a calculator. As such, it handles all of the standard operations we’d expect.
7.2.2 Variables
In algebra we know we can represent numbers with letters. For example we can say that x = 3 and
we then know that the value of x is 3. Here x is considered a variable.
In computing, we can think of variables in a similar way. Dealing with variables is analogous to: “We
have some data. We want to store it. Lets put it in a box and label it, then we can use it later.”
name = data;
The equals sign is used to assign values to variables. Variable assignment either creates the variable,
or if it already exists changes the variable value.
From here we can access the data by calling the variable. For example, in MATLAB we could store
>> beeronthewall
beeronthewall =
99
7.2. MATLAB TOOLS 157
Once a variable is declared it will show up as stored in the MATLAB workspace. It is good practice
to keep track of what variables are defined and occupy the workspace. The command clear can be
used to clear variables from the workspace.
Note: Variable names are only allowed to contain alphanumeric characters and underscores. They
can’t begin with a number or contain spaces. Some names are reserved for functions and special
values. To know what these are see the MATLAB function list in the appendix. Beyond that you
are allowed to use almost any name you would like. BUT... it is proper etiquette to give variables
descriptive names where appropriate.
E.g.
Double: This is the default data type for numbers in MATLAB .
iamnumber = 1;
Vectors and matrices of numbers are also considered type
double.
E.g.
Characters: Letters are stored in MATLAB as characters. Strings are
str = 'stringy';
(Char) stored as vectors of characters. So strings are also consid-
ered type char.
E.g.
truth = (2+2==5)
Logical: Logical values of 1 or 0, representing true or false.
would give
(Boolean)
truth = 0
From these examples we can also see that in MATLAB we can save any of these data types as variables.
An array is a variable that can hold multiple values. Arrays are useful for storing lists of values - like
we would need to store data. They are also ideal for representing vectors and matrices.
If a scalar variable (single valued variable) is like a cardboard box, then an array is like a filing cabinet
where each drawer can store a value.
Creating Arrays
Scalar: a = 12.45
Row vector: a = [3 4 5 −1]
Column vector: a = [3;4;5;−1] or a = [3 4 5 −1].'
Matrices: a = [1 2 3
4 5 6
7 8 9] or
a = [1 2 3; 4 5 6; 7 8 9]
The colon operator : and the linspace function are particularly useful. Using the colon operator,
you can specify an array that contains a sequence of increasing or decreasing values.
>> x = [10:−2:0]
x =
10 8 6 4 2 0
There are also MATLAB functions that automatically create matrices. You can find these in the
Function List (Sec. 7.2.12).
Elementwise Operations
Sometimes we want matrix arithmetic to be done element wise. In MATLAB we can do this with
elementwise operations:
Element wise multiplication, division and exponentiating can also be done on row and column vectors
of different sizes. For example:
>> a = [1 2 3];
b = (1:6)';
c = a.*b
c =
1 2 3
2 4 6
3 6 9
4 8 12
5 10 15
6 12 18
>> [1,5,4]ˆ0
ans =
1 1 1
160 CHAPTER 7. MATLAB REFERENCE CHAPTER
7.2.5 Functions
Built in functions
MATLAB also has many built in functions that take care of other common mathematical operations as
well as more complicated functions. Many functions can be used on scalars or arrays. In the appendix
you will find a comprehensive list of built in MATLAB functions. To call a MATLAB function is
simple:
>> sqrt(4)
ans =
2
>> sin([0,1/2,1,3/2]*pi)
ans =
0 1.0000 0.0000 −1.0000
Note the use of brackets around the functions input argument. There should be no spaces between
the function name and the opening bracket. Most functions in MATLAB follow this name(input)
style.
Note: MATLAB functions are not just designed for doing mathematics. MATLAB also contains many
functions that would be useful for commerce, statistics, engineering, signal processing, modelling and
more.
plot(x,y,'conditions')
The data is stored in x and y and must have the same length. The 'conditions' string is a character
string made from one element from any or all colour, symbol, line-type. For example:
A full table of the different plot colours, symbols and line-types as well as a summary of some of the
different plot types is given in the MATLAB function list in the appendix.
7.2. MATLAB TOOLS 161
Titles, legends and other plot enhancements can also be added. For a list of these commands also see
the Plotting Commands section in the MATLAB function list.
>> figure
In MATLAB we can define our own functions by creating and storing function files that follow the
general form:
(Note: the < > are not required. They are only used here to show the general form.)
<Code to be executed>
For example:
The function must be saved in a file called my square.m. It can be used in the Command Window as
follows:
b =
4 8 12 16 20
Note: We have written our function so that it can work with arrays. Functions that are written in
this way are deemed to be ‘vectorised’. Where possible, it is usually a good idea to vectorise functions
to make them more general.
• Functions have their own workspace, separate from the base workspace. I.e. Any variables
declared inside a function will not be stored in the global workspace unless they are declared to
be global variables prior to function execution.
• In MATLAB it is possible to nest functions. This means that we can have functions inside other
functions, both in use and when we are writing functions. An example of this can be found at
the end of this chapter in the fractal trees example.
Functions can be considered as our tools of programming. We use functions whenever we need either
1) the same code to be repeated, or 2) when ever we want to use the same code with different input
parameters.
Functions allow us to generalize our code, and do away with copy and pasting with minor changes.
They are very very useful.
7.2.6 Documentation
Note: This is probably the most important part of this chapter and the best skill you can learn if you
want to learn to program.
If you ever want to check what a function does MATLAB has the following functions:
Beyond this, the single best piece of general programming advice is: Google is your best friend.
Most programming hurdles you will face in your careers have likely come up for other people. A
simple Google search usually provides valuable information when programming. Knowing how to
effectively use Google as a troubleshooting tool can be an incredibly valuable skill as a programmer
and mathematician.
7.2.7 Conditionals
Logical statements
In MATLAB there are a number of ‘logical operators’ that can be used to form logical statements.
E.g. == (equal to), < (less than).
The full list of logical operators can be found in the MATLAB Function List.
>> a = 1; c = −3*a;
ans =
0
>> check = c*a==0 | | (a>0 && c<0) %(c*a = 0) OR [(a > 0) AND (c < 0)]
check =
1
The output of logical statements are boolean where 0 represented ‘false’ and 1 represents ‘true’.
Conditional Statements
Sometimes we want to compute one set of commands, or another, depending on the result of a relational
test. Conditional statements allow code to be sectioned off based on logical (true or false) conditions.
There are four main ‘statements’ that we can use:
elseif Allows for another condition other than the original ‘if’
statement. Note: It is possible to have multiple elseif
statements; they are checked in order.
if( <expression1> )
<commands evaluated if expression1 is True>
elseif( <expression2> )
<commands evaluated if expression1 was false and expression2 is True>
else
<commands evaluated if all other expressions are False>
end
We aren’t limited to 4 expressions here; we could add more. We could use this to price pens according
to quantity, as follows.
Note that leaving off the last part (else and the related commands following it) will result in no action
being taken if none of the previous expressions result in True values.
Conditionals on Arrays
• Logical operators also work on arrays. Most logical operators work element wise on arrays.
>> a = −3:3;
a>0
ans =
0 0 0 0 1 1 1
• There are some built in MATLAB functions that also help with arrays and logicals. Three of
the most useful are
>> a = −3:3;
b = a(a>0)*(−1) %make all positive elements, negative.
ans =
−3 −2 −1 0 −1 −2 −3
166 CHAPTER 7. MATLAB REFERENCE CHAPTER
7.2.8 Loops
The for loop repeated the commands that are inside of it. The basic structure is
Sometimes we want to repeat commands until a condition is satisfied. If we don’t know how many
repetitions are required in advance, a while loop can be useful.
The loop will continue to go around as long as the ¡logical statement¿ returns a value of True; provided
that the statement returns a scalar. If the logical statement evaluates to an array, then all of the
values in that array must be True.
Sometimes it is useful to exit loops part way through. While loops allow some control over this, but
the MATLAB function break allows for greater control. break terminates the execution of a while
or for loop (i.e. it exits the loop). Note: If using nested loops, break exits the innermost loop only.
Example 7.2.1. Sum a sequence of random numbers until the next random number is greater than
an upper limit. Then, exit the loop using a break statement.
Solution
7.2. MATLAB TOOLS 167
Scripts are the simplest type of program file. They store commands exactly as you would type them
at the command line. Scripts are collections of MATLAB commands stored in plain text files. To
open a new script file we can click the New Script button in the top left corner of the MATLAB
window.
When a script file is run the code written inside is executed as if you had typed them in
from the keyboard. Scripts can be run by using the run command >> run <filename>
or by pressing the run button in the toolbar,
or by pressing F5 to run the script open in the script editor.
Unlike function files, script files do not have input and output parameters. Script files can
only operate on the variables that are hard-coded into their m-file. Scripts are useful for
tasks that don’t change. And are a way to document a sequence of commands. Function
files may be called inside script files. (Note: Script files cannot be called inside of function
files). Thus, Scripts can be thought of as where the overall program is written; using the
functions and tools available in MATLAB to code a particular program. As such scripts
are useful for setting global behaviour of a MATLAB session.
Notes:
• In order for Scripts to be run they must be found in the local (current) directory.
• Scripts do not take input or output; these must be hard coded in.
• Variables declared in a MATLAB script are stored in the workspace when run.
• When a script is run, it has access to variables in the workspace. By contrast, functions do not
have access to the workspace (but only their input variables).
168 CHAPTER 7. MATLAB REFERENCE CHAPTER
Recall that the reason we are learning how to develop software is so that we may write computer
programs to solve problems. When confronted with a new problem it can be tricky to know how to
approach it so having problem solving approaches in mind can prove useful.
There are many different problem solving methodologies available. The following five steps provide a
simple framework which can help you approach a problem.
3. Work the problem by hand (or with a calculator) for a simple set of data (Testing the problem)
Example 7.2.2. Write MATLAB code to compute the distance between two points in a plane, where
the points are given as the coordinates (x1 , y1 ), (x2 , y2 )
Solution
Our inputs are the information given that we require to solve the problem. Note that sometimes we
will be given irrelevant information, so not all given information may be required. Our outputs are
the values we need to compute.
Point 1
Distance
Point 2
7.2. MATLAB TOOLS 169
Working the problem by hand is a very important step. If you are having difficulty with this step
Working the problem by hand will help you understand what steps need to be taken to solve the
problem. It will also give you a known solution value for a simple data set, which you can use later to
test your program.
(6,4)
p
distance = ((side1 )2 + (side2 )2
p
= (6 − 2)2 + (4 − 1)2
p √
= (42 + 32 = 25
(2,1) = 5.
Decompose the problem into a set of steps and write pseudocode or a flowchart for code. Then write
the code.
Simple problems give simple steps. Complex problems give complex steps.
If we are dealing with a complex problem we still decompose the problem into a series of steps. Each
complex step may also require the problem solving process. We will discuss how to create pseudocode
and flowcharts for complex problems shortly.
Pseudocode is integral to this process. Pseudocode is code written in words and symbols as opposed
to into the computer as a complete program. If the problem is well understood then we should be
able to write pseudocode for the problem. Then, if we can write pseudocode, we should be able to
code our solution up in a computer.
It is then important to test your program and think very careful about any data that might cause
errors. It is always better to be thorough. Always test your programs!
7.2. MATLAB TOOLS 171
7.2.11 Examples
Maths 162 has taught you many things. Amongst these things have been the fundamental program-
ming tools that have been reviewed in this chapter. We have learnt
These tools make up the fundamentals of programming; in MATLAB and other languages. This
section aims to be proof that you can program a vast variety of programs using these tools. This
section contains a collection of example programs and exercises that are quite different to what we
have seen so far; but only use the tools we have learnt in Maths 162.
Example 7.2.3. Use MATLAB to write a program that converts any word into pig latin. Pig latin
is a ‘secret’ language formed from English by transferring the initial consonant of each word to the
end of the word and adding ‘ay’ to the end of the word. E.g. ‘cat’ would become ‘atcay’.
Solution
We can use a collection of conditionals, arrays, and for loops for this program.
for i = 1:length(consonants)
if (first letter==consonants(i))
cons check=1;
break
end
end
if (cons check)
out word = [in word(2:end),first letter,'ay'];
else
out word = [in word,'ay'];
end
out = out word;
Alternatively, instead of cycling through each consonant in a loop, we can use conditional indexing to
speed up the process.
if (any(check))
out word = [in word(2:end),consonants(check),'ay'];
else
out word = [in word,'ay'];
end
out = out word;
>> sentence =
appyhay iggypay
Example 7.2.4. Use MATLAB to write a program that makes fractal trees. A fractal tree starts
with a trunk, and then branches out left and right by a certain angle creating new branches with a
length that has some ratio of the initial trunk. This process continues from those new branches until
you have a full tree. The program should allow the angle, length ratio and number of branch iterations
as the input.
An illustration of how a fractal tree is built is shown.
Θ
Θ Θ
To do this exercise we can know that the matrices Mr and Ml rotate a vector by θ radians left and
right respectively, where
cos(θ) sin(θ) cos(θ) −sin(θ)
Mr = Ml = .
−sin(θ) cos(θ) sin(θ) cos(θ)
Solution
%%% %%%
7.2. MATLAB TOOLS 173
x start = 0; y start = 0;
x new = 0; y new = 1;
%%% %%%
%%% The function draw right branch draws the right hand side of every
%%% offshoot:
% The next section of code uses a rotation matrix, and the previous
% branch to find the next right hand branch.
rotate matrix r = [cos(theta) sin(theta); −sin(theta) cos(theta)];
% This next section of code says to draw the next set of branches
% from the branch we just made. I.e. from the end point of the
% latest right branch draw the next right and left offshoots.
%%% %%%
%%% The function draw right branch draws the right hand side of every
%%% offshoot:
% draw left branch works the same way as draw right branch, only the
% rotation happes to the left instead of the right.
if branches left>1
draw right branch(new start(1), new start(2),...
new final(1), new final(2), branches left−1);
draw left branch(new start(1), new start(2),...
new final(1), new final(2), branches left−1);
end
end
end
This function can produce a wide array of fractal trees. The following figure shows a small few.
7.2. MATLAB TOOLS 175
2.5 3.5
3
2
2.5
1.5
2
1.5
1
0.5
0.5
0 0
-1.5 -1 -0.5 0 0.5 1 1.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
2 14
1.8 12
1.6 10
1.4
8
1.2
6
1
4
0.8
2
0.6
0
0.4
0.2 -2
0 -4
-1.5 -1 -0.5 0 0.5 1 1.5 -15 -10 -5 0 5 10 15
7.2.12 Exercises
1. Write a program that can take a whole sentence and convert it into pig latin.
7. Create a program that can calculate a payment program for students paying off their loans.
176 CHAPTER 7. MATLAB REFERENCE CHAPTER
MATLAB : function list
Management
clc Clears prompt window
clear Clears variables from memory
clear all Clears all variables in memory/workspace
close Close figures
close all Closes all plots/figures
global Declares variables to be global
help Prints help about a given function
177
178 MATLAB : FUNCTION LIST
Input/Output Commands
disp Displays contents of an array or string to screen
input Displays prompts and waits for input from user in prompt
; Suppresses printing to screen
Input/Output Commands
disp Displays contents of an array or string to screen
input Displays prompts and waits for input from user in prompt
; Suppresses printing to screen
Array Commands
max Gives the largest element
min Gives the smallest element
sum Sums each column
length Number of elements
cat Concatenates arrays
find Finds indices of nonzero elements
size Array size
Creating Arrays
linspace Creates regularly spaced vector
logspace Creates logarithmically spaced vector
eye Creates an identity matrix
ones Creates an array of ones
zeros Creates an array of zeros
repmat Replicate and tile an array
179
Plotting Commands:
Plots
plot Generates standard xy plot
loglog xy plot with logarithmically scaled axis
semilogx xy plot with logarithmic x-axis
semilogy xy plot with logarithmic y-axis
scatter Scatter plot
histogram Histogram
bar Bar graph
plot3 Plots a line in xyz (3D)
scatter3 Scatter plot of 3D data
surf Creates 3D surface plot of matrix data
mesh Draws a 3D mesh of a surface of matrix data
contour Draws a contour plot of a matrix
Plot Enhancements
figure Opens a new figure window
hold Hold current plot
title Gives plot a title
xlabel Labels the x-axis
ylabel Labels the y-axis
legend Creates a plot legend
grid Shows grid lines on plot
subplot Creates plots in subwindows
180 MATLAB : FUNCTION LIST
Logic:
Logical Operators
== Equal to
< Less than
<= Less than or equal to
> Greater than
>= Greater than or equal to
& AND
— OR
˜ NOT
xor EXCLUSIVE OR
Logical Functions
any True if any elements are nonzero
all True if all elements are nonzero
isnan True if elements are undefined
isinf True if elements are infinite
isempty True if matrix/array is empty
isreal True is all elements are real
Mathematical Functions:
181
Trigonometric Functions
cos(x) cos(x)
sin(x) sin(x)
tan(x) tan(x)
1
csc(x) Cosec; csc(x) = cos(x)
1
sec(x) Sec; sec(x) = sin(x)
1
cot(x) Cotangent; cot(x) = tan(x)
Inverse cosine; arccos(x) = cos−1 (x).
acos(x)
Note: all of the other inverse functions are similar e.g. asin(x).
Complex Functions
abs(x) Absolute value; |x|
angle(x) Angle of a complex number
conj(x) Complex conjugate
imag(x) Imaginary part of a complex number
real(x) Real part of a complex number
Statistical Functions
mean Calculates the average
median Calculates the median
mode Calculates the mode
std Calculates the standard deviation
rand Generates random numbers between 0 and 1 from a uniform distribution
randn Generates random numbers from a normal distribution
Rounding Functions
round Rounds to the nearest integer
ceil dxe Rounds to the nearest integer toward ∞
floor bxc Rounds to the nearest integer toward −∞
sign Signum function; I.e. tells you if positive or negatice