0% found this document useful (0 votes)
152 views189 pages

Matlab Notes

Uploaded by

binghuan li
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
152 views189 pages

Matlab Notes

Uploaded by

binghuan li
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 189

Mathematics 162

2019 Coursebook

Computational Mathematics

Copyright ©2019

Department of Mathematics
The University of Auckland

Front Cover: A single parotid acinar cell (in red) from a mouse parotid gland, with the associated
acinar lumen (in green). That is, it’s a spit-making cell, together with the tube that collects up the spit
from multiple cells. Figure reconstructed by John Rugis (University of Auckland) using experimental
data from David Yule (University of Rochester).
ii
Contents

1 Introduction 1

1.1 What is this course about? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 How to do well in this course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Cryptography 3

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2 Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.3 Modular Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.4 Caesar cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.5 Brute force attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.6 Affine ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.7 Cribs and frequency analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.8 One-time pads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.9 Euclid’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.10 Factoring integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.11 Public-key cryptography: Diffie-Hellman . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.11.1 Modular exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.12 Element-wise operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

iii
iv CONTENTS

2.13 Modulus as a congruence relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3 Difference Equations, dynamic models and modelling 31

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.2 Discrete Population Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.2.1 Evaluating using MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2.2 Script Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2.3 Some exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.3 A nonlinear difference equation for population growth . . . . . . . . . . . . . . . . . . 38

3.3.1 Some exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.3.2 Long-term behaviour and fixed points . . . . . . . . . . . . . . . . . . . . . . . 40

3.4 More on long-term behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.4.1 Steps to draw a cobweb diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.4.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.5 Discrete logistic equation with a parameter . . . . . . . . . . . . . . . . . . . . . . . . 46

3.5.1 Long-term behaviour and bifurcation diagrams . . . . . . . . . . . . . . . . . . 50

3.6 Fibonacci and his rabbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

3.6.1 Evaluating using MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.6.2 A lot more rabbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.7 Plutonium-239 - Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.8 Money in the bank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.8.1 Using MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.8.2 Some exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.9 Loans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
CONTENTS v

3.10 Systems of Difference Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.10.1 Red Blood Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.10.2 Predator-prey model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.10.3 An exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3.10.4 Epidemics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.11 Review Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4 Stochastic methods and stochastic modelling 71

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.2 Randomness, probability and simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.2.1 Randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.2.2 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.2.3 Simulation for probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.3 Discrete and continuous random variables . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.3.1 Discrete random variables and histograms . . . . . . . . . . . . . . . . . . . . . 76

4.3.2 Continuous random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.4 Uniform and normal distributions, probability distributions . . . . . . . . . . . . . . . 79

4.4.1 Probability Density Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.4.2 The Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.4.3 Normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

4.4.4 Modelling error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4.5 Simulating Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.5.1 Simple examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.5.2 More complicated discrete distributions . . . . . . . . . . . . . . . . . . . . . . 94


vi CONTENTS

4.6 Estimating probabilities, Monty Hall . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

4.6.1 Estimating probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

4.6.2 Supermarket workers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

4.6.3 Monty Hall problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.7 Estimating expectations, Gambler’s ruin . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4.7.1 Estimating expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4.7.2 Expectations of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

4.7.3 Gambler’s ruin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

4.8 Monte Carlo integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

4.8.1 Estimating integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

5 Networks and Graphs 111

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

5.1.1 Theory and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

5.1.2 Representing Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

5.1.3 Graphs in MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

5.2 Graph Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

5.2.1 Disease Spread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

5.3 Walks and Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

5.3.1 Where Graph Theory Started: The Bridges of Königsberg . . . . . . . . . . . . 120

5.3.2 The First Graph Theory Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

5.3.3 Graph Theory Today: The Traveling Salesman Problem . . . . . . . . . . . . . 124

5.3.4 Solving the Traveling Salesman Problem . . . . . . . . . . . . . . . . . . . . . . 125

5.4 Colourings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127


CONTENTS vii

5.4.1 Where Graph Theory Became Famous: Map Colouring . . . . . . . . . . . . . 127

5.4.2 Vertex Colourings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

5.4.3 Graph Colourings: Why and How . . . . . . . . . . . . . . . . . . . . . . . . . 130

5.4.4 Edge Colourings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

6 Markov Chains 135

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

6.2 Markov Chain Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

6.3 Probabilities after multiple steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

6.3.1 Constructing a transition matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 138

6.3.2 Path probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

6.3.3 Probabilities after multiple steps . . . . . . . . . . . . . . . . . . . . . . . . . . 140

6.4 Simulating Markov chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

6.5 Long term behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

6.5.1 Absorbing States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

6.5.2 Equilibrium probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

6.5.3 Dependency on initial conditions . . . . . . . . . . . . . . . . . . . . . . . . . . 150

6.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

7 MATLAB reference chapter 155

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

7.2 MATLAB Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

7.2.1 Standard Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

7.2.2 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156


viii CONTENTS

7.2.3 Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

7.2.4 Arrays, storage and indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

7.2.5 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

7.2.6 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

7.2.7 Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

7.2.8 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

7.2.9 Script Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

7.2.10 Five Steps for Problem Solving . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

7.2.11 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

7.2.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

MATLAB : function list 177


Chapter 1

Introduction

Contents
1.1 What is this course about? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 How to do well in this course . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 What is this course about?

There are many areas of modern mathematics where computational methods play a large role. In
this course you will learn about the mathematical underpinnings of several areas of mathematics
where computation plays a big part, and also learn about the practical aspects of programming and
algorithmic thinking. We will discuss some of these areas in the first lecture. By the end of this
course you will not only understand the theoretical basis for a number of fascinating areas of modern
mathematics, but also be able to develop and program algorithms to solve these problems.

1.2 How to do well in this course

In this course we will use the programming language MATLAB. MATLAB is commonly used in
mathematics, and it will be useful to learn some specifics aspects of programming in MATLAB.
MATLAB is available on the lab computers, and you can purchase a copy for your personal computer
from the bookstore at a discounted student rate. You will need to become comfortable with matlab
in order to do well in this course. More broadly, you will learn algorithmic thinking and programming
skills which are applicable to many environments beyond matlab.

There are additional online resources, based on MATLAB, which we will also use throughout for extra
practice. One is MATLAB onramp, which is a web-based tutorial which will help you learn the basics

1
2 CHAPTER 1. INTRODUCTION

of MATLAB. .1

The other is cody, which is a MATLAB based, online environment which tests your skills – we will
use for additional exercises throughout the course.

There will also be useful resources posted on canvas, including this coursebook, links to matlab
resources, lecture recordings, sample code, assignments, practice tests and exams, and more.

You should plan to spend about 10 hours each week working on this course. This includes attending
lectures, reading this book and doing assignment questions.

Try hard not to miss lectures. If you miss a lecture, read the lecture notes and watch the lecture
recording (if available) before the next lecture. Lecture recordings are a valuable resource, but they
do not contain everything that occurs in lecture, and are not intended as a substitute for lectures.

You can only learn mathematics by doing mathematics and it is important to supplement lecture
material by trying some of the recommended problems. Try some of the problems every week. Don’t
wait until it is time to study for the exam.

Attempt all assignments and all questions on the assignment. Once your assignment is marked, go
over the assignment to check where you made mistakes. Sample solutions to the assignments will be
made available - read them, as they contain helpful information such as alternative ways to answer
questions.

If you are having problems with material in the course, first make sure you have read the appropriate
parts of the lecture notes. Then speak to your lecturer, either in lectures or by making an appointment
with your lecturer for another time. Good ways to make an appointment are by speaking to your
lecturer after class or by emailing your lecturer. Don’t be scared to approach your lecturers for help
- they are happy to help students who are trying to help themselves.

If you need help with computer use in the computer laboratory, ask a demonstrator in the laboratory.
Demonstrators on duty will be wearing a sash and there will always be a demonstrator on duty when
the basement computer laboratory is open. If the demonstrators are unable to help you with details
of the MATLAB package used, then ask your lecturer for help.

To prepare for the test or exam, first make sure you understand your lecture notes and make sure you
can do all assignment and tutorial questions. Go over some old exam papers (these can be downloaded
from the University Library website).

1
You will find a link to MATLAB onramp on canvas.
Chapter 2

Cryptography

Contents
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Modular Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Caesar cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 Brute force attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6 Affine ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.7 Cribs and frequency analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.8 One-time pads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.9 Euclid’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.10 Factoring integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.11 Public-key cryptography: Diffie-Hellman . . . . . . . . . . . . . . . . . . . 25
2.11.1 Modular exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.12 Element-wise operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.13 Modulus as a congruence relation . . . . . . . . . . . . . . . . . . . . . . . 29

2.1 Introduction

Cryptography is the study of writing and solving codes; specifically, techniques for secure communi-
cations. Cryptography forms the basis of many modern communications systems, and this includes
not just messaging but also the secure online transactions that we take for granted in the modern
world. In this chapter we will explore both the mathematical theory behind such systems, and also
the computational methods and algorithms used for both creating and breaking codes.

We will also introduce, throughout this chapter, the basic concepts of matlab (and programming in
general) that you will need as we progress.

3
4 CHAPTER 2. CRYPTOGRAPHY

2.2 Ciphers

Theory

The simplest cryptographic systems are substitution ciphers. The central idea is that each letter of
the message to be encoded is replaced by a pre-defined coded letter; thus the original message (the
plaintext) is encoded into the ciphertext. Perhaps the simplest such cipher is known as the Atbash
cipher, which encodes letters by reversing the alphabet. Thus A goes to Z, B to Y, etc., as in the
following table:

plaintext A B C D E F G H I J K L M
ciphertext Z Y X W V U T S R Q P O N
(continued)
plaintext N O P Q R S T U V W X Y Z
ciphertext M L K J I H G F E D C B A

The Atbash cipher can then easily be read off, given the key. For example, from the plaintext FOX we
convert F→U, O→L and X→C so that the plaintext FOX becomes the ciphertext ULC.

In the same way, BEGINTHEOPERATIONATDAWN becomes YVTRMGSVLKVIZGRLMZGWZDM.

Can you decode the ciphertext XRKSVIGVCG?

Practice

Implementing the atbash cipher in code is easiest if instead of working with letters, we convert our
letters into integers and work with those. Taking A to be 0, B to be 1, etc., up to Z= 25, each letter
of the cipher (C) is computed from the plaintext P easily as

C = 25 − P. (2.1)

First we will experiment with using matlab to make this conversion.

Introduction to matlab

In its simplest form, you can simply use matlab as a calculator; for example:

>> 2+2

ans =

4
2.2. CIPHERS 5

>> 5*7

ans =

35

>> exp(3)

ans =

20.085536923187668

This last line exp(3), calls one of matlab’s built-in functions. This particular one is the exponential
function, e.g e3 . There are lots of other built-in functions that we will use later on.

It is also possible to write your own functions to perform particular tasks. In order to implement our
cipher, we will first need to be able to convert letters into integers, and also the reverse. There are two
function files posted on canvas which we will use for this. Download chartoint.m and inttochar.m
from canvas, and change matlab’s “Current Folder” to the folder where you have put these files. The
function chartoint converts a character (that is, a letter) into the corresponding integer value:

>> chartoint('A')

ans =

>> chartoint('B')

ans =

>> chartoint('M')

ans =

12

>> chartoint('Z')

ans =

25

Letters must be entered using single quotation marks. Similarly, inttochar converts an integer (0-26)
back to the corresponding letter. Here we are using only capital letters.

>> inttochar(0)
6 CHAPTER 2. CRYPTOGRAPHY

ans =

>> inttochar(25)

ans =

To get the most out of matlab, we need to think of it not just as a calculator, but also to use variables.

>> x=5

x =

>> y=3

y =

>> z=x*y

z =

15

We can also use variables to hold our characters (letters), and pass those to our functions.

>> my character='G'

my character =

>> my integer=chartoint(my character)

my integer =

>> inttochar(my integer)

ans =

G
2.2. CIPHERS 7

Then we can apply the atbash cipher

>> P=chartoint('G')

P =

>> C=25−P

C =

19

>> inttochar(C)

ans =

That is, ‘G’ becomes ‘T’.

However, we don’t really want to work with just one letter at a time; we want to encrypt entire
messages. In order to handle this, in matlab, we will need the concept of vectors.

Vectors in matlab are much like the mathematical concept that you already know. You can have either
row or column vectors, and you access individual elements of a vector by using parentheses ().

>> myvector=[1 2 3 4] % row vector

myvector =

1 2 3 4

>> myvector=[1; 2; 3; 4] % column vector

myvector =

1
2
3
4

>> myvector(3) % access third element

ans =

>> myvector(3)=11; % change third element


8 CHAPTER 2. CRYPTOGRAPHY

>> myvector

myvector =

1
2
11
4

Because we need to think about messages, we also need strings, which are essentially vectors of
characters. Individual elements can be accessed and changed just as with vectors.

>> mystring='SECRETMESSAGE'

mystring =

SECRETMESSAGE

>> mystring(5) % access 5th element

ans =

>> mystring(5)='Z'; % change 5th element


>> mystring

mystring =

SECRZTMESSAGE

Now we can operate directly on strings and vectors, rather than just one integer or character at a
time.

>> message='FOX' % this is a string

message =

FOX

>> V=chartoint(message) % convert to a vector of integers

V =

5 14 23

>> inttochar(V) % convert back to the original string

ans =
2.2. CIPHERS 9

FOX

How, then, do we write a function to apply the cipher? This will be a three-step process:

1. Convert the incoming plaintext string to a vector of integers.

2. Use the equation C = 25 − P to encode the ciphertext, with one integer corresponding to each
character.

3. Convert back to a ciphertext string.

The first and third step are just as we did above, with chartoint() and inttochar(). For the second
step, to apply the cipher to each letter, we will use a for loop.1

for loops

The for loop is a basic programming construct which uses the same code repeatedly, under specified
conditions. At its most basic, something like

for i=[1 2 3]
2*i
end

executes the code inside the block (2*i) for each of the i values specified in the loop definition (1,2,3).
Thus this loop is executed three times, and the calculations at each stage are 2, 4 and 6.

for loops can also operate on other variables, not just the counter variable (i in the example above).

x=1;
for i=[1 2 3 4]
x=2*x
end

This loop executes four times, and the value of the variables x is doubled each time; initially we have
x=1; after each pass through the loop we have 2,4,8 and 16.

One natural extension is to construct our for loops to work with vectors; that is, each pass through
the loop deals with one element in the vector. For example

>> for i=1:5


x(i)=iˆ2
end
1
You can also perform the cipher on the vector directly, C=25−P, without using a for loop at all. More on this later.
10 CHAPTER 2. CRYPTOGRAPHY

x =

x =

1 4

x =

1 4 9

x =

1 4 9 16

x =

1 4 9 16 25

Now we have the tools needed to implement the atbash cipher in matlab, using a for loop to work
with each element of the vector to be encoded. Here is an example of one way to do this:

function ciphertext=atbash(plaintext)

P=chartoint(plaintext); % convert the incoming string into corresponding integers

N=length(P); % length of message

for j=1:N % for−loop, for each entry in our vector


C(j)=25−P(j); % apply the atbash cipher to each entry
end

ciphertext=inttochar(C); % convert back to a string

The first thing to note is that here we are writing a function, and so this code would go into its own
file (atbash.m) to be executed when we call it. One way to know that this is the case is to see the
keyword function at the top. Breaking down this top line, we have

function output variable = functioname(input variable)

The body of the function (the following lines) take the input variable (plaintext in this case) and
compute the output (ciphertext). It is important to distinguish between function files, and interac-
tive commands entered at the matlab prompt. In order to do this, we have colour-coded each in this
2.2. CIPHERS 11

coursebook as follows. Examples of interaction with the matlab prompt appear this way:

>> 2+2

ans =

Function files are listed like so:

function output=myfunctionname(input)

output=inputˆ2;

For more details on writing your own function, see the matlab reference chapter (Ch 7).

Back to the atbash cipher then. Using our new function, we can easily encode our plaintext strings.

>> atbash('SUPERSECRETMESSAGE')

ans =

HFKVIHVXIVGNVHHZTV

How, then, do we decode the ciphertext? As it happens, we already have the means to do that – the
atbash cipher decodes itself!

>> ciphertext=atbash('SUPERSECRETMESSAGE')

ciphertext =

HFKVIHVXIVGNVHHZTV

>> atbash(ciphertext)

ans =

SUPERSECRETMESSAGE
12 CHAPTER 2. CRYPTOGRAPHY

2.3 Modular Arithmetic

Theory

One key idea which we will need in order to understand cryptographic systems is that of modular
arithmetic, in particular the modulo operator :

b (mod n). (2.2)

The idea is the same as the remainder by division. For example, 14 (mod 3) is 2, because 14 = 4×3+2.

For cryptography we are particularly interested in operations (mod 26), for the 26 letters in the Latin
(English) alphabet.

Example 2.3.1. Consider the operator 3x + 5 (mod 26).

For x = 3, we have 9 + 5 (mod 26) = 14 (mod 26) = 14.

For x = 8, we have 24 + 5 (mod 26) = 29 (mod 26) = 3.

Thinking of the integers (mod 26) as representing the letters A-Z, then applying 3x + 5 (mod 26), we
have D→M (3 → 14) and I→D (8 → 3). Clearly, operations (mod 26) will be helpful in constructing
cryptosystems!

Practice

MATLAB has a built-in function for the modulus, mod(b,n) which computes b (mod n).

>> mod(8,3) % 8 (mod 3)

ans =

>> P=chartoint('MESSAGE') % convert string to integers

P =

12 4 18 18 0 6 4

>> C=mod(3*P+5,26) % 3P+5 (mod 26)

C =

15 17 7 7 5 23 17
2.4. CAESAR CIPHER 13

>> inttochar(C) % convert back to string

ans =

PRHHFXR

2.4 Caesar cipher

Theory

The Caesar cipher is a simple substitution cipher, like the atbash cipher, but here the idea is to shift
each plaintext letter by the letter which is three steps down the alphabet. Hence A becomes D, B
becomes E, etc. At the end, we “roll” around, so that X becomes A, Y becomes B, and Z becomes C.

Figure 2.1: Illustration of the Caesar cipher. Image credit: public domain, wikipedia.

Thinking again of the letters A-Z as the integers (mod 26), so that A= 0, B= 1 . . . , Z= 25, then the
caesar cipher can be written using the modulo operator as
C = (P + 3) (mod 26). (2.3)
Then the decryption function is
P = (C − 3) (mod 26). (2.4)

In fact, this works with any shift key, not just 3; let’s call it k. Then we encrypt with
C = (P + k) (mod 26) (2.5)
and the decryption function is
P = (C − k) (mod 26). (2.6)

Practice

As with the atbash cipher, we first convert our plaintext string into the integers (mod 26) using our
chartoint() function. Then we use a for loop over the length of the message, and apply our encoding
function. Finally we convert the integer encoded message back to a string using inttochar().
14 CHAPTER 2. CRYPTOGRAPHY

function C = caesarcipher(P,k)
% inputs: P, plaintext string, all capitals, no spaces. k, integer key
% returns: C, ciphertext string

P=chartoint(P); % convert string to integers mod 26

N=length(P); % how long is the message?

for j=1:N % loop over each entry in the message


C(j)=mod(P(j)+k,26); % apply the cipher: C = P + k mod 26
end

C=inttochar(C); % convert back to characters

Example 2.4.1. How can you use the function above to decrypt caesar cipher messages?

Observe that the decryption function is simply the encryption function, but with a key of the opposite
sign (k → −k).

>> message='ISTHISSECURE';
>> C=caesarcipher(message,5) % encrypt with key +5

C =

NXYMNXXJHZWJ

>> caesarcipher(C,−5) % decrypt with key of opposite sign (−5)

ans =

ISTHISSECURE

2.5 Brute force attack

Theory

At the end of the last section, we observed that the decryption function for the Caesar cipher is simply
the encryption function where the key k now has the opposite sign (k → −k).

This suggests a method of breaking the cipher, using what is known as a brute force attack. Suppose
we have only the ciphertext, but not the key used to encode it, but that nonetheless we wish to decode
the message. If we know that this is a Caesar cipher, then all we must do is try all possible keys!
The question, then, is how many possible keys are there? If this number is 1015 , then it will not be
practical to try them all. However, in the case of the caesar cipher, there are only 25 possible keys.
2.5. BRUTE FORCE ATTACK 15

There are two ways of understanding this. The first is to realize that because the cipher is a shift,
there are only 25 possible shifts. Formally, in terms of the modulo operator

(P + k) (mod 26) = (P + [k (mod 26)]) (mod 26). (2.7)

That is, we need only check keys from 1 up to 25. (Why not k = 0 or k = 26?)

Practice

Example 2.5.1. Decode the ciphertext STEADNIWTHTRGTILTPEDC by brute force attack. You may
assume that it is encrypted with a Caesar cipher, but you do not know the key.
Solution
The brute for approach is simply to try every possible key; we will use a for-loop to cover all pos-
sibilities. We already wrote a function for applying the encryption/decryption function, so let’s use
that.

function bruteforce(C)

for j=1:25
caesarcipher(C,j) % try all possible keys
end

Now all we need to do is call this function on our ciphertext:

>> bruteforce('STEADNIWTHTRGTILTPEDC')
TUFBEOJXUIUSHUJMUQFED
UVGCFPKYVJVTIVKNVRGFE
VWHDGQLZWKWUJWLOWSHGF
WXIEHRMAXLXVKXMPXTIHG
XYJFISNBYMYWLYNQYUJIH
YZKGJTOCZNZXMZORZVKJI
ZALHKUPDAOAYNAPSAWLKJ
ABMILVQEBPBZOBQTBXMLK
BCNJMWRFCQCAPCRUCYNML
CDOKNXSGDRDBQDSVDZONM
DEPLOYTHESECRETWEAPON
EFQMPZUIFTFDSFUXFBQPO
FGRNQAVJGUGETGVYGCRQP
GHSORBWKHVHFUHWZHDSRQ
HITPSCXLIWIGVIXAIETSR
IJUQTDYMJXJHWJYBJFUTS
JKVRUEZNKYKIXKZCKGVUT
KLWSVFAOLZLJYLADLHWVU
LMXTWGBPMAMKZMBEMIXWV
MNYUXHCQNBNLANCFNJYXW
16 CHAPTER 2. CRYPTOGRAPHY

NOZVYIDROCOMBODGOKZYX
OPAWZJESPDPNCPEHPLAZY
PQBXAKFTQEQODQFIQMBAZ
QRCYBLGURFRPERGJRNCBA
RSDZCMHVSGSQFSHKSODCB

So we can see that the message was encoded with 11 as the key, and the message was ’DEPLOYTHES-
ECRETWEAPON’.

2.6 Affine ciphers

Theory

The idea of the Caesar cipher can be extended in a simple way. Instead of the simple shift ((P + k)
(mod 26)) we can instead take our encryption function in a slightly more general form as

C = aP + b (mod 26). (2.8)

This is known as an affine cipher (because the encryption function is an affine function, mod 26).
Now the cipher key is not just a single integer, but instead we require both a and b (both integers).

However, we cannot just go choosing and a and b freely. As it happens, in order for the affine cipher
to work, we need a and 26 to be coprime. That is, the only positive integer which divides them both
is 1. As it happens, without this condition, it is not possible to decrypt the cipher.

The decryption function is


P = A(C − b) (mod 26) (2.9)

where 1 = aA (mod 26). Hence if a and 26 are not coprime, then we can’t find A to decrypt the
message.2 But, so long as we choose a coprime with 26, we can select any integer b and have a usable
affine cipher. Note that if a = 1 we have the Caesar cipher again.

Example 2.6.1. Suppose we wish to use an affine cipher with the key a = 3, b = 5. In order to find
the decryption function, we must find A such that 1 = 3A (mod 26). Because 3 and 26 are coprime,
this has a solution: A = 9. Observe that aA (mod 26) = 3 × 9 (mod 26) = 27 (mod 26) = 1.

Example 2.6.2. Is the affine cipher harder to break by brute force, compared with the Caesar cipher?
How many keys would you need to try? Hint: the only numbers coprime with 26, and less than 26,
are: 1, 3, 5, 7, 9, 11, 15, 17, 19, 21, 23, and 25.
2
If you find yourself particularly interested in why, you should consider MATHS 328 in future.
2.6. AFFINE CIPHERS 17

Practice

As with our other ciphers, we have a three step process: first convert the message into integers
(mod 26); then use a for loop to apply the encryption function to each entry; finally convert the
encrypted integers back into a string.

function C = affinecipher(P,a,b)
%inputs: P, plaintext string
% a, b: integer key. assuming a is coprime with 26

P=chartoint(P); % convert plaintext to integers mod 26

N=length(P); % length of message

for j=1:N % loop over each entry


C(j) = mod(a*P(j)+b,26); % C = aP+b mod 26
end

C=inttochar(C); % convert back to characters

How do we decrypt affine cipher messages? In the example above, we found that the decryption
function is

P = A(C − b) (mod 26) (2.10)


= 9(C − 5) (mod 26) (2.11)
= 9C − 9 × 5 (mod 26). (2.12)

But this is just another affine cipher! So we can both encrypt and decrypt using the same function,
just by altering the key.

>> C=affinecipher('TOPSECRETMOONBASEPLANS',3,5)

C =

KVYHRLERKPVVSIFHRYMFSH

>> affinecipher(C,9,−9*5)

ans =

TOPSECRETMOONBASEPLANS
18 CHAPTER 2. CRYPTOGRAPHY

2.7 Cribs and frequency analysis

Theory

We have already discussed brute force attacks in Sec. 2.5, but this is certainly not the only way of
attempting to decrypt messages (without knowing the key). In this section we consider additional
methods which take advantage of the likely content of the plaintext message.

The first is the idea of the “crib”, a term which originated with the codebreakers at Bletchley Park,
working to break the Enigma machine, during WWII. The key concept is that in some situations,
certain parts of the plaintext may be known (or guessed). Historical examples include messages from
certain stations which often stated only “Nothing to report”, or weather reports at the same time
each day which contain the word “weather” (and also common weather terms for that day’s weather).
Such information can be extremely valuable in decoding messages.

Example 2.7.1. Suppose that you have a ciphertext, encoded with an affine cipher, which you wish
to break “HSYJINJJHAGRANIJSGV”. Instead of resorting to brute force (see example 2.6.2), here
we have a crib: suppose we know that the first two letters of each message are “HI”.

Since the decoding function for an affine cipher is

P = A(C − b) (mod 26) (2.13)

we can use the first two letters of the cipher text (H= 7, S= 18) and the first two letters of the
plaintext from the crib (H= 7, I= 8) to find:

7 = A(7 − b) (mod 26)


8 = A(18 − b) (mod 26)

This is now a system of two equations and two unknowns; all we have to do is solve these simultaneous
equations in order to find A and b.

Solving this we find A = 19 and b = 8, and remembering the relationship between the affine encryption
and decryption functions, we can decode our message:

>> affinecipher('HSYJINJJHAGRANIJSGV',19,−8*19)

ans =

HISTARTTHEOPERATION

For more complex cryptosystems, cribs can still provide valuable information, if not an out-and-out
solution, as in this example.
2.7. CRIBS AND FREQUENCY ANALYSIS 19

Frequency analysis

Frequency analysis also exploits properties of the plaintext, in particular the fact that letters are not
equally used in natural language. For example, in English, approximately 12.7% of letters are ‘E’ and
8.2% ‘A’, while less than 0.1% are ‘Q’ or ‘Z’.

The patterns can also be extended to groups of letters, for example with ‘TH’ being the most common
pair of letters (bigram) in English, and ‘THE’ the most common triplet (trigram).

This method is more usable, of course, with longer messages; the longer the message, the more likely
that it adheres closely to the known distributions.

Example 2.7.2. Suppose that we know the ciphertext

BPQAQABPMAWVOBPIBVMDMZMVLAQBOWMAWVIVLWVUGNZQMVLAAWUMXMWXTMABIZBMLAQVOQV
OQBVWBSVWEVQVOEPIBQBEIAIVLVWEBPMGSMMXWVAQVOQVOQBNWZMDMZRCABJMKICAM

is encoded with Caesar cipher, and the plaintext is in English, but we do not know the key.

The most common letters in this ciphertext are, in descending order: V, M, B and A. Knowing that
E is the most common letter in English texts, we first guess that V = E, which would mean that the
Caesar cipher has a key of 21 − 4 = 17. Checking we find

>> C='BPQAQABPMAWVOBPIBVMDMZMVLAQBOWMAWVIVLWVUGNZQMVLAAWUMXMWXTMABIZBMLAQVOQ ...


VOQBVWBSVWEVQVOEPIBQBEIAIVLVWEBPMGSMMXWVAQVOQVOQBNWZMDMZRCABJMKICAM';

>> caesarcipher(C,−17)

ans =

KYZJZJKYVJFEXKYRKEVMVIVEUJZKXFVJFEREUFEDPWIZVEUJJFDVGVFGCVJKRIKVUJZEXZE
XZKEFKBEFNEZEXNYRKZKNRJREUEFNKYVPBVVGFEJZEXZEXZKWFIVMVIALJKSVTRLJV

Oh dear. No luck this time. But, we’re not done yet: the next most common letter in the ciphertext
was M. Guessing M = E, the key would be 12 − 4 = 8.

>> caesarcipher(C,−8)

ans =

THISISTHESONGTHATNEVERENDSITGOESONANDONMYFRIENDSSOMEPEOPLESTARTEDSINGING
ITNOTKNOWNINGWHATITWASANDNOWTHEYKEEPONSINGINGITFOREVERJUSTBECAUSE

Success!

Obviously, the Caesar cipher is particularly prone to this, but more sophisticated cryptosystems are
20 CHAPTER 2. CRYPTOGRAPHY

also vulnerable if more information about the distribution of letters is used.3

2.8 One-time pads

Theory

So far we have looked at simple ciphers, which are easy to implement (but also easy to break).

A related cryptosystem is the one-time pad. If done correctly, it is unbreakable.

The central idea is to have a “code book” which must be exchanged in advance. Then to encode your
plaintext message, you take each character in your message, and add it to the corresponding character
in the code book, taking the result (mod 26).

For example, suppose the code book is a secret book from the library – here, “The Complete Poems
of Emily Dickinson.” The first unused poem from the book is

Best Witchcraft is Geometry


To the magicians mind -
His ordinary acts are feats
To thinking of mankind.

Suppose we need to send a message “BUY TELECOM SHARES”. We calculate

BUYTELECOMSHARES

1 20 24 19 4 11 4 2 14 12 18 7 0 17 4 18

and the first 16 letters from the poem will be

BESTWITCHCRAFTIS

1 4 18 19 22 8 19 2 7 2 17 0 5 19 8 18

Adding these two messages, entry by entry, mod 26 we get

2 24 16 12 0 19 23 4 21 14 9 7 5 10 12 10

CYQMATXEVOJHFKMK

and send “CYQMATXEVOJHFKMK”.

3
It’s also worth noting that the letter frequencies are different in other languages!
2.9. EUCLID’S ALGORITHM 21

The decryption process then is to use the same letters from the key (code book) and take the difference
(mod 26). If the key is not predictable, then this is an unbreakable code. This means, however, that
the key cannot be re-used; hence the name one-time pad. This disadvantage is the reason that our
chapter on cryptography doesn’t end here! In many situations, exchanging a one-time pad key in
advance is impractical.

Practice

As with our other ciphers, we have a three step process: first convert the message and key into
integers (mod 26); then use a for loop to apply the encryption function to each entry; finally convert
the encrypted integers back into a string.

function C = onetimepad(P,key)
% inputs P, plaintext string
% key, plaintext string
% output C ciphertext string

P=chartoint(P); % convert string to integers mod 26


key=chartoint(key);

N=length(P); % length of message. assuming the key is at least as long

for j=1:N
C(j) = mod(P(j)+key(j),26);
end

C=inttochar(C); % convert back to letters

Using this to check our previous example:

>> C=onetimepad('BUYTELECOMSHARES','BESTWITCHCRAFTIS')

C =

CYQMATXEVOJHFKMK

Example 2.8.1. Modify the code given above to provide the decryption function for a one-time pad.

2.9 Euclid’s algorithm

When working with integers (mod 26) for cryptosystems, we are often interested in the greatest
common divisor (gcd) of two integers (let’s call them a and b). The greatest common divisor of a and
b (gcd(a,b)) is the largest integer which divides both a and b without leaving a remainder. For example,
gcd(8, 12) = 4. In this section we discuss Euclid’s algorithm, which is a method of computing the gcd;
22 CHAPTER 2. CRYPTOGRAPHY

it is not a cryptosystem in itself, but it will help us to understand more sophisticated cryptosystems
later on.

Theory

Euclid’s algorithm is a famous method for computing the greatest common divisor. The key idea
behind Euclid’s algorithm, which dates back at least 2000 years, is an iterative process of trying
possible divisors.

Suppose we are trying to find gcd(8, 22). We might first try dividing 8 into 22, supposing that 8 could
be the greatest common divisor, but find that this leaves a remainder:

22 = 2 × 8 + 6

The key observation is that now we need to find the greatest common divisor of 8 and 6. That is, if
8 = mG and 6 = nG, where G = gcd(6, 8), (and so m and n are integers) then

22 = 2 × (mG) + nG = (2m + n)G.

For this example, we have

8=1×6+2

and repeating the process

6 = 3 × 2 + 0.

That is

22 = 2 × 8 + 6 = 2 × (4 × 2) + 3 × 2 = 11 × 2

and so gcd(8, 22) = 2.

Euclid’s algorithm is based on the same process. In more succinct form, at each step of the algorithm
we are given rk−1 and rk−2 and need to find the quotient (qk ) and remainder (rk )

rk−2 = qk rk−1 + rk .

We start with rk−2 = a and rk−1 = b, and stop when rk = 0; then gcd(a, b) = rk−1 . Here we have
assumed that a > b; if not, swap a and b before starting.

Thus the algorithm is, in pseudo-code:


2.9. EUCLID’S ALGORITHM 23

start: set rk−2 = a and rk−1 = b. % set starting values

compute rk = rk−2 (mod rk−1 ). % compute remainder

while rk 6= 0: % loop while remainder is nonzero

set rk−2 = rk−1 and rk−1 = rk % shift stored values: k ← k + 1

compute rk = rk−2 (mod rk−1 ) % compute remainder

end: gcd(a, b) = rk−1 % finished; gcd is last nonzero remainder.

Practice

Euclid’s algorithm uses a type of control that we have not so far considered. That is, we perform the
main body of the algorithm until some condition is met (rk = 0), rather than for a fixed number of
iterations. This means that we cannot use a for loop as we have been doing for ciphers up to this
point.

Instead we need another common programming control construct, the while loop. The idea here is
exactly that we continue to execute our loop until the condition is met, e.g.:

while(condition)
code−to−execude
end

More specifically, here is a while loop that squares x, starting at 2, so long as x is smaller than
1,000,000.

>> x=2;
>> while(x<1000000)
x=xˆ2
end

x =

x =

16

x =
24 CHAPTER 2. CRYPTOGRAPHY

256

x =

65536

x =

4.2950e+09

The while loop is a powerful concept that we will use repeatedly throughout the rest of the course.
We also use it to implement Euclid’s algorithm.

function z=euclidalgo(a,b)

rm2=a; % initial values


rm1=b;

r=mod(rm2,rm1); % r k = r {k−2} mod r {k−1}

while (r˜=0) % stopping condition


rm2=rm1; % shift our values along
rm1=r;
r=mod(rm2,rm1); % % r k = r {k−2} mod r {k−1}
end

z=rm1;

Matlab also has a built-in implementation of Euclid’s algorithm, known as gcd(a,b).

2.10 Factoring integers

Theory

One of the key ideas in cryptography is the idea of a “one-way function”. The idea is that there are
certain operations which are computationally easy in one direction, but very difficult in the opposite
direction – such operations can then be used at the heart of cryptosystems.

One such operation is decomposing integers into prime factors, for example 4578 = 2 × 3 × 7 × 109.

In one direction, starting with 4578, and needing to find the prime factors, this problem is hard. In
the opposite, having the prime factors and simply needing to multiply them together, is easy.

To illustrate this, let us see how long it takes to factor some very big numbers.
2.11. PUBLIC-KEY CRYPTOGRAPHY: DIFFIE-HELLMAN 25

Practice

There is code posted on canvas which uses matlab’s built-in factor() function to factor integers into
their prime factors. It does this for different numbers of digits (from 30 digits up to 50), and for each
size it generates 50 random numbers of this size and measures the time taken to factor each number.
This code is more complex than we have considered so far, and you aren’t necessarily expected to
understand all of it at this stage; but, you may find it useful to look through it and see if you can
understand the outline.

The main point of interest, though, is how long it takes to factor very large numbers. Running the
code, we can try to extrapolate the trend to understand how long it might take to factor even larger
numbers. The key conclusion, if you fit the trend line to the running time data, is that the amount of
time required grows very quickly as the size of the number increases4 .

For example, on my laptop it took about 1.6s to factor a 50-digit number. Not prohibitive. But
extrapolating to larger and larger numbers, a 100-digit number would take about 5 days. Again,
this still wouldn’t be adequate security for most purposes. But exponential growth will take care of
this; for 309-digit numbers, a size commonly used in modern cryptosystems, factoring in this manner
would be expected to take 7 × 1020 years – considerably longer than the age of the universe. The
multiplication time, by contrast, grows only very slowly with the number of digits, and is trivial for
even numbers of this size. This is the key idea of the one-way function; the factorisation direction is
hard, but the reverse (multiplication) direction is easy.

Obviously there are faster computers, and better algorithms, which would reduce these factorisa-
tion times considerably. But the fundamental concept remains the same – very fast growth of the
computation time will make the computational cost prohibitive if the numbers are big enough.

2.11 Public-key cryptography: Diffie-Hellman

Theory

Most of the cryptographic systems we have discussed so far rely on some sort of prior key exchange;
that is, that both parties already have a shared, common key.

But, this is a severe limitation. In many situations there is no obvious way to go about exchanging
keys securely. Fortunately there are mathematical solutions to this problem, which allow secure key
exchange over unsecured channels.

In this chapter we will illustrate the idea of public key cryptography by explaining the Diffie-Hellman
key exchange system. The commonly used RSA protocol, which encrypts much internet traffic today,
is based on similar ideas.
4
In fact, faster than any polynomial.
26 CHAPTER 2. CRYPTOGRAPHY

Alice and Bob wish to communicate securely with one another, but they have not previously exchanged
any information (e.g. no shared, secure keys). Eve is attempting to eavesdrop on the communication
between Alice and Bob.

Alice and Bob begin by choosing two integers: a prime p, and an integer g (with 1 < g < (p − 1)).
Both p and g are transmitted in the clear – that is, Eve can intercept them.

Alice and Bob then both choose a secret integer; let’s call Alice’s secret integer a and Bob’s b.

Alice then computes A = g a (mod p), using her secret integer, and sends this (in the clear) to Bob.

Bob computes B = g b (mod p), using his secret integer, and sends this (in the clear) to Alice. Eve is
able to intercept both A and B.

Finally Alice computes B a (mod p), and Bob computes Ab (mod p). But, both Alice and Bob now
have a shared secret key s! Why is this?

s = B a (mod p) = g ab (mod p) = g ba (mod p) = Ab (mod p)

or more specifically (g a (mod p))b (mod p) = (g b (mod p))a (mod p).

But, why can Eve not also compute the secret key s? The answer is that it is difficult to solve

A = g a (mod p)

for a if g and p are large5 , in much the same way that factoring integers was difficult in the previous
section. So this is a one-way function; it is easy for Alice and Bob to do their calculations, but the
reverse calculation required of Eve is vastly harder.

Alice transmitted in the clear (Eve) Bob


p,g
a b
→A= ga (mod p) →
← B = g b (mod p) ←
s = B a (mod p) s = Ab (mod p)

2.11.1 Modular exponentiation

While calculating the power a may be difficult, going in the other direction, i.e. computing g a (mod p),
can be very efficient. One way to do modular exponentiation would be to raise the base, g, to the
power, a, and then apply the modulus operator, however there are more efficient ways.

Consider the following property of the modulus operator:


5
and well-chosen
2.12. ELEMENT-WISE OPERATIONS 27

c (mod p) = (c1 · c2 ) (mod p) where c1 and c2 are integers that multiply to c.

Then

c (mod p) = [(c1 (mod p)) · (c2 (mod p))] (mod p)

Based on this we can use the computationally efficient method of exponentiation by squaring. We can
work out g 2 , g 4 , g 8 , etc... by repeatedly squaring and then g a can be calculated by taking products.

Example 2.11.1. Find 321 (mod 53)

32 (mod 53) = 9
4
3 (mod 53) = 92 (mod 53) = 81 (mod 53) = 28
8 2
3 (mod 53) = 28 (mod 53) = 784 (mod 53) = 42
16 2
3 (mod 53) = 42 (mod 53) = 1764 (mod 53) = 15
21 16+4+1
⇒3 (mod 53) = 3 (mod 53) = (15 × 28 × 3) (mod 53) = 1260 (mod 53) = 41

2.12 Element-wise operations

One key matlab idea that we have not so far used is the idea of operating directly on vectors, specifically
element-wise operations. These are natural ways of implementing ciphers, as an alternative to using
a for loop.

The idea is simple: we want to perform an operation on a vector, but acting only on each element at
a time. For example, if we want the square of each element in a vector:

>> x=[1 2 3 4 5]

x =

1 2 3 4 5

>> x.ˆ2

ans =

1 4 9 16 25

The ‘dot’ (e.g. x.ˆ2 rather than xˆ2) indicates the element-wise operation.

Similarly, we can take element-wise products of vectors


28 CHAPTER 2. CRYPTOGRAPHY

>> x=[1 2 3 4]

x =

1 2 3 4

>> y=[5 6 7 8]

y =

5 6 7 8

>> x.*y

ans =

5 12 21 32

Most matlab built-in functions, for example mod(), also work naturally on vectors in an element wise
manner:

>> x=[1 2 5 56 135]

x =

1 2 5 56 135

>> mod(x,26)

ans =

1 2 5 4 5

This naturally brings up the notion that we could implement our cipher codes using element-wise
operations instead of for loops. Here is an implementation of the affine cipher using element-wise
operations.

function C = affinecipher elementwise(P,a,b)


%inputs: P, plaintext string
% a, b: integer key. assuming a is coprime with 26
% now using element−wise operations rather than a for loop

P=chartoint(P); % convert plaintext to integers mod 26

C=mod(a*P+b,26);

C=inttochar(C); % convert back to characters

Compare this with the approach, using a for-loop, that we took in Sec.2.6.
2.13. MODULUS AS A CONGRUENCE RELATION 29

2.13 Modulus as a congruence relation

Theory

Up to this point, we have thought of the modulus as an operator. That is, if we consider 34 (mod 16),
we are thinking of the operation of computing 34 = (16 × 2) + 2 and finding that the remainder is
2. There is, however, another way of thinking about the modulus which is often theoretically more
powerful, and that is to think of the modulus as a congruence relation. In this case, when we write

A≡B (mod C)

we would say “A is congruent to B modulo (mod) C”. What do we mean by this?

Consider the integers modulo 3. For any integer, if we compute that integer (mod 3), we will get
either 0, 1 or 2. We sometimes refer to these as equivalence classes, which is to say that all integers
which end up in the 2 group (mod 3) are in the same equivalence class, so:

4 ≡ 13 (mod 3)

and we say that 4 is congruent to 13 modulo 3. (This is because 4 (mod 3) = 1, and also 13
(mod 3) = 1; they are both the same, modulo 3. Note the difference between the equals sign (=) and
congruence (≡).)

As an equivalence relation, then, we have several useful properties. For example:

• reflexive: A ≡ A (mod C)

• symmetric: if A ≡ B (mod C) then B ≡ A (mod C)

• transitive: if A ≡ B (mod C) and B ≡ D (mod C) then A ≡ D (mod C)


30 CHAPTER 2. CRYPTOGRAPHY
Chapter 3

Difference Equations, dynamic models


and modelling

Contents
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Discrete Population Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2.1 Evaluating using MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.2 Script Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.3 Some exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3 A nonlinear difference equation for population growth . . . . . . . . . . . 38
3.3.1 Some exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.2 Long-term behaviour and fixed points . . . . . . . . . . . . . . . . . . . . . . . 40
3.4 More on long-term behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4.1 Steps to draw a cobweb diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5 Discrete logistic equation with a parameter . . . . . . . . . . . . . . . . . . 46
3.5.1 Long-term behaviour and bifurcation diagrams . . . . . . . . . . . . . . . . . . 50
3.6 Fibonacci and his rabbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.6.1 Evaluating using MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.6.2 A lot more rabbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.7 Plutonium-239 - Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.8 Money in the bank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.8.1 Using MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.8.2 Some exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.9 Loans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.10 Systems of Difference Equations . . . . . . . . . . . . . . . . . . . . . . . . 62
3.10.1 Red Blood Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.10.2 Predator-prey model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.10.3 An exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.10.4 Epidemics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

31
32 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

3.11 Review Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.1 Introduction

Difference equations are often used to model quantities that vary with time.

Perhaps we want to know how much money we can make from interest. Or perhaps we might want
to know how populations are changing over time; or further yet know how an epidemic would spread
through a population. Difference equations are a mathematical tool we can use to investigate such
problems.
In this section, you will learn to solve difference equations numerically using MATLAB.

3.2 Discrete Population Models

Populations change with time. In a given unit of time some new members will be born, and others
will die. It should be intuitive to say that a population in a year’s time (Pn+1 ) will depend on the
population now (Pn ). Mathematically, we can use a difference equation to describe this relationship:

Pn+1 = kPn .

where k > 0 is some positive coefficient.


Example 3.2.1. Verify that this model works by calculating P2 , P3 , . . . , P4 by hand. As above, start
with P0 = 1000 and k = 1.1.
Solution

P1 = kP0 = 1.1 ∗ 1000 = 1100


P2 = kP1 = 1.1 ∗ 1100 = 1210
P3 = kP2 = 1.1 ∗ 1210 = 1331
P4 = kP3 = 1.1 ∗ 1331 = 1464 (4 s.f.)

Notes
1. This difference equation is an example of a linear difference equation. The unknown, P in this
example, only appears as terms raised to the first power.

2. Like in P0 , n = 0 conventionally used to describe what is happening at ‘time 0’. This is referred
to as the initial condition.

3. This kind of model is called an exponential model.


3.2. DISCRETE POPULATION MODELS 33

3.2.1 Evaluating using MATLAB

Example 3.2.2. Use a for loop in MATLAB to calculate P4 .

Solution
First, we’ll give the MATLAB commands, then we’ll describe them.

function out = simple pop(P 0, k, numit)


P(1) = P 0; % Storing the initial value.

for n = 1:numit % Using a for loop


P(n+1) = k*P(n); % Calculating the difference equation.
end

out = P(end); % This outputs the final population


end

>> simple pop(1000,1.1,4)

ans =

1464.1

MATLAB gave the expected answer of 1464 (4 s.f.) 

Description

1. We have stored P1 , P2 , . . . , P5 , etc in the array P.

2. The statement for n=1:numit means

(i) let n have the values 1, 2, 3, 4, 5,. . . , numit;


(ii) for each value of n, execute the statements between the for and end statements. (In this
example, there is only one statement, but there can be many.)
(iii) numit represents the number of iterations we want to do. I.e. n will take on the value of
numit as it goes around the loop for the last time.

3. (i) The values in P are accessed using P(1), P(2), and so on; see also the discussion in Sec-
tion 7.2.4.
MATLAB does not permit an array subscript that is zero or negative.
THIS IS IMPORTANT. This means P(0) is not permitted. So, we have stored our first
value in P(1) instead of P(0). Because we had to start our array at P(1) = P0 , the value
we want, P4 , is stored in P(5).
(ii) The last element in the array P is P(end).
34 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

(iii) We have written the difference equation as P (n + 1) = kP (n). As such, every iteration of
the for loop is calculating the next population. It is for this reason that numit must be
one less than the final value we want. E.g. we chose numit=4 to calculate P(5) as our last
value.

What would happen if we wanted to change k?


Example 3.2.3. Use MATLAB to plot P for n = 0, 1, . . . , 4 for k = 1.1, k = 1, and k = 0.9 starting
with P0 = 1000.

Solution
Because we want to plot, it would be better for us if we altered our simple pop.m function to output
the entire P vector. This is as simple as changing P(end) to P. E.g.

function out = simple pop vec(P 0, k, numit)


P(1) = P 0; % Storing the initial value.

for n = 1:numit % Using a for loop


P(n+1) = k*P(n); % Calculating the difference equation.
end

out = P; % This outputs the entire population vector


end

Typing the following into prompt will give us the plots below

>> P = simple pop vec(1000,1.1,4);


P = simple pop vec(1000,1.1,4);
figure
plot(0:4,P,'o−')
title('Population model with k=1.1')
grid
xlabel('n')
ylabel('P n')

P = simple pop vec(1000,1,4);


figure
plot(0:4,P,'o−')
title('Population model with k=1')
grid
xlabel('n')
ylabel('P n')

P = simple pop vec(1000,0.9,4);


figure
plot(0:4,P,'o−')
title('Population model with k=0.9')
grid
xlabel('n')
ylabel('P n')
3.2. DISCRETE POPULATION MODELS 35

Population model with k=1.1 Population model with k=1


1500
1001
1450 1000.8
1400 1000.6
1350 1000.4
1300 1000.2
Pn

1250

Pn
1000
1200 999.8

1150 999.6

1100 999.4

1050 999.2

1000 999
0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4
n n

Population model with k=0.9


1000

950

900

850
Pn

800

750

700

650
0 0.5 1 1.5 2 2.5 3 3.5 4
n

Note: Doing this in prompt means that we have to type out all of this code line by line. This is
usually an inefficient way of writing code. Some reasons for this are as follows:

1. You have to write out every line, every time.

2. If a mistake is made, you have to re input the line in question and potentially rewrite all the
following lines again.

3. If blocks of code are repeated, with or without differences, it is inefficient to be rewriting the
same thing over and over.

4. Prompt doesn’t save.

3.2.2 Script Files

In MATLAB there is another type of save-able file that lets us do everything we would need to do in
prompt, but circumvents such tedious problems. These files are called Script files.

Script files are used as a ‘notepad’ version of the prompt window. We write down all of the instructions
we would want to do in prompt. But having in a script file means that i) we don’t have to execute the
36 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

code line-by-line, ii) we can save our code for access later, iii) it is usually neater and easier to read,
iv) script files make it easier to deal with repeated code.

Note: Script files are created similarly to function files. They can be found under the new button in
the tool bar. For more information on creating and saving Script files refer to section 7.2.9.

Example 3.2.4. Use a MATLAB script file to solve Example 3.2.3.

Solution
We can write and save the following code as a Script file called simple pop script.m

%%% Setting parameters:


P 0 = 1000; % Initial Population
numit = 4; % Final year

%%% Solving for each value of k using our function file:


P case1 = simple pop vec(P 0, 1.1, numit); % k = 1.1
P case2 = simple pop vec(P 0, 1, numit); % k = 1
P case3 = simple pop vec(P 0, 0.9, numit); % k = 0.9

%%% Plotting:
n = [0:4]; % The years for the x axis
plot(n,P case1,'*−')
hold on
plot(n,P case2,'*−')
plot(n,P case3,'*−')

grid
legend('k=1.1','k=1','k=0.9')
xlabel('n')
ylabel('P n','rotation',0)

To run this Script file we can either click the Run button on the toolbar or type the filename into
prompt. Typing the following into prompt tells MATLAB to go read and run all of the code in
simple pop script.m.

>> simple pop script

We then get the following plot.


3.2. DISCRETE POPULATION MODELS 37

1500
k=1.1
1400 k=1
k=0.9
1300

1200

P n1100

1000

900

800

700

600
0 0.5 1 1.5 2 2.5 3 3.5 4
n

Our script file version may not look any shorter than what we would have typed into prompt. But it
is easier to read, edit, save and troubleshoot (and we don’t have to rewrite the same code three times
like before). 

Note: From here on in this course book,

>>

will denote writing code into prompt, and

(i.e. without the >>) will denote script files.

3.2.3 Some exercises

In an animal park we initially have 5000 animals. After one year there are 6000. Use the exponential
growth model Pn = kPn−1 where n is measured in years.

1. Find the value of k.

2. Write a MATLAB script file to plot the population for 20 years.


38 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

3.3 A nonlinear difference equation for population growth

Example 3.3.1. Another difference equation for population growth is

Pn = Pn−1 + k Pn−1 (A − Pn−1 ),

where 0 < k  1 and A > 0. Use MATLAB to create a plot of Pn for n = 0, 1, . . . , 200. Use k = 0.001,
A = 100 and P0 = 2.

Solution
In this example there are four parameters, k, A, P0 , and N . So we start by writing a function that
takes these four input parameters and outputs one vector, P .

function P=logistic pop(k,A,P0,N)


% calculate population using logistic model
P = [P0]; % array P contains only P0 at first
for n = 1:N % update P until P contains N+1 values
P(n+1) = P(n) + k*P(n)*(A − P(n));
end

We use the following MATLAB commands to evaluate the difference equation and plot the results

nfinal = 200;

P = logistic pop(0.001,100,2,nfinal);

plot([0:1:nfinal],P,'+') % Plot the points


grid
xlabel('n')
ylabel('P n', 'rotation', 0)

MATLAB produces the plot

100

90

80

70

60
P
n
50

40

30

20

10

0
0 50 100 150 200
n
3.3. A NONLINEAR DIFFERENCE EQUATION FOR POPULATION GROWTH 39

You can see from the graph that Pn levels off. This is a more realistic than having the population
grow indefinitely. 

Notes

1. The difference equation in this example is called a logistic equation.

2. The term −k Pn−12 in the equation is called a self-crowding term. As the population grows, the
self-crowding term gets larger, and the increase in the population slows.

3. The variable A is called the carrying capacity. It indicates the ideal population that a particular
region can sustain comfortably.

4. The [0:1:nfinal] in the plot command plot([0:1:nfinal],P,'+') creates a vector with


the values 0, 1, 2, . . .. These are the values of n.

5. The '+' in the plot command plot([0:1:nfinal],P,'+') tells MATLAB to plot each point
as a +, and not to join the points up with a continuous line (we had a continuous line in previous
examples).

6. You can use other symbols besides + such as ., x or *. For more information, type help plot
in the Command window.

7. An important point to understand about this difference equation is why Pn levelled off. If you
look closely at
Pn = Pn−1 + kPn−1 (A − Pn−1 ),
you can see that if Pn−1 = A, the term k Pn−1 (A − Pn−1 ) will be zero. This implies Pn = Pn−1
and hence Pn+1 , Pn+2 etc. will all equal Pn−1 . This means the population has levelled off.
You should now be able to explain the general shape of the graph. If P0 is smaller than A, then
the term k P0 (A − P0 ) will be positive and P1 will be greater than P0 . In a similar way, you can
show that P2 > P1 , P3 > P2 , etc. However, the bigger Pn−1 is, the smaller (A − Pn−1 ) is and
after a while the increase in Pn slows. Eventually P levels off.

3.3.1 Some exercises

1. The maximum sustainable population for animals living in a particular area is 50,000. Write
a difference equation for the population, if there are initially 5,000. What extra information is
needed?

2. For the previous exercise, assume that initially the population growth is exponential with the
model
Pn = 1.1 Pn−1 .
Does this give any more information?
40 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

3.3.2 Long-term behaviour and fixed points

In Example 3.3.1, we used A = 100 and k = 0.001. The MATLAB plot shows that in the long term
the population will be 100. The long-term value for the population will be at a fixed point of the
difference equation. This is a value P so that when Pn−1 = P , then Pn = P also.
Example 3.3.2. Find the fixed points of the difference equation
Pn = Pn−1 + k Pn−1 (A − Pn−1 ).
Solution
In the difference equation, we substitute Pn−1 = P and Pn = P , which gives
P = P + k P (A − P ),
which is equivalent to
k P (A − P ) = 0.
The fixed points are the solutions of this equation, i.e., P = 0 and P = A.

When k = 0.001 and A = 100, the fixed points are 0 and 100. This agrees with the plot in Exam-
ple 3.3.1. It can be shown that except for P0 = 0, no other initial value will give a solution which is
zero in the long term. 

To investigate the fixed points, we will first show Pn as a function of Pn−1 .

Example 3.3.3. Write down the function f so that Pn = f (Pn−1 ) for the difference equation in
Example 3.3.2. For k = 0.001 and A = 100, show a graph of Pn against Pn−1 and on the same axes
Pn = Pn−1 .
Solution
Pn = Pn−1 + kPn−1 (A − Pn−1 ) = f (Pn−1 ), where f (P ) = P + kP (A − P ) which in MATLAB can be
written:

function f = logistic vec(k,A,P)


f = P + k.*P.*(A−P);

To plot the graph:

k=0.001; A=100;
P=[0:1:140];
f = logistic vec(k,A,P);
plot(P,f,'−',P,P,'g− −') % f is solid line, P n=P {n−1} is dashed line
grid
xlabel('P {n−1}')
ylabel('P n', 'rotation', 0)

If you look carefully, you will see that the graphs intersect at the fixed points. Why? Does this agree
with the fixed points in Example 3.3.2?
3.4. MORE ON LONG-TERM BEHAVIOUR 41
140

120

100

80
Pn

60

40

20

0
0 20 40 60 80 100 120 140
Pn−1 

3.4 More on long-term behaviour

Example 3.4.1. Consider the difference equation

xk = 0.5 xk−1 + 1. (3.1)

Calculate and plot x1 , x2 , . . . x10 for the initial conditions (ICs)

(a) x0 = 1 and

(b) x0 = 4.

What do you notice?

Solution
We could use the following MATLAB commands:

function out = diffeqn1(x 0,nfinal)


x(1) = x 0;
for k=1:10
x(k+1) = 0.5*x(k) + 1;
end
out = x;

xc = [1,4]; % list the ICs as a vector


for j = 1:length(xc) % length(xc) determines the number of ICs
x 0 = xc(j);
x = diffeqn1(x 0,10); % using our function to find and store x 1 ... x 10
subplot(1,2,j)
42 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

plot([0:1:10],x,'*−') % plot x versus k


xlabel('k')
ylabel('x k', 'rotation', 0);
end

The plots are below. Note that the solutions for both initial values are tending to 2.
2 4

1.8
3.5

1.6
xk xk
3
1.4

2.5
1.2

1 2
0 5 10 15 0 5 10 15 
k k

Notes
1. The plots for both the initial values have been produced in one run of the MATLAB code by
storing these initial values in a vector and using the outer for loop to go through the initial
values.

2. The two graphs have appeared as one plot with two subplots. The MATLAB statements
subplot(1,2,j),plot(x,'*−') breaks the Figure window into a 1 by 2 matrix of smaller
plots and plots x in plot j. To find out more, type help subplot. These MATLAB statements
are not examinable.

It is possible to investigate the long-term behaviour without calculating a list of iterates. We can
determine the long-term behaviour by graphical means using a cobweb diagram, as described below.

3.4.1 Steps to draw a cobweb diagram

To draw a cobweb diagram and investigate the behaviour of

xn = g(xn−1 ), x0 = a.

1. On the same axes, draw y = x and y = g(x).

2. Start at x = a on the horizontal axis. This represents x0 .

3. Find x1 by moving vertically to the curve y = g(x). The vertical component of this point is x1 .

4. To find x1 on the horizontal axis, move horizontally to the line y = x, and then vertically to the
horizontal axis.
3.4. MORE ON LONG-TERM BEHAVIOUR 43

5. Repeat steps 3 and 4 to get x2 , x3 , etc.

Example 3.4.2. Draw a cobweb diagram for the difference equation

xk = 0.5 xk−1 + 1

and investigate the long-term behaviour for various values of x0 .

Solution
On the same set of axes, we draw y = 0.5 x + 1 to represent the difference equation, and also y = x.

4
y=x

y = 0.5 x + 1
3
xk

0
0 1 2 3 4 5 6 7
xk−1
Then we start at the initial value, say x0 = 4, on the x-axis and move vertically to the line y = 0.5 x+1
which represents the difference equation. The y coordinate of this point will be x1 ; in order to locate
it on the x-axis, move horizontally to the line y = x. This point will have x and y coordinates equal
to x1 and we can then move vertically to y = 0.5 x + 1 to find x2 .

Notice that for x0 = 4, xk is moving closer and closer to 2, which was what we expected above. What
will happen if x0 = 1? 

Example 3.4.3. Use a cobweb diagram to investigate the long-term behaviour of the difference
equation
xn = x2n−1 ,

for each of the initial conditions (a) x0 = 0.9; and (b) x0 = 1.1.

Solution
First draw y = x2 and y = x on the same axes. Then start at (a) x = 0.9 and follow the solid line.
For (b) start at x = 1.1 and follow the dashed line.
44 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

y = x2
y=x

x
n

0
0 1 2
xn−1

From following the solid line, we can see that if x0 = 0.9 then the solution tends to zero. However,
looking at the dashed line, it is apparent that if x0 = 1.1 then the solution tends to infinity. 

3.4.2 Exercises

1. Determine the long-term behaviour of the solution to

xn = 0.8xn−1 + 2,

for initial values (a) x0 = 20, (b) x0 = 0, (c) x0 =10.

20

18

16

14

12

10

0
0 5 10 15 20
3.4. MORE ON LONG-TERM BEHAVIOUR 45

2. Determine the long-term behaviour of the solution to


3xn−1
xn = ,
xn−1 + 1

for initial values (a) x0 = 1, (b) x0 = 4, (c) x0 =2.

3.5

2.5

1.5

0.5

0
0 0.5 1 1.5 2 2.5 3 3.5 4
46 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

3.5 Discrete logistic equation with a parameter

In Section 3.3, we used a nonlinear difference equation to model population,

Pn = Pn−1 + kPn−1 (A − Pn−1 ).

This is a good model to use if there are limited resources for the population.

We can rewrite this in a different form.


Pn = Pn−1 (1 + k(A − Pn−1 ))
 
1
= k Pn−1 A + − Pn−1
k
   
Pn−1 Pn−1
= k M Pn−1 1 − = q Pn−1 1 − ,
M M

where M = A + 1/k and q = kM .

As an alternative to considering the population Pn , we can consider the proportion xn = Pn /M .

Example 3.5.1. Write a difference equation for xn = Pn /M .

Solution
First note that Pn = M xn . Then

xn = Pn /M  
q Pn−1
= Pn−1 1 −
M M
 
q M xn−1
= M xn−1 1 −
M M
⇒ xn = q xn−1 (1 − xn−1 ).

This difference equation is called the discrete logistic equation. Note that it has only one parameter,
q, but the original difference equation had two parameters, A and k. 

Example 3.5.2. Use MATLAB to create a plot of xn for n = 0, 1, . . . , 200 where q = 1.1 and
x0 = 0.08.

Solution

function out = discrete logistic(q,x 0,nfinal)


x = [x 0];
for n = 1:nfinal
x(n+1) = q*x(n)*(1−x(n));
end
out = x;
3.5. DISCRETE LOGISTIC EQUATION WITH A PARAMETER 47

nfinal = 200; % set the number of iterations


x = discrete logistic(1.1,0.08,nfinal); % compute the iterations
plot(0:nfinal,x,'+−') % plot the graph
grid on
xlabel('n')
ylabel('x n', 'rotation', 0)

The plot produced is

0.092

0.09

0.088

xn
0.086

0.084

0.082

0.08
0 50 100 150 200
n

We note that the proportion increases and levels off at 0.091. 

Example 3.5.3. Repeat the previous example with x0 = 0.1

Solution
The plot produced is

0.102

0.1

0.098

xn
0.096

0.094

0.092

0.09
0 50 100 150 200
n
48 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

This time the proportion decreases but still levels off at 0.091. But does the long-term behaviour of
x always look like this? We will investigate. 

Example 3.5.4. Repeat the plots for q = 1.5, 2, 3, 3.2, 3.5, 3.6 using x0 = 0.05. Also draw the
corresponding cobweb diagrams.

Solution
• q = 1.5
1 1

0.8 0.8

0.6 0.6
xn xn

0.4 0.4

0.2 0.2

0 0
0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1
n xn−1

•q=2
1 1

0.8 0.8

0.6 0.6
xn xn

0.4 0.4

0.2 0.2

0 0
0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1
n x
n−1
3.5. DISCRETE LOGISTIC EQUATION WITH A PARAMETER 49

•q=3
1 1

0.8 0.8

0.6 0.6
xn xn

0.4 0.4

0.2 0.2

0 0
0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1
n xn−1

• q = 3.2
1 1

0.8 0.8

0.6 0.6
xn xn

0.4 0.4

0.2 0.2

0 0
0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1
n xn−1

• q = 3.5
1 1

0.8 0.8

0.6 0.6
xn xn

0.4 0.4

0.2 0.2

0 0
0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1
n x
n−1
50 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

• q = 3.6
1 1

0.8 0.8

0.6 0.6
xn xn

0.4 0.4

0.2 0.2

0 0
0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1
n xn−1

3.5.1 Long-term behaviour and bifurcation diagrams

Can we describe the long-term behaviours we found in the examples above?

We can use an orbit diagram to summarise the information we have about the long-term behaviour
for different values of q. The long term behaviour of x is given on the vertical axis.

0.9

0.8

0.7

0.6

0.5

0.4

0.3
2.8 2.9 3 3.1 3.2 3.3 3.4 3.5
q

For more information on the discrete logistic equation see section 8.1 of the Maths 260 textbook,
Differential Equations by Blanchard, Devaney and Hall (3rd edition).
3.5. DISCRETE LOGISTIC EQUATION WITH A PARAMETER 51

Bifurcation diagram using Matlab (not examined)

The Matlab script file that produced the above diagram is

nfinal=150;
values used=20;
clf

hand2=subplot(1,2,2) % orbit diagram


hold(hand2,'on')
hand1=subplot(1,2,1) % time series

q=2.8:.008:3.59;

pause
for j=1:length(q) % repeat for each value of q
hold(hand1,'off')
P=[1/2]; % calculate P, starting at critical point
for n=1:nfinal
P(n+1)=q(j)*P(n)*(1−P(n)); % find P n for q j and A
end
plot(hand1,1:nfinal,P(1:nfinal),'+−') % plot P n using + to show pts
grid(hand1,'on')
axis(hand1,[1,nfinal,0.3,1])
title([' q=',num2str(q(j))],'FontSize',14) % title shows q and A
xlabel('n','FontSize',16)
ylabel('x', 'Rotation', 0, 'FontSize',16)
bv=(P(end−values used:end)); % Take last values to see the long term
hold on

for i=1:length(bv)
plot(hand2,q(j),bv(i),'*','MarkerSize',5) % plot long−term values
end
grid(hand2,'on')
box(hand2,'on')
axis(hand2, [2.8,3.6,0.3,1])
xlabel(hand2, 'q','FontSize',16)
ylabel(hand2, 'x', 'Rotation', 0, 'FontSize',16)
if j==3 | j==6| j==29 | j==87 % put in pauses to watch diagram appear
pause
else
pause(0.1)
end

end
52 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

3.6 Fibonacci and his rabbits

In 1202, Leonardo of Pisa, also known as Fibonacci, published a simple population model for an
isolated group of rabbits.

Fibonacci used the following assumptions to


derive his model:

• each mature pair of rabbits produces a pair


of baby rabbits each month

• each pair of baby rabbits takes 1 month to


mature

• each pair always consists of 1 male and 1


female rabbit

• rabbits never die Smudge Taylor (also known as Bunny).


Image credit: Stephen Taylor.

The population begins with just 1 pair of newly born rabbits (1 male, 1 female). Let Fn be the number
of pairs of rabbits at the end of the nth breeding season (that is the nth month). Then F0 = 1 and
F1 = 1, because the first pair of baby rabbits needs one month to mature before they start breeding.
After two months, the first pair will have produced a pair of baby rabbits, so including the mature
pair, there are now two pairs of rabbits and F2 = 2.

One month later, the baby rabbits will have matured and the mature pair will have produced another
pair of baby rabbits, which means there are now three pairs of rabbits and F3 = 3. At the end of
month four, the new baby rabbits will have matured and the two mature pairs will each have produced
a pair of baby rabbits, so the total is now F4 = 5; and so on.

Fibonacci found that he could write the model as

Fn = Fn−1 + Fn−2 , n ≥ 2.

In words, the number of pairs of rabbits at the end of the nth breeding season is equal to sum of the
number at the end of the (n − 1)st and the number at the end of the (n − 2)nd season. The above
equation is an example of a difference equation.

Example 3.6.1. Verify that Fibonacci’s model works by calculating F2 , F3 , . . . , F7 by hand. As


above, start with F0 = 1 and F1 = 1.

Solution
3.6. FIBONACCI AND HIS RABBITS 53

We already found F2 , F3 and F4 . The formula gives:

F2 = F1 + F0

= 1 + 1 = 2
F3 = F2 + F1

= 2 + 1 = 3
F4 = F3 + F2

= 3 + 2 = 5

From the discussion above, we know that F4 is composed of three mature pairs and two baby pairs.
The baby pairs mature the following month and the mature pairs each produce a baby pair. Hence,
F5 is composed of five mature pairs and three baby pairs, so F5 = 8. Then F6 will be composed of
eight mature pairs and five baby pairs, so F6 = 13, and F7 will be composed of thirteen mature pairs
and eight baby pairs, giving F7 = 21. Using the formula, we find:

F5 = F4 + F3

= 5 + 3 = 8
F6 = F5 + F4

= 8 + 5 = 13
F7 = F6 + F5

= 13 + 8 = 21

3.6.1 Evaluating using MATLAB

Example 3.6.2. Use a for loop in a MATLAB function that solves for Fk . Use your function to
calculate F7 .

Solution

function out = fib(ICs, nfinal)


F = [ICs];
for n = 2:1:nfinal
F(n+1) = F(n) + F(n−1);
end
out = F(nfinal + 1);

>> fib([1,1], 7)

ans =

21
54 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

MATLAB gave the expected answer of 21. 

Notes:

1. MATLAB does not permit an array subscript that is zero or negative. This means F(0) is
not permitted; we have store F0 in F(1) instead. As a consequence, F (n) is stored in F(n+1).
Thus, the last element in the array F is F(nfinal + 1); you can also access the last element as
F(end).

3.6.2 A lot more rabbits

Example 3.6.3. Calculate F314 .

Solution
Calculating F314 by hand would be boring and take too long. We modify the MATLAB statements
in the previous example to

>> fib([1,1], 314)

ans =

3.0312e+65

MATLAB quickly calculates F314 as 3.0312 × 1065 . That’s a lot of rabbits. 


3.7. PLUTONIUM-239 - DECAY 55

3.7 Plutonium-239 - Decay

Example 3.7.1. Plutonium-239 (Pu-239) is a radioactive substance that decays to Uranium-235 by


emitting an alpha particle. The half-life of Pu-239 is approximately 24,000 years. This means that if
you start with a pile of Pu-239, the amount of Pu-239 left in the pile after 24,000 years will be half of
what it was at 0 years. Continuing on, the amount left after 48,000 years will be one-quarter of what
it was at 0 years, and so on.

Let Pn be the amount of Pu-239 left after n years. Pn satisfies (approximately) the difference equation

Pn = (0.999971119284533)Pn−1 .

i) Estimate the amount of Pu-239 left after one million years if P0 = 1 × 1012 .

Solution
Recognising that this difference equation is very similar to our previous examples, we can solve it in
MATLAB using the functions we created before with some very small modifications.

function out = decay(P 0, k, nfinal)


P(1) = P 0; % Storing the initial value.
for n = 2:nfinal % Using a for loop
P(n) = k*P(n−1); % Calculating the difference equation.
end
out = P(end); % This outputs the final amount
end

>> decay(1*10ˆ12, 0.999971119284533, 1*10ˆ6)

ans =

0.286481153781130

ii) Estimate the amount of Pu-239 left after one million years for any P0 .

Solution
Here we do not know P0 so we don’t have enough information to compute a solution in MATLAB .
Thus, our first step is to find the solution to the difference equation. We can notice that if Pn = kPn−1 ,
56 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

then Pn−1 = kPn−2 . This must then mean that

Pn = kPn−1 ,
= k(kPn−2 ) = k 2 Pn−2 ,
= k 2 (kPn−3 ) = k 3 Pn−3 ,
= k 4 Pn−4 ,
..
.
= k n P0 .

So we can recognise that the solution to our radiation problem is

Pn = (0.999971119284533)n P0 .

The amount of radiation left after one million years will then be

(0.999971119284533)1000000 P0 ≈ 2.86 × 10−13 P0 .

In words, after one million years, 2.86 × 10−13 of the original amount of Pu-239 will be left. 

Notes

1. How do we get the constant 0.999971119284533 in the difference equation? Let the value be k
so that
Pn = k Pn−1 .
Since the half-life is 24,000 years, we know that
1
P24000 = 2 P0 .

We also know from the difference equation that

P24000 = k 24000 P0 .

Therefore,

k 24000 = 21
(1/2400)
⇔ k = 12

⇔ k = 0.999971119284533

Note that this is the value of k correct to 15 decimal places.


3.8. MONEY IN THE BANK 57

3.8 Money in the bank

Example 3.8.1. Let Pn be the amount of money in an account at the beginning of the (n + 1)-st
year. If the money earns i percent interest each year and this interest is added to the account at the
end of the year, write down a difference equation for Pn . You may assume the interest is not taxed
and there are no bank fees.

Solution
You should be able to write down the difference equation by inspection. It is
 
i
Pn = 1 + Pn−1 .
100

Example 3.8.2. Solve the difference equation in the previous example.

Solution
Based on our work from the previous section we should be able to recognise the solution to be

i n
 
Pn = P0 1 + .
100

3.8.1 Using MATLAB

Example 3.8.3. Suppose in the previous example, P0 = 1000 and i = 8. Use MATLAB to find P20 .

Solution
We need to evaluate
8 20
 
P20 = 1000 1 + .
100
The MATLAB command

>> 1000*(1 + 8/100)ˆ20

ans =

4.6610e+03

which gives 4.660957143849309e+03 as the answer. Hence, at the beginning of the twenty-first year
there will be $4660.96 in the account. (Here, we assume that the effect of round-off error for the
annual payments rounded to cents can be ignored.) 
58 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

3.8.2 Some exercises

1. Repeat Example 3.8.3 with tax at 30%.

2. Write the difference equation for money owing on a mortgage if you are charged 10% interest each
year and pay off $20000 each year. Assume that the interest is charged and the payment made
at the end of each year. Can this difference equation be solved with the analytical technique we
used in Example 3.8.2?

3.9 Loans

In Section 3.8, we considered money in the bank and interest. We will now look at loans on which
interest is charged.

Example 3.9.1. Write the difference equation for money owing on a $200,000 loan if you are charged
10% interest each year and pay off $24,000 each year.

Note: Here we are assuming that the interest is paid only once a year and the repayments are also made only
once a year and at the same time. This will not be realistic for a house mortgage, but the same ideas can be
used.

Solution
Let An be the amount owing after n years. Then the amount after n + 1 years will be the amount
that was owing after n years, plus the interest on that amount, less the money paid off. We can write
this as the difference equation

An+1 = An + 0.1 An − 24000 = 1.1 An − 24000. (3.2)

We could write this more generally as


r
An+1 = (1 + 100 ) An − p, (3.3)

where r is the interest rate as a percentage and M is the money paid off. We can use MATLAB to
plot An . We do not know how long it will take to pay off the loan so we guess and calculate up to
A30 .

function out = loan(r, p, A 0, nfinal)


A = [A 0];
for n = 1:nfinal
A(n+1) = (1 + r/100)*A(n) − p;
end
out = A;

A = loan(10, 24000, 200000, 30);


plot(0:30,A)
grid
3.9. LOANS 59

title('Mortgage')
xlabel('Years')
ylabel('Amount owing ($)')

The plot is

x 10
5 Mortgage
2

0
Amount owing ($)

−1

−2

−3
The loan is paid off as soon as the graph dips
−4 below zero on the y-axis. Hence, it takes about
19 years to clear the balance.
−5
0 5 10 15 20 25 30
Years 

Example 3.9.2. Allison would like to borrow $200,000 but she doesn’t know if she can afford the
repayments. Use MATLAB to plot the amount owing if she repays $15,000 per year. Do this for the
interest rates 5%, 6%, 7.5%, and 8%. Use the plots to estimate how long before the loan would be
repaid for each interest rate.

Solution
We will modify the script file for the last example.

A = loan(5, 15000, 200000, 30);


plot(0:30,A)
grid
title('Mortgage')
xlabel('Years')
ylabel('Amount owing ($)')
60 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

The plot produced is

x 10
5 Mortgage
2

1.5

1
Amount owing ($)

0.5

−0.5

−1

−1.5
0 5 10 15 20 25 30
Years

The loan will be repaid in about 22 years. We will now investigate what happens for some higher
interest rates. Change r into 6 then 7.5 then 8.

x 10
4 Mortgage x 10
5 Mortgage
20 2

15 2

2
Amount owing ($)

Amount owing ($)

10 2

5 2

0 2

−5 2
0 5 10 15 20 25 30 0 5 10 15 20 25 30
Years Years

• r = 6: The loan is repaid in about 27 years. • r = 7.5: The amount owing is constant. Why?
3.9. LOANS 61

x 10
5 Mortgage
3.2

3
Amount owing ($)

2.8

2.6

2.4

2.2

2
0 5 10 15 20 25 30
Years

• r = 8: The loan increases and is never repaid.

We can summarise these results in a table.

Interest rate(%)
5% Repaid in about 22 years.
6% Repaid in about 27 years.
7.5% Amount owing does not change
8% Amount owing increases


Example 3.9.3. Alan wants to buy a house for which he would need to borrow $200,000. The interest
rate is 7.25% per year and he can pay off $15,000 each year. Can he afford the house?

Solution
We have used r as a parameter of the difference equation and saw that the type of long-term behaviour
depended on r. For r < 7.5, we notice that the loan is paid off; for r = 7.5 it is not paid off. At 7.25%
he can pay it off. Are we making any assumptions here? 
62 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

3.10 Systems of Difference Equations

3.10.1 Red Blood Cells

(from Mathematical Models in Biology by L. Edelstein-Keshet)

Example 3.10.1. Red blood cells (RBC) carry oxygen around the body. They are constantly being
destroyed and replaced. We want the number to be maintained at some fixed level. We assume that
the spleen filters out and destroys a constant fraction of RBC and the bone marrow produces a number
proportional to those lost on the previous day.

Define:

Rn number of RBC in circulation on day n


Mn number of RBC made by bone marrow on day n
f constant fraction filtered out by spleen each day
c proportional constant (number produced per number lost)

The numbers can be modelled by a system of difference equations:



Rn+1 = (1 − f ) Rn + Mn ,
Mn+1 = c f Rn .

(a) Explain the terms in the equation based on the assumptions listed above.

(b) Write a Matlab script file to plot M and R over time for (i) R0 = 106 , M0 = 100 and (ii)
R0 = 106 , M0 = 1000. Use f = 0.5, c = 1. What do you notice about the plots?

Solution
(a) The spleen filters out a fraction f of Rn on day n, which means that (1 − f ) Rn remains. This
is supplemented by the amount Mn produced by the bone marrow. Hence, the number of RBC
on day n + 1 is the sum of these two. The amount produced by the bone marrow for the next
day is proportional to the amount f Rn of RBC on the current day n; the fraction is given by c.

(b) function [R,M] = rbc(f, c, ICs, nfinal) %Note: ICs are input as a vector
R = ICs(1); %Getting ICs from the input vector
M = ICs(2);
for n = 1:nfinal %Update both equations at the same time
R(n+1) = (1−f)*R(n) + M(n);
M(n+1) = c*f*R(n);
end

nfinal = 20;
[R,M] = rbc(0.5,1,[1e6,100],nfinal); %Inputting the parameters and ICs
plot(0:nfinal,R,'r−',0:nfinal,M,'b−−')
legend('R','M') %Creates a legend to identify the curves
3.10. SYSTEMS OF DIFFERENCE EQUATIONS 63

title('f = 0.5 and c = 1')


xlabel('n')

Change the initial condition for M0 to M=[1000] to plot case (ii).

The left hand plot is (i) R0 = 106 , M0 = 100, and the right hand one is (ii) R0 = 106 , M0 = 1000.
x 10
5 f=0.5 and c=1 x 10
5 f=0.5 and c=1
10 10
R R
9 M 9 M

8 8

7 7

6 6

5 5

4 4

3 3

2 2

1 1

0 0
0 5 10 15 20 0 5 10 15 20
n n
Notice that the initial value of M makes little difference to the plot. 

Notes

1. The for loop contains two command lines; you can have as many as you like in a for loop,
which is why you must indicate where it stops by writing end.

2. In the function rbc we have two outputs R and M. In MATLAB we use a vector [R,M] as output.
You can have as many outputs as you like in a function; just put them in the output vector.

3. Initial conditions in MATLAB functions can be input as single variables or in vectors. Here f
and c are single inputs, but ICs expects a vector containing both initial conditions.

4. The plot command put the graphs of Rn and Mn on the same set of axes.

5. The 'r−' in the plot command told MATLAB to draw the graph for Rn as a continuous line
in red. The 'b−−' in the plot command told MATLAB to draw the graph for Mn as a dashed
line in blue.

6. If we had said 'b+' instead of 'b−−', MATLAB would have plotted Mn as blue points with a
+ at each point.

7. If we had said 'b+−' instead of 'b−−', MATLAB would have plotted Mn as a continuous blue
curve with a + added at each point.

8. Since the y-axis represents both Rn and Mn , we use the legend command instead of labelling
the y-axis.
64 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

Example 3.10.2. The plots below are for R0 = 106 and M0 = 1000, with different values of f and c
as shown in the titles. What would be a suitable value of c to keep the number of RBC at a constant
level?
Solution
x 10
5 f=0.8 and c=1 x 10
5 f=0.3 and c=1
10 10
R R
9 M 9 M

8 8

7 7

6 6

5 5

4 4

3 3

2 2

1 1

0 0
0 5 10 15 20 0 5 10 15 20
n n
x 10
5 f=0.5 and c=1.1 x 10
5 f=0.5 and c=0.9
14 10
R R
M 9 M
12
8

10 7

6
8
5
6
4

4 3

2
2
1

0 0
0 5 10 15 20 0 5 10 15 20
n n
The value of c = 1 seems to keep it constant. To be sure, try it with different values of f . 

Just as for the linear growth examples, linear systems always have relatively simple long-term be-
haviour: the values tend to 0; the values tend to infinity; or the parameters are special such that the
values remain constant. More interesting behaviour can be obtained in models including nonlinear
terms. The following two sections give examples of systems of nonlinear equations.

3.10.2 Predator-prey model

Example 3.10.3. The predator-prey model gives the populations, at a discrete set of times, of two
species interacting with each other. One species is the predators. They consume the prey and if there
were no prey, the predators would die off. The other species is the prey. They are consumed by the
predators and if there were no predators, the number of prey would grow indefinitely.

Let xn and yn be the number of predators and prey at the nth time interval. One set of difference
3.10. SYSTEMS OF DIFFERENCE EQUATIONS 65

equations for xn and yn is


(
xn = xn−1 − a xn−1 + b xn−1 yn−1 ,
yn = yn−1 + c yn−1 − d xn−1 yn−1 .

where the constants a, b, c and d are all positive. Can you tell from the equations which is the predator
and which the prey?

Suppose a = b = c = d = 0.005, x0 = 2, and y0 = 4, where x and y are in units of one thousand. Use
MATLAB to produce a plot of xn and yn .

Solution

function [x,y] = pred prey(pars, x 0, y 0, nfinal)


a = pars(1); b = pars(2); c = pars(3); d = pars(4); % constants
x = [x 0]; y = [y 0]; % initial populations
for n = 1:nfinal
x(n+1) = x(n) − a*x(n) + b*x(n)*y(n); % predators
y(n+1) = y(n) + c*y(n) − d*x(n)*y(n); % prey
end

nfinal = 5000;
[x, y] = pred prey([0.005,0.005,0.005,0.005], 2,4,nfinal);
plot([0:nfinal],x,'b−',[0:nfinal],y,'g−−')
legend('predator','prey')
xlabel('n')

MATLAB produced the plot

5
predator
4.5 prey

3.5

2.5

1.5

0.5

0
0 1000 2000 3000 4000 5000
n 

Notes
66 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

1. You can see from the plot that xn and yn oscillate with time. When the number of prey (yn ) is
large, the predators (xn ) will have a plentiful supply of food and xn will increase. As xn increases,
more and more prey will be consumed and after a while yn will start decreasing. Eventually yn
will get so small that the number of predators will start decreasing. When xn becomes small,
the number of prey will start increasing, and so on.

Confirmation in the wild

One activity of the Hudson’s Bay Company was buying furs from trappers. The plot
below, taken with permission from

http://www.globalchange.umich.edu/globalchange1/current/lectures/predation/predation.html

gives the number of snowshoe rabbits and one of their main predators, the Canada lynx,
bought for most years from the 1840s to the 1930s. A predator-prey cycle of eight to ten
years is readily observed.

3.10.3 An exercise

The difference equations below model the populations of foxes and rabbits, measured in thousands.
The foxes kill the rabbits for food.
(
xn = xn + 0.03 xn − 0.001 xn yn ,
yn = yn − 0.02 yn + 0.005 xn yn .

Use this model to answer the following questions.

(a) Which population does x represent and which population does y represent? Give a reason for
your answer.

(b) Describe briefly what will happen to:


3.10. SYSTEMS OF DIFFERENCE EQUATIONS 67

(i) the fox population if there are no rabbits;


(ii) the rabbit population if there are no foxes.
(c) If there are initially 15,000 foxes and 25,000 rabbits, how many of each will there be after 2
years?

3.10.4 Epidemics

In this section we will look at a model for the spread of diseases. A population will be divided into
three groups:
• Those who are susceptible to the disease - people who could catch the disease from an infected
person.
• Those who are infectious with the disease - these people have the disease and can infect suscep-
tible people.
• Those who are recovered from the disease - they are now immune to the disease.

Let Sn be the number of people susceptible at time n, In be the number infectious at time n and Rn
be the number recovered at time n.
Example 3.10.4. Use the following assumptions to write expressions for these values at time n in
terms of their values at time n − 1. Between time n − 1 and n, we will assume that:
• the number of susceptibles who are infected is proportional to Sn−1 In−1 ;
• a fixed proportion of the infected people will recover.
Solution
Between the time n − 1 and n, the number of people infected will be β Sn−1 In−1 for some constant β.
These must be subtracted from the susceptible group and added to the infectious group. A proportion
γ of the infectious will recover and they will be subtracted from the infectious group and added to the
recovered group. We can summarise this as

 Sn = Sn−1 − β Sn−1 In−1 ,

In = In−1 + β Sn−1 In−1 − γ In−1 ,

Rn = Rn−1 + γIn−1 .

Example 3.10.5. Suppose that β = 0.001, γ = 0.005, S0 = 200, I0 = 1, and R0 = 0. Use MATLAB
to produce a plot of S, I and R all on the same axes.
Solution
Note that, S, I and R in the equations are expressed in terms of time n − 1. Think about whether it
makes a difference to define Sn in terms of Sn−1 or Sn+1 in terms of Sn , etc. Conclude that you can
still write the MATLAB script files as before. To test your understanding of working with arrays and
using the for loop in MATLAB, it is a good idea to try write a script file using time n − 1 on the
right-hand side of the equations. The plot should be identical to the one produced by the script file
on the next page.
68 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

function [S,I,R] = epidemic(beta, gamma, ICs, nfinal)


S = ICs(1); I = ICs(2); R = ICs(3);
for n = 1:nfinal
S(n+1) = S(n) − beta*S(n)*I(n);
I(n+1) = I(n) + beta*S(n)*I(n) − gamma*I(n);
R(n+1) = R(n) + gamma*I(n);
end

nfinal = 200; % choose reasonable total time


[S,I,R] = epidemic(0.001,0.005,[200,1,0],nfinal);
plot([0:last],S,[0:last],I,[0:last],R) % let MATLAB choose colours
legend('Susceptible','Infectious','Recovered')
title('Epidemic')
xlabel('n')
grid on

The plot produced is


Epidemic
200
Susceptible
180 Infectious
Recovered
160

140

120

100

80

60

40

20

0
0 50 100 150 200
n 

Note: observe that Sn + In + Rn = Sn−1 + In−1 + Rn−1 . What does this tell you about the model,
and its behaviour?

3.11 Review Exercises

Some assignment questions will be similar to the exercises below.

1. The Fibonacci sequence is defined by


Fn = Fn−1 + Fn−2 , n ≥ 2.
3.11. REVIEW EXERCISES 69

If F0 = 1 and F1 = 1, use the above difference equation to calculate F11 by hand.

2. Calculate F36 using a for loop in MATLAB.

3. Use √ √ !n √ √ !n
5+ 5 1+ 5 5− 5 1− 5
Fn = +
10 2 10 2

to find the exact value for F5 and F6 .

4. Why is Fibonacci’s model an unrealistic model? You might like to read the short description of
the model in Wikipedia.

5. A person puts $23,000 in a bank account July 1, 2010. The account earns six percent interest
per year and the interest is added to the account on June 30 each year. How much money will
be in the account July 1, 2020? Assume the interest is not taxed and there are no bank fees.

6. Suppose for the previous example, the interest is taxed at a rate of 30 percent. If the tax is paid
July 1 each year using money from the account, how much money will be in the account on July
1, 2020?

7. A person puts $23,000 in a bank account July 1, 2010. Every July 1 thereafter, the person adds
$1000 to the account. The account earns six percent interest per year and the interest is added
to the account on June 30 each year. How much money will be in the account on July 1, 2020?
Assume the interest is not taxed and there are no bank fees (you may use MATLAB to answer
this question).

8. In the notes, we have the difference equation for Plutonium-239 as

Pn = (0.999971119284533) Pn−1 .

Does it make sense to give r (= 0.9999711 . . .) to fifteen digits?

9. Let Sn be the amount of Strontium-90 (Sr-90) in a pile after n years. Sn satisfies (approximately)
the difference equation
Sn = (0.97638) Sn−1 .
Suppose S0 = 2. Use MATLAB to calculate S100 .

10. Let Kn be the amount of Krypton-85 (Kr-85) in a pile after n years. Kn satisfies (approximately)
the difference equation
Kn = r Kn−1 .
Use the fact the half-life of Kr-85 is 10 years to find a numerical value for r to five decimal places.

11. Newton’s Law of Cooling is given approximately by

Tn = Tn−1 − k (Tn−1 − Ta ).

We know this approximation can be used when Ta < T0 . Can the approximation be used when
Ta > T0 ?
70 CHAPTER 3. DIFFERENCE EQUATIONS, DYNAMIC MODELS AND MODELLING

12. Suppose k = 0.005, Ta = 0 and T0 = 20. Implement the difference equation

Tn = Tn−1 − k (Tn−1 − Ta )

in MATLAB as a for loop. Then use this loop to calculate T100 .

13. A hot piece of lead is dropped in a large tank of cold water. The temperature Tn of the lead
after n seconds satisfies (approximately) the difference equation

Tn = Tn−1 − 0.05 (Tn−1 − 15).

The initial temperature of the lead is 200◦ C. Estimate the temperature of the lead after 10
seconds.

14. Consider the difference equation

Pn = Pn−1 + 0.001 Pn−1 (100 − Pn−1 ).

Use MATLAB to plot Pn for

(i) P0 = 50
(ii) P0 = 200
(iii) P0 = 100.

15. Draw a cobweb diagram for the population model

Pn = Pn−1 + k Pn−1 (A − Pn−1 ),

for k = 0.01 and A = 100. Use your diagram to describe the behaviour of Pn when (a) P0 = 20,
and (b) P0 = 120,

16. Consider the epidemic model in Section 3.10.4.

(a) Write a MATLAB script file that plots a graph of S, I and R when S0 = 400, I0 = 4 and
R0 = 0 for β = 10−3 , γ = 5 × 10−3 .
(b) Use different values of β in your script file, to find a value that gives a peak value of
infectious people of less than 200.

17. Using the assumptions in Section 3.6 on page 52

(a) write a population model for the rabbits with three difference equations for mn , the number
of mature rabbits after n months, tn , the number of one month old rabbits after n months
and bn , number of newborn rabbits after n months;
(b) starting with a single pair of baby rabbits, calculate the populations for up to 10 months
(check correctness of your model by comparing the total population with the 10th Fibonacci
number);
(c) incorporate death in your model by letting a proportion dm of the mature rabbits, and a
proportion db of the baby rabbits die each month;
(d) Let dm = 0.1, then find db be such that the total population remains constant.
Chapter 4

Stochastic methods and stochastic


modelling

Contents
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.2 Randomness, probability and simulation. . . . . . . . . . . . . . . . . . . . 72
4.2.1 Randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.2.2 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.2.3 Simulation for probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3 Discrete and continuous random variables . . . . . . . . . . . . . . . . . . 75
4.3.1 Discrete random variables and histograms . . . . . . . . . . . . . . . . . . . . . 76
4.3.2 Continuous random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.4 Uniform and normal distributions, probability distributions . . . . . . . . 79
4.4.1 Probability Density Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.4.2 The Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.4.3 Normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.4.4 Modelling error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.5 Simulating Discrete Random Variables . . . . . . . . . . . . . . . . . . . . 91
4.5.1 Simple examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.5.2 More complicated discrete distributions . . . . . . . . . . . . . . . . . . . . . . 94
4.6 Estimating probabilities, Monty Hall . . . . . . . . . . . . . . . . . . . . . . 97
4.6.1 Estimating probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.6.2 Supermarket workers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.6.3 Monty Hall problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.7 Estimating expectations, Gambler’s ruin . . . . . . . . . . . . . . . . . . . 101
4.7.1 Estimating expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.7.2 Expectations of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.7.3 Gambler’s ruin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.8 Monte Carlo integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.8.1 Estimating integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

71
72 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

4.1 Introduction

A deterministic model will always produce the same output given the same initial input; the models
in Chapter 3 were all of this type. However in real life there is variability. Hence, a stochastic model
will include randomness.

Some reasons we would include randomness in modelling include:

• To better represent reality - the result of a dice throw, radioactive decay, animal movement...

• Some phenomena are not random, but we model them as random because they are too compli-
cated.

• Measurement error

• Simulation to obtain statistics

• Simulation of behaviour to investigate qualitative patterns - e.g. climate possibilities

• Monte-Carlo methods for solving complex problems

In this chapter we review the properties of random variables and use MATLAB to simulate ran-
dom variables with special properties. We will then see how simulations can be used in a range of
applications including modelling and estimating probabilities, expectations, and integrals.

4.2 Randomness, probability and simulation.

Theory

4.2.1 Randomness

In this chapter, we consider the case where there is uncertainty in the underlying situation. For
example, consider a coin toss and let X be the side of the coin that shows when the coin lands. What
is X? We expect that X will be either Heads or Tails but we cannot be certain about X unless we
toss the coin and observe the result. In other words, the value of X depends on the experiment, and
depending on the experiment, the value of X may change: X is sometimes heads and sometimes tails.

It is useful to think of variables like X in this example as behaving randomly (with the value determined
by chance), and variables like X are called random variables. We often denote random variables
(RV for short) by capital letters (such as X).

Some examples of phenomena that might sometimes be regarded as random include:


4.2. RANDOMNESS, PROBABILITY AND SIMULATION. 73

1. the outcome of tossing dice

2. cards dealt in a game of blackjack

3. numbers of vehicles at various parts of a traffic network

4. numbers of customers at checkout queues in a supermarket

5. the motion of a pollen grain suspended in water (Brownian motion)

6. the spread of a disease through a population (epidemics)

One way to get information about possible values for an RV like X is to perform some experiments.
However, there are also ways to simulate RVs and then make predictions about their behaviour. For
example, we could simulate the numbers of customers at the checkouts of a supermarket to work out
whether more staff needed to be employed before actually employing more staff.

A random process is a sequence of random variables, one for each point in time. A random process is
usually denoted by a capital letter and a subscript representing time; for example, the daily exchange
rate of the New Zealand dollar versus the US dollar can be regarded as a random process, and might
be denoted Rt .

10

−10

−20
X

−30

−40

−50
0 200 400 600 800 1000
time

Figure 4.1: Example of a random process

Back to the coin-toss problem. Let Xt be the random process of the results when we successively toss
the coin. Although we cannot be sure about the exact value of Xt before tossing the coin for the t-th
time, we expect it to be either Heads or Tails. In other words, we know the range or the set of all
possible values that Xt can take. This range or set is called the state space Ω of the random variable.

Practice

We can use MATLAB to simulate the toss of a coin:


74 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

>> randi([0,1]) % random integers from 0 to 1


ans=
0

Where zeros are assigned as heads and ones are assigned as tails.

4.2.2 Probability

Theory

Since the exact value of a random variable Xt cannot be determined before performing an experiment,
it makes sense to consider the probability of observing an event of the state space. For example, if Xt
is the result of the coin toss at the t-th time, what is the probability that Xt is Heads? We denote
this probability by Prob(Xt = Heads) or P(Xt = Heads), or P(Xt = 0), where we assign Heads = 0.

There are analytical methods to calculate probabilities. For a simple example like P(Xt = Heads),
we already know that the chances of observing Heads or Tails are equal for a fair coin, so there is a
50% chance that Xt is Heads. Thus, P(Xt = Heads) = 21 = 0.5. However, calculating probabilities
analytically is not always straightforward, especially for complicated examples.

4.2.3 Simulation for probability

There are many courses on analytical probability calculation. In this chapter, we only aim to estimate
probabilities by running experiments or simulations (e.g. computational experiments). Monte Carlo
methods use repeated sampling to solve problems, instead of analytical methods, and can be faster
and easier.

Let S be a subset of the state space Ω and consider Xt to be a random process of Ω. Suppose that
the values of Xt are known by experiment or simulation for t ∈ {1, 2, . . . , N }. Then the probability
P(Xt ∈ S) is estimated by

the number of times that a state in S has been observed


P(Xt ∈ S) ≈ .
total number of experiments

For example, consider S to be {Heads} as a subset of Ω = {Heads, Tails}. Suppose that we toss a fair
coin 30 times and 18 times it shows Heads. Then,

the number of times that Heads is observed 18


P(Xt = Heads) ≈ = = 0.6.
total number of experiments 30

The estimated probability is close to the analytical value 0.5 but is, of course, not exactly equal; its
value depends on the experiment. The estimated probability can be improved (i.e., getting closer to the
analytical probability) by increasing the total number of experiments, provided that some conditions
are satisfied with respect to the nature of the random process. Intuitively, you would expect that a
4.3. DISCRETE AND CONTINUOUS RANDOM VARIABLES 75

probability estimate improves if we increase the total number of experiments. While a proper proof
of the above statement, in a general case, is beyond the scope of this course, we shall assume that our
estimate of a particular probability does improve if we increase the number of experiments.

The subset S can have more than one element; for example, what happens if S is chosen to be equal
to Ω, i.e. S = {Heads, Tails}? In this case,
P(Xt ∈ S) = P(Xt = Heads or Xt = Tails) = 1
since the number of times that Heads or Tails are observed is equal to the total number of experiments.
In other words, no matter how many times we toss a coin, we will observe an element in Ω (Heads or
Tails) because no other outcome is possible! This conclusion seems trivial, yet, it is a very important
theorem in statistics and probability:
Theorem 4.2.1. For any random process Xt with state space Ω, we have P(Xt ∈ Ω) = 1.

Practice

In the following, we make the definitions of random variables more specific by considering some
application problems. These are thought applications (or can be done with dice), the MATLAB
methods for simulation will be introduced later.
Example 4.2.2. What is the probability that four dice sum up to a value more than 10?

We could solve the problem exactly (with probability theory), but we could also get good estimates
by simulation. We would throw the four dice many times and determine the proportion of times that
we get a value more than 10. Better, we get a computer to throw the dice! The same strategy works
for many problems in the real world.
Example 4.2.3. What is the probability that it will be sunny tomorrow?

We can construct a simulator for the weather (not easy!) and then run it many times to compute the
probability.

In fact we can use the same idea to help with integration. It is often extremely difficult to find the
integral of a function analytically (at least for most real-world functions). Later on in this chapter we
will see how to compute approximate integrals using simulation. That is usually much much faster
than trying to evaluate the integral exactly.

4.3 Discrete and continuous random variables

Theory

Random variables can be categorised as discrete or continuous and can be visualised using histograms.
76 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

4.3.1 Discrete random variables and histograms

A random variable X is said to be discrete if its state space Ω is a discrete set; for example, the
discrete set of integer numbers {0, 1, 2, . . . }.

Example 4.3.1. Little Ben received a mini keyboard as a gift for his first birthday. It has 7 keys for
the notes Do, Re, Mi, etc. For simplicity in notation, we denote them by the numbers 1, 2, 3, . . . , 7.
The table below shows a random process of the keys that he presses.

time key time key time key time key time key time key time key
1 3 4 5 7 1 10 7 13 4 16 1 19 7
2 1 5 2 8 7 11 3 14 3 17 2 20 6
3 4 6 6 9 2 12 6 15 5 18 2 21 1

Estimate the probability that he pushes 1 or 7 based on this sample.

Solution
We can estimate the probability by counting the number of times that he pushes 1 (four times according
to the table) and 7 (three times). Thus,

4+3 1
P(Xt = 1 or Xt = 7) ≈ = ≈ 0.33.
21 3


Histograms

Note that counting the numbers is not an efficient way of estimating probabilities for large samples;
for example, counting the number of events in a sample of 2100 would be enormously time consuming!
A common method to visualise and estimate probabilities is to use a histogram.

Consider a random process Xt with the state space Ω. Assume that we partition Ω into subsets S1 , S2 ,
· · · , SN . The histogram is a graph that gives the number of times that each S1 , · · · , SN is observed
in Xt .

The division of the histogram of a discrete random process by its total number of experiments is called
the probability mass function of the random process.

The probability mass function of any discrete random process has the following properties:

• its values lie between 0 and 1 for each subset Si .

• the sum of its values, that is, the total height of the bars for the probability mass function over
Ω is 1. (Why?)
4.3. DISCRETE AND CONTINUOUS RANDOM VARIABLES 77

Practice

The command hist in MATLAB plots histograms. (Tip: typing help (or doc) followed by a command
or function gives you help in the command window (or in a separate window)). The hist function
defines subsets by their centres.Since the histogram gives the number of times each Si is observed, for
discrete random variables we can estimate probabilities by dividing the histogram by the total number
of experiments.

To visualise the keys pressed by Ben in Example 4.3.1, we consider S1 = {1}, up to S7 = {7}. We
use the following code to obtain the probability estimates and generate the graph shown below (e.g.
written in a MATLAB script file. You can save MATLAB files in the current directory/folder.).

X = [3 1 4 5 2 6 1 7 2 7 3 6 4 3 5 1 2 2 7 6 1]; % data sample


Omeg = 1:7; % state space
N=length(X); % total number of experiments
h=hist(X, Omeg); % histogram
bar(Omeg,h/N); % probability estimate
xlabel('Key')
ylabel('Probability estimate')

0.2

0.18

0.16

0.14
Probability estimate

0.12

0.1

0.08

0.06

0.04

0.02

0
1 2 3 4 5 6 7
Key

According to this graph P(Xt = 1 or Xt = 7) ≈ 0.19 + 0.145 ≈ 0.33.

Note that the semi-colon ’;’ at the end of a line of MATLAB code will suppress its output in the
command window.

Alternatively, in the MATLAB 2014b release a new function histogram was introduced. The de-
fault histogram settings give a plot of the number of observations in each subset defined by subset
edges. The histogram function has a number of options that allow us to choose how the data is
78 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

displayed. The 'BinMethod' option allows us to define the subsets as integers instead of by edges.
The 'Normalization' option allows us to show the results as probabilities rather than the default
counts. Hence the following code can be used:

X = [3 1 4 5 2 6 1 7 2 7 3 6 4 3 5 1 2 2 7 6 1];
histogram(X,'BinMethod','integers','Normalization','probability')
xlabel('Key')
ylabel('Probability estimate')

In this example we entered the data sample into MATLAB. However, large data samples are usually
provided in separate data files, which should be loaded by MATLAB before plotting the histogram.
We do not consider such cases in this course.

Example 4.3.2. For Example 4.3.1 estimate P(3 ≤ Xt ≤ 6).

4.3.2 Continuous random variables

Theory

A random variable X is said to be continuous if its state space Ω is a continuous set; for example,
the interval [0, 10] of real numbers, R, between 0 and 10.

Practice

Example 4.3.3. The pigeon PJ flies from her nest every day, searching for food. On average, she
flies Xt metres away from her nest (as a random process) and she never gets further than 100 metres.
A researcher recorded Xt for 200 days and stored the values in a data file. The distances that the
researcher has recorded during the first 21 days are given in the following table.

day d day d day d day d day d day d day d


1 56.45 4 60.35 7 44.80 10 83.23 13 58.70 16 47.54 19 66.91
2 72.01 5 53.82 8 54.11 11 33.80 14 49.24 17 48.51 20 67.01
3 22.89 6 34.31 9 92.94 12 86.42 15 58.58 18 67.88 21 58.06

Assuming that the data file X is available, explain how to estimate P(50 ≤ Xt ≤ 60) from a visual
examination of the data.

Solution
In this case the random process Xt is continuous with state space Ω = [0, 100]. We need to consider
the subsets S1 , S2 , etc, as intervals as well. For example we could select the intervals S1 = (0, 10],
S2 = (10, 20], S3 = (20, 30], · · · , S10 = (90, 100]. Also consider the following code:

% Assume that X is already available


4.4. UNIFORM AND NORMAL DISTRIBUTIONS, PROBABILITY DISTRIBUTIONS 79

S = 0:10:100; % subset edge definitions, with interval width 10


histogram(X, S,'Normalization','probability'); % histogram with defined subsets
xlabel('S')
ylabel('Probability estimate')

For a sample of 200 days this code has given the graph on the following page.

0.35

0.3

0.25
Probability estimate

0.2

0.15

0.1

0.05

0
0 10 20 30 40 50 60 70 80 90 100
S

Note that, therefore, P(50 ≤ Xt ≤ 60) is approximately equal to the bar between 50 and 60; this is
approximately 0.25. 

Example 4.3.4. According to the graph, what is the estimate of P(60 ≤ Xt ≤ 80)?

Example 4.3.5. Without the graph, what is P(0 ≤ Xt ≤ 100)? How do you interpret this probability
in the graph?

4.4 Uniform and normal distributions, probability distributions

This section outlines common continuous random variables and how to simulate them. We will then
look at the use of continuous random variables for modelling error.
80 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

Theory

4.4.1 Probability Density Functions

In Example 4.3.3, we used 10 subsets to generate the histogram and, therefore, obtain an estimate
of the probability mass function. We could, of course, assign the subsets differently or use more of
them; for example, we could use S1 = [0, 1], S2 = (1, 2], . . . , S100 = (99, 100]. By using more subsets,
we get more bars in the probability mass function but the bars get thinner and (mostly) shorter. If
we use a lot of very very small subsets (for example, S1 = [0, 0.01], etc.) the bars tend to lines and
the probability mass function tends to a continuous curve. This makes sense, because we expect that
the probability mass function of a continuous random variable would be a continuous function rather
than a discrete set of bars. This continuous function is called the probability density function.
More formally, the probability density function of a continuous random variable X is the limit of its
probability mass function when the number of subsets Si ’s tends to infinity and the subset widths go
to zero.

In Statistics and Probability Theory, probability density functions, sometimes abbreviated to p.d.f.,
are very important; in fact, continuous random variables are defined by their p.d.f.’s. The next figure
shows an example of a p.d.f. of a random variable X. The probability that the value of the random
variable lies between a and b is the shaded area. (Why?)

f ( x)

x
a b

probability that a < X < b


equals this area

Rb
Or, put another way: P r(a < X < b) = a f (x)dx.

There are three important properties of p.d.f.’s for any continuous random variable:
1. The value of the p.d.f. function is always non-negative.

2. The integral over the whole real line of any p.d.f. equals one.
4.4. UNIFORM AND NORMAL DISTRIBUTIONS, PROBABILITY DISTRIBUTIONS 81

3. The probability that the random variable X equals a particular value x is 0.

Example 4.4.1. Let X be a continuous random variable with the p.d.f. shown below. If this is a
p.d.f., can we find the value of c?

c
@
@
@
@
@
@
@
@
@
@
@
@ -
−1 0 1

Solution
1
We want the area under the graph to equal one. The area of each triangle is 2 × 1 × c so the total
area is c. Hence, in order for this to be a p.d.f., we must have c = 1. 

Example 4.4.2. Using the p.d.f. shown above (with c = 1) find:


1. Prob(X > 0)

2. Prob(X = 0)

3. Prob(0.5 < X < 1)

Solution
1. Exactly half the area is on the positive side, so Prob(X > 0) = 0.5.

2. The area under the graph from 0 to 0 is zero! So Prob(X = 0) = 0. With continuous random
variables the probability of a single point is zero. (This is a good trick question for exams.)

3. This is a little harder, but not much. We want the area of the triangle from 0.5 to 1, which has
height 0.5. This is 12 × 0.5 × 0.5 = 0.125.


In the following, we consider two famous p.d.f.’s that are widely used.

4.4.2 The Uniform Distribution

The uniform distribution is the simplest continuous distribution. More complicated distributions
can be built from this distribution and it is the basis of all random number generators.
82 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

The p.d.f. for the uniform distribution between a and b looks like a box: it is zero for X < a, and
1
zero for X > b. Between a and b the height is b−a , so that the total area under the graph is one.

(
1
b−a if a < x < b
f (x) =
0 otherwise

Note that all of the real values between a and b are possible, not just the integers. Examples of
different uniform distributions are shown in the figure below.

1 1

0.5

0 1 0 2

0 1 2

Figure 4.2: Uniform distributions on the intervals [0, 1], [0, 2] and [1, 2]

Practice

The rand function in MATLAB produces a random number from the uniform distribution on the
interval [0, 1] (this is the standard uniform distribution). The command rand(m,n) produces an
m × n matrix of random numbers in [0, 1]. To generate just one random number in [0, 1], type rand(),
which is the same as rand(1) and rand(1,1).

There are two rules that we can use to generate different uniform random variables.

• If X is uniform in [a, b] and c is a real number, then X + c is uniform in [a + c, b + c].

• If X is uniform in [a, b] and k > 0 is a real scalar, then k X is uniform in [k a, k b].

So if X is uniform on (0, 1) then (b − a)X + a is uniform on (a,b).

Example 4.4.3. Use MATLAB to generate a uniform random number in the interval [2, 5].

Solution
If X is uniform on [0, 1] then 3 X + 2 will be uniform on [2, 5]. The MATLAB code is
4.4. UNIFORM AND NORMAL DISTRIBUTIONS, PROBABILITY DISTRIBUTIONS 83

>> 3*rand()+2

Example 4.4.4. Use MATLAB to generate a uniform random number in the interval [a, b].

Solution
If X is uniform on [0, 1] then (b − a) X is uniform on [0, b − a] and (b − a) X + a is uniform on [a, b].
The MATLAB code is

>> (b−a)*rand()+a

Example 4.4.5. Write MATLAB code to generate 20 uniformly distributed random variables that
lie between 1 and 5.

Solution

>> x=4*rand(20,1)+1


Notes

1. The MATLAB command rand(20,1) generates a 20 × 1 matrix of [0, 1] uniform random num-
bers. Then 4*rand(20,1) is the same matrix with all entries multiplied by 4. When we add 1
to the matrix, MATLAB adds 1 to every element in the matrix.

2. It is important to understand that rand(a,b) does NOT generate a number from the uniform
distribution on the interval [a, b].

Example 4.4.6. Consider Example 4.3.3 again. Write MATLAB code to generate 200 random vari-
ables that are uniformly distributed between 5m and 100m. Plot the histogram of your generated data,
using the same Si ’s as in Example 4.3.3. Does your histogram look like the one that the researcher
got in the example? What do you conclude about the distribution of the distance travelled by PJ?

X = 5+95*rand(200,1);
N=length(X);% total number of experiments
S = [0:10:100]; % subset definitions
h = hist(X, S); % histogram
bar(S, h/N); % probability plotted versus S
xlabel('S')
ylabel('Probability estimate')
84 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

4.4.3 Normal distribution

Theory

One of the foundational observations of statistics is that if you take enough little random variables,
the distribution of their average approaches a density function that looks just like a bell curve (this is
essentially what the central limit theorem means).

The histogram below shows the chest measurements of 5738 Scottish soldiers collected by a Belgian
scholar with statistical interests.

0.2

0.1

0
32 34 36 38 40 42 44 46 48 50

Measurement

It turns out that we can approximate the histogram by a smooth symmetric bell-shaped curve called
the Normal probability density function curve:

1 (x−µ)2
f (x) = √ e− 2σ 2
2πσ 2

What do we notice about the histogram?

• the value of x is “normally” close to a certain number µ (µ = 40 in the histogram above)

• the probability that x lies in an interval [a − h, a + h], for fixed h > 0, decreases with the
distance between a and µ.

We say that such a random variable X is normally distributed with mean µ and variance σ 2 (or
standard deviation σ). It is often written as X ∼ N(µ, σ 2 ).

The mean tells us approximately where the data is (the peak), and the variance or standard deviation
tells us how much the data is spread out. This is illustrated in the figures below which show us the
p.d.f.s for several normal distributions with different means and variances. The normal distribution
4.4. UNIFORM AND NORMAL DISTRIBUTIONS, PROBABILITY DISTRIBUTIONS 85

with mean 0 and variance 1 (i.e., N(0,1)) is sometimes called the standard normal distribution; its
p.d.f. is shown below.

0.8

0.6

0.4

0.2

−5 −4 −3 −2 −1 0 1 2 3 4 5
x

P.d.f.s for some other normal distributions are shown in the next figure.

0.4

0.35
N(3,1)

0.3

0.25

0.2
N(-5,4)

0.15

N(3,4)
0.1

0.05
N(3,25)

0
-15 -10 -5 0 5 10 15 20

Note that N(3, 1), N(3, 4) and N(3, 25) are all centred at 3. As the variance increases, the p.d.f. becomes
more spread. Note also that N(−5, 4) and N(3, 4) are exactly the same shape and differ only in location.

Roughly 68% of the area under the curve falls within one standard deviation of the mean, which is
the interval [µ − σ, µ + σ]; roughly 95% falls within two standard deviations of the mean, that is, in
the interval [µ − 2 σ, µ + 2 σ]; and roughly 99.7% falls within three standard deviations, that is, in the
interval [µ − 3 σ, µ + 3 σ]. You can check this on the p.d.f. of the standard normal distribution plotted
earlier. Similarly, a normally distributed random variable with mean 10 and standard deviation 2 will
lie in the interval [6, 14] with probability approximately 0.95.
86 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

Practice

The randn function in MATLAB produces a random number from the standard normal distribution
N(0, 1). This function works the same way as rand: the command randn() generates a single number
with normal distribution; the command randn(m,n) produces an m × n matrix of normal random
numbers.

Warning: It is very easy to confuse rand and randn in your MATLAB code!

There is a general rule that we can use to generate normal random variables with given means and
variances, starting with a number generated by randn.

• If X is normally distributed with mean µ and variance σ 2 , then X + a is normal with mean µ + a
and variance σ 2 .

• If we multiply a normal random variable by a positive number we get another normal random
variable. If b > 0 and X is normally distributed with mean µ and variance σ 2 , then b X has
mean b µ and variance b2 σ 2 (and standard deviation b σ).

Example 4.4.7. What if we want a normal random number with mean 1 and standard deviation 5?

Solution
>> 5*randn()+1

For a random number from N(µ, σ 2 ) we use

>> sqrt(variance)*randn() + mean

or

>> std devn*randn() + mean

where mean= µ, variance= σ 2 and std devn = σ.

Example 4.4.8. Write MATLAB code to generate 100 random numbers normally distributed with
mean 12 and variance 4. Use the vector form of randn to produce a (column) vector z with 100
numbers from the N(12, 4) distribution.

Solution
The standard deviation will be 2, since the variance is 4. (The variance equals the square of the
standard deviation)
4.4. UNIFORM AND NORMAL DISTRIBUTIONS, PROBABILITY DISTRIBUTIONS 87

>> z=2*randn(100,1)+12;

Example 4.4.9. Consider Example 4.3.3 again. Write MATLAB code to generate 200 random vari-
ables that are normally distributed with mean 40m and standard deviation 10m. Plot the histogram
of your generated data, using the same Si ’s as in Example 4.3.3. Does your histogram look like the
one that the researcher got in the example? What do you conclude about the distribution of the
distance travelled by PJ? Try some other mean and standard deviation values to get your histogram
more similar to the graph in Example 4.3.3.

Generating other continuous random variables

Theory

It is also useful to know that there are many other distributions, and random variables from these can
often be generated by using combinations of uniform and normal random variables.

For example the Exponential random variable. This has pdf:

(
λe−λx if x ≥ 0
f (x) =
0 otherwise
with λ > 0

If U is a standard uniform random variable, then

1
X = − log(U )
λ

is an exponential random variable with parameter λ.

If Y is a standard normal random variable, then X = Y 2 has a chi-squared distribution (also written
as a χ2 -distribution). For this reason, you will come across chi-squared distributions a lot in classical
statistics.

Note that

>> Y=randn();
>> X(i)=Yˆ2;

is not the same as X(i)=randn()*randn();


88 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

4.4.4 Modelling error

Theory

The following is an example of the application of the simulation of a continuous random variable to
modelling. Often when modelling a system we want to be able to include uncertainty due to parameter
estimation, and uncertainty due to measurement error.

Consider this model for fish stocks:

Xn = Xn−1 exp(rXn−1 /K)

Where Xn is the fish population at time n, r is the growth rate, and K is the carrying capacity of the
environment. (This is a difference equation, as in Ch. 3).

Practice

Parameter uncertainty and measurement error in modelling

Often the parameters in models are estimates only. For example, we estimate r = 0.36. To account for
our uncertainty we can generate a value of r from a normal distribution (say with standard deviation
0.05), run the simulation for fish stock and repeat.

First we can write a function to give N time steps for the fish stock model:

function X = fish(X1,r,K,N) % function for fish population at N time steps


X = zeros(N,1); % setting up storage vector for fish population, increases code efficiency
X(1) = X1; % initial population
for n=2:N
if X(n−1)<=0 % in the case where the population dies out
X(n) = 0;
else
X(n) = X(n−1) * exp(r*(1−X(n−1)/K));
end
end

We can then use the fish function to simulate 50 fish stock populations through time, given our
uncertainty in the growth rate, r, and the carrying capacity, K.

X1=1e6; r=0.36; k=3.5e6; N=50; % setting up the model parameters


clf; % clear current figure window
hold on % plot all subsequent graphs on the same figure window
4.4. UNIFORM AND NORMAL DISTRIBUTIONS, PROBABILITY DISTRIBUTIONS 89

for s=1:50 % simulate 50 populations


R = r + 0.05*randn(); % parameter uncertainty in growth rate
K = k + 5e5*randn(); % parameter uncertainty in carrying capacity
X = fish(X1,R,K,N);
plot(X);
end

6
x 10
5

4.5

3.5

2.5

1.5

1
0 5 10 15 20 25 30 35 40 45 50

Brownian random walks

Theory

Sometimes we want to model random fluctuations in time. The most well-known model for this is
Brownian motion.

Recurrence equation:

Xn = Xn−1 + Random normal variable (with mean = 0 and standard deviation σ)

Properties of Brownian Motion

• The value Xk+1 after k steps has a normal distribution with variance kσ 2 .

• As k increases, so does the variance (and standard deviation)

• Good for short-term models, but not good for exploring long term behaviour - process is highly
likely to whiz up towards infinity (or down towards negative infinity).
90 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

Practice

The following code gives the general form for simulating a Brownian random walk:

function X=Brownian(X1,N,sigma) % function for simulating K at N steps


X=zeros(N,1);
X(1)=X1;
for n=2:N
X(n)=X(n−1)+sigma*randn();
end

Example 4.4.10. Use MATLAB to model the environmental fluctuation in carrying capacity.

Solution
We model carrying capacity as a random walk: each step K changes by a small (random) amount.

Xn = Xn−1 exp(r(1 − Xn−1 /Kn−1 ))


Kn = Kn−1 + n−1

We begin by creating a function that simulates a fish population and that models the fluctuations in
carrying capacity as a Brownian random walk.

function X = brownFish(X1,r,K1,N) % function for simulating fish population


% with changes in carrying capacity
X = zeros(N,1);
X(1) = X1;
K=K1;
for n=2:N % N time points
if X(n−1)<=0
X(n) = 0; % if population dies
else
X(n) = X(n−1) * exp(r*(1−X(n−1)/K));
end
K = K + 2e5*randn(); %Random fluctuation in carrying capacity
end

The following figure shows what the changing carrying capacity for our fish populations might look
like through time.

We could then simulate several possible fish populations through time using the brownFish function.

function simBrownFish(X1,r,K,N) % for plotting simulated fish populations through time


clf;
hold on
for k=1:50 % number of simulations
4.5. SIMULATING DISCRETE RANDOM VARIABLES 91

6
x 10
7

Carrying capacity K
4

0
0 5 10 15 20 25 30 35 40 45 50
Time step

X = brownFish(X1,r,K,N);
plot(X);
end
hold off
end

4.5 Simulating Discrete Random Variables

Theory

In the previous section we considered simulating continuous random variables (or processes), where
we assumed that their p.d.f.s were either uniform or normal. In this section we consider simulating
discrete random variables using a uniform distribution. In other words, we want to use a uniform
distribution, which is a continuous distribution, to simulate a discrete random process. We will also
show some alternative methods for simulating discrete random variables. Whichever method is used,
we first need to know the probability mass function of the discrete random process. We show how to
simulate discrete random variables using some examples.

4.5.1 Simple examples

Here we simulate random discrete variables where the probabilities for each state are equal.
92 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

Practice

Simulating a coin toss

Example 4.5.1. Use MATLAB to simulate a coin toss.

Solution
We did this earlier using the randi MATLAB function. Here we solve this problem in two different
ways.

(i) If we generate a uniform random variable between 0 and 1 then there is a 50% probability that
the number is less than 0.5. You can think of this as subdividing the line from 0 to 1 into two
segments:

0 0.5 1

HEADS TAILS

When we generate our uniform random variable we are selecting a random position along the
line segment [0,1]. We can generate X from the uniform distribution on (0, 1) using MATLAB
as follows. If the random variable from the uniform distribution is less than 0.5 we will say the
result is Heads, otherwise it is Tails. Equivalently (where Heads=1, Tails = 2)
function flip=coinflip()
X = rand(); % generate X from uniform distribution
if (X<0.5)
flip=1; % case Heads
else
flip=2; % case Tails
end
end

(ii) Alternatively, we could also use the ceil function from MATLAB. This takes a real number
and rounds up to the nearest integer. To generate 1 or 2 with equal probability we can generate
a random variable that is uniformly distributed between 0 and 2 and round up:

>> flip=ceil(2*rand())

Simulating a dice

Example 4.5.2. Use MATLAB to simulate throwing dice.


4.5. SIMULATING DISCRETE RANDOM VARIABLES 93

Solution
State space: Ω = {1, 2, 3, 4, 5, 6}. Each outcome has probability 61 . We will do this in two ways:

(i) Subdivide [0, 1] into six parts and select accordingly:

0 1/6 2/6 3/6 4/6 5/6 6/6

1 2 3 4 5 6

We let the variable out contain the result of rolling the dice.
function out=dice()
x = rand(); % Uniform random real number from 0 to 1.
if (x<1/6)
out=1; % dice gives 1
elseif (x<2/6)
out=2; % dice gives 2
elseif (x<3/6)
out=3; % dice gives 3
elseif (x<4/6)
out=4; % dice gives 4
elseif (x<5/6)
out=5; % dice gives 5
else
out=6; % dice gives 6
end
end

(ii) If we generate a uniform random variable between 0 and 6, and then round up to the nearest
integer, we get 1, 2, . . . , 6 with equal probabilities.

>> ceil(6*rand())

This is useful if we want to create vectors of dice rolls:


>> ceil(6*rand(100,1))

Note that this method, using ceil, only works when the probabilities are all equal.

Example 4.5.3. Write MATLAB code to to simulate the sum obtained when three dice are thrown.

Solution
We can use a loop performing three simulations
total=0; % total sum at start
for i=1:3 % run three simulations
total = total + dice(); % use dice function we made
end
94 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

or we could do it in one line:


>> total = sum(ceil(6*rand(3,1)))

4.5.2 More complicated discrete distributions

Theory

The same general approach works for more complicated distributions where the state probabilities are
unequal.

Benford’s law

A set of numbers satisfies Benford’s law if the leading digit d ∈ {1, 2, . . . , 9} occurs with probability

P (d) = log10 (d + 1) − log10 (d)

I.e. the leading digit is likely to be small, and it turns out that this is very common in lots of datasets.
For example, the first digits of randomly selected street addresses, atomic weights, and population
sizes have a discrete distribution roughly following Benford’s law.
4.5. SIMULATING DISCRETE RANDOM VARIABLES 95

0.35

0.3

0.25

0.2
probability

0.15

0.1

0.05

0
1 2 3 4 5 6 7 8 9
digit

Practice

Example 4.5.4. Simulate digits according to Benford’s law.

We can simulate random variables using a uniform function and then assign them to a discrete random
variable based on the cumulative distribution.

Identifying the cumulative distribution:

P (p ≤ 1) = log10 (2) − log10 (1)


P (p ≤ 2) = log10 (3) − log10 (2) + P (p ≤ 1)
= log10 (3)
P (p ≤ 3) = log10 (4) − log10 (3) + P (p ≤ 2)
= log10 (4)
P (p ≤ k) = log10 (k + 1)

0 log10(2) log10(3) log10(4) log10(5) log10(6) log10(7) log10(8) log10(9) 1

1 2 3 4 5 6 7 8 9

Note that the interval widths in the above figure are not to scale!
96 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

function digit=Benford()
x = rand(); % Uniform random real number from 0 to 1.
if x< log10(2)
digit=1;
elseif x<log10(3)
digit=2;
elseif x<log10(4)
digit=3;
elseif x<log10(5)
digit=4;
elseif x<log10(6)
digit=5;
elseif x<log10(7)
digit=6;
elseif x<log10(8)
digit=7;
elseif x<log10(9)
digit=8;
else
digit=9;
end

The general case

Suppose that we have been given a vector p containing the probabilities for states 1 to m. We want
to generate a random variable with these probabilities.

We can continue the same idea as before, except that we use a for loop to go through the different
states.
function state=NewRandomVariable()
p= (the vector containing the probabilities);
r=rand(); % generate uniform random number
total=0; % start of first interval segment
for i=1:m % check each segment, where m is the number of segmen
if total<r & r<=total+p(i)
state=i; % r lies in segment [total, p(i)]
end
total=total+p(i); % update to start of next segment
end

We are looping through the intervals until we get the one that contains r.
4.6. ESTIMATING PROBABILITIES, MONTY HALL 97

4.6 Estimating probabilities, Monty Hall

4.6.1 Estimating probabilities

Theory

Suppose, for example, that we want to know the probability that a random variable X is less than
some value, k. One way of doing this is to generate lots of instances of X and determine the proportion
of times you generate a value less than k (i.e. using simulation as outlined in Section 4.2.3).

Practice

Example 4.6.1. Write a MATLAB code to estimate the probability that a normal random variable
with mean 100 and variance 22 is less than 98.

Solution

count = 0; % initial counting the number of events


numsims = 1000; % total number of simulations
for n=1: numsims % perform each simulation
z = 100+2*randn(); % generate a normal random variable with mean 100 and std 2
if z<98 % check the condition
count = count+1; % count the simulation if the condition satisfies
end
end
prob = count/numsims % estimate probability

Every time you run this script you may get a slightly different answer. As the number of simulations
(in this case 1000) increases, the variability between runs will decrease. 

There is a general recipe that we can follow. Suppose that you want to know the probability that
some event happens. The simulation algorithm is:

count = 0;
numsims = 1000; % (or 10000, or more)
for n=1: numsims
(simulate something)
if (the event occurred in the simulation)
count = count+1;
end
end
prob = count/numsims
98 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

There are two main parts to fill in:

• 'simulate something' Use the techniques of the previous chapter to generate a random in-
stance. In the above example, this meant generating a random normal with mean 100 and
standard deviation 2.
• 'if the event occurred in the simulation' Here you test the outcome of your simula-
tion to see if what you are looking for occurred. In the above example, we wish to test if the
random variable was less than 98.

Note that the number of simulations needs to be high enough so that the resulting proportion is close
to the actual probability of the event and not largely affected by fluctuations.
Example 4.6.2. What is the probability that four dice sum up to a value more than 10?
Solution
From earlier we saw that we could simulate a single dice using

>> ceil(rand()*6)

Hence the sum of four dice is

>> ceil(rand()*6) + ceil(rand()*6) + ceil(rand()*6) + ceil(rand()*6)

or

>> sum(ceil(6*rand(4,1)))

We’re now ready to code the simulation.

numsims=1000;
count = 0; % initial counting the number of times greater than 10
for n=1:numsims
% simulate summation of the numbers
total = sum(ceil(6*rand(4,1)));
if total>10
count = count+1; % count the simulation
end
end
prob = count/numsims % probability estimate

MATLAB returned an estimate of 0.84. 

Example 4.6.3. Suppose it has been projected that the number of people who will visit the Auckland
Zoo each day during next January is normally distributed with a mean of 1200 and standard deviation
4.6. ESTIMATING PROBABILITIES, MONTY HALL 99

of 400. Assuming that the zoo is open every day and each visitor pays an entry fee of $7.50, estimate
the probability that the zoo will receive at least $250,000 in entry fees during next January.

Solution
count = 0;
numsims = 1000;
for n=1:numsims
total = 0; % number of people that came in the month
for k=1:31% 31 days in January
z = 1200 + 400 * randn(); % simulate the number of people coming this day
% add the number of people in one day to the total number in the month
total = total + z;
end

money = total * 7.5; % fee for the total number of people


if money >= 250000
count = count+1; % count the event of earning at least 25000$ fee
end
end
prob = count/numsims; % probability estimate

4.6.2 Supermarket workers

A casual supermarket worker can get 3, 4 or 5 shifts a week with probabilities 0.4, 0.4 and 0.2
respectively. A random vector of shifts worked over 10 weeks can be simulated using the code:

function shifts=shifts10()
shifts=zeros(10,1); % setting up vector for storage
for i=1:10
r=rand(); % generate variable from standard uniform distribution
if r<0.4
shifts(i)=3;
else if r<0.8 % sum of the 1st two probabilities
shifts(i)=4;
else
shifts(i)=5;
end
end

Suppose the output from our previous simulation of the shifts for a supermarket worker is

>> shifts = [3,4,4,5,5,3,4,4,4,5]


100 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

The command shifts == 5 will produce a vector which has zero for every element not equal to 5 and
one for every element equal to five. Hence shifts==5 produces the vector [0,0,0,1,1,0,0,0,0,1].

In the same way, shifts<4 produces [1,0,0,0,0,1,0,0,0,0]


and shifts>4 produces [0,0,0,1,1,0,0,0,0,1].

These vector commands are especially useful when evaluating the outcome of a single simulation. The
MATLAB command sum produces the sum of a vector. This means that sum(shifts==5) has the
value 3. In one line we can count the number of entries equal to 5 (or less than 4, or more than 4).
Example 4.6.4. Write a MATLAB program to estimate the probability that a supermarket worker
gets 5 shifts in more than 6 weeks out of 10.
Solution
Above we saw how to generate a vector shifts of 10 numbers giving the shifts worked each week.
The number of weeks in which the worker worked 5 shifts is

>> sum(shifts==5)

and we can test if this number is greater than six. In order to get an estimate, we run multiple (10000)
runs of the simulation, and determine the proportion that pass the test. Putting everything together:

numsim = 10000;
count = 0;
for n=1:numsim
shifts=shifts10(); % using the previous code
% check if the worker worked 5 shifts in more than six weeks out of ten.
if sum(shifts==5) > 6
count = count+1;
end
end
prob = count/numsim % probability estimate

4.6.3 Monty Hall problem

Suppose you’re on a game show, and you’re given the choice of three doors; behind one door is a car;
behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the
doors, opens another door, say No. 3, which has a goat.

He then says to you,“Do you want to pick door No. 2?” Is it to your advantage to switch your choice?

To answer this we can use simulation to estimate the probability of getting the car if we don’t switch,
and the probability if we do switch. We’ll assume we always pick door one to begin with. The position
4.7. ESTIMATING EXPECTATIONS, GAMBLER’S RUIN 101

of the car is a random variable with equal probability for each state.

function c=car() % simulating position of car


c=randi(3); % generates integers from 1 to 3 with equal probability
end

Using the car function we can start to see how to simulate the Monty Hall problem. For example, we
can use car() to choose a door at random, too. How then can we determine success or failure in the
game, and implement the switching strategy? We will develop a full code for this in lecture, which
will then be posted to the matlab code resource page on canvas.

4.7 Estimating expectations, Gambler’s ruin

4.7.1 Estimating expectations

Theory

The mean of a finite set of numbers is the sum of the numbers divided by the size of the set of
numbers.

The expectation of a random variable is a number which, roughly, describes the average value you’d
get if you generated lots of instances of that random variable.

The expectation E(X) of a discrete random variable is defined as


X
E(X) = i Prob(X = i).
i∈Ω

Example 4.7.1. In section 4.6.2, a worker got 3, 4 or 5 shifts a week with probabilities 0.4, 0.4 and
0.2 respectively. Let X be the number of shifts. Show that E(X) = 3.8

Solution
By the definition
E(X) = 3 × 0.4 + 4 × 0.4 + 5 × 0.2 = 3.8

Another example is the expected value of a dice. 1/6 of the time you get a one, 1/6 of the time you
get a two, and so on. So when you generate dice rolls a large number of time, close to 1/6 of the
generated numbers will be one, 1/6 will be two, and so on. This gives:

1 1 1
E(X) = × 1 + × 2 + · · · + × 6 = 3.5
6 6 6
102 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

The expectation of a continuous random variable with probability density function p is defined using
integrals: Z ∞
E(X) = x p(x)dx.
−∞

Example 4.7.2. If X has a standard uniform distribution then E(X) = 21 .

Example 4.7.3. If X has a standard normal distribution then E(X) = 0.

Example 4.7.4. If X is a normal random variable with mean µ and variance σ 2 then E(X) = µ.

There are three very important rules about expectations that you need to learn:

• If c is a constant, then E(c) = c.

• If X is a random variable and a is a real valued number, then E(aX) = aE(X).

• If X and Y are random variables, then E(X + Y ) = E(X) + E(Y ).

For example, what is the expectation of the sum of a standard uniform variable X and a normal
random variable Y with mean 5 and variance 8?

1
E(X + Y ) = E(X) + E(Y ) = + 5 = 5.5
2

If the random variable is simple, it is easy enough to calculate the expectation. In general, this is not
so simple, so we investigate by simulation.

Law of large numbers: An important property of random variables is that if we generate lots of values
and look at their mean, then as the sample get larger the mean will get closer and closer to the
expectation. A sample from X is a set of values generated using the distribution of X. The number
of values generated is called the sample size.

Practice

Example 4.7.5. Use 10000 simulations to estimate the mean of a normal random variable with mean
0 and standard deviation 1.

Solution
Of course in this case we already know the ‘true’ expectation, i.e., zero! We will do this example two
ways. First, we use a loop:

numsims = 10000;
total = 0; %initial sum
4.7. ESTIMATING EXPECTATIONS, GAMBLER’S RUIN 103

for i=1:numsims
x = randn(); % simulate a standard normal random variable
total = total + x; % add x to the sum
end
expectation = total/numsims % estimate the expectation

We can also generate a whole vector of normal random variables in one go. This gives the simpler
(and faster) code:

>> numsims = 10000;


>> vals = randn(numsims,1); % generate a vector of length numsims
>> expectation = sum(vals)/numsims % estimate the expectation

You can try bigger values of numsims to get better accuracy. The bigger the sample, the more accurate
the estimation. 

4.7.2 Expectations of Functions

Theory

We can also define the expectation of a function of a random variable. Suppose that X is a real-valued
random variable and g is some function defined on the real numbers. Then

Z ∞
E(g(X)) = g(x)p(x)dx.
−∞

Or, if X is discrete with m states:

m
X
E(g(X)) = g(xi )P (X = xi )
i=1

Note: usually E(g(X)) 6= g(E(X)).

Practice

Example 4.7.6. Estimate the expectation of the square of a uniform random variable on [0, 1].

Solution
104 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

n = 10000;
total = 0; % initial sum
for i=1:n
% simulate a standard uniform random variable, square it and add it to the sum
total = total + (rand())ˆ2;
end
expectation = total/n % expectation estimate

We could also avoid loops:

>> n = 10000;
>> x = rand(n,1); % generate a vector of standard uniform distribution RVs
>> gx = x.ˆ2; % g(x) for each x is xˆ2, using the dot to make operation vector friendly
>> expectation = sum(gx)/n % expectation estimate
The last line can be simplified using the mean command:

>> expectation = mean(gx)

Here are two general algorithms for estimating the expectation of g(X), the first with loops, the second
without.

n = 10000; % number of simulations


total = 0; % initial sum
for i=1:n
(generate the value x);
(evaluate y = g(x));
total = total + y;
end
expectation = total/n

Or,

>> n = 10000; % number of simulations


>> (generate a vector x of n values of the random variable)
>> (apply function g to the entries of x to get vector y)
>> expectation = mean(y)

Exercises:

Estimate E[sin(X) − X] with X uniform on (−1, 1). Use loops.

Estimate E[X 2 − X] where X is normally distributed, with mean 1 and variance 3. Do not use loops.
4.7. ESTIMATING EXPECTATIONS, GAMBLER’S RUIN 105

4.7.3 Gambler’s ruin

Theory

Suppose we play a game. Every turn I flip a coin:

• If I get heads, you pay me $1.

• If I get tails, I pay you $1.

Each turn has equal odds, but we have a finite number coins. We want to explore things like the
probability I lose all my money, or I get all your money, etc.

Practice

To start, we will just simulate the process.

Let Xn be the money I have at the beginning of the nth turn.

(
Xn−1 + 1 with probability 1/2
Xn =
Xn−1 − 1 with probability 1/2

function X=gamble(X1,N) % simulates the money I have for N tosses


X=zeros(N,1);
X(1)=X1; % amount of money at start
for n=2:N
if rand()<0.5 % heads − I win
X(n)=X(n−1)+1;
else % tails − you win
X(n)=X(n−1)−1;
end
end

Here’s the output from 5 runs of gamble(20,10) - i.e. I begin with $20 and we play 10 turns:

The players only have a finite amount of money, so we need to impose limits. The game ends when
we hit 0 (I’m bust) or 2x20=40 (you’re bust).

function X=gambleLimits(X1,N) % simulates the money I have for realistic limits


X=zeros(N,1);
X(1)=X1;
for n=2:N
106 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

25

24

23

22

21

dollars
20

19

18

17

16
1 2 3 4 5 6 7 8 9 10
turn

if X(n−1)<=0 % I'm bust


X(n)=0;
elseif X(n−1)>=2*X1 % You're bust
X(n)=X(n−1);
elseif rand()<0.5 % heads
X(n)=X(n−1)+1;
else % tails
X(n)=X(n−1)−1;
end
end

How long until somebody wins?

function L = howLong(X1,N) % number of turns until someone wins


% inputs are the money gambled, and an upper limit on the number of turns
X = zeros(N,1);
X(1) = X1;
L=N; % limit on no. of turns so matlab doesn't get stuck in an infinite loop
for n=2:N
if rand()<0.5
X(n) = X(n−1)+1;
else
X(n) = X(n−1)−1;
end
if X(n)<=0 % You win
L=n; break;
elseif X(n)>=2*X1; % I win
L=n; break;
end
end

A note on the use of break in Matlab. The break command tells Matlab to jump out of the loop it
is currently in. The routine breaks out of the loop if X(n) is zero or is too large. In the above code
4.8. MONTE CARLO INTEGRATION 107

the output will be the index n if this happens (or N if it never does).

How long is the average game? (I.e. how long do we expect the game to last if we start with $20)

total = 0;
N=1000; %number of simulated games
for i = 1:N
L=howLong(20,1000);
total = total + L;
end
expect = total/N;

What is the probability that I get all your money at the end of a game?

X = zeros(1000,1);
total=0;
for t=1:1000 % run several games
X(1) = 20; % initial money I have
for n=2:1000 % upper limit on turns
if rand()<0.5
X(n) = X(n−1)+1;
else
X(n) = X(n−1)−1;
end
if X(n)<=0 % You win
win=0; break;
elseif X(n)>=2*20; % I win (assuming you started with in same amount of money)
win=1; break;
end
end
total=total+win;
end
pwin=total/1000 % probability of winning

But what happens if one of us starts with more money? Or, in the more realistic casino scenario,
what happens if the game is slightly biased?

4.8 Monte Carlo integration

Theory

4.8.1 Estimating integrals

There is a range of methods for evaluating integrals, some analytical, and some numerical. There are
whole fields of applied mathematics and statistics devoted to this problem. In this section we use the
108 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

tools of simulation in order to carry out integration. We will only look at one technique for doing this,
which we will derive below:

Remember, from section 4.7.2, that:

Z ∞
E(g(X)) = g(x)f (x)dx
−∞

where X is drawn from the p.d.f f (). Suppose we have X uniformly distributed on an interval [a,b]
and we have some function g(x) for which we want to estimate the mean.

Example 4.8.1. What is the exact expectation of the square of a uniform random number on [0, 2]?

Solution
The answer is the following integral (where a = 0 and b = 2)
b
1
Z
E(g(X)) = g(x) dx.
a (b − a)

where g(x) = x2 . Therefore the mean of a uniform random number from (0, 2) squared is given by
2
4
Z
x2 1 3 2
1 
2 dx = 2 3x 0
= .
0 3


In general, if X is uniform on (a, b) then


b
g(x)
Z
E(g(X)) = dx.
a (b − a)

But wait: this gives us a formula for the integral, in terms of the expectation. Flip the sides and
multiply by (b − a). Then we get
Z b
g(x)dx = (b − a)E(g(X)).
a

That formula gives us a way to estimate the value of the integral:

1. First estimate the value of the expectation E(g(X)) using simulations.


Rb
2. Then multiply the value by (b − a) to get an estimate of a g(x)dx.

This approach is called Monte-Carlo integration (after the gambling capital, Monte-Carlo). Relative
to other numerical methods for integration, Monte Carlo integration is better for high dimensional
problems, but we’ll just look at one dimension in this course.
4.8. MONTE CARLO INTEGRATION 109

Practice

Example 4.8.2. Using Monte Carlo integration, check that


Z π/2
π/2
cos x dx = [sin x]0 = 1.
0
Solution
We have a = 0, b = π/2, g(x) = cos x. We first estimate E(g(X)) where X is uniform on [0, π/2]. The
integral is then (π/2)E(g(X)).

n = 1000;
total = 0; % initial sum
for i=1:n
x = (pi/2)*rand(); % simulate a uniform RV between 0 and pi/2
y = cos(x); % the function of x as given
total = total + y; % adding y to the total sum
end
expectation = total/n; % estimate expectation
integral = expectation * pi/2 % estimate the integral

An alternative, more efficient, MATLAB code uses vectors:

>> n = 1000;
% simulate a vector of n uniform RVs between 0 and pi/2
>> x = (pi/2)*rand(n,1);
>> y = cos(x); % the function of x as given
>> expectation = mean(y); % estimate expectation
>> integral = expectation * pi/2 % estimate the integral

Remember that if x is a vector then, in MATLAB , cos(x) is the vector formed by applying cos to
all the entries of x. 

MATLAB returns different values each time, but the larger n is, the closer this value is expected to
get to the true answer.
Example 4.8.3. Using Monte Carlo integration evaluate
Z 1
2
e−x dx
−1

Solution

n=1000;
x=−1+2*rand(n,1); % uniform RV between −1 and 1
y=exp(−xˆ2);
expectation=mean(y);
integral=expectation*2;
110 CHAPTER 4. STOCHASTIC METHODS AND STOCHASTIC MODELLING

Here is the general loop function for estimating:

Z b
f (x)dx
a

function integral = estimateIntegral(a,b,N)


total = 0;
for i=1:N
x = (b−a)*rand() + a; %uniform on (a,b)
y = f(x); % use a function or matlab expression
total = total + y;
end
expectation = total/N;
integral = (b−a)*expectation;
end

Exercise: Write a general function for estimating integrals Monte Carlo style using vectors.
Chapter 5

Networks and Graphs

Contents
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.1.1 Theory and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.1.2 Representing Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.1.3 Graphs in MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.2 Graph Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.2.1 Disease Spread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.3 Walks and Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.3.1 Where Graph Theory Started: The Bridges of Königsberg . . . . . . . . . . . . 120
5.3.2 The First Graph Theory Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.3.3 Graph Theory Today: The Traveling Salesman Problem . . . . . . . . . . . . . 124
5.3.4 Solving the Traveling Salesman Problem . . . . . . . . . . . . . . . . . . . . . . 125
5.4 Colourings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.4.1 Where Graph Theory Became Famous: Map Colouring . . . . . . . . . . . . . 127
5.4.2 Vertex Colourings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.4.3 Graph Colourings: Why and How . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.4.4 Edge Colourings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

5.1 Introduction

In this chapter, we will be studying graphs: a mathematical concept designed to let us talk about
objects and the connections between them. While graph theory was originally seen as being a recre-
ational branch of mathematics with few applications, in recent years its power to describe networks
has made it the perfect tool for studying many problems in the modern world. Graphs can be used
to model and study the internet, social networks, the spread of diseases through a population, travel,
computer chip design, and countless other phenomena; they are everywhere, and are at the heart of
exciting new frontiers of research in data science, mathematics, and computer science.

111
112 CHAPTER 5. NETWORKS AND GRAPHS

In these notes, we’ll start by building up some definitions that will let us talk about graphs and their
properties, and practice some MATLAB commands that will let us write programs to interact with
graphs. From there, we’ll move to solving various famous problems in graph theory, ranging from the
Bridges of Konigsberg and the Four-Color Theorem to more recent phenomena such as the traveling
salesman problem and tournament design. It’ll be fun!

5.1.1 Theory and Notation

A graph, in mathematics, is just a way to describe a set of objects and the relations between them.
In a graph, we call the objects vertices; to represent a relation between two objects, we will draw a
connection between those two vertices, and call that connection an edge.

Formally, we define a graph G as a pair of sets (V, E), where V is the set of vertices and E is the set
of edges. Individual vertices are usually denoted by using lower-case alphabetical letters, like a, b, c
or x, y, z; graphs are typically denoted by capital letters like K, G, H. Edges are typically denoted by
writing pairs of vertices in a set; for instance, we can describe the edge connecting a vertex a to a
vertex b by writing {a, b}. You can also describe an edge by writing a ↔ b; either is acceptable!

We give a few examples of graphs here:


a
1. Consider1 the single vertex a. Here V={a}, E={}.

a b
2. The drawing at right represents G = (V, E), where V = {a, b}, E = {a↔b}.
a b

The graph drawn at right has vertex set: V = {a, b, c, d}, and edge set E = {a ↔
3.
b, a ↔ c, a ↔ d, c ↔ d}

c d

5.1.2 Representing Graphs

There are a number of ways to describe a graph in MATLAB. In this section, we will discuss the two
most straightforward ways to input graphs into MATLAB, and the commands needed to do so.

Edge Lists
Edge lists are probably the easiest way to create a graph in MATLAB. To do this, do the following:
1
Vertex is the singular of vertices. People also sometimes call a vertex a “node;” in particular, MATLAB often uses
the word node to refer to vertices. In general, because graph theory is a relatively young field its terminology is still
developing; there are many cases where people have different notations or names for the same things! Also { thing1 ,
thing2 , thing3 } is the notation we use to describe a set containing the three things thing1 , thing2 , thing3 . Writing
something like {} denotes an “empty” set, that contains nothing; you can also write ∅ to denote the an empty set, if you
like.
5.1. INTRODUCTION 113

• Make a list of all of the edges in your graph.


• For each edge {s, t}, pick one vertex to be the “starting” vertex, and the other to be the “end”
vertex.
• Make a list [s1 , s2 , . . . sm ] of all of the starting vertices, and make a corresponding list [t1 , . . . tm ]
of all of the ending vertices, so that the start and end vertices “match up” in our lists.
• That’s it: you’ve described all of the information we need to build a graph!

To illustrate this idea, we look at an example here:

Example 5.1.1.

a The edge list of this graph is the two vectors [a,a,b,c,c,c],


[b,c,d,e,f,g]. This is because we can represent the edges of the
graph (a↔b, a↔c, b↔d, c↔e, c↔f, c↔g) as
b c
[a, a, b, c, c, c]
l l llll
d e f g [b, c, d, e, f, g].

Edge lists are simple and take up very little space. But for large graphs these lists can become large
and cumbersome to search through! If we needed to find whether the graph contains a particular
edge, for instance, we would have to search through the entire list (which can be an exhausting job.)
In cases like this, the adjacency matrix is often a good way to describe a graph:

Adjacency Matrices
Similar to edge lists, adjacency matrices describe a graph by describing the edges between vertices.
Given a graph G on n vertices, we can build the adjacency matrix AG for G as follows:

• Label the n vertices of your graph G as v1 , v2 , . . . vn .


• Build a n × n matrix AG as follows: for each entry (i, j) in the matrix, make AG (i, j) = 1 if
there is an edge connecting2 vertices vi , vj , and make AG (i, j) = 0 otherwise.
• That’s it: you’ve described all of the information we need to build a graph!

Example 5.1.2.

The adjacency matrix for this graph is

a b c d e f
a c e a 0 0 0 0 0 0
b 0 0 1 0 0 0
c 0 1 0 0 0 0
 
d 0 0 0 0 1 1
 
b d f
e 0 0 0 1 0 1
 
f 0 0 0 1 1 0

2
If two vertices vi , vj are connected by an edge, then we say that the vertices are adjacent, or that vi and vj are
neighbours.
114 CHAPTER 5. NETWORKS AND GRAPHS

Adjacency matrices usually take up more space than edge lists, and can be harder to make. However,
the trade off is that they are incredibly easy to search through! To find out if there’s an edge from vi
to vj , you just find the corresponding element AG (i, j) of the matrix and check if it is a 0 or a 1.

It’s also very easy to see how many neighbours any vertex has: we can just count how many ones
are in the corresponding column. In our example, it’s fast to see that vertex a has 0 neighbours and
isolated from every other vertex in the graph, just from reading the top row of the matrix.

Note: We call the number of neighbours a vertex has the degree of that vertex. For instance, we could
write deg(a) = 0 to denote that the vertex a in our earlier example has no neighbors.

5.1.3 Graphs in MATLAB


To generate a graph in MATLAB we use the command graph(). Let’s try to create a few graphs!

a We can use an adjacency matrix to create a graph:


>> A = [0,1,1,1;
1,0,0,0;
1,0,0,1;
b
1,0,1,0];
c d G1 = graph(A);

a Alternately, we can use the edge list technique to make the same graph:

>> s = [1 1 1 4];
b t = [2 3 4 3];
G2 = graph(s,t);
c d

Notice that we’ve used numbers to keep track of the vertices in our code. Here, for example, we used
the numbers 1,2,3,4 instead of the letters a,b,c,d. This is because in most situations MATLAB indexes
vertices using numbers (which is what you’d expect a computer language to do!), and doing this can
make it much easier to write clean and usable code.

If we want to plot the two graphs G1 , G2 from above, we can use the plot() command as follows:
>> plot(G1); title('G1')
figure
plot(G2); title('G2')
5.2. GRAPH DYNAMICAL SYSTEMS 115

Note: MATLAB might not draw the graph in the same way as you. This doesn’t mean your
code is wrong! There are often many ways to draw the same graph.

In this example, for instance, we can see that while G1 and G2 look the same, they do not look like
our original graph! This is not because we’ve done anything wrong: indeed, if we look at the actual
connections between our vertices we can see that these are the same graphs, except MATLAB has
decided to move the “central” vertex outside of the triangle.

5.2 Graph Dynamical Systems

We can use graphs to model processes that evolve over time. For example, we can use graphs to study
how a genetic trait might spread through a population over time, or how traffic patterns evolve over
the course of a day. In this section, we’ll turn to a type of problem we’ve encountered earlier in this
course: modeling how a disease might spread through a social network.

5.2.1 Disease Spread

In this section, we will build a program that simulates disease spread using graphs. To start with
something tangible, let’s look at a small class of students, with social connections used to draw our
edges:

Anna Elle Anand Nathan

Pascal Susana Vivien Mostafa


Elias

Rebekah Ielyaas

Here the connections represent friendships between students who spend time together or who share
the same office together. We want a program that allows to choose people that are sick, and then
models how the sickness spreads throughout the rest of the people in the class.

Creating the Graph

If we want to set this problem up in MATLAB we need to first create this graph. Recall that MATLAB
indexes vertices with numbers. So we should first rewrite the graph using numbers, and then write
down the edge list for the graph.
116 CHAPTER 5. NETWORKS AND GRAPHS

1 4 10 11 function Classroom = sickClassroom()


% We are writing a program that will model a
% disease spreading through a small classroom.
2 3 8 9 % We use a predefined classroom for this
6 % program.
5 7 % Creating the Edge Lists:
s=[1,1,1,2,2,3,3,4,5,5,6,7,7,8,8,8,9,9,10];
Edge list: t=[2,3,4,3,4,4,5,10,6,7,7,8,9,9,10,11,10,11,11];
% Creating the graph from edge list:
[1,1,1,2,2,3,3,4,5,5,6,7,7,8,8,8,9,9,10],
Classroom = graph(s,t);
[2,3,4,3,4,4,5,10,6,7,7,8,9,9,10,11,10,11,11].end

Defining Rules of Sickness

Now that we have our graph in MATLAB we’re going to need a model of how sickness spreads. That
is, we need some rules to tell us how a person gets sick, and what happens when they are sick. For
our model we’re going to assume the following rules:

1. If a healthy person comes into contact with someone that is sick, then they get sick themselves.
2. If a sick person is around less than two other sick people, then they start recovering. Otherwise
they stay sick.
3. A recovering person will become healthy. While they are recovering they can’t become sick.

These rules form a simple model for how sickness spreads amongst people. The next question becomes
how are we going to implement this model in our program.

Node Properties

To begin implementing this model we first have to learn about node properties. We need to be able
to describe each of our vertices as being healthy, infected, or recovering. How do we do that? In
MATLAB we would call this - being sick or not - a property that vertices can take on. In MATLAB
we can access and assign ‘properties’ to our graph’s vertices using some special notation. We named
our graph Classroom so we can access all the vertices by writing

>> Classroom.Nodes
ans =
11x0 empty table

The details of what a table is isn’t hugely important. The important thing is that we have access to
where our vertices are stored. Currently the table is size 11x0. This means that our 11 vertices are
there, but they have zero properties so far. (By default, vertices start with no properties in MATLAB.)
5.2. GRAPH DYNAMICAL SYSTEMS 117

So: let’s establish a new property! Specifically, let’s give our node the property of being sick or not.
We’ll do this by initialising every vertex as being a healthy person. We do this by using the general
notation Graph.Nodes.<property> = <vector>, where <property> can be named by us, and
the vector is going to contain the value of the property for each vertex.

% We establish the new property Sickornot using Graph.Nodes.Property.


% We initialise this property by setting all 11 vertices to 'H' for 'healthy'.
% We will be able to access this information later.
% The command repmat creates a vector of 'H' of size 11x1.
Classroom.Nodes.Sickornot = repmat('H',11,1);

Now if we look at the vertices of the graph, using the code at right,
we’ll be able to see the new property has been established. >> Classroom.Nodes

Great! Our code works so far, and everyone currently starts out healthy. ans =
Now, we need to be able to set some vertices to ‘sick’ (‘S’). We want
to make it so we can choose anyone to start out sick, so we write our 11x1 table
function to have the input variable whoIsSick, and then set those nodes
Sickornot
to be sick later in the code.
That is, we want to write something like the code below:
H
function out = SickClassroom(whoIsSick) H
H
... H
H
% We store the vertex number of everyone who starts sick H
% in the vector whoIsSick. H
% Then we can use it to denote who starts off sick. H
Classroom.Nodes.Sickornot([whoIsSick]) = 'S'; H
H
... H

Updating the Vertices

Now that we have our graph all set up, we’re ready to start modelling! To do this, we use the following
algorithm to implement the “disease spread” rules we came up with earlier:

Init: Take our graph G, and make a copy3 CopyG of it.


1. One by one, take each vertex v in G.
2. In CopyG , look up v’s current state, as well as the states of all of its neighbors.
3. Use this information to update v’s state in G.

Let’s talk about how we’d implement this algorithm in MATLAB. The first two steps (copying the
graph, and then going through each vertex) is pretty straightforward:
3
This is so that we only change our graph at the end, not while we’re working on it!
118 CHAPTER 5. NETWORKS AND GRAPHS

% Copy the current graph so we don't overwrite the current information:


Classroom update = Classroom;
% Use a for loop to cycle through each vertex:
for current node = 1:11
...
end

Now, in this for loop, we need to check the state of our vertex and its neighbors. Checking the state
of our vertex is pretty easy: we can just write something like

current state = Classroom.Nodes.Sickornot(current vertex);

To find all the neighbouring vertices, we can use the MATLAB function neighbours(G,i); given a
graph G and vertex i in G, this gives a list of all vertices adjacent to i in G.

For example, the following code will count the number of infected neighbors of a vertex i:

current neighbours = neighbours(Classroom,i);


num sick meighbours = sum(Classroom.Nodes.Sickornot(current neighbours)=='S');

Notice that the statement sum(Classroom.Nodes.Sickornot(current neighbours)=='S') is us-


ing logical statements to compare the state of all the neighbours with the character ’S’. That is:
Classroom.Nodes.Sickornot(current neighbours) will give us a vector of entries that are all
either H, S or R depending on whether their corresponding people are healthy, sick or recovering. If
we write =='S' after this vector, it turns this into a vector where each entry is either 0 if it was not
equal to S, or 1 if it was. To finish this up, the sum() command adds up all of the ones, and gives us
a count of sick people adjacent to i.

Once we have the number of infected neighbours, we have all the information we need to be able to
implement our three rules we came up with for how sickness spreads. We can implement these rules
using if and else statements:

Classroom update = Classroom;


for i = 1:11
current state = Classroom.Nodes.Sickornot(i);
current neighbors = neighbors(Classroom,i);
num sick neighbors = sum(Classroom.Nodes.Sickornot(current neighbors) == 'S');

if (current state == 'H' && num sick neighbors >= 1)


Classroom update.Nodes.Sickornot(i) = 'S';
elseif (current state == 'S' && num sick neighbors <= 2)
Classroom update.Nodes.Sickornot(i) = 'R';
elseif (current state == 'R')
Classroom update.Nodes.Sickornot(i) = 'H';
end
end
Classroom = Classroom update;
5.2. GRAPH DYNAMICAL SYSTEMS 119

The very last line of code here just saves over the previous graph with nice, new updated graph - we
wouldn’t want to lose all our work!

The Final Program

If you put all of this together, you should have a program that lets you simulate how a disease might
spread in this classroom! Try it on your own, and check Canvas if you get stuck; we’ve uploaded a
fully-functional version of this code there for you to look at.

To illustrate how it works, we show what our code does if Elias starts off sick and we ask it to go
forward in time by three steps (e.g. if we type SickClassroom(3,[6]) into the command line in
MATLAB:

Anna Elle Anand Nathan


1 4 10 11

Pascal Susana Vivien Mostafa


2 3 8 9
Elias 6

Rebekah Ielyaas 5 7

Figure 5.1: Initial Graph Figure 5.2: First time step

Figure 5.3: Second time step Figure 5.4: Third time step
120 CHAPTER 5. NETWORKS AND GRAPHS

5.3 Walks and Circuits

In the previous section, you did a lot with graphs: you learned what a graph is, picked up several
pieces of vocabulary that are useful when describing graphs, and then saw how to use MATLAB to
model real-life situations with graphs!

Graph theory can do a lot more than just model things, however: it’s a fascinating area of study that
can actually solve tricky problems through its own ideas and techniques. In this section, we’re going
to study two such concepts in the field of graph theory, the first of which started the field of graph
theory in proper:

5.3.1 Where Graph Theory Started: The Bridges of Königsberg

Puzzle.

In the early 17th century, the Prussian city of Königsberg was famous
for its beautiful downtown city. At the time, the city was divided by
the river Pregel into four parts: a northern region, a southern region,
and two islands. These regions were connected by seven ornate bridges,
drawn in red in the map at right.
On nice days, residents of the city would often go out for a walking tour
through the city that would try to cross every bridge. No matter how
hard they tried, though, they found it impossible to make a route that
would start and end at the same place and cross every bridge; every
route they made would accidentally “double back” and walk on some Map of Königsberg in Euler’s
bridges more than once. time. Image from public
Despite this, though, no-one in the city could come up with a reason domain, namely Wikipedia’s
why this was impossible! Trying to resolve this conundrum became pop- Seven Bridges of
Königsberg page.
ular with the citizens of Königsberg, and was soon shared by its mayor
with other cities in a hope that someone could find a solution. Eventu-
ally it made its way to the prolific mathematician Leonhard Euler, who
heard the problem described as follows:

“Can you come up with a path through Königsberg that starts and ends at the same place,
and walks over each bridge exactly once?”

Before reading further, take out pen and paper and try to solve the problem yourself! Can you do it?
Or is it impossible (and if so, can you explain why?)

The first key to this problem is the idea of a graph that you’ve developed earlier. Specifically, consider
the following way to turn our map of Königsberg into a graph K:
5.3. WALKS AND CIRCUITS 121

• Take each of the four regions of Königsberg, and turn each into a N
vertex: that is, make a vertex N for the northern region, a vertex
S for the southern region, and vertices I1 , I2 for the two islands.
• Now, take each bridge and turn it into an edge: that is, draw two
edges between N and I1 , two edges between I1 and S, and an I1 I2
edge from I2 to each of N, I1 and S.
• This gives you a graph K (drawn at right); we think of this
graph as representing the important connections of the city of
Königsberg, but without all of the other extra details that could S
The city of Königsberg, now
get in the way. in graph form!

Doing this gives you a sort of funny-looking graph, where we have some pairs of vertices linked by
multiple edges! This is OK, though. In graph theory, we call these kinds of graphs multigraphs;
conversely, if we want to talk about graphs where we only allow up to one edge between any two
vertices, we’ll call those kinds of graphs simple graphs.

With this concept in place, we can now rigorously define the idea of a walk:

Definition 5.3.1. In a graph G = (V, E), we define a walk of length n to be any sequence of n edges
{v0 , v1 }, {v1 , v2 }, {v2 , v3 }, . . . , {vn−1 , vn }, such that all of these edges are in our graph G. We say that
this path starts at v0 and ends at vn . If v0 = vn (that is, if we start and end at the same place) we
call this path a circuit.

Notice that circuits are allowed to repeat edges and vertices if they want, and also not use all of the
vertices or edges in a graph.

If a circuit contains every edge in G exactly once, we call it Eulerian, in honour of the mathematician
Leonhard Euler. Similarly, if a circuit contains every vertex in G exactly once, we call it Hamiltonian
(named for William Rowan Hamilton, another famous mathematician.)

Using the language of graph theory, our question from earlier can be phrased as follows: does the graph
K drawn earlier have an Eulerian circuit? In general, what kinds of graphs have Eulerian circuits,
and which do not? How can we quickly tell if a graph has an Eulerian circuit or not?

As a warm-up, let’s look at some simpler graphs: take the graph H


drawn at right, for instance. We can quickly see that H does not have
an Eulerian circuit:
z
• If we were to start our circuit at the vertex x in our graph H, we
x
could never return: there’s only one edge connecting x to other y
vertices of our graph, and so once we leave x we can’t return
A graph H with no Eulerian
without repeating an edge. circuit.
• If we didn’t start at x, though, then as soon as our walk gets to
x we’re stuck: there’s only one route into x, and so if we don’t
repeat edges we can’t leave once we’re there!

With a bit of creativity, you can extend this result as follows:


122 CHAPTER 5. NETWORKS AND GRAPHS

Theorem 5.3.2. If G is a graph that has at least one vertex of degree 1, then G does not have an
Eulerian circuit.

Proof. To prove a claim like this, we need to make an argument that applies to any graph G with a
vertex of degree 1! So, we can’t just look at an example like H above, because that wouldn’t persuade
someone if they were skeptical: they’d just say that that example was dumb, and that there could be
other graphs that did work.

Instead, we need to make an argument about all graphs G that have a vertex of degree 1. That is:
suppose that G is a graph, and x is a vertex in G with deg(x) = 1. Why does G not have an Eulerian
circuit?

Well: look at the logic we used earlier!

• If we start a walk at the vertex x in our graph G, we can never return: there’s only one edge
connecting x to other vertices of our graph, because deg(x) = 1.
• If we didn’t start at x, though, then as soon as our walk gets to x we’re stuck: there’s only one
route into x!
Because we’re trying to make a Eulerian circuit, our walk needs to use all of the edges in G; this
means that we’ll eventually have to visit x sometime, and then we’ll get stuck (and thus not be
able to form a circuit.)

So we’ve proven our claim!

So: this gives us a condition under which Eulerian circuits don’t exist.
However, there are graphs (like our graph K from Köningsberg, or the
one at right) that don’t have any vertices of degree 1, but also don’t
Another graph with no
seem to have any Eulerian circuits! Eulerian circuit.
How can we deal with graphs like these?

The answer is the following theorem of Euler, famously presented to the St. Petersburg Academy in
1735 as the first theorem in graph theory:

5.3.2 The First Graph Theory Proof

Theorem 5.3.3. A connected graph G has an Eulerian circuit precisely whenever it has no vertices
of odd degree. That is: if a graph G has any vertex with odd degree (like a vertex of degree 1, or
degree 3) it cannot have an Eulerian circuit. Conversely, if every vertex in G has even degree, then it
must have an Eulerian circuit!

Proof. This is a tricky proof! It has two main parts:

♥ First, we should show that that if G has an Eulerian circuit, it has no odd-degree vertices.
5.3. WALKS AND CIRCUITS 123

♣ Then, we should show the reverse direction: that is, we should prove that if G has no vertices
of odd degree, then it must have an Eulerian circuit.

We start with (♥). Suppose that G = (V, E) is a graph with a Eulerian circuit. Write down that
Eulerian circuit here, as {v0 , v1 }, {v1 , v2 }, {v2 , v3 }, . . . , {vn−1 , vn }, {vn , v0 }

Pick any vertex x ∈ V . Notice that each time x comes up in the above circuit, it does so twice: if
x = vi for some i, it shows up in both {vi−1 , vi } and {vi , vi+1 }. You can think of this as saying that
each time our circuit “enters” a vertex along some edge, it must “leave” it along another edge!

As a result, any vertex x shows up an even number of times in the circuit we’ve came up with here.
But our circuit is Eulerian; that is, it contains every edge in E exactly once. As a result, every vertex
x shows up in an even number of edges in E; that is, deg(x) is even for every vertex x, as claimed! So
we’ve proven this half of our claim.

We now proceed to (♣). Suppose that G is a connected graph in which all of our vertices have even
degree; we want to find an Eulerian circuit in G.

To do this, consider the following process for generating a cycle in G:

Init: Pick a vertex v0 at random from V . Think of v0 as our current location, and our
current path as the empty path.
1. If we are currently at some vertex vi , randomly choose a vertex vi+1 so that the edge
{vi , vi+1 } is not yet in our path. Add {vi , vi+1 } to our path, and update our current
location to vi+1 .
2. Repeatedly do step 1 above until we get back to v0 .

Notice that because the degree of every vertex in G is even, step 1 in this process can never fail: if we
are able to “enter” a vertex along some edge, there must be a corresponding edge we can “leave” on!
Because G has a finite number of edges, we can’t get stuck on 1 forever as well; so we must eventually
get back to v0 . In other words, the process above generates a circuit! Call it C.

If this circuit is Eulerian, sweet; we’re done. If not, though, it’s not too hard to make it Eulerian!
Simply do the following:

Init: Take G, and delete C’s edges from G. Because every vertex shows up an even number
of times in a circuit (as shown earlier!), this doesn’t change our “all vertices have
even degree” property.
1. If G has edges that aren’t in C, then (because G’s connected) there must be some
vertex vi in our circuit that still has nonzero degree.
2. Starting from vi , run our “find a circuit” algorithm, to get another circuit C 0 that
starts and ends at vi .
3. Now, “paste” that circuit C 0 into our original circuit, by traveling along C until
we get to vi , then taking the circuit C 0 which starts and ends back at vi , and then
resuming the original circuit C. We’ve made a bigger circuit!
4. If G still has edges, go to 1 and do it all again!
124 CHAPTER 5. NETWORKS AND GRAPHS

This process will “grow” our circuit on each pass, and is again guaranteed to work because our degrees
stay even on each loop of our algorithm. So doing this repeatedly will generate an Eulerian circuit for
us, and thus complete our proof!

If the (♣) half of the argument above was a bit too complex for you, try drawing out a graph where
all of its degrees are even, and then “running” by hand the process described for making an Eulerian
circuit.

To check your understanding, here are a few exercises to try out:

1. What does the theorem above tell you about the Seven Bridges of Königsberg problem?

2. Four graphs are drawn at right. Which of them


have Eulerian circuits? Which do not?

3. Write MATLAB code that when given a graph G, tells you whether G has an Eulerian circuit.
4. Write MATLAB code that when given a graph G that has an Eulerian circuit, can actually draw
an Eulerian circuit in that graph.

(You can find coded answers to the MATLAB questions on Canvas, if you’re stuck or curious!)

5.3.3 Graph Theory Today: The Traveling Salesman Problem

While taking scenic walks is certainly enjoyable, most modern applications of graph theory are much
more practical at their heart. Consider the following task, known as the traveling salesman prob-
lem:

Puzzle. Suppose that you’re a traveling salesman. In particular, you’re traveling the South Island,
and trying to sell rugby tickets for the nine rugby teams there (illustrated in the map below.)

Tasman Tasman
3 4
Buller Buller Canterbury
7 5 2
2
West Coast Canterbury Mid-Canterbury
West Coast 3 1
8 4 South Canterbury
6 2
Mid-Canterbury
South Canterbury Southland North Otago
4
North Otago 3 4

Otago
Otago
Southland
5.3. WALKS AND CIRCUITS 125

You want to start and finish in Mid-Canterbury, and visit each other region exactly once to sell tickets
in it. The travel times between adjacent regions are labeled on the edges of the graph at right. What
circuit can you take through these cities that minimizes your total travel time, while still visiting
each city exactly once?
Without knowing any mathematics, you’d probably guess that the shortest route is to just go around
the perimeter of the island. Intuitively, at the least, this makes sense: avoiding the southern alps is
probably a good way to save time!

In real life, however, maps can get a lot messier than this. Consider a
map of all of the airports in the world, or even just in New Zealand (at
right.) If you were an Air New Zealand representative and wanted to
visit each airport, how would you do so in the shortest amount of time
and still return home to Auckland?

In general: suppose you have n cities C1 , . . . Cn that you need to visit


for work, and you’re trying to come up with an order to visit them
in that’s the fastest. For each pair of cities {Ci , Cj }, assume that you
know the time it takes to travel travel from Ci to Cj . How can you find
the cheapest way to visit each city exactly once, so that you start and
Publicly-available map
end at the same place? sourced from http://www.
airlineroutemaps.com’s Air
New Zealand page.

These sorts of tasks are known as traveling salesman problems, and companies all over the world
solve them daily to move pilots, cargo, and people to where they need to be.

It’s also a task that’s remarkably similar to the Eulerian circuits we were looking at before! From the
graph theory perspective, a solution to the traveling salesman problem is a circuit that visits every
vertex exactly once, in such a way that the total “travel time,” as measured by labels we put on all
of the edges of our graph, is minimized. (Note that we don’t have to use every edge in these solutions:
we just need to visit each vertex once and start and end in the same place.)

5.3.4 Solving the Traveling Salesman Problem

Given that humans solved the Eulerian path problem in 1735, and that the traveling salesman problem
sounds a lot more practical than touring bridges, you’d think that we’d have a good solution to this
problem by now, right?

. . . not so much. Finding a “quick” solution to the traveling salesman problem is an open problem
in mathematics; if you could find an efficient solution to the traveling salesman problem for certain
specialized notions of efficient, you would solve a problem that’s stumped mathematicians for nearly a
century, advance mathematics and computer science into a new golden age, and quite likely go down
in history as one of the greatest minds of the millenium4 .
4
So, uh, extra-credit problem.
126 CHAPTER 5. NETWORKS AND GRAPHS

This is a fancy way of saying “this problem is really hard.” So: why mention it here? Well: in math-
ematics in general, and graph theory in particular, we often find ourselves having to solve problems
that don’t have known good or efficient algorithms. Despite this, people expect us to find answers
anyways: so it’s useful to know how to find “good enough” solutions in cases like this!

For the traveling salesman problem, one brute-force approach you could use to find the answer could
be coded like this:

Init: Take our graph G, containing n vertices. Let s be the vertex we start and end at.
Let c be a cost function, that given any edge {x, y} in G outputs the cost of traveling
along that edge.
1. Write down every possible order in which we can list the n vertices of G, starting
and ending at s.
2. For each order {s, v1 }, {v1 , v2 }, . . . , {vn−1 , vn }, {vn , s}, calculate c({s, v1 }) +
c({v1 , v2 }) + . . . + c({vn , s}). Assume that c({x, y}) is infinite if the edge doesn’t
exist (i.e. that it would take “forever” to travel along a path that is impossible to
travel along.)
3. Output the smallest number/path you find.

Points in favor of this algorithm: it works! Also, it’s not too hard to code (try it!)

Points against this algorithm: if you were trying to visit 25 cities in a week, it would take the world’s
fastest supercomputer over ten thousand years to answer your problem. (If you were trying to visit
75 cities, I think the heat death of the universe occurs before this algorithm is likely to terminate.)

This is because the algorithm needs us to consider every possible order of the n vertices in G to
complete. There are (n − 1)! = (n − 1) · (n − 2) · (n − 3) · . . . · 3 · 2 · 1 many ways in which we can order
our n cities5 , and the factorial function grows incredibly quickly: in general, if you have a program
whose runtime can be measured with a factorial function, it stops being something you can run very
very quickly.

Another approach (which, as someone who would like to book their travel before the heat death of
the universe, I’m in favor of) is to use randomness to solve this problem! Consider the following
algorithm:

Init: Take our graph G, starting vertex s, and cost function c just like before.
1. Start from s and randomly choose a city we haven’t visited, and then go to that city.
2. Keep randomly picking new cities until we’ve ran out of new choices, and then return
to s.
3. Calculate the total cost of that path.
4. Run this process like ten thousand times (which, while large, will be much smaller
than n! for almost all values of n that you’ll run into!)
5. Output the smallest number/path you find.

5
To see why, think about how you’d make an ordering of the cities. You’d start by choosing a city to travel to from
s: there are n − 1 choices here, as we can possibly go anywhere other than s. From there, we have n − 2 choices for our
second city, and then n − 3 for our third city, and so on/so forth!
5.4. COLOURINGS 127

Points against this algorithm: strictly speaking, it probably won’t work. That is: we’re just repeatedly
randomly picking paths and measuring their length. There’s no guarantee that we’ll ever pick the
“shortest” path!

Points in favor of this algorithm: it’s easy to code, it’s really fast, and if you only care about just
getting close-ish to the right answer it’s actually6 not too bad in many situations!

In the long run, it’s probably better if your phone gives you slightly suboptimal directions in a second
rather than taking two years to find the absolute best path to the Countdown, so in general this is
probably a better way to go. But in certain small situations (or times when you randomly have a
supercomputer at hand) brute-force can also be the way to go: it really depends on what you’re trying
to solve! This is a small preview of the “applied” side of applied mathematics: often, it’s not enough
to just know how to solve the problem. You usually have to solve it efficiently as well!

As before, here are a few exercises to try out:

1. Look at the graph we drew earlier for the South Island and its travel times. What is the shortest
route that starts and ends at Tasman? Is it actually the “go around the coast” route, or does it
cross the southern alps somewhere?
2. Draw a map for where you grew up. Label your home, school, local grocery store, and a couple
of your favorite places to visit outside of home. Use Google Maps or something similar to find
the distances between these things. What’s the shortest path that visits all of them, starting
and ending from home?
3. Write a MATLAB program that takes in a graph G with edge labels that give you the cost
function for those edges, and uses the brute-force method to solve the traveling salesman program
on that graph.
4. Write a MATLAB program that takes in a graph G with edge labels that give you the cost
function for those edges, and uses a randomization method to solve the traveling salesman
program on that graph.

As before, you can find code for the MATLAB questions on Canvas; check it out if you’re stuck!

5.4 Colourings

5.4.1 Where Graph Theory Became Famous: Map Colouring

In the last section we studied Eulerian circuits, which were the objects studied in the first major
graph theory proof. In this section, we’ll switch over to studying probably the most famous result in
graph theory: the four-colour theorem.

The four-colour theorem was first posed by Francis Guthrie in 1852. He was colouring in a map
of counties in England, and was attempting to do so such that no two counties that bordered each
6
In particular there are lots of tweaks you can apply here to make this pretty decent in most cases, while still keeping
it fast.
128 CHAPTER 5. NETWORKS AND GRAPHS

other got the same colour. When he was doing this, he noticed that only four colours were needed! He
mentioned this to his brother, who was the student of Augustus De Morgan (a famous set theoretician),
who then sent the problem out to all of his colleagues and friends.

The problem quickly became infamous amongst mathematicians7 , and attracted hundreds of false
proofs in the coming years. It stood until 1976, when Appel and Haken wrote a proof that reduced
the problem to checking a few (thousand) individual cases, which they did by computer. Since then
no entirely human-made proof has been found: every proof that a map only needs four colours has
needed a computer to check at least some of its cases!

So: why mention this in a graph theory class? Well: as you’ve seen earlier, we can turn any map into
a graph by assigning a vertex to each region, and by drawing an edge between two regions when they
share a border.

Tasman Tasman
3 4
Buller Buller Canterbury
7 5 2
2
West Coast Canterbury Mid-Canterbury
West Coast 3 1
8 4 South Canterbury
6 2
Mid-Canterbury
South Canterbury Southland North Otago
4
North Otago 3 4

Otago
Otago
Southland

Under this idea, a colouring of our map that doesn’t give adjacent regions the same colour is just a
way to paint each vertex a colour, so that no edge in our graph has both endpoints with the same
colour. This idea is an important one — indeed, it’s the focus of this section! — so we should give it
a name.

5.4.2 Vertex Colourings

Definition 5.4.1. Take a graph G. A proper vertex colouring of G with k colours is any way
to take k different colours and use them to paint the vertices of G, so that no edge in G has both
endpoints receiving the same colour.

The chromatic number of a graph G, χ(G), denotes the smallest number of colours k such that G’s
vertices can be properly k-coloured.

To illustrate this idea, look at the four graphs below for a moment:
7
Ironically the result itself was of little interest to mapmakers, who had found that in practice you could colour most
maps with three colours anyways.
5.4. COLOURINGS 129

C7 K5 L5 O
(cycle graph) (complete graph) (ladder graph) (octahedron)

Try to colour the vertices of each with the smallest number of colours possible, then try to explain
why you can’t use less than that number of colours. Once you think you’ve got the right answers,
read on for the solutions!

Solutions: We claim that χ(C7 ) = 3, χ(K5 ) = 5, χ(L5 ) = 2, and χ(O) = 3. To see why, we first
show that these graphs can indeed be coloured with 3, 5, 2 and 3 colours, respectively:

C7 K5 L5 O
(cycle graph) (complete graph) (ladder graph) (octahedron)

As well, notice the following properties:

• You can’t colour the vertices of the cycle graph C7 with just two colours. To see why, try it!
Make one vertex red to start; then if there are only two colours (say red and blue,) you know
that the neighbors of that vertex are both blue. This forces their neighbors to be red, and
forces their neighbors to be blue; this then forces a blue-blue edge, which causes a problem. (If
this argument didn’t make sense, take a pen and actually try to do the two-colouring that it
describes!)
• In the complete graph K5 , every pair of vertices are connected by an edge. Therefore, because
no edge can connect two vertices of the same colour, we need at least five colours to colour this
graph.
• In the ladder graph L5 , we clearly need at least two colours (as the only graphs that can be
1-coloured are ones with no edges at all!)
• The octahedron graph O contains a triangle graph. Colouring a triangle requires three colours
(Why? Prove this to yourself!) As a result, the octahedron needs at least three colours as well.

So: we know both what a proper vertex colouring of a graph is, some history about where it came
from, and have seen a few examples calculated. The next natural questions to ask are the following:
(1) what can we do with graph colourings, and (2) how can we tell a computer to colour a graph?
130 CHAPTER 5. NETWORKS AND GRAPHS

5.4.3 Graph Colourings: Why and How

While map colourings are nice to make, the reason that we care deeply about graph colourings in the
modern world is their application to scheduling problems. Consider the following task:

Puzzle. Suppose that you’re running a business. In the coming day, you have a set of jobs (like
picking up supplies with the company car, taking clients out to lunch, running/cleaning the store) to
complete and a set of time slots for those jobs.

Some jobs might conflict with each other, however, because they depend on a shared resource; for
example, if you only have one car, you can’t have someone both picking up supplies from Manukau
and taking clients out to lunch in Ponsonby at the same time! Similarly, you probably can’t schedule
someone to wax the floors of your store while it’s open.

The goal, then, is to assign jobs to time slots so that no two conflicting jobs occur at the same time.
How can you do this?

How can you efficiently assign time slots to the jobs you have to do?

One simple solution is to simply give every job its own time slot; this ensures that you won’t have any
conflicts! However, the amount of time slots this takes will probably make you quite sad. Another
solution, that’s much less likely to lead to burnout, is to use the language of graph theory:

• Make a graph G, with a vertex for every job you have.


• Draw an edge between two jobs whenever they depend on a common shared resource.
• Now, find a proper vertex colouring of this graph G.
• If you associate each colour used in your colouring to a different time slot, then you’ve came up
with a way to assign jobs to time slots. Finally, because your colouring was a proper colouring,
no edge in your graph has both endpoints given the same colour: that is, no two jobs that depend
on a common shared resource were scheduled for the same time slot!

Pick up supplies Meet clients

Show clients store Stock store with supplies

This sort of task is particularly common in computer science: there, your “jobs” are often calculations
that a program wants to perform, and any two jobs that rely on accessing the same bits of memory
at the same time are thought to be in “conflict.”

So: just like the traveling salesman problem, we’ve came across a graph theory idea that’s useful, easy
to describe, and has been studied by thousands of mathematicians for over a hundred years. Surely
we’ve got an efficient way to find the smallest number of colours needed to properly vertex-colour a
graph by now, right?
5.4. COLOURINGS 131

. . . sadly, not so much8 . While we’ve discovered tons of graph colouring techniques and ideas over the
past century (enough to spend your entire life studying!), we have not yet discovered a truly efficient
way to find the chromatic number of an arbitrary graph.

Like before, we could describe brute-force and random algorithms for colouring a graph (and indeed,
you can find code to do this on Canvas!) Instead, we’ll use this section to introduce a third kind of
algorithm that’s useful in graph theory: a greedy algorithm.

Init: Take a graph G on n vertices that we want to properly vertex-colour. List the
vertices of G as v1 , v2 , . . . vn .
1. As well, make a list of possible colours that we’d want to use on this graph. Because
we’re mathematicians, let’s name these colours 1,2,3. . . instead of things like red,
blue, green; this means that we’ve got a nice ordering built into our colours, and
that we’ll never run out of colours!
2. Paint v1 the colour 1.
3. Now, paint v2 the smallest colour that we can without causing a conflict with v1 .
That is; if v1 and v2 don’t share an edge, colour v2 1. If they do, however, we can’t
colour v2 1 without making a conflict; so colour v2 2.
4. Do the same thing for v3 ; that is, give v3 the smallest colour that doesn’t cause
conflicts with v1 , v2 .
5. Keep going through our list of vertices. At the end, we will have a properly coloured
graph!

To illustrate the idea, here’s a sample run of the greedy algorithm (where blue is the first colour, and
yellow is the second colour):

v1 v4 v1 v4 v1 v4 v1 v4 v1 v4 v1 v4
v2 v5 v2 v5 v2 v5 v2 v5 v2 v5 v2 v5
v3 v6 v3 v6 v3 v6 v3 v6 v3 v6 v3 v6

Points in favour of this algorithm: it’s pretty easy to code (try it in MATLAB!), and it runs quickly.
Also, unlike the randomization algorithm, it’s predictable: that is, every time you run the greedy
algorithm on a graph G with the vertices in the same order, you’ll always get the same output! This
can be important for designing processes that humans will interact with, as people are often unhappy
when things randomly change without them knowing why.

Points against this algorithm: it’s sometimes very inefficient depending on how you’ve listed the
vertices in your graph. For example, suppose we took the graph above with a slightly different
ordering on its vertices:

v1 v2 v1 v2 v1 v2 v1 v2 v1 v2 v1 v2
v3 v4 v3 v4 v3 v4 v3 v4 v3 v4 v3 v4
v5 v6 v5 v6 v5 v6 v5 v6 v5 v6 v5 v6
8
This is something of a theme in mathematics in general, and graph theory in particular: so many things are both
(a) simple to define, (b) incredibly useful and (c) a nightmare to actually calculate. If you want to help change this,
become a mathematician! We need new ideas.
132 CHAPTER 5. NETWORKS AND GRAPHS

We saw before that this graph only needs two colours; when ordered as above, however, the greedy
algorithm used three colours to colour this graph!

Unfortunately for us, you can often get massive gaps between what v1 v2
the greedy algorithm comes up with for a graph colouring and what is v3 v4
optimal. If you generalize the graph from earlier to something like the
v5 v6
drawing at right, you’ll see that you can find graphs on n vertices whose
chromatic number is 2, but where the greedy algorithm will try to use
n/2 different colours! vn-1 vn
With that said, in many situations the greedy graph colouring works pretty well; try it out on the four
graphs whose chromatic numbers we calculated earlier, and see if it gives you the same values! As is
often the case with algorithms, there is no one-size-fits-all solution: you need to look at the specific
sorts of graphs you’re encountering in your work, and figure out which algorithm gives the best results
for you.

5.4.4 Edge Colourings

We close this section by introducing a cute variant on the idea of a vertex colouring: edge colourings!
Definition 5.4.2. Given a graph G, an edge colouring of G with k colours is any way to assign
each edge of G one of k different colours, so that no two edges of the same colour share an endpoint
in common.

Two examples of graphs with edge colourings are given here:

a 3-edge-coloring a 4-edge-coloring

Just like with vertex colourings, we can use a greedy algorithm to find an edge-colouring of a graph:

Init: Take a graph G containing m edges that we want to properly edge-colour. List the
edges of G as e1 , e2 , . . . en .
1. Paint e1 the colour 1.
2. Now, paint e2 the smallest colour that we can without causing a conflict with e1 .
3. Do the same thing for e3 ; that is, give e3 the smallest colour that doesn’t cause
conflicts with e1 or e2 .
4. Keep going through our list of edges. At the end, we will have a properly edge-
coloured graph!
5.4. COLOURINGS 133

To check your understanding in this section, try out some of the exercises below!

1. A graph9 P is drawn at right. Find the chromatic number χ(P ) of P .

2. A ladder graph Ln can be drawn by making a n-rung ladder and placing vertices at intersec-
tions, as drawn below:

L1 L2 L3 L4 L5 ... Ln

Find the chromatic number χ(Ln ) of each ladder graph Ln .

3. A graph10 G is drawn at right. Find an edge-colouring of G that uses


only four colours.

4. Write a program that uses the greedy algorithm described in the notes to vertex-colour a graph.
5. Write a program that uses the greedy algorithm described in the notes to edge-colour a graph.
6. Find a map of where you grew up, broken apart into counties or regions. Try to colour it with
four colours. Can you succeed?

As before, you can find code on Canvas that answers the MATLAB questions!

9
This graph is the Petersen graph, which is famous for being a counterexample to a tremendous number of things
you might otherwise believe about graphs. There are entire textbooks centered around the Petersen graph and its
generalizations. Maths is weird.
10
This graph is one of the “flower snarks,” which were named in reference to the Lewis Carroll poem The Hunting of
the Snark ! Yay, poetry references in a maths class.
134 CHAPTER 5. NETWORKS AND GRAPHS
Chapter 6

Markov Chains

Contents
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

6.2 Markov Chain Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

6.3 Probabilities after multiple steps . . . . . . . . . . . . . . . . . . . . . . . . 138

6.3.1 Constructing a transition matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 138

6.3.2 Path probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

6.3.3 Probabilities after multiple steps . . . . . . . . . . . . . . . . . . . . . . . . . . 140

6.4 Simulating Markov chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

6.5 Long term behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

6.5.1 Absorbing States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

6.5.2 Equilibrium probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

6.5.3 Dependency on initial conditions . . . . . . . . . . . . . . . . . . . . . . . . . . 150

6.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

6.1 Introduction

A Markov chain is an example of a random process (see Ch 4) and is used to model a sequence of
random variables whose probabilities depend on the previous state.

In this chapter we will cover how to construct and interpret Markov chains. We will look at how to
calculate probabilities after multiple steps of the Markov chain, as well as in the long run. Building
on skills from the Ch 4 we will also simulate steps of a Markov chain.

135
136 CHAPTER 6. MARKOV CHAINS

6.2 Markov Chain Basics

Theory

In many cases the probability of an event may not be the same every time we repeat the process.
Consider the following example: the probabilities of Chris being fit, sick or tired tomorrow are 0.6,
0.1 and 0.3, respectively. However, will the same probabilities of 0.6, 0.1 and 0.3 be appropriate for
every day? Maybe Chris is more likely to be tired on Wednesday if he has been sick on Tuesday. It
could be that we need to look even further back to how he was on Monday, but this becomes more
complicated. In this chapter we will consider situations where the probability of an event depends
only on the previous state; so the probabilities of the next state will depend on Chris’s current state.

This is a Markov Chain model where the probabilities depend on the state the process is in. For
each possible state we define a set of probabilities. For Chris we might have

• If Chris is fit, then the probability that he will be fit on the following day is 0.7, that he will be
sick is 0.1 and that he will be tired is 0.2.

• If Chris is sick, then the probability that he will be fit on the following day is 0.3, that he will
be sick is 0.4 and that he will be tired is 0.3.

• If Chris is tired, then the probability that he will be fit on the following day is 0.6, that he will
be sick is 0.1 and that he will be tired is 0.3.

The transition diagram for a Markov chain has one node for each state and arrows indicating
possible transitions. The values on the arrows are given by the transition probabilities. Thus, the
transition diagram for the example just described is

0.4
0.1
SICK
0.7 FIT 0.3

0.6 0.2 0.3 0.1

TIRED

0.3

The transition probabilities can also be written as an array:


6.2. MARKOV CHAIN BASICS 137

To
Fit Sick Tired
Fit 0.7 0.1 0.2
From Sick 0.3 0.4 0.3
Tired 0.6 0.1 0.3

Both contain the same information.

Practice

Notice that each row of the table represents a set of probabilities and so it sums to 1. For mathematical
use, it is more convenient to use a matrix instead of a table. The matrix will give the probabilities for
changing between states, and we call it the transition matrix. For the above example,

>> P = [0.7 0.1 0.2;


0.3 0.4 0.3;
0.6 0.1 0.3];

We use X(t) to denote the state at time t (recall the definition of a state space from Ch 4). The
transition matrix gives the probabilities for different states at time t + 1, for all the different states
that can occur at time t. More formally, the transition matrix, P = (pij ), is defined by

pij = the probability of moving from state i to state j


= Prob(X(t + 1) = j | X(t) = i).

So pij gives the probability that the system is in state j at time t + 1 given it was in state i at time t.
Example 6.2.1. What is the probability that Chris is sick tomorrow if he is tired today?
Solution
Looking in the ‘tired’ row, the probability of being sick is 0.1. 

Example 6.2.2. What is the probability that Chris is sick for the next two days if he is tired today?
Solution
Looking in the ‘tired’ row, the probability of being sick tomorrow is 0.1. Furthermore if he is sick
tomorrow, there is a 0.4 probability of being sick the following day. So the chance he is sick for the
next two days is 0.1 × 0.4 = 0.04. 

Note that to get the probability of going from one state to another then to another we just multiplied
the transition probabilities. Think of this as ‘this transition happens and this transition happens’.
Example 6.2.3. What is the probability that Chris will be tired or sick tomorrow, given that he is
tired today?
138 CHAPTER 6. MARKOV CHAINS

Solution
There are two possibilities. If Chris is tired today he is tired tomorrow with probability 0.3. If Chris
is tired today, he is sick tomorrow with probability 0.1. Hence if he is tired today then he is either
tired or sick with probability 0.3 + 0.1 = 0.4. 

Note that to get the probability of going from one state to either of two options we just added the
transition probabilities. Think of this as ‘this transition happens or this transition happens’.

6.3 Probabilities after multiple steps

Theory

In this section we will investigate how to calculate the probabilities of transitioning between states
after multiple steps of the chain. We’ll run through an example to illustrate the process of constructing
a transition matrix, and show how to solve for the probability of a particular path (i.e. an ordered
sequence of events). From here we’ll look at the problem of solving for the probability of reaching a
particular future state, a number of steps down the line, via any path.

Practice

6.3.1 Constructing a transition matrix

Example 6.3.1. A lift is in a building with four floors. The floors will make up our state space,
Ω = {1, 2, 3, 4}, and t = 1, 2, 3, . . . represents the trips for the lift. X(t) is the floor visited on trip t.
The lift starts on floor 1 in the morning, X(1) = 1.

Some assumptions

• If the lift is at floor 1, then it is equally likely to go to any of the other floors. Remember that
the lift must go to a different floor each trip, i.e. the probability of recurrence is zero.

• If the lift is above floor 1, then the probability of going to floor 1 is 1/2 and the other floors are
equally likely.

Use this to write the transition matrix.

Solution
Remember that the rows of the transition matrix, P , must sum to 1.
6.3. PROBABILITIES AFTER MULTIPLE STEPS 139

We will do this row by row. First we said that if the lift is on the first floor then it goes to each other
floor with equal probability. There are 3 other floors, so the probability of going to each one is 13 . The
probability of staying on the same floor is 0. Hence the first row of the transition matrix is

1 2 3 4
1 0 1/3 1/3 1/3

Now floor 2. The lift goes to floor 1 with probability 21 , so p21 = 12 . The other floors are equally likely.
There are two of them, and we need the probabilities to sum to one, so the probabilities of going to
each of the other floors (3 or 4) is 41 . We can now fill in the second row:

1 2 3 4
1 0 1/3 1/3 1/3
2 1/2 0 1/4 1/4

We can fill in rows 3 and 4 using the same arguments that we used for row 2:

1 2 3 4
1 0 1/3 1/3 1/3
2 1/2 0 1/4 1/4
3 1/2 1/4 0 1/4
4 1/2 1/4 1/4 0

As a matrix:

>>P = [
0 1/3 1/3 1/3;
0.5 0 0.25 0.25;
0.5 0.25 0 0.25;
0.5 0.25 0.25 0];

6.3.2 Path probabilities

The probability of a specific path in the Markov chain is the product of the transition probabilities in
the path. So, if we start in state 1, the probability of jumping to state 2, then state 3, then state 1 is
the probability of a transition from 1 to 2, times the probability of a transition from 2 to 3, times the
probability of a transition from 3 to 1.
140 CHAPTER 6. MARKOV CHAINS

Example 6.3.2. Consider Example 6.3.1. Suppose that the lift is on floor 1. What is the probability
of the lift going next to floor 3 then floor 2. What is the probability that it instead remains on floor
1 and then goes to floor 2?
Solution
For the first probability, we multiple the transition probabilities

P13 P32 = 1/3 × 0.25 = 1/12.

For the second probability, we again multiply the transition probabilities:

P11 P12 = 0 × 1/3 = 0.

6.3.3 Probabilities after multiple steps

The original transition probabilities are the probabilities after a single step. To determine the proba-
bilities after more than one step we need to sum over all the paths that could have been taken.

Example 6.3.3. We continue with the lift example. What is the probability that the system is in
state 2 exactly 2 steps later if it starts in state 1?
Solution
Firstly we will consider solving the problem by hand. There are four possible paths we need to
consider:
Path Probability
1 → 1 → 2 p11 p12 = 0 × 1/3
1 → 2 → 2 p12 p22 = 1/3 × 0
1 → 3 → 2 p13 p32 = 1/3 × 1/4
1 → 4 → 2 p14 p42 = 1/3 × 1/4
The probability is
X 1
p1k pk2 =
6
k


Alternatively, we could calculate the whole transition matrix for the probability of states after two
steps.

The ij term in the two step transition matrix is the probability of moving to state j in 2 steps when
we start in state i. To find this term we sum up over all possible pathways, i.e. over all possible states
k after just one step: The ij term is therefore given by
4
X
probability = Pik Pkj = (P 2 )ij ,
k=1
6.3. PROBABILITIES AFTER MULTIPLE STEPS 141

where P 2 is the matrix multiplication square of the transition matrix P . So the transition matrix for
2 steps of the chain is P 2 . Using MATLAB :

>> P = [ % start matrix


0 1/3 1/3 1/3; % ; means next row
0.5 0 0.25 0.25;
0.5 0.25 0 0.25;
0.5 0.25 0.25 0]; % end matrix

>> Pˆ2

ans =

0.5000 0.1667 0.1667 0.1667


0.2500 0.2917 0.2292 0.2292
0.2500 0.2292 0.2917 0.2292
0.2500 0.2292 0.2292 0.2917

Note that we could have written P*P instead of Pˆ2, but that this is not the same as P.ˆ2.

Putting everything together, we have that the probability that a system in state i is in state j exactly
2 steps later is given by
(P 2 )ij ,

the ij element of the square of P . Likewise the probability a system in state i is in state j exactly n
steps later is (P n )ij .

Example 6.3.4. Each week a cellphone is working (State 1), broken (State 2) or lost/thrown out
(State 3).
Ω = {1, 2, 3} and t = 1, 2, 3, 4, . . . counts the weeks.

Assume that

• Lost phones stay lost

• Broken phones are lost more easily than working ones (probabilities 0.1 and 0.01)

• Phones break with probability 0.01 and are repaired with probability 0.5

Use this to draw a transition diagram and write the transition matrix.

Solution
Transition diagram:
142 CHAPTER 6. MARKOV CHAINS

0.98 0.4
0.5
1 2
0.01

0.01
0.1

3
1

Transition matrix:
 
0.98 0.01 0.01
 0.5 0.4 0.1  .
0 0 1


Example 6.3.5. Assume that a phone is working now. What is the probability that it will be lost in
week 2?

Solution
We can use MATLAB to check this.
>> P=[.98,.01,.01;.5,.4,.1;0,0,1]
P =
0.9800 0.0100 0.0100
0.5000 0.4000 0.1000
0 0 1.0000

>> P*P
ans =
0.9654 0.0138 0.0208
0.6900 0.1650 0.1450
0 0 1.0000

We want (P 2 )13 , that is, 0.0208. 

Example 6.3.6. What is the probability that a working phone is lost 3 weeks later?

Solution
Continuing on from the previous code we can figure out P 3 :
>> Pˆ3
ans =
6.4. SIMULATING MARKOV CHAINS 143

0.9530 0.0152 0.0318


0.7587 0.0729 0.1684
0 0 1.0000

We want (P 3 )13 , that is 0.0318. 

6.4 Simulating Markov chains

Theory

Simulating paths in Markov chains is just like simulating any discrete random variable, except that

1. we have to generate a new random variable for every step of the path;

2. the distribution of the next step in the path depends on the current step.

In some ways, the code for simulating Markov chain paths is a cross between the code for computing
difference equations and the code for generating discrete random variables.

Practice

Example 6.4.1. Recall, in section 6.2, the transition matrix for the Markov chain describing whether
Chris is fit, sick or tired.
>> P = [0.7 0.1 0.2;
0.3 0.4 0.3;
0.6 0.1 0.3];

Write MATLAB code for simulating 10 days of Chris, assuming that he starts off fit.

Solution
We will use a vector C to store the values for each day, where having 1,2, or 3 in C(i) means Chris
is fit, sick or tired on day i. Each day, the probabilities for the next day are read off the appropriate
row in the transition matrix.
days = 10; %number of days
C=zeros(10,1); % storage vector for the state each day
C(1) = 1; %the first day, Chris is fit.
for i = 1:days−1 % −1 as the first day is already known
today = C(i);
if today == 1 %Chris is fit today... determine his state tomorrow
x = rand();
if x < 0.7
tomorrow = 1;
144 CHAPTER 6. MARKOV CHAINS

elseif x < 0.8


tomorrow = 2;
else
tomorrow = 3;
end;
elseif today == 2 %Chris is sick today... determine his state tomorrow
x = rand();
if x<0.3
tomorrow = 1;
elseif x<0.7
tomorrow = 2;
else
tomorrow = 3;
end;
else %Chris is tired today... determine his state tomorrow
x = rand();
if x<0.6
tomorrow = 1;
elseif x<0.7
tomorrow = 2;
else
tomorrow = 3;
end
end
C(i+1) = tomorrow;
end

Note the that code we used for generating tomorrow’s state is cut-and-pasted from the code we used
to generate discrete random variables, and modified depending on the state today.

Alternatively, the MATLAB code could be made a bit more compact (and perhaps a bit easier to
read) by using the transition matrix P .

The key trick to notice is that if today is the current state, then the probabilities for the next state
are given by P(today,1), P(today,2) and P(today,3).

function C=Chris(days,initial) % simulating Chris's well−being each day for an initial state
P = [0.7 0.1 0.2; % transition matrix
0.3 0.4 0.3;
0.6 0.1 0.3];
C=zeros(days,1);
C(1) = initial; % the initial state (1, 2, or 3).
for i = 1:days−1
today = C(i); % state on day i
x = rand();
% checking the probability from state i to state 1
if x < P(today,1)
tomorrow = 1;
% checking the probability from state i to state 2
elseif x < P(today,1) + P(today,2)
tomorrow = 2;
6.4. SIMULATING MARKOV CHAINS 145

else
% checking the probability from state i to state 3
tomorrow = 3;
end
C(i+1) = tomorrow;
end


We can make this code even more general, so that it works for any number of states and any transition
matrix (see Exercises).

Now that we can simulate, we can test some of the techniques we’ve developed.

Example 6.4.2. Using simulation, estimate the probability that Chris is sick in five days time, given
that he is tired today.

Solution
We use the same ‘simulation recipe’ as before, except this time plugging in our code for simulating
Markov chains. Note that if today is day one, then five days time is day six (not day five!)

count = 0; %number of times Chris is sick in five days time


numsims = 10000; %number of simulations
days = 6; %five days time
for n=1: numsims
% simulate the five days
C=Chris(days,3); % initially tired i.e. state 3
%Check if Chris is sick in five days time
if C(days) == 2 % is Chris sick on the last day?
count = count+1;
end
end
prob = count/numsims %probability estimate

Note that the true answer is given by the (3,2) element of Pˆ5, or 0.1425. 

Example 6.4.3. After careful observation of a share price in the market, a financial adviser has
modelled the share price as a Markov chain. She believes that the share price mostly remains constant
over one day, or changes (increases or decreases) by either about 2% or 5%. Other changes have such
a small probability of occurrence that the adviser decided to exclude them from her model.

She considers the following information in her model:

• Starting from any of the states, the probability of 2% increase in the share price is equal to the
probability of 2% decrease. The probabilities of 5% increase and 5% decrease are also equal.

• If the share price stays constant over a day, there is 10% chance that it remains constant the
following day. The chance of 2% increase on the following day is twice the chance of 5% increase.
146 CHAPTER 6. MARKOV CHAINS

• If the share price increases by 2% on a day, it has 34% chance of another 2% increase on the
following day and 32% chance of no increase or decrease. The probability of 5% increase on the
following day is equal to the probability of 2% decrease. Similarly, if the share price decreases
by 2% on a day, it has 34% chance of another 2% decrease on the following day and 32% chance
of no increase or decrease. The probability of 5% decrease on the following day is equal to the
probability of 2% increase.

• If the share price increases by 5% on a day, with 50% chance it increases by 2% the following
day and with 50% it remains constant. Similarly, if the share price decreases by 5% on a day,
with 50% chance it decreases by 2% the following day and with 50% it remains constant.

(a) Write down a suitable state space for this Markov chain.

(b) Write down the transition matrix for the Markov chain.

(c) Suppose that the share price is $p on Monday, which has increased by 2% compared to the
Sunday price. What is the probability that the share has the same price on Wednesday (i.e.,
after two days)?

(d) Write a MATLAB script file to estimate the probability that a share with initial price of $10
becomes more expensive than $25 after 30 days. Assume that the share price remains constant
over the first day.

Solution

(a) The state space is {1, 2, 3, 4, 5}, where 1 is the share price staying constant, 2 is 2% increase, 3
is 5% increase, 4 is 2% decrease and 5 is 5% decrease.

(b) The transition matrix is


 
0.1 0.3 0.15 0.3 0.15

 0.32 0.34 0.17 0.17 0 


 0.5 0.5 0 0 0 

 0.32 0.17 0 0.34 0.17 
0.5 0 0 0.5 0

(c) The share price starts in state 2. There are several possible ways that the share price could have
the same price after two days.

• The price does not change over Tuesday and Wednesday.


• The price has 2% increase over Tuesday and 2% decrease over Wednesday or vice versa.
• The price has 5% increase over Tuesday and 5% decrease over Wednesday or vice versa

Thus, the probability will be,

P21 P11 +P22 P24 +P24 P42 +P23 P35 +P25 P53 = 0.32×0.1+0.34×0.17+0.17×0.17+0.17×0+0×0 = 0.1187
6.4. SIMULATING MARKOV CHAINS 147

(d) P = [0.1 0.3 0.15 0.3 0.15


0.32 0.34 0.17 0.17 0
0.5 0.5 0 0 0
0.32 0.17 0 0.34 0.17
0.5 0 0 0.5 0];
numsims = 10000; % number of simulations
count = 0; % number of times that the price is > 35
days = 30; % 30 days for each simulation
sp = 10; % share price initially 10$
for n = 1:numsims
% simulate 30 days
C=zeros(days,1);
C(1) = 1; % the price is constant over the first day
for i = 1:days
ptoday = C(i);
x = rand(); % uniform RV to simulate the states
% summing up the probabilities using for loop
for s = 1:5
if x >= sum(P(ptoday, 1:s−1)) && x < sum(P(ptoday, 1:s))
C(i+1) = s;
end
end
% increase or decrease in price
if C(i+1) == 2 % 2% increase
sp = sp + 0.02*sp;
elseif C(i+1) == 3 % 5% increase
sp = sp + 0.05*sp;
elseif C(i+1) == 4 % 2% decrease
sp = sp − 0.02*sp;
elseif C(i+1) == 5 % 5% decrease
sp = sp − 0.05*sp;
end
end
if sp > 25 % price after 30 days
count = count + 1;
end
end
prob = count/numsims

On one run, MATLAB returned

prob =

0.0678


148 CHAPTER 6. MARKOV CHAINS

6.5 Long term behaviour

Theory

One of the more important questions we can ask about a Markov chain is what its long term behaviour
is. In the previous section we saw how to determine transition probabilities after a few steps. However
the same techniques can be used to study what the transition probabilities are like after thousands,
or millions, or billions of steps. This is the long term behaviour.

Practice

We can investigate the long term behaviour of a Markov chain by seeing what happens to the transition
matrix as we increase the number of steps.

lim P n
n→∞

I.e. try P to the power of a large number.

6.5.1 Absorbing States

Example 6.5.1. What are the long term transition probabilities for the cellphone in Example 6.3.4?
Solution
Use MATLAB again to examine high powers of the transition matrix

>> Pˆ50
ans =
0.5527 0.0094 0.4379
0.4696 0.0080 0.5224
0 0 1.0000

>> Pˆ100
ans =
0.3099 0.0053 0.6848
0.2633 0.0045 0.7322
0 0 1.0000

>> Pˆ200
ans =
0.0974 0.0017 0.9009
0.0828 0.0014 0.9158
0 0 1.0000
6.5. LONG TERM BEHAVIOUR 149

>> Pˆ10000

ans =

0.0000 0.0000 1.0000


0.0000 0.0000 1.0000
0 0 1.0000

In the long term, the system will in state 3 with probability 1.0, no matter which state it started in.


State 3 (lost) in the cellphone example is called an absorbing state: once the chain reaches that
state it never leaves it. You can recognise a state i as an absorbing state by the fact that Pii = 1.0.
In the example above, P33 = 1.0, which means that once a cellphone is lost, it is always lost.

6.5.2 Equilibrium probabilities

In the previous example the long term behaviour was a particular state. However, this is not always
the case.

Example 6.5.2. What is the long term behaviour of the lift in Example 6.3.1?

Solution
First observe that we never have Pii = 1.0, so the system does not have any absorbing states. We use
MATLAB to look at higher powers of the transition matrix P .

>> Pˆ10
ans =
0.3340 0.2220 0.2220 0.2220
0.3330 0.2223 0.2223 0.2223
0.3330 0.2223 0.2223 0.2223
0.3330 0.2223 0.2223 0.2223

>> Pˆ20
ans =
0.3333 0.2222 0.2222 0.2222
0.3333 0.2222 0.2222 0.2222
0.3333 0.2222 0.2222 0.2222
0.3333 0.2222 0.2222 0.2222

>> Pˆ10000
ans =
0.3333 0.2222 0.2222 0.2222
0.3333 0.2222 0.2222 0.2222
0.3333 0.2222 0.2222 0.2222
0.3333 0.2222 0.2222 0.2222
150 CHAPTER 6. MARKOV CHAINS

Note that every row of this matrix is the same, even though this didn’t hold for the original transition
matrix. That means that, in the long term, the probability of where the chain ends up doesn’t depend
on where it started. In this case, the long term probabilities are

π = (1/3, 2/9, 2/9, 2/9).

In the long term, if we look at the lift at a random time, then it will be in floors 1, 2, 3, 4 with proba-
bilities 1/3, 2/9, 2/9, 2/9. 

We say that a vector π is the equilibrium distribution of a Markov chain if, for all states i, the long
term probability of state i is πi , independent of which state the chain started in. Not every Markov
chain has an equilibrium distribution. You can find the equilibrium distribution by

1. taking the transition matrix P to a high power (e.g. P 10000 );

2. checking if all the rows are the same (if not, there is no equilibrium distribution);

3. copying the probabilities from one of the rows.

Example 6.5.3. In the long run, for what proportion of the time will Chris, in Section 6.2, be tired?

Solution
States are fit (1), sick (2), tired (3). The transition matrix is

>> P = [0.7000 0.1000 0.2000;


0.3000 0.4000 0.3000;
0.6000 0.1000 0.3000]

>> Pˆ10000

ans =

0.6190 0.1429 0.2381


0.6190 0.1429 0.2381
0.6190 0.1429 0.2381

All of the rows are the same, so there is an equilibrium distribution. This equilibrium distribution is
(0.6190, 0.1429, 0.2381), so Chris is sick for 0.14 of the time. 

6.5.3 Dependency on initial conditions

Example 6.5.4. You are playing at coin tossing with a friend. You start with $2 and she starts with
$3. Each turn, if it’s heads then your friend gives you one dollar, otherwise you give her one dollar.
The game continues until one of you runs out of money. What is the probability that you will win?
6.6. EXERCISES 151

Solution
There are six possible states, that you have 0, 1, 2, 3, 4, 5 dollars. The transition matrix is

>> P = [1.0000 0 0 0 0 0;
0.5000 0 0.5000 0 0 0;
0 0.5000 0 0.5000 0 0;
0 0 0.5000 0 0.5000 0;
0 0 0 0.5000 0 0.5000;
0 0 0 0 0 1.0000]

Note that state 1 (corresponding to $0) and state 6 (corresponding to $5) are both absorbing states.

We examine P 100000 to check the long term transition probabilities.

>> Pˆ100000
ans =

1.0000 0 0 0 0 0
0.8000 0 0 0 0 0.2000
0.6000 0 0 0 0 0.4000
0.4000 0 0 0 0 0.6000
0.2000 0 0 0 0 0.8000
0 0 0 0 0 1.0000

Every row is different, so there is no equilibrium distribution. We started with $2, which corresponds
to state 3. The third row is (0.6 0 0 0 0 0.4). So with probability 0.6 we finish in state 1 (and lose) and
with probability 0.4 we finish in state 6 (and win). Notice the pattern in these probabilities: perhaps
you can prove a general rule? 

6.6 Exercises

1. A taxi company serves three small towns, Augustine, Berkeley and Camus. The company has
the following information:
• The towns are so small that people only catch a taxi to another town.
• Taxis wait for a customer in the town they travel to.
• Of the customers who catch a taxi from Augustine 50% go to Berkeley and 50% go to
Camus.
• Of the customers who catch a taxi from Berkeley 30% go to Augustine and 70% go to
Camus.
• Of the customers who catch a taxi from Camus 40% go to Augustine and 60% go to Berkeley.

(a) Write down a suitable state space for this Markov chain.
152 CHAPTER 6. MARKOV CHAINS

(b) Draw a transition diagram for this Markov chain.


(c) Write down the transition matrix for the Markov chain.
(d) What is the probability that if the taxi starts in Augustine, that it will go to Berkeley then
to Camus?
(e) What is the probability that if the taxi starts in Augustine then it will take two trips and
end up back in Augustine?
(f) Write a MATLAB script file to estimate the probability that a taxi which starts in Augustine
will visit Berkeley at least 8 times in its first 20 trips.
(g) Find the equilibrium distribution, if there is one, for the taxi.

2. Below is a transition diagram for a Markov chain.

0.5
1 2
0.2
0.5 0.5 0.2
0.2
3 4
0.4
0.5

(a) Find the transition matrix.


(b) Describe the long-term behaviour of the Markov chain.

3. The following MATLAB operations are given:

>> P=[0.6 0.1 0.3; 0.1 0.9 0; 0.2 0 0.8];


>> P*P*P
ans =
0.357 0.178 0.465
0.178 0.753 0.069
0.31 0.046 0.644
>> Pˆ1000
ans =
0.28571 0.28571 0.42857
0.28571 0.28571 0.42857
0.28571 0.28571 0.42857

(a) Find the probability for a Markov chain X(t), t = 0, 1, 2, .. with transition matrix P , started
in state X(0) = 1, to be in state X(3) = 2 at time t = 3.
(b) What can be said about the long term behaviour of the system state X(t) at large t?
6.6. EXERCISES 153

4. Assume that the 3 × 3 transition matrix for a 3 state Markov chain is assigned to the variable
Pmatrix in MATLAB . Write MATLAB code, using the variable Pmatrix, to calculate and
display the probability to go from state 1 to state 3 in 5 steps.
5. Write general MATLAB function that takes an m × m transition matrix P, a number n, and
simulates a path of length n that starts in the first state.
6. Most of this exercise was an exam question in first semester 2010.
Each minute of a Maths 162 lecture, a student is either interested, fascinated or excited.
• A student who is interested one minute has a 50% chance of being interested the next
minute and a 20% chance of being fascinated the next minute.
• A student who is fascinated one minute has a 60% chance of being interested the next
minute and a 20% chance of being fascinated the next minute.
• A student who is excited one minute will be interested the next minute.
We will model the state of the student using a Markov chain.
(a) Draw a transition diagram for this Markov chain.
(b) Write down the transition matrix for the Markov chain.
(c) If a student is excited, find the probability that she is excited in three minutes.
(d) Suppose a student is excited for the first minute of a lecture. Write a MATLAB script file
to estimate the probability that he is excited for at least 15 of the 50 minutes in the lecture.
7. An airport has 3 terminals. An airport taxi carries passengers from one terminal to another.
It takes passengers to whichever terminal they want and picks up its next passengers from that
terminal. Given that the taxi is at one terminal, the following table shows the probability that
it will be sent to any other terminal:
• Terminal 1: to Terminal 2, 10%, to Terminal 3, 90%
• Terminal 2: to Terminal 1, 50%, to Terminal 3, 50%
• Terminal 3: to Terminal 1, 90%, to Terminal 2, 10%
Each day the taxi starts at terminal 1.
Write the transition matrix. Use MATLAB to find:
(a) the probability that a taxi starts at Terminal 1 and is at Terminal 3 after four trips
(b) the long term probabilities
8. Students in 162 can be divided into two categories, those who are taking the course for the first
time, and those who are repeating the course.
The university has the following information:
• Students who are taking the course for the first time have a 80% chance of passing, a 10%
chance of repeating the course and a 10% chance of of not passing and never taking the
course again.
• Students who are repeating the course (i.e. taking the course for the second, third, fourth
time etc.) have a 50% chance of passing, a 25% chance of repeating the course and a 25%
chance of never taking the course again.
154 CHAPTER 6. MARKOV CHAINS

We will model this using a Markov chain, with four states.

• State 1: Students taking the course for the first time.


• State 2: Students taking the course for the second or subsequent time.
• State 3: Students who have passed 162.
• State 4: Students who have failed 162 and never passed, and who will never take the course
again.

(a) Draw a transition diagram for this Markov chain. (Hint states 3 and 4 are absorbing
states).
(b) Write down the transition matrix for the Markov chain.
(c) If a student enrols in 162 for the first time, find the probability she will pass only at her
third attempt.
(d) Explain (but do not calculate) how you would find the proportion of students who enrol in
162 for the first time who will eventually pass the course.

9. A rental car company has the following information:

• Customers who rented a car last year have a 50% probability of renting a car this year.
• Customers who rented a car two years ago, but who did not rent a car last year have a 25%
probability of renting a car this year.
• Customers who last rented a car more than two years ago have a 10% probability of renting
a car this year.

We will model this using a Markov chain using a time step of one year.

(a) Write down a suitable state space for this Markov chain.
(b) Draw a transition diagram for this Markov chain.
(c) Write down the transition matrix for the Markov chain.
(d) If a customer rents a car this year, find the probability that she will NOT rent a car in any
of the subsequent three years.
(e) Write a MATLAB script file to estimate the probability that if a customer rents a car this
year then she will rent one in at least four of the next six years.
Chapter 7

MATLAB reference chapter

Contents
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
7.2 MATLAB Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
7.2.1 Standard Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
7.2.2 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
7.2.3 Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
7.2.4 Arrays, storage and indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
7.2.5 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
7.2.6 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
7.2.7 Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7.2.8 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
7.2.9 Script Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.2.10 Five Steps for Problem Solving . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
7.2.11 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
7.2.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

7.1 Introduction

So far in this course we have covered a lot of MATLAB tools, coding conventions, and mathematical
concepts.
This chapter is here to summarise all of the coding practices we should have obtained from this course.
This chapter also serves to remind you that the tools that we have learnt here are not just for the
specific mathematical examples we’ve used to teach them to you. The coding ideas learned in Maths
162 can - and should - be used on any problem where they are useful.
Now is also a good time to note that although we have been learning how to code in MATLAB
specifically, most of the computing concepts used in MATLAB are transferable to other programming
languages. What follows is a summary of the most important programming concepts learnt in this
course.

155
156 CHAPTER 7. MATLAB REFERENCE CHAPTER

7.2 MATLAB Tools

7.2.1 Standard Operations

MATLAB can be used as a calculator. As such, it handles all of the standard operations we’d expect.

Operator MATLAB Algebra


+ + 2 + 5 = 7
− − 2 − 5 = −3
× * 2*5 = 10
÷ / 2 /5 = 0.4
ab ˆ 2ˆ5 = 32

7.2.2 Variables

In algebra we know we can represent numbers with letters. For example we can say that x = 3 and
we then know that the value of x is 3. Here x is considered a variable.

In computing, we can think of variables in a similar way. Dealing with variables is analogous to: “We
have some data. We want to store it. Lets put it in a box and label it, then we can use it later.”

Storing variables in MATLAB is as easy as following the format

name = data;

The equals sign is used to assign values to variables. Variable assignment either creates the variable,
or if it already exists changes the variable value.

From here we can access the data by calling the variable. For example, in MATLAB we could store

>> beeronthewall = 99;

then we could access our new variable by just typing in

>> beeronthewall

beeronthewall =

99
7.2. MATLAB TOOLS 157

Once a variable is declared it will show up as stored in the MATLAB workspace. It is good practice
to keep track of what variables are defined and occupy the workspace. The command clear can be
used to clear variables from the workspace.

Note: Variable names are only allowed to contain alphanumeric characters and underscores. They
can’t begin with a number or contain spaces. Some names are reserved for functions and special
values. To know what these are see the MATLAB function list in the appendix. Beyond that you
are allowed to use almost any name you would like. BUT... it is proper etiquette to give variables
descriptive names where appropriate.

7.2.3 Data Types

There are several different data types in MATLAB .

E.g.
Double: This is the default data type for numbers in MATLAB .
iamnumber = 1;
Vectors and matrices of numbers are also considered type
double.

E.g.
Characters: Letters are stored in MATLAB as characters. Strings are
str = 'stringy';
(Char) stored as vectors of characters. So strings are also consid-
ered type char.

E.g.
truth = (2+2==5)
Logical: Logical values of 1 or 0, representing true or false.
would give
(Boolean)
truth = 0

From these examples we can also see that in MATLAB we can save any of these data types as variables.

7.2.4 Arrays, storage and indexing

An array is a variable that can hold multiple values. Arrays are useful for storing lists of values - like
we would need to store data. They are also ideal for representing vectors and matrices.

If a scalar variable (single valued variable) is like a cardboard box, then an array is like a filing cabinet
where each drawer can store a value.

Creating Arrays

MATLAB arrays are created using square brackets [ ].


158 CHAPTER 7. MATLAB REFERENCE CHAPTER

Scalar: a = 12.45
Row vector: a = [3 4 5 −1]
Column vector: a = [3;4;5;−1] or a = [3 4 5 −1].'
Matrices: a = [1 2 3
4 5 6
7 8 9] or
a = [1 2 3; 4 5 6; 7 8 9]

There are also a number of ways to automatically create arrays in MATLAB .

The colon operator : and the linspace function are particularly useful. Using the colon operator,
you can specify an array that contains a sequence of increasing or decreasing values.

>> x = [10:−2:0]
x =
10 8 6 4 2 0

There are also MATLAB functions that automatically create matrices. You can find these in the
Function List (Sec. 7.2.12).

Indexing and accessing Arrays

To access the elements of an array we use round brackets ( ).

>> evenOdd = [[10:−2:2];[1:2:9]]


evenOdd =
10 8 6 4 2
1 3 5 7 9
>>evenOdd(3) %Scalar indexing goes down columns then moves to the next
ans =
8
>>evenOdd(1,3) %Array indexing is (row,column)
ans =
6
>> evenOdd(2,4:end)
ans =
7 9
>> evenOdd(:,end+1) = [0;11] % Note we can also extend arrays this way
evenOdd =
10 8 6 4 2 0
1 3 5 7 9 11

We call the position of each number in the array its index.


7.2. MATLAB TOOLS 159

Note: Arrays start counting from index 1.

In MATLAB indices must either be real positive integers or logicals.

Elementwise Operations

Sometimes we want matrix arithmetic to be done element wise. In MATLAB we can do this with
elementwise operations:

Operator Algebra MATLAB


     
a11 a12 b11 b12 a11 + b11 a12 + b12
+ =
+ and − a21 a22 b21 b22 a21 + b21 a22 + b22 [2,4,6]+[1,2,3]=[3,6,9]
Note: The arrays must be the same size.
     
a11 a12 b11 b12 a11 b11 a12 b12
.* .∗ = [1,−1].*[1,−1]=[1,1]
a21 a22 b21 b22 a21 b21 a22 b22
     
a11 a12 b11 b12 a11 /b11 a12 /b12
./ ./ = [2,4,6]./[1,2,3]=[2,2,2]
a21 a22 b21 b22 a21 /b21 a22 /b22
  b11
a11 ab1212
   
a11 a12 b11 b12
.ˆ .ˆ = b21 [1,2,−1].ˆ[2,0,2]=[1,1,1]
a21 a22 b21 b22 a21 ab2222

Element wise multiplication, division and exponentiating can also be done on row and column vectors
of different sizes. For example:

>> a = [1 2 3];
b = (1:6)';
c = a.*b
c =
1 2 3
2 4 6
3 6 9
4 8 12
5 10 15
6 12 18

Any standard operation (+ − * / ˆ) on a matrix by a single number will automatically be done


element wise. I.e. That operation will be applied to every element in the array. E.g.

>> [1,5,4]ˆ0
ans =
1 1 1
160 CHAPTER 7. MATLAB REFERENCE CHAPTER

7.2.5 Functions

Built in functions

MATLAB also has many built in functions that take care of other common mathematical operations as
well as more complicated functions. Many functions can be used on scalars or arrays. In the appendix
you will find a comprehensive list of built in MATLAB functions. To call a MATLAB function is
simple:

>> sqrt(4)
ans =
2

>> sin([0,1/2,1,3/2]*pi)
ans =
0 1.0000 0.0000 −1.0000

Note the use of brackets around the functions input argument. There should be no spaces between
the function name and the opening bracket. Most functions in MATLAB follow this name(input)
style.

Note: MATLAB functions are not just designed for doing mathematics. MATLAB also contains many
functions that would be useful for commerce, statistics, engineering, signal processing, modelling and
more.

Plotting functions and tools

Plotting in MATLAB follows the general form

plot(x,y,'conditions')

The data is stored in x and y and must have the same length. The 'conditions' string is a character
string made from one element from any or all colour, symbol, line-type. For example:

Setting the indepen- x = [−3:0.01:2]


dent variable
Setting the dependent y = exp(x) % y = exp(x)
variable
Plot the points plot(x,y,'g+−') % Plot green line with + symbols
Plot two curves on the plot(x,y1,'b−',x,y2,'r−−') % Plot y1 as a continuous blue line and
same set of axes y2 is a dashed red line

A full table of the different plot colours, symbols and line-types as well as a summary of some of the
different plot types is given in the MATLAB function list in the appendix.
7.2. MATLAB TOOLS 161

Titles, legends and other plot enhancements can also be added. For a list of these commands also see
the Plotting Commands section in the MATLAB function list.

If a new figure is required the MATLAB command

>> figure

will open a new figure for plotting.

User defined functions

In MATLAB we can define our own functions by creating and storing function files that follow the
general form:
(Note: the < > are not required. They are only used here to show the general form.)

function <output variables> = <name of function>( <input variables go here> )


% Comments about the function

<Code to be executed>

<output variables> = <what ever you want to have as output>

For example:

function [s area,s perimeter] = my square(y)


% function to calculate the area and perimeter of a square(s) with side length y
s area = y.ˆ2; % Note this function is vectorised
s perimeter = 4*y;

The function must be saved in a file called my square.m. It can be used in the Command Window as
follows:

>> [a,b] = my square([1:5])


a =
1 4 9 16 25

b =
4 8 12 16 20

Note: We have written our function so that it can work with arrays. Functions that are written in
this way are deemed to be ‘vectorised’. Where possible, it is usually a good idea to vectorise functions
to make them more general.

Other notes about functions:


162 CHAPTER 7. MATLAB REFERENCE CHAPTER

• Functions have their own workspace, separate from the base workspace. I.e. Any variables
declared inside a function will not be stored in the global workspace unless they are declared to
be global variables prior to function execution.

• In MATLAB it is possible to nest functions. This means that we can have functions inside other
functions, both in use and when we are writing functions. An example of this can be found at
the end of this chapter in the fractal trees example.

Functions can be considered as our tools of programming. We use functions whenever we need either
1) the same code to be repeated, or 2) when ever we want to use the same code with different input
parameters.
Functions allow us to generalize our code, and do away with copy and pasting with minor changes.
They are very very useful.

7.2.6 Documentation

Note: This is probably the most important part of this chapter and the best skill you can learn if you
want to learn to program.

If you ever want to check what a function does MATLAB has the following functions:

help - gives command line help about a given function.

doc - opens up the MATLAB documentation in a help window


7.2. MATLAB TOOLS 163

Beyond this, the single best piece of general programming advice is: Google is your best friend.

Most programming hurdles you will face in your careers have likely come up for other people. A
simple Google search usually provides valuable information when programming. Knowing how to
effectively use Google as a troubleshooting tool can be an incredibly valuable skill as a programmer
and mathematician.

7.2.7 Conditionals

Logical statements

In MATLAB there are a number of ‘logical operators’ that can be used to form logical statements.
E.g. == (equal to), < (less than).
The full list of logical operators can be found in the MATLAB Function List.

Logical statements can be built as such

>> a = 1; c = −3*a;

>> a<c %(a less than c)


164 CHAPTER 7. MATLAB REFERENCE CHAPTER

ans =
0

>> check = c*a==0 | | (a>0 && c<0) %(c*a = 0) OR [(a > 0) AND (c < 0)]

check =
1

The output of logical statements are boolean where 0 represented ‘false’ and 1 represents ‘true’.

Conditional Statements

Sometimes we want to compute one set of commands, or another, depending on the result of a relational
test. Conditional statements allow code to be sectioned off based on logical (true or false) conditions.
There are four main ‘statements’ that we can use:

if: Starts an ‘if’ statement. It is followed by some logical con-


dition and then code which will only be executed if the ‘if’
statement is ‘true’ (i.e. 1)

elseif Allows for another condition other than the original ‘if’
statement. Note: It is possible to have multiple elseif
statements; they are checked in order.

else If none of the conditional statements return true the code


following else will be executed.

end This concludes the if statement.

Using conditional statements tends to follow the following structure

if( <expression1> )
<commands evaluated if expression1 is True>
elseif( <expression2> )
<commands evaluated if expression1 was false and expression2 is True>
else
<commands evaluated if all other expressions are False>
end

We aren’t limited to 4 expressions here; we could add more. We could use this to price pens according
to quantity, as follows.

function out = pens4sale(pens)


sale = false;
7.2. MATLAB TOOLS 165

if (pens >= 0) & (pens < 20)


cost = pens; % Pens cost $1
sale = true;
elseif (pens >= 20) & (pens < 40)
cost = pens*0.95; % Pens cost $0.95
sale = true;
elseif (pens >= 40) & (pens <= 100)
cost = pens*0.90; % Pens cost $0.90
sale = true;
else
disp('not a valid number of pens')
return % Returns nothing in case of this error
end
if sale
disp(sprintf('Cost of %3.0f pens is $ %5.2f',pens,cost))
out = cost;
end

Note that leaving off the last part (else and the related commands following it) will result in no action
being taken if none of the previous expressions result in True values.

Conditionals on Arrays

• Logical operators also work on arrays. Most logical operators work element wise on arrays.

>> a = −3:3;
a>0
ans =
0 0 0 0 1 1 1

• There are some built in MATLAB functions that also help with arrays and logicals. Three of
the most useful are

any True if any elements are nonzero (True)

all True if all elements are nonzero (True)

find Finds indices of nonzero (True) elements

• Conditional statements can also be used to index arrays.


This can sometimes be used instead of using both for loops and if statements to cycle through
arrays. This makes array indexing quite a useful trick.

>> a = −3:3;
b = a(a>0)*(−1) %make all positive elements, negative.
ans =
−3 −2 −1 0 −1 −2 −3
166 CHAPTER 7. MATLAB REFERENCE CHAPTER

7.2.8 Loops

When we want to repeat commands or cycle through arrays we use loops.

The for loop repeated the commands that are inside of it. The basic structure is

for <list of values>


<code to be executed>
end

Sometimes we want to repeat commands until a condition is satisfied. If we don’t know how many
repetitions are required in advance, a while loop can be useful.

while <logical statement>


<code to be executed>
end

The loop will continue to go around as long as the ¡logical statement¿ returns a value of True; provided
that the statement returns a scalar. If the logical statement evaluates to an array, then all of the
values in that array must be True.

Sometimes it is useful to exit loops part way through. While loops allow some control over this, but
the MATLAB function break allows for greater control. break terminates the execution of a while
or for loop (i.e. it exits the loop). Note: If using nested loops, break exits the innermost loop only.

Example 7.2.1. Sum a sequence of random numbers until the next random number is greater than
an upper limit. Then, exit the loop using a break statement.

Solution

>> limit = 0.8; % Upper limit


S = 0; % The sum

while 1 % Loop infinitely until break statement is reached


new number = rand;
if new number > limit
break
end
S = S + new number;
end


7.2. MATLAB TOOLS 167

7.2.9 Script Files

Scripts are the simplest type of program file. They store commands exactly as you would type them
at the command line. Scripts are collections of MATLAB commands stored in plain text files. To
open a new script file we can click the New Script button in the top left corner of the MATLAB
window.

When a script file is run the code written inside is executed as if you had typed them in
from the keyboard. Scripts can be run by using the run command >> run <filename>
or by pressing the run button in the toolbar,
or by pressing F5 to run the script open in the script editor.

Unlike function files, script files do not have input and output parameters. Script files can
only operate on the variables that are hard-coded into their m-file. Scripts are useful for
tasks that don’t change. And are a way to document a sequence of commands. Function
files may be called inside script files. (Note: Script files cannot be called inside of function
files). Thus, Scripts can be thought of as where the overall program is written; using the
functions and tools available in MATLAB to code a particular program. As such scripts
are useful for setting global behaviour of a MATLAB session.

An introduction to using script files can be found in section 3.2.2

Notes:

• We use scripts to call functions and commands.

• In order for Scripts to be run they must be found in the local (current) directory.

• Scripts do not take input or output; these must be hard coded in.

• Variables declared in a MATLAB script are stored in the workspace when run.

• When a script is run, it has access to variables in the workspace. By contrast, functions do not
have access to the workspace (but only their input variables).
168 CHAPTER 7. MATLAB REFERENCE CHAPTER

7.2.10 Five Steps for Problem Solving

Recall that the reason we are learning how to develop software is so that we may write computer
programs to solve problems. When confronted with a new problem it can be tricky to know how to
approach it so having problem solving approaches in mind can prove useful.

There are many different problem solving methodologies available. The following five steps provide a
simple framework which can help you approach a problem.

1. State the problem clearly

2. Describe the input and output information

3. Work the problem by hand (or with a calculator) for a simple set of data (Testing the problem)

4. Develop a solution, write pseudocode and convert it to a computer program

5. Test the solution with a variety of data

Example 7.2.2. Write MATLAB code to compute the distance between two points in a plane, where
the points are given as the coordinates (x1 , y1 ), (x2 , y2 )

Solution

Step 1: State the problem clearly

Compute the straight-line distance between two points in a plane.

Step 2: Describe the input and output information

Our inputs are the information given that we require to solve the problem. Note that sometimes we
will be given irrelevant information, so not all given information may be required. Our outputs are
the values we need to compute.

Sometimes it is useful to draw an I/O (input/output) diagram.

Point 1

Distance

Point 2
7.2. MATLAB TOOLS 169

Step 3: Testing the problem (working by hand)

Working the problem by hand is a very important step. If you are having difficulty with this step

• Read the problem again

• Consult reference material (See Section 7.2.6 on documentation)

• Diagrams can be useful

Working the problem by hand will help you understand what steps need to be taken to solve the
problem. It will also give you a known solution value for a simple data set, which you can use later to
test your program.

(6,4)
p
distance = ((side1 )2 + (side2 )2
p
= (6 − 2)2 + (4 − 1)2
p √
= (42 + 32 = 25
(2,1) = 5.

Step 4: Develop a solution and convert it to a computer program

Decompose the problem into a set of steps and write pseudocode or a flowchart for code. Then write
the code.
Simple problems give simple steps. Complex problems give complex steps.

If we are dealing with a complex problem we still decompose the problem into a series of steps. Each
complex step may also require the problem solving process. We will discuss how to create pseudocode
and flowcharts for complex problems shortly.

Pseudocode is integral to this process. Pseudocode is code written in words and symbols as opposed
to into the computer as a complete program. If the problem is well understood then we should be
able to write pseudocode for the problem. Then, if we can write pseudocode, we should be able to
code our solution up in a computer.

Pseudocode for this problem might look like

% Step1: Get x− and y−values for two points


% Step2: Compute length of two sides of right angle triangle generated by points
% Step3: Use hypotenuse calculation to get distance
% Step4: Display the distance

The associated MATLAB program might then be


170 CHAPTER 7. MATLAB REFERENCE CHAPTER

function distance = CalculateDistance( x1,y1, x2,y2 )


%%% Function that calculates the distance between two given coordinates

% Calculate the differences between coordinates


side1 = x2 − x1;
side2 = y2 − y1;

% Find the distance between the points by calculating the hypotenuse


distance = sqrt(side1ˆ2 + side2ˆ2);
% Display the answer to the Command Window
disp('The distance between the two points is d = :')
disp(distance)

Step 5: Test the solution

>> CalculateDistance(6,4, 2,1);


The distance between the two points is d = :
5

It is then important to test your program and think very careful about any data that might cause
errors. It is always better to be thorough. Always test your programs!


7.2. MATLAB TOOLS 171

7.2.11 Examples

Maths 162 has taught you many things. Amongst these things have been the fundamental program-
ming tools that have been reviewed in this chapter. We have learnt

• Variables • Conditionals and Logic • Problem Solving


• Arrays • Loops
• Functions • Scripts

These tools make up the fundamentals of programming; in MATLAB and other languages. This
section aims to be proof that you can program a vast variety of programs using these tools. This
section contains a collection of example programs and exercises that are quite different to what we
have seen so far; but only use the tools we have learnt in Maths 162.

Example 7.2.3. Use MATLAB to write a program that converts any word into pig latin. Pig latin
is a ‘secret’ language formed from English by transferring the initial consonant of each word to the
end of the word and adding ‘ay’ to the end of the word. E.g. ‘cat’ would become ‘atcay’.

Solution
We can use a collection of conditionals, arrays, and for loops for this program.

function out = pig latinizer(in word)


consonants = ['b','c','d','f','g','h','i','j','k','l',...
'm','n','p','q','r','s','t','v','w','x','y','z'];
first letter = in word(1);
cons check = 0;

for i = 1:length(consonants)
if (first letter==consonants(i))
cons check=1;
break
end
end

if (cons check)
out word = [in word(2:end),first letter,'ay'];
else
out word = [in word,'ay'];
end
out = out word;

Alternatively, instead of cycling through each consonant in a loop, we can use conditional indexing to
speed up the process.

function out = pig latin(in word)


consonants = ['b','c','d','f','g','h','i','j','k','l',...
'm','n','p','q','r','s','t','v','w','x','y','z'];
first letter = in word(1);
check = (first letter==consonants);
172 CHAPTER 7. MATLAB REFERENCE CHAPTER

if (any(check))
out word = [in word(2:end),consonants(check),'ay'];
else
out word = [in word,'ay'];
end
out = out word;

first word = pig latinizer('happy');


second word = pig latin('piggy');
sentence = [first word,' ',second];

>> sentence =

appyhay iggypay

Example 7.2.4. Use MATLAB to write a program that makes fractal trees. A fractal tree starts
with a trunk, and then branches out left and right by a certain angle creating new branches with a
length that has some ratio of the initial trunk. This process continues from those new branches until
you have a full tree. The program should allow the angle, length ratio and number of branch iterations
as the input.
An illustration of how a fractal tree is built is shown.

Θ
Θ Θ

To do this exercise we can know that the matrices Mr and Ml rotate a vector by θ radians left and
right respectively, where
   
cos(θ) sin(θ) cos(θ) −sin(θ)
Mr = Ml = .
−sin(θ) cos(θ) sin(θ) cos(θ)

Solution

function frac tree(theta,length ratio,branches)


%%% frac tree.m creates a fractal tree that starts at (0,0) with a trunk of
%%% length 1 where the offshoots are defined by <theta> and <length ratio>.
%%% The number of iterations are input as <branches>.
%%% Note the angle is in radians.

%%% %%%
7.2. MATLAB TOOLS 173

%%% This draws the `trunk' of the tree:


plot([0,0],[0,1],'k')
hold on
%%% %%%
%%% Defining the starting point for the offshoots and starting the tree
%%% building:

x start = 0; y start = 0;
x new = 0; y new = 1;

draw right branch(x start,y start,x new,y new,branches);


draw left branch(x start,y start,x new,y new,branches);

%%% %%%
%%% The function draw right branch draws the right hand side of every
%%% offshoot:

function draw right branch(x start,y start,x new,y new,branches left)


% draw right branch takes in the endpoints of the previous (parent)
% branch and the number of iterations (branches) that are left.

% The next section of code uses a rotation matrix, and the previous
% branch to find the next right hand branch.
rotate matrix r = [cos(theta) sin(theta); −sin(theta) cos(theta)];

old start = [x start;y start]; % Start of old branch


new start = [x new;y new]; %End of old branch (Start of new branch)
new final = rotate matrix r *(new start − old start)*length ratio...
+ new start; % Rotates the previous branch and changes its
% length, then moves it back to the new starting
% position. This is the end point of the new
% branch.

% Plotting the new branch


plot([new start(1),new final(1)],[new start(2),new final(2)],'b')
hold on

% This next section of code says to draw the next set of branches
% from the branch we just made. I.e. from the end point of the
% latest right branch draw the next right and left offshoots.

if branches left>1 % I.e. If there aren't any more branches to


% draw don't draw anymore.
draw right branch(new start(1), new start(2),...
new final(1), new final(2), branches left−1);
draw left branch(new start(1), new start(2),...
new final(1), new final(2), branches left−1);

% Note: Every time a new offshoot happens the function is


% called with branches left−1. This ensures that the number of
174 CHAPTER 7. MATLAB REFERENCE CHAPTER

% iterations left to our fractal tree are decreasing. This


% ensures that the tree has the correct number of offshoots and
% that it does not run forever.
end
end

%%% %%%
%%% The function draw right branch draws the right hand side of every
%%% offshoot:

% draw left branch works the same way as draw right branch, only the
% rotation happes to the left instead of the right.

function draw left branch(x start,y start,x new,y new,branches left)

rotate matrix l = [cos(theta) −sin(theta); sin(theta) cos(theta)];


% This rotation matrix is slightly different.

old start = [x start;y start];


new start = [x new;y new];
new final = rotate matrix l*(new start − old start)*length ratio...
+ new start;

plot([new start(1),new final(1)],[new start(2),new final(2)],'r')


hold on

if branches left>1
draw right branch(new start(1), new start(2),...
new final(1), new final(2), branches left−1);
draw left branch(new start(1), new start(2),...
new final(1), new final(2), branches left−1);
end
end
end

This function can produce a wide array of fractal trees. The following figure shows a small few.


7.2. MATLAB TOOLS 175

2.5 3.5

3
2

2.5

1.5
2

1.5
1

0.5
0.5

0 0
-1.5 -1 -0.5 0 0.5 1 1.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2

(a) frac tree(pi/3,0.6,8) (b) frac tree(pi/5,0.7,14)

2 14

1.8 12

1.6 10

1.4
8

1.2
6
1
4
0.8
2
0.6

0
0.4

0.2 -2

0 -4
-1.5 -1 -0.5 0 0.5 1 1.5 -15 -10 -5 0 5 10 15

(c) frac tree(pi/2,0.7,10) (d) frac tree(pi/10,1,13)

Figure 7.1: A variety of fractal trees.

7.2.12 Exercises

1. Write a program that can take a whole sentence and convert it into pig latin.

2. Write a program that randomly creates sentences from stored words.

3. Create a text choose-your-path adventure game in MATLAB .

4. Write the classic video game pong in MATLAB .

5. Program Conway’s game of life in MATLAB .

6. Create a program that calculates the area of any regular polygon.

7. Create a program that can calculate a payment program for students paying off their loans.
176 CHAPTER 7. MATLAB REFERENCE CHAPTER
MATLAB : function list

General Purpose Commands:

Operators and Special Characters


+ Addition
− Subtraction
* Scalar and Matrix multiplication
.* Array elementwise multiplication
ˆ Scalar and Matrix exponentiation
.ˆ Array elementwise exponentiation
/ Division
./ Array elementwise division
\ Left-division; Pre-multiplication by the inverse of a matrix
: Generates regularly spaced elements; Represents an entire row or column
( ) Encloses function arguments; Array indexing; Overrides precedence
[ ] Encloses array elements
, Separates statements and elements in a row
; Separates columns; Suppresses display
... Continuation over a line
% Comments

Management
clc Clears prompt window
clear Clears variables from memory
clear all Clears all variables in memory/workspace
close Close figures
close all Closes all plots/figures
global Declares variables to be global
help Prints help about a given function

177
178 MATLAB : FUNCTION LIST

Special Variables and Constants


ans Most recent answer √
i,j The imaginary unit −1
Inf Infinity
NaN Undefined numerical result (not a number)
pi The number π

Input/Output and Formatting Commands:

Input/Output Commands
disp Displays contents of an array or string to screen
input Displays prompts and waits for input from user in prompt
; Suppresses printing to screen

Input/Output Commands
disp Displays contents of an array or string to screen
input Displays prompts and waits for input from user in prompt
; Suppresses printing to screen

Numeric Display Formats


format short Four decimal digits
format long 16 decimal digits
format short e Five digits with exponent
format long e 16 digits with exponent

Vector, Matrix and Array Commands:

Array Commands
max Gives the largest element
min Gives the smallest element
sum Sums each column
length Number of elements
cat Concatenates arrays
find Finds indices of nonzero elements
size Array size

Creating Arrays
linspace Creates regularly spaced vector
logspace Creates logarithmically spaced vector
eye Creates an identity matrix
ones Creates an array of ones
zeros Creates an array of zeros
repmat Replicate and tile an array
179

Vector & Matrix Arithmetic and operations


dot Dot product
cross Cross product
det Determinant
inv Inverse
rank Rank of a matrix
rref Computes reduced row echelon form
eig Computes the eigenvalues/vectors of a matrix

Plotting Commands:

Plots
plot Generates standard xy plot
loglog xy plot with logarithmically scaled axis
semilogx xy plot with logarithmic x-axis
semilogy xy plot with logarithmic y-axis
scatter Scatter plot
histogram Histogram
bar Bar graph
plot3 Plots a line in xyz (3D)
scatter3 Scatter plot of 3D data
surf Creates 3D surface plot of matrix data
mesh Draws a 3D mesh of a surface of matrix data
contour Draws a contour plot of a matrix

Plot Enhancements
figure Opens a new figure window
hold Hold current plot
title Gives plot a title
xlabel Labels the x-axis
ylabel Labels the y-axis
legend Creates a plot legend
grid Shows grid lines on plot
subplot Creates plots in subwindows
180 MATLAB : FUNCTION LIST

Colours and Line types


Colour:
y yellow
m magenta
c cyan
r red
g blue
w white
k black
Symbol:
. point
o circle
x cross
+ plus
* star
Line-type:
− solid
: dotted
−− dashed
−o Line with symbol (works with most symbols)

Logic:

Logical Operators
== Equal to
< Less than
<= Less than or equal to
> Greater than
>= Greater than or equal to
& AND
— OR
˜ NOT
xor EXCLUSIVE OR

Logical Functions
any True if any elements are nonzero
all True if all elements are nonzero
isnan True if elements are undefined
isinf True if elements are infinite
isempty True if matrix/array is empty
isreal True is all elements are real

Mathematical Functions:
181

Exponential and Logarithmic Functions


exp(x) Exponential; ex
log(x) Natural Logarithm; ln(x)
log10 Log base 10; log10 (x)

sqrt(x) Square root; x

Trigonometric Functions
cos(x) cos(x)
sin(x) sin(x)
tan(x) tan(x)
1
csc(x) Cosec; csc(x) = cos(x)
1
sec(x) Sec; sec(x) = sin(x)
1
cot(x) Cotangent; cot(x) = tan(x)
Inverse cosine; arccos(x) = cos−1 (x).
acos(x)
Note: all of the other inverse functions are similar e.g. asin(x).

Complex Functions
abs(x) Absolute value; |x|
angle(x) Angle of a complex number
conj(x) Complex conjugate
imag(x) Imaginary part of a complex number
real(x) Real part of a complex number

Statistical Functions
mean Calculates the average
median Calculates the median
mode Calculates the mode
std Calculates the standard deviation
rand Generates random numbers between 0 and 1 from a uniform distribution
randn Generates random numbers from a normal distribution

Rounding Functions
round Rounds to the nearest integer
ceil dxe Rounds to the nearest integer toward ∞
floor bxc Rounds to the nearest integer toward −∞
sign Signum function; I.e. tells you if positive or negatice

Data types and Conversions:

Data types and Conversions


class Returns what data type a given variable is
int2str Converts integers to character array
num2str Converts numbers to character array
mat2str Converts matrix to character vector
str2num Converts character array to numeric array

You might also like