lecture_notes
lecture_notes
Mathematical Foundations
for Finance
5 Stochastic integration 85
5.1 The basic construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.3 Extension to semimartingales . . . . . . . . . . . . . . . . . . . . . . . . . 101
9 References 151
10 Index 153
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 5
Example. If we think of a market where assets can be traded once each day (so that
the time index k numbers days), then the price of a stock will usually be adapted because
date k prices are known at date k. But if one wants to invest by selling or buying shares,
one must make that decision before one knows where prices go in the next step; hence
trading strategies must be predictable, unless one allows insiders or prophets. For a more
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 6
k
Y k
Y
Sek0 := (1 + rj ), Sek1 := S01 Yj
j=1 j=1
for k = 0, 1, . . . , T . Note that we use here and throughout the convention that an empty
product equals 1 and an empty sum equals 0. Suppose also that rk > 1 and Yk > 0
P -a.s. for k = 1, . . . , T . Then we have
Sek0 Sek1
= 1 + rk , = Yk ,
Se0
k 1 Se1k 1
or equivalently
Interpretation. rk describes the (simple) interest rate for the period (k 1, k]; so Se0
models a bank account with that interest rate evolution, and rk > 1 ensures that Se0 > 0,
in the sense that Se0 > 0 P -a.s. for k = 0, 1, . . . , T . Similarly, Se1 models a stock , say, and
k
Yk is the growth factor for the time period (k 1, k]. Of course, we could strengthen the
analogy by writing Yk = 1 + Rk ; then Rk > 1 would describe the (simple) return on the
stock for the period (k 1, k].
How about the filtration in this example? For a general discussion, see Remark 1.1
below. The most usual choice for IF is the filtration generated by Y , i.e.,
is the smallest -field that makes all stock prices up to time k observable. Then Se1 is
obviously adapted to IF . The bank account is naturally less risky than a stock, and in
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 7
particular the interest rate for the period (k 1, k] is usually known at the beginning,
i.e. at time k 1. So each rk ought to be Fk 1 -measurable, i.e. the process r = (rk )k=1,...,T
should be predictable. Then Se0 is also predictable (and vice versa). In particular, the
interest rate rk for the period (k 1, k] then only depends on Y1 , . . . , Yk 1 or equivalently
on the stock prices Se01 , Se11 , . . . , Sek1 1 , but not on other factors. This can be generalised.
Example (binomial model). Suppose all the rk are constant with a value r > 1;
this means that we have the same nonrandom interest rate over each period. Then the
bank account evolves as Se0 = (1 + r)k for k = 0, 1, . . . , T .
k
Suppose also that Y1 , . . . , YT are independent and only take two values, 1 + u with
probability p, and 1 + d with probability 1 p. In particular, this means that all the Yk
have the same distribution; they are identically distributed (with a particular two-point
distribution). Usually, one also has u > 0 and 1 < d < 0 so that 1 + u > 1 and
0 < 1 + d < 1. Then the stock price at each step moves either up (by a factor 1 + u) or
down (by a factor 1 + d), because
8
e
Sk1 <1 + u with probability p
= Yk =
Sek1 1 :1 + d with probability 1 p.
Remark. If in the general multiplicative model, the rk are all constant with the same
value and Y1 , . . . , YT are i.i.d., we have the i.i.d. returns model. If in addition the Yk only
take finitely many values (two or more), we get the multinomial model . ⇧
Remark 1.1. (This remark is for mathematicians, but not only.) In the general multi-
plicative model, one could also start with the filtration
generated by both Y and r, or equivalently by both assets Se0 and Se1 . In general, this
filtration IF 0 is bigger than IF , meaning that Fk0 ◆ Fk for all k. But if one also assumes
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 8
that the process r (or, equivalently, the bank account Se0 ) is predictable, one can show by
induction that
Fk0 = (Y1 , . . . , Yk ) = Fk for all k.
This explains a posteriori why we have started above directly with IF generated by Y . ⇧
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 9
we think of Se0 as a bank account and then in addition also assume that Se0 is predictable;
see Section 1.1. In contrast, Se = (Se1 , . . . , Sed ) describes the prices of d genuinely risky
assets (often called stocks); so Seki is the price of asset i at time k, and because this be-
comes known at time k, but usually not earlier, each Sei and hence also the vector process
Se is adapted. For financial reasons, one might want Sei 0 P -a.s. for all i and k, but
k
simply Sk0 := Sek0 /Sek0 = 1 at all times, and the discounted asset prices S = (Sk )k=0,1,...,T are
given by Sk := Sek /Sek0 . If Se0 is viewed as a bank account, then in terms of interest rates,
using discounted prices is equivalent to working with zero interest. We shall explain later
how to re-incorporate interest rates; but our basic (discounted) model always has S 0 ⌘ 1,
and we usually call asset 0 the bank account.
Remark 2.1. It is important for this simplification by discounting that the reference
asset 0 is also tradable. So while we have only d risky assets with discounted prices
S 1 , . . . , S d , there are actually d + 1 assets available for trading. This is almost always
implicitly assumed in the literature, but not always stated explicitly.
2) Economically, it should not matter whether one works in original or in discounted
prices (except that one has of course di↵erent units and di↵erent numbers). Mathe-
matically, however, things are more subtle. In finite discrete time, there is indeed an
equivalence between undiscounted and discounted formulations, as discussed in Delbaen/
Schachermayer [4, Section 2.5]. But in models with infinitely many trading dates (whether
in infinite discrete time or in continuous time), one must be more careful because there
are pitfalls. ⇧
We assume that we have a frictionless financial market, which includes quite a lot of
assumptions. There are no transaction costs so that assets can be bought or sold at the
same price (at any given time); money (in the bank account) can be borrowed or lent at
the same (zero) interest rate; assets are available in arbitrarily small or large quantities;
there are no constraints on the numbers of assets one holds, and in particular, one may
decide to own a negative number of shares (so-called short selling); and investors are
small so that their trading activities have no e↵ect on asset prices (which means that S
is an exogenously and a priori given and fixed stochastic process). All this is of course
unrealistic; but for explaining and understanding basic concepts, one has to start with
the simplest case, and a frictionless financial market is in many cases at least a reasonable
first approximation.
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 11
Definition. A trading strategy is an IRd+1 -valued stochastic process ' = ('0 , #), where
'0 = ('0k )k=0,1,...,T is real-valued and adapted, and # = (#k )k=0,1,...,T with #0 = 0 is
IRd -valued and predictable. The (discounted) value process of a strategy ' is the real-
valued adapted process V (') = (Vk ('))k=0,1,...,T given by
d
X
(2.1) Vk (') := '0k Sk0 + #tr
k Sk = '0k + #ik Ski for k = 0, 1, . . . , T .
i=1
Remark. If the numeraire Se0 is just strictly positive and adapted, but not necessarily
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 12
predictable, then also '0 must be predictable. We shall see later in Proposition 2.3 that
this is automatically satisfied if the strategy ' is self-financing. ⇧
Note that this is again in units of the bank account, hence discounted; and note also that
(2.2) is just a book-keeping identity with no room for alternative or artificial definitions.
Finally, the initial cost for ' at time 0 comes from putting '00 into the bank account; so
We also point out that it is to some extent arbitrary whether we associate the above cost
increment Ck+1 (') to the time interval (k, k + 1] or to [k, k + 1). The choice we have
made simplifies notations, but is not financially compelling.
Remark. '0 , # and S are all stochastic processes, and so '0k+1 , '0k , #k+1 , #k and Sk are
all random variables, i.e., functions on ⌦ (to IR or IRd ). In consequence, the equality in
(2.2) is really an equality between functions, and so (2.2) means that we have this equality
whenever we plug in an argument, i.e. for all !. In particular, what looks like one simple
equation is in fact an entire system of equations.
Of course, this comment applies not only to (2.2), but to all equalities or inequalities
between random variables. In addition, it is usually enough if the set of all ! for which
the relevant equality or inequality holds has probability 1; so e.g. (2.2) only needs to
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 13
hold P -a.s., and a similar comment applies again in general. We often do not write
P -a.s. explicitly unless this becomes important for some reason. ⇧
Notation. For any stochastic process X = (Xk )k=0,1,...,T , we denote the increment of X
from k 1 to k by
Xk := Xk Xk 1 .
But now we note that #k+1 is the share portfolio we have when arriving at time k + 1,
and Sk+1 is the asset price change at time k + 1; hence #tr
k+1 Sk+1 is the (discounted)
incremental gain or loss arising over (k, k + 1] from our trading strategy due to the price
fluctuations of S. (There is no such gain or loss from the bank account because its price
S 0 ⌘ 1 does not change over time.) This justifies the following
Definition. Let ' = ('0 , #) be a trading strategy. The (discounted) gains process asso-
ciated to ' or to # is the real-valued adapted process G(#) = (Gk (#))k=0,1,...,T with
k
X
(2.5) Gk (#) := #tr
j Sj for k = 0, 1, . . . , T
j=1
(where G0 (#) = 0 by the usual convention that a sum over an empty set is 0). The
(discounted) cost process of ' is defined by
Remark 2.2. If we think of a continuous-time model where successive trading dates are
infinitely close together, then the increment
S in (2.5) becomes a di↵erential dS and
R
the sum becomes an integral. This explains why the stochastic integral G(#) = # dS
provides the natural description of gains from trade in a continuous-time financial market
model. As a mathematical aside, we also note that we should think of this stochastic
R Pd i i
Pd R i i
integral as “G(#) = i=1 # dS ”, not as “ i=1 # dS ”. It turns out in stochastic
calculus that this does make a di↵erence. ⇧
Pk
By construction, Ck (') = C0 (') + j=1 Cj (') describes the cumulative (total) costs
for the strategy ' on the time interval [0, k]. If we do not want to worry about how to pay
these costs, we ideally try to make sure they never occur, by imposing this as a condition
on '. This motivates the next definition.
Definition. A trading strategy ' = ('0 , #) is called self-financing if its cost process C(')
is constant over time (and hence equal to C0 (') = V0 (') = '00 ).
As it should, from economic intuition, this means that changing the portfolio from 'k
to 'k+1 at time k can be done cost-neutrally, i.e. with zero gains or losses at that time.
In particular, all losses from the portfolio due to stock price changes must be fully com-
pensated by gains from the bank account holdings and vice versa, without infusing or
draining extra funds. Due to (2.6), another equivalent description of a self-financing
strategy ' = ('0 , #) is that it satisfies C(') = C0 (') or
(in the sense that Vk (') = V0 (') + Gk (#) P -a.s. for each k). This gives the following very
useful result.
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 15
Proposition 2.3. Any self-financing trading strategy ' = ('0 , #) is uniquely determined
by its initial wealth V0 and its “risky asset component” #. In particular, any pair (V0 , #),
where V0 is an F0 -measurable random variable and # is an IRd -valued predictable process
with #0 = 0, specifies in a unique way a self-financing strategy. We sometimes write
b (V0 , #) for the resulting strategy '.
'=
Moreover, if ' = ('0 , #) is self-financing, then ('0k )k=1,...,T is automatically predictable.
Proof of Proposition 2.3. By (2.8) (or directly from the definitions of self-financing
and of C(') in (2.6), a strategy ' is self-financing if and only if for each k,
which already shows that '0 is determined from V0 and # by the self-financing condition.
To see that '0 is predictable, we note that
Therefore
is directly seen to be Fk 1 -measurable, because G(#) and S are adapted and # is pre-
dictable. q.e.d.
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 16
which means exactly that we hold one unit of S up to and including time ⌧ (!), but no
b (V0 , #) is
further. The value process of the corresponding self-financing “strategy” ' =
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 17
k
X
= S0 + I{j⌧ } (Sj Sj 1 )
j=1
8
< Sk S0 if ⌧ > k
= S0 +
: S⌧ S0 if ⌧ k
8
< Sk if k < ⌧
= Sk^⌧ =
: S⌧ if k ⌧,
is called the process S stopped at ⌧ , because it clearly behaves like S up to time ⌧ and
remains constant after time ⌧ . Of course, for every ! 2 ⌦, this operation and notation
per se make sense for any stochastic process and any “random time” ⌧ as above.
However, a closer look shows that one must be a little more careful. For one thing, S ⌧
could fail to be a stochastic process because Sk⌧ = Sk^⌧ could fail to be a random variable,
i.e. could fail to be measurable. But (in discrete time like here) this is not a problem if
we assume that ⌧ is measurable, which is mild and reasonable enough.
While the measurability question is mainly technical, there is a second and financially
much more relevant issue. For ' to be a strategy, we need # to be predictable, and this
translates into the equivalent requirement that ⌧ should be a so-called stopping time,
meaning that ⌧ : ⌦ ! {0, 1, . . . , T } satisfies
(the first time that stock i’s price exceeds that of stock j) or
n o
⌧ 0 (!) := inf k : Sk1 (!) 10 max Sj1 (!) ^ T
j=0,1,...,k 1
(the first time that stock 1’s price goes above ten times its past maximum value). On the
other hand, times looking at the future like
(the last time that stock `’s price exceeds 5) are typically not stopping times; so they
cannot be used for constructing such buy-and-hold strategies. This makes intuitive sense.
Example (A doubling strategy). Suppose we have a model where the stock price can
in each step only go up or down. A well-known idea for a strategy to force winnings is
then to bet on a rise and keep on betting, doubling the stakes at each date, until the rise
occurs.
More formally, consider the binomial model with parameters u > 0, 1 < d < 0 and
r = 0; so the stock price Sk is either (1 + u)Sk 1 or (1 + d)Sk 1 . To simplify computations,
suppose u = d so that the growth factors Yk = Sk /Sk 1 are symmetric around 1. Note
that as seen earlier,
Now denote by
(2.12) ⌧ := inf{k : Yk = 1 + u} ^ T
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 19
the (random) time of the first stock price rise and define
1
(2.13) #k := 2k 1 I{k⌧ } .
Sk 1
{⌧ j} = {max(Y1 , . . . , Yj ) 1 + u} 2 Fj
for each j, and so # is predictable because each #k is Fk 1 -measurable. Note that this
uses {k ⌧ } = {⌧ < k}c = {⌧ k 1}c . Moreover,
shows that while we are not successful, the value of our stock holdings (not the amount
of shares of the strategy itself) indeed doubles from one step to the next.
For V0 := 0, we now take the self-financing strategy ' corresponding to (V0 , #). Its
value process is by (2.8) and (2.5) given by
k
X k
X
Vk (') = Gk (#) = #j S j = 2j 1 I{j⌧ } (Yj 1),
j=1 j=1
using (2.11) and (2.13). By the definition (2.12) of ⌧ , we have Yj = 1 + d for j < ⌧ and
Yj = 1 + u for j = ⌧ ; so
k
X ✓X
⌧ 1 ◆
j 1 j 1 ⌧ 1
Vk (') = I{⌧ >k} 2 d + I{⌧ k} 2 d+2 u
j=1 j=1
which says that we obtain a value, and hence net gain, of |d| in all the (usually many)
cases that S goes up at least once up to time k, and make a (big) loss of |d|(2k 1) in
the (hopefully unlikely) event that S always goes down up to time k.
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 20
One problem with the doubling strategy in the above example is that while it does
produce a gain in many cases, its value process goes very far below 0 in those cases where
“things go badly”. In continuous time or over an infinite time horizon, one obtains quite
pathological e↵ects if one does not forbid such strategies in some way. The next definition
aims at that.
Definition. For a 0, a trading strategy ' is called a-admissible if its value process V (')
is uniformly bounded from below by a, i.e. V (') a in the sense that Vk (') a
P -a.s. for all k. A trading strategy is admissible if it is a-admissible for some a 0.
Interpretation. An admissible strategy has some credit line which imposes a lower bound
on the associated value process; so one may make debts, but only within clearly defined
limits. Note that while every admissible strategy has some credit line, the level of that
can be di↵erent for di↵erent strategies.
Remarks. 1) If ⌦ (or more generally F) is finite, any random variable can only take
finitely many values; for any model with finite discrete time, every trading strategy is
then admissible. But if F (or the time horizon) is infinite or time is continuous, imposing
admissibility is usually a genuine and important restriction. We return to this point later.
2) Note that all our prices and values are discounted and hence expressed in units of the
reference asset 0. Imposing a constant lower bound on a value process like admissibility
does is therefore obviously not invariant if we change to a di↵erent reference asset for
discounting. This is the root of the pitfalls mentioned earlier in Remark 2.1. ⇧
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 21
Intuitively, this means that the best prediction for the later value X` given the earlier
information Fk is just the current value Xk ; so the changes in a martingale cannot be
predicted. If we have “” in (3.1) (a tendency to go down), X is called a supermartingale;
if we have “ ”, then X is a submartingale. An IRd -valued process X is a martingale if
each coordinate X i is a martingale.
It is important to note that the property of being a martingale depends on the proba-
bility we use to look at a process. The same process can very well be a martingale under
some Q, but not a martingale under another Q0 or P .
Example. In the binomial model on (⌦, F, IF, P ) with parameters r, u, d, the discounted
stock price Se1 /Se0 is a P -martingale if and only if r = pu + (1 p)d.
Indeed, Se1 /Se0 is obviously adapted and takes only finitely many values; so it is
bounded and hence integrable. Moreover, by induction, one easily sees that it is enough
to check (the one-step martingale property) that
e1
S Se1
EP k+1 Fk = k for each k
Sek+1
0
Sek0
or equivalently that
e1
S Sek1 Yk+1
1 = EP k+1 Fk = EP Fk .
Se0 k+1 Se0
k
1+r
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 22
Theorem 3.1. Suppose X = (Xk )k=0,1,...,T is an IRd -valued martingale or local martingale
null at 0. For any IRd -valued predictable process #, the stochastic integral process # X
defined by
k
X
# Xk := #tr
j Xj for k = 0, 1, . . . , T
j=1
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 23
Proof of Theorem 3.1. This result is important enough to deserve at least a partial
proof. So suppose X is a Q-martingale and # is bounded. Then # X is also Q-integrable,
it is always adapted, and
EQ [# Xk+1 # Xk | Fk ] = EQ [#tr
k+1 Xk+1 | Fk ]
d
X
= EQ [#ik+1 Xk+1
i
| Fk ].
i=1
EQ [#ik+1 Xk+1
i
| Fk ] = #ik+1 EQ [ Xk+1
i
| Fk ] = 0
We have seen earlier that if ⌧ is any stopping time, then #k := I{k⌧ } is predictable,
and of course bounded. So if we note that # X = X ⌧ X0 , an immediate consequence
of Theorem 3.1 is
Corollary 3.2. For any martingale X and any stopping time ⌧ , the stopped process X ⌧
is again a martingale. In particular, EQ [Xk^⌧ ] = EQ [X0 ] for all k.
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 24
Interpretation. A martingale describes a fair game in the sense that one cannot predict
where it goes next. Corollary 3.2 says that one cannot change this fundamental character
by cleverly stopping the game — and Theorem 3.1 says that as long as one can only use
information from the past, not even complicated clever betting (in the form of trading
strategies) will help.
In general, the stochastic integral with respect to a local martingale is only a local
martingale — and in continuous time, it may fail to be even that in the most general
case. But there is one situation where things are very nice in discrete time, and this is
tailor-made for applications in mathematical finance, as one can see by looking at the
definition of self-financing and admissible strategies.
Theorem 3.3. Suppose that X is an IRd -valued local Q-martingale null at 0 and # is
an IRd -valued predictable process. If the stochastic integral process # X is uniformly
bounded below (i.e. # Xk b Q-a.s. for all k, with a constant b 0), then # X is a
Q-martingale.
Proof. See Föllmer/Schied [9, Theorem 5.15]. A bit more generally, this relies on
the result that in discrete (possibly infinite) time, a local martingale that is uniformly
bounded below is a true martingale. More precisely: If L = (Lk )k2IN0 is a local Q-martin-
gale with EQ [|L0 |] < 1 and T 2 IN is such that EQ [LT ] < 1, then the stopped process
LT = (Lk )k=0,1,...,T is a Q-martingale. q.e.d.
discrete time. The same definitions and results also apply for the setting k 2 IN0 of
infinite discrete time; the only required change is that one must replace T by 1 in an
appropriate manner. ⇧
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 26
Sek0
=1+r >0 for all k,
Se0
k 1
Sek1
= Yk for all k,
Se1
k 1
where Se00 = 1, Se01 = S01 > 0 is a constant, and Y1 , . . . , YT are i.i.d. and take the finitely
many values 1 + y1 , . . . , 1 + ym with respective probabilities p1 , . . . , pm . To avoid degen-
eracies and fix the notation, we assume that all the probabilities pj are > 0 and that
y m > ym 1 > · · · > y1 > 1. This also ensures that Se1 remains strictly positive.
The interpretation for this model is very simple. At each step, the bank account
changes by a factor of 1+r, while the stock changes by a random factor that can only take
the m di↵erent values 1 + yj , j = 1, . . . , m. The choice of these factors happens randomly,
with the same mechanism (identically distributed) at each date, and independently across
dates. Intuition suggests that for a reasonable model, the sure factor 1 + r should lie
between the minimal and maximal values 1 + y1 and 1 + ym of the (uncertain) random
factor; we come back to this issue in the next chapter when we discuss absence of arbitrage.
The simplest and in fact canonical model for this setup is a path space. Let
⌦ = {1, . . . , m}T
be the set of all strings of length T formed by elements of {1, . . . , m}. Take F = 2⌦ , the
family of all subsets of ⌦, and define P by setting
T
Y
(4.1) P [{!}] = px1 px2 · · · pxT = p xk .
k=1
Finally, define Y1 , . . . , YT by
Fk = (Y1 , . . . , Yk ) for k = 0, 1, . . . , T .
Intuitively, this means that up to time k, we can observe the values of Y1 , . . . , Yk and
hence the first k “bits” of the trajectory or string !. Formally, this translates as follows.
Recall that for a general probability space (⌦, F, P ), a set B is an atom of a -field
G ✓ F if B 2 G, P [B] > 0 and any C 2 G with C ✓ B has either P [C] = 0 or
P [C] = P [B]. In that sense, atoms of a -field G are minimal elements of G, where
minimal is measured with the help of P .
In the above path-space setting, the only set of probability zero is the empty set, and
so P [C] = 0 and P [C] = P [B| translate into C = ; and C = B, respectively. A set
A ✓ ⌦ is therefore an atom of Fk if and only if there exists a string (x̄1 , . . . , x̄k ) of length
k with elements x̄i 2 {1, . . . , m} such that A consists of all those ! 2 ⌦ that start with
the substring (x̄1 , . . . , x̄k ), i.e.
– When going from time k to time k + 1, each atom A = Ax̄1 ,...,x̄k from Fk splits into
precisely m subsets A1 = Ax̄1 ,...,x̄k ,1 , . . . , Am = Ax̄1 ,...,x̄k ,m that are atoms of Fk+1 . So
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 28
we can see very precisely and graphically how information about the past, i.e. the
initial part of trajectories !, is growing and refining over time.
It is clear from the above description that for any k, the atoms of Fk are pairwise disjoint
and their union is ⌦; in other words, the atoms of Fk form a partition of ⌦ so that we
can write
[
⌦= Ax̄1 ,...,x̄k with the Ax̄1 ,...,x̄k pairwise disjoint.
(x̄1 ,...,x̄k )2{1,...,m}k
Remark. For many (but not all) purposes in the multinomial model, it is enough if one
looks at time k only at the current value Sek1 of the stock. In graphical terms, this means
that one makes the underlying tree recombining by collapsing at each time k into one
(big) node all those nodes where Se1 has the same value. In terms of -fields, this amounts
k
With the help of the atoms introduced above, we can also give a very precise and
intuitive description of all probability measures Q on FT . First of all, we identify each atom
in Fk with a node at time k of the non-recombining tree, namely that node which is reached
via the substring (x̄1 , . . . , x̄k ) that parametrises the atom. For any atom A = Ax̄1 ,...,x̄k of
Fk , we then look at its m successor atoms A1 = Ax̄1 ,...,x̄k ,1 , . . . , Am = Ax̄1 ,...,x̄k ,m of Fk+1 ,
and we define the one-step transition probabilities for Q at the node corresponding to A
by the conditional probabilities (note that Aj \ A = Aj as Aj ✓ A)
Q[Aj ]
(4.3) Q[Aj | A] = for j = 1, . . . , m.
Q[A]
1 FINANCIAL MARKETS IN FINITE DISCRETE TIME 29
! }] = Q[Ax̄1 ,...,x̄T ]
Q[{¯
= Q[Ax̄1 ,...,x̄T | Ax̄1 ,...,x̄T 1 ] Q[Ax̄1 ,...,x̄T 1 ]
= qx̄T (x̄1 , . . . , x̄T 1 )Q[Ax̄1 ,...,x̄T 1
]
time k (but they can still di↵er across dates k). Finally, Y1 , . . . , YT are i.i.d. under Q if
and only if at each node throughout the tree, the one-step transition probabilities are the
same. Probability measures with this particular structure can therefore be described by
m 1 parameters; recall that the m one-step transition probabilities at any given node
must sum to 1, which eliminates one degree of freedom.
Remark. We have discussed the path space formulation for the multinomial model where
each node in the tree has the same number of successor nodes and in that sense is homo-
geneous in time. But of course, the same considerations can be done for any model where
the final -algebra FT is finite. The only di↵erence is that the corresponding event tree
is no longer nicely symmetric and homogeneous, which makes the notation (but not the
basic considerations) more complicated. ⇧
2 ARBITRAGE AND MARTINGALE MEASURES 31
2.1 Arbitrage
Recall from Proposition 1.2.3 that any pair (V0 , #) consisting of V0 2 L0 (F0 ) and an
IRd -valued IF -predictable process # can be identified with a self-financing strategy ',
R
whose value process is then given by V (') = V0 + G(#) = V0 + # dS = V (V0 , #). We
b (V0 , #). (Of course, we work throughout in units of asset 0.) Hence
shortly write ' =
G(#) = V (0, #) describes the cumulative gains or losses one can generate from initial
b (0, #). We also recall that a strategy '
capital 0 through self-financing trading via ' =
is a-admissible if V (') a, and admissible if it is a-admissible for some a 0. Note
that these notions depend on the chosen accounting unit or numeraire (here S 0 ), except
for 0-admissibility.
b (0, #)
Definition. An arbitrage opportunity is an admissible self-financing strategy ' =
with zero initial wealth, with VT (') 0 P -a.s. and with P [VT (') > 0] > 0. The finan-
cial market (⌦, F, IF, P, S 0 ⌘ 1, S) or shortly S is called arbitrage-free if there exist no
arbitrage opportunities. Sometimes one also says that S satisfies (NA).
Example. If there exist an asset i0 and a date k0 such that Ski00 +1 Ski00 P -a.s. and
P [Ski00 +1 < Ski00 ] > 0, then S admits arbitrage.
Indeed, the price process S i0 can only go down from time k0 to k0 + 1 and does so in
some cases (i.e., with positive probability); so if we sell short that asset at time k0 , we run
b (0, #) with
no risk and have the chance of a genuine profit. Formally, the strategy ' =
gives an arbitrage opportunity, as one easily checks. [! Exercise] This also illustrates the
well-known wisdom that “bad news is better than no news” .
Let us introduce a useful notation. For any -field G ✓ F, we denote by L0(+) (G)
the space of all (equivalence classes, for the relation of equality P -a.s., of) (nonnegative)
G-measurable random variables. Then for example, we can write VT (') 0 P -a.s. and
P [VT (') > 0] > 0 more compactly as VT (') 2 L0+ (FT ) \ {0}.
Proposition 1.1. For a discounted financial market in finite discrete time, the following
are equivalent:
2 ARBITRAGE AND MARTINGALE MEASURES 33
1) S is arbitrage-free.
3) For every (not necessarily admissible) self-financing strategy ' with V0 (') = 0 P -a.s.
and VT (') 0 P -a.s., we have VT (') = 0 P -a.s.
of all final wealths that one can generate from zero initial wealth through some
b (0, #), we have
self-financing trading ' =
Remarks. 1) Proposition 1.1 and its proof substantiate the above comment that all
three above formulations for absence of arbitrage are equivalent in finite discrete time.
2) The mathematical relevance of Proposition 1.1 is that it translates the no-arbitrage
condition (NA) into the formulation in 4) which has a very useful geometric interpretation.
We shall exploit this in the next section. ⇧
Proof of Proposition 1.1. “2) , 3)” is obvious, and “2) , 4)” is a direct consequence
of the parametrisation of self-financing strategies in Proposition 1.2.3. It is also clear that
(NA0 ) as in 2) implies (NA) as in 1). Finally, the argument for “1) ) 2)” is indirect
and even shows a bit more: We claim that if one has a self-financing strategy ' which
produces something out of nothing, one can construct from ' a 0-admissible self-financing
strategy '˜ which also produces something out of nothing. Indeed, if ' is not already
0-admissible itself, then the set Ak := {Vk (') < 0} has P [Ak ] > 0 for some k. We take
as k0 the largest of these k and then define '˜ simply as the strategy ' on Ak0 after time
k0 . In words, we wait until we can start on some set with a negative initial capital and
transform that via ' into something nonnegative. As this turns something nonpositive
2 ARBITRAGE AND MARTINGALE MEASURES 34
Our next intermediate goal is to give a simple probabilistic condition that excludes
arbitrage opportunities. Recall that two probability measures Q and P on F are equivalent
(on F), written as Q ⇡ P (on F), if they have the same nullsets (in F), i.e. if for each
set A (in F), we have P [A] = 0 if and only if Q[A] = 0. Intuitively, this means that while
P and Q may di↵er in their quantitative assessments, they qualitatively agree on what is
“possible or impossible”.
Example. If we construct the multinomial model as in Section 1.4 as an event tree on the
canonical path space ⌦ = {1, . . . , m}T with F = 2⌦ , then we know that any probability
measure on (⌦, F) can be described by its collection of one-step transition probabilities,
which all lie between 0 and 1, i.e. in [0, 1].
Now consider two probability measures P and Q on (⌦, F). If some of the transition
probabilities pij of P are 0 (or 1), a characterisation of Q being equivalent to P is a bit
involved, and so we assume (as for example in the multinomial model) that P [{!}] > 0
for all ! 2 ⌦. This means that all one-step transition probabilities pij for P lie in the open
interval (0, 1), and then we have Q ⇡ P if and only if all one-step transition probabilities
qij for Q lie in (0, 1) as well.
some a with a 0). By Theorem 1.3.3, V (') is thus also a Q-martingale and so
Now suppose in addition that Q ⇡ P on FT , so that Q-a.s. and P -a.s. are the same thing
for all events in FT . If ' =
b (0, #) is an admissible self-financing strategy with VT (') 0
P -a.s., then also VT (') 0 Q-a.s. But EQ [VT (')] = 0 by the above argument, and so
we must have VT (') = 0 Q-a.s., hence also VT (') = 0 P -a.s. By Proposition 1.1, S is
therefore arbitrage-free. q.e.d.
(n)
(#k )tr Sk = #tr
k Sk I{|#k |n} #tr
k Sk I{#trk Sk 0} I{|#k |n} #tr
k Sk I{#trk Sk 0}
(n)
so that ((#k )tr Sk ) (#tr
k Sk ) for all k and hence (#(n) S) (# S) . But
V (') is bounded below by a because ' is admissible, and therefore the entire sequence
2 ARBITRAGE AND MARTINGALE MEASURES 36
(G(#(n) ))n2IN = (#(n) S)n2IN is also bounded below by a. This allows us to use Fatou’s
lemma and conclude from the martingale property of each G(#(n) ) that V (') = # S is a
Q-supermartingale; indeed,
h i
EQ [Gk (#) | Fk 1 ] = EQ lim Gk (#(n) ) Fk 1 lim inf EQ [Gk (#(n) ) | Fk 1 ]
n!1 n!1
Example. Consider the multinomial model on the canonical path space ⌦ = {1, . . . , m}T
and suppose as usual that P [{!}] > 0 for all ! 2 ⌦. (We can also assume that the returns
Y1 , . . . , YT are i.i.d. under P , but this is actually not needed for the subsequent reasoning.)
To find Q ⇡ P such that S 1 = Se1 /Se0 is a Q-martingale (recall that we always work in
units of asset 0), we need to find one-step transition probabilities in the open interval
(0, 1) such that
EQ [Sek1 /Sek0 | Fk 1 ] = Sek1 1 /Sek0 1 for all k.
Because
Sek1 /Sek0 Sek1 /Sek1 1 Yk
= = ,
Se1 /Se0
k 1 k 1 Se0 /Se0
k k 1
1+r
m
X
=1+ qj (A(k 1)
)yj
j=1
X
=1+ IA(k 1) qj (A(k 1)
)yj ,
atoms A(k 1) 2F
k 1
and we want this to equal 1+r. Note that although we have started with a particular time
k and atom A(k 1)
, the resulting condition always looks the same; this is due to the ho-
mogeneity in the structure of the multinomial model. The above conditional expectation
equals 1 + r if and only if the equation
m
X
qj (A(k 1)
)yj = r
j=1
Corollary 1.4. In the multinomial model with parameters y1 < · · · < ym and r, there
exists a probability measure Q ⇡ P such that Se1 /Se0 is a Q-martingale if and only if
y1 < r < ym .
The interpretation of the condition y1 < r < ym is very intuitive. It says that in
comparison to the riskless bank account Se0 , the stock Se1 has the potential for both
higher and lower growth than Se0 . Hence Se1 is genuinely more risky than Se0 . One has
the feeling that this should not only be sufficient to exclude arbitrage opportunities, but
necessary as well. That feeling is correct, as we shall see in the next section; alternatively,
one can also prove this directly. [! Exercise]
For the special case of the binomial model, we can even say a bit more.
Corollary 1.5. In the binomial model with parameters u > d and r, there exists a
probability measure Q ⇡ P such that Se1 /Se0 is a Q-martingale if and only if u > r > d.
In that case, Q is unique (on FT ) and characterised by the property that Y1 , . . . , YT are
i.i.d. under Q with parameter
r d
Q[Yk = 1 + u] = q ⇤ = =1 Q[Yk = 1 + d].
u d
Pm (k 1)
Proof. The martingale condition j=1 qj (A )yj = r reduces, with m = 2, y1 = d,
y2 = u and q := q2 (A(k 1)
), to the equation (1 q)d + qu = r, which has the unique
solution q ⇤ . Because the one-step transition probabilities for Q are thus the same in each
node throughout the tree, the i.i.d. description under Q follows as in Section 1.4 and in
the preceding discussion. q.e.d.
2 ARBITRAGE AND MARTINGALE MEASURES 39
Saying that IPe(,loc) (S) is non-empty is the same as saying that there exists an equiv-
alent (local) martingale measure Q for S. By Lemma 1.2 and the discussion around it,
both these properties imply that S is arbitrage-free or, equivalently, that S satisfies (NA).
It is very remarkable and important that the converse implication holds as well.
theorem). Theorem 2.1 can be viewed as a converse; it says that if one cannot win by
betting on a given process, then that process must be a martingale — at least after an
equivalent change of probability measure.
3) Note that we make no integrability assumptions about S (under P ); so it is also
noteworthy that S, being a Q-martingale, is automatically integrable under (some) Q.
(To put this into perspective, one should add that it is a minor point; one can always
easily construct [! exercise] a probability measure R equivalent to P such that S becomes
under R as nicely integrable as one wants. But of course such an R will in general not be
a martingale measure for S.)
Proving Theorem 2.1 is not elementary if one wants to allow models where the under-
lying probability space (⌦, F, P ) is infinite, or more precisely if one of the -fields Fk ,
k T , is infinite. This level of generality is needed very quickly, for instance as soon as
we want to work with returns which take more than only a finite number of values; the
simplest example would be to have the Yk lognormal, and other typical examples come up
when one wants to study GARCH-type models. In that sense, the result in Theorem 2.1 is
really needed in full generality. However, we content ourselves here with an explanation of
the key geometric idea behind the proof, and with the exact argument for the case where
⌦ (or rather FT ) is finite (like for instance in the canonical setting for the multinomial
model).
Due to Lemma 1.2 (plus Remark 1.3) and IPe ✓ IPe,loc , we only need to prove that
absence of arbitrage implies the existence of an equivalent martingale measure for S. By
Proposition 1.1, (NA) is equivalent to G 0 \ L0+ (FT ) = {0}, where
is the space of all final positions one can generate from initial wealth 0 by self-financing
(but not necessarily admissible) trading. In geometric terms, this means that the upper-
right quadrant of nonnegative random variables, L0+ (FT ), intersects the linear subspace
G 0 only in the point 0.
2 ARBITRAGE AND MARTINGALE MEASURES 41
As a consequence, the two sets L0+ (FT ) and G 0 can be separated by a hyperplane, and
the normal vector defining that hyperplane then yields (after suitable normalisation) the
(density of the) desired EMM.
As one can see from the above scheme of proof, the existence of an EMM follows from
the existence of a separating hyperplane between two sets. In that sense, the proof is (not
surprisingly) not constructive, and it is also clear that we cannot expect uniqueness of an
EMM in general. The latter fact can also easily be seen directly: Because the set IPe (S)
is obviously convex [! exercise], it is either empty, or contains exactly one element, or
contains infinitely (uncountably) many elements.
Proof of Theorem 2.1 for ⌦ (or FT ) finite. If ⌦ (or FT ) is finite, then every random
variable on (⌦, FT ) can take only a finite number (n, say) of values, and so we can identify
L0 (FT ) with the finite-dimensional space IRn and L0+ (FT ) with IR+
n
. (More precisely, as
pointed out below, we must take n as the number of atoms of FT .) The set G 0 ✓ L0 (FT ),
which is obviously linear, can then be identified with a linear subspace H of IRn , and so
n
(NA) translates into H \ IR+ = {0} due to Proposition 1.1.
Recall that a set A 2 FT is an atom in FT if P [A] > 0 and if any B 2 FT with B ✓ A
has either P [B] = 0 or P [B] = P [A]. Then any FT -measurable random variable Z has
2 ARBITRAGE AND MARTINGALE MEASURES 42
P P
the form Z = A atom in FTZIA = A atom in FT zA IA with zA 2 IR. We consider the set of
P
all FT -measurable Z 0 with A atom in FT zA = 1 and identify this with the subset
⇢ n
X
n
K = z 2 IR+ : zi = 1
i=1
n n
of IR+ , where n denotes the (finite, by assumption) number of atoms in FT . Then K ✓ IR+
and K does not contain the vector 0, so that we must have H \ K = ;. Moreover,
K is convex and compact, and so a classical separation theorem for sets in IRn (see
e.g. Lamberton/Lapeyre [12, Theorem A.3.2] implies that there exists a vector 2 IRn
with 6= 0 such that
tr
(2.1) h=0 for all h 2 H
(which says that is a normal vector to the hyperplane separating H and K) and
tr
(2.2) z>0 for all z 2 K
i
⇢i := Pn
i=1 i
lie in (0, 1) and sum to 1 so that they define a probability measure Q on FT via
recall that FT by assumption has only n atoms because it is finite, and any set in FT is
a union of atoms in FT . Because P [A] > 0 for all n atoms A 2 FT , it is clear that Q is
tr
equivalent to P on FT ; and the property (2.1) that h = 0 for all h 2 H translates via
the identification of H and G 0 and the definition of G 0 into
In continuous time or with an infinite time horizon, existence of an EMM still implies
(NA), but the converse is not true. One needs a sort of topological strengthening which
excludes not only arbitrage from each single strategy, but also the possibility of creating
“arbitrage in the limit by using a sequence of strategies”. The resulting condition is called
(NFLVR) for “no free lunch with vanishing risk”, and the corresponding equivalence the-
orem, due to Freddy Delbaen and Walter Schachermayer in its most general form, is called
the fundamental theorem of asset pricing (FTAP). (To be accurate, we should mention
that also the concept of EMM must be generalised a little to obtain that theorem.) The
basic idea for proving the FTAP is still the same as in our above proof, but the techniques
and arguments are much more advanced. One reason is that for infinite Fk , k T , already
the proof of Theorem 2.1 needs separation arguments for infinite-dimensional spaces. The
second, more important reason is that the continuous-time formulation also needs the full
arsenal and machinery of general stochastic calculus for semimartingales. This is rather
difficult. For a detailed treatment, we refer to Delbaen/Schachermayer [4, Chapters 8, 9,
14]
Remark. While Theorem 2.1 is a very nice result, one should also be aware of its as-
sumptions and in consequence its limitations. The most important of these assumptions
are frictionless markets and small investors — and if one tries to relax these to have more
realism, the theory even in finite discrete time becomes considerably more complicated
and partly does not even exist yet. The same of course applies to continuous-time models
and theorems. ⇧
In some specific models, we have already studied when there exists a probability
measure Q ⇡ P such that Se1 /Se0 is a Q-martingale; see Corollaries 1.4 and 1.5. Combining
2 ARBITRAGE AND MARTINGALE MEASURES 44
this with Theorem 2.1 now immediately gives the following results.
Corollary 2.2. The multinomial model with parameters y1 < · · · < ym and r is arbitrage-
free if and only if y1 < r < ym .
Note that this confirms the intuition stated after Corollary 1.4.
Corollary 2.3. The binomial model with parameters u > d and r is arbitrage-free if and
only if u > r > d. In that case, the EMM Q⇤ for Se1 /Se0 is unique (on FT ) and is given as
in Corollary 1.5.
2 ARBITRAGE AND MARTINGALE MEASURES 45
We begin with (⌦, F) and a filtration IF = (Fk )k=0,1,...,T in finite discrete time. On
F, we have two probability measures Q and P , and we assume that Q ⇡ P . Then the
dQ
Radon–Nikodým theorem tells us that there exists a density dP
=: D; this is a random
variable D > 0 P -a.s. (because Q ⇡ P ) such that Q[A] = EP [DIA ] for all A 2 F, or more
generally
which explains the notation to some extent. The point of these formulae is that they tell
us how to compute Q-expectations in terms of P -expectations and vice versa. Sometimes
dQ
one also writes D = |
dP F
to emphasise that we have Q[A] = EP [DIA ] for all A 2 F, and
one sometimes explicitly calls D the density of Q with respect to P on F.
To get similar transformation rules for conditional expectations, we introduce the
P -martingale Z (sometimes denoted more explicitly by Z Q or Z Q;P ) by
dQ
Zk := EP [D | Fk ] = EP Fk for k = 0, 1, . . . , T .
dP
Because D > 0 P -a.s., the process Z = (Zk )k=0,1,...,T is strictly positive in the sense that
Zk > 0 P -a.s. for each k, or also P [Zk > 0 for all k] = 1. Z is called the density process
(of Q, with respect to P ); the next result makes it clear why.
2 ARBITRAGE AND MARTINGALE MEASURES 46
respectively. This means that Zk is the density of Q with respect to P on Fk , and we also
dQ
write sometimes Zk = | .
dP Fk
1
(3.2) EQ [Uk | Fj ] = EP [Zk Uk | Fj ] Q-a.s.
Zj
This tells us how conditional expectations under Q and P are related to each other.
3) A process N = (Nk )k=0,1,...,T which is adapted to IF is a Q-martingale if and only
if the product ZN is a P -martingale. This tells us how martingale properties under P
and Q are related to each other.
The proof of Lemma 3.1 is a standard exercise from probability theory in the use of
conditional expectations. We do not give it here, but strongly recommend to do this as
an [! exercise]. Note that if FT is smaller than F, we have ZT 6= D in general.
Zk
Dk := for k = 1, . . . , T .
Zk 1
The process D = (Dk )k=1,...,T is adapted, strictly positive and satisfies by its definition
EP [Dk | Fk 1 ] = 1,
k
Y
Zk = Z0 Dj for k = 0, 1, . . . , T .
j=1
So every Q ⇡ P induces via Z a pair (Z0 , D). If we conversely start with a pair (Z0 , D)
with the above properties (i.e. Z0 is F0 -measurable, Z0 > 0 P -a.s. with EP [Z0 ] = 1, and
D is adapted and strictly positive with EP [Dk | Fk 1 ] = 1 for all k), we can define a
probability measure Q ⇡ P via
YT
dQ
:= Z0 Dj .
dP j=1
This shows that the ratios Dk play the role of “one-step conditional densities” of Q with
respect to P .
The above parametrisation is very simple and yet very useful when we want to con-
struct an equivalent martingale measure for a given process S. All we need to find are
an F0 -measurable random variable Z0 > 0 P -a.s. with EP [Z0 ] = 1 and an adapted
strictly positive process D = (Dk )k=1,...,T satisfying EP [Dk | Fk 1 ] = 1 for all k (these
are the properties required to get an equivalent probability measure Q), and in addition
EP [Dk (Sk Sk 1 ) | Fk 1 ] = 0 for all k. Indeed, the latter condition is, in view of (3.3),
simply the martingale property of S under the measure Q determined by (Z0 , D). (To be
accurate, we also need to make sure that S is Q-integrable, meaning that EQ [|Sk |] < 1
for all k; this amounts to the integrability requirement that EP [Zk |Sk |] < 1 for all k,
Q
where Zk = Z0 kj=1 Dj .)
The simplest choice for Z0 is clearly the constant Z0 ⌘ 1; this amounts to saying that
Q and P should coincide on F0 . If F0 is P -trivial (i.e. P [A] 2 {0, 1} for all A 2 F0 ) as
is often the case, then every F0 -measurable random variable is P -a.s. constant, and then
Z0 ⌘ 1 is actually the only possible choice (because we must have EP [Z0 ] = 1).
2 ARBITRAGE AND MARTINGALE MEASURES 48
Concerning the Dk , not much can be said in this generality because we do not have
any specific structure for our model. To get more explicit results, we therefore specialise
and consider a setting with i.i.d. returns under P ; this means that
k
Y
Sek1 = S01 Yj , Sek0 = (1 + r)k ,
j=1
where Y1 , . . . , YT are > 0 and i.i.d. under P . The filtration we use is generated by (Se0 , Se1 )
or equivalently by Se1 or by Y ; so F0 is P -trivial and Yk is under P independent of Fk 1
for each k. The Q-martingale condition for S 1 = Se1 /Se0 in multiplicative form is then by
(3.3) given by
Sk1 Sek1 /Sek0 Dk Yk
1 = EQ Fk 1 = EQ Fk 1 = EP Fk 1 .
Sk1 1 Se1 /Se0
k 1 k 1
1+r
Because S 1 > 0, this also implies by iteration that EQ [|Sk1 |] = EQ [Sk1 ] = EQ [S01 ] = S01 < 1
so that Q-integrability is automatically included in the martingale condition.
To keep things as simple as possible, we now might try to choose Dk like Yk independent
of Fk 1 . Then [one can prove that] we must have Dk = gk (Yk ) for some measurable
function gk , and we have to choose gk in such a way that we get
and
1 + r = EP [Dk Yk | Fk 1 ] = EP [Yk gk (Yk )].
(Note that these calculations both exploit the P -independence of Yk from Fk 1 .) If this
choice is possible, we can then choose all the gk ⌘ g1 , because the Yk are (assumed)
i.i.d. under P and so the distribution of Yk under P is the same as that of Y1 . To ensure
that Dk > 0, we can impose gk > 0.
If we find such a function g1 > 0 with EP [g1 (Y1 )] = 1 and EP [Y1 g1 (Y1 )] = 1 + r, setting
Y T
dP
:= g1 (Yj )
dQ j=1
2 ARBITRAGE AND MARTINGALE MEASURES 49
defines an EMM Q for S 1 = Se1 /Se0 . Moreover, [one can show that] the returns Y1 , . . . , YT
are again i.i.d. under Q (but of course not necessarily under an arbitrary EMM Q0 for S 1 ).
Example. We still assume that we have i.i.d. returns under P . If the Yk are discrete
random variables taking values (1 + yj )j2IN with probabilities P [Yk = 1 + yj ] = pj , then
g1 is (for our purposes) determined by its values g1 (1 + yj ), and Q ⇡ P means that we
need qj := Q[Yk = 1 + yj ] > 0 for all those j with pj > 0. If we set
qj := pj g1 (1 + yj ),
we are thus in more abstract terms looking for qj having qj > 0 whenever pj > 0 and
satisfying
X X
1 = EP [g1 (Y1 )] = pj g1 (1 + yj ) = qj
j2IN j2IN
and
X X X
1 + r = EP [Y1 g1 (Y1 )] = pj (1 + yj )g1 (1 + yj ) = qj (1 + yj ) = 1 + qj y j ,
j2IN j2IN j2IN
or equivalently
X
qj yj = r.
j2IN
Note that the actual values of the pj are not relevant here; it only matters which of them
are strictly positive.
Example. In the multinomial model with parameters y1 , . . . , ym and r, the above recipe
P Pm
boils down to finding q1 , . . . , qm > 0 with mj=1 qj = 1 and j=1 qj yj = r. If m > 2 and
the yj are as usual all distinct, there is clearly an infinite number of solutions (provided
of course that there is at least one).
Ui +b
Example. If we have i.i.d. lognormal returns, then Yi = e with random variables
U1 , . . . , UT i.i.d. ⇠ N (0, 1) under P . Instead of Di = g1 (Yi ), we here try (equivalently)
with Di = g̃1 (Ui ), and more specifically with Di = e↵Ui + . Then we have
+ 12 ↵2 1 2
EP [Di ] = e =1 for = 2
↵ ,
2 ARBITRAGE AND MARTINGALE MEASURES 50
and we get
+ 12 (↵+ )2
EP [Di Yi ] = EP [eb+ +(↵+ )Ui
] = eb+ =1+r
for
1 1
log(1 + r) = b + + (↵ + )2 = b + 2
+↵ ,
2 2
hence
✓ ◆
1 1 2
↵= log(1 + r) b .
2
with
1 2
b+ 2
log(1 + r)
= ↵= .
3 VALUATION AND HEDGING IN COMPLETE MARKETS 51
The interpretation is that H describes the net payo↵ (in units of asset 0) that the
owner of this instrument obtains at time T ; so having H 0 is natural and also avoids
integrability issues. (A bit more generally, one could instead impose that H is bounded
3 VALUATION AND HEDGING IN COMPLETE MARKETS 52
below P -a.s. by some constant.) As H is FT -measurable, the payo↵ can depend on the
entire information up to time T ; and “European” means that the time for the payo↵ is
fixed at the end T .
Remark. We could also deal with an Fk -measurable payo↵ made at time k; but as S 0 ⌘ 1,
it is financially equivalent whether such a payo↵ is made at k or at T , because we can use
the bank account to transfer money over time without changing it or its value in any way.
By using linearity, we could then also deal with payo↵ streams having a payo↵ at every
date k (with, of course, the time k payo↵ being Fk -measurable, i.e. the payo↵ stream being
an adapted process). However, we do not consider here American-type payo↵s where the
owner of the financial instrument has some additional freedom in choosing the time of the
payo↵; the theory for that is a bit more complicated. ⇧
Example. A European call option on asset i with maturity T and strike K gives its
owner the right, but not the obligation, to buy at time T one unit of asset i for the price
K, irrespective of what the actual asset price STi then is. Any rational person will make
use of (exercise) that right if and only if STi (!) > K, because it is in that, and only in
that, situation that the right is more valuable than the asset itself. In that case, in purely
monetary terms, the net payo↵ is then STi (!) K, and this is obtained by buying asset
i at the low price K and immediately selling it on the market at the high price STi (!).
In the other case STi (!) K, the option is clearly worthless — it makes no monetary
sense to pay K for one unit of asset i if one can get this on the market for less, namely
for STi (!). So here we have for the option a net payo↵, in monetary terms, of
+
H(!) = max 0, STi (!) K = STi (!) K .
Remark. In the above example, and more generally by identifying an option with its
net payo↵ in units of S 0 , we are implicitly restricting ourselves to so-called cash delivery
3 VALUATION AND HEDGING IN COMPLETE MARKETS 53
of options. However, there might be other contractual agreements. For instance, with a
call option with physical delivery, one actually obtains at time T in case of exercise the
shares or units of the specified asset and has to pay in cash the agreed amount K. If the
underlying asset is some commodity like e.g. oil or grain, this distinction becomes quite
important. However, we do not discuss this here any further. ⇧
In words, this option pays at time T one unit of money if and only if all stocks remain
between the levels a and b up to time T . This H is also FT -measurable, but now depends
on the asset price evolution over the whole time range k = 0, 1, . . . , T ; it cannot be written
as a function of the final stock price ST alone.
gives a payo↵ which depends on the average price (over time) of asset i, but which is only
due in case that a certain event A occurs. In insurance, the set A could for instance be
the event of the death up to time T of an insured person; then H would describe the
payo↵ from an index-linked insurance policy. This is an example where H depends on
more than only the basic asset prices. To get interesting examples of this type, we need
the filtration IF to be strictly larger than the filtration IF S generated by asset prices.
The basic question studied in this chapter is the following: Given a contingent claim
H 2 L0+ (FT ), how can we assign to H a value at any time k < T in such a way that
this creates no arbitrage opportunities (if the claim is made available for trading at these
values)? And having sold H, what can one do to insure oneself against the risk involved
in having to pay the random, uncertain amount H at time T ?
3 VALUATION AND HEDGING IN COMPLETE MARKETS 54
The key idea for answering both questions is very simple. With the help of the basic
traded assets S 0 and S, we try to construct an artificial product that looks as similar
to H as possible. The value of this product is then known because the product is con-
structed from the given assets; and this value should by absence of arbitrage be a good
approximation for the value of H.
Let us first look at the ideal case. Suppose that we can find a self-financing strategy
b (V0 , #) such that VT (') = H P -a.s. Then both the strategy ' and just holding H
'=
have costs of 0 at all intermediate times k = 1, . . . , T 1 because ' is self-financing, and
both have at time T a value of H. To avoid arbitrage, the values of both structures must
therefore coincide at time 0 as well, because we can otherwise buy the cheaper and sell
the more expensive product to make a riskless profit. (Note that this argument crucially
exploits that in finite discrete time, (NA) and (NA0 ) are equivalent, so that we need not
worry about any admissibility condition for the “strategy”, in the extended market, of
combining two products.) In consequence, the value or price of H at time 0 must be V0 .
An analogous argument and conclusion are valid for any time k, where the value or price
of H must then be Vk (').
Definition. A payo↵ H 2 L0+ (FT ) is called attainable if there exists an admissible self-
b (V0 , #) with VT (') = H P -a.s. The strategy ' is then said to
financing strategy ' =
replicate H and is called a replicating strategy for H.
Remark. Even in finite discrete time, it is important (and exploited below) that a repli-
cating strategy should be admissible. In continuous or infinite discrete time, this becomes
indispensable. ⇧
The next result formalises the key idea explained just before the above definition. In
addition, it also provides an efficient way of computing the resulting option price.
counted financial market in finite discrete time and suppose that S is arbitrage-free and F0
is trivial. Then every attainable payo↵ H has a unique price process V H = (VkH )k=0,1,...,T
which admits no arbitrage (in the extended market consisting of 1, S and V H ). It is given
by
VkH = EQ [H | Fk ] = Vk (V0 , #) for k = 0, 1, . . . , T ,
b (V0 , #)
for any equivalent martingale measure Q for S and for any replicating strategy ' =
for H.
Proof. By the DMW theorem in Theorem 2.2.1, IPe (S) is nonempty because S is
arbitrage-free; so there is at least one EMM Q. By assumption, H is attainable; so there
is at least one replicating strategy '. Because ' and H provide the same payo↵ structures,
they must by absence of arbitrage in the extended market have the same value processes;
so V H = V ('), and this holds for any replicating '. Because any such ' =
b (V0 , #) is
admissible by definition, V (') = V0 +# S = V (V0 , #) is a Q-martingale by Theorem 1.3.3,
for any Q 2 IPe (S), and as its final value is VT (') = H (P -a.s., hence also Q-a.s.), we get
More precisely, V0 is a constant because F0 is trivial, and ' is admissible so that V (') is
bounded from below. So # S = V (') V0 is also bounded from below, which justifies
the use of Theorem 1.3.3. q.e.d.
The next result shows how the last question can be answered by again using E(L)MMs
for S.
1) H is attainable.
2) supQ2IPe,loc (S) EQ [H] < 1 is attained in some Q⇤ 2 IPe,loc (S), i.e. the supremum is
finite and a maximum; in other words, we have supQ2IPe,loc (S) EQ [H] = EQ⇤ [H] < 1
for some Q⇤ 2 IPe,loc (S).
3) The mapping IPe (S) ! IR, Q 7! EQ [H] is constant, i.e. H has the same and finite
expectation under all EMMs Q for S.
Proof. While some of the implications are rather straightforward, the full proof,
and in particular the implication “2) ) 1)”, is difficult because it relies on the so-
called optional decomposition theorem. For the case where prices S are nonnegative,
see Föllmer/Schied [9, Remark 7.17 and Theorem 5.32]. The general case is more deli-
cate; the simplification for S 0 is due to the fact that the sets IPe (S) and IPe,loc (S) then
coincide. A full proof is for instance given in the lecture “Introduction to Mathematical
Finance”. q.e.d.
Remark. For models with continuous or infinite discrete time, the equivalence between
1) and 2) in Theorem 1.2 still holds (with a slightly stronger definition of attainability),
but the equivalence between 2) and 3) may (surprisingly!) fail. More precisely, “3) ) 2)”
remains valid if we replace IPe by IPe,loc in 3), but “2) ) 3)” in general only holds if H is
bounded; see Delbaen/Schachermayer [4, Chapter 10] for a counterexample. ⇧
3) Compute EQ [H] for all ELMMs Q for S and determine the supremum of EQ [H]
over Q.
4a) If the supremum is finite and a maximum, i.e. attained in some Q⇤ 2 IPe,loc (S), then
H is attainable and its price process can be computed as VkH = EQ [H | Fk ], for any
Q 2 IPe (S).
4b) If the supremum is not attained (or, equivalently for finite discrete time, there is a
pair of EMMs Q1 , Q2 with EQ1 [H] 6= EQ2 [H]), then H is not attainable.
In case 4a), Theorem 1.1 tells us how to value H; but if we also want to find a
replicating strategy, then more work is required.
In case 4b), we are faced with a genuine problem: It is impossible to replicate H, so our
whole conceptual approach up to here breaks down. We then have the difficult problem of
valuation and hedging for a non-attainable payo↵ , and there are in the literature several
competing approaches to that, all involving in some way the specification of preferences
or subjective views of the option seller.
2) If H is not attainable, it is at best not clear how to hedge H in any reasonably safe
way, and at worst, this may be impossible to achieve.
Both of these issues are often ignored in the literature; whether this happens intentionally
or through ignorance is not always clear. One area where this used to be particularly
prominent is credit risk. One can of course argue that having some approach to obtain
a valuation is better than nothing; but a value which has substantial arbitrariness and
perhaps no clear risk management outlook should certainly be treated with care and
respect.
3 VALUATION AND HEDGING IN COMPLETE MARKETS 59
Definition. A financial market model (in finite discrete time) is called complete if every
payo↵ H 2 L0+ (FT ) is attainable. Otherwise it is called incomplete.
b (V0 , #) for H.
for any EMM Q for S and any replicating strategy ' =
While Theorem 2.1 looks very nice, it raises the important question of how to recognise
a complete market, because completeness is a statement about all payo↵s H 2 L0+ (FT ).
But very fortunately, there is a very simple criterion — and it should be no surprise by
now that this again involves EMMs Q.
Theorem 2.2. Consider a discounted financial market model in finite discrete time and
assume that S is arbitrage-free, F0 is trivial and FT = F. Then S is complete if and only
if there is a unique equivalent martingale measure for S. In brief:
Proof. “(=”: If IPe (S) contains only one element, then Q 7! EQ [H] is of course constant
over Q 2 IPe (S) for any H 2 L0+ (FT ). Hence H is attainable by Theorem 1.2.
3 VALUATION AND HEDGING IN COMPLETE MARKETS 60
[To be accurate and avoid the case that Q 7! EQ [H] ⌘ +1, one also needs to check a
priori some integrability issues, namely that EQ [H] < 1 for at least one Q 2 IPe (S); see
Föllmer/Schied [9, Theorems 5.30 and 5.26] for details.]
“=)”: For any A 2 FT , the payo↵ H := IA is attainable; so by Theorem 1.1, we have
for any pair of EMMs Q1 , Q2 for S that
So Q1 and Q2 coincide on FT = F, which means that there can be at most one EMM
for S. By the DMW theorem in Theorem 2.2.1, there is at least one EMM because S is
arbitrage-free, and so the proof is complete. q.e.d.
Theorem 2.2 is sometimes called the second fundamental theorem of asset pricing.
Combining it with the first FTAP in Theorem 2.2.1, we have a very simple and beautiful
description of discounted financial market models in finite discrete time:
For continuous or infinite discrete time, such statements become more subtle to formulate
and more difficult to prove.
Remarks. 1) We can see from the proof of Theorem 2.2 where the assumption FT = F
is used. But it is also clear from looking at the statement why it is needed; after all,
completeness is only an assertion about FT -measurable quantities.
2) One can show that if a financial market in finite discrete time is complete, then FT
must be finite; see Föllmer/Schied [9, Theorem 5.38]. In e↵ect, finiteness of FT means
that ⌦ can also be taken finite. This shows that while it makes the theory nice and
simple, completeness is also a very restrictive property — complete financial markets in
finite discrete time are e↵ectively given by finite tree models. ⇧
Example. The multinomial model with a bank account and one stock (d = 1) is
incomplete whenever m > 2, i.e. as soon as there is some node in the tree which allows
3 VALUATION AND HEDGING IN COMPLETE MARKETS 61
more than two possible stock price evolutions. This follows from Theorem 2.2 because in
that situation, there are infinitely many EMMs; see Section 2.3.
Example. Consider any model with d = 1 (one risky asset) and i.i.d. returns Y1 , . . . , YT
under P . If Y1 has a density (e.g. if we have lognormal returns), then S is incomplete. This
is because F1 (and hence also FT ) must be infinite for Y1 to have a density. Alternatively,
one can easily construct di↵erent EMMs if there is at least one. [! Exercise]
3 VALUATION AND HEDGING IN COMPLETE MARKETS 62
where Q⇤ is the unique EMM for S 1 . We also recall from Corollary 2.2.3 that the Yj are
under Q⇤ again i.i.d., but with
r d
Q⇤ [Y1 = 1 + u] = q ⇤ := 2 (0, 1).
u d
All the above quantities S 1 , H, V H are discounted with Se0 , i.e. expressed in units of
asset 0. The undiscounted quantities are the stock price Se1 = S 1 Se0 , the payo↵ H
e := H Se0
T
e e
and its price process Ṽ H with ṼkH := VkH Sek0 for k = 0, 1, . . . , T . Putting together all we
know then yields
Corollary 3.1. In the binomial model with u > r > d, the undiscounted arbitrage-free
e 2 L0+ (FT ) is given by
price process of any undiscounted payo↵ H
e e0 e0
e e H e Sk Fk = Sk EQ⇤ [H
e | Fk ]
ṼkH 0
= Sk E Q ⇤ Fk = EQ⇤ H for k = 0, 1, . . . , T .
e
ST0 e0
ST e0
ST
Example. For a European call option on Se1 with maturity T and undiscounted strike
3 VALUATION AND HEDGING IN COMPLETE MARKETS 63
e we have
K,
e = (Se1
H e + = (Se1
K) e e1 e .
K)I
T T {S >K}
T
Now
⇢ T
Y ⇢ X
T
{SeT1 e =
> K} Sek1 e
Yj > K = e Sek1 ) .
log Yj > log(K/
j=k+1 j=k+1
If we define
8
<1 if Yj = 1 + u,
Wj := I{Yj =1+u} =
:0 if Yj = 1 + d,
T
X 1+u
log Yj = Wk,T log + (T k) log(1 + d),
j=k+1
1+d
PT
where Wk,T := j=k+1 Wj ⇠ Bin(T k, q ⇤ ) is independent of Fk under Q⇤ . So we get
⇢ e
{SeT1 e = Wk,T log 1 + u > log K
> K} (T k) log(1 + d)
1+d Sek1
and therefore
e
⇤e1 e ⇤ log Ks (T k) log(1 + d)
Q [ST > K | Fk ] = Q Wk,T > ,
log 1+u
1+d e1
s=S k
because Wk,T is independent of Fk under Q⇤ and Sek1 is Fk -measurable. The above prob-
ability can be computed explicitly because Wk,T has a binomial distribution; and as
⇥ ⇤
e | Fk ] = EQ⇤ SeT1 I e1 e Fk
E Q⇤ [ H e ⇤ [SeT1 > K
KQ e | Fk ],
{S >K}T
we already have the second half of the so-called binomial call pricing formula.
3 VALUATION AND HEDGING IN COMPLETE MARKETS 64
For the first term, one can either use explicit (and lengthy) computations or more
elegantly a so-called change of numeraire to obtain that
e0 e0 e1
⇥ 1 ⇤ 1 ST Sk S
(3.1) e
EQ⇤ ST I{Se1 >K}
e Fk = Ske EQ⇤ T I{Se1 >K}
e Fk
T
Sek0 Sek1 SeT0 T
Se0
= Sek1 T Q⇤⇤ [SeT1 > K
e | Fk ]
Se0
k
e0 e
e1 ST ⇤⇤ log Ks (T k) log(1 + d)
= Sk Q Wk,T > ,
Sek0 log 1+u
1+d e1
s=S k
1+u 1+d
q ⇤⇤ := q ⇤ , hence 1 q ⇤⇤ = (1 q⇤) .
1+r 1+r
Indeed, because Se1 /Se0 = S 1 is under Q⇤ a positive martingale, one can use it to define
via dQ⇤⇤ / dQ⇤ := ST1 /S01 a probability measure Q⇤⇤ ⇡ Q⇤ on FT ; then the Q⇤ -martingale
⇤⇤ ;Q⇤
S 1 /S01 starting at 1 is by construction the density process Z Q of Q⇤⇤ with respect to
Q⇤ , and the second equality in (3.1) is due to the Bayes formula (2.3.2) in Lemma 2.3.1.
One then easily verifies [! exercise] that Q⇤⇤ is the unique probability measure equivalent
to P on FT such that Se0 /Se1 = 1/S 1 becomes a Q⇤⇤ -martingale, and one can also check
that Y1 , . . . , YT are under Q⇤⇤ i.i.d. with Q⇤⇤ [Y1 = 1 + u] = q ⇤⇤ . Indeed, this is not
really surprising — by Lemma 2.3.1, 3), the process 1/S 1 is a Q⇤⇤ -martingale because the
⇤⇤ ;Q⇤
product Z Q (1/S 1 ) = (S 1 /S01 )(1/S 1 ) ⌘ 1/S01 is obviously a Q⇤ -martingale, and 1/S 1
has a binomial structure exactly like S 1 itself. The measure Q⇤⇤ is sometimes called dual
martingale measure.
So all in all, we obtain the fairly simple formula
e0
(3.2)
e
ṼkH = Sek1 Q⇤⇤ [Wk,T > x] e Sk Q⇤ [Wk,T > x]
K
Se0
T
with
e
log Ks (T k) log(1 + d)
(3.3) x= , for s = Sek1 ,
log 1+u
1+d
3 VALUATION AND HEDGING IN COMPLETE MARKETS 65
and where Wk,T has a binomial distribution with parameter T k and with q ⇤ under
Q⇤ , respectively with q ⇤⇤ under Q⇤⇤ . This binomial call pricing formula is the discrete
analogue of the famous Black–Scholes formula.
This provides a very simple recursive algorithm by using that the filtration IF in the
binomial model has the structure of a (binary) tree. Indeed, if we fix some node (corre-
sponding to some atom) at time k 1 (respectively of Fk 1 ) and denote by vk 1 the value
of VkH 1 there (on that atom), then there are only two possible successor nodes (atoms of
Fk ) and VkH can only take two values there, say vku and vkd . The Q⇤ -martingale property
then says that
vk 1 = q ⇤ vku + (1 q ⇤ )vkd ,
because the one-step transition probabilities of Q⇤ are the same throughout the tree and
given by q ⇤ , 1 q ⇤ . In undiscounted terms, we have
e He
ṼkH 1 Ṽ
= EQ⇤ k Fk 1
Sek0 1 Sek0
or
e 1 e
ṼkH 1 = EQ⇤ [ṼkH | Fk 1 ],
1+r
which translates at the level of node values to the recursion
1
(3.4) ṽk 1 = q ⇤ ṽku + (1 q ⇤ )ṽkd .
1+r
To work out the replicating strategy, also for a general payo↵ H, we recall from The-
orem 1.1 that
k
X
VkH = Vk (V0 , #) = V0 + #j Sj1 for k = 0, 1, . . . , T .
j=1
Now let us look again at some fixed node at time k 1 (atom of Fk 1 ). Because # is
predictable, #k is Fk 1 -measurable and so the value of #k is already known at time k 1,
hence in that node (on that atom), and it cannot change as we move forward to time k.
If we denote as before by vk 1 the value of VkH 1 in the chosen node (on the chosen atom)
at time k 1 and by sk 1 the value of Sk1 1 there, we know that vk 1 evolves to either vku
1+u 1+d
or vkd , and sk 1 evolves to suk = sk 1 1+r or sdk = sk 1 1+r , respectively, in the next step.
But the relation (3.5) between increments must hold in all nodes (on all atoms) and at
all times; so if ⇠k denotes the value of #k in the chosen node (on the chosen atom) at time
k 1, we obtain the two equations
vku vk 1 = ⇠k (suk sk 1 ),
vkd vk 1 = ⇠k (sdk sk 1 ).
Note that we have the same ⇠k in both equations because the value of #k cannot change
as we go from time k 1 to time k. The above two equations are readily solved to give
Again, the right-hand side is known at time k = T because we know that VTH = H. So
both the price process V H and the hedging strategy # can be computed in parallel while
working backward through the tree.
some function h̃, then the above formulas and computation scheme simplify considerably.
3 VALUATION AND HEDGING IN COMPLETE MARKETS 67
e
ṼkH = ṽ(k, Sek1 ) for k = 0, 1, . . . , T
and
˜ Se1 )
#k = ⇠(k, for k = 1, . . . , T
k 1
1 ⇣ ⇤ ⇤
⌘
ṽ(k 1, s) = q ṽ k, s(1 + u) + (1 q )ṽ k, s(1 + d)
1+r
and, from (3.6) multiplied in both numerator and denominator by Sek0 = (1 + r)k , by
also massively smaller, and so are computation times and storage requirements.
[It is a very good [! exercise] to either derive the above relations for the path-independent
case directly or deduce them from the preceding general results. In both cases, one uses
a backward induction argument.]
3 VALUATION AND HEDGING IN COMPLETE MARKETS 68
4 BASICS ABOUT BROWNIAN MOTION 69
For technical reasons, we should also assume (or make sure, if we construct the filtration
in some way) that IF satisfies the so-called usual conditions of being right-continuous and
P -complete, but we do not dwell on this technical mathematical issue in more detail.
(BM2) W has continuous trajectories, meaning that for P -almost all ! 2 ⌦, the function
t 7! Wt (!) on [0, 1) is continuous.
4 BASICS ABOUT BROWNIAN MOTION 70
Remarks. 1) One can prove that Brownian motion exists, but this is a nontrivial
mathematical result. See the course on “Brownian Motion and Stochastic Calculus” (in
short BMSC) for more details.
2) The letter W is used in honour of Norbert Wiener who gave the first rigorous
proof of the existence of Brownian motion in 1923. It is historically interesting to note,
however, that Brownian motion was already introduced and used considerably earlier in
both finance and physics — by Louis Bachelier in his PhD thesis in 1900 for finance and
by Albert Einstein in 1905 for physics.
3) Brownian motion in IRm is simply an adapted IRm -valued stochastic process null at
0 with (BM2) and such that (BM1) holds with N (0, t s) replaced by N (0, (t s)Im⇥m ),
where Im⇥m denotes the m ⇥ m identity matrix. This is equivalent to saying that the m
components are all real-valued Brownian motions and independent (as processes). ⇧
There is also a definition of Brownian motion (BM for short) without any filtration IF .
This is a (real-valued) stochastic process W = (Wt )t 0 which starts at 0, satisfies (BM2)
and instead of (BM1) the following property:
(BM10 ) For any n 2 IN and any times 0 = t0 < t1 < · · · < tn < 1, the increments
Wt i Wti 1 , i = 1, . . . , n, are independent (under P ) and we have (under P ) that
Wt i Wti 1
⇠ N (0, ti ti 1 ), or ⇠ N (0, (ti ti 1 )Im⇥m ) if W is IRm -valued.
Instead of (BM10 ), one also says (in words) that W has independent stationary increments
with a (specific) normal distribution.
The two definitions of BM are equivalent if one chooses as IF the filtration IF W gen-
erated by W (and made right-continuous and P -complete). This (like many other subse-
quent results and facts) needs a proof, which we do not give. More details can be found
in the lecture notes on “Brownian Motion and Stochastic Calculus”.
There are several transformations that produce a new Brownian motion from a given
one, and this can in turn be used to prove results about BM. More precisely:
1) W 1 := W is a BM.
:0 for t = 0
(Note that we always use here the definition of BM without an exogenous filtration.)
While parts 1)–4) of Proposition 1.1 are easy to prove, part 5) is a bit more tricky.
However, it is also very useful because it relates the asymptotic behaviour of BM as t ! 1
to the behaviour of BM close to time 0, and vice versa.
The next result gives some information about how trajectories of BM behave.
Wt
1) Law of large numbers: lim = 0 P -a.s., i.e. BM grows more slowly than linearly
t!1 t
as t ! 1.
p
2) (Global) Law of the iterated logarithm (LIL): With glob (t) := 2t log(log t), we
have
9
lim sup = ⇢
t!1 Wt +1
= P -a.s.,
lim inf ; glob (t) 1
t!1
i.e., for P -almost all !, the function t 7! Wt (!) for t ! 1 oscillates precisely
between t 7! ± glob (t).
4 BASICS ABOUT BROWNIAN MOTION 72
q
3) (Local) Law of the iterated logarithm (LIL): With loc (h) := 2h log(log h1 ), we
have for every t 0
9
lim sup = ⇢
h&0 Wt+h Wt +1
= P -a.s.,
lim inf ; loc (h) 1
h&0
i.e., for P -almost all !, to the right of t, the trajectory u 7! Wu (!) around the level
Wt (!) oscillates precisely between h 7! ± loc (h).
One immediate consequence of 2) and 3) is that BM crosses the level 0 (or, with a
bit more e↵ort for the proof, any level a) infinitely many times — and once it is at that
level, it even manages to achieve infinitely many crossings in an arbitrarily short amount
of time. This is already a first indication of the amazingly strong activity of BM.
We remark that part 1) of Proposition 1.2 is easily proved by using part 5) of Propo-
sition 1.1. Moreover, part 2) follows directly from part 3) via part 5) of Proposition 1.1,
and for proving part 3), it is enough to take t = 0, by part 2) of Proposition 1.1, and to
prove the lim sup result, by part 1) of Proposition 1.1. But then the easy reductions stop
and the proof becomes difficult.
The oscillation results in Proposition 1.2 already make it clear that the trajectories of
BM behave rather wildly. Another result in that direction is
Proposition 1.3. Suppose W = (Wt )t 0 is a BM. Then for P -almost all ! 2 ⌦, the
function t 7! Wt (!) from [0, 1) to IR is continuous, but nowhere di↵erentiable.
The deeper reason behind the wild behaviour of Brownian trajectories, and the key to
understanding stochastic calculus and Itô’s formula for BM, is that Brownian trajectories
are continuous functions having a nonzero quadratic variation. Heuristically, this can be
seen as follows. By definition, Brownian motion increments Wt+h Wt have a normal
4 BASICS ABOUT BROWNIAN MOTION 73
distribution N (0, h), which implies they are symmetric around 0 with variance h so that
p
roughly, “Wt+h Wt ⇡ ± h with probability 12 each”. In very loose and purely formal
terms, this means that infinitesimal increments “dWt = Wt Wt dt ” of BM have the
property that
“(dWt )2 = dt”.
While this is very helpful for an intuitive understanding, we emphasise that it is purely
formal and must not be used for rigorous mathematical arguments. A more precise
description is as follows.
Call a partition of [0, 1) any set ⇧ ✓ [0, 1) of time points with 0 2 ⇧ and such that
⇧ \ [0, T ] is finite for all T 2 [0, 1). This implies that ⇧ is at most countable and can
be ordered increasingly as ⇧ = {0 = t0 < t1 < · · · < tm < · · · < 1}. The mesh size of ⇧
is then defined as |⇧| := sup{ti ti 1 : ti 1 , ti 2 ⇧}, i.e. the size of the biggest time-step
in ⇧. For any partition ⇧ of [0, 1), any function g : [0, 1) ! IR and any p > 0, we first
define the p-variation of g on [0, T ] along ⇧ as
X
VTp (g, ⇧) := |g(ti ^ T ) g(ti 1 ^ T )|p .
ti 2⇧
where the supremum is taken over all partitions ⇧ of [0, 1). For a sequence (⇧n )n2IN
of partitions of [0, 1) with limn!1 |⇧n | = 0, one can also define the p-variation of g on
[0, T ] along (⇧n )n2IN as
lim VTp (g, ⇧n ),
n!1
with the supremum again taken over all partitions ⇧ of [0, 1), then g has finite variation
on [0, T ] if and only if it has finite arc length on [0, T ]. This can be checked by using the
p p p
inequality a + b a + b for a, b 0.
Any monotonic (increasing or decreasing) function is clearly of finite variation, because
the absolute values above disappear and we get a telescoping sum. Moreover, one can
show that any function of finite variation can be written as the di↵erence of two increasing
functions (and vice versa).
Now let us return to Brownian motion, taking p = 2 and as g one trajectory W. (!).
Then
X
Q⇧
T := (Wti ^T Wti 1 ^T
)2 = VT2 (W. , ⇧)
ti 2⇧
is the sum up to time T of the squared increments of BM along ⇧. With the above formal
intuition “(dWt )2 = dt”, we then expect, at least for |⇧| very small so that time points
are close together, that (Wti ^T Wti 1 ^T
) 2 ⇡ ti ^ T ti 1 ^ T and hence
X
Q⇧
T ⇡ (ti ^ T ti 1 ^ T) = T for |⇧| small.
ti 2⇧
Even if the above reasoning is only heuristic, the result surprisingly is correct:
Theorem 1.4. Suppose W = (Wt )t 0 is a BM. For any sequence (⇧n )n2IN of partitions
of [0, 1) which is refining (i.e. ⇧n ✓ ⇧n+1 for all n) and satisfies limn!1 |⇧n | = 0, we
have
h i
P lim Q⇧
t
n
= t for every t 0 = 1.
n!1
We express this by saying that along (⇧n )n2IN , the Brownian motion W has (with prob-
ability 1) quadratic variation t on [0, t] for every t 0, and we write hW it = t. (We
sometimes also say, with a certain abuse of terminology, that P -almost all trajectories
W. (!) : [0, 1) ! IR of BM have quadratic variation t on [0, t], for each t 0.)
Remark 1.5. 1) It is a very nice and useful [! exercise] in analysis to prove that every
continuous function f which has nonzero quadratic variation along a sequence (⇧n ) as
4 BASICS ABOUT BROWNIAN MOTION 75
above must have infinite variation, i.e. unbounded oscillations. (This will come up again
later in Section 6.1.) More generally, if limn!1 VTq (f, ⇧n ) > 0 for some q > 0, then
limn!1 VTp (f, ⇧n ) = +1 for any p with 0 < p < q, and if limn!1 VTp (f, ⇧n ) < 1 for
some p > 0, then limn!1 VTq (f, ⇧n ) = 0 for all q > p. We also recall that a classical
result due to Lebesgue says that any function of finite variation is almost everywhere
di↵erentiable. So Proposition 1.3 implies that Brownian trajectories must have infinite
variation, and Theorem 1.4 makes this even quantitative.
2) Caution: The comment in 1) is only true for continuous functions. With RCLL
functions, this breaks down in general.
3) It is important in Theorem 1.4 that the partitions ⇧n do not depend on the tra-
jectory W. (!), but are fixed a priori. One can show for P -almost all trajectories W. (!),
the (true) quadratic variation of W. (!) is +1.
4) There is an extension of Theorem 1.4 to general local martingales M instead of
Brownian motion W . But then the limit, called [M ]t , of the sequence (Q⇧
t (M ))n2IN
n
is not t, but some (Ft -measurable) random variable, and the convergence holds not P -
almost surely, but only in probability. (Alternatively, one can obtain P -a.s. convergence
along a sequence of partitions, but then this cannot be chosen, but is only shown to exist.)
Moreover, t 7! [M ]t (!) is then always increasing (for P -almost all !), but only continuous
if M itself has continuous trajectories. Finally, as for Brownian motion, the limit does
not depend on the sequence (⇧n )n2IN of partitions. ⇧
4 BASICS ABOUT BROWNIAN MOTION 76
Remark 2.1. Because our filtration satisfies the usual conditions, a general result from
the theory of stochastic processes says that any martingale has a version with nice (RCLL,
i.e. right-continuous with left limits, to be precise) trajectories. We can and do therefore
always assume that our martingales have nice trajectories in that sense, and this is im-
portant for some of the subsequent results. We shall point this out more explicitly when
it is used. ⇧
Again exactly like in discrete time, a stopping time with respect to IF is a mapping
⌧ : ⌦ ! [0, 1] such that {⌧ t} 2 Ft for all t 0. One of the standard examples is the
first time that some adapted right-continuous process X (e.g. Brownian motion W ) hits
an open set B (e.g. (a, 1)), i.e.
We remark that checking the stopping time property above uses that the filtration is
right-continuous; and we mention that ⌧ above is still a stopping time if B is allowed to
be a Borel set, but the proof of this apparently minor extension is surprisingly difficult.
One of the most useful properties of martingales is that the martingale property (2.1)
and its consequences very often extend to the case where the fixed times s t are replaced
4 BASICS ABOUT BROWNIAN MOTION 77
by stopping times ⌧ . “Very often” means under additional conditions, as we shall see
presently. To make sense of (2.1) for and ⌧ , we also first need to define, for a stopping
time , the -field of events observable up to time as
F := A 2 F : A \ { t} 2 Ft for all t 0 .
(One must and can check that F is a -field, and that one has F ✓ F⌧ for ⌧ .) We
also need to define M⌧ , the value of M at the stopping time ⌧ , by
Note that this implicitly assumes that we have a random variable M1 , because ⌧ can
take the value +1. One can then also prove that if ⌧ is a stopping time and M is
an adapted process with RC trajectories, then M⌧ is F⌧ -measurable (as one intuitively
expects). Finally, we also recall the stopped process M ⌧ = (Mt⌧ )t 0 which is defined by
Mt⌧ := Mt^⌧ for all t 0. Again, if M is adapted with RC trajectories and ⌧ is a stopping
time, then also M ⌧ is adapted and has RC trajectories.
After the above preliminaries, we now have
2) If M is an RC martingale and ⌧ is any stopping time, then we always have for any
t 0 that E[M⌧ ^t ] = E[M0 ]. If either ⌧ is bounded or M is uniformly integrable,
then we also obtain E[M⌧ ] = E[M0 ].
4 BASICS ABOUT BROWNIAN MOTION 78
For future use, let us also recall the notion of a local martingale null at 0, now in
continuous time. An adapted process X = (Xt )t 0 null at 0 (i.e. with X0 = 0) is called
a local martingale null at 0 (with respect to P and IF ) if there exists a sequence of
stopping times (⌧n )n2IN increasing to 1 such that for each n 2 IN , the stopped process
X ⌧n = (Xt^⌧n )t 0 is a (P, IF )-martingale. We then call (⌧n )n2IN a localising sequence. (If
X is defined on [0, T ] for some T 2 (0, 1), the requirement for a localising sequence is
that (⌧n ) increases to T stationarily, i.e. ⌧n % T P -a.s. and P [⌧n < T ] ! 0 as n ! 1.)
The next result presents a number of martingales directly related to Brownian motion.
Proposition 2.3. Suppose W = (Wt )t 0 is a (P, IF )-Brownian motion. Then the follow-
ing processes are all (P, IF )-martingales:
1) W itself.
2) Wt2 t, t 0.
1 2
3) e↵Wt 2
↵ t
,t 0, for any ↵ 2 IR.
Proof. We do this argument (in part) because it illustrates how to work with the prop-
erties of BM. For each of the above processes, adaptedness is obvious, and integrability is
also clear because each Wt has a normal distribution and hence all exponential moments.
Finally, as Wt Ws is independent of Fs and ⇠ N (0, t s), we get 1) from
E[Wt Ws | Fs ] = E[Wt Ws ] = 0.
Using this with Wt2 Ws2 = (Wt Ws )2 + 2Ws (Wt Ws ) and Fs -measurability of Ws
then gives
E[Wt2 Ws2 | Fs ] = E[(Wt Ws )2 | Fs ]
= E[(Wt Ws )2 ] = Var[Wt Ws ] = t s,
1 2
hence 2). Finally, setting Mt := e↵Wt 2 ↵ t yields
Mt ⇥ 1 2 ⇤
E Fs = E e↵(Wt Ws ) 2
↵ (t s)
Fs
Ms
1 2
↵ (t s)
=e 2 E[e↵(Wt Ws )
]=1
4 BASICS ABOUT BROWNIAN MOTION 79
1 2
because E[eZ ] = eµ+ 2 for Z ⇠ N (µ, 2
). So we have 3) as well. q.e.d.
Example. To illustrate that the conditions in Theorem 2.2 are really needed, consider
a Brownian motion W and the stopping time
Due to the law of the iterated logarithm in part 2) of Proposition 1.2, we have ⌧ < 1
P -a.s., and because W has continuous trajectories, we get W⌧ = 1 P -a.s. For = 0, if
(2.2) were valid for W and ⌧, , we should get by taking expectations that
which is clearly false. So ⌧ cannot be bounded by a constant (in fact, one can even show
that E[⌧ ] = +1), and W is a martingale, but not uniformly integrable. Finally, we also
see that (2.2) is not true in general (i.e. without assumptions on M and/or ⌧ ).
One useful application of the above martingale results is the computation of the
Laplace transforms of certain hitting times. More precisely, let W = (Wt )t 0 be a Brown-
ian motion and define for a > 0, b > 0 the stopping times
Note that ⌧a < 1 P -a.s. by the (global) law of the iterated logarithm in part 2) of
Proposition 1.2, whereas a,b can be +1 with positive probability (see below).
Proposition 2.4. Let W be a BM and a > 0, b > 0. Then for any > 0, we have
p
⌧a a 2
(2.3) E[e ]=e
and
⇥ ⇤ p
a(b+ b2 +2 )
(2.4) E[e a,b
]=E e a,b
I{ a,b <1}
=e .
4 BASICS ABOUT BROWNIAN MOTION 80
Proof. We give this argument because it illustrates how to use the preceding martingale
1 2
results. First of all, take ↵ > 0 and define Mt := exp(↵Wt 2
↵ t), t 0. Then M is
a martingale by part 3) of Proposition 2.3, and hence so is the stopped process M ⌧ by
(the first comment after) Theorem 2.2, for ⌧ 2 {⌧a , a,b }. This implies (as in the second
comment after Theorem 2.2) that
⇥ 1 2
↵ (⌧ ^t)
⇤
1 = E[M0 ] = E[M⌧ ^t ] = E e↵W⌧ ^t 2
2ab
Remark. If we let & 0 in (2.4), we obtain P [ a,b < 1] = e so that indeed
2ab
P[ a,b = +1] = 1 e > 0. ⇧
4 BASICS ABOUT BROWNIAN MOTION 81
U
For a general random variable U 0, the function 7! E[e ] for > 0 is called the
Laplace transform of U . Its general importance in probability theory is that it uniquely
determines the distribution of U .
if (Wt )t 0 is a Brownian motion. This means that if we restart a BM from level 0 at some
fixed time, it behaves exactly as if it had only just started. Moreover, one can show that
the independence of increments of BM implies that
This is called the Markov property of BM, and it is already very useful in many situations.
Exactly as with martingales, we suspect that it might be interesting and helpful if one
could in (3.3) replace the fixed time T 2 (0, 1) by a stopping time ⌧ . Note, however,
that quite apart from the difficulties of writing down an analogue of (3.3) for a random
time ⌧ (!), it is even not clear whether this should then be true, because after all, ⌧ itself
can explicitly depend on the past behaviour of BM. Nevertheless, it turns out that such
a result is true; one says that BM even has the strong Markov property.
4 BASICS ABOUT BROWNIAN MOTION 83
Because a precise analogue of (3.3) for a stopping time becomes a bit technical, we
formulate things a bit di↵erently. If we denote almost as above by IF W the filtration
generated by W (and made right-continuous, to be accurate), and if ⌧ is a stopping time
with respect to IF W and such that ⌧ < 1 P -a.s., then
Of course, this includes (3.1) and (3.2) as special cases, and one can easily believe that it
is even more useful than (3.3). However, the proof is too difficult to be given here.
4 BASICS ABOUT BROWNIAN MOTION 84
5 STOCHASTIC INTEGRATION 85
5 Stochastic integration
From the discrete-time theory developed in Chapters 1–3, we know that the trading gains
b (V0 , #) are described by the stochastic integral
or losses from a self-financing strategy ' =
Z X X
G(#) = # S = # dS = #tr
j Sj = #tr
j (Sj Sj 1 ).
j j
like in courses on measure and integration theory. But unfortunately, this works well (i.e.,
for many integrands #) only if the function t 7! St (!) is of finite variation — and this
would immediately exclude as integrator a process like Brownian motion which does not
have this property. So one must use a di↵erent approach, and this will be explained in
this chapter. For an amplification (and proof) of the above point that “naive stochastic
integration is impossible”, we refer to Protter [13, Section I.8]; the idea originally goes
back to C. Stricker.
integrals, this is di↵erent — choosing the left endpoint t̃i = ti leads to the Itô integral , the
right endpoint t̃i = ti+1 yields the forward integral , and the midpoint choice t̃i = 12 (ti +ti+1 )
produces the Stratonovich integral . However, for applications in finance, it is clear that
one must choose t̃i = ti (and hence the Itô integral) because the strategy must be decided
before the price move. ⇧
null at 0 (as defined before Proposition 4.2.3) and having RCLL (right-continuous with
left limits) trajectories. (The latter property, as pointed out earlier in Remark 4.2.1, is not
a restriction; we can always find an RCLL version of M thanks to the usual conditions
Rb
on IF .) Because we want to define stochastic integrals a H dM and these are always
over half-open intervals of the form (a, b] with 0 a < b 1, the value of M at 0 is
irrelevant and it is enough to look at processes H = (Ht ) defined for t > 0. This will
simplify some definitions. For any process Y = (Yt )t 0 with RCLL trajectories, we denote
by Yt := Yt Yt := Yt lims!t,s<t Ys the jump of Y at time t > 0.
5 STOCHASTIC INTEGRATION 87
Theorem 1.1. For any local martingale M = (Mt )t 0 null at 0, there exists a unique
adapted increasing RCLL process [M ] = ([M ]t )t 0 null at 0 with [M ] = ( M )2 and
having the property that M 2 [M ] is also a local martingale. This process [M ] can be
obtained as the quadratic variation of M in the following sense: There exists a sequence
(⇧n )n2IN of partitions of [0, 1) with |⇧n | ! 0 as n ! 1 such that
X 2
P [M ]t (!) = lim Mti ^t (!) Mt i 1 ^t
(!) for all t 0 = 1.
n!1
ti 2⇧n
Proof. See Protter [13, Section II.6] or Dellacherie/Meyer [5, Theorem VII.42] or Ja-
cod/Shiryaev [11, Section I.4c].
For two local martingales M , N null at 0, we define the (optional) covariation process
5 STOCHASTIC INTEGRATION 88
7) The key di↵erence between [M ] and hM i is that [M ] exists for any local martingale
M null at 0, whereas the existence of hM i requires some extra local integrability of M . ⇧
Definition. We denote by bE the set of all bounded elementary processes of the form
n 1
X
H= hi I(ti ,ti+1 ]
i=0
Z t n 1
X
Hs dXs := H Xt := hi (Xti+1 ^t Xti ^t ) for t 0.
0 i=0
R
Note that if X is RCLL, then so is H dX = H X.
If X and H are both IRd -valued, the integral is still real-valued, and we simply re-
place products by scalar products everywhere. But then Lemma 1.3 below looks more
complicated.
Note that the last d[M ]-integral can be defined ! by ! via classical measure and
integration theory, because t 7! [M ]t (!) is increasing and hence of finite variation. But
of course it is here also just a finite sum, because H has such a simple form.
where we have used twice that hi is Fti -measurable and bounded, and in the fourth step
also that M 2 [M ] is a martingale. Summing up and taking expectations then gives (1.1).
Moreover, it is not very difficult to argue that
✓Z ◆
2 2
H d[M ] = H 2 [M ] = H 2 ( M )2 = (H M )
Remark. The argument in the proof of Lemma 1.3 actually shows that the process
R 2
(H M )2 H d[M ] is a martingale. [! Exercise: Prove this in detail.] See also
Remark 1.2. ⇧
Our goal is now to extend the above results from H 2 bE to a larger class of integrands.
To that end, it is useful to view stochastic processes as random variables on the product
space ⌦ := ⌦ ⇥ (0, 1). (Recall that the values at 0 are irrelevant for stochastic integrals.)
We define the predictable -field P on ⌦ as the -field generated by all adapted left-
continuous processes, and we call a stochastic process H = (Ht )t>0 predictable if it is
P-measurable when viewed as a mapping H : ⌦ ! IR. As a consequence, every H 2 bE is
then predictable as it is adapted and left-continuous. We also define the (possibly infinite)
measure PM := P ⌦ [M ] on (⌦, P) by setting
Z Z 1
Y dPM := EM [Y ] := E Ys (!) d[M ]s (!)
⌦ 0
✓ Z 1 ◆ 12
1
2
kHkL2 (M ) := (EM [H ]) = E
2 Hs2 d[M ]s <1 .
0
5 STOCHASTIC INTEGRATION 92
(As usual, taking equivalence classes means that we identify H and H 0 if they agree
R1
PM -a.e. on ⌦ or, equivalently, if E[ 0 (Hs Hs0 )2 d[M ]s ] = 0.)
With the above notations, we can restate the first half of Lemma 1.3 as follows:
The last assertion is true because each H M remains constant after some tn given by
H 2 bE, and because Doob’s inequality gives for any martingale N and any t 0 that
h i
E sup |Ns |2 4E[Nt2 ].
0st
Now the martingale convergence theorem implies that each N 2 M20 admits a limit
N1 = limt!1 Nt P -a.s., and we have N1 2 L2 by Fatou’s lemma, and the process
(Nt )0t1 defined up to 1, i.e. on the closed interval [0, 1], is still a martingale. More-
over, Doob’s maximal inequality implies that two martingales N and N 0 which have the
0
same final value, i.e. N1 = N1 P -a.s., must coincide. Therefore we can identify N 2 M20
with its limit N1 2 L2 (F1 , P ), and so M20 becomes a Hilbert space with the norm
1
2
kN kM20 = kN1 kL2 = (E[N1 ]) 2
✓ Z 1 ◆ 12
1
2
(1.2) kH M kM20 = E[(H M1 ) ] 2
= E Hs2 d[M ]s = kHkL2 (M ) .
0
By general principles, this mapping can therefore be uniquely extended to the closure
of bE in L2 (M ); in other words, we can define a stochastic integral process H M for
every H that can be approximated, with respect to the norm k · kL2 (M ) , by processes
from bE, and the resulting H M is again a martingale in M20 and still satisfies the
isometry property (1.2).
(The argument behind these general principles is quite standard. If (H n )n2IN is a
sequence of predictable processes converging to H with respect to k · kL2 (M ) , then (H n )
is also a Cauchy sequence with respect to k · kL2 (M ) . If all the H n are in bE, then the
stochastic integral process H n M is well defined and in M20 for each n by Lemma 1.3.
Moreover, by the isometry property in Lemma 1.3 for integrands in bE, the sequence
(H n M )n2IN is then also a Cauchy sequence in M20 , and because M20 is a Hilbert space,
hence complete, that Cauchy sequence must have a limit which is again in M20 . This
limit is then defined to be the stochastic integral H M of H with respect to M . That
the isometry property extends to the limit is also standard.)
The crucial question now is of course how we can describe the closure of bE and
especially how big it is — the bigger the better, because we then have many integrands.
give here. However, we point out that the assumption M 2 M20 is used to ensure that
PM is a finite measure. Assertion 2) is then clear from the discussion above. q.e.d.
For M 2 M20, loc and H 2 L2loc (M ), defining the stochastic integral is straightforward;
we simply set
H M := (HI]]0,⌧n ]] ) M ⌧n on ]]0, ⌧n ]]
Remarks. 1) A closer look at the developments so far shows that the definitions (but
not the preceding results and arguments) for PM and L2 (M ) only need [M ]; hence one
can introduce and use them for any local martingale M , due to Theorem 1.1.
2) One can also define a stochastic integral process H M for H 2 L2loc (M ) when M is
a general local martingale, but this requires substantially more theory. For more details,
see Dellacherie/Meyer [5, Theorem VIII.37].
3) If M is IRd -valued with components M i that all are local martingales null at 0, one
can also define the so-called vector stochastic integral H M for IRd -valued predictable
processes in a suitable space L2loc (M ); the result is then a real-valued process. Details
can be found in Jacod/Shiryaev [11, Sections III.4a and III.6a]. However, one warning is
indicated: L2loc (M ) is not obtained by just asking that each component H i should be in
P
L2loc (M i ) and then setting H M = i H i M i . In fact, it can happen that H M is well
defined whereas the individual H i M i are not. So the intuition for the multidimensional
case is that
Z Z X XZ
i i
“ H dM = H dM 6= H i dM i ”,
i i
To end this section on a positive note, let us consider the case where M is a continuous
local martingale null at 0, briefly written as M 2 Mc0,loc . This includes in particular the
case of a Brownian motion W . Then M is in M20, loc because it is even locally bounded :
For the stopping times
(Note that continuity of M is only used to obtain the equality |M⌧n | = n; everything else
works just as well if M is only assumed to be adapted and RCLL.) The set L2loc (M ) of
nice integrands for M can here be explicitly described as
⇢
2
Lloc (M ) = all predictable processes H = (Ht )t>0 such that
Z t Z t
Hs2 d[M ]s = Hs2 dhM is < 1 P -a.s. for each t 0 .
0 0
R
Finally, the resulting stochastic integral H M = H dM is then (as we shall see from
the properties in Section 5.2 below) also a continuous local martingale, and of course null
at 0.
5 STOCHASTIC INTEGRATION 97
5.2 Properties
As with usual integrals, one very rarely computes a stochastic integral by passing to the
limit from some approximation. One works with stochastic integrals by using a set of
rules and properties. These are listed in this section, without proofs.
• Linearity:
– If M is a local martingale and H, H 0 are in L2loc (M ) and a, b 2 IR, then also aH +bH 0
is in L2loc (M ) and
• Associativity:
K (H M ) = (KH) M,
5 STOCHASTIC INTEGRATION 98
i.e.
Z ✓Z ◆ Z
Kd H dM = KH dM.
(H M )⌧ = H (M ⌧ ) = (HI]]0,⌧ ]] ) M = (HI]]0,⌧ ]] ) (M ⌧ ).
and
Z Z Z
H dM, K dN = HK d[M, N ].
• Jumps:
Example. To illustrate why the direct use of the definitions is complicated, let us
R
compute the stochastic integral W dW for a Brownian motion W . This is well defined
because M := W is in M20, loc (it is even continuous) and H := W is predictable and
locally bounded, because it is adapted and continuous.
Because
2Wti (Wti+1 Wti ) = Wt2i+1 Wt2i (Wti+1 Wti ) 2
If the mesh size |⇧n | of the partition sequence (⇧n ) goes to 0, then the sum on the right-
hand side converges P -a.s. to t by Theorem 4.1.4, if the partitions are also refining. We
therefore expect to obtain
Z t
1 1
Ws dWs = Wt2 t,
0 2 2
and we shall see later from Itô’s formula that this is indeed correct. Note that we should
Rx
expect the first term 12 Wt2 from classical calculus (where we have 0 y dy = 12 x2 ); the
second-order correction term 12 t appears due to the quadratic variation of Brownian tra-
jectories.
Exercise: Prove directly (without using the above result) that the stochastic integral
R
process W dW is a martingale, but not in M20 .
5 STOCHASTIC INTEGRATION 100
R
Exercise: Compute the Stratonovich integral and the backward integral for W dW , and
analyse their properties.
R
Exercise: Prove that if H is predictable and bounded, then H dW is a square-integrable
martingale.
Exercise: For any local martingale M null at 0 and any stopping time ⌧ , prove that we
have [M ]⌧ = [M ⌧ ].
5 STOCHASTIC INTEGRATION 101
a) In Section 5.1, we have taken for X = M a local martingale null at 0 and for H
a process in L2loc (M ); this means that H must be predictable and possess some
integrability.
Because integration is a linear operation, the obvious and easy idea for an extension is
therefore to look at processes that are sums of the above two types, because we can then
define an integral with respect to the sum as the sum of the two integrals.
One can show that this is well defined and does not depend on the chosen decomposition
of X. Moreover, [X] can also be obtained as a quadratic variation similarly as in Theo-
rem 1.1; see Section 6.1 below for more details. However, X 2 [X] is no longer a local
martingale, but only a semimartingale in general. ⇧
R
If X is a semimartingale, we can define a stochastic integral H X = H dX at least
for any process H which is predictable and locally bounded. We simply set
H X := H M + H A,
The resulting stochastic integral then has all the properties from Section 5.2 except
those that rest in an essential way on the (local) martingale property; so the isometry
property for example is of course lost. But we still have, for H predictable and locally
bounded:
– H X is a semimartingale.
This can also be viewed as a continuity property of the stochastic integral operator
H 7! H X, because (pointwise and locally bounded) convergence of (H n ) implies con-
vergence of (H n X), in the ucp sense of (3.1).
From the whole approach above, the definition of a semimartingale looks completely
ad hoc and rather artificial. But it turns out that this concept is in fact very natural and
has a number of very good properties:
3) If X is any adapted process with RC trajectories, we can always define the (ele-
mentary) stochastic integral H X for processes H in bE. If X is such that this
mapping on bE also has the continuity property (3.1) for any sequence (H n )n2IN in
bE converging pointwise to 0 and with |H n | 1 for all n, then X must in fact be
a semimartingale. This deep result is due to Bichteler and Dellacherie and shows
that semimartingales are a natural class of integrators.
One direct consequence of 2) for finance is that semimartingales are the natural pro-
cesses to model discounted asset prices in financial markets. In fact, the fundamental
5 STOCHASTIC INTEGRATION 104
theorem of asset pricing (in a suitably general version for continuous-time models) essen-
tially says that a suitably arbitrage-free model should be such that S is a local martingale
(or more generally a -martingale) under some Q ⇡ P . But then S is a Q-semimartingale
and thus by 2) also a P -semimartingale.
Put di↵erently, the above result implies that if we start with any model where S is
not a semimartingale, there will be arbitrage of some kind. Things become di↵erent if
one includes transaction costs; but in frictionless markets, one must be careful about this
issue.
Remark. We have explained so far how to obtain a stochastic integral H X for semi-
martingales X and locally bounded predictable H. The Bichteler–Dellacherie result shows
that one cannot go beyond semimartingales without a serious loss; but because not every
predictable process is locally bounded, one can ask if, for a given semimartingale X, there
are more possible integrands H for X. This leads to the notion and definition of the class
L(X) of X-integrable processes; but the development of this requires rather advanced re-
sults and techniques from stochastic calculus, and so we cannot go into details here. See
Dellacherie/Meyer [5, Section VIII.3] or Jacod/Shiryaev [11, Section III.6]. Alternatively,
this is usually presented in the course “Mathematical Finance”. ⇧
6 STOCHASTIC CALCULUS 105
6 Stochastic calculus
Our goal in this chapter is to provide the basic tools, results and techniques for working
with stochastic processes and especially stochastic integrals in continuous time. This
will be used in the next chapter when we discuss continuous-time option pricing and in
particular the famous Black–Scholes formula.
Throughout this chapter, we work on a probability space (⌦, F, P ) with a filtration
IF = (Ft ) satisfying the usual conditions of right-continuity and P -completeness. For all
local martingales, we then can and tacitly do choose a version with RCLL trajectories.
For the time parameter t, we have either t 2 [0, T ] with a fixed time horizon T 2 (0, 1)
or t 0. In the latter case, we set
_ ✓[ ◆
F1 := Ft := Ft .
t 0 t 0
d df dx
(f x)(t) = x(t) (t),
dt dx dt
or more compactly
(f x). (t) = f 0 x(t) ẋ(t),
0
where the dot ˙ denotes the derivative with respect to t and the prime is the derivative
with respect to x. In formal di↵erential notation, we can rewrite this as
or in integral form
Z t
(1.2) f x(t) f x(0) = f 0 x(s) dx(s).
0
In this last form, the chain rule can be extended to the case where f is in C 1 and x is
continuous and of finite variation.
Unfortunately, this classical result does not help us a lot. For one thing, X might have
only RCLL instead of continuous trajectories. This is still solvable if X has trajectories
of finite variation. But even if X is continuous, we cannot hope that its trajectories are of
finite variation, as the example of X being a Brownian motion clearly demonstrates. So
we need a di↵erent result, namely a chain rule for functions having a nonzero quadratic
variation.
Let us now connect the above idea to semimartingales. Recall that a semimartingale
is a stochastic process of the form X = X0 + M + A, where M is a local martingale null
at 0 and A is an adapted process null at 0 with RCLL trajectories of finite variation. For
any such A and any fixed, i.e. nonrandom, sequence (⇧n )n2IN of partitions of [0, 1) with
limn!1 |⇧n | = 0, the quadratic variation of A along (⇧n )n2IN is given by the sum of the
squared jumps of A, i.e.
X X X
[A]t = lim (Ati+1 ^t Ati ^t )2 = ( As ) 2 = (As As ) 2 for t 0.
n!1
ti 2⇧n 0<st 0<st
This partly repeats Remark 5.3.1. If A is continuous, we obtain that [X] = [M ], even if
X (hence M ) is only RCLL.
6 STOCHASTIC CALCULUS 107
(Note that this is a variant of the result already mentioned in Remark 4.1.5 in Chapter 4.)
So if the semimartingale X is continuous, then its (unique) finite variation part A has
zero quadratic variation, and its (unique) local martingale part M has quadratic variation
[M ] = hM i; see Remark 5.1.2 in Chapter 5. The covariation of M and A is thus also
zero by Cauchy–Schwarz. A continuous semimartingale X with canonical decomposition
X = X0 + M + A therefore has the quadratic variation [X] = hXi = [M ] = hM i which is
again continuous.
for all t 0.
Remarks. 1) Not only the result is important, but also the basic idea for its proof.
2) The dX-integral in (1.4) is a stochastic integral; it is well defined because X is
a semimartingale and f 0 (X) is adapted and continuous, hence predictable and locally
bounded. The dhXi-integral is a classical Lebesgue–Stieltjes integral because hXi has
6 STOCHASTIC CALCULUS 108
increasing trajectories; it is also well defined because f 00 (X) is also predictable and locally
bounded.
3) In purely formal di↵erential notation, (1.4) is usually written more compactly as
1 1
(1.5) df (Xt ) = f 0 (Xt ) dXt + f 00 (Xt ) dhXit = f 0 (Xt ) dXt + f 00 (Xt ) dhM it ,
2 2
Proof of Theorem 1.1. The easiest way to remember both the result and its proof for
the case where X is continuous is via the following quick and dirty argument: “A Taylor
expansion at the infinitesimal level gives
1
df (Xt ) = f (Xt ) f (Xt dt ) = f 0 (Xt ) dXt + f 00 (Xt )( dXt )2 ,
2
1
f (Xti+1 ^t ) f (Xti ) = f 0 (Xti )(Xti+1 ^t Xti ) + f 00 (Xti )(Xti+1 ^t X t i ) 2 + Ri ,
2
where Ri stands for the error term in the Taylor expansion and the ti come from a partition
⇧n of [0, 1). Now we sum over the ti t and obtain on the left-hand side a telescoping
sum which equals f (Xt ) f (X0 ). When we study the terms on the right-hand side, we
first recall the convergence
X
Q⇧
t
n
:= (Xti+1 ^t Xti )2 ! hXit as |⇧n | ! 0
ti 2⇧n , ti t
from Theorem 5.1.1; see also Remark 5.3.1. This implies firstly by a weak convergence
argument that
Z
1 X 1 t
f 00 (Xti )(Xti+1 ^t X ti ) 2
! f 00 (Xs ) dhXis ,
2 t 2⇧ , t t 2 0
i n i
(This is exactly the point where the mathematical analysis shows why the second order
is the correct order of expansion.) As a consequence, the sums
X
f 0 (Xti )(Xti+1 ^t X ti )
ti 2⇧n , ti t
must also converge, and the dominated convergence theorem for stochastic integrals then
Rt
implies that the limit is 0 f 0 (Xs ) dXs . q.e.d.
Using W0 = 0 and the fact that BM has quadratic variation hW it = t, hence dhW is = ds,
gives
Z t Z t Z t
Wt2 =2 Ws dWs + ds = 2 Ws dWs + t
0 0 0
or rewritten
Z t
1 1
Ws dWs = Wt2 t.
0 2 2
Theorem 1.2 (Itô’s formula II). Suppose X = (Xt )t 0 is a general IRd -valued semi-
martingale and f : IRd ! IR is in C 2 . Then f (X) = (f (Xt ))t 0 is again a (real-valued)
semimartingale, and we explicitly have P -a.s. for all t 0
1) if X has continuous trajectories:
d Z
X d Z
t
@f i 1 X t @ 2f
(1.6) f (Xt ) = f (X0 ) + (Xs ) dXs + (Xs ) dhX i , X j is ,
i=1 0 @xi 2 i,j=1 0 @xi @xj
d
X d
1X
df (Xt ) = fxi (Xt ) dXti + fxi xj (Xt ) dhX i , X j it .
i=1
2 i,j=1
Z t Z t
0 1
(1.7) f (Xt ) = f (X0 ) + f (Xs ) dXs + f 00 (Xs ) d[X]s
0 2 0
X ✓ 1 00
◆
0 2
+ f (Xs ) f (Xs ) f (Xs ) Xs f (Xs )( Xs ) .
0<st
2
Remark. There is of course also a version of Itô’s formula for general IRd -valued semi-
martingales (which contains both 1) and 2) as special cases). It looks similar to part 2)
of Theorem 1.2, but has in addition sums like in part 1), with h · , · i replaced by [ · , · ].
And of course one could also write (1.7) in di↵erential form. ⇧
Sek0 Sek0 1
= r,
Sek0 1
Sek1 Sek1 1
= Yk 1 =: Rk = E[Rk ] + (Rk E[Rk ]).
Sek1 1
Note that the terms in brackets above has expectation 0 and a variance which depends
on the distribution of the Rk . Passing from time steps of size 1 to dt and noting that
Brownian increments have expectation 0 like the term Rk E[Rk ], a continuous-time
analogue would be of the form
dSet0
(1.8) = r dt,
Set0
dSet1
(1.9) = µ dt + dWt .
Set1
(More accurately, we should put dSet0 /Set0 and dSet1 /Set1 . But as both Se0 and Se1 turn out
to be continuous, the di↵erence does not matter.)
Of course, the equation (1.8) for Se0 is just a very simple ordinary di↵erential equation
(ODE), whose solution for the starting value Se00 = 1 is Set0 = ert . The equation (1.9) for
6 STOCHASTIC CALCULUS 112
Se1 is a stochastic di↵erential equation (SDE), and its solution is given by the geometric
Brownian motion (GBM)
✓ ⇣ ⌘◆
1
(1.10) Set1 = Se01 exp Wt + µ 2
t for t 0.
2
1 2
Note the possibly surprising term 2
. To see that this is indeed a solution, we write
We now apply Itô’s formula (1.6) for d = 2 to Xt = (Wt , t). As the second component
(2)
Xt = t is continuous and increasing, it has finite variation; so (1.6) simplifies and we
only need the derivatives
@f
fx = = f,
@x
✓ ◆
@f 1 2
ft = = µ f,
@t 2
@ 2f 2
fxx = = f.
@x2
exactly as claimed. Note that we did not argue (as one should and can) that the above
explicit process in (1.10) is the only solution of (1.9).
dZt = Zt dXt , Z0 = 1.
6 STOCHASTIC CALCULUS 113
Checking that the above Z does satisfy the above SDE, as well as proving uniqueness of
the solution, is a good [! exercise] in the use of Itô’s formula.
dZt = Zt dXt , Z0 = 1,
i.e.,
Z t
Zt = 1 + Zs dXs for all t 0, P -a.s.,
0
1
From the preceding example, we have the explicit formula E(X) = exp(X 2
hXi)
when X is continuous and null at 0. For general X, an explicit formula is given in
Protter [13, Theorem II.37]. Note that Z = E(X) can become 0 or negative when X has
jumps; in fact, the properties of jumps of stochastic integrals yield
✓ Z ◆
Zt Zt = Zt = 1+ Zs dXs = Zt Xt ,
and this shows that Zt = Zt (1 + Xt ) so that Z = E(X) changes sign between t and
t whenever 1 + Xt < 0, i.e. when X has a jump Xt < 1.
Mt := E[h(WT ) | Ft ] for 0 t T
Mt = E[h(Wt + WT Wt ) | Ft ]
6 STOCHASTIC CALCULUS 114
Mt = E[h(x + WT Wt )] x=Wt
= f (Wt , t)
with
Z 1
1 y2
f (x, t) = E[h(x + WT Wt )] = h(x + y) p e 2(T t) dy.
1 2⇡(T t)
Now one can check by laborious analysis that the function f (x, t) satisfies the partial
di↵erential equation (PDE) ft + 12 fxx = 0; or one can use the fact that the canonical
decomposition of a special semimartingale (like the martingale M ) is unique. (Alterna-
tively, one can use that any continuous local martingale of finite variation is constant.)
Any of these leads to the conclusion that the ds-integral in (1.12) must vanish identically
because it is continuous and adapted, hence predictable, and of finite variation like any
ds-integral. By letting t % T in (1.12), we therefore obtain the representation
Z T
h(WT ) = MT = M0 + fx (Ws , s) dWs
0
of the random variable h(WT ) as an initial value M0 plus a stochastic integral with respect
to the Brownian motion W . A more general result in that direction is given in Section 6.3.
for some Brownian motion W , where µ and are predictable processes satisfying ap-
RT
propriate integrability conditions (e.g. 0 (|µs | + | s |2 ) ds < 1 P -a.s. for every T < 1).
6 STOCHASTIC CALCULUS 115
Example. For any two real-valued (RCLL) semimartingales X and Y , the product rule
is obtained by applying Itô’s formula with the function f (x, y) = xy. The result says that
Z t Z t
X t Y t = X 0 Y0 + Ys dXs + Xs dYs + [X, Y ]t
0 0
d(XY ) = Y dX + X dY + d[X, Y ].
d(XY ) = Y dX + X dY + dhX, Y i.
the first time that BM leaves the interval [a, b] around 0. Then classical results about the
ruin problem for Brownian motion say that
and
|a|
(1.13) P [W⌧a,b = b] = =1 P [W⌧a,b = a].
b a
In order to compute the covariance of ⌧a,b and W⌧a,b , we start with the function
1 3
f (x, t) = 3
x + tx. Then clearly ft + 12 fxx ⌘ 0 so that Itô’s formula shows that
Z t
Mt := f (Wt , t) = 0 + fx (Ws , s) dWs
0
is like W a continuous local martingale, and so is then the stopped process M ⌧a,b . But
⌧ 1 ⌧ 3 ⌧
Mt a,b = Mt^⌧a,b = Wt a,b + (t ^ ⌧a,b )Wt a,b
3
⇥ ⌧ ⇤ ⇥ ⌧ ⇤ 1 ⇥ 3 ⇤ ⇥ ⇤
0 = E M0 a,b = E MTa,b = E W⌧a,b ^T + E (⌧a,b ^ T )W⌧a,b ^T ,
3
1 ⇥ 3 ⇤ ⇥ ⇤
0= E W⌧a,b + E ⌧a,b W⌧a,b .
3
Hence we find
⇥ ⇤ 1 ⇥ ⇤ 1
Cov ⌧a,b , W⌧a,b = E ⌧a,b W⌧a,b = E W⌧3a,b = |a|b(b |a|),
3 3
where the last equality is obtained by computing with the known (two-point) distribution
of W⌧a,b given in (1.13).
6 STOCHASTIC CALCULUS 117
the density process of Q with respect to P on [0, T ], choosing an RCLL version of this
P -martingale on [0, T ]. Because Q ⇡ P on FT , we have Z > 0 on [0, T ], meaning
that P [Zt > 0, 8t 2 [0, T ]] = 1, and because Z is a P -(super)martingale, we even have
inf 0tT Zt > 0 P -a.s. by the so-called minimum principle for supermartingales; see
Dellacherie/Meyer [5, Theorem VI.17]. This implies that also Z > 0 on [0, T ] so that
1/Z is well defined and adapted and left-continuous, hence also predictable and locally
bounded.
In perfect analogy to Lemma 2.3.1, we now have
1
EQ [Ut | Fs ] = EP [Zt Ut | Fs ] Q-a.s.
Zs
loc
Of course, if Q ⇡ P , we can use Lemma 2.1 for any T < 1 and hence obtain a
statement for processes Y = (Yt )t 0 on [0, 1). One consequence of part 2) of Lemma 2.1
1
(with Y := 1/Z) is also that Z
is a Q-martingale, more precisely on [0, T ] if Q ⇡ P on
loc 1
FT , or even on [0, 1) if Q ⇡ P . Furthermore, it is easy to check that Z
is the density
process of P with respect to Q (again on [0, T ] or on [0, 1), respectively).
loc
Theorem 2.2 (Girsanov). Suppose that Q ⇡ P with density process Z. If M is a local
P -martingale null at 0, then
Z
f := M 1
M d[Z, M ]
Z
Proof. The second assertion is very easy to prove from the first; we simply write
✓ Z ◆
f+ A + 1 f+ A
e
X = X0 + M + A = X0 + M d[Z, M ] = X0 + M
Z
R R
e := A +
and observe that A 1
d[Z, M ] is of finite variation. Note that 1
d[Z, M ]
Z Z
It is also clear that L is continuous if and only if Y is continuous, and that L is a local
P -martingale if and only if Y is a local P -martingale. This L is called the stochastic
logarithm of Y . Note that because of the quadratic variation, we do not have L = log Y ,
not even if Y is continuous; see the explicit formula (1.11) in Section 6.1.
In the situation here, Z is a P -martingale > 0, hence has Z > 0 as discussed above,
and so applying the above with Y := Z yields Z = Z0 E(L), where L is like Z a local
P -martingale.
loc
Theorem 2.3 (Girsanov, continuous version). Suppose that Q ⇡ P with a density
process Z which is continuous. Write Z = Z0 E(L). If M is a local P -martingale null at
6 STOCHASTIC CALCULUS 120
0, then
f := M
M [L, M ] = M hL, M i
R
f+
so that the P -Brownian motion W = W ⌫s ds becomes under Q a Brownian motion
with (instantaneous) drift ⌫.
R
Proof. Because Z = Z0 E(L) satisfies dZ = Z dL, we have [Z, M ] = Z d[L, M ]
R R
and hence Z1 d[Z, M ] = ZZ d[L, M ] = [L, M ] by continuity of Z. So the first assertion
follows directly from Theorem 2.2, and [L, M ] = hL, M i because L is continuous like Z.
f needs some extra work as it relies on the so-called Lévy charac-
The assertion for W
terisation of Brownian motion that we have not discussed here. q.e.d.
In all the above discussions, we have assumed that Q is already given and have then
studied its e↵ect on given processes. But in mathematical finance, we often want to
proceed the other way round: We start with a process S = (St )0tT of discounted asset
prices and want to find or construct some Q ⇡ P on FT such that S becomes a local
Q-martingale. Let us now see how we can tackle this problem by reverse-engineering
the preceding theory. We begin very generally and successively become more specific.
Moreover, the goal here is not to remember a specific result, but rather to understand
how to approach the problem in a systematic way.
We start with a local P -martingale L null at 0 and define Z := E(L) so that Z is
like L a local P -martingale, with Z0 = 1. If we also have L> 1 (and this holds of
course in particular if L is continuous), then we have in addition Z > 0. This uses that
Z=Z L so that Z = Z (1 + L), which implies that Z never changes sign as long
as L> 1.
6 STOCHASTIC CALCULUS 121
Suppose now that Z is a true P -martingale on [0, T ]; this amounts to imposing suitable
extra conditions on L. Then we can define a probability measure Q ⇡ P on FT by
setting dQ := ZT dP , and the density process of Q with respect to P on [0, T ] is then by
construction the P -martingale Z. In particular, if L is continuous, also Z is continuous.
In a bit more detail, Z = E(L) is in the present situation a local P -martingale > 0 on
[0, T ] and therefore a P -supermartingale starting at 1. So t 7! E[Zt ] is decreasing, and one
can easily check that Z is a P -martingale on [0, T ] if and only if t 7! E[Zt ] is identically
1 on [0, T ], or also if and only if E[ZT ] = 1. However, expressing this directly in terms of
L is more tricky, and one has only sufficient conditions on L that ensure E[E(L)T ] = 1.
The most famous of these is the Novikov condition: If L is a continuous local martingale
1
null at 0 and E[e 2 hLiT ] < 1, then Z = E(L) is a martingale on [0, T ].
Now start with an IRd -valued process S = (St )0tT and suppose that S is a P -semi-
martingale. For each i, the coordinate S i can then (in general non-uniquely) be written as
S i = S0i + M i + Ai
with a local P -martingale M i and an adapted process Ai of finite variation, both null at
0. By Theorem 2.2,
Z
fi i 1
M =M d[Z, M i ]
Z
is then a local Q-martingale, and of course we have
✓ Z ◆
i f 1 fi + A
ei .
S = S0i i i
+M + A + d[Z, M ] = S0i + M
i
Z
and so
Z 1
= .
Z 1+ L
So in terms of L, the sufficient condition (2.3) can be written as
Z
i 1
A + d[L, M i ] ⌘ 0.
1+ L
Ai + hL, M i i ⌘ 0;
this could alternatively also be derived directly from Theorem 2.3. As a condition on
L in terms of M and A, this is fairly explicit. Note that this is actually a system of d
conditions (one for each S i ) imposed on a single process L.
In Chapter 7, we shall see how the above ideas can be used to construct explicitly an
equivalent martingale measure in the Black–Scholes model of geometric Brownian motion
for S. But before that, we study in the next section how local martingales L can (or
must) look if we impose more structure on the underlying filtration IF .
Remark. Instead of using Theorem 2.2, we could also argue more directly. Suppose
again that Z = E(L) is a true P -martingale > 0 on [0, T ], and define Q ⇡ P on FT by
dQ := ZT dP . By Lemma 2.1, S is then a local Q-martingale if and only if ZS is a local
P -martingale, and therefore we compute, using the product rule and dZ = Z dL,
Because both Z and M i , and hence also their stochastic integrals above, are local P -mar-
tingales, we see that Q is an ELMM for S i if and only if Ai +[L, S i ] is a local P -martingale.
A sufficient condition for this is that
Ai + [L, S i ] ⌘ 0.
Ai + hL, M i i ⌘ 0,
P
because then [L, Ai ] = L Ai ⌘ 0. ⇧
6 STOCHASTIC CALCULUS 124
and construct the filtration IF W = (FtW )0t1 by adding to each Ft0 the class N of
0
all subsets of P -nullsets in F1 to obtain FtW = Ft0 _ N . This so-called P -augmented
filtration IF W is then P -complete (in (⌦, F1
0
, P ), to be accurate) by construction, and
one can show, by using the strong Markov property of Brownian motion, that IF W is also
automatically right-continuous (so that it satisfies the usual conditions). We usually call
IF W , slightly misleadingly, the filtration generated by W . One can show that W is also
a Brownian motion with respect to IF W ; the key point is to argue that Wt Ws is still
independent of FsW ◆ Fs0 , even though FsW contains some sets from F1
0
. If one works
0
on [0, T ], one replaces 1 by T ; then F1 is not needed separately because we use the
P -nullsets from the “last” -field FT0 .
W
Remark. The assumptions on H say that H is integrable and F1 -measurable. The latter
6 STOCHASTIC CALCULUS 125
means intuitively that H(!) can depend in a measurable way on the entire trajectory
W. (!) of Brownian motion, but not on any other source of randomness. ⇧
Proof. For a localizing sequence (⌧k )k2IN , each (L L0 )⌧k is a uniformly integrable
martingale N k , say, and therefore of the form
Ntk = E[N1
k
| FtW ] for 0 t 1,
k
R k
for some N1 2 L1 (F1
W
, P ). So Theorem 3.1 and the martingale property of dW give
R
that N k = k
dW (note that N0k = 0). In particular, N k = (L L0 )⌧k is continuous,
which means that L is continuous on [[0, ⌧k ]]. As ⌧k % 1, L is continuous, and is
k k
obtained by piecing together the via := on [[0, ⌧k ]]. q.e.d.
While the above results are remarkable, the next result is bizarre. Note that in its
formulation, the filtration IF is even allowed to be general; but of course we could also
take IF = IF W .
Then every F1 -measurable random variable H with |H| < 1 P -a.s. (for example every
H 2 L1 (F1 , P )) can be written as
Z 1
H= s dWs P -a.s.
0
6 STOCHASTIC CALCULUS 126
Remark. It is not important for the above results that we work on the infinite interval
[0, 1] or [0, 1); everything could be done equally well on [0, T ] for any T 2 (0, 1). ⇧
7 THE BLACK–SCHOLES FORMULA 127
dSet1
(1.6) = µ dt + dWt .
Set1
This means that the bank account has a relative price change (Set0 Set0 e0
dt )/St dt over a
short time period (t dt, t] of r dt; so r is the growth rate of the bank account. In the same
way, the relative price change of the stock has a part µ dt giving a growth at rate µ, and a
2
second part dWt “with mean 0 and variance dt” that causes random fluctuations. We
call µ the drift (rate) and the (instantaneous) volatility of Se1 . The formulation (1.5),
(1.6) also makes it clear why this model is the continuous-time analogue of the CRR
binomial model; see Example 6.1.3 in Section 6.1 for a more detailed discussion. (Because
Se0 and Se1 are both continuous, we can replace Set0 dt and Set1 dt in the denominators above
by Set0 and Set1 , respectively.)
As usual, we pass to quantities discounted with Se0 ; so we have S 0 = Se0 /Se0 ⌘ 1, and
S 1 = Se1 /Se0 is by (1.1) and (1.2) given by
✓ ⇣ ⌘◆
1
(1.7) St1 = S01 exp Wt + µ r 2
t .
2
Either from (1.7) or from (1.3), (1.4), we obtain via Itô’s formula that S 1 solves the SDE
Remark 1.1. Because the cofficients µ, r, are all constant and > 0, the undiscounted
prices (Se0 , Se1 ), the discounted prices (S 0 , S 1 ), the discounted stock price S 1 alone, and
the Brownian motion W all generate the same filtration. This means that there is here
7 THE BLACK–SCHOLES FORMULA 129
As in discrete time, we should like to have an equivalent martingale measure for the
discounted stock price process S 1 . To get an idea how to find this, we rewrite (1.8) as
✓ ◆
µ r
(1.10) dSt1 = St1 dWt + dt = St1 dWt⇤ ,
The quantity
µ r
:=
is often called the instantaneous market price of risk or infinitesimal Sharpe ratio of S 1 .
By looking at Girsanov’s theorem in the form of Theorem 6.2.3, we see that W ⇤ is a
Brownian motion on [0, T ] under the probability measure Q⇤ given by
✓ Z ◆ ✓ ◆
dQ⇤ 1 2
:= E dW = exp WT T on FT ,
dP T 2
from (1.10) by Itô’s formula, and so we can use Proposition 4.2.3 under Q⇤ .
All in all, then, S 1 admits an equivalent martingale measure, explicitly given by Q⇤ ,
and so we expect that S 1 should be “arbitrage-free” in any reasonable sense. However,
we cannot make this precise here before defining more carefully what “trading strategy”,
“self-financing”, “arbitrage opportunity” etc. should mean in this context.
and therefore dZt = Zt dLt = Zt ⌫t dWt (as Z is automatically continuous like L).
Now suppose in addition that S 1 is a local Q-martingale, i.e. Q is an ELMM for S 1 .
By the Bayes rule in Lemma 6.2.1, this implies that ZS 1 is a local P -martingale. But the
product rule, (1.8) and the rules for computing covariations of stochastic integrals give
d(Zt St1 ) = Zt dSt1 + St1 dZt + dhZ, S 1 it
= Zt St1 (µ r) dt + Zt St1 dWt + St1 Zt ⌫t dWt + Zt ⌫t St1 dhW, W it
= Zt St1 ( + ⌫t ) dWt + Zt St1 ( + ⌫t ) dt,
must also be a local P -martingale. But A is adapted and continuous (hence predictable)
and of finite variation; so it has quadratic variation 0, hence must be constant, and so
its integrand must be 0. This implies that ⌫s ⌘ , because Z, S 1 , are all > 0, and
therefore we get
✓Z ◆ ✓ Z ◆
Z = Z0 E(L) = Z0 E ⌫ dW = Z0 E dW .
7 THE BLACK–SCHOLES FORMULA 131
Finally, Z0 has EP [Z0 ] = EP [ZT ] = Q[⌦] = 1 and is measurable with respect to F0 = F0W
which is P -trivial (because W0 is constant P -a.s.); so Z0 = EP [Z0 ] = 1 and therefore
✓ Z ◆
Z=E dW = Z ⇤, or Q = Q⇤ .
Thus we have shown that in the Black–Scholes model, there is a unique equivalent
martingale measure, which is given explicitly by Q⇤ . So we expect that the Black–Scholes
model is not only “arbitrage-free”, but also “complete” in a suitable sense. Note that the
latter point (as well as the above proof of uniqueness) depends via Itô’s representation
theorem in a crucial way on the assumption that the filtration IF is generated by W . ⇧
Now take any H 2 L0+ (FT ) and view H as a random payo↵ (in discounted units) due
at time T . Recall that IF is generated by W and that Wt⇤ = Wt + t, 0 t T , is a
Q⇤ -Brownian motion. Because is deterministic, W and W ⇤ generate the same filtration,
and so we can also apply Itô’s representation theorem with Q⇤ and W ⇤ instead of P and
W . So if H is also in L1 (Q⇤ ), the Q⇤ -martingale Vt⇤ := EQ⇤ [H | Ft ], 0 t T , can be
represented as
Z t
Vt⇤ = EQ⇤ [H] + H
s dWs⇤ for 0 t T ,
0
R
with some unique H
2 L2loc (W ⇤ ) such that H
dW ⇤ is a Q⇤ -martingale. Recall from
(1.10) that
dSt1 = St1 dWt⇤ .
So if we define for 0 t T
H
t
#H
t := ,
St1
⌘tH := Vt⇤ #H 1
t St
Moreover,
VT ('H ) = VT⇤ = H a.s.
In summary, then, every H 2 L1+ (FT , Q⇤ ) is attainable in the sense that it can be
replicated by a dynamic strategy trading in the stock and the bank account in such a way
that the strategy is self-financing and admissible, and its value process is a Q⇤ -martingale.
In that sense, we can say that the Black–Scholes model is complete. By analogous argu-
ments as in discrete time, we then also obtain the arbitrage-free value at time t of any
payo↵ H 2 L1+ (FT , Q⇤ ) as its conditional expectation
under the unique equivalent martingale measure Q⇤ for S 1 . This is in perfect parallel to
the results we have seen for the CRR binomial model; see Section 3.3.
Remarks. 1) All the above computations and results are in discounted units. Of course,
we could also go back to undiscounted units.
2) Itô’s representation theorem gives the existence of a strategy, but does not tell us
how it looks. To get more explicit results, additional structure (for the payo↵ H) and
more work is needed. [! Exercise]
3) The SDE (1.8) for discounted prices is
dSt1
= (µ r) dt + dWt ,
St1
7 THE BLACK–SCHOLES FORMULA 133
and this is rather restrictive as µ, r, are all constant. An obvious extension is to allow the
coefficients µ, r, to be (suitably integrable) predictable processes, or possibly functionals
e This brings up several issues:
of S or S.
4) From the point of view of finance, the natural filtration to work with would be the
e i.e. by prices, not by W . From the explicit formulae (1.1), (1.2),
one generated by S or S,
one can see that Se and W generate the same filtrations when the coefficients µ, r, are
deterministic. (This has already been pointed out in Remark 1.1.) But in general (i.e. for
more general coefficients), working with the price filtration is rather difficult because it is
hard to describe.
5) A closer look at the no-arbitrage argument for valuing H shows that in continuous
time, we can only say that the arbitrage-free seller price process for the payo↵ H is
given by V H = V ⇤ . The reason is that the strategy 'H is admissible, but 'H is not,
in general, unless H is in addition bounded from above. In finite discrete time, this
phenomenon does not appear because absence of arbitrage for admissible or for general
self-financing strategies is the same there. ⇧
7 THE BLACK–SCHOLES FORMULA 134
function is h(x) = (x e
Ke rT +
) =: (x K)+ . Our goal, for general h, is to compute the
value process V ⇤ and the strategy #H more explicitly.
We start with the value process. Because we have Vt⇤ = EQ⇤ [H | Ft ] = EQ⇤ [h(ST1 ) | Ft ],
we look at the explicit expression for S 1 in (1.11) and write
✓ ◆
S1 1
ST1 = St1 T1 = St1 exp (WT⇤ Wt⇤ ) 2
(T t) .
St 2
In the last term, the first factor St1 is obviously Ft -measurable. Moreover, W ⇤ is a
Q⇤ -Brownian motion with respect to IF , and so in the second factor, WT⇤ Wt⇤ is under
Q⇤ independent of Ft and has an N (0, T t)-distribution. Therefore we get
This already gives a fairly precise structural description of Vt⇤ as a function of (t and) St1 ,
instead of a general Ft -measurable random variable.
7 THE BLACK–SCHOLES FORMULA 135
Because we have an explicit formula for the function v as essentially the convolution of
h with a very smooth function (the density of a lognormally distributed random variable),
one can prove that the function v is sufficiently smooth to allow the use of Itô’s formula.
This gives, writing subscripts in the function v for partial derivatives and using (1.10)
and (1.9),
(2.3) dVt⇤ = dv(t, St1 )
1
= vt (t, St1 ) dt + vx (t, St1 ) dSt1 + vxx (t, St1 ) dhS 1 it
2
✓ ◆
1 1 ⇤ 1 1 1 2 1 2
= vx (t, St ) St dWt + vt (t, St ) + vxx (t, St ) (St ) dt.
2
But V ⇤ is a local (even a true) Q⇤ -martingale, by its definition, and so is the integrated
dW ⇤ -term on the right-hand side above. Therefore the integrated dt-term on the right-
hand side of (2.3) is at the same time continuous and adapted and of finite variation, and
a local Q⇤ -martingale. Hence it must vanish, and so (2.3) and (1.12) yield
A closer look at the above argument also allows us to extract some information about
the function v. This is similar to our arguments in Example 6.1.4 for the representation of
the random variable h(WT ) as a stochastic integral of W . Indeed, the fact that the dt-term
vanishes means that the function vt (t, x)+ 12 vxx (t, x) 2 x2 must vanish along the trajectories
of the space-time process (t, St1 )0<t<T . But by the explicit expression in (1.11), each St1
is lognormally distributed and hence has all of (0, 1) in its support. So the support of
the space-time process contains (0, T ) ⇥ (0, 1), and so v(t, x) must satisfy the (linear,
second-order) partial di↵erential equation (PDE)
2
@v 1 2 2@ v
(2.5) 0= + x on (0, T ) ⇥ (0, 1).
@t 2 @x2
7 THE BLACK–SCHOLES FORMULA 136
because v(T, ST1 ) = VT⇤ = H = h(ST1 ) and the support of the distribution of ST1 contains
(0, 1). So even if we cannot compute the integral in (2.2) explicitly, we can at least
obtain v(t, x) numerically by solving the PDE (2.5), (2.6).
Remarks. 1) Instead of using the above probabilistic argument, one can also derive the
p
PDE (2.5) analytically. Using in (2.2) the substitution u = x exp( T t y 12 2 (T t))
p
gives y = (log ux + 12 2 (T t))/ T t, hence dy = u p1T t du, and then
Z ✓ ◆
1
1 1 (log ux + 12 2 (T t))2
v(t, x) = h(u) p exp du.
0 2⇡ 2 (T t) u 2 2 (T t)
One can now first check, by using that h(ST1 ) is in L1 (Q⇤ ), that v may be di↵erentiated
by di↵erentiating under the integral sign, and by brute force computations, one can then
check in this way that v indeed satisfies the PDE (2.5). The deeper reason behind this
z2
is the fact that the density function '(t, z) = p1 e 2t of an N (0, t)-distribution satisfies
2⇡t
When comparing the PDE (2.5), (2.6) to some of those found in the literature, one
might be puzzled by the simple form of (2.5). This is because we have expressed everything
e = h̃(Se1 ) and the undiscounted value
in discounted units. If the undiscounted payo↵ is H T
and
ṽ(t, Set1 ) = ert v(t, St1 )
so that
rt
v(t, x) = e ṽ(t, xert ),
ṽ(t, x̃) = ert v(t, x̃e rt
).
For the function ṽ, we can then compute the partial derivatives
@ṽ @v @v
(t, x̃) = rṽ(t, x̃) + ert (t, x̃e rt
) ert (t, x̃e rt
)x̃re rt
,
@t @t @x
@ṽ @v @v
(t, x̃) = ert (t, x̃e rt
)e rt
= (t, x̃e rt
),
@ x̃ @x @x
@ 2 ṽ @ 2v rt rt
(t, x̃) = (t, x̃e )e ,
@ x̃2 @x2
2
@ṽ @ṽ 1 2 2@ ṽ
0= + rx̃ + x̃ rṽ on (0, T ) ⇥ (0, 1)
@t @ x̃ 2 @ x̃2
[It is a nice [! exercise] to convince oneself that this is correct. Possible ways include
straightforward but tedious calculus, or alternatively again a martingale argument.]
7 THE BLACK–SCHOLES FORMULA 138
e = (SeT1
H e +.
K)
e Se0 = (S 1
Then H = H/ e
Ke rT +
) =: (ST1 K)+ , and we obtain from (2.2) that the
T T
where
Z y
⇤ 1 1 2
z
(y) = Q [Y y] = p e 2 dz
1 2⇡
is the cumulative distribution function of the standard normal distribution N (0, 1). Plug-
p
ging in x = St1 , a = T t, b = K and then passing to undiscounted quantities via
St1 = Set1 e rt e
, K = Ke rT
therefore yields the famous Black–Scholes formula in the form
e
(3.1) ṼtH = ṽ(t, Set1 ) = Set1 (d1 ) e
Ke r(T t)
(d2 )
with
log(Set1 /K)
e + (r ± 1
2
2
)(T t)
(3.2) d1,2 = p .
T t
Note that the drift µ of the stock does not appear here; this is analogous to the result
that the probability p of an up move in the CRR binomial model does not appear in
7 THE BLACK–SCHOLES FORMULA 139
the binomial option pricing formula (3.2), (3.3) in Section 3.3. What does appear is the
volatility , in analogy to the di↵erence log(1 + u) log(1 + d) which gives an indication
of the spread between future stock prices from one time point to the next.
To compute the replicating strategy, we recall from (2.4) that the stock price holdings
at time t are given by
@v
#H
t = (t, St1 ).
@x
rt
Moreover, v(t, x) = e ṽ(t, xert ) so that
One very useful feature of the above results is that the explicit formula (3.1), (3.2)
allows to compute all partial derivatives of the option price with respect to the various
parameters. These sensitivities are usually called Greeks and denoted by (genuine or
invented) Greek letters. Examples are
• Delta: the partial derivative with respect to the asset price Set1 , computed in (3.3),
also called hedge ratio.
• Gamma: the second partial derivative with respect to Set1 ; it measures the reaction
of Delta to a stock price change.
• Vanna: the partial derivative of Delta with respect to , or the second partial
derivative of the option price, once with respect to Set1 and once with respect to .
• Vomma: the second partial derivative of the option price with respect to .
• Charm: the partial derivative of Delta with respect to T t, the time to maturity.
Of course, the above definitions per se make sense for any model; but in the Black–
Scholes model, one has even explicit expressions for them.
Remark. One can find in the literature many di↵erent derivations for the Black–Scholes
formula. One especially popular approach is to first derive the binomial call pricing
formula in the CRR model via arbitrage arguments, as we have done in Section 3.3, and
to then pass to the limit by appropriately rescaling the parameters. More precisely, one
considers for each n 2 IN a binomial model with time step T /n so that letting n increase
corresponds to more and more frequent trading. It is intuitively plausible that the CRR
models should then converge in some sense to the BS model, and one can make this
mathematically precise via Donsker’s theorem. Obtaining the Black–Scholes formula as
a limit is similar but simpler; it is essentially an application of the central limit theorem.
The above limiting “derivation” of the Black–Scholes formula is mathematically much
simpler; but it is also far less satisfactory, especially at the conceptual level. Most impor-
tantly, it does not give the key insight of the methodology behind the formula, namely that
the price is the initial capital for a self-financing replication strategy in the continuous-
time model. We do have the corresponding insight for each binomial model; but the
elementary analysis usually done in the literature does not study whether that important
structural property is preserved when passing to the limit. To obtain that insight (and to
develop it further in other applications or maybe generalisations), stochastic calculus in
continuous time is indispensable.
It is interesting to note that the above view was also shared by the Nobel Prize
Committee; when it awarded the 1997 Nobel Prize in Economics to Robert C. Merton
and Myron Scholes (Fischer Black had died in 1995), the award was given “for a new
7 THE BLACK–SCHOLES FORMULA 141
method to determine the value of derivatives”. The emphasis here is clearly on “method”,
as opposed to “formula”. ⇧
7 THE BLACK–SCHOLES FORMULA 142
8 APPENDIX: SOME BASIC CONCEPTS AND RESULTS 143
1
X (B) := {X 2 B} := {! 2 ⌦ : X(!) 2 B}.
This is sometimes called the pre-image of the set B under the mapping X. We say that
X is measurable (or more precisely Borel-measurable) if for every B 2 B(IR), we have
{X 2 B} 2 F. One can show that this is equivalent to having {X c} 2 F for every
c 2 IR. More precisely, we could also say that X : ⌦ ! IR is measurable with respect
to F and B(IR). If we replace the measurable space (IR, B(IR)) by another measurable
space (⌦0 , F 0 ), say, we have an analogous definition of a measurable function from ⌦ to
⌦0 , with respect to F and F 0 .
is a P -nullset, i.e. has P [A] = 0. We sometimes also use instead the formulation that a
statement holds for P -almost all !. For example, X Y P -a.s. means that P [X < Y ] = 0
or, equivalently, P [X Y ] = 1. Note that we also use here the shorthand notation
is the family of all equivalence classes of random variables that are bounded by a constant
c, say (where the constant can depend on the random variable).
(2.1) Y is G-measurable.
(2.2) E[U IA ] = E[Y IA ] for all A 2 G.
Y is then called a version of the conditional expectation and is denoted by E[U | G].
Proof. 1) is nontrivial and not proved here; possible proofs use the Radon–Nikodým
theorem or a projection argument in L2 (P ) combined with an extension argument.
2) Due to (2.1), the set A := {Y > Y 0 } is in G so that (2.2) implies
0 = E[(Y Y 0 )IA ].
We next recall without proofs some properties of and computation rules for conditional
expectations. Let U, U 0 be integrable random variables so that E[U | G] and E[U 0 | G] exist.
We denote by bG the set of all bounded G-measurable random variables. Then we have:
⇥ ⇤
(2.3) E[U Z] = E E[U | G]Z for all Z 2 bG.
Linearity: E[aU + bU 0 | G] = aE[U | G] + bE[U 0 | G] P -a.s., for all a, b 2 IR.
Monotonicity: If U U 0 P -a.s., then E[U | G] E[U 0 | G] P -a.s.
⇥ ⇤
Projectivity: E[U | G] = E E[U | G] H P -a.s., for every -field H ✓ G.
In fact, (2.4) is clear from the definition, (2.5) follows immediately from (2.2) with A = ⌦,
and (2.6) follows from (2.3) with the help of the definition. The right-hand side of (2.7)
is clearly G-measurable, and U and IA are by assumption independent for every A 2 G;
hence we obtain
⇥ ⇤
E[U IA ] = E[U ]E[IA ] = E E[U ]IA
Lemma 2.2. Let U, V be random variables such that U is G-measurable and V is inde-
pendent of G. For every measurable function F 0 on IR2 , we then have
Proof. For F of the form F (u, v) = g(u)h(v) with g, h 0 and measurable, we have on
the one hand
f (u) = E[F (u, V )] = g(u)E[h(V )]
because g(U ) is G-measurable and h(V ) is like V independent of G. For general F , one
then uses an argument via the so-called monotone class theorem. q.e.d.
Intuitively, (2.8) says that under the assumptions of Lemma 2.2, one can compute
the conditional expectation E[F (U, V ) | G] by “fixing the known value U and taking the
expectation over the independent quantity V ”.
In analogy to Fatou’s lemma and the dominated convergence theorem, one has the
following convergence results for conditional expectations.
2) If (Un ) converges to some random variable U P -a.s. and if |Un | X P -a.s. for all n
and some integrable random variable X, then
h i
(2.9) E lim Un G = E[U | G] = lim E[Un | G] P -a.s.
n!1 n!1
8 APPENDIX: SOME BASIC CONCEPTS AND RESULTS 149
Remark. In analogy to what happens for usual expectations, one might be tempted to
think that (2.9) is still true if one replaces the assumption that all the Un are dominated
by an integrable random variable by the weaker requirement that the sequence (Un ) is
uniformly integrable. But while this is still enough to conclude that E[U ] = limn!1 E[Un ]
(in fact, one even has convergence of (Un ) to U in L1 (P )), it does not imply that the
conditional expectations converge P -a.s. (although they then do converge in L1 ).
8 APPENDIX: SOME BASIC CONCEPTS AND RESULTS 150
We say that a stochastic process is continuous if all or P -almost all its trajectories
are continuous functions. We call a stochastic process RCLL if all or P -almost all its
trajectories are right-continuous (RC) functions admitting left limits (LL). We say that a
stochastic process is of finite variation if all or P -almost all its trajectories are functions
of finite variation. Recall that a function is of finite variation if and only if it can be
written as the di↵erence of two increasing functions.
Finally, we say that a stochastic process has a property locally if there exists a se-
quence of stopping times (⌧n )n2IN increasing to 1 P -a.s. such that when restricted to the
stochastic interval [[0, ⌧n ]] = {(!, t) 2 ⌦ ⇥ T : 0 t ⌧n (!)}, the process has the prop-
erty under consideration. (Actually, this is a bit tricky. In some cases, for example when
considering integrators, one can simply keep the process constant after ⌧n at its time-⌧n
level; in other cases, for example when considering integrands, one must set the process
to 0 after time ⌧n .)
9 REFERENCES 151
9 References
[3] R.-A. Dana, M. Jeanblanc: Financial Markets in Continuous Time, Springer, 2003
[6] R. Durrett: Probability. Theory and Examples, fourth edition, Cambridge University
Press, 2010
[8] H. Föllmer: Calcul d’Itô sans probabilités, Séminaire de Probabilités XV, Lecture
Notes in Mathematics 850, Springer, 1981, pp. 143–150
[11] J. Jacod, A. N. Shiryaev: Limit Theorems for Stochastic Processes, second edition,
Springer, 2003
[13] P. E. Protter: Stochastic Integration and Di↵erential Equations, second edition, ver-
sion 2.1, Springer, 2005
152
10 INDEX 153
10 Index