Dynamic Optimization
Dynamic Optimization
Dynamic Optimization
1. Math Preliminaries
We will study functional equations of the form
This operator T takes the function v as input and spits out a new function T v. In this sense
T is like a regular function, but it takes as inputs not scalars z ∈ R or vectors z ∈ Rn , but
functions v from some subset of possible functions. A solution to the functional equation
is then a fixed point of this operator, i.e., a function v ∗ such that v ∗ = T v ∗ .
We want to find out under what conditions the operator T has fixed point (existence),
under what conditions it is unique and under what conditions we can start from an ar-
bitrary function v and converge, by applying the operator T repeatedly, to v ∗ . More pre-
cisely, by defining the sequence of functions {vk }∞ n=0 recursively by v0 = v and vk+1 = T vn
we want to ask under what conditions limk→∞ vk = v ∗ .
In order to make these questions (and the answers to them) precise we have to define
the domain and range of the operator T and we have to define what we mean by lim .
This requires the discussion of complete metric spaces. In the next subsection we will
first define what a metric space is and then what makes a metric space complete.
Then we will state and prove the contraction mapping theorem. This theorem states
that an operator T, defined on a metric space, has a unique fixed point if this operator
T is a contraction (we will also define what we mean by a contraction). Furthermore it
assures that from any starting guess v repeated applications of the operator T will lead to
its unique fixed point.
Finally we will prove a theorem, Blackwell’s theorem, that provides sufficient condi-
tions for an operator to be a contraction.
The function d is called a metric and is used to measure the distance between two
element in S. Examples of metric spaces (S, d) include
Example 2. Let X ⊆ Rn and S = C(X) be the set of all continuous and bounded functions
f : X → R. Define the metric d : C(X) × C(X) → R as d(f, g) = supx∈X |f (x) − g(x)|. Then
(S, d) is a metric space
Definition 3. A sequence {xk } with xk ∈ S for all k is said to be a Cauchy sequence if for
every ϵ > 0 there exists a K such that d(xk , xl ) < ϵ for all k, l ≥ K.
Theorem 1.1. Suppose that (S, d) is a metric space and that the sequence {xk } converges to
x ∈ S. Then the sequence {xk } is a Cauchy sequence.
ϵ
Proof. Since {xk } converges to x, there exists M such that d(xk , x) < 2
for all k ≥ M.
Therefore, if k, l ≥ M we have
ϵ ϵ
d(xk , xl ) ≤ d(xk , x) + d(xl , x) < + = ϵ.
2 2
Thus {xk } is a Cauchy sequence.
–2–
1.3 Complete Metric Spaces
Definition 4. A metric space (S, d) is complete if every Cauchy sequence {xk } in S con-
verges to some x ∈ S.
Lemma 1.2. Let (S, d) be a metric space and T : S → S be a function mapping S into itself. If T
is a contraction mapping, then T is continuous.
vk = T vk−1 = T (T vk−2 ) = · · · = T k v0 .
Then, we have:
Theorem 1.3. Let (S, d) be a complete metric space and suppose that T : S → S is a contraction
mapping with modulus β. Then, T has exactly one fixed point v ∗ ∈ S.
Proof. Start with an arbitrary v0 . As our candidate for a fixed point we take v ∗ = limk→∞ vk .
We first have to establish that the sequence {vk } in fact converges to a function v ∗ . We then
have to show that this v ∗ satisfies v ∗ = T v ∗ and we then have to show that there is no
other than v̂ that also satisfies v̂ = T v̂.
–3–
Since T is a contraction with modulus β, we have
For any l > k it then follows from the triangle inequality that
By making k large we can make d(vl , vk ) as small as we want. Hence the sequence {vk }
is a Cauchy sequence. Since (S, d) is a complete metric space, the sequence converges in
S and therefore v ∗ = limk vk is well-defined.
Now we establish that v ∗ is a fixed point of T. We have,
Note that the fact that T (limk vk ) = limk T (vk ) follows from the continuity of T.
Now we want to prove that the fixed point of T is unique. Suppose there exists another
v̂ ∈ S such that v̂ = T v̂ and v̂ ̸= v ∗ . Then there exists c > 0 such that d(v̂, v ∗ ) = a. But
which is a contradiction.
The Contraction Mapping Theorem is extremely useful in order to establish that our
functional equation of interest has a unique fixed point. It is, however, not very opera-
tional as long as we don’t know how to determine whether a given operator is a contrac-
tion mapping. Blackwell, in 1965 provided sufficient conditions for an operator to be a
contraction mapping. It turns out that these conditions can be easily checked in a lot of
–4–
applications. Since they are only sufficient however, failure of these conditions does not
imply that the operator is not a contraction. We next state Blackwell’s theorem.
Theorem 1.4. Let X ⊆ Rn and B(X) be the space of bounded functions f : X → R with the d
being the sup-norm. Let T : B(X) → B(X) be an operator satisfying
1. Monotonicity: If f, g ∈ B(X) are such that f (x) ≤ g(x) for all x ∈ X, then (T f )(x) ≤
(T g)(x) for all x ∈ X.
2. Discounting: Let the function f + a, for f ∈ B(X) and a ∈ R+ be defined by (f + a)(x) =
f (x) + a. There exists β ∈ (0, 1) such that for all f ∈ B(X), a ≥ 0 and all x ∈ X
If these two conditions are satisfied, then the operator T is a contraction with modulus β.
factor.
For now, assume that the cake does not depreciate (melt) or grow. Hence, the evolution
of the cake over time is governed by:
(2.1) Wt+1 = Wt − ct
for t = 1, 2, . . . , T. How would you find the optimal path of consumption, {ct }T1 ?
–5–
2.1 Direct Attack
One approach is to solve the constrained optimization problem directly. This is called
the sequence problem. Consider the problem of:
T
X
(2.2) max β t−1 u(ct )
T +1
{ct }T
1 ,{Wt }2 t=1
subject to Eq. (2.1), which holds for t = 1, 2, . . . , T. Also, there are non-negativity con-
straints on consumption and the cake given by: ct ≥ 0 and Wt ≥ 0. For this problem, W1
is given.
Alternatively, the flow constraints imposed by Eq. (2.1) for each t could be combined
yielding:
T
X
(2.3) ct + WT +1 = W1 .
t=1
For now, we will work with the single resource constraint. This is a well-behaved problem
as the objective is concave and continuous and the constraint set is compact. So there is a
solution to this problem.
Letting λ be the multiplier on Eq. (2.3), the first order conditions are given by:
This is a necessary condition for optimality for any t : if it was violated, the agent
could do better by adjusting ct and ct+1 . Frequently, Eq. (2.4) is referred to as an Euler
equation.
To understand this condition, suppose that you have a proposed (candidate) solution
for this problem given by {c∗t }T1 , {Wt∗ }T2 +1 . Essentially, the Euler equation says that the
marginal utility cost of reducing consumption by ϵ in period t equals the marginal utility
–6–
gain from consuming the extra ϵ of cake in the next period, which is discounted by β. If
the Euler equation holds, then it is impossible to increase utility by moving consumption
across adjacent periods given a candidate solution.
It should be clear though that this condition may not be sufficient: it does not cover
deviations that last more than one period. For example, could utility be increased by re-
ducing consumption by ϵ in period t saving the “cake” for two periods and then increas-
ing consumption in period t + 2? Clearly this is not covered by a single Euler equation.
However, by combining the Euler equation that hold across period t and t + 1 with that
which holds for periods t + 1 and t + 2, we can see that such a deviation will not increase
utility. This is simply because the combination of Euler equations implies:
u′ (ct ) = β 2 u′ (ct+2 )
so that the two-period deviation from the candidate solution will not increase utility.
As long as the problem is finite, the fact that the Euler equation holds across all adja-
cent periods implies that any finite deviations from a candidate solution that satisfies the
Euler equations will not increase utility.
Is this enough? Not quite. Imagine a candidate solution that satisfies all of the Euler
equations but has the property that WT > cT so that there is cake left over. This is clearly
an inefficient plan: having the Euler equations holding is necessary but not sufficient.
Hence the optimal solution will satisfy the Euler equation for each period and the agent
will consume the entire cake!
Formally, this involves showing the non-negativity constraint on WT +1 must bind. In
fact, this constraint is binding in the above solution: λ = ϕ > 0. This non-negativity con-
straint serves two important purposes. First, in the absence of a constraint that WT +1 ≥ 0,
the agent would clearly want to set WT +1 = −∞ and thus die with outstanding obliga-
tions. This is clearly not feasible. Second, the fact that the constraint is binding in the
optimal solution guarantees that cake is not being thrown away after period T.
So, in effect, the problem is pinned down by an initial condition (W1 is given) and by
a terminal condition (WT +1 = 0). The set of (T − 1) Euler equations and Eq. (2.3) then
determine the time path of consumption.
Let the solution to this problem be denoted by VT (W1 ) where T is the horizon of the
problem and W1 is the initial size of the cake. VT (W1 ) represents the maximal utility flow
from a T period problem given a size W1 cake. From now on, we call this a value function.
As in those problems, a slight increase in the size of the cake leads to an increase in
–7–
lifetime utility equal to the marginal utility in any period. That is,
It doesn’t matter when the extra cake is eaten given that the consumer is acting optimally.
This is analogous to the point raised above about the effect on utility of an increase in
income in the consumer choice problem with multiple goods.
where W1 = W0 − c0 ; W0 given.
In this formulation, the choice of consumption in period 0 determines the size of the
cake that will be available starting in period 1, W1 . So instead of choosing a sequence of
consumption levels, we are just choosing c0 . Once c0 and thus W1 are determined, the
value of the problem from then on is given by VT (W1 ). This function completely summa-
rizes optimal behavior from period 1 onwards. For the purposes of the dynamic program-
ming problem, it doesn’t matter how the cake will be consumed after the initial period.
All that is important is that the agent will be acting optimally and thus generating utility
given by VT (W1 ). This is the principle of optimality, due to Richard Bellman, at work.
–8–
With this knowledge, an optimal decision can be made regarding consumption in period
0.
Note that the first order condition (assuming that VT (W1 ) is differentiable) is given by:
u (c0 ) = βVT′ (W1 ) so that the marginal gain from reducing consumption a little in period 0
′
is summarized by the derivative of the value function. As noted in the earlier discussion
of the T period sequence problem,
for t = 1, 2, . . . , T − 1. Using these two conditions together yields u′ (ct ) = βu′ (ct+1 ), for
t = 0, 1, . . . , T − 1, a familiar necessary condition for an optimal solution.
Since the Euler conditions for the other periods underlie the creation of the value func-
tion, one might suspect that the solution to the T + 1 problem using this dynamic pro-
gramming approach is identical to that from using the sequence approach. This is clearly
true for this problem: the set of first order conditions for the two problems are identical
and thus, given the strict concavity of the u(c) functions, the solutions will be identical as
well.
The apparent ease of this approach though is a bit misleading. We were able to make
the problem look simple by pretending that we actually knew VT (W1 ). Of course, we had
to solve for this either by tackling a sequence problem directly or by building it recursively
starting from an initial single period problem.
On this latter approach, we could start with the single period problem implying V1 (W1 ).
We could then solve Eq. (2.5) to build V2 (W1 ). Given this function, we could move to a
solution of the T = 3 problem and proceed iteratively, using Eq. (2.5) to build VT (W1 ) for
any T.
where A2 and B2 are constants associated with the two period problem. These constants
are given by:
1 β
A2 = log + β log , B2 = (1 + β)
1+β 1+β
–9–
Importantly, Eq. (2.6) does not include the max operator as we are substituting the optimal
decisions in the construction of the value function, V2 (W1 ).
Using this function, the T = 3 problem can then be written as:
where the choice variable is the state in the subsequent period. The first order condition
is c11 = βV2′ (W2 ).
Using Eq. (2.6) evaluated at a cake of size W2 , we can solve for V2′ (W2 ) implying:
1 B2 β
=β = .
c1 W2 c2
Here c2 the consumption level in the second period of the three-period problem and thus
is the same as the level of consumption in the first period of the two-period problem.
Further, we know from the 2-period problem that c12 = cβ3 .
This plus the resource constraint allows us to construct the solution of the 3-period
problem:
W1 βW1 β 2 W1
c1 = , c2 = , c3 =
1 + β + β2 1 + β + β2 1 + β + β2
β2
1 β 2
A3 = log + β log + β log
1 + β + β2 1 + β + β2 1 + β + β2
B3 = 1 + β + β 2
This solution can be verified from a direct attack on the 3 period problem.
–10–
3.1 Infinite Horizon
Suppose that we consider the above problem and allow the horizon to go to infinity.
As before, one can consider solving the infinite horizon sequence problem given by:
∞
X
max β t−1 u(ct )
{ct }∞ ∞
1 ,{Wt }2
t=1
for all W. Here u(c) is again the utility from consuming c units in the current period. V (W )
is the value of the infinite horizon problem starting with a cake of size W. So in the given
period, the agent chooses current consumption and thus reduces the size of the cake to
W ′ = W − c, as in the transition equation. We use variables with primes to denote future
values. The value of starting the next period with a cake of that size is then given by
V (W − c) which is discounted at rate β < 1.
For this problem, the state variable is the size of the cake W that is given at the start of
any period. The state completely summarizes all information from the past that is needed
for the forward looking optimization problem. The control variable is the variable that
is being chosen. In this case, it is the level of consumption in the current period, c. Note
that c lies in a compact set. The dependence of the state tomorrow on the state today and
the control today, given by
W′ = W − c
(3.1) V (W ) = max
′
{u(W − W ′ ) + βV (W ′ )}
W ∈[0,W ]
for all W. Either specification yields the same result. But choosing tomorrow’s state often
makes the algebra a bit easier so we will work with Eq. (3.1).
This expression is known as a functional equation and is often called a Bellman equa-
–11–
tion after Richard Bellman, one of the originators of dynamic programming. Note that the
unknown in the Bellman equation is the value function itself: the idea is to find a function
V (W ) that satisfies this condition for all W. Unlike the finite horizon problem, there is no
terminal period to use to derive the value function. In effect, the fixed point restriction
of having V (W ) on both sides of Eq. (3.1) will provide us with a means of solving the
functional equation Eq. (3.1).
Note too that time itself does not enter into Bellman’s equation: we can express all
relations without an indication of time. This is the essence of stationarity. In fact, we will
ultimately use the stationarity of the problem to make arguments about the existence of a
value function satisfying the functional equation.
A final very important property of this problem is that all information about the past
that bears on current and future decisions is summarized by W, the size of the cake at the
start of the period. Whether the cake is of this size because we initially had a large cake
and ate a lot or a small cake and were frugal is not relevant. All that matters is that we
have a cake of a given size. This property partly reflects the fact that the preferences of
the agent do not depend on past consumption. But, in fact, if this was the case, we could
amend the problem to allow this possibility.
We next address the question of whether there exists a value function that satisfies Eq.
(3.1).
The operator T corresponding to our functional equation is
(3.2) T V (W ) = max
′
{u(W − W ′ ) + βV (W ′ )}.
0≤W ≤W
It can be shown that this operator T satisfies both monotonicity and discounting. Using
Blackwell’s theorem (see Theorem 1.4) we can conclude that the functional equation Eq.
(3.1) has a unique solution.
We next explore the properties of the solution to the functional equation.
The first order condition for the optimization problem in Eq. (3.1) can be written as
u′ (c) = βV ′ (W ′ ).
This looks simple but what is the derivative of the value function? This seems particularly
hard to answer since we do not know V (W ). However, we use the fact that V (W ) satisfies
Eq. (3.1) for all W to calculate V ′ . Assuming that this value function is differentiable,
V ′ (W ) = u′ (c), a result we have seen before. Since this holds for all W, it will hold in the
following period yielding: V ′ (W ′ ) = u′ (c′ ).
Substitution leads to the familar Euler equation: u′ (c) = βu′ (c′ ). The solution to the
–12–
cake eating problem will satisfy this necessary condition for all W.
The link from the level of consumption and next period’s cake (the controls from the
different formulations) to the size of the cake (the state) is given by the policy function:
Using these in the Euler equation reduces the problem to these policy functions alone:
for all W.
These policy functions are very important for applied research since they provide the
mapping from the state to actions. When elements of the state as well as the action are
observable, then these policy functions will provide the foundation for estimation of the
underlying parameters.
In general, actually finding closed form solutions for the value function and the re-
sulting policy functions is not possible. In those cases, we try to characterize certain
properties of the solution and, for some exercises, we solve these problems numerically.
However, as suggested by the analysis of the finite horizon examples, there are some
versions of the problem we can solve completely. Suppose then, as above, that u(c) =
log(c). Given the results for the T -period problem, we might conjecture that the solution to
the functional equation takes the form of: V (W ) = A + B log(W ) for all W. With this guess
we have reduced the dimensionality of the unknown function V (W ) to two parameters,
A and B. But can we find values for A and B such that V (W ) will satisfy the functional
equation?
Taking this guess as given and using the special preferences, the functional equation
becomes:
βB
W ′ = φ(W ) = W.
(1 + βB)
–13–
for all W. Collecting terms into a constant and terms that multiply log(W ) and then im-
posing the requirement that the functional equation must hold for all W, we find that
1
B = 1−β is required for a solution. Given this, there is a complicated expression that can
be used to find A. To be clear then we have indeed guessed a solution to the functional
equation. We know that because we can solve for (A, B) such that the functional equation
holds for all W using the optimal consumption and savings decision rules.
With this solution, we know that
c = (1 − β)W, W ′ = βW.
Evidently, the optimal policy is to save a constant fraction of the cake and eat the remain-
ing fraction.
Interestingly, the solution to B could be guessed from the solution to the T -horizon
problems where BT = Tt=1 β t−1 . Evidently, B = limT →∞ BT .
P
–14–