Lectures 1

Mathematical Methods - Part 1
Leo Kaas
Goethe University Frankfurt
GSEFM
Winter Semester 2022-23
1 / 98
Organization
Lecturer: Leo Kaas, HoF 3.57
Class times: Tuesdays and Wednesdays, 8.30–10.00
7 weeks with 11 lectures and 3 tutorials.
Exam for part 1 in early December.
Part 2 on computational economics, taught by Alexander Ludwig.
Joint grade based on both exams.
All course material at OLAT.
2 / 98
Why this course?
Mathematics is essential for research in economics, finance and
management.
All courses in GSEFM, but especially your own original research,

require analytic thinking and quantitative methods.
In this class, you
I acquire mathematics skills necessary to read and understand academic
papers
I learn to evaluate proofs of others (are they correct?)
I learn to develop (simple) proofs of formal statements
I practice thinking in rigorous terms: Assumptions, statement, proof.
These skills are needed in all branches of GSEFM (empirical and

theoretical)
Aim is to reduce heterogeneity in maths backgrounds while

challenging everyone.
3 / 98
General recommendations
Learning mathematics requires much more than reading slides, books,
or listening to lectures.
Learning by doing: You only learn what you practice!
This obviously applies to other Ph.D. courses as well.
Active listening and active reading (no memorizing)
Practice problem sets!
Working in groups is strongly encouraged.
But first try to solve problems on your own. Can you explain them to
others?
Ask as much as needed!

4 / 98
Overview
Preliminaries
Sets, numbers, Euclidean space, norms, supremum and infimum, functions
Real analysis
Metric spaces, sequences, open and closed sets, continuity, compactness,
fixed-point theorems, convex and concave functions, correspondences, theorem of
the maximum
Differential calculus and static optimization

Derivatives of multivariate functions, Taylor approximation, implicit function
theorem, necessary and sufficient conditions of constrained and unconstrained
optimization problems
Dynamical systems
Stability, linear difference equations, local analysis of nonlinear difference
equations, Markov chains
Dynamic programming
5 / 98
Books
de la Fuente, Mathematical Methods and Models for Economists
Simon and Blume, Mathematics for Economists
Stokey and Lukas, Recursive Methods in Economic Dynamics
Many other sources online
6 / 98
1. Preliminaries
7 / 98
Basic notation
A ⇒ B: If A, then B (A is sufficient for B)
A ⇔ B: A if and only if (“iff”) B (A is necessary and sufficient for B)
¬ A “A is false”
∧ “and”
∨ “or”
∀ “for all . . . ”
∃ “there exists . . . ”
∃! “there exists a unique . . . ”
@ “there does not exist . . . ”
x ≡ y “x is defined by y ”
8 / 98
Sets
Let X , Y be sets and A, B subsets of X
x ∈ A: x is an element of A.
x∈
/ A: x is not an element of A.
A ⊂ B: A is a subset of B (∀x ∈ A ⇒ x ∈ B)
A ∩ B ≡ {x ∈ X |x ∈ A ∧ x ∈ B}: Intersection of A and B
A ∪ B ≡ {x ∈ X |x ∈ A ∨ x ∈ B}: Union of A and B
∅: empty set
Ac ≡ X \ A ≡ {x ∈ X |x ∈
/ A}: Complement of A
|X | cardinality of X (“number of elements”)
X × Y ≡ {(x, y )|x ∈ X , y ∈ Y } Cartesian product

9 / 98
Method of proof
Deduction
Contraposition
Induction
Contradiction
10 / 98
Numbers
Definitions
Natural numbers N ≡ {1, 2, . . .}
Integers Z ≡ {. . . , −2, −1, 0, 1, 2, . . .}
Rational numbers Q ≡ { mn |m ∈ Z, n ∈ N}
Real numbers R: All decimals (with possibly infinitely many digits)
√
Complex numbers C ≡ {a + ib|a, b ∈ R} where i ≡ −1.
Remarks
Every complex number can also be written in trigonometric form
a + ib = r (cos θ + i sin θ) = re iθ ,
√
with r ≡ a2 + b 2 and cos θ = a/r
N, Z and Q are countable, R (and C) are uncountable
11 / 98
Further definitions
Open and closed intervals
(a, b) ≡ {x ∈ R|a < x < b} , [a, b] ≡ {x ∈ R|a ≤ x ≤ b} .
Similarly for half-open intervals (a, b], [a, b)
R+ = [0, ∞), R++ = (0, ∞)
Absolute value |x| ≡ max(x, −x) ∈ R+ for x ∈ R
n-dimensional Euclidean space
Rn ≡ {x = (x1 , . . . , xn )|xi ∈ R, i = 1, . . . , n}
The Euclidean space is an example of a vector space: Vector addition and

scalar multiplication are possible.
12 / 98
Norms in Rn
Definitions
qP
n 2
Euclidean norm kxk ≡ kxk2 ≡ i=1 xi
sup-norm kxk∞ ≡ max {|xi | | i = 1, . . . , n}
Pn
Taxicab norm kxk1 ≡ i=1 |xi |
All are special cases of the p-norms (for p ≥ 1)
n
!1/p
X
p
kxkp ≡ |xi |
i=1
13 / 98
Supremum and infimum
Definition
Set A ⊂ R is bounded above (bounded below) if there exists a number b such
that x ≤ b (x ≥ b) for all x ∈ A. In this case b is an upper bound (lower
bound) of A. A is bounded if it is bounded above and below.
Definition
If A ⊂ R is bounded above, the supremum of A is sup A = b ∗ such that b ∗
is an upper bound of A and b ≥ b ∗ for every other upper bound b.
If A ⊂ R is bounded below, the infimum of A is inf A = b ∗ such that b ∗ is a
lower bound of A and b ≤ b ∗ for every other lower bound b.
Remarks
The existence of supremum and infimum requires the completeness axiom of
the real numbers.
If sup A ∈ A, it is the maximum of A, max A = sup A.
If inf A ∈ A, it is the minimum of A, min A = inf A.
14 / 98
Functions
Definitions
A function f : X → Y assigns for every x ∈ X a unique f (x) ∈ Y , short
x 7→ f (x).
For A ⊂ X , f (A) ≡ {y = f (x)|x ∈ A} ⊂ Y is the image of A
For B ⊂ Y , f −1 (B) ≡ {x|f (x) ∈ B} ⊂ X is the pre-image of B
Function f is one-to-one (injective) if f (x) = f (y ) implies x = y .
Function f is onto (surjective) if ∀y ∈ Y ∃x ∈ X : f (x) = y .
Bijective functions are one-to-one and onto.
15 / 98
Functions
Further definitions
Function f : X → R is bounded (above, below) if f (X ) is bounded
(above, below).
The set of all functions f : X → Y is denoted Y X .
For every A ⊂ X , define the indicator function

1 if x ∈ A,
IA : X → {0, 1} , IA (x) =
0 if x ∈
/A.
2X = {0, 1}X is the power set of X , the set of all subsets of X .
16 / 98
2. Real analysis
17 / 98
Metric space
Definition
Set X together with distance function d : X × X → R+ is called metric space
if for all x, y ∈ X :
1 d(x, y ) = 0 ⇔ x = y (Positive distance between distinct points)
2 d(x, y ) = d(y , x) (Symmetry)
3 d(x, y ) ≤ d(x, z) + d(z, y ) (Triangle inequality)
Examples
Rn with dp (x, y ) = kx − y kp is a metric space (for every p-norm)
For every set X , the discrete metric is d(x, y ) = 1 for x 6= y .
B(X ) ≡ {f : X → R bounded} is a metric space with metric induced by the
sup-norm
d∞ (f , g ) ≡ sup {|f (x) − g (x)| | x ∈ X }
18 / 98
Sequences
Let (X , d) be a metric space.
Definitions
A sequence in X is a function N → X , n 7→ xn . Write (xn ).
Sequence (xn ) converges if there exists a limit x̄ ∈ X such that
∀ε > 0 ∃N ∈ N ∀n ≥ N : d(xn , x̄) < ε .
Notation is xn → x̄ or limn→∞ xn = x̄.
Sequence (xn ) in Rn is bounded if there exists C > 0 such that

kxn k ≤ C ∀n.
19 / 98
Sequences
Remarks
Are the following sequences in (R, |.|) convergent?
√ 4n2 + n
(1) xn = n (2) xn = e −n (3) xn =
n2 + 5
Every convergent sequence is bounded. (Proof as exercise.)
Can a sequence have more than one limit? How about xn = (−1)n ?
20 / 98
Limit points
Definitions
x̂ is called limit point of (xn ) if
∀ε > 0 ∀N ∈ N ∃n ≥ N : d(xn , x̂) < ε
A limit point requires the existence of a convergent subsequence (xnk ) with

limk→∞ xnk = x̂ (here, k 7→ nk is strictly increasing).
Questions
Can a convergent sequence have multiple limit points?
If a sequence has a unique limit point, does it converge to it?
21 / 98
Limit points
Theorem (Bolzano-Weierstrass)
Every bounded sequence in (Rn , k.k) has a limit point.
Proof (Sketch): First show that a bounded sequence in R has a monotone

subsequence. Then the supremum or infimum of this sequence is a limit point.
Extend this argument by induction to Rn .
22 / 98
Remarks about convergence in Rn
If xn converges to x̄ in the dp -metric for some p ≥ 1, it also converges (to

the same limit) in any other dp0 -metric for p 0 6= p.
All p-norms are equivalent.
How about convergence in the discrete metric?
Sums, products of convergent series converge.
23 / 98
Convergence of functions
Consider B(X ), the space of real-valued, bounded functions on X .
Definition
A sequence fn ∈ B(X ) converges point-wise to a function f ∈ B(X ) if
∀x ∈ X ∀ε > 0 ∃N ∈ N ∀n ≥ N : |fn (x) − f (x)| < ε
Definition
A sequence fn ∈ B(X ) converges uniformly to a function f ∈ B(X ) if
∀ε > 0 ∃N ∈ N ∀n ≥ N ∀x ∈ X : |fn (x) − f (x)| < ε
Remarks
Uniform convergence is equivalent to convergence on (B(X ), d∞ ).
Uniform convergence implies point-wise convergence, but not vice versa.
24 / 98
Open and closed sets
Let (X , d) be a metric space.
Definitions
B(x0 , ε) ≡ {x ∈ X |d(x, x0 ) < ε} is the open ball at x0 ∈ X of radius ε > 0.
Set A ⊂ X is open if ∀x ∈ A ∃ε > 0 : B(x, ε) ⊂ A.
Set A ⊂ X is closed if Ac is open.
Remarks
How do open balls in R2 look like for given metric dp ?
An open ball is open!
The (finite or infinite) union of open sets is open. Proof?
The finite intersection of open sets is open. Proof?
The finite union of closed sets is closed.
The (finite or infinite) intersection of closed sets is closed.
25 / 98
Limit points and closed sets
Definition
x̄ ∈ X is a limit point of set A ⊂ X , if there exists a sequence (xn ) with xn ∈ A
such that limn xn→∞ = x̄.
Lemma
A is closed if and only if it contains all its limit points.
26 / 98
Closure and interior
Definition
The closure of set A is the intersection of all closed sets containing A:
\
clA = B
B⊃A,B closed
The interior of set A is the union of all open sets contained in A:

[
intA = B
B⊂A,B open
Remarks
clA is closed, intA is open.
x ∈ clA iff every open ball at x intersects A.
x ∈ intA iff there exists an open ball at x within A.
clAc = (intA)c
27 / 98
Bounded and compact sets
Definition
A ⊂ X is bounded if ∃x0 ∈ X ∃C > 0 ∀x ∈ A : d(x, x0 ) ≤ C .
Definition
A ⊂ X is compact if every sequence in A has a convergent subsequence with
limit in A.
Remarks
A ⊂ Rn (with any dp metric) is compact iff A is closed and bounded.
In general metric space, closed and bounded sets need not be compact.
But compact sets are closed and bounded. Proof as exercise.
28 / 98
Continuity
Definition
Let (X , dX ) and (Y , dY ) be metric spaces. Function f : X → Y is continuous at
x0 ∈ X if
∀ε > 0 ∃δ > 0 ∀x ∈ B(x0 , δ) : f (x) ∈ B(f (x0 ), ε) .
Function f is continuous if it is continuous at every x0 ∈ X .
Remarks
An equivalent property is ∀xn → x0 : f (xn ) → f (x0 ).
The usual intuition “Continuity = No jumps of the function” is correct in
many cases, but....
There are examples of functions f : [0, 1] → R which are
I not continuous everywhere
I not continuous everywhere except at one point
I continuous at all x ∈
/ Q and not continuous at all x ∈ Q.
29 / 98
Continuity
Remarks
Equivalent properties of continuity of f : X → Y are
I f −1 (A) is open for every open set A ⊂ Y , or
I f −1 (A) is closed for every closed set A ⊂ Y .
If f : X → Y and g : Y → Z are continuous, so is the composition

g ◦ f : X → Z , x 7→ g (f (x)).
Corollary of this result: For all real-valued functions, addition and

multiplication preserve continuity.
If f : X → Y is continuous and one-to-one, is the inverse f −1 : f (X ) → X

continuous?
30 / 98
Weierstrass Theorem
Theorem (Weierstrass)
If A ⊂ X is compact and f : X → Y is continuous, f (A) is compact.
Proof: Take any sequence (yn ) contained in f (A). Then there are xn such that
f (xn ) = yn . Since A is compact, xn has a convergent subsequence xnk with limit
x̄ ∈ A. By continuity of f , limk ynk = f (x̄). Therefore, (yn ) has a convergent
subsequence with limit in f (A). This implies that f (A) is compact.
Corollary
If A ⊂ X is compact and f : X → R is continuous, then f attains its maximum
and minimum on set A.
31 / 98
Examples
Does the Theorem of Weierstrass apply? Do these functions have a maximum?

1 X = (0, 1) and f (x) = 1/x defined on x ∈ X = (0, 1)

1/x x > 0
2 X = [0, 1] and f (x) =
0 x =0
3 X = [0, ∞) and f (x) = x 2

1 x ∈Q
4 X = (0, 1) and f (x) =
0 x∈/Q
32 / 98
Implications of the Weierstrass Theorem
Theorem
Let X be compact.
1 All real-valued, continuous functions on X are bounded, i.e.
C (X ) = {f : X → R continuous} ⊂ B(X )
2 Let fn ∈ C (X ) converge to f in metric d∞ . Then f ∈ C (X ).
Corollary
(C (X ), d∞ ) is a metric space and closed in B(X ).
33 / 98
Fixed-point theorems
Fixed-point problems arise in many applications.
For instance, if a dynamic model (in discrete time) is described by

xt+1 = F (xt ), all stationary solutions (steady states) are solutions of the
fixed-point equation x = F (x).
In a (symmetric) game with n players, the best response of every player (out
of set X ) is a function of the strategies of all other players. The vector of
best responses can then be described by a function F : X n → X n . A Nash
equilibrium is a fixed point of F .
Finding a solution to n equations in n unknowns can often be transformed

into a fixed-point problem.
These functions are usually non-linear, so we cannot use linear algebra to

solve such equation systems.
34 / 98
Cauchy sequences
Definition
Sequence (xn ) in metric space (X , d) is a Cauchy sequence if
∀ε > 0 ∃N ∈ N ∀m, n ≥ N : d(xm , xn ) < ε .
Remarks
The Cauchy property can be used to verify convergence without knowing the
limit.
Yet, Cauchy sequences do not automatically converge. Why?
They only converge if the metric space is complete.
35 / 98
Complete spaces
Definition
Metric space (X , d) is complete if every Cauchy sequence converges in X .
Theorem
1 Rn is complete.
2 For every compact metric space, the space of real-valued continuous
functions C (X ) is complete.
3 A subset of a complete metric space is complete if and only if it is closed.
Are the following sets complete in (R, |.|)?

1 (0, 1)
2 N
36 / 98
Contractions
Definition
A function F : X → X on metric space (X , d) is a contraction with modulus
β < 1 if
∀x, y ∈ X : d(F (x), F (y )) ≤ βd(x, y ) .
Remarks
Contractions are continuous.
In fact, the contraction property implies Lipschitz continuity with Lipschitz

constant β.
For differentiable functions in R, the property holds if |F 0 (x)| ≤ β for all x in

the domain of F . This follows from the mean value theorem.
37 / 98
Banach Fixed Point Theorem
Theorem
Let (X , d) be a complete metric space and let F : X → X be a contraction. Then
F has a unique fixed point.
Proof (Sketch): Take arbitrary x0 ∈ X and define xn = F (xn−1 ) for n ≥ 1. Use

the contraction property to show that d(xn , xn+1 ) ≤ β n d(x1 , x0 ). Then, for
βn
arbitrary n < m, show that d(xn , xm ) ≤ 1−β d(x1 , x0 ).
Since β < 1, (xn ) is a Cauchy sequence, and since X is complete, this sequence
has a limit x̄. Since F is continuous, F (xn ) converges to F (x̄). Since
xn+1 = F (xn ), the sequence (xn ) must converge to F (x̄), which implies x̄ = F (x̄).
Uniqueness follows by a simple contradiction argument.
38 / 98
Exponential convergence
Remarks
The Banach fixed point theorem not only guarantees existence and
uniqueness, but also delivers a method for approximating the fixed point.
From every initial point, iterations converge to the fixed point at exponential
speed.
Let F n = |F ◦ .{z
. . ◦ F} denote the n-th interate of F .
n times
Theorem
If F : X → X is a contraction on metric space X with modulus β and if x̄ is a
fixed point of F , then
∀x ∈ X ∀n ∈ N : d(F n (x), x̄) ≤ β n d(x, x̄) .
39 / 98
N-stage contraction theorem
Theorem
Let F : X → X , and suppose that F N is a contraction with modulus β < 1 for
some integer N. Then F has a unique fixed point x̄, limn→∞ F n (x) = x̄, and
d(F kN (x), x̄) ≤ β k d(x, x̄) for all x ∈ X .
Proof: By the Banach fixed point theorem, F N has fixed point x̄. Suppose F (x̄) 6= x̄.
Then
d(F (x̄), x̄) = d(F N+1 (x̄), F N (x̄)) ≤ βd(F (x̄), x̄) < d(F (x̄), x̄) ,
which is a contradiction. Therefore, x̄ is a fixed point of F . Any other fixed point of F
must also be a fixed point of F N , hence x̄ is the unique fixed point. For any x ∈ X ,
F n (x) = F kN+m (x) with k = 0, 1, 2, . . . and m ∈ {0, 1, . . . , N − 1}. Since
limk→∞ F kN (F m (x)) = x̄ for all m, limn→∞ F n (x) = x̄.
40 / 98
Brouwer’s fixed point theorem
Definition
Subset A of Rn is convex if
∀x, y ∈ A ∀λ ∈ [0, 1] : λx + (1 − λ)y ∈ A .
Theorem
Let A be a compact and convex subset of Rn . Then every continuous function
F : A → A has a fixed point.
41 / 98
Concave and convex functions
Definition
Let X be a convex subset of Rn . Function f : X → R is
concave if f (λx + (1 − λ)y ) ≥ λf (x) + (1 − λ)f (y )
convex if f (λx + (1 − λ)y ) ≤ λf (x) + (1 − λ)f (y )
for all x 6= y ∈ X and λ ∈ (0, 1). If the inequality is strict, the function is strictly
concave (strictly convex).
Remarks
f is concave iff −f is convex.
The sum of concave (convex) functions is concave (convex).
f is concave iff {(x, z) ∈ X × R|f (x) ≥ z} is convex.
f is convex iff {(x, z) ∈ X × R|f (x) ≤ z} is convex.
A concave function is continuous on intX (Theorem 2.14 in de la Fuente).
Concavity (convexity) is a cardinal property: It is not preserved under a
monotonic transformation.
42 / 98
Quasiconcave functions
Definition
Let X be a convex subset of Rn . Function f : X → R is quasiconcave if
f (λx + (1 − λ)y ) ≥ min(f (x), f (y ))
for all x 6= y ∈ X and λ ∈ (0, 1). If the inequality is strict, the function is strictly
quasiconcave.
Remarks
f is quasiconcave iff the upper-level sets {x ∈ X |f (x) ≥ z} are convex for
all z ∈ R.
Implication: For n = 1 every monotone function is quasiconcave.
Concavity implies quasiconcavity, but not vice versa.
Quasiconcavity is preserved under monotonic transformation.
Sums of quasiconcave functions may not be quasiconcave.
Quasiconcave functions may be discontinuous at interior points.
43 / 98
Maximum of quasiconcave functions
Definition
x ∗ is a local maximum of f : X → R if ∃ε > 0 such that x ∗ is a maximum of f
restricted to the domain B(x ∗ , ε).
Theorem
Let X ⊂ Rn be convex and f : X → R be strictly quasiconcave.
1 If x ∗ is a local maximum of f , x ∗ is a global maximum.
2 If f has a maximum, it is unique.
3 If X is compact and f is continuous, f has a unique maximum.
44 / 98
Correspondences
Definition
A correspondence f : X Y is a function from X into 2Y , the set of subsets of
Y.
Definition
Compact-valued correspondence f : X Y is upper hemicontinuous
(uhc) at x ∈ X if for every sequence xn → x and every yn ∈ f (xn ), (yn ) has
a convergent subsequence with limit in f (x).
Correspondence f : X Y is lower hemicontinuous (lhc) at x ∈ X if for
every sequence xn → x and every y ∈ f (x) there exist yn ∈ f (xn ) such that
yn → y .
Compact-valued correspondence f is continuous at x ∈ X if it is both uhc
and lhc at x.
45 / 98
Correspondences
Remarks
uhc requires that f does not “implode in the limit”.
lhc requires that f does not “explode in the limit”.
If the graph of a compact-valued correspondence is closed and Y is

compact, the correspondence is uhc. (Theorem 11.6 in Ch.2 of de la Fuente)
If f is a function, either uhc or lhc implies continuity.
Examples of correspondences in economics: budget sets, constraint sets,

demand or best-response correspondences
46 / 98
Theorem of the Maximum
How does the maximum and the maximal value depend on parameters?
Theorem
Let X ⊂ Rn , Ω ⊂ Rp , let f : X × Ω → R be continuous, and let C : Ω X be a
compact-valued continuous correspondence. Then the problem
max f (x, ω)
x∈C (ω)
has non-empty, compact valued and uhc solution correspondence S : Ω X ,
S(ω) ≡ {x ∈ C (ω) | f (x, ω) ≥ f (x 0 , ω) ∀x 0 ∈ C (ω)} ,
and the value function V (ω) ≡ maxx∈C (ω) f (x, ω) is continuous.
Corollary
If, in addition to the previous requirements, f is strictly quasiconcave in x, then
the solution correspondence is a continuous function.
47 / 98
3. Differential calculus and static optimization
48 / 98
Calculus
A univariate real function f : X → R, X ⊂ R is differentiable at x0 ∈ intX if

the limit
f (x0 + h) − f (x0 )
f 0 (x0 ) ≡ lim
h→0 h
exists.
In this case, the affine-linear function
fˆ(x) = f (x0 ) + f 0 (x0 )(x − x0 )
approximates function f around the point x0 .
For a multivariate function f : X → Rm with X ⊂ Rn , the derivative is

similarly defined, as a linear approximation of function f .
49 / 98
Derivative
Definition
Function f : X ⊂ Rn → Rm is differentiable at point x0 ∈ intX if there exists a
matrix Df (x0 ) ∈ Rm,n such that
kf (x0 + h) − f (x0 ) − Df (x0 )hk

lim =0,
h→0 khk
where k.k is the Euclidean norm and h ∈ Rn . Function f is differentiable if it is

differentiable at every point x0 ∈ intX . Df (x0 ) is called derivative of f or
Jacobian matrix.
Remarks
Differentiability implies continuity.
But the derivative may not be continuous.
50 / 98
Partial derivatives
Definition
Function f : X ⊂ Rn → Rm is partially differentiable at point x0 ∈ intX if
j
there are numbers ∂f
∂xi (x0 ) ∈ R for all 1 ≤ i ≤ n and 1 ≤ j ≤ m, the
partial derivatives of f at x0 , such that
∂f j
f j (x0 + he i ) − f j (x0 ) − ∂xi (x0 )h
lim =0,
h→0 h
where e i ∈ Rn is the i-th unit vector and h ∈ R.
Remarks
If function f is differentiable at x0 , it is partially differentiable at x0 . Partial
derivatives are the elements of the Jacobian matrix.
The converse is not true. Example f (x, y ) = sign(x · y ) at point (0, 0).
Partial differentiability does not even imply continuity (same example).
If f is partially differentiable at all points and if partial derivatives are
continuous, f is differentiable (Theorem 3.4 in Ch. 4 of de la Fuente).
51 / 98
Continuous differentiability
Definition
Function f : X ⊂ Rn → Rm is continuously differentiable of class C k (X , Rm )
` j
if all partial derivatives up to order k, i.e. ∂xi∂...∂x f
i`
(x0 ), for ` = 1, . . . , k and
1
i1 , . . . , i` ∈ {1, . . . , n} exist and are continuous functions of x0 .
If partial derivatives of any order exist, function f is infinitely differentiable and
belongs to class C ∞ (X , Rm ).
Remarks
The usual functions ln(x), e x , sin(x), cos(x), polynomials are infinitely
differentiable on their respective domains.
Addition, multiplication, division (except at zero) preserve differentiability of
any order.
Composition of C k functions are of class C k . Chain rule for first derivatives:
D(g ◦ f )(x0 ) = Dg (f (x0 )) · Df (x0 )
for f : X ⊂ Rn → Y ⊂ Rm , g : Y ⊂ Rm → Rp .
52 / 98
Taylor expansion for univariate functions
Theorem
Let f ∈ C k (X , R) with X ⊂ R open. Then for every x0 ∈ X ,
k
X f (`) (x0 )
f (x) = f (x0 ) + (x − x0 )` + g (x − x0 ) ,
`!
`=1
g (x−x0 )
with limx→x0 (x−x0 )k
= 0 and g is a real-valued function defined around 0 ∈ R.
Remarks
The Taylor approximation says that the function is locally well approximated
by a polynomial of order k.
But this does not say that every function is globally identical to an infinite
polynomial.
Examples e x , (1 − x)−1 , or
( 2
e −1/x , x 6= 0 ,
f (x) =
0 , x =0.
53 / 98
Gradient and Hessian matrix
Definition
Let f ∈ C 1 (X , R) with X ⊂ Rn open. The first derivative Df (x0 ) is a vector of
dimension 1 × n, the gradient vector denoted

∂f ∂f
∇f (x0 ) = (x0 ), . . . , (x0 )
∂x1 ∂xn
Definition
Let f ∈ C 2 (X , R) with X ⊂ Rn open. The second derivative is a matrix of
dimension n × n, the Hessian matrix denoted
2
∂ f
D 2 f (x0 ) = (x0 )
∂xi ∂xj i,j=1,...,n
Remark
The Hessian matrix is symmetric.
54 / 98
Taylor approximation in Rn
Theorem
Let f ∈ C 2 (X , R) with X ⊂ Rn open. Then for every x0 ∈ X ,
f (x) = f (x0 ) + ∇f (x0 ) · (x − x0 ) + (x − x0 )0 D 2 f (x0 )(x − x0 ) + g (x − x0 ) ,
with limx→x0 |gkx−x

(x−x0 )|
0k
2 = 0. (Here x − x0 is a column vector and g is a real-valued
function defined around 0 ∈ Rn .)
55 / 98
Implicit function theorem
Often we want to know how the solution of an equation depends on
parameters.
For example, how does the steady state of a dynamic model or the market
equilibrium in a static model depend on policy parameters? In economics
this is referred to as comparative statics.
Answers are possible even when the equation cannot be solved explicitly.
Univariate case: f (x, ω) = 0 with (x, ω) ∈ X × Ω ⊂ R 2 and f : X × Ω → R

can be solved for x = X (ω) locally at (x0 , ω0 ), i.e. f (X (ω), ω) = 0 for ω in
an ε-neighborhood around ω0 if
∂f
(x0 , ω0 ) 6= 0 .
∂x
Then, −1
0 ∂f ∂f
X (ω0 ) = − (x0 , ω0 ) (x0 , ω0 ) .
∂x ∂ω
56 / 98
Implicit function theorem
Theorem
Let f : X × Ω ⊂ Rn × Rp → Rn be a continuously differentiable function on open
set X × Ω and consider (x0 , ω0 ) ∈ X × Ω such that f (x0 , ω0 ) = 0. Suppose that
the determinant of the n × n-Jacobian matrix Dx f (x0 , ω0 ) is non-zero. Then there
exists ε > 0 and a function X : B(ω0 , ε) → X such that
1 X (ω0 ) = x0
2 f (X (ω), ω) = 0 for all ω ∈ B(ω0 , ε)
3 X is continuously differentiable with derivative
−1
DX (ω) = − (Dx f (X (ω), ω)) Dω f (X (ω), ω)
for all ω ∈ B(ω0 , ε).
57 / 98
Inverse function theorem
Corollary (Inverse function theorem)
Let f : X ⊂ Rn → Rn be a continuously differentiable function on open set X and
consider f (x0 ) = y0 . If the determinant of the n × n-Jacobian matrix Df (x0 ) is
non-zero, there exists ε > 0 and a function f −1 : B(y0 , ε) → X such that
1 f −1 (y0 ) = x0
2 f ◦ f −1 (y ) = y for all y ∈ B(y0 , ε)
3 f −1 is continuously differentiable with derivative
−1
Df −1 (y ) = (Df (x))
for all y ∈ B(y0 , ε) and x = f −1 (y ).
Remark
The inverse function is only local. Function f may not be one-to-one (and onto)
on the full domain.
58 / 98
Static optimization without constraints
Theorem (Necessary first-order conditions)

Let f ∈ C 1 (X , R) with open set X ⊂ Rn . If f has a (local) maximum at x0 , then
∇f (x0 ) = 0.
Remarks
Solutions of ∇f (x0 ) = 0 are called critical points.
Generally these can be a local maximum, a local minimum or a saddle point.
59 / 98
Static optimization without constraints
Theorem (Sufficient second-order conditions)
Let f ∈ C 2 (X , R) with open set X ⊂ Rn . If ∇f (x0 ) = 0 and D 2 f (x0 ) is negative
definite, then x0 is a local maximum of f .
Remarks
The proof of this theorem uses the 2nd order Taylor expansion.
Symmetric matrix A ∈ Rn,n is negative definite if h0 Ah < 0 for every
h ∈ Rn \ {0}.
This is equivalent to all eigenvalues being negative.
The conditions in the theorem are not necessary for a local maximum,
example: f (x) = −x 4 .
The necessary second-order condition requires D 2 f (x0 ) to be negative
semidefinite (h0 Ah ≤ 0 ∀h ∈ Rn \ {0})
f ∈ C 2 concave iff D 2 f (.) negative semidefinite.
f ∈ C 2 strictly concave if D 2 f (.) negative definite (not vice versa).
60 / 98
Definiteness of a symmetric matrix
Definition
Let A ∈ Rn,n be a symmetric matrix. The kth order leading principal submatrix
Ak is the k × k-submatrix of A with the last n − k rows and columns deleted. the
leading principal minor of A is the determinant of Ak , detAk .
Theorem
Symmetric matrix A is negative definite iff the sign of detAk is the same as the
sign of (−1)k .
61 / 98
Optimization with constraints
Most optimization problems involve constraints.
We specify the case where all constraints are specified as inequalities.
The case with equality constraints is covered separately in textbooks. But it

is much less relevant in economic applications.
We specify inequality constraints in the form
gj (x) ≥ 0 , (1)
j = 1, . . . , k, where x ∈ X ⊂ Rn .
The first-order conditions for the problem to maximize f (x) subject to

constraints (1) are derived from the Lagrange function
k
X
L(x, λ) ≡ f (x) + λj gj (x) ,
j=1
with Lagrange multipliers λ = (λ1 , . . . , λk ) ≥ 0.

62 / 98
Static optimization with inequality constraints
Theorem (Kuhn-Tucker, necessary first-order conditions)
Let f ∈ C 1 (X , R) and g ∈ C 1 (X , Rk ) with open set X ⊂ Rn . If x0 solves the
problem max f (x) subject to the constraints g (x) ≥ 0, then there are Lagrange
multipliers λ ∈ Rk+ satisfying the conditions:
∇f (x0 ) + λ0 Dg (x0 ) = 0 , (2)

0
g (x0 ) ≥ 0 and λ g (x0 ) = 0 . (3)
Remarks
(2) are n conditions obtained from differentiating the Lagrange function
w.r.t. xi , i = 1, . . . , n.
(3) are k complementary slackness conditions λj gj (x0 ) = 0. If constraint
gj (x) ≥ 0 is slack, multiplier λj is zero.
If some constraints bind, then ∇f (x0 ) is a linear combination of the
gradients of the binding constraints with positive scalars λj .
63 / 98
Static optimization with inequality constraints
Theorem (sufficient conditions for global maximum)

Let f ∈ C 1 (X , R) and gj ∈ C 1 (X , R), j = 1, . . . , k be quasiconcave functions
defined on open, convex set X ⊂ Rn . If x0 solves the first-order conditions of the
Kuhn-Tucker theorem and ∇f (x0 ) 6= 0, then x0 is a solution of the problem
max f (x) subject to the constraints gj (x) ≥ 0, j = 1, . . . , k. If f is strictly
quasiconcave, x0 is the unique solution.
Remarks
Second-order conditions sufficient for a local maximum can also be specified.
They involve negative definiteness of the Hessian of the Lagrange function
w.r.t. x at the point (x0 , λ), see Theorem 19.8 in Simon/Blume.
64 / 98
4. Dynamical systems
65 / 98
Dynamical systems
Dynamic economic models are described by dynamical systems.
These describe the time path of a state vector x ∈ X ⊂ Rn .
Two approaches:
I Discrete time t = 0, 1, 2, . . .. Difference equation
xt+1 = F (xt ) with F :X →X .
I Continuous time t ∈ R+ . Differential equation
dx
ẋt = = F (xt ) with F : X → Rn .
dt
66 / 98
Dynamical systems
xt+1 = F (xt ) or ẋt = F (xt )
Remarks
Straightforward extensions:
I Time t enters function F : non-autonomous dynamical system.
I Difference equations with additional time lags, e.g. xt+1 = F (xt , xt−1 ).
I Differential equations with higher-order derivatives.

Generalizations:
I Stochastic states enter function F : stochastic dynamical system.
I X may not be a subset of Rn (e.g. it may be defined on a space of

distribution functions).
I Partial differential equations, differential equations with time lags

(difference-differential equations).
67 / 98
Dynamical systems
Consider difference equations of the form xt+1 = F (xt ) with F : X → X .
They define sequences on X ⊂ Rn in recursive form.
The initial value x0 is taken as given, a vector of predetermined variables,

such as e.g. the capital stock, asset holdings etc.
Economics models typically include jump variables, such as interest rates or

asset prices.
The dynamical system defined by F should be understood as the solution of

the dynamics of predetermined variables when all other non-predetermined
variables are already solved for.
We are interested in the asymptotic dynamics of this system.
68 / 98
Further notation and stationary solutions
Notation
Write (X , F ) for the (discrete) dynamical system defined by F : X → X .
Write F t : X → X for the t-composite of function F , defined recursively by

F t = F ◦ F t−1 , t ≥ 2, and F 1 = F .
The time-t state of the system is then xt = F t (x0 ).
Definition
x ∗ ∈ X is a stationary solution (fixed point, steady state) of the dynamical
system (X , F ) if x ∗ = F (x ∗ ).
69 / 98
Stability
Definition
A steady state x ∗ of the dynamical system (X , F ) is locally asymptotically
stable if
∃δ > 0 ∀x0 ∈ B(x ∗ , δ) : lim F t (x0 ) = x ∗
t→∞
Definition
A steady state x ∗ of the dynamical system (X , F ) is globally asymptotically
stable if
∀x0 ∈ X : lim F t (x0 ) = x ∗
t→∞
Remarks
Steady states may not exist and they may not be unique.
Even when a unique steady state exists, it may not be stable.
Global stability implies local stability but not vice versa.
70 / 98
(Affine-)linear systems
Consider affine-linear systems of the form
xt+1 = x 1 + Axt
with x 1 ∈ Rn and A ∈ Rn,n .
If all eigenvalues of matrix A are distinct from 1, matrix I − A is invertible

(where I is the n-dimensional identity matrix), and this system has a unique
steady state
x ∗ = [I − A]−1 x 1 .
With the shift x 7→ x 0 = x − x ∗ , the system can be rewritten

0
xt+1 = Axt0
Therefore it suffices to consider linear systems whose (unique) steady state

is the zero vector.
What happens if matrix A has eigenvalue 1?

71 / 98
Stability of linear systems
The stability of the system xt+1 = Axt hinges on the eigenvalues of matrix A.
Suppose that A has n distinct eigenvalues λ1 , . . . , λn ∈ C.
Then there are n linearly independent (and possibly complex-valued)

eigenvectors e1 , . . . , en ∈ Cn . That is, λi ei = Aei for all i = 1, . . . , n.
With E = (e1 , . . . , en ) ∈ Cn,n ,
AE = E Λ
where Λ = diag(λ1 , . . . , λn ) ∈ Cn,n , the diagonal matrix with entries λi .
Since E is invertible,
E −1 AE = Λ .
Matrix A is diagonalizable.
72 / 98
E −1 AE = Λ ⇔ A = E ΛE −1
With change of coordinates yt = E −1 xt , solutions of
xt+1 = Axt
with initial value x0 are transformed into solutions of
yt+1 = Λyt
with diagonal matrix Λ and initial value y0 = E −1 x0 .
Solution yt,i = (λi )t y0,i for i = 1, . . . , n.
Unless all eigenvalues are real, yt ∈ Cn .
73 / 98
Every eigenvalue λ ∈ C can be written
λ = a + ib = re iθ = r [cos θ + i sin θ]
√
with r = a2 + b 2 and cos θ = a/r .
a and b are the real and imaginary parts of λ.
r ≥ 0 is the modulus of λ, a measure of its distance from zero.
If b = 0, the modulus coincides with the absolute value of the real number a.
The t-th power of λ is
λt = r t e iθt = r t [cos θt + i sin θt]
The modulus of λt grows without bounds if r > 1 and it shrinks to zero if

r < 1.
74 / 98
Theorem
x ∗ = 0 is an asymptotically stable steady state of the linear dynamical system
xt+1 = Axt iff all eigenvalues of A have modulus smaller than one.
Remarks
This result does not require distinct eigenvalues in which case A may not be
diagonalizable.
75 / 98
Dynamics with real eigenvalues
Theorem
If matrix A has n distinct real eigenvalues λ1 , . . . , λn with eigenvector matrix
E = (e1 , . . . , en ), the solution of xt+1 = Axt with given x0 takes the form
n
X
xt = ci λti ei ,
i=1
where (c1 , · · · , cn )0 = E −1 x0 .
76 / 98
Dynamics with complex eigenvalues
Complex eigenvalues and eigenvectors of a real-valued matrix come in
conjugate pairs:
I Eigenvalues λ1 , λ2 = a ± ib = re ±iθ
I Eigenvectors e1 , e2 = e Re ± ie Im
with b 6= 0, θ 6= 0, π, and e Re , e Im ∈ Rn .
Then
λt1 = r t [cos(θt) + i sin(θt)] , λt2 = r t [cos(−θt) + i sin(−θt)] .
Then λt1 e1 = u + iv and λt2 e2 = u − iv with

u ≡ r t e Re cos(θt) − e Im sin(θt) , v ≡ r t e Im cos(θt) + e Re sin(θt)

Real-valued linear combinations of these vectors take the form

(c1 − ic2 )(u + iv ) + (c1 + ic2 )(u − iv ) with c1 , c2 ∈ R, hence
2r t c1 e Re cos(θt) − e Im sin(θt) + c2 e Im cos(θt) + e Re sin(θt)

77 / 98
Dynamics with complex eigenvalues
Theorem
If matrix A has two complex eigenvalues λ1 , λ2 = re ±iθ with eigenvectors
e1 , e2 = e Re ± ie Im and n − 2 distinct real eigenvalues λ3 , . . . , λn with
eigenvectors e3 , . . . , en , all solutions of xt+1 = Axt take the form
n
X
xt = c1 r t [e Re cos(θt) − e Im sin(θt)] + c2 r t [e Im cos(θt) + e Re sin(θt)] + ci λti ei ,
i=3
with c1 , · · · , cn ∈ R.
Remark
Straightforward extension to multiple complex eigenvalues.
78 / 98
Special case: n = 2
The two eigenvalues λ1 , λ2 of 2 × 2-matrix A satisfy
trA = λ1 + λ2 , det A = λ1 λ2 .
Solution
1 p
λ1 , λ2 = trA ± (trA)2 − 4 det A .
2
Complex eigenvalues iff det A > (trA)2 /4.
Asymptotic stability requires

det A < 1
det A > trA − 1
det A > −trA − 1
Stability triangle in (trA, det A) space.

79 / 98
Nonlinear systems
Consider the nonlinear system
xt+1 = F (xt ) (N)
with continuously differentiable function F and stationary solution x ∗ .
The local dynamics of the nonlinear system is approximated by the linear

system
(xt+1 − x ∗ ) = DF (x ∗ )(xt − x ∗ ) (L)
80 / 98
Nonlinear systems
Theorem (Hartman-Grobman)
If all eigenvalues of the Jacobian matrix DF (x ∗ ) have modulus different from one,
there exist open neighborhoods U N and U L of x ∗ and a continuous bijection
g : U N → U L such that for all x, F (x) ∈ U N ,
F (x) = g −1 (x ∗ + DF (x ∗ )(g (x) − x ∗ )) .
That is, the two systems (N) and (L) are topologically equivalent.
Corollary
If all eigenvalues of DF (x ∗ ) have modulus less than one, x ∗ is locally
asymptotically stable for system (N).
81 / 98
Remarks on differential equations
Initial value problem

ẋt = F (xt ) , x0 = x̄
If F is continuous, the initial value problem has a local solution (Peano

theorem).
If F is Lipschitz continuous, the solution is unique (Picard-Lindelöf theorem).
If F is linear (i.e. ẋ = Ax), the steady state x = 0 is globally asymptotically
stable iff all eigenvalues of A have negative real part.
Explicit solutions of linear differential equations can be obtained by
diagonalization.
The local stability properties of a steady state x ∗ of a non-linear differential
equation ẋt = F (xt ) hinges on the eigenvalues of dF (x ∗ )
(Hartman-Grobman theorem for differential equations).
82 / 98
Markov chains
A Markov chain describes sequences of random draws st from finite set
S = {1, . . . , n} (“state space”) such that the probability distribution of st+1
depends only on st .
Let πji = Prob(st+1 = j|st = i) be the transition probability from state i to

state j.
The Markov chain is characterized by the transition matrix (Markov

matrix, stochastic matrix)
 
π11 ··· π1n
 .. .. .. 
Π= . . . 
πn1 ··· πnn
With unit vector ei , Πt ei is the probability distribution of st conditional on

s0 = i.
83 / 98
Markov chains
Let ∆ ≡ {p ∈ Rn+ | i pi = 1} denote the unit simplex, the set of

P
probability distributions on S.
Alternative interpretation of the Markov chain: Πp ∈ ∆ is the population

distribution in period t + 1 if p ∈ ∆ is the population distribution in period
t.
The Markov chain is a dynamical system F : ∆ → ∆, p 7→ Πp.
A fixed point of F is an invariant probability distribution.
Theorem
Every Markov matrix has eigenvalue 1. Equivalently, the Markov chain has an
invariant probability distribution.
Proof: Apply Brouwer’s fixed-point theorem to F .
84 / 98
Examples
What are the invariant probability distributions of the following Markov chains?
Does Πt converge?

0.75 0.25
1 Π =
0.25 0.75

1 0
2 Π =
0 1

0 1
3 Π =
1 0
 
0.5 0 0
4 Π =  0.25 0.5 0.5 
0.25 0.5 0.5
These examples show that there can be transient states, distinct ergodic sets
(i.e. subsets of S which are invariant under the Markov chain), cyclically moving
subsets of S, and multiple invariant distributions.
85 / 98
Ergodic theorem
1
PT
Even if Πt does not converge, the time average T t=1 Πt does:
Theorem
PT
For every Markov matrix Π, limT →∞ T1 t=1 Πt = Q exists. All columns of Q
(and linear combinations thereof) are invariant distributions of Π.
PT t n,n
Proof: The sequence of Markov matrices AT = T1 t=1 Π ∈ R is bounded, hence has a convergent subsequence (AT )
k
with limit Q. It can then be shown that QΠn = Πn Q = Q for every n ∈ N.
For any other convergent subsequence (AT 0 ) with limit Q̂,
k
T0
k k T0
1 X t 1 X
Q̂Q = lim Π Q = lim Q =Q
k T0 k T0
k t=1 k t=1
which uses Πt Q = Q. Similarly, Q Q̂ = Q. With the roles of Q and Q̂ reversed, it follows Q Q̂ = Q̂ = Q̂Q, and therefore
Q = Q̂. Therefore the bounded sequence AT cannot have multiple limit points, hence it must converge to a unique limit Q.
Finally, ΠQ = Q implies that each column of Q is an invariant probability distribution.
Corollary:
For every p ∈ ∆, the invariant distribution Qp is identical to the time average
PT
limT →∞ T1 t=1 Πt p of population distributions starting from p in t = 0.
86 / 98
Convergence to a unique invariant distribution
Lemma
Suppose that
∃j ∈ S ∀i ∈ S : πji > 0 (∗)
for Markov matrix Π = (πji ). Then F : ∆ → ∆, p 7→ Πp is a contraction
w.r.t. the d1 metric.
Proof: See Lemma 11.3 in Lucas and Stokey (1989).
Theorem
Let Π be a Markov matrix and suppose that ΠT satisfies property (*) for some
T ∈ N. Then there exists a unique invariant distribution p ∗ and β < 1 such that
limt→∞ Πt p0 = p ∗ and d1 (ΠTk p0 , p ∗ ) < β k d(p0 , p ∗ ) for all p0 ∈ ∆ and k ∈ N.
Proof: Application of the N-stage contraction theorem.
87 / 98
5. Dynamic programming
88 / 98
Dynamic optimization problem
Consider dynamic (infinite-horizon) optimization problems of the form
∞
X
∗
V (x0 ) = sup β t U(xt , xt+1 ) ,
(xt )t≥0 t=0
s.t. xt+1 ∈ G (xt ) for all t ≥ 0, x0 given.

where
X ⊂ Rn is the state space
G : X X is a non-empty correspondence defining feasible states for the
next period
GrG = {(x, x+ )|x+ ∈ G (x)} is the graph of G
U : GrG → R is the period payoff function
β ∈ (0, 1) is the discount factor
V ∗ : X → R ∪ {∞} is the value function
89 / 98
Recursive formulation
Because the objective function is additively separable, discounting is
exponential, and U, G are time-invariant, the sequential problem has a
recursive formulation:
V (x) = sup {U(x, x+ ) + βV (x+ )} (RP)
x+ ∈G (x)
If this problem can be solved for value function V , and if

limt→∞ β t V (xt ) = 0 for all feasible sequences (i.e. xt+1 ∈ G (xt )), then
V = V ∗ and every sequence of solutions of the recursive problem (RP)
(starting from x0 ) solves the sequential problem (Theorems 4.3 and 4.5 in
Lucas and Stokey, 1989).
If the supremum in (RP) is a maximum,

π : X → 2X
x 7→ {x+ ∈ G (x)|V (x) = U(x, x+ ) + βV (x+ )} =
6 ∅
is the policy correspondence.
90 / 98
Existence of a value function
Assumption 1. X is compact and G : X X is continuous and

compact–valued.
Assumption 2. U : GrG → R is continuous.
Theorem (Existence of solutions)

Under (A1) and (A2), there exists a unique continuous function V : X → R
solving (RP). Moreover, there exists a compact-valued and uhc policy
correspondence π : X X , x 7→ x+ .
Remark
Since V is bounded, limt→∞ β t V (xt ) = 0 holds for all (xt ). Hence V = V ∗ and
all solutions of (RP) solve the sequential problem.
91 / 98
Proof
Consider C (X ) (continuous real–valued functions on X ), endowed with the sup-norm,
k.k∞ . For V ∈ C (X ), define the function
TV (x) = max {U(x, x+ ) + βV (x+ )} .

x+ ∈G (x)
TV is well–defined because of (A1) and (A2). The theorem of the maximum implies
that TV is a continuous function. Hence, T maps the metric space C (X ) into itself.
It can be verified that
(a) V ≤ W ⇒ TV ≤ TW , for all V , W ∈ C (X ).
(b) T (V + c) ≤ TV + βc for all c ∈ R and V ∈ C (X ).

From Blackwell’s theorem (Lucas and Stokey, 1989, Thm. 3.3) follows that T is a
contraction with modulus β, i.e.
kTV − TW k∞ ≤ βkV − W k∞ for all V , W ∈ C (X ) .
Banach’s fixed point theorem implies that T has a unique fixed point, and the theorem
of the maximum implies that the policy correspondence is compact-valued and uhc.
92 / 98
Value function iteration
Theorem (Convergence of VFI)

Under (A1) and (A2), and for every V0 ∈ C (X ), the sequence Vn defined
recursively by Vn = TVn−1 converges to the unique value function V and it
satisfies kVn − V k∞ ≤ β n kV0 − V k∞ .
Remarks
The proof follows from the contraction property.
Value function iteration is very useful for computational applications.
93 / 98
Concavity and existence of a policy function
Assumption 3. U is strictly concave and GrG is a convex set.
Theorem (Concavity and policy function)

Under (A1)-(A3), let V be the unique solution of (RP). Then V is strictly
concave and there is a continuous policy function π : X → X satisfying
V (x) = U(x, π(x)) + βV (π(x)) for all x ∈ X .
Proof (sketch): Let C 0 (X ) be the set of concave functions and let C 00 (X ) be the
set of strictly concave functions. Show that T maps C 0 (X ) into C 00 (X ) ⊂ C 0 (X ).
Since C 0 (X ) ⊂ C (X ) is a complete metric space, it follows that V ∈ C 0 (X ), and
hence V = TV ∈ C 00 (X ).
Since the RHS of (RP) is strictly concave in x+ and since G (x) is convex, a
unique maximum x+ = π(x) exists. Continuity of function π follows from uhc of
the policy correspondence.
94 / 98
Monotonicity of the value function
Assumption 4. U(., x+ ) is strictly increasing in each of its arguments for all

x+ ∈ X , and G is monotone in the sense that G (x) ⊂ G (x 0 ) whenever x ≤ x 0 .
Theorem (Monotonicity)
Under (A1)-(A2) and (A4), let V be the unique solution of (RP). Then V is
strictly increasing in all of its arguments.
Proof (sketch): Similar to the proof of the previous theorem: (A4) implies that
maxx+ ∈G (x) {U(x, x+ ) + βV (x+ )} is strictly increasing in x.
95 / 98
Differentiability of the value function (Envelope Theorem)
Assumption 5. U is continuously differentiable on the interior of GrG .
Theorem (Differentiabiliy)
Under (A1)-(A3) and (A5), let V be the unique solution of (RP) and let π be the
policy function. Then, for every x0 ∈ intX and π(x0 ) ∈ intG (x0 ), V is
continuously differentiable at x0 with gradient
∇V (x0 ) = Dx U(x0 , π(x0 )) .
Proof: See Lucas and Stokey, Theorem 4.11.
96 / 98
The Euler equation
Differentiability can be used to characterize optimal solutions without
knowing value or policy functions!
Consider the one–dimensional case, X ⊂ R.
Then the last theorem implies

V 0 (x) = ∂U (x, π(x)) .
∂x
The first–order condition of the maximization problem is
∂U (x, π(x)) + βV 0 (π(x)) = 0 .
∂x+
Putting these together yields
∂U (x, π(x)) + β ∂U (π(x), π 2 (x)) = 0 ,
∂x+ ∂x
or, in sequential notation,
∂U (x , x ) + β ∂U (x , x ) = 0 .
∂x+ t t+1 ∂x t+1 t+2
97 / 98
Euler equation and transversality condition
We established that the Euler equation

∂U (x , x ) + β ∂U (x , x ) = 0 (4)
∂x+ t t+1 ∂x t+1 t+2
is necessary for optimality of (xt )t≥0 with xt ∈ intX .
It is not sufficient, however. What is further needed is the transversality

condition (TVC)
lim β T ∂U (xT , xT +1 )xT = 0 . (5)
T →∞ ∂x
Theorem (Sufficiency of Euler equations and transversality condition)

Under (A1)-(A3) and (A5), an interior sequence (xt )t≥0 is optimal for the
sequential problem iff it satisfies (4) and (5).
Proof: See Lucas and Stokey, Theorem 4.15.
98 / 98

Lectures 1

Uploaded by

Lectures 1

Uploaded by

Mathematical Methods - Part 1

Goethe University Frankfurt

Lecturer: Leo Kaas, HoF 3.57

Class times: Tuesdays and Wednesdays, 8.30–10.00

7 weeks with 11 lectures and 3 tutorials.

Exam for part 1 in early December.

Part 2 on computational economics, taught by Alexander Ludwig.

Joint grade based on both exams.

All course material at OLAT.

All courses in GSEFM, but especially your own original research,

These skills are needed in all branches of GSEFM (empirical and

Aim is to reduce heterogeneity in maths backgrounds while

Learning by doing: You only learn what you practice!

This obviously applies to other Ph.D. courses as well.

Active listening and active reading (no memorizing)

Practice problem sets!

Working in groups is strongly encouraged.

Ask as much as needed!

Differential calculus and static optimization

de la Fuente, Mathematical Methods and Models for Economists

Simon and Blume, Mathematics for Economists

Stokey and Lukas, Recursive Methods in Economic Dynamics

Many other sources online

A ⇔ B: A if and only if (“iff”) B (A is necessary and sufficient for B)

∃! “there exists a unique . . . ”

@ “there does not exist . . . ”

A ∩ B ≡ {x ∈ X |x ∈ A ∧ x ∈ B}: Intersection of A and B

A ∪ B ≡ {x ∈ X |x ∈ A ∨ x ∈ B}: Union of A and B

|X | cardinality of X (“number of elements”)

X × Y ≡ {(x, y )|x ∈ X , y ∈ Y } Cartesian product

(a, b) ≡ {x ∈ R|a < x < b} , [a, b] ≡ {x ∈ R|a ≤ x ≤ b} .

Similarly for half-open intervals (a, b], [a, b)

R+ = [0, ∞), R++ = (0, ∞)

Absolute value |x| ≡ max(x, −x) ∈ R+ for x ∈ R

n-dimensional Euclidean space

The Euclidean space is an example of a vector space: Vector addition and

sup-norm kxk∞ ≡ max {|xi | | i = 1, . . . , n}

All are special cases of the p-norms (for p ≥ 1)

For A ⊂ X , f (A) ≡ {y = f (x)|x ∈ A} ⊂ Y is the image of A

For B ⊂ Y , f −1 (B) ≡ {x|f (x) ∈ B} ⊂ X is the pre-image of B

Function f is one-to-one (injective) if f (x) = f (y ) implies x = y .

Function f is onto (surjective) if ∀y ∈ Y ∃x ∈ X : f (x) = y .

Bijective functions are one-to-one and onto.

The set of all functions f : X → Y is denoted Y X .

For every A ⊂ X , define the indicator function

2X = {0, 1}X is the power set of X , the set of all subsets of X .

Let (X , d) be a metric space.

Sequence (xn ) converges if there exists a limit x̄ ∈ X such that

∀ε > 0 ∃N ∈ N ∀n ≥ N : d(xn , x̄) < ε .

Notation is xn → x̄ or limn→∞ xn = x̄.

Sequence (xn ) in Rn is bounded if there exists C > 0 such that

Every convergent sequence is bounded. (Proof as exercise.)

∀ε > 0 ∀N ∈ N ∃n ≥ N : d(xn , x̂) < ε

A limit point requires the existence of a convergent subsequence (xnk ) with

If a sequence has a unique limit point, does it converge to it?

Proof (Sketch): First show that a bounded sequence in R has a monotone

If xn converges to x̄ in the dp -metric for some p ≥ 1, it also converges (to

All p-norms are equivalent.

How about convergence in the discrete metric?

Sums, products of convergent series converge.

∀x ∈ X ∀ε > 0 ∃N ∈ N ∀n ≥ N : |fn (x) − f (x)| < ε

∀ε > 0 ∃N ∈ N ∀n ≥ N ∀x ∈ X : |fn (x) − f (x)| < ε

Uniform convergence implies point-wise convergence, but not vice versa.

The interior of set A is the union of all open sets contained in A:

But compact sets are closed and bounded. Proof as exercise.

I f −1 (A) is closed for every closed set A ⊂ Y .

If f : X → Y and g : Y → Z are continuous, so is the composition

Corollary of this result: For all real-valued functions, addition and

If f : X → Y is continuous and one-to-one, is the inverse f −1 : f (X ) → X

Does the Theorem of Weierstrass apply? Do these functions have a maximum?

2 Let fn ∈ C (X ) converge to f in metric d∞ . Then f ∈ C (X ).

For instance, if a dynamic model (in discrete time) is described by

Finding a solution to n equations in n unknowns can often be transformed

These functions are usually non-linear, so we cannot use linear algebra to

∀ε > 0 ∃N ∈ N ∀m, n ≥ N : d(xm , xn ) < ε .

Are the following sets complete in (R, |.|)?

has non-empty, compact valued and uhc solution correspondence S : Ω X ,