Math Constrained Optimization I Foc
Math Constrained Optimization I Foc
Constrained Optimization I:
First Order Conditions
max f(x)
x
s.t. gi (x) ≤ bi , for i = 1, . . . , k
hj (x) = cj , for j = 1, . . . , ℓ
1
The letters “s.t.” can be read as “subject to”, “such that”, or “so that”. However one
renders it, it indicates the constraint equations.
2 MATH METHODS
c(w, q) = min w· z
z
s.t. f(z) ≥ q
z ≥ 0.
y1
b x∗
u3
u2
u1
x1
Figure 18.2.1: Three indifference curves are shown in the diagram. Indiffer-
ence curve u2 is the highest indifference curve the consumer can afford. This
happens at x∗ where the indifference curve is tangent to the budget line.
4 MATH METHODS
p1 ∂u/∂x1
= .
p2 ∂u/∂x2
This can also be expressed in terms of the marginal utility per dollar,
which must be the same for both goods. If not, the consumer would gain
by spending more on the good with greater per dollar value, and less on
the good with less bang for the buck.
∂u/∂x1 ∂u/∂x2
= .
p1 p2
Another way to think about this is that we are lining up the tangent
spaces of both the budget constraint and the optimal indifference curve.
Both the budget line and the indifference curves are level sets of func-
tions. As such, their tangent spaces are the null spaces of their derivatives.
The tangents v to the budget line obey (p 1 , p2 )v∗ = 0 and the tangents
w to the optimal indifference curve obey Du(x ) w = 0. This implies
that the normal vectors ∇u(x∗) and p = (p1 , p2 )T must be collinear. Thus
∇u(x∗) = µp for some µ ∈ R.
18. CONSTRAINED OPTIMIZATION I: FIRST ORDER CONDITIONS 5
p
∇u(x∗) ∇u(x∗)
x∗
b
x∗ b
u2 − u2
p
p
−p ·x
·x
=
=
−
m
m
Figure 18.4.1: Here x∗ is the consumer’s utility maximum. At the maximum,
the vectors p and ∇u(x∗ ) are collinear, with the heavier arrow being ∇u(x∗ ).
With an equality constraint, it doesn’t matter whether the vectors point in
the same or opposite directions. It will matter for inequality constraints. The
right panel writes the budget constraint as −p·x = −m to emphasize that we
flipped the direction of p. Either way, there is a µ ∈ R with ∇u(x∗ ) = µp.
6 MATH METHODS
v ∇u(x∗)
p′
x∗
u3
u2
p ·x
′ =m
Figure 18.5.1: Here the prices have been changed to p′ so that the indiffer-
ence curve u2 (blue tangent) and the budget line are no longer tangent.
At the new prices, we can move along the budget line in the direction v to
increase utility. Because of the acute angle between v and ∇u(x∗ ), moves in
the direction v increase utility. Here, we can also see that utility has not only
increased from u2 , but is even above u3 at x∗ + v.
max f(x)
x
s.t. hj (x) = cj , for i = 1, . . . , ℓ.
That is, we are attempting to maximize f(x) under the constraints that
h(x) = c. The key result is the Tangent Space Theorem.2
Tangent Space Theorem. Let U ⊂ Rm and f : U → R, h : Rm → Rℓ be
C1 functions. Suppose x∗ either maximizes or minimizes f(x) over the set
M = {x : h(x) = c} with Df(x∗) 6= 0 and rank Dh(x∗) = ℓ. Then the
tangent space T F of the differentiable manifold F = {x : f(x) = f(x∗ )}
at x∗ contains the tangent space at x∗ of the differentiable manifold M.
P
Moreover, there are unique µ∗j , i = 1, . . . , ℓ with ℓj=1 µ∗j Dhj (x∗ ) =
Df(x∗).
2
The name Tangent Space Theorem is not in general use. I gave it a name since we
use it a number of times.
8 MATH METHODS
m − rank Dh(x∗) = m − ℓ.
3
Be careful, the notation differs. The m, ℓ, and m − ℓ on this page are the m + k, k,
and m of Theorem 15.38.1. The notation is the way it is in order to highlight different
things in the two sections.
4
See Joseph-Louis Lagrange (1762), Essai sur une nouvelle methode pour determiner
les maxima et minima des formules integrales indefinies, Miscellanea Taurinensia II,
173-195.
18. CONSTRAINED OPTIMIZATION I: FIRST ORDER CONDITIONS 9
v ∇u(x∗ )
p′
b
x∗
u3
p ·x
u2
′ =m
Figure 18.5.1: The budget line in the diagram is T M and the tangent (blue)
to the indifference curve u2 is T F. Of course, u is the objective. As you can
see, movements in the direction v increase utility when the tangent spaces
are not colinear.
for some c(t) ∈ ℓ x(t), x∗ . For t small enough that x(t) ∈ Bε (x∗), we
have
f x(t) = f(x∗) + Dfc(t)v > f(x∗),
ℓ
X
∗
Df(x ) = µ∗j Dhj(x∗ ).
j=1
By NDCQ, rank Dh(x∗) = ℓ. This means the ℓ vectors Dhj are linearly
independent, implying that the µ∗j are unique.
5
Suppose A ⊂ B. The orthogonal complement of A is the set of all vectors perpen-
dicular to everything in A. That is, A⊥ = {x : x·a = 0 for all a ∈ A}. Since A ⊂ B,
there are additional conditions that must be met to be in B⊥ . Such vectors must not
only be perpendicular to all elements of A, but also those elements of B that are not in
A. So B⊥ ⊂ A⊥ .
18. CONSTRAINED OPTIMIZATION I: FIRST ORDER CONDITIONS 11
ℓ
X
T
L(x, µ) = f(x) − µ h(x) − c = f(x) − µj hj (x) − cj .
j=1
This allows us to rewrite the key conclusions of the Tangent Space The-
orem as follows:6
6
Theorems such as this are sometimes referred to as the “Kuhn-Tucker Theorem” or
“Karush-Kuhn-Tucker Theorem”. Lagrange multipliers date back to Lagrange in the
18th century where they proved useful in the calculus of variations and for problems
with equality constraints. William Karush (1917–1997) proved the first theorem with
inequality constraints in an unpublished 1939 Chicago master’s thesis. Fritz John (1910–
1994) proved another version that was published in a 1948 collection in honor of
Richard Courant’s 60th birthday. The paper was rejected by the Duke Mathematical
Journal. Harold W. Kuhn (1925–2014) and Albert W. Tucker (1905–1995) published
their version two years later, to instant acclaim. Timing is everything! See T.H. Kjeldsen
(2000), “A Contextualized Historical Analysis of the Kuhn-Tucker Theorem in Nonlinear
Programming: The Impact of World War II”, Hist. Mathematica 27 331–361.
12 MATH METHODS
That is,
∂L ∗ ∗
(x , µ ) = 0 for i = 1, . . . , m, and
∂xi
∂L ∗ ∗
(x , µ ) = 0 for j = 1, . . . , ℓ.
∂µj
ℓ
X
∗ ∗ T ∗
Df(x ) = (µ ) Dh(x ) = µjDhj (x∗).
j=1
18. CONSTRAINED OPTIMIZATION I: FIRST ORDER CONDITIONS 13
Since m > 0, any solution other than zero must have every xi > 0.
Theorem 30.3.1 then ensures that this problem has a solution.
In many economic problems, we will make assumptions that have an
impact on optimization via the Lagrangian. Here, Dh = p ≫ 0. The
NDCQ condition is satisfied.
Now we form the Lagrangian
x2 p1 x3 p1 3x3 p2
= , = , =
3x1 p2 2x1 p3 2x2 p3
The third equation is redundant, being the ratio of the other two. That
leaves us with
100
v(p, 100) = u(x∗) =
(6p1 )1/6(2p2)1/2(3p3 )1/3
∗ u(x∗) 1
µ = = .
100 (6p1 ) (2p2 )1/2(3p3 )1/3
1/6
◭
18. CONSTRAINED OPTIMIZATION I: FIRST ORDER CONDITIONS 15
max u(x)
x
s.t. g(x) ≤ b
∇g(x∗) g(x) ≤ b
∇u(x∗) ∇u(x∗)
b
∗
x∗ b
x u3 u3
u2 u2
g(x) ≤ b u1 ∇g(x∗) u1
max f(x)
x
s.t. gi (x) ≤ bi .
If rank Dĝ(x∗) = k̂ holds (NDCQ), then there are multipliers λ∗ such that
(a) The pair (x∗ , λ∗ ) is a critical point of the Lagrangian:
∂L ∗ ∗
(x , λ ) = 0 for i = 1, . . . , m
∂xi
∂L ∗ ∗
(x , λ )gj (x∗) − cj = 0 for j = 1, . . . , k,
∂λj
λ∗i gi (x∗) − bi = 0
for i = 1, . . . , k
p1 x1 + p2 x2 = m
as
p1 x1 + p2 x2 ≤ m
−p1 x1 − p2 x2 ≤ −m.
This doesn’t work. It runs afoul of the NDCQ. When both bind, as
they must if both are obeyed, we have
p p2
1
Dĝ = .
−p1 −p2
This has rank 1 when it needs rank 2. In this case, the failure of constraint
qualification is minor. If you do this with Cobb-Douglas utility, you won’t
be able to uniquely determine the two multipliers. However, you can
determine their difference.
18. CONSTRAINED OPTIMIZATION I: FIRST ORDER CONDITIONS 21
1/3 2/3
max x1 x2
x
s.t. p1 x1 + p2 x2 ≤ m
x1 ≥ 0, x2 ≥ 0.
Where p1 , p2 , m > 0.
We form the Lagrangian by first rewriting the non-negativity constraints
in the proper form: −x1 ≤ 0, −x2 ≤ 0. The Lagrangian is
p1 p2 !
Dg = −1 0
0 −1
x2 x1 = 0 & p·x = m
p1
x1
+
x1 = 0
p2
x2
=
m
x2 = 0 x1
Figure 18.21.1: Without complementary slackness, we would have to check
separately for maxima seven ways: at each of the three corner points, in the
relative interior of each of the three boundary segments of the budget set, and
in the interior of the budget set.
18. CONSTRAINED OPTIMIZATION I: FIRST ORDER CONDITIONS 23
p1 1 x2
=
p2 2 x1
implying
1
p1 x1 = p2 x2 .
2
We substitute in the budget constraint to find 3p1 x1 = m, so the Mar-
shallian demands are
m 2m
x∗1 = and x∗2 = .
3p1 3p2
22/3
λ∗0 = 1/3 2/3
.
3p1 p2
22/3m
v(p, m) = u(x∗1, x∗2 ) = 1/3 2/3
.
3p1 p2
18. CONSTRAINED OPTIMIZATION I: FIRST ORDER CONDITIONS 25
It is impossible for all four constraints to bind. If they did, we would have
x1 = x2 = x3 = 0 and 0 = p·x = m > 0. Any collection of 1, 2, or 3
rows of Dg is linearly independent, so constraint qualification (NDCQ)
is satisfied everywhere else.
26 MATH METHODS
L = a 1 x1 + a 2 x2 + a 3 x3
− λ0 (p·x − m)
+ λ 1 x1 + λ 2 x2 + λ 3 x3 .
∂L
0= = a1 − λ0p1 + λ1
∂x1
∂L
0= = a2 − λ0p2 + λ2
∂x2
∂L
0= = a3 − λ0p3 + λ3 .
∂x3
λ0 p1 = a1 + λ1
λ0 p2 = a2 + λ2
λ0 p3 = a3 + λ3 .
At this point, you may be wondering how to proceed with such first
order conditions. None of the variables are in the first order equations!
None!
What can we do?
What can we possibly do??
18. CONSTRAINED OPTIMIZATION I: FIRST ORDER CONDITIONS 27
λ0 p1 = a1 + λ1
λ0 p2 = a2 + λ2 (18.26.3)
λ0 p3 = a3 + λ3 .
Since λ0 > 0, that means p·x = m > 0. The consumer spends their
entire budget.
At this point, we don’t know whether or not the maximum is in the
relative interior of the budget frontier.
x3
p·x = m
x2
x1
Figure 18.26.1: The budget frontier is the green triangle in R3 with vertices
(m/p1 , 0, 0), (0, m/p2, 0), and (0, 0, m/p3).
28 MATH METHODS
Then
λ0 ≥ max {ai /pi }.
i=1,2,3
ah ai
= max .
ph i=1,2,3 pi
∂L
0= = 1 − λ0 px + λx
∂x
(18.29.5)
∂L 1
0= = − √ + λy .
∂y 2 y
18. CONSTRAINED OPTIMIZATION I: FIRST ORDER CONDITIONS 31
λ0 px = 1 + λx
1 (18.30.6)
λ0py = √ + λy .
2 y
λ0 px = 1 + λx
1 (18.30.6)
λ0py = √ + λy .
2 y
1
py /px = √ + λy .
2 y
p
px = 1 + λx ≥ 1.
2 py m
λ0px = 1
1
λ0 py = √ .
2 y
√
Then λ0 = 1/px and py /px = 1/2 y. It follows that y = p2x /4p2y . Using
the budget constraint, we find px x = m − py y = m − p2x /4py . We’ve
assumed x > 0, which requires m − p2x /4py > 0. In other words, Case
III requires 4py m > p2x .
So what happens now? Which solutions do we use.
• Case I didn’t have solutions,
• Case II only worked when p2x ≥ 4py m, and
• Case III required p2x < 4py m.
It all comes down to the parameter values px , py , and m.
Organizing the results by the parameter values, we obtain
m
0, py when p2x ≥ 4py m (Case II),
(x, y) =
m px p2x
px − 4p2 , 4py when p2x < 4py m (Case III).
y
If we apply the case III formula to p2x = 4py m, it agrees with case II.
34 MATH METHODS
max x
(x,y)
s.t. x3 + y2 = 0.
L = x − µ(x3 + y2).
x3 + y2 = 0
b
x∗
x
x=0
Figure 18.34.1: The graph of the constraint is illustrated in the diagram. It’s
pretty obvious from the graph that the maximum value of x on the constraint
is 0 and occurs at x∗ = (0, 0).
In section , we will see that this type of problem can be solved using
a modified Lagrangian.
36 MATH METHODS
max f(x)
x
s.t. gi (x) ≤ bi , for i = 1, . . . , k
hj (x) = cj , for i = 1, . . . , ℓ
Set ĝ(x)
G(x) =
h(x)
If rank DG(x∗ ) = k̂ + ℓ holds (NDCQ), then there are multipliers λ∗ and
µ∗ such that
(a) The triplet (x∗, λ∗ , µ∗ ) is a critical point of the Lagrangian:
∂L ∗ ∗ ∗
(x , λ , µ ) = 0 for i = 1, . . . , m
∂xi
∂L ∗ ∗ ∗
(x , λ , µ ) = ĝ(x) − b̂ = 0 for i = 1, . . . , k and binding,
∂λi
∂L ∗ ∗ ∗
(x , λ , µ ) = hj(x) − cj = 0 for j = 1, . . . , ℓ,
∂µj
(b) The complementary slackness conditions hold:
λ∗i gi (x∗) − bi = 0
for all i = 1, . . . , k
max f(x, y) = x2 − y2
x
s.t. x ≥ 0, y ≥ 0
x2 + y2 = 4.
Any 2 × 2 submatrix of this has rank 2 and the top row is positive because
x2 + y2 = 4 and x, y ≥ 0. Finally, it is impossible for all three constraints
to bind, implying the NDCQ condition is satisfied.
38 MATH METHODS
L = x2 − y2 − µ(x2 + y2 − 4) + λx x + λy y.
∂L
0= = 2x − 2xµ + λx
∂x
∂L
0= = −2y − 2yµ + λy
∂y
∂L
0= = −x2 − y2 + 4
∂µ
The last one is the equality constraint. The solution must also obey the
two complementary slackness conditions
λxx = 0, λy y = 0
x ≥ 0, y ≥ 0, λx ≥ 0, λy ≥ 0.
18. CONSTRAINED OPTIMIZATION I: FIRST ORDER CONDITIONS 39
2x + λx = 2µx
2y + 2µy = λy
2x2 = 2µx2
2y2 + 2µy2 = 0.
Based on these results, (2, 0) is the maximum point and (0, 2) is the
minimum point.
40 MATH METHODS
Since the sign of µ doesn’t matter, this is equivalent to using the La-
grangian
where λ ≥ 0 or using
with λ ≤ 0. Yet another method is the one favored in the book, writing
the inequality constraints the opposite way, setting g′ = −g and b′ =
−b, yielding constraints g′(x) ≥ b′ instead of g(x) ≤ b. We will state the
result in this form, which is more natural when thinking about duality.
It doesn’t really matter which form of Lagrangian you use. Just be sure
not to mix them!
18. CONSTRAINED OPTIMIZATION I: FIRST ORDER CONDITIONS 41
min f(x)
x
s.t. gi(x) ≥ bi , for i = 1, . . . , k
hj(x) = cj , for i = 1, . . . , ℓ
If ĝ(x∗)
rank D = k̂ + ℓ
h(x∗)
holds (NDCQ), then there are multipliers λ∗ and µ∗ such that
42 MATH METHODS
∂L ∗ ∗
(x , λ ) = 0 for i = 1, . . . , m
∂xi
∂L ∗ ∗ ∗
(x , λ , µ ) = gi (x) − bi = 0 for binding i = 1, . . . , k,
∂λi
∂L ∗ ∗ ∗
(x , λ , µ ) = hj (x∗ ) − cj = 0 for j = 1, . . . , ℓ,
∂µj
λ∗i gi (x∗) − bi = 0
for all i = 1, . . . , k
λ∗1 ≥ 0, . . . , λ∗k ≥ 0,
s.t. K ≥ 0, L ≥ 0,
K1/2L1/2 ≥ q.
L = rK + wL − λ0 (K1/2L1/2 − q) − λK K − λL λL.
λ0 L1/2
r= + λK
2 K1/2
λ0 K1/2
w= + λL .
2 L1/2
44 MATH METHODS
and r
w
K∗ = q .
r
Finally, the minimum cost is
√
c(r, w, q) = rK∗ + wL∗ = 2q rw.
December 6, 2022