Math Constrained Optimization I Foc

18.
Constrained Optimization I:
First Order Conditions
The typical problem we face in economics involves optimization under

constraints. From supply and demand alone we have: maximize utility,
subject to a budget constraint and non-negativity constraints; minimize
cost, subject to a quantity constraint; minimize expenditure, subject to a
utility constraint; maximize profit, subject to constraints on production.
And those are just the basic supply and demand related problems. Then
there are other types of constrained optimization ranging from finding
Pareto optima given resource and technology constraints to setting up
incentive schemes subject to participation constraints.
The generic constrained optimization problem involves a thing to be
optimized, the objective function, and one or more constraint functions
used to define the constraints.
It often looks something like this.
max f(x)
x
s.t. gi (x) ≤ bi , for i = 1, . . . , k
hj (x) = cj , for j = 1, . . . , ℓ
Here f is the objective function, the equations gi (x) ≤ bi are referred

to as inequality constraints, while the equations hj (x) = cj are called
equality constraints.1
1
The letters “s.t.” can be read as “subject to”, “such that”, or “so that”. However one
renders it, it indicates the constraint equations.
2 MATH METHODS
18.1 Two Optimization Problems of Economics

Sometimes a problem will have only one kind of constraint. An example
is the following consumer’s problem defining the indirect utility function
v(p, m). It only has inequality constraints.
v(p, m) = max u(x)

x
s.t. p· x ≤ m
x ≥ 0.
Here p is the price vector, u is the utility function, and m is income.

All the constraints here are inequality constraints. The constraints xi ≥ 0
(x ≥ 0) are known as non-negativity constraints. The solutions depend
on the prices of consumer goods p and consumer income m. We write
the solution as x(p, m), and refer to them as the Marshallian demands.
Another example is the firm’s cost minimization problem which defines
the cost function c(w, q).
c(w, q) = min w· z
z
s.t. f(z) ≥ q
z ≥ 0.
Here q is the amount produced, w is the vector of factor prices, and f is

the production function. The solutions, now dependent on factor prices
w and output q are the conditional factor demands, z(w, q).
18. CONSTRAINED OPTIMIZATION I: FIRST ORDER CONDITIONS 3
18.2 A Simple Consumer’s Problem

We start by examining a simple consumer’s problem with two goods
and a single equality constraint, the budget constraint. This consumer’s
problem is
max u(x1 , x2)

x
s.t. p1 x1 + p2 x2 = m.
We’re dropping the usual non-negativity constraints in the interest of

simplifying the problem.
Consider the geometry of the solution. As we teach in undergrad
micro, the indifference curve must be tangent to the budget line at the
utility maximum. That is, the slopes of the two curves must be the same.
This is illustrated in Figure 18.4.1. We have extended the budget line
because we did not impose any non-negativity constraints.
y1
b x∗
u3
u2
u1
x1
Figure 18.2.1: Three indifference curves are shown in the diagram. Indiffer-
ence curve u2 is the highest indifference curve the consumer can afford. This
happens at x∗ where the indifference curve is tangent to the budget line.
4 MATH METHODS
18.3 Solution to the Simple Consumer’s Problem I

Aligning the tangents for the budget line and indifference curve can be
accomplished by making sure they have the same slope. Both slopes are
negative. The absolute slope of the budget line is the relative price p1 /p2
while the absolute slope of the indifference
curve is the marginal rate of
substitution, MRS12 = (∂u/∂x1) (∂u/∂x2). Those two slopes must be
equal at the utility maximum x∗,
p1 ∂u/∂x1
= .
p2 ∂u/∂x2
This can also be expressed in terms of the marginal utility per dollar,
which must be the same for both goods. If not, the consumer would gain
by spending more on the good with greater per dollar value, and less on
the good with less bang for the buck.
∂u/∂x1 ∂u/∂x2
= .
p1 p2
Another way to think about this is that we are lining up the tangent
spaces of both the budget constraint and the optimal indifference curve.
Both the budget line and the indifference curves are level sets of func-
tions. As such, their tangent spaces are the null spaces of their derivatives.
The tangents v to the budget line obey (p 1 , p2 )v∗ = 0 and the tangents
w to the optimal indifference curve obey Du(x ) w = 0. This implies
that the normal vectors ∇u(x∗) and p = (p1 , p2 )T must be collinear. Thus
∇u(x∗) = µp for some µ ∈ R.
18.4 Solution to the Simple Consumer’s Problem II

We could alternatively write the problem as maximizing u over the line
−p·x = −m. It is the same set, but has a different derivative. We could
use αp·x = αm for any α 6= 0. Each α gives a different derivative to the
constraint.
We don’t have the freedom to alter the direction of ∇u(x∗) because it
points in the direction of maximum increase of u. Reversing the direction
would reverse consumer preferences. What was better would become
worse and vice-versa.
p
∇u(x∗) ∇u(x∗)
x∗
b
x∗ b
u2 − u2
p
p
−p ·x
·x
=
=
−
m
m
Figure 18.4.1: Here x∗ is the consumer’s utility maximum. At the maximum,
the vectors p and ∇u(x∗ ) are collinear, with the heavier arrow being ∇u(x∗ ).
With an equality constraint, it doesn’t matter whether the vectors point in
the same or opposite directions. It will matter for inequality constraints. The
right panel writes the budget constraint as −p·x = −m to emphasize that we
flipped the direction of p. Either way, there is a µ ∈ R with ∇u(x∗ ) = µp.
6 MATH METHODS
18.5 More on the Simple Consumer’s Problem

If the tangents (or gradients) don’t align, the constraint set and the in-
difference curve must cut through each other. Then the intersection x∗
does not maximize utility. In that case there is a v with p·v = 0 and
∇u(x∗)·v > 0. In such cases, travelling along the tangent to the budget
constraint (i.e., along the budget frontier) in the direction v will increase
utility, as illustrated in Figure 18.5.1.
v ∇u(x∗)
p′
x∗
u3
u2
p ·x
′ =m
Figure 18.5.1: Here the prices have been changed to p′ so that the indiffer-
ence curve u2 (blue tangent) and the budget line are no longer tangent.
At the new prices, we can move along the budget line in the direction v to
increase utility. Because of the acute angle between v and ∇u(x∗ ), moves in
the direction v increase utility. Here, we can also see that utility has not only
increased from u2 , but is even above u3 at x∗ + v.
The situation is similar when x∗ is a minimum.

The argument above is a key part of the proof of the upcoming Tangent
Space Theorem.
18.6 Optimization Under Equality Constraints

We can use the tangent spaces to find a necessary condition for con-
strained optimization when the constraints are equality constraints.
Consider the problem
max f(x)
x
s.t. hj (x) = cj , for i = 1, . . . , ℓ.
That is, we are attempting to maximize f(x) under the constraints that
h(x) = c. The key result is the Tangent Space Theorem.2
Tangent Space Theorem. Let U ⊂ Rm and f : U → R, h : Rm → Rℓ be
C1 functions. Suppose x∗ either maximizes or minimizes f(x) over the set
M = {x : h(x) = c} with Df(x∗) 6= 0 and rank Dh(x∗) = ℓ. Then the
tangent space T F of the differentiable manifold F = {x : f(x) = f(x∗ )}
at x∗ contains the tangent space at x∗ of the differentiable manifold M.
P
Moreover, there are unique µ∗j , i = 1, . . . , ℓ with ℓj=1 µ∗j Dhj (x∗ ) =
Df(x∗).
2
The name Tangent Space Theorem is not in general use. I gave it a name since we
use it a number of times.
8 MATH METHODS
18.7 Notes on the Tangent Space Theorem

In our simple consumer’s problem, m = 2 and ℓ = 1, so both the
indifference curve F and the budget constraint M are one-dimensional
manifolds. The tangent spaces were also one-dimensional, and so had
to coincide. Here the tangent space of F is (m − 1)-dimensional and the
tangent space of M is (m − ℓ)-dimensional.
Because the dimensions are different if ℓ > 1, the tangent spaces
cannot coincide. The best we can do in that line is for the smaller space
Tx∗ M to be contained in the larger space Tx∗ F.
The proof shows that if movement in a direction v is allowed by the
constraints, and if v is not in Tx∗ F, then the objective f can be increased.
This means that at the maximum, the constraints can only allows moves
in directions in Tx∗ F.
The condition that rank Dh(x∗ ) = ℓ is called the non-degenerate con-
straint qualification condition (NDCQ) at x∗. It ensures that M is a regular
manifold of dimension
m − rank Dh(x∗) = m − ℓ.
Theorem 15.38.1 then gives us its tangent space, ker Dh(x∗).3

This happens even if rank Dh(x∗) = 0. In that case, the tangent space,
the null space of Dh(x∗ ), is the zero-dimensional set {0}. Of course,
m < rank Dh(x∗ ) is not possible because Dh(x∗ ) is an ℓ × m matrix.
Finally, the numbers µ∗1, . . . , µ∗ℓ are called Lagrange multipliers.4
3
Be careful, the notation differs. The m, ℓ, and m − ℓ on this page are the m + k, k,
and m of Theorem 15.38.1. The notation is the way it is in order to highlight different
things in the two sections.
4
See Joseph-Louis Lagrange (1762), Essai sur une nouvelle methode pour determiner
les maxima et minima des formules integrales indefinies, Miscellanea Taurinensia II,
173-195.
18.8 Proof of the Tangent Space Theorem I

Proof of Tangent Space Theorem. Suppose x∗ is a maximum. Because
rank Dh(x∗ ) = ℓ, there is a neighborhood U of x∗ where rank Dh(x) = ℓ.
By Theorem 15.38.1, we can define a chart (U, ϕ) from U ∩ M to Rm−ℓ .
It follows that the tangent space at x∗, Tx∗ M = ker Dh(x∗), is (m − ℓ)-
dimensional.
Similarly, F is a (m − 1)-dimensional differentiable manifold, with an
(m − 1)-dimensional tangent space Tx∗ F at x∗.
By way of contradiction, suppose there is v ∈ Tx∗ M with v ∈ / Tx∗ F.
∗ ∗
Then [Dh(x )]v = 0 and [Df(x )]v 6= 0. The latter is a real number, so if
[Df(x∗)]v < 0, we may replace v by −v, ensuring [Df(x∗)]v > 0.
This is the same situation we saw in Figure 18.5.1, repeated below at
a smaller size.
v ∇u(x∗ )
p′
b
x∗
u3
p ·x
u2
′ =m
Figure 18.5.1: The budget line in the diagram is T M and the tangent (blue)
to the indifference curve u2 is T F. Of course, u is the objective. As you can
see, movements in the direction v increase utility when the tangent spaces
are not colinear.
Take a curve x(t) from (−1, 1) to M with x(0) = x∗ and x′ (0) = v ∈

Tx∗ M. This is possible because the tangent space Tx∗ M can also be
defined as the set of tangents at x∗ of curves in M through x∗ . The path
can be defined by defining it in a coordinate patch and mapping it back
to M.
Proof continues . . .
10 MATH METHODS
18.9 Proof of the Tangent Space Theorem II
Remainder of Proof. Use the continuity of Df(x) to pick ε > 0 so that

Df(x) v > 0 for x ∈ Bε (x∗ ) ⊂ U ∩ M. Apply the Mean Value Theorem

to φ(t) = f x(t) and use the Chain Rule to obtain
f x(t) = f(x∗) + Dfc(t) x′ (t)

for some c(t) ∈ ℓ x(t), x∗ . For t small enough that x(t) ∈ Bε (x∗), we

have
f x(t) = f(x∗) + Dfc(t)v > f(x∗),

showing that x∗ is not a maximum, just as in Figure 18.5.1. This contra-

dicts our hypothesis and so establishes that Tx∗ M ⊂ Tx∗ F.
Now for any pair of sets in Rm , A ⊂ B implies B⊥ ⊂ A⊥ , where A⊥
denotes the orthogonal complement, the set of vectors perpendicular to
every vector in A.5
Applying this to the tangent spaces, we find (Tx∗ F)⊥ ⊂ (Tx∗ M)⊥ . This
implies Df(x∗) ∈ (Tx∗ M)⊥. Since Df is in the span of Dh, there are µ∗j ,
j = 1, . . . , ℓ obeying
ℓ
X
∗
Df(x ) = µ∗j Dhj(x∗ ).
j=1
By NDCQ, rank Dh(x∗) = ℓ. This means the ℓ vectors Dhj are linearly
independent, implying that the µ∗j are unique.
5
Suppose A ⊂ B. The orthogonal complement of A is the set of all vectors perpen-
dicular to everything in A. That is, A⊥ = {x : x·a = 0 for all a ∈ A}. Since A ⊂ B,
there are additional conditions that must be met to be in B⊥ . Such vectors must not
only be perpendicular to all elements of A, but also those elements of B that are not in
A. So B⊥ ⊂ A⊥ .
18.10 The Lagrangian

The first order conditions for an optimum are usually written using the
Lagrangian,
ℓ
X
T

L(x, µ) = f(x) − µ h(x) − c = f(x) − µj hj (x) − cj .
j=1
This allows us to rewrite the key conclusions of the Tangent Space The-
orem as follows:6
6
Theorems such as this are sometimes referred to as the “Kuhn-Tucker Theorem” or
“Karush-Kuhn-Tucker Theorem”. Lagrange multipliers date back to Lagrange in the
18th century where they proved useful in the calculus of variations and for problems
with equality constraints. William Karush (1917–1997) proved the first theorem with
inequality constraints in an unpublished 1939 Chicago master’s thesis. Fritz John (1910–
1994) proved another version that was published in a 1948 collection in honor of
Richard Courant’s 60th birthday. The paper was rejected by the Duke Mathematical
Journal. Harold W. Kuhn (1925–2014) and Albert W. Tucker (1905–1995) published
their version two years later, to instant acclaim. Timing is everything! See T.H. Kjeldsen
(2000), “A Contextualized Historical Analysis of the Kuhn-Tucker Theorem in Nonlinear
Programming: The Impact of World War II”, Hist. Mathematica 27 331–361.
12 MATH METHODS
18.11 The Lagrangian and Optimization

Theorem 18.11.1. Let U ⊂ Rm and f : U → R, h : Rm → Rℓ be C1
functions. Suppose that x∗ solves
max (min) f(x)

x
s.t. hj (x) = cj , for i = 1, . . . , ℓ
and that rank Dh(x∗ ) = ℓ holds, the non-degenerate constraint qualifica-

tion condition (NDCQ). Then there are unique multipliers µ∗ such that
(x∗, µ∗ ) is a critical point of the Lagrangian L:
L(x, µ) = f(x) − µT h(x) − c .

That is,
∂L ∗ ∗
(x , µ ) = 0 for i = 1, . . . , m, and
∂xi
∂L ∗ ∗
(x , µ ) = 0 for j = 1, . . . , ℓ.
∂µj
Proof. If Df(x∗) 6= 0, this follows immediately from the Tangent Space

Theorem.
If Df(x∗ ) = 0, the fact that rank Dh(x∗) = ℓ means that the ℓ rows of
Dh are linearly independent, so µ∗ = 0 is the unique vector with
ℓ
X
∗ ∗ T ∗
Df(x ) = (µ ) Dh(x ) = µjDhj (x∗).
j=1
18.12 Solving a Simple Consumer’s Problem, Part I

Consider the consumer’s problem 11/15/22
v(p, 100) = max u(x)

x
s.t. p·x = 100
where p ∈ R3++ and

1/6 1/2 1/3
x1 x2 x3 when x1 , x2, x3 ≥ 0
u(x) =
0 otherwise.
Since m > 0, any solution other than zero must have every xi > 0.
Theorem 30.3.1 then ensures that this problem has a solution.
In many economic problems, we will make assumptions that have an
impact on optimization via the Lagrangian. Here, Dh = p ≫ 0. The
NDCQ condition is satisfied.
Now we form the Lagrangian
1/6 1/2 1/3

L(x1, x2 , x3, µ1 ) = x1 x2 x3 − µ(p·x − 100)
Critical points of the Lagrangian must obey
u(x) u(x) u(x)

= µp1, = µp2 , = µp3 , p·x = 100
6x1 2x2 3x3
Another important feature of this problem is that income is positive.

With positive price, that means that the budget line contains points that
are strictly positive. For a Cobb-Douglas utility function, as we have here,
this means that the maximum utility is positive and that x∗ ≫ 0. As a
result Du(x∗) ≫ 0. It follows from the first order equations above that
µ∗ > 0.
14 MATH METHODS
18.13 Solving a Simple Consumer’s Problem, Part II

We can rewrite the first three equations by dividing them in pairs, elimi-
nating both µ and u(x). Thus
x2 p1 x3 p1 3x3 p2
= , = , =
3x1 p2 2x1 p3 2x2 p3
The third equation is redundant, being the ratio of the other two. That
leaves us with
p2 x2 = 3p1 x1 and p3 x3 = 2p1 x1. (18.13.1)
Notice that x1 determines x2 and x3 in equation (18.13.1). Substituting

in our remaining equation, the budget constraint, we obtain
p·x = p1 x1 + 3p1 x1 + 2p1 x1 = 6p1 x1 = 100.
Then by equation (18.13.1),
100 100 100

x1 = , x2 = , x3 =
6p1 2p2 3p3
This implies the indirect utility function is
100
v(p, 100) = u(x∗) =
(6p1 )1/6(2p2)1/2(3p3 )1/3
The multiplier µ∗ is also easily calculated and is positive.
∗ u(x∗) 1
µ = = .
100 (6p1 ) (2p2 )1/2(3p3 )1/3
1/6
◭
18.14 Solving Standard Consumer’s Problems

The basic steps used to solve the problem above pertain to many standard
consumer’s problems.
The steps were:
1. Rewrite the first order conditions as MRSij = pi /pj to eliminate the
multiplier µ.
2. Write spending on each good in terms of spending on good one.
3. Substitute into the budget constraint so that everything is in terms
of good one.
4. Solve for x1 , then substitute back to solve for the other xj, and
anything else that needs to be calculated (e.g., multipliers)
This often suffices to solve the problem, provided the equations in-
volved are tractable.
16 MATH METHODS
18.15 Inequality Constraints: Binding or Not

Although our simple consumer’s problem in R3 involved only a single
equality constraint, that is not typical. The consumer’s problem in R3
usually involves four inequality constraints—three non-negativity con-
straints and the budget constraint. Other economics problems, such as
the firm’s cost minimization problem, or the consumer’s expenditure
minimization problem also use inequality constraints.
It will be helpful to distinguish cases where a particular constraint mat-
ters and where it does not. We say that a constraint g(x) ≤ b binds at x∗
if g(x∗ ) = b. Otherwise the constraint is non-binding.
The binding constraints are the ones that matter.
18.16 A Single Inequality Constraint

Let’s start by investigating the case of a single inequality constraint. We
will write the maximization problem in the following form:
max u(x)
x
s.t. g(x) ≤ b
Figure 18.16.1 illustrates two possibilities that we need to consider. It

shows that the sign of the multiplier matters when we have an inequality
constraint. Both ∇u and ∇g must point in the same direction, otherwise
we find ourselves minimizing utility over the constraint set as in the right
panel of Figure 18.16.1.
∇g(x∗) g(x) ≤ b
∇u(x∗) ∇u(x∗)
b
∗
x∗ b
x u3 u3
u2 u2
g(x) ≤ b u1 ∇g(x∗) u1
Figure 18.16.1: Here x∗ is the consumer’s utility maximum with u(x∗ ) = u2 .

At the maximum, the vectors ∇g(x∗ ) and ∇u(x∗ ) are collinear, with the heavier
arrow being ∇u(x∗ ). With an inequality constraint, like we have here, the
two vectors must point in the same direction. Then ∇u(x∗ ) = λ∇g(x∗ ) for
some λ ≥ 0.
If ∇g(x∗ ) pointed in the opposite direction, it would mean that the region
above u2 would have lower values of g, as shown in the right panel. In that
case utility is not maximized at x∗ , since higher indifference curves such as u3
can be attained.
Another way to think about it is that when ∇u(x∗) = −λ∇g(x∗) for

λ > 0, a small move in the ∇u(x∗) direction will reduce g, and move
into the interior of the constraint set while increasing utility. Even if λ = 0,
we can still increase utility for free. As in the proof of the Tangent Space
Theorem, the Mean Value Theorem can be used to show this formally.
18 MATH METHODS
18.17 Inequality Constraints: Complementary Slackness

The Tangent Space Theorem can be easily modified to ensure that the
multiplier is non-negative. However, there is another issue that might
arise. The point x∗ could be in the interior of the constraint set, where
g(x∗) < b. The Tangent Space Theorem does not apply there. However,
x∗ is then an interior point, so Du(x∗) = 0.
This can be interpreted as the multiplier being zero, just as we did in
the equality constraint case when Df(x∗ ) = 0.
There is a useful condition that lets up package this up using the La-
grangian framework. We impose the complementary slackness condition
that
λ g(x) − b = 0.
If the constraint g(x) ≤ b binds, the complementary slackness condition

tells us nothing. It’s already zero. If the constraint doesn’t bind, we have
g(x) − b < 0. This forces the corresponding multiplier to be zero. This
trick also works if we have several inequality constraints.
18.18 Maximization with Complementary Slackness

We sum up our discussion of complementary slackness in the following
theorem.
Theorem 18.18.1. Let U ⊂ Rm and suppose f : U → R and g : U → Rk
are C1 functions. Suppose that x∗ solves
max f(x)
x
s.t. gi (x) ≤ bi .
and that k̂ ≤ k constraints bind at x∗ . Let ĝ be the vector of functions

defining the binding inequality constraints at x∗ and b̂ the corresponding
constant terms. Form the Lagrangian L:
L(x, λ) = f(x) − λT ĝ(x) − b̂ .

If rank Dĝ(x∗) = k̂ holds (NDCQ), then there are multipliers λ∗ such that
(a) The pair (x∗ , λ∗ ) is a critical point of the Lagrangian:
∂L ∗ ∗
(x , λ ) = 0 for i = 1, . . . , m
∂xi
∂L ∗ ∗
(x , λ )gj (x∗) − cj = 0 for j = 1, . . . , k,
∂λj
(b) The complementary slackness conditions hold:
λ∗i gi (x∗) − bi = 0

for i = 1, . . . , k
(c) The multipliers are non-negative: λ∗1 ≥ 0, . . . , λ∗k ≥ 0,

(d) The constraints are satisfied: g(x∗) ≤ b.
20 MATH METHODS
18.19 Equality Constraint? Or Two Inequality Constraints?

Now that we have a new tool, inequality constraints, you might be
tempted to view an equality constraint as two inequality constraints. For
example, you can write
p1 x1 + p2 x2 = m
as
p1 x1 + p2 x2 ≤ m
−p1 x1 − p2 x2 ≤ −m.
This doesn’t work. It runs afoul of the NDCQ. When both bind, as
they must if both are obeyed, we have
p p2
1
Dĝ = .
−p1 −p2
This has rank 1 when it needs rank 2. In this case, the failure of constraint
qualification is minor. If you do this with Cobb-Douglas utility, you won’t
be able to uniquely determine the two multipliers. However, you can
determine their difference.
18.20 Solving a Cobb-Douglas Consumer’s Problem I

The key to understanding how to use Theorem 18.18.1 is that comple-
mentary slackness conditions are a way of checking all the possible cases
where some, but not all, constraints bind. Let’s look at a concrete prob-
lem in R2 to see how this works. Our problem is a consumer’s utility
maximization problem with a particular Cobb-Douglas utility function.
1/3 2/3
max x1 x2
x
s.t. p1 x1 + p2 x2 ≤ m
x1 ≥ 0, x2 ≥ 0.
Where p1 , p2 , m > 0.
We form the Lagrangian by first rewriting the non-negativity constraints
in the proper form: −x1 ≤ 0, −x2 ≤ 0. The Lagrangian is
L = u(x) − λ0 (p1 x1 + p2 x2 − m) − λ1 (−x1) − λ2(−x2 )

= u(x) − λ0 (p1 x1 + p2 x2 − m) + λ1x1 + λ2x2 .
In utility maximization problems, non-negativity constraints always yield

terms of the form +λixi .
22 MATH METHODS
18.21 Solving a Cobb-Douglas Consumer’s Problem II

We now turn to constraint qualification. Consider
p1 p2 !
Dg = −1 0
0 −1
Now if only constraint i binds, we must have Dgi(x∗ ) 6= 0, and we

do. If two constraints bind, the matrix Dĝ(x∗ ) obtained by deleting the
non-binding row must be invertible, which it is. It is not possible for all
three constraints to bind at once as we would have 0p1 + 0p2 = m > 0.
The NDCQ condition is satisfied no matter which set of constraints binds.
Figure 18.21.1 shows how the various constraints relate.
x2 x1 = 0 & p·x = m
p1
x1
+
x1 = 0
p2
x2
=
m
x1 = 0 & x2 = 0 p1 x1 + p2 x2 < m x2 = 0 & p·x = m
x2 = 0 x1
Figure 18.21.1: Without complementary slackness, we would have to check
separately for maxima seven ways: at each of the three corner points, in the
relative interior of each of the three boundary segments of the budget set, and
in the interior of the budget set.
18.22 Solving a Cobb-Douglas Consumer’s Problem III

Now differentiate L with respect to x1 and x2.
2/3
∂L 1 x2
0= = − λ0p1 + λ1
∂x1 3 x1
1/3
∂L 2 x1
0= = − λ0p2 + λ2 .
∂x2 3 x2
With m > 0, we know that positive utility is possible. Positive utility

requires both x1 > 0 and x2 > 0. The complementary slackness condi-
tions for the non-negativity constraints are λ1x1 = 0 and λ2 x2 = 0 which
imply λ∗1 = λ∗2 = 0.
We can now simplify the first order conditions:
2/3
1 x2
λ0 p1 =
3 x1
1/3 (18.22.2)
2 x1
λ0 p2 =
3 x2
Because the right-hand side must be positive at x∗, λ0 > 0. Complemen-

tary slackness for λ0 now yields p1 x1 + p2 x2 = m. The budget constraint
must bind. At this point, we have used complementary slackness to
reduce the possible locations of solutions to a single region, the relative
interior of the budget frontier.
24 MATH METHODS
18.23 Solving a Cobb-Douglas Consumer’s Problem IV

We now eliminate λ0 from the first order conditions of equation (18.22.2)
by dividing the top line by the bottom line.
p1 1 x2
=
p2 2 x1
implying
1
p1 x1 = p2 x2 .
2
We substitute in the budget constraint to find 3p1 x1 = m, so the Mar-
shallian demands are
m 2m
x∗1 = and x∗2 = .
3p1 3p2
Substituting in equation (18.22.2), we find
22/3
λ∗0 = 1/3 2/3
.
3p1 p2
Finally, the indirect utility function is
22/3m
v(p, m) = u(x∗1, x∗2 ) = 1/3 2/3
.
3p1 p2
18.24 Complementary Slackness Gone Wild: I

Let’s try to maximize a linear utility function in R3+ . One thing that’s
different about this problem is that it is purely an exercise in comple-
mentary slackness.
The problem is:
max u(x) = a1x1 + a2 x2 + a3 x3

x
s.t. p1 x1 + p2 x2 + p3 x3 ≤ m
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0.
Where p ≫ 0, m > 0, and each ai > 0. Theorem 30.3.1 guarantees

that the problem has a solution via Weierstrass’s Theorem.
We next check constraint qualification. Here
p p2 p3 
1
 −1 0 0 
Dg =  .
0 −1 0 
0 0 −1
It is impossible for all four constraints to bind. If they did, we would have
x1 = x2 = x3 = 0 and 0 = p·x = m > 0. Any collection of 1, 2, or 3
rows of Dg is linearly independent, so constraint qualification (NDCQ)
is satisfied everywhere else.
26 MATH METHODS
18.25 Complementary Slackness Gone Wild: II

The Lagrangian is
L = a 1 x1 + a 2 x2 + a 3 x3
− λ0 (p·x − m)
+ λ 1 x1 + λ 2 x2 + λ 3 x3 .
We compute the first order conditions:
∂L
0= = a1 − λ0p1 + λ1
∂x1
∂L
0= = a2 − λ0p2 + λ2
∂x2
∂L
0= = a3 − λ0p3 + λ3 .
∂x3
We then rewrite the first order conditions as
λ0 p1 = a1 + λ1
λ0 p2 = a2 + λ2
λ0 p3 = a3 + λ3 .
At this point, you may be wondering how to proceed with such first
order conditions. None of the variables are in the first order equations!
None!
What can we do?
What can we possibly do??
18.26 Complementary Slackness Gone Wild: III

The solution is to use complementary slackness.
To recapitulate, we have the following first order conditions:
λ0 p1 = a1 + λ1
λ0 p2 = a2 + λ2 (18.26.3)
λ0 p3 = a3 + λ3 .
Now λ1 ≥ 0, so λ0p0 ≥ a1 > 0. This implies λ0 > 0. Complementary

slackness tells us that
λ0(p·x − m) = 0.
Since λ0 > 0, that means p·x = m > 0. The consumer spends their
entire budget.
At this point, we don’t know whether or not the maximum is in the
relative interior of the budget frontier.
x3
p·x = m
x2
x1
Figure 18.26.1: The budget frontier is the green triangle in R3 with vertices
(m/p1 , 0, 0), (0, m/p2, 0), and (0, 0, m/p3).
28 MATH METHODS
18.27 Complementary Slackness Gone Wild: IV

Dividing each line of equation (18.26.3) by pi tells us
λ0 = a1/p1 + λ1 /p1 ≥ a1/p1

λ0 = a2/p2 + λ2 /p2 ≥ a2/p2 (18.27.4)
λ0 = a3/p3 + λ3 /p3 ≥ a3/p3 .
Then
λ0 ≥ max {ai /pi }.
i=1,2,3
If λ0 > maxi=1,2,3{ai /pi }, then each λi > 0 for i = 1, 2, 3. Com-

plementary slackness requires that λi xi = 0 for i = 1, 2, 3, implying
that x1 = x2 = x3 = 0. But this is impossible because we know that
p·x = m > 0.
At least one xi must be positive, and for that i, λ0 = ai/pi . So we can
write
ai
λ0 = max .
i=1,2,3 pi
18.28 Complementary Slackness Gone Wild: V

We are now prepared to find the solution. If ai /pi < λ0, λi > 0 and so
xi = 0. The only goods that are consumed are goods h for which
ah ai
= max .
ph i=1,2,3 pi
There is at least one such good. There may be 2 or 3, depending on the

ai ’s and pi ’s. If goods one and two are both consumed and good three
is not consumed, then
u(x) = a1x1 + a2x2 = λ0p1 x1 + λ0p2 x2 = λ0m.
Utility doesn’t depend on how consumption is distributed between the

goods that are consumed, so there are many solutions when 2 or 3 goods
maximize ai /pi .
That principle applies regardless of which goods are consumed. There-
fore the solution to the problem is that our consumer consumes only
goods h obeying ah/ph = maxi {ai /pi }, and the any distribution of con-
sumption over those goods that spends all of the available income m is
optimal.
30 MATH METHODS
18.29 Example: Quasi-linear Utility I

Now consider maximizing the quasi-linear utility function u(x, y) = x +
√
y. We must solve:
p
max x + y
x
s.t. px x + py y ≤ m
x ≥ 0, y ≥ 0.
As in the last two problems, and most standard consumer’s problems,

the NDCQ condition is satisfied. The Lagrangian is
p
L=x+ y − λ0(px x + py y − m) + λxx + λy y
yielding first order conditions
∂L
0= = 1 − λ0 px + λx
∂x
(18.29.5)
∂L 1
0= = − √ + λy .
∂y 2 y
18.30 Example: Quasi-linear Utility II

We rewrite equation (18.29.5) as
λ0 px = 1 + λx
1 (18.30.6)
λ0py = √ + λy .
2 y
The top line of equation (18.30.6) implies λ0 ≥ 1/px > 0. By com-

plementary slackness, the budget constraint must bind: px x + py y = m.
This is the usual result in standard consumer’s problems.
There are now three cases to consider, depending on which constraints
bind. We organize these based on the variables. This is usually better
than trying to organize based on the multipliers because when a multiplier
is zero, that constraint may still bind. The values of the multipliers do not
translate nicely into information about which constraints bind.
The cases to consider are
I. x > 0 and y = 0,
II. x = 0 and y > 0, and
III. x > 0 and y > 0.
By using complementary slackness, we have reduced the original seven
cases to three.
32 MATH METHODS
18.31 Example: Quasi-linear Utility III

Our first order conditions are
λ0 px = 1 + λx
1 (18.30.6)
λ0py = √ + λy .
2 y
Case I: Here x > 0 and y = 0. The complementary slackness condition

λxx = 0 tells us that λx = 0. By the top line of equation (18.30.6),
λ0 = 1/px . The second equation of equation (18.30.6) becomes
1
py /px = √ + λy .
2 y
But this is impossible to satisfy since y = 0. Case I is out.

Case II: Here x = 0 and y > 0. The complementary slackness condi-
tion λy y = 0 implies λy = 0, so the bottom line of equation (18.30.6)
becomes
1
λ0py = √
2 y
The budget constraint
p together with x = 0 tells us y = m/py . It follows
that λ0 = 1/2 py m. By the top line of equation (18.30.6),
p
px = 1 + λx ≥ 1.
2 py m
This solution works if p2x ≥ 4py m.

18.32 Example: Quasi-linear Utility IV

Case III: Here x, y > 0. By complementary slackness λx = λy = 0, so
the first order conditions (18.30.6) are
λ0px = 1
1
λ0 py = √ .
2 y
√
Then λ0 = 1/px and py /px = 1/2 y. It follows that y = p2x /4p2y . Using
the budget constraint, we find px x = m − py y = m − p2x /4py . We’ve
assumed x > 0, which requires m − p2x /4py > 0. In other words, Case
III requires 4py m > p2x .
So what happens now? Which solutions do we use.
• Case I didn’t have solutions,
• Case II only worked when p2x ≥ 4py m, and
• Case III required p2x < 4py m.
It all comes down to the parameter values px , py , and m.
Organizing the results by the parameter values, we obtain

m
0, py when p2x ≥ 4py m (Case II),
(x, y) =
m px p2x
px − 4p2 , 4py when p2x < 4py m (Case III).
y
If we apply the case III formula to p2x = 4py m, it agrees with case II.
34 MATH METHODS
18.33 Failure of Constraint Qualification I

In some cases, the failure of constraint qualification is quite serious,
serious enough that it prevents us from solving the problem as advertised.
Consider this example borrowed from section 19.5 of Simon and Blume.
max x
(x,y)
s.t. x3 + y2 = 0.
Suppose we blindly charge ahead, setting up the Lagrangian
L = x − µ(x3 + y2).
The critical points of the Lagrangian solve
1 = 3µx2, 0 = 2µy, and x3 + y2 = 0.
There are no solutions.

Since 0 = 2µy, so one of µ or y must be zero. Suppose y = 0. Then
x = 0 by the constraint, so 1 = 3µx2 = 0, which is impossible. Suppose
µ = 0, we again have 1 = 0, another impossibility.
Neither y nor µ can be zero. Our method has failed!
18.34 Failure of Constraint Qualification II

So what went wrong. Constraint qualification fails because Dh =
(3x2, 2y) has a critical point, (x, y) = (0, 0). This is a problem because
the maximum is also at (0, 0).
How do we know? The constraint says x3 = −y2, so x3 ≤ 0, implying
x ≤ 0. Of course, we can maximize x ≤ 0 by setting x = 0, which is
feasible. All we have to do is set y = 0, landing us on the critical point
of the constraint.
x3 + y2 = 0
b
x∗
x
x=0
Figure 18.34.1: The graph of the constraint is illustrated in the diagram. It’s
pretty obvious from the graph that the maximum value of x on the constraint
is 0 and occurs at x∗ = (0, 0).
In section , we will see that this type of problem can be solved using
a modified Lagrangian.
36 MATH METHODS
18.35 Mixed Constraints

Finally, we state a combined theorem incorporating both equality and
inequality constraints.
Theorem 18.35.1. Let U ⊂ Rm and suppose f : U → R, g : U → Rk ,
and h : Rm → Rℓ are C1 functions. Suppose that x∗ solves
max f(x)
x
s.t. gi (x) ≤ bi , for i = 1, . . . , k
hj (x) = cj , for i = 1, . . . , ℓ
and that k̂ inequality constraints bind at x∗ . Let ĝ be the vector of functions

defining the binding inequality constraints at x∗ and b̂ the corresponding
constant terms. Form the Lagrangian:
L(x, λ, µ) = f(x) − λT ĝ(x) − b − µT h(x) − c .

Set ĝ(x)
G(x) =
h(x)
If rank DG(x∗ ) = k̂ + ℓ holds (NDCQ), then there are multipliers λ∗ and
µ∗ such that
(a) The triplet (x∗, λ∗ , µ∗ ) is a critical point of the Lagrangian:
∂L ∗ ∗ ∗
(x , λ , µ ) = 0 for i = 1, . . . , m
∂xi
∂L ∗ ∗ ∗
(x , λ , µ ) = ĝ(x) − b̂ = 0 for i = 1, . . . , k and binding,
∂λi
∂L ∗ ∗ ∗
(x , λ , µ ) = hj(x) − cj = 0 for j = 1, . . . , ℓ,
∂µj
λ∗i gi (x∗) − bi = 0

for all i = 1, . . . , k
(c) The multipliers are non-negative: λ∗1 ≥ 0, . . . , λ∗k ≥ 0,

(d) The constraints are satisfied: g(x∗) ≤ b and h(x∗ ) = c.
The constraint involving h appears twice, in both parts (a) and (d).
18.36 Example with Mixed Constraints I

Consider the problem
max f(x, y) = x2 − y2
x
s.t. x ≥ 0, y ≥ 0
x2 + y2 = 4.
The constraint set is compact, and the function is continuous, so it will

have both maxima and minima by the Weierstrass Theorem.
The matrix DG will be a submatrix of
2x 2y !
−1 0 .
0 −1
Any 2 × 2 submatrix of this has rank 2 and the top row is positive because
x2 + y2 = 4 and x, y ≥ 0. Finally, it is impossible for all three constraints
to bind, implying the NDCQ condition is satisfied.
38 MATH METHODS
18.37 Example with Mixed Constraints II

Form the Lagrangian
L = x2 − y2 − µ(x2 + y2 − 4) + λx x + λy y.
The first order conditions are
∂L
0= = 2x − 2xµ + λx
∂x
∂L
0= = −2y − 2yµ + λy
∂y
∂L
0= = −x2 − y2 + 4
∂µ
The last one is the equality constraint. The solution must also obey the
two complementary slackness conditions
λxx = 0, λy y = 0
and the non-negativity constraints
x ≥ 0, y ≥ 0, λx ≥ 0, λy ≥ 0.
18.38 Example with Mixed Constraints III

We rewrite the first order conditions
2x + λx = 2µx
2y + 2µy = λy
Multiply the first equation by x and the second by y, obtaining
2x2 + λxx = 2µx2

2y2 + 2µy2 = λy y
Apply complementary slackness to both: λx x = 0 and λy y = 0. Then
2x2 = 2µx2
2y2 + 2µy2 = 0.
From the first equation, if x > 0 then µ = 1, while according to the

second equation, if y > 0, µ = −1. One of x and y must be zero. They
can’t both be zero due to the constraint x2 + y2 = 4.
That gives us two cases to consider: (I) x > 0, y = 0, and (II) y > 0,
x = 0.
Case I: x > 0 and y = 0. Here µ = +1, λx = 0 and λy = 0. Since
x + y2 = 4 and x ≥ 0, this solution is (2, 0) with f(2, 0) = 4.
2
Case II: x = 0 and y > 0. Here µ = −1, λx = 0 and λy = 0. Since

x + y2 = 4 and y ≥ 0, this solution is (0, 2) with f(0, 2) = −4.
2
Based on these results, (2, 0) is the maximum point and (0, 2) is the
minimum point.
40 MATH METHODS
18.39 Minimization with Inequality Constraints

When we only had equality constraints, the same conditions found crit-
ical points for both maxima and minima. That is no longer true when
there are inequality constraints as the sign of the associated multipliers
depends on whether we are maximizing or minimizing.
There are various ways to handle minimization problems with inequal-
ity constraints. One of the more obvious is to maximize −f instead of
minimizing f. That yields Lagrangian
L(x, λ, µ) = −f(x) − λT g(x) − b − µT h(x) − c .

Multiplying by −1, we obtain
f(x) + λT g(x) − b + µT h(x) − c .

Since the sign of µ doesn’t matter, this is equivalent to using the La-
grangian
L1(x, λ, µ) = f(x) + λT g(x) − b − µT h(x) − c .

where λ ≥ 0 or using
L2(x, λ, µ) = f(x) − λT g(x) − b − µT h(x) − c .

with λ ≤ 0. Yet another method is the one favored in the book, writing
the inequality constraints the opposite way, setting g′ = −g and b′ =
−b, yielding constraints g′(x) ≥ b′ instead of g(x) ≤ b. We will state the
result in this form, which is more natural when thinking about duality.
It doesn’t really matter which form of Lagrangian you use. Just be sure
not to mix them!
18.40 Minimization with Mixed Constraints I

Here’s a version of Theorem 18.35.1 that works for minimization prob-
lems.
Theorem 18.40.1. Let U ⊂ Rm and suppose f : U → R, g : U → Rk ,
and h : Rm → Rℓ are C1 functions. Suppose that x∗ solves
min f(x)
x
s.t. gi(x) ≥ bi , for i = 1, . . . , k
hj(x) = cj , for i = 1, . . . , ℓ
and that k̂ constraints bind at x∗ . Let ĝ be the vector of binding inequality

constraints at x∗ and b̂ the corresponding constants. Form the Lagrangian
L:
L(x, λ, µ) = f(x) − λT g(x) − b̂ − µT h(x) − c .

If ĝ(x∗)
rank D = k̂ + ℓ
h(x∗)
holds (NDCQ), then there are multipliers λ∗ and µ∗ such that
42 MATH METHODS
18.41 Minimization with Mixed Constraints II

Theorem 18.40.1 conclusion.
(a) The triplet (x∗, λ∗ , µ∗ ) is a critical point of the Lagrangian:
∂L ∗ ∗
(x , λ ) = 0 for i = 1, . . . , m
∂xi
∂L ∗ ∗ ∗
(x , λ , µ ) = gi (x) − bi = 0 for binding i = 1, . . . , k,
∂λi
∂L ∗ ∗ ∗
(x , λ , µ ) = hj (x∗ ) − cj = 0 for j = 1, . . . , ℓ,
∂µj
λ∗i gi (x∗) − bi = 0

for all i = 1, . . . , k
(c) The multipliers are non-negative:
λ∗1 ≥ 0, . . . , λ∗k ≥ 0,
(d) The constraints are satisfied: g(x∗) ≥ b and h(x∗ ) = c.

18.42 A Cost Minimization Problem I

Suppose a firm uses two inputs, K ≥ 0 and L ≥ 0 with Cobb-Douglas
production function f(K, L) = K1/2L1/2. The firm tries to produce a
quantity q > 0 of output in the cheapest possible manner, given that the
price of K is r and the price of L is w. This gives us the following cost
minimization problem:
c(r, w, q) = min K1/2L1/2

(K,L)
s.t. K ≥ 0, L ≥ 0,
K1/2L1/2 ≥ q.
For minimization, we use the Lagrangian
L = rK + wL − λ0 (K1/2L1/2 − q) − λK K − λL λL.
The first order conditions are
λ0 L1/2
r= + λK
2 K1/2
λ0 K1/2
w= + λL .
2 L1/2
44 MATH METHODS
18.43 A Cost Minimization Problem II

Since q > 0, we must have both K > 0 and L > 0. Then λK = λL = 0
by complementary slackness.
It follows that λ0 > 0. We now divide the first order conditions to
obtain
w K
= .
r L
Then (w/r)L = K. Substituting in the constraint K1/2L1/2 = q, we find
the solutions, the conditional factor demands
r r
w r
L∗ =q or L∗ = q
r w
and r
w
K∗ = q .
r
Finally, the minimum cost is
√
c(r, w, q) = rK∗ + wL∗ = 2q rw.
December 6, 2022
Copyright c 2022 by John H. Boyd III: Department of Economics, Florida In-

ternational University, Miami, FL 33199

Math Constrained Optimization I Foc

Uploaded by

Math Constrained Optimization I Foc

Uploaded by

18.

The typical problem we face in economics involves optimization under

Here f is the objective function, the equations gi (x) ≤ bi are referred

18.1 Two Optimization Problems of Economics

v(p, m) = max u(x)

Here p is the price vector, u is the utility function, and m is income.

Here q is the amount produced, w is the vector of factor prices, and f is

18.2 A Simple Consumer’s Problem

max u(x1 , x2)

We’re dropping the usual non-negativity constraints in the interest of

18.3 Solution to the Simple Consumer’s Problem I

18.4 Solution to the Simple Consumer’s Problem II

18.5 More on the Simple Consumer’s Problem

The situation is similar when x∗ is a minimum.

18.6 Optimization Under Equality Constraints

18.7 Notes on the Tangent Space Theorem

Theorem 15.38.1 then gives us its tangent space, ker Dh(x∗).3

18.8 Proof of the Tangent Space Theorem I

Take a curve x(t) from (−1, 1) to M with x(0) = x∗ and x′ (0) = v ∈

18.9 Proof of the Tangent Space Theorem II

Remainder of Proof. Use the continuity of Df(x) to pick ε > 0 so that

to φ(t) = f x(t) and use the Chain Rule to obtain

f x(t) = f(x∗) + Dfc(t) x′ (t)

showing that x∗ is not a maximum, just as in Figure 18.5.1. This contra-

18.10 The Lagrangian

18.11 The Lagrangian and Optimization

max (min) f(x)

and that rank Dh(x∗ ) = ℓ holds, the non-degenerate constraint qualiﬁca-

L(x, µ) = f(x) − µT h(x) − c .

Proof. If Df(x∗) 6= 0, this follows immediately from the Tangent Space

18.12 Solving a Simple Consumer’s Problem, Part I

v(p, 100) = max u(x)

where p ∈ R3++ and

1/6 1/2 1/3

Critical points of the Lagrangian must obey

u(x) u(x) u(x)

Another important feature of this problem is that income is positive.

18.13 Solving a Simple Consumer’s Problem, Part II

p2 x2 = 3p1 x1 and p3 x3 = 2p1 x1. (18.13.1)

Notice that x1 determines x2 and x3 in equation (18.13.1). Substituting

p·x = p1 x1 + 3p1 x1 + 2p1 x1 = 6p1 x1 = 100.

Then by equation (18.13.1),

100 100 100

This implies the indirect utility function is

The multiplier µ∗ is also easily calculated and is positive.

18.14 Solving Standard Consumer’s Problems

18.15 Inequality Constraints: Binding or Not

18.16 A Single Inequality Constraint

Figure 18.16.1 illustrates two possibilities that we need to consider. It

Figure 18.16.1: Here x∗ is the consumer’s utility maximum with u(x∗ ) = u2 .

Another way to think about it is that when ∇u(x∗) = −λ∇g(x∗) for

18.17 Inequality Constraints: Complementary Slackness

If the constraint g(x) ≤ b binds, the complementary slackness condition

18.18 Maximization with Complementary Slackness

and that k̂ ≤ k constraints bind at x∗ . Let ĝ be the vector of functions

L(x, λ) = f(x) − λT ĝ(x) − b̂ .

(b) The complementary slackness conditions hold:

(c) The multipliers are non-negative: λ∗1 ≥ 0, . . . , λ∗k ≥ 0,

18.19 Equality Constraint? Or Two Inequality Constraints?

18.20 Solving a Cobb-Douglas Consumer’s Problem I

L = u(x) − λ0 (p1 x1 + p2 x2 − m) − λ1 (−x1) − λ2(−x2 )

In utility maximization problems, non-negativity constraints always yield

18.21 Solving a Cobb-Douglas Consumer’s Problem II

Now if only constraint i binds, we must have Dgi(x∗ ) 6= 0, and we

x1 = 0 & x2 = 0 p1 x1 + p2 x2 < m x2 = 0 & p·x = m

18.22 Solving a Cobb-Douglas Consumer’s Problem III

With m > 0, we know that positive utility is possible. Positive utility

Because the right-hand side must be positive at x∗, λ0 > 0. Complemen-

18.23 Solving a Cobb-Douglas Consumer’s Problem IV

Substituting in equation (18.22.2), we ﬁnd

Finally, the indirect utility function is

18.24 Complementary Slackness Gone Wild: I

max u(x) = a1x1 + a2 x2 + a3 x3

Where p ≫ 0, m > 0, and each ai > 0. Theorem 30.3.1 guarantees

18.25 Complementary Slackness Gone Wild: II

We compute the ﬁrst order conditions: