0% found this document useful (0 votes)
81 views

Constrained Optimization

The document describes constrained optimization problems and methods for solving them. Constrained optimization problems involve maximizing an objective function subject to constraints on the variables. Equality constraints set functions of the variables equal to constants, while inequality constraints set the functions less than or equal to constants. The Lagrangian method forms the Lagrangian as the objective minus the constraints multiplied by Lagrange multipliers. Setting the derivatives of the Lagrangian to zero yields necessary conditions for an optimal solution. For problems with multiple constraints, the Jacobian of the constraints must be full rank at solutions or the constraints are not properly accounted for.

Uploaded by

ananyaajatasatru
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views

Constrained Optimization

The document describes constrained optimization problems and methods for solving them. Constrained optimization problems involve maximizing an objective function subject to constraints on the variables. Equality constraints set functions of the variables equal to constants, while inequality constraints set the functions less than or equal to constants. The Lagrangian method forms the Lagrangian as the objective minus the constraints multiplied by Lagrange multipliers. Setting the derivatives of the Lagrangian to zero yields necessary conditions for an optimal solution. For problems with multiple constraints, the Jacobian of the constraints must be full rank at solutions or the constraints are not properly accounted for.

Uploaded by

ananyaajatasatru
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Constrained Optimization

Joshua Wilde, revised by Isabel Tecu, Takeshi Suzuki and María José Boccardi

August 13, 2013

1 General Problem

Consider the following general constrained optimization problem:


max f (x1 , . . . , xn ) subject to :
xi ∈R

g1 (x1 , . . . , xn ) ≤ b1 , . . . , gk (x1 , . . . , xn ) ≤ bk ,
h1 (x1 , . . . , xn ) = c1 , . . . , hm (x1 , . . . , xn ) = cm .
The function f (x) is called the objective function, g(x) is called an inequality constraint, and
h(x) is called an equality constraint. In the above problem there are k inequality constraints and
m equality constraints. In the following we will always assume that f , g and h are C 1 functions, i.e.
that they are dierentiable and their derivatives are continuous.
Notice that this problem diers from the regular unconstrained optimization problem in that instead
of nding the maximum of f (x), we are nding the maximum of f (x) only over the points which
satisfy the constraints.
Example: Maximize f (x) = x2 subject to 0 ≤ x ≤ 1.
Solution: We know that f (x) is strictly monotonically increasing over the domain, therefore the
maximum (if it exists) must lie at the largest number in the domain. Since we are optimizing over a
compact set, the point x = 1 is the maximal number in the domain, and therefore it is the maximum.

This problem was easy because we could visualize the graph of f (x) in our minds and see that it
was strictly monotonically increasing over the domain. However, we see a method to nd constrained
maxima of functions even when we can't picture them in our minds.

2 Equality Constraints

2.1 One Constraint

Consider a simple optimization problem with only one constraint:


max f (x1 , . . . , xn ) subject to :
x∈R

h(x1 , . . . , xn ) = c.
Now draw level sets of the function f (x1 , . . . , xn ). Since we might not be able to achieve the un-
constrained maxima of the function due to our constraint, we seek to nd the value of x which gets

1
2 Constrained Optimization

us onto the highest level curve of f (x) while remaining on the function h(x). Notice also that the
function h(x) will be just tangent to the level curve of f (x).
Call the point which maximizes the optimization problem x∗ , (also referred to as the maximizer).
Since at x∗ the level curve of f (x) is tangent to the curve g(x), it must also be the case that the
gradient of f (x∗ ) must be in the same direction as the gradient of h(x∗ ), or
∇f (x∗ ) = λ∇h(x∗ ),

where λ is merely a constant.


Now dene the Lagrangian the following way:
L(x1 , . . . , xn , λ) = f (x1 , . . . , xn ) − λ [h(x1 , . . . , xn ) − c]

Then setting the partial derivatives of this function with respect to x equal to zero will yield the rst
order conditions for a constrained maximum:
∇f (x∗ ) − λ∇h(x∗ ) = 0.

Setting the partial derivative with respect to λ equal to zero gives us our original constraint back:
h(x∗ ) − c = 0.

So the rst order conditions for this problem are simply ∇L(x, λ) = 0

Example Maximize f (x1 , x2 ) = x1 x2 subject to h(x1 , x2 ) ≡ x1 + 4x2 = 16.

Solution: Form the Lagrangian


L(x1 , x2 ) = x1 x2 − λ (x1 + 4x2 − 16)

The rst order conditions are


dL
= x2 − λ = 0
dx1
dL
= x1 − 4λ = 0
dx2
dL
= x1 + 4x2 − 16 = 0

From the rst two equations we have
x1 = 4x2 .
Plugging this into the last equation we have that x2 = 2, which in turn implies that x1 = 8, and that
λ = 2. This states that the only candidate for the solution is (x1 , x2 , λ) = (8, 2, 2).

Remember that points obtained using this formula may or may not be a maximum or minimum, since
the rst order conditions are only necessary conditions. They only give us candidate solutions.
There is another more subtle way that this process may fail, however. Consider the case where
∇h(x∗ ) = 0, or in other words, the point which maximizes f (x) is also a critical point of h(x).
Remember our necessary condition for a maximum
∇f (x∗ ) = λ∇h(x∗ ),

Since ∇h(x∗ ) = 0, this implies that ∇f (x∗ ) = 0. However, this the necessary condition for an
unconstrained optimization problem, not a constrained one! In eect, when ∇h(x∗ ) = 0, the constraint
is no longer taken into account in the problem, and therefore we arrive at the wrong solution.
Math Camp 3

2.2 Many Constraints

Consider the problem

max f (x1 , . . . , xn ) subject to :


xi ∈R

h1 (x1 , . . . , xn ) = c1 , . . . , hm (x1 , . . . , xn ) = cm .
Let's rst talk about how the lagrangian approach might fail. As we saw for one constraint, if
∇h(x∗ ) = 0, then the constraint drops out of the equation. Now consider the Jacobian matrix, or a
vector of the gradients of the dierent hi (x∗ ).
dh1 (x∗ ) dh1 (x∗ )
 
∇h1 (x∗ )
 
dx1 ... dxn
Dh(x∗ ) =  ..   .. .. .. 
. = . . .
 
 
∇hm (x∗ ) dhm (x∗ ) dhm (x∗ )
dx1 ... dxn

Notice that if any of the ∇hi (x∗ )'s is zero, then that constraint will not be taken into account in
the analysis. Also, there will be a row of zeros in the Jacobian, and therefore the Jacobian will not
be full rank. The generalization of the condition that ∇h(x∗ ) 6= 0 for the case when m = 1 is that
the Jacobian matrix must be of rank m. Otherwise, one of the constraints is not being taken into
account, and the analysis fails. We call this condition the non-degenerate constraint qualication
(NDCQ).
Note that we only have to check whether the NDCQ holds at points in the constraint set, since points
outside the constraint set are not solution candidates anyways. If we test for NDCQ and nd that
the constraint is violated for some point within our constraint set, we have to add this point to our
candidate solution set. The Lagrangian technique simply does not give us any information about this
point.
The Lagrangian for the multi-constraint optimization problem is
m
X
L(x1 , . . . , xn , λ) = f (x1 , . . . , xn ) − λi [hi (x1 , . . . , xn ) − ci ]
i=1

And therefore the necessary conditions for a maximum are


dL dL
= 0, . . . , =0
dx1 dxn

dL dL
= 0, . . . , = 0.
dλ1 dλn
.

Example Maximize f (x, y, z) = xyz subject to h1 (x, y, z) ≡ x2 + y2 = 1 and h2 (x, y, z) ≡ x + z = 1.

First let us form the Jacobian matrix


 
2x 2y 0
Dh(x, y, z) =
1 0 1

Notice that the rank is 1 if and only if both x = y = 0. However, if this is the case, then our rst
constraint fails to hold. Therefore, the rank is 2 for all points in the constraint set, and so we don't
4 Constrained Optimization

need to worry about the NDCQ.


Form the Lagrangean
L(x, y, z) = xyz − λ1 x2 + y 2 − 1 − λ2 (x + z − 1) .


The rst order conditions are


∂L
= yz − 2λ1 x − λ2 = 0
∂x
∂L
= xz − 2λ1 y = 0
∂y
∂L
= xy − λ2 = 0
∂z
∂L
= x2 + y 2 − 1 = 0
∂λ1
∂L
=x+z−1=0
∂λ2
In order to solve for λ1 and λ2 we divide by y , so we have to assume y 6= 0. We will treat the case
y = 0 later. Assuming y 6= 0,
xz
λ1 = , λ2 = xy.
2y
Plug these into the rst equation to get
x2 z
yz − 2 − xy = 0
2y

Multiply both sides by y 6= 0


y 2 z − x2 z − xy 2 = 0
Now we want to solve the two constraints for y and z in terms of x, plug them into this equation, and
get one equation in terms of x
(1 − x2 )(1 − x) − x2 (1 − x) − x(1 − x2 ) = 0

(1 − x) (1 − x2 ) − x2 − x(1 + x) = 0
 

(1 − x) 1 − 3x2 − x = 0
 

n √ √ o
This yields x = 1, −1+6 13 , −1−6 13 .
Let's analyze x = 1 rst. From the second constraint we have that z = 0, and from the rst constraint
we have that y = 0. That contradicts our assumption that y 6= 0, so this cannot be a solution.
Plugging in the other values, we get four candidate solutions
x = .4343, y = ±0.9008, z = 0.5657

x = −.7676, y = ±0.6409, z = 1.7676


Finally, let's look at the case y = 0. Then either x = 1 and z = 0 or x = −1 and z = 2, from the
equality constraints. In the rst case, all the other rst order conditions hold as well, so (1, 0, 0) is
another candidate solution. In the second case, we get 2 = 0 in the second FOC and therefore this
point cannot be a candidate solution.
Math Camp 5

3 Inequality Constraints

3.1 One Constraint

Consider the problem


max f (x1 , . . . , xn ) subject to :
xi ∈R

g(x1 , . . . , xn ) ≤ b.
In this case the solution is not constrained to the curve g(x), but merely bounded by it.
In order to understand the new conditions, imagine the graph of the level sets which we talked about
before. Instead of being constrained to the function g(x), the domain is now bounded by it instead.
However, the boundary of the function is still the same as before. Notice that there is still a point
where the boundary is tangent to some level set of g(x). The question now is whether the boundary
is binding or not binding .
Remember that we are trying to nd candidate points for a global maximum. Restricting our original
domain X to the set of points where g(x) ≤ b gives us two types of candidate points: Points on the
boundary g(x) = b where g(x) = b is tangent to the level curves of f (x), and local maxima of f (x) for
which g(x) ≤ b. The rst type we can nd by the constrained FOC ∇f (x) = λ∇g(x), and the second
type we can nd by the unconstrained FOC ∇f (x) = 0. Let's look at each of these in turn.

Case 1: Candidates along the boundary (constraint binding) This is the case where an
unconstrained maximum lies outside of the constraint set. In other words, the inequality constrains
us from reaching a maximum of f . In this case, the gradient of f (x) is going to point in the steepest
direction up the graph. The gradient of g(x) points to the set g(x) ≥ b (since it points in the direction
of increasing g(x)). Therefore ∇g(x) is pointing in the same direction as ∇f (x), which implies that
λ ≥ 0. So necessary conditions for a solution on the boundary are

∇f (x) − λ∇g(x) = 0 and λ ≥ 0.

Case 2: Candidates not on the boundary (constraint not binding) This is the case where
an unconstrained maximum lies inside the constraint set. In other words, the inequality does not
constrain us from reaching this maximum of f . The rst order condition is simply ∇f (x) = 0, which
we can rewrite (to take the same shape as above) as
∇f (x) − λ∇g(x) = 0 and λ = 0.

In summary, either the constraint is binding (tight), that is g(x) − b = 0 and λ ≥ 0, or the constraint
is not-binding (slack), and then λ = 0. We can summarize this new set of conditions as what is called
the complementary slackness conditions
[g(x) − b] λ = 0.

This works because if the constraint is binding, then g(x) − b = 0, and if the constraint is not binding,
then we want to ignore it by having λ = 0.
As we can see, it does not matter whether the constraint binds or does not bind, the Lagrangian
multiplier must always be greater than or equal to 0. Therefore, another new set of conditions are
λi ≥ 0 ∀ i.

Finally, we have include our original inequality constraint g(x) ≤ b.


6 Constrained Optimization

We can summarize all these FOC in terms of the Lagrangian, which we dene as before to be L(x, λ) =
f (x) − λ(g(x) − b):
∂L(x, λ) ∂f (x) ∂g(x)
= −λ = 0 for alli = 1, . . . , n
∂xi ∂xi ∂xi
∂L(x, λ)
λ = λ [g(x) − b] = 0 complementary slackness
∂λ
∂L(x, λ)
= [g(x) − b] ≤ 0 original constraint
∂λ
λ≥0

3.2 Many Constraints

Consider the problem:

max f (x1 , . . . , xn ) subject to :


xi ∈R

g1 (x1 , . . . , xn ) ≤ b1 , . . . , gk (x1 , . . . , xn ) ≤ bk .
In order to understand the new NDCQ, we must realize that if a constraint does not bind, we don't
care if it drops out of the equation. The point of the NDCQ was to ensure that binding constraints do
not drop out. Therefore, the NDCQ with inequality constraints is the same as the equality constraints,
except for the fact that we only care about the Jacobian matrix of the binding constraints, or the
Jacobian for the constraints with λi > 0. (Notice that we cannot tell in advance which constraints will
be binding, so we need to check all of them when we check the NDCQ before computing the solution
candidates.)
The following rst order conditions will tell us the candidate points for a maximum
∂L ∂L
= 0, . . . , =0
∂x1 ∂xn

[g1 (x1 , . . . , xn ) − b1 ] λ1 = 0, . . . , [gk (x1 , . . . , xn ) − bk ] λk = 0

g1 (x1 , . . . , xn ) ≤ b1 , . . . , gk (x1 , . . . , xn ) ≤ bk

λ1 ≥ 0, . . . , λk ≥ 0

4 Mixed constraints

Now consider the general problem, in which we have equality and inequality constraints:
max f (x1 , . . . , xn ) subject to :
xi ∈R

g1 (x1 , . . . , xn ) ≤ b1 , . . . , gk (x1 , . . . , xn ) ≤ bk ,

h1 (x1 , . . . , xn ) = c1 , . . . , hm (x1 , . . . , xn ) = cm .
Summarizing the conditions for equality and inequality constraints that we found above we can for-
mulate the following theorem:
Math Camp 7

Theorem: Suppose that x∗ is a local maximizer of f on the constraint set given by the
k inequality and m equalities above. Without loss of generality, assume that the rst k0
inequality constraints are binding and that the other k − k0 inequality constraints are not
binding. Further suppose that the Jacobian of the equality constraints and the binding
inequality constraints at x∗ has full rank. Form the Lagrangian
k
X m
X
L(x, λ1 , . . . , λk , µ1 , . . . , µm ) = f (x) − λi [gi (x) − bi ] − µi [hi (x) − ci ]
i=1 i=1

Then there exist multipliers λ∗1 , . . . , λ∗k and µ∗1 , . . . , µ∗m such that
1. ∂L ∗ ∗ ∗
∂xi (x , λ , µ ) = 0 for all i ∈ {1, . . . , n}
2. λ∗i [gi (x) − bi ] = 0 for all i ∈ {1, . . . , k}
3. hi (x∗ ) = cifor all ∈ {1, . . . , m}
i
4. gi (x ) ≤ bi for all i ∈ {1, . . . , k}

5. λi ≥ 0 for all i ∈ {1, . . . , k}

Note again that this theorem gives us merely rst order, necessary conditions: If x∗ is a maximum
and if the NDCQ holds at x∗ , then there exist Lagrange multipliers for which the conditions hold
true. Finding all tuples (x, λ, µ) for which the conditions hold will therefore gives us a set of candidate
solutions. We still need to check whether these candidates are actually maximizers. (Conditions that
can be used to do so will be taught in your rst semester math class, or you can read Simon & Blume,
chapter 19). Notice also that we may not nd all candidate solutions using the Lagrangian method if
the NDCQ is violated at some points in the constraint set.
Meaning of the multiplier: See Simon & Blume, Chapter 19.1.

5 Minimization Problems

So far we have only discussed maximization problems. What happens if we are instead looking to
minimize a function given certain constraints? There are dierent ways in which we can change the
above technique of using the Lagrangian to apply it to minimization problems. The easiest is probably
the following:
Suppose we are given a minimization problem
min f (x1 , . . . , xn ) subject to :
xi ∈R

g1 (x1 , . . . , xn ) ≤ b1 , . . . , gk (x1 , . . . , xn ) ≤ bk ,
h1 (x1 , . . . , xn ) = c1 , . . . , hk (x1 , . . . , xn ) = ck .
Finding the minimum of f on a certain domain is really the same as nding the maximum of −f on
that domain. So we can transform the above problem into the maximization problem
max −f (x1 , . . . , xn ) subject to :
xi ∈R

g1 (x1 , . . . , xn ) ≤ b1 , . . . , gk (x1 , . . . , xn ) ≤ bk ,
h1 (x1 , . . . , xn ) = c1 , . . . , hk (x1 , . . . , xn ) = ck .
We can nd candidate solutions for this problem as discussed above. They will also be candidate
solutions of the original minimization problem.
8 Constrained Optimization

6 Kuhn-Tucker Notation

The most common problems in economics are maximization problems dealing with only inequality
constraints. Many of these constraints come in the form of non-negativity constraints, such as requiring
consumption to be weakly positive. Consider the following problem:
max f (x1 , . . . , xn )
x∈Rn

subject to g1 (x1 ≤ b1 , . . . , xn ), . . . , gm (x1 , . . . , xn ) ≤ bm


and x1 ≥ 0, . . . , xn ≥ 0
The Lagrangian function we would write would take the form
m
X n
X
L(x, λ, v) = f (x) − λi (gi (x) − bi ) + vi xi ,
i=1 i=1

which would lead to the following rst order conditions:


m m
∂L ∂f X ∂gi ∂L ∂f X ∂gi
= − λi + v1 = 0, . . . , = − λi + vn = 0
∂x1 ∂x1 i=1 ∂x1 ∂xn ∂xn i=1 ∂xn

λ1 [g1 (x) − b1 ] = 0, . . . , λm [gm (x) − bm ] = 0


g1 (x1 , . . . , xn ) ≤ b1 , . . . , gm (x1 , . . . , xn ) ≤ bm
v1 x1 = 0, . . . , vn xn = 0, and λi , xj ≥ 0 ∀ i = 1, . . . , m and ∀ j = 1, . . . , n
That's a lot of conditions! (3m + 3n)
Now consider the Lagrangian without the nonnegativity constraints, and call it the Kuhn-Tucker
Lagrangian:
n
X
L(x, λ, v) = L̃(x, λ) + vi x i
i=1

The rst n rst order conditions can be rewritten as


∂L ∂ L̃
= + vj = 0 ∀ j = 1, . . . , n,
∂xj ∂xj

which implies
∂ L̃
= −vj ∀ j = 1, . . . , n.
∂xj
Plugging those into the nonnegativity constraints we have that
∂ L̃ ∂ L̃
xj = 0 and ≤ 0.
∂xj ∂xj

Also notice that


∂L ∂ L̃
= = bj − gj (x) ≥ 0 ∀ j = 1, . . . , m,
∂λj ∂λj
which implies
∂ L̃ ∂ L̃
≥ 0 and λj = 0 ∀ j = 1, . . . , m.
∂λj ∂λj
Math Camp 9

In summary, the following are the rst order conditions for the Kuhn-Tucker Lagrangian:
∂ L̃ ∂ L̃
≤ 0 and xj = 0 ∀ j = 1, . . . , n
∂xj ∂xj

∂ L̃ ∂ L̃
≥ 0 and λj = 0 ∀ j = 1, . . . , m
∂λj ∂λj
λj ≥ 0∀j = 1, . . . m
This is 2n + 3m constraints, n less than before. If we are dealing with a lot of non-negativity con-
straints it is therefore faster to use the Kuhn-Tucker Lagrangian.

Exercises
1. There are two commodities: x and y . Let the consumer's consumption set be R2+ and his
preference relation on his consumption set be represented by u(x, y) = −(x − 4)2 − y 2 . When
his initial wealth is 2 and the relative price is 1, solve his utility maximization problem if it is
well dened.
2. Let f : R+ → R and f (x) = −(x + 1)2 + 2. Solve the maximization problem if it is well dened.
3. Let f : R2+ → R and f (x, y) = 2y − x2 . When (x, y) must be on the unit disc, i.e., x2 + y 2 ≤ 1,
solve the minimization problem if it is well dened.1

7 Homework

1. In chapter 18 of Simon and Blume, numbers 3, 6, 7, 10, 11, and 12.

1 This is the same problem as in Example 18.11 of Simon and Blume (1994).

You might also like