Conditions for optimality
Necessary conditions
for optimality
Problem
Karush-Kuhn-Tucker conditions
One-variable unconstrained
df
=0
dt
f (x) concave
Multivariable unconstrained
f
= 0, j = 1,2, , n
xi
f (x) concave
Constrained, nonnegative
constraints only
f
= 0, j = 1,2, , n
xi
(or 0, if x j = 0)
f (x) concave
General constrained problem
Karush-Kuhn-Tucker conditions
Theorem: Assume that f(x), g1(x), g2(x),..., gm(x)
are differentiable functions satisfying certain
regularity conditions. Then
x = (x1*, x2*,..., xn*)
can be an optimal solution for the NP problem if
there are m numbers u1, u2,..., um such that all the
KKT conditions are satisfied:
Also sufficient if:
1.
f (x) concave and gi(x)
2.
convex ( j = 1, 2,..., n)
Joo Miguel da Costa Sousa / Alexandra Moutinho
396
Karush-Kuhn-Tucker conditions
3.
4.
5.
6.
for j = 1,2, , m.
ui [ gi ( x ) bi ] = 0
x *j 0, for j = 1,2, , n.
ui 0, for j = 1,2, , m.
Joo Miguel da Costa Sousa / Alexandra Moutinho
397
Similarly, conditions 1 and 2 can be combined:
m
f
g
(1,2)
ui i = 0
x j i =1 x j
(or 0 if x *j = 0), for j = 1,2 , n.
Conditions 2 and 4 require that one of the two
quantities must be zero.
Thus, conditions 3 and 4 can be combined:
(3,4) gi ( x * ) bi = 0
Variables ui correspond to dual variables in linear
programming.
Previous conditions are necessary but not sufficient
to ensure optimality.
(or 0, if ui = 0), for j = 1,2, , m.
398
Karush-Kuhn-Tucker conditions
Joo Miguel da Costa Sousa / Alexandra Moutinho
399
Example
Maximize f ( x ) = ln( x1 + 1) + x2
subject to
2x 1 + x 2 3
and
x1 0, x2 0
Corollary: assume that f(x) is concave and g1(x),
g2(x),..., gm(x) are convex functions, where all
functions satisfy the regularity conditions. Then, x =
(x1*, x2*,..., xn*) is an optimal solution if and only if all
the conditions of the theorem are satisfied.
Joo Miguel da Costa Sousa / Alexandra Moutinho
*
at x = x , for j = 1,2 , n.
m
f
gi
*
xj
ui
=0
x i =1 x
j
j
Karush-Kuhn-Tucker conditions
gi ( x * ) bi 0
Joo Miguel da Costa Sousa / Alexandra Moutinho
m
f
g
ui i 0
x j i =1 x j
Thus, m = 1, and g1(x) = 2x1 + x2 is convex.
Further, f(x) is concave (check it using Appendix 2).
Thus, any solution that verifies the KKT conditions is an
optimal solution.
400
Joo Miguel da Costa Sousa / Alexandra Moutinho
401
Example: KKT conditions
1
2u1 0
x1 + 1
1
2u1 = 0
2. (j = 1) x1
x1 + 1
1. (j = 1)
3.
4.
5.
6.
Example: solving KKT conditions
(j = 2) 1 u1 0
From condition 1 (j = 2) u1 1. x1 0 from condition 5
(j = 2) x2 (1 u1 ) = 0
Therefore,
Therefore, x1 = 0, from condition 2 (j = 1).
u1 0 implies that 2x1 + x2 3 = 0 from condition 4.
Two previous steps implies that x2 = 3.
x2 0 implies that u1 = 1 from condition 2 (j = 2).
No conditions are violated for x1 = 0, x2 = 3, u1 = 1.
Consequently x* = (0,3).
2x 1 + x 2 3 0
u1(2x1 + x 2 3) = 0
x1 0, x2 0
u1 0
Joo Miguel da Costa Sousa / Alexandra Moutinho
1
2u1 < 0.
x1 + 1
402
Quadratic Programming
Joo Miguel da Costa Sousa / Alexandra Moutinho
Example
Maximize f ( x ) = 15x1 + 30x2 + 4 x1 x2 2x12 4 x 22
subject to
x1 + 2x2 30, and x1 0, x2 0
1
Maximize f ( x ) = cx xT Qx
2
subject to
Ax b , and x 0
In this case,
Objective function can be expressed as:
n
1
1 n n
f ( x ) = cx xT Qx = c j x j qij x i x j
2
2 i =1 j =1
j =1
c = [15 30]
x
x = 1
x2
A = [1 2]
Joo Miguel da Costa Sousa / Alexandra Moutinho
404
Solving QP problems
4 4
Q=
4 8
b = [30]
Joo Miguel da Costa Sousa / Alexandra Moutinho
405
Solving QP problems
Objective function is concave if xTQx0 x, i.e., Q is
a positive semidefinite matrix.
Some KKT conditions for quadratic programming
problems can be transformed in equality constraints
by introducing slack variables (y1, y2, u1).
KKT conditions can be condensed due to the
complementary variables ((x1, y1), (x2, y2), (u1, v1)),
introducing complementary constraint (1+2+4).
Joo Miguel da Costa Sousa / Alexandra Moutinho
403
406
Applying KKT conditions to example
1. (j = 1) 15+4x2-4x1-u1 0
(j = 2) 30+4x1-8x2-2u1 0
2. (j = 1) x1(15+4x2-4x1-u1) 0
(j = 2) x2(30+4x1-8x2-2u1) 0
3. x1+2x2-30=0
4. u1(x1+2x2-30)=0
5. x10, x20
6. u10
Joo Miguel da Costa Sousa / Alexandra Moutinho
407
Solving QP problems
Solving QP problems
4x1-4x2+u1-y1 = 15
-4x1+8x2+2u1-y2 = 30
x1+2x2+v1=30
x10, x20, u10, y10, y20, v10
x1 y1 + x2 y2 + u1 v1 =0
1. (j = 1) -4x1+4x2-u1+y1 = -15
(j = 2) x1 y1 = 0
2. (j = 1) 4x1-8x2-2u1+y2 = -30
(j = 2) x2 y2 = 0
3. x1+2x2+v1=30
4. u1 v1 = 0
y=ccT
Qx+A
Qx ATu-y
Ax+v
b
Ax v=b
x 0, u 0, y 0, v 0
xTy+u
uTv=0
1 (j=2)+2 (j=2)+4. x1 y1 + x2 y2 + u1 v1 =0
Joo Miguel da Costa Sousa / Alexandra Moutinho
linear
programming
constraints
408
Solving QP problems
Joo Miguel da Costa Sousa / Alexandra Moutinho
409
Separable Programming
Using the previous properties, QP problems can be
solved using a modified simplex method.
See example of a QP problem in Hilliers book (pages
580-581).
Excel, LINGO, LINDO, and MPL/CPLEX all can solve
quadratic programming problems.
Assumed that f(x) is concave and gi(x) are convex.
Joo Miguel da Costa Sousa / Alexandra Moutinho
Joo Miguel da Costa Sousa / Alexandra Moutinho
410
Example
Joo Miguel da Costa Sousa / Alexandra Moutinho
f (x) = f j (x j )
j =1
f(x) is a (concave) piecewise linear function (see
example).
If gi(x) are linear, this problem can be reformulated
as an LP problem by using a separate variable for
each line segment.
The same technique can be used for nonlinear gi(x).
411
Example
412
Joo Miguel da Costa Sousa / Alexandra Moutinho
413
Convex Programming
Frank-Wolfe algorithm
Many algorithms can be used, falling into 3
categories:
1. Gradient algorithms, where the gradient search
procedure is modified to avoid violating a constraint.
Example: generalized reduced gradient (GRG).
It is a sequential linear approximation algorithm.
It replaces the objective function f(x) by the firstorder Taylor expansion of f(x) around x = x, namely:
n
f ( x ) f ( x ) +
2. Sequential unconstrained algorithms, include
penalty function and barrier function methods.
j =1
f ( x)
( x j x j ) = f ( x) + f ( x)( x x)
x j
As f(x) and f(x)x have fixed values, they can be
dropped to give a linear objective function:
Example: sequential unconstrained minimization
technique (SUMT).
3. Sequential approximation algorithms, include linear
and quadratic approximation methods.
g( x ) = f ( x )x = c j x j , where c j =
j =1
f ( x )
at x = x.
x j
Example: Frank-Wolfe algorithm.
Joo Miguel da Costa Sousa / Alexandra Moutinho
414
Frank-Wolfe algorithm
Joo Miguel da Costa Sousa / Alexandra Moutinho
415
Summary of Frank-Wolfe algorithm
Simplex method is applied to find a solution xLP.
Then, chose the point that maximizes the nonlinear
objective function along the line segment.
This can be done using an one-variable unconstrained
optimization algorithm.
The algorithm continues the iterations until the stop
condition is satisfied.
Initialization: Find feasible initial trial solution x(0),
e.g. using LP to find initial BF solution. Set k = 1.
Iteration k:
f (x)
at x = x( k 1) .
1. For j = 1, 2, ..., n, evaluate
x
and set cj equal to this value. j
2. Find optimal solution x(LPk ) by solving LP problem:
n
Maximize g (x) = c j x j ,
j =1
subject to
Ax b and x 0
Joo Miguel da Costa Sousa / Alexandra Moutinho
416
Summary of Frank-Wolfe algorithm
Joo Miguel da Costa Sousa / Alexandra Moutinho
Example
Maximize f ( x) = 5 x1 x12 + 8 x2 2 x22
3. For the variable t [0,1], set
h(t ) = f (x)
subject to
3x1 + 2 x2 6, and
for x = x( k 1) + t (x(LPk ) x (LPk 1) ),
so that h(t) gives value of f(x) on line segment
between x ( k 1)(where t = 0) and x(LPk ) (where t = 1).
Use one-variable unconstrained optimization to
maximize h(t) to find x(k).
Stopping rule: If x(k1) and x(k) are sufficiently close
stop. x(k) is the estimate of optimal solution.
Otherwise, reset k = k + 1.
Joo Miguel da Costa Sousa / Alexandra Moutinho
417
As
f
= 5 2 x1 ,
x1
x1 0, x2 0
f
= 8 4 x2
x2
the unconstrained maximum x = (2.5, 2) violates the
functional constraint.
418
Joo Miguel da Costa Sousa / Alexandra Moutinho
419
Example (2)
Example (3)
Iteration 1: x = (0, 0) is feasible (initial trial x(0)). Step
1 gives c1 = 5 and c2 = 8, so g(x) = 5x1 + 8x2.
Step 2: solving graphically yields x(1)
LP = (0, 3).
Step 3: points between (0, 0) and (0, 3) are:
( x1 , x2 ) = (0, 0) + t[(0,3) (0, 0)] for t [0,1]
= (0,3t )
dh(t )
= 24 36t = 0
dt
so t* = 2/3. This results leads to the next trial solution,
(see figure):
2
x(1) = (0, 0) + [(0,3) (0, 0)]
3
= (0, 2)
This expression gives
Iteration 2: following the same procedure leads to the
next trial solution x(2) =(5/6, 7/6).
h(t ) = f (0,3t ) = 8(3t ) 2(3t )2
= 24t 18t 2
Joo Miguel da Costa Sousa / Alexandra Moutinho
the value t = t* that maximizes h(t) is given by
420
Example (4)
Joo Miguel da Costa Sousa / Alexandra Moutinho
421
Example (5)
Figure shows next iterations.
Note that trial solutions alternate between two
trajectories that intersect at point x = (1, 1.5).
This is the optimal solution (satisfies KKT conditions).
Using quadratic instead of linear approximations lead
to a much faster convergence.
Joo Miguel da Costa Sousa / Alexandra Moutinho
422
Joo Miguel da Costa Sousa / Alexandra Moutinho
SUMT
Sequential unconstrained minimization
Main versions of SUMT:
exterior-point algorithm: deals with infeasible solutions and
a penalty function,
interior-point algorithm: deals with feasible solutions and a
barrier function.
Uses the advantage of solving unconstrained
problems, which are much easier to solve.
Each unconstrained problem in the sequence chooses
a smaller and smaller value of r, and solves for x to
B(x) is a barrier function with following properties:
1. B(x) is small when x is far from boundary of feasible
region.
2. B(x) is large when x is close from boundary of feasible
region.
3. B(x) as distance from boundary of feasible
region 0.
Most common choice of B(x):
m
B ( x) =
Maximize P( x; r ) = f ( x) rB( x)
i =1
Joo Miguel da Costa Sousa / Alexandra Moutinho
423
424
n
1
1
+
bi gi (x) j =1 x j
Joo Miguel da Costa Sousa / Alexandra Moutinho
425
Summary of SUMT
Summary of SUMT
Initialization: Find feasible initial trial solution x(0), not
on the boundary of feasible region. Set k = 1. Choose
value for r and < 1 (e.g. r = 1 and = 0.01).
Iteration k: starting from x(k1), apply a multivariable
unconstrained optimization procedure (e.g. gradient
search procedure) to find local maximum x(k) of
n
m
1
1
P (x; r ) = f (x) r
+
b
g
(
x
)
x
j =1 j
i
i =1 i
Joo Miguel da Costa Sousa / Alexandra Moutinho
426
Example
Stopping rule: If change from x(k1) to x(k) is very small
stop and use x(k) as local maximum. Otherwise, set
k = k + 1 and r = r for other iteration.
SUMT can be extended for equality constraints.
Note that SUMT is quite sensitive to numerical
instability, so it should be applied cautiously.
Joo Miguel da Costa Sousa / Alexandra Moutinho
427
Example (2)
Maximize f ( x) = x1 x2
subject to
x12 + x2 3, and
For r = 1, maximization leads to x(1) = (0.90, 1.36).
Table below shows convergence to (1, 2).
x1 0, x2 0
g1 (x) = x + x2 is convex, but f (x) = x1 x2 is not concave
Initialization: (x1, x2) = x(0) = (1, 1), r = 1 and = 0.01.
For each iteration:
2
1
1
1 1
P (x; r ) = x1 x2 r
+ +
2
3 x1 x2 x1 x2
Joo Miguel da Costa Sousa / Alexandra Moutinho
428
Nonconvex Programming
x 1 ( k)
x 2 ( k)
0.90
1.36
102
0.987
1.925
104
0.998
1.993
Joo Miguel da Costa Sousa / Alexandra Moutinho
429
Nonconvex Programming
Assumptions of convex programming often fail.
Nonconvex programming problems can be much
more difficult to solve.
Dealing with non differentiable and non continuous
objective functions is usually very complicated.
LINDO, LINGO and MPL have efficient algorithms to
deal with these problems.
Simple problems can be solved using hill-climbing
to find a local maximum several times.
An example is given in Hilliers book using Excel
Solver to solve simple problems.
More difficult problems can use Evolutionary Solver.
It uses metaheuristics based on genetics, evolution
and survival of the fittest: a genetic algorithm.
Next section presents some well known
metaheuristics.
Joo Miguel da Costa Sousa / Alexandra Moutinho
Joo Miguel da Costa Sousa / Alexandra Moutinho
430
431