0% found this document useful (0 votes)

32 views228 pages

Intro To Mathematical Programming MIT

This document provides an overview of the structure and topics covered in a course on mathematical programming. The course covers the formulation, theory, and algorithms for linear optimization problems including the simplex method, duality theory, and sensitivity analysis. It also covers network flows, interior point methods, semidefinite optimization, and robust optimization. The course requirements include homework, a midterm exam, and a final exam. Important concepts include modeling optimization problems and using software like CPLEX to solve problems. Example applications discussed include transportation, scheduling, manufacturing, capacity expansion, and revenue management.

Uploaded by

Cloud neuron

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

0% found this document useful (0 votes)

32 views228 pages

Intro To Mathematical Programming MIT

Uploaded by

Cloud neuron

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

You are on page 1/ 228

1 Structure of Class

Formula,tionn: Lcc. I

Ctcomct,ry: Lcc. 2-4

Simplex l\lcthoi:l: Lcc. 5-8

Duality
Thcory: Lc,:. 9-11

Sensitivity
.-inalysin: Lei:. 12

Robust Optimization: Lcc. 1 3

Large ncalc ol:,timizat,ion: Lcc. 14-15

Nctmork F l o ~ i s :Lcc. 16-17

The Ellipsi:,ii:l methi:,,$: Lcc. 18-19

Interior 1:)oint mcthoi:ls: Lcc. 20-21

.
Scmii:lcfinitc opt,imizatii:,n: Lcc. 22

~ b..
il ~ c~t cOpt,imizatii:,n:
, Lcc. 24-2;

2 Requirements
Homcmorkn:
30%)

Mii:ltcrm Exam:
30%)

Final Exam: 40%

Iml:,ortant tic
brakcr: cont,ribut,ionst,o ,class

Lnc of
CPLEX fi:n si:,lving ol:,timiza,tion problems

3 Lecture Outline
History
of Optimizatio~l

Whcrc
LOPS Arisc?

Examplcs of
Fi:,rmulatii:,ns
4 History of Opti~rlization
Fermat, 1638: Newton, 1670

mi11 f(r) x : scalar

Euler, 1755

Lagrange, 1707

nlin ~ ( s .I. .,. r,")

s.t,. (Jk(1.1,. . . : r,")= 0 k = 1 , .. . , 77L

Euler, Lagrange Prol:,lcms in in fin it,^ dimcnsii:,ns, iralirulus of variat,ions.

5 Nonlinear Optirrlizatiovl
5.1 The general problem

6 What is Linear Optimization?

6.1 Forlnt~lat,ioli
mininlizc 31.1 + 1.2

sul,jcct ti:, 1.1 + 21.2 22

21.1 + 1.2 23
1.1 2 0.1.2 >0

~ninimizc c'z
sul,jcct ti:, Az2h
z20
7 History of LO

7.1 Tlie pre-algorit,hmic period

Fonrier, 1826 Mct,hoil for solving synbcm of linear inc,:lualit,ics.

de la Val1i.e Poussirl simplcx-like mcthoi:l fin ol,jcct,ivc functii:,n wit,h al:,ii:,-

lute values.

vo11 Nenlnann, 1028 game t,hcory, iluality.

Farkas, Minkowski: Caratl~i.odory,1870-1930 Fi:,un,Sa,tionn

7.2 Tlie lnoderll period

George Dantzig. 1047 51ml,lcx mcthoil

1950s Applicat,ions.

1960s Large Siralc Opt,imizat,ion.

1970s Complexity thci:,ry.

1979 The cllipsi:,ii:l algi:,rit,hm.

1980s Interior pi,int algi:,rit,hmn.

1990s Scnlidcfinit,c all<\conic ~ ~ t i ~ n i r a t i o n .

2000s Ri:,bust Ol:,timizat,ion.

8 Where do LOPS Arise?

8.1
. Wide Applicabilit,y
Transportat,ion

Air traffic ci:,nt,ri:,l,Crew schci:luling,

>Iovcmcnt i:,f Truck Loails
9 Trar~sportatiollProblem
9.1
.. 171
Dat,a
plants. 11 ~snrchouscs

. .si supply

ti,?
of i t h plant, i = 1. . . m

i:lcrnnn,S of j t h >iarchousc, j = I . . . 11

9.2 Decision Variables

9.2.1 Fur~~l~~latiuri
rij = numl:,cr i:,f units t,o send i i j

10 Sorting through LO

11 Invest ~ r ~ e vurlder
~ t taxation
.. You have purchnsci:l

Cl~rrrcntprice of stock i is pi
.5i sharcs i:,f stock i at pricc yi. i = 1, . . . . 11
You cxpcct that the l:,ricc i:,f stock i i:,nc scar fii:,m now will bc ri
You ]:,a- a cal:,ital-gains t,ax a t thc rat? of 30% ,311 any capital gains at t,hc
t,imc i:,f the sale.

You want ti:, raise C ami:,unt of

crash aft,cr taxes.

You pay
1%)
in transaction costs

Example: You sell 1.000 shares a t $50 per sharc; you ha,~:cbought them
a t $30 per sharc: Vet cash is:

SO x 1.000 - 0.30 x (SO - 30) x 1,000

Five invcstmcnt choices .-i. B. C , D.E

.-i. C , and D arc a,~:ailal:,lcin 1993.

B is availablc in 1994.

Cash carns 6% per year.

S1.OOO.OOO in 1993.

12.1 C;asli Flowper Dollar Ilivest,ed

.
12.2.1 Decision Variables
.I, . . . E: amount invcst,cd in Y; millions
ami:,unt invcstcd in cash in pcrii:,d t , t = 1.2 , 3
C'~I,../I~:

max 1.0GCu.~h3 + 1.00B + 1 . i 5 D + 1.40E

s.t. . l + C:+D+ C'udhl 5 1
+
CIA.S/L~B 5 0.3.1+ l . l C + l.OGC'u.~hl
+ <
Cil.sI~3 l.OE l . O l + 0.3A + 1.06C'ash2

13 Manufacturing
13.1 Data
11 l:,roilucts, m raw nlatcrials

hi: a,~:ailal:,lcunits of material i .

aij: # units of nlatcrial i proi:luct j needs in ori:lcr ti:, lbc proi:lucc,S.

13.2 Formulat,ion
13.2.1 Decision variables
rj = amount
i:,f pri:,,Suct j pri:,iSucc,S.
n

max
C qis,i
j=l
14 Capacity Exparlsiovi
14.1 Data and C;olistraiilt,s
Dt: fc,rccast,ccl clcmani:l fi:n electricit,:: a t ::car t
Et: cxisti~lgcal:,acity (in oil) nvailablc nt t
c,: ci:,st ti:, l:,roclucc 11\I\\' using coal capncity

lit: cost ti:, proi:lucc 1MTV using nuclcnr c n p a c i t , ~

No morc than 20% nu,rlca,r

Coal plants last 20 ycnri

Nuclear plants Inst 15 :7cars

14.2 Decision Variables

r,: ami:,unt of ci:,al cal:,acity brought on lint in ycar t.
yt: nmount i:,f capncit,:: I:,rought i:,n line in ycnr t.

u:,: tot,nl ci:,al cal:,acity in ycar t.

zt: t,otal cnpacit,:: in :;car t .

15 Scheduling
15.1 Decision variables
Hospital ~ i a n t ,ti:,
s mnkc >icckl:: night,shift fi:,r its llurscs

D , iicmancl fi:n nurses, j = 1. . . T

Evcry llursc works 5 clays in a ri:,m
Goal: hire mininl~lmnnrnbcr of nurses
Decision Variables
rj: #nurscs startiqg t,hcir week i:,n cia- j

16 Revenue Managerrlerlt
16.1 The indust,ry
Deregulation in 1978

- Clarricrs only allowc~iti:, fly ircrt,ain ri:,utcs. Hcnirc airlincs such as

iYi:,rt,h~scst.Eastern, Southwest, ctc.
- Fares ilct,cr~nincilI:,:: Clivil Acrona,utics Boaril (CAB) lbascil on mileage
and othcr ci:,st,s (C.4B no li:,ngcr exists)
SLII)E26
Post D e r c g l ~ l a t i ~ n
anyone ,ran fly, anywhcrc
farcs ~ i c t c r m i n c ~
I:,yi carrier (and thc markct)

17 Revenue Managerrlerlt

. Huge sunk anil fixcil cost,s

Vcry li:,m variablc ci:,st,s pcr passenger ($lO/passcngcr i:,r lcss)
Strong cci:,nomically ci:,mpctitivc cnvironmcnt
Ncar-pcrfcct infi:,rmat,ion and ncgligil:,lc ci:,st of infi,rmatii:,n
Highly pcrishal:,lc invcnt,ory
Result: l\lult~il~lc
farcs
18 Revenue Managerrlerlt
18.1 Data

. 11

I hub
origins. 11 ilcstinatii:,ns

.
2 irlasscs (for simplicity), (2-class. Ti-class

R c ~ c n u c sT,! ZJ 1.j.;.
i = I .. . . T I ; C,0.i. .I. = I , . . . I L
.
Ca1:)acitics:

Expected iicnlancls: D:)

18.2 LO Forillulatioil

..
18.2.1
Q,,:
Decision Variables
# o f Q-class cu~tomersme accept from i t o j
1;,: # of Y-class cu~tomers accept from i t o j

maximize zr.,::c2%, +v:,Y;,

19 Revenue
Managerrlerlt

F\'c c s t i m a t , ~t,hat RVI hns gcncrat,cil $1.4 billion in in,rrcmcntal rcvcnuc fi,r
Anlcri,ran Airlines in the last three ycnrs ali:,nc. This is not a i:,nc-time benefit.
F\'c c x p r t RM t o gcncrat,c at lcnst $500 millii:,n nnnually for the fi:,rcsccnl:,lc
future. .-is me continue t o invest in the cnhanccmcnt of DIV.-i?rlO me cxl:,cct t,o
cnpt,urc nn even lnrgcr rcvcnuc premium.
20 Messages
20.1 How t,o formulate?
1. Define your ilccision variables clearly.
2.
ITrritc ci:,nstraints ani:l ol:,jcctivc funct,ion.

What is a good LO for~ril~latiorl?

A fi,rmulation with a
numl:,cr of variaI:,lcs anil const,raint,s,anil t,hc mat,rix
A is sparse.

21.1 The general proble~ll

22 Convex furlctiorls

. f :.S+R
For all sl.s 2E ,Y

f(As1 + ( l h ) s ? )5 Xf(sl)+(lPA)f(s2)
. ,f (z) ci:,ncavc if f (z)ci:,nvcx.
-

23 0 x 1 the power of LO

min f (z)= maxi, dn z + ci,

.5.t. Az 3b
24 0x1the power of LO

min C t,j l.cjl

n.t. Az 2b
Idea: 1x1 = max{s, s ]
mi11 Cqiqi
5.t. Ax 2 b
r.i 5 z.?
" j 5 zj
Message: >Iinimizing Picirc~sisclincnr convex functii:,n ,ran lbc moi:lcllcil I:,y LO
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
15.081J/6.251J Introduction to Mathematical
Programming
Lecture 2: Geometry of Linear Optimization I
1 Outline: Slide 1
1. What is the central problem?
2. Standard Form.
3. Preliminary Geometric Insights.
4. Geometric Concepts (Polyhedra, \Corners").
5. Equivalence of algebraic and geometric concepts.

2 Central Problem Slide 2

minimize cx 0

subject to ai x =
0
bi i 2 M1
ai x
0
bi i 2 M2
ai x
0
bi i 2 M3

xj0 j 2 N1
>
x j <0 j 2 N2

2.1 Standard Form Slide 3

minimize cx
0

subject to Ax = b
x0
Characteristics
Minimization problem
Equality constraints
Non-negative variables

2.2 Transformations Slide 4

max c x0
; min(;c x) 0

ai x
0
bi ai x + si = bi si 0
0

,
ai x
0
bi ai x ; i =
0
s i
b i0
s

>
xj < 0 x j = xj ; xj+ ;

j 0 xj 0
+ ;
x

2.3 Example Slide 5

maximize x1 ; x2
subject to x1+ 21 x

x1+2 2 1 x
>
x1 <0 2 0 x

+
;minimize ;x+1 + x1 + x2 ;

subject to x+1 ; x1 + x2 + s1 ;
=1
x1 ; x1 + 2x2 ; s2 = 1

+ ;

x1 x1 x2 s1 s2 0

+ ;

3 Preliminary Insights
Slide 6
minimize ; 1 ; 2 x x

subject to 1 + 2 2 3x x

2 1+ 23

x x

1 20 x x

- x1 - x2 = - 2

1.5

- x1 - x2 = z
(1,1)

- x1 - x2 = 0

1.5 3 x1
c x1 + 2x2 < 3
2x1 + x2 < 3

Slide 7
;x1 + x2 1
x1 0

x2 0
Slide 8
There exists a unique optimal solution.
There exist multiple optimal solutions in this case, the set of optimal
solutions can be either bounded or unbounded.

2
x2

c = (1,0)

c = (- 1,- 1)
1

c = (0,1)
c = (1,1)

The optimal cost is ;1, and no feasible solution is optimal.

The feasible set is empty.

4 Polyhedra
4.1 Denitions Slide 9
The set fx j a x = g is called a hyperplane.
0
b

The set fx j a x g is called a halfspace.

0
b

The intersection of many halfspaces is called a polyhedron.

A polyhedron is a convex set, i.e., if x y 2 , then x + (1 ; )y 2 .

P P P

= b3
a '2

a '3x
x=
b2

a3
a2

a' x < b a4

a' x > b a1
=b a 4' x = b4
=b
1

a' x a5
1x
a'

a
a5' x = b
5

(a) (b)

3
5 Corners
5.1 Extreme Points Slide 10
Polyhedron P = fx j Ax bg

x 2 P is an extreme point of P

if 6 9 y z 2 P (y =
6 x z 6= x):

x = y + (1 ; )z 0 < < 1

. . u

.
v
w

. . z

. y
x

5.2 Vertex Slide 11

x 2 is a vertex of if 9 c:
P P

x is the unique optimum

minimize cy
0

subject to y2 P

5.3 Basic Feasible Solution Slide 12

P = f(x1 x2 x3 )j x1 + x2 + x3 =1 x1 x2 x3 0g
Slide 13
Points A,B,C : 3 constraints active
Point E: 2 constraints active
suppose we add 2 1 + 2 2 + 2 3 = 2.
x x x

4
'w }
.
w

=c
c 'y
{y | P

c
.
x
'x }
{y | c
'y = c

A.
E . P . C
x2

D . .
B
x1

Then 3 hyperplanes are tight, but constraints are not linearly independent.
Slide 14
Intuition: a point at which inequalities are tight and corresponding equations
n

are linearly independent.

P = fx 2 <n j Ax bg
a1 : : : am rows of A
x2P

I = fi j ai x = bi g
0

Denition x is a basic feasible solution if subspace spanned by fai 2 g i I

is <n .
5.3.1 Degeneracy
Slide 15
If jI j = n, then ai i 2 I are linearly independent x nondegenerate.

5
If jI j > n, then there exist n linearly independent fai i 2 I g x degener-
ate.

C
P
B
E

(a) (b)

5.3.2 Example
Slide 16
min x1 + 52 x ;2x3
st : : x 1+ 2+ x x3 4
x1 2
x3 3
3 2+
x x3 6
x1 x2 x3 0
Slide 17

6 Equivalence of denitions
Slide 18
Theorem: = fx j Ax bg. Let x 2 .
P P

x is a vertex , x is an extreme point , x is a BFS.

6.1 Proof Slide 19
1. Vertex ) extreme point
9c : c x c y 8y 2
0
<
0
P

If x is not an extreme point 9y z 6= x:

x = y + (1 ; )z . But c x c y c x c z

0
<
0

0
<
0

) c x = c y + (1 ; )c z c x contradiction
0

0

0
<
0

Slide 20
2. Extreme point ) BFS
Suppose x is not a BFS.

6
Let = f : ai x = ig. But ai do not
I i
0
b

span all of <n ) 9z 2 <n : ai z = 0 2 0

i I

Let
x1 = x + z

x2 = x ; z )

aix1 = i 0
b

aix2 = i 2 0
b
i I

Slide 21
i 62 I : aix < bi ) ai (x + z ) < bi
0 0
ai (x ; z)
0
< b i
for small enough.
) x1 x2 2 : yet x = x1 +2 x2 )

x not an extreme point: contradiction Slide 22

3. BFS ) vertex

x BFS �

= f : ai x = i g

0 �
I i b

Let i = 10 262

i I
d
i I:

c = ;d A 0 0

Then c x = ;d Ax = ;
Pm P P .
iai x = ; ai x = ; Slide 23
� � � �
0
i 0
d
0 0
b
i�1 i I i I
But 8x 2 P : ai x i )
2 2
0
b

x optimum
c x = ; P aix ; P i = c x
�

0
min cx
0
b
0 � 0

i2I i2I x2 P:
Why unique?

7
Equality holds if ai x = i 2 since ai spans <n
0
b i I aix = i has a unique
0
b

solution x = x .
�

MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
15.081J/6.251J Introduction to Mathematical
Programming
Lecture 3: Geometry of Linear Optimization II
1 Outline Slide 1
BFS for standard form polyhedra
Deeper understanding of degeneracy
Existence of extreme points
Optimality of Extreme Points
Representation of Polyhedra

2 BFS for standard form polyhedra

Slide 2
Ax = b and x 0

m n matrix A has linearly independent rows

x 2 <n is a basic solution if and only if Ax = b, and there exist indices

B(1) : : : B (m) such that:

{ The columns AB(1) : : : AB(m) are linearly independent

{ If i 6= B(1) : : : B (m), then xi = 0

2.1 Construction of BFS Slide 3

Procedure for constructing basic solutions
1. Choose m linearly independent columns AB(1) : : : AB(m)
2. Let xi = 0 for all i =
6 B(1) : : : B (m)
3. Solve Ax = b for xB(1) : : : xB(m)

Ax = b ! BxB + NxN = b
xN = 0 xB = B; b 1

2.2 Example 1 Slide 4

21 1 2 1 0 0 03 2 83
66 0 1 6 0 1 0 0 77 x = 66 12 77
41 0 0 0 0 1 05 4 45
0 1 0 0 0 0 1 6
A A A A basic columns
4 5 6 7

1
Solution: x = (0 0 0 8 12 4 6), a BFS

Another basis: A3 A5 A6 A7 basic columns.

Solution: x = (0 0 4 0 ;12 4 6), not a BFS

2.3 Geometric intuition Slide 5

A2
A4 = - A1

2.4 Example 2

Slide 6
General form

Slide 7
x1 + x2 + x3 4

x1 2

x3 3

3x2 + x3 6

x1 x2 x3 0

2
Standard form
x1 + x2 + x3 + s1 = 4
x1 + s2 = 2
x3 + s3 = 3
3x2 + x3 + s4 = 6
x1 x2 x3 s1 : : : s4 0

Slide 8
Using the denition for BFS in polyhedra in general form :

x1 + x2 + x3 = 4 =

Choose tight constraints: x3 = 3 ) (1 0 3)

x2 =0

011 001 001

Check if @ 1 A @ 0 A @ 1 A span <3 (they do)

1 1 0

Slide 9

Using the denition for BFS in polyhedra in standard form :

Pick the basic variables: x1 x3 s2 s3 : xB = (x1 x3 s2 s3)

Pick the nonbasic variables: x2 s1 s4 : xN = (x2 s1 s4 )

Slide 10

Partition A:
2 x1 x2 x3 s1 s2 s3 s4 3
1 1 1 1 0 0 0

A = 664 1 0 0 0 1 0 0 77 = B N ]

0 3 1 0 0 1 0

0 0 1 0 0 0 1

21 1 0 03 21 1 0
3

Slide 11

6 7 6 0 77 B non-singular
B = 64 10 01 10 01 57 N = 64 03 0

0 1 0 0
0 x 1 0 1 10 0 1

xN = 0 xB = B; b ) BB@ xs CCA = BB@ 13 CCA

1 3
2
s3 3

3 Degeneracy for standard form polyhedra

3.1 Denition Slide 12
A BFS x of P = fx 2 <n : Ax = b A : n n x 0g is called degenerate

if it contains more than n ; m zeros.

x is non-degenerate if it contains exactly n ; m zeros.

3.2 Example 2, revisited Slide 13

In previous example:
(2 2 0 0 0 3 0)degenerate : n = 7
m=4

More than n ; m = 7 ; 4 = 3 zeros.

Ambiguity about which are basic variables.
(x1 x2 x3 x6) one choice
(x1 x2 x6 x7) another choice

3.3 Extreme points and BFS Slide 14

Consider again the extreme point (2 2 0 0 0 6 0)

How do we construct the basis?

8 0 1 1 0 1 1 0 0 1 9
< 1 0 0 =
B=
: @ 0

A @ 0
A @ 1
A
0 3 0

A1
A 2 A 6

Slide 15
Columns in B are linearly independent.

Rank (A) = 4

jBj = 3 < 4

Can we augment B?

Choices:
{ B0 = B fA3 g basic variables x1 x2 x3 x6
{ B0 = B fA7 g basic variables x1 x2 x6 x7

{ How many choices do we have?

3.4 Degeneracy and geometry Slide 16

Whether a BFS is degenerate may depend on the particular representation
of a polyhedron.

n o

P = (x1 x2 x3) x1 ; x2 = 0 x1 + x2 + 2x3 = 2 x1 x2 x3 0 :

4
n = 3, m = 2 and n ; m = 1. (1 1 0) is nondegenerate, while (0,0,1) is
degenerate.
n
Consider the representation P = (x1 x2 x3) x1 ;x2 = 0 x1 +x2 +2x3 =
o
2 x1 0 x3 0 : (0,0,1) is now nondegenerate.

3.5 Conclusions Slide 17

An extreme point corresponds to possibly many bases in the presence of
degeneracy.
A basic feasible solution, however, corresponds to a unique extreme point.
Degeneracy is not a purely geometric property.

4 Existence of
extreme points
Slide 18

P Q

Note that P = f(x1 x2) : 0 x1 1g does not have an extreme point, while
P 0 = f(x1 x2) : x1 x2 x1 0 x2 0g has one. Why?
4.1 Denition Slide 19
A polyhedron P <n contains a line if there exists a vector x 2 P and a
nonzero vector d 2 <n such that x + d 2 P for all scalars .
4.2 Theorem Slide 20
Suppose that the polyhedron P = fx 2 <n j ai 0 x bi i = 1 : : : mg is
nonempty. Then, the following are equivalent:
(a) The polyhedron P has at least one extreme point.

(b) The polyhedron P does not contain a line.

independent.

5
4.3 Corollary Slide 21
Polyhedra in standard form contain an extreme point.
Bounded polyhedra contain an extreme point.
4.4 Proof Slide 22
Let P = fx j Ax = b x 0g 6= rank(A) = m. If there exists a feasible
solution in P, then there is an extreme point.
Proof

Let x = (x1 : : : xt 0 : : : 0), s.t. x 2 P. Consider B = fA1 A2 : : : At g

If fA1 A2 : : : At g are linearly independent we can augment, to nd a
basis, and thus a BFS exists.

If fA1 A2 : : : Atg are dependent

d1A1 + + dtAt = 0 (di 6= 0)
But x1A1 + + xtAt = b
) (x1 + d1)A1 + + (xt + dt )At = b
x + d j = 1 : : :t
Consider xj () = j j

0 otherwise.
Slide 23
Clearly A x() = b
n o
Let: 1 = max

d >0j
; xdjj (if all dj 0)
n x o 1 = ;1)
2 = dmin

>0 ; d
j
j
j
(if all dj 0)
2 = +1
For1 2 (suciently small)
x() 0
Slide 24
Since at least one (d1 : : : dt) 6= 0 ) at least one from 1 2 is nite, say 1 .
But then x(1 ) 0 and number of nonzeros decreased.

xj + dj 0 ) xj ;dj
4.5 Example 3 Slide 25
P = fx j x1 + x2 + x3 =2
x1 +x4 = 1 x1 : : : x4 0g

x = ; 12 21 1 21

6
1 1 1 0
B= 1 0 0 1
1 1 1 0 0
1 1 ;
0 +0

0 ;
1 = 0
Slide 26
1
Consider: x() = 2 + 12 ; 1 12 ; for ; 12 12 :

x() 2 P.

Note x(; 12 ) = (0 1 1 1) and x( 21 ) = (1 0 1 0).

5 Optimality of
Extreme Points

5.1 Theorem Slide 27

Consider
min c x

s:t: x 2 P = fx 2 <n j Ax bg:

P has no line and it has an optimal solution.

Then, there exists an optimal solution which is an extreme point of P.
5.2 Proof Slide 28
v optimal value of the cost c x. 0

Q : set of optimal solutions, i.e.,

Q = fx j c x = v Ax bg 0

Q P and P contains no lines, Q does not contain any lines, hence is has
an extreme point x� .

Slide 29

Claim: x� is an extreme point of P.

Suppose not 9 y w 6= x� : x� = y + (1 ; )w y w 2 P 0 < < 1:

v = c x� = c y + (1 ; )c w
0 0 0

cyv )

c w v ) c y = c w = v ) y w 2 Q
0 0
0

) x� is NOT an extreme point of Q, CONTRADICTION.

7
6 Representation
of Polyhedra

6.1 Theorem Slide 30

A nonempty and bounded polyhedron is the convex hull of its extreme points.

.
y

z. P

.
u
Q a 'i*x = bi*

8
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 4: Geometry of Linear Optimization III

1 Outline
Slide 1
1. Projections of Polyhedra
2. Fourier-Motzkin Elimination Algorithm
3. Optimality Conditions

2 Projections of polyhedra
Slide 2
• πk : ℜn �→ ℜk projects x onto its ﬁrst k coordinates:

πk (x) = πk (x1 , . . . , xn ) = (x1 , . . . , xk ).

• � �
Πk (S) = πk (x) | x ∈ S ;

Equivalently

� �
Πk (S) = (x1 , . . . , xk ) � there exist xk+1 , . . . , xn
�
s.t. (x1 , . . . , xn ) ∈ S .

P 2( S )

P 1( S )

2.1 The Elimination Algorithm

2.1.1 By example
Slide 3
• Consider the polyhedron

x1 + x2 ≥ 1

1
x1 + x2 + 2x3 ≥ 2
2x1 + 3x3 ≥ 3
x1 − 4x3 ≥ 4
−2x1 + x2 − x3 ≥ 5.

• We rewrite these constraints

0 ≥ 1 − x1 − x2
x3 ≥ 1 − (x1 /2) − (x2 /2)
x3 ≥ 1 − (2x1 /3)
−1 + (x1 /4) ≥ x3
−5 − 2x1 + x2 ≥ x3 .

• Eliminate variable x3 , obtaing polyhedron Q

0 ≥ 1 − x1 − x2
−1 + x1 /4 ≥ 1 − (x1 /2) − (x2 /2)
−1 + x1 /4 ≥ 1 − (2x1 /3)
−5 − 2x1 + x2 ≥ 1 − (x1 /2) − (x2 /2)
−5 − 2x1 + x2 ≥ 1 − (2x1 /3).

2.2 The Elimination Algorithm

�n Slide 4
1. Rewrite j =1 aij xj ≥ bi in the form
n−1
�
ain xn ≥ − aij xj + bi , i = 1, . . . , m;
j=1

if ain =
� 0, divide both sides by ain . By letting x = (x1 , . . . , xn−1 ) that P
is represented by:

xn ≥ di + f ′i x, if ain > 0,
dj + f ′j x ≥ xn , if ajn < 0,
0 ≥ dk + f ′k x, if akn = 0.

2. Let Q be the polyhedron in ℜn−1 deﬁned by:

dj + f ′j x ≥ di + f ′i x, if ain > 0 and ajn < 0,

0 ≥ dk + f ′k x, if akn = 0.

Theorem:

The polyhedron Q constructed by the elimination algorithm is equal to the

projection Πn−1 (P ) of P .

2
2.3 Implications
Slide 5
• Let P ⊂ ℜn+k be a polyhedron. Then, the set
x ∈ ℜn � there exists y ∈ ℜk such that (x, y) ∈ P

� � �

is also a polyhedron.

• Let P ⊂ ℜn be a polyhedron and let A be an m × n matrix. Then, the

set Q = {Ax | x ∈ P } is also a polyhedron.

• The convex hull of a ﬁnite number of vectors is a polyhedron.

2.4 Algorithm for LO

Slide 6
• Consider min c′ x subject to x ∈ P .
• Deﬁne a new variable x0 and introduce the constraint x0 = c′ x.
• Apply the elimination algorithm n times to eliminate the variables x1 , . . . , xn
• We are left with the set
� �
Q = x0 | there exists x ∈ P such that x0 = c′ x ,

and the optimal cost is equal to the smallest element of Q.

3 Optimality Conditions
3.1 Feasible directions
Slide 7
• We are at x ∈ P and we contemplate moving away from x, in the direction

of a vector d ∈ ℜn .

• We need to consider those choices of d that do not immediately take us

outside the feasible set.

• A vector d ∈ ℜn is said to be a feasible direction at x, if there exists a

positive scalar θ for which x + θd ∈ P .

Slide 8

3
Slide 9
• x be a BFS to the standard form problem corresponding to a basis B.
• xi = 0, i ∈ N , xB = B −1 B.
• We consider moving away from x, to a new vector x + θd, by selecting a

nonbasic variable xj and increasing it to a positive value θ, while keeping

the remaining nonbasic variables at zero.

• Algebraically, dj = 1, and di = 0 for every nonbasic index i other than j.

• The vector xB of basic variables changes to xB + θdB .
• Feasibility: A(x + θd) = B ⇒ Ad = 0.
�n �m
• 0 = Ad = i=1 Ai di = i=1 AB(i) dB(i) + Aj = BdB + Aj ⇒ dB =

−B −1 Aj .

• Nonnegativity constraints?

– If x nondegenerate, xB > 0; thus xB + θdB ≥ 0 for θ is suﬃciently

small.

– If xdegenerate, then d is not always a feasible direction. Why?

• Eﬀects in cost?

Cost change: c′ d = cj − c′B B −1 Aj This quantity is called reduced cost

cj of the variable xj .

3.2 Theorem
Slide 10
• x BFS associated with basis B
• c reduced costs

Then

• If c ≥ 0 ⇒ x optimal
• x optimal and non-degenerate ⇒ c ≥ 0

3.3 Proof
• y arbitrary feasible solution
• d = y − x ⇒ Ax = Ay = b ⇒ Ad = 0
� Slide 11
⇒ BdB + Ai di = 0
i∈N
� −1
⇒ dB = − B Ai di
i∈N �
⇒ c′ d = c′B dB + ci di
i∈N
Slide 12
(ci − cB B −1 Ai )di =
� ′
�
= ci di

i∈N i∈N

4
• Since yi ≥ 0 and xi = 0, i ∈ N , then di = yi − xi ≥ 0, i ∈ N
• c′ d = c′ (y − x) ≥ 0 ⇒ c′ y ≥ c′ x
⇒ x optimal

(b) Your turn

5
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 5: The Simplex Method I

1 Outline
Slide 1
• Reduced Costs
• Optimality conditions
• Improving the cost
• Unboundness
• The Simplex algorithm
• The Simplex algorithm on degenerate problems

2 Matrix View
Slide 2
min c′ x
s.t. Ax = b
x≥0

x = (xB , xN ) xB basic variables

xN non-basic variables

A = [B, N ]

Ax = b ⇒ B · xB + N · xN = b

⇒ xB + B −1 N xN = B −1 b
⇒ xB = B −1 b − B −1 N xN

2.1 Reduced Costs

Slide 3
z = c′B xB + c′N xN
= c′B (B −1 b − B −1 N xN ) + c′N xN
= c′B B −1 b + (c′N − c′B B −1 N )xN

cj = cj − c′B B −1 Aj reduced cost

2.2 Optimality Conditions

Slide 4
Recall Theorem:

• x BFS associated with basis B

• c reduced costs

Then

• If c ≥ 0 ⇒ x optimal
• x optimal and non-degenerate ⇒ c ≥ 0

1
3 Improving the Cost
Slide 5
• Suppose cj = cj − c′B B −1 Aj < 0

Can we improve the cost?

• Let dB = −B −1 Aj

dj = 1, di = 0, i =
� B(1), . . . , B(m), j.

• Let y = x + θ · d, θ > 0 scalar

Slide 6
c′ y − c′ x = θ · c′ d
= θ · (c′B dB + cj dj )
= θ · (cj − c′B B −1 Aj )
= θ · cj
Thus, if cj < 0 cost will decrease.

4 Unboundness
Slide 7
• Is y = x + θ · d feasible?

Since Ad = 0 ⇒ Ay = Ax = b

• y ≥ 0 ?

If d ≥ 0 ⇒ x + θ · d ≥ 0 ∀ θ ≥ 0

⇒ objective unbounded.

5 Improvement
Slide 8
If di < 0, then
xi
xi + θdi ≥ 0 ⇒ θ ≤ −
di
� �
xi
⇒ θ∗ = min −
{i|di <0} di
xB(i)
� �
∗
⇒θ = min −
{i=1,...,m|dB(i) <0} dB(i)

5.1 Example
Slide 9
min x1 + 5x2 −2x3
s.t. x1 + x2 + x3 ≤4
x1 ≤2
x3 ≤3
3x2 + x3 ≤6
x1 , x2 , x3 ≥0

2
x3
(0,0,3) (1,0,3)

(2,0,2)
(0,1,3)

(0,2,0) (2,2,0)
Slide 10
x2 
x1

Slide 11
A1 A2 A3 A4 A5 A6 A7
   x2   
1 1 1 1 0 0 0   4
 x3 
 1 0 0 0 1 0 0     2 
   x4 = 
 0 0 1 0 0 1 0     3 
 x5 
0 3 1 0 0 0 1   6
 x6 
x7
B = [A1 , A3 , A6 , A7 ]
BFS: x = (2, 0, 2, 0,0, 1, 4)′   Slide 12
1 1 0 0 0 1 0 0
 1 0 0 0  −1
 1 −1 0 0 
B=  c′ = (0, 7, 0, 2, −3, 0, 0)
 0 1 1 0  , B =  −1
 
1 1 0 

0 1 0 1   −1 1 0 1

d1 −1
 d3  −1
 1 
d5 = 1, d2 = d4 = 0,   d6  = −B A5
 = −1 
 Slide 13
d7 −1
y ′ = x′ + θd′ = (2 − θ, 0, 2 + θ, 0, θ, 1 − θ, 4 − θ)
What happens as θ increases? � �
x
θ∗ = min{i=1,...,m|dB(i)<0 } − Bdi(i) =
� �
min − (−21) , − (−11) , − (−41) = 1.
l = 6 (A6 exits the basis).
New solution
y = (1, 0, 3, 0, 1, 0, 3)′ Slide 14
New basis B = (A1 , A3 , A5 , A7 ) Slide 15

3
x3
(0,0,3) (1,0,3)

(2,0,2)
(0,1,3)

(0,2,0) (2,2,0) 
  
1 1 0 0 1 0 −1 0
x21 0 1 0 
 , B −1 =  0 0 1 0 
 
B = 
 0

1 0 0   −1 1 1 0 

0 1 0 1 0 0 −1 1

−1
c ′ = c′ − c′B B A = (0, 4, 0, −1, 0, 3, 0)
Need to continue, column A4 enters the basis.

6 Correctness
Slide 16
xB(l) xB(i)
� �
− = min − = θ∗
dB(l) i=1,...,m,dB(i)<0 dB(i)

Theorem

• B = {AB(i) ,i�=l , Aj } basis

• y = x + θ∗ d is a BFS associated with basis B.

7 The Simplex Algorithm

Slide 17
1. Start with basis B = [AB(1) , . . . , AB(m) ]

and a BFS x.

2. Compute cj = cj − c′B B −1 Aj
• If cj ≥ 0; x optimal; stop.
• Else select j : cj < 0.

4
Slide 18
3. Compute u = −d = B −1 Aj .
• If u ≤ 0 ⇒ cost unbounded; stop
• Else
∗ xB(i) uB(l)
4. θ = min =
1≤i≤m,ui >0 ui ul
5. Form a new basis by replacing AB(l) with Aj .
6. yj = θ∗

yB(i) = xB(i) − θ∗ ui

7.1 Finite Convergence

Slide 19
Theorem:
• P = {x | Ax = b, x ≥ 0} =
� ∅
• Every BFS non-degenerate

Then

• Simplex method terminates after a ﬁnite number of iterations

• At termination, we have optimal basis B or we have a direction d : Ad =

0, d ≥ 0, c′ d < 0 and optimal cost is −∞.

7.2 Degenerate problems

Slide 20
• θ∗ can equal zero (why?) ⇒ y = x, although B �= B.
• Even if θ∗ > 0, there might be a tie

xB(i)

min ⇒
1≤i≤m,ui >0 ui
next BFS degenerate.
• Finite termination not guaranteed; cycling is possible.
Slide 21
- g
x

f g

h x
=0 5 =0
x4
=0
3
x

x6
=0

x
2 =0
. y
c
=0
x1

5
7.3 Pivot Selection
Slide 22
• Choices for the entering column:
(a) Choose a column Aj , with cj < 0, whose reduced cost is the most

negative.

(b) Choose a column with cj < 0 for which the corresponding cost de

crease θ∗ |cj | is largest.

• Choices for the exiting column:

smallest subscript rule: out of all variables eligible to exit the basis, choose

one with the smallest subscript.

7.4 Avoiding Cycling

Slide 23
• Cycling can be avoided by carefully selecting which variables enter and

exit the basis.

• Example: among all variables cj < 0, pick the smallest subscript;

among all variables eligible to exit the basis, pick the one with the smallest

subscript.

6
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 6: The Simplex Method II

1 Outline
Slide 1
• Revised Simplex method
• The full tableau implementation
• Anticycling

2 Revised Simplex
Slide 2
Initial data: A, b, c
1. Start with basis B = [AB(1) , . . . , AB(m) ]
and B −1 .
2. Compute p′ = c′B B −1
cj = cj − p ′ A j
• If cj ≥ 0; x optimal; stop.
• Else select j : cj < 0.
Slide 3
−1
3. Compute u = B Aj .
• If u ≤ 0 ⇒ cost unbounded; stop
• Else
xB(i) uB(l)
4. θ∗ = min =
1≤i≤m,ui >0 ui ul
5. Form a new basis B by replacing AB(l) with Aj .
6. yj = θ∗ , yB(i) = xB(i) − θ∗ ui
Slide 4
7. Form [B −1 |u]
8. Add to each one of its rows a multiple of the lth row in order to make the

last column equal to the unit vector el .

−1
The ﬁrst m columns is B .

2.1 Example
Slide 5
min x1 + 5x2 −2x3
s.t. x1 + x2 + x3 ≤4
x1 ≤2
x3 ≤3
3x2 + x3 ≤6
x1 , x2 , x3 ≥0
Slide 6

1
B = {A1 , A3 , A6 , A7 }, BFS: x = (2, 0, 2, 0, 0, 1, 4)′
′
c = (0, 7, 0, 2, −3, 0,0)  
1 1 0 0 0 1 0 0
 1 0 0 0  −1
 1 −1 0 0 
B=  0 1 1 0  , B =  −1
  
1 1 0 
0 1 0 1 −1 1 0 1
(u1 , u3 , u6�, u7 )′ =�B −1 A5 = (1, −1, 1, 1)′
θ∗ = min 21 , 11 , 41 = 1, l = 6
l = 6 (A6 exits  the basis).  Slide 7
0 1 0 0 1
 1 −1 0 0 −1 
[B −1 |u] =   −1

1 1 0 1 
−1 1 0 1 1
 
1 0 −1 0

−1  0 0 1 0

⇒B =  −1 1

1 0 

0 0 −1 1

2.2 Practical issues

Slide 8
• Numerical Stability
B −1 needs to be computed from scratch once in a while, as errors accu
mulate
• Sparsity
B −1 is represented in terms of sparse triangular matrices

3 Full tableau implementation

Slide 9
−c′B B −1 b c′ − c′B B −1 A
B −1 b B −1 A

or, in more detail,

−c′B xB c1 ... cn
xB(1) | |
..
. B −1 A1 ... B −1 An
xB(m) | |

2
3.1 Example
Slide 10
min −10x1 − 12x2 − 12x3
s.t. x1 + 2x2 + 2x3 ≤ 20
2x1 + x2 + 2x3 ≤ 20
2x1 + 2x2 + x3 ≤ 20
x1 , x2 , x3 ≥ 0
min −10x1 − 12x2 − 12x3
s.t. x1 + 2x2 + 2x3 + x4 = 20
2x1 + x2 + 2x3 + x5 = 20
2x1 + 2x2 + x3 + x6 = 20
x1 , . . . , x6 ≥ 0
BFS: x = (0, 0, 0, 20, 20, 20)′
B=[A4 , A5 , A6 ] Slide 11

x1 x2 x3 x4 x5 x6
0 −10 −12 −12 0 0 0
x4 = 20 1 2 2 1 0 0
x5 = 20 2* 1 2 0 1 0
x6 = 20 2 2 1 0 0 1

c′ = c′ − c′B B −1 A = c′ = (−10, −12, −12, 0, 0, 0) Slide 12

x1 x2 x3 x4 x5 x6
100 0 −7 −2 0 5 0
x4 = 10 0 1.5 1* 1 −0.5 0
x1 = 10 1 0.5 1 0 0.5 0
x6 = 0 0 1 −1 0 −1 1
Slide 13

x1 x2 x3 x4 x5 x6
120 0 −4 0 2 4 0
x3 = 10 0 1.5 1 1 −0.5 0

x1 = 0 1 −1 0 −1 1 0

x6 = 10 0 2.5* 0 1 −1.5 1

Slide 14

3
x1 x2 x3 x4 x5 x6
136 0 0 0 3.6 1.6 1.6
x3 = 4 0 0 1 0.4 0.4 −0.6
x1 = 4 1 0 0 −0.6 0.4 0.4
x2 = 4 0 1 0 0.4 −0.6 0.4
Slide 15
x3

.B = (0,0,10)

A = (0,0,0) .
E = (4,4,4) .

. .C = (0,10,0)

D = (10,0,0)
x1 x2

4 Comparison of implementations
Slide 16
Full tableau Revised simplex

Memory O(mn) O(m2 )

Worst-case time O(mn) O(mn)

Best-case time O(mn) O(m2 )

5 Anticycling
5.1 Degeneracy in Practice
Slide 17
Does degeneracy really happen in practice?
�n
xij = 1
j=1
�n
xij = 1
i=1
xij ≥ 0

4
n! vertices:

For each vertex ∃ 2n−1 nn−2 diﬀerent bases (n = 8) for each vertex ∃ 33, 554, 432

bases.

5.2 Perturbations
Slide 18
(P ) min c′ x (Pǫ ) min c′ x  
ǫ
 ǫ2 
s.t. Ax = b s.t. Ax = b +  ..
 

 . 
ǫm
x ≥ 0 x ≥ 0.

5.2.1 Theorem
Slide 19
∃ ǫ1 > 0: for all 0 < ǫ < ǫ1

ǫ
Ax = b +  ... 
 

ǫm
x≥0

is non-degenerate.

5.2.2 Proof
Slide 20
Let B 1 , . . . , B r be all the bases.
  r
b1 + B r11 ǫ + · · · + B r1m
ǫm
  
ǫ
 .   ..
B −1
r b +  ..  = 
 

. 
r
ǫm bm + B rm1 ǫ + · · · + B rmm ǫm

where: r 
B r11 B r 1m
  
··· b1
B −1 =  ... .. −1
 , Br b = 
.. 
  
r . . 
r
B rm1 · · · B r mm bm
Slide 21
r
• bi + B ri1 θ + · · · + B rim θm is a polynomial in θ
• Roots θi,r1 , θi,r2 , . . . , θi,m
r

r
• If ǫ �= θi,r1 , . . . , θi,m
r
⇒ bi + B ri1 ǫ + · · · + B rim ǫm �= 0.
• Let ǫ1 the smallest positive root ⇒ 0 < ǫ < ǫ1 all RHS are =

� 0 ⇒

non-degeneracy.

5
5.3 Lexicography
Slide 22
L
• u is lexicographically larger than v, u > v, if u =

� v and the ﬁrst

nonzero component of u − v is positive.

• Example:
L
(0, 2, 3, 0) > (0, 2, 1, 4),
L
(0, 4, 5, 0) < (1, 2, 1, 2).

5.4 Lexicography-Pertubation
5.4.1 Theorem
Slide 23
Let B be a basis of Ax = b, x ≥ 0. Then B is feasible for Ax = b +
′
(ǫ, . . . , ǫm ) , x ≥ 0 for suﬃciently small ǫ if and only if
L
ui = (bi , Bi1 , . . . , Bim ) > 0, ∀ i

B −1 = (Bij )
(B −1 b)i = (bi )

5.4.2 Proof
Slide 24
B is feasible for peturbed problem “⇔” B −1 (b + (ǫ, . . . , ǫm )′ ) ≥ 0 ⇔
bi + B i1 ǫ + · · · + B im ǫm ≥ 0 ∀ i
⇔ First non-zero component of ui = (bi , Bi1 , . . . , Bim ) is positive ∀ i.

5.5 Summary
Slide 25
1. We start with: (P ) : Ax = b, x ≥ 0
2. We introduce (Pǫ ): Ax = b + (ǫ, . . . , ǫm )′ , x ≥ 0
L
3. A basis is feasible + non-degenerate in (Pǫ ) ⇔ ui > 0 in (P ).
L
4. If we maintain ui > 0 in (P ) ⇒ (Pǫ ) is non-degenerate ⇒ Simplex is
ﬁnite in (Pǫ ) for suﬃciently small ǫ.

5.6 Lexicographic pivoting rule

Slide 26
1. Choose an entering column Aj arbitrarily, as long as cj < 0; u = B −1 Aj .
2. For each i with ui > 0, divide the ith row of the tableau (including the
entry in the zeroth column) by ui and choose the lexicographically smallest
row. If row l is lexicographically smallest, then the lth basic variable xB(l)
exits the basis.

6
5.6.1 Example
Slide 27
• j = 3

1 0 5 3 ···
• 2 4 6 −1 ···
3 0 7 9 ···

• xB(1) /u1 = 1/3 and xB(3) /u3 = 3/9 = 1/3.

• We divide the ﬁrst and third rows of the tableau by u1 = 3 and u3 = 9,

respectively, to obtain:

1/3 0 5/3 1 ···

• ∗ ∗ ∗ ∗ ···
1/3 0 7/9 1 ···

• Since 7/9 < 5/3, the third row is chosen to be the pivot row, and the

variable xB(3) exits the basis.

5.6.2 Uniqueness
Slide 28
• Why lexicographic pivoting rule always leads to a unique choice for the

exiting variable?

• Otherwise, two rows in tableau proportional ⇒ rank(B −1 A) < m ⇒

rank(A) < m

5.7 Theorem
Slide 29
If simplex starts with all the rows in the simplex tableau, other than the zeroth
row, lexicographically positive and the lexicographic pivoting rule is followed,
then
(a) Every row of the simplex tableau, other than the zeroth row, remains

lexicographically positive throughout the algorithm.

(b) The zeroth row strictly increases lexicographically at each iteration.

5.8 Smallest subscript

pivoting rule
Slide 30
1. Find the smallest j for which the reduced cost cj is negative and have the

column Aj enter the basis.

2. Out of all variables xi that are tied in the test for choosing an exiting

variable, select the one with the smallest value of i.

7
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 7: The Simplex Method III

1 Outline
Slide 1
• Finding an initial BFS
• The complete algorithm
• The column geometry
• Computational eﬃciency
• The diameter of polyhedra and the Hirch conjecture

2 Finding an initial BFS

Slide 2
• Goal: Obtain a BFS of Ax = b, x ≥ 0

or decide that LOP is infeasible.

• Special case: b ≥ 0

Ax ≤ b, x ≥ 0

⇒ Ax + s = b, x, s ≥ 0
s = b, x=0

2.1 Artiﬁcial variables

Slide 3
Ax = b, x≥0
1. Multiply rows with −1 to get b ≥ 0.
2. Introduce artiﬁcial variables y, start with initial BFS y = b, x = 0, and

apply simplex to auxiliary problem

min y1 + y2 + . . . + ym
s.t. Ax + y = b
x, y ≥ 0
Slide 4
3. If cost > 0 ⇒ LOP infeasible; stop.

4. If cost = 0 and no artiﬁcial variable is in the basis, then a BFS was found.

5. Else, all yi∗ = 0, but some are still in the basis. Say we have AB(1) , . . . , AB(k)

in basis k < m. There are m − k additional columns of A to form a basis.

Slide 5

6. Drive artiﬁcial variables out of the basis: If lth basic variable is artiﬁ

cial examine lth row of B −1 A. If all elements = 0 ⇒ row redundant.

Otherwise pivot with =� 0 element.

1
2.2 Example
Slide 6
min x1 + x2 + x3
s.t. x1 + 2x2 + 3x3 = 3
−x1 + 2x2 + 6x3 = 2
4x2 + 9x3 = 5
3x3 + x4 = 1
x1 , . . . , x4 ≥ 0.
min x5 + x6 + x7 + x8
s.t. x1 + 2x2 + 3x3 + x5 = 3
−x1 + 2x2 + 6x3 + x6 = 2
4x2 + 9x3 + x7 = 5
3x3 + x4 + x8 = 1
x1 , . . . , x8 ≥ 0.
Slide 7
x1 x2 x3 x4 x5 x6 x7 x8
−11 0 −8 −21 −1 0 0 0 0
x5 = 3 1 2 3 0 1 0 0 0
x6 = 2 −1 2 6 0 0 1 0 0
x7 = 5 0 4 9 0 0 0 1 0
x8 = 1 0 0 3 1* 0 0 0 1

x1 x2 x3 x4 x5 x6 x7 x8
−10 0 −8 −18 0 0 0 0 1
x5 = 3 1 2 3 0 1 0 0 0
x6 = 2 −1 2 6 0 0 1 0 0
x7 = 5 0 4 9 0 0 0 1 0
x4 = 1 0 0 3* 1 0 0 0 1
Slide 8

x1 x2 x3 x4 x5 x6 x7 x8
−4 0 −8 0 6 0 0 0 7
x5 = 2 1 2 0 −1 1 0 0 −1
x6 = 0 −1 2* 0 −2 0 1 0 −2
x7 = 2 0 4 0 −3 0 0 1 −3
x3 = 1/3 0 0 1 1/3 0 0 0 1/3

2
x1 x2 x3 x4 x5 x6 x7 x8
−4 −4 0 0 −2 0 4 0 −1
x5 = 2 2* 0 0 1 1 −1 0 1
x2 = 0 −1/2 1 0 −1 0 1/2 0 −1
x7 = 2 2 0 0 1 0 −2 1 1
x3 = 1/3 0 0 1 1/3 0 0 0 1/3
Slide 9

x1 x2 x3 x4 x5 x6 x7 x8
0 0 0 0 0 2 2 0 1
x1 = 1 1 0 0 1/2 1/2 −1/2 0 1/2
x2 = 1/2 0 1 0 −3/4 1/4 1/4 0 −3/4
x7 = 0 0 0 0 0 −1 −1 1 0
x3 = 1/3 0 0 1 1/3 0 0 0 1/3
Slide 10
x1 x2 x3 x4
∗ ∗ ∗ ∗ ∗
x1 = 1 1 0 0 1/2
x2 = 1/2 0 1 0 −3/4
x3 = 1/3 0 0 1 1/3

3 A complete Algorithm for LO

Slide 11
Phase I:
1. By multiplying some of the constraints by −1, change the problem so that

b ≥ 0.

�m
2. Introduce y1 , . . . , ym , if necessary, and apply the simplex method to min i=1 yi .
3. If cost> 0, original problem is infeasible; STOP.
4. If cost= 0, a feasible solution to the original problem has been found.
5. Drive artiﬁcial variables out of the basis, potentially eliminating redundant

rows.

Slide 12
Phase II:

3
1. Let the ﬁnal basis and tableau obtained from Phase I be the initial basis
and tableau for Phase II.
2. Compute the reduced costs of all variables for this initial basis, using the
cost coeﬃcients of the original problem.
3. Apply the simplex method to the original problem.

3.1 Possible outcomes

Slide 13
1. Infeasible: Detected at Phase I.
2. A has linearly dependent rows: Detected at Phase I, eliminate redundant
rows.
3. Unbounded (cost= −∞): detected at Phase II.
4. Optimal solution: Terminate at Phase II in optimality check.

4 The big-M method

Slide 14
n
� m
�
min cj xj + M yi
j=1 i=1
s.t. Ax + y = b
x, y ≥ 0

5 The Column Geometry

Slide 15
min c′ x
s.t. Ax = b
e′ x = 1
x ≥ 0
� � � � � � � �
A1 A2 An b
x1 + x2 + · · · + xn =
c1 c2 cn z
Slide 16
Slide 17
6 Computational eﬃciency
Slide 18
Exceptional practical behavior: linear in n
Worst case
max xn
s.t. ǫ ≤ x1 ≤ 1
ǫxi−1 ≤ xi ≤ 1 − ǫxi−1 , i = 2, . . . , n
Slide 19
Theorem Slide 20

4
z
B

. I C

F
H .
G . D

b .

initialbasis
z
. 6

3. . . . 2
1

4 . 7
nextbasis

8
. . 5

. b
optimalbasis

x1 x1

(a) (b)

• The feasible set has 2n vertices

• The vertices can be ordered so that each one is adjacent to and has lower

cost than the previous one.

• There exists a pivoting rule under which the simplex method requires

2n − 1 changes of basis before it terminates.

7 The Diameter of polyhedra

Slide 21
• Given a polyhedron P , and x, y vertices of P , the distance d(x, y) is the

minimum number of jumps from one vertex to an adjacent one to reach y

starting from x.

• The diameter D(P ) is the maximum of d(x, y) ∀x, y.

Slide 22
• Δ(n, m) as the maximum of D(P ) over all bounded polyhedra in ℜn that

are represented in terms of m inequality constraints.

• Δu (n, m) is like Δ(n, m) but for possibly unbounded polyhedra.

7.1 The Hirsch Conjecture

Slide 23
• �m�
Δ(2, m) = , Δu (2, m) = m − 2
2

. .
. .
. . .
. . .
. . . . .
(a) (b)

• Hirsch Conjecture: Δ(n, m) ≤ m − n.

Slide 24
• We know that �n�
Δu (n, m) ≥ m − n +
5

Δ(n, m) ≤ Δu (n, m) < m1+log2 n = (2n)log2 m

6
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 8: Duality Theory I

1 Outline
Slide 1
• Motivation of duality
• General form of the dual
• Weak and strong duality
• Relations between primal and dual
• Economic Interpretation
• Complementary Slackness

2 Motivation
2.1 An idea from Lagrange
Slide 2
Consider the LOP, called the primal with optimal solution x∗

min c′ x
s.t. Ax = b
x≥0

Relax the constraint

g(p) = min c′ x + p′ (b − Ax)
s.t. x ≥ 0

g(p) ≤ c′ x∗ + p′ (b − Ax∗ ) = c′ x∗
Get the tightest lower bound, i.e.,

max g(p)

� �
g(p) = min c′ x + p′ (b − Ax)
x ≥0
= p′ b + min (c′ − p′ A)x
x ≥0

Note that �
′ ′ 0, if c′ − p′ A ≥ 0′ ,
min (c − p A)x =
x≥0 −∞, otherwise.

Dual max g(p) ⇔ max p′ b

s.t. p′ A ≤ c′

1
3 General form of the dual

Slide 3
Primal Dual
min c′
x max p′ b
s.t. a′i
x ≥ bi i ∈ M1 s.t. pi ≥ 0 i ∈ M1
a′i
x ≤ bi i ∈ M2 pi ≤ 0 i ∈ M1
a′i
x = bi i ∈ M3 pi >< 0 i ∈ M3
xj ≥ 0 j ∈ N1 p′ Aj ≤ cj j ∈ N1
xj ≤ 0 j ∈ N2 p′ Aj ≥ cj j ∈ N2
xj >< 0 j ∈ N3 p′ Aj = cj j ∈ N3

3.1 Example
Slide 4
min x1 + 2x2 + 3x3 max 5p1 + 6p2 + 4p3
s.t. −x1 + 3x2 =5 s.t. p1 free
2x1 − x2 + 3x3 ≥ 6 p2 ≥0
x3 ≤ 4 p3 ≤0
x1 ≥ 0 −p1 + 2p2 ≤1
x2 ≤ 0 3p1 − p2 ≥2
x3 free, 3p2 + p3 = 3.
Slide 5
Primal min max dual
≥ bi ≥0
constraints ≤ bi ≤0 variables
>
= bi <0
≥0 ≤ cj
variables ≤0 ≥ cj constraints
>
<0 = cj

Theorem: The dual of the dual is the primal.

3.2 A matrix view

Slide 6
min c′ x max p′ b
s.t. Ax = b s.t. p′ A ≤ c′
x ≥ 0
min c′ x max p′ b
s.t. Ax ≥ b s.t. p′ A = c′
p≥0

4 Weak Duality
Slide 7
Theorem:
If x is primal feasible and p is dual feasible then p′ b ≤ c′ x
Proof
p′ b = p′ Ax ≤ c′ x

2
Corollary:

If x is primal feasible, p is dual feasible, and p′ b = c′ x, then x is optimal in

the primal and p is optimal in the dual.

5 Strong Duality
Slide 8
Theorem: If the LOP has optimal solution, then so does the dual, and optimal

costs are equal.

Proof:

min c′ x
s.t. Ax = b
x ≥ 0
Apply Simplex; optimal solution x, basis B.
Optimality conditions:
c′ − c′B B −1 A ≥ 0′
Slide 9
Deﬁne p′ = c′B B −1 ⇒ p′ A ≤ c′
⇒ p dual feasible for
max p′ b
s.t. p′ A ≤ c′

p′ b = c′B B −1 b = c′B xB = c′ x
⇒ x, p are primal and dual optimal

5.1 Intuition
Slide 10
a3

c
a2
a1
p 2a 2 p 1a 1

.
x *

3
6 Relations between primal and dual
Slide 11
Finite opt. Unbounded Infeasible
Finite opt. *
Unbounded *
Infeasible * *

7 Economic Interpretation
Slide 12
• x optimal nondegenerate solution: B −1 b > 0
• Suppose b changes to b + d for some small d
• How is the optimal cost affected?
• For small d feasibilty unaffected
• Optimality conditions unaffected
• New cost c′B B −1 (b + d) = p′ (b + d)
• If resource i changes by di , cost changes by pi di : “Marginal Price”

8 Complementary slackness
8.1 Theorem
Slide 13
Let x primal feasible and p dual feasible. Then x, p optimal if and only if

pi (a′i x − bi ) = 0, ∀i

xj (cj − p′ Aj ) = 0, ∀j

8.2 Proof
Slide 14
• ui = pi (a′i x − bi ) and vj = (cj − p′ Aj )xj
• If x primal feasible and p dual feasible, we have ui ≥ 0 and vj ≥ 0 for all

i and j.

• Also � �
c ′ x − p′ b = ui + vj .
i j

• By the strong duality theorem, if x and p are optimal, then c′ x = p′ b ⇒

ui = vj = 0 for all i, j.

• Conversely, if ui = vj = 0 for all i, j, then c′ x = p′ b,

• ⇒ x and p are optimal.

4
8.3 Example
Slide 15

min 13x1 + 10x2 + 6x3 max 8p1 + 3p2

s.t. 5x1 + x2 + 3x3 = 8 s.t. 5p1 + 3p2 ≤ 13

3x1 + x2 = 3 p1 + p2 ≤ 10

x1 , x2 , x3 ≥ 0 3p1 ≤ 6

Is x∗ = (1, 0, 1)′ optimal? Slide 16

5p1 + 3p2 = 13, 3p1 = 6

⇒ p1 = 2, p2 = 1
Objective=19

MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 9: Duality Theory II

1 Outline
Slide 1
• Strict complementary slackness
• Geometry of duality
• The dual simplex algorithm
• Duality and degeneracy

2 Strict Complementary Slackness

Slide 2
Assume that both problems have an optimal solution:

min c′ x max p′ b
s.t. Ax ≥ b s.t. p′ A ≤ c′
x ≥ 0, p ≥ 0.

There exist optimal solutions to the primal and to the dual that satisfy
• For every j, either xj > 0 or p′ Aj < cj .
• For every i, we have either a′i x > bi or pi > 0.

2.1 Example
Slide 3
min 5x1 + 5x2
s.t. x1 + x2 ≥ 2
2x1 − x2 ≥ 0
x1 , x2 ≥ 0.

• Is (2/3, 4/3) strictly complementary?

• Which are all the strictly complementary solutions?

3 The Geometry of Duality

Slide 4
min c′ x
s.t. a′i x ≥ bi , i = 1, . . . , m

max p′ b
m
�
s.t. pi ai = c
i=1
p≥0

1
c
c
a1
a3
A
a5
a2 c a4
a1

B a3 c

a4 a2
a1

C c

a1
a5
D a1

a2 c

a1
a3 a3
a2

x * a1

4 Dual Simplex Algorithm

4.1 Motivation
Slide 5
• In simplex method B −1 b ≥ 0
• Primal optimality condition

c′ − c′B B −1 A ≥ 0′

same as dual feasibility

• Simplex is a primal algorithm: maintains primal feasibility and works

towards dual feasibility

• Dual algorithm: maintains dual feasibility and works towards primal

feasibility
Slide 6
−c′B xB c̄1 ... c̄n
xB(1) | |
..
. B −1 A1 ... B −1 An
xB(m) | |

• Do not require B −1 b ≥ 0
• Require c̄ ≥ 0 (dual feasibility)
• Dual cost is

p′ b = c′B B −1 b = c′B xB

• If B −1 b ≥ 0 then both dual feasibility and primal feasibility, and also

same cost ⇒ optimality

• Otherwise, change basis

4.2 An iteration
Slide 7
1. Start with basis matrix B and all reduced costs ≥ 0.

2. If B −1 b ≥ 0 optimal solution found; else, choose l s.t. xB(l) < 0.

3. Consider the lth row (pivot row) xB(l) , v1 , . . . , vn . If ∀i vi ≥ 0 then dual

optimal cost = +∞ and algorithm terminates.

Slide 8
4. Else, let j s.t.
c̄j c̄i
= min
|vj | {i|vi <0} |vi |
5. Pivot element vj : Aj enters the basis and AB(l) exits.

3
x2 p2

1 . D

c b

. B . C
1
. C

. .
A
1
D
.
E
2 x1
A
. B
1/2
. .
1
E
p1

(a) (b)

4.3 An example
Slide 9
min x1 + x2
s.t. x1 + 2x2 ≥ 2
x1 ≥ 1
x1 , x2 ≥ 0
min x1 + x2 max 2p1 + p2
s.t. x1 + 2x2 − x3 = 2 s.t. p1 + p2 ≤ 1
x1 − x4 = 1 2p1 ≤ 1
x1 , x2 , x3 , x4 ≥ 0 p1 , p2 ≥ 0
Slide 10
x1 x2 x3 x4
0 1 1 0 0
x3 = −2 −1 −2* 1 0
x4 = −1 −1 0 0 1
Slide 11

x1 x2 x3 x4
−1 1/2 0 1/2 0
x2 = 1 1/2 1 −1/2 0
x4 = −1 −1* 0 0 1

x1 x2 x3 x4
−3/2 0 0 1/2 1/2
x2 = 1/2 0 1 −1/2 1/2
x1 = 1 1 0 0 −1

4
x2 p2

B . A'
.
.
C c
.
C

A
. . B
.
D
p1

A
. .
D
x1
A' ' . b

(a) (b)

5 Duality and Degeneracy

Slide 12
• Any basis matrix B leads to dual basic solution p′ = cB ′ B −1 .
• The dual constraint p′ Aj = cj is active if and only if the reduced cost cj

is zero.

• Since p is m-dimensional, dual degeneracy implies more than m reduced

costs that are zero.

• Dual degeneracy is obtained whenever there exists a nonbasic variable

whose reduced cost is zero.

5.1 Example
Slide 13
min 3x1 + x2 max 2p1
s.t. x1 + x2 − x3 = 2 s.t. p1 + 2p2 ≤ 3
2x1 − x2 − x4 = 0 p1 − p2 ≤ 1
x1 , x2 , x3 , x4 ≥ 0, p1 , p2 ≥ 0.
Equivalent primal problem

min 3x1 + x2
s.t. x1 + x2 ≥ 2
2x1 − x2 ≥ 0
x1 , x2 ≥ 0.
Slide 14
Slide 15
• Four basic solutions in primal: A, B, C, D.
• Six distinct basic solutions in dual: A, A′ , A′′ , B, C, D.
• Diﬀerent bases may lead to the same basic solution for the primal, but

to diﬀerent basic solutions for the dual. Some are feasible and some are

infeasible.

5
5.2 Degeneracy and uniqueness
Slide 16
• If dual has a nondegenerate optimal solution, the primal problem has a

unique optimal solution.

• It is possible, however, that dual has a degenerate solution and the dual

has a unique optimal solution.

6
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 10: Duality Theory III

A1
A3
A2

b
.

1 Outline
Slide 1
• Farkas lemma
• Asset pricing
• Cones and extreme rays
• Representation of Polyhedra

2 Farkas lemma
Slide 2
Theorem:

Exactly one of the following two alternatives hold:

1. ∃x ≥ 0 s.t. Ax = b.
2. ∃p s.t. p′ A ≥ 0′ and p′ b < 0.

2.1 Proof
Slide 3
“ ⇒′′
If ∃x ≥ 0 s.t. Ax = b, and if p′ A ≥ 0′ , then p′ b = p′ Ax ≥ 0
“ ⇐′′
Assume there is no x ≥ 0 s.t. Ax = b

(P ) max 0′ x (D) min p′ b

s.t. Ax = b s.t. p′ A ≥ 0 ′
x ≥ 0

(P) infeasible ⇒ (D) either unbounded or infeasible

Since p = 0 is feasible ⇒ (D) unbounded
⇒ ∃p : p′ A ≥ 0′ and p′ b < 0

1
3 Asset Pricing
Slide 4
• n diﬀerent assets
• m possible states of nature
• one dollar invested in some asset i, and state of nature is s, we receive a

payoﬀ of rsi

• m × n payoﬀ matrix:
 
r11 . . . r1n
 .. .. .. 
R= . . . 
rm1 . . . rmn
Slide 5
� �
• xi : amount held of asset i. A portfolio of assets is x = x1 , . . . , xn .
• A negative value of xi indicates a “short” position in asset i: this amounts

to selling |xi | units of asset i at the beginning of the period, with a promise

to buy them back at the end. Hence, one must pay out rsi |xi | if state s

occurs, which is the same as receiving a payoﬀ of rsi xi

Slide 6

• Wealth in state s from a portfolio x

n
�
ws = rsi xi .
i=1

� �
• w = w1 , . . . , wm , w = Rx
� �
• pi : price of asset i, p = p1 , . . . , pn
• Cost of acquiring x is p′ x.

3.1 Arbitrage
Slide 7
• Central problem: Determine pi
• Absence of arbitrage: no investor can get a guaranteed nonnegative

payoﬀ out of a negative investment. In other words, any portfolio that pays

oﬀ nonnegative amounts in every state of nature, must have nonnegative

cost.

if Rx ≥ 0, then p′ x ≥ 0.
Slide 8

2
• Theorem: The absence of arbitrage condition holds if and only if there

exists a nonnegative vector q = (q1 , . . . , qm ), such that the price of each

asset i is given by

�m
pi = qs rsi .
s=1

• Applications to options pricing

4 Cones and extreme rays

4.1 Deﬁnitions
Slide 9
• A set C ⊂ ℜn is a cone if λx ∈ C for all λ ≥ 0 and all x ∈ C
• A polyhedron of the form P = {x ∈ ℜn | Ax ≥ 0} is called a polyhedral

cone

4.2 Applications
� � Slide 10
• P = x ∈ ℜn | Ax ≥ b , y ∈ P
• The recession cone at y

RC = d ∈ ℜn | y + λd ∈ P, ∀ λ ≥ 0
� �

• It turns out that

RC = d ∈ ℜn | Ad ≥ 0
� �

• RC independent of y
Slide 11

4.3 Extreme rays

Slide 12
Ax= � 0 of a polyhedral cone C ⊂ ℜn is called an extreme ray if there are
n − 1 linearly independent constraints that are active at x

4.4 Unbounded LPs

Slide 13
Theorem: Consider the problem of minimizing c′
x over a polyhedral cone C =
{x ∈ ℜn | A′i x ≥ 0, i = 1, . . . , m} that has zero as an extreme point. The
optimal cost is equal to −∞ if and only if some extreme ray d of C satisﬁes
c′ d < 0. Theorem: Consider the problem of minimizing c′
x subject to Ax ≥ b, Slide 14
and assume that the feasible set has at least one extreme point. The optimal
cost is equal to −∞ if and only if some extreme ray d of the feasible set satisﬁes
c′
d < 0.

What happens when the simplex method detects an unbounded problem?

3
x2 x3
0
x=

- a1
. ..
a'
1

y
. - a2
x2

0
a 2' x =

x1
x1

(a) (b)

4
w 1
x1

.
x1 .
y

recession

x2
. cone

.
x3 w 2 x2

5 Resolution Theorem
Slide 15
P = x ∈ ℜn | Ax ≥ b
� �

be a nonempty polyhedron with at least one extreme point. Let x1 , . . . , xk be

the extreme points, and let w 1 , . . . , w r be a complete set of extreme rays of P .
� k r k
�
� � � �
i j �
Q= λi x + θj w � λi ≥ 0, θj ≥ 0, λi = 1 .
i=1 j=1 i=1

Then, Q = P .

5.1 Example
Slide 16
x1 − x2 ≥ −2
x1 + x2 ≥ 1
x1 , x2 ≥ 0
Slide 17
• Extreme points: x1 = (0, 2), x2 = (0, 1), and x3 = (1, 0).
• Extreme rays w1 = (1, 1) and w2 = (1, 0).
• � � � � � � � �
2 0 1 1
y= = + + = x2 + w 1 + w 2 .
2 1 1 0

5
5.2 Proof
Slide 18
• Q ⊂ P . Let x ∈ Q:
k
� r
�
x= λi xi + θj wj

i=1 j=1

�k

λi , θj ≥ 0 i=1 λi = 1.
�k
• y = i=1 λi xi ∈ P and satisfies Ay ≥ b.
�r
• Aw j ≥ 0 for every j: z = j=1 θj wj satisfies Az ≥ 0.
• x = y + z satisfies Ax ≥ b and belongs to P .
Slide 19
For the reverse, assume there is a z ∈ P , such that z ∈
/ Q.
k
� r
�
max 0λi + 0θj
i=1 j=1
k r
� �
s.t. λi x i + θj w j = z
i=1 j=1
k
�
λi = 1
i=1
λi ≥ 0, i = 1, . . . , k,
θj ≥ 0, j = 1, . . . , r,
Is this feasible? Slide 20
• Dual
min p′
z + q
s.t. p′ xi + q ≥ 0, i = 1, . . . , k,
p′
wj ≥ 0, j = 1, . . . , r.
• This is unbounded. Why?
• There exists a feasible solution (p, q) whose cost p′
z + q < 0
• p′ z < p′ xi for all i and p′ w j ≥ 0 for all j.
Slide 21
•
min p′
x
s.t. Ax ≥ b.

• If the optimal cost is ﬁnite, there exists an extreme point xi which is

optimal. Since z is a feasible solution, we obtain p′ xi ≤ p′ z, which is a

contradiction.

• If the optimal cost is −∞, there exists an extreme ray wj such that

p′ w j < 0, which is again a contradiction

6
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 11: Duality Theory IV

1 Outline
Slide 1
• Overview and objectives
• Weistrass Theorem
• Separating hyperplanes theorem
• Farkas lemma revisited
• Duality theorem revisited

2 Overview and objectives

Slide 2
• So far: Simplex −→ Duality −→ Farkas lemma
• Disadvantages: specialized to LP, relied on a particular algorithm

• Plan today: Separation (A Geometric property) −→ Farkas lemma −→

Duality

• Purely geometric, generalizes to general nonlinear problems, more funda

mental

3 Closed sets
Slide 3
• A set S ⊂ ℜn is closed if x1 , x2 , . . . is a sequence of elements of S that

converges to some x ∈ ℜn , then x ∈ S.

• Every polyhedron is closed.

4 Weierstrass’ theorem
Slide 4
If f : ℜn �→ ℜ is a continuous function, and if S is a nonempty, closed, and
bounded subset of ℜn , then there exists some x∗ ∈ S such that f (x∗ ) ≤ f (x)
for all x ∈ S. Similarly, there exists some y ∗ ∈ S such that f (y ∗ ) ≥ f (x) for
all x ∈ S.
Note: Weierstrass’ theorem is not valid if the set S is not closed. Consider,
S = {x ∈ ℜ | x > 0}, f (x) = x

5 Separation
Slide 5
Theorem: Let S be a nonempty closed convex subset of ℜn and let x∗ ∈ ℜn :
/ S. Then, there exists some vector c ∈ ℜn such that c′ x∗ < c′ x for all
x∗ ∈
x ∈ S.

1
S

.
x *

.x *

B
.
x *

.
c

. . y
w
.y
q
.
x

S S

(a) (b)

5.1 Proof
Slide 6
• Fix w ∈ S
� � �
• B = x � ||x − x∗ || ≤ ||w − x∗ || ,
• D =S∩B
• D �= ∅, closed and bounded. Why?
• Consider min ||x − x∗ ||
Slide 7
Slide 8
• By Weierstrass’ theorem there exists some y ∈ D such that

||y − x∗ || ≤ ||x − x∗ ||, ∀ x ∈ D.

/ D, ||x − x∗ || > ||w − x∗ || ≥ ||y − x∗ ||.

• ∀x ∈ S and x ∈
• y minimizes ||x − x∗ || ∀x ∈ S.
• Let c = y − x∗

2
Slide 9
• x ∈ S. ∀λ satisfying 0 < λ ≤ 1, y + λ(x − y) ∈ S (S convex)
• ||y − x∗ ||2 ≤ ||y + λ(x − y) − x∗ ||2

= ||y − x∗ ||2 + 2λ(y − x∗ )′ (x − y) + λ2 ||x − y||2

• 2λ(y − x∗ )′ (x − y) + λ2 ||x − y||2 ≥ 0.

• Divide by λ, (y − x∗ )′ (x − y) ≥ 0, i.e.,

(y − x∗ )′ x ≥ (y − x∗ )′ y
= (y − x∗ )′ x∗ + (y − x∗ )′ (y − x∗ )
> (y − x∗ )′ x∗ .

• c = y − x∗ proves theorem

6 Farkas’ lemma
Slide 10
Theorem: If Ax = b, x ≥ 0 is infeasible, then there exists a vector p such that
p′ A ≥ 0′ and p′ b < 0.
� � �
• S = y � there exists x such that y = Ax, x ≥ 0 b ∈ / S.

• S is convex; nonempty; closed;

S is the projection of {(x, y) | y = Ax, x ≥ 0} onto the y coordinates,

is itself a polyhedron and is therefore closed.

/ S: ∃p such that p′ b < p′ y for every y ∈ S.

• b ∈
• Since 0 ∈ S, we must have p′ b < 0.
• ∀Ai and ∀λ > 0, λAi ∈ S and p′ b < λp′ Ai
• Divide by λ and then take limit as λ tends to inﬁnity: p′ Ai ≥ 0 ⇒ p′ A ≥
0′

7 Duality theorem
Slide 11
min c′ x max p′ b
s.t. Ax ≥ b s.t. p′ A = c ′
p≥0
and we assume that the primal has an optimal solution x∗ . We will show that
the dual problem also has a feasible solution with the same cost. Strong duality
follows then from weak duality. Slide 12

• I = {i | a′i x∗ = bi }
• We next show: if a′i d ≥ 0 for every i ∈ I, then c′ d ≥ 0

3
• a′i (x∗ + ǫd) ≥ ai x∗ = bi for all i ∈ I.
/ I, a′i x∗ > bi hence a′i (x∗ + ǫd) > bi .
• If i ∈
• x∗ + ǫd is feasible
Slide 13

• By optimality x∗ , c′ d ≥ 0
• By Farkas’ lemma �
c= pi a i .
i∈I

/ I, we deﬁne pi = 0, so p′ A = c′ .
• For i ∈
• � �
p′ b = pi b i = pi ai′ x∗ = c′ x∗ ,
i∈I i∈I

4
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 12: Sensitivity Analysis

1 Motivation
1.1 Questions
Slide 1
z = min c′ x
s.t. Ax = b

x ≥ 0

• How does z depend globally on c? on b?

• How does z change locally if either b, c, A change?
• How does z change if we add new constraints, introduce new variables?
• Importance: Insight about LO and practical relevance

2 Outline
Slide 2
1. Global sensitivity analysis
2. Local sensitivity analysis
(a) Changes in b
(b) Changes in c
(c) A new variable is added
(d) A new constraint is added
(e) Changes in A
3. Detailed example

3 Global sensitivity analysis

3.1 Dependence on c
Slide 3
G(c) = min c′ x
s.t. Ax = b
x≥0
i
G(c) = mini=1,...,N c′ x is a concave function of c

3.2 Dependence on b
Slide 4
Primal Dual
F (b) = min c′ x
F (b) = max p′ b
s.t. Ax = b
s.t. p′ A ≤ c′
x≥0
F (b) = maxi=1,...,N (pi )′ b is a convex function of b

1
( c + q d) ' ( x3)

( c + q d) ' ( x2
)

( c + q d) ' ( x1) ( c + q d) ' ( x4

)

x1 o p t i m a l
. x2 o p t i m a l
. x3 o p t i m a l
. x4 o p t i m a l q

f( q)

( p1) ' ( b* + q d)

( p3) ' ( b* + q d)
( p2) ' ( b* + q d)

q1 q2 q

4 Local sensitivity analysis

Slide 5
z = min c′ x
s.t. Ax = b
x≥0
What does it mean that a basis B is optimal?

1. Feasibility conditions: B −1 b ≥ 0
2. Optimality conditions: c′ − cB
′
B −1 A ≥ 0′
Slide 6
• Suppose that there is a change in either b or c for example
• How do we ﬁnd whether B is still optimal?
• Need to check whether the feasibility and optimality conditions are satis

ﬁed

5 Local sensitivity analysis

5.1 Changes in b
Slide 7
bi becomes bi + Δ, i.e.
(P ) min c′ x (P ′ ) min c′ x
s.t. Ax = b → s.t. Ax = b + Δei

x≥0 x ≥ 0

• B optimal basis for (P )

• Is B optimal for (P ′ )?
Slide 8
Need to check:

1. Feasibility: B −1 (b + Δei ) ≥ 0
2. Optimality: c′ − cB
′
B −1 A ≥ 0′

Observations:
1. Changes in b aﬀect feasibility
2. Optimality conditions are not aﬀected
Slide 9
B −1 (b + Δei ) ≥ 0
βij = [B −1 ]ij
bj = [B −1 b]j
Thus,
(B −1 b)j + Δ(B −1 ei )j ≥ 0 ⇒ bj + Δβji ≥ 0 ⇒

3

bj bj
max − ≤ Δ ≤ min −
βji >0 βji βji <0 βji
Slide 10

Δ≤Δ≤Δ

Within this range

• Current basis B is optimal
• z = c′B B −1 (b + Δei ) = cB
′
B −1 b + Δpi
• What if Δ = Δ?
• What if Δ > Δ?
Current solution is infeasible, but satisﬁes optimality conditions → use

dual simplex method

5.2 Changes in c
Slide 11
cj → cj + Δ

Is current basis B optimal?

Need to check:

1. Feasibility: B −1 b ≥ 0, unaﬀected
2. Optimality: c′ − cB
′
B −1 A ≥ 0′ , aﬀected

There are two cases:

• xj basic

• xj nonbasic

5.2.1 xj nonbasic
Slide 12
cB unaﬀected
(cj + Δ) − c′B B −1 Aj ≥ 0 ⇒ cj + Δ ≥ 0
Solution optimal if Δ ≥ −cj
What if Δ = −cj ?
What if Δ < −cj ?

4
5.2.2 xj basic
Slide 13

cB ← ĉB = cB + Δej

Then,
[c′ − ĉB
′
B −1 A]i ≥ 0 ⇒ ci − [cB + Δej ]′ B −1 Ai ≥ 0
[B −1 A]ji = aji
ci ci
ci − Δaji ≥ 0 ⇒ max ≤ Δ ≤ min
aji <0 aji aji >0 aji

What if Δ is outside this range? use primal simplex

5.3 A new variable is added

Slide 14
min c′ x min c′ x + cn+1 xn+1
s.t. Ax = b → s.t. Ax + An+1 xn+1 = b
x≥0 x≥0
In the new problem is xn+1 = 0 or xn+1 > 0? (i.e., is the new activity prof
itable?) Slide 15
Current basis B. Is solution x = B −1 b, xn+1 = 0 optimal?

• Feasibility conditions are satisﬁed

• Optimality conditions:

cn+1 − c′B B −1 An+1 ≥ 0 ⇒ cn+1 − p′ An+1 ≥ 0?

• If yes, solution x = B −1 b, xn+1 = 0 optimal

• Otherwise, use primal simplex

5.4 A new constraint is added

Slide 16
′ min c′ x
min c x
s.t. Ax = b
s.t. Ax = b →
a′m+1 x = bm+1
x≥0
x≥0
If current solution feasible, it is optimal; otherwise, apply dual simplex

5
5.5 Changes in A
Slide 17
• Suppose aij ← aij + Δ
• Assume Aj does not belong in the basis
• Feasibility conditions: B −1 b ≥ 0, unaﬀected
• Optimality conditions: cl − c′B B −1 Al ≥ 0, l 6= j, unaﬀected
• Optimality condition: cj − p′ (Aj + Δei ) ≥ 0 ⇒ cj − Δpi ≥ 0

• What if Aj is basic? BT, Exer. 5.3

6 Example
6.1 A Furniture company
Slide 18
• A furniture company makes desks, tables, chairs
• The production requires wood, ﬁnishing labor, carpentry labor

Desk Table (ft) Chair Avail.

Proﬁt 60 30 20 -
Wood (ft) 8 6 1 48
Finish hrs. 4 2 1.5 20
Carpentry hrs. 2 1.5 0.5 8

6.2 Formulation
Slide 19
Decision variables:
x1 = # desks, x2 = # tables, x3 = # chairs

max 60x1 + 30x2 + 20x3

s.t. 8x1 + 6x2 + x3 ≤ 48
4x1 + 2x2 + 1.5x3 ≤ 20
2x1 + 1.5x2 + 0.5x3 ≤8
x1 , x2 , x3 ≥0

6.3 Simplex tableaus

Slide 20
Initial tableau: s1 s2 s3 x1 x2 x3
0 0 0 0 -60 -30 -20
s1 = 48 1 8 6 1
s2 = 20 1 4 2 1.5
s2 = 8 1 2 1.5 0.5

6
Final tableau: s1 s2 s3 x1 x2 x3
280 0 10 10 0 5 0
s1 = 24 1 2 -8 0 -2 0
x3 = 8 0 2 -4 0 -2 1
x1 = 2 0 -0.5 1.5 1 1.25 0

6.4 Information in tableaus

Slide 21
• What is B?  
1 1 8
B= 0 1.5 4 
0 0.5 2

• What is B −1 ?  
1 2 −8
B −1 = 0 2 −4 
0 −0.5 1.5
Slide 22
• What is the optimal solution?
• What is the optimal solution value?
• Is it a bit surprising?
• What is the optimal dual solution?
• What is the shadow price of the wood constraint?
• What is the shadow price of the ﬁnishing hours constraint?
• What is the reduced cost for x2 ?

6.5 Shadow prices

Slide 23
Why the dual price of the ﬁnishing hours constraint is 10?

• Suppose that ﬁnishing hours become 21 (from 20).

• Currently only desks (x1 ) and chairs (x3 ) are produced
• Finishing and carpentry hours constraints are tight
• Does this change leaves current basis optimal?
Slide 24
New Previous
8x1 + x3 + s1 = 48 s1 = 26 24
New solution:
4x1 + 1.5x3 = 21 ⇒ x1 = 1.5 2
2x1 + 0.5x3 =8 x3 = 10 8
Solution change:
z ′ − z = (60 ∗ 1.5 + 20 ∗ 10) − (60 ∗ 2 + 20 ∗ 8) = 10
Slide 25

7
• Suppose you can hire 1h of ﬁnishing overtime at $7. Would you do it?
• Another check
 
1 2 −8
c′B B −1 = (0, −20, −60)  0 2 −4  =
0 −0.5 1.5

(0, −10, −10)

6.6 Reduced costs

Slide 26
• What does it mean that the reduced cost for x2 is 5?
• Suppose you are forced to produce x2 = 1 (1 table)
• How much will the proﬁt decrease?

8x1 + x3 + s1 + 6·1 = 48 s1 = 26
4x1 + 1.5x3 + 2·1 = 20 ⇒ x1 = 0.75
2x1 + 0.5x3 + 1.5 · 1 = 8 x3 = 10
z ′ − z = (60 ∗ 0.75 + 20 ∗ 10) − (60 ∗ 2 + 20 ∗ 8 + 30 ∗ 1) = −35 + 30 = −5 Slide 27
Another way to calculate the same thing: If x2 = 1

Direct proﬁt from table +30

Decrease wood by -6 −6 ∗ 0 = 0
Decrease ﬁnishing hours by -2 −2 ∗ 10 = −20
Decrease carpentry hours by -1.5 −1.5 ∗ 10 = −15
Total Eﬀect −5

Suppose proﬁt from tables increases from $30 to $34. Should it be produced?
At $35? At $36?

6.7 Cost ranges

Slide 28
Suppose proﬁt from desks becomes 60 + Δ. For what values of Δ does current

basis remain optimal?

Optimality conditions:

cj − c′B B −1 Aj ≥ 0 ⇒
1 2 −8
" #
′

p = c′B B −1 = [0, −20, −(60 + Δ)] 0 2 −4

0 −0.5 1.5

= −[0, 10 − 0.5Δ, 10 + 1.5Δ]

Slide 29
s1 , x3 , x1 are basic
Reduced costs of non-basic variables

8
 
6
c2 = c2 − p′ A2 = −30 + [0, 10 − 0.5Δ, 10 + 1.5Δ]  2  = 5 + 1.25Δ
1.5
cs2 = 10 − 0.5Δ
cs3 = 10 + 1.5Δ
Current basis optimal:

5 + 1.25Δ ≥ 0 
10 − 0.5Δ ≥ 0 −4 ≤ Δ ≤ 20
10 + 1.5Δ ≥ 0


⇒ 56 ≤ c1 ≤ 80 solution remains optimal.

If c1 < 56, or c1 > 80 current basis is not optimal.
Suppose c1 = 100(Δ = 40) What would you do?

6.8 Rhs ranges

Slide 30
Suppose
 ﬁnishing hours
 change by Δ  becoming
 (20+ Δ) What happens?
48 1 2 −8 48
B −1  20 + Δ  =  0 2 −4   20 + Δ 
8 0 −0.5 1.5 8
 
24 + 2Δ

=  8 + 2Δ  ≥ 0

2 − 0.5Δ

⇒ −4 ≤ Δ ≤ 4 current basis optimal Slide 31

Note that even if current basis is optimal, optimal solution variables change:

s1 = 24 + 2Δ
x3 = 8 + 2Δ
x1 = 2 − 0.5Δ
z = 60(2 − 0.5Δ) + 20(8 + 2Δ) = 280 + 10Δ
Slide 32
Suppose
 Δ =
 10 then

s1 44
 x3  =  25  ← inf. (Use dual simplex)
x1 −3

6.9 New activity

Slide 33
Suppose the company has the opportunity to produce stools
Proﬁt $15; requires 1 ft of wood, 1 ﬁnishing hour, 1 carpentry hour
Should the company produce stools?

max 60x1 +30x2 +20x3 +15x4

8x1 +6x2 +x3 +x4 +s1 = 48

4x1 +2x2 +1.5x3 +x4 +s2 = 20

2x1 +1.5x2 +0.5x3 +x4 +s3 = 8

xi ≥ 0

9
1
!
c4 −c′B B −1 A4 = −15 − (0, −10, −10) 1 =5≥0
1
Current basis still optimal. Do not produce stools

10
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 13: Robust Optimization

1 Papers
Slide 1
• B. and Sim, The Price of Robustness, Operations Research, 2003.
• B. and Sim, Robust Discrete optimization, Mathematical Programming,

2003.

2 Structure
Slide 2
• Motivation

• Data Uncertainty

• Robust Mixed Integer Optimization

• Robust 0-1 Optimization

• Robust Approximation Algorithms

• Robust Network Flows

• Experimental Results

• Summary and Conclusions

3 Motivation
Slide 3
• The classical paradigm in optimization is to develop a model that assumes

that the input data is precisely known and equal to some nominal values.

This approach, however, does not take into account the inﬂuence of data

uncertainties on the quality and feasibility of the model.

• Can we design solution approaches that are immune to data uncertainty,

that is they are robust?

Slide 4

• Ben-Tal and Nemirovski (2000):

In real-world applications of Linear Optimization (Net Lib li
brary), one cannot ignore the possibility that a small uncer
tainty in the data can make the usual optimal solution com
pletely meaningless from a practical viewpoint.

1
3.1 Literature
Slide 5
• Ellipsoidal uncertainty; Robust convex optimization Ben-Tal and Nemirovski

(1997), El-Ghaoui et. al (1996)

• Flexible adjustment of conservativism

• Nonlinear convex models
• Not extendable to discrete optimization

4 Goal
Slide 6
Develop an approach to address data uncertainty for optimization problems
that:
• It allows to control the degree of conservatism of the solution;
• It is computationally tractable both practically and theoretically.

5 Data Uncertainty
Slide 7
minimize c′ x
subject to Ax ≤ b
l≤x≤u
xi ∈ Z, i = 1, . . . , k,
WLOG data uncertainty aﬀects only A and c, but not the vector b. Slide 8

• (Uncertainty for matrix A): aij , j ∈ Ji is independent, symmetric

and bounded random variable (but with unknown distribution) ãij , j ∈ Ji

that takes values in [aij − âij , aij + âij ].

• (Uncertainty for cost vector c): cj , j ∈ J0 takes values in [cj , cj + dj ].

6 Robust MIP
Slide 9
• Consider an integer Γi ∈ [0, |Ji |], i = 0, 1, . . . , m.
• Γi adjusts the robustness of the proposed method against the level of

conservativeness of the solution.

• Speaking intuitively, it is unlikely that all of the aij , j ∈ Ji will change.

We want to be protected against all cases that up to Γi of the aij ’s are

allowed to change.

Slide 10
• Nature will be restricted in its behavior, in that only a subset of the

coeﬃcients will change in order to adversely aﬀect the solution.

2
• We will guarantee that if nature behaves like this then the robust solution

will be feasible deterministically. Even if more than Γi change, then the

robust solution will be feasible with very high probability.

6.1 Problem
( ) Slide 11
X
′
minimize c x+ max dj |xj |
{S0 | S0 ⊆J0 ,|S0 |≤Γ0 }
j∈S0
( )
X X
subject to aij xj + max ˆij |xj |
a ≤ bi , ∀i
{Si | Si ⊆Ji ,|Si |≤Γi }
j j∈Si
l ≤ x ≤
u
xi ∈ Z, ∀i = 1, . . . k.

6.2 Theorem 1
Slide 12
The robust problem can be reformulated has an equivalent MIP:
P
minimize c ′ x + z 0 Γ0 + p
X j∈J0X0j
subject to aij xj + zi Γi + pij ≤ bi ∀i
j j∈Ji
z0 + p0j ≥ dj yj ∀j ∈ J0
zi + pij ≥ âij yj ∀i �= 0, j ∈ Ji
pij , yj , zi ≥ 0 ∀i, j ∈ Ji
−yj ≤ xj ≤ yj ∀j
lj ≤ xj ≤ uj ∀j
xi ∈ Z i = 1, . . . , k.

6.3 Proof
Slide 13
Given a vector x∗ , we deﬁne:
( )
X
βi (x∗ ) = max âij |x∗j | .
{Si | Si ⊆Ji ,|Si |=Γi }
j∈Si

This equals to: X

βi (x∗ ) = max âij |xj∗ |zij
j∈Ji
X
s.t. zij ≤ Γi
j∈Ji
0 ≤ zij ≤ 1 ∀i, j ∈ Ji .
Slide 14
Dual: X
βi (x∗ ) = min pij + Γi zi
j∈Ji
s.t. zi + pij ≥ âij |x∗j | ∀j ∈ Ji
pij ≥ 0 ∀j ∈ Ji
zi ≥ 0 ∀i.

3
|Ji | Γi

5 5

10 8.3565

100 24.263

200 33.899

Table 1: Choice of Γi as a function of |Ji | so that the probability of constraint

violation is less than 1%.

6.4 Size
Slide 15
• Original Problem has n variables and m constraints
Pm
• Robust counterpart has 2n + m + l variables, where l = i=0 |Ji | is the

number of uncertain coeﬃcients, and 2n + m + l constraints.

6.5 Probabilistic Guarantee

6.5.1 Theorem 2
Slide 16
Let x∗ be an optimal solution of robust MIP.
(a) If A is subject to the model of data uncertainty U:
!  
n n 

1

X X n X n
Pr ãij x∗j > bi ≤ (1 − µ) +µ ,

2n  l l 
j l=⌊ν⌋ l=⌊ν⌋+1

n = |Ji |, ν = Γi2+n and µ = ν − ⌊ν⌋; bound is tight.

(b) As n → ∞
 
n n 
1 Γi − 1
 X n X n
(1 − µ) +µ ∼1−Φ √ .
2n  l l  n
l=⌊ν⌋ l=⌊ν⌋+1
Slide 17
Slide 18

7 Experimental Results
7.1 Knapsack Problems
• Slide 19
X
maximize ci xi
i∈N
X
subject to wi xi ≤ b
i∈N
x ∈ {0, 1}n.

4
0
10
Approx bound
Bound 2

−1
10

−2
10

−3
10

−4
10
0 1 2 3 4 5 6 7 8 9 10
Γi

Γ Violation Probability Optimal Value Reduction

0 0.5 5592 0%
2.8 4.49 × 10−1 5585 0.13%
36.8 5.71 × 10−3 5506 1.54%
82.0 5.04 × 10−9 5408 3.29%
200 0 5283 5.50%

• w̃i are independently distributed and follow symmetric distributions in

[wi − δi , wi + δi ];

• c is not subject to data uncertainty.

7.1.1 Data
Slide 20
• |N | = 200, b = 4000,
• wi randomly chosen from {20, 21, . . . , 29}.
• ci randomly chosen from {16, 17, . . . , 77}.
• δi = 0.1wi .

5
7.1.2 Results
Slide 21

8 Robust 0-1 Optimization

Slide 22

• Nominal combinatorial optimization:

minimize c′ x
subject to x ∈ X ⊂ {0, 1}n .

• Robust Counterpart:
X
Z∗ = minimize c′ x + max dj x j
{S| S⊆J,|S|=Γ}
j∈S

subject to x ∈ X,

• WLOG d1 ≥ d2 ≥ . . . ≥ dn .

8.1 Remarks
Slide 23

• Examples: the shortest path, the minimum spanning tree, the minimum
assignment, the traveling salesman, the vehicle routing and matroid inter
section problems.

• Other approaches to robustness are hard. Scenario based uncertainty:

minimize max(c′1 x, c′2 x)

subject to x ∈ X.

is NP-hard for the shortest path problem.

8.2 Approach
X Slide 24

Primal :Z ∗ = min c′ x + max dj xj uj

x∈X
j
s.t. 0 ≤ uj ≤ 1, ∀j
X
uj ≤ Γ
j
X
Dual :Z ∗ = min c′ x + min θΓ + yj
x∈X
j
s.t. yj + θ ≥ dj xj , ∀j
yj , θ ≥ 0

6
8.3 Algorithm A
Slide 25
• Solution: yj = max(dj xj − θ, 0)
• X
Z∗ = min θΓ + (cj xj + max(dj xj − θ, 0))
x∈X,θ≥0
j

• Since X ⊂ {0, 1}n,

max(dj xj − θ, 0) = max(dj − θ, 0) xj

• X
Z∗ = min θΓ + (cj + max(dj − θ, 0)) xj
x∈X,θ≥0
j
Slide 26
• d1 ≥ d2 ≥ . . . ≥ dn ≥ dn+1 = 0.
• For dl ≥ θ ≥ dl+1 ,
n l
X X
min θΓ + cj xj + (dj − θ)xj =
x∈X,dl ≥θ≥dl+1
j=1 j=1

n l
X X
dl Γ + min cj xj + (dj − dl )xj = Zl
x∈X
j=1 j=1
•
n l
X X
Z∗ = min dl Γ + min cj xj + (dj − dl )xj .
l=1,...,n+1 x∈X
j=1 j=1

8.4 Theorem 3
Slide 27
• Algorithm A correctly solves the robust 0-1 optimization problem.
• It requires at most |J| + 1 solutions of nominal problems. Thus, If the

nominal problem is polynomially time solvable, then the robust 0-1 coun

terpart is also polynomially solvable.

• Robust minimum spanning tree, minimum assignment, minimum match

ing, shortest path and matroid intersection, are polynomially solvable.

9 Experimental Results
9.1 Robust Sorting
X Slide 28
minimize ci xi
i∈N
X
subject to xi = k
i∈N
x ∈ {0, 1}n .

7
Γ ¯
Z(Γ) ¯
% change in Z(Γ)
σ(Γ) % change in σ(Γ)
0 8822 0 %
501.0 0.0 %
10 8827 0.056 %
493.1 -1.6 %
20 8923 1.145 %
471.9 -5.8 %
30 9059 2.686 %
454.3 -9.3 %
40 9627 9.125 %
396.3 -20.9 %
50 10049 13.91 %
371.6 -25.8 %
60 10146 15.00 %
365.7 -27.0 %
70 10355 17.38 %
352.9 -29.6 %
80 10619 20.37 %
342.5 -31.6 %
100 10619 20.37 %
340.1 -32.1 %

X
Z ∗ (Γ) = minimize c′ x + max dj x j
{S| S⊆J,|S|=Γ}
j∈S
X
subject to xi = k
i∈N
x ∈ {0, 1}n .

9.1.1 Data
Slide 29
• |N | = 200;
• k = 100;
• cj ∼ U [50, 200]; dj ∼ U [20, 200];
• For testing robustness, generate instances such that each cost component

independently deviates with probability ρ = 0.2 from the nominal value

cj to cj + dj .

9.1.2 Results
Slide 30
10 Robust Network Flows
Slide 31
• Nominal
X
min cij xij

(i,j)∈A

X X
s.t. xij − xji = bi ∀i ∈ N
{j:(i,j)∈A} {j:(j,i)∈A}

0 ≤ xij ≤ uij ∀(i, j) ∈ A.

• X set of feasible solutions ﬂows.

• Robust X
Z ∗ = min c′ x + max dij xij
{S| S⊆A,|S|≤Γ}
(i,j)∈S
subject to x ∈ X.

8
(cost, capacity)
(cij, uij)
i j

) j’ ( 0,
,� �)
(d ij

i i’ j
(cij, uij) (0, θ/ dij)

10.1 Reformulation
Slide 32
•
Z ∗ = min Z(θ),
θ≥0
X
′
Z(θ) = Γθ + min c x+ pij
(i,j)∈A
subject to pij ≥ dij xij − θ ∀(i, j) ∈ A
pij ≥ 0 ∀(i, j) ∈ A
x ∈ X.
• Equivalently

′
X θ
Z(θ) = Γθ + min c x+ dij max xij − ,0
dij
(i,j)∈A

subject to x ∈ X.

10.2 Network Reformulation

Slide 33
Theorem: For ﬁxed θ we can solve the robust problem as a network ﬂow problem

10.3 Complexity
Slide 34
• Z(θ) is a convex function and for all θ1 , θ2 ≥ 0, we have

|Z(θ1 ) − Z(θ2 )| ≤ |A||θ1 − θ2 |.

ˆ ∈
X with robust

• For any ﬁxed Γ ≤ |A| and every ǫ > 0, we can ﬁnd a solution x
objective value X
Ẑ = c′ x̂ + max dij x̂ij
{S| S⊆A,|S|≤Γ}
(i,j)∈S

such that

Z ∗ ≤ Ẑ ≤ (1 + ǫ)Z ∗

by solving 2⌈log 2 (|A|θ/ǫ)⌉ + 3 network ﬂow problems, where θ = max{uij dij :

(i, j) ∈ A}.

9
3000
Γ=0
Γ=3
Γ=6
Γ = 10
2500

2000

1500

1000

500

3 4 5 6 7 8 9
Distributions of path cost

11 Experimental Results
Slide 35
12 Conclusions
Slide 36
• Robust counterpart of a MIP remains a MIP, of comparable size.
• Approach permits ﬂexibility of adjusting the level of conservatism in terms of
probabilistic bound of constraint violation

• For polynomial solvable 0-1 optimization problems with cost uncertainty, the

robust counterpart is polynomial solvable.

Slide 37
• Robust network ﬂows are solvable as a series of nominal network ﬂow problems.
• Robust optimization is tractable for stochastic optimization problems without
the curse of dimensionality

10
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 14: Large Scale Optimization, I

1 Outline
Slide 1
1. The idea of column generation
2. The cutting stock problem
3. Stochastic programming

2 Column Generation
Slide 2
• For x ∈ ℜn and n large consider the LOP:

min c′ x
s.t. Ax = b
x≥0

• Restricted problem �
min ci xi
i∈I
�
s.t. Ai xi = b
i∈I
x≥0

2.1 Two Key Ideas

Slide 3
• Generate columns Aj only as needed.
• Calculate mini ci eﬃciently without enumerating all columns.

3 The Cutting Stock Problem

Slide 4
• Company has a supply of large rolls of paper of width W .
• bi rolls of width wi , i = 1, . . . , m need to be produced.
• Example: w = 70 inches, can be cut in 3 rolls of width w1 = 17 and 1 roll

of width w2 = 15, waste:

70 − (3 × 17 + 1 × 15) = 4
Slide 5
• Given w1 , . . . , wm and W there are many cutting patterns: (3, 1) and (2, 2)

for example

3 × 17 + 1 × 15 ≤ 70
2 × 17 + 2 × 15 ≤ 70

1
• Pattern: (a1 , . . . , am ) integers:

�
ai wi ≤ W
i=1

3.1 Problem
Slide 6
• Given wi , bi , i = 1, . . . , m (bi : number of rolls of width wi demanded,

and W (width of large rolls):

• Find how to cut the large rolls in order to minimize the number of rolls

used.

3.2 Concrete Example

Slide 7
• What is the solution for W = 70, w1 = 21, w2 = 9, b1 = 20, b2 = 21?
• feasible patterns: (2, 3), (3, 0), (0, 7), (2, 0)
• Solution 1: (2, 3) : 7 rolls; (3, 0) : 2 rolls: 9 rolls total

• Solution 2: (0, 7) : 3, (3, 0) : 6, (2, 0) : 1 : 10 rolls total

Slide 8
• W = 70, w1 = 20, w2 = 11, b1 = 12, b2 = 17
• Feasible patterns: 10 , 20 , 30 , 01 , 11 , 21 , 02 , 12 , 22 , 03 , 13 ,

��
�0� �1� �0� �0
�

4 , 4 , 5 , 6

• x1 , . . . , x15 = # of feasible patterns of the type 10 , . . . , 06 respectively

��

min x1 �+ ·�· · + x15

� � � � � �
1 2 0 12
s.t. x1 + x2 + · · · + x15 =
0 0 6 17
x1 , . . . , x15 ≥ 0
Slide 9
� � � � � � � �
0 0 3 12
• Example: 2 +1 +4 = 7 rolls used
6 5 0 17
� � � � � � � �
0 0 3 12
4 + +4 = 9 rolls used
4 1 0 17
• Any ideas?

2
3.3 Formulation
Slide 10
Decision variables: xj = number of rolls cut by pattern j characterized by vector
Aj :
�n
min xj
j=1 
b1
n
Aj · xj =  ... 
�  
j=1
bm
xj ≥ 0 ( integer)
Slide 11
• Huge number of variables.
• Can we apply column generation, that is generate the patterns Aj on the

ﬂy?

3.4 Algorithm
Slide 12
Idea: Generate feasible patterns as needed.
 W       
⌊ w1 ⌋ 0 0 0
 0   ⌊W ⌋   0
   0
1) Start with initial patterns:    w2
 0 , 0
, W , 
  ⌊ ⌋  
w3
 0
W
0 0 0 ⌊w4
⌋
Slide 13
2) Solve:
min x1 + · · · + xm

x1 A1 + · · · + xm Am = b

xi ≥ 0

Slide 14
3) Compute reduced costs

cj = 1 − p′ Aj for all patterns j

If cj ≥ 0 current set of patterns optimal
If cs < 0 ⇒ xs needs to enter basis
How are we going to compute reduced costs cj = 1 − p′ Aj for all j? (huge
number)

3
3.4.1 Key Idea
Slide 15
4) Solve
m
�
z ∗ = max p i ai
i=1
�m
s.t. wi ai ≤ W
i=1
ai ≥ 0, integer
This is the integer knapsack problem
Slide 16
• If z ∗ ≤ 1 ⇒ 1 − p′ Aj > 0 ∀j ⇒ current solution optimal
• If z ∗ > 1 ⇒ ∃ s: 1 − p′ As < 0 ⇒ Variable xs becomes basic, i.e., a new

pattern As will enter the basis.

• Perform min-ratio test and update the basis.

3.5 Dynamic Programming

Slide 17
F (u) = max p1 a1 + · · · + pm am
s.t. w1 a1 + · · · + wm am ≤ u
ai ≥ 0, integer

• For u ≤ wmin , F (u) = 0.

• For u ≥ wmin

F (u) = max {pi + F (u − wi )}

i=1,...,m

Why ?

3.6 Example
Slide 18
max 11x1 + 7x2 + 5x3 + x4
s.t. 6x1 + 4x2 + 3x3 + x4 ≤ 25
xi ≥ 0, xi integer

F (0) = 0

F (1) = 1

F (2) = 1 + F (1) = 2

Slide 19
F (3) = max(5 + F (0)∗ , 1 + F (2)) = 5

F (4) = max(7 + F (0)∗ , 5 + F (1), 1 + F (3)) = 7

F (5) = max(7 + F (1)∗ , 5 + F (2), 1 + F (4)) = 8

F (6) = max(11 + F (0)∗ , 7 + F (2), 5 + F (3), 1 + F (5)) = 11

F (7) = max(11 + F (1)∗ , 7 + F (2), 5 + F (3), 1 + F (4)) = 12

F (8) = max(11 + F (2), 7 + F (4)∗ , 5 + F (5), 1 + F (7)) = 14

F (9) = 11 + F (3) = 16

F (10) = 11 + F (4) = 18

F (u) = 11 + F (u − 6) = 16 u ≥ 11

4
⇒ F (25) = 11 + F (19) = 11 + 11 + F (13) = 11 + 11 + 11 + F (7) = 33 + 12 = 45
x∗ = (4, 0, 0, 1)

4 Stochastic Programming
4.1 Example
Slide 20
Wrenches
Pliers Cap.
Steel (lbs)
1.5
1.0 27,000
Molding machine (hrs)
1.0
1.0 21,000
Assembly machine (hrs)
0.3
0.5 9,000* Slide 21
Demand limit (tools/day)
15,000
16,000
Contribution to earnings
$130*
$100
($/1000 units)
max 130W + 100P
s.t. W ≤ 15

P ≤ 16

1.5W + P ≤ 27

W + P ≤ 21

0.3W + 0.5P ≤ 9

W, P ≥ 0

4.1.1 Random data

Slide 22
1

 8000
 with probability
• Assembly capacity is random: 2
 10, 000 with probability
 1
2
1

 160 with probability

• Contribution from wrenches: 2
 90 1
with probability

2

4.1.2 Decisions
Slide 23
• Need to decide steel capacity in the current quarter. Cost 58$/1000lbs.
• Soon after, uncertainty will be resolved.
• Next quarter, company will decide production quantities.

4.1.3 Formulation
Slide 24

5
State
Cap. W. contr.
Prob.
1
8,000 160
0.25
2
10,000 160
0.25
3
8,000 90
0.25
4
10,000 90
0.25
Decision Variables: S: steel capacity,

Pi , Wi : i = 1, . . . , 4 production plan under state i. Slide 25

max −58S + 0.25Z1 + 0.25Z2 + 0.25Z3 + 0.25Z4

s.t.

Ass. 1 0.3W1 + 0.5P1 ≤ 8

Mol. 1 W1 + P1 ≤ 21

Ste. 1 −S + 1.5W1 + P1 ≤ 0
W.d. 1 W1 ≤ 15

P.d. 1 P1 ≤ 16

Obj. 1 −Z1 + 160W1 + 100P1 = 0

Slide 26

Ass. 2 0.3W2 + 0.5P2 ≤ 8

Mol. 2 W2 + P2 ≤ 21

Ste. 2 −S + 1.5W2 + P2 ≤ 0
W.d. 2 W2 ≤ 15

P.d. 2 P2 ≤ 16

Obj. 2 −Z2 + 160W2 + 100P2 = 0

Slide 27

Ass. 3 0.3W3 + 0.5P3 ≤ 8

Mol. 3 W3 + P3 ≤ 21

Ste. 3 −S + 1.5W3 + P3 ≤ 0
W.d. 3 W3 ≤ 15

P.d. 3 P3 ≤ 16

Obj. 3 −Z3 + 160W3 + 100P3 = 0

Slide 28

Ass. 4 0.3W4 + 0.5P4 ≤ 8

Mol. 4 W4 + P4 ≤ 21

Ste. 4 −S + 1.5W4 + P4 ≤ 0
W.d. 4 W4 ≤ 15

P.d. 4 P4 ≤ 16

Obj. 4 −Z4 + 160W4 + 100P4 = 0

S, Wi , Pi ≥ 0

4.1.4 Solution
Slide 29

Solution: S = 27, 250lb.

Wi Pi
1
15,000 4,750
2
15,000 4,750
3
12,500 8,500
4
5,000 16,000

4.2 Two-stage problems

Slide 30
• Random scenarios indexed by w = 1, . . . , k. Scenario w has probability

αw .

• First stage decisions: x: Ax = b, x ≥ 0.

• Second stage decisions: yw : w = 1, . . . , k.
• Constraints:

Bw x + Dw yw = dw , yw ≥ 0.

4.2.1 Formulation
Slide 31
min c′ x + α1 f1′ y1 + ··· + αk fk′ yk
Ax =b
B1 x + D 1 y1 = d1
B2 x + D 2 y2 = d2 Slide 32
. . .. ..
.
. .
Bk x + D k yk = dk

x, y1 , y2 , . . . , yk ≥ 0.

Structure: x y1 y2 y3 y4
Objective

7
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 15: Large Scale Optimization, II

1 Outline
Slide 1
1. Dantzig-Wolfe decomposition
2. Key Idea
3. Bounds

2 Decomposition
Slide 2
min c′1 x1 + c′2 x2
s.t. D 1 x1 + D 2 x2 = b 0
F 1 x 1 = b1
F 2 x 2 = b2
x1 , x 2 ≥ 0
• Relation with stochastic programming?
• Firm’s problem

2.1 Reformulation
� � Slide 3
• Pi = xi ≥ 0 | F i xi = bi , i = 1, 2
• xji , j ∈ Ji extreme points of Pi
• w ki , k ∈ Ki , extreme rays of Pi .
• For all xi ∈ Pi � �
xi = λji xji + θik wki ,
j∈Ji k∈Ki

λji ≥ 0 and θik ≥0 �

λji = 1, i = 1, 2
j∈Ji
Slide 4
� � � �
min λj1 c′1 xj1 + θ1k c′1 w k1 + λj2 c′2 xj2 + θ2k c′2 w k2
j∈J1 k∈K1 j∈J2 k∈K2
� � �
s.t. λj1 D 1 xj1 + θ1k D 1 w k1 + λj2 D 2 xj2
j∈J1 k∈K1 j∈J2
�
+ θ2k D 2 w 2k = b0
k∈K2
�
λj1 =1
j∈J1
�
λj2 = 1
j∈J2

λji ≥ 0, θik ≥ 0, ∀ i, j, k.

1
Huge # variables, m0 + 2 constraints Slide 5
• A bfs is available with a basis matrix B
• p′ = c′B B −1 ; p = (q, r1 , r2 )
• Is B optimal?
• Check reduced costs

(c′1 − q ′ D 1 )xj1 − r1

(c′1 − q ′ D 1 )wk1
• Huge number of them

3 Key idea
Slide 6
Consider subproblem:

min (c′1 − q ′ D 1 )x1

s.t. x1 ∈ P1 ,
• If optimal cost of subproblem is −∞, an extreme ray w k1 is generated:

(c′1 − q ′ D 1 )w k1 < 0, i.e., reduced cost of θ1k is negative; Generate column

[D 1 wk1 , 0, 0]′

• If optimal cost is ﬁnite and smaller than r1 , then, an extreme point xj

1
is generated: (c′1 − q ′ D1 )xj1 < r1 , i.e., reduced cost of λj1 is negative;

Generate column [D1 xj1 ,1 , 0]′

• Otherwise, reduced costs are nonnegative

• Repear for subproblem:
min (c′2 − q ′ D2 )x2
s.t. x2 ∈ P2 ,

4 Remarks
Slide 7
• Economic interpretation
• Applicability of the method

min c′1 x1 + c′2 x2 + · · · + c′t xt

s.t. D1 x1 + D2 x2 + · · · + D t xt = b0
F i xi = bi , i = 1, 2, . . . , t
x1 , x2 , . . . , xt ≥ 0.

2
min c′ x
s.t. Dx = b0
F x = b
x ≥ 0,

4.1 Termination
Slide 8
• Finite termination
• Algorithm makes substantial progress in the beginning, but very slow later

• no faster than the revised simplex method applied to the original problem
• Storage with t subproblems
• Original: O (m0 + tm1 )2
� �

• Decomposition algorithm O (m0 + t)2 for the tableau of the master prob

� �

lem, and t times O(m21 ) for subproblems.

• If t = 10 and if m0 = m1 is much larger than t, memory requirements for

decomposition algorithm are about 100 times smaller than revised simplex

method.

5 Example
Slide 9
•
min −4x1 − x2 − 6x3
s.t. 3x1 + 2x2 + 4x3 = 17
1 ≤ x1 ≤2
1 ≤ x2 ≤2
1 ≤ x3 ≤ 2.

• P = {x ∈ ℜ3 | 1 ≤ xi ≤ 2, i = 1, 2, 3}; eight extreme points;

• Master problem:
8
�
λj Dxj = 17,
j=1
8
�
λj = 1,
j=1

Slide 10
• x1 = (2, 2, 2) and x2 = (1, 1, 2); Dx1 = 18, Dx2 = 13

3
� � � �
18 13 0.2 −2.6
• B= ; B −1 =
1 1 −0.2 3.6
•  
2
= c′ x1 = − 4 − 1 − 6  2  = −22,
� �
cB(1)
2
 
1
= c′ x2 = − 4 − 1 − 6  1  = −17.
� �
cB(2)
2
� � � � � �
• p′ = q ′ r = c′B B −1 = − 22 − 17 B −1 = − 1 − 4 .
• � � �
c′ − q ′ D = − 4 − 1 − 6] − (−1) 3 2 4 = [−1 1 − 2],
optimal solution is x3 = (2, 1, 2) with optimal cost −5 ≤ r = −4
• Generate the column corresponding to λ3 .
Slide 11
x3
x2 = ( 1 ,1,2 ) ( 1 ,2 , 2 )

. .
A
B
x3 = ( 2 , 1,2 ) x1 = ( 2 , 2 , 2 )
x2
(1,1,1) ( 1 ,2 , 1)

( 2 , 1,1) ( 2 , 2 , 1)

6 Starting the algorithm

Slide 12
m0
�
min yt

t=1

 
� � �
s.t.  λji Di xji + θik D i wki  + y = b0
i=1,2 j∈Ji k∈Ki
�
λj1 = 1
j∈J1

4
�
λj2 = 1
j∈J2

λji ≥ 0, θik ≥ 0, yt ≥ 0, ∀ i, j, k, t.

7 Bounds
Slide 13
• Optimal cost z ∗
• z cost of feasible solution obtained at some intermediate stage ofe decom

position algorithm.

• ri be the value of the dual variable associated with the convexity constraint

for the ith subproblem

• zi optimal cost in the ith subproblem

• Then, �
z+ (zi − ri ) ≤ z ∗ ≤ z.
i

7.1 Proof
Slide 14
Dual of master problem

max q ′ b0 + r1 + r2
s.t. q ′ D 1 xj1 + r1 ≤ c′1 xj1 , ∀ j ∈ J1 ,
q ′
D 1 wk1 ≤ c′1 w k1 , ∀ k ∈ K1 ,
q ′ D 2 xj2 + r2 ≤ c′2 xj2 , ∀ j ∈ J2 ,
q ′
D 2 wk2 ≤ c′2 w k2 , ∀ k ∈ K2 .
Slide 15

• (q, r1 , r2 ) dual variables

q ′ b0 + r1 + r2 = z

• z1 is the optimal cost in the ﬁrst subproblem:

min (c′1 xj1 − q ′ D 1 xj1 ) = z1 ,

j∈J1

min (c′1 wk1 − q ′ D1 wk1 ) ≥ 0.

k∈K1

• (q, z1 , z2 ) is a feasible solution to the dual of master problem

5
• By weak duality,

z ∗ ≥ q ′ b0 + z1 + z2
= q ′ b0 + r1 + r2 + (z1 − r1 ) + (z2 − r2 )
= z + (z1 − r1 ) + (z2 − r2 ),

7.2 Example
Slide 16
• (λ1 , λ2 ) = (0.8, 0.2)

• cB = (−22, −17), z = (−22, −17)′(0.8, 0.2) = −21

• r = −4; z1 = (−1, 1, −2)′(2, 1, 2) = −5.
• −21 ≥ z ∗ ≥ −21 + (−5) − (−4) = −22
• z ∗ = −21.5

6
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

2 Common Thrust
Slide 2
Move some entity (electricity, a consumer product, a person, a vehicle, a mes
sage, . . . ) from one point to another in the underlying network, as eﬃciently as
possible.

1. Learn how to model application settings as network ﬂow problems.

2. Study ways to solve the resulting models.

3 Shortest Path
3.1 Description
Slide 3
• Identify a shortest path from a given source node to a given sink node.
• Finding a path of minimum length.
• Finding a path taking minimum time.
• Finding a path of maximum reliability.

4 Maximum Flow
4.1 Description
Slide 4
• Determine the maximum ﬂow that can be sent from a given source node

to a sink node in a capacitated network.

• Determining maximum steady-state ﬂow of

– petroleum products in a pipeline network
– cars in a road network
– messages in a telecommunication network
– electricity in an electrical network

1
5 Min-Cost Flow
5.1 Description
Slide 5
• Determine a least cost shipment of a commodity through a network in

order to satisfy demands at certain nodes from available supplies at other

nodes. Arcs have capacities and cost associated with them

• Distribution of products
• Flow of items in a production line
• Routing of cars through street networks
• Routing of telephone calls

5.2 In LOP Form

Slide 6
• Network G = (N, A).
• Arc costs c : A → R.
• Arc capacities u : A → N .
• Node balances b : N → R.

�
min cij xij

(i,j)∈A

� �
s.t. xij − xji = bi for all i ∈ N
j:(i,j)∈A j:(j,i)∈A
xij ≤ uij for all (i, j) ∈ A
xij ≥ 0 for all (i, j) ∈ A

6 Outline
Slide 7
• Shortest path applications
• Maximum Flow applications
• Minimum cost ﬂow applications

7 Shortest Path
7.1 Interword Spacing in LATEX
Slide 8
The spacing between words and characters is normally set

automatically by LATEX. Interword spacing within one line

is uniform. LATEX also attempts to keep the word spacing

for diﬀerent lines as nearly the same as possible.

2
The spacing between words and characters is normally set auto

matically by LATEX. Interword spacing within one line is uniform.

LATEX also attempts to keep the word spacing for diﬀerent lines

as nearly the same as possible.

7.2 Interword Spacing in LATEX (2)

Slide 9
• The paragraph consists of n words, indexed by 1, 2, . . . , n.
• cij is the attractiveness of a line if it begins with i and ends with j − 1.
• (LATEX uses a formula to compute the value of each cij .)

For instance,

c12 = −10, 000 c13 = −1, 000

c14 = 100 c1,37 = −100, 000
...

7.3 Interword Spacing in LATEX (3)

Slide 10
• The problem of decomposing a paragraph into several lines of text to max

imize total attractiveness can be formulated as a shortest path problem.

• Nodes? Arcs? Costs?

7.4 Project Management

Slide 11
• A project consists of a set of jobs and a set of precedence relations
• Given a set A of job pairs (i, j) indicating that job i cannot start before

job j is completed.

• ci duration of job i
• Find the least possible duration of the project

7.4.1 Formulation
Slide 12
• Introduce two artiﬁcial jobs s and t, of zero duration, that signify the

beginning and the completion of the project

• Add (s, i) and (i, t) to A

• pi time that job i begins
• (i, j) ∈ A: pj ≥ pi + ci
• Project duration: pt − ps

3
Slide 13

•
min pt − ps
s.t pj − pi ≥ ci , ∀ (i, j) ∈ A.

• Dual �
max ci fij

(i,j)∈A

� �
s.t. fji − fij = bi
{j|(j,i)∈A} {j|(i,j)∈A}

fij ≥ 0
Slide 14
• bs = −1, bt = 1, and bi = 0 for i �= s, t.
• Shortest path problem, where each precedence relation (i, j) ∈ A corre
sponds to an arc with cost of −ci .
Slide 15
Activity
Immediate Predecessor
Time(ci )

s
0

A
S
14

B
S
3

C
A,B
5

D
A
7

E
C,D
10

t
E
0

Slide 16

0 14
S A D T
7
0 14
10
3 5
B C E

7.5 DNA Sequencing

Slide 17
• Given two sequences of letters, say

B = b1 · · · bp and D = d1 · · · dq

• How similar are the two sequences?

• What is the min cost of transforming B to D?

4
7.5.1 Transformation costs
Slide 18
• α = cost of inserting a letter in B
• β = cost of deleting a letter from B
• g(bi , dj ) = cost of mutating a letter bi into dj

7.5.2 Transformation steps

Slide 19
1. Add or delete letters from B so as to make |B ′ | = |D|.
2. Align B ′ and D
3. Mutate letters of B ′ so that B ′′ = D.

7.5.3 Algorithm
Slide 20
• f (b1 · · · bp , d1 · · · dq ): the min cost of transforming B into D by the three

steps above. We obtain this cost by a recursive way.

f (∅ · · · ∅, d1 · · · dj ) = jα, j = 1, ..., q

f (b1 · · · bi , ∅ · · · ∅) = iβ, i = 1, ..., p.

Slide 21
Substitution
B′ = b1 ··· bi
D= d1 ··· dj
f (b1 · · · bi , d1 · · · dj )
= f (b1 · · · bi−1 , d1 · · · dj−1 ) + g(bi , dj )
Slide 22
• Addition of dj

B′ = b1 ··· bi ··· ∅

D= d1 ··· ··· dj
f (b1 · · · bi , d1 · · · dj ) = f (b1 · · · bi , d1 · · · dj−1 ) + α.
• Deletion of bi :

f (b1 · · · bi , d1 · · · dj ) = f (b1 · · · bi−1 , d1 · · · dj ) + β

Slide 23
Recursion
f (b1 · · · bi , d1 · · · dj )
= min{f (b1 · · · bi−1 , d1 · · · dj−1 ) + g(bi , dj ),
f (b1 · · · bi , d1 · · · dj−1 ) + α,
f (b1 · · · bi−1 , d1 · · · dj ) + β}
Slide 24
The shortest path from 00 to 32

5
8 Maximum Flow
8.1 The tournament problem
Slide 25
• Each of n teams plays against every other team a total of k games.
• Each game ends in a win or a loss (no draws)

• xi : the number of wins of team i.

• X set of all possible outcome vectors (x1 , ..., xn )
• Given x = (x1 , ..., xn ) decide whether x ∈ X

8.1.1 Formulation
Slide 26
• Supply nodes T1 , ..., Tn represent teams with supply x1 , ..., xn
• Since total number of wins total number of games, we must have
�
xi = n(n − 1)k/2

• Demand nodes

G12 , ..., G1n , G23 , ..., G2n , ..., Gij , ..., Gn−1,n

denote games between Ti and Tj with demand k.

• Arcs: (Ti , Gij ), (Tj , Gij ). The ﬂow from Ti to Gij represents the total

number of games between i and j won by i

• Transportation model feasible if and only if x ∈ X

6
8.2 Preemptive Scheduling
Slide 27
• m identical machines to process n jobs
• Job j must be processed for pj periods, j = 1, ..., n
• It can not start before period rj and must be competed before period dj
• We allow preemption, i.e., we can disrupt the processing of one job with
another
Slide 28
• Problem Find a schedule (which job is processed by which machine at
which period) such that all jobs are processed after their release times and
completed before their deadlines
• Cj : completion time of job j: We need to have

rj + pj ≤ Cj ≤ dj for all j

8.3 Formulation
Slide 29
• Rank all release times and deadlines in ascending order. The ordered
list of numbers divides the time horizon into a number of nonoverlapping
intervals.
• Tkl be the interval that starts in the period k and ends in period l. During
Tkl , we can process any job j that has been released (rj ≤ k) and its
deadline has not yet been reached (l ≤ dj ).

8.3.1 Example
Slide 30
• 4 jobs with release times 3, 1, 3, 5, and deadlines 5, 4, 7, 9.
• The ascending list of release times and deadlines is 1, 3, 4, 5, 7, 9.
• Five intervals: T13 , T34 , T45 , T57 , T79 .

8.3.2 Network
Slide 31
• Nodes: source s, sink t, a node corresponding to each job j, and a node
corresponding to each interval Tkl .
• Arcs: (s, j), with capacity pj . Flow represents the number of periods of
processing that job j receives.
• Arcs: (Tkl , t), with capacity m(l − k). Flow represents the total number
of machine-periods of processing during Tkl .
Slide 32
• Arcs: (j, Tkl ) if rj ≤ k ≤ l ≤ dj with capacity l − k. Flow represents the
number of periods that job j is processed during Tkl .

7
j
pj
l-k
s t

Tkl m( l - k)

9 Min-Cost Flow
9.1 Passenger Routing
Slide 33
• United Airlines has seven daily ﬂights from BOS to SFO, every two hours,

starting at 7am.

• Capacities are 100, 100, 100, 150, 150, 150, and ∞.

• Passengers suffering from overbooking are diverted to later flights.
• Delayed passengers get $200 plus $20 for every hour of delay.
• Suppose that today the first six flighs have 110, 160, 103, 149, 175, and 140

conﬁrmed reservations.

Determine the most economical passenger routing strategy!

8
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
-15
2
2 4
2
5 3 1

4 4
10 1 5 7 10

3 6
6
3 6
5
-5

1
1

2 7

3 6

4 5

1 1

2 7 2 7

0
3 6 3 6

0
4 5 4 5
0

What is the flow in 1 1 What is the flow in 1 1

arc (4,3)? arc (5,3)?
-6 -6
2 7 3 2 7 3

1 3 6 0 1 3 6 0
-4 -4
2
0 0
4 5 4 5
2 0 3 2 0 3

2
What is the flow in 1 1 What is the flow in 1 1
arc (3,2)? arc (2,6)?
-6 -6
2 7 3 2 7 3
6
1 3 6 0 1 3 6 0
-4 -4
2 3 2 3
0 0
4 5 4 5
2 0 3 2 0 3

What is the flow in 1 1 What is the flow in 1 1

arc (7,1)? arc (1,2)? 3
-6 -6
2 7 3 2 7 3
6 4 6 4
1 3 6 0 1 3 6 0
-4 -4
2 3 2 3
0 0
4 5 4 5
2 0 3 2 0 3

Note: there are 1 1

two different ways 4 3
of calculating the -6
flow on (1,2), and 2 7 3
6 4
both ways give a
flow of 4. Is this a 1 3 6 0
coincidence? -4
2 3
0
4 5
2 0 3

3
3 1 3 1
4 4 4
2 2
1
7 7
5 2 5
3 1 2 3 2

3 3 6 6
4 4
5 4 5 4

2 1 1 1
3 2
2 2
7 7
6 7
3 1 3 0

6 6
5 4
5 3 5 2

1 1
2
2
7
7
3
6
4
5 2

4
0
1 Here is a spanning 1 There is a redundant constraint
5 -6 tree with arc costs. 5 -6 in the minimum cost flow
How can one choose problem.
2 7 node potentials so 2 7
3 -4 3 -4 One can set p1 arbitrarily. We
that reduced costs of
will let p1 = 0.
3 6 tree arcs are 0? 3 6
-2 1 -2 1
What is the node potential for 2?
4 5 4 5

0 0
1 1
5 -6 5 -6
-5 -5
-6
2 7 2 7
3 -4 3 -4
3 6 3 6
-2 1 -2 1
4 5 4 5
What is thenode potential for 7? What is the potential for node 3?

0 0
1 1
5 -6 5 -6
-5 -5
-6 -6
2 7 2 7
3 -4 3 -4
-2 -2
3 6 3 6 -1
-2 1 -2 1
4 5 4 5
What is the potential for node 6? What is the potential for node 4?

5
0 0
1 1
5 -6 5 -6
-5 -5
-6 -6
2 7 2 7
3 -4 3 -4
-2 -2
3 6 -1 3 6 -1
These are the node potentials
-2 1 -2 1 associated with this tree. They
4 5 4 5 do not depend on arc flows, nor
What is the potential for node 5? on costs of non-tree arcs.
-4 -4 -1

0
Node potentials 1
Flow on arcs 1
Original costs Reduced costs 4
3
-5
-6
2 7 2 7
7 4 2
6
-2
3 6 -1 3 6
-3 2 3 -3
4 5 4 5
-4 2 5
-1

Flow on arcs 1 1
4 4
3 3
2 7 2 7
4 1
6 3
3 6 3 6
2 3 0 2 0 3
4 5 4 5

6
1

2 7

3 6

4 5

7
1 1

2 7 2 7

3 6 3 6

4 5 4 5

2 7

3 6

4 5

8
9
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 18: The Ellipsoid method

1 Outline
Slide 1
• Eﬃcient algorithms and computational complexity
• The key geometric result behind the ellipsoid method
• The ellipsoid method for the feasibility problem
• The ellipsoid method for optimization

2 Eﬃcient algorithms
Slide 2
• The LO problem
min c′ x
s.t. Ax = b
x≥0

• A LO instance

min 2x + 3y

s.t. x + y ≤ 1
x , y ≥ 0

• A problem is a collection of instances

2.1 Size
Slide 3
• The size of an instance is the number of bits used to describe the instance,

according to a prespeciﬁed format

• A number r ≤ U

r = ak 2k + ak−1 2k−1 + · · ·
+ a1 21 + a0

is represented by (a0 , a1 , . . . , ak ) with k ≤ ⌊log 2 U ⌋

• Size of r is ⌊log 2 U ⌋ + 2
• Instance of LO: (c, A, b)
• Size is � �
(mn + m + n) ⌊log2 U ⌋ + 2

2.2 Running Time

Slide 4
Let A be an algorithm which solves the optimization problem Π.

If there exists a constant α > 0 such that A terminates its computation after at most

α f (I) elementary steps for each instance I, then A runs in O(f ) time.

Elementary operations are

• variable assignments • comparison of numbers
• random access to variables • arithmetic operations
• conditional jumps • ... Slide 5

1
A “brute force” algorithm for solving the min-cost flow problem:
Consider all spanning trees and pick the best tree solution among the feasible ones.
Suppose we had a computer to check 1015 trees in a second. It would need more than
109 years to find the best tree for a 25-node min-cost flow problem.
It would need 1059 years for a 50-node instance.
That’s not efficient! Slide 6
Ideally, we would like to call an algorithm “efficient” when it is sufficiently fast to be
usable in practice, but this is a rather vague and slippery notion.

The following notion has gained wide acceptance:

An algorithm is considered eﬃcient if the number of steps it performs for
any input is bounded by a polynomial function of the input size.
Polynomials are, e.g., n, n3 , or 106 n8 .

2.3 The Tyranny of

Exponential Growth
Slide 7
100 n log n 10 n2 n3.5 2n n! nn−2
109 /sec 1.19 · 109 600, 000 3, 868 41 15 13
1010 /sec 1.08 · 1010 1, 897, 370 7, 468 45 16 13
Maximum input sizes solvable within one hour.

2.4 Punch line

Slide 8
The equation

eﬃcient = polynomial

has been accepted as the best available way of tying the empirical

notion of a “practical algorithm” to a precisely formalized mathe

matical concept.

2.5 Deﬁnition
Slide 9
An algorithm runs in polynomial time if its running time is O(|I|k ), where |I|
is the input size, and all numbers in intermediate computations can be stored
with O(|I|k ) bits.

3 The Ellipsoid method

Slide 10
• D is an n × n positive deﬁnite symmetric matrix
• A set E of vectors in ℜn of the form
E = E(z, D) = x ∈ ℜn | (x − z)′ D −1 (x − z) ≤ 1
� �

is called an ellipsoid with center z ∈ ℜn

2
3.1 The algorithm intuitively
Slide 11
• Problem: Decide whether a given polyhedron

P = x ∈ ℜn | Ax ≥ b
� �

is nonempty Slide 12

Et+1

P
xt 00
11
Et
11
00 xt+1
a′ x ≥ b

a′ x ≥ a′ xt

• Key property: We can ﬁnd a new ellipsoid Et+1 that covers the half
ellipsoid and whose volume is only a fraction of the volume of the previous
ellipsoid Et

3.2 Key Theorem

Slide 13
• E = E(z, D) be an ellipsoid in ℜn ; a nonzero n-vector.
• H = {x ∈ ℜn | a′ x ≥ a′ z}
1 Da
z = z+ √ ,
n + 1 a′ Da
� �
n2 2 Daa′ D
D = 2
D− .
n −1 n + 1 a′ Da

• The matrix D is symmetric and positive deﬁnite and thus E ′ = E(z, D) is an

ellipsoid
• E ∩ H ⊂ E
′
• Vol(E ′ ) < e−1/(2(n+1)) Vol(E)

3
x2

E'
E

3.3 Illustration
Slide 14
3.4 Assumptions
Slide 15
• A polyhedron P is full-dimensional if it has positive volume
• The polyhedron P is bounded: there exists a ball E0 = E(x0 , r2 I), with

volume V , that contains P

• Either P is empty, or P has positive volume, i.e., Vol(P ) > v for some

v > 0

• E0 , v, V , are a priori known

• We can make our calculations in inﬁnite precision; square roots can be

computed exactly in unit time

3.5 Input-Output
Slide 16
Input:
• A matrix A and a vector b that deﬁne the polyhedron P = {x ∈ ℜn |
a′i x ≥ bi , i = 1, . . . , m}
• A number v, such that either P is empty or Vol(P ) > v

4
• A ball E0 = E(x0 , r2 I) with volume at most V , such that P ⊂ E0
Output: A feasible point x∗ ∈ P if P is nonempty, or a statement that P is
empty

3.6 The algorithm

Slide 17
1. (Initialization)
Let t∗ = 2(n + 1) log(V /v) ; E0 = E(x0 , r 2 I); D 0 = r 2 I; t = 0.
� �

2. (Main iteration)
• If t = t∗ stop; P is empty.
• If xt ∈ P stop; P is nonempty.
/ P ﬁnd a violated constraint, that is, ﬁnd an i such that a′i xt < bi .
• If xt ∈
• Let Ht = {x ∈ ℜn | a′i x ≥ a′i xt }. Find an ellipsoid Et+1 containing Et ∩ Ht :

Et+1 = E(xt+1 , D t+1 ) with

1 D a
xt+1 = xt + � t i ,
n + 1 a′i D t ai
n2
� �
2 D t ai a′i D t
D t+1 = 2 Dt − .
n −1 n + 1 a′i D t ai

• t := t + 1.

3.7 Correctness
Slide 18
Theorem: Let P be a bounded polyhedron that is either empty or full-dimensional
and for which the prior information x0 , r, v, V is available. Then, the ellipsoid
method decides correctly whether P is nonempty or not, i.e., if xt∗ −1 ∈
/ P , then
P is empty

3.8 Proof
Slide 19
• If xt ∈ P for t < t∗ , then the algorithm correctly decides that P is

nonempty

• Suppose x0 , . . . , xt∗ −1 ∈
/ P . We will show that P is empty.
• We prove by induction on k that P ⊂ Ek for k = 0, 1, . . . , t∗ . Note

that P ⊂ E0 , by the assumptions of the algorithm, and this starts the

induction.

Slide 20

• Suppose P ⊂ Ek for some k < t∗ . Since xk ∈ / P , there exists a violated

inequality: a′ i(k) x ≥ bi(k) be a violated inequality, i.e., ai(k)
′
xk < bi(k) ,

where xk is the center of the ellipsoid Ek

5
• For any x ∈ P , we have
a′i(k) x ≥ bi(k) > a′i(k) xk

• Hence, P ⊂ Hk = x ∈ ℜn | a′i(k) x ≥ a′i(k) xk

� �

• Therefore, P ⊂ Ek ∩ Hk
Slide 21
By key geometric property, Ek ∩ Hk ⊂ Ek+1 ; hence P ⊂ Ek+1 and the induction is
complete

Vol(Et+1 )
< e−1/(2(n+1))
Vol(Et )
Vol(Et∗ ) ∗
< e−t /(2(n+1))
Vol(E0 )
V V
Vol(Et∗ ) < V e−⌈2(n+1) log v ⌉/(2(n+1)) ≤ V e− log v = v
If the ellipsoid method has not terminated after t∗ iterations, then Vol(P ) ≤ Vol(Et∗ ) ≤
v. This implies that P is empty

3.9 Binary Search

� � Slide 22
• P = x ∈ ℜ | x ≥ 0, x ≥ 1, x ≤ 2, x ≤ 3
• E0 = [0, 5], centered at x0 = 2.5
• Since x0 ∈
/ P , the algorithm chooses the violated inequality x ≤ 2 and

constructs E1 that contains the interval E0 ∩ {x | x ≤ 2.5} = [0, 2.5]

• The ellipsoid E1 is the interval [0, 2.5] itself

• Its center x1 = 1.25 belongs to P
• This is binary search

3.10 Boundedness of P
Slide 23
Let A be an m × n integer matrix and let b a vector in ℜn . Let U be the largest
absolute value of the entries in A and b.
Every extreme point of the polyhedron P = {x ∈ ℜn | Ax ≥ b} satisﬁes
−(nU )n ≤ xj ≤ (nU )n , j = 1, . . . , n
Slide 24
• All extreme points of P are contained in
PB = x ∈ P �
|xj | ≤ (nU )n , j = 1, . . . , n
� � �

2n
� �

• Since
� PB ⊆ 2n E 0,
� n(nU ) I , we can start the ellipsoid method with E0 =
E 0, n(nU ) I
• �n 2
V ol(E0 ) ≤ V = 2n(nU )n = (2n)n (nU )n
�

6
3.11 Full-dimensionality
Slide 25
Let P = {x ∈ ℜn | Ax ≥ b}. We assume that A and b have integer entries,
which are bounded in absolute value by U . Let
1 � �−(n+1)
ǫ= (n + 1)U .
2(n + 1)
Let
Pǫ = x ∈ ℜn | Ax ≥ b − ǫe ,
� �

where e = (1, 1, . . . , 1).

(a) If P is empty, then Pǫ is empty.
(b) If P� is nonempty, then� Pǫ is full-dimensional. Slide 26
Let P = x ∈ ℜn | Ax ≥ b be a full-dimensional bounded polyhedron, where
the entries of A and b are integer and have absolute value bounded by U . Then,
2
(n+1)
Vol(P ) > v = n−n (nU )−n

3.12 Complexity
Slide 27
• P = {x ∈ ℜn | Ax ≥ b}, where A, b have integer entries with magni

tude bounded by some U and has full rank. If P is bounded and either

empty
� or full-dimensional,
� the ellipsoid method decides if P is empty in

O n log(V /v) iterations

2 2
(n+1)
• v = n−n (nU )−n V = (2n)n (nU )n
,
• Number of iterations O n4 log(nU )
� �
Slide 28
• If P is arbitrary, we ﬁrst form PB , then perturb PB to form PB,ǫ and apply the

ellipsoid method to PB,ǫ

• Number of iterations is O n6 log(nU ) .

� �

• It has been shown that only O(n3 log U ) binary digits of precision are needed,

and the numbers computed during the algorithm have polynomially bounded

size

• The linear programming feasibility problem with integer data can be solved in

polynomial time

4 The ellipsoid method for optimization

Slide 29
min c′ x max b′ π
s.t. Ax ≥ b, s.t. A′ π = c
π ≥ 0.
By strong duality, both problems have optimal solutions if and only if the following
system of linear inequalities is feasible:
b′ p = c′ x, Ax ≥ b, A′ p = c, p ≥ 0.
LO with integer data can be solved in polynomial time.

7
4.1 Sliding objective
Slide 30
• �
We ﬁrst run the ellipsoid method to ﬁnd a feasible solution x0 ∈ P =
x ∈ ℜn | Ax ≥ b .
�

• We apply the ellipsoid method to decide whether the set

P ∩ x ∈ ℜn | c′ x < c′ x0
� �

is empty.
• If it is empty, then x0 is optimal. If it is nonempty, we ﬁnd a new solution
x1 in P with objective function value strictly smaller than c′ x0 .
Slide 31

• More generally, every time a better feasible solution xt is found, we take

P ∩ {x ∈ ℜn | c′ x < c′ xt } as the new set of inequalities and reapply the

ellipsoid method.

- c

.xt+1 c' x < c ' xt+1

. xt c' x < c ' xt

4.2 Performance in practice

Slide 32
• Very slow convergence, close to the worst case
• Contrast with simplex method
• The ellipsoid method is a tool for classifying the complexity of linear

programming problems

8
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 19: Problems with exponentially

many constraints
1 Outline
Slide 1
• Problems with exponentially many constraints
• The separation problem
• Polynomial solvability
• Examples: MST, TSP, Probability
• Conclusions

2 Problems
2.1 Example
� Slide 2
min ci xi
i
�
ai xi ≥ |S|, for all subsets S of {1, . . . , n}
i∈S

• There are 2n constraints, but are described concisely in terms of the n

scalar parameters a1 , . . . , an

• Question: Suppose we apply the ellipsoid algorithm. Is it polynomial?

• In what?

2.2 The input

Slide 3
• Consider min c′ x s.t. x ∈ P
• P belongs to a family of polyhedra of special structure
• A typical polyhedron is described by specifying the dimension n and an

integer vector h of primary data, of dimension O(nk ), where k ≥ 1 is some

constant.

• In example, h = (a1 , . . . , an ) and k = 1

• U0 be the largest entry of h
Slide 4

• Given n and h, P is described as Ax ≥ b

• A has an arbitrary number of rows
• U largest entry in A and b. We assume

log U ≤ Cnℓ logℓ U0

1
3 The separation problem
Slide 5
Given a polyhedron P ⊂ ℜn and a vector x ∈ ℜn , the separation problem is
to:
• Either decide that x ∈ P , or
• Find a vector d such that d′ x < d′ y for all y ∈ P
What is the separation problem for
�
ai xi ≥ |S|, for all subsets S of {1, . . . , n}?
i∈S

4 Polynomial solvability
4.1 Theorem
Slide 6
If we can solve the separation problem (for a family of polyhedra) in time
polynomial in n and log U , then we can also solve linear optimization problems
in time polynomial in n and log U . If log U ≤ Cnℓ logℓ U0 , then it is also
polynomial in log U0

• Proof ?
• Converse is also true
• Separation and optimization are polynomially equivalent

4.2 Minimum Spanning

Tree (MST)
Slide 7
• How do telephone companies bill you?
• It used to be that rate/minute: Boston → LA proportional to distance in

MST

• Other applications: Telecommunications, Transportation (good lower bound

for TSP)

Slide 8
• Given a graph G = (V, E) undirected and Costs ce , e ∈ E.
• Find a tree of minimum cost spanning all the nodes.
�
1, if edge e is included in the tree
• Decision variables xe =
0, otherwise
Slide 9
• The tree should be connected. How can you model this requirement?

2
• Let S be a set of vertices. Then S and V \ S should be connected
�
i∈S
• Let δ(S) = {e = (i, j) ∈ E :
j ∈V \S
• Then, �
xe ≥ 1
e∈δ(S)

• What is the number of edges in a tree?

�
• Then, xe = n − 1
e∈E

4.2.1 Formulation
� Slide 10
IZMST = min ce xe

 e∈E

�
 xe ≥ 1 ∀ S ⊆ V, S �= ∅, V
 e∈δ(S)

�
H xe = n − 1
 e∈E


xe ∈ {0, 1}.
How can you solve the LP relaxation?

4.3 The Traveling Salesman

Problem
Slide 11
Given G = (V, E) an undirected graph. V = {1, . . . , n}, costs ce ∀ e ∈ E. Find
a tour that minimizes total length.

4.3.1 Formulation
� Slide 12
1, if edge e is included in the tour.
xe =
0, otherwise.
�
min ce xe

e∈E

�
s.t. xe ≥ 2, S⊆E
e∈δ(S)
�
xe = 2, i∈V
e∈δ(i)
xe ∈ {0, 1}

How can you solve the LP relaxation?

3
4.4 Probability Theory
Slide 13
• Events A1 , A2
• P (A1 ) = 0.5, P (A2 ) = 0.7, P (A1 ∩ A2 ) ≤ 0.1
• Are these beliefs consistent?
• General problem: Given n events Ai i ∈ N = {1, . . . , n}, beliefs
P(Ai ) ≤ pi , i ∈ N,
P(Ai ∩ Aj ) ≥ pij , i, j ∈ N, i < j.
• Given the numbers pi and pij , which are between 0 and 1, are these beliefs

consistent?

4.4.1 Formulation
� � � � �� Slide 14
x(S) = P ∩i∈S Ai ∩ ∩i∈S
/ Ai ,
�
x(S) ≤ pi , i ∈ N,
{S|i∈S}
�
x(S) ≥ pij , i, j ∈ N, i < j,
{S|i,j∈S}
�
x(S) = 1,
S
x(S) ≥ 0, ∀ S.
Slide 15
The previous LP is feasible if and only if there does not exist a vector (u, y, z) such
that � �
yij + ui + z ≥ 0, ∀ S,
i,j∈S,i<j i∈S
� �
pij yij + pi ui + z ≤ −1,
i,j∈N,i<j i∈N

yij ≤ 0, ui ≥ 0, i, j ∈ N, i < j.
Slide 16
Separation problem:
� �
z ∗ + min f (S) = ∗
yij + ui∗ ≥ 0?
S
i,j∈S,i<j i∈S

∗ ∗ ∗ ∗ ∗ ∗
Example: y12 = −2, y13 = −4, y14 = −4, y23 = −4, y24 = −1, y34 = −7,
∗ ∗ ∗ ∗ ∗
u1 = 9, u2 = 6, u3 = 4, u4 = 2, and z = 2 Slide 17
Slide 18
• The minimum cut corresponds to S0 = {3, 4} with value c(S0 ) = 21.
� �
∗
• f (S0 ) = yij + u∗i
= −7 + 4 + 2 = −1
i,j∈S0 ,i<j i∈S0

∗ ∗
• f (S) + z ≥ f (S0 ) + z = −1 + 2 = 1 > 0, ∀S
• Given solution (y ∗ , u∗ , z ∗ ) is feasible

4
1, 2

2
1, 3 1
4 9

4 1, 4 2 6
s t
4 4
2,3 3
1
2
2,4 4

7
3,4

5 Conclusions
Slide 19
• Ellipsoid algorithm can characterize the complexity of solving LOPs with

an exponential number of constraints

• For practical purposes use dual simplex

• Ellipsoid method is an important theoretical development, not a practical

one

5
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 20: The Aﬃne Scaling Algorithm

1 Outline
Slide 1
• History
• Geometric intuition
• Algebraic development
• Aﬃne Scaling
• Convergence
• Initialization
• Practical performance

2 History
Slide 2
• In 1984, Karmakar at AT&T “invented” interior point method
• In 1985, Aﬃne scaling “invented” at IBM + AT&T seeking intuitive ver

sion of Karmarkar’s algorithm

• In early computational tests, A.S. far outperformed simplex and Kar

markar’s algorithm

• In 1989, it was realised Dikin invented A.S. in 1967

3 Geometric intuition
3.1 Notation
Slide 3
min c′ x
s.t. Ax = b
x ≥ 0
and its dual
max p′ b
s.t. p′ A ≤ c′

• P = {x | Ax = b, x ≥ 0}
• {x ∈ P | x > 0} the interior of P and its elements interior points

1
c

x2 .
x1 .0
x.

3.2 The idea

Slide 4
4 Algebraic development
4.1 Theorem
Slide 5
β ∈ (0, 1), y ∈ ℜn : y > 0, and
� n
�
(xi − yi )2
� �
n
S= x∈ℜ � ≤ β2
�
2
.
yi
i=1

Then, x > 0 for every x ∈ S

Proof
• x∈S
• (xi − yi )2 ≤ β 2 yi2 < yi2
• |xi − yi | < yi ; −xi + yi < yi , and hence xi > 0
Slide 6
� �
x ∈ S is equivalent to �|Y −1 (x − y)�| ≤ β
Replace original LP:
min c′ x
s.t. �Ax = b �
�|Y −1 (x − y)�| ≤ β.

2
d=x−y
min c′ d
s.t. Ad = 0
||Y −1 d|| ≤ β

4.2 Solution
Slide 7
If rows of A are linearly independent and c is not a linear combination of the
rows of A, then
• optimal solution d∗ :
Y 2 (c − A′ p)
d∗ = −β �� , p = (AY 2 A′ )−1 AY 2 c.
|Y (c − A′ p)�|

• x = y + d∗ ∈ P
� �
• c′ x = c′ y − β �|Y (c − A′ p)�| < c′ y

4.2.1 Proof
Slide 8
• AY 2 A′ is invertible;if not, there exists some z =
� 0 such that z ′ AY 2 A′ z = 0
′ ′
• w = Y A z; w w = 0 ⇒ w = 0
• Hence A′ z = 0 contradiction
• Since c is not a linear combination of the rows of A, c − A′ p = � 0 and d∗ is well

deﬁned

• d∗ feasible
Y (c − A′ p)
Y −1 d∗ = −β � � ⇒ ||Y −1 d∗ || = β
�|Y (c − A′ p)�|
Ad∗ = 0, since AY 2 (c − A′ p) = 0
•
c′ d = (c′ − p′ A)d
= (c′ − p′ A)Y Y −1 d
� �
≥ −�|Y (c − A′ p)�| · ||Y −1 d||
� �
≥ −β �|Y (c − A′ p)�|.
Slide 9
•
c′ d∗ = (c′ − p′ A)d∗
Y 2 (c − A′ p)
= −(c′ − p′ A)β � �
�|Y (c − A′ p)�|
� �′ � �
Y (c − A′ p) Y (c − A′ p)
= −β � �
�|Y (c − A′ p)�|
� �
= −β �|Y (c − A′ p)�|.
� �
• c′ x = c′ y + c′ d∗ = c′ y − β �|Y (c − A′ p)�|

3
4.3 Interpretation
Slide 10
• y be a nondegenerate BFS with basis B
• A = [B N ]
• Y = diag(y1 , . . . , ym , 0, . . . , 0) and Y 0 = diag(y1 , . . . , ym ), then AY =
[BY 0 0]

p = (AY 2 A′ )−1 AY 2 c
= (B ′ )−1 Y −2
0 B
−1
BY 20 cB
= (B ′ )−1 cB

• Vectors p dual estimates

• r = c − A′ p becomes reduced costs:

r = c − A′ (B ′ )−1 cB

• Under degeneracy?

4.4 Termination
Slide 11
y and p be primal and dual feasible solutions with

c′ y − b′ p < ǫ

y ∗ and p∗ be optimal primal and dual solutions. Then,

c′ y ∗ ≤ c′ y < c′ y ∗ + ǫ,
b′ p∗ − ǫ < b′ p ≤ b′ p∗

4.4.1 Proof
Slide 12
• c′ y ∗ ≤ c′ y
• By weak duality, b′ p ≤ c′ y ∗
• Since c′ y − b′ p < ǫ,
c′ y < b′ p + ǫ ≤ c′ y ∗ + ǫ
b′ p∗ = c′ y ∗ ≤ c′ y < b′ p + ǫ

4
5 Aﬃne Scaling
5.1 Inputs
Slide 13
• (A, b, c);
• an initial primal feasible solution x0 > 0
• the optimality tolerance ǫ > 0
• the parameter β ∈ (0, 1)

5.2 The Algorithm

Slide 14
1. (Initialization) Start with some feasible x0 > 0; let k = 0.
2. (Computation of dual estimates and reduced costs) Given some feasible

xk > 0, let

X k = diag(xk1 , . . . , xkn ),
pk = (AX 2k A′ )−1 AX 2k c,
rk = c − A′ pk .

3. (Optimality check) Let e = (1, 1, . . . , 1). If r k ≥ 0 and e′ X k rk < ǫ, then

stop; the current solution xk is primal ǫ-optimal and pk is dual ǫ-optimal.

4. (Unboundedness check) If −X 2k r k ≥ 0 then stop; the optimal cost is −∞.

5. (Update of primal solution) Let

X 2k rk
xk+1 = xk − β .
||X k rk ||

5.3 Variants
Slide 15
• ||u||∞ = maxi |ui |, γ(u) = max{ui | ui > 0}
• γ(u) ≤ ||u||∞ ≤ ||u||
• Short-step method.
• Long-step variants
X 2k r k
xk+1 = xk − β
||X k r k ||∞
X 2k r k
xk+1 = xk − β
γ(X k r k )

5
6 Convergence
6.1 Assumptions
Slide 16
Assumptions A:
(a) The rows of the matrix A are linearly independent.
(b) The vector c is not a linear combination of the rows of A.
(c) There exists an optimal solution.
(d) There exists a positive feasible solution.
Assumptions B:
(a) Every BFS to the primal problem is nondegenerate.
(b) At every BFS to the primal problem, the reduced cost of every nonbasic
variable is nonzero.

6.2 Theorem
Slide 17
If we apply the long-step aﬃne scaling algorithm with ǫ = 0, the following hold:
(a) For the Long-step variant and under Assumptions A and B, and if 0 < β < 1,

xk and pk converge to the optimal primal and dual solutions

(b) For the second Long-step variant, and under Assumption A and if 0 < β <
2/3, the sequences xk and pk converge to some primal and dual optimal solutions,
respectively

7 Initialization
Slide 18
min c′ x + M xn+1
s.t. � � Ax + (b − Ae)xn+1 = b
x, xn+1 ≥ 0

8 Example
Slide 19
max x1 + 2x2
s.t. x1 + x2 ≤ 2
−x1 + x2 ≤ 1
x1 , x2 ≥0

9 Practical Performance
Slide 20
• Excellent practical performance, simple
• Major step: invert AX 2k A′
• Imitates the simplex method near the boundary

6
x2

..
1 ...
..
..
2 x1

MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 21: Primal Barrier

Interior Point Algorithm

1 Outline
Slide 1
1. Barrier Methods
2. The Central Path
3. Approximating the Central Path
4. The Primal Barrier Algorithm
5. Correctness and Complexity

2 Barrier methods
Slide 2
min f (x)
s.t. gj (x) ≤ 0, j = 1, . . . , p
hi (x) = 0, i = 1, . . . , m

S = {x| gj (x) < 0, j = 1, . . . , p,

hi (x) = 0, i = 1, . . . , m}

2.1 Strategy
Slide 3
• A barrier function G(x) is a continous function with the property that is

approaches ∞ as one of gj (x) approaches 0 from negative values.

• Examples:
p p
� � 1
G(x) = − log(−gj (x)), G(x) = −
j=1 j=1
gj (x)

Slide 4
• Consider a sequence of µk : 0 < µk+1 < µk and µk → 0.
• Consider the problem

xk = argminx∈S f (x) + µk G(x)

� �

• Theorem Every limit point xk generated by a barrier method is a global

minimum of the original constrained problem.

1
. x*

.
x(0.01) c

central p a th . x(0.1)

.
x(1)

. x(10)

.
analy t ic center

2.2 Primal path-following

IPMs for LO
Slide 5
(P ) min c′ x (D) max b′ p
s.t. Ax = b s.t. A′ p + s = c
x ≥ 0 s≥0
Barrier problem:
n
�
min Bµ (x) = c′ x − µ log xj
j=1
s.t. Ax = b

Minimizer: x(µ)

3 Central Path
Slide 6
• As µ varies, minimizers x(µ) form the central path
• limµ→0 x(µ) exists and is an optimal solution x∗ to the initial LP
• For µ = ∞, x(∞) is called the analytic center
n
�
min − log xj
j=1
s.t. Ax = b
Slide 7

3.1 Example
Slide 8
min x2
s.t. x1 + x2 + x3 = 1
x1 , x2 , x3 ≥ 0

2
x3 ( 1 / 3 , 1/ 3 , 1/ 3 )
the analy t ic
Q centerof P

.
P

( 1 / 2 , 0 , 1/ 2 )
the analy t ic
.
x2
centerof Q

the central p a th
x1

�
• Q = x | x = (x1 , 0, x3 ), x1 + x3 = 1, x ≥ 0}, set of optimal solutions to

original LP

• The analytic center of Q is (1/2, 0, 1/2)

min x2 − µ log x1 − µ log x2 − µ log x3

s.t. x1 + x2 + x3 = 1

min x2 − µ log x1 − µ log x2 − µ log(1 − x1 − x2 ).

1 − x2 (µ)
x1 (µ) =
2 �
1 + 3µ − 1 + 9µ2 + 2µ
x2 (µ) =
2
1 − x2 (µ)
x3 (µ) =
2
The analytic center: (1/3, 1/3, 1/3) Slide 9

3.2 Solution of Central Path

Slide 10
• Barrier problem for dual:
n
�
max p′ b + µ log sj
j=1
s.t. p′ A + s = c′
′

• Solution (KKT):

Ax(µ) = b

x(µ) ≥ 0

A′ p(µ) + s(µ) = c

s(µ) ≥ 0
X(µ)S(µ)e = eµ

3
Slide 11
• Theorem: If x∗ , p∗ , and s∗ satisfy optimality conditions, then they are

optimal solutions to problems primal and dual barrier problems.

• Goal: Solve barrier problem

n
�
min Bµ (x) = c′ x − µ log xj
j=1
s.t. Ax = b

4 Approximating the central path

Slide 12
∂Bµ (x) µ
= ci −
∂xi xi
∂ 2 Bµ (x) µ
2 = 2
∂xi xi
∂ 2 Bµ (x)
= 0, i �= j
∂xi ∂xj

Given a vector x > 0: Slide 13

n
� ∂Bµ (x)
Bµ (x + d) ≈ Bµ (x) + di +
i=1
∂xi
n
1 � ∂ 2 Bµ (x)
di dj
2 i,j=1 ∂xi ∂xj
1
= Bµ (x) + (c′ − µe′ X −1 )d + µd′ X −2 d
2
X = diag(x1 , . . . , xn ) Slide 14
Approximating problem:
1
min (c′ − µe′ X −1 )d + µd′ X −2 d
2
s.t. Ad = 0

Solution (from Lagrange):

c − µX −1 e + µX −2 d − A′ p = 0
Ad = 0
Slide 15

• System of m + n linear equations, with m + n unknowns (dj , j = 1, . . . , n,

and pi , i = 1, . . . , m).

4
.x*

.
x(0.01) c

central p a th . x(0.1)

.
x(1)

. x(10)

.
analy t ic center

• Solution:
� �� 1
�
d(µ) = I − X 2 A′ (AX 2 A′ )−1 A xe − X 2 c
µ
2 ′ −1 2
p(µ) = (AX A ) A(X c − µxe)

4.1 The Newton connection

Slide 16
• d(µ) is the Newton direction; process of calculating this direction is called

a Newton step

• Starting with x, the new primal solution is x + d(µ)

� �
• The corresponding dual solution becomes (p, s) = p(µ), c − A′ p(µ)
• We then decrease µ to µ = αµ, 0 < α < 1

4.2 Geometric Interpretation

Slide 17
• Take one Newton step so that x would be close to x(µ)

• Measure of closeness ��

�� 1 ��
�� XSe − e�� ≤ β,
�� µ ��

0 < β < 1, X = diag(x1 , . . . , xn ) S = diag(s1 , . . . , sn )

• As µ → 0, the complementarity slackness condition will be satisﬁed

Slide 18

5
5 The Primal Barrier Algorithm
Slide 19
Input
(a) (A, b, c); A has full row rank;
(b) x0 > 0, s0 > 0, p0 ;
(c) optimality tolerance ǫ > 0;
(d) µ0 , and α, where 0 < α < 1. Slide 20
1. (Initialization) Start with some primal and dual feasible x0 > 0, s0 >

0, p0 , and set k = 0.

2. (Optimality test) If (sk )′ xk < ǫ stop; else go to Step 3.

3. Let

X k = diag(xk1 , . . . , xkn ),
µk+1 = αµk

Slide 21
4. (Computation of directions) Solve the linear system

µk+1 X −2 ′
k d−A p = µ
k+1
X −1
k e−c
Ad = 0

5. (Update of solutions) Let

x k+1 = xk + d,
pk+1 = p,
sk+1 = c − A′ p.

6. Let k := k + 1 and go to Step 2.

6 Correctness
√ Slide 22
β−β
Theorem Given α = 1 − √ √ , β < 1, (x0 , s0 , p0 ), (x0 > 0, s0 > 0):
β+ n
��
�� 1 ��
�� X 0 S 0 e − e�� ≤ β.
�� µ0 ��

Then, after � √ √
(s0 )′ x0 (1 + β)
�
β+ n
K= √ log
β−β ǫ(1 − β)
iterations, (xK , sK , pK ) is found:

(sK )′ xK ≤ ǫ.

6
6.1 Proof
� � Slide 23
� 1 �
• Claim (by induction): �| µk X k S k e − e��| ≤ β
�

• For k = 0 we have assumed it

• Assume it holds for k;
� � � �
� 1
�| = �| 1 X k S k e − e �|
� � �
�| X k S k e − e
� µk+1 � � αµk �
� � � �
� 1 1 1 − α ��
= ��| X k S k e − e + e �|
α µk α
� �
1� 1 � 1−α
≤ ��| k X k S k e − e��| + ||e||
α µ α
β 1 − α√
≤ + n
α α
�
= β
√
• We next show that ||X −1
k d|| ≤ β < 1, where d = xk+1 − xk .
• d solves

µk+1 X −2 ′
k d−A p = µ
k+1
X −1
k e − c,
Ad = 0

• By left-multiplying the ﬁrst equation by d′

� �
µk+1 d′ X −2 ′
k d = d µ
k+1
X −1
k e−c

2
||X −1
k d|| = d′ X k−2 d
� �′
1
= X −1
k e − c d
µk+1
� �′
1 1 k k
= X−
k e − (s + A p ) ′
d
µk+1
� �′
1
= X −1
k e− sk d
µk+1
� �′
1
= − X kSke − e X −1
k d
µk+1
� �
� 1 �
≤ ��| k+1 X k S k e − e��| ||X −1
k d||
µ
� 1
≤ β||X −
k d||
√
hence, ||X −1
k d|| ≤ β < 1.

7
• We next show that xk+1 and (pk+1 , sk+1 ) are primal and dual feasible Since
Ad = 0, we have
Axk+1 = b

xk+1 = xk + d = X k (e + X − 1
k d) > 0,

because ||X −1
k d|| < 1

A′ pk+1 + sk+1 = c,
by construction and

sk+1 = c − A′ pk+1 = µk+1 X −1 −1

k (e − X k d) > 0,

because ||X −1
k d|| < 1

•
� �
dj
xk+1
j = xjk 1 + ,
xkj
� �
µk+1 dj
sk+1
j = 1− k .
xkj xj

Therefore,
� � � �
1 1 dj µk+1 dj
xk+1 sk+1 − 1 = k+1 xkj 1 + k 1− −1
µk+1 j j
µ xj xkj xkj
� �2
dj
= − .
xkj
�
• D = diag(d1 , . . . , dn ), ||u||1 = i
|ui |. Note that ||u|| ≤ ||u||1
� �
� 1 � � −2 2 �
�|
� µk+1 X k+1 S k+1 e − e�| = |X k D e |
� � �
� �
2 �
≤ �|X −2
k D e | 1
2 2
= e′ X −
k D e
2
= e′ DX −
k De
= d′ X −2
k d
1 � 2
� �
= �|X −
k d |
�
≤ ( β)2
= β,

and hence the induction is complete.

• Since at every iteration � �
�| 1 X k S k e − e �| ≤ β
� �
� µk �
1 k k
−β ≤ x j sj − 1 ≤ β
µk
nµk (1 − β) ≤ (sk )′ xk ≤ nµk (1 + β)

8
• � √ �k √
β−β

k k 0 β−β 0 −k √ √
β+ n µ0

µ =α µ = 1− √ √ µ ≤e
β+ n
• After
�√ √ � � √ √ �
β+ n µ0 n(1 + β) β+ n (s0 )′ x0 (1 + β)
√ log ≤ √ log =K
β−β ǫ β−β ǫ(1 − β)

iterations, the primal barrier algorithm ﬁnds primal and dual solutions xK ,

(pK , sK ), that have duality gap (sK )′ xK less than or equal to ǫ

7 Complexity
Slide 24
• Work per iteration involves solving a linear system with m + n equations

in m + n unknowns. Given that m ≤ n, the work per iteration is O(n3 ).

• ǫ0 = (s0 )′ x0 : initial duality gap. Algorithm needs

�√ ǫ0 �
O n log
ǫ
iterations to reduce the duality gap from ǫ0 to ǫ, with O(n3 ) arithmetic

operations per iteration.

9
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 22: Primal-dual Barrier

Interior Point Algorithm

1 Outline
Slide 1
1. The Barrier Problem
2. Solving Equations
3. The Primal-Dual Barrier Algorithm
4. Insight on Behavior
5. Computational Aspects
6. Conclusions

2 The Barrier Problem

Slide 2
Barrier problem:
n
�
min Bµ (x) = c′ x − µ log xj
j=1
s.t. Ax = b

KKT: � �′
1 1
c−µ ,..., + A′ p(µ) = 0
x1 (µ) xn (µ)
Ax(µ) = b, x(µ) ≥ 0

2.1 Optimality Conditions

Slide 3
µ
Set sj (µ) =
xj (µ)

Ax(µ) = b

x(µ) ≥ 0

′
A p(µ) + s(µ) = c

s(µ) ≥ 0

sj (µ)xj (µ) = µ or

X(µ)S(µ)e = eµ

� � � �
X(µ) = diag x1 (µ), . . . , xn (µ) , S(µ) = diag s1 (µ), . . . , sn (µ)

1
3 Solving Equations
  Slide 4
Ax − b
F (z) =  A′ p + s − c 
 

XSe − µe
z = (x, p, s), r = 2n + m
Solve
F (z ∗ ) = 0

3.1 Newton’s method

Slide 5
F (z k + d) ≈ F (z k ) + J (z k )d
Here J (z k ) is the r × r Jacobian matrix whose (i, j)th element is given by
�
∂Fi (z) ��
∂zj �z =z k

F (z k ) + J (z k )d = 0
Set z k+1 = z k + d (d is the Newton direction) Slide 6
(xk , pk , sk ) current primal and dual feasible solution
Newton direction d = (dkx , dkp , dsk )
 k 
Axk − b
  
A 0 0 dx
 0 A′ I   dkp  = −  A′ pk + sk − c 
    

Sk 0 Xk dks X k S k e − µk e

3.2 Step lengths

Slide 7
x k+1
= x + βPk dkx
k

k+1 k k
p = pk + βD dp
k+1 k k k
s = s + βD ds

To preserve nonnegativity, take

xk
� � ��
k
βP = min 1, α min − ki ,
{i|(dk
x )i <0} (dx )i
ski
� � ��
k
βD = min 1, α min − ,
{i|(dk
s )i <0} (dks )i

0<α<1

2
4 The Primal-Dual Barrier Algorithm
Slide 8
1. (Initialization) Start with x0 > 0, s0 > 0, p0 , and set k = 0
2. (Optimality test) If (sk )′ xk < ǫ stop; else go to Step 3.
3. (Computation of Newton directions)

(sk )′ xk
µk =
n

Xk = diag(xk1 , . . . , xkn )

S k = diag(sk1 , . . . , skn )

Solve linear system

dkx Axk − b
    
A 0 0
 0 A
′
I   dkp  = −  A′ pk + sk − c 
    

Sk 0 Xk dks X k S k e − µk e
Slide 9
4. (Find step lengths)

xki
� � ��
βPk = min 1, α min −
{i|(dk
x )i <0} (dkx )i
sk
� � ��
k
βD = min 1, α min − ki
{i|(dk
s )i <0} (ds )i

5. (Solution update)

xk+1 = xk + βPk dkx

k k
pk+1 = pk + βD dp
k k
sk+1 = sk + βD ds

6. Let k := k + 1 and go to Step 2

5 Insight on behavior
Slide 10
• Aﬃne Scaling
� �
daﬃne = −X 2 I − A′ (AX 2 A′ )−1 AX 2 c

• Primal barrier
��
1 2
�
dprimal−barrier = I − X 2 A′ (AX 2 A′ )−1 A Xe − X c
µ

3
• For µ = ∞ � �
dcentering = I − X 2 A′ (AX 2 A′ )−1 A Xe

• Note that
1
dprimal−barrier = dcentering + daﬃne
µ
• When µ is large, then the centering direction dominates, i.e., in the beginning,

the barrier algorithm takes steps towards the analytic center

• When µ is small, then the aﬃne scaling direction dominates, i.e., towards the

end, the barrier algorithm behaves like the aﬃne scaling algorithm

6 Computational aspects of IPMs

Slide 11
Simplex vs. Interior point methods (IPMs)
• Simplex method tends to perform poorly on large, massively degenerate

problems, whereas IP methods are much less aﬀected.

• Key step in IPMs

AX 2k A′ d = f
� �

• In implementations of IPMs AX 2k A′ is usually written as

AX 2k A′ = LL′ ,

where L is a square lower triangular matrix called the Cholesky factor

• Solve system
AX 2k A′ d = f
� �

by solving the triangular systems

Ly = f , L′ d = y

• The construction of L requires O(n3 ) operations; but the actual compu

tational eﬀort is highly dependent on the sparsity (number of nonzero

entries) of L

• Large scale implementations employ heuristics (reorder rows and columns

of A) to improve sparsity of L. If L is sparse, IPMs are stronger.

7 Conclusions
Slide 12
• IPMs represent the present and future of Optimization.
• Very successful in solving very large problems.
• Extend to general convex problems

4
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 23: Semideﬁnite Optimization

1 Outline
Slide 1
1. Preliminaries
2. SDO
3. Duality
4. SDO Modeling Power
5. Barrier Algorithm for SDO

2 Preliminaries
Slide 2
• A symmetric matrix A is positive semideﬁnite (A � 0) if and only if

u′ Au ≥ 0 ∀ u ∈ Rn

• A � 0 if and only if all eigenvalues of A are nonnegative

n �
� n
• A•B = Aij Bij
i=1 j=1

2.1 The trace

Slide 3
• The trace of a matrix A is deﬁned
n
�
trace(A) = Ajj
j=1

• trace(AB) = trace(BA)
• A • B = trace(A′ B)

3 SDO
Slide 4
• C symmetric n × n matrix
• Ai , i = 1, . . . , m symmetric n × n matrices
• bi , i = 1, . . . , m scalars
• Semideﬁnite optimization problem (SDO)

(P ) : min C • X

s.t. Ai • X = bi i = 1, . . . , m
X�0

1
3.1 Example
Slide 5
n = 3 and m = 2
     
1 0 1 0 2 8 1 2 3
A1 =  0 3 7  , A2 =  2 6 0, C = 2 9 0
1 7 5 8 0 4 3 0 7

b1 = 11, b2 = 19
 
x11 x12 x13
X =  x21 x22 x23 
x31 x32 x33
Slide 6

(P ) : min x11 + 4x12 + 6x13 + 9x22 + 7x33

s.t. x11 + 2x13 + 3x22 + 14x23 + 5x33 = 11
4x12 + 16x13 + 6x22 + 4x33 = 19
 
x11 x12 x13
X =  x21 x22 x23  � 0
x31 x32 x33

3.2 LO as SDO
Slide 7
LO : min c′ x
s.t. Ax = b
x≥0
ai1 0 ... 0 c1 0 ... 0
   
 0 ai2 ... 0  0 c2 ... 0 
Ai = 
 ... .. .. ...  , C =  ...
  .. .. .. 
. . . . . 
0 0 . . . ain 0 0 . . . cn
Slide 8

(P ) : min C • X
s.t. Ai • X = bi , i = 1, . . . , m
Xij = 0, i = 1, . . . , n, j = i + 1, . . . , n
X�0
x1 0 . . . 0
 
 0 x2 . . . 0 
X =  ... .. . . . 
. . .. 
0 0 . . . xn

2
4 Duality
Slide 9
m
�
(D) : max yi bi
i=1
�m
s.t. yi Ai + S = C
i=1
S�0
Equivalently,
m
�
(D) : max yi bi
i=1
m
�
s.t. C − yi Ai � 0
i=1

4.1 Example
Slide 10
(D) max 11y1 + 19y2
     
1 0 1 0 2 8 1 2 3
s.t. y1  0 3 7  + y2  2 6 0+S = 2 9 0
1 7 5 8 0 4 3 0 7
S�0

(D) max 11y1 + 19y2

 
1 − 1y1 − 0y2 2 − 0y1 − 2y2 3 − 1y1 − 8y2
s.t.  2 − 0y1 − 2y2 9 − 3y1 − 6y2 0 − 7y1 − 0y2  � 0
3 − 1y1 − 8y2 0 − 7y1 − 0y2 7 − 5y1 − 4y2

4.2 Weak Duality

Slide 11
Theorem Given a feasible solution X of (P ) and a feasible solution (y, S) of
(D),
�m
C •X − yi bi = S • X ≥ 0
i=1
m
�
If C • X − yi bi = 0, then X and (y, S) are each optimal solutions to (P )
i=1
and (D) and SX = 0

3
4.3 Proof
Slide 12
• We must show that if S � 0 and X � 0, then S • X ≥ 0
• Let S = P DP ′ and X = QEQ′ where P, Q are orthonormal matrices

and D, E are nonnegative diagonal matrices

S • X = trace(S ′ X) = trace(SX)

= trace(P DP ′ QEQ′ )
n
�
= trace(DP ′ QEQ′ P ) = Djj (P ′ QEQ′ P )jj ≥ 0,
j=1

since Djj ≥ 0 and the diagonal of P ′ QEQ′ P must be nonnegative.

• Suppose that trace(SX) = 0. Then
n
�
Djj (P ′ QEQ′ P )jj = 0
j=1

• Then, for each j = 1, . . . , n, Djj = 0 or (P ′ QEQ′ P )jj = 0.

• The latter case implies that the j th row of P ′ QEQ′ P is all zeros. There

fore, DP ′ QEQ′ P = 0, and so SX = P DP ′ QEQ′ = 0.

4.4 Strong Duality

Slide 13
• (P ) or (D) might not attain their respective optima
• There might be a duality gap, unless certain regularity conditions hold

Theorem

• If there exist feasible solutions X̂ for (P ) and (ŷ, Ŝ) for (D) such that

X̂ ≻ 0, Ŝ ≻ 0

• then, both (P ) and (D) attain their optimal values zP∗ and zD
∗

• zP∗ = zD
∗

5 SDO vs LO
Slide 14
• There may be a ﬁnite or inﬁnite duality gap. The primal and/or dual may

or may not attain their optima. Both problems will attain their common

optimal value if both programs have feasible solutions in the interior of

the semideﬁnite cone.

4
• There is no ﬁnite algorithm for solving SDO. There is a simplex algorithm,

but it is not a ﬁnite algorithm. There is no direct analog of a “basic feasible

solution” for SDO.

• Given rational data, the feasible region may have no rational solutions.

The optimal solution may not have rational components or rational eigen

values.

6 SDO Modeling Power

6.1 Quadratically
Constrained Problems
Slide 15
min (A0 x + b0 )′ (A0 x + b0 ) − c′0 x − d0

s.t. (Ai x + bi )′ (Ai x + bi ) − c′i x − di ≤ 0 ,

i = 1, . . . , m

(Ax + b)′ (Ax + b) − c′ x − d ≤ 0 ⇔

� �
I Ax + b
�0
(Ax + b)′ c′ x + d
Slide 16

min t

s.t. (A0 x + b0 )′ (A0 x + b0 ) − c′0 x − d0 − t ≤ 0

(Ai x + bi )′ (Ai x + bi ) − c′i x − di ≤ 0, ∀ i

Slide 17
⇔
min t
� �
I A0 x + b0
s.t. �0
(A0 x + b0 )′ c′0 x + d0 + t
� �
I Ai x + bi
�0 ∀i
(Ai x + bi )′ c′i x + di

6.2 Eigenvalue Problems

Slide 18
• X: symmetric n × n matrix
• λmax (X) = largest eigenvalue of X
• λ1 (X) ≥ λ2 (X) ≥ · · · ≥ λm (X) eigenvalues of X

5
• Theorem λmax (X) ≤ t ⇔ t · I − X � 0
•
k
�
λi (X) ≤ t ⇔ t − k · s − trace(Z) ≥ 0
i=1
Z�0

Z −X +sI �0
n
�
• Recall trace(Z) = Zii
i=1

6.3 Optimizing
Structural Dynamics
Slide 19
• Select xi , cross-sectional area of structure i, i = 1, . . . , n
�
• M (x) = M 0 + i xi M i , mass matrix
�
• K(x) = K 0 + i xi K i , stiﬀness matrix
�
• Structure weight w = w0 + i xi wi
• Dynamics

¨ + K(x)d = 0

M (x)d
Slide 20
• d(t) vector of displacements
n
�
• di (t) = αij cos(ωj t − φj )
j=1

• det(K(x) − M (x)ω 2 ) = 0; ω1 ≤ ω2 ≤ · · · ≤ ωn

1/2
• Fundamental frequency: ω1 = λmin (M (x), K(x))
• We want to bound the fundamental frequency

ω1 ≥ Ω ⇐⇒ M (x)Ω2 − K(x) � 0

• Minimize weight
Slide 21
Problem: Minimize weight subject to
Fundamental frequency ω1 ≥ Ω
Limits on cross-sectional areas
Formulation
�
min w0 + i xi wi

s.t. M (x) Ω2 − K(x) � 0

li ≤ xi ≤ ui

6
6.4 Measurements with Noise
Slide 22
• x: ability of a random student on k tests

E[x] = x̄, E[(x − x̄)(x − x̄)′ = Σ

• y: score of a random student on k tests

• v: testing error of k tests, independent of x

E[v] = 0, E[vv ′ ] = D, diagonal (unknown)

• y = x + v; E[y] = x̄

E[(y − x̄)(y − x̄)′ ] = Σ

� = Σ + D

• Objective: Estimate reliably x̄ and Σ

Slide 23
• Take samples of y from which we can estimate x̄, Σ
�

• e′ x: total ability on tests

• e′ y: total test score
• Reliability of test:=
Var[e′ x] e′ Σe e′ De
= = 1 −
Var[e′ y] �e
e′ Σ e′ Σ
�e
Slide 24
We can ﬁnd a lower bound on the reliability of the test
min e′ Σe
s.t. Σ + D = Σ�
Σ, D � 0
D diagonal
Equivalently,
max e′ De
s.t. 0 � D � Σ
�
D diagonal

6.5 Further Tricks

Slide 25
•
B C ′
� �
A= � 0 ⇐⇒ D − CB −1 C ′ � 0
C D
•
c b′
� �
x′ Ax + 2b′ x + c ≥ 0, ∀ x ⇐⇒ �0
b A

7
6.6 MAXCUT
Slide 26
• Given G = (N, E) undirected graph, weights wij ≥ 0 on edge (i, j) ∈ E
�
• Find a subset S ⊆ N : i∈S,j∈S̄ wij is maximized

• xj = 1 for j ∈ S and xj = −1 for j ∈ S̄

n n
1 ��
M AXCU T : max wij (1 − xi xj )
4 i=1 j=1
s.t. xj ∈ {−1, 1}, j = 1, . . . , n

6.6.1 Reformulation
Slide 27
• Let Y = xx′ , i.e., Yij = xi xj
• Let W = [wij ]
• Equivalent Formulation
n n
1 ��
M AXCU T : max wij − W • Y
4 i=1 j=1
s.t. xj ∈ {−1, 1}, j = 1, . . . , n
Yjj = 1, j = 1, . . . , n
Y = xx′

6.6.2 Relaxation
Slide 28
• Y = xx′ � 0
• Relaxation
n n
1 ��
RELAX : max wij − W • Y
4 i=1 j=1
s.t. Yjj = 1, j = 1, . . . , n

Y � 0
Slide 29
•
M AXCU T ≤ RELAX

• It turns out that:

0.87856 RELAX ≤ M AXCU T ≤ RELAX

• The value of the SDO relaxation is guaranteed to be no more than 12%

higher than the value of the very diﬃcult to solve problem MAXCUT

8
7 Barrier Algorithm for SDO
Slide 30
• X � 0 ⇔ λ1 (X) ≥ 0, . . . , λn (X) ≥ 0
• Natural barrier to repel X from the boundary λ1 (X) > 0, . . . , λn (X) > 0:
n
�
− log(λi (X)) =
j=1

n
�
− log( λi (X)) = − log(det(X))
j=1
Slide 31
• Logarithmic barrier problem

min Bµ (X) = C • X − µ log(det(X))

s.t. Ai • X = bi , i = 1, . . . , m,
X ≻0

• Derivative: ∇Bµ (X) = C − µX −1

• KKT
Ai • X = bi , i = 1, . . . , m,

X ≻ 0,
m
�
C − µX −1 = yi A i .
i=1

• Since X is symmetric, X = LL′ .

−1
S = µX −1 = µL′ L−1
1 ′
L SL = I
µ
•
Ai • X = bi , i = 1, . . . , m,

X ≻ 0, X = LL′
m
�
yi A i + S = C
i=1

I − µ1 L′ SL = 0
• Nonlinear equations: Take a Newton step analogously to IPM for LO.
�√
ǫ0
�
• Barrier algorithm needs O n log iterations to reduce duality gap from ǫ0
ǫ
to ǫ

9
8 Conclusions
Slide 32
• SDO is a very powerful modeling tool
• SDO represents the present and future in continuous optimization
• Barrier Algorithm is very powerful
• Research software available

10
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 24: Discrete Optimization

1 Outline
Slide 1
• Modeling with integer variables
• What is a good formulation?
• Theme: The Power of Formulations

2 Integer Programming
2.1 Mixed IP
Slide 2
(MIP) max c′ x + h′ y
s.t. Ax + By ≤ b
n
x ∈ Z+ (x ≥ 0, x integer)
n
y ∈ R+ (y ≥ 0)

2.2 Pure IP
Slide 3
(IP) max c′ x
s.t. Ax ≤ b
n
x ∈ Z+
Important special case: Binary IP
(BIP) max c′ x
s.t. Ax ≤ b
x ∈ {0, 1}n

2.3 LP
Slide 4
(LP) max c′ x
s.t. By ≤ b
n
y ∈ R+

3 Modeling with Binary Variables

3.1 Binary Choice
� Slide 5
1, if event occurs
x∈
0, otherwise
Example 1: IP formulation of the knapsack problem
n : projects, total budget b
aj : cost of project j
cj : value
� of project j Slide 6
1, if project j is selected.
xj =
0, otherwise.

1
n
�
max cj xj
j=1
�
s.t. aj xj ≤ b
xj ∈ {0, 1}

3.2 Modeling relations

Slide 7
• At most one event occurs �
xj ≤ 1
j

• Neither or both events occur

x2 − x1 = 0

• If one event occurs then, another occurs

0 ≤ x2 ≤ x1

• If x = 0, then y = 0; if x = 1, then y is uncontrained

0 ≤ y ≤ U x, x ∈ {0, 1}

3.3 The assignment problem

Slide 8
n people

m jobs

cij : cost
� of assigning person j to job i.

1 person jis assigned to job i

xij =
� 0
min cij xij

�
s.t. xij = 1 each job is assigned
j=1

�m

xij ≤ 1 each person can do at most one job.

i=1

xij ∈ {0, 1}

3.4 Multiple optimal solutions

Slide 9
• Generate all optimal solutions to a BOP.
max c′ x
s.t. Ax ≤ b
x ∈ {0, 1}n

• Generate third best?

• Extensions to MIO?

2
3.5 Nonconvex functions
Slide 10
• How to model min c(x), where c(x) is piecewise linear but not convex?

4 What is a good formulation?

4.1 Facility Location
Slide 11
• Data

N = {1 . . . n} potential facility locations

I = {1 . . . m} set of clients

cj : cost of facility placed at j

hij : cost of satisfying client i from facility j.

• Decision variables
�
1, a facility is placed at location j
xij =
0, otherwise
yij = fraction of demand of client i
satisﬁed by facility j.
Slide 12
n
� m �
� n
IZ1 = min cj xj + hij yij
j=1 i=1 j=1
�n
s.t. yij = 1
j=1
yij ≤ xj
xj ∈ {0, 1}, 0 ≤ yij ≤ 1.
Slide 13
Consider an alternative formulation.
n
� m �
� n
IZ2 = min cj xj + hij yij
j=1 i=1 j=1
�n
s.t. yij = 1
j=1
�m
yij ≤ m · xj
i=1
xj ∈ {0, 1}, 0 ≤ yij ≤ 1.

Are both valid?

Which one is preferable?

4.2 Observations
Slide 14
• IZ1 = IZ2 , since the integer points both formulations deﬁne are the same.
• n �
� 0 ≤ xj ≤ 1
P1 = {(x, y) : yij = 1, yij ≤ xj ,
0 ≤ yij ≤ 1
j=1

3
n
� m
�
P2 = {(x, y) : yij = 1, yij ≤ m · xj ,
j=1 i=1
�
0 ≤ xj ≤ 1
0 ≤ yij ≤ 1
Slide 15
• Let
Z1 = min cx + hy, Z2 = min cx + hy
(x, y) ∈ P1 (x, y) ∈ P2

• Z2 ≤ Z1 ≤ IZ1 = IZ2

4.3 Implications
Slide 16
• Finding IZ1 (= IZ2 ) is diﬃcult.
• Solving to ﬁnd Z1 , Z2 is an LP. Since Z1 is closer to IZ1 several methods

(branch and bound) would work better (actually much better).

• Suppose that if we solve min cx + hy, (x, y) ∈ P1 we ﬁnd an integral

solution. Have we solved the facility location problem?

Slide 17
• Formulation 1 is better than Formulation 2. (Despite the fact that 1 has

a larger number of constraints than 2.)

• What is then the criterion?

4.4 Ideal Formulations

Slide 18
• Let P be an LP relaxation for a problem
• Let

H = {(x, y) : x ∈ {0, 1}n} ∩ P

• Consider Convex Hull (H)

� �
= {x : x = λi xi , λi = 1, λi ≥ 0, xi ∈ H}
i i

Slide 19
• The extreme points of CH(H) have {0, 1} coordinates.
• So, if we know CH(H) explicitly, then by solving min cx + hy, (x, y) ∈

CH(H) we solve the problem.

• Message: Quality of formulation is judged by closeness to CH(H).

CH(H) ⊆ P1 ⊆ P2

4
5 Minimum Spanning
Tree (MST)
Slide 20
• How do telephone companies bill you?
• It used to be that rate/minute: Boston → LA proportional to distance in

MST

• Other applications: Telecommunications, Transportation (good lower bound

for TSP)

Slide 21
• Given a graph G = (V, E) undirected and Costs ce , e ∈ E.
• Find a tree of minimum cost spanning all the nodes.
�
1, if edge e is included in the tree
• Decision variables xe =
0, otherwise
Slide 22
• The tree should be connected. How can you model this requirement?
• Let S be a set of vertices. Then S and V \ S should be connected
�
i∈S
• Let δ(S) = {e = (i, j) ∈ E :
j ∈V \S
• Then, �
xe ≥ 1
e∈δ(S)

• What is the number of edges in a tree?

�
• Then, xe = n − 1
e∈E

5.1 Formulation
� Slide 23
IZMST = min ce xe
 e∈E
�
 xe ≥ 1 ∀ S ⊆ V, S �= ∅, V
 e∈δ(S)

�
H xe = n − 1
 e∈E


xe ∈ {0, 1}.
Is this a good formulation? Slide 24

Pcut = {x ∈ R|E| : 0 ≤ x ≤ e,
�
xe = n − 1
e∈E
�
xe ≥ 1 ∀ S ⊆ V, S �= ∅, V }
e∈δ(S)

Is Pcut the CH(H)?

5
5.2 What is CH(H)?
Slide 25
Let �
Psub = {x ∈ R|E| : xe = n − 1
e∈E
�
xe ≤ |S| − 1 ∀ S ⊆ V, S =
� ∅, V }
e∈E(S)
� �
i∈S
E(S) = e = (i, j) :
j∈S
Why is this a valid IP formulation? Slide 26

• Theorem: Psub = CH(H).

• ⇒ Psub is the best possible formulation.
• MESSAGE: Good formulations can have an exponential number of con

straints.

6 The Traveling Salesman

Problem
Slide 27
Given G = (V, E) an undirected graph. V = {1, . . . , n}, costs ce ∀ e ∈ E. Find
a tour that minimizes total length.

6.1 Formulation I
� Slide 28
1, if edge e is included in the tour.
xe =
0, otherwise.
�
min ce xe

e∈E

�
s.t. xe ≥ 2, S⊆E
e∈δ(S)
�
xe = 2, i∈V
e∈δ(i)
xe ∈ {0, 1}

6.2 Formulation II
� Slide 29
min �ce xe
s.t. xe ≤ |S| − 1, S ⊆ E
e∈E(S)
�
xe = 2, i ∈ V
e∈δ(i)
xe ∈ {0, 1}
Slide 30

6
T SP
� �
Pcut = {x ∈ R|E| ; xe ≥ 2, xe = 2
e∈δ(S) e∈δ(i)
0 ≤ xe ≤ 1}
�

T SP
Psub = {x ∈ R|E| ; xe = 2

e∈δ(i)

�
xe ≤ |S| − 1
e∈δ(S)
0 ≤ xe ≤ 1}
Slide 31
T SP T SP
• Theorem: Pcut = Psub �⊇ CH(H)
• Nobody knows CH(H) for the TSP

7 Minimum Matching
Slide 32
• Given G = (V, E); ce costs on e ∈ E. Find a matching of minimum cost.
• Formulation: �
min �ce xe
s.t. xe = 1, i∈V
e∈δ(i)
xe ∈ {0, 1}

• Is the LP ralaxation CH(H)?

Slide 33
Let �
PMAT = {x ∈ R|E| : xe = 1
e∈δ(i)
�
xe ≥ 1 |S| = 2k + 1, S =
� ∅
e∈δ(S)
xe ≥ 0}
Theorem: PMAT = CH(H)

8 Observations
Slide 34
• For MST, Matching there are eﬃcient algorithms. CH(H) is known.
• For TSP � ∃ eﬃcient algorithm. TSP is an N P − hard problem. CH(H)

is not known.

• Conjuecture: The convex hull of problems that are polynomially solvable

are explicitly known.

7
9 Summary
Slide 35
1. An IP formulation is better than another one if the polyhedra of their LP

relaxations are closer to the convex hull of the IP.

2. A good formulation can have an exponential number of constraints.

3. Conjecture: Formulations characterize the complexity of problems. If a

problem is solvable in polynomial time, then the convex hull of solutions

is known.

8
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

Lecture 25: Exact Methods

for Discrete Optimization
1 Outline
Slide 1
• Cutting plane methods
• Branch and bound methods

2 Cutting plane methods

Slide 2
min c′ x
s.t. Ax = b
x ≥ 0
x integer,
LP relaxation
min c′ x
s.t. Ax = b
x ≥ 0.

2.1 Algorithm
Slide 3
• Solve the LP relaxation. Let x∗ be an optimal solution.
• If x∗ is integer stop; x∗ is an optimal solution to IP.
• If not, add a linear inequality constraint to LP relaxation that all integer

solutions satisfy, but x∗ does not; go to Step 1.

2.2 Example
Slide 4
• Let x∗ be an optimal BFS to LP ralxation with at least one fractional

basic variable.

• N : set of indices of the nonbasic variables.

• Is this a valid cut? �
xj ≥ 1.
j∈N

2.3 The Gomory cutting

plane algorithm
Slide 5
• Let x∗ be an optimal BFS and B an optimal basis.
•
xB + B −1 AN xN = B −1 b.
� � � �
• aij = B −1 Aj i , ai0 = B −1 b i .

1
• �
xi + aij xj = ai0 .
j∈N

• Since xj ≥ 0 for all j,

� �
xi + ⌊aij ⌋xj ≤ xi + aij xj = ai0 .
j∈N j∈N

• Since xj integer, �
xi + ⌊aij ⌋xj ≤ ⌊ai0 ⌋.
j∈N

• Valid cut

2.4 Example
Slide 6
min x1 − 2x2
s.t. −4x1 + 6x2 ≤ 9
x1 + x2 ≤ 4
x1 , x2 ≥0
x1 , x2 integer.
We transform the problem in standard form
min x1 − 2x2
s.t. −4x1 + 6x2 + x3 = 9
x1 + x2 + x4 = 4
x1 , . . . , x4 ≥ 0
x1 , . . . , x4 integer.
LP relaxation: x1 = (15/10, 25/10). Slide 7
•
1 1 25
x2 + x3 + x4 = .
10 10 10
• Gomory cut

x2 ≤ 2.

• Add constraints x2 + x5 = 2, x5 ≥ 0
• New optimal x2 = (3/4, 2).
• One of the equations in the optimal tableau is
1 6 3
x1 − x3 + x5 = .
4 4 4
• New Gomory cut

x1 − x3 + x5 ≤ 0,

• New optimal solution is x3 = (1, 2).

Slide 8

2
x2

.
x1 - 3 x 1 + 5x2 < 7

2 ..
x2
x3
x2 < 2

0
1 2 3 4 x1

3 Branch and bound

Slide 9
1. Branching: Select an active subproblem Fi
2. Pruning: If the subproblem is infeasible, delete it.
3. Bounding: Otherwise, compute a lower bound b(Fi ) for the subproblem.
4. Pruning: If b(Fi ) ≥ U , the current best upperbound, delete the subproblem.
5. Partitioning: If b(Fi ) < U , either obtain an optimal solution to the subproblem

(stop), or break the corresponding problem into further subproblems, which are

added to the list of active subproblem.

3.1 LP Based
Slide 10
• Compute the lower bound b(F ) by solving the LP relaxation of the discrete

optimization problem.

• From the LP solution x∗ , if there is a component x∗i which is fractional,

we create two subproblems by adding either one of the constraints

xi ≤ ⌊x∗i ⌋, or xi ≥ ⌈xi∗ ⌉.

Note that both constraints are violated by x∗ .

• If there are more than 2 fractional components, we use selection rules like

maximum infeasibility etc. to determine the inequalities to be added to

the problem

• Select the active subproblem using either depth-ﬁrst or breadth-ﬁrst search

strategies.

3.2 Example
Slide 11
max 12x1 + 8x2 + 7x3 + 6x4
s.t. 8x1 + 6x2 + 5x3 + 4x4 ≤ 15
x1 , x2 , x3 , x4 are binary.

3
Ob j e ctiv e v a l u e =22.2
x1=1, x 2=0, x 3=0 . 6 , x 4=1

x3=0 x3=1

Ob j e ctiv e v a l u e =22 Ob j e ctiv e v a l u e =22

x1=1, x 2=0.5,x 3=0 , x 4=1 x1=1, x 2=0, x 3=1 x, 4=0 . 5

Ob j e ctiv e v a l u e =22.2
x1=1, x 2=0, x 3=0 . 6 , x 4=1

x3=0 x3=1

Ob j e ctiv e v a l u e =22 Ob j e ctiv e v a l u e =22

x1=1, x 2=0.5,x 3=0 , x 4=1 x1=1, x 2=0, x 3=1, x 4=0 . 5

x4=0 x4=1

Ob j e ctiv e v a l u e =2 1 . 6 6 Ob j e ctiv e v a l u e =22

x1=1, x 2=0 . 3 , x 3=1 , x 4=0 x1=0 . 7 5 , x 2=0 , x 3=1 , x 4=1

LP relaxation Slide 12

max 12x1 + 8x2 + 7x3 + 6x4

s.t. 8x1 + 6x2 + 5x3 + 4x4 ≤ 15

x1 ≤ 1, x2 ≤ 1, x3 ≤ 1, x4 ≤ 1

x1 , x2 , x3 , x4 ≥ 0

LP solution: x1 = 1, x2 = 0, x3 = 0.6, x4 = 1 Proﬁt=22.2

3.2.1 Branch and bound tree

Slide 13

3.3 Pigeonhole Problem Slide 14

Slide 15

• There are n + 1 pigeons with n holes. We want to place the pigeons in the Slide 16

holes in such a way that no two pigeons go into the same hole.

• Let xij = 1 if pigeon i goes into hole j, 0 otherwise.

Slide 17

• Formulation 1:
�
j
xij = 1, i = 1, . . . , n + 1
xij + xkj ≤ 1, ∀j, i =
� k

4
Ob j e ctiv e v a l u e =22.2
x1=1, x 2=0, x 3=0 . 6 , x 4=1

x3=0 x3=1

Ob j e ctiv e v a l u e =22 Ob j e ctiv e v a l u e =22

x1=1, x 2=0.5,x 3=0 , x 4=1 x1=1, x 2=0, x 3=1, x 4=0 . 5

x4=0 x4=1

Ob j e ctiv e v a l u e =2 1 . 6 6 Ob j e ctiv e v a l u e =22

x1=1, x 2=0 . 3 , x 3=1 , x 4=0 x1=0 . 7 5 , x 2=0 , x 3=1 , x 4=1

x1=0 x1=1

Ob j e ctiv e v a l u e =2 1 I n feasib l e
x1=0 x, 2=1 x, 3=1 x, 4=1

• Formulation 2:
�
j
xij = 1, i = 1, . . . , n + 1
�n+1
i=1
xij ≤ 1, ∀j

Which formulation is better for the problem? Slide 18

• The pigeonhole problem is infeasible.
• For Formulation 1, feasible solution xij = n1 for all i, j. O(n3 ) constraints.

Nearly complete enumeration is needed for LP-based BB, since the prob

lem remains feasible after ﬁxing many variables.

• Formulation 2 Infeasible. O(n) constraints.

• Mesage: Formulation of the problem is important!

3.4 Preprocessing
Slide 19
• An eﬀective way of improving integer programming formulations prior to and

during branch-and-bound.

• Logical Tests
– Removal of empty (all zeros) rows and columns;
– Removal of rows dominated by multiples of other rows;
– strengthening the bounds within rows by comparing individual variables
and coeﬃcients to the right-hand-side.
– Additional strengthening may be possible for integral variables using round
ing.
• Probing : Setting temporarily a 0-1 variable to 0 or 1 and redo the logical

tests. Force logical connection between variables. For example, if 5x + 4y + z ≤

8, x, y, z ∈ {0, 1}, then by setting x = 1, we obtain y = 0. This leads to an

inequality x + y ≤ 1.

5
1 2 5

4 3 6 7

4 Application
4.1 Directed TSP
4.1.1 Assignment Lower Bound
Slide 20
Given a directed graph G = (N, A) with n nodes, and a cost cij for every arc,
ﬁnd a tour (a directed cycle that visits all nodes) of minimum cost.
�n �n
min i=1
c x
j=1 ij ij
�n
s.t. : xij = 1, j = 1, . . . , n,
�i=1
n
j=1
xij = 1, i = 1, ..., n,
xij ∈ {0, 1}.

Slide 21

4.2 Improving BB
Slide 22
• Better LP solver
• Use problem structure to derive better branching strategy
• Better choice of lower bound b(F ) - better relaxation
• Better choice of upper bound U - heuristic to get good solution
• KEY: Start pruning the search tree as early as possible

6
MIT OpenCourseWare
http://ocw.mit.edu

6.251J / 15.081J Introduction to Mathematical Programming

Fall 2009

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

(Student Mathematical Library 050) Mike Mesterton-Gibbons-A Primer On The Calculus of Variations and Optimal Control Theory-American Mathematical Society (2009)
100% (1)
(Student Mathematical Library 050) Mike Mesterton-Gibbons-A Primer On The Calculus of Variations and Optimal Control Theory-American Mathematical Society (2009)
274 pages
Math 2023
100% (1)
Math 2023
34 pages
Modeling, Simulation and Control of A Robotic Arm
No ratings yet
Modeling, Simulation and Control of A Robotic Arm
7 pages
Bibliografie: Bibliografie Minimală Pentru Studenţi
No ratings yet
Bibliografie: Bibliografie Minimală Pentru Studenţi
4 pages
File 7
0% (1)
File 7
2,165 pages
Footprint Library
100% (1)
Footprint Library
140 pages
Rfic Unit II
No ratings yet
Rfic Unit II
36 pages
UlabyISMCh08
No ratings yet
UlabyISMCh08
55 pages
Answer 1 (B) (2011)
0% (1)
Answer 1 (B) (2011)
27 pages
C106DG, C106MG: Description Simplified Outline TO-126
No ratings yet
C106DG, C106MG: Description Simplified Outline TO-126
4 pages
Universal Power Supply (Dc/Ac) With Function Generator: Lucas Nülle GMBH Page 1/1 WWW - Lucas-Nuelle - Us
No ratings yet
Universal Power Supply (Dc/Ac) With Function Generator: Lucas Nülle GMBH Page 1/1 WWW - Lucas-Nuelle - Us
1 page
LCD 16x2 Interfacing With ARM MBED - MBED
100% (1)
LCD 16x2 Interfacing With ARM MBED - MBED
2 pages
Using Inline Assembly in C
No ratings yet
Using Inline Assembly in C
7 pages
Microcontroller Based Inductance Capacitance Meter: Mudit Agarwal
No ratings yet
Microcontroller Based Inductance Capacitance Meter: Mudit Agarwal
5 pages
TL074 Datasheet Op - Amp
No ratings yet
TL074 Datasheet Op - Amp
15 pages
Design Task 1 Final Lab
No ratings yet
Design Task 1 Final Lab
15 pages
DAC0830/DAC0832 8-Bit P Compatible, Double-Buffered D To A Converters
No ratings yet
DAC0830/DAC0832 8-Bit P Compatible, Double-Buffered D To A Converters
24 pages
Neamen - Electronic Circuit Analysis and Design 2nd Ed Chap 002
100% (1)
Neamen - Electronic Circuit Analysis and Design 2nd Ed Chap 002
14 pages
Basic Micromouse Circuit
No ratings yet
Basic Micromouse Circuit
2 pages
Ward and Mellor Method - Paper PDF
No ratings yet
Ward and Mellor Method - Paper PDF
142 pages
Jfet (: Type Symbol and Basic Relationships Transfer Curve Input Resistance and Capacitance
No ratings yet
Jfet (: Type Symbol and Basic Relationships Transfer Curve Input Resistance and Capacitance
64 pages
HW 3 Solution
No ratings yet
HW 3 Solution
10 pages
Cs Imp Quest
No ratings yet
Cs Imp Quest
3 pages
Mechanics and Control in Robotics
No ratings yet
Mechanics and Control in Robotics
570 pages
Transistor Hybrid Model
No ratings yet
Transistor Hybrid Model
10 pages
Avs9 PLL Designhv
No ratings yet
Avs9 PLL Designhv
36 pages
Chapter 2: Avr Architecture & Assembly Language Programming: Section 2.1: The General Purpose Registers in The Avr
No ratings yet
Chapter 2: Avr Architecture & Assembly Language Programming: Section 2.1: The General Purpose Registers in The Avr
8 pages
Controladora de Puertas Soyal 716ei
No ratings yet
Controladora de Puertas Soyal 716ei
4 pages
VHDL Code For A 74LS194 Universal Shift Register
0% (1)
VHDL Code For A 74LS194 Universal Shift Register
3 pages
Course Code Course Name Course Structure Ececc05 Signal and Systems 3-1-0 L-T-P
No ratings yet
Course Code Course Name Course Structure Ececc05 Signal and Systems 3-1-0 L-T-P
2 pages
Laboratory 14 - VCO
No ratings yet
Laboratory 14 - VCO
2 pages
Simulation of Switching Converters
No ratings yet
Simulation of Switching Converters
103 pages
Schmitt Trigger
100% (1)
Schmitt Trigger
19 pages
Introduction To Switched-Mode Converter Modeling Using MATLAB/Simulink
No ratings yet
Introduction To Switched-Mode Converter Modeling Using MATLAB/Simulink
37 pages
VHDL Implementation of A Mips-32 Pipeline Processor
No ratings yet
VHDL Implementation of A Mips-32 Pipeline Processor
5 pages
Lecture 3
No ratings yet
Lecture 3
37 pages
Design and Implementation of Combinational Circuits Using Reversible Logic On FPGA SPARTAN 3E
No ratings yet
Design and Implementation of Combinational Circuits Using Reversible Logic On FPGA SPARTAN 3E
6 pages
Repair Hantarex Part 1
No ratings yet
Repair Hantarex Part 1
1 page
Codificacion-en-Matlab IFN°3
No ratings yet
Codificacion-en-Matlab IFN°3
11 pages
Advanced Microcontroller: A Laboratory Manual For
No ratings yet
Advanced Microcontroller: A Laboratory Manual For
84 pages
Mosfet Amplifier
No ratings yet
Mosfet Amplifier
14 pages
Coolmay TK Series HMI User Manual
No ratings yet
Coolmay TK Series HMI User Manual
2 pages
Three Phase Uncontrolled Rectifiers: Notes From MUR
100% (1)
Three Phase Uncontrolled Rectifiers: Notes From MUR
9 pages
Real Time Systems
50% (2)
Real Time Systems
127 pages
24W04
No ratings yet
24W04
2 pages
Mathcad - HW2 ECE427 Soln
No ratings yet
Mathcad - HW2 ECE427 Soln
24 pages
Webster Chapter 03
No ratings yet
Webster Chapter 03
35 pages
DsPIC33F - 01 - Introduction
No ratings yet
DsPIC33F - 01 - Introduction
8 pages
Voltmeter PDF
No ratings yet
Voltmeter PDF
4 pages
Experiment 1
No ratings yet
Experiment 1
13 pages
Smith Chart Basics
No ratings yet
Smith Chart Basics
27 pages
List of Experiments: Ee2207 Electronic Devices and Circuits Laboratory (Revised)
No ratings yet
List of Experiments: Ee2207 Electronic Devices and Circuits Laboratory (Revised)
54 pages
Test 2 Set 1
No ratings yet
Test 2 Set 1
5 pages
Workbook For Merrills Atlas of Radiographic Positioning and Procedures Incomplete 12nbsped 9780323292290
No ratings yet
Workbook For Merrills Atlas of Radiographic Positioning and Procedures Incomplete 12nbsped 9780323292290
31 pages
An Out-Of-Kilter Method For Minimal-Cost Flow Problems
No ratings yet
An Out-Of-Kilter Method For Minimal-Cost Flow Problems
25 pages
Meri Can Lnstitute or Dio Prog Ressive Ed Ucation: Pr06Ressiye Cephr Loffietr Ics
No ratings yet
Meri Can Lnstitute or Dio Prog Ressive Ed Ucation: Pr06Ressiye Cephr Loffietr Ics
124 pages
Buku Matematika Kel.5
No ratings yet
Buku Matematika Kel.5
5 pages
One Marks Questions: T L T A N L R Q
No ratings yet
One Marks Questions: T L T A N L R Q
11 pages
Adobe Scan 28 Feb 2024
No ratings yet
Adobe Scan 28 Feb 2024
4 pages
Portfolio and Money Demand
No ratings yet
Portfolio and Money Demand
6 pages
Chapters 5-10: A and Same Prooucers Tnen Precesses B Is
No ratings yet
Chapters 5-10: A and Same Prooucers Tnen Precesses B Is
4 pages
Mathematical Functions
From Everand
Mathematical Functions
Oliver Linton
No ratings yet
LR SEEMP Guidance Notes For Clients v1 1 - tcm155-232078
No ratings yet
LR SEEMP Guidance Notes For Clients v1 1 - tcm155-232078
12 pages
Multiple Criteria Decision Support Software
100% (1)
Multiple Criteria Decision Support Software
30 pages
Newsletter 17
No ratings yet
Newsletter 17
15 pages
2016 10 Xiao Liang Khalid M. Mosalam
No ratings yet
2016 10 Xiao Liang Khalid M. Mosalam
251 pages
Thesis Step by Step Abaqus optimizationFULLTEXT01 PDF
100% (1)
Thesis Step by Step Abaqus optimizationFULLTEXT01 PDF
199 pages
Topology Optimization
No ratings yet
Topology Optimization
19 pages
Departmental Handbook CS
No ratings yet
Departmental Handbook CS
39 pages
IIT JAM Economics Syllabus
No ratings yet
IIT JAM Economics Syllabus
4 pages
Hargraves Collocation
No ratings yet
Hargraves Collocation
5 pages
Unit 2 Module 1 Notes - Linear Programming
No ratings yet
Unit 2 Module 1 Notes - Linear Programming
17 pages
Excel LP Sensitivity Analysis
No ratings yet
Excel LP Sensitivity Analysis
11 pages
Advanced Chemical Engineering Thermodynamics-CHE 211-S16 (05 262021) - Rev A
No ratings yet
Advanced Chemical Engineering Thermodynamics-CHE 211-S16 (05 262021) - Rev A
8 pages
Conceptual Design of Distillation Columns Sequence For Separation of Pentane and Hexane From C 5 + Stream of LPG Unit
No ratings yet
Conceptual Design of Distillation Columns Sequence For Separation of Pentane and Hexane From C 5 + Stream of LPG Unit
16 pages
Three Principles of Transportation Optimization JDA White Paper
No ratings yet
Three Principles of Transportation Optimization JDA White Paper
4 pages
Sandanayake 2014
No ratings yet
Sandanayake 2014
107 pages
21 Useful Ways To Improve Your Trading System PDF
No ratings yet
21 Useful Ways To Improve Your Trading System PDF
6 pages
Intelligence: AI: Think Humanily: Think Rationaily: Acting Humanily: Acting Rationaily
No ratings yet
Intelligence: AI: Think Humanily: Think Rationaily: Acting Humanily: Acting Rationaily
6 pages
Refinery Revenue Optimization
83% (6)
Refinery Revenue Optimization
43 pages
Business Mathematics 402d
No ratings yet
Business Mathematics 402d
24 pages
Optimization Techniques: Syllabus
No ratings yet
Optimization Techniques: Syllabus
49 pages
QM - Lecture - 11 - 2023 (HW)
No ratings yet
QM - Lecture - 11 - 2023 (HW)
38 pages
Performative Architecture Reading
No ratings yet
Performative Architecture Reading
7 pages
How-To - Convert Relative To Absolute Data
No ratings yet
How-To - Convert Relative To Absolute Data
12 pages
Memorandum of Understanding For MATLAB Project
No ratings yet
Memorandum of Understanding For MATLAB Project
3 pages
University of Toronto - Department of Economics - ECO 204 - Summer 2013 - Ajaz Hussain
No ratings yet
University of Toronto - Department of Economics - ECO 204 - Summer 2013 - Ajaz Hussain
16 pages
Machine Learning CNN
No ratings yet
Machine Learning CNN
28 pages
6.2.1.2 Inventory Control
No ratings yet
6.2.1.2 Inventory Control
15 pages
Machine Learning For Antennas and Radar Signal Processing
No ratings yet
Machine Learning For Antennas and Radar Signal Processing
2 pages