0% found this document useful (0 votes)
104 views24 pages

Thesis On Sequential Quadratic Programming As A Method of Optimization

1. All functions in the optimization problem are three times continuously differentiable 2. The Lagrangian function is defined using Lagrange multipliers 3. At a local solution x*, the first order necessary conditions for a minimum hold 4. The columns of the constraint gradient matrix G(x*) are linearly independent 5. Strict complementarity holds between constraints and their multipliers 6. The Hessian of the Lagrangian is positive definite on the null space of G(x*)

Uploaded by

Nishant Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
104 views24 pages

Thesis On Sequential Quadratic Programming As A Method of Optimization

1. All functions in the optimization problem are three times continuously differentiable 2. The Lagrangian function is defined using Lagrange multipliers 3. At a local solution x*, the first order necessary conditions for a minimum hold 4. The columns of the constraint gradient matrix G(x*) are linearly independent 5. Strict complementarity holds between constraints and their multipliers 6. The Hessian of the Lagrangian is positive definite on the null space of G(x*)

Uploaded by

Nishant Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 24

OPTIMIZATION USING SEQUENTIAL QUADRATIC

PROGRAMMING

A Project Report Submitted

by

NISHANT KUMAR
(1201EE25)

In Partial Fulfilment
of the Requirements for the award of the degree

BACHELOR OF TECHNOLOGY

DEPARTMENT OF ELECTRICAL ENGINEERING


INDIAN INSTITUTE OF TECHNOLOGY PATNA.
MAY 2016
THESIS CERTIFICATE

This is to certify that the work contained in the thesis titled OPTIMIZATION USING
SEQUENTIAL QUADRATIC PROGRAMMING , submitted by Nishant Kumar,
to the Indian Institute of Technology, Patna, for the award of the degree of Bachelor of
Technology, is a bona fide record of the research work done by him under my supervi-
sion. The contents of this thesis, in full or in parts, have not been submitted to any other
Institute or University for the award of any degree or diploma.

Dr. S. Sivasubramani
Supervisor
Assistant Professor
Dept. of Electrical Engineering
IIT-Patna, 800 013

Place: Patna
Date: 1st December 2015
ACKNOWLEDGEMENTS

We would like to thank Asst. Prof. S. Sivasubramani, Department of Electrical En-


gineering, IIT Patna, my Project Supervisor for his guidance, support, motivation and
encouragement throughout the period this work was carried out. His readiness for con-
sultation at all times, his educative comments, his concern and assistance have been
invaluable.
Last but not least, we would also extend our sincere thanks to Department of electrical
engineering IIT Patna for availing the facilities required for our investigational analysis.

i
ABSTRACT

Solving optimization problems becomes a central theme not only on operational re-
search but also on several research areas. In this thesis the basic approach for opti-
mization through sequential quadratic programming has been covered. The quadratic
approximation of a vector input scalar valued function under appropriate assumptions
are discussed. The formulation of a quadratic subproblem is emphasized that is assumed
to reflect the local properties of the original problem.

ii
TABLE OF CONTENTS

ACKNOWLEDGEMENTS i

ABSTRACT ii

NOTATION iv

1 INTRODUCTION 1

2 The Basic SQP Method 3


2.1 Asumptions and Conditions . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Quadratic Approximation . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 The Quadratic Subproblem . . . . . . . . . . . . . . . . . . . . . . 4

3 Unconstrained Optimization 8
3.1 Gradient Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1.1 Steepest Descent Method . . . . . . . . . . . . . . . . . . . 9
3.1.2 Conjugate Gradient Method . . . . . . . . . . . . . . . . . 10
3.1.3 Newton’s Method . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.4 Inverse Hessian Update . . . . . . . . . . . . . . . . . . . . 11
3.1.5 Direct Hessian Updating . . . . . . . . . . . . . . . . . . . 11
3.2 Sequential Quadratic Method . . . . . . . . . . . . . . . . . . . . . 11
3.2.1 SQP and Inverse Hessian Update Method . . . . . . . . . . 12

4 Constrained Optimization 14

A Implementaion of Quadratic Approximation 15


NOTATION

α Steplength parameter
dx Direction of next iteration
L Lagrangian function
∇ Gradient operator
H Hessian operator

iv
CHAPTER 1

INTRODUCTION

Optimization is a central to any problem involving decision making, whether in robotic,


medicine or in economics. It is the act of achieving the best possible result under given
circumstances. In design, construction, maintenance, engineers have to take decisions.
The goal of all such decisions is either to minimize effort or to maximize benefit. The
task of decision making entails choosing among various alternatives. This choice is
governed by our desire to make the best decision. The measure of goodness of the
alternatives is described by an objective function or performance index. Optimization
theory and methods deal with selecting the best alternative in the sense of the given
objective function. The effort or the benefit can be usually expressed as a function of
certain design variables. Hence, optimization is the process of finding the conditions
that give the maximum or the minimum value of a function.
It is obvious that if a point x∗ corresponds to the minimum value of a function f (x),
the same point corresponds to the maximum value of the function −f (x). Thus, opti-
mization can be taken to be minimization.
There is no single method available for solving all optimization problems efficiently.
Hence, a number of methods have been developed for solving different types of prob-
lems. The most frequently occurring problems are of non -linear constrained optimiza-
tion. The Sequential Quadratic Programming (SQP) algorithm has been one of the
most successful general methods for solving nonlinear constrained optimization prob-
lems. As with most optimization methods SQP is not a single algorithm but rather a
conceptual method from which numerous specific algorithms have evolved.
The nonlinear programming problem to be solved is

minimize f (x)
x

subject to : h(x) = 0

g(x) ≤ 0
where f : Rn → R, h : Rn → Rm , and g : Rn → Rp . Such problems arise in variety
of applications in science, engineering, industry and management. In the form NLP the
problem is quite general it includes as special cases linear and quadratic programs in
which the constraint functions, h and g are affine and f is linear or quadratic. While
these problems are important and numerous the great strength of the SQP method is its
ability to solve problems with nonlinear constraints. For this reason it is assumed that
NLP contains at least one nonlinear constraint function.
Many optimization demands are met by SQP like in the formulation and development
of a mathematical framework for the solution of the contingency constrained optimal
power flow (OPF). The contingency constrained optimal power flow minimizes the total
cost of a base case operating state.[1]
The basic idea of SQP is to model NLP at a given approximate solution say xk by a
quadratic programming subproblem and then to use the solution to this subproblem to
construct a better approximation xk . This process is iterated to create a sequence of
approximations that it is hoped will converge to a solution x∗ . The key to understand-
ing the performance and theory of SQP is the fact that, with an appropriate choice of
quadratic subproblem, the method can be viewed as the natural extension of Newton
and quasi-Newton methods to the constrained optimization setting. Thus one would
expect SQP methods to share the characteristics of Newton-like methods, namely, rapid
convergence when the iterates are close to the solution but possible erratic behavior that
needs to be carefully controlled when the iterates are far from a solution. While this
correspondence is valid in general, the presence of constraints makes both the analysis
and implementation of SQP methods significantly more complex.

2
CHAPTER 2

The Basic SQP Method

2.1 Asumptions and Conditions

The basic SQP method deals with the problems (NLP) that are classified by the assump-
tion that all the functions in (NLP) are three times continuously differentiable.
The gradient of a scalar valued function is denoted by ∇, e.g., ∇f (x).
All vectors are assumed to be column vectors and superscript t is used to denote the
transpose.
A key function, one that plays a central role in all of the theory of constrained optimiza-
tion, is the scalar-valued Lagrangian f unction defined by

L(x, u, v) = f (x) + ut h(x) + v t g(x)

where u ∈ Rm and v ∈ Rp are the multiplier vectors. Now x∗ will represent any
particular local solution of (NLP). The following conditions are assumed to hold true
for each such solution.

1. The first order necessary conditions hold, i.e., there exist optimal multiplier
vectors u∗ and v ∗ ≥ 0 such that

∇L(x∗ , u∗ , v ∗ ) = ∇f (x∗ ) + ∇h(x∗ )u∗ + ∇g(x∗ )v ∗ = 0.

2. The columns of G(x∗ ) are linearly independent.

3. Strict complementary slackness holds, i.e.,

gi (x∗ )vi∗ = 0

for i = 1, ..., p and , if gi (x∗ ) = 0, then vi∗ > 0

4. The Hessian of the Lagrangian function with respect to x is positive definite on


the null space of G(x∗ ); i.e.,
dt HL∗ d > 0
for all d 6= 0 such that G(x∗ )t d = 0
The above conditions, sometimes called the strong second order sufficient condi-
tions, in fact guarantee that x∗ is an isolated local minimum of (NLP) and that the
optimal vectors u∗ and v ∗ are unique. [2]

2.2 Quadratic Approximation

Any real or complex valued function f which is infinitely differentiable will have a
series expansion with respect to a point x0 given by

1 0 1 1
f (x) = f (x0 ) + f (x0 )(x − x0 ) + f 00 (x0 )(x − x0 )2 + f 000 (x0 )(x − x0 )3 + ...
1! 2! 3!

For approximating a function about the point x0 the higher order differential terms
are neglected. The quadratic approximation of f about x0 will be

1 0 1
q(x0 ) = f (x0 ) + f (x0 )(x − x0 ) + f 00 (x0 )(x − x0 )2
1! 2!

The same series when written for vector argument scalar function gives the quadratic
approximation as:

1 1
q(X0 ) = f (X0 ) + ∇f (x0 )(X − X0 ) + ∇2xx f (x0 )(X − X0 )2
1! 2!

where X is the argument vector of scalar function f and X0 is the point in the vector
space along which the approximation is carried out. [2]

2.3 The Quadratic Subproblem

The SQP method is an iterative method in which, at a current iterate xk , the step to
the next iterate is obtained through information generated by solving a quadratic sub-
problem. The subproblem is assumed to reflect in some way the local properties of the
original problem. The major reason for using a quadratic subproblem, i.e., a problem
with a quadratic objective function and linear constraints, is that such problems are rel-
atively easy to solve and yet, in their objective function, can reflect the nonlinearities of

4
the original problem.
A major concern in SQP methods is the choice of appropriate quadratic subproblems.
At a current approximation xk a reasonable choice for the constraints is a linearization
of the actual constraints about xk . Thus the quadratic subproblem will have the form

1
minimize (rk )t dx + dtx Bk dx
X 2
subject to : ∇h(xk )t dx + h(xk ) = 0

∇g(xk )t dx + g(xk ) ≤ 0

where dx = x − xk . The vector rk and the symmetric matrix Bk remain to be chosen.


The most obvious choice for the objective function in this quadratic program is the local
quadratic approximation to f at xk . That is, Bk is taken as the Hessian and rk as the
gradient of f at xk .
To take nonlinearities in the constraints into account while maintaining the linearity
of the constraints in the sub-problem, the SQP method uses a quadratic model of the
Lagrangian function as the objective. This can be justified by noting that conditions 1-4
imply that x∗ is a local minimum for the problem

minimize L(x, u∗ , v ∗ )
X

subject to : h(x) = 0

g(x) ≤ 0

Note that the constraint functions are included in the objective function for this equiva-
lent problem. Although the optimal multipliers are not known approximations uk and v k
to the multipliers can be maintained as part of the iterative process. The given a current
iterate, (xk , uk , v k ), the quadratic Taylor series approximation in x for the Lagrangian
is
1
L(xk , uk , v k ) + ∇L(xk , uk , v k )t dx + dtx HL(xk , uk , v k )dx
2
A strong motivation for using this function as the objective function in the quadratic
subproblem is that it generates iterates that are identical to those generated by Newton’s
method when applied to the system composed of the first order necessary condition
(condition 1) and the constraint equations (including the active inequality constraints).
This means that the resulting algorithm will have good local convergence properties. In

5
spite of these local convergence properties there are good reasons to consider choices
other than the actual Hessian of the Lagrangian, for example approximating matrices
that have properties that permit the quadratic subproblem to be solved at any xk and the
resulting algorithm to be amenable to a global convergence analysis. Letting Bk to be
approximate of HL(xk , uk , v k ), we can write the quadratic subproblem as:

1
minimize ∇L(xk , uk , v k )t dx + dtx Bk dx
dx 2
subject to : ∇h(xk )t dx + h(xk ) = 0

∇g(xk )t dx + g(xk ) ≤ 0

The form of the quadratic subproblem most often found in the literature, and the one
that will be employed is

1
minimize ∇f (xk )t dx + dtx Bk dx
dx 2
subject to : ∇h(xk )t dx + h(xk ) = 0

∇g(xk )t dx + g(xk ) ≤ 0

These two forms are equivalent for problems with only equality constraints since, by
virtue of the linearized constraints, the term ∇h(xk )t dx is constant and the objective
function becomes ∇f (xk )t dx + 12 dtx Bk dx . The two subproblems are not quite equiva-
lent in the inequality-constrained case unless the multiplier estimate v k is zero for all
inactive linear constraints. However (QP) is equivalent to (2.4) for the slack-variable
formulation of (NLP) given by

minimize f (x)
dx

subject to : h(x) = 0

g(x) + z = 0

z≥0

where z ∈ Rp is the vector of slack variables. Therefore (QP) can be considered an


appropriate quadratic subproblem for (NLP).
The solution dx of (QP) can be used to generate a new iterate xk+1 , by taking a step
from xk in the direction of dx . But to continue to the next iteration new estimates for

6
the multipliers are needed. There are several ways in which these can be chosen, but
one obvious approach is to use the optimal multipliers of the quadratic subproblem.
Let the optimal multipliers of (QP) be denoted by uqp and vqp , and setting

du = uqp − uk

dv = vqp − v k

allow the updates of (x,u,v) to be written in the compact form

xk+1 = xk + αdx

uk+1 = uk + αdu

v k+1 = v k + αdv

for some selection of the steplength parameter α. Once the new iterates are con-
structed the problem functions and derivatives are evaluated and a prescribed choice of
Bk+1 calculated. [2]

7
CHAPTER 3

Unconstrained Optimization

To understand the benefit of optimizing using SQP first unconstrained optimization will
be considered. Having problem statement of

minimize f (x)
x

where f (x) can be any non linear or linear function subjected to minimize under no
constraints. There are several algorithms present for the unconstrained minimization of
multi variable function few of them which are widely used are:

• Steepest Descent method

• Conjugate Gradient method

• Newton’s method

• Quasi Newton’s method


• Inverse Hessian Updating - (DFP Method)
• Direct Hessian Updating - (BFGS MEthod)

3.1 Gradient Methods

Gradient methods rely on an important idea, iterative descent that works as follows, to
minimize this problem.

• Start at some point x0 .

• Generate x1 , x2 ,..., such that f is decreased at each iteration.

f (xk+1 ) < f (xk )

where xk+1 = xk + αdk


α- step length (scalar)
dk - search direction

• The steps are repeated successively till ∇f (x∗k ) = 0


Now
f (xk + αdk ) < f (xk )

By Taylor series (first order),

f (xk ) + α∇f (xk )T dk < f (xk )

α∇f (xk )T dk < 0

Since α > 0,
∇f (xk )T dk < 0

After the computation of dk at each step the next point for the iteration is concluded by:

xk+1 = xk + αdk

For the evaluation of α which is the step length to be opted for gaining the maximum
possible minimum value of the function in the direction of dk , Armijo’s Rule is imple-
mented which is
φ(α) = f (xk + αdk )

The value of α is chosen large which is 1 here. Then following condition is checked.

φ(α) ≤ φ(0) + σφ(0)0 α

where σ ∈ [0, 1]. Here the value of σ = 21 .


If the above condition is not satisfied then α is multiplied by β < 1 here β = 0.98
After obtaining the updated value of α the above condition consisting of inequality is
checked and the process is carried out iteratively till the condition is met.
The classifications of the type of gradient method are made on the basis of computation
of dk .

3.1.1 Steepest Descent Method

dk = −∇f (xk )

9
3.1.2 Conjugate Gradient Method

dk = −∇f (xk ) + βk dk−1

where  2
kf (xk )k
βk =
kf (xk−1 )k

3.1.3 Newton’s Method

f (x) is expanded using Taylor’s series around it’s optimum with ∆x

1
f (x + ∆x) = f (x) + ∇f (x)T ∆x + ∆xT ∇2 f (x)∆x
2

since
∇f (x) = 0

∇f (x) + H∆x = 0

∆x = −H −1 ∇f (x)

xk+1 = xk − Hk−1 ∇f (xk )

dk = −Hk−1 ∇f (xk )

Since the basic condition of Descent is

∇f (xk )T dk < 0

for Newton’s method it becomes

−∇f (xk )T Hk−1 ∇f (xk ) < 0

which concludes that Hk must be positive definite.


To overcome this necessary condition of checking and ensuring the Hessian matrix to
be positive definite at each step Quasi Newton Method is used.

10
3.1.4 Inverse Hessian Update

At the start of the iteration it is assumed that Hk−1 = I then the inverse of the Hessian
matrix is updated as the iterations are proceed:

−1 (sk )(sTk ) −(zk )(zk )T


Hk+1 = Hk−1 + +
sk · y k yk · zk

where sk = αdk ; yk = ∇f (xk ) − ∇f (xk−1 ); zk = (Hk )−1 yk .

3.1.5 Direct Hessian Updating

Initially Hk−1 = I then the Hessian matrix is updated as the iterations proceed:

(yk )(ykT ) −∇f (xk )∇f (xk )T


Hk+1 = Hk + +
sk · yk ∇f (xk ) · dk

where sk = αdk ; yk = ∇f (xk ) − ∇f (xk−1 ).

3.2 Sequential Quadratic Method

In Sequential Quadratic Programming the function to be minimized is quadratically


approximated at each step.

1 1
Q(X) = f (X0 ) + ∇f (X0 )(X − X0 ) + ∇2xx f (X0 )(X − X0 )2
1! 2!

Then the approximated function is minimized at each step by

∇Q(X) = 0

11
which results in set of n linear equations.
   
a x + a12 x2 + a13 x3 + ... + a1n xn b1
 11 1   
 a21 x1 + a22 x2 + a23 x3 + ... + a2n xn b2
   
  
..
   
=
   
 a31 x1 + a32 x2 + a33 x3 + ... + ann xn . 
.. ..
   
   
 .   . 
   
an1 x1 + an2 x2 + an3 x3 + ... + ann xn bn

Where n is the dimension of the input variable.


After solving the set of linear equations a new point is obtained which will be the
starting point for next iteration.
This process is repeated till the ∇f (X) < . Where  is the threshold value.

Algorithm for unconstrained SQP


• Start at xk where k = 1, 2, 3, ...

• Check k∇f (xk )k ≤ . If yes then the point is the required optimizer. Else
proceed to next step.

• Compute the Quadratic approximation of f (x) at point xk .

• Solve the set of linear equations obtained by computing ∇Q(xk ).

• The point so obtained is xk+1 . Then procedures from step are followed iteratively.

3.2.1 SQP and Inverse Hessian Update Method

To compare the performance between SQP and Inverse Hessian Update Method two
sample examples were taken:
min f = 3x2 + y 4

Here x0 = [4; 5];  = 10−5

Table 3.1

Parameters Sequential Quadratic Programming Inverse Hessian Update


x1 0 -9.20657 x 10− 7
x2 0.01141 0.00209

f 1.69982 x 10 8 2.191121 x 10− 11
Iterations 16 20
Time Elapsed 2.444299 sec 2.984575 sec

12
min f = 3y 6 + x8 + 1

Here x0 = [3; 2];  = 10−3

Table 3.2

Parameters Sequential Quadratic Programming Inverse Hessian Update


x1 0.2546 0.15750
x2 0.05629 -0.14078
f 1.00001778 1.00002373
Iterations 17 276
Time Elapsed 2.106474 sec 39.571973 sec

In table 3.1 the results were quite comparable i.e.,

• The number of iterations carried out for the minimization of the function 3x2 + y 4
were 16 and 20 for SQP and Inverse Hessian Update respectively.

• The run time for both the algorithms were close enough.

In table 3.2 the results highly differ.

• The number of iterations were 17 and 276 for SQP and Inverse Hessian Update
respectively.

• The run time for the algorithms highly differed due to the difference in the number
of iterations carried out.

13
CHAPTER 4

Constrained Optimization

The constrained optimization of the objective function is carried out by using the SQP
algorithm.

4.1 Equality Constrained Optimization


APPENDIX A

Implementaion of Quadratic Approximation

Matlab Codes

function QuadraticFunction
f_str = input(’enter the function: ’,’s’);
f = inline(f_str);
n = nargin(f); %%provides no. of arguments
X_char = argnames(f); %% X_char is the nx1 matrix of argument
X = sym(’X’, [n,1]); %%declare matrix X of dimension nx1 as sym
i = 1;
while (i<=n)
X(i,1) = X_char(i,1);
i = i+1;
end
Xk = input(’suggest a point: ’);

G = Grad(f, X, Xk)
F = Fvalue(f, X, Xk)
H = Hvalue(f, X, Xk)

q = F + transpose(G)*(X-Xk) +1/2*transpose(X-Xk)*H*(X-Xk)
end

function func_val = Fvalue(f, arg_vector, Xc)


n = nargin(f);
syms temp2;
m = 1;
temp2 = f;
while(m <= n)
temp2 = subs(temp2, arg_vector(m,1), Xc(m,1), 0 );
m = m+1;
end
func_val = temp2;
end

function grad_val = Grad(F1, arg_vector, Xc)


n = nargin(F1);
temp1 = sym(’temp’, [n,1]);
F = formula(F1);
temp1 = gradient(F, arg_vector); %% temp1 is the gradient col
j = 1;
k = 1;
while (j <= n)
% j %% for checking the iteratio
while (k <= n)
% k
temp1(j,1) = subs(temp1(j,1), arg_vector(k,1), Xc(k,1),
k = k+1;
end
k=1;
j = j+1;
end

grad_val = temp1;
end

function hes_val = Hvalue(f, arg_vector, Xc)


n = nargin(f);
F = formula(f);
temp3 = hessian(F, arg_vector);
i =1;
j =1;

16
k =1;
while(i<=n)
% i
while(j<=n)
% j
while(k<=n)
% k
temp3(i,j) = subs(temp3(i,j), arg_vector(k,1), Xc(k,
k = k+1;
end
k=1;
j = j+1;
end
j = 1;
i = i+1;
end
hes_val = temp3;
end

17
REFERENCES

[1] S. Pajic, “Sequential quadratic programming-based contingency constrained opti-


mal power flow,” 2003.

[2] P. T. Boggs and J. W. Tolle, “Sequential quadratic programming,” 1995.

18

You might also like