0% found this document useful (0 votes)
4 views

Nonlinear Programming 20 34

Uploaded by

villegasm10002
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Nonlinear Programming 20 34

Uploaded by

villegasm10002
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Chapter 1.

The nonlinear
programming problem f
One The nonlinear programming prob-
lem that will concern us has three
The fundamental ingredients: a finite
number of real variables, a finite
Nonlinear number of constraints which the
variables must satisfy, and a func-
Programming tion of the variables which must
Problem, be minimized (or maximized).
Mathematically speaking we can
Preliminary state the problem as follows: Find
specific values (xi, . . . ,xn), if
Concepts, they exist, of the variables
and (£1, • . . ,zn) that will satisfy the
inequality constraints
Notation
the equality constraints

and minimize (or maximize) the


o bjective function

over all values of Xi, . . . ,xn


satisfying 1 and 2. Here, Qi, hj,
and 6 are numerical functions! of
the variables x\, . . . ,xn, which
are defined for all finite values of
t In order to introduce the problem in the
first section of the book, some undefined
terms (function, real variable, constraints,
etc.) must be interpreted intuitively for the
time being. The problem will be stated
rigorously at the end of this chapter (see
1.6.9 to 1.6.12).
t The concept of a numerical function will be
defined precisely in Sec. 1.5. For the pres-
ent by a numerical function of xi, . . . , xn
we mean a correspondence which assigns a
single real number for each n-tuple of real
values that the variables xi, . • • , xn
assume.
1.1 Nonlinear Programming

the variables. The fundamental difference between this problem and


that of the classical constrained minimization problem of the ordinary
calculus [Courant 47, Fleming 65] f is the presence of the inequalities 1.
As such, inequalities will play a crucial role in nonlinear programming and
will be studied in some detail.
As an example of the above problem consider the case shown in
Fig. 1.1.1. Here we have n = 2 (two variables Xi,Xz), m = 3 (three
inequality constraints), and A; = 1 (one equality constraint). Each curve
in Fig. 1.1.1 is obtained by setting some numerical function equal to a real
number such as B(xi,x2) — 5 or g*(x\,xt) = 0. The little arrows on the

t This refers to the works by Courant, written in 1947, and by Fleming, written in 1965,
as listed in the Bibliography at the back of the book. This system of references will be
used throughout the book with one exception: [Gordan 73] refers to Gordan's paper written
in 1873.

Fig. 1.1.1 A typical nonlinear programming problem in two


variables (xi,xi).

a
Preliminary Concepts and Notations 1.3

curves Qi(x\,Xi) = 0 indicate the side in the direction of which g^ increases,


and hence all (xi,xz) must lie on the opposite side of these curves if they are
to satisfy 1. All such (xi,Xz) lie in the shaded area of Fig. 1.1.1. To
satisfy 2, (2:1,2:2) must lie on the curve hi(xi,Xz) = 0. The solution to the
problem is (£1,0:2). This is the point on the curve A 1(0:1,2:2) = 0 at which
6 assumes its lowest value over the set of all (x 1,2:2) satisfying ^(0:1,2:2) ^ 0,
i = 1, 2, 3. In more complicated situations where n, m, and A; may be
large, it will not be easy to solve the above problem. We shall then be
concerned with obtaining necessary and/or sufficient conditions that a
point (x\, . . . ,xn) must satisfy in order for it to solve the nonlinear pro-
gramming problem 1 to 3. These optimality conditions form the crux
of nonlinear programming.
In dealing with problems of the above type we shall confine our-
selves to minimization problems only. Maximization problems can be
easily converted to minimization problems by employing the identity
maximum B(x\, . . . ,£„) = —minimum [—0(2:1, . . . ,£„)]
Problem
Solve graphically as indicated in Fig. 1.1.1 the following nonlinear
programming problem:
minimize ( — x\ — Xt)
subject to

2. Sets and symbols


We shall use some symbols and elementary concepts from set theory
[Anderson-Hall 63, Hamilton-Landin 61, Berge 63]. In particular a set F
is a collection of objects of any kind which are by definition elements or
points of T. For example if we let R (the reals or the real line) denote
the set of all real numbers, then 7 is an element or point of R. We use
the symbol G to denote the fact that an element belongs to a set. For
example we write 7 G R- For simplicity we also write sometimes
5,7 G R instead of 5 £ R and 7 G R.
If T and A are two sets, we say that F is contained in A, F is in A,
F is a subset of A, or A contains F, if each element of F is also an element
of A, and we write F C A or A 3 F. If F C A and A C F we write F = A.
A slash across a symbol denotes its negation. Thus x @ F and F <£ A
denote respectively that x is not an element of F and that F is not a subset

3
1.2 Nonlinear Programming

of A. The empty set is the set which contains no elements and is denoted
by 0. We denote a set sometimes by {x,y,z}, if the set is formed by the
elements x, y, z. Sometimes a set is characterized by a property that its
elements must have, in which case we write
[x | x satisfying property P\
For example the set of all nonnegative real numbers can be written as

The set of elements belonging to either of two sets F or A is called


the union of the sets F and A and is denoted by F U A. We have then

The set of elements belonging to at least one of the sets of the (finite or
infinite) family of sets (r,),e/ is called the union of the family and is
denoted by U F,. Then

The set of elements belonging to both sets T and A is called the


intersection of the sets F and A and is denoted by F P> A. We then have

The set of elements belonging to all the sets of the (finite or infinite)
family of sets (F,)ie/, is called the intersection of the family and is denoted
by r! F,. Then
ie/

Two sets F and A are disjoint if they do not intersect, that is, if
r n A = 0.
The difference of the sets A and F is the set of those elements of A
not contained in F and is denoted by A ~ F. We have then

In the above it is not assumed in general that F C A. If however F C A,


then A ~ F is called the complement of F relative to A.
The product of two sets F and A, denoted by F X A, is defined as
the set of ordered pairs (x,y) of which x G T and y G A. We have then

4
Preliminary Concept! and Notations 1.1

Fig. 1.2.1 The product r X A of the sets r


and A.

The product of n sets Fi, . . . , Tn, denoted by Ti X F 2 X • • * X F n ,


is defined as the set of ordered n-tuples (x\, . . . ,xn) of which x\ G I\
• • • , #n £ Tn. We have then

If Ti = T2 = • • • = rn = T, then we write F" = T X r X • • • X T.


If we let

then

Figure 1.2.1 depicts the set T X A. The set # 2 = R X R, which can be


represented by points on a plane, is called the Euclidean plane.
The following symbols will also be used:

(Vx) reads for each x


(3x) reads there exists an x such that
=> reads implies
<= reads is implied by
«=» reads is equivalent to
(A slash [/] across any one of the last three symbols denotes their negation.)

For example the statement "for each x there exists a y such that 6(x,y) = 1"
can be written as

The negation of the above statement can be automatically written as

Frequently we shall refer to a certain relationship such as an


equation or an inequality by a number or Roman numeral such as I or II.

5
1.8 Nonlinear Programming:

The notation I => II means relationship I implies relationship II. An


overbar on I or II (I or II) denotes the negation of the relationship
referred to by that numeral. Obviously then the statement that I =» II
is logically equivalent to I <= II. Thus

3. Vectors
n-vector
An n-vector or n-dimensional vector x, for any positive integer n,
is an n-tuple (xi, . . . ,xn) of real numbers. The real number x, is
referred to as the ith component or element of the vector x.
Rn
The n-dimensional (real) Euclidean space Rn, for any positive
integer n, is the set of all n-vectors.
The notation x £j Rn means that x is an element of Rn, and hence,
x is an n-vector. Frequently we shall also refer to re as a point in Rn.
R1, or simply R, is then the Euclidean line (the set of all real numbers),
R* is the Euclidean plane (the set of all ordered pairs of real numbers),
and R" = R X R X • • • X R (n times).
Vector addition and multiplication by a real number
Let x,y (E Rn and a G R- The sum x -\- y is defined by
x + y = (xi + 7/1, . . . ,xn + yn)
and the multiplication by a real number ax is defined by
ax = (axi, . . . ,axn)

Linear dependence and independence


The vectors x1, . . . ,xm G Rn are said to be linearly independent if

otherwise they are linearly dependent. (Here and elsewhere 0 denotes the
real number zero or a vector each element of which is zero.)
Linear combination
The vector x £ Rn is a linear combination of x1, . . . , xm G Rn if
x = X1^1 4- • • • + \mxm for some X1, . . . , \m G R

6
Preliminary Concepts and Notations 1.3

and it is a nonnegative linear combination of xl, . . . , xm if in addition


to the above equality X 1 , . . . , \m ^ 0. The numbers X 1 , . . . , Xm are
called weights.
The above concepts involving vector addition and multiplication
by a scalar define the vector space structure of Rn. They are not enough
however to define the concept of distance. For that purpose we introduce
the scalar product of two vectors.

Scalar product
The scalar product xy of two vectors x,y E Rn is defined by
xy = xtfji + • • • + xnyn

Norm of a vector
The norm |ja;|| of a vector x E Rn is defined by

Cauchy-Schwarz inequah'ty
Let x,y E Rn. Then

where \xy\ is the absolute value of the real number xy.

PROOF Let x,y E Rn be fixed. For any a E R

Hence the roots of the quadratic equation in a


xx(a)z + 2xya + yy = 0
cannot be distinct real numbers, and so
(xy)2 ^ (xx)(yy)
which implies the Cauchy-Schwarz inequality. |

Distance between two points


Let x,y E Rn- The nonnegative number S(x,y) = \\x — y\\ is
called the distance between the two points x and y in Rn.

Problem
Establish the fact that Rn is a metric space by showing that

7
1.4 Nonlinear Programming

8(x,y) satisfies the following conditions

(triangle inequality)
(Hint: Use the Cauchy-Schwarz inequality to establish the triangle
inequality.)

Angle between two vectors


Let x and y be two nonzero vectors in Rn: The angle \l> between
x and y is defined by the formula

This definition of angle agrees for n = 2,3 with the one in ana-
lytic geometry. The nonzero vectors x and y are orthogonal if xy = 0
(\l/ = 7T/2); form an acute angle with each other if xy ^ 0 (0 2g ^ f=j ir/2),
a s/nc/ acwte an0/e if xy > 0 (0 ^ ^ < ?r/2), an obtuse angle if xy ^ 0
(7T/2 ^ ^ ^ TT), and a sin'd obtuse angle if xy < 0 (?r/2 < ^ ^ ?r).

4. Matrices
Although our main concern is nonlinear problems, linear systems of the
following type will be encountered very frequently:
AnXj. + • • • + Alnxn - bi

Amlxi + • ' ' + Amnxn - bm


where Ai}^ and 6,, i = 1, . . . , m,j = 1, . . . , n, are given real numbers.
We can abbreviate the above system by using the concepts of the previous
section. If we let Ai denote an n-vector whose n components are
AHJ j = 1, . . . , n, and if we let x G Rn, then the above system is
equivalent to
AiX — bi i = 1, . . . , m
In 2 we interpret A& as the scalar product 1.3.6 of At and x. If we fur-
ther let Ax denote an w-vector whose m components are A<x, i = 1,
. . . , m, and b an w-vector whose m components are bi, then the equiva-
lent systems 1 and 2 can be further simplified to
Ax = b
8
Preliminary Concepts and Notations 1.4

In order to be consistent with ordinary matrix theory notation,


we define the m X n matrix A as follows

The ith row of the matrix A will be denoted by A» and will be an n-vector.
Hence

The jth column of the matrix A will be denoted by A.> and will be an
w-vector. Hence

The transpose of the matrix A is denoted by A' and is defined by

Obviously the t'th row of A is equal to the ith column of A', and the jth
column of A is equal to the jth row of A'. Hence

The last equalities of 8 and 9 are to be taken as the definitions of A't and
A'J respectively. Since Ay is the real number in the t'th row of the jth
column of A, then if we define A^t as the real number in thejth row of the
ith column of A', we have

The equivalent systems 1, 2, and 3 can be written still in another


form as follows

9
1.4 Nonlinear Programming

Here A.j and b are vectors in Rm and x, are real numbers. The represen-
tation 2 can be interpreted as a problem in Rn whereas 11 can be inter-
preted as a problem in Rm. In 2 we are required to find an x £ Rn that
makes the appropriate scalar products 6, (or angles, see 1.3.11} with the
n-vectors Ai} i = 1, . . . , m. In 11, we are given the n + 1 vectors in
Rm> A.j, j'• — 1, . . . , n and b, and we are required to find n weights
xi, . . . , xn such that b is a linear combination of the vectors A.j.
These two dual representations of the same linear system will be used in
interpreting some of the important theorems of the alternative of the next
chapter.
The m X n matrix A of 4 can generate another linear system yA,
defined as follows

where y £ Rn. Hence, yA is an n-dimensional vector whose jth com-


ponent is given by

In general we shall follow the convention of using upper case Latin


letters to denote matrices. If A is an m X n matrix, and if we let

then we define the following sub matrices of A (which are matrices with
rows and columns extracted respectively from the rows and columns of A)

Fig. 1.4.1 An m X n matrix and its submatrices.

10
Preliminary Concepts and Notations 1.8

jth column of
ith row of

It follows then that

and

Figure 1.4-1 depicts some of the above submatrices of A.


Nonvacuous matrix
A matrix A is said to be nonvacuous if it contains at least one
element A^. An m X n matrix A with m ^ 1 and n ^ 1 is nonvacuous
even if all its elements AH = 0.

5. Mappings and functions


Mapping
Let X and Y be two sets. A mapping F from X into Y is a corre-
spondence which associates with every element x of X a subset of Y. For
each x (E X, the set T(x) is called the image of x. The domain X* of F
is the subset of points of X for which the image T(x) is nonempty, that is,

The range T(X*) of F is the union of the images of all the points of X*,
that is

EXAMPLEE Let X = F = #, F(z) = {y | cos y = *}. Then

Function
A function f is a single-valued mapping from a set X into a set Y.
That is for each x £ X, the image set/(x) consists of a single element of F.
The domain of / is X, and we say that / is defined on X. The range of /
is f(X) = U f(x). (For convenience we will write the image of a func-
xex
tion not as a set but as the unique element of that set.)
11
1.& Nonlinear Programming

Numerical function
A numerical function 6 is a function from a set X into #. In other
words a numerical function is a correspondence which associates a real
number with each element x of X.

EXAMPLESS If If
EXAMPLESS X= X R,
= R, then
then is the
d isd the familiar
familiar realsingle-valued
real single-valued func
func
tion of a real variable, such as 0(x) = sin x. If X is the set of positive
integers, then 6 assigns a real number for each positive integer, for exam-
ple 6(x) = l/x\. If X = Rn, then d is the real single-valued function
of n variables.
Vector function
An m-dimensional vector function f is a function from a set X
into Rm. In other words a vector function is a correspondence which
associates a vector from Rm with each element x of Jf. The m compo-
nents of the vector f(x) are denoted by fi(x), . . . , fm(x). Each/< is a
numerical function on X. A vector function / has a certain property (for
example continuity) whenever each of its components/, has that property.

EXAMPLE If X = R, then d is the familiar real single-valued func


of Rn. The m components /,, i' = 1, . . . , m, of / are numerical func-
tions on Rn.
Linear vector functions on Rn
An m-dimensional vector function / defined on Rn is said to be
linear if
f(x) = Ax + b
where A is some fixed m X n matrix and b is some fixed vector in Rm.
It follows that if / is a linear function on Rn then

(Conversely, the last two relations could be used to define a linear vector
function on Rn, from which it could be shown that/(a;) = Ax -f 6 [Berge
63, p. 159].)
If m = 1 in the above, then we have a numerical linear function
B on Rn and

where c is a fixed vector in Rn and y is a fixed real number.

12
Preliminary Concepts and Notations 1.6

Inequalities or equalities involving linear vector functions (or


linear numerical functions) will be naturally called linear inequalities or
equalities.

6. Notation
Vectors and real numbers
In general we shall follow the convention that small Latin letters
will denote vectors such as a, b, c, x, y, z, or vector functions such as/, g, h.
Exceptions will be the letters i, j, k, m, n, and sometimes others, which
will denote integers. Small Greek letters will denote a real number (a
point in R) such as a, ft, 7, £, rj, f, or a numerical function such as 8, </>, \f/.

Subscripts
A small Latin letter with an integer subscript or a small Latin letter
subscript will denote a component of a vector, in general, and on occasion
will denote a vector. For example, if x G Rb, then x$ and zt denote
respectively the third and z'th components of x. On the other hand we
will have occasion to let x\ G Rm, £2 G Rm, etc., in which case this intent
will be made explicit. Small Greek letters with integer or Latin sub-
scripts will occasionally be used to denote real numbers such as Xi, X,. If
x G Rn, K C N = {I, . . . ,n}, and K contains k elements each of
which is distinct, then X^K is a vector in Rk with the components
[xi | i G K-} and is denoted by XK. Thus a small Latin letter with a
capital Latin letter subscript denotes a vector in a space of smaller or
equal dimension to that of the space of the unsubscripted vector.

Superscripts
A small Latin or Greek letter with a superscript or an elevated
symbol will denote a fixed vector or real number, for example x1, xz, x\
x, x, £*, I, etc. Exponentiation on the other hand will be distinguished
by parentheses enclosing the quantity raised to a power, for example (z)2.

Zero
The number 0 will denote either the real number zero or a vector
in Rn all components of which are zero.

Matrices
Matrices will be denoted by capital Latin letters as described in
detail in a previous section, Sec. 1.4.

13
1.6 Nonlinear Programming

Sets
Sets will always be denoted by capital Greek or Latin letters such
as F, A, fl, R, I, X, Y. Capital letters with subscripts, such as TI, T2, I\,
and capital letters with elevated symbols, such as r*, X° will also denote
sets. (See also Sec. 1.2.)

Ordering relations
The following convention for equalities and inequalities will be
used. If x,y £ Rn, then

If x ^ 0, x is said to be nonnegative, if x > 0 then x is said to be semiposi-


tive, and if x > 0 then x is said to be positive. The relations =, ^, >, >
defined above are called ordering relations (in Rn).

The nonlinear programming problem


By using the notation introduced above, the nonlinear program-
ming problem 1.1.1 to 1.1.3 can be rewritten in a slightly more general
form as follows. Let X° C Rn, let g, h, and 6 be respectively an m-dimen-
sional vector function, a ^-dimensional vector function, and a numerical
function, all defined on X°. Then the problem becomes this: Find an
x, if such exists, such that

The set X is called the feasible region, x the minimum solution, and Q(x)
the minimum. All points x in the feasible region X are referred to as
feasible points or simply as feasible.
Another way of writing the same problem which is quite common
in the literature is the following:

subject to

14
Preliminary Concepts and Notations 1.6

We favor the more precise and brief designation 9 of the problem instead
of 10 to 12. Notice that if we let X° = Rn in the above problem, then
we obtain the nonlinear programming problem 1.1.1 to 1.1.3.
If X° = Rn and 6, g, and h are all linear functions on Rn, then
problem 9 becomes a linear programming problem: Find an x, if such
exists, such that

where 6, c, and d are given fixed vectors in Rn, Rm, and Rk respectively,
and A and B are given fixed m X n and k X n matrices respectively.
There exists a vast literature on the subject of linear programming
[Dantzig 63, Gass 64, Hadley 62, Simmonard 66]. It should be remarked
that problem 13 is equivalent to finding an x such that

When B and d are absent from this formulation, 14 becomes the standard
dual form of the linear programming problem [Simmonard 66, p. 95].

16

You might also like