0% found this document useful (0 votes)

39 views20 pages

Convex Functions: See P. 10 of The Handout On Preliminary Material

The document discusses convex functions. [1] Convex functions are defined based on their epigraphs being convex sets. [2] A function f is convex if its epigraph, defined as the set of points (x,z) where z is greater than or equal to f(x), forms a convex set. [3] Equivalently, a function is convex if it satisfies Jensen's inequality for any points in its domain.

Uploaded by

Claribel Paola Serna Celin

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

0% found this document useful (0 votes)

39 views20 pages

Convex Functions: See P. 10 of The Handout On Preliminary Material

Uploaded by

Claribel Paola Serna Celin

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

You are on page 1/ 20

Convex Functions

Our next topic is that of convex functions. Again, we will concentrate on the context of a
map f : Rn → R although the situation can be generalized immediately by replacing Rn
with any real vector space V . We will state many of the definitions below in this more
general setting.

We will also find it useful, and in fact modern algorithms reflect this usefulness, to consider
functions f : Rn → R∗ where R∗ is the set of extended real numbers introduced earlier1 .

Before beginning with the main part of the discussion, we want to keep a couple of
examples in mind.
The primal example of a convex function is x 7→ x2 , x ∈ R. As we learn in elementary
calculus, this function is infinitely often differentiable and has a single critical point at
which the function in fact takes on, not just a relative minimum, but an absolute minimum.

Figure 1: The Generic Parabola

d 2
A critical point is, by definition, the solution of the equation x = 2 x or 2 x = 0. We
dx
can apply the second derivative test at the point x = 0 to determine the nature of the
d2
critical point and we find that, since 2 x2 = 2 > 0, the function is ”concave up” and

dx
1
See p. 10 of the handout on Preliminary Material.

1
the critical point is indeed a point of relative minimum. That this point gives an absolute
minimum to the function, we need only remark that the function values are bounded
below by zero since x2 > 0 for all x 6= 0.
We can give a similar example in R2 .

Example 1.1 We consider the function

1 2 1 2
(x, y) 7→ x + y := z,
2 3
whose graph appears in the next figure.

2
1
x
–4
z –1
–2
2 –2
4
y

–1

–2

Figure 2: An Elliptic Parabola

This is an elliptic paraboloid. In this case we expect that, once again, the minimum will
occur at the origin of coordinates and, setting f (x, y) = z, we can compute
   
x 1 0
grad (f ) (x, y) =  2  , and H(f(x, y)) =  2 .
   
y 0
3 3
Notice that, in our terminology, the Hessian matrix H(f ) is positive definite at all points
(x, y) ∈ R2 . Here the critical points are exactly those for which grad [f(x, y)] = 0 whose
only solution is x = 0, y = 0. The second derivative test is just that
2 2
∂ f ∂ f ∂2f
det H(f(x, y)) = − > 0
∂x2 ∂y2 ∂x ∂y

2
which is clearly satisfied.
Again, since for all (x, y) 6= (0, 0), z > 0, the origin is a point where f has an absolute
minimum.

As the idea of convex set lies at the foundation of our analysis, we want to describe the
notion of convex functions in terms of convex sets. We recall that, if A and B are two
non-empty sets, then the Cartesian product of these two sets A × B is defined as the set
of ordered pairs {(a, b) : a ∈ A, b ∈ B}. Notice that order does matter here and that
A × B 6= B × A! Simple examples are

1. Let A = [−1, 1], B = [−1, 1] so that A × B = {(x, y) : −1 ≤ x ≤ 1, −1 ≤ y ≤ 1}

which is just the square centered at the origin, of side two.

2. R2 itself can be identified (and we usually do!) with the Cartesian product R × R.

3. let C ⊂ R2 be convex and let S := R+ × C. Then S is called a right cylinder and is

just {(z, x) ∈ R3 : z > 0, x ∈ C}. If, in particular C = {(u, v) ∈ R2 : u2 + v 2 ≤ 1},
then S is the usual right circulinder lying above the x, y-plane (without the bottom!).

This last example shows us a situation where A × B is convex. In fact it it a general

result that if A and B are two non-empty convex sets in a vector space V , then A × B is
likewise a convex set in V × V .

Exercise 1.2 Prove this last statement. (Take V = Rn if you must!)

Previously, we introduced the idea of the epigraph of a function f : X → R where

X ⊂ Rn . For convenience, we repeat the definition here.

Definition 1.3 Let X ⊂ Rn be a non-empty set. If f : X → R then epi (f ) is defined by

epi (f ) := {(x, z) ∈ X × R | z ≥ f (x)}.

Convex functions are defined in terms of their epigraphs:

Definition 1.4 Let C ⊂ Rn be convex and f : C −→ R∗ . Then the function f is called

a convex function provided epi (f ) ⊂ R × Rn is a convex set.

3
We emphasize that this definition has the advantage of directly relating the theory of
convex sets to the theory of convex functions. However, a more traditional definition is
that a function is convex provided that, for any x, y ∈ C and any λ ∈ [0, 1]

f ( (1 − λ) x + λ y) ≤ (1 − λ) f (x) + λ f (y) ,
which is sometimes referred to as Jensen’s inequality.

In fact, these definitions turn out to be equivalent. Indeed, we have the following result.

Theorem 1.5 Let C ⊂ Rn be convex and f : C −→ R∗ . Then the following are equiva-
lent:

(a) epi(f ) is convex.

Pn
(b) For all λ1 , λ2 , . . . , λn with λi ≥ 0 and i=1 λi = 1, and points x(i) ∈ C, i =
1, 2, . . . , n, we have

n
! n
X X
f λi x(i) ≤ λi f (x(i) ) .
i=1 i=1

(c) For any x, y ∈ C and λ ∈ [0, 1],

f ( (1 − λ) x + λ y ) ≤ (1 − λ) f (x) + λ f (y) .

Proof: To see that (a) implies (b) we note that, if for all i = 1, 2, . . . , n, (x(i) , f (x(i) ) ∈
epi (f ), then since this latter set is convex, we have

n n n
!
X X X
λi (x(i) , f (x(i) )) = λi x(i) , λi f (x(i) ) ∈ epi (f ) ,
i=1 i=1 i=1

which, in turn, implies that

n
! n
X X
(i)
f λi x ≤ λi f (x(i) ) .
i=1 i=1

This establishes (b). It is obvious that (b) implies (c). So it remains only to show that
(c) implies (a) in order to establish the equivalence.
To this end, suppose that (x(1) , z1 ), (x(2) , z2 ) ∈ epi (f ) and take 0 ≤ λ ≤ 1. Then

(1 − λ) (x(1) , z1 ) + λ (x(2) , z2 ) = (1 − λ) x(1) + λ x(2) , (1 − λ) z1 + λ z2 ,

4
and since f (x(1) ) ≤ z1 and f (x(2) ) ≤ z2 we have, since (1 − λ) > 0, and λ > 0, that

(1 − λ) f (x(1) ) + λ f (x(2) ) ≤ (1 − λ) z1 + λ z2 .

Hence, by the assumption (c), f (1 − λ) x(1) + λ x(2) ≤ (1 − λ) z1 + λ z2 , which shows
the point (1 − λ) x(1) + λ x(2) , (1 − λ) z1 + λ z2 is in epi(f ).

2

We pause to remark that some authors, particularly in applications to Economics, discuss

concave functions. These latter functions are simply related to convex functions. Indeed
a function f is concave if and only if the function −f is convex2 . Again, we take as our
basic object of study the class of convex functions.

We can see another connection between convex sets and convex functions if we introduce
the indicator function, ψK of a set K ⊂ Rn . Indeed, ψK : Rn → R∗ is defined by
(
0 if x ∈ K,
ψK (x) =
+∞ if x 6∈ K .

Proposition 1.6 A non-empty subset D ⊂ Rn is convex if and only if its indicator

function is convex.

Proof: The result follows immediately from the fact that epi (ψD ) = D × R≥0 .

Certain simple properties follow immediately from the analytic form of the definition (part
(c) of the equivalence theorem above). Indeed, it is easy to see, and we leave it as an
exercise for the reader, that if f and g are convex functions defined on a convex set C,
then f + g is likewise convex on C provided there is no point for which f (x) = ∞ and
g(x) = −∞. The same is true if β ∈ R , β > 0 and we consider βf .

Moreover, we have the following simple result which is useful.

Proposition 1.7 Let f : Rn → R be given, x(1) , x(2) ∈ Rn be fixed and define a function
ϕ : [0, 1] → R by ϕ(λ) := f ((1 − λ)x(1) + λx(2) ). Then the function f is convex on Rn if
and only if the function ϕ is convex on [0, 1].

Proof: Suppose, first, that f is convex on Rn . Then it is sufficient to show that epi (ϕ)
is a convex subset of R2 . To see this, let (λ1 , z1 ) , (λ2 , z2 ) ∈ epi (ϕ) and let
2
This will also be true of quasi-convex and quasi-concave functions which we will define below.

5
ŷ (1) = λ1 x(1) + (1 − λ1 ) x(2) ,
ŷ (2) = λ2 x(1) + (1 − λ2 ) x(2) .

Then

f (y (1) ) = ϕ(λ1 ) ≤ z1 and f (y (2) ) = ϕ(λ2 ) ≤ z2 .

Hence (y (1) , z1 ) ∈ epi (f ) and (y (2) , z2 ) ∈ epi (f ). Since epi(f ) is a convex set, we also
have (µ y (1) + (1 − µ) y (2) , µ z1 + (1 − µ) z (2) ) ∈ epi (f ) for every µ ∈ [0, 1]. It follows that
f (µ y(1) + (1 − µ) y (2) ) ≤ µ z1 + (1 − µ) z2 ).

Now

µ y (1) + (1 − µ) y (2) ) = µ(λ1 x(1) + (1 − λ1 ) x(2) ) + (1 − µ) (λ2 x(1) + (1 − λ2 ) x(2)

= (µ λ1 + (1 − µ) λ2 )x(1) + µ (1 − λ1 ) + (1 − µ) (1 − λ2 ) x(2) ,

and since

1 − [µ λ1 + (1 − µ) λ2)] = [µ + (1 − µ)] − [µ λ1 + (1 − µ) λ2)]

= µ (1 − λ1 ) + (1 − µ) (1 − λ2 ) ,

we have from the definition of ϕ that f (µy (1) + (1 − µ)y (2) ) = ϕ(µλ1 + (1 − µ)λ2 ) and so
(µλ1 + (1 − µ)λ2 , µz1 + (1 − µ)z2 ) ∈ epi (ϕ) i.e., ϕ is convex.

We leave the proof of the converse statement as an exercise. 2

We also point out that if f : Rn −→ R is a linear or affine, then f is convex. Indeed,

suppose that for a vector a ∈ Rn and a real number b, the function f is given by
f (x) =< a, x > +b. Then we have, for any λ ∈ [0, 1],

f ( (1 − λ) x + λ y) = < a, (1 − λ) x + λ y > +b
= (1 − λ) < a, x > +λ < a, y > +(1 − λ) b + λ b
= (1 − λ) (< a, x > +b) + λ (< a, y > +b) = (1 − λ) f (x) + λ f (y) ,

and so f is convex, the weak inequality being an equality in this case.

6
In the case that f is linear, that is f (x) =< a, x > for some a ∈ Rn then it is easy to see
that the map ϕ : x → [f (x)]2 is also convex. Indeed, if x, y ∈ Rn then, setting α = f (x)
and β = f (y), and taking 0 < λ < 1 we have

(1 − λ) ϕ(x) + λ ϕ(y) − ϕ( (1 − λ) x + λ y)
= (1 − λ) α2 + λ β 2 − ( (1 − λ) α + λ β)2
= (1 − λ) λ (α − β)2 ≥ 0 .

Note, that in particular for the function f : R −→ R given by f (x) = x is linear and that
[f (x)]2 = x2 so that we have a proof that the function that we usually write y = x2 is a
convex function.

The next result expands our repertoire of convex functions.

Proposition 1.8 (a) If A : Rm −→ Rn is linear and f : Rn −→ R∗ is convex, then

f ◦ A is convex as a map from Rm to R.

(b) If f is as in part(a) and ϕ : R −→ R is convex and non-decreasing, then ϕ ◦ f :

Rn −→ R is convex.

Proof: To prove (a) we use Jensen’s inequality:

Given any x, y ∈ Rn and λ ∈ [0.1] we have

(f ◦ A) ( (1 − λ)x + λ y) = f ( (1 − λ) (Ax) + λ (Ay) ) ≤ (1 − λ) f (Ax) + λ f (Ay)

= (1 − λ) (f ◦ A)(x) + λ (f ◦ A)(y) .

For part (b), again we take x, y ∈ Rn and λ ∈ [0.1]. Then

(ϕ ◦ f ) [ (1 − λ) x + λ y ] ≤ ϕ [ (1 − λ) f (x) + λ f (y) ]
≤ (1 − λ) ϕ(f (x)) + λ ϕ(f (y)) = (1 − λ) (ϕ ◦ f ) (x) + λ (ϕ ◦ f ) (y) ,

where the first inequality comes from the convexity of f and the monotonicity of ϕ and
the second from the convexity of this later function. This proves part (b).

7
To establish part (c) we note that, since the arbitrary intersection of convex sets is convex,
it suffices to show that
[
epi sup fα = epi (fα ).
α∈A
αıA

To check the equality of these two sets, start with a point

(x, z) ∈ epi sup .
α∈A

Then z ≥ supα∈A fα (x) and so, for all β ∈ A , z ≥ fβ (x). Hence, by definition, (x, z) ∈
epi fβ for all β and therefore

\
(x, z) ∈ epi (fα ) .
α∈A

Conversely, suppose (x, z) ∈ epi (fα ) for all α ∈ A. Then z ≥ fα (x) for all α ∈ A and
hence z ≥ supα∈A fα . But this, by definition, implies (x, z) ∈ epi (supα∈A fα ) . This
completes the proof of part (c) and the proposition. 2

Next, we introduce the definition:

Definition 1.9 Let f : Rn → R∗ , and α ∈ R. Then the sets

S(f, α) := {x ∈ Rn | f (x) < α} and S(f, α) := {x ∈ Rn | f (x) ≤ α} ,

are called lower sections of the function f .

Proposition 1.10 If f : R → R∗ is convex, then its lower sections are likewise convex.

The proof of this result is trivial and we omit it.

The converse of this last proposition is false as can be easily seen from the function
1
x 7→ x 2 from R> to R. However, the class of functions whose lower level sets S(f, α) (or
equivalently the sets S(f, α)) are all convex is likewise an important class of functions and
are called quasi-convex. These functions appear in game theory nonlinear programming
(optimization) problems and mathematical economics. For example, quasi-convex utility

8
functions imply that consumers have convex preferences. They are obviously generaliza-
tions of convex functions since every convex function is clearly quali-convex. However
they are not as easy to work with. In particular, while the sum of two convex functions
is convex, the same is not true of quasi-convex functions as the following example shows.

Example 1.11 Define

 

 0 x ≤ −2 
0 x≤0

 

−(x + 2) −2 < x ≤ −1 −x 0<x≤1
f (x) = and g(x) = .


 x −1 < x ≤ 0 

x−2 1<x≤2
 
0 x>0 0 x>2
 

Here, the functions are each concave, the level sections are convex for each function so
that each is quasi-convex, and yet the level section corresponding to α = −1/2 for the sum
f + g is not convex. Hence the sum is not quasi-convex.

It is useful for applications to have an analytic criterion for quasi-convexity. This is the
content of the next result.

Proposition 1.12 A function f : Rn → R∗ is quasi-convex if and only if, for any x, y ∈

Rn and any λ ∈ [0, 1] we have

f ( (1 − λ) x + λ y) ≤ max{f (x), f (y)} .

Proof: Suppose that the sets S(f, α) are convex for every α. Let x, y ∈ Rn and let
α̃ := max{f (x), f (y)}. Then S(f, α̃) is convex and, since both f (x) ≤ α̃ and f (y) ≤ α̃,
we have that both x and y belong to S(f, α̃). Since this latter set is convex, we have

(1 − λ) x + λ y ∈ S(f, α̃) or f ( (1 − λ) x + λ y ) ≤ α̃ = max{f (x), f (y)} .

As we have seen above, the sum of two quasi-convex functions may well not be quasi-
convex. With this analytic test for quasi-convexity, we can check that there are certain
operations which preserve quasi-convexity. We leave the proof of the following result to
the reader.

Proposition 1.13 (a) If the functions f1 , . . . , fk are quasi-convex and αa , . . . , αk are

non-negative real numbers, then the function f := max{α1 f1 , . . . , αk , fk } is quasi-
convex.

9
(b) If ϕ : R → R is a non-decreasing function and f : Rn → R is quasi-convex, then the
composition ϕ ◦ f is a quasi-convex function.

We now return to the study of convex functions.

A simple sketch of the parabola y = x2 and any horizontal cord (which necessarily lies
above the graph) will convince the reader that all points in the domain corresponding to
the values of the function which lie below that horizontal line, form a convex set in the
domain. Indeed, this is a property of convex functions which is often useful.

Proposition 1.14 If C ⊂ Rn is a convex set and f : C −→ R is a convex function, then

the level sets {x ∈ C | f (x) ≤ α} and {x ∈ C | f (x) < α} are convex for all scalars α.

Proof: We leave this proof as an exercise.

Notice that, since the intersection of convex sets is convex, the set of points simultaneously
satisfying m inequalities f1 (x) ≤ c1 , f2 (x) ≤ c2 , . . . , fm (x) ≤ cm where each fi is a convex
function, defines a convex set. In particular, the polygonal region defined by a set of such
inequalities when the fi are affine is convex.

From this result, we can obtain an important fact about points at which a convex function
attains a minimum.

Proposition 1.15 Let C ⊂ R be a convex set and f : C −→ R a convex function. Then

the set of points M ⊂ C at which f attains its minumum is convex. Moreover, any relative
minimum is an absolute minimum.

Proof: If the function does not attain its minimum at any point of C, then the set of
such points in empty, which is a convex set. So, suppose that the set of points at which
the function attains its minimum is non-empty and let m be the minimal value attained
by f . If x, y ∈ M and λ ∈ [0, 1] then certainly (1 − λ)x + λy ∈ C and so

m ≤ f ( (1 − λ) x + λ y) ) ≤ (1 − λ) f (x) + λ f (y) = m ,
and so the point (1 − λ)x + λy ∈ M. Hence M, the set of minimal points, is convex.

Now, suppose that x⋆ ∈ C is a relative minimum point of f , but that there is another
point x̂ ∈ C such that f (x̂) < f (x⋆ ). On the line (1 − λ)x̂ + λx⋆ , 0 < λ < 1, we have

10
f ((1 − λ) x̂ + λ x⋆ ) ≤ (1 − λ) f (x̂) + λ f (x⋆ ) < f (x⋆ ) ,
contradicting the fact that x⋆ is a relative minimum point. 2

Again, the example of the simple parabola, shows that the set M may well contain only
a single point, i.e., it may well be that the minimum point is unique. We can guarantee
that this is the case for an important class of convex functions.

Definition 1.16 A real-valued function f , defined on a convex set C ⊂ R is said to be

strictly convex provided, for all x, y ∈ C, x 6= y and λ ∈ (0, 1), we have

f ( (1 − λ) x + λ y) ) < (1 − λ) f (x) + λ f (y) .

Proposition 1.17 If C ⊂ Rn is a convex set and f : C −→ R is a strictly convex

function then f attains its minimum at, at most, one pont.

Proof: Suppose that the set of minimal points M is not empty and contains two distinct
points x and y. Then, for any 0 < λ < 1, since M is convex, we have (1 − λ)x + λy ∈ M.
But f is strictly convex. Hence

m = f ( (1 − λ) x + λ y ) < (1 − λ) f (x) + λ f (y) = m ,

which is a contradiction. 2

If a function is differentiable then, as in the case in elementary calculus, we can give

characterizations of convex functions using derivatives. If f is a continuously differentiable
function defined on an open convex set C ⊂ Rn then we denote its gradient at x ∈ C, as
usual, by ∇ f (x). The excess function

E(x, y) := f (y) − f (x) − h∇ f (x), y − xi

is a measure of the discrepancy between the value of f at the point y and the value of
the tangent approximation at x to f at the point y. This is illustrated in the next figure.

Now we introduce the notion of a monotone derivative

Definition 1.18 The map x 7→ ∇ f (x) is said to be monotone on C ⊂ Rn provided

h∇ f (y) − ∇ f (x), y − xi ≥ 0 ,
for all x, y ∈ C.

11
10

y 5

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
x

Figure 3: The Tangent Approximation

We can now characterize convexity in terms of the function E and the monotonicity
concept just introduced. However, before stating and proving the next theorem, we need
a lemma.

Lemma 1.19 Let f be a real-valued, differentiable function defined on an open interval

I ⊂ R. Then if the first derivative f ′ is a non-decreasing function on I, the function f is
convex on I.

Proof: Choose x, y ∈ I with x < y, and for any λ ∈ [0, 1], define zλ := (1 − λ)x + λy. By
the Mean Value Theorem, there exist u, v ∈ R , x ≤ v ≤ zλ ≤ u ≤ y such that

f (y) = f (zλ ) + (y − zλ ) f ′ (u) , and f (zλ ) = f (x) + (zλ − x) f ′ (v) .

But, y − zλ = y − (1 − λ)x − λy = (1 − λ)(y − x) and zλ − x = (1 − λ)x + λy − x = λ(y − x)
and so the two expressions above may be rewritten as

f (y) = f (zλ ) + λ (y − x) f ′ (u) , and f (zλ ) = f (x) + λ (y − x) f ′ (v) .

Since, by choice, v < u, and since f ′ is non-decreasing, this latter equation yields

f (zλ ) ≤ f (x) + λ (y − x) f ′ (u) .

12
Hence, multiplying this last inequality by (1 − λ) and the expression for f (y) by −λ and
adding we get

(1−λ) f (zλ)−λ f (y) ≤ (1−λ) f (x)+λ(1−λ)(y −x)f ′(u)−λf (zλ )−λ(1−λ)(y −x)f ′(u) ,

which can be rearranged to yield

(1 − λ) f (zλ ) + λ f (zλ) = f (zλ ) ≤ (1 − λ) f (x) + λ f (y) ,

and this is just the condition for the convexity of f 2

We can now prove a theorem which gives three different characterizations of convexity for
continuously differentiable functions.

Theorem 1.20 Let f be a continuously differentiable function defined on an open convex

set C ⊂ Rn . Then the following are equivalent:

(a) E(x, y) ≥ 0 for all x, y ∈ C;

(b) the map x 7→ ∇f (x) is monotone in C;

(c) the function f is convex on C.

Proof: Suppose that (a) holds, i.e. E(x, y) ≥ 0 on C × C. Then we have both

f (y) − f (x) ≥ h∇ f (x), y − xi ,

and
f (x) − f (y) ≥ h∇ f (y), x − yi = −h∇ f (y), y − xi .
Then, from the second inequality, f (y) − f (x) ≤ h∇ f (y), x − yi, and so

h∇ f (y) − ∇ f (x), y − xi = h∇ f (y), y − xi − h∇ f (x), y − xi

≥ (f (y) − f (x)) − (f (y) − f (x)) = 0 .

Hence, the map x 7→ ∇ f (x) is monotone in C.

Now suppose the map x 7→ ∇ f (x) is monotone in C, and choose x, y ∈ C. Define a

function ϕ : [0, 1] → R by ϕ(t) := f (x + t(y − x)). We observe, first, that if ϕ is convex
on [0, 1] then f is convex on C. To see this, let u, v ∈ [0, 1] be arbitrary. On the one hand,

13
ϕ((1−λ)u+λv) = f (x+[(1−λ)u+λv](y−x)) = f ( (1−[(1−λ)u+λv]) x+((1−λ)u+λv) y) ,

while, on the other hand,

ϕ((1 − λ)u + λv) ≤ (1 − λ) f (x + u(y − x)) + f (x + v(y − x)) .

Setting u = 0 and v = 1 in the above expressions yields

f ((1 − λ) x) + λ y) ≤ (1 − λ) f (x) + λ f (y) .

so the convexity of ϕ on [0, 1] implies the convexity of f on C.

Now, choose any α, β, 0 ≤ α < β ≤ 1. Then

ϕ′ (β) − ϕ′ (α) = h(∇f (x + β(y − x)) − ∇f (x + α(y − x)) , y − xi .

Setting u := x + α(y − x) and v := x + β(y − x)3 we have v − u = (β − α)(y − x) and
so

ϕ′ (β) − ϕ′ (α) = h(∇f (v) − ∇f (u), v − ui ≥ 0 .

Hence ϕ′ is non-decreasing, so that the function ϕ is convex.

Finally, if f is convex on C, then, for fixed x, y ∈ C define

h(λ) := (1 − λ) f (x) + λ f (y) − f ((1 − λ) x + λ y) .

Then λ 7→ h(λ) is a non-negative, differentiable function on [0, 1] and attains its minimum
at λ = 0. Therefore 0 ≤ h′ (0) = E(x, y), and the proof is complete. 2

As an immediate corollary, we have

Corollary 1.21 Let f be a continuously differentiable convex function defined on a convex

set C. If there is a point x⋆ ∈ C such that, for all y ∈ C, h∇ f (x⋆ ), y − x⋆ i ≥ 0, then x⋆
is an absolute minimum point of f over C.

Proof: By the preceeding theorem, the convexity of f implies that

f (y) − f (x⋆ ) ≥ h∇ f (x⋆ ), y − x⋆ i ,

3
Note that u and v are convex combinations of points in the convex set C and so u, v ∈ C.

14
and so, by hypothesis,

f (y) ≥ f (x⋆ ) + h∇ f (x⋆ ), y − x⋆ i ≥ f (x⋆ ) .

The inequality E(x, y) ≥ 0 shows that local information about a convex function, given
in terms of the derivative at a point) gives us global information in terms of a global
underestimator of the function f . In a way, this is the the key property of convex functions.
For example, suppose that ∇f (x) = 0. Then, for all y ∈ dom (f ) , f (y) ≥ f (x) so that
x is a global minimizer of the convex function f .

It is also important to remark that the hypothesis that the convex function f is defined
on a convex set is crucial, both for the first order conditions as well as for the second order
conditions. Indeed, if we consider the function f (x) = 1/x2 with domain {x ∈ R | x 6= 0}.
The usual second order condition f ′′ (x) > 0 for all x ∈ dom (f ) yet f is not convex there
so that the second order test fails.

The condition E(x, y) ≥ 0 can be given an important geometrical interpretation in terms

of epigraphs. Indeed if f is convex and x, y ∈ dom (f ) then for (x, z) ∈ epi (f ), then

z ≥ f (y) ≥ f (x) + ∇f (x)⊤ (y − x) ,

can be expressed as

⊤
∇ f (x) y x
− ≤ 0.
−1 z f (x)
This shows that the hyperplane defined by (∇f (x), −1)⊤ supports epi(f ) at the boundary
point (x, f (x)).

We now turn to so-called second order criteria for convexity. The discussion involves
the Hessian matrix of a twice continuously differentiable function, and depends on the
question of whether this matrix is positive semi-definite or even positive definite (for strict
convexity)4 . Let us recall some definitions.

Definition 1.22 A real symmetric n × n matrix A is said to be

(a) Positive definite provided x⊤ A x > 0 for all x ∈ Rn , x 6= 0.

4
Notice that the smoothness assumption on the function f is sufficient to insure that the Hessian
matrix of second partial derivatives is symmetric.

15
(b) Negative definite provided x⊤ A x < 0 for all x ∈ Rn , x 6= 0.

(c) Positive semidefinite provided x⊤ A x ≥ 0 for all x ∈ Rn , x 6= 0.

(d) Negative semidefinite provided x⊤ A x ≤ 0 for all x ∈ Rn , x 6= 0.

(e) Indefinite provided x⊤ A x takes on values that differ in sign.

It is important to be able to determine if a matrix is indeed positive definite. In or-

der to do this, a number of criteria have been developed. Perhaps the most important
characterization is in terms of the eigenvalues.

Theorem 1.23 Let A be a real symmetric n × n matrix. Then A is positive definite if

and only if all its eigenvalues are positive.

Proof: If A is positive definite and λ is an eigenvalue of A, then, for any eigenvector x

belonging to λ

x⊤ A x = λ x⊤ x = λkxk2 , .
Hence
x⊤ A x
λ = > 0
kxk

Converesly, suppose that all the eigenvalues of A are positive. Let {x1 , . . . xn } be an
orthonormal set of eigenvectors of A.5 Hence any x ∈ Rn can be written as

x = α1 x1 + α2 bx2 + · · · + αn xn
with

n
X
⊤
αi = x xi for i = 1, 2, . . . n , and αi2 = kxk2 > 0 .
i=1

It follows that
5
The so-called Spectral Theorem for real symmetric matrices states that such a matrix can be diagonal-
ized, and hence has n linearly independent eigenvectors. These can be replaced by a set of n orthonormal
eigenvectors.

16
x⊤ A x = (α1 x1 + · · · + αn xn )⊤ (α1 λa x1 + · · · + αn λn xn )
Xn
= αi2 λi ≥ (min{λi } kxk > 0 .
i
i=1

Hence A is positive definite. 2

In simple cases where we can compute the eigenvalues easily, this is a useful criterion.

Example 1.24 Let

2 −2
A = .
−2 5
Then the eigenvalues are the roots of

det (A − λI) = (2 − λ) (5 − λ) − 4 = (λ − 1) (λ − 6) .
Hence the eigenvalues are both positive and hence the matrix is positive definite. In this
particular case it is easy to check directly that A is positive definite. Indeed

2 −2 x1 2x1 − 2x2
(x1 , x2 ) = (x1 , x2 )
−2 5 x2 −2x1 + 5x2

= 2 x22 − 4 x1 x2 + 5 x22

= 2 [ x21 − 2 x1 x2 + x22 ] + 4 x22 = 2 (x1 − x2 )2 + 4 x22 > 0 .

This last theorem has some immediate useful consequences. First, if A is positive defi-
nite, then A must be nonsingular, since singular matrices have λ = 0 as an eigenvalue.
Moreover, since we know that the det(A) is the product of the eigenvalues, and since
each eigenvalue is positive, then det(A) > 0. Finally, we have the following result which
depends on the notion of leading principle submatrices.

Definition 1.25 Given any n × N matrix A, let Ar denote the matrix formed by deleting
the last n − r rows and columns of A. Then A4 is called the leading principal submatrix of
A.

17
Proposition 1.26 If A is a symmetric positive definite matrix then the leading principal
submatrices A1 , A2 , . . . , An of A are all positive definite. In particular, det(Ar ) > 0.

Proof: Let xr = (x1 , x2 , . . . , xr )⊤ be any non-zero vector in Rr . Set

x = (x1 , x2 , . . . , xr , 0, . . . , 0)⊤ .
Since x⊤ ⊤
r Ar xr = x A x > 0, it follows that Ar is positive definite, by definition. 2

This proposition is half of the famous criterion of Sylvester for positive definite matrices.

Theorem 1.27 A real, symmetric matrix A is positive definite if and only if all of its
leading principle minors are positive definite.

We will not prove this theorem here but refer the reader to his or her favorite treatise on
linear algebra.

Example 1.28 Let

 
2 −1 0
A =  −1 2 −1 
0 −1 2
Then

2 −1
A2 = (2) , A2 = , A3 = A .
−1 2
Then
det A1 = 2 , det A2 = 4 − 1 = 3 , and det A = 4 .
Hence, according to Sylvester’s criterion, the matrix A is positive definite.

Now we are ready to look at second order conditions for convexity.

Proposition 1.29 Let D ⊂ Rn be an open convex set and let f : D −→ R be twice

continuously differentiable in D. Then f is convex if and only if the Hessian matrix of f
is positive semidefinite throughout D.

18
Proof: By Taylor’s Theorem we have

y − x, ∇2 f (x + λ (y − x))(y − x) ,

f (y) = f (x) + h∇ f (x), y − xi +
2
for some λ ∈ [0, 1]. Clearly, if the Hessian is positive semi-definite, we have

f (y) ≥ f (x) + h∇ f (x), y − xi ,

which in view of the definition of the excess function, means that E(x, y) ≥ 0 which
implies that f is convex on D.

Conversely, suppose that the Hessian is not positive semi-definite at some point x ∈ D.
Then, by the continuity of the Hessian, there is a y ∈ D so that, for all λ ∈ [0, 1],

hy − x, ∇2 f (x + λ(y − x)) (y − x)i < 0 ,

which, in light of the second order Taylor expansion implies that E(x, y) < 0 and so f
cannot be convex. 2

Let us consider, as an example, the quadratic function f : Rn → R with dom(f ) = Rn ,

given by

1 ⊤
f (x) =x Q x + q⊤ x + r ,
2
with Q and n × n symmetric matrix, q ∈ Rn and r ∈ R. Then since as we have seen
previously, ∇2 f (x) = Q, the function f is convex if and only if Q is positive semidefinite.
Strict convexity of f is likewise characterized by the positive definiteness of Q.

These first and second-order necessary conditions give us methods of showing that a
given function is convex. Thus, we either check the definition, Jensen’s inequality, using
the equivalence that is given by Theorem 2.1.3, or showing that the Hessian is positive
semi-definite. Let us look as some simple examples.

Example 1.30 (a) The function real-valued function defined on R+ 6 given by x 7→

x ln (x). Then, since this function C 2 (R+ ) and f ′ (x) = ln (x) + 1 and f ′′ (x) = 1/x >
0, we see that f is (even strictly) convex.

(b) The max function f (x) = max{x1 , . . . xn } is convex on Rn . Here we can use Jensen’s
inequality. Let λ ∈ [0, 1] then
6
Recall that by R+ we mean the set {x ∈ R | x > 0}.

19
f ( (1 − λ) x + λ y) = max ( λ xi + λ yi ) ≤ λ max xi + (1 − λ) max yi
1≤i≤n 1≤i≤n 1≤i≤n
= (1 − λ) f (x) + +λ f (y) .

(c) The function q : R × R+ → R given by q(x, y) = x2 /y is convex. In this case,

∇q(x, y) = (2x/y, −x2 /y 2 )⊤ while an easy computation shows

y 2 −xy

2 2
∇ q(x, y) = 3 .
y −xy x2
Since y > 0 and

y 2 −xy

(u1 , u2) (u1, u2 )⊤ = (u1 y − u2 x)2 ≥ 0 ,
−xy x2

the Hessian of q is positive definite and the function is convex.

Download Complete Mathematics for Economics, fourth edition Michael Hoy PDF for All Chapters
No ratings yet
Download Complete Mathematics for Economics, fourth edition Michael Hoy PDF for All Chapters
50 pages
Asia Pacific Business Schools
100% (1)
Asia Pacific Business Schools
11 pages
Jan Van Tiel - Convex Analysis - An Introductory Text-Wiley (1984) PDF
No ratings yet
Jan Van Tiel - Convex Analysis - An Introductory Text-Wiley (1984) PDF
135 pages
Convex Optimization Theory - Summary
100% (1)
Convex Optimization Theory - Summary
58 pages
Convex Functions and Optimization
No ratings yet
Convex Functions and Optimization
20 pages
lecture_09
No ratings yet
lecture_09
4 pages
Convex Optimisation Solutions
No ratings yet
Convex Optimisation Solutions
14 pages
Func 20160919
No ratings yet
Func 20160919
35 pages
Epigrafo PDF
No ratings yet
Epigrafo PDF
12 pages
03 Convex Functions Notes Cvxopt f22
No ratings yet
03 Convex Functions Notes Cvxopt f22
21 pages
CPSC 542f Notes
No ratings yet
CPSC 542f Notes
10 pages
Convex Optimization L2 18
No ratings yet
Convex Optimization L2 18
11 pages
New Zealand Mathematical Olympiad Committee Convex Functions
No ratings yet
New Zealand Mathematical Olympiad Committee Convex Functions
7 pages
CPSC 542F WINTER 2017: Lecture Notes
No ratings yet
CPSC 542F WINTER 2017: Lecture Notes
10 pages
Convex Functions: 3.1 First Acquaintance
No ratings yet
Convex Functions: 3.1 First Acquaintance
36 pages
Lec3 Convex Function Exercise
No ratings yet
Lec3 Convex Function Exercise
4 pages
Convexity and Differentiable Functions: R R R R R R R R R R R R R R R R
No ratings yet
Convexity and Differentiable Functions: R R R R R R R R R R R R R R R R
5 pages
Convexity: 1 Warm-Up
No ratings yet
Convexity: 1 Warm-Up
7 pages
convexity-1
No ratings yet
convexity-1
3 pages
Convexity: A. Convex Sets and Functions
No ratings yet
Convexity: A. Convex Sets and Functions
37 pages
Convexity Examples: CE 377K Stephen D. Boyles Spring 2015
No ratings yet
Convexity Examples: CE 377K Stephen D. Boyles Spring 2015
11 pages
CS599: Convex and Combinatorial Optimization Fall 2013 Lectures 5-6: Convex Functions
No ratings yet
CS599: Convex and Combinatorial Optimization Fall 2013 Lectures 5-6: Convex Functions
55 pages
Lecture3 ConvexSetsFuns PDF
No ratings yet
Lecture3 ConvexSetsFuns PDF
43 pages
Convex Sets and Jensen's Inequality
No ratings yet
Convex Sets and Jensen's Inequality
22 pages
Solutions To Selected Exercises in Chapter Two
No ratings yet
Solutions To Selected Exercises in Chapter Two
31 pages
Convex Optimization Overview: Zico Kolter October 19, 2007
No ratings yet
Convex Optimization Overview: Zico Kolter October 19, 2007
12 pages
Convexity-Print Version PDF
No ratings yet
Convexity-Print Version PDF
13 pages
Convexity Print Version
No ratings yet
Convexity Print Version
13 pages
Technical Note: R. I. Bot, S. M. Grad, and G. Wanka
No ratings yet
Technical Note: R. I. Bot, S. M. Grad, and G. Wanka
16 pages
Concave and Convex Functions: 1 Basic Definitions
No ratings yet
Concave and Convex Functions: 1 Basic Definitions
12 pages
Convex - Module A Part 2
No ratings yet
Convex - Module A Part 2
27 pages
Convex Functions and Their Applications PDF
100% (2)
Convex Functions and Their Applications PDF
44 pages
Lecture 5
No ratings yet
Lecture 5
25 pages
Convex and Discrete Optimization - Lecture 1
No ratings yet
Convex and Discrete Optimization - Lecture 1
15 pages
Convexity I: Sets and Functions: Ryan Tibshirani Convex Optimization 10-725
No ratings yet
Convexity I: Sets and Functions: Ryan Tibshirani Convex Optimization 10-725
27 pages
Lecture 01
No ratings yet
Lecture 01
10 pages
Lecture 02 - Convexity
No ratings yet
Lecture 02 - Convexity
42 pages
1 Convex Sets: C&O367: Nonlinear Optimization (Winter 2013) Assignment 2 H. Wolkowicz
No ratings yet
1 Convex Sets: C&O367: Nonlinear Optimization (Winter 2013) Assignment 2 H. Wolkowicz
4 pages
Convex Optimization 1 - Charalampos Salis
No ratings yet
Convex Optimization 1 - Charalampos Salis
12 pages
convex-fns-scribed
No ratings yet
convex-fns-scribed
6 pages
CS 726: Nonlinear Optimization 1 Lecture 2: Background Material
No ratings yet
CS 726: Nonlinear Optimization 1 Lecture 2: Background Material
14 pages
Lecture Notes PDF
No ratings yet
Lecture Notes PDF
143 pages
Convexsol 1
No ratings yet
Convexsol 1
40 pages
Convex Analysis and Optimization Solution Manual
100% (2)
Convex Analysis and Optimization Solution Manual
193 pages
1 Theory of Convex Functions
No ratings yet
1 Theory of Convex Functions
14 pages
Convex Functions: Renu M. R
No ratings yet
Convex Functions: Renu M. R
43 pages
Chapter 2, Lecture 3: Building Convex Functions
No ratings yet
Chapter 2, Lecture 3: Building Convex Functions
4 pages
Cobb Douglas
No ratings yet
Cobb Douglas
14 pages
Calculus Criterion For Concavity
No ratings yet
Calculus Criterion For Concavity
14 pages
Chapter 3
No ratings yet
Chapter 3
43 pages
Convex
No ratings yet
Convex
63 pages
(Strong, Strict) Convexity (Princeton. Lecture 14 Pages. ORF523 - Lec7)
No ratings yet
(Strong, Strict) Convexity (Princeton. Lecture 14 Pages. ORF523 - Lec7)
14 pages
Closed Functions: September 4, 2007
No ratings yet
Closed Functions: September 4, 2007
19 pages
optimzation 공부자료0
No ratings yet
optimzation 공부자료0
38 pages
03 Convex Functions
No ratings yet
03 Convex Functions
31 pages
Leonetti Convexity Arxiv
No ratings yet
Leonetti Convexity Arxiv
4 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
Algebraic Equations
From Everand
Algebraic Equations
Demetrios P. Kanoussis
No ratings yet
Hyperbolic Functions (Trigonometry) Mathematics E-Book For Public Exams
From Everand
Hyperbolic Functions (Trigonometry) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
No ratings yet
Long-Memory Time Series: Theory and Methods
From Everand
Long-Memory Time Series: Theory and Methods
Wilfredo Palma
No ratings yet
Math 220A Practice Final Exam I Solutions - Fall 2002
No ratings yet
Math 220A Practice Final Exam I Solutions - Fall 2002
12 pages
Foundation of Proofs
No ratings yet
Foundation of Proofs
50 pages
An Inquiry-Based Introduction To Proofs
No ratings yet
An Inquiry-Based Introduction To Proofs
23 pages
Parvati Shastri - Lectures On Modules Over Principal Ideal Domains
No ratings yet
Parvati Shastri - Lectures On Modules Over Principal Ideal Domains
26 pages
Iygb Gce: Core Mathematics C2 Advanced Subsidiary
No ratings yet
Iygb Gce: Core Mathematics C2 Advanced Subsidiary
7 pages
JR Mat-A Vsaqs
No ratings yet
JR Mat-A Vsaqs
12 pages
Zimsec Nov 2011 p1 Maths
100% (1)
Zimsec Nov 2011 p1 Maths
6 pages
9.5 Trig
No ratings yet
9.5 Trig
35 pages
XI Maths QP 161
No ratings yet
XI Maths QP 161
4 pages
Mathematical Economics
No ratings yet
Mathematical Economics
33 pages
Mathematics Syllabi B.sc.
No ratings yet
Mathematics Syllabi B.sc.
22 pages
Defuzzification To Scalars
No ratings yet
Defuzzification To Scalars
13 pages
Advanced: You Were Here You Are Here You Are Going Here
No ratings yet
Advanced: You Were Here You Are Here You Are Going Here
35 pages
02 Polynomials
No ratings yet
02 Polynomials
7 pages
Questions On Frequency Analysis of Signals and Systems
100% (1)
Questions On Frequency Analysis of Signals and Systems
42 pages
WRM Y9 Autumn b2 Forming and Solving Equations Assessment B
No ratings yet
WRM Y9 Autumn b2 Forming and Solving Equations Assessment B
2 pages
Trusses and Frames PDF
No ratings yet
Trusses and Frames PDF
63 pages
Crank-Nicolson Type Method For Burgers Equation
No ratings yet
Crank-Nicolson Type Method For Burgers Equation
6 pages
Basic Lowry Model: by Dr. Jean-Paul Rodrigue
No ratings yet
Basic Lowry Model: by Dr. Jean-Paul Rodrigue
14 pages
Ejemplo 8-5 Fogler
No ratings yet
Ejemplo 8-5 Fogler
12 pages
DLL Math Q1 W6
No ratings yet
DLL Math Q1 W6
8 pages
Worksheet Grade 10 Villa
No ratings yet
Worksheet Grade 10 Villa
1 page
Cal 11 Q4 0903 Final
No ratings yet
Cal 11 Q4 0903 Final
16 pages
Mathematics Cheat Sheet
No ratings yet
Mathematics Cheat Sheet
212 pages
1 (A) Suppose The Following Facts Are Known About The Function G and Its Derivative
No ratings yet
1 (A) Suppose The Following Facts Are Known About The Function G and Its Derivative
5 pages
RCCM Notes Final
No ratings yet
RCCM Notes Final
17 pages
MHF4U Exam Review Q&A
100% (1)
MHF4U Exam Review Q&A
31 pages
% DFT and IDFT Computation Using Equations
No ratings yet
% DFT and IDFT Computation Using Equations
5 pages
σ-Automata and Chebyshev-Polynomials: Klaus Sutner
No ratings yet
σ-Automata and Chebyshev-Polynomials: Klaus Sutner
24 pages
IITPKD Math PHD Test
No ratings yet
IITPKD Math PHD Test
6 pages
Chapter 5 - Determinants
No ratings yet
Chapter 5 - Determinants
34 pages
Curves - Concepts and Design Part-2 24 25-02-2015
No ratings yet
Curves - Concepts and Design Part-2 24 25-02-2015
82 pages