SYS 6003: Optimization Fall 2016
Lecture 7
Instructor: Quanquan Gu Date: September 14th
We continue to illustrate the application of second-order condition for convex functions
with more examples.
Example 1 (Quadratic over Linear Function)
x2
f (x, y) = , y > 0.
y
f (x, y) is convex over R × (0, +∞). To show this, we first calculate the partial derivatives.
The first order partial derivatives are:
∂f (x, y) 2x ∂f (x, y) x2
= , = − 2.
∂x y ∂y y
The second order partial derivatives of f (x, y) are:
∂ 2 f (x, y) 2 ∂ 2 f (x, y) 2x2 ∂ 2 f (x, y) 2x
= , = , = − 2.
∂x2 y ∂y 2 y3 ∂x∂y y
Then we can write down the Hessian matrix of f (x, y) as:
" #
2
2 y
− 2x
y2
∇ f (x, y) = 2x2 .
− 2x
y2 y3
Factoring out 2/y 3 , we can achieve:
2 2 y 2 −xy
∇ f (x, y) = 3 .
y −xy x2
Note that the matrix can be factorized as the outer product of two vectors, yielding
2 2 y
∇ f (x, y) = 3 (y, −x),
y −x
where we notice that:
y
(y, −x) 0.
−x
Therefore we have:
∇2 f (x, y) 0.
By the second order condition, we know that f (x, y) is convex.
1
Example 2 (Log-sum-exponential Function) f : Rd → R is defined as follows
d
X
f (x) = log exp(xi ) . (1)
i=1
It is a convex function.
Example 3 (Geometric Mean) f : Rd → R is defined as follows
d
Y 1/d
f (x) = xi . (2)
i=1
It is a concave function.
For convex function, we can show that its local minimum is also a global minimum. In
detail, the following theorem shows that, a local minimum of a convex function is also a
global minimum.
Theorem 1 (Local Minimum is also a Global Minimum) Let f Rd → R be convex. If
x∗ is a local minimum of f over a convex set D, then x∗ is also a global minimum of f over
a convex set D.
Proof: Since D is a convex set, for any y, y − x∗ is a feasible direction. Since x∗ is a local
minimum, for any y ∈ D, we can choose a small enough α > 0, such that
f (x∗ ) ≤ f (x∗ + α(y − x∗ )). (3)
Furthermore, since f is convex, we have
f (x∗ + α(y − x∗ )) = f (αy + (1 − α)x∗ ) ≤ αf (y) + (1 − α)f (x∗ ). (4)
Combining (3) and (4), we have
f (x∗ ) ≤ αf (y) + (1 − α)f (x∗ ),
which implies that f (x∗ ) ≤ f (y). Since y is an arbitrary point in D, this immediately proves
that x∗ is a global minimum.
Theorem 2 (First-order Condition for a Global Minimum) Let function f : Rd →
R be convex and continuously differentiable. x∗ is a global minimum of f over a convex set
D if and only if,
∇f (x∗ )> (x − x∗ ) ≥ 0, for all x ∈ D. (5)
2
Proof: “⇒”
Since x∗ is a global minimum, x∗ must also be a local minimum. By the first order necessary
condition of a local minimum, we have ∇f (x∗ )> d ≥ 0 where d is a feasible direction. For
any x ∈ D, d = x − x∗ is a feasible direction. Then we obtain:
∇f (x∗ )> (x − x∗ ) ≥ 0
Thus, this completes the proof in the forward direction.
“⇐”
By definition, we have that:
f (x) ≥ f (x∗ ) + ∇f (x∗ )> (x − x∗ ) for any x ∈ D.
Thus, if ∇f (x∗ )> (x − x∗ ) ≥ 0, then f (x) − f (x∗ ) ≥ ∇f (x∗ )> (x − x∗ ) ≥ 0, which means x∗
is a global minimum of f over D.
In the following, we will show another way to prove that a function is convex. First of
all, let’s introduce the restriction of a function to a line.
Let f : Rd → R be a function. The restriction of f to a line x + tv is defined as
g : R → R : g(t) = f (x + tv), where dom(g) = {t : x + tv ∈ dom(f )}.
Theorem 3 (Restriction of a convex function to a line) f : Rd → R is a convex
function if and only if the function g : R → R : g(t) = f (x + tv), dom(g) = {t : x + tv ∈
dom(f )} is convex for any x ∈ dom(f ), v ∈ Rd
Proof: “⇒”: f is convex → g is convex.
For any t1 , t2 ∈ dom(g) and any α ∈ [0, 1], we have
g(αt1 + (1 − α)t2 ) = f (x + (αt1 + (1 − α)t2 )v)
= f (αx + αt1 v + (1 − α)x + (1 − α)t2 v)
= f (α(x + t1 v) + (1 − α)(x + t2 v))
Since f (x) is convex, it then follows that
g(αt1 + (1 − α)t2 ) ≤ αf (x + t1 v) + (1 − α)f (x + t2 v)
= αg(t1 ) + (1 − α)g(t2 ),
where the last equality follows by the definition of g(t). Thus, by definition, g(t) is convex.
“⇐” g is convex → f is convex.
For any x, y ∈ dom(f ) and any α ∈ [0, 1], we want to show
f (αx + (1 − α)y) ≤ αf (x) + (1 − α)f (y).
3
Let v = y − x, and consider g(t) = f (x + t(y − x)). It is easy to verify that g(0) = f (x),
g(1) = f (y), and g(1 − α) = f (αx + (1 − α)y). We then have
f (αx + (1 − α)y) = g(1 − α) (6)
= g(α0 + (1 − α) · 1)
≤ αg(0) + (1 − α)g(1)
= αf (x) + (1 − α)f (y).
Therefore, by definition, f (x) is a convex function.
Theorem 3 basically suggests that a function is convex if and only if the restriction of this
function to any lines is convex. It enables us to check convexity of f by checking convexity
of functions of one variable.