1 Continuous Extensions of Submodular Functions: 1.1 Convex and Concave Closures
1 Continuous Extensions of Submodular Functions: 1.1 Convex and Concave Closures
Submodular functions are functions assigning values to all subsets of a finite set N . Equivalently,
we can regard them as functions on the boolean hypercube, f : {0, 1}N R. It has been often
said that submodular functions are analogous to convex or perhaps concave functions. But this
analogy is somewhat nebulous and indeed it is not very clear whether submodular functions are
convex or rather concave. A question related to this is, what is a natural extension of a
function f : {0, 1}N R to the domain [0, 1]N ? This question does not have a unique answer.
1.1
First, every function f : {0, 1}N R has two canonical extensions f + , f : [0, 1]N R, where f +
is concave and f is convex. These functions are the concave closure and convex closure of f .
Definition 1 For f : {0, 1}N R, we define
P
P
P
the concave closure f + (x) = max{ SN S f (S) : SN S 1S = x, SN S = 1, S 0}.
P
P
P
the convex closure f (x) = min{ SN S f (S) : SN S 1S = x, SN S = 1, S 0}.
It is easy to see by compactness that the maximum and minimum are well defined. Equivalently, we
can say that f + (x) = maxD ERD [f (R)] where the maximum is taken over all distributions D such
that E[1R ] = x. Similarly, f (x) is obtained by taking the minimum over all such distributions.
Lemma 2 For any f : {0, 1}N R, f + is concave and f is convex.
Proof: We prove the claim for f + ; the claim for f follows by considering f .
Let x, y [0, 1]N and z = x + (1 )y where [0, 1]. We have
X
f + (x) =
S f (S)
SN
S 1S = x, and similarly
X
f + (y) =
S f (S)
SN
1.2
Lov
asz extension and submodular minimization
The convex closure for submodular function is identical to another concept - the Lovasz extension.
Lovasz definition was as follows.
Definition 3 For a function f : {0, 1}N R, f L : [0, 1]N R is defined by
L
f (x) =
n
X
i f (Si )
i=0
i 1Si = x and
i = 1, i 0.
An equivalent way to define the Lovasz extension is: f L (x) = E[f ({i : xi > })], where is
uniformly random in [0, 1]. Note that the Lovasz extension is easy to compute, given oracle access
to f .
Lemma 4 The Lov
asz extension f L and convex closure f are identical if and only if f is submodular.
P
Proof: Let us assume
that f is submodular. Then consider f (x) = min{ SN S f (S) :
P
P
1, S 0}. Let us pick a probability distribution achieving f (x)
SN S =P
SN S 1S = x,
which in addition maximizes
S |S|2 ; we claim that this distribution must be supported by a
chain. If not, let A B > 0 be such that A 6 B, B 6 A. By submodularity, we have
f (A B) + f (A B) f (A) + f (B).PLet us replace an B -amount
and B by A B and
P of A
A B. This clearly does not increase
S f (S), and it increases
S |S|2 :
|AB|2 +|AB|2 = (|A|+|B\A|)2 +(|B||B\A|)2 = |A|2 +|B|2 +2|B\A|(|A||B|+|B\A|) > |A|2 +|B|2
since 0 6= |B \ A| > |B| |A|. Therefore, we get a contradiction. The minimizing probability
distribution S must be supported by a chain, and this is the unique chain defining f L (x).
If f is not submodular, consider S and i, j
/ S such that f (S)+f (S +i+j) > f (S +i)+f (S +j),
1
and take x = 1S + 2 1{i,j} . The Lov
asz extension evaluates to
1
1
f L (x) = f (S) + f (S + i + j).
2
2
However, an alternative probability distribution for x is S+i = S+j = 12 , which implies
1
1
f (x) f (S + i) + f (S + j) < f L (x).
2
2
2
This means that for submodular functions, the convex closure can be evaluated efficiently. This
is not at all clear a priori, and it is not true for the concave closure which is NP-hard to evaluate
for submodular functions. Since convex functions can be minimized efficiently, this explains why
submodular functions can be also minimized efficiently.
Theorem 5 (Gr
otschel, Lov
asz, Schrijver 88) The problem minSN f (S) can be solved in
time poly(|N |), for any submodular function f : 2N R.
The GLS algorithm is based on the ellipsoid method; later, more efficient combinatorial algorihms were found by Schrijver and Fleischer-Fujishige-Iwata.
In contrast, maximizing a submodular function is NP-hard, as can be seen from the special case
of Max Cut. In this sense, submodular functions are closer to convex functions than concave ones.
1.3
Multilinear extension
Still, submodular functions exhibit some aspects of concavity. For instance, the function f (S) =
(|S|) is submodular if and only if is a concave function (of one variable). Intuitively, this concave
aspect is useful in maximization problems such as max{f (S) : |S| k} and it makes sense to look
for a continuous extension of f which would capture this. Unfortunately, the concave closure is
hard to evaluate, so this is not the right extension for algorithmic applications. The extension
which turns out to be useful here is the multilinear extension.
Definition 6 (Multilinear extension) For a set function f : 2N R, we define its multilinear
extension F : [0, 1]N R by
X
Y
Y
(1 xj ).
F (x) =
f (S)
xi
SN
iS
jN \S
is a random
We remark that an alternative way to define F is to set F (x) = E[f (
x)] where x
set where elements appear independently with probabilities xi . This is clearly equivalent to the
definition above.
The multilinear extension can be defined for any set function but it acquires particularly nice
properties for submodular functions.
Lemma 7 Let F : [0, 1]N R be the multilinear extension of a set function f : 2N R.
If f is non-decreasing, then
If f is submodular, then
F
xi
2F
xi xj
Proof: Given x [0, 1]N , let R be a random set where elements appear independently with
F
probabilities xi . Since F is multilinear, the first partial derivative x
is constant when only xi
i
varies. Hence, it can be written as follows:
F
xi
since f (R + i) f (R i) by monotonicity.
To prove the second part, observe that the first partial derivatives themselves are multilinear,
and hence the second partial derivatives can be written as follows:
2F
xi xj
F
F
xj (x1 ,...,xi1 ,1,xi+1 ,...,xn ) xj (x1 ,...,xi1 ,0,xi+1 ,...,xn )
= E[f (R + i + j) f (R + i j)] E[f (R i + j) f (R i j)]
0
because f (R + i + j) f (R + i j) f (R i + j) f (R i j) by submodularity.
2
We remark that the converse is also true: non-negativity of the first partial derivatives and nonpositivity of the second partial derivatives imply monotonicity and submodularity, respectively. We
leave this as an exercise for the reader. As direct consequences of Lemma 7, we obtain the following
convexity properties of the multilinear relaxation.
Corollary 8 Let F : [0, 1]N R be the multilinear extension of a set function f : 2N R. Then
If f is non-decreasing, then F is non-decreasing along any line of direction d 0.
If f is submodular, then F is concave along any line of direction d 0.
If f is submodular, then F is convex along any line of direction ei ej for i, j N .
Proof: Let () = F (x0 + d) denote the function along some line of direction d 0. By the
chain rule and Lemma 7, we have
0 () =
di
iN
F
0
xi x0 +d
X
i,jN
di dj
2 F
0
xi xj x0 +d
2
+
xi 2
xi xj
xj 2
with all the derivatives evaluated at x0 + ei ej . We have
multilinear, and
1.4
2F
xi xj
2F
xi 2
2F
xj 2
= 0 because F is
Evaluation
There is a technical point here that we need to deal with: Evaluating the multilinear extension
exactly requires 2n queries to the value oracle of f . Obviously, this is something we cannot afford,
and hence we will evaluate F (x) only approximately.
Lemma 9 If F is the multilinear extension of f , x [0, 1]n , and R1 , . . . , Rt are independent
samples of random sets, where element i appears independently with probability xi , then
t
1 X
f (Ri ) F (x) | max f (S)|
t
i=1
2 /4
Proof: We can write F (x) = E[f (R)] where R is random as in the lemma. Let M = max |f (S)|;
1
f (R) is a random variable in the range [M, M ]. Let Yi = M
f (Ri ) where Ri is the i-th random
Pt
t
sample. We have Yi [1, 1] and i=1 E[Yi ] = M F (x). By the Chernoff bound,
Pr[|
t
X
i=1
Yi
t
2 2
2
F (x)| > t] < et /4t = et /4 .
M
2
In the following, we assume that we can evaluate F (x) to an arbitrary precision (more precisely,
with additive error | max f (S)|/poly(n)). In some cases, further discussion is needed to ensure that
this does not affect the approximation factors significantly and we shall return to this issue when
necessary.
1.5
Summary
We have seen four (in effect three) possible extensions of a submodular function. They are ordered
as follows:
f + (x) F (x) f (x) = f L (x).
This can be seen from the fact that each extension can be written as E[f (R)] for some distribution
of R such that E[1R ] = x. The concave closure f + (x) maximizes this expectation, f (x) minimizes
the expectation, and the multilinear extension F (x) is somewhere in between.