Disc Math 3

Chapter 3
Some Counting Problems; Binomial

Coefficients
3.1 Counting Permutations and Functions

In this short section, we consider some simple counting problems. Let us begin with permu-
tations. Recall that a permutation of a set, A, is any bijection between A and itself. If A is a
finite set with n elements, we mentioned earlier (without proof) that A has n! permutations,
where the factorial function, n → n! (n ∈ N), is given recursively by:
0! = 1
(n + 1)! = (n + 1)n!.
The reader should check that the existence of the function, n → n!, can be justified using
the Recursion Theorem (Theorem 2.5.1).
Proposition 3.1.1 The number of permutations of a set of n elements is n!.
Proof . We prove that if A and B are any two finite sets of the same cardinality, n, then the
number of bijections between A and B is n!. Now, in the special case where B = A, we get
our theorem.
The proof is by induction on n. For n = 0, the empty set has one bijection (the empty
function). So, there are 0! = 1 permutations, as desired.
Assume inductively that if A and B are any two finite sets of the same cardinality, n,
then the number of bijections between A and B is n!. If A and B are sets with n + 1
elements, then pick any element, a ∈ A, and write A = A ∪ {a}, where A = A − {a} has
n elements. Now, any bijection, f : A → B, must assign some element of B to a and then
f A is a bijection between A and B = B − {f (a)}. By the induction hypothesis, there are
n! bijections between A and B . Since there are n + 1 ways of picking f (a) in B, the total
107
108 CHAPTER 3. SOME COUNTING PROBLEMS; BINOMIAL COEFFICIENTS
number of bijections between A and B is (n + 1)n! = (n + 1)!, establishing the induction

hypothesis.
Let us also count the number of functions between two finite sets.
Proposition 3.1.2 If A and B are finite sets with |A| = m and |B| = n, then the set of
function, B A , from A to B has nm elements.
Proof . We proceed by induction on m. For m = 0, we have A = ∅, and the only function is

the empty function. In this case, n0 = 1 and the base case holds.
Assume the induction hypothesis holds for m and assume |A| = m + 1. Pick any element,
a ∈ A, and let A = A − {a}, a set with m elements. Any function, f : A → B, assigns an
element, f (a) ∈ B, to a and f A is a function from A to B. By the induction hypothesis,
there are nm functions from A to B. Since there are n ways of assigning f (a) ∈ B to a,
there are n · nm = nm+1 functions from A to B, establishing the induction hypothesis.
As a corollary, we determine the cardinality of a finite power set.
Corollary 3.1.3 For any finite set, A, if |A| = n, then |2A | = 2n .
Proof . By proposition 2.9.4, there is a bijection between 2A and the set of functions {0, 1}A .
Since |{0, 1}| = 2, we get |2A | = |{0, 1}A | = 2n , by Proposition 3.1.2.
Computing the value of the factorial function for a few inputs, say n = 1, 2 . . . , 10, shows
that it grows very fast. For example,
10! = 3, 628, 800.
Is it possible to quantify how fast factorial grows compared to other functions, say nn or en ?
Remarkably, the answer is yes. A beautiful formula due to James Stirling (1692-1770) tells
us that √ n n
∼
n! = 2πn ,
e
which means that
n!
lim √ n = 1.
n→∞ 2πn ne
Here, of course,
1 1 1 1
e=1+ + + + ··· + + ···
1! 2! 3! n!
the base of the natural logarithm. It is even possible to estimate the error. It turns out that
√ n n
n! = 2πn eλn ,
e
where
1 1
< λn < ,
12n + 1 12n
3.2. COUNTING SUBSETS OF SIZE K; BINOMIAL COEFFICIENTS 109
a formula due to Jacques Binet (1786-1856).

Let us introduce some notation used for comparing the rate of growth of functions. We
begin with the “Big oh” notation.
Given any two functions, f : N → R and g : N → R, we say that f is O(g) (or f (n) is
O(g(n))) iff there is some N > 0 and a constant c > 0 such that
|f (n)| ≤ c|g(n)|, for all n ≥ N.
In other words, for n large enough, |f (n)| is bounded by c|g(n)|. We sometimes write n >> 0
to indicate that n is “large.”
1
For example λn is O( 12n ). By abuse of notation, we often write f (n) = O(g(n)) even
though this does not make sense.
The “Big omega” notation means the following: f is Ω(g) (or f (n) is Ω(g(n))) iff there
is some N > 0 and a constant c > 0 such that
|f (n)| ≥ c|g(n)|, for all n ≥ N.
The reader should check that f (n) is O(g(n)) iff g(n) is Ω(f (n)).
We can combine O and Ω to get the “Big theta” notation: f is Θ(g) (or f (n) is Θ(g(n)))
iff there is some N > 0 and some constants c1 > 0 and c2 > 0 such that
c1 |g(n)| ≤ |f (n)| ≤ c2 |g(n)|, for all n ≥ N.
Finally, the “Little oh” notation expresses the fact that a function, f , has much slower
growth than a function g. We say that f is o(g) (or f (n) is o(g(n))) iff
f (n)
lim = 0.
n→∞ g(n)
√
For example, n is o(n).
3.2 Counting Subsets of Size k; Binomial Coefficients

Let us now count the number of subsetsn of cardinality k of a set of cardinality n, with
0 ≤ k ≤ n. Denote this number by k (say “n choose k”). Actually, in the proposition
below, it will be more convenient to assume that k ∈ Z.

Proposition 3.2.1 For all n ∈ N and all k ∈ Z, if nk denotes the number of subsets of
cardinality k of a set of cardinality n, then

0
= 1
0

n
= 0 if k ∈ / {0, 1, . . . , n}
k

n n−1 n−1
= + (n ≥ 1).
k k k−1
Proof . We proceed by induction on n ≥ 0. Clearly, we may assume that our set is

[n] = {1, . . . , n} ([0] = ∅). The base case n = 0 is trivial since the empty set is the only
subset of size 0. When n ≥ 1, there are two kinds of subsets of {1, . . . , n} having k elements:
those containing 1, and those not containing 1. Now, there are as many subsets of k elements
n−1 {1, . . . , n} containing 1 as there are subsets of k − 1 elements from {2, . . . , n}, namely
from
, and there are as many subsets of k elements from {1, . . . , n} not containing 1 as there
k−1 n−1
are subsets of k elements from {2, . . . , n}, namely
n−1 k . Thus, the number of subsets of
n−1 n
{1, . . . , n} consisting of k elements is k + k−1 , which is equal to k .

The numbers nk are also called binomial coefficients, because they arise in the expansion
of the binomial expression (a + b)n , as we will see shortly. The binomial coefficients can be
computed inductively using the formula

n n−1 n−1
= +
k k k−1
(sometimes known as Pascal’s recurrence formula)

by forming what is usually called Pascal’s
triangle, which is based on the recurrence for nk :
n n n n n n n n
n 0 1 2 3 4 5 6 7
...
0 1
1 1 1
2 1 2 1
3 1 3 3 1
4 1 4 6 4 1
5 1 5 10 10 5 1
6 1 6 15 20 15 6 1
7 1 7 21 35 35 21 7 1
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
n
We can also give the following explicit formula for k
in terms of the factorial function:
Proposition 3.2.2 For all n, k ∈ N, with 0 ≤ k ≤ n, we have

n n!
= .
k k!(n − k)!
Proof . Left as an exercise to the reader (use induction on n and Pascal’s recurrence formula).
Then, it is very easy to see that

n n
= .
k n−k
Remark: The binomial coefficients were already known in the twelfth century by the Indian
Scholar Bhaskra. Pascal’s triangle was taught back in 1265 by the Persian philosopher, Nasir-
Ad-Din.
We now prove the “binomial formula” (also called “binomial theorem”).
Proposition 3.2.3 (Binomial Formula) For any two reals a, b ∈ R (or more generally, any
two commuting variables a, b, i.e., satisfying ab = ba), we have the formula:

n n n n−1 n n−2 2 n n−k k n
(a + b) = a + a b+ a b + ··· + a b + ··· + abn−1 + bn .
1 2 k n−1
The above can be written concisely as
n

n n
(a + b) = an−k bk .
k=0
k
0
Proof . We proceed by induction0 on n. For n = 0, we have (a + b) = 1 and the sum on the
righthand side is also 1, since 0 = 1.
Assume inductively that the formula holds for n. Since
(a + b)n+1 = (a + b)n (a + b),

using the induction hypothesis, we get
(a + b)n+1 = (a + b)n (a + b)
n
n
n−k k
= a b (a + b)
k=0
k
n n
n n+1−k k n n−k k+1
= a b + a b
k=0
k k=0
k
n
n n−1
n+1 n+1−k k n n−k k+1
= a + a b + a b + bn+1
k=1
k k=0
k
n
n n
n+1 n+1−k k n
= a + a b + an+1−k bk + bn+1
k=1
k k=1
k − 1
n
n n
= an+1 + + an+1−k bk + bn+1
k=1
k k − 1
n + 1
n+1
= an+1−k bk ,
k=0
k
where we used Proposition 3.2.1 to go from the next to the last line to the last line. This
establishes the induction step and thus, proves the binomial formula.
We also stated earlier that the number of injections between a set with m elements and
n!
a set with n elements, where m ≤ n, is given by (n−m)! and we now prove it.
Proposition 3.2.4 The number of injections between a set, A, with m elements and a set,
n!
B, with n elements, where m ≤ n, is given by (n−m)! = n(n − 1) · · · (n − m + 1).
Proof . We proceed by induction on m ≤ n. If m = 0, then A = ∅ and there is only one

n!
injection, namely the empty function from ∅ to B. Since (n−0)! = n!
n!
= 1, the base case holds.
Assume the induction hypothesis holds for m and consider a set, A, with m + 1 elements,
where m + 1 ≤ n. Pick any element a ∈ A and let A = A − {a}, a set with m elements. Any
injection, f : A → B, assigns some element, f (a) ∈ B, to a and then f A is an injection
from A to B = B − {f (a)}, a set with n − 1 elements. By the induction hypothesis, there
are
(n − 1)!
(n − 1 − m)!
injections from A to B . Since there are n ways of picking f (a) in B, the number of injections
from A to B is
(n − 1)! n!
n = ,
(n − 1 − m)! (n − (m + 1))!
establishing the induction hypothesis.
Counting the number of surjections between a set with n elements and a set with p
elements, where n ≥ p, is harder. We state the following formula without proof, leaving the
proof as an interesting exercise.
Proposition 3.2.5 The number of surjections, Sn p , between a set, A, with n elements and
a set, B, with p elements, where n ≥ p, is given by

n p n p n p−1 p
Sn p = p − (p − 1) + (p − 2) + · · · + (−1) .
1 2 p−1
Remarks:
1. It can be shown that Sn p satisfies the following peculiar version of Pascal’s identity:
Sn p = p(Sn−1 p + Sn−1 p−1 ).
2. The numbers, Sn p , are intimately related to the so-called Strirling numbers of the

(p)
second kind , denoted np , S(n, p), or Sn , which count the number of partitions of a
set of n elements into p nonempty pairwise disjoint blocks. In fact,

n
Sn p = p! .
p
The binomial coefficients can be generalized as follows. For all n, m, k1 , . . . , km ∈ N, with

k1 + · · · + km = n and m ≥ 2, we have the multinomial coefficient,

n
,
k1 · · · km
which counts the number of ways of splitting a set of n elements into m disjoint subsets, the
ith subset having ki elements. Note that when m = 2, the number of ways splitting a set of
n elements into two disjoint subsets where one of the two subsets has k1 elements and the
other subset has k2 = n − k1 elements is precisely the number of subsets of size k1 of a set
of n elements, that is
n n
= .
k1 k2 k1
Proposition 3.2.6 For all n, m, k1 , . . . , km ∈ N, with k1 + · · · + km = n and m ≥ 2, we have

n n!
= .
k1 · · · km k1 ! · · · km !

Proof . There are kn1 ways of forming a subset of k1 elements from the set of n elements; there
1
are n−k ways of forming a subset of k2 elements from the remaining n − k1 elements; there
n−k
k2
1 −k2

are ways of forming a subset of k3 elements from the remaining n−k1 −k2 elements
k3
and so on; finally, there are n−k1 k−···−k
m−1
m−2
ways of forming a subset of km−1 elements from
the remaining n−k1 −· · ·−km−2 elements and there remains a set of n−k1 −· · ·−km−1 = km
elements. This shows that

n n n − k1 n − k1 − · · · − km−2
= ··· .
k1 · · · km k1 k2 km−1
But then, using the fact that km = n − k1 − · · · − km−1 , we get

n n! (n − k1 )! (n − k1 − · · · − km−2 )!
= ···
k1 · · · km k1 !(n − k1 )! k2 !(n − k1 − k2 )! km−1 !(n − k1 − · · · − km−1 )!
n!
= ,
k1 ! · · · km !
as claimed.
As in the binomial case, it is convenient to set

n
=0
k1 · · · km
if ki < 0 or ki > n, for any i, with 1 ≤ i ≤ m. Then, Proposition 3.2.1 is generalized as
follows:
Proposition 3.2.7 For all n, m, k1 , . . . , km ∈ N, with k1 + · · · + km = n, n ≥ 1 and m ≥ 2,

we have m
n n−1
= .
k1 · · · km i=1
k1 · · · (ki − 1) · · · km
Proof . Note that we have ki − 1 = −1 when ki = 0. If we observe that

n n−1
ki =n
k1 · · · km k1 · · · (ki − 1) · · · km
even if ki = 0, then we have
m
n−1 k1 km n
= + ··· +
i=1
k1 · · · (ki − 1) · · · km n n k1 · · · km

n
= ,
k1 · · · km
since k1 + · · · + km = n.
Remark: Proposition 3.2.7 shows that Pascal’s triangle generalizes to “higher dimensions”,
that is, to m ≥ 3. Indeed, it is possible to give a geometric interpretation of Proposition 3.2.7
in which the multinomial coefficients corresponding to those k1 , . . . , km with k1 +· · ·+km = n
lie on the hyperplane of equation x1 +· · ·+xm = n in Rm , and all the multinomial coefficients
for which n ≤ N , for any fixed N , lie in a generalized tetrahedron called a simplex . When
m = 3, the multinomial coefficients for which n ≤ N lie in a tetrahedron whose faces are the
planes of equations, x = 0; y = 0; z = 0; and x + y + z = N .
We have also the following generalization of Proposition 3.2.3:
Proposition 3.2.8 (Multinomial Formula) For all n, m ∈ N with m ≥ 2, for all pairwise
commuting variables a1 , . . . , am , we have

n
(a1 + · · · + am ) = n
ak1 · · · akmm .
k1 ,...,km ≥0
k1 · · · km 1
k1 +···+km =n
Proof . We proceed by induction on n and use Proposition 3.2.7. The case n = 0 is trivially
true.
Assume the induction hypothesis holds for n ≥ 0, then we have
(a1 + · · · + am )n+1 = (a1 + · · · + am )n (a1 + · · · + am )

 

 n 
=  ak11 · · · akmm  (a1 + · · · + am )
k ,...,k ≥0
k1 · · · km
1 m
k1 +···+km =n

m
n
= ak1 · · · aiki +1 · · · akmm
i=1 k1 ,...,km ≥0
k1 · · · ki · · · km 1
k1 +···+km =n

m
n
= ak1 · · · aki i · · · akmm .
i=1 k1 ,...,km ≥0, ki ≥1
k1 · · · (ki − 1) · · · km 1
k1 +···+km =n+1
We seem to hit a snag, namely, that ki ≥ 1, but recall that

n
= 0,
k1 · · · − 1 · · · km
so we have

m
n
n+1
(a1 + · · · + am ) = ak11 · · · aki i · · · akmm
i=1 k1 ,...,km ≥0, ki ≥1
k1 · · · (ki − 1) · · · km
k1 +···+km =n+1
m
n
= ak11 · · · aki i · · · akmm
i=1 k1 ,...,km ≥0,
k1 · · · (ki − 1) · · · km
k1 +···+km =n+1
m
n
= ak11 · · · aki i · · · akmm
k1 ,...,km ≥0, i=1
k1 · · · (ki − 1) · · · km
k1 +···+km =n+1

n+1
= ak11 · · · aki i · · · akmm ,
k1 ,...,km ≥0,
k1 · · · ki · · · km
k1 +···+km =n+1
where we used Proposition 3.2.7 to justify the last equation. Therefore, the induction step
is proved and so is our proposition.
How many terms occur on the right-hand side of the multinomial formula? After a
moment of reflexion, we see that this is the number of finite multisets of size n whose
elements are drawn from a set of m elements, which is also equal to the number of m-tuples,
k1 , . . . , km , with ki ∈ N and
k1 + · · · + km = n.
The following proposition is left an exercise:
Proposition 3.2.9 The number of finite multisets of size n ≥ 0 whose elements come from
a set of size m ≥ 1 is
m+n−1
.
n
3.3 The Inclusion-Exclusion Principle

We close this chapter with the proof of a poweful formula for determining the cardinality
of the union of a finite number of (finite) sets in terms of the cardinalities of the various
intersections of these sets. This identity variously attributed Nicholas Bernoulli, de Moivre,
Sylvester and Poincaré has many applications to counting problems and to probability theory.
We begin with the “baby case” of two finite sets.
Proposition 3.3.1 Given any two finite sets, A, and B, we have
|A ∪ B| = |A| + |B| − |A ∩ B|.

3.3. THE INCLUSION-EXCLUSION PRINCIPLE 117
Proof . This formula is intuitively obvious because if some element, a ∈ A ∪ B, belongs to

both A and B then it is counted twice in |A|+|B| and so we need to subtract its contribution
to A ∩ B. Now,
A ∪ B = (A − (A ∩ B)) ∪ (A ∩ B) ∪ (B − (A ∩ B)),
where the three sets on the right-hand side are pairwise disjoint. If we let a = |A|, b = |B|
and c = |A ∩ B|, then it is clear that
|A − (A ∩ B)| = a − c
|B − (A ∩ B)| = b − c,
so we get
|A ∪ B| = |A − (A ∩ B)| + |A ∩ B| + |B − (A ∩ B)|
= a−c+c+b−c=a+b−c
= |A| + |B| − |A ∩ B|,
as desired. One can also give a proof by induction on n = |A ∪ B|.

We would like to generalize the formula of Proposition 3.3.1 to any finite collection of
finite sets, A1 , . . . , An . A moment of reflexion shows that when n = 3, we have
|A ∪ B ∪ C| = |A| + |B| + |C| − |A ∩ B| − |A ∩ C| − |B ∩ C| + |A ∩ B ∩ C|.
One of the obstacles in generalizing the above formula to n sets is purely notational: We
need a way of denoting arbitrary intersections of sets belonging to a family of sets indexed
by {1, . . . , n}. We can do this by using indices ranging over subsets of {1, . . . , n}, as opposed
to indices ranging over integers. So, for example, for any nonempty subset, I ⊆ {1, . . . , n},
the expression i∈I Ai denotes the intersection of all the subsets whose index, i, belongs to
I.
Theorem 3.3.2 (Inclusion-Exclusion Principle) For any finite sequence, A1 , . . . , An , of

n ≥ 2 subsets of a finite set, X, we have
n

Ak = (−1)(|I|−1) Ai .

k=1 I⊆{1,...,n} i∈I
I=∅
Proof . We proceed by induction on n ≥ 2. The base case, n = 2, is exactly Proposition

3.3.1. Let us now consider the induction step. We can write
n

n+1
Ak = Ak ∪ {An+1 }
k=1 k=1
and so, by Proposition 3.3.1, we have

n+1 n

Ak = Ak ∪ {An+1 }

k=1
k=1
n n

= Ak + |An+1 | − Ak ∩ {An+1 } .

k=1 k=1
We can apply the induction hypothesis to the first term and we get
n

Ak = (−1)(|J|−1) Aj .

k=1 J⊆{1,...,n} j∈J
J=∅
Using distributivity of intersection over union, we have

n
n
Ak ∩ {An+1 } = (Ak ∩ An+1 ).
k=1 k=1
Again, we can apply the induction hypothesis and obtain

n

(|J|−1)
− (Ak ∩ An+1 ) = − (−1) (Aj ∩ An+1 )

k=1 J⊆{1,...,n} j∈J
J=∅

|J|
= (−1) Aj
J⊆{1,...,n} j∈J∪{n+1}
J=∅

(|J∪{n+1}|−1)

= (−1) Aj .
J⊆{1,...,n} j∈J∪{n+1}
J=∅
Putting all this together, we get

n+1

(|J∪{n+1}|−1)

Ak = (−1)(|J|−1) Aj + |An+1 | + (−1) A
j
k=1 J⊆{1,...,n} j∈J J⊆{1,...,n} j∈J∪{n+1}
J=∅ J=∅

= (−1)(|J|−1) Aj + (−1)(|J|−1) Aj

J⊆{1,...,n+1} j∈J J⊆{1,...,n+1} j∈J
J=∅, n+1∈J
/ n+1∈J

= (−1)(|I|−1) Ai ,

I⊆{1,...,n+1} i∈I
I=∅
establishing the induction hypothesis and finishing the proof.

As an application of the Inclusion-Exclusion Principle, let us prove the formula for count-
ing the number of surjections from {1, . . . , n} to {1, . . . , p}, with p ≤ n, given in Proposition
3.2.5.
Recall that the total number of functions from {1, . . . , n} to {1, . . . , p} is pn . The trick is
to count the number of functions that are not surjective. Any such function has the property
that its image misses one element from {1, . . . , p}. So, if we let
Ai = {f : {1, . . . , n} → {1, . . . , p} | i ∈
/ Im (f )},
we need to count |A1 ∪ · · · ∪ Ap |. But, we can easily do this using the Inclusion-Exclusion
Principle.
Indeed, for any nonempty subset, I, of {1, . . . , p}, with |I| = k, the functions in
i∈I Ai are exactly the functions whose range misses I. But, these are exactly the functions
from {1, . . . , n} to {1, . . . , p} − I and there are (p − k)n such functions. Thus,

Ai = (p − k)n .

i∈I

As there are kp subsets, I ⊆ {1, . . . , p}, with |I| = k, the contribution of all k-fold intersec-
tions to the Inclusion-Exclusion Principle is

p
(p − k)n .
k
Note that A1 ∩· · ·∩Ap = ∅, since functions have a nonempty image. Therefore, the Inclusion-
Exclusion Principle yields
p−1

k−1 p
|A1 ∪ · · · ∪ Ap | = (−1) (p − k)n ,
k=1
k
and so, the number of surjections, Sn p , is

p−1

n n k−1 p
Sn p = p − |A1 ∪ · · · ∪ Ap | = p − (−1) (p − k)n
k=1
k
p−1

p k
= (−1) (p − k)n
k=0
k

n p n p n p−1 p
= p − (p − 1) + (p − 2) + · · · + (−1) ,
1 2 p−1
which is indeed the formula of Proposition 3.2.5.

Another amusing application of the Inclusion-Exclusion Principle is the formula giving

the number, pn , of permutations of {1, . . . , n} that leave no element fixed (i.e., f (i) = i, for
all i ∈ {1, . . . , n}). Such permutations are often called derangements. We get

1 1 (−1)k (−1)n
pn = n! 1 − + + · · · + + ··· +
1! 2! k! n!

n n
= n! − (n − 1)! + (n − 2)! + · · · + (−1)n .
1 2
Remark: We know (using the series expansion for ex in which we set x = −1) that
1 1 1 (−1)k
= 1 − + + ··· + + ··· .
e 1! 2! k!
Consequently, the factor of n! in the above formula for pn is the sum of the first n + 1 terms
of 1e and so,
pn 1
lim = .
n→∞ n! e
It turns out that the series for 1e converges very rapidly, so pn ≈ 1e n!. The ratio pn /n! has
an interesting interpretation in terms of probabilities. Assume n persons go to a restaurant
(or to the theatre, etc.) and that they all check their coats. Unfortunately, the cleck loses
all the coat tags. Then, pn /n! is the probability that nobody will get her or his own coat
back! As we just explained, this probability is roughly 1e ≈ 13 , a surprisingly large number.
The Inclusion-Exclusion Principle can be easily generalized in a useful way as follows:
Given a finite set, X, let m be any given function, m : X → R+ , and for any nonempty
subset, A ⊆ X, set
m(A) = m(a),
a∈A
with the convention that m(∅) = 0 (Recall that R+ = {x ∈ R | x ≥ 0}). For any x ∈ X,
the number m(x) is called the weight (or measure) of x and the quantity m(A) is often
called the measure of the set A. For example, if m(x) = 1 for all x ∈ A, then m(A) = |A|,
the cardinality of A, which is the special case that we have been considering. For any two
subsets, A, B ⊆ X, it is obvious that
m(A ∪ B) = m(A) + m(B)

m(X − A) = m(X) − m(A)
m(A ∪ B) = m(A ∩ B)
m(A ∩ B) = m(A ∪ B),
where A = X − A. Then, we have the following version of Theorem 3.3.2:

Theorem 3.3.3 (Inclusion-Exclusion Principle, Version 2) Given any measure function,

m : X → R+ , for any finite sequence, A1 , . . . , An , of n ≥ 2 subsets of a finite set, X, we
have n

m Ak = (−1)(|I|−1) m Ai .
k=1 I⊆{1,...,n} i∈I
I=∅
Proof . The proof is obtained from the proof of Theorem 3.3.2 by changing everywhere any
expression of the form |B| to m(B).
A useful corollary of Theorem 3.3.3 often known as Sylvester’s formula is:
Theorem 3.3.4 (Sylvester’s Formula) Given any measure, m : X → R+ , for any finite
sequence, A1 , . . . , An , of n ≥ 2 subsets of a finite set, X, the measure of the set of elements
of X that do not belong to any of the sets Ai is given by
n

|I|
m Ak = m(X) + (−1) m Ai .
k=1 I⊆{1,...,n} i∈I
I=∅
Proof . Observe that

n
n
Ak = X − Ak .
k=1 k=1
Consequently, using Theorem 3.3.3, we get

n

n
m Ak = m X− Ak
k=1 k=1

n
= m(X) − m Ak
k=1

= m(X) − (−1)(|I|−1) m Ai
I⊆{1,...,n} i∈I
I=∅

= m(X) + (−1)|I| m Ai ,
I⊆{1,...,n} i∈I
I=∅
establishing Sylvester’s formula.

Note that if we use the convention that when the index set, I, is empty then

Ai = X,
i∈∅
then the term m(X) can be included in the above sum by removing the condition that
I = ∅. Sometimes, it is also convenient to regroup terms involving subsets, I, having the
same cardinality and another way to state Sylvester’s formula is as follows:
n
n
m Ak = (−1)k m Ai . (Sylvester’s Formula)
k=1 k=0 I⊆{1,...,n} i∈I
|I|=k
Finally, Sylvester’s formula can be generalized to a formula usually known as the “Sieve
Formula”:
Theorem 3.3.5 (Sieve Formula) Given any measure, m : X → R+ , for any finite sequence,
A1 , . . . , An , of n ≥ 2 subsets of a finite set, X, the measure of the set of elements of X that
belong to exactly p of the sets Ai (0 ≤ p ≤ n) is given by

n
k
Tnp = (−1)k−p m Ai .
k=p
p i∈I
I⊆{1,...,n}
|I|=k

subset, I ⊆ {1, . . . , n}, apply Sylvester’s formula to X =
Proof . For any i∈I Ai and to the
subsets Aj ∩ i∈I Ai . We get
 

m  Ai ∩ Aj  = (−1)|J|−|I| m Aj .
i∈I j ∈I
/ J⊆{1,...,n} j∈J
I⊆J
Hence,
 

Tnp = m Ai ∩ Aj 
I⊆{1,...,n} i∈I j ∈I
/
|I|=p

= (−1)|J|−|I| m Aj
I⊆{1,...,n} J⊆{1,...,n} j∈J
|I|=p I⊆J

= (−1)|J|−|I| m Aj
J⊆{1,...,n} I⊆J j∈J
|J|≥p |I|=p

n
k−p k
= (−1) m Aj ,
k=p
p j∈J
J⊆{1,...,n}
|J|=k
establishing the Sieve formula.

Observe that Sylvester’s Formula is the special case of the Sieve Formula for which p = 0.
The Inclusion-Exclusion Principle (and its relatives) plays an important role in combinatorics
and probablity theory as the reader will verify by consulting any text on combinatorics. A
classical reference on combinatorics is Berge [2]; a more recent is Cameron [8]; a more recent
and more advanced is Stanley [39]. Another fascinating (but deceptively tough) reference
covering discrete mathematics and including a lot of combinatorics is Graham, Knuth and
Patashnik [24].
We are now ready to study special kinds of relations: Partial orders and equivalence
relations.

Disc Math 3

Uploaded by

Disc Math 3

Uploaded by

Chapter 3

Some Counting Problems; Binomial

3.1 Counting Permutations and Functions

Proposition 3.1.1 The number of permutations of a set of n elements is n!.

number of bijections between A and B is (n + 1)n! = (n + 1)!, establishing the induction

Proof . We proceed by induction on m. For m = 0, we have A = ∅, and the only function is

Corollary 3.1.3 For any ﬁnite set, A, if |A| = n, then |2A | = 2n .

10! = 3, 628, 800.

a formula due to Jacques Binet (1786-1856).

|f (n)| ≤ c|g(n)|, for all n ≥ N.

|f (n)| ≥ c|g(n)|, for all n ≥ N.

c1 |g(n)| ≤ |f (n)| ≤ c2 |g(n)|, for all n ≥ N.

3.2 Counting Subsets of Size k; Binomial Coeﬃcients

Proof . We proceed by induction on n ≥ 0. Clearly, we may assume that our set is

(sometimes known as Pascal’s recurrence formula)

Proposition 3.2.2 For all n, k ∈ N, with 0 ≤ k ≤ n, we have

Then, it is very easy to see that

We now prove the “binomial formula” (also called “binomial theorem”).

The above can be written concisely as

Assume inductively that the formula holds for n. Since

(a + b)n+1 = (a + b)n (a + b),

using the induction hypothesis, we get

Proof . We proceed by induction on m ≤ n. If m = 0, then A = ∅ and there is only one

Sn p = p(Sn−1 p + Sn−1 p−1 ).

The binomial coeﬃcients can be generalized as follows. For all n, m, k1 , . . . , km ∈ N, with

Proposition 3.2.6 For all n, m, k1 , . . . , km ∈ N, with k1 + · · · + km = n and m ≥ 2, we have

Proposition 3.2.7 For all n, m, k1 , . . . , km ∈ N, with k1 + · · · + km = n, n ≥ 1 and m ≥ 2,

Proof . Note that we have ki − 1 = −1 when ki = 0. If we observe that

We have also the following generalization of Proposition 3.2.3:

Assume the induction hypothesis holds for n ≥ 0, then we have

(a1 + · · · + am )n+1 = (a1 + · · · + am )n (a1 + · · · + am )

We seem to hit a snag, namely, that ki ≥ 1, but recall that

3.3 The Inclusion-Exclusion Principle

Proposition 3.3.1 Given any two ﬁnite sets, A, and B, we have

|A ∪ B| = |A| + |B| − |A ∩ B|.

Proof . This formula is intuitively obvious because if some element, a ∈ A ∪ B, belongs to

as desired. One can also give a proof by induction on n = |A ∪ B|.

|A ∪ B ∪ C| = |A| + |B| + |C| − |A ∩ B| − |A ∩ C| − |B ∩ C| + |A ∩ B ∩ C|.

Theorem 3.3.2 (Inclusion-Exclusion Principle) For any ﬁnite sequence, A1 , . . . , An , of

Proof . We proceed by induction on n ≥ 2. The base case, n = 2, is exactly Proposition

and so, by Proposition 3.3.1, we have

Using distributivity of intersection over union, we have

Again, we can apply the induction hypothesis and obtain

Putting all this together, we get

establishing the induction hypothesis and ﬁnishing the proof.

and so, the number of surjections, Sn p , is

which is indeed the formula of Proposition 3.2.5.

Another amusing application of the Inclusion-Exclusion Principle is the formula giving

m(A ∪ B) = m(A) + m(B)

where A = X − A. Then, we have the following version of Theorem 3.3.2:

Theorem 3.3.3 (Inclusion-Exclusion Principle, Version 2) Given any measure function,

Proof . Observe that

Consequently, using Theorem 3.3.3, we get

establishing Sylvester’s formula.

establishing the Sieve formula.

You might also like