Disc Math 3
Disc Math 3
0! = 1
(n + 1)! = (n + 1)n!.
The reader should check that the existence of the function, n → n!, can be justified using
the Recursion Theorem (Theorem 2.5.1).
Proof . We prove that if A and B are any two finite sets of the same cardinality, n, then the
number of bijections between A and B is n!. Now, in the special case where B = A, we get
our theorem.
The proof is by induction on n. For n = 0, the empty set has one bijection (the empty
function). So, there are 0! = 1 permutations, as desired.
Assume inductively that if A and B are any two finite sets of the same cardinality, n,
then the number of bijections between A and B is n!. If A and B are sets with n + 1
elements, then pick any element, a ∈ A, and write A = A ∪ {a}, where A = A − {a} has
n elements. Now, any bijection, f : A → B, must assign some element of B to a and then
f A is a bijection between A and B = B − {f (a)}. By the induction hypothesis, there are
n! bijections between A and B . Since there are n + 1 ways of picking f (a) in B, the total
107
108 CHAPTER 3. SOME COUNTING PROBLEMS; BINOMIAL COEFFICIENTS
Proposition 3.1.2 If A and B are finite sets with |A| = m and |B| = n, then the set of
function, B A , from A to B has nm elements.
Proof . By proposition 2.9.4, there is a bijection between 2A and the set of functions {0, 1}A .
Since |{0, 1}| = 2, we get |2A | = |{0, 1}A | = 2n , by Proposition 3.1.2.
Computing the value of the factorial function for a few inputs, say n = 1, 2 . . . , 10, shows
that it grows very fast. For example,
Is it possible to quantify how fast factorial grows compared to other functions, say nn or en ?
Remarkably, the answer is yes. A beautiful formula due to James Stirling (1692-1770) tells
us that √ n n
∼
n! = 2πn ,
e
which means that
n!
lim √ n = 1.
n→∞ 2πn ne
Here, of course,
1 1 1 1
e=1+ + + + ··· + + ···
1! 2! 3! n!
the base of the natural logarithm. It is even possible to estimate the error. It turns out that
√ n n
n! = 2πn eλn ,
e
where
1 1
< λn < ,
12n + 1 12n
3.2. COUNTING SUBSETS OF SIZE K; BINOMIAL COEFFICIENTS 109
In other words, for n large enough, |f (n)| is bounded by c|g(n)|. We sometimes write n >> 0
to indicate that n is “large.”
1
For example λn is O( 12n ). By abuse of notation, we often write f (n) = O(g(n)) even
though this does not make sense.
The “Big omega” notation means the following: f is Ω(g) (or f (n) is Ω(g(n))) iff there
is some N > 0 and a constant c > 0 such that
The reader should check that f (n) is O(g(n)) iff g(n) is Ω(f (n)).
We can combine O and Ω to get the “Big theta” notation: f is Θ(g) (or f (n) is Θ(g(n)))
iff there is some N > 0 and some constants c1 > 0 and c2 > 0 such that
Finally, the “Little oh” notation expresses the fact that a function, f , has much slower
growth than a function g. We say that f is o(g) (or f (n) is o(g(n))) iff
f (n)
lim = 0.
n→∞ g(n)
√
For example, n is o(n).
0 1
1 1 1
2 1 2 1
3 1 3 3 1
4 1 4 6 4 1
5 1 5 10 10 5 1
6 1 6 15 20 15 6 1
7 1 7 21 35 35 21 7 1
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
n
We can also give the following explicit formula for k
in terms of the factorial function:
3.2. COUNTING SUBSETS OF SIZE K; BINOMIAL COEFFICIENTS 111
Proof . Left as an exercise to the reader (use induction on n and Pascal’s recurrence formula).
Remark: The binomial coefficients were already known in the twelfth century by the Indian
Scholar Bhaskra. Pascal’s triangle was taught back in 1265 by the Persian philosopher, Nasir-
Ad-Din.
Proposition 3.2.3 (Binomial Formula) For any two reals a, b ∈ R (or more generally, any
two commuting variables a, b, i.e., satisfying ab = ba), we have the formula:
n n n n−1 n n−2 2 n n−k k n
(a + b) = a + a b+ a b + ··· + a b + ··· + abn−1 + bn .
1 2 k n−1
n
n n
(a + b) = an−k bk .
k=0
k
0
Proof . We proceed by induction0 on n. For n = 0, we have (a + b) = 1 and the sum on the
righthand side is also 1, since 0 = 1.
(a + b)n+1 = (a + b)n (a + b)
n
n
n−k k
= a b (a + b)
k=0
k
n n
n n+1−k k n n−k k+1
= a b + a b
k=0
k k=0
k
n
n n−1
n+1 n+1−k k n n−k k+1
= a + a b + a b + bn+1
k=1
k k=0
k
n
n n
n+1 n+1−k k n
= a + a b + an+1−k bk + bn+1
k=1
k k=1
k − 1
n
n n
= an+1 + + an+1−k bk + bn+1
k=1
k k − 1
n + 1
n+1
= an+1−k bk ,
k=0
k
where we used Proposition 3.2.1 to go from the next to the last line to the last line. This
establishes the induction step and thus, proves the binomial formula.
We also stated earlier that the number of injections between a set with m elements and
n!
a set with n elements, where m ≤ n, is given by (n−m)! and we now prove it.
Proposition 3.2.4 The number of injections between a set, A, with m elements and a set,
n!
B, with n elements, where m ≤ n, is given by (n−m)! = n(n − 1) · · · (n − m + 1).
Counting the number of surjections between a set with n elements and a set with p
elements, where n ≥ p, is harder. We state the following formula without proof, leaving the
proof as an interesting exercise.
Proposition 3.2.5 The number of surjections, Sn p , between a set, A, with n elements and
a set, B, with p elements, where n ≥ p, is given by
n p n p n p−1 p
Sn p = p − (p − 1) + (p − 2) + · · · + (−1) .
1 2 p−1
Remarks:
1. It can be shown that Sn p satisfies the following peculiar version of Pascal’s identity:
2. The numbers, Sn p , are intimately related to the so-called Strirling numbers of the
(p)
second kind , denoted np , S(n, p), or Sn , which count the number of partitions of a
set of n elements into p nonempty pairwise disjoint blocks. In fact,
n
Sn p = p! .
p
which counts the number of ways of splitting a set of n elements into m disjoint subsets, the
ith subset having ki elements. Note that when m = 2, the number of ways splitting a set of
n elements into two disjoint subsets where one of the two subsets has k1 elements and the
other subset has k2 = n − k1 elements is precisely the number of subsets of size k1 of a set
of n elements, that is
n n
= .
k1 k2 k1
Remark: Proposition 3.2.7 shows that Pascal’s triangle generalizes to “higher dimensions”,
that is, to m ≥ 3. Indeed, it is possible to give a geometric interpretation of Proposition 3.2.7
in which the multinomial coefficients corresponding to those k1 , . . . , km with k1 +· · ·+km = n
lie on the hyperplane of equation x1 +· · ·+xm = n in Rm , and all the multinomial coefficients
for which n ≤ N , for any fixed N , lie in a generalized tetrahedron called a simplex . When
m = 3, the multinomial coefficients for which n ≤ N lie in a tetrahedron whose faces are the
planes of equations, x = 0; y = 0; z = 0; and x + y + z = N .
Proposition 3.2.8 (Multinomial Formula) For all n, m ∈ N with m ≥ 2, for all pairwise
commuting variables a1 , . . . , am , we have
n
(a1 + · · · + am ) = n
ak1 · · · akmm .
k1 ,...,km ≥0
k1 · · · km 1
k1 +···+km =n
Proof . We proceed by induction on n and use Proposition 3.2.7. The case n = 0 is trivially
true.
n
= 0,
k1 · · · − 1 · · · km
116 CHAPTER 3. SOME COUNTING PROBLEMS; BINOMIAL COEFFICIENTS
so we have
m
n
n+1
(a1 + · · · + am ) = ak11 · · · aki i · · · akmm
i=1 k1 ,...,km ≥0, ki ≥1
k1 · · · (ki − 1) · · · km
k1 +···+km =n+1
m
n
= ak11 · · · aki i · · · akmm
i=1 k1 ,...,km ≥0,
k1 · · · (ki − 1) · · · km
k1 +···+km =n+1
m
n
= ak11 · · · aki i · · · akmm
k1 ,...,km ≥0, i=1
k1 · · · (ki − 1) · · · km
k1 +···+km =n+1
n+1
= ak11 · · · aki i · · · akmm ,
k1 ,...,km ≥0,
k1 · · · ki · · · km
k1 +···+km =n+1
where we used Proposition 3.2.7 to justify the last equation. Therefore, the induction step
is proved and so is our proposition.
How many terms occur on the right-hand side of the multinomial formula? After a
moment of reflexion, we see that this is the number of finite multisets of size n whose
elements are drawn from a set of m elements, which is also equal to the number of m-tuples,
k1 , . . . , km , with ki ∈ N and
k1 + · · · + km = n.
The following proposition is left an exercise:
Proposition 3.2.9 The number of finite multisets of size n ≥ 0 whose elements come from
a set of size m ≥ 1 is
m+n−1
.
n
A ∪ B = (A − (A ∩ B)) ∪ (A ∩ B) ∪ (B − (A ∩ B)),
where the three sets on the right-hand side are pairwise disjoint. If we let a = |A|, b = |B|
and c = |A ∩ B|, then it is clear that
|A − (A ∩ B)| = a − c
|B − (A ∩ B)| = b − c,
so we get
|A ∪ B| = |A − (A ∩ B)| + |A ∩ B| + |B − (A ∩ B)|
= a−c+c+b−c=a+b−c
= |A| + |B| − |A ∩ B|,
One of the obstacles in generalizing the above formula to n sets is purely notational: We
need a way of denoting arbitrary intersections of sets belonging to a family of sets indexed
by {1, . . . , n}. We can do this by using indices ranging over subsets of {1, . . . , n}, as opposed
to indices ranging over integers. So, for example, for any nonempty subset, I ⊆ {1, . . . , n},
the expression i∈I Ai denotes the intersection of all the subsets whose index, i, belongs to
I.
We can apply the induction hypothesis to the first term and we get
n
Ak = (−1)(|J|−1) Aj .
k=1 J⊆{1,...,n} j∈J
J=∅
Ai = {f : {1, . . . , n} → {1, . . . , p} | i ∈
/ Im (f )},
we need to count |A1 ∪ · · · ∪ Ap |. But, we can easily do this using the Inclusion-Exclusion
Principle.
Indeed, for any nonempty subset, I, of {1, . . . , p}, with |I| = k, the functions in
i∈I Ai are exactly the functions whose range misses I. But, these are exactly the functions
from {1, . . . , n} to {1, . . . , p} − I and there are (p − k)n such functions. Thus,
Ai = (p − k)n .
i∈I
As there are kp subsets, I ⊆ {1, . . . , p}, with |I| = k, the contribution of all k-fold intersec-
tions to the Inclusion-Exclusion Principle is
p
(p − k)n .
k
Note that A1 ∩· · ·∩Ap = ∅, since functions have a nonempty image. Therefore, the Inclusion-
Exclusion Principle yields
p−1
k−1 p
|A1 ∪ · · · ∪ Ap | = (−1) (p − k)n ,
k=1
k
Remark: We know (using the series expansion for ex in which we set x = −1) that
1 1 1 (−1)k
= 1 − + + ··· + + ··· .
e 1! 2! k!
Consequently, the factor of n! in the above formula for pn is the sum of the first n + 1 terms
of 1e and so,
pn 1
lim = .
n→∞ n! e
It turns out that the series for 1e converges very rapidly, so pn ≈ 1e n!. The ratio pn /n! has
an interesting interpretation in terms of probabilities. Assume n persons go to a restaurant
(or to the theatre, etc.) and that they all check their coats. Unfortunately, the cleck loses
all the coat tags. Then, pn /n! is the probability that nobody will get her or his own coat
back! As we just explained, this probability is roughly 1e ≈ 13 , a surprisingly large number.
The Inclusion-Exclusion Principle can be easily generalized in a useful way as follows:
Given a finite set, X, let m be any given function, m : X → R+ , and for any nonempty
subset, A ⊆ X, set
m(A) = m(a),
a∈A
with the convention that m(∅) = 0 (Recall that R+ = {x ∈ R | x ≥ 0}). For any x ∈ X,
the number m(x) is called the weight (or measure) of x and the quantity m(A) is often
called the measure of the set A. For example, if m(x) = 1 for all x ∈ A, then m(A) = |A|,
the cardinality of A, which is the special case that we have been considering. For any two
subsets, A, B ⊆ X, it is obvious that
Proof . The proof is obtained from the proof of Theorem 3.3.2 by changing everywhere any
expression of the form |B| to m(B).
A useful corollary of Theorem 3.3.3 often known as Sylvester’s formula is:
Theorem 3.3.4 (Sylvester’s Formula) Given any measure, m : X → R+ , for any finite
sequence, A1 , . . . , An , of n ≥ 2 subsets of a finite set, X, the measure of the set of elements
of X that do not belong to any of the sets Ai is given by
n
|I|
m Ak = m(X) + (−1) m Ai .
k=1 I⊆{1,...,n} i∈I
I=∅
then the term m(X) can be included in the above sum by removing the condition that
I = ∅. Sometimes, it is also convenient to regroup terms involving subsets, I, having the
same cardinality and another way to state Sylvester’s formula is as follows:
n
n
m Ak = (−1)k m Ai . (Sylvester’s Formula)
k=1 k=0 I⊆{1,...,n} i∈I
|I|=k
Finally, Sylvester’s formula can be generalized to a formula usually known as the “Sieve
Formula”:
Theorem 3.3.5 (Sieve Formula) Given any measure, m : X → R+ , for any finite sequence,
A1 , . . . , An , of n ≥ 2 subsets of a finite set, X, the measure of the set of elements of X that
belong to exactly p of the sets Ai (0 ≤ p ≤ n) is given by
n
k
Tnp = (−1)k−p m Ai .
k=p
p i∈I
I⊆{1,...,n}
|I|=k
subset, I ⊆ {1, . . . , n}, apply Sylvester’s formula to X =
Proof . For any i∈I Ai and to the
subsets Aj ∩ i∈I Ai . We get
m Ai ∩ Aj = (−1)|J|−|I| m Aj .
i∈I j ∈I
/ J⊆{1,...,n} j∈J
I⊆J
Hence,
Tnp = m Ai ∩ Aj
I⊆{1,...,n} i∈I j ∈I
/
|I|=p
= (−1)|J|−|I| m Aj
I⊆{1,...,n} J⊆{1,...,n} j∈J
|I|=p I⊆J
= (−1)|J|−|I| m Aj
J⊆{1,...,n} I⊆J j∈J
|J|≥p |I|=p
n
k−p k
= (−1) m Aj ,
k=p
p j∈J
J⊆{1,...,n}
|J|=k
3.3. THE INCLUSION-EXCLUSION PRINCIPLE 123