0% found this document useful (0 votes)
51 views6 pages

Lecture Note Network Problem Set 3

This document contains the solutions to problems from Problem Set 3 of Networks 14.15. Problem 1 examines phase transitions for two events - whether a node has at least l neighbors, and whether a cycle of length k emerges. Threshold functions are derived for both events. Problem 2 shows that the mean degree of a vertex is 2c, and that the degree distribution follows a Poisson distribution with parameter c for even degrees and 0 for odd degrees.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
51 views6 pages

Lecture Note Network Problem Set 3

This document contains the solutions to problems from Problem Set 3 of Networks 14.15. Problem 1 examines phase transitions for two events - whether a node has at least l neighbors, and whether a cycle of length k emerges. Threshold functions are derived for both events. Problem 2 shows that the mean degree of a vertex is 2c, and that the degree distribution follows a Poisson distribution with parameter c for even degrees and 0 for odd degrees.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 6

14.

15
NETWORKS
PROBLEM SET 3

JOHN WANG

Collaborators: Ryan Liu, Bonny Jain

1. Problem 1
1.1. Problem 1.a. Problem: Let A1 denote the event that node 1 has at least l ∈ Z+ neighbhors. Do we
observe a phase transition for this event? If so, finnd the threshold function and explain your reasoning.
l
Solution: Yes, we do observe a phase transition when t(n) = n−1 . To show that there exists a phase
transition, lut us examine the case when p(n)
t(n) =
p(n)(n−1)
l → 0. In this case we can use Markov’s inequality.
We define X as a random variable of the number of neighbors of node 1 and we note that P (A1 ) = P (X ≥ l).
Thus, we want to obtain the expected value of X. However, we know that a node has an expected (n−1)p(n)
neighbors, because each neighbor has a p(n) probability of being connected by an edge, and there are n − 1
neighbors. Thus, we see that E[A1 ] = (n − 1)p(n). Markov’s inequality allows us to obtain the following
bound:
E[A1 ] p(n)(n − 1)
(1) P (A1 ) = P (X ≥ l) ≤ =
l l
However, since p(n)
t(n) =
p(n)(n−1)
l → 0, we see that P (A1 ) → 0 as well. Thus, we have shown that the first
part of the phase transition, namely that under the threshold, A1 does not occur. Now we shall show that
above the threshold, the event occurs almost surely. We want to see what happens when p(n)
t(n) =
p(n)(n−1)
l →
∞. We shall use Chebyshev’s inequality:
Var(X)
(2) P (|X − E[X]| ≥ |E[X] − l|) ≤
(E[x] − l)2
p(n − 1)
(3) =
((n − 1)p − l)2
p(n − 1)
(4) =
(n − 1)2 pl2 + l2 − 2(n − 1)pl
1
(5) = l2
2
(n − 1)l + (n−1)p − 2l

Where we know that Var(X) = E[X] = (n − 1)p because X can be approximated as a poisson random
variable (see Newman p. 402) for large n. Moreover, we know that l is a constant so that as p(n)(n−1)
l →∞
l2 1 p(n−1)
we have (n−1)p → 0. Therefore, we also see that P (|X − E[X]| ≥ |E[X] − l|) ≤ (n−1)l 2 −2l as l → ∞.
Since l is a constant we see that n → ∞ implies that this probability goes to zero. In other words, the
probability that X deviates by more than E[X] − l from its expected value goes to zero. This shows that
P (A1 ) → 1 when E[X] ≥ l. 

1.2. Problem 1.b. Problem: Let B denote the event that a cycle with k edges (for a fixed k) emerges in
the graph. Do we observe a phase transition of this event? If so, find the threshold function and explain
your reasoning.
Solution: Yes a threshold function does exist for this event. Take t(n) = n1 as the threshold function.
p
We will first show that as t(n) = pn → 0, then P (B) → 0. We first want to find the expected number of
cycles of length k in the graph. Let X denote the number of cycles of length k. We know that there are
n

k ways to select k nodes, and that for each of these subsets of nodes, there are (k − 1)!/2 ways to create a
cycle. This follows because we set the first node of the cycle, then there are k − 1 ways of picking the second
node, k − 2 ways of picking the third node, etc. We divide by two because we could go either backwards or
1
2 JOHN WANG

forwards in this cycle (clockwise or counterclockwise). Each of these cycles has probability pk of emerging.
Therefore, we have:
 
n (k − 1)! k
(6) E[X] = p
k 2
n! (k − 1)! k
(7) = p
k!(n − k)! 2
n!
(8) = pk
2k(n − k)!

n! (np)k
However, we know that (n−k)! = n(n − 1) . . . (n − k + 1) ≤ nk . This means that we have E[X] ≤ 2k .
Thus, as np → 0, we see that E[X] → 0, which implies that P (X ≥ 1) ≤ E[X] 1 = E[X] → 0. Since
p
P (X ≥ 1) = P (B), we see that P (B) → 0 as t(n) → 0. Now to show the second half of the phase transition,
p
we need to show that as t(n) → ∞ we have P (B) → 1.
To show this we note that P (X ≤ 0) = P (E[X] − X ≥ E[X]) ≤ Var(X) E[X]2 by Chebyshev. We can also
n!
bound E[X] from below by using the fact that (n−k)! = n(n − 1) . . . (n − k + 1) ≥ (n − k)k . This shows
k k
that E[X] ≥ (n−k)
2k
p
. Since k is a constant we know that nk − kp → ∞ when np → ∞. Thus, we see that
(n − k) → ∞, which implies that E[X] ≥ ∞ when np → ∞.
Now we use the fact that X can be approximated as a poission distribution to note the fact that Var(X) =
1
E[X]. This means that P (X ≤ 0) ≤ E[X] → 0. This implies that P (X ≥ 0) → ∞, which means that
P (B) → ∞, just as we wanted. 

2. Problem 2
2.1. Problem 2.a. Problem: Show that the mean degree of a vertex in this network is 2c.
Solution: We know that the expected number of connected trios is the total possible number of triples of
nodes times the probability each triple becomes a triangle. This is n3 n−1
c n!
= 3!(n−3)! c
(n−1)! (n − 1 − 2)!2! =
( 2 )
2!nc nc
3! = 3 . Since we know that the total number of edges is just 3 times the number of triangles we have
nc total expected edges. Moreover, the expected total degree is 2 times the total number of expected edges
(since each edge has two endpoints). This means we expect 2nc total degree in the graph, and since there
are n nodes, each node has an expected degree of 2nc
n = 2c. 

2.2. Problem 2.b. Problem: Show that the degree distribution is


 −c k/2 
e c /(k/2)! if k is even
(9) pk =
0 if k is odd

Solution: We can assume that the degree distribution is a poisson random variable because of the fact
that each triangle is selected randomly with a particular probability. This is a binomial distribution, which
converges to a poission distribution as n grows large. Therefore, we only need to find the expected degree
of a node. The probability that a node has degree k is really equal to the probability that it is inside of k/2
triangles (because each triangle provide two edges). The expected value is given  by the number of triangles
that a node can connect to times the probability of occuring. This is just n−1 c
.
2 (n−1
2 )
Therefore, we see that λ = c in this poisson distribution and that there are m = k/2 different triangle
m −λ
possibilities. We therefore have the probability distribution of a possion random variable λ m! e
which we
k/2 −c
can substitute λ = c and m = k/2 to obtain c(k/2)!
e
. This is the degree distribution for when k is even,
because k cannot be odd. This is because whenever a new triangle is added, 2 more edges are added to each
node, and thus, one cannot have an odd degree. Therefore, we have shown that the degree distribution is
given by:
 −c k/2 
e c /(k/2)! if k is even
(10) pk =
0 if k is odd


14.15 NETWORKS PROBLEM SET 3 3

1
2.3. Problem 2.c. Problem: Show that the clustering coefficient is C = 2c+1 .
Solution: We want to figure out the number of triangles, as well as the number of triples, that we expect
to appear in this graph model. We will find the clustering coefficient in the case where multiedges cannot
occur, then show that multiedges occur with low enough probability that this approximation is valid. Now
consider the case where the graph is simple.
First, we shall find the expected number of triangles. This is just the probability that three vertices i, j, k
end up forming a triangle. This could happen in two ways. The first is that the nodes form a trio and this
c
trio becomes a triangle, this happens with probability p = n−1 . The other possibility is that the nodes
( 2 )
do not form a trio, but that three edges form from other triangles. This happens if {i, j}, {j, k}, and {k, i}
each are part of a new triangle. This surrounds the area between i, j, k and forms a triangle. Each edge
appears with probability p(n − 3). This is because for each pair of nodes, there are n − 3 other nodes that
this pair can form a triangle with (the pair cannot form a triangle with the last node in the trio because
then we would be in the first case). Forming these triangles is an independent event, so we have (p(n − 3))3
probability of having our trio surrounded by triangles. The total probability of the trio being a triangle is
thus p + (1 − p)(p(n − 3))3 because in the second case, we cannot have the trio form a triangle.
Now, let us examine how many triples we obtain. Again consider three vertices i, j, k. When i, j, k form
a triangle then 3 triples are formed. Otherwise, a triple is created whenever {i, j} and {j, k} each are part
of a triangle. Each pair of nodes has (n − 3) other nodes to form a triangle with, and the event of forming
a triangle is independent. So the probability that the {i, j} and {j, k} edges form is (p(n − 3))2 . Now the
probability that only these two triangle form is (1 − p)(p(n − 3))2 because otherwise the original i, j, k trio
would form a triangle. There are 3 different ways that one can select two edges to create. Therefore, the
probability of a triple is just 3 times the probability of forming a triangle plus 3 times the probability of
{i, j}, {j, k} forming a triple. This is 3(p + (1 − p)(p(n − 3))3 + (1 − p)(p(n − 3))2 ).
The clustering coefficient is therefore:
p
3(p + (1 − p)(p(n − 3))3 ) (p(n−3))2 + (1 − p)(p(n − 3))
(11) = p
3(p + (1 − p)(p(n − 3))3 + (1 − p)(p(n − 3))2 ) (p(n−3))2 + (1 − p)(p(n − 3)) + (1 − p)

c 2c(n−3)! 2c 2c
We can take the limit as n → ∞ and note that p = = = ≈ n2 to obtain the
(n−1
2 )
(n−1)! (n−1)(n−2)
following expression:
1
pn2 + pn
(12) Cl(G) = 1 2c
pn2 + pn +1− n2
1
2c
(13) = 1
2c + 1
1
(14) =
2c + 1
Now we want to show that multiedges do not happen with high probability. In particular, I will show
that the ratio of multiedges to single edges goes to zero, which means that our above analysis is correct as
n → ∞. First, we want to figure out the probability of a k-edge for two nodes i, j. There are n−2 total other
nodes that these nodes can form triangles with. Thus, to have a k-edge,  there must be exactly k triangles
which are formed between i, j and k other nodes. Thus, there are n−2 k ways to choose these k other nodes.
k n−2−k
 k p (1 − p)
Each of these choices of a group of nodes has probability of forming a k-edge. Thus, the total
probability of forming a k edge on nodes i, j is n−2 k p (1 − p) n−2−k
. The ratio R of multiedges to single
edges is:
n−2 k
 n−2−k
k p (1 − p)
(15) R = n−2

n−3
1 p(1 − p)
(n−2)! k
(n−2−k)!k! p (1 − p)n−2−k
(16) =
(n − 2)p(1 − p)n−3
(n−3)...(n−1−k) k−1
k! p
(17) = k+1
1 − p)
nk pk−1
(18) ≤
k!(1 − p)k+1
4 JOHN WANG

2c
Now, as n → ∞ we know that p = n2 so that we have the following:
nk (2c)k−1
(19) R ≤
k!n2k
(2c)k−1
(20) =
n2 k!
(21)
Which goes to zero. Therefore, we see that the probability of single edges is much higher than the
probability of k-edges for k > 1, so much so, that the probability goes to zero for all k > 1. Thus, we can
ignore multiedges in our analysis and our previous analysis of the clustering coefficient is sound. 

2.4. Problem 2.d. Problem: Show that when there is a giant component in the network, its expected
size S, as a fraction of the network size, satisfies S = 1 − e−cS(2−S) .
Solution: Suppose that ui is the probability that node i does not belong to the giant component. For
i to not belong in the giant component we must either have 1) that i is not connected to some triangle j
or 2) that i is connected to the triangle j but j itself is not part of the giant component. The first occurs
with probability 1 − p, while the second occurs with probability pu2j , since both nodes of the connecting
triangle cannot belong to the giant component. This means the probaiblity that i is not connected to the
giant component through traingle j is 1 − p + pu2j .
Since each node is independent and triangles are randomly selected, we can say ui = uj = u. Since there
are n−1
2 triangles for which we must do this analysis for, we see the following:
n−1
(22) u = (1 − p + pu2 )( 2 )
!(n−1
2 )
c c
(23) = 1− n−1
+ n−1
 u2
2 2

Taking logs of both sides we see that we can simplify the expression:
  !
n−1 c c
(24) ln u = ln 1 − n−1 + n−1 u2
2 2 2
  2
n − 1 cu − u
(25) = n−1

2 2
(26) = cu2 − u
Where we have used the fact that ln(1 + x) = x for small x. Exponentiating both sides of the resulting
2
expression gives us that u = ecu −u . However, we know that S = 1 − u because u is the probability of
any particular node not belong to the giant component, and S is the probability of belonging to the giant
2
component. Therefore, we see that 1 − S = e−c+c(1−S) which simplifies to S = 1 − e−cS(2−s) , which proves
our result. 

2.5. Problem 2.e. Problem: What is the value of the clustering coefficient when the giant component
fills half of the network?
Solution: We choose S = 1/2 and we solve for c. We have:
1
1 − e 2 c( 2 −2)
1 1
(27) =
2
1 3
(28) = 1 − e− 4 c
2
3 1
(29) e− 4 c =
2
3
(30) − = − ln(2)
4
4
(31) c= ln(2)
3
1 1
Now, we substitute into our equation for the clustering coefficient of C = 2c+1 so we obtain C = 8 ln(2)/3+1 .

14.15 NETWORKS PROBLEM SET 3 5

3. Problem 3
3.1. Problem 3.a. Problem: A k-regular graphs (i.e. a graph where all nodes have degree k).
Solution: First we note that the smallest component that is possible in a k-regular subgraph is of size
k + 1. This is because each node must have k edges, so there must be at least these k other nodes in the
component starting from some node v. Thus, naively, if k = Θ(n), then there must exist a giant component.
More precisely, we will look at the sub/supercritical thresholds for this with the branching process.
Let us start at some node 1. It will have k − 1 neighbors in the next branch of the branching process. For
each child, there will be k − 1 neighbors as well. Thus, we have µ = k − 1. As long as k > 2, we have µ > 1
and we are in the supercritical regime. However, for k ≤ 2, we have µ ≤ 1 and we are in the subcritical
regime. Thus, we see that a k-regular graph has a giant component in expectation as long as k > 2. 

3.2. Problem 3.b. Problem: A power law graph (pk = ck −α , α < 3).
Solution: First we consider the case where α ≤ 2. In the branching process starting from some node 1,
we want to figure out the expected number of children. We can find this using the degree distribution. We
P∞will be k − 1 children if the node has a degree of k. Thus, we have the expected number of
see that there
children as k=1 cp−α (k − 1). We can find a bound for this when α ≤ 2:
∞ ∞
X X k−1
(32) ck −α (k − 1) = c

k=1 k=1

X 1 11
(33) ≥ c −
k 2k
k=1

X 1
(34) ≥ c =∞
2k
k=1

Since we see that the sum diverges for α ≤ 2, the expected number of children is greater than 1 and we
are in a super critical regime so we will have a giant component. Now let us consider for 2 < α < 3. In this
case we notice that we can use the Riemann Zeta function:
∞ ∞
X X 1 1
(35) ck −α (k − 1) = c α−1
+ α
k k
k=1 k=1
(36) = c(ζ(α − 1) + ζ(α))
This follows becaue j k1j converges when j > 1 and is denoted as ζ(j). Moreover, we know that for
P

j ∈ (1, 2) we have ζ(j) monotonically decreasing as j increases. This implies that ζ(j − 1) > ζ(j) which
implies that ζ(j − 1) − ζ(j) > 0. This further implies that c(ζ(α − 1) + ζ(α)) > 0 for α ∈ (2, 3), which implies
that the branching process always has a positive number of expected children. For large values of c, there
exists a giant component.
Therefore, we see that there is indeed a giant component in expectation for the power law graph when
α < 3. 

3.3. Problem 3.c. Problem: A graph in which node degrees can only take values in {0, 1, 2, 3}.
Solution: We will examine when this graph becomes P3 subcritical and supercritical. Let us assume that
each node has degree k with probability pk so that k=0 pk = 0. We know that all nodes with degree 0 will
not form giant components because they will be isolated. We want to therefore look at the excess degree
distributions of nodes where degrees 1, 2, 3. Starting from node 1, there is pk probability of obtaining a
P3 vertex of degree k. This new node will have k − 1 children. Thus, the expected number of children is
child
k=1 pk (k − 1) = p2 + 2p3 . Therefore, we see that µ = p2 + 2p3 .
We see by the branching process argument that we will obtain a giant component in expectation whenever
µ = p2 + 2p3 > 1. For example, the degree distribution is uniformly distributed and pk = 41 , then there
will not exist a giant component. Other distributions such as power law distributions will not have a giant
component either. 

4. Problem 4
4.1. Problem 4.a. Problem: Consider a graph G with n nodes generated according to the configuration
model with a particular degree distribution P (d). Show that the overall clustering coefficient is given by
h 2 i2
Cl(G) = hdi
n
hd i−hdi
hdi 2 .
6 JOHN WANG

Solution: We note that the clustering coefficient Cl is the average probability that two neighbors of
a vertex are neighbor of each other as wel. Thus, we are looking for the probability that if i and j are
connected, given that they are connected to v. First, we know that the probability of an edge between nodes
ki kj
i and j is given by 2m , were ki is the degree of node i. This is because there are kj edges coming into node
j, so there is a kj /2m probability that any edge goes to j. Since there are ki edges leaving node i, there
are ki times the probability of any given edge going to j. Therefore, we see that the probability of an edge
ki kj
existing between nodes i and j is 2m .
Next, we notice that the excess degree distribution for node i can be given by qk = k+1 k pk+1 . This can
be seen from the derivation in Newman (Equation 13.46). Thus, the probability that any node i, which is
already connected to some node v, has degree of k is given by qk . Thus, we can write the clustering coefficient
as the sum over all possible degrees ki , kj for nodes i and j of the probability of obtaining ki and kj times
the probability of an edge between nodes i and j. This is given by:
∞ X ∞
X ki kj
(37) Cl(G) = qki qkj
2m
ki =0 kj =0


!2
1 X
(38) = kqk
2m
k=0

!2
1 X
(39) = (k + 1)pk+1
2m
k=0
hki2
This was obtaining by noting that qk = k+1k pk+1 . Next, we notice that if we multiply by hki2 = 1 and
make a change of variable for k = k + 1, we obtain the following expression:

!2
1 X
(40) Cl(G) = k(k + 1)pk+1
2mhki2
k=0

!2
1 X
(41) = (k − 1)kpk
2mhki2
k=0
Since 2m is the total number of edges in the graph, we know that 2m = nhki so we can substitute in the
denominator of the fraction and expand (k − 1)kpk = k 2 pk − kpk to obtain:

!2
1 X
(42) Cl(G) = k 2 pk − kpk
nhki3
k=0
1 2
(43) = 3
hk 2 i − hki
nhki
h 2 i2
Now, set d = k and we see that Cl(G) = hdi
n
hd i−hdi
hdi2 , which is what we wanted. 

You might also like