0% found this document useful (0 votes)
1 views

Graph Course Notes yx

The document discusses algorithms for finding strongly connected components (SCC) in directed graphs using depth-first search (DFS) and explores the properties of cycles, topological sorting, and the relationship between SCCs and directed acyclic graphs (DAGs). It also covers the 2-SAT problem, explaining how to construct a directed graph from a CNF formula and the implications of satisfying assignments. The document concludes with an overview of the algorithm for solving 2-SAT using SCCs, highlighting its polynomial-time complexity.

Uploaded by

saeb2saeb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Graph Course Notes yx

The document discusses algorithms for finding strongly connected components (SCC) in directed graphs using depth-first search (DFS) and explores the properties of cycles, topological sorting, and the relationship between SCCs and directed acyclic graphs (DAGs). It also covers the 2-SAT problem, explaining how to construct a directed graph from a CNF formula and the implications of satisfying assignments. The document concludes with an overview of the algorithm for solving 2-SAT using SCCs, highlighting its polynomial-time complexity.

Uploaded by

saeb2saeb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Gt Ga Notes Gr

Posted Sep 20, 2024 • Updated Oct 20, 2024


By yxlow 18 min read

Strongly Connected Components (GR1)

Here is a recap on DFS (for a undirected graph)

 Plaintext 
DFS(G):
input G=(V,E) in adjacency list representation
output: vertices labeled by connected components
cc = 0
for all v in V, visited(v) = False, prev(v) = NULL
for all v in V:
if not visited(v):
cc++
Explore(v)

Now lets define Explore :

 Plaintext 
Explore(z):
ccnum(z) = cc
visited(z) = True
for all (z,w) in E:
if not visited(w):
Explore(w)
prev(w) = z

The ccnum is the connected component number of z.

The overall running time is O(n + m), n = |V |, m = |M|. This is because you visit each node once,
and, at each node, you try the edges, hence the total run time is total nodes and total edges in order to
reach all nodes.

What if our graph is now directed? We can still use DFS, but add pre or postorder numbers and remove
the counters.

Notice hte di�erence nad adding of the clock variable:

 Plaintext 
DFS(G):
clock = 1
for all v in V, visited(v) = False, prev(v) = NULL
for all v in V:
if not visited(v):
cc++
Explore(v)

Now lets define Explore :

 Plaintext 
Explore(z):
pre(z) = clock;clock++
visited(z) = True
for all (z,w) in E:
if not visited(w):
Explore(w)
prev(w) = z
post(z) = clock; clock++

Here is an example:
C

F B

E
A

D G

Assuming we start at B, this is how our DFS looks like define as Node(pre,post):

Blue edge reflects that it is “used” during DFS but not used because the node has been visited.

F(13,14) C

C(12,15)
F B

B(1,16) E
A E(4,7)
A(2,11) D(3,10)
D G

H(8,9) G(5,6)
H

There are various type of edges, for a given edge z → w:

• Treeedge such as A → B, B → D

◦ where post(z) > post(w), because of how you recurse back up during DFS

• Back edges B → A, F → B

◦ post(z) < post(w)


◦ Edges that goes back up.

• Forward edges D → G, B → E

◦ Same as the tree edges


◦ post(z) > post(w)

• Cross edges F → H, H → G

◦ post(z) > post(w)

Notice that only for the back edges it is di�erent in terms of the post order and behaves di�erently
from the other edges.

Cycles

A graph G has a cycle if and only if its DFS tree has a back edge.

Proof:

Given a → b → c → … → j → a which is a cycle. Suppose somewhere down the line we have this
node i, then the sub tree (descendants) of i must contain i − 1 which contains a backedge to i.

For the other direction, it is obvious, Consider the back edge B → A, then the cycle exists.

Toplogical sorting

Topologically sorting a DAG (directed acyclic graph that has no cycles): order vertices so that all edges
go from lower → higher. Recall that since it has no cycles, it has no back edges. So the post order
numbers must be post(z) > post(w) for any edge z → w.

So, to do this, we can order vertices by decreasing post order number. Note that for this case, since we
have n vertices, we can create a array of size 2n and insert the nodes according to their post order
number. So, our sorting of post order numbers runtime is O(n).

X Y
W
Z

What are the valid Topological order? : XY (ZW U|ZUW |UZW )

Note, for instance, XYUWZ is not a valid topological order! (Basically Z must come before W) - Because
thereʼs a directed edge from Z to W, Z must come before W in any valid topological ordering of this
graph.

Analogy: Imagine youʼre assembling a toy. Piece Y is needed for both piece Z and piece W. But, piece Z
is also needed for piece W. You must attach Y to Z first, then Z to W. Even though Y is needed for W, you
donʼt attach it directly to W.

DAG Structure

• Source vertex: no incoming edges

◦ Highest post order

• Sink vertex = no outgoing edges

◦ Lowest post order

By now, you are probably thinking, what does all this has to do with strongly connected component
(SCC)? We will show that it is possible to do so with two DFS search.

Connectivity in DAG

Vertices v & w are strongly connected if there is a path v → w and w → v

So, SCC defined as strongly connected component, is the maximal set of strongly connected vertices.

Example:

C F
I H

D J
K L
A B
E

How many strongly connected components (SCC) does the graph has? 5

• A is a SCC by itself since it can reach many other nodes but no other nodes can reach A
• {H,I,J,K,L} Can reach each other
• {C,F,G}

• {B,E}

• {D}

We can simplify the above graph to the following meta graph:

C,F,G

H,I,J,K,L
A B,E D

Notice that this meta graph is a DAG and it is always the case. This should be obvious because if two
strongly connected components are involved in a cycle, then they will be combined to form a bigger
SCC.

So, every directed graph is a DAG of itʼs strongly connected components. You can take any graph,
break it up into SCC and then topologically sort this SCC so that all edges go le� to right.

Motivation

There are many ways we can do this, such as start with sinking vertices or source vertices. But, instead
we can find the sink SCC, so, we find SCC S, output it, and remove it and repeat it. It turns out, Sinks
SCC are easier to work with!

Recall we take any v ∈ S , where S is the sick SCC. For example from the earlier graph we run
Explore(v) , and we run explore from any of of the vertices in {H,I,J,K,L} , we will explore these
vertices and not any other because it is a sink SCC.

What if we find a vertex in the source component? You ended up exploring the whole graph, too bad!
So, how can we be smart about this and find a vertex that lies in a sink component? Because if we can
do so (somehow magically select a vertex in the sink SCC), then we are guaranteed to find all the
nodes in the sink SCC.

So, how can we find such a vertex?


Recall that in a DAG, the vertex with the lowest postorder number is a sink.

In a directed directed G, can we use the same property? Does the property for a general graph, such
that v with the lowest post order always lie in a sink SCC? HA of course its not true, do you think it will
be that easy?

A(1,6) B
A B(2,3)
C(4,5)
C

Notice that B has the lowest post order (3) but it belongs to the SCC {A,B} .

What about the other way around? Does v with the highest post order always lie in a source SCC?
Turns out, this is true! How can we make use of this? Simple, just reverse it! So the source SCC of the
reverse graph is the sink SCC!

So, for directed G = (V , E), look at GR = (V , E R), so, the source SCC in G = sink SCC in GR. So,
we just flip the graph, run DFS, take the highest post order which is the source SCC in GR, that will be
the sink in G.

Example of SCC

lets consider the same graph but now we reverse it, and we start at node C

G
G(2,5)
F(3,4) L(15,24)
C F
C(1,12) I H
H(18,19)
I(20,21) J(17,22)
B(6,11) D J
K K(16,23)
L
A B D(13,14)
A(7,8)
E
E(9,10)

We start from c from the reverse graph GR, find the SCC at {C, G, F, B, A, E} , then we proceed to
{D} before going to {L} . you may notice that the choosing of which vertex might be important,
suppose you pick at any vertex at {H,I,J,K,L} , it will be able to reach all vertices except {D} so that
starting vertex will still have the highest post number.

Then, we sort it and get this following order: L, K, J, I, H, D, C, B, E, A, G, F . So now, we run


DFS from the original graph starting at G.

• At first step, we start at L, so, we will reach {L,K,J,I,H} label as 1 and strike them out.
• We then visit D, and reach {D} , label as 2 and strike them out

• We do the same for C , reach {C,F,G} label as 3 and strike them out

• Then do the same for {B,E} label as 4


• and finally {A} label as 5

So, the L, K, J, I, H, D, C, B, E, A, G, F is mapped to {1,1,1,1,1,2,3,4,4,5,3,3} . Another


interesting observation of the new labels {1,2,3,4,5} , we have the following graph:

1
5 4 3

Notice that this metagraph, they go from 5 → to1. So, the {1,1,1,1,1,2,3,4,4,5,3,3} also outputs
the topological order in reverse order! So we can take any graph, run two iterations of DFS, finds its
SCC, and structure these SCC in topological order.

SCC algorithm

 Plaintext 
SCC(G):
input: directed G=(V,E) in adjacency list
1. Construct G^R
2. Run DFS on G^R
3. Order V by decreasing post order number
4. Run undirected connected components alg on G based on the post order number

Proof of claim:
Given two SCC S and S ′, and there is an edge v ∈ S → w ∈ S ′, the claim is the max post number in
S is always greater than max post number of S ′.

The first case is if we start from z ∈ S ′, then, we finish exploring in S ′ before moving to S , so post
numbers in S will be bigger.

The second case is if we start z ∈ S , then z will be the root node since we can travel to S ′ from z.
Since z is the root node, then it must have the highest post order number.

BFS & Dijkstras

DFS : connectivity

BFS:

input: G = (V , E) & s ∈ V

output: for all v ∈ V , dist(v) = min number of edges from s to v and prev(v).

Dijkstraʼs:

input: G = (V , E) & s ∈ V , ℓ(e) > 0∀e ∈ E

output: ∀v ∈ V , dist(v) = length of shortest s → v path.

Note, Dijkstra uses the min-heap (known as priority queue) that takes logn insertion run time. So, the
overall runtime for Dijkstra is O((n + m)logn).

2-Satis�ability (GR2)

Boolean formula:

• n variables with x1, … , xn


• 2n literals x1, x¯1, … , xn, x¯n where x̄i = ¬xi
• We use ∧ for the and condition and ∨ for the or condition.

CNF
Now, we define CNF (conjunctive normal form):

Clause: OR of several literals (x3 ∨ x̄t ∨ x¯1 ∨ x2) F in CNF: AND of m clauses:
(x2) ∧ (x¯3 ∨ x4) ∧ (x3 ∨ x̄t ∨ x¯1 ∨ x2) ∧ (x¯2 ∨ x¯1)

Notice that for F to be true, that means for each condition we need at least one literal to be true.

SAT

Input: formula f in CNF with n variables and m clauses

output: assignment (assign T or F to each variable) satisfying if one exists, NO if none exists.

Example: f = (x¯1 ∨ x¯2 ∨ x3) ∧ (x3 ∨ x3) ∧ (x¯3 ∨ x¯1) ∧ (x¯3)

And an example that will work is x1 = F , x2 = T , x3 = F .

K-SAT

For K sat, the input is formula f in CNF with n variables and m clauses each of size ≤ k. So the above
function f is an example. In general:

• SAT is NP-complete
• K-SAT is NP complete ∀k ≥ 3
• Poly-time algorithm using SCC for 2-SAT

For example consider the following input f for 2-SAT:

f = (x3 ∨ x¯2) ∧ (x¯1) ∧ (x1 ∨ x4) ∧ (x¯4 ∨ x2) ∧ (x¯3 ∨ x4)

We want to simplify unit-clause which is a clause with 1 literal such as (x¯1). This is because to satisfy
x¯1 there is only one way to set x1 = F .

• Take a unit clause say literal ai


• Satisfy it (set ai = T )

• Remove clauses containing ai and drop a¯i


• let f ′ be the resulting formula
For example:

f = (x3 ∨ x¯2) ∧ ( x¯1 ) ∧ ( x1 ∨ x4) ∧ (x¯4 ∨ x2) ∧ (x¯3 ∨ x4)


= (x3 ∨ x¯2) ∧ (x4) ∧ (x¯4 ∨ x2) ∧ (x¯3 ∨ x4)

So, the original f is satisfiable if f ′ is. Notice that there is a unit clause (x4) and we can remove it.
Eventually I am either going to le� with an empty set, or a formula where all clauses are of size 2.

SAT-graph

Take f with all clauses of size = 2, n variables and m clauses, we create a directed graph:

• 2n vertices corresponding to x1, x¯1, … , xn, x¯n

• 2m edges corresponding to 2 “implications” per clause

Consider the following example: f = (x¯1 ∨ x¯2) ∧ (x2 ∨ x3) ∧ (x¯3 ∨ x¯1)

• Notice that if we set x1 = T → x2 = F , and likewise x2 = T → x1 = F

_x2 x3
x1 _x1
_x3 x2

In general given (α ∨ β), then you need ᾱ → β and β¯ → α

If we observe the graph, we notice that there is a path from x1 → x¯1, which is a contradiction. If
x1 = F , then it might be ok? In general, if there are paths such that x1 → x¯1 and x¯1 → x1, then f is
not satisfiable because x¯1, x1 is in the same SCC.

In general:

• If for some i, xi, x̄i are in the same SCC, then f is not satisfiable.
• If for some i, xi, x̄i are in di�erent SCC, then f is satisfiable.

2-SAT Algo

• Take source scc S ′ and set S ′ = F


• Take sink scc S¯′ and set S¯′ = T
• When we done this, then we can remove all these literals from the graph!

This works because of a key fact: if ∀i, xix̄i are in di�erent SCCʼs, then S is a sink SCC if and only if S¯ is
a source SCC.

 Plaintext 
2SAT(F):
1. Construct graph G for f
2. Take a sink SCC S
- Set S = T ( and bar(S) = F)
- remove S, bar(S)
- repeat until empty

Proof:

The first claim is we show that path α → β ⟺ β¯ → ᾱ

Take path α → β, say γ0 → γ1 → … → γl where γ0 = α, γl = β

Recall that (γ¯1 ∨ γ2) is represented in the graph as (γ1 → γ2), since if γ1 = T then γ2 must also be T
. (γ¯1 ∨ γ2) is also represented in the graph as (γ¯2 → γ¯1). This shows that γ¯0 ← γ¯1 ← … ← γ̄l
which implies β¯ → ᾱ since γ0 = α, γl = β.

Using this claim, we can show 2 more things:

If α, β ∈ S , then ᾱ, β¯ ∈ S¯. This is true because if there are paths in α ↔ β since they are in the same
SCC, this means that there are paths β¯ ↔ ᾱ using the above claim and they belong to SCC.

It remains to show that S must be a sink SCC and S¯ is a source SCC.

We take a sink SCC S, for α ∈ S , that means there are no edges from α → β which implies no edges
such that β¯ → ᾱ. In other words, no outgoing edges from α means there is no incoming edges to ᾱ.
This shows that S¯ is a source SCC!

MST (GR3)

For the minimum spanning tree, we are going to go through the krusal algorithm but mainly focus on
the correctness of it. Note that krusal algorithm is a greedy algorithm which makes use of the cut
property. This property is also useful in proving primʼs algorithm.

MST algorithm

Given: undirected G = (V , E) with weights w(e) for e ∈ E

Goal: find minimum size, connected subgraph of minimum weight. This connected subgraph is known
as the spanning tree (refer this as T). So we want T ⊂ E, w(t) = ∑e∈T w(e)

 Plaintext 
Kruskals(G):
input: undirected G = (V,E) with weights w(e)
1. Sort E by increasing weight
2. Set X = {null}
3. For e=(v,w) in E (in increasing order)
if X U e does not have a cycle:
X = X U e (U here denotes union)
4. Return X

Runtime analysis:

1. Step one takes O(mlogn) time where m = |E|, v = |V |

1. This was a little confusing to me and why the lecture said O(mlogm) = O(mlogn). It turns
out that the max of m is n2 for a fully connected graph,so
O(mlogm) = O(mlogn2) = O(2mlogn) = O(mlogn).

2. For this we can make use of union-find data structure.

1. Let c(v) be the component containing v in (V , X)


2. Let c(w) be the component containing w in (V , X)
3. We check if c(v) ≠ c(w) (by them having di�erent representative), then add e to X.
4. We then apply union to both of them.

3. The union-find data structure takes O(logn) time

1. Since we are doing it for all edges, then O(mlogn).

Cut property
To prove the correctness, we first need to define the cut property:

In other words, the cut of a graph is a set of edges which partition the vertices into two sets. In later
part, we will look at problems such as minimum/maximum cut to partition the graphs into two
components.

The core of the proof is:

• Use induction, and assume that X ⊂ E where X ⊂ T for a MST T . The claim is when we add an
edge from S, S¯, we form another MST T ′.

Proof outline
So, we need to consider two cases, if e∗ ∈ T or e∗ ∉ T .

• If e∗ ∈ T , our job is done, as there is nothing to show.


• If e∗ ∉ T (such as the diagram above), then we modify T in order ot add edge e∗ and construct a
new MST T ′

Next, we show that T ′ is still a tree:

• Remember if a tree with size n has n − 1 edges then it must be connected.


• Actually, it turns out that w(T ′) = w(T ), otherwise it would contradict the fact that T is a MST.

Prim’s algorithm

MST algorithm is akin tio Dijkstraʼs algorithm, and use the cut property to prove correctness of Primʼs
algorithm.

The primʼs algorithm selects the root vertex in the beginning and then traverses from vertex to vertex
adjacently. On the other hand, Krushalʼs algorithm helps in generating the minimum spanning tree,
initiating from the smallest weighted edge.

You might also like