Breadth First Search, Dijkstra's Algorithm For Shortest Paths
Breadth First Search, Dijkstra's Algorithm For Shortest Paths
3.1
3.1.0.1
Overview 1. BFS is obtained from BasicSearch by processing edges using a data structure called a queue. 2. It processes the vertices in the graph in the order of their shortest distance from the vertex s (the start vertex). As such... (A) DFS good for exploring graph structure (B) BFS good for exploring distances 3.1.0.2 Queues A queue is a list of elements which supports the following operations (A) enqueue: Adds an element to the end of the list (B) dequeue: Removes an element from the front of the list Elements are extracted in rst-in rst-out (FIFO) order, i.e., elements are picked in the order in which they were inserted. 1 Queue Data Structure
1 2 4 5 6 3
1 2 4 5 6 8 3 7
3.1.0.3
BFS Algorithm
3.1.0.4
BFS: An Example in Undirected Graphs 1. 2. 3. [1] [2,3] [3,4,5] 4. [4,5,7,8] 5. [5,7,8] 6. [7,8,6] 7. [8,6] 8. [6] 9. []
Basic Graph Theory Breadth First search Depth First Search Directed Graphs
Graphs
3.1.0.5
B
V V is set of ordered
Viswanathan
BFS(s) Mark all vertices as unvisited and for each v set dist(v) = Initialize search tree T to be empty pairs of vertices called edges set dist(s) = 0 Mark vertex s as visited and set Q to be the empty queue enq(s) while Q is nonempty do CS473ug u = deq(Q) for each vertex v Adj(u) do if v is not visited do add edge (u, v) to T Mark v as visited, enq(v) and set dist(v) = dist(u) + 1
3.1.0.7
Proposition 3.1.2 The following properties hold upon termination of BFS(s) 1. The search tree contains exactly the set of vertices in the connected component of s. 2. If dist(u) < dist(v) then u is visited before v. 3. For every vertex u, dist(u) is indeed the length of shortest path from s to u. 4. If u, v are in connected component of s and e = {u, v} is an edge of G, then either e is an edge in the search tree, or |dist(u) dist(v)| 1. Proof : Exercise. 3.1.0.8 Properties of BFS: Directed Graphs
Proposition 3.1.3 The following properties hold upon termination of BFS(s): 1. The search tree contains exactly the set of vertices reachable from s 3
1 2 4 5 6 3
2. If dist(u) < dist(v) then u is visited before v 3. For every vertex u, dist(u) is indeed the length of shortest path from s to u 4. If u is reachable from s and e = (u, v) is an edge of G, then either e is an edge in the search tree, or dist(v) dist(u) 1. Not necessarily the case that dist(u) dist(v) 1. Proof : Exercise. 3.1.0.9 BFS with Layers
BFSLayers(s): Mark all vertices as unvisited and initialize T to be empty Mark s as visited and set L0 = {s} i=0 while Li is not empty do initialize Li+1 to be an empty list for each u in Li do for each edge (u, v) Adj(u) do if v is not visited mark v as visited add (u, v) to tree T add v to Li+1 i=i+1
Running time: O(n + m) 3.1.0.10 3.1.0.11 Example BFS with Layers: Properties
Proposition 3.1.4 The following properties hold on termination of BFSLayers(s). (A) BFSLayers(s) outputs a BFS tree (B) Li is the set of vertices at distance exactly i from s (C) If G is undirected, each edge e = {u, v} is one of three types: (A) tree edge between two consecutive layers (B) non-tree forward/backward edge between two consecutive layers (C) non-tree cross-edge with both u, v in same layer (D) = Every edge in the graph is either between two vertices that are either (i) in the same layer, or (ii) in two consecutive layers. 4
3.1.1
3.1.1.1
Proposition 3.1.5 The following properties hold on termination of BFSLayers(s), if G is directed. For each edge e = (u, v) is one of four types: (A) a tree edge between consecutive layers, u Li , v Li+1 for some i 0 (B) a non-tree forward edge between consecutive layers (C) a non-tree backward edge (D) a cross-edge with both u, v in same layer
3.2
3.2.0.2
Denition 3.2.1 (Bipartite Graph) Undirected graph G = (V, E) is a bipartite graph if V can be partitioned into X and Y s.t. all edges in E are between X and Y .
3.2.0.3
Question When is a graph bipartite? Proposition 3.2.2 Every tree is a bipartite graph. Proof : Root tree T at some node r. Let Li be all nodes at level i, that is, Li is all nodes at distance i from root r. Now dene X to be all nodes at even levels and Y to be all nodes at odd level. Only edges in T are between levels. Proposition 3.2.3 An odd length cycle is not bipartite. 5
3.2.0.4
Proposition 3.2.4 An odd length cycle is not bipartite. Proof : Let C = u1 , u2 , . . . , u2k+1 , u1 be an odd cycle. Suppose C is a bipartite graph and let X, Y be the bipartition. Without loss of generality u1 X. Implies u2 Y . Implies u3 X. Inductively, ui X if i is odd ui Y if i is even. But {u1 , u2k+1 } is an edge and both belong to X! 3.2.0.5 Subgraphs
Denition 3.2.5 Given a graph G = (V, E) a subgraph of G is another graph H = (V , E ) where V V and E E. Proposition 3.2.6 If G is bipartite then any subgraph H of G is also bipartite. Proposition 3.2.7 A graph G is not bipartite if G has an odd cycle C as a subgraph. Proof : If G is bipartite then since C is a subgraph, C is also bipartite (by above proposition). However, C is not bipartite! 3.2.0.6 Bipartite Graph Characterization
Theorem 3.2.8 A graph G is bipartite if and only if it has no odd length cycle as subgraph. Proof : Only If: G has an odd cycle implies G is not bipartite. If: G has no odd length cycle. Assume without loss of generality that G is connected. (A) Pick u arbitrarily and do BFS(u) (B) X = i is even Li and Y = i is odd Li (C) Claim: X and Y is a valid bipartition if G has no odd length cycle.
3.2.0.7
Proof of Claim
Claim 3.2.9 In BFS(u) if a, b Li and (a, b) is an edge then there is an odd length cycle containing (a, b). Proof : Let v be least common ancestor of a, b in BFS tree T . v is in some level j < i (could be u itself). Path from v a in T is of length j i. Path from v b in T is of length j i. These two paths plus (a, b) forms an odd cycle of length 2(j i) + 1. Corollary 3.2.10 There is an O(n+m) time algorithm to check if G is bipartite and output an odd cycle if it is not. 6
3.3
3.3.0.8
Shortest Path Problems Input A (undirected or directed) graph G = (V, E) with edge lengths (or costs). For edge e = (u, v), (e) = (u, v) is its length. (A) Given nodes s, t nd shortest path from s to t. (B) Given node s nd shortest path from s to all other nodes. (C) Find shortest paths for all pairs of nodes. Many applications! 3.3.0.9 Single-Source Shortest Paths: Non-Negative Edge Lengths
Single-Source Shortest Path Problems Input A (undirected or directed) graph G = (V, E) with non-negative edge lengths. For edge e = (u, v), (e) = (u, v) is its length. (A) Given nodes s, t nd shortest path from s to t. (B) Given node s nd shortest path from s to all other nodes. (A) Restrict attention to directed graphs (B) Undirected graph problem can be reduced to directed graph problem - how? (A) Given undirected graph G, create a new directed graph G by replacing each edge {u, v} in G by (u, v) and (v, u) in G . (B) set (u, v) = (v, u) = ({u, v}) (C) Exercise: show reduction works 3.3.0.10 Single-Source Shortest Paths via BFS
Special case: All edge lengths are 1. (A) Run BFS(s) to get shortest path distances from s to all other nodes. (B) O(m + n) time algorithm. Special case: Suppose (e) is an integer for all e? Can we use BFS? Reduce to unit edge-length problem by placing (e) 1 dummy nodes on e
Let L = maxe (e). New graph has O(mL) edges and O(mL + n) nodes. BFS takes O(mL + n) time. Not ecient if L is large. 7
3.3.0.11
Towards an algorithm
Why does BFS work? BFS(s) explores nodes in increasing distance from s
Lemma 3.3.1 Let G be a directed graph with non-negative edge lengths. Let dist(s, v) denote the shortest path length from s to v. If s = v0 v1 v2 . . . vk is a shortest path from s to vk then for 1 i < k: (A) s = v0 v1 v2 . . . vi is a shortest path from s to vi (B) dist(s, vi ) dist(s, vk ).
Proof : Suppose not. Then for some i < k there is a path P from s to vi of length strictly less than that of s = v0 v1 . . . vi . Then P concatenated with vi vi+1 . . . vk contains a strictly shorter path to vk than s = v0 v1 . . . vk .
3.3.0.12
A proof by picture
v6
v6
s=v
Shor from
3.3.0.13
A Basic Strategy
Explore vertices in increasing order of distance from s: (For simplicity assume that nodes are at dierent distances from s and that no edge has zero length) 8
Initialize for each node v, dist(s, v) = Initialize S = , for i = 1 to |V | do (* Invariant: S contains the i 1 closest nodes to s *) Among nodes in V \ S, find the node v that is the ith closest to s Update dist(s, v) S = S {v}
How can we implement the step in the for loop? 3.3.0.14 Finding the ith closest node
(A) S contains the i 1 closest nodes to s (B) Want to nd the ith closest node from V S. What do we know about the ith closest node?
Claim 3.3.2 Let P be a shortest path from s to v where v is the ith closest node. Then, all intermediate nodes in P belong to S. Proof : If P had an intermediate node u not in S then u will be closer to s than v. Implies v is not the ith closest node to s - recall that S already has the i 1 closest nodes.
3.3.1
3.3.1.1
a
0
10 9 18 6 30 13 8 20 25 16 6 11 19 6
a
0
6
10
f
6 6 11 19 6
9 6
c
20
18 30
a
g 0
13 8
d
25
16
3.3.1.2
a
0
10 9 18 6 30 13 8 20 25 16 6 11 19 6 6
3.3.1.3
(A) S contains the i 1 closest nodes to s (B) Want to nd the ith closest node from V S. (A) For each u V S let P (s, u, S) be a shortest path from s to u using only nodes in S as intermediate vertices. (B) Let d (s, u) be the length of P (s, u, S) Observations: for each u V S, (A) dist(s, u) d (s, u) since we are constraining the paths (B) d (s, u) = minaS (dist(s, a) + (a, u)) - Why? Lemma 3.3.4 If v is the ith closest node to s, then d (s, v) = dist(s, v). 3.3.1.4 Finding the ith closest node
Lemma 3.3.5 If v is an ith closest node to s, then d (s, v) = dist(s, v). Proof : Let v be the ith closest node to s. Then there is a shortest path P from s to v that contains only nodes in S as intermediate nodes (see previous claim). Therefore d (s, v) = dist(s, v). 3.3.1.5 Finding the ith closest node
Lemma 3.3.6 If v is an ith closest node to s, then d (s, v) = dist(s, v). Corollary 3.3.7 The ith closest node to s is the node v V S such that d (s, v) = minuV S d (s, u). Proof : For every node u V S, dist(s, u) d (s, u) and for the ith closest node v, dist(s, v) = d (s, v). Moreover, dist(s, u) dist(s, v) for each u V S. 10
10 9 s 6 6 6 13 8 7 20 30 5 16 25 t 11 4 19 6 2 18 6 3
10 9 0 6 6 6 13 25 11 36 9 19
25 13 38
3.3.1.6
Algorithm
Initialize for each node v: dist(s, v) = Initialize S = , d (s, s) = 0 for i = 1 to |V | do (* Invariant: S contains the i-1 closest nodes to s *) (* Invariant: d(s,u) is shortest path distance from u to s using only S as intermediate nodes*) Let v be such that d(s,v) = minuV S d(s,u) dist(s, v) = d (s, v) S = S {v} for each node u in V \ S compute d(s,u) = minaS (dist(s, a) + (a, u))
Correctness: By induction on i using previous lemmas. Running time: O(n (n + m)) time. (A) n outer iterations. In each iteration, d (s, u) for each u by scanning all edges out of nodes in S; O(m + n) time/iteration.
3.3.1.7 3.3.1.8
(A) Main work is to compute the d (s, u) values in each iteration (B) d (s, u) changes from iteration i to i + 1 only because of the node v that is added to S in iteration i. 11
Initialize for each node v, dist(s, v) = d (s, v) = Initialize S = , d(s,s) = 0 for i = 1 to |V | do // S contains the i 1 closest nodes to s, // and the values of d (s, u) are current Let v be such that d(s,v) = minuV S d(s,u) dist(s, v) = d (s, v) S = S {v} Update d(s,u) for each u in V-S as follows: d (s, u) = min(d (s, u), dist(s, v) + (v, u))
Running time: O(m + n2 ) time. (A) n outer iterations and in each iteration following steps (B) updating d (s, u) after v added takes O(deg(v)) time so total work is O(m) since a node enters S only once (C) Finding v from d (s, u) values is O(n) time 3.3.1.9 Dijkstras Algorithm
(A) eliminate d (s, u) and let dist(s, u) maintain it (B) update dist values after adding v by scanning edges out of v
Initialize for each node v, dist(s, v) = Initialize S = {s}, dist(s, s) = 0 for i = 1 to |V | do Let v be such that dist(s, v) = minuV S dist(s, u) S = S {v} for each u in Adj(v) do dist(s, u) = min(dist(s, u), dist(s, v) + (v, u))
Priority Queues to maintain dist values for faster running time (A) Using heaps and standard priority queues: O((m + n) log n) (B) Using Fibonacci heaps: O(m + n log n).
3.3.2
3.3.2.1
Priority Queues
Priority Queues
Data structure to store a set S of n elements where each element v S has an associated real/integer key k(v) such that the following operations (A) makeQ: create an empty queue (B) findMin: nd the minimum key in S (C) extractMin: Remove v S with smallest key and return it (D) add(v, k(v)): Add new element v with key k(v) to S (E) delete(v): Remove element v from S 12
(F) decreaseKey(v, k(v)): decrease key of v from k(v) (current key) to k (v) (new key). Assumption: k (v) k(v) (G) meld: merge two separate priority queues into one can be performed in O(log n) time each. decreaseKey via delete and add 3.3.2.2 Dijkstras Algorithm using Priority Queues
Q = makePQ() insert(Q, (s, 0)) for each node u = s do insert(Q, (u,)) S= for i = 1 to |V | do (v, dist(s, v)) = extractM in(Q) S = S {v} For each u in Adj(v) do decreaseKey(Q, (u, min(dist(s, u), dist(s, v) + (v, u))))
Priority Queue operations: (A) O(n) insert operations (B) O(n) extractMin operations (C) O(m) decreaseKey operations 3.3.2.3 Implementing Priority Queues via Heaps
Using Heaps Store elements in a heap based on the key value (A) All operations can be done in O(log n) time
Dijkstras algorithm can be implemented in O((n + m) log n) time. 3.3.2.4 Priority Queues: Fibonacci Heaps/Relaxed Heaps
Fibonacci Heaps (A) extractMin, add, delete, meld in O(log n) time (B) decreaseKey in O(1) amortized time: decreaseKey operations for n take together O( ) time (C) Relaxed Heaps: decreaseKey in O(1) worst case time but at the expense of meld (not necessary for Dijkstras algorithm) Dijkstras algorithm can be implemented in O(n log n + m) time. If m = (n log n), running time is linear in input size. 13