Unit 4 - Data Structure
Unit 4 - Data Structure
Tech
Subject Name: Data Structure
Subject Code: CS-303
Semester: 3rd
Downloaded from be.rgpvnotes.in
Unit 4
Syllabus:
Graphs: Introduction, Classification of graph: Directed and Undirected graphs, etc, Representation, Graph
Traversal: Depth First Search (DFS), Breadth First Search (BFS), Graph algorithm: Minimum Spanning Tree
(MST)- K uskal, P i ’s algo ith s. Dijkst a’s sho test path algo ith ; Co pa iso et een different graph
algorithms, Application of graphs.
Graphs:
A graph is a mathematical abstraction used to represent "connectivity information". A graph consists of
vertices and edges that connect them, e.g.
A graph G = (V, E) is:
a set of vertices V and a set of edges E = { (u, v): u and v are vertices }.
Classification of graph:
Undirected graphs:
Graph is a data structure that consists of following two components:
1. A finite set of vertices also called as nodes.
2. A finite set of ordered pair of the form (u, v) called as edge.
The pair is ordered because (u, v) is not same as (v, u) in case of a directed graph(di-graph). The pair of the
form (u, v) indicates that there is an edge from vertex u to vertex v. The edges may contain weight/value/cost.
Graphs are used to represent many real-life applications: Graphs are used to represent networks. The
networks may include paths in a city or telephone network or circuit network. Graphs are also used in social
networks like linkedIn, Facebook. For example, in Facebook, each person is represented with a vertex (or
node). Each node is a structure and contains information like person id, name, gender and locale.
Following is an example of an undirected graph with 5 vertices.
Graph Traversal:
Depth First Search (DFS):
Depth First Search (DFS) algorithm traverses a graph in a deathward motion and uses a stack to remember to
get the next vertex to start a search, when a dead end occurs in any iteration.
Rule 1 − Visit the adja e t u isited e te . Ma k it as isited. Displa it. Push it i a sta k.
Rule 2 − If o adja e t e te is fou d, pop up a e te f o the sta k. It ill pop up all the e ti es
from the stack, which do not have adjacent vertices.)
Breadth First Search (BFS) algorithm traverses a graph in a breadth ward motion and uses a queue to
remember to get the next vertex to start a search, when a dead end occurs in any iteration.
Rule 2 − If o adja e t vertex is found, remove the first vertex from the queue.
We start from
2 visiting S(starting node), and
mark it as visited.
From A we have D as
unvisited adjacent node. We
7
mark it as visited and enqueue
it.
Graph algorithm:
A spanning tree is a subset of Graph G, which has all the vertices covered with minimum possible number of
edges. Hence, a spanning tree does not have cycles and it cannot be disconnected.
By this definition, we can draw a conclusion that every connected and undirected Graph G has at least one
spanning tree. A disconnected graph does not have any spanning tree, as it cannot be spanned to all its
vertices.
In case of parallel edges, keep the one which has the least cost associated and remove all others.
Now we start adding edges to the graph beginning from the one which has the least weight. Throughout, we
shall keep checking that the spanning properties remain intact. In case, by adding one edge, the spanning tree
property does not hold then we shall consider not to include the edge in the graph.
The least cost is 2 and edges involved are B,D and D,T. We add them. Adding them does not violate spanning
tree properties, so we continue to our next edge selection.
Next cost is 3, a d asso iated edges a e A,C a d C,D. We add the agai −
Next cost in the table is 4, and we observe that adding it will create a i uit i the g aph. −
We ignore it. In the process we shall ignore/avoid all edges that create a circuit.
We observe that edges with cost 5 and 6 also create circuits. We ignore them and move on.
Now we are left with only one node to be added. Between the two least cost edges available 7 and 8, we shall
add the edge with cost 7.
By adding edge S, A we have included all the nodes of the graph and we now have minimum cost spanning
tree.
Prim’s Algorithm:
Prim's algorithm to find minimum cost spanning tree (as Kruskal's algorithm) uses the greedy approach. Prim's
algorithm shares a similarity with the shortest path first algorithms.
Prim's algorithm, in contrast with Kruskal's algorithm, treats the nodes as a single tree and keeps on adding
new nodes to the spanning tree from the given graph.
To contrast with Kruskal's algorithm and to understand Prim's algorithm better, we shall use the same
e a ple −
Remove all loops and parallel edges from the given graph. In case of parallel edges, keep the one which has
the least cost associated and remove all others.
Now, the tree S-7-A is treated as one node and we check for all edges going out from it. We select the one
which has the lowest cost and include it in the tree.
After this step, S-7-A-3-C tree is formed. Now we'll again treat it as a node and will check all the edges again.
However, we will choose only the least cost edge. In this case, C-3-D is the new edge, which is less than other
edges' cost 8, 6, 4, etc.
After adding node D to the spanning tree, we now have two edges going out of it having the same cost, i.e. D-
2-T and D-2-B. Thus, we can add either one. But the next step will again yield edge 2 as the least cost. Hence,
we are showing a spanning tree with both edges included.
We may find that the output spanning tree of the same graph using two different algorithms is same.
Dijkstra’s Shortest Path Algorithm:
Dijkst a’s algo ith can be used to determine the shortest path from one node in a graph to every other
node within the same graph data structure, provided that the nodes are reachable from the starting node.
This algorithm will continue to run until all of the reachable vertices in a graph have been visited, which means
that e ould u Dijkst a’s algo ith , fi d the sho test path et ee a t o ea ha le odes, a d the
sa e the esults so e he e. O e e u Dijkst a’s algo ith just once, we can look up our results from our
algorithm agai a d agai — ithout ha i g to a tuall run the algorithm itself.
The abstracted rules to solve the algorithm are as follows:
1. Every time that we set out to visit a new node, we will choose the node with the smallest known
distance/cost to visit first.
2. Once e’ e o ed to the ode e’ e goi g to isit, e ill he k ea h of its eigh o i g odes.
3. Fo ea h eigh o i g ode, e’ll al ulate the dista e/ ost fo the eigh o i g odes su i g the
ost of the edges that lead to the ode e’ e he ki g from the starting vertex.
4. Finally, if the distance/cost to a node is less than a k o dista e, e’ll update the sho test dista e
that we have on file for that vertex.
Example: Find the shortest path in the following multistage graph-
3
B
1
6
D
8
4
2
T
S B 10
2
7 3
E
10
C
Figure 4.7: Prim's Algorithm Example
There is a single vertex in stage 1, then 3 vertices in stage 2, then 2 vertices in stage 3 and only one vertex
in stage 4 (this is a target stage).
Backward approach
d S, T = i { +d A, T , +d B,T , +d C,T } …
We will compute d(A,T), d(B,T) and d(C,T).
d A,T = i { +d D,T , +d E,T } …
d B,T = i { +d D,T , +d E,T } …
d C,T = i { +d E,T ,d C,T } …
Now let us compute d(D,T) and d(E,T).
d(D,T)=8
d(E,T)=2 backward vertex=E
Let us put these values in equations (2), (3) and (4)
d(A,T)=min{3+8, 6+2}
d(A,T)=8 A-E-T
d(B,T)=min{4+8,10+2}
d{B,T}=12 A-D-T
d(C,T)=min(3+2,10)
d(C,T)=5 C-E-T
d(S,T)=min{1+d(A,T), 2+d(B,T), 7+d(C,T)}
=min{1+8, 2+12,7+5}
=min{9,14,12}
d(S,T)=9 S-A-E-T
The path with minimum cost is S-A-E-T with the cost 9.
Forward approach
d(S,A)=1
d(S,B)=2
d(S,C)=7
d(S,D)=min{1+d(A,D),2+d(B,D)}
=min{1+3,2+4}
d(S,D)=4
d(S,E)=min{1+d(A,E), 2+d(B,E),7+d(C,E)}
=min {1+6,2+10,7+3}
=min {7,12,10}
d(S,E)=7 i.e. Path S-A-E is chosen.
d(S,T)=min{d(S,D)+d(D,T),d(S,E),d(E,T),d(S,C)+d(C,T)}
=min {4+8,7+2,7+10}
d(S,T)=9 i.e. Path S-E, E-T is chosen.
Application of graphs:
Since they are powerful abstractions, graphs can be very important in modeling data. In fact, many
problems can be reduced to known graph problems. Here we outline just some of the many applications of
graphs.
1. Social network graphs: To tweet or not to tweet. Graphs that represent who knows whom, who
communicates with whom, who influences whom or other relationships in social structures. An example is
the twitter graph of who follows whom. These can be used to determine how information flows, how
topics become hot, how communities develop, or even who might be a good match for who, or is that
whom.
2. Transportation networks: In road networks vertices are intersections and edges are the road segments
between them, and for public transportation networks vertices are stops and edges are the links between
them. Such networks are used by many map programs such as Google maps, Bing maps and now Apple IOS
6 maps (well perhaps without the public transport) to find the best routes between locations. They are
also used for studying traffic patterns, traffic light timings, and many aspects of transportation.
3. Utility graphs: The power grid, the Internet, and the water network are all examples of graphs where
vertices represent connection points, and edges the wires or pipes between them. Analyzing properties of
these graphs is very important in understanding the reliability of such utilities under failure or attack, or in
minimizing the costs to build infrastructure that matches required demands.
4. Document link graphs: The best known example is the link graph of the web, where each web page is a
vertex, and each hyperlink a directed edge. Link graphs are used, for example, to analyze relevance of web
pages, the best sources of information, and good link sites.
5. Protein-protein interactions graphs: Vertices represent proteins and edges represent interactions
between them that carry out some biological function in the cell. These graphs can be used, for example,
to study molecular pathways—chains of molecular interactions in a cellular process. Humans have over
120K proteins with millions of interactions among them.
6. Network packet traffic graphs: Vertices are IP (Internet protocol) addresses and edges are the packets
that flow between them. Such graphs are used for analyzing network security, studying the spread of
worms, and tracking criminal or non-criminal activity.
7. Scene graphs: In graphics and computer games scene graphs represent the logical or spacial
relationships between objects in a scene. Such graphs are very important in the computer games industry.
8. Finite element meshes: In engineering many simulations of physical systems, such as the flow of air
over a car or airplane wing, the spread of earthquakes through the ground, or the structural vibrations of a
building, involve partitioning space into discrete elements. The elements along with the connections
between adjacent elements form a graph that is called a finite element mesh.
9. Robot planning: Vertices represent states the robot can be in and the edges the possible transitions
between the states. This requires approximating continuous motion as a sequence of discrete steps. Such
graph plans are used, for example, in planning paths for autonomous vehicles.
10. Neural networks: Vertices represent neurons and edges the synapses between them. Neural networks
are used to understand how our brain works and how connections change when we learn. The human
brain has about 1011 neurons and close to 1015 synapses.
11. Graphs in quantum field theory: Vertices represent states of a quantum system and the edges the
transitions between them. The graphs can be used to analyze path integrals and summing these up
generates a quantum amplitude (yes, I have no idea what that means).
12. Semantic networks: Vertices represent words or concepts and edges represent the relationships
among the words or concepts. These have been used in various models of how humans organize their
knowledge, and how machines might simulate such an organization.
13. Graphs in epidemiology: Vertices represent individuals and directed edges the transfer of an infectious
disease from one individual to another. Analyzing such graphs has become an important component in
understanding and controlling the spread of diseases.
14. Graphs in compilers: Graphs are used extensively in compilers. They can be used for type inference,
for so called data flow analysis, register allocation and many other purposes. They are also used in
specialized compilers, such as query optimization in database languages.
15. Constraint graphs: Graphs are often used to represent constraints among items. For example the GSM
network for cell phones consists of a collection of overlapping cells. Any pair of cells that overlap must
operate at different frequencies. These constraints can be modeled as a graph where the cells are vertices
and edges are placed between cells that overlap.
16. Dependence graphs: Graphs can be used to represent dependences or precedence among items. Such
graphs are often used in large projects in laying out what components rely on other components and used
to minimize the total time or cost to completion while abiding by the dependences.