7BCE3C1 Data Structures and Computer Algorithms
7BCE3C1 Data Structures and Computer Algorithms
III SEMESTER
SEMESTER: III
Unit - I
Introduction: Basic Terminology, Elementary Data Organization, Algorithm, Efficiency of an
Algorithm, Time and Space Complexity, Asymptotic notations: Big-Oh, Time-Space trade-
off.
Abstract Data Types (ADT)
Arrays: Definition, Single and Multidimensional Arrays, Representation of Arrays: Row Major
Order, and Column Major Order, Application of arrays, Sparse Matrices and their
representations.
Linked lists: Array Implementation and Dynamic Implementation of Singly Linked Lists,
Doubly Linked List, Circularly Linked List, Operations on a Linked List. Insertion, Deletion,
Traversal, Polynomial Representation and Addition, Generalized Linked List.
Unit – II
Stacks: Abstract Data Type, Primitive Stack operations: Push & Pop, Array and Linked
Implementation of Stack in C, Application of stack: Prefix and Postfix Expressions,
Evaluation of postfix expression, Recursion, Tower of Hanoi Problem, Simulating Recursion,
Principles of recursion, Tail recursion, Removal of recursion Queues, Operations on Queue:
Create, Add, Delete, Full and Empty, Circular queues, Array and linked implementation of
queues in C, Dequeue and Priority Queue.
Unit – III
Trees: Basic terminology, Binary Trees, Binary Tree Representation: Array Representation
and Dynamic Representation, Complete Binary Tree, Algebraic Expressions, Extended Binary
Trees, Array and Linked Representation of Binary trees, Tree Traversal algorithms: Inorder,
Preorder and Postorder, Threaded Binary trees, Traversing Threaded Binary trees, Huffman
algorithm.
Unit – IV
Graphs: Terminology, Sequential and linked Representations of Graphs: Adjacency Matrices,
Adjacency List, Adjacency Multi list, Graph Traversal : Depth First Search and Breadth First
Search, Connected Component, Spanning Trees, Minimum Cost Spanning Trees: Prims and
Kruskal algorithm. Transistive Closure and Shortest Path algorithm: Warshal Algorithm and
Dijikstra Algorithm, Introduction to Activity Networks.
Unit – V
Searching : Sequential search, Binary Search, Comparison and Analysis Internal Sorting:
Insertion Sort, Selection, Bubble Sort, Quick Sort, Two Way Merge Sort, Heap Sort, Radix
Sort, Practical consideration for Internal Sorting.Search Trees: Binary Search Trees(BST),
Insertion and Deletion in BST, Complexity of Search Algorithm, AVL trees, Introduction to
m-way Search Trees, B Trees & B+ Trees .Hashing: Hash Function, Collision Resolution
Strategies Storage Management: Garbage Collection and Compaction.
Page 1
References:
1. Aaron M. Tenenbaum,YedidyahLangsam and Moshe J. Augenstein “Data
Structures Using C and C++”, PHI Learning Private Limited, Delhi India.
2. Horowitz and Sahani, “Fundamentals of Data Structures”, Galgotia
Publications Pvt Ltd Delhi India.
3. A.K. Sharma ,Data Structure Using C, Pearson Education India.
4. Rajesh K. Shukla, “Data Structure Using C and C++” Wiley Dreamtech
Publication.
5. Lipschutz, “Data Structures” Schaum’s Outline Series, Tata Mcgraw-hill
Education (India) Pvt. Ltd .
6. Michael T. Goodrich, Roberto Tamassia, David M. Mount “Data Structures
and Algorithms in C++”, Wiley India.
7. P.S. Deshpandey, “C and Datastructure”, Wiley Dreamtech Publication.
8. R. Kruse etal, “Data Structures and Program Design in C”, Pearson Education
9. Berztiss, A.T.: Data structures, Theory and Practice :, Academic Press.
10. Jean Paul Trembley and Paul G. Sorenson, “An Introduction to Data
Structures with applications”, McGraw Hill.
Page 2
Unit -I
Introduction to Data Structure and it's Characteristics
Introduction
Data Structure can be defined as the group of data elements which provides an
efficient way of storing and organizing data in the computer so that it can be used
efficiently. Some examples of Data Structures are arrays, Linked List, Stack,
Queue, etc. Data Structures are widely used in almost every aspect of Computer
Science i.e. operating System, Compiler Design, Artificial intelligence, Graphics
and many more.
Data Structures are the main part of many computer science algorithms as they
enable the programmers to handle the data in an efficient way.
1.Storing
2. Accessing
3. Manipulation
It gives different level of organization data.
It tells how data can be stored and accessed in its elementary level.
Provide operation on group of data, such as adding an item,
looking up highest priority item.
Provide a means to manage huge amount of data efficiently.
Provide fast searching and sorting of data.
As applications are getting complexed and amount of data is increasing day by day, there may
arise the following problems:
Processor speed: To handle very large amout of data, high speed processing is required, but as
the data is growing day by day to the billions of files per entity, processor may fail to deal with
that much amount of data.
Page 3
Data Search: Consider an inventory size of 106 items in a store; If our application needs to
search for a particular item, it needs to traverse 106 items every time, results in slowing down
the search process.
Multiple requests: If thousands of users are searching the data simultaneously on a web
server, then there are the chances that a very large server can be failed during that process
in order to solve the above problems, data structures are used.
Data Type
Data type is a way to classify various types of data such as integer, string, etc. which
determines the values that can be used with the corresponding type of data, the type of
operations that can be performed on the corresponding type of data. There are two data
types −
Integers
Boolean (true, false)
Floating (Decimal numbers)
Character and Strings
List
Array
Stack
Queue
ARRAY
An array refers to a collection of homogeneous element, stored in contiguous locations in
Page 4
memory
Array Representation
Arrays can be declared in various ways in different languages. For illustration, let's take C
array declaration.
Arrays can be declared in various ways in different languages. For illustration, let's take
C array declaration.
Arrays can be declared in various ways in different languages. For illustration, let's take C
array declaration.
As per the above illustration, following are the important points to be considered.
Index starts with 0.
Array length is 10 which means it can store 10 elements.
Each element can be accessed via its index. For example, we can fetch an element at
index 6 as 9.
Page 5
int arr[10]; char arr[10]; float arr[5]
Range of an Array
Range of an array is the difference between the maximum and minimum element in an array, Input
and Output Format: Input consists of n+1 integers where n corresponds to the number of elements in
the array. The first integer corresponds to n and the next n integers correspond to the elements in
the array.
1. Int data[10]
a. Array of 10 integer values
b. Lower bound 0,Upper bound 9 and range 10
2. Char code[20]
a. Array of 10 Charcter values
b. Lower bound 0,Upper bound 19 and range 20
For example in the array data[10], the elements are accessed as data[0].data[1],…..data[9].
Multidimensional array
Page 6
datatype arrayName [ size ] = {value1, value2, ...} ;
Example Code
int marks [6] = { 89, 90, 76, 78, 98, 86 } ;
The above declaration of single dimensional array reserves 6 contiguous memory locations of 2 bytes each with the
name marks and initializes with value 89 in first memory location, 90 in second memory location, 76 in third
memory location, 78 in fourth memory location, 98 in fifth memory location and 86 in sixth memory location.
We can also use the following general syntax to initialize a single dimensional array without specifying size and with
initial values...
Example Code
int marks [] = { 89, 90, 76, 78, 98, 86 } ;
Two-Dimensional Arrays
The simplest form of the multidimensional array is the two-dimensional array. A two-
dimensional array is, in essence, a list of one-dimensional arrays. To declare a two-
dimensional integer array of size x,y you would write something as follows:
type arrayName [ x ][ y ];
Where type can be any valid C data type and arrayName will be a valid C
identifier. A two- dimensional array can be think as a table which will have x
number of rows and y number of columns. A 2-dimensional array a, which
contains three rows and four columns can be shown as below:
Page 7
.
There are two systematic compact layouts for a two-dimensional array. For example, consider the
matrix
In the row-major order layout (adopted by C for statically declared arrays), the elements in each row are
stored in consecutive positions and all of the elements of a row have a lower address than any of the
elements of a consecutive row:
1 2 3 4 5 6 7 8 9
In column-major order (traditionally used by Fortran), the elements in each column are consecutive in
memory and all of the elements of a column have a lower address than any of the elements of a
consecutive column:
1 4 7 2 5 8 3 6 9
For example:
BA=100. Sol
= 100+ 13*2
= 100+ 26
= 126
Page 8
Column Major Order: Order elements of first column stored linearly and then
comes elements of next column.
To determine element address A[i,j]:
Location ( A[ i,j ] ) =Base Address + ((j*ROW) + i)*e
For example:
BA=100. Sol
= 100+ 17*2
= 100+ 34
= 134
Multi-Dimensional Array
The diagonal of a square matrix helps define two type of matrices: upper-
triangular and lower-triangular. Indeed, the diagonal subdivides the matrix into two
blocks: one above the diagonal and the other one below it. If the lower-block consists of
zeros, we call such a matrix upper-triangular. If the upper-block consists of zeros, we call
such a matrix lower-triangular. For example, the matrices are upper-riangular,
A square matrix [aij] is called an upper triangular matrix, if aij = 0, when i > j.
is an upper triangular matrix of order 3 x 3.
Page 10
Tridiagonal matrix
In linear algebra, a tridiagonal matrix is a band matrix that has nonzero elements on the main
diagonal, the first diagonal below this, and the first diagonal above the main diagonal only.
For example, the following matrix is tridiagonal:
2 1 0 0
1 3 1 0
0 5 2 7
0 0 9 0
Sparse matrix
2D array is used to represent a sparse matrix in which there are three rows named as
Row: Index of row, where non-zero element is located
Column: Index of column, where non-zero element is located
Value: Value of the non zero element located at index – (row,column)
LINKED LIST
INTRODUCTION
Linked List can be defined as collection of objects called
nodes that are randomly stored in the memory.
A node contains two fields i.e. data stored at that particular
addressand the pointer which contains the address of the
Page 11
next node in thememory.
The last node of the list contains pointer to the null.
Page 12
BENEFITS AND LIMITATIONS OF LINKED LIST BENEFITS
Dynamic Data Structure
Linked list is a dynamic data structure so it can grow and shrink at runtime by allocating
and deallocating memory. So there is no need to give initial size of linked list.
No Memory Wastage
As size of linked list can increase or decrease at run time so there is no memory
wastage. In linked list the memory is allocated onlywhen required.
Implementation
Data structures such as stack and queues can be easily implemented using
linked list.
LIMITATIONS
Memory Usage
More memory is required to store elements in linked list as compared to array. Because
in linked list each node contains a pointer and it requires extra memory for itself.
Traversal
Elements or nodes traversal is difficult in linked list. For example if we want to access a
node at position n then we have to traverse all the nodes before it. So, time required to
access a node is large
Reverse Traversing
In linked list reverse traversing is really difficult. In case of doubly linked list its easier
but extra memory is required for back pointer hence wastage of memory.
Page 13
Types of linked list
1. Singly Linked list
2. Double linked list
3. Circular linked list
Single Linked List:
A linked list allocates space for each element separately in its own block of memory
called a "node". The list gets an overall structure by using pointers to connect all its
nodes together like the links in a chain. Each node contains two fields; a "data" field to
store whatever element, and a "next" field which is a pointer used to link to the next
node.
STACK HEAP
start
10 20 200 30 40 \
The beginning of the linked list is stored in a "start" pointer which points to the first
node. The first node contains a pointer to the second node. The second node contains a
pointer to the third node, ... and so on. The last node in the list has its next field set to
NULL to mark the end of the list.
we need to create a start node, used to create and access other nodes in the linked
list.
Creating a structure with one data item and a next pointer, which will be
pointing to next node of the list. This is called as self-referential structure.
Page 14
struct slinklist
int data;
Figure 3.2.2. Structure definition, single link node and empty list
typedef struct slinklist node; node *start
= NULL;
The basic operations in a single linked list are:
Creation.
Insertion.
Deletion.
Traversing.
Creating a singly linked list starts with creating a node. Sufficient memory has to be
allocated for creating a node. The information is stored in the memory, allocated by
using the malloc() function. The function getnode(), is used for creating a node, after
allocating memory for the structure of type node, the information for the item (i.e.,
data) has to be read from the user, set next field to NULL and finally returns the
address of the node.
node* getnode()
Newnode
{
10 /
node* newnode;
100
newnode = (node *) malloc(sizeof(node)); printf("\n Enter data:
");
One of the most primitive operations that can be done in a singly linked list is the
insertion of a node. Memory is to be allocated for the new node , before reading the
data. The new node will contain empty data field and empty next field. The data field of
the new node is then stored with the information read from the user. The next field of
the new node is assigned to NULL. The new node can then be inserted at three different
places namely:
Page 15
Inserting a node at the beginning.
Inserting a node at the end.
Inserting a node at intermediate position.
The following steps are to be followed to insert a new node at the beginning of the list:
Figure 3.2.5 shows inserting a node into the single linked list at the beginning.
start
500
5 100
500
void insert_at_beg()
{
node *newnode;
newnode = getnode();
if(start == NULL)
{
start = newnode;
}
else
{
newnode -> next = start;
start = newnode;
}
}
Page 16
Deletion of a node:
Another primitive operation that can be done in a singly linked list is the deletion of a
node. Memory is to be released for the node to be deleted. A node can be deleted from
the list from three different places namely.
The following steps are followed, to delete a node at the beginning of the list:
Figure 3.2.8 shows deleting a node at the beginning of a single linked list.
start
200
20 300 30 400 40 /
100
10 200 200 300 400
temp
The function delete_at_beg(), is used for deleting the first node in the list.
void delete_at_beg()
{
node *temp;
if(start == NULL)
{
printf("\n No nodes are exist..");
} return ;
else
{
temp = start;
start = temp -> next; free(temp);
printf("\n Node deleted ");
}
Page 17
Array based linked lists:
start
a
100
b
a 200 b 300 c /
Conceptual structure d
Implementation
A double linked list is a two-way list in which all nodes will have two links. This helps in
accessing both successor node and predecessor node from the given node position. It
provides bi-directional traversing. Each node contains three fields:
Left link.
Data.
Right link.
The left link points to the predecessor node and the right link points to the successor
node. The data field stores the required data.
.
The basic operations in a double linked list are:
Creation.
Insertion.
Deletion.
Traversing.
double linked list is shown in figure 3.3.1.
Page 18
STACK HEAP
Stores the previous
node address.
100
start 10 20 300 30 /
100 200 300
The start
pointer holds
Stores the data. Stores the next The right field of the last
the address of
node address. node is NULL.
the first node
of the list.
The beginning of the double linked list is stored in a "start" pointer which points to the
first node. The first node’s left link and last node’s right link is set to NULL.
struct dlinklist
{
node:
struct dlinklist *left; int data;
struct dlinklist *right; left data right Figure
3.4.1.
};
start
typedef struct dlinklist node; Empty list: NULL
node *start = NULL;
Figure 3.4.1. Structure definition, double link node and empty list
Creating a double linked list starts with creating a node. Sufficient memory has to be
allocated for creating a node. The information is stored in the memory, allocated by
using the malloc() function. The function getnode(), is used for creating a node, after
allocating memory for the structure of type node, the information for the item (i.e.,
data) has to be read from the user and set left field to NULL and right field also set to
NULL
node* getnode()
node* newnode;
Page 19
newnode
return newnode;
newnode=getnode();
The function dbl_insert_end(), is used for inserting a node at the end. Figure 3.4.5
shows inserting a node into the double linked list at the end.
start
100
300 40 X
400
The following steps are followed, to delete a node at the beginning of the list:
temp = start;
start = start -> right;
start -> left = NULL;
free(temp);
Page 20
The function dbl_delete_beg(), is used for deleting the first node in the list. Figure
shows deleting a node at the beginning of a double linked list.
start
200
It is just a single linked list in which the link field of the last node points back to the
address of the first node. A circular linked list has no beginning and no end. It is
necessary to establish a special pointer called start pointer always pointing to the first
node of the list. Circular linked lists are frequently used instead of ordinary linked list
because many operations are much easier to implement. In circular linked list no null
pointers are used, hence all pointers contain valid address.
start
100
Creation.
Insertion.
Deletion.
Traversing.
Page 21
Creating a circular single Linked List with ‘n’ number of nodes:
newnode = getnode();
start = newnode;
temp = start;
while(temp -> next != NULL)
temp = temp -> next;
temp -> next = newnode;
The following steps are to be followed to insert a new node at the beginning of the
circular list:
newnode = getnode();
start = newnode;
newnode -> next = start;
last = start;
while(last -> next != start)
last = last -> next;
newnode -> next = start;
start = newnode;
last -> next = start;
Page 22
The function cll_insert_beg(), is used for inserting a node at the beginning. Figure
shows inserting a node into the circular single linked list at the beginning.
start
500
5 100
500
The following steps are followed to delete a node at the end of the list:
temp = start;
prev = start;
while(temp -> next != start)
{
prev = temp;
temp = temp -> next;
}
prev -> next = start;
After deleting the node, if the list is empty then start = NULL.
The function cll_delete_last(), is used for deleting the last node in the list.
shows inserting a node into the circular single linked list at the beginning.
start
500
5 100
Page 23
500
Deleting a node at the end:
The following steps are followed to delete a node at the end of the list:
temp = start;
prev = start;
while(temp -> next != start)
{
prev = temp;
temp = temp -> next;
}
prev -> next = start;
After deleting the node, if the list is empty then start = NULL.
The function cll_delete_last(), is used for deleting the last node in the list.
Figure 3.6.5 shows deleting a node at the end of a circular single linked list.
start
100
Page 24
UNIT II
STACK
one end, called TOP of the stack. The elements are removed in reverse order of that
in which they were inserted into the stack.
STACK OPERATIONS
There are two ways to represent Stack in memory. One is using array and
other is using linked list.
Push Operation
Push an item onto the top of the stack (insert an item)
Page 26
Algorithm for PUSH:
Here are the minimal operations we'd need for an abstract stack (and their typical
names):
We can represent a stack as a linked list. In a stack push and pop operations are
performed at one end called top. We can perform similar operations at one end of list
using top pointer. The linked stack looks as shown in figure 4.3.
top
400
data next
40 X
400
30 400
300
200
start
10 200
100
20 300
100
Page 27
Source code for stack operations, using linked list:
#include<stdio.h>
#include<conio.h>
#include <stdlib.h>
struct stack
{
int data;
struct stack *next;
};
void push(); void pop(); void display();
typedef struct stack node; node *start=NULL;
node *top = NULL;
node* getnode()
{
node *temp;
temp=(node *) malloc( sizeof(node)) ; printf("\n Enter data ");
scanf("%d", &temp -> data); temp -> next = NULL;
return temp;
}
void push(node *newnode)
{
node *temp;
if( newnode == NULL )
{
printf("\n Stack Overflow.."); return;
}
if(start == NULL)
{
start = newnode; top = newnode;
}
else
{
temp = start;
while( temp -> next != NULL) temp = temp -> next;
temp -> next = newnode; top = newnode;
}
printf("\n\n\t Data pushed into stack");
}
void pop()
{
node *temp; if(top == NULL)
{
printf("\n\n\t Stack underflow"); return;
}
temp = start;
if( start -> next == NULL)
{
}
else
{
Page 28
Printf("\n\n\t Popped element is %d ", top -> data); start = NULL;
free(top); top = NULL;
while(temp -> next != top)
{
temp = start;
printf("\n\n\n\t\t Elements in the stack: \n"); printf("%5d ", temp -> data);
while(temp != top)
{
temp = temp -> next; printf("%5d ", temp -> data);
}
}
}
char menu()
{ char ch; clrscr();
printf("\n \tStack operations using pointers.. "); printf("\n -----------********** \n");
printf("\n 1. Push ");
printf("\n 2. Pop "); printf("\n 3. Display"); printf("\n 4. Quit ");
printf("\n Enter your choice: "); ch = getche();
return ch;
}
void main()
{
char ch;
node *newnode; do
{
ch = menu(); switch(ch)
{
case '1' :
newnode = getnode(); push(newnode); break;
case '2' :
pop(); break;
case '3' :
display(); break;
case '4':
return;
}
getch();
} while( ch != '4' );
Page 29
}
Page 30
Traverse();
}
break;
case 4:
exit(0);
default:
cout<< "\n\n\t Invalid Choice: \n";
} // end of switch block
} // end of while loop
} // end of of main()
function void Push(int item)
{
Stack[++Top] = item;
}
int Pop( )
{
return Stack[Top--];
}
bool IsEmpty( )
{ if(Top == -1 ) return true else return
false; } bool IsFull( )
{ if(Top == STACKSIZE-1 ) return true else return
false; } void Traverse( )
{ int TopTemp = Top;
do{ cout<< Stack[TopTemp--]<<endl;} while(TopTemp>= 0);
}
Page 31
else
{ NewNode->next =
TOP; TOP=NewNode;
}
}
struct node* pop ()
{ struct node
*T; T=TOP;
TOP = TOP->next;
return T;
}
void Traverse()
{ struct node *T;
for( T=TOP ; T!=NULL ;T=T->next) cout<<T->info<<endl;
}
bool IsEmpty()
{ if(TOP == NULL) return true; else return
false; } int main ()
{ struct node
*T; int item,
ch; while(1)
{ cout<<"\n\n\n\n\n\n ***** Stack Operations
*****\n"; cout<<"\n\n 1- Push Item \n 2- Pop Item
\n";
cout<<" 3- Traverse/Print stack-values\n 4-
Exit\n\n"; cout<<"\n Your Choice --> ";
cin>>ch;
switch(ch
)
{ case 1:
cout<<"\nPut a value:
"; cin>>item;
Push(item)
; break;
case 2:
if(IsEmpty()) {cout<<"\n\n Stack is
Empty\n"; break;
}
T= Pop();
cout<< T->info <<"\n\n has been deleted
\n"; break;
case 3:
if(IsEmpty()) {cout<<"\n\n Stack is
Empty\n"; break;
}
Traverse()
; break;
case 4:
Page 32
exit(0);
} // end of switch block
}
return 0;
} // end of main function
Applications of stack
operation precedence
parenthesis highest
Unary operators highest
^(exponentiation)
*,/
+,- lowest
Syntax checking
Consider the expression wcw’{ w is in {a,b}}where w is its string in {a,b} and w’ is its reverset.this denotes
sentences of the form :a string followed by c and its reverse. Abcba
Infix, Prefix and Postfix Notation
We are accustomed to write arithmetic expressions with the operation between the two
operands: a+b or c/d. If we write a+b*c, however, we have to apply precedence rules
to avoid the ambiguous evaluation
(a + b) * (c - d) *+ab-cd ab+cd-*
6 6
2 6, 2
3 6, 2, 3
+ 2 3 5 6, 5
- 6 5 1 1
3 6 5 1 1, 3
8 6 5 1 1, 3, 8
2 6 5 1 1, 3, 8, 2
/ 8 2 4 1, 3, 4
+ 3 4 7 1, 7
* 1 7 7 7
2 1 7 7 7, 2
7 2 49 49
3 7 2 49 49, 3
+ 49 3 52 52
Infix to Postfix
This process uses a stack as well. We have to hold information that's expressed inside
parentheses while scanning to find the closing ')'. We also have to hold information on
operations that are of lower precedence on the stack. The algorithm is:
. Recursion:
A function is recursive if a statement in the body of the function calls itself. Recursion is
the process of defining something in terms of itself. For a computer language to be
recursive, a function must be able to call itself.
For example, let us consider the function factr() shown below, which computers the
factorial of an integer.
return (result);
}
else
{
}
Differences between recursion and iteration:
Iteration Recursion
Iteration explicitly user a repetition Recursion achieves repetition through
structure. repeated function calls.
Iteration terminates when the loop Recursion terminates when a base case
continuation. is recognized.
Iteration keeps modifying the counter Recursion keeps producing simple
until the loop continuation condition versions of the original problem until
fails. the base case is reached.
Iteration normally occurs within a loop Recursion causes another copy of the
so the extra memory assigned is function and hence a considerable
omitted. memory space’s occupied.
It reduces the processor’s operating It increases the processor’s operating
time. time.
Start out with some natural number N (in our example, 5). The recursive definition is:
n = 0, 0 ! = 1 Base Case
n > 0, n ! = n * (n - 1) ! Recursive Case
Recursion Factorials:
5! =5 * 4! = 5 * =
4! = 4 *3! = 4 * =
3! = 3 * 2! = 3 * =
2! = 2 * 1! = 2 * =
factr(3) = 3 * factr(2) = factr(2) = 2 * factr(1) =
! = 1 * 0! = 1 * = factr(1) = 1 * factr(0) =
0! = 1 factr(0) =
In the game of Towers of Hanoi, there are three towers labeled 1, 2, and 3. The game
starts with n disks on tower A. For simplicity, let n is 3. The disks are numbered from 1
to 3, and without loss of generality we may assume that the diameter of each disk is
the same as its number. That is, disk 1 has diameter 1 (in some unit of measure), disk
2 has diameter 2, and disk 3 has diameter 3. All three disks start on tower A in the
order 1, 2, 3. The objective of the game is to move all the disks in tower 1 to entire
tower 3 using tower 2. That is, at no time can a larger disk be placed on a smaller disk.
Figure 3.11.1, illustrates the initial setup of towers of Hanoi. The figure 3.11.2,
illustrates the final setup of towers of Hanoi.
The rules to be followed in moving the disks from tower 1 tower 3 using tower 2 are as
follows:
To w er 1 To w er 2 To w er 3
Fig. 3. 1 1. 1. In it ia l s et u p of T o w ers of Ha n o i
To w er 1 To w er 2 To w er 3
Fig 3. 1 1. 2. F in a l s et u p of T o w ers of Ha n o i
The towers of Hanoi problem can be easily implemented using recursion. To move the
largest disk to the bottom of tower 3, we move the remaining n – 1 disks to tower 2
and then move the largest disk to tower 3. Now we have the remaining n – 1 disks to
be moved to tower 3. This can be achieved by using the remaining two towers. We can
also use tower 3 to place any disk on it, since the disk placed on tower 3 is the largest
disk and continue the same operation to place the entire disks in tower 3 in order.
IF disk == 0, THEN:
ELSE:
MoveTower(disk - 1, source, spare, dest)
// Step 1 above move disk from source to
dest //
Step 2 above MoveTower(disk - 1, spare,
dest, source) // Step 3 above
END IF
Queue:
A queue is another special kind of list, where items are inserted at one end called the
rear and deleted at the other end called the front. Another name for a queue is a
“FIFO” or “First-in-first-out” list.
The operations for a queue are analogues to those for a stack, the difference is that the
insertions go at the end of the list, rather than the beginning. We shall use the
following operations on queues:
enqueue: which inserts an element at the end of the queue.
dequeue: which deletes an element at the start of the queue.
Representation of Queue:
Let us consider a queue, which can hold maximum of five elements. Initially the queue
is empty.
0 1 2 3 4
Q u e u e E mp t y
F RO NT = RE A R = 0
FR
0 1 2 3 4
11
F R
RE A R = RE A R + 1 = 1 FRO NT = 0
0 1 2 3 4
11 22
F R
RE A R = RE A R + 1 = 2 FRO NT = 0
Again insert another element 33 to the queue. The status of the queue is:
1 2 3 4
11 22 33
F R
RE A R = RE A R + 1 = 3
FRO NT = 0
Now, delete an element. The element deleted is the element at the front of the queue.
So the status of the queue is:
0 1 2 3 4
22 33 RE A R = 3
F RO NT = F R O NT + 1 = 1
F R
Again, delete an element. The element to be deleted is always pointed to by the FRONT
pointer. So, 22 is deleted. The queue status is as follows:
0 1 2 3 4
33
F R
RE A R = 3
F RO NT = F R O NT + 1 = 2
Now it is not possible to insert an element 66 even though there are two
vacant positions in the linear queue. To overcome this problem the elements of
the queue are to be shifted towards the beginning of the queue so that it
creates vacant position at the rear end. Then the FRONT and REAR are to be
adjusted properly. The element 66 can be inserted at the rear end. After this
operation, the queue status is as follows:
0 1 2 3 4
33 44 55 66
F R
RE A R = 4 FRO NT = 0
Source code for Queue operations using array:
In order to create a queue we require a one dimensional array Q(1:n) and two
variables front and rear. The conventions we shall adopt for these two variables are
that front is always 1 less than the actual front of the queue and rear always points to
the last element in the queue. Thus, front = rear if and only if there are no elements in
the queue. The initial condition then is front = rear = 0. The various queue operations
to perform creation, deletion and display the elements in a queue are as follows:
# include <conio.h>
#define MAX 6
void insertQ()
{
int data;
if(rear == MAX)
{
printf("\n Linear Queue is full"); return;
}
else
{
printf("\n Enter data: "); scanf("%d", &data); Q[rear] = data;
rear++;
printf("\n Data Inserted in the Queue ");
}
}
void deleteQ()
{
if(rear == front)
{
}
else
{
}
} printf("\n\n Queue is Empty.."); return;
printf("\n Elements in Queue are: "); for(i = front; i < rear; i++)
{
printf("%d\t", Q[i]);
}
}
}
int menu()
{
int ch; clrscr();
printf("\n \tQueue operations using ARRAY.."); printf("\n -----------********** \n");
printf("\n 1. Insert "); printf("\n 2. Delete "); printf("\n 3. Display"); printf("\n 4. Quit ");
printf("\n Enter your choice: "); scanf("%d", &ch);
return ch;
}
void main()
{
int ch; do
{
ch = menu();
switch(ch)
{
case 1: insertQ(); break;
displayQ(); break;
case 3:
return;
case 4:
getch();
} while(1);
}
Linked List Implementation of Queue:
We can represent a queue as a linked list. In a queue data is deleted from the front end and
inserted at the rear end. We can perform similar operations on the two ends of a list. We use two
pointers front and rear for our linked queue implementation.
front rear
100 400
Dequeue:
In the preceding section we saw that a queue in which we insert items at one end and from which
we remove items at the other end. In this section we examine an extension of the queue, which
provides a means to insert and remove items at both ends of the queue. This data structure is a
deque. The word deque is an acronym derived from double-ended queue. Figure 4.5 shows the
representation of a deque.
Deletion Insertion
36 16 56 62 19
Insertion Deletion
front rear
A deque provides four operations. Figure 4.6 shows the basic operations on a deque.
11 22 ) 33 11 22 33 11 22 44
dequeue_front(33)
enqueue_front(5 queue_rear(44)
5) 11 22 de 11 22 44
55 11 22
step5. Stop
Algorithm to delete an element from the DeQueue
deq_front
step1. Start
step2. Check the queue is empty or not as if (f == r) if yes queue is empty
step3. If false update pointer f as f = f+1 and delete element at position f as element = Q[f]
step4. If ( f== r) reset pointer f and r as f=r=-1
step5. Stop
deq_back
step1.start
Priority Queue:
A priority queue is a collection of elements such that each element has been assigned a
priority and such that the order in which elements are deleted and processed comes from the
following rules:
2. two elements with same priority are processed according to the order in which they
were added to the queue.
A prototype of a priority queue is time sharing system: programs of high priority are
processed first, and programs with the same priority form a standard queue. An efficient
implementation for the Priority Queue is to use heap, which in turn can be used for sorting
purpose called heap sort.
Applications of Queue:
2. When multiple users send print jobs to a printer, each printing job is kept in the printing
queue. Then the printer prints those jobs according to first in first out (FIFO) basis.
3. Breadth first search uses a queue data structure to find an element from a graph.
Circular Queue:
A more efficient queue representation is obtained by regarding the array Q[MAX] as circular.
Any number of items could be placed on the queue. This implementation of a queue is called a
circular queue because it uses its storage array as if it were a circle instead of a linear list.
Representation of Circular Queue:
Let us consider a circular queue, which can hold maximum (MAX) of six elements.
Initially the queue is empty.
FR
5 0
1 Q u eu e E mp t y
4 MA X = 6
F RO NT = RE A R = 0
3 2 CO U NT = 0
Circ u lar Q u e u e
Now, insert 11 to the circular queue. Then circular queue status will be:
5 0
R
11
1
FRO NT = 0
4 RE A R = ( RE A R + 1) % 6 = 1
CO U NT = 1
3 2
Circ u lar Q u e u e
Insert new elements 22, 33, 44 and 55 into the circular queue. The circular queue
status is:
F
R
0
11
5
55 22 1 FRONT = 0
4
REAR = (REAR + 1) % 6 = 5
44 33
3 2
COUNT = 5
Circular Queue
If(front==0)
front=front+1;
Printf(Queue Overflow);
return;
If(rear==SIZE)
rear=1;
else
rear= rear+1;
queue[rear]=item;
Delete CircularQueue ( )
If (FRONT == 0) Then [Check for Underflow]
Print: Underflow
Else
ITEM = QUEUE[FRONT]
Set FRONT = 0
Set REAR = 0
Else
. Exit
Unit III
TREES:
A tree is hierarchical collection of nodes. One of the nodes, known as the root, is at the
top of the hierarchy. Each node can have at most one link coming into it. The node
where the link originates is called the parent node. The root node has no parent. The
links leaving a node (any number of links are allowed) point to child nodes. Trees are
recursive structures. Each child node is itself the root of a subtree. At the bottom of
the tree are leaf nodes, which have no children.
a
a
b c b c
d e f d e
BINARY TREE:
In general, tree nodes can have any number of children. In a binary tree, each node
can have at most two children. A binary tree is either empty or consists of a node
called the root together with two binary trees called the left subtree and the right
subtree.
A tree with no nodes is called as a null tree. A binary tree is shown in figure 5.2.1.
right subtree
left subtree D E F G
H I
Tree Terminology:
Leaf node
A node with no children is called a leaf (or external node). A node which is not a leaf is called an
internal node.
Path
A sequence of nodes n1, n2, . . ., nk, such that ni is the parent of ni + 1 for i = 1,
2,. . ., k - 1. The length of a path is 1 less than the number of nodes on the
path. Thus there is a path of length zero from a node to itself.
For the tree shown in figure 5.2.1, the path between A and I is A, B, D, I.
Siblings
For the tree shown in figure 5.2.1, F and G are the siblings of the parent node C
and H and I are the siblings of the parent node D.
Subtree
Height
The maximum level in a tree determines its height. The height of a node in a
tree is the length of a longest path from the node to a leaf. The term depth is
also used to denote height of the tree. The height of the tree of Figure 5.2.1 is
3.
Depth
The depth of a node is the number of nodes along the path from the root to that
node. For instance, node ‘C’ in figure 5.2.1 has a depth of 1.
The nodes of a binary tree can be numbered in a natural way, level by level, left
to right. The nodes of a complete binary tree can be numbered so that the root
is assigned the number 1, a left child is assigned twice the number assigned its
parent, and a right child is assigned one more than twice the number assigned
its parent. For example, see Figure 5.2.2.
Level 0
1
Level 1
2 3
Level 2
4 5 6 7
Level 3
8 9
3. Since a binary tree can contain at most one node at level 0 (the root), it can
contain at most 2l node at level l.
If every non-leaf node in a binary tree has nonempty left and right subtrees, the
tree is termed as strictly binary tree. Thus the tree of figure 5.2.3(a) is strictly
binary. A strictly binary tree with n leaves always contains 2n - 1 nodes.
A full binary tree of height h has all its leaves at level h. Alternatively; All non
leaf nodes of a full binary tree have two children, and the leaf nodes have no
children.
1
1 Strict Binary Tree
(a) 2 3
2 3
6 7 4 5 6 7
1
1
2 3
2 3
4 5 6 7
4 5 6 7
8 9 10 11 12 13 14 15
8 9 10
A complete binary tree of height h looks like a full binary tree down to level h-1,
and the level h is filled from left to right.
A complete binary tree with n leaves that is not strictly binary has 2n nodes. For
example, the tree of Figure 5.2.3(c) is a complete binary tree having 5 leaves
and 10 nodes.
2 3 2 3 2
4 5 6 4 5 7 4
Complete Binary Tree but Not Complete and not Not Complete and not
not strict strict strict
We define two terms: Internal nodes and external nodes. An internal node is a tree
node having at least one–key and possibly some children. It is some times convenient
to have another types of nodes, called an external node,. An external node doesn’t
exist, but serves as a conceptual place holder for nodes to be inserted.
We draw internal nodes using circles, with letters as labels. External nodes are denoted
by squares. The square node version is sometimes called an extended binary tree. A
binary tree with n internal nodes has n+1 external nodes. Figure 5.2.6 shows a sample
tree illustrating both internal and external nodes.
2 3
1. Preorder
2. Inorder
3. Postorder
4. Level order or breath first traversal
In the first three traversal methods, the left subtree of a node is traversed before the
right subtree. The difference among them comes from the difference in the time at
which a root node is visited.
Inorder Traversal:
In the case of inorder traversal, the root of each subtree is visited after its left subtree
has been traversed but before the traversal of its right subtree begins. The steps for
traversing a binary tree in inorder traversal are:
Preorder Traversal:
In a preorder traversal, each root node is visited before its left and right subtrees are
traversed. Preorder search is also called backtracking. The steps for traversing a binary
tree in preorder traversal are:
Postorder Traversal:
In a postorder traversal, each root is visited after its left and right subtrees have been
traversed. The steps for traversing a binary tree in postorder traversal are:
In a level order traversal, the nodes are visited level by level starting from the root,
and going from left to right. The level order traversal requires a queue data structure.
So, it is not possible to develop a recursive procedure to traverse the binary tree in
level order. This is nothing but a breadth first search technique.
Algorithm for level order traversal is as follows:
void levelorder()
{
int j;
for(j = 0; j < ctr; j++)
{
if(tree[j] != NULL)
print tree[j] -> data;
}
}
Example 1:
Traverse the following binary tree in pre, post, inorder and level order.
Bin a ry Tr e e Pr e, P o st , I n or d er a nd lev e l or d er Tr a v er s in g
Example 2:
Traverse the following binary tree in pre, post, inorder and level order.
1. Every element has a key and no two elements have the same key.
2. The keys in the left subtree are smaller than the key in the root.
3. The keys in the right subtree are larger than the key in the root.
4. The left and right subtrees are also binary search trees.
16 16
12 20 12 20
11 14 19 11 14 19
13 13 17
(a) (b)
struct TreeNode {
int key;
int value;
};
}
General Trees (m-ary tree):
If in a tree, the outdegree of every node is less than or equal to m, the tree is called
general tree. The general tree is also called as an m-ary tree. If the outdegree of every
node is exactly equal to m or zero then the tree is called a full or complete m-ary tree.
For m = 2, the trees are called binary and full binary trees.
The subtrees in a tree are unordered. The subtrees of each element in a binary
tree are ordered (i.e. we distinguish
between left and right subtrees).
There is a one-to-one mapping between general ordered trees and binary trees. So,
every tree can be uniquely represented by a binary tree. Furthermore, a forest can also
be represented by a binary tree.
Stage 1:
As a first step, we delete all the branches originating in every node except
the left most branch.
We draw edges from a node to the node on the right, if any, which is
situated at the same level.
Stage 2:
Once this is done then for any particular node, we choose its left and right
sons in the following manner:
The left son is the node, which is immediately below the given node,
and the right son is the node to the immediate right of the given node
on the same horizontal line. Such a binary tree will not have a right
subtree.
Example 1:
2 3 4 5
6 7 8 9 10 11
Solution:
2 3 4 5
6 7 8 9 10 11
6 3
7 8 4
10
11
Huffman coding
Every information in computer science is encoded as strings of 1s and 0s. The objective
of information theory is to usually transmit information using fewest number of bits in
such a way that every encoding is unambiguous. This tutorial discusses about fixed-
length and variable-length encoding along with Huffman Encoding which is the basis for
all data encoding schemes
Encoding, in computers, can be defined as the process of transmitting or storing
sequence of characters efficiently. Fixed-length and variable lengthare two types of
encoding schemes, explained as follows-
Fixed-Length encoding - Every character is assigned a binary code using same
number of bits. Thus, a string like “aabacdad” can require 64 bits (8 bytes) for storage or
transmission, assuming that each character uses 8 bits.
Variable- Length encoding - As opposed to Fixed-length encoding, this scheme uses
variable number of bits for encoding the characters depending on their frequency in the
given text. Thus, for a given string like “aabacdad”, frequency of characters ‘a’, ‘b’, ‘c’ and
‘d’ is 4,1,1 and 2 respectively. Since ‘a’ occurs more frequently than ‘b’, ‘c’ and ‘d’, it uses
least number of bits, followed by ‘d’, ‘b’ and ‘c’. Suppose we randomly assign binary
codes to each character as follows-
a0
b 011
c 111
d 11
… and so on
To prevent such ambiguities during decoding, the encoding phase should satisfy
the “prefix rule” which states that no binary code should be a prefix of another code.
This will produce uniquely decodable codes. The above codes for ‘a’, ‘b’, ‘c’ and ‘d’ do
not follow prefix rule since the binary code for a, i.e. 0, is a prefix of binary code for b i.e
011, resulting in ambiguous decodable codes.
Lets reconsider assigning the binary codes to characters ‘a’, ‘b’, ‘c’ and ‘d’.
a0
b 11
c 101
d 100
Problem Statement-
Input: Set of symbols to be transmitted or stored along with their frequencies/
probabilities/ weights
Output: Prefix-free and variable-length binary codes with minimum expected codeword
length. Equivalently, a tree-like data structure with minimum weighted path length from
root can be used for generating the binary codes
Huffman Encoding-
Huffman Encoding can be used for finding solution to the given problem statement.
Developed by David Huffman in 1951, this technique is the basis for all data
compression and encoding schemes
It is a famous algorithm used for lossless data encoding
It follows a Greedy approach, since it deals with generating minimum length
prefix-free binary codes
It uses variable-length encoding scheme for assigning binary codes to characters
depending on how frequently they occur in the given text. The character that occurs most
frequently is assigned the smallest code and the one that occurs least frequently gets the
largest code
A Huffman tree, similar to a binary tree data structure, needs to be created having n leaf
Priority Queue is used for building the Huffman tree such that nodes with lowest
frequency have the highest priority. A Min Heap data structure can be used to implement
Initially, all nodes are leaf nodes containing the character itself along with the weight/
Internal nodes, on the other hand, contain weight and links to two child nodes
Step II - Assigning the binary codes to each symbol by traversing Huffman tree
Generally, bit ‘0’ represents the left child and bit ‘1’ represents the right child
Step 1- Create a leaf node for each character and build a min heap using all the nodes
(The frequency value is used to compare two nodes in min heap)
Step 2- Repeat Steps 3 to 5 while heap has more than one node
Step 3- Extract two nodes, say x and y, with minimum frequency from the heap
Step 4- Create a new internal node z with x as its left child and y as its right child.
Also frequency(z)= frequency(x)+frequency(y)
Step 5- Add z to min heap
Step 6- Last node in the heap is the root of Huffman tree
Let’s try and create Huffman Tree for the following characters along with their
frequencies using the above algorithm-
Characters Frequencies
a 10
e 15
i 12
o 3
u 4
s 13
t 1
Step A- Create leaf nodes for all the characters and add them to the min heap.
Step 1- Create a leaf node for each character and build a min heap using all the nodes
Step 3- Extract two nodes, say x and y, with minimum frequency from the heap
Step 4- Create a new internal node z with x as its left child and y as its right child. Also
frequency(z)= frequency(x)+frequency(y)
Fig 2: Combining
nodes o and t
i. Extract and Combine node awith an internal node having 8 as the frequency
i. Extract and Combine node e with an internal node having 18 as the frequency
i. Finally, Extract and Combine internal nodes having 25 and 33 as the frequency
Fig 7: Final Huffman tree obtained by combining internal nodes having 25 and 33
as frequency
Now, since we have only one node in the queue, the control will exit out of the loop
Step C- Since internal node with frequency 58 is the only node in the queue, it becomes
the root of Huffman tree.
Step 6- Last node in the heap is the root of Huffman tree
3. Add 0 to array while traversing the left child and add 1 to array while traversing the right
child
i 00
s 01
e 10
u 1100
t 11010
o 11011
a 111
And finally, 11010 gets decoded to ‘t’, thus returning the string “stakeout” back
.Decoding
The decoding procedure is deceptively simple. Starting with the first bit in the stream, one then
uses successive bits from the stream to determine whether to go left or right in
the decoding tree. ... The next bit in the input stream is the first bit of the next character.
HEAP
Heap is a data structure, which permits one to insert elements into a set and also to
find the largest element efficiently. A data structure, which provides these two
operations, is called a priority queue.
Max and Min Heap data structures:
A max heap is an almost complete binary tree such that the value of each node is
greater than or equal to those in its children.
95 15
85 45 45 25
75 25 35 15 55 65 35 75
A min heap is an almost complete binary tree such that the value of each node is less
than or equal to those in its children.
Since heap is a complete binary tree, a heap tree can be efficiently represented using
one dimensional array. This provides a very convenient way of figuring out where
children belong to.
The elements of the array can be thought of as lying in a tree structure. A heap tree
represented using a single array looks as follows:
x[ 1]
65 x[ 3]
x[ 2]
45 60
x[ 6] x[ 7]
x[ 4] 40 x[ 5] 25 50 55
30 He a p T r e e
x[ 8]
Operations on heap tree:
1. Insertion,
2. Deletion and
3. Merging.
This operation is used to insert a node into an existing heap tree satisfying the
properties of heap tree. Using repeated insertions of data, starting from an empty heap
tree, one can build up a heap tree.
Let us consider the heap (max) tree. The principle of insertion is that, first we have to
adjoin the data in the complete binary tree. Next, we have to compare it with the data
in its parent; if the value is greater than that at parent then interchange the values.
This will continue between two nodes on path from the newly inserted node to the root
node till we get a parent whose value is greater than its child or we reached the root.
For illustration, 35 is added as the right child of 80. Its value is compared with its
parent’s value, and to be a max heap, parent’s value greater than child’s value is
satisfied, hence interchange as well as further comparisons are no more required.
As another illustration, let us consider the case of insertion 90 into the resultant heap
tree. First, 90 will be added as left child of 40, when 90 is compared with 40 it requires
interchange. Next, 90 is compared with 80, another interchange takes place. Now, our
process stops here, as 90 is now in root node. The path on which these comparisons
and interchanges have taken places are shown by dashed line.
The algorithm Max_heap_insert to insert a data into a max heap tree is as follows:
Max_heap_insert (a, n)
{
//inserts the value in a[n] into the heap which is stored at a[1] to a[n-1]
int i, n;
i = n;
item = a[n];
while ( (i > 1) and (a[ i/2 ] < item ) do
{
a[i] = a[ i/2 ] ; // move the parent down
i = i/2 ;
}
a[i] = item ;
return true ;
}
Example:
Form a heap using the above algorithm for the data: 40, 80, 35, 90, 45, 50, 70.
1. Insert 40:
40
2. Insert 80:
80
40 80
40
40
80
3. Insert 35:
80
40 35
4. Insert 90:
90
80
80
90
90
40 40 35 80 35
90
40
5. Insert 45:
90
80 35
40 45
6. Insert 50:
90 90
50
80 35 80 50
40 45 50 40 45 35
35
7. Insert 70:
90 90
70
80 50 80 70
50
40 45 35 70 40 45 35 50
Deletion of a node from heap tree:
Any node can be deleted from a heap tree. But from the application point of view,
deleting the root node has some special importance. The principle of deletion is as
follows:
Replace the root node by the last node in the heap tree. Then re-heap the
tree as stated below:
Let newly modified root node be the current node. Compare its value
with the value of its two child. Let X be the child whose value is the
largest. Interchange the value of X with the value of the current
node.
delmax (a, n, x)
// delete the maximum from the heap a[n] and store it in x
{
if (n = 0) then
{
write (“heap is empty”);
return false;
}
x = a[1]; a[1] = a[n];
adjust (a, 1, n-1);
return true;
}
adjust (a, i, n)
// The complete binary trees with roots a(2*i) and a(2*i + 1) are combined with a(i) to
form a single heap, 1 < i < n. No node has an address greater than n or less than 1. //
{
j = 2 *i ;
item = a[i] ;
while (j < n) do
{
if ((j < n) and (a (j) < a (j + 1)) then j j + 1;
// compare left and right child and let j be the larger child
if (item > a (j)) then break;
// a position for item is found
else a[ j / 2 ] = a[j] // move the larger child up a level
j = 2 * j;
}
a [ j / 2 ] = item;
}
Here the root node is 99. The last node is 26, it is in the level 3. So, 99 is replaced by
26 and this node with data 26 is removed from the tree. Next 26 at root node is
compared with its two child 45 and 63. As 63 is greater, they are interchanged. Now,
26 is compared with its children, namely, 57 and 42, as 57 is greater, so they are
interchanged. Now, 26 appears as the leave node, hence re-heap is completed.
26 63
99
26 63
57
45 63
45 57
26
35 29 42
57 35 29 26 42
27 12 24 26
27 12 24
De le t i n g t h e no d e w i t h d a t a 9 9 A f t e r De le t i o n of n o d e w i t h d a t a 9 9
Consider, two heap trees H1 and H2. Merging the tree H2 with H1 means to include all
the node from H2 to H1. H2 may be min heap or max heap and the resultant tree will
be min heap if H1 is min heap else it will be max heap. Merging operation consists of
two steps: Continue steps 1 and 2 while H2 is not empty:
92 13
59 67 19 80
38 45 92 93 96
96
93 67
80 92 13 19
HEAP SORT:
A heap sort algorithm works by first organizing the data to be sorted into a special type
of binary tree called a heap. Any kind of data can be sorted either in ascending order or
in descending order using heap tree. It does this with the following steps:
Algorithm:
This algorithm sorts the elements a[n]. Heap sort rearranges them in-place in non-
decreasing order. First transform the elements into a heap.
heapsort(a, n)
{
heapify(a, n);
for i = n to 2 by – 1 do
{
temp = a[i];
a[i] = a[1];
a[1] = temp;
adjust (a, 1, i – 1);
}
}
heapify (a, n)
//Readjust the elements in a[n] to form a heap.
{
for i n/2 to 1 by – 1 do adjust (a, i, n);
}
adjust (a, i, n)
// The complete binary trees with roots a(2*i) and a(2*i + 1) are combined with a(i) to
form a single heap, 1 < i < n. No node has an address greater than n or less than 1. //
{
j = 2 *i ;
item = a[i] ;
while (j < n) do
{
if ((j < n) and (a (j) < a (j + 1)) then j j + 1;
// compare left and right child and let j be the larger child
if (item > a (j)) then break;
// a position for item is found
else a[ j / 2 ] = a[j] // move the larger child up a level
j = 2 * j;
}
a [ j / 2 ] = item;
}
TimeComplexity:
Each ‘n’ insertion operations takes O(log k), where ‘k’ is the number of elements in the
heap at the time. Likewise, each of the ‘n’ remove operations also runs in time O(log
k), where ‘k’ is the number of elements in the heap at the time.
Since we always have k ≤ n, each such operation runs in O(log n) time in the worst
case.
Thus, for ‘n’ elements it takes O(n log n) time, so the priority queue sorting algorithm
runs in O(n log n) time when we use a heap to implement the priority queue.
40
Example 1:
90 70
80 45 50 35
Form a heap from the set of elements (40, 80, 35, 90, 45, 50, 70) and sort the data
using heap sort.
Solution:
First form a heap tree from the given set of data and then sort by repeated deletion
operation:
40 40
80 35 80 70
90 45 50 70 90 45 50 35
90
40 70
80 45 50 35
90
80 70
40 45 50 35
1. Exchange root 90 with the last element 35 of the array and re-heapify
80
35 80
45 35
70 45 70
80
40 45 50 90 40 35 50 90
35
2. Exchange root 80 with the last element 50 of the array and re-heapify
70
50 70
50 45 50
45 70
40 35 80 90
40 35 80 90
3. Exchange root 70 with the last element 35 of the array and re-heapify
50
35 50
35
45 35
45 50
40 70 80 90
40 70 80 90
4. Exchange root 50 with the last element 40 of the array and re-heapify
45
40 45
40
45 35 40 35
50 70 80 90 50 70 80 90
5. Exchange root 45 with the last element 35 of the array and re-heapify
40
40
35
35 45
35
40 45 50 70 80 90
50 70 80 90
6. Exchange root 40 with the last element 35 of the array and re-heapify
35
40 45
50 70 80 90
void main()
{
int i, n, a[20];
clrscr();
printf("\n How many element you want: ");
scanf("%d", &n);
printf("Enter %d elements: ", n);
for (i=1; i<=n; i++)
scanf("%d", &a[i]);
heapsort(n, a);
printf("\n The sorted elements are: \n");
for (i=1; i<=n; i++)
printf("%5d", a[i]);
getch();
}
Priority queue implementation using heap tree:
Priority queue can be implemented using circular array, linked list etc. Another
simplified implementation is possible using heap tree; the heap, however, can be
represented using an array. This implementation is therefore free from the complexities
of circular array and linked list but getting the advantages of simplicities of array.
As heap trees allow the duplicity of data in it. Elements associated with their priority
values are to be stored in from of heap tree, which can be formed based on their
priority values. The top priority element that has to be processed first is at the root; so
it can be deleted and heap can be rebuilt to get the next element to be processed, and
so on. As an illustration, consider the following processes with their priorities:
Process P1 P2 P3 P4 P5 P6 P7 P8 P9 P10
Priority 5 4 3 4 5 5 3 2 1 5
These processes enter the system in the order as listed above at time 0, say. Assume
that a process having higher priority value will be serviced first. The heap tree can be
formed considering the process priority values. The order of servicing the process is
successive deletion of roots from the heap.
Inorder traversal of a Binary tree can either be done using recursion or with the use of a
auxiliary stack. The idea of threaded binary trees is to make inorder traversal faster and
do it without stack and without recursion. A binary tree is made threaded by making all
right child pointers that would normally be NULL point to the inorder successor of the
node
There are two types of threaded binary trees.
Single Threaded: Where a NULL right pointers is made to point to the inorder
successor (if successor exists)
Double Threaded: Where both left and right NULL pointers are made to point to
inorder predecessor and inorder successor respectively. The predecessor
threads are useful for reverse inorder traversal and postorder traversal.
The threads are also useful for fast accessing ancestors of a node.
6 Following diagram shows an example Single Threaded Binary Tree. The dotted lines represent
threads.
struct Node
{
int data;
bool rightThread;
if (n == NULL)
return NULL;
n = n->left;
return n;
// inorder successor
if (cur->rightThread)
cur = cur->right;
cur = leftmost(cur->right);
}
}
Their idea is to replace the null links by pointers called Threads to other nodes in the
tree.
If the RCHILD(p) is normally equal to zero, we will replace it by a pointer to the node
which would be printed after P when traversing the tree in inorder.
A null LCHILD link at node P is replaced by a pointer to the node which immediately
precedes node P in inorder. For example, Let us consider the tree:
B C
D E F G
H I
B C
D E F G
H I
The tree has 9 nodes and 10 null links which have been replaced by Threads. If we
traverse T in inorder the nodes will be visited in the order H D I B E A F C G.
For example, node ‘E’ has a predecessor Thread which points to ‘B’ and a successor
Thread which points to ‘A’. In memory representation Threads and normal pointers are
distinguished between as by adding two extra one bit fields LBIT and RBIT.
Expression Trees:
Expression tree is a binary tree, because all of the operations are binary. It is also
possible for a node to have only one child, as is the case with the unary minus
operator. The leaves of an expression tree are operands, such as constants or variable
names, and the other (non leaf) nodes contain operators.
Figure 5.4.1 shows some more expression trees that represent arithmetic expressions
given in infix form.
+ +
+ / + d
a b c d + c
+ *
- + + *
a x y b c a
An expression tree can be generated for the infix and postfix expressions.
Example 1:
Solution:
The first two symbols are operands, so we create one-node trees and push pointers to
them onto a stack.
a b
Next, a ‘+’ is read, so two pointers to trees are popped, a new tree is formed, and a
pointer to it is pushed onto the stack.
a b
Next, c, d, and e are read, and for each one–node tree is created and a pointer to the
corresponding tree is pushed onto the stack.
a b c d e
+
c +
+
a bb d e
Continuing, a ‘*’ is read, so we pop two tree pointers and form a new tree with a ‘*’ as
root.
*
+
a b c +
d ee
Finally, the last symbol is read, two trees are merged, and a pointer to the final tree is
left on the stack.
+ *
a b c +
d e
e
Algorithm
An algorithm is a finite sequence of instructions, each of which has a clear meaning and can be performed with a finite
amount of effort in a finite length of time. No matter what the input values may be, an algorithm terminates after
executing a finite number of instructions. In addition every algorithm must satisfy the following criteria:
Input: there are zero or more quantities, which are externally supplied;
Finiteness: if we trace out the instructions of an algorithm, then for all cases the algorithm will
terminate after a finite number of steps;
Effectiveness: every instruction must be sufficiently basic that it can in principle be carried out
by a person using only pencil and paper. It is not enough that each operation be definite, but it
must also be feasible.
Choosing an efficient algorithm or data structure is just one part of the design process. Next, will look at some design
issues that are broader in scope. There are three basic design goals that we should strive for in a program:
Performance of a program:
The performance of a program is the amount of computer memory and time needed to run a program. We use two
approaches to determine the performance of a program. One is analytical, and the other experimental. In performance
analysis we use analytical methods, while in performance measurement we conduct experiments.
Time Complexity:
The time needed by an algorithm expressed as a function of the size of a problem is called the TIME COMPLEXITY of
the algorithm. The time complexity of a program is the amount of computer time it needs to run to completion.
The limiting behavior of the complexity as size increases is called the asymptotic time complexity. It is the asymptotic
complexity of an algorithm, which ultimately determines the size of problems that can be solved by the algorithm.
Space Complexity:
The space complexity of a program is the amount of memory it needs to run to completion. The space need by a
program has the following components:
Instruction space: Instruction space is the space needed to store the compiled version of the program instructions.
Data space: Data space is the space needed to store all constant and variable values. Data space has two components:
Environment stack space: The environment stack is used to save information needed to resume execution of partially
completed functions.
Instruction Space: The amount of instructions space that is needed depends on factors such as:
Classification of Algorithms
If ‘n’ is the number of data items to be processed or degree of polynomial or the size of the file to be
sorted or searched or the number of nodes in a graph etc.
1 Next instructions of most programs are executed once or at most only a few times. If all
the instructions of a program have this property, we say that its running time is a
constant.
Log n When the running time of a program is logarithmic, the program gets slightly slower as n
grows. This running time commonly occurs in programs that solve a big problem by
transforming it into a smaller problem, cutting the size by some constant fraction., When
n is a million, log n is a doubled whenever n doubles, log n increases by a constant, but log
n does not double until n increases to n2.
n When the running time of a program is linear, it is generally the case that a small amount
of processing is done on each input element. This is the optimal situation for an algorithm
that must process n inputs.
n. log n This running time arises for algorithms but solve a problem by breaking it up into smaller sub-
problems, solving them independently, and then combining the solutions. When n
doubles, the running time more than doubles.
n2 When the running time of an algorithm is quadratic, it is practical for use only on
relatively small problems. Quadratic running times typically arise in algorithms that
process all pairs of data items (perhaps in a double nested loop) whenever n doubles, the
running time increases four fold.
n3 Similarly, an algorithm that process triples of data items (perhaps in a triple– nested loop)
has a cubic running time and is practical for use only on small problems. Whenever n
doubles, the running time increases eight fold.
2n Few algorithms with exponential running time are likely to be appropriate for practical
use, such algorithms arise naturally as “brute–force” solutions to problems. Whenever n
doubles, the running time squares.
Complexity of Algorithms
The complexity of an algorithm M is the function f(n) which gives the running time and/or storage space requirement
of the algorithm in terms of the size ‘n’ of the input data. Mostly, the storage space required by an algorithm is simply a
multiple of the data size ‘n’. Complexity shall refer to the running time of the algorithm.
The function f(n), gives the running time of an algorithm, depends not only on the size ‘n’ of the input data but also on
the particular data. The complexity function f(n) for certain cases are:
1. Best Case : The minimum possible value of f(n) is called the best case.
2. Average Case : The expected value of f(n).
3. Worst Case : The maximum value of f(n) for any key possible input.
The field of computer science, which studies efficiency of algorithms, is known as analysis of
algorithms.
One way to compare the function f(n) with these standard function is to use the functional ‘O’
notation, suppose f(n) and g(n) are functions defined on the positive integers with the property that
f(n) is bounded by some multiple g(n) for almost all ‘n’. Then,
f(n) = O(g(n))
Which is read as “f(n) is of order g(n)”. For example, the order of complexity for:
For example, if the first program takes 100n2 milliseconds. While the second taken 5n3 milliseconds.
Then might not 5n3 program better than 100n2 program?
As the programs can be evaluated by comparing their running time functions, with constants by
proportionality neglected. So, 5n3 program be better than the 100n2 program.
5 n3/100 n2 = n/20
for inputs n < 20, the program with running time 5n3 will be faster those the one with running time 100
n2.
Therefore, if the program is to be run mainly on inputs of small size, we would indeed prefer the
program whose running time was O(n3)
However, as ‘n’ gets large, the ratio of the running times, which is n/20, gets arbitrarily larger. Thus, as
the size of the input increases, the O(n3) program will take significantly more time than the O(n 2)
program. So it is always better to prefer a program whose running time with the lower growth rate.
The low growth rate function’s such as O(n) or O(n log n) are always better.
Following are some problems, which are solved using divide and conquer approach.
Naïve Method
Naïve method is a basic method to solve any problem. In this method, the maximum
and minimum number can be found separately. To find the maximum and minimum
numbers, the following straightforward algorithm can be used.
Algorithm: Max-Min-Element (numbers[])
max := numbers[1]
min := numbers[1]
for i = 2 to n do
if numbers[i] > max then
max := numbers[i]
if numbers[i] < min then
min := numbers[i]
return (max, min)
Analysis
The number of comparison in Naive method is 2n - 2.
The number of comparisons can be reduced using the divide and conquer approach.
Following is the technique.
In this approach, the array is divided into two halves. Then using recursive approach
maximum and minimum numbers in each halves are found. Later, return the maximum
of two maxima of each half and the minimum of two minima of each half.
In this given problem, the number of elements in an array is y−x+1y−x+1, where y is
greater than or equal to x.
Max−Min(x,y)Max−Min(x,y) will return the maximum and minimum values of an
array numbers[x...y]numbers[x...y].
Analysis
Let T(n) be the number of comparisons made by Max−Min(x,y)Max−Min(x,y), where
the number of elements n=y−x+1n=y−x+1.
If T(n) represents the numbers, then the recurrence relation can be represented as
T(n)=⎧⎩⎨⎪⎪T(⌊n2⌋)+T(⌈n2⌉)+210forn>2forn=2forn=1T(n)={T(⌊n2⌋)+T(⌈n2⌉)+2forn>21for
n=20forn=1
Let us assume that n is in the form of power of 2. Hence, n = 2k where k is height of the
recursion tree.
So,
T(n)=2.T(n2)+2=2.(2.T(n4)+2)+2.....=3n2−2T(n)=2.T(n2)+2=2.(2.T(n4)+2)+2.....=3n2−2
Compared to Naïve method, in divide and conquer approach, the number of
comparisons is less. However, using the asymptotic notation both of the approaches are
represented by O(n).
Search Techniques
Searching is used to find the location where an element is available. There are two
types of search techniques. They are:
1. Bubble sort
2. Quick sort
3. Selection sort and
4. Heap sort
1. Internal sorting
2. External sorting
If all the elements to be sorted are present in the main memory then such sorting is
called internal sorting on the other hand, if some of the elements to be sorted are
kept on the secondary storage, it is called external sorting. Here we study only
internal sorting techniques.
Linear Search:
Suppose there are ‘n’ elements organized sequentially on a List. The number of
comparisons required to retrieve an element from the list, purely depends on where the
element is stored in the list. If it is the first element, one comparison will do; if it is
second element two comparisons are necessary and so on. On an average you need
[(n+1)/2] comparison’s to search an element. If search is not successful, you would
need ’n’ comparisons.
Algorithm:
Let array a[n] stores n elements. Determine whether element ‘x’ is present or not.
linsrch(a[n], x)
{
index = 0;
flag = 0;
while (index < n) do
{
if (x == a[index])
{
flag = 1;
break;
}
index ++;
}
if(flag == 1)
printf(“Data found at %d position“, index);
else
Example 1:
Suppose we have the following unsorted list: 45, 39, 8, 54, 77, 38, 24, 16, 4, 7, 9, 20
For any element not in the list, we’ll look at 12 elements before failure.
if(number[i] == data)
{
flag = 1; break;
}
if(flag == 1)
{
printf("\n Data found at location: %d", i+1);
else
BINARY SEARCH
If we have ‘n’ records which have been ordered by keys so that x1 < x2 < … < xn . When
we are given a element ‘x’, binary search is used to find the corresponding element
from the list. In case ‘x’ is present, we have to determine a value ‘j’ such that a[j] = x
(successful search). If ‘x’ is not in the list then j is to set to zero (un successful search).
In Binary search we jump into the middle of the file, where we find key a[mid], and
compare ‘x’ with a[mid]. If x = a[mid] then the desired record has been found.
If x < a[mid] then ‘x’ must be in that portion of the file that precedes a[mid]. Similarly,
if a[mid] > x, then further search is only necessary in that part of the file which follows
a[mid].
If we use recursive procedure of finding the middle key a[mid] of the un-searched
portion of a file, then every un-successful comparison of ‘x’ with a[mid] will eliminate
roughly half the un-searched portion from consideration.
Since the array size is roughly halved after each comparison between ‘x’ and a[mid],
and since an array of length ‘n’ can be halved only about log 2n times before reaching a
trivial length, the worst case complexity of Binary search is about log 2n.
Algorithm:
Let array a[n] of elements in increasing order, n 0, determine whether ‘x’ is present,
and if so, set j such that x = a[j] else return 0.
binsrch(a[], n, x)
{
low = 1; high = n;
while (low < high) do
{
mid = (low + high)/2
if (x < a[mid])
high = mid – 1;
else if (x > a[mid])
low = mid + 1;
else return mid;
}
return 0;
}
low and high are integer variables such that each time through the loop either ‘x’ is
found or low is increased by at least one or high is decreased by at least one. Thus we
have two sequences of integers approaching each other and eventually low will become
greater than high causing termination in a finite number of steps if ‘x’ is not present.
Example 1:
For a binary search to work, it is mandatory for the target array to be sorted. We shall
learn the process of binary search with a pictorial example. The following is our sorted
array and let us assume that we need to search the location of value 31 using binary
search.
Here it is, 0 + (9 - 0 ) / 2 = 4 (integer value of 4.5). So, 4 is the mid of the array.
Now we compare the value stored at location 4, with the value being searched, i.e. 31.
We find that the value at location 4 is 27, which is not a match. As the value is greater
than 27 and we have a sorted array, so we also know that the target value must be in
the upper portion of the array.
We change our low to mid + 1 and find the new mid value again.
low = mid + 1
Our new mid is 7 now. We compare the value stored at location 7 with our target value
31.
The value stored at location 7 is not a match, rather it is more than what we are looking
for. So, the value must be in the lower part from this location.
Time Complexity:
The time complexity of binary search in a successful search is O(log n) and for an
unsuccessful search is O(log n).
Merge sort
Merge Sort is a Divide and Conquer algorithm. It is simple sort that performs the following
steps.
We know that merge sort first divides the whole array iteratively into equal halves unless
the atomic values are achieved. We see here that an array of 8 items is divided into two
arrays of size 4.
This does not change the sequence of appearance of items in the original. Now we
divide these two arrays into halves.
We further divide these arrays and we achieve atomic value which can no more be
divided.
Now, we combine them in exactly the same manner as they were broken down. Please
note the color codes given to these lists.
We first compare the element for each list and then combine them into another list in a
sorted manner. We see that 14 and 33 are in sorted positions. We compare 27 and 10
and in the target list of 2 values we put 10 first, followed by 27. We change the order of
19 and 35 whereas 42 and 44 are placed sequentially.
In the next iteration of the combining phase, we compare lists of two data values, and
merge them into a list of found data values placing all in a sorted order.
After the final merging, the list should look like this −
2. MergeSort(arr[], l, r)
If r > l
1. Find the middle point to divide the array into two halves:
middle m = (l+r)/2
2. Call mergeSort for first half:
Call mergeSort(arr, l, m)
3. Call mergeSort for second half:
Call mergeSort(arr, m+1, r)
4. Merge the two halves sorted in step 2 and 3:
Call merge(arr, l, m, r)
Example 2.
Merge Sort is quite fast, and has a time complexity of O(n*log n)
Selection Sort
The Selection sort algorithm is based on the idea of finding the minimum or maximum element in an unsorted
array and then putting it in its correct position in a sorted array.
The minimum element in the array i.e. 2 is searched for and then swapped with the element that is currently
located at the first position, i.e. 7. Now the minimum element in the remaining unsorted array is searched for
and put in the second position, and so on.
array .
minimum = i ;
To find the minimum element from the array of N elements, N−1 comparisons are required. After putting the
minimum element in its proper position, the size of an unsorted array reduces to N−1 and then N−2 comparisons
are required to find the minimum in the unsorted array.
Therefore (N−1) + (N−2) + ....... + 1 = (N⋅(N−1))/2 comparisons and N swaps result in the overall complexity
of O(N2).
Introduction
Strassen’s in 1969 which gives an overview that how we can find the multiplication of two 2*2
dimension matrix by the brute-force algorithm. But by using divide and conquer technique the
overall complexity for multiplication two matrices is reduced. This happens by decreasing the total
number if multiplication performed at the expenses of a slight increase in the number of addition.
For multiplying the two 2*2 dimension matrices Strassen's used some formulas in which there are
seven multiplication and eighteen addition, subtraction, and in brute force algorithm, there is eight
multiplication and four addition. The utility of Strassen's formula is shown by its asymptotic
superiority when order n of matrix reaches infinity. Let us consider two
matrices A and B, n*n dimension, where n is a power of two. It can be observed that we can contain
four n/2*n/2 submatrices from A, B and their product C. C is the resultant matrix of A and B.
1. Divide a matrix of order of 2*2 recursively till we get the matrix of 2*2.
2. Use the previous set of formulas to carry out 2*2 matrix multiplication.
3. In this eight multiplication and four additions, subtraction are performed.
4. Combine the result of two matrixes to find the final product or final matrix.
Formulas for Stassen’s matrix multiplication
In Strassen’s matrix multiplication there are seven multiplication and four addition, subtraction in
total.
C11 = d1 + d4 – d5 + d7
C12 = d3 + d5
C21 = d2 + d4
C22 = d1 + d3 – d2 – d6
Algorithm Strassen(n, a, b, d)
begin
If n = threshold then compute
C = a * b is a conventional matrix.
Else
Partition a into four sub matrices a11, a12, a21, a22.
Partition b into four sub matrices b11, b12, b21, b22.
Strassen ( n/2, a11 + a22, b11 + b22, d1)
Strassen ( n/2, a21 + a22, b11, d2)
Strassen ( n/2, a11, b12 – b22, d3)
Strassen ( n/2, a22, b21 – b11, d4)
Strassen ( n/2, a11 + a12, b22, d5)
Strassen (n/2, a21 – a11, b11 + b22, d6)
Strassen (n/2, a12 – a22, b21 + b22, d7)
C = d1+d4-d5+d7 d3+d5
d2+d4 d1+d3-d2-d6
end if
return (C)
end.
UNIT-V
Greedy Method
Most networking algorithms use the greedy approach. Here is a list of few of them −
Knapsack Problem
Job Scheduling Problem with dead lines
Optimal Storage on Tapes
Optimal Merge Pattern
The Knapsack problem
I found the Knapsack problem tricky and interesting at the same time. I am sure if you are visiting this page,
you already know the problem statement but just for the sake of completion :
Problem:
Given a Knapsack of a maximum capacity of W and N items each with its own value and weight, throw in items
inside the Knapsack such that the final contents has the maximum value. Yikes !!
Here’s the general way the problem is explained – Consider a thief gets into a home to rob and he carries a
knapsack. There are fixed number of items in the home – each with its own weight and value – Jewellery, with
less weight and highest value vs tables, with less value but a lot heavy. To add fuel to the fire, the thief has an
old knapsack which has limited capacity. Obviously, he can’t split the table into half or jewellery into 3/4ths. He
either takes it or leaves it
Example :
A cursory look at the example data tells us that the max value that we could accommodate with the limit of max
weight of 10 is 50 + 40 = 90 with a weight of 7.
Algorithm:
o Assume knapsack holds weight W and items have value vi and weight wi
o Rank items by value/weight ratio: vi / wi
Thus: vi / wi ≥ vj / wj, for all i ≤ j
o Consider items in order of decreasing ratio
o Take as much of each item as possible
Code:
-- Assumes value and weight arrays are sorted by vi/wi
Fractional-Knapsack(v, w, W)
load := 0
i := 1
while load < W and i ≤ n loop
if wi ≤ W - load then
take all of item i
else
take (W-load) / wi of item i
end if
add weight of what was taken to load
i := i + 1
end loop
return load
Item A B C D
Value 50 140 60 60
Size 5 20 10 12
Ratio 10 7 6 5
Solution:
o All of A, all of B, and ((30-25)/10) of C (and none of D)
o Size: 5 + 20 + 10*(5/10) = 30
o Value: 50 + 140 + 60*(5/10) = 190 + 30 = 220
In 0-1 Knapsack, items cannot be broken which means the thief should take the item as
a whole or should leave it. This is reason behind calling it as 0-1 Knapsack.
Hence, in case of 0-1 Knapsack, the value of xi can be either 0 or 1, where other
constraints remain the same.
0-1 Knapsack cannot be solved by Greedy approach. Greedy approach does not
ensure an optimal solution. In many instances, Greedy approach may give an optimal
solution.
The following examples will establish our statement.
Example-1
Let us consider that the capacity of the knapsack is W = 25 and the items are as shown
in the following table.
Item A B C D
Profit 24 18 18 10
Weight 24 10 10 7
Without considering the profit per unit weight (pi/wi), if we apply Greedy approach to
solve this problem, first item A will be selected as it will contribute maximum profit
among all the elements.
After selecting item A, no more item will be selected. Hence, for this given set of items
total profit is 24. Whereas, the optimal solution can be achieved by selecting
items, B and C, where the total profit is 18 + 18 = 36.
Dynamic-Programming Approach
Let i be the highest-numbered item in an optimal solution S for W dollars. Then S' = S -
{i} is an optimal solution for W - wi dollars and the value to the solution S is Vi plus the
value of the sub-problem.
We can express this fact in the following formula: define c[i, w] to be the solution for
items 1,2, … , i and the maximum weight w.
The algorithm takes the following inputs
The maximum weight W
The number of items n
The two sequences v = <v1, v2, …, vn> and w = <w1, w2, …, wn>
Dynamic-0-1-knapsack (v, w, n, W)
for w = 0 to W do
c[0, w] = 0
for i = 1 to n do
c[i, 0] = 0
for w = 1 to W do
if wi ≤ w then
if vi + c[i-1, w-wi] then
c[i, w] = vi + c[i-1, w-wi]
else c[i, w] = c[i-1, w]
else
c[i, w] = c[i-1, w]
The set of items to take can be deduced from the table, starting at c[n, w] and tracing
backwards where the optimal values came from.
If c[i, w] = c[i-1, w], then item i is not part of the solution, and we continue tracing
with c[i-1, w]. Otherwise, item i is part of the solution, and we continue tracing with c[i-
1, w-W].
D(0) := J(0) :=
0 k := 1
r := k
J(l + 1) := J(l)
J(r + 1) := i
k := k + 1
Job J1 J2 J3 J4 J5
Deadli 2 1 3 2 1
ne
Profit 60 10 2 4 2
0 0 0 0
Solution
To solve this problem, the given jobs are sorted according to their profit in a
descending order.
Hence, after sorting, the jobs are ordered as shown in the following table.
Job J2 J J J J
1 4 3 5
Deadli 1 2 2 3 1
ne
Profit 100 60 40 20 20
From this set of jobs, first we select J2, as it can be completed within its deadline and
contributes maximum profit.
==============================
1, 2, 3 8 + (8+12) + (8+12+2) 50
1, 3, 2 8 + 8 + 2 + 8 + 2 + 12 40
2, 1, 3 12 + 12 + 8 +12 + 8 + 2 54
2, 3, 1 12 + 12 +2 +12 + 2 + 8 48
3, 1, 2 2 + 2 + 8 + 2 + 8+ 12 34
3, 2, 1 2 + 2 +12 + 2 + 12 + 8 38
The optimal ordering is 3, 1, 2.
Greedy solution:
Algorithm:
Optimal merge pattern is a pattern that relates to the merging of two or more
sorted files in a single sorted file. This type of merging can be done by the two-
way merging method.
If we have two sorted files containing n and m records respectively then they
could be merged together, to obtain one sorted file in time O (n+m).
There are many ways in which pairwise merge can be done to get a single sorted
file. Different pairings require a different amount of computing time.
The main thing is to pairwise merge the n sorted files so that the number
of comparisons will be less.
Algorithm Tree(n)
{
// get a new tree node
Pt: new treenode;
}
* Given a set of unsorted files: 5, 3, 2, 7, 9, 13
* //
Now,tree left
arrange inelements
these list in ascending order: 2, 3, 5, 7, 9, 13
Return least(list);
* After this, pick two smallest numbers and repeat this until we left with only
} one number.
Step 3: Insert 5
Step 4: Insert 13
A spanning tree for a connected graph is a tree whose vertex set is the same as the
vertex set of the given graph, and whose edge set is a subset of the edge set of the
given graph. i.e., any connected graph will have a spanning tree.
Weight of a spanning tree w(T) is the sum of weights of all edges in T. Minimum
spanning tree (MST) is a spanning tree with the smallest possible weight.
Example:
G:
A gra p h G:
T h re e ( of ma ny p o s s i b l e) s p a nning t re e s fro m gra p h G:
2 2
4
G: 3 5 3
6
1 1
Minimum spanning tree, can be constructed using any of the following two algorithms:
Both algorithms differ in their methodology, but both eventually end up with the MST. Kruskal's
algorithm uses edges, and Prim’s algorithm uses vertex connections in determining the MST. In
Prim’s algorithm at any instance of output it represents tree whereas in Kruskal’s algorithm at any
instance of output it may represent tree or not.
Kruskal’s Algorithm
This is a greedy algorithm. A greedy algorithm chooses some local optimum (i.e.
picking an edge with the least weight in a MST).
Kruskal's algorithm works as follows: Take a graph with 'n' vertices, keep on adding the
shortest (least cost) edge, while avoiding the creation of cycles, until (n - 1) edges
have been added. Sometimes two or more edges may have the same cost.
The order in which the edges are chosen, in this case, does not matter. Different MST’s
may result, but they will all have the same total cost, which will always be the
minimum cost.
2. Repeat the steps 3, 4 and 5 as long as T contains less than n - 1 edges and E is
not empty otherwise, proceed to step 6.
Example 1:
Construct the minimal spanning tree for the graph shown below:
10 50
1 2
45 40 3
30 35
4 25 5
55
20 15
6
Cost 10 15 20 25 30 35 40 45 50 55
Edge (1, 2) (3, 6) (4, 6) (2, 6) (1, 4) (3, 5) (2, 5) (1, 5) (2, 3) (5, 6)
The stages in Kruskal’s algorithm for minimal spanning tree is as follows:
STAGES IN KRUSKAL’S
EDGE COST REMARKS
ALGORITHM
4 5
6
4 5
6
4 5
6
4 5
6
Construct the minimal spanning tree for the graph shown below:
1 28
10
2
14
6 16
24 7
3
25
5 18
12
22 4
Solution:
Cost 10 12 14 16 18 22 24 25 28
Edge (1, 6) (3, 4) (2, 7) (2, 3) (4, 7) (4, 5) (5, 7) (5, 6) (1, 2)
STAGES IN KRUSKAL’S
EDGE COST REMARKS
ALGORITHM
5
4
5
4
5
4
(2, 3) 16 1 The edge between vertices 2 and 3 is
2 next included in the tree.
6
3
7
5
4
5
4
A given graph can have many spanning trees. From these many spanning trees, we
have to select a cheapest one. This tree is called as minimal cost spanning tree.
Minimal cost spanning tree is a connected undirected graph G in which each edge is
labeled with a number (edge labels may signify lengths, weights other than costs).
Minimal cost spanning tree is a spanning tree for which the sum of the edge labels is as
small as possible
The slight modification of the spanning tree algorithm yields a very simple algorithm for
finding an MST. In the spanning tree algorithm, any vertex not in the tree but
connected to it by an edge can be added. To find a Minimal cost spanning tree, we
must be selective - we must always add a new vertex for which the cost of the new
edge is as small as possible.
This simple modified algorithm of spanning tree is called prim's algorithm for finding an
Minimal cost spanning tree. Prim's algorithm is an example of a greedy algorithm.
Prim’s Algorithm:
E is the set of edges in G. cost [1:n, 1:n] is the cost adjacency matrix of an n vertex
graph such that cost [i, j] is either a positive real number or if no edge (i, j) exists. A
minimum spanning tree is computed and stored as a set of edges in the array t [1:n-1,
1:2]. (t [i, 1], t [i, 2]) is an edge in the minimum-cost spanning tree. The final cost is
returned.
EXAMPLE:
Use Prim’s Algorithm to find a minimal spanning tree for the graph shown below
starting with the vertex A.
4
B D
4
3 2 1 2
4 E 1
A C 2 G
6
F 1
2
Solution:
The cost adjacency matrix is 0 3 6
The stepwise progress of the prim’s algorithm is as follows:
Step 1:
B 3
D Vertex A B C D E F G
Status 0 1 1 1 1 1 1
E Dist. 0 3 6
6 G Next * A A A A A A
A 0
F
C
Step 2:
44 D Vertex A B C D E F G
B 3
Status 0 0 1 1 1 1 1
E Dist. 0 3 2 4
A 0 G Next * A B B A A A
2
F
C
Step 3:
1 D
B 3
4
Vertex A B C D E F G
0 2 E
Status 0 0 0 1 1 1 1
A
2 G Dist. 0 3 2 1 4 2
C
Next * A B C C C A
F
Step 4:
1 D Vertex A B C D E F G
Status 0 0 0 0 1 1 1
Dist. 0 3 2 1 2 2 4
2 E
A 0 2 4 G Next * A B C D C D
C 2 F
Step 5:
B 3 1 D Vertex A B C D E F G
Status 0 0 0 0 1 0 1
Dist. 0 3 2 1 2 2 1
A 0 2 2 E 1 G Next * A B C D C E
C F
2
Step 6:
Vertex A B C D E F G
B 3 1 D
Status 0 0 0 0 0 1 0
Dist. 0 3 2 1 2 1 1
Next * A B C D G E
A G
0 2 2 1
E
C
1 F
Step 7:
Vertex A B C D E F G
B 3 1 D
Status 0 0 0 0 0 0 0
Dist. 0 3 2 1 2 1 1
A G Next * A B C D G E
2 E 1
0 2
C 1 F
Warshall’s algorithm requires knowing which edges exist and which does not. It doesn’t
need to know the lengths of the edges in the given directed graph. This information is
conveniently displayed by adjacency matrix for the graph, in which a ‘1’ indicates the
existence of an edge and ‘0’ indicates non-existence.
It begins with the adjacency matrix for the given graph, which is called A 0, and then
updates the matrix ‘n’ times, producing matrices called A1, A2,........... , An and then
stops.
A one entry indicates a pair of vertices, which are connected and zero entry indicates a pair, which are not.
This matrix is called a reachability matrix or path matrix for the graph. It is also called the transitive closure of
the original adjacency matrix.
The update rule for computing Ai from Ai-1 in warshall’s algorithm is:
Ai [x, y] = Ai-1 [x, y] ۷ (Ai-1 [x, i] ٨ Ai-1 [i, y]) ---- (1)
Example 1:
Use warshall’s algorithm to calculate the reachability matrix for the graph:
Example 1:
Use warshall’s algorithm to calculate the reachability matrix for the graph:
4
1 4
5 6
7 11
2 3
The first step is to compute ‘A1’ matrix. To do so we will use the updating rule – (1).
Before doing so, we notice that only one entry in A 0 must remain one in A1, since in
Boolean algebra 1 + (anything) = 1. Since these are only nine zero entries in A0, there
are only nine entries in A0 that need to be updated.
1 0 1 1 0
20 0 1 1
A1
3 0 0 0
0
4 1 1 1 0
0) = 0
A2[1, 4] = A1[1, 4] ۷ (A1[1, 2] ٨ A1[2, 4]) = 0 ۷ (1 ٨ 1) = 1
A2[2, 1] = A1[2, 1] ۷ (A1[2, 2] ٨ A1[2, 1]) = 0 ۷ (0 ٨ 0) = 0
A2[2, 2] = A1[2, 2] ۷ (A1[2, 2] ٨ A1[2, 2]) = 0 ۷ (0 ٨ 0) = 0
A2[3, 1] = A1[3, 1] ۷ (A1[3, 2] ٨ A1[2, 1]) = 0 ۷ (0 ٨ 0) = 0
A2[3, 2] = A1[3, 2] ۷ (A1[3, 2] ٨ A1[2, 2]) = 0 ۷ (0 ٨ 0) = 0
A2[3, 3] = A1[3, 3] ۷ (A1[3, 2] ٨ A1[2, 3]) = 0 ۷ (0 ٨ 1) = 0
A2[3, 4] = A1[3, 4] ۷ (A1[3, 2] ٨ A1[2, 4]) = 0 ۷ (0 ٨ 1) = 0
A2[4, 4] = A1[4, 4] ۷ (A1[4, 2] ٨ A1[2, 4]) = 0 ۷ (1 ٨ 1) = 1
1 0 1 1 1
2 0 0 1 1
A2
3 0 0
0 0
4 1 1 1 1
This matrix has only seven 0 entries, and so to compute A 3, we need to do only
seven computations.
1 0 1 1 1
2 0 0 1 1
A3
3 0 0
0 0
4 1 1 1 1
Page 126
Once A3 is calculated, we use the update rule to calculate A 4 and stop.
This matrix is the reachability matrix for the graph.
1 1 1 1 1
2 1 1 1 1
A4
3 0 0
0 0
4 1 1 1 1
Note that according to the algorithm vertex 3 is not reachable from itself 1. This is because
as can be seen in the graph, there is no path from vertex 3 back to itself
Page 126
GRAPH
Introduction to Graphs:
Graph G is a pair (V, E), where V is a finite set of vertices and E is a finite set of edges. We will often d
e = |E|.
A graph is generally displayed as figure 6.5.1, in which the vertices are represented by circles and
lines.
An edge with an orientation (i.e., arrow head) is a directed edge, while an edge with no orientation is
edge.
If all the edges in a graph are undirected, then the graph is an undirected graph. The graph in figure
undirected graph. If all the edges are directed; then the graph is a directed graph. The graph of figur
directed graph. A directed graph is also called as digraph. A graph G is connected if and only if there is
between any two nodes in G.
A graph G is said to be complete if every node a in G is adjacent to every other node v in G. A complet
nodes will have n(n-1)/2 edges. For example, Figure 6.5.1.(a) and figure 6.5.1.(d) are complete graph
A directed graph G is said to be connected, or strongly connected, if for each pair (u, v) for nodes
path from u to v and also a path from v to u. On the other hand, G is said to be unilaterally connected
(u, v) of nodes in G there is a path from u to v or a path from v to u. For example, the digraph shown
(e) is strongly connected.
B D v1
A B
E
A C E G v4 v2
C D
v1 v1 v1 v1
v2 v3
v4 v2 v4 v2 v4 v2
We can assign weight function to the edges: wG(e) is a weight of edge e E. The graph
which has such function assigned is called weighted graph
In-degree and Out degree
The number of incoming edges to a vertex v is called in–degree of the vertex (denote
indeg(v)). The number of outgoing edges from a vertex is called out-degree (denote
outdeg(v)). For example, let us consider the digraph shown in figure 6.5.1(f),
indegree(v1) = 2 outdegree(v1) = 1
indegree(v2) = 2 outdegree(v2) = 0
A path is a sequence of vertices (v1, v2, , vk), where for all i, (vi, vi+1) E. A path
is simple if all vertices in the path are distinct. If there is a path containing one or more
edges which starts from a vertex Vi and terminates into the same vertex then the path
is known as a cycle. For example, there is a cycle in figure 6.5.1(a), figure 6.5.1(c) and
figure 6.5.1(d).
If a graph (digraph) does not have any cycle then it is called acyclic graph. For
example, the graphs of figure 6.5.1 (f) and figure 6.5.1 (g) are acyclic graphs.
A Forest is a set of disjoint trees. If we remove the root node of a given tree then it
becomes forest. The following figure shows a forest F that consists of three trees T1, T2
and T3.
A P X
B D Y
Q R
Z
T1 C E F T2 T3
A Forest F
A graph that has either self loop or parallel edges or both is called multi-graph.
Tree is a connected acyclic graph (there aren’t any sequences of edges that go around
in a loop). A spanning tree of a graph G = (V, E) is a tree that contains all vertices of V
and is a subgraph of G. A single graph can have multiple spanning trees.
3. If we add any edge into T, then the new graph will contain a cycle.
Adjacency matrix.
Adjacency List.
Incidence matrix.
Adjacency matrix:
The matrix is symmetric in case of undirected graph, while it may be asymmetric if the
graph is directed. This matrix is also called as Boolean matrix or bit matrix.
1 1 2 3 4 5
1 0 1 1 0 1
G1:
2 3
2 0 0 1 1 1
3 0 0 0 1 0
4 0 0 0 0 0
(a) 4 5 (b)
5 0 0 1 1 0
Figure 6.5.2(b) shows the adjacency matrix representation of the graph G1 shown in
figure 6.5.2(a). The adjacency matrix is also useful to store multigraph as well as
weighted graph. In case of multigraph representation, instead of entry 0 or 1, the entry
will be between number of edges between two vertices.
In case of weighted graph, the entries are weights of the edges between the vertices.
The adjacency matrix for a weighted graph is called as cost adjacency matrix. Figure
6.5.3(b) shows the cost adjacency matrix representation of the graph G2 shown in
figure 6.5.3(a).
G2: 4
B D A B C D E F G
3 1 4 A 0 3 6
2 2
B 3 0 2 4
4 1 6
A C E G C 6 2 0 1 4 2
2
D 4 1 0 2 4
2 1
(a)
E 4 2 0 2 1
(b)
F F 2 2 0 1
G 4 1 1 0
In this representation, the n rows of the adjacency matrix are represented as n linked
lists. An array Adj[1, 2, . . . . . n] of pointers where for 1 < v < n, Adj[v] points to a
linked list containing the vertices which are adjacent to v (i.e. the vertices that can be
reached from v by a single edge). If the edges have weights then these weights may
also be stored in the linked list elements. For the graph G in figure 6.5.4(a), the
adjacency list in shown in figure 6.5.4 (b).
1 2 3
1 1 1 2 3
1 1 1
2 3
2 0 0 1
3 0 1 0 3 2
Incidence Matrix:
In this representation, if G is a graph with n vertices, e edges and no self loops, then
incidence matrix A is defined as an n by e matrix, say A = (a i,j), where
1
a if there is an edge j incident tovi
i, j
0 otherwise
B
c D a b c d e f g h i j k l
d f A 1 0 0 0 0 0 1 0 0 0 0 0
a b e
B 1 1 1 0 0 0 0 0 0 0 0 0
A C
h
E
i
G C 0 1 0 1 0 0 1 1 0 0 1 0
g D 0 0 1 1 1 1 0 0 0 0 0 0
j
k l E 0 0 0 0 1 0 0 1 1 1 0 0
(a)
F (b) F 0 0 0 0 0 0 0 0 0 1 1 1
G 0 0 0 0 0 1 0 0 1 0 0 1
Figure 6.5.4(b) shows the incidence matrix representation of the graph G1 shown in
figure 6.5.4(a).
Traversing a Graph
Many graph algorithms require one to systematically examine the nodes and edges of a
graph G. There are two standard ways to do this. They are:
During the execution of these algorithms, each node N of G will be in one of three
states, called the status of N, as follows:
Both BFS and DFS impose a tree (the BFS/DFS tree) on the structure of graph. So, we
can compute a spanning tree in a graph. The computed spanning tree is not a
minimum spanning tree. The spanning trees obtained using depth first search are
called depth first spanning trees. The spanning trees obtained using breadth first
search are called Breadth first spanning trees.
The general idea behind a breadth first traversal beginning at a starting node A is as
follows. First we examine the starting node A. Then we examine all the neighbors of A.
Then we examine all the neighbors of neighbors of A. And so on. We need to keep track
of the neighbors of a node, and we need to guarantee that no node is processed more
than once. This is accomplished by using a QUEUE to hold nodes that are waiting to be
processed, and by using a field STATUS that tells us the current status of any node.
The spanning trees obtained using BFS are called Breadth first spanning trees.
1. Put the starting node A in QUEUE and change its status to the waiting
state (STATUS = 2).
b. Add to the rear of QUEUE all the neighbors of N that are in the
ready state (STATUS = 1), and change their status to the waiting
state (STATUS = 2).
3. Exit.
Depth first search and traversal:
Depth first search of undirected graph proceeds as follows: First we examine the
starting node V. Next an unvisited vertex 'W' adjacent to 'V' is selected and a depth
first search from 'W' is initiated. When a vertex 'U' is reached such that all its adjacent
vertices have been visited, we back up to the last vertex visited, which has an unvisited
vertex 'W' adjacent to it and initiate a depth first search from W. The search terminates
when no unvisited vertex can be reached from any of the visited ones.
This algorithm is similar to the inorder traversal of binary tree. DFT algorithm is similar
to BFT except now use a STACK instead of the QUEUE. Again field STATUS is used to
tell us the current status of a node.
2. Push the starting node A into STACK and change its status to the waiting state
(STATUS = 2).
a. Pop the top node N from STACK. Process N and change the status of N to
the processed state (STATUS = 3).
b. Push all the neighbors of N that are in the ready state (STATUS = 1), and
change their status to the waiting state (STATUS = 2).
4. Exit.
Example 1:
Consider the graph shown below. Traverse the graph shown below in breadth first
order and depth first order.
A
Node Adjacency List
A F, C, B
F C B B A, C, G
C A, B, D, E, F, G
D C, F, E, J
D E G
E C, D, G, J, K
F A, C, D
J K
G B, C, E, K
A Gra p h G J D, E, K
K E, G, J
Adjacency list for graph G
Breadth-first search and traversal:
Current Status
QUEUE Processed Nodes
Node A B C D E F G J K
1 1 1 1 1 1 1 1 1
A 2 1 1 1 1 1 1 1 1
A FCB A 3 2 2 1 1 2 1 1 1
F CBD AF 3 2 2 2 1 3 1 1 1
C BDEG AFC 3 2 3 2 2 3 2 1 1
B DEG AFCB 3 3 3 2 2 3 2 1 1
D EGJ AFCBD 3 3 3 3 2 3 2 2 1
E GJK AFCBDE 3 3 3 3 3 3 2 2 2
G JK AFCBDEG 3 3 3 3 3 3 3 2 2
J K AFCBDEGJ 3 3 3 3 3 3 3 3 2
K EMPTY AFCBDEGJK 3 3 3 3 3 3 3 3 3
For the above graph the breadth first traversal sequence is: A F C B D E G J K.
Current Status
Stack Processed Nodes
Node A B C D E F G J K
1 1 1 1 1 1 1 1 1
A 2 1 1 1 1 1 1 1 1
A BCF A 3 2 2 1 1 2 1 1 1
F BCD AF 3 2 2 2 1 3 1 1 1
D BCEJ AFD 3 2 2 3 2 3 1 2 1
J BCEK AFDJ 3 2 2 3 2 3 1 3 2
K BCEG AFDJK 3 2 2 3 2 3 2 3 3
G BCE AFDJKG 3 2 2 3 2 3 3 3 3
E BC AFDJKGE 3 2 2 3 3 3 3 3 3
C B AFDJKGEC 3 2 3 3 3 3 3 3 3
B EMPTY AFDJKGECB 3 3 3 3 3 3 3 3 3
For the above graph the depth first traversal sequence is: A F D J K G E C B.
Example 2:
Traverse the graph shown below in breadth first order, depth first order and construct
the breadth first and depth first spanning trees.
A H I Node Adjacency List
A F, B, C, G
B A
B C G C A, G
D E, F
E G, D, F
F A, E, D
J K G A, L, E, H, J, C
D
H G, I
E I H
J G, L, K, M
L M K J
F
L G, J, M
TheGraphG TheM
adjacen cLy,liJst for the graph G
If the depth first traversal is initiated from vertex A, then the vertices of graph G are
visited in the order: A F E G L J K M H I C D B. The depth first spanning tree is shown
in the figure given below:
F B
G D
L H C
J I
K M
Depth first Traversal
If the breadth first traversal is initiated from vertex A, then the vertices of graph G are
visited in the order: A F B C G E D L H J M I K. The breadth first spanning tree is
shown in the figure given below:
F B C G
E D L H J
M I K
Traverse the graph shown below in breadth first order, depth first order and construct
the breadth first and depth first spanning trees.
2 3
4 5 6 7
Graph G
Head Nodes
1 2 3
1 4 5
2
1 6 7
3 2 8
2 8
4
3 8
5 3 8
4 5 6 7
2 3
4 5 6 7
If the breadth first search is initiated from vertex 1, then the vertices of G are visited in
the order: 1, 2, 3, 4, 5, 6, 7, 8. The breadth first spanning tree is as follows:
2 3
4 5 6 7