Data Structures and Algorithms

UNIT – 1 : INTRODUCTION TO DATA STRUCTURES
Structure
1.0 Objectives
1.1 Introduction
1.2 Overview of Data structures
1.3 Abstract Data types and their Implementation using arrays and Linked lists
1.6 Summary
1.7 Keywords
1.9 Questions
1.10 Reference
1.0 OBJECTIVES
After going through this lesson you will be able to
 Describe Overview of Data structures

 Elucidate Abstract Data types and their Implementation using arrays and Linked lists
1.1 INTRODUCTION
Data structures are a crucial aspect of computer science and programming, providing a way
to organize and store data efficiently for various computational tasks. They are essential for
designing algorithms and solving problems. Here's an overview of some common data
structures:
1. Arrays:
 A collection of elements, each identified by an index or a key.
 Elements are stored in contiguous memory locations.
 Access time is constant, O(1), but insertion and deletion can be inefficient,
especially in the middle of the array.
2. Linked Lists:
 A linear data structure consisting of nodes, where each node contains data and
a reference (or link) to the next node in the sequence.
 Dynamic size and efficient insertion and deletion at any position.
1
 Access time is O(n) in the worst case.
3. Stacks:
 A Last In, First Out (LIFO) data structure where elements are added and
removed from the same end, known as the top.
 Common operations include push (addition) and pop (removal).
4. Queues:
 A First In, First Out (FIFO) data structure where elements are added at the rear
and removed from the front.
 Common operations include enqueue (addition) and dequeue (removal).
5. Trees:
 A hierarchical data structure with a root node and branches of nodes, forming
a tree-like structure.
 Binary trees are particularly common, with each node having at most two
children.
 Special types include binary search trees (BST), AVL trees, and red-black
trees.
6. Graphs:
 A collection of nodes (vertices) and edges connecting these nodes.
 Can be directed or undirected, and may have weights associated with edges.
 Common algorithms include depth-first search (DFS) and breadth-first search
(BFS).
7. Hash Tables:
 A data structure that maps keys to values using a hash function.
 Provides constant-time average case complexity for basic operations like
insertion, deletion, and lookup.
8. Heaps:
 A specialized tree-based data structure used for heap sorting and priority
queue implementations.
 Common types include min-heap and max-heap.
9. Tries:
 An ordered tree data structure used to store a dynamic set or associative array
where keys are strings.
 Particularly efficient for string-related operations.
2
10. Sets and Maps:
 Sets store unique elements, and maps associate keys with values.
 Implementations include hash sets, linked sets, hash maps, and tree maps.
Understanding the strengths and weaknesses of different data structures is crucial for
choosing the right one for a specific problem and optimizing algorithm performance.
Different data structures excel in different scenarios, and the choice often depends on the
requirements of the task at hand.
1.2 Overview of Data Structures
Data Structure is a systematic way to organize data in order to use it efficiently. Following
terms are the foundation terms of a data structure.
 Interface − Each data structure has an interface. Interface represents the set of
operations that a data structure supports. An interface only provides the list of
supported operations, type of parameters they can accept and return type of these
operations.
 Implementation − Implementation provides the internal representation of a data
structure. Implementation also provides the definition of the algorithms used in the
operations of the data structure.
Characteristics of a Data Structure
 Correctness − Data structure implementation should implement its interface
correctly.
 Time Complexity − Running time or the execution time of operations of data
structure must be as small as possible.
 Space Complexity − Memory usage of a data structure operation should be as little as
possible.
Need for Data Structure
As applications are getting complex and data rich, there are three common problems that
applications face now-a-days.
3
 Data Search − Consider an inventory of 1 million(106) items of a store. If the
application is to search an item, it has to search an item in 1 million(106) items every
time slowing down the search. As data grows, search will become slower.
 Processor speed − Processor speed although being very high, falls limited if the data
grows to billion records.
 Multiple requests − As thousands of users can search data simultaneously on a web
server, even the fast server fails while searching the data.
To solve the above-mentioned problems, data structures come to rescue. Data can be
organized in a data structure in such a way that all items may not be required to be searched,
and the required data can be searched almost instantly.
Execution Time Cases
There are three cases which are usually used to compare various data structure's execution
time in a relative manner.
 Worst Case − This is the scenario where a particular data structure operation takes
maximum time it can take. If an operation's worst case time is ƒ(n) then this operation
will not take more than ƒ(n) time where ƒ(n) represents function of n.
 Average Case − This is the scenario depicting the average execution time of an
operation of a data structure. If an operation takes ƒ(n) time in execution, then m
operations will take mƒ(n) time.
 Best Case − This is the scenario depicting the least possible execution time of an
operation of a data structure. If an operation takes ƒ(n) time in execution, then the
actual operation may take time as the random number which would be maximum as
ƒ(n).
Basic Terminology
 Data − Data are values or set of values.
 Data Item − Data item refers to single unit of values.
 Group Items − Data items that are divided into sub items are called as Group Items.
 Elementary Items − Data items that cannot be divided are called as Elementary
Items.
 Attribute and Entity − An entity is that which contains certain attributes or
properties, which may be assigned values.
4
 Entity Set − Entities of similar attributes form an entity set.
 Field − Field is a single elementary unit of information representing an attribute of an
entity.
 Record − Record is a collection of field values of a given entity.
 File − File is a collection of records of the entities in a given entity set.
1.3 Abstract Data types and their Implementation using arrays and Linked lists
Abstract Data types and their Implementation using arrays and Linked lists:
Abstract Data Types (ADTs) : are high-level descriptions of data structures that define a set
of operations and the behavior of those operations without specifying how the data structure
is implemented. ADTs provide a way to understand and interact with data structures at a
conceptual level, abstracting away the implementation details. The primary goal is to
encapsulate data and operations into a unified interface, allowing users to work with the data
structure without needing to know the underlying implementation.
Here are some commonly used Abstract Data Types:
1. List:
 An ordered collection of elements with dynamic size.
 Operations: Insert, Delete, Find, Traverse, Get Size, etc.
2. Stack:
 A Last In, First Out (LIFO) structure.
 Operations: Push, Pop, Peek, Is Empty, etc.
3. Queue:
 A First In, First Out (FIFO) structure.
 Operations: Enqueue, Dequeue, Peek, Is Empty, etc.
4. Set:
 An unordered collection of unique elements.
 Operations: Add, Remove, Contains, Union, Intersection, etc.
5. Map (Dictionary):
 A collection of key-value pairs.
5
 Operations: Insert, Delete, Find, Get Value by Key, etc.
6. Tree:
 A hierarchical structure with nodes and edges.
 Operations: Traverse (Inorder, Preorder, Postorder), Search, Insert, Delete, etc.
7. Graph:
 A collection of nodes and edges.
 Operations: Traverse (Depth-First Search, Breadth-First Search), Shortest
Path, Connect, Disconnect, etc.
8. Heap:
 A specialized tree-based structure often used for priority queues.
 Operations: Insert, Extract Min/Max, Heapify, etc.
9. Priority Queue:
 A data structure where each element has an associated priority.
 Operations: Insert, Extract Min/Max, Peek, etc.
10. Hash Table:
 A data structure that maps keys to values using a hash function.
 Operations: Insert, Delete, Search, etc.
These abstract definitions allow programmers to choose and implement specific data
structures based on the requirements of their applications. The same abstract data type can be
implemented using different underlying data structures (e.g., lists with arrays or linked lists).
ADTs provide a powerful way to reason about and design algorithms, promoting modular and
maintainable code.
Abstract Data Types (ADTs) provide a high-level description of the behavior and properties
of a data structure, independent of its implementation details. Two common ADTs are Lists
and Stacks. Let's explore their implementation using arrays and linked lists:
1. List ADT:
 Represents an ordered collection of elements with dynamic size.
Implementation using Arrays:
6
#include <stdio.h>
#include <stdlib.h>
#define INITIAL_CAPACITY 10
typedef struct {
int *array;
int size;
int capacity;
} List;
void initialize(List *list) {
list->array = (int *)malloc(INITIAL_CAPACITY * sizeof(int));
if (list->array == NULL) {
printf("Memory allocation failed!\n");
exit(EXIT_FAILURE);
list->size = 0;
list->capacity = INITIAL_CAPACITY;
7
void append(List *list, int value) {
if (list->size >= list->capacity) {
list->capacity *= 2;
list->array = (int *)realloc(list->array, list->capacity * sizeof(int));
if (list->array == NULL) {
printf("Memory reallocation failed!\n");
exit(EXIT_FAILURE);
list->array[list->size++] = value;
int get(List *list, int index) {
if (index < 0 || index >= list->size) {
printf("Index out of bounds!\n");
exit(EXIT_FAILURE);
return list->array[index];
8
void display(List *list) {
printf("List elements: ");
for (int i = 0; i < list->size; i++) {
printf("%d ", list->array[i]);
printf("\n");
void freeList(List *list) {
free(list->array);
int main() {
List myList;
initialize(&myList);
// Adding elements to the list
append(&myList, 10);
9
// Displaying the list
display(&myList);
// Accessing an element by index
printf("Element at index 2: %d\n", get(&myList, 2));
// Freeing the memory allocated for the list
freeList(&myList);
return 0;
In this implementation:
 The List struct contains an array to store the elements, along with variables to track
the size and capacity of the array.
 The initialize function initializes the list with an initial capacity and allocates memory
for the array.
 The append function adds an element to the end of the list. If the array is full, it
dynamically reallocates memory to double its capacity.
 The get function retrieves an element from the list at a specified index.
 The display function prints the elements of the list.
10
 The freeList function frees the memory allocated for the list array when it is no
longer needed.
This implementation provides a basic dynamic array-based list, allowing for the addition of
elements and retrieval of elements by index. It dynamically resizes the array to accommodate
more elements as needed.
Implementation using Linked Lists:
Here's an implementation of a List Abstract Data Type (ADT) using singly linked lists in C:
#include <stdio.h>
#include <stdlib.h>
typedef struct Node {

int data;
struct Node* next;
} Node;
typedef struct {
Node* head;
int size;
} List;
void initialize(List *list) {

list->head = NULL;
list->size = 0;
}
void append(List *list, int value) {

Node* newNode = (Node*)malloc(sizeof(Node));
if (newNode == NULL) {
printf("Memory allocation failed!\n");
exit(EXIT_FAILURE);
11
}
newNode->data = value;
newNode->next = NULL;
if (list->head == NULL) {
list->head = newNode;
} else {
Node* current = list->head;
while (current->next != NULL) {
current = current->next;
}
current->next = newNode;
}
list->size++;
}
int get(List *list, int index) {

if (index < 0 || index >= list->size) {
printf("Index out of bounds!\n");
exit(EXIT_FAILURE);
}
for (int i = 0; i < index; i++) {
}
return current->data;
}
void display(List *list) {

printf("List elements: ");
while (current != NULL) {
printf("%d ", current->data);
12
}
printf("\n");
}
void freeList(List *list) {

Node* temp = current;
free(temp);
}
}
int main() {
List myList;
// Adding elements to the list

// Displaying the list

display(&myList);
// Accessing an element by index

printf("Element at index 2: %d\n", get(&myList, 2));
// Freeing the memory allocated for the list

freeList(&myList);
return 0;
}
13
In this implementation:
 The List struct contains a pointer to the head of the linked list and a variable to track
the size of the list.
 The initialize function initializes the list with a null head pointer and size zero.
 The append function adds a new node with the given value to the end of the linked
list.
 The get function retrieves the value at the specified index in the linked list.
 The display function prints all the elements in the linked list.
 The freeList function frees the memory allocated for all nodes in the linked list when
it is no longer needed.
This implementation provides a basic singly linked list-based list, allowing for the addition of
elements and retrieval of elements by index. It dynamically allocates memory for new nodes
as elements are added and frees the memory when the list is no longer needed.
1.7 SUMMARY
In this unit we have learnt about Overview of Data structures, Abstract Data types and their
Implementation using arrays and Linked lists.
An abstract data type is an abstraction of a data structure that provides only the interface to
which the data structure must adhere. The interface does not give any specific details about
something should be implemented or in what programming language.
In other words, we can say that abstract data types are the entities that are definitions of data
and operations but do not have implementation details. In this case, we know the data that we
are storing and the operations that can be performed on the data, but we don't know about the
implementation details. The reason for not having implementation details is that every
programming language has a different implementation strategy for example; a C data
structure is implemented using structures while a C++ data structure is implemented using
objects and classes.
14
For example, a List is an abstract data type that is implemented using a dynamic array and
linked list. A queue is implemented using linked list-based queue, array-based queue, and
stack-based queue. A Map is implemented using Tree map, hash map, or hash table.
Abstract data type model
Before knowing about the abstract data type model, we should know about abstraction and
encapsulation.
Abstraction: It is a technique of hiding the internal details from the user and only showing the
necessary details to the user.
Encapsulation: It is a technique of combining the data and the member function in a single
unit is known as encapsulation.
The above figure shows the ADT model. There are two types of models in the ADT model,
i.e., the public function and the private function. The ADT model also contains the data
structures that we are using in a program. In this model, first encapsulation is performed, i.e.,
all the data is wrapped in a single unit, i.e., ADT. Then, the abstraction is performed means
showing the operations that can be performed on the data structure and what are the data
structures that we are using in a program.
1.8 KEYWORDS
15
Data structures, Linked lists., Arrays.
1.9 QUESTIONS
1. Discuss Overview of Data structures..

2. Explain Abstract Data Types.
3. Describe Abstract Data types and their Implementation using arrays and
Linked lists.
1.10 REFERENCES
 "Data Structures and Algorithms" by Michael T. Goodrich, Roberto Tamassia, and

Michael H. Goldwasser
 "Introduction to Algorithms" by Thomas H. Cormen, Charles E. Leiserson, Ronald L.
Rivest, and Clifford Stein
 "Algorithms" by Robert Sedgewick and Kevin Wayne
 "Data Structures and Algorithms in C++" by Adam Drozdek
 "Data Structures and Algorithms Made Easy" by Narasimha Karumanchi
 "Data Structures and Algorithm Analysis in C++" by Mark A. Weiss
 "Data Structures and Algorithms with Object-Oriented Design Patterns in C++" by
Bruno R. Preiss
 "Data Structures and Algorithm Analysis in Java" by Mark A. Weiss Internet and
ChatGpt
16
UNIT 2: Algorithm Analysis
Structure:
2.0 Objectives
2.1 Introduction
2.2 Measuring the efficiency of algorithms, time and space complexity.
2.3 Big-O notation and Performance Analysis Techiques.
2.4 Summary
2.5 Keywords
2.6 Questions
2.7 Reference
2.0 OBJECTIVES
After going through this unit you will be able to
 Measuring the efficiency of algorithms, time and space complexity.

 Big-O notation and Performance Analysis Techiques.
2.1 INTRODUCTION
Algorithm analysis involves studying the efficiency and performance characteristics of

algorithms. It helps us understand how well an algorithm solves a problem and how it
behaves as the input size increases. Key aspects of algorithm analysis include time
complexity, space complexity, and Big O notation.
1. Time Complexity:
 Time complexity measures the amount of time an algorithm takes to complete
as a function of the input size.
 It is expressed using Big O notation, which provides an upper bound on the
growth rate of an algorithm.
 Common time complexities include O(1) (constant time), O(log n)
(logarithmic time), O(n) (linear time), O(n log n) (linearithmic time), O(n^2)
(quadratic time), etc.
17
2. Space Complexity:
 Space complexity measures the amount of memory an algorithm uses as a
function of the input size.
 Like time complexity, space complexity is also expressed using Big O
notation.
 It considers both the auxiliary space (extra space required for algorithm
execution) and input space.
3. Big O Notation:
 Big O notation describes the upper bound or worst-case scenario of an
algorithm's growth rate.
 It provides a simplified representation of the algorithm's efficiency, ignoring
constant factors and lower-order terms.
 Common Big O notations include O(1), O(log n), O(n), O(n log n), O(n^2),
O(2^n), etc.
4. Best, Worst, and Average Case Analysis:
 Algorithms may have different time complexities in different scenarios.
 Best-case time complexity represents the minimum time an algorithm takes for
any input.
 Worst-case time complexity represents the maximum time an algorithm takes
for any input.
 Average-case time complexity considers the expected time over all possible
inputs.
5. Amortized Analysis:
 Amortized analysis provides the average performance of an algorithm over a
sequence of operations.
 It considers the total cost of a sequence of operations divided by the number of
operations, providing a more accurate picture of average performance.
6. Space-Time Tradeoff:
 Some algorithms optimize for time complexity at the cost of increased space
complexity and vice versa.
 Analyzing the tradeoff between time and space complexity is crucial for
choosing the most suitable algorithm for a particular application.
7. Empirical Analysis:
18
 In addition to theoretical analysis, empirical analysis involves measuring the
actual performance of an algorithm on real-world data.
 This can help validate theoretical predictions and identify practical
considerations that may impact performance.
Efficient algorithms strike a balance between time and space complexity, and algorithm
analysis provides insights into the scalability and practicality of different solutions. It helps in
making informed decisions about algorithm selection based on the requirements and
constraints of a given problem.
2.2 Measuring the efficiency of algorithms, time and space complexity.
Measuring the efficiency of algorithms, time and space complexity.
Measuring the efficiency of algorithms involves analyzing their time complexity and space
complexity. Time complexity refers to the amount of time an algorithm takes to complete as a
function of the input size, while space complexity measures the amount of memory the
algorithm uses. Here's how you can analyze and measure these complexities:
Time Complexity:
1. Counting Basic Operations:
 Identify the basic operations that contribute to the running time of the
algorithm.
 Count the number of these operations as a function of the input size.
2. Asymptotic Analysis:
 Express the count of basic operations as a mathematical function, ignoring
constant factors and lower-order terms.
 Use Big O notation to represent the upper bound of the growth rate.
 Determine the best-case, worst-case, and average-case time complexity.
 Consider scenarios that lead to the minimum, maximum, and average running
times.
19
4. Recurrence Relations:
 For recursive algorithms, formulate recurrence relations to express the time
complexity.
 Solve the recurrence relation to obtain a closed-form expression.
5. Time Complexity Classes:
 Classify the algorithm into a time complexity class (e.g., O(1), O(log n), O(n),
O(n log n), O(n^2), etc.).
 Compare the growth rates to assess scalability.
Space Complexity:
1. Counting Memory Usage:
 Identify the memory requirements of the algorithm, including variables, data
structures, and recursion stack.
 Count the amount of memory used as a function of the input size.
2. Asymptotic Analysis for Space:
 Express the count of memory usage as a mathematical function using Big O
notation.
 Consider both auxiliary space (extra space required for algorithm execution)
and input space.
3. Space Complexity Classes:
 Classify the algorithm into a space complexity class (e.g., O(1), O(log n),
O(n), O(n^2), etc.).
 Evaluate the tradeoff between time and space complexity.
Practical Considerations:
1. Empirical Testing:
 Implement the algorithm and test it on real-world data.
 Measure the actual running time and memory usage using profiling tools.
2. Benchmarking:
 Compare the algorithm's performance with other algorithms solving the same
problem.
 Consider the constants hidden by Big O notation in practical scenarios.
3. Optimizations:
 Explore opportunities for algorithmic optimizations or improvements to
reduce time or space complexity.
20
 Evaluate the impact of these optimizations on overall efficiency.
4. Scalability Analysis:
 Analyze how the algorithm performs as the input size increases.
 Identify any points of diminishing returns or potential bottlenecks.
Measuring the efficiency of algorithms requires a combination of theoretical analysis,

empirical testing, and consideration of practical factors. It's essential to choose algorithms
that meet the specific requirements and constraints of the application while optimizing for
time and space complexity based on the problem at hand.
2.3 Big-O notation and Performance Analysis Techiques.
Big-O notation and Performance Analysis Techiques.:
Big-O Notation:
Big-O notation is a mathematical notation used to describe the upper bound or worst-case
performance of an algorithm in terms of its input size. It provides an asymptotic upper bound
on the growth rate of the algorithm's time or space complexity. Here are some common
complexities expressed in Big-O notation:
1. O(1): Constant time complexity. The algorithm's performance is constant, regardless

of the input size.
2. O(log n): Logarithmic time complexity. Common in algorithms that divide the
problem into smaller subproblems.
3. O(n): Linear time complexity. The running time increases linearly with the input size.
4. O(n log n): Linearithmic time complexity. Often found in efficient sorting algorithms
like merge sort and heap sort.
5. O(n^2): Quadratic time complexity. Common in algorithms with nested iterations.
6. O(2^n): Exponential time complexity. Often indicates brute-force algorithms that
explore all possible solutions.
7. O(n!): Factorial time complexity. Rare and highly inefficient, usually seen in
algorithms with combinatorial problems.
21
Performance Analysis Techniques:
1. Counting Operations:
 Identify basic operations and count the number of times each operation is
executed as a function of the input size.
2. Asymptotic Analysis:
 Express the count of operations using Big-O notation to provide an upper
bound on the growth rate.
 Consider scenarios that lead to the minimum, maximum, and average running
times to understand the algorithm's behavior.
4. Recurrence Relations:
 For recursive algorithms, formulate recurrence relations to express the time
complexity and solve them to obtain a closed-form expression.
5. Amortized Analysis:
 Analyze the average performance of a sequence of operations to provide a
more accurate picture of average efficiency.
6. Benchmarking:
 Compare the algorithm's performance with other algorithms solving the same
problem to understand its relative efficiency.
7. Empirical Testing:
 Implement the algorithm and test it on real-world data to measure the actual
running time and memory usage using profiling tools.
8. Space Complexity Analysis:
 Identify the memory requirements of the algorithm and express them using
Big-O notation.
9. Optimizations:
 Explore opportunities for algorithmic optimizations to reduce time or space
complexity while preserving correctness.
10. Scalability Analysis:
 Analyze how the algorithm performs as the input size increases to ensure it scales
well with growing data.
22
Big-O notation and performance analysis techniques provide a framework for understanding,
comparing, and optimizing algorithms. They are crucial for making informed decisions about
algorithm selection based on the specific requirements and constraints of a given problem.
What is Big O Notation in Data Structure?
Big O Notation in Data Structure is used to express algorithmic complexity using algebraic
terms. It describes the upper bound of an algorithm's runtime and calculates the time and
amount of memory needed to execute the algorithm for an input value.
Mathematical Definition
Consider the functions f(n) and g(n), where functions f and g are defined on an unbounded set
of positive real numbers. g(n) is strictly positive for every large value of n.
The function f is said to be O(g) (read as big- oh of g), if, for a constant c>0 and a natural
number n0,
f (n) ≤ CG(n) for all n >=

n0
This can be written as:
f(n) = O(g(n)), where n tends to infinity (n → ∞)
We can simply write the above expression as:
f(n) = O(g(n))
23
Properties of Big O Notation
The most important properties of Big O Notation in Data Structure are:
 Constant Multiplication:
If f(n) = CG(n), then O(f(n)) = O(g(n)) for a constant c > 0
 Summation Function:
If f(n) = f1(n) + f2(n) + -- + FM(n) and fi(n)≤ fi+1(n) ∀ i=1, 2, --, m,
then O(f(n)) = O(max(f1(n), f2(n), --, fm(n)))
 Logarithmic Function:
If f(n) = log an and g(n)=log bn, then
O(f(n)) = O(g(n))
 Polynomial Function:
If f(n) = a0 + a1.n + a2.n2 + -- + am.nm, then
O(f(n)) = O(nm)
How Does Big O Notation Make a Runtime Analysis of an Algorithm?
In order to analyze and calculate an algorithm's performance, we must calculate and compare
the worst-case runtime complexities of the algorithm. The order of O(1) - known as the
Constant Running Time - is the fastest running time for an algorithm, with the time taken by
the algorithm being equal for different input sizes. Although the Constant Running Time is
24
the ideal runtime for an algorithm, it can be rarely achieved because the runtime depends on
the size of n inputted.
For example, runtime analysis of an algorithm for a size of n = 20:
n=20,
log (20) = 2.996
20 = 20
20 log (20) = 59.9
20^2 = 400
2^20 = 1084576
20! = 2.432902 + 1818
 Runtime complexity of some common algorithmic examples:
 Runtime Complexity for Linear Search – O(n)
 Runtime Complexity for Binary Search – O(log n)
 Runtime Complexity for Bubble Sort, Insertion Sort, Selection Sort, Bucket Sort -
O(n^c).
 Runtime Complexity for Exponential algorithms like Tower of Hanoi - O(c^n).
 Runtime Complexity for Heap Sort, Merge Sort - O(n log n).
25
How Does Big O Notation Analyze Space Complexity?
It is also essential to determine the Space Complexity of an algorithm. This is because space
complexity indicates how much memory space the algorithm occupies. We compare the
worst-case space complexities of the algorithm.
Before the Big O notation analyzes the Space complexity, the following tasks need to be
implemented:
1. Implementation of the program for a particular algorithm.
2. The size of input n needs to be known to calculate the memory each item will hold.
Space Complexities of some common algorithms:
Linear Search, Binary Search, Bubble sort, Selection sort, Heap sort, Insertion sort - Space
Complexity is O(1).
 Radix sort - Space complexity is O(n+k).
 Quick Sort - Space complexity is O(n).
 Merge sort - Space complexity is O(log n).
Example of Big O Notation in C
Implementation of Selection Sort algorithm in C to find worst-case complexity (Big O

Notation) of the algorithm:
for(int i=0; i<n; i++)
int min = i;
26
for(int j=i; j<n; j++)
if(array[j]<array[min])
min=j;
int temp = array[i];
array[i] = array[min];
array[min] = temp;
Explanation:
The range of the first (outer) for loop is i<n, meaning the order of the loop is O(n).
The range for the second (inner) for loop is j<n; so, the order of the loop is again O(n).
Average efficiency is calculated as n/2 for a constant c, but we ignore the constant. Thus, the
order comes to be O(n).
We get runtime complexity by multiplying the inner and outer loop order. It is O(n^2).
In this way, you can implement other algorithms in C, and analyze and determine the
complexities.
27
2.4 SUMMARY
Analysis of Algorithms in Data Structure involves evaluating the efficiency and

performance of algorithms when applied to various data structures. This analysis is crucial for
understanding how well algorithms perform in different scenarios and helps make informed
decisions about their usage in specific applications.
Data structure analysis assesses how algorithms interact with and manipulate different data
structures. This entails determining the best-case, worst-case, and average-case situations for
algorithms, assessing their time and space complexity, and comprehending how they behave
with different input quantities.
Developers can choose algorithms with greater knowledge and efficiency, resulting in
scalable and more effective software solutions, by thoroughly analyzing algorithms in the
context of data structures.
2.5 KEYWORDS
Analysis , Algorithms , Data Structure
2.6 QUESTIONS
1. What is Big O Notation in Data Structure?

2. How Does Big O Notation Make a Runtime Analysis of an Algorithm?
3. How Does Big O Notation Analyze Space Complexity?
4. Explain with an Example of Big O Notation in C

5. Write a note on Big-O notation and Performance Analysis Techiques.
6. Discuss Measuring the efficiency of algorithms, time and space complexity.
28
2.7 REFERENCES
 "Data Structures and Algorithms" by Michael T. Goodrich, Roberto Tamassia, and

 "Introduction to Algorithms" by Thomas H. Cormen, Charles E. Leiserson, Ronald L.
 "Algorithms" by Robert Sedgewick and Kevin Wayne
 "Data Structures and Algorithms in C++" by Adam Drozdek
 "Data Structures and Algorithms Made Easy" by Narasimha Karumanchi
 "Data Structures and Algorithm Analysis in C++" by Mark A. Weiss
 "Data Structures and Algorithms with Object-Oriented Design Patterns in C++" by
Bruno R. Preiss
 "Data Structures and Algorithm Analysis in Java" by Mark A. Weiss
 Internet and ChatGpt
29
UNIT 3: Stacks and Queues
Structure:
3.0 Objectives
3.1 Introduction
3.2 Implementing Stack and Queue data structures, their applications , and usage in design
algorithm.
3.3 Summary
3.4 Keywords
3.5 Questions
3.6 Reference
3.0 OBJECTIVES
• Describe Stacks and Queues

• Describe Implementation of Stack and Queue data structures, their applications , and
usage in design algorithm.
3.1 INTRODUCTION
A stack is a last-in, first-out (LIFO) data structure, where the last element added is the first
one to be removed. It follows the principle of adding elements to the "top" and removing
them from the same "top" position.
 Operations:
 Push: Add an element to the top of the stack.
 Pop: Remove the element from the top of the stack.
 Peek (or Top): View the element at the top without removing it.
 isEmpty: Check if the stack is empty.
 Applications:
 Function call management (call stack in programming).
 Expression evaluation (postfix, prefix, and infix notations).
 Undo mechanisms in applications.
30
 Backtracking algorithms.
 Implementation:
 Can be implemented using arrays or linked lists.
Queues:
A queue is a first-in, first-out (FIFO) data structure, where the first element added is the first
one to be removed. It follows the principle of adding elements to the "rear" and removing
them from the "front."
 Operations:
 Enqueue: Add an element to the rear of the queue.
 Dequeue: Remove the element from the front of the queue.
 Front: View the element at the front without removing it.
 isEmpty: Check if the queue is empty.
 Applications:
 Task scheduling in operating systems.
 Print job management.
 Breadth-first search in graph algorithms.
 Handling requests in networking.
 Implementation:
 Can be implemented using arrays, linked lists, or specialized data structures
like a circular queue.
Differences:
 Order of Removal:
 Stack: Last In, First Out (LIFO).
 Queue: First In, First Out (FIFO).
 Operations:
 Stack: Push, Pop, Peek.
 Queue: Enqueue, Dequeue, Front.
 Implementation:
 Stacks can be easily implemented using arrays or linked lists.
31
 Queues can be implemented using arrays, linked lists, or circular buffers.
 Applications:
 Stacks are used in scenarios where the last operation needs to be undone or
revisited.
 Queues are used when tasks are processed in the order they arrive.
Both stacks and queues are fundamental data structures, and their simplicity makes them
useful in various applications. Choosing between them depends on the specific requirements
of a problem.
3.2 Implementing Stack and Queue data structures, their applications , and usage in
design algorithm
Implementing Stack and Queue data structures, their applications , and usage in
design algorithm
let's implement basic versions of a Stack and a Queue in Python and discuss their
applications and usage in algorithm design.
Stack Implementation:
Stack Implementation:
#include <stdio.h>
#include <stdlib.h>
#define MAX_SIZE 100
typedef struct {
int data[MAX_SIZE];
int top;
} Stack;
32
void initialize(Stack *stack) {
stack->top = -1;
}
int isEmpty(Stack *stack) {

return stack->top == -1;
}
int isFull(Stack *stack) {

return stack->top == MAX_SIZE - 1;
}
void push(Stack *stack, int value) {

if (!isFull(stack)) {
stack->data[++stack->top] = value;
} else {
printf("Stack overflow!\n");
}
}
int pop(Stack *stack) {

if (!isEmpty(stack)) {
return stack->data[stack->top--];
} else {
printf("Stack underflow!\n");
return -1; // or some other error value
}
}
33
Queue Implementation:
#include <stdio.h>
#include <stdlib.h>
typedef struct {
int data[MAX_SIZE];
int front;
int rear;
} Queue;
void initialize(Queue *queue) {

queue->front = -1;
queue->rear = -1;
}
int isEmpty(Queue *queue) {

return queue->front == -1;
}
int isFull(Queue *queue) {

return (queue->rear + 1) % MAX_SIZE == queue->front;
}
void enqueue(Queue *queue, int value) {

if (!isFull(queue)) {
if (isEmpty(queue))
queue->front = 0;
queue->rear = (queue->rear + 1) % MAX_SIZE;
queue->data[queue->rear] = value;
34
} else {
printf("Queue overflow!\n");
}
}
int dequeue(Queue *queue) {

if (!isEmpty(queue)) {
int value = queue->data[queue->front];
if (queue->front == queue->rear)
initialize(queue);
else
queue->front = (queue->front + 1) % MAX_SIZE;
return value;
} else {
printf("Queue underflow!\n");
}
}
Applications and Usage in Algorithm Design:

Stack:
 Applications:
 Function call stack in programming languages for managing function calls
and local variables.
 Undo/Redo functionality in text editors and other applications.
 Expression evaluation and conversion (infix to postfix, postfix to infix, etc.).
 Backtracking algorithms (e.g., depth-first search in graphs).
 Usage in Algorithm Design:
 Used to implement depth-first search (DFS) algorithm in graph traversal.
 Helps in solving problems requiring last-in-first-out (LIFO) behavior, such as
backtracking algorithms.
 Can be used to reverse a sequence of elements efficiently.
Queue:
35
 Applications:
 Job scheduling in operating systems.
 Print spooling in printers.
 Breadth-first search (BFS) in graph traversal.
 Buffer management in networking and I/O systems.
 Usage in Algorithm Design:
 Used to implement breadth-first search (BFS) algorithm in graph traversal.
 Helps in solving problems requiring first-in-first-out (FIFO) behavior, such as
scheduling problems.
 Can be used to simulate processes where entities arrive and leave in a
sequential manner.
In algorithm design, stacks and queues are fundamental data structures used in
various problem-solving scenarios. Understanding their properties and operations is
crucial for efficiently solving problems in computer science and related fields.
Implementation of Stack in Data Structures
You can perform the implementation of stacks in data structures using two data structures
that are an array and a linked list.
 Array: In array implementation, the stack is formed using an array. All the
operations are performed using arrays. You will see how all operations can be
implemented on the stack in data structures using an array data structure.
36
 Linked-List: Every new element is inserted as a top element in the linked list
implementation of stacks in data structures. That means every newly inserted
element is pointed to the top. Whenever you want to remove an element from the
stack, remove the node indicated by the top, by moving the top to its previous node
in the list.
37
Application of Stack in Data Structures
Here are the top 7 applications of the stack in data structure:
 Expression Evaluation and Conversion
 Backtracking
 Function Call
 Parentheses Checking
 String Reversal
 Syntax Parsing
 Memory Management
Now you will understand all the applications one at a time.
38
1. Expression Evaluation and Conversion
There are three types of expression that you use in programming, they are:
Infix Expression: An infix expression is a single letter or an operator preceded by one single
infix string followed by another single infix string.
 X
 X+Y
 (X + Y ) + (A - B)
Prefix Expression: A prefix expression is a single letter or an operator followed by two prefix
strings.
 X
 +XY
 ++XY-AB
Postfix Expression: A postfix expression (also called Reverse Polish Notation) is a single
letter or an operator preceded by two postfix strings.
 X
 XY+
 XY+CD-+
Similarly, the stack is used to evaluate these expressions and convert these expressions like
infix to prefix or infix to postfix.
2. Backtracking
Backtracking is a recursive algorithm mechanism that is used to solve optimization problems.
39
To solve the optimization problem with backtracking, you have multiple solutions; it does not
matter if it is correct. While finding all the possible solutions in backtracking, you store the
previously calculated problems in the stack and use that solution to resolve the following
issues.
The N-queen problem is an example of backtracking, a recursive algorithm where the stack is
used to solve this problem.
3. Function Call
Whenever you call one function from another function in programming, the reference of
calling function stores in the stack. When the function call is terminated, the program control
moves back to the function call with the help of references stored in the stack.
So stack plays an important role when you call a function from another function.
40
4. Parentheses Checking
Stack in data structures is used to check if the parentheses like ( ), { } are valid or not in
programing while matching opening and closing brackets are balanced or not.
So it stores all these parentheses in the stack and controls the flow of the program.
For e.g ((a + b) * (c + d)) is valid but {{a+b})) *(b+d}] is not valid.
5. String Reversal
Another exciting application of stack is string reversal. Each character of a string gets stored
in the stack.
The string's first character is held at the bottom of the stack, and the last character of the
string is held at the top of the stack, resulting in a reversed string after performing the pop
operation.
6. Syntax Parsing
Since many programming languages are context-free languages, the stack is used for syntax
parsing by many compilers.
7. Memory Management
Memory management is an essential feature of the operating system, so the stack is heavily
used to manage memory.
41
Array representation of Queue
We can easily represent queue by using linear arrays. There are two variables i.e. front and
rear, that are implemented in the case of every queue. Front and rear variables point to the
position from where insertions and deletions are performed in a queue. Initially, the value of
front and queue is -1 which represents an empty queue. Array representation of a queue
containing 5 elements along with the respective values of front and rear, is shown in the
following figure.
The above figure shows the queue of characters forming the English word "HELLO". Since,
No deletion is performed in the queue till now, therefore the value of front remains -1 .
However, the value of rear increases by one every time an insertion is performed in the
queue. After inserting an element into the queue shown in the above figure, the queue will
look something like following. The value of rear will become 5 while the value of front
remains same.
42
After deleting an element, the value of front will increase from -1 to 0. however, the queue
will look something like following.
Algorithm to insert any element in a queue
Check if the queue is already full by comparing rear to max - 1. if so, then return an overflow
error.
43
If the item is to be inserted as the first element in the list, in that case set the value of front
and rear to 0 and insert the element at the rear end.
Otherwise keep increasing the value of rear and insert each element one by one having rear as
the index.
Algorithm
o Step 1: IF REAR = MAX - 1
Write OVERFLOW
Go to step
[END OF IF]
o Step 2: IF FRONT = -1 and REAR = -1
SET FRONT = REAR = 0
ELSE
SET REAR = REAR + 1
[END OF IF]
o Step 3: Set QUEUE[REAR] = NUM
o Step 4: EXIT
C Function
void insert (int queue[], int max, int front, int rear, int item)
{
if (rear + 1 == max)
{
printf("overflow");
}
else
{
if(front == -1 && rear == -1)
{
front = 0;
rear = 0;
}
else
{
rear = rear + 1;
}
queue[rear]=item;
44
}
}
Algorithm to delete an element from the queue
If, the value of front is -1 or value of front is greater than rear , write an underflow message
and exit.
Otherwise, keep increasing the value of front and return the item stored at the front end of the
queue at each time.
Algorithm
o Step 1: IF FRONT = -1 or FRONT > REAR
Write UNDERFLOW
ELSE
SET VAL = QUEUE[FRONT]
SET FRONT = FRONT + 1
[END OF IF]
o Step 2: EXIT
C Function
int delete (int queue[], int max, int front, int rear)
{
int y;
if (front == -1 || front > rear)
{
printf("underflow");
}
else
{
y = queue[front];
if(front == rear)
{
front = rear = -1;
else
front = front + 1;
45
}
return y;
}
}
3.3 SUMMARY
In this unit we have discussed in detail about . Stacks and Queues. Implementation of Stack
and Queue data structures, their applications , and usage in design algorithm.
3.4 KEYWORDS
Stacks ,
Queues.
design algorithm.
3.5 QUESTIONS
1. Discuss Stacks and Queues.

2. Describe Implementation of Stack and Queue data structures, their applications , and
usage in design algorithm.
3.6 REFERENCES
1. "Data Structures and Algorithms" by Michael T. Goodrich, Roberto Tamassia, and

2. "Introduction to Algorithms" by Thomas H. Cormen, Charles E. Leiserson, Ronald L.
3. "Algorithms" by Robert Sedgewick and Kevin Wayne
46
4. "Data Structures and Algorithms in C++" by Adam Drozdek
5. "Data Structures and Algorithms Made Easy" by Narasimha Karumanchi
6. "Data Structures and Algorithm Analysis in C++" by Mark A. Weiss
7. "Data Structures and Algorithms with Object-Oriented Design Patterns in C++" by
Bruno R. Preiss
8. "Data Structures and Algorithm

Analysis in Java" by Mark A. Weiss
9. ChatGpt and Internet.
47
UNIT 4: Linked Lists
Structure:
4.0 Objectives
4.1 Introduction
4.2 Singly Linked Lists, Doubly Linked Lists and Circular Linked Lists, their
Implementation and applications.
4.3 Summary
4.4 Keywords
4.5 Questions
4.6 Reference
4.0 OBJECTIVES
 Singly Linked Lists, Doubly Linked Lists and Circular Linked Lists, their
4.1 INTRODUCTION
A singly linked list is a data structure in which each element (node) contains a data part and a
link to the next node in the sequence.
A singly linked list is a fundamental data structure consisting of a sequence of

elements called nodes. Each node contains two parts:
1. Data Part: This part stores the actual data or payload associated with the
node. It could be of any data type depending on the application.
2. Link (or Pointer) Part: This part contains a reference (or pointer) to the next
node in the sequence. It establishes the logical connection between nodes
and allows traversal through the list.
48
Here's a simple visual representation of a singly linked list:
+-------+ +-------+ +-------+ +-------+
| Node 1| -> | Node 2| -> | Node 3| -> | Node 4| -> NULL
+-------+ +-------+ +-------+ +-------+
In this example:
 Each rectangular box represents a node.

 The arrow symbol (->) represents the link from one node to the next.
 The last node points to NULL, indicating the end of the list.
Key characteristics of singly linked lists include:
 Dynamic Size: Singly linked lists can dynamically grow or shrink in size as elements
are added or removed.
 Sequential Access: Elements are accessed sequentially starting from the head node
and traversing through subsequent nodes.
 Efficient Insertion and Deletion: Insertion and deletion operations are efficient at
the beginning or end of the list, but may require traversal for operations in the middle.
 No Random Access: Unlike arrays, singly linked lists do not support direct access to
elements by index; traversal is required to reach a specific node.
Singly linked lists are widely used in various applications, such as:
 Implementing other data structures like stacks, queues, and trees.

 Managing memory dynamically, especially in situations where the size of data is not
known beforehand.
 Representing sequences of data that can be easily modified without the need for
shifting elements (unlike arrays).
 Performing operations like insertion and deletion efficiently, particularly when
dealing with large datasets.
49
Understanding singly linked lists is fundamental in computer science and forms the basis for
more complex data structures and algorithms.
Implementation of a stack data structure in C:
#include <stdio.h>
#include <stdlib.h>
typedef struct {
int data[MAX_SIZE];
int top;
} Stack;
void initialize(Stack *stack) {
stack->top = -1;
int isEmpty(Stack *stack) {
return stack->top == -1;
int isFull(Stack *stack) {
return stack->top == MAX_SIZE - 1;
50
}
void push(Stack *stack, int value) {
if (!isFull(stack)) {
stack->data[++stack->top] = value;
} else {
printf("Stack overflow!\n");
int pop(Stack *stack) {
return stack->data[stack->top--];
} else {
printf("Stack underflow!\n");
int peek(Stack *stack) {
51
return stack->data[stack->top];
} else {
printf("Stack is empty!\n");
int main() {
Stack myStack;
initialize(&myStack);
// Pushing elements onto the stack
push(&myStack, 10);
push(&myStack, 20);
push(&myStack, 30);
push(&myStack, 40);
// Peeking at the top element
printf("Top element: %d\n", peek(&myStack));
52
// Popping elements from the stack
printf("Popped element: %d\n", pop(&myStack));
printf("Popped element: %d\n", pop(&myStack));
// Peeking at the top element after popping
printf("Top element: %d\n", peek(&myStack));
return 0;
This implementation includes the following functionalities:
 initialize: Initializes the stack.

 isEmpty: Checks if the stack is empty.
 isFull: Checks if the stack is full.
 push: Pushes an element onto the stack.
 pop: Pops (removes and returns) the top element from the stack.
 peek: Returns the top element of the stack without removing it.
Doubly Linked List:
A doubly linked list is a data structure similar to a singly linked list, with the addition of each
node having a pointer/reference to the previous node as well as the next node. This
bidirectional linkage allows traversal of the list in both forward and backward directions.
Here's a visual representation of a doubly linked list:
53
NULL <-> [Node 1] <-> [Node 2] <-> [Node 3] <-> ... <-> [Node N] <-> NULL
In a doubly linked list:
 Each node contains two pointers: prev (pointer to the previous node) and next
(pointer to the next node).
 The first node's prev pointer and the last node's next pointer are NULL, indicating
the start and end of the list, respectively.
 Nodes can be traversed in both forward and backward directions.
Here's a simple implementation of a doubly linked list in C:
#include <stdio.h>
#include <stdlib.h>
int data;
struct Node* prev;
struct Node* next;
} Node;
typedef struct {
Node* head;
Node* tail;
} DoublyLinkedList;
54
void initialize(DoublyLinkedList *list) {
list->head = NULL;
list->tail = NULL;
void insertAtBeginning(DoublyLinkedList *list, int value) {
newNode->prev = NULL;
newNode->next = list->head;
if (list->head != NULL) {
list->head->prev = newNode;
} else {
list->tail = newNode; // if list was empty, newNode is now both head and tail
55
void insertAtEnd(DoublyLinkedList *list, int value) {
newNode->prev = list->tail;
if (list->tail != NULL) {
list->tail->next = newNode;
} else {
list->head = newNode; // if list was empty, newNode is now both head and tail
list->tail = newNode;
void displayForward(DoublyLinkedList *list) {
56
printf("\n");
void displayBackward(DoublyLinkedList *list) {
Node* current = list->tail;
current = current->prev;
printf("\n");
int main() {
DoublyLinkedList myList;
// Inserting elements at the beginning
insertAtBeginning(&myList, 10);
57
// Inserting elements at the end
insertAtEnd(&myList, 40);
// Displaying the list forward and backward
printf("Forward: ");
displayForward(&myList);
printf("Backward: ");
displayBackward(&myList);
return 0;
This implementation demonstrates:
 initialize: Initializes the doubly linked list.

 insertAtBeginning: Inserts a node with the given value at the beginning of the list.
 insertAtEnd: Inserts a node with the given value at the end of the list.
 displayForward: Displays the elements of the list in forward direction.
58
 displayBackward: Displays the elements of the list in backward direction.
Doubly linked lists are used in scenarios where bidirectional traversal is required, such as
implementing text editors (for undo/redo functionality), implementing cache algorithms, and
representing sparse matrices.
Applications:
Doubly linked lists find application in various scenarios where bidirectional traversal and
efficient insertion and deletion operations at both ends of the list are required. Some common
applications include:
1. Text Editors:
 Doubly linked lists can be used to implement the data structure behind text
editors for features like undo and redo operations.
 Each node in the list can represent a text buffer, and the bidirectional links
allow for efficient navigation and manipulation of the text.
2. Cache Implementation:
 Doubly linked lists are used in implementing LRU (Least Recently Used) and
MRU (Most Recently Used) cache eviction policies.
 Each node in the list represents a cached item, and when an item is accessed, it
is moved to the front or end of the list depending on the eviction policy.
3. Browser History:
 Doubly linked lists can be used to implement browser history functionalities.
 Each visited page can be stored as a node in the list, and bidirectional links
allow for efficient navigation through the browsing history.
4. Music Playlist:
 Doubly linked lists are suitable for implementing music playlist functionalities
in media players.
 Each song can be represented as a node in the list, and bidirectional links
allow for moving forward and backward through the playlist.
5. Deque (Double-ended Queue) Implementation:
59
 Doubly linked lists can be used to implement deque data structures, where
insertion and deletion operations are allowed at both ends of the list.
 This is particularly useful in scenarios where both FIFO (First-In-First-Out)
and LIFO (Last-In-First-Out) operations are required.
6. Sparse Matrix Representation:
 Doubly linked lists are used to represent sparse matrices efficiently.
 Each node in the list represents a non-zero element of the matrix, and the
bidirectional links allow for easy traversal and manipulation of the matrix
elements.
7. Undo/Redo Functionality in Software:
 Doubly linked lists are commonly used to implement undo/redo functionality
in various software applications.
 Each action performed by the user can be stored as a node in the list, and
bidirectional links allow for efficient navigation through the action history.
Overall, doubly linked lists are versatile data structures with applications in various domains
where bidirectional traversal and efficient insertion and deletion operations are required.
Circular Linked List:
A circular linked list is similar to a singly linked list, but the last node points back to the first
node.
A circular linked list is a variation of the linked list data structure where the last node's next
pointer points back to the first node, forming a circular loop. This means that the list has no
ending node; it loops back to the beginning.
Here
60
's a visual representation:
+-------+ +-------+ +-------+ +-------+
| Node 1| -> | Node 2| -> | Node 3| -> | Node 4|
+-------+ +-------+ +-------+ +-------+
^ |
In a circular linked list:
 Each node contains two parts: data and a pointer to the next node.
 The last node's next pointer points back to the first node, forming a loop.
 Circular linked lists can be singly or doubly linked.
Here's a simple implementation of a singly circular linked list in C:
#include <stdio.h>
#include <stdlib.h>
int data;
struct Node* next;
} Node;
typedef struct {
Node* head;
61
} CircularLinkedList;
void initialize(CircularLinkedList *list) {
list->head = NULL;
void insertAtEnd(CircularLinkedList *list, int value) {
newNode->next = list->head; // Point to itself to create the circular loop
} else {
Node* temp = list->head;
while (temp->next != list->head) {
temp = temp->next;
62
temp->next = newNode;
newNode->next = list->head; // Point back to the head to complete the loop
void display(CircularLinkedList *list) {
printf("Circular linked list is empty.\n");
return;
do {
} while (current != list->head);
printf("\n");
63
int main() {
CircularLinkedList myList;
// Inserting elements at the end
// Displaying the circular linked list
display(&myList);
return 0;
This implementation includes:
 initialize: Initializes the circular linked list.

 insertAtEnd: Inserts a node with the given value at the end of the list.
 display: Displays the elements of the circular linked list.
Applications of circular linked lists include:
64
1. Round-robin Scheduling: Used in scheduling algorithms where tasks are executed in
a circular manner.
2. Circular Buffers: Used in data streaming applications to manage continuous data
flow.
3. Implementation of Circular Queue: Used in data structures where elements are
inserted and removed in a circular manner, such as in the BFS algorithm's queue
implementation.
These types of linked lists have their own advantages and applications, and the choice
depends on the specific requirements of the problem. Singly linked lists are simple and
memory-efficient, doubly linked lists provide bidirectional traversal, and circular linked lists
have applications where nodes need to be traversed in a loop.
4.2 Singly Linked Lists, Doubly Linked Lists and Circular Linked Lists, their
Singly Linked Lists, Doubly Linked Lists and Circular Linked Lists, their
REPRESENTATION OF A SINGLY LINKED LIST:
A singly linked list is a collection of nodes, which can be used to effectively implement
linear data structures such as stacks and queues. In general, each node in a singly linked list
consists of several fields out which one field is used to hold the address of the next node in a
list and all other fields are used to hold the data items of type primitive. Figure 4.1 shows an
example of a singly linked list with three fields. The first two fields hold the data and the
third field contains the address of the next node in the list.
(a) Singly linked list
65
(b) Structure of the node
Fig. 4.1 A singly linked list and its node structure
Every node is a chunk of memory having an address. When a set of data elements to be
used by an application are represented using a linked list, each data element is represented by
a node. Depending on the information content of the data element, one or more data fields are
used in the node. However, in singly linked list only a single link field is used to point to
node, which represents its neighbouring element in the list. The last node in the linked list has
its link empty. The empty link is normally denoted using a cross mark.
Figure 4.2 shows the physical representation of a linked list where its logical
representation is shown in Figure 4.1. Note that the nodes are distributed all over the memory
and are not physically contiguous. Also observe that the link field of each node contains the
address of the node of its logical neighbour. The link field of the last node is NUL indicated
by a cross symbol. In the above example, the node at address 4026 containing the data
elements E and 50 is the last node of the linked list. This can be easily understood with the
Figure 4.1.
Fig. 4.2 Physical representation of a singly linked list
66
OPERATIONS ON SINGLY LINKED LISTS:
We have briefly described the various operations that can be performed on linked list, in
general, in Unit-I. In fact, we can realize all those operations in singly linked list. The
following sections describe the logical representation of all these operations along with their
associated algorithms.
1. Inserting a node at the front end of the list:
Algorithm: Insert-First-SLL
Input: F, address of first node.
e, element to be inserted.
Output: F, updated.
Method:
1. n = getnode()
2. n.data = e
3. n.link = F
4. F = n
Algorithm ends
In the above algorithm, n indicates the node to be added into the list. The getnode() function
allocates the required memory locations to a node n and assigns the address to it. The
statement n.data = e copies the element into the data field of the node n. The statement n.link
= F copies the address of the first node into the link field of the node n. Finally, the address of
the node n is placed in the pointer variable F, which always points to the first node of the list.
Thus an element is inserted into a singly linked list at the front. The Figure 4.3 below shows
the logical representation of the above operation. In figure 4.3(a) the pointer F contains the
address of the first node i.e. 1000. Figure 4.3(b) shows the node n to be inserted, which has
67
its address 5000 holding the data A. Figure 4.3(c) shows the linked list after inserting the
node n at the front end of the list. Now, F contains the address of the new node i.e. 5000.
(a) Linked list before insertion (b) Node to be inserted.
(c) Linked list after insertion
Figure 4.3 Logical representation of the insertion operation.
2. Inserting a node at the rear end of the list:
Algorithm: Insert-Last-SLL
Output: F, updated.
Method: 1. n = getnode()
5. n.data = e
6. n.link = Null
7. if (F = Null) then F = n
68
else
T=F
While (T.link ≠ Null) Do
T = T.Link
End-of-While
T.Link = n
Endif
Algorithm ends
Algorithm above illustrates the insertion of node at the end of a singly linked list. Observe
that the link field of the node to be inserted is Null. This is because the inserted node
becomes the end of the list, whose link field always contains the Null value. If the list is
empty, the inserted node becomes the first node and hence the pointer F will hold the address
of this inserted node. Otherwise, the list is traversed using a temporary variable till it reaches
the end of the list. This is accomplished using a statement T = T.Link. Once the last
node is found, the address of the new node is copied to the link field. Figure 4.4 below shows
the logical representation of inserting a node at the end of the linked list.
(a) Linked list before insertion (b) Node to be inserted
69
Figure 4.4 Logical representation of the insertion operation
3. Inserting a node into a sorted singly linked list:
Algorithm: Insert-Sorted-SLL
Output: F, updated.
Method: 1. n = getnode()
2. n.data = e
3. if (F = Null) then n.link = Null
F=n
else
If (e ≤ F.data) then n.link = F
F=n
Else
T=F
While (T.link ≠ Null) and (T.link.data < e) Do
T = T.Link
End-of-While
n.link = T.link
T.link = n
Endif
70
Endif
Algorithm ends
The above algorithm illustrates the operation of inserting an element into a sorted linked list.
This operation requires the information about the immediate neighbour of the current
element. This is done using the statement T.link.data. This is because, we need to insert an
element between two nodes. The rest of the statements are self explanatory and can be traced
with an example for better understanding of the algorithm. Figure 4.5 below shows the
logical representation of this operation.
(a) Linked list before insertion (b) Node to be inserted.
Figure 4.5 Logical representation of the insertion operation.
4. Deleting a first node from a singly linked list:
Algorithm: Delete-First-SLL
Input: F, address of the first node.
71
Output: F, updated.
Method:
1. if (F = Null) then Display “List is empty”

else
T=F
F = F.link
Dispose(T)
ifend
Algorithm ends
The above Algorithm is self explanatory. The function Dispose(T) is a logical deletion. It
releases the memory occupied by the node and freed memory is added to available memory
list. Figure 3.6 below shows the logical representation of this operation.
(a) Linked list before deletion
(a) Linked list after deletion
Figure 4.6 Logical representation of the deletion operation
5. Deleting a node at the end of a singly linked list:
Algorithm: Delete-Last-SLL
72
Output: F, updated.
Method: 1. if (F = Null) then Display “ List is empty”
else
If (F.link = Null) then Dispose(F)
F = Null
else
T=F
While ((T.link).link ≠ Null) Do
T = T.Link
End-of-While
T.link = Null
T.link = n
endif
endif
Algorithm ends
From the algorithm above, we understand that we may come across three cases while deleting
an element from a list. When the list is empty, we just display an appropriate message. When
a list contains only one element, we remove it and the pointer holding the address of the first
node will be assigned a Null value. When a list contains more than one element, we traverse
through the linked list to reach the end of the linked list. Since we need to update the link
field of the node, predecessor to the node to be deleted, it requires checking the link field of
the next node being in the current node. This is accomplished using the statement
(T.link).link. Figure 4.7 below shows the logical representation of this operation.
73
(a) Linked list after deletion
Figure 4.7 Logical representation of the deletion operation
6. Deleting a node with a given element from a singly linked list:
Algorithm: Delete-Element-SLL
e, element to be deleted.
Output: F, updated.
Method:
1. if (F = Null) then Display “ List is empty”
else
if (F.data = e) then F = F.link
else
74
T=F
While (((T.link).link ≠ Null) and ((T.link).data) ≠ e) Do
T = T.Link
End-of-While
if (T.link = Null) then Display “ Element e is not found”
else
V = T.link
T.link = (T.link).link
Dispose(V)
endif
endif
endif
Algorithm ends
The above algorithm is self explanatory as described in the earlier algorithm. Figure 4.8
below shows the logical representation of this operation.
75
(a) Linked list after deleting a node containing element 40
Figure 4.8
Note: The combination of Insertion-First-SLL and Deletion-First-SLL or the combination

of Insert-Last-SLL and Delete-Last-SLL can be used to realize the basic stack
operation using singly linked list.
The Efficiency of Linked Lists
Insertion and deletion at the beginning of a linked list are very fast. They involve changing
only one or two pointers, which takes O(1) time.
Finding or deleting a specified item requires searching through, on the average, half the items
in the list. This requires O(N) comparisons. An array is also O(N) for these operations, but
the linked list is nevertheless faster because nothing needs to be moved when an item is
inserted or removed. The increased efficiency can be significant, especially if a copy takes
much longer than a comparison.
Of course, another important advantage of linked lists over arrays is that the linked list uses
exactly as much memory as it needs, and can expand to fill all available memory.
The size of an array is fixed when it is created; this usually leads to inefficiency because the
array is too large, or to running out of room because the array is too small.
CIRCULARLY LINKED LIST AND ITS REPRESENTATION:
Definition: A linear linked list organized in such a way that link field of the last node
contains the address of the first node is called as a circularly linked list. Figure 4.9 illustrates
the representation of a circularly linked list.
76
Figure 4.9. Logical representation of a circularly linked list
From the above representation of circularly linked list, it is understood that during processing
one has to make sure that one does not get into an infinite loop owing to the circular nature of
pointers in the list. A solution to this problem is to designate a special node to act as the head
of the list. This node is usually referred to as header node. This header node has its
advantage other than pointing to the beginning of a list. The list can never be empty and
represented by a hanging pointer (F = Null) as was the case with empty singly linked lists. A
circular linked list becomes empty when head points to the head node of the list i.e.
(Head.link = Head). A circular linked list with an header node is called a headed circularly
linked list. Figure 4.10 shows an empty headed circularly linked list. Figure 4.11 shows the
logical representation of a headed circularly linked list.
4.10. An empty circularly linked list
4.11 Logical representation of a headed circularly linked list
Observe that the header node has the same structure as the other nodes in the list. The data
field of the header node is unused and is indicated as a shaded field in the pictorial
representation. However, in practical applications these fields may be utilized to represent
any useful information about the list relevant to the application.
77
OPERATIONS ON CIRCULARLY LINKED LISTS
Let us understand the various primitive operations that can be performed on circularly linked
lists.
1. Inserting an element at the beginning of the circularly linked list.

To insert an element at the beginning of the circularly linked list, we need to know the
address of the header node and an element to be inserted. The algorithm below is used to
implement this operation. Figure 4.12 shows the logical representation of this operation.
Algorithm: Insert-First-CLL
Input: H, address of the header node.
Output: H, updated.
Method:
1. n = Getnode()
2. n.data = e
3. n.link = H.link
4. H.link = n
Algorithm ends
(a) Before insertion
(b) After insertion

Figure 4.12. Pictorial representation of Insert-First-CLL operation
2. Inserting an element at the end of the circularly linked list.
78
To insert an element at the end of the circularly linked list, we need to know the address of
the header node and an element to be inserted. The algorithm below is used to implement this
operation. Figure 4.13 shows the pictorial representation of this operation.
Algorithm: Insert-Last-CLL
Output: H, updated.
Method:
1. n = Getnode()
2. n.data = e
3. n.link = H
4. T = H
5. While (T.link ≠ H) DO
T = T.link
End-of-while
6. T.link = n
Algorithm ends
(a) Before insertion
(a) After insertion

Figure 4.13 Pictorial representation of Insert-Last-CLL operation
We can observe from the above figure that inserting a node (element) at end of the list is very
simple. First, we have to traverse the list till we reach the end of the list. The last node of the
list is the one whose link field contains the address of the header node as shown in the Figure
79
3.6(a). Once we find this node, its link field is replaced with the address of the new node to
be inserted and the address of the header node is copied to the link field of the inserted node.
3. Deleting an element at the beginning of the circularly linked list.

To delete an element at the beginning of the circularly linked list, we need to know the
address of the header node. Deletion is done by just replacing the link field of the header
node with address contained in the link field of the node to be deleted. The algorithm below
is used to implement this operation and is self explanatory. Figure 4.14 shows the pictorial
representation of this operation.
Algorithm: Delete-First-CLL
Output: H, updated.
Method:
1. If (H.link = Null) then Display “List is empty”
else
H.link = (H.link).link
endif
Algorithm ends
(a) Before deletion
(a) After deletion

Figure 4.14 Pictorial representation of Delete-First-CLL operation
4. Deleting an element at the end of the circularly linked list.
80
To delete an element at the end of the circularly linked list, first, we have to traverse the list
till we reach the node previous to the last node of the list. Deletion is done by just replacing
the link field of the last but one node with the address of the header node. The algorithm
below is used to implement this operation and is self explanatory. Figure 4.15 shows the
pictorial representation of this operation.
Algorithm: Delete-Last-CLL
Output: H, updated.
Method:
1. If (H.link = Null) then Display “List is empty”
else
T=H
While ((T.link).link ≠ H) DO
T = T.link
End-of-while
2. T.link = H
Algorithm ends
(a) Before deletion
(a) After deletion

Figure 4.15 Pictorial representation of Delete-Last-CLL operation
4. Splitting circularly linked list into two lists
We can split the given circularly linked list into two lists such that a node containing the
element e becomes the first node of the second list. Let H1 be the address of the header node
81
of the list. To accomplish this task, we have to traverse the list to find the node containing the
element e. Once found, the link field of the node previous to this node is replaced with the
address of the header node H1. The address of the node containing the element e is copied to
another header node say H2 and this list is traversed till we reach the end i.e. the node whose
link field contains the address of the header node H1. Now, the link field of this node is
replaced with the address of the header node H2. Thus we get the two lists with header nodes
H1 and H2. The following algorithm is used to implement this operation and the Figure 4.16
shows the pictorial representation of this operation.
Algorithm: Split-CLL
Input: H, address of the header node and element e.
Output: H1, H2, headers.
Method: 1. H2 = Get-Header()
2. T = H
3. While (((T.link).link ≠ H) and ((T.link).data ≠ e)) DO
T = T.link
End-of-while
4. if (T.link ≠ H) then
H2.link = T.link
T.link = H
T = H2
While (((T.link).link ≠ H) DO
T = T.link
End-of-while
T.link = H2 ; H1 = H
Else
H2.link = H2
endif
Algorithm ends
82
(a) Before splitting
(a) List-1 with header H1

Figure 4.16 Pictorial representation of Split-CLL operation
4. Combining two circularly linked list into a single list

We can combine the given two circularly linked lists in to a single list. This operation is very
simple and straight forward when compared to splitting operation. Let H1 be the address of
the header node of the first list and H2 be the address of the header node of the second list. To
accomplish this task, we have to traverse the first list to find the last node of the list i.e. the
node whose link field contains the address of the header node H1. Once found, the link field
of this node is replaced with the address of the header node H2. Then we have to traverse the
second list to find the last node of the list i.e. the node whose link field contains the address
of the header node H2. Once found, the link field of this node is replaced with the address of
the header node H1. Now the header node H2 is no longer required and is disposed. The
following algorithm is used to implement this operation.
Algorithm: Combine-CLL
Input: H1 and H2, address of the header nodes.
Output: H1, Header.
Method: 1. T = H1
2. While (T.link ≠ H1) DO
T = T.link
End-of-while
83
3. T.link = H2.link
4. While (T.link ≠ H2) DO
T = T.link
End-of-while
5. T.link = H1
6. Dispose(H2)
Algorithm ends
(b) List-2 with header H2
(c) After combining
DOUBLY LINKED LIST AND ITS REPRESENTATION
Definition: A linear linked list organized in such a way that every node consists of one or
more data fields and two link fields that contain references to the previous and to the next
node in the sequence. It can be viewed as two singly-linked lists formed from the same data
items, in two opposite orders.
The two links allow walking along the list in either direction with equal ease. Compared to a
singly-linked list, modifying a doubly-linked list usually requires changing more pointers, but
is sometimes simpler because there is no need to keep track of the address of the previous
84
node. The link fields are often called forward and backwards, or next and previous. A
pointer to any node of a doubly-linked list gives access to the whole list. Figure 3.10 shows
the pictorial representation of a doubly linked list and Figure 4.17 shows the pictorial
representation of node structure in doubly linked list
Figure 4.17 Pictorial representation of a doubly linked list.
Figure 4.17 Pictorial representation of node structure in doubly linked list.
OPERATIONS ON DOUBLY LINKED LISTS
Let us understand the various primitive operations that can be performed on doubly linked
lists.
1. Inserting an element at the beginning of the doubly linked list.
To insert an element at the beginning of the doubly linked list, we need to know the address
of the first and last node of the list and an element to be inserted. The algorithm below is used
to implement this operation. Figure 4.18 shows the pictorial representation of this operation.
Algorithm: Insert-First-DLL
Input: F, R, address of first and last nodes.
Output: F and R, updated.
Method:
1. n = Getnode()
2. n.data = e
3. n.blink = null
4. n.flink = F
5. If (F = null) then R = n
Else F.blink = n
endif
85
6. F = n
Algorithm ends
(a) List before insertion
(a) List after insertion
Figure 4.18 The pictorial representation of insertion operation.
From the above pictorial representation, we can understand that inserting an element at the
beginning of the doubly linked list requires updation of blink and flink pointers of the node to
be inserted as well as the node pointed by the F pointer. After insertion, the pointer F is also
updated to point to the inserted node. If the node inserted is the first node then we have to
update the pointer R as well.
2. Inserting an element at the beginning of the doubly linked list.
To insert an element at the end of the doubly linked list, we need to know the address of the
first and last node of the list and an element to be inserted. The algorithm below is used to
implement this operation. Figure 4.19 shows the pictorial representation of this operation.
Algorithm: Insert-Last-DLL
Input: F, R, address of first and last nodes
Method:
1. n = Getnode()
2. n.data = e
3. n.flink = null
4. n.blink = R
5. If (R = null) then F = n
86
Else R.flink = n
endif
6. R = n
Algorithm ends
(a) List after insertion
From the above pictorial representation, we can understand that inserting an element at the
end of the doubly linked list requires updation of blink and flink pointers of the node to be
inserted as well as the node pointed by the R pointer. After insertion, the pointer R is also
updated to point to the inserted node. If the node inserted is the first node then we have to
update the pointer F as well.
3. Deleting an element at the beginning of the doubly linked list.

To delete an element at the beginning of the doubly linked list, we need to know the address
of the first and last node of the list. The algorithm below is used to implement this operation.
Figure 3.14 shows the pictorial representation of this operation.
Algorithm: Delete-First-DLL
Method:
1. If (F=null) then Display “List is empty”)
Else
T=F
87
F = F.flink
If (F = null) then R = null
Else
F.blink = null
Endif
Dispose(T)
Endif
Algorithm ends
Deleting the first node from a doubly linked list is a simple task. We need the address of the
first (F) and the last node (R) of the list. If (F=null) means the list is empty. Otherwise, F is
made to point to next node using the statement F = F.flink thus deleting the first node. After
deletion, if (F=null) then R is also null. That means if there is only one element then after
deletion, the list will become empty; otherwise, the node next to the one deleted will become
the first node of the list and hence its blink will become null. That is F.blink = null. Finally,
the memory occupied by the node is released. The Figure 4.20 shows the pictorial
(a) List before deletion
(b) List after deletion
Figure 4.20 The pictorial representation of deletion operation.
4. Deleting an element at the end of the doubly linked list.

To delete an element at the end of the doubly linked list, we need to know the address of the
first and last node of the list. The algorithm below is used to implement this operation. Figure
3.15 shows the pictorial representation of this operation.
88
Algorithm: Delete-Last-DLL
Method:
1. If (R=null) then Display “List is empty”)
Else
T=R
R = R.blink
If (R = null) then F = null
Else
R.flink = null
Endif
Dispose(T)
Endif
Algorithm ends
Deleting the last node from a doubly linked list is a simple task. We need the address of the
first (F) and the last node (R) of the list. If (R=null) means the list is empty. Otherwise, R is
made to point to next node using the statement R = R.blink thus deleting the last node. After
deletion, if (R=null) then F is also made to null. That means if there is only one element then
after deletion, the list will become empty; otherwise, the node previous to the one deleted will
become the last node of the list and hence its flink will become null. That is R.flink = null.
Finally, the memory occupied by the node is released. The Figure 4.21 shows the pictorial
(b) List after deletion
89
5. Inserting an element into a sorted doubly linked list.

To insert an element e into a sorted doubly linked list, we need to know the address of the
first and last node of the list. If the elements in the linked list are arranged in ascending order,
then we have to traverse the list starting with the first node, searching for a node whose
element is greater than or equal to the element e. If found then the element e is inserted as a
node predecessor to this node and the link fields of the current node the neighbouring nodes
are updated. The algorithm below is used to implement this operation. Figure 4.22 shows the
pictorial representation of this operation.
Algorithm: Insert-Sort-DLL
Method: 1. n = Getnode()
2. n.data = e
3. if (F=null) then
n.blink = null
n.flink = null
F=n
R=n
else
if (e ≤ F.data) then
n.flink = F
n.blink = null
F.blink = n
F=n
else
T=F
While ((T.flink ≠ null) and (T.flink).data < e) DO
90
T = T.flink
End-of-while
n.flink = T.flink
n.blink = T
T.flink = n
If (T = R) then R = n
else (n.flink).blink = n
endif
endif
endif
Algorithm ends
(b) List after insertion
6. Deleting an element e from a doubly linked list.

To delete an element e from a doubly linked list, we need to know the address of the first and
last node of the list. We have to traverse the list starting with the first node, searching for a
node whose element is equal to the element e. To delete the node containing element e, we
must be at the node predecessor to this node. After deletion the link fields of the
neighbouring nodes are updated. The algorithm below is used to implement this operation.
Figure 4.23 shows the pictorial representation of this operation.
91
(a) List after deletion
Algorithm: Delete-Element-DLL
Method: 1 if (F=null) then Display “List is empty“
else
if (e = F.data) then
F = F.flink
if (F = null) then R = null
else
F.blink = null
endif
else
T=F
While ((T.flink ≠ null) and (T.flink).data ≠ e) DO
T = T.flink
End-of-while
if (T = R) then Display “Element e is not found”
else
T.flink = (T.flink).flink
if (T.flink = null) then R = T
else
92
(T.flink).blink = T
endif
endif
endif
endif
Algorithm ends
4.3 Summary
 A singly linked list is the simplest of a linked representation.

 We can realize the operations of stack data structure using linked list functions: insert-
rear(), delete-front() and display().
 We can realize the operations of queue data structure using linked list functions:
insert-front(), delete-rear() and display().
 A circularly linked list is an enhancement of the singly linked list representation, in
that the nodes are circularly linked. This provides better flexibility in handling
delete operation
 The doubly linked list has one or more data item fields but two link fields blink and
flink, respectively pointing to the predecessor and successor of the node. Though the
list exhibits the advantages of greater flexibility and efficient delete operation, it
suffers from the drawback of increased storage requirement and data movement
operations when compared to singly linked lists.
4.4 Keywords
Linear data structure, Singly Linked lists, circularly linked lists, doubly linked lists, header node.
4.5 Questions
1. What are singly linked lists? Explain with an illustrative example.
93
2. Mention the various operations performed on singly linked lists.
3. Briefly explain the implementation of stack using singly linked list.
4. Briefly explain the implementation of queue using singly linked list.
5. Design an algorithm to delete all the nodes with a given element from a singly linked
list.
6. Design an algorithm to delete alternative occurrence of an element from a singly
linked list.
7. Design an algorithm to delete an element e from a sorted singly linked list.
8. What are circular linked lists? Explain with an illustrative example.
9. Mention the various operations performed on circularly linked lists.
10. What is an header? Why is it required? Explain with an example.
11. What are the advantages of circularly linked lists?
12. Design an algorithm to delete all the nodes with a given element from a circularly linked list.
13. Write a note on Singly Linked Lists, Doubly Linked Lists and Circular Linked Lists,
their Implementation and applications.
4.6 Reference

Bruno R. Preiss
8. "Data Structures and Algorithm

Analysis in Java" by Mark A. Weiss
9. ChatGpt and Internet.
94
95
Unit-5 TREES
Structure
5.0 Objectives
5.1 Introduction
5.2 Terminology and Definition of Tree
5.3 Binary Tree
5.4 Binary Search Tree
5.5 AVL trees,
5.6 B-trees, and their implementation
5.7 Traversal algorithms, and applications.
5.8 Summary
5.9 Keywords
5.10 Questions
5.11 References
5.0 OBJECTIVES
After studying this unit, we will be able to
 Explain the basic terminologies of trees.

 Discuss the importance of Binary Tree
 Describe the representation of binary trees based on Sequential Allocation.
5.1 INTRODUCTION
The trees are also non-linear data structures, which are very useful in representing
hierarchical relationships among the data items. For example, in real life, if we want to
express the relationship exists among the members of the family then we use nonlinear
structures like trees. Organizing the data in a hierarchical structure plays a very important
role for most of the applications, which involve searching. Trees are the most useful and
widely used data structure in Computer Science in the areas of data storage, parsing,
evaluation of expressions, and compiler design.
5.2 DEFINITION AND BASIC TERMINOLOGIES OF TREES
Definition: A tree is defined as a finite set of one or more nodes such that
(i) there is a specially designated node called the root and
(ii) the rest of the nodes could be partitioned in to t disjoint sets (t ≥ 0) each set
representing a tree Ti, i = 1, 2, 3, …, t known as subtree of the tree.
A node in the definition of the tree represents an item of information and the links between
the nodes termed as branches, represent an association between the items of information.
Figure 5.2.1 shows a tree.
Figure 5.2.1 An example tree
In the above figure, node 1 represents the root of the tree, nodes 2, 3, 4 and 9 are all
intermediate nodes and nodes 5, 6, 7, 8, 10, 11 and 12 are the leaf nodes of the tree. The
definition of the tree emphasizes on the aspect of (i) connectedness and (ii) absence of loops
or cycles. Beginning from the root node, the structure of the tree permits connectivity of the
root to every other node in the tree. In general, any node is reachable from any where in the
tree. Also, with branches providing links between the nodes, the structure ensures that no set
of nodes link together to form a closed loop or cycle.
Some Properties of Tree

1. There is one and only one path between every pair of vertices in a tree, T.
2. A tree with n vertices has n-1 edges.
3. Any connected graph with n vertices and n-1 edges is a tree.
4. A graph is a tree if and only if it is minimally connected.
Therefore, a graph with n vertices is called a tree if
1. G is connected and is circuit less, or
2. G is connected and has n-1 edges, or
3. G is circuit less and has n-1 edges, or
4. There is exactly one path between every pair of vertices in G, or
5. G is a minimally connected graph.
There are several basic terminologies associated with trees. There is a specially designated
node called the root node. The number of subtrees of a node is known as the degree of the
node. Nodes that have zero degree are called leaf nodes or terminal nodes. The rest of them
are called intermediate nodes. The nodes, which hang from branches emanating from a node,
are called as children and the node from which the branches emanate is known as the parent
node. Children of the same parent node are referred to as siblings. The ancestors of a given
node are those nodes that occur on the path from the root to the given node. The degree of a
tree is the maximum degree of the node in the tree. The level of the node is defined by letting
the root node to occupy level 0. The rest of the nodes occupy various levels depending on
their association. Thus, if parent node occupies level i then, its children should occupy level
i+1. This renders a tree to have a hierarchical structure with root occupying the top most
level of 0. The height or depth of a tree is defined to be the maximum level of any node in
the tree. A forest is a set of zero or more disjoint trees. The removal of the root node from a
tree result in a forest.
5. 3 BINARY TREE
A binary tree has the characteristic of all nodes having at most two branches, that is, all nodes
have a degree of at most 2. Therefore, a binary tree can be empty or consist of a root node
and two disjointed binary trees termed left subtree and right subtree. Figure 5.3.1 shows an
example binary tree.
Figure 5.3.1 An example binary tree

The number of levels in the tree is called the “depth” of the tree. A “complete” binary tree is
one which allows sequencing of the nodes and all the previous levels are maximally
accommodated before the next level is accommodated. i.e., the siblings are first
accommodated before the children of any one of them. And a binary tree, which is maximally
accommodated with all leaves at the same level is called “full” binary tree. A full binary tree
is always complete but a complete binary tree need not be full. Fig. 5.3.2 is an example for a
full binary tree and Figure 2.3 illustrates a complete binary tree.
Figure 5.3.2 A complete binary tree
The maximum number of vertices at each level in a binary tree can be found out as follows:
At level 0: 20 number of vertices
…
At level i: 2i number of vertices
Therefore, maximum number of vertices in a binary tree of depth ‘l’ is:
20 + 21 + 22 + … + 2l
i.e., ∑2k = 2 l+1 – 1 for k = 0 to l
5.4 BINARY SEARCH TREE
A binary tree has the characteristic of all nodes having at most two branches, that is, all
nodes have a degree of at most 2. Therefore, a binary tree can be empty or consist of a
root node and two disjointed binary trees termed left subtree and right subtree. Figure
5.4 shows an example binary tree.
Figure 5.4 An example binary tree
The number of levels in the tree is called the “depth” of the tree. A “complete”
binary tree is one which allows sequencing of the nodes and all the previous levels
are maximally accommodated before the next level is accommodated. i.e., the
siblings are first accommodated before the children of any one of them. And a
binary tree, which is maximally accommodated with all leaves at the same level is
called “full” binary tree. A full binary tree is always complete but a complete
binary tree need not be full. Fig. 5.4 is an example for a full binary tree and Figure
5.4.1 illustrates a complete binary tree.
Figure 5.4.1 A complete binary tree
The maximum number of vertices at each level in a binary tree can be found out as
follows:

At level i: 2i number of vertices
Therefore, maximum number of vertices in a binary tree of depth ‘l’ is:
20 + 21 + 22 + … + 2l
i.e., ∑2k = 2 l+1 – 1 for k = 0 to l

Binary Tree can be represented using sequential as well as linked data structures. In
sequential data structures, we have two ways of representing the binary tree. One is through
the use of Adjacency matrices and the other is through the use of Single dimensional array
representation.
 Adjacency Matrix Representation

A two dimensional array can be used to store the adjacency relations very easily and can be
used to represent a binary tree. In this representation, to represent a binary tree with n vertices
we use n×n matrix. Figure 5.4.2(a) shows a binary tree and Figure 5.4.2(b) shows its
adjacency matrix representation.
(a) A binary tree (b) Adjacency matrix representation
Figure 5.4.2 A binary tree and its adjacency matrix representation
Here, the row indices correspond to the parent nodes and the column corresponds to the child
nodes. i.e., a row corresponding to the vertex vi having the entries ‘L’ and ‘R’ indicate that vi
has its left child, the index corresponding to the column with the entry ‘L’ and has its right
child, the index corresponding to the column with the entry ‘R’. The column corresponds to
vertex vi with no entries indicate that it is the root node. All other columns have only one
entry. Each row may have 0, 1 or 2 entries. Zero entry in the row indicates that the
corresponding vertex vi is a leaf node, only one entry indicates that the node has only one
child and two entries indicate that the node has both the left and right children. The entry “L”
is used to indicate the left child and “R” is used to indicate the right child entries.
From the above representation, we can understand that the storage space utilization is not
efficient. Now, let us see the space utilization of this method of binary tree representation. Let
‘n’ be the number of vertices. The space allocated is n x n matrix. i.e., we have n2 number of
locations allocated, but we have only n-1 entries in the matrix. Therefore, the percentage of
space utilization is calculated as follows:
The percentage of space utilized decreases as n increases. For large ‘n’, the percentage of
utilization becomes negligible. Therefore, this way of representing a binary tree is not
efficient in terms of memory utilization.
 Single Dimensional Array Representation

Since the two-dimensional array is a sparse matrix, we can consider the prospect of mapping
it onto a single dimensional array for better space utilization. In this representation, we have
to note the following points:
 The left child of the ith node is placed at the 2ith position.
 The right child of the ith node is placed at the (2i+1)th position.
 The parent of the ith node is at the (i/2)th position in the array.
If l is the depth of the binary tree then, the number of possible nodes in the binary tree is 2 l+1-
1. Hence it is necessary to have 2l+1-1 locations allocated to represent the binary tree.
If ‘n’ is the number of nodes, then the percentage of utilization is
Figure 5.4.3 shows a binary tree and Figure 5.4.4 shows its one-dimensional array
representation.
Figure 5.4.3 A binary tree
Figure 5.4.4 One-dimensional array representation
For a complete and full binary tree, there is 100% utilization and there is a maximum wastage
if the binary tree is right skewed or left skewed, where only l+1 spaces are utilized out of the
2l+1 – 1 spaces.
An important observation to be made here is that the organization of the data in the binary
tree decides the space utilization of the representation used.
5.5 AVL TREES
The first type of self-balancing binary search tree to be invented is the AVL tree. The name
AVL tree is coined after its inventor's names − Adelson-Velsky and Landis.In AVL trees, the
difference between the heights of left and right subtrees, known as the Balance Factor, must
be at most one. Once the difference exceeds one, the tree automatically executes the
balancing algorithm until the difference becomes one again.
BALANCE FACTOR = HEIGHT(LEFT SUBTREE) – HEIGHT(RIGHT SUBTREE)
There are usually four cases of rotation in the balancing algorithm of AVL trees: LL, RR, LR,
RL.
 LL Rotations
LL rotation is performed when the node is inserted into the right subtree leading to an
unbalanced tree. This is a single left rotation to make the tree balanced again −
Fig : 5.5.1LL Rotation

The node where the unbalance occurs becomes the left child and the newly added node
becomes the right child with the middle node as the parent node.
 RR Rotations
RR rotation is performed when the node is inserted into the left subtree leading to an
unbalanced tree. This is a single right rotation to make the tree balanced again −
Fig :5.5.2 RR Rotation

The node where the unbalance occurs becomes the right child and the newly added node
becomes the left child with the middle node as the parent node.
 LR Rotations
LR rotation is the extended version of the previous single rotations, also called a double
rotation. It is performed when a node is inserted into the right subtree of the left subtree. The
LR rotation is a combination of the left rotation followed by the right rotation. There are
multiple steps to be followed to carry this out.
 Consider an example with “A” as the root node, “B” as the left child of “A” and “C”
as the right child of “B”.
 Since the unbalance occurs at A, a left rotation is applied on the child nodes of A, i.e.
B and C.
 After the rotation, the C node becomes the left child of A and B becomes the left child
of C.
 The unbalance still persists, therefore a right rotation is applied at the root node A and
the left child C.
 After the final right rotation, C becomes the root node, A becomes the right child and
B is the left child.
Fig :5.5.3 LR Rotation
 RL Rotations
RL rotation is also the extended version of the previous single rotations, hence it is called a
double rotation and it is performed if a node is inserted into the left subtree of the right
subtree. The RL rotation is a combination of the right rotation followed by the left rotation.
There are multiple steps to be followed to carry this out.
 Consider an example with “A” as the root node, “B” as the right child of “A” and “C”
as the left child of “B”.
 Since the unbalance occurs at A, a right rotation is applied on the child nodes of A,
i.e. B and C.
 After the rotation, the C node becomes the right child of A and B becomes the right
child of C.
 The unbalance still persists, therefore a left rotation is applied at the root node A and
the right child C.
 After the final left rotation, C becomes the root node, A becomes the left child and B
is the right child.
Fig :5.5.4 RL Rotation
 Basic Operations of AVL Trees

The basic operations performed on the AVL Tree structures include all the operations
performed on a binary search tree, since the AVL Tree at its core is actually just a binary
search tree holding all its properties. Therefore, basic operations performed on an AVL Tree
are − Insertion and Deletion.
Insertion
The data is inserted into the AVL Tree by following the Binary Search Tree property of
insertion, i.e. the left subtree must contain elements less than the root value and right subtree
must contain all the greater elements. However, in AVL Trees, after the insertion of each
element, the balance factor of the tree is checked; if it does not exceed 1, the tree is left as it
is. But if the balance factor exceeds 1, a balancing algorithm is applied to readjust the tree
such that balance factor becomes less than or equal to 1 again.
Algorithm
The following steps are involved in performing the insertion operation of an AVL Tree −
Step 1 − Create a node
Step 2 − Check if the tree is empty
Step 3 − If the tree is empty, the new node created will become the root node of the AVL
Tree.
Step 4 − If the tree is not empty, we perform the Binary Search Tree insertion operation and
check the balancing factor of the node in the tree.
Step 5 − Suppose the balancing factor exceeds ±1, we apply suitable rotations on the said
node and resume the insertion from Step 4.
START
if node == null then:
return new node
if key < node.key then:
node.left = insert (node.left, key)
else if (key > node.key) then:
node.right = insert (node.right, key)
else
return node
node.height = 1 + max (height (node.left), height (node.right))
balance = getBalance (node)
if balance > 1 and key < node.left.key then:
rightRotate
if balance < -1 and key > node.right.key then:
leftRotate
if balance > 1 and key > node.left.key then:
node.left = leftRotate (node.left)
rightRotate
if balance < -1 and key < node.right.key then:
node.right = rightRotate (node.right)
leftRotate (node)
return node
END
 Insertion Example
Let us understand the insertion operation by constructing an example AVL tree with 1 to 7
integers.
Starting with the first element 1, we create a node and measure the balance, i.e., 0.
Since both the binary search property and the balance factor are satisfied, we insert another
element into the tree.
The balance factor for the two nodes are calculated and is found to be -1 (Height of left
subtree is 0 and height of the right subtree is 1). Since it does not exceed 1, we add another
element to the tree.
Now, after adding the third element, the balance factor exceeds 1 and becomes 2. Therefore,
rotations are applied. In this case, the RR rotation is applied since the imbalance occurs at
two right nodes.
The tree is rearranged as
Similarly, the next elements are inserted and rearranged using these rotations. After
rearrangement, we achieve the tree as −
The balance of the tree still remains 1, therefore we leave the tree as it is without performing
any rotations.
 Example
Following are the implementations of this operation in various programming languages
#include <stdio.h>
#include <stdlib.h>
struct Node {
int data;
struct Node *leftChild;
struct Node *rightChild;
int height;
};
int max(int a, int b);
int height(struct Node *N){
if (N == NULL)
return 0;
return N->height;
}
int max(int a, int b){
return (a > b) ? a : b;
}
struct Node *newNode(int data){
struct Node *node = (struct Node *) malloc(sizeof(struct Node));
node->data = data;
node->leftChild = NULL;
node->rightChild = NULL;
node->height = 1;
return (node);
}
struct Node *rightRotate(struct Node *y){
struct Node *x = y->leftChild;
struct Node *T2 = x->rightChild;
x->rightChild = y;
y->leftChild = T2;
y->height = max(height(y->leftChild), height(y->rightChild)) + 1;
x->height = max(height(x->leftChild), height(x->rightChild)) + 1;
return x;
}
struct Node *leftRotate(struct Node *x){
struct Node *y = x->rightChild;
struct Node *T2 = y->leftChild;
y->leftChild = x;
x->rightChild = T2;
return y;
}
int getBalance(struct Node *N){
if (N == NULL)
return 0;
return height(N->leftChild) - height(N->rightChild);
}
struct Node *insertNode(struct Node *node, int data){
if (node == NULL)
return (newNode(data));
if (data < node->data)
node->leftChild = insertNode(node->leftChild, data);
else if (data > node->data)
node->rightChild = insertNode(node->rightChild, data);
else
return node;
node->height = 1 + max(height(node->leftChild),
height(node->rightChild));
int balance = getBalance(node);
if (balance > 1 && data < node->leftChild->data)
return rightRotate(node);
if (balance < -1 && data > node->rightChild->data)
return leftRotate(node);
if (balance > 1 && data > node->leftChild->data) {
node->leftChild = leftRotate(node->leftChild);
}
if (balance < -1 && data < node->rightChild->data) {
node->rightChild = rightRotate(node->rightChild);
}
return node;
}
struct Node *minValueNode(struct Node *node){
struct Node *current = node;
while (current->leftChild != NULL)
current = current->leftChild;
return current;
}
void printTree(struct Node *root){
if (root == NULL)
return;
if (root != NULL) {
printTree(root->leftChild);
printf("%d ", root->data);
printTree(root->rightChild);
}
}
int main(){
struct Node *root = NULL;
root = insertNode(root, 22);
printf("AVL Tree: ");
printTree(root);
return 0;
}
Output
AVL Tree: 14 22 25 44 63 72 98
After deletion: 14 22 44 63 72 98
 Implementation of AVL Trees

In the following implementation, we consider the inputs in ascending order and store them in
AVL Trees by calculating the balance factor and applying rotations.
Example
Following are the implementations of this operation in various programming languages.
#include <stdio.h>
#include <stdlib.h>
struct Node {
int data;
struct Node *leftChild;
struct Node *rightChild;
int height;
};
int max(int a, int b);
int height(struct Node *N){
if (N == NULL)
return 0;
return N->height;
}
int max(int a, int b){
return (a > b) ? a : b;
}
struct Node *newNode(int data){
struct Node *node = (struct Node *) malloc(sizeof(struct Node));
node->data = data;
node->leftChild = NULL;
node->rightChild = NULL;
node->height = 1;
return (node);
}
struct Node *rightRotate(struct Node *y){
struct Node *x = y->leftChild;
struct Node *T2 = x->rightChild;
x->rightChild = y;
y->leftChild = T2;
return x;
}
struct Node *leftRotate(struct Node *x){
struct Node *y = x->rightChild;
struct Node *T2 = y->leftChild;
y->leftChild = x;
x->rightChild = T2;
return y;
}
int getBalance(struct Node *N){
if (N == NULL)
return 0;
return height(N->leftChild) - height(N->rightChild);
}
struct Node *insertNode(struct Node *node, int data){
if (node == NULL)
return (newNode(data));
if (data < node->data)
node->leftChild = insertNode(node->leftChild, data);
else if (data > node->data)
node->rightChild = insertNode(node->rightChild, data);
else
return node;
node->height = 1 + max(height(node->leftChild),
height(node->rightChild));
int balance = getBalance(node);
if (balance > 1 && data < node->leftChild->data)
if (balance < -1 && data > node->rightChild->data)
if (balance > 1 && data > node->leftChild->data) {
node->leftChild = leftRotate(node->leftChild);
}
if (balance < -1 && data < node->rightChild->data) {
node->rightChild = rightRotate(node->rightChild);
}
return node;
}
struct Node *minValueNode(struct Node *node){
struct Node *current = node;
while (current->leftChild != NULL)
current = current->leftChild;
return current;
}
struct Node *deleteNode(struct Node *root, int data){
if (root == NULL)
return root;
if (data < root->data)
root->leftChild = deleteNode(root->leftChild, data);
else if (data > root->data)
root->rightChild = deleteNode(root->rightChild, data);
else {
if ((root->leftChild == NULL) || (root->rightChild == NULL)) {
struct Node *temp = root->leftChild ? root->leftChild : root->rightChild;
if (temp == NULL) {
temp = root;
root = NULL;
} else
*root = *temp;
free(temp);
} else {
struct Node *temp = minValueNode(root->rightChild);
root->data = temp->data;
root->rightChild = deleteNode(root->rightChild, temp->data);
}
}
if (root == NULL)
return root;
root->height = 1 + max(height(root->leftChild),
height(root->rightChild));
int balance = getBalance(root);
if (balance > 1 && getBalance(root->leftChild) >= 0)
return rightRotate(root);
if (balance > 1 && getBalance(root->leftChild) < 0) {
root->leftChild = leftRotate(root->leftChild);
return rightRotate(root);
}
if (balance < -1 && getBalance(root->rightChild) <= 0)
return leftRotate(root);
if (balance < -1 && getBalance(root->rightChild) > 0) {
root->rightChild = rightRotate(root->rightChild);
return leftRotate(root);
}
return root;
}
// Print the tree

void printTree(struct Node *root){
if (root != NULL) {
printTree(root->leftChild);
printf("%d ", root->data);
printTree(root->rightChild);
}
}
int main(){
struct Node *root = NULL;
printf("AVL Tree: ");
printTree(root);
root = deleteNode(root, 25);
printf("\nAfter deletion: ");
printTree(root);
return 0;
}
Output
AVL Tree: 14 22 25 44 63 72 98
After deletion: 14 22 44 63 72 98
5.6 B-TREES AND THEIR IMPLEMENTATION
Here we will see what are the B-Trees. The B-Trees are specialized m-way search tree.
This can be widely used for disc access. A B-tree of order m, can have maximum m-1
keys and m children. This can store large number of elements in a single node. So the
height is relatively small. This is one great advantage of B-Trees.
B-Tree has all of the properties of one m-way tree. It has some other properties.
• Every node in B-Tree will hold maximum m children
• Every node except root and leaves, can hold at least m/2 children The root
nodes must have at least two children.
• All leaf nodes must have at same level
Example of B-Tree
Fig 5.6.1 B-tree
This supports basic operations like searching, insertion, deletion. In each node, the
item will be sorted. The element at position i has child before and after it. So
children sored before will hold smaller values, and children present at right will
hold bigger values. Here we will see, how to perform the insertion into a B-Tree.
Suppose we have a B-Tree like below
Example of B-Tree
Fig 5.6.2
To insert an element, the idea is very similar to the BST, but we have to follow
some rules. Each node has m children, and m-1 elements. If we insert an element
into one node, there are two situations. If the node has elements less than m-1, then
the new element will be inserted directly into the node. If it has m-1 elements, then
by taking all elements, and the element which will be inserted, then take the median
of them, and the median value is send to the root of that node by performing the
same criteria, then create two separate lists from left half and right half of the node.
Suppose we want to insert 79 into the tree. At first it will be checked with root, this
is greater than 56. Then it will come to the right most sub-tree. Now it is less than
81, so move to the left sub-tree. After that it will be inserted into the root. Now
there are three elements [66, 78, 79]. The median value is 78, so 78 will go up, and
the root node becomes [79, 81], and the elements of the node will be split into two
nodes. One will hold 66, and another will hold 79.
B-Tree after inserting 79.

Fig 5.6.3
 Algorithm BTree Insert (root, key)
Input − The root of the tree, and key to insert We will assume, that the key is not
present into the list x := Read root
If x is full then
Y:= new node
Z:=new node
Locate the middle object oi, stored in x, move the objects to the left of oi in to node y
Move the object to the right of oi into node z.
If x is an index node, then move the child pointers

Accordingly x->child[1] := address of y x-
>child[2] := address of z end if
Deletion of B-Tree
Here we will see, how to perform the deletion of a node from B-Tree. Suppose we have a
BTree like below
 Example of B-Tree
Fig 5.6.4
Deletion has two parts. At first we have to find the element. That strategy is like the
querying. Now for deletion, we have to care about some rules. One node must have
at-least m/2 elements. So if we delete, one element, and it has less than m-1
elements remaining, then it will adjust itself. If the entire node is deleted, then its
children will be merged, and if their size issame as m, then split them into two
parts, and again the median value will go up.
Suppose we want to delete 46. Now there are two children. [45], and [47, 49], then
they will be merged, it will be [45, 47, 49], now 47 will go up.
 Algorithm BTree Delete (x, key)
Here we will see, how to perform the deletion of a node from B-Tree. Suppose we have a
BTree like below
Input − The root of the tree, and key to delete We will assume, that the key is present into
the list if x is leaf, then delete object with key ‘key’ from x else if x does not contain the
object with key ‘key’, then locate the child x->child[i] whose key range is holding ‘key’
y := x->child[i] if y has m/2 elements, then If the sibling node z immediate to the left or
right of y, has at least one more object than m/2, add one more object by moving x-
>key[i] from x to y, and move that last or first object from z to x. If y is non-leaf node,
then last or first child pointer in z is also moved to y else any immediate sibling of y has
m/2 elements, merge y with immediate sibling end if
BTree Delete(y, key) else if y that precedes ‘key’ in x, has at-least m/2 + 1 objects, then
find predecessor k of ‘key’, in the sub-tree rooted by y. then recursively delete k from the
sub-tree and replace key with k in x else if ys has m/2 elements, then check the child z,
which is immediately follows ‘key’ in x if z has at least m/2+1 elements, then find
successor k of ‘key’, in the sub-tree rooted by z. recursively delete k from sub-tree, and
replace key with k in x else
both y and z has m/2 elements, then merge then into one node, and push ‘key’ down
to the new node as well. Recursively delete ‘key’ from this new node end if end if.
5.7 TRAVERSAL ALGORITHMS, AND APPLICATIONS.
A traversal of a binary tree is where its nodes are visited in a particular but repetitive order,
rendering a linear order of nodes or information represented by them. There are three simple
ways to traverse a tree. They are called preorder, inorder, and postorder. In each technique,
the left subtree is traversed recursively, the right subtree is traversed recursively, and the root
is visited. What distinguishes the techniques from one another is the order of those three
tasks. The following sections discuss these three different ways of traversing a binary tree.
 Preorder Traversal
In this traversal, the nodes are visited in the order of root, left child and then right child.
o Process the root node first.
o Traverse left sub-tree.
 Traverse right sub-tree.

Repeat the same for each of the left and right subtrees encountered. Here, the leaf nodes
represent the stopping criteria. The pre-order traversal sequence for the binary tree shown in
Figure 4.1 is: A B D E H I C F G
Figure 5.7.1 A binary tree
 Inorder Traversal
In this traversal, the nodes are visited in the order of left child, root and then right child. i.e.,
the left sub-tree is traversed first, then the root is visited and then the right sub-tree is
traversed. The function must perform only three tasks.
o Traverse the left subtree.

o Process the root node.
o Traverse the right subtree.
Remember that visiting a node means doing something to it: displaying it, writing it to a file
and so on. The inorder traversal sequence for the binary tree shown in Figure 4.1 is: D B H E
I A F C G.
 Postorder Traversal
In this traversal, the nodes are visited in the order of left child, right child and then the root.
i.e., the left sub-tree is traversed first, then the right sub-tree is traversed and finally the root
is visited. The function must perform the following tasks.
o Traverse the left subtree.
o Traverse the right subtree.
o Process the root node.
The postorder traversal sequence for the binary tree shown in Figure 5.8.1 is: D H I E B F G
C A.
 Traversal of a Binary Tree Represented in an Adjacency Matrix

The steps involved in traversing a binary tree from the adjacency matrix representation,
firstly requires finding out the root node. Then it entails to traverse through the left subtree
and then the right subtree in specific orders. In order to remember the nodes already visited, it
is necessary to maintain a stack data structure. Thus, following are the steps involved in
traversing trough the binary tree given in an adjacency matrix representation.
o Locate the root (the column sum is zero for the root)
o Display
o Push in to the stack
o Scan the row in search of ‘L’ for the left child information
o Pop from the stack
o Scan the row in search of ‘R’ for the right child information
o Check if array Is Empty ().
o Stop
Sequencing the above stated steps helps us in arriving at preorder, inorder and postorder
traversal sequences.
 Binary Tree Traversal from 1D Array Representation

Preorder Traversal
Algorithm: Preorder Traversal
Input: A[], one dimensional array representing the binary tree
i, the root address //initially i =1
Output: Preorder sequence
Method:
If (A[i] ≠ 0)
Display (A[i])
Preorder Traversal (A, 2i)
Preorder Traversal (A, 2i +1)
If end
Algorithm ends
Inorder Traversal
Algorithm: Inorder Traversal
Output: Inorder sequence
Method:
If (A[i] ≠ 0)
Inorder Traversal (A, 2i)
Display (A[i])
Inorder Traversal (A, 2i +1)
If end
Algorithm ends
Postorder Traversal
Algorithm: Postorder Traversal
Output: Postorder sequence
Method:
If (A[i] ≠ 0)
Postorder Traversal (A, 2i)
Postorder Traversal (A, 2i +1)
Display(A[i])
If end
Algorithm ends
Binary Tree Traversal in Linked Representation
We have already studied that every node of a binary tree in linked representation has a
structure which has links to the left and right children. The algorithms for traversing the
binary tree in linked representation are given below.
Algorithm: Preorder Traversal

Input: bt, address of the root node
Output: Preorder sequence
Method:
If (bt ≠ NULL)
Display ([bt].data)
Preorder Traversal ([bt].Lchild)
Preorder Traversal ([bt].Rchild)
If end
Algorithm ends.
Algorithm: Inorder Traversal

Output: Inorder sequence
Method:
If (bt ≠ NULL)
Inorder Traversal ([bt].Lchild)
Display ([bt].data)
Inorder Traversal ([bt].Rchild)
If end
Algorithm ends.
Algorithm: Postorder Traversal

Output: Postorder sequence
Method:
If (bt ≠ NULL)
Postorder Traversal ([bt].Lchild)
Postorder Traversal ([bt].Rchild)
Display ([bt].data)
If end
Algorithm ends.
 OPERATIONS ON BINARY TREE

We have already discussed an important operation performed on binary trees called traversal.
The various other operations that can be performed on binary trees are discussed as follows.
 Insertion
To insert a node containing an item into a binary tree, first we need to find the position for
insertion. Suppose the node pointed to by temp has to be inserted whose information field
contains the item J as shown in Figure 5.8.2, we need to maintain an array say D, which
contains only the directions where the node temp has to be inserted.
Figure 5.7.2 To insert a item J
If D contains ‘LRLR’, from the root node, first move towards left (L), then right (R), then
left (L) and finally move towards right (R). If the pointer points to null at that position, node
temp can be inserted otherwise, it cannot be inserted. To achieve this, one has to start from
the root node. Let us use two pointers prev and cur where prev always points to parent node
and cur points to child node. Initially cur points to root node and prev points to null. To start
with one can write the following statements.
prev = null
cur = root
Now, keep updating the node pointed to by cur towards left if the direction is ‘L’ otherwise,
update towards right. Once all directions are over, if current points to null, insert the node
temp towards left or right based on the last direction. Otherwise, display error message. This
procedure can be algorithmically expressed as follows.
Algorithm: Insert Node

Input: root, address of the root node
e, element to be inserted
D, direction array
Output: Tree updated
Method:
1. temp = Getnode()
2. temp.info = e
3. temp.llink = temp.rlink = null
4. if (root = null) return temp // Node inserted for the first time
5. prev = null, cur = root, k = 0
6. While ( k < strlen(D)) DO
If (cur = null) exit
prev = cur
if (D[k] = ‘L’) then
cur = cur.llink
else
cur = cur.rlink
Whileend
7. If ((cur ≠ null) OR k ≠ strlen(D)) then
Display “Insertion not possible”
Free(temp)
Return(root)
8. If (D[k – 1] = ‘L’) then
prev.llink = temp
Else
prev.rlink = temp
Ifend
9. root
10. stop
Algorithm ends
 Searching
To search for an item in a tree, we can traverse a tree in any of the (inorder, preorder,
postorder) order to visit the node. As we visit the node, we can compare the item to be
searched with the data item stored in information field of the node. If found then the search is
successful otherwise, search is unsuccessful. A recursive inorder traversal technique used for
searching an item in binary tree is presented below.
Algorithm: Search(item, root, flag)
Input: item, data to be searched

root, address of the root node
flag, status variable
Output: Item found or not found
Method:
1. if (root = null then
flag = false
exit
ifend
2. Search (item, root.llink, flag)
3. if (item = root.info) then
flag = true
exit
ifend
4. Search (item, root.rlink, flag)
5. if (flag = true) then display “Data item is found”

else display “Data item is not found”
ifend
6. stop
Algorithm ends
 Deletion
Deletion of a node from a binary tree involves searching for a node which contains the data
item. If such a node is found then that node is deleted; otherwise, appropriate message is
displayed. If the node to be deleted is a leaf node then the deletion operation is a simple task.
Otherwise, appropriate modifications need to be done to update the binary tree after deletion.
This operation is explained in detail considering another form of a binary tree called binary
search tree.
 Binary Search Tree
Binary Search Tree (BST) is an ordered Binary Tree in that it is an empty tree or value of
root node is greater than all the values in Left Sub Tree (LST) and less than all the values of
Right Sub Tree (RST). Right and Left sub trees are again binary sub trees by themselves.
Figure 5.7.3 (a) shows an example binary tree where as Figure 5.7.3 (b) is not a binary search
tree but a binary tree.
Figure 5.7.3(a) A binary search tree Figure 5.7.3 (b) Not a binary search tree
We will be using BST structure to demonstrate features of Binary Trees. The operations
possible on a binary tree are
 Create a Binary Tree
 Insert a node in a Binary tree
 Delete a node in a Binary Tree
 Search for a node in Binary search Tree
Algorithm: Creating Binary Tree

Step 1: Do step 2 to 3 till stopped by the user
Step 2: Obtain a new node and assign value to the node
Step 3: Insert on to a Binary Search tree
Step 4: return
Algorithm: Insertion of node into a Binary Search Tree (BST)

InsertNode (node, value)
Check if Tree is empty
if (empty ) then Enter the node as root
else // find the proper location for insertion
if (value < value of current node)
If (left child is present)
InsertNode( LST, Value)
ifend
else
allocate new node and make LST pointer point to it
ifend
else if (value > value of current node)
if ( right child is present)
InsertNode( RST, Value);
else
allocate new node and make RST pointer point to it
ifend
ifend
ifend
Figure 5.8.4 (a – b) illustrates an insertion operation.
Figure 5.7.4 (a) Before insertion Figure 5.7.4 (b) After insertion of item 5
 Deleting a node from a Binary Search Tree

There are three distinct cases to be considered when deleting a node from a BST. They are
a) Node to be deleted is a leaf node.
Make its parent to point to NULL and free the node. For example, to delete node 4 the
right pointer of its parent node 5 is made to point to NULL and free node 4. Figure 5.8.5
shows an instance of delete operation.
Figure 5.7.5 Deleting a leaf node 4
(b) Deleting a node with one child only, either left child or Right child.
For example, delete node 9 that has only a right child as shown in Figure 5.7.6 The right
pointer of node 7 is made to point to node 11. The new tree after deletion is shown in 5.7.7
Figure 5.7.6 Deletion of node with only one child Figure 5.7.7 New tree after deletion
(c) Node to be deleted is an intermediate node.

 To perform operation RemoveElement(k), we search for key k
 Assume key k is in the tree, and let v be the node storing k
 If node v has a leaf child w, we remove v and w from the tree with operation
RemoveAboveExternal(w)
 Example: Remove 4
Figure 5.7.8 Before deletion Figure 5.7.9 After deletion

Searching for a node in a Binary Search Tree
 To search for a key k, we trace a downward path starting at the root
 The next node visited depends on the outcome of the comparison of k with the key of
the current node
 If we reach a leaf, the key is not found and we return NO_SUCH_KEY
 Example: findElement(4)
Algorithm: findElement (k, v)

if T.isExternal (v)
return NO_SUCH_KEY
if k < key(v)
return findElement(k, T.leftChild(v))
else if k = key(v)
return element(v)
else { k > key(v) }
return findElement(k, T.rightChild(v))
Algorithm ends
Figure 5.7.10 shows an instance of searching for a node containing element 4.
Figure 5.7.10 Searching operation
5.8 SUMMARY
In this unit Trees and binary trees are non-linear data structures, which are inherently two
dimensional in structures. While trees are non-empty and may have nodes of any degree, a
binary tree may be empty r hold nodes of degree at most two. The terminologies of root node,
height, level, parent, children, sibling, ancestors, leaf or terminal nodes and non-terminal
nodes are applicable to both trees and binary trees and traversal algorithms and their
applications.
5.9 KEYWORDS
Trees
binary trees
non-linear data structures
root node
5.10 QUESTIONS FOR SELF STUDY
1) Define tree and binary tree.

2) Differentiate complete and full binary trees
3) What is the maximum number of nodes in a binary tree of level 7, 8 and 9
4) Explain the two techniques used to represent a binary tree based on sequential
allocation method.
5) What are the advantages of single dimensional array based representation over
adjacency matrix representation of a binary tree?
6) Describe the node structure used to represent a node in a binary tree.
7) Describe binary search tree with an example diagram.
8) Explain the insertion, deletion and searching operations on binary search trees.
9) Design algorithms to traverse a binary tree represented using linked data structure.
10) Design algorithms to perform insertion, deletion and searching operations on binary
search trees.
5.11 REFERENCES
 Sartaj Sahni, 2000, Data structures, algorithms and applications in C++, McGraw Hill
international edition.
 Horowitz and Sahni, 1983, Fundamentals of Data structure, Galgotia publications
 Horowitz and Sahni, 1998, Fundamentals of Computer algorithm, Galgotia
publications.
 Narsingh Deo, 1990, Graph theory with applications to engineering and computer
science, Prentice hall publications.
 Tremblay and Sorenson, 1991, An introduction to data structures with applications,
McGraw Hill edition.
UNIT-6.0 GRAPHS
Structure
6.0 Objectives
6.1 Introduction
6.2 Basic Definitions
6.3 Graph Data Structure
6.4 Representation of Graphs
6.5 adjacency matrix
6.6 Adjacency list and graph traversal algorithms.
6.7 Summary
6.8 Keywords
6.9 Questions for self-study
6.10 References
6.0 OBJECTIVES
After studying this unit, we will be able to explain the following:
 Basic terminologies of graph

 Graph Data Structure.
 Graph Representation based on Sequential Allocation
 Graph Representation based on Linked Allocation.
6.1 INTRODUCTION
we have defined non-linear data structure and we mentioned that trees and graphs are the
examples of non-linear data structure. To recall, in non-linear data structures unlike linear
data structures, an element is permitted to have any number of adjacent elements. Graph is an
important mathematical representation of a physical problem, for example finding optimum
shortest path from a city to another city for a traveling sales man, so as to minimize the cost.
A graph can have unconnected node. Further there can be more than one path between two
nodes. Graphs and directed graphs are important to computer science for many real-world
applications from building compilers to modeling physical communication networks. A graph
is an abstract notion of a set of nodes (vertices or points) and connection relations (edges or
arcs) between them.
6.2 BASIC DEFINITIONS
Definition1: A graph G = (V,E) is a finite nonempty set V of objects called vertices

together with a (possibly empty) set E of unordered pairs of distinct vertices of G called
edges.
Definition2: A digraph G = (V,E) is a finite nonempty set V of vertices together with a

(possibly empty) set E of ordered pairs of vertices of G called arcs
An arc that begins and ends at a same vertex u is called a loop. We usually (but not always)
disallow loops in our digraphs. By being defined as a set, E does not contain duplicate (or
multiple) edges/arcs between the same two vertices. For a given graph (or digraph) G, we
also denote the set of vertices by V (G) and the set of edges (or arcs) by E (G) to lessen any
ambiguity.
Definition3: The order of a graph (digraph) G = (V, E) is |V| sometimes denoted by |G| and
the size of this graph is |E|
Sometimes we view a graph as a digraph where every unordered edge (u, v) is replaced by
two directed arcs (u, v) and (v, u). In this case, the size of a graph is half the size of the
corresponding digraph.
Definition 4: A walk in a graph (digraph) G is a sequence of vertices v0,v1…vn such that for
all 0 ≤ i < n, (vi,vi+1) is an edge (arc) in G. The length of the walk v0,v1…vn is the number n. A
path is a walk in which no vertex is repeated. A cycle is a walk (of length at least three for
graphs) in which v0 = vn and no other vertex is repeated; sometimes, it is understood, we omit
vn from the sequence.
In the next example, we display a graph G1 and a digraph G2 both of order 5. The size of the
graph G1 is 6 where E(G1) = {(0, 1), (0, 2), (1, 2), (2, 3), (2, 4), (3, 4) while the size of the
graph G2 is 7 where E(G2) = {(0, 2), (1, 0), (1, 2), (1, 3), (3, 1), (3, 4), (4, 2)}.
A pictorial example of a graph G1 and a digraph G2 is given in figure 6.1
Figure 6.1 A Graph G1 and a digraph G2
Example 1: For the graph G1 of Figure 6.1, the following sequences of vertices are classified
as being walks, paths, or cycles.
v0,v1…vn is walk? is path? is cycle?

01234 Yes Yes No
0120 Yes No Yes
012 Yes Yes Yes
032 No No No
010 Yes No No
Example 2: For the graph G1 of Figure 6.1, the following sequences of vertices are classified
as being walks, paths, or cycles.
v0,v1…vn is walk? is path? is cycle?

01234 No No No
024 No No No
312 Yes Yes No
131 Yes No Yes
31310 Yes No No
Definition 5: A graph G is connected if there is a path between all pairs of vertices u and v of
V(G). A digraph G is strongly connected if there is a path from vertex u to vertex v for all
pairs u an v in V(G).
In Figure 6.1, the graph G1 is connected by the digraph G2 is not strongly connected because
there are no arcs leaving vertex 2. However, the underlying graph G2 is connected.
Definition 6: In a graph, the degree of a vertex v, denoted by deg(v), is the number of edges
incident to v. For digraphs, the out-degree of a vertex v is the number of arcs {(v, x) Є E | x Є
V} incident from v (leaving v) and the in-degree of vertex v is the number of arcs {(v, x) Є E |
x Є V} incident to v (entering v).
For a graph, the in-degree and out-degree’s are the same as the degree. For out graph G1, we
have deg(0) = 2, deg(2) = 4, deg(3) = 2 and deg(4) = 2. We may concisely write this as a
degree sequence (2, 2, 4, 2, 2) id there is a natural ordering (e.g., 0, 1, 2, 3, 4) of the vertices.
The in-degree sequence and out-degree sequence of the digraph G2 are (1, 1, 3, 1, 1) and (1,
3, 0, 2, 1), respectively. The degree of a vertex of a digraph is sometimes defined as the sum
of its in-degree and out-degree. Using this definition, a degree sequence of G2 would be (2, 4,
3, 3, 2).
Definition 7: A weighted graph is a graph whose edges have weights. These weights can be
thought as cost involved in traversing the path along the edge. Figure 6.2 shows a weighted
graph.
Figure 6.2 A weighted graph
Definition 8: If removal of an edge makes a graph disconnected then that edge is called
cutedge or bridge.
Definition 9: If removal of a vertex makes a graph disconnected then that vertex is called
cutvertex.
Definition 10: A connected graph without a cycle in it is called a tree. The pendent vertices
of a tree are called leaves.
Definition 11: A graph without self loop and parallel edges is called a simple graph.
Definition 12: A graph which can be traced without repeating any edge is called an Eulerian
graph. If all vertices of a graph happen to be even degree then the graph is called an Eulerian
graph.
Definition 13: If two vertices of a graph are odd degree and all other vertices are even then it
is called open Eulerian graph. In open Eulerian graph the starting and ending points must be
odd degree vertices.
Definition 14: A graph in which all vertices can be traversed without repeating any edge but
can have any number of edges is called Hamiltonian graph.
Definition 15: Total degree of a graph is twice the number of edges. That is, the total degree
= 2* |E|
Corollary: Number of odd degree vertices of a graph is always even.
 Total degree = Sum of degrees of all vertices = 2 * |E| = Even.
 Sum of degrees of all even degree vertices + Sum of degrees of all odd degree
vertices = Even.
 Even + Sum of vertices of all odd degree vertices = Even.
 Sum of vertices of all odd degree vertices = Even – Even = Even.
6.3 GRAPH DATA STRUCTURE
We can formally define graph as an abstract data type with data objects and operations on it
as follows:
Data objects: A graph G of vertices and edges. Vertices represent data objects.
Operations:
 Check-Graph-Empty(G): Check if graph G is empty - Boolean function
 Insert-Vertex(G, V): Insert an isolated vertex V into a graph G. Ensure that vertex V
does not exist in G before insertion.
 Insert-Edge(G, u, v): Insert an edge connecting vertices u, v into a graph G. Ensure
that an edge does not exist in G before insertion.
 Delete-Vertex(G, V): Delete vertex V and all the edges incident on it from the graph
G. Ensure that such a vertex exists in the graph G before deletion.
 Delete-Edge(G, u, v): Delete an edge from the graph G connecting the vertices u, v.
Ensure that such an edge exists before deletion.
 Store-Data(G, V, Item): Store Item into a vertex V of graph G.
 Retrieve-Data(G, V, Item): Retrieve data of a vertex V in the graph G and return it
in Item.
 BFT(G): Perform Breath First Traversal of a graph.
 DFT(G): Perform Depth First Traversal of a graph.
6.4 REPRESENTATION OF GRAPHS
A graph is a mathematical structure and it is required to be represented as a suitable data

structure so that very many applications can be solved using digital computer. The
representation of graphs in a computer can be categorized as (i) sequential representation and
(ii) linked representation. The sequential representation makes use of an array data structure
where as the linked representation of a graph makes use of a singly linked list as its
fundamental data structure.
 Sequential Representation of Graphs

The sequential or the matrix representations of graphs have the following methods:
o Adjacency Matrix Representation
o Incidence Matrix Representation
 Adjacency Matrix Representation

A graph with n nodes can be represented as n x n Adjacency Matrix A such that an element
Ai j
1 if there is an edge between nodes i and j

Ai j =
0 Otherwise
Note that the number of 1s in a row represents the out degree of a node. In case of undirected
graph, the number of 1s represents the degree of the node. Total number of 1s in the matrix
represents number of edges. Figure 6.4(a) shows a graph and Figure 6.4(b) shows its
adjacency matrix.
Figure 6.4(a) Graph Figure 6.4(b) Adjacency matrix
Figure 6.4(a) shows a digraph and Figure 6.4(b) shows its adjacency matrix.
Figure 6.4(a) Digraph Figure 6.4(b) Adjacency matrix
 Incidence Matrix Representation

Let G be a graph with n vertices and e edges. Define an n x e matrix M = [mij] whose n rows
corresponds to n vertices and e columns correspond to e edges, as
1 ej incident upon vi
Aij =
0 Otherwise
Matrix M is known as the incidence matrix representation of the graph G. Figure 6.4.1 (a)
shows a graph and Figure 6.4.1 (b) shows its incidence matrix.
e1 e2 e3 e4 e5 e6 e7
v1 1 0 0 0 1 0 0
v2 1 1 0 0 0 1 1
v3 0 1 1 0 0 0 0
v4 0 0 1 1 0 0 1
v5 0 0 0 1 1 1 0
Figure 6.4.1(a) Undirected graph Figure 6.4.1 (b) Incidence matrix
The incidence matrix contains only two elements, 0 and 1. Such a matrix is called a binary
matrix or a (0, 1)-matrix.
The following observations about the incidence matrix can readily be made:
1. Since every edge is incident on exactly two vertices, each column of in an incidence
matrix has exactly two1’s.
2. The number of 1’s in each row equals the degree of the corresponding vertex.
3. A row with all 0’s, therefore, represents an isolated vertex.
 Linked Representation of Graphs

The linked representation of graphs is referred to as adjacency list representation and is
comparatively efficient with regard to adjacency matrix representation. Given a graph G with
n vertices and e edges, the adjacency list opens n head nodes corresponding to the n vertices
of graph G, each of which points to a singly linked list of nodes, which are adjacent to the
vertex representing the head node. Figure 6.4.2 (a-b) shows an undirected its linked
representation. Similarly, Figure 6.4.2 (a-b) shows a digraph and its linked representation.
Figure 6.4.1 (a) Undirected graph Figure 6.4.1 (b) Linked representation of a graph
Figure 6.4.2 (a) Digraph Figure 6.4.2 (b) Linked representation of a graph
6.6 ADJACENCY MATRIX

In graph theory, an adjacency matrix is nothing but a square matrix utilised to describe a
finite graph. The components of the matrix express whether the pairs of a finite set of vertices
(also called nodes) are adjacent in the graph or not. In graph representation, the networks are
expressed with the help of nodes and edges, where nodes are the vertices and edges are the
finite set of ordered pairs.
Table of Contents:
o Definition
o Creation from a Graph
o Properties
o Undirected Graph
o Directed Graph
o Example
Graphs can also be defined in the form of matrices. To perform the calculation of paths and
cycles in the graphs, matrix representation is used. It is calculated using matrix operations.
The two most common representation of the graphs are:
o Adjacency Matrix
o Adjacency List
We will discuss here about the matrix, its formation and its properties.
 Adjacency Matrix Definition
The adjacency matrix, also called the connection matrix, is a matrix containing rows and
columns which is used to represent a simple labelled graph, with 0 or 1 in the position of (Vi ,
Vj) according to the condition whether Vi and Vj are adjacent or not. It is a compact way to
represent the finite graph containing n vertices of a m x m matrix M. Sometimes adjacency
matrix is also called as vertex matrix and it is defined in the general form as
If the simple graph has no self-loops, Then the vertex matrix should have 0s in the diagonal.
It is symmetric for the undirected graph. The connection matrix is considered as a square
array where each row represents the out-nodes of a graph and each column represents the in-
nodes of a graph. Entry 1 represents that there is an edge between two nodes.
The adjacency matrix for an undirected graph is symmetric. This indicates the value in the ith
row and jth column is identical with the value in the jth row and ith column. Additionally, a
fascinating fact includes matrix multiplication. If the adjacency matrix is multiplied by itself
(matrix multiplication), if there is a nonzero value present in the ith row and jth column, there
is a route from Vi to Vj of length equal to two. It does not specify the path though there is a
path created. The nonzero value indicates the number of distinct paths present.
 How to create an Adjacency Matrix?
If a graph G with n vertices, then the vertex matrix n x n is given by
Where, the value aij equals the number of edges from the vertex i to j. For an undirected
graph, the value aij = aji for all i, j , so that the adjacency matrix becomes a symmetric
matrix.Mathematically, this can be explained as:Let G be a graph with vertex set {v1, v2, v3, .
. . , vn}, then the adjacency matrix of G is the n × n matrix that has a 1 in the (i, j)-position if
there is an edge from vi to vj in G and a 0 in the (i, j)-position otherwise.From the given
directed graph, the adjacency matrix is written as
 Properties
The vertex matrix is an array of numbers which is used to represent the information about the
graph. Some of the properties of the graph correspond to the properties of the adjacency
matrix, and vice versa. The properties are given as follows:
 Matrix Powers
The most well-known approach to get information about the given graph from operations on
this matrix is through its powers. The entries of the powers of the matrix give information
about paths in the given graph. The theorem is given below to represent the powers of the
adjacency matrix.
Theorem: Let us take, A be the connection matrix of a given graph. Then the entries i, j of
An counts n-steps walks from vertex i to j.
 Spectrum
The study of the eigenvalues of the connection matrix of a graph is clearly defined in spectral
graph theory. Assume that, A be the connection matrix of a k-regular graph and v be the all-
ones column vector in Rn. Then the i-th entry of Av is equal to the sum of the entries in the
ith row of A. This represents the number of edges proceeds from vertex i, which is exactly k.
 Isomorphisms
The given two graphs are said to be isomorphic if one graph can be obtained from the other
by relabeling vertices of another graph. It is noted that the isomorphic graphs need not have
the same adjacency matrix. Because this matrix depends on the labelling of the vertices. But
the adjacency matrices of the given isomorphic graphs are closely related.
Theorem: Assume that, G and H be the graphs having n vertices with the adjacency matrices
A and B. Then G and H are said to be isomorphic if and only if there is an occurrence of
permutation matrix P such that B=PAP-1.
 Adjacency Matrix Undirected Graph
For an undirected graph, the protocol followed will depend on the lines and loops. That
means each edge (i.e., line) adds 1 to the appropriate cell in the matrix, and each loop adds 2.
Thus, using this practice, we can find the degree of a vertex easily just by taking the sum of
the values in either its respective row or column in the adjacency matrix. This can be
understood using the below example.
From this, the adjacency matrix can be shown as:
 Adjacency Matrix Directed Graph
As explained in the previous section, the directed graph is given as:

The adjacency matrix for this type of graph is written using the same conventions that are
followed in the earlier examples.
 Adjacency Matrix Example
Question:
Write down the adjacency matrix for the given undirected weighted graph
Solution:
The weights on the edges of the graph are represented in the entries of the adjacency matrix
as follows:
6.7 ADJACENCY LIST AND GRAPH TRAVERSAL ALGORITHMS.
The graph is a non-linear data structure. This represents data using nodes, and their relations
using edges. A graph G has two sections. The vertices, and edges. Vertices are represented
using set V, and Edges are represented as set E. So the graph notation is G(V,E). Let us see
one example to get the idea.
In this graph, there are five vertices and five edges. The edges are directed. As an example, if
we choose the edge connecting vertices B and D, the source vertex is B and destination is D.
So we can move B to D but not move from D to B.
The graphs are non-linear, and it has no regular structure. To represent a graph in memory,
there are few different styles. These styles are
o Adjacency matrix representation

o Edge list representation
o Adjacency List representation
Here we will see the adjacency list representation −
 Adjacency List Representation
This representation is called the adjacency List. This representation is based on Linked Lists.
In this approach, each Node is holding a list of Nodes, which are Directly connected with that
vertices. At the end of list, each node is connected with the null values to tell that it is the end
node of that list.
 Pros of Adjacency List

o An adjacency list is efficient in terms of storage because we only need to store the
values for the edges. For a sparse graph with millions of vertices and edges, this can
mean a lot of saved space.
o It also helps to find all the vertices adjacent to a vertex easily.
 Cons of Adjacency List

o Finding the adjacent list is not quicker than the adjacency matrix because all the
connected nodes must be first explored to find them.
 Graph traversal algorithms.

A graph is a non-linear data structure composed of vertices and edges. Edges are lines or arcs
that connect any two nodes in the network. Vertices are also known as nodes. A graph, in
more technical terms, is made up of vertices (V) and edges (E). The representation of a graph
is G(E, V). So, in this article, we will look at some Graph Traversal Techniques.
Graph Traversal in Data Structure
We can traverse a graph in two ways :
1. BFS ( Breadth First Search )
2. DFS ( Depth First Search )
BFS Graph Traversal in Data Structure
Breadth-first search (BFS) traversal is a technique for visiting all nodes in a given network.
This traversal algorithm selects a node and visits all nearby nodes in order. After checking all
nearby vertices, examine another set of vertices, then recheck adjacent vertices. This
algorithm uses a queue as a data structure as an additional data structure to store nodes for
further processing. Queue size is the maximum total number of vertices in the graph.
Graph Traversal: BFS Algorithm
Pseudo Code :
def bfs(graph, start_node):
queue = [start_node]
visited = set()
while queue:
node = queue.pop(0)
if node not in visited:

visited.add(node)
print(node)
for neighbor in graph[node]:

queue.append(neighbor)
 Explanation of the above Pseudocode

o The technique starts by creating a queue with the start node and an empty set to keep
track of visited nodes.
o It then starts a loop that continues until all nodes have been visited.
o During each loop iteration, the algorithm dequeues the first node from the queue,
checks if it has been visited and if not, marks it as visited, prints it (or performs any
other desired action), and adds all its adjacent nodes to the queue.
o The operation is repeated until the queue is empty, indicating that all nodes have been
visited.
 Let us understand the algorithm using a diagram.
In the above diagram, the full way of traversing is shown using arrows.
 Step 1: Create a Queue with the same size as the total number of vertices in the graph.
 Step 2: Choose 12 as your beginning point for the traversal. Visit 12 and add it to the
Queue.
 Step 3: Insert all the adjacent vertices of 12 that are in front of the Queue but have not
been visited into the Queue. So far, we have 5, 23, and 3.
 Step 4: Delete the vertex in front of the Queue when there are no new vertices to visit
from that vertex. We now remove 12 from the list.
 Step 5: Continue steps 3 and 4 until the queue is empty.
 Step 6: When the queue is empty, generate the final spanning tree by eliminating
unnecessary graph edges.
from collections import deque
def bfs(graph, start):

visited = set()
queue = deque([start])
while queue:
vertex = queue.popleft()
if vertex not in visited:
visited.add(vertex)
print(vertex)
queue.extend(graph[vertex] - visited)
return visited
graph = {
'A': {'B', 'C'},
'B': {'A', 'D', 'E'},
'C': {'A', 'F'},
'D': {'B'},
'E': {'B', 'F'},
'F': {'C', 'E'}
}
bfs(graph, 'A')
Output: A B C E D F
 DFS Graph Traversal in Data Structure

When traversing a graph, the DFS method goes as far as it can before turning around. This
algorithm explores the graph in depth-first order, starting with a given source node and then
recursively visiting all of its surrounding vertices before backtracking. DFS will analyze the
deepest vertices in a branch of the graph before moving on to other branches. To implement
DFS, either recursion or an explicit stack might be utilized.
Graph Traversal: DFS Algorithm
Pseudo Code :
def dfs(graph, start_node, visited=set()):
visited.add(start_node)
print(start_node)
for neighbor in graph[start_node]:

if neighbor not in visited:
dfs(graph, neighbor, visited)
Explanation of the above Pseudocode
o The method starts by marking the start node as visited and publishing it (or doing
whatever additional action is needed).
o It then visits all adjacent nodes that have not yet been visited recursively. This
procedure is repeated until all nodes have been visited.
o The algorithm identifies the current node as visited and prints it (or does any other
required action) throughout each recursive call.
o It then invokes itself on all neighboring nodes that have yet to be visited.
Let us understand the algorithm using a diagram.
The entire path of traversal is depicted in the diagram above with arrows.
o Step 1: Create a Stack with the total number of vertices in the graph as its size.
o Step 2: Choose 12 as your beginning point for the traversal. Go to that vertex and
place it on the Stack.
o Step 3: Push any of the adjacent vertices of the vertex at the top of the stack that has
not been visited onto the stack. As a result, we push 5
o Step 4: Repeat step 3 until there are no new vertices to visit from the stack’s top
vertex.
o Step 5: Use backtracking to pop one vertex from the stack when there is no new
vertex to visit.
o Step 6: Repeat steps 3, 4, and 5.
o Step 7: When the stack is empty, generate the final spanning tree by eliminating
unnecessary graph edges.
 Code Implementation
def dfs(graph, start, visited=None):
if visited is None:
visited = set()
visited.add(start)
print(start)
for next_vertex in graph[start] - visited:
dfs(graph, next_vertex, visited)
return visited
graph = {
'A': {'B', 'C'},
'B': {'A', 'D', 'E'},
'C': {'A', 'F'},
'D': {'B'},
'E': {'B', 'F'},
'F': {'C', 'E'}
}
dfs(graph, 'A')
Output: A C F E B D B
6.8 SUMMARY
In this unit we have discussed in detail about the Graphs are non-linear data structures. Graph
is an important mathematical representation of a physical problem. Graphs and directed
graphs are important to computer science for many real-world applications from building
compilers to modeling physical communication networks. A graph is an abstract notion of a
set of nodes (vertices or points) and connection relations (edges or arcs) between them.
6.6 KEYWORDS
Non-linear data structures,

Undirected graphs,
Directed graphs,
Walk, Path,
6. 7 QUESTIONS FOR SELF STUDY
1. Define graph and directed graph.

2. Define walk, path and cycle with reference to graph.
3. Define connected graph and strongly connected graph and give examples.
4. Define in-degree and out-degree of a graph and give some examples.
5. Define cutedge, cutvertex, pendent vertex, Hamiltonian graph, Eulerian graph.
6. Show that the number of odd degree vertices of a graph is always even.
7. Explain graphs as a data structure.
8. Explain two different ways of sequential representation of a graph with an example.
9. Explain the linked representation of an undirected and directed graph.
10. Explain adjacency list and graph traversal algorithms.
6.8 REFERENCES
 Sartaj Sahni, 2000, Data structures, algorithms and applications in C++, McGraw Hill
international edition.
 Horowitz and Sahni, 1983, Fundamentals of Data structure, Galgotia publications
 Narsingh Deo, 1990, Graph theory with applications to engineering and computer
science, Prentice Hall publications.
 Tremblay and Sorenson, 1991, An introduction to data structures with applications,
McGraw Hill edition.
 C and Data Structures by Practice- Ramesh, Anand and Gautham.
 Data Structures and Algorithms: Concepts, Techniques and Applications by GAV Pai.
Tata McGraw Hill, New Delhi.
UNIT 7.0 SORTING ALGORITHMS
Structure
7.0 Objectives
7.1 Introduction to sorting techniques
7.2 Conventional sort
7.3 Selection sort
7.4 Insertion sort
7.5 Bubble sort,
7.6 Quicksort,
7.7 Merge sort,
7.8 Heap sort algorithms,
7.9 Applications of sorting
7.10 Summary
7.11 Keywords
7.12 Questions
7.13 References
7.0 OBJECTIVES
After reading this unit you should be able to

 Explain the basic concept of sorting techniques
 Discuss the working principle of conventional sorting technique
 List out the limitations of conventional sorting technique
 Discuss two other sorting techniques such as selection sort and insertion sort
7.1 INTRODUCTION TO SORTING TECHNIQUES
Sorting is a process of arranging a set of unordered elements in some predefined ordered. The
simplest way of ordering any unordered set of elements in some predefined orders is to
consider randomly one element at a time and array them, continue this procedure for all
elements in the set. If the desired ordered set is obtained stop the procedure else continue
with other combinations. This process shall be continued until the desired order is obtained.
The important thing to be noted in any sorting is to select any desired property on which the
sorting will be performed and that property shall be called as sorting key.
For example, consider an unordered set of records arranged sequentially containing details of
students. If we want to arrange the details of the students based on the alphabetical order of
the student’s name, then we need to sort the records based on the student’s name and student
name will be treated as sorting key. Consider Table 1.1 where we have nine students and
their details which are stored in an unordered way. If we want to store those details in an
ordered way then we need to sort the entire table using any one of the attribute values as a
sorting key.
Table 7.1: Unordered details of students
Register Student Name Course City Date of birth

Number
KSOU123 Harish M.A Mysore 13/03/1983

KSOU125 Mallikarjuna M.Com Davanageri 21/12/1978
KSOU121 Nagasundara M.Sc Nanjanagud 12/02/1980
KSOU120 Sharath M.A Mandya 18/06/1988
KSOU122 Vinay M.Sc Thumkur 11/02/1985
KSOU124 Imran M.Com Challkere 13/01/1983
KSOU126 Anamica M.A Bangalore 01/01/1985
KSOU127 Gowri M.Sc Mysore 06/04/1986
KSOU128 Nayana M.Com Mandya 05/12/1986
Table 7.2 shows an ordered table containing details of students. In this case table is sorted
based on the alphabetical order of the students and hence the sorting key is student name.
Table 7.1: Details of students ordered by their name in alphabetical order
Register Student Name Course City Date of birth

Number
KSOU126 Anamica M.A Bangalore 01/01/1985

KSOU127 Gowri M.Sc Mysore 06/04/1986
KSOU123 Harish M.A Mysore 13/03/1983
KSOU124 Imran M.Com Challkere 13/01/1983
KSOU125 Mallikarjuna M.Com Davanageri 21/12/1978
KSOU121 Nagasundara M.Sc Nanjanagud 12/02/1980
KSOU128 Nayana M.Com Mandya 05/12/1986
KSOU120 Sharath M.A Mandya 18/06/1988
KSOU122 Vinay M.Sc Thumkur 11/02/1985
Note:
The sorting key can be created from two or more sort keys. The first key is termed as primary
sort key and second is called as secondary sort key and so on.
In the literature we can a good number of sorting techniques which are classified as (i)
internal techniques (ii) External techniques. Internal techniques are those which can be used
when the records are small enough to be sorted within the main memory and external
techniques are those which can be used when the records are huge and requires main memory
and secondary memory to sort entire set of records. In this unit we shall consider only
internal techniques.
In the following section we present three simple sorting techniques viz., conventional sorting
technique, selection sorting technique and insertion sorting technique. In all the techniques
discussed in the following sections we consider an unordered set of integer numbers. While
sorting we can sort in ascending order (i.e., smallest to largest) or in descending order (i.e.,
largest to smallest). For simplicity in all the sections we have considered sorting of integer
numbers in ascending order and the sorting in descending order is left as assignment for the
students for practice.
7.2 CONVENTIONAL SORT
The basic step in this sorting technique is to bring the smallest element of the unordered list
to the recent position. In this technique we will consider two pointers ‘i’ and ‘j’. Initially
pointer ‘i’ will be pointing to first data point and pointer ‘j’ will be pointing to the next data
point. Compare the value pointed by pointers ‘i’ and ‘j’, if the value pointed by pointer ‘j’ is
smaller than the value pointed by pointer ‘i’, then swap those two values else do not swap the
values. Increment the pointer ‘j’ so that it point to the next position. Check whether the value
pointed by pointers ‘i’ and ‘j’, if the value pointed by pointer ‘j’ is smaller than the value
pointed by pointer ‘i’, then swap those two values else do not swap the values and increment
the pointer ‘j’ one at a time. Continue the same process until the pointer ‘j’ reaches the last
position. At the end of this process we can see that the smallest element in the list is at the
location pointed by the pointer ‘i’.
Now increment the pointer ‘i’ by one and initialize the pointer ‘j’ to the next location of the
pointer ‘i’. Continue the above discussed process until the pointer ‘i’ reaches the last but one
position of the list. The algorithm for the conventional list is as given below.
Algorithm Conventional sort

Input Unordered list - A
Output Ordered list – A
Method
For i = 1 to n-1 do
For j = i + 1 to n do
if (A(i) > A(j) )
Swap(A(i), A(j))
if end
For end
For end
Algorithm ends
In order to illustrate the conventional sorting technique let us consider an example where a
list A={10, 6, 8, 2, 4, 11} contains unordered set of integer numbers.
i j
10 6 8 2 4 11
Swap (10, 6) as 10 is greater than 6 and increment the pointer j
j
i
10 6 8 2 4 11
6 10 8 2 4 11
As the value 6 is less than 8 do not swap the values, only increment the pointer j
i j
10 6 8 2 4 11
6 10 8 2 4 11
Swap (6, 2) as 6 is greater than 2 and increment the pointer j
i j
10 6 8 2 4 11
2 10 8 6 4 11
i j
10 6 8 2 4 11
2 10 8 6 4 11
This complete one iteration and you can observe that the smallest element is present in the
first position. For the second iteration, increment the pointer ‘i’ such that it points to next
location and initialize the pointer ‘j’ to next position of pointer ‘i’. Carry out the same
procedure as explained above and it can be observed that at the end of 2nd iteration the list
will be as follows.
2 4 10 8 6 11
Continue the same process as explained above and at the end of the process it can be
observed that the list will be sorted in ascending order and it looks as follows
2 4 6 8 10 11
If we assume that each step will take 1 unit of time, then the total time taken to sort the
considered 6 elements is 5+4+3+2+1 = 15 unit of times. In general if there are n elements in
n2  3n  2
the unordered list the time taken to sort is (n  1)  (n  2)  (n  3)  ...  3  2  1  .
2
7.3 SELECTION SORT
In case of conventional sort we saw that in each step the smallest element will be brought to
the respective positions. In the conventional algorithm it should be noted that there is possible
of swapping elements in each step, which is more time consuming part. In order to reduce
this swapping time we have another sorting technique called selection sort in which we first
select the smallest element in each iteration and swap that smallest number with the first
element of the unsorted list part.
Consider the same example which was considered in demonstrating the conventional sorting
technique. Similar to conventional sorting technique in this selection sorting technique also
we consider same two pointers ‘i’ and ‘j’. As usual ‘i’ will be pointing to first position and ‘j’
will be pointing to the next position of ‘i’. Find the minimum for all values of ‘j’ i.e.,
i  1  j  n . Let the smallest element be present in kth position. If the element pointed by ‘i’
is larger than the element at kth position swap those values else increment ‘i’ and set the value
of j = i+1.
It should be observed clearly that in case of selection sort there is only one swap where as in
case of conventional sorting technique there is possibility of having more than one swapping
of elements in each iteration.
k j
i
10 6 8 2 4 11
10 6 8 2 4 11
Swap (10, 2) as 2 is the smallest among the elements covered by the jth pointer.
2 6 8 10 4 11
Similarly do the process for reaming set and the resultant steps are as follows.
2 6 8 10 4 11
2 4 8 10 6 11
2 4 6 10 8 11
2 4 6 8 10 11
2 4 6 8 10 11
2 4 6 8 11 11
The algorithm for selection sort is as follows.

Algorithm Selection sort
Method
For i = 1 to n-1 do
For j = i + 1 to n do
k=i
if (A(j) < A(i) )
k=j
if end
For end
If(i ≠j)
Swap(A(i), A(k))
If end
For end
Algorithm ends
7.4 INSERTION SORT
The basic step in this method is to insert an element ‘e’ into a sequence of ordered elements
e1, e2, e3, .. ej in such a way that the resulting sequence of size i+1 is also ordered. We start
with an array of size 1 which is in sorted order. By inserting the second element into its
appropriate position we get an ordered list of two elements. Similarly, each subsequent
element will be added into their respective positions obtaining a partial sorted array. The
algorithm for the insertion sort is a follows.
Algorithm Insertion sort

Method
For i = 2 to n do
j = i-1, k=A(i)
While ( (j > 0) and A(j) > k)
A(j+1) = A(j)
j=j–1
While end
A(j+1) = k
For end
Algorithm ends
To illustrate the working principle of the insertion sort let us consider the same set of
elements considered in the above sections.
10 6 8 2 4 11
Consider the first element, as there is only one element it is already sorted.
10 6 8 2 4 11
Now consider the second element 6 and insert that element in the respective position of the
sorted list. As 6 is less than 10 insert 6 before the value 10.
6 10 8 2 4 11
The third element is 8 and the value of 8 is greater than 6 and less than 10. Hence insert the
element 8 in between 6 and 10.
6 8 10 2 4 11
The fourth element is 2 and it is the smallest among all the elements in the partially sorted
list. So insert the value 2 in the beginning of the partially sorted list.
2 6 8 10 4 11
The fifth element is 4 and the value of 4 is greater than 2 and less than remaining elements of
the partially sorted list {6, 8, 10} and hence insert the element 4 in between 2 and {6, 8, 10}.
2 4 6 8 10 11
The remaining element is 11 and it is largest of all elements of the partially sorted list {2, 4,
6, 8, 10}, so leave the element in its place only. The finally sorted list is as follows.
2 4 6 8 10 11
.
The insertion sort works faster than the conventional sort and selection sort. The computation
of time taken to sort an unordered set of elements using insertion sort is left as assignment to
the students (Refer question number 3).
7.5 BUBBLE SORT
Bubble sort is a simple sorting algorithm. This sorting algorithm is comparison-based

algorithm in which each pair of adjacent elements is compared and the elements are swapped
if they are not in order. This algorithm is not suitable for large data sets as its average and
worst-case complexity are of Ο(n2) where n is the number of items.
 How Bubble Sort Works?
We take an unsorted array for our example. Bubble sort takes Ο(n2) time so we're keeping it
short and precise.
Bubble sort starts with very first two elements, comparing them to check which one is
greater.
In this case, value 33 is greater than 14, so it is already in sorted locations. Next, we compare
33 with 27.
We find that 27 is smaller than 33 and these two values must be swapped.
The new array should look like this
Next we compare 33 and 35. We find that both are in already sorted positions.
Then we move to the next two values, 35 and 10.

We know then that 10 is smaller 35. Hence they are not sorted.
We swap these values. We find that we have reached the end of the array. After one iteration,
the array should look like this
To be precise, we are now showing how an array should look like after each iteration. After
the second iteration, it should look like this
Notice that after each iteration, at least one value moves at the end.
And when there's no swap required, bubble sorts learns that an array is completely sorted.
Now we should look into some practical aspects of bubble sort.
 Algorithm
We assume list is an array of n elements. We further assume that swap function swaps the
values of the given array elements.
begin BubbleSort(list)
for all elements of list

if list[i] > list[i+1]
swap(list[i], list[i+1])
end if
end for
return list
end BubbleSort
 Pseudocode
We observe in algorithm that Bubble Sort compares each pair of array element unless the
whole array is completely sorted in an ascending order. This may cause a few complexity
issues like what if the array needs no more swapping as all the elements are already
ascending.
To ease-out the issue, we use one flag variable swapped which will help us see if any swap
has happened or not. If no swap has occurred, i.e. the array requires no more processing to be
sorted, it will come out of the loop.
Pseudocode of BubbleSort algorithm can be written as follows −
procedure bubbleSort( list : array of items )
loop = list.count;
for i = 0 to loop-1 do:

swapped = false
for j = 0 to loop-1 do:
/* compare the adjacent elements */

if list[j] > list[j+1] then
/* swap them */
swap( list[j], list[j+1] )
swapped = true
end if
end for
/*if no number was swapped that means

array is sorted now, break the loop.*/
if(not swapped) then

break
end if
end for
end procedure return list

 Implementation
One more issue we did not address in our original algorithm and its improvised pseudocode,
is that, after every iteration the highest values settles down at the end of the array. Hence, the
next iteration need not include already sorted elements. For this purpose, in our
implementation, we restrict the inner loop to avoid already sorted values.
#include <stdio.h>
#include <stdbool.h>
#define MAX 10
int list[MAX] = {1,8,4,6,0,3,5,2,7,9};
void display() {
int i;
printf("[");
// navigate through all items

for(i = 0; i < MAX; i++) {
printf("%d ",list[i]);
}
printf("]\n");
}
void bubbleSort() {
int temp;
int i,j;
bool swapped = false;
// loop through all numbers
for(i = 0; i < MAX-1; i++) {
swapped = false;
// loop through numbers falling ahead
for(j = 0; j < MAX-1-i; j++) {
printf(" Items compared: [ %d, %d ] ", list[j],list[j+1]);
// check if next number is lesser than current no
// swap the numbers.
// (Bubble up the highest number)
if(list[j] > list[j+1]) {

temp = list[j];
list[j] = list[j+1];
list[j+1] = temp;
swapped = true;
printf(" => swapped [%d, %d]\n",list[j],list[j+1]);
} else {
printf(" => not swapped\n");
}
}
// if no number was swapped that means
// array is sorted now, break the loop.
if(!swapped) {
break;
}
printf("Iteration %d#: ",(i+1));
display();
}
}
void main() {
printf("Input Array: ");
display();
printf("\n");
bubbleSort();
printf("\nOutput Array: ");
display();
}
Output
Input Array:[1 8 4 6 0 3 5 2 7 9 ]
Items compared: [ 1, 8 ] => not swapped
Items compared: [ 8, 4 ] => swapped [4, 8]
Iteration 1#: [1 4 6 0 3 5 2 7 8 9 ]
Iteration 2#: [1 4 0 3 5 2 6 7 8 9 ]
Iteration 3#: [1 0 3 4 2 5 6 7 8 9 ]
Iteration 4#: [0 1 3 2 4 5 6 7 8 9 ]
Iteration 5#: [0 1 2 3 4 5 6 7 8 9 ]
Output Array: [0 1 2 3 4 5 6 7 8 9 ]
7.6 QUICKSORT
Quick sort is a highly efficient sorting algorithm and is based on partitioning of array of data
into smaller arrays. A large array is partitioned into two arrays one of which holds values
smaller than the specified value, say pivot, based on which the partition is made and another
array holds values greater than the pivot value.
Quicksort partitions an array and then calls itself recursively twice to sort the two resulting
subarrays. This algorithm is quite efficient for large-sized data sets as its average and worst-
case complexity are O(n2), respectively.
 Partition in Quick Sort
Following animated representation explains how to find the pivot value in an array.
The pivot value divides the list into two parts. And recursively, we find the pivot for each
sub-lists until all lists contains only one element.
 Quick Sort Pivot Algorithm
Based on our understanding of partitioning in quick sort, we will now try to write an
algorithm for it, which is as follows.
Step 1 − Choose the highest index value has pivot
Step 2 − Take two variables to point left and right of the list excluding pivot
Step 3 − left points to the low index
Step 4 − right points to the high
Step 5 − while value at left is less than pivot move right
Step 6 − while value at right is greater than pivot move left
Step 7 − if both step 5 and step 6 does not match swap left and right
Step 8 − if left ≥ right, the point where they met is new pivot
 Quick Sort Pivot Pseudocode
The pseudocode for the above algorithm can be derived as −
function partitionFunc(left, right, pivot)
leftPointer = left
rightPointer = right - 1
while True do
while A[++leftPointer] < pivot do
//do-nothing
end while
while rightPointer > 0 && A[--rightPointer] > pivot do

//do-nothing
end while
if leftPointer >= rightPointer

break
else
swap leftPointer,rightPointer
end if
end while
swap leftPointer,right
return leftPointer
end function
 Quick Sort Algorithm
Using pivot algorithm recursively, we end up with smaller possible partitions. Each partition
is then processed for quick sort. We define recursive algorithm for quicksort as follows −
Step 1 − Make the right-most index value pivot
Step 2 − partition the array using pivot value
Step 3 − quicksort left partition recursively
Step 4 − quicksort right partition recursively
Quick Sort Pseudocode
To get more into it, let see the pseudocode for quick sort algorithm −
procedure quickSort(left, right)
if right-left <= 0
return
else
pivot = A[right]
partition = partitionFunc(left, right, pivot)
quickSort(left,partition-1)
quickSort(partition+1,right)
end if
end procedure
#include <stdio.h>
#define MAX 7
int intArray[MAX] = {4,6,3,2,1,9,7};
void printline(int count) {
int i;
for(i = 0;i < count-1;i++) {
printf("=");
}
printf("=\n");
}
void display() {
int i;
printf("[");
// navigate through all items

for(i = 0;i < MAX;i++) {
printf("%d ",intArray[i]);
}
printf("]\n");
}
void swap(int num1, int num2) {
int temp = intArray[num1];
intArray[num1] = intArray[num2];
intArray[num2] = temp;
}
int partition(int left, int right, int pivot) {
int leftPointer = left -1;
int rightPointer = right;
while(true) {
while(intArray[++leftPointer] < pivot) {
//do nothing
}
while(rightPointer > 0 && intArray[--rightPointer] > pivot) {
//do nothing
}
if(leftPointer >= rightPointer) {
break;
} else {
printf(" item swapped :%d,%d\n", intArray[leftPointer],intArray[rightPointer]);
swap(leftPointer,rightPointer);
}
}
printf(" pivot swapped :%d,%d\n", intArray[leftPointer],intArray[right]);
swap(leftPointer,right);
printf("Updated Array: ");
display();
return leftPointer;
}
void quickSort(int left, int right) {
if(right-left <= 0) {
return;
} else {
int pivot = intArray[right];
int partitionPoint = partition(left, right, pivot);
quickSort(left,partitionPoint-1);
quickSort(partitionPoint+1,right);
}
}
int main() {
printf("Input Array: ");
display();
printline(50);
quickSort(0,MAX-1);
printf("Output Array: ");
display();
printline(50);
}
Output
Array elements before quick sort are: [4 6 3 2 1 9 7 ]
**************************************************
pivot swapped: 9, 7
Updated Array: [4 6 3 2 1 7 9 ]
pivot swapped: 4, 1
Updated Array: [1 6 3 2 4 7 9 ]
item swapped: 6, 2
pivot swapped: 6, 4
Updated Array: [1 2 3 4 6 7 9 ]
pivot swapped: 3, 3
Updated Array: [1 2 3 4 6 7 9 ]
Array elements after quick sort are: [1 2 3 4 6 7 9 ]
7.7 MERGESORT
The basic concept of merge sort is like this. Consider a series of n numbers, say A(1), A(2)
……A(n/2) and A(n/2 + 1), A(n/2 + 2) ……. A(n). Suppose we individually sort the first set
and also the second set. To get the final sorted list, we merge the two sets into one common
set.
We first look into the concept of arranging two individually sorted series of numbers into a
common series using an example:
Let the first set be A = {3, 5, 8, 14, 27, 32}. Let the second set be B = {2, 6, 9, 15, 18, 30}.
The two lists need not be equal in length. For example the first list can have 8 elements and
the second 5. Now we want to merge these two lists to form a common list C. Look at the
elements A(1) and B(1), A(1) is 3, B(1) is 2. Since B(1) < A(1), B(1) will be the first element
of C i.e., C(1)=2. Now compare A(1) =3 with B(2) =6. Since A(1) is smaller then B(2), A(1)
will become the second element of C. C[ ] = {2, 3}
Similarly compare A(2) with B(2), since A(2) is smaller, it will be the third element and so
on. Finally, C is built up as C[ ]= {2, 3, 5, 6, 8, 9, 14, 15, 18, 27, 30, 32}.
However the main problem remains. In the above example, we presume that both A & B are
originally sorted. Then only they can be merged. But, how do we sort them in the first? To do
this and show the consequent merging process, we look at the following example. Consider
the series A= (7 5 15 6 4). Now divide A into 2 parts (7, 5, 15) and (6, 4). Divide (7, 5, 15)
again as ((7, 5) and (15)) and (6, 4) as ((6) (4)). Again (7, 5) is divided and hence ((7, 5) and
(15)) becomes (((7) and (5)) and (15)).
Now since every element has only one number, we cannot divide again. Now, we start
merging them, taking two lists at a time. When we merge 7 and 5 as per the example above,
we get (5, 7) merge this with 15 to get (5, 7, 15). Merge this with 6 to get (5, 6, 7, 15).
Merging this with 4, we finally get (4, 5, 6, 7 and 15). This is the sorted list.
You are now expected to take different sets of examples and see that the method always
works.
We design two algorithms in the following. The main algorithm is a recursive algorithm
(somewhat similar to the binary search algorithm that we saw earlier) which calls at times the
other algorithm called MERGE. The algorithm MERGE does the merging operation as
discussed earlier.
Algorithm: MERGESORT
Input: low, high, the lower and upper limits of the list to be sorted
A, the list of elements
Output: A, Sorted list
Method:
If (low<high)
Mid = (low + high)/2
MERGESORT(low, mid)
MERGESORT (mid, high)
MERGE(A, low, mid, high)
If end
Algorithm ends
Algorithm: Merge
Input: low, mid, high, limits of two lists to be merged i.e., A(low, mid) and A(mid+1, high)
A, the list of elements
Output: B, the merged and sorted list
Method:
h = low, i = low, j = mid + 1;
While ((h dŠ mid) and (j dŠ high)) do
If (A(h) dŠ A(j) )
B(i) = a(h);
h = h+1;
else
B(i) = A(j);
j = j+1;
If end
i = i+1;
If (h > mid)
For k = j to high
B(i) = A(k);
i = i+1;
For end
Else
For k = h to mid
B(i) = A(k);
i = i+1
For end
If end
While end
Algorithm ends
7.8 HEAP SORT ALGORITHMS
We now look at the concept of heap sort - a method of arranging numbers in ascending order. A
heap is defined as a collection of numbers, normally arranged in the form of a tree - A binary tree
to be more precise (students can look into a later block to look at the concept of binary trees). The
condition is that the parent node is always larger than the child nodes. For example in the
following figures the first two represent heaps, while the others do not.
What we have shown are only two levels, but the same can be rearranged for any number of
levels and any number of elements.
To insert an element into the heap, one adds it “at the bottom” of the heap and then compares it
with its parent, grandparent, great grandparent and so on, until it is less than or equal to one of
these values.
Let us consider the incoming numbers in the sequence (75, 103, 65, 105, 88, 96, 100). Now we
want to construct a heap for the same
Note that though we have ended up in a balanced tree in this case, it may not be
so always. Now take out the largest element 105, put it at the end of the sorted list and rearrange
the heap. 103 comes to top, remove it, rearrange the heap again etc.We now write algorithms to
do the same. The first one is the most important procedure that actually creates the heap. The
other two call on this to produce the required sorted array.
Algorithm Insert
Input: A, An unordered set
N, size of set A
Output:
Insert A[n] into the heap which is stored in A[1: n-1]
Method:
i=n
item=A[n];
While ( (i>n) and (A[ i!/2 ] < item)) do
A[ i ] = A[i/2];
i =i/2;
While end
A[i]=item;
Algorithm end
Algorithm Adjust
Input: A, i, n
Output: Updated A
Method:
j=2i
item=A[i]
While (j<=n) do
If ((j<=n) and (A[j] < A[j+1])) then
j=j+1
//compare left and right child and let j be the right //child
If (item >= A[i]) then break
// a position for item is found
A[i/2]=A[j]
j=2i
While end
A[j/2]=item;
Algorithm end
Algorithm: Delmax
Input: A n x
// Delete the maximum from the heap A[1..n] and store it in x
If (n=0) then
Write (‘heap is empty”);
If end
x=A[1];
A[1] =A[n];
Adjust (A, 1, n-1);
Return (true);
Algorithm end
To delete the maximum key from the max heap, we use an algorithm called Adjust. Adjust takes
asinput the array A[ ] and integer I and n. It regards A[1..n] as a complete binary tree. If the sub
treesrooted at 2I and 2I+1 are max heaps, then adjust will rearrange elements of A[ ] such that the
tree rooted at I is also a max heap. The maximum elements from the max heap A[1..n] can be
deleted by deleting the root of the corresponding complete binary tree. The last element of the
array, i.e. A[n], is copied to the root, and finally we call Adjust(A, 1, n-1).
Algorithm: sort
Input: Unsorted list - A, size of the list - n
Output: Sorted list A
Method:
For i=1 to n do
Insert (A, i)
For end
For i= n to 1 step –1 do
Delmax (A, i, x);
A[i] =x;
For end
Algorithm end
7.9 APPLICATIONS OF SORTING
Applications of sorting can find in many areas. Some applications of sorting are
(1). Speeding up the searching process is one of the most important applications of the
sorting technique. Example. To search an element for its existence in any list,
applying sorting prior to searching makes the process to work faster.
(2). In many of the statistical procedures sorting is necessary. Example, to find median of
any set of elements sorting is must.
(3). To find duplicate elements present in the list.
7.10 SUMMARY
In this unit we have presented the basics of sorting technique. We have presented three
different sorting techniques such as conventional sort, selection sort and insertion sort. The
algorithm for all the three-sorting technique is given in this unit and demonstrated with
suitable example.
7.11 KEYWORDS
(1). Sorting technique

(2). Conventional sort
(3). Selection sort
(4). Insertion sort
(1). Define sorting. Mention the applications of sorting with respect the computer science
field.
(2). Design and develop an algorithm to sort unordered set of element in descending order
using (i) Conventional sort (ii) Selection sort (iii) Insertion sort.
(3). Calculate the time taken to sort n elements present in an unordered list using insertion
sort.
(4). Sort the given unordered set {12,2,16,30,8,28,4,10,20,6,18} using conventional sort
and count the number of swapping took during the sorting process
(5). For the same unordered set given in question 1 sort using selection sort and insertion
sort and count the number of swapping too during the sorting process.
(6). Design an algorithm to sort given set of n numbers using merge sorting technique
(7). Design an algorithm to sort n number using heap sort. Consider a set A =
{12,2,16,30,8,28,4,10,20,6,18}, illustrate the designed algorithm on set A.
7.13REFERENCES
 Ellis Horowitz, Sartaj Sahni, and Dinesh Mehta. Fundamental of Data Structures in
C++
 Alfred V. Aho , Jeffrey D. Ullman, John E. Hopcroft. Data Structures and Algorithms.
Addison Wesley (January 11, 1983)
UNIT-8.0 SEARCHING ALGORITHMS
8.0 Objectives
8.1 Introduction to Searching Techniques
8.2 Linear Search
8.3 Binary Search
8.4 Depth First Search
8.5 Breadth First Search
8.6 Summary
8.7 Keywords
8.8 Questions
8.9 Reference
8,0 OBJECTIVES
After reading this unit you should be able to

 Discuss the basic concept of searching techniques
 Explain linear search technique
 Elucidate binary search technique
 Design algorithms for depth first and breadth first search technique
8.1 INTRODUCTION TO SEARCHING TECHNIQUES
Searching is a technique of finding whether a given element is present in a list of elements. If

the search element is present in the list the searching technique should return the index where
the given searching element is present in the list. If the search element is not present in the
list then the searching technique should return NULL indicating that search element is not
present in the list. Similar to sorting there are number of searching technique available in the
literature and no single algorithm suits all applications. Some algorithms work faster but
require more memory on the other hand some techniques are too fast but they assume that the
given list is already in sorted order. On the other hand searching techniques can be classified
based on the data structures used to store the list. If array is used, then we must use different
searching technique. Searching an element in a non linear data structure requires different
searching techniques. In this unit we present four different searching techniques. First, we
present linear search technique which is simplest of all searching technique. Next we present
another fast working technique called binary search. Later we present two other searching
techniques viz., breadth first search and depth first search technique. The former two are for
array type of data structure and later two are for graph data structure.
8.2 LINEAR SEARCH (SEQUENTIAL SEARCH)
Linear search is a method of searching an element in the list where the given search element
‘e’ is compared against all elements sequentially until there is atleast one match (i.e.,
Success) or search process reaches the end of the list without finding any match (i.e.,
Failure). Let A = [10 15 6 23 8 96 55 44 66 11 2 30 69 96] and searching element ‘e’ = 11.
Consider a pointer ‘i’, to begin with the process initialize the pointer ‘i’ = 1. Compare the
value pointed by the pointer with the searching element ‘e’ = 11. As A(1) = 10 and its is not
equal to element ‘e’ increment the pointer i by i+1. Compare the value pointed by pointer
i.e., A(2) = 15 and it is also not equal to element ‘e’. Continue the process until the search
element is found or the pointer ‘i’ reaches the end of the list.
Step 1 10 ≠11
A= 10 15 6 23 11 96 55 44 66 8 2 30 69 96
15 ≠11
Step 2
A= 10 15 6 23 11 96 55 44 66 8 2 30 69 96
Step 3 6 ≠11
A= 10 15 6 23 11 96 55 44 66 8 2 30 69 96
Step 4 23 ≠11
A= 10 15 6 23 11 96 55 44 66 8 2 30 69 96
Step 5 11 = 11
A= 10 15 6 23 11 96 55 44 66 8 2 30 69 96
Element found in the list A at the position 5.

Now, let us consider the same list with different search element ‘e’ = 12.
If you search the list using the above mechanism, the pointer moves till the end of the list.
This indicates that the given search element ‘e’ = 12 is not present in the list and returns
NULL. The algorithm for linear search is as given below.
Algorithm Linear search
Input A – List of elements
‘e’ element to be searched
‘n’ size of the list
Output Success or Unsuccess.
Method
For i = 1 : n
If (A(i) = = e)
display ‘Element e is present at the position i’
Flag = 1
break
If end
For end
If(Flag = = 0)
Display ‘Element e is not present in the list A’
If end
Algorithm ends
The time taken to search an element in the list is ‘n+1’ in the worst case when the element is
not present in the list. In case if element is present the time taken will be the time taken to
reach the position of the element. If the element is present at the end of the list of size ‘n’ the
time taken is ‘n’ and if the element is present in the first position the time required is 1 unit.
Linear search or sequential search suits the applications where the size of the list is less. If the
size of the list is more, then linear search may perform very poor.
8.3 BINARY SEARCH

Binary search is an efficient technique for searching in a sorted array. It works by comparing
a searching element ‘e’ with the array’s middle element A(mid). If they match, the algorithm
stops; otherwise, the same operation is repeated recursively for the first half of the array if
e < A(mid) and for the second half if e > A(mid).Consider the same example of sequential
search and let us apply Binary search technique. Let A be the list and ‘e’=11 be the searching
element. As binary search technique assumes that the list will be in sorted order, sorting of
elements in the list will be the first step.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
A= 10 15 6 23 11 96 55 44 66 8 2 30 69 96
Let Low = 1; High = 14 be the initial values, where Low is the starting position and High is
the last position of the list.
Sort the given list using any of the sorting techniques. In this illustrative we assume that we
have sorted list of A.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
A= 2 6 8 10 11 15 23 30 44 55 66 69 96 96
Compute the mid of list; Mid = (Low + High)/2;

Mid = (1+14)/2 = 15/2 = 7.5 ≈ 7.
Check whether A(Mid) = = Search element.
A(7) = 23 and it is not equal to search element 11.
Low Mid High
1 2 3 4 5 6 7 8 9 10 11 12 13 14
A= 2 6 8 10 11 15 23 30 44 55 66 69 96 96
A(Mid) ≠11 and A(Mid) >

11
Since the search element is not equal to A(mid) and A(mid) is greater than the search
element, the search element should be present in the first half of the list. Consider first half as
the new list and continue the same procedure neglecting the second half of the list. For
further processing initialization are as follows
High = Mid – 1; i.e., High =7 – 1 = 6; Mid = (Low+High)/2; i.e., Mid = (1+6)/2 = 7/2 ≈3.
The value of the Low remains unchanged ie., Low = 1.
Low Mid High
1 2 3 4 5 6
A= 2 6 8 10 11 15
A(Mid) ≠11 and A(Mid) <

11
Since the search element is not equal to A(mid) and A(mid) is less than the search element,
the search element should be present in the second half of the list. Consider second half as the
new list and continue the same procedure neglecting the first half of the list. For further
processing initialization are as follows
Low = Mid+1, i.e., Mid =3+1 = 4; Mid = (Low+High)/2; i.e., Mid = (4+6)/2 = 10/2 = 5. In
this case the value of High remains same i.e., High = 6.
Low Mid High
4 5 6
A= 10 11 15
A(Mid) =
11
As A(Mid) is equal to search element, it can be announced that Search element is found in
position 5. The recursive algorithm for binary search is as follows.
Algorithm Binary Search

Input A – List of elements
‘e’ element to be searched
‘n’ size of the list
Low = 1, High = n;
Output Success or Unsuccess.
Method
While (Low <= High)
Mid = (Low+High)/2;
If(A(Mid) = = e)
Display ‘Element found in Mid Position’
Else
If(A(Mid) > E)
High = Mid – 1
Binary Search(A(Low : High),e)
else
Low = Mid+1
Binary Search(A(Low : High),e)
If end
If end
While end
Algorithm ends
The techniques that were discussed in above two sections are only for linear type of
structures (Specifically for arrays). In next two sections we present two different approaches
called Depth first search and breadth first search algorithms, which works on graph data
structures. Normally these two approaches are used to search a node in a graph or to traverse
any undirected graph. Traversing a graph without revisiting the same node is a difficult task
as it involves loops, circuits. In this unit we consider the problem of searching a node in a
graph.
8.4 DEPTH FIRST SEARCH
Depth first search algorithm starts visiting nodes of a graph arbitrarily, marking that node as
visited node. Soon after visiting any node (current node) we consider any of its adjacent
nodes as next node for traversal and the current node address will be stored in stack data
structure and traverse to the next adjacent node. The same thing is processed until no node
can be processed further. If there are any nodes which are not visited then backtracking is
used until all the nodes are visited. In depth first search stack will be used as a storage
structure to store information about the nodes which will be used during backtracking.
Before knowing how to search a node in a graph using depth first search we need to
understand how depth first search can be used for traversal of graph. Consider a graph G as
shown in Figure 1(a). The traversal starts with node 1 (Figure 1(b)), mark the node as
traversed (Gray shading is used to indicate that the node is traversed) and push the node
number 1 into the stack. As it has only one adjacent node 4 we will move to node number 4.
Mark the node number 4 Figure 1(c) and push 4 into the stack. For node number 4 there are 2
adjacent nodes i.e., 2 and 5.
(a) (b) (c)
(d) (e) (f)
(g) (h) (i)
(j) (k)
Figure 1: Traversal of a graph using depth first search algorithm
Select one node arbitrarily (For implementation purpose we can select the node with smallest
number) and move to that node, in this case we will move to node 2 and push the node
number 2 to stack. Similarly we will move to node 5 from node 2, pushing 5 into stack and
then move to node 3 from node 5 and push node 3 into the stack (Figure (1(d, e, f, g, h, I, j)).
Figure 1(k) shows the elements present in the stack at the end. From node 3 there is no
possibility to traverse further. From this point onwards we will backtrack to check whether
there are any other nodes which are not traversed. Pop the top node 3 from stack. Now,
check is there any possibility to traverse from the element present in the top of the stack. The
top element is 5 and there is an edge with has not been traversed from the node 5 (See Figure
2(b), the line marked in red color is untraversed edge). This edge leads to 4 which has been
already visited and there is no other possibility for traversing from node 5, pop node 5 from
the stack. Do the same process and at the end there will be no elements in the stack indicating
that all the vertices of the graph have been traversed.
(a) (b) (c)
(d) (e) (f)
(g)
Figure 2. Backtracking operations for the depth first search algorithm
Figure 1 and Figure 2 demonstrated the depth first search for traversal purpose. The same
technique can be used to search an element in the graph. Given a graph with n nodes we can
search whether given node is present in the graph or not. Each time we visit a node we check
whether that node is same as the search node, if it is stop the procedure declaring that the
node is present else push that node into the stack and traverse until you the stack become
empty.
Let us consider a tree example and illustrate the working principle of the depth first search.
Let the search element be F.
Figure 3. A binary tree with 8 nodes.

Figure 4(a)
Figure 4(b).
Figure 4(c)
Figure 4(d)
Figure 4(a) – 4(d) : Various steps in depth first search algorithm.
Note: Depth first search method uses stack as a data structure.
8.5 BREADTH FIRST SEARCH
Analogous to depth first search which search the nodes from top to bottom fashion
postponing the traversal of adjacent elements, the breadth first search algorithm first traverse
adjacent nodes of a starting node, then all unvisited nodes in a connected graph will be
traversed in the same manner.It is convenient to use a queue to trace the operation of breadth
first search. The queue is initialized with the traversal’s starting node, which is marked as
visited. On each iteration, the algorithm identifies all unvisited nodes that are adjacent to the
front node, marks them as visited, and adds them to the queue; after that front node is
removed from the queue.
Let us consider the same example of tree traversal Figure 3. Starting node is A, Insert A into
queue mark A as traversed. Move to its successor element {B, C}, push them to queue and
mark them as traversed. Since there is no other adjacent element to node A, remove A from
which is first element in the queue. The next element in the queue is B, check for its
successor node. Since B has no successor elements remove B from the Queue. The next
element in the queue is C, find its successor elements i.e., {D, F}. Insert them into the queue
and correspondingly marks the, as traversed. Since C has no other elements as its successor
remove C from the Queue. The next element in the queue is D, its successor is E insert it into
the queue and mark it as traversed. Now, D has no successor node hence remove D from the
Queue. The next element in the Queue is F, find out is successor i.e., {H, I}. Insert them into
the queue and mark them as visited. Once again the element F has no successor so remove it
from queue and check for next element in the queue. The next element is E and E has no
successor remove it and next elements are H and I. traverse them in the same way.
For searching an element using breadth first search, similar to depth first search we traverse
the graph using breadth first traversal and while traversing the graph if a node same as search
element occurs we declare that search element is present in the graph.
8.6 SUMMARY
In this unit we have presented the basics of searching technique. We have presented four
different searching techniques such as linear search, binary search, depth first search and
depth first search. The first two are for conventional type of data whereas last two are for
searching of elements stored in the form of graph.
8.7 KEYWORDS
Searching technique
Linear search
Binary search
Depth first search
1. Design an algorithm to search an element ‘e’ from a given list using linear search
technique.
2. Design and develop an algorithm to find a given element ‘e’ using binary search method.
Discuss how it is efficient than linear search method.
3. Mention the difference between depth first search and breadth first search algorithm.
4. Mention the applications of searching algorithms?
5. Consider a list A= {12,2,16,30,8,28,4,10,20,6,18}. Check whether element 6 is present in
the list using binary search technique. Illustrate the searching technique
8.9 REFERENCES
• Ellis Horowitz, Sartaj Sahni, and Dinesh Mehta. Fundamental of Data Structures in
C++
• Alfred V. Aho , Jeffrey D. Ullman, John E. Hopcroft. Data Structures and Algorithms.
Addison Wesley (January 11, 1983)
UNIT 9: HASHING
Structure
9.0 Objectives
9.1 Introduction
9.2 Hashing
9.3 Collision resolution techniques
9.4 Implementation of hashing
9.5 Summary
9.6 Key words
9.7 Questions
9.9 References
9.0 OBJECTIVES
At the end of this unit, you will be able to

 Discuss Hashing
 Elucidate Collision resolution techniques
 Explain Implementation of hashing
9.1 INTRODUCTION
Hashing refers to the process of generating a fixed-size output from an input of variable
size using the mathematical formulas known as hash functions. This technique determines
an index or location for the storage of an item in a data structure.
Need for Hash data structure
Every day, the data on the internet is increasing multifold and it is always a struggle to
store this data efficiently. In day-to-day programming, this amount of data might not be that
big, but still, it needs to be stored, accessed, and processed easily and efficiently. A very
common data structure that is used for such a purpose is the Array data structure.
Now the question arises if Array was already there, what was the need for a new data
structure! The answer to this is in the word “efficiency“. Though storing in Array takes
O(1) time, searching in it takes at least O(log n) time. This time appears to be small, but for
a large data set, it can cause a lot of problems and this, in turn, makes the Array data
structure inefficient.
So now we are looking for a data structure that can store the data and search in it in
constant time, i.e. in O(1) time. This is how hashing data structure came into play. With the
introduction of the Hash data structure, it is now possible to easily store data in constant
time and retrieve them in constant time as well.
Components of Hashing
There are majorly three components of hashing:
1. Key: A Key can be anything string or integer which is fed as input in the hash
function the technique that determines an index or location for storage of an item
in a data structure.
2. Hash Function: The hash function receives the input key and returns the index of
an element in an array called a hash table. The index is known as the hash index.
3. Hash Table: Hash table is a data structure that maps keys to values using a
special function called a hash function. Hash stores the data in an associative
manner in an array where each data value has its own unique index.
Components of Hashing
How does Hashing work?

Suppose we have a set of strings {“ab”, “cd”, “efg”} and we would like to store it in a
table.
Our main objective here is to search or update the values stored in the table quickly in O(1)
time and we are not concerned about the ordering of strings in the table. So the given set of
strings can act as a key and the string itself will act as the value of the string but how to
store the value corresponding to the key?
 Step 1: We know that hash functions (which is some mathematical formula) are
used to calculate the hash value which acts as the index of the data structure
where the value will be stored.
 Step 2: So, let’s assign
 “a” = 1,
 “b”=2, .. etc, to all alphabetical characters.
 Step 3: Therefore, the numerical value by summation of all characters of the
string:
 “ab” = 1 + 2 = 3,
 “cd” = 3 + 4 = 7 ,
 “efg” = 5 + 6 + 7 = 18
 Step 4: Now, assume that we have a table of size 7 to store these strings. The
hash function that is used here is the sum of the characters in key mod Table
size. We can compute the location of the string in the array by taking
the sum(string) mod 7.
 Step 5: So we will then store
 “ab” in 3 mod 7 = 3,
 “cd” in 7 mod 7 = 0, and
 “efg” in 18 mod 7 = 4.
Mapping key with indices of array
The above technique enables us to calculate the location of a given string by using a simple
hash function and rapidly find the value that is stored in that location. Therefore the idea of
hashing seems like a great way to store (key, value) pairs of the data in a table.
9.2 HASHING
Hashing is an important data structure designed to solve the problem of efficiently finding
and storing data in an array. For example, if you have a list of 20000 numbers, and you
have given a number to search in that list- you will scan each number in the list until you
find a match.
The hash function in the data structure verifies the file which has been imported from
another source. A hash key for an item can be used to accelerate the process. It increases
the efficiency of retrieval and optimises the search. This is how we can simply give
hashing definition in data structure.
It requires a significant amount of your time to search in the entire list and locate that
specific number. This manual process of scanning is not only time-consuming but
inefficient too. With hashing in the data structure, you can narrow down the search and
find the number within seconds.
What is Hashing in Data Structure?

Hashing in the data structure is a technique of mapping a large chunk of data into small
tables using a hashing function. It is also known as the message digest function. It is a
technique that uniquely identifies a specific item from a collection of similar items.
It uses hash tables to store the data in an array format. Each value in the array has been
assigned a unique index number. Hash tables use a technique to generate these unique
index numbers for each value stored in an array format. This technique is called the hash
technique.
You only need to find the index of the desired item, rather than finding the data. With
indexing, you can quickly scan the entire list and retrieve the item you wish. Indexing also
helps in inserting operations when you need to insert data at a specific location. No matter
how big or small the table is, you can update and retrieve data within seconds.
The hash table is basically the array of elements and the hash techniques of search are
performed on a part of the item i.e. key. Each key has been mapped to a number, the range
remains from 0 to table size 1
Types of hashing in data structure is a two-step process.
1. The hash function converts the item into a small integer or hash value. This integer is
used as an index to store the original data.
2. It stores the data in a hash table. You can use a hash key to locate data quickly.
Examples of Hashing in Data Structure
The following are real-life examples of hashing in the data structure –
 In schools, the teacher assigns a unique roll number to each student. Later, the teacher
uses that roll number to retrieve information about that student.
 A library has an infinite number of books. The librarian assigns a unique number to
each book. This unique number helps in identifying the position of the books on the
bookshelf.
Hash Function
The hash function in a data structure maps the arbitrary size of data to fixed-sized data. It
returns the following values: a small integer value (also known as hash value), hash
codes, and hash sums. The hashing techniques in the data structure are very interesting,
such as:
hash = hashfunc(key)
index = hash % array_size
The hash function must satisfy the following requirements:
 A good hash function is easy to compute.
 A good hash function never gets stuck in clustering and distributes keys evenly across
the hash table.
 A good hash function avoids collision when two elements or items get assigned to the
same hash value.
One of the hashing techniques of using a hash function is used for data integrity. If using a
hash function one change in a message will create a different hash.
The three characteristics of the hash function in the data structure are:
1. Collision free
2. Property to be hidden
3. Puzzle friendly
Hash Table
Hashing in data structure uses hash tables to store the key-value pairs. The hash table
then uses the hash function to generate an index. Hashing uses this unique index to
perform insert, update, and search operations.
It can be defined as a bucket where the data are stored in an array format. These data have
their own index value. If the index values are known then the process of accessing the data
is quicker.
How does Hashing in Data Structure Works?
In hashing, the hashing function maps strings or numbers to a small integer value. Hash
tables retrieve the item from the list using a hashing function. The objective of hashing
technique is to distribute the data evenly across an array. Hashing assigns all the elements
a unique key. The hash table uses this key to access the data in the list.
Hash table stores the data in a key-value pair. The key acts as an input to the hashing
function. Hashing function then generates a unique index number for each value stored.
The index number keeps the value that corresponds to that key. The hash function returns
a small integer value as an output. The output of the hashing function is called the hash
value.
Let us understand hashing in a data structure with an example. Imagine you need to
store some items (arranged in a key-value pair) inside a hash table with 30 cells.
The values are: (3,21) (1,72) (40,36) (5,30) (11,44) (15,33) (18,12) (16,80) (38,99)
The hash table will look like the following:
Serial Array
Key Hash
Number Index
1 3 3%30 = 3 3
2 1 1%30 = 1 1
3 40 40%30 = 10 10
4 5 5%30 = 5 5
5 11 11%30 = 11 11
6 15 15%30 = 15 15
7 18 18%30 = 18 18
8 16 16%30 = 16 16
9 38 38%30 = 8 8
The process of taking any size of data and then converting that into smaller data value
which can be named as hash value. This hash value can be used in an index accessible in
hash table. This process defines hashing in data structure.
9.3 COLLISION RESOLUTION TECHNIQUES
Hashing in data structure falls into a collision if two keys are assigned the same index
number in the hash table. The collision creates a problem because each index in a hash
table is supposed to store only one value. Hashing in data structure uses several collision
resolution techniques to manage the performance of a hash table.
In this section, we have explored the idea of collision in hashing and explored different
collision resolution techniques such as:
 Open Hashing (Separate chaining)
 Closed Hashing (Open Addressing)

 Liner Probing
 Quadratic probing
 Double hashing
 Hash table: a data structure where the data is stored based upon its hashed key which
is obtained using a hashing function.
 Hash function: a function which for a given data, outputs a value mapped to a fixed
range. A hash table leverages the hash function to efficiently map data such that it can
be retrieved and updated quickly. Simply put, assume S = {s1, s2, s3, ...., sn} to be a
set of objects that we wish to store into a map of size N, so we use a hash function H,
such that for all s belonging to S; H(s) -> x, where x is guaranteed to lie in the range
[1,N]
 Perfect Hash function: a hash function that maps each item into a unique slot (no
collisions).
Hash Collisions: As per the Pigeonhole principle if the set of objects we intend to store
within our hash table is larger than the size of our hash table we are bound to have two or
more different objects having the same hash value; a hash collision. Even if the size of the
hash table is large enough to accommodate all the objects finding a hash function which
generates a unique hash for each object in the hash table is a difficult task. Collisions are
bound to occur (unless we find a perfect hash function, which in most of the cases is hard to
find) but can be significantly reduced with the help of various collision resolution techniques.
Following are the collision resolution techniques used:
 Open Hashing (Separate chaining)
 Closed Hashing (Open Addressing)

 Liner Probing
 Quadratic probing
 Double hashing
1. Open Hashing (Separate chaining)

Collisions are resolved using a list of elements to store objects with the same key together.
 Suppose you wish to store a set of numbers = {0,1,2,4,5,7} into a hash table of size 5.
 Now, assume that we have a hash function H, such that H(x) = x%5
 So, if we were to map the given data with the given hash function we'll get the
corresponding values
H(0)-> 0%5 = 0
H(1)-> 1%5 = 1
H(2)-> 2%5 = 2
H(4)-> 4%5 = 4
H(5)-> 5%5 = 0
H(7)-> 7%5 = 2
 Clearly 0 and 5, as well as 2 and 7 will have the same hash value, and in this case
we'll simply append the colliding values to a list being pointed by their hash keys.
Obviously in practice the table size can be significantly large and the hash function can be
even more complex, also the data being hashed would be more complex and non-primitive,
but the idea remains the same.
This is an easy way to implement hashing but it has its own demerits.
 The lookups/inserts/updates can become linear [O(N)] instead of constant time [O(1)]
if the hash function has too many collisions.
 It doesn't account for any empty slots which can be leveraged for more efficient
storage and lookups.
 Ideally, we require a good hash function to guarantee even distribution of the values.
 Say, for a load factor
λ=number of objects stored in table/size of the table (can be >1)
a good hash function would guarantee that the maximum length of list associated with
each key is close to the load factor.
Note that the order in which the data is stored in the lists (or any other data structures) is
based upon the implementation requirements. Some general ways include insertion order,
frequency of access etc.
2. Closed Hashing (Open Addressing)

This collision resolution technique requires a hash table with fixed and known size. During
insertion, if a collision is encountered, alternative cells are tried until an empty bucket is
found. These techniques require the size of the hash table to be supposedly larger than the
number of objects to be stored (something with a load factor < 1 is ideal).
There are various methods to find these empty buckets:
a. Liner Probing
b. Quadratic probing
c. Double hashing
Linear Probing
Hashing in data structure results in an array index that is already occupied to store a
value. In such a case, hashing performs a search operation and probes linearly for the next
empty cell.
Linear probing in hash techniques is known to be the easiest way to resolve any collisions
in hash tables. A sequential search can be performed to find any collision that occurred.
Linear Probing Example

Imagine you have been asked to store some items inside a hash table of size 30. The items
are already sorted in a key-value pair format. The values given are: (3,21) (1,72) (63,36)
(5,30) (11,44) (15,33) (18,12) (16,80) (46,99).
The hash(n) is the index computed using a hash function and T is the table size. If slot
index = ( hash(n) % T) is full, then we look for the next slot index by adding 1 ((hash(n) +
1) % T). If (hash(n) + 1) % T is also full, then we try (hash(n) + 2) % T. If (hash(n) + 2) %
T is also full, then we try (hash(n) + 3) % T.
The hash table will look like the following:
Array
Serial Array Index after
Key Hash
Number Index Linear
Probing
1 3 3%30 = 3 3 3
2 1 1%30 = 1 1 1
3 63 63%30 = 3 3 4
4 5 5%30 = 5 5 5
5 11 11%30 = 11 11 11
6 15 15%30 = 15 15 15
7 18 18%30 = 18 18 18
8 16 16%30 = 16 16 16
9 46 46%30 = 8 16 17
The idea of linear probing is simple, we take a fixed sized hash table and every time we face
a hash collision, we linearly traverse the table in a cyclic manner to find the next empty slot.
 Assume a scenario where we intend to store the following set of numbers =
{0,1,2,4,5,7} into a hash table of size 5 with the help of the following hash function
H, such that H(x) = x%5.
 So, if we were to map the given data with the given hash function we'll get the
corresponding values
H(0)-> 0%5 = 0
H(1)-> 1%5 = 1
H(2)-> 2%5 = 2
H(4)-> 4%5 = 4
H(5)-> 5%5 = 0
 in this case we see a collision of two terms (0 & 5). In this situation we move linearly
down the table to find the first empty slot. Note that this linear traversal is cyclic in
nature, i.e. in the event we exhaust the last element during the search we start again
from the beginning until the initial key is reached.
 In this case our hash function can be considered as this:

H(x, i) = (H(x) + i)%N
where N is the size of the table and i represents the linearly increasing variable which
starts from 1 (until empty bucket is found).
Despite being easy to compute, implement and deliver best cache performance, this suffers
from the problem of clustering (many consecutive elements get grouped together, which
eventually reduces the efficiency of finding elements or empty buckets).
b. Quadratic Probing
This method lies in the middle of great cache performance and the problem of clustering. The
general idea remains the same, the only difference is that we look at the Q(i) increment at
each iteration when looking for an empty bucket, where Q(i) is some quadratic expression of
i. A simple expression of Q would be Q(i) = i^2, in which case the hash function looks
something like this:
H(x, i) = (H(x) + i^2)%N
 In general, H(x, i) = (H(x) + ((c1\*i^2 + c2\*i + c3)))%N, for some choice of
constants c1, c2, and c3
 Despite resolving the problem of clustering significantly it may be the case that in
some situations this technique does not find any available bucket, unlike linear
probing which always finds an empty bucket.
 Luckily, we can get good results from quadratic probing with the right combination of
probing function and hash table size which will guarantee that we will visit as many
slots in the table as possible. In particular, if the hash table's size is a prime number
and the probing function is H(x, i) = i^2, then at least 50% of the slots in the table will
be visited. Thus, if the table is less than half full, we can be certain that a free slot will
eventually be found.
 Alternatively, if the hash table size is a power of two and the probing function is H(x,
i) = (i^2 + i)/2, then every slot in the table will be visited by the probing function.
 Assume a scenario where we intend to store the following set of numbers = {0,1,2,5}
into a hash table of size 5 with the help of the following hash function H, such
that H(x, i) = (x%5 + i^2)%5.
Clearly 5 and 0 will face a collision, in which case we'll do the following:
- we look at 5%5 = 0 (collision)
- we look at (5%5 + 1^2)%5 = 1 (collision)
- we look at (5%5 + 2^2)%5 = 4 (empty -> place element here)
c. Double Hashing
Double Hashing
The double hashing technique uses two hash functions. The second hash function comes
into use when the first function causes a collision. It provides an offset index to store the
value.
The formula for the double hashing technique is as follows:
(firstHash(key) + i * secondHash(key)) % sizeOfTable
Where i is the offset value. This offset value keeps incremented until it finds an empty
slot.
For example, you have two hash functions: h1 and h2. You must perform the following
steps to find an empty slot:
1. Verify if hash1(key) is empty. If yes, then store the value on this slot.
2. If hash1(key) is not empty, then find another slot using hash2(key).
3. Verify if hash1(key) + hash2(key) is empty. If yes, then store the value on this slot.
4. Keep incrementing the counter and repeat with hash1(key)+2hash2(key),
hash1(key)+3hash2(key), and so on, until it finds an empty slot.
This method is based upon the idea that in the event of a collision we use an another hashing
function with the key value as an input to find where in the open addressing scheme the data
should actually be placed at.
 In this case we use two hashing functions, such that the final hashing function looks
like:
H(x, i) = (H1(x) + i*H2(x))%N
 Typically for H1(x) = x%N a good H2 is H2(x) = P - (x%P), where P is a prime
number smaller than N.
 A good H2 is a function which never evaluates to zero and ensures that all the cells of
a table are effectively traversed.
 Assume a scenario where we intend to store the following set of numbers = {0,1,2,5}
into a hash table of size 5 with the help of the following hash function H, such that
H(x, i) = (H1(x) + i*H2(x))%5
H1(x) = x%5 and H2(x) = P - (x%P), where P = 3
(3 is a prime smaller than 5)
Clearly 5 and 0 will face a collision, in which case we'll do the following:
- we look at 5%5 = 0 (collision)
- we look at (5%5 + 1*(3 - (5%3)))%5 = 1 (collision)
- we look at (5%5 + 2*(3 - (5%3)))%5 = 2 (collision)
- we look at (5%5 + 3*(3 - (5%3)))%5 = 3 (empty -> place element here)
9.4 IMPLEMENTATION OF HASHING
Following are the basic primary operations of a hash table.
 Search − Searches an element in a hash table.

 Insert − inserts an element in a hash table.
 delete − Deletes an element from a hash table.
DataItem
Define a data item having some data and key, based on which the search is to be conducted in
a hash table.
struct DataItem {
int data;
int key;
};
Hash Method
Define a hashing method to compute the hash code of the key of the data item.
int hashCode(int key){
return key % SIZE;
}
Search Operation
Whenever an element is to be searched, compute the hash code of the key passed and locate
the element using that hash code as index in the array. Use linear probing to get the element
ahead if the element is not found at the computed hash code.
Example
struct DataItem *search(int key) {
//get the hash
int hashIndex = hashCode(key);
//move in array until an empty
while(hashArray[hashIndex] != NULL) {
if(hashArray[hashIndex]->key == key)
return hashArray[hashIndex];
//go to next cell
++hashIndex;
//wrap around the table
hashIndex %= SIZE;
}
return NULL;
}
Insert Operation
Whenever an element is to be inserted, compute the hash code of the key passed and locate
the index using that hash code as an index in the array. Use linear probing for empty location,
if an element is found at the computed hash code.
Example
void insert(int key,int data) {
struct DataItem *item = (struct DataItem*) malloc(sizeof(struct DataItem));
item->data = data;
item->key = key;
//get the hash

//move in array until an empty or deleted cell
while(hashArray[hashIndex] != NULL && hashArray[hashIndex]->key != -1) {
//go to next cell
++hashIndex;
hashIndex %= SIZE;
}
hashArray[hashIndex] = item;
}
Delete Operation
Whenever an element is to be deleted, compute the hash code of the key passed and locate the
index using that hash code as an index in the array. Use linear probing to get the element
ahead if an element is not found at the computed hash code. When found, store a dummy
item there to keep the performance of the hash table intact.
Example
struct DataItem* delete(struct DataItem* item) {
int key = item->key;
//get the hash
//move in array until an empty
while(hashArray[hashIndex] !=NULL) {
if(hashArray[hashIndex]->key == key) {
struct DataItem* temp = hashArray[hashIndex];
//assign a dummy item at deleted position
hashArray[hashIndex] = dummyItem;
return temp;
} //go to next cell
++hashIndex;
hashIndex %= SIZE;
}
return NULL;
}
9.5 SUMMARY
In this unit we have studied hashing in detail. We also learnt collision resolution techniques.
At the end of this unit dealt with implementation of hashing.
9.6 KEYWORDS
Key, Hashing, Hash function, Hash table, Search, Insert and Delete .
9.7 QUESTIONS
1. Define hashing.
2. Discuss components of hashing.
3. Explain hash function.
4. Describe open hashing.
5. Briefly explain liner probing.
9.8 REFERENCES

Bruno R. Preiss
8. "Data Structures and Algorithm Analysis in Java" by Mark A. Weiss
UNIT 10: PRIORITY QUEUES AND HEAPS
10.0 Objectives
10.1 Introduction
10.2 Priority queue
10.3 Implementation of priority queue
10.4 Heap data structure
10.5 Implementation of heap data structure
10.6 Summary
10.7 Key words
10.8 Questions
10.9 References
10.0 OBJECTIVES

 Discuss Priority queue
 Describe Implementation of priority queue
 Elucidate Heap data structure
 Explain Implementation of heap data structure
10.1 INTRODUCTION
The priority queue in the data structure is an extension of the “normal” queue. It is an
abstract data type that contains a group of items. It is like the “normal” queue except that
the dequeuing elements follow a priority order. The priority order dequeues those items
first that have the highest priority.
It is an abstract data type that provides a way to maintain the dataset. The “normal” queue
follows a pattern of first-in-first-out. It dequeues elements in the same order followed at
the time of insertion operation. However, the element order in a priority queue depends on
the element’s priority in that queue. The priority queue moves the highest priority
elements at the beginning of the priority queue and the lowest priority elements at the back
of the priority queue.
It supports only those elements that are comparable. Hence, a priority queue in the data
structure arranges the elements in either ascending or descending order.
You can think of a priority queue as several patients waiting in line at a hospital. Here, the
situation of the patient defines the priority order. The patient with the most severe injury
would be the first in the queue.
Characteristics of a Priority Queue
A queue is termed as a priority queue if it has the following characteristics:
 Each item has some priority associated with it.

 An item with the highest priority is moved at the front and deleted first.
 If two elements share the same priority value, then the priority queue follows the first-
in-first-out principle for de queue operation.
A priority queue is of two types:
 Ascending Order Priority Queue

 Descending Order Priority Queue
Implementation of the Priority Queue in Data Structure
You can implement the priority queues in one of the following ways:
 Linked list
 Binary heap
 Arrays
 Binary search tree
The binary heap is the most efficient method for implementing the priority queue in
the data structure.
The below tables summarize the complexity of different operations in a priority queue.
Unordered Ordered Binary Binary Search

Operation
Array Array Heap Tree
Insert 0(1) 0(N) 0(log(N)) 0(log(N))
Peek 0(N) 0(1) 0(1) 0(1)
Delete 0(N) 0(1) 0(log (N)) 0(log(N))
Heap Data Structure is also known as Binary Heap that is in the form of a tree and follows
the property of a complete binary tree such that all the levels of the tree are filled by nodes
except the last level, that can be partially filled.
Although it is represented in the form of a tree, it is stored in the memory as an array unlike a
tree that is obtained through referring to child nodes. As the elements are stored contiguously
in an array, it is more cache-friendly while the complete binary tree ensures that there are the
least number of tree levels possible for total elements.
The following mathematical formula can help us find the left child and right child or a tree or
even the parent in the array.
Here, i, means index such that if we want any such relationship, we can substitute the value
of the index and find out our necessary requirements easily.
Now that we have an idea on what is heap in data structure, we move on looking at the use
cases of heap data structure. Heap data structure is used to implement priority queues in
problem-solving while it is extensively used in heap sort which will be covered in the further
sections alongside heap sort algorithm.
Types of Heap in Data Structure

Now that we have some general understanding of heap data structure, there are two types of
heap which can be classified as:-
1. Min Heap :- The smallest element is present at the root of the tree in the min heap such
that it is easier to extract the smallest element when heap pop is performed.
2. Max Heap :- The greatest element is present at the root of the tree in the max heap such
that it is easier to extract the largest element when heap pop is performed.
The illustration stating the both is represented below:
10.2 PRIORITY QUEUE

A priority queue is a type of queue that arranges elements based on their priority values.
Elements with higher priority values are typically retrieved before elements with lower
priority values.
In a priority queue, each element has a priority value associated with it. When you add an
element to the queue, it is inserted in a position based on its priority value. For example, if
you add an element with a high priority value to a priority queue, it may be inserted near
the front of the queue, while an element with a low priority value may be inserted near the
back.
There are several ways to implement a priority queue, including using an array, linked list,
heap, or binary search tree. Each method has its own advantages and disadvantages, and the
best choice will depend on the specific needs of your application.
Priority queues are often used in real-time systems, where the order in which elements are
processed can have significant consequences. They are also used in algorithms to improve
their efficiencies, such as Dijkstra’s algorithm for finding the shortest path in a graph and
the A* search algorithm for pathfinding.
Properties of Priority Queue
So, a priority Queue is an extension of the queue with the following properties.
 Every item has a priority associated with it.
 An element with high priority is dequeued before an element with low priority.
 If two elements have the same priority, they are served according to their order
in the queue.
In the below priority queue, an element with a maximum ASCII value will have the highest
priority. The elements with higher priority are served first.
How is Priority assigned to the elements in a Priority Queue?

In a priority queue, generally, the value of an element is considered for assigning the
priority.
For example, the element with the highest value is assigned the highest priority and the
element with the lowest value is assigned the lowest priority. The reverse case can also be
used i.e., the element with the lowest value can be assigned the highest priority. Also, the
priority can be assigned according to our needs.
Operations of a Priority Queue:
A typical priority queue supports the following operations:
1) Insertion in a Priority Queue
When a new element is inserted in a priority queue, it moves to the empty slot from top to
bottom and left to right. However, if the element is not in the correct place then it will be
compared with the parent node. If the element is not in the correct order, the elements are
swapped. The swapping process continues until all the elements are placed in the correct
position.
2) Deletion in a Priority Queue
As you know that in a max heap, the maximum element is the root node. And it will
remove the element which has maximum priority first. Thus, you remove the root node
from the queue. This removal creates an empty slot, which will be further filled with new
insertion. Then, it compares the newly inserted element with all the elements inside the
queue to maintain the heap invariant.
3) Peek in a Priority Queue
This operation helps to return the maximum element from Max Heap or the minimum
element from Min Heap without deleting the node from the priority queue.
Types of Priority Queue:
1) Ascending Order Priority Queue
As the name suggests, in ascending order priority queue, the element with a lower priority
value is given a higher priority in the priority list. For example, if we have the following
elements in a priority queue arranged in ascending order like 4,6,8,9,10. Here, 4 is the
smallest number, therefore, it will get the highest priority in a priority queue and so when
we dequeue from this type of priority queue, 4 will remove from the queue and dequeue
returns 4.
2) Descending order Priority Queue
The root node is the maximum element in a max heap, as you may know. It will also
remove the element with the highest priority first. As a result, the root node is removed
from the queue. This deletion leaves an empty space, which will be filled with fresh
insertions in the future. The heap invariant is then maintained by comparing the newly
inserted element to all other entries in the queue.
Types of Priority Queues

Difference between Priority Queue and Normal Queue?
There is no priority attached to elements in a queue, the rule of first-in-first-out(FIFO) is
implemented whereas, in a priority queue, the elements have a priority. The elements with
higher priority are served first.
Priority queue can be implemented using the following data structures:
 Arrays
 Linked list
 Heap data structure
 Binary search tree
10.3 IMPLEMENTATION OF PRIORITY QUEUE

1) Implement Priority Queue Using Array:
A simple implementation is to use an array of the following structure.
struct item {
int item;
int priority;
}
 enqueue(): This function is used to insert new data into the queue.
 dequeue(): This function removes the element with the highest priority from the
queue.
 peek()/top(): This function is used to get the highest priority element in the
queue without removing it from the queue.
Arrays enqueue() dequeue() peek()
Time Complexity O(1) O(n) O(n)
// C++ program to implement Priority Queue

// using Arrays
#include <bits/stdc++.h>
using namespace std;
// Structure for the elements in the
// priority queue
struct item {
int value;
int priority;
};
// Store the element of a priority queue
item pr[100000];
// Pointer to the last index
int size = -1;
// Function to insert a new element
// into priority queue
void enqueue(int value, int priority)
{
// Increase the size
size++;
// Insert the element
pr[size].value = value;
pr[size].priority = priority;
}
// Function to check the top element
int peek()
{
int highestPriority = INT_MIN;
int ind = -1;
// Check for the element with
// highest priority
for (int i = 0; i <= size; i++) {
// If priority is same choose
// the element with the
// highest value
if (highestPriority == pr[i].priority && ind > -1
&& pr[ind].value < pr[i].value) {
highestPriority = pr[i].priority;
ind = i;
}
else if (highestPriority < pr[i].priority) {
highestPriority = pr[i].priority;
ind = i;
}
}
// Return position of the element
return ind;
}
// Function to remove the element with
// the highest priority
void dequeue()
{
// Find the position of the element
// with highest priority
int ind = peek();
// Shift the element one index before
// from the position of the element
// with highest priority is found
for (int i = ind; i < size; i++) {
pr[i] = pr[i + 1];
}
// Decrease the size of the
// priority queue by one
size--;
}
// Driver Code
int main()
{
// Function Call to insert elements
// as per the priority
enqueue(10, 2);
enqueue(14, 4);
enqueue(16, 4);
enqueue(12, 3);
// Stores the top element
// at the moment
int ind = peek();
cout << pr[ind].value << endl;
// Dequeue the top element
dequeue();
// Check the top element
ind = peek();
// Dequeue the top element
dequeue();
// Check the top element
ind = peek();
return 0;
}
Output
16
14
12
2) Implement Priority Queue Using Linked List:
In a LinkedList implementation, the entries are sorted in descending order based on their
priority. The highest priority element is always added to the front of the priority queue,
which is formed using linked lists. The functions like push(), pop() and peek() are used to
implement a priority queue using a linked list and are explained as follows:
 push(): This function is used to insert new data into the queue.
 pop(): This function removes the element with the highest priority from the
queue.
 peek() / top(): This function is used to get the highest priority element in the
queue without removing it from the queue.
Linked List push() pop() peek()

Linked List push() pop() peek()
Time Complexity O(n) O(1) O(1)
// C++ code to implement Priority Queue

// using Linked List
#include <bits/stdc++.h>
using namespace std;
// Node
typedef struct node {
int data;
// Lower values indicate
// higher priority
int priority;
struct node* next;
} Node;
// Function to create a new node
Node* newNode(int d, int p)
{
Node* temp = (Node*)malloc(sizeof(Node));
temp->data = d;
temp->priority = p;
temp->next = NULL;
return temp;
}
// Return the value at head
int peek(Node** head) { return (*head)->data; }
// Removes the element with the
// highest priority form the list
void pop(Node** head)
{
Node* temp = *head;
(*head) = (*head)->next;
free(temp);
}
// Function to push according to priority
void push(Node** head, int d, int p)
{
Node* start = (*head);
// Create new Node
Node* temp = newNode(d, p);
// Special Case: The head of list has
// lesser priority than new node
if ((*head)->priority < p) {
// Insert New Node before head
temp->next = *head;
(*head) = temp;
}
else {
// Traverse the list and find a

// position to insert new node
while (start->next != NULL
&& start->next->priority > p) {
start = start->next;
}
// Either at the ends of the list
// or at required position
temp->next = start->next;
start->next = temp;
}
}
// Function to check is list is empty
int isEmpty(Node** head) { return (*head) == NULL; }
// Driver code
int main()
{
// Create a Priority Queue
// 7->4->5->6
Node* pq = newNode(4, 1);
push(&pq, 5, 2);
push(&pq, 6, 3);
push(&pq, 7, 0);
while (!isEmpty(&pq)) {
cout << " " << peek(&pq);
pop(&pq);
}
return 0;
}
Output
6547
Note: We can also use Linked List, time complexity of all operations with linked list
remains same as array. The advantage with linked list is deleteHighestPriority() can be
more efficient as we don’t have to move items.
3) Implement Priority Queue Using Heaps:
Binary Heap is generally preferred for priority queue implementation because heaps
provide better performance compared to arrays or LinkedList. Considering the properties of
a heap, The entry with the largest key is on the top and can be removed immediately. It
will, however, take time O(log n) to restore the heap property for the remaining keys.
However if another entry is to be inserted immediately, then some of this time may be
combined with the O(log n) time needed to insert the new entry. Thus the representation of
a priority queue as a heap proves advantageous for large n, since it is represented efficiently
in contiguous storage and is guaranteed to require only logarithmic time for both insertions
and deletions. Operations on Binary Heap are as follows:
 insert(p): Inserts a new element with priority p.
 extractMax(): Extracts an element with maximum priority.
 remove(i): Removes an element pointed by an iterator i.
 getMax(): Returns an element with maximum priority.
 changePriority(i, p): Changes the priority of an element pointed by i to p.
Binary Heap insert() remove() peek()
Time Complexity O(log n) O(log n) O(1)
4) Implement Priority Queue Using Binary Search Tree:

A Self-Balancing Binary Search Tree like AVL Tree, Red-Black Tree, etc. can also be used
to implement a priority queue. Operations like peek(), insert() and delete() can be
performed using BST.
Binary Search Tree peek() insert() delete()
Time Complexity O(1) O(log n) O(log n)
Applications of Priority Queue:

 CPU Scheduling
 Graph algorithms like Dijkstra’s shortest path algorithm, Prim’s Minimum
Spanning Tree, etc.
 Stack Implementation
 All queue applications where priority is involved.
 Data compression in Huffman code
 Event-driven simulation such as customers waiting in a queue.
 Finding Kth largest/smallest element.
Advantages of Priority Queue:
 It helps to access the elements in a faster way. This is because elements in a
priority queue are ordered by priority, one can easily retrieve the highest priority
element without having to search through the entire queue.
 The ordering of elements in a Priority Queue is done dynamically. Elements in a
priority queue can have their priority values updated, which allows the queue to
dynamically reorder itself as priorities change.
 Efficient algorithms can be implemented. Priority queues are used in many
algorithms to improve their efficiency, such as Dijkstra’s algorithm for finding
the shortest path in a graph and the A* search algorithm for pathfinding.
 Included in real-time systems. This is because priority queues allow you to
quickly retrieve the highest priority element, they are often used in real-time
systems where time is of the essence.
Disadvantages of Priority Queue:
 High complexity. Priority queues are more complex than simple data structures
like arrays and linked lists, and may be more difficult to implement and
maintain.
 High consumption of memory. Storing the priority value for each element in a
priority queue can take up additional memory, which may be a concern in
systems with limited resources.
 It is not always the most efficient data structure. In some cases, other data
structures like heaps or binary search trees may be more efficient for certain
operations, such as finding the minimum or maximum element in the queue.
 At times it is less predictable:. This is because the order of elements in a priority
queue is determined by their priority values, the order in which elements are
retrieved may be less predictable than with other data structures like stacks or
queues, which follow a first-in, first-out (FIFO) or last-in, first-out (LIFO) order.
10.4 HEAP DATA STRUCTURE
What is a Heap in Data Structures?
A heap is a complete binary tree structure where each element satisfies a heap property. In a
complete binary tree, all levels are full except the last level, i.e., nodes in all levels except the
last level will have two children. The last level will be filled from the left. Here, each heap
node stores a value key, which defines the relative position of that node inside the heap.
Visualization of a complete binary tree

Examples of In-complete binary tree
Examples of complete binary tree
There are two types of heap data structures: max heap and min heap.
What is Max-Heap Data Structure?
All elements in this heap satisfy the property that the key of the parent node is greater than or
equal to the keys of its child nodes i.e. key of a node >= key of its children. So, moving up
from any node, we get a nondecreasing sequence of keys, and moving down from any node,
we get a nonincreasing sequence of keys. In particular: The largest key in a max-heap is
found at the root.
What is Min-Heap Data Structure?
All elements in this heap satisfy the property that the key of the parent node is less than or
equal to the keys of its child nodes i.e. key of a node <= key of its children. So, moving up
from any node, we get a nonincreasing sequence of keys, and moving down from any node,
we get a nondecreasing sequence of keys. In particular: The smallest key in a min-heap is
found at the root.
Special notes related to the Heap
 A binary heap is a binary tree that satisfies two properties: (1) Shape property: all
levels, except the last level, are fully filled and the last level is filled from left to right
(2) Heap property
 Level order traversal of the heap will give the order in which elements are filled in the
array.
 Heap is a complete tree structure, so we define the height of a node in a heap as the
number of edges on the longest path from the node to a leaf.
 We define the height of the heap to be the height of its root. Since a heap of n
elements is based on a complete binary tree, its height is O(logn).
 In the worst case, we shall see that the basic operations on heaps run in time
proportional to the tree's height and thus take O(logn) time.
Array implementation of Heap
A binary heap can be represented using an array where the indices of the array capture the
parent-child relationship. Suppose A[] be a heap array of size n:
 The root of the binary heap is stored at A[0].

 Given element A[i], the children of this element are stored in A[2i + 1] and A[2i + 2],
if they exist.
 The left child of i denoted as left(i) = A[2i + 1], if 2i + 1 < n
 The right child of i denoted as right(i) = A[2i + 2], if 2i + 2 < n
 The parent of A[i] is stored in A[(i−1)/2].
Note: In most programming languages, these operations can be implemented efficiently using
bitwise operators. Therefore, an array representation is a space-efficient approach as we don’t
need to store extra 3 pointers per node in the heap.
 Thus, for a max-heap, we can say that A[i] ≥ A[2i + 1] and A[i] ≥ A[2*i + 2], where
(2*i + 1) and (2i + 2) are < n. As we know, the key of each node in a max-heap is
greater than the key of its children, hence, the maximum key in the heap will be stored
at the root, that is, at A[0].
 Similarly, min-heap will satisfy the property that for any index i, A[i] ≤ A[2i + 1] and
A[i] ≤ A[2*i + 2], where (2*i + 1) and (2i + 2) are < n. Thus, for a min-heap, the
minimum element will be at the root of the heap and thus, at A[0].
Operations supported by Max-Heap
Various operations supported by max heap is described below on a high level and will be
covered in more detail in subsequent articles:
 maxHeapify(A[], i): It is a method to rearrange the elements of the heap in order to
maintain the heap property. This process is required when a certain node at index i
causes an imbalance in the heap due to some operation on that node.
 buildMaxHeap(A[]): We can use the procedure to convert an input array into a max-
heap.
 findMax(heap[]): This operation returns the maximum value in the heap and its time
complexity is O(1) as it just needs to return A[0].
 extractMax(A[]): This operation removes the maximum element from the heap and
returns it. The time complexity of this operation is O(logn) as we replace A[0] with
A[n-1] — the last element of the heap, and then do some operations to maintain the
max-heap property.
 increaseKey(A[], i, v): This operation increases the value at index i in the array to
value v. This operation is only valid if A[i] ≤ v, that is the new value is greater than
the existing value at index i. This ensures that the subtree rooted at index i is still a
max-heap. The complexity of this operation is O(logn) as after increasing the key at
index i, the max-heap property of Parent(i) might be violated, and we might need to
perform some operations to restore it.
 insertKey(A[], v): This operation inserts the element v in a heap, and its complexity
is O(lg n). To implement this operation, we add an element at the end of the heap
(at A[n-1]) and then perform some operations to restore the heap property.
 deleteKey(A[], i): This operation is used to delete an element at index i, and the
complexity of this operation is O(logn). To delete any element, we can replace it with
the last element of the heap, and then again perform operations to restore the heap
property in case it is violated.
In a similar fashion min-heap support these operations:
 minHeapify(A[], i)
 buildMinHeap(A[])
 findMin(A[])
 extractMin(A[])
 decreaseKey(A[], i, v)
 insertKey(A[], v)
 deleteKey(A[], i).
Applications of Heap Data Structure
 Heaps are used to efficiently implement a priority queue, an important data structure
in computer science. One of the applications of priority queues is in process
scheduling in operating systems.
 Heaps are used by the Heapsort Algorithm, which is one of the fastest sorting
algorithms known. Its complexity is O(nlogn).
 Heaps are also used in the efficient implementations of algorithms like Dijkstra’s
shortest-path algorithm, where we need to pick the node closest to a given node. If
distances to all the nodes are stored in a heap then the closest node can be extracted
efficiently using min-heap.
 Heaps provide an efficient way to get the kth smallest or largest element in an array.
Heap data structure is a complete binary tree that satisfies the heap property, where any
given node is
 always greater than its child node/s and the key of the root node is the largest among
all other nodes. This property is also called max heap property.
 always smaller than the child node/s and the key of the root node is the smallest
among all other nodes. This property is also called min heap property.
Min-heap
Max-heap
This type of data structure is also called a binary heap.

Heap Operations
Some of the important operations performed on a heap are described below along with their
algorithms.
Heapify
Heapify is the process of creating a heap data structure from a binary tree. It is used to create
a Min-Heap or a Max-Heap.
Let the input array be initial array
Create a complete binary tree from the array
Complete binary tree
1. Start from the first index of non-leaf node whose index is given by n/2 - 1.
2. Start from the first on leaf node

3. Set current element i as largest.
4. The index of left child is given by 2i + 1 and the right child is given by 2i + 2.
If leftChild is greater than currentElement (i.e. element at ith index),
set leftChildIndex as largest.
If rightChild is greater than element in largest, set rightChildIndex as largest.
5. Swap largest with currentElementSwap if necessary
6. Repeat steps 3-7 until the subtrees are also heapified.
Algorithm
Heapify(array, size, i)
set i as largest
leftChild = 2i + 1
rightChild = 2i + 2
if leftChild > array[largest]

set leftChildIndex as largest
if rightChild > array[largest]
set rightChildIndex as largest
swap array[i] and array[largest]
To create a Max-Heap:
MaxHeap(array, size)
loop from the first index of non-leaf node down to zero
call heapify
For Min-Heap, both leftChild and rightChild must be larger than the parent for all nodes.
Insert Element into Heap
Algorithm for insertion in Max Heap
If there is no node,
create a newNode.
else (a node is already present)
insert the newNode at the end (last node from left to right.)
heapify the array
1. Insert the new element at the end of the tree.
Insert at the end

2. Heapify the tree.
Heapify the array

For Min Heap, the above algorithm is modified so that parentNode is always smaller
than newNode.
Delete Element from Heap
Algorithm for deletion in Max Heap
If nodeToBeDeleted is the leafNode

remove the node
Else swap nodeToBeDeleted with the lastLeafNode
remove noteToBeDeleted
heapify the array
1. Select the element to be deleted.
Select the element to be deleted
2. Swap it with the last element.
Swap with the last element
3. Remove the last element.

Remove the last element
4. Heapify the tree.
Heapify the array
For Min Heap, above algorithm is modified so that both childNodes are greater smaller
than currentNode.
Peek (Find max/min)
Peek operation returns the maximum element from Max Heap or minimum element from Min
Heap without deleting the node.
For both Max heap and Min Heap
return rootNode
Extract-Max/Min
Extract-Max returns the node with maximum value after removing it from a Max Heap
whereas Extract-Min returns the node with minimum after removing it from Min Heap.
10.5 IMPLEMENTATION OF HEAP DATA STRUCTURE
/ Max-Heap data structure in C

#include <stdio.h>
int size = 0;
void swap(int *a, int *b)
{
int temp = *b;
*b = *a;
*a = temp;
}
void heapify(int array[], int size, int i)
{
if (size == 1)
{
printf("Single element in the heap");
}
else
{
int largest = i;
int l = 2 * i + 1;
int r = 2 * i + 2;
if (l < size && array[l] > array[largest])
largest = l;
if (r < size && array[r] > array[largest])
largest = r;
if (largest != i)
{
swap(&array[i], &array[largest]);
heapify(array, size, largest);
}
}
}
void insert(int array[], int newNum)
{
if (size == 0)
{
array[0] = newNum;
size += 1;
}
else
{
array[size] = newNum;
size += 1;
for (int i = size / 2 - 1; i >= 0; i--)
{
heapify(array, size, i);
}
}
}
void deleteRoot(int array[], int num)
{
int i;
for (i = 0; i < size; i++)
{
if (num == array[i])
break;
}
swap(&array[i], &array[size - 1]);
size -= 1;
for (int i = size / 2 - 1; i >= 0; i--)
{
heapify(array, size, i);
}
}
void printArray(int array[], int size)
{
for (int i = 0; i < size; ++i)
printf("%d ", array[i]);
printf("\n");
}
int main()
{
int array[10];
insert(array, 3);
insert(array, 4);
insert(array, 9);
insert(array, 5);
insert(array, 2);
printf("Max-Heap array: ");

printArray(array, size);
deleteRoot(array, 4);
printf("After deleting an element: ");
printArray(array, size);
}
10.6 SUMMARY
In this unit we have discussed priority queue in detail. We also discussed implementation of
priority queue. At the end of this unit studied heap data structure and also implementation of
heap data structure.
10.7 KEYWORDS
Priority queue, Heapify, Peek(), Insert(), Remove(), Push() and Pop().
10.8 QUESTIONS
1. Write the characteristics of priority queue.

2. Explain types of heaps in data structure.
3. Discuss implementation of priority queue.
4. Write the advantages and disadvantages of priority queue.
5. Explain heap data structure.
10.9REFERENCES

Bruno R. Preiss
UNIT 11: ADVANCED TREES
Structure
11.0 Objectives
11.1 Introduction
11.2 Red black tree in data structure
11.3 Implementation of Red black tree
11.3 Splay tree
11.5 Implementation of splay tree
11.4 Summary
11.5 Key words
11.6 Questions
11.8 References
11.0 OBJECTIVES
The objective of this unit is to learn about:
 Red black tree in data structure

 Implementation of Red black tree
 Splay tree
 Implementation of splay tree
11.1 INTRODUCTION
When it comes to searching and sorting data, one of the most fundamental data structures is
the binary search tree. However, the performance of a binary search tree is highly
dependent on its shape, and in the worst case, it can degenerate into a linear structure with a
time complexity of O(n). This is where Red Black Trees come in, they are a type of
balanced binary search tree that use a specific set of rules to ensure that the tree is always
balanced. This balance guarantees that the time complexity for operations such as insertion,
deletion, and searching is always O(log n), regardless of the initial shape of the tree.
Red Black Trees are self-balancing, meaning that the tree adjusts itself automatically after
each insertion or deletion operation. It uses a simple but powerful mechanism to maintain
balance, by coloring each node in the tree either red or black.
Splay tree is a self-adjusting binary search tree data structure, which means that the tree
structure is adjusted dynamically based on the accessed or inserted elements. In other
words, the tree automatically reorganizes itself so that frequently accessed or inserted
elements become closer to the root node.
1. The splay tree was first introduced by Daniel Dominic Sleator and Robert Endre
Tarjan in 1985. It has a simple and efficient implementation that allows it to
perform search, insertion, and deletion operations in O(log n) amortized time
complexity, where n is the number of elements in the tree.
2. The basic idea behind splay trees is to bring the most recently accessed or
inserted element to the root of the tree by performing a sequence of tree
rotations, called splaying. Splaying is a process of restructuring the tree by
making the most recently accessed or inserted element the new root and
gradually moving the remaining nodes closer to the root.
3. Splay trees are highly efficient in practice due to their self-adjusting nature,
which reduces the overall access time for frequently accessed elements. This
makes them a good choice for applications that require fast and dynamic data
structures, such as caching systems, data compression, and network routing
algorithms.
4. However, the main disadvantage of splay trees is that they do not guarantee a
balanced tree structure, which may lead to performance degradation in worst-
case scenarios. Also, splay trees are not suitable for applications that require
guaranteed worst-case performance, such as real-time systems or safety-critical
systems.
Overall, splay trees are a powerful and versatile data structure that offers fast and efficient
access to frequently accessed or inserted elements. They are widely used in various
applications and provide an excellent tradeoff between performance and simplicity.
11.2 RED-BLACK TREE IN DATA STRUCTURE

The red-black tree is a binary search tree. The prerequisite of the red-black tree is that we
should know about the binary search tree. In a binary search tree, the values of the nodes in
the left subtree should be less than the value of the root node, and the values of the nodes in
the right subtree should be greater than the value of the root node.
Each node in the Red-black tree contains an extra bit that represents a color to ensure that the
tree is balanced during any operations performed on the tree like insertion, deletion, etc. In
a binary search tree, the searching, insertion and deletion take O(log2n) time in the average
case, O(1) in the best case and O(n) in the worst case.
Let's understand the different scenarios of a binary search tree.
In the above tree, if we want to search the 80. We will first compare 80 with the root node. 80
is greater than the root node, i.e., 10, so searching will be performed on the right subtree.
Again, 80 is compared with 15; 80 is greater than 15, so we move to the right of the 15, i.e.,
20. Now, we reach the leaf node 20, and 20 is not equal to 80. Therefore, it will show that the
element is not found in the tree. After each operation, the search is divided into half. The
above BST will take O(logn) time to search the element.
The above tree shows the right-skewed BST. If we want to search the 80 in the tree, we will
compare 80 with all the nodes until we find the element or reach the leaf node. So, the above
right-skewed BST will take O(N) time to search the element.
In the above BST, the first one is the balanced BST, whereas the second one is the
unbalanced BST. We conclude from the above two binary search trees that a balanced tree
takes less time than an unbalanced tree for performing any operation on the tree.
Therefore, we need a balanced tree, and the Red-Black tree is a self-balanced binary search
tree. Now, the question arises that why do we require a Red-Black tree if AVL is also a
height-balanced tree. The Red-Black tree is used because the AVL tree requires many
rotations when the tree is large, whereas the Red-Black tree requires a maximum of two
rotations to balance the tree. The main difference between the AVL tree and the Red-Black
tree is that the AVL tree is strictly balanced, while the Red-Black tree is not completely
height-balanced. So, the AVL tree is more balanced than the Red-Black tree, but the Red-
Black tree guarantees O(log2n) time for all operations like insertion, deletion, and searching.
Insertion is easier in the AVL tree as the AVL tree is strictly balanced, whereas deletion and
searching are easier in the Red-Black tree as the Red-Black tree requires fewer rotations.
As the name suggests that the node is either colored in Red or Black color. Sometimes no
rotation is required, and only recoloring is needed to balance the tree.
Properties of Red-Black tree

o It is a self-balancing Binary Search tree. Here, self-balancing means that it balances
the tree itself by either doing the rotations or recoloring the nodes.
o This tree data structure is named as a Red-Black tree as each node is either Red or
Black in color. Every node stores one extra information known as a bit that represents
the color of the node. For example, 0 bit denotes the black color while 1 bit denotes
the red color of the node. Other information stored by the node is similar to the binary
tree, i.e., data part, left pointer and right pointer.
o In the Red-Black tree, the root node is always black in color.
o In a binary tree, we consider those nodes as the leaf which have no child. In contrast,
in the Red-Black tree, the nodes that have no child are considered the internal nodes
and these nodes are connected to the NIL nodes that are always black in color. The
NIL nodes are the leaf nodes in the Red-Black tree.
o If the node is Red, then its children should be in Black color. In other words, we can
say that there should be no red-red parent-child relationship.
o Every path from a node to any of its descendant's NIL node should have same number
of black nodes.
Is every AVL tree can be a Red-Black tree?
Yes, every AVL tree can be a Red-Black tree if we color each node either by Red or Black
color. But every Red-Black tree is not an AVL because the AVL tree is strictly height-
balanced while the Red-Black tree is not completely height-balanced.
Insertion in Red Black tree
The following are some rules used to create the Red-Black tree:
1. If the tree is empty, then we create a new node as a root node with the color black.
2. If the tree is not empty, then we create a new node as a leaf node with a color red.
3. If the parent of a new node is black, then exit.
4. If the parent of a new node is Red, then we have to check the color of the parent's
sibling of a new node.
4a) If the color is Black, then we perform rotations and recoloring.
4b) If the color is Red then we recolor the node. We will also check whether the parents'
parent of a new node is the root node or not; if it is not a root node, we will recolor and
recheck the node.
Let's understand the insertion in the Red-Black tree.
10, 18, 7, 15, 16, 30, 25, 40, 60

Step 1: Initially, the tree is empty, so we create a new node having value 10. This is the first
node of the tree, so it would be the root node of the tree. As we already discussed, that root
node must be black in color, which is shown below:
Step 2: The next node is 18. As 18 is greater than 10 so it will come at the right of 10 as
shown below.
We know the second rule of the Red Black tree that if the tree is not empty then the newly
created node will have the Red color. Therefore, node 18 has a Red color, as shown in the
below figure:
Now we verify the third rule of the Red-Black tree, i.e., the parent of the new node is black or
not. In the above figure, the parent of the node is black in color; therefore, it is a Red-Black
tree.
Step 3: Now, we create the new node having value 7 with Red color. As 7 is less than 10,
so it will come at the left of 10 as shown below.
Now we verify the third rule of the Red-Black tree, i.e., the parent of the new node is black or
not. As we can observe, the parent of the node 7 is black in color, and it obeys the Red-Black
tree's properties.
Step 4: The next element is 15, and 15 is greater than 10, but less than 18, so the new node
will be created at the left of node 18. The node 15 would be Red in color as the tree is not
empty.
The above tree violates the property of the Red-Black tree as it has Red-red parent-child
relationship. Now we have to apply some rule to make a Red-Black tree. The rule 4 says
that if the new node's parent is Red, then we have to check the color of the parent's sibling
of a new node. The new node is node 15; the parent of the new node is node 18 and the
sibling of the parent node is node 7. As the color of the parent's sibling is Red in color, so we
apply the rule 4a. The rule 4a says that we have to recolor both the parent and parent's sibling
node. So, both the nodes, i.e., 7 and 18, would be recolored as shown in the below figure.
We also have to check whether the parent's parent of the new node is the root node or not. As
we can observe in the above figure, the parent's parent of a new node is the root node, so we
do not need to recolor it.
Step 5: The next element is 16. As 16 is greater than 10 but less than 18 and greater than 15,
so node 16 will come at the right of node 15. The tree is not empty; node 16 would be Red in
color, as shown in the below figure:
In the above figure, we can observe that it violates the property of the parent-child
relationship as it has a red-red parent-child relationship. We have to apply some rules to make
a Red-Black tree. Since the new node's parent is Red color, and the parent of the new node
has no sibling, so rule 4a will be applied. The rule 4a says that some rotations and recoloring
would be performed on the tree.
Since node 16 is right of node 15 and the parent of node 15 is node 18. Node 15 is the left of
node 18. Here we have an LR relationship, so we require to perform two rotations. First, we
will perform left, and then we will perform the right rotation. The left rotation would be
performed on nodes 15 and 16, where node 16 will move upward, and node 15 will move
downward. Once the left rotation is performed, the tree looks like as shown in the below
figure:
In the above figure, we can observe that there is an LL relationship. The above tree has a
Red-red conflict, so we perform the right rotation. When we perform the right rotation, the
median element would be the root node. Once the right rotation is performed, node 16 would
become the root node, and nodes 15 and 18 would be the left child and right child,
respectively, as shown in the below figure.
After rotation, node 16 and node 18 would be recolored; the color of node 16 is red, so it will
change to black, and the color of node 18 is black, so it will change to a red color as shown in
the below figure:
Step 6: The next element is 30. Node 30 is inserted at the right of node 18. As the tree is not
empty, so the color of node 30 would be red.
The color of the parent and parent's sibling of a new node is Red, so rule 4b is applied. In rule
4b, we have to do only recoloring, i.e., no rotations are required. The color of both the parent
(node 18) and parent's sibling (node 15) would become black, as shown in the below image.
We also have to check the parent's parent of the new node, whether it is a root node or not.
The parent's parent of the new node, i.e., node 30 is node 16 and node 16 is not a root node,
so we will recolor the node 16 and changes to the Red color. The parent of node 16 is node
10, and it is not in Red color, so there is no Red-red conflict.
Step 7: The next element is 25, which we have to insert in a tree. Since 25 is greater than 10,
16, 18 but less than 30; so, it will come at the left of node 30. As the tree is not empty, node
25 would be in Red color. Here Red-red conflict occurs as the parent of the newly created is
Red color.
Since there is no parent's sibling, so rule 4a is applied in which rotation, as well as recoloring,
are performed. First, we will perform rotations. As the newly created node is at the left of its
parent and the parent node is at the right of its parent, so the RL relationship is formed.
Firstly, the right rotation is performed in which node 25 goes upwards, whereas node 30 goes
downwards, as shown in the below figure.
After the first rotation, there is an RR relationship, so left rotation is performed. After right
rotation, the median element, i.e., 25 would be the root node; node 30 would be at the right of
25 and node 18 would be at the left of node 25.
Now recoloring would be performed on nodes 25 and 18; node 25 becomes black in color,
and node 18 becomes red in color.
Step 8: The next element is 40. Since 40 is greater than 10, 16, 18, 25, and 30, so node 40
will come at the right of node 30. As the tree is not empty, node 40 would be Red in color.
There is a Red-red conflict between nodes 40 and 30, so rule 4b will be applied.
As the color of parent and parent's sibling node of a new node is Red so recoloring would be
performed. The color of both the nodes would become black, as shown in the below image.
After recoloring, we also have to check the parent's parent of a new node, i.e., 25, which is
not a root node, so recoloring would be performed, and the color of node 25 changes to Red.
After recoloring, red-red conflict occurs between nodes 25 and 16. Now node 25 would be
considered as the new node. Since the parent of node 25 is red in color, and the parent's
sibling is black in color, rule 4a would be applied. Since 25 is at the right of the node 16 and
16 is at the right of its parent, so there is an RR relationship. In the RR relationship, left
rotation is performed. After left rotation, the median element 16 would be the root node, as
shown in the below figure.
After rotation, recoloring is performed on nodes 16 and 10. The color of node 10 and node 16
changes to Red and Black, respectively as shown in the below figure.
Step 9: The next element is 60. Since 60 is greater than 16, 25, 30, and 40, so node 60 will
come at the right of node 40. As the tree is not empty, the color of node 60 would be Red.
As we can observe in the above tree that there is a Red-red conflict occurs. The parent node is
Red in color, and there is no parent's sibling exists in the tree, so rule 4a would be applied.
The first rotation would be performed. The RR relationship exists between the nodes, so left
rotation would be performed.
When left rotation is performed, node 40 will come upwards, and node 30 will come
downwards, as shown in the below figure:
After rotation, the recoloring is performed on nodes 30 and 40. The color of node 30 would
become Red, while the color of node 40 would become black.
The above tree is a Red-Black tree as it follows all the Red-Black tree properties.
Deletion in Red Back tree
Let's understand how we can delete the particular node from the Red-Black tree. The
following are the rules used to delete the particular node from the tree:
Step 1: First, we perform BST rules for the deletion.
Step 2:
Case 1: if the node is Red, which is to be deleted, we simply delete it.
Let's understand case 1 through an example.
Suppose we want to delete node 30 from the tree, which is given below.
Initially, we are having the address of the root node. First, we will apply BST to search the
node. Since 30 is greater than 10 and 20, which means that 30 is the right child of node 20.
Node 30 is a leaf node and Red in color, so it is simply deleted from the tree.
If we want to delete the internal node that has one child. First, replace the value of the
internal node with the value of the child node and then simply delete the child node.
Let's take another example in which we want to delete the internal node, i.e., node 20.
We cannot delete the internal node; we can only replace the value of that node with another
value. Node 20 is at the right of the root node, and it is having only one child, node 30. So,
node 20 is replaced with a value 30, but the color of the node would remain the same, i.e.,
Black. In the end, node 20 (leaf node) is deleted from the tree.
If we want to delete the internal node that has two child nodes. In this case, we have to decide
from which we have to replace the value of the internal node (either left subtree or right
subtree). We have two ways:
o Inorder predecessor: We will replace with the largest value that exists in the left
subtree.
o Inorder successor: We will replace with the smallest value that exists in the right
subtree.
Suppose we want to delete node 30 from the tree, which is shown below:
Node 30 is at the right of the root node. In this case, we will use the inorder successor. The
value 38 is the smallest value in the right subtree, so we will replace the value 30 with 38, but
the node would remain the same, i.e., Red. After replacement, the leaf node, i.e., 30, would
be deleted from the tree. Since node 30 is a leaf node and Red in color, we need to delete it
(we do not have to perform any rotations or any recoloring).
Case 2: If the root node is also double black, then simply remove the double black and make
it a single black.
Case 3: If the double black's sibling is black and both its children are black.
o Remove the double black node.

o Add the color of the node to the parent (P) node.
1. If the color of P is red then it becomes black.
2. If the color of P is black, then it becomes double black.
o The color of double black's sibling changes to red.
o If still double black situation arises, then we will apply other cases.
Let's understand this case through an example.
Suppose we want to delete node 15 in the below tree.
We cannot simply delete node 15 from the tree as node 15 is Black in color. Node 15 has two
children, which are nil. So, we replace the 15 value with a nil value. As node 15 and nil node
are black in color, the node becomes double black after replacement, as shown in the below
figure.
In the above tree, we can observe that the double black's sibling is black in color and its
children are nil, which are also black. As the double black's sibling and its children have
black so it cannot give its black color to neither of these. Now, the double black's parent node
is Red so double black's node add its black color to its parent node. The color of the node 20
changes to black while the color of the nil node changes to a single black as shown in the
below figure.
After adding the color to its parent node, the color of the double black's sibling, i.e., node 30
changes to red as shown in the below figure.
In the above tree, we can observe that there is no longer double black's problem exists, and it
is also a Red-Black tree.
Case 4: If double black's sibling is Red.
o Swap the color of its parent and its sibling.
o Rotate the parent node in the double black's direction.
o Reapply cases.
Let's understand this case through an example.
Suppose we want to delete node 15.
Initially, the 15 is replaced with a nil value. After replacement, the node becomes double
black. Since double black's sibling is Red so color of the node 20 changes to Red and the
color of the node 30 changes to Black.
Once the swapping of the color is completed, the rotation towards the double black would be
performed. The node 30 will move upwards and the node 20 will move downwards as shown
in the below figure.
In the above tree, we can observe that double black situation still exists in the tree. It satisfies
the case 3 in which double black's sibling is black as well as both its children are black. First,
we remove the double black from the node and add the black color to its parent node. At the
end, the color of the double black's sibling, i.e., node 25 changes to Red as shown in the
below figure.
In the above tree, we can observe that the double black situation has been resolved. It also
satisfies the properties of the Red Black tree.
Case 5: If double black's sibling is black, sibling's child who is far from the double black is
black, but near child to double black is red.
o Swap the color of double black's sibling and the sibling child which is nearer to the
double black node.
o Rotate the sibling in the opposite direction of the double black.
o Apply case 6
Suppose we want to delete the node 1 in the below tree.
First, we replace the value 1 with the nil value. The node becomes double black as both the
nodes, i.e., 1 and nil are black. It satisfies the case 3 that implies if DB's sibling is black and
both its children are black. First, we remove the double black of the nil node. Since the
parent of DB is Black, so when the black color is added to the parent node then it becomes
double black. After adding the color, the double black's sibling color changes to Red as
shown below.
We can observe in the above screenshot that the double black problem still exists in the tree.
So, we will reapply the cases. We will apply case 5 because the sibling of node 5 is node 30,
which is black in color, the child of node 30, which is far from node 5 is black, and the child
of the node 30 which is near to node 5 is Red. In this case, first we will swap the color of
node 30 and node 25 so the color of node 30 changes to Red and the color of node 25 changes
to Black as shown below.
Once the swapping of the color between the nodes is completed, we need to rotate the sibling
in the opposite direction of the double black node. In this rotation, the node 30 moves
downwards while the node 25 moves upwards as shown below.
As we can observe in the above tree that double black situation still exists. So, we need to
case 6. Let's first see what is case 6.
Case 6: If double black's sibling is black, far child is Red

o Swap the color of Parent and its sibling node.
o Rotate the parent towards the Double black's direction
o Remove Double black
o Change the Red color to black.
Now we will apply case 6 in the above example to solve the double black's situation.
In the above example, the double black is node 5, and the sibling of node 5 is node 25, which
is black in color. The far child of the double black node is node 30, which is Red in color as
shown in the below figure:
First, we will swap the colors of Parent and its sibling. The parent of node 5 is node 10, and
the sibling node is node 25. The colors of both the nodes are black, so there is no swapping
would occur.
In the second step, we need to rotate the parent in the double black's direction. After rotation,
node 25 will move upwards, whereas node 10 will move downwards. Once the rotation is
performed, the tree would like, as shown in the below figure:
In the next step, we will remove double black from node 5 and node 5 will give its black
color to the far child, i.e., node 30. Therefore, the color of node 30 changes to black as shown
in the below figure.
Real-time application of red-Black Tree:

 RB trees are utilized in practical programming to build affiliated exhibits.
 In this application, RB trees work related to 2-4 trees, a self-adjusting
information structure where each hub with kids has either two, three, or four kid
hubs. For each 2-4 tree, there are comparing RB trees with information
components in a similar request. It’s feasible to show
that INSERT and DELETE procedures on 2-4 trees are comparable to variety
of flipping and turns in RB trees.
 This outcome is summed up to exhibit that RB trees can be isometric to 2-3 trees
or 2-4 trees, an outcome because of Guibas and Sedgewick (1978).
Advantages of Red-Black Tree:
 Red-black trees balance the level of the parallel tree.
 Red-black tree gets some margin to structure the tree by reestablishing the level
of the parallel tree.
 The time intricacy for search activity is O(log n)
 It has similarly low constants in a wide scope of situations.
 Red-black trees are dynamic in nature.
 Red-black trees are relatively easy to implement and understand.
 Red-black trees can reduce the time it takes to search for a specific item.
 Red-black trees are suitable for wide range of applications, like database
indexing, memory management, network routing. They can handle ordered as
well as unordered data, making them a flexible option for many types of data
structures.
Disadvantages of Red-Black Tree:
 Complicated to use due to all the activity edge cases; generally you’d need to
utilize a standard library execution (for example: TreeSet in Java, STL set in
C++, and so forth) instead of carrying out one yourself without any preparation.
 On the off chance that you plan to just form the tree once and just perform read
activities from there on, AVL trees offer better execution. (By and by, this
presentation gain is normally irrelevant, so most well-known standard libraries
just give a red-dark tree execution and no AVL tree execution.)
 Since B-trees can have a variable number of children, they are regularly liked
over red-black trees for ordering and putting away a lot of data on plates, since
they can be kept somewhat shallow to restrict circle tasks.
 Locking red-black trees perform inefficiently with simultaneous access
compared to locking skip lists, which (1) very fast even with simultaneous
access; (2) are frequently less difficult to carry out; and (3) offer basically all the
advantages of locking red-dark trees.
 Red-black trees are difficult to manage as the number of nodes in the tree
increases.
 Insertions in red-black tree can be relatively slow compared to other data
structures like AVL Tree.
 Not suitable for large datasets.
 The self-balancing nature of Red-Black trees comes at the cost of added
overhead. Insertion and deletion operations require extra steps to maintain the
balance of the tree.
 While Red-Black trees offer good average-case performance, their worst-case
performance can be slow compared to other data structures.
11.3 IMPLEMENTATION OF RED-BLACK TREE DATA STRUCTURE
/*
* A sample Java Program to Implement the Red-Black Tree Data Structure
*/
// The Scanner class from the util is imported to take input from the user
import java.util.Scanner;
// A class named Node_Red_Black_Tree is created whose each object will work as a
Node of the Red-Black Tree
class Node_Red_Black_Tree
{
// Each element of the Red Black tree node has four members.
// Out of these four members two variables are of Node_Red_Black_Tree class type na
med left_node_addr and right_node_addr storing the left and right nodes of the previous n
ode
Node_Red_Black_Tree left_node_addr, right_node_addr;
// The node_data Integer variable is used to store the data present in that particular node
int node_data;
// The node_data Integer variable is used to store the color of that particular node
int colour_of_node;
/* Constructor of Node_Red_Black_Tree class */

public Node_Red_Black_Tree(int thenode_data)
{
this( thenode_data, null, null );
}
/* Constructor of Node_Red_Black_Tree class */
public Node_Red_Black_Tree(int thenode_data, Node_Red_Black_Tree lt, Node_Re
d_Black_Tree rt)
{
left_node_addr = lt;
right_node_addr = rt;
node_data = thenode_data;
colour_of_node = 1;
}
}
// A class named Node_Red_Black_Tree is created whose each object will work as the R
ed-Black Tree
class Red_Black_Tree
{
private Node_Red_Black_Tree current_node;
private Node_Red_Black_Tree parent_node;
private Node_Red_Black_Tree grand_node;
private Node_Red_Black_Tree great_node;
private Node_Red_Black_Tree header_node;
private static Node_Red_Black_Tree node_null;
/* static block to initialize the data */

static
{
node_null = new Node_Red_Black_Tree(0);
node_null.left_node_addr = node_null;
node_null.right_node_addr = node_null;
}
// color coding
/* Black - 1 RED - 0 */
static final int BLACK = 1;
static final int RED = 0;
/* Constructor of the Red_Black_Tree class */

public Red_Black_Tree(int negInf)
{
header_node = new Node_Red_Black_Tree(negInf);
header_node.left_node_addr = node_null;
header_node.right_node_addr = node_null;
}
/* Function to check if tree is empty */
public boolean isEmpty()
{
return header_node.right_node_addr == node_null;
}
/* Make the tree logically empty */
public void makeEmpty()
{
header_node.right_node_addr = node_null;
}
/* Function to insert item in the Red_Black_Tree class object */
public void insert(int item )
{
current_node = parent_node = grand_node = header_node;
node_null.node_data = item;
while (current_node.node_data != item)
{
great_node = grand_node;
grand_node = parent_node;
parent_node = current_node;
current_node = item < current_node.node_data ? current_node.left_node_addr : c
urrent_node.right_node_addr;
// Check if two red children and fix if so
if (current_node.left_node_addr.colour_of_node == RED && current_node.right
_node_addr.colour_of_node == RED)
handleReorient( item );
}
// Insertion fails if already present
if (current_node != node_null)
return;
current_node = new Node_Red_Black_Tree(item, node_null, node_null);
// Attach to parent_node
if (item < parent_node.node_data)
parent_node.left_node_addr = current_node;
else
parent_node.right_node_addr = current_node;
handleReorient( item );
}
private void handleReorient(int item)
{
// Do the colour_of_node flip
current_node.colour_of_node = RED;
current_node.left_node_addr.colour_of_node = BLACK;
current_node.right_node_addr.colour_of_node = BLACK;
if (parent_node.colour_of_node == RED)
{
// Have to rotate
grand_node.colour_of_node = RED;
if (item < grand_node.node_data != item < parent_node.node_data)
parent_node = rotate( item, grand_node ); // Start dbl rotate
current_node = rotate(item, great_node );
current_node.colour_of_node = BLACK;
}
// Make root black
header_node.right_node_addr.colour_of_node = BLACK;
}
private Node_Red_Black_Tree rotate(int item, Node_Red_Black_Tree parent_node)
{
if(item < parent_node.node_data)
return parent_node.left_node_addr = item < parent_node.left_node_addr.node_d
ata ? rotateWithleft_node_addrChild(parent_node.left_node_addr) : rotateWithright_node
_addrChild(parent_node.left_node_addr) ;
else
return parent_node.right_node_addr = item < parent_node.right_node_addr.node
_data ? rotateWithleft_node_addrChild(parent_node.right_node_addr) : rotateWithright_n
ode_addrChild(parent_node.right_node_addr);
}
/* Rotate binary tree node with left_node_addr child */
private Node_Red_Black_Tree rotateWithleft_node_addrChild(Node_Red_Black_Tr
ee k2)
{
Node_Red_Black_Tree k1 = k2.left_node_addr;
k2.left_node_addr = k1.right_node_addr;
k1.right_node_addr = k2;
return k1;
}
/* Rotate binary tree node with right_node_addr child */
private Node_Red_Black_Tree rotateWithright_node_addrChild(Node_Red_Black_T
ree k1)
{
Node_Red_Black_Tree k2 = k1.right_node_addr;
k1.right_node_addr = k2.left_node_addr;
k2.left_node_addr = k1;
return k2;
}
/* Functions to count number of nodes */

public int countNodes()
{
return countNodes(header_node.right_node_addr);
}
private int countNodes(Node_Red_Black_Tree r)
{
if (r == node_null)
return 0;
else
{
int l = 1;
l += countNodes(r.left_node_addr);
l += countNodes(r.right_node_addr);
return l;
}
}
/* Functions to search for an node_data */
public boolean search(int val)
{
return search(header_node.right_node_addr, val);
}
private boolean search(Node_Red_Black_Tree r, int val)
{
boolean found = false;
while ((r != node_null) && !found)
{
int rval = r.node_data;
if (val < rval)
r = r.left_node_addr;
else if (val > rval)
r = r.right_node_addr;
else
{
found = true;
break;
}
found = search(r, val);
}
return found;
}
/* Function for in order traversal of the Red_Black_Tree class object*/
public void inorder()
{
inorder(header_node.right_node_addr);
}
private void inorder(Node_Red_Black_Tree r)
{
if (r != node_null)
{
inorder(r.left_node_addr);
char c = 'B';
if (r.colour_of_node == 0)
c = 'R';
System.out.print(r.node_data +""+c+" ");
inorder(r.right_node_addr);
}
}
/* Function for pre-order traversal of the Red_Black_Tree class object*/
public void preorder()
{
preorder(header_node.right_node_addr);
}
private void preorder(Node_Red_Black_Tree r)
{
if (r != node_null)
{
char c = 'B';
c = 'R';
preorder(r.left_node_addr);
preorder(r.right_node_addr);
}
}
/* Function for post-order traversal of the Red_Black_Tree class object*/
public void postorder()
{
postorder(header_node.right_node_addr);
}
private void postorder(Node_Red_Black_Tree r)
{
if (r != node_null)
{
postorder(r.left_node_addr);
postorder(r.right_node_addr);
char c = 'B';
c = 'R';
}
}
}
/* Class Red_Black_Tree_Run */
class Red_Black_Tree_Run
{
public static void main(String[] args)
{
Scanner scannner_object = new Scanner(System.in);
/* Creating object of RedBlack Tree */
Red_Black_Tree red_black_tree_object = new Red_Black_Tree(Integer.MIN_VAL
UE);
System.out.println("Red Black Tree Test\n");
char ch;
/* Perform tree operations */
do
{
System.out.println("\nThe options list for Red Black Tree::\n");
System.out.println("1. To add a new node in the Red-Black Tree");
System.out.println("2. To search the Red-Black Tree for a node");
System.out.println("3. To get node count of nodes in Red Black Tree");
System.out.println("4. To check if the Red_Black_Tree is Empty or not?");
System.out.println("5. To Clear the Red_Black_Tree.");
int option_list_choice = scannner_object.nextInt();

switch (option_list_choice)
{
case 1 :
System.out.println("Enter integer node_data to insert");
red_black_tree_object.insert( scannner_object.nextInt() );
break;
case 2 :
System.out.println("Enter integer node_data to search");
System.out.println("Search result : "+ red_black_tree_object.search( scannner_
object.nextInt() ));
break;
case 3 :
System.out.println("Nodes = "+ red_black_tree_object.countNodes());
break;
case 4 :
System.out.println("Empty status = "+ red_black_tree_object.isEmpty());
break;
case 5 :
System.out.println("\nTree Cleared");
red_black_tree_object.makeEmpty();
break;
default :
System.out.println("Wrong Entry \n ");
break;
}
System.out.print("\nRBT in Post-order: ");

red_black_tree_object.postorder();
System.out.print("\nRBT in Pre-order: ");
red_black_tree_object.preorder();
System.out.print("\nRBT in In-order: ");
red_black_tree_object.inorder();
System.out.println("\nWanna proceed further(Type y or n)? \n");

ch = scannner_object.next().charAt(0);
} while (ch == 'Y'|| ch == 'y');
}
}
11.4 SPLAY TREE
Have you ever thought that AVL and Red-Black trees are also self-adjusted trees? Then, what
makes the Splay Tree different from the AVL and Red-Black trees? Yes, there is one operation
called splaying, which makes it different from both AVL and the Red-Black Tree.
A splay tree contains all the operations of a binary search tree, like insertion, deletion, and
searching. But it also contains one more operation, which is called splaying. In a splay tree,
every operation is performed at the root of the tree. All operations in the splay tree involve
one common operation called splaying.
You may have questioned what splaying is and why it differentiates splay trees from AVL
and Red-Black trees. So, let me tell you about splaying. Splaying is the process of bringing
an element to the root by performing suitable rotations.
By splaying elements in the tree, we can bring more frequently used elements closer to the
root of the tree so that any operations like insertion, searching, and deletion can be performed
quickly. This means, that after applying the splaying operation, more frequently used
elements come closer to the root.
Suppose we have been given a binary search tree with different nodes and we know that in
the binary search tree, elements to the left are smaller and those to the right are greater than
the root node.
For searching, we will perform the binary search method. Let’s say we want to search for
element 9. As 9 is less than 11, we will come to the left of the root node. After performing a
search operation, we need to do one thing called splaying. This means, that after splaying, the
element on which we are operating should come to the root. Elements would come to the root
after performing some rearrangements of elements or rotations in the tree.
Rotations in Splay Tree
To rearrange the tree, we need to perform some rotations. The rotations given below are the
rotations that we are going to perform in the splay tree.
 Zig rotation [Right Rotation]

 Zig zig [Two Right Rotations]
 Zag rotation [Left Rotation]
 Zag zag [Two Left Rotations]
 Zig zag [Zig followed by Zag]
 Zag zig [Zag followed by Zig]
Zig rotation:
This rotation is similar to the right rotation in the AVL tree. In zig rotation, every node moves
one position to the right of its current position. We use Zig rotation when the item which is to
be searched is either a root node or a left child of the root node.
Let’s take a scenario where the search item is the left child of the root node.
In the above example, we have to search for node 9 in the tree. To search for the given node
in the binary search tree, we need to perform the following steps:
Step 1: First, we compare 9 with the root node. As 9 is less than 11, it is a left child of the
root node.
Step 2: We have already seen above that once the element is found, we will perform
splaying. Here the right rotation is performed so that 9 becomes the root node of the tree.
Have a look at the diagram below.
In the above diagram, we can see that node 9 has become the root node of the tree, this shows
that the searching is completed.
Zig zig rotation:
It’s a kind of double zig rotation. Here we perform zig rotation two times. Every node
moves two positions to the right of its current position. But why are we doing this?
We are doing this because sometimes situations arise where we need to search for the item
that has both parent and grandparent. In such cases, we have to perform four rotations for
splaying.
In the example given below, suppose we have to search for element 3.

Step 1: First, we have to perform a search operation in the tree as we did previously, which
means a BST operation. As 3 is less than 6 and 4, it will be at the left of node 4. So we can
say that element 3 has a parent of 4 and a grandparent of 6.
Step 2: We have to perform splaying, which means we have to make node 3 the root node.
So here we will perform two right rotations because the elements to be searched have both
parent and grandparent.
Zag rotation:
This rotation is similar to the left rotation in the AVL tree. In zag rotation, every node moves
one position to the left of its current position. We use Zag rotation when the item which is to
be searched is either a root node or a right child of the root node.
Let’s see the case where the element to be searched for is present on the right of the root node
of the tree. Now let’s say we have to search for 13, which is present on the right of the root
node of the tree.
The steps involved in searching are given below:
Step 1: First, we compare 13 with the root node. As 13 is greater than the root node of the
tree, it is the right child of the root node.
Step 2: Once the element is found, we will perform splaying as we did in the previous
examples. Here we will perform a left rotation so that 13 becomes the root node of the tree.
In the above diagram, we can see that node 13 has become the root node of the tree, this
shows that the searching is completed.
Zag Zag Rotation:
It’s a kind of double zag rotation. Here we perform zag rotation two times. Every node moves
two positions to the left of its current position. But why are we doing this?
We are doing this because sometimes situations arise where we need to search for the item
that has both parent and grandparent. In such cases, we have to perform four rotations for
splaying.
Now we will try to understand this case with some examples:
In the example given below, suppose we have to search for element 7.
Step 1: First, we have to perform a search operation in the tree as we did previously, which
means a BST operation. As 7 is greater than 4 and 6, it will be at the right of node 6. So we
can say that element 7 has a parent of 6 and a grandparent of 4.
Step 2: We have to perform splaying, which means we have to make node 7 the root node.
So here we will perform two left rotations because the elements to be searched have both
parent and grandparent.
Zig Zag rotation
This type of rotation is a sequence of zig rotations followed by zag rotations. So far, we've
seen that both the parent and the grandparent are in a RR or LL relationship. Now, here we
will see the RL and LR kinds of relationships between parent and grandparent. Every node
moves one position to the right, followed by one position to the left of its current position.
Suppose we want to search for element 5. Here first we perform a BST searching operation.
Like 5, it is greater than the root node 4 and smaller than the node 6. So 5 would be the left
child of node 6.
So an RL relationship exists since node 5 is to the left of node 6 and node 6 is to the right of
node 4. So first we will perform the right rotation on node 6, and then node 6 will move
downwards, and node 5 will come upwards, as you can see in the example given below. After
that, we will perform zag rotation(left rotation) at 4, and we will see that 5 will become the
root node of the tree.
As we can observe in the above tree, node 5 has become the root node; therefore, the search
is completed. In this case, first, we have performed the zig rotation and then the zag rotation.
So it is known as zig-zag rotation.
Zag Zig Rotation
This rotation is similar to the Zig-zag rotation, the only difference is that here every node
moves one position to the left, followed by one position to the right of its current position.
Suppose we want to search for element 5. Here first we perform a BST searching operation.
Like 5, it is smaller than the root node 6 and greater than node 4. So 5 would be the right
child of node 4, and 4 would be the left child of the root node 6.
So here LR relationship exists since node 5 is to the right of node 4 and node 4 is to the left of
node 6. So first we will perform the left rotation on node 4, and then node 4 will move
downwards, and node 5 will come upwards, as you can see in the example given below. And
after that, we will perform one zig rotation (right rotation) at 6, so finally, we will get 5 as the
root node of the tree.
As we can observe in the above tree, node 5 has become the root node; therefore, the
searching is completed. In this case, first, we have performed the zag rotation and then the zig
rotation. So it is known as zag-zig rotation.
Advantages of Splay Tree
 In the AVL and Red-Black trees, we need to store some information. Like in AVL
trees, we need to store the balance factor of each node, and in the red-black trees, we
also need to store one extra bit of information that denotes the color of the node,
either black or red.
 Splay tree is the fastest type of binary search tree, which is used in a variety of
practical applications such as GCC compilers.
 Improve searching by moving frequently accessed nodes closer to the root node.
One of the practical uses is cache implementation, in which recently used data is
saved in the cache so that we can access the data more quickly without going into
memory.
Disadvantages of Splay Tree
The main disadvantage of the splay tree is that trees are not strictly balanced, but
rather roughly balanced. When the splay trees are linear, the time complexity is O(n).
11.5 IMPLEMENTATION OF SPLAY TREE
C
#include <stdio.h>
#include <stdlib.h>
// Tree Node
struct node
{
int key;
struct node *left, *right;
};
// Allocates a new node with the given key and NULL left and right pointers.
struct node *TreeNode(int key)
{
struct node *node = (struct node *)malloc(sizeof(struct node));
node->key = key;
node->left = node->right = NULL;
return (node);
}
// A utility function for right-rotating a y-rooted subtree.

struct node *rightRotate(struct node *x)
{
struct node *y = x->left;
x->left = y->right;
y->right = x;
return y;
}
// A utility function for right-rotating a x-rooted subtree.

struct node *leftRotate(struct node *x)
{
struct node *y = x->right;
x->right = y->left;
y->left = x;
return y;
}
// If the key is present in the tree, this function moves it to the root.
// If the key is not present, it returns the last item accessed by root.
// This function modifies the tree and returns the modified root.
struct node *splay(struct node *root, int key)
{
// Root is NULL or key is present at root.
if (root == NULL || root->key == key)
return root;
// Key lies in left subtree!

if (root->key > key)
{
if (root->left == NULL)
return root;
// Zig-Zig (Left Left)

if (root->left->key > key)
{
// First, recursively bring the key as the root of left-left.
root->left->left = splay(root->left->left, key);
// Do the first rotation for the root, followed by the second rotation.
root = rightRotate(root);
}
else if (root->left->key < key) // Zig-Zag (Left Right)
{
// First, recursively bring the key as the root of left-right.
root->left->right = splay(root->left->right, key);
// Do first rotation for root->left

if (root->left->right != NULL)
root->left = leftRotate(root->left);
}
// Do second rotation for root

return (root->left == NULL) ? root : rightRotate(root);
}
// Key lies in right subtree
else
{
if (root->right == NULL)
return root;
// Zag-Zig (Right Left)

if (root->right->key > key)
{
// Bring the key as root of right-left
root->right->left = splay(root->right->left, key);
// Do first rotation for root->right
if (root->right->left != NULL)
root->right = rightRotate(root->right);
}
// Zag-Zag (Right Right)
else if (root->right->key < key)
{
// Bring the key as root of right-right and do first rotation
root->right->right = splay(root->right->right, key);
root = leftRotate(root);
}
// Do second rotation for root

return (root->right == NULL) ? root : leftRotate(root);
}
}
// The search function for Splay Tree. Note that this function returns the new root of the
splay-tree.
// If a key is present in the tree, then it is moved to the root.
struct node *bstSearch(struct node *root, int key)
{
return splay(root, key);
}
// A utility function to print preorder traversal of the tree.

void preOrder(struct node *root)
{
if (root != NULL)
{
printf("%d ", root->key);
preOrder(root->left);
preOrder(root->right);
}
}
// main function
int main()
{
struct node *root = TreeNode(100);
root->left = TreeNode(50);
root->right = TreeNode(200);
root->left->left = TreeNode(40);
root->left->left->left = TreeNode(30);
root->left->left->left->left = TreeNode(20);
root = bstSearch(root, 20);
preOrder(root);
return 0;
}
11.6 SUMMARY
In this unit we learnt about red-black tree in data structure. We also learnt about
implementation of red-black tree in data structure. At the end of this unit, we have discussed
splay tree and also implementation of splay tree in data structure
11.7 KEYWORDS
Splay tree, Red-black tree, zig rotation, Zig-zig, Zag-zag and Zag-zig.
11.8 QUESTIONS
1. Define red-black tree.
2. Write the properties of red-black tree.
3. Explain insertion in red-black tree with example.
4. Write the advantages and disadvantages of red-black tree.
5. Explain rotations in splay tree
11.10REFERENCES
Bruno R. Preiss
UNIT 12: GRAPH ALGORITHMS
Structure
12.0 Objectives
12.1 Introduction
12.2 Shortest path algorithms
12.3 Minimum spanning tree algorithms
12.7 Summary
12.8 Key words
12.9 Questions
12.10 References
12.0 OBJECTIVES
- Discuss Shortest path algorithms.
- Understand the Minimum spanning tree algorithms.
12.1 INTRODUCTION
Graph algorithms are a subset of tools for graph analytics. Graph analytics is something we
do—it’s the use of any graph-based approach to analyze connected data. There are various
methods we could use: we might query the graph data, use basic statistics, visually explore
the graphs, or incorporate graphs into our machine learning tasks. Graph pattern–based
querying is often used for local data analysis, whereas graph computational algorithms
usually refer to more global and iterative analysis. Although there is overlap in how these
types of analysis can be employed, we use the term graph algorithms to refer to the latter,
more computational analytics and data science uses.
Graph algorithms provide one of the most potent approaches to analyzing connected data
because their mathematical calculations are specifically built to operate on relationships.
They describe steps to be taken to process a graph to discover its general qualities or specific
quantities. Based on the mathematics of graph theory, graph algorithms use the relationships
between nodes to infer the organization and dynamics of complex systems. Network
scientists use these algorithms to uncover hidden information, test hypotheses, and make
predictions about behavior.
Graph algorithms have widespread potential, from preventing fraud and optimizing call
routing to predicting the spread of the flu. For instance, we might want to score particular
nodes that could correspond to overload conditions in a power system. Or we might like to
discover groupings in the graph which correspond to congestion in a transport system.
12.2 SHORTEST PATH ALGORITHMS

The shortest path problem is about finding a path between 2 vertices in a graph such that the
total sum of the edges weights is minimum.
This problem could be solved easily using (BFS) if all edge weights were (1), but here
weights can take any value. Three different algorithms are discussed below depending on the
use-case.
Bellman Ford's Algorithm:

Bellman Ford's algorithm is used to find the shortest paths from the source vertex to all other
vertices in a weighted graph. It depends on the following concept: shortest path contains at
most n−1 edges, because the shortest path couldn't have a cycle.
So why shortest path shouldn't have a cycle

There is no need to pass a vertex again, because the shortest path to all other vertices could be
found without the need for a second visit for any vertices.
Algorithm Steps:
 The outer loop traverses from 0: n−1.

 Loop over all edges, check if the next node distance > current node distance + edge
weight, in this case update the next node distance to "current node distance + edge
weight".
This algorithm depends on the relaxation principle where the shortest distance for all vertices
is gradually replaced by more accurate values until eventually reaching the optimum solution.
In the beginning all vertices have a distance of "Infinity", but only the distance of the source
vertex = 0, then update all the connected vertices with the new distances (source vertex
distance + edge weights), then apply the same concept for the new vertices with new
distances and so on.
Implementation:
Assume the source node has a number (0):
vector <int> v [2000 + 10];

int dis [1000 + 10];
for(int i = 0; i < m + 2; i++){
v[i].clear();
dis[i] = 2e9;
}
for(int i = 0; i < m; i++){
scanf("%d%d%d", &from , &next , &weight);
v[i].push_back(from);
v[i].push_back(next);
v[i].push_back(weight);
}
dis[0] = 0;
for(int i = 0; i < n - 1; i++){
int j = 0;
while(v[j].size() != 0){
if(dis[ v[j][0] ] + v[j][2] < dis[ v[j][1] ] ){

dis[ v[j][1] ] = dis[ v[j][0] ] + v[j][2];
}
j++;
}
}
A very important application of Bellman Ford is to check if there is a negative cycle in the
graph,
Time Complexity of Bellman Ford algorithm is relatively high O(V * E), in case
Let's discuss an optimized algorithm.

Dijkstra's Algorithm
Dijkstra's algorithm has many variants but the most common one is to find the shortest paths
from the source vertex to all other vertices in the graph.
Algorithm Steps:
 Set all vertices distances = infinity except for the source vertex, set the source
distance = 0.
 Push the source vertex in a min-priority queue in the form (distance, vertex), as the
comparison in the min-priority queue will be according to vertices distances.
 Pop the vertex with the minimum distance from the priority queue (at first the popped
vertex = source).
 Update the distances of the connected vertices to the popped vertex in case of "current
vertex distance + edge weight < next vertex distance", then push the vertex
with the new distance to the priority queue.
 If the popped vertex is visited before, just continue without using it.
 Apply the same algorithm again until the priority queue is empty.
Implementation:
Assume the source vertex = 1.
#define SIZE 100000 + 1
vector < pair < int , int > > v [SIZE]; // each vertex has all the connected vertices with the
edges weights
int dist [SIZE];
bool vis [SIZE];
void dijkstra(){
// set the vertices distances as infinity
memset(vis, false , sizeof vis); // set all vertex as unvisited
dist[1] = 0;
multiset < pair < int , int > > s; // multiset do the job as a min-priority queue
s.insert({0 , 1}); // insert the source node with distance = 0
while(!s.empty()){
pair <int , int> p = *s.begin(); // pop the vertex with the minimum distance
s.erase(s.begin());
int x = p.s; int wei = p.f;

if( vis[x] ) continue; // check if the popped vertex is visited before
vis[x] = true;
for(int i = 0; i < v[x].size(); i++){

int e = v[x][i].f; int w = v[x][i].s;
if(dist[x] + w < dist[e] ){ // check if the next vertex distance could be
minimized
dist[e] = dist[x] + w;
s.insert({dist[e], e} ); // insert the next vertex with the updated distance
}
}
}
}
Time Complexity of Dijkstra's Algorithm is but with min-priority queue it drops
down to .
However, if we have to find the shortest path between all pairs of vertices, both of the above
methods would be expensive in terms of time. Discussed below is another algorithm designed
for this case.
Floyd\u Warshall's Algorithm
Floyd\u Warshall's Algorithm is used to find the shortest paths between between all pairs of
vertices in a graph, where each edge in the graph has a weight which is positive or negative.
The biggest advantage of using this algorithm is that all the shortest distances between
any 2 vertices could be calculated in , where V is the number of vertices in a graph.
The Algorithm Steps:
For a graph with N vertices:
 Initialize the shortest paths between any 2 vertices with Infinity.

 Find all pair shortest paths that use 0 intermediate vertices, then find the shortest paths
that use 1 intermediate vertex and so on.. until using all N vertices as intermediate
nodes.
 Minimize the shortest paths between any 2 pairs in the previous operation.
for(int k = 1; k <= n; k++){
for(int i = 1; i <= n; i++){

for(int j = 1; j <= n; j++){
dist[i][j] = min( dist[i][j], dist[i][k] + dist[k][j] );
}
}
}
Time Complexity of Floyd\u2013Warshall's Algorithm is where is the number of

vertices in a graph.
12.3 MINIMUM SPANNING TREE ALGORITHM

The minimum spanning tree algorithm is developed by referencing the field of graph theory
in mathematics. Thus, to understand this algorithm, we shall first understand what a spanning
tree is?
A spanning tree of a connected undirected graph is a subgraph, i.e., a tree structure that binds
all vertices with a minimum edge cost sum. If you have graph G with vertices V and edges E,
then that graph can be represented as G(V, E). For this graph G(V, E), if you construct a tree
structure G’(V’, E’) such that the formed tree structure follows constraints mentioned below,
then that structure can be called a Spanning Tree.
1. V’ = V (number of Vertices in G’ must be equal to the number of vertices in G)

2. E’ = |V| - 1 (Edges of G’ must be equal to the number of vertices in graph G
minus 1)
Let's create spanning trees for a given graph topology to understand this concept better.
For the graph above, possible spanning tree structures are:
Now, let’s look at the sum of edge weight costs for all these spanning trees represented in the
table below:
Sum of Edge
Spanning Tree
Costs
ST - 1 22
ST - 2 35
ST - 3 36
Table - 1: Sum of Edge costs

The spanning tree structure 1 has an overall edge weight cost of 22. Which is the least
possible substructure cost for given graph G. Thus, ST-1 is considered the minimum
spanning tree of graph G. A minimum spanning tree in a weighted graph is one that has the
least weight of all the other spanning tree structures.
How to Find the Minimum Spanning Tree

The naive method we discussed above is not an ideal approach to find out possible MST
structures. In that method, we will have to generate all possible spanning trees at first and
then calculate the overall sum of edge weights for each generated spanning tree. After that,
we will have to determine the minima of all those spanning trees, which will cost us more
time and memory.
A better method is to locate a vital attribute of the MST, which will provide clarification if
some particular edge should be included in it or not. And then, we can use that property to
build up the MST gradually. Let’s find out how we can do that with the help of the greedy
programming paradigm. The greedy algorithms that we can use are Prim’s Algorithm and
Kruskal’s Algorithm.
First, we shall look into Prim’s algorithm.
1. Prim’s Algorithm
Prim's algorithm begins with a single node and adds up adjacent nodes one by one by
discovering all of the connected edges along the way. Edges with the lowest weights that
don't generate cycles are chosen for inclusion in the MST structure. As a result, we can claim
that Prim's algorithm finds the globally best answer by making locally optimal decisions.
Steps involved in Prim’s algorithms are mentioned below:
 Step 1: Choose any vertex as a starting vertex.

 Step 2: Pick an edge connecting any tree vertex and fringe vertex (adjacent vertex
to visited vertex) having the minimum edge weight.
 Step 3: Add the selected edge to MST only if it doesn't form any closed cycle.
 Step 4: Keep repeating steps 2 and 3 until the fringe vertices exist.
 Step 5: End.
Let’s formulate a program to implement an MST using the C programming language.
//Library for accessing the maximum integer variable value
#include <limits.h>
//Library for creating the set
#include <stdio.h>
#define Vertices 5
//Finding the vertex with the lowest key value from a set of vertices not included in MST
int Least_Key(int key[], bool Min_Span_Tree[])
{
int least = INT_MAX, min_index;
for (int v = 0; v < Vertices; v++)
if (Min_Span_Tree[v] == false && key[v] < least)
least = key[v], min_index = v;
return min_index;
}
//Utility function to print MST
int print_Prims_MST(int parent[], int graph[Vertices][Vertices])
{
printf("Edge \tWeight\n");
for (int i = 1; i < Vertices; i++)
printf("%d - %d \t%d \n", parent[i], i, graph[i][parent[i]]);
}
//Function for generating an MST
void prims_MST(int graph[Vertices][Vertices])
{
int parent[Vertices];
int key[Vertices];
bool Min_Span_Tree[Vertices];
for (int i = 0; i < Vertices; i++)
key[i] = INT_MAX, Min_Span_Tree[i] = false;
key[0] = 0;
parent[0] = -1;
for (int count = 0; count < Vertices - 1; count++) {
int u = Least_Key(key, Min_Span_Tree);
Min_Span_Tree[u] = true;
for (int v = 0; v < Vertices; v++)
if (graph[u][v] && Min_Span_Tree[v] == false && graph[u][v] < key[v])
parent[v] = u, key[v] = graph[u][v];
}
printf("Created Spanning Tree for Given Graph is: \n");
printf("\n");
print_Prims_MST(parent, graph);
}
//Driver method
int main()
{
int graph[Vertices][Vertices] = { { 0, 3, 2, 0, 0 },
{ 3, 0, 16, 12, 0 },
{ 2, 16, 0, 0, 5 },
{ 0, 12, 0, 0, 0 },
{ 0, 0, 5, 0, 0 } };
prims_MST(graph);
return 0;
}
Output:
You can verify the output of this code by comparing it with ST-1 as the graph that we have
passed to this Prim’s algorithm is the same graph G(V, E) represented previously.
2. Kruskal’s Algorithm
Kruskal's approach sorts all the edges in ascending order of edge weights and only adds
nodes to the tree if the chosen edge does not form a cycle. It also selects the edge with the
lowest cost first and the edge with the highest cost last. As a result, we can say that
the Kruskal algorithm makes a locally optimum decision in the hopes of finding the global
optimal solution. Hence, this algorithm can also be considered as a Greedy Algorithm.
The steps involved in Kruskal’s algorithm to generate a minimum spanning tree are:
 Step 1: Sort all edges in increasing order of their edge weights.

 Step 2: Pick the smallest edge.
 Step 3: Check if the new edge creates a cycle or loop in a spanning tree.
 Step 4: If it doesn’t form the cycle, then include that edge in MST. Otherwise,
discard it.
 Step 5: Repeat from step 2 until it includes |V| - 1 edges in MST.
Now, we will develop a C program to implement MST using Kruskal’s algorithm for the
graph given below:
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
//structure for denoting an edge
struct Edge {
int source, destination, weight;
};
//structure for representing a weighted, connected and undirected graph
struct Graph {
int Node, E;
struct Edge* edge;
};
//memory allocation for storing graph with V vertices and E edges
struct Graph* create_Graph(int Node, int E)
{
struct Graph* gph = (struct Graph*)(malloc(sizeof(struct Graph)));
gph->Node = Node;
gph->E = E;
gph->edge = (struct Edge*)malloc(sizeof( struct Edge));
return gph;
}
//Union-Find Subset
struct tree_subset {
int parent;
int rank;
};
//finding the set of selected nodes
int DisjointSet_find(struct tree_subset subsets[], int i)
{
//find root and make root as parent of i
if (subsets[i].parent != i)
subsets[i].parent
= DisjointSet_find(subsets, subsets[i].parent);
return subsets[i].parent;
}
void DisjointSet_Union(struct tree_subset subsets[], int x, int y)
{
int xroot = DisjointSet_find(subsets, x);
int yroot = DisjointSet_find(subsets, y);
if (subsets[xroot].rank < subsets[yroot].rank)
subsets[xroot].parent = yroot;
else if (subsets[xroot].rank > subsets[yroot].rank)
subsets[yroot].parent = xroot;
else
{
subsets[yroot].parent = xroot;
subsets[xroot].rank++;
}
}
//Comparing edges with qsort() in C
int myComp(const void* a, const void* b)
{
struct Edge* a1 = (struct Edge*)a;
struct Edge* b1 = (struct Edge*)b;
return a1->weight > b1->weight;
}
//function for creating MST using Kruskal’s Approach
void MST_Kruskal(struct Graph* gph)
{
int Node = gph->Node;
struct Edge
result[Node];
int e = 0;
int i = 0;
//edge sorting
qsort(gph->edge, gph->E, sizeof(gph->edge[0]),
myComp);
//allocating memory for v subsets
struct tree_subset* subsets
= (struct tree_subset*)malloc(Node * sizeof(struct tree_subset));
for (int v = 0; v < Node; ++v) {
subsets[v].parent = v;
subsets[v].rank = 0;
}
//V-1 : Path traversal limit
while (e < Node - 1 && i < gph->E) {
struct Edge next_edge = gph->edge[i++];
int x = DisjointSet_find(subsets, next_edge.source);
int y = DisjointSet_find(subsets, next_edge.destination);
if (x != y) {
result[e++] = next_edge;
DisjointSet_Union(subsets, x, y);
}
}
//prompting state of MST
printf(
"Edges created in MST are as below: \n");
int minimumCost = 0;
//calculating minimum cost using for loop
for (i = 0; i < e; ++i)
{
printf("%d -- %d == %d\n", result[i].source,
result[i].destination, result[i].weight);
minimumCost += result[i].weight;
}
printf("The Cost for created MST is : %d",minimumCost);
return;
}
//driver function
int main()
{
int Node = 4;
int E = 6;
struct Graph* gph = create_Graph(Node, E);
//graph creation
gph->edge[0].source = 0;
gph->edge[0].destination = 1;
gph->edge[0].weight = 2;
MST_Kruskal(gph);
return 0;
}
Output:
You can verify this output’s accuracy by generating an MST for the graph given above.
Applications of Minimum Spanning Tree

The following are the applications of minimum spanning tree in data structures:
 Telecommunication Network Building: A basic naive approach will be more

expensive if we develop a telecommunication network for the entire city. Using the
MST approach in data structures, we can design a communication system at a
much lesser cost. The distinction between Naive and MST routing is illustrated in
the diagram below.
 Constructing Highways or Railroads: The Minimum Spanning Tree (MST)

technique is used globally for building roadways or railroads. The MST approach
determines the best route between two cities depending on all potential routes.
Essentially, the algorithm treats cities as vertices and roads connecting them as
edges to produce a subtree that connects two cities with less cost.
12.4 SUMMARY
In this chapter we learnt about shortest path algorithms in detail and also discussed
implementations of shortest path algorithms. At the end of this unit, we also learnt minimum
spanning tree algorithms and implementation.
12.5 KEYWORDS
Spanning tree, Ford’s algorithm, Dijkstra’s algorithm and Kruskal’s algorithm.
12.6 QUESTIONS
1. Explain bellman ford’s algorithm.
2. Discuss dijkstra’s algorithm.
3. Write a short note on minimum spanning tree algorithm.
4. Describe prim’s algorithm.
5. Briefly explain Kruskal’s algorithm.
12.7 REFERENCES

Bruno R. Preiss

Data Structures and Algorithms

Uploaded by

Data Structures and Algorithms

Uploaded by

UNIT – 1 : INTRODUCTION TO DATA STRUCTURES

After going through this lesson you will be able to

 Describe Overview of Data structures

1.2 Overview of Data Structures

Execution Time Cases

Here are some commonly used Abstract Data Types:

void initialize(List *list) {

list->array = (int *)malloc(INITIAL_CAPACITY * sizeof(int));

printf("Memory allocation failed!\n");

if (list->size >= list->capacity) {

list->array = (int *)realloc(list->array, list->capacity * sizeof(int));

printf("Memory reallocation failed!\n");

int get(List *list, int index) {

if (index < 0 || index >= list->size) {

printf("Index out of bounds!\n");

printf("List elements: ");

for (int i = 0; i < list->size; i++) {

printf("%d ", list->array[i]);

void freeList(List *list) {

// Adding elements to the list

// Displaying the list

// Accessing an element by index

printf("Element at index 2: %d\n", get(&myList, 2));

// Freeing the memory allocated for the list

Implementation using Linked Lists:

typedef struct Node {

void initialize(List *list) {

void append(List *list, int value) {

int get(List *list, int index) {

void display(List *list) {

void freeList(List *list) {

// Adding elements to the list

// Displaying the list

// Accessing an element by index

// Freeing the memory allocated for the list

Abstract data type model

1. Discuss Overview of Data structures..

 "Data Structures and Algorithms" by Michael T. Goodrich, Roberto Tamassia, and

After going through this unit you will be able to

 Measuring the efficiency of algorithms, time and space complexity.

Algorithm analysis involves studying the efficiency and performance characteristics of

2.2 Measuring the efficiency of algorithms, time and space complexity.

Measuring the efficiency of algorithms, time and space complexity.

Measuring the efficiency of algorithms requires a combination of theoretical analysis,

2.3 Big-O notation and Performance Analysis Techiques.

Big-O notation and Performance Analysis Techiques.:

1. O(1): Constant time complexity. The algorithm's performance is constant, regardless

What is Big O Notation in Data Structure?

f (n) ≤ CG(n) for all n >=

This can be written as:

f(n) = O(g(n)), where n tends to infinity (n → ∞)

We can simply write the above expression as:

The most important properties of Big O Notation in Data Structure are:

If f(n) = CG(n), then O(f(n)) = O(g(n)) for a constant c > 0

If f(n) = f1(n) + f2(n) + -- + FM(n) and fi(n)≤ fi+1(n) ∀ i=1, 2, --, m,

then O(f(n)) = O(max(f1(n), f2(n), --, fm(n)))

If f(n) = log an and g(n)=log bn, then

If f(n) = a0 + a1.n + a2.n2 + -- + am.nm, then

How Does Big O Notation Make a Runtime Analysis of an Algorithm?

For example, runtime analysis of an algorithm for a size of n = 20:

log (20) = 2.996

20 log (20) = 59.9

20! = 2.432902 + 1818

 Runtime complexity of some common algorithmic examples:

 Runtime Complexity for Linear Search – O(n)

 Runtime Complexity for Binary Search – O(log n)

 Runtime Complexity for Exponential algorithms like Tower of Hanoi - O(c^n).

1. Implementation of the program for a particular algorithm.

Space Complexities of some common algorithms:

 Radix sort - Space complexity is O(n+k).

 Quick Sort - Space complexity is O(n).

 Merge sort - Space complexity is O(log n).

Example of Big O Notation in C

Implementation of Selection Sort algorithm in C to find worst-case complexity (Big O

for(int i=0; i<n; i++)

int temp = array[i];

list->array = (int )malloc(INITIAL_CAPACITY sizeof(int));

list->array = (int )realloc(list->array, list->capacity sizeof(int));