CSC 409 Algorithms and Complexity Analysis
CSC 409 Algorithms and Complexity Analysis
Prepared by
Mr. K. O. Oluborode
MAU, Yola
Page 1 of 223
CENTER FOR DISTANCE LEARNING
Page 2 of 223
TABLE OF CONTENTS for Study Guide
Introduction
What you will learn in this course
Course Aims
Course Objectives
Working through this Course
Course Materials
Study Units
Presentations Scheduled
Assessment
Tutor Marked Assignment (TMA/SAQ)
Final Examination and Grading
Course Marking Structure
Course Overview
How to benefit most from this course
Page 3 of 223
INTRODUCTION
Studying Algorithms and its Complexity Analysis course, this will lay emphasis on
understanding the concept of computer algorithms as a set of steps of operations to solve a
problem performing calculation, data processing, and automated reasoning tasks; how to develop
algorithms i.e best way to represent the solution of a particular problem in a very simple and
efficient way, before translating into viable and workable programs, in terms of analysis of the
algorithm is the process of analyzing the problem-solving capability of the algorithm based on
time and size required (the size of memory for storage while implementation). This course is
specifically tailored those students who are actually studying computing and interested in
developing and testing computer algorithms and in applying them towards developing programs
in any programming language.
What students will learn in this course: This is a course with theoretical and self-exercises
content. Throughout the semester, students will complete 3 modules of 16 units, self-assessment
exercises and workable assignments expected to meet specific course criteria.
Page 4 of 223
COURSE AIMS
The aim of the course is to guide learners of Computing and Computer Programs on how to
design and test algorithms and also help them in identifying different types of algorithm design
paradigms. It is also to help them simplify the task of understanding the theory behind computer
algorithms.
COURSE OBJECTIVES
Below are the objectives of the course which are to:
1. Provide sound understanding of computer algorithms.
2. Provide a clear understanding on how algorithm work and its technique.
3. Provide suitable examples of different types of algorithms and why algorithms are very
important in computing.
STUDY UNITS
The study units in this course are as follows:
Page 5 of 223
Unit 2 Analysis and Complexity of Algorithms
Unit 3 Algorithm Design Techniques
Unit 4 Recursion and Recursive Algorithms
Unit 5 Recurrence Relations
PRESENTATION SCHEDULE
The course materials assignments have important deadlines for submission. The learners should
guide against falling behind stipulated deadlines.
ASSESSMENT
There are three ways of carrying out assessment of the course. First assessment is made up of
self-assessment exercises, second consists of the tutor marked assignments and the last is the
written examination/end of course examination.
You are expected to do all the self-assessment exercises by applying what you have read in the
units. The tutor marked assignments should be submitted to your facilitators for formal
assessment in accordance with the deadlines stated in the presentation schedule and the
assignment files. The total assessment will carry 30% of the total course score. At the end of the
Page 6 of 223
course, the final examination will not be more than three hours and will carry 70% of the total
marks.
Marks
Assignments 30%
Examination 70%
Total 100%
Page 7 of 223
date (at least two working days are required). The assignments will be marked by your tutor and
returned to you as soon as possible.
Do not delay in contacting your facilitator on telephone or e-mail if you need assistance. Such
assistance could be as a result of the followings:
Having difficulties in understanding any part of the study unit or assigned readings
Difficulties in the self-assessment exercises
Questions or problems with assignment or grading of assignments
The only way to have face to face contact and to ask questions from your facilitator is to attend
tutorials. To gain from the tutorials prepare lists of questions and participate actively in
discussions.
SUMMARY
This course is to provide overview of computer algorithms and complexity analysis. In
particular, we will see know more about the nature and design of algorithms, why they are so
important in the field of computing/ computer programming and the several algorithm design
paradigms that would be explained. In fact, the learners will actually learn how do basic run-time
and space-complexity analysis of computer algorithms. Some examples of algorithms applied in
the fields of Searching and Sorting would also be examined..
I wish you success in the course and I hope you will find the course both interesting and useful.
Page 8 of 223
ALGORITHMS AND COMPLEXITY ANALYSIS
CONTENT PAGE
Module 1 Basic Algorithm Analysis
Unit 1 Basic Algorithm Concepts
Unit 2 Analysis and Complexity of Algorithms
Unit 3 Algorithm Design Techniques
Unit 4 Recursion and Recursive Algorithms
Unit 5 Recurrence Relations
Page 9 of 223
MODULE 1 BASIC ALGORITHMIC ANALYSIS
1.0 INTRODUCTION
The word algorithm literarily means “a step-by-step procedure used in solving a problem” and is
a type of effective method in which a finite list of well-defined instructions for completing a
task; that given an initial state, will proceed through a well-defined series of successive states,
eventually terminating in an end-state. The concept of an algorithm originated as a means of
recording procedures for solving mathematical problems such as finding the common divisor of
two numbers or multiplying two numbers.
Page 10 of 223
2.0 OBJECTIVES
By the end of this unit, you should be able to:
Define and describe what an algorithm is
Enumerate the different characteristics of an algorithm
Examine some of the advantages of algorithms
Identify some shortcomings or disadvantages of algorithms
Look at the concept of a pseudocode
Examine some benefits and shortcomings of a pseudocode
Make a comparison between and algorithm and a pseudocode
Look at the various reasons why an algorithm is needed
Page 11 of 223
Definiteness: Each instruction should be clear and ambiguous.
Finiteness: An algorithm should terminate after executing a finite number of steps.
Effectiveness: Every instruction should be fundamental to be carried out, in principle, by
a person using only pen and paper.
Feasible: It must be feasible enough to produce each instruction.
Flexibility: It must be flexible enough to carry out desired changes with no efforts.
Efficient: The term efficiency is measured in terms of time and space required by an
algorithm to implement. Thus, an algorithm must ensure that it takes little time and less
memory space meeting the acceptable limit of development time.
Independent: An algorithm must be language independent, which means that it should
mainly focus on the input and the procedure required to derive the output instead of
depending upon the language.
3.2 Pseudocode
Page 12 of 223
Pseudocode refers to an informal high-level description of the operating principle of a computer
program or algorithm. It uses structural conventions of a standard programming language
intended for human reading rather than the machine reading.
Page 13 of 223
3.2.4 Problem Case/ Example:
Suppose there are 60 students in a class. How will you calculate the number of absentees in the
class?
i. Pseudocode Approach:
1. Initialize a variable called Count to zero, absent to zero, total to 60
2. FOR EACH Student PRESENT DO the following:
Increase the Count by One
3. Then Subtract Count from total and store the result in absent
4. Display the number of absent students
Page 14 of 223
11. With the help of an algorithm, we can also identify the resources (memory, input-output)
cycles required by the algorithm.
12. With the help of algorithm, we convert art into a science.
13. To understand the principle of designing.
14. We can measure and analyze the complexity (time and space) of the problems concerning
input size without implementing and running it; it will reduce the cost of design.
Self-Assessment Exercise
1. What is an algorithm?
2. Differentiate between an algorithm and a pseudocode
3. Highlight some of the basic reasons why algorithms are needed?
4. How is an algorithm similar to and different from a program?
5. Why must every good computer programmer understand an algorithm first?
6. State an algorithm for adding three numbers A, B, and C
4.0 CONCLUSION
The concept of understanding and writing computer algorithms is very essential to understanding
the task of programming and every computing student has to imbibe the concepts of algorithms.
In fact, algorithms are the basic key to understanding the theory and practice of computing.
5.0 SUMMARY
In this unit we have considered an overview of algorithms and their basic characteristics. In
addition, we looked at some of the benefits and shortcomings of algorithms and also examined
the concept of a pseudocode as well as some of its benefits and shortcomings. We also made a
brief comparison between a pseudocode and an algorithm and finally looked at some of the
reasons why an algorithm is needed
Page 15 of 223
c. Computer Programs
2. State five properties or features of an algorithm.
3. State some basic differences between an algorithm and a pseudocode and also between an
algorithm and a computer program
4. Give four benefits each of an algorithm and a pseudocode
Page 16 of 223
UNIT 2 ANALYSIS AND COMPLEXITY OF ALGORITHMS
1.0 Introduction
2.0 Objectives
3.0 Analysis of Algorithms
3.1 Types of Time Complexity Analysis
3.1.1 Worst-case Time Complexity
3.1.2 Average-case Time Complexity
3.1.3 Best-case Time Complexity
3.2 Complexity of Algorithms
3.3 Typical Complexities of an Algorithm
3.3.1 Constant complexity
3.3.2 Logarithmic complexity
3.3.3 Linear complexity
3.3.4 Quadratic complexity
3.3.5 Cubic complexity
3.3.6 Exponential complexity
3.4 How to approximate the Time taken by an Algorithm
3.4.1 Some Examples
4.0 Conclusion
5.0 Summary
6.0 Tutor Marked Assignments
7.0 Further Reading/ References
Page 17 of 223
1.0 INTRODUCTION
Analysis of an algorithm is the same thing as estimating the efficiency of the algorithm. There
are two fundamental parameters based on which we can analyze the algorithm and they are
Space and Time Complexity. There is also the concept in Time Complexity of estimating the
running time of an algorithm and we have the Best-case, Average-case and Worst-case
2.0 OBJECTIVES
By the end of this unit, you will be able to
Understand runtime and space analysis or complexity of algorithms
Know the different types of analysis
Understand the typical complexities of an algorithm
Learn how to approximate the time taken by an algorithm
Page 18 of 223
Before you implement any algorithm as a program, it is better to find out which among these
algorithms are good in terms of time and memory. It would be best to analyze every algorithm in
terms of Time that relates to which one could execute faster and Memory or Space
corresponding to which one will take less memory.
So, the Design and Analysis of Algorithm talks about how to design various algorithms and how
to analyze them. After designing and analyzing, choose the best algorithm that takes the least
time and the least memory and then implement it as a program in C or any preferable language.
We will be looking more on time rather than space because time is instead a more limiting
parameter in terms of the hardware. It is not easy to take a computer and change its speed. So, if
we are running an algorithm on a particular platform, we are more or less stuck with the
performance that platform can give us in terms of speed.
However, on the other hand, memory is relatively more flexible. We can increase the memory as
when required by simply adding a memory card. So, we will focus on time than that of the space.
The running time is measured in terms of a particular piece of hardware, not a robust measure.
When we run the same algorithm on a different computer which might be faster or use different
programming languages which may be designed to compile faster, we will find out that the same
algorithm takes a different time.
3.1.1 Worst-case time complexity: For 'n' input size, the worst-case time complexity can be
defined as the maximum amount of time needed by an algorithm to complete its execution. Thus,
it is nothing but a function defined by the maximum number of steps performed on an instance
having an input size of n. Computer Scientists are more interested in this.
3.1.2 Average case time complexity: For 'n' input size, the average case time complexity can be
defined as the average amount of time needed by an algorithm to complete its execution. Thus, it
is nothing but a function defined by the average number of steps performed on an instance
having an input size of n.
3.1.3 Best case time complexity: For 'n' input size, the best-case time complexity can be defined
as the minimum amount of time needed by an algorithm to complete its execution. Thus, it is
Page 19 of 223
nothing but a function defined by the minimum number of steps performed on an instance having
an input size of n.
Page 20 of 223
For N = 1,000,000, an algorithm that has a complexity of O(log(N)) would undergo 20 steps
(with a constant precision). Here, the logarithmic base does not hold a necessary consequence for
the operation count order, so it is usually omitted.
Page 21 of 223
Modified by me to make the explanation of the manuscript conversational and easy to be comprehend by the
learner.
The exponential function N! grows even faster, given by the formular of computing factorial of
an integer number N! = N*(N-1)*(N-2)*(N-3)*(N-4) util the incremented integer value is 1 less
than the value of N.
For example, if N = 5, then
N! = 5*(5-1) *(5-2) *(5-3) *(5-4) = 5* 4*3*2*1 =120.
Likewise, if N =10, then
N! =10! = 10*9*8*7*6*5*4*3*2*1 =3,628,800 and so on.
Since the constants do not hold a significant effect on the order of count of operation, so it is
better to ignore them.
Thus, to consider an algorithm to be linear and equally efficient, it must undergo N, N/2 or 3*N
count of operation, respectively, on the same number of elements to solve a particular problem
Self-Assessment Exercises
1. Compare the Worst-case and the Best-case analysis of an algorithm
2. Why is the Worst-case analysis the most important in algorithm analysis?
Page 22 of 223
3. Among the different complexity types of an algorithm, which do you consider as the worst?
4. Presently we can solve problem instances of size 30 in 1 minute using algorithm A, which is a
Θ(2n) algorithm. On the other hand, we will soon have to solve problem instances twice this
large in 1 minute. Do you think it would help to buy a faster (and more expensive) computer?
A()
Page 23 of 223
{
int i;
for (i=1 to n)
printf("Abdullahi");
}
Since i equals 1 to n, so the above program will print Abdullahi, n number of times. Thus, the
complexity will be O(n).
Example2:
A()
{
int i, j:
for (i=1 to n)
for (j=1 to n)
printf("Hello, how are you ");
}
Each time the outer loop runs, the inner loop will run n times before the outer loop can be
incremented by 1. This routine will continue util the outer loop repeats its routine n times; thus,
making the time complexity to be given by O(n2).
This means that the program will print Hello, how are you 9 times if n is given to be 3. It will
look like the output below:
Hello, how are you
Hello, how are you
Hello, how are you
Hello, how are you
Hello, how are you
Hello, how are you
Hello, how are you
Hello, how are you
Hello, how are you
Example3:
Page 24 of 223
A()
{
i = 1; S = 1;
while (S<=n)
{
i++;
S = S + i;
printf("I am excited");
}
}
As we can see from the above example, we have two variables; i, S and then we have while
S<=n, which means S will start at 1, and the entire loop will stop whenever S value reaches a
point where S becomes greater than n.
Here i is incrementing in steps of one, and S will increment by the value of i, i.e., the increment
in i is linear. However, the increment in S depends on the i.
Initially;
i=1, S=1
After 1st iteration;
i=2, S=3
After 2nd iteration;
i=3, S=6
After 3rd iteration;
i=4, S=10 … and so on.
Since we don't know the value of n, so let's suppose it to be k. Now, if we notice the value of S in
the above case is increasing; for i=1, S=1;
i=2, S=3; i=3, S=6; i=4, S=10; …
Thus, it is nothing but a series of the sum of first n natural numbers, i.e., by the time i reaches k,
k (k +1)
the value of S will be .
2
Page 25 of 223
k (k +1)
To stop the loop, has to be greater than n, and when we solve this equation,
2
2
k +k
We will get >n .
2
Hence, it can be concluded that we get a complexity of O(√n) in this case.
The program give the following output:
I am excited
I am excited
I am excited
Solution;
Here we will see the simple Back Substitution method to solve the above problem.
T(n) = 1 + T(n-1) … Eqn. (1)
Step1: Substitute n-1 at the place of n in Eqn. (1)
T(n-1) = 1 + T(n-2) .. .Eqn. (2)
Step2: Substitute n-2 at the place of n in Eqn. (1)
T(n-2) = 1 + T(n-3) … Eqn. (3)
Step3: Substitute Eqn. (2) in Eqn. (1)
T(n)= 1 + 1+ T(n-2) = 2 + T(n-2) … Eqn. (4)
Step4: Substitute eqn. (3) in Eqn. (4)
T(n) = 2 + 1 + T(n-3) = 3 + T(n-3) = …... = k + T(n-k) …Eqn. (5)
Now, according to Eqn. (1), i.e. T(n) = 1 + T(n-1), the algorithm will run until n>1. Basically, n
will start from a very large number, and it will decrease gradually. So, when T(n) = 1, the
Page 26 of 223
algorithm eventually stops, and such a terminating condition is called anchor condition, base
condition or stopping condition.
Thus, for k = n-1, the T(n) will become.
Step5: Substitute k = n-1 in eqn. (5)
T(n) = (n-1) + T(n-(n-1)) = (n-1) + T(1) = n-1+1
Hence, T(n) = n or O(n).
4.0 CONCLUSION
Analysis of algorithms helps us to determine how good or how bad they are in terms of speed or
time taken and memory or space utilized. Designing good programs is dependent on how good or
how bad the algorithm is and the analysis helps us to determine the efficiency of such
algorithms.
5.0 SUMMARY
In the unit, we have learnt the meaning of algorithm analysis and the different types of analysis.
We also examined the complexity of algorithms and the different types of complexities.
Page 27 of 223
Jena, S. R. and Patro, S. (2018) – Design and Analysis of Algorithms, ISBN 978-93-935274-
311-7
1.0 INTRODUCTION
The design of any algorithm follows some planning as there are different design techniques,
strategies or paradigms that could be adopted depending on the problem domain and a better
Page 28 of 223
understanding by the designer. Some of these techniques could be combined also while the
limiting behaviour of the algorithm can be represented with asymptotic analysis of which we
shall be looking at examples of algorithm design techniques and asymptotic notations.
2.0 OBJECTIVES
By the end of this unit, you will be able to
Understand several design techniques or paradigms of algorithms
Know the meaning of Asymptotic notations
Understand some popular Asymptotic notations
Learn how to apply some of the Asymptotic notations learnt
Page 29 of 223
The divide-and-conquer paradigm often helps in the discovery of efficient algorithms. It is a top-
down approach. The algorithms which follow the divide & conquer techniques involve three
steps:
Divide the original problem into a set of sub-problems.
Solve every sub-problem individually, recursively.
Combine the solution of the sub-problems (top level) into a solution of the whole original
problem.
Following are some standard algorithms that are of the Divide and Conquer algorithms variety.
Binary Search is a searching algorithm. ...
Quicksort is a sorting algorithm. ...
Merge Sort is also a sorting algorithm. ...
Closest Pair of Points The problem is to find the closest pair of points in a set of points in
x-y plane.
Page 30 of 223
Graph – Map Coloring.
Kruskal's Minimal Spanning Tree Algorithm.
Dijkstra's Minimal Spanning Tree Algorithm.
Graph – Vertex Cover.
Knapsack Problem.
Job Scheduling Problem.
Page 31 of 223
It is used for solving the optimization problems and minimization problems. If we have given a
maximization problem then we can convert it using the Branch and bound technique by simply
converting the problem into a maximization problem.
An important advantage of branch-and-bound algorithms is that we can control the quality of
the solution to be expected, even if it is not yet found. The cost of an optimal solution is only up
to smaller than the cost of the best computed one.
Branch and bound is an algorithm design paradigm which is generally used for solving
combinatorial optimization problems.
Page 32 of 223
In any backtracking algorithm, the algorithm seeks a path to a feasible solution that includes
some intermediate checkpoints. If the checkpoints do not lead to a viable solution, the problem
can return to the checkpoints and take another path to find a solution
There are the following scenarios in which you can use the backtracking:
It is used to solve a variety of problems. You can use it, for example, to find a feasible
solution to a decision problem.
Backtracking algorithms were also discovered to be very effective for solving
optimization problems.
In some cases, it is used to find all feasible solutions to the enumeration problem.
Backtracking, on the other hand, is not regarded as an optimal problem-solving
technique. It is useful when the solution to a problem does not have a time limit.
Backtracking algorithms are used in;
Finding all Hamiltonian paths present in a graph
Solving the N-Queen problem
Knights Tour problem, etc
Self-Assessment Exercise
1. What do you understand by an Algorithm design paradigm?
2. How does the Greedy Technique work and give an example?
3, Give a difference between the Backtracking and Randomized algorithm techniques
Page 33 of 223
3.2 Asymptotic Analysis of algorithms (Growth of function)
Resources for an algorithm are usually expressed as a function regarding input. Often this
function is messy and complicated to work. To study Function growth efficiently, we reduce the
function down to the important part.
Asymptotic notations are used to write fastest and slowest possible running time for an
algorithm. These are also referred to as 'best case' and 'worst case' scenarios respectively. "In
asymptotic notations, we derive the complexity concerning the size of the input. (Example in
terms of n)"
"These notations are important because without expanding the cost of running the algorithm, we
can estimate the complexity of the algorithms."
Page 34 of 223
3.3.1. Big-oh notation:
Big-oh is the formal method of expressing the upper bound of an algorithm's running time. It is
the measure of the longest amount of time. The function f (n) = O (g (n)) [read as "f of n is big-
oh of g of n"] if and only if exist positive constant c and such that f (n) ⩽ k.g (n)f(n)⩽k.g(n) for
n>n0n>n0 in all case
Hence, function g (n) is an upper bound for function f (n), as g (n) grows faster than f (n)
Examples:
1. 3n+2=O(n) as 3n+2≤4n for all n≥2
2. 3n+3=O(n) as 3n+3≤4n for all n≥3
Hence, the complexity of f(n) can be represented as O (g (n))
Page 35 of 223
Example:
f (n) =8n2+2n-3≥8n2-3
=7n2+(n2-3)≥7n2 (g(n))
Thus, k1=7
Hence, the complexity of f (n) can be represented as Ω (g (n))
Page 36 of 223
For Example:
3n+2= θ (n) as 3n+2≥3n and 3n+2≤ 4n, for n
k1=3,k2=4, and n0=2
Hence, the complexity of f (n) can be represented as θ (g(n)).
The Theta Notation is more precise than both the big-oh and Omega notation. The function f (n)
= θ (g (n)) if g(n) is both an upper and lower bound.
Self-Assessment Exercise
1. Which of the Asymptotic notations do you consider more important and why?
2. What do you understand by a Backtracking algorithm?
3. What do you understand by the Upper and Lower bound of an algorithm?
4.0 CONCLUSION
Algorithm design techniques presents us with different paradigms or methods of representing or
designing computer algorithms and as the algorithm executes and grows in bounds (upper or
lower), the
Asymptotic notations helps us to determine the levels of growth.
5.0 SUMMARY
Page 37 of 223
Several design techniques or paradigms are available for specifying algorithms and they range
from the popular Divide-and-Conquer, Greedy techniques and Randomized algorithms amongst
others. In the same vein, we have three main notations for carrying out the Asymptotic analysis
of algorithms and they are the Big O, Big Omega and Big Theta notations.
Page 38 of 223
3.2 Types of Recursions
3.2.1 Direct Recursion
3.2.2 Indirect Recursion
3.3 Recursion versus Iteration
3.4 Some Recursive Algorithms (Examples)
3.4.1 Reversing an Array
3.4.2 Fibonacci Sequence
4.0 Conclusion
5.0 Summary
6.0 Tutor Marked Assignment
7.0 Further Reading and other Resources
1.0 INTRODUCTION
Recursion is a method of solving problems that involves breaking a problem down into smaller
and smaller sub-problems until you get to a small enough problem that it can be solved trivially.
In computer science, recursion involves a function calling itself. While it may not seem like
much on the surface, recursion allows us to write elegant solutions to problems that may
otherwise be very difficult to program.
2.0 OBJECTIVES
By the end of this unit, you will be able to
Know the meaning of Recursion and a Recursive algorithm
Understand the different types of recursive algorithms
See some examples of recursive algorithms
Understand how the recursive algorithm works
Know the difference between recursion and iteration
Know the reasons why recursion is preferred in programming
Know the runtime and space complexity of different recursive Algorithms
Page 39 of 223
specified condition is met at which time the rest of each repetition is processed from the last one
called to the first.”
Recursion is the process of defining something in terms of itself. A physical world example
would be to place two parallel mirrors facing each other. Any object in between them would be
reflected recursively.
A recursive algorithm is an algorithm which calls itself with "smaller (or simpler)" input values,
and which obtains the result for the current input by applying simple operations to the returned
value for the smaller (or simpler) input.
There are two main instances of recursion. The first is when recursion is used as a technique in
which a function makes one or more calls to itself. The second is when a data structure uses
smaller instances of the exact same type of data structure when it represents itself.
Page 40 of 223
4! = 4 ⋅ 3! = 24
Meaning we can rewrite the formal recursion definition in terms of recursion like so:
n! = n ⋅ (n−1) !
Note, if n = 0, then n! = 1. This means the base case occurs once n=0, the recursive cases are
defined in the equation above. Whenever you are trying to develop a recursive solution it is very
important to think about the base case, as your solution will need to return the base case once all
the recursive cases have been worked through. Let’s look at how we can create the factorial
function in Python:
def fact(n):
'''Returns factorial of n (n!). Note use of recursion'''
# BASE CASE!
if n == 0:
return 1
# Recursion!
else:
return n * fact(n-1)
Page 41 of 223
3.1.2 Purpose of Recursions
Recursive functions have many uses, but like any other kind of code, their necessity should be
considered. As discussed above, consider the differences between recursions and loops, and use
the one that best fits your needs. If you decide to go with recursions, decide what you want the
function to do before you start to compose the actual code.
Page 42 of 223
function. However, in the case of tail-end recursion, the return value still calls a function but gets
the value of that function right away. The establishment of base cases is commonly achieved by
having a conditional observe some quality, such as the length of an array or the amount of a
number, just like loops. However, there are multiple ways to go about it, so feel free to alter the
complexity as needed.
Self-Assessment Exercises
1. What do you understand by the term “base case”?
2. Why must a stopping criterion be specified in a recursive algorithm?
3. What happens when a recursive algorithm calls itself recursively?
a. Tail Recursion:
If a recursive function calling itself and that recursive call is the last statement in the function
then it’s known as Tail Recursion. After that call the recursive function performs nothing. The
function has to process or perform any operation at the time of calling and it does nothing at
returning time.
Example:
// Code Showing Tail Recursion
Page 43 of 223
#include <iostream>
using namespace std;
// Recursion function
void fun(int n)
{
if (n > 0) {
cout << n << " ";
// Last statement in the function
fun(n - 1);
}
}
// Driver Code
int main()
{
int x = 3;
fun(x);
return 0;
}
Output:
321
Page 44 of 223
while (y > 0) {
cout << y << " ";
y--;
}
}
// Driver code
int main()
{
int x = 3;
fun(x);
return 0;
}
Output
321
Example:
// C++ program showing Head Recursion
#include <bits/stdc++.h>
using namespace std;
// Recursive function
Page 45 of 223
void fun(int n)
{
if (n > 0) {
// First statement in the function
fun(n - 1);
cout << " "<< n;
}
}
// Driver code
int main()
{
int x = 3;
fun(x);
return 0;
}
Output:
123
// Driver code
int main()
{
int x = 3;
fun(x);
return 0;
}
Output:
123
c. Tree Recursion:
To understand Tree Recursion let’s first understand Linear Recursion. If a recursive function
calling itself for one time then it’s known as Linear Recursion. Otherwise if a recursive
function calling itself for more than one time then it’s known as Tree Recursion.
Page 47 of 223
using namespace std;
// Recursive function
void fun(int n)
{
if (n > 0)
{
cout << " " << n;
// Calling once
fun(n - 1);
// Calling twice
fun(n - 1);
}
}
// Driver code
int main()
{
fun(3);
return 0;
}
Output:
3211211
Example:
// C++ program to show Nested Recursion
#include <iostream>
Page 48 of 223
using namespace std;
int fun(int n)
{
if (n > 100)
return n - 10;
// A recursive function passing parameter
// as a recursive call or recursion inside
// the recursion
return fun(fun(n + 11));
}
// Driver code
int main()
{
int r;
r = fun(95);
cout << " " << r;
return 0;
}
Output:
91
Page 49 of 223
From the above diagram fun(A) is calling for fun(B), fun(B) is calling for fun(C) and fun(C) is
calling for fun(A) and thus it makes a cycle.
Example:
// C++ program to show Indirect Recursion
#include <iostream>
using namespace std;
void funB(int n);
void funA(int n)
{
if (n > 0) {
cout <<" "<< n;
// Fun(A) is calling fun(B)
funB(n - 1);
}
}
void funB(int n)
{
if (n > 1) {
cout <<" "<< n;
// Fun(B) is calling fun(A)
funA(n / 2);
}
}
// Driver code
int main()
{
funA(20);
return 0;
}
Output:
20 19 9 8 4 3 1
Page 50 of 223
3.3 Recursion versus Iteration
The Recursion and Iteration both repeatedly execute the set of instructions. Recursion is when a
statement in a function calls itself repeatedly. The iteration is when a loop repeatedly executes
until the controlling condition becomes false. The primary difference between recursion and
iteration is that recursion is a process, always applied to a function and iteration is applied to the
set of instructions which we want to get repeatedly executed.
Recursion
Recursion uses selection structure.
Infinite recursion occurs if the recursion step does not reduce the problem in a manner
that converges on some condition (base case) and Infinite recursion can crash the system.
Recursion terminates when a base case is recognized.
Recursion is usually slower than iteration due to the overhead of maintaining the stack.
Recursion uses more memory than iteration.
Recursion makes the code smaller.
Iteration
Iteration uses repetition structure.
An infinite loop occurs with iteration if the loop condition test never becomes false and
Infinite looping uses CPU cycles repeatedly.
An iteration terminates when the loop condition fails.
An iteration does not use the stack so it's faster than recursion.
Iteration consumes less memory.
Iteration makes the code longer.
Self-Assessment Exercises
1. Try and find the Sum of the elements of an array recursively
2. Find the maximum number of elements in an array A of n elements using recursion
3. How is recursion different from iteration?
Page 51 of 223
3.4 Some Recursive Algorithms (Examples)
3.4.1 Reversing an Array
Let us consider the problem of reversing the n elements of an array, A, so that the first element
becomes the last, the second element becomes the second to the last, and so on. We can solve
this problem using the linear recursion, by observing that the reversal of an array can be achieved
by swapping the first and last elements and then recursively reversing the remaining elements in
the array.
Algorithm Fib(n) {
if (n < 2) return 1
else return Fib(n-1) + Fib(n-2)
}
The above recursion is called binary recursion since it makes two recursive calls instead of one.
How many number of calls are needed to compute the kth Fibonacci number? Let nk denote the
number of calls performed in the execution.
n0 = 1
Page 52 of 223
n1 = 1
n2 = n1 + n0 + 1 = 3 > 21
n3 = n2 + n1 + 1 = 5 > 22
n4 = n3 + n2 + 1 = 9 > 23
n5 = n4 + n3 + 1 = 15 > 23
...
nk > 2k/2
This means that the Fibonacci recursion makes a number of calls that are exponential in k. In
other words, using binary recursion to compute Fibonacci numbers is very inefficient. Compare
this problem with binary search, which is very efficient in searching items, why is this binary
recursion inefficient? The main problem with the approach above, is that there are multiple
overlapping recursive calls.
We can compute F(n) much more efficiently using linear recursion. One way to accomplish this
conversion is to define a recursive function that computes a pair of consecutive Fibonacci
numbers F(n) and F(n-1) using the convention F(-1) = 0.
Algorithm LinearFib(n) {
Input: A nonnegative integer n
Output: Pair of Fibonacci numbers (Fn, Fn-1)
if (n <= 1) then
return (n, 0)
else
(i, j) <-- LinearFib(n-1)
return (i + j, i)
}
Since each recursive call to LinearFib decreases the argument n by 1, the original call results in a
series of n-1 additional calls. This performance is significantly faster than the exponential time
needed by the binary recursion. Therefore, when using binary recursion, we should first try to
fully partition the problem in two or, we should be sure that overlapping recursive calls are really
necessary.
Let's use iteration to generate the Fibonacci numbers. What's the complexity of this algorithm?
Page 53 of 223
public static int IterationFib(int n) {
if (n < 2) return n;
int f0 = 0, f1 = 1, f2 = 1;
for (int i = 2; i < n; i++) {
f0 = f1;
f1 = f2;
f2 = f0 + f1;
}
return f2;
}
Self-Assessment Exercises
1. Either write the pseudo-code or the Java code for the following problems. Draw the recursion
trace of a simple case. What is the running time and space requirement?.
Recursively searching a linked list
Forward printing a linked list
Reverse printing a linked list
4.0 CONCLUSION
Recursive algorithms are very important in programming as they help us write very good
programs and also allow us to understand the concept of computing well. So many programs are
naturally recursive and many others can be turned into a recursive algorithm.
5.0 SUMMARY
In computer science, recursion is a method of solving a problem where the solution depends on
solutions to smaller instances of the same problem. Such problems can generally be solved by
iteration, but this needs to identify and index the smaller instances at programming time.
There exist several natural examples of recursive algorithms while other programming
algorithms that are iterative can be turned into recursive algorithms.
The concept of recursion is very important to developers of algorithms and also to programmers.
Page 54 of 223
F0 =1, F1 = 1
Fn = Fn-1 + Fn-2 for n≥2
Find F10 and F15 by simulating it manually
2. Mathematically, the greatest common divisor, gcd is given as:
gcd(p, q) = P, if q = 0
gcd (q, remainder (p, q)), if p ≥ q and q > 0
Compute; i. gcd (48, 12) ii. gcd (1035, 759)
3. What makes recursion better than iteration and what makes iteration better than recursion.
4. Give a vital difference between Head recursion and Tail recursion.
Page 55 of 223
UNIT 5 RECURRENCE RELATIONS
1.0 Introduction
2.0 Objectives
3.0 Recurrence Relations
3.1 Methods for Resolving Recurrence Relations
3.1.1 Guess-and-Verify Method
3.1.2 Iteration Method
3.1.3 Recursion Tree method
3.1.4 Master Method
3.2 Example of Recurrence relation: Tower of Hanoi
3.2.1 Program for Tower of Hanoi
3.2.2 Applications of Tower of Hanoi Problem
3.2.3 Finding a Recurrence
3.2.4 Closed-form Solution
4.0 Conclusion
5.0 Summary
6.0 Tutor Marked Assignment
7.0 Further Reading and Other References
1.0 INTRODUCTION
A recurrence or recurrence relation on the other hand defines an infinite sequence by
describing how to calculate the n-th element of the sequence given the values of smaller
elements, as in:
T(n) = T(n/2) + n, T(0) = T(1) = 1.
In principle such a relation allows us to calculate T(n) for any n by applying the first equation
until we reach the base case. To solve a recurrence, we will find a formula that calculates T(n)
directly from n, without this recursive computation.
2.0 OBJECTIVES
By the end of this unit, you will be able to
Know more about Recurrences and Recurrence relations
Understand the different methods for resolving recurrence relations
Page 56 of 223
Know the areas of applications of recurrence relations
2T + θ (n) if n>1
Page 57 of 223
T (n) = T +n
We have to show that it is asymptotically bound by O (log n).
Solution:
For T (n) = O (log n) We have to show that for some constant c
T (n) ≤c log n.
Put this in given Recurrence Equation.
T (n) ≤c log +1
T (n) = 2T + n n>1
Find an Asymptotic bound on T.
Solution:
We guess the solution is O (n (logn)).Thus for constant 'c'.
T (n) ≤c n logn
Put this in given Recurrence Equation.
Now,
Page 58 of 223
Example1: Consider the Recurrence
T (n) = 1 if n=1
= 2T (n-1) if n>1
Solution:
T (n) = 2T (n-1)
= 2[2T (n-2)] = 22T (n-2)
= 4[2T (n-3)] = 23T (n-3)
= 8[2T (n-4)] = 24T (n-4) (Eq.1)
Repeat the procedure for i times
T (n) = 2i T (n-i)
Put n-i=1 or i= n-1 in (Eq.1)
T (n) = 2n-1 T (1)
= 2n-1 .1 {T (1) =1 .....given}
= 2n-1
Page 59 of 223
to obtain a set of pre-level costs and then sum all pre-level costs to determine the total cost of all
levels of the recursion.
A Recursion Tree is best used to generate a good guess, which can be verified by the Substitution
Method.
Example 1
Consider T (n) = 2T + n2
We have to obtain the asymptotic bound using recursion tree method.
Solution: The Recursion tree for the above recurrence is
Page 60 of 223
Example 2: Consider the following recurrence
T (n) = 4T +n
Obtain the asymptotic bound using recursion tree method.
Solution: The recursion trees for the above recurrence
Page 61 of 223
Example 3: Consider the following recurrence
When we add the values across the levels of the recursion trees, we get a value of n for every
level. The longest path from the root to leaf is
Page 62 of 223
3.1.5 Master Method
The Master Method is used for solving the following types of recurrence
T (n) = a T + f (n) with a≥1 and b≥1 be constant & f(n) be a function and can be interpreted
as
Let T (n) is defined on non-negative integers by the recurrence.
T (n) = a T + f (n)
In the function to the analysis of a recursive algorithm, the constants and function take on the
following significance:
n is the size of the problem.
a is the number of sub-problems in the recursion.
n/b is the size of each sub-problem. (Here it is assumed that all sub-problems are
essentially the same size.)
f (n) is the sum of the work done outside the recursive calls, which includes the sum of
dividing the problem and the sum of combining the solutions to the sub-problems.
It is not possible always bound the function according to the requirement, so we make
three cases which will tell us what kind of bound we can apply on the function.
Master Theorem:
It is possible to complete an asymptotic tight bound in these three cases:
Page 63 of 223
T (n) = Θ
Example:
T (n) = a T
a = 8, b=2, f (n) = 1000 n2, logba = log28 = 3
T (n) = Θ
Therefore: T (n) = Θ (n3)
F (n) = Θ
Example:
Page 64 of 223
Put all the values in f (n) =Θ , we will get
10n = Θ (n1) = Θ (n) which is true.
Therefore: T (n) = Θ
= Θ (n log n)
Case 3: If it is true f(n) = Ω for some constant ε >0 and it also true that: a f
for some constant c<1 for large value of n ,then :
T (n) = Θ((f (n))
Example: Solve the recurrence relation:
T (n) = 2
Solution:
Compare the given problem with
T (n) = a T
a= 2, b =2, f (n) = n2, logba = log22 =1
2
If we will choose c =1/2, it is true:
∀ n ≥1
So it follows: T (n) = Θ ((f (n))
T (n) = Θ(n2)
Self-Assessment Exercises
Page 65 of 223
1. How is the Guess-and-Verify method better than the Iteration method
2. Is a recurrence relation similar to a recursive algorithm? Discuss.
3. What is the essence of the base case in every recurrence relation?
Page 66 of 223
In the above 7 step all the disks from peg A will be transferred to C given Condition:
1. Only one disk will be shifted at a time.
2. Smaller disk can be placed on larger disk.
Let T (n) be the total time taken to move n disks from peg A to peg C
1. Moving n-1 disks from the first peg to the second peg. This can be done in T (n-1) steps.
2. Moving larger disks from the first peg to the third peg will require first one step.
3. Recursively moving n-1 disks from the second peg to the third peg will require again T
(n-1) step.
So, total time taken T (n) = T (n-1)+1+ T(n-1)
Page 67 of 223
We get,
Page 68 of 223
B equation is the required complexity of technique tower of Hanoi when we have to move n
disks from one peg to another.
T (3) = 23- 1
=8-1
= 7 Ans
[As in concept we have proved that there will be 7 steps now proved by general equation]
Page 69 of 223
void towers( int num, char from peg, char topeg, char auxpeg)
{
if (num == 1)
{
printf ("\n Move disk 1 from peg %c to peg %c", from peg, to
peg);
return;
}
In the other cases, the monks follow our three-step procedure. First they move the (n-1)-disk
tower to the spare peg; this takes M(n-1) moves. Then the monks move the nth disk, taking 1
move. And finally they move the (n-1)-disk tower again, this time on top of the nth disk, taking
M(n-1) moves. This gives us our recurrence relation,
M(n) = 2 M(n-1) + 1.
Page 70 of 223
Since the monks are handling a 64-disk tower, all we need to do is to compute M(64), and that
tells us how many moves they will have to make.
This would be more convenient if we had M(n) into a closed-form solution - that is, if we could
write a formula for M(n) without using recursion. Do you see what it should be? (It may be
helpful if you go ahead and compute the first few values, like M(2), M(3), and M(4).)
Since our expression 2n+1 is consistent with all the recurrence's cases, this is the closed-form
solution. So the monks will move 264+1 (about 18.45x1018) disks. If they are really good and
can move one disk a millisecond, then they'll have to work for 584.6 million years. It looks like
we're safe.
Self-Assessment Exercise
1. Simulate the Tower-of-Hanoi problem for N = 7 disks and N = 12 disks.
2. Can we solve the Tower of Hanoi problem for any value of Tn without using a
Recurrence relation? Discuss.
3. What are the application areas for the Tower of Hanoi problem?
4.0 CONCLUSION
Recurrence relation permits us to compute the members of a sequence one after the other starting
from one or more initial values. Recurrence relations apply recursion completely and there exist
one or more base cases to help determine the stopping criterion.
Page 71 of 223
5.0 SUMMARY
In mathematics and computer science, a recurrence relation is an equation that expresses the nth
term of a sequence as a function of the k preceding terms, for some fixed k, which is called the
order of the relation. Recurrence relations can be solved by several methods ranging from the
popular Guess-and-Verify method to the Master method and they help us understand the
workings of algorithms better.
Page 72 of 223
Michalewicz, Z. and Fogel, D. (2004). How to Solve It: Modern Heuristics. Second Edition.
Springer.
Pearl, J. (1984). Heuristics: Intelligent Search Strategies for Computer Problem Solving.
Addison-Wesley, 1984.
Page 73 of 223
2.0 Objectives
3.0 What is Search?
3.1 What is Binary Search?
3.2 How Binary Search Work
3.2.1 Example Binary Search
3.2.2 Example 2
3.2.3 How did you need Binary Search?
3.3 Sequential Search
3.3.1 With and Without Sentinels
3.3.2 Weighted Sequential Search
3.3.3 Self-Organizing Lists
3.3.4 Which Heuristic is Better?
3.4 Binary Search
3.4.1 Implementing Binary Search
3.4.2 Recursive Binary Search Implementation
4.0 Conclusion
5.0 Summary
6.0 Tutor Marked Assignment
7.0 Further Reading and other Resources
1.0 INTRODUCTION
We introduce here a special search tree called the Binary Search Tree and a derivative of it
known as the Red Black Tree. A binary search tree, also known as ordered binary tree is a binary
tree wherein the nodes are arranged in a order.
The order is:
a) All the values in the left sub-tree has a value less than that of the root node.
b) All the values in the right node have a value greater than the value of the root node.
On the other hand, a red-black tree is a Binary tree where a particular node has color as an extra
attribute, either red or black. By check the node colors on any simple path from the root to a leaf,
Page 74 of 223
red-black trees secure that no such path is higher than twice as long as any other so that the tree
is generally balanced.
2.0 OBJECTIVES
At the end of this unit, you will be able to:
Understand the meaning of a Binary Search Tree.
Know the different methods of traversing a Binary Search Tree
List and explain the different ways a Binary Search Tree can be queried
Understand the Red Black Trees
Learn the different properties of Red Black Trees
Know the different operations done on Red Black Trees
3.0 What is Search?
Search is a utility that enables its user to find documents, files, media, or any other type of data
held inside a database. Search works on the simple principle of matching the criteria with the
records and displaying it to the user. In this way, the most basic search function works.
Page 75 of 223
3.2.1 Example Binary Search
Let us look at the example of a dictionary. If you need to find a certain word, no one goes
through each word in a sequential manner but randomly locates the nearest words to search for
the required word.
Page 76 of 223
F. These iterations continue until the array is reduced to only one element, or the item to be
found becomes the middle of the array.
3.2.2 Example 2
Let’s look at the following example to understand the binary search working
A. You have an array of sorted values ranging from 2 to 20 and need to locate 18.
B. The average of the lower and upper limits is (l + r) / 2 = 4. The value being searched is
greater than the mid which is 4.
C. The array values less than the mid are dropped from search and values greater than the
mid-value 4 are searched.
D. This is a recurrent dividing process until the actual item to be searched is found.
Page 77 of 223
Binary search performs comparisons of the sorted data based on an ordering principle
than using equality comparisons, which are slower and mostly inaccurate.
After every cycle of search, the algorithm divides the size of the array into half hence in
the next iteration it will work only in the remaining half of the array
Summary
Search is a utility that enables its user to search for documents, files, and other types of
data. A binary search is an advanced type of search algorithm that finds and fetches data
from a sorted list of items.
Binary search is commonly known as a half-interval search or a logarithmic search
It works by dividing the array into half on every iteration under the required element is
found.
The binary algorithm takes the middle of the array by dividing the sum of the left and
rightmost index values by 2. Now, the algorithm drops either the lower or upper bound of
elements from the middle of the array, depending on the element to be found.
The algorithm randomly accesses the data to find the required element. This makes the
search cycles shorter and more accurate.
Binary search performs comparisons of the sorted data based on an ordering principle
than using equality comparisons that are slow and inaccurate.
A binary search is not suitable for unsorted data.
A sentinel is a value placed at the end of an array to insure that the normal case of searching
returns something even if the item is not found. It is a way to simplify coding by eliminating the
special case.
MODULE LinearSearch EXPORTS Main; (*1.12.94. LB*)
(* Linear search without a sentinel *)
...
i:= FIRST(a);
WHILE (i <= last) AND NOT Text.Equal(a[i], x) DO INC(i) END;
Page 79 of 223
MODULE SentinelSearch EXPORTS Main; (*27.10.93. LB*)
(* Linear search with sentinel. *)
...
(* Do search *)
a[LAST(a)]:= x; (*sentinel at position N+1*)
i:= FIRST(a);
WHILE x # a[i] DO INC(i) END;
(* Output result *)
IF i = LAST(a) THEN
SIO.PutText("NOT found");
ELSE
SIO.PutText("Found at position: "); SIO.PutInt(i)
END;
SIO.Nl();
END SentinelSearch.
Why? If is the probability of searching for the ith key, which is a distance from the front,
the expected search time is
Page 80 of 223
which is minimized by placing the list in decreasing probability of access order.
For the list (Cheryl,0.4), (Lisa,0.25), (Lori,0.2), (Lauren,0.15), the expected search time is:
If access probability had been uniform, the expected search time would have been
So I win using this order, and win even more if the access probabilities are furthered skewed.
But how do I find the probabilities?
Self-Organizing Lists
Since it is often impractical to compute usage frequencies, and because usage frequencies often
change in the middle of a program (locality), we would like our data structure to automatically
adjust to the distribution.
Such data structures are called self-organizing.
The idea is to use a heuristic to move an element forward in the list whenever it is accessed.
There are two possibilities:
Move forward one is the ``conservative'' approach. (1,2,3,4,5) becomes (1,2,4,3,5) after
a Find(4).
Move to front is the ``liberal'' approach. (1,2,3,4,5) becomes (4,1,2,3,5) after a Find(4).
Page 81 of 223
4. 14.
5. 15.
6. 16.
7. 17.
8. 18.
9. 19.
10. 20.
Binary Search
Binary Search is an incredibly powerful technique for searching an ordered list. It is familiar to
everyone who uses a telephone book!
The basic algorithm is to find the middle element of the list, compare it against the key, decide
which half of the list must contain the key, and repeat with that half.
Two requirements to support binary search:
Random access of the list elements, so we need arrays instead of linked lists.
The array must contain elements in sorted order by the search key.
Why Do Twenty Questions Suffice?
With one question, I can distinguish between two words: A and B; ``Is the key ?''
With two questions, I can distinguish between four words: A,B,C,D; ``Is the ?''
Each question I ask em doubles the number of words I can search in my dictionary.
Page 82 of 223
Although the algorithm is simple to describe informally, it is tricky to produce a working binary
search function. The first published binary search algorithm appeared in 1946, but the
first correct published program appeared in 1962!
The difficulty is maintaining the following two invariants with each iteration:
The key must always remain between the low and high indices.
The low or high indice must advance each iteration.
The boundary cases are very tricky: zero elements left, one elements left, two elements left, and
an even or odd number of elements!
Versions of Binary Search
There are at least two different versions of binary search, depending upon whether we want to
test for equality at each query or only at the end.
For the later, suppose we want to search for ``k'':
iteration bottom top mid
---------------------------------------
1 2 14 (1+14)/2=7
2 1 7 (1+7)/2=4
3 5 7 (5+7)/2=6
4 6 7 (7+7)/2=7
Since , 7 is the right spot. However, we must now test if entry[7]='k'. If not, the
item isn't in the array.
Alternately, we can test for equality at each comparison. Suppose we search for ``c'':
iteration bottom top mid
------------------------------------
1 1 14 (1+14)/2 = 7
2 1 6 (1+6)/2 = 3
3 1 2 (1+2)/2 = 1
4 2 2 (2+2)/2 = 2
Now it will be found!
Page 83 of 223
left, right: [0 .. MaxInd - 1];
argument: INTEGER): [0..MaxInd] =
(*Implements binary search in an array*)
VAR
middle := left + (right - left) DIV 2;
4.0 CONCLUSION
The sorting problem enables us to find better algorithms that would help arrange the numbers in
a list or sequence in any order. Ascending order is when it is arranged from Smallest to Biggest
while Descending order is when the list is arranged from biggest item to the smallest item. We
looked at the case of the bubble sort and the Selection sort algorithms which are well suited for
sorting a small-sized list efficiently.
5.0 SUMMARY
Page 84 of 223
In simple terms, the Sorting algorithm arranges a list from either smallest item consecutively to
the biggest item (Ascending order) or from the biggest item consecutively to the smallest item
(Descending order).
Two methods of Sorting small-sized lists (Bubble sort and Selection Sort) were introduced and
incidentally, they both have the same Worst case runnung time of O(n2).
Page 85 of 223
3.2 Binary Search
3.2.1 Complexity of Binary Search
3.1.2 Advantages of Binary Search
3.1.3 Disadvantages of Binary Search
3.1.4 Applications of Binary Search
4.0 Conclusion
5.0 Summary
6.0 Tutor Marked Assignment
7.0 Further Reading and other Resources
1.0 INTRODUCTION
Divide-and-Conquer is a useful problem-solving technique that divides a large instance of a
problem sixe into smaller and smaller instances and then solves these smaller instances to give a
complete solution of the bigger problem. There are several strategies for implementing the
Divide-and-Conquer approach and we shall first examine the Binary Search algorithm which
first requires that a list be sorted and then proceeds to find any requested item on the list and is
very efficient for large lists since it uses logarithmic time.
2.0 OBJECTIVES
By the end of this unit, you should be able to:
Know the meaning of a Divide-and-Conquer Algorithm
Know how to use a Divide-and-Conquer algorithm
Know the different applications of Divide-and-Conquer algorithms
Understand the Binary Search algorithm,
Know why the Binary Search algorithm is useful
Understand the benefits and shortcomings of Binary search
Know the different application areas of Binary Search
Page 86 of 223
small pieces, and then merge the piecewise solutions into a global solution. This mechanism of
solving the problem is called the Divide & Conquer Strategy.
Divide and Conquer algorithm consists of a dispute using the following three steps.
1. Divide the original problem into a set of sub-problems.
2. Conquer: Solve every sub-problem individually, recursively.
3. Combine: Put together the solutions of the sub-problems to get the solution to the whole
problem.
Page 87 of 223
2. Stopping Condition
1. Relational Formula: It is the formula that we generate from the given technique. After
generation of Formula we apply D&C Strategy, i.e. we break the problem recursively & solve
the broken sub-problems.
2. Stopping Condition: When we break the problem using Divide & Conquer Strategy, then we
need to know that for how much time, we need to apply divide & Conquer. So the condition
where the need to stop our recursion steps of Divide & Conquer is called as Stopping Condition.
Self-Assessment Exercise
1. The steps in the Divide-and-Conquer process that takes a recursive approach is said to
be?
2. Given the recurrence f(n) = 4 f(n/2) + 1, how many sub-problems will a divide-and-
conquer algorithm divide the original problem into, and what will be the size of those
sub-problems?
3. Design a divide-and-conquer algorithm to compute kn for k > 0 and integer n >= 0.
4. Define divide and conquer approach to algorithm design
Page 90 of 223
2. If the value held there is a match, the search ends.
3. If the value at the midpoint is less than the value to be found, the list is divided in half.
The lower half of the list is ignored and the search keeps to the upper half of the list.
4. Otherwise, if the value at the midpoint is greater than the value to be found, the upper half
of the list is ignored and the search keeps to the lower half of the list.
5. The search moves to the midpoint of the remaining items. Steps 2 through 4 continue until
a match is made or there are no more items to be found.
Check at position 4, which has the value 7. 7 is less than 11, so the bottom half of the list
(including the midpoint) is discarded.
14 is greater than 11, so the top half of the list (including the midpoint) is discarded.
Page 91 of 223
11/2 = 5.5, which rounds up to 6 Check at position 6.
The value held at position 6 is 11, a match. The search ends. A binary search in pseudocode
might look like this:
find = 11
found = False
length = list.length
lowerBound = 0
upperBound = length
while found == False
midpoint = int((upperBound + lowerBound))/2
if list[midPoint] == find then
print('Found at' , midPoint)
found = True
else
if list[midPoint]> item then
upperBound = midpoint-1
else
lowerBound = midpoint+1
endif
endif
endwhile
if found == False then
print('Not found')
endif
A binary search is a much more efficient algorithm than a linear search. In an ordered list of
every number from 0 to 100, a linear search would take 99 steps to find the value 99. A binary
search would only require seven steps.
However, a binary search can only work if a list is ordered.
Page 92 of 223
The time complexity of the binary search algorithm is O(log n). The best-case time complexity
would be O(1) when the central index would directly match the desired value. The worst-case
scenario could be the values at either extremity of the list or values not in the list.
The space complexity of the binary search algorithm depends on the implementation of the
algorithm. There are two ways of implementing it:
Iterative method
Recursive method
Both methods are quite the same, with two differences in implementation. First, there is no loop
in the recursive method. Second, rather than passing the new values to the next iteration of the
loop, it passes them to the next recursion. In the iterative method, the iterations can be controlled
through the looping conditions, while in the recursive method, the maximum and minimum are
used as the boundary condition.
In the iterative method, the space complexity would be O(1). While in the recursive method, the
space complexity would be O(log n).
Page 93 of 223
Self-Assessment Exercise
1. Which type of lists or data sets are binary searching algorithms used for?
2. A binary search is to be performed on the list: [3 5 9 10 23]. How many comparisons
would it take to find number 9?
3. How many binary searches will it take to find the value 7 in the list [1,4,7,8,10,28]?
4. Given an array arr = {45,77,89,90,94,99,100} and key = 100;
What are the mid values(corresponding array elements) generated in the first and second
iterations?
4.0 CONCLUSION
In computer science, divide and conquer is an algorithm design paradigm. A divide-and-conquer
algorithm recursively breaks down a problem into two or more sub-problems of the same or
related type, until these become simple enough to be solved directly.
A binary search algorithm is a widely used algorithm in the computational domain. It is a fat and
accurate search algorithm that can work well on both big and small datasets. A binary search
algorithm is a simple and reliable algorithm to implement. With time and space analysis, the
benefits of using this particular technique are evident.
5.0 SUMMARY
We looked at the meaning of Divide-and-Conquer algorithms and how they work and then
considered a very good example of a Divide-and-Conquer algorithm called Binary Search which
is very efficient for large lists as its worst case complexity is given in logarithmic time.
Page 94 of 223
xn = 1 if n = 0
xn = (xn/2)2 if n > 0 and n is even
xn = x (x(n-1)/2)2 if n > 0 and n is odd
Use this recursion to give a divide-and-conquer algorithm for computing xn. Explain how your
algorithm meets the definition of “divide and conquer.”
6. What is the maximum number of comparisons required to find a value in a list of 20 items
using a binary search?
1.0 INTRODUCTION
Insertion sort is one of the simplest sorting algorithms for the reason that it sorts a single element
at a particular instance. It is not the best sorting algorithm in terms of performance, but it's
slightly more efficient than selection sort and bubble sort in practical scenarios. It is an intuitive
sorting technique.
2.0 OBJECTIVES
By the end of this unit, you will be able to:
Know how Insertion sort and Linear search works
Understand the complexities of both Linear search and Insertion sort
Know the advantages and disadvantages of Linear search
Know the advantages and disadvantages of Insertion sort
Use the Linear Search and Insertion sort algorithms to write good programs in any
programming language of your choice.
Page 96 of 223
slightly more efficient than selection sort and bubble sort in practical scenarios. It is an intuitive
sorting technique. Let's consider the example of cards to have a better understanding of the logic
behind the insertion sort. Suppose we have a set of cards in our hand, such that we want to
arrange these cards in ascending order. To sort these cards, we have a number of intuitive ways.
One such thing we can do is initially we can hold all of the cards in our left hand, and we can
start taking cards one after other from the left hand, followed by building a sorted arrangement in
the right hand.
Assuming the first card to be already sorted, we will select the next unsorted card. If the unsorted
card is found to be greater than the selected card, we will simply place it on the right side, else to
the left side. At any stage during this whole process, the left hand will be unsorted, and the right
hand will be sorted.
In the same way, we will sort the rest of the unsorted cards by placing them in the correct
position. At each iteration, the insertion algorithm places an unsorted element at its right place.
Page 97 of 223
2. Now, we will move on to the third element and compare it with the left-hand side
elements. If it is the smallest element, then we will place the third element at the first
index. Else if it is greater than the first element and smaller than the second element, then
we will interchange its position with the third element and place it after the first element.
After doing this, we will have our first three elements in a sorted manner.
3. Similarly, we will sort the rest of the elements and place them in their correct position.
Consider the following example of an unsorted array that we will sort with the help of the
Insertion Sort algorithm.
A = (41, 22, 63, 14, 55, 36)
Initially,
1st Iteration:
Set key = 22
Compare a1 with a0
22 41 63 14 55 36
2nd Iteration:
Set key = 63
Compare a2 with a1 and a0
Page 98 of 223
3rd Iteration:
Set key = 14
Compare a3 with a2, a1 and a0
Since a3 is the smallest among all the elements on the left-hand side, place a3 at the beginning of
the array.
14 22 41 63 55 36
4th Iteration:
Set key = 55
Compare a4 with a3, a2, a1 and a0.
14 22 41 63 63 36
Page 99 of 223
Since a5 < a2, so we will place the elements in their correct positions.
14 22 36 41 55 63
Output:
(n-1) + (n-2) + (n-3) + (n-4) = …… +1
41 22 63 14 55 36
Self-Assessment Exercise:
1. What is the worst case time complexity of insertion sort where position of the data to be
inserted is calculated using binary search?
2. Consider an array of elements arr[5]= {5,4,3,2,1} , what are the steps of insertions done
while doing insertion sort in the array.
3. How many passes does an insertion sort algorithm consist of?
4. What is the average case running time of an insertion sort algorithm?
Suppose we were to search for the value 2. The search would start at position 0 and check the
value held there, in this case 3. 3 does not match 2, so we move on to the next position.
The value at position 1 is 5. 5 does not match 2, so we move on to the next position.
The value at position 2 is 2 - a match. The search ends.
A linear search in pseudocode might look like this:
find = 2
found = False
length = list.length
counter = 0
while found == False and counter < length
if list[counter] == find then found = True
print ('Found at position', counter)
else:
A linear search, although simple, can be quite inefficient. Suppose them data set contained 100
items of data, and the item searched for happens to be the last item in the set? All of the previous
99 items would have to be searched through first.
However, linear searches have the advantage that they will work on any data set, whether it is
ordered or unordered.
Self-Assessment Exercises
1. Given a list of numbers 12, 45, 23, 7, 9, 10, 22, 87, 45, 23, 34, 56
a. Use the linear search algorithm to search for the number 10
b. Comment on the worst-case running time of your algorithm
4.0 CONCLUSION
The Insertion sort is a simple sorting algorithm that builds the final sorted array one item at a
time. It is much less efficient on large lists than more advanced algorithms such as quicksort, or
merge sort while a linear search or sequential search is a method for finding an element within a
list. It sequentially checks each element of the list until a match is found or the whole list has
been searched
5.0 SUMMARY
We examined the Insertion sort algorithm and how it can be used to sort or arrange a list in any
order while at the same time noting its complexity, advantages and disadvantages. A Linear
Search algorithm which is also known as Sequential search is used in finding a given element in
a list and returns a positive answer once the element is located else it returns a negative answer.
Linear search is very efficient for searching an item within a small-sized list’
1.0 INTRODUCTION
Radix sort is one of the simplest sorting algorithms for the reason that it sorts a single element at
a particular instance. It is not the best sorting algorithm in terms of performance, but it's slightly
more efficient than selection sort and bubble sort in practical scenarios. It is an intuitive sorting
technique.
2.0 OBJECTIVES
By the end of this unit, you will be able to:
Know how to calculate with various data types
Specify input and output statements
Differentiate between formatted and unformatted I/O statements.
Example: The first Column is the input. The remaining Column shows the list after successive
sorts on increasingly significant digit position.
The vertical arrows indicate the digits position sorted on to produce each list from the previous
one.
576 49[4] 9[5]4 [1]76 176
494 19[4] 5[7]6 [1]94 194
194 95[4] 1[7]6 [2]78 278
296 → 57[6] → 2[7]8 → [2]96 → 296
278 29[6] 4[9]4 [4]94 494
176 17[6] 1[9]4 [5]76 576
954 27[8] 2[9]6 [9]54 954
In the average case, we have considered the distribution of the number of digits. There are D
passes and each digit can take on up to b possible values. Radix sort doesn't depend on the input
sequence, so we may keep n as a constant.
The running time of radix sort is, T(n) = d(n+b). Taking expectations of both sides and using
linearity of expectation,
The average case time complexity of radix sort is O(D*(n+b)).
Space Complexity
In this algorithm, we have two auxiliary arrays cnt of size b (base) and tempArray of size n
(number of elements), and an input array arr of size n.
Self-Assessment Exercises
1. If we use Radix Sort to sort n integers in the range (nk/2,nk], for some k>0 which is
independent of n, the time taken would be?
2. The maximum number of comparisons needed to sort 9 items using radix sort is? (assume
each item is 5 digit octal number):
3. Sort the following list in descending order using the Radix sort algorithm
Stability is important to preserve order over multiple sorts on the same data set. For example, say
that student records consisting of name and class section are sorted dynamically, first by name,
then by class section. If a stable sorting algorithm is used in both cases, the sort-byclass- section
operation will not change the name order; with an unstable sort, it could be that sorting by
section shuffles the name order, resulting in a nonalphabetical list of students.
More formally, the data being sorted can be represented as a record or tuple of values, and the
part of the data that is used for sorting is called the key. In the card example, cards are
represented as a record (rank, suit), and the key is the rank. A sorting algorithm is stable if
When equal elements are indistinguishable, such as with integers, or more generally, any data
where the entire element is the key, stability is not an issue. Stability is also not an issue if all
keys are different.
An example of stable sort on playing cards. When the cards are sorted by rank with a stable sort,
the two 5s must remain in the same order in the sorted output that they were originally in. When
they are sorted with a non-stable sort, the 5s may end up in the opposite order in the sorted
output.
Unstable sorting algorithms can be specially implemented to be stable. One way of doing this is
to artificially extend the key comparison, so that comparisons between two objects with
otherwise equal keys are decided using the order of the entries in the original input list as a
tiebreaker.
Remembering this order, however, may require additional time and space. One application for
stable sorting algorithms is sorting a list using a primary and secondary key. For example,
This can be done by first sorting the cards by rank (using any sort), and then doing a stable sort
by suit:
Within each suit, the stable sort preserves the ordering by rank that was already done. This idea
can be extended to any number of keys and is utilized by radix sort. The same effect can be
achieved with an unstable sort by using a lexicographic key comparison, which, e.g., compares
first by suit, and then compares by rank if the suits are the same.
Suppose you need to sort following key-value pairs in the increasing order of keys:
Now, there is two possible solution for the two pairs where the key is same i.e. (4,5) and (4,3) as
shown below:
The sorting algorithm which will produce the first output will be known as stable sorting
algorithm because the original order of equal keys are maintained, you can see that (4, 5) comes
before (4,3) in the sorted order, which was the original order i.e. in the given input, (4, 5) comes
before (4,3) .
On the other hand, the algorithm which produces second output will know as an unstable sorting
algorithm because the order of objects with the same key is not maintained in the sorted order.
You can see that in the second output, the (4,3) comes before (4,5) which was not the case in the
original input.
Self-Assessment Exercise
1. Can any unstable sorting algorithm be altered to become stable? If so, how?
2. What is the use of differentiating algorithms on the basis of stability?
3. When is it definitely unnecessary to look at the nature of stability of a sorting algorithm?
4. What are some stable sorting techniques?
5. What properties of sorting algorithms are most likely to get affected when a typically
unstable sorting algorithm is implemented to be stable?
4.0 CONCLUSION
In computer science, radix sort is a non-comparative sorting algorithm. It avoids comparison by
creating and distributing elements into buckets according to their radix. Stable sorting algorithms
on the other hand maintain the relative order of records with equal keys (i.e. values). That is, a
sorting algorithm is stable if whenever there are two records R and S with the same key and with
R appearing before S in the original list, R will appear before S in the sorted list.
5.0 SUMMARY
We considered another good example of a sorting algorithm known as Radix sort which
unconsciously, is the commonest method we use in sorting some items in a list. On the other
hand, we looked at stability in sorting algorithms and how to identify stable and unstable sorting
algorithms.
1.0 Introduction
Hash Table is a data structure which stores data in an associative manner. In a hash table, data is
stored in an array format, where each data value has its own unique index value. Access of data
becomes very fast if we know the index of the desired data.
Thus, it becomes a data structure in which insertion and search operations are very fast
irrespective of the size of the data. Hash Table uses an array as a storage medium and uses hash
technique to generate an index where an element is to be inserted or is to be located from.
2.0 OBJECTIVES
By the end of this unit, you should be able to:
Hash Table is a data structure which stores data in an associative manner. In a hash table, data is
stored in an array format, where each data value has its own unique index value. Access of data
becomes very fast if we know the index of the desired data.
Thus, it becomes a data structure in which insertion and search operations are very fast
irrespective of the size of the data. Hash Table uses an array as a storage medium and uses hash
technique to generate an index where an element is to be inserted or is to be located from.
The concept of a key enables this to happen. Each element e∈C can be mapped to a key value
k=key(e) such that if ei=ej then key(ei)=key(ej).* A hash function h=hash(e) uses the key value
key(e) to * Note that the reverse is not necessarily true since key values do not have to be unique.
determine the bin A[h] into which to insert e, where 0≤h<b. Once the hash table A is constructed,
then searching for an item t is transformed into a search for t within A[h] where h=hash(t).
The general pattern for hash-based searching is shown in Figure 5-4 with a small example. The
components of HASH-BASED SEARCH are:
• The universe U that defines the set of possible keys. Each element e∈C maps to a key k∈U.
• The hash table, A, which stores the n elements from the original collection C. A may contain
just the elements themselves or it may contain the key values of the elements. There are b
locations within A.
• The hash function, hash, which computes an integer index h into the hash table using key(e),
where 0≤h<b.
There are two main concerns when implementing HASH-BASED SEARCH: the design of the
hash function, and how to handle collisions (when two keys map to the same bin in A).
Collisions occur in almost all cases where b<<|U|, that is, when b is much smaller than the
number of potential keys that exist in the universe U. Note that if b<n there won’t be enough
Page 116 of 223
space in the hash table A
to store all of the n elements from the original collection. When this happens, it is common for A
to store a set of elements (typically in a linked list) in each of its b bins, as shown in options
“store elements” and “store keys” in Figure 5-4.*
Improper hash function design can lead to poor distribution of the keys in the primary storage. A
poorly designed hash function has two consequences: many slots in the hash table may be
unused—wasting space—and there will be many collisions where the keys map into the same
slot, which worsens performance.
3.1.1 Input/Output
To search for a target item t, it must have one or more properties that can be used as a key k;
these keys determine the universe U. Unlike BINARY SEARCH, the original collection C does
not need to be ordered. Indeed, even if the elements in C Figure 5-4. General approach to
hashing
* Alternatively, if the elements themselves are stored directly in A (as shown by the “store
elements, no lists” option in Figure 5-4), then you need to deal with collisions; otherwise,
elements inserted into the hash table may be lost. were ordered in some way, the hashing method
that inserts elements into the hash table A does not attempt to replicate this ordering within A.
The input to HASH-BASED SEARCH is the computed hash table, A, and the target element t
being sought. The algorithm returns true if t exists in the linked list stored by A[h] where
h=hash(t). If A[h] is empty or t does not exist within the linked list stored by A[h], then false is
returned to indicate that t is not present in A (and by implication, it does not exist in C). The
pseudocode in Figure 5-3 shows the simplified version where A stores lists containing the
elements themselves as key values.
3.1.2 Assumptions
The variable n represents the number of elements in the original collection C and b represents the
number of bins in the indexed collection, A.
3.1.3 Context
Suppose we are creating a text editor and want to add a component that will check the spelling of
words as the user types. There are several free word lists available that can be downloaded from
the Internet (see http://www.wordlist.com).
Performance is essential for this application. We must be able to search the word list quickly, or
the program will be unusable for even the slowest typists. We will probably check word spellings
in a separate thread that must keep up with the changes to the document or file.*
We know from the earlier section “Binary Search” that we can expect about 18 string
comparisons† on average if we use BINARY SEARCH. String comparisons can be expensive—
even when we optimize the code, there is a loop that compares bytes.
* Hash-based searches require the complete word before the search can begin. With a binary
search approach, one may search for the work incrementally as it is being typed, but this
introduces additional complexity to the program.
Figure 5-5. Storing strings for hash-based searching † log (200,000) = 17.61
Sometimes we write these loops by hand in an assembly language to ensure that we optimize for
the specific architecture, such as making sure that we don’t stall the instruction pipeline in the
more common cases and unroll the loop to fill the pipeline when possible. One goal, therefore, is
to minimize the number of string comparisons.
We first need to define a function to compute the key for a string, s, in the word list. One goal for
the key function is to produce as many different values as possible but it is not required for the
values to all be unique. A popular technique is to produce a value based on each piece of
information from the original string:
key(s)=s[0]*31(len–1)+s[1]*31(len–2)+ ... +s[len–1] where s[i] is the ith character (as a value
between 0 and 255) and len is the length of the string s. Computing this function is simple as
shown in the Java code in
Because this hashCode method tries to be efficient, it caches the value of the computed hash to
avoid re-computation (i.e., it computes the value only if hash is 0). Next, we construct the hash
table. We have n strings but what should be the size of the hash table A? In an ideal case, A could
be b=n bins where the hash function is a one-to-one function from the set of strings in the word
collection onto the integers [0,n). This does not occur in the normal case, so we instead try to
have a hash table that has as few empty bins as possible. If our hash function distributes the keys
evenly, we can achieve reasonable success by selecting an array size approximately as large as
the collection. We define hash(s)=key(s)%b, where % is the modulo operator that returns the
remainder when dividing key(s) by b.
The advanced reader should, at this point, question the use of a basic hash function and hash
table for this problem. Since the word list is static, we can do better by creating a perfect hash
function. A perfect hash function is one that guarantees no collisions for a specific set of keys. In
this case a perfect hash function can be used; this is discussed in the upcming “Variations”
section. Let’s first try to solve the problem without one.
For our first try at this problem, we choose a primary array A that will hold b=218–1=262,143
elements. Our word list contains 213,557 words. If our hash function perfectly distributes the
You may be surprised that this is quite a good hashing function and finding one with better
distribution will require a more complex scheme. For the record, there were only five pairs of
strings with identical key values (for example, both “hypoplankton” and “unheavenly” have a
computed key value of 427,589,249)!
Table 5-4. Hash distribution using Java String. hashCode( ) method as key with b=262,143
Finally, we need to decide on a strategy for handling collisions. One approach is to store a
pointer to a list of entries in each slot of the primary hash table, rather than a single object. This
approach, shown in Figure 5-6, is called chaining.
The overhead with this approach is that you have either a list or the value nil (representing “no
list”) in each slot. When a slot has only a single search item, it must be accessed using the list
capability. For our first approximation of a solution, we start with this approach and refine it if
the performance becomes an issue.
Hashing is a technique to convert a range of key values into a range of indexes of an array. We're
going to use modulo operator to get a range of key values. Consider an example of hash table of
size 20, and the following items are to be stored. Item are in the (key,value) format.
(1,20)
(2,70)
(42,80)
(4,25)
(12,44)
(14,32)
(17,11)
(13,78)
(37,98)
Sr.No. Key Hash Array Index
1 1 1 % 20 = 1 1
2 2 2 % 20 = 2 2
3 42 42 % 20 = 2 2
4 4 4 % 20 = 4 4
5 12 12 % 20 = 12 12
6 14 14 % 20 = 14 14
7 17 17 % 20 = 17 17
8 13 13 % 20 = 13 13
9 37 37 % 20 = 17 17
As we can see, it may happen that the hashing technique is used to create an already used index
of the array. In such a case, we can search the next empty location in the array by looking into
the next cell until we find an empty cell. This technique is called linear probing.
3.3.1 DataItem
Define a data item having some data and key, based on which the search is to be conducted in a
hash table.
struct DataItem {
int data;
int key;
};
Define a hashing method to compute the hash code of the key of the data item.
Whenever an element is to be searched, compute the hash code of the key passed and locate the
element using that hash code as index in the array. Use linear probing to get the element ahead if
the element is not found at the computed hash code.
Example
struct DataItem *search(int key) {
//get the hash
int hashIndex = hashCode(key);
if(hashArray[hashIndex]->key == key)
return hashArray[hashIndex];
return NULL;
Whenever an element is to be inserted, compute the hash code of the key passed and locate the
index using that hash code as an index in the array. Use linear probing for empty location, if an
element is found at the computed hash code.
Example
void insert(int key,int data) {
struct DataItem *item = (struct DataItem*) malloc(sizeof(struct DataItem));
item->data = data;
item->key = key;
hashArray[hashIndex] = item;
}
Example
struct DataItem* delete(struct DataItem* item) {
int key = item->key;
//get the hash
int hashIndex = hashCode(key);
if(hashArray[hashIndex]->key == key) {
struct DataItem* temp = hashArray[hashIndex];
//assign a dummy item at deleted position
hashArray[hashIndex] = dummyItem;
return temp;
}
return NULL;
}
Self-Assessment Exercise
4.0 CONCLUSION
The sorting problem enables us to find better algorithms that would help arrange the numbers in
a list or sequence in any order. Ascending order is when it is arranged from Smallest to Biggest
while Descending order is when the list is arranged from biggest item to the smallest item. We
looked at the case of the bubble sort and the Selection sort algorithms which are well suited for
sorting a small-sized list efficiently.
5.0 SUMMARY
In simple terms, the Sorting algorithm arranges a list from either smallest item consecutively to
the biggest item (Ascending order) or from the biggest item consecutively to the smallest item
(Descending order).
Two methods of Sorting small-sized lists (Bubble sort and Selection Sort) were introduced and
incidentally, they both have the same Worst case runnung time of O(n2).
2.0 OBJECTIVES
By the end of this unit, you should be able to:
Know some of the techniques Graph
Identify how to analysis graph and its attribute
Know some benefits of graphs
3.0 Overview
Graphs are fundamental structures used in computer science to represent complex structured
information. The images in Figure 6-1 are all sample graphs.
In this chapter we investigate common ways to represent graphs and some associated algorithms
that occur frequently. Inherently, a graph contains a set of elements, known as vertices, and
relationships between pairs of these elements, known as edges. In this chapter we consider only
simple graphs that avoid (a) self-edges from a vertex to itself, and (b) multiple edges between the
same pair of vertices.
What is Graphs
A graph G = (V, E) is defined by a set of vertices, V, and a set of edges, E, over pairs of these
vertices. There are distinct types of graphs that occur commonly in algorithms:
Undirected graphs
Model relationships between vertices (u,v) without caring about the direction of the relationship.
These graphs are useful for capturing symmetric information. For example, a road from town A
to town B can be traversed in either direction.
Directed graphs
Model relationships between vertices (u,v) that are distinct from, say, the relationship between
(v,u), which may or may not exist. For example, a program to provide driving directions must
store information on one-way streets to avoid giving illegal directions.
Weighted graphs
Model relationships where there is a numeric value known as a weight associated with the
relationship between vertices (u,v). Sometimes these values can store arbitrary non-numeric
information. For example, the edge between towns A and B could store the mileage between the
towns; alternatively, it could store estimated traveling time in minutes.
Hyper graphs
There are two standard data structures to store such a graph; both data structures explicitly store
the weights and implicitly store the directed edges. One could store the graph as n adjacency
lists, as shown in Figure 6-3, where each vertex vi maintains a linked list of nodes, each of which
stores the weight of the edge leading to an adjacent vertex of vi. Thus the base structure is a one-
dimensional array of vertices in the graph. Adding an edge requires additional processing to
ensure that no duplicate edges are added.
We can use adjacency lists and matrices to store undirected graphs as well. Consider the
undirected graph in Figure 6-5. We use the notation <v0, v1, …, vk–1> to describe a path of k
vertices in a graph that traverses k–1 edges (vi,vi+1) for 0≤i<k–1; paths in a directed graph honor
the direction of the edge. In Figure 6-5, the path <v3, v1, v5, v4> is valid. In this graph there is a
cycle, which is a path of vertices that includes the same vertex multiple times. A cycle is
typically represented in its most minimal form.
In Figure 6-5, a cycle exists in the path <v3, v1, v5, v4, v2, v1, v5, v4, v2>, and this cycle is best
represented by the notation <v1, v5, v4, v2, v1>. Note that in the directed, weighted graph in
Figure 6-2, there is a cycle < v3, v5, v3>.
When using an adjacency list to store an undirected graph, the same edge (u,v) appears twice—
once in the linked list of neighbor vertices for u and once for v.
A graph is a pictorial representation of a set of objects where some pairs of objects are connected
by links. The interconnected objects are represented by points termed as vertices, and the links
that connect the vertices are called edges.
Formally, a graph is a pair of sets (V, E), where V is the set of vertices and E is the set of edges,
connecting the pairs of vertices. Take a look at the following graph −
V = {a, b, c, d, e}
Mathematical graphs can be represented in data structure. We can represent a graph using an
array of vertices and a two-dimensional array of edges. Before we proceed further, let's
familiarize ourselves with some important terms −
Vertex − Each node of the graph is represented as a vertex. In the following example, the
labeled circle represents vertices. Thus, A to G are vertices. We can represent them using
an array as shown in the following image. Here A can be identified by index 0. B can be
identified using index 1 and so on.
Edge − Edge represents a path between two vertices or a line between two vertices. In the
following example, the lines from A to B, B to C, and so on represents edges. We can use
a two-dimensional array to represent an array as shown in the following image. Here AB
can be represented as 1 at row 0, column 1, BC as 1 at row 1, column 2 and so on,
keeping other combinations as 0.
Adjacency − Two node or vertices are adjacent if they are connected to each other
through an edge. In the following example, B is adjacent to A, C is adjacent to B, and so
on.
Path − Path represents a sequence of edges between the two vertices. In the following
example, ABCD represents a path from A to D.
To know more about Graph, please read Graph Theory Tutorial. We shall learn about traversing
a graph in the coming chapters.
Depth First Search (DFS) algorithm traverses a graph in a depthward motion and uses a stack to
remember to get the next vertex to start a search, when a dead end occurs in any iteration.
Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Push it in a
stack.
Rule 2 − If no adjacent vertex is found, pop up a vertex from the stack. (It will pop up all
the vertices from the stack, which do not have adjacent vertices.)
Rule 3 − Repeat Rule 1 and Rule 2 until the stack is empty.
Step Traversal Description
1 Initialize the stack.
As C does not have any unvisited adjacent node so we keep popping the stack until we find a
node that has an unvisited adjacent node. In this case, there's none and we keep popping until the
stack is empty.
To know about the implementation of this algorithm in C programming language, click here.
Breadth First Search (BFS) algorithm traverses a graph in a breadthward motion and uses a
queue to remember to get the next vertex to start a search, when a dead end occurs in any
iteration.
Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Insert it in a
queue.
Rule 2 − If no adjacent vertex is found, remove the first vertex from the queue.
Rule 3 − Repeat Rule 1 and Rule 2 until the queue is empty.
Step Traversal Description
1 Initialize the queue.
2 We start from
visiting S (starting node), and
mark it as visited.
At this stage, we are left with no unmarked (unvisited) nodes. But as per the algorithm we keep
on dequeuing in order to get all unvisited nodes. When the queue gets emptied, the program is
over.
3.5 Tree
Tree represents the nodes connected by edges. We will discuss binary tree or binary search tree
specifically.
Binary Tree is a special data structure used for data storage purposes. A binary tree has a special
condition that each node can have a maximum of two children. A binary tree has the benefits of
both an ordered array and a linked list as search is as quick as in a sorted array and insertion or
deletion operation are as fast as in linked list.
Path − Path refers to the sequence of nodes along the edges of a tree.
Root − The node at the top of the tree is called root. There is only one root per tree and
one path from the root node to any node.
Parent − Any node except the root node has one edge upward to a node called parent.
Child − The node below a given node connected by its edge downward is called its child
node.
Leaf − The node which does not have any child node is called the leaf node.
Subtree − Subtree represents the descendants of a node.
Visiting − Visiting refers to checking the value of a node when control is on the node.
Traversing − Traversing means passing through nodes in a specific order.
Levels − Level of a node represents the generation of a node. If the root node is at level 0,
then its next child node is at level 1, its grandchild is at level 2, and so on.
keys − Key represents a value of a node based on which a search operation is to be carried
out for a node.
Binary Search tree exhibits a special behavior. A node's left child must have a value less than its
parent's value and the node's right child must have a value greater than its parent value.
We're going to implement tree using node object and connecting them through references.
The code to write a tree node would be similar to what is given below. It has a data part and
references to its left and right child nodes.
The basic operations that can be performed on a binary search tree data structure, are the
following −
We shall learn creating (inserting into) a tree structure and searching a data item in a tree in this
chapter. We shall learn about tree traversing methods in the coming chapter.
The very first insertion creates the tree. Afterwards, whenever an element is to be inserted, first
locate its proper location. Start searching from the root node, then if the data is less than the key
value, search for the empty location in the left subtree and insert the data. Otherwise, search for
the empty location in the right subtree and insert the data.
Algorithm
If root is NULL
then create root node
return
endwhile
insert data
end If
Implementation
tempNode->data = data;
tempNode->leftChild = NULL;
tempNode->rightChild = NULL;
while(1) {
parent = current;
Algorithm
If root.data is equal to search.data
return root
else
while data not found
If data found
return node
endwhile
end if
while(current->data != data) {
if(current != NULL)
printf("%d ",current->data);
//not found
if(current == NULL) {
return NULL;
}
}
return current;
}
A spanning tree is a subset of Graph G, which has all the vertices covered with minimum
possible number of edges. Hence, a spanning tree does not have cycles and it cannot be
disconnected..
By this definition, we can draw a conclusion that every connected and undirected Graph G has at
least one spanning tree. A disconnected graph does not have any spanning tree, as it cannot be
spanned to all its vertices.
We now understand that one graph can have more than one spanning tree. Following are a few
properties of the spanning tree connected to graph G −
Spanning tree has n-1 edges, where n is the number of nodes (vertices).
Thus, we can conclude that spanning trees are a subset of connected Graph G and disconnected
graphs do not have spanning tree.
Spanning tree is basically used to find a minimum path to connect all nodes in a graph. Common
application of spanning trees are −
Let us understand this through a small example. Consider, city network as a huge graph and now
plans to deploy telephone lines in such a way that in minimum lines we can connect to all city
nodes. This is where the spanning tree comes into picture.
In a weighted graph, a minimum spanning tree is a spanning tree that has minimum weight than
all other spanning trees of the same graph. In real-world situations, this weight can be measured
as distance, congestion, traffic load or any arbitrary value denoted to the edges.
There are two most important spanning tree algorithms, which are:
Kruskal's Algorithm
Prim's Algorithm
Self-Assessment Exercise
1. What is Graph G?
2. Analysis the steps involve in design graph?
4.0 CONCLUSION
The sorting problem enables us to find better algorithms that would help arrange the numbers in
a list or sequence in any order. Ascending order is when it is arranged from Smallest to Biggest
while Descending order is when the list is arranged from biggest item to the smallest item. We
looked at the case of the bubble sort and the Selection sort algorithms which are well suited for
sorting a small-sized list efficiently.
5.0 SUMMARY
In simple terms, the Sorting algorithm arranges a list from either smallest item consecutively to
the biggest item (Ascending order) or from the biggest item consecutively to the smallest item
(Descending order).
Two methods of Sorting small-sized lists (Bubble sort and Selection Sort) were introduced and
incidentally, they both have the same Worst case runnung time of O(n2).
1.0 Introduction
2.0 Objectives
3.0 Binary Search Trees
3.0.1 Binary Search Tree Property
3.1 Traversal In Binary Search Treess
3.1.1 Inorder Tree Walk
1.0 INTRODUCTION
We introduce here a special search tree called the Binary Search Tree and a derivative of it
known as the Red Black Tree. A binary search tree, also known as ordered binary tree is a binary
tree wherein the nodes are arranged in a order.
The order is:
c) All the values in the left sub-tree has a value less than that of the root node.
d) All the values in the right node have a value greater than the value of the root node.
On the other hand, a red-black tree is a Binary tree where a particular node has color as an extra
attribute, either red or black. By check the node colors on any simple path from the root to a leaf,
red-black trees secure that no such path is higher than twice as long as any other so that the tree
is generally balanced.
2.0 OBJECTIVES
POSTORDER-TREE-WALK (x):
1. If x ≠ NIL.
2. then POSTORDER-TREE-WALK (left [x]).
3. POSTORDER-TREE-WALK (right [x]).
4. print key [x]
TREE-SEARCH (x, k)
TREE-MAXIMUM (x)
1. While left [x] ≠ NIL
2. do x←right [x].
3. return x.
TREE-INSERT (T, z)
1. y ←NIL.
For Example:
Working
Working of TREE-INSERT
Suppose we want to insert an item with key 13 into a Binary Search Tree.
x=1
y = 1 as x ≠ NIL.
Key [z] < key [x]
13 < not equal to 12.
x ←right [x].
x ←3
Again x ≠ NIL
y ←3
Page 156 of 223
key [z] < key [x]
13 < 18
x←left [x]
x←6
Again x ≠ NIL, y←6
13 < 15
x←left [x]
x←NIL
p [z]←6
Now our node z will be either left or right child of its parent (y).
key [z] < key [y]
13 < 15
Left [y] ← z
Left [6] ← z
So, insert a node in the left of node index at 6.
The Procedure runs in O (h) time on a tree of height h. For Example: Deleting a node z from a
binary search tree. Node z may be the root, a left child of node q, or a right child of q.
Node z has two children; its left child is node l, its right child is its successor y, and y's right
child is node x. We replace z by y, updating y's left child to become l, but leaving x as y's right
child.
Node z has two children (left child l and right child r), and its successor y ≠ r lies within the
subtree rooted at r. We replace y with its own right child x, and we set y to be r's parent. Then,
we set y to be q's child and the parent of l.
Self-Assessment Exercises
1. What is the worst case time complexity for search, insert and delete operations in a
general Binary Search Tree?
2. We are given a set of n distinct elements and an unlabelled binary tree with n nodes. In
how many ways can we populate the tree with the given set so that it becomes a binary
search tree?
3. How many distinct binary search trees can be created out of 4 distinct keys?
A tree T is an almost red-black tree (ARB tree) if the root is red, but other conditions above hold.
3.4.1. Rotation:
Restructuring operations on red-black trees can generally be expressed more clearly in details of
the rotation operation.
Example: Draw the complete binary tree of height 3 on the keys {1, 2,
3... 15}. Add the NIL leaves and color the nodes in three different ways such that the black
heights of the resulting trees are: 2, 3 and 4.
Solution:
3.4.2. Insertion:
Insert the new node the way it is done in Binary Search Trees.
RB-INSERT (T, z)
y ← nil [T]
x ← root [T]
while x ≠ NIL [T]
do y ← x
if key [z] < key [x]
then x ← left [x]
else x ← right [x]
p [z] ← y
if y = nil [T]
then root [T] ← z
else if key [z] < key [y]
then left [y] ← z
else right [y] ← z
left [z] ← nil [T]
right [z] ← nil [T]
color [z] ← RED
RB-INSERT-FIXUP (T, z)
After the insert new node, Coloring this new node into black may violate the black-height
conditions and coloring this new node into red may violate coloring conditions i.e. root is black
and red node has no red children. We know the black-height violations are hard. So we color the
node red. After this, if there is any color violation, then we have to correct them by an RB-
INSERT-FIXUP procedure.
RB-DELETE-FIXUP (T, x)
1. while x ≠ root [T] and color [x] = BLACK
2. do if x = left [p[x]]
3. then w ← right [p[x]]
4. if color [w] = RED
5. then color [w] ← BLACK //Case 1
6. color [p[x]] ← RED //Case 1
7. LEFT-ROTATE (T, p [x]) //Case 1
8. w ← right [p[x]] //Case 1
9. If color [left [w]] = BLACK and color [right[w]] = BLACK
10. then color [w] ← RED //Case 2
11. x ← p[x] //Case 2
12. else if color [right [w]] = BLACK
Example: In a previous example, we found that the red-black tree that results from successively
inserting the keys 41,38,31,12,19,8 into an initially empty tree. Now show the red-black trees
that result from the successful deletion of the keys in the order 8, 12, 19,31,38,41.
Solution:
Delete 41
No Tree.
Self-Assessment Exercises
1. When deleting a node from a red-black tree, what condition might happen?
2. What is the maximum height of a Red-Black Tree with 14 nodes? (Hint: The black depth
of each external node in this tree is 2.) Draw an example of a tree with 14 nodes that
achieves this maximum height.
3. Why can't a Red-Black tree have a black node with exactly one black child and no red
child?
5.0 SUMMARY
In this unit, we considered the Binary Search Tree and looked at how such trees could be
traversed while also examining the various methods of querying or accessing information from a
Binary Search Tree. In addition, we looked at a special derivative of the Binary Search Tree
called Red Black Trees, its properties and also some operations that could be carried out on Red
Black Tress.
1.0 INTRODUCTION
Dynamic programming is both a mathematical optimization method and a computer
programming method. The method was developed by Richard Bellman in the 1950s and has
found applications in numerous fields, from aerospace engineering to economics. We look at
some of the techniques of Dynamic Programming in this unit as well as some benefits and
applications of Dynamic Programming
1.0 OBJECTIVES
At the end of this unit, you should be able to
Explain better the concept of Dynamic Programming
Know the different methods for resolving a Dynamic Programming problem
Know when to use either of the methodologies learnt
Understand the different areas of applications of Dynamic Programming
Evaluate the basic differences between Dynamic Programming and the Divide-and-
Conquer paradigm.
From the definition of dynamic programming, it is a technique for solving a complex problem by
first breaking it into a collection of simpler sub-problems, solving each sub-problem just once,
and then storing their solutions to avoid repetitive computations.
Let's understand this approach through an example. Consider an example of the Fibonacci
series. The following series is the Fibonacci series:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ,…
The numbers in the above series are not randomly calculated.
Mathematically, we could write each of the terms using the below formula:
F(n) = F(n-1) + F(n-2),
With the base values F(0) = 0, and F(1) = 1.
To calculate the other numbers, we follow the above relationship. For example, F(2) is the sum
f(0) and f(1), which is equal to 1.
In the dynamic programming approach, we try to divide the problem into the similar sub-
problems. We are following this approach in the above case where F(20) into the similar sub-
problems, i.e., F(19) and F(18). If we revisit the definition of dynamic programming that it says
the similar sub-problem should not be computed more than once. Still, in the above case, the
sub-problem is calculated twice. F(18) is calculated two times; similarly, F(17) is also calculated
twice. However, this technique is quite useful as it solves the similar sub-problems, but we need
to be cautious while storing the results because we are not particular about storing the result that
we have computed once, as it can lead to a wastage of resources.
In the above example, if we calculate the F(18) in the right subtree, then it leads to the
tremendous usage of resources and decreases the overall performance.
The solution to the above problem is to save the computed results in an array. First, we calculate
F(16) and F(17) and save their values in an array. The F(18) is calculated by summing the values
of F(17) and F(16), which are already saved in an array. The computed value of F(18) is saved in
an array. The value of F(19) is calculated using the sum of F(18), and F(17), and their values are
already saved in an array. The computed value of F(19) is stored in an array. The value of F(20)
can be calculated by adding the values of F(19) and F(18), and the values of both F(19) and
F(18) are stored in an array. The final computed value of F(20) is stored in an array.
In the above code, we have used the recursive approach to find out the Fibonacci series. When
the value of 'n' increases, the function calls will also increase, and computations will also
increase. In this case, the time complexity increases exponentially, and it becomes O(2n).
Another solution to this problem is to use the dynamic programming approach. Rather than
generating the recursive tree again and again, we can reuse the previously calculated value. If we
use the dynamic programming approach, then the time complexity would be O(n).
When we apply the dynamic programming approach in the implementation of the Fibonacci
series, then the code would look like:
In the above code, we have used the memorization technique in which we store the results in an
array to reuse the values. This is also known as a top-down approach in which we move from the
top and break the problem into sub-problems.
The bottom-up is the approach used to avoid the recursion, thus saving the memory space. The
bottom-up is an algorithm that starts from the beginning, whereas the recursive algorithm starts
from the end and works backward. In the bottom-up approach, we start from the base case to find
the answer for the end. As we know, the base cases in the Fibonacci series are 0 and 1. Since the
bottom approach starts from the base cases, so we will start from 0 and 1.
0 1
0 1 1
a[0] a[1] a[2]
The value of a[3] will be calculated by adding a[1] and a[2] , and it becomes 2 shown below:
0 1 1 2
a[0] a[1] a[2] a[3]
The value of a[4] will be calculated by adding 2[2] and a[3], and it becomes 3 shown below:
0 1 1 2 3
a[0] a[1] a[2] a[3] a[4]
The value of a[5] will be calculated by adding the value of a[4] and a[3], and it becomes 5 shown
below:
0 1 1 2 3 5
a[0] a[1] a[2] a[3] a[4] a[5]
The code for implementing the Fibonacci series using the bottom-up approach is given below:
int fib(int n)
{
int A[];
A[0] = 0, A[1] = 1;
for( i=2; i<=n; i++)
{
A[i] = A[i-1] + A[i-2]
}
return A[n];
}
In the above code, base cases are 0 and 1 and then we have used for loop to find other values of
Fibonacci series.
Let's explain better using the following diagrammatic representation: Initially, the first two
values, i.e., 0 and 1 can be represented as:
Page 179 of 223
0 1
0 1
When i=3 then the values 1 and 1 are added as shown below
0 1
0 1
0 1
The above case starting from the bottom and reaching to the top
Portability: - A program should be supported by many different computers. The program should
compile and run smoothly on different platforms. Because of rapid development in hardware and
software, platform change is a common phenomenon these days. So, portability is measured by
how a software application can be transferred from one computer environment to another
without failure. A program is said to be more portable if it is easily adopted on different
computer systems. Subsequently, if a program is developed only for a particular platform, its life
expectancy is seriously compromised.
Maintainability: - It is the process of fixing program errors and improving the program. If a
program is easy to read and understand, then its maintenance will be easier. It should also
prevent unwanted work so that the maintenance cost in the future will be low. It should also have
quality to easily meet new requirements. A maintainable software allows us to fix bugs quickly
and easily, improve usability and performance, add new features, make changes to support
multiple platforms, and so on.
Cost Effectiveness: - Cost Effectiveness is the key to measure the program quality. Cost must be
measured over the life of the program and must include both cost and human cost of producing
these programs.
Flexible: - The program should be written in such a manner that it allows one to add new
features without changing the existing module. The majority of the projects are developed for a
specific period, and they require modifications from time to time. It should always be ready to
meet new requirements. Highly flexible software is always ready for a new world of possibilities.
The above recursive solution is also the solution for the above problem but the time complexity
in this case is O(2n). So, dynamic programming is used to reduce the time complexity from the
exponential time to the linear time.
In the above code, we have used a cache array of size n+1. If cache[n] is not equal to zero then
we return the result from the cache else we will calculate the value of cache and then return the
cache. The technique that we have used here is top-down approach as it follows the recursive
approach. Here, we always look for the cache so cache will be populated on the demand basis.
Suppose we want to calculate the fib(4), first we look into cache, and if the value is not in the
cache then the value is calculated and stored in the cache.
In the above code, we have followed the bottom-up approach. We have declared a cache array
of size n+1. The base cases are cache[0] and cache[1] with their values 0 and 1 respectively. In
As we can observe in the above figure that we are populating the cache from bottom to up so it is
known as bottom-up approach. This approach is much more efficient than the previous one as it
is not using recursion but both the approaches have the same time and space complexity, i.e.,
O(n).
In this case, we have used the FAST method to obtain the optimal solution. The above is the
optimal solution that we have got so far but this is not the purely an optimal solution.
Efficient solution:
fib(n)
{
int first=0, second=1, sum=0;
if(n<2)
{
return 0;
}
for(int i =2; i<=n; i++)
{
sum = first + second;
first = second;
second = sum;
}
return sum;
}
The above solution is the efficient solution as we do not use the cache.
The Following are the top 10 problems that can easily be solved using
Self-Assessment Exercises
1. When do we consider using the dynamic programming approach?
2. Four matrices M1, M2, M3 and M4 of dimensions pxq, qxr, rxs and sxt respectively can
be multiplied is several ways with different number of total scalar multiplications. For
example, when multiplied as ((M1 X M2) X (M3 X M4)), the total number of
multiplications is pqr + rst + prt. When multiplied as (((M1 X M2) X M3) X M4), the
total number of scalar multiplications is pqr + prs + pst. If p = 10, q = 100, r = 20, s = 5
and t = 80, then the number of scalar multiplications needed is?
3. Consider two strings A = "qpqrr" and B = "pqprqrp". Let x be the length of the longest
common subsequence (not necessarily contiguous) between A and B and let y be the
number of such longest common subsequences between A and B. Then x + 10y=?
4. In dynamic programming, the technique of storing the previously calculated values is
called?
5. What happens when a top-down approach of dynamic programming is applied to a
problem?
4.0 CONCLUSION
Dynamic programming is nothing but recursion with memoization i.e. calculating and storing
values that can be later accessed to solve subproblems that occur again, hence making your code
faster and reducing the time complexity (computing CPU cycles are reduced). Dynamic
programming is used where we have problems, which can be divided into similar sub-problems,
so that their results can be re-used. Mostly, these algorithms are used for optimization. Before
5.0 SUMMARY
In this Unit, we considered a very important algorithm design paradigm known as Dynamic
programming and compared it with another useful method known as Divide-and-Conquer
technique. Several ways for resolving the Dynamic Programming problem were considered.
6.0 TUTOR MARKED ASSIGNMENTS
1. For each of the following problems, explain whether they could be solved or not using
dynamic programming?
A: Mergesort
B: Binary search
C: Longest common subsequence
D: Quicksort
2. Give at least three properties of a dynamic programming problem
3. You are given infinite coins of denominations 1, 3, 4. What is the total number of ways in
which a sum of 7 can be achieved using these coins if the order of the coins is not
important?
4. What is the main difference between the Top-down and Bottomup approach for solving
Dynamic Programming problems?
1.0 INTRODUCTION
In general, the amount of resources (or cost) that an algorithm requires in order to return the
expected result is called computational complexity or just complexity. ... The complexity of an
algorithm can be measured in terms of time complexity and/or space complexity.
2.0 OBJECTIVES
By the end of this unit, you should be able to:
Know the meaning and focus of Computational Complexity theory
Identify the different cases of P and NP problems
Differentiate between Tractable and Intractable problems
Know what we mean by Deterministic and Non-Deterministic problems
Understand the differences between Deterministic and Non Deterministic algorithms
Computational complexity theory, or just complexity theory, is the study of the difficulty of
computational problems. Rather than focusing on specific algorithms, complexity theory focuses
on problems. For example, the mergesort algorithm can sort a list of N numbers in O(N log N)
time. Complexity theory asks what you can learn about the task of sorting in general, not what
you can learn about a specific algorithm. It turns out that you can show that any sorting
algorithm that sorts by using comparisons must use at least N × log(N) time in the worst case.
Complexity theory is a large and difficult topic, so there’s no room here to cover it fully.
However, every programmer who studies algorithms should know at least something about
In other words, it is an algorithm in which the result of every algorithm is not uniquely defined
and result could be random.
An algorithm that solves a problem in nondeterministic polynomial time can run in polynomial
time or exponential time depending on the choices it makes during. The nondeterministic
algorithms are often used to find an approximation to a solution, when the exact solution would
be too costly using a deterministic one.
A nondeterministic algorithm is different from its more familiar deterministic counterpart in its
ability to arrive at outcomes using various routes. If a deterministic algorithm represents a single
path from an input to an outcome, a nondeterministic algorithm represents a single path
stemming into many paths, some of which may arrive at the same output and some of which may
arrive at unique outputs.
i. If it uses external state other than the input, such as user input, a global variable, a
hardware timer value, a random value, or stored disk data.
ii. If it operates in a way that is timing-sensitive, for example if it has multiple processors
writing to the same data at the same time. In this case, the precise order in which each
processor writes its data will affect the result.
iii. If a hardware error causes its state to change in an unexpected way.
3.5.3 IS P = NP?
The P versus NP problem is a major unsolved problem in computer science. It asks whether
every problem whose solution can be quickly verified can also be solved quickly.
An answer to the P versus NP question would determine whether problems that can be verified in
polynomial time can also be solved in polynomial time.
If it turns out that P ≠ NP, which is widely believed, it would mean that there are problems in NP
that are harder to compute than to verify: they could not be solved in polynomial time, but the
answer could be verified in polynomial time.
If P=NP, then all of the NP problems can be solved deterministically in Polynomial time.
The Clay Mathematics Institute has offered a $1,000,000 prize to anyone who proves or
disproves P = NP.
4.0 CONCLUSION
Computational complexity theory focuses on classifying computational problems according to
their resource usage, and relating these classes to each other. A computational problem is a task
solved by a computer. A computation problem is solvable by mechanical application of
mathematical steps, such as an algorithm. Several areas considered in this Unit were P and NP
5.0 SUMMARY
In this Unit we looked at the meaning and nature of Computational Complexity theory and also
examined the notion of Deterministic as well as Non Deterministic algorithms. Several examples
of the algorithms were listed and we also treated P, NP, NP-hard and NPcomplete problems
while also mentioning Tractable and Intractable problems. On a final note, we also looked at the
unsolvable problem of P = NP.
Bhasin, H. (2015). Algorithms: Design and Analysis. Oxford University Press. ISBN:
0199456666, 9780199456666
Sen, S. and Kumar, A, (2019). Design and Analysis of Algorithms. A Contemporary Perspective.
Cambridge University Press. ISBN: 1108496822, 9781108496827
1.0 INTRODUCTION
In computer science and operations research, approximation algorithms are efficient algorithms
that find approximate solutions to optimization problems (in particular NP-hard problems) with
provable guarantees on the distance of the returned solution to the optimal one. Approximation
algorithms are typically used when finding an optimal solution is intractable, but can also be
used in some situations where a near-optimal solution can be found quickly and an exact solution
is not needed.
2.0 OBJECTIVES
At the end of this Unit, you should be able to;
Know the meaning of an Approximate algorithm
Understand the performance ratio of approximate algorithms
Learn more about the Vertex Cover and Traveling Salesman problems
Understand the concept of Minimal Spanning Trees
Understand more of the concept of Performance Ratios
Intuitively, the approximation ratio measures how bad the approximate solution is distinguished
with the optimal solution. A large (small) approximation ratio measures the solution is much
worse than (more or less the same as) an optimal solution.
Observe that P (n) is always ≥ 1, if the ratio does not depend on n, we may write P. Therefore, a
1-approximation algorithm gives an optimal solution. Some problems have polynomial-time
approximation algorithm with small constant approximate ratios, while others have best-known
polynomial time approximation algorithms whose approximate ratios grow with n.
We can model the cities as a complete graph of n vertices, where each vertex represents a city.
It can be shown that TSP is NPC.
If we assume the cost function c satisfies the triangle inequality, then we can use the following
approximate algorithm.
Triangle inequality
Let u, v, w be any three vertices, we have
One important observation to develop an approximate solution is if we remove an edge from H*,
the tour becomes a spanning tree.
Approx-TSP (G= (V, E))
{
1. Compute a MST T of G;
2. Select any vertex r is the root of the tree;
Page 203 of 223
3. Let L be the list of vertices visited in a preorder tree walk of T;
4. Return the Hamiltonian cycle H that visits the vertices in the order L;
}
The Traveling-salesman Problem
Intuitively, Approx-TSP first makes a full walk of MST T, which visits each edge exactly two
times. To create a Hamiltonian cycle from the full walk, it bypasses some vertices (which
corresponds to making a shortcut)
The above graph can be represented as G(V, E), where 'V' is the number of vertices, and 'E' is the
number of edges. The spanning tree of the above graph would be represented as G`(V`, E`). In
this case, V` = V means that the number of vertices in the spanning tree would be the same as the
number of vertices in the graph, but the number of edges would be different. The number of
The number of spanning trees that can be made from the above complete graph equals to nn-2 =
44-2 = 16.
Therefore, 16 spanning trees can be created from the above graph.
The maximum number of edges that can be removed to construct a spanning tree equals to e-n+1
= 6 - 4 + 1 = 3.
Expensive
Self-Assessment Exercises
1. The traveling salesman problem involves visiting each city how many times?
2. What do you understand by the term MINIMUM SPANNING TREE?
3. An undirected graph G(V, E) contains n ( n > 2 ) nodes named v1 , v2 ,….vn. Two nodes
vi , vj are connected if and only if 0 < |i – j| <= 2. Each edge (vi, vj ) is assigned a weight
i + j. A sample graph with n = 4 is shown below. What will be the cost of the minimum
spanning tree (MST) of such a graph with n nodes?
4.0 CONCLUSION
An approximation or approximate algorithm is a way of dealing with NP-completeness for an
optimization problem. The goal of the approximation algorithm is to come close as much as
possible to the optimal solution in polynomial time. Examples of Approximation algorithms are
the Minimal Spanning tree, Vertex cover and Traveling Salesman problem.
3. In the graph given in question (2) above, what is the minimum possible weight of a path
P from vertex 1 to vertex 2 in this graph such that P contains at most 3 edges?
4. Consider a weighted complete graph G on the vertex set {v1,v2 ,v} such that the weight
of the edge (v, v) is 2|i-j|. The weight of a minimum spanning tree of G is?
Greenbaum, A. and Chartier, T. P. (2012). Numerical Methods: Design, Analysis, and Computer
Implementation of Algorithms. Princeton University Press. ISBN: 1400842670, 9781400842674.
Heineman, G. T., Pollice, G. and Selkow, S. (2016). Algorithms in a Nutshell. O’Reilly Media,
Inc. USA.
1.0 INTRODUCTION
An approximation algorithm is a way of dealing with NPcompleteness for an optimization
problem. The goal of the approximation algorithm is to come close as much as possible to the
optimal solution in polynomial time.
We continue our class on Approximate algorithms by looking at some methods of the Minimal
Spanning Tree given as Kruskal’s algorithm and the Prim’s algorithm.
2.0 OBJECTIVES
At the end of this Unit, you should be able to:
Understand the methods of the Minimal Spanning Tree (MST)
Know more about the Kruskal and the Prim algorithms
Analysis:
Where E is the number of edges in the graph and V is the number of vertices, Kruskal's
Algorithm can be shown to run in O (E log E) time, or simply, O (E log V) time, all with simple
data structures. These running times are equivalent because:
E is at most V2 and log V2= 2 x log V is O (log V).
If we ignore isolated vertices, which will each their components of the minimum
spanning tree, V ≤ 2 E, so log V is O (log E).
Thus the total time is
O (E log E) = O (E log V).
Example:
Find the Minimum Spanning Tree of the following graph using Kruskal's algorithm.
Now, check for each edge (u, v) whether the endpoints u and v belong to the same tree. If they
do then the edge (u, v) cannot be supplementary.
Otherwise, the two vertices belong to different trees, and the edge (u, v) is added to A, and the
vertices in two trees are merged in by union procedure.
Step 3: then (a, b) and (i, g) edges are considered, and the forest Becomes
Step 4: Now, edge (h, i). Both h and i vertices are in the same set. Thus it creates a cycle. So this
edge is discarded. Then edge (c, d), (b, c), (a, h), (d, e), (e, f) are considered, and the forest
becomes.
Step 5: In (e, f) edge both endpoints e and f exist in the same tree so discarded this edge. Then
(b, h) edge, it also creates a cycle.
Step 6: After that edge (d, f) and the final spanning tree is shown as in dark lines.
Example:
Generate minimum cost spanning tree for the following graph using
1. Prim's algorithm.
Solution:
Page 216 of 223
In Prim's algorithm, first we initialize the priority Queue Q. to contain all the vertices and the key
of each vertex to ∞ except for the root, whose key is set to 0. Suppose 0 vertex is the root, i.e., r.
By EXTRACT - MIN (Q) procure, now u = r and Adj [u] = {5, 1}.
Removing u from set Q and adds it to set V - Q of vertices in the tree. Now, update the key and π
fields of every vertex v adjacent to u but not in a tree.
u = EXTRACT_MIN (2, 6)
u = 2 [key [2] < key [6]]
12 < 18
Now the root is 2
Adj [2] = {3, 1}
3 is already in a heap
Taking 1, key [1] = 28
w (2,1) = 16
Now all the vertices have been spanned, Using above the table we get
Minimum Spanning Tree.
0→5→4→3→2→1→6
[Because Π [5] = 0, Π [4] = 5, Π [3] = 4, Π [2] = 3, Π [1] =2, Π [6] =1]
Self-Assessment Exercises
1. The number of distinct minimum spanning trees for the weighted graph below is?
4. Let G be connected undirected graph of 100 vertices and 300 edges. The weight of a minimum
spanning tree of G is 500.
4.0 CONCLUSION
An approximation algorithm returns a solution to a combinatorial optimization problem that is
provably close to optimal (as opposed to a heuristic that may or may not find a good solution).
Approximation algorithms are typically used when finding an optimal solution is intractable, but
can also be used in some situations where a near-optimal solution can be found quickly and an
exact solution is not needed.
Many problems that are NP-hard are also non-approximable assuming P≠NP.
5.0 SUMMARY
In this Unit, we concluded our class on Approximate or Approximation algorithms by looking
again at the Minimal Spanning Tree and methods for resolving MST problems, we looked at the
Prim’s and Kruskal’s algorithms as well as steps for finding MST using either of the algorithms
considered.
Greenbaum, A. and Chartier, T. P. (2012). Numerical Methods: Design, Analysis, and Computer
Implementation of Algorithms. Princeton University Press. ISBN: 1400842670, 9781400842674.
Heineman, G. T., Pollice, G. and Selkow, S. (2016). Algorithms in a Nutshell. O’Reilly Media,
Inc. USA.