1. Fundamentals of algorithms and data structures
1. Fundamentals of algorithms and data structures
The size of data is increasing exponentially on the daily basis and is becoming tough to manage the
data. Performing basic operations like searching, inserting and deleting a data item in minimum time
is a big challenge for all of us. Design of effective algorithm to perform basic operations in the need
of the hour. In this very first lesson of this SLM you will learn the concept of algorithm and what is
the meaning of complexity of the algorithm. Also you will learn the notion to represent the
complexity of any algorithm. In the end you will learn the concept of data structure and what
operations can be performed on it.
Data is in raw form and its needs to be processed before you can draw meaning from it. The data after
processing is called information. The ultimate aim of collecting data is to help the management take
effective decisions. But data itself is not capable of providing knowledge to the management based on
which decision can be taken. Knowledge can be added to data by processing the data and converting it
into information. Information refers to presentation of data in aggregated or summarized form.
For example, if you collect data about the attendance of all employees in the company. The
attendance record in itself is of little help to management. Processing that data to find the aggregations
such as number of leaves taken by each employee in the last month, total leaves taken in the current
year till date, leaves balance, number of employees on leave on a particular day, all such information
can be really useful.
Problem analysis
There are two ways to express a problem:
1. Algorithm- it is a collection of well defined steps that should be followed in order to find the
solution to a particular problem.
2. Flowchart- flowchart also refers to pictorial representation of a problem. It is used to show the flow
of control that ultimately leads you to the desired output.
Algorithm
An algorithm is a is a collection of well defined steps that should be followed in order to find the
solution to a particular problem. It is also known as step-by-step representation of a solution of a
given problem. Algorithm is sequence of logical steps followed to achieve the desired result. The
language used is simple representation of tasks to be performed and it is not directly understood by
the computer. Hence an algorithm cannot be executed directly and for that it must be converted into a
program using any programming language. Most of the problems are from real life scenarios.
Algorithm should give a solution to any problem in finite number of computation steps. The solution
which is given infinitely has no benefit. The solution to most complex problems related to science or
space are expressed in finite steps or time even if it expected to take months to solve.
Some of the algorithms are as simple as making a cup of tea, but there are more complex problems
that requires formulation of proper sequencing of steps to be followed.
For example,
Consider the following basic algorithm to add two numbers:
Algorithm complexity
You can design more than one algorithm as solution to any problem. When you have 2 or more
algorithms for the same problem it is only appropriate to have a criteria to judge which one is better
than the rest. In order to find a winner between two algorithms, you need to compare the complexity
of the two algorithms. Complexity of an algorithm can be expressed in two forms.
1. Time complexity: It is not possible to execute an algorithm directly which means the time
complexity is not the amount of time it takes to execute an algorithm. Time complexity of an
algorithm is the number of computational steps/ operations it takes or the number of comparisons
performed to find a correct solution for a specific problem.
Sometimes an algorithm with lesser number of steps can have more time complexity than the
algorithm with more steps. It is because the number of comparisons in later is less than the former.
2. Space complexity: Space complexity refers to the space in memory needed by the variables and
functions specified in the algorithm. More variables leads to more space complexity.
Time space trade off
Out of the two complexities for an algorithm which one is important? Well it depends on the
requirement and availability of resource. For example, if there is enough space available and space is
not a criteria, then the obvious requirement is to create an algorithm with minimal time complexity.
On the other hand if the data generated is humongous and with every step the space is getting lesser
and lesser, space becomes more important than time.
Time space trade off decides on what is more important: time or space. Once an optimal or close to
optimal algorithm has been generated. In order to improve one of the two parameters, you have to
compromise on the other.
For example, you can swap two numbers using two different algorithms:
temp = number1
number1 = number2
number2 = temp
The first algorithm makes use of 2 variables and second algorithm makes use of three variables.
Hence the space complexity of first algorithm is good
In first algorithm every step has two operations and in second algorithm every step has only one
operation which means the time complexity of the second algorithm is good.
Depending upon your requirement, you can choose between time or space.
Asymptotic analysis/ notations
In computing, asymptotic analysis of an algorithm refers to defining the mathematical boundation of
its run-time performance based on the input size. For example, the running time of one operation is
computed as f(n), and maybe for another operation, it is computed as g(n2). This means the first
operation running time will increase linearly with the increase in n and the running time of the second
operation will increase exponentially when n increases. Similarly, the running time of both
operations will be nearly the same if n is small in value.
Graphical representation
Graphical representation
Follow the steps below to calculate Θ for a program:
Break the program into smaller segments.
Find all types of inputs and calculate the number of operations they take to be executed. Make sure
that the input cases are equally distributed.
Find the sum of all the calculated values and divide the sum by the total number of inputs let say the
function of n obtained is g(n) after removing all the constants, then in Θ notation, it’s represented as
Θ(g(n)).
Example: In a linear search problem, let’s assume that all the cases are uniformly
distributed (including the case when the key is absent in the array). So, sum all the cases when the key
is present at positions 1, 2, 3, ……, n and not present, and divide the sum by n + 1.
Big – O Notation:
Big – O (O) notation specifies the asymptotic upper bound for a function f(n). For a given function
g(n), O(g(n)) is denoted by:
O (g(n)) = {f(n): there exist positive constants c and n0 such that f(n) ≤ c*g(n) for all n ≥ n0}.
This means that, f(n) = O(g(n)), If there are positive constants n0 and c such that, to the right of n0 the
f(n) always lies on or below c*g(n).
Graphical representation
Follow the steps below to calculate O for a program:
Break the program into smaller segments.
Find the number of operations performed for each segment (in terms of the input size) assuming the
given input is such that the program takes the maximum time i.e the worst-case scenario.
Add up all the operations and simplify it, let’s say it is f(n).
Remove all the constants and choose the term having the highest order because for n tends to infinity
the constants and the lower order terms in f(n) will be insignificant, let say the function
is g(n) then, big-O notation is O(g(n)).
It is the most widely used notation as it is easier to calculate since there is no need to check for every
type of input as it was in the case of theta notation, also since the worst case of input is taken into
account it pretty much gives the upper bound of the time the program will take to execute.
1. Best Case Time Complexity: It refers to the minimum number of steps needed by an algorithm for
an input of size N to successfully complete. For a search algorithm called linear search, the best case
time complexity is 1.
2. Worst Case Time Complexity: It refers to the maximum number of steps needed by an algorithm
for an input of size N to successfully complete. For a search algorithm called linear search, the worst
case time complexity is N.
3. Average Case Time Complexity: Sometimes the worst case and best case do not provide adequate
information about the algorithm’s behaviour. Then average case time complexity is computed by
randomly selecting some numbers and computing the average complexity for all numbers. The
average complexity for some algorithms is same as worst case complexity.
Data Structure
Data structure is a logical representation of data in memory. As the size of the data grows
exponentially it becomes very difficult to manage the data. You cannot expect the system to take
enormous amount of data to perform the desired task simply because the size of data is extremely
large. Data structure is a perfect solution to this problem. Depending upon the type and size of data in
hand you can select the appropriate data structure.
0 3 0 2
1