DAA_unit_1_Introduction
DAA_unit_1_Introduction
UNIT-I - INTRODUCTION
Introduction, Algorithm, Pseudo code for expressing algorithms, Performance Analysis –
Space Complexity, Time Complexity, Asymptotic Notation - Big Oh Notation, Omega
notation, Theta Notation and Little oh notation, Probabilistic analysis, amortized analysis,
Performance Measurement, Randomized Algorithms.
* * *
What is an Algorithm: An algorithm is any well defined computational procedure that takes
some values as input and produces some values as output. It is thus a sequence of
computational steps that transform input into output.
An algorithm is composed of finite steps each of which may require one or more operations.
Each operation may be characterized as either simple or complex.
An algorithm is a step by step procedure for performing some tasks in a finite amount of
time.
According to D.E.Knuth a pioneer in the computer science discipline an algorithm must
have the following 5 basic properties
1. Input
2. Output
3. Finiteness
4. Definiteness
5. Effectiveness
Input: An algorithm has 0 or more inputs that are given to it initially before it begins (or)
dynamically as it runs.
Outputs: An algorithm has 1 or more outputs that have a specified relation with the inputs.
An algorithm produces at least one or more outputs.
Definiteness: Each operation specified in an algorithm must have definite meaning. Each
step of the algorithm must be precisely defined. Instructions such as “compute x/0 “ or
“subtract 7 or 6 to x” are not permitted because it is not clear which of the two possibilities
should be done or what the result is.
Effectiveness: Each operation of an algorithm should be effective. i.e. The operation must be
able to be carried out in a finite amount of time. Tracing of each step should be possible.
} } while (condition);
}until(condition); }
}
Ex 13) Algorithm to find nth Fibonacci number.
Algorthm fib(n)
{
if (n=1) then return 0;
else
if (n=2) then return 1;
else
return (fib(n-1)+fib(n-2));
}
Ex 14) Algorithm to solve Towers of Hanoi problem.
There are three towers named A,B and C. N disks were stacked on one tower(A) in
decreasing order of size from bottom to top. The problem is to move disks from tower A to
tower b using tower C as intermediate storage. Only one disk can be moved at a time. At any
time bigger disk cannot be placed on a smaller one.
Algorithm towersofhanoi(n,x,y,z)
// move the top n disks form tower x to tower y
{
if (n>=1) then
{
towersofhanoi(n-1,x,z,y);
write(“move disk “, n , “from tower”, x , “to tower”,y);
towersofhanoi(n-1,z,y,x);
}
}
3) Algorithm design techniques. For a given problem, there are many ways to design
algorithms for it.
i) Divide & Conquer (D&C)
ii) Greedy Method
iii) Dynamic programming
iv) Backtracking
v) Branch and bound
vi) Brute-force
vii) Decrease and conquer
viii) Transform and conquer
ix) Space and time Trade offs
Depending upon the problem, we will use suitable design method.
4) Prove Correctness
Once an algorithm has been specified, next we have to prove its correctness.
Usually testing is used for proving correctness.
5) Analyse Algorithm
Analysing the algorithm means studying the algorithm behavior, i.e. calculating the
time complexity and space complexity. If the time complexity is more, then we will
use one more designing technique such that time complexity should be minimum.
6)Coding an Algorithm
After completion of all phases successfully, then we will code an algorithm. Coding
should not depend on any programming language. We use general notation (pseudo
code) and English language statements.
Testing of Algorithms
2) Posteriori Analysis: in this we will collect the actual statistics about the algorithm, in
conjunction of the time and space while executing. Once the algorithm is written it has to be
tested. Testing a program consists of two major phases.
i) Debugging : It is the process of executing programs on sample data sets that
determine whether we get proper results. If faulty results occurs it has to be
corrected.
ii) Profiling : it is the process of executing a correct program on a actual data
sets and measuring the time and space it takes to compute the results during
execution. The actual time taken by the algorithm to process the data is called
profiling.
Space Complexity: The space complexity of an algorithm is the amount of memory it needs
to run to completion.
Time Complexity: The time complexity of an algorithm is the amount of computer time it
needs to run to completion.
Space Complexity: The space needed by the algorithms is seen to be the sum of the
following components:
1) A fixed part that is independent of the characteristics of the inputs and outputs. This
part typically includes the instruction space, space for simple variables, space for
constants etc..
2) A variable part that consists of the space needed by component variables whose size is
dependent on the particular problem instance being solved; the space needed by
reference variables and the recursion stack space.
In this algorithm the problem instance is characterized by the specific values of a,b and c.
making the assumption that one word is adequate to store the values of each of a,b and c, and
the result, we see that space needed by abc is independent of the instance characteristics,
Sp (instance characteristics)=0
Space requirement s[p] = 3+0 =0
One space for each a,b,c
Space complexity is O(1)
Algorithm addition(a,n)
// a is an array of size n
{
sum:=0.0; -------------------------------------------------- 1
for i:= 1 to n do ---------1 1
sum:=sum+a[i]; ---------------------------------------- n
return sum;
}
Space needed by n is one word; space needed by a is n words; space needed by i and sum one
word each.
S(addition)=3+n
Space complexity O(n)
Algorithm add(a,n)
// a is an array of size n
{
if (n=0) return 0.0;
else
return a[n]+add(a,n-1);
}
The instances are characterized by n. The recursion stack space includes space for the formal
parameters, the local variables, and the return address. Assume that the return address
requires only one word of memory. Each call to add requires at least three words( n, return
address, pointer to a[]). Since the depth of recursion is n+1, the recursion stack space needed
is >=(3(n+1).
Recursion –stack space = 3(n+1) = 3n + 3 = O(n)
Time Complexity: The time taken by a program P is the sum of the compile time and the run
time. The compile time does not depend on the instance characteristics. Also we may assume
that a compiled program will run several times without recompilation. Consequently we
concern ourselves with just the run time of a program. This runtime is denoted by tp(instance
characteristics).
The entire statement could be regarded as a step since its execution time is independent of the
instance characteristics. The number of steps any program statement is assigned depends on
the kind of statement. For example comments count as 0 steps. Assignment statement which
does not involve any calls to other algorithms is counted as one step. In an iterative statement
such as the for, while and repeat-until statements, we consider the step counts only for the
control part of the statement.
Big-Oh Notation (O): Big-oh notation gives an upper bound and a function f(n). The upper
bound of f(n) indicates that the function f(n) will be the worst case that it doesnot consume
more than this computing time.
Def: The function f(n) = O(g(n)) (read as f of n is big-oh of g of n) if and only if There
exists positive constants c and n0 such that f(n) <= c* g(n) for all n, n>= n0.
Ex 6) for =1 to n div 2 do
for j=1 to n*n do
x=x+1
Time complexity = (n/2) n2 = n3/2 = O(n3)
Theta Notation θ
For some functions the lower bound and the upper bound may be same. i.e. big Oh and
omega will have the same function. For example to find minimum or maximum of an array of
elements the computing time is O(n) and Ω(n).
There exists a special notation to denote for functions having the same time complexity for
lower and upper bounds and this notation is called Theta θ Notation.
Def: f(n)=θ (g(n)) iff there exists three positive constants c1 c2 and n0 with the constraint that
c1 g(n) <= f(n) <= c2 (g(n) for all n >= n0
Ex 1) f(n)=1/2n2-3n θ(n2)
Ex2 f(n)=3n+2 θ(n)
Ex3 f(n)=10n2+4n+2 θ(n2)
Ex 4 f(n)=10xlogn+4 θ(log n)
Little-Oh Notation (o) : The asymptotic upper bound provided by O-notation may or may
not be asymptotically tight. The bound 2n2=O(n2) is asymptotically tight, but the bound
2n = O(n2) is not. We use o-notation to denote an upper bound that is not asymptotically
tight.
Definition : f(n) = o(g(n), iff f(n) < c.g(n) for any constants c > 0, n0>0 and n>n0.
Or
lim f (n) 0
n g(n)
Examples
Ex1) Given f(n)=3n+2 The time complexity = o(n2)
Ex2) Given f(n)=2n, then prove that f(n)=o(n!).
2n
f (n) lim
n n!
=0
Therefore f(n)=o(n!)
1) Best Case : It is the minimum number of steps that can be executed for a given
parameter.
2) Worst case : It is the maximum number of steps that can be executed for a given
parameter.
3) Average case: It is the average number of steps executed for a given parameter.
Average Case: If we want to search an element 8, whether it is present in the array or not .
first A[1] is compared with 8, no match occurs. Compare A[2],A[3] and A[4] with 8, no
match occurs. Up to now 4 comparisons taken place. Now compare A[5] and 8 so match
occurs. The number of comparisons are 5. It is observed that search takes average number of
comparisons. So it comes under average case. If there are n elements, then we require n/2
comparisons. Time complexity is O(n/2) which is O(n). we can neglect constant.
Worst Case : If we want to search an element 13, whether it is present in the array or not.
First A[1] is compared with 13. No match occurs. Continue this process until element is
found or the list exhausted. The element is found at 9th comparison. So number of
comparisons are 9. It is observed that search takes maximum number of comparisons. So it
comes under worst case.
Time complexity is O(n)
Note : If the element is not found in the list then we have to search entire list, so it comes
under worst case.
Ex 2) Binary Search: The array elements are in ascending/descending order. The target is
compared with middle element. If it is equal we stop our search otherwise we repeat the same
process either in the first half or second half of the array depending on the value of target.
Best case time complexity O(1) : If the target found in the first comparison then it is
called best case
Average case time complexity O(log n) :Average number of comparisons
Worst case time complexity O(log n) : maximum number of comparisons
Assume that the number of elements is considered as 2m as every time the list is divided into
two halfs.
n = 2m
log (n) = log(2m )
log (n) = m log(2 )
m = log(n)/log(2)
m = log(2n) i.e. log n base 2
m = log (n)
Maximum number of comparisons will be m in a binary search.
PERFORMANCE MEASUREMENT
A good algorithm is correct, but a great algorithm is both correct and efficient. The most
efficient algorithm is one that takes the least amount of execution time and memory usage
possible while still yielding a correct answer.
One way to measure the efficiency of an algorithm is to count how many operations it needs in
order to find the answer across different input sizes.
Let's start by measuring the linear search algorithm, which finds a value in a list. The
algorithm looks through each item in the list, checking each one to see if it equals the target
value. If it finds the value, it immediately returns the index. If it never finds the value after
checking every list item, it returns -1.
Iteration #2:
It checks if index is greater than LENGTH(numbers). Since 2 is not greater than 6, it executes
the code inside the loop.
It compares numbers[index] to targetNumber. Since 37 is not equal to 45, it
does not execute the code inside the conditional.
It increments index by 1, so it now stores 3.
Iteration #3:
It checks if index is greater than LENGTH(numbers). Since 3 is not greater than 6, it executes
the code inside the loop.
It compares numbers[index] to targetNumber. Since 45 is equal to 45, it executes the code
inside the conditional.
It returns the current index, 3.
Now let's count the number of operations that the linear search algorithm needed to find that
value, by counting how often each type of operation was called.
Return index 1
Increment index by 1 2
That's a total of 10 operations to find the targetNumber at the index of 3. Notice the connection
to the number 3? The loop repeated 3 times, and each time, it executed 3 operations.
The best case for an algorithm is the situation which requires the least number of
operations. According to that table, the best case for linear search is
when targetNumber is the very first item in the list.
60 91
600 901
Las Vegas − The Las Vegas method of randomized algorithms never gives incorrect
outputs, making the time constraint as the random variable. For example, in string
matching algorithms, las vegas algorithms start from the beginning once they encounter
an error. This increases the probability of correctness. Eg., Randomized Quick Sort
Algorithm.
Monte Carlo − The Monte Carlo method of randomized algorithms focuses on finishing
the execution within the given time constraint. Therefore, the running time of this method
is deterministic. For example, in string matching, if monte carlo encounters an error, it
restarts the algorithm from the same point. Thus, saving time. Eg., Karger‟s Minimum Cut
Algorithm
Need for Randomized Algorithms
This approach is usually adopted to reduce the time complexity and space complexity. But there
might be some ambiguity about how adding randomness will decrease the runtime and memory
stored, instead of increasing; we will understand that using the game theory.
The Game Theory and Randomized Algorithms
The basic idea of game theory actually provides with few models that help understand how
decision-makers in a game interact with each other. These game theoretical models use
assumptions to figure out the decision-making structure of the players in a game. The popular
assumptions made by these theoretical models are that the players are both rational and take into
account what the opponent player would decide to do in a particular situation of a game. We
will apply this theory on randomized algorithms.
Zero-sum game
The zero-sum game is a mathematical representation of the game theory. It has two players
where the result is a gain for one player while it is an equivalent loss to the other player. So, the
net improvement is the sum of both players‟ status which sums up to zero.