Design and Analysis of Algorithms PDF
Design and Analysis of Algorithms PDF
asia
LECTURE NOTES ON
DESIGN AND ANALYSIS OF ALGORITHMS
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
CONTENTS
CHAPTER 1 BASIC CONCEPTS
Algorithm
Performance of Programs
Algorithm Design Goals
Classification of Algorithms
Complexity of Algorithms
Rate of Growth
Analyzing Algorithms
The Rule of Sums
The Rule of products
The Running time of Programs
Measuring the running time of programs
Asymptotic Analyzing of Algorithms
Calculating the running time of programs
General rules for the analysis of programs
General Method
Control Abstraction of Divide and Conquer
Binary Search
External and Internal path length
Merge Sort
Strassen’s Matrix Multiplication
Quick Sort
Straight Insertion Sort
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Graph Algorithms
CHAPTER 7 Backtracking
General method
Terminology
N-Queens problem
Sum of Subsets
Graph Coloring( for planar graphs)
Hamiltonian Cycles
0/1 Knapsack
Traveling Sales Person using Backtracking
General method
Least Cost (LC) Search
Control Abstraction for LC-Search
Bounding
The 15-Puzzle problem
LC Search for 15-Puzzle Problem
Job Sequencing with deadlines
Traveling Sales Person problem
0/1 Knapsack
II
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Chapter
1
Basic Concepts
Algorithm
Input: there are zero or more quantities, which are externally supplied;
Finiteness: if we trace out the instructions of an algorithm, then for all cases
the algorithm will terminate after a finite number of steps;
Performance of a program:
The performance of a program is the amount of computer memory and time needed
to run a program. We use two approaches to determine the performance of a
program. One is analytical, and the other experimental. In performance analysis we
use analytical methods, while in performance measurement we conduct experiments.
Time Complexity:
The limiting behavior of the complexity as size increases is called the asymptotic time
complexity. It is the asymptotic complexity of an algorithm, which ultimately
determines the size of problems that can be solved by the algorithm.
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Space Complexity:
Instruction space: Instruction space is the space needed to store the compiled
version of the program instructions.
Data space: Data space is the space needed to store all constant and variable
values. Data space has two components:
The three basic design goals that one should strive for in a program are:
Classification of Algorithms
If ‘n’ is the number of data items to be processed or degree of polynomial or the size
of the file to be sorted or searched or the number of nodes in a graph etc.
Log n When the running time of a program is logarithmic, the program gets
slightly slower as n grows. This running time commonly occurs in
programs that solve a big problem by transforming it into a smaller
problem, cutting the size by some constant fraction., When n is a million,
log n is a doubled. Whenever n doubles, log n increases by a constant,
but log n does not double until n increases to n2.
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
n When the running time of a program is linear, it is generally the case that
a small amount of processing is done on each input element. This is the
optimal situation for an algorithm that must process n inputs.
n. log n This running time arises for algorithms that solve a problem by breaking
it up into smaller sub-problems, solving then independently, and then
combining the solutions. When n doubles, the running time more than
doubles.
Complexity of Algorithms
The complexity of an algorithm M is the function f(n) which gives the running time
and/or storage space requirement of the algorithm in terms of the size ‘n’ of the
input data. Mostly, the storage space required by an algorithm is simply a multiple of
the data size ‘n’. Complexity shall refer to the running time of the algorithm.
The function f(n), gives the running time of an algorithm, depends not only on the
size ‘n’ of the input data but also on the particular data. The complexity function f(n)
for certain cases are:
1. Best Case : The minimum possible value of f(n) is called the best case.
3. Worst Case : The maximum value of f(n) for any key possible input.
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Rate of Growth:
The following notations are commonly use notations in performance analysis and
used to characterize the complexity of an algorithm:
1. Big–OH (O) 1,
2. Big–OMEGA (),
3. Big–THETA () and
4. Little–OH (o)
f(n) = O(g(n)), (pronounced order of or big oh), says that the growth rate of f(n) is
less than or equal (<) that of g(n).
f(n) = (g(n)) (pronounced omega), says that the growth rate of f(n) is greater
than or equal to (>) that of g(n).
1
In 1892, P. Bachmann invented a notation for characterizing the asymptotic behavior of
functions. His invention has come to be known as big oh notation.
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Little–OH (o)
T(n) = o(p(n)) (pronounced little oh), says that the growth rate of T(n) is less than
the growth rate of p(n) [if T(n) = O(p(n)) and T(n) (p(n))].
Analyzing Algorithms
Suppose ‘M’ is an algorithm, and suppose ‘n’ is the size of the input data. Clearly the
complexity f(n) of M increases as n increases. It is usually the rate of increase of f(n)
we want to examine. This is usually done by comparing f(n) with some standard
functions. The most common computing times are:
O(1), O(log2 n), O(n), O(n. log2 n), O(n2), O(n3), O(2n), n! and nn
The execution time for six of the typical functions is given below:
n log2 n n*log2n n2 n3 2n
1 0 0 1 1 2
2 1 2 4 8 4
4 2 8 16 64 16
8 3 24 64 512 256
16 4 64 256 4096 65,536
32 5 160 1024 32,768 4,294,967,296
64 6 384 4096 2,62,144 Note 1
128 7 896 16,384 2,097,152 Note 2
256 8 2048 65,536 1,677,216 ????????
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Note 2: The value here is about 500 billion times the age of the universe in
nanoseconds, assuming a universe age of 20 billion years.
Graph of log n, n, n log n, n2, n3, 2n, n! and nn
O(log n) does not depend on the base of the logarithm. To simplify the analysis, the
convention will not have any particular units of time. Thus we throw away leading
constants. We will also throw away low–order terms while computing a Big–Oh
running time. Since Big-Oh is an upper bound, the answer provided is a guarantee
that the program will terminate within a certain time period. The program may stop
earlier than this, but never later.
One way to compare the function f(n) with these standard function is to use the
functional ‘O’ notation, suppose f(n) and g(n) are functions defined on the positive
integers with the property that f(n) is bounded by some multiple g(n) for almost all
‘n’. Then,
f(n) = O(g(n))
Which is read as “f(n) is of order g(n)”. For example, the order of complexity for:
Suppose that T1(n) and T2(n) are the running times of two programs fragments P1
and P2, and that T1(n) is O(f(n)) and T2(n) is O(g(n)). Then T1(n) + T2(n), the
running time of P1 followed by P2 is O(max f(n), g(n)), this is called as rule of sums.
For example, suppose that we have three steps whose running times are respectively
O(n2), O(n3) and O(n. log n). Then the running time of the first two steps executed
sequentially is O (max(n2, n3)) which is O(n3). The running time of all three
together is O(max (n3, n. log n)) which is O(n3).
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
If T1(n) and T2(n) are O(f(n) and O(g(n)) respectively. Then T1(n)*T2(n) is O(f(n)
g(n)). It follows term the product rule that O(c f(n)) means the same thing as O(f(n))
if ‘c’ is any positive constant. For example, O(n 2/2) is same as O(n2).
Suppose that we have five algorithms A1–A5 with the following time complexities:
A1 : n
A2 : n log n
A3 : n2
A4: n3
A5: 2n
The time complexity is the number of time units required to process an input of size
‘n’. Assuming that one unit of time equals one millisecond. The size of the problems
that can be solved by each of these five algorithms is:
The speed of computations has increased so much over last thirty years and it might
seem that efficiency in algorithm is no longer important. But, paradoxically, efficiency
matters more today then ever before. The reason why this is so is that our ambition
has grown with our computing power. Virtually all applications of computing
simulation of physical data are demanding more speed.
The faster the computer run, the more need are efficient algorithms to take
advantage of their power. As the computer becomes faster and we can handle larger
problems, it is the complexity of an algorithm that determines the increase in
problem size that can be achieved with an increase in computer speed.
Suppose the next generation of computers is ten times faster than the current
generation, from the table we can see the increase in size of the problem.
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
When solving a problem we are faced with a choice among algorithms. The basis for
this can be any one of the following:
ii. We would like an algorithm that makes efficient use of the computer’s
resources, especially, one that runs as fast as possible.
2. The quality of code generated by the compiler used to create the object
program.
3. The nature and speed of the instructions on the machine used to execute the
program, and
The running time depends not on the exact input but only the size of the input. For
many programs, the running time is really a function of the particular input, and not
just of the input size. In that case we define T(n) to be the worst case running time,
i.e. the maximum overall input of size ‘n’, of the running time on that input. We also
consider Tavg(n) the average, over all input of size ‘n’ of the running time on that
input. In practice, the average running time is often much harder to determine than
the worst case running time. Thus, we will use worst–case running time as the
principal measure of time complexity.
Seeing the remarks (2) and (3) we cannot express the running time T(n) in standard
time units such as seconds. Rather we can only make remarks like the running time
of such and such algorithm is proportional to n2. The constant of proportionality will
remain un-specified, since it depends so heavily on the compiler, the machine and
other factors.
Our approach is based on the asymptotic complexity measure. This means that we
don’t try to count the exact number of steps of a program, but how that number
grows with the size of the input to the program. That gives us a measure that will
work for different operating systems, compilers and CPUs. The asymptotic complexity
is written using big-O notation.
The most important property is that big-O gives an upper bound only. If an algorithm
is O(n2), it doesn’t have to take n2 steps (or a constant multiple of n2). But it can’t
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
take more than n2. So any algorithm that is O(n), is also an O(n 2) algorithm. If this
seems confusing, think of big-O as being like "<". Any number that is < n is also <
n2.
1. Ignoring constant factors: O(c f(n)) = O(f(n)), where c is a constant; e.g.
O(20 n3) = O(n3)
2. Ignoring smaller terms: If a<b then O(a+b) = O(b), for example O(n2+n)
= O(n2)
4. n and log n are "bigger" than any constant, from an asymptotic view (that
means for large enough n). So if k is a constant, an O(n + k) algorithm is
also O(n), by ignoring smaller terms. Similarly, an O(log n + k) algorithm
is also O(log n).
Let us now look into how big-O bounds can be computed for some common
algorithms.
Example 1:
x = 3*y + 2;
z = z + 1;
If y, z are scalars, this piece of code takes a constant amount of time, which we write
as O(1). In terms of actual computer instructions or clock ticks, it’s difficult to say
exactly how long it takes. But whatever it is, it should be the same whenever this
piece of code is executed. O(1) means some constant, it might be 5, or 1 or 1000.
Example 2:
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Example 3:
If the first program takes 100n2 milliseconds and while the second takes 5n 3
milliseconds, then might not 5n3 program better than 100n2 program?
As the programs can be evaluated by comparing their running time functions, with
constants by proportionality neglected. So, 5n3 program be better than the 100n2
program.
5 n3/100 n2 = n/20
for inputs n < 20, the program with running time 5n3 will be faster than those the
one with running time 100 n2. Therefore, if the program is to be run mainly on inputs
of small size, we would indeed prefer the program whose running time was O(n3)
However, as ‘n’ gets large, the ratio of the running times, which is n/20, gets
arbitrarily larger. Thus, as the size of the input increases, the O(n 3) program will take
significantly more time than the O(n 2) program. So it is always better to prefer a
program whose running time with the lower growth rate. The low growth rate
function’s such as O(n) or O(n log n) are always better.
Example 4:
This loop will run exactly n times, and because the inside of the loop takes constant
time, the total running time is proportional to n. We write it as O(n). The actual
number of instructions might be 50n, while the running time might be 17n
microseconds. It might even be 17n+3 microseconds because the loop needs some
time to start up. The big-O notation allows a multiplication factor (like 17) as well as
an additive factor (like 3). As long as it’s a linear function which is proportional to n,
the correct notation is O(n) and the code is said to have linear running time.
Example 5:
The outer for loop executes N times, while the inner loop executes n times for every
execution of the outer loop. That is, the inner loop executes n n = n2 times. The
assignment statement in the inner loop takes constant time, so the running time of
the code is O(n2) steps. This piece of code is said to have quadratic running time.
10
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Example 6:
Lets start with an easy case. Multiplying two n n matrices. The code to compute the
matrix product C = A * B is given below.
There are 3 nested for loops, each of which runs n times. The innermost loop
therefore executes n*n*n = n3 times. The innermost statement, which contains a
scalar sum and product takes constant O(1) time. So the algorithm overall takes
O(n3) time.
Example 7:
The main body of the code for bubble sort looks something like this:
This looks like the double. The innermost statement, the if, takes O(1) time. It
doesn’t necessarily take the same time when the condition is true as it does when it
is false, but both times are bounded by a constant. But there is an important
difference here. The outer loop executes n times, but the inner loop executes a
number of times that depends on i. The first time the inner for executes, it runs i =
n-1 times. The second time it runs n-2 times, etc. The total number of times the
inner if statement executes is therefore:
N 1 2
(n i) n(n i) n n
2 2
i 1 2
The value of the sum is n(n-1)/2. So the running time of bubble sort is O(n(n-1)/2),
which is O((n2-n)/2). Using the rules for big-O given earlier, this bound simplifies to
O((n2)/2) by ignoring a smaller term, and to O(n 2), by ignoring a constant factor.
Thus, bubble sort is an O(n2) algorithm.
11
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Example 8:
Binary search is a little harder to analyze because it doesn’t have a for loop. But it’s
still pretty easy because the search interval halves each time we iterate the search.
The sequence of search intervals looks something like this:
It’s not obvious how long this sequence is, but if we take logs, it is:
Since the second sequence decrements by 1 each time down to 0, its length must be
log2 n + 1. It takes only constant time to do each test of binary search, so the total
running time is just the number of times that we iterate, which is log 2n + 1. So
binary search is an O(log2 n) algorithm. Since the base of the log doesn’t matter in an
asymptotic bound, we can write that binary search is O(log n).
1. The running time of each assignment read and write statement can usually be
taken to be O(1). (There are few exemptions, such as in PL/1, where
assignments can involve arbitrarily larger arrays and in any language that
allows function calls in arraignment statements).
4. The time to execute a loop is the sum, over all times around the loop, the
time to execute the body and the time to evaluate the condition for
termination (usually the latter is O(1)). Often this time is, neglected constant
factors, the product of the number of times around the loop and the largest
possible time for one execution of the body, but we must consider each loop
separately to make sure.
12
jntuworldupdates.org Specworld.in