Complexity Analysis of Algorithms
Complexity Analysis of Algorithms
Topics
1. Asymptotic Analysis
2. Best, Worst and Average Case Analysis
3. Asymptotic Notations
4. Analysis of Loops
5. Analysis of Recursive Algorithms: Solving Recurrences
6. Amortized Analysis
7. Time Complexity Table
8. Space Complexity
9. Problems & Solutions
Asymptotic Analysis
Given two algorithms for a task, how do we find out which one is better?
One naive way of doing this is – implement both the algorithms and run the two programs on
your computer for different inputs and see which one takes less time. There are many
problems with this approach for analysis of algorithms.
1. It might be possible that for some inputs, first algorithm performs better than the
second. And for some inputs second performs better than the first one.
2. It might also be possible that for some inputs, first algorithm performs better on one
machine and the second performs better on some other machine for some other inputs.
Asymptotic Analysis is the big idea that handles above issues in analyzing algorithms. In
Asymptotic Analysis, we evaluate the performance of an algorithm in terms of input size (we
don’t measure the actual running time). We calculate, how does the time (or space) taken by
an algorithm increases with the input size.
For example, let us consider the search problem (searching a given item) in a sorted array.
One way to search is Linear Search (order of growth is linear) and other way is Binary Search
(order of growth is logarithmic).
To understand how Asymptotic Analysis solves the above mentioned problems in analyzing
algorithms, let us say we run the Linear Search on a fast computer and Binary Search on a
slow computer. For small values of input array size n, the fast computer may take less time.
But, after certain value of input array size, the Binary Search will definitely start taking less
time compared to the Linear Search even though the Binary Search is being run on a slower
machine. The reason is the order of growth of Binary Search with respect to input size is
logarithmic while the order of growth of Linear Search is linear. So the machine dependent
constants can always be ignored after certain values of input size.
Also, in Asymptotic Analysis, we always talk about input sizes larger than a constant value. It
might be possible that those large inputs are never given to your software and an algorithm
which is asymptotically slower, always performs better for your particular situation. So, you
may end up choosing an algorithm that is Asymptotically slower but faster for your software.
Best, Worst & Average Case Analysis
A. Θ Notation
The Θ (Theta) notation bounds a functions from above and below, so it defines exact
asymptotic behavior. A simple way to get Θ notation of an expression is to drop lower
order terms and ignore leading constants.
For example, consider the following expression: 3n3 + 6n2 + 6000 = Θ(n3) .
Dropping lower order terms is always fine because there will always be a an n0 after
which Θ(n3) beats Θ(n2) irrespective of the constants involved.
For a given function g( n), we denote Θ(g( n)) as the following set of functions:
Θ(g(n)) = {f( n) : there exist positive constants c₁, c₂ and n0 such that 0 ≤ c₁g(n) ≤ f(n)
≤ c₂g(n) for all n ≥ n0 }.
The above definition means, “if f(n) is theta of g(n), then the value f(n) is always
between c₁g( n) and c₂g( n) for large values of n (n ≥ n0). The definition of theta also
requires that f( n) must be non-negative for values of n greater than n0 ”.
B. Big O Notation
The Big O notation defines an upper bound of an algorithm, it bounds a function only
from above. For example, consider the case of Insertion Sort. It takes linear time
[O(n)] in best case and quadratic time [O(n²) ] in worst case. We can safely say that
the time complexity of Insertion sort is O(n2 ). Note that O(n2) also covers linear time.
The Big O notation is useful when we only have upper bound on time complexity of
an algorithm. Many times we easily find an upper bound by simply looking at the
algorithm.
O(g(n)) = { f( n): there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg( n) for
all n ≥ n0}.
Let us consider the same Insertion sort example here. The time complexity of
Insertion Sort can be written as Ω(n) , but it is not a very useful information about
insertion sort, as we are generally interested in worst case (Big O) and sometimes in
average case (Big Θ).
Analysis of Loops
3. O(nc ) : Time complexity of nested loops is equal to the number of times the
innermost statement is executed. For example, the following sample loops have O(n2 )
time complexity.
a. for (int i = 1; i <= n; i += c) {
for (int j = 1;
j <= n;
j += c) {
// some O(1) expressions
}
}
b. for (int i = n; i > 0; i -= c) {
for (int j = i + 1;
j <= n;
j += c) {
// some O(1) expressions
}
}
How to calculate time complexity when there are many if-else statements inside loops?
As discussed here, worst case time complexity is the most useful among best, average and
worst time complexities. Therefore we need to consider worst case. We evaluate the situation
when values in if-else conditions cause maximum number of statements to be executed.
For example consider the linear search function where we consider the case when element is
present at the end or not present at all. When the code is too complex to consider all if-else
cases, we can get an upper bound by ignoring if else and other complex control statements.
For example, in Merge Sort, to sort a given array, we divide it into two halves and recursively
repeat the process for the two halves. Finally we merge the results. Time complexity of
Merge Sort can be written as T(n) = 2T(n/ 2) + n. There are many other recursive algorithms
like Binary Search, Towers of Hanoi, Quicksort etc.
(1) Substitution Method: Recurrence relations themselves are recursive. We solve the
recurrence relation by repeated substitution until we get a generic form.
For example, consider the recurrence T(n) = T(n-1) + 1 [Time to solve problem of size n]
where T(0) = 1 [Time to solve problem of size 0 - Base Case]
T(n) = T(n-1) + 1
= ( T(n-2) + 1 ) + 1 [T(n-1) = T(n-2) + 1]
= T(n-2) + 2
= ( T(n-3) + 1 ) + 2 [T(n-2) = T(n-3) + 1]
= T(n-3) + 3
= T(n-k) + k
Since we know that: T(0) = 1, therefore we equate n-k to 0, which implies k=n
∴ T(n) = T(n-k) + k = T(n-n) + n = T(0) + n = 1 + n
= O(n) - Linear in Nature
(2) Recurrence Tree Method: In this method, we draw a recurrence tree and calculate the
total time taken in every level of tree. Finally, we sum the work done at all levels. To draw
the recurrence tree, we start from the given recurrence and keep drawing till we find a pattern
among levels. The pattern is typically an Arithmetic Progression [AP] or Geometric
Progression [GP].
To know the value of T(n), we calculate sum of tree nodes level by level. If we sum the
above tree level by level, we get the following series: T(n) = cn2(1 + (5/16) + (25/256) + ....)
The above series is a GP with ratio 5/16. To get an upper bound, we can sum the infinite
series. [Sum = a / (1 - r)]. We get the sum as (cn2)/(1 - (5/16)), which is O(n2).
T(n) = aT(n/ b) + f( n) where a ≥ 1 and b > 1
t = log b a
In recurrence tree method, we calculate total work done. If the work done at leaves is
polynomially more, then leaves are the dominant part, and our result becomes the work done
at leaves (Case 1). If work done at leaves and root is asymptotically same, then our result
becomes height multiplied by work done at any level (Case 2). If work done at root is
asymptotically more, then, our result becomes work done at root (Case 3).
Some standard algorithms whose time complexity can be evaluated using Master’s Theorem:
1. Merge Sort: T(n) = 2T(n/ 2) + n. It falls in case 2 as c is 1 and t [= log b a ] is also 1.
So the solution is O(n×log2(n) ).
2. Binary Search: T(n) = T(n/ 2) + 1. It also falls in case 2 as c is 0 and t [= log b a ] is
also 0. So the solution is O(log2(n) ).
3. Binary Tree Traversal: T(n) = 2T(n/2) + 1. It falls in case 1 as c is 0 and t [= log b a
] is 1 [c < t]. So the solution is O(n) .
Note that it is not necessary that a recurrence of the form T(n) = aT (n/ b) + f(n) can be solved
using Master’s Theorem. The given three cases have some gaps between them. For example,
the recurrence T(n) = 2T(n/ 2) + (n/ log2(n) ) cannot be solved by using master’s theorem.
Amortized Analysis
Amortized Analysis is used for algorithms where an occasional operation is very slow, but
most of the other operations are faster. In Amortized Analysis, we analyze a sequence of
operations and guarantee a worst case average time which is lower than the worst case time
of a particular expensive operation.
The example data structures whose operations are analyzed using Amortized Analysis are
Hash Tables, Disjoint Sets and Splay Trees.
Let us consider an example of a simple Hash-Table Insertion. How do we decide Table size?
There is a trade-off between space and time, if we make our Hash-Table size big, search time
becomes fast, but space required becomes high. But, if we make our Hash-Table size smaller,
our search time increases. Therefore, there’s always a trade-off between time and space.
Dynamic Table
The solution to this trade-off problem is to use Dynamic Table (or Arrays). The idea is to
increase size of table whenever it becomes full. Following are the steps to follow when table
becomes full:
1. Allocate memory for a larger table of size, typically twice the old table.
2. Copy the contents of old table to the new table.
3. Free the old table.
If the table has space available, we simply insert new item in available space.
Examples of Dynamic Tables in different languages are: std::vector in C++, ArrayList in
Java, L ists in Python, etc.
What is the time complexity of n insertions using the above scheme (dynamic table)?
If we use asymptotic analysis, the worst case cost of an insertion is O(n) . Therefore, worst
case cost of n inserts is n×O(n) which is O(n2 ). This analysis gives an upper bound, but not a
tight upper bound for n insertions as all insertions don’t take O(n) time.
After amortized analysis, we can say that the complexity is O(n) , as creating and copying is
rare but expensive operation, and hence can be neglected while computing complexity.
In the real world, the aforementioned parameters, are highly system dependent, therefore,
these parameters keep on changing from one system to another system.
So for simplicity, we take all the three parameters mentioned above, as:
1. Size of the Input(n): n = 106. We took this input size, because it is not too big, and not
too small for our analysis.
2. Number of Machine Instructions executed internally: For this parameter, we can
assume that a single statement like a loop/iteration, is internally, just a single
machine instruction, i.e., 1 iteration = 1 machine instruction in our system.
3. Clock Frequency = 1GHz, i.e., if a processor’s speed is 1GHz, then in a second, the
processor can generate 109 clock cycles.
○ If we assume that 1 clock cycle = 1 instruction, then:
○ 109 clock cycles per second => 109 instructions per second.
○ If our processor can execute 109 instructions per second, then
execution time for a single instruction will be, T = 1019 seconds = 1 ns
1
Therefore, for the values, n = 106 and T = seconds, o ur Time Complexity Table is:
109
Big- O n3 n2 n×log2(n) n log2(n) 1 2n [ n= 30] 2n [n= 60]
Time Taken** 31.7 yrs 16 mins 0.02 sec 1 ms 20 ns 1 ns 1 sec 31.7 yrs
*Total #Instructions: We have n = 106, therefore, O(n3) = (106)3 = 1018 instructions to be executed.
**Time Taken: This is the total time taken by the processor to execute the given solution, i.e., if a
solution for some problem takes O(n3 ) time, then, for n = 106, we have 1018 instructions to be executed
by the processor, as shown above in the green column. Therefore, in a 1GHz processor (T = 101 9
seconds): [Note: T is the execution time for a single instruction]
1 instruction takes 101 9 seconds, then,
1018 109
1018 instruction will take 109
seconds = 109 seconds =
60×60×24×365 years ≈ 31.7 years.
[1]
More on Machine Instructions ( must watch).
Space Complexity
The term Space Complexity is misused for Auxiliary Space at many places. Following are the
correct definitions of Auxiliary Space and Space Complexity:
Space Complexity: The total space taken by the algorithm with respect to its input size. Space
complexity includes both auxiliary space and space used by input.
For example, if we want to compare standard sorting algorithms on the basis of space, then
auxiliary space would be better a criteria for space complexity. Merge Sort uses O(n)
auxiliary space, Insertion Sort and Heap Sort use O(1) auxiliary space. Space complexity of
all these sorting algorithms is O(n) though.
b. Order of growth of log2(n! ) and n×log2(n) is same for large values of n, i.e.,
O (log2(n!)) = O(n×
log2(n)). So time complexity of fun() is O(n×log2(n)).
c. The expression O(log2(n!)) = O(n×log2(n) ) can be easily derived from
Stirling’s approximation (or Stirling’s formula) i.e. :
log2(n!) = (n×log2(n) ) - n + O(log2(n))
)/(k +
In general, asymptotic value can be written as ((nk+1 1)), i.e., O((nk+1)/(k +
1))