Analysis and Design of Algorithm Notes
Analysis and Design of Algorithm Notes
What is an Algorithm?
An algorithm is a step-by-step set of instructions or a well-defined procedure for solving a
specific problem or accomplishing a particular task. Algorithms are used in various fields of
computer science, mathematics, and other disciplines to automate processes, make
decisions, and perform computations. Here are some key properties and characteristics of
algorithms:
1. Input: An algorithm takes one or more inputs, which are the data or information
necessary to perform a specific task. The input is typically provided as a set of
parameters or variables.
2. Output: An algorithm produces one or more outputs, which are the results or
solutions to the problem it aims to solve. The output should be clearly defined and
relevant to the problem.
3. Finiteness: Algorithms must be finite, meaning they have a well-defined start and
end. They should terminate after a finite number of steps or operations. Infinite
loops or processes are not considered algorithms.
4. Definiteness: Each step in an algorithm must be precisely defined and unambiguous,
leaving no room for interpretation. This ensures that anyone following the algorithm
will arrive at the same result.
5. Effectiveness: An algorithm must be effective, meaning that it can be executed using
a finite amount of resources (such as time and memory). It should be practical and
feasible to implement the algorithm on a real computer.
6. Determinism: Algorithms are deterministic, meaning that for a given input, they will
always produce the same output. There should be no randomness or uncertainty in
the algorithm's behavior.
7. Correctness: An algorithm is correct if it solves the problem it was designed for and
produces the expected output for all valid inputs. Correctness is typically established
through rigorous testing and mathematical proofs.
8. Efficiency: Efficiency is an important consideration in algorithm design. An algorithm
should perform its task in a reasonable amount of time and use a reasonable amount
of resources. This involves analyzing time and space complexity.
9. Termination: An algorithm must eventually terminate, meaning it will finish
executing and provide an output. Endless or non-terminating processes are not
considered algorithms.
10. Generality: Ideally, an algorithm should be designed to solve a class of problems
rather than a specific instance. It should be adaptable to various inputs within its
problem domain.
11. Modularity: Algorithms can be designed with modularity in mind, breaking down
complex tasks into smaller, more manageable subproblems. This can make
algorithms easier to understand, maintain, and reuse.
12. Optimality: In some cases, an algorithm may strive to find the optimal solution,
meaning the best possible outcome according to some criteria (e.g., shortest path in
a graph). Not all algorithms need to be optimal; some aim for approximations or
heuristics.
We write f(n) = O(g(n)), If there are positive constants n0 and c such that, to
the right of n0 the f(n) always lies on or below c*g(n).
O(g(n)) ={f(n) : There exist positive constant c and n0 such that 0 ≤ f(n) ≤ c
g(n), for all n ≥ n0}
We write f(n) = Ω(g(n)), If there are positive constantsn0 and c such that, to
the right of n0 the f(n) always lies on or above c*g(n).
Ω(g(n)) = {f(n): There exist positive constant c and n0 such that 0 ≤ c g(n)
≤ f(n), for all n ≥ n0}
Big Theta Notation
Big-Theta(Θ) notation gives bound for a function f(n) to within a constant
factor.
We write f(n) = Θ(g(n)), If there are positive constants n0 and c1 and c2 such
that, to the right of n0 the f(n) always lies between c1*g(n) and c2*g(n)
inclusive.
Θ(g(n)) = {f(n): There exist positive constant c1, c2 and n0 such that 0 ≤ c1
g(n) ≤ f(n) ≤ c2 g(n), for all n ≥ n0}
Best, worst, and average case analysis are three different ways of analysing the performance
of an algorithm.
• Best case analysis: This analysis considers the minimum amount of time and space
that the algorithm will require for any input of a given size.
• Worst case analysis: This analysis considers the maximum amount of time and space
that the algorithm will require for any input of a given size.
• Average case analysis: This analysis considers the average amount of time and space
that the algorithm will require for all possible inputs of a given size.
Best case analysis is often used to give a theoretical lower bound on the performance of an
algorithm. Worst case analysis is often used to give a theoretical upper bound on the
performance of an algorithm. Average case analysis is often used to give a more realistic
estimate of the performance of an algorithm in practice
• Best-case analysis provides insight into the lower performance bound and helps
identify scenarios where an algorithm excels.
• Worst-case analysis establishes an upper performance bound, ensuring that an
algorithm doesn't degrade unacceptably in any situation.
• Average-case analysis offers a probabilistic view of an algorithm's performance under
typical conditions, providing a more realistic assessment.
The key idea behind amortized analysis is to spread the cost of an expensive operation over
several operations. For example, consider a dynamic array data structure that is resized when
it runs out of space. The cost of resizing the array is expensive, but it can be amortized over
several insertions into the array, so that the average time complexity of an insertion operation
is constant
To analyze a programming code or algorithm, we must notice that each instruction affects
the overall performance of the algorithm and therefore, each instruction must be analyzed
separately to analyze overall performance. However, there are some algorithm control
structures which are present in each programming code and have a specific asymptotic
analysis.
1. Sequencing
2. If-then-else
3. for loop
4. While loop
1. Sequencing:
Suppose our algorithm consists of two parts A and B. A takes time tA and B takes time tB for
computation. The total computation "tA + tB" is according to the sequence rule. According to
maximum rule, this computation time is (max (tA,tB)).
Example:
Computation Time = tA + tB
= (max (tA,tB)
2. If-then-else:
The total time computation is according to the condition rule-"if-then-else." According to the
maximum rule, this computation time is max (tA,tB).
Example:
3. For loop:
The outer loop executes N times. Every time the outer loop executes, the inner loop executes
M times. As a result, the statements in the inner loop execute a total of N * M times. Thus,
the total complexity for the two loops is O (N2)
1. for i ← 1 to n
2. {
3. P (i)
4. }
If the computation time ti for ( PI) various as a function of "i", then the total computation time
for the loop is given not by a multiplication but by a sum i.e.
1. For i ← 1 to n
2. {
3. P (i)
4. }
Takes
If the algorithms consist of nested "for" loops, then the total computation time is
For i ← 1 to n
{
For j ← 1 to n
P (ij)
Example:
Consider the following "for" loop, Calculate the total computation time for the following:
1. For i ← 2 to n-1
2. {
3. For j ← 3 to i
4. {
5. Sum ← Sum+A [i] [j]
6. }
7. }
Solution:
AD
4. While loop:
The Simple technique for analyzing the loop is to determine the function of variable involved
whose value decreases each time around. Secondly, for terminating the loop, it is necessary
that value must be a positive integer. By keeping track of how many times the value of
function decreases, one can obtain the number of repetition of the loop. The other approach
for analyzing "while" loop is to treat them as recursive algorithms.
Algorithm:
Example:
The running time of algorithm array Max of computing the maximum element in an array of
n integer is O (n).
Solution:
AD
1. 2 + 1 + n +4 (n-1) + 1=5n
2. 2 + 1 + n + 6 (n-1) + 1=7n-2
The best case T(n) =5n occurs when A [0] is the maximum element. The worst case T(n) = 7n-
2 occurs when element are sorted in increasing order.
We may, therefore, apply the big-Oh definition with c=7 and n0=1 and conclude the running
time of this is O (n).
A loop invariant is a statement that is true before and after each iteration of a loop. It is a
useful tool for proving the correctness of an algorithm.
To prove the correctness of an algorithm using a loop invariant, we need to show three things:
1. Initialization: The loop invariant is true before the first iteration of the loop.
2. Maintenance: The loop invariant remains true after each iteration of the loop.
3. Termination: When the loop terminates, the loop invariant gives us a useful property
that helps show that the algorithm is correct.
If we can show these three things, then we can be confident that the algorithm is correct.
Here is an example of a loop invariant for the following algorithm for finding the maximum
element in an array:
Python
def find_max(array):
max_element = array[0]
max_element = element
return max_element
Initialization: The loop invariant is true before the first iteration of the loop, because the
variable max_element is initialized to the first element in the array.
Maintenance: The loop invariant remains true after each iteration of the loop, because the
algorithm only updates the variable max_element if it finds an element that is greater than
the current maximum element.
Termination: When the loop terminates, the loop invariant tells us that the variable
max_element contains the maximum element of the array. Therefore, the algorithm is
correct.
Loop invariants can be used to prove the correctness of a wide range of algorithms. They are
a powerful tool for ensuring that our algorithms are reliable and efficient.
Here are some tips for using loop invariants to prove the correctness of algorithms:
By following these tips, you can learn to use loop invariants effectively to prove the
correctness of your algorithms.
SORTING ALGORITHM AND THIER ANALYSIS
1. Bubble Sort -->
Bubble Sort is a simple sorting algorithm that repeatedly steps through the list, compares
adjacent elements, and swaps them if they are in the wrong order. The pass through the list
is repeated until no swaps are needed, indicating that the list is sorted. Bubble Sort is
straightforward to understand and implement but is inefficient for large lists, making it
suitable primarily for educational purposes or for small datasets.
• Worst-case time complexity: O(n^2) - In the worst case, when the input list is in
reverse order, Bubble Sort will require n * (n-1) / 2 comparisons and swaps, where n
is the number of elements in the list.
• Average-case time complexity: O(n^2) - The average case also requires roughly the
same number of comparisons and swaps as the worst case.
• Best-case time complexity: O(n) - When the input list is already sorted, Bubble Sort
will perform n-1 comparisons and zero swaps because no elements need to be
swapped.
2.Selection Sort -->
Selection Sort is a simple and straightforward sorting algorithm that works by repeatedly
selecting the minimum (or maximum) element from an unsorted portion of the list and
moving it to its correct position in the sorted portion of the list. Selection Sort has a time
complexity of O(n^2), making it inefficient for large datasets, but it's easy to understand and
implement.
1. Start with the first element as the minimum (or maximum) element in the unsorted
portion of the list.
2. Compare the minimum (or maximum) element with the next element in the unsorted
portion.
3. If the next element is smaller (or larger) than the current minimum (or maximum),
update the minimum (or maximum) element to the next element.
4. Repeat steps 2 and 3 for the rest of the unsorted portion of the list, finding the
minimum (or maximum) element.
5. Swap the minimum (or maximum) element with the first element in the unsorted
portion, effectively moving it to its correct position in the sorted portion.
6. Repeat the above steps for the remaining unsorted portion of the list until the entire
list is sorted.
• Worst-case time complexity: O(n^2) - In the worst case, Selection Sort requires n * (n-
1) / 2 comparisons and n swaps to sort an array of n elements.
• Average-case time complexity: O(n^2) - The average case also requires roughly the
same number of comparisons and swaps as the worst case.
• Best-case time complexity: O(n^2) - Selection Sort performs the same number of
comparisons and swaps in the best case as in the worst and average cases because it
doesn't take advantage of any pre-sorted or partially sorted portions of the input.
3. Insertion Sort -->
Insertion Sort is a simple and efficient in-place sorting algorithm that builds the final sorted
array one item at a time. It works by repeatedly taking an element from the unsorted portion
of the array and inserting it into its correct position within the sorted portion of the array.
Insertion Sort is particularly effective for small datasets or lists that are nearly sorted.
1. Start with the second element (index 1) as the first element in the sorted portion.
2. Compare the first unsorted element (next element after the sorted portion) with the
elements in the sorted portion from right to left.
3. Shift the elements in the sorted portion to the right until the correct position is found
for the unsorted element.
4. Insert the unsorted element into its correct position in the sorted portion.
5. Repeat steps 2-4 for the remaining unsorted elements until the entire array is sorted.
• Worst-case time complexity: O(n^2) - In the worst case, when the input array is in
reverse order, Insertion Sort may require n * (n-1) / 2 comparisons and swaps to sort
an array of n elements.
• Average-case time complexity: O(n^2) - The average case also requires roughly the
same number of comparisons and swaps as the worst case.
• Best-case time complexity: O(n) - In the best case, when the input array is already
sorted, Insertion Sort performs n-1 comparisons but no swaps because each element
is already in its correct position.
Shell Sort, also known as Shell's method, is an efficient variation of the insertion sort
algorithm. It was designed to improve upon the performance of insertion sort, especially
when dealing with larger datasets. Shell Sort works by sorting elements that are far apart
from each other and gradually reducing the gap between elements until the entire array is
sorted. This technique is known as "diminishing increment sort."
Here is the algorithm for Shell Sort:
1. Start with a gap (or interval) value, typically set to half the length of the array. The gap
value is reduced on each pass through the array.
2. Divide the array into multiple subarrays of equal length, determined by the gap value.
Each subarray is independently sorted using the insertion sort algorithm.
3. Decrease the gap value (often by dividing it by 2) and repeat step 2. Continue reducing
the gap and sorting subarrays until the gap becomes 1.
4. Perform one final pass with a gap of 1, effectively performing an insertion sort on the
entire array.
• The time complexity of Shell Sort depends on the choice of gap sequence. The most
commonly used sequence is the "N/2 to 1" sequence, where N is the length of the
array. In this case, the time complexity is generally considered between O(n^1.25) and
O(n^2). It's significantly faster than the simple insertion sort, especially for larger
datasets.
• The choice of gap sequence can affect the algorithm's performance. There are various
gap sequences, and some may yield better results for specific datasets.
Heap Sort is a comparison-based sorting algorithm that uses a binary heap data structure to
efficiently sort an array or list of elements. It is an in-place sorting algorithm, which means it
doesn't require additional memory to perform the sorting. Heap Sort has a time complexity
of O(n log n) for both its best-case and worst-case scenarios, making it suitable for sorting
large datasets.
1. Build a Max Heap: Convert the given array into a Max Heap. A Max Heap is a binary
tree in which the value of each node is greater than or equal to the values of its
children.
2. Heapify the Array: Starting from the end of the array, repeatedly heapify the
remaining elements. Heapify is an operation that ensures that the heap property is
maintained for a given node. In this case, we are building a Max Heap, so we ensure
that the largest element is at the root.
3. Swap and Remove: After the heap is built, the largest element (at the root) is swapped
with the last element in the array. Then, the heap size is reduced by one, effectively
removing the largest element from consideration.
4. Repeat: Repeat steps 2 and 3 for the remaining elements in the array. After each
iteration, the largest remaining element will be at the end of the array.
5. Continue the process until the entire array is sorted. The sorted elements will
accumulate at the end of the array in descending order.
• Worst-case time complexity: O(n log n) - Heap Sort consistently has a time complexity
of O(n log n) for both its best-case and worst-case scenarios, making it efficient for
large datasets.
• Average-case time complexity: O(n log n) - The average-case performance is also O(n
log n).
• Best-case time complexity: O(n log n) - Unlike some other sorting algorithms, Heap
Sort doesn't benefit from any special best-case scenarios. It consistently performs at
O(n log n).
Divide-and-conquer algorithms are often very efficient, because they can solve
problems in logarithmic time (O(log n)). This means that the time it takes for the
algorithm to run grows logarithmically with the size of the input problem.
Example:
Steps:
1. Initialize low and high to the beginning and end of the array, respectively:
low = 0
high = 9
2. While low is less than or equal to high:
o Find the middle element of the array:
mid = (low + high) // 2 = 4
* Compare the target value to the middle element:
if array[mid] == target:
return mid
Since the target value is equal to the middle element, we return the
middle index, which is 4.
Therefore, the binary search algorithm successfully finds the target value 9 in the
sorted array.
Another example:
Therefore, the binary search algorithm successfully finds that the target value 14 is
not in the sorted array.
Binary search is a very efficient algorithm for searching for a target value in a sorted
array. It has an average time complexity of O(log n), where n is the size of the array.
This means that the time it takes for the algorithm to run grows logarithmically with
the size of the input problem.
Space
Algorithm Time complexity
complexity
Example ...
Here is an example of how the merge sort algorithm works:
Suppose we have the following unsorted array:
[5, 3, 2, 1, 4]
We can use the merge sort algorithm to sort this array as follows.
1. Divide the array into two halves:
left = [5, 3, 2]
right = [1, 4]
2. Recursively sort the two halves:
left = merge_sort(left)
right = merge_sort(right)
3. Merge the two sorted halves back together:
result = merge(left, right)
After merging the two sorted halves, we get the following sorted array:
[1, 2, 3, 4, 5]
Therefore, the merge sort algorithm successfully sorted the original unsorted array.
Merge sort is a very versatile and efficient sorting algorithm. It is often used in a variety of
applications, such as database management systems, search engines, and operating systems.
Space
Algorithm Time complexity
complexity
Example ...
Suppose we have the following unsorted array:
[5, 3, 2, 1, 4]
We can use the quicksort algorithm to sort this array as follows:
1. Select a pivot element. Let's say we choose the first element, which is 5.
2. Partition the array around the pivot element. We get the following two subarrays:
left = [1, 2, 3, 4]
right = []
3. Recursively sort the two subarrays on either side of the pivot element:
left = quicksort(left)
right = quicksort(right)
4. Return the sorted array:
result = left + [pivot] + right
The following is the sorted array:
[1, 2, 3, 4, 5]
Therefore, the quicksort algorithm successfully sorted the original unsorted array.
RECURRENCE
In the context of algorithm design and analysis, a recurrence relation (or recurrence) is a
mathematical equation that describes the time complexity or space complexity of an
algorithm in terms of the size of the input. Recurrence relations are often used to analyse
the efficiency of recursive algorithms and divide-and-conquer algorithms. They provide a
way to express the running time of an algorithm as a function of the problem size.
1. Substitution Method
2. Iteration Method
3. Recursion Tree Method
4. Master Method
Solution:
AD
T (n) = T (n-1) +1
= (T (n-2) +1) +1 = (T (n-3) +1) +1+1
= T (n-4) +4 = T (n-5) +1+4
= T (n-5) +5= T (n-k) + k
Where k = n-1
T (n-k) = T (1) = θ (1)
T (n) = θ (1) + (n-1) = 1+n-1=n= θ (n).
The following are the two main properties of a problem that suggest that the
given problem can be solved using Dynamic programming:
In each of these problems, we can break the problem down into smaller
subproblems, and then use the principle of optimality to find the optimal solution to
each subproblem. By combining the optimal solutions to the subproblems, we can
then find the optimal solution to the overall problem.
Here is a simple example of how to use the principle of optimality to solve a problem:
2. For each neighbor, solve the subproblem of finding the shortest path from that
neighbor to node B.
3. The shortest path from node A to node B is the shortest path from node A to
one of its neighbors, plus the shortest path from that neighbor to node B.
By using the principle of optimality to ensure that we always find the shortest path to
each subproblem, we can guarantee that we will also find the shortest path to the
overall problem.
The principle of optimality is a powerful tool that allows us to solve complex problems
efficiently. By breaking down problems into smaller subproblems and using the
principle of optimality to find the optimal solutions to the subproblems, we can find
the optimal solution to the overall problem.
GREEDY ALGORITHM
A greedy algorithm is an algorithmic paradigm that makes a series of choices at each step
with the hope of finding an overall optimal solution. The key characteristic of a greedy
algorithm is that it makes the locally optimal choice at each step without considering the
global picture. In other words, it makes the best decision based on the current information
available, without worrying about how that choice will affect future decisions. Greedy
algorithms are often used for optimization problems where the goal is to find the best
solution from a set of possible solutions.
Here are some general characteristics of greedy algorithms:
1. Greedy Choice Property: At each step of the algorithm, a greedy algorithm makes the
choice that appears to be the best at that particular moment. This choice is based
solely on the information available at that step and doesn't consider how it will affect
future choices or the overall solution.
2. Optimal Substructure: Many problems exhibit the optimal substructure property,
which means that an optimal solution to the problem can be constructed from
optimal solutions to its subproblems. Greedy algorithms often rely on this property,
solving subproblems in a way that contributes to the global optimal solution.
3. Greedy Algorithms are Easy to Design: One of the advantages of greedy algorithms is
that they are relatively easy to design and implement. This simplicity makes them a
practical choice for solving a wide range of problems.
4. Not Always Globally Optimal: While greedy algorithms are efficient and simple, they
do not guarantee finding the globally optimal solution in every case. In some
situations, making locally optimal choices at each step can lead to suboptimal overall
solutions.
5. Greedy Choice May Not Be Reversible: In some cases, the choice made by a greedy
algorithm cannot be undone or changed later. This lack of reversibility means that if
the initial choices turn out to be suboptimal, the algorithm may get stuck with a less-
than-optimal solution.
6. Greedy Algorithms Work Well for Certain Problems: Greedy algorithms are
particularly effective for solving problems where making the locally optimal choice at
each step leads to a globally optimal solution. These problems are often referred to
as "greedy-choice property" problems.
Exploring Graph
• Undirected graph
• Directed graph
• Traversing graph
DFS and BFS are both very powerful and versatile algorithms. They can be used to solve a
wide variety of problems, including:
• Finding the shortest path between two nodes in a graph
• Finding the minimum spanning tree of a graph
• Finding the maximum flow in a network
• Detecting cycles in a graph
• Finding the connected components of a graph
DFS and BFS are essential tools for anyone who works with graphs.
Topological sort:
Topological sorting is a linear ordering of the vertices of a directed acyclic graph (DAG) such
that for every directed edge from vertex u to vertex v, u comes before v in the ordering.
A DAG is a directed graph with no cycles. This means that there is no path from any vertex to
itself in the graph.
Topological sorting can be used to solve a variety of problems, such as:
• Scheduling tasks: Topological sorting can be used to schedule tasks in a way that
ensures that all dependencies are satisfied. For example, if task A depends on task B,
then task B must be scheduled before task A.
• Ordering data: Topological sorting can be used to order data in a way that ensures
that all references are resolved. For example, if a database table references another
database table, then the referenced table must be created before the referencing
table.
There are two main algorithms for topological sorting:
• Kahn's algorithm: Kahn's algorithm works by finding all of the vertices in the DAG
with no incoming edges. These vertices are placed in a queue. The algorithm then
repeatedly removes a vertex from the queue and visits all of its outgoing neighbors.
If an outgoing neighbor has no remaining incoming edges, it is placed in the queue.
The algorithm terminates when the queue is empty.
• DFS algorithm: A DFS algorithm can also be used to perform topological sorting. The
algorithm starts at a vertex and recursively explores all of its outgoing neighbors. If
an outgoing neighbor has not yet been visited, the algorithm recursively explores all
of its outgoing neighbors. The algorithm terminates when all of the vertices in the
DAG have been visited.
The following is an example of how to topologically sort the following DAG:
A -> B
| |
C -> D
Kahn's algorithm would topologically sort the DAG as follows:
1. Find all of the vertices with no incoming edges: A and C.
2. Place the vertices with no incoming edges in a queue: A, C.
3. Remove a vertex from the queue and visit all of its outgoing neighbors: A, B, D.
4. If an outgoing neighbor has no remaining incoming edges, place it in the queue: B.
5. Repeat steps 3 and 4 until the queue is empty.
The final queue would be [B], and the topological order of the DAG would be [A, C, B, D].
DFS algorithm would topologically sort the DAG as follows:
1. Start at a vertex: A.
2. Recursively explore all of the outgoing neighbors of A: B, D.
3. If an outgoing neighbor has not yet been visited, recursively explore all of its
outgoing neighbors: D.
4. Repeat steps 2 and 3 until all of the vertices in the DAG have been visited.
The final topological order of the DAG would be [A, C, B, D].
Topological sorting is a powerful tool that can be used to solve a variety of problems. It is a
relatively simple algorithm to implement, and it can be used on graphs of any size.